2023-10-02 01:35:27,301 INFO [train.py:1114] (3/4) Training started 2023-10-02 01:35:27,302 INFO [train.py:1124] (3/4) Device: cuda:3 2023-10-02 01:35:27,335 INFO [train.py:1136] (3/4) {'best_train_loss': inf, 'best_valid_loss': inf, 'best_train_epoch': -1, 'best_valid_epoch': -1, 'batch_idx_train': 0, 'log_interval': 50, 'reset_interval': 200, 'valid_interval': 3000, 'feature_dim': 80, 'subsampling_factor': 4, 'warm_step': 2000, 'env_info': {'k2-version': '1.24.3', 'k2-build-type': 'Release', 'k2-with-cuda': True, 'k2-git-sha1': '821ebc378e7fb99b8adc81950227963332821e01', 'k2-git-date': 'Wed Jul 19 15:38:25 2023', 'lhotse-version': '1.16.0.dev+git.1db4d97a.clean', 'torch-version': '1.11.0+cu102', 'torch-cuda-available': True, 'torch-cuda-version': '10.2', 'python-version': '3.9', 'icefall-git-branch': 'dev/bilingual', 'icefall-git-sha1': '4897f2c0-dirty', 'icefall-git-date': 'Thu Sep 28 11:38:28 2023', 'icefall-path': '/star-home/jinzengrui/lib/miniconda3/envs/dev39/lib/python3.9/site-packages/icefall-1.0-py3.9.egg', 'k2-path': '/star-home/jinzengrui/lib/miniconda3/envs/dev39/lib/python3.9/site-packages/k2-1.24.3.dev20230721+cuda10.2.torch1.11.0-py3.9-linux-x86_64.egg/k2/__init__.py', 'lhotse-path': '/star-home/jinzengrui/lib/miniconda3/envs/dev39/lib/python3.9/site-packages/lhotse-1.16.0.dev0+git.1db4d97a.clean-py3.9.egg/lhotse/__init__.py', 'hostname': 'de-74279-k2-train-7-1218101249-5d97868c7c-tp8w2', 'IP address': '10.177.6.147'}, 'world_size': 4, 'master_port': 12354, 'tensorboard': True, 'num_epochs': 50, 'start_epoch': 21, 'start_batch': 0, 'exp_dir': PosixPath('zipformer/exp-w-tal-csasr'), 'bpe_model': 'data/lang_bbpe_2000/bbpe.model', 'base_lr': 0.045, 'lr_batches': 7500, 'lr_epochs': 3.5, 'ref_duration': 600, 'context_size': 2, 'prune_range': 5, 'lm_scale': 0.25, 'am_scale': 0.0, 'simple_loss_scale': 0.5, 'ctc_loss_scale': 0.2, 'seed': 42, 'print_diagnostics': False, 'inf_check': False, 'save_every_n': 4000, 'keep_last_k': 30, 'average_period': 200, 'use_fp16': True, 'use_tal_csasr': True, 'use_librispeech': True, 'num_encoder_layers': '2,2,3,4,3,2', 'downsampling_factor': '1,2,4,8,4,2', 'feedforward_dim': '512,768,1024,1536,1024,768', 'num_heads': '4,4,4,8,4,4', 'encoder_dim': '192,256,384,512,384,256', 'query_head_dim': '32', 'value_head_dim': '12', 'pos_head_dim': '4', 'pos_dim': 48, 'encoder_unmasked_dim': '192,192,256,256,256,192', 'cnn_module_kernel': '31,31,15,15,15,31', 'decoder_dim': 512, 'joiner_dim': 512, 'causal': False, 'chunk_size': '16,32,64,-1', 'left_context_frames': '64,128,256,-1', 'use_transducer': True, 'use_ctc': False, 'manifest_dir': PosixPath('data/fbank'), 'max_duration': 1000, 'bucketing_sampler': True, 'num_buckets': 30, 'concatenate_cuts': False, 'duration_factor': 1.0, 'gap': 1.0, 'on_the_fly_feats': False, 'shuffle': True, 'drop_last': True, 'return_cuts': True, 'num_workers': 2, 'enable_spec_aug': True, 'spec_aug_time_warp_factor': 80, 'enable_musan': True, 'input_strategy': 'PrecomputedFeatures', 'blank_id': 0, 'vocab_size': 2000} 2023-10-02 01:35:27,335 INFO [train.py:1138] (3/4) About to create model 2023-10-02 01:35:28,002 INFO [train.py:1142] (3/4) Number of model parameters: 68625511 2023-10-02 01:35:28,003 INFO [checkpoint.py:112] (3/4) Loading checkpoint from zipformer/exp-w-tal-csasr/epoch-20.pt 2023-10-02 01:35:37,392 INFO [train.py:1157] (3/4) Using DDP 2023-10-02 01:35:37,827 INFO [train.py:1169] (3/4) Loading optimizer state dict 2023-10-02 01:35:38,597 INFO [train.py:1177] (3/4) Loading scheduler state dict 2023-10-02 01:35:38,598 INFO [multi_dataset.py:40] (3/4) About to get multidataset train cuts 2023-10-02 01:35:38,598 INFO [multi_dataset.py:43] (3/4) Loading Aishell-2 in lazy mode 2023-10-02 01:35:38,659 INFO [multi_dataset.py:50] (3/4) Loading TAL-CSASR in lazy mode 2023-10-02 01:35:38,661 INFO [multi_dataset.py:57] (3/4) Loading LibriSpeech in lazy mode 2023-10-02 01:35:38,661 INFO [multi_dataset.py:161] (3/4) About to get train-clean-100 cuts 2023-10-02 01:35:38,665 INFO [multi_dataset.py:168] (3/4) About to get train-clean-360 cuts 2023-10-02 01:35:38,680 INFO [multi_dataset.py:175] (3/4) About to get train-other-500 cuts 2023-10-02 01:35:48,700 INFO [asr_datamodule.py:218] (3/4) Enable MUSAN 2023-10-02 01:35:48,701 INFO [asr_datamodule.py:219] (3/4) About to get Musan cuts 2023-10-02 01:35:51,297 INFO [asr_datamodule.py:243] (3/4) Enable SpecAugment 2023-10-02 01:35:51,299 INFO [asr_datamodule.py:244] (3/4) Time warp factor: 80 2023-10-02 01:35:51,299 INFO [asr_datamodule.py:254] (3/4) Num frame mask: 10 2023-10-02 01:35:51,300 INFO [asr_datamodule.py:267] (3/4) About to create train dataset 2023-10-02 01:35:51,300 INFO [asr_datamodule.py:294] (3/4) Using DynamicBucketingSampler. 2023-10-02 01:35:51,316 WARNING [train.py:1204] (3/4) Exclude cut with ID _14629_210_15709_1_1530604298182_956899_6-232695-0_sp1.1 from training. Duration: 0.8545625 2023-10-02 01:35:51,336 WARNING [train.py:1204] (3/4) Exclude cut with ID _13921_210_9994_1_1530841782272_3503520_327-69677-0 from training. Duration: 0.9 2023-10-02 01:35:51,361 WARNING [train.py:1204] (3/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_372-298093-0 from training. Duration: 0.8 2023-10-02 01:35:51,377 WARNING [train.py:1204] (3/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_515-139111-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 01:35:51,608 WARNING [train.py:1204] (3/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_85-65056-0_sp1.1 from training. Duration: 0.87275 2023-10-02 01:35:51,655 WARNING [train.py:1204] (3/4) Exclude cut with ID _15113_210_8254_1_1530842438579_3716279_209-54533-0_sp1.1 from training. Duration: 0.87275 2023-10-02 01:35:51,700 WARNING [train.py:1204] (3/4) Exclude cut with ID _70447_210_12392_1_1535972499300_3160420_123-312838-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 01:35:51,732 WARNING [train.py:1204] (3/4) Exclude cut with ID _60113_210_16271_1_1535164212481_4386079_70-63596-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 01:35:51,762 WARNING [train.py:1204] (3/4) Exclude cut with ID _14632_210_9988_1_1530691279392_3923200_225-188509-0_sp1.1 from training. Duration: 0.963625 2023-10-02 01:35:51,812 WARNING [train.py:1204] (3/4) Exclude cut with ID _34734_210_4525_1_1533365977542_4448720_584-180532-0_sp1.1 from training. Duration: 0.97275 2023-10-02 01:35:51,825 WARNING [train.py:1204] (3/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_351-294650-0_sp1.1 from training. Duration: 0.57275 2023-10-02 01:35:52,139 WARNING [train.py:1204] (3/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_263-168388-0_sp0.9 from training. Duration: 0.9555625 2023-10-02 01:35:52,164 WARNING [train.py:1204] (3/4) Exclude cut with ID _22083_210_8254_1_1531702862439_3678970_201-85738-0 from training. Duration: 0.9 2023-10-02 01:35:52,249 WARNING [train.py:1204] (3/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_237-58433-0 from training. Duration: 0.74 2023-10-02 01:35:52,274 WARNING [train.py:1204] (3/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_262-250826-0 from training. Duration: 0.99 2023-10-02 01:35:52,368 WARNING [train.py:1204] (3/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_927-43827-0_sp0.9 from training. Duration: 0.82225 2023-10-02 01:35:52,398 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2821_210_2715_1_1526108385_3763560_73-296673-0 from training. Duration: 0.83 2023-10-02 01:35:52,481 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6287_210_3273_1_1527509506_2653716_182-199887-0 from training. Duration: 0.51 2023-10-02 01:35:53,072 WARNING [train.py:1204] (3/4) Exclude cut with ID _70519_210_4163_1_1535961595994_7458230_899-80223-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 01:35:53,330 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_210-234949-0_sp1.1 from training. Duration: 0.97275 2023-10-02 01:35:53,367 WARNING [train.py:1204] (3/4) Exclude cut with ID _30196_210_1664_1_1533708496522_7074140_768-278176-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 01:35:53,448 WARNING [train.py:1204] (3/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_315-128283-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 01:35:53,526 WARNING [train.py:1204] (3/4) Exclude cut with ID _14886_210_4220_1_1530702020355_3516190_330-323941-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 01:35:53,774 WARNING [train.py:1204] (3/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_264-348896-0_sp1.1 from training. Duration: 0.77275 2023-10-02 01:35:53,794 WARNING [train.py:1204] (3/4) Exclude cut with ID _37086_210_9038_1_1532999540891_7021250_433-10563-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 01:35:53,828 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_53638_210_21581_1_1534297319_5808449_885-80172-0_sp1.1 from training. Duration: 0.87275 2023-10-02 01:35:53,913 WARNING [train.py:1204] (3/4) Exclude cut with ID _14623_210_1925_1_1530773354962_3632609_157-218771-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 01:35:53,923 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_1368-78165-0_sp1.1 from training. Duration: 0.97275 2023-10-02 01:35:53,927 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_309-65177-0_sp0.9 from training. Duration: 0.97775 2023-10-02 01:35:53,933 WARNING [train.py:1204] (3/4) Exclude cut with ID _21632_210_2253_1_1531994001765_3744140_291-138812-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 01:35:53,953 WARNING [train.py:1204] (3/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_669-93163-0_sp1.1 from training. Duration: 0.8909375 2023-10-02 01:35:54,499 WARNING [train.py:1204] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_292-215346-0 from training. Duration: 0.97 2023-10-02 01:35:54,524 WARNING [train.py:1204] (3/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_286-222073-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 01:35:54,815 WARNING [train.py:1204] (3/4) Exclude cut with ID _59256_210_5986_1_1535076085997_3589120_121-237386-0_sp1.1 from training. Duration: 0.87275 2023-10-02 01:35:54,829 WARNING [train.py:1204] (3/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_96-320736-0 from training. Duration: 0.86 2023-10-02 01:35:54,838 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_128-202104-0 from training. Duration: 0.63 2023-10-02 01:35:54,931 WARNING [train.py:1204] (3/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_242-3472-0_sp0.9 from training. Duration: 0.811125 2023-10-02 01:35:54,975 WARNING [train.py:1204] (3/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_154-74182-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 01:35:54,979 WARNING [train.py:1204] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_187-221566-0 from training. Duration: 0.43 2023-10-02 01:35:55,138 WARNING [train.py:1204] (3/4) Exclude cut with ID _6194_210_1534_1_1527511826160_3941194_14-282876-0 from training. Duration: 0.71 2023-10-02 01:35:55,181 WARNING [train.py:1204] (3/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_660-211088-0_sp1.1 from training. Duration: 0.563625 2023-10-02 01:35:55,511 WARNING [train.py:1204] (3/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_526-62079-0_sp1.1 from training. Duration: 0.6090625 2023-10-02 01:35:55,655 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_92-246270-0_sp1.1 from training. Duration: 0.77275 2023-10-02 01:35:55,742 WARNING [train.py:1204] (3/4) Exclude cut with ID IC0287W0023-142350 from training. Duration: 0.956 2023-10-02 01:35:55,817 WARNING [train.py:1204] (3/4) Exclude cut with ID _58605_210_12210_1_1534723195939_3904259_48-269402-0 from training. Duration: 0.96 2023-10-02 01:35:55,832 WARNING [train.py:1204] (3/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_416-257730-0_sp0.9 from training. Duration: 0.72225 2023-10-02 01:35:55,907 WARNING [train.py:1204] (3/4) Exclude cut with ID _7877_210_6014_1_1528286358099_3750136_185-17516-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 01:35:56,017 WARNING [train.py:1204] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_23-189234-0 from training. Duration: 0.83 2023-10-02 01:35:56,034 WARNING [train.py:1204] (3/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_237-89684-0 from training. Duration: 0.96 2023-10-02 01:35:56,111 WARNING [train.py:1204] (3/4) Exclude cut with ID _13916_210_9994_1_1530788341126_3763149_116-21034-0 from training. Duration: 0.96 2023-10-02 01:35:56,294 INFO [asr_datamodule.py:309] (3/4) About to create train dataloader 2023-10-02 01:35:56,295 INFO [multi_dataset.py:103] (3/4) About to get multidataset dev cuts 2023-10-02 01:35:56,295 INFO [multi_dataset.py:106] (3/4) Loading Aishell-2 DEV set in lazy mode 2023-10-02 01:35:56,298 INFO [multi_dataset.py:182] (3/4) About to get dev-clean cuts 2023-10-02 01:35:56,299 INFO [multi_dataset.py:189] (3/4) About to get dev-other cuts 2023-10-02 01:35:56,326 INFO [asr_datamodule.py:340] (3/4) About to create dev dataset 2023-10-02 01:35:56,769 INFO [asr_datamodule.py:357] (3/4) About to create dev dataloader 2023-10-02 01:35:56,769 INFO [train.py:1358] (3/4) Sanity check -- see if any of the batches in epoch 1 would cause OOM. 2023-10-02 01:35:56,778 WARNING [train.py:1204] (3/4) Exclude cut with ID _14629_210_15709_1_1530604298182_956899_6-232695-0_sp1.1 from training. Duration: 0.8545625 2023-10-02 01:35:56,797 WARNING [train.py:1204] (3/4) Exclude cut with ID _13921_210_9994_1_1530841782272_3503520_327-69677-0 from training. Duration: 0.9 2023-10-02 01:35:56,820 WARNING [train.py:1204] (3/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_372-298093-0 from training. Duration: 0.8 2023-10-02 01:35:56,832 WARNING [train.py:1204] (3/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_515-139111-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 01:35:57,049 WARNING [train.py:1204] (3/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_85-65056-0_sp1.1 from training. Duration: 0.87275 2023-10-02 01:35:57,086 WARNING [train.py:1204] (3/4) Exclude cut with ID _15113_210_8254_1_1530842438579_3716279_209-54533-0_sp1.1 from training. Duration: 0.87275 2023-10-02 01:35:57,558 WARNING [train.py:1204] (3/4) Exclude cut with ID _70447_210_12392_1_1535972499300_3160420_123-312838-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 01:35:57,590 WARNING [train.py:1204] (3/4) Exclude cut with ID _60113_210_16271_1_1535164212481_4386079_70-63596-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 01:35:57,619 WARNING [train.py:1204] (3/4) Exclude cut with ID _14632_210_9988_1_1530691279392_3923200_225-188509-0_sp1.1 from training. Duration: 0.963625 2023-10-02 01:35:57,665 WARNING [train.py:1204] (3/4) Exclude cut with ID _34734_210_4525_1_1533365977542_4448720_584-180532-0_sp1.1 from training. Duration: 0.97275 2023-10-02 01:35:57,679 WARNING [train.py:1204] (3/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_351-294650-0_sp1.1 from training. Duration: 0.57275 2023-10-02 01:35:57,998 WARNING [train.py:1204] (3/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_263-168388-0_sp0.9 from training. Duration: 0.9555625 2023-10-02 01:35:58,023 WARNING [train.py:1204] (3/4) Exclude cut with ID _22083_210_8254_1_1531702862439_3678970_201-85738-0 from training. Duration: 0.9 2023-10-02 01:35:58,108 WARNING [train.py:1204] (3/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_237-58433-0 from training. Duration: 0.74 2023-10-02 01:35:58,140 WARNING [train.py:1204] (3/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_262-250826-0 from training. Duration: 0.99 2023-10-02 01:35:58,243 WARNING [train.py:1204] (3/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_927-43827-0_sp0.9 from training. Duration: 0.82225 2023-10-02 01:35:58,271 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2821_210_2715_1_1526108385_3763560_73-296673-0 from training. Duration: 0.83 2023-10-02 01:35:58,340 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6287_210_3273_1_1527509506_2653716_182-199887-0 from training. Duration: 0.51 2023-10-02 01:35:58,458 WARNING [train.py:1204] (3/4) Exclude cut with ID _70519_210_4163_1_1535961595994_7458230_899-80223-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 01:35:58,699 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_210-234949-0_sp1.1 from training. Duration: 0.97275 2023-10-02 01:35:58,733 WARNING [train.py:1204] (3/4) Exclude cut with ID _30196_210_1664_1_1533708496522_7074140_768-278176-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 01:35:58,806 WARNING [train.py:1204] (3/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_315-128283-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 01:35:58,885 WARNING [train.py:1204] (3/4) Exclude cut with ID _14886_210_4220_1_1530702020355_3516190_330-323941-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 01:35:59,139 WARNING [train.py:1204] (3/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_264-348896-0_sp1.1 from training. Duration: 0.77275 2023-10-02 01:35:59,160 WARNING [train.py:1204] (3/4) Exclude cut with ID _37086_210_9038_1_1532999540891_7021250_433-10563-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 01:35:59,200 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_53638_210_21581_1_1534297319_5808449_885-80172-0_sp1.1 from training. Duration: 0.87275 2023-10-02 01:35:59,710 WARNING [train.py:1204] (3/4) Exclude cut with ID _14623_210_1925_1_1530773354962_3632609_157-218771-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 01:35:59,721 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_1368-78165-0_sp1.1 from training. Duration: 0.97275 2023-10-02 01:35:59,732 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_309-65177-0_sp0.9 from training. Duration: 0.97775 2023-10-02 01:35:59,738 WARNING [train.py:1204] (3/4) Exclude cut with ID _21632_210_2253_1_1531994001765_3744140_291-138812-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 01:35:59,750 WARNING [train.py:1204] (3/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_669-93163-0_sp1.1 from training. Duration: 0.8909375 2023-10-02 01:36:00,257 WARNING [train.py:1204] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_292-215346-0 from training. Duration: 0.97 2023-10-02 01:36:00,286 WARNING [train.py:1204] (3/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_286-222073-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 01:36:00,554 WARNING [train.py:1204] (3/4) Exclude cut with ID _59256_210_5986_1_1535076085997_3589120_121-237386-0_sp1.1 from training. Duration: 0.87275 2023-10-02 01:36:00,568 WARNING [train.py:1204] (3/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_96-320736-0 from training. Duration: 0.86 2023-10-02 01:36:00,576 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_128-202104-0 from training. Duration: 0.63 2023-10-02 01:36:00,659 WARNING [train.py:1204] (3/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_242-3472-0_sp0.9 from training. Duration: 0.811125 2023-10-02 01:36:00,702 WARNING [train.py:1204] (3/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_154-74182-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 01:36:00,706 WARNING [train.py:1204] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_187-221566-0 from training. Duration: 0.43 2023-10-02 01:36:00,871 WARNING [train.py:1204] (3/4) Exclude cut with ID _6194_210_1534_1_1527511826160_3941194_14-282876-0 from training. Duration: 0.71 2023-10-02 01:36:00,911 WARNING [train.py:1204] (3/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_660-211088-0_sp1.1 from training. Duration: 0.563625 2023-10-02 01:36:01,243 WARNING [train.py:1204] (3/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_526-62079-0_sp1.1 from training. Duration: 0.6090625 2023-10-02 01:36:01,400 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_92-246270-0_sp1.1 from training. Duration: 0.77275 2023-10-02 01:36:01,487 WARNING [train.py:1204] (3/4) Exclude cut with ID IC0287W0023-142350 from training. Duration: 0.956 2023-10-02 01:36:01,563 WARNING [train.py:1204] (3/4) Exclude cut with ID _58605_210_12210_1_1534723195939_3904259_48-269402-0 from training. Duration: 0.96 2023-10-02 01:36:01,577 WARNING [train.py:1204] (3/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_416-257730-0_sp0.9 from training. Duration: 0.72225 2023-10-02 01:36:01,651 WARNING [train.py:1204] (3/4) Exclude cut with ID _7877_210_6014_1_1528286358099_3750136_185-17516-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 01:36:02,317 WARNING [train.py:1204] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_23-189234-0 from training. Duration: 0.83 2023-10-02 01:36:02,335 WARNING [train.py:1204] (3/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_237-89684-0 from training. Duration: 0.96 2023-10-02 01:36:02,403 WARNING [train.py:1204] (3/4) Exclude cut with ID _13916_210_9994_1_1530788341126_3763149_116-21034-0 from training. Duration: 0.96 2023-10-02 01:36:02,691 WARNING [train.py:1204] (3/4) Exclude cut with ID _14629_210_15709_1_1530604298182_956899_6-232695-0_sp1.1 from training. Duration: 0.8545625 2023-10-02 01:36:02,710 WARNING [train.py:1204] (3/4) Exclude cut with ID _13921_210_9994_1_1530841782272_3503520_327-69677-0 from training. Duration: 0.9 2023-10-02 01:36:02,733 WARNING [train.py:1204] (3/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_372-298093-0 from training. Duration: 0.8 2023-10-02 01:36:02,746 WARNING [train.py:1204] (3/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_515-139111-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 01:36:02,964 WARNING [train.py:1204] (3/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_85-65056-0_sp1.1 from training. Duration: 0.87275 2023-10-02 01:36:03,008 WARNING [train.py:1204] (3/4) Exclude cut with ID _15113_210_8254_1_1530842438579_3716279_209-54533-0_sp1.1 from training. Duration: 0.87275 2023-10-02 01:36:03,055 WARNING [train.py:1204] (3/4) Exclude cut with ID _70447_210_12392_1_1535972499300_3160420_123-312838-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 01:36:03,087 WARNING [train.py:1204] (3/4) Exclude cut with ID _60113_210_16271_1_1535164212481_4386079_70-63596-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 01:36:03,113 WARNING [train.py:1204] (3/4) Exclude cut with ID _14632_210_9988_1_1530691279392_3923200_225-188509-0_sp1.1 from training. Duration: 0.963625 2023-10-02 01:36:03,166 WARNING [train.py:1204] (3/4) Exclude cut with ID _34734_210_4525_1_1533365977542_4448720_584-180532-0_sp1.1 from training. Duration: 0.97275 2023-10-02 01:36:03,182 WARNING [train.py:1204] (3/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_351-294650-0_sp1.1 from training. Duration: 0.57275 2023-10-02 01:36:03,510 WARNING [train.py:1204] (3/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_263-168388-0_sp0.9 from training. Duration: 0.9555625 2023-10-02 01:36:03,536 WARNING [train.py:1204] (3/4) Exclude cut with ID _22083_210_8254_1_1531702862439_3678970_201-85738-0 from training. Duration: 0.9 2023-10-02 01:36:03,629 WARNING [train.py:1204] (3/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_237-58433-0 from training. Duration: 0.74 2023-10-02 01:36:03,654 WARNING [train.py:1204] (3/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_262-250826-0 from training. Duration: 0.99 2023-10-02 01:36:03,751 WARNING [train.py:1204] (3/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_927-43827-0_sp0.9 from training. Duration: 0.82225 2023-10-02 01:36:03,780 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2821_210_2715_1_1526108385_3763560_73-296673-0 from training. Duration: 0.83 2023-10-02 01:36:03,848 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6287_210_3273_1_1527509506_2653716_182-199887-0 from training. Duration: 0.51 2023-10-02 01:36:03,968 WARNING [train.py:1204] (3/4) Exclude cut with ID _70519_210_4163_1_1535961595994_7458230_899-80223-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 01:36:04,211 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_210-234949-0_sp1.1 from training. Duration: 0.97275 2023-10-02 01:36:04,247 WARNING [train.py:1204] (3/4) Exclude cut with ID _30196_210_1664_1_1533708496522_7074140_768-278176-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 01:36:04,322 WARNING [train.py:1204] (3/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_315-128283-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 01:36:04,401 WARNING [train.py:1204] (3/4) Exclude cut with ID _14886_210_4220_1_1530702020355_3516190_330-323941-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 01:36:04,651 WARNING [train.py:1204] (3/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_264-348896-0_sp1.1 from training. Duration: 0.77275 2023-10-02 01:36:04,671 WARNING [train.py:1204] (3/4) Exclude cut with ID _37086_210_9038_1_1532999540891_7021250_433-10563-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 01:36:04,706 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_53638_210_21581_1_1534297319_5808449_885-80172-0_sp1.1 from training. Duration: 0.87275 2023-10-02 01:36:04,793 WARNING [train.py:1204] (3/4) Exclude cut with ID _14623_210_1925_1_1530773354962_3632609_157-218771-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 01:36:04,803 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_1368-78165-0_sp1.1 from training. Duration: 0.97275 2023-10-02 01:36:04,807 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_309-65177-0_sp0.9 from training. Duration: 0.97775 2023-10-02 01:36:04,813 WARNING [train.py:1204] (3/4) Exclude cut with ID _21632_210_2253_1_1531994001765_3744140_291-138812-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 01:36:04,825 WARNING [train.py:1204] (3/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_669-93163-0_sp1.1 from training. Duration: 0.8909375 2023-10-02 01:36:05,800 WARNING [train.py:1204] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_292-215346-0 from training. Duration: 0.97 2023-10-02 01:36:05,827 WARNING [train.py:1204] (3/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_286-222073-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 01:36:06,101 WARNING [train.py:1204] (3/4) Exclude cut with ID _59256_210_5986_1_1535076085997_3589120_121-237386-0_sp1.1 from training. Duration: 0.87275 2023-10-02 01:36:06,114 WARNING [train.py:1204] (3/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_96-320736-0 from training. Duration: 0.86 2023-10-02 01:36:06,122 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_128-202104-0 from training. Duration: 0.63 2023-10-02 01:36:06,200 WARNING [train.py:1204] (3/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_242-3472-0_sp0.9 from training. Duration: 0.811125 2023-10-02 01:36:06,240 WARNING [train.py:1204] (3/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_154-74182-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 01:36:06,244 WARNING [train.py:1204] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_187-221566-0 from training. Duration: 0.43 2023-10-02 01:36:06,392 WARNING [train.py:1204] (3/4) Exclude cut with ID _6194_210_1534_1_1527511826160_3941194_14-282876-0 from training. Duration: 0.71 2023-10-02 01:36:06,430 WARNING [train.py:1204] (3/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_660-211088-0_sp1.1 from training. Duration: 0.563625 2023-10-02 01:36:06,749 WARNING [train.py:1204] (3/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_526-62079-0_sp1.1 from training. Duration: 0.6090625 2023-10-02 01:36:06,909 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_92-246270-0_sp1.1 from training. Duration: 0.77275 2023-10-02 01:36:06,994 WARNING [train.py:1204] (3/4) Exclude cut with ID IC0287W0023-142350 from training. Duration: 0.956 2023-10-02 01:36:07,070 WARNING [train.py:1204] (3/4) Exclude cut with ID _58605_210_12210_1_1534723195939_3904259_48-269402-0 from training. Duration: 0.96 2023-10-02 01:36:07,085 WARNING [train.py:1204] (3/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_416-257730-0_sp0.9 from training. Duration: 0.72225 2023-10-02 01:36:07,158 WARNING [train.py:1204] (3/4) Exclude cut with ID _7877_210_6014_1_1528286358099_3750136_185-17516-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 01:36:07,264 WARNING [train.py:1204] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_23-189234-0 from training. Duration: 0.83 2023-10-02 01:36:07,280 WARNING [train.py:1204] (3/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_237-89684-0 from training. Duration: 0.96 2023-10-02 01:36:07,349 WARNING [train.py:1204] (3/4) Exclude cut with ID _13916_210_9994_1_1530788341126_3763149_116-21034-0 from training. Duration: 0.96 2023-10-02 01:36:07,519 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_246-125000-0_sp1.1 from training. Duration: 0.563625 2023-10-02 01:36:08,840 WARNING [train.py:1204] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_402-253027-0 from training. Duration: 0.54 2023-10-02 01:36:08,903 WARNING [train.py:1204] (3/4) Exclude cut with ID _59288_210_5854_1_1535077757774_3412246_158-289345-0_sp1.1 from training. Duration: 0.92725 2023-10-02 01:36:09,079 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_28211_210_6388_1_1532861506_4523224_128-328162-0_sp1.1 from training. Duration: 0.8181875 2023-10-02 01:36:09,515 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_788-278281-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 01:36:09,523 WARNING [train.py:1204] (3/4) Exclude cut with ID _49728_210_12651_1_1534041034294_6788150_624-195301-0_sp0.9 from training. Duration: 0.9444375 2023-10-02 01:36:09,573 WARNING [train.py:1204] (3/4) Exclude cut with ID _14860_210_4915_1_1530702020340_7343240_267-139775-0_sp1.1 from training. Duration: 0.963625 2023-10-02 01:36:09,631 WARNING [train.py:1204] (3/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_332-221057-0 from training. Duration: 0.68 2023-10-02 01:36:09,759 WARNING [train.py:1204] (3/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_242-76943-0 from training. Duration: 0.94 2023-10-02 01:36:09,931 WARNING [train.py:1204] (3/4) Exclude cut with ID _16747_210_5986_1_1531811544733_2749109_87-349587-0_sp1.1 from training. Duration: 0.963625 2023-10-02 01:36:09,996 WARNING [train.py:1204] (3/4) Exclude cut with ID _22982_210_1883_1_1531882790447_5449409_505-346452-0_sp1.1 from training. Duration: 0.963625 2023-10-02 01:36:10,296 WARNING [train.py:1204] (3/4) Exclude cut with ID _17956_210_4353_1_1531209568125_6979970_13-192107-0_sp1.1 from training. Duration: 0.963625 2023-10-02 01:36:10,336 WARNING [train.py:1204] (3/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_430-307666-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 01:36:10,399 WARNING [train.py:1204] (3/4) Exclude cut with ID _2895_210_2477_1_1525956785794_4063750_289-11475-0_sp1.1 from training. Duration: 0.7454375 2023-10-02 01:36:10,411 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3910_210_1794_1_1526742306_334530_28-238909-0_sp0.9 from training. Duration: 0.9 2023-10-02 01:36:10,568 WARNING [train.py:1204] (3/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_18-130336-0 from training. Duration: 0.79 2023-10-02 01:36:10,737 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_548-294316-0_sp0.9 from training. Duration: 0.9 2023-10-02 01:36:11,299 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_42-142473-0_sp1.1 from training. Duration: 0.6545625 2023-10-02 01:36:11,311 WARNING [train.py:1204] (3/4) Exclude cut with ID _59170_210_6721_1_1534983633960_4203335_326-48066-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 01:36:11,458 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_45886_210_17024_1_1533709215_471063_7-79165-0 from training. Duration: 0.98 2023-10-02 01:36:11,813 WARNING [train.py:1204] (3/4) Exclude cut with ID _44585_210_18628_1_1533605350387_4518199_280-73534-0_sp0.9 from training. Duration: 0.988875 2023-10-02 01:36:11,821 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3475_210_2028_1_1526193578_3622569_265-276625-0_sp1.1 from training. Duration: 0.6909375 2023-10-02 01:36:12,018 WARNING [train.py:1204] (3/4) Exclude cut with ID _64631_210_27767_1_1535349580068_4165160_88-225232-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 01:36:12,377 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_205-265518-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 01:36:13,361 WARNING [train.py:1204] (3/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_271-253734-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 01:36:13,813 WARNING [train.py:1204] (3/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_282-257791-0 from training. Duration: 0.86 2023-10-02 01:36:14,105 WARNING [train.py:1204] (3/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_356-45054-0 from training. Duration: 0.86 2023-10-02 01:36:14,156 WARNING [train.py:1204] (3/4) Exclude cut with ID _69584_210_5219_1_1535785133775_4044450_38-348116-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 01:36:14,157 WARNING [train.py:1204] (3/4) Exclude cut with ID _66763_210_17301_1_1535788667617_241750_21-328480-0_sp1.1 from training. Duration: 0.936375 2023-10-02 01:36:14,227 WARNING [train.py:1204] (3/4) Exclude cut with ID _26528_210_9070_1_1532570410842_3851310_497-97218-0_sp0.9 from training. Duration: 0.9555625 2023-10-02 01:36:14,278 WARNING [train.py:1204] (3/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_592-101075-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 01:36:14,414 WARNING [train.py:1204] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_678-159307-0 from training. Duration: 0.85 2023-10-02 01:36:14,577 WARNING [train.py:1204] (3/4) Exclude cut with ID _28427_210_4220_1_1532581240967_3061069_166-319773-0_sp1.1 from training. Duration: 0.936375 2023-10-02 01:36:14,653 WARNING [train.py:1204] (3/4) Exclude cut with ID _29877_210_6228_1_1532397791106_5276149_606-224984-0_sp1.1 from training. Duration: 0.936375 2023-10-02 01:36:14,915 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_125-203105-0_sp1.1 from training. Duration: 0.72725 2023-10-02 01:36:15,176 WARNING [train.py:1204] (3/4) Exclude cut with ID IC0620W0113-306765 from training. Duration: 0.768 2023-10-02 01:36:15,322 WARNING [train.py:1204] (3/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_410-287857-0_sp1.1 from training. Duration: 0.8454375 2023-10-02 01:36:15,591 WARNING [train.py:1204] (3/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_78-95260-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 01:36:15,779 WARNING [train.py:1204] (3/4) Exclude cut with ID _58059_210_5854_1_1534901471149_3164808_6-73080-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 01:36:15,806 WARNING [train.py:1204] (3/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_331-117829-0 from training. Duration: 0.62 2023-10-02 01:36:15,874 WARNING [train.py:1204] (3/4) Exclude cut with ID _14880_210_8895_1_1530680426655_3795259_289-212817-0_sp0.9 from training. Duration: 0.7666875 2023-10-02 01:36:15,908 WARNING [train.py:1204] (3/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_620-144779-0_sp1.1 from training. Duration: 0.92725 2023-10-02 01:36:16,089 WARNING [train.py:1204] (3/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_561-288575-0_sp1.1 from training. Duration: 0.963625 2023-10-02 01:36:16,177 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_22657_210_6890_1_1532086002_1637216_122-188128-0_sp1.1 from training. Duration: 0.963625 2023-10-02 01:36:16,389 WARNING [train.py:1204] (3/4) Exclude cut with ID _16756_210_4915_1_1531047803763_7104259_604-86607-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 01:36:16,631 WARNING [train.py:1204] (3/4) Exclude cut with ID _15150_210_8341_1_1530752411803_7256384_469-239509-0 from training. Duration: 0.68 2023-10-02 01:36:16,639 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_10987_210_4163_1_1529760555_4123902_302-87926-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 01:36:17,156 WARNING [train.py:1204] (3/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_469-94673-0_sp0.9 from training. Duration: 0.87775 2023-10-02 01:36:17,969 WARNING [train.py:1204] (3/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_798-106179-0 from training. Duration: 0.83 2023-10-02 01:36:18,107 WARNING [train.py:1204] (3/4) Exclude cut with ID _16744_210_5986_1_1531724405384_3907567_242-111316-0 from training. Duration: 0.93 2023-10-02 01:36:18,250 WARNING [train.py:1204] (3/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_938-205135-0_sp0.9 from training. Duration: 0.9666875 2023-10-02 01:36:18,356 WARNING [train.py:1204] (3/4) Exclude cut with ID _64135_210_2253_1_1535677267876_7608972_133-188266-0_sp0.9 from training. Duration: 0.9 2023-10-02 01:36:18,360 WARNING [train.py:1204] (3/4) Exclude cut with ID _44585_210_18628_1_1533605350387_4518199_623-346820-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 01:36:18,426 WARNING [train.py:1204] (3/4) Exclude cut with ID _42326_210_7770_1_1533814282531_3707269_454-266903-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 01:36:18,511 WARNING [train.py:1204] (3/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_5-55893-0_sp1.1 from training. Duration: 0.5 2023-10-02 01:36:18,549 WARNING [train.py:1204] (3/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_577-332600-0_sp1.1 from training. Duration: 0.5181875 2023-10-02 01:36:18,550 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_957-138882-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 01:36:19,141 WARNING [train.py:1204] (3/4) Exclude cut with ID _12556_210_8341_1_1530081024100_7327690_219-38004-0_sp1.1 from training. Duration: 0.97275 2023-10-02 01:36:19,235 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_397-147196-0_sp1.1 from training. Duration: 0.72725 2023-10-02 01:36:19,252 WARNING [train.py:1204] (3/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_231-104736-0_sp0.9 from training. Duration: 0.8666875 2023-10-02 01:36:19,336 WARNING [train.py:1204] (3/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_398-334558-0 from training. Duration: 0.96 2023-10-02 01:36:19,476 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_257-20733-0_sp1.1 from training. Duration: 0.6909375 2023-10-02 01:36:19,571 WARNING [train.py:1204] (3/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_146-282400-0_sp1.1 from training. Duration: 0.6454375 2023-10-02 01:36:19,574 WARNING [train.py:1204] (3/4) Exclude cut with ID _51899_210_20034_1_1534129254466_3669220_265-215428-0 from training. Duration: 0.98 2023-10-02 01:36:19,685 WARNING [train.py:1204] (3/4) Exclude cut with ID _34423_210_8750_1_1533430798456_3330199_219-106541-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 01:36:19,818 WARNING [train.py:1204] (3/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_458-67321-0 from training. Duration: 0.97 2023-10-02 01:36:20,324 WARNING [train.py:1204] (3/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_1051-182606-0_sp1.1 from training. Duration: 0.936375 2023-10-02 01:36:20,344 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_53555_210_5632_1_1534595220_7232297_603-4396-0_sp1.1 from training. Duration: 0.97275 2023-10-02 01:36:20,448 WARNING [train.py:1204] (3/4) Exclude cut with ID _54218_210_12210_1_1534413646388_4409259_159-144367-0_sp1.1 from training. Duration: 0.963625 2023-10-02 01:36:20,548 WARNING [train.py:1204] (3/4) Exclude cut with ID _7606_210_4915_1_1528894253277_5080690_301-10732-0_sp0.9 from training. Duration: 0.9 2023-10-02 01:36:20,554 WARNING [train.py:1204] (3/4) Exclude cut with ID _15026_210_8750_1_1530777602316_3291850_211-195043-0_sp0.9 from training. Duration: 0.888875 2023-10-02 01:36:20,720 WARNING [train.py:1204] (3/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_146-282400-0 from training. Duration: 0.71 2023-10-02 01:36:20,762 WARNING [train.py:1204] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_65-88242-0 from training. Duration: 0.56 2023-10-02 01:36:20,884 WARNING [train.py:1204] (3/4) Exclude cut with ID _55857_210_2346_1_1534644007831_6898209_80-107757-0_sp1.1 from training. Duration: 0.963625 2023-10-02 01:36:20,919 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_15126_210_6753_1_1530778677_2591185_120-277707-0_sp0.9 from training. Duration: 0.888875 2023-10-02 01:36:21,094 WARNING [train.py:1204] (3/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_1033-168593-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 01:36:21,151 WARNING [train.py:1204] (3/4) Exclude cut with ID _46683_210_9642_1_1533730913492_4591116_475-30711-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 01:36:21,188 WARNING [train.py:1204] (3/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_253-308377-0 from training. Duration: 0.49 2023-10-02 01:36:21,261 WARNING [train.py:1204] (3/4) Exclude cut with ID _4680_210_3525_1_1527249757953_893386_75-78979-0 from training. Duration: 0.91 2023-10-02 01:36:21,339 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_5522_210_3273_1_1527249606_3909096_318-203153-0_sp0.9 from training. Duration: 0.47775 2023-10-02 01:36:21,451 WARNING [train.py:1204] (3/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_550-170116-0_sp1.1 from training. Duration: 0.92725 2023-10-02 01:36:21,490 WARNING [train.py:1204] (3/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_277-210798-0_sp0.9 from training. Duration: 0.611125 2023-10-02 01:36:21,587 WARNING [train.py:1204] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_620-276793-0 from training. Duration: 0.59 2023-10-02 01:36:21,605 WARNING [train.py:1204] (3/4) Exclude cut with ID _3067_210_2667_1_1526212974640_4503168_119-175776-0 from training. Duration: 0.74 2023-10-02 01:36:21,681 WARNING [train.py:1204] (3/4) Exclude cut with ID _55209_210_1531_1_1534675828996_4597160_681-195827-0_sp1.1 from training. Duration: 0.92725 2023-10-02 01:36:21,754 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_427-78097-0_sp1.1 from training. Duration: 0.72725 2023-10-02 01:36:22,451 WARNING [train.py:1204] (3/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_96-290039-0_sp0.9 from training. Duration: 0.62225 2023-10-02 01:36:22,476 WARNING [train.py:1204] (3/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_287-292174-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 01:36:22,706 WARNING [train.py:1204] (3/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_200-292228-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 01:36:22,970 WARNING [train.py:1204] (3/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_291-194999-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 01:36:23,225 WARNING [train.py:1204] (3/4) Exclude cut with ID _58895_210_8750_1_1535184081320_3649089_265-323810-0_sp1.1 from training. Duration: 0.87275 2023-10-02 01:36:23,377 WARNING [train.py:1204] (3/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_58-68956-0 from training. Duration: 0.83 2023-10-02 01:36:23,391 WARNING [train.py:1204] (3/4) Exclude cut with ID _39867_210_9826_1_1533553222030_9344259_322-115153-0_sp1.1 from training. Duration: 0.963625 2023-10-02 01:36:23,671 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3371_210_1794_1_1526384462_5580777_396-339812-0_sp1.1 from training. Duration: 0.77275 2023-10-02 01:36:23,691 WARNING [train.py:1204] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_731-2428-0_sp0.9 from training. Duration: 0.92225 2023-10-02 01:36:23,718 WARNING [train.py:1204] (3/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_98-741-0_sp1.1 from training. Duration: 0.72725 2023-10-02 01:36:23,720 WARNING [train.py:1204] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_238-245655-0_sp0.9 from training. Duration: 0.9 2023-10-02 01:36:23,745 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6788_210_1388_1_1527898698_4562021_91-197981-0_sp0.9 from training. Duration: 0.92225 2023-10-02 01:36:23,842 WARNING [train.py:1204] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_860-96198-0 from training. Duration: 0.73 2023-10-02 01:36:23,995 WARNING [train.py:1204] (3/4) Exclude cut with ID _14268_210_9206_1_1530622771285_4714099_331-105351-0_sp0.9 from training. Duration: 0.72225 2023-10-02 01:36:24,018 WARNING [train.py:1204] (3/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_204-137133-0_sp1.1 from training. Duration: 0.92725 2023-10-02 01:36:24,035 WARNING [train.py:1204] (3/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_5-46496-0_sp1.1 from training. Duration: 0.936375 2023-10-02 01:36:24,051 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_160-218721-0_sp1.1 from training. Duration: 0.87275 2023-10-02 01:36:24,392 WARNING [train.py:1204] (3/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_503-287883-0 from training. Duration: 0.88 2023-10-02 01:36:24,481 WARNING [train.py:1204] (3/4) Exclude cut with ID _57572_210_5517_1_1534921219624_3809484_110-79134-0_sp1.1 from training. Duration: 0.92725 2023-10-02 01:36:24,578 WARNING [train.py:1204] (3/4) Exclude cut with ID _44585_210_18628_1_1533605350387_4518199_421-37641-0_sp1.1 from training. Duration: 0.936375 2023-10-02 01:36:24,675 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_42-142473-0_sp0.9 from training. Duration: 0.8 2023-10-02 01:36:24,833 WARNING [train.py:1204] (3/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_310-114452-0_sp0.9 from training. Duration: 0.6555625 2023-10-02 01:36:25,110 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9029W0120-509520 from training. Duration: 0.768 2023-10-02 01:36:25,130 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9029W0433-509820 from training. Duration: 0.896 2023-10-02 01:36:25,255 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_40222_210_6228_1_1533385298_4481939_163-334586-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 01:36:25,255 WARNING [train.py:1204] (3/4) Exclude cut with ID _48677_210_5663_1_1533902405919_3892989_408-40740-0_sp1.1 from training. Duration: 0.7545625 2023-10-02 01:36:25,513 WARNING [train.py:1204] (3/4) Exclude cut with ID _24314_210_4915_1_1532135078116_7170179_521-258868-0_sp0.9 from training. Duration: 0.87775 2023-10-02 01:36:25,667 WARNING [train.py:1204] (3/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_576-51198-0_sp1.1 from training. Duration: 0.92725 2023-10-02 01:36:25,822 WARNING [train.py:1204] (3/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_306-216871-0_sp1.1 from training. Duration: 0.97275 2023-10-02 01:36:26,237 WARNING [train.py:1204] (3/4) Exclude cut with ID _20684_210_6917_1_1531567751646_4203930_463-119982-0_sp1.1 from training. Duration: 0.97275 2023-10-02 01:36:26,325 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9084W0012-536850 from training. Duration: 0.64 2023-10-02 01:36:27,068 WARNING [train.py:1204] (3/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_847-41334-0_sp1.1 from training. Duration: 0.4 2023-10-02 01:36:27,322 WARNING [train.py:1204] (3/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_326-165900-0_sp1.1 from training. Duration: 0.62725 2023-10-02 01:36:27,398 WARNING [train.py:1204] (3/4) Exclude cut with ID _7537_210_2084_1_1528502395431_3924359_128-268029-0_sp1.1 from training. Duration: 0.8909375 2023-10-02 01:36:27,559 WARNING [train.py:1204] (3/4) Exclude cut with ID _67499_210_17282_1_1535509805220_3692430_333-155030-0_sp1.1 from training. Duration: 0.97275 2023-10-02 01:36:27,829 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_34008_210_10431_1_1533345424_8183247_588-257177-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 01:36:28,070 WARNING [train.py:1204] (3/4) Exclude cut with ID _25914_210_6400_1_1532141317058_644740_52-96808-0_sp1.1 from training. Duration: 0.82725 2023-10-02 01:36:28,234 WARNING [train.py:1204] (3/4) Exclude cut with ID _56494_210_4220_1_1535284837625_3033169_69-138150-0_sp1.1 from training. Duration: 0.8545625 2023-10-02 01:36:28,431 WARNING [train.py:1204] (3/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_19-143553-0_sp1.1 from training. Duration: 0.97275 2023-10-02 01:36:28,483 WARNING [train.py:1204] (3/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_582-130735-0_sp1.1 from training. Duration: 0.936375 2023-10-02 01:36:28,618 WARNING [train.py:1204] (3/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_943-201356-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 01:36:28,628 WARNING [train.py:1204] (3/4) Exclude cut with ID _3231_210_2106_1_1526205864934_6784220_422-229276-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 01:36:28,650 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_48049_210_5632_1_1533864553_3519937_73-198717-0_sp1.1 from training. Duration: 0.97275 2023-10-02 01:36:28,722 WARNING [train.py:1204] (3/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_329-285801-0 from training. Duration: 0.91 2023-10-02 01:36:28,744 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9171W0114-579000 from training. Duration: 0.896 2023-10-02 01:36:28,769 WARNING [train.py:1204] (3/4) Exclude cut with ID _3120_210_3525_1_1526036143112_4072980_210-132675-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 01:36:28,850 WARNING [train.py:1204] (3/4) Exclude cut with ID _55857_210_2346_1_1534644007831_6898209_745-98391-0_sp0.9 from training. Duration: 0.8444375 2023-10-02 01:36:28,924 WARNING [train.py:1204] (3/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_615-322035-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 01:36:28,935 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_36756_210_7234_1_1533026991_966276_119-315767-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 01:36:28,949 WARNING [train.py:1204] (3/4) Exclude cut with ID _13916_210_9994_1_1530788341126_3763149_60-191556-0_sp1.1 from training. Duration: 0.5545625 2023-10-02 01:36:28,970 WARNING [train.py:1204] (3/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_177-236809-0_sp1.1 from training. Duration: 0.4454375 2023-10-02 01:36:29,024 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7002_210_1794_1_1527939683_5157286_184-229630-0_sp1.1 from training. Duration: 0.67275 2023-10-02 01:36:29,029 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_50428_210_5724_1_1534247532_4126418_464-18322-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 01:36:29,090 WARNING [train.py:1204] (3/4) Exclude cut with ID _35799_210_5854_1_1533263487887_3529071_466-131402-0_sp1.1 from training. Duration: 0.936375 2023-10-02 01:36:29,179 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_1259-132768-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 01:36:29,242 WARNING [train.py:1204] (3/4) Exclude cut with ID _6694_210_4599_1_1527852637490_3965529_144-185051-0_sp1.1 from training. Duration: 0.87275 2023-10-02 01:36:29,287 WARNING [train.py:1204] (3/4) Exclude cut with ID _64639_210_5816_1_1535367619437_3528000_59-133290-0_sp1.1 from training. Duration: 0.963625 2023-10-02 01:36:29,496 WARNING [train.py:1204] (3/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_223-49525-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 01:36:29,720 WARNING [train.py:1204] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_748-36150-0_sp1.1 from training. Duration: 0.82725 2023-10-02 01:36:29,721 WARNING [train.py:1204] (3/4) Exclude cut with ID _19896_210_1883_1_1531709022662_6649569_52-221627-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 01:36:29,782 WARNING [train.py:1204] (3/4) Exclude cut with ID _6789_210_1534_1_1527854471671_2870519_296-115766-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 01:36:30,046 WARNING [train.py:1204] (3/4) Exclude cut with ID _14624_210_1925_1_1530858690657_3642219_100-19871-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 01:36:30,088 WARNING [train.py:1204] (3/4) Exclude cut with ID _12555_210_8750_1_1530075594402_3780239_50-293675-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 01:36:30,316 WARNING [train.py:1204] (3/4) Exclude cut with ID _14880_210_8895_1_1530680426655_3795259_289-212817-0_sp1.1 from training. Duration: 0.62725 2023-10-02 01:36:30,384 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_388-211626-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 01:36:30,678 WARNING [train.py:1204] (3/4) Exclude cut with ID _5289_210_2084_1_1527678202736_3779299_392-261988-0 from training. Duration: 0.65 2023-10-02 01:36:30,685 WARNING [train.py:1204] (3/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_124-345840-0 from training. Duration: 0.52 2023-10-02 01:36:30,689 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2655_210_2087_1_1525688959_3897400_437-337690-0 from training. Duration: 0.63 2023-10-02 01:36:30,921 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3780_210_2087_1_1526467391_239068_13-274633-0_sp1.1 from training. Duration: 0.92725 2023-10-02 01:36:30,925 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_805-71497-0_sp1.1 from training. Duration: 0.6454375 2023-10-02 01:36:31,669 WARNING [train.py:1204] (3/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_434-83000-0_sp1.1 from training. Duration: 0.8909375 2023-10-02 01:36:31,769 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_17613_210_2151_1_1531137569_2366160_285-219531-0_sp1.1 from training. Duration: 0.97275 2023-10-02 01:36:31,785 WARNING [train.py:1204] (3/4) Exclude cut with ID _56108_210_16380_1_1534505446159_3887959_22-37858-0_sp1.1 from training. Duration: 0.936375 2023-10-02 01:36:31,826 WARNING [train.py:1204] (3/4) Exclude cut with ID _28427_210_4220_1_1532581240967_3061069_200-88444-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 01:36:31,897 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_77-279042-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 01:36:32,013 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9309W0193-640320 from training. Duration: 0.896 2023-10-02 01:36:32,166 WARNING [train.py:1204] (3/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_516-324055-0_sp1.1 from training. Duration: 0.936375 2023-10-02 01:36:32,556 WARNING [train.py:1204] (3/4) Exclude cut with ID _40094_210_16380_1_1533294005773_3641159_10-261240-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 01:36:32,800 WARNING [train.py:1204] (3/4) Exclude cut with ID _6267_210_2084_1_1527764658526_3894730_375-219887-0_sp1.1 from training. Duration: 0.6090625 2023-10-02 01:36:32,887 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6788_210_1388_1_1527898698_4562021_180-75288-0 from training. Duration: 0.99 2023-10-02 01:36:33,171 WARNING [train.py:1204] (3/4) Exclude cut with ID _8030_210_7497_1_1528444886781_3531089_262-86750-0_sp1.1 from training. Duration: 0.736375 2023-10-02 01:36:33,184 WARNING [train.py:1204] (3/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_934-325498-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 01:36:33,212 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_456-22141-0_sp0.9 from training. Duration: 0.97775 2023-10-02 01:36:33,401 WARNING [train.py:1204] (3/4) Exclude cut with ID _49643_210_7989_1_1534640244224_3939031_598-161926-0_sp1.1 from training. Duration: 0.8181875 2023-10-02 01:36:33,542 WARNING [train.py:1204] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_345-49721-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 01:36:33,651 WARNING [train.py:1204] (3/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_440-141811-0_sp1.1 from training. Duration: 0.863625 2023-10-02 01:36:33,727 WARNING [train.py:1204] (3/4) Exclude cut with ID _64048_210_10626_1_1535198382802_3835179_394-212120-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 01:36:33,783 WARNING [train.py:1204] (3/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_91-277684-0 from training. Duration: 0.74 2023-10-02 01:36:34,198 WARNING [train.py:1204] (3/4) Exclude cut with ID _23038_210_9570_1_1532001584158_3652419_191-346343-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 01:36:34,269 WARNING [train.py:1204] (3/4) Exclude cut with ID _15140_210_9288_1_1530791388450_4880149_351-347974-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 01:36:34,320 WARNING [train.py:1204] (3/4) Exclude cut with ID _58605_210_12210_1_1534723195939_3904259_48-269402-0_sp1.1 from training. Duration: 0.87275 2023-10-02 01:36:34,326 WARNING [train.py:1204] (3/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_401-53423-0_sp0.9 from training. Duration: 0.611125 2023-10-02 01:36:34,507 WARNING [train.py:1204] (3/4) Exclude cut with ID _42659_210_2070_1_1533607442765_3120380_158-49004-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 01:36:34,635 WARNING [train.py:1204] (3/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_81-75849-0_sp1.1 from training. Duration: 0.3545625 2023-10-02 01:36:34,842 WARNING [train.py:1204] (3/4) Exclude cut with ID _27052_210_5986_1_1532329197187_3585009_314-106499-0_sp1.1 from training. Duration: 0.836375 2023-10-02 01:36:34,993 WARNING [train.py:1204] (3/4) Exclude cut with ID _4626_210_3532_1_1526817652717_3916859_446-206656-0_sp0.9 from training. Duration: 0.8444375 2023-10-02 01:36:35,120 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_53378_210_17282_1_1534221071_5769509_158-114960-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 01:36:35,265 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3171_210_2087_1_1526109084_2330007_37-64334-0_sp0.9 from training. Duration: 0.911125 2023-10-02 01:36:35,294 WARNING [train.py:1204] (3/4) Exclude cut with ID _5410_210_1388_1_1527397055795_1719020_39-112221-0 from training. Duration: 0.69 2023-10-02 01:36:35,325 WARNING [train.py:1204] (3/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_4-91037-0_sp0.9 from training. Duration: 0.97775 2023-10-02 01:36:35,345 WARNING [train.py:1204] (3/4) Exclude cut with ID ID0079W0443-717795 from training. Duration: 0.541 2023-10-02 01:36:36,200 WARNING [train.py:1204] (3/4) Exclude cut with ID _34734_210_4525_1_1533365977542_4448720_766-297928-0_sp1.1 from training. Duration: 0.936375 2023-10-02 01:36:36,480 WARNING [train.py:1204] (3/4) Exclude cut with ID _51471_210_1889_1_1534399391390_3094250_310-337793-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 01:36:36,505 WARNING [train.py:1204] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_678-159307-0_sp0.9 from training. Duration: 0.9444375 2023-10-02 01:36:36,719 WARNING [train.py:1204] (3/4) Exclude cut with ID _50093_210_4220_1_1534417141207_3432299_259-322252-0 from training. Duration: 0.92 2023-10-02 01:36:36,776 WARNING [train.py:1204] (3/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_67-103444-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 01:36:36,795 WARNING [train.py:1204] (3/4) Exclude cut with ID _40551_210_10635_1_1533776424632_3416179_311-247578-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 01:36:37,028 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_4633_210_2040_1_1526824163_4145343_416-76117-0 from training. Duration: 0.95 2023-10-02 01:36:37,157 WARNING [train.py:1204] (3/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_331-117829-0_sp0.9 from training. Duration: 0.688875 2023-10-02 01:36:37,276 WARNING [train.py:1204] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_299-247059-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 01:36:37,359 WARNING [train.py:1204] (3/4) Exclude cut with ID _64002_210_9570_1_1535198605157_38979_2-240845-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 01:36:37,686 WARNING [train.py:1204] (3/4) Exclude cut with ID _4681_210_3525_1_1527250689464_120889_15-126760-0_sp0.9 from training. Duration: 0.9 2023-10-02 01:36:37,698 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_21500_210_2054_1_1532149310_2716758_256-156611-0_sp1.1 from training. Duration: 0.936375 2023-10-02 01:36:37,719 WARNING [train.py:1204] (3/4) Exclude cut with ID _16908_210_5724_1_1531119157248_3772700_624-203729-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 01:36:39,209 WARNING [train.py:1204] (3/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_605-219732-0_sp1.1 from training. Duration: 0.8545625 2023-10-02 01:36:39,243 WARNING [train.py:1204] (3/4) Exclude cut with ID _44718_210_5816_1_1534050051869_6469030_380-167520-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 01:36:39,313 WARNING [train.py:1204] (3/4) Exclude cut with ID _19896_210_1883_1_1531709022662_6649569_94-229927-0_sp1.1 from training. Duration: 0.8818125 2023-10-02 01:36:39,371 WARNING [train.py:1204] (3/4) Exclude cut with ID _19514_210_9259_1_1531530071330_5084230_519-174300-0_sp1.1 from training. Duration: 0.963625 2023-10-02 01:36:39,457 WARNING [train.py:1204] (3/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_340-174176-0_sp0.9 from training. Duration: 0.5444375 2023-10-02 01:36:39,461 WARNING [train.py:1204] (3/4) Exclude cut with ID _8263_210_1664_1_1528439452891_970220_68-181805-0_sp1.1 from training. Duration: 0.5818125 2023-10-02 01:36:39,591 WARNING [train.py:1204] (3/4) Exclude cut with ID _3120_210_3525_1_1526036143112_4072980_348-257723-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 01:36:39,656 WARNING [train.py:1204] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_453-133983-0_sp1.1 from training. Duration: 0.7090625 2023-10-02 01:36:39,709 WARNING [train.py:1204] (3/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_17-86060-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 01:36:39,734 WARNING [train.py:1204] (3/4) Exclude cut with ID _29877_210_6228_1_1532397791106_5276149_228-184657-0_sp1.1 from training. Duration: 0.97275 2023-10-02 01:36:39,854 WARNING [train.py:1204] (3/4) Exclude cut with ID _18516_210_9206_1_1531704653206_3692406_198-334870-0 from training. Duration: 0.92 2023-10-02 01:36:39,912 WARNING [train.py:1204] (3/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_340-174176-0_sp1.1 from training. Duration: 0.4454375 2023-10-02 01:36:39,928 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_14-139186-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 01:36:40,845 WARNING [train.py:1204] (3/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_126-29104-0_sp0.9 from training. Duration: 0.9444375 2023-10-02 01:36:41,337 WARNING [train.py:1204] (3/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_84-10310-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 01:36:41,835 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_15517_210_6753_1_1530954008_3487336_79-273076-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 01:36:41,879 WARNING [train.py:1204] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_371-18169-0_sp0.9 from training. Duration: 0.9555625 2023-10-02 01:36:42,383 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_68702_210_23689_1_1535799577_3420068_170-92788-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 01:36:42,545 WARNING [train.py:1204] (3/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_78-182912-0 from training. Duration: 0.59 2023-10-02 01:36:42,596 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3019_210_2028_1_1526044245_3264865_297-12384-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 01:36:42,601 WARNING [train.py:1204] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_147-111137-0_sp1.1 from training. Duration: 0.863625 2023-10-02 01:36:42,605 WARNING [train.py:1204] (3/4) Exclude cut with ID _28323_210_16187_1_1532480395667_4100300_117-8859-0_sp1.1 from training. Duration: 0.97275 2023-10-02 01:36:42,678 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7762_210_1794_1_1528199508_4057408_419-125009-0_sp1.1 from training. Duration: 0.7545625 2023-10-02 01:36:42,787 WARNING [train.py:1204] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_268-87909-0 from training. Duration: 0.99 2023-10-02 01:36:42,873 WARNING [train.py:1204] (3/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_118-311357-0_sp1.1 from training. Duration: 0.92725 2023-10-02 01:36:42,888 WARNING [train.py:1204] (3/4) Exclude cut with ID ID0377W0003-870225 from training. Duration: 0.852 2023-10-02 01:36:43,020 WARNING [train.py:1204] (3/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_56-5838-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 01:36:43,170 WARNING [train.py:1204] (3/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_90-31383-0_sp0.9 from training. Duration: 0.9666875 2023-10-02 01:36:43,268 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_31583_210_3852_1_1532595851_4024413_58-86657-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 01:36:43,306 WARNING [train.py:1204] (3/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_344-113875-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 01:36:43,492 WARNING [train.py:1204] (3/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_278-104676-0_sp1.1 from training. Duration: 0.8909375 2023-10-02 01:36:43,528 WARNING [train.py:1204] (3/4) Exclude cut with ID _55844_210_2346_1_1534471212569_7625707_764-2387-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 01:36:43,654 WARNING [train.py:1204] (3/4) Exclude cut with ID _27430_210_7770_1_1532604689375_1378590_157-14963-0_sp1.1 from training. Duration: 0.963625 2023-10-02 01:36:43,931 WARNING [train.py:1204] (3/4) Exclude cut with ID _7160_210_1876_1_1527940923231_3703799_282-322611-0_sp1.1 from training. Duration: 0.7909375 2023-10-02 01:36:45,360 WARNING [train.py:1204] (3/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_190-138640-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 01:36:45,563 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_33348_210_5632_1_1533086912_3324030_63-270922-0_sp1.1 from training. Duration: 0.97275 2023-10-02 01:36:45,607 WARNING [train.py:1204] (3/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_135-184281-0_sp1.1 from training. Duration: 0.9 2023-10-02 01:36:46,094 WARNING [train.py:1204] (3/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_277-210798-0_sp1.1 from training. Duration: 0.5 2023-10-02 01:36:46,146 WARNING [train.py:1204] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_402-253027-0_sp0.9 from training. Duration: 0.6 2023-10-02 01:36:46,239 WARNING [train.py:1204] (3/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_540-275685-0_sp0.9 from training. Duration: 0.988875 2023-10-02 01:36:46,287 WARNING [train.py:1204] (3/4) Exclude cut with ID _59493_210_17282_1_1535078419077_3225879_118-18960-0_sp1.1 from training. Duration: 0.936375 2023-10-02 01:36:46,372 WARNING [train.py:1204] (3/4) Exclude cut with ID _6194_210_1534_1_1527511826160_3941194_321-148641-0_sp1.1 from training. Duration: 0.5818125 2023-10-02 01:36:46,378 WARNING [train.py:1204] (3/4) Exclude cut with ID _61133_210_9969_1_1535196777227_3696409_333-270325-0_sp1.1 from training. Duration: 0.8454375 2023-10-02 01:36:46,444 WARNING [train.py:1204] (3/4) Exclude cut with ID _49376_210_5986_1_1533985278732_3370740_307-145221-0_sp1.1 from training. Duration: 0.936375 2023-10-02 01:36:46,625 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6846_210_6753_1_1527850767_4295158_2-315073-0_sp0.9 from training. Duration: 0.97775 2023-10-02 01:36:46,836 WARNING [train.py:1204] (3/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_17-25045-0 from training. Duration: 0.59 2023-10-02 01:36:46,857 WARNING [train.py:1204] (3/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_503-333510-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 01:36:46,997 WARNING [train.py:1204] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_238-245655-0_sp1.1 from training. Duration: 0.736375 2023-10-02 01:36:47,022 WARNING [train.py:1204] (3/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_329-82235-0_sp0.9 from training. Duration: 0.911125 2023-10-02 01:36:47,032 WARNING [train.py:1204] (3/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_325-113798-0_sp0.9 from training. Duration: 0.9444375 2023-10-02 01:36:47,099 WARNING [train.py:1204] (3/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_395-37222-0_sp0.9 from training. Duration: 0.8444375 2023-10-02 01:36:47,171 WARNING [train.py:1204] (3/4) Exclude cut with ID _64135_210_2253_1_1535677267876_7608972_498-5039-0_sp1.1 from training. Duration: 0.7090625 2023-10-02 01:36:47,188 WARNING [train.py:1204] (3/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_303-228738-0_sp0.9 from training. Duration: 0.9333125 2023-10-02 01:36:47,347 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_577-220899-0_sp1.1 from training. Duration: 0.92725 2023-10-02 01:36:47,472 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3475_210_2028_1_1526193578_3622569_377-166133-0_sp1.1 from training. Duration: 0.7818125 2023-10-02 01:36:47,504 WARNING [train.py:1204] (3/4) Exclude cut with ID _49376_210_5986_1_1533985278732_3370740_272-313512-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 01:36:47,838 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7762_210_1794_1_1528199508_4057408_422-227034-0_sp0.9 from training. Duration: 0.811125 2023-10-02 01:36:48,120 WARNING [train.py:1204] (3/4) Exclude cut with ID _18895_210_6228_1_1531357274234_3784930_74-341428-0_sp1.1 from training. Duration: 0.92725 2023-10-02 01:36:48,356 WARNING [train.py:1204] (3/4) Exclude cut with ID _44902_210_16187_1_1533888006062_3792078_149-67212-0_sp1.1 from training. Duration: 0.7909375 2023-10-02 01:36:48,712 WARNING [train.py:1204] (3/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_243-126020-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 01:36:49,494 WARNING [train.py:1204] (3/4) Exclude cut with ID _54218_210_12210_1_1534413646388_4409259_259-6561-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 01:36:49,771 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_41452_210_7291_1_1533437687_3941281_405-11559-0 from training. Duration: 0.91 2023-10-02 01:36:49,870 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_759-225084-0_sp1.1 from training. Duration: 0.97275 2023-10-02 01:36:49,894 WARNING [train.py:1204] (3/4) Exclude cut with ID _14747_210_8341_1_1530685797463_3654089_147-131229-0_sp0.9 from training. Duration: 0.8444375 2023-10-02 01:36:50,029 WARNING [train.py:1204] (3/4) Exclude cut with ID _14867_210_1968_1_1530684179890_3846969_319-250868-0 from training. Duration: 0.99 2023-10-02 01:36:50,050 WARNING [train.py:1204] (3/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_580-93798-0_sp0.9 from training. Duration: 0.62225 2023-10-02 01:36:50,168 WARNING [train.py:1204] (3/4) Exclude cut with ID _9679_210_7497_1_1529129212906_4363929_512-313889-0_sp0.9 from training. Duration: 0.9 2023-10-02 01:36:50,176 WARNING [train.py:1204] (3/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_18-189198-0 from training. Duration: 0.9 2023-10-02 01:36:50,515 WARNING [train.py:1204] (3/4) Exclude cut with ID _49755_210_18053_1_1534581005846_3218709_137-64179-0_sp1.1 from training. Duration: 0.92725 2023-10-02 01:36:50,571 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_48049_210_5632_1_1533864553_3519937_306-283794-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 01:36:50,833 WARNING [train.py:1204] (3/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_25-120161-0_sp1.1 from training. Duration: 0.8909375 2023-10-02 01:36:50,892 WARNING [train.py:1204] (3/4) Exclude cut with ID _8833_210_5219_1_1528592665210_7553059_319-349181-0 from training. Duration: 0.99 2023-10-02 01:36:51,008 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_296-332325-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 01:36:51,130 WARNING [train.py:1204] (3/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_436-186311-0_sp1.1 from training. Duration: 0.5909375 2023-10-02 01:36:51,152 WARNING [train.py:1204] (3/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_348-127789-0 from training. Duration: 0.85 2023-10-02 01:36:51,157 WARNING [train.py:1204] (3/4) Exclude cut with ID _3992_210_1747_1_1526644816031_3731060_443-195183-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 01:36:51,524 WARNING [train.py:1204] (3/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_308-231735-0_sp1.1 from training. Duration: 0.663625 2023-10-02 01:36:51,851 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6786_210_1388_1_1527839699_4194495_378-46070-0_sp1.1 from training. Duration: 0.6545625 2023-10-02 01:36:51,896 WARNING [train.py:1204] (3/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_63-101996-0 from training. Duration: 0.88 2023-10-02 01:36:52,166 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2541_210_1794_1_1525590269_3084765_156-173713-0 from training. Duration: 0.81 2023-10-02 01:36:52,222 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_217-239796-0_sp1.1 from training. Duration: 0.963625 2023-10-02 01:36:52,381 WARNING [train.py:1204] (3/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_530-329400-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 01:36:52,556 WARNING [train.py:1204] (3/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_150-62920-0_sp1.1 from training. Duration: 0.963625 2023-10-02 01:36:52,556 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_5522_210_3273_1_1527249606_3909096_366-172508-0 from training. Duration: 0.66 2023-10-02 01:36:52,559 WARNING [train.py:1204] (3/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_303-340446-0_sp1.1 from training. Duration: 0.8181875 2023-10-02 01:36:52,734 WARNING [train.py:1204] (3/4) Exclude cut with ID _8699_210_2069_1_1528630346831_3960409_160-24408-0_sp1.1 from training. Duration: 0.82725 2023-10-02 01:36:52,935 WARNING [train.py:1204] (3/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_299-187801-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 01:36:52,979 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_34886_210_10633_1_1533203882_1755661_92-345288-0_sp1.1 from training. Duration: 0.97275 2023-10-02 01:36:53,366 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_5438_210_1794_1_1527336687_2600009_104-288274-0_sp1.1 from training. Duration: 0.52725 2023-10-02 01:36:53,370 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_154-316450-0 from training. Duration: 0.76 2023-10-02 01:36:53,450 WARNING [train.py:1204] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_23-189234-0_sp0.9 from training. Duration: 0.92225 2023-10-02 01:36:54,286 WARNING [train.py:1204] (3/4) Exclude cut with ID _4779_210_2084_1_1527318022944_3914893_346-168820-0_sp1.1 from training. Duration: 0.963625 2023-10-02 01:36:54,375 WARNING [train.py:1204] (3/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_5-55893-0 from training. Duration: 0.55 2023-10-02 01:36:54,480 WARNING [train.py:1204] (3/4) Exclude cut with ID _20455_210_5219_1_1531565996775_8098100_180-24517-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 01:36:54,842 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_243-143119-0_sp0.9 from training. Duration: 0.8555625 2023-10-02 01:36:55,052 WARNING [train.py:1204] (3/4) Exclude cut with ID _65585_210_12210_1_1535450451584_3764098_293-212045-0_sp1.1 from training. Duration: 0.92725 2023-10-02 01:36:55,057 WARNING [train.py:1204] (3/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_577-332600-0 from training. Duration: 0.57 2023-10-02 01:36:55,417 WARNING [train.py:1204] (3/4) Exclude cut with ID _30039_210_13270_1_1532484612900_2725150_104-289874-0_sp1.1 from training. Duration: 0.963625 2023-10-02 01:36:55,424 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3172_210_2087_1_1526292929_3987865_197-97762-0_sp1.1 from training. Duration: 0.6818125 2023-10-02 01:36:55,633 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_256-150908-0_sp1.1 from training. Duration: 0.963625 2023-10-02 01:36:55,731 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_296-287678-0_sp1.1 from training. Duration: 0.863625 2023-10-02 01:36:55,770 WARNING [train.py:1204] (3/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_443-30034-0 from training. Duration: 0.64 2023-10-02 01:36:55,801 WARNING [train.py:1204] (3/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_494-199538-0_sp0.9 from training. Duration: 0.8333125 2023-10-02 01:36:55,847 WARNING [train.py:1204] (3/4) Exclude cut with ID _19708_210_9206_1_1532136650733_4238364_61-162325-0_sp1.1 from training. Duration: 0.97275 2023-10-02 01:36:55,950 WARNING [train.py:1204] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_475-127455-0 from training. Duration: 0.7 2023-10-02 01:36:56,075 WARNING [train.py:1204] (3/4) Exclude cut with ID _48677_210_5663_1_1533902405919_3892989_188-11581-0_sp1.1 from training. Duration: 0.963625 2023-10-02 01:36:56,095 WARNING [train.py:1204] (3/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_136-8381-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 01:36:56,226 WARNING [train.py:1204] (3/4) Exclude cut with ID _55328_210_27767_1_1534580973260_4088170_270-229555-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 01:36:56,252 WARNING [train.py:1204] (3/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_372-266951-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 01:36:56,318 WARNING [train.py:1204] (3/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_14-242149-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 01:36:56,728 WARNING [train.py:1204] (3/4) Exclude cut with ID _7877_210_6014_1_1528286358099_3750136_159-76512-0_sp1.1 from training. Duration: 0.936375 2023-10-02 01:36:56,730 WARNING [train.py:1204] (3/4) Exclude cut with ID _8263_210_1664_1_1528438953494_294100_40-204731-0_sp1.1 from training. Duration: 0.4545625 2023-10-02 01:36:56,978 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8278_210_2550_1_1528637099_4220413_694-238413-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 01:36:57,420 WARNING [train.py:1204] (3/4) Exclude cut with ID _47278_210_12041_1_1533902418349_3576110_421-172710-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 01:36:57,653 WARNING [train.py:1204] (3/4) Exclude cut with ID _14970_210_15709_1_1530773543493_1715730_215-282680-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 01:36:57,690 WARNING [train.py:1204] (3/4) Exclude cut with ID _64639_210_5816_1_1535367619437_3528000_306-149449-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 01:36:57,918 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_28-291982-0 from training. Duration: 0.97 2023-10-02 01:36:58,686 WARNING [train.py:1204] (3/4) Exclude cut with ID _55134_210_21982_1_1534590028377_3882170_412-15870-0_sp1.1 from training. Duration: 0.936375 2023-10-02 01:36:58,711 WARNING [train.py:1204] (3/4) Exclude cut with ID _13684_210_7497_1_1530619354863_3598359_312-235606-0 from training. Duration: 0.6 2023-10-02 01:36:58,935 WARNING [train.py:1204] (3/4) Exclude cut with ID _37112_210_2575_1_1532997007673_7268610_347-238993-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 01:36:58,994 WARNING [train.py:1204] (3/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_121-284152-0 from training. Duration: 0.59 2023-10-02 01:36:59,060 WARNING [train.py:1204] (3/4) Exclude cut with ID _2941_210_2577_1_1526199673150_2489759_228-2961-0_sp1.1 from training. Duration: 0.97275 2023-10-02 01:36:59,283 WARNING [train.py:1204] (3/4) Exclude cut with ID _28323_210_16187_1_1532480395667_4100300_531-24157-0 from training. Duration: 0.98 2023-10-02 01:36:59,420 WARNING [train.py:1204] (3/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_266-329755-0_sp0.9 from training. Duration: 0.988875 2023-10-02 01:36:59,556 WARNING [train.py:1204] (3/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_1038-146421-0_sp1.1 from training. Duration: 0.97275 2023-10-02 01:36:59,623 WARNING [train.py:1204] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_391-22148-0_sp0.9 from training. Duration: 0.8555625 2023-10-02 01:36:59,726 WARNING [train.py:1204] (3/4) Exclude cut with ID _64521_210_14699_1_1535243427259_3815328_543-341117-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 01:36:59,749 WARNING [train.py:1204] (3/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_52-168712-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 01:36:59,781 WARNING [train.py:1204] (3/4) Exclude cut with ID _33920_210_4220_1_1533257851500_3084399_198-334063-0_sp1.1 from training. Duration: 0.8545625 2023-10-02 01:36:59,808 WARNING [train.py:1204] (3/4) Exclude cut with ID _50093_210_4220_1_1534417141207_3432299_69-111987-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 01:36:59,866 WARNING [train.py:1204] (3/4) Exclude cut with ID _14821_210_8750_1_1530680503991_6829359_189-297451-0_sp0.9 from training. Duration: 0.611125 2023-10-02 01:37:00,004 WARNING [train.py:1204] (3/4) Exclude cut with ID _8030_210_7497_1_1528444886781_3531089_262-86750-0_sp0.9 from training. Duration: 0.9 2023-10-02 01:37:00,009 WARNING [train.py:1204] (3/4) Exclude cut with ID _24612_210_8913_1_1531998054193_3806359_294-191196-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 01:37:00,627 WARNING [train.py:1204] (3/4) Exclude cut with ID _16909_210_5724_1_1531292079912_4118919_707-342896-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 01:37:00,646 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_9460_210_4211_1_1528954587_10049667_572-37035-0_sp0.9 from training. Duration: 0.888875 2023-10-02 01:37:00,705 WARNING [train.py:1204] (3/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_762-109886-0_sp1.1 from training. Duration: 0.82725 2023-10-02 01:37:00,736 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_14953_210_2151_1_1530701710_5616165_519-326393-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 01:37:01,123 WARNING [train.py:1204] (3/4) Exclude cut with ID _25914_210_6400_1_1532141317058_644740_52-96808-0 from training. Duration: 0.91 2023-10-02 01:37:01,126 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_68095_210_20185_1_1535718226_8888855_1079-94525-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 01:37:01,525 WARNING [train.py:1204] (3/4) Exclude cut with ID _14860_210_4915_1_1530702020340_7343240_746-3647-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 01:37:01,526 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_70379_210_7925_1_1535941455_557455_49-101720-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 01:37:01,551 WARNING [train.py:1204] (3/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_8-307291-0_sp1.1 from training. Duration: 0.936375 2023-10-02 01:37:01,683 WARNING [train.py:1204] (3/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_117-290916-0 from training. Duration: 0.51 2023-10-02 01:37:01,863 WARNING [train.py:1204] (3/4) Exclude cut with ID _22020_210_2461_1_1531724164513_6326900_944-339840-0_sp1.1 from training. Duration: 0.963625 2023-10-02 01:37:01,936 WARNING [train.py:1204] (3/4) Exclude cut with ID IC0515W0043-254911 from training. Duration: 0.976 2023-10-02 01:37:02,038 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3910_210_1794_1_1526742306_334530_28-238909-0 from training. Duration: 0.81 2023-10-02 01:37:02,058 WARNING [train.py:1204] (3/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_269-267275-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 01:37:02,313 WARNING [train.py:1204] (3/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_72-251255-0_sp1.1 from training. Duration: 0.8545625 2023-10-02 01:37:02,323 WARNING [train.py:1204] (3/4) Exclude cut with ID _14886_210_4220_1_1530702020355_3516190_175-229073-0 from training. Duration: 0.45 2023-10-02 01:37:02,494 WARNING [train.py:1204] (3/4) Exclude cut with ID _40519_210_13794_1_1533715228655_7337390_921-209452-0_sp1.1 from training. Duration: 0.963625 2023-10-02 01:37:03,258 WARNING [train.py:1204] (3/4) Exclude cut with ID _29857_210_20034_1_1532395886753_3798160_335-163562-0_sp0.9 from training. Duration: 0.9444375 2023-10-02 01:37:03,363 WARNING [train.py:1204] (3/4) Exclude cut with ID _58583_210_5236_1_1534726835266_3574249_384-58445-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 01:37:03,438 WARNING [train.py:1204] (3/4) Exclude cut with ID _44011_210_2046_1_1533722401691_3631310_96-315656-0_sp1.1 from training. Duration: 0.963625 2023-10-02 01:37:03,448 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_68484_210_1794_1_1535849804_7678097_549-170613-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 01:37:03,634 WARNING [train.py:1204] (3/4) Exclude cut with ID _37892_210_19596_1_1533198677897_3964058_214-148058-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 01:37:03,907 WARNING [train.py:1204] (3/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_415-78792-0_sp0.9 from training. Duration: 0.9 2023-10-02 01:37:04,085 WARNING [train.py:1204] (3/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_383-205833-0_sp1.1 from training. Duration: 0.863625 2023-10-02 01:37:04,193 WARNING [train.py:1204] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_167-137255-0 from training. Duration: 0.64 2023-10-02 01:37:04,201 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8275_210_4198_1_1528455320_3892571_484-154786-0_sp1.1 from training. Duration: 0.963625 2023-10-02 01:37:04,250 WARNING [train.py:1204] (3/4) Exclude cut with ID _40284_210_2084_1_1533513679814_3704279_396-319412-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 01:37:04,389 WARNING [train.py:1204] (3/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_280-120942-0_sp0.9 from training. Duration: 0.8555625 2023-10-02 01:37:04,454 WARNING [train.py:1204] (3/4) Exclude cut with ID _45630_210_10556_1_1533949174498_3902049_558-240889-0_sp1.1 from training. Duration: 0.92725 2023-10-02 01:37:04,687 WARNING [train.py:1204] (3/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_197-307458-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 01:37:04,789 WARNING [train.py:1204] (3/4) Exclude cut with ID _16909_210_5724_1_1531292079912_4118919_163-254843-0_sp1.1 from training. Duration: 0.92725 2023-10-02 01:37:04,930 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_103-84010-0 from training. Duration: 0.69 2023-10-02 01:37:05,060 WARNING [train.py:1204] (3/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_401-158239-0 from training. Duration: 0.71 2023-10-02 01:37:05,061 WARNING [train.py:1204] (3/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_899-233658-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 01:37:05,170 WARNING [train.py:1204] (3/4) Exclude cut with ID _23825_210_13449_1_1531915313526_3497150_240-271367-0 from training. Duration: 0.97 2023-10-02 01:37:05,215 WARNING [train.py:1204] (3/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_424-134619-0_sp1.1 from training. Duration: 0.92725 2023-10-02 01:37:05,542 WARNING [train.py:1204] (3/4) Exclude cut with ID _44831_210_13427_1_1533816011549_3689035_32-188147-0_sp1.1 from training. Duration: 0.87275 2023-10-02 01:37:05,545 WARNING [train.py:1204] (3/4) Exclude cut with ID _59170_210_6721_1_1534983633960_4203335_314-63879-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 01:37:05,571 WARNING [train.py:1204] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_602-275260-0 from training. Duration: 0.82 2023-10-02 01:37:05,608 WARNING [train.py:1204] (3/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_511-306767-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 01:37:05,645 WARNING [train.py:1204] (3/4) Exclude cut with ID _41817_210_18835_1_1533434477160_7423049_565-201775-0_sp1.1 from training. Duration: 0.92725 2023-10-02 01:37:05,655 WARNING [train.py:1204] (3/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_778-173151-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 01:37:05,707 WARNING [train.py:1204] (3/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_680-51001-0_sp1.1 from training. Duration: 0.963625 2023-10-02 01:37:05,907 WARNING [train.py:1204] (3/4) Exclude cut with ID IC0677W0059-334891 from training. Duration: 0.896 2023-10-02 01:37:05,979 WARNING [train.py:1204] (3/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_370-331600-0 from training. Duration: 0.94 2023-10-02 01:37:06,345 WARNING [train.py:1204] (3/4) Exclude cut with ID _66840_210_5219_1_1535452168426_4196349_446-240192-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 01:37:06,439 WARNING [train.py:1204] (3/4) Exclude cut with ID _65242_210_18628_1_1535369393717_3823149_38-6732-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 01:37:06,508 WARNING [train.py:1204] (3/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_302-2550-0 from training. Duration: 0.81 2023-10-02 01:37:06,589 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_100-348441-0 from training. Duration: 0.91 2023-10-02 01:37:06,802 WARNING [train.py:1204] (3/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_329-285801-0_sp1.1 from training. Duration: 0.82725 2023-10-02 01:37:06,968 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_30167_210_3528_1_1532428012_4772375_343-334712-0_sp1.1 from training. Duration: 0.936375 2023-10-02 01:37:08,107 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7543_210_6753_1_1528543318_4552_1-85072-0 from training. Duration: 0.83 2023-10-02 01:37:08,351 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7002_210_1794_1_1527935634_4034444_487-236501-0_sp1.1 from training. Duration: 0.6 2023-10-02 01:37:08,455 WARNING [train.py:1204] (3/4) Exclude cut with ID _7160_210_1876_1_1527940923231_3703799_282-322611-0 from training. Duration: 0.87 2023-10-02 01:37:08,648 WARNING [train.py:1204] (3/4) Exclude cut with ID _67335_210_5219_1_1535525975652_4275229_570-262983-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 01:37:08,784 WARNING [train.py:1204] (3/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_625-338546-0_sp1.1 from training. Duration: 0.8 2023-10-02 01:37:08,815 WARNING [train.py:1204] (3/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_176-56120-0 from training. Duration: 0.83 2023-10-02 01:37:09,122 WARNING [train.py:1204] (3/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_963-187109-0_sp1.1 from training. Duration: 0.9 2023-10-02 01:37:09,317 WARNING [train.py:1204] (3/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_121-284152-0_sp0.9 from training. Duration: 0.6555625 2023-10-02 01:37:09,448 WARNING [train.py:1204] (3/4) Exclude cut with ID _45021_210_9994_1_1533619751094_3330288_104-81711-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 01:37:09,679 WARNING [train.py:1204] (3/4) Exclude cut with ID _51382_210_23862_1_1534492801838_3650104_129-343914-0_sp1.1 from training. Duration: 0.936375 2023-10-02 01:37:09,704 WARNING [train.py:1204] (3/4) Exclude cut with ID _14970_210_15709_1_1530775547361_1966965_8-169924-0 from training. Duration: 0.92 2023-10-02 01:37:09,876 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2295_210_2087_1_1525500993_2407863_234-83104-0_sp1.1 from training. Duration: 0.563625 2023-10-02 01:37:09,976 WARNING [train.py:1204] (3/4) Exclude cut with ID _14687_210_5854_1_1530703838259_3664889_115-239046-0 from training. Duration: 0.67 2023-10-02 01:37:10,140 WARNING [train.py:1204] (3/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_690-126306-0_sp1.1 from training. Duration: 0.7090625 2023-10-02 01:37:10,150 WARNING [train.py:1204] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_625-102955-0_sp0.9 from training. Duration: 0.8555625 2023-10-02 01:37:10,347 WARNING [train.py:1204] (3/4) Exclude cut with ID _9389_210_1664_1_1529217337301_5289269_289-213417-0 from training. Duration: 0.89 2023-10-02 01:37:10,503 WARNING [train.py:1204] (3/4) Exclude cut with ID _13253_210_1925_1_1530757775819_3577873_191-144977-0_sp0.9 from training. Duration: 0.8666875 2023-10-02 01:37:10,554 WARNING [train.py:1204] (3/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_325-155703-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 01:37:10,595 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2295_210_2087_1_1525500993_2407863_116-238894-0_sp0.9 from training. Duration: 0.77775 2023-10-02 01:37:10,725 WARNING [train.py:1204] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_94-126128-0 from training. Duration: 0.86 2023-10-02 01:37:10,740 WARNING [train.py:1204] (3/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_395-310220-0_sp0.9 from training. Duration: 0.988875 2023-10-02 01:37:10,820 WARNING [train.py:1204] (3/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_821-110852-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 01:37:10,885 WARNING [train.py:1204] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_64-248359-0_sp1.1 from training. Duration: 0.62725 2023-10-02 01:37:10,903 WARNING [train.py:1204] (3/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_138-187031-0 from training. Duration: 0.99 2023-10-02 01:37:10,941 WARNING [train.py:1204] (3/4) Exclude cut with ID _4122_210_2084_1_1526713164417_4353280_528-206771-0_sp0.9 from training. Duration: 0.97775 2023-10-02 01:37:11,055 WARNING [train.py:1204] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_844-96989-0_sp1.1 from training. Duration: 0.6454375 2023-10-02 01:37:11,229 WARNING [train.py:1204] (3/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_106-30782-0_sp1.1 from training. Duration: 0.72725 2023-10-02 01:37:12,573 WARNING [train.py:1204] (3/4) Exclude cut with ID _14683_210_5219_1_1530698352608_3974859_281-250040-0_sp1.1 from training. Duration: 0.936375 2023-10-02 01:37:12,613 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_46013_210_7234_1_1533776362_3744032_463-52378-0_sp0.9 from training. Duration: 0.9555625 2023-10-02 01:37:12,756 WARNING [train.py:1204] (3/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_299-34413-0 from training. Duration: 0.65 2023-10-02 01:37:12,822 WARNING [train.py:1204] (3/4) Exclude cut with ID _35799_210_5854_1_1533263487887_3529071_490-326847-0 from training. Duration: 0.99 2023-10-02 01:37:13,115 WARNING [train.py:1204] (3/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_589-9070-0_sp0.9 from training. Duration: 0.788875 2023-10-02 01:37:13,346 WARNING [train.py:1204] (3/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_884-29515-0_sp1.1 from training. Duration: 0.936375 2023-10-02 01:37:13,513 WARNING [train.py:1204] (3/4) Exclude cut with ID _15012_210_1968_1_1530856820429_5095350_474-119995-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 01:37:13,871 WARNING [train.py:1204] (3/4) Exclude cut with ID _65585_210_12210_1_1535450451584_3764098_211-97124-0_sp1.1 from training. Duration: 0.963625 2023-10-02 01:37:13,925 WARNING [train.py:1204] (3/4) Exclude cut with ID _33640_210_2655_1_1533117605479_4855469_666-224463-0_sp1.1 from training. Duration: 0.963625 2023-10-02 01:37:14,098 WARNING [train.py:1204] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_35-153886-0 from training. Duration: 0.84 2023-10-02 01:37:14,159 WARNING [train.py:1204] (3/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_210-106047-0 from training. Duration: 0.97 2023-10-02 01:37:14,301 WARNING [train.py:1204] (3/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_479-125611-0 from training. Duration: 0.96 2023-10-02 01:37:14,329 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_9417_210_1794_1_1529235261_4614131_427-153963-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 01:37:14,427 WARNING [train.py:1204] (3/4) Exclude cut with ID _23818_210_6228_1_1531917063884_3676920_133-170571-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 01:37:14,479 WARNING [train.py:1204] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_602-275260-0_sp1.1 from training. Duration: 0.7454375 2023-10-02 01:37:14,617 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9029W0121-509521 from training. Duration: 0.896 2023-10-02 01:37:14,638 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9029W0434-509821 from training. Duration: 0.64 2023-10-02 01:37:14,656 WARNING [train.py:1204] (3/4) Exclude cut with ID _23038_210_9570_1_1532001584158_3652419_289-307034-0_sp1.1 from training. Duration: 0.936375 2023-10-02 01:37:14,783 WARNING [train.py:1204] (3/4) Exclude cut with ID _29345_210_4381_1_1532689168263_3860169_149-257268-0_sp0.9 from training. Duration: 0.9 2023-10-02 01:37:14,877 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_405-339986-0_sp1.1 from training. Duration: 0.463625 2023-10-02 01:37:15,154 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2655_210_2087_1_1525688959_3897400_437-337690-0_sp1.1 from training. Duration: 0.57275 2023-10-02 01:37:15,190 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7543_210_6753_1_1528541907_1399322_165-323619-0_sp1.1 from training. Duration: 0.62725 2023-10-02 01:37:15,249 WARNING [train.py:1204] (3/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_508-335369-0_sp1.1 from training. Duration: 0.436375 2023-10-02 01:37:15,316 WARNING [train.py:1204] (3/4) Exclude cut with ID _7852_210_5219_1_1528286471316_8139400_358-287492-0 from training. Duration: 0.96 2023-10-02 01:37:15,537 WARNING [train.py:1204] (3/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_322-217220-0_sp0.9 from training. Duration: 0.9555625 2023-10-02 01:37:15,694 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3171_210_2087_1_1526109084_2330007_66-260583-0_sp0.9 from training. Duration: 0.8 2023-10-02 01:37:15,756 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8558_210_1410_1_1528505225_7613406_131-329524-0_sp1.1 from training. Duration: 0.7181875 2023-10-02 01:37:15,907 WARNING [train.py:1204] (3/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_30-326681-0 from training. Duration: 0.83 2023-10-02 01:37:16,178 WARNING [train.py:1204] (3/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_58-68956-0_sp0.9 from training. Duration: 0.92225 2023-10-02 01:37:16,912 WARNING [train.py:1204] (3/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_255-312654-0 from training. Duration: 0.91 2023-10-02 01:37:17,000 WARNING [train.py:1204] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_99-167392-0 from training. Duration: 0.61 2023-10-02 01:37:17,117 WARNING [train.py:1204] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_94-126128-0_sp0.9 from training. Duration: 0.9555625 2023-10-02 01:37:17,479 WARNING [train.py:1204] (3/4) Exclude cut with ID _68635_210_9317_1_1535770813410_4315870_187-192817-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 01:37:17,576 WARNING [train.py:1204] (3/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_277-204970-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 01:37:17,715 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6163_210_6400_1_1527506746_4263068_283-137811-0_sp0.9 from training. Duration: 0.8555625 2023-10-02 01:37:17,740 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9144W0121-566446 from training. Duration: 0.896 2023-10-02 01:37:18,016 WARNING [train.py:1204] (3/4) Exclude cut with ID _49371_210_12071_1_1533952761021_3812190_351-315519-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 01:37:18,079 WARNING [train.py:1204] (3/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_115-326778-0_sp1.1 from training. Duration: 0.8454375 2023-10-02 01:37:18,145 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_17292_210_6753_1_1531125388_2943734_436-198965-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 01:37:18,174 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9165W0117-576751 from training. Duration: 0.896 2023-10-02 01:37:18,289 WARNING [train.py:1204] (3/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_212-149947-0 from training. Duration: 0.73 2023-10-02 01:37:18,304 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_13757_210_3528_1_1530440682_4107239_65-167544-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 01:37:18,574 WARNING [train.py:1204] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_700-183549-0_sp0.9 from training. Duration: 0.7555625 2023-10-02 01:37:18,918 WARNING [train.py:1204] (3/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_216-343007-0_sp0.9 from training. Duration: 0.7333125 2023-10-02 01:37:19,026 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7544_210_6753_1_1528587722_3576686_295-258479-0_sp1.1 from training. Duration: 0.763625 2023-10-02 01:37:19,205 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_365-287106-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 01:37:19,217 WARNING [train.py:1204] (3/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_300-3747-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 01:37:19,277 WARNING [train.py:1204] (3/4) Exclude cut with ID _14069_210_5632_1_1530512692033_4803390_487-303201-0_sp1.1 from training. Duration: 0.92725 2023-10-02 01:37:20,068 WARNING [train.py:1204] (3/4) Exclude cut with ID _14459_210_2228_1_1531443641946_7516700_668-123026-0_sp1.1 from training. Duration: 0.936375 2023-10-02 01:37:20,090 WARNING [train.py:1204] (3/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_108-53929-0_sp1.1 from training. Duration: 0.536375 2023-10-02 01:37:20,129 WARNING [train.py:1204] (3/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_266-137104-0_sp0.9 from training. Duration: 0.72225 2023-10-02 01:37:20,168 WARNING [train.py:1204] (3/4) Exclude cut with ID _12302_210_5219_1_1529924446466_8107280_902-168619-0_sp1.1 from training. Duration: 0.936375 2023-10-02 01:37:20,189 WARNING [train.py:1204] (3/4) Exclude cut with ID _59502_210_12210_1_1535107024698_1835139_146-162495-0 from training. Duration: 0.92 2023-10-02 01:37:20,214 WARNING [train.py:1204] (3/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_412-287526-0_sp1.1 from training. Duration: 0.6545625 2023-10-02 01:37:20,497 WARNING [train.py:1204] (3/4) Exclude cut with ID _7160_210_1876_1_1527940923231_3703799_120-150943-0_sp0.9 from training. Duration: 0.9 2023-10-02 01:37:20,574 WARNING [train.py:1204] (3/4) Exclude cut with ID _15037_210_2253_1_1530752294104_3267650_296-192597-0_sp1.1 from training. Duration: 0.736375 2023-10-02 01:37:20,592 WARNING [train.py:1204] (3/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_397-216497-0_sp1.1 from training. Duration: 0.87275 2023-10-02 01:37:20,605 WARNING [train.py:1204] (3/4) Exclude cut with ID _40094_210_16380_1_1533294005773_3641159_76-269320-0_sp1.1 from training. Duration: 0.936375 2023-10-02 01:37:20,662 WARNING [train.py:1204] (3/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_449-15508-0 from training. Duration: 0.82 2023-10-02 01:37:21,500 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9309W0194-640321 from training. Duration: 0.64 2023-10-02 01:37:21,710 WARNING [train.py:1204] (3/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_284-312178-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 01:37:21,825 WARNING [train.py:1204] (3/4) Exclude cut with ID _29853_210_7497_1_1532505600851_6642110_401-55937-0_sp1.1 from training. Duration: 0.92725 2023-10-02 01:37:21,927 WARNING [train.py:1204] (3/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_716-24918-0_sp1.1 from training. Duration: 0.92725 2023-10-02 01:37:21,959 WARNING [train.py:1204] (3/4) Exclude cut with ID _19637_210_4220_1_1531537167676_3205060_314-305638-0_sp1.1 from training. Duration: 0.92725 2023-10-02 01:37:22,005 WARNING [train.py:1204] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_475-127455-0_sp1.1 from training. Duration: 0.636375 2023-10-02 01:37:22,178 WARNING [train.py:1204] (3/4) Exclude cut with ID _5444_210_2151_1_1527418808464_5312210_581-315411-0 from training. Duration: 0.62 2023-10-02 01:37:22,439 WARNING [train.py:1204] (3/4) Exclude cut with ID _44902_210_16187_1_1533888006062_3792078_149-67212-0_sp0.9 from training. Duration: 0.9666875 2023-10-02 01:37:22,542 WARNING [train.py:1204] (3/4) Exclude cut with ID _31602_210_5854_1_1532677021456_4458250_63-63253-0_sp1.1 from training. Duration: 0.963625 2023-10-02 01:37:22,840 WARNING [train.py:1204] (3/4) Exclude cut with ID _55209_210_1531_1_1534675828996_4597160_660-203992-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 01:37:23,086 WARNING [train.py:1204] (3/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_83-190624-0_sp1.1 from training. Duration: 0.92725 2023-10-02 01:37:23,524 WARNING [train.py:1204] (3/4) Exclude cut with ID _58605_210_12210_1_1534723195939_3904259_154-139635-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 01:37:23,771 WARNING [train.py:1204] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_164-224052-0 from training. Duration: 0.7 2023-10-02 01:37:23,787 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_1126-189071-0_sp1.1 from training. Duration: 0.963625 2023-10-02 01:37:23,809 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_15396_210_6890_1_1531308473_1938936_310-214789-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 01:37:24,045 WARNING [train.py:1204] (3/4) Exclude cut with ID _8395_210_1531_1_1528597871523_4411579_431-132470-0 from training. Duration: 0.74 2023-10-02 01:37:24,108 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3780_210_2087_1_1526467760_2130483_170-259044-0_sp0.9 from training. Duration: 0.77775 2023-10-02 01:37:24,212 WARNING [train.py:1204] (3/4) Exclude cut with ID _12733_210_8341_1_1530167424556_3646380_537-230640-0_sp1.1 from training. Duration: 0.963625 2023-10-02 01:37:24,543 WARNING [train.py:1204] (3/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_159-281310-0 from training. Duration: 0.95 2023-10-02 01:37:24,700 WARNING [train.py:1204] (3/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_434-83000-0 from training. Duration: 0.98 2023-10-02 01:37:24,718 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_4405_210_1849_1_1526732205_3593939_493-32464-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 01:37:24,727 WARNING [train.py:1204] (3/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_342-100157-0 from training. Duration: 0.54 2023-10-02 01:37:24,801 WARNING [train.py:1204] (3/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_765-250133-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 01:37:24,805 WARNING [train.py:1204] (3/4) Exclude cut with ID _38670_210_15379_1_1533448810659_4391059_81-312245-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 01:37:24,914 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_50050_210_16223_1_1535435796_7290138_1273-247611-0_sp1.1 from training. Duration: 0.936375 2023-10-02 01:37:24,976 WARNING [train.py:1204] (3/4) Exclude cut with ID _44831_210_13427_1_1533816011549_3689035_399-18591-0_sp1.1 from training. Duration: 0.936375 2023-10-02 01:37:24,991 WARNING [train.py:1204] (3/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_557-324258-0_sp0.9 from training. Duration: 0.9 2023-10-02 01:37:25,079 WARNING [train.py:1204] (3/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_62-335552-0_sp1.1 from training. Duration: 0.7909375 2023-10-02 01:37:25,270 WARNING [train.py:1204] (3/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_673-264125-0_sp1.1 from training. Duration: 0.963625 2023-10-02 01:37:25,410 WARNING [train.py:1204] (3/4) Exclude cut with ID _8030_210_7497_1_1528444886781_3531089_262-86750-0 from training. Duration: 0.81 2023-10-02 01:37:25,424 WARNING [train.py:1204] (3/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_444-28041-0_sp1.1 from training. Duration: 0.77275 2023-10-02 01:37:26,332 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_34008_210_10431_1_1533345424_8183247_516-14858-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 01:37:26,344 WARNING [train.py:1204] (3/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_54-248218-0_sp1.1 from training. Duration: 0.936375 2023-10-02 01:37:26,576 WARNING [train.py:1204] (3/4) Exclude cut with ID _13253_210_1925_1_1530757775819_3577873_300-119782-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 01:37:26,667 WARNING [train.py:1204] (3/4) Exclude cut with ID _39290_210_4220_1_1533862778760_3075118_21-240703-0_sp1.1 from training. Duration: 0.936375 2023-10-02 01:37:27,036 WARNING [train.py:1204] (3/4) Exclude cut with ID 774-127930-0014-10412-0_sp1.1 from training. Duration: 0.95 2023-10-02 01:37:27,103 WARNING [train.py:1204] (3/4) Exclude cut with ID _67499_210_17282_1_1535509805220_3692430_20-179346-0 from training. Duration: 0.97 2023-10-02 01:37:27,230 WARNING [train.py:1204] (3/4) Exclude cut with ID _8365_210_2064_1_1528592311070_4677690_431-294102-0_sp0.9 from training. Duration: 0.988875 2023-10-02 01:37:27,631 WARNING [train.py:1204] (3/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_365-1492-0_sp1.1 from training. Duration: 0.82725 2023-10-02 01:37:27,675 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_105-114797-0_sp0.9 from training. Duration: 0.9333125 2023-10-02 01:37:27,800 WARNING [train.py:1204] (3/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_769-168302-0_sp0.9 from training. Duration: 0.788875 2023-10-02 01:37:28,063 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_40340_210_9024_1_1533433916_4132944_320-270277-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 01:37:28,068 WARNING [train.py:1204] (3/4) Exclude cut with ID ID0203W0003-781711 from training. Duration: 0.958 2023-10-02 01:37:28,146 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_30434_210_13049_1_1532519384_981746_52-12932-0_sp1.1 from training. Duration: 0.936375 2023-10-02 01:37:28,242 WARNING [train.py:1204] (3/4) Exclude cut with ID _8263_210_1664_1_1528438953494_294100_40-204731-0_sp0.9 from training. Duration: 0.5555625 2023-10-02 01:37:28,438 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1032-236863-0_sp0.9 from training. Duration: 0.9333125 2023-10-02 01:37:28,504 WARNING [train.py:1204] (3/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_335-274780-0_sp0.9 from training. Duration: 0.8666875 2023-10-02 01:37:28,514 WARNING [train.py:1204] (3/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_654-52713-0_sp1.1 from training. Duration: 0.7 2023-10-02 01:37:28,626 WARNING [train.py:1204] (3/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_742-267856-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 01:37:28,729 WARNING [train.py:1204] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_74-48717-0 from training. Duration: 0.58 2023-10-02 01:37:28,842 WARNING [train.py:1204] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_18-46756-0 from training. Duration: 0.79 2023-10-02 01:37:28,912 WARNING [train.py:1204] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_612-56061-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 01:37:28,920 WARNING [train.py:1204] (3/4) Exclude cut with ID _56423_210_5854_1_1534896061049_3165969_412-2403-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 01:37:28,992 WARNING [train.py:1204] (3/4) Exclude cut with ID _62574_210_2655_1_1535180407058_3933249_120-196848-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 01:37:28,993 WARNING [train.py:1204] (3/4) Exclude cut with ID _40519_210_13794_1_1533715228655_7337390_916-51000-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 01:37:29,224 WARNING [train.py:1204] (3/4) Exclude cut with ID _65263_210_22356_1_1535763598691_3921037_108-21902-0_sp1.1 from training. Duration: 0.9 2023-10-02 01:37:29,327 WARNING [train.py:1204] (3/4) Exclude cut with ID _4122_210_2084_1_1526713164417_4353280_528-206771-0_sp1.1 from training. Duration: 0.8 2023-10-02 01:37:29,566 WARNING [train.py:1204] (3/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_43-325657-0_sp1.1 from training. Duration: 0.92725 2023-10-02 01:37:29,627 WARNING [train.py:1204] (3/4) Exclude cut with ID _44659_210_2721_1_1534036120061_6614860_314-82041-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 01:37:29,712 WARNING [train.py:1204] (3/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_81-300212-0_sp0.9 from training. Duration: 0.6666875 2023-10-02 01:37:29,792 WARNING [train.py:1204] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_665-196155-0_sp0.9 from training. Duration: 0.7444375 2023-10-02 01:37:29,901 WARNING [train.py:1204] (3/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_140-336881-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 01:37:30,609 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6290_210_3273_1_1527595224_1282787_115-87095-0_sp0.9 from training. Duration: 0.611125 2023-10-02 01:37:30,673 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_53373_210_5399_1_1534229471_4715695_607-259685-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 01:37:30,799 WARNING [train.py:1204] (3/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_175-163894-0_sp1.1 from training. Duration: 0.67275 2023-10-02 01:37:30,834 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7002_210_1794_1_1527939683_5157286_1-281264-0_sp1.1 from training. Duration: 0.4 2023-10-02 01:37:31,331 WARNING [train.py:1204] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_605-257205-0 from training. Duration: 0.59 2023-10-02 01:37:31,583 WARNING [train.py:1204] (3/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_454-314901-0 from training. Duration: 0.46 2023-10-02 01:37:31,681 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_46032_210_14656_1_1533781412_4507969_287-130422-0_sp1.1 from training. Duration: 0.97275 2023-10-02 01:37:31,709 WARNING [train.py:1204] (3/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_186-309161-0_sp1.1 from training. Duration: 0.4909375 2023-10-02 01:37:31,738 WARNING [train.py:1204] (3/4) Exclude cut with ID _42659_210_2070_1_1533607442765_3120380_45-342307-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 01:37:32,244 WARNING [train.py:1204] (3/4) Exclude cut with ID _12555_210_8750_1_1530075594402_3780239_478-326818-0_sp1.1 from training. Duration: 0.963625 2023-10-02 01:37:32,400 WARNING [train.py:1204] (3/4) Exclude cut with ID _7606_210_4915_1_1528894253277_5080690_127-177761-0_sp1.1 from training. Duration: 0.5818125 2023-10-02 01:37:32,555 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6788_210_1388_1_1527898698_4562021_91-197981-0 from training. Duration: 0.83 2023-10-02 01:37:32,702 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8558_210_1410_1_1528505225_7613406_131-329524-0_sp0.9 from training. Duration: 0.87775 2023-10-02 01:37:32,835 WARNING [train.py:1204] (3/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_153-153537-0_sp1.1 from training. Duration: 0.82725 2023-10-02 01:37:33,016 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_17292_210_6753_1_1531125388_2943734_285-349105-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 01:37:33,188 WARNING [train.py:1204] (3/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_37-41919-0 from training. Duration: 0.87 2023-10-02 01:37:33,224 WARNING [train.py:1204] (3/4) Exclude cut with ID _60860_210_8254_1_1534896015875_3557169_172-294852-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 01:37:33,669 WARNING [train.py:1204] (3/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_146-276944-0 from training. Duration: 0.83 2023-10-02 01:37:33,921 WARNING [train.py:1204] (3/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_1254-44082-0_sp1.1 from training. Duration: 0.936375 2023-10-02 01:37:33,933 WARNING [train.py:1204] (3/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_1115-180350-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 01:37:33,970 WARNING [train.py:1204] (3/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_60-146189-0_sp1.1 from training. Duration: 0.8909375 2023-10-02 01:37:35,045 WARNING [train.py:1204] (3/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_517-153649-0_sp1.1 from training. Duration: 0.92725 2023-10-02 01:37:35,063 WARNING [train.py:1204] (3/4) Exclude cut with ID _39571_210_13449_1_1533801584143_3651077_37-266257-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 01:37:35,113 WARNING [train.py:1204] (3/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_527-276118-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 01:37:35,645 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7696_210_2040_1_1528207167_3716099_117-125434-0_sp1.1 from training. Duration: 0.6909375 2023-10-02 01:37:35,998 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_25767_210_2161_1_1532307483_3867748_478-156360-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 01:37:36,002 WARNING [train.py:1204] (3/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_177-49092-0_sp1.1 from training. Duration: 0.82725 2023-10-02 01:37:36,009 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_11305_210_1794_1_1529750428_7178148_680-317073-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 01:37:36,569 WARNING [train.py:1204] (3/4) Exclude cut with ID _53565_210_10635_1_1534337218335_3820259_287-18120-0 from training. Duration: 0.97 2023-10-02 01:37:36,939 WARNING [train.py:1204] (3/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_498-40153-0_sp1.1 from training. Duration: 0.57275 2023-10-02 01:37:36,960 WARNING [train.py:1204] (3/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_555-63066-0_sp1.1 from training. Duration: 0.963625 2023-10-02 01:37:37,241 WARNING [train.py:1204] (3/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_395-310220-0 from training. Duration: 0.89 2023-10-02 01:37:37,310 WARNING [train.py:1204] (3/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_937-208281-0_sp1.1 from training. Duration: 0.77275 2023-10-02 01:37:37,493 WARNING [train.py:1204] (3/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_120-211630-0 from training. Duration: 0.92 2023-10-02 01:37:37,525 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_454-276429-0_sp1.1 from training. Duration: 0.7909375 2023-10-02 01:37:37,561 WARNING [train.py:1204] (3/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_404-75889-0_sp1.1 from training. Duration: 0.8090625 2023-10-02 01:37:38,081 WARNING [train.py:1204] (3/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_282-136641-0_sp0.9 from training. Duration: 0.4666875 2023-10-02 01:37:38,211 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_809-17156-0_sp1.1 from training. Duration: 0.663625 2023-10-02 01:37:38,398 WARNING [train.py:1204] (3/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_1003-101638-0_sp0.9 from training. Duration: 0.811125 2023-10-02 01:37:38,502 WARNING [train.py:1204] (3/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_404-75889-0_sp0.9 from training. Duration: 0.988875 2023-10-02 01:37:39,253 WARNING [train.py:1204] (3/4) Exclude cut with ID _59288_210_5854_1_1535077757774_3412246_570-295255-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 01:37:39,489 WARNING [train.py:1204] (3/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_412-287526-0 from training. Duration: 0.72 2023-10-02 01:37:39,559 WARNING [train.py:1204] (3/4) Exclude cut with ID _44010_210_4525_1_1533884262964_4013620_649-79655-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 01:37:40,012 WARNING [train.py:1204] (3/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_57-270037-0_sp0.9 from training. Duration: 0.8555625 2023-10-02 01:37:40,013 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_34815_210_6388_1_1533381320_3571903_302-255963-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 01:37:40,263 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_47449_210_5399_1_1534076445_4006929_573-301852-0_sp1.1 from training. Duration: 0.97275 2023-10-02 01:37:40,541 WARNING [train.py:1204] (3/4) Exclude cut with ID 6709-74022-0004-86860-0_sp1.1 from training. Duration: 0.9409375 2023-10-02 01:37:40,660 WARNING [train.py:1204] (3/4) Exclude cut with ID _40519_210_13794_1_1533715228655_7337390_111-71991-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 01:37:40,704 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_30167_210_3528_1_1532428012_4772375_169-214186-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 01:37:40,975 WARNING [train.py:1204] (3/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_20-235313-0_sp1.1 from training. Duration: 0.963625 2023-10-02 01:37:41,017 WARNING [train.py:1204] (3/4) Exclude cut with ID _4681_210_3525_1_1527251040321_1235713_123-24428-0_sp0.9 from training. Duration: 0.5333125 2023-10-02 01:37:41,233 WARNING [train.py:1204] (3/4) Exclude cut with ID _31602_210_5854_1_1532677021456_4458250_444-198752-0_sp1.1 from training. Duration: 0.97275 2023-10-02 01:37:41,284 WARNING [train.py:1204] (3/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_9-279442-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 01:37:41,337 WARNING [train.py:1204] (3/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_973-198085-0_sp0.9 from training. Duration: 0.8333125 2023-10-02 01:37:41,452 WARNING [train.py:1204] (3/4) Exclude cut with ID _29853_210_7497_1_1532505600851_6642110_567-250221-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 01:37:41,586 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6545_210_2040_1_1527774670_4245258_407-48070-0_sp1.1 from training. Duration: 0.6909375 2023-10-02 01:37:41,760 WARNING [train.py:1204] (3/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_224-343295-0_sp0.9 from training. Duration: 0.611125 2023-10-02 01:37:41,779 WARNING [train.py:1204] (3/4) Exclude cut with ID IC0125W0011-61817 from training. Duration: 0.88 2023-10-02 01:37:41,793 WARNING [train.py:1204] (3/4) Exclude cut with ID _60113_210_16271_1_1535164212481_4386079_435-154371-0_sp1.1 from training. Duration: 0.97275 2023-10-02 01:37:41,811 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_25836_210_16554_1_1532138167_53197_3-9394-0_sp1.1 from training. Duration: 0.92725 2023-10-02 01:37:42,069 WARNING [train.py:1204] (3/4) Exclude cut with ID _29343_210_4381_1_1532602757752_3674109_57-55125-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 01:37:42,160 WARNING [train.py:1204] (3/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_267-23560-0_sp1.1 from training. Duration: 0.936375 2023-10-02 01:37:42,189 WARNING [train.py:1204] (3/4) Exclude cut with ID _30196_210_1664_1_1533708496522_7074140_1150-278867-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 01:37:42,242 WARNING [train.py:1204] (3/4) Exclude cut with ID _6270_210_2084_1_1527850911573_3856420_26-277343-0_sp0.9 from training. Duration: 0.911125 2023-10-02 01:37:42,308 WARNING [train.py:1204] (3/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_325-62802-0 from training. Duration: 0.87 2023-10-02 01:37:42,380 WARNING [train.py:1204] (3/4) Exclude cut with ID _56524_210_9642_1_1534572011779_3619170_145-310842-0_sp1.1 from training. Duration: 0.9 2023-10-02 01:37:42,412 WARNING [train.py:1204] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_860-96198-0_sp0.9 from training. Duration: 0.811125 2023-10-02 01:37:42,517 WARNING [train.py:1204] (3/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_17-25045-0_sp1.1 from training. Duration: 0.536375 2023-10-02 01:37:42,547 WARNING [train.py:1204] (3/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_349-69727-0_sp1.1 from training. Duration: 0.936375 2023-10-02 01:37:42,638 WARNING [train.py:1204] (3/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_580-93798-0_sp1.1 from training. Duration: 0.5090625 2023-10-02 01:37:42,744 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7762_210_1794_1_1528199508_4057408_419-125009-0 from training. Duration: 0.83 2023-10-02 01:37:42,843 WARNING [train.py:1204] (3/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_356-167816-0 from training. Duration: 0.72 2023-10-02 01:37:42,872 WARNING [train.py:1204] (3/4) Exclude cut with ID _29345_210_4381_1_1532689168263_3860169_225-97100-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 01:37:42,878 WARNING [train.py:1204] (3/4) Exclude cut with ID _14227_210_4381_1_1530701804286_3940259_374-194712-0_sp1.1 from training. Duration: 0.92725 2023-10-02 01:37:42,911 WARNING [train.py:1204] (3/4) Exclude cut with ID _40137_210_12210_1_1533290484046_3795180_249-187059-0_sp0.9 from training. Duration: 0.97775 2023-10-02 01:37:42,941 WARNING [train.py:1204] (3/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_389-317194-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 01:37:43,749 WARNING [train.py:1204] (3/4) Exclude cut with ID _27485_210_4916_1_1532739191677_3668269_239-122918-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 01:37:44,161 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2245_210_2715_1_1525511548_3540254_128-117609-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 01:37:44,193 WARNING [train.py:1204] (3/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_160-276178-0_sp1.1 from training. Duration: 0.963625 2023-10-02 01:37:44,292 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_31920_210_10633_1_1532860009_1686094_1-279735-0_sp1.1 from training. Duration: 0.97275 2023-10-02 01:37:44,497 WARNING [train.py:1204] (3/4) Exclude cut with ID _51078_210_9070_1_1534161557424_3805250_325-262383-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 01:37:44,532 WARNING [train.py:1204] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_643-52007-0_sp0.9 from training. Duration: 0.6333125 2023-10-02 01:37:44,568 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_51937_210_5014_1_1534573610_4022025_518-299248-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 01:37:45,068 WARNING [train.py:1204] (3/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_166-314607-0_sp1.1 from training. Duration: 0.4909375 2023-10-02 01:37:45,072 WARNING [train.py:1204] (3/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_1087-282939-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 01:37:45,115 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_23850_210_4238_1_1532232597_1632433_32-185135-0_sp1.1 from training. Duration: 0.7818125 2023-10-02 01:37:45,137 WARNING [train.py:1204] (3/4) Exclude cut with ID _59500_210_8254_1_1534809610680_3736009_145-89359-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 01:37:45,559 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_340-336408-0 from training. Duration: 0.93 2023-10-02 01:37:45,661 WARNING [train.py:1204] (3/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_9-21737-0 from training. Duration: 0.97 2023-10-02 01:37:45,873 WARNING [train.py:1204] (3/4) Exclude cut with ID _2220_210_1819_1_1525509109898_3658680_50-99837-0 from training. Duration: 0.54 2023-10-02 01:37:45,941 WARNING [train.py:1204] (3/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_879-299350-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 01:37:46,067 WARNING [train.py:1204] (3/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_905-160074-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 01:37:46,218 WARNING [train.py:1204] (3/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_395-73570-0 from training. Duration: 0.99 2023-10-02 01:37:46,554 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7264_210_4938_1_1527939911_8132095_800-91995-0_sp1.1 from training. Duration: 0.7818125 2023-10-02 01:37:46,750 WARNING [train.py:1204] (3/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_77-311404-0_sp1.1 from training. Duration: 0.8545625 2023-10-02 01:37:46,858 WARNING [train.py:1204] (3/4) Exclude cut with ID _13679_210_7497_1_1530534293662_3355098_107-80772-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 01:37:46,968 WARNING [train.py:1204] (3/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_224-343295-0_sp1.1 from training. Duration: 0.5 2023-10-02 01:37:47,009 WARNING [train.py:1204] (3/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_156-253482-0_sp1.1 from training. Duration: 0.963625 2023-10-02 01:37:47,251 WARNING [train.py:1204] (3/4) Exclude cut with ID _49728_210_12651_1_1534041034294_6788150_309-125532-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 01:37:47,513 WARNING [train.py:1204] (3/4) Exclude cut with ID _13913_210_9994_1_1530579615187_3383897_473-277682-0_sp0.9 from training. Duration: 0.4333125 2023-10-02 01:37:47,521 WARNING [train.py:1204] (3/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_3-276701-0_sp1.1 from training. Duration: 0.82725 2023-10-02 01:37:48,271 WARNING [train.py:1204] (3/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_282-136641-0 from training. Duration: 0.42 2023-10-02 01:37:48,490 WARNING [train.py:1204] (3/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_127-566-0 from training. Duration: 0.84 2023-10-02 01:37:48,776 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6163_210_6400_1_1527506746_4263068_283-137811-0_sp1.1 from training. Duration: 0.7 2023-10-02 01:37:48,826 WARNING [train.py:1204] (3/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_34-183685-0_sp1.1 from training. Duration: 0.8909375 2023-10-02 01:37:48,975 WARNING [train.py:1204] (3/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_154-135696-0_sp0.9 from training. Duration: 0.888875 2023-10-02 01:37:49,285 WARNING [train.py:1204] (3/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_676-51014-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 01:37:49,371 WARNING [train.py:1204] (3/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_38-169920-0 from training. Duration: 0.91 2023-10-02 01:37:49,727 WARNING [train.py:1204] (3/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_747-114465-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 01:37:49,781 WARNING [train.py:1204] (3/4) Exclude cut with ID _8021_210_7994_1_1528376451358_3653069_100-140231-0_sp1.1 from training. Duration: 0.6090625 2023-10-02 01:37:49,797 WARNING [train.py:1204] (3/4) Exclude cut with ID _64135_210_2253_1_1535677267876_7608972_133-188266-0 from training. Duration: 0.81 2023-10-02 01:37:50,066 WARNING [train.py:1204] (3/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_714-310538-0_sp1.1 from training. Duration: 0.7818125 2023-10-02 01:37:50,173 WARNING [train.py:1204] (3/4) Exclude cut with ID _39867_210_9826_1_1533553222030_9344259_1134-288498-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 01:37:50,422 WARNING [train.py:1204] (3/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_211-237289-0_sp1.1 from training. Duration: 0.936375 2023-10-02 01:37:50,853 WARNING [train.py:1204] (3/4) Exclude cut with ID _59202_210_26984_1_1534816810612_3549174_137-318710-0_sp0.9 from training. Duration: 0.988875 2023-10-02 01:37:50,891 WARNING [train.py:1204] (3/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_86-66192-0 from training. Duration: 0.55 2023-10-02 01:37:51,133 WARNING [train.py:1204] (3/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_540-275685-0 from training. Duration: 0.89 2023-10-02 01:37:51,164 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8776_210_2040_1_1528638837_4088036_244-278763-0 from training. Duration: 0.9 2023-10-02 01:37:51,351 WARNING [train.py:1204] (3/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_9-219972-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 01:37:51,505 WARNING [train.py:1204] (3/4) Exclude cut with ID _44659_210_2721_1_1534036120061_6614860_656-283823-0_sp1.1 from training. Duration: 0.92725 2023-10-02 01:37:51,579 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_69630_210_7234_1_1535789601_661049_77-216385-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 01:37:51,741 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_19952_210_2161_1_1531875469_3851750_210-308919-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 01:37:51,746 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_4737_210_2040_1_1526998689_3293008_378-178650-0 from training. Duration: 0.95 2023-10-02 01:37:52,709 WARNING [train.py:1204] (3/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_224-343295-0 from training. Duration: 0.55 2023-10-02 01:37:52,733 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7002_210_1794_1_1527939683_5157286_184-229630-0 from training. Duration: 0.74 2023-10-02 01:37:52,763 WARNING [train.py:1204] (3/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_6-214815-0 from training. Duration: 0.85 2023-10-02 01:37:52,856 WARNING [train.py:1204] (3/4) Exclude cut with ID _8699_210_2069_1_1528630346831_3960409_160-24408-0 from training. Duration: 0.91 2023-10-02 01:37:52,900 WARNING [train.py:1204] (3/4) Exclude cut with ID _5585_210_2667_1_1527418813324_8007340_89-332072-0 from training. Duration: 0.64 2023-10-02 01:37:52,947 WARNING [train.py:1204] (3/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_528-259800-0_sp1.1 from training. Duration: 0.963625 2023-10-02 01:37:53,038 WARNING [train.py:1204] (3/4) Exclude cut with ID _55383_210_10635_1_1534468426172_2231669_134-128246-0 from training. Duration: 0.89 2023-10-02 01:37:53,122 WARNING [train.py:1204] (3/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_400-338274-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 01:37:53,174 WARNING [train.py:1204] (3/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_364-22817-0_sp0.9 from training. Duration: 0.788875 2023-10-02 01:37:53,290 WARNING [train.py:1204] (3/4) Exclude cut with ID _42863_210_9070_1_1533902398788_3696920_331-291057-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 01:37:53,424 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_5436_210_1794_1_1527331317_2541494_229-211016-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 01:37:53,457 WARNING [train.py:1204] (3/4) Exclude cut with ID _6456_210_1534_1_1527681634212_3181049_345-252877-0_sp0.9 from training. Duration: 0.711125 2023-10-02 01:37:53,492 WARNING [train.py:1204] (3/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_96-81499-0_sp1.1 from training. Duration: 0.92725 2023-10-02 01:37:53,678 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_5575_210_1410_1_1527301305_2570170_342-265572-0_sp1.1 from training. Duration: 0.8545625 2023-10-02 01:37:53,762 WARNING [train.py:1204] (3/4) Exclude cut with ID _5444_210_2151_1_1527418808464_5312210_726-27165-0_sp0.9 from training. Duration: 0.7444375 2023-10-02 01:37:53,950 WARNING [train.py:1204] (3/4) Exclude cut with ID _22083_210_8254_1_1531702862439_3678970_304-327750-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 01:37:54,141 WARNING [train.py:1204] (3/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_245-226199-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 01:37:54,202 WARNING [train.py:1204] (3/4) Exclude cut with ID _42875_210_14856_1_1533704397487_4012433_102-199499-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 01:37:54,263 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_51225_210_3601_1_1534400634_2327383_55-301844-0 from training. Duration: 0.93 2023-10-02 01:37:54,384 WARNING [train.py:1204] (3/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_916-195848-0_sp1.1 from training. Duration: 0.836375 2023-10-02 01:37:54,456 WARNING [train.py:1204] (3/4) Exclude cut with ID _44718_210_5816_1_1534050051869_6469030_497-311999-0_sp1.1 from training. Duration: 0.936375 2023-10-02 01:37:54,600 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_34886_210_10633_1_1533203882_1755661_91-194765-0_sp1.1 from training. Duration: 0.936375 2023-10-02 01:37:54,724 WARNING [train.py:1204] (3/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_310-26186-0_sp0.9 from training. Duration: 0.77775 2023-10-02 01:37:54,826 WARNING [train.py:1204] (3/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_416-257730-0 from training. Duration: 0.65 2023-10-02 01:37:54,831 WARNING [train.py:1204] (3/4) Exclude cut with ID _9389_210_1664_1_1529217337301_5289269_209-154760-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 01:37:55,000 WARNING [train.py:1204] (3/4) Exclude cut with ID _4092_210_2151_1_1526643044449_3135759_227-213972-0 from training. Duration: 0.54 2023-10-02 01:37:55,027 WARNING [train.py:1204] (3/4) Exclude cut with ID IC0679W0044-335852 from training. Duration: 0.768 2023-10-02 01:37:55,086 WARNING [train.py:1204] (3/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_42-186162-0 from training. Duration: 0.92 2023-10-02 01:37:55,092 WARNING [train.py:1204] (3/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_302-2550-0_sp0.9 from training. Duration: 0.9 2023-10-02 01:37:55,111 WARNING [train.py:1204] (3/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_356-31216-0_sp0.9 from training. Duration: 0.8333125 2023-10-02 01:37:55,315 WARNING [train.py:1204] (3/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_125-322172-0_sp1.1 from training. Duration: 0.8454375 2023-10-02 01:37:55,822 WARNING [train.py:1204] (3/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_141-89727-0_sp0.9 from training. Duration: 0.788875 2023-10-02 01:37:55,839 WARNING [train.py:1204] (3/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_647-337784-0_sp1.1 from training. Duration: 0.97275 2023-10-02 01:37:55,849 WARNING [train.py:1204] (3/4) Exclude cut with ID _6493_210_1747_1_1528459210424_4012470_513-140967-0_sp1.1 from training. Duration: 0.7181875 2023-10-02 01:37:56,058 WARNING [train.py:1204] (3/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_87-266334-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 01:37:56,189 WARNING [train.py:1204] (3/4) Exclude cut with ID _26528_210_9070_1_1532570410842_3851310_526-261338-0_sp1.1 from training. Duration: 0.92725 2023-10-02 01:37:56,252 WARNING [train.py:1204] (3/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_380-343670-0 from training. Duration: 0.88 2023-10-02 01:37:56,320 WARNING [train.py:1204] (3/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_449-15508-0_sp1.1 from training. Duration: 0.7454375 2023-10-02 01:37:56,431 WARNING [train.py:1204] (3/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_372-6869-0_sp0.9 from training. Duration: 0.488875 2023-10-02 01:37:56,499 WARNING [train.py:1204] (3/4) Exclude cut with ID _26907_210_7234_1_1532221740694_3205263_337-2494-0_sp0.9 from training. Duration: 0.97775 2023-10-02 01:37:57,325 WARNING [train.py:1204] (3/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_10-100367-0_sp1.1 from training. Duration: 0.8909375 2023-10-02 01:37:57,353 WARNING [train.py:1204] (3/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_47-167373-0 from training. Duration: 0.92 2023-10-02 01:37:57,435 WARNING [train.py:1204] (3/4) Exclude cut with ID _28323_210_16187_1_1532480395667_4100300_628-282179-0_sp1.1 from training. Duration: 0.97275 2023-10-02 01:37:57,877 WARNING [train.py:1204] (3/4) Exclude cut with ID _20455_210_5219_1_1531565996775_8098100_213-280898-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 01:37:57,980 WARNING [train.py:1204] (3/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_383-316000-0_sp1.1 from training. Duration: 0.8181875 2023-10-02 01:37:58,021 WARNING [train.py:1204] (3/4) Exclude cut with ID _18513_210_9206_1_1531618215223_3862620_16-78180-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 01:37:58,198 WARNING [train.py:1204] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_330-327861-0_sp1.1 from training. Duration: 0.6090625 2023-10-02 01:37:58,488 WARNING [train.py:1204] (3/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_356-31216-0 from training. Duration: 0.75 2023-10-02 01:37:58,519 WARNING [train.py:1204] (3/4) Exclude cut with ID _25248_210_8750_1_1532062825941_5554180_255-148539-0_sp1.1 from training. Duration: 0.963625 2023-10-02 01:37:58,564 WARNING [train.py:1204] (3/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_269-154726-0_sp1.1 from training. Duration: 0.7818125 2023-10-02 01:37:58,780 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7002_210_1794_1_1527939683_5157286_163-87552-0_sp1.1 from training. Duration: 0.7818125 2023-10-02 01:37:58,814 WARNING [train.py:1204] (3/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_97-151896-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 01:37:59,032 WARNING [train.py:1204] (3/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_207-7026-0_sp1.1 from training. Duration: 0.97275 2023-10-02 01:37:59,070 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_92-46009-0_sp0.9 from training. Duration: 0.6 2023-10-02 01:37:59,723 WARNING [train.py:1204] (3/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_122-178847-0_sp1.1 from training. Duration: 0.97275 2023-10-02 01:37:59,727 WARNING [train.py:1204] (3/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_94-348956-0_sp1.1 from training. Duration: 0.92725 2023-10-02 01:37:59,873 WARNING [train.py:1204] (3/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_1018-244089-0_sp1.1 from training. Duration: 0.963625 2023-10-02 01:37:59,894 WARNING [train.py:1204] (3/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_87-118423-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 01:38:00,004 WARNING [train.py:1204] (3/4) Exclude cut with ID _53713_210_24116_1_1534314487762_7416751_504-212292-0_sp1.1 from training. Duration: 0.92725 2023-10-02 01:38:00,030 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_446-150327-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 01:38:00,161 WARNING [train.py:1204] (3/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_682-245989-0_sp1.1 from training. Duration: 0.92725 2023-10-02 01:38:00,415 WARNING [train.py:1204] (3/4) Exclude cut with ID _35723_210_1794_1_1533088341043_628250_8-234524-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 01:38:00,481 WARNING [train.py:1204] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_64-248359-0_sp0.9 from training. Duration: 0.7666875 2023-10-02 01:38:00,701 WARNING [train.py:1204] (3/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_49-97158-0 from training. Duration: 0.76 2023-10-02 01:38:00,774 WARNING [train.py:1204] (3/4) Exclude cut with ID _24314_210_4915_1_1532135078116_7170179_743-915-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 01:38:00,880 WARNING [train.py:1204] (3/4) Exclude cut with ID _61194_210_18628_1_1535007555628_3750909_353-323468-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 01:38:00,910 WARNING [train.py:1204] (3/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_387-131538-0_sp1.1 from training. Duration: 0.736375 2023-10-02 01:38:00,947 WARNING [train.py:1204] (3/4) Exclude cut with ID _44424_210_18163_1_1533623394848_1213110_79-187603-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 01:38:00,985 WARNING [train.py:1204] (3/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_86-19601-0 from training. Duration: 0.85 2023-10-02 01:38:01,013 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_50050_210_16223_1_1535435796_7290138_103-283529-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 01:38:01,078 WARNING [train.py:1204] (3/4) Exclude cut with ID _13913_210_9994_1_1530579615187_3383897_473-277682-0 from training. Duration: 0.39 2023-10-02 01:38:01,958 WARNING [train.py:1204] (3/4) Exclude cut with ID _42326_210_7770_1_1533814282531_3707269_368-68029-0_sp1.1 from training. Duration: 0.92725 2023-10-02 01:38:02,142 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_41428_210_2161_1_1533774184_3992532_446-339645-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 01:38:02,310 WARNING [train.py:1204] (3/4) Exclude cut with ID _69584_210_5219_1_1535785133775_4044450_325-7656-0_sp1.1 from training. Duration: 0.7818125 2023-10-02 01:38:02,317 WARNING [train.py:1204] (3/4) Exclude cut with ID _21632_210_2253_1_1531994001765_3744140_134-184387-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 01:38:02,536 WARNING [train.py:1204] (3/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_550-184707-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 01:38:02,561 WARNING [train.py:1204] (3/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_106-341780-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 01:38:02,910 WARNING [train.py:1204] (3/4) Exclude cut with ID _45871_210_5665_1_1533864614245_3296060_253-132552-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 01:38:02,972 WARNING [train.py:1204] (3/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_395-310220-0_sp1.1 from training. Duration: 0.8090625 2023-10-02 01:38:02,991 WARNING [train.py:1204] (3/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_795-33252-0_sp0.9 from training. Duration: 0.411125 2023-10-02 01:38:03,200 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9002W0003-495962 from training. Duration: 0.946 2023-10-02 01:38:03,220 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_43-71989-0 from training. Duration: 0.57 2023-10-02 01:38:03,244 WARNING [train.py:1204] (3/4) Exclude cut with ID _8699_210_2069_1_1528630346831_3960409_155-39766-0_sp1.1 from training. Duration: 0.7090625 2023-10-02 01:38:03,256 WARNING [train.py:1204] (3/4) Exclude cut with ID _27894_210_12392_1_1532415496078_6902439_788-232684-0_sp1.1 from training. Duration: 0.97275 2023-10-02 01:38:03,368 WARNING [train.py:1204] (3/4) Exclude cut with ID _56494_210_4220_1_1535284837625_3033169_10-39296-0_sp1.1 from training. Duration: 0.92725 2023-10-02 01:38:03,401 WARNING [train.py:1204] (3/4) Exclude cut with ID _40901_210_5665_1_1533363789958_3856130_341-20819-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 01:38:03,773 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9029W0435-509822 from training. Duration: 0.896 2023-10-02 01:38:03,814 WARNING [train.py:1204] (3/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_51-117576-0 from training. Duration: 0.4 2023-10-02 01:38:03,938 WARNING [train.py:1204] (3/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_197-28164-0_sp1.1 from training. Duration: 0.72725 2023-10-02 01:38:04,099 WARNING [train.py:1204] (3/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_145-137790-0_sp1.1 from training. Duration: 0.7545625 2023-10-02 01:38:04,386 WARNING [train.py:1204] (3/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_2-39653-0_sp1.1 from training. Duration: 0.8909375 2023-10-02 01:38:04,643 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3780_210_2087_1_1526467760_2130483_186-208240-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 01:38:04,726 WARNING [train.py:1204] (3/4) Exclude cut with ID _24055_210_12651_1_1532310907143_976979_92-59356-0 from training. Duration: 0.91 2023-10-02 01:38:04,789 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3910_210_1794_1_1526742306_334530_28-238909-0_sp1.1 from training. Duration: 0.736375 2023-10-02 01:38:05,033 WARNING [train.py:1204] (3/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_269-154726-0 from training. Duration: 0.86 2023-10-02 01:38:05,504 WARNING [train.py:1204] (3/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_326-165900-0_sp0.9 from training. Duration: 0.7666875 2023-10-02 01:38:06,308 WARNING [train.py:1204] (3/4) Exclude cut with ID _5294_210_1747_1_1527249643928_3696179_470-276077-0_sp1.1 from training. Duration: 0.963625 2023-10-02 01:38:06,359 WARNING [train.py:1204] (3/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_372-131163-0 from training. Duration: 0.94 2023-10-02 01:38:06,401 WARNING [train.py:1204] (3/4) Exclude cut with ID _45297_210_2721_1_1533715351381_3264949_351-147136-0_sp1.1 from training. Duration: 0.7909375 2023-10-02 01:38:06,470 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_1072-12671-0_sp1.1 from training. Duration: 0.92725 2023-10-02 01:38:06,550 WARNING [train.py:1204] (3/4) Exclude cut with ID _15133_210_6228_1_1530794512074_3080379_54-198193-0 from training. Duration: 0.94 2023-10-02 01:38:06,565 WARNING [train.py:1204] (3/4) Exclude cut with ID _30196_210_1664_1_1533708496522_7074140_1194-270988-0_sp1.1 from training. Duration: 0.97275 2023-10-02 01:38:06,696 WARNING [train.py:1204] (3/4) Exclude cut with ID _50310_210_15488_1_1534068043345_3641080_289-193637-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 01:38:06,812 WARNING [train.py:1204] (3/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_313-263495-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 01:38:06,936 WARNING [train.py:1204] (3/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_18-130336-0_sp1.1 from training. Duration: 0.7181875 2023-10-02 01:38:07,027 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_47896_210_20508_1_1534492518_5958868_646-287209-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 01:38:07,171 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_294-163793-0 from training. Duration: 0.82 2023-10-02 01:38:07,229 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_8-319957-0 from training. Duration: 0.96 2023-10-02 01:38:07,233 WARNING [train.py:1204] (3/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_1211-226066-0 from training. Duration: 0.76 2023-10-02 01:38:07,370 WARNING [train.py:1204] (3/4) Exclude cut with ID _4779_210_2084_1_1527318022944_3914893_500-285013-0_sp0.9 from training. Duration: 0.7666875 2023-10-02 01:38:07,395 WARNING [train.py:1204] (3/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_347-232268-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 01:38:07,552 WARNING [train.py:1204] (3/4) Exclude cut with ID _29877_210_6228_1_1532397791106_5276149_111-55056-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 01:38:07,590 WARNING [train.py:1204] (3/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_812-271646-0_sp1.1 from training. Duration: 0.92725 2023-10-02 01:38:07,625 WARNING [train.py:1204] (3/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_235-262124-0_sp1.1 from training. Duration: 0.6090625 2023-10-02 01:38:08,001 WARNING [train.py:1204] (3/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_14-16530-0_sp1.1 from training. Duration: 0.97275 2023-10-02 01:38:08,046 WARNING [train.py:1204] (3/4) Exclude cut with ID _44718_210_5816_1_1534050051869_6469030_568-304512-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 01:38:08,076 WARNING [train.py:1204] (3/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_1033-274376-0 from training. Duration: 0.87 2023-10-02 01:38:08,346 WARNING [train.py:1204] (3/4) Exclude cut with ID _4681_210_3525_1_1527251040321_1235713_123-24428-0 from training. Duration: 0.48 2023-10-02 01:38:08,590 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_37818_210_4238_1_1533620910_4720083_251-288764-0_sp1.1 from training. Duration: 0.97275 2023-10-02 01:38:08,673 WARNING [train.py:1204] (3/4) Exclude cut with ID _65585_210_12210_1_1535450451584_3764098_112-294301-0 from training. Duration: 0.88 2023-10-02 01:38:08,984 WARNING [train.py:1204] (3/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_329-113173-0 from training. Duration: 0.62 2023-10-02 01:38:09,081 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6767_210_3273_1_1527855783_1718932_173-256601-0 from training. Duration: 0.8 2023-10-02 01:38:09,290 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9266W0001-627677 from training. Duration: 0.896 2023-10-02 01:38:09,301 WARNING [train.py:1204] (3/4) Exclude cut with ID _49643_210_7989_1_1534640244224_3939031_611-185515-0_sp1.1 from training. Duration: 0.8545625 2023-10-02 01:38:09,304 WARNING [train.py:1204] (3/4) Exclude cut with ID _31649_210_6228_1_1532584608063_4091078_687-237771-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 01:38:09,314 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_148-172861-0_sp0.9 from training. Duration: 0.5555625 2023-10-02 01:38:09,531 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_5129_210_6400_1_1527161005_3579054_107-69738-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 01:38:09,626 WARNING [train.py:1204] (3/4) Exclude cut with ID _40137_210_12210_1_1533290484046_3795180_36-301164-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 01:38:09,658 WARNING [train.py:1204] (3/4) Exclude cut with ID _7537_210_2084_1_1528502395431_3924359_113-199933-0 from training. Duration: 0.59 2023-10-02 01:38:09,818 WARNING [train.py:1204] (3/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_547-5424-0_sp1.1 from training. Duration: 0.8545625 2023-10-02 01:38:09,878 WARNING [train.py:1204] (3/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_147-284024-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 01:38:09,983 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_294-163793-0_sp1.1 from training. Duration: 0.7454375 2023-10-02 01:38:10,019 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3780_210_2087_1_1526467760_2130483_170-259044-0_sp1.1 from training. Duration: 0.636375 2023-10-02 01:38:10,175 WARNING [train.py:1204] (3/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_726-270780-0_sp1.1 from training. Duration: 0.7818125 2023-10-02 01:38:10,946 WARNING [train.py:1204] (3/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_214-291369-0_sp1.1 from training. Duration: 0.57275 2023-10-02 01:38:11,002 WARNING [train.py:1204] (3/4) Exclude cut with ID _21881_210_3525_1_1531825242757_3798380_247-102591-0 from training. Duration: 0.98 2023-10-02 01:38:11,325 WARNING [train.py:1204] (3/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_18-174647-0_sp1.1 from training. Duration: 0.82725 2023-10-02 01:38:11,327 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_415-52611-0_sp1.1 from training. Duration: 0.936375 2023-10-02 01:38:11,379 WARNING [train.py:1204] (3/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_379-49457-0_sp0.9 from training. Duration: 0.9666875 2023-10-02 01:38:11,393 WARNING [train.py:1204] (3/4) Exclude cut with ID _64521_210_14699_1_1535243427259_3815328_368-195106-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 01:38:11,469 WARNING [train.py:1204] (3/4) Exclude cut with ID _28394_210_8913_1_1532430049518_3650118_51-334124-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 01:38:11,655 WARNING [train.py:1204] (3/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_317-161769-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 01:38:11,804 WARNING [train.py:1204] (3/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_40-129756-0_sp0.9 from training. Duration: 0.9 2023-10-02 01:38:11,893 WARNING [train.py:1204] (3/4) Exclude cut with ID _3067_210_2667_1_1526212974640_4503168_119-175776-0_sp0.9 from training. Duration: 0.82225 2023-10-02 01:38:11,943 WARNING [train.py:1204] (3/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_1004-304915-0_sp1.1 from training. Duration: 0.92725 2023-10-02 01:38:12,014 WARNING [train.py:1204] (3/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_359-118926-0_sp0.9 from training. Duration: 0.8 2023-10-02 01:38:12,648 WARNING [train.py:1204] (3/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_51-200220-0_sp0.9 from training. Duration: 0.62225 2023-10-02 01:38:12,742 WARNING [train.py:1204] (3/4) Exclude cut with ID _46683_210_9642_1_1533730913492_4591116_530-326202-0_sp1.1 from training. Duration: 0.936375 2023-10-02 01:38:12,798 WARNING [train.py:1204] (3/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_567-135114-0 from training. Duration: 0.87 2023-10-02 01:38:12,832 WARNING [train.py:1204] (3/4) Exclude cut with ID _66863_210_18053_1_1535698758334_8590319_1092-271784-0_sp1.1 from training. Duration: 0.87275 2023-10-02 01:38:12,843 WARNING [train.py:1204] (3/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_762-282614-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 01:38:13,035 WARNING [train.py:1204] (3/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_280-120942-0 from training. Duration: 0.77 2023-10-02 01:38:13,409 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_31583_210_3852_1_1532595851_4024413_27-170931-0_sp1.1 from training. Duration: 0.963625 2023-10-02 01:38:13,591 WARNING [train.py:1204] (3/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_266-271628-0_sp1.1 from training. Duration: 0.92725 2023-10-02 01:38:13,648 WARNING [train.py:1204] (3/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_41-31157-0 from training. Duration: 0.57 2023-10-02 01:38:13,889 WARNING [train.py:1204] (3/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_113-18163-0_sp1.1 from training. Duration: 0.8090625 2023-10-02 01:38:13,927 WARNING [train.py:1204] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_147-111137-0 from training. Duration: 0.95 2023-10-02 01:38:14,016 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1083-220122-0_sp1.1 from training. Duration: 0.463625 2023-10-02 01:38:14,125 WARNING [train.py:1204] (3/4) Exclude cut with ID _14884_210_2252_1_1530707459689_5561250_591-173406-0_sp1.1 from training. Duration: 0.87275 2023-10-02 01:38:14,131 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_24677_210_1794_1_1532002693_4341296_149-303793-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 01:38:14,196 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_133-564-0_sp1.1 from training. Duration: 0.7181875 2023-10-02 01:38:14,233 WARNING [train.py:1204] (3/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_625-338546-0_sp0.9 from training. Duration: 0.97775 2023-10-02 01:38:15,761 WARNING [train.py:1204] (3/4) Exclude cut with ID _25882_210_3036_1_1532775665815_6479510_624-320169-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 01:38:15,768 WARNING [train.py:1204] (3/4) Exclude cut with ID _20622_210_3852_1_1531622427080_1093839_127-325326-0_sp1.1 from training. Duration: 0.92725 2023-10-02 01:38:15,798 WARNING [train.py:1204] (3/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_464-33931-0_sp1.1 from training. Duration: 0.4909375 2023-10-02 01:38:16,020 WARNING [train.py:1204] (3/4) Exclude cut with ID _66112_210_10635_1_1535590236305_3801484_120-304588-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 01:38:16,373 WARNING [train.py:1204] (3/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_110-130961-0 from training. Duration: 0.47 2023-10-02 01:38:16,545 WARNING [train.py:1204] (3/4) Exclude cut with ID _16908_210_5724_1_1531119157248_3772700_64-54020-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 01:38:16,879 WARNING [train.py:1204] (3/4) Exclude cut with ID _4068_210_2629_1_1526697350395_1546259_224-288693-0_sp1.1 from training. Duration: 0.536375 2023-10-02 01:38:16,922 WARNING [train.py:1204] (3/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_191-41023-0_sp0.9 from training. Duration: 0.788875 2023-10-02 01:38:16,990 WARNING [train.py:1204] (3/4) Exclude cut with ID ID0205W0255-783002 from training. Duration: 0.958 2023-10-02 01:38:17,066 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_162-262717-0 from training. Duration: 0.83 2023-10-02 01:38:17,487 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_224-322394-0_sp1.1 from training. Duration: 0.6 2023-10-02 01:38:17,505 WARNING [train.py:1204] (3/4) Exclude cut with ID _4866_210_2577_1_1527404412284_4364659_133-1696-0_sp1.1 from training. Duration: 0.9 2023-10-02 01:38:17,564 WARNING [train.py:1204] (3/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_973-198085-0_sp1.1 from training. Duration: 0.6818125 2023-10-02 01:38:18,050 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_47449_210_5399_1_1534076445_4006929_410-237806-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 01:38:18,060 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7543_210_6753_1_1528543318_4552_1-85072-0_sp0.9 from training. Duration: 0.92225 2023-10-02 01:38:18,225 WARNING [train.py:1204] (3/4) Exclude cut with ID _65263_210_22356_1_1535763598691_3921037_108-21902-0 from training. Duration: 0.99 2023-10-02 01:38:18,242 WARNING [train.py:1204] (3/4) Exclude cut with ID _65585_210_12210_1_1535450451584_3764098_309-174818-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 01:38:18,419 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_309-65177-0 from training. Duration: 0.88 2023-10-02 01:38:18,570 WARNING [train.py:1204] (3/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_363-185703-0_sp1.1 from training. Duration: 0.863625 2023-10-02 01:38:18,602 WARNING [train.py:1204] (3/4) Exclude cut with ID _42326_210_7770_1_1533814282531_3707269_442-238692-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 01:38:18,732 WARNING [train.py:1204] (3/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_226-254446-0_sp1.1 from training. Duration: 0.936375 2023-10-02 01:38:18,754 WARNING [train.py:1204] (3/4) Exclude cut with ID _29853_210_7497_1_1532505600851_6642110_287-299903-0_sp1.1 from training. Duration: 0.963625 2023-10-02 01:38:19,037 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1015-343884-0 from training. Duration: 0.62 2023-10-02 01:38:19,076 WARNING [train.py:1204] (3/4) Exclude cut with ID ID0305W0008-833402 from training. Duration: 0.91 2023-10-02 01:38:19,781 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6673_210_3273_1_1527767790_3173240_351-167954-0_sp1.1 from training. Duration: 0.436375 2023-10-02 01:38:19,799 WARNING [train.py:1204] (3/4) Exclude cut with ID _6456_210_1534_1_1527681634212_3181049_345-252877-0 from training. Duration: 0.64 2023-10-02 01:38:19,993 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_13-205182-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 01:38:20,262 WARNING [train.py:1204] (3/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_427-208901-0 from training. Duration: 0.68 2023-10-02 01:38:20,646 WARNING [train.py:1204] (3/4) Exclude cut with ID _49039_210_6315_1_1533967537541_3846118_9-148921-0_sp1.1 from training. Duration: 0.97275 2023-10-02 01:38:20,889 WARNING [train.py:1204] (3/4) Exclude cut with ID _59493_210_17282_1_1535078419077_3225879_143-70317-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 01:38:20,921 WARNING [train.py:1204] (3/4) Exclude cut with ID _27967_210_12140_1_1532588391859_8419130_1056-236369-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 01:38:20,981 WARNING [train.py:1204] (3/4) Exclude cut with ID _6537_210_1883_1_1527987559710_3597129_382-133062-0_sp0.9 from training. Duration: 0.7555625 2023-10-02 01:38:21,161 WARNING [train.py:1204] (3/4) Exclude cut with ID ID0376W0255-869957 from training. Duration: 0.9510625 2023-10-02 01:38:21,393 WARNING [train.py:1204] (3/4) Exclude cut with ID _49371_210_12071_1_1533952761021_3812190_483-240691-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 01:38:21,471 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_12868_210_6753_1_1530701932_530275_18-76322-0_sp1.1 from training. Duration: 0.7909375 2023-10-02 01:38:21,528 WARNING [train.py:1204] (3/4) Exclude cut with ID _53950_210_5816_1_1534417264919_3862951_367-26344-0_sp1.1 from training. Duration: 0.97275 2023-10-02 01:38:21,546 WARNING [train.py:1204] (3/4) Exclude cut with ID _5444_210_2151_1_1527418808464_5312210_634-194174-0_sp1.1 from training. Duration: 0.8818125 2023-10-02 01:38:21,561 WARNING [train.py:1204] (3/4) Exclude cut with ID _56494_210_4220_1_1535284837625_3033169_69-138150-0 from training. Duration: 0.94 2023-10-02 01:38:21,667 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7543_210_6753_1_1528544010_1110402_25-183860-0_sp0.9 from training. Duration: 0.52225 2023-10-02 01:38:21,794 WARNING [train.py:1204] (3/4) Exclude cut with ID _40284_210_2084_1_1533513679814_3704279_287-191918-0_sp1.1 from training. Duration: 0.963625 2023-10-02 01:38:21,922 WARNING [train.py:1204] (3/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_291-270881-0 from training. Duration: 0.97 2023-10-02 01:38:21,981 WARNING [train.py:1204] (3/4) Exclude cut with ID _12555_210_8750_1_1530075594402_3780239_449-267986-0_sp0.9 from training. Duration: 0.92225 2023-10-02 01:38:22,322 WARNING [train.py:1204] (3/4) Exclude cut with ID _3992_210_1747_1_1526644816031_3731060_268-64520-0_sp1.1 from training. Duration: 0.963625 2023-10-02 01:38:22,329 WARNING [train.py:1204] (3/4) Exclude cut with ID _39411_210_5986_1_1533538825030_3343649_173-238248-0_sp0.9 from training. Duration: 0.92225 2023-10-02 01:38:22,358 WARNING [train.py:1204] (3/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_828-313097-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 01:38:22,466 WARNING [train.py:1204] (3/4) Exclude cut with ID _26907_210_7234_1_1532221740694_3205263_337-2494-0 from training. Duration: 0.88 2023-10-02 01:38:22,600 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_229-45300-0_sp1.1 from training. Duration: 0.5181875 2023-10-02 01:38:22,750 WARNING [train.py:1204] (3/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_300-27235-0_sp1.1 from training. Duration: 0.7909375 2023-10-02 01:38:22,866 WARNING [train.py:1204] (3/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_48-338976-0_sp1.1 from training. Duration: 0.8090625 2023-10-02 01:38:23,222 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_13091_210_7925_1_1530609447_8987354_792-195353-0_sp1.1 from training. Duration: 0.936375 2023-10-02 01:38:23,234 WARNING [train.py:1204] (3/4) Exclude cut with ID _56524_210_9642_1_1534572011779_3619170_311-54500-0_sp1.1 from training. Duration: 0.97275 2023-10-02 01:38:24,200 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7543_210_6753_1_1528544010_1110402_25-183860-0 from training. Duration: 0.47 2023-10-02 01:38:24,348 WARNING [train.py:1204] (3/4) Exclude cut with ID 2411-132532-0017-82279-0_sp1.1 from training. Duration: 0.9681875 2023-10-02 01:38:24,349 WARNING [train.py:1204] (3/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_272-249985-0_sp0.9 from training. Duration: 0.7444375 2023-10-02 01:38:24,409 WARNING [train.py:1204] (3/4) Exclude cut with ID _55644_210_11856_1_1534593599582_4103719_320-229006-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 01:38:24,458 WARNING [train.py:1204] (3/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_177-100554-0_sp1.1 from training. Duration: 0.963625 2023-10-02 01:38:24,460 WARNING [train.py:1204] (3/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_787-294416-0_sp1.1 from training. Duration: 0.7454375 2023-10-02 01:38:24,771 WARNING [train.py:1204] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_318-59097-0 from training. Duration: 0.76 2023-10-02 01:38:24,864 WARNING [train.py:1204] (3/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_480-13146-0_sp1.1 from training. Duration: 0.77275 2023-10-02 01:38:25,029 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_469-338558-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 01:38:25,109 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_367-126897-0_sp1.1 from training. Duration: 0.963625 2023-10-02 01:38:25,124 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6545_210_2040_1_1527774670_4245258_379-14182-0 from training. Duration: 0.8 2023-10-02 01:38:25,149 WARNING [train.py:1204] (3/4) Exclude cut with ID _21631_210_2253_1_1531907880778_3160339_197-79498-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 01:38:25,153 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_416-293746-0_sp1.1 from training. Duration: 0.8181875 2023-10-02 01:38:25,195 WARNING [train.py:1204] (3/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_52-211683-0 from training. Duration: 0.9 2023-10-02 01:38:25,567 WARNING [train.py:1204] (3/4) Exclude cut with ID _48678_210_5663_1_1533988830947_3769199_306-255886-0_sp0.9 from training. Duration: 0.8555625 2023-10-02 01:38:25,784 WARNING [train.py:1204] (3/4) Exclude cut with ID _65134_210_14763_1_1535335393985_3505368_296-24733-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 01:38:26,138 WARNING [train.py:1204] (3/4) Exclude cut with ID _14898_210_9206_1_1530766812377_4177190_335-99364-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 01:38:26,237 WARNING [train.py:1204] (3/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_19-342632-0 from training. Duration: 0.91 2023-10-02 01:38:26,261 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_53071_210_16554_1_1534490405_2954066_265-170472-0 from training. Duration: 0.83 2023-10-02 01:38:26,538 WARNING [train.py:1204] (3/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_189-298172-0_sp1.1 from training. Duration: 0.963625 2023-10-02 01:38:26,792 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7615_210_4211_1_1528110965_4580160_72-235211-0 from training. Duration: 0.76 2023-10-02 01:38:26,853 WARNING [train.py:1204] (3/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_155-90874-0_sp1.1 from training. Duration: 0.936375 2023-10-02 01:38:27,223 WARNING [train.py:1204] (3/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_164-216310-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 01:38:27,352 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_46013_210_7234_1_1533776362_3744032_463-52378-0 from training. Duration: 0.86 2023-10-02 01:38:27,355 WARNING [train.py:1204] (3/4) Exclude cut with ID _61133_210_9969_1_1535196777227_3696409_333-270325-0 from training. Duration: 0.93 2023-10-02 01:38:27,724 WARNING [train.py:1204] (3/4) Exclude cut with ID _5172_210_1883_1_1527246062567_3577219_18-101380-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 01:38:27,735 WARNING [train.py:1204] (3/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_577-236858-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 01:38:27,847 WARNING [train.py:1204] (3/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_1006-177345-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 01:38:28,730 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_4-266348-0 from training. Duration: 0.78 2023-10-02 01:38:29,030 WARNING [train.py:1204] (3/4) Exclude cut with ID _55857_210_2346_1_1534644007831_6898209_745-98391-0 from training. Duration: 0.76 2023-10-02 01:38:29,179 WARNING [train.py:1204] (3/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_375-244976-0 from training. Duration: 0.75 2023-10-02 01:38:29,219 WARNING [train.py:1204] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_4-248404-0_sp1.1 from training. Duration: 0.97275 2023-10-02 01:38:29,359 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8453_210_4211_1_1528502940_8562730_596-306820-0 from training. Duration: 0.97 2023-10-02 01:38:29,488 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_211-339622-0_sp0.9 from training. Duration: 0.5 2023-10-02 01:38:29,514 WARNING [train.py:1204] (3/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_393-57888-0 from training. Duration: 0.64 2023-10-02 01:38:29,609 WARNING [train.py:1204] (3/4) Exclude cut with ID _23825_210_13449_1_1531915313526_3497150_143-343575-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 01:38:29,674 WARNING [train.py:1204] (3/4) Exclude cut with ID _58605_210_12210_1_1534723195939_3904259_36-12256-0_sp1.1 from training. Duration: 0.936375 2023-10-02 01:38:30,307 WARNING [train.py:1204] (3/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_38-67795-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 01:38:30,362 WARNING [train.py:1204] (3/4) Exclude cut with ID _64631_210_27767_1_1535349580068_4165160_308-327832-0_sp1.1 from training. Duration: 0.92725 2023-10-02 01:38:30,432 WARNING [train.py:1204] (3/4) Exclude cut with ID _37086_210_9038_1_1532999540891_7021250_726-81176-0_sp1.1 from training. Duration: 0.936375 2023-10-02 01:38:30,495 WARNING [train.py:1204] (3/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_47-141521-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 01:38:30,499 WARNING [train.py:1204] (3/4) Exclude cut with ID _3067_210_2667_1_1526212974640_4503168_250-285937-0 from training. Duration: 0.8 2023-10-02 01:38:30,540 WARNING [train.py:1204] (3/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_239-24000-0_sp1.1 from training. Duration: 0.97275 2023-10-02 01:38:30,606 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3371_210_1794_1_1526384462_5580777_396-339812-0_sp0.9 from training. Duration: 0.9444375 2023-10-02 01:38:30,669 WARNING [train.py:1204] (3/4) Exclude cut with ID _16909_210_5724_1_1531292079912_4118919_580-173454-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 01:38:30,722 WARNING [train.py:1204] (3/4) Exclude cut with ID IC0124W0002-61308 from training. Duration: 0.982 2023-10-02 01:38:30,973 WARNING [train.py:1204] (3/4) Exclude cut with ID _31602_210_5854_1_1532677021456_4458250_249-96604-0 from training. Duration: 0.89 2023-10-02 01:38:31,062 WARNING [train.py:1204] (3/4) Exclude cut with ID _69584_210_5219_1_1535785133775_4044450_325-7656-0 from training. Duration: 0.86 2023-10-02 01:38:31,211 WARNING [train.py:1204] (3/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_478-162247-0 from training. Duration: 0.82 2023-10-02 01:38:31,366 WARNING [train.py:1204] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_686-54383-0_sp1.1 from training. Duration: 0.7909375 2023-10-02 01:38:31,822 WARNING [train.py:1204] (3/4) Exclude cut with ID _29303_210_2575_1_1532392237303_7410649_85-270092-0_sp1.1 from training. Duration: 0.936375 2023-10-02 01:38:31,843 WARNING [train.py:1204] (3/4) Exclude cut with ID _2322_210_2667_1_1525604548971_3656299_440-80494-0_sp1.1 from training. Duration: 0.8 2023-10-02 01:38:31,873 WARNING [train.py:1204] (3/4) Exclude cut with ID _47346_210_9476_1_1533815971553_3536160_201-40644-0_sp1.1 from training. Duration: 0.936375 2023-10-02 01:38:31,991 WARNING [train.py:1204] (3/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_343-139898-0_sp1.1 from training. Duration: 0.87275 2023-10-02 01:38:32,156 WARNING [train.py:1204] (3/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_1012-279473-0 from training. Duration: 0.67 2023-10-02 01:38:32,257 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7931_210_1794_1_1528594728_5564059_280-279560-0_sp1.1 from training. Duration: 0.8909375 2023-10-02 01:38:32,315 WARNING [train.py:1204] (3/4) Exclude cut with ID _8263_210_1664_1_1528439452891_970220_68-181805-0 from training. Duration: 0.64 2023-10-02 01:38:32,360 WARNING [train.py:1204] (3/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_605-133395-0 from training. Duration: 0.66 2023-10-02 01:38:33,107 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_43-171109-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 01:38:33,110 WARNING [train.py:1204] (3/4) Exclude cut with ID _40094_210_16380_1_1533294005773_3641159_107-280739-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 01:38:33,273 WARNING [train.py:1204] (3/4) Exclude cut with ID _14821_210_8750_1_1530680503991_6829359_2-227151-0_sp0.9 from training. Duration: 0.92225 2023-10-02 01:38:33,451 WARNING [train.py:1204] (3/4) Exclude cut with ID _14268_210_9206_1_1530622771285_4714099_331-105351-0_sp1.1 from training. Duration: 0.5909375 2023-10-02 01:38:33,606 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_26326_210_7925_1_1533002493_3592406_451-82741-0_sp1.1 from training. Duration: 0.97275 2023-10-02 01:38:33,934 WARNING [train.py:1204] (3/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_191-59784-0_sp0.9 from training. Duration: 0.97775 2023-10-02 01:38:34,049 WARNING [train.py:1204] (3/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_445-117398-0_sp1.1 from training. Duration: 0.8090625 2023-10-02 01:38:34,203 WARNING [train.py:1204] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_605-257205-0_sp0.9 from training. Duration: 0.6555625 2023-10-02 01:38:34,298 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_50050_210_16223_1_1535435796_7290138_592-258841-0_sp1.1 from training. Duration: 0.936375 2023-10-02 01:38:34,469 WARNING [train.py:1204] (3/4) Exclude cut with ID _14348_210_8341_1_1530599434996_3778640_240-232370-0_sp0.9 from training. Duration: 0.9333125 2023-10-02 01:38:34,616 WARNING [train.py:1204] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_239-316802-0_sp0.9 from training. Duration: 0.7666875 2023-10-02 01:38:34,744 WARNING [train.py:1204] (3/4) Exclude cut with ID _45560_210_21982_1_1533812603951_3584999_111-6376-0_sp1.1 from training. Duration: 0.763625 2023-10-02 01:38:34,927 WARNING [train.py:1204] (3/4) Exclude cut with ID _54189_210_16380_1_1534332670039_3911710_606-311481-0_sp1.1 from training. Duration: 0.963625 2023-10-02 01:38:34,989 WARNING [train.py:1204] (3/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_237-332027-0_sp1.1 from training. Duration: 0.863625 2023-10-02 01:38:35,012 WARNING [train.py:1204] (3/4) Exclude cut with ID _66070_210_7640_1_1535441393096_7805310_489-292631-0_sp0.9 from training. Duration: 0.87775 2023-10-02 01:38:35,083 WARNING [train.py:1204] (3/4) Exclude cut with ID _49728_210_12651_1_1534041034294_6788150_624-195301-0_sp1.1 from training. Duration: 0.77275 2023-10-02 01:38:35,126 WARNING [train.py:1204] (3/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_44-79427-0 from training. Duration: 0.98 2023-10-02 01:38:35,190 WARNING [train.py:1204] (3/4) Exclude cut with ID _35350_210_16040_1_1533040191183_4075299_490-198354-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 01:38:35,227 WARNING [train.py:1204] (3/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_309-222545-0 from training. Duration: 0.84 2023-10-02 01:38:35,423 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_13091_210_7925_1_1530609447_8987354_790-199469-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 01:38:35,429 WARNING [train.py:1204] (3/4) Exclude cut with ID _21632_210_2253_1_1531994001765_3744140_110-274654-0 from training. Duration: 0.82 2023-10-02 01:38:35,434 WARNING [train.py:1204] (3/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_498-40153-0_sp0.9 from training. Duration: 0.7 2023-10-02 01:38:36,001 WARNING [train.py:1204] (3/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_363-282909-0_sp1.1 from training. Duration: 0.936375 2023-10-02 01:38:36,063 WARNING [train.py:1204] (3/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_178-177582-0_sp0.9 from training. Duration: 0.82225 2023-10-02 01:38:36,080 WARNING [train.py:1204] (3/4) Exclude cut with ID _24612_210_8913_1_1531998054193_3806359_357-35487-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 01:38:36,308 WARNING [train.py:1204] (3/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_138-332773-0 from training. Duration: 0.81 2023-10-02 01:38:36,329 WARNING [train.py:1204] (3/4) Exclude cut with ID _7624_210_1531_1_1528109802198_4933101_418-349834-0 from training. Duration: 0.6 2023-10-02 01:38:36,361 WARNING [train.py:1204] (3/4) Exclude cut with ID _7537_210_2084_1_1528502395431_3924359_14-312906-0_sp0.9 from training. Duration: 0.9333125 2023-10-02 01:38:36,594 WARNING [train.py:1204] (3/4) Exclude cut with ID _60057_210_10640_1_1534903065793_3333150_362-230377-0 from training. Duration: 0.87 2023-10-02 01:38:36,845 WARNING [train.py:1204] (3/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_138-64210-0 from training. Duration: 0.73 2023-10-02 01:38:37,530 WARNING [train.py:1204] (3/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_454-339504-0_sp1.1 from training. Duration: 0.97275 2023-10-02 01:38:37,793 WARNING [train.py:1204] (3/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_508-335369-0_sp0.9 from training. Duration: 0.5333125 2023-10-02 01:38:37,955 WARNING [train.py:1204] (3/4) Exclude cut with ID _5608_210_2106_1_1527415439089_6819768_909-82767-0_sp0.9 from training. Duration: 0.67775 2023-10-02 01:38:37,983 WARNING [train.py:1204] (3/4) Exclude cut with ID _6694_210_4599_1_1527852637490_3965529_99-11611-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 01:38:38,050 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2655_210_2087_1_1525688959_3897400_52-279899-0_sp1.1 from training. Duration: 0.62725 2023-10-02 01:38:38,325 WARNING [train.py:1204] (3/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_375-245677-0_sp1.1 from training. Duration: 0.736375 2023-10-02 01:38:38,571 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_23850_210_4238_1_1532232597_1632433_32-185135-0 from training. Duration: 0.86 2023-10-02 01:38:38,707 WARNING [train.py:1204] (3/4) Exclude cut with ID _15113_210_8254_1_1530842438579_3716279_209-54533-0 from training. Duration: 0.96 2023-10-02 01:38:38,736 WARNING [train.py:1204] (3/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_401-53423-0 from training. Duration: 0.55 2023-10-02 01:38:38,781 WARNING [train.py:1204] (3/4) Exclude cut with ID _14867_210_1968_1_1530684179890_3846969_319-250868-0_sp1.1 from training. Duration: 0.9 2023-10-02 01:38:38,802 WARNING [train.py:1204] (3/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_106-30782-0_sp0.9 from training. Duration: 0.888875 2023-10-02 01:38:38,931 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_5960_210_1794_1_1527403865_3990999_515-2613-0 from training. Duration: 0.88 2023-10-02 01:38:39,246 WARNING [train.py:1204] (3/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_798-106179-0_sp0.9 from training. Duration: 0.92225 2023-10-02 01:38:39,435 WARNING [train.py:1204] (3/4) Exclude cut with ID _14227_210_4381_1_1530701804286_3940259_125-275642-0_sp1.1 from training. Duration: 0.9 2023-10-02 01:38:39,438 WARNING [train.py:1204] (3/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_79-200392-0_sp1.1 from training. Duration: 0.8818125 2023-10-02 01:38:39,501 WARNING [train.py:1204] (3/4) Exclude cut with ID _48836_210_12210_1_1534075848293_3904150_429-35475-0_sp0.9 from training. Duration: 0.988875 2023-10-02 01:38:39,518 WARNING [train.py:1204] (3/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_224-172517-0_sp1.1 from training. Duration: 0.97275 2023-10-02 01:38:39,820 WARNING [train.py:1204] (3/4) Exclude cut with ID _15113_210_8254_1_1530842438579_3716279_76-232507-0_sp1.1 from training. Duration: 0.97275 2023-10-02 01:38:39,830 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_135-5501-0 from training. Duration: 0.98 2023-10-02 01:38:39,995 WARNING [train.py:1204] (3/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_563-347832-0_sp0.9 from training. Duration: 0.988875 2023-10-02 01:38:40,005 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6786_210_1388_1_1527839699_4194495_273-149841-0 from training. Duration: 0.92 2023-10-02 01:38:40,022 WARNING [train.py:1204] (3/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_172-215932-0 from training. Duration: 0.66 2023-10-02 01:38:40,095 WARNING [train.py:1204] (3/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_939-132910-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 01:38:40,353 WARNING [train.py:1204] (3/4) Exclude cut with ID _49371_210_12071_1_1533952761021_3812190_16-246001-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 01:38:40,761 WARNING [train.py:1204] (3/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_680-189165-0_sp1.1 from training. Duration: 0.936375 2023-10-02 01:38:41,124 WARNING [train.py:1204] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_53-300544-0_sp1.1 from training. Duration: 0.6090625 2023-10-02 01:38:41,217 WARNING [train.py:1204] (3/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_540-275685-0_sp1.1 from training. Duration: 0.8090625 2023-10-02 01:38:41,240 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_38965_210_4852_1_1533174906_3484735_453-106579-0_sp1.1 from training. Duration: 0.92725 2023-10-02 01:38:41,299 WARNING [train.py:1204] (3/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_105-328174-0 from training. Duration: 0.76 2023-10-02 01:38:42,028 WARNING [train.py:1204] (3/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_279-126205-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 01:38:42,223 WARNING [train.py:1204] (3/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_5-55893-0_sp0.9 from training. Duration: 0.611125 2023-10-02 01:38:42,405 WARNING [train.py:1204] (3/4) Exclude cut with ID _13678_210_7497_1_1530530069226_3463170_141-10835-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 01:38:42,602 WARNING [train.py:1204] (3/4) Exclude cut with ID _22083_210_8254_1_1531702862439_3678970_201-85738-0_sp1.1 from training. Duration: 0.8181875 2023-10-02 01:38:42,630 WARNING [train.py:1204] (3/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_51-1049-0 from training. Duration: 0.75 2023-10-02 01:38:42,719 WARNING [train.py:1204] (3/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_206-10097-0_sp0.9 from training. Duration: 0.5444375 2023-10-02 01:38:42,827 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_4528_210_3273_1_1527076654_3276777_360-63661-0_sp1.1 from training. Duration: 0.92725 2023-10-02 01:38:42,845 WARNING [train.py:1204] (3/4) Exclude cut with ID _64086_210_7994_1_1535270417019_7435949_449-31567-0_sp1.1 from training. Duration: 0.8818125 2023-10-02 01:38:43,066 WARNING [train.py:1204] (3/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_245-174553-0_sp0.9 from training. Duration: 0.97775 2023-10-02 01:38:43,227 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6545_210_2040_1_1527774670_4245258_379-14182-0_sp1.1 from training. Duration: 0.72725 2023-10-02 01:38:43,567 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_256-206070-0_sp1.1 from training. Duration: 0.963625 2023-10-02 01:38:43,820 WARNING [train.py:1204] (3/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_278-104676-0 from training. Duration: 0.98 2023-10-02 01:38:43,900 WARNING [train.py:1204] (3/4) Exclude cut with ID IC0671W0065-331908 from training. Duration: 0.896 2023-10-02 01:38:43,955 WARNING [train.py:1204] (3/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_244-129917-0_sp1.1 from training. Duration: 0.97275 2023-10-02 01:38:44,120 WARNING [train.py:1204] (3/4) Exclude cut with ID _7852_210_5219_1_1528286471316_8139400_1129-170857-0_sp1.1 from training. Duration: 0.97275 2023-10-02 01:38:44,185 WARNING [train.py:1204] (3/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_443-30034-0_sp0.9 from training. Duration: 0.711125 2023-10-02 01:38:44,231 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_31583_210_3852_1_1532595851_4024413_250-332999-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 01:38:44,327 WARNING [train.py:1204] (3/4) Exclude cut with ID _8772_210_1850_1_1528635595972_3664011_247-336448-0 from training. Duration: 0.74 2023-10-02 01:38:44,377 WARNING [train.py:1204] (3/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_567-135114-0_sp0.9 from training. Duration: 0.9666875 2023-10-02 01:38:44,380 WARNING [train.py:1204] (3/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_557-349534-0_sp1.1 from training. Duration: 0.82725 2023-10-02 01:38:44,405 WARNING [train.py:1204] (3/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_1341-26728-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 01:38:44,463 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_40222_210_6228_1_1533385298_4481939_375-328051-0_sp1.1 from training. Duration: 0.97275 2023-10-02 01:38:44,501 WARNING [train.py:1204] (3/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_276-96593-0 from training. Duration: 0.77 2023-10-02 01:38:44,969 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_53071_210_16554_1_1534490405_2954066_265-170472-0_sp0.9 from training. Duration: 0.92225 2023-10-02 01:38:44,995 WARNING [train.py:1204] (3/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_365-1492-0 from training. Duration: 0.91 2023-10-02 01:38:45,189 WARNING [train.py:1204] (3/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_654-52713-0_sp0.9 from training. Duration: 0.8555625 2023-10-02 01:38:45,543 WARNING [train.py:1204] (3/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_113-92608-0_sp1.1 from training. Duration: 0.4909375 2023-10-02 01:38:45,766 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_46013_210_7234_1_1533776362_3744032_50-141535-0 from training. Duration: 0.99 2023-10-02 01:38:45,780 WARNING [train.py:1204] (3/4) Exclude cut with ID _13942_210_9988_1_1530604906036_3733360_499-55676-0_sp1.1 from training. Duration: 0.563625 2023-10-02 01:38:46,564 WARNING [train.py:1204] (3/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_375-65143-0_sp1.1 from training. Duration: 0.97275 2023-10-02 01:38:46,783 WARNING [train.py:1204] (3/4) Exclude cut with ID _16756_210_4915_1_1531047803763_7104259_199-325608-0_sp1.1 from training. Duration: 0.92725 2023-10-02 01:38:46,818 WARNING [train.py:1204] (3/4) Exclude cut with ID _31649_210_6228_1_1532584608063_4091078_443-124391-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 01:38:46,944 WARNING [train.py:1204] (3/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_146-218763-0 from training. Duration: 0.86 2023-10-02 01:38:47,013 WARNING [train.py:1204] (3/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_567-135114-0_sp1.1 from training. Duration: 0.7909375 2023-10-02 01:38:47,056 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_26326_210_7925_1_1533002493_3592406_462-288299-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 01:38:47,174 WARNING [train.py:1204] (3/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_322-217220-0 from training. Duration: 0.86 2023-10-02 01:38:47,219 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_238-256668-0_sp1.1 from training. Duration: 0.62725 2023-10-02 01:38:47,274 WARNING [train.py:1204] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_371-18169-0 from training. Duration: 0.86 2023-10-02 01:38:47,388 WARNING [train.py:1204] (3/4) Exclude cut with ID _58480_210_12392_1_1534811121111_3646160_23-74512-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 01:38:47,393 WARNING [train.py:1204] (3/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_784-213099-0_sp1.1 from training. Duration: 0.8454375 2023-10-02 01:38:47,490 WARNING [train.py:1204] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_625-102955-0_sp1.1 from training. Duration: 0.7 2023-10-02 01:38:47,849 WARNING [train.py:1204] (3/4) Exclude cut with ID _49748_210_24116_1_1533986971737_3158910_176-174327-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 01:38:48,051 WARNING [train.py:1204] (3/4) Exclude cut with ID _50310_210_15488_1_1534068043345_3641080_432-96140-0_sp1.1 from training. Duration: 0.936375 2023-10-02 01:38:48,242 WARNING [train.py:1204] (3/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_76-14910-0_sp1.1 from training. Duration: 0.92725 2023-10-02 01:38:48,276 WARNING [train.py:1204] (3/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_40-19458-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 01:38:48,308 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_46-93547-0 from training. Duration: 0.95 2023-10-02 01:38:48,352 WARNING [train.py:1204] (3/4) Exclude cut with ID _21631_210_2253_1_1531907880778_3160339_321-35325-0_sp1.1 from training. Duration: 0.963625 2023-10-02 01:38:48,571 WARNING [train.py:1204] (3/4) Exclude cut with ID _4779_210_2084_1_1527318022944_3914893_500-285013-0 from training. Duration: 0.69 2023-10-02 01:38:48,670 WARNING [train.py:1204] (3/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_301-93324-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 01:38:48,676 WARNING [train.py:1204] (3/4) Exclude cut with ID _6473_210_1741_1_1527901307396_3815380_108-17335-0_sp1.1 from training. Duration: 0.5909375 2023-10-02 01:38:49,092 WARNING [train.py:1204] (3/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_206-10097-0 from training. Duration: 0.49 2023-10-02 01:38:49,260 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_36756_210_7234_1_1533026991_966276_103-77513-0_sp1.1 from training. Duration: 0.97275 2023-10-02 01:38:49,502 WARNING [train.py:1204] (3/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_662-18193-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 01:38:49,569 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_26433_210_12108_1_1532157158_4056480_439-52438-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 01:38:49,616 WARNING [train.py:1204] (3/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_464-33931-0 from training. Duration: 0.54 2023-10-02 01:38:49,630 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3474_210_2028_1_1526188513_4825246_145-309131-0_sp0.9 from training. Duration: 0.82225 2023-10-02 01:38:49,645 WARNING [train.py:1204] (3/4) Exclude cut with ID _9679_210_7497_1_1529129212906_4363929_446-308321-0_sp1.1 from training. Duration: 0.963625 2023-10-02 01:38:49,938 WARNING [train.py:1204] (3/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_662-275621-0 from training. Duration: 0.42 2023-10-02 01:38:50,036 WARNING [train.py:1204] (3/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_344-337148-0_sp1.1 from training. Duration: 0.97275 2023-10-02 01:38:50,179 WARNING [train.py:1204] (3/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_759-179071-0_sp1.1 from training. Duration: 0.92725 2023-10-02 01:38:51,528 WARNING [train.py:1204] (3/4) Exclude cut with ID _28783_210_7943_1_1532671164808_7155479_293-12188-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 01:38:52,066 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_17613_210_2151_1_1531137569_2366160_235-320304-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 01:38:52,081 WARNING [train.py:1204] (3/4) Exclude cut with ID _14349_210_8341_1_1530615710818_3442470_170-336541-0 from training. Duration: 0.86 2023-10-02 01:38:52,160 WARNING [train.py:1204] (3/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_375-218677-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 01:38:52,201 WARNING [train.py:1204] (3/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_829-202956-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 01:38:52,676 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9029W0436-509823 from training. Duration: 0.768 2023-10-02 01:38:52,804 WARNING [train.py:1204] (3/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_231-112214-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 01:38:53,269 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9059W0470-524883 from training. Duration: 0.3415625 2023-10-02 01:38:53,537 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_43266_210_1794_1_1533521789_4497218_206-79512-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 01:38:53,645 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_294-163793-0_sp0.9 from training. Duration: 0.911125 2023-10-02 01:38:53,674 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6290_210_3273_1_1527595224_1282787_115-87095-0_sp1.1 from training. Duration: 0.5 2023-10-02 01:38:53,713 WARNING [train.py:1204] (3/4) Exclude cut with ID _14100_210_8341_1_1530529320613_3678069_279-55760-0_sp1.1 from training. Duration: 0.8454375 2023-10-02 01:38:53,957 WARNING [train.py:1204] (3/4) Exclude cut with ID _14621_210_1925_1_1530686901185_3675420_419-234200-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 01:38:54,393 WARNING [train.py:1204] (3/4) Exclude cut with ID _60229_210_27767_1_1534838406158_3655350_353-213656-0_sp1.1 from training. Duration: 0.863625 2023-10-02 01:38:54,402 WARNING [train.py:1204] (3/4) Exclude cut with ID _48678_210_5663_1_1533988830947_3769199_306-255886-0_sp1.1 from training. Duration: 0.7 2023-10-02 01:38:54,571 WARNING [train.py:1204] (3/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_466-3492-0_sp1.1 from training. Duration: 0.97275 2023-10-02 01:38:54,572 WARNING [train.py:1204] (3/4) Exclude cut with ID _8755_210_1876_1_1528628559357_3258209_199-242167-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 01:38:54,647 WARNING [train.py:1204] (3/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_237-89684-0_sp1.1 from training. Duration: 0.87275 2023-10-02 01:38:54,786 WARNING [train.py:1204] (3/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_70-84121-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 01:38:55,536 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7544_210_6753_1_1528587722_3576686_172-171750-0_sp1.1 from training. Duration: 0.5909375 2023-10-02 01:38:55,645 WARNING [train.py:1204] (3/4) Exclude cut with ID _24271_210_12658_1_1531987270022_7190519_40-321822-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 01:38:55,661 WARNING [train.py:1204] (3/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_227-83612-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 01:38:55,831 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9156W0015-572508 from training. Duration: 0.896 2023-10-02 01:38:56,036 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_292-265928-0 from training. Duration: 0.51 2023-10-02 01:38:56,348 WARNING [train.py:1204] (3/4) Exclude cut with ID _14747_210_8341_1_1530685797463_3654089_147-131229-0_sp1.1 from training. Duration: 0.6909375 2023-10-02 01:38:56,515 WARNING [train.py:1204] (3/4) Exclude cut with ID _45297_210_2721_1_1533715351381_3264949_351-147136-0_sp0.9 from training. Duration: 0.9666875 2023-10-02 01:38:56,675 WARNING [train.py:1204] (3/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_1102-324568-0_sp1.1 from training. Duration: 0.97275 2023-10-02 01:38:56,818 WARNING [train.py:1204] (3/4) Exclude cut with ID _18516_210_9206_1_1531704653206_3692406_465-75446-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 01:38:56,819 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9203W0126-595623 from training. Duration: 0.896 2023-10-02 01:38:56,904 WARNING [train.py:1204] (3/4) Exclude cut with ID _55383_210_10635_1_1534468426172_2231669_139-328410-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 01:38:57,520 WARNING [train.py:1204] (3/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_166-246557-0_sp0.9 from training. Duration: 0.711125 2023-10-02 01:38:57,524 WARNING [train.py:1204] (3/4) Exclude cut with ID _30039_210_13270_1_1532484612900_2725150_239-304872-0_sp1.1 from training. Duration: 0.963625 2023-10-02 01:38:57,561 WARNING [train.py:1204] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_6-226750-0 from training. Duration: 0.94 2023-10-02 01:38:57,651 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_135-5501-0_sp1.1 from training. Duration: 0.8909375 2023-10-02 01:38:57,955 WARNING [train.py:1204] (3/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_359-118926-0 from training. Duration: 0.72 2023-10-02 01:38:58,199 WARNING [train.py:1204] (3/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_4-91037-0 from training. Duration: 0.88 2023-10-02 01:38:58,238 WARNING [train.py:1204] (3/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_415-288492-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 01:38:58,324 WARNING [train.py:1204] (3/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_544-287179-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 01:38:58,424 WARNING [train.py:1204] (3/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_122-32606-0_sp1.1 from training. Duration: 0.92725 2023-10-02 01:38:58,497 WARNING [train.py:1204] (3/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_121-284152-0_sp1.1 from training. Duration: 0.536375 2023-10-02 01:38:58,618 WARNING [train.py:1204] (3/4) Exclude cut with ID _44718_210_5816_1_1534050051869_6469030_350-299477-0_sp1.1 from training. Duration: 0.97275 2023-10-02 01:38:58,636 WARNING [train.py:1204] (3/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_1103-56445-0_sp0.9 from training. Duration: 0.811125 2023-10-02 01:38:58,704 WARNING [train.py:1204] (3/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_64-298917-0_sp0.9 from training. Duration: 0.97775 2023-10-02 01:38:58,753 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2245_210_2715_1_1525511548_3540254_310-52483-0 from training. Duration: 0.88 2023-10-02 01:38:58,811 WARNING [train.py:1204] (3/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_401-158239-0_sp1.1 from training. Duration: 0.6454375 2023-10-02 01:38:58,860 WARNING [train.py:1204] (3/4) Exclude cut with ID _59502_210_12210_1_1535107024698_1835139_146-162495-0_sp1.1 from training. Duration: 0.836375 2023-10-02 01:38:58,883 WARNING [train.py:1204] (3/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_41-31157-0_sp0.9 from training. Duration: 0.6333125 2023-10-02 01:38:59,066 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_68717_210_3528_1_1535628683_1556759_95-36722-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 01:38:59,073 WARNING [train.py:1204] (3/4) Exclude cut with ID _30196_210_1664_1_1533708496522_7074140_654-162611-0_sp1.1 from training. Duration: 0.92725 2023-10-02 01:38:59,405 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_9302_210_2550_1_1528889962_4655416_638-301620-0_sp0.9 from training. Duration: 0.52225 2023-10-02 01:39:00,191 WARNING [train.py:1204] (3/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_478-162247-0_sp1.1 from training. Duration: 0.7454375 2023-10-02 01:39:00,399 WARNING [train.py:1204] (3/4) Exclude cut with ID _45297_210_2721_1_1533715351381_3264949_353-9541-0 from training. Duration: 0.97 2023-10-02 01:39:00,722 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9653W0190-675468 from training. Duration: 0.9935 2023-10-02 01:39:00,862 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_49730_210_13049_1_1534075294_1830676_214-84436-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 01:39:01,063 WARNING [train.py:1204] (3/4) Exclude cut with ID _50093_210_4220_1_1534417141207_3432299_259-322252-0_sp1.1 from training. Duration: 0.836375 2023-10-02 01:39:01,188 WARNING [train.py:1204] (3/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_599-50220-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 01:39:01,326 WARNING [train.py:1204] (3/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_174-109016-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 01:39:01,536 WARNING [train.py:1204] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_665-196155-0 from training. Duration: 0.67 2023-10-02 01:39:01,822 WARNING [train.py:1204] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_656-188361-0_sp1.1 from training. Duration: 0.7909375 2023-10-02 01:39:01,899 WARNING [train.py:1204] (3/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_60-18123-0_sp1.1 from training. Duration: 0.8 2023-10-02 01:39:01,953 WARNING [train.py:1204] (3/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_535-14410-0 from training. Duration: 0.47 2023-10-02 01:39:02,095 WARNING [train.py:1204] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_94-126128-0_sp1.1 from training. Duration: 0.7818125 2023-10-02 01:39:02,193 WARNING [train.py:1204] (3/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_356-31216-0_sp1.1 from training. Duration: 0.6818125 2023-10-02 01:39:02,478 WARNING [train.py:1204] (3/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_443-30034-0_sp1.1 from training. Duration: 0.5818125 2023-10-02 01:39:02,568 WARNING [train.py:1204] (3/4) Exclude cut with ID _26907_210_7234_1_1532221740694_3205263_337-2494-0_sp1.1 from training. Duration: 0.8 2023-10-02 01:39:02,664 WARNING [train.py:1204] (3/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_49-97158-0_sp0.9 from training. Duration: 0.8444375 2023-10-02 01:39:02,684 WARNING [train.py:1204] (3/4) Exclude cut with ID _7877_210_6014_1_1528286358099_3750136_553-170128-0_sp1.1 from training. Duration: 0.963625 2023-10-02 01:39:02,878 WARNING [train.py:1204] (3/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_202-293523-0_sp0.9 from training. Duration: 0.8 2023-10-02 01:39:03,180 WARNING [train.py:1204] (3/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_188-171673-0_sp0.9 from training. Duration: 0.6444375 2023-10-02 01:39:03,195 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_120-41668-0_sp1.1 from training. Duration: 0.636375 2023-10-02 01:39:03,200 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_40223_210_6228_1_1533298404_4812267_613-78769-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 01:39:03,335 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_53861_210_5399_1_1534422156_4272077_150-336899-0_sp1.1 from training. Duration: 0.963625 2023-10-02 01:39:03,403 WARNING [train.py:1204] (3/4) Exclude cut with ID _14459_210_2228_1_1531443641946_7516700_1146-56843-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 01:39:03,612 WARNING [train.py:1204] (3/4) Exclude cut with ID _39867_210_9826_1_1533553222030_9344259_1194-299928-0_sp1.1 from training. Duration: 0.92725 2023-10-02 01:39:03,702 WARNING [train.py:1204] (3/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_214-291369-0_sp0.9 from training. Duration: 0.7 2023-10-02 01:39:04,740 WARNING [train.py:1204] (3/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_358-215558-0 from training. Duration: 0.93 2023-10-02 01:39:04,821 WARNING [train.py:1204] (3/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_577-287150-0_sp0.9 from training. Duration: 0.911125 2023-10-02 01:39:05,033 WARNING [train.py:1204] (3/4) Exclude cut with ID _60860_210_8254_1_1534896015875_3557169_229-187367-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 01:39:05,088 WARNING [train.py:1204] (3/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_857-298942-0 from training. Duration: 0.85 2023-10-02 01:39:05,122 WARNING [train.py:1204] (3/4) Exclude cut with ID _40137_210_12210_1_1533290484046_3795180_249-187059-0_sp1.1 from training. Duration: 0.8 2023-10-02 01:39:05,141 WARNING [train.py:1204] (3/4) Exclude cut with ID ID0171W0508-765603 from training. Duration: 0.541 2023-10-02 01:39:05,162 WARNING [train.py:1204] (3/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_521-219439-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 01:39:05,173 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_40223_210_6228_1_1533298404_4812267_741-302616-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 01:39:05,489 WARNING [train.py:1204] (3/4) Exclude cut with ID _65293_210_9070_1_1535545549765_4004210_438-154735-0_sp1.1 from training. Duration: 0.92725 2023-10-02 01:39:05,733 WARNING [train.py:1204] (3/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_564-132950-0_sp1.1 from training. Duration: 0.92725 2023-10-02 01:39:05,769 WARNING [train.py:1204] (3/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_291-270881-0_sp1.1 from training. Duration: 0.8818125 2023-10-02 01:39:05,901 WARNING [train.py:1204] (3/4) Exclude cut with ID _60229_210_27767_1_1534838406158_3655350_353-213656-0 from training. Duration: 0.95 2023-10-02 01:39:05,922 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_243-143119-0 from training. Duration: 0.77 2023-10-02 01:39:05,955 WARNING [train.py:1204] (3/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_690-126306-0 from training. Duration: 0.78 2023-10-02 01:39:06,198 WARNING [train.py:1204] (3/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_347-95003-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 01:39:06,319 WARNING [train.py:1204] (3/4) Exclude cut with ID _2322_210_2667_1_1525604548971_3656299_440-80494-0 from training. Duration: 0.88 2023-10-02 01:39:06,337 WARNING [train.py:1204] (3/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_370-229434-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 01:39:06,597 WARNING [train.py:1204] (3/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_124-345840-0_sp1.1 from training. Duration: 0.47275 2023-10-02 01:39:06,604 WARNING [train.py:1204] (3/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_165-123581-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 01:39:06,759 WARNING [train.py:1204] (3/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_453-282588-0 from training. Duration: 0.98 2023-10-02 01:39:06,774 WARNING [train.py:1204] (3/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_299-34413-0_sp0.9 from training. Duration: 0.72225 2023-10-02 01:39:06,812 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_40036_210_13405_1_1533648647_1315388_48-146016-0_sp1.1 from training. Duration: 0.8454375 2023-10-02 01:39:06,844 WARNING [train.py:1204] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_99-167392-0_sp0.9 from training. Duration: 0.67775 2023-10-02 01:39:06,891 WARNING [train.py:1204] (3/4) Exclude cut with ID _48677_210_5663_1_1533902405919_3892989_37-207114-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 01:39:07,012 WARNING [train.py:1204] (3/4) Exclude cut with ID _29857_210_20034_1_1532395886753_3798160_335-163562-0 from training. Duration: 0.85 2023-10-02 01:39:07,191 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_36492_210_7927_1_1533261987_4790834_703-343351-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 01:39:07,308 WARNING [train.py:1204] (3/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_80-147301-0_sp0.9 from training. Duration: 0.9333125 2023-10-02 01:39:07,409 WARNING [train.py:1204] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_218-295097-0_sp0.9 from training. Duration: 0.8555625 2023-10-02 01:39:07,609 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2295_210_2087_1_1525500993_2407863_116-238894-0_sp1.1 from training. Duration: 0.636375 2023-10-02 01:39:07,881 WARNING [train.py:1204] (3/4) Exclude cut with ID _3231_210_2106_1_1526205864934_6784220_697-171068-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 01:39:07,900 WARNING [train.py:1204] (3/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_102-77064-0 from training. Duration: 0.93 2023-10-02 01:39:08,260 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_13091_210_7925_1_1530609447_8987354_1186-261957-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 01:39:08,991 WARNING [train.py:1204] (3/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_91-277684-0_sp1.1 from training. Duration: 0.67275 2023-10-02 01:39:09,078 WARNING [train.py:1204] (3/4) Exclude cut with ID _31088_210_16446_1_1532520012284_3440919_159-313751-0_sp1.1 from training. Duration: 0.936375 2023-10-02 01:39:09,228 WARNING [train.py:1204] (3/4) Exclude cut with ID _42326_210_7770_1_1533814282531_3707269_227-203721-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 01:39:09,367 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3531_210_3528_1_1526294203_5216456_309-227302-0_sp1.1 from training. Duration: 0.62725 2023-10-02 01:39:09,438 WARNING [train.py:1204] (3/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_226-242140-0 from training. Duration: 0.81 2023-10-02 01:39:09,828 WARNING [train.py:1204] (3/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_122-109023-0_sp1.1 from training. Duration: 0.6909375 2023-10-02 01:39:09,932 WARNING [train.py:1204] (3/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_660-211088-0_sp0.9 from training. Duration: 0.688875 2023-10-02 01:39:10,050 WARNING [train.py:1204] (3/4) Exclude cut with ID _39290_210_4220_1_1533862778760_3075118_92-311600-0 from training. Duration: 0.88 2023-10-02 01:39:10,298 WARNING [train.py:1204] (3/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_390-339137-0_sp1.1 from training. Duration: 0.5090625 2023-10-02 01:39:10,559 WARNING [train.py:1204] (3/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_449-341946-0_sp1.1 from training. Duration: 0.963625 2023-10-02 01:39:10,633 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_68095_210_20185_1_1535718226_8888855_884-327007-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 01:39:10,729 WARNING [train.py:1204] (3/4) Exclude cut with ID _42326_210_7770_1_1533814282531_3707269_308-222998-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 01:39:10,829 WARNING [train.py:1204] (3/4) Exclude cut with ID _40399_210_4353_1_1533305658194_3279406_344-238708-0_sp1.1 from training. Duration: 0.963625 2023-10-02 01:39:10,899 WARNING [train.py:1204] (3/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_87-131427-0_sp1.1 from training. Duration: 0.7090625 2023-10-02 01:39:10,947 WARNING [train.py:1204] (3/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_110-130961-0_sp0.9 from training. Duration: 0.52225 2023-10-02 01:39:10,987 WARNING [train.py:1204] (3/4) Exclude cut with ID _55153_210_2253_1_1534384786677_3162096_148-79022-0 from training. Duration: 0.96 2023-10-02 01:39:11,433 WARNING [train.py:1204] (3/4) Exclude cut with ID _9679_210_7497_1_1529129212906_4363929_512-313889-0_sp1.1 from training. Duration: 0.736375 2023-10-02 01:39:11,456 WARNING [train.py:1204] (3/4) Exclude cut with ID _7970_210_1883_1_1528455616296_3472230_25-178407-0_sp1.1 from training. Duration: 0.6454375 2023-10-02 01:39:11,587 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_246-125000-0 from training. Duration: 0.62 2023-10-02 01:39:11,634 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_15126_210_6753_1_1530778677_2591185_113-157651-0_sp0.9 from training. Duration: 0.7555625 2023-10-02 01:39:11,777 WARNING [train.py:1204] (3/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_105-63309-0_sp1.1 from training. Duration: 0.8909375 2023-10-02 01:39:11,968 WARNING [train.py:1204] (3/4) Exclude cut with ID _29877_210_6228_1_1532397791106_5276149_578-177271-0_sp1.1 from training. Duration: 0.936375 2023-10-02 01:39:12,017 WARNING [train.py:1204] (3/4) Exclude cut with ID _44831_210_13427_1_1533816011549_3689035_32-188147-0 from training. Duration: 0.96 2023-10-02 01:39:12,074 WARNING [train.py:1204] (3/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_300-254337-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 01:39:12,085 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_128-241008-0 from training. Duration: 0.94 2023-10-02 01:39:12,235 WARNING [train.py:1204] (3/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_53-279178-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 01:39:12,479 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_41452_210_7291_1_1533437687_3941281_273-334088-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 01:39:12,488 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_128-241008-0_sp1.1 from training. Duration: 0.8545625 2023-10-02 01:39:13,399 WARNING [train.py:1204] (3/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_379-49457-0 from training. Duration: 0.87 2023-10-02 01:39:13,478 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_105-114797-0 from training. Duration: 0.84 2023-10-02 01:39:13,566 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_14928_210_5632_1_1530773461_4714000_454-112198-0 from training. Duration: 0.66 2023-10-02 01:39:13,880 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_456-22141-0_sp1.1 from training. Duration: 0.8 2023-10-02 01:39:14,084 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_115-233929-0 from training. Duration: 0.72 2023-10-02 01:39:14,228 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_616-16795-0_sp1.1 from training. Duration: 0.963625 2023-10-02 01:39:14,803 WARNING [train.py:1204] (3/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_448-212135-0 from training. Duration: 0.91 2023-10-02 01:39:15,083 WARNING [train.py:1204] (3/4) Exclude cut with ID _46683_210_9642_1_1533730913492_4591116_201-205677-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 01:39:15,243 WARNING [train.py:1204] (3/4) Exclude cut with ID _64086_210_7994_1_1535270417019_7435949_43-80483-0_sp1.1 from training. Duration: 0.97275 2023-10-02 01:39:15,479 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_240-134124-0_sp1.1 from training. Duration: 0.963625 2023-10-02 01:39:15,512 WARNING [train.py:1204] (3/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_907-120940-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 01:39:15,633 WARNING [train.py:1204] (3/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_848-280957-0_sp1.1 from training. Duration: 0.7909375 2023-10-02 01:39:15,674 WARNING [train.py:1204] (3/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_243-42116-0_sp0.9 from training. Duration: 0.811125 2023-10-02 01:39:15,984 WARNING [train.py:1204] (3/4) Exclude cut with ID _6694_210_4599_1_1527852637490_3965529_116-109398-0_sp0.9 from training. Duration: 0.811125 2023-10-02 01:39:16,104 WARNING [train.py:1204] (3/4) Exclude cut with ID _24314_210_4915_1_1532135078116_7170179_521-258868-0 from training. Duration: 0.79 2023-10-02 01:39:16,219 WARNING [train.py:1204] (3/4) Exclude cut with ID _2220_210_1819_1_1525509109898_3658680_50-99837-0_sp0.9 from training. Duration: 0.6 2023-10-02 01:39:16,273 WARNING [train.py:1204] (3/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_313-134930-0_sp1.1 from training. Duration: 0.8818125 2023-10-02 01:39:16,523 WARNING [train.py:1204] (3/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_125-144174-0 from training. Duration: 0.39 2023-10-02 01:39:16,598 WARNING [train.py:1204] (3/4) Exclude cut with ID _59288_210_5854_1_1535077757774_3412246_275-293897-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 01:39:16,696 WARNING [train.py:1204] (3/4) Exclude cut with ID _40519_210_13794_1_1533715228655_7337390_722-52880-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 01:39:16,710 WARNING [train.py:1204] (3/4) Exclude cut with ID _7537_210_2084_1_1528502395431_3924359_128-268029-0 from training. Duration: 0.98 2023-10-02 01:39:16,855 WARNING [train.py:1204] (3/4) Exclude cut with ID _8021_210_7994_1_1528376451358_3653069_179-45206-0 from training. Duration: 0.86 2023-10-02 01:39:17,572 WARNING [train.py:1204] (3/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_88-276668-0 from training. Duration: 0.99 2023-10-02 01:39:17,684 WARNING [train.py:1204] (3/4) Exclude cut with ID _22613_210_5219_1_1532253599686_4090170_290-212862-0_sp1.1 from training. Duration: 0.97275 2023-10-02 01:39:17,689 WARNING [train.py:1204] (3/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_465-214726-0 from training. Duration: 0.86 2023-10-02 01:39:18,576 WARNING [train.py:1204] (3/4) Exclude cut with ID _56480_210_4220_1_1534679942079_3075409_138-36031-0_sp1.1 from training. Duration: 0.97275 2023-10-02 01:39:19,248 WARNING [train.py:1204] (3/4) Exclude cut with ID _58605_210_12210_1_1534723195939_3904259_300-285878-0_sp1.1 from training. Duration: 0.97275 2023-10-02 01:39:19,280 WARNING [train.py:1204] (3/4) Exclude cut with ID _40901_210_5665_1_1533363789958_3856130_114-340229-0_sp1.1 from training. Duration: 0.936375 2023-10-02 01:39:19,328 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_489-230703-0 from training. Duration: 0.85 2023-10-02 01:39:19,553 WARNING [train.py:1204] (3/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_322-81243-0_sp1.1 from training. Duration: 0.936375 2023-10-02 01:39:19,659 WARNING [train.py:1204] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_271-269084-0 from training. Duration: 0.83 2023-10-02 01:39:19,676 WARNING [train.py:1204] (3/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_166-314607-0_sp0.9 from training. Duration: 0.6 2023-10-02 01:39:19,737 WARNING [train.py:1204] (3/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_340-343302-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 01:39:19,950 WARNING [train.py:1204] (3/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_286-112015-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 01:39:20,112 WARNING [train.py:1204] (3/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_328-264776-0 from training. Duration: 0.67 2023-10-02 01:39:20,281 WARNING [train.py:1204] (3/4) Exclude cut with ID _4866_210_2577_1_1527404412284_4364659_256-306830-0_sp0.9 from training. Duration: 0.9666875 2023-10-02 01:39:20,716 WARNING [train.py:1204] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_239-316802-0 from training. Duration: 0.69 2023-10-02 01:39:20,856 WARNING [train.py:1204] (3/4) Exclude cut with ID _29857_210_20034_1_1532395886753_3798160_279-185801-0 from training. Duration: 0.91 2023-10-02 01:39:21,287 WARNING [train.py:1204] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_656-188361-0 from training. Duration: 0.87 2023-10-02 01:39:21,988 WARNING [train.py:1204] (3/4) Exclude cut with ID _44718_210_5816_1_1534050051869_6469030_100-230287-0_sp1.1 from training. Duration: 0.936375 2023-10-02 01:39:22,275 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8275_210_4198_1_1528455320_3892571_276-215622-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 01:39:22,340 WARNING [train.py:1204] (3/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_698-309430-0_sp1.1 from training. Duration: 0.963625 2023-10-02 01:39:22,774 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_365-217119-0 from training. Duration: 0.63 2023-10-02 01:39:22,901 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7264_210_4938_1_1527939911_8132095_800-91995-0 from training. Duration: 0.86 2023-10-02 01:39:23,611 WARNING [train.py:1204] (3/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_49-97158-0_sp1.1 from training. Duration: 0.6909375 2023-10-02 01:39:23,772 WARNING [train.py:1204] (3/4) Exclude cut with ID _61132_210_9969_1_1535021792964_3755419_61-57720-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 01:39:23,938 WARNING [train.py:1204] (3/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_345-306832-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 01:39:23,968 WARNING [train.py:1204] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_164-224052-0_sp0.9 from training. Duration: 0.77775 2023-10-02 01:39:24,301 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_34886_210_10633_1_1533203882_1755661_76-156096-0_sp1.1 from training. Duration: 0.7909375 2023-10-02 01:39:24,381 WARNING [train.py:1204] (3/4) Exclude cut with ID 4133-6541-0027-40495-0_sp1.1 from training. Duration: 0.9681875 2023-10-02 01:39:25,046 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_514-211165-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 01:39:25,105 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_41211_210_2544_1_1533348859_3702910_97-212695-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 01:39:25,415 WARNING [train.py:1204] (3/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_406-45086-0 from training. Duration: 0.8 2023-10-02 01:39:25,499 WARNING [train.py:1204] (3/4) Exclude cut with ID _65263_210_22356_1_1535763598691_3921037_49-222917-0_sp1.1 from training. Duration: 0.836375 2023-10-02 01:39:25,544 WARNING [train.py:1204] (3/4) Exclude cut with ID _4866_210_2577_1_1527404412284_4364659_352-157930-0_sp1.1 from training. Duration: 0.736375 2023-10-02 01:39:25,660 WARNING [train.py:1204] (3/4) Exclude cut with ID _4366_210_1388_1_1526792053633_3613819_231-211402-0_sp1.1 from training. Duration: 0.8545625 2023-10-02 01:39:25,710 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_98-21576-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 01:39:25,797 WARNING [train.py:1204] (3/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_348-127789-0_sp0.9 from training. Duration: 0.9444375 2023-10-02 01:39:25,823 WARNING [train.py:1204] (3/4) Exclude cut with ID _45871_210_5665_1_1533864614245_3296060_98-329557-0_sp1.1 from training. Duration: 0.97275 2023-10-02 01:39:25,879 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6846_210_6753_1_1527850767_4295158_2-315073-0_sp1.1 from training. Duration: 0.8 2023-10-02 01:39:26,686 WARNING [train.py:1204] (3/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_145-137790-0 from training. Duration: 0.83 2023-10-02 01:39:26,700 WARNING [train.py:1204] (3/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_133-158308-0_sp0.9 from training. Duration: 0.9333125 2023-10-02 01:39:27,024 WARNING [train.py:1204] (3/4) Exclude cut with ID _14898_210_9206_1_1530766812377_4177190_320-11758-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 01:39:27,300 WARNING [train.py:1204] (3/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_406-45086-0_sp0.9 from training. Duration: 0.888875 2023-10-02 01:39:27,983 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_5438_210_1794_1_1527336687_2600009_104-288274-0 from training. Duration: 0.58 2023-10-02 01:39:28,083 WARNING [train.py:1204] (3/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_108-53929-0_sp0.9 from training. Duration: 0.6555625 2023-10-02 01:39:28,159 WARNING [train.py:1204] (3/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_444-286390-0_sp1.1 from training. Duration: 0.963625 2023-10-02 01:39:28,356 WARNING [train.py:1204] (3/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_125-144174-0_sp0.9 from training. Duration: 0.4333125 2023-10-02 01:39:28,441 WARNING [train.py:1204] (3/4) Exclude cut with ID _55209_210_1531_1_1534675828996_4597160_118-345621-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 01:39:28,591 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_45886_210_17024_1_1533709215_471063_7-79165-0_sp1.1 from training. Duration: 0.8909375 2023-10-02 01:39:28,841 WARNING [train.py:1204] (3/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_175-163894-0_sp0.9 from training. Duration: 0.82225 2023-10-02 01:39:29,008 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_50188_210_4198_1_1534503621_3646041_353-81682-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 01:39:29,017 WARNING [train.py:1204] (3/4) Exclude cut with ID _17875_210_2087_1_1531224012907_1821339_175-80565-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 01:39:29,025 WARNING [train.py:1204] (3/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_159-328157-0_sp1.1 from training. Duration: 0.47275 2023-10-02 01:39:29,505 WARNING [train.py:1204] (3/4) Exclude cut with ID _60057_210_10640_1_1534903065793_3333150_137-267579-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 01:39:29,575 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_92-46009-0_sp1.1 from training. Duration: 0.4909375 2023-10-02 01:39:29,698 WARNING [train.py:1204] (3/4) Exclude cut with ID _53203_210_2087_1_1534246218309_475590_48-68507-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 01:39:29,716 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_28211_210_6388_1_1532861506_4523224_128-328162-0 from training. Duration: 0.9 2023-10-02 01:39:29,824 WARNING [train.py:1204] (3/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_51-1049-0_sp1.1 from training. Duration: 0.6818125 2023-10-02 01:39:29,949 WARNING [train.py:1204] (3/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_383-316000-0 from training. Duration: 0.9 2023-10-02 01:39:30,001 WARNING [train.py:1204] (3/4) Exclude cut with ID _4779_210_2084_1_1527318022944_3914893_185-295153-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 01:39:30,084 WARNING [train.py:1204] (3/4) Exclude cut with ID _3231_210_2106_1_1526205864934_6784220_171-256004-0_sp0.9 from training. Duration: 0.9555625 2023-10-02 01:39:30,085 WARNING [train.py:1204] (3/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_87-272068-0 from training. Duration: 0.83 2023-10-02 01:39:30,199 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_164-152777-0_sp1.1 from training. Duration: 0.92725 2023-10-02 01:39:30,249 WARNING [train.py:1204] (3/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_18-130336-0_sp0.9 from training. Duration: 0.87775 2023-10-02 01:39:30,311 WARNING [train.py:1204] (3/4) Exclude cut with ID _14886_210_4220_1_1530702020355_3516190_175-229073-0_sp1.1 from training. Duration: 0.4090625 2023-10-02 01:39:30,329 WARNING [train.py:1204] (3/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_791-45149-0_sp0.9 from training. Duration: 0.9555625 2023-10-02 01:39:30,414 WARNING [train.py:1204] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_468-189058-0_sp1.1 from training. Duration: 0.87275 2023-10-02 01:39:30,530 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8278_210_2550_1_1528637099_4220413_32-142780-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 01:39:30,710 WARNING [train.py:1204] (3/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_146-218763-0_sp0.9 from training. Duration: 0.9555625 2023-10-02 01:39:31,649 WARNING [train.py:1204] (3/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_348-127789-0_sp1.1 from training. Duration: 0.77275 2023-10-02 01:39:31,656 WARNING [train.py:1204] (3/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_288-285445-0_sp1.1 from training. Duration: 0.936375 2023-10-02 01:39:31,819 WARNING [train.py:1204] (3/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_308-181732-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 01:39:31,842 WARNING [train.py:1204] (3/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_895-117428-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 01:39:31,997 WARNING [train.py:1204] (3/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_387-159008-0_sp0.9 from training. Duration: 0.9555625 2023-10-02 01:39:32,011 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_37351_210_7925_1_1533012931_3812113_265-85780-0_sp0.9 from training. Duration: 0.97775 2023-10-02 01:39:32,035 WARNING [train.py:1204] (3/4) Exclude cut with ID _40551_210_10635_1_1533776424632_3416179_361-3964-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 01:39:32,071 WARNING [train.py:1204] (3/4) Exclude cut with ID _25882_210_3036_1_1532775665815_6479510_485-122742-0_sp1.1 from training. Duration: 0.97275 2023-10-02 01:39:32,353 WARNING [train.py:1204] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_98-51676-0 from training. Duration: 0.73 2023-10-02 01:39:32,592 WARNING [train.py:1204] (3/4) Exclude cut with ID _30548_210_16040_1_1532608097130_3146969_371-280610-0_sp1.1 from training. Duration: 0.92725 2023-10-02 01:39:32,869 WARNING [train.py:1204] (3/4) Exclude cut with ID IC0671W0081-331924 from training. Duration: 0.896 2023-10-02 01:39:33,004 WARNING [train.py:1204] (3/4) Exclude cut with ID _40284_210_2084_1_1533513679814_3704279_380-160775-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 01:39:33,111 WARNING [train.py:1204] (3/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_57-270037-0_sp1.1 from training. Duration: 0.7 2023-10-02 01:39:33,229 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_853-150237-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 01:39:33,322 WARNING [train.py:1204] (3/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_491-261923-0 from training. Duration: 0.98 2023-10-02 01:39:33,660 WARNING [train.py:1204] (3/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_836-244045-0_sp1.1 from training. Duration: 0.97275 2023-10-02 01:39:33,750 WARNING [train.py:1204] (3/4) Exclude cut with ID _14880_210_8895_1_1530680426655_3795259_289-212817-0 from training. Duration: 0.69 2023-10-02 01:39:33,857 WARNING [train.py:1204] (3/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_303-340446-0 from training. Duration: 0.9 2023-10-02 01:39:33,967 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_37152_210_9697_1_1533106573_3844188_418-41736-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 01:39:34,254 WARNING [train.py:1204] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_371-18169-0_sp1.1 from training. Duration: 0.7818125 2023-10-02 01:39:34,282 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_187-219984-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 01:39:34,426 WARNING [train.py:1204] (3/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_63-267585-0 from training. Duration: 0.9 2023-10-02 01:39:34,600 WARNING [train.py:1204] (3/4) Exclude cut with ID _27013_210_1389_1_1532437173240_3252649_222-125573-0 from training. Duration: 0.98 2023-10-02 01:39:34,647 WARNING [train.py:1204] (3/4) Exclude cut with ID _14821_210_8750_1_1530680503991_6829359_189-297451-0 from training. Duration: 0.55 2023-10-02 01:39:34,754 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_557-163391-0_sp1.1 from training. Duration: 0.97275 2023-10-02 01:39:34,856 WARNING [train.py:1204] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_65-88242-0_sp1.1 from training. Duration: 0.5090625 2023-10-02 01:39:36,279 WARNING [train.py:1204] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_218-295097-0 from training. Duration: 0.77 2023-10-02 01:39:36,306 WARNING [train.py:1204] (3/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_686-247600-0_sp1.1 from training. Duration: 0.836375 2023-10-02 01:39:36,309 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_397-147196-0_sp0.9 from training. Duration: 0.888875 2023-10-02 01:39:36,398 WARNING [train.py:1204] (3/4) Exclude cut with ID _53950_210_5816_1_1534417264919_3862951_237-136449-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 01:39:36,480 WARNING [train.py:1204] (3/4) Exclude cut with ID _3120_210_3525_1_1526036143112_4072980_74-178400-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 01:39:36,546 WARNING [train.py:1204] (3/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_309-222545-0_sp0.9 from training. Duration: 0.9333125 2023-10-02 01:39:36,633 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7544_210_6753_1_1528587722_3576686_172-171750-0 from training. Duration: 0.65 2023-10-02 01:39:36,744 WARNING [train.py:1204] (3/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_3-237434-0_sp0.9 from training. Duration: 0.8444375 2023-10-02 01:39:36,792 WARNING [train.py:1204] (3/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_101-293367-0_sp0.9 from training. Duration: 0.7 2023-10-02 01:39:36,807 WARNING [train.py:1204] (3/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_818-213552-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 01:39:36,890 WARNING [train.py:1204] (3/4) Exclude cut with ID _14349_210_8341_1_1530615710818_3442470_115-285975-0_sp1.1 from training. Duration: 0.7818125 2023-10-02 01:39:37,042 WARNING [train.py:1204] (3/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_291-309749-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 01:39:37,145 WARNING [train.py:1204] (3/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_715-3462-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 01:39:37,576 WARNING [train.py:1204] (3/4) Exclude cut with ID _59502_210_12210_1_1535107024698_1835139_13-331827-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 01:39:37,594 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_4528_210_3273_1_1527076654_3276777_342-168068-0 from training. Duration: 0.87 2023-10-02 01:39:37,668 WARNING [train.py:1204] (3/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_215-141059-0_sp0.9 from training. Duration: 0.688875 2023-10-02 01:39:37,691 WARNING [train.py:1204] (3/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_106-286130-0_sp1.1 from training. Duration: 0.8909375 2023-10-02 01:39:37,791 WARNING [train.py:1204] (3/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_480-77655-0_sp1.1 from training. Duration: 0.97275 2023-10-02 01:39:38,017 WARNING [train.py:1204] (3/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_848-280957-0 from training. Duration: 0.87 2023-10-02 01:39:38,345 WARNING [train.py:1204] (3/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_122-109023-0 from training. Duration: 0.76 2023-10-02 01:39:38,470 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_11305_210_1794_1_1529750428_7178148_645-233480-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 01:39:38,529 WARNING [train.py:1204] (3/4) Exclude cut with ID _7562_210_2084_1_1528887767916_3946229_345-169156-0 from training. Duration: 0.58 2023-10-02 01:39:38,726 WARNING [train.py:1204] (3/4) Exclude cut with ID _7562_210_2084_1_1528887767916_3946229_345-169156-0_sp1.1 from training. Duration: 0.52725 2023-10-02 01:39:38,881 WARNING [train.py:1204] (3/4) Exclude cut with ID _13253_210_1925_1_1530757775819_3577873_253-246386-0_sp1.1 from training. Duration: 0.8181875 2023-10-02 01:39:38,967 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_136-264444-0_sp1.1 from training. Duration: 0.97275 2023-10-02 01:39:38,983 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2168_210_2290_1_1525606920_1849400_136-222991-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 01:39:39,127 WARNING [train.py:1204] (3/4) Exclude cut with ID _7852_210_5219_1_1528286471316_8139400_242-105336-0 from training. Duration: 0.72 2023-10-02 01:39:39,176 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_5960_210_1794_1_1527403865_3990999_515-2613-0_sp1.1 from training. Duration: 0.8 2023-10-02 01:39:39,191 WARNING [train.py:1204] (3/4) Exclude cut with ID _39688_210_19596_1_1533371626725_4154159_551-167834-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 01:39:39,257 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_85-195240-0 from training. Duration: 0.87 2023-10-02 01:39:39,339 WARNING [train.py:1204] (3/4) Exclude cut with ID _63171_210_15488_1_1535104792794_3295110_113-113534-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 01:39:39,795 WARNING [train.py:1204] (3/4) Exclude cut with ID _8833_210_5219_1_1528592665210_7553059_319-349181-0_sp1.1 from training. Duration: 0.9 2023-10-02 01:39:39,796 WARNING [train.py:1204] (3/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_225-43381-0 from training. Duration: 0.94 2023-10-02 01:39:40,846 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_5959_210_1794_1_1527400400_3445726_412-44052-0_sp0.9 from training. Duration: 0.8555625 2023-10-02 01:39:40,999 WARNING [train.py:1204] (3/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_356-167816-0_sp1.1 from training. Duration: 0.6545625 2023-10-02 01:39:41,353 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9012W0112-501244 from training. Duration: 0.768 2023-10-02 01:39:41,412 WARNING [train.py:1204] (3/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_177-326952-0_sp1.1 from training. Duration: 0.92725 2023-10-02 01:39:41,416 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9016W0116-502789 from training. Duration: 0.896 2023-10-02 01:39:41,544 WARNING [train.py:1204] (3/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_557-204815-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 01:39:41,648 WARNING [train.py:1204] (3/4) Exclude cut with ID _55857_210_2346_1_1534644007831_6898209_279-229469-0_sp1.1 from training. Duration: 0.936375 2023-10-02 01:39:41,720 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9029W0375-509764 from training. Duration: 0.896 2023-10-02 01:39:41,801 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2295_210_2087_1_1525500993_2407863_234-83104-0_sp0.9 from training. Duration: 0.688875 2023-10-02 01:39:42,013 WARNING [train.py:1204] (3/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_266-137104-0 from training. Duration: 0.65 2023-10-02 01:39:42,125 WARNING [train.py:1204] (3/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_289-163927-0_sp1.1 from training. Duration: 0.92725 2023-10-02 01:39:42,398 WARNING [train.py:1204] (3/4) Exclude cut with ID _12733_210_8341_1_1530167424556_3646380_86-225314-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 01:39:42,420 WARNING [train.py:1204] (3/4) Exclude cut with ID _7877_210_6014_1_1528286358099_3750136_618-284064-0_sp1.1 from training. Duration: 0.92725 2023-10-02 01:39:42,488 WARNING [train.py:1204] (3/4) Exclude cut with ID _55844_210_2346_1_1534471212569_7625707_1017-31469-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 01:39:42,538 WARNING [train.py:1204] (3/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_617-241446-0_sp1.1 from training. Duration: 0.92725 2023-10-02 01:39:42,591 WARNING [train.py:1204] (3/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_866-173440-0_sp1.1 from training. Duration: 0.8181875 2023-10-02 01:39:42,745 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_9460_210_4211_1_1528954587_10049667_480-260269-0 from training. Duration: 0.56 2023-10-02 01:39:42,797 WARNING [train.py:1204] (3/4) Exclude cut with ID _44659_210_2721_1_1534036120061_6614860_626-298884-0 from training. Duration: 0.97 2023-10-02 01:39:42,834 WARNING [train.py:1204] (3/4) Exclude cut with ID _53565_210_10635_1_1534337218335_3820259_287-18120-0_sp1.1 from training. Duration: 0.8818125 2023-10-02 01:39:42,912 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_791-197891-0 from training. Duration: 0.84 2023-10-02 01:39:42,958 WARNING [train.py:1204] (3/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_527-238990-0 from training. Duration: 0.9 2023-10-02 01:39:43,135 WARNING [train.py:1204] (3/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_449-65969-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 01:39:43,260 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_553-341452-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 01:39:43,300 WARNING [train.py:1204] (3/4) Exclude cut with ID _16909_210_5724_1_1531292079912_4118919_300-47574-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 01:39:43,312 WARNING [train.py:1204] (3/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_70-27732-0_sp1.1 from training. Duration: 0.8545625 2023-10-02 01:39:43,421 WARNING [train.py:1204] (3/4) Exclude cut with ID _44718_210_5816_1_1534050051869_6469030_727-28074-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 01:39:43,467 WARNING [train.py:1204] (3/4) Exclude cut with ID _55857_210_2346_1_1534644007831_6898209_357-123664-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 01:39:43,658 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9123W0004-555964 from training. Duration: 0.896 2023-10-02 01:39:43,687 WARNING [train.py:1204] (3/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_445-145208-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 01:39:43,709 WARNING [train.py:1204] (3/4) Exclude cut with ID _3675_210_4557_1_1526639934408_4182089_456-18305-0_sp1.1 from training. Duration: 0.6909375 2023-10-02 01:39:43,783 WARNING [train.py:1204] (3/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_1-99680-0_sp1.1 from training. Duration: 0.6090625 2023-10-02 01:39:43,962 WARNING [train.py:1204] (3/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_146-282400-0_sp0.9 from training. Duration: 0.788875 2023-10-02 01:39:43,984 WARNING [train.py:1204] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_844-96989-0 from training. Duration: 0.71 2023-10-02 01:39:44,110 WARNING [train.py:1204] (3/4) Exclude cut with ID _29853_210_7497_1_1532505600851_6642110_128-146364-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 01:39:44,131 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_224-322394-0 from training. Duration: 0.66 2023-10-02 01:39:44,234 WARNING [train.py:1204] (3/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_191-59784-0 from training. Duration: 0.88 2023-10-02 01:39:44,246 WARNING [train.py:1204] (3/4) Exclude cut with ID _12556_210_8341_1_1530081024100_7327690_725-193477-0 from training. Duration: 0.98 2023-10-02 01:39:44,302 WARNING [train.py:1204] (3/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_523-79838-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 01:39:44,410 WARNING [train.py:1204] (3/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_398-334558-0_sp1.1 from training. Duration: 0.87275 2023-10-02 01:39:45,373 WARNING [train.py:1204] (3/4) Exclude cut with ID _6456_210_1534_1_1527681634212_3181049_171-36697-0_sp1.1 from training. Duration: 0.936375 2023-10-02 01:39:45,558 WARNING [train.py:1204] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_700-183549-0 from training. Duration: 0.68 2023-10-02 01:39:45,561 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8853_210_3273_1_1528633899_818968_51-187685-0 from training. Duration: 0.69 2023-10-02 01:39:46,187 WARNING [train.py:1204] (3/4) Exclude cut with ID _7719_210_3525_1_1528456153163_4296960_333-110399-0_sp1.1 from training. Duration: 0.87275 2023-10-02 01:39:46,475 WARNING [train.py:1204] (3/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_1022-83740-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 01:39:46,490 WARNING [train.py:1204] (3/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_61-327062-0_sp0.9 from training. Duration: 0.9 2023-10-02 01:39:46,502 WARNING [train.py:1204] (3/4) Exclude cut with ID _47251_210_18628_1_1533814063128_4137250_214-88076-0_sp1.1 from training. Duration: 0.8 2023-10-02 01:39:46,572 WARNING [train.py:1204] (3/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_839-173017-0 from training. Duration: 0.41 2023-10-02 01:39:46,985 WARNING [train.py:1204] (3/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_299-34413-0_sp1.1 from training. Duration: 0.5909375 2023-10-02 01:39:47,095 WARNING [train.py:1204] (3/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_664-159457-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 01:39:47,291 WARNING [train.py:1204] (3/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_483-291760-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 01:39:47,478 WARNING [train.py:1204] (3/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_287-255474-0_sp1.1 from training. Duration: 0.92725 2023-10-02 01:39:47,520 WARNING [train.py:1204] (3/4) Exclude cut with ID _48677_210_5663_1_1533902405919_3892989_111-11439-0_sp1.1 from training. Duration: 0.87275 2023-10-02 01:39:47,538 WARNING [train.py:1204] (3/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_399-156791-0 from training. Duration: 0.83 2023-10-02 01:39:47,577 WARNING [train.py:1204] (3/4) Exclude cut with ID _3992_210_1747_1_1526644816031_3731060_36-147855-0_sp0.9 from training. Duration: 0.8666875 2023-10-02 01:39:47,693 WARNING [train.py:1204] (3/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_442-320317-0_sp1.1 from training. Duration: 0.7454375 2023-10-02 01:39:47,728 WARNING [train.py:1204] (3/4) Exclude cut with ID _55429_210_10626_1_1534467588527_7174143_953-80868-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 01:39:47,808 WARNING [train.py:1204] (3/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_159-328157-0_sp0.9 from training. Duration: 0.57775 2023-10-02 01:39:47,814 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9309W0197-640324 from training. Duration: 0.896 2023-10-02 01:39:48,010 WARNING [train.py:1204] (3/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_361-30822-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 01:39:48,427 WARNING [train.py:1204] (3/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_636-251213-0 from training. Duration: 0.94 2023-10-02 01:39:48,821 WARNING [train.py:1204] (3/4) Exclude cut with ID _49728_210_12651_1_1534041034294_6788150_468-333827-0_sp1.1 from training. Duration: 0.963625 2023-10-02 01:39:48,906 WARNING [train.py:1204] (3/4) Exclude cut with ID _25882_210_3036_1_1532775665815_6479510_623-191759-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 01:39:48,994 WARNING [train.py:1204] (3/4) Exclude cut with ID _5189_210_4557_1_1527248319428_3769229_199-53526-0 from training. Duration: 0.42 2023-10-02 01:39:49,767 WARNING [train.py:1204] (3/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_32-205288-0_sp0.9 from training. Duration: 0.8666875 2023-10-02 01:39:49,866 WARNING [train.py:1204] (3/4) Exclude cut with ID _65263_210_22356_1_1535763598691_3921037_401-346646-0_sp1.1 from training. Duration: 0.963625 2023-10-02 01:39:49,877 WARNING [train.py:1204] (3/4) Exclude cut with ID _12555_210_8750_1_1530075594402_3780239_449-267986-0_sp1.1 from training. Duration: 0.7545625 2023-10-02 01:39:49,892 WARNING [train.py:1204] (3/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_430-201020-0_sp1.1 from training. Duration: 0.8818125 2023-10-02 01:39:49,969 WARNING [train.py:1204] (3/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_379-49457-0_sp1.1 from training. Duration: 0.7909375 2023-10-02 01:39:50,256 WARNING [train.py:1204] (3/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_200-105218-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 01:39:50,308 WARNING [train.py:1204] (3/4) Exclude cut with ID _3867_210_1883_1_1526641200575_4475830_319-349850-0 from training. Duration: 0.67 2023-10-02 01:39:50,389 WARNING [train.py:1204] (3/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_916-195848-0 from training. Duration: 0.92 2023-10-02 01:39:50,555 WARNING [train.py:1204] (3/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_436-186311-0 from training. Duration: 0.65 2023-10-02 01:39:50,715 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3910_210_1794_1_1526777667_3747260_302-180617-0_sp1.1 from training. Duration: 0.97275 2023-10-02 01:39:50,832 WARNING [train.py:1204] (3/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_102-92751-0 from training. Duration: 0.99 2023-10-02 01:39:50,891 WARNING [train.py:1204] (3/4) Exclude cut with ID _51899_210_20034_1_1534129254466_3669220_265-215428-0_sp1.1 from training. Duration: 0.8909375 2023-10-02 01:39:51,098 WARNING [train.py:1204] (3/4) Exclude cut with ID _58583_210_5236_1_1534726835266_3574249_438-198993-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 01:39:51,396 WARNING [train.py:1204] (3/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_166-166990-0_sp1.1 from training. Duration: 0.87275 2023-10-02 01:39:51,623 WARNING [train.py:1204] (3/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_907-151398-0 from training. Duration: 0.37 2023-10-02 01:39:51,819 WARNING [train.py:1204] (3/4) Exclude cut with ID _38670_210_15379_1_1533448810659_4391059_452-256669-0_sp1.1 from training. Duration: 0.936375 2023-10-02 01:39:51,866 WARNING [train.py:1204] (3/4) Exclude cut with ID _48677_210_5663_1_1533902405919_3892989_408-40740-0 from training. Duration: 0.83 2023-10-02 01:39:51,915 WARNING [train.py:1204] (3/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_903-158756-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 01:39:51,974 WARNING [train.py:1204] (3/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_397-216497-0 from training. Duration: 0.96 2023-10-02 01:39:52,433 WARNING [train.py:1204] (3/4) Exclude cut with ID _55429_210_10626_1_1534467588527_7174143_97-335084-0 from training. Duration: 0.97 2023-10-02 01:39:52,968 WARNING [train.py:1204] (3/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_453-267730-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 01:39:53,029 WARNING [train.py:1204] (3/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_153-97441-0 from training. Duration: 0.56 2023-10-02 01:39:53,083 WARNING [train.py:1204] (3/4) Exclude cut with ID _40901_210_5665_1_1533363789958_3856130_312-329122-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 01:39:53,133 WARNING [train.py:1204] (3/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_97-126471-0_sp1.1 from training. Duration: 0.97275 2023-10-02 01:39:53,134 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_11305_210_1794_1_1529750428_7178148_257-312870-0_sp1.1 from training. Duration: 0.8 2023-10-02 01:39:53,351 WARNING [train.py:1204] (3/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_454-314901-0_sp0.9 from training. Duration: 0.511125 2023-10-02 01:39:54,316 WARNING [train.py:1204] (3/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_282-136641-0_sp1.1 from training. Duration: 0.3818125 2023-10-02 01:39:54,445 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_46013_210_7234_1_1533776362_3744032_493-160724-0_sp1.1 from training. Duration: 0.963625 2023-10-02 01:39:54,490 WARNING [train.py:1204] (3/4) Exclude cut with ID _67831_210_12392_1_1535612464202_3092612_259-33442-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 01:39:54,546 WARNING [train.py:1204] (3/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_289-62384-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 01:39:54,625 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6545_210_2040_1_1527774670_4245258_379-14182-0_sp0.9 from training. Duration: 0.888875 2023-10-02 01:39:54,748 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_548-294316-0_sp1.1 from training. Duration: 0.736375 2023-10-02 01:39:54,830 WARNING [train.py:1204] (3/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_857-298942-0_sp1.1 from training. Duration: 0.77275 2023-10-02 01:39:54,967 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6673_210_3273_1_1527767790_3173240_327-306185-0_sp1.1 from training. Duration: 0.7545625 2023-10-02 01:39:55,440 WARNING [train.py:1204] (3/4) Exclude cut with ID _61008_210_10633_1_1534852769198_6974370_814-107647-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 01:39:55,502 WARNING [train.py:1204] (3/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_116-332008-0_sp1.1 from training. Duration: 0.8909375 2023-10-02 01:39:55,673 WARNING [train.py:1204] (3/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_266-329755-0 from training. Duration: 0.89 2023-10-02 01:39:55,676 WARNING [train.py:1204] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_271-269084-0_sp0.9 from training. Duration: 0.92225 2023-10-02 01:39:55,748 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3171_210_2087_1_1526109084_2330007_66-260583-0 from training. Duration: 0.72 2023-10-02 01:39:56,160 WARNING [train.py:1204] (3/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_1326-276386-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 01:39:56,333 WARNING [train.py:1204] (3/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_372-30302-0_sp1.1 from training. Duration: 0.7818125 2023-10-02 01:39:56,393 WARNING [train.py:1204] (3/4) Exclude cut with ID _25882_210_3036_1_1532775665815_6479510_631-257029-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 01:39:56,401 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3691_210_3528_1_1526468169_4062053_355-334432-0 from training. Duration: 0.87 2023-10-02 01:39:56,408 WARNING [train.py:1204] (3/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_839-173017-0_sp1.1 from training. Duration: 0.37275 2023-10-02 01:39:56,416 WARNING [train.py:1204] (3/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_441-91466-0 from training. Duration: 0.97 2023-10-02 01:39:56,478 WARNING [train.py:1204] (3/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_108-53929-0 from training. Duration: 0.59 2023-10-02 01:39:56,826 WARNING [train.py:1204] (3/4) Exclude cut with ID _9389_210_1664_1_1529217337301_5289269_260-1942-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 01:39:56,907 WARNING [train.py:1204] (3/4) Exclude cut with ID _56423_210_5854_1_1534896061049_3165969_198-61547-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 01:39:56,953 WARNING [train.py:1204] (3/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_412-142122-0_sp1.1 from training. Duration: 0.92725 2023-10-02 01:39:56,969 WARNING [train.py:1204] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_88-341718-0_sp1.1 from training. Duration: 0.6 2023-10-02 01:39:57,124 WARNING [train.py:1204] (3/4) Exclude cut with ID _61079_210_4525_1_1534921008198_4011419_334-298697-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 01:39:57,277 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_5465_210_4211_1_1527227253_55357_8-2499-0_sp1.1 from training. Duration: 0.996375 2023-10-02 01:39:57,451 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_447-201565-0_sp1.1 from training. Duration: 0.97275 2023-10-02 01:39:57,457 WARNING [train.py:1204] (3/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_253-161789-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 01:39:57,459 WARNING [train.py:1204] (3/4) Exclude cut with ID _44304_210_15374_1_1533639352765_4613129_302-22084-0_sp0.9 from training. Duration: 0.9555625 2023-10-02 01:39:57,478 WARNING [train.py:1204] (3/4) Exclude cut with ID _26664_210_7770_1_1532258943280_3418270_187-80815-0_sp1.1 from training. Duration: 0.8454375 2023-10-02 01:39:57,552 WARNING [train.py:1204] (3/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_68-212801-0 from training. Duration: 0.73 2023-10-02 01:39:57,593 WARNING [train.py:1204] (3/4) Exclude cut with ID _29345_210_4381_1_1532689168263_3860169_149-257268-0 from training. Duration: 0.81 2023-10-02 01:39:58,724 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_213-103282-0_sp1.1 from training. Duration: 0.7090625 2023-10-02 01:39:58,844 WARNING [train.py:1204] (3/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_156-302675-0_sp1.1 from training. Duration: 0.5 2023-10-02 01:39:59,412 WARNING [train.py:1204] (3/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_125-322172-0 from training. Duration: 0.93 2023-10-02 01:39:59,487 WARNING [train.py:1204] (3/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_493-60474-0_sp1.1 from training. Duration: 0.963625 2023-10-02 01:39:59,665 WARNING [train.py:1204] (3/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_156-302675-0 from training. Duration: 0.55 2023-10-02 01:39:59,886 WARNING [train.py:1204] (3/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_123-267862-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 01:40:00,082 WARNING [train.py:1204] (3/4) Exclude cut with ID _28427_210_4220_1_1532581240967_3061069_201-143747-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 01:40:00,106 WARNING [train.py:1204] (3/4) Exclude cut with ID _8772_210_1850_1_1528635595972_3664011_96-12756-0_sp1.1 from training. Duration: 0.8909375 2023-10-02 01:40:00,156 WARNING [train.py:1204] (3/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_1347-145675-0_sp1.1 from training. Duration: 0.936375 2023-10-02 01:40:00,284 WARNING [train.py:1204] (3/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_387-159008-0_sp1.1 from training. Duration: 0.7818125 2023-10-02 01:40:00,317 WARNING [train.py:1204] (3/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_400-201896-0_sp1.1 from training. Duration: 0.963625 2023-10-02 01:40:00,549 WARNING [train.py:1204] (3/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_326-174612-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 01:40:00,587 WARNING [train.py:1204] (3/4) Exclude cut with ID _55844_210_2346_1_1534471212569_7625707_390-116233-0_sp1.1 from training. Duration: 0.963625 2023-10-02 01:40:00,618 WARNING [train.py:1204] (3/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_419-317062-0_sp1.1 from training. Duration: 0.82725 2023-10-02 01:40:00,683 WARNING [train.py:1204] (3/4) Exclude cut with ID _29877_210_6228_1_1532397791106_5276149_266-62291-0_sp0.9 from training. Duration: 0.9666875 2023-10-02 01:40:00,749 WARNING [train.py:1204] (3/4) Exclude cut with ID _7857_210_1534_1_1528286515745_6831470_431-338528-0_sp1.1 from training. Duration: 0.92725 2023-10-02 01:40:00,827 WARNING [train.py:1204] (3/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_87-131427-0_sp0.9 from training. Duration: 0.8666875 2023-10-02 01:40:01,044 WARNING [train.py:1204] (3/4) Exclude cut with ID _24055_210_12651_1_1532310907143_976979_92-59356-0_sp1.1 from training. Duration: 0.82725 2023-10-02 01:40:01,123 WARNING [train.py:1204] (3/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_96-290039-0 from training. Duration: 0.56 2023-10-02 01:40:01,243 WARNING [train.py:1204] (3/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_48-182155-0_sp0.9 from training. Duration: 0.9666875 2023-10-02 01:40:01,283 WARNING [train.py:1204] (3/4) Exclude cut with ID _4626_210_3532_1_1526817652717_3916859_446-206656-0 from training. Duration: 0.76 2023-10-02 01:40:01,384 WARNING [train.py:1204] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_168-32906-0 from training. Duration: 0.69 2023-10-02 01:40:01,407 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3780_210_2087_1_1526467760_2130483_170-259044-0 from training. Duration: 0.7 2023-10-02 01:40:01,432 WARNING [train.py:1204] (3/4) Exclude cut with ID _65134_210_14763_1_1535335393985_3505368_153-216718-0_sp1.1 from training. Duration: 0.92725 2023-10-02 01:40:01,520 WARNING [train.py:1204] (3/4) Exclude cut with ID _64639_210_5816_1_1535367619437_3528000_461-329156-0_sp1.1 from training. Duration: 0.87275 2023-10-02 01:40:01,555 WARNING [train.py:1204] (3/4) Exclude cut with ID _21989_210_5854_1_1531899214931_3271360_126-347531-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 01:40:01,614 WARNING [train.py:1204] (3/4) Exclude cut with ID _7614_210_4915_1_1529326764092_4537359_96-162197-0_sp1.1 from training. Duration: 0.963625 2023-10-02 01:40:01,615 WARNING [train.py:1204] (3/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_1083-147536-0 from training. Duration: 0.97 2023-10-02 01:40:01,947 WARNING [train.py:1204] (3/4) Exclude cut with ID _64029_210_4381_1_1535178502772_3756459_275-16710-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 01:40:02,073 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7264_210_4938_1_1527939911_8132095_800-91995-0_sp0.9 from training. Duration: 0.9555625 2023-10-02 01:40:02,111 WARNING [train.py:1204] (3/4) Exclude cut with ID _3844_210_1390_1_1526727803582_2871209_50-193879-0_sp1.1 from training. Duration: 0.936375 2023-10-02 01:40:02,328 WARNING [train.py:1204] (3/4) Exclude cut with ID _46952_210_5986_1_1534230010357_3107379_232-344503-0 from training. Duration: 0.96 2023-10-02 01:40:03,298 WARNING [train.py:1204] (3/4) Exclude cut with ID _25808_210_13292_1_1532343514562_7366109_206-14076-0_sp1.1 from training. Duration: 0.936375 2023-10-02 01:40:03,299 WARNING [train.py:1204] (3/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_55-31904-0_sp0.9 from training. Duration: 0.97775 2023-10-02 01:40:03,316 WARNING [train.py:1204] (3/4) Exclude cut with ID _14632_210_9988_1_1530691279392_3923200_161-129530-0 from training. Duration: 0.96 2023-10-02 01:40:03,380 WARNING [train.py:1204] (3/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_770-200430-0_sp1.1 from training. Duration: 0.8090625 2023-10-02 01:40:03,390 WARNING [train.py:1204] (3/4) Exclude cut with ID _6270_210_2084_1_1527850911573_3856420_26-277343-0_sp1.1 from training. Duration: 0.7454375 2023-10-02 01:40:03,392 WARNING [train.py:1204] (3/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_88-23776-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 01:40:03,635 WARNING [train.py:1204] (3/4) Exclude cut with ID _60860_210_8254_1_1534896015875_3557169_245-256666-0_sp1.1 from training. Duration: 0.8818125 2023-10-02 01:40:03,671 WARNING [train.py:1204] (3/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_636-251213-0_sp1.1 from training. Duration: 0.8545625 2023-10-02 01:40:03,700 WARNING [train.py:1204] (3/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_540-131164-0 from training. Duration: 0.92 2023-10-02 01:40:03,909 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_5931_210_2040_1_1527428481_4547999_321-193614-0_sp0.9 from training. Duration: 0.7444375 2023-10-02 01:40:04,468 WARNING [train.py:1204] (3/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_373-225403-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 01:40:04,657 WARNING [train.py:1204] (3/4) Exclude cut with ID _20622_210_3852_1_1531622427080_1093839_57-326027-0_sp1.1 from training. Duration: 0.97275 2023-10-02 01:40:05,144 WARNING [train.py:1204] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_605-257205-0_sp1.1 from training. Duration: 0.536375 2023-10-02 01:40:05,176 WARNING [train.py:1204] (3/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_442-320317-0_sp0.9 from training. Duration: 0.911125 2023-10-02 01:40:05,240 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_35361_210_4238_1_1533262810_7793476_675-87012-0_sp1.1 from training. Duration: 0.8090625 2023-10-02 01:40:05,274 WARNING [train.py:1204] (3/4) Exclude cut with ID _4184_210_2550_1_1526730192013_2219380_306-44736-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 01:40:05,478 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_124-113562-0 from training. Duration: 0.94 2023-10-02 01:40:05,638 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2821_210_2715_1_1526108385_3763560_73-296673-0_sp0.9 from training. Duration: 0.92225 2023-10-02 01:40:05,661 WARNING [train.py:1204] (3/4) Exclude cut with ID _10301_210_5886_1_1529222275332_3910620_216-46959-0_sp1.1 from training. Duration: 0.92725 2023-10-02 01:40:05,784 WARNING [train.py:1204] (3/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_91-277684-0_sp0.9 from training. Duration: 0.82225 2023-10-02 01:40:05,872 WARNING [train.py:1204] (3/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_351-294650-0_sp0.9 from training. Duration: 0.7 2023-10-02 01:40:06,006 WARNING [train.py:1204] (3/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_209-205365-0 from training. Duration: 0.79 2023-10-02 01:40:06,054 WARNING [train.py:1204] (3/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_97-325053-0 from training. Duration: 0.92 2023-10-02 01:40:06,160 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_49348_210_7927_1_1533954500_3691300_109-276430-0_sp1.1 from training. Duration: 0.92725 2023-10-02 01:40:06,287 WARNING [train.py:1204] (3/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_912-47473-0 from training. Duration: 0.68 2023-10-02 01:40:06,366 WARNING [train.py:1204] (3/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_96-320736-0_sp0.9 from training. Duration: 0.9555625 2023-10-02 01:40:07,633 WARNING [train.py:1204] (3/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_1252-235481-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 01:40:07,717 WARNING [train.py:1204] (3/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_813-261554-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 01:40:07,785 WARNING [train.py:1204] (3/4) Exclude cut with ID _21632_210_2253_1_1531994001765_3744140_110-274654-0_sp0.9 from training. Duration: 0.911125 2023-10-02 01:40:07,929 WARNING [train.py:1204] (3/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_381-317791-0_sp1.1 from training. Duration: 0.663625 2023-10-02 01:40:07,941 WARNING [train.py:1204] (3/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_51-117576-0_sp1.1 from training. Duration: 0.363625 2023-10-02 01:40:07,968 WARNING [train.py:1204] (3/4) Exclude cut with ID _42321_210_7770_1_1533468631352_4366895_104-215915-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 01:40:08,161 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_216-22824-0_sp1.1 from training. Duration: 0.92725 2023-10-02 01:40:08,162 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_14831_210_7927_1_1530700200_5583610_181-27467-0 from training. Duration: 0.91 2023-10-02 01:40:08,227 WARNING [train.py:1204] (3/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_1024-265645-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 01:40:08,230 WARNING [train.py:1204] (3/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_412-233904-0_sp1.1 from training. Duration: 0.936375 2023-10-02 01:40:08,261 WARNING [train.py:1204] (3/4) Exclude cut with ID _30191_210_1664_1_1533189915047_7196670_911-223461-0_sp1.1 from training. Duration: 0.92725 2023-10-02 01:40:08,268 WARNING [train.py:1204] (3/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_558-148266-0_sp1.1 from training. Duration: 0.963625 2023-10-02 01:40:08,513 WARNING [train.py:1204] (3/4) Exclude cut with ID _58480_210_12392_1_1534811121111_3646160_212-164495-0_sp1.1 from training. Duration: 0.936375 2023-10-02 01:40:08,515 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_454-276429-0_sp0.9 from training. Duration: 0.9666875 2023-10-02 01:40:08,609 WARNING [train.py:1204] (3/4) Exclude cut with ID _24736_210_3279_1_1532332894218_2838700_73-197086-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 01:40:08,670 WARNING [train.py:1204] (3/4) Exclude cut with ID _5294_210_1747_1_1527249643928_3696179_529-242298-0_sp1.1 from training. Duration: 0.863625 2023-10-02 01:40:08,703 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_53555_210_5632_1_1534595220_7232297_345-171438-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 01:40:09,017 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_28211_210_6388_1_1532861506_4523224_22-25766-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 01:40:09,094 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7002_210_1794_1_1527939683_5157286_163-87552-0 from training. Duration: 0.86 2023-10-02 01:40:09,255 WARNING [train.py:1204] (3/4) Exclude cut with ID _56927_210_9969_1_1534848970840_3695229_409-225129-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 01:40:09,380 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_26192_210_7927_1_1532137269_1040115_21-219331-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 01:40:09,537 WARNING [train.py:1204] (3/4) Exclude cut with ID _15012_210_1968_1_1530856820429_5095350_662-210469-0 from training. Duration: 0.57 2023-10-02 01:40:10,029 WARNING [train.py:1204] (3/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_582-191016-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 01:40:10,247 WARNING [train.py:1204] (3/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_270-210032-0_sp1.1 from training. Duration: 0.963625 2023-10-02 01:40:10,292 WARNING [train.py:1204] (3/4) Exclude cut with ID _6789_210_1534_1_1527857937218_3118950_262-48853-0_sp0.9 from training. Duration: 0.62225 2023-10-02 01:40:10,364 WARNING [train.py:1204] (3/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_847-41334-0 from training. Duration: 0.44 2023-10-02 01:40:10,404 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_573-46532-0_sp1.1 from training. Duration: 0.92725 2023-10-02 01:40:10,621 WARNING [train.py:1204] (3/4) Exclude cut with ID _21881_210_3525_1_1531825242757_3798380_247-102591-0_sp1.1 from training. Duration: 0.8909375 2023-10-02 01:40:10,653 WARNING [train.py:1204] (3/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_200-21395-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 01:40:10,904 WARNING [train.py:1204] (3/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_711-15688-0 from training. Duration: 0.43 2023-10-02 01:40:11,070 WARNING [train.py:1204] (3/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_53-75025-0_sp1.1 from training. Duration: 0.963625 2023-10-02 01:40:11,911 WARNING [train.py:1204] (3/4) Exclude cut with ID _6194_210_1534_1_1527511826160_3941194_321-148641-0 from training. Duration: 0.64 2023-10-02 01:40:11,963 WARNING [train.py:1204] (3/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_337-294431-0_sp1.1 from training. Duration: 0.936375 2023-10-02 01:40:12,091 WARNING [train.py:1204] (3/4) Exclude cut with ID _21632_210_2253_1_1531994001765_3744140_110-274654-0_sp1.1 from training. Duration: 0.7454375 2023-10-02 01:40:12,295 WARNING [train.py:1204] (3/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_135-174080-0_sp1.1 from training. Duration: 0.4818125 2023-10-02 01:40:12,399 WARNING [train.py:1204] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_21-174514-0 from training. Duration: 0.43 2023-10-02 01:40:12,540 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_38965_210_4852_1_1533174906_3484735_215-294589-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 01:40:12,571 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_548-294316-0 from training. Duration: 0.81 2023-10-02 01:40:13,188 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_103-84010-0_sp1.1 from training. Duration: 0.62725 2023-10-02 01:40:13,395 WARNING [train.py:1204] (3/4) Exclude cut with ID _18199_210_6109_1_1531475789864_4288790_295-198247-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 01:40:13,423 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_13757_210_3528_1_1530440682_4107239_103-313224-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 01:40:13,720 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_34886_210_10633_1_1533203882_1755661_67-294673-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 01:40:13,729 WARNING [train.py:1204] (3/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_350-101734-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 01:40:13,910 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_37357_210_17024_1_1533089504_2316121_204-153986-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 01:40:14,039 WARNING [train.py:1204] (3/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_673-115569-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 01:40:14,221 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_489-230703-0_sp0.9 from training. Duration: 0.9444375 2023-10-02 01:40:14,289 WARNING [train.py:1204] (3/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_575-102496-0_sp1.1 from training. Duration: 0.936375 2023-10-02 01:40:14,361 WARNING [train.py:1204] (3/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_669-93163-0 from training. Duration: 0.98 2023-10-02 01:40:14,508 WARNING [train.py:1204] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_249-281349-0_sp0.9 from training. Duration: 0.9444375 2023-10-02 01:40:14,740 WARNING [train.py:1204] (3/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_106-30782-0 from training. Duration: 0.8 2023-10-02 01:40:14,818 WARNING [train.py:1204] (3/4) Exclude cut with ID _8772_210_1850_1_1528635595972_3664011_398-320049-0_sp1.1 from training. Duration: 0.8 2023-10-02 01:40:14,968 WARNING [train.py:1204] (3/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_516-151727-0_sp1.1 from training. Duration: 0.963625 2023-10-02 01:40:15,042 WARNING [train.py:1204] (3/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_325-62802-0_sp1.1 from training. Duration: 0.7909375 2023-10-02 01:40:15,392 WARNING [train.py:1204] (3/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_93-58002-0_sp0.9 from training. Duration: 0.6333125 2023-10-02 01:40:15,421 WARNING [train.py:1204] (3/4) Exclude cut with ID _4681_210_3525_1_1527251040321_1235713_123-24428-0_sp1.1 from training. Duration: 0.436375 2023-10-02 01:40:15,499 WARNING [train.py:1204] (3/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_1185-56473-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 01:40:15,654 WARNING [train.py:1204] (3/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_96-244299-0_sp1.1 from training. Duration: 0.8 2023-10-02 01:40:16,547 WARNING [train.py:1204] (3/4) Exclude cut with ID _59500_210_8254_1_1534809610680_3736009_104-72815-0_sp1.1 from training. Duration: 0.963625 2023-10-02 01:40:16,742 WARNING [train.py:1204] (3/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_468-283113-0_sp1.1 from training. Duration: 0.87275 2023-10-02 01:40:16,880 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6239_210_4211_1_1527765584_4306698_528-102186-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 01:40:16,937 WARNING [train.py:1204] (3/4) Exclude cut with ID _45297_210_2721_1_1533715351381_3264949_351-147136-0 from training. Duration: 0.87 2023-10-02 01:40:16,941 WARNING [train.py:1204] (3/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_520-51744-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 01:40:17,037 WARNING [train.py:1204] (3/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_310-114452-0_sp1.1 from training. Duration: 0.536375 2023-10-02 01:40:17,040 WARNING [train.py:1204] (3/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_450-165710-0_sp1.1 from training. Duration: 0.92725 2023-10-02 01:40:17,049 WARNING [train.py:1204] (3/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_280-120942-0_sp1.1 from training. Duration: 0.7 2023-10-02 01:40:17,086 WARNING [train.py:1204] (3/4) Exclude cut with ID _65585_210_12210_1_1535450451584_3764098_112-294301-0_sp0.9 from training. Duration: 0.97775 2023-10-02 01:40:17,132 WARNING [train.py:1204] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_98-51676-0_sp0.9 from training. Duration: 0.811125 2023-10-02 01:40:17,348 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_33348_210_5632_1_1533086912_3324030_85-28583-0_sp1.1 from training. Duration: 0.7454375 2023-10-02 01:40:17,458 WARNING [train.py:1204] (3/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_336-208232-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 01:40:17,615 WARNING [train.py:1204] (3/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_339-104791-0_sp0.9 from training. Duration: 0.5444375 2023-10-02 01:40:17,810 WARNING [train.py:1204] (3/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_559-29840-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 01:40:18,045 WARNING [train.py:1204] (3/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_117-290916-0_sp0.9 from training. Duration: 0.5666875 2023-10-02 01:40:18,147 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_185-112289-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 01:40:18,345 WARNING [train.py:1204] (3/4) Exclude cut with ID _59256_210_5986_1_1535076085997_3589120_155-298797-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 01:40:18,558 WARNING [train.py:1204] (3/4) Exclude cut with ID _44659_210_2721_1_1534036120061_6614860_246-206531-0_sp1.1 from training. Duration: 0.92725 2023-10-02 01:40:18,620 WARNING [train.py:1204] (3/4) Exclude cut with ID _23818_210_6228_1_1531917063884_3676920_469-151944-0_sp1.1 from training. Duration: 0.92725 2023-10-02 01:40:18,715 WARNING [train.py:1204] (3/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_88-276668-0_sp1.1 from training. Duration: 0.9 2023-10-02 01:40:18,914 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_15101_210_7927_1_1530786415_5033274_320-327527-0_sp1.1 from training. Duration: 0.87275 2023-10-02 01:40:18,922 WARNING [train.py:1204] (3/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_446-112117-0 from training. Duration: 0.9 2023-10-02 01:40:19,010 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_30280_210_1881_1_1532872779_3660139_400-254159-0_sp1.1 from training. Duration: 0.936375 2023-10-02 01:40:19,272 WARNING [train.py:1204] (3/4) Exclude cut with ID _40284_210_2084_1_1533513679814_3704279_158-345341-0_sp1.1 from training. Duration: 0.936375 2023-10-02 01:40:19,547 WARNING [train.py:1204] (3/4) Exclude cut with ID 3033-130750-0096-55598-0 from training. Duration: 0.83 2023-10-02 01:40:19,697 WARNING [train.py:1204] (3/4) Exclude cut with ID _9389_210_1664_1_1529217337301_5289269_379-128794-0 from training. Duration: 0.85 2023-10-02 01:40:19,718 WARNING [train.py:1204] (3/4) Exclude cut with ID _3675_210_4557_1_1526639934408_4182089_456-18305-0 from training. Duration: 0.76 2023-10-02 01:40:19,754 WARNING [train.py:1204] (3/4) Exclude cut with ID _29853_210_7497_1_1532505600851_6642110_571-249460-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 01:40:19,838 WARNING [train.py:1204] (3/4) Exclude cut with ID _4068_210_2629_1_1526697350395_1546259_230-325568-0_sp1.1 from training. Duration: 0.92725 2023-10-02 01:40:19,838 WARNING [train.py:1204] (3/4) Exclude cut with ID _34415_210_8750_1_1533085213375_3451116_213-267570-0_sp1.1 from training. Duration: 0.963625 2023-10-02 01:40:19,934 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6767_210_3273_1_1527855783_1718932_142-127132-0_sp1.1 from training. Duration: 0.863625 2023-10-02 01:40:21,119 WARNING [train.py:1204] (3/4) Exclude cut with ID IC0653W0001-322925 from training. Duration: 0.896 2023-10-02 01:40:21,204 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_4528_210_3273_1_1527076654_3276777_209-219804-0_sp1.1 from training. Duration: 0.62725 2023-10-02 01:40:21,326 WARNING [train.py:1204] (3/4) Exclude cut with ID _55844_210_2346_1_1534471212569_7625707_187-204971-0_sp1.1 from training. Duration: 0.936375 2023-10-02 01:40:21,463 WARNING [train.py:1204] (3/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_369-325184-0 from training. Duration: 0.99 2023-10-02 01:40:21,494 WARNING [train.py:1204] (3/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_791-45149-0 from training. Duration: 0.86 2023-10-02 01:40:21,549 WARNING [train.py:1204] (3/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_317-211107-0_sp1.1 from training. Duration: 0.836375 2023-10-02 01:40:21,649 WARNING [train.py:1204] (3/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_488-330913-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 01:40:21,751 WARNING [train.py:1204] (3/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_506-42614-0_sp1.1 from training. Duration: 0.7181875 2023-10-02 01:40:22,128 WARNING [train.py:1204] (3/4) Exclude cut with ID _8696_210_1876_1_1528545632440_3345058_209-54069-0 from training. Duration: 0.67 2023-10-02 01:40:22,617 WARNING [train.py:1204] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_215-202973-0_sp1.1 from training. Duration: 0.8 2023-10-02 01:40:22,685 WARNING [train.py:1204] (3/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_321-215742-0 from training. Duration: 0.5 2023-10-02 01:40:22,729 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_428-32114-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 01:40:22,743 WARNING [train.py:1204] (3/4) Exclude cut with ID _27013_210_1389_1_1532437173240_3252649_467-263151-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 01:40:22,781 WARNING [train.py:1204] (3/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_734-129761-0_sp0.9 from training. Duration: 0.911125 2023-10-02 01:40:22,829 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_521-214276-0 from training. Duration: 0.44 2023-10-02 01:40:23,041 WARNING [train.py:1204] (3/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_80-147301-0_sp1.1 from training. Duration: 0.763625 2023-10-02 01:40:23,048 WARNING [train.py:1204] (3/4) Exclude cut with ID _9679_210_7497_1_1529129212906_4363929_56-307796-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 01:40:23,270 WARNING [train.py:1204] (3/4) Exclude cut with ID _14832_210_8750_1_1530691351539_3171348_1-54315-0 from training. Duration: 0.87 2023-10-02 01:40:23,270 WARNING [train.py:1204] (3/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_558-155750-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 01:40:23,440 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_142-309370-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 01:40:23,481 WARNING [train.py:1204] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_18-46756-0_sp0.9 from training. Duration: 0.87775 2023-10-02 01:40:23,488 WARNING [train.py:1204] (3/4) Exclude cut with ID _61148_210_6897_1_1535094022726_4217610_279-212159-0_sp1.1 from training. Duration: 0.936375 2023-10-02 01:40:23,593 WARNING [train.py:1204] (3/4) Exclude cut with ID _21987_210_5854_1_1531821500482_3637038_164-2641-0_sp1.1 from training. Duration: 0.936375 2023-10-02 01:40:23,638 WARNING [train.py:1204] (3/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_395-340852-0_sp1.1 from training. Duration: 0.8454375 2023-10-02 01:40:23,776 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1077-184261-0_sp1.1 from training. Duration: 0.963625 2023-10-02 01:40:23,784 WARNING [train.py:1204] (3/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_1176-204992-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 01:40:23,999 WARNING [train.py:1204] (3/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_563-347832-0_sp1.1 from training. Duration: 0.8090625 2023-10-02 01:40:24,063 WARNING [train.py:1204] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_380-204756-0_sp0.9 from training. Duration: 0.9555625 2023-10-02 01:40:25,227 WARNING [train.py:1204] (3/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_212-10501-0_sp1.1 from training. Duration: 0.8545625 2023-10-02 01:40:25,255 WARNING [train.py:1204] (3/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_445-117398-0 from training. Duration: 0.89 2023-10-02 01:40:25,527 WARNING [train.py:1204] (3/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_29-280622-0 from training. Duration: 0.82 2023-10-02 01:40:25,757 WARNING [train.py:1204] (3/4) Exclude cut with ID _26664_210_7770_1_1532258943280_3418270_187-80815-0 from training. Duration: 0.93 2023-10-02 01:40:26,041 WARNING [train.py:1204] (3/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_414-349640-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 01:40:26,060 WARNING [train.py:1204] (3/4) Exclude cut with ID _45021_210_9994_1_1533619751094_3330288_96-239010-0 from training. Duration: 0.91 2023-10-02 01:40:26,063 WARNING [train.py:1204] (3/4) Exclude cut with ID _5189_210_4557_1_1527248319428_3769229_199-53526-0_sp0.9 from training. Duration: 0.4666875 2023-10-02 01:40:26,815 WARNING [train.py:1204] (3/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_63-101996-0_sp1.1 from training. Duration: 0.8 2023-10-02 01:40:26,948 WARNING [train.py:1204] (3/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_199-86432-0 from training. Duration: 0.98 2023-10-02 01:40:27,152 WARNING [train.py:1204] (3/4) Exclude cut with ID _36399_210_16049_1_1532998771377_3698333_207-259013-0_sp1.1 from training. Duration: 0.92725 2023-10-02 01:40:27,196 WARNING [train.py:1204] (3/4) Exclude cut with ID _62574_210_2655_1_1535180407058_3933249_567-111756-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 01:40:27,508 WARNING [train.py:1204] (3/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_436-318113-0 from training. Duration: 0.75 2023-10-02 01:40:27,551 WARNING [train.py:1204] (3/4) Exclude cut with ID _53379_210_20034_1_1534222027363_3854654_43-305214-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 01:40:27,574 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_15126_210_6753_1_1530778677_2591185_113-157651-0_sp1.1 from training. Duration: 0.6181875 2023-10-02 01:40:27,693 WARNING [train.py:1204] (3/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_146-218763-0_sp1.1 from training. Duration: 0.7818125 2023-10-02 01:40:27,873 WARNING [train.py:1204] (3/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_714-310538-0_sp0.9 from training. Duration: 0.9555625 2023-10-02 01:40:28,067 WARNING [train.py:1204] (3/4) Exclude cut with ID _24271_210_12658_1_1531987270022_7190519_709-16721-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 01:40:28,314 WARNING [train.py:1204] (3/4) Exclude cut with ID _5190_210_4557_1_1527244368799_373439_28-197030-0_sp0.9 from training. Duration: 0.8 2023-10-02 01:40:28,351 WARNING [train.py:1204] (3/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_267-197536-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 01:40:28,382 WARNING [train.py:1204] (3/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_658-282610-0_sp1.1 from training. Duration: 0.4818125 2023-10-02 01:40:28,390 WARNING [train.py:1204] (3/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_257-67050-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 01:40:28,560 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_14906_210_7927_1_1530754190_9408633_961-53402-0_sp1.1 from training. Duration: 0.936375 2023-10-02 01:40:28,672 WARNING [train.py:1204] (3/4) Exclude cut with ID _60113_210_16271_1_1535164212481_4386079_490-280290-0_sp1.1 from training. Duration: 0.8909375 2023-10-02 01:40:28,933 WARNING [train.py:1204] (3/4) Exclude cut with ID _14100_210_8341_1_1530529320613_3678069_258-224599-0 from training. Duration: 0.77 2023-10-02 01:40:28,985 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6961_210_2087_1_1527937168_6815011_565-326898-0_sp1.1 from training. Duration: 0.936375 2023-10-02 01:40:29,876 WARNING [train.py:1204] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_239-316802-0_sp1.1 from training. Duration: 0.62725 2023-10-02 01:40:29,936 WARNING [train.py:1204] (3/4) Exclude cut with ID _55857_210_2346_1_1534644007831_6898209_745-98391-0_sp1.1 from training. Duration: 0.6909375 2023-10-02 01:40:29,939 WARNING [train.py:1204] (3/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_154-135696-0 from training. Duration: 0.8 2023-10-02 01:40:29,949 WARNING [train.py:1204] (3/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_93-58002-0 from training. Duration: 0.57 2023-10-02 01:40:30,096 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9025W0005-507335 from training. Duration: 0.896 2023-10-02 01:40:30,171 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9029W0034-509435 from training. Duration: 0.64 2023-10-02 01:40:30,323 WARNING [train.py:1204] (3/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_76-203867-0_sp1.1 from training. Duration: 0.6545625 2023-10-02 01:40:30,336 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_46639_210_8341_1_1533726088_4495500_66-346325-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 01:40:30,354 WARNING [train.py:1204] (3/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_145-137790-0_sp0.9 from training. Duration: 0.92225 2023-10-02 01:40:30,376 WARNING [train.py:1204] (3/4) Exclude cut with ID _36522_210_8887_1_1533177030611_1772478_7-327299-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 01:40:30,471 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9042W0009-515615 from training. Duration: 0.896 2023-10-02 01:40:30,479 WARNING [train.py:1204] (3/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_39-248870-0_sp0.9 from training. Duration: 0.7666875 2023-10-02 01:40:30,538 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_35441_210_16554_1_1533347572_4092680_362-242912-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 01:40:30,629 WARNING [train.py:1204] (3/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_216-343007-0_sp1.1 from training. Duration: 0.6 2023-10-02 01:40:30,718 WARNING [train.py:1204] (3/4) Exclude cut with ID _3867_210_1883_1_1526641200575_4475830_272-215563-0_sp0.9 from training. Duration: 0.6333125 2023-10-02 01:40:30,810 WARNING [train.py:1204] (3/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_553-175603-0_sp1.1 from training. Duration: 0.92725 2023-10-02 01:40:30,818 WARNING [train.py:1204] (3/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_963-187109-0 from training. Duration: 0.99 2023-10-02 01:40:30,973 WARNING [train.py:1204] (3/4) Exclude cut with ID _44973_210_1511_1_1533794381163_3634770_495-211466-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 01:40:30,985 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9068W0002-529085 from training. Duration: 0.896 2023-10-02 01:40:31,006 WARNING [train.py:1204] (3/4) Exclude cut with ID _5294_210_1747_1_1527249643928_3696179_180-100713-0_sp1.1 from training. Duration: 0.736375 2023-10-02 01:40:31,126 WARNING [train.py:1204] (3/4) Exclude cut with ID _35799_210_5854_1_1533263487887_3529071_149-49689-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 01:40:31,539 WARNING [train.py:1204] (3/4) Exclude cut with ID _67725_210_27767_1_1535608756671_5105359_36-220794-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 01:40:31,698 WARNING [train.py:1204] (3/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_338-96520-0_sp1.1 from training. Duration: 0.8545625 2023-10-02 01:40:31,787 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9104W0004-547160 from training. Duration: 0.896 2023-10-02 01:40:31,866 WARNING [train.py:1204] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_330-327861-0 from training. Duration: 0.67 2023-10-02 01:40:32,004 WARNING [train.py:1204] (3/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_156-257607-0_sp0.9 from training. Duration: 0.92225 2023-10-02 01:40:32,062 WARNING [train.py:1204] (3/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_476-322050-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 01:40:32,078 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9118W0017-553385 from training. Duration: 0.896 2023-10-02 01:40:32,123 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9120W0028-554435 from training. Duration: 0.896 2023-10-02 01:40:32,460 WARNING [train.py:1204] (3/4) Exclude cut with ID _56524_210_9642_1_1534572011779_3619170_145-310842-0 from training. Duration: 0.99 2023-10-02 01:40:32,595 WARNING [train.py:1204] (3/4) Exclude cut with ID _51631_210_8254_1_1534129228420_4084794_270-222508-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 01:40:32,873 WARNING [train.py:1204] (3/4) Exclude cut with ID _14268_210_9206_1_1530622771285_4714099_331-105351-0 from training. Duration: 0.65 2023-10-02 01:40:33,009 WARNING [train.py:1204] (3/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_40-129756-0 from training. Duration: 0.81 2023-10-02 01:40:34,443 WARNING [train.py:1204] (3/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_231-104736-0 from training. Duration: 0.78 2023-10-02 01:40:34,661 WARNING [train.py:1204] (3/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_214-291369-0 from training. Duration: 0.63 2023-10-02 01:40:34,677 WARNING [train.py:1204] (3/4) Exclude cut with ID _62574_210_2655_1_1535180407058_3933249_369-84347-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 01:40:34,727 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9210W0111-599240 from training. Duration: 0.896 2023-10-02 01:40:34,737 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_92-246270-0 from training. Duration: 0.85 2023-10-02 01:40:34,779 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_37152_210_9697_1_1533106573_3844188_29-239070-0 from training. Duration: 0.98 2023-10-02 01:40:34,842 WARNING [train.py:1204] (3/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_239-22076-0 from training. Duration: 0.85 2023-10-02 01:40:34,844 WARNING [train.py:1204] (3/4) Exclude cut with ID _65962_210_29927_1_1535436026519_4174930_368-44639-0_sp1.1 from training. Duration: 0.97275 2023-10-02 01:40:35,180 WARNING [train.py:1204] (3/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_152-212533-0 from training. Duration: 0.97 2023-10-02 01:40:35,423 WARNING [train.py:1204] (3/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_90-31383-0_sp1.1 from training. Duration: 0.7909375 2023-10-02 01:40:35,657 WARNING [train.py:1204] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_196-152868-0_sp1.1 from training. Duration: 0.92725 2023-10-02 01:40:35,658 WARNING [train.py:1204] (3/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_140-181176-0 from training. Duration: 0.92 2023-10-02 01:40:35,822 WARNING [train.py:1204] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_168-32906-0_sp0.9 from training. Duration: 0.7666875 2023-10-02 01:40:36,101 WARNING [train.py:1204] (3/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_13-165512-0 from training. Duration: 0.82 2023-10-02 01:40:36,133 WARNING [train.py:1204] (3/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_381-317791-0_sp0.9 from training. Duration: 0.811125 2023-10-02 01:40:36,612 WARNING [train.py:1204] (3/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_63-267585-0_sp1.1 from training. Duration: 0.8181875 2023-10-02 01:40:36,615 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_326-303945-0_sp1.1 from training. Duration: 0.9 2023-10-02 01:40:36,634 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_194-143381-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 01:40:36,709 WARNING [train.py:1204] (3/4) Exclude cut with ID _39290_210_4220_1_1533862778760_3075118_194-16667-0_sp1.1 from training. Duration: 0.963625 2023-10-02 01:40:36,833 WARNING [train.py:1204] (3/4) Exclude cut with ID _15012_210_1968_1_1530856820429_5095350_662-210469-0_sp0.9 from training. Duration: 0.6333125 2023-10-02 01:40:36,871 WARNING [train.py:1204] (3/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_68-219807-0_sp1.1 from training. Duration: 0.42725 2023-10-02 01:40:36,931 WARNING [train.py:1204] (3/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_321-22821-0_sp0.9 from training. Duration: 0.988875 2023-10-02 01:40:37,143 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_1037-45317-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 01:40:37,144 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_403-39619-0_sp1.1 from training. Duration: 0.763625 2023-10-02 01:40:37,347 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_510-96996-0_sp0.9 from training. Duration: 0.9666875 2023-10-02 01:40:37,349 WARNING [train.py:1204] (3/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_225-180155-0_sp1.1 from training. Duration: 0.92725 2023-10-02 01:40:37,397 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_278-272612-0_sp0.9 from training. Duration: 0.811125 2023-10-02 01:40:37,485 WARNING [train.py:1204] (3/4) Exclude cut with ID _53713_210_24116_1_1534314487762_7416751_691-70244-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 01:40:37,710 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6788_210_1388_1_1527898698_4562021_91-197981-0_sp1.1 from training. Duration: 0.7545625 2023-10-02 01:40:37,868 WARNING [train.py:1204] (3/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_60-18123-0_sp0.9 from training. Duration: 0.97775 2023-10-02 01:40:37,912 WARNING [train.py:1204] (3/4) Exclude cut with ID _35799_210_5854_1_1533263487887_3529071_5-214617-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 01:40:37,917 WARNING [train.py:1204] (3/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_890-5117-0_sp1.1 from training. Duration: 0.7090625 2023-10-02 01:40:38,116 WARNING [train.py:1204] (3/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_660-211088-0 from training. Duration: 0.62 2023-10-02 01:40:38,170 WARNING [train.py:1204] (3/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_314-126819-0_sp0.9 from training. Duration: 0.5555625 2023-10-02 01:40:38,200 WARNING [train.py:1204] (3/4) Exclude cut with ID _21632_210_2253_1_1531994001765_3744140_362-66225-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 01:40:38,993 WARNING [train.py:1204] (3/4) Exclude cut with ID _61118_210_9527_1_1534897668884_3752630_173-258555-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 01:40:39,205 WARNING [train.py:1204] (3/4) Exclude cut with ID _7719_210_3525_1_1528456153163_4296960_88-250259-0_sp0.9 from training. Duration: 0.9333125 2023-10-02 01:40:39,241 WARNING [train.py:1204] (3/4) Exclude cut with ID _15140_210_9288_1_1530791388450_4880149_539-153708-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 01:40:39,263 WARNING [train.py:1204] (3/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_253-42544-0_sp1.1 from training. Duration: 0.97275 2023-10-02 01:40:39,429 WARNING [train.py:1204] (3/4) Exclude cut with ID _33640_210_2655_1_1533117605479_4855469_479-296877-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 01:40:39,619 WARNING [train.py:1204] (3/4) Exclude cut with ID 3033-130750-0096-55598-0_sp1.1 from training. Duration: 0.7545625 2023-10-02 01:40:39,635 WARNING [train.py:1204] (3/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_191-41023-0_sp1.1 from training. Duration: 0.6454375 2023-10-02 01:40:39,663 WARNING [train.py:1204] (3/4) Exclude cut with ID _8365_210_2064_1_1528592311070_4677690_431-294102-0_sp1.1 from training. Duration: 0.8090625 2023-10-02 01:40:39,685 WARNING [train.py:1204] (3/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_893-296030-0_sp1.1 from training. Duration: 0.97275 2023-10-02 01:40:39,956 WARNING [train.py:1204] (3/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_401-343820-0_sp1.1 from training. Duration: 0.97275 2023-10-02 01:40:40,185 WARNING [train.py:1204] (3/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_127-566-0_sp1.1 from training. Duration: 0.763625 2023-10-02 01:40:40,187 WARNING [train.py:1204] (3/4) Exclude cut with ID _14100_210_8341_1_1530529320613_3678069_126-272847-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 01:40:40,219 WARNING [train.py:1204] (3/4) Exclude cut with ID _15133_210_6228_1_1530794512074_3080379_29-264458-0_sp1.1 from training. Duration: 0.52725 2023-10-02 01:40:40,220 WARNING [train.py:1204] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_127-119109-0 from training. Duration: 0.72 2023-10-02 01:40:40,302 WARNING [train.py:1204] (3/4) Exclude cut with ID _6537_210_1883_1_1527987559710_3597129_382-133062-0_sp1.1 from training. Duration: 0.6181875 2023-10-02 01:40:40,322 WARNING [train.py:1204] (3/4) Exclude cut with ID _29529_210_7234_1_1532393961455_3977460_391-219682-0_sp1.1 from training. Duration: 0.936375 2023-10-02 01:40:40,370 WARNING [train.py:1204] (3/4) Exclude cut with ID _37537_210_2253_1_1533171758568_3421949_96-137189-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 01:40:40,675 WARNING [train.py:1204] (3/4) Exclude cut with ID _58414_210_8254_1_1535421609162_7151182_15-276301-0_sp1.1 from training. Duration: 0.97275 2023-10-02 01:40:40,839 WARNING [train.py:1204] (3/4) Exclude cut with ID _59502_210_12210_1_1535107024698_1835139_110-289074-0_sp1.1 from training. Duration: 0.92725 2023-10-02 01:40:41,094 WARNING [train.py:1204] (3/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_367-61330-0_sp1.1 from training. Duration: 0.6818125 2023-10-02 01:40:41,323 WARNING [train.py:1204] (3/4) Exclude cut with ID _45297_210_2721_1_1533715351381_3264949_353-9541-0_sp1.1 from training. Duration: 0.8818125 2023-10-02 01:40:41,363 WARNING [train.py:1204] (3/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_85-138257-0_sp0.9 from training. Duration: 0.788875 2023-10-02 01:40:41,392 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3787_210_2040_1_1526477544_5478586_603-120324-0 from training. Duration: 0.81 2023-10-02 01:40:41,446 WARNING [train.py:1204] (3/4) Exclude cut with ID _42863_210_9070_1_1533902398788_3696920_411-204131-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 01:40:41,788 WARNING [train.py:1204] (3/4) Exclude cut with ID _30196_210_1664_1_1533708496522_7074140_822-289909-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 01:40:41,872 WARNING [train.py:1204] (3/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_32-335103-0_sp0.9 from training. Duration: 0.911125 2023-10-02 01:40:41,933 WARNING [train.py:1204] (3/4) Exclude cut with ID _6493_210_1747_1_1528459210424_4012470_513-140967-0_sp0.9 from training. Duration: 0.87775 2023-10-02 01:40:42,501 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_47896_210_20508_1_1534492518_5958868_439-257766-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 01:40:42,572 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_1294-63152-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 01:40:43,383 WARNING [train.py:1204] (3/4) Exclude cut with ID _59170_210_6721_1_1534983633960_4203335_319-166512-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 01:40:43,543 WARNING [train.py:1204] (3/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_146-3470-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 01:40:43,730 WARNING [train.py:1204] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_678-159307-0_sp1.1 from training. Duration: 0.77275 2023-10-02 01:40:43,768 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_343-48822-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 01:40:43,835 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_4692_210_3528_1_1526899755_4323254_107-44401-0 from training. Duration: 0.86 2023-10-02 01:40:43,839 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_74-292608-0_sp1.1 from training. Duration: 0.6545625 2023-10-02 01:40:43,930 WARNING [train.py:1204] (3/4) Exclude cut with ID _55644_210_11856_1_1534593599582_4103719_293-248694-0_sp1.1 from training. Duration: 0.97275 2023-10-02 01:40:44,033 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3172_210_2087_1_1526292929_3987865_197-97762-0 from training. Duration: 0.75 2023-10-02 01:40:44,222 WARNING [train.py:1204] (3/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_264-348896-0_sp0.9 from training. Duration: 0.9444375 2023-10-02 01:40:44,506 WARNING [train.py:1204] (3/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_209-205365-0_sp0.9 from training. Duration: 0.87775 2023-10-02 01:40:44,587 WARNING [train.py:1204] (3/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_222-198127-0_sp0.9 from training. Duration: 0.9333125 2023-10-02 01:40:44,632 WARNING [train.py:1204] (3/4) Exclude cut with ID _57572_210_5517_1_1534921219624_3809484_52-43530-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 01:40:44,747 WARNING [train.py:1204] (3/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_96-320736-0_sp1.1 from training. Duration: 0.7818125 2023-10-02 01:40:44,906 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_13091_210_7925_1_1530609447_8987354_1159-225508-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 01:40:45,124 WARNING [train.py:1204] (3/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_237-44332-0 from training. Duration: 0.94 2023-10-02 01:40:45,165 WARNING [train.py:1204] (3/4) Exclude cut with ID _7970_210_1883_1_1528455616296_3472230_25-178407-0_sp0.9 from training. Duration: 0.788875 2023-10-02 01:40:45,378 WARNING [train.py:1204] (3/4) Exclude cut with ID _29003_210_19596_1_1532507391720_3894098_357-257062-0_sp1.1 from training. Duration: 0.936375 2023-10-02 01:40:45,526 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_809-17156-0 from training. Duration: 0.73 2023-10-02 01:40:45,614 WARNING [train.py:1204] (3/4) Exclude cut with ID _7160_210_1876_1_1527940923231_3703799_302-150728-0_sp1.1 from training. Duration: 0.7090625 2023-10-02 01:40:45,629 WARNING [train.py:1204] (3/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_444-28041-0_sp0.9 from training. Duration: 0.9444375 2023-10-02 01:40:45,828 WARNING [train.py:1204] (3/4) Exclude cut with ID _52892_210_6721_1_1534378941259_4124129_278-251666-0_sp1.1 from training. Duration: 0.92725 2023-10-02 01:40:45,922 WARNING [train.py:1204] (3/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_115-326778-0 from training. Duration: 0.93 2023-10-02 01:40:46,055 WARNING [train.py:1204] (3/4) Exclude cut with ID _41391_210_7994_1_1533603771872_3638201_31-188961-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 01:40:46,199 WARNING [train.py:1204] (3/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_1073-6264-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 01:40:46,223 WARNING [train.py:1204] (3/4) Exclude cut with ID _66868_210_9527_1_1535509287954_4561140_435-259515-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 01:40:46,328 WARNING [train.py:1204] (3/4) Exclude cut with ID _14886_210_4220_1_1530702020355_3516190_159-73748-0 from training. Duration: 0.83 2023-10-02 01:40:46,362 WARNING [train.py:1204] (3/4) Exclude cut with ID _15150_210_8341_1_1530752411803_7256384_469-239509-0_sp1.1 from training. Duration: 0.6181875 2023-10-02 01:40:46,531 WARNING [train.py:1204] (3/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_395-340852-0 from training. Duration: 0.93 2023-10-02 01:40:46,534 WARNING [train.py:1204] (3/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_138-187031-0_sp1.1 from training. Duration: 0.9 2023-10-02 01:40:46,803 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_13757_210_3528_1_1530440682_4107239_48-106347-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 01:40:46,879 WARNING [train.py:1204] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_164-224052-0_sp1.1 from training. Duration: 0.636375 2023-10-02 01:40:46,884 WARNING [train.py:1204] (3/4) Exclude cut with ID _59288_210_5854_1_1535077757774_3412246_408-322648-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 01:40:46,939 WARNING [train.py:1204] (3/4) Exclude cut with ID _30191_210_1664_1_1533189915047_7196670_827-36359-0_sp1.1 from training. Duration: 0.963625 2023-10-02 01:40:47,046 WARNING [train.py:1204] (3/4) Exclude cut with ID _4680_210_3525_1_1527249757953_893386_75-78979-0_sp1.1 from training. Duration: 0.82725 2023-10-02 01:40:47,717 WARNING [train.py:1204] (3/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_383-165234-0 from training. Duration: 0.94 2023-10-02 01:40:47,968 WARNING [train.py:1204] (3/4) Exclude cut with ID _19896_210_1883_1_1531709022662_6649569_94-229927-0 from training. Duration: 0.97 2023-10-02 01:40:47,982 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6788_210_1388_1_1527898698_4562021_180-75288-0_sp1.1 from training. Duration: 0.9 2023-10-02 01:40:47,990 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_26433_210_12108_1_1532157158_4056480_54-321349-0_sp1.1 from training. Duration: 0.97275 2023-10-02 01:40:48,389 WARNING [train.py:1204] (3/4) Exclude cut with ID _56108_210_16380_1_1534505446159_3887959_265-161493-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 01:40:48,486 WARNING [train.py:1204] (3/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_395-111602-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 01:40:48,498 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_154-316450-0_sp0.9 from training. Duration: 0.8444375 2023-10-02 01:40:48,571 WARNING [train.py:1204] (3/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_381-116041-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 01:40:48,707 WARNING [train.py:1204] (3/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_65-104124-0_sp1.1 from training. Duration: 0.963625 2023-10-02 01:40:48,760 WARNING [train.py:1204] (3/4) Exclude cut with ID _6536_210_1883_1_1527850783339_3319069_169-23811-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 01:40:48,803 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_289-180904-0_sp0.9 from training. Duration: 0.8444375 2023-10-02 01:40:48,824 WARNING [train.py:1204] (3/4) Exclude cut with ID _56406_210_9288_1_1534899618664_3496149_226-242400-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 01:40:48,924 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_24899_210_16554_1_1532046422_3827462_276-216330-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 01:40:49,178 WARNING [train.py:1204] (3/4) Exclude cut with ID _29345_210_4381_1_1532689168263_3860169_198-174328-0_sp1.1 from training. Duration: 0.82725 2023-10-02 01:40:49,258 WARNING [train.py:1204] (3/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_338-96520-0 from training. Duration: 0.94 2023-10-02 01:40:49,626 WARNING [train.py:1204] (3/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_7-111954-0_sp1.1 from training. Duration: 0.5090625 2023-10-02 01:40:49,728 WARNING [train.py:1204] (3/4) Exclude cut with ID _36692_210_5665_1_1533608967420_398809_8-16261-0_sp1.1 from training. Duration: 0.97275 2023-10-02 01:40:49,976 WARNING [train.py:1204] (3/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_86-212657-0_sp1.1 from training. Duration: 0.97275 2023-10-02 01:40:49,984 WARNING [train.py:1204] (3/4) Exclude cut with ID _44659_210_2721_1_1534036120061_6614860_626-298884-0_sp1.1 from training. Duration: 0.8818125 2023-10-02 01:40:50,284 WARNING [train.py:1204] (3/4) Exclude cut with ID _49755_210_18053_1_1534581005846_3218709_120-257918-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 01:40:50,424 WARNING [train.py:1204] (3/4) Exclude cut with ID _21881_210_3525_1_1531825242757_3798380_400-113821-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 01:40:50,427 WARNING [train.py:1204] (3/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_387-236773-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 01:40:50,498 WARNING [train.py:1204] (3/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_613-232856-0_sp1.1 from training. Duration: 0.4909375 2023-10-02 01:40:50,522 WARNING [train.py:1204] (3/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_181-78632-0_sp0.9 from training. Duration: 0.8666875 2023-10-02 01:40:50,678 WARNING [train.py:1204] (3/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_175-17485-0_sp1.1 from training. Duration: 0.97275 2023-10-02 01:40:50,769 WARNING [train.py:1204] (3/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_1004-183422-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 01:40:50,979 WARNING [train.py:1204] (3/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_32-65962-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 01:40:51,047 WARNING [train.py:1204] (3/4) Exclude cut with ID _15458_210_8341_1_1530858614263_3834410_40-298664-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 01:40:51,440 WARNING [train.py:1204] (3/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_380-261863-0_sp1.1 from training. Duration: 0.963625 2023-10-02 01:40:52,252 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_372-173012-0_sp1.1 from training. Duration: 0.863625 2023-10-02 01:40:52,291 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_57-82205-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 01:40:52,363 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3691_210_3528_1_1526468169_4062053_355-334432-0_sp0.9 from training. Duration: 0.9666875 2023-10-02 01:40:52,501 WARNING [train.py:1204] (3/4) Exclude cut with ID _19023_210_4353_1_1531378892552_5820780_28-36615-0 from training. Duration: 0.97 2023-10-02 01:40:52,510 WARNING [train.py:1204] (3/4) Exclude cut with ID _29853_210_7497_1_1532505600851_6642110_286-251333-0_sp1.1 from training. Duration: 0.92725 2023-10-02 01:40:52,626 WARNING [train.py:1204] (3/4) Exclude cut with ID _31602_210_5854_1_1532677021456_4458250_518-80679-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 01:40:52,656 WARNING [train.py:1204] (3/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_38-187851-0_sp0.9 from training. Duration: 0.688875 2023-10-02 01:40:53,340 WARNING [train.py:1204] (3/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_39-248870-0_sp1.1 from training. Duration: 0.62725 2023-10-02 01:40:53,357 WARNING [train.py:1204] (3/4) Exclude cut with ID _16908_210_5724_1_1531119157248_3772700_154-31455-0_sp1.1 from training. Duration: 0.97275 2023-10-02 01:40:53,525 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8453_210_4211_1_1528502940_8562730_347-330673-0 from training. Duration: 0.72 2023-10-02 01:40:53,703 WARNING [train.py:1204] (3/4) Exclude cut with ID _30191_210_1664_1_1533189915047_7196670_674-204247-0_sp1.1 from training. Duration: 0.97275 2023-10-02 01:40:53,827 WARNING [train.py:1204] (3/4) Exclude cut with ID _8833_210_5219_1_1528592665210_7553059_329-125156-0 from training. Duration: 0.82 2023-10-02 01:40:53,877 WARNING [train.py:1204] (3/4) Exclude cut with ID _4779_210_2084_1_1527318022944_3914893_500-285013-0_sp1.1 from training. Duration: 0.62725 2023-10-02 01:40:54,100 WARNING [train.py:1204] (3/4) Exclude cut with ID _14421_210_8750_1_1530604795825_3542290_190-313578-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 01:40:54,261 WARNING [train.py:1204] (3/4) Exclude cut with ID _35799_210_5854_1_1533263487887_3529071_538-273574-0_sp1.1 from training. Duration: 0.936375 2023-10-02 01:40:54,326 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_14928_210_5632_1_1530773461_4714000_454-112198-0_sp1.1 from training. Duration: 0.6 2023-10-02 01:40:54,381 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_468-143126-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 01:40:54,487 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_4528_210_3273_1_1527076654_3276777_258-121681-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 01:40:54,580 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_397-132565-0_sp1.1 from training. Duration: 0.9 2023-10-02 01:40:54,606 WARNING [train.py:1204] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_454-123002-0_sp0.9 from training. Duration: 0.7666875 2023-10-02 01:40:54,848 WARNING [train.py:1204] (3/4) Exclude cut with ID _27052_210_5986_1_1532329197187_3585009_297-86890-0_sp1.1 from training. Duration: 0.936375 2023-10-02 01:40:54,992 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_51225_210_3601_1_1534400634_2327383_55-301844-0_sp1.1 from training. Duration: 0.8454375 2023-10-02 01:40:55,144 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6786_210_1388_1_1527839699_4194495_378-46070-0_sp0.9 from training. Duration: 0.8 2023-10-02 01:40:55,247 WARNING [train.py:1204] (3/4) Exclude cut with ID _42334_210_7770_1_1534206610396_3605880_55-19758-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 01:40:55,574 WARNING [train.py:1204] (3/4) Exclude cut with ID _14884_210_2252_1_1530707459689_5561250_114-201399-0_sp0.9 from training. Duration: 0.8444375 2023-10-02 01:40:56,631 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_99-251314-0_sp0.9 from training. Duration: 0.9666875 2023-10-02 01:40:56,699 WARNING [train.py:1204] (3/4) Exclude cut with ID _26528_210_9070_1_1532570410842_3851310_497-97218-0 from training. Duration: 0.86 2023-10-02 01:40:57,119 WARNING [train.py:1204] (3/4) Exclude cut with ID _55328_210_27767_1_1534580973260_4088170_230-317241-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 01:40:57,205 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_125-203105-0_sp0.9 from training. Duration: 0.888875 2023-10-02 01:40:57,392 WARNING [train.py:1204] (3/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_55-31904-0_sp1.1 from training. Duration: 0.8 2023-10-02 01:40:57,557 WARNING [train.py:1204] (3/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_18-174647-0 from training. Duration: 0.91 2023-10-02 01:40:57,852 WARNING [train.py:1204] (3/4) Exclude cut with ID IC0165W0005-81726 from training. Duration: 0.952 2023-10-02 01:40:57,852 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_31583_210_3852_1_1532595851_4024413_148-171847-0_sp1.1 from training. Duration: 0.963625 2023-10-02 01:40:57,894 WARNING [train.py:1204] (3/4) Exclude cut with ID _21989_210_5854_1_1531899214931_3271360_116-138769-0_sp1.1 from training. Duration: 0.936375 2023-10-02 01:40:57,946 WARNING [train.py:1204] (3/4) Exclude cut with ID _66070_210_7640_1_1535441393096_7805310_489-292631-0_sp1.1 from training. Duration: 0.7181875 2023-10-02 01:40:58,045 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_202-27960-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 01:40:58,052 WARNING [train.py:1204] (3/4) Exclude cut with ID _5294_210_1747_1_1527249643928_3696179_342-270278-0 from training. Duration: 0.55 2023-10-02 01:40:58,088 WARNING [train.py:1204] (3/4) Exclude cut with ID _38323_210_12108_1_1533121233369_3303049_416-11919-0 from training. Duration: 0.96 2023-10-02 01:40:58,227 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6545_210_2040_1_1527774670_4245258_407-48070-0_sp0.9 from training. Duration: 0.8444375 2023-10-02 01:40:58,460 WARNING [train.py:1204] (3/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_503-30719-0_sp1.1 from training. Duration: 0.8909375 2023-10-02 01:40:58,522 WARNING [train.py:1204] (3/4) Exclude cut with ID _49643_210_7989_1_1534640244224_3939031_234-326497-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 01:40:58,698 WARNING [train.py:1204] (3/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_301-77873-0_sp1.1 from training. Duration: 0.963625 2023-10-02 01:40:58,772 WARNING [train.py:1204] (3/4) Exclude cut with ID _45297_210_2721_1_1533715351381_3264949_152-154697-0_sp1.1 from training. Duration: 0.97275 2023-10-02 01:40:58,779 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3371_210_1794_1_1526384462_5580777_396-339812-0 from training. Duration: 0.85 2023-10-02 01:40:58,879 WARNING [train.py:1204] (3/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_399-156791-0_sp1.1 from training. Duration: 0.7545625 2023-10-02 01:40:58,923 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7696_210_2040_1_1528207167_3716099_117-125434-0 from training. Duration: 0.76 2023-10-02 01:40:58,928 WARNING [train.py:1204] (3/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_96-244299-0 from training. Duration: 0.88 2023-10-02 01:40:59,063 WARNING [train.py:1204] (3/4) Exclude cut with ID _37892_210_19596_1_1533198677897_3964058_519-343462-0_sp1.1 from training. Duration: 0.92725 2023-10-02 01:40:59,087 WARNING [train.py:1204] (3/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_202-183922-0_sp1.1 from training. Duration: 0.77275 2023-10-02 01:40:59,093 WARNING [train.py:1204] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_453-133983-0 from training. Duration: 0.78 2023-10-02 01:40:59,140 WARNING [train.py:1204] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_178-39722-0_sp1.1 from training. Duration: 0.4454375 2023-10-02 01:40:59,556 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_15126_210_6753_1_1530778677_2591185_113-157651-0 from training. Duration: 0.68 2023-10-02 01:40:59,557 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_271-234962-0_sp1.1 from training. Duration: 0.7181875 2023-10-02 01:40:59,787 WARNING [train.py:1204] (3/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_195-64967-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 01:40:59,822 WARNING [train.py:1204] (3/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_365-215065-0_sp1.1 from training. Duration: 0.936375 2023-10-02 01:41:00,110 WARNING [train.py:1204] (3/4) Exclude cut with ID _31602_210_5854_1_1532677021456_4458250_249-96604-0_sp0.9 from training. Duration: 0.988875 2023-10-02 01:41:00,162 WARNING [train.py:1204] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_307-158462-0 from training. Duration: 0.69 2023-10-02 01:41:00,216 WARNING [train.py:1204] (3/4) Exclude cut with ID _30918_210_6400_1_1532754078600_3125920_407-223394-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 01:41:00,219 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6673_210_3273_1_1527767790_3173240_351-167954-0_sp0.9 from training. Duration: 0.5333125 2023-10-02 01:41:00,957 WARNING [train.py:1204] (3/4) Exclude cut with ID _4866_210_2577_1_1527404412284_4364659_256-306830-0 from training. Duration: 0.87 2023-10-02 01:41:01,011 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_17613_210_2151_1_1531137569_2366160_222-131118-0_sp1.1 from training. Duration: 0.963625 2023-10-02 01:41:01,035 WARNING [train.py:1204] (3/4) Exclude cut with ID _45560_210_21982_1_1533812603951_3584999_111-6376-0 from training. Duration: 0.84 2023-10-02 01:41:01,055 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_216-73841-0 from training. Duration: 0.96 2023-10-02 01:41:01,098 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_14058_210_4163_1_1530690953_6327180_49-210687-0 from training. Duration: 0.94 2023-10-02 01:41:01,297 WARNING [train.py:1204] (3/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_506-42614-0_sp0.9 from training. Duration: 0.87775 2023-10-02 01:41:01,471 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_25-332138-0_sp1.1 from training. Duration: 0.82725 2023-10-02 01:41:01,686 WARNING [train.py:1204] (3/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_589-9070-0_sp1.1 from training. Duration: 0.6454375 2023-10-02 01:41:01,804 WARNING [train.py:1204] (3/4) Exclude cut with ID _6194_210_1534_1_1527511826160_3941194_14-282876-0_sp1.1 from training. Duration: 0.6454375 2023-10-02 01:41:01,900 WARNING [train.py:1204] (3/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_1166-90654-0_sp1.1 from training. Duration: 0.92725 2023-10-02 01:41:02,050 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_218-255858-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 01:41:02,052 WARNING [train.py:1204] (3/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_1285-256622-0 from training. Duration: 0.84 2023-10-02 01:41:02,072 WARNING [train.py:1204] (3/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_205-286261-0_sp1.1 from training. Duration: 0.963625 2023-10-02 01:41:02,088 WARNING [train.py:1204] (3/4) Exclude cut with ID _68777_210_4353_1_1535810390731_3117904_6-154119-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 01:41:02,169 WARNING [train.py:1204] (3/4) Exclude cut with ID _22982_210_1883_1_1531882790447_5449409_565-199393-0_sp1.1 from training. Duration: 0.92725 2023-10-02 01:41:02,179 WARNING [train.py:1204] (3/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_278-154492-0 from training. Duration: 0.76 2023-10-02 01:41:02,327 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_426-313557-0 from training. Duration: 0.53 2023-10-02 01:41:02,410 WARNING [train.py:1204] (3/4) Exclude cut with ID _41817_210_18835_1_1533434477160_7423049_555-293448-0 from training. Duration: 0.8 2023-10-02 01:41:02,689 WARNING [train.py:1204] (3/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_32-335103-0_sp1.1 from training. Duration: 0.7454375 2023-10-02 01:41:02,889 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2655_210_2087_1_1525688959_3897400_202-147557-0_sp1.1 from training. Duration: 0.77275 2023-10-02 01:41:02,922 WARNING [train.py:1204] (3/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_158-250116-0 from training. Duration: 0.62 2023-10-02 01:41:03,345 WARNING [train.py:1204] (3/4) Exclude cut with ID _55429_210_10626_1_1534467588527_7174143_528-88083-0_sp1.1 from training. Duration: 0.963625 2023-10-02 01:41:03,531 WARNING [train.py:1204] (3/4) Exclude cut with ID _8030_210_7497_1_1528444886781_3531089_32-31837-0_sp1.1 from training. Duration: 0.8818125 2023-10-02 01:41:03,577 WARNING [train.py:1204] (3/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_744-86168-0_sp1.1 from training. Duration: 0.97275 2023-10-02 01:41:03,593 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1113-7369-0_sp0.9 from training. Duration: 0.9444375 2023-10-02 01:41:03,610 WARNING [train.py:1204] (3/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_124-345840-0_sp0.9 from training. Duration: 0.57775 2023-10-02 01:41:03,665 WARNING [train.py:1204] (3/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_328-264776-0_sp0.9 from training. Duration: 0.7444375 2023-10-02 01:41:03,831 WARNING [train.py:1204] (3/4) Exclude cut with ID _18895_210_6228_1_1531357274234_3784930_469-166615-0_sp1.1 from training. Duration: 0.963625 2023-10-02 01:41:03,837 WARNING [train.py:1204] (3/4) Exclude cut with ID _8395_210_1531_1_1528597871523_4411579_431-132470-0_sp0.9 from training. Duration: 0.82225 2023-10-02 01:41:03,929 WARNING [train.py:1204] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_554-45617-0_sp1.1 from training. Duration: 0.9 2023-10-02 01:41:03,961 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_35361_210_4238_1_1533262810_7793476_524-188242-0_sp1.1 from training. Duration: 0.92725 2023-10-02 01:41:04,117 WARNING [train.py:1204] (3/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_938-205135-0 from training. Duration: 0.87 2023-10-02 01:41:04,273 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_5931_210_2040_1_1527428481_4547999_321-193614-0 from training. Duration: 0.67 2023-10-02 01:41:04,292 WARNING [train.py:1204] (3/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_445-269521-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 01:41:04,466 WARNING [train.py:1204] (3/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_3-33116-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 01:41:04,468 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_4633_210_2040_1_1526824163_4145343_416-76117-0_sp1.1 from training. Duration: 0.863625 2023-10-02 01:41:04,503 WARNING [train.py:1204] (3/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_1033-274376-0_sp1.1 from training. Duration: 0.7909375 2023-10-02 01:41:04,545 WARNING [train.py:1204] (3/4) Exclude cut with ID _8772_210_1850_1_1528635595972_3664011_398-320049-0_sp0.9 from training. Duration: 0.97775 2023-10-02 01:41:05,604 WARNING [train.py:1204] (3/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_321-215742-0_sp0.9 from training. Duration: 0.5555625 2023-10-02 01:41:05,735 WARNING [train.py:1204] (3/4) Exclude cut with ID _28394_210_8913_1_1532430049518_3650118_272-190950-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 01:41:05,837 WARNING [train.py:1204] (3/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_31-299371-0_sp1.1 from training. Duration: 0.92725 2023-10-02 01:41:05,968 WARNING [train.py:1204] (3/4) Exclude cut with ID _55383_210_10635_1_1534468426172_2231669_134-128246-0_sp0.9 from training. Duration: 0.988875 2023-10-02 01:41:05,971 WARNING [train.py:1204] (3/4) Exclude cut with ID _40519_210_13794_1_1533715228655_7337390_887-9547-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 01:41:06,023 WARNING [train.py:1204] (3/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_310-109461-0_sp0.9 from training. Duration: 0.888875 2023-10-02 01:41:06,271 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2009_210_2071_1_1525481134_4588937_258-250196-0_sp1.1 from training. Duration: 0.92725 2023-10-02 01:41:06,322 WARNING [train.py:1204] (3/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_1186-72702-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 01:41:06,324 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_14831_210_7927_1_1530700200_5583610_181-27467-0_sp1.1 from training. Duration: 0.82725 2023-10-02 01:41:06,623 WARNING [train.py:1204] (3/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_278-132109-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 01:41:06,649 WARNING [train.py:1204] (3/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_261-297503-0 from training. Duration: 0.99 2023-10-02 01:41:07,003 WARNING [train.py:1204] (3/4) Exclude cut with ID _19896_210_1883_1_1531709022662_6649569_313-265903-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 01:41:07,115 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_105-114797-0_sp1.1 from training. Duration: 0.763625 2023-10-02 01:41:07,220 WARNING [train.py:1204] (3/4) Exclude cut with ID _53755_210_8254_1_1534577312941_4707360_9-65843-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 01:41:07,250 WARNING [train.py:1204] (3/4) Exclude cut with ID _18516_210_9206_1_1531704653206_3692406_361-224549-0_sp1.1 from training. Duration: 0.97275 2023-10-02 01:41:07,294 WARNING [train.py:1204] (3/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_403-310975-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 01:41:07,343 WARNING [train.py:1204] (3/4) Exclude cut with ID _5444_210_2151_1_1527418808464_5312210_581-315411-0_sp0.9 from training. Duration: 0.688875 2023-10-02 01:41:07,403 WARNING [train.py:1204] (3/4) Exclude cut with ID _23825_210_13449_1_1531915313526_3497150_464-154904-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 01:41:07,411 WARNING [train.py:1204] (3/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_937-208281-0_sp0.9 from training. Duration: 0.9444375 2023-10-02 01:41:07,469 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_37839_210_7234_1_1533542462_3689303_158-181212-0_sp1.1 from training. Duration: 0.963625 2023-10-02 01:41:07,555 WARNING [train.py:1204] (3/4) Exclude cut with ID _8772_210_1850_1_1528635595972_3664011_398-320049-0 from training. Duration: 0.88 2023-10-02 01:41:07,668 WARNING [train.py:1204] (3/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_718-250907-0_sp1.1 from training. Duration: 0.62725 2023-10-02 01:41:07,740 WARNING [train.py:1204] (3/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_641-126395-0_sp1.1 from training. Duration: 0.92725 2023-10-02 01:41:07,779 WARNING [train.py:1204] (3/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_166-241493-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 01:41:07,863 WARNING [train.py:1204] (3/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_526-62079-0_sp0.9 from training. Duration: 0.7444375 2023-10-02 01:41:07,969 WARNING [train.py:1204] (3/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_439-275083-0_sp1.1 from training. Duration: 0.936375 2023-10-02 01:41:08,171 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_66907_210_21581_1_1535615661_3590177_493-229032-0_sp1.1 from training. Duration: 0.92725 2023-10-02 01:41:08,211 WARNING [train.py:1204] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_89-321955-0_sp1.1 from training. Duration: 0.72725 2023-10-02 01:41:08,323 WARNING [train.py:1204] (3/4) Exclude cut with ID _29857_210_20034_1_1532395886753_3798160_273-226195-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 01:41:08,329 WARNING [train.py:1204] (3/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_404-75889-0 from training. Duration: 0.89 2023-10-02 01:41:08,366 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_271-234962-0_sp0.9 from training. Duration: 0.87775 2023-10-02 01:41:08,615 WARNING [train.py:1204] (3/4) Exclude cut with ID _22961_210_8254_1_1531976366672_4278180_156-339685-0_sp1.1 from training. Duration: 0.97275 2023-10-02 01:41:08,675 WARNING [train.py:1204] (3/4) Exclude cut with ID _37892_210_19596_1_1533198677897_3964058_425-231927-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 01:41:08,778 WARNING [train.py:1204] (3/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_102-288670-0_sp1.1 from training. Duration: 0.97275 2023-10-02 01:41:08,871 WARNING [train.py:1204] (3/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_283-220372-0_sp1.1 from training. Duration: 0.5090625 2023-10-02 01:41:08,941 WARNING [train.py:1204] (3/4) Exclude cut with ID _44304_210_15374_1_1533639352765_4613129_86-1699-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 01:41:09,057 WARNING [train.py:1204] (3/4) Exclude cut with ID _2891_210_2550_1_1525874398521_4427710_327-121590-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 01:41:09,064 WARNING [train.py:1204] (3/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_369-222060-0 from training. Duration: 0.71 2023-10-02 01:41:09,212 WARNING [train.py:1204] (3/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_762-109886-0 from training. Duration: 0.91 2023-10-02 01:41:09,239 WARNING [train.py:1204] (3/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_477-255782-0_sp0.9 from training. Duration: 0.911125 2023-10-02 01:41:09,270 WARNING [train.py:1204] (3/4) Exclude cut with ID IC0669W0061-330906 from training. Duration: 0.768 2023-10-02 01:41:09,324 WARNING [train.py:1204] (3/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_347-213071-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 01:41:09,365 WARNING [train.py:1204] (3/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_507-303370-0_sp1.1 from training. Duration: 0.87275 2023-10-02 01:41:10,044 WARNING [train.py:1204] (3/4) Exclude cut with ID _15012_210_1968_1_1530856820429_5095350_688-178849-0 from training. Duration: 0.64 2023-10-02 01:41:10,049 WARNING [train.py:1204] (3/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_28-70987-0_sp0.9 from training. Duration: 0.9666875 2023-10-02 01:41:10,065 WARNING [train.py:1204] (3/4) Exclude cut with ID _5910_210_1389_1_1527337972454_5050100_555-159815-0 from training. Duration: 0.98 2023-10-02 01:41:10,092 WARNING [train.py:1204] (3/4) Exclude cut with ID IC0679W0064-335871 from training. Duration: 0.768 2023-10-02 01:41:10,092 WARNING [train.py:1204] (3/4) Exclude cut with ID IC0679W0079-335886 from training. Duration: 0.896 2023-10-02 01:41:10,139 WARNING [train.py:1204] (3/4) Exclude cut with ID _5608_210_2106_1_1527415439089_6819768_909-82767-0 from training. Duration: 0.61 2023-10-02 01:41:10,297 WARNING [train.py:1204] (3/4) Exclude cut with ID _23038_210_9570_1_1532001584158_3652419_411-323967-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 01:41:10,367 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_350-231748-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 01:41:10,377 WARNING [train.py:1204] (3/4) Exclude cut with ID _5289_210_2084_1_1527678202736_3779299_142-103317-0_sp0.9 from training. Duration: 0.9333125 2023-10-02 01:41:10,456 WARNING [train.py:1204] (3/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_314-144055-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 01:41:10,526 WARNING [train.py:1204] (3/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_177-236809-0_sp0.9 from training. Duration: 0.5444375 2023-10-02 01:41:10,660 WARNING [train.py:1204] (3/4) Exclude cut with ID _23038_210_9570_1_1532001584158_3652419_82-334733-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 01:41:10,667 WARNING [train.py:1204] (3/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_225-22526-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 01:41:11,271 WARNING [train.py:1204] (3/4) Exclude cut with ID _30327_210_16049_1_1532484161295_3924169_401-70819-0_sp1.1 from training. Duration: 0.7818125 2023-10-02 01:41:11,305 WARNING [train.py:1204] (3/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_538-32291-0 from training. Duration: 0.83 2023-10-02 01:41:11,606 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_37357_210_17024_1_1533089504_2316121_258-211605-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 01:41:11,989 WARNING [train.py:1204] (3/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_550-335198-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 01:41:12,058 WARNING [train.py:1204] (3/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_98-741-0_sp0.9 from training. Duration: 0.888875 2023-10-02 01:41:12,101 WARNING [train.py:1204] (3/4) Exclude cut with ID _38323_210_12108_1_1533121233369_3303049_251-238988-0_sp1.1 from training. Duration: 0.92725 2023-10-02 01:41:12,192 WARNING [train.py:1204] (3/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_259-34038-0_sp0.9 from training. Duration: 0.82225 2023-10-02 01:41:12,422 WARNING [train.py:1204] (3/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_1060-180633-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 01:41:12,464 WARNING [train.py:1204] (3/4) Exclude cut with ID _8755_210_1876_1_1528628559357_3258209_211-37473-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 01:41:12,479 WARNING [train.py:1204] (3/4) Exclude cut with ID _4122_210_2084_1_1526713164417_4353280_528-206771-0 from training. Duration: 0.88 2023-10-02 01:41:12,880 WARNING [train.py:1204] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_64-248359-0 from training. Duration: 0.69 2023-10-02 01:41:13,056 WARNING [train.py:1204] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_402-253027-0_sp1.1 from training. Duration: 0.4909375 2023-10-02 01:41:13,491 WARNING [train.py:1204] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_625-102955-0 from training. Duration: 0.77 2023-10-02 01:41:13,722 WARNING [train.py:1204] (3/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_611-219324-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 01:41:13,829 WARNING [train.py:1204] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_718-68505-0_sp0.9 from training. Duration: 0.988875 2023-10-02 01:41:13,891 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_353-182855-0_sp1.1 from training. Duration: 0.87275 2023-10-02 01:41:14,815 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_5960_210_1794_1_1527403865_3990999_515-2613-0_sp0.9 from training. Duration: 0.97775 2023-10-02 01:41:14,833 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_5575_210_1410_1_1527301305_2570170_342-265572-0 from training. Duration: 0.94 2023-10-02 01:41:15,134 WARNING [train.py:1204] (3/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_329-113173-0_sp0.9 from training. Duration: 0.688875 2023-10-02 01:41:15,244 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_15396_210_6890_1_1531308473_1938936_213-312506-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 01:41:15,303 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_521-214276-0_sp1.1 from training. Duration: 0.4 2023-10-02 01:41:15,564 WARNING [train.py:1204] (3/4) Exclude cut with ID _15026_210_8750_1_1530777602316_3291850_211-195043-0_sp1.1 from training. Duration: 0.72725 2023-10-02 01:41:15,718 WARNING [train.py:1204] (3/4) Exclude cut with ID _29343_210_4381_1_1532602757752_3674109_108-257494-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 01:41:15,888 WARNING [train.py:1204] (3/4) Exclude cut with ID _64414_210_17301_1_1535243252048_3757759_403-58151-0_sp1.1 from training. Duration: 0.963625 2023-10-02 01:41:16,007 WARNING [train.py:1204] (3/4) Exclude cut with ID _36512_210_6987_1_1533033143998_3126069_109-177787-0_sp1.1 from training. Duration: 0.92725 2023-10-02 01:41:16,173 WARNING [train.py:1204] (3/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_498-40153-0 from training. Duration: 0.63 2023-10-02 01:41:16,258 WARNING [train.py:1204] (3/4) Exclude cut with ID _56447_210_1389_1_1535074245910_3697189_161-334523-0_sp1.1 from training. Duration: 0.936375 2023-10-02 01:41:16,358 WARNING [train.py:1204] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_238-245655-0 from training. Duration: 0.81 2023-10-02 01:41:16,510 WARNING [train.py:1204] (3/4) Exclude cut with ID _64631_210_27767_1_1535349580068_4165160_108-43865-0_sp1.1 from training. Duration: 0.92725 2023-10-02 01:41:16,514 WARNING [train.py:1204] (3/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_243-42116-0_sp1.1 from training. Duration: 0.663625 2023-10-02 01:41:16,536 WARNING [train.py:1204] (3/4) Exclude cut with ID _66868_210_9527_1_1535509287954_4561140_594-132160-0_sp1.1 from training. Duration: 0.92725 2023-10-02 01:41:16,694 WARNING [train.py:1204] (3/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_48-338976-0_sp0.9 from training. Duration: 0.988875 2023-10-02 01:41:16,738 WARNING [train.py:1204] (3/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_137-100595-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 01:41:16,751 WARNING [train.py:1204] (3/4) Exclude cut with ID _49748_210_24116_1_1533986971737_3158910_12-271669-0_sp1.1 from training. Duration: 0.936375 2023-10-02 01:41:16,786 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_37357_210_17024_1_1533089504_2316121_206-80421-0_sp1.1 from training. Duration: 0.936375 2023-10-02 01:41:16,888 WARNING [train.py:1204] (3/4) Exclude cut with ID _7537_210_2084_1_1528502395431_3924359_113-199933-0_sp1.1 from training. Duration: 0.536375 2023-10-02 01:41:16,938 WARNING [train.py:1204] (3/4) Exclude cut with ID _55440_210_12651_1_1534496713364_2980520_31-59289-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 01:41:17,104 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1083-220122-0_sp0.9 from training. Duration: 0.5666875 2023-10-02 01:41:17,335 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_426-313557-0_sp1.1 from training. Duration: 0.4818125 2023-10-02 01:41:17,394 WARNING [train.py:1204] (3/4) Exclude cut with ID _35799_210_5854_1_1533263487887_3529071_324-94843-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 01:41:17,569 WARNING [train.py:1204] (3/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_264-108685-0_sp1.1 from training. Duration: 0.5 2023-10-02 01:41:17,669 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9012W0006-501141 from training. Duration: 0.896 2023-10-02 01:41:17,847 WARNING [train.py:1204] (3/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_890-5117-0_sp0.9 from training. Duration: 0.8666875 2023-10-02 01:41:17,894 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9023W0008-506301 from training. Duration: 0.896 2023-10-02 01:41:17,985 WARNING [train.py:1204] (3/4) Exclude cut with ID _13916_210_9994_1_1530788341126_3763149_60-191556-0_sp0.9 from training. Duration: 0.67775 2023-10-02 01:41:18,020 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9029W0331-509721 from training. Duration: 0.896 2023-10-02 01:41:18,149 WARNING [train.py:1204] (3/4) Exclude cut with ID _35823_210_6917_1_1533187881914_7297830_567-108596-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 01:41:18,199 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7543_210_6753_1_1528544010_1110402_25-183860-0_sp1.1 from training. Duration: 0.42725 2023-10-02 01:41:18,307 WARNING [train.py:1204] (3/4) Exclude cut with ID _47278_210_12041_1_1533902418349_3576110_495-260035-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 01:41:18,430 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9051W0015-520281 from training. Duration: 0.9045625 2023-10-02 01:41:19,210 WARNING [train.py:1204] (3/4) Exclude cut with ID _14459_210_2228_1_1531443641946_7516700_707-66193-0_sp1.1 from training. Duration: 0.97275 2023-10-02 01:41:19,397 WARNING [train.py:1204] (3/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_219-224445-0_sp1.1 from training. Duration: 0.763625 2023-10-02 01:41:19,790 WARNING [train.py:1204] (3/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_345-167986-0_sp0.9 from training. Duration: 0.9333125 2023-10-02 01:41:19,925 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_34815_210_6388_1_1533381320_3571903_279-305565-0_sp1.1 from training. Duration: 0.836375 2023-10-02 01:41:20,149 WARNING [train.py:1204] (3/4) Exclude cut with ID _21987_210_5854_1_1531821500482_3637038_317-179212-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 01:41:20,215 WARNING [train.py:1204] (3/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_395-37222-0_sp1.1 from training. Duration: 0.6909375 2023-10-02 01:41:20,304 WARNING [train.py:1204] (3/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_168-50337-0_sp1.1 from training. Duration: 0.763625 2023-10-02 01:41:20,513 WARNING [train.py:1204] (3/4) Exclude cut with ID _60113_210_16271_1_1535164212481_4386079_559-167191-0 from training. Duration: 0.93 2023-10-02 01:41:20,528 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_47228_210_4983_1_1533780021_3672027_344-179642-0_sp1.1 from training. Duration: 0.92725 2023-10-02 01:41:20,576 WARNING [train.py:1204] (3/4) Exclude cut with ID _22961_210_8254_1_1531976366672_4278180_317-199546-0_sp0.9 from training. Duration: 0.9555625 2023-10-02 01:41:20,739 WARNING [train.py:1204] (3/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_141-89727-0 from training. Duration: 0.71 2023-10-02 01:41:20,801 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_78-116075-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 01:41:20,821 WARNING [train.py:1204] (3/4) Exclude cut with ID _64086_210_7994_1_1535270417019_7435949_52-235756-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 01:41:20,978 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7615_210_4211_1_1528110965_4580160_72-235211-0_sp1.1 from training. Duration: 0.6909375 2023-10-02 01:41:21,308 WARNING [train.py:1204] (3/4) Exclude cut with ID _14069_210_5632_1_1530512692033_4803390_287-27950-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 01:41:21,420 WARNING [train.py:1204] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_74-48717-0_sp0.9 from training. Duration: 0.6444375 2023-10-02 01:41:21,456 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7544_210_6753_1_1528591327_1471426_172-348777-0_sp1.1 from training. Duration: 0.6 2023-10-02 01:41:21,568 WARNING [train.py:1204] (3/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_220-191080-0 from training. Duration: 0.98 2023-10-02 01:41:21,667 WARNING [train.py:1204] (3/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_1081-137641-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 01:41:21,848 WARNING [train.py:1204] (3/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_278-235459-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 01:41:22,134 WARNING [train.py:1204] (3/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_1181-77470-0_sp1.1 from training. Duration: 0.936375 2023-10-02 01:41:22,290 WARNING [train.py:1204] (3/4) Exclude cut with ID _60860_210_8254_1_1534896015875_3557169_198-5531-0_sp1.1 from training. Duration: 0.936375 2023-10-02 01:41:22,427 WARNING [train.py:1204] (3/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_17-289781-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 01:41:22,434 WARNING [train.py:1204] (3/4) Exclude cut with ID _50310_210_15488_1_1534068043345_3641080_146-199752-0_sp1.1 from training. Duration: 0.92725 2023-10-02 01:41:22,626 WARNING [train.py:1204] (3/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_969-213354-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 01:41:22,721 WARNING [train.py:1204] (3/4) Exclude cut with ID _70519_210_4163_1_1535961595994_7458230_66-344156-0_sp1.1 from training. Duration: 0.963625 2023-10-02 01:41:23,673 WARNING [train.py:1204] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_380-204756-0_sp1.1 from training. Duration: 0.7818125 2023-10-02 01:41:23,849 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_128-202104-0_sp0.9 from training. Duration: 0.7 2023-10-02 01:41:24,211 WARNING [train.py:1204] (3/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_51-1049-0_sp0.9 from training. Duration: 0.8333125 2023-10-02 01:41:24,237 WARNING [train.py:1204] (3/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_86-66192-0_sp1.1 from training. Duration: 0.5 2023-10-02 01:41:24,314 WARNING [train.py:1204] (3/4) Exclude cut with ID _14069_210_5632_1_1530512692033_4803390_95-1896-0_sp1.1 from training. Duration: 0.8909375 2023-10-02 01:41:24,716 WARNING [train.py:1204] (3/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_29-293896-0_sp1.1 from training. Duration: 0.5545625 2023-10-02 01:41:24,910 WARNING [train.py:1204] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_167-137255-0_sp0.9 from training. Duration: 0.711125 2023-10-02 01:41:24,911 WARNING [train.py:1204] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_217-95056-0 from training. Duration: 0.98 2023-10-02 01:41:24,932 WARNING [train.py:1204] (3/4) Exclude cut with ID _47278_210_12041_1_1533902418349_3576110_97-266746-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 01:41:24,982 WARNING [train.py:1204] (3/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_380-343670-0_sp1.1 from training. Duration: 0.8 2023-10-02 01:41:25,192 WARNING [train.py:1204] (3/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_107-332398-0 from training. Duration: 0.78 2023-10-02 01:41:25,435 WARNING [train.py:1204] (3/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_91-267696-0_sp1.1 from training. Duration: 0.7909375 2023-10-02 01:41:25,455 WARNING [train.py:1204] (3/4) Exclude cut with ID _41298_210_9019_1_1533430371450_4822143_498-186993-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 01:41:25,921 WARNING [train.py:1204] (3/4) Exclude cut with ID _17956_210_4353_1_1531209568125_6979970_54-77610-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 01:41:25,964 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_40047_210_2290_1_1533290519_2374311_257-335162-0_sp1.1 from training. Duration: 0.863625 2023-10-02 01:41:26,133 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9653W0193-675471 from training. Duration: 0.9656875 2023-10-02 01:41:26,296 WARNING [train.py:1204] (3/4) Exclude cut with ID _25248_210_8750_1_1532062825941_5554180_243-240536-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 01:41:26,789 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_32360_210_4198_1_1532689160_3648436_60-51908-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 01:41:26,798 WARNING [train.py:1204] (3/4) Exclude cut with ID _4092_210_2151_1_1526643044449_3135759_227-213972-0_sp0.9 from training. Duration: 0.6 2023-10-02 01:41:26,848 WARNING [train.py:1204] (3/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_392-173500-0_sp1.1 from training. Duration: 0.97275 2023-10-02 01:41:26,898 WARNING [train.py:1204] (3/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_71-329150-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 01:41:26,900 WARNING [train.py:1204] (3/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_866-173440-0 from training. Duration: 0.9 2023-10-02 01:41:26,998 WARNING [train.py:1204] (3/4) Exclude cut with ID _14832_210_8750_1_1530691351539_3171348_1-54315-0_sp0.9 from training. Duration: 0.9666875 2023-10-02 01:41:27,205 WARNING [train.py:1204] (3/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_430-116139-0_sp1.1 from training. Duration: 0.77275 2023-10-02 01:41:27,235 WARNING [train.py:1204] (3/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_356-45054-0_sp0.9 from training. Duration: 0.9555625 2023-10-02 01:41:27,542 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3531_210_3528_1_1526294203_5216456_309-227302-0_sp0.9 from training. Duration: 0.7666875 2023-10-02 01:41:28,420 WARNING [train.py:1204] (3/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_301-330455-0_sp1.1 from training. Duration: 0.72725 2023-10-02 01:41:28,681 WARNING [train.py:1204] (3/4) Exclude cut with ID _23557_210_8750_1_1531890106370_6846255_667-145945-0_sp1.1 from training. Duration: 0.92725 2023-10-02 01:41:29,022 WARNING [train.py:1204] (3/4) Exclude cut with ID _14860_210_4915_1_1530702020340_7343240_836-328489-0_sp1.1 from training. Duration: 0.8181875 2023-10-02 01:41:29,064 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_37152_210_9697_1_1533106573_3844188_416-333249-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 01:41:29,260 WARNING [train.py:1204] (3/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_538-32291-0_sp0.9 from training. Duration: 0.92225 2023-10-02 01:41:29,453 WARNING [train.py:1204] (3/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_965-176824-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 01:41:29,722 WARNING [train.py:1204] (3/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_86-19601-0_sp1.1 from training. Duration: 0.77275 2023-10-02 01:41:29,801 WARNING [train.py:1204] (3/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_436-318113-0_sp1.1 from training. Duration: 0.6818125 2023-10-02 01:41:29,827 WARNING [train.py:1204] (3/4) Exclude cut with ID _41817_210_18835_1_1533434477160_7423049_555-293448-0_sp0.9 from training. Duration: 0.888875 2023-10-02 01:41:29,837 WARNING [train.py:1204] (3/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_68-219807-0 from training. Duration: 0.47 2023-10-02 01:41:30,137 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7544_210_6753_1_1528591327_1471426_33-346031-0_sp1.1 from training. Duration: 0.5545625 2023-10-02 01:41:30,139 WARNING [train.py:1204] (3/4) Exclude cut with ID _49748_210_24116_1_1533986971737_3158910_263-52113-0_sp1.1 from training. Duration: 0.97275 2023-10-02 01:41:30,206 WARNING [train.py:1204] (3/4) Exclude cut with ID _64639_210_5816_1_1535367619437_3528000_340-24580-0_sp1.1 from training. Duration: 0.92725 2023-10-02 01:41:30,219 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_47228_210_4983_1_1533780021_3672027_479-114941-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 01:41:30,248 WARNING [train.py:1204] (3/4) Exclude cut with ID _44585_210_18628_1_1533605350387_4518199_506-119659-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 01:41:30,341 WARNING [train.py:1204] (3/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_314-126819-0_sp1.1 from training. Duration: 0.4545625 2023-10-02 01:41:30,344 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2541_210_1794_1_1525590269_3084765_156-173713-0_sp0.9 from training. Duration: 0.9 2023-10-02 01:41:30,384 WARNING [train.py:1204] (3/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_108-255430-0 from training. Duration: 0.97 2023-10-02 01:41:30,399 WARNING [train.py:1204] (3/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_791-45149-0_sp1.1 from training. Duration: 0.7818125 2023-10-02 01:41:30,404 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_53378_210_17282_1_1534227054_1261425_10-118295-0_sp1.1 from training. Duration: 0.97275 2023-10-02 01:41:30,471 WARNING [train.py:1204] (3/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_148-158509-0 from training. Duration: 0.41 2023-10-02 01:41:30,920 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_34726_210_4525_1_1533019474_5030395_687-291305-0_sp1.1 from training. Duration: 0.936375 2023-10-02 01:41:31,208 WARNING [train.py:1204] (3/4) Exclude cut with ID _14860_210_4915_1_1530702020340_7343240_264-324537-0_sp1.1 from training. Duration: 0.963625 2023-10-02 01:41:31,526 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_41251_210_15368_1_1533429767_4820695_591-46386-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 01:41:31,564 WARNING [train.py:1204] (3/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_86-19601-0_sp0.9 from training. Duration: 0.9444375 2023-10-02 01:41:31,599 WARNING [train.py:1204] (3/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_401-255491-0_sp0.9 from training. Duration: 0.77775 2023-10-02 01:41:31,786 WARNING [train.py:1204] (3/4) Exclude cut with ID _8696_210_1876_1_1528545632440_3345058_209-54069-0_sp1.1 from training. Duration: 0.6090625 2023-10-02 01:41:31,795 WARNING [train.py:1204] (3/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_369-325184-0_sp1.1 from training. Duration: 0.9 2023-10-02 01:41:31,883 WARNING [train.py:1204] (3/4) Exclude cut with ID _14687_210_5854_1_1530703838259_3664889_115-239046-0_sp0.9 from training. Duration: 0.7444375 2023-10-02 01:41:31,920 WARNING [train.py:1204] (3/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_168-50337-0 from training. Duration: 0.84 2023-10-02 01:41:32,967 WARNING [train.py:1204] (3/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_909-64151-0_sp1.1 from training. Duration: 0.8545625 2023-10-02 01:41:32,976 WARNING [train.py:1204] (3/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_597-154640-0 from training. Duration: 0.9 2023-10-02 01:41:33,435 WARNING [train.py:1204] (3/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_126-274694-0 from training. Duration: 0.98 2023-10-02 01:41:33,700 WARNING [train.py:1204] (3/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_344-269786-0_sp1.1 from training. Duration: 0.97275 2023-10-02 01:41:33,939 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_37818_210_4238_1_1533620910_4720083_322-253154-0_sp1.1 from training. Duration: 0.92725 2023-10-02 01:41:33,942 WARNING [train.py:1204] (3/4) Exclude cut with ID _58895_210_8750_1_1535184081320_3649089_221-15823-0_sp1.1 from training. Duration: 0.92725 2023-10-02 01:41:33,958 WARNING [train.py:1204] (3/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_9-21737-0_sp1.1 from training. Duration: 0.8818125 2023-10-02 01:41:34,013 WARNING [train.py:1204] (3/4) Exclude cut with ID _14069_210_5632_1_1530512692033_4803390_522-341588-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 01:41:34,085 WARNING [train.py:1204] (3/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_363-185703-0 from training. Duration: 0.95 2023-10-02 01:41:34,308 WARNING [train.py:1204] (3/4) Exclude cut with ID _41817_210_18835_1_1533434477160_7423049_265-76216-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 01:41:34,651 WARNING [train.py:1204] (3/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_841-341679-0 from training. Duration: 0.58 2023-10-02 01:41:34,791 WARNING [train.py:1204] (3/4) Exclude cut with ID _55429_210_10626_1_1534467588527_7174143_97-335084-0_sp1.1 from training. Duration: 0.8818125 2023-10-02 01:41:35,050 WARNING [train.py:1204] (3/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_690-126306-0_sp0.9 from training. Duration: 0.8666875 2023-10-02 01:41:35,060 WARNING [train.py:1204] (3/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_102-92751-0_sp1.1 from training. Duration: 0.9 2023-10-02 01:41:35,252 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3787_210_2040_1_1526477544_5478586_603-120324-0_sp1.1 from training. Duration: 0.736375 2023-10-02 01:41:35,352 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_33348_210_5632_1_1533086912_3324030_85-28583-0 from training. Duration: 0.82 2023-10-02 01:41:35,417 WARNING [train.py:1204] (3/4) Exclude cut with ID _60057_210_10640_1_1534903065793_3333150_362-230377-0_sp1.1 from training. Duration: 0.7909375 2023-10-02 01:41:35,582 WARNING [train.py:1204] (3/4) Exclude cut with ID _60113_210_16271_1_1535164212481_4386079_194-146486-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 01:41:35,600 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_53753_210_7234_1_1534320003_3594148_171-277668-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 01:41:35,632 WARNING [train.py:1204] (3/4) Exclude cut with ID _34423_210_8750_1_1533430798456_3330199_251-332788-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 01:41:35,870 WARNING [train.py:1204] (3/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_663-166973-0_sp0.9 from training. Duration: 0.888875 2023-10-02 01:41:36,018 WARNING [train.py:1204] (3/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_927-43827-0 from training. Duration: 0.74 2023-10-02 01:41:36,059 WARNING [train.py:1204] (3/4) Exclude cut with ID _28323_210_16187_1_1532480395667_4100300_306-74561-0_sp1.1 from training. Duration: 0.8545625 2023-10-02 01:41:36,249 WARNING [train.py:1204] (3/4) Exclude cut with ID _15037_210_2253_1_1530752294104_3267650_139-337989-0_sp1.1 from training. Duration: 0.92725 2023-10-02 01:41:36,261 WARNING [train.py:1204] (3/4) Exclude cut with ID _49039_210_6315_1_1533967537541_3846118_460-263670-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 01:41:36,985 WARNING [train.py:1204] (3/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_186-309161-0 from training. Duration: 0.54 2023-10-02 01:41:37,090 WARNING [train.py:1204] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_87-278686-0_sp1.1 from training. Duration: 0.7 2023-10-02 01:41:37,310 WARNING [train.py:1204] (3/4) Exclude cut with ID _27052_210_5986_1_1532329197187_3585009_314-106499-0 from training. Duration: 0.92 2023-10-02 01:41:37,328 WARNING [train.py:1204] (3/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_412-287526-0_sp0.9 from training. Duration: 0.8 2023-10-02 01:41:37,678 WARNING [train.py:1204] (3/4) Exclude cut with ID _22961_210_8254_1_1531976366672_4278180_317-199546-0 from training. Duration: 0.86 2023-10-02 01:41:38,008 WARNING [train.py:1204] (3/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_381-317791-0 from training. Duration: 0.73 2023-10-02 01:41:38,058 WARNING [train.py:1204] (3/4) Exclude cut with ID _44010_210_4525_1_1533884262964_4013620_560-310411-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 01:41:38,064 WARNING [train.py:1204] (3/4) Exclude cut with ID _5294_210_1747_1_1527249643928_3696179_342-270278-0_sp0.9 from training. Duration: 0.611125 2023-10-02 01:41:38,095 WARNING [train.py:1204] (3/4) Exclude cut with ID ID0473W0002-918951 from training. Duration: 0.924 2023-10-02 01:41:38,112 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1083-220122-0 from training. Duration: 0.51 2023-10-02 01:41:38,306 WARNING [train.py:1204] (3/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_138-154812-0 from training. Duration: 0.78 2023-10-02 01:41:38,505 WARNING [train.py:1204] (3/4) Exclude cut with ID _44982_210_1511_1_1534312820108_3726029_331-19420-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 01:41:38,873 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2655_210_2087_1_1525688959_3897400_202-147557-0_sp0.9 from training. Duration: 0.9444375 2023-10-02 01:41:39,219 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_23850_210_4238_1_1532232597_1632433_32-185135-0_sp0.9 from training. Duration: 0.9555625 2023-10-02 01:41:39,341 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_46-93547-0_sp1.1 from training. Duration: 0.863625 2023-10-02 01:41:39,418 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7931_210_1794_1_1528594728_5564059_280-279560-0 from training. Duration: 0.98 2023-10-02 01:41:39,457 WARNING [train.py:1204] (3/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_937-208281-0 from training. Duration: 0.85 2023-10-02 01:41:39,975 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_14928_210_5632_1_1530773461_4714000_454-112198-0_sp0.9 from training. Duration: 0.7333125 2023-10-02 01:41:39,976 WARNING [train.py:1204] (3/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_944-153333-0_sp1.1 from training. Duration: 0.963625 2023-10-02 01:41:40,172 WARNING [train.py:1204] (3/4) Exclude cut with ID _14821_210_8750_1_1530680503991_6829359_2-227151-0 from training. Duration: 0.83 2023-10-02 01:41:40,201 WARNING [train.py:1204] (3/4) Exclude cut with ID _28461_210_2253_1_1532771775189_3041299_96-152158-0_sp1.1 from training. Duration: 0.77275 2023-10-02 01:41:40,272 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7837_210_1792_1_1528453445_5841305_654-54195-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 01:41:40,317 WARNING [train.py:1204] (3/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_344-152526-0 from training. Duration: 0.53 2023-10-02 01:41:40,786 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_30167_210_3528_1_1532428012_4772375_480-142230-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 01:41:41,631 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_160-218721-0 from training. Duration: 0.96 2023-10-02 01:41:41,959 WARNING [train.py:1204] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_65-88242-0_sp0.9 from training. Duration: 0.62225 2023-10-02 01:41:42,300 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_34886_210_10633_1_1533203882_1755661_76-156096-0 from training. Duration: 0.87 2023-10-02 01:41:42,541 WARNING [train.py:1204] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_188-38388-0_sp1.1 from training. Duration: 0.92725 2023-10-02 01:41:42,676 WARNING [train.py:1204] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_81-289335-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 01:41:43,037 WARNING [train.py:1204] (3/4) Exclude cut with ID _59493_210_17282_1_1535078419077_3225879_110-8778-0_sp1.1 from training. Duration: 0.936375 2023-10-02 01:41:43,093 WARNING [train.py:1204] (3/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_188-260011-0 from training. Duration: 0.73 2023-10-02 01:41:43,142 WARNING [train.py:1204] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_453-133983-0_sp0.9 from training. Duration: 0.8666875 2023-10-02 01:41:43,648 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8054_210_7927_1_1528627721_4984094_553-17379-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 01:41:43,825 WARNING [train.py:1204] (3/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_341-171072-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 01:41:44,029 WARNING [train.py:1204] (3/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_265-257286-0_sp1.1 from training. Duration: 0.963625 2023-10-02 01:41:44,135 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_403-39619-0_sp0.9 from training. Duration: 0.9333125 2023-10-02 01:41:44,139 WARNING [train.py:1204] (3/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_464-33931-0_sp0.9 from training. Duration: 0.6 2023-10-02 01:41:44,160 WARNING [train.py:1204] (3/4) Exclude cut with ID _60804_210_9642_1_1534831199859_7268299_632-143863-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 01:41:44,182 WARNING [train.py:1204] (3/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_431-310997-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 01:41:44,231 WARNING [train.py:1204] (3/4) Exclude cut with ID _31649_210_6228_1_1532584608063_4091078_114-263860-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 01:41:44,243 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6961_210_2087_1_1527937168_6815011_562-165518-0_sp0.9 from training. Duration: 0.8555625 2023-10-02 01:41:44,614 WARNING [train.py:1204] (3/4) Exclude cut with ID _27052_210_5986_1_1532329197187_3585009_338-325870-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 01:41:44,679 WARNING [train.py:1204] (3/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_469-94673-0_sp1.1 from training. Duration: 0.7181875 2023-10-02 01:41:44,697 WARNING [train.py:1204] (3/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_259-34038-0 from training. Duration: 0.74 2023-10-02 01:41:44,797 WARNING [train.py:1204] (3/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_388-129552-0 from training. Duration: 0.96 2023-10-02 01:41:44,950 WARNING [train.py:1204] (3/4) Exclude cut with ID _28394_210_8913_1_1532430049518_3650118_342-251455-0_sp1.1 from training. Duration: 0.936375 2023-10-02 01:41:44,969 WARNING [train.py:1204] (3/4) Exclude cut with ID _53731_210_7994_1_1534553979244_3757214_8-191390-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 01:41:45,017 WARNING [train.py:1204] (3/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_613-232856-0 from training. Duration: 0.54 2023-10-02 01:41:45,085 WARNING [train.py:1204] (3/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_557-349534-0 from training. Duration: 0.91 2023-10-02 01:41:45,093 WARNING [train.py:1204] (3/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_379-184223-0 from training. Duration: 0.85 2023-10-02 01:41:45,097 WARNING [train.py:1204] (3/4) Exclude cut with ID IC0125W0001-61807 from training. Duration: 0.966 2023-10-02 01:41:45,198 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_229-45300-0 from training. Duration: 0.57 2023-10-02 01:41:45,272 WARNING [train.py:1204] (3/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_1069-24743-0_sp1.1 from training. Duration: 0.92725 2023-10-02 01:41:45,382 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_41452_210_7291_1_1533437687_3941281_323-172873-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 01:41:45,399 WARNING [train.py:1204] (3/4) Exclude cut with ID _37086_210_9038_1_1532999540891_7021250_802-226709-0_sp1.1 from training. Duration: 0.97275 2023-10-02 01:41:46,161 WARNING [train.py:1204] (3/4) Exclude cut with ID IC0144W0500-71767 from training. Duration: 0.931875 2023-10-02 01:41:46,245 WARNING [train.py:1204] (3/4) Exclude cut with ID _39846_210_12651_1_1533605310265_2115212_233-129699-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 01:41:46,310 WARNING [train.py:1204] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_574-45541-0_sp0.9 from training. Duration: 0.72225 2023-10-02 01:41:46,579 WARNING [train.py:1204] (3/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_372-298093-0_sp1.1 from training. Duration: 0.72725 2023-10-02 01:41:46,590 WARNING [train.py:1204] (3/4) Exclude cut with ID _62574_210_2655_1_1535180407058_3933249_34-26343-0_sp1.1 from training. Duration: 0.97275 2023-10-02 01:41:46,875 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_512-256238-0_sp1.1 from training. Duration: 0.963625 2023-10-02 01:41:46,880 WARNING [train.py:1204] (3/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_196-55502-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 01:41:46,988 WARNING [train.py:1204] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_195-167011-0 from training. Duration: 0.75 2023-10-02 01:41:47,392 WARNING [train.py:1204] (3/4) Exclude cut with ID _13678_210_7497_1_1530530069226_3463170_345-230416-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 01:41:47,412 WARNING [train.py:1204] (3/4) Exclude cut with ID _51078_210_9070_1_1534161557424_3805250_225-327809-0_sp1.1 from training. Duration: 0.963625 2023-10-02 01:41:47,679 WARNING [train.py:1204] (3/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_18-189198-0_sp1.1 from training. Duration: 0.8181875 2023-10-02 01:41:47,706 WARNING [train.py:1204] (3/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_329-82235-0_sp1.1 from training. Duration: 0.7454375 2023-10-02 01:41:47,714 WARNING [train.py:1204] (3/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_726-270780-0_sp0.9 from training. Duration: 0.9555625 2023-10-02 01:41:47,753 WARNING [train.py:1204] (3/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_654-52713-0 from training. Duration: 0.77 2023-10-02 01:41:48,115 WARNING [train.py:1204] (3/4) Exclude cut with ID _20455_210_5219_1_1531565996775_8098100_191-16006-0_sp1.1 from training. Duration: 0.963625 2023-10-02 01:41:48,249 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3171_210_2087_1_1526109084_2330007_66-260583-0_sp1.1 from training. Duration: 0.6545625 2023-10-02 01:41:48,302 WARNING [train.py:1204] (3/4) Exclude cut with ID _38670_210_15379_1_1533448810659_4391059_63-129006-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 01:41:48,579 WARNING [train.py:1204] (3/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_393-57888-0_sp0.9 from training. Duration: 0.711125 2023-10-02 01:41:48,632 WARNING [train.py:1204] (3/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_857-298942-0_sp0.9 from training. Duration: 0.9444375 2023-10-02 01:41:48,736 WARNING [train.py:1204] (3/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_239-22076-0_sp0.9 from training. Duration: 0.9444375 2023-10-02 01:41:48,771 WARNING [train.py:1204] (3/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_424-227947-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 01:41:48,925 WARNING [train.py:1204] (3/4) Exclude cut with ID _14069_210_5632_1_1530512692033_4803390_95-1896-0 from training. Duration: 0.98 2023-10-02 01:41:49,029 WARNING [train.py:1204] (3/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_96-244299-0_sp0.9 from training. Duration: 0.97775 2023-10-02 01:41:49,655 WARNING [train.py:1204] (3/4) Exclude cut with ID _46683_210_9642_1_1533730913492_4591116_562-319825-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 01:41:49,775 WARNING [train.py:1204] (3/4) Exclude cut with ID _64639_210_5816_1_1535367619437_3528000_284-173474-0_sp1.1 from training. Duration: 0.963625 2023-10-02 01:41:49,810 WARNING [train.py:1204] (3/4) Exclude cut with ID _29529_210_7234_1_1532393961455_3977460_497-100133-0_sp1.1 from training. Duration: 0.936375 2023-10-02 01:41:49,857 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_12868_210_6753_1_1530701932_530275_18-76322-0_sp0.9 from training. Duration: 0.9666875 2023-10-02 01:41:49,903 WARNING [train.py:1204] (3/4) Exclude cut with ID _50093_210_4220_1_1534417141207_3432299_116-303447-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 01:41:50,603 WARNING [train.py:1204] (3/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_44-79427-0_sp1.1 from training. Duration: 0.8909375 2023-10-02 01:41:50,660 WARNING [train.py:1204] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_887-249555-0 from training. Duration: 0.77 2023-10-02 01:41:50,924 WARNING [train.py:1204] (3/4) Exclude cut with ID _28461_210_2253_1_1532771775189_3041299_96-152158-0_sp0.9 from training. Duration: 0.9444375 2023-10-02 01:41:50,980 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_14058_210_4163_1_1530690953_6327180_49-210687-0_sp1.1 from training. Duration: 0.8545625 2023-10-02 01:41:51,263 WARNING [train.py:1204] (3/4) Exclude cut with ID _40399_210_4353_1_1533305658194_3279406_351-127903-0_sp1.1 from training. Duration: 0.97275 2023-10-02 01:41:51,269 WARNING [train.py:1204] (3/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_184-206939-0_sp1.1 from training. Duration: 0.936375 2023-10-02 01:41:51,617 WARNING [train.py:1204] (3/4) Exclude cut with ID _14100_210_8341_1_1530529320613_3678069_258-224599-0_sp1.1 from training. Duration: 0.7 2023-10-02 01:41:51,644 WARNING [train.py:1204] (3/4) Exclude cut with ID _6493_210_1747_1_1528459210424_4012470_324-323854-0 from training. Duration: 0.92 2023-10-02 01:41:51,728 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_151-325626-0_sp1.1 from training. Duration: 0.8818125 2023-10-02 01:41:51,818 WARNING [train.py:1204] (3/4) Exclude cut with ID _14528_210_8254_1_1530669700514_4306939_106-43145-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 01:41:51,831 WARNING [train.py:1204] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_686-54383-0 from training. Duration: 0.87 2023-10-02 01:41:51,870 WARNING [train.py:1204] (3/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_801-102362-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 01:41:51,942 WARNING [train.py:1204] (3/4) Exclude cut with ID _48677_210_5663_1_1533902405919_3892989_408-40740-0_sp0.9 from training. Duration: 0.92225 2023-10-02 01:41:52,276 WARNING [train.py:1204] (3/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_448-212135-0_sp1.1 from training. Duration: 0.82725 2023-10-02 01:41:52,447 WARNING [train.py:1204] (3/4) Exclude cut with ID _59170_210_6721_1_1534983633960_4203335_320-124047-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 01:41:52,506 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_4692_210_3528_1_1526899755_4323254_107-44401-0_sp1.1 from training. Duration: 0.7818125 2023-10-02 01:41:52,803 WARNING [train.py:1204] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_731-2428-0 from training. Duration: 0.83 2023-10-02 01:41:52,892 WARNING [train.py:1204] (3/4) Exclude cut with ID _29857_210_20034_1_1532395886753_3798160_335-163562-0_sp1.1 from training. Duration: 0.77275 2023-10-02 01:41:53,335 WARNING [train.py:1204] (3/4) Exclude cut with ID _14832_210_8750_1_1530691351539_3171348_171-275004-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 01:41:54,071 WARNING [train.py:1204] (3/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_105-328174-0_sp0.9 from training. Duration: 0.8444375 2023-10-02 01:41:54,121 WARNING [train.py:1204] (3/4) Exclude cut with ID _68965_210_1883_1_1535698962272_3497490_164-118421-0_sp1.1 from training. Duration: 0.936375 2023-10-02 01:41:54,220 WARNING [train.py:1204] (3/4) Exclude cut with ID _19896_210_1883_1_1531709022662_6649569_380-24170-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 01:41:54,233 WARNING [train.py:1204] (3/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_505-205270-0_sp1.1 from training. Duration: 0.3454375 2023-10-02 01:41:55,376 WARNING [train.py:1204] (3/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_427-208901-0_sp1.1 from training. Duration: 0.6181875 2023-10-02 01:41:55,433 WARNING [train.py:1204] (3/4) Exclude cut with ID _23818_210_6228_1_1531917063884_3676920_85-326916-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 01:41:55,539 WARNING [train.py:1204] (3/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_101-293367-0_sp1.1 from training. Duration: 0.57275 2023-10-02 01:41:55,549 WARNING [train.py:1204] (3/4) Exclude cut with ID _5409_210_1388_1_1527292850189_5865191_178-127970-0_sp1.1 from training. Duration: 0.5181875 2023-10-02 01:41:55,600 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_403-39619-0 from training. Duration: 0.84 2023-10-02 01:41:55,740 WARNING [train.py:1204] (3/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_427-87841-0_sp1.1 from training. Duration: 0.963625 2023-10-02 01:41:55,801 WARNING [train.py:1204] (3/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_286-68181-0_sp1.1 from training. Duration: 0.97275 2023-10-02 01:41:55,839 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6673_210_3273_1_1527767790_3173240_327-306185-0 from training. Duration: 0.83 2023-10-02 01:41:55,863 WARNING [train.py:1204] (3/4) Exclude cut with ID _3675_210_4557_1_1526639934408_4182089_106-203713-0_sp1.1 from training. Duration: 0.963625 2023-10-02 01:41:55,919 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_53071_210_16554_1_1534493434_1276796_130-36076-0 from training. Duration: 0.95 2023-10-02 01:41:55,961 WARNING [train.py:1204] (3/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_586-93054-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 01:41:56,308 WARNING [train.py:1204] (3/4) Exclude cut with ID _23825_210_13449_1_1531915313526_3497150_262-10571-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 01:41:56,381 WARNING [train.py:1204] (3/4) Exclude cut with ID _28783_210_7943_1_1532671164808_7155479_432-67885-0_sp1.1 from training. Duration: 0.97275 2023-10-02 01:41:56,607 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6287_210_3273_1_1527509506_2653716_182-199887-0_sp0.9 from training. Duration: 0.5666875 2023-10-02 01:41:56,652 WARNING [train.py:1204] (3/4) Exclude cut with ID _3231_210_2106_1_1526205864934_6784220_171-256004-0 from training. Duration: 0.86 2023-10-02 01:41:56,723 WARNING [train.py:1204] (3/4) Exclude cut with ID _69051_210_1531_1_1535885566901_3829538_332-92090-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 01:41:56,880 WARNING [train.py:1204] (3/4) Exclude cut with ID _3675_210_4557_1_1526639934408_4182089_104-324125-0_sp1.1 from training. Duration: 0.963625 2023-10-02 01:41:57,172 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_41764_210_2521_1_1533686110_4089313_501-2119-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 01:41:57,453 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_56-100988-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 01:41:57,666 WARNING [train.py:1204] (3/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_398-239819-0_sp1.1 from training. Duration: 0.9 2023-10-02 01:41:58,054 WARNING [train.py:1204] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_74-48717-0_sp1.1 from training. Duration: 0.52725 2023-10-02 01:41:58,264 WARNING [train.py:1204] (3/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_177-49092-0 from training. Duration: 0.91 2023-10-02 01:41:58,292 WARNING [train.py:1204] (3/4) Exclude cut with ID _62337_210_12392_1_1535009273105_3412229_44-176353-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 01:41:58,299 WARNING [train.py:1204] (3/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_110-130961-0_sp1.1 from training. Duration: 0.42725 2023-10-02 01:41:58,446 WARNING [train.py:1204] (3/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_37-189262-0_sp1.1 from training. Duration: 0.8909375 2023-10-02 01:41:58,447 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3172_210_2087_1_1526292929_3987865_197-97762-0_sp0.9 from training. Duration: 0.8333125 2023-10-02 01:41:58,540 WARNING [train.py:1204] (3/4) Exclude cut with ID IC0677W0065-334897 from training. Duration: 0.896 2023-10-02 01:41:58,541 WARNING [train.py:1204] (3/4) Exclude cut with ID IC0677W0080-334912 from training. Duration: 0.768 2023-10-02 01:41:58,546 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_120-41668-0 from training. Duration: 0.7 2023-10-02 01:41:58,809 WARNING [train.py:1204] (3/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_460-205287-0_sp1.1 from training. Duration: 0.963625 2023-10-02 01:41:58,986 WARNING [train.py:1204] (3/4) Exclude cut with ID _14632_210_9988_1_1530691279392_3923200_102-204477-0 from training. Duration: 0.97 2023-10-02 01:41:58,991 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_34815_210_6388_1_1533381320_3571903_279-305565-0 from training. Duration: 0.92 2023-10-02 01:41:59,072 WARNING [train.py:1204] (3/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_91-148831-0_sp1.1 from training. Duration: 0.9 2023-10-02 01:41:59,840 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_82-221196-0 from training. Duration: 0.76 2023-10-02 01:42:00,078 WARNING [train.py:1204] (3/4) Exclude cut with ID _38020_210_5986_1_1533112252259_7006089_493-102745-0 from training. Duration: 0.98 2023-10-02 01:42:00,278 WARNING [train.py:1204] (3/4) Exclude cut with ID _61133_210_9969_1_1535196777227_3696409_40-93442-0_sp1.1 from training. Duration: 0.92725 2023-10-02 01:42:00,407 WARNING [train.py:1204] (3/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_45-258852-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 01:42:00,458 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_53071_210_16554_1_1534493434_1276796_130-36076-0_sp1.1 from training. Duration: 0.863625 2023-10-02 01:42:00,644 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8694_210_6400_1_1529065754_3260756_332-13553-0_sp1.1 from training. Duration: 0.92725 2023-10-02 01:42:00,730 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_405-339986-0 from training. Duration: 0.51 2023-10-02 01:42:00,780 WARNING [train.py:1204] (3/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_927-43827-0_sp1.1 from training. Duration: 0.67275 2023-10-02 01:42:01,104 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_397-147196-0 from training. Duration: 0.8 2023-10-02 01:42:01,232 WARNING [train.py:1204] (3/4) Exclude cut with ID 3557-8342-0013-54691-0_sp1.1 from training. Duration: 0.836375 2023-10-02 01:42:01,397 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_47438_210_5399_1_1533797471_4513356_626-127722-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 01:42:01,576 WARNING [train.py:1204] (3/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_256-323883-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 01:42:01,581 WARNING [train.py:1204] (3/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_839-173017-0_sp0.9 from training. Duration: 0.4555625 2023-10-02 01:42:01,619 WARNING [train.py:1204] (3/4) Exclude cut with ID _5289_210_2084_1_1527678202736_3779299_392-261988-0_sp1.1 from training. Duration: 0.5909375 2023-10-02 01:42:01,673 WARNING [train.py:1204] (3/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_119-224384-0_sp1.1 from training. Duration: 0.936375 2023-10-02 01:42:01,700 WARNING [train.py:1204] (3/4) Exclude cut with ID _57523_210_1856_1_1534766294545_3732989_262-267933-0_sp1.1 from training. Duration: 0.963625 2023-10-02 01:42:01,924 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_35361_210_4238_1_1533262810_7793476_675-87012-0_sp0.9 from training. Duration: 0.988875 2023-10-02 01:42:01,941 WARNING [train.py:1204] (3/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_17-90541-0 from training. Duration: 0.94 2023-10-02 01:42:01,966 WARNING [train.py:1204] (3/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_535-14410-0_sp1.1 from training. Duration: 0.42725 2023-10-02 01:42:01,971 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_24673_210_6400_1_1531968633_778659_94-154511-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 01:42:01,974 WARNING [train.py:1204] (3/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_212-10501-0 from training. Duration: 0.94 2023-10-02 01:42:02,995 WARNING [train.py:1204] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_383-285196-0_sp1.1 from training. Duration: 0.8818125 2023-10-02 01:42:03,348 WARNING [train.py:1204] (3/4) Exclude cut with ID _45560_210_21982_1_1533812603951_3584999_238-47020-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 01:42:03,356 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_21500_210_2054_1_1532149310_2716758_278-18053-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 01:42:03,369 WARNING [train.py:1204] (3/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_453-9626-0_sp1.1 from training. Duration: 0.936375 2023-10-02 01:42:03,465 WARNING [train.py:1204] (3/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_153-97441-0_sp1.1 from training. Duration: 0.5090625 2023-10-02 01:42:04,594 WARNING [train.py:1204] (3/4) Exclude cut with ID _67725_210_27767_1_1535608756671_5105359_431-120895-0_sp1.1 from training. Duration: 0.963625 2023-10-02 01:42:04,808 WARNING [train.py:1204] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_574-45541-0_sp1.1 from training. Duration: 0.5909375 2023-10-02 01:42:04,823 WARNING [train.py:1204] (3/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_162-103620-0_sp1.1 from training. Duration: 0.8454375 2023-10-02 01:42:04,841 WARNING [train.py:1204] (3/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_734-129761-0_sp1.1 from training. Duration: 0.7454375 2023-10-02 01:42:04,890 WARNING [train.py:1204] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_620-276793-0_sp1.1 from training. Duration: 0.536375 2023-10-02 01:42:04,925 WARNING [train.py:1204] (3/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_153-97441-0_sp0.9 from training. Duration: 0.62225 2023-10-02 01:42:05,180 WARNING [train.py:1204] (3/4) Exclude cut with ID _12733_210_8341_1_1530167424556_3646380_523-85621-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 01:42:05,201 WARNING [train.py:1204] (3/4) Exclude cut with ID _64048_210_10626_1_1535198382802_3835179_352-179834-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 01:42:05,549 WARNING [train.py:1204] (3/4) Exclude cut with ID _55162_210_17071_1_1534408248583_3753480_43-242364-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 01:42:05,570 WARNING [train.py:1204] (3/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_661-264857-0 from training. Duration: 0.93 2023-10-02 01:42:05,585 WARNING [train.py:1204] (3/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_325-237800-0_sp1.1 from training. Duration: 0.97275 2023-10-02 01:42:05,620 WARNING [train.py:1204] (3/4) Exclude cut with ID _55328_210_27767_1_1534580973260_4088170_385-171459-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 01:42:05,706 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3780_210_2087_1_1526467389_2145572_176-107140-0_sp1.1 from training. Duration: 0.62725 2023-10-02 01:42:05,812 WARNING [train.py:1204] (3/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_375-244976-0_sp1.1 from training. Duration: 0.6818125 2023-10-02 01:42:05,931 WARNING [train.py:1204] (3/4) Exclude cut with ID _35941_210_15379_1_1532995294515_4023089_309-33162-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 01:42:06,428 WARNING [train.py:1204] (3/4) Exclude cut with ID _14459_210_2228_1_1531443641946_7516700_1185-22038-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 01:42:06,588 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_1197-69460-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 01:42:06,804 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9012W0022-501157 from training. Duration: 0.896 2023-10-02 01:42:07,047 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9023W0009-506302 from training. Duration: 0.896 2023-10-02 01:42:07,072 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2821_210_2715_1_1526108385_3763560_73-296673-0_sp1.1 from training. Duration: 0.7545625 2023-10-02 01:42:07,098 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9025W0113-507442 from training. Duration: 0.896 2023-10-02 01:42:07,182 WARNING [train.py:1204] (3/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_739-2098-0 from training. Duration: 0.92 2023-10-02 01:42:07,203 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9029W0332-509722 from training. Duration: 0.768 2023-10-02 01:42:07,411 WARNING [train.py:1204] (3/4) Exclude cut with ID _16908_210_5724_1_1531119157248_3772700_68-101953-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 01:42:07,436 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9042W0011-515617 from training. Duration: 0.768 2023-10-02 01:42:07,548 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3019_210_2028_1_1526044245_3264865_191-72235-0 from training. Duration: 0.97 2023-10-02 01:42:07,662 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9052W0006-520792 from training. Duration: 0.896 2023-10-02 01:42:07,805 WARNING [train.py:1204] (3/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_245-174553-0_sp1.1 from training. Duration: 0.8 2023-10-02 01:42:07,953 WARNING [train.py:1204] (3/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_202-67017-0 from training. Duration: 0.82 2023-10-02 01:42:08,059 WARNING [train.py:1204] (3/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_304-59227-0 from training. Duration: 0.77 2023-10-02 01:42:08,202 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_246-125000-0_sp0.9 from training. Duration: 0.688875 2023-10-02 01:42:08,890 WARNING [train.py:1204] (3/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_243-42116-0 from training. Duration: 0.73 2023-10-02 01:42:09,077 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9087W0121-538492 from training. Duration: 0.896 2023-10-02 01:42:09,092 WARNING [train.py:1204] (3/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_120-76843-0 from training. Duration: 0.55 2023-10-02 01:42:09,688 WARNING [train.py:1204] (3/4) Exclude cut with ID _7852_210_5219_1_1528286471316_8139400_850-304621-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 01:42:09,701 WARNING [train.py:1204] (3/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_154-285415-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 01:42:09,713 WARNING [train.py:1204] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_538-278194-0_sp1.1 from training. Duration: 0.92725 2023-10-02 01:42:09,714 WARNING [train.py:1204] (3/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_119-207162-0 from training. Duration: 0.71 2023-10-02 01:42:09,876 WARNING [train.py:1204] (3/4) Exclude cut with ID _2334_210_2667_1_1525608238880_4705129_459-104997-0_sp1.1 from training. Duration: 0.863625 2023-10-02 01:42:10,327 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9147W0019-567847 from training. Duration: 0.768 2023-10-02 01:42:10,716 WARNING [train.py:1204] (3/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_228-189439-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 01:42:10,752 WARNING [train.py:1204] (3/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_196-77172-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 01:42:10,814 WARNING [train.py:1204] (3/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_625-338546-0 from training. Duration: 0.88 2023-10-02 01:42:10,868 WARNING [train.py:1204] (3/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_193-327630-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 01:42:10,869 WARNING [train.py:1204] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_391-24006-0_sp1.1 from training. Duration: 0.92725 2023-10-02 01:42:10,938 WARNING [train.py:1204] (3/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_197-28164-0 from training. Duration: 0.8 2023-10-02 01:42:11,195 WARNING [train.py:1204] (3/4) Exclude cut with ID _8395_210_1531_1_1528597871523_4411579_431-132470-0_sp1.1 from training. Duration: 0.67275 2023-10-02 01:42:11,215 WARNING [train.py:1204] (3/4) Exclude cut with ID _35350_210_16040_1_1533040191183_4075299_465-248326-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 01:42:11,401 WARNING [train.py:1204] (3/4) Exclude cut with ID _16756_210_4915_1_1531047803763_7104259_101-141922-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 01:42:11,682 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9212W0005-600172 from training. Duration: 0.896 2023-10-02 01:42:11,703 WARNING [train.py:1204] (3/4) Exclude cut with ID _63186_210_2084_1_1535414479705_3908649_378-166820-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 01:42:11,718 WARNING [train.py:1204] (3/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_52-211683-0_sp1.1 from training. Duration: 0.8181875 2023-10-02 01:42:12,161 WARNING [train.py:1204] (3/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_1147-237729-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 01:42:12,229 WARNING [train.py:1204] (3/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_234-14174-0_sp0.9 from training. Duration: 0.911125 2023-10-02 01:42:12,249 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_454-276429-0 from training. Duration: 0.87 2023-10-02 01:42:12,332 WARNING [train.py:1204] (3/4) Exclude cut with ID _53713_210_24116_1_1534314487762_7416751_755-165313-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 01:42:12,495 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8278_210_2550_1_1528637099_4220413_436-317072-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 01:42:12,571 WARNING [train.py:1204] (3/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_28-324729-0_sp1.1 from training. Duration: 0.8545625 2023-10-02 01:42:13,643 WARNING [train.py:1204] (3/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_326-165900-0 from training. Duration: 0.69 2023-10-02 01:42:13,720 WARNING [train.py:1204] (3/4) Exclude cut with ID _53489_210_6228_1_1534298357169_3835970_96-30246-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 01:42:13,854 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8275_210_4198_1_1528455320_3892571_491-17707-0_sp1.1 from training. Duration: 0.5818125 2023-10-02 01:42:14,192 WARNING [train.py:1204] (3/4) Exclude cut with ID _14348_210_8341_1_1530599434996_3778640_240-232370-0 from training. Duration: 0.84 2023-10-02 01:42:14,202 WARNING [train.py:1204] (3/4) Exclude cut with ID _45012_210_12651_1_1533608435136_5371380_54-112623-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 01:42:14,299 WARNING [train.py:1204] (3/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_328-264776-0_sp1.1 from training. Duration: 0.6090625 2023-10-02 01:42:14,343 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9325W0001-648427 from training. Duration: 0.768 2023-10-02 01:42:14,352 WARNING [train.py:1204] (3/4) Exclude cut with ID _56480_210_4220_1_1534679942079_3075409_49-74717-0_sp1.1 from training. Duration: 0.936375 2023-10-02 01:42:14,549 WARNING [train.py:1204] (3/4) Exclude cut with ID _59500_210_8254_1_1534809610680_3736009_6-241869-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 01:42:14,724 WARNING [train.py:1204] (3/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_172-215932-0_sp0.9 from training. Duration: 0.7333125 2023-10-02 01:42:14,818 WARNING [train.py:1204] (3/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_17-90541-0_sp1.1 from training. Duration: 0.8545625 2023-10-02 01:42:14,998 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_43831_210_2158_1_1533725502_5872791_525-89447-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 01:42:15,087 WARNING [train.py:1204] (3/4) Exclude cut with ID _12302_210_5219_1_1529924446466_8107280_907-182949-0 from training. Duration: 0.92 2023-10-02 01:42:15,100 WARNING [train.py:1204] (3/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_402-102311-0_sp0.9 from training. Duration: 0.7444375 2023-10-02 01:42:15,139 WARNING [train.py:1204] (3/4) Exclude cut with ID _8021_210_7994_1_1528376451358_3653069_179-45206-0_sp0.9 from training. Duration: 0.9555625 2023-10-02 01:42:15,394 WARNING [train.py:1204] (3/4) Exclude cut with ID _40137_210_12210_1_1533290484046_3795180_249-187059-0 from training. Duration: 0.88 2023-10-02 01:42:15,498 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9653W0194-675472 from training. Duration: 0.9658125 2023-10-02 01:42:15,693 WARNING [train.py:1204] (3/4) Exclude cut with ID _51332_210_9988_1_1534244371836_3801270_336-255604-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 01:42:15,887 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_510-96996-0 from training. Duration: 0.87 2023-10-02 01:42:15,900 WARNING [train.py:1204] (3/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_1167-20437-0_sp1.1 from training. Duration: 0.92725 2023-10-02 01:42:15,954 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3475_210_2028_1_1526193578_3622569_265-276625-0 from training. Duration: 0.76 2023-10-02 01:42:16,282 WARNING [train.py:1204] (3/4) Exclude cut with ID _14860_210_4915_1_1530702020340_7343240_470-172418-0_sp1.1 from training. Duration: 0.97275 2023-10-02 01:42:16,289 WARNING [train.py:1204] (3/4) Exclude cut with ID _5444_210_2151_1_1527418808464_5312210_581-315411-0_sp1.1 from training. Duration: 0.563625 2023-10-02 01:42:16,320 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_34450_210_9547_1_1533099443_4109050_519-342439-0_sp1.1 from training. Duration: 0.97275 2023-10-02 01:42:16,355 WARNING [train.py:1204] (3/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_50-175961-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 01:42:16,781 WARNING [train.py:1204] (3/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_372-6869-0 from training. Duration: 0.44 2023-10-02 01:42:16,787 WARNING [train.py:1204] (3/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_3-237434-0 from training. Duration: 0.76 2023-10-02 01:42:17,024 WARNING [train.py:1204] (3/4) Exclude cut with ID _8772_210_1850_1_1528635595972_3664011_247-336448-0_sp1.1 from training. Duration: 0.67275 2023-10-02 01:42:17,885 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_427-78097-0 from training. Duration: 0.8 2023-10-02 01:42:17,906 WARNING [train.py:1204] (3/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_466-252727-0_sp1.1 from training. Duration: 0.97275 2023-10-02 01:42:18,002 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_15972_210_6400_1_1530877934_3405692_316-35478-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 01:42:18,033 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_809-17156-0_sp0.9 from training. Duration: 0.811125 2023-10-02 01:42:18,074 WARNING [train.py:1204] (3/4) Exclude cut with ID _57572_210_5517_1_1534921219624_3809484_594-263474-0_sp1.1 from training. Duration: 0.936375 2023-10-02 01:42:18,126 WARNING [train.py:1204] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_190-228204-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 01:42:18,273 WARNING [train.py:1204] (3/4) Exclude cut with ID _19514_210_9259_1_1531530071330_5084230_117-21193-0_sp1.1 from training. Duration: 0.936375 2023-10-02 01:42:18,437 WARNING [train.py:1204] (3/4) Exclude cut with ID _69584_210_5219_1_1535785133775_4044450_190-21315-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 01:42:18,523 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_53071_210_16554_1_1534493434_1276796_184-302050-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 01:42:18,570 WARNING [train.py:1204] (3/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_120-76843-0_sp1.1 from training. Duration: 0.5 2023-10-02 01:42:18,689 WARNING [train.py:1204] (3/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_318-162017-0_sp1.1 from training. Duration: 0.82725 2023-10-02 01:42:18,816 WARNING [train.py:1204] (3/4) Exclude cut with ID _46067_210_4220_1_1534125571066_3307450_79-46864-0_sp1.1 from training. Duration: 0.92725 2023-10-02 01:42:18,880 WARNING [train.py:1204] (3/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_358-215558-0_sp1.1 from training. Duration: 0.8454375 2023-10-02 01:42:18,950 WARNING [train.py:1204] (3/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_621-66764-0_sp1.1 from training. Duration: 0.92725 2023-10-02 01:42:19,082 WARNING [train.py:1204] (3/4) Exclude cut with ID _42863_210_9070_1_1533902398788_3696920_338-156851-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 01:42:19,111 WARNING [train.py:1204] (3/4) Exclude cut with ID _5289_210_2084_1_1527678202736_3779299_392-261988-0_sp0.9 from training. Duration: 0.72225 2023-10-02 01:42:19,365 WARNING [train.py:1204] (3/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_409-84682-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 01:42:19,440 WARNING [train.py:1204] (3/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_30-326681-0_sp0.9 from training. Duration: 0.92225 2023-10-02 01:42:19,444 WARNING [train.py:1204] (3/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_389-184695-0_sp1.1 from training. Duration: 0.92725 2023-10-02 01:42:19,491 WARNING [train.py:1204] (3/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_442-320317-0 from training. Duration: 0.82 2023-10-02 01:42:19,757 WARNING [train.py:1204] (3/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_756-29911-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 01:42:19,867 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_401-112036-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 01:42:19,973 WARNING [train.py:1204] (3/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_181-308827-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 01:42:20,088 WARNING [train.py:1204] (3/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_332-258725-0_sp1.1 from training. Duration: 0.963625 2023-10-02 01:42:20,180 WARNING [train.py:1204] (3/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_188-260011-0_sp0.9 from training. Duration: 0.811125 2023-10-02 01:42:20,230 WARNING [train.py:1204] (3/4) Exclude cut with ID _49039_210_6315_1_1533967537541_3846118_99-302239-0_sp1.1 from training. Duration: 0.963625 2023-10-02 01:42:20,418 WARNING [train.py:1204] (3/4) Exclude cut with ID _4122_210_2084_1_1526713164417_4353280_312-294828-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 01:42:20,424 WARNING [train.py:1204] (3/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_303-228738-0 from training. Duration: 0.84 2023-10-02 01:42:20,651 WARNING [train.py:1204] (3/4) Exclude cut with ID _14349_210_8341_1_1530615710818_3442470_115-285975-0_sp0.9 from training. Duration: 0.9555625 2023-10-02 01:42:20,802 WARNING [train.py:1204] (3/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_125-144174-0_sp1.1 from training. Duration: 0.3545625 2023-10-02 01:42:20,975 WARNING [train.py:1204] (3/4) Exclude cut with ID _53565_210_10635_1_1534337218335_3820259_351-54570-0_sp1.1 from training. Duration: 0.97275 2023-10-02 01:42:21,000 WARNING [train.py:1204] (3/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_410-40065-0_sp1.1 from training. Duration: 0.963625 2023-10-02 01:42:21,025 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_38965_210_4852_1_1533174906_3484735_167-339442-0_sp1.1 from training. Duration: 0.963625 2023-10-02 01:42:21,105 WARNING [train.py:1204] (3/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_265-164486-0_sp1.1 from training. Duration: 0.87275 2023-10-02 01:42:21,121 WARNING [train.py:1204] (3/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_725-182484-0_sp1.1 from training. Duration: 0.92725 2023-10-02 01:42:21,155 WARNING [train.py:1204] (3/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_607-213951-0_sp0.9 from training. Duration: 0.9333125 2023-10-02 01:42:21,177 WARNING [train.py:1204] (3/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_605-133395-0_sp1.1 from training. Duration: 0.6 2023-10-02 01:42:21,199 WARNING [train.py:1204] (3/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_351-294650-0 from training. Duration: 0.63 2023-10-02 01:42:21,253 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_14953_210_2151_1_1530701710_5616165_710-165400-0_sp1.1 from training. Duration: 0.7909375 2023-10-02 01:42:21,382 WARNING [train.py:1204] (3/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_237-58433-0_sp1.1 from training. Duration: 0.67275 2023-10-02 01:42:21,492 WARNING [train.py:1204] (3/4) Exclude cut with ID _5190_210_4557_1_1527244368799_373439_28-197030-0_sp1.1 from training. Duration: 0.6545625 2023-10-02 01:42:21,556 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_743-327651-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 01:42:22,503 WARNING [train.py:1204] (3/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_76-203867-0_sp0.9 from training. Duration: 0.8 2023-10-02 01:42:22,611 WARNING [train.py:1204] (3/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_274-274798-0 from training. Duration: 0.98 2023-10-02 01:42:22,661 WARNING [train.py:1204] (3/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_226-242140-0_sp1.1 from training. Duration: 0.736375 2023-10-02 01:42:23,029 WARNING [train.py:1204] (3/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_352-59958-0_sp1.1 from training. Duration: 0.7818125 2023-10-02 01:42:23,033 WARNING [train.py:1204] (3/4) Exclude cut with ID _39411_210_5986_1_1533538825030_3343649_191-251819-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 01:42:23,486 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7046_210_6753_1_1527895337_6920917_642-106799-0_sp1.1 from training. Duration: 0.836375 2023-10-02 01:42:23,502 WARNING [train.py:1204] (3/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_412-3442-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 01:42:23,525 WARNING [train.py:1204] (3/4) Exclude cut with ID _60057_210_10640_1_1534903065793_3333150_171-181133-0_sp0.9 from training. Duration: 0.97775 2023-10-02 01:42:23,533 WARNING [train.py:1204] (3/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_154-135696-0_sp1.1 from training. Duration: 0.72725 2023-10-02 01:42:23,799 WARNING [train.py:1204] (3/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_148-186805-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 01:42:23,994 WARNING [train.py:1204] (3/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_605-165641-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 01:42:24,006 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_174-277364-0_sp0.9 from training. Duration: 0.788875 2023-10-02 01:42:24,017 WARNING [train.py:1204] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_89-321955-0_sp0.9 from training. Duration: 0.888875 2023-10-02 01:42:24,333 WARNING [train.py:1204] (3/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_438-105429-0_sp1.1 from training. Duration: 0.963625 2023-10-02 01:42:24,336 WARNING [train.py:1204] (3/4) Exclude cut with ID _14970_210_15709_1_1530773543493_1715730_187-20259-0_sp0.9 from training. Duration: 0.988875 2023-10-02 01:42:25,012 WARNING [train.py:1204] (3/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_385-193052-0_sp0.9 from training. Duration: 0.9555625 2023-10-02 01:42:25,066 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7113_210_1849_1_1527942685_3448768_194-311626-0_sp1.1 from training. Duration: 0.92725 2023-10-02 01:42:25,429 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3780_210_2087_1_1526467389_2145572_176-107140-0_sp0.9 from training. Duration: 0.7666875 2023-10-02 01:42:25,432 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_36756_210_7234_1_1533026991_966276_123-213457-0_sp1.1 from training. Duration: 0.936375 2023-10-02 01:42:25,707 WARNING [train.py:1204] (3/4) Exclude cut with ID _23557_210_8750_1_1531890106370_6846255_456-244861-0_sp1.1 from training. Duration: 0.963625 2023-10-02 01:42:25,761 WARNING [train.py:1204] (3/4) Exclude cut with ID _37086_210_9038_1_1532999540891_7021250_242-349170-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 01:42:25,825 WARNING [train.py:1204] (3/4) Exclude cut with ID _41298_210_9019_1_1533430371450_4822143_471-281351-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 01:42:25,920 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_53378_210_17282_1_1534221071_5769509_572-126176-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 01:42:26,029 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_1507-263032-0_sp1.1 from training. Duration: 0.963625 2023-10-02 01:42:26,058 WARNING [train.py:1204] (3/4) Exclude cut with ID _13679_210_7497_1_1530534293662_3355098_411-186088-0_sp1.1 from training. Duration: 0.8545625 2023-10-02 01:42:26,233 WARNING [train.py:1204] (3/4) Exclude cut with ID _29345_210_4381_1_1532689168263_3860169_149-257268-0_sp1.1 from training. Duration: 0.736375 2023-10-02 01:42:26,998 WARNING [train.py:1204] (3/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_381-252014-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 01:42:27,011 WARNING [train.py:1204] (3/4) Exclude cut with ID _35799_210_5854_1_1533263487887_3529071_512-2717-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 01:42:27,246 WARNING [train.py:1204] (3/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_505-205270-0_sp0.9 from training. Duration: 0.42225 2023-10-02 01:42:27,340 WARNING [train.py:1204] (3/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_731-20117-0_sp1.1 from training. Duration: 0.936375 2023-10-02 01:42:27,508 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_37351_210_7925_1_1533012931_3812113_265-85780-0_sp1.1 from training. Duration: 0.8 2023-10-02 01:42:27,526 WARNING [train.py:1204] (3/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_313-134930-0 from training. Duration: 0.97 2023-10-02 01:42:27,665 WARNING [train.py:1204] (3/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_374-138971-0 from training. Duration: 0.94 2023-10-02 01:42:27,715 WARNING [train.py:1204] (3/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_1227-287886-0_sp1.1 from training. Duration: 0.936375 2023-10-02 01:42:27,962 WARNING [train.py:1204] (3/4) Exclude cut with ID _59493_210_17282_1_1535078419077_3225879_332-217544-0_sp1.1 from training. Duration: 0.97275 2023-10-02 01:42:28,011 WARNING [train.py:1204] (3/4) Exclude cut with ID _23514_210_1389_1_1531832139785_3031439_361-119640-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 01:42:28,165 WARNING [train.py:1204] (3/4) Exclude cut with ID _69584_210_5219_1_1535785133775_4044450_142-118058-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 01:42:28,192 WARNING [train.py:1204] (3/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_178-177582-0_sp1.1 from training. Duration: 0.67275 2023-10-02 01:42:28,230 WARNING [train.py:1204] (3/4) Exclude cut with ID _25303_210_17519_1_1532050207576_3635970_41-195572-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 01:42:28,473 WARNING [train.py:1204] (3/4) Exclude cut with ID _55328_210_27767_1_1534580973260_4088170_329-341322-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 01:42:28,498 WARNING [train.py:1204] (3/4) Exclude cut with ID _5608_210_2106_1_1527415439089_6819768_909-82767-0_sp1.1 from training. Duration: 0.5545625 2023-10-02 01:42:28,539 WARNING [train.py:1204] (3/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_261-297503-0_sp1.1 from training. Duration: 0.9 2023-10-02 01:42:28,545 WARNING [train.py:1204] (3/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_459-291706-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 01:42:28,546 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7762_210_1794_1_1528199508_4057408_422-227034-0 from training. Duration: 0.73 2023-10-02 01:42:28,554 WARNING [train.py:1204] (3/4) Exclude cut with ID _3067_210_2667_1_1526212974640_4503168_250-285937-0_sp0.9 from training. Duration: 0.888875 2023-10-02 01:42:28,583 WARNING [train.py:1204] (3/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_332-253745-0_sp1.1 from training. Duration: 0.936375 2023-10-02 01:42:29,102 WARNING [train.py:1204] (3/4) Exclude cut with ID _13938_210_9988_1_1530518718978_3693960_304-112784-0 from training. Duration: 0.46 2023-10-02 01:42:29,281 WARNING [train.py:1204] (3/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_226-267957-0_sp1.1 from training. Duration: 0.92725 2023-10-02 01:42:29,326 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_41822_210_13270_1_1533955976_4379284_608-254567-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 01:42:29,405 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_124-113562-0_sp1.1 from training. Duration: 0.8545625 2023-10-02 01:42:29,456 WARNING [train.py:1204] (3/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_117-290916-0_sp1.1 from training. Duration: 0.463625 2023-10-02 01:42:29,539 WARNING [train.py:1204] (3/4) Exclude cut with ID _28427_210_4220_1_1532581240967_3061069_102-66533-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 01:42:29,618 WARNING [train.py:1204] (3/4) Exclude cut with ID _55383_210_10635_1_1534468426172_2231669_134-128246-0_sp1.1 from training. Duration: 0.8090625 2023-10-02 01:42:29,668 WARNING [train.py:1204] (3/4) Exclude cut with ID _45871_210_5665_1_1533864614245_3296060_49-308316-0_sp1.1 from training. Duration: 0.97275 2023-10-02 01:42:29,715 WARNING [train.py:1204] (3/4) Exclude cut with ID _57572_210_5517_1_1534921219624_3809484_176-199205-0_sp1.1 from training. Duration: 0.97275 2023-10-02 01:42:30,024 WARNING [train.py:1204] (3/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_45-265684-0_sp1.1 from training. Duration: 0.6545625 2023-10-02 01:42:30,041 WARNING [train.py:1204] (3/4) Exclude cut with ID _15150_210_8341_1_1530752411803_7256384_469-239509-0_sp0.9 from training. Duration: 0.7555625 2023-10-02 01:42:30,085 WARNING [train.py:1204] (3/4) Exclude cut with ID _28461_210_2253_1_1532771775189_3041299_136-270644-0_sp1.1 from training. Duration: 0.8454375 2023-10-02 01:42:30,165 WARNING [train.py:1204] (3/4) Exclude cut with ID _69584_210_5219_1_1535785133775_4044450_302-47302-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 01:42:30,267 WARNING [train.py:1204] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_166-128941-0_sp1.1 from training. Duration: 0.4909375 2023-10-02 01:42:30,783 WARNING [train.py:1204] (3/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_559-120504-0_sp1.1 from training. Duration: 0.97275 2023-10-02 01:42:31,551 WARNING [train.py:1204] (3/4) Exclude cut with ID _6456_210_1534_1_1527681634212_3181049_345-252877-0_sp1.1 from training. Duration: 0.5818125 2023-10-02 01:42:31,579 WARNING [train.py:1204] (3/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_902-233003-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 01:42:31,906 WARNING [train.py:1204] (3/4) Exclude cut with ID _36538_210_9994_1_1533204020363_3795620_204-261385-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 01:42:31,910 WARNING [train.py:1204] (3/4) Exclude cut with ID _14821_210_8750_1_1530680503991_6829359_189-297451-0_sp1.1 from training. Duration: 0.5 2023-10-02 01:42:31,949 WARNING [train.py:1204] (3/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_1211-226066-0_sp0.9 from training. Duration: 0.8444375 2023-10-02 01:42:32,464 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6786_210_1388_1_1527839699_4194495_273-149841-0_sp1.1 from training. Duration: 0.836375 2023-10-02 01:42:32,500 WARNING [train.py:1204] (3/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_380-162648-0_sp1.1 from training. Duration: 0.8545625 2023-10-02 01:42:32,501 WARNING [train.py:1204] (3/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_494-199538-0 from training. Duration: 0.75 2023-10-02 01:42:32,869 WARNING [train.py:1204] (3/4) Exclude cut with ID _49097_210_16049_1_1533952780207_4360979_276-87053-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 01:42:33,029 WARNING [train.py:1204] (3/4) Exclude cut with ID _5444_210_2151_1_1527418808464_5312210_726-27165-0 from training. Duration: 0.67 2023-10-02 01:42:33,396 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_128-202104-0_sp1.1 from training. Duration: 0.57275 2023-10-02 01:42:33,580 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_11349_210_6753_1_1529800861_1101207_26-65602-0_sp1.1 from training. Duration: 0.87275 2023-10-02 01:42:33,616 WARNING [train.py:1204] (3/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_159-328157-0 from training. Duration: 0.52 2023-10-02 01:42:33,706 WARNING [train.py:1204] (3/4) Exclude cut with ID _14069_210_5632_1_1530512692033_4803390_98-21934-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 01:42:33,861 WARNING [train.py:1204] (3/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_352-59958-0_sp0.9 from training. Duration: 0.9555625 2023-10-02 01:42:33,917 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6163_210_6400_1_1527506746_4263068_283-137811-0 from training. Duration: 0.77 2023-10-02 01:42:33,959 WARNING [train.py:1204] (3/4) Exclude cut with ID _21989_210_5854_1_1531899214931_3271360_133-44759-0_sp1.1 from training. Duration: 0.7 2023-10-02 01:42:34,186 WARNING [train.py:1204] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_21-174514-0_sp0.9 from training. Duration: 0.47775 2023-10-02 01:42:34,215 WARNING [train.py:1204] (3/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_201-78811-0_sp1.1 from training. Duration: 0.963625 2023-10-02 01:42:34,246 WARNING [train.py:1204] (3/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_537-164630-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 01:42:34,331 WARNING [train.py:1204] (3/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_156-257607-0 from training. Duration: 0.83 2023-10-02 01:42:34,342 WARNING [train.py:1204] (3/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_308-154450-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 01:42:34,372 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_45399_210_4852_1_1533624725_7579737_214-240574-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 01:42:34,514 WARNING [train.py:1204] (3/4) Exclude cut with ID _29303_210_2575_1_1532392237303_7410649_615-305517-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 01:42:34,551 WARNING [train.py:1204] (3/4) Exclude cut with ID IC0125W0002-61808 from training. Duration: 0.708 2023-10-02 01:42:34,552 WARNING [train.py:1204] (3/4) Exclude cut with ID IC0125W0017-61823 from training. Duration: 0.759 2023-10-02 01:42:34,820 WARNING [train.py:1204] (3/4) Exclude cut with ID _53219_210_4525_1_1534834836008_4064825_4-145291-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 01:42:34,972 WARNING [train.py:1204] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_185-174805-0_sp1.1 from training. Duration: 0.6181875 2023-10-02 01:42:34,992 WARNING [train.py:1204] (3/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_635-193046-0_sp1.1 from training. Duration: 0.92725 2023-10-02 01:42:35,848 WARNING [train.py:1204] (3/4) Exclude cut with ID _46067_210_4220_1_1534125571066_3307450_321-258866-0_sp1.1 from training. Duration: 0.97275 2023-10-02 01:42:36,014 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8558_210_1410_1_1528505225_7613406_131-329524-0 from training. Duration: 0.79 2023-10-02 01:42:36,176 WARNING [train.py:1204] (3/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_847-41334-0_sp0.9 from training. Duration: 0.488875 2023-10-02 01:42:36,304 WARNING [train.py:1204] (3/4) Exclude cut with ID _14227_210_4381_1_1530701804286_3940259_125-275642-0 from training. Duration: 0.99 2023-10-02 01:42:36,432 WARNING [train.py:1204] (3/4) Exclude cut with ID _25248_210_8750_1_1532062825941_5554180_248-177255-0_sp1.1 from training. Duration: 0.963625 2023-10-02 01:42:36,486 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_40047_210_2290_1_1533290519_2374311_195-165693-0_sp1.1 from training. Duration: 0.8090625 2023-10-02 01:42:36,490 WARNING [train.py:1204] (3/4) Exclude cut with ID _23109_210_4381_1_1532150828777_901949_29-258672-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 01:42:36,790 WARNING [train.py:1204] (3/4) Exclude cut with ID _14821_210_8750_1_1530680503991_6829359_2-227151-0_sp1.1 from training. Duration: 0.7545625 2023-10-02 01:42:36,836 WARNING [train.py:1204] (3/4) Exclude cut with ID _49643_210_7989_1_1534640244224_3939031_565-53982-0_sp1.1 from training. Duration: 0.963625 2023-10-02 01:42:36,839 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7544_210_6753_1_1528591327_1471426_33-346031-0_sp0.9 from training. Duration: 0.67775 2023-10-02 01:42:36,910 WARNING [train.py:1204] (3/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_95-61782-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 01:42:37,489 WARNING [train.py:1204] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_686-54383-0_sp0.9 from training. Duration: 0.9666875 2023-10-02 01:42:37,619 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_505-276823-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 01:42:37,783 WARNING [train.py:1204] (3/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_745-128443-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 01:42:37,836 WARNING [train.py:1204] (3/4) Exclude cut with ID _3844_210_1390_1_1526727803582_2871209_20-251664-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 01:42:37,904 WARNING [train.py:1204] (3/4) Exclude cut with ID _36538_210_9994_1_1533204020363_3795620_487-164628-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 01:42:38,333 WARNING [train.py:1204] (3/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_309-222545-0_sp1.1 from training. Duration: 0.763625 2023-10-02 01:42:38,346 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3531_210_3528_1_1526294203_5216456_309-227302-0 from training. Duration: 0.69 2023-10-02 01:42:38,416 WARNING [train.py:1204] (3/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_42-192460-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 01:42:38,494 WARNING [train.py:1204] (3/4) Exclude cut with ID _22083_210_8254_1_1531702862439_3678970_49-234546-0_sp1.1 from training. Duration: 0.7545625 2023-10-02 01:42:38,499 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_427-78097-0_sp0.9 from training. Duration: 0.888875 2023-10-02 01:42:38,874 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_378-205254-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 01:42:38,931 WARNING [train.py:1204] (3/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_672-314830-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 01:42:39,197 WARNING [train.py:1204] (3/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_90-30673-0_sp1.1 from training. Duration: 0.763625 2023-10-02 01:42:39,335 WARNING [train.py:1204] (3/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_563-329943-0_sp1.1 from training. Duration: 0.9 2023-10-02 01:42:39,354 WARNING [train.py:1204] (3/4) Exclude cut with ID _21632_210_2253_1_1531994001765_3744140_154-84090-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 01:42:39,355 WARNING [train.py:1204] (3/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_103-336064-0_sp0.9 from training. Duration: 0.5666875 2023-10-02 01:42:39,400 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_43-71989-0_sp0.9 from training. Duration: 0.6333125 2023-10-02 01:42:39,467 WARNING [train.py:1204] (3/4) Exclude cut with ID _3867_210_1883_1_1526641200575_4475830_319-349850-0_sp0.9 from training. Duration: 0.7444375 2023-10-02 01:42:40,192 WARNING [train.py:1204] (3/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_629-179454-0_sp1.1 from training. Duration: 0.963625 2023-10-02 01:42:40,200 WARNING [train.py:1204] (3/4) Exclude cut with ID _65263_210_22356_1_1535763598691_3921037_49-222917-0 from training. Duration: 0.92 2023-10-02 01:42:40,229 WARNING [train.py:1204] (3/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_602-331772-0_sp1.1 from training. Duration: 0.936375 2023-10-02 01:42:40,335 WARNING [train.py:1204] (3/4) Exclude cut with ID _58414_210_8254_1_1535421609162_7151182_292-134978-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 01:42:40,374 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_38793_210_2544_1_1533176114_3849320_200-299292-0_sp1.1 from training. Duration: 0.936375 2023-10-02 01:42:40,526 WARNING [train.py:1204] (3/4) Exclude cut with ID _14970_210_15709_1_1530775547361_1966965_6-23328-0 from training. Duration: 0.86 2023-10-02 01:42:40,606 WARNING [train.py:1204] (3/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_386-225511-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 01:42:40,624 WARNING [train.py:1204] (3/4) Exclude cut with ID _38323_210_12108_1_1533121233369_3303049_416-11919-0_sp1.1 from training. Duration: 0.87275 2023-10-02 01:42:40,675 WARNING [train.py:1204] (3/4) Exclude cut with ID _4866_210_2577_1_1527404412284_4364659_256-306830-0_sp1.1 from training. Duration: 0.7909375 2023-10-02 01:42:40,808 WARNING [train.py:1204] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_53-300544-0 from training. Duration: 0.67 2023-10-02 01:42:41,222 WARNING [train.py:1204] (3/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_80-74184-0_sp1.1 from training. Duration: 0.7545625 2023-10-02 01:42:41,242 WARNING [train.py:1204] (3/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_332-256168-0_sp1.1 from training. Duration: 0.6818125 2023-10-02 01:42:41,281 WARNING [train.py:1204] (3/4) Exclude cut with ID _48836_210_12210_1_1534075848293_3904150_429-35475-0_sp1.1 from training. Duration: 0.8090625 2023-10-02 01:42:41,435 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_144-135833-0_sp1.1 from training. Duration: 0.97275 2023-10-02 01:42:41,765 WARNING [train.py:1204] (3/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_380-343670-0_sp0.9 from training. Duration: 0.97775 2023-10-02 01:42:41,792 WARNING [train.py:1204] (3/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_213-265174-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 01:42:41,820 WARNING [train.py:1204] (3/4) Exclude cut with ID _20455_210_5219_1_1531565996775_8098100_86-314283-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 01:42:42,070 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_41750_210_1960_1_1533520678_3948042_68-223813-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 01:42:42,155 WARNING [train.py:1204] (3/4) Exclude cut with ID _55153_210_2253_1_1534384786677_3162096_24-198429-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 01:42:42,256 WARNING [train.py:1204] (3/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_119-207162-0_sp0.9 from training. Duration: 0.788875 2023-10-02 01:42:42,329 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_148-172861-0 from training. Duration: 0.5 2023-10-02 01:42:42,756 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_5522_210_3273_1_1527249606_3909096_318-203153-0 from training. Duration: 0.43 2023-10-02 01:42:42,762 WARNING [train.py:1204] (3/4) Exclude cut with ID _14884_210_2252_1_1530707459689_5561250_288-189949-0_sp1.1 from training. Duration: 0.97275 2023-10-02 01:42:42,881 WARNING [train.py:1204] (3/4) Exclude cut with ID _5294_210_1747_1_1527249643928_3696179_529-242298-0 from training. Duration: 0.95 2023-10-02 01:42:42,939 WARNING [train.py:1204] (3/4) Exclude cut with ID _36538_210_9994_1_1533204020363_3795620_503-110289-0_sp1.1 from training. Duration: 0.936375 2023-10-02 01:42:43,166 WARNING [train.py:1204] (3/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_385-193052-0 from training. Duration: 0.86 2023-10-02 01:42:43,236 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_456-22141-0 from training. Duration: 0.88 2023-10-02 01:42:43,381 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_38965_210_4852_1_1533174906_3484735_284-117840-0_sp1.1 from training. Duration: 0.936375 2023-10-02 01:42:44,930 WARNING [train.py:1204] (3/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_75-201625-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 01:42:45,064 WARNING [train.py:1204] (3/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_255-312654-0_sp1.1 from training. Duration: 0.82725 2023-10-02 01:42:45,074 WARNING [train.py:1204] (3/4) Exclude cut with ID _47251_210_18628_1_1533814063128_4137250_214-88076-0_sp0.9 from training. Duration: 0.97775 2023-10-02 01:42:45,182 WARNING [train.py:1204] (3/4) Exclude cut with ID _4092_210_2151_1_1526643044449_3135759_227-213972-0_sp1.1 from training. Duration: 0.4909375 2023-10-02 01:42:45,211 WARNING [train.py:1204] (3/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_912-47473-0_sp1.1 from training. Duration: 0.6181875 2023-10-02 01:42:45,249 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6767_210_3273_1_1527855783_1718932_173-256601-0_sp1.1 from training. Duration: 0.72725 2023-10-02 01:42:45,407 WARNING [train.py:1204] (3/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_32-205288-0_sp1.1 from training. Duration: 0.7090625 2023-10-02 01:42:45,443 WARNING [train.py:1204] (3/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_39-248870-0 from training. Duration: 0.69 2023-10-02 01:42:45,633 WARNING [train.py:1204] (3/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_377-296839-0_sp1.1 from training. Duration: 0.5 2023-10-02 01:42:45,660 WARNING [train.py:1204] (3/4) Exclude cut with ID _13679_210_7497_1_1530534293662_3355098_413-56154-0_sp1.1 from training. Duration: 0.963625 2023-10-02 01:42:45,706 WARNING [train.py:1204] (3/4) Exclude cut with ID _14970_210_15709_1_1530775547361_1966965_201-84544-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 01:42:45,765 WARNING [train.py:1204] (3/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_105-95775-0_sp1.1 from training. Duration: 0.936375 2023-10-02 01:42:45,942 WARNING [train.py:1204] (3/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_131-191669-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 01:42:45,967 WARNING [train.py:1204] (3/4) Exclude cut with ID _14860_210_4915_1_1530702020340_7343240_695-4323-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 01:42:46,067 WARNING [train.py:1204] (3/4) Exclude cut with ID _39867_210_9826_1_1533553222030_9344259_991-130565-0_sp1.1 from training. Duration: 0.97275 2023-10-02 01:42:46,211 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_50-3289-0_sp1.1 from training. Duration: 0.82725 2023-10-02 01:42:46,360 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_54-347670-0_sp1.1 from training. Duration: 0.92725 2023-10-02 01:42:46,409 WARNING [train.py:1204] (3/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_200-153445-0_sp1.1 from training. Duration: 0.936375 2023-10-02 01:42:46,492 WARNING [train.py:1204] (3/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_279-302911-0_sp1.1 from training. Duration: 0.97275 2023-10-02 01:42:46,649 WARNING [train.py:1204] (3/4) Exclude cut with ID _14886_210_4220_1_1530702020355_3516190_159-73748-0_sp0.9 from training. Duration: 0.92225 2023-10-02 01:42:47,034 WARNING [train.py:1204] (3/4) Exclude cut with ID _53379_210_20034_1_1534222027363_3854654_23-222715-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 01:42:47,157 WARNING [train.py:1204] (3/4) Exclude cut with ID _12555_210_8750_1_1530075594402_3780239_449-267986-0 from training. Duration: 0.83 2023-10-02 01:42:47,167 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7544_210_6753_1_1528587722_3576686_295-258479-0 from training. Duration: 0.84 2023-10-02 01:42:47,312 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7002_210_1794_1_1527935634_4034444_487-236501-0_sp0.9 from training. Duration: 0.7333125 2023-10-02 01:42:47,374 WARNING [train.py:1204] (3/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_244-19107-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 01:42:47,403 WARNING [train.py:1204] (3/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_390-339137-0 from training. Duration: 0.56 2023-10-02 01:42:47,466 WARNING [train.py:1204] (3/4) Exclude cut with ID _7562_210_2084_1_1528887767916_3946229_186-326422-0_sp1.1 from training. Duration: 0.763625 2023-10-02 01:42:47,544 WARNING [train.py:1204] (3/4) Exclude cut with ID _33640_210_2655_1_1533117605479_4855469_645-145235-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 01:42:47,567 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_25767_210_2161_1_1532307483_3867748_506-171777-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 01:42:47,600 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_5438_210_1794_1_1527336687_2600009_233-58072-0_sp1.1 from training. Duration: 0.7 2023-10-02 01:42:47,600 WARNING [train.py:1204] (3/4) Exclude cut with ID IC0669W0063-330908 from training. Duration: 0.896 2023-10-02 01:42:47,641 WARNING [train.py:1204] (3/4) Exclude cut with ID IC0671W0070-331913 from training. Duration: 0.896 2023-10-02 01:42:47,644 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_258-310570-0_sp1.1 from training. Duration: 0.7454375 2023-10-02 01:42:47,688 WARNING [train.py:1204] (3/4) Exclude cut with ID _9461_210_4557_1_1529060514726_3677451_162-177507-0_sp1.1 from training. Duration: 0.97275 2023-10-02 01:42:48,056 WARNING [train.py:1204] (3/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_26-175553-0_sp0.9 from training. Duration: 0.67775 2023-10-02 01:42:48,089 WARNING [train.py:1204] (3/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_7-150481-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 01:42:48,150 WARNING [train.py:1204] (3/4) Exclude cut with ID _23557_210_8750_1_1531890106370_6846255_72-273262-0_sp1.1 from training. Duration: 0.963625 2023-10-02 01:42:48,257 WARNING [train.py:1204] (3/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_367-61330-0 from training. Duration: 0.75 2023-10-02 01:42:48,548 WARNING [train.py:1204] (3/4) Exclude cut with ID _39867_210_9826_1_1533553222030_9344259_1337-11141-0_sp1.1 from training. Duration: 0.936375 2023-10-02 01:42:48,581 WARNING [train.py:1204] (3/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_101-293367-0 from training. Duration: 0.63 2023-10-02 01:42:49,316 WARNING [train.py:1204] (3/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_10-100367-0 from training. Duration: 0.98 2023-10-02 01:42:49,424 WARNING [train.py:1204] (3/4) Exclude cut with ID _60057_210_10640_1_1534903065793_3333150_171-181133-0_sp1.1 from training. Duration: 0.8 2023-10-02 01:42:49,433 WARNING [train.py:1204] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_844-96989-0_sp0.9 from training. Duration: 0.788875 2023-10-02 01:42:49,599 WARNING [train.py:1204] (3/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_301-144595-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 01:42:49,683 WARNING [train.py:1204] (3/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_115-147343-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 01:42:49,985 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3171_210_2087_1_1526109084_2330007_37-64334-0_sp1.1 from training. Duration: 0.7454375 2023-10-02 01:42:50,039 WARNING [train.py:1204] (3/4) Exclude cut with ID _19514_210_9259_1_1531530071330_5084230_769-272914-0_sp1.1 from training. Duration: 0.936375 2023-10-02 01:42:50,218 WARNING [train.py:1204] (3/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_76-311963-0_sp0.9 from training. Duration: 0.77775 2023-10-02 01:42:50,271 WARNING [train.py:1204] (3/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_526-62079-0 from training. Duration: 0.67 2023-10-02 01:42:50,312 WARNING [train.py:1204] (3/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_7-111954-0_sp0.9 from training. Duration: 0.62225 2023-10-02 01:42:50,447 WARNING [train.py:1204] (3/4) Exclude cut with ID _57997_210_4163_1_1534766425889_3961049_360-279140-0_sp1.1 from training. Duration: 0.97275 2023-10-02 01:42:50,662 WARNING [train.py:1204] (3/4) Exclude cut with ID _4866_210_2577_1_1527404412284_4364659_133-1696-0 from training. Duration: 0.99 2023-10-02 01:42:50,731 WARNING [train.py:1204] (3/4) Exclude cut with ID _5190_210_4557_1_1527244368799_373439_28-197030-0 from training. Duration: 0.72 2023-10-02 01:42:50,944 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_53-39003-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 01:42:50,947 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6673_210_3273_1_1527767790_3173240_351-167954-0 from training. Duration: 0.48 2023-10-02 01:42:50,966 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2822_210_2715_1_1526112586_1999587_165-36656-0_sp1.1 from training. Duration: 0.8090625 2023-10-02 01:42:51,149 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_328-264892-0_sp1.1 from training. Duration: 0.92725 2023-10-02 01:42:51,153 WARNING [train.py:1204] (3/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_29-293896-0_sp0.9 from training. Duration: 0.67775 2023-10-02 01:42:51,369 WARNING [train.py:1204] (3/4) Exclude cut with ID _65263_210_22356_1_1535763598691_3921037_465-55448-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 01:42:51,429 WARNING [train.py:1204] (3/4) Exclude cut with ID _21989_210_5854_1_1531899214931_3271360_311-53106-0_sp1.1 from training. Duration: 0.97275 2023-10-02 01:42:51,733 WARNING [train.py:1204] (3/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_389-277915-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 01:42:51,919 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_69630_210_7234_1_1535791094_677837_63-277139-0_sp1.1 from training. Duration: 0.963625 2023-10-02 01:42:52,047 WARNING [train.py:1204] (3/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_85-138257-0 from training. Duration: 0.71 2023-10-02 01:42:52,078 WARNING [train.py:1204] (3/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_686-247600-0 from training. Duration: 0.92 2023-10-02 01:42:52,083 WARNING [train.py:1204] (3/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_108-255430-0_sp1.1 from training. Duration: 0.8818125 2023-10-02 01:42:52,422 WARNING [train.py:1204] (3/4) Exclude cut with ID _14884_210_2252_1_1530707459689_5561250_114-201399-0_sp1.1 from training. Duration: 0.6909375 2023-10-02 01:42:52,591 WARNING [train.py:1204] (3/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_506-42614-0 from training. Duration: 0.79 2023-10-02 01:42:52,701 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_85-195240-0_sp1.1 from training. Duration: 0.7909375 2023-10-02 01:42:53,058 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_141-36243-0_sp1.1 from training. Duration: 0.97275 2023-10-02 01:42:54,342 WARNING [train.py:1204] (3/4) Exclude cut with ID _7852_210_5219_1_1528286471316_8139400_358-287492-0_sp1.1 from training. Duration: 0.87275 2023-10-02 01:42:54,365 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_805-71497-0_sp0.9 from training. Duration: 0.788875 2023-10-02 01:42:54,493 WARNING [train.py:1204] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_178-39722-0 from training. Duration: 0.49 2023-10-02 01:42:54,749 WARNING [train.py:1204] (3/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_428-181925-0_sp1.1 from training. Duration: 0.963625 2023-10-02 01:42:54,753 WARNING [train.py:1204] (3/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_770-200430-0 from training. Duration: 0.89 2023-10-02 01:42:54,798 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_1104-271009-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 01:42:54,863 WARNING [train.py:1204] (3/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_128-296581-0_sp0.9 from training. Duration: 0.82225 2023-10-02 01:42:55,365 WARNING [train.py:1204] (3/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_19-62511-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 01:42:55,520 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_805-71497-0 from training. Duration: 0.71 2023-10-02 01:42:55,609 WARNING [train.py:1204] (3/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_573-152077-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 01:42:55,614 WARNING [train.py:1204] (3/4) Exclude cut with ID _42875_210_14856_1_1533704397487_4012433_393-177816-0_sp1.1 from training. Duration: 0.963625 2023-10-02 01:42:55,727 WARNING [train.py:1204] (3/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_278-35159-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 01:42:55,831 WARNING [train.py:1204] (3/4) Exclude cut with ID _15037_210_2253_1_1530752294104_3267650_13-130361-0_sp1.1 from training. Duration: 0.9 2023-10-02 01:42:55,933 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3019_210_2028_1_1526044245_3264865_127-135615-0 from training. Duration: 0.94 2023-10-02 01:42:55,991 WARNING [train.py:1204] (3/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_607-213951-0 from training. Duration: 0.84 2023-10-02 01:42:56,062 WARNING [train.py:1204] (3/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_436-318113-0_sp0.9 from training. Duration: 0.8333125 2023-10-02 01:42:56,063 WARNING [train.py:1204] (3/4) Exclude cut with ID _13938_210_9988_1_1530518718978_3693960_148-152225-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 01:42:56,499 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2541_210_1794_1_1525590269_3084765_156-173713-0_sp1.1 from training. Duration: 0.736375 2023-10-02 01:42:56,653 WARNING [train.py:1204] (3/4) Exclude cut with ID _28323_210_16187_1_1532480395667_4100300_531-24157-0_sp1.1 from training. Duration: 0.8909375 2023-10-02 01:42:56,827 WARNING [train.py:1204] (3/4) Exclude cut with ID _29303_210_2575_1_1532392237303_7410649_382-217492-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 01:42:56,866 WARNING [train.py:1204] (3/4) Exclude cut with ID _58895_210_8750_1_1535184081320_3649089_270-158013-0_sp1.1 from training. Duration: 0.92725 2023-10-02 01:42:57,104 WARNING [train.py:1204] (3/4) Exclude cut with ID _29277_210_3279_1_1532581109293_3173069_311-206227-0_sp1.1 from training. Duration: 0.936375 2023-10-02 01:42:57,124 WARNING [train.py:1204] (3/4) Exclude cut with ID _7160_210_1876_1_1527940923231_3703799_282-322611-0_sp0.9 from training. Duration: 0.9666875 2023-10-02 01:42:57,244 WARNING [train.py:1204] (3/4) Exclude cut with ID _53219_210_4525_1_1534834836008_4064825_562-32295-0_sp1.1 from training. Duration: 0.963625 2023-10-02 01:42:57,329 WARNING [train.py:1204] (3/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_408-87330-0_sp1.1 from training. Duration: 0.963625 2023-10-02 01:42:57,336 WARNING [train.py:1204] (3/4) Exclude cut with ID _59202_210_26984_1_1534816810612_3549174_137-318710-0_sp1.1 from training. Duration: 0.8090625 2023-10-02 01:42:57,512 WARNING [train.py:1204] (3/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_372-30302-0 from training. Duration: 0.86 2023-10-02 01:42:58,422 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7543_210_6753_1_1528541907_1399322_178-56269-0_sp1.1 from training. Duration: 0.95275 2023-10-02 01:42:58,445 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9109W0012-549758 from training. Duration: 0.512 2023-10-02 01:42:58,509 WARNING [train.py:1204] (3/4) Exclude cut with ID _5410_210_1388_1_1527397055795_1719020_39-112221-0_sp0.9 from training. Duration: 0.7666875 2023-10-02 01:42:58,640 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9121W0012-554933 from training. Duration: 0.896 2023-10-02 01:42:58,744 WARNING [train.py:1204] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_476-272757-0 from training. Duration: 0.93 2023-10-02 01:42:58,770 WARNING [train.py:1204] (3/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_1085-171718-0_sp1.1 from training. Duration: 0.92725 2023-10-02 01:42:58,841 WARNING [train.py:1204] (3/4) Exclude cut with ID _8480_210_1874_1_1528589391205_6728330_309-139896-0_sp1.1 from training. Duration: 0.736375 2023-10-02 01:42:58,841 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9130W0113-559703 from training. Duration: 0.896 2023-10-02 01:42:58,845 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_292-265928-0_sp1.1 from training. Duration: 0.463625 2023-10-02 01:42:59,098 WARNING [train.py:1204] (3/4) Exclude cut with ID _8772_210_1850_1_1528635595972_3664011_96-12756-0 from training. Duration: 0.98 2023-10-02 01:42:59,174 WARNING [train.py:1204] (3/4) Exclude cut with ID _12556_210_8341_1_1530081024100_7327690_725-193477-0_sp1.1 from training. Duration: 0.8909375 2023-10-02 01:42:59,211 WARNING [train.py:1204] (3/4) Exclude cut with ID _5289_210_2084_1_1527678202736_3779299_142-103317-0_sp1.1 from training. Duration: 0.763625 2023-10-02 01:42:59,418 WARNING [train.py:1204] (3/4) Exclude cut with ID _45012_210_12651_1_1533608435136_5371380_206-51075-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 01:42:59,535 WARNING [train.py:1204] (3/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_176-56120-0_sp1.1 from training. Duration: 0.7545625 2023-10-02 01:42:59,561 WARNING [train.py:1204] (3/4) Exclude cut with ID _40284_210_2084_1_1533513679814_3704279_220-201864-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 01:42:59,580 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9165W0004-576638 from training. Duration: 0.896 2023-10-02 01:42:59,615 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_5438_210_1794_1_1527336687_2600009_218-343336-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 01:42:59,636 WARNING [train.py:1204] (3/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_318-162017-0 from training. Duration: 0.91 2023-10-02 01:43:00,073 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_49544_210_1794_1_1534379770_5262083_483-349298-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 01:43:00,158 WARNING [train.py:1204] (3/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_197-28164-0_sp0.9 from training. Duration: 0.888875 2023-10-02 01:43:00,215 WARNING [train.py:1204] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_643-52007-0 from training. Duration: 0.57 2023-10-02 01:43:00,237 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_38965_210_4852_1_1533174906_3484735_403-332963-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 01:43:00,358 WARNING [train.py:1204] (3/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_340-174176-0 from training. Duration: 0.49 2023-10-02 01:43:00,533 WARNING [train.py:1204] (3/4) Exclude cut with ID _56708_210_13177_1_1535252351282_3422899_363-917-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 01:43:00,668 WARNING [train.py:1204] (3/4) Exclude cut with ID _19896_210_1883_1_1531709022662_6649569_525-159063-0_sp1.1 from training. Duration: 0.936375 2023-10-02 01:43:00,692 WARNING [train.py:1204] (3/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_430-116139-0_sp0.9 from training. Duration: 0.9444375 2023-10-02 01:43:00,845 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_53753_210_7234_1_1534320003_3594148_86-329250-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 01:43:00,859 WARNING [train.py:1204] (3/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_406-275649-0_sp1.1 from training. Duration: 0.3818125 2023-10-02 01:43:00,995 WARNING [train.py:1204] (3/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_1083-147536-0_sp1.1 from training. Duration: 0.8818125 2023-10-02 01:43:01,080 WARNING [train.py:1204] (3/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_54-311741-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 01:43:01,084 WARNING [train.py:1204] (3/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_237-58433-0_sp0.9 from training. Duration: 0.82225 2023-10-02 01:43:01,130 WARNING [train.py:1204] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_98-51676-0_sp1.1 from training. Duration: 0.663625 2023-10-02 01:43:01,199 WARNING [train.py:1204] (3/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_386-270472-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 01:43:01,310 WARNING [train.py:1204] (3/4) Exclude cut with ID _19023_210_4353_1_1531378892552_5820780_83-254794-0_sp1.1 from training. Duration: 0.7818125 2023-10-02 01:43:01,406 WARNING [train.py:1204] (3/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_314-93656-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 01:43:01,425 WARNING [train.py:1204] (3/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_336-213530-0 from training. Duration: 0.9 2023-10-02 01:43:01,554 WARNING [train.py:1204] (3/4) Exclude cut with ID _36686_210_5665_1_1533778225913_3646410_65-221120-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 01:43:01,743 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_26433_210_12108_1_1532157158_4056480_271-68178-0_sp1.1 from training. Duration: 0.92725 2023-10-02 01:43:01,791 WARNING [train.py:1204] (3/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_46-207599-0_sp0.9 from training. Duration: 0.711125 2023-10-02 01:43:02,056 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9303W0114-637133 from training. Duration: 0.896 2023-10-02 01:43:02,096 WARNING [train.py:1204] (3/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_490-207861-0 from training. Duration: 0.98 2023-10-02 01:43:02,942 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6767_210_3273_1_1527855783_1718932_173-256601-0_sp0.9 from training. Duration: 0.888875 2023-10-02 01:43:02,997 WARNING [train.py:1204] (3/4) Exclude cut with ID _13921_210_9994_1_1530841782272_3503520_327-69677-0_sp1.1 from training. Duration: 0.8181875 2023-10-02 01:43:03,067 WARNING [train.py:1204] (3/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_508-335369-0 from training. Duration: 0.48 2023-10-02 01:43:03,091 WARNING [train.py:1204] (3/4) Exclude cut with ID _64631_210_27767_1_1535349580068_4165160_24-46567-0_sp1.1 from training. Duration: 0.963625 2023-10-02 01:43:03,566 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_89-35748-0_sp1.1 from training. Duration: 0.7181875 2023-10-02 01:43:04,244 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_26433_210_12108_1_1532157158_4056480_578-262107-0_sp1.1 from training. Duration: 0.9 2023-10-02 01:43:04,735 WARNING [train.py:1204] (3/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_129-337802-0 from training. Duration: 0.72 2023-10-02 01:43:04,828 WARNING [train.py:1204] (3/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_356-167816-0_sp0.9 from training. Duration: 0.8 2023-10-02 01:43:05,014 WARNING [train.py:1204] (3/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_137-116442-0_sp1.1 from training. Duration: 0.7090625 2023-10-02 01:43:05,051 WARNING [train.py:1204] (3/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_358-172940-0_sp1.1 from training. Duration: 0.963625 2023-10-02 01:43:05,084 WARNING [train.py:1204] (3/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_668-344147-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 01:43:05,228 WARNING [train.py:1204] (3/4) Exclude cut with ID _62337_210_12392_1_1535009273105_3412229_255-157667-0_sp1.1 from training. Duration: 0.936375 2023-10-02 01:43:05,231 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_486-151131-0 from training. Duration: 0.85 2023-10-02 01:43:05,377 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_53638_210_21581_1_1534297319_5808449_885-80172-0 from training. Duration: 0.96 2023-10-02 01:43:05,481 WARNING [train.py:1204] (3/4) Exclude cut with ID _50093_210_4220_1_1534417141207_3432299_105-183067-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 01:43:05,545 WARNING [train.py:1204] (3/4) Exclude cut with ID _7562_210_2084_1_1528887767916_3946229_345-169156-0_sp0.9 from training. Duration: 0.6444375 2023-10-02 01:43:05,694 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_4528_210_3273_1_1527076654_3276777_209-219804-0_sp0.9 from training. Duration: 0.7666875 2023-10-02 01:43:05,731 WARNING [train.py:1204] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_35-153886-0_sp0.9 from training. Duration: 0.9333125 2023-10-02 01:43:05,813 WARNING [train.py:1204] (3/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_332-121925-0_sp1.1 from training. Duration: 0.92725 2023-10-02 01:43:05,815 WARNING [train.py:1204] (3/4) Exclude cut with ID _59202_210_26984_1_1534816810612_3549174_495-65515-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 01:43:06,087 WARNING [train.py:1204] (3/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_769-168302-0_sp1.1 from training. Duration: 0.6454375 2023-10-02 01:43:06,126 WARNING [train.py:1204] (3/4) Exclude cut with ID _24271_210_12658_1_1531987270022_7190519_708-71345-0_sp1.1 from training. Duration: 0.936375 2023-10-02 01:43:06,127 WARNING [train.py:1204] (3/4) Exclude cut with ID 3033-130750-0096-55598-0_sp0.9 from training. Duration: 0.92225 2023-10-02 01:43:06,291 WARNING [train.py:1204] (3/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_219-224445-0_sp0.9 from training. Duration: 0.9333125 2023-10-02 01:43:06,477 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_174-277364-0 from training. Duration: 0.71 2023-10-02 01:43:07,224 WARNING [train.py:1204] (3/4) Exclude cut with ID _45021_210_9994_1_1533619751094_3330288_96-239010-0_sp1.1 from training. Duration: 0.82725 2023-10-02 01:43:07,256 WARNING [train.py:1204] (3/4) Exclude cut with ID _31602_210_5854_1_1532677021456_4458250_489-305116-0_sp1.1 from training. Duration: 0.97275 2023-10-02 01:43:07,277 WARNING [train.py:1204] (3/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_269-154726-0_sp0.9 from training. Duration: 0.9555625 2023-10-02 01:43:07,563 WARNING [train.py:1204] (3/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_126-54418-0_sp1.1 from training. Duration: 0.92725 2023-10-02 01:43:07,602 WARNING [train.py:1204] (3/4) Exclude cut with ID _42326_210_7770_1_1533814282531_3707269_182-224159-0_sp1.1 from training. Duration: 0.92725 2023-10-02 01:43:07,693 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7002_210_1794_1_1527939683_5157286_1-281264-0_sp0.9 from training. Duration: 0.488875 2023-10-02 01:43:07,738 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_86-16881-0 from training. Duration: 0.58 2023-10-02 01:43:07,765 WARNING [train.py:1204] (3/4) Exclude cut with ID _19514_210_9259_1_1531530071330_5084230_637-279094-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 01:43:07,805 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_26433_210_12108_1_1532157158_4056480_578-262107-0 from training. Duration: 0.99 2023-10-02 01:43:07,869 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_82-221196-0_sp1.1 from training. Duration: 0.6909375 2023-10-02 01:43:07,994 WARNING [train.py:1204] (3/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_402-102311-0 from training. Duration: 0.67 2023-10-02 01:43:08,232 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1015-343884-0_sp1.1 from training. Duration: 0.563625 2023-10-02 01:43:08,316 WARNING [train.py:1204] (3/4) Exclude cut with ID _13684_210_7497_1_1530619354863_3598359_312-235606-0_sp1.1 from training. Duration: 0.5454375 2023-10-02 01:43:08,346 WARNING [train.py:1204] (3/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_55-31904-0 from training. Duration: 0.88 2023-10-02 01:43:08,433 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_25-332138-0 from training. Duration: 0.91 2023-10-02 01:43:08,435 WARNING [train.py:1204] (3/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_912-282412-0_sp0.9 from training. Duration: 0.5444375 2023-10-02 01:43:08,523 WARNING [train.py:1204] (3/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_458-299435-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 01:43:08,620 WARNING [train.py:1204] (3/4) Exclude cut with ID _69584_210_5219_1_1535785133775_4044450_275-88403-0_sp1.1 from training. Duration: 0.92725 2023-10-02 01:43:08,628 WARNING [train.py:1204] (3/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_841-341679-0_sp1.1 from training. Duration: 0.52725 2023-10-02 01:43:08,634 WARNING [train.py:1204] (3/4) Exclude cut with ID _41391_210_7994_1_1533603771872_3638201_1-54521-0_sp1.1 from training. Duration: 0.97275 2023-10-02 01:43:08,700 WARNING [train.py:1204] (3/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_495-186609-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 01:43:08,881 WARNING [train.py:1204] (3/4) Exclude cut with ID _4681_210_3525_1_1527250689464_120889_15-126760-0 from training. Duration: 0.81 2023-10-02 01:43:09,017 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3019_210_2028_1_1526044245_3264865_127-135615-0_sp1.1 from training. Duration: 0.8545625 2023-10-02 01:43:09,164 WARNING [train.py:1204] (3/4) Exclude cut with ID _30191_210_1664_1_1533189915047_7196670_915-21561-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 01:43:09,195 WARNING [train.py:1204] (3/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_6-214815-0_sp0.9 from training. Duration: 0.9444375 2023-10-02 01:43:09,497 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_45399_210_4852_1_1533624725_7579737_190-137494-0_sp1.1 from training. Duration: 0.97275 2023-10-02 01:43:09,694 WARNING [train.py:1204] (3/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_207-132038-0 from training. Duration: 0.94 2023-10-02 01:43:10,193 WARNING [train.py:1204] (3/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_90-31383-0 from training. Duration: 0.87 2023-10-02 01:43:10,245 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_15517_210_6753_1_1530952293_1664795_119-31001-0 from training. Duration: 0.94 2023-10-02 01:43:10,291 WARNING [train.py:1204] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_684-85342-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 01:43:10,539 WARNING [train.py:1204] (3/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_97-325053-0_sp1.1 from training. Duration: 0.836375 2023-10-02 01:43:10,776 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_68702_210_23689_1_1535799577_3420068_168-25150-0_sp1.1 from training. Duration: 0.97275 2023-10-02 01:43:10,783 WARNING [train.py:1204] (3/4) Exclude cut with ID _65134_210_14763_1_1535335393985_3505368_111-27984-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 01:43:10,843 WARNING [train.py:1204] (3/4) Exclude cut with ID _48678_210_5663_1_1533988830947_3769199_276-226230-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 01:43:11,668 WARNING [train.py:1204] (3/4) Exclude cut with ID _28427_210_4220_1_1532581240967_3061069_43-84928-0_sp1.1 from training. Duration: 0.9 2023-10-02 01:43:11,717 WARNING [train.py:1204] (3/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_158-250116-0_sp1.1 from training. Duration: 0.563625 2023-10-02 01:43:11,747 WARNING [train.py:1204] (3/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_279-332556-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 01:43:11,780 WARNING [train.py:1204] (3/4) Exclude cut with ID _42863_210_9070_1_1533902398788_3696920_488-230146-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 01:43:11,780 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_387-247159-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 01:43:11,889 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6729_210_2550_1_1528032481_1788725_261-42619-0_sp1.1 from training. Duration: 0.97275 2023-10-02 01:43:12,043 WARNING [train.py:1204] (3/4) Exclude cut with ID _19896_210_1883_1_1531709022662_6649569_831-44724-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 01:43:12,236 WARNING [train.py:1204] (3/4) Exclude cut with ID _55153_210_2253_1_1534384786677_3162096_280-285944-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 01:43:12,277 WARNING [train.py:1204] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_448-4048-0 from training. Duration: 0.96 2023-10-02 01:43:12,339 WARNING [train.py:1204] (3/4) Exclude cut with ID _29678_210_7893_1_1532421042787_3906118_456-202548-0_sp1.1 from training. Duration: 0.97275 2023-10-02 01:43:12,359 WARNING [train.py:1204] (3/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_107-332398-0_sp1.1 from training. Duration: 0.7090625 2023-10-02 01:43:12,630 WARNING [train.py:1204] (3/4) Exclude cut with ID _13921_210_9994_1_1530838702800_3003429_46-149444-0_sp1.1 from training. Duration: 0.8545625 2023-10-02 01:43:12,660 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_257-20733-0_sp0.9 from training. Duration: 0.8444375 2023-10-02 01:43:12,734 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_53753_210_7234_1_1534320003_3594148_385-234809-0_sp1.1 from training. Duration: 0.963625 2023-10-02 01:43:12,786 WARNING [train.py:1204] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_292-215346-0_sp1.1 from training. Duration: 0.8818125 2023-10-02 01:43:13,127 WARNING [train.py:1204] (3/4) Exclude cut with ID _68965_210_1883_1_1535698962272_3497490_196-124268-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 01:43:13,178 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_23801_210_16223_1_1532066392_3294657_570-5439-0_sp1.1 from training. Duration: 0.8818125 2023-10-02 01:43:13,658 WARNING [train.py:1204] (3/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_349-6928-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 01:43:13,716 WARNING [train.py:1204] (3/4) Exclude cut with ID _60621_210_17071_1_1535013053530_3671739_226-255202-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 01:43:13,724 WARNING [train.py:1204] (3/4) Exclude cut with ID _41817_210_18835_1_1533434477160_7423049_448-130302-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 01:43:13,882 WARNING [train.py:1204] (3/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_758-143677-0_sp1.1 from training. Duration: 0.8909375 2023-10-02 01:43:13,938 WARNING [train.py:1204] (3/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_912-47473-0_sp0.9 from training. Duration: 0.7555625 2023-10-02 01:43:13,966 WARNING [train.py:1204] (3/4) Exclude cut with ID _6694_210_4599_1_1527852637490_3965529_356-169233-0_sp1.1 from training. Duration: 0.9 2023-10-02 01:43:14,054 WARNING [train.py:1204] (3/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_1-99680-0 from training. Duration: 0.67 2023-10-02 01:43:14,140 WARNING [train.py:1204] (3/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_199-86432-0_sp1.1 from training. Duration: 0.8909375 2023-10-02 01:43:14,161 WARNING [train.py:1204] (3/4) Exclude cut with ID _61148_210_6897_1_1535094022726_4217610_362-182430-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 01:43:14,253 WARNING [train.py:1204] (3/4) Exclude cut with ID _49376_210_5986_1_1533985278732_3370740_301-154763-0 from training. Duration: 0.98 2023-10-02 01:43:14,400 WARNING [train.py:1204] (3/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_572-185150-0_sp1.1 from training. Duration: 0.8818125 2023-10-02 01:43:14,788 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_47896_210_20508_1_1534492518_5958868_558-314572-0_sp1.1 from training. Duration: 0.8818125 2023-10-02 01:43:14,927 WARNING [train.py:1204] (3/4) Exclude cut with ID _45560_210_21982_1_1533812603951_3584999_111-6376-0_sp0.9 from training. Duration: 0.9333125 2023-10-02 01:43:15,137 WARNING [train.py:1204] (3/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_412-317077-0_sp1.1 from training. Duration: 0.6818125 2023-10-02 01:43:15,305 WARNING [train.py:1204] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_718-68505-0 from training. Duration: 0.89 2023-10-02 01:43:16,115 WARNING [train.py:1204] (3/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_77-311404-0 from training. Duration: 0.94 2023-10-02 01:43:16,190 WARNING [train.py:1204] (3/4) Exclude cut with ID _60229_210_27767_1_1534838406158_3655350_264-279573-0 from training. Duration: 0.83 2023-10-02 01:43:16,299 WARNING [train.py:1204] (3/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_106-300985-0_sp1.1 from training. Duration: 0.8181875 2023-10-02 01:43:16,530 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_4183_210_2087_1_1526729100_1869838_79-277189-0_sp1.1 from training. Duration: 0.963625 2023-10-02 01:43:16,546 WARNING [train.py:1204] (3/4) Exclude cut with ID _28427_210_4220_1_1532581240967_3061069_195-295268-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 01:43:16,769 WARNING [train.py:1204] (3/4) Exclude cut with ID _4092_210_2151_1_1526643044449_3135759_356-278694-0_sp0.9 from training. Duration: 0.52225 2023-10-02 01:43:17,065 WARNING [train.py:1204] (3/4) Exclude cut with ID _14459_210_2228_1_1531443641946_7516700_777-71446-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 01:43:17,463 WARNING [train.py:1204] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_520-126729-0 from training. Duration: 0.7 2023-10-02 01:43:17,787 WARNING [train.py:1204] (3/4) Exclude cut with ID _13684_210_7497_1_1530619354863_3598359_312-235606-0_sp0.9 from training. Duration: 0.6666875 2023-10-02 01:43:17,828 WARNING [train.py:1204] (3/4) Exclude cut with ID _37537_210_2253_1_1533171758568_3421949_283-349402-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 01:43:17,854 WARNING [train.py:1204] (3/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_29-117932-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 01:43:17,882 WARNING [train.py:1204] (3/4) Exclude cut with ID _29303_210_2575_1_1532392237303_7410649_721-283351-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 01:43:17,944 WARNING [train.py:1204] (3/4) Exclude cut with ID _14886_210_4220_1_1530702020355_3516190_175-229073-0_sp0.9 from training. Duration: 0.5 2023-10-02 01:43:18,062 WARNING [train.py:1204] (3/4) Exclude cut with ID _15458_210_8341_1_1530858614263_3834410_70-268830-0_sp1.1 from training. Duration: 0.97275 2023-10-02 01:43:18,088 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2295_210_2087_1_1525500993_2407863_234-83104-0 from training. Duration: 0.62 2023-10-02 01:43:18,089 WARNING [train.py:1204] (3/4) Exclude cut with ID _22961_210_8254_1_1531976366672_4278180_317-199546-0_sp1.1 from training. Duration: 0.7818125 2023-10-02 01:43:18,209 WARNING [train.py:1204] (3/4) Exclude cut with ID _27894_210_12392_1_1532415496078_6902439_547-196022-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 01:43:18,347 WARNING [train.py:1204] (3/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_46-207599-0 from training. Duration: 0.64 2023-10-02 01:43:18,469 WARNING [train.py:1204] (3/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_426-326246-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 01:43:18,721 WARNING [train.py:1204] (3/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_445-117398-0_sp0.9 from training. Duration: 0.988875 2023-10-02 01:43:18,783 WARNING [train.py:1204] (3/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_563-347832-0 from training. Duration: 0.89 2023-10-02 01:43:18,876 WARNING [train.py:1204] (3/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_252-216674-0 from training. Duration: 0.54 2023-10-02 01:43:18,991 WARNING [train.py:1204] (3/4) Exclude cut with ID _59611_210_8895_1_1534741198240_3650361_187-212779-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 01:43:19,057 WARNING [train.py:1204] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_282-159367-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 01:43:19,244 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_68702_210_23689_1_1535799577_3420068_33-136883-0_sp1.1 from training. Duration: 0.936375 2023-10-02 01:43:19,263 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_47228_210_4983_1_1533780021_3672027_184-293706-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 01:43:19,298 WARNING [train.py:1204] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_656-188361-0_sp0.9 from training. Duration: 0.9666875 2023-10-02 01:43:19,417 WARNING [train.py:1204] (3/4) Exclude cut with ID _4866_210_2577_1_1527404412284_4364659_352-157930-0_sp0.9 from training. Duration: 0.9 2023-10-02 01:43:19,418 WARNING [train.py:1204] (3/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_210-106047-0_sp1.1 from training. Duration: 0.8818125 2023-10-02 01:43:19,572 WARNING [train.py:1204] (3/4) Exclude cut with ID _9389_210_1664_1_1529217337301_5289269_289-213417-0_sp1.1 from training. Duration: 0.8090625 2023-10-02 01:43:19,597 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3910_210_1794_1_1526777667_3747260_178-41186-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 01:43:19,608 WARNING [train.py:1204] (3/4) Exclude cut with ID _27052_210_5986_1_1532329197187_3585009_24-172263-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 01:43:19,611 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_5522_210_3273_1_1527249606_3909096_318-203153-0_sp1.1 from training. Duration: 0.3909375 2023-10-02 01:43:20,599 WARNING [train.py:1204] (3/4) Exclude cut with ID _27894_210_12392_1_1532415496078_6902439_222-51982-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 01:43:20,714 WARNING [train.py:1204] (3/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_87-131427-0 from training. Duration: 0.78 2023-10-02 01:43:20,869 WARNING [train.py:1204] (3/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_372-298093-0_sp0.9 from training. Duration: 0.888875 2023-10-02 01:43:20,926 WARNING [train.py:1204] (3/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_663-166973-0 from training. Duration: 0.8 2023-10-02 01:43:20,952 WARNING [train.py:1204] (3/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_429-177469-0_sp1.1 from training. Duration: 0.936375 2023-10-02 01:43:20,985 WARNING [train.py:1204] (3/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_724-172651-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 01:43:21,036 WARNING [train.py:1204] (3/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_103-336064-0 from training. Duration: 0.51 2023-10-02 01:43:21,851 WARNING [train.py:1204] (3/4) Exclude cut with ID _48677_210_5663_1_1533902405919_3892989_111-11439-0 from training. Duration: 0.96 2023-10-02 01:43:22,074 WARNING [train.py:1204] (3/4) Exclude cut with ID _8085_210_4557_1_1528455633493_3716100_95-103398-0_sp1.1 from training. Duration: 0.963625 2023-10-02 01:43:22,146 WARNING [train.py:1204] (3/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_1041-110837-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 01:43:22,287 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_50050_210_16223_1_1535435796_7290138_1155-121757-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 01:43:22,289 WARNING [train.py:1204] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_602-275260-0_sp0.9 from training. Duration: 0.911125 2023-10-02 01:43:22,344 WARNING [train.py:1204] (3/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_265-164486-0 from training. Duration: 0.96 2023-10-02 01:43:22,476 WARNING [train.py:1204] (3/4) Exclude cut with ID _46067_210_4220_1_1534125571066_3307450_142-229375-0_sp1.1 from training. Duration: 0.963625 2023-10-02 01:43:22,499 WARNING [train.py:1204] (3/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_310-26186-0_sp1.1 from training. Duration: 0.636375 2023-10-02 01:43:22,613 WARNING [train.py:1204] (3/4) Exclude cut with ID _28461_210_2253_1_1532771775189_3041299_136-270644-0 from training. Duration: 0.93 2023-10-02 01:43:22,739 WARNING [train.py:1204] (3/4) Exclude cut with ID _14860_210_4915_1_1530702020340_7343240_438-286372-0_sp1.1 from training. Duration: 0.936375 2023-10-02 01:43:22,899 WARNING [train.py:1204] (3/4) Exclude cut with ID IC0128W0004-63309 from training. Duration: 0.831 2023-10-02 01:43:23,106 WARNING [train.py:1204] (3/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_135-174080-0 from training. Duration: 0.53 2023-10-02 01:43:23,154 WARNING [train.py:1204] (3/4) Exclude cut with ID _41326_210_5854_1_1533778076509_3492080_277-59511-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 01:43:23,250 WARNING [train.py:1204] (3/4) Exclude cut with ID IC0144W0112-71379 from training. Duration: 0.992 2023-10-02 01:43:23,331 WARNING [train.py:1204] (3/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_711-15688-0_sp0.9 from training. Duration: 0.47775 2023-10-02 01:43:23,444 WARNING [train.py:1204] (3/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_237-332027-0 from training. Duration: 0.95 2023-10-02 01:43:23,503 WARNING [train.py:1204] (3/4) Exclude cut with ID _7970_210_1883_1_1528455616296_3472230_25-178407-0 from training. Duration: 0.71 2023-10-02 01:43:23,505 WARNING [train.py:1204] (3/4) Exclude cut with ID _8480_210_1874_1_1528589391205_6728330_309-139896-0 from training. Duration: 0.81 2023-10-02 01:43:23,522 WARNING [train.py:1204] (3/4) Exclude cut with ID _39571_210_13449_1_1533801584143_3651077_319-268877-0_sp1.1 from training. Duration: 0.936375 2023-10-02 01:43:23,525 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_155-246880-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 01:43:23,622 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_566-122599-0_sp1.1 from training. Duration: 0.936375 2023-10-02 01:43:23,793 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_12868_210_6753_1_1530701932_530275_18-76322-0 from training. Duration: 0.87 2023-10-02 01:43:23,942 WARNING [train.py:1204] (3/4) Exclude cut with ID _56494_210_4220_1_1535284837625_3033169_159-106035-0_sp1.1 from training. Duration: 0.963625 2023-10-02 01:43:23,980 WARNING [train.py:1204] (3/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_98-276013-0_sp1.1 from training. Duration: 0.963625 2023-10-02 01:43:24,041 WARNING [train.py:1204] (3/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_330-216124-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 01:43:24,755 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_177-58005-0_sp0.9 from training. Duration: 0.688875 2023-10-02 01:43:25,053 WARNING [train.py:1204] (3/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_343-139898-0 from training. Duration: 0.96 2023-10-02 01:43:25,131 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6961_210_2087_1_1527937168_6815011_395-185060-0_sp1.1 from training. Duration: 0.863625 2023-10-02 01:43:25,323 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7002_210_1794_1_1527939683_5157286_184-229630-0_sp0.9 from training. Duration: 0.82225 2023-10-02 01:43:25,378 WARNING [train.py:1204] (3/4) Exclude cut with ID _59170_210_6721_1_1534983633960_4203335_153-18895-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 01:43:25,437 WARNING [train.py:1204] (3/4) Exclude cut with ID _49643_210_7989_1_1534640244224_3939031_598-161926-0 from training. Duration: 0.9 2023-10-02 01:43:25,657 WARNING [train.py:1204] (3/4) Exclude cut with ID _4068_210_2629_1_1526697350395_1546259_224-288693-0 from training. Duration: 0.59 2023-10-02 01:43:25,720 WARNING [train.py:1204] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_718-68505-0_sp1.1 from training. Duration: 0.8090625 2023-10-02 01:43:25,777 WARNING [train.py:1204] (3/4) Exclude cut with ID _4068_210_2629_1_1526697350395_1546259_224-288693-0_sp0.9 from training. Duration: 0.6555625 2023-10-02 01:43:25,789 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2296_210_2087_1_1525503448_3397335_57-185430-0_sp0.9 from training. Duration: 0.6666875 2023-10-02 01:43:25,848 WARNING [train.py:1204] (3/4) Exclude cut with ID _15458_210_8341_1_1530858614263_3834410_49-235472-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 01:43:25,850 WARNING [train.py:1204] (3/4) Exclude cut with ID _64135_210_2253_1_1535677267876_7608972_498-5039-0_sp0.9 from training. Duration: 0.8666875 2023-10-02 01:43:25,965 WARNING [train.py:1204] (3/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_283-220372-0_sp0.9 from training. Duration: 0.62225 2023-10-02 01:43:25,976 WARNING [train.py:1204] (3/4) Exclude cut with ID _6473_210_1741_1_1527901307396_3815380_108-17335-0_sp0.9 from training. Duration: 0.72225 2023-10-02 01:43:26,061 WARNING [train.py:1204] (3/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_113-92608-0 from training. Duration: 0.54 2023-10-02 01:43:26,124 WARNING [train.py:1204] (3/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_272-249985-0_sp1.1 from training. Duration: 0.6090625 2023-10-02 01:43:26,133 WARNING [train.py:1204] (3/4) Exclude cut with ID _50376_210_13401_1_1534031502101_4217740_518-187390-0_sp1.1 from training. Duration: 0.97275 2023-10-02 01:43:26,272 WARNING [train.py:1204] (3/4) Exclude cut with ID _59738_210_15379_1_1534849512562_3941470_165-248260-0_sp1.1 from training. Duration: 0.92725 2023-10-02 01:43:26,287 WARNING [train.py:1204] (3/4) Exclude cut with ID _23818_210_6228_1_1531917063884_3676920_400-343925-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 01:43:26,350 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6961_210_2087_1_1527937168_6815011_562-165518-0 from training. Duration: 0.77 2023-10-02 01:43:26,416 WARNING [train.py:1204] (3/4) Exclude cut with ID _24271_210_12658_1_1531987270022_7190519_289-207931-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 01:43:26,572 WARNING [train.py:1204] (3/4) Exclude cut with ID _14632_210_9988_1_1530691279392_3923200_332-105904-0 from training. Duration: 0.9 2023-10-02 01:43:26,582 WARNING [train.py:1204] (3/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_203-232438-0_sp1.1 from training. Duration: 0.97275 2023-10-02 01:43:26,687 WARNING [train.py:1204] (3/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_263-168388-0 from training. Duration: 0.86 2023-10-02 01:43:26,795 WARNING [train.py:1204] (3/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_174-205058-0 from training. Duration: 0.95 2023-10-02 01:43:26,946 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_94-195801-0_sp1.1 from training. Duration: 0.8545625 2023-10-02 01:43:26,976 WARNING [train.py:1204] (3/4) Exclude cut with ID _55153_210_2253_1_1534384786677_3162096_269-103204-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 01:43:27,079 WARNING [train.py:1204] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_748-36150-0 from training. Duration: 0.91 2023-10-02 01:43:27,179 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_148-172861-0_sp1.1 from training. Duration: 0.4545625 2023-10-02 01:43:27,229 WARNING [train.py:1204] (3/4) Exclude cut with ID _67499_210_17282_1_1535509805220_3692430_460-77730-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 01:43:27,459 WARNING [train.py:1204] (3/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_374-20753-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 01:43:27,593 WARNING [train.py:1204] (3/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_374-289983-0_sp1.1 from training. Duration: 0.97275 2023-10-02 01:43:27,624 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_34886_210_10633_1_1533203882_1755661_76-156096-0_sp0.9 from training. Duration: 0.9666875 2023-10-02 01:43:28,004 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_238-256668-0_sp0.9 from training. Duration: 0.7666875 2023-10-02 01:43:28,050 WARNING [train.py:1204] (3/4) Exclude cut with ID _46683_210_9642_1_1533730913492_4591116_489-146727-0_sp1.1 from training. Duration: 0.97275 2023-10-02 01:43:28,245 WARNING [train.py:1204] (3/4) Exclude cut with ID _13938_210_9988_1_1530518718978_3693960_304-112784-0_sp0.9 from training. Duration: 0.511125 2023-10-02 01:43:29,313 WARNING [train.py:1204] (3/4) Exclude cut with ID _24271_210_12658_1_1531987270022_7190519_243-46549-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 01:43:29,317 WARNING [train.py:1204] (3/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_78-182912-0_sp1.1 from training. Duration: 0.536375 2023-10-02 01:43:29,604 WARNING [train.py:1204] (3/4) Exclude cut with ID _57572_210_5517_1_1534921219624_3809484_121-321334-0_sp1.1 from training. Duration: 0.97275 2023-10-02 01:43:29,708 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_40047_210_2290_1_1533290519_2374311_254-102149-0_sp1.1 from training. Duration: 0.963625 2023-10-02 01:43:29,709 WARNING [train.py:1204] (3/4) Exclude cut with ID _28461_210_2253_1_1532771775189_3041299_96-152158-0 from training. Duration: 0.85 2023-10-02 01:43:29,923 WARNING [train.py:1204] (3/4) Exclude cut with ID _29877_210_6228_1_1532397791106_5276149_161-102979-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 01:43:30,243 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3787_210_2040_1_1526477544_5478586_603-120324-0_sp0.9 from training. Duration: 0.9 2023-10-02 01:43:30,511 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_41750_210_1960_1_1533520678_3948042_247-213456-0_sp1.1 from training. Duration: 0.97275 2023-10-02 01:43:31,209 WARNING [train.py:1204] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_127-119109-0_sp0.9 from training. Duration: 0.8 2023-10-02 01:43:31,915 WARNING [train.py:1204] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_215-202973-0 from training. Duration: 0.88 2023-10-02 01:43:32,009 WARNING [train.py:1204] (3/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_17-320491-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 01:43:32,250 WARNING [train.py:1204] (3/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_310-114452-0 from training. Duration: 0.59 2023-10-02 01:43:32,329 WARNING [train.py:1204] (3/4) Exclude cut with ID _15133_210_6228_1_1530794512074_3080379_29-264458-0_sp0.9 from training. Duration: 0.6444375 2023-10-02 01:43:32,610 WARNING [train.py:1204] (3/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_159-281310-0_sp1.1 from training. Duration: 0.863625 2023-10-02 01:43:32,620 WARNING [train.py:1204] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_195-167011-0_sp0.9 from training. Duration: 0.8333125 2023-10-02 01:43:32,709 WARNING [train.py:1204] (3/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_156-257607-0_sp1.1 from training. Duration: 0.7545625 2023-10-02 01:43:33,604 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2295_210_2087_1_1525500993_2407863_116-238894-0 from training. Duration: 0.7 2023-10-02 01:43:33,719 WARNING [train.py:1204] (3/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_321-335475-0_sp0.9 from training. Duration: 0.5 2023-10-02 01:43:33,892 WARNING [train.py:1204] (3/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_188-114464-0 from training. Duration: 0.82 2023-10-02 01:43:34,144 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2592_210_2290_1_1525697025_4882925_157-70512-0 from training. Duration: 0.91 2023-10-02 01:43:34,298 WARNING [train.py:1204] (3/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_138-64210-0_sp1.1 from training. Duration: 0.663625 2023-10-02 01:43:34,807 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_11305_210_1794_1_1529750428_7178148_651-95702-0_sp1.1 from training. Duration: 0.963625 2023-10-02 01:43:34,829 WARNING [train.py:1204] (3/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_379-184223-0_sp0.9 from training. Duration: 0.9444375 2023-10-02 01:43:34,855 WARNING [train.py:1204] (3/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_381-125325-0_sp1.1 from training. Duration: 0.963625 2023-10-02 01:43:34,932 WARNING [train.py:1204] (3/4) Exclude cut with ID IC0622W0021-307659 from training. Duration: 0.896 2023-10-02 01:43:34,934 WARNING [train.py:1204] (3/4) Exclude cut with ID _7624_210_1531_1_1528109802198_4933101_418-349834-0_sp1.1 from training. Duration: 0.5454375 2023-10-02 01:43:35,232 WARNING [train.py:1204] (3/4) Exclude cut with ID _49728_210_12651_1_1534041034294_6788150_607-3416-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 01:43:35,364 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8275_210_4198_1_1528455320_3892571_491-17707-0 from training. Duration: 0.64 2023-10-02 01:43:35,406 WARNING [train.py:1204] (3/4) Exclude cut with ID _5287_210_2084_1_1527418883040_3878670_333-48508-0 from training. Duration: 0.6 2023-10-02 01:43:35,485 WARNING [train.py:1204] (3/4) Exclude cut with ID _8365_210_2064_1_1528592311070_4677690_371-122295-0 from training. Duration: 0.55 2023-10-02 01:43:35,601 WARNING [train.py:1204] (3/4) Exclude cut with ID _14860_210_4915_1_1530702020340_7343240_836-328489-0 from training. Duration: 0.9 2023-10-02 01:43:35,743 WARNING [train.py:1204] (3/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_1033-274376-0_sp0.9 from training. Duration: 0.9666875 2023-10-02 01:43:35,928 WARNING [train.py:1204] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_475-127455-0_sp0.9 from training. Duration: 0.77775 2023-10-02 01:43:35,933 WARNING [train.py:1204] (3/4) Exclude cut with ID IC0671W0071-331914 from training. Duration: 0.896 2023-10-02 01:43:35,964 WARNING [train.py:1204] (3/4) Exclude cut with ID _30327_210_16049_1_1532484161295_3924169_2-1523-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 01:43:35,966 WARNING [train.py:1204] (3/4) Exclude cut with ID _24314_210_4915_1_1532135078116_7170179_460-208279-0_sp1.1 from training. Duration: 0.97275 2023-10-02 01:43:36,074 WARNING [train.py:1204] (3/4) Exclude cut with ID IC0679W0052-335859 from training. Duration: 0.64 2023-10-02 01:43:36,435 WARNING [train.py:1204] (3/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_3-237434-0_sp1.1 from training. Duration: 0.6909375 2023-10-02 01:43:36,686 WARNING [train.py:1204] (3/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_405-185283-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 01:43:37,326 WARNING [train.py:1204] (3/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_927-83684-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 01:43:37,336 WARNING [train.py:1204] (3/4) Exclude cut with ID _6694_210_4599_1_1527852637490_3965529_356-169233-0 from training. Duration: 0.99 2023-10-02 01:43:37,424 WARNING [train.py:1204] (3/4) Exclude cut with ID _19896_210_1883_1_1531709022662_6649569_562-233001-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 01:43:37,474 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_354-78629-0_sp1.1 from training. Duration: 0.963625 2023-10-02 01:43:37,476 WARNING [train.py:1204] (3/4) Exclude cut with ID _60135_210_13427_1_1534935557922_3651319_96-168198-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 01:43:37,584 WARNING [train.py:1204] (3/4) Exclude cut with ID _31602_210_5854_1_1532677021456_4458250_249-96604-0_sp1.1 from training. Duration: 0.8090625 2023-10-02 01:43:37,617 WARNING [train.py:1204] (3/4) Exclude cut with ID _14100_210_8341_1_1530529320613_3678069_258-224599-0_sp0.9 from training. Duration: 0.8555625 2023-10-02 01:43:38,425 WARNING [train.py:1204] (3/4) Exclude cut with ID _19708_210_9206_1_1532136650733_4238364_215-160135-0_sp1.1 from training. Duration: 0.97275 2023-10-02 01:43:38,460 WARNING [train.py:1204] (3/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_113-18163-0_sp0.9 from training. Duration: 0.988875 2023-10-02 01:43:38,511 WARNING [train.py:1204] (3/4) Exclude cut with ID _64135_210_2253_1_1535677267876_7608972_552-337340-0_sp1.1 from training. Duration: 0.92725 2023-10-02 01:43:38,546 WARNING [train.py:1204] (3/4) Exclude cut with ID _67725_210_27767_1_1535608756671_5105359_10-12887-0_sp1.1 from training. Duration: 0.97275 2023-10-02 01:43:38,552 WARNING [train.py:1204] (3/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_401-284840-0_sp1.1 from training. Duration: 0.97275 2023-10-02 01:43:38,570 WARNING [train.py:1204] (3/4) Exclude cut with ID _44659_210_2721_1_1534036120061_6614860_483-141285-0_sp1.1 from training. Duration: 0.936375 2023-10-02 01:43:38,836 WARNING [train.py:1204] (3/4) Exclude cut with ID _9461_210_4557_1_1529060514726_3677451_358-332734-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 01:43:38,931 WARNING [train.py:1204] (3/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_788-298907-0_sp1.1 from training. Duration: 0.8090625 2023-10-02 01:43:39,072 WARNING [train.py:1204] (3/4) Exclude cut with ID _39411_210_5986_1_1533538825030_3343649_354-267196-0_sp1.1 from training. Duration: 0.92725 2023-10-02 01:43:39,104 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_14953_210_2151_1_1530701710_5616165_192-309792-0_sp1.1 from training. Duration: 0.97275 2023-10-02 01:43:39,227 WARNING [train.py:1204] (3/4) Exclude cut with ID _59170_210_6721_1_1534983633960_4203335_290-151255-0_sp1.1 from training. Duration: 0.92725 2023-10-02 01:43:39,250 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_5575_210_1410_1_1527301305_2570170_249-93769-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 01:43:39,269 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_46013_210_7234_1_1533776362_3744032_463-52378-0_sp1.1 from training. Duration: 0.7818125 2023-10-02 01:43:39,645 WARNING [train.py:1204] (3/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_580-93798-0 from training. Duration: 0.56 2023-10-02 01:43:39,712 WARNING [train.py:1204] (3/4) Exclude cut with ID _19514_210_9259_1_1531530071330_5084230_385-13762-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 01:43:39,716 WARNING [train.py:1204] (3/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_458-67321-0_sp1.1 from training. Duration: 0.8818125 2023-10-02 01:43:39,856 WARNING [train.py:1204] (3/4) Exclude cut with ID _41298_210_9019_1_1533430371450_4822143_188-154791-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 01:43:39,917 WARNING [train.py:1204] (3/4) Exclude cut with ID _15037_210_2253_1_1530752294104_3267650_296-192597-0_sp0.9 from training. Duration: 0.9 2023-10-02 01:43:40,386 WARNING [train.py:1204] (3/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_661-264857-0_sp1.1 from training. Duration: 0.8454375 2023-10-02 01:43:40,906 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_34008_210_10431_1_1533345424_8183247_662-317331-0_sp1.1 from training. Duration: 0.8545625 2023-10-02 01:43:40,929 WARNING [train.py:1204] (3/4) Exclude cut with ID _46502_210_13794_1_1533724452198_3657159_241-278329-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 01:43:40,930 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_11349_210_6753_1_1529800861_1101207_26-65602-0 from training. Duration: 0.96 2023-10-02 01:43:40,940 WARNING [train.py:1204] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_448-4048-0_sp1.1 from training. Duration: 0.87275 2023-10-02 01:43:40,947 WARNING [train.py:1204] (3/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_662-275621-0_sp1.1 from training. Duration: 0.3818125 2023-10-02 01:43:40,999 WARNING [train.py:1204] (3/4) Exclude cut with ID _13938_210_9988_1_1530518718978_3693960_256-54454-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 01:43:41,183 WARNING [train.py:1204] (3/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_256-32662-0 from training. Duration: 0.9 2023-10-02 01:43:41,241 WARNING [train.py:1204] (3/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_326-17061-0 from training. Duration: 0.94 2023-10-02 01:43:41,277 WARNING [train.py:1204] (3/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_505-97987-0_sp1.1 from training. Duration: 0.7818125 2023-10-02 01:43:41,389 WARNING [train.py:1204] (3/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_316-294364-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 01:43:41,456 WARNING [train.py:1204] (3/4) Exclude cut with ID _19514_210_9259_1_1531530071330_5084230_762-231063-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 01:43:41,506 WARNING [train.py:1204] (3/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_68-219807-0_sp0.9 from training. Duration: 0.52225 2023-10-02 01:43:41,548 WARNING [train.py:1204] (3/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_479-158053-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 01:43:41,828 WARNING [train.py:1204] (3/4) Exclude cut with ID _66112_210_10635_1_1535590236305_3801484_83-38181-0_sp1.1 from training. Duration: 0.936375 2023-10-02 01:43:41,850 WARNING [train.py:1204] (3/4) Exclude cut with ID _55857_210_2346_1_1534644007831_6898209_255-146346-0_sp1.1 from training. Duration: 0.963625 2023-10-02 01:43:41,990 WARNING [train.py:1204] (3/4) Exclude cut with ID _48836_210_12210_1_1534075848293_3904150_429-35475-0 from training. Duration: 0.89 2023-10-02 01:43:42,001 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_50154_210_13731_1_1534750862_1080961_119-187163-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 01:43:42,771 WARNING [train.py:1204] (3/4) Exclude cut with ID _8793_210_6310_1_1528542985139_2310009_121-179782-0_sp0.9 from training. Duration: 0.888875 2023-10-02 01:43:42,773 WARNING [train.py:1204] (3/4) Exclude cut with ID _14884_210_2252_1_1530707459689_5561250_591-173406-0 from training. Duration: 0.96 2023-10-02 01:43:43,022 WARNING [train.py:1204] (3/4) Exclude cut with ID _15133_210_6228_1_1530794512074_3080379_54-198193-0_sp1.1 from training. Duration: 0.8545625 2023-10-02 01:43:43,031 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6545_210_2040_1_1527774670_4245258_407-48070-0 from training. Duration: 0.76 2023-10-02 01:43:43,189 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1032-236863-0 from training. Duration: 0.84 2023-10-02 01:43:43,276 WARNING [train.py:1204] (3/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_507-303370-0 from training. Duration: 0.96 2023-10-02 01:43:43,295 WARNING [train.py:1204] (3/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_164-282366-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 01:43:43,588 WARNING [train.py:1204] (3/4) Exclude cut with ID _39867_210_9826_1_1533553222030_9344259_312-158727-0_sp1.1 from training. Duration: 0.963625 2023-10-02 01:43:43,685 WARNING [train.py:1204] (3/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_174-205058-0_sp1.1 from training. Duration: 0.863625 2023-10-02 01:43:43,721 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_11305_210_1794_1_1529750428_7178148_166-237358-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 01:43:43,869 WARNING [train.py:1204] (3/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_26-175553-0_sp1.1 from training. Duration: 0.5545625 2023-10-02 01:43:43,904 WARNING [train.py:1204] (3/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_96-290039-0_sp1.1 from training. Duration: 0.5090625 2023-10-02 01:43:44,079 WARNING [train.py:1204] (3/4) Exclude cut with ID _53219_210_4525_1_1534834836008_4064825_261-101691-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 01:43:44,211 WARNING [train.py:1204] (3/4) Exclude cut with ID _24271_210_12658_1_1531987270022_7190519_96-293390-0_sp1.1 from training. Duration: 0.936375 2023-10-02 01:43:44,549 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_271-234962-0 from training. Duration: 0.79 2023-10-02 01:43:44,626 WARNING [train.py:1204] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_136-30660-0_sp1.1 from training. Duration: 0.97275 2023-10-02 01:43:44,639 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_625-281046-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 01:43:44,738 WARNING [train.py:1204] (3/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_333-168193-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 01:43:44,789 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9029W0269-509664 from training. Duration: 0.896 2023-10-02 01:43:44,864 WARNING [train.py:1204] (3/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_450-147393-0_sp1.1 from training. Duration: 0.92725 2023-10-02 01:43:44,927 WARNING [train.py:1204] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_643-52007-0_sp1.1 from training. Duration: 0.5181875 2023-10-02 01:43:45,014 WARNING [train.py:1204] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_330-327861-0_sp0.9 from training. Duration: 0.7444375 2023-10-02 01:43:45,030 WARNING [train.py:1204] (3/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_260-317651-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 01:43:45,050 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9044W0010-516654 from training. Duration: 0.896 2023-10-02 01:43:45,354 WARNING [train.py:1204] (3/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_265-118378-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 01:43:45,360 WARNING [train.py:1204] (3/4) Exclude cut with ID _6194_210_1534_1_1527511826160_3941194_321-148641-0_sp0.9 from training. Duration: 0.711125 2023-10-02 01:43:45,514 WARNING [train.py:1204] (3/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_101-169176-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 01:43:45,517 WARNING [train.py:1204] (3/4) Exclude cut with ID 497-129325-0061-62254-0_sp1.1 from training. Duration: 0.97725 2023-10-02 01:43:45,642 WARNING [train.py:1204] (3/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_795-33252-0_sp1.1 from training. Duration: 0.336375 2023-10-02 01:43:45,664 WARNING [train.py:1204] (3/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_168-246176-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 01:43:45,764 WARNING [train.py:1204] (3/4) Exclude cut with ID _7160_210_1876_1_1527940923231_3703799_120-150943-0_sp1.1 from training. Duration: 0.736375 2023-10-02 01:43:45,916 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9084W0005-536844 from training. Duration: 0.768 2023-10-02 01:43:46,067 WARNING [train.py:1204] (3/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_234-260682-0 from training. Duration: 0.84 2023-10-02 01:43:46,117 WARNING [train.py:1204] (3/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_188-114464-0_sp0.9 from training. Duration: 0.911125 2023-10-02 01:43:46,293 WARNING [train.py:1204] (3/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_81-75849-0 from training. Duration: 0.39 2023-10-02 01:43:46,460 WARNING [train.py:1204] (3/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_379-184223-0_sp1.1 from training. Duration: 0.77275 2023-10-02 01:43:46,650 WARNING [train.py:1204] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_454-123002-0_sp1.1 from training. Duration: 0.62725 2023-10-02 01:43:47,272 WARNING [train.py:1204] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_887-249555-0_sp1.1 from training. Duration: 0.7 2023-10-02 01:43:47,495 WARNING [train.py:1204] (3/4) Exclude cut with ID _31088_210_16446_1_1532520012284_3440919_20-260902-0_sp1.1 from training. Duration: 0.97275 2023-10-02 01:43:47,550 WARNING [train.py:1204] (3/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_856-344574-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 01:43:47,552 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_36756_210_7234_1_1533026991_966276_4-164118-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 01:43:47,567 WARNING [train.py:1204] (3/4) Exclude cut with ID _8365_210_2064_1_1528592311070_4677690_371-122295-0_sp1.1 from training. Duration: 0.5 2023-10-02 01:43:47,753 WARNING [train.py:1204] (3/4) Exclude cut with ID _5910_210_1389_1_1527337972454_5050100_555-159815-0_sp1.1 from training. Duration: 0.8909375 2023-10-02 01:43:47,780 WARNING [train.py:1204] (3/4) Exclude cut with ID _37086_210_9038_1_1532999540891_7021250_469-147941-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 01:43:47,862 WARNING [train.py:1204] (3/4) Exclude cut with ID _3067_210_2667_1_1526212974640_4503168_459-173867-0_sp0.9 from training. Duration: 0.97775 2023-10-02 01:43:47,979 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9156W0006-572499 from training. Duration: 0.896 2023-10-02 01:43:48,067 WARNING [train.py:1204] (3/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_277-210798-0 from training. Duration: 0.55 2023-10-02 01:43:48,229 WARNING [train.py:1204] (3/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_103-336064-0_sp1.1 from training. Duration: 0.463625 2023-10-02 01:43:48,277 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3414_210_1988_1_1526212718_4813195_331-290945-0_sp1.1 from training. Duration: 0.936375 2023-10-02 01:43:48,278 WARNING [train.py:1204] (3/4) Exclude cut with ID _67499_210_17282_1_1535509805220_3692430_365-29276-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 01:43:48,430 WARNING [train.py:1204] (3/4) Exclude cut with ID _14821_210_8750_1_1530680503991_6829359_69-250847-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 01:43:48,432 WARNING [train.py:1204] (3/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_1102-61301-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 01:43:48,533 WARNING [train.py:1204] (3/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_166-246557-0_sp1.1 from training. Duration: 0.5818125 2023-10-02 01:43:48,582 WARNING [train.py:1204] (3/4) Exclude cut with ID _15351_210_10626_1_1530856635653_3576210_214-129918-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 01:43:48,601 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_120-41668-0_sp0.9 from training. Duration: 0.77775 2023-10-02 01:43:48,696 WARNING [train.py:1204] (3/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_27-72278-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 01:43:48,823 WARNING [train.py:1204] (3/4) Exclude cut with ID _24314_210_4915_1_1532135078116_7170179_521-258868-0_sp1.1 from training. Duration: 0.7181875 2023-10-02 01:43:49,046 WARNING [train.py:1204] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_143-86344-0 from training. Duration: 0.77 2023-10-02 01:43:49,080 WARNING [train.py:1204] (3/4) Exclude cut with ID _53489_210_6228_1_1534298357169_3835970_465-157433-0_sp1.1 from training. Duration: 0.92725 2023-10-02 01:43:49,171 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_248-319353-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 01:43:49,354 WARNING [train.py:1204] (3/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_102-77064-0_sp1.1 from training. Duration: 0.8454375 2023-10-02 01:43:49,398 WARNING [train.py:1204] (3/4) Exclude cut with ID _9389_210_1664_1_1529217337301_5289269_379-128794-0_sp1.1 from training. Duration: 0.77275 2023-10-02 01:43:49,521 WARNING [train.py:1204] (3/4) Exclude cut with ID _3231_210_2106_1_1526205864934_6784220_236-260835-0_sp1.1 from training. Duration: 0.97275 2023-10-02 01:43:49,659 WARNING [train.py:1204] (3/4) Exclude cut with ID _24271_210_12658_1_1531987270022_7190519_184-342363-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 01:43:49,661 WARNING [train.py:1204] (3/4) Exclude cut with ID _27894_210_12392_1_1532415496078_6902439_67-221663-0_sp1.1 from training. Duration: 0.92725 2023-10-02 01:43:49,909 WARNING [train.py:1204] (3/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_304-59227-0_sp1.1 from training. Duration: 0.7 2023-10-02 01:43:50,042 WARNING [train.py:1204] (3/4) Exclude cut with ID _58605_210_12210_1_1534723195939_3904259_458-42232-0_sp1.1 from training. Duration: 0.92725 2023-10-02 01:43:50,147 WARNING [train.py:1204] (3/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_13-165512-0_sp0.9 from training. Duration: 0.911125 2023-10-02 01:43:50,310 WARNING [train.py:1204] (3/4) Exclude cut with ID _58602_210_2346_1_1534903210085_6908830_792-102241-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 01:43:50,457 WARNING [train.py:1204] (3/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_588-125165-0_sp0.9 from training. Duration: 0.92225 2023-10-02 01:43:50,585 WARNING [train.py:1204] (3/4) Exclude cut with ID _54218_210_12210_1_1534413646388_4409259_239-98429-0_sp1.1 from training. Duration: 0.97275 2023-10-02 01:43:50,645 WARNING [train.py:1204] (3/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_538-32291-0_sp1.1 from training. Duration: 0.7545625 2023-10-02 01:43:50,763 WARNING [train.py:1204] (3/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_419-317062-0 from training. Duration: 0.91 2023-10-02 01:43:50,878 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9309W0202-640329 from training. Duration: 0.896 2023-10-02 01:43:50,916 WARNING [train.py:1204] (3/4) Exclude cut with ID _3231_210_2106_1_1526205864934_6784220_419-313291-0_sp1.1 from training. Duration: 0.97275 2023-10-02 01:43:51,181 WARNING [train.py:1204] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_619-344499-0 from training. Duration: 0.63 2023-10-02 01:43:51,192 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_278-272612-0 from training. Duration: 0.73 2023-10-02 01:43:51,899 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_103-84010-0_sp0.9 from training. Duration: 0.7666875 2023-10-02 01:43:51,929 WARNING [train.py:1204] (3/4) Exclude cut with ID _26528_210_9070_1_1532570410842_3851310_497-97218-0_sp1.1 from training. Duration: 0.7818125 2023-10-02 01:43:52,010 WARNING [train.py:1204] (3/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_262-84613-0_sp1.1 from training. Duration: 0.936375 2023-10-02 01:43:52,058 WARNING [train.py:1204] (3/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_387-131538-0 from training. Duration: 0.81 2023-10-02 01:43:52,080 WARNING [train.py:1204] (3/4) Exclude cut with ID _8030_210_7497_1_1528444886781_3531089_224-300489-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 01:43:52,094 WARNING [train.py:1204] (3/4) Exclude cut with ID _14349_210_8341_1_1530615710818_3442470_170-336541-0_sp0.9 from training. Duration: 0.9555625 2023-10-02 01:43:52,282 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_47069_210_15368_1_1533781267_4431320_180-31746-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 01:43:52,424 WARNING [train.py:1204] (3/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_447-7041-0_sp1.1 from training. Duration: 0.92725 2023-10-02 01:43:52,460 WARNING [train.py:1204] (3/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_2-55283-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 01:43:52,527 WARNING [train.py:1204] (3/4) Exclude cut with ID _12555_210_8750_1_1530075594402_3780239_70-181911-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 01:43:52,838 WARNING [train.py:1204] (3/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_311-328022-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 01:43:53,007 WARNING [train.py:1204] (3/4) Exclude cut with ID _14528_210_8254_1_1530669700514_4306939_330-2987-0_sp1.1 from training. Duration: 0.92725 2023-10-02 01:43:53,052 WARNING [train.py:1204] (3/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_385-184858-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 01:43:53,374 WARNING [train.py:1204] (3/4) Exclude cut with ID _64414_210_17301_1_1535243252048_3757759_348-333360-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 01:43:53,432 WARNING [train.py:1204] (3/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_753-214751-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 01:43:53,607 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_17613_210_2151_1_1531137569_2366160_202-208013-0_sp1.1 from training. Duration: 0.92725 2023-10-02 01:43:53,615 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_43831_210_2158_1_1533725502_5872791_610-129300-0_sp1.1 from training. Duration: 0.936375 2023-10-02 01:43:53,786 WARNING [train.py:1204] (3/4) Exclude cut with ID _54218_210_12210_1_1534413646388_4409259_321-23852-0_sp1.1 from training. Duration: 0.936375 2023-10-02 01:43:53,959 WARNING [train.py:1204] (3/4) Exclude cut with ID _39411_210_5986_1_1533538825030_3343649_173-238248-0 from training. Duration: 0.83 2023-10-02 01:43:53,966 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_405-339986-0_sp0.9 from training. Duration: 0.5666875 2023-10-02 01:43:54,010 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2009_210_2071_1_1525481134_4588937_385-302100-0 from training. Duration: 0.87 2023-10-02 01:43:54,048 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_161-276996-0_sp1.1 from training. Duration: 0.863625 2023-10-02 01:43:54,158 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_177-58005-0 from training. Duration: 0.62 2023-10-02 01:43:54,260 WARNING [train.py:1204] (3/4) Exclude cut with ID _64639_210_5816_1_1535367619437_3528000_146-343734-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 01:43:54,362 WARNING [train.py:1204] (3/4) Exclude cut with ID _3845_210_1390_1_1526731326141_3523610_72-61441-0_sp1.1 from training. Duration: 0.92725 2023-10-02 01:43:54,953 WARNING [train.py:1204] (3/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_991-125028-0_sp1.1 from training. Duration: 0.936375 2023-10-02 01:43:55,035 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7544_210_6753_1_1528591327_1471426_172-348777-0 from training. Duration: 0.66 2023-10-02 01:43:55,093 WARNING [train.py:1204] (3/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_416-257730-0_sp1.1 from training. Duration: 0.5909375 2023-10-02 01:43:55,172 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1113-7369-0_sp1.1 from training. Duration: 0.77275 2023-10-02 01:43:55,296 WARNING [train.py:1204] (3/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_242-76943-0_sp1.1 from training. Duration: 0.8545625 2023-10-02 01:43:55,692 WARNING [train.py:1204] (3/4) Exclude cut with ID _44718_210_5816_1_1534050051869_6469030_190-49648-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 01:43:56,507 WARNING [train.py:1204] (3/4) Exclude cut with ID _47251_210_18628_1_1533814063128_4137250_214-88076-0 from training. Duration: 0.88 2023-10-02 01:43:56,545 WARNING [train.py:1204] (3/4) Exclude cut with ID _5585_210_2667_1_1527418813324_8007340_89-332072-0_sp1.1 from training. Duration: 0.5818125 2023-10-02 01:43:56,575 WARNING [train.py:1204] (3/4) Exclude cut with ID _41817_210_18835_1_1533434477160_7423049_555-293448-0_sp1.1 from training. Duration: 0.72725 2023-10-02 01:43:56,698 WARNING [train.py:1204] (3/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_286-331314-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 01:43:56,742 WARNING [train.py:1204] (3/4) Exclude cut with ID _13921_210_9994_1_1530838702800_3003429_46-149444-0 from training. Duration: 0.94 2023-10-02 01:43:56,798 WARNING [train.py:1204] (3/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_768-43595-0_sp1.1 from training. Duration: 0.936375 2023-10-02 01:43:56,815 WARNING [train.py:1204] (3/4) Exclude cut with ID _35355_210_16040_1_1533212988977_4016229_74-190076-0 from training. Duration: 0.8 2023-10-02 01:43:56,932 WARNING [train.py:1204] (3/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_1007-316380-0_sp1.1 from training. Duration: 0.963625 2023-10-02 01:43:56,979 WARNING [train.py:1204] (3/4) Exclude cut with ID _23557_210_8750_1_1531890106370_6846255_150-283437-0_sp1.1 from training. Duration: 0.963625 2023-10-02 01:43:57,023 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3474_210_2028_1_1526188513_4825246_145-309131-0_sp1.1 from training. Duration: 0.67275 2023-10-02 01:43:57,118 WARNING [train.py:1204] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_391-22148-0_sp1.1 from training. Duration: 0.7 2023-10-02 01:43:57,145 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_238-256668-0 from training. Duration: 0.69 2023-10-02 01:43:57,500 WARNING [train.py:1204] (3/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_372-198878-0 from training. Duration: 0.6 2023-10-02 01:43:57,518 WARNING [train.py:1204] (3/4) Exclude cut with ID ID0174W0006-766659 from training. Duration: 0.953 2023-10-02 01:43:57,543 WARNING [train.py:1204] (3/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_668-23019-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 01:43:57,868 WARNING [train.py:1204] (3/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_322-74328-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 01:43:57,874 WARNING [train.py:1204] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_872-264525-0_sp1.1 from training. Duration: 0.5909375 2023-10-02 01:43:57,942 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_53378_210_17282_1_1534221071_5769509_641-286201-0_sp1.1 from training. Duration: 0.97275 2023-10-02 01:43:58,039 WARNING [train.py:1204] (3/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_199-295611-0_sp1.1 from training. Duration: 0.5 2023-10-02 01:43:58,404 WARNING [train.py:1204] (3/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_1270-317971-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 01:43:58,531 WARNING [train.py:1204] (3/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_784-213099-0 from training. Duration: 0.93 2023-10-02 01:43:58,898 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_115-233929-0_sp0.9 from training. Duration: 0.8 2023-10-02 01:43:59,046 WARNING [train.py:1204] (3/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_256-223809-0_sp1.1 from training. Duration: 0.97275 2023-10-02 01:43:59,080 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_285-87781-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 01:43:59,201 WARNING [train.py:1204] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_619-344499-0_sp1.1 from training. Duration: 0.57275 2023-10-02 01:43:59,646 WARNING [train.py:1204] (3/4) Exclude cut with ID _60229_210_27767_1_1534838406158_3655350_264-279573-0_sp0.9 from training. Duration: 0.92225 2023-10-02 01:43:59,920 WARNING [train.py:1204] (3/4) Exclude cut with ID _7719_210_3525_1_1528456153163_4296960_333-110399-0 from training. Duration: 0.96 2023-10-02 01:44:00,987 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_50423_210_13270_1_1534128598_5021734_759-90833-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 01:44:01,041 WARNING [train.py:1204] (3/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_151-254798-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 01:44:01,078 WARNING [train.py:1204] (3/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_450-203637-0 from training. Duration: 0.92 2023-10-02 01:44:01,114 WARNING [train.py:1204] (3/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_673-36456-0_sp1.1 from training. Duration: 0.936375 2023-10-02 01:44:01,169 WARNING [train.py:1204] (3/4) Exclude cut with ID _36538_210_9994_1_1533204020363_3795620_131-8685-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 01:44:01,247 WARNING [train.py:1204] (3/4) Exclude cut with ID _12556_210_8341_1_1530081024100_7327690_713-55626-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 01:44:01,297 WARNING [train.py:1204] (3/4) Exclude cut with ID _2322_210_2667_1_1525604548971_3656299_440-80494-0_sp0.9 from training. Duration: 0.97775 2023-10-02 01:44:01,534 WARNING [train.py:1204] (3/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_210-325081-0_sp1.1 from training. Duration: 0.97275 2023-10-02 01:44:01,785 WARNING [train.py:1204] (3/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_375-244976-0_sp0.9 from training. Duration: 0.8333125 2023-10-02 01:44:01,791 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_15517_210_6753_1_1530952293_1664795_119-31001-0_sp1.1 from training. Duration: 0.8545625 2023-10-02 01:44:02,217 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_41452_210_7291_1_1533437687_3941281_319-42326-0_sp1.1 from training. Duration: 0.92725 2023-10-02 01:44:02,375 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_372-173012-0 from training. Duration: 0.95 2023-10-02 01:44:02,774 WARNING [train.py:1204] (3/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_41-31157-0_sp1.1 from training. Duration: 0.5181875 2023-10-02 01:44:03,086 WARNING [train.py:1204] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_91-212486-0 from training. Duration: 0.68 2023-10-02 01:44:03,356 WARNING [train.py:1204] (3/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_1267-272693-0 from training. Duration: 0.73 2023-10-02 01:44:03,391 WARNING [train.py:1204] (3/4) Exclude cut with ID _13938_210_9988_1_1530518718978_3693960_152-67046-0_sp1.1 from training. Duration: 0.936375 2023-10-02 01:44:03,539 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_4528_210_3273_1_1527076654_3276777_342-168068-0_sp1.1 from training. Duration: 0.7909375 2023-10-02 01:44:03,549 WARNING [train.py:1204] (3/4) Exclude cut with ID _7606_210_4915_1_1528894253277_5080690_127-177761-0 from training. Duration: 0.64 2023-10-02 01:44:03,634 WARNING [train.py:1204] (3/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_335-137972-0_sp1.1 from training. Duration: 0.92725 2023-10-02 01:44:03,884 WARNING [train.py:1204] (3/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_331-117829-0_sp1.1 from training. Duration: 0.563625 2023-10-02 01:44:04,304 WARNING [train.py:1204] (3/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_125-125818-0_sp1.1 from training. Duration: 0.87275 2023-10-02 01:44:04,379 WARNING [train.py:1204] (3/4) Exclude cut with ID _6789_210_1534_1_1527854471671_2870519_48-294897-0_sp1.1 from training. Duration: 0.963625 2023-10-02 01:44:04,459 WARNING [train.py:1204] (3/4) Exclude cut with ID _6694_210_4599_1_1527852637490_3965529_116-109398-0_sp1.1 from training. Duration: 0.663625 2023-10-02 01:44:04,468 WARNING [train.py:1204] (3/4) Exclude cut with ID _52892_210_6721_1_1534378941259_4124129_222-172685-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 01:44:05,316 WARNING [train.py:1204] (3/4) Exclude cut with ID _50655_210_18628_1_1534229942193_3772154_320-132275-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 01:44:05,742 WARNING [train.py:1204] (3/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_76-311963-0 from training. Duration: 0.7 2023-10-02 01:44:06,185 WARNING [train.py:1204] (3/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_32-205288-0 from training. Duration: 0.78 2023-10-02 01:44:06,210 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_86-16881-0_sp0.9 from training. Duration: 0.6444375 2023-10-02 01:44:06,249 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_271-297505-0_sp1.1 from training. Duration: 0.9 2023-10-02 01:44:06,359 WARNING [train.py:1204] (3/4) Exclude cut with ID _57572_210_5517_1_1534921219624_3809484_187-295293-0_sp1.1 from training. Duration: 0.963625 2023-10-02 01:44:06,751 WARNING [train.py:1204] (3/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_563-329943-0 from training. Duration: 0.99 2023-10-02 01:44:06,840 WARNING [train.py:1204] (3/4) Exclude cut with ID _7160_210_1876_1_1527940923231_3703799_302-150728-0_sp0.9 from training. Duration: 0.8666875 2023-10-02 01:44:07,135 WARNING [train.py:1204] (3/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_926-7526-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 01:44:07,165 WARNING [train.py:1204] (3/4) Exclude cut with ID _62574_210_2655_1_1535180407058_3933249_467-22348-0_sp1.1 from training. Duration: 0.936375 2023-10-02 01:44:07,271 WARNING [train.py:1204] (3/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_280-161633-0_sp0.9 from training. Duration: 0.811125 2023-10-02 01:44:07,396 WARNING [train.py:1204] (3/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_385-193052-0_sp1.1 from training. Duration: 0.7818125 2023-10-02 01:44:07,540 WARNING [train.py:1204] (3/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_444-28041-0 from training. Duration: 0.85 2023-10-02 01:44:07,544 WARNING [train.py:1204] (3/4) Exclude cut with ID _66070_210_7640_1_1535441393096_7805310_341-249237-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 01:44:07,637 WARNING [train.py:1204] (3/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_425-269826-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 01:44:07,815 WARNING [train.py:1204] (3/4) Exclude cut with ID _53950_210_5816_1_1534417264919_3862951_124-185597-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 01:44:07,998 WARNING [train.py:1204] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_380-204756-0 from training. Duration: 0.86 2023-10-02 01:44:08,234 WARNING [train.py:1204] (3/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_282-257791-0_sp0.9 from training. Duration: 0.9555625 2023-10-02 01:44:08,617 WARNING [train.py:1204] (3/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_372-131163-0_sp1.1 from training. Duration: 0.8545625 2023-10-02 01:44:08,751 WARNING [train.py:1204] (3/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_447-233513-0_sp1.1 from training. Duration: 0.963625 2023-10-02 01:44:09,601 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_54-118333-0_sp1.1 from training. Duration: 0.97275 2023-10-02 01:44:09,947 WARNING [train.py:1204] (3/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_388-230239-0_sp1.1 from training. Duration: 0.963625 2023-10-02 01:44:09,959 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_40047_210_2290_1_1533290519_2374311_141-54639-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 01:44:10,051 WARNING [train.py:1204] (3/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_91-267696-0_sp0.9 from training. Duration: 0.9666875 2023-10-02 01:44:10,088 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_532-137847-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 01:44:10,436 WARNING [train.py:1204] (3/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_155-283231-0_sp1.1 from training. Duration: 0.97275 2023-10-02 01:44:10,688 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2245_210_2715_1_1525511548_3540254_310-52483-0_sp1.1 from training. Duration: 0.8 2023-10-02 01:44:10,759 WARNING [train.py:1204] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_185-174805-0 from training. Duration: 0.68 2023-10-02 01:44:10,918 WARNING [train.py:1204] (3/4) Exclude cut with ID _15012_210_1968_1_1530856820429_5095350_662-210469-0_sp1.1 from training. Duration: 0.5181875 2023-10-02 01:44:11,135 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_426-313557-0_sp0.9 from training. Duration: 0.588875 2023-10-02 01:44:11,328 WARNING [train.py:1204] (3/4) Exclude cut with ID _42326_210_7770_1_1533814282531_3707269_407-204343-0_sp1.1 from training. Duration: 0.97275 2023-10-02 01:44:11,335 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_258-310570-0 from training. Duration: 0.82 2023-10-02 01:44:11,724 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_4737_210_2040_1_1526998689_3293008_378-178650-0_sp1.1 from training. Duration: 0.863625 2023-10-02 01:44:11,813 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_47896_210_20508_1_1534492518_5958868_432-327451-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 01:44:11,911 WARNING [train.py:1204] (3/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_188-114464-0_sp1.1 from training. Duration: 0.7454375 2023-10-02 01:44:11,918 WARNING [train.py:1204] (3/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_798-106179-0_sp1.1 from training. Duration: 0.7545625 2023-10-02 01:44:11,954 WARNING [train.py:1204] (3/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_143-117421-0_sp1.1 from training. Duration: 0.52725 2023-10-02 01:44:12,005 WARNING [train.py:1204] (3/4) Exclude cut with ID _67499_210_17282_1_1535509805220_3692430_361-78786-0_sp1.1 from training. Duration: 0.963625 2023-10-02 01:44:12,035 WARNING [train.py:1204] (3/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_712-342563-0_sp1.1 from training. Duration: 0.8818125 2023-10-02 01:44:12,088 WARNING [train.py:1204] (3/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_387-159008-0 from training. Duration: 0.86 2023-10-02 01:44:12,281 WARNING [train.py:1204] (3/4) Exclude cut with ID _33640_210_2655_1_1533117605479_4855469_959-18820-0_sp1.1 from training. Duration: 0.963625 2023-10-02 01:44:12,333 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_133-564-0_sp0.9 from training. Duration: 0.87775 2023-10-02 01:44:12,461 WARNING [train.py:1204] (3/4) Exclude cut with ID _23514_210_1389_1_1531832139785_3031439_12-29287-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 01:44:12,760 WARNING [train.py:1204] (3/4) Exclude cut with ID _23514_210_1389_1_1531832139785_3031439_489-185052-0_sp1.1 from training. Duration: 0.963625 2023-10-02 01:44:12,836 WARNING [train.py:1204] (3/4) Exclude cut with ID _5444_210_2151_1_1527418808464_5312210_634-194174-0 from training. Duration: 0.97 2023-10-02 01:44:12,853 WARNING [train.py:1204] (3/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_91-26919-0_sp1.1 from training. Duration: 0.8818125 2023-10-02 01:44:13,077 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_37351_210_7925_1_1533012931_3812113_516-82303-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 01:44:13,217 WARNING [train.py:1204] (3/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_80-74184-0_sp0.9 from training. Duration: 0.92225 2023-10-02 01:44:13,291 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_9460_210_4211_1_1528954587_10049667_495-202963-0_sp1.1 from training. Duration: 0.963625 2023-10-02 01:44:13,446 WARNING [train.py:1204] (3/4) Exclude cut with ID _14632_210_9988_1_1530691279392_3923200_332-105904-0_sp1.1 from training. Duration: 0.8181875 2023-10-02 01:44:13,472 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_40764_210_1925_1_1533882359_3752345_493-167886-0_sp1.1 from training. Duration: 0.92725 2023-10-02 01:44:14,255 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_196-289637-0 from training. Duration: 0.75 2023-10-02 01:44:14,323 WARNING [train.py:1204] (3/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_26-175553-0 from training. Duration: 0.61 2023-10-02 01:44:14,366 WARNING [train.py:1204] (3/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_890-5117-0 from training. Duration: 0.78 2023-10-02 01:44:14,424 WARNING [train.py:1204] (3/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_206-306860-0_sp1.1 from training. Duration: 0.92725 2023-10-02 01:44:14,548 WARNING [train.py:1204] (3/4) Exclude cut with ID _25882_210_3036_1_1532775665815_6479510_570-79824-0_sp1.1 from training. Duration: 0.963625 2023-10-02 01:44:14,626 WARNING [train.py:1204] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_731-2428-0_sp1.1 from training. Duration: 0.7545625 2023-10-02 01:44:14,653 WARNING [train.py:1204] (3/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_138-154812-0_sp0.9 from training. Duration: 0.8666875 2023-10-02 01:44:14,898 WARNING [train.py:1204] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_178-39722-0_sp0.9 from training. Duration: 0.5444375 2023-10-02 01:44:14,964 WARNING [train.py:1204] (3/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_1285-256622-0_sp0.9 from training. Duration: 0.9333125 2023-10-02 01:44:15,387 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_41428_210_2161_1_1533774184_3992532_65-271323-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 01:44:15,485 WARNING [train.py:1204] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_308-161067-0 from training. Duration: 0.66 2023-10-02 01:44:15,498 WARNING [train.py:1204] (3/4) Exclude cut with ID _3067_210_2667_1_1526212974640_4503168_459-173867-0 from training. Duration: 0.88 2023-10-02 01:44:15,499 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2221_210_1988_1_1525597051_4935541_415-296882-0_sp1.1 from training. Duration: 0.7 2023-10-02 01:44:15,715 WARNING [train.py:1204] (3/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_354-192266-0_sp1.1 from training. Duration: 0.936375 2023-10-02 01:44:15,786 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_258-310570-0_sp0.9 from training. Duration: 0.911125 2023-10-02 01:44:15,906 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_548-141039-0_sp1.1 from training. Duration: 0.963625 2023-10-02 01:44:16,111 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_15101_210_7927_1_1530786415_5033274_320-327527-0 from training. Duration: 0.96 2023-10-02 01:44:16,164 WARNING [train.py:1204] (3/4) Exclude cut with ID _2895_210_2477_1_1525956785794_4063750_289-11475-0_sp0.9 from training. Duration: 0.911125 2023-10-02 01:44:16,286 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7543_210_6753_1_1528543318_4552_1-85072-0_sp1.1 from training. Duration: 0.7545625 2023-10-02 01:44:16,371 WARNING [train.py:1204] (3/4) Exclude cut with ID _64639_210_5816_1_1535367619437_3528000_461-329156-0 from training. Duration: 0.96 2023-10-02 01:44:16,515 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_211-339622-0 from training. Duration: 0.45 2023-10-02 01:44:16,678 WARNING [train.py:1204] (3/4) Exclude cut with ID _20455_210_5219_1_1531565996775_8098100_402-112739-0_sp1.1 from training. Duration: 0.963625 2023-10-02 01:44:16,792 WARNING [train.py:1204] (3/4) Exclude cut with ID _53489_210_6228_1_1534298357169_3835970_256-115565-0_sp1.1 from training. Duration: 0.936375 2023-10-02 01:44:16,814 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_26326_210_7925_1_1533002493_3592406_360-305260-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 01:44:16,858 WARNING [train.py:1204] (3/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_18-204383-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 01:44:17,108 WARNING [train.py:1204] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_306-232480-0_sp1.1 from training. Duration: 0.87275 2023-10-02 01:44:17,794 WARNING [train.py:1204] (3/4) Exclude cut with ID _41161_210_20034_1_1533431249657_3555314_116-299186-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 01:44:18,514 WARNING [train.py:1204] (3/4) Exclude cut with ID _13913_210_9994_1_1530579615187_3383897_473-277682-0_sp1.1 from training. Duration: 0.3545625 2023-10-02 01:44:18,744 WARNING [train.py:1204] (3/4) Exclude cut with ID _58895_210_8750_1_1535184081320_3649089_49-225702-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 01:44:18,833 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3080_210_2071_1_1526086102_4365166_312-8726-0_sp1.1 from training. Duration: 0.82725 2023-10-02 01:44:18,964 WARNING [train.py:1204] (3/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_489-190175-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 01:44:19,034 WARNING [train.py:1204] (3/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_345-95717-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 01:44:19,059 WARNING [train.py:1204] (3/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_85-138257-0_sp1.1 from training. Duration: 0.6454375 2023-10-02 01:44:19,351 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7046_210_6753_1_1527895337_6920917_783-278109-0_sp1.1 from training. Duration: 0.7 2023-10-02 01:44:19,398 WARNING [train.py:1204] (3/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_450-203637-0_sp1.1 from training. Duration: 0.836375 2023-10-02 01:44:19,430 WARNING [train.py:1204] (3/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_416-132893-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 01:44:19,451 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2655_210_2087_1_1525688959_3897400_437-337690-0_sp0.9 from training. Duration: 0.7 2023-10-02 01:44:19,541 WARNING [train.py:1204] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_700-183549-0_sp1.1 from training. Duration: 0.6181875 2023-10-02 01:44:19,988 WARNING [train.py:1204] (3/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_191-59784-0_sp1.1 from training. Duration: 0.8 2023-10-02 01:44:19,998 WARNING [train.py:1204] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_218-295097-0_sp1.1 from training. Duration: 0.7 2023-10-02 01:44:20,119 WARNING [train.py:1204] (3/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_310-109461-0_sp1.1 from training. Duration: 0.72725 2023-10-02 01:44:20,123 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_33348_210_5632_1_1533086912_3324030_131-321149-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 01:44:20,186 WARNING [train.py:1204] (3/4) Exclude cut with ID _64135_210_2253_1_1535677267876_7608972_133-188266-0_sp1.1 from training. Duration: 0.736375 2023-10-02 01:44:20,203 WARNING [train.py:1204] (3/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_505-205270-0 from training. Duration: 0.38 2023-10-02 01:44:20,219 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_67223_210_4238_1_1535612120_2537431_102-88248-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 01:44:20,333 WARNING [train.py:1204] (3/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_60-235433-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 01:44:20,407 WARNING [train.py:1204] (3/4) Exclude cut with ID _5189_210_4557_1_1527248319428_3769229_199-53526-0_sp1.1 from training. Duration: 0.3818125 2023-10-02 01:44:20,806 WARNING [train.py:1204] (3/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_439-90979-0_sp1.1 from training. Duration: 0.963625 2023-10-02 01:44:20,889 WARNING [train.py:1204] (3/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_1162-132880-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 01:44:20,970 WARNING [train.py:1204] (3/4) Exclude cut with ID _56423_210_5854_1_1534896061049_3165969_220-22317-0_sp1.1 from training. Duration: 0.963625 2023-10-02 01:44:21,132 WARNING [train.py:1204] (3/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_909-64151-0 from training. Duration: 0.94 2023-10-02 01:44:21,202 WARNING [train.py:1204] (3/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_465-156500-0_sp0.9 from training. Duration: 0.8 2023-10-02 01:44:21,294 WARNING [train.py:1204] (3/4) Exclude cut with ID _44304_210_15374_1_1533639352765_4613129_302-22084-0 from training. Duration: 0.86 2023-10-02 01:44:21,339 WARNING [train.py:1204] (3/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_663-166973-0_sp1.1 from training. Duration: 0.72725 2023-10-02 01:44:21,509 WARNING [train.py:1204] (3/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_503-287883-0_sp0.9 from training. Duration: 0.97775 2023-10-02 01:44:21,523 WARNING [train.py:1204] (3/4) Exclude cut with ID _60057_210_10640_1_1534903065793_3333150_362-230377-0_sp0.9 from training. Duration: 0.9666875 2023-10-02 01:44:21,787 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_4183_210_2087_1_1526729100_1869838_143-280962-0 from training. Duration: 0.56 2023-10-02 01:44:21,888 WARNING [train.py:1204] (3/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_281-1313-0_sp1.1 from training. Duration: 0.936375 2023-10-02 01:44:22,938 WARNING [train.py:1204] (3/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_64-215118-0_sp1.1 from training. Duration: 0.936375 2023-10-02 01:44:22,991 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2822_210_2715_1_1526112586_1999587_165-36656-0 from training. Duration: 0.89 2023-10-02 01:44:23,051 WARNING [train.py:1204] (3/4) Exclude cut with ID _22961_210_8254_1_1531976366672_4278180_402-208830-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 01:44:23,363 WARNING [train.py:1204] (3/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_98-193494-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 01:44:23,447 WARNING [train.py:1204] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_383-285196-0 from training. Duration: 0.97 2023-10-02 01:44:23,893 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6767_210_3273_1_1527855783_1718932_142-127132-0 from training. Duration: 0.95 2023-10-02 01:44:23,918 WARNING [train.py:1204] (3/4) Exclude cut with ID _39867_210_9826_1_1533553222030_9344259_1018-339635-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 01:44:24,000 WARNING [train.py:1204] (3/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_274-274798-0_sp1.1 from training. Duration: 0.8909375 2023-10-02 01:44:24,158 WARNING [train.py:1204] (3/4) Exclude cut with ID _2334_210_2667_1_1525608238880_4705129_376-263393-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 01:44:24,212 WARNING [train.py:1204] (3/4) Exclude cut with ID _29303_210_2575_1_1532392237303_7410649_363-344675-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 01:44:24,300 WARNING [train.py:1204] (3/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_58-68956-0_sp1.1 from training. Duration: 0.7545625 2023-10-02 01:44:24,628 WARNING [train.py:1204] (3/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_709-311571-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 01:44:24,795 WARNING [train.py:1204] (3/4) Exclude cut with ID _46067_210_4220_1_1534125571066_3307450_136-257407-0_sp1.1 from training. Duration: 0.963625 2023-10-02 01:44:24,902 WARNING [train.py:1204] (3/4) Exclude cut with ID _6493_210_1747_1_1528459210424_4012470_324-323854-0_sp1.1 from training. Duration: 0.836375 2023-10-02 01:44:24,974 WARNING [train.py:1204] (3/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_234-260682-0_sp1.1 from training. Duration: 0.763625 2023-10-02 01:44:25,053 WARNING [train.py:1204] (3/4) Exclude cut with ID _36634_210_1794_1_1533296352450_1399884_55-119451-0_sp1.1 from training. Duration: 0.963625 2023-10-02 01:44:25,057 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_40047_210_2290_1_1533290519_2374311_195-165693-0 from training. Duration: 0.89 2023-10-02 01:44:25,354 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_5522_210_3273_1_1527249606_3909096_366-172508-0_sp0.9 from training. Duration: 0.7333125 2023-10-02 01:44:25,436 WARNING [train.py:1204] (3/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_187-10504-0_sp1.1 from training. Duration: 0.963625 2023-10-02 01:44:25,652 WARNING [train.py:1204] (3/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_282-257791-0_sp1.1 from training. Duration: 0.7818125 2023-10-02 01:44:25,887 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_43831_210_2158_1_1533725502_5872791_467-288776-0_sp1.1 from training. Duration: 0.92725 2023-10-02 01:44:25,975 WARNING [train.py:1204] (3/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_138-154812-0_sp1.1 from training. Duration: 0.7090625 2023-10-02 01:44:26,030 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_1154-185930-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 01:44:26,047 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2655_210_2087_1_1525688959_3897400_52-279899-0 from training. Duration: 0.69 2023-10-02 01:44:26,112 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_141-89113-0_sp1.1 from training. Duration: 0.7818125 2023-10-02 01:44:26,348 WARNING [train.py:1204] (3/4) Exclude cut with ID _55134_210_21982_1_1534590028377_3882170_297-47310-0_sp1.1 from training. Duration: 0.963625 2023-10-02 01:44:26,420 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_486-151131-0_sp1.1 from training. Duration: 0.77275 2023-10-02 01:44:26,556 WARNING [train.py:1204] (3/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_21-232980-0_sp1.1 from training. Duration: 0.936375 2023-10-02 01:44:26,690 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6165_210_3528_1_1527590982_4379841_338-106380-0_sp1.1 from training. Duration: 0.92725 2023-10-02 01:44:27,353 WARNING [train.py:1204] (3/4) Exclude cut with ID _55162_210_17071_1_1534408248583_3753480_355-257218-0_sp1.1 from training. Duration: 0.97275 2023-10-02 01:44:27,425 WARNING [train.py:1204] (3/4) Exclude cut with ID _2895_210_2477_1_1525956785794_4063750_289-11475-0 from training. Duration: 0.82 2023-10-02 01:44:28,006 WARNING [train.py:1204] (3/4) Exclude cut with ID _16756_210_4915_1_1531047803763_7104259_839-183431-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 01:44:28,134 WARNING [train.py:1204] (3/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_427-208901-0_sp0.9 from training. Duration: 0.7555625 2023-10-02 01:44:28,190 WARNING [train.py:1204] (3/4) Exclude cut with ID _36512_210_6987_1_1533033143998_3126069_181-306500-0 from training. Duration: 0.95 2023-10-02 01:44:28,548 WARNING [train.py:1204] (3/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_28-70987-0_sp1.1 from training. Duration: 0.7909375 2023-10-02 01:44:28,969 WARNING [train.py:1204] (3/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_590-58996-0_sp1.1 from training. Duration: 0.936375 2023-10-02 01:44:29,181 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_48730_210_5399_1_1533885886_2212769_108-47561-0_sp1.1 from training. Duration: 0.936375 2023-10-02 01:44:29,661 WARNING [train.py:1204] (3/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_401-158239-0_sp0.9 from training. Duration: 0.788875 2023-10-02 01:44:29,685 WARNING [train.py:1204] (3/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_278-154492-0_sp1.1 from training. Duration: 0.6909375 2023-10-02 01:44:29,691 WARNING [train.py:1204] (3/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_137-116442-0 from training. Duration: 0.78 2023-10-02 01:44:29,822 WARNING [train.py:1204] (3/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_189-317158-0 from training. Duration: 0.9 2023-10-02 01:44:29,948 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_47896_210_20508_1_1534492518_5958868_558-314572-0 from training. Duration: 0.97 2023-10-02 01:44:30,137 WARNING [train.py:1204] (3/4) Exclude cut with ID _66070_210_7640_1_1535441393096_7805310_290-40284-0_sp1.1 from training. Duration: 0.97275 2023-10-02 01:44:30,184 WARNING [train.py:1204] (3/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_110-224688-0_sp1.1 from training. Duration: 0.8818125 2023-10-02 01:44:30,274 WARNING [train.py:1204] (3/4) Exclude cut with ID _7719_210_3525_1_1528456153163_4296960_88-250259-0 from training. Duration: 0.84 2023-10-02 01:44:30,328 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8453_210_4211_1_1528502940_8562730_1157-114102-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 01:44:30,350 WARNING [train.py:1204] (3/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_317-175062-0_sp1.1 from training. Duration: 0.7454375 2023-10-02 01:44:30,364 WARNING [train.py:1204] (3/4) Exclude cut with ID _29857_210_20034_1_1532395886753_3798160_254-347254-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 01:44:30,451 WARNING [train.py:1204] (3/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_28-70987-0 from training. Duration: 0.87 2023-10-02 01:44:30,522 WARNING [train.py:1204] (3/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_321-335475-0 from training. Duration: 0.45 2023-10-02 01:44:30,780 WARNING [train.py:1204] (3/4) Exclude cut with ID _40901_210_5665_1_1533363789958_3856130_356-107330-0_sp1.1 from training. Duration: 0.936375 2023-10-02 01:44:30,839 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3780_210_2087_1_1526467389_2145572_176-107140-0 from training. Duration: 0.69 2023-10-02 01:44:31,280 WARNING [train.py:1204] (3/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_80-135200-0 from training. Duration: 0.79 2023-10-02 01:44:31,426 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_33348_210_5632_1_1533086912_3324030_85-28583-0_sp0.9 from training. Duration: 0.911125 2023-10-02 01:44:32,287 WARNING [train.py:1204] (3/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_29-293896-0 from training. Duration: 0.61 2023-10-02 01:44:32,419 WARNING [train.py:1204] (3/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_194-252800-0 from training. Duration: 0.97 2023-10-02 01:44:32,740 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_360-52446-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 01:44:32,741 WARNING [train.py:1204] (3/4) Exclude cut with ID _9389_210_1664_1_1529217337301_5289269_289-213417-0_sp0.9 from training. Duration: 0.988875 2023-10-02 01:44:32,797 WARNING [train.py:1204] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_143-86344-0_sp0.9 from training. Duration: 0.8555625 2023-10-02 01:44:33,056 WARNING [train.py:1204] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_520-126729-0_sp0.9 from training. Duration: 0.77775 2023-10-02 01:44:33,086 WARNING [train.py:1204] (3/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_76-308663-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 01:44:33,160 WARNING [train.py:1204] (3/4) Exclude cut with ID _8021_210_7994_1_1528376451358_3653069_100-140231-0 from training. Duration: 0.67 2023-10-02 01:44:33,254 WARNING [train.py:1204] (3/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_387-131538-0_sp0.9 from training. Duration: 0.9 2023-10-02 01:44:33,287 WARNING [train.py:1204] (3/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_365-182054-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 01:44:33,335 WARNING [train.py:1204] (3/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_345-340690-0 from training. Duration: 0.93 2023-10-02 01:44:33,440 WARNING [train.py:1204] (3/4) Exclude cut with ID _5409_210_1388_1_1527292850189_5865191_178-127970-0_sp0.9 from training. Duration: 0.6333125 2023-10-02 01:44:33,501 WARNING [train.py:1204] (3/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_352-12734-0_sp1.1 from training. Duration: 0.92725 2023-10-02 01:44:33,504 WARNING [train.py:1204] (3/4) Exclude cut with ID _65134_210_14763_1_1535335393985_3505368_121-129373-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 01:44:33,651 WARNING [train.py:1204] (3/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_557-324258-0_sp1.1 from training. Duration: 0.736375 2023-10-02 01:44:33,821 WARNING [train.py:1204] (3/4) Exclude cut with ID _48678_210_5663_1_1533988830947_3769199_306-255886-0 from training. Duration: 0.77 2023-10-02 01:44:33,909 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7544_210_6753_1_1528587000_691468_87-178599-0 from training. Duration: 0.87 2023-10-02 01:44:33,967 WARNING [train.py:1204] (3/4) Exclude cut with ID _2941_210_2577_1_1526199673150_2489759_208-236402-0_sp1.1 from training. Duration: 0.963625 2023-10-02 01:44:34,130 WARNING [train.py:1204] (3/4) Exclude cut with ID _38670_210_15379_1_1533448810659_4391059_65-169397-0 from training. Duration: 0.97 2023-10-02 01:44:34,251 WARNING [train.py:1204] (3/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_306-328175-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 01:44:34,267 WARNING [train.py:1204] (3/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_212-149947-0_sp1.1 from training. Duration: 0.663625 2023-10-02 01:44:34,721 WARNING [train.py:1204] (3/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_321-22821-0_sp1.1 from training. Duration: 0.8090625 2023-10-02 01:44:34,869 WARNING [train.py:1204] (3/4) Exclude cut with ID _44010_210_4525_1_1533884262964_4013620_483-69110-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 01:44:34,878 WARNING [train.py:1204] (3/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_38-187851-0_sp1.1 from training. Duration: 0.563625 2023-10-02 01:44:34,967 WARNING [train.py:1204] (3/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_129-337802-0_sp0.9 from training. Duration: 0.8 2023-10-02 01:44:35,030 WARNING [train.py:1204] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_268-87909-0_sp1.1 from training. Duration: 0.9 2023-10-02 01:44:35,229 WARNING [train.py:1204] (3/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_559-184862-0_sp1.1 from training. Duration: 0.8545625 2023-10-02 01:44:35,410 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_46032_210_14656_1_1533781412_4507969_237-120766-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 01:44:35,509 WARNING [train.py:1204] (3/4) Exclude cut with ID _28427_210_4220_1_1532581240967_3061069_330-170906-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 01:44:35,514 WARNING [train.py:1204] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_401-51578-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 01:44:35,649 WARNING [train.py:1204] (3/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_209-205365-0_sp1.1 from training. Duration: 0.7181875 2023-10-02 01:44:35,751 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_23801_210_16223_1_1532066392_3294657_57-242170-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 01:44:35,816 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_127-37371-0_sp1.1 from training. Duration: 0.936375 2023-10-02 01:44:36,940 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9171W0019-578905 from training. Duration: 0.896 2023-10-02 01:44:37,180 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_24605_210_1794_1_1532047863_4149755_349-49056-0_sp1.1 from training. Duration: 0.92725 2023-10-02 01:44:37,195 WARNING [train.py:1204] (3/4) Exclude cut with ID _12302_210_5219_1_1529924446466_8107280_897-21237-0_sp1.1 from training. Duration: 0.936375 2023-10-02 01:44:37,280 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_9460_210_4211_1_1528954587_10049667_480-260269-0_sp0.9 from training. Duration: 0.62225 2023-10-02 01:44:37,334 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_348-152548-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 01:44:37,407 WARNING [train.py:1204] (3/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_301-330455-0_sp0.9 from training. Duration: 0.888875 2023-10-02 01:44:37,525 WARNING [train.py:1204] (3/4) Exclude cut with ID _61008_210_10633_1_1534852769198_6974370_345-91705-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 01:44:37,668 WARNING [train.py:1204] (3/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_304-273433-0 from training. Duration: 0.99 2023-10-02 01:44:37,671 WARNING [train.py:1204] (3/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_359-345972-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 01:44:37,878 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2296_210_2087_1_1525503448_3397335_57-185430-0_sp1.1 from training. Duration: 0.5454375 2023-10-02 01:44:38,052 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_1256-120974-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 01:44:38,123 WARNING [train.py:1204] (3/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_372-30302-0_sp0.9 from training. Duration: 0.9555625 2023-10-02 01:44:38,301 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_42012_210_7927_1_1533439936_3871556_451-344262-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 01:44:38,301 WARNING [train.py:1204] (3/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_28-324729-0 from training. Duration: 0.94 2023-10-02 01:44:38,321 WARNING [train.py:1204] (3/4) Exclude cut with ID _64135_210_2253_1_1535677267876_7608972_313-18394-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 01:44:38,423 WARNING [train.py:1204] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_620-276793-0_sp0.9 from training. Duration: 0.6555625 2023-10-02 01:44:38,460 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7544_210_6753_1_1528591327_1471426_172-348777-0_sp0.9 from training. Duration: 0.7333125 2023-10-02 01:44:38,682 WARNING [train.py:1204] (3/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_332-221057-0_sp0.9 from training. Duration: 0.7555625 2023-10-02 01:44:38,926 WARNING [train.py:1204] (3/4) Exclude cut with ID _19023_210_4353_1_1531378892552_5820780_5-300035-0_sp1.1 from training. Duration: 0.92725 2023-10-02 01:44:38,983 WARNING [train.py:1204] (3/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_343-309023-0_sp1.1 from training. Duration: 0.963625 2023-10-02 01:44:39,060 WARNING [train.py:1204] (3/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_37-41919-0_sp1.1 from training. Duration: 0.7909375 2023-10-02 01:44:39,090 WARNING [train.py:1204] (3/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_41-182910-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 01:44:39,145 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_229-45300-0_sp0.9 from training. Duration: 0.6333125 2023-10-02 01:44:39,300 WARNING [train.py:1204] (3/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_40-214716-0_sp1.1 from training. Duration: 0.963625 2023-10-02 01:44:39,391 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9309W0203-640330 from training. Duration: 0.896 2023-10-02 01:44:39,956 WARNING [train.py:1204] (3/4) Exclude cut with ID _60229_210_27767_1_1534838406158_3655350_264-279573-0_sp1.1 from training. Duration: 0.7545625 2023-10-02 01:44:39,984 WARNING [train.py:1204] (3/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_372-198878-0_sp0.9 from training. Duration: 0.6666875 2023-10-02 01:44:40,097 WARNING [train.py:1204] (3/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_359-118926-0_sp1.1 from training. Duration: 0.6545625 2023-10-02 01:44:40,108 WARNING [train.py:1204] (3/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_85-65056-0 from training. Duration: 0.96 2023-10-02 01:44:40,131 WARNING [train.py:1204] (3/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_89-242335-0_sp1.1 from training. Duration: 0.863625 2023-10-02 01:44:41,081 WARNING [train.py:1204] (3/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_128-49444-0_sp1.1 from training. Duration: 0.936375 2023-10-02 01:44:41,157 WARNING [train.py:1204] (3/4) Exclude cut with ID _30327_210_16049_1_1532484161295_3924169_401-70819-0 from training. Duration: 0.86 2023-10-02 01:44:41,254 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_41822_210_13270_1_1533955976_4379284_260-256391-0_sp1.1 from training. Duration: 0.936375 2023-10-02 01:44:41,390 WARNING [train.py:1204] (3/4) Exclude cut with ID _67335_210_5219_1_1535525975652_4275229_599-247317-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 01:44:41,587 WARNING [train.py:1204] (3/4) Exclude cut with ID _29857_210_20034_1_1532395886753_3798160_209-239267-0_sp1.1 from training. Duration: 0.936375 2023-10-02 01:44:41,623 WARNING [train.py:1204] (3/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_46-207599-0_sp1.1 from training. Duration: 0.5818125 2023-10-02 01:44:41,796 WARNING [train.py:1204] (3/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_912-282412-0_sp1.1 from training. Duration: 0.4454375 2023-10-02 01:44:42,100 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_4183_210_2087_1_1526729100_1869838_141-166600-0_sp1.1 from training. Duration: 0.863625 2023-10-02 01:44:42,111 WARNING [train.py:1204] (3/4) Exclude cut with ID _15037_210_2253_1_1530752294104_3267650_296-192597-0 from training. Duration: 0.81 2023-10-02 01:44:42,165 WARNING [train.py:1204] (3/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_571-220562-0_sp1.1 from training. Duration: 0.963625 2023-10-02 01:44:42,186 WARNING [train.py:1204] (3/4) Exclude cut with ID _5287_210_2084_1_1527418883040_3878670_12-221186-0 from training. Duration: 0.98 2023-10-02 01:44:42,573 WARNING [train.py:1204] (3/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_149-85602-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 01:44:42,594 WARNING [train.py:1204] (3/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_1003-101638-0_sp1.1 from training. Duration: 0.663625 2023-10-02 01:44:42,843 WARNING [train.py:1204] (3/4) Exclude cut with ID _55844_210_2346_1_1534471212569_7625707_1099-33689-0_sp1.1 from training. Duration: 0.92725 2023-10-02 01:44:42,885 WARNING [train.py:1204] (3/4) Exclude cut with ID _7606_210_4915_1_1528894253277_5080690_301-10732-0 from training. Duration: 0.81 2023-10-02 01:44:43,041 WARNING [train.py:1204] (3/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_403-34698-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 01:44:43,047 WARNING [train.py:1204] (3/4) Exclude cut with ID _8480_210_1874_1_1528589391205_6728330_755-112625-0_sp0.9 from training. Duration: 0.82225 2023-10-02 01:44:43,064 WARNING [train.py:1204] (3/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_527-238990-0_sp1.1 from training. Duration: 0.8181875 2023-10-02 01:44:43,093 WARNING [train.py:1204] (3/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_426-7328-0_sp1.1 from training. Duration: 0.92725 2023-10-02 01:44:43,350 WARNING [train.py:1204] (3/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_189-317158-0_sp1.1 from training. Duration: 0.8181875 2023-10-02 01:44:43,415 WARNING [train.py:1204] (3/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_162-103620-0 from training. Duration: 0.93 2023-10-02 01:44:43,524 WARNING [train.py:1204] (3/4) Exclude cut with ID _22083_210_8254_1_1531702862439_3678970_49-234546-0 from training. Duration: 0.83 2023-10-02 01:44:43,570 WARNING [train.py:1204] (3/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_370-331600-0_sp1.1 from training. Duration: 0.8545625 2023-10-02 01:44:43,601 WARNING [train.py:1204] (3/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_166-59042-0_sp1.1 from training. Duration: 0.97275 2023-10-02 01:44:43,715 WARNING [train.py:1204] (3/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_138-332773-0_sp1.1 from training. Duration: 0.736375 2023-10-02 01:44:43,778 WARNING [train.py:1204] (3/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_90-30673-0_sp0.9 from training. Duration: 0.9333125 2023-10-02 01:44:44,028 WARNING [train.py:1204] (3/4) Exclude cut with ID _27967_210_12140_1_1532588391859_8419130_737-41898-0_sp1.1 from training. Duration: 0.936375 2023-10-02 01:44:44,147 WARNING [train.py:1204] (3/4) Exclude cut with ID _8833_210_5219_1_1528592665210_7553059_329-125156-0_sp1.1 from training. Duration: 0.7454375 2023-10-02 01:44:44,284 WARNING [train.py:1204] (3/4) Exclude cut with ID _41817_210_18835_1_1533434477160_7423049_352-42537-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 01:44:44,431 WARNING [train.py:1204] (3/4) Exclude cut with ID _8365_210_2064_1_1528592311070_4677690_431-294102-0 from training. Duration: 0.89 2023-10-02 01:44:44,552 WARNING [train.py:1204] (3/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_454-314901-0_sp1.1 from training. Duration: 0.4181875 2023-10-02 01:44:44,745 WARNING [train.py:1204] (3/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_80-135200-0_sp0.9 from training. Duration: 0.87775 2023-10-02 01:44:44,795 WARNING [train.py:1204] (3/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_181-78632-0 from training. Duration: 0.78 2023-10-02 01:44:44,857 WARNING [train.py:1204] (3/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_300-27235-0_sp0.9 from training. Duration: 0.9666875 2023-10-02 01:44:45,535 WARNING [train.py:1204] (3/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_96-177176-0_sp1.1 from training. Duration: 0.97275 2023-10-02 01:44:45,646 WARNING [train.py:1204] (3/4) Exclude cut with ID _35941_210_15379_1_1532995294515_4023089_246-348742-0_sp1.1 from training. Duration: 0.97275 2023-10-02 01:44:45,768 WARNING [train.py:1204] (3/4) Exclude cut with ID _38670_210_15379_1_1533448810659_4391059_65-169397-0_sp1.1 from training. Duration: 0.8818125 2023-10-02 01:44:46,015 WARNING [train.py:1204] (3/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_1003-99642-0_sp1.1 from training. Duration: 0.963625 2023-10-02 01:44:46,274 WARNING [train.py:1204] (3/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_320-14078-0_sp1.1 from training. Duration: 0.863625 2023-10-02 01:44:46,378 WARNING [train.py:1204] (3/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_367-61330-0_sp0.9 from training. Duration: 0.8333125 2023-10-02 01:44:46,535 WARNING [train.py:1204] (3/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_515-298514-0_sp1.1 from training. Duration: 0.92725 2023-10-02 01:44:46,731 WARNING [train.py:1204] (3/4) Exclude cut with ID _53731_210_7994_1_1534553979244_3757214_471-320815-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 01:44:46,791 WARNING [train.py:1204] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_306-232480-0 from training. Duration: 0.96 2023-10-02 01:44:46,858 WARNING [train.py:1204] (3/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_478-162247-0_sp0.9 from training. Duration: 0.911125 2023-10-02 01:44:46,963 WARNING [train.py:1204] (3/4) Exclude cut with ID _56423_210_5854_1_1534896061049_3165969_345-284696-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 01:44:46,995 WARNING [train.py:1204] (3/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_600-122253-0_sp1.1 from training. Duration: 0.963625 2023-10-02 01:44:47,272 WARNING [train.py:1204] (3/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_199-295611-0 from training. Duration: 0.55 2023-10-02 01:44:47,573 WARNING [train.py:1204] (3/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_734-129761-0 from training. Duration: 0.82 2023-10-02 01:44:47,672 WARNING [train.py:1204] (3/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_349-169257-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 01:44:47,710 WARNING [train.py:1204] (3/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_202-67017-0_sp0.9 from training. Duration: 0.911125 2023-10-02 01:44:47,862 WARNING [train.py:1204] (3/4) Exclude cut with ID _16908_210_5724_1_1531119157248_3772700_61-258978-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 01:44:48,267 WARNING [train.py:1204] (3/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_172-156931-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 01:44:48,362 WARNING [train.py:1204] (3/4) Exclude cut with ID _4092_210_2151_1_1526643044449_3135759_356-278694-0_sp1.1 from training. Duration: 0.42725 2023-10-02 01:44:48,656 WARNING [train.py:1204] (3/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_116-332008-0 from training. Duration: 0.98 2023-10-02 01:44:48,879 WARNING [train.py:1204] (3/4) Exclude cut with ID _29003_210_19596_1_1532507391720_3894098_358-114789-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 01:44:49,130 WARNING [train.py:1204] (3/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_152-212533-0_sp1.1 from training. Duration: 0.8818125 2023-10-02 01:44:49,188 WARNING [train.py:1204] (3/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_267-214116-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 01:44:50,118 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7543_210_6753_1_1528541907_1399322_165-323619-0_sp0.9 from training. Duration: 0.7666875 2023-10-02 01:44:50,414 WARNING [train.py:1204] (3/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_577-332600-0_sp0.9 from training. Duration: 0.6333125 2023-10-02 01:44:50,519 WARNING [train.py:1204] (3/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_342-100157-0_sp0.9 from training. Duration: 0.6 2023-10-02 01:44:50,668 WARNING [train.py:1204] (3/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_141-89727-0_sp1.1 from training. Duration: 0.6454375 2023-10-02 01:44:50,782 WARNING [train.py:1204] (3/4) Exclude cut with ID _3992_210_1747_1_1526644816031_3731060_107-251434-0_sp1.1 from training. Duration: 0.9 2023-10-02 01:44:50,939 WARNING [train.py:1204] (3/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_401-53423-0_sp1.1 from training. Duration: 0.5 2023-10-02 01:44:51,566 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7762_210_1794_1_1528199508_4057408_422-227034-0_sp1.1 from training. Duration: 0.663625 2023-10-02 01:44:51,863 WARNING [train.py:1204] (3/4) Exclude cut with ID _4184_210_2550_1_1526730192013_2219380_102-250299-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 01:44:51,950 WARNING [train.py:1204] (3/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_253-308377-0_sp0.9 from training. Duration: 0.5444375 2023-10-02 01:44:52,020 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_271-297505-0 from training. Duration: 0.99 2023-10-02 01:44:52,111 WARNING [train.py:1204] (3/4) Exclude cut with ID _35941_210_15379_1_1532995294515_4023089_213-142973-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 01:44:52,246 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7544_210_6753_1_1528587722_3576686_162-63018-0_sp1.1 from training. Duration: 0.92725 2023-10-02 01:44:52,342 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_365-217119-0_sp0.9 from training. Duration: 0.7 2023-10-02 01:44:52,484 WARNING [train.py:1204] (3/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_662-275621-0_sp0.9 from training. Duration: 0.4666875 2023-10-02 01:44:52,486 WARNING [train.py:1204] (3/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_374-187555-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 01:44:52,557 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_289-180904-0_sp1.1 from training. Duration: 0.6909375 2023-10-02 01:44:52,716 WARNING [train.py:1204] (3/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_189-47206-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 01:44:52,770 WARNING [train.py:1204] (3/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_234-14174-0_sp1.1 from training. Duration: 0.7454375 2023-10-02 01:44:52,821 WARNING [train.py:1204] (3/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_576-76621-0_sp1.1 from training. Duration: 0.963625 2023-10-02 01:44:52,902 WARNING [train.py:1204] (3/4) Exclude cut with ID _13253_210_1925_1_1530757775819_3577873_253-246386-0 from training. Duration: 0.9 2023-10-02 01:44:53,202 WARNING [train.py:1204] (3/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_372-6869-0_sp1.1 from training. Duration: 0.4 2023-10-02 01:44:53,271 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_21434_210_4238_1_1531916339_1222252_22-54954-0_sp1.1 from training. Duration: 0.936375 2023-10-02 01:44:53,442 WARNING [train.py:1204] (3/4) Exclude cut with ID _29853_210_7497_1_1532505600851_6642110_492-170361-0_sp1.1 from training. Duration: 0.92725 2023-10-02 01:44:53,630 WARNING [train.py:1204] (3/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_505-97987-0_sp0.9 from training. Duration: 0.9555625 2023-10-02 01:44:54,261 WARNING [train.py:1204] (3/4) Exclude cut with ID _8365_210_2064_1_1528592311070_4677690_371-122295-0_sp0.9 from training. Duration: 0.611125 2023-10-02 01:44:54,394 WARNING [train.py:1204] (3/4) Exclude cut with ID _14268_210_9206_1_1530622771285_4714099_637-86630-0_sp1.1 from training. Duration: 0.463625 2023-10-02 01:44:54,406 WARNING [train.py:1204] (3/4) Exclude cut with ID _29857_210_20034_1_1532395886753_3798160_372-5678-0_sp1.1 from training. Duration: 0.963625 2023-10-02 01:44:54,590 WARNING [train.py:1204] (3/4) Exclude cut with ID _52892_210_6721_1_1534378941259_4124129_90-165108-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 01:44:54,710 WARNING [train.py:1204] (3/4) Exclude cut with ID _27052_210_5986_1_1532329197187_3585009_143-339198-0_sp1.1 from training. Duration: 0.963625 2023-10-02 01:44:55,089 WARNING [train.py:1204] (3/4) Exclude cut with ID _14268_210_9206_1_1530622771285_4714099_637-86630-0_sp0.9 from training. Duration: 0.5666875 2023-10-02 01:44:55,102 WARNING [train.py:1204] (3/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_401-255491-0 from training. Duration: 0.7 2023-10-02 01:44:55,262 WARNING [train.py:1204] (3/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_178-6946-0_sp1.1 from training. Duration: 0.97275 2023-10-02 01:44:55,758 WARNING [train.py:1204] (3/4) Exclude cut with ID _27485_210_4916_1_1532739191677_3668269_460-163885-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 01:44:56,122 WARNING [train.py:1204] (3/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_270-26053-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 01:44:56,290 WARNING [train.py:1204] (3/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_412-317077-0 from training. Duration: 0.75 2023-10-02 01:44:56,452 WARNING [train.py:1204] (3/4) Exclude cut with ID _5289_210_2084_1_1527678202736_3779299_142-103317-0 from training. Duration: 0.84 2023-10-02 01:44:56,510 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_150-143438-0_sp1.1 from training. Duration: 0.92725 2023-10-02 01:44:56,640 WARNING [train.py:1204] (3/4) Exclude cut with ID _27894_210_12392_1_1532415496078_6902439_668-269779-0_sp1.1 from training. Duration: 0.97275 2023-10-02 01:44:56,721 WARNING [train.py:1204] (3/4) Exclude cut with ID _18513_210_9206_1_1531618215223_3862620_56-222075-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 01:44:56,833 WARNING [train.py:1204] (3/4) Exclude cut with ID _4092_210_2151_1_1526643044449_3135759_356-278694-0 from training. Duration: 0.47 2023-10-02 01:44:57,140 WARNING [train.py:1204] (3/4) Exclude cut with ID _25808_210_13292_1_1532343514562_7366109_633-47524-0 from training. Duration: 0.99 2023-10-02 01:44:57,158 WARNING [train.py:1204] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_872-264525-0 from training. Duration: 0.65 2023-10-02 01:44:57,194 WARNING [train.py:1204] (3/4) Exclude cut with ID _14632_210_9988_1_1530691279392_3923200_239-232545-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 01:44:57,286 WARNING [train.py:1204] (3/4) Exclude cut with ID _14349_210_8341_1_1530615710818_3442470_153-27432-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 01:44:57,728 WARNING [train.py:1204] (3/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_37-41919-0_sp0.9 from training. Duration: 0.9666875 2023-10-02 01:44:57,811 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_196-289637-0_sp0.9 from training. Duration: 0.8333125 2023-10-02 01:44:58,836 WARNING [train.py:1204] (3/4) Exclude cut with ID _13678_210_7497_1_1530530069226_3463170_371-3221-0_sp1.1 from training. Duration: 0.8181875 2023-10-02 01:44:58,930 WARNING [train.py:1204] (3/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_557-324258-0 from training. Duration: 0.81 2023-10-02 01:44:59,002 WARNING [train.py:1204] (3/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_1012-279473-0_sp1.1 from training. Duration: 0.6090625 2023-10-02 01:44:59,082 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_66916_210_14656_1_1535725525_326566_7-92649-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 01:44:59,359 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_9460_210_4211_1_1528954587_10049667_480-260269-0_sp1.1 from training. Duration: 0.5090625 2023-10-02 01:44:59,540 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_121-155001-0_sp1.1 from training. Duration: 0.92725 2023-10-02 01:44:59,730 WARNING [train.py:1204] (3/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_188-171673-0_sp1.1 from training. Duration: 0.52725 2023-10-02 01:44:59,811 WARNING [train.py:1204] (3/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_2-39653-0 from training. Duration: 0.98 2023-10-02 01:45:00,342 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_23850_210_4238_1_1532232597_1632433_112-229012-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 01:45:00,481 WARNING [train.py:1204] (3/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_626-79528-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 01:45:00,658 WARNING [train.py:1204] (3/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_856-90052-0_sp1.1 from training. Duration: 0.963625 2023-10-02 01:45:00,711 WARNING [train.py:1204] (3/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_122-109023-0_sp0.9 from training. Duration: 0.8444375 2023-10-02 01:45:00,956 WARNING [train.py:1204] (3/4) Exclude cut with ID _64414_210_17301_1_1535243252048_3757759_37-70328-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 01:45:01,039 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_622-344907-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 01:45:01,101 WARNING [train.py:1204] (3/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_410-321956-0_sp1.1 from training. Duration: 0.92725 2023-10-02 01:45:01,120 WARNING [train.py:1204] (3/4) Exclude cut with ID _7562_210_2084_1_1528887767916_3946229_186-326422-0_sp0.9 from training. Duration: 0.9333125 2023-10-02 01:45:01,231 WARNING [train.py:1204] (3/4) Exclude cut with ID _37537_210_2253_1_1533171758568_3421949_127-285192-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 01:45:01,368 WARNING [train.py:1204] (3/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_723-228168-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 01:45:01,445 WARNING [train.py:1204] (3/4) Exclude cut with ID _13678_210_7497_1_1530530069226_3463170_426-246984-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 01:45:01,465 WARNING [train.py:1204] (3/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_399-156791-0_sp0.9 from training. Duration: 0.92225 2023-10-02 01:45:01,535 WARNING [train.py:1204] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_391-22148-0 from training. Duration: 0.77 2023-10-02 01:45:01,565 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3475_210_2028_1_1526193578_3622569_64-297072-0 from training. Duration: 0.91 2023-10-02 01:45:01,628 WARNING [train.py:1204] (3/4) Exclude cut with ID _17875_210_2087_1_1531224012907_1821339_225-280015-0_sp1.1 from training. Duration: 0.963625 2023-10-02 01:45:01,657 WARNING [train.py:1204] (3/4) Exclude cut with ID _50655_210_18628_1_1534229942193_3772154_325-246169-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 01:45:01,899 WARNING [train.py:1204] (3/4) Exclude cut with ID _29529_210_7234_1_1532393961455_3977460_597-257348-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 01:45:01,971 WARNING [train.py:1204] (3/4) Exclude cut with ID _58895_210_8750_1_1535184081320_3649089_77-173485-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 01:45:02,000 WARNING [train.py:1204] (3/4) Exclude cut with ID _6270_210_2084_1_1527850911573_3856420_26-277343-0 from training. Duration: 0.82 2023-10-02 01:45:02,228 WARNING [train.py:1204] (3/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_144-3287-0 from training. Duration: 0.67 2023-10-02 01:45:02,378 WARNING [train.py:1204] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_293-145278-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 01:45:02,527 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_40047_210_2290_1_1533290519_2374311_257-335162-0 from training. Duration: 0.95 2023-10-02 01:45:02,709 WARNING [train.py:1204] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_619-344499-0_sp0.9 from training. Duration: 0.7 2023-10-02 01:45:03,741 WARNING [train.py:1204] (3/4) Exclude cut with ID _19504_210_9994_1_1531648863242_3325220_456-315379-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 01:45:03,826 WARNING [train.py:1204] (3/4) Exclude cut with ID _55857_210_2346_1_1534644007831_6898209_631-221692-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 01:45:04,185 WARNING [train.py:1204] (3/4) Exclude cut with ID _64631_210_27767_1_1535349580068_4165160_324-3682-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 01:45:04,204 WARNING [train.py:1204] (3/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_317-211107-0 from training. Duration: 0.92 2023-10-02 01:45:04,396 WARNING [train.py:1204] (3/4) Exclude cut with ID _6789_210_1534_1_1527857937218_3118950_311-61262-0 from training. Duration: 0.65 2023-10-02 01:45:04,568 WARNING [train.py:1204] (3/4) Exclude cut with ID _28323_210_16187_1_1532480395667_4100300_365-68882-0_sp1.1 from training. Duration: 0.92725 2023-10-02 01:45:04,608 WARNING [train.py:1204] (3/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_816-181825-0_sp1.1 from training. Duration: 0.92725 2023-10-02 01:45:04,795 WARNING [train.py:1204] (3/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_132-269633-0_sp1.1 from training. Duration: 0.6181875 2023-10-02 01:45:04,805 WARNING [train.py:1204] (3/4) Exclude cut with ID _44585_210_18628_1_1533605350387_4518199_280-73534-0_sp1.1 from training. Duration: 0.8090625 2023-10-02 01:45:04,877 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_68095_210_20185_1_1535718226_8888855_1009-164755-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 01:45:04,950 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_40223_210_6228_1_1533298404_4812267_400-103813-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 01:45:04,951 WARNING [train.py:1204] (3/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_491-261923-0_sp1.1 from training. Duration: 0.8909375 2023-10-02 01:45:04,967 WARNING [train.py:1204] (3/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_712-342563-0 from training. Duration: 0.97 2023-10-02 01:45:05,033 WARNING [train.py:1204] (3/4) Exclude cut with ID _41161_210_20034_1_1533431249657_3555314_175-190032-0_sp1.1 from training. Duration: 0.963625 2023-10-02 01:45:05,173 WARNING [train.py:1204] (3/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_788-298907-0 from training. Duration: 0.89 2023-10-02 01:45:05,189 WARNING [train.py:1204] (3/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_225-70961-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 01:45:05,205 WARNING [train.py:1204] (3/4) Exclude cut with ID _58583_210_5236_1_1534726835266_3574249_235-175994-0_sp1.1 from training. Duration: 0.92725 2023-10-02 01:45:05,355 WARNING [train.py:1204] (3/4) Exclude cut with ID _12302_210_5219_1_1529924446466_8107280_907-182949-0_sp1.1 from training. Duration: 0.836375 2023-10-02 01:45:05,409 WARNING [train.py:1204] (3/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_94-193692-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 01:45:05,506 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_4528_210_3273_1_1527076654_3276777_342-168068-0_sp0.9 from training. Duration: 0.9666875 2023-10-02 01:45:05,565 WARNING [train.py:1204] (3/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_368-74239-0_sp1.1 from training. Duration: 0.92725 2023-10-02 01:45:05,567 WARNING [train.py:1204] (3/4) Exclude cut with ID _44831_210_13427_1_1533816011549_3689035_291-295928-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 01:45:05,648 WARNING [train.py:1204] (3/4) Exclude cut with ID _56406_210_9288_1_1534899618664_3496149_455-131997-0_sp1.1 from training. Duration: 0.97275 2023-10-02 01:45:05,655 WARNING [train.py:1204] (3/4) Exclude cut with ID _6473_210_1741_1_1527901307396_3815380_108-17335-0 from training. Duration: 0.65 2023-10-02 01:45:05,705 WARNING [train.py:1204] (3/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_286-135600-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 01:45:05,947 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_928-321675-0_sp1.1 from training. Duration: 0.936375 2023-10-02 01:45:06,021 WARNING [train.py:1204] (3/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_81-300212-0_sp1.1 from training. Duration: 0.5454375 2023-10-02 01:45:06,056 WARNING [train.py:1204] (3/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_29-280622-0_sp0.9 from training. Duration: 0.911125 2023-10-02 01:45:06,169 WARNING [train.py:1204] (3/4) Exclude cut with ID _61079_210_4525_1_1534921008198_4011419_228-176447-0_sp1.1 from training. Duration: 0.936375 2023-10-02 01:45:06,343 WARNING [train.py:1204] (3/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_372-198878-0_sp1.1 from training. Duration: 0.5454375 2023-10-02 01:45:06,353 WARNING [train.py:1204] (3/4) Exclude cut with ID _64521_210_14699_1_1535243427259_3815328_244-208725-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 01:45:06,503 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8275_210_4198_1_1528455320_3892571_491-17707-0_sp0.9 from training. Duration: 0.711125 2023-10-02 01:45:06,622 WARNING [train.py:1204] (3/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_375-245677-0 from training. Duration: 0.81 2023-10-02 01:45:06,640 WARNING [train.py:1204] (3/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_690-286055-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 01:45:06,790 WARNING [train.py:1204] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_89-321955-0 from training. Duration: 0.8 2023-10-02 01:45:06,834 WARNING [train.py:1204] (3/4) Exclude cut with ID _18895_210_6228_1_1531357274234_3784930_213-98721-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 01:45:06,918 WARNING [train.py:1204] (3/4) Exclude cut with ID _19708_210_9206_1_1532136650733_4238364_347-65240-0 from training. Duration: 0.97 2023-10-02 01:45:07,060 WARNING [train.py:1204] (3/4) Exclude cut with ID _64135_210_2253_1_1535677267876_7608972_498-5039-0 from training. Duration: 0.78 2023-10-02 01:45:08,019 WARNING [train.py:1204] (3/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_146-276944-0_sp1.1 from training. Duration: 0.7545625 2023-10-02 01:45:08,104 WARNING [train.py:1204] (3/4) Exclude cut with ID _63171_210_15488_1_1535104792794_3295110_155-34488-0_sp1.1 from training. Duration: 0.97275 2023-10-02 01:45:08,116 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_5438_210_1794_1_1527336687_2600009_233-58072-0_sp0.9 from training. Duration: 0.8555625 2023-10-02 01:45:08,162 WARNING [train.py:1204] (3/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_63-101996-0_sp0.9 from training. Duration: 0.97775 2023-10-02 01:45:08,480 WARNING [train.py:1204] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_271-269084-0_sp1.1 from training. Duration: 0.7545625 2023-10-02 01:45:08,684 WARNING [train.py:1204] (3/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_345-167986-0_sp1.1 from training. Duration: 0.763625 2023-10-02 01:45:08,874 WARNING [train.py:1204] (3/4) Exclude cut with ID _39290_210_4220_1_1533862778760_3075118_92-311600-0_sp1.1 from training. Duration: 0.8 2023-10-02 01:45:08,881 WARNING [train.py:1204] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_209-65306-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 01:45:08,974 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_26433_210_12108_1_1532157158_4056480_317-335807-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 01:45:09,485 WARNING [train.py:1204] (3/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_238-94127-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 01:45:09,536 WARNING [train.py:1204] (3/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_207-132038-0_sp1.1 from training. Duration: 0.8545625 2023-10-02 01:45:10,036 WARNING [train.py:1204] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_167-137255-0_sp1.1 from training. Duration: 0.5818125 2023-10-02 01:45:10,135 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2009_210_2071_1_1525481134_4588937_385-302100-0_sp0.9 from training. Duration: 0.9666875 2023-10-02 01:45:10,934 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_15517_210_6753_1_1530954008_3487336_253-110617-0_sp1.1 from training. Duration: 0.936375 2023-10-02 01:45:11,145 WARNING [train.py:1204] (3/4) Exclude cut with ID _39290_210_4220_1_1533862778760_3075118_92-311600-0_sp0.9 from training. Duration: 0.97775 2023-10-02 01:45:11,187 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_42-142473-0 from training. Duration: 0.72 2023-10-02 01:45:11,230 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2296_210_2087_1_1525503448_3397335_57-185430-0 from training. Duration: 0.6 2023-10-02 01:45:11,248 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_11305_210_1794_1_1529750428_7178148_257-312870-0_sp0.9 from training. Duration: 0.97775 2023-10-02 01:45:11,375 WARNING [train.py:1204] (3/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_222-198127-0 from training. Duration: 0.84 2023-10-02 01:45:11,481 WARNING [train.py:1204] (3/4) Exclude cut with ID _65263_210_22356_1_1535763598691_3921037_423-202699-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 01:45:11,536 WARNING [train.py:1204] (3/4) Exclude cut with ID _15037_210_2253_1_1530752294104_3267650_13-130361-0 from training. Duration: 0.99 2023-10-02 01:45:12,759 WARNING [train.py:1204] (3/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_628-198080-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 01:45:12,875 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3171_210_2087_1_1526109084_2330007_37-64334-0 from training. Duration: 0.82 2023-10-02 01:45:12,918 WARNING [train.py:1204] (3/4) Exclude cut with ID _5287_210_2084_1_1527418883040_3878670_12-221186-0_sp1.1 from training. Duration: 0.8909375 2023-10-02 01:45:13,106 WARNING [train.py:1204] (3/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_1103-56445-0_sp1.1 from training. Duration: 0.663625 2023-10-02 01:45:13,218 WARNING [train.py:1204] (3/4) Exclude cut with ID _65263_210_22356_1_1535763598691_3921037_169-140068-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 01:45:13,643 WARNING [train.py:1204] (3/4) Exclude cut with ID IC0671W0118-331961 from training. Duration: 0.896 2023-10-02 01:45:13,694 WARNING [train.py:1204] (3/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_402-102311-0_sp1.1 from training. Duration: 0.6090625 2023-10-02 01:45:13,729 WARNING [train.py:1204] (3/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_202-293523-0 from training. Duration: 0.72 2023-10-02 01:45:13,793 WARNING [train.py:1204] (3/4) Exclude cut with ID IC0679W0038-335846 from training. Duration: 0.768 2023-10-02 01:45:13,822 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_37357_210_17024_1_1533089504_2316121_79-198866-0_sp1.1 from training. Duration: 0.936375 2023-10-02 01:45:14,085 WARNING [train.py:1204] (3/4) Exclude cut with ID _7606_210_4915_1_1528894253277_5080690_292-271285-0_sp1.1 from training. Duration: 0.963625 2023-10-02 01:45:14,096 WARNING [train.py:1204] (3/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_377-296839-0_sp0.9 from training. Duration: 0.611125 2023-10-02 01:45:14,106 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_45886_210_17024_1_1533704917_2435333_58-131069-0_sp1.1 from training. Duration: 0.963625 2023-10-02 01:45:14,303 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6961_210_2087_1_1527937168_6815011_395-185060-0 from training. Duration: 0.95 2023-10-02 01:45:14,506 WARNING [train.py:1204] (3/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_304-266479-0_sp1.1 from training. Duration: 0.97275 2023-10-02 01:45:14,564 WARNING [train.py:1204] (3/4) Exclude cut with ID _3867_210_1883_1_1526641200575_4475830_319-349850-0_sp1.1 from training. Duration: 0.6090625 2023-10-02 01:45:14,580 WARNING [train.py:1204] (3/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_304-59227-0_sp0.9 from training. Duration: 0.8555625 2023-10-02 01:45:14,636 WARNING [train.py:1204] (3/4) Exclude cut with ID _8021_210_7994_1_1528376451358_3653069_100-140231-0_sp0.9 from training. Duration: 0.7444375 2023-10-02 01:45:14,703 WARNING [train.py:1204] (3/4) Exclude cut with ID _8833_210_5219_1_1528592665210_7553059_329-125156-0_sp0.9 from training. Duration: 0.911125 2023-10-02 01:45:15,493 WARNING [train.py:1204] (3/4) Exclude cut with ID _14459_210_2228_1_1531443641946_7516700_792-287234-0_sp1.1 from training. Duration: 0.92725 2023-10-02 01:45:15,515 WARNING [train.py:1204] (3/4) Exclude cut with ID _19708_210_9206_1_1532136650733_4238364_347-65240-0_sp1.1 from training. Duration: 0.8818125 2023-10-02 01:45:15,880 WARNING [train.py:1204] (3/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_70-27732-0 from training. Duration: 0.94 2023-10-02 01:45:16,324 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_397-132565-0 from training. Duration: 0.99 2023-10-02 01:45:16,327 WARNING [train.py:1204] (3/4) Exclude cut with ID _49728_210_12651_1_1534041034294_6788150_624-195301-0 from training. Duration: 0.85 2023-10-02 01:45:16,364 WARNING [train.py:1204] (3/4) Exclude cut with ID _55429_210_10626_1_1534467588527_7174143_377-169391-0_sp1.1 from training. Duration: 0.87275 2023-10-02 01:45:17,102 WARNING [train.py:1204] (3/4) Exclude cut with ID _49376_210_5986_1_1533985278732_3370740_354-344695-0_sp1.1 from training. Duration: 0.9 2023-10-02 01:45:17,665 WARNING [train.py:1204] (3/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_276-96593-0_sp1.1 from training. Duration: 0.7 2023-10-02 01:45:17,686 WARNING [train.py:1204] (3/4) Exclude cut with ID _15012_210_1968_1_1530856820429_5095350_688-178849-0_sp0.9 from training. Duration: 0.711125 2023-10-02 01:45:17,725 WARNING [train.py:1204] (3/4) Exclude cut with ID _25565_210_12392_1_1532423009382_3952098_372-69023-0_sp1.1 from training. Duration: 0.936375 2023-10-02 01:45:17,776 WARNING [train.py:1204] (3/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_61-327062-0_sp1.1 from training. Duration: 0.736375 2023-10-02 01:45:17,816 WARNING [train.py:1204] (3/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_215-141059-0 from training. Duration: 0.62 2023-10-02 01:45:18,218 WARNING [train.py:1204] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_6-226750-0_sp1.1 from training. Duration: 0.8545625 2023-10-02 01:45:18,308 WARNING [train.py:1204] (3/4) Exclude cut with ID _14348_210_8341_1_1530599434996_3778640_240-232370-0_sp1.1 from training. Duration: 0.763625 2023-10-02 01:45:18,584 WARNING [train.py:1204] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_468-189058-0 from training. Duration: 0.96 2023-10-02 01:45:19,345 WARNING [train.py:1204] (3/4) Exclude cut with ID _21987_210_5854_1_1531821500482_3637038_247-93340-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 01:45:19,862 WARNING [train.py:1204] (3/4) Exclude cut with ID _66868_210_9527_1_1535509287954_4561140_436-191602-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 01:45:20,237 WARNING [train.py:1204] (3/4) Exclude cut with ID _18895_210_6228_1_1531357274234_3784930_394-58203-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 01:45:20,283 WARNING [train.py:1204] (3/4) Exclude cut with ID _58605_210_12210_1_1534723195939_3904259_258-281498-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 01:45:20,319 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_33348_210_5632_1_1533086912_3324030_245-292752-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 01:45:20,343 WARNING [train.py:1204] (3/4) Exclude cut with ID _67831_210_12392_1_1535612464202_3092612_346-2148-0 from training. Duration: 0.93 2023-10-02 01:45:20,402 WARNING [train.py:1204] (3/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_369-222060-0_sp0.9 from training. Duration: 0.788875 2023-10-02 01:45:20,465 WARNING [train.py:1204] (3/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_245-174553-0 from training. Duration: 0.88 2023-10-02 01:45:20,470 WARNING [train.py:1204] (3/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_231-104736-0_sp1.1 from training. Duration: 0.7090625 2023-10-02 01:45:20,481 WARNING [train.py:1204] (3/4) Exclude cut with ID _44585_210_18628_1_1533605350387_4518199_280-73534-0 from training. Duration: 0.89 2023-10-02 01:45:20,628 WARNING [train.py:1204] (3/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_22-201476-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 01:45:20,859 WARNING [train.py:1204] (3/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_325-62802-0_sp0.9 from training. Duration: 0.9666875 2023-10-02 01:45:20,879 WARNING [train.py:1204] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_199-274062-0_sp1.1 from training. Duration: 0.97275 2023-10-02 01:45:20,882 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_8-319957-0_sp1.1 from training. Duration: 0.87275 2023-10-02 01:45:20,903 WARNING [train.py:1204] (3/4) Exclude cut with ID _29343_210_4381_1_1532602757752_3674109_224-349726-0_sp1.1 from training. Duration: 0.936375 2023-10-02 01:45:20,907 WARNING [train.py:1204] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_520-126729-0_sp1.1 from training. Duration: 0.636375 2023-10-02 01:45:21,699 WARNING [train.py:1204] (3/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_239-22076-0_sp1.1 from training. Duration: 0.77275 2023-10-02 01:45:21,822 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9012W0011-501146 from training. Duration: 0.896 2023-10-02 01:45:21,897 WARNING [train.py:1204] (3/4) Exclude cut with ID _29857_210_20034_1_1532395886753_3798160_279-185801-0_sp1.1 from training. Duration: 0.82725 2023-10-02 01:45:21,952 WARNING [train.py:1204] (3/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_261-223933-0_sp1.1 from training. Duration: 0.92725 2023-10-02 01:45:22,166 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9029W0025-509426 from training. Duration: 0.896 2023-10-02 01:45:22,239 WARNING [train.py:1204] (3/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_337-181502-0_sp1.1 from training. Duration: 0.5909375 2023-10-02 01:45:22,252 WARNING [train.py:1204] (3/4) Exclude cut with ID _68635_210_9317_1_1535770813410_4315870_159-332599-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 01:45:22,729 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_161-276996-0 from training. Duration: 0.95 2023-10-02 01:45:22,762 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7088_210_1874_1_1527939776_1101113_127-68870-0_sp1.1 from training. Duration: 0.936375 2023-10-02 01:45:22,921 WARNING [train.py:1204] (3/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_322-217220-0_sp1.1 from training. Duration: 0.7818125 2023-10-02 01:45:22,989 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9072W0002-530636 from training. Duration: 0.768 2023-10-02 01:45:23,082 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_340-336408-0_sp1.1 from training. Duration: 0.8454375 2023-10-02 01:45:23,151 WARNING [train.py:1204] (3/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_395-37222-0 from training. Duration: 0.76 2023-10-02 01:45:23,154 WARNING [train.py:1204] (3/4) Exclude cut with ID _64099_210_17301_1_1535526977656_1578188_159-209156-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 01:45:23,232 WARNING [train.py:1204] (3/4) Exclude cut with ID _53950_210_5816_1_1534417264919_3862951_28-29681-0_sp1.1 from training. Duration: 0.92725 2023-10-02 01:45:23,334 WARNING [train.py:1204] (3/4) Exclude cut with ID _4681_210_3525_1_1527250689464_120889_15-126760-0_sp1.1 from training. Duration: 0.736375 2023-10-02 01:45:23,460 WARNING [train.py:1204] (3/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_242-3472-0_sp1.1 from training. Duration: 0.663625 2023-10-02 01:45:23,516 WARNING [train.py:1204] (3/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_436-186311-0_sp0.9 from training. Duration: 0.72225 2023-10-02 01:45:23,545 WARNING [train.py:1204] (3/4) Exclude cut with ID _3675_210_4557_1_1526639934408_4182089_452-172147-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 01:45:23,705 WARNING [train.py:1204] (3/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_80-147301-0 from training. Duration: 0.84 2023-10-02 01:45:23,740 WARNING [train.py:1204] (3/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_351-107588-0_sp1.1 from training. Duration: 0.92725 2023-10-02 01:45:23,890 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9111W0002-550781 from training. Duration: 0.896 2023-10-02 01:45:24,270 WARNING [train.py:1204] (3/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_465-156500-0_sp1.1 from training. Duration: 0.6545625 2023-10-02 01:45:24,490 WARNING [train.py:1204] (3/4) Exclude cut with ID _13938_210_9988_1_1530518718978_3693960_304-112784-0_sp1.1 from training. Duration: 0.4181875 2023-10-02 01:45:24,644 WARNING [train.py:1204] (3/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_202-67017-0_sp1.1 from training. Duration: 0.7454375 2023-10-02 01:45:24,704 WARNING [train.py:1204] (3/4) Exclude cut with ID _58414_210_8254_1_1535421609162_7151182_448-4358-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 01:45:24,784 WARNING [train.py:1204] (3/4) Exclude cut with ID _66840_210_5219_1_1535452168426_4196349_203-243234-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 01:45:24,864 WARNING [train.py:1204] (3/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_201-213513-0_sp1.1 from training. Duration: 0.97275 2023-10-02 01:45:25,260 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_117-187560-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 01:45:26,083 WARNING [train.py:1204] (3/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_135-174080-0_sp0.9 from training. Duration: 0.588875 2023-10-02 01:45:26,130 WARNING [train.py:1204] (3/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_45-265684-0 from training. Duration: 0.72 2023-10-02 01:45:26,292 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_435-313305-0_sp1.1 from training. Duration: 0.6454375 2023-10-02 01:45:26,331 WARNING [train.py:1204] (3/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_235-307337-0_sp1.1 from training. Duration: 0.8545625 2023-10-02 01:45:26,421 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7762_210_1794_1_1528199508_4057408_419-125009-0_sp0.9 from training. Duration: 0.92225 2023-10-02 01:45:26,528 WARNING [train.py:1204] (3/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_215-141059-0_sp1.1 from training. Duration: 0.563625 2023-10-02 01:45:26,605 WARNING [train.py:1204] (3/4) Exclude cut with ID _27013_210_1389_1_1532437173240_3252649_222-125573-0_sp1.1 from training. Duration: 0.8909375 2023-10-02 01:45:26,871 WARNING [train.py:1204] (3/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_126-274694-0_sp1.1 from training. Duration: 0.8909375 2023-10-02 01:45:27,063 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2592_210_2290_1_1525697025_4882925_157-70512-0_sp1.1 from training. Duration: 0.82725 2023-10-02 01:45:27,135 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_292-265928-0_sp0.9 from training. Duration: 0.5666875 2023-10-02 01:45:27,278 WARNING [train.py:1204] (3/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_1211-226066-0_sp1.1 from training. Duration: 0.6909375 2023-10-02 01:45:27,350 WARNING [train.py:1204] (3/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_394-265747-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 01:45:27,663 WARNING [train.py:1204] (3/4) Exclude cut with ID _48782_210_12658_1_1534467701544_2300422_239-67139-0_sp1.1 from training. Duration: 0.97275 2023-10-02 01:45:27,812 WARNING [train.py:1204] (3/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_329-113173-0_sp1.1 from training. Duration: 0.563625 2023-10-02 01:45:28,033 WARNING [train.py:1204] (3/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_81-75849-0_sp0.9 from training. Duration: 0.4333125 2023-10-02 01:45:28,182 WARNING [train.py:1204] (3/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_1003-101638-0 from training. Duration: 0.73 2023-10-02 01:45:28,210 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9309W0189-640316 from training. Duration: 0.64 2023-10-02 01:45:28,411 WARNING [train.py:1204] (3/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_503-287883-0_sp1.1 from training. Duration: 0.8 2023-10-02 01:45:28,913 WARNING [train.py:1204] (3/4) Exclude cut with ID _8833_210_5219_1_1528592665210_7553059_611-80313-0 from training. Duration: 0.64 2023-10-02 01:45:28,999 WARNING [train.py:1204] (3/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_335-106626-0_sp1.1 from training. Duration: 0.963625 2023-10-02 01:45:29,293 WARNING [train.py:1204] (3/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_237-44332-0_sp1.1 from training. Duration: 0.8545625 2023-10-02 01:45:29,557 WARNING [train.py:1204] (3/4) Exclude cut with ID _46952_210_5986_1_1534230010357_3107379_178-147341-0_sp1.1 from training. Duration: 0.97275 2023-10-02 01:45:29,605 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_13091_210_7925_1_1530609447_8987354_946-154426-0_sp1.1 from training. Duration: 0.936375 2023-10-02 01:45:29,621 WARNING [train.py:1204] (3/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_326-17061-0_sp1.1 from training. Duration: 0.8545625 2023-10-02 01:45:29,853 WARNING [train.py:1204] (3/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_38-169920-0_sp1.1 from training. Duration: 0.82725 2023-10-02 01:45:30,728 WARNING [train.py:1204] (3/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_283-220372-0 from training. Duration: 0.56 2023-10-02 01:45:30,736 WARNING [train.py:1204] (3/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_252-216674-0_sp1.1 from training. Duration: 0.4909375 2023-10-02 01:45:30,877 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_202-91290-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 01:45:31,015 WARNING [train.py:1204] (3/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_80-74184-0 from training. Duration: 0.83 2023-10-02 01:45:31,403 WARNING [train.py:1204] (3/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_411-271358-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 01:45:31,898 WARNING [train.py:1204] (3/4) Exclude cut with ID _68777_210_4353_1_1535810390731_3117904_129-166012-0 from training. Duration: 0.85 2023-10-02 01:45:31,979 WARNING [train.py:1204] (3/4) Exclude cut with ID _44982_210_1511_1_1534312820108_3726029_410-109759-0_sp1.1 from training. Duration: 0.963625 2023-10-02 01:45:31,983 WARNING [train.py:1204] (3/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_364-22817-0_sp1.1 from training. Duration: 0.6454375 2023-10-02 01:45:32,157 WARNING [train.py:1204] (3/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_89-242335-0 from training. Duration: 0.95 2023-10-02 01:45:32,165 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_151-325626-0 from training. Duration: 0.97 2023-10-02 01:45:32,166 WARNING [train.py:1204] (3/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_786-264332-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 01:45:32,378 WARNING [train.py:1204] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_35-153886-0_sp1.1 from training. Duration: 0.763625 2023-10-02 01:45:32,484 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_24007_210_1794_1_1531918925_1663875_116-100916-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 01:45:32,515 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_162-262717-0_sp1.1 from training. Duration: 0.7545625 2023-10-02 01:45:33,117 WARNING [train.py:1204] (3/4) Exclude cut with ID _8263_210_1664_1_1528438953494_294100_40-204731-0 from training. Duration: 0.5 2023-10-02 01:45:33,219 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_416-293746-0 from training. Duration: 0.9 2023-10-02 01:45:33,378 WARNING [train.py:1204] (3/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_605-219732-0 from training. Duration: 0.94 2023-10-02 01:45:33,453 WARNING [train.py:1204] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_454-123002-0 from training. Duration: 0.69 2023-10-02 01:45:33,458 WARNING [train.py:1204] (3/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_25-304273-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 01:45:33,525 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6860_210_4852_1_1527917457_3833327_451-39702-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 01:45:33,555 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_304-16505-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 01:45:33,583 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_4-266348-0_sp1.1 from training. Duration: 0.7090625 2023-10-02 01:45:33,662 WARNING [train.py:1204] (3/4) Exclude cut with ID ID0149W0001-753701 from training. Duration: 0.989 2023-10-02 01:45:33,919 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_41251_210_15368_1_1533429767_4820695_394-55393-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 01:45:34,047 WARNING [train.py:1204] (3/4) Exclude cut with ID _8793_210_6310_1_1528542985139_2310009_121-179782-0_sp1.1 from training. Duration: 0.72725 2023-10-02 01:45:34,068 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2729_210_3528_1_1525863505_3882214_421-73047-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 01:45:34,140 WARNING [train.py:1204] (3/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_278-154492-0_sp0.9 from training. Duration: 0.8444375 2023-10-02 01:45:34,482 WARNING [train.py:1204] (3/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_412-317077-0_sp0.9 from training. Duration: 0.8333125 2023-10-02 01:45:34,580 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_727-138317-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 01:45:34,622 WARNING [train.py:1204] (3/4) Exclude cut with ID _25808_210_13292_1_1532343514562_7366109_345-335048-0_sp1.1 from training. Duration: 0.92725 2023-10-02 01:45:34,635 WARNING [train.py:1204] (3/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_506-23200-0 from training. Duration: 0.97 2023-10-02 01:45:35,317 WARNING [train.py:1204] (3/4) Exclude cut with ID _44718_210_5816_1_1534050051869_6469030_643-245187-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 01:45:35,321 WARNING [train.py:1204] (3/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_42-186162-0_sp1.1 from training. Duration: 0.836375 2023-10-02 01:45:35,333 WARNING [train.py:1204] (3/4) Exclude cut with ID _3231_210_2106_1_1526205864934_6784220_171-256004-0_sp1.1 from training. Duration: 0.7818125 2023-10-02 01:45:35,349 WARNING [train.py:1204] (3/4) Exclude cut with ID _25914_210_6400_1_1532141317058_644740_43-248478-0_sp1.1 from training. Duration: 0.936375 2023-10-02 01:45:35,412 WARNING [train.py:1204] (3/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_577-287150-0 from training. Duration: 0.82 2023-10-02 01:45:35,627 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_46015_210_7234_1_1533863319_3087837_376-276068-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 01:45:35,727 WARNING [train.py:1204] (3/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_51-200220-0 from training. Duration: 0.56 2023-10-02 01:45:35,837 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_452-96101-0_sp1.1 from training. Duration: 0.963625 2023-10-02 01:45:36,065 WARNING [train.py:1204] (3/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_263-168388-0_sp1.1 from training. Duration: 0.7818125 2023-10-02 01:45:36,065 WARNING [train.py:1204] (3/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_469-94673-0 from training. Duration: 0.79 2023-10-02 01:45:36,208 WARNING [train.py:1204] (3/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_303-228738-0_sp1.1 from training. Duration: 0.763625 2023-10-02 01:45:36,254 WARNING [train.py:1204] (3/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_262-250826-0_sp1.1 from training. Duration: 0.9 2023-10-02 01:45:36,292 WARNING [train.py:1204] (3/4) Exclude cut with ID _68777_210_4353_1_1535810390731_3117904_129-166012-0_sp1.1 from training. Duration: 0.77275 2023-10-02 01:45:36,453 WARNING [train.py:1204] (3/4) Exclude cut with ID _58895_210_8750_1_1535184081320_3649089_265-323810-0 from training. Duration: 0.96 2023-10-02 01:45:36,551 WARNING [train.py:1204] (3/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_120-76843-0_sp0.9 from training. Duration: 0.611125 2023-10-02 01:45:36,620 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7544_210_6753_1_1528587722_3576686_295-258479-0_sp0.9 from training. Duration: 0.9333125 2023-10-02 01:45:36,828 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7046_210_6753_1_1527895337_6920917_783-278109-0 from training. Duration: 0.77 2023-10-02 01:45:36,872 WARNING [train.py:1204] (3/4) Exclude cut with ID _42326_210_7770_1_1533814282531_3707269_113-308323-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 01:45:36,928 WARNING [train.py:1204] (3/4) Exclude cut with ID _7852_210_5219_1_1528286471316_8139400_242-105336-0_sp0.9 from training. Duration: 0.8 2023-10-02 01:45:37,145 WARNING [train.py:1204] (3/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_114-277905-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 01:45:37,543 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_13118_210_7925_1_1530755612_7608297_42-66683-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 01:45:37,799 WARNING [train.py:1204] (3/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_901-79438-0_sp1.1 from training. Duration: 0.8545625 2023-10-02 01:45:37,859 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_30280_210_1881_1_1532872779_3660139_211-182194-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 01:45:38,433 WARNING [train.py:1204] (3/4) Exclude cut with ID _14898_210_9206_1_1530766812377_4177190_621-45732-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 01:45:38,449 WARNING [train.py:1204] (3/4) Exclude cut with ID _35350_210_16040_1_1533040191183_4075299_224-133775-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 01:45:38,709 WARNING [train.py:1204] (3/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_147-293311-0_sp1.1 from training. Duration: 0.8545625 2023-10-02 01:45:38,890 WARNING [train.py:1204] (3/4) Exclude cut with ID _12302_210_5219_1_1529924446466_8107280_835-279754-0_sp1.1 from training. Duration: 0.8181875 2023-10-02 01:45:39,898 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8453_210_4211_1_1528502940_8562730_347-330673-0_sp0.9 from training. Duration: 0.8 2023-10-02 01:45:39,989 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_213-103282-0_sp0.9 from training. Duration: 0.8666875 2023-10-02 01:45:40,084 WARNING [train.py:1204] (3/4) Exclude cut with ID _49728_210_12651_1_1534041034294_6788150_66-43496-0_sp1.1 from training. Duration: 0.97275 2023-10-02 01:45:40,092 WARNING [train.py:1204] (3/4) Exclude cut with ID _19023_210_4353_1_1531378892552_5820780_28-36615-0_sp1.1 from training. Duration: 0.8818125 2023-10-02 01:45:40,294 WARNING [train.py:1204] (3/4) Exclude cut with ID _14349_210_8341_1_1530615710818_3442470_115-285975-0 from training. Duration: 0.86 2023-10-02 01:45:40,326 WARNING [train.py:1204] (3/4) Exclude cut with ID _12734_210_8341_1_1530183614296_3891929_236-47464-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 01:45:40,375 WARNING [train.py:1204] (3/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_177-236809-0 from training. Duration: 0.49 2023-10-02 01:45:40,440 WARNING [train.py:1204] (3/4) Exclude cut with ID _13678_210_7497_1_1530530069226_3463170_371-3221-0 from training. Duration: 0.9 2023-10-02 01:45:40,454 WARNING [train.py:1204] (3/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_48-338976-0 from training. Duration: 0.89 2023-10-02 01:45:40,576 WARNING [train.py:1204] (3/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_1167-309744-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 01:45:40,957 WARNING [train.py:1204] (3/4) Exclude cut with ID _30327_210_16049_1_1532484161295_3924169_401-70819-0_sp0.9 from training. Duration: 0.9555625 2023-10-02 01:45:40,976 WARNING [train.py:1204] (3/4) Exclude cut with ID _55134_210_21982_1_1534590028377_3882170_346-91022-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 01:45:41,267 WARNING [train.py:1204] (3/4) Exclude cut with ID _14459_210_2228_1_1531443641946_7516700_174-118801-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 01:45:41,343 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_269-278006-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 01:45:41,409 WARNING [train.py:1204] (3/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_390-339137-0_sp0.9 from training. Duration: 0.62225 2023-10-02 01:45:41,559 WARNING [train.py:1204] (3/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_253-308377-0_sp1.1 from training. Duration: 0.4454375 2023-10-02 01:45:41,583 WARNING [train.py:1204] (3/4) Exclude cut with ID _23825_210_13449_1_1531915313526_3497150_240-271367-0_sp1.1 from training. Duration: 0.8818125 2023-10-02 01:45:41,677 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1015-343884-0_sp0.9 from training. Duration: 0.688875 2023-10-02 01:45:42,017 WARNING [train.py:1204] (3/4) Exclude cut with ID _23514_210_1389_1_1531832139785_3031439_335-48210-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 01:45:42,337 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_309-65177-0_sp1.1 from training. Duration: 0.8 2023-10-02 01:45:42,404 WARNING [train.py:1204] (3/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_231-137892-0 from training. Duration: 0.99 2023-10-02 01:45:42,592 WARNING [train.py:1204] (3/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_57-270037-0 from training. Duration: 0.77 2023-10-02 01:45:42,597 WARNING [train.py:1204] (3/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_465-214726-0_sp1.1 from training. Duration: 0.7818125 2023-10-02 01:45:42,676 WARNING [train.py:1204] (3/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_106-300985-0 from training. Duration: 0.9 2023-10-02 01:45:42,681 WARNING [train.py:1204] (3/4) Exclude cut with ID _14632_210_9988_1_1530691279392_3923200_161-129530-0_sp1.1 from training. Duration: 0.87275 2023-10-02 01:45:42,700 WARNING [train.py:1204] (3/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_514-88370-0_sp1.1 from training. Duration: 0.963625 2023-10-02 01:45:42,930 WARNING [train.py:1204] (3/4) Exclude cut with ID _56495_210_4234_1_1534748080334_4136219_305-334505-0_sp1.1 from training. Duration: 0.92725 2023-10-02 01:45:43,008 WARNING [train.py:1204] (3/4) Exclude cut with ID _42875_210_14856_1_1533704397487_4012433_61-264914-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 01:45:43,294 WARNING [train.py:1204] (3/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_91-148831-0 from training. Duration: 0.99 2023-10-02 01:45:43,557 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_435-313305-0_sp0.9 from training. Duration: 0.788875 2023-10-02 01:45:44,296 WARNING [train.py:1204] (3/4) Exclude cut with ID _16744_210_5986_1_1531724405384_3907567_242-111316-0_sp1.1 from training. Duration: 0.8454375 2023-10-02 01:45:44,364 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_257-20733-0 from training. Duration: 0.76 2023-10-02 01:45:44,439 WARNING [train.py:1204] (3/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_4-91037-0_sp1.1 from training. Duration: 0.8 2023-10-02 01:45:44,548 WARNING [train.py:1204] (3/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_3-276701-0 from training. Duration: 0.91 2023-10-02 01:45:44,739 WARNING [train.py:1204] (3/4) Exclude cut with ID _3992_210_1747_1_1526644816031_3731060_36-147855-0_sp1.1 from training. Duration: 0.7090625 2023-10-02 01:45:44,833 WARNING [train.py:1204] (3/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_1296-298671-0_sp1.1 from training. Duration: 0.963625 2023-10-02 01:45:44,935 WARNING [train.py:1204] (3/4) Exclude cut with ID _68965_210_1883_1_1535698962272_3497490_309-334593-0_sp1.1 from training. Duration: 0.92725 2023-10-02 01:45:45,036 WARNING [train.py:1204] (3/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_128-296581-0 from training. Duration: 0.74 2023-10-02 01:45:45,036 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_17613_210_2151_1_1531137569_2366160_170-299079-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 01:45:45,038 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6287_210_3273_1_1527509506_2653716_182-199887-0_sp1.1 from training. Duration: 0.463625 2023-10-02 01:45:45,201 WARNING [train.py:1204] (3/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_80-135200-0_sp1.1 from training. Duration: 0.7181875 2023-10-02 01:45:45,373 WARNING [train.py:1204] (3/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_34-183685-0 from training. Duration: 0.98 2023-10-02 01:45:45,393 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8278_210_2550_1_1528637099_4220413_418-210155-0_sp1.1 from training. Duration: 0.92725 2023-10-02 01:45:45,403 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8853_210_3273_1_1528633899_818968_51-187685-0_sp0.9 from training. Duration: 0.7666875 2023-10-02 01:45:45,416 WARNING [train.py:1204] (3/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_332-221057-0_sp1.1 from training. Duration: 0.6181875 2023-10-02 01:45:45,508 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2221_210_1988_1_1525597051_4935541_415-296882-0 from training. Duration: 0.77 2023-10-02 01:45:45,539 WARNING [train.py:1204] (3/4) Exclude cut with ID _33640_210_2655_1_1533117605479_4855469_770-278185-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 01:45:45,572 WARNING [train.py:1204] (3/4) Exclude cut with ID _5287_210_2084_1_1527418883040_3878670_333-48508-0_sp1.1 from training. Duration: 0.5454375 2023-10-02 01:45:45,621 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_142-276839-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 01:45:45,752 WARNING [train.py:1204] (3/4) Exclude cut with ID _28394_210_8913_1_1532430049518_3650118_126-139371-0_sp1.1 from training. Duration: 0.92725 2023-10-02 01:45:45,765 WARNING [train.py:1204] (3/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_76-203867-0 from training. Duration: 0.72 2023-10-02 01:45:45,805 WARNING [train.py:1204] (3/4) Exclude cut with ID _6194_210_1534_1_1527511826160_3941194_14-282876-0_sp0.9 from training. Duration: 0.788875 2023-10-02 01:45:46,207 WARNING [train.py:1204] (3/4) Exclude cut with ID _35355_210_16040_1_1533212988977_4016229_74-190076-0_sp1.1 from training. Duration: 0.72725 2023-10-02 01:45:46,359 WARNING [train.py:1204] (3/4) Exclude cut with ID _33920_210_4220_1_1533257851500_3084399_198-334063-0 from training. Duration: 0.94 2023-10-02 01:45:46,491 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_4-266348-0_sp0.9 from training. Duration: 0.8666875 2023-10-02 01:45:46,660 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_23856_210_4238_1_1531994785_3087934_122-38075-0_sp1.1 from training. Duration: 0.936375 2023-10-02 01:45:46,747 WARNING [train.py:1204] (3/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_276-96593-0_sp0.9 from training. Duration: 0.8555625 2023-10-02 01:45:46,810 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_308-278734-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 01:45:46,812 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_5190_210_4557_1_1527245078_2965646_353-250603-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 01:45:47,024 WARNING [train.py:1204] (3/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_472-216663-0 from training. Duration: 0.92 2023-10-02 01:45:47,256 WARNING [train.py:1204] (3/4) Exclude cut with ID _7537_210_2084_1_1528502395431_3924359_14-312906-0 from training. Duration: 0.84 2023-10-02 01:45:47,307 WARNING [train.py:1204] (3/4) Exclude cut with ID _29003_210_19596_1_1532507391720_3894098_559-236660-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 01:45:47,487 WARNING [train.py:1204] (3/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_312-204798-0_sp1.1 from training. Duration: 0.8454375 2023-10-02 01:45:47,692 WARNING [train.py:1204] (3/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_17-309070-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 01:45:47,930 WARNING [train.py:1204] (3/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_344-152526-0_sp0.9 from training. Duration: 0.588875 2023-10-02 01:45:48,741 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_41452_210_7291_1_1533437687_3941281_405-11559-0_sp1.1 from training. Duration: 0.82725 2023-10-02 01:45:48,770 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_48730_210_5399_1_1533885886_2212769_157-124052-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 01:45:48,826 WARNING [train.py:1204] (3/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_415-78792-0 from training. Duration: 0.81 2023-10-02 01:45:48,836 WARNING [train.py:1204] (3/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_345-340690-0_sp1.1 from training. Duration: 0.8454375 2023-10-02 01:45:48,958 WARNING [train.py:1204] (3/4) Exclude cut with ID _23825_210_13449_1_1531915313526_3497150_124-8138-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 01:45:48,997 WARNING [train.py:1204] (3/4) Exclude cut with ID _49258_210_7994_1_1534122017863_3738861_194-24864-0_sp1.1 from training. Duration: 0.936375 2023-10-02 01:45:49,019 WARNING [train.py:1204] (3/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_369-222060-0_sp1.1 from training. Duration: 0.6454375 2023-10-02 01:45:49,151 WARNING [train.py:1204] (3/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_480-13146-0_sp0.9 from training. Duration: 0.9444375 2023-10-02 01:45:49,302 WARNING [train.py:1204] (3/4) Exclude cut with ID _13253_210_1925_1_1530757775819_3577873_191-144977-0 from training. Duration: 0.78 2023-10-02 01:45:49,318 WARNING [train.py:1204] (3/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_1128-140266-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 01:45:49,617 WARNING [train.py:1204] (3/4) Exclude cut with ID _8772_210_1850_1_1528635595972_3664011_247-336448-0_sp0.9 from training. Duration: 0.82225 2023-10-02 01:45:49,694 WARNING [train.py:1204] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_318-59097-0_sp1.1 from training. Duration: 0.6909375 2023-10-02 01:45:49,899 WARNING [train.py:1204] (3/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_540-131164-0_sp1.1 from training. Duration: 0.836375 2023-10-02 01:45:49,991 WARNING [train.py:1204] (3/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_39-110836-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 01:45:50,161 WARNING [train.py:1204] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_860-96198-0_sp1.1 from training. Duration: 0.663625 2023-10-02 01:45:50,180 WARNING [train.py:1204] (3/4) Exclude cut with ID _15133_210_6228_1_1530794512074_3080379_29-264458-0 from training. Duration: 0.58 2023-10-02 01:45:50,208 WARNING [train.py:1204] (3/4) Exclude cut with ID _55209_210_1531_1_1534675828996_4597160_797-348343-0_sp1.1 from training. Duration: 0.963625 2023-10-02 01:45:50,327 WARNING [train.py:1204] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_23-189234-0_sp1.1 from training. Duration: 0.7545625 2023-10-02 01:45:50,744 WARNING [train.py:1204] (3/4) Exclude cut with ID _5410_210_1388_1_1527397055795_1719020_39-112221-0_sp1.1 from training. Duration: 0.62725 2023-10-02 01:45:50,868 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_100-348441-0_sp1.1 from training. Duration: 0.82725 2023-10-02 01:45:51,288 WARNING [train.py:1204] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_143-86344-0_sp1.1 from training. Duration: 0.7 2023-10-02 01:45:51,475 WARNING [train.py:1204] (3/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_126-29104-0 from training. Duration: 0.85 2023-10-02 01:45:51,664 WARNING [train.py:1204] (3/4) Exclude cut with ID _39846_210_12651_1_1533603651992_1318030_138-31771-0_sp1.1 from training. Duration: 0.963625 2023-10-02 01:45:52,011 WARNING [train.py:1204] (3/4) Exclude cut with ID _2220_210_1819_1_1525509109898_3658680_50-99837-0_sp1.1 from training. Duration: 0.4909375 2023-10-02 01:45:52,083 WARNING [train.py:1204] (3/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_836-210550-0_sp1.1 from training. Duration: 0.97275 2023-10-02 01:45:52,247 WARNING [train.py:1204] (3/4) Exclude cut with ID _28323_210_16187_1_1532480395667_4100300_306-74561-0 from training. Duration: 0.94 2023-10-02 01:45:53,249 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_177-58005-0_sp1.1 from training. Duration: 0.563625 2023-10-02 01:45:53,613 WARNING [train.py:1204] (3/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_19-342632-0_sp1.1 from training. Duration: 0.82725 2023-10-02 01:45:53,637 WARNING [train.py:1204] (3/4) Exclude cut with ID _35355_210_16040_1_1533212988977_4016229_74-190076-0_sp0.9 from training. Duration: 0.888875 2023-10-02 01:45:53,868 WARNING [train.py:1204] (3/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_285-97786-0_sp1.1 from training. Duration: 0.97275 2023-10-02 01:45:54,272 WARNING [train.py:1204] (3/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_337-181502-0_sp0.9 from training. Duration: 0.72225 2023-10-02 01:45:54,310 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_353-182855-0 from training. Duration: 0.96 2023-10-02 01:45:54,354 WARNING [train.py:1204] (3/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_409-327730-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 01:45:54,558 WARNING [train.py:1204] (3/4) Exclude cut with ID _8030_210_7497_1_1528444886781_3531089_373-19403-0_sp1.1 from training. Duration: 0.97275 2023-10-02 01:45:54,862 WARNING [train.py:1204] (3/4) Exclude cut with ID _6789_210_1534_1_1527857937218_3118950_231-340237-0_sp1.1 from training. Duration: 0.936375 2023-10-02 01:45:55,169 WARNING [train.py:1204] (3/4) Exclude cut with ID _6493_210_1747_1_1528459210424_4012470_536-57832-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 01:45:55,194 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7046_210_6753_1_1527895337_6920917_756-116002-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 01:45:55,384 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_37152_210_9697_1_1533106573_3844188_29-239070-0_sp1.1 from training. Duration: 0.8909375 2023-10-02 01:45:55,400 WARNING [train.py:1204] (3/4) Exclude cut with ID _19023_210_4353_1_1531378892552_5820780_61-334539-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 01:45:55,534 WARNING [train.py:1204] (3/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_159-28488-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 01:45:55,618 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_764-71272-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 01:45:55,701 WARNING [train.py:1204] (3/4) Exclude cut with ID _44902_210_16187_1_1533888006062_3792078_258-178274-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 01:45:55,886 WARNING [train.py:1204] (3/4) Exclude cut with ID _7537_210_2084_1_1528502395431_3924359_14-312906-0_sp1.1 from training. Duration: 0.763625 2023-10-02 01:45:55,998 WARNING [train.py:1204] (3/4) Exclude cut with ID _49643_210_7989_1_1534640244224_3939031_674-116639-0_sp1.1 from training. Duration: 0.963625 2023-10-02 01:45:56,109 WARNING [train.py:1204] (3/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_415-161-0 from training. Duration: 0.98 2023-10-02 01:45:56,371 WARNING [train.py:1204] (3/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_264-348896-0 from training. Duration: 0.85 2023-10-02 01:45:56,378 WARNING [train.py:1204] (3/4) Exclude cut with ID _51899_210_20034_1_1534129254466_3669220_233-161492-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 01:45:56,439 WARNING [train.py:1204] (3/4) Exclude cut with ID _31255_210_13427_1_1532865619495_3815143_233-200023-0_sp1.1 from training. Duration: 0.936375 2023-10-02 01:45:56,469 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_23850_210_4238_1_1532232597_1632433_100-229879-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 01:45:56,552 WARNING [train.py:1204] (3/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_375-245677-0_sp0.9 from training. Duration: 0.9 2023-10-02 01:45:56,555 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_40223_210_6228_1_1533298404_4812267_355-317415-0_sp1.1 from training. Duration: 0.97275 2023-10-02 01:45:56,607 WARNING [train.py:1204] (3/4) Exclude cut with ID _9679_210_7497_1_1529129212906_4363929_235-81488-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 01:45:57,477 WARNING [train.py:1204] (3/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_172-215932-0_sp1.1 from training. Duration: 0.6 2023-10-02 01:45:57,559 WARNING [train.py:1204] (3/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_64-298917-0_sp1.1 from training. Duration: 0.8 2023-10-02 01:45:57,879 WARNING [train.py:1204] (3/4) Exclude cut with ID _38020_210_5986_1_1533112252259_7006089_131-53533-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 01:45:58,009 WARNING [train.py:1204] (3/4) Exclude cut with ID _42334_210_7770_1_1534206610396_3605880_392-48154-0_sp1.1 from training. Duration: 0.963625 2023-10-02 01:45:58,094 WARNING [train.py:1204] (3/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_337-181502-0 from training. Duration: 0.65 2023-10-02 01:45:58,102 WARNING [train.py:1204] (3/4) Exclude cut with ID _6267_210_2084_1_1527764658526_3894730_375-219887-0_sp0.9 from training. Duration: 0.7444375 2023-10-02 01:45:58,177 WARNING [train.py:1204] (3/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_60-146189-0 from training. Duration: 0.98 2023-10-02 01:45:58,301 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_243-143119-0_sp1.1 from training. Duration: 0.7 2023-10-02 01:45:58,434 WARNING [train.py:1204] (3/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_1267-272693-0_sp0.9 from training. Duration: 0.811125 2023-10-02 01:45:58,565 WARNING [train.py:1204] (3/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_83-182645-0_sp1.1 from training. Duration: 0.97275 2023-10-02 01:45:58,583 WARNING [train.py:1204] (3/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_403-292260-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 01:45:58,767 WARNING [train.py:1204] (3/4) Exclude cut with ID _55429_210_10626_1_1534467588527_7174143_377-169391-0 from training. Duration: 0.96 2023-10-02 01:45:58,896 WARNING [train.py:1204] (3/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_494-199538-0_sp1.1 from training. Duration: 0.6818125 2023-10-02 01:45:58,959 WARNING [train.py:1204] (3/4) Exclude cut with ID _3067_210_2667_1_1526212974640_4503168_119-175776-0_sp1.1 from training. Duration: 0.67275 2023-10-02 01:45:59,269 WARNING [train.py:1204] (3/4) Exclude cut with ID _3231_210_2106_1_1526205864934_6784220_183-303839-0_sp1.1 from training. Duration: 0.97275 2023-10-02 01:45:59,467 WARNING [train.py:1204] (3/4) Exclude cut with ID _18513_210_9206_1_1531618215223_3862620_165-38958-0_sp1.1 from training. Duration: 0.963625 2023-10-02 01:45:59,583 WARNING [train.py:1204] (3/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_87-272068-0_sp1.1 from training. Duration: 0.7545625 2023-10-02 01:45:59,677 WARNING [train.py:1204] (3/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_353-95-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 01:45:59,824 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2009_210_2071_1_1525481134_4588937_73-302813-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 01:45:59,936 WARNING [train.py:1204] (3/4) Exclude cut with ID _64220_210_9527_1_1535250151772_4875259_88-95011-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 01:46:00,005 WARNING [train.py:1204] (3/4) Exclude cut with ID _44902_210_16187_1_1533888006062_3792078_160-12107-0_sp1.1 from training. Duration: 0.92725 2023-10-02 01:46:00,022 WARNING [train.py:1204] (3/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_547-5424-0 from training. Duration: 0.94 2023-10-02 01:46:00,168 WARNING [train.py:1204] (3/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_268-85656-0_sp1.1 from training. Duration: 0.936375 2023-10-02 01:46:00,543 WARNING [train.py:1204] (3/4) Exclude cut with ID _66210_210_12651_1_1535446812922_3365850_42-284985-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 01:46:00,582 WARNING [train.py:1204] (3/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_465-214726-0_sp0.9 from training. Duration: 0.9555625 2023-10-02 01:46:00,880 WARNING [train.py:1204] (3/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_50-95770-0_sp1.1 from training. Duration: 0.936375 2023-10-02 01:46:02,072 WARNING [train.py:1204] (3/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_269-77567-0_sp1.1 from training. Duration: 0.963625 2023-10-02 01:46:02,074 WARNING [train.py:1204] (3/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_7-111954-0 from training. Duration: 0.56 2023-10-02 01:46:02,179 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_85-195240-0_sp0.9 from training. Duration: 0.9666875 2023-10-02 01:46:02,362 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2822_210_2715_1_1526112586_1999587_165-36656-0_sp0.9 from training. Duration: 0.988875 2023-10-02 01:46:02,389 WARNING [train.py:1204] (3/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_181-78632-0_sp1.1 from training. Duration: 0.7090625 2023-10-02 01:46:02,400 WARNING [train.py:1204] (3/4) Exclude cut with ID IC0671W0044-331887 from training. Duration: 0.896 2023-10-02 01:46:02,633 WARNING [train.py:1204] (3/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_266-137104-0_sp1.1 from training. Duration: 0.5909375 2023-10-02 01:46:02,761 WARNING [train.py:1204] (3/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_137-116442-0_sp0.9 from training. Duration: 0.8666875 2023-10-02 01:46:03,002 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_23801_210_16223_1_1532066392_3294657_570-5439-0 from training. Duration: 0.97 2023-10-02 01:46:03,025 WARNING [train.py:1204] (3/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_29-280622-0_sp1.1 from training. Duration: 0.7454375 2023-10-02 01:46:03,048 WARNING [train.py:1204] (3/4) Exclude cut with ID 3557-8342-0013-54691-0 from training. Duration: 0.92 2023-10-02 01:46:03,219 WARNING [train.py:1204] (3/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_711-15688-0_sp1.1 from training. Duration: 0.3909375 2023-10-02 01:46:03,342 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_15126_210_6753_1_1530778677_2591185_120-277707-0_sp1.1 from training. Duration: 0.72725 2023-10-02 01:46:03,596 WARNING [train.py:1204] (3/4) Exclude cut with ID _14970_210_15709_1_1530775547361_1966965_8-169924-0_sp1.1 from training. Duration: 0.836375 2023-10-02 01:46:03,598 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_4692_210_3528_1_1526899755_4323254_107-44401-0_sp0.9 from training. Duration: 0.9555625 2023-10-02 01:46:03,693 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3475_210_2028_1_1526193578_3622569_265-276625-0_sp0.9 from training. Duration: 0.8444375 2023-10-02 01:46:03,795 WARNING [train.py:1204] (3/4) Exclude cut with ID _55153_210_2253_1_1534384786677_3162096_148-79022-0_sp1.1 from training. Duration: 0.87275 2023-10-02 01:46:03,893 WARNING [train.py:1204] (3/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_292-36336-0_sp1.1 from training. Duration: 0.92725 2023-10-02 01:46:03,902 WARNING [train.py:1204] (3/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_329-82235-0 from training. Duration: 0.82 2023-10-02 01:46:03,983 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_11305_210_1794_1_1529750428_7178148_257-312870-0 from training. Duration: 0.88 2023-10-02 01:46:04,168 WARNING [train.py:1204] (3/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_436-4493-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 01:46:04,383 WARNING [train.py:1204] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_321-54129-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 01:46:04,395 WARNING [train.py:1204] (3/4) Exclude cut with ID _5287_210_2084_1_1527418883040_3878670_333-48508-0_sp0.9 from training. Duration: 0.6666875 2023-10-02 01:46:04,416 WARNING [train.py:1204] (3/4) Exclude cut with ID _26528_210_9070_1_1532570410842_3851310_244-278158-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 01:46:04,452 WARNING [train.py:1204] (3/4) Exclude cut with ID _46952_210_5986_1_1534230010357_3107379_232-344503-0_sp1.1 from training. Duration: 0.87275 2023-10-02 01:46:04,468 WARNING [train.py:1204] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_43-206757-0 from training. Duration: 0.75 2023-10-02 01:46:04,469 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_4183_210_2087_1_1526729100_1869838_141-166600-0 from training. Duration: 0.95 2023-10-02 01:46:04,536 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_296-287678-0 from training. Duration: 0.95 2023-10-02 01:46:04,596 WARNING [train.py:1204] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_265-60343-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 01:46:04,621 WARNING [train.py:1204] (3/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_332-256168-0 from training. Duration: 0.75 2023-10-02 01:46:04,646 WARNING [train.py:1204] (3/4) Exclude cut with ID _13679_210_7497_1_1530534293662_3355098_411-186088-0 from training. Duration: 0.94 2023-10-02 01:46:04,978 WARNING [train.py:1204] (3/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_458-126779-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 01:46:05,087 WARNING [train.py:1204] (3/4) Exclude cut with ID IC0803W0380-397572 from training. Duration: 0.032 2023-10-02 01:46:05,174 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_278-272612-0_sp1.1 from training. Duration: 0.663625 2023-10-02 01:46:05,339 WARNING [train.py:1204] (3/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_491-263101-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 01:46:05,344 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_53555_210_5632_1_1534595220_7232297_334-336778-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 01:46:05,520 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_50-3289-0 from training. Duration: 0.91 2023-10-02 01:46:05,602 WARNING [train.py:1204] (3/4) Exclude cut with ID _8699_210_2069_1_1528630346831_3960409_155-39766-0_sp0.9 from training. Duration: 0.8666875 2023-10-02 01:46:05,606 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_502-82145-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 01:46:05,668 WARNING [train.py:1204] (3/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_550-43132-0_sp1.1 from training. Duration: 0.97275 2023-10-02 01:46:05,687 WARNING [train.py:1204] (3/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_446-112117-0_sp1.1 from training. Duration: 0.8181875 2023-10-02 01:46:05,740 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3691_210_3528_1_1526468169_4062053_355-334432-0_sp1.1 from training. Duration: 0.7909375 2023-10-02 01:46:06,537 WARNING [train.py:1204] (3/4) Exclude cut with ID _58605_210_12210_1_1534723195939_3904259_239-94720-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 01:46:06,738 WARNING [train.py:1204] (3/4) Exclude cut with ID _49376_210_5986_1_1533985278732_3370740_391-344072-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 01:46:06,839 WARNING [train.py:1204] (3/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_558-322014-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 01:46:06,879 WARNING [train.py:1204] (3/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_256-32662-0_sp1.1 from training. Duration: 0.8181875 2023-10-02 01:46:07,322 WARNING [train.py:1204] (3/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_572-185150-0 from training. Duration: 0.97 2023-10-02 01:46:07,361 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_89-35748-0_sp0.9 from training. Duration: 0.87775 2023-10-02 01:46:07,689 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_88-66747-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 01:46:07,882 WARNING [train.py:1204] (3/4) Exclude cut with ID _19504_210_9994_1_1531648863242_3325220_357-237029-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 01:46:08,078 WARNING [train.py:1204] (3/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_302-2550-0_sp1.1 from training. Duration: 0.736375 2023-10-02 01:46:08,078 WARNING [train.py:1204] (3/4) Exclude cut with ID _9461_210_4557_1_1529060514726_3677451_59-194017-0_sp1.1 from training. Duration: 0.963625 2023-10-02 01:46:08,453 WARNING [train.py:1204] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_665-196155-0_sp1.1 from training. Duration: 0.6090625 2023-10-02 01:46:08,755 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_326-315007-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 01:46:08,938 WARNING [train.py:1204] (3/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_105-328174-0_sp1.1 from training. Duration: 0.6909375 2023-10-02 01:46:08,953 WARNING [train.py:1204] (3/4) Exclude cut with ID _56494_210_4220_1_1535284837625_3033169_106-263722-0_sp1.1 from training. Duration: 0.97275 2023-10-02 01:46:09,210 WARNING [train.py:1204] (3/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_770-200430-0_sp0.9 from training. Duration: 0.988875 2023-10-02 01:46:09,375 WARNING [train.py:1204] (3/4) Exclude cut with ID _65640_210_6310_1_1535368201445_3173250_209-13008-0_sp1.1 from training. Duration: 0.9 2023-10-02 01:46:09,507 WARNING [train.py:1204] (3/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_212-149947-0_sp0.9 from training. Duration: 0.811125 2023-10-02 01:46:09,870 WARNING [train.py:1204] (3/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_188-171673-0 from training. Duration: 0.58 2023-10-02 01:46:09,950 WARNING [train.py:1204] (3/4) Exclude cut with ID _58895_210_8750_1_1535184081320_3649089_286-281724-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 01:46:10,065 WARNING [train.py:1204] (3/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_16-254909-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 01:46:11,092 WARNING [train.py:1204] (3/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_52-95655-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 01:46:11,288 WARNING [train.py:1204] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_554-45617-0 from training. Duration: 0.99 2023-10-02 01:46:11,531 WARNING [train.py:1204] (3/4) Exclude cut with ID _14683_210_5219_1_1530698352608_3974859_473-344611-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 01:46:11,661 WARNING [train.py:1204] (3/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_17-25045-0_sp0.9 from training. Duration: 0.6555625 2023-10-02 01:46:12,024 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9072W0003-530637 from training. Duration: 0.768 2023-10-02 01:46:12,144 WARNING [train.py:1204] (3/4) Exclude cut with ID _38670_210_15379_1_1533448810659_4391059_76-74025-0_sp1.1 from training. Duration: 0.936375 2023-10-02 01:46:12,207 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_74-292608-0_sp0.9 from training. Duration: 0.8 2023-10-02 01:46:12,279 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9084W0025-536862 from training. Duration: 0.896 2023-10-02 01:46:12,335 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9087W0021-538392 from training. Duration: 0.768 2023-10-02 01:46:12,339 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7543_210_6753_1_1528543323_645515_56-163234-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 01:46:12,368 WARNING [train.py:1204] (3/4) Exclude cut with ID _45560_210_21982_1_1533812603951_3584999_44-332643-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 01:46:12,432 WARNING [train.py:1204] (3/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_1285-256622-0_sp1.1 from training. Duration: 0.763625 2023-10-02 01:46:12,480 WARNING [train.py:1204] (3/4) Exclude cut with ID _52781_210_16187_1_1534297729099_1118149_141-325632-0_sp1.1 from training. Duration: 0.936375 2023-10-02 01:46:12,549 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_1051-137631-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 01:46:12,579 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_13091_210_7925_1_1530609447_8987354_233-18333-0_sp1.1 from training. Duration: 0.97275 2023-10-02 01:46:12,768 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_34008_210_10431_1_1533345424_8183247_662-317331-0 from training. Duration: 0.94 2023-10-02 01:46:12,774 WARNING [train.py:1204] (3/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_291-248920-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 01:46:12,775 WARNING [train.py:1204] (3/4) Exclude cut with ID _65263_210_22356_1_1535763598691_3921037_274-288481-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 01:46:12,794 WARNING [train.py:1204] (3/4) Exclude cut with ID _5189_210_4557_1_1527248319428_3769229_233-168080-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 01:46:12,855 WARNING [train.py:1204] (3/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_135-184281-0 from training. Duration: 0.99 2023-10-02 01:46:12,955 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9123W0012-555972 from training. Duration: 0.896 2023-10-02 01:46:12,958 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9123W0118-556077 from training. Duration: 0.896 2023-10-02 01:46:12,967 WARNING [train.py:1204] (3/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_406-275649-0 from training. Duration: 0.42 2023-10-02 01:46:13,221 WARNING [train.py:1204] (3/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_309-13684-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 01:46:13,239 WARNING [train.py:1204] (3/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_129-337802-0_sp1.1 from training. Duration: 0.6545625 2023-10-02 01:46:13,275 WARNING [train.py:1204] (3/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_649-266825-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 01:46:13,347 WARNING [train.py:1204] (3/4) Exclude cut with ID _40519_210_13794_1_1533715228655_7337390_895-224742-0_sp1.1 from training. Duration: 0.92725 2023-10-02 01:46:13,525 WARNING [train.py:1204] (3/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_234-14174-0 from training. Duration: 0.82 2023-10-02 01:46:13,678 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9156W0009-572502 from training. Duration: 0.768 2023-10-02 01:46:13,684 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_424-245529-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 01:46:14,012 WARNING [train.py:1204] (3/4) Exclude cut with ID _26528_210_9070_1_1532570410842_3851310_232-10149-0_sp1.1 from training. Duration: 0.963625 2023-10-02 01:46:14,023 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_17292_210_6753_1_1531125388_2943734_152-30410-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 01:46:14,190 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_35441_210_16554_1_1533347572_4092680_291-82075-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 01:46:14,332 WARNING [train.py:1204] (3/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_64-298917-0 from training. Duration: 0.88 2023-10-02 01:46:14,360 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_5618_210_1874_1_1527311008_5851_1-214501-0_sp0.9 from training. Duration: 0.7655625 2023-10-02 01:46:14,404 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_9460_210_4211_1_1528954587_10049667_572-37035-0 from training. Duration: 0.8 2023-10-02 01:46:14,427 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9193W0023-590322 from training. Duration: 0.896 2023-10-02 01:46:14,514 WARNING [train.py:1204] (3/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_206-10097-0_sp1.1 from training. Duration: 0.4454375 2023-10-02 01:46:14,540 WARNING [train.py:1204] (3/4) Exclude cut with ID _55134_210_21982_1_1534590028377_3882170_17-51731-0_sp1.1 from training. Duration: 0.963625 2023-10-02 01:46:14,712 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_14953_210_2151_1_1530701710_5616165_710-165400-0 from training. Duration: 0.87 2023-10-02 01:46:15,516 WARNING [train.py:1204] (3/4) Exclude cut with ID _53203_210_2087_1_1534246218309_475590_61-85777-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 01:46:15,601 WARNING [train.py:1204] (3/4) Exclude cut with ID _55844_210_2346_1_1534471212569_7625707_1050-10453-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 01:46:15,624 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9218W0013-603297 from training. Duration: 0.896 2023-10-02 01:46:15,872 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_267-102818-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 01:46:15,873 WARNING [train.py:1204] (3/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_191-41023-0 from training. Duration: 0.71 2023-10-02 01:46:15,934 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9231W0202-610227 from training. Duration: 0.896 2023-10-02 01:46:16,160 WARNING [train.py:1204] (3/4) Exclude cut with ID _6537_210_1883_1_1527987559710_3597129_382-133062-0 from training. Duration: 0.68 2023-10-02 01:46:16,201 WARNING [train.py:1204] (3/4) Exclude cut with ID _28427_210_4220_1_1532581240967_3061069_43-84928-0 from training. Duration: 0.99 2023-10-02 01:46:16,223 WARNING [train.py:1204] (3/4) Exclude cut with ID _59256_210_5986_1_1535076085997_3589120_121-237386-0 from training. Duration: 0.96 2023-10-02 01:46:16,244 WARNING [train.py:1204] (3/4) Exclude cut with ID _8480_210_1874_1_1528589391205_6728330_137-256986-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 01:46:16,340 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_9417_210_1794_1_1529235261_4614131_479-346963-0_sp1.1 from training. Duration: 0.936375 2023-10-02 01:46:16,380 WARNING [train.py:1204] (3/4) Exclude cut with ID _26474_210_9570_1_1532520022699_3623380_267-7362-0_sp1.1 from training. Duration: 0.936375 2023-10-02 01:46:16,460 WARNING [train.py:1204] (3/4) Exclude cut with ID _55328_210_27767_1_1534580973260_4088170_287-61272-0_sp1.1 from training. Duration: 0.97275 2023-10-02 01:46:16,586 WARNING [train.py:1204] (3/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_589-9070-0 from training. Duration: 0.71 2023-10-02 01:46:16,599 WARNING [train.py:1204] (3/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_148-158509-0_sp1.1 from training. Duration: 0.37275 2023-10-02 01:46:16,672 WARNING [train.py:1204] (3/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_164-184548-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 01:46:16,848 WARNING [train.py:1204] (3/4) Exclude cut with ID _14970_210_15709_1_1530775547361_1966965_6-23328-0_sp1.1 from training. Duration: 0.7818125 2023-10-02 01:46:16,853 WARNING [train.py:1204] (3/4) Exclude cut with ID _29303_210_2575_1_1532392237303_7410649_612-165069-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 01:46:16,993 WARNING [train.py:1204] (3/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_383-321224-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 01:46:17,037 WARNING [train.py:1204] (3/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_283-160525-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 01:46:17,048 WARNING [train.py:1204] (3/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_505-97987-0 from training. Duration: 0.86 2023-10-02 01:46:17,128 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9309W0190-640317 from training. Duration: 0.768 2023-10-02 01:46:17,441 WARNING [train.py:1204] (3/4) Exclude cut with ID _50093_210_4220_1_1534417141207_3432299_280-297390-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 01:46:17,908 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_17292_210_6753_1_1531125388_2943734_320-71385-0_sp1.1 from training. Duration: 0.97275 2023-10-02 01:46:18,070 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_213-103282-0 from training. Duration: 0.78 2023-10-02 01:46:18,321 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7615_210_4211_1_1528110965_4580160_72-235211-0_sp0.9 from training. Duration: 0.8444375 2023-10-02 01:46:18,556 WARNING [train.py:1204] (3/4) Exclude cut with ID _14867_210_1968_1_1530684179890_3846969_173-320977-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 01:46:18,740 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7046_210_6753_1_1527895337_6920917_783-278109-0_sp0.9 from training. Duration: 0.8555625 2023-10-02 01:46:18,801 WARNING [train.py:1204] (3/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_90-30673-0 from training. Duration: 0.84 2023-10-02 01:46:18,817 WARNING [train.py:1204] (3/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_415-161-0_sp1.1 from training. Duration: 0.8909375 2023-10-02 01:46:18,834 WARNING [train.py:1204] (3/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_86-66192-0_sp0.9 from training. Duration: 0.611125 2023-10-02 01:46:18,837 WARNING [train.py:1204] (3/4) Exclude cut with ID _4626_210_3532_1_1526817652717_3916859_446-206656-0_sp1.1 from training. Duration: 0.6909375 2023-10-02 01:46:18,929 WARNING [train.py:1204] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_166-128941-0_sp0.9 from training. Duration: 0.6 2023-10-02 01:46:19,261 WARNING [train.py:1204] (3/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_345-167986-0 from training. Duration: 0.84 2023-10-02 01:46:20,018 WARNING [train.py:1204] (3/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_973-198085-0 from training. Duration: 0.75 2023-10-02 01:46:20,100 WARNING [train.py:1204] (3/4) Exclude cut with ID _60057_210_10640_1_1534903065793_3333150_171-181133-0 from training. Duration: 0.88 2023-10-02 01:46:20,113 WARNING [train.py:1204] (3/4) Exclude cut with ID _19504_210_9994_1_1531648863242_3325220_336-196513-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 01:46:20,128 WARNING [train.py:1204] (3/4) Exclude cut with ID _6493_210_1747_1_1528459210424_4012470_513-140967-0 from training. Duration: 0.79 2023-10-02 01:46:20,204 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6673_210_3273_1_1527767790_3173240_327-306185-0_sp0.9 from training. Duration: 0.92225 2023-10-02 01:46:20,438 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1032-236863-0_sp1.1 from training. Duration: 0.763625 2023-10-02 01:46:20,643 WARNING [train.py:1204] (3/4) Exclude cut with ID _14867_210_1968_1_1530684179890_3846969_413-29292-0 from training. Duration: 0.9 2023-10-02 01:46:20,946 WARNING [train.py:1204] (3/4) Exclude cut with ID _47251_210_18628_1_1533814063128_4137250_186-343322-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 01:46:21,133 WARNING [train.py:1204] (3/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_624-46091-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 01:46:21,163 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_791-197891-0_sp0.9 from training. Duration: 0.9333125 2023-10-02 01:46:21,615 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_50050_210_16223_1_1535435796_7290138_786-288457-0_sp1.1 from training. Duration: 0.92725 2023-10-02 01:46:21,636 WARNING [train.py:1204] (3/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_213-227283-0_sp1.1 from training. Duration: 0.963625 2023-10-02 01:46:21,911 WARNING [train.py:1204] (3/4) Exclude cut with ID _42659_210_2070_1_1533607442765_3120380_58-341121-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 01:46:22,074 WARNING [train.py:1204] (3/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_506-23200-0_sp1.1 from training. Duration: 0.8818125 2023-10-02 01:46:22,263 WARNING [train.py:1204] (3/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_336-213530-0_sp1.1 from training. Duration: 0.8181875 2023-10-02 01:46:22,297 WARNING [train.py:1204] (3/4) Exclude cut with ID _64135_210_2253_1_1535677267876_7608972_180-5312-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 01:46:22,412 WARNING [train.py:1204] (3/4) Exclude cut with ID _12302_210_5219_1_1529924446466_8107280_835-279754-0 from training. Duration: 0.9 2023-10-02 01:46:22,413 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_510-96996-0_sp1.1 from training. Duration: 0.7909375 2023-10-02 01:46:22,511 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_15517_210_6753_1_1530954008_3487336_216-187834-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 01:46:22,537 WARNING [train.py:1204] (3/4) Exclude cut with ID _19504_210_9994_1_1531648863242_3325220_414-113427-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 01:46:22,538 WARNING [train.py:1204] (3/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_264-108685-0_sp0.9 from training. Duration: 0.611125 2023-10-02 01:46:22,770 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_5438_210_1794_1_1527336687_2600009_104-288274-0_sp0.9 from training. Duration: 0.6444375 2023-10-02 01:46:23,160 WARNING [train.py:1204] (3/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_520-39496-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 01:46:23,198 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_37818_210_4238_1_1533620910_4720083_315-308565-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 01:46:23,303 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2009_210_2071_1_1525481134_4588937_385-302100-0_sp1.1 from training. Duration: 0.7909375 2023-10-02 01:46:23,360 WARNING [train.py:1204] (3/4) Exclude cut with ID _58895_210_8750_1_1535184081320_3649089_277-53219-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 01:46:23,442 WARNING [train.py:1204] (3/4) Exclude cut with ID _26664_210_7770_1_1532258943280_3418270_121-111258-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 01:46:24,330 WARNING [train.py:1204] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_99-167392-0_sp1.1 from training. Duration: 0.5545625 2023-10-02 01:46:24,456 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1113-7369-0 from training. Duration: 0.85 2023-10-02 01:46:24,474 WARNING [train.py:1204] (3/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_352-59958-0 from training. Duration: 0.86 2023-10-02 01:46:24,479 WARNING [train.py:1204] (3/4) Exclude cut with ID _14886_210_4220_1_1530702020355_3516190_159-73748-0_sp1.1 from training. Duration: 0.7545625 2023-10-02 01:46:24,692 WARNING [train.py:1204] (3/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_131-119569-0_sp1.1 from training. Duration: 0.92725 2023-10-02 01:46:24,787 WARNING [train.py:1204] (3/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_216-343007-0 from training. Duration: 0.66 2023-10-02 01:46:25,070 WARNING [train.py:1204] (3/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_113-92608-0_sp0.9 from training. Duration: 0.6 2023-10-02 01:46:25,381 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_40047_210_2290_1_1533290519_2374311_132-123271-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 01:46:25,433 WARNING [train.py:1204] (3/4) Exclude cut with ID _8793_210_6310_1_1528542985139_2310009_121-179782-0 from training. Duration: 0.8 2023-10-02 01:46:25,450 WARNING [train.py:1204] (3/4) Exclude cut with ID _43997_210_4525_1_1533538632555_5189329_283-218639-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 01:46:25,453 WARNING [train.py:1204] (3/4) Exclude cut with ID _49376_210_5986_1_1533985278732_3370740_297-78397-0_sp1.1 from training. Duration: 0.936375 2023-10-02 01:46:25,466 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3475_210_2028_1_1526193578_3622569_377-166133-0_sp0.9 from training. Duration: 0.9555625 2023-10-02 01:46:25,480 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_520-213149-0_sp1.1 from training. Duration: 0.92725 2023-10-02 01:46:25,640 WARNING [train.py:1204] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_567-144483-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 01:46:25,884 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_4183_210_2087_1_1526729100_1869838_143-280962-0_sp0.9 from training. Duration: 0.62225 2023-10-02 01:46:25,931 WARNING [train.py:1204] (3/4) Exclude cut with ID _49643_210_7989_1_1534640244224_3939031_611-185515-0 from training. Duration: 0.94 2023-10-02 01:46:26,058 WARNING [train.py:1204] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_88-341718-0_sp0.9 from training. Duration: 0.7333125 2023-10-02 01:46:26,236 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_51225_210_3601_1_1534400634_2327383_49-235441-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 01:46:26,346 WARNING [train.py:1204] (3/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_240-294400-0_sp1.1 from training. Duration: 0.936375 2023-10-02 01:46:26,480 WARNING [train.py:1204] (3/4) Exclude cut with ID _55429_210_10626_1_1534467588527_7174143_640-304825-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 01:46:26,513 WARNING [train.py:1204] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_187-221566-0_sp1.1 from training. Duration: 0.3909375 2023-10-02 01:46:26,698 WARNING [train.py:1204] (3/4) Exclude cut with ID _7606_210_4915_1_1528894253277_5080690_127-177761-0_sp0.9 from training. Duration: 0.711125 2023-10-02 01:46:26,902 WARNING [train.py:1204] (3/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_430-116139-0 from training. Duration: 0.85 2023-10-02 01:46:27,022 WARNING [train.py:1204] (3/4) Exclude cut with ID _3675_210_4557_1_1526639934408_4182089_456-18305-0_sp0.9 from training. Duration: 0.8444375 2023-10-02 01:46:27,365 WARNING [train.py:1204] (3/4) Exclude cut with ID _18513_210_9206_1_1531618215223_3862620_402-5957-0_sp1.1 from training. Duration: 0.9 2023-10-02 01:46:27,463 WARNING [train.py:1204] (3/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_465-156500-0 from training. Duration: 0.72 2023-10-02 01:46:27,465 WARNING [train.py:1204] (3/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_714-310538-0 from training. Duration: 0.86 2023-10-02 01:46:27,616 WARNING [train.py:1204] (3/4) Exclude cut with ID _39411_210_5986_1_1533538825030_3343649_335-301187-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 01:46:27,998 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_41750_210_1960_1_1533520678_3948042_241-158616-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 01:46:28,063 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_46013_210_7234_1_1533776362_3744032_50-141535-0_sp1.1 from training. Duration: 0.9 2023-10-02 01:46:28,738 WARNING [train.py:1204] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_43-206757-0_sp0.9 from training. Duration: 0.8333125 2023-10-02 01:46:28,793 WARNING [train.py:1204] (3/4) Exclude cut with ID _22961_210_8254_1_1531976366672_4278180_172-276296-0_sp1.1 from training. Duration: 0.97275 2023-10-02 01:46:28,814 WARNING [train.py:1204] (3/4) Exclude cut with ID _17956_210_4353_1_1531209568125_6979970_1305-301903-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 01:46:28,865 WARNING [train.py:1204] (3/4) Exclude cut with ID _28323_210_16187_1_1532480395667_4100300_604-44507-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 01:46:29,807 WARNING [train.py:1204] (3/4) Exclude cut with ID _23514_210_1389_1_1531832139785_3031439_88-337544-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 01:46:29,878 WARNING [train.py:1204] (3/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_932-3672-0_sp1.1 from training. Duration: 0.963625 2023-10-02 01:46:30,084 WARNING [train.py:1204] (3/4) Exclude cut with ID _46502_210_13794_1_1533724452198_3657159_207-109991-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 01:46:30,150 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_216-73841-0_sp1.1 from training. Duration: 0.87275 2023-10-02 01:46:30,233 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8453_210_4211_1_1528502940_8562730_347-330673-0_sp1.1 from training. Duration: 0.6545625 2023-10-02 01:46:30,653 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_4183_210_2087_1_1526729100_1869838_143-280962-0_sp1.1 from training. Duration: 0.5090625 2023-10-02 01:46:30,926 WARNING [train.py:1204] (3/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_325-113798-0_sp1.1 from training. Duration: 0.77275 2023-10-02 01:46:31,237 WARNING [train.py:1204] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_91-212486-0_sp0.9 from training. Duration: 0.7555625 2023-10-02 01:46:31,437 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8776_210_2040_1_1528638837_4088036_244-278763-0_sp1.1 from training. Duration: 0.8181875 2023-10-02 01:46:31,476 WARNING [train.py:1204] (3/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_106-286130-0 from training. Duration: 0.98 2023-10-02 01:46:31,539 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2746_210_1988_1_1525872816_3795627_372-205861-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 01:46:31,592 WARNING [train.py:1204] (3/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_240-54646-0_sp1.1 from training. Duration: 0.936375 2023-10-02 01:46:31,787 WARNING [train.py:1204] (3/4) Exclude cut with ID _59500_210_8254_1_1534809610680_3736009_162-180695-0_sp1.1 from training. Duration: 0.936375 2023-10-02 01:46:31,814 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7544_210_6753_1_1528591327_1471426_146-93357-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 01:46:32,026 WARNING [train.py:1204] (3/4) Exclude cut with ID _42657_210_2070_1_1533520654016_3327008_405-87432-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 01:46:32,047 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_4528_210_3273_1_1527076654_3276777_209-219804-0 from training. Duration: 0.69 2023-10-02 01:46:32,047 WARNING [train.py:1204] (3/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_78-182912-0_sp0.9 from training. Duration: 0.6555625 2023-10-02 01:46:32,054 WARNING [train.py:1204] (3/4) Exclude cut with ID _31255_210_13427_1_1532865619495_3815143_322-114998-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 01:46:32,417 WARNING [train.py:1204] (3/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_848-280957-0_sp0.9 from training. Duration: 0.9666875 2023-10-02 01:46:32,445 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7696_210_2040_1_1528207167_3716099_117-125434-0_sp0.9 from training. Duration: 0.8444375 2023-10-02 01:46:33,245 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_1002-109192-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 01:46:33,438 WARNING [train.py:1204] (3/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_138-64210-0_sp0.9 from training. Duration: 0.811125 2023-10-02 01:46:33,459 WARNING [train.py:1204] (3/4) Exclude cut with ID _25808_210_13292_1_1532343514562_7366109_633-47524-0_sp1.1 from training. Duration: 0.9 2023-10-02 01:46:33,595 WARNING [train.py:1204] (3/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_297-5233-0 from training. Duration: 0.95 2023-10-02 01:46:33,752 WARNING [train.py:1204] (3/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_335-274780-0 from training. Duration: 0.78 2023-10-02 01:46:33,757 WARNING [train.py:1204] (3/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_301-330455-0 from training. Duration: 0.8 2023-10-02 01:46:34,049 WARNING [train.py:1204] (3/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_383-205833-0 from training. Duration: 0.95 2023-10-02 01:46:34,295 WARNING [train.py:1204] (3/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_48-182155-0 from training. Duration: 0.87 2023-10-02 01:46:34,385 WARNING [train.py:1204] (3/4) Exclude cut with ID _14832_210_8750_1_1530691351539_3171348_1-54315-0_sp1.1 from training. Duration: 0.7909375 2023-10-02 01:46:34,613 WARNING [train.py:1204] (3/4) Exclude cut with ID _44982_210_1511_1_1534312820108_3726029_413-100814-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 01:46:34,660 WARNING [train.py:1204] (3/4) Exclude cut with ID _3231_210_2106_1_1526205864934_6784220_104-261392-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 01:46:34,881 WARNING [train.py:1204] (3/4) Exclude cut with ID _24314_210_4915_1_1532135078116_7170179_1-13588-0_sp1.1 from training. Duration: 0.97275 2023-10-02 01:46:35,168 WARNING [train.py:1204] (3/4) Exclude cut with ID _44304_210_15374_1_1533639352765_4613129_141-8356-0_sp1.1 from training. Duration: 0.963625 2023-10-02 01:46:35,334 WARNING [train.py:1204] (3/4) Exclude cut with ID _37892_210_19596_1_1533198677897_3964058_374-335034-0_sp1.1 from training. Duration: 0.936375 2023-10-02 01:46:35,486 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_195-257512-0_sp0.9 from training. Duration: 0.7444375 2023-10-02 01:46:35,488 WARNING [train.py:1204] (3/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_1267-272693-0_sp1.1 from training. Duration: 0.663625 2023-10-02 01:46:35,489 WARNING [train.py:1204] (3/4) Exclude cut with ID _30196_210_1664_1_1533708496522_7074140_690-326155-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 01:46:35,703 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_68847_210_14656_1_1536070214_2586843_38-20717-0_sp1.1 from training. Duration: 0.97275 2023-10-02 01:46:35,749 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_99-251314-0_sp1.1 from training. Duration: 0.7909375 2023-10-02 01:46:36,036 WARNING [train.py:1204] (3/4) Exclude cut with ID _49376_210_5986_1_1533985278732_3370740_225-95630-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 01:46:36,248 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2655_210_2087_1_1525688959_3897400_202-147557-0 from training. Duration: 0.85 2023-10-02 01:46:36,327 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_5618_210_1874_1_1527311008_5851_1-214501-0_sp1.1 from training. Duration: 0.626375 2023-10-02 01:46:36,387 WARNING [train.py:1204] (3/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_225-43381-0_sp1.1 from training. Duration: 0.8545625 2023-10-02 01:46:36,502 WARNING [train.py:1204] (3/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_242-3472-0 from training. Duration: 0.73 2023-10-02 01:46:36,762 WARNING [train.py:1204] (3/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_339-104791-0 from training. Duration: 0.49 2023-10-02 01:46:36,841 WARNING [train.py:1204] (3/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_488-92261-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 01:46:37,775 WARNING [train.py:1204] (3/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_175-163894-0 from training. Duration: 0.74 2023-10-02 01:46:37,923 WARNING [train.py:1204] (3/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_41-234353-0_sp0.9 from training. Duration: 0.7555625 2023-10-02 01:46:38,178 WARNING [train.py:1204] (3/4) Exclude cut with ID _47251_210_18628_1_1533814063128_4137250_116-275927-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 01:46:38,215 WARNING [train.py:1204] (3/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_61-223324-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 01:46:38,226 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6961_210_2087_1_1527937168_6815011_562-165518-0_sp1.1 from training. Duration: 0.7 2023-10-02 01:46:38,365 WARNING [train.py:1204] (3/4) Exclude cut with ID _21989_210_5854_1_1531899214931_3271360_133-44759-0 from training. Duration: 0.77 2023-10-02 01:46:38,548 WARNING [train.py:1204] (3/4) Exclude cut with ID _23557_210_8750_1_1531890106370_6846255_77-284923-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 01:46:38,751 WARNING [train.py:1204] (3/4) Exclude cut with ID _13678_210_7497_1_1530530069226_3463170_464-296127-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 01:46:38,772 WARNING [train.py:1204] (3/4) Exclude cut with ID _22613_210_5219_1_1532253599686_4090170_393-36663-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 01:46:38,874 WARNING [train.py:1204] (3/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_235-262124-0_sp0.9 from training. Duration: 0.7444375 2023-10-02 01:46:38,983 WARNING [train.py:1204] (3/4) Exclude cut with ID _8480_210_1874_1_1528589391205_6728330_755-112625-0 from training. Duration: 0.74 2023-10-02 01:46:39,031 WARNING [train.py:1204] (3/4) Exclude cut with ID _6456_210_1534_1_1527681634212_3181049_215-309359-0 from training. Duration: 0.88 2023-10-02 01:46:39,057 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3019_210_2028_1_1526044245_3264865_191-72235-0_sp1.1 from training. Duration: 0.8818125 2023-10-02 01:46:39,173 WARNING [train.py:1204] (3/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_380-162648-0 from training. Duration: 0.94 2023-10-02 01:46:39,350 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_92-46009-0 from training. Duration: 0.54 2023-10-02 01:46:39,366 WARNING [train.py:1204] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_53-300544-0_sp0.9 from training. Duration: 0.7444375 2023-10-02 01:46:39,471 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_68235_210_1925_1_1535863247_4484656_101-250943-0_sp1.1 from training. Duration: 0.97275 2023-10-02 01:46:39,491 WARNING [train.py:1204] (3/4) Exclude cut with ID _14970_210_15709_1_1530775547361_1966965_204-22182-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 01:46:39,589 WARNING [train.py:1204] (3/4) Exclude cut with ID _55153_210_2253_1_1534384786677_3162096_310-130092-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 01:46:39,601 WARNING [train.py:1204] (3/4) Exclude cut with ID _60113_210_16271_1_1535164212481_4386079_559-167191-0_sp1.1 from training. Duration: 0.8454375 2023-10-02 01:46:39,714 WARNING [train.py:1204] (3/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_93-58002-0_sp1.1 from training. Duration: 0.5181875 2023-10-02 01:46:39,771 WARNING [train.py:1204] (3/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_110-224688-0 from training. Duration: 0.97 2023-10-02 01:46:39,914 WARNING [train.py:1204] (3/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_178-178004-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 01:46:39,923 WARNING [train.py:1204] (3/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_321-335475-0_sp1.1 from training. Duration: 0.4090625 2023-10-02 01:46:39,992 WARNING [train.py:1204] (3/4) Exclude cut with ID _66863_210_18053_1_1535698758334_8590319_1092-271784-0 from training. Duration: 0.96 2023-10-02 01:46:39,998 WARNING [train.py:1204] (3/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_607-213951-0_sp1.1 from training. Duration: 0.763625 2023-10-02 01:46:40,016 WARNING [train.py:1204] (3/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_468-283113-0 from training. Duration: 0.96 2023-10-02 01:46:40,284 WARNING [train.py:1204] (3/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_13-165512-0_sp1.1 from training. Duration: 0.7454375 2023-10-02 01:46:40,295 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_69630_210_7234_1_1535788670_910989_56-169762-0_sp1.1 from training. Duration: 0.92725 2023-10-02 01:46:40,467 WARNING [train.py:1204] (3/4) Exclude cut with ID _67335_210_5219_1_1535525975652_4275229_121-80357-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 01:46:40,515 WARNING [train.py:1204] (3/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_126-12079-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 01:46:40,560 WARNING [train.py:1204] (3/4) Exclude cut with ID _5294_210_1747_1_1527249643928_3696179_342-270278-0_sp1.1 from training. Duration: 0.5 2023-10-02 01:46:40,683 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_30322_210_1925_1_1532843478_826817_35-328264-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 01:46:40,845 WARNING [train.py:1204] (3/4) Exclude cut with ID _8480_210_1874_1_1528589391205_6728330_755-112625-0_sp1.1 from training. Duration: 0.67275 2023-10-02 01:46:41,076 WARNING [train.py:1204] (3/4) Exclude cut with ID _38670_210_15379_1_1533448810659_4391059_132-146412-0_sp1.1 from training. Duration: 0.97275 2023-10-02 01:46:41,167 WARNING [train.py:1204] (3/4) Exclude cut with ID _22020_210_2461_1_1531724164513_6326900_563-39691-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 01:46:41,369 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_9460_210_4211_1_1528954587_10049667_572-37035-0_sp1.1 from training. Duration: 0.72725 2023-10-02 01:46:41,390 WARNING [train.py:1204] (3/4) Exclude cut with ID _68777_210_4353_1_1535810390731_3117904_129-166012-0_sp0.9 from training. Duration: 0.9444375 2023-10-02 01:46:41,447 WARNING [train.py:1204] (3/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_487-91921-0_sp1.1 from training. Duration: 0.963625 2023-10-02 01:46:42,168 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_195-257512-0 from training. Duration: 0.67 2023-10-02 01:46:42,284 WARNING [train.py:1204] (3/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_188-260011-0_sp1.1 from training. Duration: 0.663625 2023-10-02 01:46:42,564 WARNING [train.py:1204] (3/4) Exclude cut with ID _8480_210_1874_1_1528589391205_6728330_309-139896-0_sp0.9 from training. Duration: 0.9 2023-10-02 01:46:42,635 WARNING [train.py:1204] (3/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_866-321248-0_sp1.1 from training. Duration: 0.963625 2023-10-02 01:46:42,825 WARNING [train.py:1204] (3/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_510-216513-0_sp1.1 from training. Duration: 0.97275 2023-10-02 01:46:43,341 WARNING [train.py:1204] (3/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_758-143677-0 from training. Duration: 0.98 2023-10-02 01:46:43,452 WARNING [train.py:1204] (3/4) Exclude cut with ID _55134_210_21982_1_1534590028377_3882170_50-214427-0_sp1.1 from training. Duration: 0.97275 2023-10-02 01:46:43,679 WARNING [train.py:1204] (3/4) Exclude cut with ID _31649_210_6228_1_1532584608063_4091078_482-301330-0_sp1.1 from training. Duration: 0.97275 2023-10-02 01:46:43,954 WARNING [train.py:1204] (3/4) Exclude cut with ID _35799_210_5854_1_1533263487887_3529071_490-326847-0_sp1.1 from training. Duration: 0.9 2023-10-02 01:46:43,968 WARNING [train.py:1204] (3/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_984-90716-0_sp1.1 from training. Duration: 0.963625 2023-10-02 01:46:44,395 WARNING [train.py:1204] (3/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_166-246557-0 from training. Duration: 0.64 2023-10-02 01:46:44,395 WARNING [train.py:1204] (3/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_51-200220-0_sp1.1 from training. Duration: 0.5090625 2023-10-02 01:46:44,472 WARNING [train.py:1204] (3/4) Exclude cut with ID _51382_210_23862_1_1534492801838_3650104_275-92796-0_sp1.1 from training. Duration: 0.936375 2023-10-02 01:46:44,871 WARNING [train.py:1204] (3/4) Exclude cut with ID _29853_210_7497_1_1532505600851_6642110_461-142829-0_sp1.1 from training. Duration: 0.97275 2023-10-02 01:46:44,922 WARNING [train.py:1204] (3/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_788-298907-0_sp0.9 from training. Duration: 0.988875 2023-10-02 01:46:45,057 WARNING [train.py:1204] (3/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_739-2098-0_sp1.1 from training. Duration: 0.836375 2023-10-02 01:46:45,346 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7543_210_6753_1_1528541907_1399322_165-323619-0 from training. Duration: 0.69 2023-10-02 01:46:45,429 WARNING [train.py:1204] (3/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_401-255491-0_sp1.1 from training. Duration: 0.636375 2023-10-02 01:46:45,774 WARNING [train.py:1204] (3/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_24-143283-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 01:46:45,851 WARNING [train.py:1204] (3/4) Exclude cut with ID _14459_210_2228_1_1531443641946_7516700_1221-280439-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 01:46:46,620 WARNING [train.py:1204] (3/4) Exclude cut with ID _59202_210_26984_1_1534816810612_3549174_469-213482-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 01:46:46,621 WARNING [train.py:1204] (3/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_148-158509-0_sp0.9 from training. Duration: 0.4555625 2023-10-02 01:46:46,676 WARNING [train.py:1204] (3/4) Exclude cut with ID _24314_210_4915_1_1532135078116_7170179_863-259095-0_sp1.1 from training. Duration: 0.97275 2023-10-02 01:46:46,748 WARNING [train.py:1204] (3/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_321-22821-0 from training. Duration: 0.89 2023-10-02 01:46:46,782 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_43831_210_2158_1_1533725502_5872791_55-163137-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 01:46:46,841 WARNING [train.py:1204] (3/4) Exclude cut with ID _27430_210_7770_1_1532604689375_1378590_26-25207-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 01:46:46,987 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_17613_210_2151_1_1531137569_2366160_223-316461-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 01:46:47,056 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_53679_210_4852_1_1534556726_4186141_541-213370-0_sp1.1 from training. Duration: 0.963625 2023-10-02 01:46:47,131 WARNING [train.py:1204] (3/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_369-295086-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 01:46:47,180 WARNING [train.py:1204] (3/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_91-267696-0 from training. Duration: 0.87 2023-10-02 01:46:47,256 WARNING [train.py:1204] (3/4) Exclude cut with ID _5294_210_1747_1_1527249643928_3696179_180-100713-0 from training. Duration: 0.81 2023-10-02 01:46:47,296 WARNING [train.py:1204] (3/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_32-335103-0 from training. Duration: 0.82 2023-10-02 01:46:47,302 WARNING [train.py:1204] (3/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_133-312727-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 01:46:47,410 WARNING [train.py:1204] (3/4) Exclude cut with ID _59288_210_5854_1_1535077757774_3412246_391-165934-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 01:46:47,468 WARNING [train.py:1204] (3/4) Exclude cut with ID _28323_210_16187_1_1532480395667_4100300_488-1281-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 01:46:47,562 WARNING [train.py:1204] (3/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_124-193207-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 01:46:48,329 WARNING [train.py:1204] (3/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_226-242140-0_sp0.9 from training. Duration: 0.9 2023-10-02 01:46:48,585 WARNING [train.py:1204] (3/4) Exclude cut with ID _29277_210_3279_1_1532581109293_3173069_392-3694-0_sp1.1 from training. Duration: 0.936375 2023-10-02 01:46:48,612 WARNING [train.py:1204] (3/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_563-329842-0_sp1.1 from training. Duration: 0.97275 2023-10-02 01:46:48,661 WARNING [train.py:1204] (3/4) Exclude cut with ID _58583_210_5236_1_1534726835266_3574249_391-87755-0_sp1.1 from training. Duration: 0.92725 2023-10-02 01:46:48,679 WARNING [train.py:1204] (3/4) Exclude cut with ID _28427_210_4220_1_1532581240967_3061069_281-237827-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 01:46:48,700 WARNING [train.py:1204] (3/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_44-147898-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 01:46:48,796 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_24007_210_1794_1_1531918925_1663875_125-320772-0_sp1.1 from training. Duration: 0.97275 2023-10-02 01:46:49,037 WARNING [train.py:1204] (3/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_143-117421-0 from training. Duration: 0.58 2023-10-02 01:46:49,319 WARNING [train.py:1204] (3/4) Exclude cut with ID _69584_210_5219_1_1535785133775_4044450_325-7656-0_sp0.9 from training. Duration: 0.9555625 2023-10-02 01:46:49,466 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_37351_210_7925_1_1533012931_3812113_265-85780-0 from training. Duration: 0.88 2023-10-02 01:46:49,488 WARNING [train.py:1204] (3/4) Exclude cut with ID _18516_210_9206_1_1531704653206_3692406_331-237896-0_sp1.1 from training. Duration: 0.936375 2023-10-02 01:46:49,569 WARNING [train.py:1204] (3/4) Exclude cut with ID _14629_210_15709_1_1530604298182_956899_6-232695-0 from training. Duration: 0.94 2023-10-02 01:46:49,596 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7544_210_6753_1_1528587000_691468_87-178599-0_sp1.1 from training. Duration: 0.7909375 2023-10-02 01:46:49,663 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_326-303945-0 from training. Duration: 0.99 2023-10-02 01:46:49,683 WARNING [train.py:1204] (3/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_503-30719-0 from training. Duration: 0.98 2023-10-02 01:46:49,690 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_11305_210_1794_1_1529750428_7178148_299-344640-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 01:46:49,748 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_53071_210_16554_1_1534490405_2954066_265-170472-0_sp1.1 from training. Duration: 0.7545625 2023-10-02 01:46:50,009 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_174-277364-0_sp1.1 from training. Duration: 0.6454375 2023-10-02 01:46:50,110 WARNING [train.py:1204] (3/4) Exclude cut with ID _58414_210_8254_1_1535421609162_7151182_193-151884-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 01:46:50,129 WARNING [train.py:1204] (3/4) Exclude cut with ID IC0651W0104-322063 from training. Duration: 0.768 2023-10-02 01:46:50,371 WARNING [train.py:1204] (3/4) Exclude cut with ID _21881_210_3525_1_1531825242757_3798380_139-305982-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 01:46:50,492 WARNING [train.py:1204] (3/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_79-200392-0 from training. Duration: 0.97 2023-10-02 01:46:51,385 WARNING [train.py:1204] (3/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_451-280894-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 01:46:51,393 WARNING [train.py:1204] (3/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_304-273433-0_sp1.1 from training. Duration: 0.9 2023-10-02 01:46:51,475 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_5959_210_1794_1_1527400400_3445726_412-44052-0 from training. Duration: 0.77 2023-10-02 01:46:51,608 WARNING [train.py:1204] (3/4) Exclude cut with ID _4681_210_3525_1_1527251040321_1235713_148-26159-0_sp1.1 from training. Duration: 0.963625 2023-10-02 01:46:51,829 WARNING [train.py:1204] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_887-249555-0_sp0.9 from training. Duration: 0.8555625 2023-10-02 01:46:52,121 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_25236_210_2158_1_1532051968_280567_37-6860-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 01:46:52,490 WARNING [train.py:1204] (3/4) Exclude cut with ID _29277_210_3279_1_1532581109293_3173069_417-115527-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 01:46:52,702 WARNING [train.py:1204] (3/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_552-134748-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 01:46:52,784 WARNING [train.py:1204] (3/4) Exclude cut with ID _43176_210_8254_1_1533524417469_4174339_188-174143-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 01:46:52,829 WARNING [train.py:1204] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_127-119109-0_sp1.1 from training. Duration: 0.6545625 2023-10-02 01:46:53,022 WARNING [train.py:1204] (3/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_38-187851-0 from training. Duration: 0.62 2023-10-02 01:46:53,061 WARNING [train.py:1204] (3/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_588-125165-0 from training. Duration: 0.83 2023-10-02 01:46:53,146 WARNING [train.py:1204] (3/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_344-152526-0_sp1.1 from training. Duration: 0.4818125 2023-10-02 01:46:53,148 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7002_210_1794_1_1527935634_4034444_487-236501-0 from training. Duration: 0.66 2023-10-02 01:46:53,271 WARNING [train.py:1204] (3/4) Exclude cut with ID _23825_210_13449_1_1531915313526_3497150_379-16981-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 01:46:53,810 WARNING [train.py:1204] (3/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_297-5233-0_sp1.1 from training. Duration: 0.863625 2023-10-02 01:46:53,814 WARNING [train.py:1204] (3/4) Exclude cut with ID _41298_210_9019_1_1533430371450_4822143_178-283538-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 01:46:53,850 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7046_210_6753_1_1527895337_6920917_642-106799-0 from training. Duration: 0.92 2023-10-02 01:46:53,864 WARNING [train.py:1204] (3/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_532-291952-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 01:46:53,981 WARNING [train.py:1204] (3/4) Exclude cut with ID _47278_210_12041_1_1533902418349_3576110_260-270468-0_sp1.1 from training. Duration: 0.92725 2023-10-02 01:46:53,987 WARNING [train.py:1204] (3/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_144-3287-0_sp0.9 from training. Duration: 0.7444375 2023-10-02 01:46:54,116 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_162-262717-0_sp0.9 from training. Duration: 0.92225 2023-10-02 01:46:54,321 WARNING [train.py:1204] (3/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_62-335552-0_sp0.9 from training. Duration: 0.9666875 2023-10-02 01:46:54,326 WARNING [train.py:1204] (3/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_726-272852-0_sp1.1 from training. Duration: 0.92725 2023-10-02 01:46:54,403 WARNING [train.py:1204] (3/4) Exclude cut with ID _14898_210_9206_1_1530766812377_4177190_26-135492-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 01:46:54,674 WARNING [train.py:1204] (3/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_200-95304-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 01:46:54,719 WARNING [train.py:1204] (3/4) Exclude cut with ID _45012_210_12651_1_1533608435136_5371380_240-37101-0_sp1.1 from training. Duration: 0.8454375 2023-10-02 01:46:54,727 WARNING [train.py:1204] (3/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_132-269633-0_sp0.9 from training. Duration: 0.7555625 2023-10-02 01:46:54,795 WARNING [train.py:1204] (3/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_907-151398-0_sp0.9 from training. Duration: 0.411125 2023-10-02 01:46:54,871 WARNING [train.py:1204] (3/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_45-265684-0_sp0.9 from training. Duration: 0.8 2023-10-02 01:46:54,964 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_74-292608-0 from training. Duration: 0.72 2023-10-02 01:46:56,180 WARNING [train.py:1204] (3/4) Exclude cut with ID _59202_210_26984_1_1534816810612_3549174_221-84819-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 01:46:56,254 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_33696_210_3852_1_1533082780_2663375_181-325595-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 01:46:56,316 WARNING [train.py:1204] (3/4) Exclude cut with ID _47251_210_18628_1_1533814063128_4137250_351-154105-0_sp1.1 from training. Duration: 0.963625 2023-10-02 01:46:56,425 WARNING [train.py:1204] (3/4) Exclude cut with ID _35501_210_9288_1_1532998507986_3192189_351-334740-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 01:46:56,552 WARNING [train.py:1204] (3/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_342-100157-0_sp1.1 from training. Duration: 0.4909375 2023-10-02 01:46:56,898 WARNING [train.py:1204] (3/4) Exclude cut with ID _6789_210_1534_1_1527857937218_3118950_262-48853-0 from training. Duration: 0.56 2023-10-02 01:46:56,931 WARNING [train.py:1204] (3/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_430-201020-0 from training. Duration: 0.97 2023-10-02 01:46:57,146 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_71-136518-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 01:46:57,231 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_28-291982-0_sp1.1 from training. Duration: 0.8818125 2023-10-02 01:46:57,254 WARNING [train.py:1204] (3/4) Exclude cut with ID _54218_210_12210_1_1534413646388_4409259_437-275326-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 01:46:57,538 WARNING [train.py:1204] (3/4) Exclude cut with ID _55134_210_21982_1_1534590028377_3882170_40-309139-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 01:46:58,047 WARNING [train.py:1204] (3/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_266-329755-0_sp1.1 from training. Duration: 0.8090625 2023-10-02 01:46:58,142 WARNING [train.py:1204] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_21-174514-0_sp1.1 from training. Duration: 0.3909375 2023-10-02 01:46:58,331 WARNING [train.py:1204] (3/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_409-207378-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 01:46:58,733 WARNING [train.py:1204] (3/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_132-269633-0 from training. Duration: 0.68 2023-10-02 01:46:58,840 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2245_210_2715_1_1525511548_3540254_310-52483-0_sp0.9 from training. Duration: 0.97775 2023-10-02 01:46:59,026 WARNING [train.py:1204] (3/4) Exclude cut with ID _27430_210_7770_1_1532604689375_1378590_47-77403-0_sp1.1 from training. Duration: 0.92725 2023-10-02 01:46:59,322 WARNING [train.py:1204] (3/4) Exclude cut with ID _7562_210_2084_1_1528887767916_3946229_186-326422-0 from training. Duration: 0.84 2023-10-02 01:46:59,449 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_230-35993-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 01:47:00,365 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_23854_210_1794_1_1531969696_1329157_57-123491-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 01:47:00,424 WARNING [train.py:1204] (3/4) Exclude cut with ID _14747_210_8341_1_1530685797463_3654089_147-131229-0 from training. Duration: 0.76 2023-10-02 01:47:00,559 WARNING [train.py:1204] (3/4) Exclude cut with ID _30191_210_1664_1_1533189915047_7196670_430-19539-0_sp1.1 from training. Duration: 0.92725 2023-10-02 01:47:00,582 WARNING [train.py:1204] (3/4) Exclude cut with ID _67725_210_27767_1_1535608756671_5105359_44-50762-0_sp1.1 from training. Duration: 0.963625 2023-10-02 01:47:00,802 WARNING [train.py:1204] (3/4) Exclude cut with ID _27485_210_4916_1_1532739191677_3668269_217-161755-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 01:47:00,849 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_5931_210_2040_1_1527428481_4547999_321-193614-0_sp1.1 from training. Duration: 0.6090625 2023-10-02 01:47:00,864 WARNING [train.py:1204] (3/4) Exclude cut with ID _6267_210_2084_1_1527764658526_3894730_375-219887-0 from training. Duration: 0.67 2023-10-02 01:47:00,963 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9087W0022-538393 from training. Duration: 0.768 2023-10-02 01:47:01,070 WARNING [train.py:1204] (3/4) Exclude cut with ID _68777_210_4353_1_1535810390731_3117904_83-196459-0_sp1.1 from training. Duration: 0.963625 2023-10-02 01:47:01,254 WARNING [train.py:1204] (3/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_455-343812-0_sp1.1 from training. Duration: 0.92725 2023-10-02 01:47:01,255 WARNING [train.py:1204] (3/4) Exclude cut with ID _13916_210_9994_1_1530788341126_3763149_414-320174-0_sp1.1 from training. Duration: 0.92725 2023-10-02 01:47:01,259 WARNING [train.py:1204] (3/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_398-239819-0 from training. Duration: 0.99 2023-10-02 01:47:01,372 WARNING [train.py:1204] (3/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_199-232314-0_sp1.1 from training. Duration: 0.92725 2023-10-02 01:47:01,640 WARNING [train.py:1204] (3/4) Exclude cut with ID _2334_210_2667_1_1525608238880_4705129_459-104997-0 from training. Duration: 0.95 2023-10-02 01:47:01,854 WARNING [train.py:1204] (3/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_441-209395-0_sp1.1 from training. Duration: 0.936375 2023-10-02 01:47:01,930 WARNING [train.py:1204] (3/4) Exclude cut with ID _27052_210_5986_1_1532329197187_3585009_320-80393-0_sp1.1 from training. Duration: 0.97275 2023-10-02 01:47:02,288 WARNING [train.py:1204] (3/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_89-217253-0_sp1.1 from training. Duration: 0.97275 2023-10-02 01:47:02,301 WARNING [train.py:1204] (3/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_453-282588-0_sp1.1 from training. Duration: 0.8909375 2023-10-02 01:47:02,447 WARNING [train.py:1204] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_88-341718-0 from training. Duration: 0.66 2023-10-02 01:47:02,494 WARNING [train.py:1204] (3/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_230-3521-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 01:47:02,782 WARNING [train.py:1204] (3/4) Exclude cut with ID _9679_210_7497_1_1529129212906_4363929_512-313889-0 from training. Duration: 0.81 2023-10-02 01:47:02,940 WARNING [train.py:1204] (3/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_194-252800-0_sp1.1 from training. Duration: 0.8818125 2023-10-02 01:47:02,968 WARNING [train.py:1204] (3/4) Exclude cut with ID _55209_210_1531_1_1534675828996_4597160_239-293541-0_sp1.1 from training. Duration: 0.963625 2023-10-02 01:47:03,035 WARNING [train.py:1204] (3/4) Exclude cut with ID _35799_210_5854_1_1533263487887_3529071_15-325131-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 01:47:03,443 WARNING [train.py:1204] (3/4) Exclude cut with ID _14100_210_8341_1_1530529320613_3678069_279-55760-0 from training. Duration: 0.93 2023-10-02 01:47:03,765 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_489-230703-0_sp1.1 from training. Duration: 0.77275 2023-10-02 01:47:03,925 WARNING [train.py:1204] (3/4) Exclude cut with ID _6694_210_4599_1_1527852637490_3965529_144-185051-0 from training. Duration: 0.96 2023-10-02 01:47:03,999 WARNING [train.py:1204] (3/4) Exclude cut with ID _28461_210_2253_1_1532771775189_3041299_1-228155-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 01:47:04,899 WARNING [train.py:1204] (3/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_686-252323-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 01:47:04,901 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_1075-294633-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 01:47:04,912 WARNING [train.py:1204] (3/4) Exclude cut with ID _22083_210_8254_1_1531702862439_3678970_30-92879-0_sp1.1 from training. Duration: 0.97275 2023-10-02 01:47:04,985 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9245W0121-616888 from training. Duration: 0.896 2023-10-02 01:47:04,987 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6786_210_1388_1_1527839699_4194495_378-46070-0 from training. Duration: 0.72 2023-10-02 01:47:05,405 WARNING [train.py:1204] (3/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_317-175062-0 from training. Duration: 0.82 2023-10-02 01:47:05,605 WARNING [train.py:1204] (3/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_238-89390-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 01:47:05,751 WARNING [train.py:1204] (3/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_524-310811-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 01:47:05,946 WARNING [train.py:1204] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_307-158462-0_sp0.9 from training. Duration: 0.7666875 2023-10-02 01:47:05,947 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9309W0191-640318 from training. Duration: 0.512 2023-10-02 01:47:05,951 WARNING [train.py:1204] (3/4) Exclude cut with ID _55153_210_2253_1_1534384786677_3162096_100-204617-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 01:47:06,163 WARNING [train.py:1204] (3/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_112-149213-0_sp1.1 from training. Duration: 0.863625 2023-10-02 01:47:06,360 WARNING [train.py:1204] (3/4) Exclude cut with ID _67831_210_12392_1_1535612464202_3092612_346-2148-0_sp1.1 from training. Duration: 0.8454375 2023-10-02 01:47:06,512 WARNING [train.py:1204] (3/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_51-117576-0_sp0.9 from training. Duration: 0.4444375 2023-10-02 01:47:06,532 WARNING [train.py:1204] (3/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_300-27235-0 from training. Duration: 0.87 2023-10-02 01:47:06,585 WARNING [train.py:1204] (3/4) Exclude cut with ID _40399_210_4353_1_1533305658194_3279406_195-116825-0_sp1.1 from training. Duration: 0.97275 2023-10-02 01:47:06,617 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7002_210_1794_1_1527939683_5157286_163-87552-0_sp0.9 from training. Duration: 0.9555625 2023-10-02 01:47:06,657 WARNING [train.py:1204] (3/4) Exclude cut with ID _19514_210_9259_1_1531530071330_5084230_560-237597-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 01:47:06,756 WARNING [train.py:1204] (3/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_41-234353-0_sp1.1 from training. Duration: 0.6181875 2023-10-02 01:47:06,788 WARNING [train.py:1204] (3/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_769-168302-0 from training. Duration: 0.71 2023-10-02 01:47:06,987 WARNING [train.py:1204] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_574-45541-0 from training. Duration: 0.65 2023-10-02 01:47:07,196 WARNING [train.py:1204] (3/4) Exclude cut with ID _50310_210_15488_1_1534068043345_3641080_262-67120-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 01:47:07,402 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_53071_210_16554_1_1534493434_1276796_35-11927-0_sp1.1 from training. Duration: 0.963625 2023-10-02 01:47:07,403 WARNING [train.py:1204] (3/4) Exclude cut with ID _3992_210_1747_1_1526644816031_3731060_107-251434-0 from training. Duration: 0.99 2023-10-02 01:47:07,448 WARNING [train.py:1204] (3/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_412-54488-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 01:47:07,565 WARNING [train.py:1204] (3/4) Exclude cut with ID _18516_210_9206_1_1531704653206_3692406_77-258490-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 01:47:07,699 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_40047_210_2290_1_1533290519_2374311_195-165693-0_sp0.9 from training. Duration: 0.988875 2023-10-02 01:47:07,741 WARNING [train.py:1204] (3/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_296-36971-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 01:47:07,829 WARNING [train.py:1204] (3/4) Exclude cut with ID _14970_210_15709_1_1530773543493_1715730_187-20259-0_sp1.1 from training. Duration: 0.8090625 2023-10-02 01:47:08,137 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_53373_210_5399_1_1534229471_4715695_135-305927-0_sp1.1 from training. Duration: 0.936375 2023-10-02 01:47:08,162 WARNING [train.py:1204] (3/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_60-18123-0 from training. Duration: 0.88 2023-10-02 01:47:08,266 WARNING [train.py:1204] (3/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_166-166990-0 from training. Duration: 0.96 2023-10-02 01:47:08,369 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_5438_210_1794_1_1527336687_2600009_233-58072-0 from training. Duration: 0.77 2023-10-02 01:47:08,558 WARNING [train.py:1204] (3/4) Exclude cut with ID _8833_210_5219_1_1528592665210_7553059_611-80313-0_sp0.9 from training. Duration: 0.711125 2023-10-02 01:47:08,591 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_53555_210_5632_1_1534595220_7232297_118-43239-0_sp1.1 from training. Duration: 0.936375 2023-10-02 01:47:09,269 WARNING [train.py:1204] (3/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_480-13146-0 from training. Duration: 0.85 2023-10-02 01:47:09,693 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_892-28814-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 01:47:09,734 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_24899_210_16554_1_1532046422_3827462_160-21255-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 01:47:10,171 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_154-316450-0_sp1.1 from training. Duration: 0.6909375 2023-10-02 01:47:10,262 WARNING [train.py:1204] (3/4) Exclude cut with ID _30196_210_1664_1_1533708496522_7074140_1225-301232-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 01:47:10,285 WARNING [train.py:1204] (3/4) Exclude cut with ID _28783_210_7943_1_1532671164808_7155479_414-208100-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 01:47:10,334 WARNING [train.py:1204] (3/4) Exclude cut with ID _19023_210_4353_1_1531378892552_5820780_83-254794-0 from training. Duration: 0.86 2023-10-02 01:47:10,404 WARNING [train.py:1204] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_591-83752-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 01:47:10,452 WARNING [train.py:1204] (3/4) Exclude cut with ID _29877_210_6228_1_1532397791106_5276149_266-62291-0_sp1.1 from training. Duration: 0.7909375 2023-10-02 01:47:10,578 WARNING [train.py:1204] (3/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_280-161633-0_sp1.1 from training. Duration: 0.663625 2023-10-02 01:47:10,864 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7543_210_6753_1_1528539468_333764_39-156634-0_sp1.1 from training. Duration: 0.97275 2023-10-02 01:47:10,998 WARNING [train.py:1204] (3/4) Exclude cut with ID _67499_210_17282_1_1535509805220_3692430_505-312248-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 01:47:11,018 WARNING [train.py:1204] (3/4) Exclude cut with ID _5294_210_1747_1_1527249643928_3696179_180-100713-0_sp0.9 from training. Duration: 0.9 2023-10-02 01:47:11,133 WARNING [train.py:1204] (3/4) Exclude cut with ID _26528_210_9070_1_1532570410842_3851310_508-148132-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 01:47:11,142 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_211-339622-0_sp1.1 from training. Duration: 0.4090625 2023-10-02 01:47:11,154 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_339-138356-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 01:47:11,242 WARNING [train.py:1204] (3/4) Exclude cut with ID _27060_210_5986_1_1532674838922_3522549_211-19933-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 01:47:11,451 WARNING [train.py:1204] (3/4) Exclude cut with ID _60229_210_27767_1_1534838406158_3655350_457-117923-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 01:47:11,715 WARNING [train.py:1204] (3/4) Exclude cut with ID _55429_210_10626_1_1534467588527_7174143_460-141439-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 01:47:11,838 WARNING [train.py:1204] (3/4) Exclude cut with ID _59170_210_6721_1_1534983633960_4203335_263-10780-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 01:47:11,863 WARNING [train.py:1204] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_872-264525-0_sp0.9 from training. Duration: 0.72225 2023-10-02 01:47:11,959 WARNING [train.py:1204] (3/4) Exclude cut with ID _7624_210_1531_1_1528109802198_4933101_418-349834-0_sp0.9 from training. Duration: 0.6666875 2023-10-02 01:47:12,081 WARNING [train.py:1204] (3/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_27-53998-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 01:47:12,198 WARNING [train.py:1204] (3/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_202-183922-0 from training. Duration: 0.85 2023-10-02 01:47:12,218 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7002_210_1794_1_1527939683_5157286_1-281264-0 from training. Duration: 0.44 2023-10-02 01:47:12,287 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_53753_210_7234_1_1534320003_3594148_493-9314-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 01:47:12,305 WARNING [train.py:1204] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_217-95056-0_sp1.1 from training. Duration: 0.8909375 2023-10-02 01:47:12,356 WARNING [train.py:1204] (3/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_406-45086-0_sp1.1 from training. Duration: 0.72725 2023-10-02 01:47:12,361 WARNING [train.py:1204] (3/4) Exclude cut with ID _31649_210_6228_1_1532584608063_4091078_505-213821-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 01:47:12,374 WARNING [train.py:1204] (3/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_317-175062-0_sp0.9 from training. Duration: 0.911125 2023-10-02 01:47:12,532 WARNING [train.py:1204] (3/4) Exclude cut with ID _5444_210_2151_1_1527418808464_5312210_726-27165-0_sp1.1 from training. Duration: 0.6090625 2023-10-02 01:47:12,568 WARNING [train.py:1204] (3/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_494-170437-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 01:47:12,839 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_474-64054-0_sp1.1 from training. Duration: 0.936375 2023-10-02 01:47:13,736 WARNING [train.py:1204] (3/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_521-7556-0_sp1.1 from training. Duration: 0.97275 2023-10-02 01:47:13,860 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_269-348505-0_sp1.1 from training. Duration: 0.92725 2023-10-02 01:47:14,173 WARNING [train.py:1204] (3/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_559-184862-0 from training. Duration: 0.94 2023-10-02 01:47:14,199 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_68702_210_23689_1_1535799577_3420068_92-76040-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 01:47:14,242 WARNING [train.py:1204] (3/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_227-102052-0_sp1.1 from training. Duration: 0.97275 2023-10-02 01:47:14,273 WARNING [train.py:1204] (3/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_718-250907-0_sp0.9 from training. Duration: 0.7666875 2023-10-02 01:47:14,339 WARNING [train.py:1204] (3/4) Exclude cut with ID _14069_210_5632_1_1530512692033_4803390_352-143891-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 01:47:14,621 WARNING [train.py:1204] (3/4) Exclude cut with ID _6267_210_2084_1_1527764658526_3894730_65-106602-0_sp1.1 from training. Duration: 0.936375 2023-10-02 01:47:14,707 WARNING [train.py:1204] (3/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_202-183922-0_sp0.9 from training. Duration: 0.9444375 2023-10-02 01:47:14,721 WARNING [train.py:1204] (3/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_465-268827-0_sp1.1 from training. Duration: 0.97275 2023-10-02 01:47:14,778 WARNING [train.py:1204] (3/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_58-29064-0_sp1.1 from training. Duration: 0.9 2023-10-02 01:47:14,821 WARNING [train.py:1204] (3/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_1-99680-0_sp0.9 from training. Duration: 0.7444375 2023-10-02 01:47:14,915 WARNING [train.py:1204] (3/4) Exclude cut with ID _13253_210_1925_1_1530757775819_3577873_191-144977-0_sp1.1 from training. Duration: 0.7090625 2023-10-02 01:47:15,238 WARNING [train.py:1204] (3/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_276-112684-0_sp1.1 from training. Duration: 0.92725 2023-10-02 01:47:15,247 WARNING [train.py:1204] (3/4) Exclude cut with ID _64639_210_5816_1_1535367619437_3528000_358-411-0_sp1.1 from training. Duration: 0.97275 2023-10-02 01:47:15,254 WARNING [train.py:1204] (3/4) Exclude cut with ID _31649_210_6228_1_1532584608063_4091078_106-21199-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 01:47:15,350 WARNING [train.py:1204] (3/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_901-79438-0 from training. Duration: 0.94 2023-10-02 01:47:15,558 WARNING [train.py:1204] (3/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_718-250907-0 from training. Duration: 0.69 2023-10-02 01:47:15,570 WARNING [train.py:1204] (3/4) Exclude cut with ID _5287_210_2084_1_1527418883040_3878670_363-194161-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 01:47:15,574 WARNING [train.py:1204] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_586-321321-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 01:47:15,645 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_68900_210_2087_1_1535713157_3708529_164-292661-0_sp1.1 from training. Duration: 0.963625 2023-10-02 01:47:15,646 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_26433_210_12108_1_1532157158_4056480_114-144878-0_sp1.1 from training. Duration: 0.97275 2023-10-02 01:47:15,868 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_15517_210_6753_1_1530954008_3487336_96-341874-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 01:47:16,460 WARNING [train.py:1204] (3/4) Exclude cut with ID _14268_210_9206_1_1530622771285_4714099_637-86630-0 from training. Duration: 0.51 2023-10-02 01:47:16,581 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_28211_210_6388_1_1532861506_4523224_121-320092-0_sp1.1 from training. Duration: 0.92725 2023-10-02 01:47:16,981 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_141-89113-0_sp0.9 from training. Duration: 0.9555625 2023-10-02 01:47:17,097 WARNING [train.py:1204] (3/4) Exclude cut with ID _6789_210_1534_1_1527857937218_3118950_311-61262-0_sp1.1 from training. Duration: 0.5909375 2023-10-02 01:47:17,116 WARNING [train.py:1204] (3/4) Exclude cut with ID _58480_210_12392_1_1534811121111_3646160_389-200461-0_sp1.1 from training. Duration: 0.97275 2023-10-02 01:47:17,368 WARNING [train.py:1204] (3/4) Exclude cut with ID _41326_210_5854_1_1533778076509_3492080_28-156553-0_sp1.1 from training. Duration: 0.92725 2023-10-02 01:47:17,448 WARNING [train.py:1204] (3/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_577-287150-0_sp1.1 from training. Duration: 0.7454375 2023-10-02 01:47:18,140 WARNING [train.py:1204] (3/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_40-129756-0_sp1.1 from training. Duration: 0.736375 2023-10-02 01:47:18,150 WARNING [train.py:1204] (3/4) Exclude cut with ID _60113_210_16271_1_1535164212481_4386079_490-280290-0 from training. Duration: 0.98 2023-10-02 01:47:18,460 WARNING [train.py:1204] (3/4) Exclude cut with ID _58059_210_5854_1_1534901471149_3164808_34-39905-0_sp1.1 from training. Duration: 0.963625 2023-10-02 01:47:18,635 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_791-197891-0_sp1.1 from training. Duration: 0.763625 2023-10-02 01:47:18,663 WARNING [train.py:1204] (3/4) Exclude cut with ID _6789_210_1534_1_1527857937218_3118950_262-48853-0_sp1.1 from training. Duration: 0.5090625 2023-10-02 01:47:18,723 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2655_210_2087_1_1525688959_3897400_52-279899-0_sp0.9 from training. Duration: 0.7666875 2023-10-02 01:47:18,726 WARNING [train.py:1204] (3/4) Exclude cut with ID _14884_210_2252_1_1530707459689_5561250_114-201399-0 from training. Duration: 0.76 2023-10-02 01:47:18,960 WARNING [train.py:1204] (3/4) Exclude cut with ID _19023_210_4353_1_1531378892552_5820780_83-254794-0_sp0.9 from training. Duration: 0.9555625 2023-10-02 01:47:18,981 WARNING [train.py:1204] (3/4) Exclude cut with ID _53489_210_6228_1_1534298357169_3835970_10-297919-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 01:47:19,323 WARNING [train.py:1204] (3/4) Exclude cut with ID _44585_210_18628_1_1533605350387_4518199_418-3055-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 01:47:19,335 WARNING [train.py:1204] (3/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_310-109461-0 from training. Duration: 0.8 2023-10-02 01:47:19,398 WARNING [train.py:1204] (3/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_235-262124-0 from training. Duration: 0.67 2023-10-02 01:47:19,500 WARNING [train.py:1204] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_195-167011-0_sp1.1 from training. Duration: 0.6818125 2023-10-02 01:47:20,030 WARNING [train.py:1204] (3/4) Exclude cut with ID _7852_210_5219_1_1528286471316_8139400_188-55162-0_sp1.1 from training. Duration: 0.936375 2023-10-02 01:47:20,078 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_35361_210_4238_1_1533262810_7793476_675-87012-0 from training. Duration: 0.89 2023-10-02 01:47:20,159 WARNING [train.py:1204] (3/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_465-81427-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 01:47:20,166 WARNING [train.py:1204] (3/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_597-154640-0_sp1.1 from training. Duration: 0.8181875 2023-10-02 01:47:20,310 WARNING [train.py:1204] (3/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_186-309161-0_sp0.9 from training. Duration: 0.6 2023-10-02 01:47:20,407 WARNING [train.py:1204] (3/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_546-119312-0 from training. Duration: 0.99 2023-10-02 01:47:20,411 WARNING [train.py:1204] (3/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_330-240060-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 01:47:20,527 WARNING [train.py:1204] (3/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_477-255782-0 from training. Duration: 0.82 2023-10-02 01:47:20,552 WARNING [train.py:1204] (3/4) Exclude cut with ID _14970_210_15709_1_1530773543493_1715730_150-248939-0_sp1.1 from training. Duration: 0.97275 2023-10-02 01:47:20,659 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_556-206615-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 01:47:20,734 WARNING [train.py:1204] (3/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_308-231735-0 from training. Duration: 0.73 2023-10-02 01:47:21,337 WARNING [train.py:1204] (3/4) Exclude cut with ID _27485_210_4916_1_1532739191677_3668269_66-236571-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 01:47:21,756 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2221_210_1988_1_1525597051_4935541_415-296882-0_sp0.9 from training. Duration: 0.8555625 2023-10-02 01:47:21,763 WARNING [train.py:1204] (3/4) Exclude cut with ID _64521_210_14699_1_1535243427259_3815328_231-78630-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 01:47:22,907 WARNING [train.py:1204] (3/4) Exclude cut with ID _13942_210_9988_1_1530604906036_3733360_499-55676-0 from training. Duration: 0.62 2023-10-02 01:47:22,909 WARNING [train.py:1204] (3/4) Exclude cut with ID _49376_210_5986_1_1533985278732_3370740_301-154763-0_sp1.1 from training. Duration: 0.8909375 2023-10-02 01:47:23,355 WARNING [train.py:1204] (3/4) Exclude cut with ID _27485_210_4916_1_1532739191677_3668269_255-216698-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 01:47:23,447 WARNING [train.py:1204] (3/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_137-334143-0_sp1.1 from training. Duration: 0.97275 2023-10-02 01:47:23,466 WARNING [train.py:1204] (3/4) Exclude cut with ID _6456_210_1534_1_1527681634212_3181049_215-309359-0_sp1.1 from training. Duration: 0.8 2023-10-02 01:47:23,720 WARNING [train.py:1204] (3/4) Exclude cut with ID _65640_210_6310_1_1535368201445_3173250_209-13008-0 from training. Duration: 0.99 2023-10-02 01:47:24,113 WARNING [train.py:1204] (3/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_58-29064-0 from training. Duration: 0.99 2023-10-02 01:47:24,389 WARNING [train.py:1204] (3/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_410-287857-0 from training. Duration: 0.93 2023-10-02 01:47:24,469 WARNING [train.py:1204] (3/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_153-153537-0 from training. Duration: 0.91 2023-10-02 01:47:24,505 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7544_210_6753_1_1528587722_3576686_172-171750-0_sp0.9 from training. Duration: 0.72225 2023-10-02 01:47:24,542 WARNING [train.py:1204] (3/4) Exclude cut with ID _64521_210_14699_1_1535243427259_3815328_464-183853-0_sp1.1 from training. Duration: 0.97275 2023-10-02 01:47:24,554 WARNING [train.py:1204] (3/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_385-210308-0_sp1.1 from training. Duration: 0.963625 2023-10-02 01:47:24,566 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_46013_210_7234_1_1533776362_3744032_354-261865-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 01:47:24,572 WARNING [train.py:1204] (3/4) Exclude cut with ID _3067_210_2667_1_1526212974640_4503168_250-285937-0_sp1.1 from training. Duration: 0.72725 2023-10-02 01:47:24,647 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_40036_210_13405_1_1533648647_1315388_48-146016-0 from training. Duration: 0.93 2023-10-02 01:47:24,904 WARNING [train.py:1204] (3/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_280-161633-0 from training. Duration: 0.73 2023-10-02 01:47:24,956 WARNING [train.py:1204] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_308-161067-0_sp0.9 from training. Duration: 0.7333125 2023-10-02 01:47:25,076 WARNING [train.py:1204] (3/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_68-212801-0_sp0.9 from training. Duration: 0.811125 2023-10-02 01:47:25,140 WARNING [train.py:1204] (3/4) Exclude cut with ID _6789_210_1534_1_1527857937218_3118950_311-61262-0_sp0.9 from training. Duration: 0.72225 2023-10-02 01:47:25,323 WARNING [train.py:1204] (3/4) Exclude cut with ID _14632_210_9988_1_1530691279392_3923200_102-204477-0_sp1.1 from training. Duration: 0.8818125 2023-10-02 01:47:25,403 WARNING [train.py:1204] (3/4) Exclude cut with ID _65242_210_18628_1_1535369393717_3823149_290-117517-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 01:47:25,537 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_69630_210_7234_1_1535788670_910989_35-337140-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 01:47:25,552 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_5618_210_1874_1_1527311008_5851_1-214501-0 from training. Duration: 0.689 2023-10-02 01:47:25,697 WARNING [train.py:1204] (3/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_168-50337-0_sp0.9 from training. Duration: 0.9333125 2023-10-02 01:47:25,805 WARNING [train.py:1204] (3/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_185-14438-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 01:47:25,829 WARNING [train.py:1204] (3/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_61-327062-0 from training. Duration: 0.81 2023-10-02 01:47:25,834 WARNING [train.py:1204] (3/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_98-741-0 from training. Duration: 0.8 2023-10-02 01:47:26,117 WARNING [train.py:1204] (3/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_312-204798-0 from training. Duration: 0.93 2023-10-02 01:47:26,231 WARNING [train.py:1204] (3/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_787-294416-0_sp0.9 from training. Duration: 0.911125 2023-10-02 01:47:26,309 WARNING [train.py:1204] (3/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_140-181176-0_sp1.1 from training. Duration: 0.836375 2023-10-02 01:47:26,345 WARNING [train.py:1204] (3/4) Exclude cut with ID _14687_210_5854_1_1530703838259_3664889_115-239046-0_sp1.1 from training. Duration: 0.6090625 2023-10-02 01:47:27,099 WARNING [train.py:1204] (3/4) Exclude cut with ID _56423_210_5854_1_1534896061049_3165969_370-230614-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 01:47:27,146 WARNING [train.py:1204] (3/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_406-275649-0_sp0.9 from training. Duration: 0.4666875 2023-10-02 01:47:27,158 WARNING [train.py:1204] (3/4) Exclude cut with ID _15150_210_8341_1_1530752411803_7256384_22-138274-0_sp1.1 from training. Duration: 0.87275 2023-10-02 01:47:27,222 WARNING [train.py:1204] (3/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_125-125818-0 from training. Duration: 0.96 2023-10-02 01:47:27,442 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_4399_210_1874_1_1526705485_2400757_220-47974-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 01:47:27,596 WARNING [train.py:1204] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_308-161067-0_sp1.1 from training. Duration: 0.6 2023-10-02 01:47:27,720 WARNING [train.py:1204] (3/4) Exclude cut with ID _53489_210_6228_1_1534298357169_3835970_102-57645-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 01:47:27,948 WARNING [train.py:1204] (3/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_178-177582-0 from training. Duration: 0.74 2023-10-02 01:47:28,011 WARNING [train.py:1204] (3/4) Exclude cut with ID _14349_210_8341_1_1530615710818_3442470_170-336541-0_sp1.1 from training. Duration: 0.7818125 2023-10-02 01:47:28,080 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_521-214276-0_sp0.9 from training. Duration: 0.488875 2023-10-02 01:47:28,125 WARNING [train.py:1204] (3/4) Exclude cut with ID _66070_210_7640_1_1535441393096_7805310_489-292631-0 from training. Duration: 0.79 2023-10-02 01:47:28,621 WARNING [train.py:1204] (3/4) Exclude cut with ID _14069_210_5632_1_1530512692033_4803390_438-340023-0_sp1.1 from training. Duration: 0.936375 2023-10-02 01:47:28,719 WARNING [train.py:1204] (3/4) Exclude cut with ID _7852_210_5219_1_1528286471316_8139400_242-105336-0_sp1.1 from training. Duration: 0.6545625 2023-10-02 01:47:28,830 WARNING [train.py:1204] (3/4) Exclude cut with ID _3867_210_1883_1_1526641200575_4475830_272-215563-0 from training. Duration: 0.57 2023-10-02 01:47:28,848 WARNING [train.py:1204] (3/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_143-117421-0_sp0.9 from training. Duration: 0.6444375 2023-10-02 01:47:28,855 WARNING [train.py:1204] (3/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_30-326681-0_sp1.1 from training. Duration: 0.7545625 2023-10-02 01:47:28,995 WARNING [train.py:1204] (3/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_538-218448-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 01:47:29,317 WARNING [train.py:1204] (3/4) Exclude cut with ID _33640_210_2655_1_1533117605479_4855469_60-192970-0_sp1.1 from training. Duration: 0.963625 2023-10-02 01:47:29,332 WARNING [train.py:1204] (3/4) Exclude cut with ID _65585_210_12210_1_1535450451584_3764098_112-294301-0_sp1.1 from training. Duration: 0.8 2023-10-02 01:47:29,361 WARNING [train.py:1204] (3/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_643-37412-0_sp1.1 from training. Duration: 0.936375 2023-10-02 01:47:29,377 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_9302_210_2550_1_1528889962_4655416_638-301620-0_sp1.1 from training. Duration: 0.42725 2023-10-02 01:47:29,492 WARNING [train.py:1204] (3/4) Exclude cut with ID _3867_210_1883_1_1526641200575_4475830_272-215563-0_sp1.1 from training. Duration: 0.5181875 2023-10-02 01:47:29,758 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_53679_210_4852_1_1534556726_4186141_358-154143-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 01:47:29,778 WARNING [train.py:1204] (3/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_841-341679-0_sp0.9 from training. Duration: 0.6444375 2023-10-02 01:47:29,997 WARNING [train.py:1204] (3/4) Exclude cut with ID _7160_210_1876_1_1527940923231_3703799_302-150728-0 from training. Duration: 0.78 2023-10-02 01:47:30,038 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_125-203105-0 from training. Duration: 0.8 2023-10-02 01:47:30,056 WARNING [train.py:1204] (3/4) Exclude cut with ID _5585_210_2667_1_1527418813324_8007340_89-332072-0_sp0.9 from training. Duration: 0.711125 2023-10-02 01:47:30,131 WARNING [train.py:1204] (3/4) Exclude cut with ID _3992_210_1747_1_1526644816031_3731060_36-147855-0 from training. Duration: 0.78 2023-10-02 01:47:30,148 WARNING [train.py:1204] (3/4) Exclude cut with ID _59256_210_5986_1_1535076085997_3589120_172-239045-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 01:47:30,156 WARNING [train.py:1204] (3/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_472-216663-0_sp1.1 from training. Duration: 0.836375 2023-10-02 01:47:30,205 WARNING [train.py:1204] (3/4) Exclude cut with ID _36512_210_6987_1_1533033143998_3126069_181-306500-0_sp1.1 from training. Duration: 0.863625 2023-10-02 01:47:30,219 WARNING [train.py:1204] (3/4) Exclude cut with ID _56524_210_9642_1_1534572011779_3619170_492-95008-0_sp1.1 from training. Duration: 0.97275 2023-10-02 01:47:30,375 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_33348_210_5632_1_1533086912_3324030_119-90326-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 01:47:30,431 WARNING [train.py:1204] (3/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_231-137892-0_sp1.1 from training. Duration: 0.9 2023-10-02 01:47:30,568 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8453_210_4211_1_1528502940_8562730_596-306820-0_sp1.1 from training. Duration: 0.8818125 2023-10-02 01:47:30,667 WARNING [train.py:1204] (3/4) Exclude cut with ID _27052_210_5986_1_1532329197187_3585009_50-242750-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 01:47:31,564 WARNING [train.py:1204] (3/4) Exclude cut with ID _13913_210_9994_1_1530579615187_3383897_358-98435-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 01:47:31,577 WARNING [train.py:1204] (3/4) Exclude cut with ID _27013_210_1389_1_1532437173240_3252649_399-229778-0_sp1.1 from training. Duration: 0.963625 2023-10-02 01:47:31,823 WARNING [train.py:1204] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_185-174805-0_sp0.9 from training. Duration: 0.7555625 2023-10-02 01:47:32,195 WARNING [train.py:1204] (3/4) Exclude cut with ID _65585_210_12210_1_1535450451584_3764098_196-221014-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 01:47:32,329 WARNING [train.py:1204] (3/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_202-293523-0_sp1.1 from training. Duration: 0.6545625 2023-10-02 01:47:32,474 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8275_210_4198_1_1528455320_3892571_213-127634-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 01:47:32,517 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_921-231678-0_sp1.1 from training. Duration: 0.97275 2023-10-02 01:47:32,640 WARNING [train.py:1204] (3/4) Exclude cut with ID _38020_210_5986_1_1533112252259_7006089_493-102745-0_sp1.1 from training. Duration: 0.8909375 2023-10-02 01:47:32,764 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6846_210_6753_1_1527850767_4295158_2-315073-0 from training. Duration: 0.88 2023-10-02 01:47:32,831 WARNING [train.py:1204] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_249-281349-0 from training. Duration: 0.85 2023-10-02 01:47:33,001 WARNING [train.py:1204] (3/4) Exclude cut with ID _18513_210_9206_1_1531618215223_3862620_314-83429-0_sp1.1 from training. Duration: 0.97275 2023-10-02 01:47:33,146 WARNING [train.py:1204] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_215-202973-0_sp0.9 from training. Duration: 0.97775 2023-10-02 01:47:33,163 WARNING [train.py:1204] (3/4) Exclude cut with ID _7719_210_3525_1_1528456153163_4296960_88-250259-0_sp1.1 from training. Duration: 0.763625 2023-10-02 01:47:33,266 WARNING [train.py:1204] (3/4) Exclude cut with ID _7606_210_4915_1_1528894253277_5080690_301-10732-0_sp1.1 from training. Duration: 0.736375 2023-10-02 01:47:33,275 WARNING [train.py:1204] (3/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_818-11907-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 01:47:33,353 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_5959_210_1794_1_1527400400_3445726_412-44052-0_sp1.1 from training. Duration: 0.7 2023-10-02 01:47:33,477 WARNING [train.py:1204] (3/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_6-279195-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 01:47:33,649 WARNING [train.py:1204] (3/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_252-216674-0_sp0.9 from training. Duration: 0.6 2023-10-02 01:47:33,803 WARNING [train.py:1204] (3/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_144-3287-0_sp1.1 from training. Duration: 0.6090625 2023-10-02 01:47:33,962 WARNING [train.py:1204] (3/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_342-3879-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 01:47:34,000 WARNING [train.py:1204] (3/4) Exclude cut with ID _33640_210_2655_1_1533117605479_4855469_584-65630-0_sp1.1 from training. Duration: 0.97275 2023-10-02 01:47:34,072 WARNING [train.py:1204] (3/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_37-189262-0 from training. Duration: 0.98 2023-10-02 01:47:34,128 WARNING [train.py:1204] (3/4) Exclude cut with ID _9389_210_1664_1_1529217337301_5289269_379-128794-0_sp0.9 from training. Duration: 0.9444375 2023-10-02 01:47:34,242 WARNING [train.py:1204] (3/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_119-207162-0_sp1.1 from training. Duration: 0.6454375 2023-10-02 01:47:34,601 WARNING [train.py:1204] (3/4) Exclude cut with ID _2941_210_2577_1_1526199673150_2489759_200-333643-0_sp1.1 from training. Duration: 0.936375 2023-10-02 01:47:34,706 WARNING [train.py:1204] (3/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_479-125611-0_sp1.1 from training. Duration: 0.87275 2023-10-02 01:47:34,721 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3475_210_2028_1_1526193578_3622569_64-297072-0_sp1.1 from training. Duration: 0.82725 2023-10-02 01:47:34,758 WARNING [train.py:1204] (3/4) Exclude cut with ID _53565_210_10635_1_1534337218335_3820259_139-346291-0_sp1.1 from training. Duration: 0.97275 2023-10-02 01:47:34,806 WARNING [train.py:1204] (3/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_588-125165-0_sp1.1 from training. Duration: 0.7545625 2023-10-02 01:47:34,872 WARNING [train.py:1204] (3/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_938-205135-0_sp1.1 from training. Duration: 0.7909375 2023-10-02 01:47:35,022 WARNING [train.py:1204] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_216-48707-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 01:47:35,084 WARNING [train.py:1204] (3/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_477-255782-0_sp1.1 from training. Duration: 0.7454375 2023-10-02 01:47:35,130 WARNING [train.py:1204] (3/4) Exclude cut with ID _16909_210_5724_1_1531292079912_4118919_583-92842-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 01:47:35,210 WARNING [train.py:1204] (3/4) Exclude cut with ID _60860_210_8254_1_1534896015875_3557169_245-256666-0 from training. Duration: 0.97 2023-10-02 01:47:36,199 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_45399_210_4852_1_1533624725_7579737_349-116173-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 01:47:36,586 WARNING [train.py:1204] (3/4) Exclude cut with ID _8699_210_2069_1_1528630346831_3960409_155-39766-0 from training. Duration: 0.78 2023-10-02 01:47:36,604 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_86-16881-0_sp1.1 from training. Duration: 0.52725 2023-10-02 01:47:37,078 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_191-159384-0_sp1.1 from training. Duration: 0.97275 2023-10-02 01:47:37,091 WARNING [train.py:1204] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_307-158462-0_sp1.1 from training. Duration: 0.62725 2023-10-02 01:47:37,205 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7544_210_6753_1_1528591327_1471426_33-346031-0 from training. Duration: 0.61 2023-10-02 01:47:37,309 WARNING [train.py:1204] (3/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_112-149213-0 from training. Duration: 0.95 2023-10-02 01:47:37,450 WARNING [train.py:1204] (3/4) Exclude cut with ID _55844_210_2346_1_1534471212569_7625707_925-191342-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 01:47:37,584 WARNING [train.py:1204] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_87-278686-0 from training. Duration: 0.77 2023-10-02 01:47:37,624 WARNING [train.py:1204] (3/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_415-78792-0_sp1.1 from training. Duration: 0.736375 2023-10-02 01:47:37,641 WARNING [train.py:1204] (3/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_107-332398-0_sp0.9 from training. Duration: 0.8666875 2023-10-02 01:47:37,725 WARNING [train.py:1204] (3/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_72-251255-0 from training. Duration: 0.94 2023-10-02 01:47:37,758 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_431-227273-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 01:47:37,821 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7544_210_6753_1_1528587000_691468_87-178599-0_sp0.9 from training. Duration: 0.9666875 2023-10-02 01:47:37,903 WARNING [train.py:1204] (3/4) Exclude cut with ID _13916_210_9994_1_1530788341126_3763149_60-191556-0 from training. Duration: 0.61 2023-10-02 01:47:37,905 WARNING [train.py:1204] (3/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_746-164782-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 01:47:37,950 WARNING [train.py:1204] (3/4) Exclude cut with ID _33640_210_2655_1_1533117605479_4855469_596-21066-0_sp1.1 from training. Duration: 0.92725 2023-10-02 01:47:38,093 WARNING [train.py:1204] (3/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_320-14078-0 from training. Duration: 0.95 2023-10-02 01:47:38,132 WARNING [train.py:1204] (3/4) Exclude cut with ID _29877_210_6228_1_1532397791106_5276149_266-62291-0 from training. Duration: 0.87 2023-10-02 01:47:38,193 WARNING [train.py:1204] (3/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_176-56120-0_sp0.9 from training. Duration: 0.92225 2023-10-02 01:47:38,239 WARNING [train.py:1204] (3/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_91-26919-0 from training. Duration: 0.97 2023-10-02 01:47:38,254 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_224-322394-0_sp0.9 from training. Duration: 0.7333125 2023-10-02 01:47:38,308 WARNING [train.py:1204] (3/4) Exclude cut with ID _44982_210_1511_1_1534312820108_3726029_586-297059-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 01:47:38,369 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_196-289637-0_sp1.1 from training. Duration: 0.6818125 2023-10-02 01:47:38,371 WARNING [train.py:1204] (3/4) Exclude cut with ID _18513_210_9206_1_1531618215223_3862620_402-5957-0 from training. Duration: 0.99 2023-10-02 01:47:38,377 WARNING [train.py:1204] (3/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_81-300212-0 from training. Duration: 0.6 2023-10-02 01:47:38,556 WARNING [train.py:1204] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_166-128941-0 from training. Duration: 0.54 2023-10-02 01:47:38,570 WARNING [train.py:1204] (3/4) Exclude cut with ID _58059_210_5854_1_1534901471149_3164808_57-309660-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 01:47:38,633 WARNING [train.py:1204] (3/4) Exclude cut with ID _60621_210_17071_1_1535013053530_3671739_73-155814-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 01:47:38,729 WARNING [train.py:1204] (3/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_726-270780-0 from training. Duration: 0.86 2023-10-02 01:47:38,750 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8853_210_3273_1_1528633899_818968_51-187685-0_sp1.1 from training. Duration: 0.62725 2023-10-02 01:47:38,884 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_315-212794-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 01:47:38,963 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_53373_210_5399_1_1534229471_4715695_314-122795-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 01:47:39,080 WARNING [train.py:1204] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_187-221566-0_sp0.9 from training. Duration: 0.47775 2023-10-02 01:47:39,206 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_133-564-0 from training. Duration: 0.79 2023-10-02 01:47:39,242 WARNING [train.py:1204] (3/4) Exclude cut with ID _55153_210_2253_1_1534384786677_3162096_255-228344-0_sp1.1 from training. Duration: 0.936375 2023-10-02 01:47:39,367 WARNING [train.py:1204] (3/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_383-165234-0_sp1.1 from training. Duration: 0.8545625 2023-10-02 01:47:39,649 WARNING [train.py:1204] (3/4) Exclude cut with ID IC0677W0056-334889 from training. Duration: 0.768 2023-10-02 01:47:40,524 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_14953_210_2151_1_1530701710_5616165_710-165400-0_sp0.9 from training. Duration: 0.9666875 2023-10-02 01:47:40,627 WARNING [train.py:1204] (3/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_355-236205-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 01:47:40,629 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_34450_210_9547_1_1533099443_4109050_503-246893-0_sp1.1 from training. Duration: 0.963625 2023-10-02 01:47:40,902 WARNING [train.py:1204] (3/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_25-120161-0 from training. Duration: 0.98 2023-10-02 01:47:40,938 WARNING [train.py:1204] (3/4) Exclude cut with ID _66112_210_10635_1_1535590236305_3801484_205-311930-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 01:47:40,974 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_48049_210_5632_1_1533864553_3519937_185-281218-0_sp1.1 from training. Duration: 0.92725 2023-10-02 01:47:41,005 WARNING [train.py:1204] (3/4) Exclude cut with ID _48782_210_12658_1_1534467701544_2300422_191-332415-0_sp1.1 from training. Duration: 0.97275 2023-10-02 01:47:41,151 WARNING [train.py:1204] (3/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_907-151398-0_sp1.1 from training. Duration: 0.336375 2023-10-02 01:47:41,201 WARNING [train.py:1204] (3/4) Exclude cut with ID _67499_210_17282_1_1535509805220_3692430_20-179346-0_sp1.1 from training. Duration: 0.8818125 2023-10-02 01:47:41,410 WARNING [train.py:1204] (3/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_441-91466-0_sp1.1 from training. Duration: 0.8818125 2023-10-02 01:47:41,503 WARNING [train.py:1204] (3/4) Exclude cut with ID _46681_210_9642_1_1533893005571_7603359_786-164007-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 01:47:41,936 WARNING [train.py:1204] (3/4) Exclude cut with ID _14970_210_15709_1_1530773543493_1715730_187-20259-0 from training. Duration: 0.89 2023-10-02 01:47:42,279 WARNING [train.py:1204] (3/4) Exclude cut with ID _12734_210_8341_1_1530183614296_3891929_383-339075-0_sp1.1 from training. Duration: 0.963625 2023-10-02 01:47:42,950 WARNING [train.py:1204] (3/4) Exclude cut with ID _37086_210_9038_1_1532999540891_7021250_834-322225-0_sp1.1 from training. Duration: 0.92725 2023-10-02 01:47:43,066 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_10987_210_4163_1_1529760555_4123902_56-290559-0_sp1.1 from training. Duration: 0.963625 2023-10-02 01:47:43,072 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_92-246270-0_sp0.9 from training. Duration: 0.9444375 2023-10-02 01:47:43,097 WARNING [train.py:1204] (3/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_56-13561-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 01:47:43,122 WARNING [train.py:1204] (3/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_393-57888-0_sp1.1 from training. Duration: 0.5818125 2023-10-02 01:47:43,148 WARNING [train.py:1204] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_168-32906-0_sp1.1 from training. Duration: 0.62725 2023-10-02 01:47:43,207 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_51225_210_3601_1_1534400634_2327383_127-207553-0_sp1.1 from training. Duration: 0.963625 2023-10-02 01:47:43,529 WARNING [train.py:1204] (3/4) Exclude cut with ID _58583_210_5236_1_1534726835266_3574249_494-203521-0_sp1.1 from training. Duration: 0.963625 2023-10-02 01:47:43,549 WARNING [train.py:1204] (3/4) Exclude cut with ID _49376_210_5986_1_1533985278732_3370740_354-344695-0 from training. Duration: 0.99 2023-10-02 01:47:43,626 WARNING [train.py:1204] (3/4) Exclude cut with ID _8021_210_7994_1_1528376451358_3653069_179-45206-0_sp1.1 from training. Duration: 0.7818125 2023-10-02 01:47:43,820 WARNING [train.py:1204] (3/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_742-154017-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 01:47:43,944 WARNING [train.py:1204] (3/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_222-198127-0_sp1.1 from training. Duration: 0.763625 2023-10-02 01:47:43,995 WARNING [train.py:1204] (3/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_235-307337-0 from training. Duration: 0.94 2023-10-02 01:47:44,086 WARNING [train.py:1204] (3/4) Exclude cut with ID _8393_210_6702_1_1528505860249_3457328_453-176500-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 01:47:44,119 WARNING [train.py:1204] (3/4) Exclude cut with ID _53565_210_10635_1_1534337218335_3820259_102-179528-0_sp1.1 from training. Duration: 0.97275 2023-10-02 01:47:44,955 WARNING [train.py:1204] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_43-206757-0_sp1.1 from training. Duration: 0.6818125 2023-10-02 01:47:45,069 WARNING [train.py:1204] (3/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_335-274780-0_sp1.1 from training. Duration: 0.7090625 2023-10-02 01:47:45,107 WARNING [train.py:1204] (3/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_128-296581-0_sp1.1 from training. Duration: 0.67275 2023-10-02 01:47:45,829 WARNING [train.py:1204] (3/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_111-112362-0 from training. Duration: 0.91 2023-10-02 01:47:45,887 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_5522_210_3273_1_1527249606_3909096_366-172508-0_sp1.1 from training. Duration: 0.6 2023-10-02 01:47:45,986 WARNING [train.py:1204] (3/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_47-167373-0_sp1.1 from training. Duration: 0.836375 2023-10-02 01:47:46,035 WARNING [train.py:1204] (3/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_41-234353-0 from training. Duration: 0.68 2023-10-02 01:47:46,082 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_82-221196-0_sp0.9 from training. Duration: 0.8444375 2023-10-02 01:47:46,179 WARNING [train.py:1204] (3/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_47-4814-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 01:47:46,210 WARNING [train.py:1204] (3/4) Exclude cut with ID _43541_210_10626_1_1533796233418_3635440_40-247993-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 01:47:46,273 WARNING [train.py:1204] (3/4) Exclude cut with ID _21989_210_5854_1_1531899214931_3271360_133-44759-0_sp0.9 from training. Duration: 0.8555625 2023-10-02 01:47:46,276 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_94-195801-0 from training. Duration: 0.94 2023-10-02 01:47:46,367 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_141-89113-0 from training. Duration: 0.86 2023-10-02 01:47:46,476 WARNING [train.py:1204] (3/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_683-284578-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 01:47:46,634 WARNING [train.py:1204] (3/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_6-214815-0_sp1.1 from training. Duration: 0.77275 2023-10-02 01:47:46,875 WARNING [train.py:1204] (3/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_556-212082-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 01:47:46,922 WARNING [train.py:1204] (3/4) Exclude cut with ID _44902_210_16187_1_1533888006062_3792078_149-67212-0 from training. Duration: 0.87 2023-10-02 01:47:47,068 WARNING [train.py:1204] (3/4) Exclude cut with ID _61148_210_6897_1_1535094022726_4217610_333-159440-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 01:47:47,316 WARNING [train.py:1204] (3/4) Exclude cut with ID _3120_210_3525_1_1526036143112_4072980_404-249994-0 from training. Duration: 0.99 2023-10-02 01:47:47,405 WARNING [train.py:1204] (3/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_234-260682-0_sp0.9 from training. Duration: 0.9333125 2023-10-02 01:47:47,425 WARNING [train.py:1204] (3/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_374-138971-0_sp1.1 from training. Duration: 0.8545625 2023-10-02 01:47:47,504 WARNING [train.py:1204] (3/4) Exclude cut with ID _53713_210_24116_1_1534314487762_7416751_743-302085-0_sp1.1 from training. Duration: 0.92725 2023-10-02 01:47:47,566 WARNING [train.py:1204] (3/4) Exclude cut with ID _18516_210_9206_1_1531704653206_3692406_198-334870-0_sp1.1 from training. Duration: 0.836375 2023-10-02 01:47:47,698 WARNING [train.py:1204] (3/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_395-73570-0_sp1.1 from training. Duration: 0.9 2023-10-02 01:47:47,880 WARNING [train.py:1204] (3/4) Exclude cut with ID _46683_210_9642_1_1533730913492_4591116_421-19547-0_sp1.1 from training. Duration: 0.936375 2023-10-02 01:47:47,936 WARNING [train.py:1204] (3/4) Exclude cut with ID _19896_210_1883_1_1531709022662_6649569_714-8668-0_sp1.1 from training. Duration: 0.963625 2023-10-02 01:47:47,963 WARNING [train.py:1204] (3/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_49-101072-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 01:47:47,979 WARNING [train.py:1204] (3/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_220-191080-0_sp1.1 from training. Duration: 0.8909375 2023-10-02 01:47:48,018 WARNING [train.py:1204] (3/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_62-335552-0 from training. Duration: 0.87 2023-10-02 01:47:48,121 WARNING [train.py:1204] (3/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_111-112362-0_sp1.1 from training. Duration: 0.82725 2023-10-02 01:47:48,252 WARNING [train.py:1204] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_318-59097-0_sp0.9 from training. Duration: 0.8444375 2023-10-02 01:47:48,553 WARNING [train.py:1204] (3/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_98-255710-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 01:47:48,559 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9042W0003-515609 from training. Duration: 0.896 2023-10-02 01:47:48,578 WARNING [train.py:1204] (3/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_158-250116-0_sp0.9 from training. Duration: 0.688875 2023-10-02 01:47:48,702 WARNING [train.py:1204] (3/4) Exclude cut with ID _26750_210_5665_1_1532226678442_4063230_7-346885-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 01:47:48,739 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_37839_210_7234_1_1533542462_3689303_357-227078-0_sp1.1 from training. Duration: 0.963625 2023-10-02 01:47:48,784 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9052W0119-520904 from training. Duration: 0.896 2023-10-02 01:47:49,637 WARNING [train.py:1204] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_249-281349-0_sp1.1 from training. Duration: 0.77275 2023-10-02 01:47:49,651 WARNING [train.py:1204] (3/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_147-293311-0 from training. Duration: 0.94 2023-10-02 01:47:49,651 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_158-21166-0_sp1.1 from training. Duration: 0.963625 2023-10-02 01:47:49,950 WARNING [train.py:1204] (3/4) Exclude cut with ID _14100_210_8341_1_1530529320613_3678069_207-332194-0_sp1.1 from training. Duration: 0.92725 2023-10-02 01:47:49,998 WARNING [train.py:1204] (3/4) Exclude cut with ID _9389_210_1664_1_1529217337301_5289269_324-211031-0_sp1.1 from training. Duration: 0.963625 2023-10-02 01:47:50,009 WARNING [train.py:1204] (3/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_133-158308-0 from training. Duration: 0.84 2023-10-02 01:47:50,144 WARNING [train.py:1204] (3/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_166-314607-0 from training. Duration: 0.54 2023-10-02 01:47:50,304 WARNING [train.py:1204] (3/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_649-79538-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 01:47:50,311 WARNING [train.py:1204] (3/4) Exclude cut with ID _28394_210_8913_1_1532430049518_3650118_343-115667-0_sp1.1 from training. Duration: 0.97275 2023-10-02 01:47:50,365 WARNING [train.py:1204] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_87-278686-0_sp0.9 from training. Duration: 0.8555625 2023-10-02 01:47:50,591 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9109W0003-549749 from training. Duration: 0.768 2023-10-02 01:47:50,781 WARNING [train.py:1204] (3/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_126-29104-0_sp1.1 from training. Duration: 0.77275 2023-10-02 01:47:50,992 WARNING [train.py:1204] (3/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_325-113798-0 from training. Duration: 0.85 2023-10-02 01:47:51,052 WARNING [train.py:1204] (3/4) Exclude cut with ID _6694_210_4599_1_1527852637490_3965529_116-109398-0 from training. Duration: 0.73 2023-10-02 01:47:51,092 WARNING [train.py:1204] (3/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_46-57119-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 01:47:51,204 WARNING [train.py:1204] (3/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_546-119312-0_sp1.1 from training. Duration: 0.9 2023-10-02 01:47:51,393 WARNING [train.py:1204] (3/4) Exclude cut with ID _60057_210_10640_1_1534903065793_3333150_468-306939-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 01:47:51,437 WARNING [train.py:1204] (3/4) Exclude cut with ID _15150_210_8341_1_1530752411803_7256384_22-138274-0 from training. Duration: 0.96 2023-10-02 01:47:51,458 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_289-180904-0 from training. Duration: 0.76 2023-10-02 01:47:51,816 WARNING [train.py:1204] (3/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_64-133721-0_sp1.1 from training. Duration: 0.92725 2023-10-02 01:47:51,884 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_195-257512-0_sp1.1 from training. Duration: 0.6090625 2023-10-02 01:47:52,210 WARNING [train.py:1204] (3/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_704-101963-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 01:47:52,452 WARNING [train.py:1204] (3/4) Exclude cut with ID _4366_210_1388_1_1526792053633_3613819_231-211402-0 from training. Duration: 0.94 2023-10-02 01:47:52,505 WARNING [train.py:1204] (3/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_679-190385-0_sp1.1 from training. Duration: 0.97275 2023-10-02 01:47:52,666 WARNING [train.py:1204] (3/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_298-81459-0_sp1.1 from training. Duration: 0.963625 2023-10-02 01:47:52,675 WARNING [train.py:1204] (3/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_658-282610-0_sp0.9 from training. Duration: 0.588875 2023-10-02 01:47:52,912 WARNING [train.py:1204] (3/4) Exclude cut with ID _57572_210_5517_1_1534921219624_3809484_196-283734-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 01:47:52,981 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_53071_210_16554_1_1534490405_2954066_294-198554-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 01:47:52,986 WARNING [train.py:1204] (3/4) Exclude cut with ID _4866_210_2577_1_1527404412284_4364659_352-157930-0 from training. Duration: 0.81 2023-10-02 01:47:53,144 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9231W0113-610139 from training. Duration: 0.896 2023-10-02 01:47:53,212 WARNING [train.py:1204] (3/4) Exclude cut with ID _24271_210_12658_1_1531987270022_7190519_638-171906-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 01:47:53,246 WARNING [train.py:1204] (3/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_272-249985-0 from training. Duration: 0.67 2023-10-02 01:47:53,263 WARNING [train.py:1204] (3/4) Exclude cut with ID _64086_210_7994_1_1535270417019_7435949_449-31567-0 from training. Duration: 0.97 2023-10-02 01:47:54,167 WARNING [train.py:1204] (3/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_109-282568-0_sp1.1 from training. Duration: 0.97275 2023-10-02 01:47:54,777 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_121-45934-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 01:47:54,984 WARNING [train.py:1204] (3/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_314-126819-0 from training. Duration: 0.5 2023-10-02 01:47:55,015 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9309W0192-640319 from training. Duration: 0.512 2023-10-02 01:47:55,033 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9310W0003-640649 from training. Duration: 0.896 2023-10-02 01:47:55,190 WARNING [train.py:1204] (3/4) Exclude cut with ID _7160_210_1876_1_1527940923231_3703799_120-150943-0 from training. Duration: 0.81 2023-10-02 01:47:55,192 WARNING [train.py:1204] (3/4) Exclude cut with ID _40380_210_9059_1_1533380594759_4187380_6-33508-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 01:47:55,399 WARNING [train.py:1204] (3/4) Exclude cut with ID _59202_210_26984_1_1534816810612_3549174_137-318710-0 from training. Duration: 0.89 2023-10-02 01:47:55,656 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_435-313305-0 from training. Duration: 0.71 2023-10-02 01:47:55,840 WARNING [train.py:1204] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_91-212486-0_sp1.1 from training. Duration: 0.6181875 2023-10-02 01:47:55,947 WARNING [train.py:1204] (3/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_133-158308-0_sp1.1 from training. Duration: 0.763625 2023-10-02 01:47:56,117 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_99-251314-0 from training. Duration: 0.87 2023-10-02 01:47:56,292 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_365-217119-0_sp1.1 from training. Duration: 0.57275 2023-10-02 01:47:56,339 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3475_210_2028_1_1526193578_3622569_377-166133-0 from training. Duration: 0.86 2023-10-02 01:47:56,729 WARNING [train.py:1204] (3/4) Exclude cut with ID _13916_210_9994_1_1530788341126_3763149_116-21034-0_sp1.1 from training. Duration: 0.87275 2023-10-02 01:47:56,738 WARNING [train.py:1204] (3/4) Exclude cut with ID _64220_210_9527_1_1535250151772_4875259_142-216831-0_sp1.1 from training. Duration: 0.936375 2023-10-02 01:47:56,739 WARNING [train.py:1204] (3/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_490-207861-0_sp1.1 from training. Duration: 0.8909375 2023-10-02 01:47:56,789 WARNING [train.py:1204] (3/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_87-272068-0_sp0.9 from training. Duration: 0.92225 2023-10-02 01:47:56,818 WARNING [train.py:1204] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_18-46756-0_sp1.1 from training. Duration: 0.7181875 2023-10-02 01:47:56,888 WARNING [train.py:1204] (3/4) Exclude cut with ID _39411_210_5986_1_1533538825030_3343649_98-311335-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 01:47:56,949 WARNING [train.py:1204] (3/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_113-18163-0 from training. Duration: 0.89 2023-10-02 01:47:56,951 WARNING [train.py:1204] (3/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_1103-56445-0 from training. Duration: 0.73 2023-10-02 01:47:57,006 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3474_210_2028_1_1526188513_4825246_145-309131-0 from training. Duration: 0.74 2023-10-02 01:47:57,029 WARNING [train.py:1204] (3/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_120-211630-0_sp1.1 from training. Duration: 0.836375 2023-10-02 01:47:57,039 WARNING [train.py:1204] (3/4) Exclude cut with ID _8030_210_7497_1_1528444886781_3531089_32-31837-0 from training. Duration: 0.97 2023-10-02 01:47:57,167 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_19949_210_2161_1_1531706337_928079_8-160956-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 01:47:57,206 WARNING [train.py:1204] (3/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_321-215742-0_sp1.1 from training. Duration: 0.4545625 2023-10-02 01:47:57,353 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_583-124650-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 01:47:57,473 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_30434_210_13049_1_1532519384_981746_164-182755-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 01:47:57,836 WARNING [train.py:1204] (3/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_339-104791-0_sp1.1 from training. Duration: 0.4454375 2023-10-02 01:47:58,441 WARNING [train.py:1204] (3/4) Exclude cut with ID _15026_210_8750_1_1530777602316_3291850_211-195043-0 from training. Duration: 0.8 2023-10-02 01:47:58,531 WARNING [train.py:1204] (3/4) Exclude cut with ID _29877_210_6228_1_1532397791106_5276149_487-182647-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 01:47:58,563 WARNING [train.py:1204] (3/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_127-566-0_sp0.9 from training. Duration: 0.9333125 2023-10-02 01:47:58,727 WARNING [train.py:1204] (3/4) Exclude cut with ID _8263_210_1664_1_1528439452891_970220_68-181805-0_sp0.9 from training. Duration: 0.711125 2023-10-02 01:47:58,728 WARNING [train.py:1204] (3/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_504-72513-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 01:47:58,736 WARNING [train.py:1204] (3/4) Exclude cut with ID _64639_210_5816_1_1535367619437_3528000_324-265233-0_sp1.1 from training. Duration: 0.963625 2023-10-02 01:47:58,819 WARNING [train.py:1204] (3/4) Exclude cut with ID _6456_210_1534_1_1527681634212_3181049_215-309359-0_sp0.9 from training. Duration: 0.97775 2023-10-02 01:47:58,825 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_115-233929-0_sp1.1 from training. Duration: 0.6545625 2023-10-02 01:47:58,858 WARNING [train.py:1204] (3/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_219-224445-0 from training. Duration: 0.84 2023-10-02 01:47:58,983 WARNING [train.py:1204] (3/4) Exclude cut with ID _14867_210_1968_1_1530684179890_3846969_413-29292-0_sp1.1 from training. Duration: 0.8181875 2023-10-02 01:47:59,033 WARNING [train.py:1204] (3/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_1012-279473-0_sp0.9 from training. Duration: 0.7444375 2023-10-02 01:47:59,197 WARNING [train.py:1204] (3/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_605-133395-0_sp0.9 from training. Duration: 0.7333125 2023-10-02 01:47:59,369 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_89-35748-0 from training. Duration: 0.79 2023-10-02 01:47:59,499 WARNING [train.py:1204] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_476-272757-0_sp1.1 from training. Duration: 0.8454375 2023-10-02 01:47:59,972 WARNING [train.py:1204] (3/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_449-15508-0_sp0.9 from training. Duration: 0.911125 2023-10-02 01:48:00,121 WARNING [train.py:1204] (3/4) Exclude cut with ID _29345_210_4381_1_1532689168263_3860169_198-174328-0 from training. Duration: 0.91 2023-10-02 01:48:00,360 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_15517_210_6753_1_1530952293_1664795_125-158157-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 01:48:00,822 WARNING [train.py:1204] (3/4) Exclude cut with ID _25565_210_12392_1_1532423009382_3952098_420-113180-0_sp1.1 from training. Duration: 0.963625 2023-10-02 01:48:00,914 WARNING [train.py:1204] (3/4) Exclude cut with ID _51471_210_1889_1_1534399391390_3094250_236-146773-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 01:48:01,233 WARNING [train.py:1204] (3/4) Exclude cut with ID _30039_210_13270_1_1532484612900_2725150_175-35012-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 01:48:01,267 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_23850_210_4238_1_1532232597_1632433_99-275843-0_sp1.1 from training. Duration: 0.936375 2023-10-02 01:48:01,450 WARNING [train.py:1204] (3/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_658-282610-0 from training. Duration: 0.53 2023-10-02 01:48:01,743 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_37818_210_4238_1_1533620910_4720083_234-65934-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 01:48:01,821 WARNING [train.py:1204] (3/4) Exclude cut with ID _3067_210_2667_1_1526212974640_4503168_459-173867-0_sp1.1 from training. Duration: 0.8 2023-10-02 01:48:01,854 WARNING [train.py:1204] (3/4) Exclude cut with ID _8696_210_1876_1_1528545632440_3345058_209-54069-0_sp0.9 from training. Duration: 0.7444375 2023-10-02 01:48:02,079 WARNING [train.py:1204] (3/4) Exclude cut with ID _62574_210_2655_1_1535180407058_3933249_420-236465-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 01:48:02,151 WARNING [train.py:1204] (3/4) Exclude cut with ID _46681_210_9642_1_1533893005571_7603359_333-4042-0_sp1.1 from training. Duration: 0.936375 2023-10-02 01:48:02,232 WARNING [train.py:1204] (3/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_377-296839-0 from training. Duration: 0.55 2023-10-02 01:48:03,293 WARNING [train.py:1204] (3/4) Exclude cut with ID _43997_210_4525_1_1533538632555_5189329_426-92599-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 01:48:03,399 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_43-71989-0_sp1.1 from training. Duration: 0.5181875 2023-10-02 01:48:03,558 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2009_210_2071_1_1525481134_4588937_140-345530-0_sp1.1 from training. Duration: 0.963625 2023-10-02 01:48:03,565 WARNING [train.py:1204] (3/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_138-332773-0_sp0.9 from training. Duration: 0.9 2023-10-02 01:48:03,627 WARNING [train.py:1204] (3/4) Exclude cut with ID _13942_210_9988_1_1530604906036_3733360_499-55676-0_sp0.9 from training. Duration: 0.688875 2023-10-02 01:48:03,658 WARNING [train.py:1204] (3/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_259-34038-0_sp1.1 from training. Duration: 0.67275 2023-10-02 01:48:03,673 WARNING [train.py:1204] (3/4) Exclude cut with ID _3120_210_3525_1_1526036143112_4072980_404-249994-0_sp1.1 from training. Duration: 0.9 2023-10-02 01:48:03,712 WARNING [train.py:1204] (3/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_222-108810-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 01:48:03,954 WARNING [train.py:1204] (3/4) Exclude cut with ID _22083_210_8254_1_1531702862439_3678970_49-234546-0_sp0.9 from training. Duration: 0.92225 2023-10-02 01:48:04,082 WARNING [train.py:1204] (3/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_199-295611-0_sp0.9 from training. Duration: 0.611125 2023-10-02 01:48:04,268 WARNING [train.py:1204] (3/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_43-192945-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 01:48:04,565 WARNING [train.py:1204] (3/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_364-22817-0 from training. Duration: 0.71 2023-10-02 01:48:04,682 WARNING [train.py:1204] (3/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_388-129552-0_sp1.1 from training. Duration: 0.87275 2023-10-02 01:48:04,749 WARNING [train.py:1204] (3/4) Exclude cut with ID _6537_210_1883_1_1527987559710_3597129_190-280227-0_sp1.1 from training. Duration: 0.97275 2023-10-02 01:48:04,912 WARNING [train.py:1204] (3/4) Exclude cut with ID _55844_210_2346_1_1534471212569_7625707_398-56897-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 01:48:04,980 WARNING [train.py:1204] (3/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_48-182155-0_sp1.1 from training. Duration: 0.7909375 2023-10-02 01:48:05,006 WARNING [train.py:1204] (3/4) Exclude cut with ID _29529_210_7234_1_1532393961455_3977460_582-18084-0_sp1.1 from training. Duration: 0.97275 2023-10-02 01:48:05,140 WARNING [train.py:1204] (3/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_912-282412-0 from training. Duration: 0.49 2023-10-02 01:48:05,299 WARNING [train.py:1204] (3/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_332-256168-0_sp0.9 from training. Duration: 0.8333125 2023-10-02 01:48:05,364 WARNING [train.py:1204] (3/4) Exclude cut with ID _64220_210_9527_1_1535250151772_4875259_55-250624-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 01:48:05,534 WARNING [train.py:1204] (3/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_264-108685-0 from training. Duration: 0.55 2023-10-02 01:48:05,708 WARNING [train.py:1204] (3/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_613-232856-0_sp0.9 from training. Duration: 0.6 2023-10-02 01:48:05,808 WARNING [train.py:1204] (3/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_68-212801-0_sp1.1 from training. Duration: 0.663625 2023-10-02 01:48:05,848 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6290_210_3273_1_1527595224_1282787_115-87095-0 from training. Duration: 0.55 2023-10-02 01:48:05,892 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3080_210_2071_1_1526086102_4365166_312-8726-0 from training. Duration: 0.91 2023-10-02 01:48:06,081 WARNING [train.py:1204] (3/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_105-63309-0 from training. Duration: 0.98 2023-10-02 01:48:06,167 WARNING [train.py:1204] (3/4) Exclude cut with ID _2941_210_2577_1_1526199673150_2489759_201-160779-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 01:48:06,170 WARNING [train.py:1204] (3/4) Exclude cut with ID ID0406W0226-885434 from training. Duration: 0.997 2023-10-02 01:48:06,176 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_17292_210_6753_1_1531125388_2943734_353-328201-0_sp1.1 from training. Duration: 0.97275 2023-10-02 01:48:06,297 WARNING [train.py:1204] (3/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_572-96029-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 01:48:06,334 WARNING [train.py:1204] (3/4) Exclude cut with ID _14970_210_15709_1_1530775547361_1966965_6-23328-0_sp0.9 from training. Duration: 0.9555625 2023-10-02 01:48:06,393 WARNING [train.py:1204] (3/4) Exclude cut with ID _45012_210_12651_1_1533608435136_5371380_240-37101-0 from training. Duration: 0.93 2023-10-02 01:48:06,475 WARNING [train.py:1204] (3/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_685-90183-0_sp1.1 from training. Duration: 0.936375 2023-10-02 01:48:06,670 WARNING [train.py:1204] (3/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_41-22245-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 01:48:07,478 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_15126_210_6753_1_1530778677_2591185_120-277707-0 from training. Duration: 0.8 2023-10-02 01:48:07,512 WARNING [train.py:1204] (3/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_310-26186-0 from training. Duration: 0.7 2023-10-02 01:48:07,535 WARNING [train.py:1204] (3/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_787-294416-0 from training. Duration: 0.82 2023-10-02 01:48:07,931 WARNING [train.py:1204] (3/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_795-33252-0 from training. Duration: 0.37 2023-10-02 01:48:07,977 WARNING [train.py:1204] (3/4) Exclude cut with ID _7537_210_2084_1_1528502395431_3924359_113-199933-0_sp0.9 from training. Duration: 0.6555625 2023-10-02 01:48:08,433 WARNING [train.py:1204] (3/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_308-231735-0_sp0.9 from training. Duration: 0.811125 2023-10-02 01:48:08,455 WARNING [train.py:1204] (3/4) Exclude cut with ID _19504_210_9994_1_1531648863242_3325220_354-296765-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 01:48:08,576 WARNING [train.py:1204] (3/4) Exclude cut with ID _5409_210_1388_1_1527292850189_5865191_178-127970-0 from training. Duration: 0.57 2023-10-02 01:48:08,615 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_1466-248056-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 01:48:08,639 WARNING [train.py:1204] (3/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_535-14410-0_sp0.9 from training. Duration: 0.52225 2023-10-02 01:48:08,653 WARNING [train.py:1204] (3/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_747-129941-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 01:48:08,699 WARNING [train.py:1204] (3/4) Exclude cut with ID _15012_210_1968_1_1530856820429_5095350_688-178849-0_sp1.1 from training. Duration: 0.5818125 2023-10-02 01:48:08,944 WARNING [train.py:1204] (3/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_356-45054-0_sp1.1 from training. Duration: 0.7818125 2023-10-02 01:48:09,032 WARNING [train.py:1204] (3/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_146-276944-0_sp0.9 from training. Duration: 0.92225 2023-10-02 01:48:09,276 WARNING [train.py:1204] (3/4) Exclude cut with ID _39204_210_4220_1_1533880547368_3191120_346-180306-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 01:48:09,367 WARNING [train.py:1204] (3/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_641-158017-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 01:48:09,367 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_19952_210_2161_1_1531875469_3851750_274-42137-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 01:48:09,734 WARNING [train.py:1204] (3/4) Exclude cut with ID _60804_210_9642_1_1534831199859_7268299_87-327819-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 01:48:09,801 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_9302_210_2550_1_1528889962_4655416_638-301620-0 from training. Duration: 0.47 2023-10-02 01:48:09,862 WARNING [train.py:1204] (3/4) Exclude cut with ID _44304_210_15374_1_1533639352765_4613129_302-22084-0_sp1.1 from training. Duration: 0.7818125 2023-10-02 01:48:09,886 WARNING [train.py:1204] (3/4) Exclude cut with ID _18513_210_9206_1_1531618215223_3862620_177-227063-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 01:48:09,982 WARNING [train.py:1204] (3/4) Exclude cut with ID _41326_210_5854_1_1533778076509_3492080_510-45444-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 01:48:10,026 WARNING [train.py:1204] (3/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_76-311963-0_sp1.1 from training. Duration: 0.636375 2023-10-02 01:48:10,126 WARNING [train.py:1204] (3/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_156-302675-0_sp0.9 from training. Duration: 0.611125 2023-10-02 01:48:10,364 WARNING [train.py:1204] (3/4) Exclude cut with ID _53489_210_6228_1_1534298357169_3835970_367-148411-0_sp1.1 from training. Duration: 0.936375 2023-10-02 01:48:10,599 WARNING [train.py:1204] (3/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_389-144525-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 01:48:10,634 WARNING [train.py:1204] (3/4) Exclude cut with ID _14069_210_5632_1_1530512692033_4803390_158-238427-0_sp1.1 from training. Duration: 0.963625 2023-10-02 01:48:10,726 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_486-151131-0_sp0.9 from training. Duration: 0.9444375 2023-10-02 01:48:11,777 WARNING [train.py:1204] (3/4) Exclude cut with ID _39846_210_12651_1_1533603651992_1318030_188-71831-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 01:48:11,908 WARNING [train.py:1204] (3/4) Exclude cut with ID _39411_210_5986_1_1533538825030_3343649_173-238248-0_sp1.1 from training. Duration: 0.7545625 2023-10-02 01:48:12,125 WARNING [train.py:1204] (3/4) Exclude cut with ID _20684_210_6917_1_1531567751646_4203930_469-49081-0_sp1.1 from training. Duration: 0.92725 2023-10-02 01:48:12,237 WARNING [train.py:1204] (3/4) Exclude cut with ID _8833_210_5219_1_1528592665210_7553059_611-80313-0_sp1.1 from training. Duration: 0.5818125 2023-10-02 01:48:12,393 WARNING [train.py:1204] (3/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_440-141811-0 from training. Duration: 0.95 2023-10-02 01:48:12,418 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_41211_210_2544_1_1533348859_3702910_236-223652-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 01:48:12,527 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_40222_210_6228_1_1533385298_4481939_220-163998-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 01:48:15,246 INFO [scaling.py:1022] (3/4) Whitening: name=None, num_groups=8, num_channels=256, metric=6.26 vs. limit=3.0 2023-10-02 01:48:15,931 INFO [train.py:1386] (3/4) Maximum memory allocated so far is 20808MB 2023-10-02 01:48:18,276 INFO [train.py:1386] (3/4) Maximum memory allocated so far is 20808MB 2023-10-02 01:48:21,312 INFO [train.py:1386] (3/4) Maximum memory allocated so far is 20808MB 2023-10-02 01:48:23,307 INFO [train.py:1386] (3/4) Maximum memory allocated so far is 20808MB 2023-10-02 01:48:29,317 INFO [train.py:1386] (3/4) Maximum memory allocated so far is 20808MB 2023-10-02 01:48:32,206 INFO [train.py:1386] (3/4) Maximum memory allocated so far is 20808MB 2023-10-02 01:48:32,236 INFO [train.py:1267] (3/4) Loading grad scaler state dict 2023-10-02 01:48:49,325 WARNING [train.py:1204] (3/4) Exclude cut with ID _14629_210_15709_1_1530604298182_956899_6-232695-0_sp1.1 from training. Duration: 0.8545625 2023-10-02 01:48:49,346 WARNING [train.py:1204] (3/4) Exclude cut with ID _13921_210_9994_1_1530841782272_3503520_327-69677-0 from training. Duration: 0.9 2023-10-02 01:48:49,371 WARNING [train.py:1204] (3/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_372-298093-0 from training. Duration: 0.8 2023-10-02 01:48:49,384 WARNING [train.py:1204] (3/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_515-139111-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 01:48:49,612 WARNING [train.py:1204] (3/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_85-65056-0_sp1.1 from training. Duration: 0.87275 2023-10-02 01:48:49,650 WARNING [train.py:1204] (3/4) Exclude cut with ID _15113_210_8254_1_1530842438579_3716279_209-54533-0_sp1.1 from training. Duration: 0.87275 2023-10-02 01:48:49,699 WARNING [train.py:1204] (3/4) Exclude cut with ID _70447_210_12392_1_1535972499300_3160420_123-312838-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 01:48:49,725 WARNING [train.py:1204] (3/4) Exclude cut with ID _60113_210_16271_1_1535164212481_4386079_70-63596-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 01:48:49,751 WARNING [train.py:1204] (3/4) Exclude cut with ID _14632_210_9988_1_1530691279392_3923200_225-188509-0_sp1.1 from training. Duration: 0.963625 2023-10-02 01:48:49,798 WARNING [train.py:1204] (3/4) Exclude cut with ID _34734_210_4525_1_1533365977542_4448720_584-180532-0_sp1.1 from training. Duration: 0.97275 2023-10-02 01:48:49,810 WARNING [train.py:1204] (3/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_351-294650-0_sp1.1 from training. Duration: 0.57275 2023-10-02 01:48:50,133 WARNING [train.py:1204] (3/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_263-168388-0_sp0.9 from training. Duration: 0.9555625 2023-10-02 01:48:50,166 WARNING [train.py:1204] (3/4) Exclude cut with ID _22083_210_8254_1_1531702862439_3678970_201-85738-0 from training. Duration: 0.9 2023-10-02 01:48:50,254 WARNING [train.py:1204] (3/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_237-58433-0 from training. Duration: 0.74 2023-10-02 01:48:50,281 WARNING [train.py:1204] (3/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_262-250826-0 from training. Duration: 0.99 2023-10-02 01:48:50,388 WARNING [train.py:1204] (3/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_927-43827-0_sp0.9 from training. Duration: 0.82225 2023-10-02 01:48:50,418 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2821_210_2715_1_1526108385_3763560_73-296673-0 from training. Duration: 0.83 2023-10-02 01:48:50,489 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6287_210_3273_1_1527509506_2653716_182-199887-0 from training. Duration: 0.51 2023-10-02 01:48:50,622 WARNING [train.py:1204] (3/4) Exclude cut with ID _70519_210_4163_1_1535961595994_7458230_899-80223-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 01:48:50,877 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_210-234949-0_sp1.1 from training. Duration: 0.97275 2023-10-02 01:48:50,913 WARNING [train.py:1204] (3/4) Exclude cut with ID _30196_210_1664_1_1533708496522_7074140_768-278176-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 01:48:50,988 WARNING [train.py:1204] (3/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_315-128283-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 01:48:51,066 WARNING [train.py:1204] (3/4) Exclude cut with ID _14886_210_4220_1_1530702020355_3516190_330-323941-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 01:48:51,784 WARNING [train.py:1204] (3/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_264-348896-0_sp1.1 from training. Duration: 0.77275 2023-10-02 01:48:51,808 WARNING [train.py:1204] (3/4) Exclude cut with ID _37086_210_9038_1_1532999540891_7021250_433-10563-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 01:48:51,846 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_53638_210_21581_1_1534297319_5808449_885-80172-0_sp1.1 from training. Duration: 0.87275 2023-10-02 01:48:51,937 WARNING [train.py:1204] (3/4) Exclude cut with ID _14623_210_1925_1_1530773354962_3632609_157-218771-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 01:48:51,948 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_1368-78165-0_sp1.1 from training. Duration: 0.97275 2023-10-02 01:48:51,951 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_309-65177-0_sp0.9 from training. Duration: 0.97775 2023-10-02 01:48:51,957 WARNING [train.py:1204] (3/4) Exclude cut with ID _21632_210_2253_1_1531994001765_3744140_291-138812-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 01:48:51,969 WARNING [train.py:1204] (3/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_669-93163-0_sp1.1 from training. Duration: 0.8909375 2023-10-02 01:48:52,499 WARNING [train.py:1204] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_292-215346-0 from training. Duration: 0.97 2023-10-02 01:48:52,522 WARNING [train.py:1204] (3/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_286-222073-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 01:48:52,788 WARNING [train.py:1204] (3/4) Exclude cut with ID _59256_210_5986_1_1535076085997_3589120_121-237386-0_sp1.1 from training. Duration: 0.87275 2023-10-02 01:48:52,800 WARNING [train.py:1204] (3/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_96-320736-0 from training. Duration: 0.86 2023-10-02 01:48:52,809 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_128-202104-0 from training. Duration: 0.63 2023-10-02 01:48:52,889 WARNING [train.py:1204] (3/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_242-3472-0_sp0.9 from training. Duration: 0.811125 2023-10-02 01:48:52,930 WARNING [train.py:1204] (3/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_154-74182-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 01:48:52,934 WARNING [train.py:1204] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_187-221566-0 from training. Duration: 0.43 2023-10-02 01:48:53,094 WARNING [train.py:1204] (3/4) Exclude cut with ID _6194_210_1534_1_1527511826160_3941194_14-282876-0 from training. Duration: 0.71 2023-10-02 01:48:53,132 WARNING [train.py:1204] (3/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_660-211088-0_sp1.1 from training. Duration: 0.563625 2023-10-02 01:48:53,473 WARNING [train.py:1204] (3/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_526-62079-0_sp1.1 from training. Duration: 0.6090625 2023-10-02 01:48:53,628 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_92-246270-0_sp1.1 from training. Duration: 0.77275 2023-10-02 01:48:53,734 WARNING [train.py:1204] (3/4) Exclude cut with ID IC0287W0023-142350 from training. Duration: 0.956 2023-10-02 01:48:53,812 WARNING [train.py:1204] (3/4) Exclude cut with ID _58605_210_12210_1_1534723195939_3904259_48-269402-0 from training. Duration: 0.96 2023-10-02 01:48:53,826 WARNING [train.py:1204] (3/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_416-257730-0_sp0.9 from training. Duration: 0.72225 2023-10-02 01:48:54,408 WARNING [train.py:1204] (3/4) Exclude cut with ID _7877_210_6014_1_1528286358099_3750136_185-17516-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 01:48:54,515 WARNING [train.py:1204] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_23-189234-0 from training. Duration: 0.83 2023-10-02 01:48:54,531 WARNING [train.py:1204] (3/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_237-89684-0 from training. Duration: 0.96 2023-10-02 01:48:54,599 WARNING [train.py:1204] (3/4) Exclude cut with ID _13916_210_9994_1_1530788341126_3763149_116-21034-0 from training. Duration: 0.96 2023-10-02 01:48:54,776 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_246-125000-0_sp1.1 from training. Duration: 0.563625 2023-10-02 01:48:57,877 INFO [train.py:1046] (3/4) Epoch 21, batch 0, loss[loss=0.1635, simple_loss=0.2459, pruned_loss=0.04051, over 24288.00 frames. ], tot_loss[loss=0.1635, simple_loss=0.2459, pruned_loss=0.04051, over 24288.00 frames. ], batch size: 61, lr: 4.96e-03, grad_scale: 16.0 2023-10-02 01:48:57,877 INFO [train.py:1069] (3/4) Computing validation loss 2023-10-02 01:49:10,089 INFO [train.py:1078] (3/4) Epoch 21, validation: loss=0.2779, simple_loss=0.2712, pruned_loss=0.1423, over 1125622.00 frames. 2023-10-02 01:49:10,090 INFO [train.py:1079] (3/4) Maximum memory allocated so far is 20808MB 2023-10-02 01:49:13,202 WARNING [train.py:1204] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_402-253027-0 from training. Duration: 0.54 2023-10-02 01:49:13,260 WARNING [train.py:1204] (3/4) Exclude cut with ID _59288_210_5854_1_1535077757774_3412246_158-289345-0_sp1.1 from training. Duration: 0.92725 2023-10-02 01:49:13,825 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.3.encoder.layers.2.whiten, num_groups=1, num_channels=512, metric=5.46 vs. limit=12.0 2023-10-02 01:49:16,658 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_28211_210_6388_1_1532861506_4523224_128-328162-0_sp1.1 from training. Duration: 0.8181875 2023-10-02 01:49:19,439 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_788-278281-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 01:49:20,884 WARNING [train.py:1204] (3/4) Exclude cut with ID _49728_210_12651_1_1534041034294_6788150_624-195301-0_sp0.9 from training. Duration: 0.9444375 2023-10-02 01:49:20,917 WARNING [train.py:1204] (3/4) Exclude cut with ID _14860_210_4915_1_1530702020340_7343240_267-139775-0_sp1.1 from training. Duration: 0.963625 2023-10-02 01:49:20,984 WARNING [train.py:1204] (3/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_332-221057-0 from training. Duration: 0.68 2023-10-02 01:49:23,058 WARNING [train.py:1204] (3/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_242-76943-0 from training. Duration: 0.94 2023-10-02 01:49:25,702 WARNING [train.py:1204] (3/4) Exclude cut with ID _16747_210_5986_1_1531811544733_2749109_87-349587-0_sp1.1 from training. Duration: 0.963625 2023-10-02 01:49:25,761 WARNING [train.py:1204] (3/4) Exclude cut with ID _22982_210_1883_1_1531882790447_5449409_505-346452-0_sp1.1 from training. Duration: 0.963625 2023-10-02 01:49:27,211 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.feed_forward2.hidden_balancer.prob, batch_count=708346.6666666666, ans=0.125 2023-10-02 01:49:28,461 WARNING [train.py:1204] (3/4) Exclude cut with ID _17956_210_4353_1_1531209568125_6979970_13-192107-0_sp1.1 from training. Duration: 0.963625 2023-10-02 01:49:28,501 WARNING [train.py:1204] (3/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_430-307666-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 01:49:28,536 WARNING [train.py:1204] (3/4) Exclude cut with ID _2895_210_2477_1_1525956785794_4063750_289-11475-0_sp1.1 from training. Duration: 0.7454375 2023-10-02 01:49:29,763 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.491e+02 1.851e+02 2.013e+02 2.311e+02 4.182e+02, threshold=4.026e+02, percent-clipped=1.0 2023-10-02 01:49:29,792 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3910_210_1794_1_1526742306_334530_28-238909-0_sp0.9 from training. Duration: 0.9 2023-10-02 01:49:31,242 WARNING [train.py:1204] (3/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_18-130336-0 from training. Duration: 0.79 2023-10-02 01:49:34,073 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_548-294316-0_sp0.9 from training. Duration: 0.9 2023-10-02 01:49:41,749 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_42-142473-0_sp1.1 from training. Duration: 0.6545625 2023-10-02 01:49:41,755 WARNING [train.py:1204] (3/4) Exclude cut with ID _59170_210_6721_1_1534983633960_4203335_326-48066-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 01:49:43,289 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_45886_210_17024_1_1533709215_471063_7-79165-0 from training. Duration: 0.98 2023-10-02 01:49:47,980 WARNING [train.py:1204] (3/4) Exclude cut with ID _44585_210_18628_1_1533605350387_4518199_280-73534-0_sp0.9 from training. Duration: 0.988875 2023-10-02 01:49:47,986 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3475_210_2028_1_1526193578_3622569_265-276625-0_sp1.1 from training. Duration: 0.6909375 2023-10-02 01:49:50,496 WARNING [train.py:1204] (3/4) Exclude cut with ID _64631_210_27767_1_1535349580068_4165160_88-225232-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 01:49:54,074 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_205-265518-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 01:49:57,672 WARNING [train.py:1204] (3/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_271-253734-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 01:49:57,824 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.balancer1.prob, batch_count=708480.0, ans=0.125 2023-10-02 01:50:01,770 WARNING [train.py:1204] (3/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_282-257791-0 from training. Duration: 0.86 2023-10-02 01:50:02,073 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.bypass.skip_rate, batch_count=708480.0, ans=0.07 2023-10-02 01:50:02,145 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.feed_forward2.hidden_balancer.prob, batch_count=708480.0, ans=0.125 2023-10-02 01:50:04,709 WARNING [train.py:1204] (3/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_356-45054-0 from training. Duration: 0.86 2023-10-02 01:50:04,760 WARNING [train.py:1204] (3/4) Exclude cut with ID _69584_210_5219_1_1535785133775_4044450_38-348116-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 01:50:04,761 WARNING [train.py:1204] (3/4) Exclude cut with ID _66763_210_17301_1_1535788667617_241750_21-328480-0_sp1.1 from training. Duration: 0.936375 2023-10-02 01:50:06,136 WARNING [train.py:1204] (3/4) Exclude cut with ID _26528_210_9070_1_1532570410842_3851310_497-97218-0_sp0.9 from training. Duration: 0.9555625 2023-10-02 01:50:06,191 WARNING [train.py:1204] (3/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_592-101075-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 01:50:07,516 WARNING [train.py:1204] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_678-159307-0 from training. Duration: 0.85 2023-10-02 01:50:10,751 WARNING [train.py:1204] (3/4) Exclude cut with ID _28427_210_4220_1_1532581240967_3061069_166-319773-0_sp1.1 from training. Duration: 0.936375 2023-10-02 01:50:10,825 WARNING [train.py:1204] (3/4) Exclude cut with ID _29877_210_6228_1_1532397791106_5276149_606-224984-0_sp1.1 from training. Duration: 0.936375 2023-10-02 01:50:15,231 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_125-203105-0_sp1.1 from training. Duration: 0.72725 2023-10-02 01:50:16,839 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.1.attention_skip_rate, batch_count=708546.6666666666, ans=0.0 2023-10-02 01:50:18,051 WARNING [train.py:1204] (3/4) Exclude cut with ID IC0620W0113-306765 from training. Duration: 0.768 2023-10-02 01:50:18,640 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.3.encoder.layers.3.conv_module2.whiten, num_groups=1, num_channels=512, metric=8.84 vs. limit=15.0 2023-10-02 01:50:19,379 WARNING [train.py:1204] (3/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_410-287857-0_sp1.1 from training. Duration: 0.8454375 2023-10-02 01:50:22,183 WARNING [train.py:1204] (3/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_78-95260-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 01:50:24,135 WARNING [train.py:1204] (3/4) Exclude cut with ID _58059_210_5854_1_1534901471149_3164808_6-73080-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 01:50:26,153 INFO [train.py:1046] (3/4) Epoch 21, batch 50, loss[loss=0.1598, simple_loss=0.2356, pruned_loss=0.042, over 23216.00 frames. ], tot_loss[loss=0.1772, simple_loss=0.2533, pruned_loss=0.05061, over 1062979.82 frames. ], batch size: 105, lr: 4.96e-03, grad_scale: 16.0 2023-10-02 01:50:26,204 WARNING [train.py:1204] (3/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_331-117829-0 from training. Duration: 0.62 2023-10-02 01:50:26,295 WARNING [train.py:1204] (3/4) Exclude cut with ID _14880_210_8895_1_1530680426655_3795259_289-212817-0_sp0.9 from training. Duration: 0.7666875 2023-10-02 01:50:26,335 WARNING [train.py:1204] (3/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_620-144779-0_sp1.1 from training. Duration: 0.92725 2023-10-02 01:50:28,992 WARNING [train.py:1204] (3/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_561-288575-0_sp1.1 from training. Duration: 0.963625 2023-10-02 01:50:29,093 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_22657_210_6890_1_1532086002_1637216_122-188128-0_sp1.1 from training. Duration: 0.963625 2023-10-02 01:50:31,049 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.nonlin_attention.whiten1.whitening_limit, batch_count=708613.3333333334, ans=10.0 2023-10-02 01:50:31,917 WARNING [train.py:1204] (3/4) Exclude cut with ID _16756_210_4915_1_1531047803763_7104259_604-86607-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 01:50:32,110 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.0.ff2_skip_rate, batch_count=708613.3333333334, ans=0.0 2023-10-02 01:50:36,203 WARNING [train.py:1204] (3/4) Exclude cut with ID _15150_210_8341_1_1530752411803_7256384_469-239509-0 from training. Duration: 0.68 2023-10-02 01:50:36,222 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_10987_210_4163_1_1529760555_4123902_302-87926-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 01:50:42,303 WARNING [train.py:1204] (3/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_469-94673-0_sp0.9 from training. Duration: 0.87775 2023-10-02 01:50:43,702 WARNING [train.py:1204] (3/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_798-106179-0 from training. Duration: 0.83 2023-10-02 01:50:45,948 WARNING [train.py:1204] (3/4) Exclude cut with ID _16744_210_5986_1_1531724405384_3907567_242-111316-0 from training. Duration: 0.93 2023-10-02 01:50:47,389 WARNING [train.py:1204] (3/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_938-205135-0_sp0.9 from training. Duration: 0.9666875 2023-10-02 01:50:48,794 WARNING [train.py:1204] (3/4) Exclude cut with ID _64135_210_2253_1_1535677267876_7608972_133-188266-0_sp0.9 from training. Duration: 0.9 2023-10-02 01:50:48,798 WARNING [train.py:1204] (3/4) Exclude cut with ID _44585_210_18628_1_1533605350387_4518199_623-346820-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 01:50:50,119 WARNING [train.py:1204] (3/4) Exclude cut with ID _42326_210_7770_1_1533814282531_3707269_454-266903-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 01:50:50,191 WARNING [train.py:1204] (3/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_5-55893-0_sp1.1 from training. Duration: 0.5 2023-10-02 01:50:51,485 WARNING [train.py:1204] (3/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_577-332600-0_sp1.1 from training. Duration: 0.5181875 2023-10-02 01:50:51,486 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_957-138882-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 01:50:56,561 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.feed_forward2.hidden_balancer.prob, batch_count=708746.6666666666, ans=0.125 2023-10-02 01:51:02,394 WARNING [train.py:1204] (3/4) Exclude cut with ID _12556_210_8341_1_1530081024100_7327690_219-38004-0_sp1.1 from training. Duration: 0.97275 2023-10-02 01:51:03,765 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_397-147196-0_sp1.1 from training. Duration: 0.72725 2023-10-02 01:51:03,784 WARNING [train.py:1204] (3/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_231-104736-0_sp0.9 from training. Duration: 0.8666875 2023-10-02 01:51:05,348 WARNING [train.py:1204] (3/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_398-334558-0 from training. Duration: 0.96 2023-10-02 01:51:06,708 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_257-20733-0_sp1.1 from training. Duration: 0.6909375 2023-10-02 01:51:08,179 WARNING [train.py:1204] (3/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_146-282400-0_sp1.1 from training. Duration: 0.6454375 2023-10-02 01:51:08,190 WARNING [train.py:1204] (3/4) Exclude cut with ID _51899_210_20034_1_1534129254466_3669220_265-215428-0 from training. Duration: 0.98 2023-10-02 01:51:09,441 WARNING [train.py:1204] (3/4) Exclude cut with ID _34423_210_8750_1_1533430798456_3330199_219-106541-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 01:51:11,016 WARNING [train.py:1204] (3/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_458-67321-0 from training. Duration: 0.97 2023-10-02 01:51:12,738 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=708813.3333333334, ans=0.1 2023-10-02 01:51:17,097 WARNING [train.py:1204] (3/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_1051-182606-0_sp1.1 from training. Duration: 0.936375 2023-10-02 01:51:17,118 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_53555_210_5632_1_1534595220_7232297_603-4396-0_sp1.1 from training. Duration: 0.97275 2023-10-02 01:51:18,353 WARNING [train.py:1204] (3/4) Exclude cut with ID _54218_210_12210_1_1534413646388_4409259_159-144367-0_sp1.1 from training. Duration: 0.963625 2023-10-02 01:51:18,470 WARNING [train.py:1204] (3/4) Exclude cut with ID _7606_210_4915_1_1528894253277_5080690_301-10732-0_sp0.9 from training. Duration: 0.9 2023-10-02 01:51:18,473 WARNING [train.py:1204] (3/4) Exclude cut with ID _15026_210_8750_1_1530777602316_3291850_211-195043-0_sp0.9 from training. Duration: 0.888875 2023-10-02 01:51:23,059 WARNING [train.py:1204] (3/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_146-282400-0 from training. Duration: 0.71 2023-10-02 01:51:23,104 WARNING [train.py:1204] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_65-88242-0 from training. Duration: 0.56 2023-10-02 01:51:25,729 WARNING [train.py:1204] (3/4) Exclude cut with ID _55857_210_2346_1_1534644007831_6898209_80-107757-0_sp1.1 from training. Duration: 0.963625 2023-10-02 01:51:25,786 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_15126_210_6753_1_1530778677_2591185_120-277707-0_sp0.9 from training. Duration: 0.888875 2023-10-02 01:51:27,307 WARNING [train.py:1204] (3/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_1033-168593-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 01:51:28,599 WARNING [train.py:1204] (3/4) Exclude cut with ID _46683_210_9642_1_1533730913492_4591116_475-30711-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 01:51:28,633 WARNING [train.py:1204] (3/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_253-308377-0 from training. Duration: 0.49 2023-10-02 01:51:29,994 WARNING [train.py:1204] (3/4) Exclude cut with ID _4680_210_3525_1_1527249757953_893386_75-78979-0 from training. Duration: 0.91 2023-10-02 01:51:30,073 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_5522_210_3273_1_1527249606_3909096_318-203153-0_sp0.9 from training. Duration: 0.47775 2023-10-02 01:51:32,096 WARNING [train.py:1204] (3/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_550-170116-0_sp1.1 from training. Duration: 0.92725 2023-10-02 01:51:34,005 WARNING [train.py:1204] (3/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_277-210798-0_sp0.9 from training. Duration: 0.611125 2023-10-02 01:51:35,397 WARNING [train.py:1204] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_620-276793-0 from training. Duration: 0.59 2023-10-02 01:51:35,417 WARNING [train.py:1204] (3/4) Exclude cut with ID _3067_210_2667_1_1526212974640_4503168_119-175776-0 from training. Duration: 0.74 2023-10-02 01:51:36,691 WARNING [train.py:1204] (3/4) Exclude cut with ID _55209_210_1531_1_1534675828996_4597160_681-195827-0_sp1.1 from training. Duration: 0.92725 2023-10-02 01:51:38,089 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_427-78097-0_sp1.1 from training. Duration: 0.72725 2023-10-02 01:51:39,419 WARNING [train.py:1204] (3/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_96-290039-0_sp0.9 from training. Duration: 0.62225 2023-10-02 01:51:39,428 WARNING [train.py:1204] (3/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_287-292174-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 01:51:40,808 INFO [train.py:1046] (3/4) Epoch 21, batch 100, loss[loss=0.1689, simple_loss=0.2578, pruned_loss=0.03996, over 24637.00 frames. ], tot_loss[loss=0.1778, simple_loss=0.2543, pruned_loss=0.05064, over 1864815.94 frames. ], batch size: 68, lr: 4.96e-03, grad_scale: 16.0 2023-10-02 01:51:42,212 WARNING [train.py:1204] (3/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_200-292228-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 01:51:43,720 WARNING [train.py:1204] (3/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_291-194999-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 01:51:48,297 WARNING [train.py:1204] (3/4) Exclude cut with ID _58895_210_8750_1_1535184081320_3649089_265-323810-0_sp1.1 from training. Duration: 0.87275 2023-10-02 01:51:49,794 WARNING [train.py:1204] (3/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_58-68956-0 from training. Duration: 0.83 2023-10-02 01:51:49,800 WARNING [train.py:1204] (3/4) Exclude cut with ID _39867_210_9826_1_1533553222030_9344259_322-115153-0_sp1.1 from training. Duration: 0.963625 2023-10-02 01:51:54,515 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3371_210_1794_1_1526384462_5580777_396-339812-0_sp1.1 from training. Duration: 0.77275 2023-10-02 01:51:54,538 WARNING [train.py:1204] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_731-2428-0_sp0.9 from training. Duration: 0.92225 2023-10-02 01:51:54,559 WARNING [train.py:1204] (3/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_98-741-0_sp1.1 from training. Duration: 0.72725 2023-10-02 01:51:54,561 WARNING [train.py:1204] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_238-245655-0_sp0.9 from training. Duration: 0.9 2023-10-02 01:51:54,599 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6788_210_1388_1_1527898698_4562021_91-197981-0_sp0.9 from training. Duration: 0.92225 2023-10-02 01:51:57,301 WARNING [train.py:1204] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_860-96198-0 from training. Duration: 0.73 2023-10-02 01:51:58,780 WARNING [train.py:1204] (3/4) Exclude cut with ID _14268_210_9206_1_1530622771285_4714099_331-105351-0_sp0.9 from training. Duration: 0.72225 2023-10-02 01:52:00,005 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.547e+02 1.895e+02 2.091e+02 2.365e+02 3.412e+02, threshold=4.183e+02, percent-clipped=0.0 2023-10-02 01:52:00,044 WARNING [train.py:1204] (3/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_204-137133-0_sp1.1 from training. Duration: 0.92725 2023-10-02 01:52:00,062 WARNING [train.py:1204] (3/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_5-46496-0_sp1.1 from training. Duration: 0.936375 2023-10-02 01:52:00,070 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_160-218721-0_sp1.1 from training. Duration: 0.87275 2023-10-02 01:52:02,995 WARNING [train.py:1204] (3/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_503-287883-0 from training. Duration: 0.88 2023-10-02 01:52:04,477 WARNING [train.py:1204] (3/4) Exclude cut with ID _57572_210_5517_1_1534921219624_3809484_110-79134-0_sp1.1 from training. Duration: 0.92725 2023-10-02 01:52:04,569 WARNING [train.py:1204] (3/4) Exclude cut with ID _44585_210_18628_1_1533605350387_4518199_421-37641-0_sp1.1 from training. Duration: 0.936375 2023-10-02 01:52:04,645 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_42-142473-0_sp0.9 from training. Duration: 0.8 2023-10-02 01:52:08,024 WARNING [train.py:1204] (3/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_310-114452-0_sp0.9 from training. Duration: 0.6555625 2023-10-02 01:52:11,427 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9029W0120-509520 from training. Duration: 0.768 2023-10-02 01:52:11,440 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9029W0433-509820 from training. Duration: 0.896 2023-10-02 01:52:12,874 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_40222_210_6228_1_1533385298_4481939_163-334586-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 01:52:12,875 WARNING [train.py:1204] (3/4) Exclude cut with ID _48677_210_5663_1_1533902405919_3892989_408-40740-0_sp1.1 from training. Duration: 0.7545625 2023-10-02 01:52:16,263 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.5.encoder.layers.0.feed_forward2.out_whiten, num_groups=1, num_channels=256, metric=11.98 vs. limit=15.0 2023-10-02 01:52:17,173 WARNING [train.py:1204] (3/4) Exclude cut with ID _24314_210_4915_1_1532135078116_7170179_521-258868-0_sp0.9 from training. Duration: 0.87775 2023-10-02 01:52:19,098 WARNING [train.py:1204] (3/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_576-51198-0_sp1.1 from training. Duration: 0.92725 2023-10-02 01:52:20,497 WARNING [train.py:1204] (3/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_306-216871-0_sp1.1 from training. Duration: 0.97275 2023-10-02 01:52:25,260 WARNING [train.py:1204] (3/4) Exclude cut with ID _20684_210_6917_1_1531567751646_4203930_463-119982-0_sp1.1 from training. Duration: 0.97275 2023-10-02 01:52:26,700 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9084W0012-536850 from training. Duration: 0.64 2023-10-02 01:52:28,012 WARNING [train.py:1204] (3/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_847-41334-0_sp1.1 from training. Duration: 0.4 2023-10-02 01:52:30,795 WARNING [train.py:1204] (3/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_326-165900-0_sp1.1 from training. Duration: 0.62725 2023-10-02 01:52:32,149 WARNING [train.py:1204] (3/4) Exclude cut with ID _7537_210_2084_1_1528502395431_3924359_128-268029-0_sp1.1 from training. Duration: 0.8909375 2023-10-02 01:52:33,606 WARNING [train.py:1204] (3/4) Exclude cut with ID _67499_210_17282_1_1535509805220_3692430_333-155030-0_sp1.1 from training. Duration: 0.97275 2023-10-02 01:52:33,788 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.0.nonlin_attention.balancer.max_positive, batch_count=709146.6666666666, ans=0.95 2023-10-02 01:52:36,920 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_34008_210_10431_1_1533345424_8183247_588-257177-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 01:52:38,968 WARNING [train.py:1204] (3/4) Exclude cut with ID _25914_210_6400_1_1532141317058_644740_52-96808-0_sp1.1 from training. Duration: 0.82725 2023-10-02 01:52:40,323 WARNING [train.py:1204] (3/4) Exclude cut with ID _56494_210_4220_1_1535284837625_3033169_69-138150-0_sp1.1 from training. Duration: 0.8545625 2023-10-02 01:52:40,609 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.conv_module1.balancer1.prob, batch_count=709213.3333333334, ans=0.125 2023-10-02 01:52:41,830 WARNING [train.py:1204] (3/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_19-143553-0_sp1.1 from training. Duration: 0.97275 2023-10-02 01:52:42,097 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.attention_skip_rate, batch_count=709213.3333333334, ans=0.0 2023-10-02 01:52:43,046 WARNING [train.py:1204] (3/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_582-130735-0_sp1.1 from training. Duration: 0.936375 2023-10-02 01:52:44,465 WARNING [train.py:1204] (3/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_943-201356-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 01:52:44,476 WARNING [train.py:1204] (3/4) Exclude cut with ID _3231_210_2106_1_1526205864934_6784220_422-229276-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 01:52:44,507 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_48049_210_5632_1_1533864553_3519937_73-198717-0_sp1.1 from training. Duration: 0.97275 2023-10-02 01:52:44,559 WARNING [train.py:1204] (3/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_329-285801-0 from training. Duration: 0.91 2023-10-02 01:52:44,591 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9171W0114-579000 from training. Duration: 0.896 2023-10-02 01:52:44,610 WARNING [train.py:1204] (3/4) Exclude cut with ID _3120_210_3525_1_1526036143112_4072980_210-132675-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 01:52:45,985 WARNING [train.py:1204] (3/4) Exclude cut with ID _55857_210_2346_1_1534644007831_6898209_745-98391-0_sp0.9 from training. Duration: 0.8444375 2023-10-02 01:52:46,057 WARNING [train.py:1204] (3/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_615-322035-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 01:52:46,061 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_36756_210_7234_1_1533026991_966276_119-315767-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 01:52:46,077 WARNING [train.py:1204] (3/4) Exclude cut with ID _13916_210_9994_1_1530788341126_3763149_60-191556-0_sp1.1 from training. Duration: 0.5545625 2023-10-02 01:52:46,107 WARNING [train.py:1204] (3/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_177-236809-0_sp1.1 from training. Duration: 0.4454375 2023-10-02 01:52:48,093 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7002_210_1794_1_1527939683_5157286_184-229630-0_sp1.1 from training. Duration: 0.67275 2023-10-02 01:52:48,099 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_50428_210_5724_1_1534247532_4126418_464-18322-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 01:52:49,553 WARNING [train.py:1204] (3/4) Exclude cut with ID _35799_210_5854_1_1533263487887_3529071_466-131402-0_sp1.1 from training. Duration: 0.936375 2023-10-02 01:52:50,900 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_1259-132768-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 01:52:52,146 WARNING [train.py:1204] (3/4) Exclude cut with ID _6694_210_4599_1_1527852637490_3965529_144-185051-0_sp1.1 from training. Duration: 0.87275 2023-10-02 01:52:52,204 WARNING [train.py:1204] (3/4) Exclude cut with ID _64639_210_5816_1_1535367619437_3528000_59-133290-0_sp1.1 from training. Duration: 0.963625 2023-10-02 01:52:54,154 WARNING [train.py:1204] (3/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_223-49525-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 01:52:55,754 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.1.conv_module2.balancer2.min_abs, batch_count=709280.0, ans=0.5 2023-10-02 01:52:56,948 INFO [train.py:1046] (3/4) Epoch 21, batch 150, loss[loss=0.159, simple_loss=0.2462, pruned_loss=0.03593, over 24466.00 frames. ], tot_loss[loss=0.1791, simple_loss=0.2552, pruned_loss=0.05148, over 2491471.52 frames. ], batch size: 66, lr: 4.95e-03, grad_scale: 16.0 2023-10-02 01:52:57,025 WARNING [train.py:1204] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_748-36150-0_sp1.1 from training. Duration: 0.82725 2023-10-02 01:52:57,026 WARNING [train.py:1204] (3/4) Exclude cut with ID _19896_210_1883_1_1531709022662_6649569_52-221627-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 01:52:57,092 WARNING [train.py:1204] (3/4) Exclude cut with ID _6789_210_1534_1_1527854471671_2870519_296-115766-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 01:52:59,867 WARNING [train.py:1204] (3/4) Exclude cut with ID _14624_210_1925_1_1530858690657_3642219_100-19871-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 01:52:59,921 WARNING [train.py:1204] (3/4) Exclude cut with ID _12555_210_8750_1_1530075594402_3780239_50-293675-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 01:53:01,438 WARNING [train.py:1204] (3/4) Exclude cut with ID _14880_210_8895_1_1530680426655_3795259_289-212817-0_sp1.1 from training. Duration: 0.62725 2023-10-02 01:53:02,763 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_388-211626-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 01:53:07,907 WARNING [train.py:1204] (3/4) Exclude cut with ID _5289_210_2084_1_1527678202736_3779299_392-261988-0 from training. Duration: 0.65 2023-10-02 01:53:07,915 WARNING [train.py:1204] (3/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_124-345840-0 from training. Duration: 0.52 2023-10-02 01:53:07,920 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2655_210_2087_1_1525688959_3897400_437-337690-0 from training. Duration: 0.63 2023-10-02 01:53:12,153 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.0.layers.1.nonlin_attention.whiten2, num_groups=1, num_channels=192, metric=5.65 vs. limit=15.0 2023-10-02 01:53:12,519 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3780_210_2087_1_1526467391_239068_13-274633-0_sp1.1 from training. Duration: 0.92725 2023-10-02 01:53:12,524 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_805-71497-0_sp1.1 from training. Duration: 0.6454375 2023-10-02 01:53:12,604 WARNING [train.py:1204] (3/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_434-83000-0_sp1.1 from training. Duration: 0.8909375 2023-10-02 01:53:13,977 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_17613_210_2151_1_1531137569_2366160_285-219531-0_sp1.1 from training. Duration: 0.97275 2023-10-02 01:53:13,982 WARNING [train.py:1204] (3/4) Exclude cut with ID _56108_210_16380_1_1534505446159_3887959_22-37858-0_sp1.1 from training. Duration: 0.936375 2023-10-02 01:53:14,013 WARNING [train.py:1204] (3/4) Exclude cut with ID _28427_210_4220_1_1532581240967_3061069_200-88444-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 01:53:15,380 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_77-279042-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 01:53:18,143 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9309W0193-640320 from training. Duration: 0.896 2023-10-02 01:53:19,648 WARNING [train.py:1204] (3/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_516-324055-0_sp1.1 from training. Duration: 0.936375 2023-10-02 01:53:26,320 WARNING [train.py:1204] (3/4) Exclude cut with ID _40094_210_16380_1_1533294005773_3641159_10-261240-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 01:53:29,296 WARNING [train.py:1204] (3/4) Exclude cut with ID _6267_210_2084_1_1527764658526_3894730_375-219887-0_sp1.1 from training. Duration: 0.6090625 2023-10-02 01:53:30,625 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6788_210_1388_1_1527898698_4562021_180-75288-0 from training. Duration: 0.99 2023-10-02 01:53:30,917 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.feed_forward2.hidden_balancer.prob, batch_count=709413.3333333334, ans=0.125 2023-10-02 01:53:33,386 WARNING [train.py:1204] (3/4) Exclude cut with ID _8030_210_7497_1_1528444886781_3531089_262-86750-0_sp1.1 from training. Duration: 0.736375 2023-10-02 01:53:33,400 WARNING [train.py:1204] (3/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_934-325498-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 01:53:33,429 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_456-22141-0_sp0.9 from training. Duration: 0.97775 2023-10-02 01:53:36,607 WARNING [train.py:1204] (3/4) Exclude cut with ID _49643_210_7989_1_1534640244224_3939031_598-161926-0_sp1.1 from training. Duration: 0.8181875 2023-10-02 01:53:36,722 WARNING [train.py:1204] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_345-49721-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 01:53:38,060 WARNING [train.py:1204] (3/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_440-141811-0_sp1.1 from training. Duration: 0.863625 2023-10-02 01:53:41,288 WARNING [train.py:1204] (3/4) Exclude cut with ID _64048_210_10626_1_1535198382802_3835179_394-212120-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 01:53:41,360 WARNING [train.py:1204] (3/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_91-277684-0 from training. Duration: 0.74 2023-10-02 01:53:46,919 WARNING [train.py:1204] (3/4) Exclude cut with ID _23038_210_9570_1_1532001584158_3652419_191-346343-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 01:53:46,987 WARNING [train.py:1204] (3/4) Exclude cut with ID _15140_210_9288_1_1530791388450_4880149_351-347974-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 01:53:47,565 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.4.encoder.layers.2.feed_forward1.out_whiten, num_groups=1, num_channels=384, metric=9.46 vs. limit=15.0 2023-10-02 01:53:48,317 WARNING [train.py:1204] (3/4) Exclude cut with ID _58605_210_12210_1_1534723195939_3904259_48-269402-0_sp1.1 from training. Duration: 0.87275 2023-10-02 01:53:48,323 WARNING [train.py:1204] (3/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_401-53423-0_sp0.9 from training. Duration: 0.611125 2023-10-02 01:53:49,716 WARNING [train.py:1204] (3/4) Exclude cut with ID _42659_210_2070_1_1533607442765_3120380_158-49004-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 01:53:51,121 WARNING [train.py:1204] (3/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_81-75849-0_sp1.1 from training. Duration: 0.3545625 2023-10-02 01:53:52,405 WARNING [train.py:1204] (3/4) Exclude cut with ID _27052_210_5986_1_1532329197187_3585009_314-106499-0_sp1.1 from training. Duration: 0.836375 2023-10-02 01:53:54,368 WARNING [train.py:1204] (3/4) Exclude cut with ID _4626_210_3532_1_1526817652717_3916859_446-206656-0_sp0.9 from training. Duration: 0.8444375 2023-10-02 01:53:56,571 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_53378_210_17282_1_1534221071_5769509_158-114960-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 01:53:57,746 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3171_210_2087_1_1526109084_2330007_37-64334-0_sp0.9 from training. Duration: 0.911125 2023-10-02 01:53:57,771 WARNING [train.py:1204] (3/4) Exclude cut with ID _5410_210_1388_1_1527397055795_1719020_39-112221-0 from training. Duration: 0.69 2023-10-02 01:53:59,151 WARNING [train.py:1204] (3/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_4-91037-0_sp0.9 from training. Duration: 0.97775 2023-10-02 01:53:59,166 WARNING [train.py:1204] (3/4) Exclude cut with ID ID0079W0443-717795 from training. Duration: 0.541 2023-10-02 01:54:03,103 WARNING [train.py:1204] (3/4) Exclude cut with ID _34734_210_4525_1_1533365977542_4448720_766-297928-0_sp1.1 from training. Duration: 0.936375 2023-10-02 01:54:06,549 WARNING [train.py:1204] (3/4) Exclude cut with ID _51471_210_1889_1_1534399391390_3094250_310-337793-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 01:54:06,562 WARNING [train.py:1204] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_678-159307-0_sp0.9 from training. Duration: 0.9444375 2023-10-02 01:54:09,288 WARNING [train.py:1204] (3/4) Exclude cut with ID _50093_210_4220_1_1534417141207_3432299_259-322252-0 from training. Duration: 0.92 2023-10-02 01:54:09,346 WARNING [train.py:1204] (3/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_67-103444-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 01:54:11,282 WARNING [train.py:1204] (3/4) Exclude cut with ID _40551_210_10635_1_1533776424632_3416179_311-247578-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 01:54:12,628 INFO [train.py:1046] (3/4) Epoch 21, batch 200, loss[loss=0.1812, simple_loss=0.2666, pruned_loss=0.04792, over 24376.00 frames. ], tot_loss[loss=0.1796, simple_loss=0.2557, pruned_loss=0.05174, over 2993550.46 frames. ], batch size: 77, lr: 4.95e-03, grad_scale: 8.0 2023-10-02 01:54:12,749 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_4633_210_2040_1_1526824163_4145343_416-76117-0 from training. Duration: 0.95 2023-10-02 01:54:12,842 WARNING [train.py:1204] (3/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_331-117829-0_sp0.9 from training. Duration: 0.688875 2023-10-02 01:54:14,209 WARNING [train.py:1204] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_299-247059-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 01:54:14,819 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.4.encoder.layers.1.nonlin_attention.whiten1, num_groups=1, num_channels=288, metric=4.12 vs. limit=10.0 2023-10-02 01:54:15,658 WARNING [train.py:1204] (3/4) Exclude cut with ID _64002_210_9570_1_1535198605157_38979_2-240845-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 01:54:20,990 WARNING [train.py:1204] (3/4) Exclude cut with ID _4681_210_3525_1_1527250689464_120889_15-126760-0_sp0.9 from training. Duration: 0.9 2023-10-02 01:54:21,012 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_21500_210_2054_1_1532149310_2716758_256-156611-0_sp1.1 from training. Duration: 0.936375 2023-10-02 01:54:21,042 WARNING [train.py:1204] (3/4) Exclude cut with ID _16908_210_5724_1_1531119157248_3772700_624-203729-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 01:54:32,804 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.474e+02 1.906e+02 2.104e+02 2.322e+02 3.848e+02, threshold=4.208e+02, percent-clipped=0.0 2023-10-02 01:54:40,411 WARNING [train.py:1204] (3/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_605-219732-0_sp1.1 from training. Duration: 0.8545625 2023-10-02 01:54:40,458 WARNING [train.py:1204] (3/4) Exclude cut with ID _44718_210_5816_1_1534050051869_6469030_380-167520-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 01:54:43,612 WARNING [train.py:1204] (3/4) Exclude cut with ID _19896_210_1883_1_1531709022662_6649569_94-229927-0_sp1.1 from training. Duration: 0.8818125 2023-10-02 01:54:43,674 WARNING [train.py:1204] (3/4) Exclude cut with ID _19514_210_9259_1_1531530071330_5084230_519-174300-0_sp1.1 from training. Duration: 0.963625 2023-10-02 01:54:45,087 WARNING [train.py:1204] (3/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_340-174176-0_sp0.9 from training. Duration: 0.5444375 2023-10-02 01:54:45,091 WARNING [train.py:1204] (3/4) Exclude cut with ID _8263_210_1664_1_1528439452891_970220_68-181805-0_sp1.1 from training. Duration: 0.5818125 2023-10-02 01:54:46,484 WARNING [train.py:1204] (3/4) Exclude cut with ID _3120_210_3525_1_1526036143112_4072980_348-257723-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 01:54:47,845 WARNING [train.py:1204] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_453-133983-0_sp1.1 from training. Duration: 0.7090625 2023-10-02 01:54:47,899 WARNING [train.py:1204] (3/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_17-86060-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 01:54:47,935 WARNING [train.py:1204] (3/4) Exclude cut with ID _29877_210_6228_1_1532397791106_5276149_228-184657-0_sp1.1 from training. Duration: 0.97275 2023-10-02 01:54:49,379 WARNING [train.py:1204] (3/4) Exclude cut with ID _18516_210_9206_1_1531704653206_3692406_198-334870-0 from training. Duration: 0.92 2023-10-02 01:54:49,426 WARNING [train.py:1204] (3/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_340-174176-0_sp1.1 from training. Duration: 0.4454375 2023-10-02 01:54:50,770 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_14-139186-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 01:54:53,652 WARNING [train.py:1204] (3/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_126-29104-0_sp0.9 from training. Duration: 0.9444375 2023-10-02 01:54:55,187 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.nonlin_attention.balancer.prob, batch_count=709813.3333333334, ans=0.125 2023-10-02 01:55:01,041 WARNING [train.py:1204] (3/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_84-10310-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 01:55:09,924 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_15517_210_6753_1_1530954008_3487336_79-273076-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 01:55:09,963 WARNING [train.py:1204] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_371-18169-0_sp0.9 from training. Duration: 0.9555625 2023-10-02 01:55:18,115 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_68702_210_23689_1_1535799577_3420068_170-92788-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 01:55:20,894 WARNING [train.py:1204] (3/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_78-182912-0 from training. Duration: 0.59 2023-10-02 01:55:20,944 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3019_210_2028_1_1526044245_3264865_297-12384-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 01:55:22,251 WARNING [train.py:1204] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_147-111137-0_sp1.1 from training. Duration: 0.863625 2023-10-02 01:55:22,259 WARNING [train.py:1204] (3/4) Exclude cut with ID _28323_210_16187_1_1532480395667_4100300_117-8859-0_sp1.1 from training. Duration: 0.97275 2023-10-02 01:55:23,729 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7762_210_1794_1_1528199508_4057408_419-125009-0_sp1.1 from training. Duration: 0.7545625 2023-10-02 01:55:23,924 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.conv_module2.balancer1.min_positive, batch_count=709880.0, ans=0.025 2023-10-02 01:55:26,253 INFO [train.py:1046] (3/4) Epoch 21, batch 250, loss[loss=0.1769, simple_loss=0.2515, pruned_loss=0.05111, over 23283.00 frames. ], tot_loss[loss=0.1789, simple_loss=0.2548, pruned_loss=0.05143, over 3375753.44 frames. ], batch size: 119, lr: 4.95e-03, grad_scale: 8.0 2023-10-02 01:55:26,302 WARNING [train.py:1204] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_268-87909-0 from training. Duration: 0.99 2023-10-02 01:55:26,372 WARNING [train.py:1204] (3/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_118-311357-0_sp1.1 from training. Duration: 0.92725 2023-10-02 01:55:26,391 WARNING [train.py:1204] (3/4) Exclude cut with ID ID0377W0003-870225 from training. Duration: 0.852 2023-10-02 01:55:29,019 WARNING [train.py:1204] (3/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_56-5838-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 01:55:29,149 WARNING [train.py:1204] (3/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_90-31383-0_sp0.9 from training. Duration: 0.9666875 2023-10-02 01:55:30,537 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_31583_210_3852_1_1532595851_4024413_58-86657-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 01:55:31,964 WARNING [train.py:1204] (3/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_344-113875-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 01:55:35,051 WARNING [train.py:1204] (3/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_278-104676-0_sp1.1 from training. Duration: 0.8909375 2023-10-02 01:55:35,087 WARNING [train.py:1204] (3/4) Exclude cut with ID _55844_210_2346_1_1534471212569_7625707_764-2387-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 01:55:36,469 WARNING [train.py:1204] (3/4) Exclude cut with ID _27430_210_7770_1_1532604689375_1378590_157-14963-0_sp1.1 from training. Duration: 0.963625 2023-10-02 01:55:41,157 WARNING [train.py:1204] (3/4) Exclude cut with ID _7160_210_1876_1_1527940923231_3703799_282-322611-0_sp1.1 from training. Duration: 0.7909375 2023-10-02 01:55:47,176 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=710013.3333333334, ans=0.1 2023-10-02 01:55:50,883 WARNING [train.py:1204] (3/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_190-138640-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 01:55:52,319 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_33348_210_5632_1_1533086912_3324030_63-270922-0_sp1.1 from training. Duration: 0.97275 2023-10-02 01:55:53,630 WARNING [train.py:1204] (3/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_135-184281-0_sp1.1 from training. Duration: 0.9 2023-10-02 01:55:59,100 WARNING [train.py:1204] (3/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_277-210798-0_sp1.1 from training. Duration: 0.5 2023-10-02 01:55:59,256 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.0.feed_forward1.out_proj.dropout_p, batch_count=710080.0, ans=0.1 2023-10-02 01:56:00,599 WARNING [train.py:1204] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_402-253027-0_sp0.9 from training. Duration: 0.6 2023-10-02 01:56:00,671 WARNING [train.py:1204] (3/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_540-275685-0_sp0.9 from training. Duration: 0.988875 2023-10-02 01:56:00,707 WARNING [train.py:1204] (3/4) Exclude cut with ID _59493_210_17282_1_1535078419077_3225879_118-18960-0_sp1.1 from training. Duration: 0.936375 2023-10-02 01:56:02,048 WARNING [train.py:1204] (3/4) Exclude cut with ID _6194_210_1534_1_1527511826160_3941194_321-148641-0_sp1.1 from training. Duration: 0.5818125 2023-10-02 01:56:02,054 WARNING [train.py:1204] (3/4) Exclude cut with ID _61133_210_9969_1_1535196777227_3696409_333-270325-0_sp1.1 from training. Duration: 0.8454375 2023-10-02 01:56:03,335 WARNING [train.py:1204] (3/4) Exclude cut with ID _49376_210_5986_1_1533985278732_3370740_307-145221-0_sp1.1 from training. Duration: 0.936375 2023-10-02 01:56:05,337 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6846_210_6753_1_1527850767_4295158_2-315073-0_sp0.9 from training. Duration: 0.97775 2023-10-02 01:56:06,946 WARNING [train.py:1204] (3/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_17-25045-0 from training. Duration: 0.59 2023-10-02 01:56:08,214 WARNING [train.py:1204] (3/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_503-333510-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 01:56:10,277 WARNING [train.py:1204] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_238-245655-0_sp1.1 from training. Duration: 0.736375 2023-10-02 01:56:11,616 WARNING [train.py:1204] (3/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_329-82235-0_sp0.9 from training. Duration: 0.911125 2023-10-02 01:56:11,622 WARNING [train.py:1204] (3/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_325-113798-0_sp0.9 from training. Duration: 0.9444375 2023-10-02 01:56:11,713 WARNING [train.py:1204] (3/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_395-37222-0_sp0.9 from training. Duration: 0.8444375 2023-10-02 01:56:14,382 WARNING [train.py:1204] (3/4) Exclude cut with ID _64135_210_2253_1_1535677267876_7608972_498-5039-0_sp1.1 from training. Duration: 0.7090625 2023-10-02 01:56:14,393 WARNING [train.py:1204] (3/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_303-228738-0_sp0.9 from training. Duration: 0.9333125 2023-10-02 01:56:15,814 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_577-220899-0_sp1.1 from training. Duration: 0.92725 2023-10-02 01:56:17,174 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3475_210_2028_1_1526193578_3622569_377-166133-0_sp1.1 from training. Duration: 0.7818125 2023-10-02 01:56:18,474 WARNING [train.py:1204] (3/4) Exclude cut with ID _49376_210_5986_1_1533985278732_3370740_272-313512-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 01:56:22,573 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.ff2_skip_rate, batch_count=710146.6666666666, ans=0.0 2023-10-02 01:56:23,662 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7762_210_1794_1_1528199508_4057408_422-227034-0_sp0.9 from training. Duration: 0.811125 2023-10-02 01:56:27,927 WARNING [train.py:1204] (3/4) Exclude cut with ID _18895_210_6228_1_1531357274234_3784930_74-341428-0_sp1.1 from training. Duration: 0.92725 2023-10-02 01:56:30,714 WARNING [train.py:1204] (3/4) Exclude cut with ID _44902_210_16187_1_1533888006062_3792078_149-67212-0_sp1.1 from training. Duration: 0.7909375 2023-10-02 01:56:36,252 WARNING [train.py:1204] (3/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_243-126020-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 01:56:37,660 WARNING [train.py:1204] (3/4) Exclude cut with ID _54218_210_12210_1_1534413646388_4409259_259-6561-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 01:56:39,188 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_41452_210_7291_1_1533437687_3941281_405-11559-0 from training. Duration: 0.91 2023-10-02 01:56:40,522 INFO [train.py:1046] (3/4) Epoch 21, batch 300, loss[loss=0.1706, simple_loss=0.2426, pruned_loss=0.04931, over 16779.00 frames. ], tot_loss[loss=0.1772, simple_loss=0.2538, pruned_loss=0.05028, over 3680401.11 frames. ], batch size: 36, lr: 4.95e-03, grad_scale: 8.0 2023-10-02 01:56:41,347 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_759-225084-0_sp1.1 from training. Duration: 0.97275 2023-10-02 01:56:41,365 WARNING [train.py:1204] (3/4) Exclude cut with ID _14747_210_8341_1_1530685797463_3654089_147-131229-0_sp0.9 from training. Duration: 0.8444375 2023-10-02 01:56:42,575 WARNING [train.py:1204] (3/4) Exclude cut with ID _14867_210_1968_1_1530684179890_3846969_319-250868-0 from training. Duration: 0.99 2023-10-02 01:56:42,594 WARNING [train.py:1204] (3/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_580-93798-0_sp0.9 from training. Duration: 0.62225 2023-10-02 01:56:44,616 WARNING [train.py:1204] (3/4) Exclude cut with ID _9679_210_7497_1_1529129212906_4363929_512-313889-0_sp0.9 from training. Duration: 0.9 2023-10-02 01:56:44,630 WARNING [train.py:1204] (3/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_18-189198-0 from training. Duration: 0.9 2023-10-02 01:56:47,894 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.4.encoder.layers.1.feed_forward3.out_whiten, num_groups=1, num_channels=384, metric=12.20 vs. limit=15.0 2023-10-02 01:56:49,993 WARNING [train.py:1204] (3/4) Exclude cut with ID _49755_210_18053_1_1534581005846_3218709_137-64179-0_sp1.1 from training. Duration: 0.92725 2023-10-02 01:56:51,356 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_48049_210_5632_1_1533864553_3519937_306-283794-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 01:56:54,909 WARNING [train.py:1204] (3/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_25-120161-0_sp1.1 from training. Duration: 0.8909375 2023-10-02 01:56:56,353 WARNING [train.py:1204] (3/4) Exclude cut with ID _8833_210_5219_1_1528592665210_7553059_319-349181-0 from training. Duration: 0.99 2023-10-02 01:56:56,461 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_296-332325-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 01:56:57,837 WARNING [train.py:1204] (3/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_436-186311-0_sp1.1 from training. Duration: 0.5909375 2023-10-02 01:56:57,855 WARNING [train.py:1204] (3/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_348-127789-0 from training. Duration: 0.85 2023-10-02 01:56:57,859 WARNING [train.py:1204] (3/4) Exclude cut with ID _3992_210_1747_1_1526644816031_3731060_443-195183-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 01:57:02,047 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.563e+02 1.853e+02 2.062e+02 2.397e+02 3.479e+02, threshold=4.124e+02, percent-clipped=0.0 2023-10-02 01:57:02,137 WARNING [train.py:1204] (3/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_308-231735-0_sp1.1 from training. Duration: 0.663625 2023-10-02 01:57:04,825 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6786_210_1388_1_1527839699_4194495_378-46070-0_sp1.1 from training. Duration: 0.6545625 2023-10-02 01:57:04,860 WARNING [train.py:1204] (3/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_63-101996-0 from training. Duration: 0.88 2023-10-02 01:57:09,169 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2541_210_1794_1_1525590269_3084765_156-173713-0 from training. Duration: 0.81 2023-10-02 01:57:09,218 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_217-239796-0_sp1.1 from training. Duration: 0.963625 2023-10-02 01:57:12,515 WARNING [train.py:1204] (3/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_530-329400-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 01:57:13,935 WARNING [train.py:1204] (3/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_150-62920-0_sp1.1 from training. Duration: 0.963625 2023-10-02 01:57:13,935 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_5522_210_3273_1_1527249606_3909096_366-172508-0 from training. Duration: 0.66 2023-10-02 01:57:13,945 WARNING [train.py:1204] (3/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_303-340446-0_sp1.1 from training. Duration: 0.8181875 2023-10-02 01:57:17,241 WARNING [train.py:1204] (3/4) Exclude cut with ID _8699_210_2069_1_1528630346831_3960409_160-24408-0_sp1.1 from training. Duration: 0.82725 2023-10-02 01:57:20,171 WARNING [train.py:1204] (3/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_299-187801-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 01:57:20,198 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_34886_210_10633_1_1533203882_1755661_92-345288-0_sp1.1 from training. Duration: 0.97275 2023-10-02 01:57:23,940 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.bypass.skip_rate, batch_count=710413.3333333334, ans=0.07 2023-10-02 01:57:25,116 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_5438_210_1794_1_1527336687_2600009_104-288274-0_sp1.1 from training. Duration: 0.52725 2023-10-02 01:57:25,120 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_154-316450-0 from training. Duration: 0.76 2023-10-02 01:57:26,447 WARNING [train.py:1204] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_23-189234-0_sp0.9 from training. Duration: 0.92225 2023-10-02 01:57:28,319 WARNING [train.py:1204] (3/4) Exclude cut with ID _4779_210_2084_1_1527318022944_3914893_346-168820-0_sp1.1 from training. Duration: 0.963625 2023-10-02 01:57:29,636 WARNING [train.py:1204] (3/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_5-55893-0 from training. Duration: 0.55 2023-10-02 01:57:29,743 WARNING [train.py:1204] (3/4) Exclude cut with ID _20455_210_5219_1_1531565996775_8098100_180-24517-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 01:57:32,712 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_243-143119-0_sp0.9 from training. Duration: 0.8555625 2023-10-02 01:57:35,458 WARNING [train.py:1204] (3/4) Exclude cut with ID _65585_210_12210_1_1535450451584_3764098_293-212045-0_sp1.1 from training. Duration: 0.92725 2023-10-02 01:57:35,470 WARNING [train.py:1204] (3/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_577-332600-0 from training. Duration: 0.57 2023-10-02 01:57:35,645 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.conv_skip_rate, batch_count=710480.0, ans=0.0 2023-10-02 01:57:39,497 WARNING [train.py:1204] (3/4) Exclude cut with ID _30039_210_13270_1_1532484612900_2725150_104-289874-0_sp1.1 from training. Duration: 0.963625 2023-10-02 01:57:39,510 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3172_210_2087_1_1526292929_3987865_197-97762-0_sp1.1 from training. Duration: 0.6818125 2023-10-02 01:57:42,961 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_256-150908-0_sp1.1 from training. Duration: 0.963625 2023-10-02 01:57:44,367 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_296-287678-0_sp1.1 from training. Duration: 0.863625 2023-10-02 01:57:45,617 WARNING [train.py:1204] (3/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_443-30034-0 from training. Duration: 0.64 2023-10-02 01:57:45,662 WARNING [train.py:1204] (3/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_494-199538-0_sp0.9 from training. Duration: 0.8333125 2023-10-02 01:57:46,987 WARNING [train.py:1204] (3/4) Exclude cut with ID _19708_210_9206_1_1532136650733_4238364_61-162325-0_sp1.1 from training. Duration: 0.97275 2023-10-02 01:57:48,265 WARNING [train.py:1204] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_475-127455-0 from training. Duration: 0.7 2023-10-02 01:57:48,395 WARNING [train.py:1204] (3/4) Exclude cut with ID _48677_210_5663_1_1533902405919_3892989_188-11581-0_sp1.1 from training. Duration: 0.963625 2023-10-02 01:57:48,418 WARNING [train.py:1204] (3/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_136-8381-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 01:57:49,864 WARNING [train.py:1204] (3/4) Exclude cut with ID _55328_210_27767_1_1534580973260_4088170_270-229555-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 01:57:51,185 WARNING [train.py:1204] (3/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_372-266951-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 01:57:51,246 WARNING [train.py:1204] (3/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_14-242149-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 01:57:53,407 INFO [scaling.py:1118] (3/4) WithLoss: name=encoder.encoders.1.encoder.layers.0.self_attn_weights, loss-sum=0.000e+00 2023-10-02 01:57:53,500 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.conv_module1.balancer2.min_abs, batch_count=710546.6666666666, ans=0.5 2023-10-02 01:57:56,571 INFO [train.py:1046] (3/4) Epoch 21, batch 350, loss[loss=0.1969, simple_loss=0.2585, pruned_loss=0.06769, over 23595.00 frames. ], tot_loss[loss=0.1765, simple_loss=0.2526, pruned_loss=0.05025, over 3915839.22 frames. ], batch size: 256, lr: 4.95e-03, grad_scale: 8.0 2023-10-02 01:57:56,710 WARNING [train.py:1204] (3/4) Exclude cut with ID _7877_210_6014_1_1528286358099_3750136_159-76512-0_sp1.1 from training. Duration: 0.936375 2023-10-02 01:57:56,712 WARNING [train.py:1204] (3/4) Exclude cut with ID _8263_210_1664_1_1528438953494_294100_40-204731-0_sp1.1 from training. Duration: 0.4545625 2023-10-02 01:58:00,606 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8278_210_2550_1_1528637099_4220413_694-238413-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 01:58:01,524 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.2.encoder.layers.1.nonlin_attention.whiten2, num_groups=1, num_channels=384, metric=5.92 vs. limit=15.0 2023-10-02 01:58:05,015 WARNING [train.py:1204] (3/4) Exclude cut with ID _47278_210_12041_1_1533902418349_3576110_421-172710-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 01:58:08,344 WARNING [train.py:1204] (3/4) Exclude cut with ID _14970_210_15709_1_1530773543493_1715730_215-282680-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 01:58:08,392 WARNING [train.py:1204] (3/4) Exclude cut with ID _64639_210_5816_1_1535367619437_3528000_306-149449-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 01:58:08,663 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.bypass.skip_rate, batch_count=710613.3333333334, ans=0.09899494936611666 2023-10-02 01:58:11,288 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_28-291982-0 from training. Duration: 0.97 2023-10-02 01:58:11,396 WARNING [train.py:1204] (3/4) Exclude cut with ID _55134_210_21982_1_1534590028377_3882170_412-15870-0_sp1.1 from training. Duration: 0.936375 2023-10-02 01:58:13,151 WARNING [train.py:1204] (3/4) Exclude cut with ID _13684_210_7497_1_1530619354863_3598359_312-235606-0 from training. Duration: 0.6 2023-10-02 01:58:16,363 WARNING [train.py:1204] (3/4) Exclude cut with ID _37112_210_2575_1_1532997007673_7268610_347-238993-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 01:58:16,399 WARNING [train.py:1204] (3/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_121-284152-0 from training. Duration: 0.59 2023-10-02 01:58:16,460 WARNING [train.py:1204] (3/4) Exclude cut with ID _2941_210_2577_1_1526199673150_2489759_228-2961-0_sp1.1 from training. Duration: 0.97275 2023-10-02 01:58:20,605 WARNING [train.py:1204] (3/4) Exclude cut with ID _28323_210_16187_1_1532480395667_4100300_531-24157-0 from training. Duration: 0.98 2023-10-02 01:58:22,029 WARNING [train.py:1204] (3/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_266-329755-0_sp0.9 from training. Duration: 0.988875 2023-10-02 01:58:22,147 WARNING [train.py:1204] (3/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_1038-146421-0_sp1.1 from training. Duration: 0.97275 2023-10-02 01:58:24,261 WARNING [train.py:1204] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_391-22148-0_sp0.9 from training. Duration: 0.8555625 2023-10-02 01:58:26,944 WARNING [train.py:1204] (3/4) Exclude cut with ID _64521_210_14699_1_1535243427259_3815328_543-341117-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 01:58:26,954 WARNING [train.py:1204] (3/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_52-168712-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 01:58:26,993 WARNING [train.py:1204] (3/4) Exclude cut with ID _33920_210_4220_1_1533257851500_3084399_198-334063-0_sp1.1 from training. Duration: 0.8545625 2023-10-02 01:58:27,011 WARNING [train.py:1204] (3/4) Exclude cut with ID _50093_210_4220_1_1534417141207_3432299_69-111987-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 01:58:27,047 WARNING [train.py:1204] (3/4) Exclude cut with ID _14821_210_8750_1_1530680503991_6829359_189-297451-0_sp0.9 from training. Duration: 0.611125 2023-10-02 01:58:30,221 WARNING [train.py:1204] (3/4) Exclude cut with ID _8030_210_7497_1_1528444886781_3531089_262-86750-0_sp0.9 from training. Duration: 0.9 2023-10-02 01:58:30,226 WARNING [train.py:1204] (3/4) Exclude cut with ID _24612_210_8913_1_1531998054193_3806359_294-191196-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 01:58:35,167 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.3.encoder.layers.0.nonlin_attention.whiten2, num_groups=1, num_channels=512, metric=6.93 vs. limit=15.0 2023-10-02 01:58:35,969 WARNING [train.py:1204] (3/4) Exclude cut with ID _16909_210_5724_1_1531292079912_4118919_707-342896-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 01:58:35,995 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_9460_210_4211_1_1528954587_10049667_572-37035-0_sp0.9 from training. Duration: 0.888875 2023-10-02 01:58:37,341 WARNING [train.py:1204] (3/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_762-109886-0_sp1.1 from training. Duration: 0.82725 2023-10-02 01:58:38,640 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_14953_210_2151_1_1530701710_5616165_519-326393-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 01:58:39,433 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.1.encoder.layers.0.feed_forward3.out_whiten, num_groups=1, num_channels=256, metric=9.84 vs. limit=15.0 2023-10-02 01:58:43,624 WARNING [train.py:1204] (3/4) Exclude cut with ID _25914_210_6400_1_1532141317058_644740_52-96808-0 from training. Duration: 0.91 2023-10-02 01:58:43,629 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_68095_210_20185_1_1535718226_8888855_1079-94525-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 01:58:44,455 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.3.encoder.layers.2.self_attn_weights.whiten_keys, num_groups=8, num_channels=256, metric=4.30 vs. limit=6.0 2023-10-02 01:58:47,767 WARNING [train.py:1204] (3/4) Exclude cut with ID _14860_210_4915_1_1530702020340_7343240_746-3647-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 01:58:47,767 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_70379_210_7925_1_1535941455_557455_49-101720-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 01:58:47,782 WARNING [train.py:1204] (3/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_8-307291-0_sp1.1 from training. Duration: 0.936375 2023-10-02 01:58:49,157 WARNING [train.py:1204] (3/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_117-290916-0 from training. Duration: 0.51 2023-10-02 01:58:53,297 WARNING [train.py:1204] (3/4) Exclude cut with ID _22020_210_2461_1_1531724164513_6326900_944-339840-0_sp1.1 from training. Duration: 0.963625 2023-10-02 01:58:54,133 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.0.layers.0.self_attn_weights.whiten_keys, num_groups=4, num_channels=128, metric=3.02 vs. limit=6.0 2023-10-02 01:58:55,153 WARNING [train.py:1204] (3/4) Exclude cut with ID IC0515W0043-254911 from training. Duration: 0.976 2023-10-02 01:58:56,616 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3910_210_1794_1_1526742306_334530_28-238909-0 from training. Duration: 0.81 2023-10-02 01:58:56,633 WARNING [train.py:1204] (3/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_269-267275-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 01:58:58,236 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.feed_forward3.hidden_balancer.prob, batch_count=710880.0, ans=0.125 2023-10-02 01:59:00,662 WARNING [train.py:1204] (3/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_72-251255-0_sp1.1 from training. Duration: 0.8545625 2023-10-02 01:59:00,682 WARNING [train.py:1204] (3/4) Exclude cut with ID _14886_210_4220_1_1530702020355_3516190_175-229073-0 from training. Duration: 0.45 2023-10-02 01:59:02,782 WARNING [train.py:1204] (3/4) Exclude cut with ID _40519_210_13794_1_1533715228655_7337390_921-209452-0_sp1.1 from training. Duration: 0.963625 2023-10-02 01:59:04,412 WARNING [train.py:1204] (3/4) Exclude cut with ID _29857_210_20034_1_1532395886753_3798160_335-163562-0_sp0.9 from training. Duration: 0.9444375 2023-10-02 01:59:07,331 WARNING [train.py:1204] (3/4) Exclude cut with ID _58583_210_5236_1_1534726835266_3574249_384-58445-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 01:59:07,407 WARNING [train.py:1204] (3/4) Exclude cut with ID _44011_210_2046_1_1533722401691_3631310_96-315656-0_sp1.1 from training. Duration: 0.963625 2023-10-02 01:59:07,411 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_68484_210_1794_1_1535849804_7678097_549-170613-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 01:59:10,093 WARNING [train.py:1204] (3/4) Exclude cut with ID _37892_210_19596_1_1533198677897_3964058_214-148058-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 01:59:10,326 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.conv_module1.balancer1.prob, batch_count=710946.6666666666, ans=0.125 2023-10-02 01:59:11,356 INFO [train.py:1046] (3/4) Epoch 21, batch 400, loss[loss=0.1797, simple_loss=0.2632, pruned_loss=0.0481, over 24308.00 frames. ], tot_loss[loss=0.1754, simple_loss=0.2514, pruned_loss=0.04966, over 4099163.83 frames. ], batch size: 77, lr: 4.95e-03, grad_scale: 16.0 2023-10-02 01:59:12,847 WARNING [train.py:1204] (3/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_415-78792-0_sp0.9 from training. Duration: 0.9 2023-10-02 01:59:14,283 WARNING [train.py:1204] (3/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_383-205833-0_sp1.1 from training. Duration: 0.863625 2023-10-02 01:59:14,364 WARNING [train.py:1204] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_167-137255-0 from training. Duration: 0.64 2023-10-02 01:59:15,753 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8275_210_4198_1_1528455320_3892571_484-154786-0_sp1.1 from training. Duration: 0.963625 2023-10-02 01:59:15,838 WARNING [train.py:1204] (3/4) Exclude cut with ID _40284_210_2084_1_1533513679814_3704279_396-319412-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 01:59:17,917 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.0.bypass_mid.scale_min, batch_count=710946.6666666666, ans=0.2 2023-10-02 01:59:19,065 WARNING [train.py:1204] (3/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_280-120942-0_sp0.9 from training. Duration: 0.8555625 2023-10-02 01:59:19,137 WARNING [train.py:1204] (3/4) Exclude cut with ID _45630_210_10556_1_1533949174498_3902049_558-240889-0_sp1.1 from training. Duration: 0.92725 2023-10-02 01:59:19,325 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.bypass.scale_min, batch_count=710946.6666666666, ans=0.2 2023-10-02 01:59:21,897 WARNING [train.py:1204] (3/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_197-307458-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 01:59:23,248 WARNING [train.py:1204] (3/4) Exclude cut with ID _16909_210_5724_1_1531292079912_4118919_163-254843-0_sp1.1 from training. Duration: 0.92725 2023-10-02 01:59:24,625 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_103-84010-0 from training. Duration: 0.69 2023-10-02 01:59:26,586 WARNING [train.py:1204] (3/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_401-158239-0 from training. Duration: 0.71 2023-10-02 01:59:26,586 WARNING [train.py:1204] (3/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_899-233658-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 01:59:26,694 WARNING [train.py:1204] (3/4) Exclude cut with ID _23825_210_13449_1_1531915313526_3497150_240-271367-0 from training. Duration: 0.97 2023-10-02 01:59:26,732 WARNING [train.py:1204] (3/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_424-134619-0_sp1.1 from training. Duration: 0.92725 2023-10-02 01:59:30,970 WARNING [train.py:1204] (3/4) Exclude cut with ID _44831_210_13427_1_1533816011549_3689035_32-188147-0_sp1.1 from training. Duration: 0.87275 2023-10-02 01:59:30,974 WARNING [train.py:1204] (3/4) Exclude cut with ID _59170_210_6721_1_1534983633960_4203335_314-63879-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 01:59:30,997 WARNING [train.py:1204] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_602-275260-0 from training. Duration: 0.82 2023-10-02 01:59:31,031 WARNING [train.py:1204] (3/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_511-306767-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 01:59:31,056 WARNING [train.py:1204] (3/4) Exclude cut with ID _41817_210_18835_1_1533434477160_7423049_565-201775-0_sp1.1 from training. Duration: 0.92725 2023-10-02 01:59:32,633 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.499e+02 1.816e+02 1.987e+02 2.321e+02 3.446e+02, threshold=3.973e+02, percent-clipped=0.0 2023-10-02 01:59:32,722 WARNING [train.py:1204] (3/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_778-173151-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 01:59:32,777 WARNING [train.py:1204] (3/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_680-51001-0_sp1.1 from training. Duration: 0.963625 2023-10-02 01:59:34,197 WARNING [train.py:1204] (3/4) Exclude cut with ID IC0677W0059-334891 from training. Duration: 0.896 2023-10-02 01:59:34,248 WARNING [train.py:1204] (3/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_370-331600-0 from training. Duration: 0.94 2023-10-02 01:59:38,473 WARNING [train.py:1204] (3/4) Exclude cut with ID _66840_210_5219_1_1535452168426_4196349_446-240192-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 01:59:39,801 WARNING [train.py:1204] (3/4) Exclude cut with ID _65242_210_18628_1_1535369393717_3823149_38-6732-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 01:59:41,143 WARNING [train.py:1204] (3/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_302-2550-0 from training. Duration: 0.81 2023-10-02 01:59:41,234 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_100-348441-0 from training. Duration: 0.91 2023-10-02 01:59:44,561 WARNING [train.py:1204] (3/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_329-285801-0_sp1.1 from training. Duration: 0.82725 2023-10-02 01:59:46,133 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.conv_module2.balancer1.prob, batch_count=711080.0, ans=0.125 2023-10-02 01:59:48,747 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_30167_210_3528_1_1532428012_4772375_343-334712-0_sp1.1 from training. Duration: 0.936375 2023-10-02 01:59:54,978 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7543_210_6753_1_1528543318_4552_1-85072-0 from training. Duration: 0.83 2023-10-02 01:59:57,900 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7002_210_1794_1_1527935634_4034444_487-236501-0_sp1.1 from training. Duration: 0.6 2023-10-02 01:59:59,299 WARNING [train.py:1204] (3/4) Exclude cut with ID _7160_210_1876_1_1527940923231_3703799_282-322611-0 from training. Duration: 0.87 2023-10-02 02:00:00,761 WARNING [train.py:1204] (3/4) Exclude cut with ID _67335_210_5219_1_1535525975652_4275229_570-262983-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 02:00:03,420 WARNING [train.py:1204] (3/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_625-338546-0_sp1.1 from training. Duration: 0.8 2023-10-02 02:00:03,458 WARNING [train.py:1204] (3/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_176-56120-0 from training. Duration: 0.83 2023-10-02 02:00:06,139 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.1.encoder.layers.0.whiten, num_groups=1, num_channels=256, metric=5.26 vs. limit=12.0 2023-10-02 02:00:07,996 WARNING [train.py:1204] (3/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_963-187109-0_sp1.1 from training. Duration: 0.9 2023-10-02 02:00:09,400 WARNING [train.py:1204] (3/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_121-284152-0_sp0.9 from training. Duration: 0.6555625 2023-10-02 02:00:10,840 WARNING [train.py:1204] (3/4) Exclude cut with ID _45021_210_9994_1_1533619751094_3330288_104-81711-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 02:00:13,573 WARNING [train.py:1204] (3/4) Exclude cut with ID _51382_210_23862_1_1534492801838_3650104_129-343914-0_sp1.1 from training. Duration: 0.936375 2023-10-02 02:00:13,609 WARNING [train.py:1204] (3/4) Exclude cut with ID _14970_210_15709_1_1530775547361_1966965_8-169924-0 from training. Duration: 0.92 2023-10-02 02:00:16,998 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2295_210_2087_1_1525500993_2407863_234-83104-0_sp1.1 from training. Duration: 0.563625 2023-10-02 02:00:17,094 WARNING [train.py:1204] (3/4) Exclude cut with ID _14687_210_5854_1_1530703838259_3664889_115-239046-0 from training. Duration: 0.67 2023-10-02 02:00:19,804 WARNING [train.py:1204] (3/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_690-126306-0_sp1.1 from training. Duration: 0.7090625 2023-10-02 02:00:19,811 WARNING [train.py:1204] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_625-102955-0_sp0.9 from training. Duration: 0.8555625 2023-10-02 02:00:21,211 WARNING [train.py:1204] (3/4) Exclude cut with ID _9389_210_1664_1_1529217337301_5289269_289-213417-0 from training. Duration: 0.89 2023-10-02 02:00:22,524 WARNING [train.py:1204] (3/4) Exclude cut with ID _13253_210_1925_1_1530757775819_3577873_191-144977-0_sp0.9 from training. Duration: 0.8666875 2023-10-02 02:00:24,464 WARNING [train.py:1204] (3/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_325-155703-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 02:00:24,510 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2295_210_2087_1_1525500993_2407863_116-238894-0_sp0.9 from training. Duration: 0.77775 2023-10-02 02:00:25,796 INFO [train.py:1046] (3/4) Epoch 21, batch 450, loss[loss=0.1816, simple_loss=0.2669, pruned_loss=0.04814, over 24376.00 frames. ], tot_loss[loss=0.1754, simple_loss=0.2514, pruned_loss=0.04972, over 4241288.14 frames. ], batch size: 77, lr: 4.95e-03, grad_scale: 16.0 2023-10-02 02:00:25,883 WARNING [train.py:1204] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_94-126128-0 from training. Duration: 0.86 2023-10-02 02:00:25,900 WARNING [train.py:1204] (3/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_395-310220-0_sp0.9 from training. Duration: 0.988875 2023-10-02 02:00:26,002 WARNING [train.py:1204] (3/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_821-110852-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 02:00:26,172 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.ff2_skip_rate, batch_count=711280.0, ans=0.0 2023-10-02 02:00:27,234 WARNING [train.py:1204] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_64-248359-0_sp1.1 from training. Duration: 0.62725 2023-10-02 02:00:27,253 WARNING [train.py:1204] (3/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_138-187031-0 from training. Duration: 0.99 2023-10-02 02:00:28,552 WARNING [train.py:1204] (3/4) Exclude cut with ID _4122_210_2084_1_1526713164417_4353280_528-206771-0_sp0.9 from training. Duration: 0.97775 2023-10-02 02:00:30,009 WARNING [train.py:1204] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_844-96989-0_sp1.1 from training. Duration: 0.6454375 2023-10-02 02:00:30,232 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.bypass.scale_min, batch_count=711280.0, ans=0.2 2023-10-02 02:00:31,269 WARNING [train.py:1204] (3/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_106-30782-0_sp1.1 from training. Duration: 0.72725 2023-10-02 02:00:41,653 WARNING [train.py:1204] (3/4) Exclude cut with ID _14683_210_5219_1_1530698352608_3974859_281-250040-0_sp1.1 from training. Duration: 0.936375 2023-10-02 02:00:42,981 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_46013_210_7234_1_1533776362_3744032_463-52378-0_sp0.9 from training. Duration: 0.9555625 2023-10-02 02:00:44,423 WARNING [train.py:1204] (3/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_299-34413-0 from training. Duration: 0.65 2023-10-02 02:00:44,504 WARNING [train.py:1204] (3/4) Exclude cut with ID _35799_210_5854_1_1533263487887_3529071_490-326847-0 from training. Duration: 0.99 2023-10-02 02:00:48,653 WARNING [train.py:1204] (3/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_589-9070-0_sp0.9 from training. Duration: 0.788875 2023-10-02 02:00:50,032 WARNING [train.py:1204] (3/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_884-29515-0_sp1.1 from training. Duration: 0.936375 2023-10-02 02:00:51,468 WARNING [train.py:1204] (3/4) Exclude cut with ID _15012_210_1968_1_1530856820429_5095350_474-119995-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 02:00:54,777 WARNING [train.py:1204] (3/4) Exclude cut with ID _65585_210_12210_1_1535450451584_3764098_211-97124-0_sp1.1 from training. Duration: 0.963625 2023-10-02 02:00:55,011 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.balancer2.prob, batch_count=711413.3333333334, ans=0.125 2023-10-02 02:00:55,033 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.ff3_skip_rate, batch_count=711413.3333333334, ans=0.0 2023-10-02 02:00:56,149 WARNING [train.py:1204] (3/4) Exclude cut with ID _33640_210_2655_1_1533117605479_4855469_666-224463-0_sp1.1 from training. Duration: 0.963625 2023-10-02 02:00:58,784 WARNING [train.py:1204] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_35-153886-0 from training. Duration: 0.84 2023-10-02 02:00:58,831 WARNING [train.py:1204] (3/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_210-106047-0 from training. Duration: 0.97 2023-10-02 02:01:00,299 WARNING [train.py:1204] (3/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_479-125611-0 from training. Duration: 0.96 2023-10-02 02:01:00,320 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_9417_210_1794_1_1529235261_4614131_427-153963-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 02:01:01,753 WARNING [train.py:1204] (3/4) Exclude cut with ID _23818_210_6228_1_1531917063884_3676920_133-170571-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 02:01:03,208 WARNING [train.py:1204] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_602-275260-0_sp1.1 from training. Duration: 0.7454375 2023-10-02 02:01:04,621 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9029W0121-509521 from training. Duration: 0.896 2023-10-02 02:01:04,630 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9029W0434-509821 from training. Duration: 0.64 2023-10-02 02:01:04,657 WARNING [train.py:1204] (3/4) Exclude cut with ID _23038_210_9570_1_1532001584158_3652419_289-307034-0_sp1.1 from training. Duration: 0.936375 2023-10-02 02:01:06,038 WARNING [train.py:1204] (3/4) Exclude cut with ID _29345_210_4381_1_1532689168263_3860169_149-257268-0_sp0.9 from training. Duration: 0.9 2023-10-02 02:01:07,542 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_405-339986-0_sp1.1 from training. Duration: 0.463625 2023-10-02 02:01:12,493 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2655_210_2087_1_1525688959_3897400_437-337690-0_sp1.1 from training. Duration: 0.57275 2023-10-02 02:01:13,803 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7543_210_6753_1_1528541907_1399322_165-323619-0_sp1.1 from training. Duration: 0.62725 2023-10-02 02:01:13,850 WARNING [train.py:1204] (3/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_508-335369-0_sp1.1 from training. Duration: 0.436375 2023-10-02 02:01:13,910 WARNING [train.py:1204] (3/4) Exclude cut with ID _7852_210_5219_1_1528286471316_8139400_358-287492-0 from training. Duration: 0.96 2023-10-02 02:01:15,905 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder_embed.convnext.layerdrop_rate, batch_count=711480.0, ans=0.015 2023-10-02 02:01:18,665 WARNING [train.py:1204] (3/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_322-217220-0_sp0.9 from training. Duration: 0.9555625 2023-10-02 02:01:20,041 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3171_210_2087_1_1526109084_2330007_66-260583-0_sp0.9 from training. Duration: 0.8 2023-10-02 02:01:20,081 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8558_210_1410_1_1528505225_7613406_131-329524-0_sp1.1 from training. Duration: 0.7181875 2023-10-02 02:01:23,322 WARNING [train.py:1204] (3/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_30-326681-0 from training. Duration: 0.83 2023-10-02 02:01:23,588 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.conv_module2.balancer1.prob, batch_count=711480.0, ans=0.125 2023-10-02 02:01:26,579 WARNING [train.py:1204] (3/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_58-68956-0_sp0.9 from training. Duration: 0.92225 2023-10-02 02:01:27,924 WARNING [train.py:1204] (3/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_255-312654-0 from training. Duration: 0.91 2023-10-02 02:01:29,328 WARNING [train.py:1204] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_99-167392-0 from training. Duration: 0.61 2023-10-02 02:01:29,411 WARNING [train.py:1204] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_94-126128-0_sp0.9 from training. Duration: 0.9555625 2023-10-02 02:01:35,015 WARNING [train.py:1204] (3/4) Exclude cut with ID _68635_210_9317_1_1535770813410_4315870_187-192817-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 02:01:36,382 WARNING [train.py:1204] (3/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_277-204970-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 02:01:37,780 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6163_210_6400_1_1527506746_4263068_283-137811-0_sp0.9 from training. Duration: 0.8555625 2023-10-02 02:01:37,818 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9144W0121-566446 from training. Duration: 0.896 2023-10-02 02:01:40,465 INFO [train.py:1046] (3/4) Epoch 21, batch 500, loss[loss=0.1823, simple_loss=0.2564, pruned_loss=0.05405, over 22755.00 frames. ], tot_loss[loss=0.1773, simple_loss=0.253, pruned_loss=0.05082, over 4345283.73 frames. ], batch size: 322, lr: 4.95e-03, grad_scale: 16.0 2023-10-02 02:01:41,992 WARNING [train.py:1204] (3/4) Exclude cut with ID _49371_210_12071_1_1533952761021_3812190_351-315519-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 02:01:43,498 WARNING [train.py:1204] (3/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_115-326778-0_sp1.1 from training. Duration: 0.8454375 2023-10-02 02:01:43,550 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_17292_210_6753_1_1531125388_2943734_436-198965-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 02:01:43,566 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9165W0117-576751 from training. Duration: 0.896 2023-10-02 02:01:45,422 WARNING [train.py:1204] (3/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_212-149947-0 from training. Duration: 0.73 2023-10-02 02:01:45,430 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_13757_210_3528_1_1530440682_4107239_65-167544-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 02:01:47,709 WARNING [train.py:1204] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_700-183549-0_sp0.9 from training. Duration: 0.7555625 2023-10-02 02:01:50,464 WARNING [train.py:1204] (3/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_216-343007-0_sp0.9 from training. Duration: 0.7333125 2023-10-02 02:01:53,754 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7544_210_6753_1_1528587722_3576686_295-258479-0_sp1.1 from training. Duration: 0.763625 2023-10-02 02:01:56,524 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_365-287106-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 02:01:56,536 WARNING [train.py:1204] (3/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_300-3747-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 02:01:58,247 WARNING [train.py:1204] (3/4) Exclude cut with ID _14069_210_5632_1_1530512692033_4803390_487-303201-0_sp1.1 from training. Duration: 0.92725 2023-10-02 02:02:02,239 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.454e+02 2.030e+02 2.224e+02 2.686e+02 4.005e+02, threshold=4.448e+02, percent-clipped=1.0 2023-10-02 02:02:04,130 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.conv_skip_rate, batch_count=711680.0, ans=0.0 2023-10-02 02:02:06,729 WARNING [train.py:1204] (3/4) Exclude cut with ID _14459_210_2228_1_1531443641946_7516700_668-123026-0_sp1.1 from training. Duration: 0.936375 2023-10-02 02:02:06,749 WARNING [train.py:1204] (3/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_108-53929-0_sp1.1 from training. Duration: 0.536375 2023-10-02 02:02:06,774 WARNING [train.py:1204] (3/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_266-137104-0_sp0.9 from training. Duration: 0.72225 2023-10-02 02:02:06,805 WARNING [train.py:1204] (3/4) Exclude cut with ID _12302_210_5219_1_1529924446466_8107280_902-168619-0_sp1.1 from training. Duration: 0.936375 2023-10-02 02:02:08,066 WARNING [train.py:1204] (3/4) Exclude cut with ID _59502_210_12210_1_1535107024698_1835139_146-162495-0 from training. Duration: 0.92 2023-10-02 02:02:08,088 WARNING [train.py:1204] (3/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_412-287526-0_sp1.1 from training. Duration: 0.6545625 2023-10-02 02:02:09,711 WARNING [train.py:1204] (3/4) Exclude cut with ID _7160_210_1876_1_1527940923231_3703799_120-150943-0_sp0.9 from training. Duration: 0.9 2023-10-02 02:02:11,003 WARNING [train.py:1204] (3/4) Exclude cut with ID _15037_210_2253_1_1530752294104_3267650_296-192597-0_sp1.1 from training. Duration: 0.736375 2023-10-02 02:02:12,362 WARNING [train.py:1204] (3/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_397-216497-0_sp1.1 from training. Duration: 0.87275 2023-10-02 02:02:12,393 WARNING [train.py:1204] (3/4) Exclude cut with ID _40094_210_16380_1_1533294005773_3641159_76-269320-0_sp1.1 from training. Duration: 0.936375 2023-10-02 02:02:12,455 WARNING [train.py:1204] (3/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_449-15508-0 from training. Duration: 0.82 2023-10-02 02:02:15,719 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9309W0194-640321 from training. Duration: 0.64 2023-10-02 02:02:15,846 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.1.feed_forward1.out_proj.dropout_p, batch_count=711746.6666666666, ans=0.1 2023-10-02 02:02:19,264 WARNING [train.py:1204] (3/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_284-312178-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 02:02:19,384 WARNING [train.py:1204] (3/4) Exclude cut with ID _29853_210_7497_1_1532505600851_6642110_401-55937-0_sp1.1 from training. Duration: 0.92725 2023-10-02 02:02:20,685 WARNING [train.py:1204] (3/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_716-24918-0_sp1.1 from training. Duration: 0.92725 2023-10-02 02:02:21,957 WARNING [train.py:1204] (3/4) Exclude cut with ID _19637_210_4220_1_1531537167676_3205060_314-305638-0_sp1.1 from training. Duration: 0.92725 2023-10-02 02:02:22,002 WARNING [train.py:1204] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_475-127455-0_sp1.1 from training. Duration: 0.636375 2023-10-02 02:02:25,371 WARNING [train.py:1204] (3/4) Exclude cut with ID _5444_210_2151_1_1527418808464_5312210_581-315411-0 from training. Duration: 0.62 2023-10-02 02:02:28,553 WARNING [train.py:1204] (3/4) Exclude cut with ID _44902_210_16187_1_1533888006062_3792078_149-67212-0_sp0.9 from training. Duration: 0.9666875 2023-10-02 02:02:29,947 WARNING [train.py:1204] (3/4) Exclude cut with ID _31602_210_5854_1_1532677021456_4458250_63-63253-0_sp1.1 from training. Duration: 0.963625 2023-10-02 02:02:32,845 WARNING [train.py:1204] (3/4) Exclude cut with ID _55209_210_1531_1_1534675828996_4597160_660-203992-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 02:02:34,442 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.bypass.scale_min, batch_count=711813.3333333334, ans=0.2 2023-10-02 02:02:35,562 WARNING [train.py:1204] (3/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_83-190624-0_sp1.1 from training. Duration: 0.92725 2023-10-02 02:02:39,768 WARNING [train.py:1204] (3/4) Exclude cut with ID _58605_210_12210_1_1534723195939_3904259_154-139635-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 02:02:41,254 WARNING [train.py:1204] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_164-224052-0 from training. Duration: 0.7 2023-10-02 02:02:41,271 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_1126-189071-0_sp1.1 from training. Duration: 0.963625 2023-10-02 02:02:41,291 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_15396_210_6890_1_1531308473_1938936_310-214789-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 02:02:45,279 WARNING [train.py:1204] (3/4) Exclude cut with ID _8395_210_1531_1_1528597871523_4411579_431-132470-0 from training. Duration: 0.74 2023-10-02 02:02:46,432 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3780_210_2087_1_1526467760_2130483_170-259044-0_sp0.9 from training. Duration: 0.77775 2023-10-02 02:02:47,784 WARNING [train.py:1204] (3/4) Exclude cut with ID _12733_210_8341_1_1530167424556_3646380_537-230640-0_sp1.1 from training. Duration: 0.963625 2023-10-02 02:02:48,586 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.2.encoder.layers.0.self_attn1.whiten, num_groups=1, num_channels=384, metric=21.59 vs. limit=22.5 2023-10-02 02:02:53,931 WARNING [train.py:1204] (3/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_159-281310-0 from training. Duration: 0.95 2023-10-02 02:02:55,399 WARNING [train.py:1204] (3/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_434-83000-0 from training. Duration: 0.98 2023-10-02 02:02:55,426 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_4405_210_1849_1_1526732205_3593939_493-32464-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 02:02:55,434 WARNING [train.py:1204] (3/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_342-100157-0 from training. Duration: 0.54 2023-10-02 02:02:55,487 WARNING [train.py:1204] (3/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_765-250133-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 02:02:55,499 WARNING [train.py:1204] (3/4) Exclude cut with ID _38670_210_15379_1_1533448810659_4391059_81-312245-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 02:02:56,836 INFO [train.py:1046] (3/4) Epoch 21, batch 550, loss[loss=0.1757, simple_loss=0.2564, pruned_loss=0.04753, over 24505.00 frames. ], tot_loss[loss=0.1781, simple_loss=0.2541, pruned_loss=0.05102, over 4416698.91 frames. ], batch size: 63, lr: 4.95e-03, grad_scale: 8.0 2023-10-02 02:02:56,900 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_50050_210_16223_1_1535435796_7290138_1273-247611-0_sp1.1 from training. Duration: 0.936375 2023-10-02 02:02:56,955 WARNING [train.py:1204] (3/4) Exclude cut with ID _44831_210_13427_1_1533816011549_3689035_399-18591-0_sp1.1 from training. Duration: 0.936375 2023-10-02 02:02:56,969 WARNING [train.py:1204] (3/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_557-324258-0_sp0.9 from training. Duration: 0.9 2023-10-02 02:02:58,928 WARNING [train.py:1204] (3/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_62-335552-0_sp1.1 from training. Duration: 0.7909375 2023-10-02 02:03:01,615 WARNING [train.py:1204] (3/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_673-264125-0_sp1.1 from training. Duration: 0.963625 2023-10-02 02:03:03,034 WARNING [train.py:1204] (3/4) Exclude cut with ID _8030_210_7497_1_1528444886781_3531089_262-86750-0 from training. Duration: 0.81 2023-10-02 02:03:03,048 WARNING [train.py:1204] (3/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_444-28041-0_sp1.1 from training. Duration: 0.77275 2023-10-02 02:03:07,286 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_34008_210_10431_1_1533345424_8183247_516-14858-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 02:03:07,309 WARNING [train.py:1204] (3/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_54-248218-0_sp1.1 from training. Duration: 0.936375 2023-10-02 02:03:10,697 WARNING [train.py:1204] (3/4) Exclude cut with ID _13253_210_1925_1_1530757775819_3577873_300-119782-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 02:03:12,175 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder_embed.conv.8.prob, batch_count=712013.3333333334, ans=0.125 2023-10-02 02:03:13,337 WARNING [train.py:1204] (3/4) Exclude cut with ID _39290_210_4220_1_1533862778760_3075118_21-240703-0_sp1.1 from training. Duration: 0.936375 2023-10-02 02:03:18,303 WARNING [train.py:1204] (3/4) Exclude cut with ID 774-127930-0014-10412-0_sp1.1 from training. Duration: 0.95 2023-10-02 02:03:18,372 WARNING [train.py:1204] (3/4) Exclude cut with ID _67499_210_17282_1_1535509805220_3692430_20-179346-0 from training. Duration: 0.97 2023-10-02 02:03:20,444 WARNING [train.py:1204] (3/4) Exclude cut with ID _8365_210_2064_1_1528592311070_4677690_431-294102-0_sp0.9 from training. Duration: 0.988875 2023-10-02 02:03:26,262 WARNING [train.py:1204] (3/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_365-1492-0_sp1.1 from training. Duration: 0.82725 2023-10-02 02:03:26,308 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_105-114797-0_sp0.9 from training. Duration: 0.9333125 2023-10-02 02:03:29,079 WARNING [train.py:1204] (3/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_769-168302-0_sp0.9 from training. Duration: 0.788875 2023-10-02 02:03:29,324 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.feed_forward1.hidden_balancer.prob, batch_count=712080.0, ans=0.125 2023-10-02 02:03:32,276 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_40340_210_9024_1_1533433916_4132944_320-270277-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 02:03:32,282 WARNING [train.py:1204] (3/4) Exclude cut with ID ID0203W0003-781711 from training. Duration: 0.958 2023-10-02 02:03:32,359 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_30434_210_13049_1_1532519384_981746_52-12932-0_sp1.1 from training. Duration: 0.936375 2023-10-02 02:03:33,791 WARNING [train.py:1204] (3/4) Exclude cut with ID _8263_210_1664_1_1528438953494_294100_40-204731-0_sp0.9 from training. Duration: 0.5555625 2023-10-02 02:03:35,381 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.conv_skip_rate, batch_count=712080.0, ans=0.0 2023-10-02 02:03:36,487 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1032-236863-0_sp0.9 from training. Duration: 0.9333125 2023-10-02 02:03:36,557 WARNING [train.py:1204] (3/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_335-274780-0_sp0.9 from training. Duration: 0.8666875 2023-10-02 02:03:37,665 WARNING [train.py:1204] (3/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_654-52713-0_sp1.1 from training. Duration: 0.7 2023-10-02 02:03:37,748 WARNING [train.py:1204] (3/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_742-267856-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 02:03:39,095 WARNING [train.py:1204] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_74-48717-0 from training. Duration: 0.58 2023-10-02 02:03:41,711 WARNING [train.py:1204] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_18-46756-0 from training. Duration: 0.79 2023-10-02 02:03:43,041 WARNING [train.py:1204] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_612-56061-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 02:03:43,058 WARNING [train.py:1204] (3/4) Exclude cut with ID _56423_210_5854_1_1534896061049_3165969_412-2403-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 02:03:44,524 WARNING [train.py:1204] (3/4) Exclude cut with ID _62574_210_2655_1_1535180407058_3933249_120-196848-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 02:03:44,524 WARNING [train.py:1204] (3/4) Exclude cut with ID _40519_210_13794_1_1533715228655_7337390_916-51000-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 02:03:48,597 WARNING [train.py:1204] (3/4) Exclude cut with ID _65263_210_22356_1_1535763598691_3921037_108-21902-0_sp1.1 from training. Duration: 0.9 2023-10-02 02:03:50,029 WARNING [train.py:1204] (3/4) Exclude cut with ID _4122_210_2084_1_1526713164417_4353280_528-206771-0_sp1.1 from training. Duration: 0.8 2023-10-02 02:03:52,094 WARNING [train.py:1204] (3/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_43-325657-0_sp1.1 from training. Duration: 0.92725 2023-10-02 02:03:52,135 WARNING [train.py:1204] (3/4) Exclude cut with ID _44659_210_2721_1_1534036120061_6614860_314-82041-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 02:03:53,504 WARNING [train.py:1204] (3/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_81-300212-0_sp0.9 from training. Duration: 0.6666875 2023-10-02 02:03:55,317 WARNING [train.py:1204] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_665-196155-0_sp0.9 from training. Duration: 0.7444375 2023-10-02 02:03:56,746 WARNING [train.py:1204] (3/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_140-336881-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 02:03:58,100 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6290_210_3273_1_1527595224_1282787_115-87095-0_sp0.9 from training. Duration: 0.611125 2023-10-02 02:04:00,042 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_53373_210_5399_1_1534229471_4715695_607-259685-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 02:04:00,245 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.conv_module2.balancer1.prob, batch_count=712213.3333333334, ans=0.125 2023-10-02 02:04:01,380 WARNING [train.py:1204] (3/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_175-163894-0_sp1.1 from training. Duration: 0.67275 2023-10-02 02:04:01,397 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7002_210_1794_1_1527939683_5157286_1-281264-0_sp1.1 from training. Duration: 0.4 2023-10-02 02:04:06,329 WARNING [train.py:1204] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_605-257205-0 from training. Duration: 0.59 2023-10-02 02:04:07,758 WARNING [train.py:1204] (3/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_454-314901-0 from training. Duration: 0.46 2023-10-02 02:04:10,509 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_46032_210_14656_1_1533781412_4507969_287-130422-0_sp1.1 from training. Duration: 0.97275 2023-10-02 02:04:10,540 WARNING [train.py:1204] (3/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_186-309161-0_sp1.1 from training. Duration: 0.4909375 2023-10-02 02:04:10,579 WARNING [train.py:1204] (3/4) Exclude cut with ID _42659_210_2070_1_1533607442765_3120380_45-342307-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 02:04:11,874 INFO [train.py:1046] (3/4) Epoch 21, batch 600, loss[loss=0.1725, simple_loss=0.2552, pruned_loss=0.0449, over 24620.00 frames. ], tot_loss[loss=0.1775, simple_loss=0.2536, pruned_loss=0.05065, over 4486544.86 frames. ], batch size: 68, lr: 4.94e-03, grad_scale: 8.0 2023-10-02 02:04:17,661 WARNING [train.py:1204] (3/4) Exclude cut with ID _12555_210_8750_1_1530075594402_3780239_478-326818-0_sp1.1 from training. Duration: 0.963625 2023-10-02 02:04:19,071 WARNING [train.py:1204] (3/4) Exclude cut with ID _7606_210_4915_1_1528894253277_5080690_127-177761-0_sp1.1 from training. Duration: 0.5818125 2023-10-02 02:04:20,714 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6788_210_1388_1_1527898698_4562021_91-197981-0 from training. Duration: 0.83 2023-10-02 02:04:22,670 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8558_210_1410_1_1528505225_7613406_131-329524-0_sp0.9 from training. Duration: 0.87775 2023-10-02 02:04:24,730 WARNING [train.py:1204] (3/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_153-153537-0_sp1.1 from training. Duration: 0.82725 2023-10-02 02:04:26,023 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_17292_210_6753_1_1531125388_2943734_285-349105-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 02:04:29,422 WARNING [train.py:1204] (3/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_37-41919-0 from training. Duration: 0.87 2023-10-02 02:04:29,440 WARNING [train.py:1204] (3/4) Exclude cut with ID _60860_210_8254_1_1534896015875_3557169_172-294852-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 02:04:35,556 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.460e+02 1.808e+02 2.033e+02 2.469e+02 3.913e+02, threshold=4.067e+02, percent-clipped=0.0 2023-10-02 02:04:35,655 WARNING [train.py:1204] (3/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_146-276944-0 from training. Duration: 0.83 2023-10-02 02:04:37,598 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.5.encoder.layers.0.nonlin_attention.whiten2, num_groups=1, num_channels=256, metric=3.43 vs. limit=15.0 2023-10-02 02:04:38,357 WARNING [train.py:1204] (3/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_1254-44082-0_sp1.1 from training. Duration: 0.936375 2023-10-02 02:04:38,376 WARNING [train.py:1204] (3/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_1115-180350-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 02:04:38,408 WARNING [train.py:1204] (3/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_60-146189-0_sp1.1 from training. Duration: 0.8909375 2023-10-02 02:04:40,590 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.2.encoder.layers.0.self_attn2.whiten, num_groups=1, num_channels=384, metric=20.63 vs. limit=22.5 2023-10-02 02:04:42,703 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.ff3_skip_rate, batch_count=712413.3333333334, ans=0.0 2023-10-02 02:04:43,875 WARNING [train.py:1204] (3/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_517-153649-0_sp1.1 from training. Duration: 0.92725 2023-10-02 02:04:43,886 WARNING [train.py:1204] (3/4) Exclude cut with ID _39571_210_13449_1_1533801584143_3651077_37-266257-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 02:04:43,930 WARNING [train.py:1204] (3/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_527-276118-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 02:04:44,169 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=712413.3333333334, ans=0.1 2023-10-02 02:04:44,758 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.1.conv_module1.whiten.whitening_limit, batch_count=712413.3333333334, ans=15.0 2023-10-02 02:04:49,445 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7696_210_2040_1_1528207167_3716099_117-125434-0_sp1.1 from training. Duration: 0.6909375 2023-10-02 02:04:53,463 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_25767_210_2161_1_1532307483_3867748_478-156360-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 02:04:53,468 WARNING [train.py:1204] (3/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_177-49092-0_sp1.1 from training. Duration: 0.82725 2023-10-02 02:04:53,481 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_11305_210_1794_1_1529750428_7178148_680-317073-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 02:05:02,401 WARNING [train.py:1204] (3/4) Exclude cut with ID _53565_210_10635_1_1534337218335_3820259_287-18120-0 from training. Duration: 0.97 2023-10-02 02:05:08,242 WARNING [train.py:1204] (3/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_498-40153-0_sp1.1 from training. Duration: 0.57275 2023-10-02 02:05:08,259 WARNING [train.py:1204] (3/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_555-63066-0_sp1.1 from training. Duration: 0.963625 2023-10-02 02:05:11,132 WARNING [train.py:1204] (3/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_395-310220-0 from training. Duration: 0.89 2023-10-02 02:05:12,513 WARNING [train.py:1204] (3/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_937-208281-0_sp1.1 from training. Duration: 0.77275 2023-10-02 02:05:15,234 WARNING [train.py:1204] (3/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_120-211630-0 from training. Duration: 0.92 2023-10-02 02:05:15,277 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_454-276429-0_sp1.1 from training. Duration: 0.7909375 2023-10-02 02:05:15,321 WARNING [train.py:1204] (3/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_404-75889-0_sp1.1 from training. Duration: 0.8090625 2023-10-02 02:05:20,854 WARNING [train.py:1204] (3/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_282-136641-0_sp0.9 from training. Duration: 0.4666875 2023-10-02 02:05:20,983 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_809-17156-0_sp1.1 from training. Duration: 0.663625 2023-10-02 02:05:23,680 WARNING [train.py:1204] (3/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_1003-101638-0_sp0.9 from training. Duration: 0.811125 2023-10-02 02:05:24,993 WARNING [train.py:1204] (3/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_404-75889-0_sp0.9 from training. Duration: 0.988875 2023-10-02 02:05:26,431 WARNING [train.py:1204] (3/4) Exclude cut with ID _59288_210_5854_1_1535077757774_3412246_570-295255-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 02:05:27,972 INFO [train.py:1046] (3/4) Epoch 21, batch 650, loss[loss=0.1663, simple_loss=0.2241, pruned_loss=0.05423, over 23658.00 frames. ], tot_loss[loss=0.1765, simple_loss=0.2519, pruned_loss=0.05053, over 4538030.74 frames. ], batch size: 232, lr: 4.94e-03, grad_scale: 8.0 2023-10-02 02:05:29,398 WARNING [train.py:1204] (3/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_412-287526-0 from training. Duration: 0.72 2023-10-02 02:05:30,719 WARNING [train.py:1204] (3/4) Exclude cut with ID _44010_210_4525_1_1533884262964_4013620_649-79655-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 02:05:34,358 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.feed_forward1.hidden_balancer.prob, batch_count=712613.3333333334, ans=0.125 2023-10-02 02:05:35,704 WARNING [train.py:1204] (3/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_57-270037-0_sp0.9 from training. Duration: 0.8555625 2023-10-02 02:05:35,705 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_34815_210_6388_1_1533381320_3571903_302-255963-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 02:05:39,705 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_47449_210_5399_1_1534076445_4006929_573-301852-0_sp1.1 from training. Duration: 0.97275 2023-10-02 02:05:43,823 WARNING [train.py:1204] (3/4) Exclude cut with ID 6709-74022-0004-86860-0_sp1.1 from training. Duration: 0.9409375 2023-10-02 02:05:46,555 WARNING [train.py:1204] (3/4) Exclude cut with ID _40519_210_13794_1_1533715228655_7337390_111-71991-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 02:05:46,611 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_30167_210_3528_1_1532428012_4772375_169-214186-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 02:05:50,723 WARNING [train.py:1204] (3/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_20-235313-0_sp1.1 from training. Duration: 0.963625 2023-10-02 02:05:52,084 WARNING [train.py:1204] (3/4) Exclude cut with ID _4681_210_3525_1_1527251040321_1235713_123-24428-0_sp0.9 from training. Duration: 0.5333125 2023-10-02 02:05:54,307 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.1.feed_forward1.hidden_balancer.prob, batch_count=712680.0, ans=0.125 2023-10-02 02:05:55,453 WARNING [train.py:1204] (3/4) Exclude cut with ID _31602_210_5854_1_1532677021456_4458250_444-198752-0_sp1.1 from training. Duration: 0.97275 2023-10-02 02:05:55,506 WARNING [train.py:1204] (3/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_9-279442-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 02:05:56,868 WARNING [train.py:1204] (3/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_973-198085-0_sp0.9 from training. Duration: 0.8333125 2023-10-02 02:05:58,823 WARNING [train.py:1204] (3/4) Exclude cut with ID _29853_210_7497_1_1532505600851_6642110_567-250221-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 02:06:00,147 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6545_210_2040_1_1527774670_4245258_407-48070-0_sp1.1 from training. Duration: 0.6909375 2023-10-02 02:06:00,407 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.conv_module1.balancer2.prob, batch_count=712746.6666666666, ans=0.125 2023-10-02 02:06:01,667 WARNING [train.py:1204] (3/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_224-343295-0_sp0.9 from training. Duration: 0.611125 2023-10-02 02:06:01,690 WARNING [train.py:1204] (3/4) Exclude cut with ID IC0125W0011-61817 from training. Duration: 0.88 2023-10-02 02:06:01,704 WARNING [train.py:1204] (3/4) Exclude cut with ID _60113_210_16271_1_1535164212481_4386079_435-154371-0_sp1.1 from training. Duration: 0.97275 2023-10-02 02:06:01,723 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_25836_210_16554_1_1532138167_53197_3-9394-0_sp1.1 from training. Duration: 0.92725 2023-10-02 02:06:04,535 WARNING [train.py:1204] (3/4) Exclude cut with ID _29343_210_4381_1_1532602757752_3674109_57-55125-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 02:06:06,530 WARNING [train.py:1204] (3/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_267-23560-0_sp1.1 from training. Duration: 0.936375 2023-10-02 02:06:07,785 WARNING [train.py:1204] (3/4) Exclude cut with ID _30196_210_1664_1_1533708496522_7074140_1150-278867-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 02:06:07,838 WARNING [train.py:1204] (3/4) Exclude cut with ID _6270_210_2084_1_1527850911573_3856420_26-277343-0_sp0.9 from training. Duration: 0.911125 2023-10-02 02:06:09,275 WARNING [train.py:1204] (3/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_325-62802-0 from training. Duration: 0.87 2023-10-02 02:06:09,345 WARNING [train.py:1204] (3/4) Exclude cut with ID _56524_210_9642_1_1534572011779_3619170_145-310842-0_sp1.1 from training. Duration: 0.9 2023-10-02 02:06:09,385 WARNING [train.py:1204] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_860-96198-0_sp0.9 from training. Duration: 0.811125 2023-10-02 02:06:10,711 WARNING [train.py:1204] (3/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_17-25045-0_sp1.1 from training. Duration: 0.536375 2023-10-02 02:06:10,740 WARNING [train.py:1204] (3/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_349-69727-0_sp1.1 from training. Duration: 0.936375 2023-10-02 02:06:13,482 WARNING [train.py:1204] (3/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_580-93798-0_sp1.1 from training. Duration: 0.5090625 2023-10-02 02:06:13,591 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7762_210_1794_1_1528199508_4057408_419-125009-0 from training. Duration: 0.83 2023-10-02 02:06:14,960 WARNING [train.py:1204] (3/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_356-167816-0 from training. Duration: 0.72 2023-10-02 02:06:14,995 WARNING [train.py:1204] (3/4) Exclude cut with ID _29345_210_4381_1_1532689168263_3860169_225-97100-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 02:06:14,998 WARNING [train.py:1204] (3/4) Exclude cut with ID _14227_210_4381_1_1530701804286_3940259_374-194712-0_sp1.1 from training. Duration: 0.92725 2023-10-02 02:06:15,021 WARNING [train.py:1204] (3/4) Exclude cut with ID _40137_210_12210_1_1533290484046_3795180_249-187059-0_sp0.9 from training. Duration: 0.97775 2023-10-02 02:06:15,038 WARNING [train.py:1204] (3/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_389-317194-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 02:06:17,865 WARNING [train.py:1204] (3/4) Exclude cut with ID _27485_210_4916_1_1532739191677_3668269_239-122918-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 02:06:20,779 INFO [scaling.py:1118] (3/4) WithLoss: name=encoder.encoders.2.encoder.layers.0.self_attn_weights, loss-sum=0.000e+00 2023-10-02 02:06:24,687 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2245_210_2715_1_1525511548_3540254_128-117609-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 02:06:25,394 WARNING [train.py:1204] (3/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_160-276178-0_sp1.1 from training. Duration: 0.963625 2023-10-02 02:06:26,708 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_31920_210_10633_1_1532860009_1686094_1-279735-0_sp1.1 from training. Duration: 0.97275 2023-10-02 02:06:28,901 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.ff2_skip_rate, batch_count=712880.0, ans=0.0 2023-10-02 02:06:30,033 WARNING [train.py:1204] (3/4) Exclude cut with ID _51078_210_9070_1_1534161557424_3805250_325-262383-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 02:06:30,059 WARNING [train.py:1204] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_643-52007-0_sp0.9 from training. Duration: 0.6333125 2023-10-02 02:06:31,497 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_51937_210_5014_1_1534573610_4022025_518-299248-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 02:06:38,439 WARNING [train.py:1204] (3/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_166-314607-0_sp1.1 from training. Duration: 0.4909375 2023-10-02 02:06:38,443 WARNING [train.py:1204] (3/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_1087-282939-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 02:06:40,333 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_23850_210_4238_1_1532232597_1632433_32-185135-0_sp1.1 from training. Duration: 0.7818125 2023-10-02 02:06:40,355 WARNING [train.py:1204] (3/4) Exclude cut with ID _59500_210_8254_1_1534809610680_3736009_145-89359-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 02:06:41,697 INFO [train.py:1046] (3/4) Epoch 21, batch 700, loss[loss=0.1731, simple_loss=0.2502, pruned_loss=0.04805, over 23236.00 frames. ], tot_loss[loss=0.1753, simple_loss=0.2509, pruned_loss=0.04986, over 4585832.94 frames. ], batch size: 105, lr: 4.94e-03, grad_scale: 8.0 2023-10-02 02:06:43,262 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_340-336408-0 from training. Duration: 0.93 2023-10-02 02:06:44,601 WARNING [train.py:1204] (3/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_9-21737-0 from training. Duration: 0.97 2023-10-02 02:06:47,306 WARNING [train.py:1204] (3/4) Exclude cut with ID _2220_210_1819_1_1525509109898_3658680_50-99837-0 from training. Duration: 0.54 2023-10-02 02:06:47,376 WARNING [train.py:1204] (3/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_879-299350-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 02:06:47,551 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.ff2_skip_rate, batch_count=712946.6666666666, ans=0.0 2023-10-02 02:06:47,611 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.conv_module1.balancer1.prob, batch_count=712946.6666666666, ans=0.125 2023-10-02 02:06:48,776 WARNING [train.py:1204] (3/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_905-160074-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 02:06:50,223 WARNING [train.py:1204] (3/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_395-73570-0 from training. Duration: 0.99 2023-10-02 02:06:54,463 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7264_210_4938_1_1527939911_8132095_800-91995-0_sp1.1 from training. Duration: 0.7818125 2023-10-02 02:06:57,806 WARNING [train.py:1204] (3/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_77-311404-0_sp1.1 from training. Duration: 0.8545625 2023-10-02 02:06:59,215 WARNING [train.py:1204] (3/4) Exclude cut with ID _13679_210_7497_1_1530534293662_3355098_107-80772-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 02:07:00,659 WARNING [train.py:1204] (3/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_224-343295-0_sp1.1 from training. Duration: 0.5 2023-10-02 02:07:01,876 WARNING [train.py:1204] (3/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_156-253482-0_sp1.1 from training. Duration: 0.963625 2023-10-02 02:07:03,242 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.445e+02 1.955e+02 2.291e+02 2.559e+02 6.231e+02, threshold=4.583e+02, percent-clipped=1.0 2023-10-02 02:07:04,797 WARNING [train.py:1204] (3/4) Exclude cut with ID _49728_210_12651_1_1534041034294_6788150_309-125532-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 02:07:06,253 WARNING [train.py:1204] (3/4) Exclude cut with ID _13913_210_9994_1_1530579615187_3383897_473-277682-0_sp0.9 from training. Duration: 0.4333125 2023-10-02 02:07:06,257 WARNING [train.py:1204] (3/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_3-276701-0_sp1.1 from training. Duration: 0.82725 2023-10-02 02:07:07,737 WARNING [train.py:1204] (3/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_282-136641-0 from training. Duration: 0.42 2023-10-02 02:07:11,073 WARNING [train.py:1204] (3/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_127-566-0 from training. Duration: 0.84 2023-10-02 02:07:11,306 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.0.bypass_mid.scale_min, batch_count=713080.0, ans=0.2 2023-10-02 02:07:15,159 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6163_210_6400_1_1527506746_4263068_283-137811-0_sp1.1 from training. Duration: 0.7 2023-10-02 02:07:15,194 WARNING [train.py:1204] (3/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_34-183685-0_sp1.1 from training. Duration: 0.8909375 2023-10-02 02:07:16,594 WARNING [train.py:1204] (3/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_154-135696-0_sp0.9 from training. Duration: 0.888875 2023-10-02 02:07:19,490 WARNING [train.py:1204] (3/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_676-51014-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 02:07:19,555 WARNING [train.py:1204] (3/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_38-169920-0 from training. Duration: 0.91 2023-10-02 02:07:23,755 WARNING [train.py:1204] (3/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_747-114465-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 02:07:23,799 WARNING [train.py:1204] (3/4) Exclude cut with ID _8021_210_7994_1_1528376451358_3653069_100-140231-0_sp1.1 from training. Duration: 0.6090625 2023-10-02 02:07:25,059 WARNING [train.py:1204] (3/4) Exclude cut with ID _64135_210_2253_1_1535677267876_7608972_133-188266-0 from training. Duration: 0.81 2023-10-02 02:07:27,303 WARNING [train.py:1204] (3/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_714-310538-0_sp1.1 from training. Duration: 0.7818125 2023-10-02 02:07:29,175 WARNING [train.py:1204] (3/4) Exclude cut with ID _39867_210_9826_1_1533553222030_9344259_1134-288498-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 02:07:30,676 WARNING [train.py:1204] (3/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_211-237289-0_sp1.1 from training. Duration: 0.936375 2023-10-02 02:07:35,399 WARNING [train.py:1204] (3/4) Exclude cut with ID _59202_210_26984_1_1534816810612_3549174_137-318710-0_sp0.9 from training. Duration: 0.988875 2023-10-02 02:07:35,428 WARNING [train.py:1204] (3/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_86-66192-0 from training. Duration: 0.55 2023-10-02 02:07:39,461 WARNING [train.py:1204] (3/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_540-275685-0 from training. Duration: 0.89 2023-10-02 02:07:39,503 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8776_210_2040_1_1528638837_4088036_244-278763-0 from training. Duration: 0.9 2023-10-02 02:07:42,253 WARNING [train.py:1204] (3/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_9-219972-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 02:07:43,816 WARNING [train.py:1204] (3/4) Exclude cut with ID _44659_210_2721_1_1534036120061_6614860_656-283823-0_sp1.1 from training. Duration: 0.92725 2023-10-02 02:07:45,151 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_69630_210_7234_1_1535789601_661049_77-216385-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 02:07:47,875 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_19952_210_2161_1_1531875469_3851750_210-308919-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 02:07:47,881 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_4737_210_2040_1_1526998689_3293008_378-178650-0 from training. Duration: 0.95 2023-10-02 02:07:52,016 WARNING [train.py:1204] (3/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_224-343295-0 from training. Duration: 0.55 2023-10-02 02:07:52,044 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7002_210_1794_1_1527939683_5157286_184-229630-0 from training. Duration: 0.74 2023-10-02 02:07:52,076 WARNING [train.py:1204] (3/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_6-214815-0 from training. Duration: 0.85 2023-10-02 02:07:53,511 WARNING [train.py:1204] (3/4) Exclude cut with ID _8699_210_2069_1_1528630346831_3960409_160-24408-0 from training. Duration: 0.91 2023-10-02 02:07:54,810 INFO [train.py:1046] (3/4) Epoch 21, batch 750, loss[loss=0.1751, simple_loss=0.2427, pruned_loss=0.05372, over 23826.00 frames. ], tot_loss[loss=0.1748, simple_loss=0.2502, pruned_loss=0.04971, over 4616431.19 frames. ], batch size: 179, lr: 4.94e-03, grad_scale: 8.0 2023-10-02 02:07:54,875 WARNING [train.py:1204] (3/4) Exclude cut with ID _5585_210_2667_1_1527418813324_8007340_89-332072-0 from training. Duration: 0.64 2023-10-02 02:07:54,923 WARNING [train.py:1204] (3/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_528-259800-0_sp1.1 from training. Duration: 0.963625 2023-10-02 02:07:57,008 WARNING [train.py:1204] (3/4) Exclude cut with ID _55383_210_10635_1_1534468426172_2231669_134-128246-0 from training. Duration: 0.89 2023-10-02 02:07:58,996 WARNING [train.py:1204] (3/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_400-338274-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 02:08:00,343 WARNING [train.py:1204] (3/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_364-22817-0_sp0.9 from training. Duration: 0.788875 2023-10-02 02:08:00,468 WARNING [train.py:1204] (3/4) Exclude cut with ID _42863_210_9070_1_1533902398788_3696920_331-291057-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 02:08:03,098 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_5436_210_1794_1_1527331317_2541494_229-211016-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 02:08:04,393 WARNING [train.py:1204] (3/4) Exclude cut with ID _6456_210_1534_1_1527681634212_3181049_345-252877-0_sp0.9 from training. Duration: 0.711125 2023-10-02 02:08:04,418 WARNING [train.py:1204] (3/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_96-81499-0_sp1.1 from training. Duration: 0.92725 2023-10-02 02:08:07,660 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_5575_210_1410_1_1527301305_2570170_342-265572-0_sp1.1 from training. Duration: 0.8545625 2023-10-02 02:08:07,730 WARNING [train.py:1204] (3/4) Exclude cut with ID _5444_210_2151_1_1527418808464_5312210_726-27165-0_sp0.9 from training. Duration: 0.7444375 2023-10-02 02:08:09,173 WARNING [train.py:1204] (3/4) Exclude cut with ID _22083_210_8254_1_1531702862439_3678970_304-327750-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 02:08:11,986 WARNING [train.py:1204] (3/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_245-226199-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 02:08:12,017 WARNING [train.py:1204] (3/4) Exclude cut with ID _42875_210_14856_1_1533704397487_4012433_102-199499-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 02:08:12,471 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.5.encoder.layers.0.whiten, num_groups=1, num_channels=256, metric=2.44 vs. limit=12.0 2023-10-02 02:08:13,409 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_51225_210_3601_1_1534400634_2327383_55-301844-0 from training. Duration: 0.93 2023-10-02 02:08:14,777 WARNING [train.py:1204] (3/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_916-195848-0_sp1.1 from training. Duration: 0.836375 2023-10-02 02:08:14,847 WARNING [train.py:1204] (3/4) Exclude cut with ID _44718_210_5816_1_1534050051869_6469030_497-311999-0_sp1.1 from training. Duration: 0.936375 2023-10-02 02:08:16,235 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_34886_210_10633_1_1533203882_1755661_91-194765-0_sp1.1 from training. Duration: 0.936375 2023-10-02 02:08:16,395 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.balancer1.prob, batch_count=713346.6666666666, ans=0.125 2023-10-02 02:08:19,084 WARNING [train.py:1204] (3/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_310-26186-0_sp0.9 from training. Duration: 0.77775 2023-10-02 02:08:20,446 WARNING [train.py:1204] (3/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_416-257730-0 from training. Duration: 0.65 2023-10-02 02:08:20,451 WARNING [train.py:1204] (3/4) Exclude cut with ID _9389_210_1664_1_1529217337301_5289269_209-154760-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 02:08:23,160 WARNING [train.py:1204] (3/4) Exclude cut with ID _4092_210_2151_1_1526643044449_3135759_227-213972-0 from training. Duration: 0.54 2023-10-02 02:08:23,191 WARNING [train.py:1204] (3/4) Exclude cut with ID IC0679W0044-335852 from training. Duration: 0.768 2023-10-02 02:08:24,626 WARNING [train.py:1204] (3/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_42-186162-0 from training. Duration: 0.92 2023-10-02 02:08:24,633 WARNING [train.py:1204] (3/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_302-2550-0_sp0.9 from training. Duration: 0.9 2023-10-02 02:08:24,651 WARNING [train.py:1204] (3/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_356-31216-0_sp0.9 from training. Duration: 0.8333125 2023-10-02 02:08:27,368 WARNING [train.py:1204] (3/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_125-322172-0_sp1.1 from training. Duration: 0.8454375 2023-10-02 02:08:35,268 WARNING [train.py:1204] (3/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_141-89727-0_sp0.9 from training. Duration: 0.788875 2023-10-02 02:08:35,286 WARNING [train.py:1204] (3/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_647-337784-0_sp1.1 from training. Duration: 0.97275 2023-10-02 02:08:35,297 WARNING [train.py:1204] (3/4) Exclude cut with ID _6493_210_1747_1_1528459210424_4012470_513-140967-0_sp1.1 from training. Duration: 0.7181875 2023-10-02 02:08:36,775 WARNING [train.py:1204] (3/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_87-266334-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 02:08:36,878 WARNING [train.py:1204] (3/4) Exclude cut with ID _26528_210_9070_1_1532570410842_3851310_526-261338-0_sp1.1 from training. Duration: 0.92725 2023-10-02 02:08:36,918 WARNING [train.py:1204] (3/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_380-343670-0 from training. Duration: 0.88 2023-10-02 02:08:38,789 WARNING [train.py:1204] (3/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_449-15508-0_sp1.1 from training. Duration: 0.7454375 2023-10-02 02:08:40,232 WARNING [train.py:1204] (3/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_372-6869-0_sp0.9 from training. Duration: 0.488875 2023-10-02 02:08:40,292 WARNING [train.py:1204] (3/4) Exclude cut with ID _26907_210_7234_1_1532221740694_3205263_337-2494-0_sp0.9 from training. Duration: 0.97775 2023-10-02 02:08:44,305 WARNING [train.py:1204] (3/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_10-100367-0_sp1.1 from training. Duration: 0.8909375 2023-10-02 02:08:45,657 WARNING [train.py:1204] (3/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_47-167373-0 from training. Duration: 0.92 2023-10-02 02:08:46,935 WARNING [train.py:1204] (3/4) Exclude cut with ID _28323_210_16187_1_1532480395667_4100300_628-282179-0_sp1.1 from training. Duration: 0.97275 2023-10-02 02:08:51,289 WARNING [train.py:1204] (3/4) Exclude cut with ID _20455_210_5219_1_1531565996775_8098100_213-280898-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 02:08:52,785 WARNING [train.py:1204] (3/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_383-316000-0_sp1.1 from training. Duration: 0.8181875 2023-10-02 02:08:52,828 WARNING [train.py:1204] (3/4) Exclude cut with ID _18513_210_9206_1_1531618215223_3862620_16-78180-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 02:08:54,288 WARNING [train.py:1204] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_330-327861-0_sp1.1 from training. Duration: 0.6090625 2023-10-02 02:08:58,362 WARNING [train.py:1204] (3/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_356-31216-0 from training. Duration: 0.75 2023-10-02 02:08:59,011 WARNING [train.py:1204] (3/4) Exclude cut with ID _25248_210_8750_1_1532062825941_5554180_255-148539-0_sp1.1 from training. Duration: 0.963625 2023-10-02 02:08:59,070 WARNING [train.py:1204] (3/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_269-154726-0_sp1.1 from training. Duration: 0.7818125 2023-10-02 02:09:03,516 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7002_210_1794_1_1527939683_5157286_163-87552-0_sp1.1 from training. Duration: 0.7818125 2023-10-02 02:09:03,550 WARNING [train.py:1204] (3/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_97-151896-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 02:09:04,995 WARNING [train.py:1204] (3/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_207-7026-0_sp1.1 from training. Duration: 0.97275 2023-10-02 02:09:05,013 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_92-46009-0_sp0.9 from training. Duration: 0.6 2023-10-02 02:09:09,596 INFO [train.py:1046] (3/4) Epoch 21, batch 800, loss[loss=0.1526, simple_loss=0.2207, pruned_loss=0.0423, over 19519.00 frames. ], tot_loss[loss=0.1751, simple_loss=0.2507, pruned_loss=0.04979, over 4632722.15 frames. ], batch size: 42, lr: 4.94e-03, grad_scale: 16.0 2023-10-02 02:09:15,297 WARNING [train.py:1204] (3/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_122-178847-0_sp1.1 from training. Duration: 0.97275 2023-10-02 02:09:15,301 WARNING [train.py:1204] (3/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_94-348956-0_sp1.1 from training. Duration: 0.92725 2023-10-02 02:09:18,111 WARNING [train.py:1204] (3/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_1018-244089-0_sp1.1 from training. Duration: 0.963625 2023-10-02 02:09:18,122 WARNING [train.py:1204] (3/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_87-118423-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 02:09:18,211 WARNING [train.py:1204] (3/4) Exclude cut with ID _53713_210_24116_1_1534314487762_7416751_504-212292-0_sp1.1 from training. Duration: 0.92725 2023-10-02 02:09:19,518 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_446-150327-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 02:09:20,877 WARNING [train.py:1204] (3/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_682-245989-0_sp1.1 from training. Duration: 0.92725 2023-10-02 02:09:24,906 WARNING [train.py:1204] (3/4) Exclude cut with ID _35723_210_1794_1_1533088341043_628250_8-234524-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 02:09:24,980 WARNING [train.py:1204] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_64-248359-0_sp0.9 from training. Duration: 0.7666875 2023-10-02 02:09:27,892 WARNING [train.py:1204] (3/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_49-97158-0 from training. Duration: 0.76 2023-10-02 02:09:27,968 WARNING [train.py:1204] (3/4) Exclude cut with ID _24314_210_4915_1_1532135078116_7170179_743-915-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 02:09:29,309 WARNING [train.py:1204] (3/4) Exclude cut with ID _61194_210_18628_1_1535007555628_3750909_353-323468-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 02:09:29,332 WARNING [train.py:1204] (3/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_387-131538-0_sp1.1 from training. Duration: 0.736375 2023-10-02 02:09:29,362 WARNING [train.py:1204] (3/4) Exclude cut with ID _44424_210_18163_1_1533623394848_1213110_79-187603-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 02:09:30,650 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.477e+02 1.851e+02 2.052e+02 2.465e+02 3.868e+02, threshold=4.103e+02, percent-clipped=0.0 2023-10-02 02:09:30,745 WARNING [train.py:1204] (3/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_86-19601-0 from training. Duration: 0.85 2023-10-02 02:09:30,767 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_50050_210_16223_1_1535435796_7290138_103-283529-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 02:09:30,808 WARNING [train.py:1204] (3/4) Exclude cut with ID _13913_210_9994_1_1530579615187_3383897_473-277682-0 from training. Duration: 0.39 2023-10-02 02:09:35,571 WARNING [train.py:1204] (3/4) Exclude cut with ID _42326_210_7770_1_1533814282531_3707269_368-68029-0_sp1.1 from training. Duration: 0.92725 2023-10-02 02:09:37,579 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_41428_210_2161_1_1533774184_3992532_446-339645-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 02:09:39,509 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.5.encoder.layers.1.nonlin_attention.whiten1, num_groups=1, num_channels=192, metric=2.97 vs. limit=10.0 2023-10-02 02:09:40,728 WARNING [train.py:1204] (3/4) Exclude cut with ID _69584_210_5219_1_1535785133775_4044450_325-7656-0_sp1.1 from training. Duration: 0.7818125 2023-10-02 02:09:40,736 WARNING [train.py:1204] (3/4) Exclude cut with ID _21632_210_2253_1_1531994001765_3744140_134-184387-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 02:09:42,298 WARNING [train.py:1204] (3/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_550-184707-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 02:09:42,322 WARNING [train.py:1204] (3/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_106-341780-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 02:09:47,796 WARNING [train.py:1204] (3/4) Exclude cut with ID _45871_210_5665_1_1533864614245_3296060_253-132552-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 02:09:47,836 WARNING [train.py:1204] (3/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_395-310220-0_sp1.1 from training. Duration: 0.8090625 2023-10-02 02:09:47,857 WARNING [train.py:1204] (3/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_795-33252-0_sp0.9 from training. Duration: 0.411125 2023-10-02 02:09:49,317 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9002W0003-495962 from training. Duration: 0.946 2023-10-02 02:09:49,343 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_43-71989-0 from training. Duration: 0.57 2023-10-02 02:09:49,361 WARNING [train.py:1204] (3/4) Exclude cut with ID _8699_210_2069_1_1528630346831_3960409_155-39766-0_sp1.1 from training. Duration: 0.7090625 2023-10-02 02:09:49,373 WARNING [train.py:1204] (3/4) Exclude cut with ID _27894_210_12392_1_1532415496078_6902439_788-232684-0_sp1.1 from training. Duration: 0.97275 2023-10-02 02:09:52,172 WARNING [train.py:1204] (3/4) Exclude cut with ID _56494_210_4220_1_1535284837625_3033169_10-39296-0_sp1.1 from training. Duration: 0.92725 2023-10-02 02:09:52,202 WARNING [train.py:1204] (3/4) Exclude cut with ID _40901_210_5665_1_1533363789958_3856130_341-20819-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 02:09:56,560 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9029W0435-509822 from training. Duration: 0.896 2023-10-02 02:09:57,880 WARNING [train.py:1204] (3/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_51-117576-0 from training. Duration: 0.4 2023-10-02 02:09:59,280 WARNING [train.py:1204] (3/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_197-28164-0_sp1.1 from training. Duration: 0.72725 2023-10-02 02:10:00,633 WARNING [train.py:1204] (3/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_145-137790-0_sp1.1 from training. Duration: 0.7545625 2023-10-02 02:10:05,312 WARNING [train.py:1204] (3/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_2-39653-0_sp1.1 from training. Duration: 0.8909375 2023-10-02 02:10:08,868 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3780_210_2087_1_1526467760_2130483_186-208240-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 02:10:10,183 WARNING [train.py:1204] (3/4) Exclude cut with ID _24055_210_12651_1_1532310907143_976979_92-59356-0 from training. Duration: 0.91 2023-10-02 02:10:10,243 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3910_210_1794_1_1526742306_334530_28-238909-0_sp1.1 from training. Duration: 0.736375 2023-10-02 02:10:13,871 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.conv_module1.balancer2.prob, batch_count=713880.0, ans=0.125 2023-10-02 02:10:15,072 WARNING [train.py:1204] (3/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_269-154726-0 from training. Duration: 0.86 2023-10-02 02:10:22,067 WARNING [train.py:1204] (3/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_326-165900-0_sp0.9 from training. Duration: 0.7666875 2023-10-02 02:10:23,520 INFO [train.py:1046] (3/4) Epoch 21, batch 850, loss[loss=0.1914, simple_loss=0.2736, pruned_loss=0.05466, over 24561.00 frames. ], tot_loss[loss=0.1753, simple_loss=0.2511, pruned_loss=0.04978, over 4661666.26 frames. ], batch size: 71, lr: 4.94e-03, grad_scale: 16.0 2023-10-02 02:10:23,660 WARNING [train.py:1204] (3/4) Exclude cut with ID _5294_210_1747_1_1527249643928_3696179_470-276077-0_sp1.1 from training. Duration: 0.963625 2023-10-02 02:10:23,713 WARNING [train.py:1204] (3/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_372-131163-0 from training. Duration: 0.94 2023-10-02 02:10:25,057 WARNING [train.py:1204] (3/4) Exclude cut with ID _45297_210_2721_1_1533715351381_3264949_351-147136-0_sp1.1 from training. Duration: 0.7909375 2023-10-02 02:10:25,141 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_1072-12671-0_sp1.1 from training. Duration: 0.92725 2023-10-02 02:10:26,626 WARNING [train.py:1204] (3/4) Exclude cut with ID _15133_210_6228_1_1530794512074_3080379_54-198193-0 from training. Duration: 0.94 2023-10-02 02:10:26,641 WARNING [train.py:1204] (3/4) Exclude cut with ID _30196_210_1664_1_1533708496522_7074140_1194-270988-0_sp1.1 from training. Duration: 0.97275 2023-10-02 02:10:27,915 WARNING [train.py:1204] (3/4) Exclude cut with ID _50310_210_15488_1_1534068043345_3641080_289-193637-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 02:10:28,008 WARNING [train.py:1204] (3/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_313-263495-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 02:10:29,448 WARNING [train.py:1204] (3/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_18-130336-0_sp1.1 from training. Duration: 0.7181875 2023-10-02 02:10:30,849 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_47896_210_20508_1_1534492518_5958868_646-287209-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 02:10:33,575 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_294-163793-0 from training. Duration: 0.82 2023-10-02 02:10:33,613 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_8-319957-0 from training. Duration: 0.96 2023-10-02 02:10:33,616 WARNING [train.py:1204] (3/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_1211-226066-0 from training. Duration: 0.76 2023-10-02 02:10:35,085 WARNING [train.py:1204] (3/4) Exclude cut with ID _4779_210_2084_1_1527318022944_3914893_500-285013-0_sp0.9 from training. Duration: 0.7666875 2023-10-02 02:10:35,113 WARNING [train.py:1204] (3/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_347-232268-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 02:10:37,745 WARNING [train.py:1204] (3/4) Exclude cut with ID _29877_210_6228_1_1532397791106_5276149_111-55056-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 02:10:39,051 WARNING [train.py:1204] (3/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_812-271646-0_sp1.1 from training. Duration: 0.92725 2023-10-02 02:10:39,095 WARNING [train.py:1204] (3/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_235-262124-0_sp1.1 from training. Duration: 0.6090625 2023-10-02 02:10:44,400 WARNING [train.py:1204] (3/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_14-16530-0_sp1.1 from training. Duration: 0.97275 2023-10-02 02:10:44,437 WARNING [train.py:1204] (3/4) Exclude cut with ID _44718_210_5816_1_1534050051869_6469030_568-304512-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 02:10:44,456 WARNING [train.py:1204] (3/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_1033-274376-0 from training. Duration: 0.87 2023-10-02 02:10:46,057 INFO [scaling.py:1118] (3/4) WithLoss: name=encoder.encoders.3.encoder.layers.0.self_attn_weights, loss-sum=0.000e+00 2023-10-02 02:10:48,685 WARNING [train.py:1204] (3/4) Exclude cut with ID _4681_210_3525_1_1527251040321_1235713_123-24428-0 from training. Duration: 0.48 2023-10-02 02:10:51,488 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_37818_210_4238_1_1533620910_4720083_251-288764-0_sp1.1 from training. Duration: 0.97275 2023-10-02 02:10:52,187 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.3.encoder.layers.2.conv_module2.whiten, num_groups=1, num_channels=512, metric=4.13 vs. limit=15.0 2023-10-02 02:10:52,896 WARNING [train.py:1204] (3/4) Exclude cut with ID _65585_210_12210_1_1535450451584_3764098_112-294301-0 from training. Duration: 0.88 2023-10-02 02:10:55,826 WARNING [train.py:1204] (3/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_329-113173-0 from training. Duration: 0.62 2023-10-02 02:10:56,016 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.conv_module1.balancer2.min_abs, batch_count=714080.0, ans=0.5 2023-10-02 02:10:57,256 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6767_210_3273_1_1527855783_1718932_173-256601-0 from training. Duration: 0.8 2023-10-02 02:11:00,018 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9266W0001-627677 from training. Duration: 0.896 2023-10-02 02:11:00,029 WARNING [train.py:1204] (3/4) Exclude cut with ID _49643_210_7989_1_1534640244224_3939031_611-185515-0_sp1.1 from training. Duration: 0.8545625 2023-10-02 02:11:00,033 WARNING [train.py:1204] (3/4) Exclude cut with ID _31649_210_6228_1_1532584608063_4091078_687-237771-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 02:11:00,044 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_148-172861-0_sp0.9 from training. Duration: 0.5555625 2023-10-02 02:11:02,804 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_5129_210_6400_1_1527161005_3579054_107-69738-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 02:11:04,146 WARNING [train.py:1204] (3/4) Exclude cut with ID _40137_210_12210_1_1533290484046_3795180_36-301164-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 02:11:04,180 WARNING [train.py:1204] (3/4) Exclude cut with ID _7537_210_2084_1_1528502395431_3924359_113-199933-0 from training. Duration: 0.59 2023-10-02 02:11:06,930 WARNING [train.py:1204] (3/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_547-5424-0_sp1.1 from training. Duration: 0.8545625 2023-10-02 02:11:08,387 WARNING [train.py:1204] (3/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_147-284024-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 02:11:08,466 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_294-163793-0_sp1.1 from training. Duration: 0.7454375 2023-10-02 02:11:10,500 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3780_210_2087_1_1526467760_2130483_170-259044-0_sp1.1 from training. Duration: 0.636375 2023-10-02 02:11:11,352 WARNING [train.py:1204] (3/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_726-270780-0_sp1.1 from training. Duration: 0.7818125 2023-10-02 02:11:12,787 WARNING [train.py:1204] (3/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_214-291369-0_sp1.1 from training. Duration: 0.57275 2023-10-02 02:11:12,842 WARNING [train.py:1204] (3/4) Exclude cut with ID _21881_210_3525_1_1531825242757_3798380_247-102591-0 from training. Duration: 0.98 2023-10-02 02:11:17,458 WARNING [train.py:1204] (3/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_18-174647-0_sp1.1 from training. Duration: 0.82725 2023-10-02 02:11:17,460 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_415-52611-0_sp1.1 from training. Duration: 0.936375 2023-10-02 02:11:18,858 WARNING [train.py:1204] (3/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_379-49457-0_sp0.9 from training. Duration: 0.9666875 2023-10-02 02:11:18,873 WARNING [train.py:1204] (3/4) Exclude cut with ID _64521_210_14699_1_1535243427259_3815328_368-195106-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 02:11:18,938 WARNING [train.py:1204] (3/4) Exclude cut with ID _28394_210_8913_1_1532430049518_3650118_51-334124-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 02:11:20,424 WARNING [train.py:1204] (3/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_317-161769-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 02:11:21,837 WARNING [train.py:1204] (3/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_40-129756-0_sp0.9 from training. Duration: 0.9 2023-10-02 02:11:23,379 WARNING [train.py:1204] (3/4) Exclude cut with ID _3067_210_2667_1_1526212974640_4503168_119-175776-0_sp0.9 from training. Duration: 0.82225 2023-10-02 02:11:24,727 WARNING [train.py:1204] (3/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_1004-304915-0_sp1.1 from training. Duration: 0.92725 2023-10-02 02:11:26,072 WARNING [train.py:1204] (3/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_359-118926-0_sp0.9 from training. Duration: 0.8 2023-10-02 02:11:30,470 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.conv_module2.balancer1.prob, batch_count=714213.3333333334, ans=0.125 2023-10-02 02:11:31,886 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.balancer1.prob, batch_count=714213.3333333334, ans=0.125 2023-10-02 02:11:31,930 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.feed_forward1.out_proj.dropout_p, batch_count=714213.3333333334, ans=0.1 2023-10-02 02:11:33,014 WARNING [train.py:1204] (3/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_51-200220-0_sp0.9 from training. Duration: 0.62225 2023-10-02 02:11:34,335 WARNING [train.py:1204] (3/4) Exclude cut with ID _46683_210_9642_1_1533730913492_4591116_530-326202-0_sp1.1 from training. Duration: 0.936375 2023-10-02 02:11:34,384 WARNING [train.py:1204] (3/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_567-135114-0 from training. Duration: 0.87 2023-10-02 02:11:35,730 WARNING [train.py:1204] (3/4) Exclude cut with ID _66863_210_18053_1_1535698758334_8590319_1092-271784-0_sp1.1 from training. Duration: 0.87275 2023-10-02 02:11:35,749 WARNING [train.py:1204] (3/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_762-282614-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 02:11:37,316 WARNING [train.py:1204] (3/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_280-120942-0 from training. Duration: 0.77 2023-10-02 02:11:38,616 INFO [train.py:1046] (3/4) Epoch 21, batch 900, loss[loss=0.1915, simple_loss=0.2641, pruned_loss=0.05943, over 23763.00 frames. ], tot_loss[loss=0.176, simple_loss=0.2518, pruned_loss=0.05007, over 4669337.61 frames. ], batch size: 212, lr: 4.94e-03, grad_scale: 16.0 2023-10-02 02:11:42,433 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_31583_210_3852_1_1532595851_4024413_27-170931-0_sp1.1 from training. Duration: 0.963625 2023-10-02 02:11:43,149 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.3.encoder.layers.1.conv_module1.whiten, num_groups=1, num_channels=512, metric=3.04 vs. limit=15.0 2023-10-02 02:11:45,202 WARNING [train.py:1204] (3/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_266-271628-0_sp1.1 from training. Duration: 0.92725 2023-10-02 02:11:45,409 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.self_attn_weights.pos_emb_skip_rate, batch_count=714280.0, ans=0.0 2023-10-02 02:11:45,432 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.self_attn_weights.pos_emb_skip_rate, batch_count=714280.0, ans=0.0 2023-10-02 02:11:47,108 WARNING [train.py:1204] (3/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_41-31157-0 from training. Duration: 0.57 2023-10-02 02:11:51,134 WARNING [train.py:1204] (3/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_113-18163-0_sp1.1 from training. Duration: 0.8090625 2023-10-02 02:11:51,174 WARNING [train.py:1204] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_147-111137-0 from training. Duration: 0.95 2023-10-02 02:11:52,571 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1083-220122-0_sp1.1 from training. Duration: 0.463625 2023-10-02 02:11:54,021 WARNING [train.py:1204] (3/4) Exclude cut with ID _14884_210_2252_1_1530707459689_5561250_591-173406-0_sp1.1 from training. Duration: 0.87275 2023-10-02 02:11:54,027 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_24677_210_1794_1_1532002693_4341296_149-303793-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 02:11:54,083 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_133-564-0_sp1.1 from training. Duration: 0.7181875 2023-10-02 02:11:54,107 WARNING [train.py:1204] (3/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_625-338546-0_sp0.9 from training. Duration: 0.97775 2023-10-02 02:11:54,356 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.self_attn_weights.pos_emb_skip_rate, batch_count=714346.6666666666, ans=0.0 2023-10-02 02:11:57,275 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.conv_skip_rate, batch_count=714346.6666666666, ans=0.0 2023-10-02 02:11:57,321 INFO [scaling.py:1118] (3/4) WithLoss: name=encoder.encoders.3.encoder.layers.3.self_attn_weights, loss-sum=0.000e+00 2023-10-02 02:12:00,924 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.512e+02 1.833e+02 2.065e+02 2.368e+02 4.209e+02, threshold=4.129e+02, percent-clipped=1.0 2023-10-02 02:12:02,531 WARNING [train.py:1204] (3/4) Exclude cut with ID _25882_210_3036_1_1532775665815_6479510_624-320169-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 02:12:02,548 WARNING [train.py:1204] (3/4) Exclude cut with ID _20622_210_3852_1_1531622427080_1093839_127-325326-0_sp1.1 from training. Duration: 0.92725 2023-10-02 02:12:02,573 WARNING [train.py:1204] (3/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_464-33931-0_sp1.1 from training. Duration: 0.4909375 2023-10-02 02:12:04,157 WARNING [train.py:1204] (3/4) Exclude cut with ID _66112_210_10635_1_1535590236305_3801484_120-304588-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 02:12:06,306 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.3.encoder.layers.1.conv_module1.whiten, num_groups=1, num_channels=512, metric=3.80 vs. limit=15.0 2023-10-02 02:12:11,747 WARNING [train.py:1204] (3/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_110-130961-0 from training. Duration: 0.47 2023-10-02 02:12:13,868 WARNING [train.py:1204] (3/4) Exclude cut with ID _16908_210_5724_1_1531119157248_3772700_64-54020-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 02:12:20,004 WARNING [train.py:1204] (3/4) Exclude cut with ID _4068_210_2629_1_1526697350395_1546259_224-288693-0_sp1.1 from training. Duration: 0.536375 2023-10-02 02:12:20,060 WARNING [train.py:1204] (3/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_191-41023-0_sp0.9 from training. Duration: 0.788875 2023-10-02 02:12:21,440 WARNING [train.py:1204] (3/4) Exclude cut with ID ID0205W0255-783002 from training. Duration: 0.958 2023-10-02 02:12:21,517 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_162-262717-0 from training. Duration: 0.83 2023-10-02 02:12:27,202 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_224-322394-0_sp1.1 from training. Duration: 0.6 2023-10-02 02:12:27,218 WARNING [train.py:1204] (3/4) Exclude cut with ID _4866_210_2577_1_1527404412284_4364659_133-1696-0_sp1.1 from training. Duration: 0.9 2023-10-02 02:12:28,543 WARNING [train.py:1204] (3/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_973-198085-0_sp1.1 from training. Duration: 0.6818125 2023-10-02 02:12:35,587 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_47449_210_5399_1_1534076445_4006929_410-237806-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 02:12:35,611 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7543_210_6753_1_1528543318_4552_1-85072-0_sp0.9 from training. Duration: 0.92225 2023-10-02 02:12:35,765 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.ff3_skip_rate, batch_count=714480.0, ans=0.0 2023-10-02 02:12:38,294 WARNING [train.py:1204] (3/4) Exclude cut with ID _65263_210_22356_1_1535763598691_3921037_108-21902-0 from training. Duration: 0.99 2023-10-02 02:12:38,302 WARNING [train.py:1204] (3/4) Exclude cut with ID _65585_210_12210_1_1535450451584_3764098_309-174818-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 02:12:41,079 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_309-65177-0 from training. Duration: 0.88 2023-10-02 02:12:43,221 WARNING [train.py:1204] (3/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_363-185703-0_sp1.1 from training. Duration: 0.863625 2023-10-02 02:12:43,246 WARNING [train.py:1204] (3/4) Exclude cut with ID _42326_210_7770_1_1533814282531_3707269_442-238692-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 02:12:44,615 WARNING [train.py:1204] (3/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_226-254446-0_sp1.1 from training. Duration: 0.936375 2023-10-02 02:12:46,401 WARNING [train.py:1204] (3/4) Exclude cut with ID _29853_210_7497_1_1532505600851_6642110_287-299903-0_sp1.1 from training. Duration: 0.963625 2023-10-02 02:12:51,028 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1015-343884-0 from training. Duration: 0.62 2023-10-02 02:12:51,066 WARNING [train.py:1204] (3/4) Exclude cut with ID ID0305W0008-833402 from training. Duration: 0.91 2023-10-02 02:12:52,914 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6673_210_3273_1_1527767790_3173240_351-167954-0_sp1.1 from training. Duration: 0.436375 2023-10-02 02:12:52,929 WARNING [train.py:1204] (3/4) Exclude cut with ID _6456_210_1534_1_1527681634212_3181049_345-252877-0 from training. Duration: 0.64 2023-10-02 02:12:54,278 INFO [train.py:1046] (3/4) Epoch 21, batch 950, loss[loss=0.1758, simple_loss=0.2626, pruned_loss=0.04447, over 24426.00 frames. ], tot_loss[loss=0.176, simple_loss=0.2524, pruned_loss=0.04978, over 4699121.36 frames. ], batch size: 69, lr: 4.94e-03, grad_scale: 16.0 2023-10-02 02:12:55,715 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_13-205182-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 02:12:55,970 INFO [scaling.py:1118] (3/4) WithLoss: name=encoder.encoders.3.encoder.layers.3.self_attn_weights, loss-sum=0.000e+00 2023-10-02 02:12:59,720 WARNING [train.py:1204] (3/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_427-208901-0 from training. Duration: 0.68 2023-10-02 02:13:02,572 WARNING [train.py:1204] (3/4) Exclude cut with ID _49039_210_6315_1_1533967537541_3846118_9-148921-0_sp1.1 from training. Duration: 0.97275 2023-10-02 02:13:05,998 WARNING [train.py:1204] (3/4) Exclude cut with ID _59493_210_17282_1_1535078419077_3225879_143-70317-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 02:13:06,040 WARNING [train.py:1204] (3/4) Exclude cut with ID _27967_210_12140_1_1532588391859_8419130_1056-236369-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 02:13:07,289 WARNING [train.py:1204] (3/4) Exclude cut with ID _6537_210_1883_1_1527987559710_3597129_382-133062-0_sp0.9 from training. Duration: 0.7555625 2023-10-02 02:13:09,894 WARNING [train.py:1204] (3/4) Exclude cut with ID ID0376W0255-869957 from training. Duration: 0.9510625 2023-10-02 02:13:14,114 WARNING [train.py:1204] (3/4) Exclude cut with ID _49371_210_12071_1_1533952761021_3812190_483-240691-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 02:13:14,172 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_12868_210_6753_1_1530701932_530275_18-76322-0_sp1.1 from training. Duration: 0.7909375 2023-10-02 02:13:14,227 WARNING [train.py:1204] (3/4) Exclude cut with ID _53950_210_5816_1_1534417264919_3862951_367-26344-0_sp1.1 from training. Duration: 0.97275 2023-10-02 02:13:14,244 WARNING [train.py:1204] (3/4) Exclude cut with ID _5444_210_2151_1_1527418808464_5312210_634-194174-0_sp1.1 from training. Duration: 0.8818125 2023-10-02 02:13:16,234 WARNING [train.py:1204] (3/4) Exclude cut with ID _56494_210_4220_1_1535284837625_3033169_69-138150-0 from training. Duration: 0.94 2023-10-02 02:13:16,349 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7543_210_6753_1_1528544010_1110402_25-183860-0_sp0.9 from training. Duration: 0.52225 2023-10-02 02:13:17,792 WARNING [train.py:1204] (3/4) Exclude cut with ID _40284_210_2084_1_1533513679814_3704279_287-191918-0_sp1.1 from training. Duration: 0.963625 2023-10-02 02:13:19,993 WARNING [train.py:1204] (3/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_291-270881-0 from training. Duration: 0.97 2023-10-02 02:13:20,033 WARNING [train.py:1204] (3/4) Exclude cut with ID _12555_210_8750_1_1530075594402_3780239_449-267986-0_sp0.9 from training. Duration: 0.92225 2023-10-02 02:13:20,227 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.balancer2.prob, batch_count=714680.0, ans=0.125 2023-10-02 02:13:21,633 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=714680.0, ans=0.1 2023-10-02 02:13:24,757 WARNING [train.py:1204] (3/4) Exclude cut with ID _3992_210_1747_1_1526644816031_3731060_268-64520-0_sp1.1 from training. Duration: 0.963625 2023-10-02 02:13:26,006 WARNING [train.py:1204] (3/4) Exclude cut with ID _39411_210_5986_1_1533538825030_3343649_173-238248-0_sp0.9 from training. Duration: 0.92225 2023-10-02 02:13:26,049 WARNING [train.py:1204] (3/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_828-313097-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 02:13:27,419 WARNING [train.py:1204] (3/4) Exclude cut with ID _26907_210_7234_1_1532221740694_3205263_337-2494-0 from training. Duration: 0.88 2023-10-02 02:13:28,990 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_229-45300-0_sp1.1 from training. Duration: 0.5181875 2023-10-02 02:13:30,395 WARNING [train.py:1204] (3/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_300-27235-0_sp1.1 from training. Duration: 0.7909375 2023-10-02 02:13:31,785 WARNING [train.py:1204] (3/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_48-338976-0_sp1.1 from training. Duration: 0.8090625 2023-10-02 02:13:35,773 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_13091_210_7925_1_1530609447_8987354_792-195353-0_sp1.1 from training. Duration: 0.936375 2023-10-02 02:13:35,778 WARNING [train.py:1204] (3/4) Exclude cut with ID _56524_210_9642_1_1534572011779_3619170_311-54500-0_sp1.1 from training. Duration: 0.97275 2023-10-02 02:13:38,669 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7543_210_6753_1_1528544010_1110402_25-183860-0 from training. Duration: 0.47 2023-10-02 02:13:38,861 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.0.conv_module1.balancer2.prob, batch_count=714813.3333333334, ans=0.125 2023-10-02 02:13:40,379 WARNING [train.py:1204] (3/4) Exclude cut with ID 2411-132532-0017-82279-0_sp1.1 from training. Duration: 0.9681875 2023-10-02 02:13:40,380 WARNING [train.py:1204] (3/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_272-249985-0_sp0.9 from training. Duration: 0.7444375 2023-10-02 02:13:42,987 WARNING [train.py:1204] (3/4) Exclude cut with ID _55644_210_11856_1_1534593599582_4103719_320-229006-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 02:13:43,062 WARNING [train.py:1204] (3/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_177-100554-0_sp1.1 from training. Duration: 0.963625 2023-10-02 02:13:43,064 WARNING [train.py:1204] (3/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_787-294416-0_sp1.1 from training. Duration: 0.7454375 2023-10-02 02:13:46,459 WARNING [train.py:1204] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_318-59097-0 from training. Duration: 0.76 2023-10-02 02:13:47,823 WARNING [train.py:1204] (3/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_480-13146-0_sp1.1 from training. Duration: 0.77275 2023-10-02 02:13:50,563 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_469-338558-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 02:13:51,890 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_367-126897-0_sp1.1 from training. Duration: 0.963625 2023-10-02 02:13:51,908 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6545_210_2040_1_1527774670_4245258_379-14182-0 from training. Duration: 0.8 2023-10-02 02:13:51,920 WARNING [train.py:1204] (3/4) Exclude cut with ID _21631_210_2253_1_1531907880778_3160339_197-79498-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 02:13:51,927 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_416-293746-0_sp1.1 from training. Duration: 0.8181875 2023-10-02 02:13:51,977 WARNING [train.py:1204] (3/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_52-211683-0 from training. Duration: 0.9 2023-10-02 02:13:52,674 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.3.encoder.layers.1.nonlin_attention.whiten2, num_groups=1, num_channels=512, metric=11.40 vs. limit=15.0 2023-10-02 02:13:56,080 WARNING [train.py:1204] (3/4) Exclude cut with ID _48678_210_5663_1_1533988830947_3769199_306-255886-0_sp0.9 from training. Duration: 0.8555625 2023-10-02 02:13:57,779 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.conv_module1.balancer1.prob, batch_count=714880.0, ans=0.125 2023-10-02 02:13:58,928 WARNING [train.py:1204] (3/4) Exclude cut with ID _65134_210_14763_1_1535335393985_3505368_296-24733-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 02:14:01,889 WARNING [train.py:1204] (3/4) Exclude cut with ID _14898_210_9206_1_1530766812377_4177190_335-99364-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 02:14:03,194 WARNING [train.py:1204] (3/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_19-342632-0 from training. Duration: 0.91 2023-10-02 02:14:03,211 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_53071_210_16554_1_1534490405_2954066_265-170472-0 from training. Duration: 0.83 2023-10-02 02:14:06,104 WARNING [train.py:1204] (3/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_189-298172-0_sp1.1 from training. Duration: 0.963625 2023-10-02 02:14:08,799 INFO [train.py:1046] (3/4) Epoch 21, batch 1000, loss[loss=0.159, simple_loss=0.2061, pruned_loss=0.05601, over 19461.00 frames. ], tot_loss[loss=0.1758, simple_loss=0.2516, pruned_loss=0.04997, over 4692526.66 frames. ], batch size: 388, lr: 4.93e-03, grad_scale: 16.0 2023-10-02 02:14:08,967 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7615_210_4211_1_1528110965_4580160_72-235211-0 from training. Duration: 0.76 2023-10-02 02:14:09,017 WARNING [train.py:1204] (3/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_155-90874-0_sp1.1 from training. Duration: 0.936375 2023-10-02 02:14:14,885 WARNING [train.py:1204] (3/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_164-216310-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 02:14:16,779 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_46013_210_7234_1_1533776362_3744032_463-52378-0 from training. Duration: 0.86 2023-10-02 02:14:16,783 WARNING [train.py:1204] (3/4) Exclude cut with ID _61133_210_9969_1_1535196777227_3696409_333-270325-0 from training. Duration: 0.93 2023-10-02 02:14:18,963 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.conv_module2.balancer2.prob, batch_count=714946.6666666666, ans=0.125 2023-10-02 02:14:22,110 WARNING [train.py:1204] (3/4) Exclude cut with ID _5172_210_1883_1_1527246062567_3577219_18-101380-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 02:14:22,121 WARNING [train.py:1204] (3/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_577-236858-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 02:14:24,734 WARNING [train.py:1204] (3/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_1006-177345-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 02:14:26,638 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.4.encoder.layers.0.feed_forward2.out_whiten, num_groups=1, num_channels=384, metric=6.16 vs. limit=15.0 2023-10-02 02:14:27,372 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_4-266348-0 from training. Duration: 0.78 2023-10-02 02:14:30,320 WARNING [train.py:1204] (3/4) Exclude cut with ID _55857_210_2346_1_1534644007831_6898209_745-98391-0 from training. Duration: 0.76 2023-10-02 02:14:31,677 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.523e+02 1.852e+02 2.059e+02 2.423e+02 3.876e+02, threshold=4.119e+02, percent-clipped=0.0 2023-10-02 02:14:31,782 WARNING [train.py:1204] (3/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_375-244976-0 from training. Duration: 0.75 2023-10-02 02:14:31,964 INFO [scaling.py:1118] (3/4) WithLoss: name=encoder.encoders.0.layers.1.self_attn_weights, loss-sum=0.000e+00 2023-10-02 02:14:33,143 WARNING [train.py:1204] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_4-248404-0_sp1.1 from training. Duration: 0.97275 2023-10-02 02:14:34,509 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8453_210_4211_1_1528502940_8562730_596-306820-0 from training. Duration: 0.97 2023-10-02 02:14:37,105 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_211-339622-0_sp0.9 from training. Duration: 0.5 2023-10-02 02:14:37,131 WARNING [train.py:1204] (3/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_393-57888-0 from training. Duration: 0.64 2023-10-02 02:14:37,233 WARNING [train.py:1204] (3/4) Exclude cut with ID _23825_210_13449_1_1531915313526_3497150_143-343575-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 02:14:37,385 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.1.conv_skip_rate, batch_count=715080.0, ans=0.0 2023-10-02 02:14:38,641 WARNING [train.py:1204] (3/4) Exclude cut with ID _58605_210_12210_1_1534723195939_3904259_36-12256-0_sp1.1 from training. Duration: 0.936375 2023-10-02 02:14:39,451 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.2.encoder.layers.2.self_attn1.whiten, num_groups=1, num_channels=384, metric=21.11 vs. limit=22.5 2023-10-02 02:14:47,589 WARNING [train.py:1204] (3/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_38-67795-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 02:14:47,668 WARNING [train.py:1204] (3/4) Exclude cut with ID _64631_210_27767_1_1535349580068_4165160_308-327832-0_sp1.1 from training. Duration: 0.92725 2023-10-02 02:14:49,586 WARNING [train.py:1204] (3/4) Exclude cut with ID _37086_210_9038_1_1532999540891_7021250_726-81176-0_sp1.1 from training. Duration: 0.936375 2023-10-02 02:14:51,542 WARNING [train.py:1204] (3/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_47-141521-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 02:14:51,546 WARNING [train.py:1204] (3/4) Exclude cut with ID _3067_210_2667_1_1526212974640_4503168_250-285937-0 from training. Duration: 0.8 2023-10-02 02:14:51,585 WARNING [train.py:1204] (3/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_239-24000-0_sp1.1 from training. Duration: 0.97275 2023-10-02 02:14:52,871 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3371_210_1794_1_1526384462_5580777_396-339812-0_sp0.9 from training. Duration: 0.9444375 2023-10-02 02:14:54,278 WARNING [train.py:1204] (3/4) Exclude cut with ID _16909_210_5724_1_1531292079912_4118919_580-173454-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 02:14:54,341 WARNING [train.py:1204] (3/4) Exclude cut with ID IC0124W0002-61308 from training. Duration: 0.982 2023-10-02 02:14:54,542 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=715146.6666666666, ans=0.1 2023-10-02 02:14:54,544 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.conv_skip_rate, batch_count=715146.6666666666, ans=0.0 2023-10-02 02:14:57,049 WARNING [train.py:1204] (3/4) Exclude cut with ID _31602_210_5854_1_1532677021456_4458250_249-96604-0 from training. Duration: 0.89 2023-10-02 02:14:58,397 WARNING [train.py:1204] (3/4) Exclude cut with ID _69584_210_5219_1_1535785133775_4044450_325-7656-0 from training. Duration: 0.86 2023-10-02 02:15:01,020 WARNING [train.py:1204] (3/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_478-162247-0 from training. Duration: 0.82 2023-10-02 02:15:02,521 WARNING [train.py:1204] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_686-54383-0_sp1.1 from training. Duration: 0.7909375 2023-10-02 02:15:08,198 WARNING [train.py:1204] (3/4) Exclude cut with ID _29303_210_2575_1_1532392237303_7410649_85-270092-0_sp1.1 from training. Duration: 0.936375 2023-10-02 02:15:08,216 WARNING [train.py:1204] (3/4) Exclude cut with ID _2322_210_2667_1_1525604548971_3656299_440-80494-0_sp1.1 from training. Duration: 0.8 2023-10-02 02:15:09,509 WARNING [train.py:1204] (3/4) Exclude cut with ID _47346_210_9476_1_1533815971553_3536160_201-40644-0_sp1.1 from training. Duration: 0.936375 2023-10-02 02:15:09,615 WARNING [train.py:1204] (3/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_343-139898-0_sp1.1 from training. Duration: 0.87275 2023-10-02 02:15:11,026 WARNING [train.py:1204] (3/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_1012-279473-0 from training. Duration: 0.67 2023-10-02 02:15:12,361 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7931_210_1794_1_1528594728_5564059_280-279560-0_sp1.1 from training. Duration: 0.8909375 2023-10-02 02:15:12,406 WARNING [train.py:1204] (3/4) Exclude cut with ID _8263_210_1664_1_1528439452891_970220_68-181805-0 from training. Duration: 0.64 2023-10-02 02:15:12,443 WARNING [train.py:1204] (3/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_605-133395-0 from training. Duration: 0.66 2023-10-02 02:15:13,852 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_43-171109-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 02:15:13,855 WARNING [train.py:1204] (3/4) Exclude cut with ID _40094_210_16380_1_1533294005773_3641159_107-280739-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 02:15:16,677 WARNING [train.py:1204] (3/4) Exclude cut with ID _14821_210_8750_1_1530680503991_6829359_2-227151-0_sp0.9 from training. Duration: 0.92225 2023-10-02 02:15:20,446 WARNING [train.py:1204] (3/4) Exclude cut with ID _14268_210_9206_1_1530622771285_4714099_331-105351-0_sp1.1 from training. Duration: 0.5909375 2023-10-02 02:15:21,279 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.1.encoder.layers.0.whiten, num_groups=1, num_channels=256, metric=5.38 vs. limit=12.0 2023-10-02 02:15:21,797 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_26326_210_7925_1_1533002493_3592406_451-82741-0_sp1.1 from training. Duration: 0.97275 2023-10-02 02:15:23,781 INFO [train.py:1046] (3/4) Epoch 21, batch 1050, loss[loss=0.1465, simple_loss=0.2268, pruned_loss=0.03303, over 24563.00 frames. ], tot_loss[loss=0.1746, simple_loss=0.2504, pruned_loss=0.04938, over 4701082.08 frames. ], batch size: 60, lr: 4.93e-03, grad_scale: 16.0 2023-10-02 02:15:23,944 WARNING [train.py:1204] (3/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_191-59784-0_sp0.9 from training. Duration: 0.97775 2023-10-02 02:15:25,315 WARNING [train.py:1204] (3/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_445-117398-0_sp1.1 from training. Duration: 0.8090625 2023-10-02 02:15:26,715 WARNING [train.py:1204] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_605-257205-0_sp0.9 from training. Duration: 0.6555625 2023-10-02 02:15:28,043 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_50050_210_16223_1_1535435796_7290138_592-258841-0_sp1.1 from training. Duration: 0.936375 2023-10-02 02:15:29,534 WARNING [train.py:1204] (3/4) Exclude cut with ID _14348_210_8341_1_1530599434996_3778640_240-232370-0_sp0.9 from training. Duration: 0.9333125 2023-10-02 02:15:32,230 WARNING [train.py:1204] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_239-316802-0_sp0.9 from training. Duration: 0.7666875 2023-10-02 02:15:33,578 WARNING [train.py:1204] (3/4) Exclude cut with ID _45560_210_21982_1_1533812603951_3584999_111-6376-0_sp1.1 from training. Duration: 0.763625 2023-10-02 02:15:36,303 WARNING [train.py:1204] (3/4) Exclude cut with ID _54189_210_16380_1_1534332670039_3911710_606-311481-0_sp1.1 from training. Duration: 0.963625 2023-10-02 02:15:36,371 WARNING [train.py:1204] (3/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_237-332027-0_sp1.1 from training. Duration: 0.863625 2023-10-02 02:15:36,407 WARNING [train.py:1204] (3/4) Exclude cut with ID _66070_210_7640_1_1535441393096_7805310_489-292631-0_sp0.9 from training. Duration: 0.87775 2023-10-02 02:15:37,718 WARNING [train.py:1204] (3/4) Exclude cut with ID _49728_210_12651_1_1534041034294_6788150_624-195301-0_sp1.1 from training. Duration: 0.77275 2023-10-02 02:15:39,178 WARNING [train.py:1204] (3/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_44-79427-0 from training. Duration: 0.98 2023-10-02 02:15:39,248 WARNING [train.py:1204] (3/4) Exclude cut with ID _35350_210_16040_1_1533040191183_4075299_490-198354-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 02:15:39,291 WARNING [train.py:1204] (3/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_309-222545-0 from training. Duration: 0.84 2023-10-02 02:15:40,717 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_13091_210_7925_1_1530609447_8987354_790-199469-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 02:15:40,723 WARNING [train.py:1204] (3/4) Exclude cut with ID _21632_210_2253_1_1531994001765_3744140_110-274654-0 from training. Duration: 0.82 2023-10-02 02:15:42,107 WARNING [train.py:1204] (3/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_498-40153-0_sp0.9 from training. Duration: 0.7 2023-10-02 02:15:48,741 WARNING [train.py:1204] (3/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_363-282909-0_sp1.1 from training. Duration: 0.936375 2023-10-02 02:15:48,812 WARNING [train.py:1204] (3/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_178-177582-0_sp0.9 from training. Duration: 0.82225 2023-10-02 02:15:48,833 WARNING [train.py:1204] (3/4) Exclude cut with ID _24612_210_8913_1_1531998054193_3806359_357-35487-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 02:15:53,396 WARNING [train.py:1204] (3/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_138-332773-0 from training. Duration: 0.81 2023-10-02 02:15:53,421 WARNING [train.py:1204] (3/4) Exclude cut with ID _7624_210_1531_1_1528109802198_4933101_418-349834-0 from training. Duration: 0.6 2023-10-02 02:15:54,705 WARNING [train.py:1204] (3/4) Exclude cut with ID _7537_210_2084_1_1528502395431_3924359_14-312906-0_sp0.9 from training. Duration: 0.9333125 2023-10-02 02:15:56,205 WARNING [train.py:1204] (3/4) Exclude cut with ID _60057_210_10640_1_1534903065793_3333150_362-230377-0 from training. Duration: 0.87 2023-10-02 02:15:59,065 WARNING [train.py:1204] (3/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_138-64210-0 from training. Duration: 0.73 2023-10-02 02:16:00,400 WARNING [train.py:1204] (3/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_454-339504-0_sp1.1 from training. Duration: 0.97275 2023-10-02 02:16:03,148 WARNING [train.py:1204] (3/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_508-335369-0_sp0.9 from training. Duration: 0.5333125 2023-10-02 02:16:04,555 WARNING [train.py:1204] (3/4) Exclude cut with ID _5608_210_2106_1_1527415439089_6819768_909-82767-0_sp0.9 from training. Duration: 0.67775 2023-10-02 02:16:05,911 WARNING [train.py:1204] (3/4) Exclude cut with ID _6694_210_4599_1_1527852637490_3965529_99-11611-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 02:16:05,974 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2655_210_2087_1_1525688959_3897400_52-279899-0_sp1.1 from training. Duration: 0.62725 2023-10-02 02:16:08,882 WARNING [train.py:1204] (3/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_375-245677-0_sp1.1 from training. Duration: 0.736375 2023-10-02 02:16:11,757 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_23850_210_4238_1_1532232597_1632433_32-185135-0 from training. Duration: 0.86 2023-10-02 02:16:13,070 WARNING [train.py:1204] (3/4) Exclude cut with ID _15113_210_8254_1_1530842438579_3716279_209-54533-0 from training. Duration: 0.96 2023-10-02 02:16:13,106 WARNING [train.py:1204] (3/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_401-53423-0 from training. Duration: 0.55 2023-10-02 02:16:13,259 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.conv_module2.balancer1.prob, batch_count=715480.0, ans=0.125 2023-10-02 02:16:14,417 WARNING [train.py:1204] (3/4) Exclude cut with ID _14867_210_1968_1_1530684179890_3846969_319-250868-0_sp1.1 from training. Duration: 0.9 2023-10-02 02:16:14,444 WARNING [train.py:1204] (3/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_106-30782-0_sp0.9 from training. Duration: 0.888875 2023-10-02 02:16:16,533 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_5960_210_1794_1_1527403865_3990999_515-2613-0 from training. Duration: 0.88 2023-10-02 02:16:18,080 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.self_attn_weights.pos_emb_skip_rate, batch_count=715480.0, ans=0.0 2023-10-02 02:16:20,711 WARNING [train.py:1204] (3/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_798-106179-0_sp0.9 from training. Duration: 0.92225 2023-10-02 02:16:23,975 WARNING [train.py:1204] (3/4) Exclude cut with ID _14227_210_4381_1_1530701804286_3940259_125-275642-0_sp1.1 from training. Duration: 0.9 2023-10-02 02:16:23,978 WARNING [train.py:1204] (3/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_79-200392-0_sp1.1 from training. Duration: 0.8818125 2023-10-02 02:16:24,031 WARNING [train.py:1204] (3/4) Exclude cut with ID _48836_210_12210_1_1534075848293_3904150_429-35475-0_sp0.9 from training. Duration: 0.988875 2023-10-02 02:16:24,056 WARNING [train.py:1204] (3/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_224-172517-0_sp1.1 from training. Duration: 0.97275 2023-10-02 02:16:28,174 WARNING [train.py:1204] (3/4) Exclude cut with ID _15113_210_8254_1_1530842438579_3716279_76-232507-0_sp1.1 from training. Duration: 0.97275 2023-10-02 02:16:28,177 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_135-5501-0 from training. Duration: 0.98 2023-10-02 02:16:29,626 WARNING [train.py:1204] (3/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_563-347832-0_sp0.9 from training. Duration: 0.988875 2023-10-02 02:16:29,645 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6786_210_1388_1_1527839699_4194495_273-149841-0 from training. Duration: 0.92 2023-10-02 02:16:30,996 WARNING [train.py:1204] (3/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_172-215932-0 from training. Duration: 0.66 2023-10-02 02:16:31,064 WARNING [train.py:1204] (3/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_939-132910-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 02:16:33,896 WARNING [train.py:1204] (3/4) Exclude cut with ID _49371_210_12071_1_1533952761021_3812190_16-246001-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 02:16:34,116 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.0.ff2_skip_rate, batch_count=715546.6666666666, ans=0.0 2023-10-02 02:16:36,665 INFO [train.py:1046] (3/4) Epoch 21, batch 1100, loss[loss=0.1866, simple_loss=0.2563, pruned_loss=0.05846, over 23668.00 frames. ], tot_loss[loss=0.1744, simple_loss=0.2502, pruned_loss=0.04928, over 4704050.75 frames. ], batch size: 256, lr: 4.93e-03, grad_scale: 16.0 2023-10-02 02:16:39,528 WARNING [train.py:1204] (3/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_680-189165-0_sp1.1 from training. Duration: 0.936375 2023-10-02 02:16:43,778 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.feed_forward1.hidden_balancer.prob, batch_count=715613.3333333334, ans=0.125 2023-10-02 02:16:44,975 WARNING [train.py:1204] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_53-300544-0_sp1.1 from training. Duration: 0.6090625 2023-10-02 02:16:47,082 WARNING [train.py:1204] (3/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_540-275685-0_sp1.1 from training. Duration: 0.8090625 2023-10-02 02:16:47,112 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_38965_210_4852_1_1533174906_3484735_453-106579-0_sp1.1 from training. Duration: 0.92725 2023-10-02 02:16:48,787 WARNING [train.py:1204] (3/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_105-328174-0 from training. Duration: 0.76 2023-10-02 02:16:50,603 WARNING [train.py:1204] (3/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_279-126205-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 02:16:53,433 WARNING [train.py:1204] (3/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_5-55893-0_sp0.9 from training. Duration: 0.611125 2023-10-02 02:16:54,791 WARNING [train.py:1204] (3/4) Exclude cut with ID _13678_210_7497_1_1530530069226_3463170_141-10835-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 02:16:57,948 WARNING [train.py:1204] (3/4) Exclude cut with ID _22083_210_8254_1_1531702862439_3678970_201-85738-0_sp1.1 from training. Duration: 0.8181875 2023-10-02 02:16:57,973 WARNING [train.py:1204] (3/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_51-1049-0 from training. Duration: 0.75 2023-10-02 02:16:59,257 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.521e+02 1.784e+02 1.995e+02 2.356e+02 3.579e+02, threshold=3.990e+02, percent-clipped=0.0 2023-10-02 02:16:59,397 WARNING [train.py:1204] (3/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_206-10097-0_sp0.9 from training. Duration: 0.5444375 2023-10-02 02:17:00,657 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_4528_210_3273_1_1527076654_3276777_360-63661-0_sp1.1 from training. Duration: 0.92725 2023-10-02 02:17:00,666 WARNING [train.py:1204] (3/4) Exclude cut with ID _64086_210_7994_1_1535270417019_7435949_449-31567-0_sp1.1 from training. Duration: 0.8818125 2023-10-02 02:17:03,469 WARNING [train.py:1204] (3/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_245-174553-0_sp0.9 from training. Duration: 0.97775 2023-10-02 02:17:04,830 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6545_210_2040_1_1527774670_4245258_379-14182-0_sp1.1 from training. Duration: 0.72725 2023-10-02 02:17:05,358 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.5.encoder.layers.0.whiten, num_groups=1, num_channels=256, metric=2.15 vs. limit=12.0 2023-10-02 02:17:09,152 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_256-206070-0_sp1.1 from training. Duration: 0.963625 2023-10-02 02:17:13,261 WARNING [train.py:1204] (3/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_278-104676-0 from training. Duration: 0.98 2023-10-02 02:17:13,340 WARNING [train.py:1204] (3/4) Exclude cut with ID IC0671W0065-331908 from training. Duration: 0.896 2023-10-02 02:17:13,481 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.bypass.scale_min, batch_count=715746.6666666666, ans=0.2 2023-10-02 02:17:14,676 WARNING [train.py:1204] (3/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_244-129917-0_sp1.1 from training. Duration: 0.97275 2023-10-02 02:17:17,513 WARNING [train.py:1204] (3/4) Exclude cut with ID _7852_210_5219_1_1528286471316_8139400_1129-170857-0_sp1.1 from training. Duration: 0.97275 2023-10-02 02:17:17,597 WARNING [train.py:1204] (3/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_443-30034-0_sp0.9 from training. Duration: 0.711125 2023-10-02 02:17:19,240 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_31583_210_3852_1_1532595851_4024413_250-332999-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 02:17:22,474 WARNING [train.py:1204] (3/4) Exclude cut with ID _8772_210_1850_1_1528635595972_3664011_247-336448-0 from training. Duration: 0.74 2023-10-02 02:17:22,547 WARNING [train.py:1204] (3/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_567-135114-0_sp0.9 from training. Duration: 0.9666875 2023-10-02 02:17:22,551 WARNING [train.py:1204] (3/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_557-349534-0_sp1.1 from training. Duration: 0.82725 2023-10-02 02:17:23,214 WARNING [train.py:1204] (3/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_1341-26728-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 02:17:23,278 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_40222_210_6228_1_1533385298_4481939_375-328051-0_sp1.1 from training. Duration: 0.97275 2023-10-02 02:17:24,595 WARNING [train.py:1204] (3/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_276-96593-0 from training. Duration: 0.77 2023-10-02 02:17:30,751 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_53071_210_16554_1_1534490405_2954066_265-170472-0_sp0.9 from training. Duration: 0.92225 2023-10-02 02:17:30,776 WARNING [train.py:1204] (3/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_365-1492-0 from training. Duration: 0.91 2023-10-02 02:17:33,403 WARNING [train.py:1204] (3/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_654-52713-0_sp0.9 from training. Duration: 0.8555625 2023-10-02 02:17:36,373 WARNING [train.py:1204] (3/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_113-92608-0_sp1.1 from training. Duration: 0.4909375 2023-10-02 02:17:37,998 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.conv_module1.balancer2.prob, batch_count=715880.0, ans=0.125 2023-10-02 02:17:39,172 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_46013_210_7234_1_1533776362_3744032_50-141535-0 from training. Duration: 0.99 2023-10-02 02:17:39,180 WARNING [train.py:1204] (3/4) Exclude cut with ID _13942_210_9988_1_1530604906036_3733360_499-55676-0_sp1.1 from training. Duration: 0.563625 2023-10-02 02:17:39,917 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.3.encoder.layers.2.conv_module1.whiten, num_groups=1, num_channels=512, metric=4.62 vs. limit=15.0 2023-10-02 02:17:40,545 WARNING [train.py:1204] (3/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_375-65143-0_sp1.1 from training. Duration: 0.97275 2023-10-02 02:17:42,096 WARNING [train.py:1204] (3/4) Exclude cut with ID _16756_210_4915_1_1531047803763_7104259_199-325608-0_sp1.1 from training. Duration: 0.92725 2023-10-02 02:17:42,127 WARNING [train.py:1204] (3/4) Exclude cut with ID _31649_210_6228_1_1532584608063_4091078_443-124391-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 02:17:44,790 WARNING [train.py:1204] (3/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_146-218763-0 from training. Duration: 0.86 2023-10-02 02:17:46,158 WARNING [train.py:1204] (3/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_567-135114-0_sp1.1 from training. Duration: 0.7909375 2023-10-02 02:17:46,211 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_26326_210_7925_1_1533002493_3592406_462-288299-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 02:17:47,574 WARNING [train.py:1204] (3/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_322-217220-0 from training. Duration: 0.86 2023-10-02 02:17:47,604 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_238-256668-0_sp1.1 from training. Duration: 0.62725 2023-10-02 02:17:48,845 WARNING [train.py:1204] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_371-18169-0 from training. Duration: 0.86 2023-10-02 02:17:48,963 WARNING [train.py:1204] (3/4) Exclude cut with ID _58480_210_12392_1_1534811121111_3646160_23-74512-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 02:17:48,968 WARNING [train.py:1204] (3/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_784-213099-0_sp1.1 from training. Duration: 0.8454375 2023-10-02 02:17:50,289 INFO [train.py:1046] (3/4) Epoch 21, batch 1150, loss[loss=0.1777, simple_loss=0.2593, pruned_loss=0.04808, over 24688.00 frames. ], tot_loss[loss=0.1747, simple_loss=0.2508, pruned_loss=0.0493, over 4711339.32 frames. ], batch size: 68, lr: 4.93e-03, grad_scale: 16.0 2023-10-02 02:17:50,398 WARNING [train.py:1204] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_625-102955-0_sp1.1 from training. Duration: 0.7 2023-10-02 02:17:57,235 WARNING [train.py:1204] (3/4) Exclude cut with ID _49748_210_24116_1_1533986971737_3158910_176-174327-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 02:17:59,818 WARNING [train.py:1204] (3/4) Exclude cut with ID _50310_210_15488_1_1534068043345_3641080_432-96140-0_sp1.1 from training. Duration: 0.936375 2023-10-02 02:18:01,289 WARNING [train.py:1204] (3/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_76-14910-0_sp1.1 from training. Duration: 0.92725 2023-10-02 02:18:01,322 WARNING [train.py:1204] (3/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_40-19458-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 02:18:01,355 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_46-93547-0 from training. Duration: 0.95 2023-10-02 02:18:03,752 WARNING [train.py:1204] (3/4) Exclude cut with ID _21631_210_2253_1_1531907880778_3160339_321-35325-0_sp1.1 from training. Duration: 0.963625 2023-10-02 02:18:05,242 WARNING [train.py:1204] (3/4) Exclude cut with ID _4779_210_2084_1_1527318022944_3914893_500-285013-0 from training. Duration: 0.69 2023-10-02 02:18:06,638 WARNING [train.py:1204] (3/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_301-93324-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 02:18:06,646 WARNING [train.py:1204] (3/4) Exclude cut with ID _6473_210_1741_1_1527901307396_3815380_108-17335-0_sp1.1 from training. Duration: 0.5909375 2023-10-02 02:18:10,917 WARNING [train.py:1204] (3/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_206-10097-0 from training. Duration: 0.49 2023-10-02 02:18:12,341 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_36756_210_7234_1_1533026991_966276_103-77513-0_sp1.1 from training. Duration: 0.97275 2023-10-02 02:18:16,387 WARNING [train.py:1204] (3/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_662-18193-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 02:18:16,452 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_26433_210_12108_1_1532157158_4056480_439-52438-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 02:18:16,494 WARNING [train.py:1204] (3/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_464-33931-0 from training. Duration: 0.54 2023-10-02 02:18:16,500 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3474_210_2028_1_1526188513_4825246_145-309131-0_sp0.9 from training. Duration: 0.82225 2023-10-02 02:18:16,511 WARNING [train.py:1204] (3/4) Exclude cut with ID _9679_210_7497_1_1529129212906_4363929_446-308321-0_sp1.1 from training. Duration: 0.963625 2023-10-02 02:18:18,150 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.conv_module1.balancer2.prob, batch_count=716013.3333333334, ans=0.125 2023-10-02 02:18:22,480 WARNING [train.py:1204] (3/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_662-275621-0 from training. Duration: 0.42 2023-10-02 02:18:24,504 WARNING [train.py:1204] (3/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_344-337148-0_sp1.1 from training. Duration: 0.97275 2023-10-02 02:18:25,906 WARNING [train.py:1204] (3/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_759-179071-0_sp1.1 from training. Duration: 0.92725 2023-10-02 02:18:32,983 WARNING [train.py:1204] (3/4) Exclude cut with ID _28783_210_7943_1_1532671164808_7155479_293-12188-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 02:18:35,309 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.0.balancer2.prob, batch_count=716146.6666666666, ans=0.125 2023-10-02 02:18:37,960 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_17613_210_2151_1_1531137569_2366160_235-320304-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 02:18:37,984 WARNING [train.py:1204] (3/4) Exclude cut with ID _14349_210_8341_1_1530615710818_3442470_170-336541-0 from training. Duration: 0.86 2023-10-02 02:18:39,425 WARNING [train.py:1204] (3/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_375-218677-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 02:18:39,474 WARNING [train.py:1204] (3/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_829-202956-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 02:18:47,928 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9029W0436-509823 from training. Duration: 0.768 2023-10-02 02:18:49,489 WARNING [train.py:1204] (3/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_231-112214-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 02:18:57,507 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9059W0470-524883 from training. Duration: 0.3415625 2023-10-02 02:19:00,409 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_43266_210_1794_1_1533521789_4497218_206-79512-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 02:19:01,759 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_294-163793-0_sp0.9 from training. Duration: 0.911125 2023-10-02 02:19:01,778 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6290_210_3273_1_1527595224_1282787_115-87095-0_sp1.1 from training. Duration: 0.5 2023-10-02 02:19:01,823 WARNING [train.py:1204] (3/4) Exclude cut with ID _14100_210_8341_1_1530529320613_3678069_279-55760-0_sp1.1 from training. Duration: 0.8454375 2023-10-02 02:19:05,845 INFO [train.py:1046] (3/4) Epoch 21, batch 1200, loss[loss=0.1678, simple_loss=0.244, pruned_loss=0.0458, over 23680.00 frames. ], tot_loss[loss=0.1756, simple_loss=0.2516, pruned_loss=0.04978, over 4713858.37 frames. ], batch size: 149, lr: 4.93e-03, grad_scale: 32.0 2023-10-02 02:19:05,952 WARNING [train.py:1204] (3/4) Exclude cut with ID _14621_210_1925_1_1530686901185_3675420_419-234200-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 02:19:09,583 INFO [scaling.py:1118] (3/4) WithLoss: name=encoder.encoders.5.encoder.layers.1.self_attn_weights, loss-sum=0.000e+00 2023-10-02 02:19:10,713 WARNING [train.py:1204] (3/4) Exclude cut with ID _60229_210_27767_1_1534838406158_3655350_353-213656-0_sp1.1 from training. Duration: 0.863625 2023-10-02 02:19:10,722 WARNING [train.py:1204] (3/4) Exclude cut with ID _48678_210_5663_1_1533988830947_3769199_306-255886-0_sp1.1 from training. Duration: 0.7 2023-10-02 02:19:12,216 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.feed_forward2.hidden_balancer.prob, batch_count=716280.0, ans=0.125 2023-10-02 02:19:13,360 WARNING [train.py:1204] (3/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_466-3492-0_sp1.1 from training. Duration: 0.97275 2023-10-02 02:19:13,361 WARNING [train.py:1204] (3/4) Exclude cut with ID _8755_210_1876_1_1528628559357_3258209_199-242167-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 02:19:13,440 WARNING [train.py:1204] (3/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_237-89684-0_sp1.1 from training. Duration: 0.87275 2023-10-02 02:19:14,884 WARNING [train.py:1204] (3/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_70-84121-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 02:19:16,313 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7544_210_6753_1_1528587722_3576686_172-171750-0_sp1.1 from training. Duration: 0.5909375 2023-10-02 02:19:19,007 WARNING [train.py:1204] (3/4) Exclude cut with ID _24271_210_12658_1_1531987270022_7190519_40-321822-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 02:19:19,037 WARNING [train.py:1204] (3/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_227-83612-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 02:19:21,707 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9156W0015-572508 from training. Duration: 0.896 2023-10-02 02:19:23,277 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_292-265928-0 from training. Duration: 0.51 2023-10-02 02:19:23,403 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.feed_forward3.hidden_balancer.prob, batch_count=716346.6666666666, ans=0.125 2023-10-02 02:19:27,892 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.586e+02 1.837e+02 2.069e+02 2.341e+02 3.988e+02, threshold=4.139e+02, percent-clipped=0.0 2023-10-02 02:19:28,019 WARNING [train.py:1204] (3/4) Exclude cut with ID _14747_210_8341_1_1530685797463_3654089_147-131229-0_sp1.1 from training. Duration: 0.6909375 2023-10-02 02:19:28,280 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.feed_forward1.out_proj.dropout_p, batch_count=716346.6666666666, ans=0.1 2023-10-02 02:19:30,762 WARNING [train.py:1204] (3/4) Exclude cut with ID _45297_210_2721_1_1533715351381_3264949_351-147136-0_sp0.9 from training. Duration: 0.9666875 2023-10-02 02:19:32,232 WARNING [train.py:1204] (3/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_1102-324568-0_sp1.1 from training. Duration: 0.97275 2023-10-02 02:19:33,657 WARNING [train.py:1204] (3/4) Exclude cut with ID _18516_210_9206_1_1531704653206_3692406_465-75446-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 02:19:33,659 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9203W0126-595623 from training. Duration: 0.896 2023-10-02 02:19:35,100 WARNING [train.py:1204] (3/4) Exclude cut with ID _55383_210_10635_1_1534468426172_2231669_139-328410-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 02:19:41,384 WARNING [train.py:1204] (3/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_166-246557-0_sp0.9 from training. Duration: 0.711125 2023-10-02 02:19:41,389 WARNING [train.py:1204] (3/4) Exclude cut with ID _30039_210_13270_1_1532484612900_2725150_239-304872-0_sp1.1 from training. Duration: 0.963625 2023-10-02 02:19:42,168 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.1.encoder.layers.1.self_attn2.whiten, num_groups=1, num_channels=256, metric=10.07 vs. limit=22.5 2023-10-02 02:19:42,752 WARNING [train.py:1204] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_6-226750-0 from training. Duration: 0.94 2023-10-02 02:19:44,075 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_135-5501-0_sp1.1 from training. Duration: 0.8909375 2023-10-02 02:19:44,300 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.1.conv_module2.balancer2.min_positive, batch_count=716413.3333333334, ans=0.05 2023-10-02 02:19:45,698 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.feed_forward1.hidden_balancer.prob, batch_count=716413.3333333334, ans=0.125 2023-10-02 02:19:47,029 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.feed_forward1.out_proj.dropout_p, batch_count=716413.3333333334, ans=0.1 2023-10-02 02:19:48,068 WARNING [train.py:1204] (3/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_359-118926-0 from training. Duration: 0.72 2023-10-02 02:19:48,432 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.ff3_skip_rate, batch_count=716480.0, ans=0.0 2023-10-02 02:19:52,349 WARNING [train.py:1204] (3/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_4-91037-0 from training. Duration: 0.88 2023-10-02 02:19:52,374 WARNING [train.py:1204] (3/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_415-288492-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 02:19:53,645 WARNING [train.py:1204] (3/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_544-287179-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 02:19:55,662 WARNING [train.py:1204] (3/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_122-32606-0_sp1.1 from training. Duration: 0.92725 2023-10-02 02:19:57,066 WARNING [train.py:1204] (3/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_121-284152-0_sp1.1 from training. Duration: 0.536375 2023-10-02 02:19:58,506 WARNING [train.py:1204] (3/4) Exclude cut with ID _44718_210_5816_1_1534050051869_6469030_350-299477-0_sp1.1 from training. Duration: 0.97275 2023-10-02 02:19:58,535 WARNING [train.py:1204] (3/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_1103-56445-0_sp0.9 from training. Duration: 0.811125 2023-10-02 02:20:00,535 WARNING [train.py:1204] (3/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_64-298917-0_sp0.9 from training. Duration: 0.97775 2023-10-02 02:20:00,593 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2245_210_2715_1_1525511548_3540254_310-52483-0 from training. Duration: 0.88 2023-10-02 02:20:02,762 WARNING [train.py:1204] (3/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_401-158239-0_sp1.1 from training. Duration: 0.6454375 2023-10-02 02:20:02,808 WARNING [train.py:1204] (3/4) Exclude cut with ID _59502_210_12210_1_1535107024698_1835139_146-162495-0_sp1.1 from training. Duration: 0.836375 2023-10-02 02:20:02,817 WARNING [train.py:1204] (3/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_41-31157-0_sp0.9 from training. Duration: 0.6333125 2023-10-02 02:20:04,462 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.conv_skip_rate, batch_count=716546.6666666666, ans=0.0 2023-10-02 02:20:05,679 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_68717_210_3528_1_1535628683_1556759_95-36722-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 02:20:05,688 WARNING [train.py:1204] (3/4) Exclude cut with ID _30196_210_1664_1_1533708496522_7074140_654-162611-0_sp1.1 from training. Duration: 0.92725 2023-10-02 02:20:08,611 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_9302_210_2550_1_1528889962_4655416_638-301620-0_sp0.9 from training. Duration: 0.52225 2023-10-02 02:20:10,108 WARNING [train.py:1204] (3/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_478-162247-0_sp1.1 from training. Duration: 0.7454375 2023-10-02 02:20:12,741 WARNING [train.py:1204] (3/4) Exclude cut with ID _45297_210_2721_1_1533715351381_3264949_353-9541-0 from training. Duration: 0.97 2023-10-02 02:20:16,232 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9653W0190-675468 from training. Duration: 0.9935 2023-10-02 02:20:18,928 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_49730_210_13049_1_1534075294_1830676_214-84436-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 02:20:20,217 INFO [train.py:1046] (3/4) Epoch 21, batch 1250, loss[loss=0.179, simple_loss=0.2477, pruned_loss=0.05514, over 23882.00 frames. ], tot_loss[loss=0.1762, simple_loss=0.2523, pruned_loss=0.05001, over 4724063.98 frames. ], batch size: 164, lr: 4.93e-03, grad_scale: 16.0 2023-10-02 02:20:20,369 WARNING [train.py:1204] (3/4) Exclude cut with ID _50093_210_4220_1_1534417141207_3432299_259-322252-0_sp1.1 from training. Duration: 0.836375 2023-10-02 02:20:21,166 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.1.encoder.layers.1.self_attn_weights.whiten_keys, num_groups=4, num_channels=128, metric=2.18 vs. limit=6.0 2023-10-02 02:20:22,979 WARNING [train.py:1204] (3/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_599-50220-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 02:20:24,358 WARNING [train.py:1204] (3/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_174-109016-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 02:20:28,804 WARNING [train.py:1204] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_665-196155-0 from training. Duration: 0.67 2023-10-02 02:20:30,913 WARNING [train.py:1204] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_656-188361-0_sp1.1 from training. Duration: 0.7909375 2023-10-02 02:20:32,713 WARNING [train.py:1204] (3/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_60-18123-0_sp1.1 from training. Duration: 0.8 2023-10-02 02:20:32,775 WARNING [train.py:1204] (3/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_535-14410-0 from training. Duration: 0.47 2023-10-02 02:20:35,771 WARNING [train.py:1204] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_94-126128-0_sp1.1 from training. Duration: 0.7818125 2023-10-02 02:20:37,128 WARNING [train.py:1204] (3/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_356-31216-0_sp1.1 from training. Duration: 0.6818125 2023-10-02 02:20:37,354 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.conv_skip_rate, batch_count=716680.0, ans=0.0 2023-10-02 02:20:40,162 WARNING [train.py:1204] (3/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_443-30034-0_sp1.1 from training. Duration: 0.5818125 2023-10-02 02:20:41,581 WARNING [train.py:1204] (3/4) Exclude cut with ID _26907_210_7234_1_1532221740694_3205263_337-2494-0_sp1.1 from training. Duration: 0.8 2023-10-02 02:20:41,668 WARNING [train.py:1204] (3/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_49-97158-0_sp0.9 from training. Duration: 0.8444375 2023-10-02 02:20:41,679 WARNING [train.py:1204] (3/4) Exclude cut with ID _7877_210_6014_1_1528286358099_3750136_553-170128-0_sp1.1 from training. Duration: 0.963625 2023-10-02 02:20:44,383 WARNING [train.py:1204] (3/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_202-293523-0_sp0.9 from training. Duration: 0.8 2023-10-02 02:20:49,154 WARNING [train.py:1204] (3/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_188-171673-0_sp0.9 from training. Duration: 0.6444375 2023-10-02 02:20:49,182 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_120-41668-0_sp1.1 from training. Duration: 0.636375 2023-10-02 02:20:49,187 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_40223_210_6228_1_1533298404_4812267_613-78769-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 02:20:49,433 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.conv_skip_rate, batch_count=716746.6666666666, ans=0.0 2023-10-02 02:20:50,587 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_53861_210_5399_1_1534422156_4272077_150-336899-0_sp1.1 from training. Duration: 0.963625 2023-10-02 02:20:51,869 WARNING [train.py:1204] (3/4) Exclude cut with ID _14459_210_2228_1_1531443641946_7516700_1146-56843-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 02:20:54,798 WARNING [train.py:1204] (3/4) Exclude cut with ID _39867_210_9826_1_1533553222030_9344259_1194-299928-0_sp1.1 from training. Duration: 0.92725 2023-10-02 02:20:55,447 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.nonlin_attention.whiten2.whitening_limit, batch_count=716746.6666666666, ans=15.0 2023-10-02 02:20:56,607 WARNING [train.py:1204] (3/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_214-291369-0_sp0.9 from training. Duration: 0.7 2023-10-02 02:21:00,801 WARNING [train.py:1204] (3/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_358-215558-0 from training. Duration: 0.93 2023-10-02 02:21:02,076 WARNING [train.py:1204] (3/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_577-287150-0_sp0.9 from training. Duration: 0.911125 2023-10-02 02:21:05,338 WARNING [train.py:1204] (3/4) Exclude cut with ID _60860_210_8254_1_1534896015875_3557169_229-187367-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 02:21:06,685 WARNING [train.py:1204] (3/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_857-298942-0 from training. Duration: 0.85 2023-10-02 02:21:06,734 WARNING [train.py:1204] (3/4) Exclude cut with ID _40137_210_12210_1_1533290484046_3795180_249-187059-0_sp1.1 from training. Duration: 0.8 2023-10-02 02:21:06,743 WARNING [train.py:1204] (3/4) Exclude cut with ID ID0171W0508-765603 from training. Duration: 0.541 2023-10-02 02:21:06,758 WARNING [train.py:1204] (3/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_521-219439-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 02:21:06,763 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_40223_210_6228_1_1533298404_4812267_741-302616-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 02:21:09,180 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.conv_module1.balancer2.prob, batch_count=716813.3333333334, ans=0.125 2023-10-02 02:21:11,686 WARNING [train.py:1204] (3/4) Exclude cut with ID _65293_210_9070_1_1535545549765_4004210_438-154735-0_sp1.1 from training. Duration: 0.92725 2023-10-02 02:21:13,232 WARNING [train.py:1204] (3/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_564-132950-0_sp1.1 from training. Duration: 0.92725 2023-10-02 02:21:13,265 WARNING [train.py:1204] (3/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_291-270881-0_sp1.1 from training. Duration: 0.8818125 2023-10-02 02:21:14,676 WARNING [train.py:1204] (3/4) Exclude cut with ID _60229_210_27767_1_1534838406158_3655350_353-213656-0 from training. Duration: 0.95 2023-10-02 02:21:14,696 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_243-143119-0 from training. Duration: 0.77 2023-10-02 02:21:16,117 WARNING [train.py:1204] (3/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_690-126306-0 from training. Duration: 0.78 2023-10-02 02:21:16,689 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.5.encoder.layers.1.self_attn_weights.whiten_keys, num_groups=4, num_channels=128, metric=1.66 vs. limit=6.0 2023-10-02 02:21:18,951 WARNING [train.py:1204] (3/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_347-95003-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 02:21:20,693 WARNING [train.py:1204] (3/4) Exclude cut with ID _2322_210_2667_1_1525604548971_3656299_440-80494-0 from training. Duration: 0.88 2023-10-02 02:21:20,706 WARNING [train.py:1204] (3/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_370-229434-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 02:21:20,909 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=716880.0, ans=0.1 2023-10-02 02:21:23,554 WARNING [train.py:1204] (3/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_124-345840-0_sp1.1 from training. Duration: 0.47275 2023-10-02 02:21:23,562 WARNING [train.py:1204] (3/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_165-123581-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 02:21:25,001 WARNING [train.py:1204] (3/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_453-282588-0 from training. Duration: 0.98 2023-10-02 02:21:25,023 WARNING [train.py:1204] (3/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_299-34413-0_sp0.9 from training. Duration: 0.72225 2023-10-02 02:21:26,333 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_40036_210_13405_1_1533648647_1315388_48-146016-0_sp1.1 from training. Duration: 0.8454375 2023-10-02 02:21:26,362 WARNING [train.py:1204] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_99-167392-0_sp0.9 from training. Duration: 0.67775 2023-10-02 02:21:26,417 WARNING [train.py:1204] (3/4) Exclude cut with ID _48677_210_5663_1_1533902405919_3892989_37-207114-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 02:21:29,688 WARNING [train.py:1204] (3/4) Exclude cut with ID _29857_210_20034_1_1532395886753_3798160_335-163562-0 from training. Duration: 0.85 2023-10-02 02:21:31,842 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_36492_210_7927_1_1533261987_4790834_703-343351-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 02:21:32,132 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.feed_forward1.hidden_balancer.prob, batch_count=716880.0, ans=0.125 2023-10-02 02:21:33,106 WARNING [train.py:1204] (3/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_80-147301-0_sp0.9 from training. Duration: 0.9333125 2023-10-02 02:21:33,200 WARNING [train.py:1204] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_218-295097-0_sp0.9 from training. Duration: 0.8555625 2023-10-02 02:21:36,631 INFO [train.py:1046] (3/4) Epoch 21, batch 1300, loss[loss=0.1752, simple_loss=0.2571, pruned_loss=0.04658, over 24637.00 frames. ], tot_loss[loss=0.177, simple_loss=0.2529, pruned_loss=0.05049, over 4719856.87 frames. ], batch size: 65, lr: 4.93e-03, grad_scale: 16.0 2023-10-02 02:21:36,727 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2295_210_2087_1_1525500993_2407863_116-238894-0_sp1.1 from training. Duration: 0.636375 2023-10-02 02:21:39,320 WARNING [train.py:1204] (3/4) Exclude cut with ID _3231_210_2106_1_1526205864934_6784220_697-171068-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 02:21:40,649 WARNING [train.py:1204] (3/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_102-77064-0 from training. Duration: 0.93 2023-10-02 02:21:44,799 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_13091_210_7925_1_1530609447_8987354_1186-261957-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 02:21:46,222 WARNING [train.py:1204] (3/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_91-277684-0_sp1.1 from training. Duration: 0.67275 2023-10-02 02:21:47,627 WARNING [train.py:1204] (3/4) Exclude cut with ID _31088_210_16446_1_1532520012284_3440919_159-313751-0_sp1.1 from training. Duration: 0.936375 2023-10-02 02:21:47,900 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.conv_module2.balancer1.prob, batch_count=716946.6666666666, ans=0.125 2023-10-02 02:21:48,914 WARNING [train.py:1204] (3/4) Exclude cut with ID _42326_210_7770_1_1533814282531_3707269_227-203721-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 02:21:49,008 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3531_210_3528_1_1526294203_5216456_309-227302-0_sp1.1 from training. Duration: 0.62725 2023-10-02 02:21:50,839 WARNING [train.py:1204] (3/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_226-242140-0 from training. Duration: 0.81 2023-10-02 02:21:53,997 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.conv_module2.balancer1.prob, batch_count=717013.3333333334, ans=0.125 2023-10-02 02:21:55,062 WARNING [train.py:1204] (3/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_122-109023-0_sp1.1 from training. Duration: 0.6909375 2023-10-02 02:21:56,473 WARNING [train.py:1204] (3/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_660-211088-0_sp0.9 from training. Duration: 0.688875 2023-10-02 02:21:56,555 WARNING [train.py:1204] (3/4) Exclude cut with ID _39290_210_4220_1_1533862778760_3075118_92-311600-0 from training. Duration: 0.88 2023-10-02 02:21:59,129 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.481e+02 1.796e+02 1.993e+02 2.237e+02 3.308e+02, threshold=3.985e+02, percent-clipped=0.0 2023-10-02 02:22:01,097 WARNING [train.py:1204] (3/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_390-339137-0_sp1.1 from training. Duration: 0.5090625 2023-10-02 02:22:01,300 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.0.feed_forward1.out_proj.dropout_p, batch_count=717013.3333333334, ans=0.1 2023-10-02 02:22:05,334 WARNING [train.py:1204] (3/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_449-341946-0_sp1.1 from training. Duration: 0.963625 2023-10-02 02:22:05,418 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_68095_210_20185_1_1535718226_8888855_884-327007-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 02:22:07,398 WARNING [train.py:1204] (3/4) Exclude cut with ID _42326_210_7770_1_1533814282531_3707269_308-222998-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 02:22:08,786 WARNING [train.py:1204] (3/4) Exclude cut with ID _40399_210_4353_1_1533305658194_3279406_344-238708-0_sp1.1 from training. Duration: 0.963625 2023-10-02 02:22:10,234 WARNING [train.py:1204] (3/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_87-131427-0_sp1.1 from training. Duration: 0.7090625 2023-10-02 02:22:11,534 WARNING [train.py:1204] (3/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_110-130961-0_sp0.9 from training. Duration: 0.52225 2023-10-02 02:22:11,578 WARNING [train.py:1204] (3/4) Exclude cut with ID _55153_210_2253_1_1534384786677_3162096_148-79022-0 from training. Duration: 0.96 2023-10-02 02:22:14,641 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.conv_module1.balancer2.prob, batch_count=717080.0, ans=0.125 2023-10-02 02:22:15,803 WARNING [train.py:1204] (3/4) Exclude cut with ID _9679_210_7497_1_1529129212906_4363929_512-313889-0_sp1.1 from training. Duration: 0.736375 2023-10-02 02:22:15,814 WARNING [train.py:1204] (3/4) Exclude cut with ID _7970_210_1883_1_1528455616296_3472230_25-178407-0_sp1.1 from training. Duration: 0.6454375 2023-10-02 02:22:18,477 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_246-125000-0 from training. Duration: 0.62 2023-10-02 02:22:18,538 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_15126_210_6753_1_1530778677_2591185_113-157651-0_sp0.9 from training. Duration: 0.7555625 2023-10-02 02:22:20,294 WARNING [train.py:1204] (3/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_105-63309-0_sp1.1 from training. Duration: 0.8909375 2023-10-02 02:22:21,798 WARNING [train.py:1204] (3/4) Exclude cut with ID _29877_210_6228_1_1532397791106_5276149_578-177271-0_sp1.1 from training. Duration: 0.936375 2023-10-02 02:22:23,188 WARNING [train.py:1204] (3/4) Exclude cut with ID _44831_210_13427_1_1533816011549_3689035_32-188147-0 from training. Duration: 0.96 2023-10-02 02:22:23,237 WARNING [train.py:1204] (3/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_300-254337-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 02:22:24,591 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_128-241008-0 from training. Duration: 0.94 2023-10-02 02:22:25,955 WARNING [train.py:1204] (3/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_53-279178-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 02:22:28,829 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_41452_210_7291_1_1533437687_3941281_273-334088-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 02:22:28,832 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_128-241008-0_sp1.1 from training. Duration: 0.8545625 2023-10-02 02:22:34,205 WARNING [train.py:1204] (3/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_379-49457-0 from training. Duration: 0.87 2023-10-02 02:22:34,291 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_105-114797-0 from training. Duration: 0.84 2023-10-02 02:22:37,050 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_14928_210_5632_1_1530773461_4714000_454-112198-0 from training. Duration: 0.66 2023-10-02 02:22:40,467 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_456-22141-0_sp1.1 from training. Duration: 0.8 2023-10-02 02:22:43,289 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_115-233929-0 from training. Duration: 0.72 2023-10-02 02:22:44,639 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_616-16795-0_sp1.1 from training. Duration: 0.963625 2023-10-02 02:22:50,633 INFO [train.py:1046] (3/4) Epoch 21, batch 1350, loss[loss=0.172, simple_loss=0.2497, pruned_loss=0.04712, over 24307.00 frames. ], tot_loss[loss=0.1771, simple_loss=0.2527, pruned_loss=0.0507, over 4707663.66 frames. ], batch size: 61, lr: 4.93e-03, grad_scale: 16.0 2023-10-02 02:22:50,688 WARNING [train.py:1204] (3/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_448-212135-0 from training. Duration: 0.91 2023-10-02 02:22:53,464 WARNING [train.py:1204] (3/4) Exclude cut with ID _46683_210_9642_1_1533730913492_4591116_201-205677-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 02:22:55,096 WARNING [train.py:1204] (3/4) Exclude cut with ID _64086_210_7994_1_1535270417019_7435949_43-80483-0_sp1.1 from training. Duration: 0.97275 2023-10-02 02:22:56,528 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_240-134124-0_sp1.1 from training. Duration: 0.963625 2023-10-02 02:22:58,360 WARNING [train.py:1204] (3/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_907-120940-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 02:22:59,694 WARNING [train.py:1204] (3/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_848-280957-0_sp1.1 from training. Duration: 0.7909375 2023-10-02 02:22:59,877 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.conv_module1.balancer1.prob, batch_count=717280.0, ans=0.125 2023-10-02 02:23:01,530 WARNING [train.py:1204] (3/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_243-42116-0_sp0.9 from training. Duration: 0.811125 2023-10-02 02:23:04,739 WARNING [train.py:1204] (3/4) Exclude cut with ID _6694_210_4599_1_1527852637490_3965529_116-109398-0_sp0.9 from training. Duration: 0.811125 2023-10-02 02:23:06,127 WARNING [train.py:1204] (3/4) Exclude cut with ID _24314_210_4915_1_1532135078116_7170179_521-258868-0 from training. Duration: 0.79 2023-10-02 02:23:07,525 WARNING [train.py:1204] (3/4) Exclude cut with ID _2220_210_1819_1_1525509109898_3658680_50-99837-0_sp0.9 from training. Duration: 0.6 2023-10-02 02:23:07,724 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.self_attn_weights.pos_emb_skip_rate, batch_count=717346.6666666666, ans=0.0 2023-10-02 02:23:09,350 WARNING [train.py:1204] (3/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_313-134930-0_sp1.1 from training. Duration: 0.8818125 2023-10-02 02:23:09,654 INFO [scaling.py:1118] (3/4) WithLoss: name=encoder.encoders.3.encoder.layers.1.self_attn_weights, loss-sum=0.000e+00 2023-10-02 02:23:12,245 WARNING [train.py:1204] (3/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_125-144174-0 from training. Duration: 0.39 2023-10-02 02:23:13,573 WARNING [train.py:1204] (3/4) Exclude cut with ID _59288_210_5854_1_1535077757774_3412246_275-293897-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 02:23:13,676 WARNING [train.py:1204] (3/4) Exclude cut with ID _40519_210_13794_1_1533715228655_7337390_722-52880-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 02:23:13,701 WARNING [train.py:1204] (3/4) Exclude cut with ID _7537_210_2084_1_1528502395431_3924359_128-268029-0 from training. Duration: 0.98 2023-10-02 02:23:15,199 WARNING [train.py:1204] (3/4) Exclude cut with ID _8021_210_7994_1_1528376451358_3653069_179-45206-0 from training. Duration: 0.86 2023-10-02 02:23:18,002 WARNING [train.py:1204] (3/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_88-276668-0 from training. Duration: 0.99 2023-10-02 02:23:20,639 WARNING [train.py:1204] (3/4) Exclude cut with ID _22613_210_5219_1_1532253599686_4090170_290-212862-0_sp1.1 from training. Duration: 0.97275 2023-10-02 02:23:20,644 WARNING [train.py:1204] (3/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_465-214726-0 from training. Duration: 0.86 2023-10-02 02:23:32,321 WARNING [train.py:1204] (3/4) Exclude cut with ID _56480_210_4220_1_1534679942079_3075409_138-36031-0_sp1.1 from training. Duration: 0.97275 2023-10-02 02:23:34,125 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.conv_module2.balancer2.prob, batch_count=717413.3333333334, ans=0.125 2023-10-02 02:23:41,366 WARNING [train.py:1204] (3/4) Exclude cut with ID _58605_210_12210_1_1534723195939_3904259_300-285878-0_sp1.1 from training. Duration: 0.97275 2023-10-02 02:23:42,680 WARNING [train.py:1204] (3/4) Exclude cut with ID _40901_210_5665_1_1533363789958_3856130_114-340229-0_sp1.1 from training. Duration: 0.936375 2023-10-02 02:23:42,719 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_489-230703-0 from training. Duration: 0.85 2023-10-02 02:23:45,447 WARNING [train.py:1204] (3/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_322-81243-0_sp1.1 from training. Duration: 0.936375 2023-10-02 02:23:48,133 WARNING [train.py:1204] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_271-269084-0 from training. Duration: 0.83 2023-10-02 02:23:48,144 WARNING [train.py:1204] (3/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_166-314607-0_sp0.9 from training. Duration: 0.6 2023-10-02 02:23:48,205 WARNING [train.py:1204] (3/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_340-343302-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 02:23:48,369 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.0.attention_skip_rate, batch_count=717480.0, ans=0.0 2023-10-02 02:23:51,088 WARNING [train.py:1204] (3/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_286-112015-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 02:23:52,478 WARNING [train.py:1204] (3/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_328-264776-0 from training. Duration: 0.67 2023-10-02 02:23:53,922 WARNING [train.py:1204] (3/4) Exclude cut with ID _4866_210_2577_1_1527404412284_4364659_256-306830-0_sp0.9 from training. Duration: 0.9666875 2023-10-02 02:23:58,316 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.attention_skip_rate, batch_count=717546.6666666666, ans=0.0 2023-10-02 02:23:59,202 WARNING [train.py:1204] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_239-316802-0 from training. Duration: 0.69 2023-10-02 02:24:00,815 WARNING [train.py:1204] (3/4) Exclude cut with ID _29857_210_20034_1_1532395886753_3798160_279-185801-0 from training. Duration: 0.91 2023-10-02 02:24:04,245 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.balancer2.prob, batch_count=717546.6666666666, ans=0.125 2023-10-02 02:24:06,667 INFO [train.py:1046] (3/4) Epoch 21, batch 1400, loss[loss=0.1831, simple_loss=0.251, pruned_loss=0.05761, over 23510.00 frames. ], tot_loss[loss=0.1755, simple_loss=0.2504, pruned_loss=0.05034, over 4670189.65 frames. ], batch size: 120, lr: 4.93e-03, grad_scale: 16.0 2023-10-02 02:24:06,776 WARNING [train.py:1204] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_656-188361-0 from training. Duration: 0.87 2023-10-02 02:24:08,183 WARNING [train.py:1204] (3/4) Exclude cut with ID _44718_210_5816_1_1534050051869_6469030_100-230287-0_sp1.1 from training. Duration: 0.936375 2023-10-02 02:24:11,653 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8275_210_4198_1_1528455320_3892571_276-215622-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 02:24:12,976 WARNING [train.py:1204] (3/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_698-309430-0_sp1.1 from training. Duration: 0.963625 2023-10-02 02:24:15,814 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_365-217119-0 from training. Duration: 0.63 2023-10-02 02:24:16,001 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.bypass.scale_min, batch_count=717613.3333333334, ans=0.2 2023-10-02 02:24:17,221 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7264_210_4938_1_1527939911_8132095_800-91995-0 from training. Duration: 0.86 2023-10-02 02:24:17,374 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.bypass.skip_rate, batch_count=717613.3333333334, ans=0.07 2023-10-02 02:24:22,847 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.bypass.scale_min, batch_count=717680.0, ans=0.2 2023-10-02 02:24:26,961 WARNING [train.py:1204] (3/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_49-97158-0_sp1.1 from training. Duration: 0.6909375 2023-10-02 02:24:30,122 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.472e+02 1.883e+02 2.074e+02 2.449e+02 3.328e+02, threshold=4.148e+02, percent-clipped=0.0 2023-10-02 02:24:30,227 WARNING [train.py:1204] (3/4) Exclude cut with ID _61132_210_9969_1_1535021792964_3755419_61-57720-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 02:24:32,900 WARNING [train.py:1204] (3/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_345-306832-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 02:24:32,927 WARNING [train.py:1204] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_164-224052-0_sp0.9 from training. Duration: 0.77775 2023-10-02 02:24:37,090 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_34886_210_10633_1_1533203882_1755661_76-156096-0_sp1.1 from training. Duration: 0.7909375 2023-10-02 02:24:38,386 WARNING [train.py:1204] (3/4) Exclude cut with ID 4133-6541-0027-40495-0_sp1.1 from training. Duration: 0.9681875 2023-10-02 02:24:48,708 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_514-211165-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 02:24:48,758 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_41211_210_2544_1_1533348859_3702910_97-212695-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 02:24:50,364 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.conv_module2.balancer1.prob, batch_count=717813.3333333334, ans=0.125 2023-10-02 02:24:52,298 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.2.encoder.layers.2.self_attn_weights.whiten_keys, num_groups=4, num_channels=128, metric=4.99 vs. limit=6.0 2023-10-02 02:24:54,323 WARNING [train.py:1204] (3/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_406-45086-0 from training. Duration: 0.8 2023-10-02 02:24:54,380 WARNING [train.py:1204] (3/4) Exclude cut with ID _65263_210_22356_1_1535763598691_3921037_49-222917-0_sp1.1 from training. Duration: 0.836375 2023-10-02 02:24:54,425 WARNING [train.py:1204] (3/4) Exclude cut with ID _4866_210_2577_1_1527404412284_4364659_352-157930-0_sp1.1 from training. Duration: 0.736375 2023-10-02 02:24:55,704 WARNING [train.py:1204] (3/4) Exclude cut with ID _4366_210_1388_1_1526792053633_3613819_231-211402-0_sp1.1 from training. Duration: 0.8545625 2023-10-02 02:24:55,755 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_98-21576-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 02:24:57,145 WARNING [train.py:1204] (3/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_348-127789-0_sp0.9 from training. Duration: 0.9444375 2023-10-02 02:24:57,160 WARNING [train.py:1204] (3/4) Exclude cut with ID _45871_210_5665_1_1533864614245_3296060_98-329557-0_sp1.1 from training. Duration: 0.97275 2023-10-02 02:24:57,216 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6846_210_6753_1_1527850767_4295158_2-315073-0_sp1.1 from training. Duration: 0.8 2023-10-02 02:24:59,908 WARNING [train.py:1204] (3/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_145-137790-0 from training. Duration: 0.83 2023-10-02 02:24:59,922 WARNING [train.py:1204] (3/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_133-158308-0_sp0.9 from training. Duration: 0.9333125 2023-10-02 02:25:05,105 WARNING [train.py:1204] (3/4) Exclude cut with ID _14898_210_9206_1_1530766812377_4177190_320-11758-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 02:25:08,012 WARNING [train.py:1204] (3/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_406-45086-0_sp0.9 from training. Duration: 0.888875 2023-10-02 02:25:09,696 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.conv_module1.balancer1.prob, batch_count=717880.0, ans=0.125 2023-10-02 02:25:14,221 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_5438_210_1794_1_1527336687_2600009_104-288274-0 from training. Duration: 0.58 2023-10-02 02:25:15,480 WARNING [train.py:1204] (3/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_108-53929-0_sp0.9 from training. Duration: 0.6555625 2023-10-02 02:25:16,820 WARNING [train.py:1204] (3/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_444-286390-0_sp1.1 from training. Duration: 0.963625 2023-10-02 02:25:18,267 WARNING [train.py:1204] (3/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_125-144174-0_sp0.9 from training. Duration: 0.4333125 2023-10-02 02:25:19,762 WARNING [train.py:1204] (3/4) Exclude cut with ID _55209_210_1531_1_1534675828996_4597160_118-345621-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 02:25:21,075 INFO [train.py:1046] (3/4) Epoch 21, batch 1450, loss[loss=0.1725, simple_loss=0.2573, pruned_loss=0.04383, over 24682.00 frames. ], tot_loss[loss=0.1747, simple_loss=0.2503, pruned_loss=0.04958, over 4684537.91 frames. ], batch size: 65, lr: 4.92e-03, grad_scale: 16.0 2023-10-02 02:25:21,170 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_45886_210_17024_1_1533709215_471063_7-79165-0_sp1.1 from training. Duration: 0.8909375 2023-10-02 02:25:23,913 WARNING [train.py:1204] (3/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_175-163894-0_sp0.9 from training. Duration: 0.82225 2023-10-02 02:25:26,626 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_50188_210_4198_1_1534503621_3646041_353-81682-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 02:25:26,642 WARNING [train.py:1204] (3/4) Exclude cut with ID _17875_210_2087_1_1531224012907_1821339_175-80565-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 02:25:26,651 WARNING [train.py:1204] (3/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_159-328157-0_sp1.1 from training. Duration: 0.47275 2023-10-02 02:25:32,870 WARNING [train.py:1204] (3/4) Exclude cut with ID _60057_210_10640_1_1534903065793_3333150_137-267579-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 02:25:34,822 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_92-46009-0_sp1.1 from training. Duration: 0.4909375 2023-10-02 02:25:36,183 WARNING [train.py:1204] (3/4) Exclude cut with ID _53203_210_2087_1_1534246218309_475590_48-68507-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 02:25:36,215 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_28211_210_6388_1_1532861506_4523224_128-328162-0 from training. Duration: 0.9 2023-10-02 02:25:36,427 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.conv_module2.balancer1.prob, batch_count=718013.3333333334, ans=0.125 2023-10-02 02:25:37,550 WARNING [train.py:1204] (3/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_51-1049-0_sp1.1 from training. Duration: 0.6818125 2023-10-02 02:25:39,566 WARNING [train.py:1204] (3/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_383-316000-0 from training. Duration: 0.9 2023-10-02 02:25:40,902 WARNING [train.py:1204] (3/4) Exclude cut with ID _4779_210_2084_1_1527318022944_3914893_185-295153-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 02:25:40,976 WARNING [train.py:1204] (3/4) Exclude cut with ID _3231_210_2106_1_1526205864934_6784220_171-256004-0_sp0.9 from training. Duration: 0.9555625 2023-10-02 02:25:40,976 WARNING [train.py:1204] (3/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_87-272068-0 from training. Duration: 0.83 2023-10-02 02:25:42,358 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_164-152777-0_sp1.1 from training. Duration: 0.92725 2023-10-02 02:25:43,679 WARNING [train.py:1204] (3/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_18-130336-0_sp0.9 from training. Duration: 0.87775 2023-10-02 02:25:43,729 WARNING [train.py:1204] (3/4) Exclude cut with ID _14886_210_4220_1_1530702020355_3516190_175-229073-0_sp1.1 from training. Duration: 0.4090625 2023-10-02 02:25:43,747 WARNING [train.py:1204] (3/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_791-45149-0_sp0.9 from training. Duration: 0.9555625 2023-10-02 02:25:45,519 WARNING [train.py:1204] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_468-189058-0_sp1.1 from training. Duration: 0.87275 2023-10-02 02:25:46,952 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8278_210_2550_1_1528637099_4220413_32-142780-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 02:25:49,662 WARNING [train.py:1204] (3/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_146-218763-0_sp0.9 from training. Duration: 0.9555625 2023-10-02 02:25:53,943 WARNING [train.py:1204] (3/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_348-127789-0_sp1.1 from training. Duration: 0.77275 2023-10-02 02:25:53,949 WARNING [train.py:1204] (3/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_288-285445-0_sp1.1 from training. Duration: 0.936375 2023-10-02 02:25:55,366 WARNING [train.py:1204] (3/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_308-181732-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 02:25:55,411 WARNING [train.py:1204] (3/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_895-117428-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 02:25:56,737 WARNING [train.py:1204] (3/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_387-159008-0_sp0.9 from training. Duration: 0.9555625 2023-10-02 02:25:56,756 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_37351_210_7925_1_1533012931_3812113_265-85780-0_sp0.9 from training. Duration: 0.97775 2023-10-02 02:25:56,771 WARNING [train.py:1204] (3/4) Exclude cut with ID _40551_210_10635_1_1533776424632_3416179_361-3964-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 02:25:57,003 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.feed_forward1.out_proj.dropout_p, batch_count=718080.0, ans=0.1 2023-10-02 02:25:57,556 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.1.encoder.layers.1.nonlin_attention.whiten1, num_groups=1, num_channels=192, metric=5.60 vs. limit=10.0 2023-10-02 02:25:58,127 WARNING [train.py:1204] (3/4) Exclude cut with ID _25882_210_3036_1_1532775665815_6479510_485-122742-0_sp1.1 from training. Duration: 0.97275 2023-10-02 02:26:02,273 WARNING [train.py:1204] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_98-51676-0 from training. Duration: 0.73 2023-10-02 02:26:03,665 WARNING [train.py:1204] (3/4) Exclude cut with ID _30548_210_16040_1_1532608097130_3146969_371-280610-0_sp1.1 from training. Duration: 0.92725 2023-10-02 02:26:09,081 WARNING [train.py:1204] (3/4) Exclude cut with ID IC0671W0081-331924 from training. Duration: 0.896 2023-10-02 02:26:10,482 WARNING [train.py:1204] (3/4) Exclude cut with ID _40284_210_2084_1_1533513679814_3704279_380-160775-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 02:26:11,845 WARNING [train.py:1204] (3/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_57-270037-0_sp1.1 from training. Duration: 0.7 2023-10-02 02:26:13,215 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_853-150237-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 02:26:13,309 WARNING [train.py:1204] (3/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_491-261923-0 from training. Duration: 0.98 2023-10-02 02:26:18,116 WARNING [train.py:1204] (3/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_836-244045-0_sp1.1 from training. Duration: 0.97275 2023-10-02 02:26:18,192 WARNING [train.py:1204] (3/4) Exclude cut with ID _14880_210_8895_1_1530680426655_3795259_289-212817-0 from training. Duration: 0.69 2023-10-02 02:26:20,767 WARNING [train.py:1204] (3/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_303-340446-0 from training. Duration: 0.9 2023-10-02 02:26:20,873 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_37152_210_9697_1_1533106573_3844188_418-41736-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 02:26:24,839 WARNING [train.py:1204] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_371-18169-0_sp1.1 from training. Duration: 0.7818125 2023-10-02 02:26:24,872 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_187-219984-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 02:26:26,230 WARNING [train.py:1204] (3/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_63-267585-0 from training. Duration: 0.9 2023-10-02 02:26:28,962 WARNING [train.py:1204] (3/4) Exclude cut with ID _27013_210_1389_1_1532437173240_3252649_222-125573-0 from training. Duration: 0.98 2023-10-02 02:26:29,020 WARNING [train.py:1204] (3/4) Exclude cut with ID _14821_210_8750_1_1530680503991_6829359_189-297451-0 from training. Duration: 0.55 2023-10-02 02:26:30,460 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_557-163391-0_sp1.1 from training. Duration: 0.97275 2023-10-02 02:26:31,791 WARNING [train.py:1204] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_65-88242-0_sp1.1 from training. Duration: 0.5090625 2023-10-02 02:26:34,531 INFO [train.py:1046] (3/4) Epoch 21, batch 1500, loss[loss=0.1771, simple_loss=0.2473, pruned_loss=0.05342, over 23386.00 frames. ], tot_loss[loss=0.1749, simple_loss=0.2504, pruned_loss=0.04973, over 4687614.36 frames. ], batch size: 285, lr: 4.92e-03, grad_scale: 8.0 2023-10-02 02:26:34,884 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.feed_forward3.hidden_balancer.prob, batch_count=718280.0, ans=0.125 2023-10-02 02:26:42,879 WARNING [train.py:1204] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_218-295097-0 from training. Duration: 0.77 2023-10-02 02:26:42,900 WARNING [train.py:1204] (3/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_686-247600-0_sp1.1 from training. Duration: 0.836375 2023-10-02 02:26:42,903 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_397-147196-0_sp0.9 from training. Duration: 0.888875 2023-10-02 02:26:44,279 WARNING [train.py:1204] (3/4) Exclude cut with ID _53950_210_5816_1_1534417264919_3862951_237-136449-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 02:26:44,344 WARNING [train.py:1204] (3/4) Exclude cut with ID _3120_210_3525_1_1526036143112_4072980_74-178400-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 02:26:44,580 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.conv_skip_rate, batch_count=718280.0, ans=0.0 2023-10-02 02:26:45,633 WARNING [train.py:1204] (3/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_309-222545-0_sp0.9 from training. Duration: 0.9333125 2023-10-02 02:26:45,702 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7544_210_6753_1_1528587722_3576686_172-171750-0 from training. Duration: 0.65 2023-10-02 02:26:47,596 WARNING [train.py:1204] (3/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_3-237434-0_sp0.9 from training. Duration: 0.8444375 2023-10-02 02:26:47,655 WARNING [train.py:1204] (3/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_101-293367-0_sp0.9 from training. Duration: 0.7 2023-10-02 02:26:47,662 WARNING [train.py:1204] (3/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_818-213552-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 02:26:49,074 WARNING [train.py:1204] (3/4) Exclude cut with ID _14349_210_8341_1_1530615710818_3442470_115-285975-0_sp1.1 from training. Duration: 0.7818125 2023-10-02 02:26:50,489 WARNING [train.py:1204] (3/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_291-309749-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 02:26:51,175 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.1.encoder.layers.1.feed_forward2.out_whiten, num_groups=1, num_channels=256, metric=4.31 vs. limit=15.0 2023-10-02 02:26:51,794 WARNING [train.py:1204] (3/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_715-3462-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 02:26:53,337 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.bypass_mid.scale_min, batch_count=718346.6666666666, ans=0.2 2023-10-02 02:26:54,676 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=718346.6666666666, ans=0.1 2023-10-02 02:26:57,355 WARNING [train.py:1204] (3/4) Exclude cut with ID _59502_210_12210_1_1535107024698_1835139_13-331827-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 02:26:57,364 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_4528_210_3273_1_1527076654_3276777_342-168068-0 from training. Duration: 0.87 2023-10-02 02:26:58,768 WARNING [train.py:1204] (3/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_215-141059-0_sp0.9 from training. Duration: 0.688875 2023-10-02 02:27:00,062 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.376e+02 1.883e+02 2.099e+02 2.535e+02 4.584e+02, threshold=4.198e+02, percent-clipped=1.0 2023-10-02 02:27:00,164 WARNING [train.py:1204] (3/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_106-286130-0_sp1.1 from training. Duration: 0.8909375 2023-10-02 02:27:01,456 WARNING [train.py:1204] (3/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_480-77655-0_sp1.1 from training. Duration: 0.97275 2023-10-02 02:27:04,228 WARNING [train.py:1204] (3/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_848-280957-0 from training. Duration: 0.87 2023-10-02 02:27:08,190 WARNING [train.py:1204] (3/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_122-109023-0 from training. Duration: 0.76 2023-10-02 02:27:09,399 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_11305_210_1794_1_1529750428_7178148_645-233480-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 02:27:09,458 WARNING [train.py:1204] (3/4) Exclude cut with ID _7562_210_2084_1_1528887767916_3946229_345-169156-0 from training. Duration: 0.58 2023-10-02 02:27:12,203 WARNING [train.py:1204] (3/4) Exclude cut with ID _7562_210_2084_1_1528887767916_3946229_345-169156-0_sp1.1 from training. Duration: 0.52725 2023-10-02 02:27:13,686 WARNING [train.py:1204] (3/4) Exclude cut with ID _13253_210_1925_1_1530757775819_3577873_253-246386-0_sp1.1 from training. Duration: 0.8181875 2023-10-02 02:27:14,991 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_136-264444-0_sp1.1 from training. Duration: 0.97275 2023-10-02 02:27:15,006 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2168_210_2290_1_1525606920_1849400_136-222991-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 02:27:15,116 WARNING [train.py:1204] (3/4) Exclude cut with ID _7852_210_5219_1_1528286471316_8139400_242-105336-0 from training. Duration: 0.72 2023-10-02 02:27:16,427 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_5960_210_1794_1_1527403865_3990999_515-2613-0_sp1.1 from training. Duration: 0.8 2023-10-02 02:27:16,442 WARNING [train.py:1204] (3/4) Exclude cut with ID _39688_210_19596_1_1533371626725_4154159_551-167834-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 02:27:18,267 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_85-195240-0 from training. Duration: 0.87 2023-10-02 02:27:18,328 WARNING [train.py:1204] (3/4) Exclude cut with ID _63171_210_15488_1_1535104792794_3295110_113-113534-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 02:27:22,454 WARNING [train.py:1204] (3/4) Exclude cut with ID _8833_210_5219_1_1528592665210_7553059_319-349181-0_sp1.1 from training. Duration: 0.9 2023-10-02 02:27:22,462 WARNING [train.py:1204] (3/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_225-43381-0 from training. Duration: 0.94 2023-10-02 02:27:27,925 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_5959_210_1794_1_1527400400_3445726_412-44052-0_sp0.9 from training. Duration: 0.8555625 2023-10-02 02:27:29,308 WARNING [train.py:1204] (3/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_356-167816-0_sp1.1 from training. Duration: 0.6545625 2023-10-02 02:27:32,174 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9012W0112-501244 from training. Duration: 0.768 2023-10-02 02:27:32,211 WARNING [train.py:1204] (3/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_177-326952-0_sp1.1 from training. Duration: 0.92725 2023-10-02 02:27:32,215 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9016W0116-502789 from training. Duration: 0.896 2023-10-02 02:27:34,243 WARNING [train.py:1204] (3/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_557-204815-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 02:27:35,634 WARNING [train.py:1204] (3/4) Exclude cut with ID _55857_210_2346_1_1534644007831_6898209_279-229469-0_sp1.1 from training. Duration: 0.936375 2023-10-02 02:27:36,829 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9029W0375-509764 from training. Duration: 0.896 2023-10-02 02:27:37,075 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.conv_module2.balancer1.prob, batch_count=718546.6666666666, ans=0.125 2023-10-02 02:27:39,301 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2295_210_2087_1_1525500993_2407863_234-83104-0_sp0.9 from training. Duration: 0.688875 2023-10-02 02:27:40,815 WARNING [train.py:1204] (3/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_266-137104-0 from training. Duration: 0.65 2023-10-02 02:27:41,008 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.1.bypass.scale_min, batch_count=718546.6666666666, ans=0.2 2023-10-02 02:27:42,174 WARNING [train.py:1204] (3/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_289-163927-0_sp1.1 from training. Duration: 0.92725 2023-10-02 02:27:44,997 WARNING [train.py:1204] (3/4) Exclude cut with ID _12733_210_8341_1_1530167424556_3646380_86-225314-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 02:27:46,313 WARNING [train.py:1204] (3/4) Exclude cut with ID _7877_210_6014_1_1528286358099_3750136_618-284064-0_sp1.1 from training. Duration: 0.92725 2023-10-02 02:27:46,379 WARNING [train.py:1204] (3/4) Exclude cut with ID _55844_210_2346_1_1534471212569_7625707_1017-31469-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 02:27:48,143 WARNING [train.py:1204] (3/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_617-241446-0_sp1.1 from training. Duration: 0.92725 2023-10-02 02:27:48,188 WARNING [train.py:1204] (3/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_866-173440-0_sp1.1 from training. Duration: 0.8181875 2023-10-02 02:27:48,329 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_9460_210_4211_1_1528954587_10049667_480-260269-0 from training. Duration: 0.56 2023-10-02 02:27:49,562 INFO [train.py:1046] (3/4) Epoch 21, batch 1550, loss[loss=0.185, simple_loss=0.2674, pruned_loss=0.05129, over 24042.00 frames. ], tot_loss[loss=0.175, simple_loss=0.2512, pruned_loss=0.04941, over 4707392.62 frames. ], batch size: 80, lr: 4.92e-03, grad_scale: 8.0 2023-10-02 02:27:49,671 WARNING [train.py:1204] (3/4) Exclude cut with ID _44659_210_2721_1_1534036120061_6614860_626-298884-0 from training. Duration: 0.97 2023-10-02 02:27:49,724 WARNING [train.py:1204] (3/4) Exclude cut with ID _53565_210_10635_1_1534337218335_3820259_287-18120-0_sp1.1 from training. Duration: 0.8818125 2023-10-02 02:27:51,000 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_791-197891-0 from training. Duration: 0.84 2023-10-02 02:27:52,321 WARNING [train.py:1204] (3/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_527-238990-0 from training. Duration: 0.9 2023-10-02 02:27:53,867 WARNING [train.py:1204] (3/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_449-65969-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 02:27:55,400 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_553-341452-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 02:27:56,773 WARNING [train.py:1204] (3/4) Exclude cut with ID _16909_210_5724_1_1531292079912_4118919_300-47574-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 02:27:56,786 WARNING [train.py:1204] (3/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_70-27732-0_sp1.1 from training. Duration: 0.8545625 2023-10-02 02:27:58,154 WARNING [train.py:1204] (3/4) Exclude cut with ID _44718_210_5816_1_1534050051869_6469030_727-28074-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 02:27:59,535 WARNING [train.py:1204] (3/4) Exclude cut with ID _55857_210_2346_1_1534644007831_6898209_357-123664-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 02:28:01,615 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9123W0004-555964 from training. Duration: 0.896 2023-10-02 02:28:01,650 WARNING [train.py:1204] (3/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_445-145208-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 02:28:02,921 WARNING [train.py:1204] (3/4) Exclude cut with ID _3675_210_4557_1_1526639934408_4182089_456-18305-0_sp1.1 from training. Duration: 0.6909375 2023-10-02 02:28:04,348 WARNING [train.py:1204] (3/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_1-99680-0_sp1.1 from training. Duration: 0.6090625 2023-10-02 02:28:06,288 WARNING [train.py:1204] (3/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_146-282400-0_sp0.9 from training. Duration: 0.788875 2023-10-02 02:28:06,310 WARNING [train.py:1204] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_844-96989-0 from training. Duration: 0.71 2023-10-02 02:28:07,739 WARNING [train.py:1204] (3/4) Exclude cut with ID _29853_210_7497_1_1532505600851_6642110_128-146364-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 02:28:09,052 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_224-322394-0 from training. Duration: 0.66 2023-10-02 02:28:09,141 WARNING [train.py:1204] (3/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_191-59784-0 from training. Duration: 0.88 2023-10-02 02:28:11,030 WARNING [train.py:1204] (3/4) Exclude cut with ID _12556_210_8341_1_1530081024100_7327690_725-193477-0 from training. Duration: 0.98 2023-10-02 02:28:11,079 WARNING [train.py:1204] (3/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_523-79838-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 02:28:13,767 WARNING [train.py:1204] (3/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_398-334558-0_sp1.1 from training. Duration: 0.87275 2023-10-02 02:28:16,558 WARNING [train.py:1204] (3/4) Exclude cut with ID _6456_210_1534_1_1527681634212_3181049_171-36697-0_sp1.1 from training. Duration: 0.936375 2023-10-02 02:28:19,322 WARNING [train.py:1204] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_700-183549-0 from training. Duration: 0.68 2023-10-02 02:28:19,325 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8853_210_3273_1_1528633899_818968_51-187685-0 from training. Duration: 0.69 2023-10-02 02:28:28,399 WARNING [train.py:1204] (3/4) Exclude cut with ID _7719_210_3525_1_1528456153163_4296960_333-110399-0_sp1.1 from training. Duration: 0.87275 2023-10-02 02:28:31,252 WARNING [train.py:1204] (3/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_1022-83740-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 02:28:31,270 WARNING [train.py:1204] (3/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_61-327062-0_sp0.9 from training. Duration: 0.9 2023-10-02 02:28:31,283 WARNING [train.py:1204] (3/4) Exclude cut with ID _47251_210_18628_1_1533814063128_4137250_214-88076-0_sp1.1 from training. Duration: 0.8 2023-10-02 02:28:32,584 WARNING [train.py:1204] (3/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_839-173017-0 from training. Duration: 0.41 2023-10-02 02:28:32,896 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.conv_skip_rate, batch_count=718813.3333333334, ans=0.0 2023-10-02 02:28:36,892 WARNING [train.py:1204] (3/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_299-34413-0_sp1.1 from training. Duration: 0.5909375 2023-10-02 02:28:38,903 WARNING [train.py:1204] (3/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_664-159457-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 02:28:42,844 WARNING [train.py:1204] (3/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_483-291760-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 02:28:46,228 WARNING [train.py:1204] (3/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_287-255474-0_sp1.1 from training. Duration: 0.92725 2023-10-02 02:28:46,276 WARNING [train.py:1204] (3/4) Exclude cut with ID _48677_210_5663_1_1533902405919_3892989_111-11439-0_sp1.1 from training. Duration: 0.87275 2023-10-02 02:28:47,576 WARNING [train.py:1204] (3/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_399-156791-0 from training. Duration: 0.83 2023-10-02 02:28:47,616 WARNING [train.py:1204] (3/4) Exclude cut with ID _3992_210_1747_1_1526644816031_3731060_36-147855-0_sp0.9 from training. Duration: 0.8666875 2023-10-02 02:28:47,770 WARNING [train.py:1204] (3/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_442-320317-0_sp1.1 from training. Duration: 0.7454375 2023-10-02 02:28:49,126 WARNING [train.py:1204] (3/4) Exclude cut with ID _55429_210_10626_1_1534467588527_7174143_953-80868-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 02:28:49,218 WARNING [train.py:1204] (3/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_159-328157-0_sp0.9 from training. Duration: 0.57775 2023-10-02 02:28:49,226 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9309W0197-640324 from training. Duration: 0.896 2023-10-02 02:28:53,299 WARNING [train.py:1204] (3/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_361-30822-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 02:28:55,666 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.conv_module1.balancer1.max_abs, batch_count=718880.0, ans=10.0 2023-10-02 02:28:58,203 WARNING [train.py:1204] (3/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_636-251213-0 from training. Duration: 0.94 2023-10-02 02:29:02,599 WARNING [train.py:1204] (3/4) Exclude cut with ID _49728_210_12651_1_1534041034294_6788150_468-333827-0_sp1.1 from training. Duration: 0.963625 2023-10-02 02:29:03,923 INFO [train.py:1046] (3/4) Epoch 21, batch 1600, loss[loss=0.1514, simple_loss=0.233, pruned_loss=0.0349, over 24502.00 frames. ], tot_loss[loss=0.176, simple_loss=0.2523, pruned_loss=0.04984, over 4694480.25 frames. ], batch size: 63, lr: 4.92e-03, grad_scale: 16.0 2023-10-02 02:29:03,963 WARNING [train.py:1204] (3/4) Exclude cut with ID _25882_210_3036_1_1532775665815_6479510_623-191759-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 02:29:04,017 WARNING [train.py:1204] (3/4) Exclude cut with ID _5189_210_4557_1_1527248319428_3769229_199-53526-0 from training. Duration: 0.42 2023-10-02 02:29:05,388 WARNING [train.py:1204] (3/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_32-205288-0_sp0.9 from training. Duration: 0.8666875 2023-10-02 02:29:06,737 WARNING [train.py:1204] (3/4) Exclude cut with ID _65263_210_22356_1_1535763598691_3921037_401-346646-0_sp1.1 from training. Duration: 0.963625 2023-10-02 02:29:06,743 WARNING [train.py:1204] (3/4) Exclude cut with ID _12555_210_8750_1_1530075594402_3780239_449-267986-0_sp1.1 from training. Duration: 0.7545625 2023-10-02 02:29:06,969 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.self_attn_weights.pos_emb_skip_rate, batch_count=718946.6666666666, ans=0.0 2023-10-02 02:29:07,979 WARNING [train.py:1204] (3/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_430-201020-0_sp1.1 from training. Duration: 0.8818125 2023-10-02 02:29:08,050 WARNING [train.py:1204] (3/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_379-49457-0_sp1.1 from training. Duration: 0.7909375 2023-10-02 02:29:09,563 WARNING [train.py:1204] (3/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_200-105218-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 02:29:11,438 WARNING [train.py:1204] (3/4) Exclude cut with ID _3867_210_1883_1_1526641200575_4475830_319-349850-0 from training. Duration: 0.67 2023-10-02 02:29:12,816 WARNING [train.py:1204] (3/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_916-195848-0 from training. Duration: 0.92 2023-10-02 02:29:15,946 WARNING [train.py:1204] (3/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_436-186311-0 from training. Duration: 0.65 2023-10-02 02:29:19,138 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3910_210_1794_1_1526777667_3747260_302-180617-0_sp1.1 from training. Duration: 0.97275 2023-10-02 02:29:20,511 WARNING [train.py:1204] (3/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_102-92751-0 from training. Duration: 0.99 2023-10-02 02:29:20,571 WARNING [train.py:1204] (3/4) Exclude cut with ID _51899_210_20034_1_1534129254466_3669220_265-215428-0_sp1.1 from training. Duration: 0.8909375 2023-10-02 02:29:24,073 WARNING [train.py:1204] (3/4) Exclude cut with ID _58583_210_5236_1_1534726835266_3574249_438-198993-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 02:29:27,019 WARNING [train.py:1204] (3/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_166-166990-0_sp1.1 from training. Duration: 0.87275 2023-10-02 02:29:28,524 WARNING [train.py:1204] (3/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_907-151398-0 from training. Duration: 0.37 2023-10-02 02:29:29,738 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.480e+02 1.788e+02 1.991e+02 2.210e+02 3.444e+02, threshold=3.981e+02, percent-clipped=0.0 2023-10-02 02:29:31,178 WARNING [train.py:1204] (3/4) Exclude cut with ID _38670_210_15379_1_1533448810659_4391059_452-256669-0_sp1.1 from training. Duration: 0.936375 2023-10-02 02:29:32,559 WARNING [train.py:1204] (3/4) Exclude cut with ID _48677_210_5663_1_1533902405919_3892989_408-40740-0 from training. Duration: 0.83 2023-10-02 02:29:32,597 WARNING [train.py:1204] (3/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_903-158756-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 02:29:32,655 WARNING [train.py:1204] (3/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_397-216497-0 from training. Duration: 0.96 2023-10-02 02:29:36,940 WARNING [train.py:1204] (3/4) Exclude cut with ID _55429_210_10626_1_1534467588527_7174143_97-335084-0 from training. Duration: 0.97 2023-10-02 02:29:44,716 WARNING [train.py:1204] (3/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_453-267730-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 02:29:46,730 WARNING [train.py:1204] (3/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_153-97441-0 from training. Duration: 0.56 2023-10-02 02:29:46,802 WARNING [train.py:1204] (3/4) Exclude cut with ID _40901_210_5665_1_1533363789958_3856130_312-329122-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 02:29:48,048 WARNING [train.py:1204] (3/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_97-126471-0_sp1.1 from training. Duration: 0.97275 2023-10-02 02:29:48,048 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_11305_210_1794_1_1529750428_7178148_257-312870-0_sp1.1 from training. Duration: 0.8 2023-10-02 02:29:49,581 WARNING [train.py:1204] (3/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_454-314901-0_sp0.9 from training. Duration: 0.511125 2023-10-02 02:29:53,022 WARNING [train.py:1204] (3/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_282-136641-0_sp1.1 from training. Duration: 0.3818125 2023-10-02 02:29:54,465 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_46013_210_7234_1_1533776362_3744032_493-160724-0_sp1.1 from training. Duration: 0.963625 2023-10-02 02:29:54,506 WARNING [train.py:1204] (3/4) Exclude cut with ID _67831_210_12392_1_1535612464202_3092612_259-33442-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 02:29:55,903 WARNING [train.py:1204] (3/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_289-62384-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 02:29:57,238 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6545_210_2040_1_1527774670_4245258_379-14182-0_sp0.9 from training. Duration: 0.888875 2023-10-02 02:29:57,359 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_548-294316-0_sp1.1 from training. Duration: 0.736375 2023-10-02 02:29:58,668 WARNING [train.py:1204] (3/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_857-298942-0_sp1.1 from training. Duration: 0.77275 2023-10-02 02:30:00,028 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6673_210_3273_1_1527767790_3173240_327-306185-0_sp1.1 from training. Duration: 0.7545625 2023-10-02 02:30:01,675 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.ff2_skip_rate, batch_count=719146.6666666666, ans=0.0 2023-10-02 02:30:05,627 WARNING [train.py:1204] (3/4) Exclude cut with ID _61008_210_10633_1_1534852769198_6974370_814-107647-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 02:30:05,728 WARNING [train.py:1204] (3/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_116-332008-0_sp1.1 from training. Duration: 0.8909375 2023-10-02 02:30:08,413 WARNING [train.py:1204] (3/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_266-329755-0 from training. Duration: 0.89 2023-10-02 02:30:08,416 WARNING [train.py:1204] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_271-269084-0_sp0.9 from training. Duration: 0.92225 2023-10-02 02:30:10,259 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3171_210_2087_1_1526109084_2330007_66-260583-0 from training. Duration: 0.72 2023-10-02 02:30:14,405 WARNING [train.py:1204] (3/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_1326-276386-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 02:30:17,638 WARNING [train.py:1204] (3/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_372-30302-0_sp1.1 from training. Duration: 0.7818125 2023-10-02 02:30:17,698 WARNING [train.py:1204] (3/4) Exclude cut with ID _25882_210_3036_1_1532775665815_6479510_631-257029-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 02:30:17,705 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3691_210_3528_1_1526468169_4062053_355-334432-0 from training. Duration: 0.87 2023-10-02 02:30:17,714 WARNING [train.py:1204] (3/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_839-173017-0_sp1.1 from training. Duration: 0.37275 2023-10-02 02:30:17,728 WARNING [train.py:1204] (3/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_441-91466-0 from training. Duration: 0.97 2023-10-02 02:30:17,762 WARNING [train.py:1204] (3/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_108-53929-0 from training. Duration: 0.59 2023-10-02 02:30:19,682 INFO [train.py:1046] (3/4) Epoch 21, batch 1650, loss[loss=0.1821, simple_loss=0.2549, pruned_loss=0.05459, over 23452.00 frames. ], tot_loss[loss=0.1773, simple_loss=0.2534, pruned_loss=0.05054, over 4689909.79 frames. ], batch size: 134, lr: 4.92e-03, grad_scale: 16.0 2023-10-02 02:30:22,620 WARNING [train.py:1204] (3/4) Exclude cut with ID _9389_210_1664_1_1529217337301_5289269_260-1942-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 02:30:22,726 WARNING [train.py:1204] (3/4) Exclude cut with ID _56423_210_5854_1_1534896061049_3165969_198-61547-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 02:30:22,760 WARNING [train.py:1204] (3/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_412-142122-0_sp1.1 from training. Duration: 0.92725 2023-10-02 02:30:23,978 WARNING [train.py:1204] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_88-341718-0_sp1.1 from training. Duration: 0.6 2023-10-02 02:30:26,736 WARNING [train.py:1204] (3/4) Exclude cut with ID _61079_210_4525_1_1534921008198_4011419_334-298697-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 02:30:28,198 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_5465_210_4211_1_1527227253_55357_8-2499-0_sp1.1 from training. Duration: 0.996375 2023-10-02 02:30:32,244 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_447-201565-0_sp1.1 from training. Duration: 0.97275 2023-10-02 02:30:32,257 WARNING [train.py:1204] (3/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_253-161789-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 02:30:32,260 WARNING [train.py:1204] (3/4) Exclude cut with ID _44304_210_15374_1_1533639352765_4613129_302-22084-0_sp0.9 from training. Duration: 0.9555625 2023-10-02 02:30:32,268 WARNING [train.py:1204] (3/4) Exclude cut with ID _26664_210_7770_1_1532258943280_3418270_187-80815-0_sp1.1 from training. Duration: 0.8454375 2023-10-02 02:30:33,649 WARNING [train.py:1204] (3/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_68-212801-0 from training. Duration: 0.73 2023-10-02 02:30:33,680 WARNING [train.py:1204] (3/4) Exclude cut with ID _29345_210_4381_1_1532689168263_3860169_149-257268-0 from training. Duration: 0.81 2023-10-02 02:30:39,301 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_213-103282-0_sp1.1 from training. Duration: 0.7090625 2023-10-02 02:30:42,441 WARNING [train.py:1204] (3/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_156-302675-0_sp1.1 from training. Duration: 0.5 2023-10-02 02:30:51,225 WARNING [train.py:1204] (3/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_125-322172-0 from training. Duration: 0.93 2023-10-02 02:30:52,648 WARNING [train.py:1204] (3/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_493-60474-0_sp1.1 from training. Duration: 0.963625 2023-10-02 02:30:54,006 WARNING [train.py:1204] (3/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_156-302675-0 from training. Duration: 0.55 2023-10-02 02:30:57,907 WARNING [train.py:1204] (3/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_123-267862-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 02:30:59,579 WARNING [train.py:1204] (3/4) Exclude cut with ID _28427_210_4220_1_1532581240967_3061069_201-143747-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 02:31:00,908 WARNING [train.py:1204] (3/4) Exclude cut with ID _8772_210_1850_1_1528635595972_3664011_96-12756-0_sp1.1 from training. Duration: 0.8909375 2023-10-02 02:31:02,194 WARNING [train.py:1204] (3/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_1347-145675-0_sp1.1 from training. Duration: 0.936375 2023-10-02 02:31:03,573 WARNING [train.py:1204] (3/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_387-159008-0_sp1.1 from training. Duration: 0.7818125 2023-10-02 02:31:03,590 WARNING [train.py:1204] (3/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_400-201896-0_sp1.1 from training. Duration: 0.963625 2023-10-02 02:31:03,759 WARNING [train.py:1204] (3/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_326-174612-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 02:31:05,038 WARNING [train.py:1204] (3/4) Exclude cut with ID _55844_210_2346_1_1534471212569_7625707_390-116233-0_sp1.1 from training. Duration: 0.963625 2023-10-02 02:31:05,084 WARNING [train.py:1204] (3/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_419-317062-0_sp1.1 from training. Duration: 0.82725 2023-10-02 02:31:06,459 WARNING [train.py:1204] (3/4) Exclude cut with ID _29877_210_6228_1_1532397791106_5276149_266-62291-0_sp0.9 from training. Duration: 0.9666875 2023-10-02 02:31:07,826 WARNING [train.py:1204] (3/4) Exclude cut with ID _7857_210_1534_1_1528286515745_6831470_431-338528-0_sp1.1 from training. Duration: 0.92725 2023-10-02 02:31:07,890 WARNING [train.py:1204] (3/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_87-131427-0_sp0.9 from training. Duration: 0.8666875 2023-10-02 02:31:10,592 WARNING [train.py:1204] (3/4) Exclude cut with ID _24055_210_12651_1_1532310907143_976979_92-59356-0_sp1.1 from training. Duration: 0.82725 2023-10-02 02:31:13,123 WARNING [train.py:1204] (3/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_96-290039-0 from training. Duration: 0.56 2023-10-02 02:31:15,816 WARNING [train.py:1204] (3/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_48-182155-0_sp0.9 from training. Duration: 0.9666875 2023-10-02 02:31:15,862 WARNING [train.py:1204] (3/4) Exclude cut with ID _4626_210_3532_1_1526817652717_3916859_446-206656-0 from training. Duration: 0.76 2023-10-02 02:31:17,214 WARNING [train.py:1204] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_168-32906-0 from training. Duration: 0.69 2023-10-02 02:31:17,235 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3780_210_2087_1_1526467760_2130483_170-259044-0 from training. Duration: 0.7 2023-10-02 02:31:17,251 WARNING [train.py:1204] (3/4) Exclude cut with ID _65134_210_14763_1_1535335393985_3505368_153-216718-0_sp1.1 from training. Duration: 0.92725 2023-10-02 02:31:18,614 WARNING [train.py:1204] (3/4) Exclude cut with ID _64639_210_5816_1_1535367619437_3528000_461-329156-0_sp1.1 from training. Duration: 0.87275 2023-10-02 02:31:18,656 WARNING [train.py:1204] (3/4) Exclude cut with ID _21989_210_5854_1_1531899214931_3271360_126-347531-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 02:31:19,968 WARNING [train.py:1204] (3/4) Exclude cut with ID _7614_210_4915_1_1529326764092_4537359_96-162197-0_sp1.1 from training. Duration: 0.963625 2023-10-02 02:31:19,969 WARNING [train.py:1204] (3/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_1083-147536-0 from training. Duration: 0.97 2023-10-02 02:31:23,383 WARNING [train.py:1204] (3/4) Exclude cut with ID _64029_210_4381_1_1535178502772_3756459_275-16710-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 02:31:26,077 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7264_210_4938_1_1527939911_8132095_800-91995-0_sp0.9 from training. Duration: 0.9555625 2023-10-02 02:31:26,129 WARNING [train.py:1204] (3/4) Exclude cut with ID _3844_210_1390_1_1526727803582_2871209_50-193879-0_sp1.1 from training. Duration: 0.936375 2023-10-02 02:31:28,929 WARNING [train.py:1204] (3/4) Exclude cut with ID _46952_210_5986_1_1534230010357_3107379_232-344503-0 from training. Duration: 0.96 2023-10-02 02:31:33,172 INFO [train.py:1046] (3/4) Epoch 21, batch 1700, loss[loss=0.1576, simple_loss=0.2342, pruned_loss=0.04049, over 24439.00 frames. ], tot_loss[loss=0.1764, simple_loss=0.252, pruned_loss=0.0504, over 4688645.32 frames. ], batch size: 58, lr: 4.92e-03, grad_scale: 16.0 2023-10-02 02:31:33,226 WARNING [train.py:1204] (3/4) Exclude cut with ID _25808_210_13292_1_1532343514562_7366109_206-14076-0_sp1.1 from training. Duration: 0.936375 2023-10-02 02:31:33,227 WARNING [train.py:1204] (3/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_55-31904-0_sp0.9 from training. Duration: 0.97775 2023-10-02 02:31:33,259 WARNING [train.py:1204] (3/4) Exclude cut with ID _14632_210_9988_1_1530691279392_3923200_161-129530-0 from training. Duration: 0.96 2023-10-02 02:31:33,334 WARNING [train.py:1204] (3/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_770-200430-0_sp1.1 from training. Duration: 0.8090625 2023-10-02 02:31:33,345 WARNING [train.py:1204] (3/4) Exclude cut with ID _6270_210_2084_1_1527850911573_3856420_26-277343-0_sp1.1 from training. Duration: 0.7454375 2023-10-02 02:31:33,346 WARNING [train.py:1204] (3/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_88-23776-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 02:31:36,290 WARNING [train.py:1204] (3/4) Exclude cut with ID _60860_210_8254_1_1534896015875_3557169_245-256666-0_sp1.1 from training. Duration: 0.8818125 2023-10-02 02:31:36,319 WARNING [train.py:1204] (3/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_636-251213-0_sp1.1 from training. Duration: 0.8545625 2023-10-02 02:31:37,502 WARNING [train.py:1204] (3/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_540-131164-0 from training. Duration: 0.92 2023-10-02 02:31:39,079 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_5931_210_2040_1_1527428481_4547999_321-193614-0_sp0.9 from training. Duration: 0.7444375 2023-10-02 02:31:48,407 WARNING [train.py:1204] (3/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_373-225403-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 02:31:49,989 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder_embed.conv.2.prob, batch_count=719680.0, ans=0.125 2023-10-02 02:31:51,283 WARNING [train.py:1204] (3/4) Exclude cut with ID _20622_210_3852_1_1531622427080_1093839_57-326027-0_sp1.1 from training. Duration: 0.97275 2023-10-02 02:31:56,795 WARNING [train.py:1204] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_605-257205-0_sp1.1 from training. Duration: 0.536375 2023-10-02 02:31:56,816 WARNING [train.py:1204] (3/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_442-320317-0_sp0.9 from training. Duration: 0.911125 2023-10-02 02:31:56,861 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_35361_210_4238_1_1533262810_7793476_675-87012-0_sp1.1 from training. Duration: 0.8090625 2023-10-02 02:31:58,208 WARNING [train.py:1204] (3/4) Exclude cut with ID _4184_210_2550_1_1526730192013_2219380_306-44736-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 02:31:59,513 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.562e+02 1.914e+02 2.162e+02 2.469e+02 4.106e+02, threshold=4.325e+02, percent-clipped=1.0 2023-10-02 02:32:00,859 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_124-113562-0 from training. Duration: 0.94 2023-10-02 02:32:02,302 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2821_210_2715_1_1526108385_3763560_73-296673-0_sp0.9 from training. Duration: 0.92225 2023-10-02 02:32:02,328 WARNING [train.py:1204] (3/4) Exclude cut with ID _10301_210_5886_1_1529222275332_3910620_216-46959-0_sp1.1 from training. Duration: 0.92725 2023-10-02 02:32:05,071 WARNING [train.py:1204] (3/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_91-277684-0_sp0.9 from training. Duration: 0.82225 2023-10-02 02:32:05,167 WARNING [train.py:1204] (3/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_351-294650-0_sp0.9 from training. Duration: 0.7 2023-10-02 02:32:07,885 WARNING [train.py:1204] (3/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_209-205365-0 from training. Duration: 0.79 2023-10-02 02:32:07,948 WARNING [train.py:1204] (3/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_97-325053-0 from training. Duration: 0.92 2023-10-02 02:32:09,397 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_49348_210_7927_1_1533954500_3691300_109-276430-0_sp1.1 from training. Duration: 0.92725 2023-10-02 02:32:10,827 WARNING [train.py:1204] (3/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_912-47473-0 from training. Duration: 0.68 2023-10-02 02:32:12,894 WARNING [train.py:1204] (3/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_96-320736-0_sp0.9 from training. Duration: 0.9555625 2023-10-02 02:32:22,976 WARNING [train.py:1204] (3/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_1252-235481-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 02:32:24,387 WARNING [train.py:1204] (3/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_813-261554-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 02:32:24,446 WARNING [train.py:1204] (3/4) Exclude cut with ID _21632_210_2253_1_1531994001765_3744140_110-274654-0_sp0.9 from training. Duration: 0.911125 2023-10-02 02:32:26,378 WARNING [train.py:1204] (3/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_381-317791-0_sp1.1 from training. Duration: 0.663625 2023-10-02 02:32:26,397 WARNING [train.py:1204] (3/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_51-117576-0_sp1.1 from training. Duration: 0.363625 2023-10-02 02:32:26,420 WARNING [train.py:1204] (3/4) Exclude cut with ID _42321_210_7770_1_1533468631352_4366895_104-215915-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 02:32:28,002 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.conv_module2.balancer2.prob, batch_count=719813.3333333334, ans=0.125 2023-10-02 02:32:29,862 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_216-22824-0_sp1.1 from training. Duration: 0.92725 2023-10-02 02:32:29,863 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_14831_210_7927_1_1530700200_5583610_181-27467-0 from training. Duration: 0.91 2023-10-02 02:32:29,930 WARNING [train.py:1204] (3/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_1024-265645-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 02:32:29,932 WARNING [train.py:1204] (3/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_412-233904-0_sp1.1 from training. Duration: 0.936375 2023-10-02 02:32:31,240 WARNING [train.py:1204] (3/4) Exclude cut with ID _30191_210_1664_1_1533189915047_7196670_911-223461-0_sp1.1 from training. Duration: 0.92725 2023-10-02 02:32:31,249 WARNING [train.py:1204] (3/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_558-148266-0_sp1.1 from training. Duration: 0.963625 2023-10-02 02:32:32,653 WARNING [train.py:1204] (3/4) Exclude cut with ID _58480_210_12392_1_1534811121111_3646160_212-164495-0_sp1.1 from training. Duration: 0.936375 2023-10-02 02:32:32,655 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_454-276429-0_sp0.9 from training. Duration: 0.9666875 2023-10-02 02:32:34,040 WARNING [train.py:1204] (3/4) Exclude cut with ID _24736_210_3279_1_1532332894218_2838700_73-197086-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 02:32:34,099 WARNING [train.py:1204] (3/4) Exclude cut with ID _5294_210_1747_1_1527249643928_3696179_529-242298-0_sp1.1 from training. Duration: 0.863625 2023-10-02 02:32:34,138 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_53555_210_5632_1_1534595220_7232297_345-171438-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 02:32:39,420 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_28211_210_6388_1_1532861506_4523224_22-25766-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 02:32:40,805 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7002_210_1794_1_1527939683_5157286_163-87552-0 from training. Duration: 0.86 2023-10-02 02:32:42,190 WARNING [train.py:1204] (3/4) Exclude cut with ID _56927_210_9969_1_1534848970840_3695229_409-225129-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 02:32:43,047 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.1.encoder.layers.1.feed_forward2.out_whiten, num_groups=1, num_channels=256, metric=3.97 vs. limit=15.0 2023-10-02 02:32:43,590 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_26192_210_7927_1_1532137269_1040115_21-219331-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 02:32:43,830 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.bypass.scale_min, batch_count=719880.0, ans=0.2 2023-10-02 02:32:47,513 WARNING [train.py:1204] (3/4) Exclude cut with ID _15012_210_1968_1_1530856820429_5095350_662-210469-0 from training. Duration: 0.57 2023-10-02 02:32:48,779 INFO [train.py:1046] (3/4) Epoch 21, batch 1750, loss[loss=0.1639, simple_loss=0.2352, pruned_loss=0.04633, over 23445.00 frames. ], tot_loss[loss=0.1755, simple_loss=0.2505, pruned_loss=0.05025, over 4688484.17 frames. ], batch size: 134, lr: 4.92e-03, grad_scale: 16.0 2023-10-02 02:32:51,765 WARNING [train.py:1204] (3/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_582-191016-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 02:32:53,251 WARNING [train.py:1204] (3/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_270-210032-0_sp1.1 from training. Duration: 0.963625 2023-10-02 02:32:54,515 WARNING [train.py:1204] (3/4) Exclude cut with ID _6789_210_1534_1_1527857937218_3118950_262-48853-0_sp0.9 from training. Duration: 0.62225 2023-10-02 02:32:54,580 WARNING [train.py:1204] (3/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_847-41334-0 from training. Duration: 0.44 2023-10-02 02:32:56,362 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_573-46532-0_sp1.1 from training. Duration: 0.92725 2023-10-02 02:32:58,475 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.nonlin_attention.whiten2.whitening_limit, batch_count=719946.6666666666, ans=15.0 2023-10-02 02:32:59,606 WARNING [train.py:1204] (3/4) Exclude cut with ID _21881_210_3525_1_1531825242757_3798380_247-102591-0_sp1.1 from training. Duration: 0.8909375 2023-10-02 02:32:59,638 WARNING [train.py:1204] (3/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_200-21395-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 02:33:06,818 WARNING [train.py:1204] (3/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_711-15688-0 from training. Duration: 0.43 2023-10-02 02:33:08,296 WARNING [train.py:1204] (3/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_53-75025-0_sp1.1 from training. Duration: 0.963625 2023-10-02 02:33:09,699 WARNING [train.py:1204] (3/4) Exclude cut with ID _6194_210_1534_1_1527511826160_3941194_321-148641-0 from training. Duration: 0.64 2023-10-02 02:33:09,730 WARNING [train.py:1204] (3/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_337-294431-0_sp1.1 from training. Duration: 0.936375 2023-10-02 02:33:11,122 WARNING [train.py:1204] (3/4) Exclude cut with ID _21632_210_2253_1_1531994001765_3744140_110-274654-0_sp1.1 from training. Duration: 0.7454375 2023-10-02 02:33:13,949 WARNING [train.py:1204] (3/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_135-174080-0_sp1.1 from training. Duration: 0.4818125 2023-10-02 02:33:14,539 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.5.encoder.layers.0.nonlin_attention.whiten2, num_groups=1, num_channels=256, metric=4.56 vs. limit=15.0 2023-10-02 02:33:15,308 WARNING [train.py:1204] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_21-174514-0 from training. Duration: 0.43 2023-10-02 02:33:17,180 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_38965_210_4852_1_1533174906_3484735_215-294589-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 02:33:18,553 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_548-294316-0 from training. Duration: 0.81 2023-10-02 02:33:26,085 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_103-84010-0_sp1.1 from training. Duration: 0.62725 2023-10-02 02:33:28,789 WARNING [train.py:1204] (3/4) Exclude cut with ID _18199_210_6109_1_1531475789864_4288790_295-198247-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 02:33:28,824 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_13757_210_3528_1_1530440682_4107239_103-313224-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 02:33:28,999 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.conv_skip_rate, batch_count=720080.0, ans=0.0 2023-10-02 02:33:32,139 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_34886_210_10633_1_1533203882_1755661_67-294673-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 02:33:32,148 WARNING [train.py:1204] (3/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_350-101734-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 02:33:34,190 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_37357_210_17024_1_1533089504_2316121_204-153986-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 02:33:35,580 WARNING [train.py:1204] (3/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_673-115569-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 02:33:37,085 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_489-230703-0_sp0.9 from training. Duration: 0.9444375 2023-10-02 02:33:37,147 WARNING [train.py:1204] (3/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_575-102496-0_sp1.1 from training. Duration: 0.936375 2023-10-02 02:33:38,497 WARNING [train.py:1204] (3/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_669-93163-0 from training. Duration: 0.98 2023-10-02 02:33:39,862 WARNING [train.py:1204] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_249-281349-0_sp0.9 from training. Duration: 0.9444375 2023-10-02 02:33:41,302 WARNING [train.py:1204] (3/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_106-30782-0 from training. Duration: 0.8 2023-10-02 02:33:41,379 WARNING [train.py:1204] (3/4) Exclude cut with ID _8772_210_1850_1_1528635595972_3664011_398-320049-0_sp1.1 from training. Duration: 0.8 2023-10-02 02:33:42,762 WARNING [train.py:1204] (3/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_516-151727-0_sp1.1 from training. Duration: 0.963625 2023-10-02 02:33:44,723 WARNING [train.py:1204] (3/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_325-62802-0_sp1.1 from training. Duration: 0.7909375 2023-10-02 02:33:48,419 WARNING [train.py:1204] (3/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_93-58002-0_sp0.9 from training. Duration: 0.6333125 2023-10-02 02:33:48,456 WARNING [train.py:1204] (3/4) Exclude cut with ID _4681_210_3525_1_1527251040321_1235713_123-24428-0_sp1.1 from training. Duration: 0.436375 2023-10-02 02:33:48,506 WARNING [train.py:1204] (3/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_1185-56473-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 02:33:51,134 WARNING [train.py:1204] (3/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_96-244299-0_sp1.1 from training. Duration: 0.8 2023-10-02 02:33:51,737 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.3.encoder.layers.2.whiten, num_groups=1, num_channels=512, metric=9.86 vs. limit=12.0 2023-10-02 02:33:54,105 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.1.feed_forward1.hidden_balancer.prob, batch_count=720213.3333333334, ans=0.125 2023-10-02 02:33:55,824 WARNING [train.py:1204] (3/4) Exclude cut with ID _59500_210_8254_1_1534809610680_3736009_104-72815-0_sp1.1 from training. Duration: 0.963625 2023-10-02 02:33:58,583 WARNING [train.py:1204] (3/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_468-283113-0_sp1.1 from training. Duration: 0.87275 2023-10-02 02:33:59,968 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6239_210_4211_1_1527765584_4306698_528-102186-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 02:34:00,033 WARNING [train.py:1204] (3/4) Exclude cut with ID _45297_210_2721_1_1533715351381_3264949_351-147136-0 from training. Duration: 0.87 2023-10-02 02:34:00,038 WARNING [train.py:1204] (3/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_520-51744-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 02:34:01,446 WARNING [train.py:1204] (3/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_310-114452-0_sp1.1 from training. Duration: 0.536375 2023-10-02 02:34:01,451 WARNING [train.py:1204] (3/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_450-165710-0_sp1.1 from training. Duration: 0.92725 2023-10-02 02:34:01,468 WARNING [train.py:1204] (3/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_280-120942-0_sp1.1 from training. Duration: 0.7 2023-10-02 02:34:01,501 WARNING [train.py:1204] (3/4) Exclude cut with ID _65585_210_12210_1_1535450451584_3764098_112-294301-0_sp0.9 from training. Duration: 0.97775 2023-10-02 02:34:03,415 WARNING [train.py:1204] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_98-51676-0_sp0.9 from training. Duration: 0.811125 2023-10-02 02:34:07,462 INFO [train.py:1046] (3/4) Epoch 21, batch 1800, loss[loss=0.1726, simple_loss=0.2444, pruned_loss=0.05044, over 23674.00 frames. ], tot_loss[loss=0.1754, simple_loss=0.2504, pruned_loss=0.05014, over 4691233.70 frames. ], batch size: 256, lr: 4.92e-03, grad_scale: 16.0 2023-10-02 02:34:07,517 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_33348_210_5632_1_1533086912_3324030_85-28583-0_sp1.1 from training. Duration: 0.7454375 2023-10-02 02:34:07,619 WARNING [train.py:1204] (3/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_336-208232-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 02:34:08,948 WARNING [train.py:1204] (3/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_339-104791-0_sp0.9 from training. Duration: 0.5444375 2023-10-02 02:34:10,344 WARNING [train.py:1204] (3/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_559-29840-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 02:34:14,412 WARNING [train.py:1204] (3/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_117-290916-0_sp0.9 from training. Duration: 0.5666875 2023-10-02 02:34:14,492 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_185-112289-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 02:34:17,677 WARNING [train.py:1204] (3/4) Exclude cut with ID _59256_210_5986_1_1535076085997_3589120_155-298797-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 02:34:20,930 WARNING [train.py:1204] (3/4) Exclude cut with ID _44659_210_2721_1_1534036120061_6614860_246-206531-0_sp1.1 from training. Duration: 0.92725 2023-10-02 02:34:20,975 WARNING [train.py:1204] (3/4) Exclude cut with ID _23818_210_6228_1_1531917063884_3676920_469-151944-0_sp1.1 from training. Duration: 0.92725 2023-10-02 02:34:22,345 WARNING [train.py:1204] (3/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_88-276668-0_sp1.1 from training. Duration: 0.9 2023-10-02 02:34:23,793 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_15101_210_7927_1_1530786415_5033274_320-327527-0_sp1.1 from training. Duration: 0.87275 2023-10-02 02:34:23,801 WARNING [train.py:1204] (3/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_446-112117-0 from training. Duration: 0.9 2023-10-02 02:34:25,755 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_30280_210_1881_1_1532872779_3660139_400-254159-0_sp1.1 from training. Duration: 0.936375 2023-10-02 02:34:26,021 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.conv_skip_rate, batch_count=720346.6666666666, ans=0.0 2023-10-02 02:34:28,450 WARNING [train.py:1204] (3/4) Exclude cut with ID _40284_210_2084_1_1533513679814_3704279_158-345341-0_sp1.1 from training. Duration: 0.936375 2023-10-02 02:34:30,086 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.conv_module2.balancer1.prob, batch_count=720346.6666666666, ans=0.125 2023-10-02 02:34:32,581 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.574e+02 1.823e+02 2.052e+02 2.264e+02 3.155e+02, threshold=4.104e+02, percent-clipped=0.0 2023-10-02 02:34:32,732 WARNING [train.py:1204] (3/4) Exclude cut with ID 3033-130750-0096-55598-0 from training. Duration: 0.83 2023-10-02 02:34:34,230 WARNING [train.py:1204] (3/4) Exclude cut with ID _9389_210_1664_1_1529217337301_5289269_379-128794-0 from training. Duration: 0.85 2023-10-02 02:34:35,532 WARNING [train.py:1204] (3/4) Exclude cut with ID _3675_210_4557_1_1526639934408_4182089_456-18305-0 from training. Duration: 0.76 2023-10-02 02:34:35,579 WARNING [train.py:1204] (3/4) Exclude cut with ID _29853_210_7497_1_1532505600851_6642110_571-249460-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 02:34:35,677 WARNING [train.py:1204] (3/4) Exclude cut with ID _4068_210_2629_1_1526697350395_1546259_230-325568-0_sp1.1 from training. Duration: 0.92725 2023-10-02 02:34:35,678 WARNING [train.py:1204] (3/4) Exclude cut with ID _34415_210_8750_1_1533085213375_3451116_213-267570-0_sp1.1 from training. Duration: 0.963625 2023-10-02 02:34:37,536 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6767_210_3273_1_1527855783_1718932_142-127132-0_sp1.1 from training. Duration: 0.863625 2023-10-02 02:34:43,356 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.conv_module2.balancer1.prob, batch_count=720413.3333333334, ans=0.125 2023-10-02 02:34:44,459 WARNING [train.py:1204] (3/4) Exclude cut with ID IC0653W0001-322925 from training. Duration: 0.896 2023-10-02 02:34:47,564 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_4528_210_3273_1_1527076654_3276777_209-219804-0_sp1.1 from training. Duration: 0.62725 2023-10-02 02:34:48,978 WARNING [train.py:1204] (3/4) Exclude cut with ID _55844_210_2346_1_1534471212569_7625707_187-204971-0_sp1.1 from training. Duration: 0.936375 2023-10-02 02:34:50,881 WARNING [train.py:1204] (3/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_369-325184-0 from training. Duration: 0.99 2023-10-02 02:34:50,912 WARNING [train.py:1204] (3/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_791-45149-0 from training. Duration: 0.86 2023-10-02 02:34:52,215 WARNING [train.py:1204] (3/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_317-211107-0_sp1.1 from training. Duration: 0.836375 2023-10-02 02:34:53,641 WARNING [train.py:1204] (3/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_488-330913-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 02:34:55,072 WARNING [train.py:1204] (3/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_506-42614-0_sp1.1 from training. Duration: 0.7181875 2023-10-02 02:34:59,762 WARNING [train.py:1204] (3/4) Exclude cut with ID _8696_210_1876_1_1528545632440_3345058_209-54069-0 from training. Duration: 0.67 2023-10-02 02:35:00,077 INFO [scaling.py:1118] (3/4) WithLoss: name=encoder.encoders.2.encoder.layers.0.self_attn_weights, loss-sum=0.000e+00 2023-10-02 02:35:05,494 WARNING [train.py:1204] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_215-202973-0_sp1.1 from training. Duration: 0.8 2023-10-02 02:35:05,578 WARNING [train.py:1204] (3/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_321-215742-0 from training. Duration: 0.5 2023-10-02 02:35:06,958 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_428-32114-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 02:35:06,966 WARNING [train.py:1204] (3/4) Exclude cut with ID _27013_210_1389_1_1532437173240_3252649_467-263151-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 02:35:08,321 WARNING [train.py:1204] (3/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_734-129761-0_sp0.9 from training. Duration: 0.911125 2023-10-02 02:35:08,372 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_521-214276-0 from training. Duration: 0.44 2023-10-02 02:35:11,256 WARNING [train.py:1204] (3/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_80-147301-0_sp1.1 from training. Duration: 0.763625 2023-10-02 02:35:11,258 WARNING [train.py:1204] (3/4) Exclude cut with ID _9679_210_7497_1_1529129212906_4363929_56-307796-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 02:35:14,540 WARNING [train.py:1204] (3/4) Exclude cut with ID _14832_210_8750_1_1530691351539_3171348_1-54315-0 from training. Duration: 0.87 2023-10-02 02:35:14,540 WARNING [train.py:1204] (3/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_558-155750-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 02:35:16,000 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_142-309370-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 02:35:16,045 WARNING [train.py:1204] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_18-46756-0_sp0.9 from training. Duration: 0.87775 2023-10-02 02:35:16,052 WARNING [train.py:1204] (3/4) Exclude cut with ID _61148_210_6897_1_1535094022726_4217610_279-212159-0_sp1.1 from training. Duration: 0.936375 2023-10-02 02:35:17,439 WARNING [train.py:1204] (3/4) Exclude cut with ID _21987_210_5854_1_1531821500482_3637038_164-2641-0_sp1.1 from training. Duration: 0.936375 2023-10-02 02:35:18,673 WARNING [train.py:1204] (3/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_395-340852-0_sp1.1 from training. Duration: 0.8454375 2023-10-02 02:35:20,599 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1077-184261-0_sp1.1 from training. Duration: 0.963625 2023-10-02 02:35:20,616 WARNING [train.py:1204] (3/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_1176-204992-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 02:35:22,596 INFO [train.py:1046] (3/4) Epoch 21, batch 1850, loss[loss=0.1789, simple_loss=0.2516, pruned_loss=0.05312, over 17247.00 frames. ], tot_loss[loss=0.1759, simple_loss=0.2511, pruned_loss=0.05036, over 4684474.09 frames. ], batch size: 37, lr: 4.92e-03, grad_scale: 16.0 2023-10-02 02:35:24,104 WARNING [train.py:1204] (3/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_563-347832-0_sp1.1 from training. Duration: 0.8090625 2023-10-02 02:35:25,502 WARNING [train.py:1204] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_380-204756-0_sp0.9 from training. Duration: 0.9555625 2023-10-02 02:35:31,672 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=720613.3333333334, ans=0.1 2023-10-02 02:35:34,077 WARNING [train.py:1204] (3/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_212-10501-0_sp1.1 from training. Duration: 0.8545625 2023-10-02 02:35:34,101 WARNING [train.py:1204] (3/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_445-117398-0 from training. Duration: 0.89 2023-10-02 02:35:38,467 WARNING [train.py:1204] (3/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_29-280622-0 from training. Duration: 0.82 2023-10-02 02:35:42,500 WARNING [train.py:1204] (3/4) Exclude cut with ID _26664_210_7770_1_1532258943280_3418270_187-80815-0 from training. Duration: 0.93 2023-10-02 02:35:44,759 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.2.encoder.layers.2.nonlin_attention.whiten2, num_groups=1, num_channels=384, metric=5.27 vs. limit=15.0 2023-10-02 02:35:45,391 WARNING [train.py:1204] (3/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_414-349640-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 02:35:47,312 WARNING [train.py:1204] (3/4) Exclude cut with ID _45021_210_9994_1_1533619751094_3330288_96-239010-0 from training. Duration: 0.91 2023-10-02 02:35:47,316 WARNING [train.py:1204] (3/4) Exclude cut with ID _5189_210_4557_1_1527248319428_3769229_199-53526-0_sp0.9 from training. Duration: 0.4666875 2023-10-02 02:35:49,768 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.2.encoder.layers.0.feed_forward1.out_whiten, num_groups=1, num_channels=384, metric=6.19 vs. limit=15.0 2023-10-02 02:35:53,094 WARNING [train.py:1204] (3/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_63-101996-0_sp1.1 from training. Duration: 0.8 2023-10-02 02:35:55,096 WARNING [train.py:1204] (3/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_199-86432-0 from training. Duration: 0.98 2023-10-02 02:35:57,725 WARNING [train.py:1204] (3/4) Exclude cut with ID _36399_210_16049_1_1532998771377_3698333_207-259013-0_sp1.1 from training. Duration: 0.92725 2023-10-02 02:35:57,759 WARNING [train.py:1204] (3/4) Exclude cut with ID _62574_210_2655_1_1535180407058_3933249_567-111756-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 02:36:02,269 WARNING [train.py:1204] (3/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_436-318113-0 from training. Duration: 0.75 2023-10-02 02:36:02,330 WARNING [train.py:1204] (3/4) Exclude cut with ID _53379_210_20034_1_1534222027363_3854654_43-305214-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 02:36:02,354 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_15126_210_6753_1_1530778677_2591185_113-157651-0_sp1.1 from training. Duration: 0.6181875 2023-10-02 02:36:03,793 WARNING [train.py:1204] (3/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_146-218763-0_sp1.1 from training. Duration: 0.7818125 2023-10-02 02:36:05,247 WARNING [train.py:1204] (3/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_714-310538-0_sp0.9 from training. Duration: 0.9555625 2023-10-02 02:36:05,398 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.nonlin_attention.balancer.prob, batch_count=720813.3333333334, ans=0.125 2023-10-02 02:36:05,424 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.conv_module2.balancer1.prob, batch_count=720813.3333333334, ans=0.125 2023-10-02 02:36:05,512 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.ff2_skip_rate, batch_count=720813.3333333334, ans=0.0 2023-10-02 02:36:08,061 WARNING [train.py:1204] (3/4) Exclude cut with ID _24271_210_12658_1_1531987270022_7190519_709-16721-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 02:36:10,787 WARNING [train.py:1204] (3/4) Exclude cut with ID _5190_210_4557_1_1527244368799_373439_28-197030-0_sp0.9 from training. Duration: 0.8 2023-10-02 02:36:10,815 WARNING [train.py:1204] (3/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_267-197536-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 02:36:10,840 WARNING [train.py:1204] (3/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_658-282610-0_sp1.1 from training. Duration: 0.4818125 2023-10-02 02:36:10,848 WARNING [train.py:1204] (3/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_257-67050-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 02:36:13,585 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_14906_210_7927_1_1530754190_9408633_961-53402-0_sp1.1 from training. Duration: 0.936375 2023-10-02 02:36:15,649 WARNING [train.py:1204] (3/4) Exclude cut with ID _60113_210_16271_1_1535164212481_4386079_490-280290-0_sp1.1 from training. Duration: 0.8909375 2023-10-02 02:36:15,796 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.balancer1.prob, batch_count=720813.3333333334, ans=0.125 2023-10-02 02:36:18,568 WARNING [train.py:1204] (3/4) Exclude cut with ID _14100_210_8341_1_1530529320613_3678069_258-224599-0 from training. Duration: 0.77 2023-10-02 02:36:18,631 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6961_210_2087_1_1527937168_6815011_565-326898-0_sp1.1 from training. Duration: 0.936375 2023-10-02 02:36:23,293 WARNING [train.py:1204] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_239-316802-0_sp1.1 from training. Duration: 0.62725 2023-10-02 02:36:23,361 WARNING [train.py:1204] (3/4) Exclude cut with ID _55857_210_2346_1_1534644007831_6898209_745-98391-0_sp1.1 from training. Duration: 0.6909375 2023-10-02 02:36:23,365 WARNING [train.py:1204] (3/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_154-135696-0 from training. Duration: 0.8 2023-10-02 02:36:23,366 WARNING [train.py:1204] (3/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_93-58002-0 from training. Duration: 0.57 2023-10-02 02:36:26,048 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9025W0005-507335 from training. Duration: 0.896 2023-10-02 02:36:26,117 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9029W0034-509435 from training. Duration: 0.64 2023-10-02 02:36:28,842 WARNING [train.py:1204] (3/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_76-203867-0_sp1.1 from training. Duration: 0.6545625 2023-10-02 02:36:28,852 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_46639_210_8341_1_1533726088_4495500_66-346325-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 02:36:28,859 WARNING [train.py:1204] (3/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_145-137790-0_sp0.9 from training. Duration: 0.92225 2023-10-02 02:36:28,874 WARNING [train.py:1204] (3/4) Exclude cut with ID _36522_210_8887_1_1533177030611_1772478_7-327299-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 02:36:30,290 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9042W0009-515615 from training. Duration: 0.896 2023-10-02 02:36:30,307 WARNING [train.py:1204] (3/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_39-248870-0_sp0.9 from training. Duration: 0.7666875 2023-10-02 02:36:31,639 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_35441_210_16554_1_1533347572_4092680_362-242912-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 02:36:31,723 WARNING [train.py:1204] (3/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_216-343007-0_sp1.1 from training. Duration: 0.6 2023-10-02 02:36:33,101 WARNING [train.py:1204] (3/4) Exclude cut with ID _3867_210_1883_1_1526641200575_4475830_272-215563-0_sp0.9 from training. Duration: 0.6333125 2023-10-02 02:36:34,434 WARNING [train.py:1204] (3/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_553-175603-0_sp1.1 from training. Duration: 0.92725 2023-10-02 02:36:34,450 WARNING [train.py:1204] (3/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_963-187109-0 from training. Duration: 0.99 2023-10-02 02:36:36,473 INFO [train.py:1046] (3/4) Epoch 21, batch 1900, loss[loss=0.1727, simple_loss=0.2585, pruned_loss=0.04349, over 24373.00 frames. ], tot_loss[loss=0.1763, simple_loss=0.2517, pruned_loss=0.05044, over 4698832.43 frames. ], batch size: 77, lr: 4.91e-03, grad_scale: 16.0 2023-10-02 02:36:38,048 WARNING [train.py:1204] (3/4) Exclude cut with ID _44973_210_1511_1_1533794381163_3634770_495-211466-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 02:36:38,071 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9068W0002-529085 from training. Duration: 0.896 2023-10-02 02:36:38,086 WARNING [train.py:1204] (3/4) Exclude cut with ID _5294_210_1747_1_1527249643928_3696179_180-100713-0_sp1.1 from training. Duration: 0.736375 2023-10-02 02:36:39,490 WARNING [train.py:1204] (3/4) Exclude cut with ID _35799_210_5854_1_1533263487887_3529071_149-49689-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 02:36:43,931 WARNING [train.py:1204] (3/4) Exclude cut with ID _67725_210_27767_1_1535608756671_5105359_36-220794-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 02:36:47,087 WARNING [train.py:1204] (3/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_338-96520-0_sp1.1 from training. Duration: 0.8545625 2023-10-02 02:36:47,152 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9104W0004-547160 from training. Duration: 0.896 2023-10-02 02:36:48,520 WARNING [train.py:1204] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_330-327861-0 from training. Duration: 0.67 2023-10-02 02:36:49,944 WARNING [train.py:1204] (3/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_156-257607-0_sp0.9 from training. Duration: 0.92225 2023-10-02 02:36:50,007 WARNING [train.py:1204] (3/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_476-322050-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 02:36:51,228 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9118W0017-553385 from training. Duration: 0.896 2023-10-02 02:36:51,264 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9120W0028-554435 from training. Duration: 0.896 2023-10-02 02:36:51,481 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.bypass.skip_rate, batch_count=721013.3333333334, ans=0.04949747468305833 2023-10-02 02:36:55,117 WARNING [train.py:1204] (3/4) Exclude cut with ID _56524_210_9642_1_1534572011779_3619170_145-310842-0 from training. Duration: 0.99 2023-10-02 02:36:57,766 WARNING [train.py:1204] (3/4) Exclude cut with ID _51631_210_8254_1_1534129228420_4084794_270-222508-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 02:37:00,602 WARNING [train.py:1204] (3/4) Exclude cut with ID _14268_210_9206_1_1530622771285_4714099_331-105351-0 from training. Duration: 0.65 2023-10-02 02:37:01,129 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.5.encoder.layers.1.self_attn1.whiten, num_groups=1, num_channels=256, metric=12.39 vs. limit=22.5 2023-10-02 02:37:01,998 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.526e+02 1.796e+02 1.986e+02 2.247e+02 3.290e+02, threshold=3.972e+02, percent-clipped=0.0 2023-10-02 02:37:02,192 WARNING [train.py:1204] (3/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_40-129756-0 from training. Duration: 0.81 2023-10-02 02:37:06,639 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.feed_forward1.hidden_balancer.prob, batch_count=721080.0, ans=0.125 2023-10-02 02:37:12,353 WARNING [train.py:1204] (3/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_231-104736-0 from training. Duration: 0.78 2023-10-02 02:37:16,398 WARNING [train.py:1204] (3/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_214-291369-0 from training. Duration: 0.63 2023-10-02 02:37:16,405 WARNING [train.py:1204] (3/4) Exclude cut with ID _62574_210_2655_1_1535180407058_3933249_369-84347-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 02:37:17,741 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9210W0111-599240 from training. Duration: 0.896 2023-10-02 02:37:17,745 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_92-246270-0 from training. Duration: 0.85 2023-10-02 02:37:17,787 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_37152_210_9697_1_1533106573_3844188_29-239070-0 from training. Duration: 0.98 2023-10-02 02:37:17,848 WARNING [train.py:1204] (3/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_239-22076-0 from training. Duration: 0.85 2023-10-02 02:37:17,851 WARNING [train.py:1204] (3/4) Exclude cut with ID _65962_210_29927_1_1535436026519_4174930_368-44639-0_sp1.1 from training. Duration: 0.97275 2023-10-02 02:37:18,118 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.conv_module1.balancer2.prob, batch_count=721080.0, ans=0.125 2023-10-02 02:37:22,755 WARNING [train.py:1204] (3/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_152-212533-0 from training. Duration: 0.97 2023-10-02 02:37:25,938 WARNING [train.py:1204] (3/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_90-31383-0_sp1.1 from training. Duration: 0.7909375 2023-10-02 02:37:30,571 WARNING [train.py:1204] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_196-152868-0_sp1.1 from training. Duration: 0.92725 2023-10-02 02:37:30,573 WARNING [train.py:1204] (3/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_140-181176-0 from training. Duration: 0.92 2023-10-02 02:37:30,711 WARNING [train.py:1204] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_168-32906-0_sp0.9 from training. Duration: 0.7666875 2023-10-02 02:37:34,818 WARNING [train.py:1204] (3/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_13-165512-0 from training. Duration: 0.82 2023-10-02 02:37:34,863 WARNING [train.py:1204] (3/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_381-317791-0_sp0.9 from training. Duration: 0.811125 2023-10-02 02:37:42,326 WARNING [train.py:1204] (3/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_63-267585-0_sp1.1 from training. Duration: 0.8181875 2023-10-02 02:37:42,335 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_326-303945-0_sp1.1 from training. Duration: 0.9 2023-10-02 02:37:42,347 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_194-143381-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 02:37:43,710 WARNING [train.py:1204] (3/4) Exclude cut with ID _39290_210_4220_1_1533862778760_3075118_194-16667-0_sp1.1 from training. Duration: 0.963625 2023-10-02 02:37:43,799 WARNING [train.py:1204] (3/4) Exclude cut with ID _15012_210_1968_1_1530856820429_5095350_662-210469-0_sp0.9 from training. Duration: 0.6333125 2023-10-02 02:37:43,829 WARNING [train.py:1204] (3/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_68-219807-0_sp1.1 from training. Duration: 0.42725 2023-10-02 02:37:45,242 WARNING [train.py:1204] (3/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_321-22821-0_sp0.9 from training. Duration: 0.988875 2023-10-02 02:37:48,028 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_1037-45317-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 02:37:48,030 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_403-39619-0_sp1.1 from training. Duration: 0.763625 2023-10-02 02:37:49,490 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_510-96996-0_sp0.9 from training. Duration: 0.9666875 2023-10-02 02:37:49,493 WARNING [train.py:1204] (3/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_225-180155-0_sp1.1 from training. Duration: 0.92725 2023-10-02 02:37:49,537 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_278-272612-0_sp0.9 from training. Duration: 0.811125 2023-10-02 02:37:50,870 INFO [train.py:1046] (3/4) Epoch 21, batch 1950, loss[loss=0.1576, simple_loss=0.2342, pruned_loss=0.04047, over 20360.00 frames. ], tot_loss[loss=0.1772, simple_loss=0.2528, pruned_loss=0.05085, over 4700211.06 frames. ], batch size: 44, lr: 4.91e-03, grad_scale: 16.0 2023-10-02 02:37:50,989 WARNING [train.py:1204] (3/4) Exclude cut with ID _53713_210_24116_1_1534314487762_7416751_691-70244-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 02:37:54,313 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6788_210_1388_1_1527898698_4562021_91-197981-0_sp1.1 from training. Duration: 0.7545625 2023-10-02 02:37:57,474 WARNING [train.py:1204] (3/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_60-18123-0_sp0.9 from training. Duration: 0.97775 2023-10-02 02:37:57,525 WARNING [train.py:1204] (3/4) Exclude cut with ID _35799_210_5854_1_1533263487887_3529071_5-214617-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 02:37:57,533 WARNING [train.py:1204] (3/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_890-5117-0_sp1.1 from training. Duration: 0.7090625 2023-10-02 02:38:00,812 WARNING [train.py:1204] (3/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_660-211088-0 from training. Duration: 0.62 2023-10-02 02:38:00,893 WARNING [train.py:1204] (3/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_314-126819-0_sp0.9 from training. Duration: 0.5555625 2023-10-02 02:38:00,921 WARNING [train.py:1204] (3/4) Exclude cut with ID _21632_210_2253_1_1531994001765_3744140_362-66225-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 02:38:02,270 WARNING [train.py:1204] (3/4) Exclude cut with ID _61118_210_9527_1_1534897668884_3752630_173-258555-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 02:38:05,232 WARNING [train.py:1204] (3/4) Exclude cut with ID _7719_210_3525_1_1528456153163_4296960_88-250259-0_sp0.9 from training. Duration: 0.9333125 2023-10-02 02:38:05,253 WARNING [train.py:1204] (3/4) Exclude cut with ID _15140_210_9288_1_1530791388450_4880149_539-153708-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 02:38:06,604 WARNING [train.py:1204] (3/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_253-42544-0_sp1.1 from training. Duration: 0.97275 2023-10-02 02:38:08,082 WARNING [train.py:1204] (3/4) Exclude cut with ID _33640_210_2655_1_1533117605479_4855469_479-296877-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 02:38:10,953 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.bypass.skip_rate, batch_count=721346.6666666666, ans=0.07 2023-10-02 02:38:12,053 WARNING [train.py:1204] (3/4) Exclude cut with ID 3033-130750-0096-55598-0_sp1.1 from training. Duration: 0.7545625 2023-10-02 02:38:12,063 WARNING [train.py:1204] (3/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_191-41023-0_sp1.1 from training. Duration: 0.6454375 2023-10-02 02:38:12,078 WARNING [train.py:1204] (3/4) Exclude cut with ID _8365_210_2064_1_1528592311070_4677690_431-294102-0_sp1.1 from training. Duration: 0.8090625 2023-10-02 02:38:12,109 WARNING [train.py:1204] (3/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_893-296030-0_sp1.1 from training. Duration: 0.97275 2023-10-02 02:38:15,475 WARNING [train.py:1204] (3/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_401-343820-0_sp1.1 from training. Duration: 0.97275 2023-10-02 02:38:18,194 WARNING [train.py:1204] (3/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_127-566-0_sp1.1 from training. Duration: 0.763625 2023-10-02 02:38:18,196 WARNING [train.py:1204] (3/4) Exclude cut with ID _14100_210_8341_1_1530529320613_3678069_126-272847-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 02:38:18,211 WARNING [train.py:1204] (3/4) Exclude cut with ID _15133_210_6228_1_1530794512074_3080379_29-264458-0_sp1.1 from training. Duration: 0.52725 2023-10-02 02:38:18,211 WARNING [train.py:1204] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_127-119109-0 from training. Duration: 0.72 2023-10-02 02:38:19,617 WARNING [train.py:1204] (3/4) Exclude cut with ID _6537_210_1883_1_1527987559710_3597129_382-133062-0_sp1.1 from training. Duration: 0.6181875 2023-10-02 02:38:20,871 WARNING [train.py:1204] (3/4) Exclude cut with ID _29529_210_7234_1_1532393961455_3977460_391-219682-0_sp1.1 from training. Duration: 0.936375 2023-10-02 02:38:20,924 WARNING [train.py:1204] (3/4) Exclude cut with ID _37537_210_2253_1_1533171758568_3421949_96-137189-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 02:38:23,199 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.1.encoder.layers.1.feed_forward2.out_whiten, num_groups=1, num_channels=256, metric=4.06 vs. limit=15.0 2023-10-02 02:38:23,763 WARNING [train.py:1204] (3/4) Exclude cut with ID _58414_210_8254_1_1535421609162_7151182_15-276301-0_sp1.1 from training. Duration: 0.97275 2023-10-02 02:38:25,566 WARNING [train.py:1204] (3/4) Exclude cut with ID _59502_210_12210_1_1535107024698_1835139_110-289074-0_sp1.1 from training. Duration: 0.92725 2023-10-02 02:38:30,829 WARNING [train.py:1204] (3/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_367-61330-0_sp1.1 from training. Duration: 0.6818125 2023-10-02 02:38:33,617 WARNING [train.py:1204] (3/4) Exclude cut with ID _45297_210_2721_1_1533715351381_3264949_353-9541-0_sp1.1 from training. Duration: 0.8818125 2023-10-02 02:38:33,642 WARNING [train.py:1204] (3/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_85-138257-0_sp0.9 from training. Duration: 0.788875 2023-10-02 02:38:33,673 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3787_210_2040_1_1526477544_5478586_603-120324-0 from training. Duration: 0.81 2023-10-02 02:38:33,715 WARNING [train.py:1204] (3/4) Exclude cut with ID _42863_210_9070_1_1533902398788_3696920_411-204131-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 02:38:40,394 WARNING [train.py:1204] (3/4) Exclude cut with ID _30196_210_1664_1_1533708496522_7074140_822-289909-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 02:38:40,487 WARNING [train.py:1204] (3/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_32-335103-0_sp0.9 from training. Duration: 0.911125 2023-10-02 02:38:41,864 WARNING [train.py:1204] (3/4) Exclude cut with ID _6493_210_1747_1_1528459210424_4012470_513-140967-0_sp0.9 from training. Duration: 0.87775 2023-10-02 02:38:49,433 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_47896_210_20508_1_1534492518_5958868_439-257766-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 02:38:50,789 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_1294-63152-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 02:38:52,281 WARNING [train.py:1204] (3/4) Exclude cut with ID _59170_210_6721_1_1534983633960_4203335_319-166512-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 02:38:56,728 WARNING [train.py:1204] (3/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_146-3470-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 02:38:59,956 WARNING [train.py:1204] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_678-159307-0_sp1.1 from training. Duration: 0.77275 2023-10-02 02:39:00,005 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_343-48822-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 02:39:01,385 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_4692_210_3528_1_1526899755_4323254_107-44401-0 from training. Duration: 0.86 2023-10-02 02:39:01,391 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_74-292608-0_sp1.1 from training. Duration: 0.6545625 2023-10-02 02:39:03,399 WARNING [train.py:1204] (3/4) Exclude cut with ID _55644_210_11856_1_1534593599582_4103719_293-248694-0_sp1.1 from training. Duration: 0.97275 2023-10-02 02:39:04,746 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3172_210_2087_1_1526292929_3987865_197-97762-0 from training. Duration: 0.75 2023-10-02 02:39:06,063 INFO [train.py:1046] (3/4) Epoch 21, batch 2000, loss[loss=0.1785, simple_loss=0.2466, pruned_loss=0.05521, over 23605.00 frames. ], tot_loss[loss=0.1774, simple_loss=0.2529, pruned_loss=0.05093, over 4711295.46 frames. ], batch size: 149, lr: 4.91e-03, grad_scale: 32.0 2023-10-02 02:39:06,169 WARNING [train.py:1204] (3/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_264-348896-0_sp0.9 from training. Duration: 0.9444375 2023-10-02 02:39:10,146 WARNING [train.py:1204] (3/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_209-205365-0_sp0.9 from training. Duration: 0.87775 2023-10-02 02:39:10,218 WARNING [train.py:1204] (3/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_222-198127-0_sp0.9 from training. Duration: 0.9333125 2023-10-02 02:39:10,256 WARNING [train.py:1204] (3/4) Exclude cut with ID _57572_210_5517_1_1534921219624_3809484_52-43530-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 02:39:11,693 WARNING [train.py:1204] (3/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_96-320736-0_sp1.1 from training. Duration: 0.7818125 2023-10-02 02:39:14,251 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_13091_210_7925_1_1530609447_8987354_1159-225508-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 02:39:16,449 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.3.encoder.layers.0.conv_module2.whiten, num_groups=1, num_channels=512, metric=2.75 vs. limit=15.0 2023-10-02 02:39:17,659 WARNING [train.py:1204] (3/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_237-44332-0 from training. Duration: 0.94 2023-10-02 02:39:17,714 WARNING [train.py:1204] (3/4) Exclude cut with ID _7970_210_1883_1_1528455616296_3472230_25-178407-0_sp0.9 from training. Duration: 0.788875 2023-10-02 02:39:20,800 WARNING [train.py:1204] (3/4) Exclude cut with ID _29003_210_19596_1_1532507391720_3894098_357-257062-0_sp1.1 from training. Duration: 0.936375 2023-10-02 02:39:22,248 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_809-17156-0 from training. Duration: 0.73 2023-10-02 02:39:23,667 WARNING [train.py:1204] (3/4) Exclude cut with ID _7160_210_1876_1_1527940923231_3703799_302-150728-0_sp1.1 from training. Duration: 0.7090625 2023-10-02 02:39:23,675 WARNING [train.py:1204] (3/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_444-28041-0_sp0.9 from training. Duration: 0.9444375 2023-10-02 02:39:25,126 WARNING [train.py:1204] (3/4) Exclude cut with ID _52892_210_6721_1_1534378941259_4124129_278-251666-0_sp1.1 from training. Duration: 0.92725 2023-10-02 02:39:25,365 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.self_attn_weights.pos_emb_skip_rate, batch_count=721680.0, ans=0.0 2023-10-02 02:39:26,881 WARNING [train.py:1204] (3/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_115-326778-0 from training. Duration: 0.93 2023-10-02 02:39:26,982 WARNING [train.py:1204] (3/4) Exclude cut with ID _41391_210_7994_1_1533603771872_3638201_31-188961-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 02:39:28,967 WARNING [train.py:1204] (3/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_1073-6264-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 02:39:28,993 WARNING [train.py:1204] (3/4) Exclude cut with ID _66868_210_9527_1_1535509287954_4561140_435-259515-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 02:39:30,385 WARNING [train.py:1204] (3/4) Exclude cut with ID _14886_210_4220_1_1530702020355_3516190_159-73748-0 from training. Duration: 0.83 2023-10-02 02:39:31,582 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.495e+02 1.854e+02 2.083e+02 2.376e+02 3.627e+02, threshold=4.166e+02, percent-clipped=0.0 2023-10-02 02:39:31,674 WARNING [train.py:1204] (3/4) Exclude cut with ID _15150_210_8341_1_1530752411803_7256384_469-239509-0_sp1.1 from training. Duration: 0.6181875 2023-10-02 02:39:34,979 WARNING [train.py:1204] (3/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_395-340852-0 from training. Duration: 0.93 2023-10-02 02:39:34,982 WARNING [train.py:1204] (3/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_138-187031-0_sp1.1 from training. Duration: 0.9 2023-10-02 02:39:37,971 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_13757_210_3528_1_1530440682_4107239_48-106347-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 02:39:38,029 WARNING [train.py:1204] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_164-224052-0_sp1.1 from training. Duration: 0.636375 2023-10-02 02:39:38,041 WARNING [train.py:1204] (3/4) Exclude cut with ID _59288_210_5854_1_1535077757774_3412246_408-322648-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 02:39:39,300 WARNING [train.py:1204] (3/4) Exclude cut with ID _30191_210_1664_1_1533189915047_7196670_827-36359-0_sp1.1 from training. Duration: 0.963625 2023-10-02 02:39:40,687 WARNING [train.py:1204] (3/4) Exclude cut with ID _4680_210_3525_1_1527249757953_893386_75-78979-0_sp1.1 from training. Duration: 0.82725 2023-10-02 02:39:40,764 WARNING [train.py:1204] (3/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_383-165234-0 from training. Duration: 0.94 2023-10-02 02:39:42,420 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.1.feed_forward1.out_proj.dropout_p, batch_count=721746.6666666666, ans=0.1 2023-10-02 02:39:43,535 WARNING [train.py:1204] (3/4) Exclude cut with ID _19896_210_1883_1_1531709022662_6649569_94-229927-0 from training. Duration: 0.97 2023-10-02 02:39:43,549 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6788_210_1388_1_1527898698_4562021_180-75288-0_sp1.1 from training. Duration: 0.9 2023-10-02 02:39:43,555 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_26433_210_12108_1_1532157158_4056480_54-321349-0_sp1.1 from training. Duration: 0.97275 2023-10-02 02:39:48,400 WARNING [train.py:1204] (3/4) Exclude cut with ID _56108_210_16380_1_1534505446159_3887959_265-161493-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 02:39:49,825 WARNING [train.py:1204] (3/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_395-111602-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 02:39:51,050 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_154-316450-0_sp0.9 from training. Duration: 0.8444375 2023-10-02 02:39:51,134 WARNING [train.py:1204] (3/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_381-116041-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 02:39:53,805 WARNING [train.py:1204] (3/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_65-104124-0_sp1.1 from training. Duration: 0.963625 2023-10-02 02:39:53,872 WARNING [train.py:1204] (3/4) Exclude cut with ID _6536_210_1883_1_1527850783339_3319069_169-23811-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 02:39:55,680 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_289-180904-0_sp0.9 from training. Duration: 0.8444375 2023-10-02 02:39:55,690 WARNING [train.py:1204] (3/4) Exclude cut with ID _56406_210_9288_1_1534899618664_3496149_226-242400-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 02:39:57,084 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_24899_210_16554_1_1532046422_3827462_276-216330-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 02:40:00,258 WARNING [train.py:1204] (3/4) Exclude cut with ID _29345_210_4381_1_1532689168263_3860169_198-174328-0_sp1.1 from training. Duration: 0.82725 2023-10-02 02:40:00,321 WARNING [train.py:1204] (3/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_338-96520-0 from training. Duration: 0.94 2023-10-02 02:40:05,835 WARNING [train.py:1204] (3/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_7-111954-0_sp1.1 from training. Duration: 0.5090625 2023-10-02 02:40:06,549 WARNING [train.py:1204] (3/4) Exclude cut with ID _36692_210_5665_1_1533608967420_398809_8-16261-0_sp1.1 from training. Duration: 0.97275 2023-10-02 02:40:10,587 WARNING [train.py:1204] (3/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_86-212657-0_sp1.1 from training. Duration: 0.97275 2023-10-02 02:40:10,596 WARNING [train.py:1204] (3/4) Exclude cut with ID _44659_210_2721_1_1534036120061_6614860_626-298884-0_sp1.1 from training. Duration: 0.8818125 2023-10-02 02:40:13,400 WARNING [train.py:1204] (3/4) Exclude cut with ID _49755_210_18053_1_1534581005846_3218709_120-257918-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 02:40:14,844 WARNING [train.py:1204] (3/4) Exclude cut with ID _21881_210_3525_1_1531825242757_3798380_400-113821-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 02:40:14,846 WARNING [train.py:1204] (3/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_387-236773-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 02:40:16,195 WARNING [train.py:1204] (3/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_613-232856-0_sp1.1 from training. Duration: 0.4909375 2023-10-02 02:40:16,226 WARNING [train.py:1204] (3/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_181-78632-0_sp0.9 from training. Duration: 0.8666875 2023-10-02 02:40:19,361 WARNING [train.py:1204] (3/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_175-17485-0_sp1.1 from training. Duration: 0.97275 2023-10-02 02:40:20,614 INFO [train.py:1046] (3/4) Epoch 21, batch 2050, loss[loss=0.1631, simple_loss=0.2522, pruned_loss=0.03695, over 24353.00 frames. ], tot_loss[loss=0.1766, simple_loss=0.2523, pruned_loss=0.05049, over 4715139.41 frames. ], batch size: 77, lr: 4.91e-03, grad_scale: 32.0 2023-10-02 02:40:20,732 WARNING [train.py:1204] (3/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_1004-183422-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 02:40:21,545 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.1.conv_module2.whiten.whitening_limit, batch_count=721946.6666666666, ans=15.0 2023-10-02 02:40:23,571 WARNING [train.py:1204] (3/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_32-65962-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 02:40:24,983 WARNING [train.py:1204] (3/4) Exclude cut with ID _15458_210_8341_1_1530858614263_3834410_40-298664-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 02:40:28,560 WARNING [train.py:1204] (3/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_380-261863-0_sp1.1 from training. Duration: 0.963625 2023-10-02 02:40:30,542 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_372-173012-0_sp1.1 from training. Duration: 0.863625 2023-10-02 02:40:31,861 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_57-82205-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 02:40:33,125 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3691_210_3528_1_1526468169_4062053_355-334432-0_sp0.9 from training. Duration: 0.9666875 2023-10-02 02:40:34,552 WARNING [train.py:1204] (3/4) Exclude cut with ID _19023_210_4353_1_1531378892552_5820780_28-36615-0 from training. Duration: 0.97 2023-10-02 02:40:34,557 WARNING [train.py:1204] (3/4) Exclude cut with ID _29853_210_7497_1_1532505600851_6642110_286-251333-0_sp1.1 from training. Duration: 0.92725 2023-10-02 02:40:37,183 WARNING [train.py:1204] (3/4) Exclude cut with ID _31602_210_5854_1_1532677021456_4458250_518-80679-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 02:40:37,231 WARNING [train.py:1204] (3/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_38-187851-0_sp0.9 from training. Duration: 0.688875 2023-10-02 02:40:39,430 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.1.feed_forward1.out_proj.dropout_p, batch_count=722013.3333333334, ans=0.1 2023-10-02 02:40:41,196 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.4.encoder.layers.2.feed_forward3.out_whiten, num_groups=1, num_channels=384, metric=11.42 vs. limit=15.0 2023-10-02 02:40:42,176 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.1.feed_forward1.out_proj.dropout_p, batch_count=722013.3333333334, ans=0.1 2023-10-02 02:40:46,176 WARNING [train.py:1204] (3/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_39-248870-0_sp1.1 from training. Duration: 0.62725 2023-10-02 02:40:46,194 WARNING [train.py:1204] (3/4) Exclude cut with ID _16908_210_5724_1_1531119157248_3772700_154-31455-0_sp1.1 from training. Duration: 0.97275 2023-10-02 02:40:49,027 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8453_210_4211_1_1528502940_8562730_347-330673-0 from training. Duration: 0.72 2023-10-02 02:40:50,917 WARNING [train.py:1204] (3/4) Exclude cut with ID _30191_210_1664_1_1533189915047_7196670_674-204247-0_sp1.1 from training. Duration: 0.97275 2023-10-02 02:40:51,162 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=722080.0, ans=0.1 2023-10-02 02:40:52,197 WARNING [train.py:1204] (3/4) Exclude cut with ID _8833_210_5219_1_1528592665210_7553059_329-125156-0 from training. Duration: 0.82 2023-10-02 02:40:53,482 WARNING [train.py:1204] (3/4) Exclude cut with ID _4779_210_2084_1_1527318022944_3914893_500-285013-0_sp1.1 from training. Duration: 0.62725 2023-10-02 02:40:56,306 WARNING [train.py:1204] (3/4) Exclude cut with ID _14421_210_8750_1_1530604795825_3542290_190-313578-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 02:40:58,164 WARNING [train.py:1204] (3/4) Exclude cut with ID _35799_210_5854_1_1533263487887_3529071_538-273574-0_sp1.1 from training. Duration: 0.936375 2023-10-02 02:40:58,254 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_14928_210_5632_1_1530773461_4714000_454-112198-0_sp1.1 from training. Duration: 0.6 2023-10-02 02:41:00,098 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_468-143126-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 02:41:01,597 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_4528_210_3273_1_1527076654_3276777_258-121681-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 02:41:03,003 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_397-132565-0_sp1.1 from training. Duration: 0.9 2023-10-02 02:41:04,304 WARNING [train.py:1204] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_454-123002-0_sp0.9 from training. Duration: 0.7666875 2023-10-02 02:41:05,831 WARNING [train.py:1204] (3/4) Exclude cut with ID _27052_210_5986_1_1532329197187_3585009_297-86890-0_sp1.1 from training. Duration: 0.936375 2023-10-02 02:41:07,876 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_51225_210_3601_1_1534400634_2327383_55-301844-0_sp1.1 from training. Duration: 0.8454375 2023-10-02 02:41:09,252 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6786_210_1388_1_1527839699_4194495_378-46070-0_sp0.9 from training. Duration: 0.8 2023-10-02 02:41:10,642 WARNING [train.py:1204] (3/4) Exclude cut with ID _42334_210_7770_1_1534206610396_3605880_55-19758-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 02:41:14,659 WARNING [train.py:1204] (3/4) Exclude cut with ID _14884_210_2252_1_1530707459689_5561250_114-201399-0_sp0.9 from training. Duration: 0.8444375 2023-10-02 02:41:15,257 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.4.encoder.layers.0.nonlin_attention.whiten1, num_groups=1, num_channels=288, metric=7.97 vs. limit=10.0 2023-10-02 02:41:20,421 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_99-251314-0_sp0.9 from training. Duration: 0.9666875 2023-10-02 02:41:22,305 WARNING [train.py:1204] (3/4) Exclude cut with ID _26528_210_9070_1_1532570410842_3851310_497-97218-0 from training. Duration: 0.86 2023-10-02 02:41:27,826 WARNING [train.py:1204] (3/4) Exclude cut with ID _55328_210_27767_1_1534580973260_4088170_230-317241-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 02:41:27,893 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_125-203105-0_sp0.9 from training. Duration: 0.888875 2023-10-02 02:41:28,113 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.bypass.scale_min, batch_count=722213.3333333334, ans=0.2 2023-10-02 02:41:29,446 WARNING [train.py:1204] (3/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_55-31904-0_sp1.1 from training. Duration: 0.8 2023-10-02 02:41:32,637 WARNING [train.py:1204] (3/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_18-174647-0 from training. Duration: 0.91 2023-10-02 02:41:35,360 INFO [train.py:1046] (3/4) Epoch 21, batch 2100, loss[loss=0.1622, simple_loss=0.2058, pruned_loss=0.05925, over 18876.00 frames. ], tot_loss[loss=0.176, simple_loss=0.2511, pruned_loss=0.05042, over 4707128.85 frames. ], batch size: 388, lr: 4.91e-03, grad_scale: 32.0 2023-10-02 02:41:35,452 WARNING [train.py:1204] (3/4) Exclude cut with ID IC0165W0005-81726 from training. Duration: 0.952 2023-10-02 02:41:35,452 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_31583_210_3852_1_1532595851_4024413_148-171847-0_sp1.1 from training. Duration: 0.963625 2023-10-02 02:41:35,504 WARNING [train.py:1204] (3/4) Exclude cut with ID _21989_210_5854_1_1531899214931_3271360_116-138769-0_sp1.1 from training. Duration: 0.936375 2023-10-02 02:41:36,846 WARNING [train.py:1204] (3/4) Exclude cut with ID _66070_210_7640_1_1535441393096_7805310_489-292631-0_sp1.1 from training. Duration: 0.7181875 2023-10-02 02:41:36,918 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_202-27960-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 02:41:36,939 WARNING [train.py:1204] (3/4) Exclude cut with ID _5294_210_1747_1_1527249643928_3696179_342-270278-0 from training. Duration: 0.55 2023-10-02 02:41:38,752 WARNING [train.py:1204] (3/4) Exclude cut with ID _38323_210_12108_1_1533121233369_3303049_416-11919-0 from training. Duration: 0.96 2023-10-02 02:41:38,877 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6545_210_2040_1_1527774670_4245258_407-48070-0_sp0.9 from training. Duration: 0.8444375 2023-10-02 02:41:41,603 WARNING [train.py:1204] (3/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_503-30719-0_sp1.1 from training. Duration: 0.8909375 2023-10-02 02:41:42,994 WARNING [train.py:1204] (3/4) Exclude cut with ID _49643_210_7989_1_1534640244224_3939031_234-326497-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 02:41:43,174 WARNING [train.py:1204] (3/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_301-77873-0_sp1.1 from training. Duration: 0.963625 2023-10-02 02:41:43,674 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.4.encoder.layers.1.feed_forward2.out_whiten, num_groups=1, num_channels=384, metric=14.08 vs. limit=15.0 2023-10-02 02:41:44,504 WARNING [train.py:1204] (3/4) Exclude cut with ID _45297_210_2721_1_1533715351381_3264949_152-154697-0_sp1.1 from training. Duration: 0.97275 2023-10-02 02:41:44,520 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3371_210_1794_1_1526384462_5580777_396-339812-0 from training. Duration: 0.85 2023-10-02 02:41:45,948 WARNING [train.py:1204] (3/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_399-156791-0_sp1.1 from training. Duration: 0.7545625 2023-10-02 02:41:47,243 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7696_210_2040_1_1528207167_3716099_117-125434-0 from training. Duration: 0.76 2023-10-02 02:41:47,248 WARNING [train.py:1204] (3/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_96-244299-0 from training. Duration: 0.88 2023-10-02 02:41:49,878 WARNING [train.py:1204] (3/4) Exclude cut with ID _37892_210_19596_1_1533198677897_3964058_519-343462-0_sp1.1 from training. Duration: 0.92725 2023-10-02 02:41:49,902 WARNING [train.py:1204] (3/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_202-183922-0_sp1.1 from training. Duration: 0.77275 2023-10-02 02:41:49,920 WARNING [train.py:1204] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_453-133983-0 from training. Duration: 0.78 2023-10-02 02:41:51,894 WARNING [train.py:1204] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_178-39722-0_sp1.1 from training. Duration: 0.4454375 2023-10-02 02:41:54,771 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_15126_210_6753_1_1530778677_2591185_113-157651-0 from training. Duration: 0.68 2023-10-02 02:41:54,772 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_271-234962-0_sp1.1 from training. Duration: 0.7181875 2023-10-02 02:41:58,108 WARNING [train.py:1204] (3/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_195-64967-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 02:41:59,368 WARNING [train.py:1204] (3/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_365-215065-0_sp1.1 from training. Duration: 0.936375 2023-10-02 02:42:00,601 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.496e+02 1.890e+02 2.103e+02 2.367e+02 4.500e+02, threshold=4.205e+02, percent-clipped=1.0 2023-10-02 02:42:02,716 WARNING [train.py:1204] (3/4) Exclude cut with ID _31602_210_5854_1_1532677021456_4458250_249-96604-0_sp0.9 from training. Duration: 0.988875 2023-10-02 02:42:02,862 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.self_attn_weights.pos_emb_skip_rate, batch_count=722346.6666666666, ans=0.0 2023-10-02 02:42:04,056 WARNING [train.py:1204] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_307-158462-0 from training. Duration: 0.69 2023-10-02 02:42:05,441 WARNING [train.py:1204] (3/4) Exclude cut with ID _30918_210_6400_1_1532754078600_3125920_407-223394-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 02:42:05,445 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6673_210_3273_1_1527767790_3173240_351-167954-0_sp0.9 from training. Duration: 0.5333125 2023-10-02 02:42:05,726 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.ff3_skip_rate, batch_count=722413.3333333334, ans=0.0 2023-10-02 02:42:06,934 WARNING [train.py:1204] (3/4) Exclude cut with ID _4866_210_2577_1_1527404412284_4364659_256-306830-0 from training. Duration: 0.87 2023-10-02 02:42:08,231 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_17613_210_2151_1_1531137569_2366160_222-131118-0_sp1.1 from training. Duration: 0.963625 2023-10-02 02:42:08,250 WARNING [train.py:1204] (3/4) Exclude cut with ID _45560_210_21982_1_1533812603951_3584999_111-6376-0 from training. Duration: 0.84 2023-10-02 02:42:08,285 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_216-73841-0 from training. Duration: 0.96 2023-10-02 02:42:10,414 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_14058_210_4163_1_1530690953_6327180_49-210687-0 from training. Duration: 0.94 2023-10-02 02:42:11,779 WARNING [train.py:1204] (3/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_506-42614-0_sp0.9 from training. Duration: 0.87775 2023-10-02 02:42:13,194 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_25-332138-0_sp1.1 from training. Duration: 0.82725 2023-10-02 02:42:15,912 WARNING [train.py:1204] (3/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_589-9070-0_sp1.1 from training. Duration: 0.6454375 2023-10-02 02:42:16,021 WARNING [train.py:1204] (3/4) Exclude cut with ID _6194_210_1534_1_1527511826160_3941194_14-282876-0_sp1.1 from training. Duration: 0.6454375 2023-10-02 02:42:17,446 WARNING [train.py:1204] (3/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_1166-90654-0_sp1.1 from training. Duration: 0.92725 2023-10-02 02:42:18,911 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_218-255858-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 02:42:18,913 WARNING [train.py:1204] (3/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_1285-256622-0 from training. Duration: 0.84 2023-10-02 02:42:18,939 WARNING [train.py:1204] (3/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_205-286261-0_sp1.1 from training. Duration: 0.963625 2023-10-02 02:42:18,960 WARNING [train.py:1204] (3/4) Exclude cut with ID _68777_210_4353_1_1535810390731_3117904_6-154119-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 02:42:20,281 WARNING [train.py:1204] (3/4) Exclude cut with ID _22982_210_1883_1_1531882790447_5449409_565-199393-0_sp1.1 from training. Duration: 0.92725 2023-10-02 02:42:20,293 WARNING [train.py:1204] (3/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_278-154492-0 from training. Duration: 0.76 2023-10-02 02:42:22,538 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_426-313557-0 from training. Duration: 0.53 2023-10-02 02:42:22,619 WARNING [train.py:1204] (3/4) Exclude cut with ID _41817_210_18835_1_1533434477160_7423049_555-293448-0 from training. Duration: 0.8 2023-10-02 02:42:28,368 WARNING [train.py:1204] (3/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_32-335103-0_sp1.1 from training. Duration: 0.7454375 2023-10-02 02:42:31,221 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2655_210_2087_1_1525688959_3897400_202-147557-0_sp1.1 from training. Duration: 0.77275 2023-10-02 02:42:32,601 WARNING [train.py:1204] (3/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_158-250116-0 from training. Duration: 0.62 2023-10-02 02:42:37,537 WARNING [train.py:1204] (3/4) Exclude cut with ID _55429_210_10626_1_1534467588527_7174143_528-88083-0_sp1.1 from training. Duration: 0.963625 2023-10-02 02:42:40,233 WARNING [train.py:1204] (3/4) Exclude cut with ID _8030_210_7497_1_1528444886781_3531089_32-31837-0_sp1.1 from training. Duration: 0.8818125 2023-10-02 02:42:40,265 WARNING [train.py:1204] (3/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_744-86168-0_sp1.1 from training. Duration: 0.97275 2023-10-02 02:42:40,273 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1113-7369-0_sp0.9 from training. Duration: 0.9444375 2023-10-02 02:42:40,979 WARNING [train.py:1204] (3/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_124-345840-0_sp0.9 from training. Duration: 0.57775 2023-10-02 02:42:41,023 WARNING [train.py:1204] (3/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_328-264776-0_sp0.9 from training. Duration: 0.7444375 2023-10-02 02:42:42,335 WARNING [train.py:1204] (3/4) Exclude cut with ID _18895_210_6228_1_1531357274234_3784930_469-166615-0_sp1.1 from training. Duration: 0.963625 2023-10-02 02:42:42,355 WARNING [train.py:1204] (3/4) Exclude cut with ID _8395_210_1531_1_1528597871523_4411579_431-132470-0_sp0.9 from training. Duration: 0.82225 2023-10-02 02:42:43,682 WARNING [train.py:1204] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_554-45617-0_sp1.1 from training. Duration: 0.9 2023-10-02 02:42:43,706 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_35361_210_4238_1_1533262810_7793476_524-188242-0_sp1.1 from training. Duration: 0.92725 2023-10-02 02:42:45,092 WARNING [train.py:1204] (3/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_938-205135-0 from training. Duration: 0.87 2023-10-02 02:42:46,592 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_5931_210_2040_1_1527428481_4547999_321-193614-0 from training. Duration: 0.67 2023-10-02 02:42:46,604 WARNING [train.py:1204] (3/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_445-269521-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 02:42:49,213 WARNING [train.py:1204] (3/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_3-33116-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 02:42:49,214 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_4633_210_2040_1_1526824163_4145343_416-76117-0_sp1.1 from training. Duration: 0.863625 2023-10-02 02:42:50,495 INFO [train.py:1046] (3/4) Epoch 21, batch 2150, loss[loss=0.1471, simple_loss=0.2223, pruned_loss=0.036, over 21466.00 frames. ], tot_loss[loss=0.1753, simple_loss=0.2503, pruned_loss=0.05017, over 4702202.37 frames. ], batch size: 47, lr: 4.91e-03, grad_scale: 8.0 2023-10-02 02:42:50,570 WARNING [train.py:1204] (3/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_1033-274376-0_sp1.1 from training. Duration: 0.7909375 2023-10-02 02:42:50,604 WARNING [train.py:1204] (3/4) Exclude cut with ID _8772_210_1850_1_1528635595972_3664011_398-320049-0_sp0.9 from training. Duration: 0.97775 2023-10-02 02:42:55,198 WARNING [train.py:1204] (3/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_321-215742-0_sp0.9 from training. Duration: 0.5555625 2023-10-02 02:42:56,632 WARNING [train.py:1204] (3/4) Exclude cut with ID _28394_210_8913_1_1532430049518_3650118_272-190950-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 02:42:57,901 WARNING [train.py:1204] (3/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_31-299371-0_sp1.1 from training. Duration: 0.92725 2023-10-02 02:42:59,312 WARNING [train.py:1204] (3/4) Exclude cut with ID _55383_210_10635_1_1534468426172_2231669_134-128246-0_sp0.9 from training. Duration: 0.988875 2023-10-02 02:42:59,322 WARNING [train.py:1204] (3/4) Exclude cut with ID _40519_210_13794_1_1533715228655_7337390_887-9547-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 02:42:59,358 WARNING [train.py:1204] (3/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_310-109461-0_sp0.9 from training. Duration: 0.888875 2023-10-02 02:43:04,369 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2009_210_2071_1_1525481134_4588937_258-250196-0_sp1.1 from training. Duration: 0.92725 2023-10-02 02:43:05,725 WARNING [train.py:1204] (3/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_1186-72702-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 02:43:05,728 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_14831_210_7927_1_1530700200_5583610_181-27467-0_sp1.1 from training. Duration: 0.82725 2023-10-02 02:43:08,553 WARNING [train.py:1204] (3/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_278-132109-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 02:43:08,575 WARNING [train.py:1204] (3/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_261-297503-0 from training. Duration: 0.99 2023-10-02 02:43:13,283 WARNING [train.py:1204] (3/4) Exclude cut with ID _19896_210_1883_1_1531709022662_6649569_313-265903-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 02:43:13,388 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_105-114797-0_sp1.1 from training. Duration: 0.763625 2023-10-02 02:43:14,780 WARNING [train.py:1204] (3/4) Exclude cut with ID _53755_210_8254_1_1534577312941_4707360_9-65843-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 02:43:14,813 WARNING [train.py:1204] (3/4) Exclude cut with ID _18516_210_9206_1_1531704653206_3692406_361-224549-0_sp1.1 from training. Duration: 0.97275 2023-10-02 02:43:15,626 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.0.layers.1.conv_module1.whiten, num_groups=1, num_channels=192, metric=11.98 vs. limit=15.0 2023-10-02 02:43:16,229 WARNING [train.py:1204] (3/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_403-310975-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 02:43:16,284 WARNING [train.py:1204] (3/4) Exclude cut with ID _5444_210_2151_1_1527418808464_5312210_581-315411-0_sp0.9 from training. Duration: 0.688875 2023-10-02 02:43:16,321 WARNING [train.py:1204] (3/4) Exclude cut with ID _23825_210_13449_1_1531915313526_3497150_464-154904-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 02:43:17,639 WARNING [train.py:1204] (3/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_937-208281-0_sp0.9 from training. Duration: 0.9444375 2023-10-02 02:43:17,716 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_37839_210_7234_1_1533542462_3689303_158-181212-0_sp1.1 from training. Duration: 0.963625 2023-10-02 02:43:19,063 WARNING [train.py:1204] (3/4) Exclude cut with ID _8772_210_1850_1_1528635595972_3664011_398-320049-0 from training. Duration: 0.88 2023-10-02 02:43:20,513 WARNING [train.py:1204] (3/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_718-250907-0_sp1.1 from training. Duration: 0.62725 2023-10-02 02:43:20,607 WARNING [train.py:1204] (3/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_641-126395-0_sp1.1 from training. Duration: 0.92725 2023-10-02 02:43:20,634 WARNING [train.py:1204] (3/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_166-241493-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 02:43:23,934 WARNING [train.py:1204] (3/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_526-62079-0_sp0.9 from training. Duration: 0.7444375 2023-10-02 02:43:24,067 WARNING [train.py:1204] (3/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_439-275083-0_sp1.1 from training. Duration: 0.936375 2023-10-02 02:43:25,480 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_66907_210_21581_1_1535615661_3590177_493-229032-0_sp1.1 from training. Duration: 0.92725 2023-10-02 02:43:26,869 WARNING [train.py:1204] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_89-321955-0_sp1.1 from training. Duration: 0.72725 2023-10-02 02:43:28,276 WARNING [train.py:1204] (3/4) Exclude cut with ID _29857_210_20034_1_1532395886753_3798160_273-226195-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 02:43:28,286 WARNING [train.py:1204] (3/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_404-75889-0 from training. Duration: 0.89 2023-10-02 02:43:28,321 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_271-234962-0_sp0.9 from training. Duration: 0.87775 2023-10-02 02:43:32,095 WARNING [train.py:1204] (3/4) Exclude cut with ID _22961_210_8254_1_1531976366672_4278180_156-339685-0_sp1.1 from training. Duration: 0.97275 2023-10-02 02:43:32,140 WARNING [train.py:1204] (3/4) Exclude cut with ID _37892_210_19596_1_1533198677897_3964058_425-231927-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 02:43:33,512 WARNING [train.py:1204] (3/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_102-288670-0_sp1.1 from training. Duration: 0.97275 2023-10-02 02:43:36,122 WARNING [train.py:1204] (3/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_283-220372-0_sp1.1 from training. Duration: 0.5090625 2023-10-02 02:43:37,483 WARNING [train.py:1204] (3/4) Exclude cut with ID _44304_210_15374_1_1533639352765_4613129_86-1699-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 02:43:38,905 WARNING [train.py:1204] (3/4) Exclude cut with ID _2891_210_2550_1_1525874398521_4427710_327-121590-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 02:43:38,920 WARNING [train.py:1204] (3/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_369-222060-0 from training. Duration: 0.71 2023-10-02 02:43:40,375 WARNING [train.py:1204] (3/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_762-109886-0 from training. Duration: 0.91 2023-10-02 02:43:40,413 WARNING [train.py:1204] (3/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_477-255782-0_sp0.9 from training. Duration: 0.911125 2023-10-02 02:43:40,459 WARNING [train.py:1204] (3/4) Exclude cut with ID IC0669W0061-330906 from training. Duration: 0.768 2023-10-02 02:43:41,103 WARNING [train.py:1204] (3/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_347-213071-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 02:43:42,246 WARNING [train.py:1204] (3/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_507-303370-0_sp1.1 from training. Duration: 0.87275 2023-10-02 02:43:42,318 WARNING [train.py:1204] (3/4) Exclude cut with ID _15012_210_1968_1_1530856820429_5095350_688-178849-0 from training. Duration: 0.64 2023-10-02 02:43:42,323 WARNING [train.py:1204] (3/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_28-70987-0_sp0.9 from training. Duration: 0.9666875 2023-10-02 02:43:42,325 WARNING [train.py:1204] (3/4) Exclude cut with ID _5910_210_1389_1_1527337972454_5050100_555-159815-0 from training. Duration: 0.98 2023-10-02 02:43:42,351 WARNING [train.py:1204] (3/4) Exclude cut with ID IC0679W0064-335871 from training. Duration: 0.768 2023-10-02 02:43:42,351 WARNING [train.py:1204] (3/4) Exclude cut with ID IC0679W0079-335886 from training. Duration: 0.896 2023-10-02 02:43:43,669 WARNING [train.py:1204] (3/4) Exclude cut with ID _5608_210_2106_1_1527415439089_6819768_909-82767-0 from training. Duration: 0.61 2023-10-02 02:43:46,264 WARNING [train.py:1204] (3/4) Exclude cut with ID _23038_210_9570_1_1532001584158_3652419_411-323967-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 02:43:47,646 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_350-231748-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 02:43:47,668 WARNING [train.py:1204] (3/4) Exclude cut with ID _5289_210_2084_1_1527678202736_3779299_142-103317-0_sp0.9 from training. Duration: 0.9333125 2023-10-02 02:43:47,777 WARNING [train.py:1204] (3/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_314-144055-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 02:43:49,150 WARNING [train.py:1204] (3/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_177-236809-0_sp0.9 from training. Duration: 0.5444375 2023-10-02 02:43:50,576 WARNING [train.py:1204] (3/4) Exclude cut with ID _23038_210_9570_1_1532001584158_3652419_82-334733-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 02:43:50,584 WARNING [train.py:1204] (3/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_225-22526-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 02:43:57,229 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.balancer1.prob, batch_count=722880.0, ans=0.125 2023-10-02 02:44:00,341 WARNING [train.py:1204] (3/4) Exclude cut with ID _30327_210_16049_1_1532484161295_3924169_401-70819-0_sp1.1 from training. Duration: 0.7818125 2023-10-02 02:44:00,403 WARNING [train.py:1204] (3/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_538-32291-0 from training. Duration: 0.83 2023-10-02 02:44:05,127 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_37357_210_17024_1_1533089504_2316121_258-211605-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 02:44:06,415 INFO [train.py:1046] (3/4) Epoch 21, batch 2200, loss[loss=0.1878, simple_loss=0.2559, pruned_loss=0.05985, over 23568.00 frames. ], tot_loss[loss=0.1748, simple_loss=0.25, pruned_loss=0.04983, over 4711796.40 frames. ], batch size: 256, lr: 4.91e-03, grad_scale: 8.0 2023-10-02 02:44:09,309 WARNING [train.py:1204] (3/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_550-335198-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 02:44:10,625 WARNING [train.py:1204] (3/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_98-741-0_sp0.9 from training. Duration: 0.888875 2023-10-02 02:44:10,676 WARNING [train.py:1204] (3/4) Exclude cut with ID _38323_210_12108_1_1533121233369_3303049_251-238988-0_sp1.1 from training. Duration: 0.92725 2023-10-02 02:44:11,994 WARNING [train.py:1204] (3/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_259-34038-0_sp0.9 from training. Duration: 0.82225 2023-10-02 02:44:15,331 WARNING [train.py:1204] (3/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_1060-180633-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 02:44:15,367 WARNING [train.py:1204] (3/4) Exclude cut with ID _8755_210_1876_1_1528628559357_3258209_211-37473-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 02:44:15,372 WARNING [train.py:1204] (3/4) Exclude cut with ID _4122_210_2084_1_1526713164417_4353280_528-206771-0 from training. Duration: 0.88 2023-10-02 02:44:19,654 WARNING [train.py:1204] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_64-248359-0 from training. Duration: 0.69 2023-10-02 02:44:21,213 WARNING [train.py:1204] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_402-253027-0_sp1.1 from training. Duration: 0.4909375 2023-10-02 02:44:25,215 WARNING [train.py:1204] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_625-102955-0 from training. Duration: 0.77 2023-10-02 02:44:28,597 WARNING [train.py:1204] (3/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_611-219324-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 02:44:29,927 WARNING [train.py:1204] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_718-68505-0_sp0.9 from training. Duration: 0.988875 2023-10-02 02:44:29,983 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_353-182855-0_sp1.1 from training. Duration: 0.87275 2023-10-02 02:44:30,148 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.0.conv_module2.balancer1.prob, batch_count=723013.3333333334, ans=0.125 2023-10-02 02:44:30,225 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.balancer2.prob, batch_count=723013.3333333334, ans=0.125 2023-10-02 02:44:32,020 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_5960_210_1794_1_1527403865_3990999_515-2613-0_sp0.9 from training. Duration: 0.97775 2023-10-02 02:44:33,945 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_5575_210_1410_1_1527301305_2570170_342-265572-0 from training. Duration: 0.94 2023-10-02 02:44:35,307 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.554e+02 1.860e+02 2.025e+02 2.303e+02 3.643e+02, threshold=4.050e+02, percent-clipped=0.0 2023-10-02 02:44:36,858 WARNING [train.py:1204] (3/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_329-113173-0_sp0.9 from training. Duration: 0.688875 2023-10-02 02:44:38,167 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_15396_210_6890_1_1531308473_1938936_213-312506-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 02:44:39,624 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_521-214276-0_sp1.1 from training. Duration: 0.4 2023-10-02 02:44:41,190 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.bypass.skip_rate, batch_count=723080.0, ans=0.09899494936611666 2023-10-02 02:44:43,008 WARNING [train.py:1204] (3/4) Exclude cut with ID _15026_210_8750_1_1530777602316_3291850_211-195043-0_sp1.1 from training. Duration: 0.72725 2023-10-02 02:44:44,152 WARNING [train.py:1204] (3/4) Exclude cut with ID _29343_210_4381_1_1532602757752_3674109_108-257494-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 02:44:47,008 WARNING [train.py:1204] (3/4) Exclude cut with ID _64414_210_17301_1_1535243252048_3757759_403-58151-0_sp1.1 from training. Duration: 0.963625 2023-10-02 02:44:48,433 WARNING [train.py:1204] (3/4) Exclude cut with ID _36512_210_6987_1_1533033143998_3126069_109-177787-0_sp1.1 from training. Duration: 0.92725 2023-10-02 02:44:49,864 WARNING [train.py:1204] (3/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_498-40153-0 from training. Duration: 0.63 2023-10-02 02:44:51,232 WARNING [train.py:1204] (3/4) Exclude cut with ID _56447_210_1389_1_1535074245910_3697189_161-334523-0_sp1.1 from training. Duration: 0.936375 2023-10-02 02:44:52,643 WARNING [train.py:1204] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_238-245655-0 from training. Duration: 0.81 2023-10-02 02:44:54,060 WARNING [train.py:1204] (3/4) Exclude cut with ID _64631_210_27767_1_1535349580068_4165160_108-43865-0_sp1.1 from training. Duration: 0.92725 2023-10-02 02:44:54,066 WARNING [train.py:1204] (3/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_243-42116-0_sp1.1 from training. Duration: 0.663625 2023-10-02 02:44:55,428 WARNING [train.py:1204] (3/4) Exclude cut with ID _66868_210_9527_1_1535509287954_4561140_594-132160-0_sp1.1 from training. Duration: 0.92725 2023-10-02 02:44:56,861 WARNING [train.py:1204] (3/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_48-338976-0_sp0.9 from training. Duration: 0.988875 2023-10-02 02:44:56,908 WARNING [train.py:1204] (3/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_137-100595-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 02:44:56,915 WARNING [train.py:1204] (3/4) Exclude cut with ID _49748_210_24116_1_1533986971737_3158910_12-271669-0_sp1.1 from training. Duration: 0.936375 2023-10-02 02:44:58,894 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_37357_210_17024_1_1533089504_2316121_206-80421-0_sp1.1 from training. Duration: 0.936375 2023-10-02 02:45:00,252 WARNING [train.py:1204] (3/4) Exclude cut with ID _7537_210_2084_1_1528502395431_3924359_113-199933-0_sp1.1 from training. Duration: 0.536375 2023-10-02 02:45:00,300 WARNING [train.py:1204] (3/4) Exclude cut with ID _55440_210_12651_1_1534496713364_2980520_31-59289-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 02:45:01,019 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.3.encoder.layers.0.feed_forward1.out_whiten, num_groups=1, num_channels=512, metric=9.90 vs. limit=15.0 2023-10-02 02:45:01,822 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1083-220122-0_sp0.9 from training. Duration: 0.5666875 2023-10-02 02:45:06,322 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_426-313557-0_sp1.1 from training. Duration: 0.4818125 2023-10-02 02:45:06,375 WARNING [train.py:1204] (3/4) Exclude cut with ID _35799_210_5854_1_1533263487887_3529071_324-94843-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 02:45:08,301 WARNING [train.py:1204] (3/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_264-108685-0_sp1.1 from training. Duration: 0.5 2023-10-02 02:45:08,404 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9012W0006-501141 from training. Duration: 0.896 2023-10-02 02:45:11,171 WARNING [train.py:1204] (3/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_890-5117-0_sp0.9 from training. Duration: 0.8666875 2023-10-02 02:45:11,228 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9023W0008-506301 from training. Duration: 0.896 2023-10-02 02:45:14,455 WARNING [train.py:1204] (3/4) Exclude cut with ID _13916_210_9994_1_1530788341126_3763149_60-191556-0_sp0.9 from training. Duration: 0.67775 2023-10-02 02:45:14,500 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9029W0331-509721 from training. Duration: 0.896 2023-10-02 02:45:15,773 WARNING [train.py:1204] (3/4) Exclude cut with ID _35823_210_6917_1_1533187881914_7297830_567-108596-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 02:45:15,839 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7543_210_6753_1_1528544010_1110402_25-183860-0_sp1.1 from training. Duration: 0.42725 2023-10-02 02:45:17,273 WARNING [train.py:1204] (3/4) Exclude cut with ID _47278_210_12041_1_1533902418349_3576110_495-260035-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 02:45:19,891 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9051W0015-520281 from training. Duration: 0.9045625 2023-10-02 02:45:21,297 INFO [train.py:1046] (3/4) Epoch 21, batch 2250, loss[loss=0.1888, simple_loss=0.2555, pruned_loss=0.06109, over 23600.00 frames. ], tot_loss[loss=0.1751, simple_loss=0.2507, pruned_loss=0.04975, over 4720146.80 frames. ], batch size: 256, lr: 4.91e-03, grad_scale: 8.0 2023-10-02 02:45:21,374 WARNING [train.py:1204] (3/4) Exclude cut with ID _14459_210_2228_1_1531443641946_7516700_707-66193-0_sp1.1 from training. Duration: 0.97275 2023-10-02 02:45:22,741 WARNING [train.py:1204] (3/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_219-224445-0_sp1.1 from training. Duration: 0.763625 2023-10-02 02:45:27,222 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.ff3_skip_rate, batch_count=723280.0, ans=0.0 2023-10-02 02:45:28,381 WARNING [train.py:1204] (3/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_345-167986-0_sp0.9 from training. Duration: 0.9333125 2023-10-02 02:45:28,493 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_34815_210_6388_1_1533381320_3571903_279-305565-0_sp1.1 from training. Duration: 0.836375 2023-10-02 02:45:32,960 WARNING [train.py:1204] (3/4) Exclude cut with ID _21987_210_5854_1_1531821500482_3637038_317-179212-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 02:45:33,032 WARNING [train.py:1204] (3/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_395-37222-0_sp1.1 from training. Duration: 0.6909375 2023-10-02 02:45:34,418 WARNING [train.py:1204] (3/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_168-50337-0_sp1.1 from training. Duration: 0.763625 2023-10-02 02:45:38,150 WARNING [train.py:1204] (3/4) Exclude cut with ID _60113_210_16271_1_1535164212481_4386079_559-167191-0 from training. Duration: 0.93 2023-10-02 02:45:38,159 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_47228_210_4983_1_1533780021_3672027_344-179642-0_sp1.1 from training. Duration: 0.92725 2023-10-02 02:45:38,191 WARNING [train.py:1204] (3/4) Exclude cut with ID _22961_210_8254_1_1531976366672_4278180_317-199546-0_sp0.9 from training. Duration: 0.9555625 2023-10-02 02:45:39,598 WARNING [train.py:1204] (3/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_141-89727-0 from training. Duration: 0.71 2023-10-02 02:45:41,035 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_78-116075-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 02:45:41,054 WARNING [train.py:1204] (3/4) Exclude cut with ID _64086_210_7994_1_1535270417019_7435949_52-235756-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 02:45:43,702 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7615_210_4211_1_1528110965_4580160_72-235211-0_sp1.1 from training. Duration: 0.6909375 2023-10-02 02:45:48,505 WARNING [train.py:1204] (3/4) Exclude cut with ID _14069_210_5632_1_1530512692033_4803390_287-27950-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 02:45:50,023 WARNING [train.py:1204] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_74-48717-0_sp0.9 from training. Duration: 0.6444375 2023-10-02 02:45:50,056 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7544_210_6753_1_1528591327_1471426_172-348777-0_sp1.1 from training. Duration: 0.6 2023-10-02 02:45:51,415 WARNING [train.py:1204] (3/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_220-191080-0 from training. Duration: 0.98 2023-10-02 02:45:52,722 WARNING [train.py:1204] (3/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_1081-137641-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 02:45:54,146 WARNING [train.py:1204] (3/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_278-235459-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 02:45:59,484 WARNING [train.py:1204] (3/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_1181-77470-0_sp1.1 from training. Duration: 0.936375 2023-10-02 02:45:59,719 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.ff3_skip_rate, batch_count=723413.3333333334, ans=0.0 2023-10-02 02:46:02,167 WARNING [train.py:1204] (3/4) Exclude cut with ID _60860_210_8254_1_1534896015875_3557169_198-5531-0_sp1.1 from training. Duration: 0.936375 2023-10-02 02:46:03,492 WARNING [train.py:1204] (3/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_17-289781-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 02:46:03,505 WARNING [train.py:1204] (3/4) Exclude cut with ID _50310_210_15488_1_1534068043345_3641080_146-199752-0_sp1.1 from training. Duration: 0.92725 2023-10-02 02:46:06,954 WARNING [train.py:1204] (3/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_969-213354-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 02:46:08,449 WARNING [train.py:1204] (3/4) Exclude cut with ID _70519_210_4163_1_1535961595994_7458230_66-344156-0_sp1.1 from training. Duration: 0.963625 2023-10-02 02:46:08,547 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder_embed.dropout.p, batch_count=723480.0, ans=0.1 2023-10-02 02:46:12,557 WARNING [train.py:1204] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_380-204756-0_sp1.1 from training. Duration: 0.7818125 2023-10-02 02:46:15,341 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_128-202104-0_sp0.9 from training. Duration: 0.7 2023-10-02 02:46:21,338 WARNING [train.py:1204] (3/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_51-1049-0_sp0.9 from training. Duration: 0.8333125 2023-10-02 02:46:21,356 WARNING [train.py:1204] (3/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_86-66192-0_sp1.1 from training. Duration: 0.5 2023-10-02 02:46:21,418 WARNING [train.py:1204] (3/4) Exclude cut with ID _14069_210_5632_1_1530512692033_4803390_95-1896-0_sp1.1 from training. Duration: 0.8909375 2023-10-02 02:46:25,572 WARNING [train.py:1204] (3/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_29-293896-0_sp1.1 from training. Duration: 0.5545625 2023-10-02 02:46:28,254 WARNING [train.py:1204] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_167-137255-0_sp0.9 from training. Duration: 0.711125 2023-10-02 02:46:28,255 WARNING [train.py:1204] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_217-95056-0 from training. Duration: 0.98 2023-10-02 02:46:29,546 WARNING [train.py:1204] (3/4) Exclude cut with ID _47278_210_12041_1_1533902418349_3576110_97-266746-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 02:46:29,599 WARNING [train.py:1204] (3/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_380-343670-0_sp1.1 from training. Duration: 0.8 2023-10-02 02:46:32,418 WARNING [train.py:1204] (3/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_107-332398-0 from training. Duration: 0.78 2023-10-02 02:46:35,181 INFO [train.py:1046] (3/4) Epoch 21, batch 2300, loss[loss=0.2023, simple_loss=0.2655, pruned_loss=0.06958, over 23561.00 frames. ], tot_loss[loss=0.1763, simple_loss=0.2516, pruned_loss=0.05051, over 4720595.94 frames. ], batch size: 120, lr: 4.91e-03, grad_scale: 8.0 2023-10-02 02:46:35,255 WARNING [train.py:1204] (3/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_91-267696-0_sp1.1 from training. Duration: 0.7909375 2023-10-02 02:46:35,286 WARNING [train.py:1204] (3/4) Exclude cut with ID _41298_210_9019_1_1533430371450_4822143_498-186993-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 02:46:41,389 WARNING [train.py:1204] (3/4) Exclude cut with ID _17956_210_4353_1_1531209568125_6979970_54-77610-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 02:46:41,434 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_40047_210_2290_1_1533290519_2374311_257-335162-0_sp1.1 from training. Duration: 0.863625 2023-10-02 02:46:41,631 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.bypass.scale_min, batch_count=723613.3333333334, ans=0.2 2023-10-02 02:46:45,233 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9653W0193-675471 from training. Duration: 0.9656875 2023-10-02 02:46:46,544 WARNING [train.py:1204] (3/4) Exclude cut with ID _25248_210_8750_1_1532062825941_5554180_243-240536-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 02:46:47,112 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.4.encoder.layers.2.whiten, num_groups=1, num_channels=384, metric=3.13 vs. limit=12.0 2023-10-02 02:46:54,051 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_32360_210_4198_1_1532689160_3648436_60-51908-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 02:46:54,067 WARNING [train.py:1204] (3/4) Exclude cut with ID _4092_210_2151_1_1526643044449_3135759_227-213972-0_sp0.9 from training. Duration: 0.6 2023-10-02 02:46:55,351 WARNING [train.py:1204] (3/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_392-173500-0_sp1.1 from training. Duration: 0.97275 2023-10-02 02:46:56,656 WARNING [train.py:1204] (3/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_71-329150-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 02:46:56,658 WARNING [train.py:1204] (3/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_866-173440-0 from training. Duration: 0.9 2023-10-02 02:46:56,752 WARNING [train.py:1204] (3/4) Exclude cut with ID _14832_210_8750_1_1530691351539_3171348_1-54315-0_sp0.9 from training. Duration: 0.9666875 2023-10-02 02:46:58,204 WARNING [train.py:1204] (3/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_430-116139-0_sp1.1 from training. Duration: 0.77275 2023-10-02 02:46:59,477 WARNING [train.py:1204] (3/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_356-45054-0_sp0.9 from training. Duration: 0.9555625 2023-10-02 02:47:03,569 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3531_210_3528_1_1526294203_5216456_309-227302-0_sp0.9 from training. Duration: 0.7666875 2023-10-02 02:47:04,817 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.547e+02 1.907e+02 2.156e+02 2.525e+02 3.499e+02, threshold=4.312e+02, percent-clipped=0.0 2023-10-02 02:47:06,284 WARNING [train.py:1204] (3/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_301-330455-0_sp1.1 from training. Duration: 0.72725 2023-10-02 02:47:10,357 WARNING [train.py:1204] (3/4) Exclude cut with ID _23557_210_8750_1_1531890106370_6846255_667-145945-0_sp1.1 from training. Duration: 0.92725 2023-10-02 02:47:16,935 WARNING [train.py:1204] (3/4) Exclude cut with ID _14860_210_4915_1_1530702020340_7343240_836-328489-0_sp1.1 from training. Duration: 0.8181875 2023-10-02 02:47:16,978 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_37152_210_9697_1_1533106573_3844188_416-333249-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 02:47:20,035 WARNING [train.py:1204] (3/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_538-32291-0_sp0.9 from training. Duration: 0.92225 2023-10-02 02:47:22,879 WARNING [train.py:1204] (3/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_965-176824-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 02:47:27,407 WARNING [train.py:1204] (3/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_86-19601-0_sp1.1 from training. Duration: 0.77275 2023-10-02 02:47:27,473 WARNING [train.py:1204] (3/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_436-318113-0_sp1.1 from training. Duration: 0.6818125 2023-10-02 02:47:27,617 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.bypass_mid.scale_min, batch_count=723813.3333333334, ans=0.2 2023-10-02 02:47:28,891 WARNING [train.py:1204] (3/4) Exclude cut with ID _41817_210_18835_1_1533434477160_7423049_555-293448-0_sp0.9 from training. Duration: 0.888875 2023-10-02 02:47:28,905 WARNING [train.py:1204] (3/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_68-219807-0 from training. Duration: 0.47 2023-10-02 02:47:33,053 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7544_210_6753_1_1528591327_1471426_33-346031-0_sp1.1 from training. Duration: 0.5545625 2023-10-02 02:47:33,055 WARNING [train.py:1204] (3/4) Exclude cut with ID _49748_210_24116_1_1533986971737_3158910_263-52113-0_sp1.1 from training. Duration: 0.97275 2023-10-02 02:47:33,104 WARNING [train.py:1204] (3/4) Exclude cut with ID _64639_210_5816_1_1535367619437_3528000_340-24580-0_sp1.1 from training. Duration: 0.92725 2023-10-02 02:47:33,118 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_47228_210_4983_1_1533780021_3672027_479-114941-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 02:47:34,385 WARNING [train.py:1204] (3/4) Exclude cut with ID _44585_210_18628_1_1533605350387_4518199_506-119659-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 02:47:34,468 WARNING [train.py:1204] (3/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_314-126819-0_sp1.1 from training. Duration: 0.4545625 2023-10-02 02:47:34,471 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2541_210_1794_1_1525590269_3084765_156-173713-0_sp0.9 from training. Duration: 0.9 2023-10-02 02:47:35,823 WARNING [train.py:1204] (3/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_108-255430-0 from training. Duration: 0.97 2023-10-02 02:47:35,831 WARNING [train.py:1204] (3/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_791-45149-0_sp1.1 from training. Duration: 0.7818125 2023-10-02 02:47:35,836 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_53378_210_17282_1_1534227054_1261425_10-118295-0_sp1.1 from training. Duration: 0.97275 2023-10-02 02:47:35,892 WARNING [train.py:1204] (3/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_148-158509-0 from training. Duration: 0.41 2023-10-02 02:47:41,363 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_34726_210_4525_1_1533019474_5030395_687-291305-0_sp1.1 from training. Duration: 0.936375 2023-10-02 02:47:44,609 WARNING [train.py:1204] (3/4) Exclude cut with ID _14860_210_4915_1_1530702020340_7343240_264-324537-0_sp1.1 from training. Duration: 0.963625 2023-10-02 02:47:47,876 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.feed_forward2.hidden_balancer.prob, batch_count=723946.6666666666, ans=0.125 2023-10-02 02:47:49,012 INFO [train.py:1046] (3/4) Epoch 21, batch 2350, loss[loss=0.2591, simple_loss=0.316, pruned_loss=0.1011, over 19811.00 frames. ], tot_loss[loss=0.176, simple_loss=0.2518, pruned_loss=0.05015, over 4726006.20 frames. ], batch size: 389, lr: 4.90e-03, grad_scale: 8.0 2023-10-02 02:47:50,528 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_41251_210_15368_1_1533429767_4820695_591-46386-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 02:47:50,554 WARNING [train.py:1204] (3/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_86-19601-0_sp0.9 from training. Duration: 0.9444375 2023-10-02 02:47:50,587 WARNING [train.py:1204] (3/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_401-255491-0_sp0.9 from training. Duration: 0.77775 2023-10-02 02:47:50,926 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.balancer1.prob, batch_count=723946.6666666666, ans=0.125 2023-10-02 02:47:52,567 WARNING [train.py:1204] (3/4) Exclude cut with ID _8696_210_1876_1_1528545632440_3345058_209-54069-0_sp1.1 from training. Duration: 0.6090625 2023-10-02 02:47:52,575 WARNING [train.py:1204] (3/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_369-325184-0_sp1.1 from training. Duration: 0.9 2023-10-02 02:47:53,944 WARNING [train.py:1204] (3/4) Exclude cut with ID _14687_210_5854_1_1530703838259_3664889_115-239046-0_sp0.9 from training. Duration: 0.7444375 2023-10-02 02:47:53,980 WARNING [train.py:1204] (3/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_168-50337-0 from training. Duration: 0.84 2023-10-02 02:47:58,788 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.conv_module2.balancer2.prob, batch_count=723946.6666666666, ans=0.125 2023-10-02 02:48:01,376 WARNING [train.py:1204] (3/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_909-64151-0_sp1.1 from training. Duration: 0.8545625 2023-10-02 02:48:01,391 WARNING [train.py:1204] (3/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_597-154640-0 from training. Duration: 0.9 2023-10-02 02:48:05,543 WARNING [train.py:1204] (3/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_126-274694-0 from training. Duration: 0.98 2023-10-02 02:48:07,102 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.conv_module1.balancer1.max_abs, batch_count=724013.3333333334, ans=10.0 2023-10-02 02:48:09,439 WARNING [train.py:1204] (3/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_344-269786-0_sp1.1 from training. Duration: 0.97275 2023-10-02 02:48:10,888 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_37818_210_4238_1_1533620910_4720083_322-253154-0_sp1.1 from training. Duration: 0.92725 2023-10-02 02:48:10,891 WARNING [train.py:1204] (3/4) Exclude cut with ID _58895_210_8750_1_1535184081320_3649089_221-15823-0_sp1.1 from training. Duration: 0.92725 2023-10-02 02:48:10,911 WARNING [train.py:1204] (3/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_9-21737-0_sp1.1 from training. Duration: 0.8818125 2023-10-02 02:48:10,964 WARNING [train.py:1204] (3/4) Exclude cut with ID _14069_210_5632_1_1530512692033_4803390_522-341588-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 02:48:12,355 WARNING [train.py:1204] (3/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_363-185703-0 from training. Duration: 0.95 2023-10-02 02:48:13,938 WARNING [train.py:1204] (3/4) Exclude cut with ID _41817_210_18835_1_1533434477160_7423049_265-76216-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 02:48:16,081 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.conv_module2.balancer1.prob, batch_count=724013.3333333334, ans=0.125 2023-10-02 02:48:16,160 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.feed_forward1.hidden_balancer.prob, batch_count=724013.3333333334, ans=0.125 2023-10-02 02:48:20,382 WARNING [train.py:1204] (3/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_841-341679-0 from training. Duration: 0.58 2023-10-02 02:48:21,824 WARNING [train.py:1204] (3/4) Exclude cut with ID _55429_210_10626_1_1534467588527_7174143_97-335084-0_sp1.1 from training. Duration: 0.8818125 2023-10-02 02:48:22,120 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.nonlin_attention.balancer.prob, batch_count=724080.0, ans=0.125 2023-10-02 02:48:25,408 WARNING [train.py:1204] (3/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_690-126306-0_sp0.9 from training. Duration: 0.8666875 2023-10-02 02:48:25,420 WARNING [train.py:1204] (3/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_102-92751-0_sp1.1 from training. Duration: 0.9 2023-10-02 02:48:26,877 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3787_210_2040_1_1526477544_5478586_603-120324-0_sp1.1 from training. Duration: 0.736375 2023-10-02 02:48:28,288 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_33348_210_5632_1_1533086912_3324030_85-28583-0 from training. Duration: 0.82 2023-10-02 02:48:29,598 WARNING [train.py:1204] (3/4) Exclude cut with ID _60057_210_10640_1_1534903065793_3333150_362-230377-0_sp1.1 from training. Duration: 0.7909375 2023-10-02 02:48:31,093 WARNING [train.py:1204] (3/4) Exclude cut with ID _60113_210_16271_1_1535164212481_4386079_194-146486-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 02:48:31,120 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_53753_210_7234_1_1534320003_3594148_171-277668-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 02:48:32,447 WARNING [train.py:1204] (3/4) Exclude cut with ID _34423_210_8750_1_1533430798456_3330199_251-332788-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 02:48:33,942 WARNING [train.py:1204] (3/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_663-166973-0_sp0.9 from training. Duration: 0.888875 2023-10-02 02:48:36,655 WARNING [train.py:1204] (3/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_927-43827-0 from training. Duration: 0.74 2023-10-02 02:48:36,703 WARNING [train.py:1204] (3/4) Exclude cut with ID _28323_210_16187_1_1532480395667_4100300_306-74561-0_sp1.1 from training. Duration: 0.8545625 2023-10-02 02:48:40,791 WARNING [train.py:1204] (3/4) Exclude cut with ID _15037_210_2253_1_1530752294104_3267650_139-337989-0_sp1.1 from training. Duration: 0.92725 2023-10-02 02:48:40,815 WARNING [train.py:1204] (3/4) Exclude cut with ID _49039_210_6315_1_1533967537541_3846118_460-263670-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 02:48:42,235 WARNING [train.py:1204] (3/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_186-309161-0 from training. Duration: 0.54 2023-10-02 02:48:42,325 WARNING [train.py:1204] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_87-278686-0_sp1.1 from training. Duration: 0.7 2023-10-02 02:48:45,864 WARNING [train.py:1204] (3/4) Exclude cut with ID _27052_210_5986_1_1532329197187_3585009_314-106499-0 from training. Duration: 0.92 2023-10-02 02:48:45,883 WARNING [train.py:1204] (3/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_412-287526-0_sp0.9 from training. Duration: 0.8 2023-10-02 02:48:49,253 WARNING [train.py:1204] (3/4) Exclude cut with ID _22961_210_8254_1_1531976366672_4278180_317-199546-0 from training. Duration: 0.86 2023-10-02 02:48:52,667 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.conv_module2.balancer1.min_positive, batch_count=724213.3333333334, ans=0.025 2023-10-02 02:48:53,852 WARNING [train.py:1204] (3/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_381-317791-0 from training. Duration: 0.73 2023-10-02 02:48:53,898 WARNING [train.py:1204] (3/4) Exclude cut with ID _44010_210_4525_1_1533884262964_4013620_560-310411-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 02:48:53,904 WARNING [train.py:1204] (3/4) Exclude cut with ID _5294_210_1747_1_1527249643928_3696179_342-270278-0_sp0.9 from training. Duration: 0.611125 2023-10-02 02:48:54,555 WARNING [train.py:1204] (3/4) Exclude cut with ID ID0473W0002-918951 from training. Duration: 0.924 2023-10-02 02:48:55,627 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1083-220122-0 from training. Duration: 0.51 2023-10-02 02:48:57,104 WARNING [train.py:1204] (3/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_138-154812-0 from training. Duration: 0.78 2023-10-02 02:49:00,188 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.conv_skip_rate, batch_count=724213.3333333334, ans=0.0 2023-10-02 02:49:01,227 WARNING [train.py:1204] (3/4) Exclude cut with ID _44982_210_1511_1_1534312820108_3726029_331-19420-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 02:49:04,171 INFO [train.py:1046] (3/4) Epoch 21, batch 2400, loss[loss=0.165, simple_loss=0.2573, pruned_loss=0.0363, over 24431.00 frames. ], tot_loss[loss=0.176, simple_loss=0.2519, pruned_loss=0.05009, over 4719564.66 frames. ], batch size: 69, lr: 4.90e-03, grad_scale: 16.0 2023-10-02 02:49:04,286 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2655_210_2087_1_1525688959_3897400_202-147557-0_sp0.9 from training. Duration: 0.9444375 2023-10-02 02:49:07,353 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_23850_210_4238_1_1532232597_1632433_32-185135-0_sp0.9 from training. Duration: 0.9555625 2023-10-02 02:49:10,117 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_46-93547-0_sp1.1 from training. Duration: 0.863625 2023-10-02 02:49:11,530 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7931_210_1794_1_1528594728_5564059_280-279560-0 from training. Duration: 0.98 2023-10-02 02:49:11,558 WARNING [train.py:1204] (3/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_937-208281-0 from training. Duration: 0.85 2023-10-02 02:49:16,703 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.ff2_skip_rate, batch_count=724280.0, ans=0.0 2023-10-02 02:49:19,684 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_14928_210_5632_1_1530773461_4714000_454-112198-0_sp0.9 from training. Duration: 0.7333125 2023-10-02 02:49:19,686 WARNING [train.py:1204] (3/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_944-153333-0_sp1.1 from training. Duration: 0.963625 2023-10-02 02:49:21,888 WARNING [train.py:1204] (3/4) Exclude cut with ID _14821_210_8750_1_1530680503991_6829359_2-227151-0 from training. Duration: 0.83 2023-10-02 02:49:21,913 WARNING [train.py:1204] (3/4) Exclude cut with ID _28461_210_2253_1_1532771775189_3041299_96-152158-0_sp1.1 from training. Duration: 0.77275 2023-10-02 02:49:21,984 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7837_210_1792_1_1528453445_5841305_654-54195-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 02:49:23,244 WARNING [train.py:1204] (3/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_344-152526-0 from training. Duration: 0.53 2023-10-02 02:49:27,992 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_30167_210_3528_1_1532428012_4772375_480-142230-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 02:49:30,692 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_160-218721-0 from training. Duration: 0.96 2023-10-02 02:49:35,035 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.568e+02 1.859e+02 2.075e+02 2.399e+02 3.778e+02, threshold=4.149e+02, percent-clipped=0.0 2023-10-02 02:49:36,436 WARNING [train.py:1204] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_65-88242-0_sp0.9 from training. Duration: 0.62225 2023-10-02 02:49:39,570 INFO [scaling.py:1118] (3/4) WithLoss: name=encoder.encoders.2.encoder.layers.0.self_attn_weights, loss-sum=0.000e+00 2023-10-02 02:49:40,717 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_34886_210_10633_1_1533203882_1755661_76-156096-0 from training. Duration: 0.87 2023-10-02 02:49:43,460 WARNING [train.py:1204] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_188-38388-0_sp1.1 from training. Duration: 0.92725 2023-10-02 02:49:43,567 WARNING [train.py:1204] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_81-289335-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 02:49:47,153 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.conv_module2.balancer2.prob, batch_count=724413.3333333334, ans=0.125 2023-10-02 02:49:47,812 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.0.layers.1.feed_forward2.out_whiten, num_groups=1, num_channels=192, metric=4.58 vs. limit=15.0 2023-10-02 02:49:48,333 WARNING [train.py:1204] (3/4) Exclude cut with ID _59493_210_17282_1_1535078419077_3225879_110-8778-0_sp1.1 from training. Duration: 0.936375 2023-10-02 02:49:48,370 WARNING [train.py:1204] (3/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_188-260011-0 from training. Duration: 0.73 2023-10-02 02:49:49,732 WARNING [train.py:1204] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_453-133983-0_sp0.9 from training. Duration: 0.8666875 2023-10-02 02:49:56,407 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8054_210_7927_1_1528627721_4984094_553-17379-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 02:49:58,299 WARNING [train.py:1204] (3/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_341-171072-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 02:50:01,014 WARNING [train.py:1204] (3/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_265-257286-0_sp1.1 from training. Duration: 0.963625 2023-10-02 02:50:01,140 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder_embed.conv.2.prob, batch_count=724480.0, ans=0.125 2023-10-02 02:50:02,372 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_403-39619-0_sp0.9 from training. Duration: 0.9333125 2023-10-02 02:50:02,376 WARNING [train.py:1204] (3/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_464-33931-0_sp0.9 from training. Duration: 0.6 2023-10-02 02:50:02,403 WARNING [train.py:1204] (3/4) Exclude cut with ID _60804_210_9642_1_1534831199859_7268299_632-143863-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 02:50:02,411 WARNING [train.py:1204] (3/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_431-310997-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 02:50:02,454 WARNING [train.py:1204] (3/4) Exclude cut with ID _31649_210_6228_1_1532584608063_4091078_114-263860-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 02:50:02,465 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6961_210_2087_1_1527937168_6815011_562-165518-0_sp0.9 from training. Duration: 0.8555625 2023-10-02 02:50:06,689 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.1.conv_module2.balancer2.min_abs, batch_count=724546.6666666666, ans=0.5 2023-10-02 02:50:07,806 WARNING [train.py:1204] (3/4) Exclude cut with ID _27052_210_5986_1_1532329197187_3585009_338-325870-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 02:50:07,857 WARNING [train.py:1204] (3/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_469-94673-0_sp1.1 from training. Duration: 0.7181875 2023-10-02 02:50:07,886 WARNING [train.py:1204] (3/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_259-34038-0 from training. Duration: 0.74 2023-10-02 02:50:09,283 WARNING [train.py:1204] (3/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_388-129552-0 from training. Duration: 0.96 2023-10-02 02:50:12,042 WARNING [train.py:1204] (3/4) Exclude cut with ID _28394_210_8913_1_1532430049518_3650118_342-251455-0_sp1.1 from training. Duration: 0.936375 2023-10-02 02:50:12,070 WARNING [train.py:1204] (3/4) Exclude cut with ID _53731_210_7994_1_1534553979244_3757214_8-191390-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 02:50:12,118 WARNING [train.py:1204] (3/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_613-232856-0 from training. Duration: 0.54 2023-10-02 02:50:12,354 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.attention_skip_rate, batch_count=724546.6666666666, ans=0.0 2023-10-02 02:50:14,054 WARNING [train.py:1204] (3/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_557-349534-0 from training. Duration: 0.91 2023-10-02 02:50:14,062 WARNING [train.py:1204] (3/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_379-184223-0 from training. Duration: 0.85 2023-10-02 02:50:14,066 WARNING [train.py:1204] (3/4) Exclude cut with ID IC0125W0001-61807 from training. Duration: 0.966 2023-10-02 02:50:15,421 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_229-45300-0 from training. Duration: 0.57 2023-10-02 02:50:15,512 WARNING [train.py:1204] (3/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_1069-24743-0_sp1.1 from training. Duration: 0.92725 2023-10-02 02:50:18,355 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_41452_210_7291_1_1533437687_3941281_323-172873-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 02:50:18,372 WARNING [train.py:1204] (3/4) Exclude cut with ID _37086_210_9038_1_1532999540891_7021250_802-226709-0_sp1.1 from training. Duration: 0.97275 2023-10-02 02:50:19,697 INFO [train.py:1046] (3/4) Epoch 21, batch 2450, loss[loss=0.1684, simple_loss=0.2478, pruned_loss=0.04449, over 24478.00 frames. ], tot_loss[loss=0.1751, simple_loss=0.2505, pruned_loss=0.04984, over 4710981.07 frames. ], batch size: 58, lr: 4.90e-03, grad_scale: 16.0 2023-10-02 02:50:19,798 WARNING [train.py:1204] (3/4) Exclude cut with ID IC0144W0500-71767 from training. Duration: 0.931875 2023-10-02 02:50:19,868 WARNING [train.py:1204] (3/4) Exclude cut with ID _39846_210_12651_1_1533605310265_2115212_233-129699-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 02:50:21,209 WARNING [train.py:1204] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_574-45541-0_sp0.9 from training. Duration: 0.72225 2023-10-02 02:50:23,253 WARNING [train.py:1204] (3/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_372-298093-0_sp1.1 from training. Duration: 0.72725 2023-10-02 02:50:24,687 WARNING [train.py:1204] (3/4) Exclude cut with ID _62574_210_2655_1_1535180407058_3933249_34-26343-0_sp1.1 from training. Duration: 0.97275 2023-10-02 02:50:25,383 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.4.encoder.layers.1.conv_module2.whiten, num_groups=1, num_channels=384, metric=2.50 vs. limit=15.0 2023-10-02 02:50:28,419 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_512-256238-0_sp1.1 from training. Duration: 0.963625 2023-10-02 02:50:28,425 WARNING [train.py:1204] (3/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_196-55502-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 02:50:29,799 WARNING [train.py:1204] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_195-167011-0 from training. Duration: 0.75 2023-10-02 02:50:30,081 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.ff3_skip_rate, batch_count=724613.3333333334, ans=0.0 2023-10-02 02:50:31,284 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.bypass_mid.scale_min, batch_count=724613.3333333334, ans=0.2 2023-10-02 02:50:35,253 WARNING [train.py:1204] (3/4) Exclude cut with ID _13678_210_7497_1_1530530069226_3463170_345-230416-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 02:50:35,263 WARNING [train.py:1204] (3/4) Exclude cut with ID _51078_210_9070_1_1534161557424_3805250_225-327809-0_sp1.1 from training. Duration: 0.963625 2023-10-02 02:50:38,151 WARNING [train.py:1204] (3/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_18-189198-0_sp1.1 from training. Duration: 0.8181875 2023-10-02 02:50:38,161 WARNING [train.py:1204] (3/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_329-82235-0_sp1.1 from training. Duration: 0.7454375 2023-10-02 02:50:38,169 WARNING [train.py:1204] (3/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_726-270780-0_sp0.9 from training. Duration: 0.9555625 2023-10-02 02:50:39,502 WARNING [train.py:1204] (3/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_654-52713-0 from training. Duration: 0.77 2023-10-02 02:50:44,298 WARNING [train.py:1204] (3/4) Exclude cut with ID _20455_210_5219_1_1531565996775_8098100_191-16006-0_sp1.1 from training. Duration: 0.963625 2023-10-02 02:50:45,723 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3171_210_2087_1_1526109084_2330007_66-260583-0_sp1.1 from training. Duration: 0.6545625 2023-10-02 02:50:46,986 WARNING [train.py:1204] (3/4) Exclude cut with ID _38670_210_15379_1_1533448810659_4391059_63-129006-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 02:50:49,805 WARNING [train.py:1204] (3/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_393-57888-0_sp0.9 from training. Duration: 0.711125 2023-10-02 02:50:49,845 WARNING [train.py:1204] (3/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_857-298942-0_sp0.9 from training. Duration: 0.9444375 2023-10-02 02:50:53,296 WARNING [train.py:1204] (3/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_239-22076-0_sp0.9 from training. Duration: 0.9444375 2023-10-02 02:50:53,340 WARNING [train.py:1204] (3/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_424-227947-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 02:50:55,337 WARNING [train.py:1204] (3/4) Exclude cut with ID _14069_210_5632_1_1530512692033_4803390_95-1896-0 from training. Duration: 0.98 2023-10-02 02:50:56,684 WARNING [train.py:1204] (3/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_96-244299-0_sp0.9 from training. Duration: 0.97775 2023-10-02 02:51:04,329 WARNING [train.py:1204] (3/4) Exclude cut with ID _46683_210_9642_1_1533730913492_4591116_562-319825-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 02:51:04,603 INFO [scaling.py:1118] (3/4) WithLoss: name=encoder.encoders.3.encoder.layers.2.self_attn_weights, loss-sum=0.000e+00 2023-10-02 02:51:05,719 WARNING [train.py:1204] (3/4) Exclude cut with ID _64639_210_5816_1_1535367619437_3528000_284-173474-0_sp1.1 from training. Duration: 0.963625 2023-10-02 02:51:05,748 WARNING [train.py:1204] (3/4) Exclude cut with ID _29529_210_7234_1_1532393961455_3977460_497-100133-0_sp1.1 from training. Duration: 0.936375 2023-10-02 02:51:07,068 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_12868_210_6753_1_1530701932_530275_18-76322-0_sp0.9 from training. Duration: 0.9666875 2023-10-02 02:51:07,105 WARNING [train.py:1204] (3/4) Exclude cut with ID _50093_210_4220_1_1534417141207_3432299_116-303447-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 02:51:08,444 WARNING [train.py:1204] (3/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_44-79427-0_sp1.1 from training. Duration: 0.8909375 2023-10-02 02:51:09,768 WARNING [train.py:1204] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_887-249555-0 from training. Duration: 0.77 2023-10-02 02:51:10,477 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.3.encoder.layers.3.whiten, num_groups=1, num_channels=512, metric=6.04 vs. limit=12.0 2023-10-02 02:51:11,858 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.4.encoder.layers.0.self_attn1.whiten, num_groups=1, num_channels=384, metric=15.26 vs. limit=22.5 2023-10-02 02:51:12,594 WARNING [train.py:1204] (3/4) Exclude cut with ID _28461_210_2253_1_1532771775189_3041299_96-152158-0_sp0.9 from training. Duration: 0.9444375 2023-10-02 02:51:12,646 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_14058_210_4163_1_1530690953_6327180_49-210687-0_sp1.1 from training. Duration: 0.8545625 2023-10-02 02:51:17,356 WARNING [train.py:1204] (3/4) Exclude cut with ID _40399_210_4353_1_1533305658194_3279406_351-127903-0_sp1.1 from training. Duration: 0.97275 2023-10-02 02:51:17,363 WARNING [train.py:1204] (3/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_184-206939-0_sp1.1 from training. Duration: 0.936375 2023-10-02 02:51:22,746 WARNING [train.py:1204] (3/4) Exclude cut with ID _14100_210_8341_1_1530529320613_3678069_258-224599-0_sp1.1 from training. Duration: 0.7 2023-10-02 02:51:22,769 WARNING [train.py:1204] (3/4) Exclude cut with ID _6493_210_1747_1_1528459210424_4012470_324-323854-0 from training. Duration: 0.92 2023-10-02 02:51:24,027 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_151-325626-0_sp1.1 from training. Duration: 0.8818125 2023-10-02 02:51:24,109 WARNING [train.py:1204] (3/4) Exclude cut with ID _14528_210_8254_1_1530669700514_4306939_106-43145-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 02:51:24,135 WARNING [train.py:1204] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_686-54383-0 from training. Duration: 0.87 2023-10-02 02:51:24,188 WARNING [train.py:1204] (3/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_801-102362-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 02:51:27,336 WARNING [train.py:1204] (3/4) Exclude cut with ID _48677_210_5663_1_1533902405919_3892989_408-40740-0_sp0.9 from training. Duration: 0.92225 2023-10-02 02:51:29,006 WARNING [train.py:1204] (3/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_448-212135-0_sp1.1 from training. Duration: 0.82725 2023-10-02 02:51:30,477 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.balancer2.prob, batch_count=724880.0, ans=0.125 2023-10-02 02:51:31,699 WARNING [train.py:1204] (3/4) Exclude cut with ID _59170_210_6721_1_1534983633960_4203335_320-124047-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 02:51:33,010 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_4692_210_3528_1_1526899755_4323254_107-44401-0_sp1.1 from training. Duration: 0.7818125 2023-10-02 02:51:34,475 INFO [train.py:1046] (3/4) Epoch 21, batch 2500, loss[loss=0.1709, simple_loss=0.2436, pruned_loss=0.04911, over 24457.00 frames. ], tot_loss[loss=0.1742, simple_loss=0.2497, pruned_loss=0.04928, over 4714708.84 frames. ], batch size: 58, lr: 4.90e-03, grad_scale: 16.0 2023-10-02 02:51:36,392 WARNING [train.py:1204] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_731-2428-0 from training. Duration: 0.83 2023-10-02 02:51:36,493 WARNING [train.py:1204] (3/4) Exclude cut with ID _29857_210_20034_1_1532395886753_3798160_335-163562-0_sp1.1 from training. Duration: 0.77275 2023-10-02 02:51:42,096 WARNING [train.py:1204] (3/4) Exclude cut with ID _14832_210_8750_1_1530691351539_3171348_171-275004-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 02:51:49,882 WARNING [train.py:1204] (3/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_105-328174-0_sp0.9 from training. Duration: 0.8444375 2023-10-02 02:51:51,188 WARNING [train.py:1204] (3/4) Exclude cut with ID _68965_210_1883_1_1535698962272_3497490_164-118421-0_sp1.1 from training. Duration: 0.936375 2023-10-02 02:51:52,522 WARNING [train.py:1204] (3/4) Exclude cut with ID _19896_210_1883_1_1531709022662_6649569_380-24170-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 02:51:52,527 WARNING [train.py:1204] (3/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_505-205270-0_sp1.1 from training. Duration: 0.3454375 2023-10-02 02:51:55,969 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.ff2_skip_rate, batch_count=725013.3333333334, ans=0.0 2023-10-02 02:51:59,163 WARNING [train.py:1204] (3/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_427-208901-0_sp1.1 from training. Duration: 0.6181875 2023-10-02 02:51:59,213 WARNING [train.py:1204] (3/4) Exclude cut with ID _23818_210_6228_1_1531917063884_3676920_85-326916-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 02:52:00,596 WARNING [train.py:1204] (3/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_101-293367-0_sp1.1 from training. Duration: 0.57275 2023-10-02 02:52:00,604 WARNING [train.py:1204] (3/4) Exclude cut with ID _5409_210_1388_1_1527292850189_5865191_178-127970-0_sp1.1 from training. Duration: 0.5181875 2023-10-02 02:52:01,517 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.0.layers.1.nonlin_attention.whiten1, num_groups=1, num_channels=144, metric=5.25 vs. limit=10.0 2023-10-02 02:52:01,997 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_403-39619-0 from training. Duration: 0.84 2023-10-02 02:52:03,459 WARNING [train.py:1204] (3/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_427-87841-0_sp1.1 from training. Duration: 0.963625 2023-10-02 02:52:04,790 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.542e+02 1.878e+02 2.187e+02 2.689e+02 3.332e+02, threshold=4.374e+02, percent-clipped=0.0 2023-10-02 02:52:04,910 WARNING [train.py:1204] (3/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_286-68181-0_sp1.1 from training. Duration: 0.97275 2023-10-02 02:52:04,944 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6673_210_3273_1_1527767790_3173240_327-306185-0 from training. Duration: 0.83 2023-10-02 02:52:04,966 WARNING [train.py:1204] (3/4) Exclude cut with ID _3675_210_4557_1_1526639934408_4182089_106-203713-0_sp1.1 from training. Duration: 0.963625 2023-10-02 02:52:06,880 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_53071_210_16554_1_1534493434_1276796_130-36076-0 from training. Duration: 0.95 2023-10-02 02:52:06,913 WARNING [train.py:1204] (3/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_586-93054-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 02:52:11,158 WARNING [train.py:1204] (3/4) Exclude cut with ID _23825_210_13449_1_1531915313526_3497150_262-10571-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 02:52:12,493 WARNING [train.py:1204] (3/4) Exclude cut with ID _28783_210_7943_1_1532671164808_7155479_432-67885-0_sp1.1 from training. Duration: 0.97275 2023-10-02 02:52:15,227 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6287_210_3273_1_1527509506_2653716_182-199887-0_sp0.9 from training. Duration: 0.5666875 2023-10-02 02:52:15,384 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.1.conv_module2.balancer2.prob, batch_count=725080.0, ans=0.125 2023-10-02 02:52:17,210 WARNING [train.py:1204] (3/4) Exclude cut with ID _3231_210_2106_1_1526205864934_6784220_171-256004-0 from training. Duration: 0.86 2023-10-02 02:52:17,272 WARNING [train.py:1204] (3/4) Exclude cut with ID _69051_210_1531_1_1535885566901_3829538_332-92090-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 02:52:17,405 WARNING [train.py:1204] (3/4) Exclude cut with ID _3675_210_4557_1_1526639934408_4182089_104-324125-0_sp1.1 from training. Duration: 0.963625 2023-10-02 02:52:21,522 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_41764_210_2521_1_1533686110_4089313_501-2119-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 02:52:25,354 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_56-100988-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 02:52:28,193 WARNING [train.py:1204] (3/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_398-239819-0_sp1.1 from training. Duration: 0.9 2023-10-02 02:52:28,354 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.feed_forward3.hidden_balancer.prob, batch_count=725146.6666666666, ans=0.125 2023-10-02 02:52:34,460 WARNING [train.py:1204] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_74-48717-0_sp1.1 from training. Duration: 0.52725 2023-10-02 02:52:35,845 WARNING [train.py:1204] (3/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_177-49092-0 from training. Duration: 0.91 2023-10-02 02:52:35,871 WARNING [train.py:1204] (3/4) Exclude cut with ID _62337_210_12392_1_1535009273105_3412229_44-176353-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 02:52:37,756 WARNING [train.py:1204] (3/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_110-130961-0_sp1.1 from training. Duration: 0.42725 2023-10-02 02:52:39,266 WARNING [train.py:1204] (3/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_37-189262-0_sp1.1 from training. Duration: 0.8909375 2023-10-02 02:52:39,267 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3172_210_2087_1_1526292929_3987865_197-97762-0_sp0.9 from training. Duration: 0.8333125 2023-10-02 02:52:40,660 WARNING [train.py:1204] (3/4) Exclude cut with ID IC0677W0065-334897 from training. Duration: 0.896 2023-10-02 02:52:40,660 WARNING [train.py:1204] (3/4) Exclude cut with ID IC0677W0080-334912 from training. Duration: 0.768 2023-10-02 02:52:40,665 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_120-41668-0 from training. Duration: 0.7 2023-10-02 02:52:40,923 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.ff2_skip_rate, batch_count=725213.3333333334, ans=0.0 2023-10-02 02:52:43,426 WARNING [train.py:1204] (3/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_460-205287-0_sp1.1 from training. Duration: 0.963625 2023-10-02 02:52:46,154 WARNING [train.py:1204] (3/4) Exclude cut with ID _14632_210_9988_1_1530691279392_3923200_102-204477-0 from training. Duration: 0.97 2023-10-02 02:52:46,159 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_34815_210_6388_1_1533381320_3571903_279-305565-0 from training. Duration: 0.92 2023-10-02 02:52:46,235 WARNING [train.py:1204] (3/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_91-148831-0_sp1.1 from training. Duration: 0.9 2023-10-02 02:52:49,409 INFO [train.py:1046] (3/4) Epoch 21, batch 2550, loss[loss=0.1903, simple_loss=0.2571, pruned_loss=0.06169, over 23715.00 frames. ], tot_loss[loss=0.1743, simple_loss=0.2501, pruned_loss=0.04929, over 4720434.22 frames. ], batch size: 179, lr: 4.90e-03, grad_scale: 16.0 2023-10-02 02:52:49,455 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_82-221196-0 from training. Duration: 0.76 2023-10-02 02:52:51,002 WARNING [train.py:1204] (3/4) Exclude cut with ID _38020_210_5986_1_1533112252259_7006089_493-102745-0 from training. Duration: 0.98 2023-10-02 02:52:53,747 WARNING [train.py:1204] (3/4) Exclude cut with ID _61133_210_9969_1_1535196777227_3696409_40-93442-0_sp1.1 from training. Duration: 0.92725 2023-10-02 02:52:55,199 WARNING [train.py:1204] (3/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_45-258852-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 02:52:55,242 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_53071_210_16554_1_1534493434_1276796_130-36076-0_sp1.1 from training. Duration: 0.863625 2023-10-02 02:52:57,798 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8694_210_6400_1_1529065754_3260756_332-13553-0_sp1.1 from training. Duration: 0.92725 2023-10-02 02:52:59,785 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_405-339986-0 from training. Duration: 0.51 2023-10-02 02:52:59,826 WARNING [train.py:1204] (3/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_927-43827-0_sp1.1 from training. Duration: 0.67275 2023-10-02 02:53:03,462 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_397-147196-0 from training. Duration: 0.8 2023-10-02 02:53:04,942 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder_embed.convnext.layerdrop_rate, batch_count=725346.6666666666, ans=0.015 2023-10-02 02:53:06,212 WARNING [train.py:1204] (3/4) Exclude cut with ID 3557-8342-0013-54691-0_sp1.1 from training. Duration: 0.836375 2023-10-02 02:53:07,640 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_47438_210_5399_1_1533797471_4513356_626-127722-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 02:53:08,566 WARNING [train.py:1204] (3/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_256-323883-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 02:53:09,665 WARNING [train.py:1204] (3/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_839-173017-0_sp0.9 from training. Duration: 0.4555625 2023-10-02 02:53:09,723 WARNING [train.py:1204] (3/4) Exclude cut with ID _5289_210_2084_1_1527678202736_3779299_392-261988-0_sp1.1 from training. Duration: 0.5909375 2023-10-02 02:53:11,011 WARNING [train.py:1204] (3/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_119-224384-0_sp1.1 from training. Duration: 0.936375 2023-10-02 02:53:11,040 WARNING [train.py:1204] (3/4) Exclude cut with ID _57523_210_1856_1_1534766294545_3732989_262-267933-0_sp1.1 from training. Duration: 0.963625 2023-10-02 02:53:13,764 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_35361_210_4238_1_1533262810_7793476_675-87012-0_sp0.9 from training. Duration: 0.988875 2023-10-02 02:53:13,788 WARNING [train.py:1204] (3/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_17-90541-0 from training. Duration: 0.94 2023-10-02 02:53:13,816 WARNING [train.py:1204] (3/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_535-14410-0_sp1.1 from training. Duration: 0.42725 2023-10-02 02:53:13,821 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_24673_210_6400_1_1531968633_778659_94-154511-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 02:53:13,824 WARNING [train.py:1204] (3/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_212-10501-0 from training. Duration: 0.94 2023-10-02 02:53:25,450 WARNING [train.py:1204] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_383-285196-0_sp1.1 from training. Duration: 0.8818125 2023-10-02 02:53:30,215 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.bypass.scale_min, batch_count=725413.3333333334, ans=0.2 2023-10-02 02:53:32,727 WARNING [train.py:1204] (3/4) Exclude cut with ID _45560_210_21982_1_1533812603951_3584999_238-47020-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 02:53:32,736 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_21500_210_2054_1_1532149310_2716758_278-18053-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 02:53:32,743 WARNING [train.py:1204] (3/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_453-9626-0_sp1.1 from training. Duration: 0.936375 2023-10-02 02:53:32,834 WARNING [train.py:1204] (3/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_153-97441-0_sp1.1 from training. Duration: 0.5090625 2023-10-02 02:53:34,899 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.1.feed_forward1.out_proj.dropout_p, batch_count=725480.0, ans=0.1 2023-10-02 02:53:39,536 WARNING [train.py:1204] (3/4) Exclude cut with ID _67725_210_27767_1_1535608756671_5105359_431-120895-0_sp1.1 from training. Duration: 0.963625 2023-10-02 02:53:40,992 WARNING [train.py:1204] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_574-45541-0_sp1.1 from training. Duration: 0.5909375 2023-10-02 02:53:41,008 WARNING [train.py:1204] (3/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_162-103620-0_sp1.1 from training. Duration: 0.8454375 2023-10-02 02:53:42,342 WARNING [train.py:1204] (3/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_734-129761-0_sp1.1 from training. Duration: 0.7454375 2023-10-02 02:53:42,380 WARNING [train.py:1204] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_620-276793-0_sp1.1 from training. Duration: 0.536375 2023-10-02 02:53:42,410 WARNING [train.py:1204] (3/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_153-97441-0_sp0.9 from training. Duration: 0.62225 2023-10-02 02:53:45,326 WARNING [train.py:1204] (3/4) Exclude cut with ID _12733_210_8341_1_1530167424556_3646380_523-85621-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 02:53:46,656 WARNING [train.py:1204] (3/4) Exclude cut with ID _64048_210_10626_1_1535198382802_3835179_352-179834-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 02:53:47,502 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.1.encoder.layers.1.nonlin_attention.whiten1, num_groups=1, num_channels=192, metric=5.72 vs. limit=10.0 2023-10-02 02:53:50,140 WARNING [train.py:1204] (3/4) Exclude cut with ID _55162_210_17071_1_1534408248583_3753480_43-242364-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 02:53:50,165 WARNING [train.py:1204] (3/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_661-264857-0 from training. Duration: 0.93 2023-10-02 02:53:50,175 WARNING [train.py:1204] (3/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_325-237800-0_sp1.1 from training. Duration: 0.97275 2023-10-02 02:53:50,210 WARNING [train.py:1204] (3/4) Exclude cut with ID _55328_210_27767_1_1534580973260_4088170_385-171459-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 02:53:51,518 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3780_210_2087_1_1526467389_2145572_176-107140-0_sp1.1 from training. Duration: 0.62725 2023-10-02 02:53:52,856 WARNING [train.py:1204] (3/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_375-244976-0_sp1.1 from training. Duration: 0.6818125 2023-10-02 02:53:54,283 WARNING [train.py:1204] (3/4) Exclude cut with ID _35941_210_15379_1_1532995294515_4023089_309-33162-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 02:54:01,762 WARNING [train.py:1204] (3/4) Exclude cut with ID _14459_210_2228_1_1531443641946_7516700_1185-22038-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 02:54:04,389 INFO [train.py:1046] (3/4) Epoch 21, batch 2600, loss[loss=0.1929, simple_loss=0.269, pruned_loss=0.05837, over 24061.00 frames. ], tot_loss[loss=0.1752, simple_loss=0.2512, pruned_loss=0.04965, over 4714564.77 frames. ], batch size: 86, lr: 4.90e-03, grad_scale: 16.0 2023-10-02 02:54:04,443 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_1197-69460-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 02:54:05,942 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9012W0022-501157 from training. Duration: 0.896 2023-10-02 02:54:08,970 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.self_attn_weights.pos_emb_skip_rate, batch_count=725613.3333333334, ans=0.0 2023-10-02 02:54:10,008 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9023W0009-506302 from training. Duration: 0.896 2023-10-02 02:54:10,023 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2821_210_2715_1_1526108385_3763560_73-296673-0_sp1.1 from training. Duration: 0.7545625 2023-10-02 02:54:10,052 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9025W0113-507442 from training. Duration: 0.896 2023-10-02 02:54:10,149 WARNING [train.py:1204] (3/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_739-2098-0 from training. Duration: 0.92 2023-10-02 02:54:11,470 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9029W0332-509722 from training. Duration: 0.768 2023-10-02 02:54:12,973 WARNING [train.py:1204] (3/4) Exclude cut with ID _16908_210_5724_1_1531119157248_3772700_68-101953-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 02:54:12,997 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9042W0011-515617 from training. Duration: 0.768 2023-10-02 02:54:14,380 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3019_210_2028_1_1526044245_3264865_191-72235-0 from training. Duration: 0.97 2023-10-02 02:54:16,255 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9052W0006-520792 from training. Duration: 0.896 2023-10-02 02:54:17,600 WARNING [train.py:1204] (3/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_245-174553-0_sp1.1 from training. Duration: 0.8 2023-10-02 02:54:20,335 WARNING [train.py:1204] (3/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_202-67017-0 from training. Duration: 0.82 2023-10-02 02:54:21,746 WARNING [train.py:1204] (3/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_304-59227-0 from training. Duration: 0.77 2023-10-02 02:54:21,844 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_246-125000-0_sp0.9 from training. Duration: 0.688875 2023-10-02 02:54:23,223 WARNING [train.py:1204] (3/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_243-42116-0 from training. Duration: 0.73 2023-10-02 02:54:24,723 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9087W0121-538492 from training. Duration: 0.896 2023-10-02 02:54:24,740 WARNING [train.py:1204] (3/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_120-76843-0 from training. Duration: 0.55 2023-10-02 02:54:29,039 INFO [scaling.py:1118] (3/4) WithLoss: name=encoder.encoders.1.encoder.layers.1.self_attn_weights, loss-sum=0.000e+00 2023-10-02 02:54:32,037 WARNING [train.py:1204] (3/4) Exclude cut with ID _7852_210_5219_1_1528286471316_8139400_850-304621-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 02:54:32,057 WARNING [train.py:1204] (3/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_154-285415-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 02:54:32,069 WARNING [train.py:1204] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_538-278194-0_sp1.1 from training. Duration: 0.92725 2023-10-02 02:54:32,069 WARNING [train.py:1204] (3/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_119-207162-0 from training. Duration: 0.71 2023-10-02 02:54:34,848 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.561e+02 1.872e+02 2.087e+02 2.390e+02 4.163e+02, threshold=4.174e+02, percent-clipped=0.0 2023-10-02 02:54:36,281 WARNING [train.py:1204] (3/4) Exclude cut with ID _2334_210_2667_1_1525608238880_4705129_459-104997-0_sp1.1 from training. Duration: 0.863625 2023-10-02 02:54:41,124 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.1.encoder.layers.1.whiten, num_groups=1, num_channels=256, metric=3.76 vs. limit=12.0 2023-10-02 02:54:42,947 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9147W0019-567847 from training. Duration: 0.768 2023-10-02 02:54:43,930 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.1.encoder.layers.0.feed_forward2.out_whiten, num_groups=1, num_channels=256, metric=5.02 vs. limit=15.0 2023-10-02 02:54:47,144 WARNING [train.py:1204] (3/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_228-189439-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 02:54:48,933 WARNING [train.py:1204] (3/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_196-77172-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 02:54:48,993 WARNING [train.py:1204] (3/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_625-338546-0 from training. Duration: 0.88 2023-10-02 02:54:49,049 WARNING [train.py:1204] (3/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_193-327630-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 02:54:49,051 WARNING [train.py:1204] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_391-24006-0_sp1.1 from training. Duration: 0.92725 2023-10-02 02:54:50,363 WARNING [train.py:1204] (3/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_197-28164-0 from training. Duration: 0.8 2023-10-02 02:54:50,724 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=725813.3333333334, ans=0.1 2023-10-02 02:54:53,182 WARNING [train.py:1204] (3/4) Exclude cut with ID _8395_210_1531_1_1528597871523_4411579_431-132470-0_sp1.1 from training. Duration: 0.67275 2023-10-02 02:54:53,216 WARNING [train.py:1204] (3/4) Exclude cut with ID _35350_210_16040_1_1533040191183_4075299_465-248326-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 02:54:54,623 WARNING [train.py:1204] (3/4) Exclude cut with ID _16756_210_4915_1_1531047803763_7104259_101-141922-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 02:54:58,601 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9212W0005-600172 from training. Duration: 0.896 2023-10-02 02:54:58,622 WARNING [train.py:1204] (3/4) Exclude cut with ID _63186_210_2084_1_1535414479705_3908649_378-166820-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 02:54:58,644 WARNING [train.py:1204] (3/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_52-211683-0_sp1.1 from training. Duration: 0.8181875 2023-10-02 02:54:58,838 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.conv_module2.balancer1.prob, batch_count=725813.3333333334, ans=0.125 2023-10-02 02:55:06,218 WARNING [train.py:1204] (3/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_1147-237729-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 02:55:06,292 WARNING [train.py:1204] (3/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_234-14174-0_sp0.9 from training. Duration: 0.911125 2023-10-02 02:55:06,312 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_454-276429-0 from training. Duration: 0.87 2023-10-02 02:55:07,624 WARNING [train.py:1204] (3/4) Exclude cut with ID _53713_210_24116_1_1534314487762_7416751_755-165313-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 02:55:09,083 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8278_210_2550_1_1528637099_4220413_436-317072-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 02:55:10,430 WARNING [train.py:1204] (3/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_28-324729-0_sp1.1 from training. Duration: 0.8545625 2023-10-02 02:55:16,386 WARNING [train.py:1204] (3/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_326-165900-0 from training. Duration: 0.69 2023-10-02 02:55:16,454 WARNING [train.py:1204] (3/4) Exclude cut with ID _53489_210_6228_1_1534298357169_3835970_96-30246-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 02:55:17,957 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8275_210_4198_1_1528455320_3892571_491-17707-0_sp1.1 from training. Duration: 0.5818125 2023-10-02 02:55:19,854 INFO [train.py:1046] (3/4) Epoch 21, batch 2650, loss[loss=0.1893, simple_loss=0.269, pruned_loss=0.05482, over 23998.00 frames. ], tot_loss[loss=0.1757, simple_loss=0.2522, pruned_loss=0.04959, over 4725272.69 frames. ], batch size: 86, lr: 4.90e-03, grad_scale: 16.0 2023-10-02 02:55:21,634 INFO [scaling.py:1118] (3/4) WithLoss: name=encoder.encoders.4.encoder.layers.0.self_attn_weights, loss-sum=0.000e+00 2023-10-02 02:55:22,757 WARNING [train.py:1204] (3/4) Exclude cut with ID _14348_210_8341_1_1530599434996_3778640_240-232370-0 from training. Duration: 0.84 2023-10-02 02:55:22,766 WARNING [train.py:1204] (3/4) Exclude cut with ID _45012_210_12651_1_1533608435136_5371380_54-112623-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 02:55:22,852 WARNING [train.py:1204] (3/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_328-264776-0_sp1.1 from training. Duration: 0.6090625 2023-10-02 02:55:24,122 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9325W0001-648427 from training. Duration: 0.768 2023-10-02 02:55:25,403 WARNING [train.py:1204] (3/4) Exclude cut with ID _56480_210_4220_1_1534679942079_3075409_49-74717-0_sp1.1 from training. Duration: 0.936375 2023-10-02 02:55:28,089 WARNING [train.py:1204] (3/4) Exclude cut with ID _59500_210_8254_1_1534809610680_3736009_6-241869-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 02:55:28,357 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.bypass_mid.scale_min, batch_count=725946.6666666666, ans=0.2 2023-10-02 02:55:29,698 WARNING [train.py:1204] (3/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_172-215932-0_sp0.9 from training. Duration: 0.7333125 2023-10-02 02:55:32,377 WARNING [train.py:1204] (3/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_17-90541-0_sp1.1 from training. Duration: 0.8545625 2023-10-02 02:55:34,382 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_43831_210_2158_1_1533725502_5872791_525-89447-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 02:55:35,822 WARNING [train.py:1204] (3/4) Exclude cut with ID _12302_210_5219_1_1529924446466_8107280_907-182949-0 from training. Duration: 0.92 2023-10-02 02:55:35,836 WARNING [train.py:1204] (3/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_402-102311-0_sp0.9 from training. Duration: 0.7444375 2023-10-02 02:55:35,866 WARNING [train.py:1204] (3/4) Exclude cut with ID _8021_210_7994_1_1528376451358_3653069_179-45206-0_sp0.9 from training. Duration: 0.9555625 2023-10-02 02:55:37,351 WARNING [train.py:1204] (3/4) Exclude cut with ID _40137_210_12210_1_1533290484046_3795180_249-187059-0 from training. Duration: 0.88 2023-10-02 02:55:38,993 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.conv_module2.balancer1.prob, batch_count=726013.3333333334, ans=0.125 2023-10-02 02:55:40,114 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9653W0194-675472 from training. Duration: 0.9658125 2023-10-02 02:55:41,603 WARNING [train.py:1204] (3/4) Exclude cut with ID _51332_210_9988_1_1534244371836_3801270_336-255604-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 02:55:45,147 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_510-96996-0 from training. Duration: 0.87 2023-10-02 02:55:45,168 WARNING [train.py:1204] (3/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_1167-20437-0_sp1.1 from training. Duration: 0.92725 2023-10-02 02:55:45,380 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.bypass.skip_rate, batch_count=726013.3333333334, ans=0.07 2023-10-02 02:55:46,666 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3475_210_2028_1_1526193578_3622569_265-276625-0 from training. Duration: 0.76 2023-10-02 02:55:50,845 WARNING [train.py:1204] (3/4) Exclude cut with ID _14860_210_4915_1_1530702020340_7343240_470-172418-0_sp1.1 from training. Duration: 0.97275 2023-10-02 02:55:50,854 WARNING [train.py:1204] (3/4) Exclude cut with ID _5444_210_2151_1_1527418808464_5312210_581-315411-0_sp1.1 from training. Duration: 0.563625 2023-10-02 02:55:52,703 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_34450_210_9547_1_1533099443_4109050_519-342439-0_sp1.1 from training. Duration: 0.97275 2023-10-02 02:55:52,751 WARNING [train.py:1204] (3/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_50-175961-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 02:55:53,786 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.1.encoder.layers.1.nonlin_attention.whiten1, num_groups=1, num_channels=192, metric=5.77 vs. limit=10.0 2023-10-02 02:55:58,432 WARNING [train.py:1204] (3/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_372-6869-0 from training. Duration: 0.44 2023-10-02 02:55:58,438 WARNING [train.py:1204] (3/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_3-237434-0 from training. Duration: 0.76 2023-10-02 02:55:59,127 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.3.encoder.layers.0.whiten, num_groups=1, num_channels=512, metric=5.89 vs. limit=12.0 2023-10-02 02:56:01,288 WARNING [train.py:1204] (3/4) Exclude cut with ID _8772_210_1850_1_1528635595972_3664011_247-336448-0_sp1.1 from training. Duration: 0.67275 2023-10-02 02:56:04,171 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_427-78097-0 from training. Duration: 0.8 2023-10-02 02:56:04,184 WARNING [train.py:1204] (3/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_466-252727-0_sp1.1 from training. Duration: 0.97275 2023-10-02 02:56:06,107 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_15972_210_6400_1_1530877934_3405692_316-35478-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 02:56:06,148 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_809-17156-0_sp0.9 from training. Duration: 0.811125 2023-10-02 02:56:07,453 WARNING [train.py:1204] (3/4) Exclude cut with ID _57572_210_5517_1_1534921219624_3809484_594-263474-0_sp1.1 from training. Duration: 0.936375 2023-10-02 02:56:07,482 WARNING [train.py:1204] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_190-228204-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 02:56:08,873 WARNING [train.py:1204] (3/4) Exclude cut with ID _19514_210_9259_1_1531530071330_5084230_117-21193-0_sp1.1 from training. Duration: 0.936375 2023-10-02 02:56:10,308 WARNING [train.py:1204] (3/4) Exclude cut with ID _69584_210_5219_1_1535785133775_4044450_190-21315-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 02:56:11,650 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_53071_210_16554_1_1534493434_1276796_184-302050-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 02:56:12,955 WARNING [train.py:1204] (3/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_120-76843-0_sp1.1 from training. Duration: 0.5 2023-10-02 02:56:13,030 WARNING [train.py:1204] (3/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_318-162017-0_sp1.1 from training. Duration: 0.82725 2023-10-02 02:56:13,273 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.bypass.scale_min, batch_count=726146.6666666666, ans=0.2 2023-10-02 02:56:14,445 WARNING [train.py:1204] (3/4) Exclude cut with ID _46067_210_4220_1_1534125571066_3307450_79-46864-0_sp1.1 from training. Duration: 0.92725 2023-10-02 02:56:16,307 WARNING [train.py:1204] (3/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_358-215558-0_sp1.1 from training. Duration: 0.8454375 2023-10-02 02:56:17,644 WARNING [train.py:1204] (3/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_621-66764-0_sp1.1 from training. Duration: 0.92725 2023-10-02 02:56:20,715 WARNING [train.py:1204] (3/4) Exclude cut with ID _42863_210_9070_1_1533902398788_3696920_338-156851-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 02:56:20,749 WARNING [train.py:1204] (3/4) Exclude cut with ID _5289_210_2084_1_1527678202736_3779299_392-261988-0_sp0.9 from training. Duration: 0.72225 2023-10-02 02:56:23,496 WARNING [train.py:1204] (3/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_409-84682-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 02:56:23,584 WARNING [train.py:1204] (3/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_30-326681-0_sp0.9 from training. Duration: 0.92225 2023-10-02 02:56:23,589 WARNING [train.py:1204] (3/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_389-184695-0_sp1.1 from training. Duration: 0.92725 2023-10-02 02:56:23,628 WARNING [train.py:1204] (3/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_442-320317-0 from training. Duration: 0.82 2023-10-02 02:56:28,405 WARNING [train.py:1204] (3/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_756-29911-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 02:56:28,519 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_401-112036-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 02:56:29,997 WARNING [train.py:1204] (3/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_181-308827-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 02:56:31,307 WARNING [train.py:1204] (3/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_332-258725-0_sp1.1 from training. Duration: 0.963625 2023-10-02 02:56:32,613 WARNING [train.py:1204] (3/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_188-260011-0_sp0.9 from training. Duration: 0.811125 2023-10-02 02:56:32,665 WARNING [train.py:1204] (3/4) Exclude cut with ID _49039_210_6315_1_1533967537541_3846118_99-302239-0_sp1.1 from training. Duration: 0.963625 2023-10-02 02:56:33,939 INFO [train.py:1046] (3/4) Epoch 21, batch 2700, loss[loss=0.1642, simple_loss=0.2462, pruned_loss=0.04111, over 24340.00 frames. ], tot_loss[loss=0.177, simple_loss=0.2534, pruned_loss=0.05032, over 4720668.51 frames. ], batch size: 61, lr: 4.90e-03, grad_scale: 16.0 2023-10-02 02:56:36,737 WARNING [train.py:1204] (3/4) Exclude cut with ID _4122_210_2084_1_1526713164417_4353280_312-294828-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 02:56:36,743 WARNING [train.py:1204] (3/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_303-228738-0 from training. Duration: 0.84 2023-10-02 02:56:37,460 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.3.encoder.layers.3.conv_module2.whiten, num_groups=1, num_channels=512, metric=6.36 vs. limit=15.0 2023-10-02 02:56:40,065 WARNING [train.py:1204] (3/4) Exclude cut with ID _14349_210_8341_1_1530615710818_3442470_115-285975-0_sp0.9 from training. Duration: 0.9555625 2023-10-02 02:56:41,564 WARNING [train.py:1204] (3/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_125-144174-0_sp1.1 from training. Duration: 0.3545625 2023-10-02 02:56:44,247 WARNING [train.py:1204] (3/4) Exclude cut with ID _53565_210_10635_1_1534337218335_3820259_351-54570-0_sp1.1 from training. Duration: 0.97275 2023-10-02 02:56:44,284 WARNING [train.py:1204] (3/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_410-40065-0_sp1.1 from training. Duration: 0.963625 2023-10-02 02:56:44,326 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_38965_210_4852_1_1533174906_3484735_167-339442-0_sp1.1 from training. Duration: 0.963625 2023-10-02 02:56:45,709 WARNING [train.py:1204] (3/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_265-164486-0_sp1.1 from training. Duration: 0.87275 2023-10-02 02:56:45,724 WARNING [train.py:1204] (3/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_725-182484-0_sp1.1 from training. Duration: 0.92725 2023-10-02 02:56:45,761 WARNING [train.py:1204] (3/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_607-213951-0_sp0.9 from training. Duration: 0.9333125 2023-10-02 02:56:45,774 WARNING [train.py:1204] (3/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_605-133395-0_sp1.1 from training. Duration: 0.6 2023-10-02 02:56:45,788 WARNING [train.py:1204] (3/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_351-294650-0 from training. Duration: 0.63 2023-10-02 02:56:47,786 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_14953_210_2151_1_1530701710_5616165_710-165400-0_sp1.1 from training. Duration: 0.7909375 2023-10-02 02:56:49,338 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.conv_module2.balancer2.prob, batch_count=726346.6666666666, ans=0.125 2023-10-02 02:56:51,013 WARNING [train.py:1204] (3/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_237-58433-0_sp1.1 from training. Duration: 0.67275 2023-10-02 02:56:51,097 WARNING [train.py:1204] (3/4) Exclude cut with ID _5190_210_4557_1_1527244368799_373439_28-197030-0_sp1.1 from training. Duration: 0.6545625 2023-10-02 02:56:51,143 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_743-327651-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 02:56:51,377 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.conv_skip_rate, batch_count=726346.6666666666, ans=0.0 2023-10-02 02:56:52,599 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.bypass.skip_rate, batch_count=726346.6666666666, ans=0.07 2023-10-02 02:56:55,151 WARNING [train.py:1204] (3/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_76-203867-0_sp0.9 from training. Duration: 0.8 2023-10-02 02:56:56,561 WARNING [train.py:1204] (3/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_274-274798-0 from training. Duration: 0.98 2023-10-02 02:56:56,620 WARNING [train.py:1204] (3/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_226-242140-0_sp1.1 from training. Duration: 0.736375 2023-10-02 02:57:01,235 WARNING [train.py:1204] (3/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_352-59958-0_sp1.1 from training. Duration: 0.7818125 2023-10-02 02:57:01,240 WARNING [train.py:1204] (3/4) Exclude cut with ID _39411_210_5986_1_1533538825030_3343649_191-251819-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 02:57:04,043 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.527e+02 1.850e+02 2.012e+02 2.260e+02 2.930e+02, threshold=4.023e+02, percent-clipped=0.0 2023-10-02 02:57:05,443 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7046_210_6753_1_1527895337_6920917_642-106799-0_sp1.1 from training. Duration: 0.836375 2023-10-02 02:57:05,449 WARNING [train.py:1204] (3/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_412-3442-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 02:57:05,481 WARNING [train.py:1204] (3/4) Exclude cut with ID _60057_210_10640_1_1534903065793_3333150_171-181133-0_sp0.9 from training. Duration: 0.97775 2023-10-02 02:57:05,489 WARNING [train.py:1204] (3/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_154-135696-0_sp1.1 from training. Duration: 0.72725 2023-10-02 02:57:05,695 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.conv_skip_rate, batch_count=726413.3333333334, ans=0.0 2023-10-02 02:57:08,696 WARNING [train.py:1204] (3/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_148-186805-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 02:57:11,495 WARNING [train.py:1204] (3/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_605-165641-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 02:57:11,501 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_174-277364-0_sp0.9 from training. Duration: 0.788875 2023-10-02 02:57:11,514 WARNING [train.py:1204] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_89-321955-0_sp0.9 from training. Duration: 0.888875 2023-10-02 02:57:16,197 WARNING [train.py:1204] (3/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_438-105429-0_sp1.1 from training. Duration: 0.963625 2023-10-02 02:57:16,200 WARNING [train.py:1204] (3/4) Exclude cut with ID _14970_210_15709_1_1530773543493_1715730_187-20259-0_sp0.9 from training. Duration: 0.988875 2023-10-02 02:57:25,122 WARNING [train.py:1204] (3/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_385-193052-0_sp0.9 from training. Duration: 0.9555625 2023-10-02 02:57:25,171 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7113_210_1849_1_1527942685_3448768_194-311626-0_sp1.1 from training. Duration: 0.92725 2023-10-02 02:57:28,115 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3780_210_2087_1_1526467389_2145572_176-107140-0_sp0.9 from training. Duration: 0.7666875 2023-10-02 02:57:28,117 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_36756_210_7234_1_1533026991_966276_123-213457-0_sp1.1 from training. Duration: 0.936375 2023-10-02 02:57:32,848 WARNING [train.py:1204] (3/4) Exclude cut with ID _23557_210_8750_1_1531890106370_6846255_456-244861-0_sp1.1 from training. Duration: 0.963625 2023-10-02 02:57:34,123 WARNING [train.py:1204] (3/4) Exclude cut with ID _37086_210_9038_1_1532999540891_7021250_242-349170-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 02:57:34,180 WARNING [train.py:1204] (3/4) Exclude cut with ID _41298_210_9019_1_1533430371450_4822143_471-281351-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 02:57:34,276 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_53378_210_17282_1_1534221071_5769509_572-126176-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 02:57:35,679 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_1507-263032-0_sp1.1 from training. Duration: 0.963625 2023-10-02 02:57:35,720 WARNING [train.py:1204] (3/4) Exclude cut with ID _13679_210_7497_1_1530534293662_3355098_411-186088-0_sp1.1 from training. Duration: 0.8545625 2023-10-02 02:57:38,898 WARNING [train.py:1204] (3/4) Exclude cut with ID _29345_210_4381_1_1532689168263_3860169_149-257268-0_sp1.1 from training. Duration: 0.736375 2023-10-02 02:57:40,278 WARNING [train.py:1204] (3/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_381-252014-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 02:57:40,292 WARNING [train.py:1204] (3/4) Exclude cut with ID _35799_210_5854_1_1533263487887_3529071_512-2717-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 02:57:45,013 WARNING [train.py:1204] (3/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_505-205270-0_sp0.9 from training. Duration: 0.42225 2023-10-02 02:57:46,400 WARNING [train.py:1204] (3/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_731-20117-0_sp1.1 from training. Duration: 0.936375 2023-10-02 02:57:48,967 INFO [train.py:1046] (3/4) Epoch 21, batch 2750, loss[loss=0.1675, simple_loss=0.2318, pruned_loss=0.05154, over 23459.00 frames. ], tot_loss[loss=0.177, simple_loss=0.2529, pruned_loss=0.05052, over 4713626.54 frames. ], batch size: 285, lr: 4.90e-03, grad_scale: 16.0 2023-10-02 02:57:49,078 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_37351_210_7925_1_1533012931_3812113_265-85780-0_sp1.1 from training. Duration: 0.8 2023-10-02 02:57:49,086 WARNING [train.py:1204] (3/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_313-134930-0 from training. Duration: 0.97 2023-10-02 02:57:49,219 WARNING [train.py:1204] (3/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_374-138971-0 from training. Duration: 0.94 2023-10-02 02:57:49,252 WARNING [train.py:1204] (3/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_1227-287886-0_sp1.1 from training. Duration: 0.936375 2023-10-02 02:57:52,439 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.attention_skip_rate, batch_count=726613.3333333334, ans=0.0 2023-10-02 02:57:54,152 WARNING [train.py:1204] (3/4) Exclude cut with ID _59493_210_17282_1_1535078419077_3225879_332-217544-0_sp1.1 from training. Duration: 0.97275 2023-10-02 02:57:55,404 WARNING [train.py:1204] (3/4) Exclude cut with ID _23514_210_1389_1_1531832139785_3031439_361-119640-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 02:57:55,589 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.1.balancer2.prob, batch_count=726613.3333333334, ans=0.125 2023-10-02 02:57:56,885 WARNING [train.py:1204] (3/4) Exclude cut with ID _69584_210_5219_1_1535785133775_4044450_142-118058-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 02:57:56,907 WARNING [train.py:1204] (3/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_178-177582-0_sp1.1 from training. Duration: 0.67275 2023-10-02 02:57:56,946 WARNING [train.py:1204] (3/4) Exclude cut with ID _25303_210_17519_1_1532050207576_3635970_41-195572-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 02:58:02,056 WARNING [train.py:1204] (3/4) Exclude cut with ID _55328_210_27767_1_1534580973260_4088170_329-341322-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 02:58:02,097 WARNING [train.py:1204] (3/4) Exclude cut with ID _5608_210_2106_1_1527415439089_6819768_909-82767-0_sp1.1 from training. Duration: 0.5545625 2023-10-02 02:58:02,150 WARNING [train.py:1204] (3/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_261-297503-0_sp1.1 from training. Duration: 0.9 2023-10-02 02:58:02,372 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.balancer_na.min_abs, batch_count=726613.3333333334, ans=0.02 2023-10-02 02:58:02,616 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.4.encoder.layers.1.nonlin_attention.whiten1, num_groups=1, num_channels=288, metric=4.14 vs. limit=10.0 2023-10-02 02:58:03,406 WARNING [train.py:1204] (3/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_459-291706-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 02:58:03,407 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7762_210_1794_1_1528199508_4057408_422-227034-0 from training. Duration: 0.73 2023-10-02 02:58:03,417 WARNING [train.py:1204] (3/4) Exclude cut with ID _3067_210_2667_1_1526212974640_4503168_250-285937-0_sp0.9 from training. Duration: 0.888875 2023-10-02 02:58:03,433 WARNING [train.py:1204] (3/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_332-253745-0_sp1.1 from training. Duration: 0.936375 2023-10-02 02:58:05,242 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.feed_forward1.hidden_balancer.prob, batch_count=726680.0, ans=0.125 2023-10-02 02:58:07,761 WARNING [train.py:1204] (3/4) Exclude cut with ID _13938_210_9988_1_1530518718978_3693960_304-112784-0 from training. Duration: 0.46 2023-10-02 02:58:09,231 WARNING [train.py:1204] (3/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_226-267957-0_sp1.1 from training. Duration: 0.92725 2023-10-02 02:58:09,278 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_41822_210_13270_1_1533955976_4379284_608-254567-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 02:58:11,035 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_124-113562-0_sp1.1 from training. Duration: 0.8545625 2023-10-02 02:58:11,074 WARNING [train.py:1204] (3/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_117-290916-0_sp1.1 from training. Duration: 0.463625 2023-10-02 02:58:12,544 WARNING [train.py:1204] (3/4) Exclude cut with ID _28427_210_4220_1_1532581240967_3061069_102-66533-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 02:58:14,615 WARNING [train.py:1204] (3/4) Exclude cut with ID _55383_210_10635_1_1534468426172_2231669_134-128246-0_sp1.1 from training. Duration: 0.8090625 2023-10-02 02:58:14,668 WARNING [train.py:1204] (3/4) Exclude cut with ID _45871_210_5665_1_1533864614245_3296060_49-308316-0_sp1.1 from training. Duration: 0.97275 2023-10-02 02:58:14,721 WARNING [train.py:1204] (3/4) Exclude cut with ID _57572_210_5517_1_1534921219624_3809484_176-199205-0_sp1.1 from training. Duration: 0.97275 2023-10-02 02:58:18,597 WARNING [train.py:1204] (3/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_45-265684-0_sp1.1 from training. Duration: 0.6545625 2023-10-02 02:58:18,621 WARNING [train.py:1204] (3/4) Exclude cut with ID _15150_210_8341_1_1530752411803_7256384_469-239509-0_sp0.9 from training. Duration: 0.7555625 2023-10-02 02:58:20,018 WARNING [train.py:1204] (3/4) Exclude cut with ID _28461_210_2253_1_1532771775189_3041299_136-270644-0_sp1.1 from training. Duration: 0.8454375 2023-10-02 02:58:21,405 WARNING [train.py:1204] (3/4) Exclude cut with ID _69584_210_5219_1_1535785133775_4044450_302-47302-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 02:58:23,296 WARNING [train.py:1204] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_166-128941-0_sp1.1 from training. Duration: 0.4909375 2023-10-02 02:58:30,235 WARNING [train.py:1204] (3/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_559-120504-0_sp1.1 from training. Duration: 0.97275 2023-10-02 02:58:33,550 WARNING [train.py:1204] (3/4) Exclude cut with ID _6456_210_1534_1_1527681634212_3181049_345-252877-0_sp1.1 from training. Duration: 0.5818125 2023-10-02 02:58:33,577 WARNING [train.py:1204] (3/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_902-233003-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 02:58:37,795 WARNING [train.py:1204] (3/4) Exclude cut with ID _36538_210_9994_1_1533204020363_3795620_204-261385-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 02:58:37,799 WARNING [train.py:1204] (3/4) Exclude cut with ID _14821_210_8750_1_1530680503991_6829359_189-297451-0_sp1.1 from training. Duration: 0.5 2023-10-02 02:58:37,846 WARNING [train.py:1204] (3/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_1211-226066-0_sp0.9 from training. Duration: 0.8444375 2023-10-02 02:58:43,819 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6786_210_1388_1_1527839699_4194495_273-149841-0_sp1.1 from training. Duration: 0.836375 2023-10-02 02:58:43,854 WARNING [train.py:1204] (3/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_380-162648-0_sp1.1 from training. Duration: 0.8545625 2023-10-02 02:58:43,855 WARNING [train.py:1204] (3/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_494-199538-0 from training. Duration: 0.75 2023-10-02 02:58:49,859 WARNING [train.py:1204] (3/4) Exclude cut with ID _49097_210_16049_1_1533952780207_4360979_276-87053-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 02:58:51,178 WARNING [train.py:1204] (3/4) Exclude cut with ID _5444_210_2151_1_1527418808464_5312210_726-27165-0 from training. Duration: 0.67 2023-10-02 02:58:54,640 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_128-202104-0_sp1.1 from training. Duration: 0.57275 2023-10-02 02:58:57,406 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_11349_210_6753_1_1529800861_1101207_26-65602-0_sp1.1 from training. Duration: 0.87275 2023-10-02 02:58:57,438 WARNING [train.py:1204] (3/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_159-328157-0 from training. Duration: 0.52 2023-10-02 02:58:58,772 WARNING [train.py:1204] (3/4) Exclude cut with ID _14069_210_5632_1_1530512692033_4803390_98-21934-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 02:59:00,201 WARNING [train.py:1204] (3/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_352-59958-0_sp0.9 from training. Duration: 0.9555625 2023-10-02 02:59:01,411 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6163_210_6400_1_1527506746_4263068_283-137811-0 from training. Duration: 0.77 2023-10-02 02:59:01,443 WARNING [train.py:1204] (3/4) Exclude cut with ID _21989_210_5854_1_1531899214931_3271360_133-44759-0_sp1.1 from training. Duration: 0.7 2023-10-02 02:59:04,148 INFO [train.py:1046] (3/4) Epoch 21, batch 2800, loss[loss=0.1563, simple_loss=0.2312, pruned_loss=0.04076, over 24446.00 frames. ], tot_loss[loss=0.1759, simple_loss=0.2518, pruned_loss=0.04996, over 4711745.14 frames. ], batch size: 58, lr: 4.89e-03, grad_scale: 32.0 2023-10-02 02:59:04,297 WARNING [train.py:1204] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_21-174514-0_sp0.9 from training. Duration: 0.47775 2023-10-02 02:59:06,217 WARNING [train.py:1204] (3/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_201-78811-0_sp1.1 from training. Duration: 0.963625 2023-10-02 02:59:06,256 WARNING [train.py:1204] (3/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_537-164630-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 02:59:07,602 WARNING [train.py:1204] (3/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_156-257607-0 from training. Duration: 0.83 2023-10-02 02:59:07,623 WARNING [train.py:1204] (3/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_308-154450-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 02:59:08,908 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_45399_210_4852_1_1533624725_7579737_214-240574-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 02:59:10,392 WARNING [train.py:1204] (3/4) Exclude cut with ID _29303_210_2575_1_1532392237303_7410649_615-305517-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 02:59:10,453 WARNING [train.py:1204] (3/4) Exclude cut with ID IC0125W0002-61808 from training. Duration: 0.708 2023-10-02 02:59:10,453 WARNING [train.py:1204] (3/4) Exclude cut with ID IC0125W0017-61823 from training. Duration: 0.759 2023-10-02 02:59:13,706 WARNING [train.py:1204] (3/4) Exclude cut with ID _53219_210_4525_1_1534834836008_4064825_4-145291-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 02:59:15,189 WARNING [train.py:1204] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_185-174805-0_sp1.1 from training. Duration: 0.6181875 2023-10-02 02:59:15,202 WARNING [train.py:1204] (3/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_635-193046-0_sp1.1 from training. Duration: 0.92725 2023-10-02 02:59:17,882 WARNING [train.py:1204] (3/4) Exclude cut with ID _46067_210_4220_1_1534125571066_3307450_321-258866-0_sp1.1 from training. Duration: 0.97275 2023-10-02 02:59:20,976 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8558_210_1410_1_1528505225_7613406_131-329524-0 from training. Duration: 0.79 2023-10-02 02:59:22,391 WARNING [train.py:1204] (3/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_847-41334-0_sp0.9 from training. Duration: 0.488875 2023-10-02 02:59:24,157 WARNING [train.py:1204] (3/4) Exclude cut with ID _14227_210_4381_1_1530701804286_3940259_125-275642-0 from training. Duration: 0.99 2023-10-02 02:59:24,369 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.nonlin_attention.balancer.prob, batch_count=727013.3333333334, ans=0.125 2023-10-02 02:59:25,552 WARNING [train.py:1204] (3/4) Exclude cut with ID _25248_210_8750_1_1532062825941_5554180_248-177255-0_sp1.1 from training. Duration: 0.963625 2023-10-02 02:59:25,590 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_40047_210_2290_1_1533290519_2374311_195-165693-0_sp1.1 from training. Duration: 0.8090625 2023-10-02 02:59:26,866 WARNING [train.py:1204] (3/4) Exclude cut with ID _23109_210_4381_1_1532150828777_901949_29-258672-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 02:59:28,767 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.ff3_skip_rate, batch_count=727013.3333333334, ans=0.0 2023-10-02 02:59:29,840 WARNING [train.py:1204] (3/4) Exclude cut with ID _14821_210_8750_1_1530680503991_6829359_2-227151-0_sp1.1 from training. Duration: 0.7545625 2023-10-02 02:59:29,870 WARNING [train.py:1204] (3/4) Exclude cut with ID _49643_210_7989_1_1534640244224_3939031_565-53982-0_sp1.1 from training. Duration: 0.963625 2023-10-02 02:59:29,874 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7544_210_6753_1_1528591327_1471426_33-346031-0_sp0.9 from training. Duration: 0.67775 2023-10-02 02:59:31,427 WARNING [train.py:1204] (3/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_95-61782-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 02:59:34,441 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.528e+02 1.948e+02 2.235e+02 2.690e+02 3.655e+02, threshold=4.470e+02, percent-clipped=0.0 2023-10-02 02:59:36,562 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.4.encoder.layers.0.self_attn1.whiten, num_groups=1, num_channels=384, metric=14.27 vs. limit=22.5 2023-10-02 02:59:37,563 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.conv_module1.balancer2.prob, batch_count=727080.0, ans=0.125 2023-10-02 02:59:38,703 WARNING [train.py:1204] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_686-54383-0_sp0.9 from training. Duration: 0.9666875 2023-10-02 02:59:40,030 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_505-276823-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 02:59:42,896 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.2.encoder.layers.0.nonlin_attention.whiten2, num_groups=1, num_channels=384, metric=8.94 vs. limit=15.0 2023-10-02 02:59:43,460 WARNING [train.py:1204] (3/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_745-128443-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 02:59:44,843 WARNING [train.py:1204] (3/4) Exclude cut with ID _3844_210_1390_1_1526727803582_2871209_20-251664-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 02:59:46,199 WARNING [train.py:1204] (3/4) Exclude cut with ID _36538_210_9994_1_1533204020363_3795620_487-164628-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 02:59:47,957 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.conv_module2.balancer2.prob, batch_count=727146.6666666666, ans=0.125 2023-10-02 02:59:49,008 WARNING [train.py:1204] (3/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_309-222545-0_sp1.1 from training. Duration: 0.763625 2023-10-02 02:59:49,027 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3531_210_3528_1_1526294203_5216456_309-227302-0 from training. Duration: 0.69 2023-10-02 02:59:49,084 WARNING [train.py:1204] (3/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_42-192460-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 02:59:51,122 WARNING [train.py:1204] (3/4) Exclude cut with ID _22083_210_8254_1_1531702862439_3678970_49-234546-0_sp1.1 from training. Duration: 0.7545625 2023-10-02 02:59:51,126 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_427-78097-0_sp0.9 from training. Duration: 0.888875 2023-10-02 02:59:51,361 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.self_attn_weights.pos_emb_skip_rate, batch_count=727146.6666666666, ans=0.0 2023-10-02 02:59:55,481 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_378-205254-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 02:59:56,782 WARNING [train.py:1204] (3/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_672-314830-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 02:59:58,344 WARNING [train.py:1204] (3/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_90-30673-0_sp1.1 from training. Duration: 0.763625 2023-10-02 03:00:01,158 WARNING [train.py:1204] (3/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_563-329943-0_sp1.1 from training. Duration: 0.9 2023-10-02 03:00:01,172 WARNING [train.py:1204] (3/4) Exclude cut with ID _21632_210_2253_1_1531994001765_3744140_154-84090-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 03:00:01,173 WARNING [train.py:1204] (3/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_103-336064-0_sp0.9 from training. Duration: 0.5666875 2023-10-02 03:00:02,562 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_43-71989-0_sp0.9 from training. Duration: 0.6333125 2023-10-02 03:00:02,615 WARNING [train.py:1204] (3/4) Exclude cut with ID _3867_210_1883_1_1526641200575_4475830_319-349850-0_sp0.9 from training. Duration: 0.7444375 2023-10-02 03:00:04,465 WARNING [train.py:1204] (3/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_629-179454-0_sp1.1 from training. Duration: 0.963625 2023-10-02 03:00:04,483 WARNING [train.py:1204] (3/4) Exclude cut with ID _65263_210_22356_1_1535763598691_3921037_49-222917-0 from training. Duration: 0.92 2023-10-02 03:00:04,505 WARNING [train.py:1204] (3/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_602-331772-0_sp1.1 from training. Duration: 0.936375 2023-10-02 03:00:07,061 WARNING [train.py:1204] (3/4) Exclude cut with ID _58414_210_8254_1_1535421609162_7151182_292-134978-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 03:00:07,088 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_38793_210_2544_1_1533176114_3849320_200-299292-0_sp1.1 from training. Duration: 0.936375 2023-10-02 03:00:07,196 WARNING [train.py:1204] (3/4) Exclude cut with ID _14970_210_15709_1_1530775547361_1966965_6-23328-0 from training. Duration: 0.86 2023-10-02 03:00:07,263 WARNING [train.py:1204] (3/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_386-225511-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 03:00:07,273 WARNING [train.py:1204] (3/4) Exclude cut with ID _38323_210_12108_1_1533121233369_3303049_416-11919-0_sp1.1 from training. Duration: 0.87275 2023-10-02 03:00:09,244 WARNING [train.py:1204] (3/4) Exclude cut with ID _4866_210_2577_1_1527404412284_4364659_256-306830-0_sp1.1 from training. Duration: 0.7909375 2023-10-02 03:00:09,361 WARNING [train.py:1204] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_53-300544-0 from training. Duration: 0.67 2023-10-02 03:00:14,882 WARNING [train.py:1204] (3/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_80-74184-0_sp1.1 from training. Duration: 0.7545625 2023-10-02 03:00:14,904 WARNING [train.py:1204] (3/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_332-256168-0_sp1.1 from training. Duration: 0.6818125 2023-10-02 03:00:16,209 WARNING [train.py:1204] (3/4) Exclude cut with ID _48836_210_12210_1_1534075848293_3904150_429-35475-0_sp1.1 from training. Duration: 0.8090625 2023-10-02 03:00:18,737 INFO [train.py:1046] (3/4) Epoch 21, batch 2850, loss[loss=0.1703, simple_loss=0.2406, pruned_loss=0.04999, over 23584.00 frames. ], tot_loss[loss=0.1752, simple_loss=0.2513, pruned_loss=0.04959, over 4711545.69 frames. ], batch size: 135, lr: 4.89e-03, grad_scale: 16.0 2023-10-02 03:00:18,787 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_144-135833-0_sp1.1 from training. Duration: 0.97275 2023-10-02 03:00:23,547 WARNING [train.py:1204] (3/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_380-343670-0_sp0.9 from training. Duration: 0.97775 2023-10-02 03:00:23,574 WARNING [train.py:1204] (3/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_213-265174-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 03:00:23,596 WARNING [train.py:1204] (3/4) Exclude cut with ID _20455_210_5219_1_1531565996775_8098100_86-314283-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 03:00:26,828 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_41750_210_1960_1_1533520678_3948042_68-223813-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 03:00:26,894 WARNING [train.py:1204] (3/4) Exclude cut with ID _55153_210_2253_1_1534384786677_3162096_24-198429-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 03:00:28,284 WARNING [train.py:1204] (3/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_119-207162-0_sp0.9 from training. Duration: 0.788875 2023-10-02 03:00:29,631 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_148-172861-0 from training. Duration: 0.5 2023-10-02 03:00:37,043 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_5522_210_3273_1_1527249606_3909096_318-203153-0 from training. Duration: 0.43 2023-10-02 03:00:37,050 WARNING [train.py:1204] (3/4) Exclude cut with ID _14884_210_2252_1_1530707459689_5561250_288-189949-0_sp1.1 from training. Duration: 0.97275 2023-10-02 03:00:38,402 WARNING [train.py:1204] (3/4) Exclude cut with ID _5294_210_1747_1_1527249643928_3696179_529-242298-0 from training. Duration: 0.95 2023-10-02 03:00:40,237 WARNING [train.py:1204] (3/4) Exclude cut with ID _36538_210_9994_1_1533204020363_3795620_503-110289-0_sp1.1 from training. Duration: 0.936375 2023-10-02 03:00:41,796 WARNING [train.py:1204] (3/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_385-193052-0 from training. Duration: 0.86 2023-10-02 03:00:43,129 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_456-22141-0 from training. Duration: 0.88 2023-10-02 03:00:44,483 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_38965_210_4852_1_1533174906_3484735_284-117840-0_sp1.1 from training. Duration: 0.936375 2023-10-02 03:00:54,994 WARNING [train.py:1204] (3/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_75-201625-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 03:00:56,831 WARNING [train.py:1204] (3/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_255-312654-0_sp1.1 from training. Duration: 0.82725 2023-10-02 03:00:56,841 WARNING [train.py:1204] (3/4) Exclude cut with ID _47251_210_18628_1_1533814063128_4137250_214-88076-0_sp0.9 from training. Duration: 0.97775 2023-10-02 03:00:58,342 WARNING [train.py:1204] (3/4) Exclude cut with ID _4092_210_2151_1_1526643044449_3135759_227-213972-0_sp1.1 from training. Duration: 0.4909375 2023-10-02 03:00:58,358 WARNING [train.py:1204] (3/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_912-47473-0_sp1.1 from training. Duration: 0.6181875 2023-10-02 03:00:58,382 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6767_210_3273_1_1527855783_1718932_173-256601-0_sp1.1 from training. Duration: 0.72725 2023-10-02 03:00:59,796 WARNING [train.py:1204] (3/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_32-205288-0_sp1.1 from training. Duration: 0.7090625 2023-10-02 03:00:59,839 WARNING [train.py:1204] (3/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_39-248870-0 from training. Duration: 0.69 2023-10-02 03:01:01,186 WARNING [train.py:1204] (3/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_377-296839-0_sp1.1 from training. Duration: 0.5 2023-10-02 03:01:01,202 WARNING [train.py:1204] (3/4) Exclude cut with ID _13679_210_7497_1_1530534293662_3355098_413-56154-0_sp1.1 from training. Duration: 0.963625 2023-10-02 03:01:03,165 WARNING [train.py:1204] (3/4) Exclude cut with ID _14970_210_15709_1_1530775547361_1966965_201-84544-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 03:01:03,222 WARNING [train.py:1204] (3/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_105-95775-0_sp1.1 from training. Duration: 0.936375 2023-10-02 03:01:05,907 WARNING [train.py:1204] (3/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_131-191669-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 03:01:05,943 WARNING [train.py:1204] (3/4) Exclude cut with ID _14860_210_4915_1_1530702020340_7343240_695-4323-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 03:01:07,334 WARNING [train.py:1204] (3/4) Exclude cut with ID _39867_210_9826_1_1533553222030_9344259_991-130565-0_sp1.1 from training. Duration: 0.97275 2023-10-02 03:01:08,825 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_50-3289-0_sp1.1 from training. Duration: 0.82725 2023-10-02 03:01:10,640 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_54-347670-0_sp1.1 from training. Duration: 0.92725 2023-10-02 03:01:11,973 WARNING [train.py:1204] (3/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_200-153445-0_sp1.1 from training. Duration: 0.936375 2023-10-02 03:01:13,379 WARNING [train.py:1204] (3/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_279-302911-0_sp1.1 from training. Duration: 0.97275 2023-10-02 03:01:16,007 WARNING [train.py:1204] (3/4) Exclude cut with ID _14886_210_4220_1_1530702020355_3516190_159-73748-0_sp0.9 from training. Duration: 0.92225 2023-10-02 03:01:20,815 WARNING [train.py:1204] (3/4) Exclude cut with ID _53379_210_20034_1_1534222027363_3854654_23-222715-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 03:01:22,294 WARNING [train.py:1204] (3/4) Exclude cut with ID _12555_210_8750_1_1530075594402_3780239_449-267986-0 from training. Duration: 0.83 2023-10-02 03:01:22,311 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7544_210_6753_1_1528587722_3576686_295-258479-0 from training. Duration: 0.84 2023-10-02 03:01:23,807 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7002_210_1794_1_1527935634_4034444_487-236501-0_sp0.9 from training. Duration: 0.7333125 2023-10-02 03:01:23,998 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.1.feed_forward1.out_proj.dropout_p, batch_count=727546.6666666666, ans=0.1 2023-10-02 03:01:25,107 WARNING [train.py:1204] (3/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_244-19107-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 03:01:25,132 WARNING [train.py:1204] (3/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_390-339137-0 from training. Duration: 0.56 2023-10-02 03:01:25,197 WARNING [train.py:1204] (3/4) Exclude cut with ID _7562_210_2084_1_1528887767916_3946229_186-326422-0_sp1.1 from training. Duration: 0.763625 2023-10-02 03:01:26,621 WARNING [train.py:1204] (3/4) Exclude cut with ID _33640_210_2655_1_1533117605479_4855469_645-145235-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 03:01:26,637 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_25767_210_2161_1_1532307483_3867748_506-171777-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 03:01:26,664 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_5438_210_1794_1_1527336687_2600009_233-58072-0_sp1.1 from training. Duration: 0.7 2023-10-02 03:01:26,664 WARNING [train.py:1204] (3/4) Exclude cut with ID IC0669W0063-330908 from training. Duration: 0.896 2023-10-02 03:01:27,969 WARNING [train.py:1204] (3/4) Exclude cut with ID IC0671W0070-331913 from training. Duration: 0.896 2023-10-02 03:01:27,972 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_258-310570-0_sp1.1 from training. Duration: 0.7454375 2023-10-02 03:01:28,037 WARNING [train.py:1204] (3/4) Exclude cut with ID _9461_210_4557_1_1529060514726_3677451_162-177507-0_sp1.1 from training. Duration: 0.97275 2023-10-02 03:01:30,202 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.ff2_skip_rate, batch_count=727546.6666666666, ans=0.0 2023-10-02 03:01:32,690 WARNING [train.py:1204] (3/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_26-175553-0_sp0.9 from training. Duration: 0.67775 2023-10-02 03:01:32,721 WARNING [train.py:1204] (3/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_7-150481-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 03:01:32,962 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.nonlin_attention.balancer.prob, batch_count=727613.3333333334, ans=0.125 2023-10-02 03:01:33,961 INFO [train.py:1046] (3/4) Epoch 21, batch 2900, loss[loss=0.1368, simple_loss=0.2162, pruned_loss=0.0287, over 24281.00 frames. ], tot_loss[loss=0.1756, simple_loss=0.2515, pruned_loss=0.0499, over 4700872.95 frames. ], batch size: 56, lr: 4.89e-03, grad_scale: 16.0 2023-10-02 03:01:34,056 WARNING [train.py:1204] (3/4) Exclude cut with ID _23557_210_8750_1_1531890106370_6846255_72-273262-0_sp1.1 from training. Duration: 0.963625 2023-10-02 03:01:35,844 WARNING [train.py:1204] (3/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_367-61330-0 from training. Duration: 0.75 2023-10-02 03:01:39,991 WARNING [train.py:1204] (3/4) Exclude cut with ID _39867_210_9826_1_1533553222030_9344259_1337-11141-0_sp1.1 from training. Duration: 0.936375 2023-10-02 03:01:40,013 WARNING [train.py:1204] (3/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_101-293367-0 from training. Duration: 0.63 2023-10-02 03:01:40,227 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.ff3_skip_rate, batch_count=727613.3333333334, ans=0.0 2023-10-02 03:01:41,804 WARNING [train.py:1204] (3/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_10-100367-0 from training. Duration: 0.98 2023-10-02 03:01:43,374 WARNING [train.py:1204] (3/4) Exclude cut with ID _60057_210_10640_1_1534903065793_3333150_171-181133-0_sp1.1 from training. Duration: 0.8 2023-10-02 03:01:43,377 WARNING [train.py:1204] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_844-96989-0_sp0.9 from training. Duration: 0.788875 2023-10-02 03:01:44,821 WARNING [train.py:1204] (3/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_301-144595-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 03:01:46,271 WARNING [train.py:1204] (3/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_115-147343-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 03:01:46,751 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.5.encoder.layers.1.nonlin_attention.whiten1, num_groups=1, num_channels=192, metric=3.04 vs. limit=10.0 2023-10-02 03:01:49,107 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3171_210_2087_1_1526109084_2330007_37-64334-0_sp1.1 from training. Duration: 0.7454375 2023-10-02 03:01:49,166 WARNING [train.py:1204] (3/4) Exclude cut with ID _19514_210_9259_1_1531530071330_5084230_769-272914-0_sp1.1 from training. Duration: 0.936375 2023-10-02 03:01:52,604 WARNING [train.py:1204] (3/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_76-311963-0_sp0.9 from training. Duration: 0.77775 2023-10-02 03:01:52,637 WARNING [train.py:1204] (3/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_526-62079-0 from training. Duration: 0.67 2023-10-02 03:01:53,963 WARNING [train.py:1204] (3/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_7-111954-0_sp0.9 from training. Duration: 0.62225 2023-10-02 03:01:54,203 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.nonlin_attention.balancer.prob, batch_count=727680.0, ans=0.125 2023-10-02 03:01:55,333 WARNING [train.py:1204] (3/4) Exclude cut with ID _57997_210_4163_1_1534766425889_3961049_360-279140-0_sp1.1 from training. Duration: 0.97275 2023-10-02 03:01:56,906 WARNING [train.py:1204] (3/4) Exclude cut with ID _4866_210_2577_1_1527404412284_4364659_133-1696-0 from training. Duration: 0.99 2023-10-02 03:01:56,974 WARNING [train.py:1204] (3/4) Exclude cut with ID _5190_210_4557_1_1527244368799_373439_28-197030-0 from training. Duration: 0.72 2023-10-02 03:01:57,503 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.4.encoder.layers.2.self_attn2.whiten, num_groups=1, num_channels=384, metric=11.56 vs. limit=22.5 2023-10-02 03:01:59,736 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_53-39003-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 03:01:59,738 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6673_210_3273_1_1527767790_3173240_351-167954-0 from training. Duration: 0.48 2023-10-02 03:01:59,760 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2822_210_2715_1_1526112586_1999587_165-36656-0_sp1.1 from training. Duration: 0.8090625 2023-10-02 03:02:03,005 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_328-264892-0_sp1.1 from training. Duration: 0.92725 2023-10-02 03:02:03,010 WARNING [train.py:1204] (3/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_29-293896-0_sp0.9 from training. Duration: 0.67775 2023-10-02 03:02:06,016 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.668e+02 1.860e+02 2.101e+02 2.423e+02 4.328e+02, threshold=4.202e+02, percent-clipped=0.0 2023-10-02 03:02:06,129 WARNING [train.py:1204] (3/4) Exclude cut with ID _65263_210_22356_1_1535763598691_3921037_465-55448-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 03:02:07,429 WARNING [train.py:1204] (3/4) Exclude cut with ID _21989_210_5854_1_1531899214931_3271360_311-53106-0_sp1.1 from training. Duration: 0.97275 2023-10-02 03:02:11,674 WARNING [train.py:1204] (3/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_389-277915-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 03:02:15,001 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_69630_210_7234_1_1535791094_677837_63-277139-0_sp1.1 from training. Duration: 0.963625 2023-10-02 03:02:16,560 WARNING [train.py:1204] (3/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_85-138257-0 from training. Duration: 0.71 2023-10-02 03:02:16,578 WARNING [train.py:1204] (3/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_686-247600-0 from training. Duration: 0.92 2023-10-02 03:02:16,592 WARNING [train.py:1204] (3/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_108-255430-0_sp1.1 from training. Duration: 0.8818125 2023-10-02 03:02:19,393 WARNING [train.py:1204] (3/4) Exclude cut with ID _14884_210_2252_1_1530707459689_5561250_114-201399-0_sp1.1 from training. Duration: 0.6909375 2023-10-02 03:02:21,507 WARNING [train.py:1204] (3/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_506-42614-0 from training. Duration: 0.79 2023-10-02 03:02:22,763 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_85-195240-0_sp1.1 from training. Duration: 0.7909375 2023-10-02 03:02:26,346 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.2.encoder.layers.1.nonlin_attention.whiten2, num_groups=1, num_channels=384, metric=6.13 vs. limit=15.0 2023-10-02 03:02:26,953 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_141-36243-0_sp1.1 from training. Duration: 0.97275 2023-10-02 03:02:37,725 WARNING [train.py:1204] (3/4) Exclude cut with ID _7852_210_5219_1_1528286471316_8139400_358-287492-0_sp1.1 from training. Duration: 0.87275 2023-10-02 03:02:37,741 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_805-71497-0_sp0.9 from training. Duration: 0.788875 2023-10-02 03:02:37,830 WARNING [train.py:1204] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_178-39722-0 from training. Duration: 0.49 2023-10-02 03:02:41,836 WARNING [train.py:1204] (3/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_428-181925-0_sp1.1 from training. Duration: 0.963625 2023-10-02 03:02:43,047 WARNING [train.py:1204] (3/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_770-200430-0 from training. Duration: 0.89 2023-10-02 03:02:43,082 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_1104-271009-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 03:02:43,128 WARNING [train.py:1204] (3/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_128-296581-0_sp0.9 from training. Duration: 0.82225 2023-10-02 03:02:49,261 INFO [train.py:1046] (3/4) Epoch 21, batch 2950, loss[loss=0.2487, simple_loss=0.3027, pruned_loss=0.09739, over 19620.00 frames. ], tot_loss[loss=0.1762, simple_loss=0.2522, pruned_loss=0.05009, over 4702753.34 frames. ], batch size: 388, lr: 4.89e-03, grad_scale: 8.0 2023-10-02 03:02:49,365 WARNING [train.py:1204] (3/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_19-62511-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 03:02:50,771 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_805-71497-0 from training. Duration: 0.71 2023-10-02 03:02:50,852 WARNING [train.py:1204] (3/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_573-152077-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 03:02:50,857 WARNING [train.py:1204] (3/4) Exclude cut with ID _42875_210_14856_1_1533704397487_4012433_393-177816-0_sp1.1 from training. Duration: 0.963625 2023-10-02 03:02:52,258 WARNING [train.py:1204] (3/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_278-35159-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 03:02:55,611 WARNING [train.py:1204] (3/4) Exclude cut with ID _15037_210_2253_1_1530752294104_3267650_13-130361-0_sp1.1 from training. Duration: 0.9 2023-10-02 03:02:55,705 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3019_210_2028_1_1526044245_3264865_127-135615-0 from training. Duration: 0.94 2023-10-02 03:02:56,967 WARNING [train.py:1204] (3/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_607-213951-0 from training. Duration: 0.84 2023-10-02 03:02:58,218 WARNING [train.py:1204] (3/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_436-318113-0_sp0.9 from training. Duration: 0.8333125 2023-10-02 03:02:58,219 WARNING [train.py:1204] (3/4) Exclude cut with ID _13938_210_9988_1_1530518718978_3693960_148-152225-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 03:02:59,959 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.feed_forward3.hidden_balancer.prob, batch_count=727946.6666666666, ans=0.125 2023-10-02 03:03:01,586 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.feed_forward1.hidden_balancer.prob, batch_count=727946.6666666666, ans=0.125 2023-10-02 03:03:04,555 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2541_210_1794_1_1525590269_3084765_156-173713-0_sp1.1 from training. Duration: 0.736375 2023-10-02 03:03:05,982 WARNING [train.py:1204] (3/4) Exclude cut with ID _28323_210_16187_1_1532480395667_4100300_531-24157-0_sp1.1 from training. Duration: 0.8909375 2023-10-02 03:03:07,938 WARNING [train.py:1204] (3/4) Exclude cut with ID _29303_210_2575_1_1532392237303_7410649_382-217492-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 03:03:09,277 WARNING [train.py:1204] (3/4) Exclude cut with ID _58895_210_8750_1_1535184081320_3649089_270-158013-0_sp1.1 from training. Duration: 0.92725 2023-10-02 03:03:13,446 WARNING [train.py:1204] (3/4) Exclude cut with ID _29277_210_3279_1_1532581109293_3173069_311-206227-0_sp1.1 from training. Duration: 0.936375 2023-10-02 03:03:13,466 WARNING [train.py:1204] (3/4) Exclude cut with ID _7160_210_1876_1_1527940923231_3703799_282-322611-0_sp0.9 from training. Duration: 0.9666875 2023-10-02 03:03:14,898 WARNING [train.py:1204] (3/4) Exclude cut with ID _53219_210_4525_1_1534834836008_4064825_562-32295-0_sp1.1 from training. Duration: 0.963625 2023-10-02 03:03:16,332 WARNING [train.py:1204] (3/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_408-87330-0_sp1.1 from training. Duration: 0.963625 2023-10-02 03:03:16,340 WARNING [train.py:1204] (3/4) Exclude cut with ID _59202_210_26984_1_1534816810612_3549174_137-318710-0_sp1.1 from training. Duration: 0.8090625 2023-10-02 03:03:17,808 WARNING [train.py:1204] (3/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_372-30302-0 from training. Duration: 0.86 2023-10-02 03:03:23,993 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7543_210_6753_1_1528541907_1399322_178-56269-0_sp1.1 from training. Duration: 0.95275 2023-10-02 03:03:24,010 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9109W0012-549758 from training. Duration: 0.512 2023-10-02 03:03:25,280 WARNING [train.py:1204] (3/4) Exclude cut with ID _5410_210_1388_1_1527397055795_1719020_39-112221-0_sp0.9 from training. Duration: 0.7666875 2023-10-02 03:03:26,711 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9121W0012-554933 from training. Duration: 0.896 2023-10-02 03:03:28,572 WARNING [train.py:1204] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_476-272757-0 from training. Duration: 0.93 2023-10-02 03:03:28,593 WARNING [train.py:1204] (3/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_1085-171718-0_sp1.1 from training. Duration: 0.92725 2023-10-02 03:03:28,662 WARNING [train.py:1204] (3/4) Exclude cut with ID _8480_210_1874_1_1528589391205_6728330_309-139896-0_sp1.1 from training. Duration: 0.736375 2023-10-02 03:03:28,662 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9130W0113-559703 from training. Duration: 0.896 2023-10-02 03:03:28,667 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_292-265928-0_sp1.1 from training. Duration: 0.463625 2023-10-02 03:03:31,444 WARNING [train.py:1204] (3/4) Exclude cut with ID _8772_210_1850_1_1528635595972_3664011_96-12756-0 from training. Duration: 0.98 2023-10-02 03:03:32,797 WARNING [train.py:1204] (3/4) Exclude cut with ID _12556_210_8341_1_1530081024100_7327690_725-193477-0_sp1.1 from training. Duration: 0.8909375 2023-10-02 03:03:32,846 WARNING [train.py:1204] (3/4) Exclude cut with ID _5289_210_2084_1_1527678202736_3779299_142-103317-0_sp1.1 from training. Duration: 0.763625 2023-10-02 03:03:34,346 WARNING [train.py:1204] (3/4) Exclude cut with ID _45012_210_12651_1_1533608435136_5371380_206-51075-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 03:03:37,083 WARNING [train.py:1204] (3/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_176-56120-0_sp1.1 from training. Duration: 0.7545625 2023-10-02 03:03:37,112 WARNING [train.py:1204] (3/4) Exclude cut with ID _40284_210_2084_1_1533513679814_3704279_220-201864-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 03:03:37,153 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9165W0004-576638 from training. Duration: 0.896 2023-10-02 03:03:37,198 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_5438_210_1794_1_1527336687_2600009_218-343336-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 03:03:39,150 WARNING [train.py:1204] (3/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_318-162017-0 from training. Duration: 0.91 2023-10-02 03:03:43,407 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_49544_210_1794_1_1534379770_5262083_483-349298-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 03:03:44,730 WARNING [train.py:1204] (3/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_197-28164-0_sp0.9 from training. Duration: 0.888875 2023-10-02 03:03:44,795 WARNING [train.py:1204] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_643-52007-0 from training. Duration: 0.57 2023-10-02 03:03:44,820 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_38965_210_4852_1_1533174906_3484735_403-332963-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 03:03:46,148 WARNING [train.py:1204] (3/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_340-174176-0 from training. Duration: 0.49 2023-10-02 03:03:49,461 WARNING [train.py:1204] (3/4) Exclude cut with ID _56708_210_13177_1_1535252351282_3422899_363-917-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 03:03:49,590 WARNING [train.py:1204] (3/4) Exclude cut with ID _19896_210_1883_1_1531709022662_6649569_525-159063-0_sp1.1 from training. Duration: 0.936375 2023-10-02 03:03:50,822 WARNING [train.py:1204] (3/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_430-116139-0_sp0.9 from training. Duration: 0.9444375 2023-10-02 03:03:52,213 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_53753_210_7234_1_1534320003_3594148_86-329250-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 03:03:52,222 WARNING [train.py:1204] (3/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_406-275649-0_sp1.1 from training. Duration: 0.3818125 2023-10-02 03:03:53,601 WARNING [train.py:1204] (3/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_1083-147536-0_sp1.1 from training. Duration: 0.8818125 2023-10-02 03:03:54,943 WARNING [train.py:1204] (3/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_54-311741-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 03:03:54,948 WARNING [train.py:1204] (3/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_237-58433-0_sp0.9 from training. Duration: 0.82225 2023-10-02 03:03:54,987 WARNING [train.py:1204] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_98-51676-0_sp1.1 from training. Duration: 0.663625 2023-10-02 03:03:55,242 INFO [scaling.py:1118] (3/4) WithLoss: name=encoder.encoders.3.encoder.layers.0.self_attn_weights, loss-sum=0.000e+00 2023-10-02 03:03:56,248 WARNING [train.py:1204] (3/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_386-270472-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 03:03:56,335 WARNING [train.py:1204] (3/4) Exclude cut with ID _19023_210_4353_1_1531378892552_5820780_83-254794-0_sp1.1 from training. Duration: 0.7818125 2023-10-02 03:03:58,194 WARNING [train.py:1204] (3/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_314-93656-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 03:03:58,213 WARNING [train.py:1204] (3/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_336-213530-0 from training. Duration: 0.9 2023-10-02 03:03:59,638 WARNING [train.py:1204] (3/4) Exclude cut with ID _36686_210_5665_1_1533778225913_3646410_65-221120-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 03:04:01,216 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_26433_210_12108_1_1532157158_4056480_271-68178-0_sp1.1 from training. Duration: 0.92725 2023-10-02 03:04:01,248 WARNING [train.py:1204] (3/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_46-207599-0_sp0.9 from training. Duration: 0.711125 2023-10-02 03:04:04,054 INFO [train.py:1046] (3/4) Epoch 21, batch 3000, loss[loss=0.1885, simple_loss=0.2575, pruned_loss=0.05973, over 22806.00 frames. ], tot_loss[loss=0.1769, simple_loss=0.2531, pruned_loss=0.05032, over 4714738.21 frames. ], batch size: 322, lr: 4.89e-03, grad_scale: 8.0 2023-10-02 03:04:04,054 INFO [train.py:1069] (3/4) Computing validation loss 2023-10-02 03:04:18,545 INFO [train.py:1078] (3/4) Epoch 21, validation: loss=0.3071, simple_loss=0.2764, pruned_loss=0.1689, over 1125622.00 frames. 2023-10-02 03:04:18,546 INFO [train.py:1079] (3/4) Maximum memory allocated so far is 20808MB 2023-10-02 03:04:20,014 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9303W0114-637133 from training. Duration: 0.896 2023-10-02 03:04:20,048 WARNING [train.py:1204] (3/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_490-207861-0 from training. Duration: 0.98 2023-10-02 03:04:24,006 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6767_210_3273_1_1527855783_1718932_173-256601-0_sp0.9 from training. Duration: 0.888875 2023-10-02 03:04:24,064 WARNING [train.py:1204] (3/4) Exclude cut with ID _13921_210_9994_1_1530841782272_3503520_327-69677-0_sp1.1 from training. Duration: 0.8181875 2023-10-02 03:04:24,112 WARNING [train.py:1204] (3/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_508-335369-0 from training. Duration: 0.48 2023-10-02 03:04:25,490 WARNING [train.py:1204] (3/4) Exclude cut with ID _64631_210_27767_1_1535349580068_4165160_24-46567-0_sp1.1 from training. Duration: 0.963625 2023-10-02 03:04:31,829 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_89-35748-0_sp1.1 from training. Duration: 0.7181875 2023-10-02 03:04:36,665 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.3.encoder.layers.1.self_attn_weights.whiten_keys, num_groups=8, num_channels=256, metric=4.65 vs. limit=6.0 2023-10-02 03:04:40,674 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_26433_210_12108_1_1532157158_4056480_578-262107-0_sp1.1 from training. Duration: 0.9 2023-10-02 03:04:45,698 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.feed_forward1.hidden_balancer.prob, batch_count=728346.6666666666, ans=0.125 2023-10-02 03:04:46,845 WARNING [train.py:1204] (3/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_129-337802-0 from training. Duration: 0.72 2023-10-02 03:04:46,936 WARNING [train.py:1204] (3/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_356-167816-0_sp0.9 from training. Duration: 0.8 2023-10-02 03:04:51,184 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.562e+02 1.800e+02 1.986e+02 2.235e+02 3.212e+02, threshold=3.971e+02, percent-clipped=0.0 2023-10-02 03:04:51,263 WARNING [train.py:1204] (3/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_137-116442-0_sp1.1 from training. Duration: 0.7090625 2023-10-02 03:04:51,299 WARNING [train.py:1204] (3/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_358-172940-0_sp1.1 from training. Duration: 0.963625 2023-10-02 03:04:51,322 WARNING [train.py:1204] (3/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_668-344147-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 03:04:54,173 WARNING [train.py:1204] (3/4) Exclude cut with ID _62337_210_12392_1_1535009273105_3412229_255-157667-0_sp1.1 from training. Duration: 0.936375 2023-10-02 03:04:54,177 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_486-151131-0 from training. Duration: 0.85 2023-10-02 03:04:55,726 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_53638_210_21581_1_1534297319_5808449_885-80172-0 from training. Duration: 0.96 2023-10-02 03:04:58,791 WARNING [train.py:1204] (3/4) Exclude cut with ID _50093_210_4220_1_1534417141207_3432299_105-183067-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 03:04:58,848 WARNING [train.py:1204] (3/4) Exclude cut with ID _7562_210_2084_1_1528887767916_3946229_345-169156-0_sp0.9 from training. Duration: 0.6444375 2023-10-02 03:05:01,641 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_4528_210_3273_1_1527076654_3276777_209-219804-0_sp0.9 from training. Duration: 0.7666875 2023-10-02 03:05:01,675 WARNING [train.py:1204] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_35-153886-0_sp0.9 from training. Duration: 0.9333125 2023-10-02 03:05:01,738 WARNING [train.py:1204] (3/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_332-121925-0_sp1.1 from training. Duration: 0.92725 2023-10-02 03:05:01,739 WARNING [train.py:1204] (3/4) Exclude cut with ID _59202_210_26984_1_1534816810612_3549174_495-65515-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 03:05:04,715 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.feed_forward1.out_proj.dropout_p, batch_count=728480.0, ans=0.1 2023-10-02 03:05:05,978 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.0.bypass.scale_min, batch_count=728480.0, ans=0.2 2023-10-02 03:05:07,137 WARNING [train.py:1204] (3/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_769-168302-0_sp1.1 from training. Duration: 0.6454375 2023-10-02 03:05:07,195 WARNING [train.py:1204] (3/4) Exclude cut with ID _24271_210_12658_1_1531987270022_7190519_708-71345-0_sp1.1 from training. Duration: 0.936375 2023-10-02 03:05:07,195 WARNING [train.py:1204] (3/4) Exclude cut with ID 3033-130750-0096-55598-0_sp0.9 from training. Duration: 0.92225 2023-10-02 03:05:09,134 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.4.encoder.layers.0.feed_forward3.out_whiten, num_groups=1, num_channels=384, metric=14.36 vs. limit=15.0 2023-10-02 03:05:09,864 WARNING [train.py:1204] (3/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_219-224445-0_sp0.9 from training. Duration: 0.9333125 2023-10-02 03:05:12,027 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_174-277364-0 from training. Duration: 0.71 2023-10-02 03:05:14,083 WARNING [train.py:1204] (3/4) Exclude cut with ID _45021_210_9994_1_1533619751094_3330288_96-239010-0_sp1.1 from training. Duration: 0.82725 2023-10-02 03:05:14,114 WARNING [train.py:1204] (3/4) Exclude cut with ID _31602_210_5854_1_1532677021456_4458250_489-305116-0_sp1.1 from training. Duration: 0.97275 2023-10-02 03:05:14,134 WARNING [train.py:1204] (3/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_269-154726-0_sp0.9 from training. Duration: 0.9555625 2023-10-02 03:05:15,719 WARNING [train.py:1204] (3/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_126-54418-0_sp1.1 from training. Duration: 0.92725 2023-10-02 03:05:17,154 WARNING [train.py:1204] (3/4) Exclude cut with ID _42326_210_7770_1_1533814282531_3707269_182-224159-0_sp1.1 from training. Duration: 0.92725 2023-10-02 03:05:18,639 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7002_210_1794_1_1527939683_5157286_1-281264-0_sp0.9 from training. Duration: 0.488875 2023-10-02 03:05:18,687 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_86-16881-0 from training. Duration: 0.58 2023-10-02 03:05:19,982 WARNING [train.py:1204] (3/4) Exclude cut with ID _19514_210_9259_1_1531530071330_5084230_637-279094-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 03:05:20,009 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_26433_210_12108_1_1532157158_4056480_578-262107-0 from training. Duration: 0.99 2023-10-02 03:05:20,058 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_82-221196-0_sp1.1 from training. Duration: 0.6909375 2023-10-02 03:05:22,444 WARNING [train.py:1204] (3/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_402-102311-0 from training. Duration: 0.67 2023-10-02 03:05:25,700 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1015-343884-0_sp1.1 from training. Duration: 0.563625 2023-10-02 03:05:27,155 WARNING [train.py:1204] (3/4) Exclude cut with ID _13684_210_7497_1_1530619354863_3598359_312-235606-0_sp1.1 from training. Duration: 0.5454375 2023-10-02 03:05:27,186 WARNING [train.py:1204] (3/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_55-31904-0 from training. Duration: 0.88 2023-10-02 03:05:27,256 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_25-332138-0 from training. Duration: 0.91 2023-10-02 03:05:27,258 WARNING [train.py:1204] (3/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_912-282412-0_sp0.9 from training. Duration: 0.5444375 2023-10-02 03:05:27,316 WARNING [train.py:1204] (3/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_458-299435-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 03:05:28,701 WARNING [train.py:1204] (3/4) Exclude cut with ID _69584_210_5219_1_1535785133775_4044450_275-88403-0_sp1.1 from training. Duration: 0.92725 2023-10-02 03:05:28,719 WARNING [train.py:1204] (3/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_841-341679-0_sp1.1 from training. Duration: 0.52725 2023-10-02 03:05:28,726 WARNING [train.py:1204] (3/4) Exclude cut with ID _41391_210_7994_1_1533603771872_3638201_1-54521-0_sp1.1 from training. Duration: 0.97275 2023-10-02 03:05:30,612 WARNING [train.py:1204] (3/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_495-186609-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 03:05:32,132 WARNING [train.py:1204] (3/4) Exclude cut with ID _4681_210_3525_1_1527250689464_120889_15-126760-0 from training. Duration: 0.81 2023-10-02 03:05:33,387 INFO [train.py:1046] (3/4) Epoch 21, batch 3050, loss[loss=0.1812, simple_loss=0.251, pruned_loss=0.05573, over 23748.00 frames. ], tot_loss[loss=0.1781, simple_loss=0.2543, pruned_loss=0.05093, over 4710885.67 frames. ], batch size: 150, lr: 4.89e-03, grad_scale: 8.0 2023-10-02 03:05:33,540 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3019_210_2028_1_1526044245_3264865_127-135615-0_sp1.1 from training. Duration: 0.8545625 2023-10-02 03:05:36,378 WARNING [train.py:1204] (3/4) Exclude cut with ID _30191_210_1664_1_1533189915047_7196670_915-21561-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 03:05:37,619 WARNING [train.py:1204] (3/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_6-214815-0_sp0.9 from training. Duration: 0.9444375 2023-10-02 03:05:39,213 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.feed_forward1.out_proj.dropout_p, batch_count=728613.3333333334, ans=0.1 2023-10-02 03:05:42,202 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_45399_210_4852_1_1533624725_7579737_190-137494-0_sp1.1 from training. Duration: 0.97275 2023-10-02 03:05:45,531 WARNING [train.py:1204] (3/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_207-132038-0 from training. Duration: 0.94 2023-10-02 03:05:48,880 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.4.encoder.layers.2.self_attn_weights.whiten_keys, num_groups=4, num_channels=128, metric=1.84 vs. limit=6.0 2023-10-02 03:05:49,703 WARNING [train.py:1204] (3/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_90-31383-0 from training. Duration: 0.87 2023-10-02 03:05:51,050 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_15517_210_6753_1_1530952293_1664795_119-31001-0 from training. Duration: 0.94 2023-10-02 03:05:51,090 WARNING [train.py:1204] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_684-85342-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 03:05:52,981 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.4.encoder.layers.2.conv_module1.whiten, num_groups=1, num_channels=384, metric=3.27 vs. limit=15.0 2023-10-02 03:05:55,645 WARNING [train.py:1204] (3/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_97-325053-0_sp1.1 from training. Duration: 0.836375 2023-10-02 03:05:59,887 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_68702_210_23689_1_1535799577_3420068_168-25150-0_sp1.1 from training. Duration: 0.97275 2023-10-02 03:05:59,909 WARNING [train.py:1204] (3/4) Exclude cut with ID _65134_210_14763_1_1535335393985_3505368_111-27984-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 03:05:59,966 WARNING [train.py:1204] (3/4) Exclude cut with ID _48678_210_5663_1_1533988830947_3769199_276-226230-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 03:06:03,360 WARNING [train.py:1204] (3/4) Exclude cut with ID _28427_210_4220_1_1532581240967_3061069_43-84928-0_sp1.1 from training. Duration: 0.9 2023-10-02 03:06:04,551 WARNING [train.py:1204] (3/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_158-250116-0_sp1.1 from training. Duration: 0.563625 2023-10-02 03:06:04,592 WARNING [train.py:1204] (3/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_279-332556-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 03:06:05,866 WARNING [train.py:1204] (3/4) Exclude cut with ID _42863_210_9070_1_1533902398788_3696920_488-230146-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 03:06:05,867 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_387-247159-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 03:06:07,219 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6729_210_2550_1_1528032481_1788725_261-42619-0_sp1.1 from training. Duration: 0.97275 2023-10-02 03:06:08,645 WARNING [train.py:1204] (3/4) Exclude cut with ID _19896_210_1883_1_1531709022662_6649569_831-44724-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 03:06:10,099 WARNING [train.py:1204] (3/4) Exclude cut with ID _55153_210_2253_1_1534384786677_3162096_280-285944-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 03:06:10,138 WARNING [train.py:1204] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_448-4048-0 from training. Duration: 0.96 2023-10-02 03:06:11,420 WARNING [train.py:1204] (3/4) Exclude cut with ID _29678_210_7893_1_1532421042787_3906118_456-202548-0_sp1.1 from training. Duration: 0.97275 2023-10-02 03:06:11,433 WARNING [train.py:1204] (3/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_107-332398-0_sp1.1 from training. Duration: 0.7090625 2023-10-02 03:06:16,166 WARNING [train.py:1204] (3/4) Exclude cut with ID _13921_210_9994_1_1530838702800_3003429_46-149444-0_sp1.1 from training. Duration: 0.8545625 2023-10-02 03:06:17,495 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_257-20733-0_sp0.9 from training. Duration: 0.8444375 2023-10-02 03:06:17,552 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_53753_210_7234_1_1534320003_3594148_385-234809-0_sp1.1 from training. Duration: 0.963625 2023-10-02 03:06:17,604 WARNING [train.py:1204] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_292-215346-0_sp1.1 from training. Duration: 0.8818125 2023-10-02 03:06:17,768 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.nonlin_attention.balancer.prob, batch_count=728813.3333333334, ans=0.125 2023-10-02 03:06:22,330 WARNING [train.py:1204] (3/4) Exclude cut with ID _68965_210_1883_1_1535698962272_3497490_196-124268-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 03:06:23,610 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_23801_210_16223_1_1532066392_3294657_570-5439-0_sp1.1 from training. Duration: 0.8818125 2023-10-02 03:06:26,706 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.balancer1.prob, batch_count=728813.3333333334, ans=0.125 2023-10-02 03:06:29,844 WARNING [train.py:1204] (3/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_349-6928-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 03:06:29,902 WARNING [train.py:1204] (3/4) Exclude cut with ID _60621_210_17071_1_1535013053530_3671739_226-255202-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 03:06:29,906 WARNING [train.py:1204] (3/4) Exclude cut with ID _41817_210_18835_1_1533434477160_7423049_448-130302-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 03:06:31,352 WARNING [train.py:1204] (3/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_758-143677-0_sp1.1 from training. Duration: 0.8909375 2023-10-02 03:06:32,650 WARNING [train.py:1204] (3/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_912-47473-0_sp0.9 from training. Duration: 0.7555625 2023-10-02 03:06:32,678 WARNING [train.py:1204] (3/4) Exclude cut with ID _6694_210_4599_1_1527852637490_3965529_356-169233-0_sp1.1 from training. Duration: 0.9 2023-10-02 03:06:34,047 WARNING [train.py:1204] (3/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_1-99680-0 from training. Duration: 0.67 2023-10-02 03:06:36,030 WARNING [train.py:1204] (3/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_199-86432-0_sp1.1 from training. Duration: 0.8909375 2023-10-02 03:06:36,054 WARNING [train.py:1204] (3/4) Exclude cut with ID _61148_210_6897_1_1535094022726_4217610_362-182430-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 03:06:37,321 WARNING [train.py:1204] (3/4) Exclude cut with ID _49376_210_5986_1_1533985278732_3370740_301-154763-0 from training. Duration: 0.98 2023-10-02 03:06:38,780 WARNING [train.py:1204] (3/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_572-185150-0_sp1.1 from training. Duration: 0.8818125 2023-10-02 03:06:46,129 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_47896_210_20508_1_1534492518_5958868_558-314572-0_sp1.1 from training. Duration: 0.8818125 2023-10-02 03:06:47,534 INFO [train.py:1046] (3/4) Epoch 21, batch 3100, loss[loss=0.1828, simple_loss=0.2523, pruned_loss=0.05662, over 23445.00 frames. ], tot_loss[loss=0.1773, simple_loss=0.2535, pruned_loss=0.05056, over 4718887.53 frames. ], batch size: 119, lr: 4.89e-03, grad_scale: 8.0 2023-10-02 03:06:49,027 WARNING [train.py:1204] (3/4) Exclude cut with ID _45560_210_21982_1_1533812603951_3584999_111-6376-0_sp0.9 from training. Duration: 0.9333125 2023-10-02 03:06:51,108 WARNING [train.py:1204] (3/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_412-317077-0_sp1.1 from training. Duration: 0.6818125 2023-10-02 03:06:52,532 WARNING [train.py:1204] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_718-68505-0 from training. Duration: 0.89 2023-10-02 03:06:53,973 WARNING [train.py:1204] (3/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_77-311404-0 from training. Duration: 0.94 2023-10-02 03:06:56,686 WARNING [train.py:1204] (3/4) Exclude cut with ID _60229_210_27767_1_1534838406158_3655350_264-279573-0 from training. Duration: 0.83 2023-10-02 03:06:56,813 WARNING [train.py:1204] (3/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_106-300985-0_sp1.1 from training. Duration: 0.8181875 2023-10-02 03:06:59,773 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_4183_210_2087_1_1526729100_1869838_79-277189-0_sp1.1 from training. Duration: 0.963625 2023-10-02 03:06:59,782 WARNING [train.py:1204] (3/4) Exclude cut with ID _28427_210_4220_1_1532581240967_3061069_195-295268-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 03:07:03,085 WARNING [train.py:1204] (3/4) Exclude cut with ID _4092_210_2151_1_1526643044449_3135759_356-278694-0_sp0.9 from training. Duration: 0.52225 2023-10-02 03:07:07,166 WARNING [train.py:1204] (3/4) Exclude cut with ID _14459_210_2228_1_1531443641946_7516700_777-71446-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 03:07:11,972 WARNING [train.py:1204] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_520-126729-0 from training. Duration: 0.7 2023-10-02 03:07:16,799 WARNING [train.py:1204] (3/4) Exclude cut with ID _13684_210_7497_1_1530619354863_3598359_312-235606-0_sp0.9 from training. Duration: 0.6666875 2023-10-02 03:07:16,837 WARNING [train.py:1204] (3/4) Exclude cut with ID _37537_210_2253_1_1533171758568_3421949_283-349402-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 03:07:16,864 WARNING [train.py:1204] (3/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_29-117932-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 03:07:16,885 WARNING [train.py:1204] (3/4) Exclude cut with ID _29303_210_2575_1_1532392237303_7410649_721-283351-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 03:07:17,046 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.bypass.scale_min, batch_count=729080.0, ans=0.2 2023-10-02 03:07:18,256 WARNING [train.py:1204] (3/4) Exclude cut with ID _14886_210_4220_1_1530702020355_3516190_175-229073-0_sp0.9 from training. Duration: 0.5 2023-10-02 03:07:21,274 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.522e+02 1.950e+02 2.237e+02 2.681e+02 4.003e+02, threshold=4.474e+02, percent-clipped=0.0 2023-10-02 03:07:21,369 WARNING [train.py:1204] (3/4) Exclude cut with ID _15458_210_8341_1_1530858614263_3834410_70-268830-0_sp1.1 from training. Duration: 0.97275 2023-10-02 03:07:21,400 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2295_210_2087_1_1525500993_2407863_234-83104-0 from training. Duration: 0.62 2023-10-02 03:07:21,401 WARNING [train.py:1204] (3/4) Exclude cut with ID _22961_210_8254_1_1531976366672_4278180_317-199546-0_sp1.1 from training. Duration: 0.7818125 2023-10-02 03:07:22,799 WARNING [train.py:1204] (3/4) Exclude cut with ID _27894_210_12392_1_1532415496078_6902439_547-196022-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 03:07:24,143 WARNING [train.py:1204] (3/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_46-207599-0 from training. Duration: 0.64 2023-10-02 03:07:25,567 WARNING [train.py:1204] (3/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_426-326246-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 03:07:25,727 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.conv_module1.balancer2.prob, batch_count=729080.0, ans=0.125 2023-10-02 03:07:28,391 WARNING [train.py:1204] (3/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_445-117398-0_sp0.9 from training. Duration: 0.988875 2023-10-02 03:07:29,708 WARNING [train.py:1204] (3/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_563-347832-0 from training. Duration: 0.89 2023-10-02 03:07:31,759 WARNING [train.py:1204] (3/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_252-216674-0 from training. Duration: 0.54 2023-10-02 03:07:31,839 WARNING [train.py:1204] (3/4) Exclude cut with ID _59611_210_8895_1_1534741198240_3650361_187-212779-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 03:07:33,155 WARNING [train.py:1204] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_282-159367-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 03:07:35,990 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_68702_210_23689_1_1535799577_3420068_33-136883-0_sp1.1 from training. Duration: 0.936375 2023-10-02 03:07:37,320 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_47228_210_4983_1_1533780021_3672027_184-293706-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 03:07:37,346 WARNING [train.py:1204] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_656-188361-0_sp0.9 from training. Duration: 0.9666875 2023-10-02 03:07:38,743 WARNING [train.py:1204] (3/4) Exclude cut with ID _4866_210_2577_1_1527404412284_4364659_352-157930-0_sp0.9 from training. Duration: 0.9 2023-10-02 03:07:38,744 WARNING [train.py:1204] (3/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_210-106047-0_sp1.1 from training. Duration: 0.8818125 2023-10-02 03:07:40,147 WARNING [train.py:1204] (3/4) Exclude cut with ID _9389_210_1664_1_1529217337301_5289269_289-213417-0_sp1.1 from training. Duration: 0.8090625 2023-10-02 03:07:40,192 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3910_210_1794_1_1526777667_3747260_178-41186-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 03:07:40,197 WARNING [train.py:1204] (3/4) Exclude cut with ID _27052_210_5986_1_1532329197187_3585009_24-172263-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 03:07:40,200 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_5522_210_3273_1_1527249606_3909096_318-203153-0_sp1.1 from training. Duration: 0.3909375 2023-10-02 03:07:42,377 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.self_attn_weights.pos_emb_skip_rate, batch_count=729146.6666666666, ans=0.0 2023-10-02 03:07:46,220 WARNING [train.py:1204] (3/4) Exclude cut with ID _27894_210_12392_1_1532415496078_6902439_222-51982-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 03:07:46,335 WARNING [train.py:1204] (3/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_87-131427-0 from training. Duration: 0.78 2023-10-02 03:07:48,381 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.1.balancer2.prob, batch_count=729213.3333333334, ans=0.125 2023-10-02 03:07:49,557 WARNING [train.py:1204] (3/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_372-298093-0_sp0.9 from training. Duration: 0.888875 2023-10-02 03:07:49,618 WARNING [train.py:1204] (3/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_663-166973-0 from training. Duration: 0.8 2023-10-02 03:07:49,650 WARNING [train.py:1204] (3/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_429-177469-0_sp1.1 from training. Duration: 0.936375 2023-10-02 03:07:50,981 WARNING [train.py:1204] (3/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_724-172651-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 03:07:51,022 WARNING [train.py:1204] (3/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_103-336064-0 from training. Duration: 0.51 2023-10-02 03:08:01,484 WARNING [train.py:1204] (3/4) Exclude cut with ID _48677_210_5663_1_1533902405919_3892989_111-11439-0 from training. Duration: 0.96 2023-10-02 03:08:02,735 INFO [train.py:1046] (3/4) Epoch 21, batch 3150, loss[loss=0.191, simple_loss=0.2712, pruned_loss=0.05542, over 23947.00 frames. ], tot_loss[loss=0.1763, simple_loss=0.2527, pruned_loss=0.04995, over 4728728.02 frames. ], batch size: 86, lr: 4.89e-03, grad_scale: 8.0 2023-10-02 03:08:02,902 WARNING [train.py:1204] (3/4) Exclude cut with ID _8085_210_4557_1_1528455633493_3716100_95-103398-0_sp1.1 from training. Duration: 0.963625 2023-10-02 03:08:04,736 WARNING [train.py:1204] (3/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_1041-110837-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 03:08:06,235 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_50050_210_16223_1_1535435796_7290138_1155-121757-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 03:08:07,416 WARNING [train.py:1204] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_602-275260-0_sp0.9 from training. Duration: 0.911125 2023-10-02 03:08:07,482 WARNING [train.py:1204] (3/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_265-164486-0 from training. Duration: 0.96 2023-10-02 03:08:08,812 WARNING [train.py:1204] (3/4) Exclude cut with ID _46067_210_4220_1_1534125571066_3307450_142-229375-0_sp1.1 from training. Duration: 0.963625 2023-10-02 03:08:08,850 WARNING [train.py:1204] (3/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_310-26186-0_sp1.1 from training. Duration: 0.636375 2023-10-02 03:08:10,136 WARNING [train.py:1204] (3/4) Exclude cut with ID _28461_210_2253_1_1532771775189_3041299_136-270644-0 from training. Duration: 0.93 2023-10-02 03:08:11,634 WARNING [train.py:1204] (3/4) Exclude cut with ID _14860_210_4915_1_1530702020340_7343240_438-286372-0_sp1.1 from training. Duration: 0.936375 2023-10-02 03:08:13,692 WARNING [train.py:1204] (3/4) Exclude cut with ID IC0128W0004-63309 from training. Duration: 0.831 2023-10-02 03:08:15,324 WARNING [train.py:1204] (3/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_135-174080-0 from training. Duration: 0.53 2023-10-02 03:08:15,366 WARNING [train.py:1204] (3/4) Exclude cut with ID _41326_210_5854_1_1533778076509_3492080_277-59511-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 03:08:17,180 WARNING [train.py:1204] (3/4) Exclude cut with ID IC0144W0112-71379 from training. Duration: 0.992 2023-10-02 03:08:17,254 WARNING [train.py:1204] (3/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_711-15688-0_sp0.9 from training. Duration: 0.47775 2023-10-02 03:08:18,603 WARNING [train.py:1204] (3/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_237-332027-0 from training. Duration: 0.95 2023-10-02 03:08:20,468 WARNING [train.py:1204] (3/4) Exclude cut with ID _7970_210_1883_1_1528455616296_3472230_25-178407-0 from training. Duration: 0.71 2023-10-02 03:08:20,470 WARNING [train.py:1204] (3/4) Exclude cut with ID _8480_210_1874_1_1528589391205_6728330_309-139896-0 from training. Duration: 0.81 2023-10-02 03:08:20,487 WARNING [train.py:1204] (3/4) Exclude cut with ID _39571_210_13449_1_1533801584143_3651077_319-268877-0_sp1.1 from training. Duration: 0.936375 2023-10-02 03:08:20,490 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_155-246880-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 03:08:21,831 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_566-122599-0_sp1.1 from training. Duration: 0.936375 2023-10-02 03:08:21,946 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_12868_210_6753_1_1530701932_530275_18-76322-0 from training. Duration: 0.87 2023-10-02 03:08:23,299 WARNING [train.py:1204] (3/4) Exclude cut with ID _56494_210_4220_1_1535284837625_3033169_159-106035-0_sp1.1 from training. Duration: 0.963625 2023-10-02 03:08:24,591 WARNING [train.py:1204] (3/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_98-276013-0_sp1.1 from training. Duration: 0.963625 2023-10-02 03:08:24,660 WARNING [train.py:1204] (3/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_330-216124-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 03:08:27,417 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_177-58005-0_sp0.9 from training. Duration: 0.688875 2023-10-02 03:08:28,015 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.4.encoder.layers.2.nonlin_attention.whiten1, num_groups=1, num_channels=288, metric=4.10 vs. limit=10.0 2023-10-02 03:08:31,640 WARNING [train.py:1204] (3/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_343-139898-0 from training. Duration: 0.96 2023-10-02 03:08:31,702 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6961_210_2087_1_1527937168_6815011_395-185060-0_sp1.1 from training. Duration: 0.863625 2023-10-02 03:08:34,920 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7002_210_1794_1_1527939683_5157286_184-229630-0_sp0.9 from training. Duration: 0.82225 2023-10-02 03:08:34,959 WARNING [train.py:1204] (3/4) Exclude cut with ID _59170_210_6721_1_1534983633960_4203335_153-18895-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 03:08:35,008 WARNING [train.py:1204] (3/4) Exclude cut with ID _49643_210_7989_1_1534640244224_3939031_598-161926-0 from training. Duration: 0.9 2023-10-02 03:08:37,893 WARNING [train.py:1204] (3/4) Exclude cut with ID _4068_210_2629_1_1526697350395_1546259_224-288693-0 from training. Duration: 0.59 2023-10-02 03:08:39,207 WARNING [train.py:1204] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_718-68505-0_sp1.1 from training. Duration: 0.8090625 2023-10-02 03:08:39,251 WARNING [train.py:1204] (3/4) Exclude cut with ID _4068_210_2629_1_1526697350395_1546259_224-288693-0_sp0.9 from training. Duration: 0.6555625 2023-10-02 03:08:39,262 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2296_210_2087_1_1525503448_3397335_57-185430-0_sp0.9 from training. Duration: 0.6666875 2023-10-02 03:08:40,516 WARNING [train.py:1204] (3/4) Exclude cut with ID _15458_210_8341_1_1530858614263_3834410_49-235472-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 03:08:40,518 WARNING [train.py:1204] (3/4) Exclude cut with ID _64135_210_2253_1_1535677267876_7608972_498-5039-0_sp0.9 from training. Duration: 0.8666875 2023-10-02 03:08:42,441 WARNING [train.py:1204] (3/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_283-220372-0_sp0.9 from training. Duration: 0.62225 2023-10-02 03:08:42,449 WARNING [train.py:1204] (3/4) Exclude cut with ID _6473_210_1741_1_1527901307396_3815380_108-17335-0_sp0.9 from training. Duration: 0.72225 2023-10-02 03:08:43,837 WARNING [train.py:1204] (3/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_113-92608-0 from training. Duration: 0.54 2023-10-02 03:08:45,752 WARNING [train.py:1204] (3/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_272-249985-0_sp1.1 from training. Duration: 0.6090625 2023-10-02 03:08:45,760 WARNING [train.py:1204] (3/4) Exclude cut with ID _50376_210_13401_1_1534031502101_4217740_518-187390-0_sp1.1 from training. Duration: 0.97275 2023-10-02 03:08:47,113 WARNING [train.py:1204] (3/4) Exclude cut with ID _59738_210_15379_1_1534849512562_3941470_165-248260-0_sp1.1 from training. Duration: 0.92725 2023-10-02 03:08:47,125 WARNING [train.py:1204] (3/4) Exclude cut with ID _23818_210_6228_1_1531917063884_3676920_400-343925-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 03:08:48,515 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6961_210_2087_1_1527937168_6815011_562-165518-0 from training. Duration: 0.77 2023-10-02 03:08:50,421 WARNING [train.py:1204] (3/4) Exclude cut with ID _24271_210_12658_1_1531987270022_7190519_289-207931-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 03:08:51,845 WARNING [train.py:1204] (3/4) Exclude cut with ID _14632_210_9988_1_1530691279392_3923200_332-105904-0 from training. Duration: 0.9 2023-10-02 03:08:51,869 WARNING [train.py:1204] (3/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_203-232438-0_sp1.1 from training. Duration: 0.97275 2023-10-02 03:08:53,189 WARNING [train.py:1204] (3/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_263-168388-0 from training. Duration: 0.86 2023-10-02 03:08:53,261 WARNING [train.py:1204] (3/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_174-205058-0 from training. Duration: 0.95 2023-10-02 03:08:54,640 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_94-195801-0_sp1.1 from training. Duration: 0.8545625 2023-10-02 03:08:55,991 WARNING [train.py:1204] (3/4) Exclude cut with ID _55153_210_2253_1_1534384786677_3162096_269-103204-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 03:08:57,326 WARNING [train.py:1204] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_748-36150-0 from training. Duration: 0.91 2023-10-02 03:08:57,409 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_148-172861-0_sp1.1 from training. Duration: 0.4545625 2023-10-02 03:08:58,656 WARNING [train.py:1204] (3/4) Exclude cut with ID _67499_210_17282_1_1535509805220_3692430_460-77730-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 03:09:01,384 WARNING [train.py:1204] (3/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_374-20753-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 03:09:02,834 WARNING [train.py:1204] (3/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_374-289983-0_sp1.1 from training. Duration: 0.97275 2023-10-02 03:09:02,873 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_34886_210_10633_1_1533203882_1755661_76-156096-0_sp0.9 from training. Duration: 0.9666875 2023-10-02 03:09:07,626 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_238-256668-0_sp0.9 from training. Duration: 0.7666875 2023-10-02 03:09:07,692 WARNING [train.py:1204] (3/4) Exclude cut with ID _46683_210_9642_1_1533730913492_4591116_489-146727-0_sp1.1 from training. Duration: 0.97275 2023-10-02 03:09:10,230 WARNING [train.py:1204] (3/4) Exclude cut with ID _13938_210_9988_1_1530518718978_3693960_304-112784-0_sp0.9 from training. Duration: 0.511125 2023-10-02 03:09:16,703 WARNING [train.py:1204] (3/4) Exclude cut with ID _24271_210_12658_1_1531987270022_7190519_243-46549-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 03:09:16,714 WARNING [train.py:1204] (3/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_78-182912-0_sp1.1 from training. Duration: 0.536375 2023-10-02 03:09:17,967 INFO [train.py:1046] (3/4) Epoch 21, batch 3200, loss[loss=0.1915, simple_loss=0.2688, pruned_loss=0.05712, over 23761.00 frames. ], tot_loss[loss=0.1756, simple_loss=0.2518, pruned_loss=0.04974, over 4726629.26 frames. ], batch size: 85, lr: 4.89e-03, grad_scale: 16.0 2023-10-02 03:09:20,820 WARNING [train.py:1204] (3/4) Exclude cut with ID _57572_210_5517_1_1534921219624_3809484_121-321334-0_sp1.1 from training. Duration: 0.97275 2023-10-02 03:09:22,211 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_40047_210_2290_1_1533290519_2374311_254-102149-0_sp1.1 from training. Duration: 0.963625 2023-10-02 03:09:22,214 WARNING [train.py:1204] (3/4) Exclude cut with ID _28461_210_2253_1_1532771775189_3041299_96-152158-0 from training. Duration: 0.85 2023-10-02 03:09:24,128 WARNING [train.py:1204] (3/4) Exclude cut with ID _29877_210_6228_1_1532397791106_5276149_161-102979-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 03:09:25,654 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.conv_module1.balancer2.prob, batch_count=729613.3333333334, ans=0.125 2023-10-02 03:09:29,689 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3787_210_2040_1_1526477544_5478586_603-120324-0_sp0.9 from training. Duration: 0.9 2023-10-02 03:09:32,428 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_41750_210_1960_1_1533520678_3948042_247-213456-0_sp1.1 from training. Duration: 0.97275 2023-10-02 03:09:32,677 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.conv_skip_rate, batch_count=729680.0, ans=0.0 2023-10-02 03:09:38,859 WARNING [train.py:1204] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_127-119109-0_sp0.9 from training. Duration: 0.8 2023-10-02 03:09:48,269 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.feed_forward3.hidden_balancer.prob, batch_count=729746.6666666666, ans=0.125 2023-10-02 03:09:49,431 WARNING [train.py:1204] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_215-202973-0 from training. Duration: 0.88 2023-10-02 03:09:50,801 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.481e+02 1.933e+02 2.083e+02 2.478e+02 3.380e+02, threshold=4.166e+02, percent-clipped=0.0 2023-10-02 03:09:50,898 WARNING [train.py:1204] (3/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_17-320491-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 03:09:53,149 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.bypass.scale_min, batch_count=729746.6666666666, ans=0.2 2023-10-02 03:09:54,188 WARNING [train.py:1204] (3/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_310-114452-0 from training. Duration: 0.59 2023-10-02 03:09:55,435 WARNING [train.py:1204] (3/4) Exclude cut with ID _15133_210_6228_1_1530794512074_3080379_29-264458-0_sp0.9 from training. Duration: 0.6444375 2023-10-02 03:09:58,359 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.conv_module1.balancer1.prob, batch_count=729746.6666666666, ans=0.125 2023-10-02 03:09:59,561 WARNING [train.py:1204] (3/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_159-281310-0_sp1.1 from training. Duration: 0.863625 2023-10-02 03:09:59,570 WARNING [train.py:1204] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_195-167011-0_sp0.9 from training. Duration: 0.8333125 2023-10-02 03:10:00,923 WARNING [train.py:1204] (3/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_156-257607-0_sp1.1 from training. Duration: 0.7545625 2023-10-02 03:10:05,758 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2295_210_2087_1_1525500993_2407863_116-238894-0 from training. Duration: 0.7 2023-10-02 03:10:06,009 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.conv_module2.balancer1.prob, batch_count=729813.3333333334, ans=0.125 2023-10-02 03:10:07,144 WARNING [train.py:1204] (3/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_321-335475-0_sp0.9 from training. Duration: 0.5 2023-10-02 03:10:08,625 WARNING [train.py:1204] (3/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_188-114464-0 from training. Duration: 0.82 2023-10-02 03:10:08,895 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.feed_forward1.out_proj.dropout_p, batch_count=729813.3333333334, ans=0.1 2023-10-02 03:10:10,201 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=729813.3333333334, ans=0.1 2023-10-02 03:10:11,383 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2592_210_2290_1_1525697025_4882925_157-70512-0 from training. Duration: 0.91 2023-10-02 03:10:13,084 WARNING [train.py:1204] (3/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_138-64210-0_sp1.1 from training. Duration: 0.663625 2023-10-02 03:10:18,986 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_11305_210_1794_1_1529750428_7178148_651-95702-0_sp1.1 from training. Duration: 0.963625 2023-10-02 03:10:20,721 WARNING [train.py:1204] (3/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_379-184223-0_sp0.9 from training. Duration: 0.9444375 2023-10-02 03:10:20,746 WARNING [train.py:1204] (3/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_381-125325-0_sp1.1 from training. Duration: 0.963625 2023-10-02 03:10:20,799 WARNING [train.py:1204] (3/4) Exclude cut with ID IC0622W0021-307659 from training. Duration: 0.896 2023-10-02 03:10:20,802 WARNING [train.py:1204] (3/4) Exclude cut with ID _7624_210_1531_1_1528109802198_4933101_418-349834-0_sp1.1 from training. Duration: 0.5454375 2023-10-02 03:10:23,677 WARNING [train.py:1204] (3/4) Exclude cut with ID _49728_210_12651_1_1534041034294_6788150_607-3416-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 03:10:25,144 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8275_210_4198_1_1528455320_3892571_491-17707-0 from training. Duration: 0.64 2023-10-02 03:10:26,978 WARNING [train.py:1204] (3/4) Exclude cut with ID _5287_210_2084_1_1527418883040_3878670_333-48508-0 from training. Duration: 0.6 2023-10-02 03:10:27,060 WARNING [train.py:1204] (3/4) Exclude cut with ID _8365_210_2064_1_1528592311070_4677690_371-122295-0 from training. Duration: 0.55 2023-10-02 03:10:29,818 WARNING [train.py:1204] (3/4) Exclude cut with ID _14860_210_4915_1_1530702020340_7343240_836-328489-0 from training. Duration: 0.9 2023-10-02 03:10:30,403 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.4.encoder.layers.2.nonlin_attention.whiten2, num_groups=1, num_channels=384, metric=3.59 vs. limit=15.0 2023-10-02 03:10:31,302 WARNING [train.py:1204] (3/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_1033-274376-0_sp0.9 from training. Duration: 0.9666875 2023-10-02 03:10:32,611 INFO [train.py:1046] (3/4) Epoch 21, batch 3250, loss[loss=0.1745, simple_loss=0.2517, pruned_loss=0.04864, over 23609.00 frames. ], tot_loss[loss=0.1751, simple_loss=0.2511, pruned_loss=0.04955, over 4722608.62 frames. ], batch size: 149, lr: 4.88e-03, grad_scale: 16.0 2023-10-02 03:10:33,994 WARNING [train.py:1204] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_475-127455-0_sp0.9 from training. Duration: 0.77775 2023-10-02 03:10:34,000 WARNING [train.py:1204] (3/4) Exclude cut with ID IC0671W0071-331914 from training. Duration: 0.896 2023-10-02 03:10:34,023 WARNING [train.py:1204] (3/4) Exclude cut with ID _30327_210_16049_1_1532484161295_3924169_2-1523-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 03:10:35,316 WARNING [train.py:1204] (3/4) Exclude cut with ID _24314_210_4915_1_1532135078116_7170179_460-208279-0_sp1.1 from training. Duration: 0.97275 2023-10-02 03:10:35,427 WARNING [train.py:1204] (3/4) Exclude cut with ID IC0679W0052-335859 from training. Duration: 0.64 2023-10-02 03:10:38,749 WARNING [train.py:1204] (3/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_3-237434-0_sp1.1 from training. Duration: 0.6909375 2023-10-02 03:10:41,507 WARNING [train.py:1204] (3/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_405-185283-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 03:10:48,379 WARNING [train.py:1204] (3/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_927-83684-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 03:10:48,390 WARNING [train.py:1204] (3/4) Exclude cut with ID _6694_210_4599_1_1527852637490_3965529_356-169233-0 from training. Duration: 0.99 2023-10-02 03:10:48,726 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.attention_skip_rate, batch_count=730013.3333333334, ans=0.0 2023-10-02 03:10:51,046 WARNING [train.py:1204] (3/4) Exclude cut with ID _19896_210_1883_1_1531709022662_6649569_562-233001-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 03:10:51,083 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_354-78629-0_sp1.1 from training. Duration: 0.963625 2023-10-02 03:10:51,084 WARNING [train.py:1204] (3/4) Exclude cut with ID _60135_210_13427_1_1534935557922_3651319_96-168198-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 03:10:52,622 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.bypass_mid.scale_min, batch_count=730013.3333333334, ans=0.2 2023-10-02 03:10:53,677 WARNING [train.py:1204] (3/4) Exclude cut with ID _31602_210_5854_1_1532677021456_4458250_249-96604-0_sp1.1 from training. Duration: 0.8090625 2023-10-02 03:10:53,707 WARNING [train.py:1204] (3/4) Exclude cut with ID _14100_210_8341_1_1530529320613_3678069_258-224599-0_sp0.9 from training. Duration: 0.8555625 2023-10-02 03:10:56,444 WARNING [train.py:1204] (3/4) Exclude cut with ID _19708_210_9206_1_1532136650733_4238364_215-160135-0_sp1.1 from training. Duration: 0.97275 2023-10-02 03:10:56,492 WARNING [train.py:1204] (3/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_113-18163-0_sp0.9 from training. Duration: 0.988875 2023-10-02 03:10:56,529 WARNING [train.py:1204] (3/4) Exclude cut with ID _64135_210_2253_1_1535677267876_7608972_552-337340-0_sp1.1 from training. Duration: 0.92725 2023-10-02 03:10:57,842 WARNING [train.py:1204] (3/4) Exclude cut with ID _67725_210_27767_1_1535608756671_5105359_10-12887-0_sp1.1 from training. Duration: 0.97275 2023-10-02 03:10:57,848 WARNING [train.py:1204] (3/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_401-284840-0_sp1.1 from training. Duration: 0.97275 2023-10-02 03:10:57,874 WARNING [train.py:1204] (3/4) Exclude cut with ID _44659_210_2721_1_1534036120061_6614860_483-141285-0_sp1.1 from training. Duration: 0.936375 2023-10-02 03:11:01,243 WARNING [train.py:1204] (3/4) Exclude cut with ID _9461_210_4557_1_1529060514726_3677451_358-332734-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 03:11:02,620 WARNING [train.py:1204] (3/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_788-298907-0_sp1.1 from training. Duration: 0.8090625 2023-10-02 03:11:04,045 WARNING [train.py:1204] (3/4) Exclude cut with ID _39411_210_5986_1_1533538825030_3343649_354-267196-0_sp1.1 from training. Duration: 0.92725 2023-10-02 03:11:04,087 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_14953_210_2151_1_1530701710_5616165_192-309792-0_sp1.1 from training. Duration: 0.97275 2023-10-02 03:11:05,891 WARNING [train.py:1204] (3/4) Exclude cut with ID _59170_210_6721_1_1534983633960_4203335_290-151255-0_sp1.1 from training. Duration: 0.92725 2023-10-02 03:11:07,090 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_5575_210_1410_1_1527301305_2570170_249-93769-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 03:11:07,100 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_46013_210_7234_1_1533776362_3744032_463-52378-0_sp1.1 from training. Duration: 0.7818125 2023-10-02 03:11:11,231 WARNING [train.py:1204] (3/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_580-93798-0 from training. Duration: 0.56 2023-10-02 03:11:12,536 WARNING [train.py:1204] (3/4) Exclude cut with ID _19514_210_9259_1_1531530071330_5084230_385-13762-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 03:11:12,549 WARNING [train.py:1204] (3/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_458-67321-0_sp1.1 from training. Duration: 0.8818125 2023-10-02 03:11:13,934 WARNING [train.py:1204] (3/4) Exclude cut with ID _41298_210_9019_1_1533430371450_4822143_188-154791-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 03:11:14,005 WARNING [train.py:1204] (3/4) Exclude cut with ID _15037_210_2253_1_1530752294104_3267650_296-192597-0_sp0.9 from training. Duration: 0.9 2023-10-02 03:11:19,493 WARNING [train.py:1204] (3/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_661-264857-0_sp1.1 from training. Duration: 0.8454375 2023-10-02 03:11:25,639 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_34008_210_10431_1_1533345424_8183247_662-317331-0_sp1.1 from training. Duration: 0.8545625 2023-10-02 03:11:26,899 WARNING [train.py:1204] (3/4) Exclude cut with ID _46502_210_13794_1_1533724452198_3657159_241-278329-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 03:11:26,900 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_11349_210_6753_1_1529800861_1101207_26-65602-0 from training. Duration: 0.96 2023-10-02 03:11:26,905 WARNING [train.py:1204] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_448-4048-0_sp1.1 from training. Duration: 0.87275 2023-10-02 03:11:26,915 WARNING [train.py:1204] (3/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_662-275621-0_sp1.1 from training. Duration: 0.3818125 2023-10-02 03:11:26,986 WARNING [train.py:1204] (3/4) Exclude cut with ID _13938_210_9988_1_1530518718978_3693960_256-54454-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 03:11:29,840 WARNING [train.py:1204] (3/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_256-32662-0 from training. Duration: 0.9 2023-10-02 03:11:29,900 WARNING [train.py:1204] (3/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_326-17061-0 from training. Duration: 0.94 2023-10-02 03:11:30,069 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.conv_module2.balancer2.prob, batch_count=730213.3333333334, ans=0.125 2023-10-02 03:11:31,832 WARNING [train.py:1204] (3/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_505-97987-0_sp1.1 from training. Duration: 0.7818125 2023-10-02 03:11:31,925 WARNING [train.py:1204] (3/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_316-294364-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 03:11:33,177 WARNING [train.py:1204] (3/4) Exclude cut with ID _19514_210_9259_1_1531530071330_5084230_762-231063-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 03:11:33,230 WARNING [train.py:1204] (3/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_68-219807-0_sp0.9 from training. Duration: 0.52225 2023-10-02 03:11:35,099 WARNING [train.py:1204] (3/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_479-158053-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 03:11:37,840 WARNING [train.py:1204] (3/4) Exclude cut with ID _66112_210_10635_1_1535590236305_3801484_83-38181-0_sp1.1 from training. Duration: 0.936375 2023-10-02 03:11:37,854 WARNING [train.py:1204] (3/4) Exclude cut with ID _55857_210_2346_1_1534644007831_6898209_255-146346-0_sp1.1 from training. Duration: 0.963625 2023-10-02 03:11:39,227 WARNING [train.py:1204] (3/4) Exclude cut with ID _48836_210_12210_1_1534075848293_3904150_429-35475-0 from training. Duration: 0.89 2023-10-02 03:11:39,237 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_50154_210_13731_1_1534750862_1080961_119-187163-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 03:11:41,972 WARNING [train.py:1204] (3/4) Exclude cut with ID _8793_210_6310_1_1528542985139_2310009_121-179782-0_sp0.9 from training. Duration: 0.888875 2023-10-02 03:11:41,976 WARNING [train.py:1204] (3/4) Exclude cut with ID _14884_210_2252_1_1530707459689_5561250_591-173406-0 from training. Duration: 0.96 2023-10-02 03:11:44,820 WARNING [train.py:1204] (3/4) Exclude cut with ID _15133_210_6228_1_1530794512074_3080379_54-198193-0_sp1.1 from training. Duration: 0.8545625 2023-10-02 03:11:44,838 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6545_210_2040_1_1527774670_4245258_407-48070-0 from training. Duration: 0.76 2023-10-02 03:11:46,120 INFO [train.py:1046] (3/4) Epoch 21, batch 3300, loss[loss=0.1708, simple_loss=0.2614, pruned_loss=0.04012, over 24478.00 frames. ], tot_loss[loss=0.176, simple_loss=0.2524, pruned_loss=0.04982, over 4731006.79 frames. ], batch size: 69, lr: 4.88e-03, grad_scale: 16.0 2023-10-02 03:11:46,289 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1032-236863-0 from training. Duration: 0.84 2023-10-02 03:11:46,549 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=730280.0, ans=0.1 2023-10-02 03:11:47,645 WARNING [train.py:1204] (3/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_507-303370-0 from training. Duration: 0.96 2023-10-02 03:11:49,025 WARNING [train.py:1204] (3/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_164-282366-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 03:11:54,287 WARNING [train.py:1204] (3/4) Exclude cut with ID _39867_210_9826_1_1533553222030_9344259_312-158727-0_sp1.1 from training. Duration: 0.963625 2023-10-02 03:11:55,803 WARNING [train.py:1204] (3/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_174-205058-0_sp1.1 from training. Duration: 0.863625 2023-10-02 03:11:55,826 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_11305_210_1794_1_1529750428_7178148_166-237358-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 03:11:58,545 WARNING [train.py:1204] (3/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_26-175553-0_sp1.1 from training. Duration: 0.5545625 2023-10-02 03:11:58,584 WARNING [train.py:1204] (3/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_96-290039-0_sp1.1 from training. Duration: 0.5090625 2023-10-02 03:12:01,491 WARNING [train.py:1204] (3/4) Exclude cut with ID _53219_210_4525_1_1534834836008_4064825_261-101691-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 03:12:03,009 WARNING [train.py:1204] (3/4) Exclude cut with ID _24271_210_12658_1_1531987270022_7190519_96-293390-0_sp1.1 from training. Duration: 0.936375 2023-10-02 03:12:08,276 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_271-234962-0 from training. Duration: 0.79 2023-10-02 03:12:08,337 WARNING [train.py:1204] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_136-30660-0_sp1.1 from training. Duration: 0.97275 2023-10-02 03:12:08,351 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_625-281046-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 03:12:09,750 WARNING [train.py:1204] (3/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_333-168193-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 03:12:09,803 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9029W0269-509664 from training. Duration: 0.896 2023-10-02 03:12:11,157 WARNING [train.py:1204] (3/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_450-147393-0_sp1.1 from training. Duration: 0.92725 2023-10-02 03:12:12,508 WARNING [train.py:1204] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_643-52007-0_sp1.1 from training. Duration: 0.5181875 2023-10-02 03:12:13,847 WARNING [train.py:1204] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_330-327861-0_sp0.9 from training. Duration: 0.7444375 2023-10-02 03:12:13,863 WARNING [train.py:1204] (3/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_260-317651-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 03:12:15,106 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9044W0010-516654 from training. Duration: 0.896 2023-10-02 03:12:17,920 WARNING [train.py:1204] (3/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_265-118378-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 03:12:17,937 WARNING [train.py:1204] (3/4) Exclude cut with ID _6194_210_1534_1_1527511826160_3941194_321-148641-0_sp0.9 from training. Duration: 0.711125 2023-10-02 03:12:19,241 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.513e+02 1.870e+02 2.047e+02 2.284e+02 2.829e+02, threshold=4.094e+02, percent-clipped=0.0 2023-10-02 03:12:20,681 WARNING [train.py:1204] (3/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_101-169176-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 03:12:20,683 WARNING [train.py:1204] (3/4) Exclude cut with ID 497-129325-0061-62254-0_sp1.1 from training. Duration: 0.97725 2023-10-02 03:12:22,040 WARNING [train.py:1204] (3/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_795-33252-0_sp1.1 from training. Duration: 0.336375 2023-10-02 03:12:22,063 WARNING [train.py:1204] (3/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_168-246176-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 03:12:23,979 WARNING [train.py:1204] (3/4) Exclude cut with ID _7160_210_1876_1_1527940923231_3703799_120-150943-0_sp1.1 from training. Duration: 0.736375 2023-10-02 03:12:24,274 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.conv_module2.balancer2.prob, batch_count=730413.3333333334, ans=0.125 2023-10-02 03:12:26,614 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9084W0005-536844 from training. Duration: 0.768 2023-10-02 03:12:27,986 WARNING [train.py:1204] (3/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_234-260682-0 from training. Duration: 0.84 2023-10-02 03:12:30,040 WARNING [train.py:1204] (3/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_188-114464-0_sp0.9 from training. Duration: 0.911125 2023-10-02 03:12:31,525 WARNING [train.py:1204] (3/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_81-75849-0 from training. Duration: 0.39 2023-10-02 03:12:32,955 WARNING [train.py:1204] (3/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_379-184223-0_sp1.1 from training. Duration: 0.77275 2023-10-02 03:12:35,749 WARNING [train.py:1204] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_454-123002-0_sp1.1 from training. Duration: 0.62725 2023-10-02 03:12:37,647 WARNING [train.py:1204] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_887-249555-0_sp1.1 from training. Duration: 0.7 2023-10-02 03:12:39,126 WARNING [train.py:1204] (3/4) Exclude cut with ID _31088_210_16446_1_1532520012284_3440919_20-260902-0_sp1.1 from training. Duration: 0.97275 2023-10-02 03:12:40,984 WARNING [train.py:1204] (3/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_856-344574-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 03:12:40,986 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_36756_210_7234_1_1533026991_966276_4-164118-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 03:12:41,011 WARNING [train.py:1204] (3/4) Exclude cut with ID _8365_210_2064_1_1528592311070_4677690_371-122295-0_sp1.1 from training. Duration: 0.5 2023-10-02 03:12:43,685 WARNING [train.py:1204] (3/4) Exclude cut with ID _5910_210_1389_1_1527337972454_5050100_555-159815-0_sp1.1 from training. Duration: 0.8909375 2023-10-02 03:12:43,694 WARNING [train.py:1204] (3/4) Exclude cut with ID _37086_210_9038_1_1532999540891_7021250_469-147941-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 03:12:45,011 WARNING [train.py:1204] (3/4) Exclude cut with ID _3067_210_2667_1_1526212974640_4503168_459-173867-0_sp0.9 from training. Duration: 0.97775 2023-10-02 03:12:46,429 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9156W0006-572499 from training. Duration: 0.896 2023-10-02 03:12:47,811 WARNING [train.py:1204] (3/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_277-210798-0 from training. Duration: 0.55 2023-10-02 03:12:49,269 WARNING [train.py:1204] (3/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_103-336064-0_sp1.1 from training. Duration: 0.463625 2023-10-02 03:12:49,332 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3414_210_1988_1_1526212718_4813195_331-290945-0_sp1.1 from training. Duration: 0.936375 2023-10-02 03:12:49,333 WARNING [train.py:1204] (3/4) Exclude cut with ID _67499_210_17282_1_1535509805220_3692430_365-29276-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 03:12:50,696 WARNING [train.py:1204] (3/4) Exclude cut with ID _14821_210_8750_1_1530680503991_6829359_69-250847-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 03:12:50,698 WARNING [train.py:1204] (3/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_1102-61301-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 03:12:52,063 WARNING [train.py:1204] (3/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_166-246557-0_sp1.1 from training. Duration: 0.5818125 2023-10-02 03:12:52,121 WARNING [train.py:1204] (3/4) Exclude cut with ID _15351_210_10626_1_1530856635653_3576210_214-129918-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 03:12:53,390 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_120-41668-0_sp0.9 from training. Duration: 0.77775 2023-10-02 03:12:54,770 WARNING [train.py:1204] (3/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_27-72278-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 03:12:54,881 WARNING [train.py:1204] (3/4) Exclude cut with ID _24314_210_4915_1_1532135078116_7170179_521-258868-0_sp1.1 from training. Duration: 0.7181875 2023-10-02 03:12:55,022 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=730546.6666666666, ans=0.1 2023-10-02 03:12:55,575 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.1.encoder.layers.0.self_attn1.whiten, num_groups=1, num_channels=256, metric=10.35 vs. limit=22.5 2023-10-02 03:12:58,194 WARNING [train.py:1204] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_143-86344-0 from training. Duration: 0.77 2023-10-02 03:12:58,236 WARNING [train.py:1204] (3/4) Exclude cut with ID _53489_210_6228_1_1534298357169_3835970_465-157433-0_sp1.1 from training. Duration: 0.92725 2023-10-02 03:13:00,136 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_248-319353-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 03:13:01,442 INFO [train.py:1046] (3/4) Epoch 21, batch 3350, loss[loss=0.182, simple_loss=0.2547, pruned_loss=0.05467, over 23853.00 frames. ], tot_loss[loss=0.1763, simple_loss=0.253, pruned_loss=0.04979, over 4734810.59 frames. ], batch size: 195, lr: 4.88e-03, grad_scale: 16.0 2023-10-02 03:13:02,812 WARNING [train.py:1204] (3/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_102-77064-0_sp1.1 from training. Duration: 0.8454375 2023-10-02 03:13:02,838 WARNING [train.py:1204] (3/4) Exclude cut with ID _9389_210_1664_1_1529217337301_5289269_379-128794-0_sp1.1 from training. Duration: 0.77275 2023-10-02 03:13:05,622 WARNING [train.py:1204] (3/4) Exclude cut with ID _3231_210_2106_1_1526205864934_6784220_236-260835-0_sp1.1 from training. Duration: 0.97275 2023-10-02 03:13:05,750 WARNING [train.py:1204] (3/4) Exclude cut with ID _24271_210_12658_1_1531987270022_7190519_184-342363-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 03:13:05,751 WARNING [train.py:1204] (3/4) Exclude cut with ID _27894_210_12392_1_1532415496078_6902439_67-221663-0_sp1.1 from training. Duration: 0.92725 2023-10-02 03:13:10,543 WARNING [train.py:1204] (3/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_304-59227-0_sp1.1 from training. Duration: 0.7 2023-10-02 03:13:11,991 WARNING [train.py:1204] (3/4) Exclude cut with ID _58605_210_12210_1_1534723195939_3904259_458-42232-0_sp1.1 from training. Duration: 0.92725 2023-10-02 03:13:12,071 WARNING [train.py:1204] (3/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_13-165512-0_sp0.9 from training. Duration: 0.911125 2023-10-02 03:13:13,683 WARNING [train.py:1204] (3/4) Exclude cut with ID _58602_210_2346_1_1534903210085_6908830_792-102241-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 03:13:16,323 WARNING [train.py:1204] (3/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_588-125165-0_sp0.9 from training. Duration: 0.92225 2023-10-02 03:13:17,681 WARNING [train.py:1204] (3/4) Exclude cut with ID _54218_210_12210_1_1534413646388_4409259_239-98429-0_sp1.1 from training. Duration: 0.97275 2023-10-02 03:13:18,953 WARNING [train.py:1204] (3/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_538-32291-0_sp1.1 from training. Duration: 0.7545625 2023-10-02 03:13:20,301 WARNING [train.py:1204] (3/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_419-317062-0 from training. Duration: 0.91 2023-10-02 03:13:21,694 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9309W0202-640329 from training. Duration: 0.896 2023-10-02 03:13:21,728 WARNING [train.py:1204] (3/4) Exclude cut with ID _3231_210_2106_1_1526205864934_6784220_419-313291-0_sp1.1 from training. Duration: 0.97275 2023-10-02 03:13:24,647 WARNING [train.py:1204] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_619-344499-0 from training. Duration: 0.63 2023-10-02 03:13:24,667 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_278-272612-0 from training. Duration: 0.73 2023-10-02 03:13:26,127 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_103-84010-0_sp0.9 from training. Duration: 0.7666875 2023-10-02 03:13:26,148 WARNING [train.py:1204] (3/4) Exclude cut with ID _26528_210_9070_1_1532570410842_3851310_497-97218-0_sp1.1 from training. Duration: 0.7818125 2023-10-02 03:13:28,893 WARNING [train.py:1204] (3/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_262-84613-0_sp1.1 from training. Duration: 0.936375 2023-10-02 03:13:28,924 WARNING [train.py:1204] (3/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_387-131538-0 from training. Duration: 0.81 2023-10-02 03:13:28,964 WARNING [train.py:1204] (3/4) Exclude cut with ID _8030_210_7497_1_1528444886781_3531089_224-300489-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 03:13:30,671 WARNING [train.py:1204] (3/4) Exclude cut with ID _14349_210_8341_1_1530615710818_3442470_170-336541-0_sp0.9 from training. Duration: 0.9555625 2023-10-02 03:13:32,601 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_47069_210_15368_1_1533781267_4431320_180-31746-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 03:13:33,959 WARNING [train.py:1204] (3/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_447-7041-0_sp1.1 from training. Duration: 0.92725 2023-10-02 03:13:35,285 WARNING [train.py:1204] (3/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_2-55283-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 03:13:35,345 WARNING [train.py:1204] (3/4) Exclude cut with ID _12555_210_8750_1_1530075594402_3780239_70-181911-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 03:13:38,225 WARNING [train.py:1204] (3/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_311-328022-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 03:13:39,634 WARNING [train.py:1204] (3/4) Exclude cut with ID _14528_210_8254_1_1530669700514_4306939_330-2987-0_sp1.1 from training. Duration: 0.92725 2023-10-02 03:13:39,680 WARNING [train.py:1204] (3/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_385-184858-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 03:13:43,497 WARNING [train.py:1204] (3/4) Exclude cut with ID _64414_210_17301_1_1535243252048_3757759_348-333360-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 03:13:44,827 WARNING [train.py:1204] (3/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_753-214751-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 03:13:47,515 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_17613_210_2151_1_1531137569_2366160_202-208013-0_sp1.1 from training. Duration: 0.92725 2023-10-02 03:13:47,528 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_43831_210_2158_1_1533725502_5872791_610-129300-0_sp1.1 from training. Duration: 0.936375 2023-10-02 03:13:49,089 WARNING [train.py:1204] (3/4) Exclude cut with ID _54218_210_12210_1_1534413646388_4409259_321-23852-0_sp1.1 from training. Duration: 0.936375 2023-10-02 03:13:51,812 WARNING [train.py:1204] (3/4) Exclude cut with ID _39411_210_5986_1_1533538825030_3343649_173-238248-0 from training. Duration: 0.83 2023-10-02 03:13:51,820 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_405-339986-0_sp0.9 from training. Duration: 0.5666875 2023-10-02 03:13:51,851 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2009_210_2071_1_1525481134_4588937_385-302100-0 from training. Duration: 0.87 2023-10-02 03:13:51,887 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_161-276996-0_sp1.1 from training. Duration: 0.863625 2023-10-02 03:13:53,373 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_177-58005-0 from training. Duration: 0.62 2023-10-02 03:13:53,445 WARNING [train.py:1204] (3/4) Exclude cut with ID _64639_210_5816_1_1535367619437_3528000_146-343734-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 03:13:54,838 WARNING [train.py:1204] (3/4) Exclude cut with ID _3845_210_1390_1_1526731326141_3523610_72-61441-0_sp1.1 from training. Duration: 0.92725 2023-10-02 03:14:01,696 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.feed_forward3.hidden_balancer.prob, batch_count=730880.0, ans=0.125 2023-10-02 03:14:02,757 WARNING [train.py:1204] (3/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_991-125028-0_sp1.1 from training. Duration: 0.936375 2023-10-02 03:14:04,156 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7544_210_6753_1_1528591327_1471426_172-348777-0 from training. Duration: 0.66 2023-10-02 03:14:05,451 WARNING [train.py:1204] (3/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_416-257730-0_sp1.1 from training. Duration: 0.5909375 2023-10-02 03:14:06,805 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1113-7369-0_sp1.1 from training. Duration: 0.77275 2023-10-02 03:14:06,944 WARNING [train.py:1204] (3/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_242-76943-0_sp1.1 from training. Duration: 0.8545625 2023-10-02 03:14:11,101 WARNING [train.py:1204] (3/4) Exclude cut with ID _44718_210_5816_1_1534050051869_6469030_190-49648-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 03:14:14,428 WARNING [train.py:1204] (3/4) Exclude cut with ID _47251_210_18628_1_1533814063128_4137250_214-88076-0 from training. Duration: 0.88 2023-10-02 03:14:14,477 WARNING [train.py:1204] (3/4) Exclude cut with ID _5585_210_2667_1_1527418813324_8007340_89-332072-0_sp1.1 from training. Duration: 0.5818125 2023-10-02 03:14:14,510 WARNING [train.py:1204] (3/4) Exclude cut with ID _41817_210_18835_1_1533434477160_7423049_555-293448-0_sp1.1 from training. Duration: 0.72725 2023-10-02 03:14:15,766 INFO [train.py:1046] (3/4) Epoch 21, batch 3400, loss[loss=0.1827, simple_loss=0.2419, pruned_loss=0.06173, over 22604.00 frames. ], tot_loss[loss=0.1776, simple_loss=0.2542, pruned_loss=0.05046, over 4736122.06 frames. ], batch size: 322, lr: 4.88e-03, grad_scale: 8.0 2023-10-02 03:14:16,067 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=730946.6666666666, ans=0.1 2023-10-02 03:14:17,188 WARNING [train.py:1204] (3/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_286-331314-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 03:14:18,388 WARNING [train.py:1204] (3/4) Exclude cut with ID _13921_210_9994_1_1530838702800_3003429_46-149444-0 from training. Duration: 0.94 2023-10-02 03:14:19,738 WARNING [train.py:1204] (3/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_768-43595-0_sp1.1 from training. Duration: 0.936375 2023-10-02 03:14:19,746 WARNING [train.py:1204] (3/4) Exclude cut with ID _35355_210_16040_1_1533212988977_4016229_74-190076-0 from training. Duration: 0.8 2023-10-02 03:14:21,106 WARNING [train.py:1204] (3/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_1007-316380-0_sp1.1 from training. Duration: 0.963625 2023-10-02 03:14:21,149 WARNING [train.py:1204] (3/4) Exclude cut with ID _23557_210_8750_1_1531890106370_6846255_150-283437-0_sp1.1 from training. Duration: 0.963625 2023-10-02 03:14:21,194 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3474_210_2028_1_1526188513_4825246_145-309131-0_sp1.1 from training. Duration: 0.67275 2023-10-02 03:14:22,547 WARNING [train.py:1204] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_391-22148-0_sp1.1 from training. Duration: 0.7 2023-10-02 03:14:22,563 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_238-256668-0 from training. Duration: 0.69 2023-10-02 03:14:26,823 WARNING [train.py:1204] (3/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_372-198878-0 from training. Duration: 0.6 2023-10-02 03:14:26,841 WARNING [train.py:1204] (3/4) Exclude cut with ID ID0174W0006-766659 from training. Duration: 0.953 2023-10-02 03:14:26,858 WARNING [train.py:1204] (3/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_668-23019-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 03:14:27,086 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=730946.6666666666, ans=0.1 2023-10-02 03:14:32,178 WARNING [train.py:1204] (3/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_322-74328-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 03:14:32,185 WARNING [train.py:1204] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_872-264525-0_sp1.1 from training. Duration: 0.5909375 2023-10-02 03:14:33,532 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_53378_210_17282_1_1534221071_5769509_641-286201-0_sp1.1 from training. Duration: 0.97275 2023-10-02 03:14:34,798 WARNING [train.py:1204] (3/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_199-295611-0_sp1.1 from training. Duration: 0.5 2023-10-02 03:14:39,138 WARNING [train.py:1204] (3/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_1270-317971-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 03:14:40,514 WARNING [train.py:1204] (3/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_784-213099-0 from training. Duration: 0.93 2023-10-02 03:14:48,132 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_115-233929-0_sp0.9 from training. Duration: 0.8 2023-10-02 03:14:48,281 WARNING [train.py:1204] (3/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_256-223809-0_sp1.1 from training. Duration: 0.97275 2023-10-02 03:14:49,938 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.541e+02 1.878e+02 1.982e+02 2.218e+02 2.946e+02, threshold=3.964e+02, percent-clipped=0.0 2023-10-02 03:14:50,025 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_285-87781-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 03:14:50,115 WARNING [train.py:1204] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_619-344499-0_sp1.1 from training. Duration: 0.57275 2023-10-02 03:14:56,338 WARNING [train.py:1204] (3/4) Exclude cut with ID _60229_210_27767_1_1534838406158_3655350_264-279573-0_sp0.9 from training. Duration: 0.92225 2023-10-02 03:15:01,081 WARNING [train.py:1204] (3/4) Exclude cut with ID _7719_210_3525_1_1528456153163_4296960_333-110399-0 from training. Duration: 0.96 2023-10-02 03:15:05,936 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_50423_210_13270_1_1534128598_5021734_759-90833-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 03:15:06,003 WARNING [train.py:1204] (3/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_151-254798-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 03:15:06,037 WARNING [train.py:1204] (3/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_450-203637-0 from training. Duration: 0.92 2023-10-02 03:15:06,073 WARNING [train.py:1204] (3/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_673-36456-0_sp1.1 from training. Duration: 0.936375 2023-10-02 03:15:07,375 WARNING [train.py:1204] (3/4) Exclude cut with ID _36538_210_9994_1_1533204020363_3795620_131-8685-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 03:15:08,738 WARNING [train.py:1204] (3/4) Exclude cut with ID _12556_210_8341_1_1530081024100_7327690_713-55626-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 03:15:08,777 WARNING [train.py:1204] (3/4) Exclude cut with ID _2322_210_2667_1_1525604548971_3656299_440-80494-0_sp0.9 from training. Duration: 0.97775 2023-10-02 03:15:11,603 WARNING [train.py:1204] (3/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_210-325081-0_sp1.1 from training. Duration: 0.97275 2023-10-02 03:15:15,041 WARNING [train.py:1204] (3/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_375-244976-0_sp0.9 from training. Duration: 0.8333125 2023-10-02 03:15:15,047 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_15517_210_6753_1_1530952293_1664795_119-31001-0_sp1.1 from training. Duration: 0.8545625 2023-10-02 03:15:18,019 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_41452_210_7291_1_1533437687_3941281_319-42326-0_sp1.1 from training. Duration: 0.92725 2023-10-02 03:15:21,313 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_372-173012-0 from training. Duration: 0.95 2023-10-02 03:15:26,708 WARNING [train.py:1204] (3/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_41-31157-0_sp1.1 from training. Duration: 0.5181875 2023-10-02 03:15:30,851 INFO [train.py:1046] (3/4) Epoch 21, batch 3450, loss[loss=0.1774, simple_loss=0.2671, pruned_loss=0.04382, over 24020.00 frames. ], tot_loss[loss=0.1772, simple_loss=0.253, pruned_loss=0.05073, over 4712699.67 frames. ], batch size: 80, lr: 4.88e-03, grad_scale: 8.0 2023-10-02 03:15:30,968 WARNING [train.py:1204] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_91-212486-0 from training. Duration: 0.68 2023-10-02 03:15:36,447 WARNING [train.py:1204] (3/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_1267-272693-0 from training. Duration: 0.73 2023-10-02 03:15:36,477 WARNING [train.py:1204] (3/4) Exclude cut with ID _13938_210_9988_1_1530518718978_3693960_152-67046-0_sp1.1 from training. Duration: 0.936375 2023-10-02 03:15:37,902 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_4528_210_3273_1_1527076654_3276777_342-168068-0_sp1.1 from training. Duration: 0.7909375 2023-10-02 03:15:37,920 WARNING [train.py:1204] (3/4) Exclude cut with ID _7606_210_4915_1_1528894253277_5080690_127-177761-0 from training. Duration: 0.64 2023-10-02 03:15:39,288 WARNING [train.py:1204] (3/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_335-137972-0_sp1.1 from training. Duration: 0.92725 2023-10-02 03:15:40,828 WARNING [train.py:1204] (3/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_331-117829-0_sp1.1 from training. Duration: 0.563625 2023-10-02 03:15:46,695 WARNING [train.py:1204] (3/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_125-125818-0_sp1.1 from training. Duration: 0.87275 2023-10-02 03:15:46,753 WARNING [train.py:1204] (3/4) Exclude cut with ID _6789_210_1534_1_1527854471671_2870519_48-294897-0_sp1.1 from training. Duration: 0.963625 2023-10-02 03:15:48,569 WARNING [train.py:1204] (3/4) Exclude cut with ID _6694_210_4599_1_1527852637490_3965529_116-109398-0_sp1.1 from training. Duration: 0.663625 2023-10-02 03:15:48,589 WARNING [train.py:1204] (3/4) Exclude cut with ID _52892_210_6721_1_1534378941259_4124129_222-172685-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 03:15:49,997 WARNING [train.py:1204] (3/4) Exclude cut with ID _50655_210_18628_1_1534229942193_3772154_320-132275-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 03:15:55,442 WARNING [train.py:1204] (3/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_76-311963-0 from training. Duration: 0.7 2023-10-02 03:16:01,363 WARNING [train.py:1204] (3/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_32-205288-0 from training. Duration: 0.78 2023-10-02 03:16:02,696 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_86-16881-0_sp0.9 from training. Duration: 0.6444375 2023-10-02 03:16:02,748 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_271-297505-0_sp1.1 from training. Duration: 0.9 2023-10-02 03:16:04,158 WARNING [train.py:1204] (3/4) Exclude cut with ID _57572_210_5517_1_1534921219624_3809484_187-295293-0_sp1.1 from training. Duration: 0.963625 2023-10-02 03:16:09,144 WARNING [train.py:1204] (3/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_563-329943-0 from training. Duration: 0.99 2023-10-02 03:16:09,240 WARNING [train.py:1204] (3/4) Exclude cut with ID _7160_210_1876_1_1527940923231_3703799_302-150728-0_sp0.9 from training. Duration: 0.8666875 2023-10-02 03:16:14,996 WARNING [train.py:1204] (3/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_926-7526-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 03:16:15,025 WARNING [train.py:1204] (3/4) Exclude cut with ID _62574_210_2655_1_1535180407058_3933249_467-22348-0_sp1.1 from training. Duration: 0.936375 2023-10-02 03:16:15,118 WARNING [train.py:1204] (3/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_280-161633-0_sp0.9 from training. Duration: 0.811125 2023-10-02 03:16:17,865 WARNING [train.py:1204] (3/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_385-193052-0_sp1.1 from training. Duration: 0.7818125 2023-10-02 03:16:20,627 WARNING [train.py:1204] (3/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_444-28041-0 from training. Duration: 0.85 2023-10-02 03:16:20,631 WARNING [train.py:1204] (3/4) Exclude cut with ID _66070_210_7640_1_1535441393096_7805310_341-249237-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 03:16:22,395 WARNING [train.py:1204] (3/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_425-269826-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 03:16:25,108 WARNING [train.py:1204] (3/4) Exclude cut with ID _53950_210_5816_1_1534417264919_3862951_124-185597-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 03:16:26,699 WARNING [train.py:1204] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_380-204756-0 from training. Duration: 0.86 2023-10-02 03:16:29,530 WARNING [train.py:1204] (3/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_282-257791-0_sp0.9 from training. Duration: 0.9555625 2023-10-02 03:16:35,298 WARNING [train.py:1204] (3/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_372-131163-0_sp1.1 from training. Duration: 0.8545625 2023-10-02 03:16:36,658 WARNING [train.py:1204] (3/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_447-233513-0_sp1.1 from training. Duration: 0.963625 2023-10-02 03:16:38,663 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_54-118333-0_sp1.1 from training. Duration: 0.97275 2023-10-02 03:16:42,899 WARNING [train.py:1204] (3/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_388-230239-0_sp1.1 from training. Duration: 0.963625 2023-10-02 03:16:42,916 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_40047_210_2290_1_1533290519_2374311_141-54639-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 03:16:44,716 WARNING [train.py:1204] (3/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_91-267696-0_sp0.9 from training. Duration: 0.9666875 2023-10-02 03:16:44,750 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_532-137847-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 03:16:46,173 INFO [train.py:1046] (3/4) Epoch 21, batch 3500, loss[loss=0.1583, simple_loss=0.2453, pruned_loss=0.03564, over 24440.00 frames. ], tot_loss[loss=0.1757, simple_loss=0.251, pruned_loss=0.05023, over 4700813.40 frames. ], batch size: 69, lr: 4.88e-03, grad_scale: 8.0 2023-10-02 03:16:48,930 WARNING [train.py:1204] (3/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_155-283231-0_sp1.1 from training. Duration: 0.97275 2023-10-02 03:16:52,173 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2245_210_2715_1_1525511548_3540254_310-52483-0_sp1.1 from training. Duration: 0.8 2023-10-02 03:16:52,240 WARNING [train.py:1204] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_185-174805-0 from training. Duration: 0.68 2023-10-02 03:16:53,787 WARNING [train.py:1204] (3/4) Exclude cut with ID _15012_210_1968_1_1530856820429_5095350_662-210469-0_sp1.1 from training. Duration: 0.5181875 2023-10-02 03:16:56,522 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_426-313557-0_sp0.9 from training. Duration: 0.588875 2023-10-02 03:16:59,686 WARNING [train.py:1204] (3/4) Exclude cut with ID _42326_210_7770_1_1533814282531_3707269_407-204343-0_sp1.1 from training. Duration: 0.97275 2023-10-02 03:16:59,700 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_258-310570-0 from training. Duration: 0.82 2023-10-02 03:17:02,588 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.1.bypass.skip_rate, batch_count=731680.0, ans=0.035 2023-10-02 03:17:05,194 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_4737_210_2040_1_1526998689_3293008_378-178650-0_sp1.1 from training. Duration: 0.863625 2023-10-02 03:17:06,611 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_47896_210_20508_1_1534492518_5958868_432-327451-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 03:17:06,715 WARNING [train.py:1204] (3/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_188-114464-0_sp1.1 from training. Duration: 0.7454375 2023-10-02 03:17:06,721 WARNING [train.py:1204] (3/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_798-106179-0_sp1.1 from training. Duration: 0.7545625 2023-10-02 03:17:06,749 WARNING [train.py:1204] (3/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_143-117421-0_sp1.1 from training. Duration: 0.52725 2023-10-02 03:17:08,010 WARNING [train.py:1204] (3/4) Exclude cut with ID _67499_210_17282_1_1535509805220_3692430_361-78786-0_sp1.1 from training. Duration: 0.963625 2023-10-02 03:17:08,045 WARNING [train.py:1204] (3/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_712-342563-0_sp1.1 from training. Duration: 0.8818125 2023-10-02 03:17:08,095 WARNING [train.py:1204] (3/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_387-159008-0 from training. Duration: 0.86 2023-10-02 03:17:11,634 WARNING [train.py:1204] (3/4) Exclude cut with ID _33640_210_2655_1_1533117605479_4855469_959-18820-0_sp1.1 from training. Duration: 0.963625 2023-10-02 03:17:11,855 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.balancer2.prob, batch_count=731680.0, ans=0.125 2023-10-02 03:17:12,908 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_133-564-0_sp0.9 from training. Duration: 0.87775 2023-10-02 03:17:14,287 WARNING [train.py:1204] (3/4) Exclude cut with ID _23514_210_1389_1_1531832139785_3031439_12-29287-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 03:17:17,570 WARNING [train.py:1204] (3/4) Exclude cut with ID _23514_210_1389_1_1531832139785_3031439_489-185052-0_sp1.1 from training. Duration: 0.963625 2023-10-02 03:17:18,937 WARNING [train.py:1204] (3/4) Exclude cut with ID _5444_210_2151_1_1527418808464_5312210_634-194174-0 from training. Duration: 0.97 2023-10-02 03:17:18,954 WARNING [train.py:1204] (3/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_91-26919-0_sp1.1 from training. Duration: 0.8818125 2023-10-02 03:17:20,267 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.581e+02 1.924e+02 2.111e+02 2.557e+02 4.190e+02, threshold=4.222e+02, percent-clipped=1.0 2023-10-02 03:17:21,652 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_37351_210_7925_1_1533012931_3812113_516-82303-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 03:17:23,634 WARNING [train.py:1204] (3/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_80-74184-0_sp0.9 from training. Duration: 0.92225 2023-10-02 03:17:23,711 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_9460_210_4211_1_1528954587_10049667_495-202963-0_sp1.1 from training. Duration: 0.963625 2023-10-02 03:17:23,923 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=731746.6666666666, ans=0.1 2023-10-02 03:17:25,120 WARNING [train.py:1204] (3/4) Exclude cut with ID _14632_210_9988_1_1530691279392_3923200_332-105904-0_sp1.1 from training. Duration: 0.8181875 2023-10-02 03:17:26,483 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_40764_210_1925_1_1533882359_3752345_493-167886-0_sp1.1 from training. Duration: 0.92725 2023-10-02 03:17:26,626 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_196-289637-0 from training. Duration: 0.75 2023-10-02 03:17:28,010 WARNING [train.py:1204] (3/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_26-175553-0 from training. Duration: 0.61 2023-10-02 03:17:29,955 WARNING [train.py:1204] (3/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_890-5117-0 from training. Duration: 0.78 2023-10-02 03:17:30,021 WARNING [train.py:1204] (3/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_206-306860-0_sp1.1 from training. Duration: 0.92725 2023-10-02 03:17:31,506 WARNING [train.py:1204] (3/4) Exclude cut with ID _25882_210_3036_1_1532775665815_6479510_570-79824-0_sp1.1 from training. Duration: 0.963625 2023-10-02 03:17:32,824 WARNING [train.py:1204] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_731-2428-0_sp1.1 from training. Duration: 0.7545625 2023-10-02 03:17:32,856 WARNING [train.py:1204] (3/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_138-154812-0_sp0.9 from training. Duration: 0.8666875 2023-10-02 03:17:34,315 WARNING [train.py:1204] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_178-39722-0_sp0.9 from training. Duration: 0.5444375 2023-10-02 03:17:34,442 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.feed_forward3.hidden_balancer.prob, batch_count=731813.3333333334, ans=0.125 2023-10-02 03:17:35,643 WARNING [train.py:1204] (3/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_1285-256622-0_sp0.9 from training. Duration: 0.9333125 2023-10-02 03:17:35,863 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.feed_forward2.hidden_balancer.prob, batch_count=731813.3333333334, ans=0.125 2023-10-02 03:17:40,832 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.4.encoder.layers.1.feed_forward1.out_whiten, num_groups=1, num_channels=384, metric=10.56 vs. limit=15.0 2023-10-02 03:17:41,556 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_41428_210_2161_1_1533774184_3992532_65-271323-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 03:17:41,654 WARNING [train.py:1204] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_308-161067-0 from training. Duration: 0.66 2023-10-02 03:17:41,659 WARNING [train.py:1204] (3/4) Exclude cut with ID _3067_210_2667_1_1526212974640_4503168_459-173867-0 from training. Duration: 0.88 2023-10-02 03:17:41,660 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2221_210_1988_1_1525597051_4935541_415-296882-0_sp1.1 from training. Duration: 0.7 2023-10-02 03:17:46,172 WARNING [train.py:1204] (3/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_354-192266-0_sp1.1 from training. Duration: 0.936375 2023-10-02 03:17:46,261 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_258-310570-0_sp0.9 from training. Duration: 0.911125 2023-10-02 03:17:47,619 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_548-141039-0_sp1.1 from training. Duration: 0.963625 2023-10-02 03:17:50,382 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_15101_210_7927_1_1530786415_5033274_320-327527-0 from training. Duration: 0.96 2023-10-02 03:17:51,872 WARNING [train.py:1204] (3/4) Exclude cut with ID _2895_210_2477_1_1525956785794_4063750_289-11475-0_sp0.9 from training. Duration: 0.911125 2023-10-02 03:17:53,798 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7543_210_6753_1_1528543318_4552_1-85072-0_sp1.1 from training. Duration: 0.7545625 2023-10-02 03:17:53,885 WARNING [train.py:1204] (3/4) Exclude cut with ID _64639_210_5816_1_1535367619437_3528000_461-329156-0 from training. Duration: 0.96 2023-10-02 03:17:54,320 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.5.encoder.layers.1.self_attn_weights.whiten_keys, num_groups=4, num_channels=128, metric=1.96 vs. limit=6.0 2023-10-02 03:17:56,592 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_211-339622-0 from training. Duration: 0.45 2023-10-02 03:17:58,759 WARNING [train.py:1204] (3/4) Exclude cut with ID _20455_210_5219_1_1531565996775_8098100_402-112739-0_sp1.1 from training. Duration: 0.963625 2023-10-02 03:17:58,878 WARNING [train.py:1204] (3/4) Exclude cut with ID _53489_210_6228_1_1534298357169_3835970_256-115565-0_sp1.1 from training. Duration: 0.936375 2023-10-02 03:18:00,115 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_26326_210_7925_1_1533002493_3592406_360-305260-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 03:18:00,150 WARNING [train.py:1204] (3/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_18-204383-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 03:18:01,523 INFO [train.py:1046] (3/4) Epoch 21, batch 3550, loss[loss=0.1612, simple_loss=0.2343, pruned_loss=0.04403, over 24426.00 frames. ], tot_loss[loss=0.1749, simple_loss=0.2494, pruned_loss=0.05018, over 4696759.96 frames. ], batch size: 58, lr: 4.88e-03, grad_scale: 8.0 2023-10-02 03:18:03,459 WARNING [train.py:1204] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_306-232480-0_sp1.1 from training. Duration: 0.87275 2023-10-02 03:18:13,783 WARNING [train.py:1204] (3/4) Exclude cut with ID _41161_210_20034_1_1533431249657_3555314_116-299186-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 03:18:14,397 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.4.encoder.layers.0.self_attn1.whiten, num_groups=1, num_channels=384, metric=15.73 vs. limit=22.5 2023-10-02 03:18:15,433 WARNING [train.py:1204] (3/4) Exclude cut with ID _13913_210_9994_1_1530579615187_3383897_473-277682-0_sp1.1 from training. Duration: 0.3545625 2023-10-02 03:18:16,903 WARNING [train.py:1204] (3/4) Exclude cut with ID _58895_210_8750_1_1535184081320_3649089_49-225702-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 03:18:20,060 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3080_210_2071_1_1526086102_4365166_312-8726-0_sp1.1 from training. Duration: 0.82725 2023-10-02 03:18:21,535 WARNING [train.py:1204] (3/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_489-190175-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 03:18:21,609 WARNING [train.py:1204] (3/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_345-95717-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 03:18:21,619 WARNING [train.py:1204] (3/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_85-138257-0_sp1.1 from training. Duration: 0.6454375 2023-10-02 03:18:24,535 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7046_210_6753_1_1527895337_6920917_783-278109-0_sp1.1 from training. Duration: 0.7 2023-10-02 03:18:26,461 WARNING [train.py:1204] (3/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_450-203637-0_sp1.1 from training. Duration: 0.836375 2023-10-02 03:18:27,697 WARNING [train.py:1204] (3/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_416-132893-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 03:18:27,720 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2655_210_2087_1_1525688959_3897400_437-337690-0_sp0.9 from training. Duration: 0.7 2023-10-02 03:18:29,112 WARNING [train.py:1204] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_700-183549-0_sp1.1 from training. Duration: 0.6181875 2023-10-02 03:18:33,878 WARNING [train.py:1204] (3/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_191-59784-0_sp1.1 from training. Duration: 0.8 2023-10-02 03:18:33,889 WARNING [train.py:1204] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_218-295097-0_sp1.1 from training. Duration: 0.7 2023-10-02 03:18:36,561 WARNING [train.py:1204] (3/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_310-109461-0_sp1.1 from training. Duration: 0.72725 2023-10-02 03:18:36,575 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_33348_210_5632_1_1533086912_3324030_131-321149-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 03:18:36,630 WARNING [train.py:1204] (3/4) Exclude cut with ID _64135_210_2253_1_1535677267876_7608972_133-188266-0_sp1.1 from training. Duration: 0.736375 2023-10-02 03:18:37,913 WARNING [train.py:1204] (3/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_505-205270-0 from training. Duration: 0.38 2023-10-02 03:18:37,924 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_67223_210_4238_1_1535612120_2537431_102-88248-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 03:18:39,489 WARNING [train.py:1204] (3/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_60-235433-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 03:18:39,661 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.conv_module1.balancer1.min_positive, batch_count=732080.0, ans=0.025 2023-10-02 03:18:40,735 WARNING [train.py:1204] (3/4) Exclude cut with ID _5189_210_4557_1_1527248319428_3769229_199-53526-0_sp1.1 from training. Duration: 0.3818125 2023-10-02 03:18:45,599 WARNING [train.py:1204] (3/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_439-90979-0_sp1.1 from training. Duration: 0.963625 2023-10-02 03:18:46,980 WARNING [train.py:1204] (3/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_1162-132880-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 03:18:47,066 WARNING [train.py:1204] (3/4) Exclude cut with ID _56423_210_5854_1_1534896061049_3165969_220-22317-0_sp1.1 from training. Duration: 0.963625 2023-10-02 03:18:49,848 WARNING [train.py:1204] (3/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_909-64151-0 from training. Duration: 0.94 2023-10-02 03:18:49,912 WARNING [train.py:1204] (3/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_465-156500-0_sp0.9 from training. Duration: 0.8 2023-10-02 03:18:50,205 INFO [scaling.py:1118] (3/4) WithLoss: name=encoder.encoders.5.encoder.layers.0.self_attn_weights, loss-sum=0.000e+00 2023-10-02 03:18:51,756 WARNING [train.py:1204] (3/4) Exclude cut with ID _44304_210_15374_1_1533639352765_4613129_302-22084-0 from training. Duration: 0.86 2023-10-02 03:18:53,082 WARNING [train.py:1204] (3/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_663-166973-0_sp1.1 from training. Duration: 0.72725 2023-10-02 03:18:54,457 WARNING [train.py:1204] (3/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_503-287883-0_sp0.9 from training. Duration: 0.97775 2023-10-02 03:18:54,472 WARNING [train.py:1204] (3/4) Exclude cut with ID _60057_210_10640_1_1534903065793_3333150_362-230377-0_sp0.9 from training. Duration: 0.9666875 2023-10-02 03:18:57,388 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_4183_210_2087_1_1526729100_1869838_143-280962-0 from training. Duration: 0.56 2023-10-02 03:18:58,734 WARNING [train.py:1204] (3/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_281-1313-0_sp1.1 from training. Duration: 0.936375 2023-10-02 03:19:02,185 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.bypass.skip_rate, batch_count=732213.3333333334, ans=0.04949747468305833 2023-10-02 03:19:05,597 WARNING [train.py:1204] (3/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_64-215118-0_sp1.1 from training. Duration: 0.936375 2023-10-02 03:19:06,723 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2822_210_2715_1_1526112586_1999587_165-36656-0 from training. Duration: 0.89 2023-10-02 03:19:06,779 WARNING [train.py:1204] (3/4) Exclude cut with ID _22961_210_8254_1_1531976366672_4278180_402-208830-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 03:19:07,367 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.4.encoder.layers.0.self_attn2.whiten, num_groups=1, num_channels=384, metric=14.15 vs. limit=22.5 2023-10-02 03:19:10,858 WARNING [train.py:1204] (3/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_98-193494-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 03:19:12,716 WARNING [train.py:1204] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_383-285196-0 from training. Duration: 0.97 2023-10-02 03:19:13,045 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.feed_forward1.out_proj.dropout_p, batch_count=732213.3333333334, ans=0.1 2023-10-02 03:19:17,030 INFO [train.py:1046] (3/4) Epoch 21, batch 3600, loss[loss=0.1611, simple_loss=0.238, pruned_loss=0.04207, over 24631.00 frames. ], tot_loss[loss=0.1748, simple_loss=0.2499, pruned_loss=0.04983, over 4711454.89 frames. ], batch size: 60, lr: 4.88e-03, grad_scale: 16.0 2023-10-02 03:19:17,220 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6767_210_3273_1_1527855783_1718932_142-127132-0 from training. Duration: 0.95 2023-10-02 03:19:18,409 WARNING [train.py:1204] (3/4) Exclude cut with ID _39867_210_9826_1_1533553222030_9344259_1018-339635-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 03:19:19,774 WARNING [train.py:1204] (3/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_274-274798-0_sp1.1 from training. Duration: 0.8909375 2023-10-02 03:19:19,983 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.balancer2.prob, batch_count=732280.0, ans=0.125 2023-10-02 03:19:21,205 WARNING [train.py:1204] (3/4) Exclude cut with ID _2334_210_2667_1_1525608238880_4705129_376-263393-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 03:19:22,501 WARNING [train.py:1204] (3/4) Exclude cut with ID _29303_210_2575_1_1532392237303_7410649_363-344675-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 03:19:22,585 WARNING [train.py:1204] (3/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_58-68956-0_sp1.1 from training. Duration: 0.7545625 2023-10-02 03:19:26,122 WARNING [train.py:1204] (3/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_709-311571-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 03:19:27,502 WARNING [train.py:1204] (3/4) Exclude cut with ID _46067_210_4220_1_1534125571066_3307450_136-257407-0_sp1.1 from training. Duration: 0.963625 2023-10-02 03:19:27,586 WARNING [train.py:1204] (3/4) Exclude cut with ID _6493_210_1747_1_1528459210424_4012470_324-323854-0_sp1.1 from training. Duration: 0.836375 2023-10-02 03:19:28,946 WARNING [train.py:1204] (3/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_234-260682-0_sp1.1 from training. Duration: 0.763625 2023-10-02 03:19:30,677 WARNING [train.py:1204] (3/4) Exclude cut with ID _36634_210_1794_1_1533296352450_1399884_55-119451-0_sp1.1 from training. Duration: 0.963625 2023-10-02 03:19:30,681 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_40047_210_2290_1_1533290519_2374311_195-165693-0 from training. Duration: 0.89 2023-10-02 03:19:33,603 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_5522_210_3273_1_1527249606_3909096_366-172508-0_sp0.9 from training. Duration: 0.7333125 2023-10-02 03:19:34,994 WARNING [train.py:1204] (3/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_187-10504-0_sp1.1 from training. Duration: 0.963625 2023-10-02 03:19:39,741 WARNING [train.py:1204] (3/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_282-257791-0_sp1.1 from training. Duration: 0.7818125 2023-10-02 03:19:43,059 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_43831_210_2158_1_1533725502_5872791_467-288776-0_sp1.1 from training. Duration: 0.92725 2023-10-02 03:19:44,423 WARNING [train.py:1204] (3/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_138-154812-0_sp1.1 from training. Duration: 0.7090625 2023-10-02 03:19:44,466 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_1154-185930-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 03:19:44,482 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2655_210_2087_1_1525688959_3897400_52-279899-0 from training. Duration: 0.69 2023-10-02 03:19:45,909 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_141-89113-0_sp1.1 from training. Duration: 0.7818125 2023-10-02 03:19:47,379 WARNING [train.py:1204] (3/4) Exclude cut with ID _55134_210_21982_1_1534590028377_3882170_297-47310-0_sp1.1 from training. Duration: 0.963625 2023-10-02 03:19:48,605 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_486-151131-0_sp1.1 from training. Duration: 0.77275 2023-10-02 03:19:50,040 WARNING [train.py:1204] (3/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_21-232980-0_sp1.1 from training. Duration: 0.936375 2023-10-02 03:19:51,257 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.584e+02 1.800e+02 2.043e+02 2.517e+02 4.317e+02, threshold=4.086e+02, percent-clipped=1.0 2023-10-02 03:19:52,773 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6165_210_3528_1_1527590982_4379841_338-106380-0_sp1.1 from training. Duration: 0.92725 2023-10-02 03:19:54,153 WARNING [train.py:1204] (3/4) Exclude cut with ID _55162_210_17071_1_1534408248583_3753480_355-257218-0_sp1.1 from training. Duration: 0.97275 2023-10-02 03:19:54,243 WARNING [train.py:1204] (3/4) Exclude cut with ID _2895_210_2477_1_1525956785794_4063750_289-11475-0 from training. Duration: 0.82 2023-10-02 03:20:00,242 WARNING [train.py:1204] (3/4) Exclude cut with ID _16756_210_4915_1_1531047803763_7104259_839-183431-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 03:20:01,638 WARNING [train.py:1204] (3/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_427-208901-0_sp0.9 from training. Duration: 0.7555625 2023-10-02 03:20:03,605 WARNING [train.py:1204] (3/4) Exclude cut with ID _36512_210_6987_1_1533033143998_3126069_181-306500-0 from training. Duration: 0.95 2023-10-02 03:20:08,337 WARNING [train.py:1204] (3/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_28-70987-0_sp1.1 from training. Duration: 0.7909375 2023-10-02 03:20:14,039 WARNING [train.py:1204] (3/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_590-58996-0_sp1.1 from training. Duration: 0.936375 2023-10-02 03:20:16,213 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.3.encoder.layers.1.nonlin_attention.whiten2, num_groups=1, num_channels=512, metric=8.32 vs. limit=15.0 2023-10-02 03:20:18,786 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_48730_210_5399_1_1533885886_2212769_108-47561-0_sp1.1 from training. Duration: 0.936375 2023-10-02 03:20:24,309 WARNING [train.py:1204] (3/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_401-158239-0_sp0.9 from training. Duration: 0.788875 2023-10-02 03:20:24,518 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.conv_module1.balancer1.prob, batch_count=732546.6666666666, ans=0.125 2023-10-02 03:20:25,622 WARNING [train.py:1204] (3/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_278-154492-0_sp1.1 from training. Duration: 0.6909375 2023-10-02 03:20:25,628 WARNING [train.py:1204] (3/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_137-116442-0 from training. Duration: 0.78 2023-10-02 03:20:25,765 WARNING [train.py:1204] (3/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_189-317158-0 from training. Duration: 0.9 2023-10-02 03:20:27,074 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_47896_210_20508_1_1534492518_5958868_558-314572-0 from training. Duration: 0.97 2023-10-02 03:20:28,571 WARNING [train.py:1204] (3/4) Exclude cut with ID _66070_210_7640_1_1535441393096_7805310_290-40284-0_sp1.1 from training. Duration: 0.97275 2023-10-02 03:20:28,607 WARNING [train.py:1204] (3/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_110-224688-0_sp1.1 from training. Duration: 0.8818125 2023-10-02 03:20:30,417 WARNING [train.py:1204] (3/4) Exclude cut with ID _7719_210_3525_1_1528456153163_4296960_88-250259-0 from training. Duration: 0.84 2023-10-02 03:20:31,684 INFO [train.py:1046] (3/4) Epoch 21, batch 3650, loss[loss=0.1895, simple_loss=0.255, pruned_loss=0.06205, over 22879.00 frames. ], tot_loss[loss=0.1743, simple_loss=0.2501, pruned_loss=0.04926, over 4714076.88 frames. ], batch size: 322, lr: 4.88e-03, grad_scale: 16.0 2023-10-02 03:20:31,765 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8453_210_4211_1_1528502940_8562730_1157-114102-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 03:20:31,803 WARNING [train.py:1204] (3/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_317-175062-0_sp1.1 from training. Duration: 0.7454375 2023-10-02 03:20:31,814 WARNING [train.py:1204] (3/4) Exclude cut with ID _29857_210_20034_1_1532395886753_3798160_254-347254-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 03:20:31,908 WARNING [train.py:1204] (3/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_28-70987-0 from training. Duration: 0.87 2023-10-02 03:20:32,172 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.conv_module2.balancer2.prob, batch_count=732613.3333333334, ans=0.125 2023-10-02 03:20:33,250 WARNING [train.py:1204] (3/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_321-335475-0 from training. Duration: 0.45 2023-10-02 03:20:36,568 WARNING [train.py:1204] (3/4) Exclude cut with ID _40901_210_5665_1_1533363789958_3856130_356-107330-0_sp1.1 from training. Duration: 0.936375 2023-10-02 03:20:37,909 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3780_210_2087_1_1526467389_2145572_176-107140-0 from training. Duration: 0.69 2023-10-02 03:20:39,023 INFO [scaling.py:1022] (3/4) Whitening: name=encoder_embed.out_whiten, num_groups=1, num_channels=192, metric=6.86 vs. limit=8.0 2023-10-02 03:20:42,797 WARNING [train.py:1204] (3/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_80-135200-0 from training. Duration: 0.79 2023-10-02 03:20:44,361 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_33348_210_5632_1_1533086912_3324030_85-28583-0_sp0.9 from training. Duration: 0.911125 2023-10-02 03:20:47,180 WARNING [train.py:1204] (3/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_29-293896-0 from training. Duration: 0.61 2023-10-02 03:20:49,153 WARNING [train.py:1204] (3/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_194-252800-0 from training. Duration: 0.97 2023-10-02 03:20:53,385 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_360-52446-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 03:20:53,387 WARNING [train.py:1204] (3/4) Exclude cut with ID _9389_210_1664_1_1529217337301_5289269_289-213417-0_sp0.9 from training. Duration: 0.988875 2023-10-02 03:20:53,437 WARNING [train.py:1204] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_143-86344-0_sp0.9 from training. Duration: 0.8555625 2023-10-02 03:20:54,850 WARNING [train.py:1204] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_520-126729-0_sp0.9 from training. Duration: 0.77775 2023-10-02 03:20:54,873 WARNING [train.py:1204] (3/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_76-308663-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 03:20:56,220 WARNING [train.py:1204] (3/4) Exclude cut with ID _8021_210_7994_1_1528376451358_3653069_100-140231-0 from training. Duration: 0.67 2023-10-02 03:20:57,449 WARNING [train.py:1204] (3/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_387-131538-0_sp0.9 from training. Duration: 0.9 2023-10-02 03:20:57,485 WARNING [train.py:1204] (3/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_365-182054-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 03:20:59,299 WARNING [train.py:1204] (3/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_345-340690-0 from training. Duration: 0.93 2023-10-02 03:21:00,676 WARNING [train.py:1204] (3/4) Exclude cut with ID _5409_210_1388_1_1527292850189_5865191_178-127970-0_sp0.9 from training. Duration: 0.6333125 2023-10-02 03:21:00,735 WARNING [train.py:1204] (3/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_352-12734-0_sp1.1 from training. Duration: 0.92725 2023-10-02 03:21:00,739 WARNING [train.py:1204] (3/4) Exclude cut with ID _65134_210_14763_1_1535335393985_3505368_121-129373-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 03:21:03,379 WARNING [train.py:1204] (3/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_557-324258-0_sp1.1 from training. Duration: 0.736375 2023-10-02 03:21:04,858 WARNING [train.py:1204] (3/4) Exclude cut with ID _48678_210_5663_1_1533988830947_3769199_306-255886-0 from training. Duration: 0.77 2023-10-02 03:21:05,113 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.nonlin_attention.balancer.prob, batch_count=732746.6666666666, ans=0.125 2023-10-02 03:21:05,489 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.4.encoder.layers.1.self_attn1.whiten, num_groups=1, num_channels=384, metric=11.60 vs. limit=22.5 2023-10-02 03:21:06,271 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7544_210_6753_1_1528587000_691468_87-178599-0 from training. Duration: 0.87 2023-10-02 03:21:06,345 WARNING [train.py:1204] (3/4) Exclude cut with ID _2941_210_2577_1_1526199673150_2489759_208-236402-0_sp1.1 from training. Duration: 0.963625 2023-10-02 03:21:08,381 WARNING [train.py:1204] (3/4) Exclude cut with ID _38670_210_15379_1_1533448810659_4391059_65-169397-0 from training. Duration: 0.97 2023-10-02 03:21:08,536 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=732746.6666666666, ans=0.1 2023-10-02 03:21:09,301 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.feed_forward1.hidden_balancer.prob, batch_count=732746.6666666666, ans=0.125 2023-10-02 03:21:10,270 WARNING [train.py:1204] (3/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_306-328175-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 03:21:10,278 WARNING [train.py:1204] (3/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_212-149947-0_sp1.1 from training. Duration: 0.663625 2023-10-02 03:21:15,849 WARNING [train.py:1204] (3/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_321-22821-0_sp1.1 from training. Duration: 0.8090625 2023-10-02 03:21:17,878 WARNING [train.py:1204] (3/4) Exclude cut with ID _44010_210_4525_1_1533884262964_4013620_483-69110-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 03:21:17,896 WARNING [train.py:1204] (3/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_38-187851-0_sp1.1 from training. Duration: 0.563625 2023-10-02 03:21:20,610 WARNING [train.py:1204] (3/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_129-337802-0_sp0.9 from training. Duration: 0.8 2023-10-02 03:21:20,680 WARNING [train.py:1204] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_268-87909-0_sp1.1 from training. Duration: 0.9 2023-10-02 03:21:23,419 WARNING [train.py:1204] (3/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_559-184862-0_sp1.1 from training. Duration: 0.8545625 2023-10-02 03:21:25,123 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.bypass.scale_min, batch_count=732813.3333333334, ans=0.2 2023-10-02 03:21:26,235 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_46032_210_14656_1_1533781412_4507969_237-120766-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 03:21:26,321 WARNING [train.py:1204] (3/4) Exclude cut with ID _28427_210_4220_1_1532581240967_3061069_330-170906-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 03:21:26,327 WARNING [train.py:1204] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_401-51578-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 03:21:28,938 WARNING [train.py:1204] (3/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_209-205365-0_sp1.1 from training. Duration: 0.7181875 2023-10-02 03:21:29,012 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_23801_210_16223_1_1532066392_3294657_57-242170-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 03:21:29,068 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_127-37371-0_sp1.1 from training. Duration: 0.936375 2023-10-02 03:21:35,180 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9171W0019-578905 from training. Duration: 0.896 2023-10-02 03:21:35,913 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.3.encoder.layers.1.self_attn_weights.whiten_keys, num_groups=8, num_channels=256, metric=5.12 vs. limit=6.0 2023-10-02 03:21:38,507 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_24605_210_1794_1_1532047863_4149755_349-49056-0_sp1.1 from training. Duration: 0.92725 2023-10-02 03:21:38,517 WARNING [train.py:1204] (3/4) Exclude cut with ID _12302_210_5219_1_1529924446466_8107280_897-21237-0_sp1.1 from training. Duration: 0.936375 2023-10-02 03:21:39,202 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_9460_210_4211_1_1528954587_10049667_480-260269-0_sp0.9 from training. Duration: 0.62225 2023-10-02 03:21:40,471 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_348-152548-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 03:21:41,776 WARNING [train.py:1204] (3/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_301-330455-0_sp0.9 from training. Duration: 0.888875 2023-10-02 03:21:43,108 WARNING [train.py:1204] (3/4) Exclude cut with ID _61008_210_10633_1_1534852769198_6974370_345-91705-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 03:21:45,793 WARNING [train.py:1204] (3/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_304-273433-0 from training. Duration: 0.99 2023-10-02 03:21:45,796 WARNING [train.py:1204] (3/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_359-345972-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 03:21:47,728 INFO [train.py:1046] (3/4) Epoch 21, batch 3700, loss[loss=0.1796, simple_loss=0.249, pruned_loss=0.05509, over 23622.00 frames. ], tot_loss[loss=0.1755, simple_loss=0.2516, pruned_loss=0.04972, over 4705931.56 frames. ], batch size: 256, lr: 4.87e-03, grad_scale: 16.0 2023-10-02 03:21:49,355 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2296_210_2087_1_1525503448_3397335_57-185430-0_sp1.1 from training. Duration: 0.5454375 2023-10-02 03:21:52,122 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_1256-120974-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 03:21:52,190 WARNING [train.py:1204] (3/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_372-30302-0_sp0.9 from training. Duration: 0.9555625 2023-10-02 03:21:55,038 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_42012_210_7927_1_1533439936_3871556_451-344262-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 03:21:55,039 WARNING [train.py:1204] (3/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_28-324729-0 from training. Duration: 0.94 2023-10-02 03:21:55,049 WARNING [train.py:1204] (3/4) Exclude cut with ID _64135_210_2253_1_1535677267876_7608972_313-18394-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 03:21:56,492 WARNING [train.py:1204] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_620-276793-0_sp0.9 from training. Duration: 0.6555625 2023-10-02 03:21:56,531 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7544_210_6753_1_1528591327_1471426_172-348777-0_sp0.9 from training. Duration: 0.7333125 2023-10-02 03:21:59,265 WARNING [train.py:1204] (3/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_332-221057-0_sp0.9 from training. Duration: 0.7555625 2023-10-02 03:22:02,671 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.0.feed_forward2.hidden_balancer.prob, batch_count=733013.3333333334, ans=0.125 2023-10-02 03:22:03,843 WARNING [train.py:1204] (3/4) Exclude cut with ID _19023_210_4353_1_1531378892552_5820780_5-300035-0_sp1.1 from training. Duration: 0.92725 2023-10-02 03:22:05,108 WARNING [train.py:1204] (3/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_343-309023-0_sp1.1 from training. Duration: 0.963625 2023-10-02 03:22:05,183 WARNING [train.py:1204] (3/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_37-41919-0_sp1.1 from training. Duration: 0.7909375 2023-10-02 03:22:05,207 WARNING [train.py:1204] (3/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_41-182910-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 03:22:06,546 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_229-45300-0_sp0.9 from training. Duration: 0.6333125 2023-10-02 03:22:06,849 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.balancer1.prob, batch_count=733013.3333333334, ans=0.125 2023-10-02 03:22:09,265 WARNING [train.py:1204] (3/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_40-214716-0_sp1.1 from training. Duration: 0.963625 2023-10-02 03:22:09,365 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9309W0203-640330 from training. Duration: 0.896 2023-10-02 03:22:17,617 WARNING [train.py:1204] (3/4) Exclude cut with ID _60229_210_27767_1_1534838406158_3655350_264-279573-0_sp1.1 from training. Duration: 0.7545625 2023-10-02 03:22:17,654 WARNING [train.py:1204] (3/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_372-198878-0_sp0.9 from training. Duration: 0.6666875 2023-10-02 03:22:19,711 WARNING [train.py:1204] (3/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_359-118926-0_sp1.1 from training. Duration: 0.6545625 2023-10-02 03:22:19,734 WARNING [train.py:1204] (3/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_85-65056-0 from training. Duration: 0.96 2023-10-02 03:22:19,769 WARNING [train.py:1204] (3/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_89-242335-0_sp1.1 from training. Duration: 0.863625 2023-10-02 03:22:22,359 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.473e+02 1.917e+02 2.171e+02 2.489e+02 3.807e+02, threshold=4.342e+02, percent-clipped=0.0 2023-10-02 03:22:23,868 WARNING [train.py:1204] (3/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_128-49444-0_sp1.1 from training. Duration: 0.936375 2023-10-02 03:22:25,186 WARNING [train.py:1204] (3/4) Exclude cut with ID _30327_210_16049_1_1532484161295_3924169_401-70819-0 from training. Duration: 0.86 2023-10-02 03:22:26,578 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_41822_210_13270_1_1533955976_4379284_260-256391-0_sp1.1 from training. Duration: 0.936375 2023-10-02 03:22:26,692 WARNING [train.py:1204] (3/4) Exclude cut with ID _67335_210_5219_1_1535525975652_4275229_599-247317-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 03:22:29,407 WARNING [train.py:1204] (3/4) Exclude cut with ID _29857_210_20034_1_1532395886753_3798160_209-239267-0_sp1.1 from training. Duration: 0.936375 2023-10-02 03:22:29,434 WARNING [train.py:1204] (3/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_46-207599-0_sp1.1 from training. Duration: 0.5818125 2023-10-02 03:22:32,139 WARNING [train.py:1204] (3/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_912-282412-0_sp1.1 from training. Duration: 0.4454375 2023-10-02 03:22:35,313 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_4183_210_2087_1_1526729100_1869838_141-166600-0_sp1.1 from training. Duration: 0.863625 2023-10-02 03:22:35,325 WARNING [train.py:1204] (3/4) Exclude cut with ID _15037_210_2253_1_1530752294104_3267650_296-192597-0 from training. Duration: 0.81 2023-10-02 03:22:35,366 WARNING [train.py:1204] (3/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_571-220562-0_sp1.1 from training. Duration: 0.963625 2023-10-02 03:22:35,382 WARNING [train.py:1204] (3/4) Exclude cut with ID _5287_210_2084_1_1527418883040_3878670_12-221186-0 from training. Duration: 0.98 2023-10-02 03:22:40,993 WARNING [train.py:1204] (3/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_149-85602-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 03:22:42,955 WARNING [train.py:1204] (3/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_1003-101638-0_sp1.1 from training. Duration: 0.663625 2023-10-02 03:22:44,458 WARNING [train.py:1204] (3/4) Exclude cut with ID _55844_210_2346_1_1534471212569_7625707_1099-33689-0_sp1.1 from training. Duration: 0.92725 2023-10-02 03:22:44,524 WARNING [train.py:1204] (3/4) Exclude cut with ID _7606_210_4915_1_1528894253277_5080690_301-10732-0 from training. Duration: 0.81 2023-10-02 03:22:44,756 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.bypass.skip_rate, batch_count=733146.6666666666, ans=0.04949747468305833 2023-10-02 03:22:47,640 WARNING [train.py:1204] (3/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_403-34698-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 03:22:47,646 WARNING [train.py:1204] (3/4) Exclude cut with ID _8480_210_1874_1_1528589391205_6728330_755-112625-0_sp0.9 from training. Duration: 0.82225 2023-10-02 03:22:48,928 WARNING [train.py:1204] (3/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_527-238990-0_sp1.1 from training. Duration: 0.8181875 2023-10-02 03:22:48,949 WARNING [train.py:1204] (3/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_426-7328-0_sp1.1 from training. Duration: 0.92725 2023-10-02 03:22:51,812 WARNING [train.py:1204] (3/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_189-317158-0_sp1.1 from training. Duration: 0.8181875 2023-10-02 03:22:51,888 WARNING [train.py:1204] (3/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_162-103620-0 from training. Duration: 0.93 2023-10-02 03:22:53,327 WARNING [train.py:1204] (3/4) Exclude cut with ID _22083_210_8254_1_1531702862439_3678970_49-234546-0 from training. Duration: 0.83 2023-10-02 03:22:54,606 WARNING [train.py:1204] (3/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_370-331600-0_sp1.1 from training. Duration: 0.8545625 2023-10-02 03:22:54,634 WARNING [train.py:1204] (3/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_166-59042-0_sp1.1 from training. Duration: 0.97275 2023-10-02 03:22:56,017 WARNING [train.py:1204] (3/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_138-332773-0_sp1.1 from training. Duration: 0.736375 2023-10-02 03:22:56,098 WARNING [train.py:1204] (3/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_90-30673-0_sp0.9 from training. Duration: 0.9333125 2023-10-02 03:23:00,106 WARNING [train.py:1204] (3/4) Exclude cut with ID _27967_210_12140_1_1532588391859_8419130_737-41898-0_sp1.1 from training. Duration: 0.936375 2023-10-02 03:23:01,485 INFO [train.py:1046] (3/4) Epoch 21, batch 3750, loss[loss=0.1676, simple_loss=0.2505, pruned_loss=0.04236, over 24653.00 frames. ], tot_loss[loss=0.1765, simple_loss=0.2523, pruned_loss=0.05039, over 4702381.21 frames. ], batch size: 65, lr: 4.87e-03, grad_scale: 16.0 2023-10-02 03:23:01,553 WARNING [train.py:1204] (3/4) Exclude cut with ID _8833_210_5219_1_1528592665210_7553059_329-125156-0_sp1.1 from training. Duration: 0.7454375 2023-10-02 03:23:01,660 WARNING [train.py:1204] (3/4) Exclude cut with ID _41817_210_18835_1_1533434477160_7423049_352-42537-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 03:23:04,932 WARNING [train.py:1204] (3/4) Exclude cut with ID _8365_210_2064_1_1528592311070_4677690_431-294102-0 from training. Duration: 0.89 2023-10-02 03:23:06,344 WARNING [train.py:1204] (3/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_454-314901-0_sp1.1 from training. Duration: 0.4181875 2023-10-02 03:23:09,112 WARNING [train.py:1204] (3/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_80-135200-0_sp0.9 from training. Duration: 0.87775 2023-10-02 03:23:09,164 WARNING [train.py:1204] (3/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_181-78632-0 from training. Duration: 0.78 2023-10-02 03:23:10,493 WARNING [train.py:1204] (3/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_300-27235-0_sp0.9 from training. Duration: 0.9666875 2023-10-02 03:23:11,173 WARNING [train.py:1204] (3/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_96-177176-0_sp1.1 from training. Duration: 0.97275 2023-10-02 03:23:12,590 WARNING [train.py:1204] (3/4) Exclude cut with ID _35941_210_15379_1_1532995294515_4023089_246-348742-0_sp1.1 from training. Duration: 0.97275 2023-10-02 03:23:14,091 WARNING [train.py:1204] (3/4) Exclude cut with ID _38670_210_15379_1_1533448810659_4391059_65-169397-0_sp1.1 from training. Duration: 0.8818125 2023-10-02 03:23:17,349 WARNING [train.py:1204] (3/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_1003-99642-0_sp1.1 from training. Duration: 0.963625 2023-10-02 03:23:20,748 WARNING [train.py:1204] (3/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_320-14078-0_sp1.1 from training. Duration: 0.863625 2023-10-02 03:23:20,836 WARNING [train.py:1204] (3/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_367-61330-0_sp0.9 from training. Duration: 0.8333125 2023-10-02 03:23:23,643 WARNING [train.py:1204] (3/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_515-298514-0_sp1.1 from training. Duration: 0.92725 2023-10-02 03:23:26,476 WARNING [train.py:1204] (3/4) Exclude cut with ID _53731_210_7994_1_1534553979244_3757214_471-320815-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 03:23:26,524 WARNING [train.py:1204] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_306-232480-0 from training. Duration: 0.96 2023-10-02 03:23:26,589 WARNING [train.py:1204] (3/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_478-162247-0_sp0.9 from training. Duration: 0.911125 2023-10-02 03:23:27,930 WARNING [train.py:1204] (3/4) Exclude cut with ID _56423_210_5854_1_1534896061049_3165969_345-284696-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 03:23:27,975 WARNING [train.py:1204] (3/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_600-122253-0_sp1.1 from training. Duration: 0.963625 2023-10-02 03:23:29,520 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.conv_module1.balancer2.min_abs, batch_count=733346.6666666666, ans=0.5 2023-10-02 03:23:31,972 WARNING [train.py:1204] (3/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_199-295611-0 from training. Duration: 0.55 2023-10-02 03:23:35,318 WARNING [train.py:1204] (3/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_734-129761-0 from training. Duration: 0.82 2023-10-02 03:23:36,641 WARNING [train.py:1204] (3/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_349-169257-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 03:23:37,989 WARNING [train.py:1204] (3/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_202-67017-0_sp0.9 from training. Duration: 0.911125 2023-10-02 03:23:39,387 WARNING [train.py:1204] (3/4) Exclude cut with ID _16908_210_5724_1_1531119157248_3772700_61-258978-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 03:23:43,014 WARNING [train.py:1204] (3/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_172-156931-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 03:23:46,215 WARNING [train.py:1204] (3/4) Exclude cut with ID _4092_210_2151_1_1526643044449_3135759_356-278694-0_sp1.1 from training. Duration: 0.42725 2023-10-02 03:23:51,615 WARNING [train.py:1204] (3/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_116-332008-0 from training. Duration: 0.98 2023-10-02 03:23:53,399 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.bypass.skip_rate, batch_count=733480.0, ans=0.07 2023-10-02 03:23:54,471 WARNING [train.py:1204] (3/4) Exclude cut with ID _29003_210_19596_1_1532507391720_3894098_358-114789-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 03:23:57,320 WARNING [train.py:1204] (3/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_152-212533-0_sp1.1 from training. Duration: 0.8818125 2023-10-02 03:23:57,396 WARNING [train.py:1204] (3/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_267-214116-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 03:24:01,490 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7543_210_6753_1_1528541907_1399322_165-323619-0_sp0.9 from training. Duration: 0.7666875 2023-10-02 03:24:04,254 WARNING [train.py:1204] (3/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_577-332600-0_sp0.9 from training. Duration: 0.6333125 2023-10-02 03:24:05,712 WARNING [train.py:1204] (3/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_342-100157-0_sp0.9 from training. Duration: 0.6 2023-10-02 03:24:07,812 WARNING [train.py:1204] (3/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_141-89727-0_sp1.1 from training. Duration: 0.6454375 2023-10-02 03:24:09,308 WARNING [train.py:1204] (3/4) Exclude cut with ID _3992_210_1747_1_1526644816031_3731060_107-251434-0_sp1.1 from training. Duration: 0.9 2023-10-02 03:24:13,247 WARNING [train.py:1204] (3/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_401-53423-0_sp1.1 from training. Duration: 0.5 2023-10-02 03:24:16,556 INFO [train.py:1046] (3/4) Epoch 21, batch 3800, loss[loss=0.1505, simple_loss=0.2255, pruned_loss=0.03772, over 24363.00 frames. ], tot_loss[loss=0.1759, simple_loss=0.2518, pruned_loss=0.04998, over 4704725.71 frames. ], batch size: 56, lr: 4.87e-03, grad_scale: 16.0 2023-10-02 03:24:19,078 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.4.encoder.layers.2.conv_module1.whiten, num_groups=1, num_channels=384, metric=3.02 vs. limit=15.0 2023-10-02 03:24:21,824 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7762_210_1794_1_1528199508_4057408_422-227034-0_sp1.1 from training. Duration: 0.663625 2023-10-02 03:24:24,726 WARNING [train.py:1204] (3/4) Exclude cut with ID _4184_210_2550_1_1526730192013_2219380_102-250299-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 03:24:26,062 WARNING [train.py:1204] (3/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_253-308377-0_sp0.9 from training. Duration: 0.5444375 2023-10-02 03:24:27,534 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_271-297505-0 from training. Duration: 0.99 2023-10-02 03:24:28,911 WARNING [train.py:1204] (3/4) Exclude cut with ID _35941_210_15379_1_1532995294515_4023089_213-142973-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 03:24:31,619 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7544_210_6753_1_1528587722_3576686_162-63018-0_sp1.1 from training. Duration: 0.92725 2023-10-02 03:24:31,691 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_365-217119-0_sp0.9 from training. Duration: 0.7 2023-10-02 03:24:33,165 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=733680.0, ans=0.1 2023-10-02 03:24:34,379 WARNING [train.py:1204] (3/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_662-275621-0_sp0.9 from training. Duration: 0.4666875 2023-10-02 03:24:34,381 WARNING [train.py:1204] (3/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_374-187555-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 03:24:34,474 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_289-180904-0_sp1.1 from training. Duration: 0.6909375 2023-10-02 03:24:35,889 WARNING [train.py:1204] (3/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_189-47206-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 03:24:36,846 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.0.layers.1.nonlin_attention.whiten1, num_groups=1, num_channels=144, metric=5.70 vs. limit=10.0 2023-10-02 03:24:37,247 WARNING [train.py:1204] (3/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_234-14174-0_sp1.1 from training. Duration: 0.7454375 2023-10-02 03:24:37,287 WARNING [train.py:1204] (3/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_576-76621-0_sp1.1 from training. Duration: 0.963625 2023-10-02 03:24:37,382 WARNING [train.py:1204] (3/4) Exclude cut with ID _13253_210_1925_1_1530757775819_3577873_253-246386-0 from training. Duration: 0.9 2023-10-02 03:24:40,941 WARNING [train.py:1204] (3/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_372-6869-0_sp1.1 from training. Duration: 0.4 2023-10-02 03:24:42,198 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_21434_210_4238_1_1531916339_1222252_22-54954-0_sp1.1 from training. Duration: 0.936375 2023-10-02 03:24:46,948 WARNING [train.py:1204] (3/4) Exclude cut with ID _29853_210_7497_1_1532505600851_6642110_492-170361-0_sp1.1 from training. Duration: 0.92725 2023-10-02 03:24:48,378 WARNING [train.py:1204] (3/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_505-97987-0_sp0.9 from training. Duration: 0.9555625 2023-10-02 03:24:49,889 WARNING [train.py:1204] (3/4) Exclude cut with ID _8365_210_2064_1_1528592311070_4677690_371-122295-0_sp0.9 from training. Duration: 0.611125 2023-10-02 03:24:51,635 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.549e+02 1.912e+02 2.251e+02 2.632e+02 4.326e+02, threshold=4.502e+02, percent-clipped=0.0 2023-10-02 03:24:51,742 WARNING [train.py:1204] (3/4) Exclude cut with ID _14268_210_9206_1_1530622771285_4714099_637-86630-0_sp1.1 from training. Duration: 0.463625 2023-10-02 03:24:51,754 WARNING [train.py:1204] (3/4) Exclude cut with ID _29857_210_20034_1_1532395886753_3798160_372-5678-0_sp1.1 from training. Duration: 0.963625 2023-10-02 03:24:53,904 WARNING [train.py:1204] (3/4) Exclude cut with ID _52892_210_6721_1_1534378941259_4124129_90-165108-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 03:24:55,281 WARNING [train.py:1204] (3/4) Exclude cut with ID _27052_210_5986_1_1532329197187_3585009_143-339198-0_sp1.1 from training. Duration: 0.963625 2023-10-02 03:24:55,478 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.0.conv_skip_rate, batch_count=733746.6666666666, ans=0.0 2023-10-02 03:24:59,353 WARNING [train.py:1204] (3/4) Exclude cut with ID _14268_210_9206_1_1530622771285_4714099_637-86630-0_sp0.9 from training. Duration: 0.5666875 2023-10-02 03:24:59,356 WARNING [train.py:1204] (3/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_401-255491-0 from training. Duration: 0.7 2023-10-02 03:25:00,744 WARNING [train.py:1204] (3/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_178-6946-0_sp1.1 from training. Duration: 0.97275 2023-10-02 03:25:00,889 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=733813.3333333334, ans=0.1 2023-10-02 03:25:00,933 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.conv_module2.balancer1.prob, batch_count=733813.3333333334, ans=0.125 2023-10-02 03:25:02,699 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.3.encoder.layers.3.whiten, num_groups=1, num_channels=512, metric=3.71 vs. limit=12.0 2023-10-02 03:25:06,485 WARNING [train.py:1204] (3/4) Exclude cut with ID _27485_210_4916_1_1532739191677_3668269_460-163885-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 03:25:06,649 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.self_attn_weights.pos_emb_skip_rate, batch_count=733813.3333333334, ans=0.0 2023-10-02 03:25:06,663 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.ff3_skip_rate, batch_count=733813.3333333334, ans=0.0 2023-10-02 03:25:10,392 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.feed_forward3.out_whiten.whitening_limit, batch_count=733813.3333333334, ans=15.0 2023-10-02 03:25:11,111 WARNING [train.py:1204] (3/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_270-26053-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 03:25:12,585 WARNING [train.py:1204] (3/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_412-317077-0 from training. Duration: 0.75 2023-10-02 03:25:14,499 WARNING [train.py:1204] (3/4) Exclude cut with ID _5289_210_2084_1_1527678202736_3779299_142-103317-0 from training. Duration: 0.84 2023-10-02 03:25:15,833 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_150-143438-0_sp1.1 from training. Duration: 0.92725 2023-10-02 03:25:17,357 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.conv_skip_rate, batch_count=733880.0, ans=0.0 2023-10-02 03:25:18,471 WARNING [train.py:1204] (3/4) Exclude cut with ID _27894_210_12392_1_1532415496078_6902439_668-269779-0_sp1.1 from training. Duration: 0.97275 2023-10-02 03:25:19,837 WARNING [train.py:1204] (3/4) Exclude cut with ID _18513_210_9206_1_1531618215223_3862620_56-222075-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 03:25:21,788 WARNING [train.py:1204] (3/4) Exclude cut with ID _4092_210_2151_1_1526643044449_3135759_356-278694-0 from training. Duration: 0.47 2023-10-02 03:25:24,622 WARNING [train.py:1204] (3/4) Exclude cut with ID _25808_210_13292_1_1532343514562_7366109_633-47524-0 from training. Duration: 0.99 2023-10-02 03:25:24,632 WARNING [train.py:1204] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_872-264525-0 from training. Duration: 0.65 2023-10-02 03:25:24,659 WARNING [train.py:1204] (3/4) Exclude cut with ID _14632_210_9988_1_1530691279392_3923200_239-232545-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 03:25:26,659 WARNING [train.py:1204] (3/4) Exclude cut with ID _14349_210_8341_1_1530615710818_3442470_153-27432-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 03:25:30,928 WARNING [train.py:1204] (3/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_37-41919-0_sp0.9 from training. Duration: 0.9666875 2023-10-02 03:25:30,990 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_196-289637-0_sp0.9 from training. Duration: 0.8333125 2023-10-02 03:25:32,226 INFO [train.py:1046] (3/4) Epoch 21, batch 3850, loss[loss=0.1525, simple_loss=0.2333, pruned_loss=0.03581, over 24322.00 frames. ], tot_loss[loss=0.1753, simple_loss=0.2509, pruned_loss=0.04989, over 4706117.22 frames. ], batch size: 56, lr: 4.87e-03, grad_scale: 16.0 2023-10-02 03:25:32,516 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.bypass.scale_min, batch_count=733946.6666666666, ans=0.2 2023-10-02 03:25:36,592 WARNING [train.py:1204] (3/4) Exclude cut with ID _13678_210_7497_1_1530530069226_3463170_371-3221-0_sp1.1 from training. Duration: 0.8181875 2023-10-02 03:25:37,230 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.4.encoder.layers.1.self_attn2.whiten, num_groups=1, num_channels=384, metric=12.25 vs. limit=22.5 2023-10-02 03:25:38,362 WARNING [train.py:1204] (3/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_557-324258-0 from training. Duration: 0.81 2023-10-02 03:25:38,432 WARNING [train.py:1204] (3/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_1012-279473-0_sp1.1 from training. Duration: 0.6090625 2023-10-02 03:25:38,691 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.conv_module2.balancer2.prob, batch_count=733946.6666666666, ans=0.125 2023-10-02 03:25:39,834 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_66916_210_14656_1_1535725525_326566_7-92649-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 03:25:42,749 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_9460_210_4211_1_1528954587_10049667_480-260269-0_sp1.1 from training. Duration: 0.5090625 2023-10-02 03:25:47,330 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_121-155001-0_sp1.1 from training. Duration: 0.92725 2023-10-02 03:25:48,810 WARNING [train.py:1204] (3/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_188-171673-0_sp1.1 from training. Duration: 0.52725 2023-10-02 03:25:48,884 WARNING [train.py:1204] (3/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_2-39653-0 from training. Duration: 0.98 2023-10-02 03:25:49,165 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.conv_module2.balancer2.prob, batch_count=734013.3333333334, ans=0.125 2023-10-02 03:25:56,389 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_23850_210_4238_1_1532232597_1632433_112-229012-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 03:25:57,819 WARNING [train.py:1204] (3/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_626-79528-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 03:25:58,124 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.nonlin_attention.balancer.prob, batch_count=734013.3333333334, ans=0.125 2023-10-02 03:26:00,880 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.feed_forward2.hidden_balancer.prob, batch_count=734080.0, ans=0.125 2023-10-02 03:26:01,942 WARNING [train.py:1204] (3/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_856-90052-0_sp1.1 from training. Duration: 0.963625 2023-10-02 03:26:01,978 WARNING [train.py:1204] (3/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_122-109023-0_sp0.9 from training. Duration: 0.8444375 2023-10-02 03:26:03,624 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.feed_forward2.hidden_balancer.prob, batch_count=734080.0, ans=0.125 2023-10-02 03:26:04,772 WARNING [train.py:1204] (3/4) Exclude cut with ID _64414_210_17301_1_1535243252048_3757759_37-70328-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 03:26:04,846 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_622-344907-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 03:26:06,152 WARNING [train.py:1204] (3/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_410-321956-0_sp1.1 from training. Duration: 0.92725 2023-10-02 03:26:06,170 WARNING [train.py:1204] (3/4) Exclude cut with ID _7562_210_2084_1_1528887767916_3946229_186-326422-0_sp0.9 from training. Duration: 0.9333125 2023-10-02 03:26:07,525 WARNING [train.py:1204] (3/4) Exclude cut with ID _37537_210_2253_1_1533171758568_3421949_127-285192-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 03:26:08,980 WARNING [train.py:1204] (3/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_723-228168-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 03:26:09,597 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.3.encoder.layers.2.nonlin_attention.whiten1, num_groups=1, num_channels=384, metric=5.54 vs. limit=10.0 2023-10-02 03:26:10,363 WARNING [train.py:1204] (3/4) Exclude cut with ID _13678_210_7497_1_1530530069226_3463170_426-246984-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 03:26:10,386 WARNING [train.py:1204] (3/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_399-156791-0_sp0.9 from training. Duration: 0.92225 2023-10-02 03:26:10,437 WARNING [train.py:1204] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_391-22148-0 from training. Duration: 0.77 2023-10-02 03:26:10,475 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3475_210_2028_1_1526193578_3622569_64-297072-0 from training. Duration: 0.91 2023-10-02 03:26:12,295 WARNING [train.py:1204] (3/4) Exclude cut with ID _17875_210_2087_1_1531224012907_1821339_225-280015-0_sp1.1 from training. Duration: 0.963625 2023-10-02 03:26:12,313 WARNING [train.py:1204] (3/4) Exclude cut with ID _50655_210_18628_1_1534229942193_3772154_325-246169-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 03:26:15,116 WARNING [train.py:1204] (3/4) Exclude cut with ID _29529_210_7234_1_1532393961455_3977460_597-257348-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 03:26:15,158 WARNING [train.py:1204] (3/4) Exclude cut with ID _58895_210_8750_1_1535184081320_3649089_77-173485-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 03:26:15,179 WARNING [train.py:1204] (3/4) Exclude cut with ID _6270_210_2084_1_1527850911573_3856420_26-277343-0 from training. Duration: 0.82 2023-10-02 03:26:16,746 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder_embed.convnext.layerdrop_rate, batch_count=734146.6666666666, ans=0.015 2023-10-02 03:26:18,452 WARNING [train.py:1204] (3/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_144-3287-0 from training. Duration: 0.67 2023-10-02 03:26:19,890 WARNING [train.py:1204] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_293-145278-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 03:26:23,139 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_40047_210_2290_1_1533290519_2374311_257-335162-0 from training. Duration: 0.95 2023-10-02 03:26:24,525 WARNING [train.py:1204] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_619-344499-0_sp0.9 from training. Duration: 0.7 2023-10-02 03:26:30,382 WARNING [train.py:1204] (3/4) Exclude cut with ID _19504_210_9994_1_1531648863242_3325220_456-315379-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 03:26:30,466 WARNING [train.py:1204] (3/4) Exclude cut with ID _55857_210_2346_1_1534644007831_6898209_631-221692-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 03:26:33,386 WARNING [train.py:1204] (3/4) Exclude cut with ID _64631_210_27767_1_1535349580068_4165160_324-3682-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 03:26:34,715 WARNING [train.py:1204] (3/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_317-211107-0 from training. Duration: 0.92 2023-10-02 03:26:36,194 WARNING [train.py:1204] (3/4) Exclude cut with ID _6789_210_1534_1_1527857937218_3118950_311-61262-0 from training. Duration: 0.65 2023-10-02 03:26:40,216 WARNING [train.py:1204] (3/4) Exclude cut with ID _28323_210_16187_1_1532480395667_4100300_365-68882-0_sp1.1 from training. Duration: 0.92725 2023-10-02 03:26:40,249 WARNING [train.py:1204] (3/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_816-181825-0_sp1.1 from training. Duration: 0.92725 2023-10-02 03:26:43,558 WARNING [train.py:1204] (3/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_132-269633-0_sp1.1 from training. Duration: 0.6181875 2023-10-02 03:26:43,575 WARNING [train.py:1204] (3/4) Exclude cut with ID _44585_210_18628_1_1533605350387_4518199_280-73534-0_sp1.1 from training. Duration: 0.8090625 2023-10-02 03:26:43,628 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_68095_210_20185_1_1535718226_8888855_1009-164755-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 03:26:44,986 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_40223_210_6228_1_1533298404_4812267_400-103813-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 03:26:44,987 WARNING [train.py:1204] (3/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_491-261923-0_sp1.1 from training. Duration: 0.8909375 2023-10-02 03:26:44,992 WARNING [train.py:1204] (3/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_712-342563-0 from training. Duration: 0.97 2023-10-02 03:26:45,079 WARNING [train.py:1204] (3/4) Exclude cut with ID _41161_210_20034_1_1533431249657_3555314_175-190032-0_sp1.1 from training. Duration: 0.963625 2023-10-02 03:26:46,325 INFO [train.py:1046] (3/4) Epoch 21, batch 3900, loss[loss=0.1836, simple_loss=0.2604, pruned_loss=0.05338, over 23446.00 frames. ], tot_loss[loss=0.1742, simple_loss=0.2495, pruned_loss=0.0494, over 4709864.52 frames. ], batch size: 105, lr: 4.87e-03, grad_scale: 8.0 2023-10-02 03:26:48,432 WARNING [train.py:1204] (3/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_788-298907-0 from training. Duration: 0.89 2023-10-02 03:26:48,449 WARNING [train.py:1204] (3/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_225-70961-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 03:26:48,455 WARNING [train.py:1204] (3/4) Exclude cut with ID _58583_210_5236_1_1534726835266_3574249_235-175994-0_sp1.1 from training. Duration: 0.92725 2023-10-02 03:26:49,793 WARNING [train.py:1204] (3/4) Exclude cut with ID _12302_210_5219_1_1529924446466_8107280_907-182949-0_sp1.1 from training. Duration: 0.836375 2023-10-02 03:26:49,843 WARNING [train.py:1204] (3/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_94-193692-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 03:26:51,257 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_4528_210_3273_1_1527076654_3276777_342-168068-0_sp0.9 from training. Duration: 0.9666875 2023-10-02 03:26:51,314 WARNING [train.py:1204] (3/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_368-74239-0_sp1.1 from training. Duration: 0.92725 2023-10-02 03:26:51,315 WARNING [train.py:1204] (3/4) Exclude cut with ID _44831_210_13427_1_1533816011549_3689035_291-295928-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 03:26:53,273 WARNING [train.py:1204] (3/4) Exclude cut with ID _56406_210_9288_1_1534899618664_3496149_455-131997-0_sp1.1 from training. Duration: 0.97275 2023-10-02 03:26:53,280 WARNING [train.py:1204] (3/4) Exclude cut with ID _6473_210_1741_1_1527901307396_3815380_108-17335-0 from training. Duration: 0.65 2023-10-02 03:26:53,312 WARNING [train.py:1204] (3/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_286-135600-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 03:26:54,814 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_928-321675-0_sp1.1 from training. Duration: 0.936375 2023-10-02 03:26:56,179 WARNING [train.py:1204] (3/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_81-300212-0_sp1.1 from training. Duration: 0.5454375 2023-10-02 03:26:56,220 WARNING [train.py:1204] (3/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_29-280622-0_sp0.9 from training. Duration: 0.911125 2023-10-02 03:26:56,351 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=734280.0, ans=0.1 2023-10-02 03:26:57,616 WARNING [train.py:1204] (3/4) Exclude cut with ID _61079_210_4525_1_1534921008198_4011419_228-176447-0_sp1.1 from training. Duration: 0.936375 2023-10-02 03:27:02,152 WARNING [train.py:1204] (3/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_372-198878-0_sp1.1 from training. Duration: 0.5454375 2023-10-02 03:27:02,171 WARNING [train.py:1204] (3/4) Exclude cut with ID _64521_210_14699_1_1535243427259_3815328_244-208725-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 03:27:03,582 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8275_210_4198_1_1528455320_3892571_491-17707-0_sp0.9 from training. Duration: 0.711125 2023-10-02 03:27:04,999 WARNING [train.py:1204] (3/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_375-245677-0 from training. Duration: 0.81 2023-10-02 03:27:06,410 WARNING [train.py:1204] (3/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_690-286055-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 03:27:06,542 WARNING [train.py:1204] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_89-321955-0 from training. Duration: 0.8 2023-10-02 03:27:06,582 WARNING [train.py:1204] (3/4) Exclude cut with ID _18895_210_6228_1_1531357274234_3784930_213-98721-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 03:27:06,796 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=734346.6666666666, ans=0.1 2023-10-02 03:27:09,196 WARNING [train.py:1204] (3/4) Exclude cut with ID _19708_210_9206_1_1532136650733_4238364_347-65240-0 from training. Duration: 0.97 2023-10-02 03:27:09,309 WARNING [train.py:1204] (3/4) Exclude cut with ID _64135_210_2253_1_1535677267876_7608972_498-5039-0 from training. Duration: 0.78 2023-10-02 03:27:15,119 WARNING [train.py:1204] (3/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_146-276944-0_sp1.1 from training. Duration: 0.7545625 2023-10-02 03:27:16,446 WARNING [train.py:1204] (3/4) Exclude cut with ID _63171_210_15488_1_1535104792794_3295110_155-34488-0_sp1.1 from training. Duration: 0.97275 2023-10-02 03:27:16,468 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_5438_210_1794_1_1527336687_2600009_233-58072-0_sp0.9 from training. Duration: 0.8555625 2023-10-02 03:27:18,377 WARNING [train.py:1204] (3/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_63-101996-0_sp0.9 from training. Duration: 0.97775 2023-10-02 03:27:21,351 WARNING [train.py:1204] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_271-269084-0_sp1.1 from training. Duration: 0.7545625 2023-10-02 03:27:22,591 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.494e+02 1.899e+02 2.124e+02 2.634e+02 1.113e+03, threshold=4.247e+02, percent-clipped=1.0 2023-10-02 03:27:24,857 WARNING [train.py:1204] (3/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_345-167986-0_sp1.1 from training. Duration: 0.763625 2023-10-02 03:27:25,099 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.conv_module2.balancer1.prob, batch_count=734413.3333333334, ans=0.125 2023-10-02 03:27:26,288 WARNING [train.py:1204] (3/4) Exclude cut with ID _39290_210_4220_1_1533862778760_3075118_92-311600-0_sp1.1 from training. Duration: 0.8 2023-10-02 03:27:26,294 WARNING [train.py:1204] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_209-65306-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 03:27:27,609 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_26433_210_12108_1_1532157158_4056480_317-335807-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 03:27:27,907 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.balancer1.prob, batch_count=734413.3333333334, ans=0.125 2023-10-02 03:27:29,499 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.4.encoder.layers.0.feed_forward2.out_whiten, num_groups=1, num_channels=384, metric=7.25 vs. limit=15.0 2023-10-02 03:27:33,478 WARNING [train.py:1204] (3/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_238-94127-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 03:27:33,537 WARNING [train.py:1204] (3/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_207-132038-0_sp1.1 from training. Duration: 0.8545625 2023-10-02 03:27:40,869 WARNING [train.py:1204] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_167-137255-0_sp1.1 from training. Duration: 0.5818125 2023-10-02 03:27:40,961 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2009_210_2071_1_1525481134_4588937_385-302100-0_sp0.9 from training. Duration: 0.9666875 2023-10-02 03:27:51,763 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_15517_210_6753_1_1530954008_3487336_253-110617-0_sp1.1 from training. Duration: 0.936375 2023-10-02 03:27:53,283 WARNING [train.py:1204] (3/4) Exclude cut with ID _39290_210_4220_1_1533862778760_3075118_92-311600-0_sp0.9 from training. Duration: 0.97775 2023-10-02 03:27:53,786 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.4.encoder.layers.2.self_attn1.whiten, num_groups=1, num_channels=384, metric=14.60 vs. limit=22.5 2023-10-02 03:27:54,637 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_42-142473-0 from training. Duration: 0.72 2023-10-02 03:27:54,686 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2296_210_2087_1_1525503448_3397335_57-185430-0 from training. Duration: 0.6 2023-10-02 03:27:56,495 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_11305_210_1794_1_1529750428_7178148_257-312870-0_sp0.9 from training. Duration: 0.97775 2023-10-02 03:27:56,626 WARNING [train.py:1204] (3/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_222-198127-0 from training. Duration: 0.84 2023-10-02 03:27:58,072 WARNING [train.py:1204] (3/4) Exclude cut with ID _65263_210_22356_1_1535763598691_3921037_423-202699-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 03:27:59,438 WARNING [train.py:1204] (3/4) Exclude cut with ID _15037_210_2253_1_1530752294104_3267650_13-130361-0 from training. Duration: 0.99 2023-10-02 03:28:02,207 INFO [train.py:1046] (3/4) Epoch 21, batch 3950, loss[loss=0.1676, simple_loss=0.2389, pruned_loss=0.04818, over 23270.00 frames. ], tot_loss[loss=0.1736, simple_loss=0.2487, pruned_loss=0.04926, over 4706964.63 frames. ], batch size: 105, lr: 4.87e-03, grad_scale: 8.0 2023-10-02 03:28:05,707 WARNING [train.py:1204] (3/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_628-198080-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 03:28:07,149 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3171_210_2087_1_1526109084_2330007_37-64334-0 from training. Duration: 0.82 2023-10-02 03:28:07,210 WARNING [train.py:1204] (3/4) Exclude cut with ID _5287_210_2084_1_1527418883040_3878670_12-221186-0_sp1.1 from training. Duration: 0.8909375 2023-10-02 03:28:10,023 WARNING [train.py:1204] (3/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_1103-56445-0_sp1.1 from training. Duration: 0.663625 2023-10-02 03:28:12,784 WARNING [train.py:1204] (3/4) Exclude cut with ID _65263_210_22356_1_1535763598691_3921037_169-140068-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 03:28:12,992 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.0.feed_forward2.hidden_balancer.prob, batch_count=734613.3333333334, ans=0.125 2023-10-02 03:28:16,179 WARNING [train.py:1204] (3/4) Exclude cut with ID IC0671W0118-331961 from training. Duration: 0.896 2023-10-02 03:28:17,960 WARNING [train.py:1204] (3/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_402-102311-0_sp1.1 from training. Duration: 0.6090625 2023-10-02 03:28:17,987 WARNING [train.py:1204] (3/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_202-293523-0 from training. Duration: 0.72 2023-10-02 03:28:19,290 WARNING [train.py:1204] (3/4) Exclude cut with ID IC0679W0038-335846 from training. Duration: 0.768 2023-10-02 03:28:19,311 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_37357_210_17024_1_1533089504_2316121_79-198866-0_sp1.1 from training. Duration: 0.936375 2023-10-02 03:28:21,984 WARNING [train.py:1204] (3/4) Exclude cut with ID _7606_210_4915_1_1528894253277_5080690_292-271285-0_sp1.1 from training. Duration: 0.963625 2023-10-02 03:28:21,995 WARNING [train.py:1204] (3/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_377-296839-0_sp0.9 from training. Duration: 0.611125 2023-10-02 03:28:21,997 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_45886_210_17024_1_1533704917_2435333_58-131069-0_sp1.1 from training. Duration: 0.963625 2023-10-02 03:28:25,444 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6961_210_2087_1_1527937168_6815011_395-185060-0 from training. Duration: 0.95 2023-10-02 03:28:26,873 WARNING [train.py:1204] (3/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_304-266479-0_sp1.1 from training. Duration: 0.97275 2023-10-02 03:28:28,252 WARNING [train.py:1204] (3/4) Exclude cut with ID _3867_210_1883_1_1526641200575_4475830_319-349850-0_sp1.1 from training. Duration: 0.6090625 2023-10-02 03:28:28,277 WARNING [train.py:1204] (3/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_304-59227-0_sp0.9 from training. Duration: 0.8555625 2023-10-02 03:28:28,358 WARNING [train.py:1204] (3/4) Exclude cut with ID _8021_210_7994_1_1528376451358_3653069_100-140231-0_sp0.9 from training. Duration: 0.7444375 2023-10-02 03:28:29,674 WARNING [train.py:1204] (3/4) Exclude cut with ID _8833_210_5219_1_1528592665210_7553059_329-125156-0_sp0.9 from training. Duration: 0.911125 2023-10-02 03:28:38,713 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder_embed.conv.2.prob, batch_count=734746.6666666666, ans=0.125 2023-10-02 03:28:39,983 WARNING [train.py:1204] (3/4) Exclude cut with ID _14459_210_2228_1_1531443641946_7516700_792-287234-0_sp1.1 from training. Duration: 0.92725 2023-10-02 03:28:40,009 WARNING [train.py:1204] (3/4) Exclude cut with ID _19708_210_9206_1_1532136650733_4238364_347-65240-0_sp1.1 from training. Duration: 0.8818125 2023-10-02 03:28:44,180 WARNING [train.py:1204] (3/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_70-27732-0 from training. Duration: 0.94 2023-10-02 03:28:44,236 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder_embed.convnext.out_balancer.prob, batch_count=734746.6666666666, ans=0.125 2023-10-02 03:28:51,524 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_397-132565-0 from training. Duration: 0.99 2023-10-02 03:28:51,539 WARNING [train.py:1204] (3/4) Exclude cut with ID _49728_210_12651_1_1534041034294_6788150_624-195301-0 from training. Duration: 0.85 2023-10-02 03:28:51,585 WARNING [train.py:1204] (3/4) Exclude cut with ID _55429_210_10626_1_1534467588527_7174143_377-169391-0_sp1.1 from training. Duration: 0.87275 2023-10-02 03:28:55,010 WARNING [train.py:1204] (3/4) Exclude cut with ID _49376_210_5986_1_1533985278732_3370740_354-344695-0_sp1.1 from training. Duration: 0.9 2023-10-02 03:29:00,212 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.ff2_skip_rate, batch_count=734813.3333333334, ans=0.0 2023-10-02 03:29:02,545 WARNING [train.py:1204] (3/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_276-96593-0_sp1.1 from training. Duration: 0.7 2023-10-02 03:29:02,563 WARNING [train.py:1204] (3/4) Exclude cut with ID _15012_210_1968_1_1530856820429_5095350_688-178849-0_sp0.9 from training. Duration: 0.711125 2023-10-02 03:29:03,876 WARNING [train.py:1204] (3/4) Exclude cut with ID _25565_210_12392_1_1532423009382_3952098_372-69023-0_sp1.1 from training. Duration: 0.936375 2023-10-02 03:29:03,904 WARNING [train.py:1204] (3/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_61-327062-0_sp1.1 from training. Duration: 0.736375 2023-10-02 03:29:03,946 WARNING [train.py:1204] (3/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_215-141059-0 from training. Duration: 0.62 2023-10-02 03:29:04,330 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.balancer_na.min_abs, batch_count=734880.0, ans=0.02 2023-10-02 03:29:07,509 WARNING [train.py:1204] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_6-226750-0_sp1.1 from training. Duration: 0.8545625 2023-10-02 03:29:08,921 WARNING [train.py:1204] (3/4) Exclude cut with ID _14348_210_8341_1_1530599434996_3778640_240-232370-0_sp1.1 from training. Duration: 0.763625 2023-10-02 03:29:13,071 WARNING [train.py:1204] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_468-189058-0 from training. Duration: 0.96 2023-10-02 03:29:17,014 INFO [train.py:1046] (3/4) Epoch 21, batch 4000, loss[loss=0.1817, simple_loss=0.2529, pruned_loss=0.05519, over 23788.00 frames. ], tot_loss[loss=0.1737, simple_loss=0.2495, pruned_loss=0.04893, over 4718812.28 frames. ], batch size: 212, lr: 4.87e-03, grad_scale: 16.0 2023-10-02 03:29:17,540 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.bypass_mid.scale_min, batch_count=734946.6666666666, ans=0.2 2023-10-02 03:29:19,251 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.feed_forward3.hidden_balancer.prob, batch_count=734946.6666666666, ans=0.125 2023-10-02 03:29:20,538 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=734946.6666666666, ans=0.1 2023-10-02 03:29:23,131 WARNING [train.py:1204] (3/4) Exclude cut with ID _21987_210_5854_1_1531821500482_3637038_247-93340-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 03:29:29,796 WARNING [train.py:1204] (3/4) Exclude cut with ID _66868_210_9527_1_1535509287954_4561140_436-191602-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 03:29:30,037 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.balancer1.prob, batch_count=734946.6666666666, ans=0.125 2023-10-02 03:29:32,790 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.nonlin_attention.balancer.prob, batch_count=735013.3333333334, ans=0.125 2023-10-02 03:29:35,396 WARNING [train.py:1204] (3/4) Exclude cut with ID _18895_210_6228_1_1531357274234_3784930_394-58203-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 03:29:35,453 WARNING [train.py:1204] (3/4) Exclude cut with ID _58605_210_12210_1_1534723195939_3904259_258-281498-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 03:29:36,735 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_33348_210_5632_1_1533086912_3324030_245-292752-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 03:29:36,754 WARNING [train.py:1204] (3/4) Exclude cut with ID _67831_210_12392_1_1535612464202_3092612_346-2148-0 from training. Duration: 0.93 2023-10-02 03:29:38,683 WARNING [train.py:1204] (3/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_369-222060-0_sp0.9 from training. Duration: 0.788875 2023-10-02 03:29:38,738 WARNING [train.py:1204] (3/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_245-174553-0 from training. Duration: 0.88 2023-10-02 03:29:38,743 WARNING [train.py:1204] (3/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_231-104736-0_sp1.1 from training. Duration: 0.7090625 2023-10-02 03:29:38,755 WARNING [train.py:1204] (3/4) Exclude cut with ID _44585_210_18628_1_1533605350387_4518199_280-73534-0 from training. Duration: 0.89 2023-10-02 03:29:41,542 WARNING [train.py:1204] (3/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_22-201476-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 03:29:43,860 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.1.encoder.layers.1.nonlin_attention.whiten1, num_groups=1, num_channels=192, metric=5.55 vs. limit=10.0 2023-10-02 03:29:44,378 WARNING [train.py:1204] (3/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_325-62802-0_sp0.9 from training. Duration: 0.9666875 2023-10-02 03:29:44,390 WARNING [train.py:1204] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_199-274062-0_sp1.1 from training. Duration: 0.97275 2023-10-02 03:29:44,393 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_8-319957-0_sp1.1 from training. Duration: 0.87275 2023-10-02 03:29:44,414 WARNING [train.py:1204] (3/4) Exclude cut with ID _29343_210_4381_1_1532602757752_3674109_224-349726-0_sp1.1 from training. Duration: 0.936375 2023-10-02 03:29:44,418 WARNING [train.py:1204] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_520-126729-0_sp1.1 from training. Duration: 0.636375 2023-10-02 03:29:47,155 WARNING [train.py:1204] (3/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_239-22076-0_sp1.1 from training. Duration: 0.77275 2023-10-02 03:29:48,565 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9012W0011-501146 from training. Duration: 0.896 2023-10-02 03:29:48,784 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=735080.0, ans=0.1 2023-10-02 03:29:49,976 WARNING [train.py:1204] (3/4) Exclude cut with ID _29857_210_20034_1_1532395886753_3798160_279-185801-0_sp1.1 from training. Duration: 0.82725 2023-10-02 03:29:50,017 WARNING [train.py:1204] (3/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_261-223933-0_sp1.1 from training. Duration: 0.92725 2023-10-02 03:29:53,492 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9029W0025-509426 from training. Duration: 0.896 2023-10-02 03:29:54,649 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.537e+02 1.771e+02 2.002e+02 2.200e+02 3.082e+02, threshold=4.004e+02, percent-clipped=0.0 2023-10-02 03:29:54,772 WARNING [train.py:1204] (3/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_337-181502-0_sp1.1 from training. Duration: 0.5909375 2023-10-02 03:29:54,798 WARNING [train.py:1204] (3/4) Exclude cut with ID _68635_210_9317_1_1535770813410_4315870_159-332599-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 03:29:59,211 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_161-276996-0 from training. Duration: 0.95 2023-10-02 03:30:01,293 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7088_210_1874_1_1527939776_1101113_127-68870-0_sp1.1 from training. Duration: 0.936375 2023-10-02 03:30:03,269 WARNING [train.py:1204] (3/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_322-217220-0_sp1.1 from training. Duration: 0.7818125 2023-10-02 03:30:04,629 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9072W0002-530636 from training. Duration: 0.768 2023-10-02 03:30:05,983 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_340-336408-0_sp1.1 from training. Duration: 0.8454375 2023-10-02 03:30:06,072 WARNING [train.py:1204] (3/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_395-37222-0 from training. Duration: 0.76 2023-10-02 03:30:06,076 WARNING [train.py:1204] (3/4) Exclude cut with ID _64099_210_17301_1_1535526977656_1578188_159-209156-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 03:30:07,365 WARNING [train.py:1204] (3/4) Exclude cut with ID _53950_210_5816_1_1534417264919_3862951_28-29681-0_sp1.1 from training. Duration: 0.92725 2023-10-02 03:30:09,069 WARNING [train.py:1204] (3/4) Exclude cut with ID _4681_210_3525_1_1527250689464_120889_15-126760-0_sp1.1 from training. Duration: 0.736375 2023-10-02 03:30:10,465 WARNING [train.py:1204] (3/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_242-3472-0_sp1.1 from training. Duration: 0.663625 2023-10-02 03:30:10,514 WARNING [train.py:1204] (3/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_436-186311-0_sp0.9 from training. Duration: 0.72225 2023-10-02 03:30:10,535 WARNING [train.py:1204] (3/4) Exclude cut with ID _3675_210_4557_1_1526639934408_4182089_452-172147-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 03:30:13,362 WARNING [train.py:1204] (3/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_80-147301-0 from training. Duration: 0.84 2023-10-02 03:30:13,392 WARNING [train.py:1204] (3/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_351-107588-0_sp1.1 from training. Duration: 0.92725 2023-10-02 03:30:16,142 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9111W0002-550781 from training. Duration: 0.896 2023-10-02 03:30:18,013 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=735213.3333333334, ans=0.1 2023-10-02 03:30:19,272 WARNING [train.py:1204] (3/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_465-156500-0_sp1.1 from training. Duration: 0.6545625 2023-10-02 03:30:19,518 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.feed_forward2.hidden_balancer.prob, batch_count=735213.3333333334, ans=0.125 2023-10-02 03:30:20,884 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.nonlin_attention.balancer.prob, batch_count=735213.3333333334, ans=0.125 2023-10-02 03:30:22,056 WARNING [train.py:1204] (3/4) Exclude cut with ID _13938_210_9988_1_1530518718978_3693960_304-112784-0_sp1.1 from training. Duration: 0.4181875 2023-10-02 03:30:25,159 WARNING [train.py:1204] (3/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_202-67017-0_sp1.1 from training. Duration: 0.7454375 2023-10-02 03:30:25,215 WARNING [train.py:1204] (3/4) Exclude cut with ID _58414_210_8254_1_1535421609162_7151182_448-4358-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 03:30:26,614 WARNING [train.py:1204] (3/4) Exclude cut with ID _66840_210_5219_1_1535452168426_4196349_203-243234-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 03:30:26,734 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder_embed.convnext.out_balancer.prob, batch_count=735213.3333333334, ans=0.125 2023-10-02 03:30:28,013 WARNING [train.py:1204] (3/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_201-213513-0_sp1.1 from training. Duration: 0.97275 2023-10-02 03:30:33,398 INFO [train.py:1046] (3/4) Epoch 21, batch 4050, loss[loss=0.1871, simple_loss=0.2579, pruned_loss=0.05809, over 22745.00 frames. ], tot_loss[loss=0.1742, simple_loss=0.2501, pruned_loss=0.04918, over 4722519.51 frames. ], batch size: 322, lr: 4.87e-03, grad_scale: 8.0 2023-10-02 03:30:34,853 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_117-187560-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 03:30:36,334 WARNING [train.py:1204] (3/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_135-174080-0_sp0.9 from training. Duration: 0.588875 2023-10-02 03:30:37,652 WARNING [train.py:1204] (3/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_45-265684-0 from training. Duration: 0.72 2023-10-02 03:30:39,071 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_435-313305-0_sp1.1 from training. Duration: 0.6454375 2023-10-02 03:30:39,439 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=735280.0, ans=0.1 2023-10-02 03:30:40,448 WARNING [train.py:1204] (3/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_235-307337-0_sp1.1 from training. Duration: 0.8545625 2023-10-02 03:30:40,536 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7762_210_1794_1_1528199508_4057408_419-125009-0_sp0.9 from training. Duration: 0.92225 2023-10-02 03:30:41,865 WARNING [train.py:1204] (3/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_215-141059-0_sp1.1 from training. Duration: 0.563625 2023-10-02 03:30:43,876 WARNING [train.py:1204] (3/4) Exclude cut with ID _27013_210_1389_1_1532437173240_3252649_222-125573-0_sp1.1 from training. Duration: 0.8909375 2023-10-02 03:30:46,746 WARNING [train.py:1204] (3/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_126-274694-0_sp1.1 from training. Duration: 0.8909375 2023-10-02 03:30:49,554 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2592_210_2290_1_1525697025_4882925_157-70512-0_sp1.1 from training. Duration: 0.82725 2023-10-02 03:30:50,944 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_292-265928-0_sp0.9 from training. Duration: 0.5666875 2023-10-02 03:30:52,339 WARNING [train.py:1204] (3/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_1211-226066-0_sp1.1 from training. Duration: 0.6909375 2023-10-02 03:30:52,402 WARNING [train.py:1204] (3/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_394-265747-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 03:30:57,037 WARNING [train.py:1204] (3/4) Exclude cut with ID _48782_210_12658_1_1534467701544_2300422_239-67139-0_sp1.1 from training. Duration: 0.97275 2023-10-02 03:30:57,378 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.self_attn_weights.pos_emb_skip_rate, batch_count=735346.6666666666, ans=0.0 2023-10-02 03:30:58,472 WARNING [train.py:1204] (3/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_329-113173-0_sp1.1 from training. Duration: 0.563625 2023-10-02 03:31:01,307 WARNING [train.py:1204] (3/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_81-75849-0_sp0.9 from training. Duration: 0.4333125 2023-10-02 03:31:02,747 WARNING [train.py:1204] (3/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_1003-101638-0 from training. Duration: 0.73 2023-10-02 03:31:02,778 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9309W0189-640316 from training. Duration: 0.64 2023-10-02 03:31:05,481 WARNING [train.py:1204] (3/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_503-287883-0_sp1.1 from training. Duration: 0.8 2023-10-02 03:31:06,898 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.conv_skip_rate, batch_count=735413.3333333334, ans=0.0 2023-10-02 03:31:12,181 WARNING [train.py:1204] (3/4) Exclude cut with ID _8833_210_5219_1_1528592665210_7553059_611-80313-0 from training. Duration: 0.64 2023-10-02 03:31:12,252 WARNING [train.py:1204] (3/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_335-106626-0_sp1.1 from training. Duration: 0.963625 2023-10-02 03:31:15,695 WARNING [train.py:1204] (3/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_237-44332-0_sp1.1 from training. Duration: 0.8545625 2023-10-02 03:31:18,424 WARNING [train.py:1204] (3/4) Exclude cut with ID _46952_210_5986_1_1534230010357_3107379_178-147341-0_sp1.1 from training. Duration: 0.97275 2023-10-02 03:31:19,795 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_13091_210_7925_1_1530609447_8987354_946-154426-0_sp1.1 from training. Duration: 0.936375 2023-10-02 03:31:19,819 WARNING [train.py:1204] (3/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_326-17061-0_sp1.1 from training. Duration: 0.8545625 2023-10-02 03:31:21,428 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.ff2_skip_rate, batch_count=735480.0, ans=0.0 2023-10-02 03:31:22,670 WARNING [train.py:1204] (3/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_38-169920-0_sp1.1 from training. Duration: 0.82725 2023-10-02 03:31:27,154 WARNING [train.py:1204] (3/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_283-220372-0 from training. Duration: 0.56 2023-10-02 03:31:27,161 WARNING [train.py:1204] (3/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_252-216674-0_sp1.1 from training. Duration: 0.4909375 2023-10-02 03:31:28,480 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_202-91290-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 03:31:29,827 WARNING [train.py:1204] (3/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_80-74184-0 from training. Duration: 0.83 2023-10-02 03:31:32,854 WARNING [train.py:1204] (3/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_411-271358-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 03:31:37,033 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.conv_module1.balancer2.prob, batch_count=735546.6666666666, ans=0.125 2023-10-02 03:31:37,535 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.3.encoder.layers.0.whiten, num_groups=1, num_channels=512, metric=6.28 vs. limit=12.0 2023-10-02 03:31:37,624 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.2.encoder.layers.0.feed_forward3.out_whiten, num_groups=1, num_channels=384, metric=10.71 vs. limit=15.0 2023-10-02 03:31:40,976 WARNING [train.py:1204] (3/4) Exclude cut with ID _68777_210_4353_1_1535810390731_3117904_129-166012-0 from training. Duration: 0.85 2023-10-02 03:31:42,369 WARNING [train.py:1204] (3/4) Exclude cut with ID _44982_210_1511_1_1534312820108_3726029_410-109759-0_sp1.1 from training. Duration: 0.963625 2023-10-02 03:31:42,382 WARNING [train.py:1204] (3/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_364-22817-0_sp1.1 from training. Duration: 0.6454375 2023-10-02 03:31:43,833 WARNING [train.py:1204] (3/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_89-242335-0 from training. Duration: 0.95 2023-10-02 03:31:43,841 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_151-325626-0 from training. Duration: 0.97 2023-10-02 03:31:43,842 WARNING [train.py:1204] (3/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_786-264332-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 03:31:46,156 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.conv_skip_rate, batch_count=735546.6666666666, ans=0.0 2023-10-02 03:31:47,203 WARNING [train.py:1204] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_35-153886-0_sp1.1 from training. Duration: 0.763625 2023-10-02 03:31:48,519 INFO [train.py:1046] (3/4) Epoch 21, batch 4100, loss[loss=0.2393, simple_loss=0.2951, pruned_loss=0.09171, over 19747.00 frames. ], tot_loss[loss=0.1755, simple_loss=0.2514, pruned_loss=0.04975, over 4723378.59 frames. ], batch size: 389, lr: 4.87e-03, grad_scale: 8.0 2023-10-02 03:31:48,575 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_24007_210_1794_1_1531918925_1663875_116-100916-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 03:31:48,604 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_162-262717-0_sp1.1 from training. Duration: 0.7545625 2023-10-02 03:31:54,478 WARNING [train.py:1204] (3/4) Exclude cut with ID _8263_210_1664_1_1528438953494_294100_40-204731-0 from training. Duration: 0.5 2023-10-02 03:31:54,687 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.self_attn_weights.pos_emb_skip_rate, batch_count=735613.3333333334, ans=0.0 2023-10-02 03:31:57,801 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_416-293746-0 from training. Duration: 0.9 2023-10-02 03:31:57,932 WARNING [train.py:1204] (3/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_605-219732-0 from training. Duration: 0.94 2023-10-02 03:31:59,304 WARNING [train.py:1204] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_454-123002-0 from training. Duration: 0.69 2023-10-02 03:31:59,310 WARNING [train.py:1204] (3/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_25-304273-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 03:32:00,666 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6860_210_4852_1_1527917457_3833327_451-39702-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 03:32:00,698 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_304-16505-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 03:32:02,016 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_4-266348-0_sp1.1 from training. Duration: 0.7090625 2023-10-02 03:32:02,105 WARNING [train.py:1204] (3/4) Exclude cut with ID ID0149W0001-753701 from training. Duration: 0.989 2023-10-02 03:32:06,821 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_41251_210_15368_1_1533429767_4820695_394-55393-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 03:32:06,900 WARNING [train.py:1204] (3/4) Exclude cut with ID _8793_210_6310_1_1528542985139_2310009_121-179782-0_sp1.1 from training. Duration: 0.72725 2023-10-02 03:32:06,915 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2729_210_3528_1_1525863505_3882214_421-73047-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 03:32:06,975 WARNING [train.py:1204] (3/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_278-154492-0_sp0.9 from training. Duration: 0.8444375 2023-10-02 03:32:10,309 WARNING [train.py:1204] (3/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_412-317077-0_sp0.9 from training. Duration: 0.8333125 2023-10-02 03:32:10,528 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.conv_module1.balancer1.prob, batch_count=735680.0, ans=0.125 2023-10-02 03:32:11,689 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_727-138317-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 03:32:12,956 WARNING [train.py:1204] (3/4) Exclude cut with ID _25808_210_13292_1_1532343514562_7366109_345-335048-0_sp1.1 from training. Duration: 0.92725 2023-10-02 03:32:12,980 WARNING [train.py:1204] (3/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_506-23200-0 from training. Duration: 0.97 2023-10-02 03:32:13,068 WARNING [train.py:1204] (3/4) Exclude cut with ID _44718_210_5816_1_1534050051869_6469030_643-245187-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 03:32:13,072 WARNING [train.py:1204] (3/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_42-186162-0_sp1.1 from training. Duration: 0.836375 2023-10-02 03:32:14,202 WARNING [train.py:1204] (3/4) Exclude cut with ID _3231_210_2106_1_1526205864934_6784220_171-256004-0_sp1.1 from training. Duration: 0.7818125 2023-10-02 03:32:14,229 WARNING [train.py:1204] (3/4) Exclude cut with ID _25914_210_6400_1_1532141317058_644740_43-248478-0_sp1.1 from training. Duration: 0.936375 2023-10-02 03:32:15,528 WARNING [train.py:1204] (3/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_577-287150-0 from training. Duration: 0.82 2023-10-02 03:32:17,539 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_46015_210_7234_1_1533863319_3087837_376-276068-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 03:32:17,729 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.feed_forward1.hidden_balancer.prob, batch_count=735746.6666666666, ans=0.125 2023-10-02 03:32:18,933 WARNING [train.py:1204] (3/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_51-200220-0 from training. Duration: 0.56 2023-10-02 03:32:20,428 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_452-96101-0_sp1.1 from training. Duration: 0.963625 2023-10-02 03:32:23,105 WARNING [train.py:1204] (3/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_263-168388-0_sp1.1 from training. Duration: 0.7818125 2023-10-02 03:32:23,106 WARNING [train.py:1204] (3/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_469-94673-0 from training. Duration: 0.79 2023-10-02 03:32:25,651 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.570e+02 1.845e+02 2.063e+02 2.238e+02 3.608e+02, threshold=4.126e+02, percent-clipped=0.0 2023-10-02 03:32:25,742 WARNING [train.py:1204] (3/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_303-228738-0_sp1.1 from training. Duration: 0.763625 2023-10-02 03:32:25,799 WARNING [train.py:1204] (3/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_262-250826-0_sp1.1 from training. Duration: 0.9 2023-10-02 03:32:27,148 WARNING [train.py:1204] (3/4) Exclude cut with ID _68777_210_4353_1_1535810390731_3117904_129-166012-0_sp1.1 from training. Duration: 0.77275 2023-10-02 03:32:28,842 WARNING [train.py:1204] (3/4) Exclude cut with ID _58895_210_8750_1_1535184081320_3649089_265-323810-0 from training. Duration: 0.96 2023-10-02 03:32:30,850 WARNING [train.py:1204] (3/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_120-76843-0_sp0.9 from training. Duration: 0.611125 2023-10-02 03:32:32,184 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7544_210_6753_1_1528587722_3576686_295-258479-0_sp0.9 from training. Duration: 0.9333125 2023-10-02 03:32:34,973 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7046_210_6753_1_1527895337_6920917_783-278109-0 from training. Duration: 0.77 2023-10-02 03:32:35,014 WARNING [train.py:1204] (3/4) Exclude cut with ID _42326_210_7770_1_1533814282531_3707269_113-308323-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 03:32:35,062 WARNING [train.py:1204] (3/4) Exclude cut with ID _7852_210_5219_1_1528286471316_8139400_242-105336-0_sp0.9 from training. Duration: 0.8 2023-10-02 03:32:39,777 WARNING [train.py:1204] (3/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_114-277905-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 03:32:45,582 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_13118_210_7925_1_1530755612_7608297_42-66683-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 03:32:47,131 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.self_attn_weights.pos_emb_skip_rate, batch_count=735880.0, ans=0.0 2023-10-02 03:32:48,418 WARNING [train.py:1204] (3/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_901-79438-0_sp1.1 from training. Duration: 0.8545625 2023-10-02 03:32:50,264 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_30280_210_1881_1_1532872779_3660139_211-182194-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 03:32:55,956 WARNING [train.py:1204] (3/4) Exclude cut with ID _14898_210_9206_1_1530766812377_4177190_621-45732-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 03:32:55,965 WARNING [train.py:1204] (3/4) Exclude cut with ID _35350_210_16040_1_1533040191183_4075299_224-133775-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 03:32:58,702 WARNING [train.py:1204] (3/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_147-293311-0_sp1.1 from training. Duration: 0.8545625 2023-10-02 03:33:00,083 WARNING [train.py:1204] (3/4) Exclude cut with ID _12302_210_5219_1_1529924446466_8107280_835-279754-0_sp1.1 from training. Duration: 0.8181875 2023-10-02 03:33:03,329 INFO [train.py:1046] (3/4) Epoch 21, batch 4150, loss[loss=0.1603, simple_loss=0.24, pruned_loss=0.04033, over 24373.00 frames. ], tot_loss[loss=0.1765, simple_loss=0.2525, pruned_loss=0.05018, over 4716244.65 frames. ], batch size: 61, lr: 4.86e-03, grad_scale: 8.0 2023-10-02 03:33:04,901 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8453_210_4211_1_1528502940_8562730_347-330673-0_sp0.9 from training. Duration: 0.8 2023-10-02 03:33:06,283 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_213-103282-0_sp0.9 from training. Duration: 0.8666875 2023-10-02 03:33:08,420 WARNING [train.py:1204] (3/4) Exclude cut with ID _49728_210_12651_1_1534041034294_6788150_66-43496-0_sp1.1 from training. Duration: 0.97275 2023-10-02 03:33:08,423 WARNING [train.py:1204] (3/4) Exclude cut with ID _19023_210_4353_1_1531378892552_5820780_28-36615-0_sp1.1 from training. Duration: 0.8818125 2023-10-02 03:33:11,082 WARNING [train.py:1204] (3/4) Exclude cut with ID _14349_210_8341_1_1530615710818_3442470_115-285975-0 from training. Duration: 0.86 2023-10-02 03:33:11,111 WARNING [train.py:1204] (3/4) Exclude cut with ID _12734_210_8341_1_1530183614296_3891929_236-47464-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 03:33:12,846 WARNING [train.py:1204] (3/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_177-236809-0 from training. Duration: 0.49 2023-10-02 03:33:12,916 WARNING [train.py:1204] (3/4) Exclude cut with ID _13678_210_7497_1_1530530069226_3463170_371-3221-0 from training. Duration: 0.9 2023-10-02 03:33:14,185 WARNING [train.py:1204] (3/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_48-338976-0 from training. Duration: 0.89 2023-10-02 03:33:14,301 WARNING [train.py:1204] (3/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_1167-309744-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 03:33:18,870 WARNING [train.py:1204] (3/4) Exclude cut with ID _30327_210_16049_1_1532484161295_3924169_401-70819-0_sp0.9 from training. Duration: 0.9555625 2023-10-02 03:33:18,891 WARNING [train.py:1204] (3/4) Exclude cut with ID _55134_210_21982_1_1534590028377_3882170_346-91022-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 03:33:23,467 WARNING [train.py:1204] (3/4) Exclude cut with ID _14459_210_2228_1_1531443641946_7516700_174-118801-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 03:33:23,542 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_269-278006-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 03:33:24,893 WARNING [train.py:1204] (3/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_390-339137-0_sp0.9 from training. Duration: 0.62225 2023-10-02 03:33:26,349 WARNING [train.py:1204] (3/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_253-308377-0_sp1.1 from training. Duration: 0.4454375 2023-10-02 03:33:26,360 WARNING [train.py:1204] (3/4) Exclude cut with ID _23825_210_13449_1_1531915313526_3497150_240-271367-0_sp1.1 from training. Duration: 0.8818125 2023-10-02 03:33:27,720 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1015-343884-0_sp0.9 from training. Duration: 0.688875 2023-10-02 03:33:27,941 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.conv_module1.balancer2.prob, batch_count=736013.3333333334, ans=0.125 2023-10-02 03:33:31,819 WARNING [train.py:1204] (3/4) Exclude cut with ID _23514_210_1389_1_1531832139785_3031439_335-48210-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 03:33:34,676 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_309-65177-0_sp1.1 from training. Duration: 0.8 2023-10-02 03:33:37,304 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.2.encoder.layers.2.self_attn_weights.whiten_keys, num_groups=4, num_channels=128, metric=3.44 vs. limit=6.0 2023-10-02 03:33:37,833 WARNING [train.py:1204] (3/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_231-137892-0 from training. Duration: 0.99 2023-10-02 03:33:40,433 WARNING [train.py:1204] (3/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_57-270037-0 from training. Duration: 0.77 2023-10-02 03:33:40,444 WARNING [train.py:1204] (3/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_465-214726-0_sp1.1 from training. Duration: 0.7818125 2023-10-02 03:33:43,133 WARNING [train.py:1204] (3/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_106-300985-0 from training. Duration: 0.9 2023-10-02 03:33:43,137 WARNING [train.py:1204] (3/4) Exclude cut with ID _14632_210_9988_1_1530691279392_3923200_161-129530-0_sp1.1 from training. Duration: 0.87275 2023-10-02 03:33:43,148 WARNING [train.py:1204] (3/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_514-88370-0_sp1.1 from training. Duration: 0.963625 2023-10-02 03:33:44,652 WARNING [train.py:1204] (3/4) Exclude cut with ID _56495_210_4234_1_1534748080334_4136219_305-334505-0_sp1.1 from training. Duration: 0.92725 2023-10-02 03:33:45,954 WARNING [train.py:1204] (3/4) Exclude cut with ID _42875_210_14856_1_1533704397487_4012433_61-264914-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 03:33:50,774 WARNING [train.py:1204] (3/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_91-148831-0 from training. Duration: 0.99 2023-10-02 03:33:53,537 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_435-313305-0_sp0.9 from training. Duration: 0.788875 2023-10-02 03:33:54,930 WARNING [train.py:1204] (3/4) Exclude cut with ID _16744_210_5986_1_1531724405384_3907567_242-111316-0_sp1.1 from training. Duration: 0.8454375 2023-10-02 03:33:56,326 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_257-20733-0 from training. Duration: 0.76 2023-10-02 03:33:56,377 WARNING [train.py:1204] (3/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_4-91037-0_sp1.1 from training. Duration: 0.8 2023-10-02 03:33:57,745 WARNING [train.py:1204] (3/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_3-276701-0 from training. Duration: 0.91 2023-10-02 03:33:57,908 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.attention_skip_rate, batch_count=736146.6666666666, ans=0.0 2023-10-02 03:34:00,550 WARNING [train.py:1204] (3/4) Exclude cut with ID _3992_210_1747_1_1526644816031_3731060_36-147855-0_sp1.1 from training. Duration: 0.7090625 2023-10-02 03:34:00,642 WARNING [train.py:1204] (3/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_1296-298671-0_sp1.1 from training. Duration: 0.963625 2023-10-02 03:34:01,994 WARNING [train.py:1204] (3/4) Exclude cut with ID _68965_210_1883_1_1535698962272_3497490_309-334593-0_sp1.1 from training. Duration: 0.92725 2023-10-02 03:34:03,321 WARNING [train.py:1204] (3/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_128-296581-0 from training. Duration: 0.74 2023-10-02 03:34:03,321 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_17613_210_2151_1_1531137569_2366160_170-299079-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 03:34:03,323 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6287_210_3273_1_1527509506_2653716_182-199887-0_sp1.1 from training. Duration: 0.463625 2023-10-02 03:34:05,987 WARNING [train.py:1204] (3/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_80-135200-0_sp1.1 from training. Duration: 0.7181875 2023-10-02 03:34:07,344 WARNING [train.py:1204] (3/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_34-183685-0 from training. Duration: 0.98 2023-10-02 03:34:07,374 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8278_210_2550_1_1528637099_4220413_418-210155-0_sp1.1 from training. Duration: 0.92725 2023-10-02 03:34:07,378 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8853_210_3273_1_1528633899_818968_51-187685-0_sp0.9 from training. Duration: 0.7666875 2023-10-02 03:34:07,393 WARNING [train.py:1204] (3/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_332-221057-0_sp1.1 from training. Duration: 0.6181875 2023-10-02 03:34:09,250 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2221_210_1988_1_1525597051_4935541_415-296882-0 from training. Duration: 0.77 2023-10-02 03:34:09,279 WARNING [train.py:1204] (3/4) Exclude cut with ID _33640_210_2655_1_1533117605479_4855469_770-278185-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 03:34:09,316 WARNING [train.py:1204] (3/4) Exclude cut with ID _5287_210_2084_1_1527418883040_3878670_333-48508-0_sp1.1 from training. Duration: 0.5454375 2023-10-02 03:34:10,187 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.conv_skip_rate, batch_count=736213.3333333334, ans=0.0 2023-10-02 03:34:11,304 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_142-276839-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 03:34:11,433 WARNING [train.py:1204] (3/4) Exclude cut with ID _28394_210_8913_1_1532430049518_3650118_126-139371-0_sp1.1 from training. Duration: 0.92725 2023-10-02 03:34:11,446 WARNING [train.py:1204] (3/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_76-203867-0 from training. Duration: 0.72 2023-10-02 03:34:12,799 WARNING [train.py:1204] (3/4) Exclude cut with ID _6194_210_1534_1_1527511826160_3941194_14-282876-0_sp0.9 from training. Duration: 0.788875 2023-10-02 03:34:14,415 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.conv_module2.balancer1.prob, batch_count=736213.3333333334, ans=0.125 2023-10-02 03:34:16,809 WARNING [train.py:1204] (3/4) Exclude cut with ID _35355_210_16040_1_1533212988977_4016229_74-190076-0_sp1.1 from training. Duration: 0.72725 2023-10-02 03:34:18,534 INFO [train.py:1046] (3/4) Epoch 21, batch 4200, loss[loss=0.181, simple_loss=0.2642, pruned_loss=0.0489, over 24671.00 frames. ], tot_loss[loss=0.1758, simple_loss=0.2516, pruned_loss=0.05004, over 4719867.32 frames. ], batch size: 73, lr: 4.86e-03, grad_scale: 8.0 2023-10-02 03:34:18,702 WARNING [train.py:1204] (3/4) Exclude cut with ID _33920_210_4220_1_1533257851500_3084399_198-334063-0 from training. Duration: 0.94 2023-10-02 03:34:20,149 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_4-266348-0_sp0.9 from training. Duration: 0.8666875 2023-10-02 03:34:22,843 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_23856_210_4238_1_1531994785_3087934_122-38075-0_sp1.1 from training. Duration: 0.936375 2023-10-02 03:34:24,215 WARNING [train.py:1204] (3/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_276-96593-0_sp0.9 from training. Duration: 0.8555625 2023-10-02 03:34:24,275 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_308-278734-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 03:34:24,277 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_5190_210_4557_1_1527245078_2965646_353-250603-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 03:34:27,119 WARNING [train.py:1204] (3/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_472-216663-0 from training. Duration: 0.92 2023-10-02 03:34:28,716 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=736280.0, ans=0.1 2023-10-02 03:34:29,965 WARNING [train.py:1204] (3/4) Exclude cut with ID _7537_210_2084_1_1528502395431_3924359_14-312906-0 from training. Duration: 0.84 2023-10-02 03:34:31,216 WARNING [train.py:1204] (3/4) Exclude cut with ID _29003_210_19596_1_1532507391720_3894098_559-236660-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 03:34:32,712 WARNING [train.py:1204] (3/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_312-204798-0_sp1.1 from training. Duration: 0.8454375 2023-10-02 03:34:35,492 WARNING [train.py:1204] (3/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_17-309070-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 03:34:39,830 WARNING [train.py:1204] (3/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_344-152526-0_sp0.9 from training. Duration: 0.588875 2023-10-02 03:34:40,670 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_41452_210_7291_1_1533437687_3941281_405-11559-0_sp1.1 from training. Duration: 0.82725 2023-10-02 03:34:40,699 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_48730_210_5399_1_1533885886_2212769_157-124052-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 03:34:42,109 WARNING [train.py:1204] (3/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_415-78792-0 from training. Duration: 0.81 2023-10-02 03:34:42,113 WARNING [train.py:1204] (3/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_345-340690-0_sp1.1 from training. Duration: 0.8454375 2023-10-02 03:34:43,529 WARNING [train.py:1204] (3/4) Exclude cut with ID _23825_210_13449_1_1531915313526_3497150_124-8138-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 03:34:44,920 WARNING [train.py:1204] (3/4) Exclude cut with ID _49258_210_7994_1_1534122017863_3738861_194-24864-0_sp1.1 from training. Duration: 0.936375 2023-10-02 03:34:44,937 WARNING [train.py:1204] (3/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_369-222060-0_sp1.1 from training. Duration: 0.6454375 2023-10-02 03:34:46,372 WARNING [train.py:1204] (3/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_480-13146-0_sp0.9 from training. Duration: 0.9444375 2023-10-02 03:34:47,881 WARNING [train.py:1204] (3/4) Exclude cut with ID _13253_210_1925_1_1530757775819_3577873_191-144977-0 from training. Duration: 0.78 2023-10-02 03:34:47,909 WARNING [train.py:1204] (3/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_1128-140266-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 03:34:52,652 WARNING [train.py:1204] (3/4) Exclude cut with ID _8772_210_1850_1_1528635595972_3664011_247-336448-0_sp0.9 from training. Duration: 0.82225 2023-10-02 03:34:52,741 WARNING [train.py:1204] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_318-59097-0_sp1.1 from training. Duration: 0.6909375 2023-10-02 03:34:54,184 WARNING [train.py:1204] (3/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_540-131164-0_sp1.1 from training. Duration: 0.836375 2023-10-02 03:34:55,340 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.596e+02 1.914e+02 2.073e+02 2.283e+02 3.336e+02, threshold=4.146e+02, percent-clipped=0.0 2023-10-02 03:34:57,002 WARNING [train.py:1204] (3/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_39-110836-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 03:34:58,457 WARNING [train.py:1204] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_860-96198-0_sp1.1 from training. Duration: 0.663625 2023-10-02 03:34:58,464 WARNING [train.py:1204] (3/4) Exclude cut with ID _15133_210_6228_1_1530794512074_3080379_29-264458-0 from training. Duration: 0.58 2023-10-02 03:34:58,485 WARNING [train.py:1204] (3/4) Exclude cut with ID _55209_210_1531_1_1534675828996_4597160_797-348343-0_sp1.1 from training. Duration: 0.963625 2023-10-02 03:34:59,978 WARNING [train.py:1204] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_23-189234-0_sp1.1 from training. Duration: 0.7545625 2023-10-02 03:35:02,966 WARNING [train.py:1204] (3/4) Exclude cut with ID _5410_210_1388_1_1527397055795_1719020_39-112221-0_sp1.1 from training. Duration: 0.62725 2023-10-02 03:35:04,516 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_100-348441-0_sp1.1 from training. Duration: 0.82725 2023-10-02 03:35:13,411 WARNING [train.py:1204] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_143-86344-0_sp1.1 from training. Duration: 0.7 2023-10-02 03:35:14,892 WARNING [train.py:1204] (3/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_126-29104-0 from training. Duration: 0.85 2023-10-02 03:35:17,700 WARNING [train.py:1204] (3/4) Exclude cut with ID _39846_210_12651_1_1533603651992_1318030_138-31771-0_sp1.1 from training. Duration: 0.963625 2023-10-02 03:35:22,594 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.nonlin_attention.balancer.prob, batch_count=736546.6666666666, ans=0.125 2023-10-02 03:35:23,817 WARNING [train.py:1204] (3/4) Exclude cut with ID _2220_210_1819_1_1525509109898_3658680_50-99837-0_sp1.1 from training. Duration: 0.4909375 2023-10-02 03:35:23,884 WARNING [train.py:1204] (3/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_836-210550-0_sp1.1 from training. Duration: 0.97275 2023-10-02 03:35:25,296 WARNING [train.py:1204] (3/4) Exclude cut with ID _28323_210_16187_1_1532480395667_4100300_306-74561-0 from training. Duration: 0.94 2023-10-02 03:35:30,907 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_177-58005-0_sp1.1 from training. Duration: 0.563625 2023-10-02 03:35:33,678 INFO [train.py:1046] (3/4) Epoch 21, batch 4250, loss[loss=0.1915, simple_loss=0.2729, pruned_loss=0.05504, over 24560.00 frames. ], tot_loss[loss=0.1747, simple_loss=0.2503, pruned_loss=0.04956, over 4722615.29 frames. ], batch size: 71, lr: 4.86e-03, grad_scale: 8.0 2023-10-02 03:35:35,116 WARNING [train.py:1204] (3/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_19-342632-0_sp1.1 from training. Duration: 0.82725 2023-10-02 03:35:35,131 WARNING [train.py:1204] (3/4) Exclude cut with ID _35355_210_16040_1_1533212988977_4016229_74-190076-0_sp0.9 from training. Duration: 0.888875 2023-10-02 03:35:39,677 WARNING [train.py:1204] (3/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_285-97786-0_sp1.1 from training. Duration: 0.97275 2023-10-02 03:35:44,575 WARNING [train.py:1204] (3/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_337-181502-0_sp0.9 from training. Duration: 0.72225 2023-10-02 03:35:44,614 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_353-182855-0 from training. Duration: 0.96 2023-10-02 03:35:44,645 WARNING [train.py:1204] (3/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_409-327730-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 03:35:49,422 WARNING [train.py:1204] (3/4) Exclude cut with ID _8030_210_7497_1_1528444886781_3531089_373-19403-0_sp1.1 from training. Duration: 0.97275 2023-10-02 03:35:53,925 WARNING [train.py:1204] (3/4) Exclude cut with ID _6789_210_1534_1_1527857937218_3118950_231-340237-0_sp1.1 from training. Duration: 0.936375 2023-10-02 03:35:56,783 WARNING [train.py:1204] (3/4) Exclude cut with ID _6493_210_1747_1_1528459210424_4012470_536-57832-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 03:35:56,798 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7046_210_6753_1_1527895337_6920917_756-116002-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 03:35:59,540 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_37152_210_9697_1_1533106573_3844188_29-239070-0_sp1.1 from training. Duration: 0.8909375 2023-10-02 03:35:59,543 WARNING [train.py:1204] (3/4) Exclude cut with ID _19023_210_4353_1_1531378892552_5820780_61-334539-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 03:36:00,916 WARNING [train.py:1204] (3/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_159-28488-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 03:36:02,343 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_764-71272-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 03:36:03,648 WARNING [train.py:1204] (3/4) Exclude cut with ID _44902_210_16187_1_1533888006062_3792078_258-178274-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 03:36:05,027 WARNING [train.py:1204] (3/4) Exclude cut with ID _7537_210_2084_1_1528502395431_3924359_14-312906-0_sp1.1 from training. Duration: 0.763625 2023-10-02 03:36:05,115 WARNING [train.py:1204] (3/4) Exclude cut with ID _49643_210_7989_1_1534640244224_3939031_674-116639-0_sp1.1 from training. Duration: 0.963625 2023-10-02 03:36:06,570 WARNING [train.py:1204] (3/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_415-161-0 from training. Duration: 0.98 2023-10-02 03:36:08,157 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.conv_skip_rate, batch_count=736746.6666666666, ans=0.0 2023-10-02 03:36:09,776 WARNING [train.py:1204] (3/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_264-348896-0 from training. Duration: 0.85 2023-10-02 03:36:09,784 WARNING [train.py:1204] (3/4) Exclude cut with ID _51899_210_20034_1_1534129254466_3669220_233-161492-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 03:36:09,841 WARNING [train.py:1204] (3/4) Exclude cut with ID _31255_210_13427_1_1532865619495_3815143_233-200023-0_sp1.1 from training. Duration: 0.936375 2023-10-02 03:36:11,081 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_23850_210_4238_1_1532232597_1632433_100-229879-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 03:36:11,176 WARNING [train.py:1204] (3/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_375-245677-0_sp0.9 from training. Duration: 0.9 2023-10-02 03:36:11,179 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_40223_210_6228_1_1533298404_4812267_355-317415-0_sp1.1 from training. Duration: 0.97275 2023-10-02 03:36:11,236 WARNING [train.py:1204] (3/4) Exclude cut with ID _9679_210_7497_1_1529129212906_4363929_235-81488-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 03:36:16,307 WARNING [train.py:1204] (3/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_172-215932-0_sp1.1 from training. Duration: 0.6 2023-10-02 03:36:17,634 WARNING [train.py:1204] (3/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_64-298917-0_sp1.1 from training. Duration: 0.8 2023-10-02 03:36:21,752 WARNING [train.py:1204] (3/4) Exclude cut with ID _38020_210_5986_1_1533112252259_7006089_131-53533-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 03:36:21,992 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.conv_module1.balancer1.prob, batch_count=736813.3333333334, ans=0.125 2023-10-02 03:36:23,173 WARNING [train.py:1204] (3/4) Exclude cut with ID _42334_210_7770_1_1534206610396_3605880_392-48154-0_sp1.1 from training. Duration: 0.963625 2023-10-02 03:36:25,129 WARNING [train.py:1204] (3/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_337-181502-0 from training. Duration: 0.65 2023-10-02 03:36:25,144 WARNING [train.py:1204] (3/4) Exclude cut with ID _6267_210_2084_1_1527764658526_3894730_375-219887-0_sp0.9 from training. Duration: 0.7444375 2023-10-02 03:36:25,229 WARNING [train.py:1204] (3/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_60-146189-0 from training. Duration: 0.98 2023-10-02 03:36:26,629 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_243-143119-0_sp1.1 from training. Duration: 0.7 2023-10-02 03:36:27,427 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.3.encoder.layers.0.nonlin_attention.whiten2, num_groups=1, num_channels=512, metric=8.32 vs. limit=15.0 2023-10-02 03:36:28,062 WARNING [train.py:1204] (3/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_1267-272693-0_sp0.9 from training. Duration: 0.811125 2023-10-02 03:36:29,468 WARNING [train.py:1204] (3/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_83-182645-0_sp1.1 from training. Duration: 0.97275 2023-10-02 03:36:29,493 WARNING [train.py:1204] (3/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_403-292260-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 03:36:32,409 WARNING [train.py:1204] (3/4) Exclude cut with ID _55429_210_10626_1_1534467588527_7174143_377-169391-0 from training. Duration: 0.96 2023-10-02 03:36:33,790 WARNING [train.py:1204] (3/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_494-199538-0_sp1.1 from training. Duration: 0.6818125 2023-10-02 03:36:35,025 WARNING [train.py:1204] (3/4) Exclude cut with ID _3067_210_2667_1_1526212974640_4503168_119-175776-0_sp1.1 from training. Duration: 0.67275 2023-10-02 03:36:39,836 WARNING [train.py:1204] (3/4) Exclude cut with ID _3231_210_2106_1_1526205864934_6784220_183-303839-0_sp1.1 from training. Duration: 0.97275 2023-10-02 03:36:42,517 WARNING [train.py:1204] (3/4) Exclude cut with ID _18513_210_9206_1_1531618215223_3862620_165-38958-0_sp1.1 from training. Duration: 0.963625 2023-10-02 03:36:44,501 WARNING [train.py:1204] (3/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_87-272068-0_sp1.1 from training. Duration: 0.7545625 2023-10-02 03:36:44,613 WARNING [train.py:1204] (3/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_353-95-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 03:36:45,998 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2009_210_2071_1_1525481134_4588937_73-302813-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 03:36:48,062 WARNING [train.py:1204] (3/4) Exclude cut with ID _64220_210_9527_1_1535250151772_4875259_88-95011-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 03:36:49,218 INFO [train.py:1046] (3/4) Epoch 21, batch 4300, loss[loss=0.1816, simple_loss=0.2517, pruned_loss=0.0557, over 23420.00 frames. ], tot_loss[loss=0.1747, simple_loss=0.2505, pruned_loss=0.0495, over 4716737.03 frames. ], batch size: 285, lr: 4.86e-03, grad_scale: 8.0 2023-10-02 03:36:49,313 WARNING [train.py:1204] (3/4) Exclude cut with ID _44902_210_16187_1_1533888006062_3792078_160-12107-0_sp1.1 from training. Duration: 0.92725 2023-10-02 03:36:49,319 WARNING [train.py:1204] (3/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_547-5424-0 from training. Duration: 0.94 2023-10-02 03:36:51,983 WARNING [train.py:1204] (3/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_268-85656-0_sp1.1 from training. Duration: 0.936375 2023-10-02 03:36:54,917 WARNING [train.py:1204] (3/4) Exclude cut with ID _66210_210_12651_1_1535446812922_3365850_42-284985-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 03:36:56,786 WARNING [train.py:1204] (3/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_465-214726-0_sp0.9 from training. Duration: 0.9555625 2023-10-02 03:36:59,810 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=736946.6666666666, ans=0.1 2023-10-02 03:37:00,912 WARNING [train.py:1204] (3/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_50-95770-0_sp1.1 from training. Duration: 0.936375 2023-10-02 03:37:09,131 WARNING [train.py:1204] (3/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_269-77567-0_sp1.1 from training. Duration: 0.963625 2023-10-02 03:37:09,133 WARNING [train.py:1204] (3/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_7-111954-0 from training. Duration: 0.56 2023-10-02 03:37:09,250 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_85-195240-0_sp0.9 from training. Duration: 0.9666875 2023-10-02 03:37:12,106 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2822_210_2715_1_1526112586_1999587_165-36656-0_sp0.9 from training. Duration: 0.988875 2023-10-02 03:37:12,123 WARNING [train.py:1204] (3/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_181-78632-0_sp1.1 from training. Duration: 0.7090625 2023-10-02 03:37:12,151 WARNING [train.py:1204] (3/4) Exclude cut with ID IC0671W0044-331887 from training. Duration: 0.896 2023-10-02 03:37:15,390 WARNING [train.py:1204] (3/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_266-137104-0_sp1.1 from training. Duration: 0.5909375 2023-10-02 03:37:16,779 WARNING [train.py:1204] (3/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_137-116442-0_sp0.9 from training. Duration: 0.8666875 2023-10-02 03:37:20,275 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_23801_210_16223_1_1532066392_3294657_570-5439-0 from training. Duration: 0.97 2023-10-02 03:37:20,287 WARNING [train.py:1204] (3/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_29-280622-0_sp1.1 from training. Duration: 0.7454375 2023-10-02 03:37:21,495 WARNING [train.py:1204] (3/4) Exclude cut with ID 3557-8342-0013-54691-0 from training. Duration: 0.92 2023-10-02 03:37:22,947 WARNING [train.py:1204] (3/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_711-15688-0_sp1.1 from training. Duration: 0.3909375 2023-10-02 03:37:24,338 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_15126_210_6753_1_1530778677_2591185_120-277707-0_sp1.1 from training. Duration: 0.72725 2023-10-02 03:37:25,613 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.579e+02 1.828e+02 2.109e+02 2.471e+02 4.007e+02, threshold=4.219e+02, percent-clipped=0.0 2023-10-02 03:37:26,068 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.bypass_mid.scale_min, batch_count=737080.0, ans=0.2 2023-10-02 03:37:27,782 WARNING [train.py:1204] (3/4) Exclude cut with ID _14970_210_15709_1_1530775547361_1966965_8-169924-0_sp1.1 from training. Duration: 0.836375 2023-10-02 03:37:27,783 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_4692_210_3528_1_1526899755_4323254_107-44401-0_sp0.9 from training. Duration: 0.9555625 2023-10-02 03:37:29,127 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3475_210_2028_1_1526193578_3622569_265-276625-0_sp0.9 from training. Duration: 0.8444375 2023-10-02 03:37:30,592 WARNING [train.py:1204] (3/4) Exclude cut with ID _55153_210_2253_1_1534384786677_3162096_148-79022-0_sp1.1 from training. Duration: 0.87275 2023-10-02 03:37:30,678 WARNING [train.py:1204] (3/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_292-36336-0_sp1.1 from training. Duration: 0.92725 2023-10-02 03:37:31,942 WARNING [train.py:1204] (3/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_329-82235-0 from training. Duration: 0.82 2023-10-02 03:37:32,017 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_11305_210_1794_1_1529750428_7178148_257-312870-0 from training. Duration: 0.88 2023-10-02 03:37:34,727 WARNING [train.py:1204] (3/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_436-4493-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 03:37:37,535 WARNING [train.py:1204] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_321-54129-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 03:37:37,544 WARNING [train.py:1204] (3/4) Exclude cut with ID _5287_210_2084_1_1527418883040_3878670_333-48508-0_sp0.9 from training. Duration: 0.6666875 2023-10-02 03:37:37,576 WARNING [train.py:1204] (3/4) Exclude cut with ID _26528_210_9070_1_1532570410842_3851310_244-278158-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 03:37:37,605 WARNING [train.py:1204] (3/4) Exclude cut with ID _46952_210_5986_1_1534230010357_3107379_232-344503-0_sp1.1 from training. Duration: 0.87275 2023-10-02 03:37:37,616 WARNING [train.py:1204] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_43-206757-0 from training. Duration: 0.75 2023-10-02 03:37:37,844 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.feed_forward1.hidden_balancer.prob, batch_count=737146.6666666666, ans=0.125 2023-10-02 03:37:38,876 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_4183_210_2087_1_1526729100_1869838_141-166600-0 from training. Duration: 0.95 2023-10-02 03:37:38,936 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_296-287678-0 from training. Duration: 0.95 2023-10-02 03:37:40,214 WARNING [train.py:1204] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_265-60343-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 03:37:40,239 WARNING [train.py:1204] (3/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_332-256168-0 from training. Duration: 0.75 2023-10-02 03:37:41,532 WARNING [train.py:1204] (3/4) Exclude cut with ID _13679_210_7497_1_1530534293662_3355098_411-186088-0 from training. Duration: 0.94 2023-10-02 03:37:44,695 WARNING [train.py:1204] (3/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_458-126779-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 03:37:46,145 WARNING [train.py:1204] (3/4) Exclude cut with ID IC0803W0380-397572 from training. Duration: 0.032 2023-10-02 03:37:46,193 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_278-272612-0_sp1.1 from training. Duration: 0.663625 2023-10-02 03:37:49,464 WARNING [train.py:1204] (3/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_491-263101-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 03:37:49,472 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_53555_210_5632_1_1534595220_7232297_334-336778-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 03:37:51,541 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_50-3289-0 from training. Duration: 0.91 2023-10-02 03:37:52,703 WARNING [train.py:1204] (3/4) Exclude cut with ID _8699_210_2069_1_1528630346831_3960409_155-39766-0_sp0.9 from training. Duration: 0.8666875 2023-10-02 03:37:52,707 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_502-82145-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 03:37:53,997 WARNING [train.py:1204] (3/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_550-43132-0_sp1.1 from training. Duration: 0.97275 2023-10-02 03:37:54,028 WARNING [train.py:1204] (3/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_446-112117-0_sp1.1 from training. Duration: 0.8181875 2023-10-02 03:37:55,385 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3691_210_3528_1_1526468169_4062053_355-334432-0_sp1.1 from training. Duration: 0.7909375 2023-10-02 03:37:56,791 WARNING [train.py:1204] (3/4) Exclude cut with ID _58605_210_12210_1_1534723195939_3904259_239-94720-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 03:37:59,941 WARNING [train.py:1204] (3/4) Exclude cut with ID _49376_210_5986_1_1533985278732_3370740_391-344072-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 03:38:00,010 WARNING [train.py:1204] (3/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_558-322014-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 03:38:00,038 WARNING [train.py:1204] (3/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_256-32662-0_sp1.1 from training. Duration: 0.8181875 2023-10-02 03:38:02,696 INFO [train.py:1046] (3/4) Epoch 21, batch 4350, loss[loss=0.1521, simple_loss=0.233, pruned_loss=0.03561, over 24486.00 frames. ], tot_loss[loss=0.174, simple_loss=0.2501, pruned_loss=0.04899, over 4724179.18 frames. ], batch size: 66, lr: 4.86e-03, grad_scale: 8.0 2023-10-02 03:38:05,662 WARNING [train.py:1204] (3/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_572-185150-0 from training. Duration: 0.97 2023-10-02 03:38:05,704 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_89-35748-0_sp0.9 from training. Duration: 0.87775 2023-10-02 03:38:07,347 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.feed_forward1.out_proj.dropout_p, batch_count=737280.0, ans=0.1 2023-10-02 03:38:09,913 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_88-66747-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 03:38:10,030 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.1.conv_module2.balancer2.prob, batch_count=737280.0, ans=0.125 2023-10-02 03:38:13,210 WARNING [train.py:1204] (3/4) Exclude cut with ID _19504_210_9994_1_1531648863242_3325220_357-237029-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 03:38:15,145 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.balancer1.prob, batch_count=737280.0, ans=0.125 2023-10-02 03:38:16,219 WARNING [train.py:1204] (3/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_302-2550-0_sp1.1 from training. Duration: 0.736375 2023-10-02 03:38:16,219 WARNING [train.py:1204] (3/4) Exclude cut with ID _9461_210_4557_1_1529060514726_3677451_59-194017-0_sp1.1 from training. Duration: 0.963625 2023-10-02 03:38:21,065 WARNING [train.py:1204] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_665-196155-0_sp1.1 from training. Duration: 0.6090625 2023-10-02 03:38:23,017 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.1.bypass.scale_min, batch_count=737346.6666666666, ans=0.2 2023-10-02 03:38:25,631 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_326-315007-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 03:38:27,115 WARNING [train.py:1204] (3/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_105-328174-0_sp1.1 from training. Duration: 0.6909375 2023-10-02 03:38:28,395 WARNING [train.py:1204] (3/4) Exclude cut with ID _56494_210_4220_1_1535284837625_3033169_106-263722-0_sp1.1 from training. Duration: 0.97275 2023-10-02 03:38:31,257 WARNING [train.py:1204] (3/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_770-200430-0_sp0.9 from training. Duration: 0.988875 2023-10-02 03:38:33,141 WARNING [train.py:1204] (3/4) Exclude cut with ID _65640_210_6310_1_1535368201445_3173250_209-13008-0_sp1.1 from training. Duration: 0.9 2023-10-02 03:38:33,235 WARNING [train.py:1204] (3/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_212-149947-0_sp0.9 from training. Duration: 0.811125 2023-10-02 03:38:37,510 WARNING [train.py:1204] (3/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_188-171673-0 from training. Duration: 0.58 2023-10-02 03:38:38,761 WARNING [train.py:1204] (3/4) Exclude cut with ID _58895_210_8750_1_1535184081320_3649089_286-281724-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 03:38:38,843 WARNING [train.py:1204] (3/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_16-254909-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 03:38:44,900 WARNING [train.py:1204] (3/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_52-95655-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 03:38:47,175 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.0.layers.1.nonlin_attention.whiten2, num_groups=1, num_channels=192, metric=5.93 vs. limit=15.0 2023-10-02 03:38:47,685 WARNING [train.py:1204] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_554-45617-0 from training. Duration: 0.99 2023-10-02 03:38:49,483 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.5.encoder.layers.1.whiten, num_groups=1, num_channels=256, metric=2.04 vs. limit=12.0 2023-10-02 03:38:51,148 WARNING [train.py:1204] (3/4) Exclude cut with ID _14683_210_5219_1_1530698352608_3974859_473-344611-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 03:38:52,988 WARNING [train.py:1204] (3/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_17-25045-0_sp0.9 from training. Duration: 0.6555625 2023-10-02 03:38:57,165 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9072W0003-530637 from training. Duration: 0.768 2023-10-02 03:38:59,806 WARNING [train.py:1204] (3/4) Exclude cut with ID _38670_210_15379_1_1533448810659_4391059_76-74025-0_sp1.1 from training. Duration: 0.936375 2023-10-02 03:38:59,863 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_74-292608-0_sp0.9 from training. Duration: 0.8 2023-10-02 03:39:01,196 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9084W0025-536862 from training. Duration: 0.896 2023-10-02 03:39:01,261 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9087W0021-538392 from training. Duration: 0.768 2023-10-02 03:39:01,266 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7543_210_6753_1_1528543323_645515_56-163234-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 03:39:02,611 WARNING [train.py:1204] (3/4) Exclude cut with ID _45560_210_21982_1_1533812603951_3584999_44-332643-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 03:39:02,701 WARNING [train.py:1204] (3/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_1285-256622-0_sp1.1 from training. Duration: 0.763625 2023-10-02 03:39:04,484 WARNING [train.py:1204] (3/4) Exclude cut with ID _52781_210_16187_1_1534297729099_1118149_141-325632-0_sp1.1 from training. Duration: 0.936375 2023-10-02 03:39:05,858 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_1051-137631-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 03:39:05,893 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_13091_210_7925_1_1530609447_8987354_233-18333-0_sp1.1 from training. Duration: 0.97275 2023-10-02 03:39:08,630 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_34008_210_10431_1_1533345424_8183247_662-317331-0 from training. Duration: 0.94 2023-10-02 03:39:08,636 WARNING [train.py:1204] (3/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_291-248920-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 03:39:08,638 WARNING [train.py:1204] (3/4) Exclude cut with ID _65263_210_22356_1_1535763598691_3921037_274-288481-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 03:39:08,670 WARNING [train.py:1204] (3/4) Exclude cut with ID _5189_210_4557_1_1527248319428_3769229_233-168080-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 03:39:10,011 WARNING [train.py:1204] (3/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_135-184281-0 from training. Duration: 0.99 2023-10-02 03:39:11,037 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.0.layers.1.self_attn1.whiten, num_groups=1, num_channels=192, metric=12.00 vs. limit=22.5 2023-10-02 03:39:11,421 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9123W0012-555972 from training. Duration: 0.896 2023-10-02 03:39:11,426 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9123W0118-556077 from training. Duration: 0.896 2023-10-02 03:39:11,444 WARNING [train.py:1204] (3/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_406-275649-0 from training. Duration: 0.42 2023-10-02 03:39:11,757 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.conv_module2.balancer2.prob, batch_count=737546.6666666666, ans=0.125 2023-10-02 03:39:12,966 WARNING [train.py:1204] (3/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_309-13684-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 03:39:14,228 WARNING [train.py:1204] (3/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_129-337802-0_sp1.1 from training. Duration: 0.6545625 2023-10-02 03:39:14,253 WARNING [train.py:1204] (3/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_649-266825-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 03:39:14,299 WARNING [train.py:1204] (3/4) Exclude cut with ID _40519_210_13794_1_1533715228655_7337390_895-224742-0_sp1.1 from training. Duration: 0.92725 2023-10-02 03:39:17,364 INFO [train.py:1046] (3/4) Epoch 21, batch 4400, loss[loss=0.1647, simple_loss=0.2408, pruned_loss=0.04433, over 23661.00 frames. ], tot_loss[loss=0.1759, simple_loss=0.2518, pruned_loss=0.04998, over 4727994.32 frames. ], batch size: 135, lr: 4.86e-03, grad_scale: 16.0 2023-10-02 03:39:17,410 WARNING [train.py:1204] (3/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_234-14174-0 from training. Duration: 0.82 2023-10-02 03:39:18,716 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9156W0009-572502 from training. Duration: 0.768 2023-10-02 03:39:18,723 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_424-245529-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 03:39:23,091 WARNING [train.py:1204] (3/4) Exclude cut with ID _26528_210_9070_1_1532570410842_3851310_232-10149-0_sp1.1 from training. Duration: 0.963625 2023-10-02 03:39:23,099 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_17292_210_6753_1_1531125388_2943734_152-30410-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 03:39:25,157 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_35441_210_16554_1_1533347572_4092680_291-82075-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 03:39:26,629 WARNING [train.py:1204] (3/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_64-298917-0 from training. Duration: 0.88 2023-10-02 03:39:26,648 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_5618_210_1874_1_1527311008_5851_1-214501-0_sp0.9 from training. Duration: 0.7655625 2023-10-02 03:39:27,963 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_9460_210_4211_1_1528954587_10049667_572-37035-0 from training. Duration: 0.8 2023-10-02 03:39:27,989 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9193W0023-590322 from training. Duration: 0.896 2023-10-02 03:39:28,110 WARNING [train.py:1204] (3/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_206-10097-0_sp1.1 from training. Duration: 0.4454375 2023-10-02 03:39:29,400 WARNING [train.py:1204] (3/4) Exclude cut with ID _55134_210_21982_1_1534590028377_3882170_17-51731-0_sp1.1 from training. Duration: 0.963625 2023-10-02 03:39:29,668 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=737613.3333333334, ans=0.1 2023-10-02 03:39:30,804 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_14953_210_2151_1_1530701710_5616165_710-165400-0 from training. Duration: 0.87 2023-10-02 03:39:32,733 WARNING [train.py:1204] (3/4) Exclude cut with ID _53203_210_2087_1_1534246218309_475590_61-85777-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 03:39:32,967 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.feed_forward1.hidden_balancer.prob, batch_count=737680.0, ans=0.125 2023-10-02 03:39:34,145 WARNING [train.py:1204] (3/4) Exclude cut with ID _55844_210_2346_1_1534471212569_7625707_1050-10453-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 03:39:34,154 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9218W0013-603297 from training. Duration: 0.896 2023-10-02 03:39:38,256 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_267-102818-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 03:39:38,257 WARNING [train.py:1204] (3/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_191-41023-0 from training. Duration: 0.71 2023-10-02 03:39:38,310 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9231W0202-610227 from training. Duration: 0.896 2023-10-02 03:39:40,006 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.feed_forward1.out_proj.dropout_p, batch_count=737680.0, ans=0.1 2023-10-02 03:39:41,180 WARNING [train.py:1204] (3/4) Exclude cut with ID _6537_210_1883_1_1527987559710_3597129_382-133062-0 from training. Duration: 0.68 2023-10-02 03:39:42,551 WARNING [train.py:1204] (3/4) Exclude cut with ID _28427_210_4220_1_1532581240967_3061069_43-84928-0 from training. Duration: 0.99 2023-10-02 03:39:42,574 WARNING [train.py:1204] (3/4) Exclude cut with ID _59256_210_5986_1_1535076085997_3589120_121-237386-0 from training. Duration: 0.96 2023-10-02 03:39:42,595 WARNING [train.py:1204] (3/4) Exclude cut with ID _8480_210_1874_1_1528589391205_6728330_137-256986-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 03:39:42,691 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_9417_210_1794_1_1529235261_4614131_479-346963-0_sp1.1 from training. Duration: 0.936375 2023-10-02 03:39:43,895 WARNING [train.py:1204] (3/4) Exclude cut with ID _26474_210_9570_1_1532520022699_3623380_267-7362-0_sp1.1 from training. Duration: 0.936375 2023-10-02 03:39:43,959 WARNING [train.py:1204] (3/4) Exclude cut with ID _55328_210_27767_1_1534580973260_4088170_287-61272-0_sp1.1 from training. Duration: 0.97275 2023-10-02 03:39:45,379 WARNING [train.py:1204] (3/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_589-9070-0 from training. Duration: 0.71 2023-10-02 03:39:45,386 WARNING [train.py:1204] (3/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_148-158509-0_sp1.1 from training. Duration: 0.37275 2023-10-02 03:39:47,241 WARNING [train.py:1204] (3/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_164-184548-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 03:39:48,726 WARNING [train.py:1204] (3/4) Exclude cut with ID _14970_210_15709_1_1530775547361_1966965_6-23328-0_sp1.1 from training. Duration: 0.7818125 2023-10-02 03:39:48,732 WARNING [train.py:1204] (3/4) Exclude cut with ID _29303_210_2575_1_1532392237303_7410649_612-165069-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 03:39:51,947 WARNING [train.py:1204] (3/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_383-321224-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 03:39:51,992 WARNING [train.py:1204] (3/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_283-160525-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 03:39:52,006 WARNING [train.py:1204] (3/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_505-97987-0 from training. Duration: 0.86 2023-10-02 03:39:53,362 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9309W0190-640317 from training. Duration: 0.768 2023-10-02 03:39:54,350 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.self_attn_weights.pos_emb_skip_rate, batch_count=737746.6666666666, ans=0.0 2023-10-02 03:39:55,152 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.620e+02 1.905e+02 2.159e+02 2.393e+02 3.383e+02, threshold=4.317e+02, percent-clipped=0.0 2023-10-02 03:39:55,989 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.3.encoder.layers.1.nonlin_attention.whiten1, num_groups=1, num_channels=384, metric=4.88 vs. limit=10.0 2023-10-02 03:39:56,650 WARNING [train.py:1204] (3/4) Exclude cut with ID _50093_210_4220_1_1534417141207_3432299_280-297390-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 03:39:56,836 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.conv_module2.balancer1.prob, batch_count=737746.6666666666, ans=0.125 2023-10-02 03:40:00,912 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.1.attention_skip_rate, batch_count=737813.3333333334, ans=0.0 2023-10-02 03:40:04,073 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_17292_210_6753_1_1531125388_2943734_320-71385-0_sp1.1 from training. Duration: 0.97275 2023-10-02 03:40:05,558 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_213-103282-0 from training. Duration: 0.78 2023-10-02 03:40:09,682 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7615_210_4211_1_1528110965_4580160_72-235211-0_sp0.9 from training. Duration: 0.8444375 2023-10-02 03:40:11,157 WARNING [train.py:1204] (3/4) Exclude cut with ID _14867_210_1968_1_1530684179890_3846969_173-320977-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 03:40:15,170 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7046_210_6753_1_1527895337_6920917_783-278109-0_sp0.9 from training. Duration: 0.8555625 2023-10-02 03:40:15,224 WARNING [train.py:1204] (3/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_90-30673-0 from training. Duration: 0.84 2023-10-02 03:40:16,560 WARNING [train.py:1204] (3/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_415-161-0_sp1.1 from training. Duration: 0.8909375 2023-10-02 03:40:16,573 WARNING [train.py:1204] (3/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_86-66192-0_sp0.9 from training. Duration: 0.611125 2023-10-02 03:40:16,575 WARNING [train.py:1204] (3/4) Exclude cut with ID _4626_210_3532_1_1526817652717_3916859_446-206656-0_sp1.1 from training. Duration: 0.6909375 2023-10-02 03:40:16,640 WARNING [train.py:1204] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_166-128941-0_sp0.9 from training. Duration: 0.6 2023-10-02 03:40:20,320 WARNING [train.py:1204] (3/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_345-167986-0 from training. Duration: 0.84 2023-10-02 03:40:22,293 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.3.encoder.layers.1.self_attn1.whiten, num_groups=1, num_channels=512, metric=16.27 vs. limit=22.5 2023-10-02 03:40:23,093 WARNING [train.py:1204] (3/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_973-198085-0 from training. Duration: 0.75 2023-10-02 03:40:26,289 WARNING [train.py:1204] (3/4) Exclude cut with ID _60057_210_10640_1_1534903065793_3333150_171-181133-0 from training. Duration: 0.88 2023-10-02 03:40:26,314 WARNING [train.py:1204] (3/4) Exclude cut with ID _19504_210_9994_1_1531648863242_3325220_336-196513-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 03:40:26,329 WARNING [train.py:1204] (3/4) Exclude cut with ID _6493_210_1747_1_1528459210424_4012470_513-140967-0 from training. Duration: 0.79 2023-10-02 03:40:26,415 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6673_210_3273_1_1527767790_3173240_327-306185-0_sp0.9 from training. Duration: 0.92225 2023-10-02 03:40:29,238 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1032-236863-0_sp1.1 from training. Duration: 0.763625 2023-10-02 03:40:30,674 WARNING [train.py:1204] (3/4) Exclude cut with ID _14867_210_1968_1_1530684179890_3846969_413-29292-0 from training. Duration: 0.9 2023-10-02 03:40:30,841 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.conv_skip_rate, batch_count=737946.6666666666, ans=0.0 2023-10-02 03:40:31,977 INFO [train.py:1046] (3/4) Epoch 21, batch 4450, loss[loss=0.1779, simple_loss=0.2598, pruned_loss=0.04803, over 24046.00 frames. ], tot_loss[loss=0.1761, simple_loss=0.252, pruned_loss=0.0501, over 4710010.90 frames. ], batch size: 80, lr: 4.86e-03, grad_scale: 8.0 2023-10-02 03:40:33,539 WARNING [train.py:1204] (3/4) Exclude cut with ID _47251_210_18628_1_1533814063128_4137250_186-343322-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 03:40:36,538 WARNING [train.py:1204] (3/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_624-46091-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 03:40:36,587 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_791-197891-0_sp0.9 from training. Duration: 0.9333125 2023-10-02 03:40:41,052 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_50050_210_16223_1_1535435796_7290138_786-288457-0_sp1.1 from training. Duration: 0.92725 2023-10-02 03:40:41,077 WARNING [train.py:1204] (3/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_213-227283-0_sp1.1 from training. Duration: 0.963625 2023-10-02 03:40:41,340 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.conv_module2.balancer2.prob, batch_count=737946.6666666666, ans=0.125 2023-10-02 03:40:43,319 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.0.layers.0.nonlin_attention.whiten1, num_groups=1, num_channels=144, metric=9.47 vs. limit=10.0 2023-10-02 03:40:45,205 WARNING [train.py:1204] (3/4) Exclude cut with ID _42659_210_2070_1_1533607442765_3120380_58-341121-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 03:40:47,883 WARNING [train.py:1204] (3/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_506-23200-0_sp1.1 from training. Duration: 0.8818125 2023-10-02 03:40:49,316 WARNING [train.py:1204] (3/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_336-213530-0_sp1.1 from training. Duration: 0.8181875 2023-10-02 03:40:49,336 WARNING [train.py:1204] (3/4) Exclude cut with ID _64135_210_2253_1_1535677267876_7608972_180-5312-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 03:40:52,578 WARNING [train.py:1204] (3/4) Exclude cut with ID _12302_210_5219_1_1529924446466_8107280_835-279754-0 from training. Duration: 0.9 2023-10-02 03:40:52,580 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_510-96996-0_sp1.1 from training. Duration: 0.7909375 2023-10-02 03:40:53,979 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_15517_210_6753_1_1530954008_3487336_216-187834-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 03:40:54,014 WARNING [train.py:1204] (3/4) Exclude cut with ID _19504_210_9994_1_1531648863242_3325220_414-113427-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 03:40:54,015 WARNING [train.py:1204] (3/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_264-108685-0_sp0.9 from training. Duration: 0.611125 2023-10-02 03:40:56,169 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_5438_210_1794_1_1527336687_2600009_104-288274-0_sp0.9 from training. Duration: 0.6444375 2023-10-02 03:41:01,681 WARNING [train.py:1204] (3/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_520-39496-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 03:41:01,735 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_37818_210_4238_1_1533620910_4720083_315-308565-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 03:41:03,047 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2009_210_2071_1_1525481134_4588937_385-302100-0_sp1.1 from training. Duration: 0.7909375 2023-10-02 03:41:04,361 WARNING [train.py:1204] (3/4) Exclude cut with ID _58895_210_8750_1_1535184081320_3649089_277-53219-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 03:41:06,235 WARNING [train.py:1204] (3/4) Exclude cut with ID _26664_210_7770_1_1532258943280_3418270_121-111258-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 03:41:09,116 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.0.balancer1.prob, batch_count=738080.0, ans=0.125 2023-10-02 03:41:10,346 WARNING [train.py:1204] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_99-167392-0_sp1.1 from training. Duration: 0.5545625 2023-10-02 03:41:11,739 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1113-7369-0 from training. Duration: 0.85 2023-10-02 03:41:13,044 WARNING [train.py:1204] (3/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_352-59958-0 from training. Duration: 0.86 2023-10-02 03:41:13,049 WARNING [train.py:1204] (3/4) Exclude cut with ID _14886_210_4220_1_1530702020355_3516190_159-73748-0_sp1.1 from training. Duration: 0.7545625 2023-10-02 03:41:13,354 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.balancer2.prob, batch_count=738080.0, ans=0.125 2023-10-02 03:41:13,359 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.feed_forward2.hidden_balancer.prob, batch_count=738080.0, ans=0.125 2023-10-02 03:41:14,539 WARNING [train.py:1204] (3/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_131-119569-0_sp1.1 from training. Duration: 0.92725 2023-10-02 03:41:15,929 WARNING [train.py:1204] (3/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_216-343007-0 from training. Duration: 0.66 2023-10-02 03:41:17,769 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.ff3_skip_rate, batch_count=738146.6666666666, ans=0.0 2023-10-02 03:41:18,819 WARNING [train.py:1204] (3/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_113-92608-0_sp0.9 from training. Duration: 0.6 2023-10-02 03:41:21,662 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_40047_210_2290_1_1533290519_2374311_132-123271-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 03:41:23,602 WARNING [train.py:1204] (3/4) Exclude cut with ID _8793_210_6310_1_1528542985139_2310009_121-179782-0 from training. Duration: 0.8 2023-10-02 03:41:25,280 WARNING [train.py:1204] (3/4) Exclude cut with ID _43997_210_4525_1_1533538632555_5189329_283-218639-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 03:41:25,283 WARNING [train.py:1204] (3/4) Exclude cut with ID _49376_210_5986_1_1533985278732_3370740_297-78397-0_sp1.1 from training. Duration: 0.936375 2023-10-02 03:41:25,304 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3475_210_2028_1_1526193578_3622569_377-166133-0_sp0.9 from training. Duration: 0.9555625 2023-10-02 03:41:25,311 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_520-213149-0_sp1.1 from training. Duration: 0.92725 2023-10-02 03:41:26,729 WARNING [train.py:1204] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_567-144483-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 03:41:31,439 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_4183_210_2087_1_1526729100_1869838_143-280962-0_sp0.9 from training. Duration: 0.62225 2023-10-02 03:41:31,485 WARNING [train.py:1204] (3/4) Exclude cut with ID _49643_210_7989_1_1534640244224_3939031_611-185515-0 from training. Duration: 0.94 2023-10-02 03:41:32,927 WARNING [train.py:1204] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_88-341718-0_sp0.9 from training. Duration: 0.7333125 2023-10-02 03:41:34,337 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_51225_210_3601_1_1534400634_2327383_49-235441-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 03:41:34,581 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.nonlin_attention.balancer.prob, batch_count=738213.3333333334, ans=0.125 2023-10-02 03:41:36,241 WARNING [train.py:1204] (3/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_240-294400-0_sp1.1 from training. Duration: 0.936375 2023-10-02 03:41:37,630 WARNING [train.py:1204] (3/4) Exclude cut with ID _55429_210_10626_1_1534467588527_7174143_640-304825-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 03:41:37,654 WARNING [train.py:1204] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_187-221566-0_sp1.1 from training. Duration: 0.3909375 2023-10-02 03:41:40,470 WARNING [train.py:1204] (3/4) Exclude cut with ID _7606_210_4915_1_1528894253277_5080690_127-177761-0_sp0.9 from training. Duration: 0.711125 2023-10-02 03:41:41,941 WARNING [train.py:1204] (3/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_430-116139-0 from training. Duration: 0.85 2023-10-02 03:41:43,365 WARNING [train.py:1204] (3/4) Exclude cut with ID _3675_210_4557_1_1526639934408_4182089_456-18305-0_sp0.9 from training. Duration: 0.8444375 2023-10-02 03:41:46,100 INFO [train.py:1046] (3/4) Epoch 21, batch 4500, loss[loss=0.1562, simple_loss=0.2374, pruned_loss=0.03746, over 24502.00 frames. ], tot_loss[loss=0.1762, simple_loss=0.252, pruned_loss=0.0502, over 4715214.08 frames. ], batch size: 63, lr: 4.86e-03, grad_scale: 8.0 2023-10-02 03:41:46,406 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=738280.0, ans=0.1 2023-10-02 03:41:47,504 WARNING [train.py:1204] (3/4) Exclude cut with ID _18513_210_9206_1_1531618215223_3862620_402-5957-0_sp1.1 from training. Duration: 0.9 2023-10-02 03:41:48,929 WARNING [train.py:1204] (3/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_465-156500-0 from training. Duration: 0.72 2023-10-02 03:41:48,930 WARNING [train.py:1204] (3/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_714-310538-0 from training. Duration: 0.86 2023-10-02 03:41:51,577 WARNING [train.py:1204] (3/4) Exclude cut with ID _39411_210_5986_1_1533538825030_3343649_335-301187-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 03:41:56,639 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_41750_210_1960_1_1533520678_3948042_241-158616-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 03:41:56,913 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.bypass.skip_rate, batch_count=738280.0, ans=0.07 2023-10-02 03:41:57,956 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_46013_210_7234_1_1533776362_3744032_50-141535-0_sp1.1 from training. Duration: 0.9 2023-10-02 03:41:59,940 WARNING [train.py:1204] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_43-206757-0_sp0.9 from training. Duration: 0.8333125 2023-10-02 03:42:00,002 WARNING [train.py:1204] (3/4) Exclude cut with ID _22961_210_8254_1_1531976366672_4278180_172-276296-0_sp1.1 from training. Duration: 0.97275 2023-10-02 03:42:01,270 WARNING [train.py:1204] (3/4) Exclude cut with ID _17956_210_4353_1_1531209568125_6979970_1305-301903-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 03:42:01,304 WARNING [train.py:1204] (3/4) Exclude cut with ID _28323_210_16187_1_1532480395667_4100300_604-44507-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 03:42:01,712 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.conv_module1.balancer1.prob, batch_count=738346.6666666666, ans=0.125 2023-10-02 03:42:07,192 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=738346.6666666666, ans=0.1 2023-10-02 03:42:11,749 WARNING [train.py:1204] (3/4) Exclude cut with ID _23514_210_1389_1_1531832139785_3031439_88-337544-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 03:42:13,072 WARNING [train.py:1204] (3/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_932-3672-0_sp1.1 from training. Duration: 0.963625 2023-10-02 03:42:15,793 WARNING [train.py:1204] (3/4) Exclude cut with ID _46502_210_13794_1_1533724452198_3657159_207-109991-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 03:42:17,104 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_216-73841-0_sp1.1 from training. Duration: 0.87275 2023-10-02 03:42:17,189 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8453_210_4211_1_1528502940_8562730_347-330673-0_sp1.1 from training. Duration: 0.6545625 2023-10-02 03:42:21,498 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_4183_210_2087_1_1526729100_1869838_143-280962-0_sp1.1 from training. Duration: 0.5090625 2023-10-02 03:42:24,906 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.629e+02 1.913e+02 2.114e+02 2.419e+02 4.024e+02, threshold=4.228e+02, percent-clipped=0.0 2023-10-02 03:42:25,094 WARNING [train.py:1204] (3/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_325-113798-0_sp1.1 from training. Duration: 0.77275 2023-10-02 03:42:30,446 WARNING [train.py:1204] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_91-212486-0_sp0.9 from training. Duration: 0.7555625 2023-10-02 03:42:33,023 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8776_210_2040_1_1528638837_4088036_244-278763-0_sp1.1 from training. Duration: 0.8181875 2023-10-02 03:42:33,144 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.1.feed_forward1.hidden_balancer.prob, batch_count=738480.0, ans=0.125 2023-10-02 03:42:34,346 WARNING [train.py:1204] (3/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_106-286130-0 from training. Duration: 0.98 2023-10-02 03:42:35,800 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2746_210_1988_1_1525872816_3795627_372-205861-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 03:42:35,847 WARNING [train.py:1204] (3/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_240-54646-0_sp1.1 from training. Duration: 0.936375 2023-10-02 03:42:39,072 WARNING [train.py:1204] (3/4) Exclude cut with ID _59500_210_8254_1_1534809610680_3736009_162-180695-0_sp1.1 from training. Duration: 0.936375 2023-10-02 03:42:39,095 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7544_210_6753_1_1528591327_1471426_146-93357-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 03:42:40,567 WARNING [train.py:1204] (3/4) Exclude cut with ID _42657_210_2070_1_1533520654016_3327008_405-87432-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 03:42:40,608 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_4528_210_3273_1_1527076654_3276777_209-219804-0 from training. Duration: 0.69 2023-10-02 03:42:40,609 WARNING [train.py:1204] (3/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_78-182912-0_sp0.9 from training. Duration: 0.6555625 2023-10-02 03:42:40,616 WARNING [train.py:1204] (3/4) Exclude cut with ID _31255_210_13427_1_1532865619495_3815143_322-114998-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 03:42:44,716 WARNING [train.py:1204] (3/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_848-280957-0_sp0.9 from training. Duration: 0.9666875 2023-10-02 03:42:44,736 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7696_210_2040_1_1528207167_3716099_117-125434-0_sp0.9 from training. Duration: 0.8444375 2023-10-02 03:42:47,579 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_1002-109192-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 03:42:47,766 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.balancer1.prob, batch_count=738546.6666666666, ans=0.125 2023-10-02 03:42:48,997 WARNING [train.py:1204] (3/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_138-64210-0_sp0.9 from training. Duration: 0.811125 2023-10-02 03:42:49,013 WARNING [train.py:1204] (3/4) Exclude cut with ID _25808_210_13292_1_1532343514562_7366109_633-47524-0_sp1.1 from training. Duration: 0.9 2023-10-02 03:42:51,060 WARNING [train.py:1204] (3/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_297-5233-0 from training. Duration: 0.95 2023-10-02 03:42:53,875 WARNING [train.py:1204] (3/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_335-274780-0 from training. Duration: 0.78 2023-10-02 03:42:53,888 WARNING [train.py:1204] (3/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_301-330455-0 from training. Duration: 0.8 2023-10-02 03:42:57,498 WARNING [train.py:1204] (3/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_383-205833-0 from training. Duration: 0.95 2023-10-02 03:42:59,676 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.2.encoder.layers.0.nonlin_attention.whiten1, num_groups=1, num_channels=288, metric=8.25 vs. limit=10.0 2023-10-02 03:43:00,724 WARNING [train.py:1204] (3/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_48-182155-0 from training. Duration: 0.87 2023-10-02 03:43:01,960 INFO [train.py:1046] (3/4) Epoch 21, batch 4550, loss[loss=0.1969, simple_loss=0.2541, pruned_loss=0.06989, over 23818.00 frames. ], tot_loss[loss=0.1749, simple_loss=0.25, pruned_loss=0.04986, over 4705627.89 frames. ], batch size: 179, lr: 4.86e-03, grad_scale: 8.0 2023-10-02 03:43:02,098 WARNING [train.py:1204] (3/4) Exclude cut with ID _14832_210_8750_1_1530691351539_3171348_1-54315-0_sp1.1 from training. Duration: 0.7909375 2023-10-02 03:43:06,141 WARNING [train.py:1204] (3/4) Exclude cut with ID _44982_210_1511_1_1534312820108_3726029_413-100814-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 03:43:06,203 WARNING [train.py:1204] (3/4) Exclude cut with ID _3231_210_2106_1_1526205864934_6784220_104-261392-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 03:43:06,393 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=738613.3333333334, ans=0.1 2023-10-02 03:43:07,805 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=738613.3333333334, ans=0.1 2023-10-02 03:43:08,946 WARNING [train.py:1204] (3/4) Exclude cut with ID _24314_210_4915_1_1532135078116_7170179_1-13588-0_sp1.1 from training. Duration: 0.97275 2023-10-02 03:43:13,686 WARNING [train.py:1204] (3/4) Exclude cut with ID _44304_210_15374_1_1533639352765_4613129_141-8356-0_sp1.1 from training. Duration: 0.963625 2023-10-02 03:43:15,107 WARNING [train.py:1204] (3/4) Exclude cut with ID _37892_210_19596_1_1533198677897_3964058_374-335034-0_sp1.1 from training. Duration: 0.936375 2023-10-02 03:43:16,588 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_195-257512-0_sp0.9 from training. Duration: 0.7444375 2023-10-02 03:43:16,590 WARNING [train.py:1204] (3/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_1267-272693-0_sp1.1 from training. Duration: 0.663625 2023-10-02 03:43:16,590 WARNING [train.py:1204] (3/4) Exclude cut with ID _30196_210_1664_1_1533708496522_7074140_690-326155-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 03:43:20,641 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_68847_210_14656_1_1536070214_2586843_38-20717-0_sp1.1 from training. Duration: 0.97275 2023-10-02 03:43:20,679 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_99-251314-0_sp1.1 from training. Duration: 0.7909375 2023-10-02 03:43:24,193 WARNING [train.py:1204] (3/4) Exclude cut with ID _49376_210_5986_1_1533985278732_3370740_225-95630-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 03:43:26,943 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2655_210_2087_1_1525688959_3897400_202-147557-0 from training. Duration: 0.85 2023-10-02 03:43:26,996 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_5618_210_1874_1_1527311008_5851_1-214501-0_sp1.1 from training. Duration: 0.626375 2023-10-02 03:43:28,826 WARNING [train.py:1204] (3/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_225-43381-0_sp1.1 from training. Duration: 0.8545625 2023-10-02 03:43:30,330 WARNING [train.py:1204] (3/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_242-3472-0 from training. Duration: 0.73 2023-10-02 03:43:31,059 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.2.encoder.layers.1.conv_module1.whiten, num_groups=1, num_channels=384, metric=7.09 vs. limit=15.0 2023-10-02 03:43:34,915 WARNING [train.py:1204] (3/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_339-104791-0 from training. Duration: 0.49 2023-10-02 03:43:35,001 WARNING [train.py:1204] (3/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_488-92261-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 03:43:36,462 WARNING [train.py:1204] (3/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_175-163894-0 from training. Duration: 0.74 2023-10-02 03:43:36,602 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.nonlin_attention.balancer.prob, batch_count=738746.6666666666, ans=0.125 2023-10-02 03:43:37,948 WARNING [train.py:1204] (3/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_41-234353-0_sp0.9 from training. Duration: 0.7555625 2023-10-02 03:43:40,660 WARNING [train.py:1204] (3/4) Exclude cut with ID _47251_210_18628_1_1533814063128_4137250_116-275927-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 03:43:40,689 WARNING [train.py:1204] (3/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_61-223324-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 03:43:40,702 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6961_210_2087_1_1527937168_6815011_562-165518-0_sp1.1 from training. Duration: 0.7 2023-10-02 03:43:43,984 WARNING [train.py:1204] (3/4) Exclude cut with ID _21989_210_5854_1_1531899214931_3271360_133-44759-0 from training. Duration: 0.77 2023-10-02 03:43:45,344 WARNING [train.py:1204] (3/4) Exclude cut with ID _23557_210_8750_1_1531890106370_6846255_77-284923-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 03:43:46,790 WARNING [train.py:1204] (3/4) Exclude cut with ID _13678_210_7497_1_1530530069226_3463170_464-296127-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 03:43:46,985 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.nonlin_attention.balancer.prob, batch_count=738813.3333333334, ans=0.125 2023-10-02 03:43:48,129 WARNING [train.py:1204] (3/4) Exclude cut with ID _22613_210_5219_1_1532253599686_4090170_393-36663-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 03:43:49,497 WARNING [train.py:1204] (3/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_235-262124-0_sp0.9 from training. Duration: 0.7444375 2023-10-02 03:43:52,165 WARNING [train.py:1204] (3/4) Exclude cut with ID _8480_210_1874_1_1528589391205_6728330_755-112625-0 from training. Duration: 0.74 2023-10-02 03:43:52,224 WARNING [train.py:1204] (3/4) Exclude cut with ID _6456_210_1534_1_1527681634212_3181049_215-309359-0 from training. Duration: 0.88 2023-10-02 03:43:52,256 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3019_210_2028_1_1526044245_3264865_191-72235-0_sp1.1 from training. Duration: 0.8818125 2023-10-02 03:43:53,999 WARNING [train.py:1204] (3/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_380-162648-0 from training. Duration: 0.94 2023-10-02 03:43:55,497 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_92-46009-0 from training. Duration: 0.54 2023-10-02 03:43:56,751 WARNING [train.py:1204] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_53-300544-0_sp0.9 from training. Duration: 0.7444375 2023-10-02 03:43:57,573 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_68235_210_1925_1_1535863247_4484656_101-250943-0_sp1.1 from training. Duration: 0.97275 2023-10-02 03:43:57,582 WARNING [train.py:1204] (3/4) Exclude cut with ID _14970_210_15709_1_1530775547361_1966965_204-22182-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 03:43:58,917 WARNING [train.py:1204] (3/4) Exclude cut with ID _55153_210_2253_1_1534384786677_3162096_310-130092-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 03:43:58,936 WARNING [train.py:1204] (3/4) Exclude cut with ID _60113_210_16271_1_1535164212481_4386079_559-167191-0_sp1.1 from training. Duration: 0.8454375 2023-10-02 03:44:00,831 WARNING [train.py:1204] (3/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_93-58002-0_sp1.1 from training. Duration: 0.5181875 2023-10-02 03:44:02,212 WARNING [train.py:1204] (3/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_110-224688-0 from training. Duration: 0.97 2023-10-02 03:44:03,612 WARNING [train.py:1204] (3/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_178-178004-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 03:44:03,631 WARNING [train.py:1204] (3/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_321-335475-0_sp1.1 from training. Duration: 0.4090625 2023-10-02 03:44:03,692 WARNING [train.py:1204] (3/4) Exclude cut with ID _66863_210_18053_1_1535698758334_8590319_1092-271784-0 from training. Duration: 0.96 2023-10-02 03:44:03,699 WARNING [train.py:1204] (3/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_607-213951-0_sp1.1 from training. Duration: 0.763625 2023-10-02 03:44:04,927 WARNING [train.py:1204] (3/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_468-283113-0 from training. Duration: 0.96 2023-10-02 03:44:07,850 WARNING [train.py:1204] (3/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_13-165512-0_sp1.1 from training. Duration: 0.7454375 2023-10-02 03:44:07,863 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_69630_210_7234_1_1535788670_910989_56-169762-0_sp1.1 from training. Duration: 0.92725 2023-10-02 03:44:10,542 WARNING [train.py:1204] (3/4) Exclude cut with ID _67335_210_5219_1_1535525975652_4275229_121-80357-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 03:44:11,893 WARNING [train.py:1204] (3/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_126-12079-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 03:44:11,931 WARNING [train.py:1204] (3/4) Exclude cut with ID _5294_210_1747_1_1527249643928_3696179_342-270278-0_sp1.1 from training. Duration: 0.5 2023-10-02 03:44:13,845 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_30322_210_1925_1_1532843478_826817_35-328264-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 03:44:15,253 WARNING [train.py:1204] (3/4) Exclude cut with ID _8480_210_1874_1_1528589391205_6728330_755-112625-0_sp1.1 from training. Duration: 0.67275 2023-10-02 03:44:16,621 INFO [train.py:1046] (3/4) Epoch 21, batch 4600, loss[loss=0.1756, simple_loss=0.2544, pruned_loss=0.04837, over 24441.00 frames. ], tot_loss[loss=0.1742, simple_loss=0.2491, pruned_loss=0.04963, over 4711165.17 frames. ], batch size: 63, lr: 4.85e-03, grad_scale: 8.0 2023-10-02 03:44:18,135 WARNING [train.py:1204] (3/4) Exclude cut with ID _38670_210_15379_1_1533448810659_4391059_132-146412-0_sp1.1 from training. Duration: 0.97275 2023-10-02 03:44:18,219 WARNING [train.py:1204] (3/4) Exclude cut with ID _22020_210_2461_1_1531724164513_6326900_563-39691-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 03:44:18,339 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.ff2_skip_rate, batch_count=738946.6666666666, ans=0.0 2023-10-02 03:44:20,895 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_9460_210_4211_1_1528954587_10049667_572-37035-0_sp1.1 from training. Duration: 0.72725 2023-10-02 03:44:20,908 WARNING [train.py:1204] (3/4) Exclude cut with ID _68777_210_4353_1_1535810390731_3117904_129-166012-0_sp0.9 from training. Duration: 0.9444375 2023-10-02 03:44:21,078 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.feed_forward3.hidden_balancer.prob, batch_count=738946.6666666666, ans=0.125 2023-10-02 03:44:22,233 WARNING [train.py:1204] (3/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_487-91921-0_sp1.1 from training. Duration: 0.963625 2023-10-02 03:44:23,605 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_195-257512-0 from training. Duration: 0.67 2023-10-02 03:44:26,920 WARNING [train.py:1204] (3/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_188-260011-0_sp1.1 from training. Duration: 0.663625 2023-10-02 03:44:30,399 WARNING [train.py:1204] (3/4) Exclude cut with ID _8480_210_1874_1_1528589391205_6728330_309-139896-0_sp0.9 from training. Duration: 0.9 2023-10-02 03:44:30,455 WARNING [train.py:1204] (3/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_866-321248-0_sp1.1 from training. Duration: 0.963625 2023-10-02 03:44:33,717 WARNING [train.py:1204] (3/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_510-216513-0_sp1.1 from training. Duration: 0.97275 2023-10-02 03:44:35,403 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.feed_forward3.hidden_balancer.prob, batch_count=739013.3333333334, ans=0.125 2023-10-02 03:44:39,191 WARNING [train.py:1204] (3/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_758-143677-0 from training. Duration: 0.98 2023-10-02 03:44:40,578 WARNING [train.py:1204] (3/4) Exclude cut with ID _55134_210_21982_1_1534590028377_3882170_50-214427-0_sp1.1 from training. Duration: 0.97275 2023-10-02 03:44:42,177 WARNING [train.py:1204] (3/4) Exclude cut with ID _31649_210_6228_1_1532584608063_4091078_482-301330-0_sp1.1 from training. Duration: 0.97275 2023-10-02 03:44:46,869 WARNING [train.py:1204] (3/4) Exclude cut with ID _35799_210_5854_1_1533263487887_3529071_490-326847-0_sp1.1 from training. Duration: 0.9 2023-10-02 03:44:46,885 WARNING [train.py:1204] (3/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_984-90716-0_sp1.1 from training. Duration: 0.963625 2023-10-02 03:44:50,267 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.4.encoder.layers.2.nonlin_attention.whiten1, num_groups=1, num_channels=288, metric=4.06 vs. limit=10.0 2023-10-02 03:44:51,141 WARNING [train.py:1204] (3/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_166-246557-0 from training. Duration: 0.64 2023-10-02 03:44:51,141 WARNING [train.py:1204] (3/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_51-200220-0_sp1.1 from training. Duration: 0.5090625 2023-10-02 03:44:52,459 WARNING [train.py:1204] (3/4) Exclude cut with ID _51382_210_23862_1_1534492801838_3650104_275-92796-0_sp1.1 from training. Duration: 0.936375 2023-10-02 03:44:54,055 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.feed_forward1.hidden_balancer.prob, batch_count=739080.0, ans=0.125 2023-10-02 03:44:55,817 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.576e+02 1.844e+02 1.997e+02 2.333e+02 3.874e+02, threshold=3.995e+02, percent-clipped=0.0 2023-10-02 03:44:56,117 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.1.balancer2.prob, batch_count=739080.0, ans=0.125 2023-10-02 03:44:56,121 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.conv_module1.balancer2.prob, batch_count=739080.0, ans=0.125 2023-10-02 03:44:58,720 WARNING [train.py:1204] (3/4) Exclude cut with ID _29853_210_7497_1_1532505600851_6642110_461-142829-0_sp1.1 from training. Duration: 0.97275 2023-10-02 03:44:59,390 WARNING [train.py:1204] (3/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_788-298907-0_sp0.9 from training. Duration: 0.988875 2023-10-02 03:45:00,872 WARNING [train.py:1204] (3/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_739-2098-0_sp1.1 from training. Duration: 0.836375 2023-10-02 03:45:06,590 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7543_210_6753_1_1528541907_1399322_165-323619-0 from training. Duration: 0.69 2023-10-02 03:45:07,953 WARNING [train.py:1204] (3/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_401-255491-0_sp1.1 from training. Duration: 0.636375 2023-10-02 03:45:09,705 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.conv_skip_rate, batch_count=739146.6666666666, ans=0.0 2023-10-02 03:45:10,961 WARNING [train.py:1204] (3/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_24-143283-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 03:45:12,354 WARNING [train.py:1204] (3/4) Exclude cut with ID _14459_210_2228_1_1531443641946_7516700_1221-280439-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 03:45:12,609 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.feed_forward1.hidden_balancer.prob, batch_count=739146.6666666666, ans=0.125 2023-10-02 03:45:14,002 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.balancer1.prob, batch_count=739146.6666666666, ans=0.125 2023-10-02 03:45:14,280 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.5.encoder.layers.1.conv_module2.whiten, num_groups=1, num_channels=256, metric=4.18 vs. limit=15.0 2023-10-02 03:45:15,097 WARNING [train.py:1204] (3/4) Exclude cut with ID _59202_210_26984_1_1534816810612_3549174_469-213482-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 03:45:15,098 WARNING [train.py:1204] (3/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_148-158509-0_sp0.9 from training. Duration: 0.4555625 2023-10-02 03:45:15,137 WARNING [train.py:1204] (3/4) Exclude cut with ID _24314_210_4915_1_1532135078116_7170179_863-259095-0_sp1.1 from training. Duration: 0.97275 2023-10-02 03:45:16,501 WARNING [train.py:1204] (3/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_321-22821-0 from training. Duration: 0.89 2023-10-02 03:45:16,519 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_43831_210_2158_1_1533725502_5872791_55-163137-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 03:45:16,578 WARNING [train.py:1204] (3/4) Exclude cut with ID _27430_210_7770_1_1532604689375_1378590_26-25207-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 03:45:18,413 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_17613_210_2151_1_1531137569_2366160_223-316461-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 03:45:19,765 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_53679_210_4852_1_1534556726_4186141_541-213370-0_sp1.1 from training. Duration: 0.963625 2023-10-02 03:45:19,840 WARNING [train.py:1204] (3/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_369-295086-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 03:45:21,179 WARNING [train.py:1204] (3/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_91-267696-0 from training. Duration: 0.87 2023-10-02 03:45:21,256 WARNING [train.py:1204] (3/4) Exclude cut with ID _5294_210_1747_1_1527249643928_3696179_180-100713-0 from training. Duration: 0.81 2023-10-02 03:45:22,512 WARNING [train.py:1204] (3/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_32-335103-0 from training. Duration: 0.82 2023-10-02 03:45:22,518 WARNING [train.py:1204] (3/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_133-312727-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 03:45:22,622 WARNING [train.py:1204] (3/4) Exclude cut with ID _59288_210_5854_1_1535077757774_3412246_391-165934-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 03:45:23,903 WARNING [train.py:1204] (3/4) Exclude cut with ID _28323_210_16187_1_1532480395667_4100300_488-1281-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 03:45:24,246 INFO [scaling.py:1118] (3/4) WithLoss: name=encoder.encoders.4.encoder.layers.2.self_attn_weights, loss-sum=0.000e+00 2023-10-02 03:45:25,310 WARNING [train.py:1204] (3/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_124-193207-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 03:45:26,258 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.0.layers.1.self_attn_weights.whiten_keys, num_groups=4, num_channels=128, metric=2.25 vs. limit=6.0 2023-10-02 03:45:32,098 INFO [train.py:1046] (3/4) Epoch 21, batch 4650, loss[loss=0.1823, simple_loss=0.2689, pruned_loss=0.04781, over 24468.00 frames. ], tot_loss[loss=0.1737, simple_loss=0.249, pruned_loss=0.04924, over 4715675.79 frames. ], batch size: 69, lr: 4.85e-03, grad_scale: 8.0 2023-10-02 03:45:35,318 WARNING [train.py:1204] (3/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_226-242140-0_sp0.9 from training. Duration: 0.9 2023-10-02 03:45:38,239 WARNING [train.py:1204] (3/4) Exclude cut with ID _29277_210_3279_1_1532581109293_3173069_392-3694-0_sp1.1 from training. Duration: 0.936375 2023-10-02 03:45:38,267 WARNING [train.py:1204] (3/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_563-329842-0_sp1.1 from training. Duration: 0.97275 2023-10-02 03:45:39,547 WARNING [train.py:1204] (3/4) Exclude cut with ID _58583_210_5236_1_1534726835266_3574249_391-87755-0_sp1.1 from training. Duration: 0.92725 2023-10-02 03:45:39,567 WARNING [train.py:1204] (3/4) Exclude cut with ID _28427_210_4220_1_1532581240967_3061069_281-237827-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 03:45:39,589 WARNING [train.py:1204] (3/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_44-147898-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 03:45:39,665 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_24007_210_1794_1_1531918925_1663875_125-320772-0_sp1.1 from training. Duration: 0.97275 2023-10-02 03:45:43,760 WARNING [train.py:1204] (3/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_143-117421-0 from training. Duration: 0.58 2023-10-02 03:45:46,424 WARNING [train.py:1204] (3/4) Exclude cut with ID _69584_210_5219_1_1535785133775_4044450_325-7656-0_sp0.9 from training. Duration: 0.9555625 2023-10-02 03:45:47,881 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_37351_210_7925_1_1533012931_3812113_265-85780-0 from training. Duration: 0.88 2023-10-02 03:45:47,913 WARNING [train.py:1204] (3/4) Exclude cut with ID _18516_210_9206_1_1531704653206_3692406_331-237896-0_sp1.1 from training. Duration: 0.936375 2023-10-02 03:45:49,781 WARNING [train.py:1204] (3/4) Exclude cut with ID _14629_210_15709_1_1530604298182_956899_6-232695-0 from training. Duration: 0.94 2023-10-02 03:45:49,813 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7544_210_6753_1_1528587000_691468_87-178599-0_sp1.1 from training. Duration: 0.7909375 2023-10-02 03:45:51,159 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_326-303945-0 from training. Duration: 0.99 2023-10-02 03:45:51,182 WARNING [train.py:1204] (3/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_503-30719-0 from training. Duration: 0.98 2023-10-02 03:45:51,189 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_11305_210_1794_1_1529750428_7178148_299-344640-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 03:45:51,251 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_53071_210_16554_1_1534490405_2954066_265-170472-0_sp1.1 from training. Duration: 0.7545625 2023-10-02 03:45:55,389 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_174-277364-0_sp1.1 from training. Duration: 0.6454375 2023-10-02 03:45:56,829 WARNING [train.py:1204] (3/4) Exclude cut with ID _58414_210_8254_1_1535421609162_7151182_193-151884-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 03:45:56,847 WARNING [train.py:1204] (3/4) Exclude cut with ID IC0651W0104-322063 from training. Duration: 0.768 2023-10-02 03:46:00,307 WARNING [train.py:1204] (3/4) Exclude cut with ID _21881_210_3525_1_1531825242757_3798380_139-305982-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 03:46:00,413 WARNING [train.py:1204] (3/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_79-200392-0 from training. Duration: 0.97 2023-10-02 03:46:03,781 WARNING [train.py:1204] (3/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_451-280894-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 03:46:03,797 WARNING [train.py:1204] (3/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_304-273433-0_sp1.1 from training. Duration: 0.9 2023-10-02 03:46:03,864 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_5959_210_1794_1_1527400400_3445726_412-44052-0 from training. Duration: 0.77 2023-10-02 03:46:06,919 WARNING [train.py:1204] (3/4) Exclude cut with ID _4681_210_3525_1_1527251040321_1235713_148-26159-0_sp1.1 from training. Duration: 0.963625 2023-10-02 03:46:09,482 WARNING [train.py:1204] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_887-249555-0_sp0.9 from training. Duration: 0.8555625 2023-10-02 03:46:12,364 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_25236_210_2158_1_1532051968_280567_37-6860-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 03:46:15,629 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.5.encoder.layers.0.feed_forward2.out_whiten, num_groups=1, num_channels=256, metric=12.22 vs. limit=15.0 2023-10-02 03:46:17,865 WARNING [train.py:1204] (3/4) Exclude cut with ID _29277_210_3279_1_1532581109293_3173069_417-115527-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 03:46:20,604 WARNING [train.py:1204] (3/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_552-134748-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 03:46:22,524 WARNING [train.py:1204] (3/4) Exclude cut with ID _43176_210_8254_1_1533524417469_4174339_188-174143-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 03:46:22,665 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.0.bypass_mid.scale_min, batch_count=739480.0, ans=0.2 2023-10-02 03:46:23,956 WARNING [train.py:1204] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_127-119109-0_sp1.1 from training. Duration: 0.6545625 2023-10-02 03:46:26,813 WARNING [train.py:1204] (3/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_38-187851-0 from training. Duration: 0.62 2023-10-02 03:46:26,853 WARNING [train.py:1204] (3/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_588-125165-0 from training. Duration: 0.83 2023-10-02 03:46:28,220 WARNING [train.py:1204] (3/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_344-152526-0_sp1.1 from training. Duration: 0.4818125 2023-10-02 03:46:28,222 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7002_210_1794_1_1527935634_4034444_487-236501-0 from training. Duration: 0.66 2023-10-02 03:46:29,687 WARNING [train.py:1204] (3/4) Exclude cut with ID _23825_210_13449_1_1531915313526_3497150_379-16981-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 03:46:36,560 WARNING [train.py:1204] (3/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_297-5233-0_sp1.1 from training. Duration: 0.863625 2023-10-02 03:46:36,572 WARNING [train.py:1204] (3/4) Exclude cut with ID _41298_210_9019_1_1533430371450_4822143_178-283538-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 03:46:38,260 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7046_210_6753_1_1527895337_6920917_642-106799-0 from training. Duration: 0.92 2023-10-02 03:46:38,283 WARNING [train.py:1204] (3/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_532-291952-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 03:46:39,698 WARNING [train.py:1204] (3/4) Exclude cut with ID _47278_210_12041_1_1533902418349_3576110_260-270468-0_sp1.1 from training. Duration: 0.92725 2023-10-02 03:46:39,704 WARNING [train.py:1204] (3/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_144-3287-0_sp0.9 from training. Duration: 0.7444375 2023-10-02 03:46:41,112 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_162-262717-0_sp0.9 from training. Duration: 0.92225 2023-10-02 03:46:43,746 WARNING [train.py:1204] (3/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_62-335552-0_sp0.9 from training. Duration: 0.9666875 2023-10-02 03:46:43,759 WARNING [train.py:1204] (3/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_726-272852-0_sp1.1 from training. Duration: 0.92725 2023-10-02 03:46:45,203 WARNING [train.py:1204] (3/4) Exclude cut with ID _14898_210_9206_1_1530766812377_4177190_26-135492-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 03:46:46,499 INFO [train.py:1046] (3/4) Epoch 21, batch 4700, loss[loss=0.1741, simple_loss=0.2551, pruned_loss=0.04659, over 24033.00 frames. ], tot_loss[loss=0.1738, simple_loss=0.2492, pruned_loss=0.04922, over 4716339.21 frames. ], batch size: 86, lr: 4.85e-03, grad_scale: 8.0 2023-10-02 03:46:46,894 INFO [scaling.py:1118] (3/4) WithLoss: name=encoder.encoders.3.encoder.layers.0.self_attn_weights, loss-sum=0.000e+00 2023-10-02 03:46:48,062 WARNING [train.py:1204] (3/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_200-95304-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 03:46:49,353 WARNING [train.py:1204] (3/4) Exclude cut with ID _45012_210_12651_1_1533608435136_5371380_240-37101-0_sp1.1 from training. Duration: 0.8454375 2023-10-02 03:46:49,365 WARNING [train.py:1204] (3/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_132-269633-0_sp0.9 from training. Duration: 0.7555625 2023-10-02 03:46:49,456 WARNING [train.py:1204] (3/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_907-151398-0_sp0.9 from training. Duration: 0.411125 2023-10-02 03:46:50,807 WARNING [train.py:1204] (3/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_45-265684-0_sp0.9 from training. Duration: 0.8 2023-10-02 03:46:52,170 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_74-292608-0 from training. Duration: 0.72 2023-10-02 03:46:58,317 WARNING [train.py:1204] (3/4) Exclude cut with ID _59202_210_26984_1_1534816810612_3549174_221-84819-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 03:46:59,638 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_33696_210_3852_1_1533082780_2663375_181-325595-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 03:46:59,673 WARNING [train.py:1204] (3/4) Exclude cut with ID _47251_210_18628_1_1533814063128_4137250_351-154105-0_sp1.1 from training. Duration: 0.963625 2023-10-02 03:47:01,031 WARNING [train.py:1204] (3/4) Exclude cut with ID _35501_210_9288_1_1532998507986_3192189_351-334740-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 03:47:02,433 WARNING [train.py:1204] (3/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_342-100157-0_sp1.1 from training. Duration: 0.4909375 2023-10-02 03:47:05,207 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.bypass.skip_rate, batch_count=739680.0, ans=0.04949747468305833 2023-10-02 03:47:08,828 WARNING [train.py:1204] (3/4) Exclude cut with ID _6789_210_1534_1_1527857937218_3118950_262-48853-0 from training. Duration: 0.56 2023-10-02 03:47:08,859 WARNING [train.py:1204] (3/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_430-201020-0 from training. Duration: 0.97 2023-10-02 03:47:12,340 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_71-136518-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 03:47:13,737 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_28-291982-0_sp1.1 from training. Duration: 0.8818125 2023-10-02 03:47:13,763 WARNING [train.py:1204] (3/4) Exclude cut with ID _54218_210_12210_1_1534413646388_4409259_437-275326-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 03:47:16,653 WARNING [train.py:1204] (3/4) Exclude cut with ID _55134_210_21982_1_1534590028377_3882170_40-309139-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 03:47:22,208 WARNING [train.py:1204] (3/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_266-329755-0_sp1.1 from training. Duration: 0.8090625 2023-10-02 03:47:22,302 WARNING [train.py:1204] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_21-174514-0_sp1.1 from training. Duration: 0.3909375 2023-10-02 03:47:25,317 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.458e+02 1.835e+02 1.990e+02 2.333e+02 4.154e+02, threshold=3.980e+02, percent-clipped=1.0 2023-10-02 03:47:25,427 WARNING [train.py:1204] (3/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_409-207378-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 03:47:31,194 WARNING [train.py:1204] (3/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_132-269633-0 from training. Duration: 0.68 2023-10-02 03:47:31,283 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2245_210_2715_1_1525511548_3540254_310-52483-0_sp0.9 from training. Duration: 0.97775 2023-10-02 03:47:33,285 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.3.encoder.layers.0.feed_forward3.out_whiten, num_groups=1, num_channels=512, metric=10.87 vs. limit=15.0 2023-10-02 03:47:33,981 WARNING [train.py:1204] (3/4) Exclude cut with ID _27430_210_7770_1_1532604689375_1378590_47-77403-0_sp1.1 from training. Duration: 0.92725 2023-10-02 03:47:38,781 WARNING [train.py:1204] (3/4) Exclude cut with ID _7562_210_2084_1_1528887767916_3946229_186-326422-0 from training. Duration: 0.84 2023-10-02 03:47:40,327 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_230-35993-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 03:47:45,011 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_23854_210_1794_1_1531969696_1329157_57-123491-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 03:47:46,460 WARNING [train.py:1204] (3/4) Exclude cut with ID _14747_210_8341_1_1530685797463_3654089_147-131229-0 from training. Duration: 0.76 2023-10-02 03:47:47,964 WARNING [train.py:1204] (3/4) Exclude cut with ID _30191_210_1664_1_1533189915047_7196670_430-19539-0_sp1.1 from training. Duration: 0.92725 2023-10-02 03:47:47,993 WARNING [train.py:1204] (3/4) Exclude cut with ID _67725_210_27767_1_1535608756671_5105359_44-50762-0_sp1.1 from training. Duration: 0.963625 2023-10-02 03:47:49,404 WARNING [train.py:1204] (3/4) Exclude cut with ID _27485_210_4916_1_1532739191677_3668269_217-161755-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 03:47:50,762 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_5931_210_2040_1_1527428481_4547999_321-193614-0_sp1.1 from training. Duration: 0.6090625 2023-10-02 03:47:50,778 WARNING [train.py:1204] (3/4) Exclude cut with ID _6267_210_2084_1_1527764658526_3894730_375-219887-0 from training. Duration: 0.67 2023-10-02 03:47:52,157 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9087W0022-538393 from training. Duration: 0.768 2023-10-02 03:47:53,590 WARNING [train.py:1204] (3/4) Exclude cut with ID _68777_210_4353_1_1535810390731_3117904_83-196459-0_sp1.1 from training. Duration: 0.963625 2023-10-02 03:47:54,970 WARNING [train.py:1204] (3/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_455-343812-0_sp1.1 from training. Duration: 0.92725 2023-10-02 03:47:54,971 WARNING [train.py:1204] (3/4) Exclude cut with ID _13916_210_9994_1_1530788341126_3763149_414-320174-0_sp1.1 from training. Duration: 0.92725 2023-10-02 03:47:54,977 WARNING [train.py:1204] (3/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_398-239819-0 from training. Duration: 0.99 2023-10-02 03:47:56,385 WARNING [train.py:1204] (3/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_199-232314-0_sp1.1 from training. Duration: 0.92725 2023-10-02 03:47:59,803 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=739946.6666666666, ans=0.1 2023-10-02 03:48:00,882 INFO [train.py:1046] (3/4) Epoch 21, batch 4750, loss[loss=0.1497, simple_loss=0.2272, pruned_loss=0.03612, over 24335.00 frames. ], tot_loss[loss=0.174, simple_loss=0.2498, pruned_loss=0.0491, over 4726639.56 frames. ], batch size: 61, lr: 4.85e-03, grad_scale: 8.0 2023-10-02 03:48:00,939 WARNING [train.py:1204] (3/4) Exclude cut with ID _2334_210_2667_1_1525608238880_4705129_459-104997-0 from training. Duration: 0.95 2023-10-02 03:48:02,426 WARNING [train.py:1204] (3/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_441-209395-0_sp1.1 from training. Duration: 0.936375 2023-10-02 03:48:03,828 WARNING [train.py:1204] (3/4) Exclude cut with ID _27052_210_5986_1_1532329197187_3585009_320-80393-0_sp1.1 from training. Duration: 0.97275 2023-10-02 03:48:04,119 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=739946.6666666666, ans=0.1 2023-10-02 03:48:07,384 WARNING [train.py:1204] (3/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_89-217253-0_sp1.1 from training. Duration: 0.97275 2023-10-02 03:48:07,400 WARNING [train.py:1204] (3/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_453-282588-0_sp1.1 from training. Duration: 0.8909375 2023-10-02 03:48:08,834 WARNING [train.py:1204] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_88-341718-0 from training. Duration: 0.66 2023-10-02 03:48:08,881 WARNING [train.py:1204] (3/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_230-3521-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 03:48:13,096 WARNING [train.py:1204] (3/4) Exclude cut with ID _9679_210_7497_1_1529129212906_4363929_512-313889-0 from training. Duration: 0.81 2023-10-02 03:48:16,187 WARNING [train.py:1204] (3/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_194-252800-0_sp1.1 from training. Duration: 0.8818125 2023-10-02 03:48:16,230 WARNING [train.py:1204] (3/4) Exclude cut with ID _55209_210_1531_1_1534675828996_4597160_239-293541-0_sp1.1 from training. Duration: 0.963625 2023-10-02 03:48:17,502 WARNING [train.py:1204] (3/4) Exclude cut with ID _35799_210_5854_1_1533263487887_3529071_15-325131-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 03:48:21,865 WARNING [train.py:1204] (3/4) Exclude cut with ID _14100_210_8341_1_1530529320613_3678069_279-55760-0 from training. Duration: 0.93 2023-10-02 03:48:22,048 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.conv_module1.balancer2.prob, batch_count=740013.3333333334, ans=0.125 2023-10-02 03:48:25,074 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.balancer2.prob, batch_count=740013.3333333334, ans=0.125 2023-10-02 03:48:26,450 INFO [scaling.py:1118] (3/4) WithLoss: name=encoder.encoders.5.encoder.layers.1.self_attn_weights, loss-sum=0.000e+00 2023-10-02 03:48:27,458 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_489-230703-0_sp1.1 from training. Duration: 0.77275 2023-10-02 03:48:28,856 WARNING [train.py:1204] (3/4) Exclude cut with ID _6694_210_4599_1_1527852637490_3965529_144-185051-0 from training. Duration: 0.96 2023-10-02 03:48:28,923 WARNING [train.py:1204] (3/4) Exclude cut with ID _28461_210_2253_1_1532771775189_3041299_1-228155-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 03:48:29,187 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.conv_module2.balancer2.prob, batch_count=740080.0, ans=0.125 2023-10-02 03:48:31,259 INFO [scaling.py:1022] (3/4) Whitening: name=encoder_embed.convnext.out_whiten, num_groups=1, num_channels=128, metric=4.78 vs. limit=5.0 2023-10-02 03:48:33,664 WARNING [train.py:1204] (3/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_686-252323-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 03:48:33,666 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_1075-294633-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 03:48:33,677 WARNING [train.py:1204] (3/4) Exclude cut with ID _22083_210_8254_1_1531702862439_3678970_30-92879-0_sp1.1 from training. Duration: 0.97275 2023-10-02 03:48:35,070 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9245W0121-616888 from training. Duration: 0.896 2023-10-02 03:48:35,072 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6786_210_1388_1_1527839699_4194495_378-46070-0 from training. Duration: 0.72 2023-10-02 03:48:35,678 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.3.encoder.layers.3.conv_module2.whiten, num_groups=1, num_channels=512, metric=6.53 vs. limit=15.0 2023-10-02 03:48:38,613 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.bypass_mid.scale_min, batch_count=740080.0, ans=0.2 2023-10-02 03:48:41,699 WARNING [train.py:1204] (3/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_317-175062-0 from training. Duration: 0.82 2023-10-02 03:48:43,231 WARNING [train.py:1204] (3/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_238-89390-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 03:48:43,505 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=740080.0, ans=0.1 2023-10-02 03:48:45,997 WARNING [train.py:1204] (3/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_524-310811-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 03:48:47,893 WARNING [train.py:1204] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_307-158462-0_sp0.9 from training. Duration: 0.7666875 2023-10-02 03:48:47,894 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9309W0191-640318 from training. Duration: 0.512 2023-10-02 03:48:47,899 WARNING [train.py:1204] (3/4) Exclude cut with ID _55153_210_2253_1_1534384786677_3162096_100-204617-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 03:48:50,712 WARNING [train.py:1204] (3/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_112-149213-0_sp1.1 from training. Duration: 0.863625 2023-10-02 03:48:52,188 WARNING [train.py:1204] (3/4) Exclude cut with ID _67831_210_12392_1_1535612464202_3092612_346-2148-0_sp1.1 from training. Duration: 0.8454375 2023-10-02 03:48:53,468 WARNING [train.py:1204] (3/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_51-117576-0_sp0.9 from training. Duration: 0.4444375 2023-10-02 03:48:53,504 WARNING [train.py:1204] (3/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_300-27235-0 from training. Duration: 0.87 2023-10-02 03:48:54,811 WARNING [train.py:1204] (3/4) Exclude cut with ID _40399_210_4353_1_1533305658194_3279406_195-116825-0_sp1.1 from training. Duration: 0.97275 2023-10-02 03:48:54,843 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7002_210_1794_1_1527939683_5157286_163-87552-0_sp0.9 from training. Duration: 0.9555625 2023-10-02 03:48:56,262 WARNING [train.py:1204] (3/4) Exclude cut with ID _19514_210_9259_1_1531530071330_5084230_560-237597-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 03:48:56,349 WARNING [train.py:1204] (3/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_41-234353-0_sp1.1 from training. Duration: 0.6181875 2023-10-02 03:48:56,369 WARNING [train.py:1204] (3/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_769-168302-0 from training. Duration: 0.71 2023-10-02 03:48:59,226 WARNING [train.py:1204] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_574-45541-0 from training. Duration: 0.65 2023-10-02 03:49:01,930 WARNING [train.py:1204] (3/4) Exclude cut with ID _50310_210_15488_1_1534068043345_3641080_262-67120-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 03:49:04,141 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_53071_210_16554_1_1534493434_1276796_35-11927-0_sp1.1 from training. Duration: 0.963625 2023-10-02 03:49:05,456 WARNING [train.py:1204] (3/4) Exclude cut with ID _3992_210_1747_1_1526644816031_3731060_107-251434-0 from training. Duration: 0.99 2023-10-02 03:49:05,511 WARNING [train.py:1204] (3/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_412-54488-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 03:49:05,615 WARNING [train.py:1204] (3/4) Exclude cut with ID _18516_210_9206_1_1531704653206_3692406_77-258490-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 03:49:07,643 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_40047_210_2290_1_1533290519_2374311_195-165693-0_sp0.9 from training. Duration: 0.988875 2023-10-02 03:49:08,318 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.3.encoder.layers.2.self_attn_weights.whiten_keys, num_groups=8, num_channels=256, metric=4.22 vs. limit=6.0 2023-10-02 03:49:08,944 WARNING [train.py:1204] (3/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_296-36971-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 03:49:10,842 WARNING [train.py:1204] (3/4) Exclude cut with ID _14970_210_15709_1_1530773543493_1715730_187-20259-0_sp1.1 from training. Duration: 0.8090625 2023-10-02 03:49:12,365 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_53373_210_5399_1_1534229471_4715695_135-305927-0_sp1.1 from training. Duration: 0.936375 2023-10-02 03:49:12,388 WARNING [train.py:1204] (3/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_60-18123-0 from training. Duration: 0.88 2023-10-02 03:49:12,573 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.1.ff2_skip_rate, batch_count=740213.3333333334, ans=0.0 2023-10-02 03:49:13,756 WARNING [train.py:1204] (3/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_166-166990-0 from training. Duration: 0.96 2023-10-02 03:49:15,145 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_5438_210_1794_1_1527336687_2600009_233-58072-0 from training. Duration: 0.77 2023-10-02 03:49:17,015 INFO [train.py:1046] (3/4) Epoch 21, batch 4800, loss[loss=0.1936, simple_loss=0.2589, pruned_loss=0.06417, over 23417.00 frames. ], tot_loss[loss=0.1753, simple_loss=0.2513, pruned_loss=0.04966, over 4734780.49 frames. ], batch size: 285, lr: 4.85e-03, grad_scale: 16.0 2023-10-02 03:49:18,423 WARNING [train.py:1204] (3/4) Exclude cut with ID _8833_210_5219_1_1528592665210_7553059_611-80313-0_sp0.9 from training. Duration: 0.711125 2023-10-02 03:49:18,445 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_53555_210_5632_1_1534595220_7232297_118-43239-0_sp1.1 from training. Duration: 0.936375 2023-10-02 03:49:19,775 WARNING [train.py:1204] (3/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_480-13146-0 from training. Duration: 0.85 2023-10-02 03:49:21,474 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.conv_module1.balancer2.prob, batch_count=740280.0, ans=0.125 2023-10-02 03:49:24,239 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_892-28814-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 03:49:25,604 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_24899_210_16554_1_1532046422_3827462_160-21255-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 03:49:29,878 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_154-316450-0_sp1.1 from training. Duration: 0.6909375 2023-10-02 03:49:31,653 WARNING [train.py:1204] (3/4) Exclude cut with ID _30196_210_1664_1_1533708496522_7074140_1225-301232-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 03:49:31,691 WARNING [train.py:1204] (3/4) Exclude cut with ID _28783_210_7943_1_1532671164808_7155479_414-208100-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 03:49:31,732 WARNING [train.py:1204] (3/4) Exclude cut with ID _19023_210_4353_1_1531378892552_5820780_83-254794-0 from training. Duration: 0.86 2023-10-02 03:49:31,840 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.1.bypass_mid.scale_min, batch_count=740346.6666666666, ans=0.2 2023-10-02 03:49:32,380 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.2.encoder.layers.1.conv_module1.whiten, num_groups=1, num_channels=384, metric=13.92 vs. limit=15.0 2023-10-02 03:49:34,429 WARNING [train.py:1204] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_591-83752-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 03:49:34,463 WARNING [train.py:1204] (3/4) Exclude cut with ID _29877_210_6228_1_1532397791106_5276149_266-62291-0_sp1.1 from training. Duration: 0.7909375 2023-10-02 03:49:34,609 WARNING [train.py:1204] (3/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_280-161633-0_sp1.1 from training. Duration: 0.663625 2023-10-02 03:49:38,146 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7543_210_6753_1_1528539468_333764_39-156634-0_sp1.1 from training. Duration: 0.97275 2023-10-02 03:49:39,584 WARNING [train.py:1204] (3/4) Exclude cut with ID _67499_210_17282_1_1535509805220_3692430_505-312248-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 03:49:39,614 WARNING [train.py:1204] (3/4) Exclude cut with ID _5294_210_1747_1_1527249643928_3696179_180-100713-0_sp0.9 from training. Duration: 0.9 2023-10-02 03:49:41,739 WARNING [train.py:1204] (3/4) Exclude cut with ID _26528_210_9070_1_1532570410842_3851310_508-148132-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 03:49:41,750 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_211-339622-0_sp1.1 from training. Duration: 0.4090625 2023-10-02 03:49:41,770 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_339-138356-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 03:49:43,095 WARNING [train.py:1204] (3/4) Exclude cut with ID _27060_210_5986_1_1532674838922_3522549_211-19933-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 03:49:45,920 WARNING [train.py:1204] (3/4) Exclude cut with ID _60229_210_27767_1_1534838406158_3655350_457-117923-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 03:49:47,853 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.feed_forward2.hidden_balancer.prob, batch_count=740413.3333333334, ans=0.125 2023-10-02 03:49:49,055 WARNING [train.py:1204] (3/4) Exclude cut with ID _55429_210_10626_1_1534467588527_7174143_460-141439-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 03:49:50,402 WARNING [train.py:1204] (3/4) Exclude cut with ID _59170_210_6721_1_1534983633960_4203335_263-10780-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 03:49:50,423 WARNING [train.py:1204] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_872-264525-0_sp0.9 from training. Duration: 0.72225 2023-10-02 03:49:51,864 WARNING [train.py:1204] (3/4) Exclude cut with ID _7624_210_1531_1_1528109802198_4933101_418-349834-0_sp0.9 from training. Duration: 0.6666875 2023-10-02 03:49:53,281 WARNING [train.py:1204] (3/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_27-53998-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 03:49:55,819 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.579e+02 1.875e+02 2.035e+02 2.300e+02 3.149e+02, threshold=4.071e+02, percent-clipped=0.0 2023-10-02 03:49:55,910 WARNING [train.py:1204] (3/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_202-183922-0 from training. Duration: 0.85 2023-10-02 03:49:55,931 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7002_210_1794_1_1527939683_5157286_1-281264-0 from training. Duration: 0.44 2023-10-02 03:49:57,264 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_53753_210_7234_1_1534320003_3594148_493-9314-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 03:49:57,292 WARNING [train.py:1204] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_217-95056-0_sp1.1 from training. Duration: 0.8909375 2023-10-02 03:49:57,341 WARNING [train.py:1204] (3/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_406-45086-0_sp1.1 from training. Duration: 0.72725 2023-10-02 03:49:57,347 WARNING [train.py:1204] (3/4) Exclude cut with ID _31649_210_6228_1_1532584608063_4091078_505-213821-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 03:49:57,355 WARNING [train.py:1204] (3/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_317-175062-0_sp0.9 from training. Duration: 0.911125 2023-10-02 03:49:58,718 WARNING [train.py:1204] (3/4) Exclude cut with ID _5444_210_2151_1_1527418808464_5312210_726-27165-0_sp1.1 from training. Duration: 0.6090625 2023-10-02 03:50:00,060 WARNING [train.py:1204] (3/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_494-170437-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 03:50:03,456 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_474-64054-0_sp1.1 from training. Duration: 0.936375 2023-10-02 03:50:07,361 WARNING [train.py:1204] (3/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_521-7556-0_sp1.1 from training. Duration: 0.97275 2023-10-02 03:50:08,736 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_269-348505-0_sp1.1 from training. Duration: 0.92725 2023-10-02 03:50:15,495 WARNING [train.py:1204] (3/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_559-184862-0 from training. Duration: 0.94 2023-10-02 03:50:15,536 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_68702_210_23689_1_1535799577_3420068_92-76040-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 03:50:15,576 WARNING [train.py:1204] (3/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_227-102052-0_sp1.1 from training. Duration: 0.97275 2023-10-02 03:50:15,607 WARNING [train.py:1204] (3/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_718-250907-0_sp0.9 from training. Duration: 0.7666875 2023-10-02 03:50:16,981 WARNING [train.py:1204] (3/4) Exclude cut with ID _14069_210_5632_1_1530512692033_4803390_352-143891-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 03:50:17,328 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.conv_module1.balancer1.prob, batch_count=740546.6666666666, ans=0.125 2023-10-02 03:50:20,220 WARNING [train.py:1204] (3/4) Exclude cut with ID _6267_210_2084_1_1527764658526_3894730_65-106602-0_sp1.1 from training. Duration: 0.936375 2023-10-02 03:50:21,589 WARNING [train.py:1204] (3/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_202-183922-0_sp0.9 from training. Duration: 0.9444375 2023-10-02 03:50:21,596 WARNING [train.py:1204] (3/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_465-268827-0_sp1.1 from training. Duration: 0.97275 2023-10-02 03:50:21,648 WARNING [train.py:1204] (3/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_58-29064-0_sp1.1 from training. Duration: 0.9 2023-10-02 03:50:22,920 WARNING [train.py:1204] (3/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_1-99680-0_sp0.9 from training. Duration: 0.7444375 2023-10-02 03:50:23,096 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.self_attn_weights.pos_emb_skip_rate, batch_count=740546.6666666666, ans=0.0 2023-10-02 03:50:23,116 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=740546.6666666666, ans=0.1 2023-10-02 03:50:23,843 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.0.layers.1.conv_module1.whiten, num_groups=1, num_channels=192, metric=8.78 vs. limit=15.0 2023-10-02 03:50:24,202 WARNING [train.py:1204] (3/4) Exclude cut with ID _13253_210_1925_1_1530757775819_3577873_191-144977-0_sp1.1 from training. Duration: 0.7090625 2023-10-02 03:50:27,057 WARNING [train.py:1204] (3/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_276-112684-0_sp1.1 from training. Duration: 0.92725 2023-10-02 03:50:27,065 WARNING [train.py:1204] (3/4) Exclude cut with ID _64639_210_5816_1_1535367619437_3528000_358-411-0_sp1.1 from training. Duration: 0.97275 2023-10-02 03:50:27,071 WARNING [train.py:1204] (3/4) Exclude cut with ID _31649_210_6228_1_1532584608063_4091078_106-21199-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 03:50:28,432 WARNING [train.py:1204] (3/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_901-79438-0 from training. Duration: 0.94 2023-10-02 03:50:31,113 INFO [train.py:1046] (3/4) Epoch 21, batch 4850, loss[loss=0.187, simple_loss=0.2697, pruned_loss=0.05217, over 24431.00 frames. ], tot_loss[loss=0.1753, simple_loss=0.2516, pruned_loss=0.04946, over 4738832.62 frames. ], batch size: 77, lr: 4.85e-03, grad_scale: 16.0 2023-10-02 03:50:31,210 WARNING [train.py:1204] (3/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_718-250907-0 from training. Duration: 0.69 2023-10-02 03:50:31,216 WARNING [train.py:1204] (3/4) Exclude cut with ID _5287_210_2084_1_1527418883040_3878670_363-194161-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 03:50:31,220 WARNING [train.py:1204] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_586-321321-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 03:50:31,298 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_68900_210_2087_1_1535713157_3708529_164-292661-0_sp1.1 from training. Duration: 0.963625 2023-10-02 03:50:31,299 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_26433_210_12108_1_1532157158_4056480_114-144878-0_sp1.1 from training. Duration: 0.97275 2023-10-02 03:50:34,625 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_15517_210_6753_1_1530954008_3487336_96-341874-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 03:50:37,775 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.conv_module1.balancer1.prob, batch_count=740613.3333333334, ans=0.125 2023-10-02 03:50:43,535 WARNING [train.py:1204] (3/4) Exclude cut with ID _14268_210_9206_1_1530622771285_4714099_637-86630-0 from training. Duration: 0.51 2023-10-02 03:50:45,441 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_28211_210_6388_1_1532861506_4523224_121-320092-0_sp1.1 from training. Duration: 0.92725 2023-10-02 03:50:49,603 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_141-89113-0_sp0.9 from training. Duration: 0.9555625 2023-10-02 03:50:49,696 WARNING [train.py:1204] (3/4) Exclude cut with ID _6789_210_1534_1_1527857937218_3118950_311-61262-0_sp1.1 from training. Duration: 0.5909375 2023-10-02 03:50:50,997 WARNING [train.py:1204] (3/4) Exclude cut with ID _58480_210_12392_1_1534811121111_3646160_389-200461-0_sp1.1 from training. Duration: 0.97275 2023-10-02 03:50:54,160 WARNING [train.py:1204] (3/4) Exclude cut with ID _41326_210_5854_1_1533778076509_3492080_28-156553-0_sp1.1 from training. Duration: 0.92725 2023-10-02 03:50:55,457 WARNING [train.py:1204] (3/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_577-287150-0_sp1.1 from training. Duration: 0.7454375 2023-10-02 03:50:56,850 WARNING [train.py:1204] (3/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_40-129756-0_sp1.1 from training. Duration: 0.736375 2023-10-02 03:50:56,860 WARNING [train.py:1204] (3/4) Exclude cut with ID _60113_210_16271_1_1535164212481_4386079_490-280290-0 from training. Duration: 0.98 2023-10-02 03:50:58,480 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.balancer1.prob, batch_count=740680.0, ans=0.125 2023-10-02 03:50:59,916 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=740746.6666666666, ans=0.1 2023-10-02 03:51:01,153 WARNING [train.py:1204] (3/4) Exclude cut with ID _58059_210_5854_1_1534901471149_3164808_34-39905-0_sp1.1 from training. Duration: 0.963625 2023-10-02 03:51:02,564 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_791-197891-0_sp1.1 from training. Duration: 0.763625 2023-10-02 03:51:02,578 WARNING [train.py:1204] (3/4) Exclude cut with ID _6789_210_1534_1_1527857937218_3118950_262-48853-0_sp1.1 from training. Duration: 0.5090625 2023-10-02 03:51:03,891 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2655_210_2087_1_1525688959_3897400_52-279899-0_sp0.9 from training. Duration: 0.7666875 2023-10-02 03:51:03,895 WARNING [train.py:1204] (3/4) Exclude cut with ID _14884_210_2252_1_1530707459689_5561250_114-201399-0 from training. Duration: 0.76 2023-10-02 03:51:07,170 WARNING [train.py:1204] (3/4) Exclude cut with ID _19023_210_4353_1_1531378892552_5820780_83-254794-0_sp0.9 from training. Duration: 0.9555625 2023-10-02 03:51:07,202 WARNING [train.py:1204] (3/4) Exclude cut with ID _53489_210_6228_1_1534298357169_3835970_10-297919-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 03:51:10,265 WARNING [train.py:1204] (3/4) Exclude cut with ID _44585_210_18628_1_1533605350387_4518199_418-3055-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 03:51:10,279 WARNING [train.py:1204] (3/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_310-109461-0 from training. Duration: 0.8 2023-10-02 03:51:10,336 WARNING [train.py:1204] (3/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_235-262124-0 from training. Duration: 0.67 2023-10-02 03:51:11,657 WARNING [train.py:1204] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_195-167011-0_sp1.1 from training. Duration: 0.6818125 2023-10-02 03:51:19,873 WARNING [train.py:1204] (3/4) Exclude cut with ID _7852_210_5219_1_1528286471316_8139400_188-55162-0_sp1.1 from training. Duration: 0.936375 2023-10-02 03:51:21,170 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_35361_210_4238_1_1533262810_7793476_675-87012-0 from training. Duration: 0.89 2023-10-02 03:51:21,254 WARNING [train.py:1204] (3/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_465-81427-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 03:51:21,260 WARNING [train.py:1204] (3/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_597-154640-0_sp1.1 from training. Duration: 0.8181875 2023-10-02 03:51:22,687 WARNING [train.py:1204] (3/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_186-309161-0_sp0.9 from training. Duration: 0.6 2023-10-02 03:51:22,839 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.conv_module2.balancer2.prob, batch_count=740813.3333333334, ans=0.125 2023-10-02 03:51:25,892 WARNING [train.py:1204] (3/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_546-119312-0 from training. Duration: 0.99 2023-10-02 03:51:25,902 WARNING [train.py:1204] (3/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_330-240060-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 03:51:27,181 WARNING [train.py:1204] (3/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_477-255782-0 from training. Duration: 0.82 2023-10-02 03:51:27,200 WARNING [train.py:1204] (3/4) Exclude cut with ID _14970_210_15709_1_1530773543493_1715730_150-248939-0_sp1.1 from training. Duration: 0.97275 2023-10-02 03:51:28,603 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_556-206615-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 03:51:28,673 WARNING [train.py:1204] (3/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_308-231735-0 from training. Duration: 0.73 2023-10-02 03:51:30,315 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.feed_forward3.hidden_balancer.prob, batch_count=740880.0, ans=0.125 2023-10-02 03:51:38,957 WARNING [train.py:1204] (3/4) Exclude cut with ID _27485_210_4916_1_1532739191677_3668269_66-236571-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 03:51:43,288 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2221_210_1988_1_1525597051_4935541_415-296882-0_sp0.9 from training. Duration: 0.8555625 2023-10-02 03:51:43,295 WARNING [train.py:1204] (3/4) Exclude cut with ID _64521_210_14699_1_1535243427259_3815328_231-78630-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 03:51:46,621 INFO [train.py:1046] (3/4) Epoch 21, batch 4900, loss[loss=0.1624, simple_loss=0.2459, pruned_loss=0.03947, over 24481.00 frames. ], tot_loss[loss=0.1751, simple_loss=0.2513, pruned_loss=0.04948, over 4736828.33 frames. ], batch size: 63, lr: 4.85e-03, grad_scale: 16.0 2023-10-02 03:51:48,150 WARNING [train.py:1204] (3/4) Exclude cut with ID _13942_210_9988_1_1530604906036_3733360_499-55676-0 from training. Duration: 0.62 2023-10-02 03:51:48,152 WARNING [train.py:1204] (3/4) Exclude cut with ID _49376_210_5986_1_1533985278732_3370740_301-154763-0_sp1.1 from training. Duration: 0.8909375 2023-10-02 03:51:52,470 WARNING [train.py:1204] (3/4) Exclude cut with ID _27485_210_4916_1_1532739191677_3668269_255-216698-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 03:51:52,645 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.1.feed_forward1.out_proj.dropout_p, batch_count=740946.6666666666, ans=0.1 2023-10-02 03:51:53,229 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.2.encoder.layers.2.nonlin_attention.whiten1, num_groups=1, num_channels=288, metric=5.99 vs. limit=10.0 2023-10-02 03:51:53,747 WARNING [train.py:1204] (3/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_137-334143-0_sp1.1 from training. Duration: 0.97275 2023-10-02 03:51:53,767 WARNING [train.py:1204] (3/4) Exclude cut with ID _6456_210_1534_1_1527681634212_3181049_215-309359-0_sp1.1 from training. Duration: 0.8 2023-10-02 03:51:57,054 WARNING [train.py:1204] (3/4) Exclude cut with ID _65640_210_6310_1_1535368201445_3173250_209-13008-0 from training. Duration: 0.99 2023-10-02 03:52:01,446 WARNING [train.py:1204] (3/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_58-29064-0 from training. Duration: 0.99 2023-10-02 03:52:04,387 WARNING [train.py:1204] (3/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_410-287857-0 from training. Duration: 0.93 2023-10-02 03:52:05,768 WARNING [train.py:1204] (3/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_153-153537-0 from training. Duration: 0.91 2023-10-02 03:52:05,818 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7544_210_6753_1_1528587722_3576686_172-171750-0_sp0.9 from training. Duration: 0.72225 2023-10-02 03:52:05,850 WARNING [train.py:1204] (3/4) Exclude cut with ID _64521_210_14699_1_1535243427259_3815328_464-183853-0_sp1.1 from training. Duration: 0.97275 2023-10-02 03:52:05,867 WARNING [train.py:1204] (3/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_385-210308-0_sp1.1 from training. Duration: 0.963625 2023-10-02 03:52:05,884 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_46013_210_7234_1_1533776362_3744032_354-261865-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 03:52:05,891 WARNING [train.py:1204] (3/4) Exclude cut with ID _3067_210_2667_1_1526212974640_4503168_250-285937-0_sp1.1 from training. Duration: 0.72725 2023-10-02 03:52:07,891 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_40036_210_13405_1_1533648647_1315388_48-146016-0 from training. Duration: 0.93 2023-10-02 03:52:12,623 WARNING [train.py:1204] (3/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_280-161633-0 from training. Duration: 0.73 2023-10-02 03:52:14,477 WARNING [train.py:1204] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_308-161067-0_sp0.9 from training. Duration: 0.7333125 2023-10-02 03:52:15,889 WARNING [train.py:1204] (3/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_68-212801-0_sp0.9 from training. Duration: 0.811125 2023-10-02 03:52:17,367 WARNING [train.py:1204] (3/4) Exclude cut with ID _6789_210_1534_1_1527857937218_3118950_311-61262-0_sp0.9 from training. Duration: 0.72225 2023-10-02 03:52:18,819 WARNING [train.py:1204] (3/4) Exclude cut with ID _14632_210_9988_1_1530691279392_3923200_102-204477-0_sp1.1 from training. Duration: 0.8818125 2023-10-02 03:52:20,224 WARNING [train.py:1204] (3/4) Exclude cut with ID _65242_210_18628_1_1535369393717_3823149_290-117517-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 03:52:20,337 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_69630_210_7234_1_1535788670_910989_35-337140-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 03:52:20,344 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_5618_210_1874_1_1527311008_5851_1-214501-0 from training. Duration: 0.689 2023-10-02 03:52:22,974 WARNING [train.py:1204] (3/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_168-50337-0_sp0.9 from training. Duration: 0.9333125 2023-10-02 03:52:23,068 WARNING [train.py:1204] (3/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_185-14438-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 03:52:23,090 WARNING [train.py:1204] (3/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_61-327062-0 from training. Duration: 0.81 2023-10-02 03:52:23,096 WARNING [train.py:1204] (3/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_98-741-0 from training. Duration: 0.8 2023-10-02 03:52:25,621 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.526e+02 1.942e+02 2.183e+02 2.601e+02 5.042e+02, threshold=4.365e+02, percent-clipped=7.0 2023-10-02 03:52:27,105 WARNING [train.py:1204] (3/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_312-204798-0 from training. Duration: 0.93 2023-10-02 03:52:29,070 WARNING [train.py:1204] (3/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_787-294416-0_sp0.9 from training. Duration: 0.911125 2023-10-02 03:52:31,709 WARNING [train.py:1204] (3/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_140-181176-0_sp1.1 from training. Duration: 0.836375 2023-10-02 03:52:31,756 WARNING [train.py:1204] (3/4) Exclude cut with ID _14687_210_5854_1_1530703838259_3664889_115-239046-0_sp1.1 from training. Duration: 0.6090625 2023-10-02 03:52:33,087 WARNING [train.py:1204] (3/4) Exclude cut with ID _56423_210_5854_1_1534896061049_3165969_370-230614-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 03:52:33,131 WARNING [train.py:1204] (3/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_406-275649-0_sp0.9 from training. Duration: 0.4666875 2023-10-02 03:52:33,142 WARNING [train.py:1204] (3/4) Exclude cut with ID _15150_210_8341_1_1530752411803_7256384_22-138274-0_sp1.1 from training. Duration: 0.87275 2023-10-02 03:52:33,186 WARNING [train.py:1204] (3/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_125-125818-0 from training. Duration: 0.96 2023-10-02 03:52:36,008 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_4399_210_1874_1_1526705485_2400757_220-47974-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 03:52:37,462 WARNING [train.py:1204] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_308-161067-0_sp1.1 from training. Duration: 0.6 2023-10-02 03:52:39,530 WARNING [train.py:1204] (3/4) Exclude cut with ID _53489_210_6228_1_1534298357169_3835970_102-57645-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 03:52:42,991 WARNING [train.py:1204] (3/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_178-177582-0 from training. Duration: 0.74 2023-10-02 03:52:44,375 WARNING [train.py:1204] (3/4) Exclude cut with ID _14349_210_8341_1_1530615710818_3442470_170-336541-0_sp1.1 from training. Duration: 0.7818125 2023-10-02 03:52:44,436 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_521-214276-0_sp0.9 from training. Duration: 0.488875 2023-10-02 03:52:44,557 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.1.conv_module1.balancer2.min_abs, batch_count=741146.6666666666, ans=0.5 2023-10-02 03:52:45,768 WARNING [train.py:1204] (3/4) Exclude cut with ID _66070_210_7640_1_1535441393096_7805310_489-292631-0 from training. Duration: 0.79 2023-10-02 03:52:51,383 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.1.encoder.layers.1.feed_forward2.out_whiten, num_groups=1, num_channels=256, metric=4.93 vs. limit=15.0 2023-10-02 03:52:53,386 WARNING [train.py:1204] (3/4) Exclude cut with ID _14069_210_5632_1_1530512692033_4803390_438-340023-0_sp1.1 from training. Duration: 0.936375 2023-10-02 03:52:54,825 WARNING [train.py:1204] (3/4) Exclude cut with ID _7852_210_5219_1_1528286471316_8139400_242-105336-0_sp1.1 from training. Duration: 0.6545625 2023-10-02 03:52:56,224 WARNING [train.py:1204] (3/4) Exclude cut with ID _3867_210_1883_1_1526641200575_4475830_272-215563-0 from training. Duration: 0.57 2023-10-02 03:52:56,235 WARNING [train.py:1204] (3/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_143-117421-0_sp0.9 from training. Duration: 0.6444375 2023-10-02 03:52:56,243 WARNING [train.py:1204] (3/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_30-326681-0_sp1.1 from training. Duration: 0.7545625 2023-10-02 03:52:57,741 WARNING [train.py:1204] (3/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_538-218448-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 03:53:01,704 INFO [train.py:1046] (3/4) Epoch 21, batch 4950, loss[loss=0.1687, simple_loss=0.2431, pruned_loss=0.04714, over 19061.00 frames. ], tot_loss[loss=0.1743, simple_loss=0.2498, pruned_loss=0.04943, over 4717430.48 frames. ], batch size: 42, lr: 4.85e-03, grad_scale: 16.0 2023-10-02 03:53:01,824 WARNING [train.py:1204] (3/4) Exclude cut with ID _33640_210_2655_1_1533117605479_4855469_60-192970-0_sp1.1 from training. Duration: 0.963625 2023-10-02 03:53:01,832 WARNING [train.py:1204] (3/4) Exclude cut with ID _65585_210_12210_1_1535450451584_3764098_112-294301-0_sp1.1 from training. Duration: 0.8 2023-10-02 03:53:03,618 WARNING [train.py:1204] (3/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_643-37412-0_sp1.1 from training. Duration: 0.936375 2023-10-02 03:53:03,636 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_9302_210_2550_1_1528889962_4655416_638-301620-0_sp1.1 from training. Duration: 0.42725 2023-10-02 03:53:05,019 WARNING [train.py:1204] (3/4) Exclude cut with ID _3867_210_1883_1_1526641200575_4475830_272-215563-0_sp1.1 from training. Duration: 0.5181875 2023-10-02 03:53:07,748 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_53679_210_4852_1_1534556726_4186141_358-154143-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 03:53:07,764 WARNING [train.py:1204] (3/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_841-341679-0_sp0.9 from training. Duration: 0.6444375 2023-10-02 03:53:11,134 WARNING [train.py:1204] (3/4) Exclude cut with ID _7160_210_1876_1_1527940923231_3703799_302-150728-0 from training. Duration: 0.78 2023-10-02 03:53:11,166 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_125-203105-0 from training. Duration: 0.8 2023-10-02 03:53:11,180 WARNING [train.py:1204] (3/4) Exclude cut with ID _5585_210_2667_1_1527418813324_8007340_89-332072-0_sp0.9 from training. Duration: 0.711125 2023-10-02 03:53:12,590 WARNING [train.py:1204] (3/4) Exclude cut with ID _3992_210_1747_1_1526644816031_3731060_36-147855-0 from training. Duration: 0.78 2023-10-02 03:53:12,613 WARNING [train.py:1204] (3/4) Exclude cut with ID _59256_210_5986_1_1535076085997_3589120_172-239045-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 03:53:12,624 WARNING [train.py:1204] (3/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_472-216663-0_sp1.1 from training. Duration: 0.836375 2023-10-02 03:53:14,927 WARNING [train.py:1204] (3/4) Exclude cut with ID _36512_210_6987_1_1533033143998_3126069_181-306500-0_sp1.1 from training. Duration: 0.863625 2023-10-02 03:53:14,944 WARNING [train.py:1204] (3/4) Exclude cut with ID _56524_210_9642_1_1534572011779_3619170_492-95008-0_sp1.1 from training. Duration: 0.97275 2023-10-02 03:53:15,242 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.nonlin_attention.balancer.prob, batch_count=741280.0, ans=0.125 2023-10-02 03:53:16,349 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_33348_210_5632_1_1533086912_3324030_119-90326-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 03:53:17,834 WARNING [train.py:1204] (3/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_231-137892-0_sp1.1 from training. Duration: 0.9 2023-10-02 03:53:19,087 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8453_210_4211_1_1528502940_8562730_596-306820-0_sp1.1 from training. Duration: 0.8818125 2023-10-02 03:53:20,546 WARNING [train.py:1204] (3/4) Exclude cut with ID _27052_210_5986_1_1532329197187_3585009_50-242750-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 03:53:22,663 WARNING [train.py:1204] (3/4) Exclude cut with ID _13913_210_9994_1_1530579615187_3383897_358-98435-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 03:53:22,679 WARNING [train.py:1204] (3/4) Exclude cut with ID _27013_210_1389_1_1532437173240_3252649_399-229778-0_sp1.1 from training. Duration: 0.963625 2023-10-02 03:53:25,305 WARNING [train.py:1204] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_185-174805-0_sp0.9 from training. Duration: 0.7555625 2023-10-02 03:53:29,350 WARNING [train.py:1204] (3/4) Exclude cut with ID _65585_210_12210_1_1535450451584_3764098_196-221014-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 03:53:30,726 WARNING [train.py:1204] (3/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_202-293523-0_sp1.1 from training. Duration: 0.6545625 2023-10-02 03:53:32,274 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8275_210_4198_1_1528455320_3892571_213-127634-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 03:53:33,621 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_921-231678-0_sp1.1 from training. Duration: 0.97275 2023-10-02 03:53:35,602 WARNING [train.py:1204] (3/4) Exclude cut with ID _38020_210_5986_1_1533112252259_7006089_493-102745-0_sp1.1 from training. Duration: 0.8909375 2023-10-02 03:53:35,690 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6846_210_6753_1_1527850767_4295158_2-315073-0 from training. Duration: 0.88 2023-10-02 03:53:35,755 WARNING [train.py:1204] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_249-281349-0 from training. Duration: 0.85 2023-10-02 03:53:38,367 WARNING [train.py:1204] (3/4) Exclude cut with ID _18513_210_9206_1_1531618215223_3862620_314-83429-0_sp1.1 from training. Duration: 0.97275 2023-10-02 03:53:38,588 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.feed_forward3.hidden_balancer.prob, batch_count=741413.3333333334, ans=0.125 2023-10-02 03:53:39,235 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.0.layers.0.self_attn2.whiten, num_groups=1, num_channels=192, metric=10.98 vs. limit=22.5 2023-10-02 03:53:39,932 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.conv_module2.balancer2.prob, batch_count=741413.3333333334, ans=0.125 2023-10-02 03:53:41,598 WARNING [train.py:1204] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_215-202973-0_sp0.9 from training. Duration: 0.97775 2023-10-02 03:53:41,607 WARNING [train.py:1204] (3/4) Exclude cut with ID _7719_210_3525_1_1528456153163_4296960_88-250259-0_sp1.1 from training. Duration: 0.763625 2023-10-02 03:53:43,093 WARNING [train.py:1204] (3/4) Exclude cut with ID _7606_210_4915_1_1528894253277_5080690_301-10732-0_sp1.1 from training. Duration: 0.736375 2023-10-02 03:53:43,104 WARNING [train.py:1204] (3/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_818-11907-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 03:53:44,462 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_5959_210_1794_1_1527400400_3445726_412-44052-0_sp1.1 from training. Duration: 0.7 2023-10-02 03:53:45,723 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.3.encoder.layers.1.nonlin_attention.whiten2, num_groups=1, num_channels=512, metric=8.64 vs. limit=15.0 2023-10-02 03:53:47,642 WARNING [train.py:1204] (3/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_6-279195-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 03:53:49,099 WARNING [train.py:1204] (3/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_252-216674-0_sp0.9 from training. Duration: 0.6 2023-10-02 03:53:51,068 WARNING [train.py:1204] (3/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_144-3287-0_sp1.1 from training. Duration: 0.6090625 2023-10-02 03:53:52,602 WARNING [train.py:1204] (3/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_342-3879-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 03:53:53,922 WARNING [train.py:1204] (3/4) Exclude cut with ID _33640_210_2655_1_1533117605479_4855469_584-65630-0_sp1.1 from training. Duration: 0.97275 2023-10-02 03:53:54,010 WARNING [train.py:1204] (3/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_37-189262-0 from training. Duration: 0.98 2023-10-02 03:53:54,040 WARNING [train.py:1204] (3/4) Exclude cut with ID _9389_210_1664_1_1529217337301_5289269_379-128794-0_sp0.9 from training. Duration: 0.9444375 2023-10-02 03:53:55,360 WARNING [train.py:1204] (3/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_119-207162-0_sp1.1 from training. Duration: 0.6454375 2023-10-02 03:53:58,263 WARNING [train.py:1204] (3/4) Exclude cut with ID _2941_210_2577_1_1526199673150_2489759_200-333643-0_sp1.1 from training. Duration: 0.936375 2023-10-02 03:54:00,831 WARNING [train.py:1204] (3/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_479-125611-0_sp1.1 from training. Duration: 0.87275 2023-10-02 03:54:00,850 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3475_210_2028_1_1526193578_3622569_64-297072-0_sp1.1 from training. Duration: 0.82725 2023-10-02 03:54:00,887 WARNING [train.py:1204] (3/4) Exclude cut with ID _53565_210_10635_1_1534337218335_3820259_139-346291-0_sp1.1 from training. Duration: 0.97275 2023-10-02 03:54:00,931 WARNING [train.py:1204] (3/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_588-125165-0_sp1.1 from training. Duration: 0.7545625 2023-10-02 03:54:02,270 WARNING [train.py:1204] (3/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_938-205135-0_sp1.1 from training. Duration: 0.7909375 2023-10-02 03:54:03,761 WARNING [train.py:1204] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_216-48707-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 03:54:05,607 WARNING [train.py:1204] (3/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_477-255782-0_sp1.1 from training. Duration: 0.7454375 2023-10-02 03:54:05,639 WARNING [train.py:1204] (3/4) Exclude cut with ID _16909_210_5724_1_1531292079912_4118919_583-92842-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 03:54:06,961 WARNING [train.py:1204] (3/4) Exclude cut with ID _60860_210_8254_1_1534896015875_3557169_245-256666-0 from training. Duration: 0.97 2023-10-02 03:54:11,373 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_45399_210_4852_1_1533624725_7579737_349-116173-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 03:54:13,516 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.1.conv_module2.balancer2.prob, batch_count=741546.6666666666, ans=0.125 2023-10-02 03:54:15,004 WARNING [train.py:1204] (3/4) Exclude cut with ID _8699_210_2069_1_1528630346831_3960409_155-39766-0 from training. Duration: 0.78 2023-10-02 03:54:15,014 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_86-16881-0_sp1.1 from training. Duration: 0.52725 2023-10-02 03:54:18,270 INFO [train.py:1046] (3/4) Epoch 21, batch 5000, loss[loss=0.1599, simple_loss=0.2158, pruned_loss=0.05204, over 19129.00 frames. ], tot_loss[loss=0.1735, simple_loss=0.2491, pruned_loss=0.04897, over 4707285.33 frames. ], batch size: 388, lr: 4.85e-03, grad_scale: 16.0 2023-10-02 03:54:22,414 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_191-159384-0_sp1.1 from training. Duration: 0.97275 2023-10-02 03:54:22,421 WARNING [train.py:1204] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_307-158462-0_sp1.1 from training. Duration: 0.62725 2023-10-02 03:54:23,884 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7544_210_6753_1_1528591327_1471426_33-346031-0 from training. Duration: 0.61 2023-10-02 03:54:25,280 WARNING [train.py:1204] (3/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_112-149213-0 from training. Duration: 0.95 2023-10-02 03:54:26,684 WARNING [train.py:1204] (3/4) Exclude cut with ID _55844_210_2346_1_1534471212569_7625707_925-191342-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 03:54:29,551 WARNING [train.py:1204] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_87-278686-0 from training. Duration: 0.77 2023-10-02 03:54:29,579 WARNING [train.py:1204] (3/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_415-78792-0_sp1.1 from training. Duration: 0.736375 2023-10-02 03:54:30,783 WARNING [train.py:1204] (3/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_107-332398-0_sp0.9 from training. Duration: 0.8666875 2023-10-02 03:54:30,882 WARNING [train.py:1204] (3/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_72-251255-0 from training. Duration: 0.94 2023-10-02 03:54:31,124 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.ff3_skip_rate, batch_count=741680.0, ans=0.0 2023-10-02 03:54:32,191 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_431-227273-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 03:54:32,256 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7544_210_6753_1_1528587000_691468_87-178599-0_sp0.9 from training. Duration: 0.9666875 2023-10-02 03:54:33,561 WARNING [train.py:1204] (3/4) Exclude cut with ID _13916_210_9994_1_1530788341126_3763149_60-191556-0 from training. Duration: 0.61 2023-10-02 03:54:33,563 WARNING [train.py:1204] (3/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_746-164782-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 03:54:33,615 WARNING [train.py:1204] (3/4) Exclude cut with ID _33640_210_2655_1_1533117605479_4855469_596-21066-0_sp1.1 from training. Duration: 0.92725 2023-10-02 03:54:36,781 WARNING [train.py:1204] (3/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_320-14078-0 from training. Duration: 0.95 2023-10-02 03:54:38,101 WARNING [train.py:1204] (3/4) Exclude cut with ID _29877_210_6228_1_1532397791106_5276149_266-62291-0 from training. Duration: 0.87 2023-10-02 03:54:38,169 WARNING [train.py:1204] (3/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_176-56120-0_sp0.9 from training. Duration: 0.92225 2023-10-02 03:54:38,209 WARNING [train.py:1204] (3/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_91-26919-0 from training. Duration: 0.97 2023-10-02 03:54:38,215 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_224-322394-0_sp0.9 from training. Duration: 0.7333125 2023-10-02 03:54:38,348 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.feed_forward1.hidden_balancer.prob, batch_count=741680.0, ans=0.125 2023-10-02 03:54:39,514 WARNING [train.py:1204] (3/4) Exclude cut with ID _44982_210_1511_1_1534312820108_3726029_586-297059-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 03:54:40,942 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_196-289637-0_sp1.1 from training. Duration: 0.6818125 2023-10-02 03:54:40,944 WARNING [train.py:1204] (3/4) Exclude cut with ID _18513_210_9206_1_1531618215223_3862620_402-5957-0 from training. Duration: 0.99 2023-10-02 03:54:40,950 WARNING [train.py:1204] (3/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_81-300212-0 from training. Duration: 0.6 2023-10-02 03:54:43,758 WARNING [train.py:1204] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_166-128941-0 from training. Duration: 0.54 2023-10-02 03:54:43,780 WARNING [train.py:1204] (3/4) Exclude cut with ID _58059_210_5854_1_1534901471149_3164808_57-309660-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 03:54:45,104 WARNING [train.py:1204] (3/4) Exclude cut with ID _60621_210_17071_1_1535013053530_3671739_73-155814-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 03:54:45,207 WARNING [train.py:1204] (3/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_726-270780-0 from training. Duration: 0.86 2023-10-02 03:54:45,220 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8853_210_3273_1_1528633899_818968_51-187685-0_sp1.1 from training. Duration: 0.62725 2023-10-02 03:54:47,101 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_315-212794-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 03:54:47,194 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.0.feed_forward1.out_proj.dropout_p, batch_count=741746.6666666666, ans=0.1 2023-10-02 03:54:48,564 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_53373_210_5399_1_1534229471_4715695_314-122795-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 03:54:50,526 WARNING [train.py:1204] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_187-221566-0_sp0.9 from training. Duration: 0.47775 2023-10-02 03:54:51,981 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_133-564-0 from training. Duration: 0.79 2023-10-02 03:54:53,132 WARNING [train.py:1204] (3/4) Exclude cut with ID _55153_210_2253_1_1534384786677_3162096_255-228344-0_sp1.1 from training. Duration: 0.936375 2023-10-02 03:54:53,248 WARNING [train.py:1204] (3/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_383-165234-0_sp1.1 from training. Duration: 0.8545625 2023-10-02 03:54:55,823 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.620e+02 1.891e+02 2.105e+02 2.416e+02 3.579e+02, threshold=4.211e+02, percent-clipped=0.0 2023-10-02 03:54:57,368 WARNING [train.py:1204] (3/4) Exclude cut with ID IC0677W0056-334889 from training. Duration: 0.768 2023-10-02 03:54:57,590 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.conv_skip_rate, batch_count=741746.6666666666, ans=0.0 2023-10-02 03:55:00,198 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_14953_210_2151_1_1530701710_5616165_710-165400-0_sp0.9 from training. Duration: 0.9666875 2023-10-02 03:55:01,624 WARNING [train.py:1204] (3/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_355-236205-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 03:55:01,626 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_34450_210_9547_1_1533099443_4109050_503-246893-0_sp1.1 from training. Duration: 0.963625 2023-10-02 03:55:04,439 WARNING [train.py:1204] (3/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_25-120161-0 from training. Duration: 0.98 2023-10-02 03:55:05,796 WARNING [train.py:1204] (3/4) Exclude cut with ID _66112_210_10635_1_1535590236305_3801484_205-311930-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 03:55:05,818 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_48049_210_5632_1_1533864553_3519937_185-281218-0_sp1.1 from training. Duration: 0.92725 2023-10-02 03:55:05,849 WARNING [train.py:1204] (3/4) Exclude cut with ID _48782_210_12658_1_1534467701544_2300422_191-332415-0_sp1.1 from training. Duration: 0.97275 2023-10-02 03:55:08,922 WARNING [train.py:1204] (3/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_907-151398-0_sp1.1 from training. Duration: 0.336375 2023-10-02 03:55:08,962 WARNING [train.py:1204] (3/4) Exclude cut with ID _67499_210_17282_1_1535509805220_3692430_20-179346-0_sp1.1 from training. Duration: 0.8818125 2023-10-02 03:55:10,409 WARNING [train.py:1204] (3/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_441-91466-0_sp1.1 from training. Duration: 0.8818125 2023-10-02 03:55:11,784 WARNING [train.py:1204] (3/4) Exclude cut with ID _46681_210_9642_1_1533893005571_7603359_786-164007-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 03:55:17,759 WARNING [train.py:1204] (3/4) Exclude cut with ID _14970_210_15709_1_1530773543493_1715730_187-20259-0 from training. Duration: 0.89 2023-10-02 03:55:21,302 WARNING [train.py:1204] (3/4) Exclude cut with ID _12734_210_8341_1_1530183614296_3891929_383-339075-0_sp1.1 from training. Duration: 0.963625 2023-10-02 03:55:25,620 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.0.feed_forward1.out_proj.dropout_p, batch_count=741880.0, ans=0.1 2023-10-02 03:55:30,914 INFO [train.py:1046] (3/4) Epoch 21, batch 5050, loss[loss=0.1816, simple_loss=0.2519, pruned_loss=0.05571, over 23745.00 frames. ], tot_loss[loss=0.1734, simple_loss=0.2495, pruned_loss=0.04869, over 4715591.84 frames. ], batch size: 212, lr: 4.84e-03, grad_scale: 16.0 2023-10-02 03:55:31,036 WARNING [train.py:1204] (3/4) Exclude cut with ID _37086_210_9038_1_1532999540891_7021250_834-322225-0_sp1.1 from training. Duration: 0.92725 2023-10-02 03:55:32,487 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_10987_210_4163_1_1529760555_4123902_56-290559-0_sp1.1 from training. Duration: 0.963625 2023-10-02 03:55:32,496 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_92-246270-0_sp0.9 from training. Duration: 0.9444375 2023-10-02 03:55:32,516 WARNING [train.py:1204] (3/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_56-13561-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 03:55:32,556 WARNING [train.py:1204] (3/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_393-57888-0_sp1.1 from training. Duration: 0.5818125 2023-10-02 03:55:33,862 WARNING [train.py:1204] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_168-32906-0_sp1.1 from training. Duration: 0.62725 2023-10-02 03:55:33,905 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_51225_210_3601_1_1534400634_2327383_127-207553-0_sp1.1 from training. Duration: 0.963625 2023-10-02 03:55:37,005 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.feed_forward1.out_proj.dropout_p, batch_count=741946.6666666666, ans=0.1 2023-10-02 03:55:38,244 WARNING [train.py:1204] (3/4) Exclude cut with ID _58583_210_5236_1_1534726835266_3574249_494-203521-0_sp1.1 from training. Duration: 0.963625 2023-10-02 03:55:38,256 WARNING [train.py:1204] (3/4) Exclude cut with ID _49376_210_5986_1_1533985278732_3370740_354-344695-0 from training. Duration: 0.99 2023-10-02 03:55:40,201 WARNING [train.py:1204] (3/4) Exclude cut with ID _8021_210_7994_1_1528376451358_3653069_179-45206-0_sp1.1 from training. Duration: 0.7818125 2023-10-02 03:55:43,003 WARNING [train.py:1204] (3/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_742-154017-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 03:55:44,304 WARNING [train.py:1204] (3/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_222-198127-0_sp1.1 from training. Duration: 0.763625 2023-10-02 03:55:44,357 WARNING [train.py:1204] (3/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_235-307337-0 from training. Duration: 0.94 2023-10-02 03:55:45,754 WARNING [train.py:1204] (3/4) Exclude cut with ID _8393_210_6702_1_1528505860249_3457328_453-176500-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 03:55:47,106 WARNING [train.py:1204] (3/4) Exclude cut with ID _53565_210_10635_1_1534337218335_3820259_102-179528-0_sp1.1 from training. Duration: 0.97275 2023-10-02 03:55:48,544 WARNING [train.py:1204] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_43-206757-0_sp1.1 from training. Duration: 0.6818125 2023-10-02 03:55:48,658 WARNING [train.py:1204] (3/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_335-274780-0_sp1.1 from training. Duration: 0.7090625 2023-10-02 03:55:50,356 WARNING [train.py:1204] (3/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_128-296581-0_sp1.1 from training. Duration: 0.67275 2023-10-02 03:55:59,713 WARNING [train.py:1204] (3/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_111-112362-0 from training. Duration: 0.91 2023-10-02 03:55:59,963 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=742080.0, ans=0.1 2023-10-02 03:56:01,062 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_5522_210_3273_1_1527249606_3909096_366-172508-0_sp1.1 from training. Duration: 0.6 2023-10-02 03:56:01,143 WARNING [train.py:1204] (3/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_47-167373-0_sp1.1 from training. Duration: 0.836375 2023-10-02 03:56:02,486 WARNING [train.py:1204] (3/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_41-234353-0 from training. Duration: 0.68 2023-10-02 03:56:03,871 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_82-221196-0_sp0.9 from training. Duration: 0.8444375 2023-10-02 03:56:03,963 WARNING [train.py:1204] (3/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_47-4814-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 03:56:03,992 WARNING [train.py:1204] (3/4) Exclude cut with ID _43541_210_10626_1_1533796233418_3635440_40-247993-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 03:56:04,032 WARNING [train.py:1204] (3/4) Exclude cut with ID _21989_210_5854_1_1531899214931_3271360_133-44759-0_sp0.9 from training. Duration: 0.8555625 2023-10-02 03:56:04,035 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_94-195801-0 from training. Duration: 0.94 2023-10-02 03:56:05,366 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_141-89113-0 from training. Duration: 0.86 2023-10-02 03:56:06,680 WARNING [train.py:1204] (3/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_683-284578-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 03:56:09,421 WARNING [train.py:1204] (3/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_6-214815-0_sp1.1 from training. Duration: 0.77275 2023-10-02 03:56:12,628 WARNING [train.py:1204] (3/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_556-212082-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 03:56:12,667 WARNING [train.py:1204] (3/4) Exclude cut with ID _44902_210_16187_1_1533888006062_3792078_149-67212-0 from training. Duration: 0.87 2023-10-02 03:56:15,373 WARNING [train.py:1204] (3/4) Exclude cut with ID _61148_210_6897_1_1535094022726_4217610_333-159440-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 03:56:15,574 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.0.bypass.scale_min, batch_count=742146.6666666666, ans=0.2 2023-10-02 03:56:18,293 WARNING [train.py:1204] (3/4) Exclude cut with ID _3120_210_3525_1_1526036143112_4072980_404-249994-0 from training. Duration: 0.99 2023-10-02 03:56:19,667 WARNING [train.py:1204] (3/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_234-260682-0_sp0.9 from training. Duration: 0.9333125 2023-10-02 03:56:19,694 WARNING [train.py:1204] (3/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_374-138971-0_sp1.1 from training. Duration: 0.8545625 2023-10-02 03:56:19,767 WARNING [train.py:1204] (3/4) Exclude cut with ID _53713_210_24116_1_1534314487762_7416751_743-302085-0_sp1.1 from training. Duration: 0.92725 2023-10-02 03:56:19,835 WARNING [train.py:1204] (3/4) Exclude cut with ID _18516_210_9206_1_1531704653206_3692406_198-334870-0_sp1.1 from training. Duration: 0.836375 2023-10-02 03:56:21,807 WARNING [train.py:1204] (3/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_395-73570-0_sp1.1 from training. Duration: 0.9 2023-10-02 03:56:25,068 WARNING [train.py:1204] (3/4) Exclude cut with ID _46683_210_9642_1_1533730913492_4591116_421-19547-0_sp1.1 from training. Duration: 0.936375 2023-10-02 03:56:25,130 WARNING [train.py:1204] (3/4) Exclude cut with ID _19896_210_1883_1_1531709022662_6649569_714-8668-0_sp1.1 from training. Duration: 0.963625 2023-10-02 03:56:25,151 WARNING [train.py:1204] (3/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_49-101072-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 03:56:25,158 WARNING [train.py:1204] (3/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_220-191080-0_sp1.1 from training. Duration: 0.8909375 2023-10-02 03:56:27,135 WARNING [train.py:1204] (3/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_62-335552-0 from training. Duration: 0.87 2023-10-02 03:56:28,388 WARNING [train.py:1204] (3/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_111-112362-0_sp1.1 from training. Duration: 0.82725 2023-10-02 03:56:29,843 WARNING [train.py:1204] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_318-59097-0_sp0.9 from training. Duration: 0.8444375 2023-10-02 03:56:33,878 WARNING [train.py:1204] (3/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_98-255710-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 03:56:33,885 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9042W0003-515609 from training. Duration: 0.896 2023-10-02 03:56:33,893 WARNING [train.py:1204] (3/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_158-250116-0_sp0.9 from training. Duration: 0.688875 2023-10-02 03:56:34,796 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.0.layers.1.self_attn2.whiten, num_groups=1, num_channels=192, metric=10.23 vs. limit=22.5 2023-10-02 03:56:35,226 WARNING [train.py:1204] (3/4) Exclude cut with ID _26750_210_5665_1_1532226678442_4063230_7-346885-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 03:56:35,263 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_37839_210_7234_1_1533542462_3689303_357-227078-0_sp1.1 from training. Duration: 0.963625 2023-10-02 03:56:35,300 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9052W0119-520904 from training. Duration: 0.896 2023-10-02 03:56:36,756 WARNING [train.py:1204] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_249-281349-0_sp1.1 from training. Duration: 0.77275 2023-10-02 03:56:36,761 WARNING [train.py:1204] (3/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_147-293311-0 from training. Duration: 0.94 2023-10-02 03:56:36,762 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_158-21166-0_sp1.1 from training. Duration: 0.963625 2023-10-02 03:56:39,597 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.conv_module2.balancer2.prob, batch_count=742213.3333333334, ans=0.125 2023-10-02 03:56:40,697 WARNING [train.py:1204] (3/4) Exclude cut with ID _14100_210_8341_1_1530529320613_3678069_207-332194-0_sp1.1 from training. Duration: 0.92725 2023-10-02 03:56:42,482 WARNING [train.py:1204] (3/4) Exclude cut with ID _9389_210_1664_1_1529217337301_5289269_324-211031-0_sp1.1 from training. Duration: 0.963625 2023-10-02 03:56:42,510 WARNING [train.py:1204] (3/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_133-158308-0 from training. Duration: 0.84 2023-10-02 03:56:44,019 WARNING [train.py:1204] (3/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_166-314607-0 from training. Duration: 0.54 2023-10-02 03:56:44,234 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.balancer1.prob, batch_count=742280.0, ans=0.125 2023-10-02 03:56:45,287 INFO [train.py:1046] (3/4) Epoch 21, batch 5100, loss[loss=0.1808, simple_loss=0.254, pruned_loss=0.05386, over 23664.00 frames. ], tot_loss[loss=0.1746, simple_loss=0.2508, pruned_loss=0.04919, over 4719735.90 frames. ], batch size: 232, lr: 4.84e-03, grad_scale: 16.0 2023-10-02 03:56:46,811 WARNING [train.py:1204] (3/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_649-79538-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 03:56:46,823 WARNING [train.py:1204] (3/4) Exclude cut with ID _28394_210_8913_1_1532430049518_3650118_343-115667-0_sp1.1 from training. Duration: 0.97275 2023-10-02 03:56:46,900 WARNING [train.py:1204] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_87-278686-0_sp0.9 from training. Duration: 0.8555625 2023-10-02 03:56:49,671 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9109W0003-549749 from training. Duration: 0.768 2023-10-02 03:56:52,393 WARNING [train.py:1204] (3/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_126-29104-0_sp1.1 from training. Duration: 0.77275 2023-10-02 03:56:54,010 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.1.encoder.layers.0.self_attn1.whiten, num_groups=1, num_channels=256, metric=13.37 vs. limit=22.5 2023-10-02 03:56:54,482 WARNING [train.py:1204] (3/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_325-113798-0 from training. Duration: 0.85 2023-10-02 03:56:54,537 WARNING [train.py:1204] (3/4) Exclude cut with ID _6694_210_4599_1_1527852637490_3965529_116-109398-0 from training. Duration: 0.73 2023-10-02 03:56:54,596 WARNING [train.py:1204] (3/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_46-57119-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 03:56:57,787 WARNING [train.py:1204] (3/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_546-119312-0_sp1.1 from training. Duration: 0.9 2023-10-02 03:56:59,470 INFO [scaling.py:1118] (3/4) WithLoss: name=encoder.encoders.1.encoder.layers.1.self_attn_weights, loss-sum=0.000e+00 2023-10-02 03:57:00,588 WARNING [train.py:1204] (3/4) Exclude cut with ID _60057_210_10640_1_1534903065793_3333150_468-306939-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 03:57:00,630 WARNING [train.py:1204] (3/4) Exclude cut with ID _15150_210_8341_1_1530752411803_7256384_22-138274-0 from training. Duration: 0.96 2023-10-02 03:57:01,785 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_289-180904-0 from training. Duration: 0.76 2023-10-02 03:57:04,780 WARNING [train.py:1204] (3/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_64-133721-0_sp1.1 from training. Duration: 0.92725 2023-10-02 03:57:06,042 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_195-257512-0_sp1.1 from training. Duration: 0.6090625 2023-10-02 03:57:08,816 WARNING [train.py:1204] (3/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_704-101963-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 03:57:12,137 WARNING [train.py:1204] (3/4) Exclude cut with ID _4366_210_1388_1_1526792053633_3613819_231-211402-0 from training. Duration: 0.94 2023-10-02 03:57:13,560 WARNING [train.py:1204] (3/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_679-190385-0_sp1.1 from training. Duration: 0.97275 2023-10-02 03:57:14,957 WARNING [train.py:1204] (3/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_298-81459-0_sp1.1 from training. Duration: 0.963625 2023-10-02 03:57:14,974 WARNING [train.py:1204] (3/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_658-282610-0_sp0.9 from training. Duration: 0.588875 2023-10-02 03:57:16,426 WARNING [train.py:1204] (3/4) Exclude cut with ID _57572_210_5517_1_1534921219624_3809484_196-283734-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 03:57:16,623 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.bypass.skip_rate, batch_count=742413.3333333334, ans=0.07 2023-10-02 03:57:19,099 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_53071_210_16554_1_1534490405_2954066_294-198554-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 03:57:19,105 WARNING [train.py:1204] (3/4) Exclude cut with ID _4866_210_2577_1_1527404412284_4364659_352-157930-0 from training. Duration: 0.81 2023-10-02 03:57:20,599 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9231W0113-610139 from training. Duration: 0.896 2023-10-02 03:57:20,669 WARNING [train.py:1204] (3/4) Exclude cut with ID _24271_210_12658_1_1531987270022_7190519_638-171906-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 03:57:21,981 WARNING [train.py:1204] (3/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_272-249985-0 from training. Duration: 0.67 2023-10-02 03:57:21,991 WARNING [train.py:1204] (3/4) Exclude cut with ID _64086_210_7994_1_1535270417019_7435949_449-31567-0 from training. Duration: 0.97 2023-10-02 03:57:23,361 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.477e+02 1.859e+02 2.079e+02 2.348e+02 3.705e+02, threshold=4.159e+02, percent-clipped=0.0 2023-10-02 03:57:25,689 WARNING [train.py:1204] (3/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_109-282568-0_sp1.1 from training. Duration: 0.97275 2023-10-02 03:57:35,536 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_121-45934-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 03:57:38,295 WARNING [train.py:1204] (3/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_314-126819-0 from training. Duration: 0.5 2023-10-02 03:57:38,326 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9309W0192-640319 from training. Duration: 0.512 2023-10-02 03:57:38,334 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9310W0003-640649 from training. Duration: 0.896 2023-10-02 03:57:39,744 WARNING [train.py:1204] (3/4) Exclude cut with ID _7160_210_1876_1_1527940923231_3703799_120-150943-0 from training. Duration: 0.81 2023-10-02 03:57:39,746 WARNING [train.py:1204] (3/4) Exclude cut with ID _40380_210_9059_1_1533380594759_4187380_6-33508-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 03:57:41,234 WARNING [train.py:1204] (3/4) Exclude cut with ID _59202_210_26984_1_1534816810612_3549174_137-318710-0 from training. Duration: 0.89 2023-10-02 03:57:45,939 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_435-313305-0 from training. Duration: 0.71 2023-10-02 03:57:47,255 WARNING [train.py:1204] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_91-212486-0_sp1.1 from training. Duration: 0.6181875 2023-10-02 03:57:48,619 WARNING [train.py:1204] (3/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_133-158308-0_sp1.1 from training. Duration: 0.763625 2023-10-02 03:57:51,342 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_99-251314-0 from training. Duration: 0.87 2023-10-02 03:57:52,735 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_365-217119-0_sp1.1 from training. Duration: 0.57275 2023-10-02 03:57:54,603 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3475_210_2028_1_1526193578_3622569_377-166133-0 from training. Duration: 0.86 2023-10-02 03:57:58,596 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.3.encoder.layers.3.self_attn_weights.whiten_keys, num_groups=8, num_channels=256, metric=3.48 vs. limit=6.0 2023-10-02 03:57:59,334 INFO [train.py:1046] (3/4) Epoch 21, batch 5150, loss[loss=0.1624, simple_loss=0.2454, pruned_loss=0.0397, over 24405.00 frames. ], tot_loss[loss=0.1756, simple_loss=0.2517, pruned_loss=0.04977, over 4725885.38 frames. ], batch size: 63, lr: 4.84e-03, grad_scale: 16.0 2023-10-02 03:57:59,453 WARNING [train.py:1204] (3/4) Exclude cut with ID _13916_210_9994_1_1530788341126_3763149_116-21034-0_sp1.1 from training. Duration: 0.87275 2023-10-02 03:57:59,470 WARNING [train.py:1204] (3/4) Exclude cut with ID _64220_210_9527_1_1535250151772_4875259_142-216831-0_sp1.1 from training. Duration: 0.936375 2023-10-02 03:57:59,471 WARNING [train.py:1204] (3/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_490-207861-0_sp1.1 from training. Duration: 0.8909375 2023-10-02 03:58:00,781 WARNING [train.py:1204] (3/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_87-272068-0_sp0.9 from training. Duration: 0.92225 2023-10-02 03:58:02,090 WARNING [train.py:1204] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_18-46756-0_sp1.1 from training. Duration: 0.7181875 2023-10-02 03:58:03,461 WARNING [train.py:1204] (3/4) Exclude cut with ID _39411_210_5986_1_1533538825030_3343649_98-311335-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 03:58:04,673 WARNING [train.py:1204] (3/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_113-18163-0 from training. Duration: 0.89 2023-10-02 03:58:04,675 WARNING [train.py:1204] (3/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_1103-56445-0 from training. Duration: 0.73 2023-10-02 03:58:04,725 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3474_210_2028_1_1526188513_4825246_145-309131-0 from training. Duration: 0.74 2023-10-02 03:58:04,744 WARNING [train.py:1204] (3/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_120-211630-0_sp1.1 from training. Duration: 0.836375 2023-10-02 03:58:06,006 WARNING [train.py:1204] (3/4) Exclude cut with ID _8030_210_7497_1_1528444886781_3531089_32-31837-0 from training. Duration: 0.97 2023-10-02 03:58:06,119 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_19949_210_2161_1_1531706337_928079_8-160956-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 03:58:06,166 WARNING [train.py:1204] (3/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_321-215742-0_sp1.1 from training. Duration: 0.4545625 2023-10-02 03:58:07,656 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_583-124650-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 03:58:09,140 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_30434_210_13049_1_1532519384_981746_164-182755-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 03:58:11,034 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.feed_forward1.hidden_balancer.prob, batch_count=742613.3333333334, ans=0.125 2023-10-02 03:58:14,010 WARNING [train.py:1204] (3/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_339-104791-0_sp1.1 from training. Duration: 0.4454375 2023-10-02 03:58:14,028 WARNING [train.py:1204] (3/4) Exclude cut with ID _15026_210_8750_1_1530777602316_3291850_211-195043-0 from training. Duration: 0.8 2023-10-02 03:58:16,800 WARNING [train.py:1204] (3/4) Exclude cut with ID _29877_210_6228_1_1532397791106_5276149_487-182647-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 03:58:16,854 WARNING [train.py:1204] (3/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_127-566-0_sp0.9 from training. Duration: 0.9333125 2023-10-02 03:58:19,552 WARNING [train.py:1204] (3/4) Exclude cut with ID _8263_210_1664_1_1528439452891_970220_68-181805-0_sp0.9 from training. Duration: 0.711125 2023-10-02 03:58:19,554 WARNING [train.py:1204] (3/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_504-72513-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 03:58:19,563 WARNING [train.py:1204] (3/4) Exclude cut with ID _64639_210_5816_1_1535367619437_3528000_324-265233-0_sp1.1 from training. Duration: 0.963625 2023-10-02 03:58:19,620 WARNING [train.py:1204] (3/4) Exclude cut with ID _6456_210_1534_1_1527681634212_3181049_215-309359-0_sp0.9 from training. Duration: 0.97775 2023-10-02 03:58:19,625 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_115-233929-0_sp1.1 from training. Duration: 0.6545625 2023-10-02 03:58:19,670 WARNING [train.py:1204] (3/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_219-224445-0 from training. Duration: 0.84 2023-10-02 03:58:22,393 WARNING [train.py:1204] (3/4) Exclude cut with ID _14867_210_1968_1_1530684179890_3846969_413-29292-0_sp1.1 from training. Duration: 0.8181875 2023-10-02 03:58:22,450 WARNING [train.py:1204] (3/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_1012-279473-0_sp0.9 from training. Duration: 0.7444375 2023-10-02 03:58:25,835 WARNING [train.py:1204] (3/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_605-133395-0_sp0.9 from training. Duration: 0.7333125 2023-10-02 03:58:27,027 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_89-35748-0 from training. Duration: 0.79 2023-10-02 03:58:27,128 WARNING [train.py:1204] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_476-272757-0_sp1.1 from training. Duration: 0.8454375 2023-10-02 03:58:33,839 WARNING [train.py:1204] (3/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_449-15508-0_sp0.9 from training. Duration: 0.911125 2023-10-02 03:58:33,968 WARNING [train.py:1204] (3/4) Exclude cut with ID _29345_210_4381_1_1532689168263_3860169_198-174328-0 from training. Duration: 0.91 2023-10-02 03:58:38,083 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_15517_210_6753_1_1530952293_1664795_125-158157-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 03:58:46,836 WARNING [train.py:1204] (3/4) Exclude cut with ID _25565_210_12392_1_1532423009382_3952098_420-113180-0_sp1.1 from training. Duration: 0.963625 2023-10-02 03:58:46,918 WARNING [train.py:1204] (3/4) Exclude cut with ID _51471_210_1889_1_1534399391390_3094250_236-146773-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 03:58:49,753 WARNING [train.py:1204] (3/4) Exclude cut with ID _30039_210_13270_1_1532484612900_2725150_175-35012-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 03:58:49,795 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_23850_210_4238_1_1532232597_1632433_99-275843-0_sp1.1 from training. Duration: 0.936375 2023-10-02 03:58:51,265 WARNING [train.py:1204] (3/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_658-282610-0 from training. Duration: 0.53 2023-10-02 03:58:57,099 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_37818_210_4238_1_1533620910_4720083_234-65934-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 03:58:58,459 WARNING [train.py:1204] (3/4) Exclude cut with ID _3067_210_2667_1_1526212974640_4503168_459-173867-0_sp1.1 from training. Duration: 0.8 2023-10-02 03:58:58,480 WARNING [train.py:1204] (3/4) Exclude cut with ID _8696_210_1876_1_1528545632440_3345058_209-54069-0_sp0.9 from training. Duration: 0.7444375 2023-10-02 03:59:00,007 WARNING [train.py:1204] (3/4) Exclude cut with ID _62574_210_2655_1_1535180407058_3933249_420-236465-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 03:59:00,155 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.0.feed_forward1.out_proj.dropout_p, batch_count=742880.0, ans=0.1 2023-10-02 03:59:00,219 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=742880.0, ans=0.1 2023-10-02 03:59:01,926 WARNING [train.py:1204] (3/4) Exclude cut with ID _46681_210_9642_1_1533893005571_7603359_333-4042-0_sp1.1 from training. Duration: 0.936375 2023-10-02 03:59:03,306 WARNING [train.py:1204] (3/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_377-296839-0 from training. Duration: 0.55 2023-10-02 03:59:08,904 WARNING [train.py:1204] (3/4) Exclude cut with ID _43997_210_4525_1_1533538632555_5189329_426-92599-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 03:59:09,014 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_43-71989-0_sp1.1 from training. Duration: 0.5181875 2023-10-02 03:59:10,437 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2009_210_2071_1_1525481134_4588937_140-345530-0_sp1.1 from training. Duration: 0.963625 2023-10-02 03:59:10,445 WARNING [train.py:1204] (3/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_138-332773-0_sp0.9 from training. Duration: 0.9 2023-10-02 03:59:11,782 WARNING [train.py:1204] (3/4) Exclude cut with ID _13942_210_9988_1_1530604906036_3733360_499-55676-0_sp0.9 from training. Duration: 0.688875 2023-10-02 03:59:11,805 WARNING [train.py:1204] (3/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_259-34038-0_sp1.1 from training. Duration: 0.67275 2023-10-02 03:59:11,816 WARNING [train.py:1204] (3/4) Exclude cut with ID _3120_210_3525_1_1526036143112_4072980_404-249994-0_sp1.1 from training. Duration: 0.9 2023-10-02 03:59:13,709 INFO [train.py:1046] (3/4) Epoch 21, batch 5200, loss[loss=0.1569, simple_loss=0.2356, pruned_loss=0.03905, over 24628.00 frames. ], tot_loss[loss=0.1761, simple_loss=0.2521, pruned_loss=0.05004, over 4728374.11 frames. ], batch size: 60, lr: 4.84e-03, grad_scale: 32.0 2023-10-02 03:59:13,781 WARNING [train.py:1204] (3/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_222-108810-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 03:59:15,502 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=742946.6666666666, ans=0.1 2023-10-02 03:59:16,609 WARNING [train.py:1204] (3/4) Exclude cut with ID _22083_210_8254_1_1531702862439_3678970_49-234546-0_sp0.9 from training. Duration: 0.92225 2023-10-02 03:59:18,029 WARNING [train.py:1204] (3/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_199-295611-0_sp0.9 from training. Duration: 0.611125 2023-10-02 03:59:20,689 WARNING [train.py:1204] (3/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_43-192945-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 03:59:22,289 INFO [scaling.py:1118] (3/4) WithLoss: name=encoder.encoders.4.encoder.layers.0.self_attn_weights, loss-sum=0.000e+00 2023-10-02 03:59:23,456 WARNING [train.py:1204] (3/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_364-22817-0 from training. Duration: 0.71 2023-10-02 03:59:25,492 WARNING [train.py:1204] (3/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_388-129552-0_sp1.1 from training. Duration: 0.87275 2023-10-02 03:59:26,631 WARNING [train.py:1204] (3/4) Exclude cut with ID _6537_210_1883_1_1527987559710_3597129_190-280227-0_sp1.1 from training. Duration: 0.97275 2023-10-02 03:59:28,093 WARNING [train.py:1204] (3/4) Exclude cut with ID _55844_210_2346_1_1534471212569_7625707_398-56897-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 03:59:29,482 WARNING [train.py:1204] (3/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_48-182155-0_sp1.1 from training. Duration: 0.7909375 2023-10-02 03:59:29,501 WARNING [train.py:1204] (3/4) Exclude cut with ID _29529_210_7234_1_1532393961455_3977460_582-18084-0_sp1.1 from training. Duration: 0.97275 2023-10-02 03:59:30,770 WARNING [train.py:1204] (3/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_912-282412-0 from training. Duration: 0.49 2023-10-02 03:59:35,473 WARNING [train.py:1204] (3/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_332-256168-0_sp0.9 from training. Duration: 0.8333125 2023-10-02 03:59:35,529 WARNING [train.py:1204] (3/4) Exclude cut with ID _64220_210_9527_1_1535250151772_4875259_55-250624-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 03:59:38,272 WARNING [train.py:1204] (3/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_264-108685-0 from training. Duration: 0.55 2023-10-02 03:59:40,980 WARNING [train.py:1204] (3/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_613-232856-0_sp0.9 from training. Duration: 0.6 2023-10-02 03:59:41,073 WARNING [train.py:1204] (3/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_68-212801-0_sp1.1 from training. Duration: 0.663625 2023-10-02 03:59:42,459 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6290_210_3273_1_1527595224_1282787_115-87095-0 from training. Duration: 0.55 2023-10-02 03:59:42,510 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3080_210_2071_1_1526086102_4365166_312-8726-0 from training. Duration: 0.91 2023-10-02 03:59:44,641 WARNING [train.py:1204] (3/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_105-63309-0 from training. Duration: 0.98 2023-10-02 03:59:46,106 WARNING [train.py:1204] (3/4) Exclude cut with ID _2941_210_2577_1_1526199673150_2489759_201-160779-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 03:59:46,110 WARNING [train.py:1204] (3/4) Exclude cut with ID ID0406W0226-885434 from training. Duration: 0.997 2023-10-02 03:59:46,124 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_17292_210_6753_1_1531125388_2943734_353-328201-0_sp1.1 from training. Duration: 0.97275 2023-10-02 03:59:47,476 WARNING [train.py:1204] (3/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_572-96029-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 03:59:47,510 WARNING [train.py:1204] (3/4) Exclude cut with ID _14970_210_15709_1_1530775547361_1966965_6-23328-0_sp0.9 from training. Duration: 0.9555625 2023-10-02 03:59:48,754 WARNING [train.py:1204] (3/4) Exclude cut with ID _45012_210_12651_1_1533608435136_5371380_240-37101-0 from training. Duration: 0.93 2023-10-02 03:59:48,815 WARNING [train.py:1204] (3/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_685-90183-0_sp1.1 from training. Duration: 0.936375 2023-10-02 03:59:51,336 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.459e+02 1.817e+02 2.050e+02 2.419e+02 3.713e+02, threshold=4.100e+02, percent-clipped=0.0 2023-10-02 03:59:51,454 WARNING [train.py:1204] (3/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_41-22245-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 03:59:51,825 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.self_attn_weights.pos_emb_skip_rate, batch_count=743080.0, ans=0.0 2023-10-02 03:59:53,010 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_15126_210_6753_1_1530778677_2591185_120-277707-0 from training. Duration: 0.8 2023-10-02 03:59:53,061 WARNING [train.py:1204] (3/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_310-26186-0 from training. Duration: 0.7 2023-10-02 03:59:55,068 WARNING [train.py:1204] (3/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_787-294416-0 from training. Duration: 0.82 2023-10-02 04:00:00,413 WARNING [train.py:1204] (3/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_795-33252-0 from training. Duration: 0.37 2023-10-02 04:00:00,452 WARNING [train.py:1204] (3/4) Exclude cut with ID _7537_210_2084_1_1528502395431_3924359_113-199933-0_sp0.9 from training. Duration: 0.6555625 2023-10-02 04:00:07,153 WARNING [train.py:1204] (3/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_308-231735-0_sp0.9 from training. Duration: 0.811125 2023-10-02 04:00:07,182 WARNING [train.py:1204] (3/4) Exclude cut with ID _19504_210_9994_1_1531648863242_3325220_354-296765-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 04:00:08,606 WARNING [train.py:1204] (3/4) Exclude cut with ID _5409_210_1388_1_1527292850189_5865191_178-127970-0 from training. Duration: 0.57 2023-10-02 04:00:08,652 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_1466-248056-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 04:00:09,940 WARNING [train.py:1204] (3/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_535-14410-0_sp0.9 from training. Duration: 0.52225 2023-10-02 04:00:09,944 WARNING [train.py:1204] (3/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_747-129941-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 04:00:09,998 WARNING [train.py:1204] (3/4) Exclude cut with ID _15012_210_1968_1_1530856820429_5095350_688-178849-0_sp1.1 from training. Duration: 0.5818125 2023-10-02 04:00:12,255 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.2.encoder.layers.1.feed_forward3.out_whiten, num_groups=1, num_channels=384, metric=6.82 vs. limit=15.0 2023-10-02 04:00:12,889 WARNING [train.py:1204] (3/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_356-45054-0_sp1.1 from training. Duration: 0.7818125 2023-10-02 04:00:14,736 WARNING [train.py:1204] (3/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_146-276944-0_sp0.9 from training. Duration: 0.92225 2023-10-02 04:00:15,015 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.feed_forward1.out_proj.dropout_p, batch_count=743213.3333333334, ans=0.1 2023-10-02 04:00:17,697 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.balancer2.prob, batch_count=743213.3333333334, ans=0.125 2023-10-02 04:00:18,856 WARNING [train.py:1204] (3/4) Exclude cut with ID _39204_210_4220_1_1533880547368_3191120_346-180306-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 04:00:20,138 WARNING [train.py:1204] (3/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_641-158017-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 04:00:20,138 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_19952_210_2161_1_1531875469_3851750_274-42137-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 04:00:24,417 WARNING [train.py:1204] (3/4) Exclude cut with ID _60804_210_9642_1_1534831199859_7268299_87-327819-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 04:00:24,497 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_9302_210_2550_1_1528889962_4655416_638-301620-0 from training. Duration: 0.47 2023-10-02 04:00:25,835 WARNING [train.py:1204] (3/4) Exclude cut with ID _44304_210_15374_1_1533639352765_4613129_302-22084-0_sp1.1 from training. Duration: 0.7818125 2023-10-02 04:00:25,848 WARNING [train.py:1204] (3/4) Exclude cut with ID _18513_210_9206_1_1531618215223_3862620_177-227063-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 04:00:27,639 INFO [train.py:1046] (3/4) Epoch 21, batch 5250, loss[loss=0.1604, simple_loss=0.2329, pruned_loss=0.0439, over 23523.00 frames. ], tot_loss[loss=0.1755, simple_loss=0.2513, pruned_loss=0.0499, over 4721767.28 frames. ], batch size: 134, lr: 4.84e-03, grad_scale: 16.0 2023-10-02 04:00:27,768 WARNING [train.py:1204] (3/4) Exclude cut with ID _41326_210_5854_1_1533778076509_3492080_510-45444-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 04:00:29,016 WARNING [train.py:1204] (3/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_76-311963-0_sp1.1 from training. Duration: 0.636375 2023-10-02 04:00:29,110 WARNING [train.py:1204] (3/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_156-302675-0_sp0.9 from training. Duration: 0.611125 2023-10-02 04:00:31,777 WARNING [train.py:1204] (3/4) Exclude cut with ID _53489_210_6228_1_1534298357169_3835970_367-148411-0_sp1.1 from training. Duration: 0.936375 2023-10-02 04:00:34,957 WARNING [train.py:1204] (3/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_389-144525-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 04:00:35,013 WARNING [train.py:1204] (3/4) Exclude cut with ID _14069_210_5632_1_1530512692033_4803390_158-238427-0_sp1.1 from training. Duration: 0.963625 2023-10-02 04:00:36,408 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_486-151131-0_sp0.9 from training. Duration: 0.9444375 2023-10-02 04:00:41,365 WARNING [train.py:1204] (3/4) Exclude cut with ID _39846_210_12651_1_1533603651992_1318030_188-71831-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 04:00:42,697 WARNING [train.py:1204] (3/4) Exclude cut with ID _39411_210_5986_1_1533538825030_3343649_173-238248-0_sp1.1 from training. Duration: 0.7545625 2023-10-02 04:00:44,077 WARNING [train.py:1204] (3/4) Exclude cut with ID _20684_210_6917_1_1531567751646_4203930_469-49081-0_sp1.1 from training. Duration: 0.92725 2023-10-02 04:00:47,294 WARNING [train.py:1204] (3/4) Exclude cut with ID _8833_210_5219_1_1528592665210_7553059_611-80313-0_sp1.1 from training. Duration: 0.5818125 2023-10-02 04:00:48,715 WARNING [train.py:1204] (3/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_440-141811-0 from training. Duration: 0.95 2023-10-02 04:00:48,737 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_41211_210_2544_1_1533348859_3702910_236-223652-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 04:00:50,139 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_40222_210_6228_1_1533385298_4481939_220-163998-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 04:01:15,880 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.conv_module2.balancer1.prob, batch_count=743480.0, ans=0.125 2023-10-02 04:01:17,777 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.1.encoder.layers.0.conv_module2.whiten, num_groups=1, num_channels=256, metric=5.13 vs. limit=15.0 2023-10-02 04:01:37,138 INFO [train.py:1046] (3/4) Epoch 21, batch 5300, loss[loss=0.1762, simple_loss=0.2517, pruned_loss=0.05039, over 23302.00 frames. ], tot_loss[loss=0.1742, simple_loss=0.2495, pruned_loss=0.04947, over 4696461.72 frames. ], batch size: 119, lr: 4.84e-03, grad_scale: 16.0 2023-10-02 04:01:37,844 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.3.encoder.layers.3.conv_module2.whiten, num_groups=1, num_channels=512, metric=5.20 vs. limit=15.0 2023-10-02 04:01:51,143 WARNING [train.py:1204] (3/4) Exclude cut with ID _14629_210_15709_1_1530604298182_956899_6-232695-0_sp1.1 from training. Duration: 0.8545625 2023-10-02 04:01:51,162 WARNING [train.py:1204] (3/4) Exclude cut with ID _13921_210_9994_1_1530841782272_3503520_327-69677-0 from training. Duration: 0.9 2023-10-02 04:01:51,186 WARNING [train.py:1204] (3/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_372-298093-0 from training. Duration: 0.8 2023-10-02 04:01:51,198 WARNING [train.py:1204] (3/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_515-139111-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 04:01:51,415 WARNING [train.py:1204] (3/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_85-65056-0_sp1.1 from training. Duration: 0.87275 2023-10-02 04:01:51,453 WARNING [train.py:1204] (3/4) Exclude cut with ID _15113_210_8254_1_1530842438579_3716279_209-54533-0_sp1.1 from training. Duration: 0.87275 2023-10-02 04:01:51,497 WARNING [train.py:1204] (3/4) Exclude cut with ID _70447_210_12392_1_1535972499300_3160420_123-312838-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 04:01:51,882 WARNING [train.py:1204] (3/4) Exclude cut with ID _60113_210_16271_1_1535164212481_4386079_70-63596-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 04:01:51,911 WARNING [train.py:1204] (3/4) Exclude cut with ID _14632_210_9988_1_1530691279392_3923200_225-188509-0_sp1.1 from training. Duration: 0.963625 2023-10-02 04:01:51,960 WARNING [train.py:1204] (3/4) Exclude cut with ID _34734_210_4525_1_1533365977542_4448720_584-180532-0_sp1.1 from training. Duration: 0.97275 2023-10-02 04:01:51,974 WARNING [train.py:1204] (3/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_351-294650-0_sp1.1 from training. Duration: 0.57275 2023-10-02 04:01:52,304 WARNING [train.py:1204] (3/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_263-168388-0_sp0.9 from training. Duration: 0.9555625 2023-10-02 04:01:52,332 WARNING [train.py:1204] (3/4) Exclude cut with ID _22083_210_8254_1_1531702862439_3678970_201-85738-0 from training. Duration: 0.9 2023-10-02 04:01:52,428 WARNING [train.py:1204] (3/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_237-58433-0 from training. Duration: 0.74 2023-10-02 04:01:52,457 WARNING [train.py:1204] (3/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_262-250826-0 from training. Duration: 0.99 2023-10-02 04:01:52,558 WARNING [train.py:1204] (3/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_927-43827-0_sp0.9 from training. Duration: 0.82225 2023-10-02 04:01:52,586 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2821_210_2715_1_1526108385_3763560_73-296673-0 from training. Duration: 0.83 2023-10-02 04:01:52,659 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6287_210_3273_1_1527509506_2653716_182-199887-0 from training. Duration: 0.51 2023-10-02 04:01:52,789 WARNING [train.py:1204] (3/4) Exclude cut with ID _70519_210_4163_1_1535961595994_7458230_899-80223-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 04:01:53,035 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_210-234949-0_sp1.1 from training. Duration: 0.97275 2023-10-02 04:01:53,070 WARNING [train.py:1204] (3/4) Exclude cut with ID _30196_210_1664_1_1533708496522_7074140_768-278176-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 04:01:53,145 WARNING [train.py:1204] (3/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_315-128283-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 04:01:53,228 WARNING [train.py:1204] (3/4) Exclude cut with ID _14886_210_4220_1_1530702020355_3516190_330-323941-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 04:01:53,473 WARNING [train.py:1204] (3/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_264-348896-0_sp1.1 from training. Duration: 0.77275 2023-10-02 04:01:53,494 WARNING [train.py:1204] (3/4) Exclude cut with ID _37086_210_9038_1_1532999540891_7021250_433-10563-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 04:01:53,533 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_53638_210_21581_1_1534297319_5808449_885-80172-0_sp1.1 from training. Duration: 0.87275 2023-10-02 04:01:54,059 WARNING [train.py:1204] (3/4) Exclude cut with ID _14623_210_1925_1_1530773354962_3632609_157-218771-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 04:01:54,071 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_1368-78165-0_sp1.1 from training. Duration: 0.97275 2023-10-02 04:01:54,075 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_309-65177-0_sp0.9 from training. Duration: 0.97775 2023-10-02 04:01:54,081 WARNING [train.py:1204] (3/4) Exclude cut with ID _21632_210_2253_1_1531994001765_3744140_291-138812-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 04:01:54,093 WARNING [train.py:1204] (3/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_669-93163-0_sp1.1 from training. Duration: 0.8909375 2023-10-02 04:01:54,655 WARNING [train.py:1204] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_292-215346-0 from training. Duration: 0.97 2023-10-02 04:01:54,681 WARNING [train.py:1204] (3/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_286-222073-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 04:01:54,999 WARNING [train.py:1204] (3/4) Exclude cut with ID _59256_210_5986_1_1535076085997_3589120_121-237386-0_sp1.1 from training. Duration: 0.87275 2023-10-02 04:01:55,013 WARNING [train.py:1204] (3/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_96-320736-0 from training. Duration: 0.86 2023-10-02 04:01:55,023 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_128-202104-0 from training. Duration: 0.63 2023-10-02 04:01:55,118 WARNING [train.py:1204] (3/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_242-3472-0_sp0.9 from training. Duration: 0.811125 2023-10-02 04:01:55,166 WARNING [train.py:1204] (3/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_154-74182-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 04:01:55,170 WARNING [train.py:1204] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_187-221566-0 from training. Duration: 0.43 2023-10-02 04:01:55,355 WARNING [train.py:1204] (3/4) Exclude cut with ID _6194_210_1534_1_1527511826160_3941194_14-282876-0 from training. Duration: 0.71 2023-10-02 04:01:55,394 WARNING [train.py:1204] (3/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_660-211088-0_sp1.1 from training. Duration: 0.563625 2023-10-02 04:01:55,725 WARNING [train.py:1204] (3/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_526-62079-0_sp1.1 from training. Duration: 0.6090625 2023-10-02 04:01:55,881 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_92-246270-0_sp1.1 from training. Duration: 0.77275 2023-10-02 04:01:55,967 WARNING [train.py:1204] (3/4) Exclude cut with ID IC0287W0023-142350 from training. Duration: 0.956 2023-10-02 04:01:56,046 WARNING [train.py:1204] (3/4) Exclude cut with ID _58605_210_12210_1_1534723195939_3904259_48-269402-0 from training. Duration: 0.96 2023-10-02 04:01:56,064 WARNING [train.py:1204] (3/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_416-257730-0_sp0.9 from training. Duration: 0.72225 2023-10-02 04:01:56,158 WARNING [train.py:1204] (3/4) Exclude cut with ID _7877_210_6014_1_1528286358099_3750136_185-17516-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 04:01:56,259 WARNING [train.py:1204] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_23-189234-0 from training. Duration: 0.83 2023-10-02 04:01:56,874 WARNING [train.py:1204] (3/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_237-89684-0 from training. Duration: 0.96 2023-10-02 04:01:56,943 WARNING [train.py:1204] (3/4) Exclude cut with ID _13916_210_9994_1_1530788341126_3763149_116-21034-0 from training. Duration: 0.96 2023-10-02 04:01:57,088 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_246-125000-0_sp1.1 from training. Duration: 0.563625 2023-10-02 04:02:03,124 INFO [train.py:1046] (3/4) Epoch 22, batch 0, loss[loss=0.1859, simple_loss=0.2635, pruned_loss=0.05411, over 23233.00 frames. ], tot_loss[loss=0.1859, simple_loss=0.2635, pruned_loss=0.05411, over 23233.00 frames. ], batch size: 93, lr: 4.73e-03, grad_scale: 32.0 2023-10-02 04:02:03,124 INFO [train.py:1069] (3/4) Computing validation loss 2023-10-02 04:02:16,021 INFO [train.py:1078] (3/4) Epoch 22, validation: loss=0.3002, simple_loss=0.2661, pruned_loss=0.1671, over 1125622.00 frames. 2023-10-02 04:02:16,022 INFO [train.py:1079] (3/4) Maximum memory allocated so far is 21134MB 2023-10-02 04:02:17,720 WARNING [train.py:1204] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_402-253027-0 from training. Duration: 0.54 2023-10-02 04:02:19,137 WARNING [train.py:1204] (3/4) Exclude cut with ID _59288_210_5854_1_1535077757774_3412246_158-289345-0_sp1.1 from training. Duration: 0.92725 2023-10-02 04:02:20,588 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_28211_210_6388_1_1532861506_4523224_128-328162-0_sp1.1 from training. Duration: 0.8181875 2023-10-02 04:02:20,754 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.feed_forward2.hidden_balancer.prob, batch_count=743693.3333333334, ans=0.125 2023-10-02 04:02:25,139 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_788-278281-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 04:02:25,147 WARNING [train.py:1204] (3/4) Exclude cut with ID _49728_210_12651_1_1534041034294_6788150_624-195301-0_sp0.9 from training. Duration: 0.9444375 2023-10-02 04:02:25,338 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.balancer_ff3.min_abs, batch_count=743693.3333333334, ans=0.2 2023-10-02 04:02:26,501 WARNING [train.py:1204] (3/4) Exclude cut with ID _14860_210_4915_1_1530702020340_7343240_267-139775-0_sp1.1 from training. Duration: 0.963625 2023-10-02 04:02:26,570 WARNING [train.py:1204] (3/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_332-221057-0 from training. Duration: 0.68 2023-10-02 04:02:27,947 WARNING [train.py:1204] (3/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_242-76943-0 from training. Duration: 0.94 2023-10-02 04:02:31,399 WARNING [train.py:1204] (3/4) Exclude cut with ID _16747_210_5986_1_1531811544733_2749109_87-349587-0_sp1.1 from training. Duration: 0.963625 2023-10-02 04:02:31,464 WARNING [train.py:1204] (3/4) Exclude cut with ID _22982_210_1883_1_1531882790447_5449409_505-346452-0_sp1.1 from training. Duration: 0.963625 2023-10-02 04:02:34,246 WARNING [train.py:1204] (3/4) Exclude cut with ID _17956_210_4353_1_1531209568125_6979970_13-192107-0_sp1.1 from training. Duration: 0.963625 2023-10-02 04:02:34,305 WARNING [train.py:1204] (3/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_430-307666-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 04:02:35,642 WARNING [train.py:1204] (3/4) Exclude cut with ID _2895_210_2477_1_1525956785794_4063750_289-11475-0_sp1.1 from training. Duration: 0.7454375 2023-10-02 04:02:35,665 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3910_210_1794_1_1526742306_334530_28-238909-0_sp0.9 from training. Duration: 0.9 2023-10-02 04:02:38,931 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.547e+02 2.142e+02 2.497e+02 3.157e+02 5.918e+02, threshold=4.995e+02, percent-clipped=12.0 2023-10-02 04:02:39,047 WARNING [train.py:1204] (3/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_18-130336-0 from training. Duration: 0.79 2023-10-02 04:02:40,474 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_548-294316-0_sp0.9 from training. Duration: 0.9 2023-10-02 04:02:48,093 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_42-142473-0_sp1.1 from training. Duration: 0.6545625 2023-10-02 04:02:48,099 WARNING [train.py:1204] (3/4) Exclude cut with ID _59170_210_6721_1_1534983633960_4203335_326-48066-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 04:02:48,361 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.balancer1.prob, batch_count=743826.6666666666, ans=0.125 2023-10-02 04:02:49,532 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_45886_210_17024_1_1533709215_471063_7-79165-0 from training. Duration: 0.98 2023-10-02 04:02:55,833 WARNING [train.py:1204] (3/4) Exclude cut with ID _44585_210_18628_1_1533605350387_4518199_280-73534-0_sp0.9 from training. Duration: 0.988875 2023-10-02 04:02:55,839 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3475_210_2028_1_1526193578_3622569_265-276625-0_sp1.1 from training. Duration: 0.6909375 2023-10-02 04:02:58,635 WARNING [train.py:1204] (3/4) Exclude cut with ID _64631_210_27767_1_1535349580068_4165160_88-225232-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 04:03:01,644 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_205-265518-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 04:03:05,092 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.bypass.scale_min, batch_count=743893.3333333334, ans=0.2 2023-10-02 04:03:06,327 WARNING [train.py:1204] (3/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_271-253734-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 04:03:06,481 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.feed_forward3.hidden_balancer.prob, batch_count=743893.3333333334, ans=0.125 2023-10-02 04:03:10,711 WARNING [train.py:1204] (3/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_282-257791-0 from training. Duration: 0.86 2023-10-02 04:03:14,247 WARNING [train.py:1204] (3/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_356-45054-0 from training. Duration: 0.86 2023-10-02 04:03:15,510 WARNING [train.py:1204] (3/4) Exclude cut with ID _69584_210_5219_1_1535785133775_4044450_38-348116-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 04:03:15,511 WARNING [train.py:1204] (3/4) Exclude cut with ID _66763_210_17301_1_1535788667617_241750_21-328480-0_sp1.1 from training. Duration: 0.936375 2023-10-02 04:03:15,574 WARNING [train.py:1204] (3/4) Exclude cut with ID _26528_210_9070_1_1532570410842_3851310_497-97218-0_sp0.9 from training. Duration: 0.9555625 2023-10-02 04:03:16,876 WARNING [train.py:1204] (3/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_592-101075-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 04:03:18,231 WARNING [train.py:1204] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_678-159307-0 from training. Duration: 0.85 2023-10-02 04:03:19,510 WARNING [train.py:1204] (3/4) Exclude cut with ID _28427_210_4220_1_1532581240967_3061069_166-319773-0_sp1.1 from training. Duration: 0.936375 2023-10-02 04:03:22,674 WARNING [train.py:1204] (3/4) Exclude cut with ID _29877_210_6228_1_1532397791106_5276149_606-224984-0_sp1.1 from training. Duration: 0.936375 2023-10-02 04:03:24,304 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.conv_module2.balancer2.prob, batch_count=743960.0, ans=0.125 2023-10-02 04:03:27,146 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_125-203105-0_sp1.1 from training. Duration: 0.72725 2023-10-02 04:03:29,132 WARNING [train.py:1204] (3/4) Exclude cut with ID IC0620W0113-306765 from training. Duration: 0.768 2023-10-02 04:03:30,502 WARNING [train.py:1204] (3/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_410-287857-0_sp1.1 from training. Duration: 0.8454375 2023-10-02 04:03:31,148 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.3.encoder.layers.0.conv_module1.whiten, num_groups=1, num_channels=512, metric=2.95 vs. limit=15.0 2023-10-02 04:03:31,755 INFO [train.py:1046] (3/4) Epoch 22, batch 50, loss[loss=0.1777, simple_loss=0.2509, pruned_loss=0.05229, over 23466.00 frames. ], tot_loss[loss=0.1733, simple_loss=0.25, pruned_loss=0.04831, over 1063141.13 frames. ], batch size: 134, lr: 4.72e-03, grad_scale: 32.0 2023-10-02 04:03:35,012 WARNING [train.py:1204] (3/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_78-95260-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 04:03:36,591 WARNING [train.py:1204] (3/4) Exclude cut with ID _58059_210_5854_1_1534901471149_3164808_6-73080-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 04:03:36,617 WARNING [train.py:1204] (3/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_331-117829-0 from training. Duration: 0.62 2023-10-02 04:03:36,686 WARNING [train.py:1204] (3/4) Exclude cut with ID _14880_210_8895_1_1530680426655_3795259_289-212817-0_sp0.9 from training. Duration: 0.7666875 2023-10-02 04:03:36,724 WARNING [train.py:1204] (3/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_620-144779-0_sp1.1 from training. Duration: 0.92725 2023-10-02 04:03:39,426 WARNING [train.py:1204] (3/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_561-288575-0_sp1.1 from training. Duration: 0.963625 2023-10-02 04:03:39,522 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_22657_210_6890_1_1532086002_1637216_122-188128-0_sp1.1 from training. Duration: 0.963625 2023-10-02 04:03:40,060 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.3.encoder.layers.3.nonlin_attention.whiten2, num_groups=1, num_channels=512, metric=5.40 vs. limit=15.0 2023-10-02 04:03:42,271 WARNING [train.py:1204] (3/4) Exclude cut with ID _16756_210_4915_1_1531047803763_7104259_604-86607-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 04:03:46,747 WARNING [train.py:1204] (3/4) Exclude cut with ID _15150_210_8341_1_1530752411803_7256384_469-239509-0 from training. Duration: 0.68 2023-10-02 04:03:46,756 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_10987_210_4163_1_1529760555_4123902_302-87926-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 04:03:52,825 WARNING [train.py:1204] (3/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_469-94673-0_sp0.9 from training. Duration: 0.87775 2023-10-02 04:03:54,292 WARNING [train.py:1204] (3/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_798-106179-0 from training. Duration: 0.83 2023-10-02 04:03:55,860 WARNING [train.py:1204] (3/4) Exclude cut with ID _16744_210_5986_1_1531724405384_3907567_242-111316-0 from training. Duration: 0.93 2023-10-02 04:03:56,543 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.3.encoder.layers.2.feed_forward1.out_whiten, num_groups=1, num_channels=512, metric=7.80 vs. limit=15.0 2023-10-02 04:03:57,213 WARNING [train.py:1204] (3/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_938-205135-0_sp0.9 from training. Duration: 0.9666875 2023-10-02 04:04:00,303 WARNING [train.py:1204] (3/4) Exclude cut with ID _64135_210_2253_1_1535677267876_7608972_133-188266-0_sp0.9 from training. Duration: 0.9 2023-10-02 04:04:00,307 WARNING [train.py:1204] (3/4) Exclude cut with ID _44585_210_18628_1_1533605350387_4518199_623-346820-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 04:04:00,394 WARNING [train.py:1204] (3/4) Exclude cut with ID _42326_210_7770_1_1533814282531_3707269_454-266903-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 04:04:01,803 WARNING [train.py:1204] (3/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_5-55893-0_sp1.1 from training. Duration: 0.5 2023-10-02 04:04:03,076 WARNING [train.py:1204] (3/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_577-332600-0_sp1.1 from training. Duration: 0.5181875 2023-10-02 04:04:03,077 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_957-138882-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 04:04:09,401 WARNING [train.py:1204] (3/4) Exclude cut with ID _12556_210_8341_1_1530081024100_7327690_219-38004-0_sp1.1 from training. Duration: 0.97275 2023-10-02 04:04:09,535 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.conv_skip_rate, batch_count=744160.0, ans=0.0 2023-10-02 04:04:12,069 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_397-147196-0_sp1.1 from training. Duration: 0.72725 2023-10-02 04:04:12,087 WARNING [train.py:1204] (3/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_231-104736-0_sp0.9 from training. Duration: 0.8666875 2023-10-02 04:04:12,154 WARNING [train.py:1204] (3/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_398-334558-0 from training. Duration: 0.96 2023-10-02 04:04:14,840 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_257-20733-0_sp1.1 from training. Duration: 0.6909375 2023-10-02 04:04:16,497 WARNING [train.py:1204] (3/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_146-282400-0_sp1.1 from training. Duration: 0.6454375 2023-10-02 04:04:16,502 WARNING [train.py:1204] (3/4) Exclude cut with ID _51899_210_20034_1_1534129254466_3669220_265-215428-0 from training. Duration: 0.98 2023-10-02 04:04:16,574 WARNING [train.py:1204] (3/4) Exclude cut with ID _34423_210_8750_1_1533430798456_3330199_219-106541-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 04:04:18,006 WARNING [train.py:1204] (3/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_458-67321-0 from training. Duration: 0.97 2023-10-02 04:04:18,844 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.2.encoder.layers.2.self_attn1.whiten, num_groups=1, num_channels=384, metric=22.23 vs. limit=22.5 2023-10-02 04:04:25,564 WARNING [train.py:1204] (3/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_1051-182606-0_sp1.1 from training. Duration: 0.936375 2023-10-02 04:04:25,590 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_53555_210_5632_1_1534595220_7232297_603-4396-0_sp1.1 from training. Duration: 0.97275 2023-10-02 04:04:26,986 WARNING [train.py:1204] (3/4) Exclude cut with ID _54218_210_12210_1_1534413646388_4409259_159-144367-0_sp1.1 from training. Duration: 0.963625 2023-10-02 04:04:30,225 WARNING [train.py:1204] (3/4) Exclude cut with ID _7606_210_4915_1_1528894253277_5080690_301-10732-0_sp0.9 from training. Duration: 0.9 2023-10-02 04:04:30,228 WARNING [train.py:1204] (3/4) Exclude cut with ID _15026_210_8750_1_1530777602316_3291850_211-195043-0_sp0.9 from training. Duration: 0.888875 2023-10-02 04:04:31,725 WARNING [train.py:1204] (3/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_146-282400-0 from training. Duration: 0.71 2023-10-02 04:04:31,754 WARNING [train.py:1204] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_65-88242-0 from training. Duration: 0.56 2023-10-02 04:04:33,159 WARNING [train.py:1204] (3/4) Exclude cut with ID _55857_210_2346_1_1534644007831_6898209_80-107757-0_sp1.1 from training. Duration: 0.963625 2023-10-02 04:04:33,188 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_15126_210_6753_1_1530778677_2591185_120-277707-0_sp0.9 from training. Duration: 0.888875 2023-10-02 04:04:34,689 WARNING [train.py:1204] (3/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_1033-168593-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 04:04:36,017 WARNING [train.py:1204] (3/4) Exclude cut with ID _46683_210_9642_1_1533730913492_4591116_475-30711-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 04:04:37,391 WARNING [train.py:1204] (3/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_253-308377-0 from training. Duration: 0.49 2023-10-02 04:04:37,461 WARNING [train.py:1204] (3/4) Exclude cut with ID _4680_210_3525_1_1527249757953_893386_75-78979-0 from training. Duration: 0.91 2023-10-02 04:04:38,751 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_5522_210_3273_1_1527249606_3909096_318-203153-0_sp0.9 from training. Duration: 0.47775 2023-10-02 04:04:41,735 WARNING [train.py:1204] (3/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_550-170116-0_sp1.1 from training. Duration: 0.92725 2023-10-02 04:04:41,769 WARNING [train.py:1204] (3/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_277-210798-0_sp0.9 from training. Duration: 0.611125 2023-10-02 04:04:41,825 WARNING [train.py:1204] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_620-276793-0 from training. Duration: 0.59 2023-10-02 04:04:41,842 WARNING [train.py:1204] (3/4) Exclude cut with ID _3067_210_2667_1_1526212974640_4503168_119-175776-0 from training. Duration: 0.74 2023-10-02 04:04:43,216 WARNING [train.py:1204] (3/4) Exclude cut with ID _55209_210_1531_1_1534675828996_4597160_681-195827-0_sp1.1 from training. Duration: 0.92725 2023-10-02 04:04:43,303 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_427-78097-0_sp1.1 from training. Duration: 0.72725 2023-10-02 04:04:44,743 WARNING [train.py:1204] (3/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_96-290039-0_sp0.9 from training. Duration: 0.62225 2023-10-02 04:04:45,989 INFO [train.py:1046] (3/4) Epoch 22, batch 100, loss[loss=0.1713, simple_loss=0.2476, pruned_loss=0.0475, over 23231.00 frames. ], tot_loss[loss=0.1756, simple_loss=0.2523, pruned_loss=0.04946, over 1877898.13 frames. ], batch size: 105, lr: 4.72e-03, grad_scale: 32.0 2023-10-02 04:04:46,031 WARNING [train.py:1204] (3/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_287-292174-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 04:04:47,639 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.conv_module1.balancer2.prob, batch_count=744360.0, ans=0.125 2023-10-02 04:04:49,358 WARNING [train.py:1204] (3/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_200-292228-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 04:04:52,080 WARNING [train.py:1204] (3/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_291-194999-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 04:04:54,874 WARNING [train.py:1204] (3/4) Exclude cut with ID _58895_210_8750_1_1535184081320_3649089_265-323810-0_sp1.1 from training. Duration: 0.87275 2023-10-02 04:04:55,005 WARNING [train.py:1204] (3/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_58-68956-0 from training. Duration: 0.83 2023-10-02 04:04:55,010 WARNING [train.py:1204] (3/4) Exclude cut with ID _39867_210_9826_1_1533553222030_9344259_322-115153-0_sp1.1 from training. Duration: 0.963625 2023-10-02 04:04:59,513 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3371_210_1794_1_1526384462_5580777_396-339812-0_sp1.1 from training. Duration: 0.77275 2023-10-02 04:04:59,526 WARNING [train.py:1204] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_731-2428-0_sp0.9 from training. Duration: 0.92225 2023-10-02 04:04:59,554 WARNING [train.py:1204] (3/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_98-741-0_sp1.1 from training. Duration: 0.72725 2023-10-02 04:04:59,556 WARNING [train.py:1204] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_238-245655-0_sp0.9 from training. Duration: 0.9 2023-10-02 04:04:59,589 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6788_210_1388_1_1527898698_4562021_91-197981-0_sp0.9 from training. Duration: 0.92225 2023-10-02 04:05:01,636 WARNING [train.py:1204] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_860-96198-0 from training. Duration: 0.73 2023-10-02 04:05:01,836 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.conv_skip_rate, batch_count=744426.6666666666, ans=0.0 2023-10-02 04:05:03,070 WARNING [train.py:1204] (3/4) Exclude cut with ID _14268_210_9206_1_1530622771285_4714099_331-105351-0_sp0.9 from training. Duration: 0.72225 2023-10-02 04:05:03,291 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.feed_forward1.hidden_balancer.prob, batch_count=744426.6666666666, ans=0.125 2023-10-02 04:05:03,684 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.3.encoder.layers.3.self_attn2.whiten, num_groups=1, num_channels=512, metric=12.34 vs. limit=22.5 2023-10-02 04:05:04,383 WARNING [train.py:1204] (3/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_204-137133-0_sp1.1 from training. Duration: 0.92725 2023-10-02 04:05:04,400 WARNING [train.py:1204] (3/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_5-46496-0_sp1.1 from training. Duration: 0.936375 2023-10-02 04:05:04,408 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_160-218721-0_sp1.1 from training. Duration: 0.87275 2023-10-02 04:05:08,460 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.503e+02 1.848e+02 2.160e+02 2.576e+02 4.696e+02, threshold=4.321e+02, percent-clipped=0.0 2023-10-02 04:05:08,548 WARNING [train.py:1204] (3/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_503-287883-0 from training. Duration: 0.88 2023-10-02 04:05:08,641 WARNING [train.py:1204] (3/4) Exclude cut with ID _57572_210_5517_1_1534921219624_3809484_110-79134-0_sp1.1 from training. Duration: 0.92725 2023-10-02 04:05:10,007 WARNING [train.py:1204] (3/4) Exclude cut with ID _44585_210_18628_1_1533605350387_4518199_421-37641-0_sp1.1 from training. Duration: 0.936375 2023-10-02 04:05:11,804 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_42-142473-0_sp0.9 from training. Duration: 0.8 2023-10-02 04:05:13,209 WARNING [train.py:1204] (3/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_310-114452-0_sp0.9 from training. Duration: 0.6555625 2023-10-02 04:05:15,325 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.2.encoder.layers.2.feed_forward1.out_whiten, num_groups=1, num_channels=384, metric=10.01 vs. limit=15.0 2023-10-02 04:05:16,157 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.nonlin_attention.balancer.min_positive, batch_count=744493.3333333334, ans=0.05 2023-10-02 04:05:17,216 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9029W0120-509520 from training. Duration: 0.768 2023-10-02 04:05:17,241 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9029W0433-509820 from training. Duration: 0.896 2023-10-02 04:05:18,653 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_40222_210_6228_1_1533385298_4481939_163-334586-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 04:05:18,653 WARNING [train.py:1204] (3/4) Exclude cut with ID _48677_210_5663_1_1533902405919_3892989_408-40740-0_sp1.1 from training. Duration: 0.7545625 2023-10-02 04:05:22,024 WARNING [train.py:1204] (3/4) Exclude cut with ID _24314_210_4915_1_1532135078116_7170179_521-258868-0_sp0.9 from training. Duration: 0.87775 2023-10-02 04:05:23,430 WARNING [train.py:1204] (3/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_576-51198-0_sp1.1 from training. Duration: 0.92725 2023-10-02 04:05:23,543 WARNING [train.py:1204] (3/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_306-216871-0_sp1.1 from training. Duration: 0.97275 2023-10-02 04:05:23,656 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.1.conv_module2.balancer1.prob, batch_count=744493.3333333334, ans=0.125 2023-10-02 04:05:25,151 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.feed_forward3.hidden_balancer.prob, batch_count=744493.3333333334, ans=0.125 2023-10-02 04:05:29,667 WARNING [train.py:1204] (3/4) Exclude cut with ID _20684_210_6917_1_1531567751646_4203930_463-119982-0_sp1.1 from training. Duration: 0.97275 2023-10-02 04:05:31,019 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9084W0012-536850 from training. Duration: 0.64 2023-10-02 04:05:33,149 WARNING [train.py:1204] (3/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_847-41334-0_sp1.1 from training. Duration: 0.4 2023-10-02 04:05:37,239 WARNING [train.py:1204] (3/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_326-165900-0_sp1.1 from training. Duration: 0.62725 2023-10-02 04:05:37,303 WARNING [train.py:1204] (3/4) Exclude cut with ID _7537_210_2084_1_1528502395431_3924359_128-268029-0_sp1.1 from training. Duration: 0.8909375 2023-10-02 04:05:39,896 WARNING [train.py:1204] (3/4) Exclude cut with ID _67499_210_17282_1_1535509805220_3692430_333-155030-0_sp1.1 from training. Duration: 0.97275 2023-10-02 04:05:40,397 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.5.encoder.layers.0.feed_forward1.out_whiten, num_groups=1, num_channels=256, metric=10.02 vs. limit=15.0 2023-10-02 04:05:43,383 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_34008_210_10431_1_1533345424_8183247_588-257177-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 04:05:46,158 WARNING [train.py:1204] (3/4) Exclude cut with ID _25914_210_6400_1_1532141317058_644740_52-96808-0_sp1.1 from training. Duration: 0.82725 2023-10-02 04:05:46,388 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=744626.6666666666, ans=0.1 2023-10-02 04:05:47,649 WARNING [train.py:1204] (3/4) Exclude cut with ID _56494_210_4220_1_1535284837625_3033169_69-138150-0_sp1.1 from training. Duration: 0.8545625 2023-10-02 04:05:47,958 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.balancer1.prob, batch_count=744626.6666666666, ans=0.125 2023-10-02 04:05:50,406 WARNING [train.py:1204] (3/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_19-143553-0_sp1.1 from training. Duration: 0.97275 2023-10-02 04:05:50,453 WARNING [train.py:1204] (3/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_582-130735-0_sp1.1 from training. Duration: 0.936375 2023-10-02 04:05:53,578 WARNING [train.py:1204] (3/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_943-201356-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 04:05:53,595 WARNING [train.py:1204] (3/4) Exclude cut with ID _3231_210_2106_1_1526205864934_6784220_422-229276-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 04:05:53,625 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_48049_210_5632_1_1533864553_3519937_73-198717-0_sp1.1 from training. Duration: 0.97275 2023-10-02 04:05:54,931 WARNING [train.py:1204] (3/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_329-285801-0 from training. Duration: 0.91 2023-10-02 04:05:54,946 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9171W0114-579000 from training. Duration: 0.896 2023-10-02 04:05:54,957 WARNING [train.py:1204] (3/4) Exclude cut with ID _3120_210_3525_1_1526036143112_4072980_210-132675-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 04:05:56,332 WARNING [train.py:1204] (3/4) Exclude cut with ID _55857_210_2346_1_1534644007831_6898209_745-98391-0_sp0.9 from training. Duration: 0.8444375 2023-10-02 04:05:56,392 WARNING [train.py:1204] (3/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_615-322035-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 04:05:56,399 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_36756_210_7234_1_1533026991_966276_119-315767-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 04:05:56,420 WARNING [train.py:1204] (3/4) Exclude cut with ID _13916_210_9994_1_1530788341126_3763149_60-191556-0_sp1.1 from training. Duration: 0.5545625 2023-10-02 04:05:56,434 WARNING [train.py:1204] (3/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_177-236809-0_sp1.1 from training. Duration: 0.4454375 2023-10-02 04:05:56,474 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7002_210_1794_1_1527939683_5157286_184-229630-0_sp1.1 from training. Duration: 0.67275 2023-10-02 04:05:58,277 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_50428_210_5724_1_1534247532_4126418_464-18322-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 04:05:59,624 WARNING [train.py:1204] (3/4) Exclude cut with ID _35799_210_5854_1_1533263487887_3529071_466-131402-0_sp1.1 from training. Duration: 0.936375 2023-10-02 04:05:59,691 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_1259-132768-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 04:06:00,985 INFO [train.py:1046] (3/4) Epoch 22, batch 150, loss[loss=0.1762, simple_loss=0.2443, pruned_loss=0.05404, over 23696.00 frames. ], tot_loss[loss=0.1751, simple_loss=0.2518, pruned_loss=0.04916, over 2510814.48 frames. ], batch size: 232, lr: 4.72e-03, grad_scale: 32.0 2023-10-02 04:06:01,067 WARNING [train.py:1204] (3/4) Exclude cut with ID _6694_210_4599_1_1527852637490_3965529_144-185051-0_sp1.1 from training. Duration: 0.87275 2023-10-02 04:06:02,435 WARNING [train.py:1204] (3/4) Exclude cut with ID _64639_210_5816_1_1535367619437_3528000_59-133290-0_sp1.1 from training. Duration: 0.963625 2023-10-02 04:06:05,803 WARNING [train.py:1204] (3/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_223-49525-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 04:06:08,652 WARNING [train.py:1204] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_748-36150-0_sp1.1 from training. Duration: 0.82725 2023-10-02 04:06:08,653 WARNING [train.py:1204] (3/4) Exclude cut with ID _19896_210_1883_1_1531709022662_6649569_52-221627-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 04:06:08,730 WARNING [train.py:1204] (3/4) Exclude cut with ID _6789_210_1534_1_1527854471671_2870519_296-115766-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 04:06:11,577 WARNING [train.py:1204] (3/4) Exclude cut with ID _14624_210_1925_1_1530858690657_3642219_100-19871-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 04:06:11,610 WARNING [train.py:1204] (3/4) Exclude cut with ID _12555_210_8750_1_1530075594402_3780239_50-293675-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 04:06:14,331 WARNING [train.py:1204] (3/4) Exclude cut with ID _14880_210_8895_1_1530680426655_3795259_289-212817-0_sp1.1 from training. Duration: 0.62725 2023-10-02 04:06:15,672 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_388-211626-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 04:06:20,326 WARNING [train.py:1204] (3/4) Exclude cut with ID _5289_210_2084_1_1527678202736_3779299_392-261988-0 from training. Duration: 0.65 2023-10-02 04:06:20,341 WARNING [train.py:1204] (3/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_124-345840-0 from training. Duration: 0.52 2023-10-02 04:06:20,345 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2655_210_2087_1_1525688959_3897400_437-337690-0 from training. Duration: 0.63 2023-10-02 04:06:23,063 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3780_210_2087_1_1526467391_239068_13-274633-0_sp1.1 from training. Duration: 0.92725 2023-10-02 04:06:23,067 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_805-71497-0_sp1.1 from training. Duration: 0.6454375 2023-10-02 04:06:24,406 WARNING [train.py:1204] (3/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_434-83000-0_sp1.1 from training. Duration: 0.8909375 2023-10-02 04:06:26,312 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_17613_210_2151_1_1531137569_2366160_285-219531-0_sp1.1 from training. Duration: 0.97275 2023-10-02 04:06:26,316 WARNING [train.py:1204] (3/4) Exclude cut with ID _56108_210_16380_1_1534505446159_3887959_22-37858-0_sp1.1 from training. Duration: 0.936375 2023-10-02 04:06:26,349 WARNING [train.py:1204] (3/4) Exclude cut with ID _28427_210_4220_1_1532581240967_3061069_200-88444-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 04:06:26,412 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_77-279042-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 04:06:27,738 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9309W0193-640320 from training. Duration: 0.896 2023-10-02 04:06:30,496 WARNING [train.py:1204] (3/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_516-324055-0_sp1.1 from training. Duration: 0.936375 2023-10-02 04:06:35,649 WARNING [train.py:1204] (3/4) Exclude cut with ID _40094_210_16380_1_1533294005773_3641159_10-261240-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 04:06:38,597 WARNING [train.py:1204] (3/4) Exclude cut with ID _6267_210_2084_1_1527764658526_3894730_375-219887-0_sp1.1 from training. Duration: 0.6090625 2023-10-02 04:06:39,951 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6788_210_1388_1_1527898698_4562021_180-75288-0 from training. Duration: 0.99 2023-10-02 04:06:43,977 WARNING [train.py:1204] (3/4) Exclude cut with ID _8030_210_7497_1_1528444886781_3531089_262-86750-0_sp1.1 from training. Duration: 0.736375 2023-10-02 04:06:44,005 WARNING [train.py:1204] (3/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_934-325498-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 04:06:45,409 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_456-22141-0_sp0.9 from training. Duration: 0.97775 2023-10-02 04:06:46,879 WARNING [train.py:1204] (3/4) Exclude cut with ID _49643_210_7989_1_1534640244224_3939031_598-161926-0_sp1.1 from training. Duration: 0.8181875 2023-10-02 04:06:48,825 WARNING [train.py:1204] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_345-49721-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 04:06:50,170 WARNING [train.py:1204] (3/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_440-141811-0_sp1.1 from training. Duration: 0.863625 2023-10-02 04:06:50,265 WARNING [train.py:1204] (3/4) Exclude cut with ID _64048_210_10626_1_1535198382802_3835179_394-212120-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 04:06:50,306 WARNING [train.py:1204] (3/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_91-277684-0 from training. Duration: 0.74 2023-10-02 04:06:54,433 WARNING [train.py:1204] (3/4) Exclude cut with ID _23038_210_9570_1_1532001584158_3652419_191-346343-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 04:06:55,810 WARNING [train.py:1204] (3/4) Exclude cut with ID _15140_210_9288_1_1530791388450_4880149_351-347974-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 04:06:55,850 WARNING [train.py:1204] (3/4) Exclude cut with ID _58605_210_12210_1_1534723195939_3904259_48-269402-0_sp1.1 from training. Duration: 0.87275 2023-10-02 04:06:55,859 WARNING [train.py:1204] (3/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_401-53423-0_sp0.9 from training. Duration: 0.611125 2023-10-02 04:06:59,084 WARNING [train.py:1204] (3/4) Exclude cut with ID _42659_210_2070_1_1533607442765_3120380_158-49004-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 04:07:00,447 WARNING [train.py:1204] (3/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_81-75849-0_sp1.1 from training. Duration: 0.3545625 2023-10-02 04:07:01,841 WARNING [train.py:1204] (3/4) Exclude cut with ID _27052_210_5986_1_1532329197187_3585009_314-106499-0_sp1.1 from training. Duration: 0.836375 2023-10-02 04:07:03,944 WARNING [train.py:1204] (3/4) Exclude cut with ID _4626_210_3532_1_1526817652717_3916859_446-206656-0_sp0.9 from training. Duration: 0.8444375 2023-10-02 04:07:05,994 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_53378_210_17282_1_1534221071_5769509_158-114960-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 04:07:07,153 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3171_210_2087_1_1526109084_2330007_37-64334-0_sp0.9 from training. Duration: 0.911125 2023-10-02 04:07:08,437 WARNING [train.py:1204] (3/4) Exclude cut with ID _5410_210_1388_1_1527397055795_1719020_39-112221-0 from training. Duration: 0.69 2023-10-02 04:07:08,483 WARNING [train.py:1204] (3/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_4-91037-0_sp0.9 from training. Duration: 0.97775 2023-10-02 04:07:08,508 WARNING [train.py:1204] (3/4) Exclude cut with ID ID0079W0443-717795 from training. Duration: 0.541 2023-10-02 04:07:12,564 WARNING [train.py:1204] (3/4) Exclude cut with ID _34734_210_4525_1_1533365977542_4448720_766-297928-0_sp1.1 from training. Duration: 0.936375 2023-10-02 04:07:14,764 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.3.encoder.layers.2.conv_module2.whiten, num_groups=1, num_channels=512, metric=3.94 vs. limit=15.0 2023-10-02 04:07:15,456 INFO [train.py:1046] (3/4) Epoch 22, batch 200, loss[loss=0.195, simple_loss=0.2712, pruned_loss=0.05942, over 24356.00 frames. ], tot_loss[loss=0.1769, simple_loss=0.2532, pruned_loss=0.0503, over 3001150.70 frames. ], batch size: 77, lr: 4.72e-03, grad_scale: 16.0 2023-10-02 04:07:15,550 WARNING [train.py:1204] (3/4) Exclude cut with ID _51471_210_1889_1_1534399391390_3094250_310-337793-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 04:07:15,568 WARNING [train.py:1204] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_678-159307-0_sp0.9 from training. Duration: 0.9444375 2023-10-02 04:07:20,161 WARNING [train.py:1204] (3/4) Exclude cut with ID _50093_210_4220_1_1534417141207_3432299_259-322252-0 from training. Duration: 0.92 2023-10-02 04:07:20,237 WARNING [train.py:1204] (3/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_67-103444-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 04:07:21,606 WARNING [train.py:1204] (3/4) Exclude cut with ID _40551_210_10635_1_1533776424632_3416179_311-247578-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 04:07:23,068 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_4633_210_2040_1_1526824163_4145343_416-76117-0 from training. Duration: 0.95 2023-10-02 04:07:24,442 WARNING [train.py:1204] (3/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_331-117829-0_sp0.9 from training. Duration: 0.688875 2023-10-02 04:07:25,899 WARNING [train.py:1204] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_299-247059-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 04:07:27,219 WARNING [train.py:1204] (3/4) Exclude cut with ID _64002_210_9570_1_1535198605157_38979_2-240845-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 04:07:31,936 WARNING [train.py:1204] (3/4) Exclude cut with ID _4681_210_3525_1_1527250689464_120889_15-126760-0_sp0.9 from training. Duration: 0.9 2023-10-02 04:07:31,956 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_21500_210_2054_1_1532149310_2716758_256-156611-0_sp1.1 from training. Duration: 0.936375 2023-10-02 04:07:31,971 WARNING [train.py:1204] (3/4) Exclude cut with ID _16908_210_5724_1_1531119157248_3772700_624-203729-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 04:07:40,071 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.425e+02 1.856e+02 2.062e+02 2.415e+02 5.556e+02, threshold=4.124e+02, percent-clipped=1.0 2023-10-02 04:07:47,347 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.bypass.skip_rate, batch_count=745160.0, ans=0.04949747468305833 2023-10-02 04:07:47,384 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.attention_skip_rate, batch_count=745160.0, ans=0.0 2023-10-02 04:07:50,028 WARNING [train.py:1204] (3/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_605-219732-0_sp1.1 from training. Duration: 0.8545625 2023-10-02 04:07:51,421 WARNING [train.py:1204] (3/4) Exclude cut with ID _44718_210_5816_1_1534050051869_6469030_380-167520-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 04:07:51,509 WARNING [train.py:1204] (3/4) Exclude cut with ID _19896_210_1883_1_1531709022662_6649569_94-229927-0_sp1.1 from training. Duration: 0.8818125 2023-10-02 04:07:53,336 WARNING [train.py:1204] (3/4) Exclude cut with ID _19514_210_9259_1_1531530071330_5084230_519-174300-0_sp1.1 from training. Duration: 0.963625 2023-10-02 04:07:54,733 WARNING [train.py:1204] (3/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_340-174176-0_sp0.9 from training. Duration: 0.5444375 2023-10-02 04:07:54,738 WARNING [train.py:1204] (3/4) Exclude cut with ID _8263_210_1664_1_1528439452891_970220_68-181805-0_sp1.1 from training. Duration: 0.5818125 2023-10-02 04:07:54,869 WARNING [train.py:1204] (3/4) Exclude cut with ID _3120_210_3525_1_1526036143112_4072980_348-257723-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 04:07:55,535 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.3.encoder.layers.1.conv_module1.whiten, num_groups=1, num_channels=512, metric=2.52 vs. limit=15.0 2023-10-02 04:07:56,163 WARNING [train.py:1204] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_453-133983-0_sp1.1 from training. Duration: 0.7090625 2023-10-02 04:07:57,572 WARNING [train.py:1204] (3/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_17-86060-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 04:07:57,600 WARNING [train.py:1204] (3/4) Exclude cut with ID _29877_210_6228_1_1532397791106_5276149_228-184657-0_sp1.1 from training. Duration: 0.97275 2023-10-02 04:07:59,014 WARNING [train.py:1204] (3/4) Exclude cut with ID _18516_210_9206_1_1531704653206_3692406_198-334870-0 from training. Duration: 0.92 2023-10-02 04:07:59,059 WARNING [train.py:1204] (3/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_340-174176-0_sp1.1 from training. Duration: 0.4454375 2023-10-02 04:07:59,078 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_14-139186-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 04:08:05,171 WARNING [train.py:1204] (3/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_126-29104-0_sp0.9 from training. Duration: 0.9444375 2023-10-02 04:08:10,688 WARNING [train.py:1204] (3/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_84-10310-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 04:08:15,736 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.nonlin_attention.balancer.prob, batch_count=745293.3333333334, ans=0.125 2023-10-02 04:08:16,850 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_15517_210_6753_1_1530954008_3487336_79-273076-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 04:08:16,902 WARNING [train.py:1204] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_371-18169-0_sp0.9 from training. Duration: 0.9555625 2023-10-02 04:08:25,818 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_68702_210_23689_1_1535799577_3420068_170-92788-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 04:08:28,552 WARNING [train.py:1204] (3/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_78-182912-0 from training. Duration: 0.59 2023-10-02 04:08:28,611 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3019_210_2028_1_1526044245_3264865_297-12384-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 04:08:28,616 WARNING [train.py:1204] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_147-111137-0_sp1.1 from training. Duration: 0.863625 2023-10-02 04:08:29,832 INFO [train.py:1046] (3/4) Epoch 22, batch 250, loss[loss=0.1729, simple_loss=0.2518, pruned_loss=0.04699, over 23423.00 frames. ], tot_loss[loss=0.1758, simple_loss=0.2524, pruned_loss=0.04954, over 3385420.68 frames. ], batch size: 105, lr: 4.72e-03, grad_scale: 16.0 2023-10-02 04:08:29,893 WARNING [train.py:1204] (3/4) Exclude cut with ID _28323_210_16187_1_1532480395667_4100300_117-8859-0_sp1.1 from training. Duration: 0.97275 2023-10-02 04:08:29,986 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7762_210_1794_1_1528199508_4057408_419-125009-0_sp1.1 from training. Duration: 0.7545625 2023-10-02 04:08:30,217 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.bypass.skip_rate, batch_count=745360.0, ans=0.04949747468305833 2023-10-02 04:08:31,362 WARNING [train.py:1204] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_268-87909-0 from training. Duration: 0.99 2023-10-02 04:08:31,430 WARNING [train.py:1204] (3/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_118-311357-0_sp1.1 from training. Duration: 0.92725 2023-10-02 04:08:31,444 WARNING [train.py:1204] (3/4) Exclude cut with ID ID0377W0003-870225 from training. Duration: 0.852 2023-10-02 04:08:33,539 WARNING [train.py:1204] (3/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_56-5838-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 04:08:36,713 WARNING [train.py:1204] (3/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_90-31383-0_sp0.9 from training. Duration: 0.9666875 2023-10-02 04:08:36,818 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_31583_210_3852_1_1532595851_4024413_58-86657-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 04:08:38,133 WARNING [train.py:1204] (3/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_344-113875-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 04:08:39,509 WARNING [train.py:1204] (3/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_278-104676-0_sp1.1 from training. Duration: 0.8909375 2023-10-02 04:08:40,897 WARNING [train.py:1204] (3/4) Exclude cut with ID _55844_210_2346_1_1534471212569_7625707_764-2387-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 04:08:42,336 WARNING [train.py:1204] (3/4) Exclude cut with ID _27430_210_7770_1_1532604689375_1378590_157-14963-0_sp1.1 from training. Duration: 0.963625 2023-10-02 04:08:45,568 WARNING [train.py:1204] (3/4) Exclude cut with ID _7160_210_1876_1_1527940923231_3703799_282-322611-0_sp1.1 from training. Duration: 0.7909375 2023-10-02 04:08:57,223 WARNING [train.py:1204] (3/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_190-138640-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 04:08:58,729 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=745493.3333333334, ans=0.1 2023-10-02 04:09:00,033 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_33348_210_5632_1_1533086912_3324030_63-270922-0_sp1.1 from training. Duration: 0.97275 2023-10-02 04:09:00,074 WARNING [train.py:1204] (3/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_135-184281-0_sp1.1 from training. Duration: 0.9 2023-10-02 04:09:06,877 WARNING [train.py:1204] (3/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_277-210798-0_sp1.1 from training. Duration: 0.5 2023-10-02 04:09:06,941 WARNING [train.py:1204] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_402-253027-0_sp0.9 from training. Duration: 0.6 2023-10-02 04:09:08,911 WARNING [train.py:1204] (3/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_540-275685-0_sp0.9 from training. Duration: 0.988875 2023-10-02 04:09:08,954 WARNING [train.py:1204] (3/4) Exclude cut with ID _59493_210_17282_1_1535078419077_3225879_118-18960-0_sp1.1 from training. Duration: 0.936375 2023-10-02 04:09:09,547 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.5.encoder.layers.1.conv_module1.whiten, num_groups=1, num_channels=256, metric=3.46 vs. limit=15.0 2023-10-02 04:09:10,736 WARNING [train.py:1204] (3/4) Exclude cut with ID _6194_210_1534_1_1527511826160_3941194_321-148641-0_sp1.1 from training. Duration: 0.5818125 2023-10-02 04:09:10,743 WARNING [train.py:1204] (3/4) Exclude cut with ID _61133_210_9969_1_1535196777227_3696409_333-270325-0_sp1.1 from training. Duration: 0.8454375 2023-10-02 04:09:10,814 WARNING [train.py:1204] (3/4) Exclude cut with ID _49376_210_5986_1_1533985278732_3370740_307-145221-0_sp1.1 from training. Duration: 0.936375 2023-10-02 04:09:13,078 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.1.encoder.layers.1.self_attn1.whiten, num_groups=1, num_channels=256, metric=11.74 vs. limit=22.5 2023-10-02 04:09:15,129 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6846_210_6753_1_1527850767_4295158_2-315073-0_sp0.9 from training. Duration: 0.97775 2023-10-02 04:09:18,460 WARNING [train.py:1204] (3/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_17-25045-0 from training. Duration: 0.59 2023-10-02 04:09:18,480 WARNING [train.py:1204] (3/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_503-333510-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 04:09:19,874 WARNING [train.py:1204] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_238-245655-0_sp1.1 from training. Duration: 0.736375 2023-10-02 04:09:19,898 WARNING [train.py:1204] (3/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_329-82235-0_sp0.9 from training. Duration: 0.911125 2023-10-02 04:09:19,904 WARNING [train.py:1204] (3/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_325-113798-0_sp0.9 from training. Duration: 0.9444375 2023-10-02 04:09:19,999 WARNING [train.py:1204] (3/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_395-37222-0_sp0.9 from training. Duration: 0.8444375 2023-10-02 04:09:20,080 WARNING [train.py:1204] (3/4) Exclude cut with ID _64135_210_2253_1_1535677267876_7608972_498-5039-0_sp1.1 from training. Duration: 0.7090625 2023-10-02 04:09:21,378 WARNING [train.py:1204] (3/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_303-228738-0_sp0.9 from training. Duration: 0.9333125 2023-10-02 04:09:22,770 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_577-220899-0_sp1.1 from training. Duration: 0.92725 2023-10-02 04:09:24,206 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3475_210_2028_1_1526193578_3622569_377-166133-0_sp1.1 from training. Duration: 0.7818125 2023-10-02 04:09:25,561 WARNING [train.py:1204] (3/4) Exclude cut with ID _49376_210_5986_1_1533985278732_3370740_272-313512-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 04:09:27,654 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7762_210_1794_1_1528199508_4057408_422-227034-0_sp0.9 from training. Duration: 0.811125 2023-10-02 04:09:31,811 WARNING [train.py:1204] (3/4) Exclude cut with ID _18895_210_6228_1_1531357274234_3784930_74-341428-0_sp1.1 from training. Duration: 0.92725 2023-10-02 04:09:34,521 WARNING [train.py:1204] (3/4) Exclude cut with ID _44902_210_16187_1_1533888006062_3792078_149-67212-0_sp1.1 from training. Duration: 0.7909375 2023-10-02 04:09:37,523 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.self_attn_weights.pos_emb_skip_rate, batch_count=745626.6666666666, ans=0.0 2023-10-02 04:09:39,200 WARNING [train.py:1204] (3/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_243-126020-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 04:09:41,224 WARNING [train.py:1204] (3/4) Exclude cut with ID _54218_210_12210_1_1534413646388_4409259_259-6561-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 04:09:45,234 INFO [train.py:1046] (3/4) Epoch 22, batch 300, loss[loss=0.1749, simple_loss=0.2675, pruned_loss=0.04119, over 24314.00 frames. ], tot_loss[loss=0.1745, simple_loss=0.2506, pruned_loss=0.04918, over 3682145.53 frames. ], batch size: 74, lr: 4.72e-03, grad_scale: 16.0 2023-10-02 04:09:45,283 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_41452_210_7291_1_1533437687_3941281_405-11559-0 from training. Duration: 0.91 2023-10-02 04:09:45,394 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_759-225084-0_sp1.1 from training. Duration: 0.97275 2023-10-02 04:09:45,408 WARNING [train.py:1204] (3/4) Exclude cut with ID _14747_210_8341_1_1530685797463_3654089_147-131229-0_sp0.9 from training. Duration: 0.8444375 2023-10-02 04:09:48,714 WARNING [train.py:1204] (3/4) Exclude cut with ID _14867_210_1968_1_1530684179890_3846969_319-250868-0 from training. Duration: 0.99 2023-10-02 04:09:48,744 WARNING [train.py:1204] (3/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_580-93798-0_sp0.9 from training. Duration: 0.62225 2023-10-02 04:09:50,332 WARNING [train.py:1204] (3/4) Exclude cut with ID _9679_210_7497_1_1529129212906_4363929_512-313889-0_sp0.9 from training. Duration: 0.9 2023-10-02 04:09:50,342 WARNING [train.py:1204] (3/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_18-189198-0 from training. Duration: 0.9 2023-10-02 04:09:54,493 WARNING [train.py:1204] (3/4) Exclude cut with ID _49755_210_18053_1_1534581005846_3218709_137-64179-0_sp1.1 from training. Duration: 0.92725 2023-10-02 04:09:55,698 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_48049_210_5632_1_1533864553_3519937_306-283794-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 04:09:59,079 WARNING [train.py:1204] (3/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_25-120161-0_sp1.1 from training. Duration: 0.8909375 2023-10-02 04:09:59,242 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.bypass.skip_rate, batch_count=745760.0, ans=0.04949747468305833 2023-10-02 04:10:00,375 WARNING [train.py:1204] (3/4) Exclude cut with ID _8833_210_5219_1_1528592665210_7553059_319-349181-0 from training. Duration: 0.99 2023-10-02 04:10:01,786 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_296-332325-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 04:10:03,151 WARNING [train.py:1204] (3/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_436-186311-0_sp1.1 from training. Duration: 0.5909375 2023-10-02 04:10:03,170 WARNING [train.py:1204] (3/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_348-127789-0 from training. Duration: 0.85 2023-10-02 04:10:03,177 WARNING [train.py:1204] (3/4) Exclude cut with ID _3992_210_1747_1_1526644816031_3731060_443-195183-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 04:10:08,535 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.601e+02 1.896e+02 2.196e+02 2.526e+02 3.714e+02, threshold=4.393e+02, percent-clipped=0.0 2023-10-02 04:10:08,618 WARNING [train.py:1204] (3/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_308-231735-0_sp1.1 from training. Duration: 0.663625 2023-10-02 04:10:08,998 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.conv_skip_rate, batch_count=745760.0, ans=0.0 2023-10-02 04:10:11,834 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6786_210_1388_1_1527839699_4194495_378-46070-0_sp1.1 from training. Duration: 0.6545625 2023-10-02 04:10:11,877 WARNING [train.py:1204] (3/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_63-101996-0 from training. Duration: 0.88 2023-10-02 04:10:15,885 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2541_210_1794_1_1525590269_3084765_156-173713-0 from training. Duration: 0.81 2023-10-02 04:10:15,935 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_217-239796-0_sp1.1 from training. Duration: 0.963625 2023-10-02 04:10:18,557 WARNING [train.py:1204] (3/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_530-329400-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 04:10:20,072 WARNING [train.py:1204] (3/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_150-62920-0_sp1.1 from training. Duration: 0.963625 2023-10-02 04:10:20,072 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_5522_210_3273_1_1527249606_3909096_366-172508-0 from training. Duration: 0.66 2023-10-02 04:10:20,075 WARNING [train.py:1204] (3/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_303-340446-0_sp1.1 from training. Duration: 0.8181875 2023-10-02 04:10:21,598 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.conv_module1.balancer1.prob, batch_count=745826.6666666666, ans=0.125 2023-10-02 04:10:22,745 WARNING [train.py:1204] (3/4) Exclude cut with ID _8699_210_2069_1_1528630346831_3960409_160-24408-0_sp1.1 from training. Duration: 0.82725 2023-10-02 04:10:24,159 WARNING [train.py:1204] (3/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_299-187801-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 04:10:24,186 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_34886_210_10633_1_1533203882_1755661_92-345288-0_sp1.1 from training. Duration: 0.97275 2023-10-02 04:10:27,591 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_5438_210_1794_1_1527336687_2600009_104-288274-0_sp1.1 from training. Duration: 0.52725 2023-10-02 04:10:27,594 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_154-316450-0 from training. Duration: 0.76 2023-10-02 04:10:28,978 WARNING [train.py:1204] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_23-189234-0_sp0.9 from training. Duration: 0.92225 2023-10-02 04:10:31,676 WARNING [train.py:1204] (3/4) Exclude cut with ID _4779_210_2084_1_1527318022944_3914893_346-168820-0_sp1.1 from training. Duration: 0.963625 2023-10-02 04:10:33,073 WARNING [train.py:1204] (3/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_5-55893-0 from training. Duration: 0.55 2023-10-02 04:10:33,150 WARNING [train.py:1204] (3/4) Exclude cut with ID _20455_210_5219_1_1531565996775_8098100_180-24517-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 04:10:37,445 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_243-143119-0_sp0.9 from training. Duration: 0.8555625 2023-10-02 04:10:40,581 WARNING [train.py:1204] (3/4) Exclude cut with ID _65585_210_12210_1_1535450451584_3764098_293-212045-0_sp1.1 from training. Duration: 0.92725 2023-10-02 04:10:40,586 WARNING [train.py:1204] (3/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_577-332600-0 from training. Duration: 0.57 2023-10-02 04:10:45,831 WARNING [train.py:1204] (3/4) Exclude cut with ID _30039_210_13270_1_1532484612900_2725150_104-289874-0_sp1.1 from training. Duration: 0.963625 2023-10-02 04:10:45,838 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3172_210_2087_1_1526292929_3987865_197-97762-0_sp1.1 from training. Duration: 0.6818125 2023-10-02 04:10:47,324 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_256-150908-0_sp1.1 from training. Duration: 0.963625 2023-10-02 04:10:48,706 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_296-287678-0_sp1.1 from training. Duration: 0.863625 2023-10-02 04:10:48,895 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.nonlin_attention.balancer.prob, batch_count=745960.0, ans=0.125 2023-10-02 04:10:50,042 WARNING [train.py:1204] (3/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_443-30034-0 from training. Duration: 0.64 2023-10-02 04:10:50,066 WARNING [train.py:1204] (3/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_494-199538-0_sp0.9 from training. Duration: 0.8333125 2023-10-02 04:10:50,120 WARNING [train.py:1204] (3/4) Exclude cut with ID _19708_210_9206_1_1532136650733_4238364_61-162325-0_sp1.1 from training. Duration: 0.97275 2023-10-02 04:10:51,489 WARNING [train.py:1204] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_475-127455-0 from training. Duration: 0.7 2023-10-02 04:10:52,743 WARNING [train.py:1204] (3/4) Exclude cut with ID _48677_210_5663_1_1533902405919_3892989_188-11581-0_sp1.1 from training. Duration: 0.963625 2023-10-02 04:10:52,777 WARNING [train.py:1204] (3/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_136-8381-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 04:10:54,307 WARNING [train.py:1204] (3/4) Exclude cut with ID _55328_210_27767_1_1534580973260_4088170_270-229555-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 04:10:56,089 WARNING [train.py:1204] (3/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_372-266951-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 04:10:56,146 WARNING [train.py:1204] (3/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_14-242149-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 04:11:00,120 INFO [train.py:1046] (3/4) Epoch 22, batch 350, loss[loss=0.1566, simple_loss=0.2289, pruned_loss=0.04211, over 23540.00 frames. ], tot_loss[loss=0.1732, simple_loss=0.2482, pruned_loss=0.04908, over 3895520.58 frames. ], batch size: 120, lr: 4.72e-03, grad_scale: 16.0 2023-10-02 04:11:01,578 WARNING [train.py:1204] (3/4) Exclude cut with ID _7877_210_6014_1_1528286358099_3750136_159-76512-0_sp1.1 from training. Duration: 0.936375 2023-10-02 04:11:01,581 WARNING [train.py:1204] (3/4) Exclude cut with ID _8263_210_1664_1_1528438953494_294100_40-204731-0_sp1.1 from training. Duration: 0.4545625 2023-10-02 04:11:05,627 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8278_210_2550_1_1528637099_4220413_694-238413-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 04:11:08,609 WARNING [train.py:1204] (3/4) Exclude cut with ID _47278_210_12041_1_1533902418349_3576110_421-172710-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 04:11:10,209 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.feed_forward1.out_proj.dropout_p, batch_count=746026.6666666666, ans=0.1 2023-10-02 04:11:13,081 WARNING [train.py:1204] (3/4) Exclude cut with ID _14970_210_15709_1_1530773543493_1715730_215-282680-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 04:11:13,136 WARNING [train.py:1204] (3/4) Exclude cut with ID _64639_210_5816_1_1535367619437_3528000_306-149449-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 04:11:16,950 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_28-291982-0 from training. Duration: 0.97 2023-10-02 04:11:18,343 WARNING [train.py:1204] (3/4) Exclude cut with ID _55134_210_21982_1_1534590028377_3882170_412-15870-0_sp1.1 from training. Duration: 0.936375 2023-10-02 04:11:18,372 WARNING [train.py:1204] (3/4) Exclude cut with ID _13684_210_7497_1_1530619354863_3598359_312-235606-0 from training. Duration: 0.6 2023-10-02 04:11:21,091 WARNING [train.py:1204] (3/4) Exclude cut with ID _37112_210_2575_1_1532997007673_7268610_347-238993-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 04:11:21,148 WARNING [train.py:1204] (3/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_121-284152-0 from training. Duration: 0.59 2023-10-02 04:11:22,574 WARNING [train.py:1204] (3/4) Exclude cut with ID _2941_210_2577_1_1526199673150_2489759_228-2961-0_sp1.1 from training. Duration: 0.97275 2023-10-02 04:11:25,313 WARNING [train.py:1204] (3/4) Exclude cut with ID _28323_210_16187_1_1532480395667_4100300_531-24157-0 from training. Duration: 0.98 2023-10-02 04:11:28,153 WARNING [train.py:1204] (3/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_266-329755-0_sp0.9 from training. Duration: 0.988875 2023-10-02 04:11:30,074 WARNING [train.py:1204] (3/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_1038-146421-0_sp1.1 from training. Duration: 0.97275 2023-10-02 04:11:31,430 WARNING [train.py:1204] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_391-22148-0_sp0.9 from training. Duration: 0.8555625 2023-10-02 04:11:32,025 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.5.encoder.layers.0.conv_module2.whiten, num_groups=1, num_channels=256, metric=4.26 vs. limit=15.0 2023-10-02 04:11:32,844 WARNING [train.py:1204] (3/4) Exclude cut with ID _64521_210_14699_1_1535243427259_3815328_543-341117-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 04:11:32,855 WARNING [train.py:1204] (3/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_52-168712-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 04:11:32,899 WARNING [train.py:1204] (3/4) Exclude cut with ID _33920_210_4220_1_1533257851500_3084399_198-334063-0_sp1.1 from training. Duration: 0.8545625 2023-10-02 04:11:32,921 WARNING [train.py:1204] (3/4) Exclude cut with ID _50093_210_4220_1_1534417141207_3432299_69-111987-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 04:11:32,958 WARNING [train.py:1204] (3/4) Exclude cut with ID _14821_210_8750_1_1530680503991_6829359_189-297451-0_sp0.9 from training. Duration: 0.611125 2023-10-02 04:11:35,693 WARNING [train.py:1204] (3/4) Exclude cut with ID _8030_210_7497_1_1528444886781_3531089_262-86750-0_sp0.9 from training. Duration: 0.9 2023-10-02 04:11:35,706 WARNING [train.py:1204] (3/4) Exclude cut with ID _24612_210_8913_1_1531998054193_3806359_294-191196-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 04:11:42,554 WARNING [train.py:1204] (3/4) Exclude cut with ID _16909_210_5724_1_1531292079912_4118919_707-342896-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 04:11:42,568 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_9460_210_4211_1_1528954587_10049667_572-37035-0_sp0.9 from training. Duration: 0.888875 2023-10-02 04:11:42,616 WARNING [train.py:1204] (3/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_762-109886-0_sp1.1 from training. Duration: 0.82725 2023-10-02 04:11:42,648 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_14953_210_2151_1_1530701710_5616165_519-326393-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 04:11:45,004 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.out_combiner.scale_min, batch_count=746226.6666666666, ans=0.2 2023-10-02 04:11:48,162 WARNING [train.py:1204] (3/4) Exclude cut with ID _25914_210_6400_1_1532141317058_644740_52-96808-0 from training. Duration: 0.91 2023-10-02 04:11:48,166 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_68095_210_20185_1_1535718226_8888855_1079-94525-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 04:11:49,850 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.feed_forward2.hidden_balancer.prob, batch_count=746226.6666666666, ans=0.125 2023-10-02 04:11:52,321 WARNING [train.py:1204] (3/4) Exclude cut with ID _14860_210_4915_1_1530702020340_7343240_746-3647-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 04:11:52,322 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_70379_210_7925_1_1535941455_557455_49-101720-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 04:11:52,348 WARNING [train.py:1204] (3/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_8-307291-0_sp1.1 from training. Duration: 0.936375 2023-10-02 04:11:53,632 WARNING [train.py:1204] (3/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_117-290916-0 from training. Duration: 0.51 2023-10-02 04:11:55,075 WARNING [train.py:1204] (3/4) Exclude cut with ID _22020_210_2461_1_1531724164513_6326900_944-339840-0_sp1.1 from training. Duration: 0.963625 2023-10-02 04:11:55,151 WARNING [train.py:1204] (3/4) Exclude cut with ID IC0515W0043-254911 from training. Duration: 0.976 2023-10-02 04:11:57,088 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3910_210_1794_1_1526742306_334530_28-238909-0 from training. Duration: 0.81 2023-10-02 04:11:58,308 WARNING [train.py:1204] (3/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_269-267275-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 04:12:01,159 WARNING [train.py:1204] (3/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_72-251255-0_sp1.1 from training. Duration: 0.8545625 2023-10-02 04:12:01,170 WARNING [train.py:1204] (3/4) Exclude cut with ID _14886_210_4220_1_1530702020355_3516190_175-229073-0 from training. Duration: 0.45 2023-10-02 04:12:01,454 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.conv_skip_rate, batch_count=746293.3333333334, ans=0.0 2023-10-02 04:12:03,900 WARNING [train.py:1204] (3/4) Exclude cut with ID _40519_210_13794_1_1533715228655_7337390_921-209452-0_sp1.1 from training. Duration: 0.963625 2023-10-02 04:12:05,476 WARNING [train.py:1204] (3/4) Exclude cut with ID _29857_210_20034_1_1532395886753_3798160_335-163562-0_sp0.9 from training. Duration: 0.9444375 2023-10-02 04:12:08,189 WARNING [train.py:1204] (3/4) Exclude cut with ID _58583_210_5236_1_1534726835266_3574249_384-58445-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 04:12:09,571 WARNING [train.py:1204] (3/4) Exclude cut with ID _44011_210_2046_1_1533722401691_3631310_96-315656-0_sp1.1 from training. Duration: 0.963625 2023-10-02 04:12:09,575 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_68484_210_1794_1_1535849804_7678097_549-170613-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 04:12:11,005 WARNING [train.py:1204] (3/4) Exclude cut with ID _37892_210_19596_1_1533198677897_3964058_214-148058-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 04:12:11,133 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.conv_module1.balancer2.prob, batch_count=746293.3333333334, ans=0.125 2023-10-02 04:12:13,738 INFO [train.py:1046] (3/4) Epoch 22, batch 400, loss[loss=0.1779, simple_loss=0.2651, pruned_loss=0.04536, over 24394.00 frames. ], tot_loss[loss=0.1724, simple_loss=0.2481, pruned_loss=0.04835, over 4093182.20 frames. ], batch size: 77, lr: 4.72e-03, grad_scale: 32.0 2023-10-02 04:12:13,847 WARNING [train.py:1204] (3/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_415-78792-0_sp0.9 from training. Duration: 0.9 2023-10-02 04:12:15,857 WARNING [train.py:1204] (3/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_383-205833-0_sp1.1 from training. Duration: 0.863625 2023-10-02 04:12:17,210 WARNING [train.py:1204] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_167-137255-0 from training. Duration: 0.64 2023-10-02 04:12:17,220 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8275_210_4198_1_1528455320_3892571_484-154786-0_sp1.1 from training. Duration: 0.963625 2023-10-02 04:12:17,973 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.2.encoder.layers.0.self_attn2.whiten, num_groups=1, num_channels=384, metric=19.13 vs. limit=22.5 2023-10-02 04:12:18,576 WARNING [train.py:1204] (3/4) Exclude cut with ID _40284_210_2084_1_1533513679814_3704279_396-319412-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 04:12:18,724 WARNING [train.py:1204] (3/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_280-120942-0_sp0.9 from training. Duration: 0.8555625 2023-10-02 04:12:20,026 WARNING [train.py:1204] (3/4) Exclude cut with ID _45630_210_10556_1_1533949174498_3902049_558-240889-0_sp1.1 from training. Duration: 0.92725 2023-10-02 04:12:20,285 INFO [scaling.py:1118] (3/4) WithLoss: name=encoder.encoders.3.encoder.layers.2.self_attn_weights, loss-sum=0.000e+00 2023-10-02 04:12:22,806 WARNING [train.py:1204] (3/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_197-307458-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 04:12:24,177 WARNING [train.py:1204] (3/4) Exclude cut with ID _16909_210_5724_1_1531292079912_4118919_163-254843-0_sp1.1 from training. Duration: 0.92725 2023-10-02 04:12:25,517 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_103-84010-0 from training. Duration: 0.69 2023-10-02 04:12:28,984 WARNING [train.py:1204] (3/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_401-158239-0 from training. Duration: 0.71 2023-10-02 04:12:28,985 WARNING [train.py:1204] (3/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_899-233658-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 04:12:30,324 WARNING [train.py:1204] (3/4) Exclude cut with ID _23825_210_13449_1_1531915313526_3497150_240-271367-0 from training. Duration: 0.97 2023-10-02 04:12:31,627 WARNING [train.py:1204] (3/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_424-134619-0_sp1.1 from training. Duration: 0.92725 2023-10-02 04:12:34,528 WARNING [train.py:1204] (3/4) Exclude cut with ID _44831_210_13427_1_1533816011549_3689035_32-188147-0_sp1.1 from training. Duration: 0.87275 2023-10-02 04:12:34,531 WARNING [train.py:1204] (3/4) Exclude cut with ID _59170_210_6721_1_1534983633960_4203335_314-63879-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 04:12:34,544 WARNING [train.py:1204] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_602-275260-0 from training. Duration: 0.82 2023-10-02 04:12:35,951 WARNING [train.py:1204] (3/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_511-306767-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 04:12:35,989 WARNING [train.py:1204] (3/4) Exclude cut with ID _41817_210_18835_1_1533434477160_7423049_565-201775-0_sp1.1 from training. Duration: 0.92725 2023-10-02 04:12:37,214 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.458e+02 1.706e+02 1.907e+02 2.140e+02 3.847e+02, threshold=3.815e+02, percent-clipped=0.0 2023-10-02 04:12:37,279 WARNING [train.py:1204] (3/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_778-173151-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 04:12:37,324 WARNING [train.py:1204] (3/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_680-51001-0_sp1.1 from training. Duration: 0.963625 2023-10-02 04:12:38,762 WARNING [train.py:1204] (3/4) Exclude cut with ID IC0677W0059-334891 from training. Duration: 0.896 2023-10-02 04:12:40,102 WARNING [train.py:1204] (3/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_370-331600-0 from training. Duration: 0.94 2023-10-02 04:12:44,278 WARNING [train.py:1204] (3/4) Exclude cut with ID _66840_210_5219_1_1535452168426_4196349_446-240192-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 04:12:46,222 WARNING [train.py:1204] (3/4) Exclude cut with ID _65242_210_18628_1_1535369393717_3823149_38-6732-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 04:12:46,290 WARNING [train.py:1204] (3/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_302-2550-0 from training. Duration: 0.81 2023-10-02 04:12:48,251 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder_embed.dropout.p, batch_count=746493.3333333334, ans=0.1 2023-10-02 04:12:49,435 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_100-348441-0 from training. Duration: 0.91 2023-10-02 04:12:51,013 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.feed_forward3.hidden_balancer.prob, batch_count=746493.3333333334, ans=0.125 2023-10-02 04:12:52,199 WARNING [train.py:1204] (3/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_329-285801-0_sp1.1 from training. Duration: 0.82725 2023-10-02 04:12:53,756 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_30167_210_3528_1_1532428012_4772375_343-334712-0_sp1.1 from training. Duration: 0.936375 2023-10-02 04:12:54,025 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.ff3_skip_rate, batch_count=746493.3333333334, ans=0.0 2023-10-02 04:12:59,376 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7543_210_6753_1_1528543318_4552_1-85072-0 from training. Duration: 0.83 2023-10-02 04:13:02,635 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7002_210_1794_1_1527935634_4034444_487-236501-0_sp1.1 from training. Duration: 0.6 2023-10-02 04:13:02,871 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=746560.0, ans=0.1 2023-10-02 04:13:05,267 WARNING [train.py:1204] (3/4) Exclude cut with ID _7160_210_1876_1_1527940923231_3703799_282-322611-0 from training. Duration: 0.87 2023-10-02 04:13:06,692 WARNING [train.py:1204] (3/4) Exclude cut with ID _67335_210_5219_1_1535525975652_4275229_570-262983-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 04:13:08,125 WARNING [train.py:1204] (3/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_625-338546-0_sp1.1 from training. Duration: 0.8 2023-10-02 04:13:08,149 WARNING [train.py:1204] (3/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_176-56120-0 from training. Duration: 0.83 2023-10-02 04:13:10,983 WARNING [train.py:1204] (3/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_963-187109-0_sp1.1 from training. Duration: 0.9 2023-10-02 04:13:13,803 WARNING [train.py:1204] (3/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_121-284152-0_sp0.9 from training. Duration: 0.6555625 2023-10-02 04:13:15,074 WARNING [train.py:1204] (3/4) Exclude cut with ID _45021_210_9994_1_1533619751094_3330288_104-81711-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 04:13:18,571 WARNING [train.py:1204] (3/4) Exclude cut with ID _51382_210_23862_1_1534492801838_3650104_129-343914-0_sp1.1 from training. Duration: 0.936375 2023-10-02 04:13:18,619 WARNING [train.py:1204] (3/4) Exclude cut with ID _14970_210_15709_1_1530775547361_1966965_8-169924-0 from training. Duration: 0.92 2023-10-02 04:13:18,865 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.feed_forward1.hidden_balancer.prob, batch_count=746626.6666666666, ans=0.125 2023-10-02 04:13:20,498 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2295_210_2087_1_1525500993_2407863_234-83104-0_sp1.1 from training. Duration: 0.563625 2023-10-02 04:13:20,760 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.bypass.scale_min, batch_count=746626.6666666666, ans=0.2 2023-10-02 04:13:24,516 WARNING [train.py:1204] (3/4) Exclude cut with ID _14687_210_5854_1_1530703838259_3664889_115-239046-0 from training. Duration: 0.67 2023-10-02 04:13:25,932 WARNING [train.py:1204] (3/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_690-126306-0_sp1.1 from training. Duration: 0.7090625 2023-10-02 04:13:25,947 WARNING [train.py:1204] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_625-102955-0_sp0.9 from training. Duration: 0.8555625 2023-10-02 04:13:27,394 WARNING [train.py:1204] (3/4) Exclude cut with ID _9389_210_1664_1_1529217337301_5289269_289-213417-0 from training. Duration: 0.89 2023-10-02 04:13:27,536 WARNING [train.py:1204] (3/4) Exclude cut with ID _13253_210_1925_1_1530757775819_3577873_191-144977-0_sp0.9 from training. Duration: 0.8666875 2023-10-02 04:13:28,846 WARNING [train.py:1204] (3/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_325-155703-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 04:13:30,146 INFO [train.py:1046] (3/4) Epoch 22, batch 450, loss[loss=0.1735, simple_loss=0.2623, pruned_loss=0.04234, over 24628.00 frames. ], tot_loss[loss=0.1731, simple_loss=0.249, pruned_loss=0.04857, over 4224675.54 frames. ], batch size: 68, lr: 4.72e-03, grad_scale: 16.0 2023-10-02 04:13:30,198 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2295_210_2087_1_1525500993_2407863_116-238894-0_sp0.9 from training. Duration: 0.77775 2023-10-02 04:13:31,521 WARNING [train.py:1204] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_94-126128-0 from training. Duration: 0.86 2023-10-02 04:13:32,724 WARNING [train.py:1204] (3/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_395-310220-0_sp0.9 from training. Duration: 0.988875 2023-10-02 04:13:32,818 WARNING [train.py:1204] (3/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_821-110852-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 04:13:34,768 WARNING [train.py:1204] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_64-248359-0_sp1.1 from training. Duration: 0.62725 2023-10-02 04:13:34,786 WARNING [train.py:1204] (3/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_138-187031-0 from training. Duration: 0.99 2023-10-02 04:13:34,841 WARNING [train.py:1204] (3/4) Exclude cut with ID _4122_210_2084_1_1526713164417_4353280_528-206771-0_sp0.9 from training. Duration: 0.97775 2023-10-02 04:13:36,156 WARNING [train.py:1204] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_844-96989-0_sp1.1 from training. Duration: 0.6454375 2023-10-02 04:13:38,876 WARNING [train.py:1204] (3/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_106-30782-0_sp1.1 from training. Duration: 0.72725 2023-10-02 04:13:45,960 WARNING [train.py:1204] (3/4) Exclude cut with ID _14683_210_5219_1_1530698352608_3974859_281-250040-0_sp1.1 from training. Duration: 0.936375 2023-10-02 04:13:47,799 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_46013_210_7234_1_1533776362_3744032_463-52378-0_sp0.9 from training. Duration: 0.9555625 2023-10-02 04:13:49,165 WARNING [train.py:1204] (3/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_299-34413-0 from training. Duration: 0.65 2023-10-02 04:13:49,243 WARNING [train.py:1204] (3/4) Exclude cut with ID _35799_210_5854_1_1533263487887_3529071_490-326847-0 from training. Duration: 0.99 2023-10-02 04:13:52,779 WARNING [train.py:1204] (3/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_589-9070-0_sp0.9 from training. Duration: 0.788875 2023-10-02 04:13:55,471 WARNING [train.py:1204] (3/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_884-29515-0_sp1.1 from training. Duration: 0.936375 2023-10-02 04:13:56,899 WARNING [train.py:1204] (3/4) Exclude cut with ID _15012_210_1968_1_1530856820429_5095350_474-119995-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 04:13:59,761 INFO [scaling.py:1118] (3/4) WithLoss: name=encoder.encoders.5.encoder.layers.1.self_attn_weights, loss-sum=0.000e+00 2023-10-02 04:14:02,106 WARNING [train.py:1204] (3/4) Exclude cut with ID _65585_210_12210_1_1535450451584_3764098_211-97124-0_sp1.1 from training. Duration: 0.963625 2023-10-02 04:14:03,424 WARNING [train.py:1204] (3/4) Exclude cut with ID _33640_210_2655_1_1533117605479_4855469_666-224463-0_sp1.1 from training. Duration: 0.963625 2023-10-02 04:14:04,903 WARNING [train.py:1204] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_35-153886-0 from training. Duration: 0.84 2023-10-02 04:14:06,743 WARNING [train.py:1204] (3/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_210-106047-0 from training. Duration: 0.97 2023-10-02 04:14:07,354 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.4.encoder.layers.2.self_attn1.whiten, num_groups=1, num_channels=384, metric=12.25 vs. limit=22.5 2023-10-02 04:14:08,221 WARNING [train.py:1204] (3/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_479-125611-0 from training. Duration: 0.96 2023-10-02 04:14:08,244 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_9417_210_1794_1_1529235261_4614131_427-153963-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 04:14:09,581 WARNING [train.py:1204] (3/4) Exclude cut with ID _23818_210_6228_1_1531917063884_3676920_133-170571-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 04:14:09,736 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.conv_module1.balancer2.prob, batch_count=746826.6666666666, ans=0.125 2023-10-02 04:14:10,968 WARNING [train.py:1204] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_602-275260-0_sp1.1 from training. Duration: 0.7454375 2023-10-02 04:14:11,110 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9029W0121-509521 from training. Duration: 0.896 2023-10-02 04:14:11,118 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9029W0434-509821 from training. Duration: 0.64 2023-10-02 04:14:12,391 WARNING [train.py:1204] (3/4) Exclude cut with ID _23038_210_9570_1_1532001584158_3652419_289-307034-0_sp1.1 from training. Duration: 0.936375 2023-10-02 04:14:13,857 WARNING [train.py:1204] (3/4) Exclude cut with ID _29345_210_4381_1_1532689168263_3860169_149-257268-0_sp0.9 from training. Duration: 0.9 2023-10-02 04:14:13,924 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_405-339986-0_sp1.1 from training. Duration: 0.463625 2023-10-02 04:14:18,510 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2655_210_2087_1_1525688959_3897400_437-337690-0_sp1.1 from training. Duration: 0.57275 2023-10-02 04:14:18,549 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7543_210_6753_1_1528541907_1399322_165-323619-0_sp1.1 from training. Duration: 0.62725 2023-10-02 04:14:19,884 WARNING [train.py:1204] (3/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_508-335369-0_sp1.1 from training. Duration: 0.436375 2023-10-02 04:14:19,957 WARNING [train.py:1204] (3/4) Exclude cut with ID _7852_210_5219_1_1528286471316_8139400_358-287492-0 from training. Duration: 0.96 2023-10-02 04:14:21,393 WARNING [train.py:1204] (3/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_322-217220-0_sp0.9 from training. Duration: 0.9555625 2023-10-02 04:14:23,573 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3171_210_2087_1_1526109084_2330007_66-260583-0_sp0.9 from training. Duration: 0.8 2023-10-02 04:14:23,604 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8558_210_1410_1_1528505225_7613406_131-329524-0_sp1.1 from training. Duration: 0.7181875 2023-10-02 04:14:25,054 WARNING [train.py:1204] (3/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_30-326681-0 from training. Duration: 0.83 2023-10-02 04:14:29,116 WARNING [train.py:1204] (3/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_58-68956-0_sp0.9 from training. Duration: 0.92225 2023-10-02 04:14:30,498 WARNING [train.py:1204] (3/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_255-312654-0 from training. Duration: 0.91 2023-10-02 04:14:30,580 WARNING [train.py:1204] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_99-167392-0 from training. Duration: 0.61 2023-10-02 04:14:31,963 WARNING [train.py:1204] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_94-126128-0_sp0.9 from training. Duration: 0.9555625 2023-10-02 04:14:34,973 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.ff2_skip_rate, batch_count=746960.0, ans=0.0 2023-10-02 04:14:36,567 WARNING [train.py:1204] (3/4) Exclude cut with ID _68635_210_9317_1_1535770813410_4315870_187-192817-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 04:14:39,313 WARNING [train.py:1204] (3/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_277-204970-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 04:14:40,745 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6163_210_6400_1_1527506746_4263068_283-137811-0_sp0.9 from training. Duration: 0.8555625 2023-10-02 04:14:40,772 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9144W0121-566446 from training. Duration: 0.896 2023-10-02 04:14:40,978 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.attention_skip_rate, batch_count=746960.0, ans=0.0 2023-10-02 04:14:43,475 INFO [train.py:1046] (3/4) Epoch 22, batch 500, loss[loss=0.1861, simple_loss=0.2614, pruned_loss=0.05544, over 23295.00 frames. ], tot_loss[loss=0.1746, simple_loss=0.2503, pruned_loss=0.0495, over 4343301.76 frames. ], batch size: 105, lr: 4.72e-03, grad_scale: 16.0 2023-10-02 04:14:43,545 WARNING [train.py:1204] (3/4) Exclude cut with ID _49371_210_12071_1_1533952761021_3812190_351-315519-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 04:14:44,936 WARNING [train.py:1204] (3/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_115-326778-0_sp1.1 from training. Duration: 0.8454375 2023-10-02 04:14:45,005 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_17292_210_6753_1_1531125388_2943734_436-198965-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 04:14:45,013 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9165W0117-576751 from training. Duration: 0.896 2023-10-02 04:14:46,398 WARNING [train.py:1204] (3/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_212-149947-0 from training. Duration: 0.73 2023-10-02 04:14:46,411 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_13757_210_3528_1_1530440682_4107239_65-167544-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 04:14:48,976 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.balancer1.prob, batch_count=747026.6666666666, ans=0.125 2023-10-02 04:14:51,406 WARNING [train.py:1204] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_700-183549-0_sp0.9 from training. Duration: 0.7555625 2023-10-02 04:14:54,258 WARNING [train.py:1204] (3/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_216-343007-0_sp0.9 from training. Duration: 0.7333125 2023-10-02 04:14:57,419 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7544_210_6753_1_1528587722_3576686_295-258479-0_sp1.1 from training. Duration: 0.763625 2023-10-02 04:14:58,934 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_365-287106-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 04:14:58,952 WARNING [train.py:1204] (3/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_300-3747-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 04:15:00,281 WARNING [train.py:1204] (3/4) Exclude cut with ID _14069_210_5632_1_1530512692033_4803390_487-303201-0_sp1.1 from training. Duration: 0.92725 2023-10-02 04:15:09,058 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.526e+02 1.808e+02 1.987e+02 2.210e+02 3.214e+02, threshold=3.973e+02, percent-clipped=0.0 2023-10-02 04:15:09,194 WARNING [train.py:1204] (3/4) Exclude cut with ID _14459_210_2228_1_1531443641946_7516700_668-123026-0_sp1.1 from training. Duration: 0.936375 2023-10-02 04:15:10,451 WARNING [train.py:1204] (3/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_108-53929-0_sp1.1 from training. Duration: 0.536375 2023-10-02 04:15:10,498 WARNING [train.py:1204] (3/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_266-137104-0_sp0.9 from training. Duration: 0.72225 2023-10-02 04:15:10,528 WARNING [train.py:1204] (3/4) Exclude cut with ID _12302_210_5219_1_1529924446466_8107280_902-168619-0_sp1.1 from training. Duration: 0.936375 2023-10-02 04:15:11,817 WARNING [train.py:1204] (3/4) Exclude cut with ID _59502_210_12210_1_1535107024698_1835139_146-162495-0 from training. Duration: 0.92 2023-10-02 04:15:11,842 WARNING [train.py:1204] (3/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_412-287526-0_sp1.1 from training. Duration: 0.6545625 2023-10-02 04:15:12,133 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.feed_forward1.out_proj.dropout_p, batch_count=747160.0, ans=0.1 2023-10-02 04:15:16,118 WARNING [train.py:1204] (3/4) Exclude cut with ID _7160_210_1876_1_1527940923231_3703799_120-150943-0_sp0.9 from training. Duration: 0.9 2023-10-02 04:15:17,502 WARNING [train.py:1204] (3/4) Exclude cut with ID _15037_210_2253_1_1530752294104_3267650_296-192597-0_sp1.1 from training. Duration: 0.736375 2023-10-02 04:15:17,529 WARNING [train.py:1204] (3/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_397-216497-0_sp1.1 from training. Duration: 0.87275 2023-10-02 04:15:17,544 WARNING [train.py:1204] (3/4) Exclude cut with ID _40094_210_16380_1_1533294005773_3641159_76-269320-0_sp1.1 from training. Duration: 0.936375 2023-10-02 04:15:18,888 WARNING [train.py:1204] (3/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_449-15508-0 from training. Duration: 0.82 2023-10-02 04:15:21,128 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.conv_module1.balancer1.prob, batch_count=747160.0, ans=0.125 2023-10-02 04:15:23,925 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9309W0194-640321 from training. Duration: 0.64 2023-10-02 04:15:25,394 WARNING [train.py:1204] (3/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_284-312178-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 04:15:26,824 WARNING [train.py:1204] (3/4) Exclude cut with ID _29853_210_7497_1_1532505600851_6642110_401-55937-0_sp1.1 from training. Duration: 0.92725 2023-10-02 04:15:28,263 WARNING [train.py:1204] (3/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_716-24918-0_sp1.1 from training. Duration: 0.92725 2023-10-02 04:15:28,292 WARNING [train.py:1204] (3/4) Exclude cut with ID _19637_210_4220_1_1531537167676_3205060_314-305638-0_sp1.1 from training. Duration: 0.92725 2023-10-02 04:15:30,252 WARNING [train.py:1204] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_475-127455-0_sp1.1 from training. Duration: 0.636375 2023-10-02 04:15:32,983 WARNING [train.py:1204] (3/4) Exclude cut with ID _5444_210_2151_1_1527418808464_5312210_581-315411-0 from training. Duration: 0.62 2023-10-02 04:15:35,698 WARNING [train.py:1204] (3/4) Exclude cut with ID _44902_210_16187_1_1533888006062_3792078_149-67212-0_sp0.9 from training. Duration: 0.9666875 2023-10-02 04:15:35,870 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.1.conv_module2.balancer2.prob, batch_count=747226.6666666666, ans=0.125 2023-10-02 04:15:37,088 WARNING [train.py:1204] (3/4) Exclude cut with ID _31602_210_5854_1_1532677021456_4458250_63-63253-0_sp1.1 from training. Duration: 0.963625 2023-10-02 04:15:40,467 WARNING [train.py:1204] (3/4) Exclude cut with ID _55209_210_1531_1_1534675828996_4597160_660-203992-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 04:15:41,969 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.ff3_skip_rate, batch_count=747293.3333333334, ans=0.0 2023-10-02 04:15:44,538 WARNING [train.py:1204] (3/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_83-190624-0_sp1.1 from training. Duration: 0.92725 2023-10-02 04:15:50,234 WARNING [train.py:1204] (3/4) Exclude cut with ID _58605_210_12210_1_1534723195939_3904259_154-139635-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 04:15:51,679 WARNING [train.py:1204] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_164-224052-0 from training. Duration: 0.7 2023-10-02 04:15:51,689 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_1126-189071-0_sp1.1 from training. Duration: 0.963625 2023-10-02 04:15:51,709 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_15396_210_6890_1_1531308473_1938936_310-214789-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 04:15:55,194 WARNING [train.py:1204] (3/4) Exclude cut with ID _8395_210_1531_1_1528597871523_4411579_431-132470-0 from training. Duration: 0.74 2023-10-02 04:15:57,189 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3780_210_2087_1_1526467760_2130483_170-259044-0_sp0.9 from training. Duration: 0.77775 2023-10-02 04:15:58,492 INFO [train.py:1046] (3/4) Epoch 22, batch 550, loss[loss=0.1748, simple_loss=0.2453, pruned_loss=0.05215, over 23366.00 frames. ], tot_loss[loss=0.175, simple_loss=0.251, pruned_loss=0.04954, over 4428979.36 frames. ], batch size: 93, lr: 4.71e-03, grad_scale: 16.0 2023-10-02 04:15:58,540 WARNING [train.py:1204] (3/4) Exclude cut with ID _12733_210_8341_1_1530167424556_3646380_537-230640-0_sp1.1 from training. Duration: 0.963625 2023-10-02 04:16:01,916 WARNING [train.py:1204] (3/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_159-281310-0 from training. Duration: 0.95 2023-10-02 04:16:04,575 WARNING [train.py:1204] (3/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_434-83000-0 from training. Duration: 0.98 2023-10-02 04:16:04,600 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_4405_210_1849_1_1526732205_3593939_493-32464-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 04:16:04,608 WARNING [train.py:1204] (3/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_342-100157-0 from training. Duration: 0.54 2023-10-02 04:16:04,653 WARNING [train.py:1204] (3/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_765-250133-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 04:16:05,894 WARNING [train.py:1204] (3/4) Exclude cut with ID _38670_210_15379_1_1533448810659_4391059_81-312245-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 04:16:05,990 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_50050_210_16223_1_1535435796_7290138_1273-247611-0_sp1.1 from training. Duration: 0.936375 2023-10-02 04:16:06,058 WARNING [train.py:1204] (3/4) Exclude cut with ID _44831_210_13427_1_1533816011549_3689035_399-18591-0_sp1.1 from training. Duration: 0.936375 2023-10-02 04:16:06,077 WARNING [train.py:1204] (3/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_557-324258-0_sp0.9 from training. Duration: 0.9 2023-10-02 04:16:06,279 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.conv_module2.balancer1.prob, batch_count=747360.0, ans=0.125 2023-10-02 04:16:07,432 WARNING [train.py:1204] (3/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_62-335552-0_sp1.1 from training. Duration: 0.7909375 2023-10-02 04:16:10,822 WARNING [train.py:1204] (3/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_673-264125-0_sp1.1 from training. Duration: 0.963625 2023-10-02 04:16:12,092 WARNING [train.py:1204] (3/4) Exclude cut with ID _8030_210_7497_1_1528444886781_3531089_262-86750-0 from training. Duration: 0.81 2023-10-02 04:16:12,122 WARNING [train.py:1204] (3/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_444-28041-0_sp1.1 from training. Duration: 0.77275 2023-10-02 04:16:17,643 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_34008_210_10431_1_1533345424_8183247_516-14858-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 04:16:17,655 WARNING [train.py:1204] (3/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_54-248218-0_sp1.1 from training. Duration: 0.936375 2023-10-02 04:16:17,963 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.conv_skip_rate, batch_count=747426.6666666666, ans=0.0 2023-10-02 04:16:19,121 WARNING [train.py:1204] (3/4) Exclude cut with ID _13253_210_1925_1_1530757775819_3577873_300-119782-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 04:16:20,542 WARNING [train.py:1204] (3/4) Exclude cut with ID _39290_210_4220_1_1533862778760_3075118_21-240703-0_sp1.1 from training. Duration: 0.936375 2023-10-02 04:16:25,382 WARNING [train.py:1204] (3/4) Exclude cut with ID 774-127930-0014-10412-0_sp1.1 from training. Duration: 0.95 2023-10-02 04:16:25,451 WARNING [train.py:1204] (3/4) Exclude cut with ID _67499_210_17282_1_1535509805220_3692430_20-179346-0 from training. Duration: 0.97 2023-10-02 04:16:27,260 WARNING [train.py:1204] (3/4) Exclude cut with ID _8365_210_2064_1_1528592311070_4677690_431-294102-0_sp0.9 from training. Duration: 0.988875 2023-10-02 04:16:33,241 WARNING [train.py:1204] (3/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_365-1492-0_sp1.1 from training. Duration: 0.82725 2023-10-02 04:16:33,274 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_105-114797-0_sp0.9 from training. Duration: 0.9333125 2023-10-02 04:16:34,677 WARNING [train.py:1204] (3/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_769-168302-0_sp0.9 from training. Duration: 0.788875 2023-10-02 04:16:37,734 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.ff2_skip_rate, batch_count=747493.3333333334, ans=0.0 2023-10-02 04:16:38,821 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_40340_210_9024_1_1533433916_4132944_320-270277-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 04:16:38,838 WARNING [train.py:1204] (3/4) Exclude cut with ID ID0203W0003-781711 from training. Duration: 0.958 2023-10-02 04:16:40,539 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_30434_210_13049_1_1532519384_981746_52-12932-0_sp1.1 from training. Duration: 0.936375 2023-10-02 04:16:41,922 WARNING [train.py:1204] (3/4) Exclude cut with ID _8263_210_1664_1_1528438953494_294100_40-204731-0_sp0.9 from training. Duration: 0.5555625 2023-10-02 04:16:43,436 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1032-236863-0_sp0.9 from training. Duration: 0.9333125 2023-10-02 04:16:44,877 WARNING [train.py:1204] (3/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_335-274780-0_sp0.9 from training. Duration: 0.8666875 2023-10-02 04:16:44,886 WARNING [train.py:1204] (3/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_654-52713-0_sp1.1 from training. Duration: 0.7 2023-10-02 04:16:46,268 WARNING [train.py:1204] (3/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_742-267856-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 04:16:46,374 WARNING [train.py:1204] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_74-48717-0 from training. Duration: 0.58 2023-10-02 04:16:47,798 WARNING [train.py:1204] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_18-46756-0 from training. Duration: 0.79 2023-10-02 04:16:49,164 WARNING [train.py:1204] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_612-56061-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 04:16:49,173 WARNING [train.py:1204] (3/4) Exclude cut with ID _56423_210_5854_1_1534896061049_3165969_412-2403-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 04:16:49,234 WARNING [train.py:1204] (3/4) Exclude cut with ID _62574_210_2655_1_1535180407058_3933249_120-196848-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 04:16:49,235 WARNING [train.py:1204] (3/4) Exclude cut with ID _40519_210_13794_1_1533715228655_7337390_916-51000-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 04:16:51,976 WARNING [train.py:1204] (3/4) Exclude cut with ID _65263_210_22356_1_1535763598691_3921037_108-21902-0_sp1.1 from training. Duration: 0.9 2023-10-02 04:16:53,304 WARNING [train.py:1204] (3/4) Exclude cut with ID _4122_210_2084_1_1526713164417_4353280_528-206771-0_sp1.1 from training. Duration: 0.8 2023-10-02 04:16:56,644 WARNING [train.py:1204] (3/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_43-325657-0_sp1.1 from training. Duration: 0.92725 2023-10-02 04:16:56,696 WARNING [train.py:1204] (3/4) Exclude cut with ID _44659_210_2721_1_1534036120061_6614860_314-82041-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 04:16:58,104 WARNING [train.py:1204] (3/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_81-300212-0_sp0.9 from training. Duration: 0.6666875 2023-10-02 04:16:59,459 WARNING [train.py:1204] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_665-196155-0_sp0.9 from training. Duration: 0.7444375 2023-10-02 04:17:00,863 WARNING [train.py:1204] (3/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_140-336881-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 04:17:02,164 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6290_210_3273_1_1527595224_1282787_115-87095-0_sp0.9 from training. Duration: 0.611125 2023-10-02 04:17:02,218 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_53373_210_5399_1_1534229471_4715695_607-259685-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 04:17:05,433 WARNING [train.py:1204] (3/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_175-163894-0_sp1.1 from training. Duration: 0.67275 2023-10-02 04:17:05,466 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7002_210_1794_1_1527939683_5157286_1-281264-0_sp1.1 from training. Duration: 0.4 2023-10-02 04:17:05,756 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.self_attn_weights.pos_emb_skip_rate, batch_count=747626.6666666666, ans=0.0 2023-10-02 04:17:10,585 WARNING [train.py:1204] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_605-257205-0 from training. Duration: 0.59 2023-10-02 04:17:13,295 INFO [train.py:1046] (3/4) Epoch 22, batch 600, loss[loss=0.1805, simple_loss=0.2467, pruned_loss=0.05713, over 23679.00 frames. ], tot_loss[loss=0.1752, simple_loss=0.2511, pruned_loss=0.04966, over 4495762.33 frames. ], batch size: 232, lr: 4.71e-03, grad_scale: 16.0 2023-10-02 04:17:14,585 WARNING [train.py:1204] (3/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_454-314901-0 from training. Duration: 0.46 2023-10-02 04:17:15,951 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_46032_210_14656_1_1533781412_4507969_287-130422-0_sp1.1 from training. Duration: 0.97275 2023-10-02 04:17:15,970 WARNING [train.py:1204] (3/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_186-309161-0_sp1.1 from training. Duration: 0.4909375 2023-10-02 04:17:16,005 WARNING [train.py:1204] (3/4) Exclude cut with ID _42659_210_2070_1_1533607442765_3120380_45-342307-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 04:17:21,597 WARNING [train.py:1204] (3/4) Exclude cut with ID _12555_210_8750_1_1530075594402_3780239_478-326818-0_sp1.1 from training. Duration: 0.963625 2023-10-02 04:17:24,344 WARNING [train.py:1204] (3/4) Exclude cut with ID _7606_210_4915_1_1528894253277_5080690_127-177761-0_sp1.1 from training. Duration: 0.5818125 2023-10-02 04:17:26,435 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6788_210_1388_1_1527898698_4562021_91-197981-0 from training. Duration: 0.83 2023-10-02 04:17:28,627 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8558_210_1410_1_1528505225_7613406_131-329524-0_sp0.9 from training. Duration: 0.87775 2023-10-02 04:17:31,193 WARNING [train.py:1204] (3/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_153-153537-0_sp1.1 from training. Duration: 0.82725 2023-10-02 04:17:32,607 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_17292_210_6753_1_1531125388_2943734_285-349105-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 04:17:35,437 WARNING [train.py:1204] (3/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_37-41919-0 from training. Duration: 0.87 2023-10-02 04:17:35,458 WARNING [train.py:1204] (3/4) Exclude cut with ID _60860_210_8254_1_1534896015875_3557169_172-294852-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 04:17:38,705 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.510e+02 1.855e+02 2.187e+02 2.560e+02 3.889e+02, threshold=4.374e+02, percent-clipped=0.0 2023-10-02 04:17:41,603 WARNING [train.py:1204] (3/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_146-276944-0 from training. Duration: 0.83 2023-10-02 04:17:44,861 WARNING [train.py:1204] (3/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_1254-44082-0_sp1.1 from training. Duration: 0.936375 2023-10-02 04:17:44,881 WARNING [train.py:1204] (3/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_1115-180350-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 04:17:46,157 WARNING [train.py:1204] (3/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_60-146189-0_sp1.1 from training. Duration: 0.8909375 2023-10-02 04:17:50,479 WARNING [train.py:1204] (3/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_517-153649-0_sp1.1 from training. Duration: 0.92725 2023-10-02 04:17:50,502 WARNING [train.py:1204] (3/4) Exclude cut with ID _39571_210_13449_1_1533801584143_3651077_37-266257-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 04:17:51,860 WARNING [train.py:1204] (3/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_527-276118-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 04:17:58,722 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7696_210_2040_1_1528207167_3716099_117-125434-0_sp1.1 from training. Duration: 0.6909375 2023-10-02 04:18:00,165 INFO [scaling.py:1118] (3/4) WithLoss: name=encoder.encoders.2.encoder.layers.0.self_attn_weights, loss-sum=0.000e+00 2023-10-02 04:18:01,391 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_25767_210_2161_1_1532307483_3867748_478-156360-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 04:18:01,395 WARNING [train.py:1204] (3/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_177-49092-0_sp1.1 from training. Duration: 0.82725 2023-10-02 04:18:01,402 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_11305_210_1794_1_1529750428_7178148_680-317073-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 04:18:08,946 WARNING [train.py:1204] (3/4) Exclude cut with ID _53565_210_10635_1_1534337218335_3820259_287-18120-0 from training. Duration: 0.97 2023-10-02 04:18:10,704 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.conv_module2.balancer2.min_abs, batch_count=747893.3333333334, ans=0.5 2023-10-02 04:18:13,913 WARNING [train.py:1204] (3/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_498-40153-0_sp1.1 from training. Duration: 0.57275 2023-10-02 04:18:15,277 WARNING [train.py:1204] (3/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_555-63066-0_sp1.1 from training. Duration: 0.963625 2023-10-02 04:18:18,179 WARNING [train.py:1204] (3/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_395-310220-0 from training. Duration: 0.89 2023-10-02 04:18:19,538 WARNING [train.py:1204] (3/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_937-208281-0_sp1.1 from training. Duration: 0.77275 2023-10-02 04:18:19,733 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.ff2_skip_rate, batch_count=747960.0, ans=0.0 2023-10-02 04:18:22,492 WARNING [train.py:1204] (3/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_120-211630-0 from training. Duration: 0.92 2023-10-02 04:18:22,519 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_454-276429-0_sp1.1 from training. Duration: 0.7909375 2023-10-02 04:18:22,565 WARNING [train.py:1204] (3/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_404-75889-0_sp1.1 from training. Duration: 0.8090625 2023-10-02 04:18:23,445 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.1.encoder.layers.1.feed_forward1.out_whiten, num_groups=1, num_channels=256, metric=6.00 vs. limit=15.0 2023-10-02 04:18:28,669 INFO [train.py:1046] (3/4) Epoch 22, batch 650, loss[loss=0.1607, simple_loss=0.2082, pruned_loss=0.05656, over 19457.00 frames. ], tot_loss[loss=0.1748, simple_loss=0.2502, pruned_loss=0.04967, over 4531146.71 frames. ], batch size: 389, lr: 4.71e-03, grad_scale: 16.0 2023-10-02 04:18:28,797 WARNING [train.py:1204] (3/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_282-136641-0_sp0.9 from training. Duration: 0.4666875 2023-10-02 04:18:29,583 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.2.encoder.layers.2.self_attn2.whiten, num_groups=1, num_channels=384, metric=18.04 vs. limit=22.5 2023-10-02 04:18:30,227 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_809-17156-0_sp1.1 from training. Duration: 0.663625 2023-10-02 04:18:30,420 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.feed_forward3.hidden_balancer.prob, batch_count=748026.6666666666, ans=0.125 2023-10-02 04:18:31,687 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder_embed.conv.8.prob, batch_count=748026.6666666666, ans=0.125 2023-10-02 04:18:32,971 WARNING [train.py:1204] (3/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_1003-101638-0_sp0.9 from training. Duration: 0.811125 2023-10-02 04:18:34,400 WARNING [train.py:1204] (3/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_404-75889-0_sp0.9 from training. Duration: 0.988875 2023-10-02 04:18:35,723 WARNING [train.py:1204] (3/4) Exclude cut with ID _59288_210_5854_1_1535077757774_3412246_570-295255-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 04:18:37,281 WARNING [train.py:1204] (3/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_412-287526-0 from training. Duration: 0.72 2023-10-02 04:18:37,480 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.conv_module2.balancer2.prob, batch_count=748026.6666666666, ans=0.125 2023-10-02 04:18:39,198 WARNING [train.py:1204] (3/4) Exclude cut with ID _44010_210_4525_1_1533884262964_4013620_649-79655-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 04:18:45,221 WARNING [train.py:1204] (3/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_57-270037-0_sp0.9 from training. Duration: 0.8555625 2023-10-02 04:18:45,222 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_34815_210_6388_1_1533381320_3571903_302-255963-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 04:18:49,548 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_47449_210_5399_1_1534076445_4006929_573-301852-0_sp1.1 from training. Duration: 0.97275 2023-10-02 04:18:52,454 WARNING [train.py:1204] (3/4) Exclude cut with ID 6709-74022-0004-86860-0_sp1.1 from training. Duration: 0.9409375 2023-10-02 04:18:53,837 WARNING [train.py:1204] (3/4) Exclude cut with ID _40519_210_13794_1_1533715228655_7337390_111-71991-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 04:18:55,201 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_30167_210_3528_1_1532428012_4772375_169-214186-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 04:18:58,662 WARNING [train.py:1204] (3/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_20-235313-0_sp1.1 from training. Duration: 0.963625 2023-10-02 04:18:58,713 WARNING [train.py:1204] (3/4) Exclude cut with ID _4681_210_3525_1_1527251040321_1235713_123-24428-0_sp0.9 from training. Duration: 0.5333125 2023-10-02 04:19:01,902 WARNING [train.py:1204] (3/4) Exclude cut with ID _31602_210_5854_1_1532677021456_4458250_444-198752-0_sp1.1 from training. Duration: 0.97275 2023-10-02 04:19:01,958 WARNING [train.py:1204] (3/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_9-279442-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 04:19:01,999 WARNING [train.py:1204] (3/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_973-198085-0_sp0.9 from training. Duration: 0.8333125 2023-10-02 04:19:04,641 WARNING [train.py:1204] (3/4) Exclude cut with ID _29853_210_7497_1_1532505600851_6642110_567-250221-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 04:19:04,751 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6545_210_2040_1_1527774670_4245258_407-48070-0_sp1.1 from training. Duration: 0.6909375 2023-10-02 04:19:07,492 WARNING [train.py:1204] (3/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_224-343295-0_sp0.9 from training. Duration: 0.611125 2023-10-02 04:19:07,506 WARNING [train.py:1204] (3/4) Exclude cut with ID IC0125W0011-61817 from training. Duration: 0.88 2023-10-02 04:19:07,514 WARNING [train.py:1204] (3/4) Exclude cut with ID _60113_210_16271_1_1535164212481_4386079_435-154371-0_sp1.1 from training. Duration: 0.97275 2023-10-02 04:19:07,534 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_25836_210_16554_1_1532138167_53197_3-9394-0_sp1.1 from training. Duration: 0.92725 2023-10-02 04:19:11,316 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.3.encoder.layers.1.conv_module2.whiten, num_groups=1, num_channels=512, metric=7.13 vs. limit=15.0 2023-10-02 04:19:11,973 WARNING [train.py:1204] (3/4) Exclude cut with ID _29343_210_4381_1_1532602757752_3674109_57-55125-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 04:19:12,088 WARNING [train.py:1204] (3/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_267-23560-0_sp1.1 from training. Duration: 0.936375 2023-10-02 04:19:12,277 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.conv_module2.balancer1.prob, batch_count=748226.6666666666, ans=0.125 2023-10-02 04:19:13,378 WARNING [train.py:1204] (3/4) Exclude cut with ID _30196_210_1664_1_1533708496522_7074140_1150-278867-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 04:19:13,431 WARNING [train.py:1204] (3/4) Exclude cut with ID _6270_210_2084_1_1527850911573_3856420_26-277343-0_sp0.9 from training. Duration: 0.911125 2023-10-02 04:19:13,497 WARNING [train.py:1204] (3/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_325-62802-0 from training. Duration: 0.87 2023-10-02 04:19:16,733 WARNING [train.py:1204] (3/4) Exclude cut with ID _56524_210_9642_1_1534572011779_3619170_145-310842-0_sp1.1 from training. Duration: 0.9 2023-10-02 04:19:16,772 WARNING [train.py:1204] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_860-96198-0_sp0.9 from training. Duration: 0.811125 2023-10-02 04:19:18,087 WARNING [train.py:1204] (3/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_17-25045-0_sp1.1 from training. Duration: 0.536375 2023-10-02 04:19:18,105 WARNING [train.py:1204] (3/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_349-69727-0_sp1.1 from training. Duration: 0.936375 2023-10-02 04:19:19,426 WARNING [train.py:1204] (3/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_580-93798-0_sp1.1 from training. Duration: 0.5090625 2023-10-02 04:19:20,714 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7762_210_1794_1_1528199508_4057408_419-125009-0 from training. Duration: 0.83 2023-10-02 04:19:22,090 WARNING [train.py:1204] (3/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_356-167816-0 from training. Duration: 0.72 2023-10-02 04:19:22,116 WARNING [train.py:1204] (3/4) Exclude cut with ID _29345_210_4381_1_1532689168263_3860169_225-97100-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 04:19:22,120 WARNING [train.py:1204] (3/4) Exclude cut with ID _14227_210_4381_1_1530701804286_3940259_374-194712-0_sp1.1 from training. Duration: 0.92725 2023-10-02 04:19:23,382 WARNING [train.py:1204] (3/4) Exclude cut with ID _40137_210_12210_1_1533290484046_3795180_249-187059-0_sp0.9 from training. Duration: 0.97775 2023-10-02 04:19:23,401 WARNING [train.py:1204] (3/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_389-317194-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 04:19:24,963 WARNING [train.py:1204] (3/4) Exclude cut with ID _27485_210_4916_1_1532739191677_3668269_239-122918-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 04:19:31,437 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2245_210_2715_1_1525511548_3540254_128-117609-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 04:19:31,468 WARNING [train.py:1204] (3/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_160-276178-0_sp1.1 from training. Duration: 0.963625 2023-10-02 04:19:33,452 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_31920_210_10633_1_1532860009_1686094_1-279735-0_sp1.1 from training. Duration: 0.97275 2023-10-02 04:19:34,941 WARNING [train.py:1204] (3/4) Exclude cut with ID _51078_210_9070_1_1534161557424_3805250_325-262383-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 04:19:34,967 WARNING [train.py:1204] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_643-52007-0_sp0.9 from training. Duration: 0.6333125 2023-10-02 04:19:35,700 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.2.encoder.layers.1.conv_module2.whiten, num_groups=1, num_channels=384, metric=6.55 vs. limit=15.0 2023-10-02 04:19:36,296 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_51937_210_5014_1_1534573610_4022025_518-299248-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 04:19:42,287 WARNING [train.py:1204] (3/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_166-314607-0_sp1.1 from training. Duration: 0.4909375 2023-10-02 04:19:42,292 WARNING [train.py:1204] (3/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_1087-282939-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 04:19:42,331 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_23850_210_4238_1_1532232597_1632433_32-185135-0_sp1.1 from training. Duration: 0.7818125 2023-10-02 04:19:42,375 WARNING [train.py:1204] (3/4) Exclude cut with ID _59500_210_8254_1_1534809610680_3736009_145-89359-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 04:19:43,629 INFO [train.py:1046] (3/4) Epoch 22, batch 700, loss[loss=0.165, simple_loss=0.2395, pruned_loss=0.04524, over 24330.00 frames. ], tot_loss[loss=0.1736, simple_loss=0.2485, pruned_loss=0.04935, over 4568699.10 frames. ], batch size: 56, lr: 4.71e-03, grad_scale: 16.0 2023-10-02 04:19:44,048 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.conv_module1.balancer2.prob, batch_count=748360.0, ans=0.125 2023-10-02 04:19:47,197 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_340-336408-0 from training. Duration: 0.93 2023-10-02 04:19:47,265 WARNING [train.py:1204] (3/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_9-21737-0 from training. Duration: 0.97 2023-10-02 04:19:49,966 WARNING [train.py:1204] (3/4) Exclude cut with ID _2220_210_1819_1_1525509109898_3658680_50-99837-0 from training. Duration: 0.54 2023-10-02 04:19:50,034 WARNING [train.py:1204] (3/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_879-299350-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 04:19:51,378 WARNING [train.py:1204] (3/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_905-160074-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 04:19:54,517 WARNING [train.py:1204] (3/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_395-73570-0 from training. Duration: 0.99 2023-10-02 04:19:58,594 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7264_210_4938_1_1527939911_8132095_800-91995-0_sp1.1 from training. Duration: 0.7818125 2023-10-02 04:20:02,102 WARNING [train.py:1204] (3/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_77-311404-0_sp1.1 from training. Duration: 0.8545625 2023-10-02 04:20:03,548 WARNING [train.py:1204] (3/4) Exclude cut with ID _13679_210_7497_1_1530534293662_3355098_107-80772-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 04:20:06,271 WARNING [train.py:1204] (3/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_224-343295-0_sp1.1 from training. Duration: 0.5 2023-10-02 04:20:06,300 WARNING [train.py:1204] (3/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_156-253482-0_sp1.1 from training. Duration: 0.963625 2023-10-02 04:20:08,873 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.658e+02 1.977e+02 2.354e+02 2.667e+02 4.737e+02, threshold=4.709e+02, percent-clipped=1.0 2023-10-02 04:20:09,037 WARNING [train.py:1204] (3/4) Exclude cut with ID _49728_210_12651_1_1534041034294_6788150_309-125532-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 04:20:09,284 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.balancer1.prob, batch_count=748426.6666666666, ans=0.125 2023-10-02 04:20:13,772 WARNING [train.py:1204] (3/4) Exclude cut with ID _13913_210_9994_1_1530579615187_3383897_473-277682-0_sp0.9 from training. Duration: 0.4333125 2023-10-02 04:20:13,777 WARNING [train.py:1204] (3/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_3-276701-0_sp1.1 from training. Duration: 0.82725 2023-10-02 04:20:13,929 WARNING [train.py:1204] (3/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_282-136641-0 from training. Duration: 0.42 2023-10-02 04:20:17,955 WARNING [train.py:1204] (3/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_127-566-0 from training. Duration: 0.84 2023-10-02 04:20:21,423 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6163_210_6400_1_1527506746_4263068_283-137811-0_sp1.1 from training. Duration: 0.7 2023-10-02 04:20:21,472 WARNING [train.py:1204] (3/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_34-183685-0_sp1.1 from training. Duration: 0.8909375 2023-10-02 04:20:24,143 WARNING [train.py:1204] (3/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_154-135696-0_sp0.9 from training. Duration: 0.888875 2023-10-02 04:20:26,917 WARNING [train.py:1204] (3/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_676-51014-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 04:20:28,349 WARNING [train.py:1204] (3/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_38-169920-0 from training. Duration: 0.91 2023-10-02 04:20:31,742 WARNING [train.py:1204] (3/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_747-114465-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 04:20:31,799 WARNING [train.py:1204] (3/4) Exclude cut with ID _8021_210_7994_1_1528376451358_3653069_100-140231-0_sp1.1 from training. Duration: 0.6090625 2023-10-02 04:20:33,638 WARNING [train.py:1204] (3/4) Exclude cut with ID _64135_210_2253_1_1535677267876_7608972_133-188266-0 from training. Duration: 0.81 2023-10-02 04:20:35,196 WARNING [train.py:1204] (3/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_714-310538-0_sp1.1 from training. Duration: 0.7818125 2023-10-02 04:20:36,550 WARNING [train.py:1204] (3/4) Exclude cut with ID _39867_210_9826_1_1533553222030_9344259_1134-288498-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 04:20:39,188 WARNING [train.py:1204] (3/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_211-237289-0_sp1.1 from training. Duration: 0.936375 2023-10-02 04:20:44,039 WARNING [train.py:1204] (3/4) Exclude cut with ID _59202_210_26984_1_1534816810612_3549174_137-318710-0_sp0.9 from training. Duration: 0.988875 2023-10-02 04:20:45,347 WARNING [train.py:1204] (3/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_86-66192-0 from training. Duration: 0.55 2023-10-02 04:20:46,942 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.conv_skip_rate, batch_count=748626.6666666666, ans=0.0 2023-10-02 04:20:48,158 WARNING [train.py:1204] (3/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_540-275685-0 from training. Duration: 0.89 2023-10-02 04:20:48,211 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8776_210_2040_1_1528638837_4088036_244-278763-0 from training. Duration: 0.9 2023-10-02 04:20:51,394 WARNING [train.py:1204] (3/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_9-219972-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 04:20:54,063 WARNING [train.py:1204] (3/4) Exclude cut with ID _44659_210_2721_1_1534036120061_6614860_656-283823-0_sp1.1 from training. Duration: 0.92725 2023-10-02 04:20:54,314 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.conv_skip_rate, batch_count=748626.6666666666, ans=0.0 2023-10-02 04:20:55,477 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_69630_210_7234_1_1535789601_661049_77-216385-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 04:20:55,615 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_19952_210_2161_1_1531875469_3851750_210-308919-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 04:20:55,620 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_4737_210_2040_1_1526998689_3293008_378-178650-0 from training. Duration: 0.95 2023-10-02 04:20:58,670 INFO [train.py:1046] (3/4) Epoch 22, batch 750, loss[loss=0.1853, simple_loss=0.2623, pruned_loss=0.05412, over 23937.00 frames. ], tot_loss[loss=0.1731, simple_loss=0.2481, pruned_loss=0.04908, over 4585873.28 frames. ], batch size: 86, lr: 4.71e-03, grad_scale: 16.0 2023-10-02 04:20:58,989 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.conv_skip_rate, batch_count=748693.3333333334, ans=0.0 2023-10-02 04:21:00,136 WARNING [train.py:1204] (3/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_224-343295-0 from training. Duration: 0.55 2023-10-02 04:21:00,166 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7002_210_1794_1_1527939683_5157286_184-229630-0 from training. Duration: 0.74 2023-10-02 04:21:00,190 WARNING [train.py:1204] (3/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_6-214815-0 from training. Duration: 0.85 2023-10-02 04:21:02,141 WARNING [train.py:1204] (3/4) Exclude cut with ID _8699_210_2069_1_1528630346831_3960409_160-24408-0 from training. Duration: 0.91 2023-10-02 04:21:02,189 WARNING [train.py:1204] (3/4) Exclude cut with ID _5585_210_2667_1_1527418813324_8007340_89-332072-0 from training. Duration: 0.64 2023-10-02 04:21:02,230 WARNING [train.py:1204] (3/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_528-259800-0_sp1.1 from training. Duration: 0.963625 2023-10-02 04:21:03,487 WARNING [train.py:1204] (3/4) Exclude cut with ID _55383_210_10635_1_1534468426172_2231669_134-128246-0 from training. Duration: 0.89 2023-10-02 04:21:04,943 WARNING [train.py:1204] (3/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_400-338274-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 04:21:04,998 WARNING [train.py:1204] (3/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_364-22817-0_sp0.9 from training. Duration: 0.788875 2023-10-02 04:21:06,338 WARNING [train.py:1204] (3/4) Exclude cut with ID _42863_210_9070_1_1533902398788_3696920_331-291057-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 04:21:07,794 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_5436_210_1794_1_1527331317_2541494_229-211016-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 04:21:09,154 WARNING [train.py:1204] (3/4) Exclude cut with ID _6456_210_1534_1_1527681634212_3181049_345-252877-0_sp0.9 from training. Duration: 0.711125 2023-10-02 04:21:09,193 WARNING [train.py:1204] (3/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_96-81499-0_sp1.1 from training. Duration: 0.92725 2023-10-02 04:21:13,820 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_5575_210_1410_1_1527301305_2570170_342-265572-0_sp1.1 from training. Duration: 0.8545625 2023-10-02 04:21:15,196 WARNING [train.py:1204] (3/4) Exclude cut with ID _5444_210_2151_1_1527418808464_5312210_726-27165-0_sp0.9 from training. Duration: 0.7444375 2023-10-02 04:21:16,644 WARNING [train.py:1204] (3/4) Exclude cut with ID _22083_210_8254_1_1531702862439_3678970_304-327750-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 04:21:19,477 WARNING [train.py:1204] (3/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_245-226199-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 04:21:19,513 WARNING [train.py:1204] (3/4) Exclude cut with ID _42875_210_14856_1_1533704397487_4012433_102-199499-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 04:21:19,568 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_51225_210_3601_1_1534400634_2327383_55-301844-0 from training. Duration: 0.93 2023-10-02 04:21:22,811 WARNING [train.py:1204] (3/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_916-195848-0_sp1.1 from training. Duration: 0.836375 2023-10-02 04:21:22,882 WARNING [train.py:1204] (3/4) Exclude cut with ID _44718_210_5816_1_1534050051869_6469030_497-311999-0_sp1.1 from training. Duration: 0.936375 2023-10-02 04:21:24,246 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_34886_210_10633_1_1533203882_1755661_91-194765-0_sp1.1 from training. Duration: 0.936375 2023-10-02 04:21:25,658 WARNING [train.py:1204] (3/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_310-26186-0_sp0.9 from training. Duration: 0.77775 2023-10-02 04:21:27,011 WARNING [train.py:1204] (3/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_416-257730-0 from training. Duration: 0.65 2023-10-02 04:21:27,016 WARNING [train.py:1204] (3/4) Exclude cut with ID _9389_210_1664_1_1529217337301_5289269_209-154760-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 04:21:29,154 WARNING [train.py:1204] (3/4) Exclude cut with ID _4092_210_2151_1_1526643044449_3135759_227-213972-0 from training. Duration: 0.54 2023-10-02 04:21:29,176 WARNING [train.py:1204] (3/4) Exclude cut with ID IC0679W0044-335852 from training. Duration: 0.768 2023-10-02 04:21:30,565 WARNING [train.py:1204] (3/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_42-186162-0 from training. Duration: 0.92 2023-10-02 04:21:30,584 WARNING [train.py:1204] (3/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_302-2550-0_sp0.9 from training. Duration: 0.9 2023-10-02 04:21:31,880 WARNING [train.py:1204] (3/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_356-31216-0_sp0.9 from training. Duration: 0.8333125 2023-10-02 04:21:32,238 INFO [scaling.py:1118] (3/4) WithLoss: name=encoder.encoders.3.encoder.layers.1.self_attn_weights, loss-sum=0.000e+00 2023-10-02 04:21:33,334 WARNING [train.py:1204] (3/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_125-322172-0_sp1.1 from training. Duration: 0.8454375 2023-10-02 04:21:39,399 WARNING [train.py:1204] (3/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_141-89727-0_sp0.9 from training. Duration: 0.788875 2023-10-02 04:21:39,418 WARNING [train.py:1204] (3/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_647-337784-0_sp1.1 from training. Duration: 0.97275 2023-10-02 04:21:39,430 WARNING [train.py:1204] (3/4) Exclude cut with ID _6493_210_1747_1_1528459210424_4012470_513-140967-0_sp1.1 from training. Duration: 0.7181875 2023-10-02 04:21:39,588 WARNING [train.py:1204] (3/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_87-266334-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 04:21:40,197 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.3.encoder.layers.3.self_attn2.whiten, num_groups=1, num_channels=512, metric=10.99 vs. limit=22.5 2023-10-02 04:21:41,641 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.2.encoder.layers.2.feed_forward2.out_whiten, num_groups=1, num_channels=384, metric=6.08 vs. limit=15.0 2023-10-02 04:21:42,229 WARNING [train.py:1204] (3/4) Exclude cut with ID _26528_210_9070_1_1532570410842_3851310_526-261338-0_sp1.1 from training. Duration: 0.92725 2023-10-02 04:21:42,277 WARNING [train.py:1204] (3/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_380-343670-0 from training. Duration: 0.88 2023-10-02 04:21:42,410 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.1.conv_module2.balancer2.min_positive, batch_count=748893.3333333334, ans=0.05 2023-10-02 04:21:43,657 WARNING [train.py:1204] (3/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_449-15508-0_sp1.1 from training. Duration: 0.7454375 2023-10-02 04:21:43,756 WARNING [train.py:1204] (3/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_372-6869-0_sp0.9 from training. Duration: 0.488875 2023-10-02 04:21:45,486 WARNING [train.py:1204] (3/4) Exclude cut with ID _26907_210_7234_1_1532221740694_3205263_337-2494-0_sp0.9 from training. Duration: 0.97775 2023-10-02 04:21:49,562 WARNING [train.py:1204] (3/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_10-100367-0_sp1.1 from training. Duration: 0.8909375 2023-10-02 04:21:51,206 WARNING [train.py:1204] (3/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_47-167373-0 from training. Duration: 0.92 2023-10-02 04:21:51,276 WARNING [train.py:1204] (3/4) Exclude cut with ID _28323_210_16187_1_1532480395667_4100300_628-282179-0_sp1.1 from training. Duration: 0.97275 2023-10-02 04:21:55,468 WARNING [train.py:1204] (3/4) Exclude cut with ID _20455_210_5219_1_1531565996775_8098100_213-280898-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 04:21:56,943 WARNING [train.py:1204] (3/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_383-316000-0_sp1.1 from training. Duration: 0.8181875 2023-10-02 04:21:58,256 WARNING [train.py:1204] (3/4) Exclude cut with ID _18513_210_9206_1_1531618215223_3862620_16-78180-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 04:22:01,541 WARNING [train.py:1204] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_330-327861-0_sp1.1 from training. Duration: 0.6090625 2023-10-02 04:22:06,127 WARNING [train.py:1204] (3/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_356-31216-0 from training. Duration: 0.75 2023-10-02 04:22:06,145 WARNING [train.py:1204] (3/4) Exclude cut with ID _25248_210_8750_1_1532062825941_5554180_255-148539-0_sp1.1 from training. Duration: 0.963625 2023-10-02 04:22:06,189 WARNING [train.py:1204] (3/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_269-154726-0_sp1.1 from training. Duration: 0.7818125 2023-10-02 04:22:07,713 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7002_210_1794_1_1527939683_5157286_163-87552-0_sp1.1 from training. Duration: 0.7818125 2023-10-02 04:22:09,038 WARNING [train.py:1204] (3/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_97-151896-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 04:22:10,557 WARNING [train.py:1204] (3/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_207-7026-0_sp1.1 from training. Duration: 0.97275 2023-10-02 04:22:11,960 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_92-46009-0_sp0.9 from training. Duration: 0.6 2023-10-02 04:22:12,193 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.1.feed_forward1.hidden_balancer.prob, batch_count=749026.6666666666, ans=0.125 2023-10-02 04:22:13,160 INFO [train.py:1046] (3/4) Epoch 22, batch 800, loss[loss=0.1822, simple_loss=0.2535, pruned_loss=0.05549, over 22893.00 frames. ], tot_loss[loss=0.1744, simple_loss=0.2494, pruned_loss=0.04967, over 4614921.27 frames. ], batch size: 322, lr: 4.71e-03, grad_scale: 32.0 2023-10-02 04:22:18,928 WARNING [train.py:1204] (3/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_122-178847-0_sp1.1 from training. Duration: 0.97275 2023-10-02 04:22:18,931 WARNING [train.py:1204] (3/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_94-348956-0_sp1.1 from training. Duration: 0.92725 2023-10-02 04:22:20,844 WARNING [train.py:1204] (3/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_1018-244089-0_sp1.1 from training. Duration: 0.963625 2023-10-02 04:22:20,855 WARNING [train.py:1204] (3/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_87-118423-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 04:22:22,173 WARNING [train.py:1204] (3/4) Exclude cut with ID _53713_210_24116_1_1534314487762_7416751_504-212292-0_sp1.1 from training. Duration: 0.92725 2023-10-02 04:22:22,193 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_446-150327-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 04:22:25,304 WARNING [train.py:1204] (3/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_682-245989-0_sp1.1 from training. Duration: 0.92725 2023-10-02 04:22:27,264 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.3.encoder.layers.2.self_attn2.whiten, num_groups=1, num_channels=512, metric=13.38 vs. limit=22.5 2023-10-02 04:22:27,970 WARNING [train.py:1204] (3/4) Exclude cut with ID _35723_210_1794_1_1533088341043_628250_8-234524-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 04:22:29,877 WARNING [train.py:1204] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_64-248359-0_sp0.9 from training. Duration: 0.7666875 2023-10-02 04:22:30,123 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.1.conv_module2.balancer2.min_positive, batch_count=749093.3333333334, ans=0.05 2023-10-02 04:22:31,254 WARNING [train.py:1204] (3/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_49-97158-0 from training. Duration: 0.76 2023-10-02 04:22:32,659 WARNING [train.py:1204] (3/4) Exclude cut with ID _24314_210_4915_1_1532135078116_7170179_743-915-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 04:22:32,768 WARNING [train.py:1204] (3/4) Exclude cut with ID _61194_210_18628_1_1535007555628_3750909_353-323468-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 04:22:32,801 WARNING [train.py:1204] (3/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_387-131538-0_sp1.1 from training. Duration: 0.736375 2023-10-02 04:22:34,468 WARNING [train.py:1204] (3/4) Exclude cut with ID _44424_210_18163_1_1533623394848_1213110_79-187603-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 04:22:34,507 WARNING [train.py:1204] (3/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_86-19601-0 from training. Duration: 0.85 2023-10-02 04:22:35,882 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_50050_210_16223_1_1535435796_7290138_103-283529-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 04:22:35,925 WARNING [train.py:1204] (3/4) Exclude cut with ID _13913_210_9994_1_1530579615187_3383897_473-277682-0 from training. Duration: 0.39 2023-10-02 04:22:38,832 WARNING [train.py:1204] (3/4) Exclude cut with ID _42326_210_7770_1_1533814282531_3707269_368-68029-0_sp1.1 from training. Duration: 0.92725 2023-10-02 04:22:40,035 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.522e+02 1.751e+02 1.954e+02 2.249e+02 3.221e+02, threshold=3.908e+02, percent-clipped=0.0 2023-10-02 04:22:41,496 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_41428_210_2161_1_1533774184_3992532_446-339645-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 04:22:44,341 WARNING [train.py:1204] (3/4) Exclude cut with ID _69584_210_5219_1_1535785133775_4044450_325-7656-0_sp1.1 from training. Duration: 0.7818125 2023-10-02 04:22:45,622 WARNING [train.py:1204] (3/4) Exclude cut with ID _21632_210_2253_1_1531994001765_3744140_134-184387-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 04:22:48,328 WARNING [train.py:1204] (3/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_550-184707-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 04:22:48,344 WARNING [train.py:1204] (3/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_106-341780-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 04:22:48,832 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.bypass_mid.scale_min, batch_count=749160.0, ans=0.2 2023-10-02 04:22:49,969 WARNING [train.py:1204] (3/4) Exclude cut with ID _45871_210_5665_1_1533864614245_3296060_253-132552-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 04:22:51,907 WARNING [train.py:1204] (3/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_395-310220-0_sp1.1 from training. Duration: 0.8090625 2023-10-02 04:22:51,921 WARNING [train.py:1204] (3/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_795-33252-0_sp0.9 from training. Duration: 0.411125 2023-10-02 04:22:55,156 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9002W0003-495962 from training. Duration: 0.946 2023-10-02 04:22:55,184 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_43-71989-0 from training. Duration: 0.57 2023-10-02 04:22:55,203 WARNING [train.py:1204] (3/4) Exclude cut with ID _8699_210_2069_1_1528630346831_3960409_155-39766-0_sp1.1 from training. Duration: 0.7090625 2023-10-02 04:22:56,504 WARNING [train.py:1204] (3/4) Exclude cut with ID _27894_210_12392_1_1532415496078_6902439_788-232684-0_sp1.1 from training. Duration: 0.97275 2023-10-02 04:22:57,978 WARNING [train.py:1204] (3/4) Exclude cut with ID _56494_210_4220_1_1535284837625_3033169_10-39296-0_sp1.1 from training. Duration: 0.92725 2023-10-02 04:22:58,013 WARNING [train.py:1204] (3/4) Exclude cut with ID _40901_210_5665_1_1533363789958_3856130_341-20819-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 04:22:58,813 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.2.encoder.layers.2.nonlin_attention.whiten2, num_groups=1, num_channels=384, metric=5.30 vs. limit=15.0 2023-10-02 04:22:59,663 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.ff3_skip_rate, batch_count=749226.6666666666, ans=0.0 2023-10-02 04:23:02,673 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9029W0435-509822 from training. Duration: 0.896 2023-10-02 04:23:02,721 WARNING [train.py:1204] (3/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_51-117576-0 from training. Duration: 0.4 2023-10-02 04:23:04,146 WARNING [train.py:1204] (3/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_197-28164-0_sp1.1 from training. Duration: 0.72725 2023-10-02 04:23:05,520 WARNING [train.py:1204] (3/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_145-137790-0_sp1.1 from training. Duration: 0.7545625 2023-10-02 04:23:08,778 WARNING [train.py:1204] (3/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_2-39653-0_sp1.1 from training. Duration: 0.8909375 2023-10-02 04:23:11,735 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3780_210_2087_1_1526467760_2130483_186-208240-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 04:23:13,110 WARNING [train.py:1204] (3/4) Exclude cut with ID _24055_210_12651_1_1532310907143_976979_92-59356-0 from training. Duration: 0.91 2023-10-02 04:23:14,356 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3910_210_1794_1_1526742306_334530_28-238909-0_sp1.1 from training. Duration: 0.736375 2023-10-02 04:23:15,870 WARNING [train.py:1204] (3/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_269-154726-0 from training. Duration: 0.86 2023-10-02 04:23:16,073 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.bypass.scale_min, batch_count=749293.3333333334, ans=0.2 2023-10-02 04:23:23,555 WARNING [train.py:1204] (3/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_326-165900-0_sp0.9 from training. Duration: 0.7666875 2023-10-02 04:23:26,704 WARNING [train.py:1204] (3/4) Exclude cut with ID _5294_210_1747_1_1527249643928_3696179_470-276077-0_sp1.1 from training. Duration: 0.963625 2023-10-02 04:23:26,753 WARNING [train.py:1204] (3/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_372-131163-0 from training. Duration: 0.94 2023-10-02 04:23:26,805 WARNING [train.py:1204] (3/4) Exclude cut with ID _45297_210_2721_1_1533715351381_3264949_351-147136-0_sp1.1 from training. Duration: 0.7909375 2023-10-02 04:23:28,155 INFO [train.py:1046] (3/4) Epoch 22, batch 850, loss[loss=0.2227, simple_loss=0.2845, pruned_loss=0.0804, over 18934.00 frames. ], tot_loss[loss=0.1749, simple_loss=0.2504, pruned_loss=0.04966, over 4632743.50 frames. ], batch size: 388, lr: 4.71e-03, grad_scale: 16.0 2023-10-02 04:23:28,234 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_1072-12671-0_sp1.1 from training. Duration: 0.92725 2023-10-02 04:23:29,652 WARNING [train.py:1204] (3/4) Exclude cut with ID _15133_210_6228_1_1530794512074_3080379_54-198193-0 from training. Duration: 0.94 2023-10-02 04:23:29,679 WARNING [train.py:1204] (3/4) Exclude cut with ID _30196_210_1664_1_1533708496522_7074140_1194-270988-0_sp1.1 from training. Duration: 0.97275 2023-10-02 04:23:31,673 WARNING [train.py:1204] (3/4) Exclude cut with ID _50310_210_15488_1_1534068043345_3641080_289-193637-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 04:23:33,061 WARNING [train.py:1204] (3/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_313-263495-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 04:23:34,500 WARNING [train.py:1204] (3/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_18-130336-0_sp1.1 from training. Duration: 0.7181875 2023-10-02 04:23:35,788 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_47896_210_20508_1_1534492518_5958868_646-287209-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 04:23:35,887 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_294-163793-0 from training. Duration: 0.82 2023-10-02 04:23:37,648 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_8-319957-0 from training. Duration: 0.96 2023-10-02 04:23:37,651 WARNING [train.py:1204] (3/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_1211-226066-0 from training. Duration: 0.76 2023-10-02 04:23:39,112 WARNING [train.py:1204] (3/4) Exclude cut with ID _4779_210_2084_1_1527318022944_3914893_500-285013-0_sp0.9 from training. Duration: 0.7666875 2023-10-02 04:23:39,129 WARNING [train.py:1204] (3/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_347-232268-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 04:23:39,462 INFO [scaling.py:1118] (3/4) WithLoss: name=encoder.encoders.5.encoder.layers.0.self_attn_weights, loss-sum=2.504e-03 2023-10-02 04:23:41,917 WARNING [train.py:1204] (3/4) Exclude cut with ID _29877_210_6228_1_1532397791106_5276149_111-55056-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 04:23:41,944 WARNING [train.py:1204] (3/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_812-271646-0_sp1.1 from training. Duration: 0.92725 2023-10-02 04:23:41,968 WARNING [train.py:1204] (3/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_235-262124-0_sp1.1 from training. Duration: 0.6090625 2023-10-02 04:23:46,122 WARNING [train.py:1204] (3/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_14-16530-0_sp1.1 from training. Duration: 0.97275 2023-10-02 04:23:46,157 WARNING [train.py:1204] (3/4) Exclude cut with ID _44718_210_5816_1_1534050051869_6469030_568-304512-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 04:23:46,188 WARNING [train.py:1204] (3/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_1033-274376-0 from training. Duration: 0.87 2023-10-02 04:23:48,877 WARNING [train.py:1204] (3/4) Exclude cut with ID _4681_210_3525_1_1527251040321_1235713_123-24428-0 from training. Duration: 0.48 2023-10-02 04:23:53,668 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_37818_210_4238_1_1533620910_4720083_251-288764-0_sp1.1 from training. Duration: 0.97275 2023-10-02 04:23:53,749 WARNING [train.py:1204] (3/4) Exclude cut with ID _65585_210_12210_1_1535450451584_3764098_112-294301-0 from training. Duration: 0.88 2023-10-02 04:23:55,749 WARNING [train.py:1204] (3/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_329-113173-0 from training. Duration: 0.62 2023-10-02 04:23:57,175 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6767_210_3273_1_1527855783_1718932_173-256601-0 from training. Duration: 0.8 2023-10-02 04:24:00,388 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9266W0001-627677 from training. Duration: 0.896 2023-10-02 04:24:00,411 WARNING [train.py:1204] (3/4) Exclude cut with ID _49643_210_7989_1_1534640244224_3939031_611-185515-0_sp1.1 from training. Duration: 0.8545625 2023-10-02 04:24:00,415 WARNING [train.py:1204] (3/4) Exclude cut with ID _31649_210_6228_1_1532584608063_4091078_687-237771-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 04:24:00,425 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_148-172861-0_sp0.9 from training. Duration: 0.5555625 2023-10-02 04:24:03,152 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.1.feed_forward1.out_proj.dropout_p, batch_count=749493.3333333334, ans=0.1 2023-10-02 04:24:04,744 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_5129_210_6400_1_1527161005_3579054_107-69738-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 04:24:04,846 WARNING [train.py:1204] (3/4) Exclude cut with ID _40137_210_12210_1_1533290484046_3795180_36-301164-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 04:24:06,047 WARNING [train.py:1204] (3/4) Exclude cut with ID _7537_210_2084_1_1528502395431_3924359_113-199933-0 from training. Duration: 0.59 2023-10-02 04:24:06,214 WARNING [train.py:1204] (3/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_547-5424-0_sp1.1 from training. Duration: 0.8545625 2023-10-02 04:24:06,918 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.2.encoder.layers.1.feed_forward1.out_whiten, num_groups=1, num_channels=384, metric=8.74 vs. limit=15.0 2023-10-02 04:24:07,465 WARNING [train.py:1204] (3/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_147-284024-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 04:24:07,543 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_294-163793-0_sp1.1 from training. Duration: 0.7454375 2023-10-02 04:24:08,974 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3780_210_2087_1_1526467760_2130483_170-259044-0_sp1.1 from training. Duration: 0.636375 2023-10-02 04:24:10,403 WARNING [train.py:1204] (3/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_726-270780-0_sp1.1 from training. Duration: 0.7818125 2023-10-02 04:24:11,672 WARNING [train.py:1204] (3/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_214-291369-0_sp1.1 from training. Duration: 0.57275 2023-10-02 04:24:11,728 WARNING [train.py:1204] (3/4) Exclude cut with ID _21881_210_3525_1_1531825242757_3798380_247-102591-0 from training. Duration: 0.98 2023-10-02 04:24:12,415 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.3.encoder.layers.0.whiten, num_groups=1, num_channels=512, metric=5.55 vs. limit=12.0 2023-10-02 04:24:13,493 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.conv_module1.balancer1.prob, batch_count=749560.0, ans=0.125 2023-10-02 04:24:14,614 WARNING [train.py:1204] (3/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_18-174647-0_sp1.1 from training. Duration: 0.82725 2023-10-02 04:24:14,616 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_415-52611-0_sp1.1 from training. Duration: 0.936375 2023-10-02 04:24:16,103 WARNING [train.py:1204] (3/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_379-49457-0_sp0.9 from training. Duration: 0.9666875 2023-10-02 04:24:16,123 WARNING [train.py:1204] (3/4) Exclude cut with ID _64521_210_14699_1_1535243427259_3815328_368-195106-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 04:24:17,971 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.4.encoder.layers.1.nonlin_attention.whiten2, num_groups=1, num_channels=384, metric=3.44 vs. limit=15.0 2023-10-02 04:24:18,697 WARNING [train.py:1204] (3/4) Exclude cut with ID _28394_210_8913_1_1532430049518_3650118_51-334124-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 04:24:21,489 WARNING [train.py:1204] (3/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_317-161769-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 04:24:22,927 WARNING [train.py:1204] (3/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_40-129756-0_sp0.9 from training. Duration: 0.9 2023-10-02 04:24:26,287 WARNING [train.py:1204] (3/4) Exclude cut with ID _3067_210_2667_1_1526212974640_4503168_119-175776-0_sp0.9 from training. Duration: 0.82225 2023-10-02 04:24:26,350 WARNING [train.py:1204] (3/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_1004-304915-0_sp1.1 from training. Duration: 0.92725 2023-10-02 04:24:28,243 WARNING [train.py:1204] (3/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_359-118926-0_sp0.9 from training. Duration: 0.8 2023-10-02 04:24:37,909 WARNING [train.py:1204] (3/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_51-200220-0_sp0.9 from training. Duration: 0.62225 2023-10-02 04:24:39,249 WARNING [train.py:1204] (3/4) Exclude cut with ID _46683_210_9642_1_1533730913492_4591116_530-326202-0_sp1.1 from training. Duration: 0.936375 2023-10-02 04:24:39,302 WARNING [train.py:1204] (3/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_567-135114-0 from training. Duration: 0.87 2023-10-02 04:24:39,327 WARNING [train.py:1204] (3/4) Exclude cut with ID _66863_210_18053_1_1535698758334_8590319_1092-271784-0_sp1.1 from training. Duration: 0.87275 2023-10-02 04:24:39,337 WARNING [train.py:1204] (3/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_762-282614-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 04:24:42,052 WARNING [train.py:1204] (3/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_280-120942-0 from training. Duration: 0.77 2023-10-02 04:24:43,315 INFO [train.py:1046] (3/4) Epoch 22, batch 900, loss[loss=0.1983, simple_loss=0.2675, pruned_loss=0.06457, over 23586.00 frames. ], tot_loss[loss=0.1759, simple_loss=0.2513, pruned_loss=0.0502, over 4650253.42 frames. ], batch size: 256, lr: 4.71e-03, grad_scale: 16.0 2023-10-02 04:24:48,759 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_31583_210_3852_1_1532595851_4024413_27-170931-0_sp1.1 from training. Duration: 0.963625 2023-10-02 04:24:52,826 WARNING [train.py:1204] (3/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_266-271628-0_sp1.1 from training. Duration: 0.92725 2023-10-02 04:24:52,874 WARNING [train.py:1204] (3/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_41-31157-0 from training. Duration: 0.57 2023-10-02 04:24:53,223 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=749693.3333333334, ans=0.1 2023-10-02 04:24:55,661 WARNING [train.py:1204] (3/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_113-18163-0_sp1.1 from training. Duration: 0.8090625 2023-10-02 04:24:55,693 WARNING [train.py:1204] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_147-111137-0 from training. Duration: 0.95 2023-10-02 04:24:57,121 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1083-220122-0_sp1.1 from training. Duration: 0.463625 2023-10-02 04:24:58,478 WARNING [train.py:1204] (3/4) Exclude cut with ID _14884_210_2252_1_1530707459689_5561250_591-173406-0_sp1.1 from training. Duration: 0.87275 2023-10-02 04:24:58,484 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_24677_210_1794_1_1532002693_4341296_149-303793-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 04:24:58,538 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_133-564-0_sp1.1 from training. Duration: 0.7181875 2023-10-02 04:24:59,885 WARNING [train.py:1204] (3/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_625-338546-0_sp0.9 from training. Duration: 0.97775 2023-10-02 04:25:00,193 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.feed_forward3.hidden_balancer.prob, batch_count=749760.0, ans=0.125 2023-10-02 04:25:10,080 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.575e+02 1.859e+02 2.170e+02 2.493e+02 3.460e+02, threshold=4.340e+02, percent-clipped=0.0 2023-10-02 04:25:10,204 WARNING [train.py:1204] (3/4) Exclude cut with ID _25882_210_3036_1_1532775665815_6479510_624-320169-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 04:25:10,211 WARNING [train.py:1204] (3/4) Exclude cut with ID _20622_210_3852_1_1531622427080_1093839_127-325326-0_sp1.1 from training. Duration: 0.92725 2023-10-02 04:25:10,249 WARNING [train.py:1204] (3/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_464-33931-0_sp1.1 from training. Duration: 0.4909375 2023-10-02 04:25:11,716 WARNING [train.py:1204] (3/4) Exclude cut with ID _66112_210_10635_1_1535590236305_3801484_120-304588-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 04:25:15,914 WARNING [train.py:1204] (3/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_110-130961-0 from training. Duration: 0.47 2023-10-02 04:25:17,279 WARNING [train.py:1204] (3/4) Exclude cut with ID _16908_210_5724_1_1531119157248_3772700_64-54020-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 04:25:20,261 WARNING [train.py:1204] (3/4) Exclude cut with ID _4068_210_2629_1_1526697350395_1546259_224-288693-0_sp1.1 from training. Duration: 0.536375 2023-10-02 04:25:21,499 WARNING [train.py:1204] (3/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_191-41023-0_sp0.9 from training. Duration: 0.788875 2023-10-02 04:25:21,590 WARNING [train.py:1204] (3/4) Exclude cut with ID ID0205W0255-783002 from training. Duration: 0.958 2023-10-02 04:25:21,782 INFO [scaling.py:1118] (3/4) WithLoss: name=encoder.encoders.2.encoder.layers.0.self_attn_weights, loss-sum=0.000e+00 2023-10-02 04:25:22,947 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_162-262717-0 from training. Duration: 0.83 2023-10-02 04:25:30,082 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_224-322394-0_sp1.1 from training. Duration: 0.6 2023-10-02 04:25:30,089 WARNING [train.py:1204] (3/4) Exclude cut with ID _4866_210_2577_1_1527404412284_4364659_133-1696-0_sp1.1 from training. Duration: 0.9 2023-10-02 04:25:30,155 WARNING [train.py:1204] (3/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_973-198085-0_sp1.1 from training. Duration: 0.6818125 2023-10-02 04:25:38,267 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_47449_210_5399_1_1534076445_4006929_410-237806-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 04:25:38,283 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7543_210_6753_1_1528543318_4552_1-85072-0_sp0.9 from training. Duration: 0.92225 2023-10-02 04:25:39,718 WARNING [train.py:1204] (3/4) Exclude cut with ID _65263_210_22356_1_1535763598691_3921037_108-21902-0 from training. Duration: 0.99 2023-10-02 04:25:40,987 WARNING [train.py:1204] (3/4) Exclude cut with ID _65585_210_12210_1_1535450451584_3764098_309-174818-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 04:25:42,411 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_309-65177-0 from training. Duration: 0.88 2023-10-02 04:25:45,135 WARNING [train.py:1204] (3/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_363-185703-0_sp1.1 from training. Duration: 0.863625 2023-10-02 04:25:45,171 WARNING [train.py:1204] (3/4) Exclude cut with ID _42326_210_7770_1_1533814282531_3707269_442-238692-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 04:25:46,637 WARNING [train.py:1204] (3/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_226-254446-0_sp1.1 from training. Duration: 0.936375 2023-10-02 04:25:46,659 WARNING [train.py:1204] (3/4) Exclude cut with ID _29853_210_7497_1_1532505600851_6642110_287-299903-0_sp1.1 from training. Duration: 0.963625 2023-10-02 04:25:52,098 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1015-343884-0 from training. Duration: 0.62 2023-10-02 04:25:53,411 WARNING [train.py:1204] (3/4) Exclude cut with ID ID0305W0008-833402 from training. Duration: 0.91 2023-10-02 04:25:53,527 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6673_210_3273_1_1527767790_3173240_351-167954-0_sp1.1 from training. Duration: 0.436375 2023-10-02 04:25:53,551 WARNING [train.py:1204] (3/4) Exclude cut with ID _6456_210_1534_1_1527681634212_3181049_345-252877-0 from training. Duration: 0.64 2023-10-02 04:25:56,135 INFO [train.py:1046] (3/4) Epoch 22, batch 950, loss[loss=0.1585, simple_loss=0.2269, pruned_loss=0.04502, over 23473.00 frames. ], tot_loss[loss=0.1756, simple_loss=0.2514, pruned_loss=0.04992, over 4676247.51 frames. ], batch size: 134, lr: 4.71e-03, grad_scale: 16.0 2023-10-02 04:25:56,259 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_13-205182-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 04:25:59,177 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.ff2_skip_rate, batch_count=750026.6666666666, ans=0.0 2023-10-02 04:26:00,871 WARNING [train.py:1204] (3/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_427-208901-0 from training. Duration: 0.68 2023-10-02 04:26:05,185 WARNING [train.py:1204] (3/4) Exclude cut with ID _49039_210_6315_1_1533967537541_3846118_9-148921-0_sp1.1 from training. Duration: 0.97275 2023-10-02 04:26:08,448 WARNING [train.py:1204] (3/4) Exclude cut with ID _59493_210_17282_1_1535078419077_3225879_143-70317-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 04:26:08,485 WARNING [train.py:1204] (3/4) Exclude cut with ID _27967_210_12140_1_1532588391859_8419130_1056-236369-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 04:26:08,717 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.conv_module2.balancer1.prob, batch_count=750026.6666666666, ans=0.125 2023-10-02 04:26:10,207 WARNING [train.py:1204] (3/4) Exclude cut with ID _6537_210_1883_1_1527987559710_3597129_382-133062-0_sp0.9 from training. Duration: 0.7555625 2023-10-02 04:26:12,957 WARNING [train.py:1204] (3/4) Exclude cut with ID ID0376W0255-869957 from training. Duration: 0.9510625 2023-10-02 04:26:14,613 WARNING [train.py:1204] (3/4) Exclude cut with ID _49371_210_12071_1_1533952761021_3812190_483-240691-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 04:26:15,842 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_12868_210_6753_1_1530701932_530275_18-76322-0_sp1.1 from training. Duration: 0.7909375 2023-10-02 04:26:17,183 WARNING [train.py:1204] (3/4) Exclude cut with ID _53950_210_5816_1_1534417264919_3862951_367-26344-0_sp1.1 from training. Duration: 0.97275 2023-10-02 04:26:17,202 WARNING [train.py:1204] (3/4) Exclude cut with ID _5444_210_2151_1_1527418808464_5312210_634-194174-0_sp1.1 from training. Duration: 0.8818125 2023-10-02 04:26:17,219 WARNING [train.py:1204] (3/4) Exclude cut with ID _56494_210_4220_1_1535284837625_3033169_69-138150-0 from training. Duration: 0.94 2023-10-02 04:26:17,309 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7543_210_6753_1_1528544010_1110402_25-183860-0_sp0.9 from training. Duration: 0.52225 2023-10-02 04:26:18,920 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.bypass_mid.scale_min, batch_count=750093.3333333334, ans=0.2 2023-10-02 04:26:19,958 WARNING [train.py:1204] (3/4) Exclude cut with ID _40284_210_2084_1_1533513679814_3704279_287-191918-0_sp1.1 from training. Duration: 0.963625 2023-10-02 04:26:21,409 WARNING [train.py:1204] (3/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_291-270881-0 from training. Duration: 0.97 2023-10-02 04:26:21,459 WARNING [train.py:1204] (3/4) Exclude cut with ID _12555_210_8750_1_1530075594402_3780239_449-267986-0_sp0.9 from training. Duration: 0.92225 2023-10-02 04:26:24,456 WARNING [train.py:1204] (3/4) Exclude cut with ID _3992_210_1747_1_1526644816031_3731060_268-64520-0_sp1.1 from training. Duration: 0.963625 2023-10-02 04:26:24,463 WARNING [train.py:1204] (3/4) Exclude cut with ID _39411_210_5986_1_1533538825030_3343649_173-238248-0_sp0.9 from training. Duration: 0.92225 2023-10-02 04:26:24,494 WARNING [train.py:1204] (3/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_828-313097-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 04:26:24,582 WARNING [train.py:1204] (3/4) Exclude cut with ID _26907_210_7234_1_1532221740694_3205263_337-2494-0 from training. Duration: 0.88 2023-10-02 04:26:27,220 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_229-45300-0_sp1.1 from training. Duration: 0.5181875 2023-10-02 04:26:28,630 WARNING [train.py:1204] (3/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_300-27235-0_sp1.1 from training. Duration: 0.7909375 2023-10-02 04:26:29,894 WARNING [train.py:1204] (3/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_48-338976-0_sp1.1 from training. Duration: 0.8090625 2023-10-02 04:26:35,210 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_13091_210_7925_1_1530609447_8987354_792-195353-0_sp1.1 from training. Duration: 0.936375 2023-10-02 04:26:35,216 WARNING [train.py:1204] (3/4) Exclude cut with ID _56524_210_9642_1_1534572011779_3619170_311-54500-0_sp1.1 from training. Duration: 0.97275 2023-10-02 04:26:40,434 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7543_210_6753_1_1528544010_1110402_25-183860-0 from training. Duration: 0.47 2023-10-02 04:26:43,077 WARNING [train.py:1204] (3/4) Exclude cut with ID 2411-132532-0017-82279-0_sp1.1 from training. Duration: 0.9681875 2023-10-02 04:26:43,078 WARNING [train.py:1204] (3/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_272-249985-0_sp0.9 from training. Duration: 0.7444375 2023-10-02 04:26:43,150 WARNING [train.py:1204] (3/4) Exclude cut with ID _55644_210_11856_1_1534593599582_4103719_320-229006-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 04:26:44,510 WARNING [train.py:1204] (3/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_177-100554-0_sp1.1 from training. Duration: 0.963625 2023-10-02 04:26:44,512 WARNING [train.py:1204] (3/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_787-294416-0_sp1.1 from training. Duration: 0.7454375 2023-10-02 04:26:48,740 WARNING [train.py:1204] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_318-59097-0 from training. Duration: 0.76 2023-10-02 04:26:48,824 WARNING [train.py:1204] (3/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_480-13146-0_sp1.1 from training. Duration: 0.77275 2023-10-02 04:26:51,621 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_469-338558-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 04:26:52,936 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_367-126897-0_sp1.1 from training. Duration: 0.963625 2023-10-02 04:26:52,951 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6545_210_2040_1_1527774670_4245258_379-14182-0 from training. Duration: 0.8 2023-10-02 04:26:52,962 WARNING [train.py:1204] (3/4) Exclude cut with ID _21631_210_2253_1_1531907880778_3160339_197-79498-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 04:26:52,966 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_416-293746-0_sp1.1 from training. Duration: 0.8181875 2023-10-02 04:26:53,006 WARNING [train.py:1204] (3/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_52-211683-0 from training. Duration: 0.9 2023-10-02 04:26:55,883 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.conv_skip_rate, batch_count=750293.3333333334, ans=0.0 2023-10-02 04:26:56,452 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.1.encoder.layers.0.nonlin_attention.whiten2, num_groups=1, num_channels=256, metric=6.30 vs. limit=15.0 2023-10-02 04:26:56,915 WARNING [train.py:1204] (3/4) Exclude cut with ID _48678_210_5663_1_1533988830947_3769199_306-255886-0_sp0.9 from training. Duration: 0.8555625 2023-10-02 04:26:59,597 WARNING [train.py:1204] (3/4) Exclude cut with ID _65134_210_14763_1_1535335393985_3505368_296-24733-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 04:27:04,820 WARNING [train.py:1204] (3/4) Exclude cut with ID _14898_210_9206_1_1530766812377_4177190_335-99364-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 04:27:05,955 WARNING [train.py:1204] (3/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_19-342632-0 from training. Duration: 0.91 2023-10-02 04:27:05,983 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_53071_210_16554_1_1534490405_2954066_265-170472-0 from training. Duration: 0.83 2023-10-02 04:27:10,304 INFO [train.py:1046] (3/4) Epoch 22, batch 1000, loss[loss=0.1803, simple_loss=0.2535, pruned_loss=0.05359, over 23649.00 frames. ], tot_loss[loss=0.1755, simple_loss=0.2506, pruned_loss=0.05019, over 4664175.80 frames. ], batch size: 149, lr: 4.70e-03, grad_scale: 8.0 2023-10-02 04:27:13,585 WARNING [train.py:1204] (3/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_189-298172-0_sp1.1 from training. Duration: 0.963625 2023-10-02 04:27:16,478 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7615_210_4211_1_1528110965_4580160_72-235211-0 from training. Duration: 0.76 2023-10-02 04:27:16,540 WARNING [train.py:1204] (3/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_155-90874-0_sp1.1 from training. Duration: 0.936375 2023-10-02 04:27:16,737 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.out_combiner.scale_min, batch_count=750360.0, ans=0.2 2023-10-02 04:27:20,791 WARNING [train.py:1204] (3/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_164-216310-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 04:27:22,059 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_46013_210_7234_1_1533776362_3744032_463-52378-0 from training. Duration: 0.86 2023-10-02 04:27:22,072 WARNING [train.py:1204] (3/4) Exclude cut with ID _61133_210_9969_1_1535196777227_3696409_333-270325-0 from training. Duration: 0.93 2023-10-02 04:27:26,312 WARNING [train.py:1204] (3/4) Exclude cut with ID _5172_210_1883_1_1527246062567_3577219_18-101380-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 04:27:26,321 WARNING [train.py:1204] (3/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_577-236858-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 04:27:29,003 WARNING [train.py:1204] (3/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_1006-177345-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 04:27:30,534 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_4-266348-0 from training. Duration: 0.78 2023-10-02 04:27:33,926 WARNING [train.py:1204] (3/4) Exclude cut with ID _55857_210_2346_1_1534644007831_6898209_745-98391-0 from training. Duration: 0.76 2023-10-02 04:27:36,590 WARNING [train.py:1204] (3/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_375-244976-0 from training. Duration: 0.75 2023-10-02 04:27:37,326 WARNING [train.py:1204] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_4-248404-0_sp1.1 from training. Duration: 0.97275 2023-10-02 04:27:38,286 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.537e+02 1.816e+02 2.103e+02 2.458e+02 3.993e+02, threshold=4.207e+02, percent-clipped=0.0 2023-10-02 04:27:40,107 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8453_210_4211_1_1528502940_8562730_596-306820-0 from training. Duration: 0.97 2023-10-02 04:27:40,213 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_211-339622-0_sp0.9 from training. Duration: 0.5 2023-10-02 04:27:41,537 WARNING [train.py:1204] (3/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_393-57888-0 from training. Duration: 0.64 2023-10-02 04:27:43,443 WARNING [train.py:1204] (3/4) Exclude cut with ID _23825_210_13449_1_1531915313526_3497150_143-343575-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 04:27:43,503 WARNING [train.py:1204] (3/4) Exclude cut with ID _58605_210_12210_1_1534723195939_3904259_36-12256-0_sp1.1 from training. Duration: 0.936375 2023-10-02 04:27:45,228 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.conv_skip_rate, batch_count=750493.3333333334, ans=0.0 2023-10-02 04:27:51,870 WARNING [train.py:1204] (3/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_38-67795-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 04:27:53,104 WARNING [train.py:1204] (3/4) Exclude cut with ID _64631_210_27767_1_1535349580068_4165160_308-327832-0_sp1.1 from training. Duration: 0.92725 2023-10-02 04:27:53,174 WARNING [train.py:1204] (3/4) Exclude cut with ID _37086_210_9038_1_1532999540891_7021250_726-81176-0_sp1.1 from training. Duration: 0.936375 2023-10-02 04:27:53,238 WARNING [train.py:1204] (3/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_47-141521-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 04:27:53,243 WARNING [train.py:1204] (3/4) Exclude cut with ID _3067_210_2667_1_1526212974640_4503168_250-285937-0 from training. Duration: 0.8 2023-10-02 04:27:54,534 WARNING [train.py:1204] (3/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_239-24000-0_sp1.1 from training. Duration: 0.97275 2023-10-02 04:27:55,930 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3371_210_1794_1_1526384462_5580777_396-339812-0_sp0.9 from training. Duration: 0.9444375 2023-10-02 04:27:55,967 WARNING [train.py:1204] (3/4) Exclude cut with ID _16909_210_5724_1_1531292079912_4118919_580-173454-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 04:27:56,025 WARNING [train.py:1204] (3/4) Exclude cut with ID IC0124W0002-61308 from training. Duration: 0.982 2023-10-02 04:28:00,174 WARNING [train.py:1204] (3/4) Exclude cut with ID _31602_210_5854_1_1532677021456_4458250_249-96604-0 from training. Duration: 0.89 2023-10-02 04:28:01,047 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.2.encoder.layers.1.nonlin_attention.whiten2, num_groups=1, num_channels=384, metric=6.35 vs. limit=15.0 2023-10-02 04:28:01,547 WARNING [train.py:1204] (3/4) Exclude cut with ID _69584_210_5219_1_1535785133775_4044450_325-7656-0 from training. Duration: 0.86 2023-10-02 04:28:02,913 WARNING [train.py:1204] (3/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_478-162247-0 from training. Duration: 0.82 2023-10-02 04:28:04,386 WARNING [train.py:1204] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_686-54383-0_sp1.1 from training. Duration: 0.7909375 2023-10-02 04:28:09,946 WARNING [train.py:1204] (3/4) Exclude cut with ID _29303_210_2575_1_1532392237303_7410649_85-270092-0_sp1.1 from training. Duration: 0.936375 2023-10-02 04:28:09,963 WARNING [train.py:1204] (3/4) Exclude cut with ID _2322_210_2667_1_1525604548971_3656299_440-80494-0_sp1.1 from training. Duration: 0.8 2023-10-02 04:28:10,002 WARNING [train.py:1204] (3/4) Exclude cut with ID _47346_210_9476_1_1533815971553_3536160_201-40644-0_sp1.1 from training. Duration: 0.936375 2023-10-02 04:28:11,989 WARNING [train.py:1204] (3/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_343-139898-0_sp1.1 from training. Duration: 0.87275 2023-10-02 04:28:13,343 WARNING [train.py:1204] (3/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_1012-279473-0 from training. Duration: 0.67 2023-10-02 04:28:13,453 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7931_210_1794_1_1528594728_5564059_280-279560-0_sp1.1 from training. Duration: 0.8909375 2023-10-02 04:28:13,488 WARNING [train.py:1204] (3/4) Exclude cut with ID _8263_210_1664_1_1528439452891_970220_68-181805-0 from training. Duration: 0.64 2023-10-02 04:28:14,796 WARNING [train.py:1204] (3/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_605-133395-0 from training. Duration: 0.66 2023-10-02 04:28:14,900 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_43-171109-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 04:28:14,903 WARNING [train.py:1204] (3/4) Exclude cut with ID _40094_210_16380_1_1533294005773_3641159_107-280739-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 04:28:17,705 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.ff3_skip_rate, batch_count=750626.6666666666, ans=0.0 2023-10-02 04:28:19,020 WARNING [train.py:1204] (3/4) Exclude cut with ID _14821_210_8750_1_1530680503991_6829359_2-227151-0_sp0.9 from training. Duration: 0.92225 2023-10-02 04:28:20,448 WARNING [train.py:1204] (3/4) Exclude cut with ID _14268_210_9206_1_1530622771285_4714099_331-105351-0_sp1.1 from training. Duration: 0.5909375 2023-10-02 04:28:21,856 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.1.feed_forward1.out_proj.dropout_p, batch_count=750626.6666666666, ans=0.1 2023-10-02 04:28:23,071 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_26326_210_7925_1_1533002493_3592406_451-82741-0_sp1.1 from training. Duration: 0.97275 2023-10-02 04:28:24,506 INFO [train.py:1046] (3/4) Epoch 22, batch 1050, loss[loss=0.1388, simple_loss=0.219, pruned_loss=0.02928, over 24368.00 frames. ], tot_loss[loss=0.1741, simple_loss=0.249, pruned_loss=0.04957, over 4658923.99 frames. ], batch size: 56, lr: 4.70e-03, grad_scale: 8.0 2023-10-02 04:28:24,875 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.balancer1.prob, batch_count=750693.3333333334, ans=0.125 2023-10-02 04:28:25,993 WARNING [train.py:1204] (3/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_191-59784-0_sp0.9 from training. Duration: 0.97775 2023-10-02 04:28:28,625 WARNING [train.py:1204] (3/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_445-117398-0_sp1.1 from training. Duration: 0.8090625 2023-10-02 04:28:30,072 WARNING [train.py:1204] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_605-257205-0_sp0.9 from training. Duration: 0.6555625 2023-10-02 04:28:31,435 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_50050_210_16223_1_1535435796_7290138_592-258841-0_sp1.1 from training. Duration: 0.936375 2023-10-02 04:28:31,596 WARNING [train.py:1204] (3/4) Exclude cut with ID _14348_210_8341_1_1530599434996_3778640_240-232370-0_sp0.9 from training. Duration: 0.9333125 2023-10-02 04:28:36,184 WARNING [train.py:1204] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_239-316802-0_sp0.9 from training. Duration: 0.7666875 2023-10-02 04:28:37,556 WARNING [train.py:1204] (3/4) Exclude cut with ID _45560_210_21982_1_1533812603951_3584999_111-6376-0_sp1.1 from training. Duration: 0.763625 2023-10-02 04:28:39,627 WARNING [train.py:1204] (3/4) Exclude cut with ID _54189_210_16380_1_1534332670039_3911710_606-311481-0_sp1.1 from training. Duration: 0.963625 2023-10-02 04:28:41,622 WARNING [train.py:1204] (3/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_237-332027-0_sp1.1 from training. Duration: 0.863625 2023-10-02 04:28:41,647 WARNING [train.py:1204] (3/4) Exclude cut with ID _66070_210_7640_1_1535441393096_7805310_489-292631-0_sp0.9 from training. Duration: 0.87775 2023-10-02 04:28:41,708 WARNING [train.py:1204] (3/4) Exclude cut with ID _49728_210_12651_1_1534041034294_6788150_624-195301-0_sp1.1 from training. Duration: 0.77275 2023-10-02 04:28:43,611 WARNING [train.py:1204] (3/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_44-79427-0 from training. Duration: 0.98 2023-10-02 04:28:43,661 WARNING [train.py:1204] (3/4) Exclude cut with ID _35350_210_16040_1_1533040191183_4075299_490-198354-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 04:28:44,963 WARNING [train.py:1204] (3/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_309-222545-0 from training. Duration: 0.84 2023-10-02 04:28:47,754 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_13091_210_7925_1_1530609447_8987354_790-199469-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 04:28:47,767 WARNING [train.py:1204] (3/4) Exclude cut with ID _21632_210_2253_1_1531994001765_3744140_110-274654-0 from training. Duration: 0.82 2023-10-02 04:28:47,772 WARNING [train.py:1204] (3/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_498-40153-0_sp0.9 from training. Duration: 0.7 2023-10-02 04:28:54,787 WARNING [train.py:1204] (3/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_363-282909-0_sp1.1 from training. Duration: 0.936375 2023-10-02 04:28:56,258 WARNING [train.py:1204] (3/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_178-177582-0_sp0.9 from training. Duration: 0.82225 2023-10-02 04:28:56,269 WARNING [train.py:1204] (3/4) Exclude cut with ID _24612_210_8913_1_1531998054193_3806359_357-35487-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 04:28:57,719 WARNING [train.py:1204] (3/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_138-332773-0 from training. Duration: 0.81 2023-10-02 04:28:59,036 WARNING [train.py:1204] (3/4) Exclude cut with ID _7624_210_1531_1_1528109802198_4933101_418-349834-0 from training. Duration: 0.6 2023-10-02 04:28:59,061 WARNING [train.py:1204] (3/4) Exclude cut with ID _7537_210_2084_1_1528502395431_3924359_14-312906-0_sp0.9 from training. Duration: 0.9333125 2023-10-02 04:29:01,857 WARNING [train.py:1204] (3/4) Exclude cut with ID _60057_210_10640_1_1534903065793_3333150_362-230377-0 from training. Duration: 0.87 2023-10-02 04:29:02,055 WARNING [train.py:1204] (3/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_138-64210-0 from training. Duration: 0.73 2023-10-02 04:29:03,419 WARNING [train.py:1204] (3/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_454-339504-0_sp1.1 from training. Duration: 0.97275 2023-10-02 04:29:05,619 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.self_attn_weights.pos_emb_skip_rate, batch_count=750826.6666666666, ans=0.0 2023-10-02 04:29:06,768 WARNING [train.py:1204] (3/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_508-335369-0_sp0.9 from training. Duration: 0.5333125 2023-10-02 04:29:08,525 WARNING [train.py:1204] (3/4) Exclude cut with ID _5608_210_2106_1_1527415439089_6819768_909-82767-0_sp0.9 from training. Duration: 0.67775 2023-10-02 04:29:09,821 WARNING [train.py:1204] (3/4) Exclude cut with ID _6694_210_4599_1_1527852637490_3965529_99-11611-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 04:29:10,610 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2655_210_2087_1_1525688959_3897400_52-279899-0_sp1.1 from training. Duration: 0.62725 2023-10-02 04:29:15,113 WARNING [train.py:1204] (3/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_375-245677-0_sp1.1 from training. Duration: 0.736375 2023-10-02 04:29:18,020 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_23850_210_4238_1_1532232597_1632433_32-185135-0 from training. Duration: 0.86 2023-10-02 04:29:19,451 WARNING [train.py:1204] (3/4) Exclude cut with ID _15113_210_8254_1_1530842438579_3716279_209-54533-0 from training. Duration: 0.96 2023-10-02 04:29:19,487 WARNING [train.py:1204] (3/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_401-53423-0 from training. Duration: 0.55 2023-10-02 04:29:19,535 WARNING [train.py:1204] (3/4) Exclude cut with ID _14867_210_1968_1_1530684179890_3846969_319-250868-0_sp1.1 from training. Duration: 0.9 2023-10-02 04:29:20,857 WARNING [train.py:1204] (3/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_106-30782-0_sp0.9 from training. Duration: 0.888875 2023-10-02 04:29:22,222 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_5960_210_1794_1_1527403865_3990999_515-2613-0 from training. Duration: 0.88 2023-10-02 04:29:25,132 WARNING [train.py:1204] (3/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_798-106179-0_sp0.9 from training. Duration: 0.92225 2023-10-02 04:29:26,434 WARNING [train.py:1204] (3/4) Exclude cut with ID _14227_210_4381_1_1530701804286_3940259_125-275642-0_sp1.1 from training. Duration: 0.9 2023-10-02 04:29:26,436 WARNING [train.py:1204] (3/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_79-200392-0_sp1.1 from training. Duration: 0.8818125 2023-10-02 04:29:26,503 WARNING [train.py:1204] (3/4) Exclude cut with ID _48836_210_12210_1_1534075848293_3904150_429-35475-0_sp0.9 from training. Duration: 0.988875 2023-10-02 04:29:26,517 WARNING [train.py:1204] (3/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_224-172517-0_sp1.1 from training. Duration: 0.97275 2023-10-02 04:29:30,604 WARNING [train.py:1204] (3/4) Exclude cut with ID _15113_210_8254_1_1530842438579_3716279_76-232507-0_sp1.1 from training. Duration: 0.97275 2023-10-02 04:29:30,607 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_135-5501-0 from training. Duration: 0.98 2023-10-02 04:29:33,196 WARNING [train.py:1204] (3/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_563-347832-0_sp0.9 from training. Duration: 0.988875 2023-10-02 04:29:33,213 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6786_210_1388_1_1527839699_4194495_273-149841-0 from training. Duration: 0.92 2023-10-02 04:29:33,241 WARNING [train.py:1204] (3/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_172-215932-0 from training. Duration: 0.66 2023-10-02 04:29:35,085 WARNING [train.py:1204] (3/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_939-132910-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 04:29:38,534 WARNING [train.py:1204] (3/4) Exclude cut with ID _49371_210_12071_1_1533952761021_3812190_16-246001-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 04:29:39,814 INFO [train.py:1046] (3/4) Epoch 22, batch 1100, loss[loss=0.1906, simple_loss=0.2546, pruned_loss=0.06331, over 22778.00 frames. ], tot_loss[loss=0.1732, simple_loss=0.2483, pruned_loss=0.04907, over 4670830.21 frames. ], batch size: 322, lr: 4.70e-03, grad_scale: 8.0 2023-10-02 04:29:45,896 WARNING [train.py:1204] (3/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_680-189165-0_sp1.1 from training. Duration: 0.936375 2023-10-02 04:29:46,218 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.conv_module2.balancer2.prob, batch_count=751026.6666666666, ans=0.125 2023-10-02 04:29:47,530 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.balancer_ff2.min_abs, batch_count=751026.6666666666, ans=0.1 2023-10-02 04:29:50,079 WARNING [train.py:1204] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_53-300544-0_sp1.1 from training. Duration: 0.6090625 2023-10-02 04:29:52,834 WARNING [train.py:1204] (3/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_540-275685-0_sp1.1 from training. Duration: 0.8090625 2023-10-02 04:29:52,853 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_38965_210_4852_1_1533174906_3484735_453-106579-0_sp1.1 from training. Duration: 0.92725 2023-10-02 04:29:52,905 WARNING [train.py:1204] (3/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_105-328174-0 from training. Duration: 0.76 2023-10-02 04:29:53,141 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.conv_skip_rate, batch_count=751093.3333333334, ans=0.0 2023-10-02 04:29:54,349 WARNING [train.py:1204] (3/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_279-126205-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 04:29:55,782 WARNING [train.py:1204] (3/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_5-55893-0_sp0.9 from training. Duration: 0.611125 2023-10-02 04:29:57,257 WARNING [train.py:1204] (3/4) Exclude cut with ID _13678_210_7497_1_1530530069226_3463170_141-10835-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 04:29:59,962 WARNING [train.py:1204] (3/4) Exclude cut with ID _22083_210_8254_1_1531702862439_3678970_201-85738-0_sp1.1 from training. Duration: 0.8181875 2023-10-02 04:29:59,983 WARNING [train.py:1204] (3/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_51-1049-0 from training. Duration: 0.75 2023-10-02 04:30:02,540 WARNING [train.py:1204] (3/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_206-10097-0_sp0.9 from training. Duration: 0.5444375 2023-10-02 04:30:03,866 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_4528_210_3273_1_1527076654_3276777_360-63661-0_sp1.1 from training. Duration: 0.92725 2023-10-02 04:30:03,876 WARNING [train.py:1204] (3/4) Exclude cut with ID _64086_210_7994_1_1535270417019_7435949_449-31567-0_sp1.1 from training. Duration: 0.8818125 2023-10-02 04:30:06,505 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.545e+02 1.819e+02 2.068e+02 2.421e+02 4.208e+02, threshold=4.136e+02, percent-clipped=1.0 2023-10-02 04:30:06,608 WARNING [train.py:1204] (3/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_245-174553-0_sp0.9 from training. Duration: 0.97775 2023-10-02 04:30:06,782 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6545_210_2040_1_1527774670_4245258_379-14182-0_sp1.1 from training. Duration: 0.72725 2023-10-02 04:30:13,515 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_256-206070-0_sp1.1 from training. Duration: 0.963625 2023-10-02 04:30:16,987 WARNING [train.py:1204] (3/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_278-104676-0 from training. Duration: 0.98 2023-10-02 04:30:17,064 WARNING [train.py:1204] (3/4) Exclude cut with ID IC0671W0065-331908 from training. Duration: 0.896 2023-10-02 04:30:18,364 WARNING [train.py:1204] (3/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_244-129917-0_sp1.1 from training. Duration: 0.97275 2023-10-02 04:30:19,839 WARNING [train.py:1204] (3/4) Exclude cut with ID _7852_210_5219_1_1528286471316_8139400_1129-170857-0_sp1.1 from training. Duration: 0.97275 2023-10-02 04:30:21,210 WARNING [train.py:1204] (3/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_443-30034-0_sp0.9 from training. Duration: 0.711125 2023-10-02 04:30:21,237 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_31583_210_3852_1_1532595851_4024413_250-332999-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 04:30:22,667 WARNING [train.py:1204] (3/4) Exclude cut with ID _8772_210_1850_1_1528635595972_3664011_247-336448-0 from training. Duration: 0.74 2023-10-02 04:30:23,983 WARNING [train.py:1204] (3/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_567-135114-0_sp0.9 from training. Duration: 0.9666875 2023-10-02 04:30:23,988 WARNING [train.py:1204] (3/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_557-349534-0_sp1.1 from training. Duration: 0.82725 2023-10-02 04:30:24,010 WARNING [train.py:1204] (3/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_1341-26728-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 04:30:24,238 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.attention_skip_rate, batch_count=751226.6666666666, ans=0.0 2023-10-02 04:30:25,325 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_40222_210_6228_1_1533385298_4481939_375-328051-0_sp1.1 from training. Duration: 0.97275 2023-10-02 04:30:25,356 WARNING [train.py:1204] (3/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_276-96593-0 from training. Duration: 0.77 2023-10-02 04:30:29,676 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_53071_210_16554_1_1534490405_2954066_265-170472-0_sp0.9 from training. Duration: 0.92225 2023-10-02 04:30:29,694 WARNING [train.py:1204] (3/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_365-1492-0 from training. Duration: 0.91 2023-10-02 04:30:32,237 WARNING [train.py:1204] (3/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_654-52713-0_sp0.9 from training. Duration: 0.8555625 2023-10-02 04:30:35,053 WARNING [train.py:1204] (3/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_113-92608-0_sp1.1 from training. Duration: 0.4909375 2023-10-02 04:30:38,366 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_46013_210_7234_1_1533776362_3744032_50-141535-0 from training. Duration: 0.99 2023-10-02 04:30:38,375 WARNING [train.py:1204] (3/4) Exclude cut with ID _13942_210_9988_1_1530604906036_3733360_499-55676-0_sp1.1 from training. Duration: 0.563625 2023-10-02 04:30:40,387 WARNING [train.py:1204] (3/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_375-65143-0_sp1.1 from training. Duration: 0.97275 2023-10-02 04:30:41,731 WARNING [train.py:1204] (3/4) Exclude cut with ID _16756_210_4915_1_1531047803763_7104259_199-325608-0_sp1.1 from training. Duration: 0.92725 2023-10-02 04:30:43,133 WARNING [train.py:1204] (3/4) Exclude cut with ID _31649_210_6228_1_1532584608063_4091078_443-124391-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 04:30:44,447 WARNING [train.py:1204] (3/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_146-218763-0 from training. Duration: 0.86 2023-10-02 04:30:44,890 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.5.encoder.layers.1.whiten, num_groups=1, num_channels=256, metric=1.99 vs. limit=12.0 2023-10-02 04:30:46,346 WARNING [train.py:1204] (3/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_567-135114-0_sp1.1 from training. Duration: 0.7909375 2023-10-02 04:30:46,401 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_26326_210_7925_1_1533002493_3592406_462-288299-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 04:30:46,491 WARNING [train.py:1204] (3/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_322-217220-0 from training. Duration: 0.86 2023-10-02 04:30:47,829 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_238-256668-0_sp1.1 from training. Duration: 0.62725 2023-10-02 04:30:47,882 WARNING [train.py:1204] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_371-18169-0 from training. Duration: 0.86 2023-10-02 04:30:49,272 WARNING [train.py:1204] (3/4) Exclude cut with ID _58480_210_12392_1_1534811121111_3646160_23-74512-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 04:30:49,276 WARNING [train.py:1204] (3/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_784-213099-0_sp1.1 from training. Duration: 0.8454375 2023-10-02 04:30:50,735 WARNING [train.py:1204] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_625-102955-0_sp1.1 from training. Duration: 0.7 2023-10-02 04:30:53,561 INFO [train.py:1046] (3/4) Epoch 22, batch 1150, loss[loss=0.1594, simple_loss=0.2458, pruned_loss=0.03652, over 24686.00 frames. ], tot_loss[loss=0.1738, simple_loss=0.2492, pruned_loss=0.04923, over 4678045.87 frames. ], batch size: 65, lr: 4.70e-03, grad_scale: 8.0 2023-10-02 04:30:53,744 WARNING [train.py:1204] (3/4) Exclude cut with ID _49748_210_24116_1_1533986971737_3158910_176-174327-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 04:30:56,412 WARNING [train.py:1204] (3/4) Exclude cut with ID _50310_210_15488_1_1534068043345_3641080_432-96140-0_sp1.1 from training. Duration: 0.936375 2023-10-02 04:30:59,176 WARNING [train.py:1204] (3/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_76-14910-0_sp1.1 from training. Duration: 0.92725 2023-10-02 04:30:59,213 WARNING [train.py:1204] (3/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_40-19458-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 04:30:59,239 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_46-93547-0 from training. Duration: 0.95 2023-10-02 04:31:00,520 WARNING [train.py:1204] (3/4) Exclude cut with ID _21631_210_2253_1_1531907880778_3160339_321-35325-0_sp1.1 from training. Duration: 0.963625 2023-10-02 04:31:03,330 WARNING [train.py:1204] (3/4) Exclude cut with ID _4779_210_2084_1_1527318022944_3914893_500-285013-0 from training. Duration: 0.69 2023-10-02 04:31:06,040 WARNING [train.py:1204] (3/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_301-93324-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 04:31:06,046 WARNING [train.py:1204] (3/4) Exclude cut with ID _6473_210_1741_1_1527901307396_3815380_108-17335-0_sp1.1 from training. Duration: 0.5909375 2023-10-02 04:31:11,022 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.conv_skip_rate, batch_count=751426.6666666666, ans=0.0 2023-10-02 04:31:12,749 WARNING [train.py:1204] (3/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_206-10097-0 from training. Duration: 0.49 2023-10-02 04:31:14,179 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_36756_210_7234_1_1533026991_966276_103-77513-0_sp1.1 from training. Duration: 0.97275 2023-10-02 04:31:18,305 WARNING [train.py:1204] (3/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_662-18193-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 04:31:19,951 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_26433_210_12108_1_1532157158_4056480_439-52438-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 04:31:19,990 WARNING [train.py:1204] (3/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_464-33931-0 from training. Duration: 0.54 2023-10-02 04:31:19,996 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3474_210_2028_1_1526188513_4825246_145-309131-0_sp0.9 from training. Duration: 0.82225 2023-10-02 04:31:20,016 WARNING [train.py:1204] (3/4) Exclude cut with ID _9679_210_7497_1_1529129212906_4363929_446-308321-0_sp1.1 from training. Duration: 0.963625 2023-10-02 04:31:24,236 WARNING [train.py:1204] (3/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_662-275621-0 from training. Duration: 0.42 2023-10-02 04:31:25,582 WARNING [train.py:1204] (3/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_344-337148-0_sp1.1 from training. Duration: 0.97275 2023-10-02 04:31:26,980 WARNING [train.py:1204] (3/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_759-179071-0_sp1.1 from training. Duration: 0.92725 2023-10-02 04:31:27,339 INFO [scaling.py:1118] (3/4) WithLoss: name=encoder.encoders.0.layers.1.self_attn_weights, loss-sum=0.000e+00 2023-10-02 04:31:27,647 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.5.encoder.layers.1.conv_module2.whiten, num_groups=1, num_channels=256, metric=4.35 vs. limit=15.0 2023-10-02 04:31:31,482 INFO [scaling.py:1118] (3/4) WithLoss: name=encoder.encoders.0.layers.1.self_attn_weights, loss-sum=0.000e+00 2023-10-02 04:31:31,582 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.nonlin_attention.balancer.prob, batch_count=751493.3333333334, ans=0.125 2023-10-02 04:31:35,399 WARNING [train.py:1204] (3/4) Exclude cut with ID _28783_210_7943_1_1532671164808_7155479_293-12188-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 04:31:39,990 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.bypass_mid.scale_min, batch_count=751560.0, ans=0.2 2023-10-02 04:31:41,124 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_17613_210_2151_1_1531137569_2366160_235-320304-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 04:31:42,907 WARNING [train.py:1204] (3/4) Exclude cut with ID _14349_210_8341_1_1530615710818_3442470_170-336541-0 from training. Duration: 0.86 2023-10-02 04:31:43,616 WARNING [train.py:1204] (3/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_375-218677-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 04:31:43,659 WARNING [train.py:1204] (3/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_829-202956-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 04:31:47,848 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9029W0436-509823 from training. Duration: 0.768 2023-10-02 04:31:49,254 WARNING [train.py:1204] (3/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_231-112214-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 04:31:55,276 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9059W0470-524883 from training. Duration: 0.3415625 2023-10-02 04:32:00,584 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_43266_210_1794_1_1533521789_4497218_206-79512-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 04:32:02,011 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_294-163793-0_sp0.9 from training. Duration: 0.911125 2023-10-02 04:32:02,047 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6290_210_3273_1_1527595224_1282787_115-87095-0_sp1.1 from training. Duration: 0.5 2023-10-02 04:32:02,091 WARNING [train.py:1204] (3/4) Exclude cut with ID _14100_210_8341_1_1530529320613_3678069_279-55760-0_sp1.1 from training. Duration: 0.8454375 2023-10-02 04:32:04,992 WARNING [train.py:1204] (3/4) Exclude cut with ID _14621_210_1925_1_1530686901185_3675420_419-234200-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 04:32:06,280 INFO [train.py:1046] (3/4) Epoch 22, batch 1200, loss[loss=0.1934, simple_loss=0.2645, pruned_loss=0.06111, over 23366.00 frames. ], tot_loss[loss=0.1737, simple_loss=0.2497, pruned_loss=0.04886, over 4681370.34 frames. ], batch size: 93, lr: 4.70e-03, grad_scale: 16.0 2023-10-02 04:32:09,346 WARNING [train.py:1204] (3/4) Exclude cut with ID _60229_210_27767_1_1534838406158_3655350_353-213656-0_sp1.1 from training. Duration: 0.863625 2023-10-02 04:32:09,366 WARNING [train.py:1204] (3/4) Exclude cut with ID _48678_210_5663_1_1533988830947_3769199_306-255886-0_sp1.1 from training. Duration: 0.7 2023-10-02 04:32:10,788 WARNING [train.py:1204] (3/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_466-3492-0_sp1.1 from training. Duration: 0.97275 2023-10-02 04:32:10,789 WARNING [train.py:1204] (3/4) Exclude cut with ID _8755_210_1876_1_1528628559357_3258209_199-242167-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 04:32:12,773 WARNING [train.py:1204] (3/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_237-89684-0_sp1.1 from training. Duration: 0.87275 2023-10-02 04:32:16,196 WARNING [train.py:1204] (3/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_70-84121-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 04:32:17,628 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7544_210_6753_1_1528587722_3576686_172-171750-0_sp1.1 from training. Duration: 0.5909375 2023-10-02 04:32:17,738 WARNING [train.py:1204] (3/4) Exclude cut with ID _24271_210_12658_1_1531987270022_7190519_40-321822-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 04:32:17,765 WARNING [train.py:1204] (3/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_227-83612-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 04:32:20,439 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9156W0015-572508 from training. Duration: 0.896 2023-10-02 04:32:23,808 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_292-265928-0 from training. Duration: 0.51 2023-10-02 04:32:25,428 WARNING [train.py:1204] (3/4) Exclude cut with ID _14747_210_8341_1_1530685797463_3654089_147-131229-0_sp1.1 from training. Duration: 0.6909375 2023-10-02 04:32:26,881 WARNING [train.py:1204] (3/4) Exclude cut with ID _45297_210_2721_1_1533715351381_3264949_351-147136-0_sp0.9 from training. Duration: 0.9666875 2023-10-02 04:32:27,136 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=751760.0, ans=0.1 2023-10-02 04:32:28,313 WARNING [train.py:1204] (3/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_1102-324568-0_sp1.1 from training. Duration: 0.97275 2023-10-02 04:32:31,036 WARNING [train.py:1204] (3/4) Exclude cut with ID _18516_210_9206_1_1531704653206_3692406_465-75446-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 04:32:31,037 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9203W0126-595623 from training. Duration: 0.896 2023-10-02 04:32:32,378 WARNING [train.py:1204] (3/4) Exclude cut with ID _55383_210_10635_1_1534468426172_2231669_139-328410-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 04:32:32,567 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.nonlin_attention.balancer.max_positive, batch_count=751760.0, ans=0.95 2023-10-02 04:32:34,978 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.564e+02 1.785e+02 1.972e+02 2.130e+02 2.698e+02, threshold=3.944e+02, percent-clipped=0.0 2023-10-02 04:32:39,230 WARNING [train.py:1204] (3/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_166-246557-0_sp0.9 from training. Duration: 0.711125 2023-10-02 04:32:39,243 WARNING [train.py:1204] (3/4) Exclude cut with ID _30039_210_13270_1_1532484612900_2725150_239-304872-0_sp1.1 from training. Duration: 0.963625 2023-10-02 04:32:39,265 WARNING [train.py:1204] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_6-226750-0 from training. Duration: 0.94 2023-10-02 04:32:39,336 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_135-5501-0_sp1.1 from training. Duration: 0.8909375 2023-10-02 04:32:44,505 WARNING [train.py:1204] (3/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_359-118926-0 from training. Duration: 0.72 2023-10-02 04:32:48,578 WARNING [train.py:1204] (3/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_4-91037-0 from training. Duration: 0.88 2023-10-02 04:32:48,595 WARNING [train.py:1204] (3/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_415-288492-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 04:32:49,972 WARNING [train.py:1204] (3/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_544-287179-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 04:32:51,403 WARNING [train.py:1204] (3/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_122-32606-0_sp1.1 from training. Duration: 0.92725 2023-10-02 04:32:51,464 WARNING [train.py:1204] (3/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_121-284152-0_sp1.1 from training. Duration: 0.536375 2023-10-02 04:32:53,475 WARNING [train.py:1204] (3/4) Exclude cut with ID _44718_210_5816_1_1534050051869_6469030_350-299477-0_sp1.1 from training. Duration: 0.97275 2023-10-02 04:32:53,486 WARNING [train.py:1204] (3/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_1103-56445-0_sp0.9 from training. Duration: 0.811125 2023-10-02 04:32:53,549 WARNING [train.py:1204] (3/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_64-298917-0_sp0.9 from training. Duration: 0.97775 2023-10-02 04:32:54,837 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2245_210_2715_1_1525511548_3540254_310-52483-0 from training. Duration: 0.88 2023-10-02 04:32:56,148 WARNING [train.py:1204] (3/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_401-158239-0_sp1.1 from training. Duration: 0.6454375 2023-10-02 04:32:56,192 WARNING [train.py:1204] (3/4) Exclude cut with ID _59502_210_12210_1_1535107024698_1835139_146-162495-0_sp1.1 from training. Duration: 0.836375 2023-10-02 04:32:56,205 WARNING [train.py:1204] (3/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_41-31157-0_sp0.9 from training. Duration: 0.6333125 2023-10-02 04:32:58,032 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.5.encoder.layers.1.feed_forward1.out_whiten, num_groups=1, num_channels=256, metric=5.99 vs. limit=15.0 2023-10-02 04:32:58,960 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_68717_210_3528_1_1535628683_1556759_95-36722-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 04:32:58,967 WARNING [train.py:1204] (3/4) Exclude cut with ID _30196_210_1664_1_1533708496522_7074140_654-162611-0_sp1.1 from training. Duration: 0.92725 2023-10-02 04:33:03,026 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_9302_210_2550_1_1528889962_4655416_638-301620-0_sp0.9 from training. Duration: 0.52225 2023-10-02 04:33:04,522 WARNING [train.py:1204] (3/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_478-162247-0_sp1.1 from training. Duration: 0.7454375 2023-10-02 04:33:07,340 WARNING [train.py:1204] (3/4) Exclude cut with ID _45297_210_2721_1_1533715351381_3264949_353-9541-0 from training. Duration: 0.97 2023-10-02 04:33:12,029 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9653W0190-675468 from training. Duration: 0.9935 2023-10-02 04:33:12,221 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=751960.0, ans=0.1 2023-10-02 04:33:15,255 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_49730_210_13049_1_1534075294_1830676_214-84436-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 04:33:16,673 WARNING [train.py:1204] (3/4) Exclude cut with ID _50093_210_4220_1_1534417141207_3432299_259-322252-0_sp1.1 from training. Duration: 0.836375 2023-10-02 04:33:16,848 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.attention_skip_rate, batch_count=751960.0, ans=0.0 2023-10-02 04:33:17,995 WARNING [train.py:1204] (3/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_599-50220-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 04:33:19,355 WARNING [train.py:1204] (3/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_174-109016-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 04:33:20,649 INFO [train.py:1046] (3/4) Epoch 22, batch 1250, loss[loss=0.1471, simple_loss=0.2207, pruned_loss=0.03674, over 24322.00 frames. ], tot_loss[loss=0.1745, simple_loss=0.2505, pruned_loss=0.04929, over 4685613.14 frames. ], batch size: 56, lr: 4.70e-03, grad_scale: 16.0 2023-10-02 04:33:22,140 WARNING [train.py:1204] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_665-196155-0 from training. Duration: 0.67 2023-10-02 04:33:26,311 WARNING [train.py:1204] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_656-188361-0_sp1.1 from training. Duration: 0.7909375 2023-10-02 04:33:26,381 WARNING [train.py:1204] (3/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_60-18123-0_sp1.1 from training. Duration: 0.8 2023-10-02 04:33:28,147 WARNING [train.py:1204] (3/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_535-14410-0 from training. Duration: 0.47 2023-10-02 04:33:28,274 WARNING [train.py:1204] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_94-126128-0_sp1.1 from training. Duration: 0.7818125 2023-10-02 04:33:29,619 WARNING [train.py:1204] (3/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_356-31216-0_sp1.1 from training. Duration: 0.6818125 2023-10-02 04:33:32,433 WARNING [train.py:1204] (3/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_443-30034-0_sp1.1 from training. Duration: 0.5818125 2023-10-02 04:33:33,755 WARNING [train.py:1204] (3/4) Exclude cut with ID _26907_210_7234_1_1532221740694_3205263_337-2494-0_sp1.1 from training. Duration: 0.8 2023-10-02 04:33:33,836 WARNING [train.py:1204] (3/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_49-97158-0_sp0.9 from training. Duration: 0.8444375 2023-10-02 04:33:33,846 WARNING [train.py:1204] (3/4) Exclude cut with ID _7877_210_6014_1_1528286358099_3750136_553-170128-0_sp1.1 from training. Duration: 0.963625 2023-10-02 04:33:35,321 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.conv_skip_rate, batch_count=752093.3333333334, ans=0.0 2023-10-02 04:33:37,760 WARNING [train.py:1204] (3/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_202-293523-0_sp0.9 from training. Duration: 0.8 2023-10-02 04:33:42,923 WARNING [train.py:1204] (3/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_188-171673-0_sp0.9 from training. Duration: 0.6444375 2023-10-02 04:33:42,949 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_120-41668-0_sp1.1 from training. Duration: 0.636375 2023-10-02 04:33:42,954 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_40223_210_6228_1_1533298404_4812267_613-78769-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 04:33:44,362 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_53861_210_5399_1_1534422156_4272077_150-336899-0_sp1.1 from training. Duration: 0.963625 2023-10-02 04:33:46,312 WARNING [train.py:1204] (3/4) Exclude cut with ID _14459_210_2228_1_1531443641946_7516700_1146-56843-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 04:33:49,264 WARNING [train.py:1204] (3/4) Exclude cut with ID _39867_210_9826_1_1533553222030_9344259_1194-299928-0_sp1.1 from training. Duration: 0.92725 2023-10-02 04:33:49,362 WARNING [train.py:1204] (3/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_214-291369-0_sp0.9 from training. Duration: 0.7 2023-10-02 04:33:54,780 WARNING [train.py:1204] (3/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_358-215558-0 from training. Duration: 0.93 2023-10-02 04:33:54,846 WARNING [train.py:1204] (3/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_577-287150-0_sp0.9 from training. Duration: 0.911125 2023-10-02 04:33:56,537 WARNING [train.py:1204] (3/4) Exclude cut with ID _60860_210_8254_1_1534896015875_3557169_229-187367-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 04:33:56,584 WARNING [train.py:1204] (3/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_857-298942-0 from training. Duration: 0.85 2023-10-02 04:33:58,388 WARNING [train.py:1204] (3/4) Exclude cut with ID _40137_210_12210_1_1533290484046_3795180_249-187059-0_sp1.1 from training. Duration: 0.8 2023-10-02 04:33:58,395 WARNING [train.py:1204] (3/4) Exclude cut with ID ID0171W0508-765603 from training. Duration: 0.541 2023-10-02 04:33:58,410 WARNING [train.py:1204] (3/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_521-219439-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 04:33:58,417 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_40223_210_6228_1_1533298404_4812267_741-302616-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 04:34:01,266 WARNING [train.py:1204] (3/4) Exclude cut with ID _65293_210_9070_1_1535545549765_4004210_438-154735-0_sp1.1 from training. Duration: 0.92725 2023-10-02 04:34:03,983 WARNING [train.py:1204] (3/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_564-132950-0_sp1.1 from training. Duration: 0.92725 2023-10-02 04:34:05,265 WARNING [train.py:1204] (3/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_291-270881-0_sp1.1 from training. Duration: 0.8818125 2023-10-02 04:34:06,672 WARNING [train.py:1204] (3/4) Exclude cut with ID _60229_210_27767_1_1534838406158_3655350_353-213656-0 from training. Duration: 0.95 2023-10-02 04:34:06,683 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_243-143119-0 from training. Duration: 0.77 2023-10-02 04:34:07,999 WARNING [train.py:1204] (3/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_690-126306-0 from training. Duration: 0.78 2023-10-02 04:34:11,358 WARNING [train.py:1204] (3/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_347-95003-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 04:34:11,473 WARNING [train.py:1204] (3/4) Exclude cut with ID _2322_210_2667_1_1525604548971_3656299_440-80494-0 from training. Duration: 0.88 2023-10-02 04:34:11,497 WARNING [train.py:1204] (3/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_370-229434-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 04:34:16,678 WARNING [train.py:1204] (3/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_124-345840-0_sp1.1 from training. Duration: 0.47275 2023-10-02 04:34:16,696 WARNING [train.py:1204] (3/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_165-123581-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 04:34:17,877 WARNING [train.py:1204] (3/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_453-282588-0 from training. Duration: 0.98 2023-10-02 04:34:17,902 WARNING [train.py:1204] (3/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_299-34413-0_sp0.9 from training. Duration: 0.72225 2023-10-02 04:34:19,085 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_40036_210_13405_1_1533648647_1315388_48-146016-0_sp1.1 from training. Duration: 0.8454375 2023-10-02 04:34:19,124 WARNING [train.py:1204] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_99-167392-0_sp0.9 from training. Duration: 0.67775 2023-10-02 04:34:20,492 WARNING [train.py:1204] (3/4) Exclude cut with ID _48677_210_5663_1_1533902405919_3892989_37-207114-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 04:34:21,914 WARNING [train.py:1204] (3/4) Exclude cut with ID _29857_210_20034_1_1532395886753_3798160_335-163562-0 from training. Duration: 0.85 2023-10-02 04:34:24,675 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_36492_210_7927_1_1533261987_4790834_703-343351-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 04:34:24,774 WARNING [train.py:1204] (3/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_80-147301-0_sp0.9 from training. Duration: 0.9333125 2023-10-02 04:34:26,089 WARNING [train.py:1204] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_218-295097-0_sp0.9 from training. Duration: 0.8555625 2023-10-02 04:34:27,527 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2295_210_2087_1_1525500993_2407863_116-238894-0_sp1.1 from training. Duration: 0.636375 2023-10-02 04:34:32,108 WARNING [train.py:1204] (3/4) Exclude cut with ID _3231_210_2106_1_1526205864934_6784220_697-171068-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 04:34:33,385 WARNING [train.py:1204] (3/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_102-77064-0 from training. Duration: 0.93 2023-10-02 04:34:34,839 INFO [train.py:1046] (3/4) Epoch 22, batch 1300, loss[loss=0.1651, simple_loss=0.2542, pruned_loss=0.03806, over 24455.00 frames. ], tot_loss[loss=0.1752, simple_loss=0.2511, pruned_loss=0.0496, over 4695493.69 frames. ], batch size: 69, lr: 4.70e-03, grad_scale: 16.0 2023-10-02 04:34:36,238 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_13091_210_7925_1_1530609447_8987354_1186-261957-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 04:34:36,348 WARNING [train.py:1204] (3/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_91-277684-0_sp1.1 from training. Duration: 0.67275 2023-10-02 04:34:37,706 WARNING [train.py:1204] (3/4) Exclude cut with ID _31088_210_16446_1_1532520012284_3440919_159-313751-0_sp1.1 from training. Duration: 0.936375 2023-10-02 04:34:39,116 WARNING [train.py:1204] (3/4) Exclude cut with ID _42326_210_7770_1_1533814282531_3707269_227-203721-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 04:34:39,880 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.3.encoder.layers.3.nonlin_attention.whiten1, num_groups=1, num_channels=384, metric=6.76 vs. limit=10.0 2023-10-02 04:34:40,985 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3531_210_3528_1_1526294203_5216456_309-227302-0_sp1.1 from training. Duration: 0.62725 2023-10-02 04:34:42,815 WARNING [train.py:1204] (3/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_226-242140-0 from training. Duration: 0.81 2023-10-02 04:34:44,336 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.0.bypass.scale_min, batch_count=752360.0, ans=0.2 2023-10-02 04:34:47,669 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.self_attn_weights.pos_emb_skip_rate, batch_count=752360.0, ans=0.0 2023-10-02 04:34:50,219 WARNING [train.py:1204] (3/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_122-109023-0_sp1.1 from training. Duration: 0.6909375 2023-10-02 04:34:50,301 WARNING [train.py:1204] (3/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_660-211088-0_sp0.9 from training. Duration: 0.688875 2023-10-02 04:34:51,649 WARNING [train.py:1204] (3/4) Exclude cut with ID _39290_210_4220_1_1533862778760_3075118_92-311600-0 from training. Duration: 0.88 2023-10-02 04:34:55,827 WARNING [train.py:1204] (3/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_390-339137-0_sp1.1 from training. Duration: 0.5090625 2023-10-02 04:34:58,508 WARNING [train.py:1204] (3/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_449-341946-0_sp1.1 from training. Duration: 0.963625 2023-10-02 04:34:59,973 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_68095_210_20185_1_1535718226_8888855_884-327007-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 04:35:00,069 WARNING [train.py:1204] (3/4) Exclude cut with ID _42326_210_7770_1_1533814282531_3707269_308-222998-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 04:35:01,949 WARNING [train.py:1204] (3/4) Exclude cut with ID _40399_210_4353_1_1533305658194_3279406_344-238708-0_sp1.1 from training. Duration: 0.963625 2023-10-02 04:35:02,145 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=752426.6666666666, ans=0.1 2023-10-02 04:35:03,258 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.531e+02 1.912e+02 2.109e+02 2.278e+02 3.691e+02, threshold=4.217e+02, percent-clipped=0.0 2023-10-02 04:35:03,369 WARNING [train.py:1204] (3/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_87-131427-0_sp1.1 from training. Duration: 0.7090625 2023-10-02 04:35:03,423 WARNING [train.py:1204] (3/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_110-130961-0_sp0.9 from training. Duration: 0.52225 2023-10-02 04:35:03,464 WARNING [train.py:1204] (3/4) Exclude cut with ID _55153_210_2253_1_1534384786677_3162096_148-79022-0 from training. Duration: 0.96 2023-10-02 04:35:05,053 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.conv_module1.balancer1.prob, batch_count=752493.3333333334, ans=0.125 2023-10-02 04:35:08,995 WARNING [train.py:1204] (3/4) Exclude cut with ID _9679_210_7497_1_1529129212906_4363929_512-313889-0_sp1.1 from training. Duration: 0.736375 2023-10-02 04:35:09,011 WARNING [train.py:1204] (3/4) Exclude cut with ID _7970_210_1883_1_1528455616296_3472230_25-178407-0_sp1.1 from training. Duration: 0.6454375 2023-10-02 04:35:09,172 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.balancer1.prob, batch_count=752493.3333333334, ans=0.125 2023-10-02 04:35:11,679 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_246-125000-0 from training. Duration: 0.62 2023-10-02 04:35:13,569 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_15126_210_6753_1_1530778677_2591185_113-157651-0_sp0.9 from training. Duration: 0.7555625 2023-10-02 04:35:15,529 WARNING [train.py:1204] (3/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_105-63309-0_sp1.1 from training. Duration: 0.8909375 2023-10-02 04:35:19,036 WARNING [train.py:1204] (3/4) Exclude cut with ID _29877_210_6228_1_1532397791106_5276149_578-177271-0_sp1.1 from training. Duration: 0.936375 2023-10-02 04:35:19,084 WARNING [train.py:1204] (3/4) Exclude cut with ID _44831_210_13427_1_1533816011549_3689035_32-188147-0 from training. Duration: 0.96 2023-10-02 04:35:19,135 WARNING [train.py:1204] (3/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_300-254337-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 04:35:20,465 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_128-241008-0 from training. Duration: 0.94 2023-10-02 04:35:21,843 WARNING [train.py:1204] (3/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_53-279178-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 04:35:27,112 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_41452_210_7291_1_1533437687_3941281_273-334088-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 04:35:27,114 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_128-241008-0_sp1.1 from training. Duration: 0.8545625 2023-10-02 04:35:29,952 WARNING [train.py:1204] (3/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_379-49457-0 from training. Duration: 0.87 2023-10-02 04:35:30,000 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_105-114797-0 from training. Duration: 0.84 2023-10-02 04:35:31,439 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_14928_210_5632_1_1530773461_4714000_454-112198-0 from training. Duration: 0.66 2023-10-02 04:35:37,521 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_456-22141-0_sp1.1 from training. Duration: 0.8 2023-10-02 04:35:40,305 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_115-233929-0 from training. Duration: 0.72 2023-10-02 04:35:41,683 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_616-16795-0_sp1.1 from training. Duration: 0.963625 2023-10-02 04:35:47,791 WARNING [train.py:1204] (3/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_448-212135-0 from training. Duration: 0.91 2023-10-02 04:35:49,079 INFO [train.py:1046] (3/4) Epoch 22, batch 1350, loss[loss=0.1732, simple_loss=0.2631, pruned_loss=0.04162, over 24296.00 frames. ], tot_loss[loss=0.1743, simple_loss=0.2504, pruned_loss=0.04911, over 4693296.34 frames. ], batch size: 74, lr: 4.70e-03, grad_scale: 16.0 2023-10-02 04:35:51,306 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.feed_forward2.hidden_balancer.prob, batch_count=752693.3333333334, ans=0.125 2023-10-02 04:35:51,362 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=752693.3333333334, ans=0.1 2023-10-02 04:35:52,461 WARNING [train.py:1204] (3/4) Exclude cut with ID _46683_210_9642_1_1533730913492_4591116_201-205677-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 04:35:52,605 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder_embed.convnext.hidden_balancer.prob, batch_count=752693.3333333334, ans=0.125 2023-10-02 04:35:55,065 WARNING [train.py:1204] (3/4) Exclude cut with ID _64086_210_7994_1_1535270417019_7435949_43-80483-0_sp1.1 from training. Duration: 0.97275 2023-10-02 04:35:56,606 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_240-134124-0_sp1.1 from training. Duration: 0.963625 2023-10-02 04:35:57,929 WARNING [train.py:1204] (3/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_907-120940-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 04:35:58,035 WARNING [train.py:1204] (3/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_848-280957-0_sp1.1 from training. Duration: 0.7909375 2023-10-02 04:35:59,336 WARNING [train.py:1204] (3/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_243-42116-0_sp0.9 from training. Duration: 0.811125 2023-10-02 04:36:03,653 WARNING [train.py:1204] (3/4) Exclude cut with ID _6694_210_4599_1_1527852637490_3965529_116-109398-0_sp0.9 from training. Duration: 0.811125 2023-10-02 04:36:04,923 WARNING [train.py:1204] (3/4) Exclude cut with ID _24314_210_4915_1_1532135078116_7170179_521-258868-0 from training. Duration: 0.79 2023-10-02 04:36:05,031 WARNING [train.py:1204] (3/4) Exclude cut with ID _2220_210_1819_1_1525509109898_3658680_50-99837-0_sp0.9 from training. Duration: 0.6 2023-10-02 04:36:06,421 WARNING [train.py:1204] (3/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_313-134930-0_sp1.1 from training. Duration: 0.8818125 2023-10-02 04:36:06,585 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.conv_skip_rate, batch_count=752760.0, ans=0.0 2023-10-02 04:36:09,823 WARNING [train.py:1204] (3/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_125-144174-0 from training. Duration: 0.39 2023-10-02 04:36:11,176 WARNING [train.py:1204] (3/4) Exclude cut with ID _59288_210_5854_1_1535077757774_3412246_275-293897-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 04:36:12,613 WARNING [train.py:1204] (3/4) Exclude cut with ID _40519_210_13794_1_1533715228655_7337390_722-52880-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 04:36:12,630 WARNING [train.py:1204] (3/4) Exclude cut with ID _7537_210_2084_1_1528502395431_3924359_128-268029-0 from training. Duration: 0.98 2023-10-02 04:36:13,960 WARNING [train.py:1204] (3/4) Exclude cut with ID _8021_210_7994_1_1528376451358_3653069_179-45206-0 from training. Duration: 0.86 2023-10-02 04:36:15,355 WARNING [train.py:1204] (3/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_88-276668-0 from training. Duration: 0.99 2023-10-02 04:36:17,305 WARNING [train.py:1204] (3/4) Exclude cut with ID _22613_210_5219_1_1532253599686_4090170_290-212862-0_sp1.1 from training. Duration: 0.97275 2023-10-02 04:36:17,310 WARNING [train.py:1204] (3/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_465-214726-0 from training. Duration: 0.86 2023-10-02 04:36:27,814 WARNING [train.py:1204] (3/4) Exclude cut with ID _56480_210_4220_1_1534679942079_3075409_138-36031-0_sp1.1 from training. Duration: 0.97275 2023-10-02 04:36:36,403 WARNING [train.py:1204] (3/4) Exclude cut with ID _58605_210_12210_1_1534723195939_3904259_300-285878-0_sp1.1 from training. Duration: 0.97275 2023-10-02 04:36:36,426 WARNING [train.py:1204] (3/4) Exclude cut with ID _40901_210_5665_1_1533363789958_3856130_114-340229-0_sp1.1 from training. Duration: 0.936375 2023-10-02 04:36:36,473 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_489-230703-0 from training. Duration: 0.85 2023-10-02 04:36:39,698 WARNING [train.py:1204] (3/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_322-81243-0_sp1.1 from training. Duration: 0.936375 2023-10-02 04:36:41,020 WARNING [train.py:1204] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_271-269084-0 from training. Duration: 0.83 2023-10-02 04:36:41,034 WARNING [train.py:1204] (3/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_166-314607-0_sp0.9 from training. Duration: 0.6 2023-10-02 04:36:41,111 WARNING [train.py:1204] (3/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_340-343302-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 04:36:45,170 WARNING [train.py:1204] (3/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_286-112015-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 04:36:47,224 WARNING [train.py:1204] (3/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_328-264776-0 from training. Duration: 0.67 2023-10-02 04:36:48,681 WARNING [train.py:1204] (3/4) Exclude cut with ID _4866_210_2577_1_1527404412284_4364659_256-306830-0_sp0.9 from training. Duration: 0.9666875 2023-10-02 04:36:54,226 WARNING [train.py:1204] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_239-316802-0 from training. Duration: 0.69 2023-10-02 04:36:55,652 WARNING [train.py:1204] (3/4) Exclude cut with ID _29857_210_20034_1_1532395886753_3798160_279-185801-0 from training. Duration: 0.91 2023-10-02 04:36:55,779 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.0.conv_skip_rate, batch_count=752960.0, ans=0.0 2023-10-02 04:37:01,161 WARNING [train.py:1204] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_656-188361-0 from training. Duration: 0.87 2023-10-02 04:37:01,295 WARNING [train.py:1204] (3/4) Exclude cut with ID _44718_210_5816_1_1534050051869_6469030_100-230287-0_sp1.1 from training. Duration: 0.936375 2023-10-02 04:37:02,559 INFO [train.py:1046] (3/4) Epoch 22, batch 1400, loss[loss=0.1669, simple_loss=0.2546, pruned_loss=0.03962, over 24443.00 frames. ], tot_loss[loss=0.1733, simple_loss=0.249, pruned_loss=0.04881, over 4691642.31 frames. ], batch size: 69, lr: 4.70e-03, grad_scale: 16.0 2023-10-02 04:37:04,095 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8275_210_4198_1_1528455320_3892571_276-215622-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 04:37:04,144 WARNING [train.py:1204] (3/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_698-309430-0_sp1.1 from training. Duration: 0.963625 2023-10-02 04:37:10,037 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_365-217119-0 from training. Duration: 0.63 2023-10-02 04:37:11,465 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7264_210_4938_1_1527939911_8132095_800-91995-0 from training. Duration: 0.86 2023-10-02 04:37:20,003 WARNING [train.py:1204] (3/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_49-97158-0_sp1.1 from training. Duration: 0.6909375 2023-10-02 04:37:21,315 WARNING [train.py:1204] (3/4) Exclude cut with ID _61132_210_9969_1_1535021792964_3755419_61-57720-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 04:37:24,062 WARNING [train.py:1204] (3/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_345-306832-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 04:37:24,107 WARNING [train.py:1204] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_164-224052-0_sp0.9 from training. Duration: 0.77775 2023-10-02 04:37:29,497 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_34886_210_10633_1_1533203882_1755661_76-156096-0_sp1.1 from training. Duration: 0.7909375 2023-10-02 04:37:29,587 WARNING [train.py:1204] (3/4) Exclude cut with ID 4133-6541-0027-40495-0_sp1.1 from training. Duration: 0.9681875 2023-10-02 04:37:30,877 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.430e+02 1.782e+02 2.050e+02 2.266e+02 3.252e+02, threshold=4.099e+02, percent-clipped=0.0 2023-10-02 04:37:39,819 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_514-211165-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 04:37:39,891 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_41211_210_2544_1_1533348859_3702910_97-212695-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 04:37:44,461 WARNING [train.py:1204] (3/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_406-45086-0 from training. Duration: 0.8 2023-10-02 04:37:45,791 WARNING [train.py:1204] (3/4) Exclude cut with ID _65263_210_22356_1_1535763598691_3921037_49-222917-0_sp1.1 from training. Duration: 0.836375 2023-10-02 04:37:45,853 WARNING [train.py:1204] (3/4) Exclude cut with ID _4866_210_2577_1_1527404412284_4364659_352-157930-0_sp1.1 from training. Duration: 0.736375 2023-10-02 04:37:47,678 WARNING [train.py:1204] (3/4) Exclude cut with ID _4366_210_1388_1_1526792053633_3613819_231-211402-0_sp1.1 from training. Duration: 0.8545625 2023-10-02 04:37:49,497 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_98-21576-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 04:37:49,571 WARNING [train.py:1204] (3/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_348-127789-0_sp0.9 from training. Duration: 0.9444375 2023-10-02 04:37:49,586 WARNING [train.py:1204] (3/4) Exclude cut with ID _45871_210_5665_1_1533864614245_3296060_98-329557-0_sp1.1 from training. Duration: 0.97275 2023-10-02 04:37:49,638 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6846_210_6753_1_1527850767_4295158_2-315073-0_sp1.1 from training. Duration: 0.8 2023-10-02 04:37:52,264 WARNING [train.py:1204] (3/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_145-137790-0 from training. Duration: 0.83 2023-10-02 04:37:52,284 WARNING [train.py:1204] (3/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_133-158308-0_sp0.9 from training. Duration: 0.9333125 2023-10-02 04:37:56,341 WARNING [train.py:1204] (3/4) Exclude cut with ID _14898_210_9206_1_1530766812377_4177190_320-11758-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 04:37:59,169 WARNING [train.py:1204] (3/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_406-45086-0_sp0.9 from training. Duration: 0.888875 2023-10-02 04:38:03,589 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.ff3_skip_rate, batch_count=753293.3333333334, ans=0.0 2023-10-02 04:38:07,550 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_5438_210_1794_1_1527336687_2600009_104-288274-0 from training. Duration: 0.58 2023-10-02 04:38:07,634 WARNING [train.py:1204] (3/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_108-53929-0_sp0.9 from training. Duration: 0.6555625 2023-10-02 04:38:09,408 WARNING [train.py:1204] (3/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_444-286390-0_sp1.1 from training. Duration: 0.963625 2023-10-02 04:38:11,299 WARNING [train.py:1204] (3/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_125-144174-0_sp0.9 from training. Duration: 0.4333125 2023-10-02 04:38:13,480 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.0.layers.1.whiten, num_groups=1, num_channels=192, metric=3.56 vs. limit=12.0 2023-10-02 04:38:13,895 WARNING [train.py:1204] (3/4) Exclude cut with ID _55209_210_1531_1_1534675828996_4597160_118-345621-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 04:38:15,388 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_45886_210_17024_1_1533709215_471063_7-79165-0_sp1.1 from training. Duration: 0.8909375 2023-10-02 04:38:17,293 INFO [train.py:1046] (3/4) Epoch 22, batch 1450, loss[loss=0.1839, simple_loss=0.2483, pruned_loss=0.05978, over 22837.00 frames. ], tot_loss[loss=0.1727, simple_loss=0.2484, pruned_loss=0.04849, over 4695535.79 frames. ], batch size: 322, lr: 4.70e-03, grad_scale: 16.0 2023-10-02 04:38:18,913 WARNING [train.py:1204] (3/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_175-163894-0_sp0.9 from training. Duration: 0.82225 2023-10-02 04:38:21,742 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_50188_210_4198_1_1534503621_3646041_353-81682-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 04:38:21,748 WARNING [train.py:1204] (3/4) Exclude cut with ID _17875_210_2087_1_1531224012907_1821339_175-80565-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 04:38:21,758 WARNING [train.py:1204] (3/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_159-328157-0_sp1.1 from training. Duration: 0.47275 2023-10-02 04:38:27,090 WARNING [train.py:1204] (3/4) Exclude cut with ID _60057_210_10640_1_1534903065793_3333150_137-267579-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 04:38:27,146 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_92-46009-0_sp1.1 from training. Duration: 0.4909375 2023-10-02 04:38:28,510 WARNING [train.py:1204] (3/4) Exclude cut with ID _53203_210_2087_1_1534246218309_475590_48-68507-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 04:38:29,825 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_28211_210_6388_1_1532861506_4523224_128-328162-0 from training. Duration: 0.9 2023-10-02 04:38:29,918 WARNING [train.py:1204] (3/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_51-1049-0_sp1.1 from training. Duration: 0.6818125 2023-10-02 04:38:31,197 WARNING [train.py:1204] (3/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_383-316000-0 from training. Duration: 0.9 2023-10-02 04:38:32,517 WARNING [train.py:1204] (3/4) Exclude cut with ID _4779_210_2084_1_1527318022944_3914893_185-295153-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 04:38:32,590 WARNING [train.py:1204] (3/4) Exclude cut with ID _3231_210_2106_1_1526205864934_6784220_171-256004-0_sp0.9 from training. Duration: 0.9555625 2023-10-02 04:38:32,591 WARNING [train.py:1204] (3/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_87-272068-0 from training. Duration: 0.83 2023-10-02 04:38:33,980 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_164-152777-0_sp1.1 from training. Duration: 0.92725 2023-10-02 04:38:35,341 WARNING [train.py:1204] (3/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_18-130336-0_sp0.9 from training. Duration: 0.87775 2023-10-02 04:38:35,401 WARNING [train.py:1204] (3/4) Exclude cut with ID _14886_210_4220_1_1530702020355_3516190_175-229073-0_sp1.1 from training. Duration: 0.4090625 2023-10-02 04:38:35,414 WARNING [train.py:1204] (3/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_791-45149-0_sp0.9 from training. Duration: 0.9555625 2023-10-02 04:38:36,864 WARNING [train.py:1204] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_468-189058-0_sp1.1 from training. Duration: 0.87275 2023-10-02 04:38:39,427 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8278_210_2550_1_1528637099_4220413_32-142780-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 04:38:42,376 WARNING [train.py:1204] (3/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_146-218763-0_sp0.9 from training. Duration: 0.9555625 2023-10-02 04:38:45,644 WARNING [train.py:1204] (3/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_348-127789-0_sp1.1 from training. Duration: 0.77275 2023-10-02 04:38:45,653 WARNING [train.py:1204] (3/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_288-285445-0_sp1.1 from training. Duration: 0.936375 2023-10-02 04:38:48,405 WARNING [train.py:1204] (3/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_308-181732-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 04:38:48,444 WARNING [train.py:1204] (3/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_895-117428-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 04:38:50,360 WARNING [train.py:1204] (3/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_387-159008-0_sp0.9 from training. Duration: 0.9555625 2023-10-02 04:38:51,660 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_37351_210_7925_1_1533012931_3812113_265-85780-0_sp0.9 from training. Duration: 0.97775 2023-10-02 04:38:51,676 WARNING [train.py:1204] (3/4) Exclude cut with ID _40551_210_10635_1_1533776424632_3416179_361-3964-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 04:38:51,727 WARNING [train.py:1204] (3/4) Exclude cut with ID _25882_210_3036_1_1532775665815_6479510_485-122742-0_sp1.1 from training. Duration: 0.97275 2023-10-02 04:38:52,245 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.4.encoder.layers.1.conv_module2.whiten, num_groups=1, num_channels=384, metric=3.62 vs. limit=15.0 2023-10-02 04:38:54,697 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.out_combiner.scale_min, batch_count=753493.3333333334, ans=0.2 2023-10-02 04:38:57,125 WARNING [train.py:1204] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_98-51676-0 from training. Duration: 0.73 2023-10-02 04:38:58,595 WARNING [train.py:1204] (3/4) Exclude cut with ID _30548_210_16040_1_1532608097130_3146969_371-280610-0_sp1.1 from training. Duration: 0.92725 2023-10-02 04:39:01,422 WARNING [train.py:1204] (3/4) Exclude cut with ID IC0671W0081-331924 from training. Duration: 0.896 2023-10-02 04:39:02,805 WARNING [train.py:1204] (3/4) Exclude cut with ID _40284_210_2084_1_1533513679814_3704279_380-160775-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 04:39:04,222 WARNING [train.py:1204] (3/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_57-270037-0_sp1.1 from training. Duration: 0.7 2023-10-02 04:39:05,601 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_853-150237-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 04:39:08,314 WARNING [train.py:1204] (3/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_491-261923-0 from training. Duration: 0.98 2023-10-02 04:39:11,186 WARNING [train.py:1204] (3/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_836-244045-0_sp1.1 from training. Duration: 0.97275 2023-10-02 04:39:12,559 WARNING [train.py:1204] (3/4) Exclude cut with ID _14880_210_8895_1_1530680426655_3795259_289-212817-0 from training. Duration: 0.69 2023-10-02 04:39:14,064 WARNING [train.py:1204] (3/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_303-340446-0 from training. Duration: 0.9 2023-10-02 04:39:15,506 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_37152_210_9697_1_1533106573_3844188_418-41736-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 04:39:18,681 WARNING [train.py:1204] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_371-18169-0_sp1.1 from training. Duration: 0.7818125 2023-10-02 04:39:18,728 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_187-219984-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 04:39:20,137 WARNING [train.py:1204] (3/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_63-267585-0 from training. Duration: 0.9 2023-10-02 04:39:23,431 WARNING [train.py:1204] (3/4) Exclude cut with ID _27013_210_1389_1_1532437173240_3252649_222-125573-0 from training. Duration: 0.98 2023-10-02 04:39:25,343 WARNING [train.py:1204] (3/4) Exclude cut with ID _14821_210_8750_1_1530680503991_6829359_189-297451-0 from training. Duration: 0.55 2023-10-02 04:39:25,442 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_557-163391-0_sp1.1 from training. Duration: 0.97275 2023-10-02 04:39:26,897 WARNING [train.py:1204] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_65-88242-0_sp1.1 from training. Duration: 0.5090625 2023-10-02 04:39:32,471 INFO [train.py:1046] (3/4) Epoch 22, batch 1500, loss[loss=0.1597, simple_loss=0.233, pruned_loss=0.04321, over 23302.00 frames. ], tot_loss[loss=0.173, simple_loss=0.249, pruned_loss=0.04855, over 4707447.40 frames. ], batch size: 119, lr: 4.69e-03, grad_scale: 16.0 2023-10-02 04:39:36,625 WARNING [train.py:1204] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_218-295097-0 from training. Duration: 0.77 2023-10-02 04:39:36,667 WARNING [train.py:1204] (3/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_686-247600-0_sp1.1 from training. Duration: 0.836375 2023-10-02 04:39:36,683 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_397-147196-0_sp0.9 from training. Duration: 0.888875 2023-10-02 04:39:38,004 WARNING [train.py:1204] (3/4) Exclude cut with ID _53950_210_5816_1_1534417264919_3862951_237-136449-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 04:39:38,079 WARNING [train.py:1204] (3/4) Exclude cut with ID _3120_210_3525_1_1526036143112_4072980_74-178400-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 04:39:39,429 WARNING [train.py:1204] (3/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_309-222545-0_sp0.9 from training. Duration: 0.9333125 2023-10-02 04:39:39,649 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.conv_module2.balancer2.prob, batch_count=753693.3333333334, ans=0.125 2023-10-02 04:39:40,824 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7544_210_6753_1_1528587722_3576686_172-171750-0 from training. Duration: 0.65 2023-10-02 04:39:42,199 WARNING [train.py:1204] (3/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_3-237434-0_sp0.9 from training. Duration: 0.8444375 2023-10-02 04:39:42,250 WARNING [train.py:1204] (3/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_101-293367-0_sp0.9 from training. Duration: 0.7 2023-10-02 04:39:42,258 WARNING [train.py:1204] (3/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_818-213552-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 04:39:43,591 WARNING [train.py:1204] (3/4) Exclude cut with ID _14349_210_8341_1_1530615710818_3442470_115-285975-0_sp1.1 from training. Duration: 0.7818125 2023-10-02 04:39:44,904 WARNING [train.py:1204] (3/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_291-309749-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 04:39:45,196 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.nonlin_attention.balancer.prob, batch_count=753760.0, ans=0.125 2023-10-02 04:39:46,891 WARNING [train.py:1204] (3/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_715-3462-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 04:39:51,474 WARNING [train.py:1204] (3/4) Exclude cut with ID _59502_210_12210_1_1535107024698_1835139_13-331827-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 04:39:51,491 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_4528_210_3273_1_1527076654_3276777_342-168068-0 from training. Duration: 0.87 2023-10-02 04:39:52,849 WARNING [train.py:1204] (3/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_215-141059-0_sp0.9 from training. Duration: 0.688875 2023-10-02 04:39:52,886 WARNING [train.py:1204] (3/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_106-286130-0_sp1.1 from training. Duration: 0.8909375 2023-10-02 04:39:52,971 WARNING [train.py:1204] (3/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_480-77655-0_sp1.1 from training. Duration: 0.97275 2023-10-02 04:39:56,377 WARNING [train.py:1204] (3/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_848-280957-0 from training. Duration: 0.87 2023-10-02 04:39:56,658 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.conv_module2.balancer1.prob, batch_count=753760.0, ans=0.125 2023-10-02 04:39:58,080 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.ff3_skip_rate, batch_count=753760.0, ans=0.0 2023-10-02 04:40:00,496 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.485e+02 1.944e+02 2.195e+02 2.609e+02 5.119e+02, threshold=4.390e+02, percent-clipped=1.0 2023-10-02 04:40:00,613 WARNING [train.py:1204] (3/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_122-109023-0 from training. Duration: 0.76 2023-10-02 04:40:01,988 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_11305_210_1794_1_1529750428_7178148_645-233480-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 04:40:02,047 WARNING [train.py:1204] (3/4) Exclude cut with ID _7562_210_2084_1_1528887767916_3946229_345-169156-0 from training. Duration: 0.58 2023-10-02 04:40:02,250 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.conv_skip_rate, batch_count=753826.6666666666, ans=0.0 2023-10-02 04:40:04,847 WARNING [train.py:1204] (3/4) Exclude cut with ID _7562_210_2084_1_1528887767916_3946229_345-169156-0_sp1.1 from training. Duration: 0.52725 2023-10-02 04:40:07,379 WARNING [train.py:1204] (3/4) Exclude cut with ID _13253_210_1925_1_1530757775819_3577873_253-246386-0_sp1.1 from training. Duration: 0.8181875 2023-10-02 04:40:07,631 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.balancer_na.min_abs, batch_count=753826.6666666666, ans=0.02 2023-10-02 04:40:08,695 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_136-264444-0_sp1.1 from training. Duration: 0.97275 2023-10-02 04:40:08,709 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2168_210_2290_1_1525606920_1849400_136-222991-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 04:40:10,145 WARNING [train.py:1204] (3/4) Exclude cut with ID _7852_210_5219_1_1528286471316_8139400_242-105336-0 from training. Duration: 0.72 2023-10-02 04:40:10,214 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_5960_210_1794_1_1527403865_3990999_515-2613-0_sp1.1 from training. Duration: 0.8 2023-10-02 04:40:11,497 WARNING [train.py:1204] (3/4) Exclude cut with ID _39688_210_19596_1_1533371626725_4154159_551-167834-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 04:40:11,549 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_85-195240-0 from training. Duration: 0.87 2023-10-02 04:40:11,607 WARNING [train.py:1204] (3/4) Exclude cut with ID _63171_210_15488_1_1535104792794_3295110_113-113534-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 04:40:15,878 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.0.layers.0.whiten, num_groups=1, num_channels=192, metric=4.26 vs. limit=12.0 2023-10-02 04:40:16,406 WARNING [train.py:1204] (3/4) Exclude cut with ID _8833_210_5219_1_1528592665210_7553059_319-349181-0_sp1.1 from training. Duration: 0.9 2023-10-02 04:40:16,407 WARNING [train.py:1204] (3/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_225-43381-0 from training. Duration: 0.94 2023-10-02 04:40:16,670 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.nonlin_attention.balancer.prob, batch_count=753893.3333333334, ans=0.125 2023-10-02 04:40:22,387 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_5959_210_1794_1_1527400400_3445726_412-44052-0_sp0.9 from training. Duration: 0.8555625 2023-10-02 04:40:24,524 WARNING [train.py:1204] (3/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_356-167816-0_sp1.1 from training. Duration: 0.6545625 2023-10-02 04:40:27,404 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9012W0112-501244 from training. Duration: 0.768 2023-10-02 04:40:28,802 WARNING [train.py:1204] (3/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_177-326952-0_sp1.1 from training. Duration: 0.92725 2023-10-02 04:40:28,806 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9016W0116-502789 from training. Duration: 0.896 2023-10-02 04:40:31,502 WARNING [train.py:1204] (3/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_557-204815-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 04:40:31,598 WARNING [train.py:1204] (3/4) Exclude cut with ID _55857_210_2346_1_1534644007831_6898209_279-229469-0_sp1.1 from training. Duration: 0.936375 2023-10-02 04:40:32,917 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9029W0375-509764 from training. Duration: 0.896 2023-10-02 04:40:34,218 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2295_210_2087_1_1525500993_2407863_234-83104-0_sp0.9 from training. Duration: 0.688875 2023-10-02 04:40:37,100 WARNING [train.py:1204] (3/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_266-137104-0 from training. Duration: 0.65 2023-10-02 04:40:37,211 WARNING [train.py:1204] (3/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_289-163927-0_sp1.1 from training. Duration: 0.92725 2023-10-02 04:40:41,341 WARNING [train.py:1204] (3/4) Exclude cut with ID _12733_210_8341_1_1530167424556_3646380_86-225314-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 04:40:41,361 WARNING [train.py:1204] (3/4) Exclude cut with ID _7877_210_6014_1_1528286358099_3750136_618-284064-0_sp1.1 from training. Duration: 0.92725 2023-10-02 04:40:42,654 WARNING [train.py:1204] (3/4) Exclude cut with ID _55844_210_2346_1_1534471212569_7625707_1017-31469-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 04:40:42,686 WARNING [train.py:1204] (3/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_617-241446-0_sp1.1 from training. Duration: 0.92725 2023-10-02 04:40:42,752 WARNING [train.py:1204] (3/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_866-173440-0_sp1.1 from training. Duration: 0.8181875 2023-10-02 04:40:45,335 INFO [train.py:1046] (3/4) Epoch 22, batch 1550, loss[loss=0.1684, simple_loss=0.2559, pruned_loss=0.04044, over 24671.00 frames. ], tot_loss[loss=0.1739, simple_loss=0.2502, pruned_loss=0.0488, over 4714797.50 frames. ], batch size: 68, lr: 4.69e-03, grad_scale: 8.0 2023-10-02 04:40:45,465 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_9460_210_4211_1_1528954587_10049667_480-260269-0 from training. Duration: 0.56 2023-10-02 04:40:47,185 WARNING [train.py:1204] (3/4) Exclude cut with ID _44659_210_2721_1_1534036120061_6614860_626-298884-0 from training. Duration: 0.97 2023-10-02 04:40:47,211 WARNING [train.py:1204] (3/4) Exclude cut with ID _53565_210_10635_1_1534337218335_3820259_287-18120-0_sp1.1 from training. Duration: 0.8818125 2023-10-02 04:40:48,555 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_791-197891-0 from training. Duration: 0.84 2023-10-02 04:40:48,591 WARNING [train.py:1204] (3/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_527-238990-0 from training. Duration: 0.9 2023-10-02 04:40:50,065 WARNING [train.py:1204] (3/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_449-65969-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 04:40:51,495 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_553-341452-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 04:40:51,531 WARNING [train.py:1204] (3/4) Exclude cut with ID _16909_210_5724_1_1531292079912_4118919_300-47574-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 04:40:52,767 WARNING [train.py:1204] (3/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_70-27732-0_sp1.1 from training. Duration: 0.8545625 2023-10-02 04:40:52,878 WARNING [train.py:1204] (3/4) Exclude cut with ID _44718_210_5816_1_1534050051869_6469030_727-28074-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 04:40:54,799 WARNING [train.py:1204] (3/4) Exclude cut with ID _55857_210_2346_1_1534644007831_6898209_357-123664-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 04:40:55,050 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.bypass.scale_min, batch_count=754026.6666666666, ans=0.2 2023-10-02 04:40:56,729 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.1.feed_forward3.hidden_balancer.prob, batch_count=754026.6666666666, ans=0.125 2023-10-02 04:40:58,387 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9123W0004-555964 from training. Duration: 0.896 2023-10-02 04:40:58,408 WARNING [train.py:1204] (3/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_445-145208-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 04:40:58,445 WARNING [train.py:1204] (3/4) Exclude cut with ID _3675_210_4557_1_1526639934408_4182089_456-18305-0_sp1.1 from training. Duration: 0.6909375 2023-10-02 04:40:59,862 WARNING [train.py:1204] (3/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_1-99680-0_sp1.1 from training. Duration: 0.6090625 2023-10-02 04:41:02,511 WARNING [train.py:1204] (3/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_146-282400-0_sp0.9 from training. Duration: 0.788875 2023-10-02 04:41:02,524 WARNING [train.py:1204] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_844-96989-0 from training. Duration: 0.71 2023-10-02 04:41:02,639 WARNING [train.py:1204] (3/4) Exclude cut with ID _29853_210_7497_1_1532505600851_6642110_128-146364-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 04:41:03,890 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_224-322394-0 from training. Duration: 0.66 2023-10-02 04:41:05,265 WARNING [train.py:1204] (3/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_191-59784-0 from training. Duration: 0.88 2023-10-02 04:41:05,281 WARNING [train.py:1204] (3/4) Exclude cut with ID _12556_210_8341_1_1530081024100_7327690_725-193477-0 from training. Duration: 0.98 2023-10-02 04:41:05,327 WARNING [train.py:1204] (3/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_523-79838-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 04:41:06,735 WARNING [train.py:1204] (3/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_398-334558-0_sp1.1 from training. Duration: 0.87275 2023-10-02 04:41:08,878 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.3.encoder.layers.1.self_attn_weights.whiten_keys, num_groups=8, num_channels=256, metric=5.09 vs. limit=6.0 2023-10-02 04:41:11,041 WARNING [train.py:1204] (3/4) Exclude cut with ID _6456_210_1534_1_1527681634212_3181049_171-36697-0_sp1.1 from training. Duration: 0.936375 2023-10-02 04:41:12,536 WARNING [train.py:1204] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_700-183549-0 from training. Duration: 0.68 2023-10-02 04:41:12,538 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8853_210_3273_1_1528633899_818968_51-187685-0 from training. Duration: 0.69 2023-10-02 04:41:20,044 WARNING [train.py:1204] (3/4) Exclude cut with ID _7719_210_3525_1_1528456153163_4296960_333-110399-0_sp1.1 from training. Duration: 0.87275 2023-10-02 04:41:24,927 WARNING [train.py:1204] (3/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_1022-83740-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 04:41:26,193 WARNING [train.py:1204] (3/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_61-327062-0_sp0.9 from training. Duration: 0.9 2023-10-02 04:41:26,199 WARNING [train.py:1204] (3/4) Exclude cut with ID _47251_210_18628_1_1533814063128_4137250_214-88076-0_sp1.1 from training. Duration: 0.8 2023-10-02 04:41:27,451 WARNING [train.py:1204] (3/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_839-173017-0 from training. Duration: 0.41 2023-10-02 04:41:34,228 WARNING [train.py:1204] (3/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_299-34413-0_sp1.1 from training. Duration: 0.5909375 2023-10-02 04:41:34,339 WARNING [train.py:1204] (3/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_664-159457-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 04:41:37,145 WARNING [train.py:1204] (3/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_483-291760-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 04:41:38,591 WARNING [train.py:1204] (3/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_287-255474-0_sp1.1 from training. Duration: 0.92725 2023-10-02 04:41:39,915 WARNING [train.py:1204] (3/4) Exclude cut with ID _48677_210_5663_1_1533902405919_3892989_111-11439-0_sp1.1 from training. Duration: 0.87275 2023-10-02 04:41:39,923 WARNING [train.py:1204] (3/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_399-156791-0 from training. Duration: 0.83 2023-10-02 04:41:41,257 WARNING [train.py:1204] (3/4) Exclude cut with ID _3992_210_1747_1_1526644816031_3731060_36-147855-0_sp0.9 from training. Duration: 0.8666875 2023-10-02 04:41:41,572 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.self_attn_weights.pos_emb_skip_rate, batch_count=754226.6666666666, ans=0.0 2023-10-02 04:41:42,601 WARNING [train.py:1204] (3/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_442-320317-0_sp1.1 from training. Duration: 0.7454375 2023-10-02 04:41:42,618 WARNING [train.py:1204] (3/4) Exclude cut with ID _55429_210_10626_1_1534467588527_7174143_953-80868-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 04:41:44,011 WARNING [train.py:1204] (3/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_159-328157-0_sp0.9 from training. Duration: 0.57775 2023-10-02 04:41:44,019 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9309W0197-640324 from training. Duration: 0.896 2023-10-02 04:41:46,873 WARNING [train.py:1204] (3/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_361-30822-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 04:41:51,324 WARNING [train.py:1204] (3/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_636-251213-0 from training. Duration: 0.94 2023-10-02 04:41:57,937 WARNING [train.py:1204] (3/4) Exclude cut with ID _49728_210_12651_1_1534041034294_6788150_468-333827-0_sp1.1 from training. Duration: 0.963625 2023-10-02 04:41:58,034 WARNING [train.py:1204] (3/4) Exclude cut with ID _25882_210_3036_1_1532775665815_6479510_623-191759-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 04:41:59,391 WARNING [train.py:1204] (3/4) Exclude cut with ID _5189_210_4557_1_1527248319428_3769229_199-53526-0 from training. Duration: 0.42 2023-10-02 04:42:00,689 INFO [train.py:1046] (3/4) Epoch 22, batch 1600, loss[loss=0.1509, simple_loss=0.2287, pruned_loss=0.03658, over 24573.00 frames. ], tot_loss[loss=0.1732, simple_loss=0.2494, pruned_loss=0.04848, over 4720490.17 frames. ], batch size: 60, lr: 4.69e-03, grad_scale: 16.0 2023-10-02 04:42:01,536 WARNING [train.py:1204] (3/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_32-205288-0_sp0.9 from training. Duration: 0.8666875 2023-10-02 04:42:02,700 WARNING [train.py:1204] (3/4) Exclude cut with ID _65263_210_22356_1_1535763598691_3921037_401-346646-0_sp1.1 from training. Duration: 0.963625 2023-10-02 04:42:02,704 WARNING [train.py:1204] (3/4) Exclude cut with ID _12555_210_8750_1_1530075594402_3780239_449-267986-0_sp1.1 from training. Duration: 0.7545625 2023-10-02 04:42:02,727 WARNING [train.py:1204] (3/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_430-201020-0_sp1.1 from training. Duration: 0.8818125 2023-10-02 04:42:04,242 WARNING [train.py:1204] (3/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_379-49457-0_sp1.1 from training. Duration: 0.7909375 2023-10-02 04:42:08,276 WARNING [train.py:1204] (3/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_200-105218-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 04:42:08,344 WARNING [train.py:1204] (3/4) Exclude cut with ID _3867_210_1883_1_1526641200575_4475830_319-349850-0 from training. Duration: 0.67 2023-10-02 04:42:09,750 WARNING [train.py:1204] (3/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_916-195848-0 from training. Duration: 0.92 2023-10-02 04:42:12,359 WARNING [train.py:1204] (3/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_436-186311-0 from training. Duration: 0.65 2023-10-02 04:42:13,743 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3910_210_1794_1_1526777667_3747260_302-180617-0_sp1.1 from training. Duration: 0.97275 2023-10-02 04:42:15,178 WARNING [train.py:1204] (3/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_102-92751-0 from training. Duration: 0.99 2023-10-02 04:42:15,249 WARNING [train.py:1204] (3/4) Exclude cut with ID _51899_210_20034_1_1534129254466_3669220_265-215428-0_sp1.1 from training. Duration: 0.8909375 2023-10-02 04:42:18,024 WARNING [train.py:1204] (3/4) Exclude cut with ID _58583_210_5236_1_1534726835266_3574249_438-198993-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 04:42:22,145 WARNING [train.py:1204] (3/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_166-166990-0_sp1.1 from training. Duration: 0.87275 2023-10-02 04:42:24,843 WARNING [train.py:1204] (3/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_907-151398-0 from training. Duration: 0.37 2023-10-02 04:42:25,815 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.0.layers.1.self_attn2.whiten, num_groups=1, num_channels=192, metric=10.40 vs. limit=22.5 2023-10-02 04:42:28,580 WARNING [train.py:1204] (3/4) Exclude cut with ID _38670_210_15379_1_1533448810659_4391059_452-256669-0_sp1.1 from training. Duration: 0.936375 2023-10-02 04:42:29,799 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.588e+02 1.849e+02 2.015e+02 2.329e+02 4.994e+02, threshold=4.030e+02, percent-clipped=1.0 2023-10-02 04:42:29,909 WARNING [train.py:1204] (3/4) Exclude cut with ID _48677_210_5663_1_1533902405919_3892989_408-40740-0 from training. Duration: 0.83 2023-10-02 04:42:29,960 WARNING [train.py:1204] (3/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_903-158756-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 04:42:31,258 WARNING [train.py:1204] (3/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_397-216497-0 from training. Duration: 0.96 2023-10-02 04:42:31,451 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.conv_module1.balancer2.min_abs, batch_count=754493.3333333334, ans=0.5 2023-10-02 04:42:38,055 WARNING [train.py:1204] (3/4) Exclude cut with ID _55429_210_10626_1_1534467588527_7174143_97-335084-0 from training. Duration: 0.97 2023-10-02 04:42:40,127 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.3.encoder.layers.3.conv_module1.whiten, num_groups=1, num_channels=512, metric=4.61 vs. limit=15.0 2023-10-02 04:42:43,727 WARNING [train.py:1204] (3/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_453-267730-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 04:42:45,090 WARNING [train.py:1204] (3/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_153-97441-0 from training. Duration: 0.56 2023-10-02 04:42:46,474 WARNING [train.py:1204] (3/4) Exclude cut with ID _40901_210_5665_1_1533363789958_3856130_312-329122-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 04:42:46,518 WARNING [train.py:1204] (3/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_97-126471-0_sp1.1 from training. Duration: 0.97275 2023-10-02 04:42:46,518 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_11305_210_1794_1_1529750428_7178148_257-312870-0_sp1.1 from training. Duration: 0.8 2023-10-02 04:42:47,936 WARNING [train.py:1204] (3/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_454-314901-0_sp0.9 from training. Duration: 0.511125 2023-10-02 04:42:50,753 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=754560.0, ans=0.1 2023-10-02 04:42:52,101 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.bypass_mid.scale_min, batch_count=754560.0, ans=0.2 2023-10-02 04:42:53,156 WARNING [train.py:1204] (3/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_282-136641-0_sp1.1 from training. Duration: 0.3818125 2023-10-02 04:42:54,587 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_46013_210_7234_1_1533776362_3744032_493-160724-0_sp1.1 from training. Duration: 0.963625 2023-10-02 04:42:54,617 WARNING [train.py:1204] (3/4) Exclude cut with ID _67831_210_12392_1_1535612464202_3092612_259-33442-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 04:42:55,946 WARNING [train.py:1204] (3/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_289-62384-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 04:42:56,013 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6545_210_2040_1_1527774670_4245258_379-14182-0_sp0.9 from training. Duration: 0.888875 2023-10-02 04:42:59,357 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_548-294316-0_sp1.1 from training. Duration: 0.736375 2023-10-02 04:42:59,454 WARNING [train.py:1204] (3/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_857-298942-0_sp1.1 from training. Duration: 0.77275 2023-10-02 04:43:00,749 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6673_210_3273_1_1527767790_3173240_327-306185-0_sp1.1 from training. Duration: 0.7545625 2023-10-02 04:43:03,610 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.conv_module1.balancer2.prob, batch_count=754626.6666666666, ans=0.125 2023-10-02 04:43:08,326 WARNING [train.py:1204] (3/4) Exclude cut with ID _61008_210_10633_1_1534852769198_6974370_814-107647-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 04:43:08,398 WARNING [train.py:1204] (3/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_116-332008-0_sp1.1 from training. Duration: 0.8909375 2023-10-02 04:43:11,358 WARNING [train.py:1204] (3/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_266-329755-0 from training. Duration: 0.89 2023-10-02 04:43:11,361 WARNING [train.py:1204] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_271-269084-0_sp0.9 from training. Duration: 0.92225 2023-10-02 04:43:14,076 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3171_210_2087_1_1526109084_2330007_66-260583-0 from training. Duration: 0.72 2023-10-02 04:43:15,458 INFO [train.py:1046] (3/4) Epoch 22, batch 1650, loss[loss=0.168, simple_loss=0.2529, pruned_loss=0.04154, over 24439.00 frames. ], tot_loss[loss=0.1736, simple_loss=0.2499, pruned_loss=0.04864, over 4720379.74 frames. ], batch size: 69, lr: 4.69e-03, grad_scale: 16.0 2023-10-02 04:43:18,333 WARNING [train.py:1204] (3/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_1326-276386-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 04:43:18,454 WARNING [train.py:1204] (3/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_372-30302-0_sp1.1 from training. Duration: 0.7818125 2023-10-02 04:43:18,722 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.ff2_skip_rate, batch_count=754693.3333333334, ans=0.0 2023-10-02 04:43:19,735 WARNING [train.py:1204] (3/4) Exclude cut with ID _25882_210_3036_1_1532775665815_6479510_631-257029-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 04:43:19,743 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3691_210_3528_1_1526468169_4062053_355-334432-0 from training. Duration: 0.87 2023-10-02 04:43:19,752 WARNING [train.py:1204] (3/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_839-173017-0_sp1.1 from training. Duration: 0.37275 2023-10-02 04:43:19,760 WARNING [train.py:1204] (3/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_441-91466-0 from training. Duration: 0.97 2023-10-02 04:43:19,806 WARNING [train.py:1204] (3/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_108-53929-0 from training. Duration: 0.59 2023-10-02 04:43:22,805 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.ff2_skip_rate, batch_count=754693.3333333334, ans=0.0 2023-10-02 04:43:23,961 WARNING [train.py:1204] (3/4) Exclude cut with ID _9389_210_1664_1_1529217337301_5289269_260-1942-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 04:43:24,022 WARNING [train.py:1204] (3/4) Exclude cut with ID _56423_210_5854_1_1534896061049_3165969_198-61547-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 04:43:25,206 WARNING [train.py:1204] (3/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_412-142122-0_sp1.1 from training. Duration: 0.92725 2023-10-02 04:43:25,231 WARNING [train.py:1204] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_88-341718-0_sp1.1 from training. Duration: 0.6 2023-10-02 04:43:25,464 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=754693.3333333334, ans=0.1 2023-10-02 04:43:27,992 WARNING [train.py:1204] (3/4) Exclude cut with ID _61079_210_4525_1_1534921008198_4011419_334-298697-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 04:43:31,309 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_5465_210_4211_1_1527227253_55357_8-2499-0_sp1.1 from training. Duration: 0.996375 2023-10-02 04:43:33,362 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_447-201565-0_sp1.1 from training. Duration: 0.97275 2023-10-02 04:43:33,368 WARNING [train.py:1204] (3/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_253-161789-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 04:43:33,371 WARNING [train.py:1204] (3/4) Exclude cut with ID _44304_210_15374_1_1533639352765_4613129_302-22084-0_sp0.9 from training. Duration: 0.9555625 2023-10-02 04:43:33,389 WARNING [train.py:1204] (3/4) Exclude cut with ID _26664_210_7770_1_1532258943280_3418270_187-80815-0_sp1.1 from training. Duration: 0.8454375 2023-10-02 04:43:34,787 WARNING [train.py:1204] (3/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_68-212801-0 from training. Duration: 0.73 2023-10-02 04:43:34,834 WARNING [train.py:1204] (3/4) Exclude cut with ID _29345_210_4381_1_1532689168263_3860169_149-257268-0 from training. Duration: 0.81 2023-10-02 04:43:37,101 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.1.encoder.layers.0.self_attn1.whiten, num_groups=1, num_channels=256, metric=11.03 vs. limit=22.5 2023-10-02 04:43:41,091 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_213-103282-0_sp1.1 from training. Duration: 0.7090625 2023-10-02 04:43:43,692 WARNING [train.py:1204] (3/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_156-302675-0_sp1.1 from training. Duration: 0.5 2023-10-02 04:43:50,700 WARNING [train.py:1204] (3/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_125-322172-0 from training. Duration: 0.93 2023-10-02 04:43:50,779 WARNING [train.py:1204] (3/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_493-60474-0_sp1.1 from training. Duration: 0.963625 2023-10-02 04:43:52,271 WARNING [train.py:1204] (3/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_156-302675-0 from training. Duration: 0.55 2023-10-02 04:43:55,104 WARNING [train.py:1204] (3/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_123-267862-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 04:43:57,665 WARNING [train.py:1204] (3/4) Exclude cut with ID _28427_210_4220_1_1532581240967_3061069_201-143747-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 04:43:57,690 WARNING [train.py:1204] (3/4) Exclude cut with ID _8772_210_1850_1_1528635595972_3664011_96-12756-0_sp1.1 from training. Duration: 0.8909375 2023-10-02 04:43:59,663 WARNING [train.py:1204] (3/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_1347-145675-0_sp1.1 from training. Duration: 0.936375 2023-10-02 04:43:59,761 WARNING [train.py:1204] (3/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_387-159008-0_sp1.1 from training. Duration: 0.7818125 2023-10-02 04:43:59,777 WARNING [train.py:1204] (3/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_400-201896-0_sp1.1 from training. Duration: 0.963625 2023-10-02 04:44:02,558 WARNING [train.py:1204] (3/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_326-174612-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 04:44:03,883 WARNING [train.py:1204] (3/4) Exclude cut with ID _55844_210_2346_1_1534471212569_7625707_390-116233-0_sp1.1 from training. Duration: 0.963625 2023-10-02 04:44:04,515 WARNING [train.py:1204] (3/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_419-317062-0_sp1.1 from training. Duration: 0.82725 2023-10-02 04:44:05,915 WARNING [train.py:1204] (3/4) Exclude cut with ID _29877_210_6228_1_1532397791106_5276149_266-62291-0_sp0.9 from training. Duration: 0.9666875 2023-10-02 04:44:07,755 WARNING [train.py:1204] (3/4) Exclude cut with ID _7857_210_1534_1_1528286515745_6831470_431-338528-0_sp1.1 from training. Duration: 0.92725 2023-10-02 04:44:09,111 WARNING [train.py:1204] (3/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_87-131427-0_sp0.9 from training. Duration: 0.8666875 2023-10-02 04:44:11,927 WARNING [train.py:1204] (3/4) Exclude cut with ID _24055_210_12651_1_1532310907143_976979_92-59356-0_sp1.1 from training. Duration: 0.82725 2023-10-02 04:44:13,270 WARNING [train.py:1204] (3/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_96-290039-0 from training. Duration: 0.56 2023-10-02 04:44:14,633 WARNING [train.py:1204] (3/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_48-182155-0_sp0.9 from training. Duration: 0.9666875 2023-10-02 04:44:14,678 WARNING [train.py:1204] (3/4) Exclude cut with ID _4626_210_3532_1_1526817652717_3916859_446-206656-0 from training. Duration: 0.76 2023-10-02 04:44:16,185 WARNING [train.py:1204] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_168-32906-0 from training. Duration: 0.69 2023-10-02 04:44:17,477 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3780_210_2087_1_1526467760_2130483_170-259044-0 from training. Duration: 0.7 2023-10-02 04:44:17,495 WARNING [train.py:1204] (3/4) Exclude cut with ID _65134_210_14763_1_1535335393985_3505368_153-216718-0_sp1.1 from training. Duration: 0.92725 2023-10-02 04:44:17,557 WARNING [train.py:1204] (3/4) Exclude cut with ID _64639_210_5816_1_1535367619437_3528000_461-329156-0_sp1.1 from training. Duration: 0.87275 2023-10-02 04:44:17,596 WARNING [train.py:1204] (3/4) Exclude cut with ID _21989_210_5854_1_1531899214931_3271360_126-347531-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 04:44:18,873 WARNING [train.py:1204] (3/4) Exclude cut with ID _7614_210_4915_1_1529326764092_4537359_96-162197-0_sp1.1 from training. Duration: 0.963625 2023-10-02 04:44:18,874 WARNING [train.py:1204] (3/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_1083-147536-0 from training. Duration: 0.97 2023-10-02 04:44:21,702 WARNING [train.py:1204] (3/4) Exclude cut with ID _64029_210_4381_1_1535178502772_3756459_275-16710-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 04:44:23,125 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7264_210_4938_1_1527939911_8132095_800-91995-0_sp0.9 from training. Duration: 0.9555625 2023-10-02 04:44:23,164 WARNING [train.py:1204] (3/4) Exclude cut with ID _3844_210_1390_1_1526727803582_2871209_50-193879-0_sp1.1 from training. Duration: 0.936375 2023-10-02 04:44:23,411 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.nonlin_attention.balancer.prob, batch_count=754960.0, ans=0.125 2023-10-02 04:44:24,799 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.conv_module2.balancer1.prob, batch_count=754960.0, ans=0.125 2023-10-02 04:44:25,887 WARNING [train.py:1204] (3/4) Exclude cut with ID _46952_210_5986_1_1534230010357_3107379_232-344503-0 from training. Duration: 0.96 2023-10-02 04:44:26,717 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.0.layers.1.self_attn_weights.whiten_keys, num_groups=4, num_channels=128, metric=2.18 vs. limit=6.0 2023-10-02 04:44:28,643 INFO [train.py:1046] (3/4) Epoch 22, batch 1700, loss[loss=0.1649, simple_loss=0.2543, pruned_loss=0.03776, over 24652.00 frames. ], tot_loss[loss=0.1734, simple_loss=0.2498, pruned_loss=0.04855, over 4725015.76 frames. ], batch size: 68, lr: 4.69e-03, grad_scale: 16.0 2023-10-02 04:44:30,598 WARNING [train.py:1204] (3/4) Exclude cut with ID _25808_210_13292_1_1532343514562_7366109_206-14076-0_sp1.1 from training. Duration: 0.936375 2023-10-02 04:44:30,599 WARNING [train.py:1204] (3/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_55-31904-0_sp0.9 from training. Duration: 0.97775 2023-10-02 04:44:31,895 WARNING [train.py:1204] (3/4) Exclude cut with ID _14632_210_9988_1_1530691279392_3923200_161-129530-0 from training. Duration: 0.96 2023-10-02 04:44:33,265 WARNING [train.py:1204] (3/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_770-200430-0_sp1.1 from training. Duration: 0.8090625 2023-10-02 04:44:33,283 WARNING [train.py:1204] (3/4) Exclude cut with ID _6270_210_2084_1_1527850911573_3856420_26-277343-0_sp1.1 from training. Duration: 0.7454375 2023-10-02 04:44:33,284 WARNING [train.py:1204] (3/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_88-23776-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 04:44:34,808 WARNING [train.py:1204] (3/4) Exclude cut with ID _60860_210_8254_1_1534896015875_3557169_245-256666-0_sp1.1 from training. Duration: 0.8818125 2023-10-02 04:44:36,634 WARNING [train.py:1204] (3/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_636-251213-0_sp1.1 from training. Duration: 0.8545625 2023-10-02 04:44:37,293 WARNING [train.py:1204] (3/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_540-131164-0 from training. Duration: 0.92 2023-10-02 04:44:40,107 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_5931_210_2040_1_1527428481_4547999_321-193614-0_sp0.9 from training. Duration: 0.7444375 2023-10-02 04:44:44,491 WARNING [train.py:1204] (3/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_373-225403-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 04:44:44,629 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.feed_forward3.hidden_balancer.prob, batch_count=755093.3333333334, ans=0.125 2023-10-02 04:44:47,380 WARNING [train.py:1204] (3/4) Exclude cut with ID _20622_210_3852_1_1531622427080_1093839_57-326027-0_sp1.1 from training. Duration: 0.97275 2023-10-02 04:44:52,920 WARNING [train.py:1204] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_605-257205-0_sp1.1 from training. Duration: 0.536375 2023-10-02 04:44:52,946 WARNING [train.py:1204] (3/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_442-320317-0_sp0.9 from training. Duration: 0.911125 2023-10-02 04:44:52,992 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_35361_210_4238_1_1533262810_7793476_675-87012-0_sp1.1 from training. Duration: 0.8090625 2023-10-02 04:44:54,441 WARNING [train.py:1204] (3/4) Exclude cut with ID _4184_210_2550_1_1526730192013_2219380_306-44736-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 04:44:57,241 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_124-113562-0 from training. Duration: 0.94 2023-10-02 04:44:58,520 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.468e+02 1.824e+02 2.037e+02 2.362e+02 3.685e+02, threshold=4.074e+02, percent-clipped=0.0 2023-10-02 04:44:58,663 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2821_210_2715_1_1526108385_3763560_73-296673-0_sp0.9 from training. Duration: 0.92225 2023-10-02 04:44:58,679 WARNING [train.py:1204] (3/4) Exclude cut with ID _10301_210_5886_1_1529222275332_3910620_216-46959-0_sp1.1 from training. Duration: 0.92725 2023-10-02 04:45:00,623 WARNING [train.py:1204] (3/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_91-277684-0_sp0.9 from training. Duration: 0.82225 2023-10-02 04:45:02,549 WARNING [train.py:1204] (3/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_351-294650-0_sp0.9 from training. Duration: 0.7 2023-10-02 04:45:04,690 WARNING [train.py:1204] (3/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_209-205365-0 from training. Duration: 0.79 2023-10-02 04:45:05,937 WARNING [train.py:1204] (3/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_97-325053-0 from training. Duration: 0.92 2023-10-02 04:45:06,048 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_49348_210_7927_1_1533954500_3691300_109-276430-0_sp1.1 from training. Duration: 0.92725 2023-10-02 04:45:08,053 WARNING [train.py:1204] (3/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_912-47473-0 from training. Duration: 0.68 2023-10-02 04:45:09,563 WARNING [train.py:1204] (3/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_96-320736-0_sp0.9 from training. Duration: 0.9555625 2023-10-02 04:45:11,121 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.out_combiner.scale_min, batch_count=755160.0, ans=0.2 2023-10-02 04:45:17,747 WARNING [train.py:1204] (3/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_1252-235481-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 04:45:19,205 WARNING [train.py:1204] (3/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_813-261554-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 04:45:20,410 WARNING [train.py:1204] (3/4) Exclude cut with ID _21632_210_2253_1_1531994001765_3744140_110-274654-0_sp0.9 from training. Duration: 0.911125 2023-10-02 04:45:20,526 WARNING [train.py:1204] (3/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_381-317791-0_sp1.1 from training. Duration: 0.663625 2023-10-02 04:45:20,537 WARNING [train.py:1204] (3/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_51-117576-0_sp1.1 from training. Duration: 0.363625 2023-10-02 04:45:20,553 WARNING [train.py:1204] (3/4) Exclude cut with ID _42321_210_7770_1_1533468631352_4366895_104-215915-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 04:45:23,244 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_216-22824-0_sp1.1 from training. Duration: 0.92725 2023-10-02 04:45:23,244 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_14831_210_7927_1_1530700200_5583610_181-27467-0 from training. Duration: 0.91 2023-10-02 04:45:23,307 WARNING [train.py:1204] (3/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_1024-265645-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 04:45:23,310 WARNING [train.py:1204] (3/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_412-233904-0_sp1.1 from training. Duration: 0.936375 2023-10-02 04:45:24,678 WARNING [train.py:1204] (3/4) Exclude cut with ID _30191_210_1664_1_1533189915047_7196670_911-223461-0_sp1.1 from training. Duration: 0.92725 2023-10-02 04:45:24,694 WARNING [train.py:1204] (3/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_558-148266-0_sp1.1 from training. Duration: 0.963625 2023-10-02 04:45:27,754 WARNING [train.py:1204] (3/4) Exclude cut with ID _58480_210_12392_1_1534811121111_3646160_212-164495-0_sp1.1 from training. Duration: 0.936375 2023-10-02 04:45:27,756 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_454-276429-0_sp0.9 from training. Duration: 0.9666875 2023-10-02 04:45:29,129 WARNING [train.py:1204] (3/4) Exclude cut with ID _24736_210_3279_1_1532332894218_2838700_73-197086-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 04:45:29,347 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=755293.3333333334, ans=0.1 2023-10-02 04:45:30,959 WARNING [train.py:1204] (3/4) Exclude cut with ID _5294_210_1747_1_1527249643928_3696179_529-242298-0_sp1.1 from training. Duration: 0.863625 2023-10-02 04:45:31,000 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_53555_210_5632_1_1534595220_7232297_345-171438-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 04:45:34,638 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_28211_210_6388_1_1532861506_4523224_22-25766-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 04:45:35,993 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7002_210_1794_1_1527939683_5157286_163-87552-0 from training. Duration: 0.86 2023-10-02 04:45:39,103 WARNING [train.py:1204] (3/4) Exclude cut with ID _56927_210_9969_1_1534848970840_3695229_409-225129-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 04:45:40,527 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_26192_210_7927_1_1532137269_1040115_21-219331-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 04:45:43,281 WARNING [train.py:1204] (3/4) Exclude cut with ID _15012_210_1968_1_1530856820429_5095350_662-210469-0 from training. Duration: 0.57 2023-10-02 04:45:44,591 INFO [train.py:1046] (3/4) Epoch 22, batch 1750, loss[loss=0.1556, simple_loss=0.2322, pruned_loss=0.03952, over 23263.00 frames. ], tot_loss[loss=0.1723, simple_loss=0.2482, pruned_loss=0.04821, over 4711939.91 frames. ], batch size: 105, lr: 4.69e-03, grad_scale: 16.0 2023-10-02 04:45:48,938 WARNING [train.py:1204] (3/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_582-191016-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 04:45:51,638 WARNING [train.py:1204] (3/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_270-210032-0_sp1.1 from training. Duration: 0.963625 2023-10-02 04:45:51,672 WARNING [train.py:1204] (3/4) Exclude cut with ID _6789_210_1534_1_1527857937218_3118950_262-48853-0_sp0.9 from training. Duration: 0.62225 2023-10-02 04:45:51,916 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=755360.0, ans=0.1 2023-10-02 04:45:53,047 WARNING [train.py:1204] (3/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_847-41334-0 from training. Duration: 0.44 2023-10-02 04:45:53,087 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_573-46532-0_sp1.1 from training. Duration: 0.92725 2023-10-02 04:45:55,941 WARNING [train.py:1204] (3/4) Exclude cut with ID _21881_210_3525_1_1531825242757_3798380_247-102591-0_sp1.1 from training. Duration: 0.8909375 2023-10-02 04:45:55,969 WARNING [train.py:1204] (3/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_200-21395-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 04:46:00,298 WARNING [train.py:1204] (3/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_711-15688-0 from training. Duration: 0.43 2023-10-02 04:46:02,290 WARNING [train.py:1204] (3/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_53-75025-0_sp1.1 from training. Duration: 0.963625 2023-10-02 04:46:05,535 WARNING [train.py:1204] (3/4) Exclude cut with ID _6194_210_1534_1_1527511826160_3941194_321-148641-0 from training. Duration: 0.64 2023-10-02 04:46:05,579 WARNING [train.py:1204] (3/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_337-294431-0_sp1.1 from training. Duration: 0.936375 2023-10-02 04:46:07,015 WARNING [train.py:1204] (3/4) Exclude cut with ID _21632_210_2253_1_1531994001765_3744140_110-274654-0_sp1.1 from training. Duration: 0.7454375 2023-10-02 04:46:10,345 WARNING [train.py:1204] (3/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_135-174080-0_sp1.1 from training. Duration: 0.4818125 2023-10-02 04:46:10,447 WARNING [train.py:1204] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_21-174514-0 from training. Duration: 0.43 2023-10-02 04:46:10,634 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.conv_module1.balancer2.prob, batch_count=755426.6666666666, ans=0.125 2023-10-02 04:46:13,811 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_38965_210_4852_1_1533174906_3484735_215-294589-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 04:46:13,841 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_548-294316-0 from training. Duration: 0.81 2023-10-02 04:46:20,807 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_103-84010-0_sp1.1 from training. Duration: 0.62725 2023-10-02 04:46:23,666 WARNING [train.py:1204] (3/4) Exclude cut with ID _18199_210_6109_1_1531475789864_4288790_295-198247-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 04:46:23,688 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_13757_210_3528_1_1530440682_4107239_103-313224-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 04:46:26,445 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_34886_210_10633_1_1533203882_1755661_67-294673-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 04:46:26,468 WARNING [train.py:1204] (3/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_350-101734-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 04:46:27,878 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_37357_210_17024_1_1533089504_2316121_204-153986-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 04:46:29,453 WARNING [train.py:1204] (3/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_673-115569-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 04:46:32,814 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_489-230703-0_sp0.9 from training. Duration: 0.9444375 2023-10-02 04:46:32,863 WARNING [train.py:1204] (3/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_575-102496-0_sp1.1 from training. Duration: 0.936375 2023-10-02 04:46:34,869 WARNING [train.py:1204] (3/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_669-93163-0 from training. Duration: 0.98 2023-10-02 04:46:36,334 WARNING [train.py:1204] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_249-281349-0_sp0.9 from training. Duration: 0.9444375 2023-10-02 04:46:39,596 WARNING [train.py:1204] (3/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_106-30782-0 from training. Duration: 0.8 2023-10-02 04:46:40,916 WARNING [train.py:1204] (3/4) Exclude cut with ID _8772_210_1850_1_1528635595972_3664011_398-320049-0_sp1.1 from training. Duration: 0.8 2023-10-02 04:46:42,271 WARNING [train.py:1204] (3/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_516-151727-0_sp1.1 from training. Duration: 0.963625 2023-10-02 04:46:42,338 WARNING [train.py:1204] (3/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_325-62802-0_sp1.1 from training. Duration: 0.7909375 2023-10-02 04:46:45,996 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.conv_module1.balancer2.min_abs, batch_count=755626.6666666666, ans=0.5 2023-10-02 04:46:47,274 WARNING [train.py:1204] (3/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_93-58002-0_sp0.9 from training. Duration: 0.6333125 2023-10-02 04:46:48,551 WARNING [train.py:1204] (3/4) Exclude cut with ID _4681_210_3525_1_1527251040321_1235713_123-24428-0_sp1.1 from training. Duration: 0.436375 2023-10-02 04:46:48,619 WARNING [train.py:1204] (3/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_1185-56473-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 04:46:49,378 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.2.encoder.layers.1.nonlin_attention.whiten2, num_groups=1, num_channels=384, metric=6.28 vs. limit=15.0 2023-10-02 04:46:50,104 WARNING [train.py:1204] (3/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_96-244299-0_sp1.1 from training. Duration: 0.8 2023-10-02 04:46:54,335 WARNING [train.py:1204] (3/4) Exclude cut with ID _59500_210_8254_1_1534809610680_3736009_104-72815-0_sp1.1 from training. Duration: 0.963625 2023-10-02 04:46:54,965 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.4.encoder.layers.0.nonlin_attention.whiten2, num_groups=1, num_channels=384, metric=10.39 vs. limit=15.0 2023-10-02 04:46:55,830 WARNING [train.py:1204] (3/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_468-283113-0_sp1.1 from training. Duration: 0.87275 2023-10-02 04:46:57,146 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6239_210_4211_1_1527765584_4306698_528-102186-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 04:46:57,235 WARNING [train.py:1204] (3/4) Exclude cut with ID _45297_210_2721_1_1533715351381_3264949_351-147136-0 from training. Duration: 0.87 2023-10-02 04:46:58,628 WARNING [train.py:1204] (3/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_520-51744-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 04:46:58,743 WARNING [train.py:1204] (3/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_310-114452-0_sp1.1 from training. Duration: 0.536375 2023-10-02 04:46:58,746 WARNING [train.py:1204] (3/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_450-165710-0_sp1.1 from training. Duration: 0.92725 2023-10-02 04:46:58,760 WARNING [train.py:1204] (3/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_280-120942-0_sp1.1 from training. Duration: 0.7 2023-10-02 04:47:00,001 INFO [train.py:1046] (3/4) Epoch 22, batch 1800, loss[loss=0.1647, simple_loss=0.2453, pruned_loss=0.04205, over 24531.00 frames. ], tot_loss[loss=0.1722, simple_loss=0.248, pruned_loss=0.04826, over 4717060.40 frames. ], batch size: 63, lr: 4.69e-03, grad_scale: 16.0 2023-10-02 04:47:00,071 WARNING [train.py:1204] (3/4) Exclude cut with ID _65585_210_12210_1_1535450451584_3764098_112-294301-0_sp0.9 from training. Duration: 0.97775 2023-10-02 04:47:00,119 WARNING [train.py:1204] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_98-51676-0_sp0.9 from training. Duration: 0.811125 2023-10-02 04:47:00,437 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.ff2_skip_rate, batch_count=755693.3333333334, ans=0.0 2023-10-02 04:47:03,386 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_33348_210_5632_1_1533086912_3324030_85-28583-0_sp1.1 from training. Duration: 0.7454375 2023-10-02 04:47:03,475 WARNING [train.py:1204] (3/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_336-208232-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 04:47:05,448 WARNING [train.py:1204] (3/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_339-104791-0_sp0.9 from training. Duration: 0.5444375 2023-10-02 04:47:07,595 WARNING [train.py:1204] (3/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_559-29840-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 04:47:07,882 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=755693.3333333334, ans=0.1 2023-10-02 04:47:10,504 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.1.balancer2.prob, batch_count=755693.3333333334, ans=0.125 2023-10-02 04:47:11,743 WARNING [train.py:1204] (3/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_117-290916-0_sp0.9 from training. Duration: 0.5666875 2023-10-02 04:47:13,163 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_185-112289-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 04:47:15,161 WARNING [train.py:1204] (3/4) Exclude cut with ID _59256_210_5986_1_1535076085997_3589120_155-298797-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 04:47:17,972 WARNING [train.py:1204] (3/4) Exclude cut with ID _44659_210_2721_1_1534036120061_6614860_246-206531-0_sp1.1 from training. Duration: 0.92725 2023-10-02 04:47:19,157 WARNING [train.py:1204] (3/4) Exclude cut with ID _23818_210_6228_1_1531917063884_3676920_469-151944-0_sp1.1 from training. Duration: 0.92725 2023-10-02 04:47:19,246 WARNING [train.py:1204] (3/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_88-276668-0_sp1.1 from training. Duration: 0.9 2023-10-02 04:47:20,680 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_15101_210_7927_1_1530786415_5033274_320-327527-0_sp1.1 from training. Duration: 0.87275 2023-10-02 04:47:20,688 WARNING [train.py:1204] (3/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_446-112117-0 from training. Duration: 0.9 2023-10-02 04:47:22,040 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_30280_210_1881_1_1532872779_3660139_400-254159-0_sp1.1 from training. Duration: 0.936375 2023-10-02 04:47:24,922 WARNING [train.py:1204] (3/4) Exclude cut with ID _40284_210_2084_1_1533513679814_3704279_158-345341-0_sp1.1 from training. Duration: 0.936375 2023-10-02 04:47:27,697 WARNING [train.py:1204] (3/4) Exclude cut with ID 3033-130750-0096-55598-0 from training. Duration: 0.83 2023-10-02 04:47:30,294 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.502e+02 1.851e+02 2.122e+02 2.393e+02 3.759e+02, threshold=4.245e+02, percent-clipped=0.0 2023-10-02 04:47:30,442 WARNING [train.py:1204] (3/4) Exclude cut with ID _9389_210_1664_1_1529217337301_5289269_379-128794-0 from training. Duration: 0.85 2023-10-02 04:47:30,464 WARNING [train.py:1204] (3/4) Exclude cut with ID _3675_210_4557_1_1526639934408_4182089_456-18305-0 from training. Duration: 0.76 2023-10-02 04:47:31,777 WARNING [train.py:1204] (3/4) Exclude cut with ID _29853_210_7497_1_1532505600851_6642110_571-249460-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 04:47:33,735 WARNING [train.py:1204] (3/4) Exclude cut with ID _4068_210_2629_1_1526697350395_1546259_230-325568-0_sp1.1 from training. Duration: 0.92725 2023-10-02 04:47:33,736 WARNING [train.py:1204] (3/4) Exclude cut with ID _34415_210_8750_1_1533085213375_3451116_213-267570-0_sp1.1 from training. Duration: 0.963625 2023-10-02 04:47:35,068 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6767_210_3273_1_1527855783_1718932_142-127132-0_sp1.1 from training. Duration: 0.863625 2023-10-02 04:47:37,479 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.conv_module2.balancer2.prob, batch_count=755826.6666666666, ans=0.125 2023-10-02 04:47:40,196 WARNING [train.py:1204] (3/4) Exclude cut with ID IC0653W0001-322925 from training. Duration: 0.896 2023-10-02 04:47:41,572 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_4528_210_3273_1_1527076654_3276777_209-219804-0_sp1.1 from training. Duration: 0.62725 2023-10-02 04:47:43,445 WARNING [train.py:1204] (3/4) Exclude cut with ID _55844_210_2346_1_1534471212569_7625707_187-204971-0_sp1.1 from training. Duration: 0.936375 2023-10-02 04:47:46,188 WARNING [train.py:1204] (3/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_369-325184-0 from training. Duration: 0.99 2023-10-02 04:47:47,492 WARNING [train.py:1204] (3/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_791-45149-0 from training. Duration: 0.86 2023-10-02 04:47:47,551 WARNING [train.py:1204] (3/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_317-211107-0_sp1.1 from training. Duration: 0.836375 2023-10-02 04:47:48,958 WARNING [train.py:1204] (3/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_488-330913-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 04:47:50,348 WARNING [train.py:1204] (3/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_506-42614-0_sp1.1 from training. Duration: 0.7181875 2023-10-02 04:47:52,055 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=755893.3333333334, ans=0.1 2023-10-02 04:47:54,480 WARNING [train.py:1204] (3/4) Exclude cut with ID _8696_210_1876_1_1528545632440_3345058_209-54069-0 from training. Duration: 0.67 2023-10-02 04:47:57,812 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.3.encoder.layers.1.whiten, num_groups=1, num_channels=512, metric=5.42 vs. limit=12.0 2023-10-02 04:47:59,916 WARNING [train.py:1204] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_215-202973-0_sp1.1 from training. Duration: 0.8 2023-10-02 04:48:00,014 WARNING [train.py:1204] (3/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_321-215742-0 from training. Duration: 0.5 2023-10-02 04:48:01,283 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_428-32114-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 04:48:01,291 WARNING [train.py:1204] (3/4) Exclude cut with ID _27013_210_1389_1_1532437173240_3252649_467-263151-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 04:48:03,069 WARNING [train.py:1204] (3/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_734-129761-0_sp0.9 from training. Duration: 0.911125 2023-10-02 04:48:03,120 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_521-214276-0 from training. Duration: 0.44 2023-10-02 04:48:04,631 WARNING [train.py:1204] (3/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_80-147301-0_sp1.1 from training. Duration: 0.763625 2023-10-02 04:48:04,633 WARNING [train.py:1204] (3/4) Exclude cut with ID _9679_210_7497_1_1529129212906_4363929_56-307796-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 04:48:06,609 WARNING [train.py:1204] (3/4) Exclude cut with ID _14832_210_8750_1_1530691351539_3171348_1-54315-0 from training. Duration: 0.87 2023-10-02 04:48:06,610 WARNING [train.py:1204] (3/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_558-155750-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 04:48:10,244 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_142-309370-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 04:48:11,477 WARNING [train.py:1204] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_18-46756-0_sp0.9 from training. Duration: 0.87775 2023-10-02 04:48:11,486 WARNING [train.py:1204] (3/4) Exclude cut with ID _61148_210_6897_1_1535094022726_4217610_279-212159-0_sp1.1 from training. Duration: 0.936375 2023-10-02 04:48:12,928 WARNING [train.py:1204] (3/4) Exclude cut with ID _21987_210_5854_1_1531821500482_3637038_164-2641-0_sp1.1 from training. Duration: 0.936375 2023-10-02 04:48:12,978 WARNING [train.py:1204] (3/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_395-340852-0_sp1.1 from training. Duration: 0.8454375 2023-10-02 04:48:13,738 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.1.encoder.layers.0.nonlin_attention.whiten2, num_groups=1, num_channels=256, metric=8.72 vs. limit=15.0 2023-10-02 04:48:14,306 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1077-184261-0_sp1.1 from training. Duration: 0.963625 2023-10-02 04:48:14,314 WARNING [train.py:1204] (3/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_1176-204992-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 04:48:16,100 INFO [train.py:1046] (3/4) Epoch 22, batch 1850, loss[loss=0.1653, simple_loss=0.2553, pruned_loss=0.0377, over 24681.00 frames. ], tot_loss[loss=0.1723, simple_loss=0.2484, pruned_loss=0.04816, over 4710242.97 frames. ], batch size: 73, lr: 4.69e-03, grad_scale: 16.0 2023-10-02 04:48:18,885 WARNING [train.py:1204] (3/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_563-347832-0_sp1.1 from training. Duration: 0.8090625 2023-10-02 04:48:18,947 WARNING [train.py:1204] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_380-204756-0_sp0.9 from training. Duration: 0.9555625 2023-10-02 04:48:25,852 WARNING [train.py:1204] (3/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_212-10501-0_sp1.1 from training. Duration: 0.8545625 2023-10-02 04:48:25,889 WARNING [train.py:1204] (3/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_445-117398-0 from training. Duration: 0.89 2023-10-02 04:48:28,700 WARNING [train.py:1204] (3/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_29-280622-0 from training. Duration: 0.82 2023-10-02 04:48:31,939 WARNING [train.py:1204] (3/4) Exclude cut with ID _26664_210_7770_1_1532258943280_3418270_187-80815-0 from training. Duration: 0.93 2023-10-02 04:48:35,973 WARNING [train.py:1204] (3/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_414-349640-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 04:48:35,998 WARNING [train.py:1204] (3/4) Exclude cut with ID _45021_210_9994_1_1533619751094_3330288_96-239010-0 from training. Duration: 0.91 2023-10-02 04:48:37,305 WARNING [train.py:1204] (3/4) Exclude cut with ID _5189_210_4557_1_1527248319428_3769229_199-53526-0_sp0.9 from training. Duration: 0.4666875 2023-10-02 04:48:45,649 WARNING [train.py:1204] (3/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_63-101996-0_sp1.1 from training. Duration: 0.8 2023-10-02 04:48:47,104 WARNING [train.py:1204] (3/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_199-86432-0 from training. Duration: 0.98 2023-10-02 04:48:50,168 WARNING [train.py:1204] (3/4) Exclude cut with ID _36399_210_16049_1_1532998771377_3698333_207-259013-0_sp1.1 from training. Duration: 0.92725 2023-10-02 04:48:50,195 WARNING [train.py:1204] (3/4) Exclude cut with ID _62574_210_2655_1_1535180407058_3933249_567-111756-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 04:48:54,357 WARNING [train.py:1204] (3/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_436-318113-0 from training. Duration: 0.75 2023-10-02 04:48:55,705 WARNING [train.py:1204] (3/4) Exclude cut with ID _53379_210_20034_1_1534222027363_3854654_43-305214-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 04:48:55,737 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_15126_210_6753_1_1530778677_2591185_113-157651-0_sp1.1 from training. Duration: 0.6181875 2023-10-02 04:48:57,254 WARNING [train.py:1204] (3/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_146-218763-0_sp1.1 from training. Duration: 0.7818125 2023-10-02 04:48:58,742 WARNING [train.py:1204] (3/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_714-310538-0_sp0.9 from training. Duration: 0.9555625 2023-10-02 04:49:02,762 WARNING [train.py:1204] (3/4) Exclude cut with ID _24271_210_12658_1_1531987270022_7190519_709-16721-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 04:49:06,115 WARNING [train.py:1204] (3/4) Exclude cut with ID _5190_210_4557_1_1527244368799_373439_28-197030-0_sp0.9 from training. Duration: 0.8 2023-10-02 04:49:06,158 WARNING [train.py:1204] (3/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_267-197536-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 04:49:06,184 WARNING [train.py:1204] (3/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_658-282610-0_sp1.1 from training. Duration: 0.4818125 2023-10-02 04:49:06,194 WARNING [train.py:1204] (3/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_257-67050-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 04:49:07,528 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_14906_210_7927_1_1530754190_9408633_961-53402-0_sp1.1 from training. Duration: 0.936375 2023-10-02 04:49:08,907 WARNING [train.py:1204] (3/4) Exclude cut with ID _60113_210_16271_1_1535164212481_4386079_490-280290-0_sp1.1 from training. Duration: 0.8909375 2023-10-02 04:49:13,014 WARNING [train.py:1204] (3/4) Exclude cut with ID _14100_210_8341_1_1530529320613_3678069_258-224599-0 from training. Duration: 0.77 2023-10-02 04:49:14,241 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6961_210_2087_1_1527937168_6815011_565-326898-0_sp1.1 from training. Duration: 0.936375 2023-10-02 04:49:17,212 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.conv_module2.balancer2.prob, batch_count=756293.3333333334, ans=0.125 2023-10-02 04:49:18,311 WARNING [train.py:1204] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_239-316802-0_sp1.1 from training. Duration: 0.62725 2023-10-02 04:49:18,480 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.conv_module2.balancer1.max_abs, batch_count=756293.3333333334, ans=10.0 2023-10-02 04:49:19,755 WARNING [train.py:1204] (3/4) Exclude cut with ID _55857_210_2346_1_1534644007831_6898209_745-98391-0_sp1.1 from training. Duration: 0.6909375 2023-10-02 04:49:19,758 WARNING [train.py:1204] (3/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_154-135696-0 from training. Duration: 0.8 2023-10-02 04:49:19,760 WARNING [train.py:1204] (3/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_93-58002-0 from training. Duration: 0.57 2023-10-02 04:49:23,129 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9025W0005-507335 from training. Duration: 0.896 2023-10-02 04:49:23,201 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9029W0034-509435 from training. Duration: 0.64 2023-10-02 04:49:24,607 WARNING [train.py:1204] (3/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_76-203867-0_sp1.1 from training. Duration: 0.6545625 2023-10-02 04:49:24,616 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_46639_210_8341_1_1533726088_4495500_66-346325-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 04:49:24,624 WARNING [train.py:1204] (3/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_145-137790-0_sp0.9 from training. Duration: 0.92225 2023-10-02 04:49:24,644 WARNING [train.py:1204] (3/4) Exclude cut with ID _36522_210_8887_1_1533177030611_1772478_7-327299-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 04:49:25,973 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9042W0009-515615 from training. Duration: 0.896 2023-10-02 04:49:25,989 WARNING [train.py:1204] (3/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_39-248870-0_sp0.9 from training. Duration: 0.7666875 2023-10-02 04:49:26,024 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_35441_210_16554_1_1533347572_4092680_362-242912-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 04:49:27,382 WARNING [train.py:1204] (3/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_216-343007-0_sp1.1 from training. Duration: 0.6 2023-10-02 04:49:28,883 WARNING [train.py:1204] (3/4) Exclude cut with ID _3867_210_1883_1_1526641200575_4475830_272-215563-0_sp0.9 from training. Duration: 0.6333125 2023-10-02 04:49:30,111 INFO [train.py:1046] (3/4) Epoch 22, batch 1900, loss[loss=0.1809, simple_loss=0.2739, pruned_loss=0.04391, over 24329.00 frames. ], tot_loss[loss=0.1732, simple_loss=0.2499, pruned_loss=0.04825, over 4714683.69 frames. ], batch size: 74, lr: 4.69e-03, grad_scale: 16.0 2023-10-02 04:49:30,189 WARNING [train.py:1204] (3/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_553-175603-0_sp1.1 from training. Duration: 0.92725 2023-10-02 04:49:30,205 WARNING [train.py:1204] (3/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_963-187109-0 from training. Duration: 0.99 2023-10-02 04:49:31,645 WARNING [train.py:1204] (3/4) Exclude cut with ID _44973_210_1511_1_1533794381163_3634770_495-211466-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 04:49:31,656 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9068W0002-529085 from training. Duration: 0.896 2023-10-02 04:49:31,669 WARNING [train.py:1204] (3/4) Exclude cut with ID _5294_210_1747_1_1527249643928_3696179_180-100713-0_sp1.1 from training. Duration: 0.736375 2023-10-02 04:49:33,008 WARNING [train.py:1204] (3/4) Exclude cut with ID _35799_210_5854_1_1533263487887_3529071_149-49689-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 04:49:38,901 WARNING [train.py:1204] (3/4) Exclude cut with ID _67725_210_27767_1_1535608756671_5105359_36-220794-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 04:49:40,438 WARNING [train.py:1204] (3/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_338-96520-0_sp1.1 from training. Duration: 0.8545625 2023-10-02 04:49:41,825 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9104W0004-547160 from training. Duration: 0.896 2023-10-02 04:49:43,854 WARNING [train.py:1204] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_330-327861-0 from training. Duration: 0.67 2023-10-02 04:49:45,134 WARNING [train.py:1204] (3/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_156-257607-0_sp0.9 from training. Duration: 0.92225 2023-10-02 04:49:45,203 WARNING [train.py:1204] (3/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_476-322050-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 04:49:47,145 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9118W0017-553385 from training. Duration: 0.896 2023-10-02 04:49:47,169 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9120W0028-554435 from training. Duration: 0.896 2023-10-02 04:49:47,470 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.conv_skip_rate, batch_count=756426.6666666666, ans=0.0 2023-10-02 04:49:50,080 WARNING [train.py:1204] (3/4) Exclude cut with ID _56524_210_9642_1_1534572011779_3619170_145-310842-0 from training. Duration: 0.99 2023-10-02 04:49:52,655 WARNING [train.py:1204] (3/4) Exclude cut with ID _51631_210_8254_1_1534129228420_4084794_270-222508-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 04:49:54,875 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.conv_module2.balancer1.max_abs, batch_count=756426.6666666666, ans=10.0 2023-10-02 04:49:57,350 WARNING [train.py:1204] (3/4) Exclude cut with ID _14268_210_9206_1_1530622771285_4714099_331-105351-0 from training. Duration: 0.65 2023-10-02 04:49:57,498 WARNING [train.py:1204] (3/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_40-129756-0 from training. Duration: 0.81 2023-10-02 04:50:00,301 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.540e+02 1.830e+02 1.988e+02 2.362e+02 3.579e+02, threshold=3.977e+02, percent-clipped=0.0 2023-10-02 04:50:07,811 WARNING [train.py:1204] (3/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_231-104736-0 from training. Duration: 0.78 2023-10-02 04:50:10,665 WARNING [train.py:1204] (3/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_214-291369-0 from training. Duration: 0.63 2023-10-02 04:50:10,679 WARNING [train.py:1204] (3/4) Exclude cut with ID _62574_210_2655_1_1535180407058_3933249_369-84347-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 04:50:10,733 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9210W0111-599240 from training. Duration: 0.896 2023-10-02 04:50:10,737 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_92-246270-0 from training. Duration: 0.85 2023-10-02 04:50:12,009 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_37152_210_9697_1_1533106573_3844188_29-239070-0 from training. Duration: 0.98 2023-10-02 04:50:12,082 WARNING [train.py:1204] (3/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_239-22076-0 from training. Duration: 0.85 2023-10-02 04:50:12,084 WARNING [train.py:1204] (3/4) Exclude cut with ID _65962_210_29927_1_1535436026519_4174930_368-44639-0_sp1.1 from training. Duration: 0.97275 2023-10-02 04:50:15,552 WARNING [train.py:1204] (3/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_152-212533-0 from training. Duration: 0.97 2023-10-02 04:50:18,372 WARNING [train.py:1204] (3/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_90-31383-0_sp1.1 from training. Duration: 0.7909375 2023-10-02 04:50:20,560 WARNING [train.py:1204] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_196-152868-0_sp1.1 from training. Duration: 0.92725 2023-10-02 04:50:20,562 WARNING [train.py:1204] (3/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_140-181176-0 from training. Duration: 0.92 2023-10-02 04:50:22,630 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.2.encoder.layers.1.conv_module1.whiten, num_groups=1, num_channels=384, metric=6.30 vs. limit=15.0 2023-10-02 04:50:23,223 WARNING [train.py:1204] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_168-32906-0_sp0.9 from training. Duration: 0.7666875 2023-10-02 04:50:26,596 WARNING [train.py:1204] (3/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_13-165512-0 from training. Duration: 0.82 2023-10-02 04:50:27,852 WARNING [train.py:1204] (3/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_381-317791-0_sp0.9 from training. Duration: 0.811125 2023-10-02 04:50:35,246 WARNING [train.py:1204] (3/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_63-267585-0_sp1.1 from training. Duration: 0.8181875 2023-10-02 04:50:35,248 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_326-303945-0_sp1.1 from training. Duration: 0.9 2023-10-02 04:50:35,260 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_194-143381-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 04:50:35,359 WARNING [train.py:1204] (3/4) Exclude cut with ID _39290_210_4220_1_1533862778760_3075118_194-16667-0_sp1.1 from training. Duration: 0.963625 2023-10-02 04:50:35,602 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=756626.6666666666, ans=0.1 2023-10-02 04:50:36,725 WARNING [train.py:1204] (3/4) Exclude cut with ID _15012_210_1968_1_1530856820429_5095350_662-210469-0_sp0.9 from training. Duration: 0.6333125 2023-10-02 04:50:38,030 WARNING [train.py:1204] (3/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_68-219807-0_sp1.1 from training. Duration: 0.42725 2023-10-02 04:50:39,486 WARNING [train.py:1204] (3/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_321-22821-0_sp0.9 from training. Duration: 0.988875 2023-10-02 04:50:40,957 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_1037-45317-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 04:50:40,959 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_403-39619-0_sp1.1 from training. Duration: 0.763625 2023-10-02 04:50:43,680 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_510-96996-0_sp0.9 from training. Duration: 0.9666875 2023-10-02 04:50:43,690 WARNING [train.py:1204] (3/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_225-180155-0_sp1.1 from training. Duration: 0.92725 2023-10-02 04:50:44,917 INFO [train.py:1046] (3/4) Epoch 22, batch 1950, loss[loss=0.1939, simple_loss=0.263, pruned_loss=0.06237, over 23437.00 frames. ], tot_loss[loss=0.1738, simple_loss=0.2504, pruned_loss=0.04862, over 4725558.58 frames. ], batch size: 285, lr: 4.69e-03, grad_scale: 16.0 2023-10-02 04:50:45,016 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_278-272612-0_sp0.9 from training. Duration: 0.811125 2023-10-02 04:50:46,459 WARNING [train.py:1204] (3/4) Exclude cut with ID _53713_210_24116_1_1534314487762_7416751_691-70244-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 04:50:49,727 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6788_210_1388_1_1527898698_4562021_91-197981-0_sp1.1 from training. Duration: 0.7545625 2023-10-02 04:50:52,421 WARNING [train.py:1204] (3/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_60-18123-0_sp0.9 from training. Duration: 0.97775 2023-10-02 04:50:52,473 WARNING [train.py:1204] (3/4) Exclude cut with ID _35799_210_5854_1_1533263487887_3529071_5-214617-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 04:50:52,478 WARNING [train.py:1204] (3/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_890-5117-0_sp1.1 from training. Duration: 0.7090625 2023-10-02 04:50:52,673 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.bypass.skip_rate, batch_count=756693.3333333334, ans=0.09899494936611666 2023-10-02 04:50:57,177 WARNING [train.py:1204] (3/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_660-211088-0 from training. Duration: 0.62 2023-10-02 04:50:57,233 WARNING [train.py:1204] (3/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_314-126819-0_sp0.9 from training. Duration: 0.5555625 2023-10-02 04:50:57,263 WARNING [train.py:1204] (3/4) Exclude cut with ID _21632_210_2253_1_1531994001765_3744140_362-66225-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 04:50:58,673 WARNING [train.py:1204] (3/4) Exclude cut with ID _61118_210_9527_1_1534897668884_3752630_173-258555-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 04:51:01,298 WARNING [train.py:1204] (3/4) Exclude cut with ID _7719_210_3525_1_1528456153163_4296960_88-250259-0_sp0.9 from training. Duration: 0.9333125 2023-10-02 04:51:01,328 WARNING [train.py:1204] (3/4) Exclude cut with ID _15140_210_9288_1_1530791388450_4880149_539-153708-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 04:51:01,356 WARNING [train.py:1204] (3/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_253-42544-0_sp1.1 from training. Duration: 0.97275 2023-10-02 04:51:04,081 WARNING [train.py:1204] (3/4) Exclude cut with ID _33640_210_2655_1_1533117605479_4855469_479-296877-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 04:51:07,261 WARNING [train.py:1204] (3/4) Exclude cut with ID 3033-130750-0096-55598-0_sp1.1 from training. Duration: 0.7545625 2023-10-02 04:51:07,279 WARNING [train.py:1204] (3/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_191-41023-0_sp1.1 from training. Duration: 0.6454375 2023-10-02 04:51:07,302 WARNING [train.py:1204] (3/4) Exclude cut with ID _8365_210_2064_1_1528592311070_4677690_431-294102-0_sp1.1 from training. Duration: 0.8090625 2023-10-02 04:51:08,546 WARNING [train.py:1204] (3/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_893-296030-0_sp1.1 from training. Duration: 0.97275 2023-10-02 04:51:11,412 WARNING [train.py:1204] (3/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_401-343820-0_sp1.1 from training. Duration: 0.97275 2023-10-02 04:51:14,353 WARNING [train.py:1204] (3/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_127-566-0_sp1.1 from training. Duration: 0.763625 2023-10-02 04:51:14,354 WARNING [train.py:1204] (3/4) Exclude cut with ID _14100_210_8341_1_1530529320613_3678069_126-272847-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 04:51:14,369 WARNING [train.py:1204] (3/4) Exclude cut with ID _15133_210_6228_1_1530794512074_3080379_29-264458-0_sp1.1 from training. Duration: 0.52725 2023-10-02 04:51:14,369 WARNING [train.py:1204] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_127-119109-0 from training. Duration: 0.72 2023-10-02 04:51:15,742 WARNING [train.py:1204] (3/4) Exclude cut with ID _6537_210_1883_1_1527987559710_3597129_382-133062-0_sp1.1 from training. Duration: 0.6181875 2023-10-02 04:51:15,752 WARNING [train.py:1204] (3/4) Exclude cut with ID _29529_210_7234_1_1532393961455_3977460_391-219682-0_sp1.1 from training. Duration: 0.936375 2023-10-02 04:51:17,076 WARNING [train.py:1204] (3/4) Exclude cut with ID _37537_210_2253_1_1533171758568_3421949_96-137189-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 04:51:21,783 WARNING [train.py:1204] (3/4) Exclude cut with ID _58414_210_8254_1_1535421609162_7151182_15-276301-0_sp1.1 from training. Duration: 0.97275 2023-10-02 04:51:23,258 WARNING [train.py:1204] (3/4) Exclude cut with ID _59502_210_12210_1_1535107024698_1835139_110-289074-0_sp1.1 from training. Duration: 0.92725 2023-10-02 04:51:23,482 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=756826.6666666666, ans=0.1 2023-10-02 04:51:27,430 WARNING [train.py:1204] (3/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_367-61330-0_sp1.1 from training. Duration: 0.6818125 2023-10-02 04:51:32,540 WARNING [train.py:1204] (3/4) Exclude cut with ID _45297_210_2721_1_1533715351381_3264949_353-9541-0_sp1.1 from training. Duration: 0.8818125 2023-10-02 04:51:32,576 WARNING [train.py:1204] (3/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_85-138257-0_sp0.9 from training. Duration: 0.788875 2023-10-02 04:51:32,611 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3787_210_2040_1_1526477544_5478586_603-120324-0 from training. Duration: 0.81 2023-10-02 04:51:34,092 WARNING [train.py:1204] (3/4) Exclude cut with ID _42863_210_9070_1_1533902398788_3696920_411-204131-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 04:51:34,439 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.conv_module2.balancer2.min_abs, batch_count=756893.3333333334, ans=0.5 2023-10-02 04:51:38,187 WARNING [train.py:1204] (3/4) Exclude cut with ID _30196_210_1664_1_1533708496522_7074140_822-289909-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 04:51:38,279 WARNING [train.py:1204] (3/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_32-335103-0_sp0.9 from training. Duration: 0.911125 2023-10-02 04:51:39,598 WARNING [train.py:1204] (3/4) Exclude cut with ID _6493_210_1747_1_1528459210424_4012470_513-140967-0_sp0.9 from training. Duration: 0.87775 2023-10-02 04:51:46,089 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.feed_forward1.hidden_balancer.prob, batch_count=756960.0, ans=0.125 2023-10-02 04:51:47,164 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_47896_210_20508_1_1534492518_5958868_439-257766-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 04:51:47,247 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_1294-63152-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 04:51:49,939 WARNING [train.py:1204] (3/4) Exclude cut with ID _59170_210_6721_1_1534983633960_4203335_319-166512-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 04:51:51,539 WARNING [train.py:1204] (3/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_146-3470-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 04:51:54,625 WARNING [train.py:1204] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_678-159307-0_sp1.1 from training. Duration: 0.77275 2023-10-02 04:51:54,664 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_343-48822-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 04:51:55,866 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_4692_210_3528_1_1526899755_4323254_107-44401-0 from training. Duration: 0.86 2023-10-02 04:51:55,873 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_74-292608-0_sp1.1 from training. Duration: 0.6545625 2023-10-02 04:51:57,242 WARNING [train.py:1204] (3/4) Exclude cut with ID _55644_210_11856_1_1534593599582_4103719_293-248694-0_sp1.1 from training. Duration: 0.97275 2023-10-02 04:51:59,250 INFO [train.py:1046] (3/4) Epoch 22, batch 2000, loss[loss=0.1894, simple_loss=0.2626, pruned_loss=0.05808, over 23395.00 frames. ], tot_loss[loss=0.1745, simple_loss=0.251, pruned_loss=0.04898, over 4728034.57 frames. ], batch size: 93, lr: 4.68e-03, grad_scale: 32.0 2023-10-02 04:51:59,315 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3172_210_2087_1_1526292929_3987865_197-97762-0 from training. Duration: 0.75 2023-10-02 04:52:00,826 WARNING [train.py:1204] (3/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_264-348896-0_sp0.9 from training. Duration: 0.9444375 2023-10-02 04:52:04,153 WARNING [train.py:1204] (3/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_209-205365-0_sp0.9 from training. Duration: 0.87775 2023-10-02 04:52:05,477 WARNING [train.py:1204] (3/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_222-198127-0_sp0.9 from training. Duration: 0.9333125 2023-10-02 04:52:05,531 WARNING [train.py:1204] (3/4) Exclude cut with ID _57572_210_5517_1_1534921219624_3809484_52-43530-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 04:52:08,164 WARNING [train.py:1204] (3/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_96-320736-0_sp1.1 from training. Duration: 0.7818125 2023-10-02 04:52:09,538 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_13091_210_7925_1_1530609447_8987354_1159-225508-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 04:52:11,654 WARNING [train.py:1204] (3/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_237-44332-0 from training. Duration: 0.94 2023-10-02 04:52:11,817 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.feed_forward3.hidden_balancer.prob, batch_count=757026.6666666666, ans=0.125 2023-10-02 04:52:13,029 WARNING [train.py:1204] (3/4) Exclude cut with ID _7970_210_1883_1_1528455616296_3472230_25-178407-0_sp0.9 from training. Duration: 0.788875 2023-10-02 04:52:14,504 WARNING [train.py:1204] (3/4) Exclude cut with ID _29003_210_19596_1_1532507391720_3894098_357-257062-0_sp1.1 from training. Duration: 0.936375 2023-10-02 04:52:16,096 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_809-17156-0 from training. Duration: 0.73 2023-10-02 04:52:18,795 WARNING [train.py:1204] (3/4) Exclude cut with ID _7160_210_1876_1_1527940923231_3703799_302-150728-0_sp1.1 from training. Duration: 0.7090625 2023-10-02 04:52:18,811 WARNING [train.py:1204] (3/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_444-28041-0_sp0.9 from training. Duration: 0.9444375 2023-10-02 04:52:21,499 WARNING [train.py:1204] (3/4) Exclude cut with ID _52892_210_6721_1_1534378941259_4124129_278-251666-0_sp1.1 from training. Duration: 0.92725 2023-10-02 04:52:23,467 WARNING [train.py:1204] (3/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_115-326778-0 from training. Duration: 0.93 2023-10-02 04:52:23,571 WARNING [train.py:1204] (3/4) Exclude cut with ID _41391_210_7994_1_1533603771872_3638201_31-188961-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 04:52:25,023 WARNING [train.py:1204] (3/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_1073-6264-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 04:52:26,271 WARNING [train.py:1204] (3/4) Exclude cut with ID _66868_210_9527_1_1535509287954_4561140_435-259515-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 04:52:26,379 WARNING [train.py:1204] (3/4) Exclude cut with ID _14886_210_4220_1_1530702020355_3516190_159-73748-0 from training. Duration: 0.83 2023-10-02 04:52:27,638 WARNING [train.py:1204] (3/4) Exclude cut with ID _15150_210_8341_1_1530752411803_7256384_469-239509-0_sp1.1 from training. Duration: 0.6181875 2023-10-02 04:52:29,716 WARNING [train.py:1204] (3/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_395-340852-0 from training. Duration: 0.93 2023-10-02 04:52:29,719 WARNING [train.py:1204] (3/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_138-187031-0_sp1.1 from training. Duration: 0.9 2023-10-02 04:52:30,916 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.642e+02 1.905e+02 2.121e+02 2.548e+02 4.469e+02, threshold=4.243e+02, percent-clipped=4.0 2023-10-02 04:52:32,954 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_13757_210_3528_1_1530440682_4107239_48-106347-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 04:52:34,320 WARNING [train.py:1204] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_164-224052-0_sp1.1 from training. Duration: 0.636375 2023-10-02 04:52:34,325 WARNING [train.py:1204] (3/4) Exclude cut with ID _59288_210_5854_1_1535077757774_3412246_408-322648-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 04:52:34,363 WARNING [train.py:1204] (3/4) Exclude cut with ID _30191_210_1664_1_1533189915047_7196670_827-36359-0_sp1.1 from training. Duration: 0.963625 2023-10-02 04:52:35,730 WARNING [train.py:1204] (3/4) Exclude cut with ID _4680_210_3525_1_1527249757953_893386_75-78979-0_sp1.1 from training. Duration: 0.82725 2023-10-02 04:52:35,930 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.nonlin_attention.balancer.prob, batch_count=757160.0, ans=0.125 2023-10-02 04:52:37,028 WARNING [train.py:1204] (3/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_383-165234-0 from training. Duration: 0.94 2023-10-02 04:52:39,785 WARNING [train.py:1204] (3/4) Exclude cut with ID _19896_210_1883_1_1531709022662_6649569_94-229927-0 from training. Duration: 0.97 2023-10-02 04:52:39,792 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6788_210_1388_1_1527898698_4562021_180-75288-0_sp1.1 from training. Duration: 0.9 2023-10-02 04:52:39,799 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_26433_210_12108_1_1532157158_4056480_54-321349-0_sp1.1 from training. Duration: 0.97275 2023-10-02 04:52:44,482 WARNING [train.py:1204] (3/4) Exclude cut with ID _56108_210_16380_1_1534505446159_3887959_265-161493-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 04:52:45,843 WARNING [train.py:1204] (3/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_395-111602-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 04:52:45,849 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_154-316450-0_sp0.9 from training. Duration: 0.8444375 2023-10-02 04:52:45,915 WARNING [train.py:1204] (3/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_381-116041-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 04:52:47,410 WARNING [train.py:1204] (3/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_65-104124-0_sp1.1 from training. Duration: 0.963625 2023-10-02 04:52:47,672 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.nonlin_attention.balancer.prob, batch_count=757226.6666666666, ans=0.125 2023-10-02 04:52:48,689 WARNING [train.py:1204] (3/4) Exclude cut with ID _6536_210_1883_1_1527850783339_3319069_169-23811-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 04:52:50,506 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_289-180904-0_sp0.9 from training. Duration: 0.8444375 2023-10-02 04:52:50,516 WARNING [train.py:1204] (3/4) Exclude cut with ID _56406_210_9288_1_1534899618664_3496149_226-242400-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 04:52:50,613 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_24899_210_16554_1_1532046422_3827462_276-216330-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 04:52:52,267 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.conv_module2.balancer2.min_positive, batch_count=757226.6666666666, ans=0.05 2023-10-02 04:52:53,392 WARNING [train.py:1204] (3/4) Exclude cut with ID _29345_210_4381_1_1532689168263_3860169_198-174328-0_sp1.1 from training. Duration: 0.82725 2023-10-02 04:52:53,459 WARNING [train.py:1204] (3/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_338-96520-0 from training. Duration: 0.94 2023-10-02 04:52:59,450 WARNING [train.py:1204] (3/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_7-111954-0_sp1.1 from training. Duration: 0.5090625 2023-10-02 04:52:59,544 WARNING [train.py:1204] (3/4) Exclude cut with ID _36692_210_5665_1_1533608967420_398809_8-16261-0_sp1.1 from training. Duration: 0.97275 2023-10-02 04:53:04,105 WARNING [train.py:1204] (3/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_86-212657-0_sp1.1 from training. Duration: 0.97275 2023-10-02 04:53:04,120 WARNING [train.py:1204] (3/4) Exclude cut with ID _44659_210_2721_1_1534036120061_6614860_626-298884-0_sp1.1 from training. Duration: 0.8818125 2023-10-02 04:53:04,707 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.self_attn1.whiten.whitening_limit, batch_count=757293.3333333334, ans=22.5 2023-10-02 04:53:06,921 WARNING [train.py:1204] (3/4) Exclude cut with ID _49755_210_18053_1_1534581005846_3218709_120-257918-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 04:53:08,669 WARNING [train.py:1204] (3/4) Exclude cut with ID _21881_210_3525_1_1531825242757_3798380_400-113821-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 04:53:08,671 WARNING [train.py:1204] (3/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_387-236773-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 04:53:10,046 WARNING [train.py:1204] (3/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_613-232856-0_sp1.1 from training. Duration: 0.4909375 2023-10-02 04:53:10,066 WARNING [train.py:1204] (3/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_181-78632-0_sp0.9 from training. Duration: 0.8666875 2023-10-02 04:53:14,693 INFO [train.py:1046] (3/4) Epoch 22, batch 2050, loss[loss=0.1713, simple_loss=0.2323, pruned_loss=0.05521, over 23489.00 frames. ], tot_loss[loss=0.1747, simple_loss=0.2508, pruned_loss=0.04924, over 4722422.04 frames. ], batch size: 285, lr: 4.68e-03, grad_scale: 16.0 2023-10-02 04:53:14,751 WARNING [train.py:1204] (3/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_175-17485-0_sp1.1 from training. Duration: 0.97275 2023-10-02 04:53:16,093 WARNING [train.py:1204] (3/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_1004-183422-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 04:53:18,892 WARNING [train.py:1204] (3/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_32-65962-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 04:53:18,961 WARNING [train.py:1204] (3/4) Exclude cut with ID _15458_210_8341_1_1530858614263_3834410_40-298664-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 04:53:24,874 WARNING [train.py:1204] (3/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_380-261863-0_sp1.1 from training. Duration: 0.963625 2023-10-02 04:53:26,356 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_372-173012-0_sp1.1 from training. Duration: 0.863625 2023-10-02 04:53:26,411 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_57-82205-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 04:53:27,760 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3691_210_3528_1_1526468169_4062053_355-334432-0_sp0.9 from training. Duration: 0.9666875 2023-10-02 04:53:29,166 WARNING [train.py:1204] (3/4) Exclude cut with ID _19023_210_4353_1_1531378892552_5820780_28-36615-0 from training. Duration: 0.97 2023-10-02 04:53:29,171 WARNING [train.py:1204] (3/4) Exclude cut with ID _29853_210_7497_1_1532505600851_6642110_286-251333-0_sp1.1 from training. Duration: 0.92725 2023-10-02 04:53:29,276 WARNING [train.py:1204] (3/4) Exclude cut with ID _31602_210_5854_1_1532677021456_4458250_518-80679-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 04:53:31,337 WARNING [train.py:1204] (3/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_38-187851-0_sp0.9 from training. Duration: 0.688875 2023-10-02 04:53:39,240 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.ff2_skip_rate, batch_count=757426.6666666666, ans=0.0 2023-10-02 04:53:40,305 WARNING [train.py:1204] (3/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_39-248870-0_sp1.1 from training. Duration: 0.62725 2023-10-02 04:53:40,327 WARNING [train.py:1204] (3/4) Exclude cut with ID _16908_210_5724_1_1531119157248_3772700_154-31455-0_sp1.1 from training. Duration: 0.97275 2023-10-02 04:53:42,338 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8453_210_4211_1_1528502940_8562730_347-330673-0 from training. Duration: 0.72 2023-10-02 04:53:43,860 WARNING [train.py:1204] (3/4) Exclude cut with ID _30191_210_1664_1_1533189915047_7196670_674-204247-0_sp1.1 from training. Duration: 0.97275 2023-10-02 04:53:45,382 WARNING [train.py:1204] (3/4) Exclude cut with ID _8833_210_5219_1_1528592665210_7553059_329-125156-0 from training. Duration: 0.82 2023-10-02 04:53:45,418 WARNING [train.py:1204] (3/4) Exclude cut with ID _4779_210_2084_1_1527318022944_3914893_500-285013-0_sp1.1 from training. Duration: 0.62725 2023-10-02 04:53:48,199 WARNING [train.py:1204] (3/4) Exclude cut with ID _14421_210_8750_1_1530604795825_3542290_190-313578-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 04:53:51,415 WARNING [train.py:1204] (3/4) Exclude cut with ID _35799_210_5854_1_1533263487887_3529071_538-273574-0_sp1.1 from training. Duration: 0.936375 2023-10-02 04:53:52,880 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_14928_210_5632_1_1530773461_4714000_454-112198-0_sp1.1 from training. Duration: 0.6 2023-10-02 04:53:52,914 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_468-143126-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 04:53:54,903 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.4.encoder.layers.0.whiten, num_groups=1, num_channels=384, metric=4.35 vs. limit=12.0 2023-10-02 04:53:55,625 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_4528_210_3273_1_1527076654_3276777_258-121681-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 04:53:56,977 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_397-132565-0_sp1.1 from training. Duration: 0.9 2023-10-02 04:53:56,994 WARNING [train.py:1204] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_454-123002-0_sp0.9 from training. Duration: 0.7666875 2023-10-02 04:53:57,160 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.1.feed_forward1.out_proj.dropout_p, batch_count=757493.3333333334, ans=0.1 2023-10-02 04:54:00,483 WARNING [train.py:1204] (3/4) Exclude cut with ID _27052_210_5986_1_1532329197187_3585009_297-86890-0_sp1.1 from training. Duration: 0.936375 2023-10-02 04:54:01,185 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.3.encoder.layers.1.nonlin_attention.whiten1, num_groups=1, num_channels=384, metric=5.31 vs. limit=10.0 2023-10-02 04:54:01,927 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_51225_210_3601_1_1534400634_2327383_55-301844-0_sp1.1 from training. Duration: 0.8454375 2023-10-02 04:54:03,409 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6786_210_1388_1_1527839699_4194495_378-46070-0_sp0.9 from training. Duration: 0.8 2023-10-02 04:54:04,743 WARNING [train.py:1204] (3/4) Exclude cut with ID _42334_210_7770_1_1534206610396_3605880_55-19758-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 04:54:08,088 WARNING [train.py:1204] (3/4) Exclude cut with ID _14884_210_2252_1_1530707459689_5561250_114-201399-0_sp0.9 from training. Duration: 0.8444375 2023-10-02 04:54:14,115 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_99-251314-0_sp0.9 from training. Duration: 0.9666875 2023-10-02 04:54:16,805 WARNING [train.py:1204] (3/4) Exclude cut with ID _26528_210_9070_1_1532570410842_3851310_497-97218-0 from training. Duration: 0.86 2023-10-02 04:54:21,000 WARNING [train.py:1204] (3/4) Exclude cut with ID _55328_210_27767_1_1534580973260_4088170_230-317241-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 04:54:22,342 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_125-203105-0_sp0.9 from training. Duration: 0.888875 2023-10-02 04:54:22,554 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.bypass.skip_rate, batch_count=757626.6666666666, ans=0.07 2023-10-02 04:54:25,209 WARNING [train.py:1204] (3/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_55-31904-0_sp1.1 from training. Duration: 0.8 2023-10-02 04:54:26,034 WARNING [train.py:1204] (3/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_18-174647-0 from training. Duration: 0.91 2023-10-02 04:54:29,492 WARNING [train.py:1204] (3/4) Exclude cut with ID IC0165W0005-81726 from training. Duration: 0.952 2023-10-02 04:54:29,493 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_31583_210_3852_1_1532595851_4024413_148-171847-0_sp1.1 from training. Duration: 0.963625 2023-10-02 04:54:29,559 WARNING [train.py:1204] (3/4) Exclude cut with ID _21989_210_5854_1_1531899214931_3271360_116-138769-0_sp1.1 from training. Duration: 0.936375 2023-10-02 04:54:30,801 INFO [train.py:1046] (3/4) Epoch 22, batch 2100, loss[loss=0.1731, simple_loss=0.2543, pruned_loss=0.04592, over 23357.00 frames. ], tot_loss[loss=0.1732, simple_loss=0.2494, pruned_loss=0.04847, over 4721080.69 frames. ], batch size: 94, lr: 4.68e-03, grad_scale: 16.0 2023-10-02 04:54:30,864 WARNING [train.py:1204] (3/4) Exclude cut with ID _66070_210_7640_1_1535441393096_7805310_489-292631-0_sp1.1 from training. Duration: 0.7181875 2023-10-02 04:54:30,942 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_202-27960-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 04:54:30,953 WARNING [train.py:1204] (3/4) Exclude cut with ID _5294_210_1747_1_1527249643928_3696179_342-270278-0 from training. Duration: 0.55 2023-10-02 04:54:32,486 WARNING [train.py:1204] (3/4) Exclude cut with ID _38323_210_12108_1_1533121233369_3303049_416-11919-0 from training. Duration: 0.96 2023-10-02 04:54:33,901 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.conv_skip_rate, batch_count=757693.3333333334, ans=0.0 2023-10-02 04:54:35,068 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6545_210_2040_1_1527774670_4245258_407-48070-0_sp0.9 from training. Duration: 0.8444375 2023-10-02 04:54:35,438 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.nonlin_attention.balancer.prob, batch_count=757693.3333333334, ans=0.125 2023-10-02 04:54:37,855 WARNING [train.py:1204] (3/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_503-30719-0_sp1.1 from training. Duration: 0.8909375 2023-10-02 04:54:37,919 WARNING [train.py:1204] (3/4) Exclude cut with ID _49643_210_7989_1_1534640244224_3939031_234-326497-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 04:54:38,272 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.ff2_skip_rate, batch_count=757693.3333333334, ans=0.0 2023-10-02 04:54:41,269 WARNING [train.py:1204] (3/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_301-77873-0_sp1.1 from training. Duration: 0.963625 2023-10-02 04:54:41,329 WARNING [train.py:1204] (3/4) Exclude cut with ID _45297_210_2721_1_1533715351381_3264949_152-154697-0_sp1.1 from training. Duration: 0.97275 2023-10-02 04:54:41,344 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3371_210_1794_1_1526384462_5580777_396-339812-0 from training. Duration: 0.85 2023-10-02 04:54:43,250 WARNING [train.py:1204] (3/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_399-156791-0_sp1.1 from training. Duration: 0.7545625 2023-10-02 04:54:43,364 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.0.conv_module1.balancer1.prob, batch_count=757693.3333333334, ans=0.125 2023-10-02 04:54:43,401 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.ff2_skip_rate, batch_count=757693.3333333334, ans=0.0 2023-10-02 04:54:44,525 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7696_210_2040_1_1528207167_3716099_117-125434-0 from training. Duration: 0.76 2023-10-02 04:54:44,537 WARNING [train.py:1204] (3/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_96-244299-0 from training. Duration: 0.88 2023-10-02 04:54:47,281 WARNING [train.py:1204] (3/4) Exclude cut with ID _37892_210_19596_1_1533198677897_3964058_519-343462-0_sp1.1 from training. Duration: 0.92725 2023-10-02 04:54:47,316 WARNING [train.py:1204] (3/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_202-183922-0_sp1.1 from training. Duration: 0.77275 2023-10-02 04:54:47,324 WARNING [train.py:1204] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_453-133983-0 from training. Duration: 0.78 2023-10-02 04:54:47,374 WARNING [train.py:1204] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_178-39722-0_sp1.1 from training. Duration: 0.4454375 2023-10-02 04:54:53,037 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_15126_210_6753_1_1530778677_2591185_113-157651-0 from training. Duration: 0.68 2023-10-02 04:54:53,038 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_271-234962-0_sp1.1 from training. Duration: 0.7181875 2023-10-02 04:54:55,806 WARNING [train.py:1204] (3/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_195-64967-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 04:54:55,858 WARNING [train.py:1204] (3/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_365-215065-0_sp1.1 from training. Duration: 0.936375 2023-10-02 04:54:57,950 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.attention_skip_rate, batch_count=757760.0, ans=0.0 2023-10-02 04:55:01,106 WARNING [train.py:1204] (3/4) Exclude cut with ID _31602_210_5854_1_1532677021456_4458250_249-96604-0_sp0.9 from training. Duration: 0.988875 2023-10-02 04:55:01,187 WARNING [train.py:1204] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_307-158462-0 from training. Duration: 0.69 2023-10-02 04:55:02,524 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.504e+02 1.868e+02 2.085e+02 2.437e+02 3.685e+02, threshold=4.169e+02, percent-clipped=0.0 2023-10-02 04:55:02,619 WARNING [train.py:1204] (3/4) Exclude cut with ID _30918_210_6400_1_1532754078600_3125920_407-223394-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 04:55:02,629 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6673_210_3273_1_1527767790_3173240_351-167954-0_sp0.9 from training. Duration: 0.5333125 2023-10-02 04:55:03,979 WARNING [train.py:1204] (3/4) Exclude cut with ID _4866_210_2577_1_1527404412284_4364659_256-306830-0 from training. Duration: 0.87 2023-10-02 04:55:04,024 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_17613_210_2151_1_1531137569_2366160_222-131118-0_sp1.1 from training. Duration: 0.963625 2023-10-02 04:55:04,035 WARNING [train.py:1204] (3/4) Exclude cut with ID _45560_210_21982_1_1533812603951_3584999_111-6376-0 from training. Duration: 0.84 2023-10-02 04:55:04,196 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.feed_forward1.hidden_balancer.prob, batch_count=757826.6666666666, ans=0.125 2023-10-02 04:55:05,383 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_216-73841-0 from training. Duration: 0.96 2023-10-02 04:55:05,421 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_14058_210_4163_1_1530690953_6327180_49-210687-0 from training. Duration: 0.94 2023-10-02 04:55:06,836 WARNING [train.py:1204] (3/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_506-42614-0_sp0.9 from training. Duration: 0.87775 2023-10-02 04:55:09,467 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_25-332138-0_sp1.1 from training. Duration: 0.82725 2023-10-02 04:55:11,411 WARNING [train.py:1204] (3/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_589-9070-0_sp1.1 from training. Duration: 0.6454375 2023-10-02 04:55:12,905 WARNING [train.py:1204] (3/4) Exclude cut with ID _6194_210_1534_1_1527511826160_3941194_14-282876-0_sp1.1 from training. Duration: 0.6454375 2023-10-02 04:55:14,336 WARNING [train.py:1204] (3/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_1166-90654-0_sp1.1 from training. Duration: 0.92725 2023-10-02 04:55:17,539 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_218-255858-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 04:55:17,540 WARNING [train.py:1204] (3/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_1285-256622-0 from training. Duration: 0.84 2023-10-02 04:55:17,553 WARNING [train.py:1204] (3/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_205-286261-0_sp1.1 from training. Duration: 0.963625 2023-10-02 04:55:17,570 WARNING [train.py:1204] (3/4) Exclude cut with ID _68777_210_4353_1_1535810390731_3117904_6-154119-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 04:55:17,634 WARNING [train.py:1204] (3/4) Exclude cut with ID _22982_210_1883_1_1531882790447_5449409_565-199393-0_sp1.1 from training. Duration: 0.92725 2023-10-02 04:55:18,976 WARNING [train.py:1204] (3/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_278-154492-0 from training. Duration: 0.76 2023-10-02 04:55:20,360 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_426-313557-0 from training. Duration: 0.53 2023-10-02 04:55:20,412 WARNING [train.py:1204] (3/4) Exclude cut with ID _41817_210_18835_1_1533434477160_7423049_555-293448-0 from training. Duration: 0.8 2023-10-02 04:55:24,489 WARNING [train.py:1204] (3/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_32-335103-0_sp1.1 from training. Duration: 0.7454375 2023-10-02 04:55:25,936 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.conv_skip_rate, batch_count=757893.3333333334, ans=0.0 2023-10-02 04:55:28,470 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2655_210_2087_1_1525688959_3897400_202-147557-0_sp1.1 from training. Duration: 0.77275 2023-10-02 04:55:29,167 WARNING [train.py:1204] (3/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_158-250116-0 from training. Duration: 0.62 2023-10-02 04:55:33,997 WARNING [train.py:1204] (3/4) Exclude cut with ID _55429_210_10626_1_1534467588527_7174143_528-88083-0_sp1.1 from training. Duration: 0.963625 2023-10-02 04:55:35,489 WARNING [train.py:1204] (3/4) Exclude cut with ID _8030_210_7497_1_1528444886781_3531089_32-31837-0_sp1.1 from training. Duration: 0.8818125 2023-10-02 04:55:36,929 WARNING [train.py:1204] (3/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_744-86168-0_sp1.1 from training. Duration: 0.97275 2023-10-02 04:55:36,938 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1113-7369-0_sp0.9 from training. Duration: 0.9444375 2023-10-02 04:55:36,954 WARNING [train.py:1204] (3/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_124-345840-0_sp0.9 from training. Duration: 0.57775 2023-10-02 04:55:37,012 WARNING [train.py:1204] (3/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_328-264776-0_sp0.9 from training. Duration: 0.7444375 2023-10-02 04:55:38,443 WARNING [train.py:1204] (3/4) Exclude cut with ID _18895_210_6228_1_1531357274234_3784930_469-166615-0_sp1.1 from training. Duration: 0.963625 2023-10-02 04:55:38,449 WARNING [train.py:1204] (3/4) Exclude cut with ID _8395_210_1531_1_1528597871523_4411579_431-132470-0_sp0.9 from training. Duration: 0.82225 2023-10-02 04:55:41,029 WARNING [train.py:1204] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_554-45617-0_sp1.1 from training. Duration: 0.9 2023-10-02 04:55:41,051 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_35361_210_4238_1_1533262810_7793476_524-188242-0_sp1.1 from training. Duration: 0.92725 2023-10-02 04:55:42,996 WARNING [train.py:1204] (3/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_938-205135-0 from training. Duration: 0.87 2023-10-02 04:55:44,451 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_5931_210_2040_1_1527428481_4547999_321-193614-0 from training. Duration: 0.67 2023-10-02 04:55:44,472 WARNING [train.py:1204] (3/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_445-269521-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 04:55:44,743 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.bypass.skip_rate, batch_count=758026.6666666666, ans=0.07 2023-10-02 04:55:45,222 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.2.encoder.layers.2.nonlin_attention.whiten2, num_groups=1, num_channels=384, metric=6.22 vs. limit=15.0 2023-10-02 04:55:45,750 INFO [train.py:1046] (3/4) Epoch 22, batch 2150, loss[loss=0.1622, simple_loss=0.2413, pruned_loss=0.04152, over 24269.00 frames. ], tot_loss[loss=0.1728, simple_loss=0.2489, pruned_loss=0.04837, over 4713067.84 frames. ], batch size: 56, lr: 4.68e-03, grad_scale: 16.0 2023-10-02 04:55:45,877 WARNING [train.py:1204] (3/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_3-33116-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 04:55:45,886 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_4633_210_2040_1_1526824163_4145343_416-76117-0_sp1.1 from training. Duration: 0.863625 2023-10-02 04:55:45,913 WARNING [train.py:1204] (3/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_1033-274376-0_sp1.1 from training. Duration: 0.7909375 2023-10-02 04:55:47,976 WARNING [train.py:1204] (3/4) Exclude cut with ID _8772_210_1850_1_1528635595972_3664011_398-320049-0_sp0.9 from training. Duration: 0.97775 2023-10-02 04:55:52,172 WARNING [train.py:1204] (3/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_321-215742-0_sp0.9 from training. Duration: 0.5555625 2023-10-02 04:55:52,887 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.3.encoder.layers.1.nonlin_attention.whiten1, num_groups=1, num_channels=384, metric=5.76 vs. limit=10.0 2023-10-02 04:55:53,523 WARNING [train.py:1204] (3/4) Exclude cut with ID _28394_210_8913_1_1532430049518_3650118_272-190950-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 04:55:54,928 WARNING [train.py:1204] (3/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_31-299371-0_sp1.1 from training. Duration: 0.92725 2023-10-02 04:55:55,199 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.feed_forward1.hidden_balancer.prob, batch_count=758026.6666666666, ans=0.125 2023-10-02 04:55:56,359 WARNING [train.py:1204] (3/4) Exclude cut with ID _55383_210_10635_1_1534468426172_2231669_134-128246-0_sp0.9 from training. Duration: 0.988875 2023-10-02 04:55:56,361 WARNING [train.py:1204] (3/4) Exclude cut with ID _40519_210_13794_1_1533715228655_7337390_887-9547-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 04:55:58,102 WARNING [train.py:1204] (3/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_310-109461-0_sp0.9 from training. Duration: 0.888875 2023-10-02 04:56:01,460 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2009_210_2071_1_1525481134_4588937_258-250196-0_sp1.1 from training. Duration: 0.92725 2023-10-02 04:56:01,515 WARNING [train.py:1204] (3/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_1186-72702-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 04:56:01,518 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_14831_210_7927_1_1530700200_5583610_181-27467-0_sp1.1 from training. Duration: 0.82725 2023-10-02 04:56:04,349 WARNING [train.py:1204] (3/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_278-132109-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 04:56:05,639 WARNING [train.py:1204] (3/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_261-297503-0 from training. Duration: 0.99 2023-10-02 04:56:08,650 WARNING [train.py:1204] (3/4) Exclude cut with ID _19896_210_1883_1_1531709022662_6649569_313-265903-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 04:56:10,276 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.conv_skip_rate, batch_count=758093.3333333334, ans=0.0 2023-10-02 04:56:11,320 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_105-114797-0_sp1.1 from training. Duration: 0.763625 2023-10-02 04:56:13,215 WARNING [train.py:1204] (3/4) Exclude cut with ID _53755_210_8254_1_1534577312941_4707360_9-65843-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 04:56:13,244 WARNING [train.py:1204] (3/4) Exclude cut with ID _18516_210_9206_1_1531704653206_3692406_361-224549-0_sp1.1 from training. Duration: 0.97275 2023-10-02 04:56:13,278 WARNING [train.py:1204] (3/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_403-310975-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 04:56:13,332 WARNING [train.py:1204] (3/4) Exclude cut with ID _5444_210_2151_1_1527418808464_5312210_581-315411-0_sp0.9 from training. Duration: 0.688875 2023-10-02 04:56:14,706 WARNING [train.py:1204] (3/4) Exclude cut with ID _23825_210_13449_1_1531915313526_3497150_464-154904-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 04:56:14,717 WARNING [train.py:1204] (3/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_937-208281-0_sp0.9 from training. Duration: 0.9444375 2023-10-02 04:56:16,624 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_37839_210_7234_1_1533542462_3689303_158-181212-0_sp1.1 from training. Duration: 0.963625 2023-10-02 04:56:16,719 WARNING [train.py:1204] (3/4) Exclude cut with ID _8772_210_1850_1_1528635595972_3664011_398-320049-0 from training. Duration: 0.88 2023-10-02 04:56:18,178 WARNING [train.py:1204] (3/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_718-250907-0_sp1.1 from training. Duration: 0.62725 2023-10-02 04:56:19,432 WARNING [train.py:1204] (3/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_641-126395-0_sp1.1 from training. Duration: 0.92725 2023-10-02 04:56:19,469 WARNING [train.py:1204] (3/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_166-241493-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 04:56:20,842 WARNING [train.py:1204] (3/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_526-62079-0_sp0.9 from training. Duration: 0.7444375 2023-10-02 04:56:20,935 WARNING [train.py:1204] (3/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_439-275083-0_sp1.1 from training. Duration: 0.936375 2023-10-02 04:56:23,618 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_66907_210_21581_1_1535615661_3590177_493-229032-0_sp1.1 from training. Duration: 0.92725 2023-10-02 04:56:24,935 WARNING [train.py:1204] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_89-321955-0_sp1.1 from training. Duration: 0.72725 2023-10-02 04:56:26,332 WARNING [train.py:1204] (3/4) Exclude cut with ID _29857_210_20034_1_1532395886753_3798160_273-226195-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 04:56:26,338 WARNING [train.py:1204] (3/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_404-75889-0 from training. Duration: 0.89 2023-10-02 04:56:26,366 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_271-234962-0_sp0.9 from training. Duration: 0.87775 2023-10-02 04:56:28,442 WARNING [train.py:1204] (3/4) Exclude cut with ID _22961_210_8254_1_1531976366672_4278180_156-339685-0_sp1.1 from training. Duration: 0.97275 2023-10-02 04:56:29,742 WARNING [train.py:1204] (3/4) Exclude cut with ID _37892_210_19596_1_1533198677897_3964058_425-231927-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 04:56:31,573 WARNING [train.py:1204] (3/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_102-288670-0_sp1.1 from training. Duration: 0.97275 2023-10-02 04:56:32,843 WARNING [train.py:1204] (3/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_283-220372-0_sp1.1 from training. Duration: 0.5090625 2023-10-02 04:56:32,939 WARNING [train.py:1204] (3/4) Exclude cut with ID _44304_210_15374_1_1533639352765_4613129_86-1699-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 04:56:34,263 WARNING [train.py:1204] (3/4) Exclude cut with ID _2891_210_2550_1_1525874398521_4427710_327-121590-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 04:56:34,270 WARNING [train.py:1204] (3/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_369-222060-0 from training. Duration: 0.71 2023-10-02 04:56:36,921 WARNING [train.py:1204] (3/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_762-109886-0 from training. Duration: 0.91 2023-10-02 04:56:36,953 WARNING [train.py:1204] (3/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_477-255782-0_sp0.9 from training. Duration: 0.911125 2023-10-02 04:56:38,338 WARNING [train.py:1204] (3/4) Exclude cut with ID IC0669W0061-330906 from training. Duration: 0.768 2023-10-02 04:56:38,385 WARNING [train.py:1204] (3/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_347-213071-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 04:56:38,417 WARNING [train.py:1204] (3/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_507-303370-0_sp1.1 from training. Duration: 0.87275 2023-10-02 04:56:39,742 WARNING [train.py:1204] (3/4) Exclude cut with ID _15012_210_1968_1_1530856820429_5095350_688-178849-0 from training. Duration: 0.64 2023-10-02 04:56:39,747 WARNING [train.py:1204] (3/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_28-70987-0_sp0.9 from training. Duration: 0.9666875 2023-10-02 04:56:39,750 WARNING [train.py:1204] (3/4) Exclude cut with ID _5910_210_1389_1_1527337972454_5050100_555-159815-0 from training. Duration: 0.98 2023-10-02 04:56:41,038 WARNING [train.py:1204] (3/4) Exclude cut with ID IC0679W0064-335871 from training. Duration: 0.768 2023-10-02 04:56:41,038 WARNING [train.py:1204] (3/4) Exclude cut with ID IC0679W0079-335886 from training. Duration: 0.896 2023-10-02 04:56:41,084 WARNING [train.py:1204] (3/4) Exclude cut with ID _5608_210_2106_1_1527415439089_6819768_909-82767-0 from training. Duration: 0.61 2023-10-02 04:56:43,128 WARNING [train.py:1204] (3/4) Exclude cut with ID _23038_210_9570_1_1532001584158_3652419_411-323967-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 04:56:43,197 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_350-231748-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 04:56:43,208 WARNING [train.py:1204] (3/4) Exclude cut with ID _5289_210_2084_1_1527678202736_3779299_142-103317-0_sp0.9 from training. Duration: 0.9333125 2023-10-02 04:56:44,577 WARNING [train.py:1204] (3/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_314-144055-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 04:56:45,908 WARNING [train.py:1204] (3/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_177-236809-0_sp0.9 from training. Duration: 0.5444375 2023-10-02 04:56:47,360 WARNING [train.py:1204] (3/4) Exclude cut with ID _23038_210_9570_1_1532001584158_3652419_82-334733-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 04:56:47,368 WARNING [train.py:1204] (3/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_225-22526-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 04:56:47,691 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.balancer2.prob, batch_count=758293.3333333334, ans=0.125 2023-10-02 04:56:53,462 WARNING [train.py:1204] (3/4) Exclude cut with ID _30327_210_16049_1_1532484161295_3924169_401-70819-0_sp1.1 from training. Duration: 0.7818125 2023-10-02 04:56:53,510 WARNING [train.py:1204] (3/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_538-32291-0 from training. Duration: 0.83 2023-10-02 04:56:53,794 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.conv_module2.balancer1.prob, batch_count=758293.3333333334, ans=0.125 2023-10-02 04:56:59,447 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_37357_210_17024_1_1533089504_2316121_258-211605-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 04:57:00,787 INFO [train.py:1046] (3/4) Epoch 22, batch 2200, loss[loss=0.1797, simple_loss=0.2675, pruned_loss=0.04599, over 23966.00 frames. ], tot_loss[loss=0.1737, simple_loss=0.2498, pruned_loss=0.0488, over 4702490.39 frames. ], batch size: 86, lr: 4.68e-03, grad_scale: 8.0 2023-10-02 04:57:02,658 WARNING [train.py:1204] (3/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_550-335198-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 04:57:04,003 WARNING [train.py:1204] (3/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_98-741-0_sp0.9 from training. Duration: 0.888875 2023-10-02 04:57:04,046 WARNING [train.py:1204] (3/4) Exclude cut with ID _38323_210_12108_1_1533121233369_3303049_251-238988-0_sp1.1 from training. Duration: 0.92725 2023-10-02 04:57:05,461 WARNING [train.py:1204] (3/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_259-34038-0_sp0.9 from training. Duration: 0.82225 2023-10-02 04:57:08,193 WARNING [train.py:1204] (3/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_1060-180633-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 04:57:09,460 WARNING [train.py:1204] (3/4) Exclude cut with ID _8755_210_1876_1_1528628559357_3258209_211-37473-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 04:57:09,466 WARNING [train.py:1204] (3/4) Exclude cut with ID _4122_210_2084_1_1526713164417_4353280_528-206771-0 from training. Duration: 0.88 2023-10-02 04:57:14,266 WARNING [train.py:1204] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_64-248359-0 from training. Duration: 0.69 2023-10-02 04:57:17,031 WARNING [train.py:1204] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_402-253027-0_sp1.1 from training. Duration: 0.4909375 2023-10-02 04:57:17,307 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=758426.6666666666, ans=0.1 2023-10-02 04:57:22,190 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.0.balancer1.prob, batch_count=758426.6666666666, ans=0.125 2023-10-02 04:57:23,281 WARNING [train.py:1204] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_625-102955-0 from training. Duration: 0.77 2023-10-02 04:57:24,861 WARNING [train.py:1204] (3/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_611-219324-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 04:57:24,939 WARNING [train.py:1204] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_718-68505-0_sp0.9 from training. Duration: 0.988875 2023-10-02 04:57:26,216 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_353-182855-0_sp1.1 from training. Duration: 0.87275 2023-10-02 04:57:27,582 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.3.encoder.layers.0.feed_forward2.out_whiten, num_groups=1, num_channels=512, metric=5.54 vs. limit=15.0 2023-10-02 04:57:29,610 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_5960_210_1794_1_1527403865_3990999_515-2613-0_sp0.9 from training. Duration: 0.97775 2023-10-02 04:57:29,649 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_5575_210_1410_1_1527301305_2570170_342-265572-0 from training. Duration: 0.94 2023-10-02 04:57:32,958 WARNING [train.py:1204] (3/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_329-113173-0_sp0.9 from training. Duration: 0.688875 2023-10-02 04:57:34,192 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.478e+02 1.806e+02 1.968e+02 2.209e+02 3.586e+02, threshold=3.937e+02, percent-clipped=0.0 2023-10-02 04:57:35,640 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_15396_210_6890_1_1531308473_1938936_213-312506-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 04:57:36,973 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_521-214276-0_sp1.1 from training. Duration: 0.4 2023-10-02 04:57:39,872 WARNING [train.py:1204] (3/4) Exclude cut with ID _15026_210_8750_1_1530777602316_3291850_211-195043-0_sp1.1 from training. Duration: 0.72725 2023-10-02 04:57:40,012 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.0.feed_forward1.hidden_balancer.prob, batch_count=758493.3333333334, ans=0.125 2023-10-02 04:57:41,365 WARNING [train.py:1204] (3/4) Exclude cut with ID _29343_210_4381_1_1532602757752_3674109_108-257494-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 04:57:42,771 WARNING [train.py:1204] (3/4) Exclude cut with ID _64414_210_17301_1_1535243252048_3757759_403-58151-0_sp1.1 from training. Duration: 0.963625 2023-10-02 04:57:44,811 WARNING [train.py:1204] (3/4) Exclude cut with ID _36512_210_6987_1_1533033143998_3126069_109-177787-0_sp1.1 from training. Duration: 0.92725 2023-10-02 04:57:46,197 WARNING [train.py:1204] (3/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_498-40153-0 from training. Duration: 0.63 2023-10-02 04:57:47,571 WARNING [train.py:1204] (3/4) Exclude cut with ID _56447_210_1389_1_1535074245910_3697189_161-334523-0_sp1.1 from training. Duration: 0.936375 2023-10-02 04:57:48,976 WARNING [train.py:1204] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_238-245655-0 from training. Duration: 0.81 2023-10-02 04:57:49,959 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.0.feed_forward1.out_whiten.whitening_limit, batch_count=758560.0, ans=15.0 2023-10-02 04:57:52,163 WARNING [train.py:1204] (3/4) Exclude cut with ID _64631_210_27767_1_1535349580068_4165160_108-43865-0_sp1.1 from training. Duration: 0.92725 2023-10-02 04:57:52,169 WARNING [train.py:1204] (3/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_243-42116-0_sp1.1 from training. Duration: 0.663625 2023-10-02 04:57:52,193 WARNING [train.py:1204] (3/4) Exclude cut with ID _66868_210_9527_1_1535509287954_4561140_594-132160-0_sp1.1 from training. Duration: 0.92725 2023-10-02 04:57:54,895 WARNING [train.py:1204] (3/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_48-338976-0_sp0.9 from training. Duration: 0.988875 2023-10-02 04:57:54,944 WARNING [train.py:1204] (3/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_137-100595-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 04:57:54,951 WARNING [train.py:1204] (3/4) Exclude cut with ID _49748_210_24116_1_1533986971737_3158910_12-271669-0_sp1.1 from training. Duration: 0.936375 2023-10-02 04:57:54,980 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_37357_210_17024_1_1533089504_2316121_206-80421-0_sp1.1 from training. Duration: 0.936375 2023-10-02 04:57:56,346 WARNING [train.py:1204] (3/4) Exclude cut with ID _7537_210_2084_1_1528502395431_3924359_113-199933-0_sp1.1 from training. Duration: 0.536375 2023-10-02 04:57:56,386 WARNING [train.py:1204] (3/4) Exclude cut with ID _55440_210_12651_1_1534496713364_2980520_31-59289-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 04:57:59,348 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1083-220122-0_sp0.9 from training. Duration: 0.5666875 2023-10-02 04:57:59,542 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.conv_module1.balancer1.prob, batch_count=758626.6666666666, ans=0.125 2023-10-02 04:58:02,606 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_426-313557-0_sp1.1 from training. Duration: 0.4818125 2023-10-02 04:58:03,882 WARNING [train.py:1204] (3/4) Exclude cut with ID _35799_210_5854_1_1533263487887_3529071_324-94843-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 04:58:05,962 WARNING [train.py:1204] (3/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_264-108685-0_sp1.1 from training. Duration: 0.5 2023-10-02 04:58:07,305 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9012W0006-501141 from training. Duration: 0.896 2023-10-02 04:58:08,778 WARNING [train.py:1204] (3/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_890-5117-0_sp0.9 from training. Duration: 0.8666875 2023-10-02 04:58:10,311 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9023W0008-506301 from training. Duration: 0.896 2023-10-02 04:58:10,395 WARNING [train.py:1204] (3/4) Exclude cut with ID _13916_210_9994_1_1530788341126_3763149_60-191556-0_sp0.9 from training. Duration: 0.67775 2023-10-02 04:58:11,644 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9029W0331-509721 from training. Duration: 0.896 2023-10-02 04:58:13,062 WARNING [train.py:1204] (3/4) Exclude cut with ID _35823_210_6917_1_1533187881914_7297830_567-108596-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 04:58:15,014 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7543_210_6753_1_1528544010_1110402_25-183860-0_sp1.1 from training. Duration: 0.42725 2023-10-02 04:58:15,126 WARNING [train.py:1204] (3/4) Exclude cut with ID _47278_210_12041_1_1533902418349_3576110_495-260035-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 04:58:16,406 INFO [train.py:1046] (3/4) Epoch 22, batch 2250, loss[loss=0.1892, simple_loss=0.2765, pruned_loss=0.05092, over 24370.00 frames. ], tot_loss[loss=0.1738, simple_loss=0.2502, pruned_loss=0.04869, over 4695180.38 frames. ], batch size: 77, lr: 4.68e-03, grad_scale: 8.0 2023-10-02 04:58:16,519 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9051W0015-520281 from training. Duration: 0.9045625 2023-10-02 04:58:19,256 WARNING [train.py:1204] (3/4) Exclude cut with ID _14459_210_2228_1_1531443641946_7516700_707-66193-0_sp1.1 from training. Duration: 0.97275 2023-10-02 04:58:21,257 WARNING [train.py:1204] (3/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_219-224445-0_sp1.1 from training. Duration: 0.763625 2023-10-02 04:58:26,779 WARNING [train.py:1204] (3/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_345-167986-0_sp0.9 from training. Duration: 0.9333125 2023-10-02 04:58:28,206 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_34815_210_6388_1_1533381320_3571903_279-305565-0_sp1.1 from training. Duration: 0.836375 2023-10-02 04:58:31,011 WARNING [train.py:1204] (3/4) Exclude cut with ID _21987_210_5854_1_1531821500482_3637038_317-179212-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 04:58:31,080 WARNING [train.py:1204] (3/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_395-37222-0_sp1.1 from training. Duration: 0.6909375 2023-10-02 04:58:32,469 WARNING [train.py:1204] (3/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_168-50337-0_sp1.1 from training. Duration: 0.763625 2023-10-02 04:58:32,645 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.conv_module1.balancer1.prob, batch_count=758760.0, ans=0.125 2023-10-02 04:58:34,494 WARNING [train.py:1204] (3/4) Exclude cut with ID _60113_210_16271_1_1535164212481_4386079_559-167191-0 from training. Duration: 0.93 2023-10-02 04:58:34,506 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_47228_210_4983_1_1533780021_3672027_344-179642-0_sp1.1 from training. Duration: 0.92725 2023-10-02 04:58:34,538 WARNING [train.py:1204] (3/4) Exclude cut with ID _22961_210_8254_1_1531976366672_4278180_317-199546-0_sp0.9 from training. Duration: 0.9555625 2023-10-02 04:58:35,855 WARNING [train.py:1204] (3/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_141-89727-0 from training. Duration: 0.71 2023-10-02 04:58:37,613 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_78-116075-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 04:58:37,626 WARNING [train.py:1204] (3/4) Exclude cut with ID _64086_210_7994_1_1535270417019_7435949_52-235756-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 04:58:39,109 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7615_210_4211_1_1528110965_4580160_72-235211-0_sp1.1 from training. Duration: 0.6909375 2023-10-02 04:58:41,725 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.1.feed_forward1.out_proj.dropout_p, batch_count=758760.0, ans=0.1 2023-10-02 04:58:42,839 WARNING [train.py:1204] (3/4) Exclude cut with ID _14069_210_5632_1_1530512692033_4803390_287-27950-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 04:58:44,824 WARNING [train.py:1204] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_74-48717-0_sp0.9 from training. Duration: 0.6444375 2023-10-02 04:58:46,152 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7544_210_6753_1_1528591327_1471426_172-348777-0_sp1.1 from training. Duration: 0.6 2023-10-02 04:58:47,575 WARNING [train.py:1204] (3/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_220-191080-0 from training. Duration: 0.98 2023-10-02 04:58:49,026 WARNING [train.py:1204] (3/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_1081-137641-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 04:58:53,102 WARNING [train.py:1204] (3/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_278-235459-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 04:58:56,503 WARNING [train.py:1204] (3/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_1181-77470-0_sp1.1 from training. Duration: 0.936375 2023-10-02 04:58:56,737 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.bypass.skip_rate, batch_count=758826.6666666666, ans=0.09899494936611666 2023-10-02 04:58:57,985 WARNING [train.py:1204] (3/4) Exclude cut with ID _60860_210_8254_1_1534896015875_3557169_198-5531-0_sp1.1 from training. Duration: 0.936375 2023-10-02 04:58:59,334 WARNING [train.py:1204] (3/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_17-289781-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 04:58:59,339 WARNING [train.py:1204] (3/4) Exclude cut with ID _50310_210_15488_1_1534068043345_3641080_146-199752-0_sp1.1 from training. Duration: 0.92725 2023-10-02 04:59:01,009 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.nonlin_attention.balancer.prob, batch_count=758893.3333333334, ans=0.125 2023-10-02 04:59:02,069 WARNING [train.py:1204] (3/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_969-213354-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 04:59:03,416 WARNING [train.py:1204] (3/4) Exclude cut with ID _70519_210_4163_1_1535961595994_7458230_66-344156-0_sp1.1 from training. Duration: 0.963625 2023-10-02 04:59:08,072 WARNING [train.py:1204] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_380-204756-0_sp1.1 from training. Duration: 0.7818125 2023-10-02 04:59:10,088 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_128-202104-0_sp0.9 from training. Duration: 0.7 2023-10-02 04:59:14,334 WARNING [train.py:1204] (3/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_51-1049-0_sp0.9 from training. Duration: 0.8333125 2023-10-02 04:59:14,359 WARNING [train.py:1204] (3/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_86-66192-0_sp1.1 from training. Duration: 0.5 2023-10-02 04:59:15,704 WARNING [train.py:1204] (3/4) Exclude cut with ID _14069_210_5632_1_1530512692033_4803390_95-1896-0_sp1.1 from training. Duration: 0.8909375 2023-10-02 04:59:19,214 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.0.feed_forward3.hidden_balancer.prob, batch_count=758960.0, ans=0.125 2023-10-02 04:59:21,862 WARNING [train.py:1204] (3/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_29-293896-0_sp1.1 from training. Duration: 0.5545625 2023-10-02 04:59:23,343 WARNING [train.py:1204] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_167-137255-0_sp0.9 from training. Duration: 0.711125 2023-10-02 04:59:23,344 WARNING [train.py:1204] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_217-95056-0 from training. Duration: 0.98 2023-10-02 04:59:23,364 WARNING [train.py:1204] (3/4) Exclude cut with ID _47278_210_12041_1_1533902418349_3576110_97-266746-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 04:59:24,746 WARNING [train.py:1204] (3/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_380-343670-0_sp1.1 from training. Duration: 0.8 2023-10-02 04:59:28,169 WARNING [train.py:1204] (3/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_107-332398-0 from training. Duration: 0.78 2023-10-02 04:59:30,854 INFO [train.py:1046] (3/4) Epoch 22, batch 2300, loss[loss=0.1649, simple_loss=0.2542, pruned_loss=0.03782, over 24563.00 frames. ], tot_loss[loss=0.1742, simple_loss=0.2504, pruned_loss=0.04896, over 4702219.26 frames. ], batch size: 71, lr: 4.68e-03, grad_scale: 8.0 2023-10-02 04:59:30,936 WARNING [train.py:1204] (3/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_91-267696-0_sp1.1 from training. Duration: 0.7909375 2023-10-02 04:59:30,972 WARNING [train.py:1204] (3/4) Exclude cut with ID _41298_210_9019_1_1533430371450_4822143_498-186993-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 04:59:34,821 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.nonlin_attention.balancer.prob, batch_count=759026.6666666666, ans=0.125 2023-10-02 04:59:35,481 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.1.encoder.layers.1.self_attn_weights.whiten_keys, num_groups=4, num_channels=128, metric=2.69 vs. limit=6.0 2023-10-02 04:59:37,217 WARNING [train.py:1204] (3/4) Exclude cut with ID _17956_210_4353_1_1531209568125_6979970_54-77610-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 04:59:37,265 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_40047_210_2290_1_1533290519_2374311_257-335162-0_sp1.1 from training. Duration: 0.863625 2023-10-02 04:59:39,906 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9653W0193-675471 from training. Duration: 0.9656875 2023-10-02 04:59:41,266 WARNING [train.py:1204] (3/4) Exclude cut with ID _25248_210_8750_1_1532062825941_5554180_243-240536-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 04:59:45,410 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.3.encoder.layers.0.nonlin_attention.whiten2, num_groups=1, num_channels=512, metric=7.69 vs. limit=15.0 2023-10-02 04:59:48,084 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_32360_210_4198_1_1532689160_3648436_60-51908-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 04:59:49,357 WARNING [train.py:1204] (3/4) Exclude cut with ID _4092_210_2151_1_1526643044449_3135759_227-213972-0_sp0.9 from training. Duration: 0.6 2023-10-02 04:59:49,408 WARNING [train.py:1204] (3/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_392-173500-0_sp1.1 from training. Duration: 0.97275 2023-10-02 04:59:50,808 WARNING [train.py:1204] (3/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_71-329150-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 04:59:50,811 WARNING [train.py:1204] (3/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_866-173440-0 from training. Duration: 0.9 2023-10-02 04:59:50,925 WARNING [train.py:1204] (3/4) Exclude cut with ID _14832_210_8750_1_1530691351539_3171348_1-54315-0_sp0.9 from training. Duration: 0.9666875 2023-10-02 04:59:53,620 WARNING [train.py:1204] (3/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_430-116139-0_sp1.1 from training. Duration: 0.77275 2023-10-02 04:59:53,646 WARNING [train.py:1204] (3/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_356-45054-0_sp0.9 from training. Duration: 0.9555625 2023-10-02 04:59:56,456 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3531_210_3528_1_1526294203_5216456_309-227302-0_sp0.9 from training. Duration: 0.7666875 2023-10-02 04:59:58,460 WARNING [train.py:1204] (3/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_301-330455-0_sp1.1 from training. Duration: 0.72725 2023-10-02 05:00:02,686 WARNING [train.py:1204] (3/4) Exclude cut with ID _23557_210_8750_1_1531890106370_6846255_667-145945-0_sp1.1 from training. Duration: 0.92725 2023-10-02 05:00:03,916 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.601e+02 1.999e+02 2.256e+02 2.584e+02 4.812e+02, threshold=4.513e+02, percent-clipped=1.0 2023-10-02 05:00:08,927 WARNING [train.py:1204] (3/4) Exclude cut with ID _14860_210_4915_1_1530702020340_7343240_836-328489-0_sp1.1 from training. Duration: 0.8181875 2023-10-02 05:00:08,984 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_37152_210_9697_1_1533106573_3844188_416-333249-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 05:00:10,807 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.5.encoder.layers.1.whiten, num_groups=1, num_channels=256, metric=2.03 vs. limit=12.0 2023-10-02 05:00:11,790 WARNING [train.py:1204] (3/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_538-32291-0_sp0.9 from training. Duration: 0.92225 2023-10-02 05:00:15,172 WARNING [train.py:1204] (3/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_965-176824-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 05:00:19,355 WARNING [train.py:1204] (3/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_86-19601-0_sp1.1 from training. Duration: 0.77275 2023-10-02 05:00:19,424 WARNING [train.py:1204] (3/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_436-318113-0_sp1.1 from training. Duration: 0.6818125 2023-10-02 05:00:21,341 WARNING [train.py:1204] (3/4) Exclude cut with ID _41817_210_18835_1_1533434477160_7423049_555-293448-0_sp0.9 from training. Duration: 0.888875 2023-10-02 05:00:21,360 WARNING [train.py:1204] (3/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_68-219807-0 from training. Duration: 0.47 2023-10-02 05:00:25,582 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7544_210_6753_1_1528591327_1471426_33-346031-0_sp1.1 from training. Duration: 0.5545625 2023-10-02 05:00:25,584 WARNING [train.py:1204] (3/4) Exclude cut with ID _49748_210_24116_1_1533986971737_3158910_263-52113-0_sp1.1 from training. Duration: 0.97275 2023-10-02 05:00:26,914 WARNING [train.py:1204] (3/4) Exclude cut with ID _64639_210_5816_1_1535367619437_3528000_340-24580-0_sp1.1 from training. Duration: 0.92725 2023-10-02 05:00:26,939 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_47228_210_4983_1_1533780021_3672027_479-114941-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 05:00:28,245 WARNING [train.py:1204] (3/4) Exclude cut with ID _44585_210_18628_1_1533605350387_4518199_506-119659-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 05:00:28,336 WARNING [train.py:1204] (3/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_314-126819-0_sp1.1 from training. Duration: 0.4545625 2023-10-02 05:00:28,340 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2541_210_1794_1_1525590269_3084765_156-173713-0_sp0.9 from training. Duration: 0.9 2023-10-02 05:00:29,723 WARNING [train.py:1204] (3/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_108-255430-0 from training. Duration: 0.97 2023-10-02 05:00:29,731 WARNING [train.py:1204] (3/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_791-45149-0_sp1.1 from training. Duration: 0.7818125 2023-10-02 05:00:29,743 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_53378_210_17282_1_1534227054_1261425_10-118295-0_sp1.1 from training. Duration: 0.97275 2023-10-02 05:00:31,086 WARNING [train.py:1204] (3/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_148-158509-0 from training. Duration: 0.41 2023-10-02 05:00:35,879 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_34726_210_4525_1_1533019474_5030395_687-291305-0_sp1.1 from training. Duration: 0.936375 2023-10-02 05:00:36,086 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.conv_module1.balancer1.prob, batch_count=759293.3333333334, ans=0.125 2023-10-02 05:00:36,101 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.balancer1.prob, batch_count=759293.3333333334, ans=0.125 2023-10-02 05:00:38,752 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.conv_module1.balancer2.min_abs, batch_count=759293.3333333334, ans=0.5 2023-10-02 05:00:39,871 WARNING [train.py:1204] (3/4) Exclude cut with ID _14860_210_4915_1_1530702020340_7343240_264-324537-0_sp1.1 from training. Duration: 0.963625 2023-10-02 05:00:44,623 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_41251_210_15368_1_1533429767_4820695_591-46386-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 05:00:44,657 WARNING [train.py:1204] (3/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_86-19601-0_sp0.9 from training. Duration: 0.9444375 2023-10-02 05:00:44,678 WARNING [train.py:1204] (3/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_401-255491-0_sp0.9 from training. Duration: 0.77775 2023-10-02 05:00:44,841 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.1.conv_module1.balancer2.min_abs, batch_count=759360.0, ans=0.5 2023-10-02 05:00:45,947 INFO [train.py:1046] (3/4) Epoch 22, batch 2350, loss[loss=0.1837, simple_loss=0.2535, pruned_loss=0.0569, over 23771.00 frames. ], tot_loss[loss=0.1748, simple_loss=0.2513, pruned_loss=0.04915, over 4713291.23 frames. ], batch size: 212, lr: 4.68e-03, grad_scale: 8.0 2023-10-02 05:00:47,316 WARNING [train.py:1204] (3/4) Exclude cut with ID _8696_210_1876_1_1528545632440_3345058_209-54069-0_sp1.1 from training. Duration: 0.6090625 2023-10-02 05:00:47,327 WARNING [train.py:1204] (3/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_369-325184-0_sp1.1 from training. Duration: 0.9 2023-10-02 05:00:47,399 WARNING [train.py:1204] (3/4) Exclude cut with ID _14687_210_5854_1_1530703838259_3664889_115-239046-0_sp0.9 from training. Duration: 0.7444375 2023-10-02 05:00:48,723 WARNING [train.py:1204] (3/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_168-50337-0 from training. Duration: 0.84 2023-10-02 05:00:52,824 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.nonlin_attention.balancer.prob, batch_count=759360.0, ans=0.125 2023-10-02 05:00:53,166 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.4.encoder.layers.1.feed_forward1.out_whiten, num_groups=1, num_channels=384, metric=10.83 vs. limit=15.0 2023-10-02 05:00:54,070 WARNING [train.py:1204] (3/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_909-64151-0_sp1.1 from training. Duration: 0.8545625 2023-10-02 05:00:54,087 WARNING [train.py:1204] (3/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_597-154640-0 from training. Duration: 0.9 2023-10-02 05:01:00,959 WARNING [train.py:1204] (3/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_126-274694-0 from training. Duration: 0.98 2023-10-02 05:01:03,706 WARNING [train.py:1204] (3/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_344-269786-0_sp1.1 from training. Duration: 0.97275 2023-10-02 05:01:06,498 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_37818_210_4238_1_1533620910_4720083_322-253154-0_sp1.1 from training. Duration: 0.92725 2023-10-02 05:01:06,501 WARNING [train.py:1204] (3/4) Exclude cut with ID _58895_210_8750_1_1535184081320_3649089_221-15823-0_sp1.1 from training. Duration: 0.92725 2023-10-02 05:01:06,511 WARNING [train.py:1204] (3/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_9-21737-0_sp1.1 from training. Duration: 0.8818125 2023-10-02 05:01:06,554 WARNING [train.py:1204] (3/4) Exclude cut with ID _14069_210_5632_1_1530512692033_4803390_522-341588-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 05:01:08,483 WARNING [train.py:1204] (3/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_363-185703-0 from training. Duration: 0.95 2023-10-02 05:01:10,069 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.balancer1.prob, batch_count=759426.6666666666, ans=0.125 2023-10-02 05:01:11,290 WARNING [train.py:1204] (3/4) Exclude cut with ID _41817_210_18835_1_1533434477160_7423049_265-76216-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 05:01:15,977 WARNING [train.py:1204] (3/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_841-341679-0 from training. Duration: 0.58 2023-10-02 05:01:17,360 WARNING [train.py:1204] (3/4) Exclude cut with ID _55429_210_10626_1_1534467588527_7174143_97-335084-0_sp1.1 from training. Duration: 0.8818125 2023-10-02 05:01:20,644 WARNING [train.py:1204] (3/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_690-126306-0_sp0.9 from training. Duration: 0.8666875 2023-10-02 05:01:20,652 WARNING [train.py:1204] (3/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_102-92751-0_sp1.1 from training. Duration: 0.9 2023-10-02 05:01:23,411 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3787_210_2040_1_1526477544_5478586_603-120324-0_sp1.1 from training. Duration: 0.736375 2023-10-02 05:01:24,875 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_33348_210_5632_1_1533086912_3324030_85-28583-0 from training. Duration: 0.82 2023-10-02 05:01:24,951 WARNING [train.py:1204] (3/4) Exclude cut with ID _60057_210_10640_1_1534903065793_3333150_362-230377-0_sp1.1 from training. Duration: 0.7909375 2023-10-02 05:01:26,985 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.feed_forward1.hidden_balancer.prob, batch_count=759493.3333333334, ans=0.125 2023-10-02 05:01:28,002 WARNING [train.py:1204] (3/4) Exclude cut with ID _60113_210_16271_1_1535164212481_4386079_194-146486-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 05:01:28,014 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_53753_210_7234_1_1534320003_3594148_171-277668-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 05:01:28,064 WARNING [train.py:1204] (3/4) Exclude cut with ID _34423_210_8750_1_1533430798456_3330199_251-332788-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 05:01:31,058 WARNING [train.py:1204] (3/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_663-166973-0_sp0.9 from training. Duration: 0.888875 2023-10-02 05:01:32,485 WARNING [train.py:1204] (3/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_927-43827-0 from training. Duration: 0.74 2023-10-02 05:01:33,814 WARNING [train.py:1204] (3/4) Exclude cut with ID _28323_210_16187_1_1532480395667_4100300_306-74561-0_sp1.1 from training. Duration: 0.8545625 2023-10-02 05:01:36,641 WARNING [train.py:1204] (3/4) Exclude cut with ID _15037_210_2253_1_1530752294104_3267650_139-337989-0_sp1.1 from training. Duration: 0.92725 2023-10-02 05:01:38,439 WARNING [train.py:1204] (3/4) Exclude cut with ID _49039_210_6315_1_1533967537541_3846118_460-263670-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 05:01:39,842 WARNING [train.py:1204] (3/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_186-309161-0 from training. Duration: 0.54 2023-10-02 05:01:39,944 WARNING [train.py:1204] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_87-278686-0_sp1.1 from training. Duration: 0.7 2023-10-02 05:01:42,727 WARNING [train.py:1204] (3/4) Exclude cut with ID _27052_210_5986_1_1532329197187_3585009_314-106499-0 from training. Duration: 0.92 2023-10-02 05:01:43,446 WARNING [train.py:1204] (3/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_412-287526-0_sp0.9 from training. Duration: 0.8 2023-10-02 05:01:47,680 WARNING [train.py:1204] (3/4) Exclude cut with ID _22961_210_8254_1_1531976366672_4278180_317-199546-0 from training. Duration: 0.86 2023-10-02 05:01:52,480 WARNING [train.py:1204] (3/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_381-317791-0 from training. Duration: 0.73 2023-10-02 05:01:53,669 WARNING [train.py:1204] (3/4) Exclude cut with ID _44010_210_4525_1_1533884262964_4013620_560-310411-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 05:01:53,683 WARNING [train.py:1204] (3/4) Exclude cut with ID _5294_210_1747_1_1527249643928_3696179_342-270278-0_sp0.9 from training. Duration: 0.611125 2023-10-02 05:01:53,705 WARNING [train.py:1204] (3/4) Exclude cut with ID ID0473W0002-918951 from training. Duration: 0.924 2023-10-02 05:01:55,379 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1083-220122-0 from training. Duration: 0.51 2023-10-02 05:01:55,632 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.feed_forward1.out_proj.dropout_p, batch_count=759626.6666666666, ans=0.1 2023-10-02 05:01:56,887 WARNING [train.py:1204] (3/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_138-154812-0 from training. Duration: 0.78 2023-10-02 05:01:59,736 WARNING [train.py:1204] (3/4) Exclude cut with ID _44982_210_1511_1_1534312820108_3726029_331-19420-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 05:02:01,015 INFO [train.py:1046] (3/4) Epoch 22, batch 2400, loss[loss=0.1781, simple_loss=0.2594, pruned_loss=0.04845, over 23314.00 frames. ], tot_loss[loss=0.1745, simple_loss=0.2511, pruned_loss=0.04897, over 4711076.59 frames. ], batch size: 105, lr: 4.68e-03, grad_scale: 16.0 2023-10-02 05:02:04,119 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2655_210_2087_1_1525688959_3897400_202-147557-0_sp0.9 from training. Duration: 0.9444375 2023-10-02 05:02:06,875 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_23850_210_4238_1_1532232597_1632433_32-185135-0_sp0.9 from training. Duration: 0.9555625 2023-10-02 05:02:07,000 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_46-93547-0_sp1.1 from training. Duration: 0.863625 2023-10-02 05:02:08,363 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7931_210_1794_1_1528594728_5564059_280-279560-0 from training. Duration: 0.98 2023-10-02 05:02:08,423 WARNING [train.py:1204] (3/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_937-208281-0 from training. Duration: 0.85 2023-10-02 05:02:16,323 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_14928_210_5632_1_1530773461_4714000_454-112198-0_sp0.9 from training. Duration: 0.7333125 2023-10-02 05:02:16,324 WARNING [train.py:1204] (3/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_944-153333-0_sp1.1 from training. Duration: 0.963625 2023-10-02 05:02:17,806 WARNING [train.py:1204] (3/4) Exclude cut with ID _14821_210_8750_1_1530680503991_6829359_2-227151-0 from training. Duration: 0.83 2023-10-02 05:02:19,096 WARNING [train.py:1204] (3/4) Exclude cut with ID _28461_210_2253_1_1532771775189_3041299_96-152158-0_sp1.1 from training. Duration: 0.77275 2023-10-02 05:02:20,492 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7837_210_1792_1_1528453445_5841305_654-54195-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 05:02:20,536 WARNING [train.py:1204] (3/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_344-152526-0 from training. Duration: 0.53 2023-10-02 05:02:28,292 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_30167_210_3528_1_1532428012_4772375_480-142230-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 05:02:29,767 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_160-218721-0 from training. Duration: 0.96 2023-10-02 05:02:29,968 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.self_attn_weights.pos_emb_skip_rate, batch_count=759826.6666666666, ans=0.0 2023-10-02 05:02:32,776 WARNING [train.py:1204] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_65-88242-0_sp0.9 from training. Duration: 0.62225 2023-10-02 05:02:34,040 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.507e+02 1.808e+02 2.044e+02 2.322e+02 3.355e+02, threshold=4.088e+02, percent-clipped=0.0 2023-10-02 05:02:36,888 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_34886_210_10633_1_1533203882_1755661_76-156096-0 from training. Duration: 0.87 2023-10-02 05:02:38,493 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.0.bypass.scale_min, batch_count=759826.6666666666, ans=0.2 2023-10-02 05:02:39,647 WARNING [train.py:1204] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_188-38388-0_sp1.1 from training. Duration: 0.92725 2023-10-02 05:02:41,489 WARNING [train.py:1204] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_81-289335-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 05:02:44,399 WARNING [train.py:1204] (3/4) Exclude cut with ID _59493_210_17282_1_1535078419077_3225879_110-8778-0_sp1.1 from training. Duration: 0.936375 2023-10-02 05:02:46,380 WARNING [train.py:1204] (3/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_188-260011-0 from training. Duration: 0.73 2023-10-02 05:02:47,686 WARNING [train.py:1204] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_453-133983-0_sp0.9 from training. Duration: 0.8666875 2023-10-02 05:02:49,413 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.bypass.scale_min, batch_count=759893.3333333334, ans=0.2 2023-10-02 05:02:50,729 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.bypass_mid.scale_min, batch_count=759893.3333333334, ans=0.2 2023-10-02 05:02:52,095 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8054_210_7927_1_1528627721_4984094_553-17379-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 05:02:55,347 WARNING [train.py:1204] (3/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_341-171072-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 05:02:55,614 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.conv_module2.balancer1.prob, batch_count=759893.3333333334, ans=0.125 2023-10-02 05:02:58,711 WARNING [train.py:1204] (3/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_265-257286-0_sp1.1 from training. Duration: 0.963625 2023-10-02 05:03:00,025 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_403-39619-0_sp0.9 from training. Duration: 0.9333125 2023-10-02 05:03:00,031 WARNING [train.py:1204] (3/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_464-33931-0_sp0.9 from training. Duration: 0.6 2023-10-02 05:03:00,067 WARNING [train.py:1204] (3/4) Exclude cut with ID _60804_210_9642_1_1534831199859_7268299_632-143863-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 05:03:00,078 WARNING [train.py:1204] (3/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_431-310997-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 05:03:01,457 WARNING [train.py:1204] (3/4) Exclude cut with ID _31649_210_6228_1_1532584608063_4091078_114-263860-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 05:03:01,478 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6961_210_2087_1_1527937168_6815011_562-165518-0_sp0.9 from training. Duration: 0.8555625 2023-10-02 05:03:05,548 WARNING [train.py:1204] (3/4) Exclude cut with ID _27052_210_5986_1_1532329197187_3585009_338-325870-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 05:03:06,869 WARNING [train.py:1204] (3/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_469-94673-0_sp1.1 from training. Duration: 0.7181875 2023-10-02 05:03:06,904 WARNING [train.py:1204] (3/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_259-34038-0 from training. Duration: 0.74 2023-10-02 05:03:08,276 WARNING [train.py:1204] (3/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_388-129552-0 from training. Duration: 0.96 2023-10-02 05:03:09,913 WARNING [train.py:1204] (3/4) Exclude cut with ID _28394_210_8913_1_1532430049518_3650118_342-251455-0_sp1.1 from training. Duration: 0.936375 2023-10-02 05:03:09,931 WARNING [train.py:1204] (3/4) Exclude cut with ID _53731_210_7994_1_1534553979244_3757214_8-191390-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 05:03:11,280 WARNING [train.py:1204] (3/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_613-232856-0 from training. Duration: 0.54 2023-10-02 05:03:11,340 WARNING [train.py:1204] (3/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_557-349534-0 from training. Duration: 0.91 2023-10-02 05:03:12,979 WARNING [train.py:1204] (3/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_379-184223-0 from training. Duration: 0.85 2023-10-02 05:03:12,983 WARNING [train.py:1204] (3/4) Exclude cut with ID IC0125W0001-61807 from training. Duration: 0.966 2023-10-02 05:03:13,067 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_229-45300-0 from training. Duration: 0.57 2023-10-02 05:03:14,477 WARNING [train.py:1204] (3/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_1069-24743-0_sp1.1 from training. Duration: 0.92725 2023-10-02 05:03:15,823 INFO [train.py:1046] (3/4) Epoch 22, batch 2450, loss[loss=0.1599, simple_loss=0.2485, pruned_loss=0.03568, over 24490.00 frames. ], tot_loss[loss=0.1742, simple_loss=0.2507, pruned_loss=0.04881, over 4717107.83 frames. ], batch size: 66, lr: 4.67e-03, grad_scale: 16.0 2023-10-02 05:03:15,883 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_41452_210_7291_1_1533437687_3941281_323-172873-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 05:03:15,904 WARNING [train.py:1204] (3/4) Exclude cut with ID _37086_210_9038_1_1532999540891_7021250_802-226709-0_sp1.1 from training. Duration: 0.97275 2023-10-02 05:03:17,357 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.1.encoder.layers.1.nonlin_attention.whiten1, num_groups=1, num_channels=192, metric=5.66 vs. limit=10.0 2023-10-02 05:03:17,835 WARNING [train.py:1204] (3/4) Exclude cut with ID IC0144W0500-71767 from training. Duration: 0.931875 2023-10-02 05:03:17,895 WARNING [train.py:1204] (3/4) Exclude cut with ID _39846_210_12651_1_1533605310265_2115212_233-129699-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 05:03:17,948 WARNING [train.py:1204] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_574-45541-0_sp0.9 from training. Duration: 0.72225 2023-10-02 05:03:18,127 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.nonlin_attention.balancer.prob, batch_count=760026.6666666666, ans=0.125 2023-10-02 05:03:22,131 WARNING [train.py:1204] (3/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_372-298093-0_sp1.1 from training. Duration: 0.72725 2023-10-02 05:03:22,157 WARNING [train.py:1204] (3/4) Exclude cut with ID _62574_210_2655_1_1535180407058_3933249_34-26343-0_sp1.1 from training. Duration: 0.97275 2023-10-02 05:03:25,440 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_512-256238-0_sp1.1 from training. Duration: 0.963625 2023-10-02 05:03:25,445 WARNING [train.py:1204] (3/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_196-55502-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 05:03:25,749 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.feed_forward3.hidden_balancer.prob, batch_count=760026.6666666666, ans=0.125 2023-10-02 05:03:27,289 WARNING [train.py:1204] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_195-167011-0 from training. Duration: 0.75 2023-10-02 05:03:31,564 WARNING [train.py:1204] (3/4) Exclude cut with ID _13678_210_7497_1_1530530069226_3463170_345-230416-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 05:03:32,745 WARNING [train.py:1204] (3/4) Exclude cut with ID _51078_210_9070_1_1534161557424_3805250_225-327809-0_sp1.1 from training. Duration: 0.963625 2023-10-02 05:03:34,339 WARNING [train.py:1204] (3/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_18-189198-0_sp1.1 from training. Duration: 0.8181875 2023-10-02 05:03:34,350 WARNING [train.py:1204] (3/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_329-82235-0_sp1.1 from training. Duration: 0.7454375 2023-10-02 05:03:35,641 WARNING [train.py:1204] (3/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_726-270780-0_sp0.9 from training. Duration: 0.9555625 2023-10-02 05:03:35,683 WARNING [train.py:1204] (3/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_654-52713-0 from training. Duration: 0.77 2023-10-02 05:03:39,810 WARNING [train.py:1204] (3/4) Exclude cut with ID _20455_210_5219_1_1531565996775_8098100_191-16006-0_sp1.1 from training. Duration: 0.963625 2023-10-02 05:03:42,997 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3171_210_2087_1_1526109084_2330007_66-260583-0_sp1.1 from training. Duration: 0.6545625 2023-10-02 05:03:43,051 WARNING [train.py:1204] (3/4) Exclude cut with ID _38670_210_15379_1_1533448810659_4391059_63-129006-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 05:03:46,587 WARNING [train.py:1204] (3/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_393-57888-0_sp0.9 from training. Duration: 0.711125 2023-10-02 05:03:46,617 WARNING [train.py:1204] (3/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_857-298942-0_sp0.9 from training. Duration: 0.9444375 2023-10-02 05:03:48,017 WARNING [train.py:1204] (3/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_239-22076-0_sp0.9 from training. Duration: 0.9444375 2023-10-02 05:03:48,053 WARNING [train.py:1204] (3/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_424-227947-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 05:03:50,686 WARNING [train.py:1204] (3/4) Exclude cut with ID _14069_210_5632_1_1530512692033_4803390_95-1896-0 from training. Duration: 0.98 2023-10-02 05:03:52,021 WARNING [train.py:1204] (3/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_96-244299-0_sp0.9 from training. Duration: 0.97775 2023-10-02 05:03:55,810 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.conv_module1.balancer2.min_positive, batch_count=760160.0, ans=0.05 2023-10-02 05:04:00,228 WARNING [train.py:1204] (3/4) Exclude cut with ID _46683_210_9642_1_1533730913492_4591116_562-319825-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 05:04:01,655 WARNING [train.py:1204] (3/4) Exclude cut with ID _64639_210_5816_1_1535367619437_3528000_284-173474-0_sp1.1 from training. Duration: 0.963625 2023-10-02 05:04:02,982 WARNING [train.py:1204] (3/4) Exclude cut with ID _29529_210_7234_1_1532393961455_3977460_497-100133-0_sp1.1 from training. Duration: 0.936375 2023-10-02 05:04:03,007 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_12868_210_6753_1_1530701932_530275_18-76322-0_sp0.9 from training. Duration: 0.9666875 2023-10-02 05:04:03,053 WARNING [train.py:1204] (3/4) Exclude cut with ID _50093_210_4220_1_1534417141207_3432299_116-303447-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 05:04:04,484 WARNING [train.py:1204] (3/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_44-79427-0_sp1.1 from training. Duration: 0.8909375 2023-10-02 05:04:04,549 WARNING [train.py:1204] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_887-249555-0 from training. Duration: 0.77 2023-10-02 05:04:07,309 WARNING [train.py:1204] (3/4) Exclude cut with ID _28461_210_2253_1_1532771775189_3041299_96-152158-0_sp0.9 from training. Duration: 0.9444375 2023-10-02 05:04:08,622 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_14058_210_4163_1_1530690953_6327180_49-210687-0_sp1.1 from training. Duration: 0.8545625 2023-10-02 05:04:11,333 WARNING [train.py:1204] (3/4) Exclude cut with ID _40399_210_4353_1_1533305658194_3279406_351-127903-0_sp1.1 from training. Duration: 0.97275 2023-10-02 05:04:12,520 WARNING [train.py:1204] (3/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_184-206939-0_sp1.1 from training. Duration: 0.936375 2023-10-02 05:04:17,444 WARNING [train.py:1204] (3/4) Exclude cut with ID _14100_210_8341_1_1530529320613_3678069_258-224599-0_sp1.1 from training. Duration: 0.7 2023-10-02 05:04:17,472 WARNING [train.py:1204] (3/4) Exclude cut with ID _6493_210_1747_1_1528459210424_4012470_324-323854-0 from training. Duration: 0.92 2023-10-02 05:04:17,563 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_151-325626-0_sp1.1 from training. Duration: 0.8818125 2023-10-02 05:04:18,887 WARNING [train.py:1204] (3/4) Exclude cut with ID _14528_210_8254_1_1530669700514_4306939_106-43145-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 05:04:18,902 WARNING [train.py:1204] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_686-54383-0 from training. Duration: 0.87 2023-10-02 05:04:20,276 WARNING [train.py:1204] (3/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_801-102362-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 05:04:20,357 WARNING [train.py:1204] (3/4) Exclude cut with ID _48677_210_5663_1_1533902405919_3892989_408-40740-0_sp0.9 from training. Duration: 0.92225 2023-10-02 05:04:24,781 WARNING [train.py:1204] (3/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_448-212135-0_sp1.1 from training. Duration: 0.82725 2023-10-02 05:04:28,027 WARNING [train.py:1204] (3/4) Exclude cut with ID _59170_210_6721_1_1534983633960_4203335_320-124047-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 05:04:28,076 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_4692_210_3528_1_1526899755_4323254_107-44401-0_sp1.1 from training. Duration: 0.7818125 2023-10-02 05:04:31,464 INFO [train.py:1046] (3/4) Epoch 22, batch 2500, loss[loss=0.1799, simple_loss=0.261, pruned_loss=0.04935, over 23351.00 frames. ], tot_loss[loss=0.173, simple_loss=0.2492, pruned_loss=0.04839, over 4700583.34 frames. ], batch size: 93, lr: 4.67e-03, grad_scale: 16.0 2023-10-02 05:04:31,621 WARNING [train.py:1204] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_731-2428-0 from training. Duration: 0.83 2023-10-02 05:04:33,015 WARNING [train.py:1204] (3/4) Exclude cut with ID _29857_210_20034_1_1532395886753_3798160_335-163562-0_sp1.1 from training. Duration: 0.77275 2023-10-02 05:04:36,022 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.balancer2.prob, batch_count=760360.0, ans=0.125 2023-10-02 05:04:38,549 WARNING [train.py:1204] (3/4) Exclude cut with ID _14832_210_8750_1_1530691351539_3171348_171-275004-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 05:04:43,045 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.bypass.scale_min, batch_count=760360.0, ans=0.2 2023-10-02 05:04:43,118 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.feed_forward1.hidden_balancer.prob, batch_count=760360.0, ans=0.125 2023-10-02 05:04:48,142 WARNING [train.py:1204] (3/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_105-328174-0_sp0.9 from training. Duration: 0.8444375 2023-10-02 05:04:48,176 WARNING [train.py:1204] (3/4) Exclude cut with ID _68965_210_1883_1_1535698962272_3497490_164-118421-0_sp1.1 from training. Duration: 0.936375 2023-10-02 05:04:49,569 WARNING [train.py:1204] (3/4) Exclude cut with ID _19896_210_1883_1_1531709022662_6649569_380-24170-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 05:04:49,576 WARNING [train.py:1204] (3/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_505-205270-0_sp1.1 from training. Duration: 0.3454375 2023-10-02 05:04:56,607 WARNING [train.py:1204] (3/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_427-208901-0_sp1.1 from training. Duration: 0.6181875 2023-10-02 05:04:58,024 WARNING [train.py:1204] (3/4) Exclude cut with ID _23818_210_6228_1_1531917063884_3676920_85-326916-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 05:04:58,125 WARNING [train.py:1204] (3/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_101-293367-0_sp1.1 from training. Duration: 0.57275 2023-10-02 05:04:58,138 WARNING [train.py:1204] (3/4) Exclude cut with ID _5409_210_1388_1_1527292850189_5865191_178-127970-0_sp1.1 from training. Duration: 0.5181875 2023-10-02 05:04:59,923 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_403-39619-0 from training. Duration: 0.84 2023-10-02 05:05:00,024 WARNING [train.py:1204] (3/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_427-87841-0_sp1.1 from training. Duration: 0.963625 2023-10-02 05:05:01,799 WARNING [train.py:1204] (3/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_286-68181-0_sp1.1 from training. Duration: 0.97275 2023-10-02 05:05:01,840 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6673_210_3273_1_1527767790_3173240_327-306185-0 from training. Duration: 0.83 2023-10-02 05:05:01,858 WARNING [train.py:1204] (3/4) Exclude cut with ID _3675_210_4557_1_1526639934408_4182089_106-203713-0_sp1.1 from training. Duration: 0.963625 2023-10-02 05:05:03,206 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_53071_210_16554_1_1534493434_1276796_130-36076-0 from training. Duration: 0.95 2023-10-02 05:05:03,248 WARNING [train.py:1204] (3/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_586-93054-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 05:05:04,481 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.512e+02 1.863e+02 2.107e+02 2.380e+02 3.578e+02, threshold=4.214e+02, percent-clipped=0.0 2023-10-02 05:05:07,501 WARNING [train.py:1204] (3/4) Exclude cut with ID _23825_210_13449_1_1531915313526_3497150_262-10571-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 05:05:07,565 WARNING [train.py:1204] (3/4) Exclude cut with ID _28783_210_7943_1_1532671164808_7155479_432-67885-0_sp1.1 from training. Duration: 0.97275 2023-10-02 05:05:10,277 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6287_210_3273_1_1527509506_2653716_182-199887-0_sp0.9 from training. Duration: 0.5666875 2023-10-02 05:05:10,325 WARNING [train.py:1204] (3/4) Exclude cut with ID _3231_210_2106_1_1526205864934_6784220_171-256004-0 from training. Duration: 0.86 2023-10-02 05:05:10,391 WARNING [train.py:1204] (3/4) Exclude cut with ID _69051_210_1531_1_1535885566901_3829538_332-92090-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 05:05:10,672 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.conv_module2.balancer1.prob, batch_count=760493.3333333334, ans=0.125 2023-10-02 05:05:11,738 WARNING [train.py:1204] (3/4) Exclude cut with ID _3675_210_4557_1_1526639934408_4182089_104-324125-0_sp1.1 from training. Duration: 0.963625 2023-10-02 05:05:16,578 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_41764_210_2521_1_1533686110_4089313_501-2119-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 05:05:20,531 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_56-100988-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 05:05:21,982 WARNING [train.py:1204] (3/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_398-239819-0_sp1.1 from training. Duration: 0.9 2023-10-02 05:05:26,915 WARNING [train.py:1204] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_74-48717-0_sp1.1 from training. Duration: 0.52725 2023-10-02 05:05:27,579 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.4.encoder.layers.2.nonlin_attention.whiten2, num_groups=1, num_channels=384, metric=3.82 vs. limit=15.0 2023-10-02 05:05:28,433 WARNING [train.py:1204] (3/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_177-49092-0 from training. Duration: 0.91 2023-10-02 05:05:30,426 WARNING [train.py:1204] (3/4) Exclude cut with ID _62337_210_12392_1_1535009273105_3412229_44-176353-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 05:05:30,435 WARNING [train.py:1204] (3/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_110-130961-0_sp1.1 from training. Duration: 0.42725 2023-10-02 05:05:31,905 WARNING [train.py:1204] (3/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_37-189262-0_sp1.1 from training. Duration: 0.8909375 2023-10-02 05:05:31,906 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3172_210_2087_1_1526292929_3987865_197-97762-0_sp0.9 from training. Duration: 0.8333125 2023-10-02 05:05:33,339 WARNING [train.py:1204] (3/4) Exclude cut with ID IC0677W0065-334897 from training. Duration: 0.896 2023-10-02 05:05:33,339 WARNING [train.py:1204] (3/4) Exclude cut with ID IC0677W0080-334912 from training. Duration: 0.768 2023-10-02 05:05:33,353 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_120-41668-0 from training. Duration: 0.7 2023-10-02 05:05:35,956 WARNING [train.py:1204] (3/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_460-205287-0_sp1.1 from training. Duration: 0.963625 2023-10-02 05:05:37,426 WARNING [train.py:1204] (3/4) Exclude cut with ID _14632_210_9988_1_1530691279392_3923200_102-204477-0 from training. Duration: 0.97 2023-10-02 05:05:38,723 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_34815_210_6388_1_1533381320_3571903_279-305565-0 from training. Duration: 0.92 2023-10-02 05:05:38,805 WARNING [train.py:1204] (3/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_91-148831-0_sp1.1 from training. Duration: 0.9 2023-10-02 05:05:39,137 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.balancer1.prob, batch_count=760626.6666666666, ans=0.125 2023-10-02 05:05:40,185 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_82-221196-0 from training. Duration: 0.76 2023-10-02 05:05:42,924 WARNING [train.py:1204] (3/4) Exclude cut with ID _38020_210_5986_1_1533112252259_7006089_493-102745-0 from training. Duration: 0.98 2023-10-02 05:05:46,209 INFO [train.py:1046] (3/4) Epoch 22, batch 2550, loss[loss=0.1588, simple_loss=0.2366, pruned_loss=0.04051, over 20577.00 frames. ], tot_loss[loss=0.1724, simple_loss=0.249, pruned_loss=0.04796, over 4708701.31 frames. ], batch size: 45, lr: 4.67e-03, grad_scale: 16.0 2023-10-02 05:05:46,259 WARNING [train.py:1204] (3/4) Exclude cut with ID _61133_210_9969_1_1535196777227_3696409_40-93442-0_sp1.1 from training. Duration: 0.92725 2023-10-02 05:05:48,232 WARNING [train.py:1204] (3/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_45-258852-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 05:05:48,288 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_53071_210_16554_1_1534493434_1276796_130-36076-0_sp1.1 from training. Duration: 0.863625 2023-10-02 05:05:50,769 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8694_210_6400_1_1529065754_3260756_332-13553-0_sp1.1 from training. Duration: 0.92725 2023-10-02 05:05:50,849 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_405-339986-0 from training. Duration: 0.51 2023-10-02 05:05:52,090 WARNING [train.py:1204] (3/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_927-43827-0_sp1.1 from training. Duration: 0.67275 2023-10-02 05:05:55,006 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_397-147196-0 from training. Duration: 0.8 2023-10-02 05:05:56,937 WARNING [train.py:1204] (3/4) Exclude cut with ID 3557-8342-0013-54691-0_sp1.1 from training. Duration: 0.836375 2023-10-02 05:05:58,324 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_47438_210_5399_1_1533797471_4513356_626-127722-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 05:06:01,717 WARNING [train.py:1204] (3/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_256-323883-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 05:06:01,729 WARNING [train.py:1204] (3/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_839-173017-0_sp0.9 from training. Duration: 0.4555625 2023-10-02 05:06:03,090 WARNING [train.py:1204] (3/4) Exclude cut with ID _5289_210_2084_1_1527678202736_3779299_392-261988-0_sp1.1 from training. Duration: 0.5909375 2023-10-02 05:06:03,137 WARNING [train.py:1204] (3/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_119-224384-0_sp1.1 from training. Duration: 0.936375 2023-10-02 05:06:03,179 WARNING [train.py:1204] (3/4) Exclude cut with ID _57523_210_1856_1_1534766294545_3732989_262-267933-0_sp1.1 from training. Duration: 0.963625 2023-10-02 05:06:06,009 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_35361_210_4238_1_1533262810_7793476_675-87012-0_sp0.9 from training. Duration: 0.988875 2023-10-02 05:06:06,036 WARNING [train.py:1204] (3/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_17-90541-0 from training. Duration: 0.94 2023-10-02 05:06:07,395 WARNING [train.py:1204] (3/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_535-14410-0_sp1.1 from training. Duration: 0.42725 2023-10-02 05:06:07,402 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_24673_210_6400_1_1531968633_778659_94-154511-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 05:06:07,405 WARNING [train.py:1204] (3/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_212-10501-0 from training. Duration: 0.94 2023-10-02 05:06:18,388 WARNING [train.py:1204] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_383-285196-0_sp1.1 from training. Duration: 0.8818125 2023-10-02 05:06:23,602 WARNING [train.py:1204] (3/4) Exclude cut with ID _45560_210_21982_1_1533812603951_3584999_238-47020-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 05:06:23,610 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_21500_210_2054_1_1532149310_2716758_278-18053-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 05:06:23,617 WARNING [train.py:1204] (3/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_453-9626-0_sp1.1 from training. Duration: 0.936375 2023-10-02 05:06:23,718 WARNING [train.py:1204] (3/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_153-97441-0_sp1.1 from training. Duration: 0.5090625 2023-10-02 05:06:30,135 WARNING [train.py:1204] (3/4) Exclude cut with ID _67725_210_27767_1_1535608756671_5105359_431-120895-0_sp1.1 from training. Duration: 0.963625 2023-10-02 05:06:30,412 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.conv_skip_rate, batch_count=760893.3333333334, ans=0.0 2023-10-02 05:06:32,999 WARNING [train.py:1204] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_574-45541-0_sp1.1 from training. Duration: 0.5909375 2023-10-02 05:06:33,020 WARNING [train.py:1204] (3/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_162-103620-0_sp1.1 from training. Duration: 0.8454375 2023-10-02 05:06:33,047 WARNING [train.py:1204] (3/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_734-129761-0_sp1.1 from training. Duration: 0.7454375 2023-10-02 05:06:34,384 WARNING [train.py:1204] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_620-276793-0_sp1.1 from training. Duration: 0.536375 2023-10-02 05:06:34,422 WARNING [train.py:1204] (3/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_153-97441-0_sp0.9 from training. Duration: 0.62225 2023-10-02 05:06:37,029 WARNING [train.py:1204] (3/4) Exclude cut with ID _12733_210_8341_1_1530167424556_3646380_523-85621-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 05:06:37,040 WARNING [train.py:1204] (3/4) Exclude cut with ID _64048_210_10626_1_1535198382802_3835179_352-179834-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 05:06:37,716 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.3.encoder.layers.0.self_attn_weights.whiten_keys, num_groups=8, num_channels=256, metric=5.22 vs. limit=6.0 2023-10-02 05:06:42,704 WARNING [train.py:1204] (3/4) Exclude cut with ID _55162_210_17071_1_1534408248583_3753480_43-242364-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 05:06:42,721 WARNING [train.py:1204] (3/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_661-264857-0 from training. Duration: 0.93 2023-10-02 05:06:42,729 WARNING [train.py:1204] (3/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_325-237800-0_sp1.1 from training. Duration: 0.97275 2023-10-02 05:06:44,106 WARNING [train.py:1204] (3/4) Exclude cut with ID _55328_210_27767_1_1534580973260_4088170_385-171459-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 05:06:45,466 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3780_210_2087_1_1526467389_2145572_176-107140-0_sp1.1 from training. Duration: 0.62725 2023-10-02 05:06:47,276 WARNING [train.py:1204] (3/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_375-244976-0_sp1.1 from training. Duration: 0.6818125 2023-10-02 05:06:49,500 WARNING [train.py:1204] (3/4) Exclude cut with ID _35941_210_15379_1_1532995294515_4023089_309-33162-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 05:06:54,988 WARNING [train.py:1204] (3/4) Exclude cut with ID _14459_210_2228_1_1531443641946_7516700_1185-22038-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 05:06:57,700 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_1197-69460-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 05:07:01,066 INFO [train.py:1046] (3/4) Epoch 22, batch 2600, loss[loss=0.1674, simple_loss=0.2603, pruned_loss=0.0373, over 24548.00 frames. ], tot_loss[loss=0.1732, simple_loss=0.25, pruned_loss=0.04819, over 4714555.30 frames. ], batch size: 71, lr: 4.67e-03, grad_scale: 16.0 2023-10-02 05:07:01,131 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9012W0022-501157 from training. Duration: 0.896 2023-10-02 05:07:03,245 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9023W0009-506302 from training. Duration: 0.896 2023-10-02 05:07:03,259 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2821_210_2715_1_1526108385_3763560_73-296673-0_sp1.1 from training. Duration: 0.7545625 2023-10-02 05:07:03,288 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9025W0113-507442 from training. Duration: 0.896 2023-10-02 05:07:03,358 WARNING [train.py:1204] (3/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_739-2098-0 from training. Duration: 0.92 2023-10-02 05:07:04,583 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9029W0332-509722 from training. Duration: 0.768 2023-10-02 05:07:04,775 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.self_attn_weights.pos_emb_skip_rate, batch_count=761026.6666666666, ans=0.0 2023-10-02 05:07:06,032 WARNING [train.py:1204] (3/4) Exclude cut with ID _16908_210_5724_1_1531119157248_3772700_68-101953-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 05:07:06,060 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9042W0011-515617 from training. Duration: 0.768 2023-10-02 05:07:07,462 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3019_210_2028_1_1526044245_3264865_191-72235-0 from training. Duration: 0.97 2023-10-02 05:07:08,923 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9052W0006-520792 from training. Duration: 0.896 2023-10-02 05:07:10,461 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.conv_module2.balancer1.prob, batch_count=761026.6666666666, ans=0.125 2023-10-02 05:07:11,624 WARNING [train.py:1204] (3/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_245-174553-0_sp1.1 from training. Duration: 0.8 2023-10-02 05:07:13,510 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.3.encoder.layers.3.self_attn2.whiten, num_groups=1, num_channels=512, metric=11.77 vs. limit=22.5 2023-10-02 05:07:14,242 WARNING [train.py:1204] (3/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_202-67017-0 from training. Duration: 0.82 2023-10-02 05:07:15,626 WARNING [train.py:1204] (3/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_304-59227-0 from training. Duration: 0.77 2023-10-02 05:07:15,890 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.bypass.scale_min, batch_count=761093.3333333334, ans=0.2 2023-10-02 05:07:17,050 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_246-125000-0_sp0.9 from training. Duration: 0.688875 2023-10-02 05:07:17,087 WARNING [train.py:1204] (3/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_243-42116-0 from training. Duration: 0.73 2023-10-02 05:07:20,931 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9087W0121-538492 from training. Duration: 0.896 2023-10-02 05:07:20,945 WARNING [train.py:1204] (3/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_120-76843-0 from training. Duration: 0.55 2023-10-02 05:07:22,936 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.4.encoder.layers.2.feed_forward2.out_whiten, num_groups=1, num_channels=384, metric=8.46 vs. limit=15.0 2023-10-02 05:07:27,888 WARNING [train.py:1204] (3/4) Exclude cut with ID _7852_210_5219_1_1528286471316_8139400_850-304621-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 05:07:27,910 WARNING [train.py:1204] (3/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_154-285415-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 05:07:29,204 WARNING [train.py:1204] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_538-278194-0_sp1.1 from training. Duration: 0.92725 2023-10-02 05:07:29,205 WARNING [train.py:1204] (3/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_119-207162-0 from training. Duration: 0.71 2023-10-02 05:07:30,675 WARNING [train.py:1204] (3/4) Exclude cut with ID _2334_210_2667_1_1525608238880_4705129_459-104997-0_sp1.1 from training. Duration: 0.863625 2023-10-02 05:07:33,967 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.582e+02 1.850e+02 2.100e+02 2.387e+02 3.462e+02, threshold=4.200e+02, percent-clipped=0.0 2023-10-02 05:07:38,649 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9147W0019-567847 from training. Duration: 0.768 2023-10-02 05:07:43,813 INFO [scaling.py:1022] (3/4) Whitening: name=encoder_embed.out_whiten, num_groups=1, num_channels=192, metric=6.68 vs. limit=8.0 2023-10-02 05:07:44,263 WARNING [train.py:1204] (3/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_228-189439-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 05:07:45,668 WARNING [train.py:1204] (3/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_196-77172-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 05:07:45,752 WARNING [train.py:1204] (3/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_625-338546-0 from training. Duration: 0.88 2023-10-02 05:07:47,274 WARNING [train.py:1204] (3/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_193-327630-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 05:07:47,276 WARNING [train.py:1204] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_391-24006-0_sp1.1 from training. Duration: 0.92725 2023-10-02 05:07:47,374 WARNING [train.py:1204] (3/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_197-28164-0 from training. Duration: 0.8 2023-10-02 05:07:47,695 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.conv_module2.balancer2.prob, batch_count=761226.6666666666, ans=0.125 2023-10-02 05:07:48,909 WARNING [train.py:1204] (3/4) Exclude cut with ID _8395_210_1531_1_1528597871523_4411579_431-132470-0_sp1.1 from training. Duration: 0.67275 2023-10-02 05:07:50,179 WARNING [train.py:1204] (3/4) Exclude cut with ID _35350_210_16040_1_1533040191183_4075299_465-248326-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 05:07:51,680 WARNING [train.py:1204] (3/4) Exclude cut with ID _16756_210_4915_1_1531047803763_7104259_101-141922-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 05:07:56,352 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9212W0005-600172 from training. Duration: 0.896 2023-10-02 05:07:56,384 WARNING [train.py:1204] (3/4) Exclude cut with ID _63186_210_2084_1_1535414479705_3908649_378-166820-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 05:07:57,716 WARNING [train.py:1204] (3/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_52-211683-0_sp1.1 from training. Duration: 0.8181875 2023-10-02 05:08:03,235 WARNING [train.py:1204] (3/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_1147-237729-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 05:08:03,311 WARNING [train.py:1204] (3/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_234-14174-0_sp0.9 from training. Duration: 0.911125 2023-10-02 05:08:03,323 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_454-276429-0 from training. Duration: 0.87 2023-10-02 05:08:04,707 WARNING [train.py:1204] (3/4) Exclude cut with ID _53713_210_24116_1_1534314487762_7416751_755-165313-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 05:08:06,101 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8278_210_2550_1_1528637099_4220413_436-317072-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 05:08:08,142 WARNING [train.py:1204] (3/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_28-324729-0_sp1.1 from training. Duration: 0.8545625 2023-10-02 05:08:12,655 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.1.self_attn_weights.pos_emb_skip_rate, batch_count=761293.3333333334, ans=0.0 2023-10-02 05:08:13,821 WARNING [train.py:1204] (3/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_326-165900-0 from training. Duration: 0.69 2023-10-02 05:08:15,231 INFO [train.py:1046] (3/4) Epoch 22, batch 2650, loss[loss=0.1888, simple_loss=0.2634, pruned_loss=0.05707, over 23323.00 frames. ], tot_loss[loss=0.1745, simple_loss=0.2515, pruned_loss=0.04877, over 4716410.13 frames. ], batch size: 119, lr: 4.67e-03, grad_scale: 16.0 2023-10-02 05:08:15,310 WARNING [train.py:1204] (3/4) Exclude cut with ID _53489_210_6228_1_1534298357169_3835970_96-30246-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 05:08:15,859 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.conv_skip_rate, batch_count=761360.0, ans=0.0 2023-10-02 05:08:17,083 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8275_210_4198_1_1528455320_3892571_491-17707-0_sp1.1 from training. Duration: 0.5818125 2023-10-02 05:08:17,420 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.feed_forward3.hidden_balancer.prob, batch_count=761360.0, ans=0.125 2023-10-02 05:08:19,836 WARNING [train.py:1204] (3/4) Exclude cut with ID _14348_210_8341_1_1530599434996_3778640_240-232370-0 from training. Duration: 0.84 2023-10-02 05:08:19,846 WARNING [train.py:1204] (3/4) Exclude cut with ID _45012_210_12651_1_1533608435136_5371380_54-112623-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 05:08:21,235 WARNING [train.py:1204] (3/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_328-264776-0_sp1.1 from training. Duration: 0.6090625 2023-10-02 05:08:23,085 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9325W0001-648427 from training. Duration: 0.768 2023-10-02 05:08:23,093 WARNING [train.py:1204] (3/4) Exclude cut with ID _56480_210_4220_1_1534679942079_3075409_49-74717-0_sp1.1 from training. Duration: 0.936375 2023-10-02 05:08:24,516 WARNING [train.py:1204] (3/4) Exclude cut with ID _59500_210_8254_1_1534809610680_3736009_6-241869-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 05:08:26,545 WARNING [train.py:1204] (3/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_172-215932-0_sp0.9 from training. Duration: 0.7333125 2023-10-02 05:08:27,898 WARNING [train.py:1204] (3/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_17-90541-0_sp1.1 from training. Duration: 0.8545625 2023-10-02 05:08:29,509 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.out_combiner.scale_min, batch_count=761426.6666666666, ans=0.2 2023-10-02 05:08:30,722 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_43831_210_2158_1_1533725502_5872791_525-89447-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 05:08:30,803 WARNING [train.py:1204] (3/4) Exclude cut with ID _12302_210_5219_1_1529924446466_8107280_907-182949-0 from training. Duration: 0.92 2023-10-02 05:08:30,816 WARNING [train.py:1204] (3/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_402-102311-0_sp0.9 from training. Duration: 0.7444375 2023-10-02 05:08:32,118 WARNING [train.py:1204] (3/4) Exclude cut with ID _8021_210_7994_1_1528376451358_3653069_179-45206-0_sp0.9 from training. Duration: 0.9555625 2023-10-02 05:08:35,044 WARNING [train.py:1204] (3/4) Exclude cut with ID _40137_210_12210_1_1533290484046_3795180_249-187059-0 from training. Duration: 0.88 2023-10-02 05:08:36,325 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9653W0194-675472 from training. Duration: 0.9658125 2023-10-02 05:08:38,340 WARNING [train.py:1204] (3/4) Exclude cut with ID _51332_210_9988_1_1534244371836_3801270_336-255604-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 05:08:42,654 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_510-96996-0 from training. Duration: 0.87 2023-10-02 05:08:43,999 WARNING [train.py:1204] (3/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_1167-20437-0_sp1.1 from training. Duration: 0.92725 2023-10-02 05:08:44,063 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3475_210_2028_1_1526193578_3622569_265-276625-0 from training. Duration: 0.76 2023-10-02 05:08:48,182 WARNING [train.py:1204] (3/4) Exclude cut with ID _14860_210_4915_1_1530702020340_7343240_470-172418-0_sp1.1 from training. Duration: 0.97275 2023-10-02 05:08:48,200 WARNING [train.py:1204] (3/4) Exclude cut with ID _5444_210_2151_1_1527418808464_5312210_581-315411-0_sp1.1 from training. Duration: 0.563625 2023-10-02 05:08:48,231 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_34450_210_9547_1_1533099443_4109050_519-342439-0_sp1.1 from training. Duration: 0.97275 2023-10-02 05:08:49,533 WARNING [train.py:1204] (3/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_50-175961-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 05:08:52,404 WARNING [train.py:1204] (3/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_372-6869-0 from training. Duration: 0.44 2023-10-02 05:08:52,409 WARNING [train.py:1204] (3/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_3-237434-0 from training. Duration: 0.76 2023-10-02 05:08:56,922 WARNING [train.py:1204] (3/4) Exclude cut with ID _8772_210_1850_1_1528635595972_3664011_247-336448-0_sp1.1 from training. Duration: 0.67275 2023-10-02 05:09:01,592 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_427-78097-0 from training. Duration: 0.8 2023-10-02 05:09:01,615 WARNING [train.py:1204] (3/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_466-252727-0_sp1.1 from training. Duration: 0.97275 2023-10-02 05:09:01,684 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_15972_210_6400_1_1530877934_3405692_316-35478-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 05:09:02,997 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_809-17156-0_sp0.9 from training. Duration: 0.811125 2023-10-02 05:09:03,041 WARNING [train.py:1204] (3/4) Exclude cut with ID _57572_210_5517_1_1534921219624_3809484_594-263474-0_sp1.1 from training. Duration: 0.936375 2023-10-02 05:09:03,071 WARNING [train.py:1204] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_190-228204-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 05:09:04,441 WARNING [train.py:1204] (3/4) Exclude cut with ID _19514_210_9259_1_1531530071330_5084230_117-21193-0_sp1.1 from training. Duration: 0.936375 2023-10-02 05:09:04,609 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.nonlin_attention.balancer.prob, batch_count=761560.0, ans=0.125 2023-10-02 05:09:05,840 WARNING [train.py:1204] (3/4) Exclude cut with ID _69584_210_5219_1_1535785133775_4044450_190-21315-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 05:09:05,927 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_53071_210_16554_1_1534493434_1276796_184-302050-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 05:09:07,269 WARNING [train.py:1204] (3/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_120-76843-0_sp1.1 from training. Duration: 0.5 2023-10-02 05:09:08,622 WARNING [train.py:1204] (3/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_318-162017-0_sp1.1 from training. Duration: 0.82725 2023-10-02 05:09:10,545 WARNING [train.py:1204] (3/4) Exclude cut with ID _46067_210_4220_1_1534125571066_3307450_79-46864-0_sp1.1 from training. Duration: 0.92725 2023-10-02 05:09:10,595 WARNING [train.py:1204] (3/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_358-215558-0_sp1.1 from training. Duration: 0.8454375 2023-10-02 05:09:12,410 WARNING [train.py:1204] (3/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_621-66764-0_sp1.1 from training. Duration: 0.92725 2023-10-02 05:09:13,836 WARNING [train.py:1204] (3/4) Exclude cut with ID _42863_210_9070_1_1533902398788_3696920_338-156851-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 05:09:13,862 WARNING [train.py:1204] (3/4) Exclude cut with ID _5289_210_2084_1_1527678202736_3779299_392-261988-0_sp0.9 from training. Duration: 0.72225 2023-10-02 05:09:16,672 WARNING [train.py:1204] (3/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_409-84682-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 05:09:17,993 WARNING [train.py:1204] (3/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_30-326681-0_sp0.9 from training. Duration: 0.92225 2023-10-02 05:09:18,007 WARNING [train.py:1204] (3/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_389-184695-0_sp1.1 from training. Duration: 0.92725 2023-10-02 05:09:19,315 WARNING [train.py:1204] (3/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_442-320317-0 from training. Duration: 0.82 2023-10-02 05:09:20,819 WARNING [train.py:1204] (3/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_756-29911-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 05:09:23,435 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_401-112036-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 05:09:24,910 WARNING [train.py:1204] (3/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_181-308827-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 05:09:26,938 WARNING [train.py:1204] (3/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_332-258725-0_sp1.1 from training. Duration: 0.963625 2023-10-02 05:09:28,304 WARNING [train.py:1204] (3/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_188-260011-0_sp0.9 from training. Duration: 0.811125 2023-10-02 05:09:28,341 WARNING [train.py:1204] (3/4) Exclude cut with ID _49039_210_6315_1_1533967537541_3846118_99-302239-0_sp1.1 from training. Duration: 0.963625 2023-10-02 05:09:29,692 INFO [train.py:1046] (3/4) Epoch 22, batch 2700, loss[loss=0.1634, simple_loss=0.2445, pruned_loss=0.04115, over 24313.00 frames. ], tot_loss[loss=0.1751, simple_loss=0.252, pruned_loss=0.04907, over 4713812.52 frames. ], batch size: 61, lr: 4.67e-03, grad_scale: 16.0 2023-10-02 05:09:31,653 WARNING [train.py:1204] (3/4) Exclude cut with ID _4122_210_2084_1_1526713164417_4353280_312-294828-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 05:09:31,658 WARNING [train.py:1204] (3/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_303-228738-0 from training. Duration: 0.84 2023-10-02 05:09:33,166 WARNING [train.py:1204] (3/4) Exclude cut with ID _14349_210_8341_1_1530615710818_3442470_115-285975-0_sp0.9 from training. Duration: 0.9555625 2023-10-02 05:09:35,860 WARNING [train.py:1204] (3/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_125-144174-0_sp1.1 from training. Duration: 0.3545625 2023-10-02 05:09:38,426 WARNING [train.py:1204] (3/4) Exclude cut with ID _53565_210_10635_1_1534337218335_3820259_351-54570-0_sp1.1 from training. Duration: 0.97275 2023-10-02 05:09:38,448 WARNING [train.py:1204] (3/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_410-40065-0_sp1.1 from training. Duration: 0.963625 2023-10-02 05:09:38,478 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_38965_210_4852_1_1533174906_3484735_167-339442-0_sp1.1 from training. Duration: 0.963625 2023-10-02 05:09:38,563 WARNING [train.py:1204] (3/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_265-164486-0_sp1.1 from training. Duration: 0.87275 2023-10-02 05:09:39,863 WARNING [train.py:1204] (3/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_725-182484-0_sp1.1 from training. Duration: 0.92725 2023-10-02 05:09:39,886 WARNING [train.py:1204] (3/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_607-213951-0_sp0.9 from training. Duration: 0.9333125 2023-10-02 05:09:39,899 WARNING [train.py:1204] (3/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_605-133395-0_sp1.1 from training. Duration: 0.6 2023-10-02 05:09:39,913 WARNING [train.py:1204] (3/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_351-294650-0 from training. Duration: 0.63 2023-10-02 05:09:39,973 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_14953_210_2151_1_1530701710_5616165_710-165400-0_sp1.1 from training. Duration: 0.7909375 2023-10-02 05:09:41,990 WARNING [train.py:1204] (3/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_237-58433-0_sp1.1 from training. Duration: 0.67275 2023-10-02 05:09:43,389 WARNING [train.py:1204] (3/4) Exclude cut with ID _5190_210_4557_1_1527244368799_373439_28-197030-0_sp1.1 from training. Duration: 0.6545625 2023-10-02 05:09:44,687 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_743-327651-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 05:09:47,519 WARNING [train.py:1204] (3/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_76-203867-0_sp0.9 from training. Duration: 0.8 2023-10-02 05:09:47,624 WARNING [train.py:1204] (3/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_274-274798-0 from training. Duration: 0.98 2023-10-02 05:09:49,035 WARNING [train.py:1204] (3/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_226-242140-0_sp1.1 from training. Duration: 0.736375 2023-10-02 05:09:53,876 WARNING [train.py:1204] (3/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_352-59958-0_sp1.1 from training. Duration: 0.7818125 2023-10-02 05:09:53,887 WARNING [train.py:1204] (3/4) Exclude cut with ID _39411_210_5986_1_1533538825030_3343649_191-251819-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 05:09:54,007 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.1.conv_skip_rate, batch_count=761760.0, ans=0.0 2023-10-02 05:09:59,905 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7046_210_6753_1_1527895337_6920917_642-106799-0_sp1.1 from training. Duration: 0.836375 2023-10-02 05:09:59,912 WARNING [train.py:1204] (3/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_412-3442-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 05:09:59,939 WARNING [train.py:1204] (3/4) Exclude cut with ID _60057_210_10640_1_1534903065793_3333150_171-181133-0_sp0.9 from training. Duration: 0.97775 2023-10-02 05:10:01,220 WARNING [train.py:1204] (3/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_154-135696-0_sp1.1 from training. Duration: 0.72725 2023-10-02 05:10:02,476 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.447e+02 1.897e+02 2.080e+02 2.344e+02 3.157e+02, threshold=4.159e+02, percent-clipped=0.0 2023-10-02 05:10:03,977 WARNING [train.py:1204] (3/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_148-186805-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 05:10:06,794 WARNING [train.py:1204] (3/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_605-165641-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 05:10:06,800 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_174-277364-0_sp0.9 from training. Duration: 0.788875 2023-10-02 05:10:06,815 WARNING [train.py:1204] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_89-321955-0_sp0.9 from training. Duration: 0.888875 2023-10-02 05:10:12,794 WARNING [train.py:1204] (3/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_438-105429-0_sp1.1 from training. Duration: 0.963625 2023-10-02 05:10:12,798 WARNING [train.py:1204] (3/4) Exclude cut with ID _14970_210_15709_1_1530773543493_1715730_187-20259-0_sp0.9 from training. Duration: 0.988875 2023-10-02 05:10:19,158 WARNING [train.py:1204] (3/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_385-193052-0_sp0.9 from training. Duration: 0.9555625 2023-10-02 05:10:20,549 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7113_210_1849_1_1527942685_3448768_194-311626-0_sp1.1 from training. Duration: 0.92725 2023-10-02 05:10:20,785 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.conv_skip_rate, batch_count=761893.3333333334, ans=0.0 2023-10-02 05:10:22,337 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.conv_skip_rate, batch_count=761893.3333333334, ans=0.0 2023-10-02 05:10:23,391 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3780_210_2087_1_1526467389_2145572_176-107140-0_sp0.9 from training. Duration: 0.7666875 2023-10-02 05:10:23,393 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_36756_210_7234_1_1533026991_966276_123-213457-0_sp1.1 from training. Duration: 0.936375 2023-10-02 05:10:25,641 WARNING [train.py:1204] (3/4) Exclude cut with ID _23557_210_8750_1_1531890106370_6846255_456-244861-0_sp1.1 from training. Duration: 0.963625 2023-10-02 05:10:26,783 WARNING [train.py:1204] (3/4) Exclude cut with ID _37086_210_9038_1_1532999540891_7021250_242-349170-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 05:10:28,117 WARNING [train.py:1204] (3/4) Exclude cut with ID _41298_210_9019_1_1533430371450_4822143_471-281351-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 05:10:28,225 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_53378_210_17282_1_1534221071_5769509_572-126176-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 05:10:29,633 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_1507-263032-0_sp1.1 from training. Duration: 0.963625 2023-10-02 05:10:29,668 WARNING [train.py:1204] (3/4) Exclude cut with ID _13679_210_7497_1_1530534293662_3355098_411-186088-0_sp1.1 from training. Duration: 0.8545625 2023-10-02 05:10:34,120 WARNING [train.py:1204] (3/4) Exclude cut with ID _29345_210_4381_1_1532689168263_3860169_149-257268-0_sp1.1 from training. Duration: 0.736375 2023-10-02 05:10:35,427 WARNING [train.py:1204] (3/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_381-252014-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 05:10:35,433 WARNING [train.py:1204] (3/4) Exclude cut with ID _35799_210_5854_1_1533263487887_3529071_512-2717-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 05:10:38,144 WARNING [train.py:1204] (3/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_505-205270-0_sp0.9 from training. Duration: 0.42225 2023-10-02 05:10:38,236 WARNING [train.py:1204] (3/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_731-20117-0_sp1.1 from training. Duration: 0.936375 2023-10-02 05:10:39,696 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_37351_210_7925_1_1533012931_3812113_265-85780-0_sp1.1 from training. Duration: 0.8 2023-10-02 05:10:39,703 WARNING [train.py:1204] (3/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_313-134930-0 from training. Duration: 0.97 2023-10-02 05:10:41,519 WARNING [train.py:1204] (3/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_374-138971-0 from training. Duration: 0.94 2023-10-02 05:10:42,959 WARNING [train.py:1204] (3/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_1227-287886-0_sp1.1 from training. Duration: 0.936375 2023-10-02 05:10:44,756 INFO [train.py:1046] (3/4) Epoch 22, batch 2750, loss[loss=0.1884, simple_loss=0.2524, pruned_loss=0.06226, over 23860.00 frames. ], tot_loss[loss=0.1756, simple_loss=0.2521, pruned_loss=0.04957, over 4704492.99 frames. ], batch size: 195, lr: 4.67e-03, grad_scale: 16.0 2023-10-02 05:10:46,233 WARNING [train.py:1204] (3/4) Exclude cut with ID _59493_210_17282_1_1535078419077_3225879_332-217544-0_sp1.1 from training. Duration: 0.97275 2023-10-02 05:10:46,265 WARNING [train.py:1204] (3/4) Exclude cut with ID _23514_210_1389_1_1531832139785_3031439_361-119640-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 05:10:46,477 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.conv_module1.balancer1.prob, batch_count=762026.6666666666, ans=0.125 2023-10-02 05:10:47,530 WARNING [train.py:1204] (3/4) Exclude cut with ID _69584_210_5219_1_1535785133775_4044450_142-118058-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 05:10:48,873 WARNING [train.py:1204] (3/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_178-177582-0_sp1.1 from training. Duration: 0.67275 2023-10-02 05:10:48,923 WARNING [train.py:1204] (3/4) Exclude cut with ID _25303_210_17519_1_1532050207576_3635970_41-195572-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 05:10:53,048 WARNING [train.py:1204] (3/4) Exclude cut with ID _55328_210_27767_1_1534580973260_4088170_329-341322-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 05:10:53,090 WARNING [train.py:1204] (3/4) Exclude cut with ID _5608_210_2106_1_1527415439089_6819768_909-82767-0_sp1.1 from training. Duration: 0.5545625 2023-10-02 05:10:53,125 WARNING [train.py:1204] (3/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_261-297503-0_sp1.1 from training. Duration: 0.9 2023-10-02 05:10:54,414 WARNING [train.py:1204] (3/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_459-291706-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 05:10:54,415 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7762_210_1794_1_1528199508_4057408_422-227034-0 from training. Duration: 0.73 2023-10-02 05:10:54,424 WARNING [train.py:1204] (3/4) Exclude cut with ID _3067_210_2667_1_1526212974640_4503168_250-285937-0_sp0.9 from training. Duration: 0.888875 2023-10-02 05:10:54,438 WARNING [train.py:1204] (3/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_332-253745-0_sp1.1 from training. Duration: 0.936375 2023-10-02 05:11:00,935 WARNING [train.py:1204] (3/4) Exclude cut with ID _13938_210_9988_1_1530518718978_3693960_304-112784-0 from training. Duration: 0.46 2023-10-02 05:11:02,344 WARNING [train.py:1204] (3/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_226-267957-0_sp1.1 from training. Duration: 0.92725 2023-10-02 05:11:02,413 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_41822_210_13270_1_1533955976_4379284_608-254567-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 05:11:03,626 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_124-113562-0_sp1.1 from training. Duration: 0.8545625 2023-10-02 05:11:03,661 WARNING [train.py:1204] (3/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_117-290916-0_sp1.1 from training. Duration: 0.463625 2023-10-02 05:11:05,102 WARNING [train.py:1204] (3/4) Exclude cut with ID _28427_210_4220_1_1532581240967_3061069_102-66533-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 05:11:06,442 WARNING [train.py:1204] (3/4) Exclude cut with ID _55383_210_10635_1_1534468426172_2231669_134-128246-0_sp1.1 from training. Duration: 0.8090625 2023-10-02 05:11:07,811 WARNING [train.py:1204] (3/4) Exclude cut with ID _45871_210_5665_1_1533864614245_3296060_49-308316-0_sp1.1 from training. Duration: 0.97275 2023-10-02 05:11:07,868 WARNING [train.py:1204] (3/4) Exclude cut with ID _57572_210_5517_1_1534921219624_3809484_176-199205-0_sp1.1 from training. Duration: 0.97275 2023-10-02 05:11:08,099 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.bypass.skip_rate, batch_count=762093.3333333334, ans=0.07 2023-10-02 05:11:13,733 WARNING [train.py:1204] (3/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_45-265684-0_sp1.1 from training. Duration: 0.6545625 2023-10-02 05:11:13,763 WARNING [train.py:1204] (3/4) Exclude cut with ID _15150_210_8341_1_1530752411803_7256384_469-239509-0_sp0.9 from training. Duration: 0.7555625 2023-10-02 05:11:13,797 WARNING [train.py:1204] (3/4) Exclude cut with ID _28461_210_2253_1_1532771775189_3041299_136-270644-0_sp1.1 from training. Duration: 0.8454375 2023-10-02 05:11:15,244 WARNING [train.py:1204] (3/4) Exclude cut with ID _69584_210_5219_1_1535785133775_4044450_302-47302-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 05:11:17,173 WARNING [train.py:1204] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_166-128941-0_sp1.1 from training. Duration: 0.4909375 2023-10-02 05:11:18,748 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.conv_skip_rate, batch_count=762160.0, ans=0.0 2023-10-02 05:11:22,692 WARNING [train.py:1204] (3/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_559-120504-0_sp1.1 from training. Duration: 0.97275 2023-10-02 05:11:25,604 WARNING [train.py:1204] (3/4) Exclude cut with ID _6456_210_1534_1_1527681634212_3181049_345-252877-0_sp1.1 from training. Duration: 0.5818125 2023-10-02 05:11:25,640 WARNING [train.py:1204] (3/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_902-233003-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 05:11:29,814 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.3.encoder.layers.0.self_attn2.whiten, num_groups=1, num_channels=512, metric=21.83 vs. limit=22.5 2023-10-02 05:11:30,306 WARNING [train.py:1204] (3/4) Exclude cut with ID _36538_210_9994_1_1533204020363_3795620_204-261385-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 05:11:30,317 WARNING [train.py:1204] (3/4) Exclude cut with ID _14821_210_8750_1_1530680503991_6829359_189-297451-0_sp1.1 from training. Duration: 0.5 2023-10-02 05:11:30,356 WARNING [train.py:1204] (3/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_1211-226066-0_sp0.9 from training. Duration: 0.8444375 2023-10-02 05:11:36,505 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6786_210_1388_1_1527839699_4194495_273-149841-0_sp1.1 from training. Duration: 0.836375 2023-10-02 05:11:36,546 WARNING [train.py:1204] (3/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_380-162648-0_sp1.1 from training. Duration: 0.8545625 2023-10-02 05:11:36,546 WARNING [train.py:1204] (3/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_494-199538-0 from training. Duration: 0.75 2023-10-02 05:11:42,323 WARNING [train.py:1204] (3/4) Exclude cut with ID _49097_210_16049_1_1533952780207_4360979_276-87053-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 05:11:43,771 WARNING [train.py:1204] (3/4) Exclude cut with ID _5444_210_2151_1_1527418808464_5312210_726-27165-0 from training. Duration: 0.67 2023-10-02 05:11:49,228 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_128-202104-0_sp1.1 from training. Duration: 0.57275 2023-10-02 05:11:51,342 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_11349_210_6753_1_1529800861_1101207_26-65602-0_sp1.1 from training. Duration: 0.87275 2023-10-02 05:11:51,366 WARNING [train.py:1204] (3/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_159-328157-0 from training. Duration: 0.52 2023-10-02 05:11:52,701 WARNING [train.py:1204] (3/4) Exclude cut with ID _14069_210_5632_1_1530512692033_4803390_98-21934-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 05:11:52,830 WARNING [train.py:1204] (3/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_352-59958-0_sp0.9 from training. Duration: 0.9555625 2023-10-02 05:11:54,192 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6163_210_6400_1_1527506746_4263068_283-137811-0 from training. Duration: 0.77 2023-10-02 05:11:54,232 WARNING [train.py:1204] (3/4) Exclude cut with ID _21989_210_5854_1_1531899214931_3271360_133-44759-0_sp1.1 from training. Duration: 0.7 2023-10-02 05:11:57,485 WARNING [train.py:1204] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_21-174514-0_sp0.9 from training. Duration: 0.47775 2023-10-02 05:11:58,739 INFO [train.py:1046] (3/4) Epoch 22, batch 2800, loss[loss=0.1652, simple_loss=0.2314, pruned_loss=0.04951, over 23638.00 frames. ], tot_loss[loss=0.174, simple_loss=0.2499, pruned_loss=0.04899, over 4705076.79 frames. ], batch size: 256, lr: 4.67e-03, grad_scale: 32.0 2023-10-02 05:11:58,781 WARNING [train.py:1204] (3/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_201-78811-0_sp1.1 from training. Duration: 0.963625 2023-10-02 05:11:58,824 WARNING [train.py:1204] (3/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_537-164630-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 05:12:00,163 WARNING [train.py:1204] (3/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_156-257607-0 from training. Duration: 0.83 2023-10-02 05:12:00,178 WARNING [train.py:1204] (3/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_308-154450-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 05:12:00,236 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_45399_210_4852_1_1533624725_7579737_214-240574-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 05:12:02,242 WARNING [train.py:1204] (3/4) Exclude cut with ID _29303_210_2575_1_1532392237303_7410649_615-305517-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 05:12:03,543 WARNING [train.py:1204] (3/4) Exclude cut with ID IC0125W0002-61808 from training. Duration: 0.708 2023-10-02 05:12:03,544 WARNING [train.py:1204] (3/4) Exclude cut with ID IC0125W0017-61823 from training. Duration: 0.759 2023-10-02 05:12:06,419 WARNING [train.py:1204] (3/4) Exclude cut with ID _53219_210_4525_1_1534834836008_4064825_4-145291-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 05:12:09,108 WARNING [train.py:1204] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_185-174805-0_sp1.1 from training. Duration: 0.6181875 2023-10-02 05:12:09,131 WARNING [train.py:1204] (3/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_635-193046-0_sp1.1 from training. Duration: 0.92725 2023-10-02 05:12:09,998 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.2.encoder.layers.0.whiten, num_groups=1, num_channels=384, metric=5.88 vs. limit=12.0 2023-10-02 05:12:12,080 WARNING [train.py:1204] (3/4) Exclude cut with ID _46067_210_4220_1_1534125571066_3307450_321-258866-0_sp1.1 from training. Duration: 0.97275 2023-10-02 05:12:14,001 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8558_210_1410_1_1528505225_7613406_131-329524-0 from training. Duration: 0.79 2023-10-02 05:12:15,435 WARNING [train.py:1204] (3/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_847-41334-0_sp0.9 from training. Duration: 0.488875 2023-10-02 05:12:16,789 WARNING [train.py:1204] (3/4) Exclude cut with ID _14227_210_4381_1_1530701804286_3940259_125-275642-0 from training. Duration: 0.99 2023-10-02 05:12:19,469 WARNING [train.py:1204] (3/4) Exclude cut with ID _25248_210_8750_1_1532062825941_5554180_248-177255-0_sp1.1 from training. Duration: 0.963625 2023-10-02 05:12:19,509 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_40047_210_2290_1_1533290519_2374311_195-165693-0_sp1.1 from training. Duration: 0.8090625 2023-10-02 05:12:19,517 WARNING [train.py:1204] (3/4) Exclude cut with ID _23109_210_4381_1_1532150828777_901949_29-258672-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 05:12:23,973 WARNING [train.py:1204] (3/4) Exclude cut with ID _14821_210_8750_1_1530680503991_6829359_2-227151-0_sp1.1 from training. Duration: 0.7545625 2023-10-02 05:12:24,016 WARNING [train.py:1204] (3/4) Exclude cut with ID _49643_210_7989_1_1534640244224_3939031_565-53982-0_sp1.1 from training. Duration: 0.963625 2023-10-02 05:12:24,020 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7544_210_6753_1_1528591327_1471426_33-346031-0_sp0.9 from training. Duration: 0.67775 2023-10-02 05:12:25,419 WARNING [train.py:1204] (3/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_95-61782-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 05:12:31,712 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.490e+02 1.904e+02 2.151e+02 2.380e+02 3.525e+02, threshold=4.302e+02, percent-clipped=0.0 2023-10-02 05:12:34,905 WARNING [train.py:1204] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_686-54383-0_sp0.9 from training. Duration: 0.9666875 2023-10-02 05:12:36,320 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_505-276823-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 05:12:37,769 WARNING [train.py:1204] (3/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_745-128443-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 05:12:38,023 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.ff3_skip_rate, batch_count=762493.3333333334, ans=0.0 2023-10-02 05:12:39,031 WARNING [train.py:1204] (3/4) Exclude cut with ID _3844_210_1390_1_1526727803582_2871209_20-251664-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 05:12:39,092 WARNING [train.py:1204] (3/4) Exclude cut with ID _36538_210_9994_1_1533204020363_3795620_487-164628-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 05:12:43,447 WARNING [train.py:1204] (3/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_309-222545-0_sp1.1 from training. Duration: 0.763625 2023-10-02 05:12:43,469 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3531_210_3528_1_1526294203_5216456_309-227302-0 from training. Duration: 0.69 2023-10-02 05:12:43,606 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.1.feed_forward3.hidden_balancer.prob, batch_count=762560.0, ans=0.125 2023-10-02 05:12:44,779 WARNING [train.py:1204] (3/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_42-192460-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 05:12:46,703 WARNING [train.py:1204] (3/4) Exclude cut with ID _22083_210_8254_1_1531702862439_3678970_49-234546-0_sp1.1 from training. Duration: 0.7545625 2023-10-02 05:12:46,706 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_427-78097-0_sp0.9 from training. Duration: 0.888875 2023-10-02 05:12:50,911 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_378-205254-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 05:12:52,870 WARNING [train.py:1204] (3/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_672-314830-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 05:12:55,769 WARNING [train.py:1204] (3/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_90-30673-0_sp1.1 from training. Duration: 0.763625 2023-10-02 05:12:57,287 WARNING [train.py:1204] (3/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_563-329943-0_sp1.1 from training. Duration: 0.9 2023-10-02 05:12:57,301 WARNING [train.py:1204] (3/4) Exclude cut with ID _21632_210_2253_1_1531994001765_3744140_154-84090-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 05:12:57,302 WARNING [train.py:1204] (3/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_103-336064-0_sp0.9 from training. Duration: 0.5666875 2023-10-02 05:12:58,631 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_43-71989-0_sp0.9 from training. Duration: 0.6333125 2023-10-02 05:12:58,686 WARNING [train.py:1204] (3/4) Exclude cut with ID _3867_210_1883_1_1526641200575_4475830_319-349850-0_sp0.9 from training. Duration: 0.7444375 2023-10-02 05:13:00,501 WARNING [train.py:1204] (3/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_629-179454-0_sp1.1 from training. Duration: 0.963625 2023-10-02 05:13:00,508 WARNING [train.py:1204] (3/4) Exclude cut with ID _65263_210_22356_1_1535763598691_3921037_49-222917-0 from training. Duration: 0.92 2023-10-02 05:13:01,782 WARNING [train.py:1204] (3/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_602-331772-0_sp1.1 from training. Duration: 0.936375 2023-10-02 05:13:02,003 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.conv_module1.balancer1.max_abs, batch_count=762626.6666666666, ans=10.0 2023-10-02 05:13:03,100 WARNING [train.py:1204] (3/4) Exclude cut with ID _58414_210_8254_1_1535421609162_7151182_292-134978-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 05:13:03,140 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_38793_210_2544_1_1533176114_3849320_200-299292-0_sp1.1 from training. Duration: 0.936375 2023-10-02 05:13:04,527 WARNING [train.py:1204] (3/4) Exclude cut with ID _14970_210_15709_1_1530775547361_1966965_6-23328-0 from training. Duration: 0.86 2023-10-02 05:13:05,978 WARNING [train.py:1204] (3/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_386-225511-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 05:13:05,996 WARNING [train.py:1204] (3/4) Exclude cut with ID _38323_210_12108_1_1533121233369_3303049_416-11919-0_sp1.1 from training. Duration: 0.87275 2023-10-02 05:13:08,298 WARNING [train.py:1204] (3/4) Exclude cut with ID _4866_210_2577_1_1527404412284_4364659_256-306830-0_sp1.1 from training. Duration: 0.7909375 2023-10-02 05:13:08,435 WARNING [train.py:1204] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_53-300544-0 from training. Duration: 0.67 2023-10-02 05:13:14,073 INFO [train.py:1046] (3/4) Epoch 22, batch 2850, loss[loss=0.1674, simple_loss=0.2341, pruned_loss=0.05037, over 23551.00 frames. ], tot_loss[loss=0.1735, simple_loss=0.2494, pruned_loss=0.0488, over 4716203.88 frames. ], batch size: 285, lr: 4.67e-03, grad_scale: 32.0 2023-10-02 05:13:14,253 WARNING [train.py:1204] (3/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_80-74184-0_sp1.1 from training. Duration: 0.7545625 2023-10-02 05:13:14,267 WARNING [train.py:1204] (3/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_332-256168-0_sp1.1 from training. Duration: 0.6818125 2023-10-02 05:13:15,515 WARNING [train.py:1204] (3/4) Exclude cut with ID _48836_210_12210_1_1534075848293_3904150_429-35475-0_sp1.1 from training. Duration: 0.8090625 2023-10-02 05:13:18,177 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_144-135833-0_sp1.1 from training. Duration: 0.97275 2023-10-02 05:13:18,428 INFO [scaling.py:1118] (3/4) WithLoss: name=encoder.encoders.1.encoder.layers.0.self_attn_weights, loss-sum=0.000e+00 2023-10-02 05:13:21,445 WARNING [train.py:1204] (3/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_380-343670-0_sp0.9 from training. Duration: 0.97775 2023-10-02 05:13:22,774 WARNING [train.py:1204] (3/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_213-265174-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 05:13:22,798 WARNING [train.py:1204] (3/4) Exclude cut with ID _20455_210_5219_1_1531565996775_8098100_86-314283-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 05:13:25,985 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_41750_210_1960_1_1533520678_3948042_68-223813-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 05:13:26,046 WARNING [train.py:1204] (3/4) Exclude cut with ID _55153_210_2253_1_1534384786677_3162096_24-198429-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 05:13:27,396 WARNING [train.py:1204] (3/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_119-207162-0_sp0.9 from training. Duration: 0.788875 2023-10-02 05:13:28,759 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_148-172861-0 from training. Duration: 0.5 2023-10-02 05:13:30,061 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.3.encoder.layers.1.self_attn2.whiten, num_groups=1, num_channels=512, metric=17.21 vs. limit=22.5 2023-10-02 05:13:30,873 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.conv_module1.balancer2.prob, batch_count=762760.0, ans=0.125 2023-10-02 05:13:33,533 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_5522_210_3273_1_1527249606_3909096_318-203153-0 from training. Duration: 0.43 2023-10-02 05:13:33,540 WARNING [train.py:1204] (3/4) Exclude cut with ID _14884_210_2252_1_1530707459689_5561250_288-189949-0_sp1.1 from training. Duration: 0.97275 2023-10-02 05:13:34,936 WARNING [train.py:1204] (3/4) Exclude cut with ID _5294_210_1747_1_1527249643928_3696179_529-242298-0 from training. Duration: 0.95 2023-10-02 05:13:36,315 WARNING [train.py:1204] (3/4) Exclude cut with ID _36538_210_9994_1_1533204020363_3795620_503-110289-0_sp1.1 from training. Duration: 0.936375 2023-10-02 05:13:37,734 WARNING [train.py:1204] (3/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_385-193052-0 from training. Duration: 0.86 2023-10-02 05:13:39,764 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_456-22141-0 from training. Duration: 0.88 2023-10-02 05:13:41,387 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.ff2_skip_rate, batch_count=762760.0, ans=0.0 2023-10-02 05:13:42,514 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_38965_210_4852_1_1533174906_3484735_284-117840-0_sp1.1 from training. Duration: 0.936375 2023-10-02 05:13:52,908 WARNING [train.py:1204] (3/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_75-201625-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 05:13:53,010 WARNING [train.py:1204] (3/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_255-312654-0_sp1.1 from training. Duration: 0.82725 2023-10-02 05:13:54,292 WARNING [train.py:1204] (3/4) Exclude cut with ID _47251_210_18628_1_1533814063128_4137250_214-88076-0_sp0.9 from training. Duration: 0.97775 2023-10-02 05:13:55,658 WARNING [train.py:1204] (3/4) Exclude cut with ID _4092_210_2151_1_1526643044449_3135759_227-213972-0_sp1.1 from training. Duration: 0.4909375 2023-10-02 05:13:55,679 WARNING [train.py:1204] (3/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_912-47473-0_sp1.1 from training. Duration: 0.6181875 2023-10-02 05:13:55,707 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6767_210_3273_1_1527855783_1718932_173-256601-0_sp1.1 from training. Duration: 0.72725 2023-10-02 05:13:55,916 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.bypass_mid.scale_min, batch_count=762826.6666666666, ans=0.2 2023-10-02 05:13:57,109 WARNING [train.py:1204] (3/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_32-205288-0_sp1.1 from training. Duration: 0.7090625 2023-10-02 05:13:58,542 WARNING [train.py:1204] (3/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_39-248870-0 from training. Duration: 0.69 2023-10-02 05:14:00,450 WARNING [train.py:1204] (3/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_377-296839-0_sp1.1 from training. Duration: 0.5 2023-10-02 05:14:00,464 WARNING [train.py:1204] (3/4) Exclude cut with ID _13679_210_7497_1_1530534293662_3355098_413-56154-0_sp1.1 from training. Duration: 0.963625 2023-10-02 05:14:00,511 WARNING [train.py:1204] (3/4) Exclude cut with ID _14970_210_15709_1_1530775547361_1966965_201-84544-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 05:14:01,799 WARNING [train.py:1204] (3/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_105-95775-0_sp1.1 from training. Duration: 0.936375 2023-10-02 05:14:04,450 WARNING [train.py:1204] (3/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_131-191669-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 05:14:05,835 WARNING [train.py:1204] (3/4) Exclude cut with ID _14860_210_4915_1_1530702020340_7343240_695-4323-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 05:14:07,167 WARNING [train.py:1204] (3/4) Exclude cut with ID _39867_210_9826_1_1533553222030_9344259_991-130565-0_sp1.1 from training. Duration: 0.97275 2023-10-02 05:14:09,807 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_50-3289-0_sp1.1 from training. Duration: 0.82725 2023-10-02 05:14:11,236 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_54-347670-0_sp1.1 from training. Duration: 0.92725 2023-10-02 05:14:11,304 WARNING [train.py:1204] (3/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_200-153445-0_sp1.1 from training. Duration: 0.936375 2023-10-02 05:14:13,287 WARNING [train.py:1204] (3/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_279-302911-0_sp1.1 from training. Duration: 0.97275 2023-10-02 05:14:15,958 WARNING [train.py:1204] (3/4) Exclude cut with ID _14886_210_4220_1_1530702020355_3516190_159-73748-0_sp0.9 from training. Duration: 0.92225 2023-10-02 05:14:18,914 WARNING [train.py:1204] (3/4) Exclude cut with ID _53379_210_20034_1_1534222027363_3854654_23-222715-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 05:14:20,318 WARNING [train.py:1204] (3/4) Exclude cut with ID _12555_210_8750_1_1530075594402_3780239_449-267986-0 from training. Duration: 0.83 2023-10-02 05:14:21,714 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7544_210_6753_1_1528587722_3576686_295-258479-0 from training. Duration: 0.84 2023-10-02 05:14:23,585 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7002_210_1794_1_1527935634_4034444_487-236501-0_sp0.9 from training. Duration: 0.7333125 2023-10-02 05:14:23,640 WARNING [train.py:1204] (3/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_244-19107-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 05:14:23,654 WARNING [train.py:1204] (3/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_390-339137-0 from training. Duration: 0.56 2023-10-02 05:14:23,706 WARNING [train.py:1204] (3/4) Exclude cut with ID _7562_210_2084_1_1528887767916_3946229_186-326422-0_sp1.1 from training. Duration: 0.763625 2023-10-02 05:14:25,583 WARNING [train.py:1204] (3/4) Exclude cut with ID _33640_210_2655_1_1533117605479_4855469_645-145235-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 05:14:25,607 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_25767_210_2161_1_1532307483_3867748_506-171777-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 05:14:26,931 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_5438_210_1794_1_1527336687_2600009_233-58072-0_sp1.1 from training. Duration: 0.7 2023-10-02 05:14:26,932 WARNING [train.py:1204] (3/4) Exclude cut with ID IC0669W0063-330908 from training. Duration: 0.896 2023-10-02 05:14:26,990 WARNING [train.py:1204] (3/4) Exclude cut with ID IC0671W0070-331913 from training. Duration: 0.896 2023-10-02 05:14:26,994 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_258-310570-0_sp1.1 from training. Duration: 0.7454375 2023-10-02 05:14:27,069 WARNING [train.py:1204] (3/4) Exclude cut with ID _9461_210_4557_1_1529060514726_3677451_162-177507-0_sp1.1 from training. Duration: 0.97275 2023-10-02 05:14:27,310 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.bypass_mid.scale_min, batch_count=763026.6666666666, ans=0.2 2023-10-02 05:14:28,340 INFO [train.py:1046] (3/4) Epoch 22, batch 2900, loss[loss=0.1707, simple_loss=0.2395, pruned_loss=0.05091, over 23789.00 frames. ], tot_loss[loss=0.1732, simple_loss=0.2496, pruned_loss=0.04839, over 4726380.90 frames. ], batch size: 164, lr: 4.67e-03, grad_scale: 32.0 2023-10-02 05:14:31,348 WARNING [train.py:1204] (3/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_26-175553-0_sp0.9 from training. Duration: 0.67775 2023-10-02 05:14:32,589 WARNING [train.py:1204] (3/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_7-150481-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 05:14:32,641 WARNING [train.py:1204] (3/4) Exclude cut with ID _23557_210_8750_1_1531890106370_6846255_72-273262-0_sp1.1 from training. Duration: 0.963625 2023-10-02 05:14:34,441 WARNING [train.py:1204] (3/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_367-61330-0 from training. Duration: 0.75 2023-10-02 05:14:38,593 WARNING [train.py:1204] (3/4) Exclude cut with ID _39867_210_9826_1_1533553222030_9344259_1337-11141-0_sp1.1 from training. Duration: 0.936375 2023-10-02 05:14:38,619 WARNING [train.py:1204] (3/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_101-293367-0 from training. Duration: 0.63 2023-10-02 05:14:39,996 WARNING [train.py:1204] (3/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_10-100367-0 from training. Duration: 0.98 2023-10-02 05:14:41,985 WARNING [train.py:1204] (3/4) Exclude cut with ID _60057_210_10640_1_1534903065793_3333150_171-181133-0_sp1.1 from training. Duration: 0.8 2023-10-02 05:14:41,988 WARNING [train.py:1204] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_844-96989-0_sp0.9 from training. Duration: 0.788875 2023-10-02 05:14:43,427 WARNING [train.py:1204] (3/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_301-144595-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 05:14:43,527 WARNING [train.py:1204] (3/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_115-147343-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 05:14:43,654 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.balancer2.prob, batch_count=763093.3333333334, ans=0.125 2023-10-02 05:14:47,599 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3171_210_2087_1_1526109084_2330007_37-64334-0_sp1.1 from training. Duration: 0.7454375 2023-10-02 05:14:47,655 WARNING [train.py:1204] (3/4) Exclude cut with ID _19514_210_9259_1_1531530071330_5084230_769-272914-0_sp1.1 from training. Duration: 0.936375 2023-10-02 05:14:50,327 WARNING [train.py:1204] (3/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_76-311963-0_sp0.9 from training. Duration: 0.77775 2023-10-02 05:14:51,683 WARNING [train.py:1204] (3/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_526-62079-0 from training. Duration: 0.67 2023-10-02 05:14:51,747 WARNING [train.py:1204] (3/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_7-111954-0_sp0.9 from training. Duration: 0.62225 2023-10-02 05:14:53,700 WARNING [train.py:1204] (3/4) Exclude cut with ID _57997_210_4163_1_1534766425889_3961049_360-279140-0_sp1.1 from training. Duration: 0.97275 2023-10-02 05:14:57,068 WARNING [train.py:1204] (3/4) Exclude cut with ID _4866_210_2577_1_1527404412284_4364659_133-1696-0 from training. Duration: 0.99 2023-10-02 05:14:57,141 WARNING [train.py:1204] (3/4) Exclude cut with ID _5190_210_4557_1_1527244368799_373439_28-197030-0 from training. Duration: 0.72 2023-10-02 05:14:59,970 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_53-39003-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 05:14:59,973 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6673_210_3273_1_1527767790_3173240_351-167954-0 from training. Duration: 0.48 2023-10-02 05:14:59,997 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2822_210_2715_1_1526112586_1999587_165-36656-0_sp1.1 from training. Duration: 0.8090625 2023-10-02 05:15:02,607 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.540e+02 1.846e+02 2.042e+02 2.296e+02 2.937e+02, threshold=4.085e+02, percent-clipped=0.0 2023-10-02 05:15:03,989 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_328-264892-0_sp1.1 from training. Duration: 0.92725 2023-10-02 05:15:03,994 WARNING [train.py:1204] (3/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_29-293896-0_sp0.9 from training. Duration: 0.67775 2023-10-02 05:15:06,083 WARNING [train.py:1204] (3/4) Exclude cut with ID _65263_210_22356_1_1535763598691_3921037_465-55448-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 05:15:07,449 WARNING [train.py:1204] (3/4) Exclude cut with ID _21989_210_5854_1_1531899214931_3271360_311-53106-0_sp1.1 from training. Duration: 0.97275 2023-10-02 05:15:10,314 WARNING [train.py:1204] (3/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_389-277915-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 05:15:12,316 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.3.encoder.layers.2.feed_forward3.out_whiten, num_groups=1, num_channels=512, metric=7.90 vs. limit=15.0 2023-10-02 05:15:13,027 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_69630_210_7234_1_1535791094_677837_63-277139-0_sp1.1 from training. Duration: 0.963625 2023-10-02 05:15:14,382 WARNING [train.py:1204] (3/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_85-138257-0 from training. Duration: 0.71 2023-10-02 05:15:14,406 WARNING [train.py:1204] (3/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_686-247600-0 from training. Duration: 0.92 2023-10-02 05:15:14,420 WARNING [train.py:1204] (3/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_108-255430-0_sp1.1 from training. Duration: 0.8818125 2023-10-02 05:15:18,898 WARNING [train.py:1204] (3/4) Exclude cut with ID _14884_210_2252_1_1530707459689_5561250_114-201399-0_sp1.1 from training. Duration: 0.6909375 2023-10-02 05:15:20,406 WARNING [train.py:1204] (3/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_506-42614-0 from training. Duration: 0.79 2023-10-02 05:15:21,790 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_85-195240-0_sp1.1 from training. Duration: 0.7909375 2023-10-02 05:15:26,870 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_141-36243-0_sp1.1 from training. Duration: 0.97275 2023-10-02 05:15:35,774 WARNING [train.py:1204] (3/4) Exclude cut with ID _7852_210_5219_1_1528286471316_8139400_358-287492-0_sp1.1 from training. Duration: 0.87275 2023-10-02 05:15:35,801 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_805-71497-0_sp0.9 from training. Duration: 0.788875 2023-10-02 05:15:37,165 WARNING [train.py:1204] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_178-39722-0 from training. Duration: 0.49 2023-10-02 05:15:41,154 WARNING [train.py:1204] (3/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_428-181925-0_sp1.1 from training. Duration: 0.963625 2023-10-02 05:15:41,158 WARNING [train.py:1204] (3/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_770-200430-0 from training. Duration: 0.89 2023-10-02 05:15:41,201 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_1104-271009-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 05:15:41,253 WARNING [train.py:1204] (3/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_128-296581-0_sp0.9 from training. Duration: 0.82225 2023-10-02 05:15:42,581 INFO [train.py:1046] (3/4) Epoch 22, batch 2950, loss[loss=0.1889, simple_loss=0.2545, pruned_loss=0.06167, over 23761.00 frames. ], tot_loss[loss=0.174, simple_loss=0.2502, pruned_loss=0.04887, over 4713863.05 frames. ], batch size: 164, lr: 4.66e-03, grad_scale: 16.0 2023-10-02 05:15:46,201 WARNING [train.py:1204] (3/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_19-62511-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 05:15:48,831 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_805-71497-0 from training. Duration: 0.71 2023-10-02 05:15:50,099 WARNING [train.py:1204] (3/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_573-152077-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 05:15:50,105 WARNING [train.py:1204] (3/4) Exclude cut with ID _42875_210_14856_1_1533704397487_4012433_393-177816-0_sp1.1 from training. Duration: 0.963625 2023-10-02 05:15:51,472 WARNING [train.py:1204] (3/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_278-35159-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 05:15:54,178 WARNING [train.py:1204] (3/4) Exclude cut with ID _15037_210_2253_1_1530752294104_3267650_13-130361-0_sp1.1 from training. Duration: 0.9 2023-10-02 05:15:54,277 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3019_210_2028_1_1526044245_3264865_127-135615-0 from training. Duration: 0.94 2023-10-02 05:15:55,679 WARNING [train.py:1204] (3/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_607-213951-0 from training. Duration: 0.84 2023-10-02 05:15:55,744 WARNING [train.py:1204] (3/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_436-318113-0_sp0.9 from training. Duration: 0.8333125 2023-10-02 05:15:55,745 WARNING [train.py:1204] (3/4) Exclude cut with ID _13938_210_9988_1_1530518718978_3693960_148-152225-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 05:16:02,679 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2541_210_1794_1_1525590269_3084765_156-173713-0_sp1.1 from training. Duration: 0.736375 2023-10-02 05:16:02,813 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.0.feed_forward1.out_proj.dropout_p, batch_count=763426.6666666666, ans=0.1 2023-10-02 05:16:05,896 WARNING [train.py:1204] (3/4) Exclude cut with ID _28323_210_16187_1_1532480395667_4100300_531-24157-0_sp1.1 from training. Duration: 0.8909375 2023-10-02 05:16:07,406 WARNING [train.py:1204] (3/4) Exclude cut with ID _29303_210_2575_1_1532392237303_7410649_382-217492-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 05:16:07,679 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.feed_forward1.hidden_balancer.prob, batch_count=763426.6666666666, ans=0.125 2023-10-02 05:16:08,727 WARNING [train.py:1204] (3/4) Exclude cut with ID _58895_210_8750_1_1535184081320_3649089_270-158013-0_sp1.1 from training. Duration: 0.92725 2023-10-02 05:16:11,529 WARNING [train.py:1204] (3/4) Exclude cut with ID _29277_210_3279_1_1532581109293_3173069_311-206227-0_sp1.1 from training. Duration: 0.936375 2023-10-02 05:16:11,549 WARNING [train.py:1204] (3/4) Exclude cut with ID _7160_210_1876_1_1527940923231_3703799_282-322611-0_sp0.9 from training. Duration: 0.9666875 2023-10-02 05:16:14,213 WARNING [train.py:1204] (3/4) Exclude cut with ID _53219_210_4525_1_1534834836008_4064825_562-32295-0_sp1.1 from training. Duration: 0.963625 2023-10-02 05:16:14,287 WARNING [train.py:1204] (3/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_408-87330-0_sp1.1 from training. Duration: 0.963625 2023-10-02 05:16:14,301 WARNING [train.py:1204] (3/4) Exclude cut with ID _59202_210_26984_1_1534816810612_3549174_137-318710-0_sp1.1 from training. Duration: 0.8090625 2023-10-02 05:16:17,066 WARNING [train.py:1204] (3/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_372-30302-0 from training. Duration: 0.86 2023-10-02 05:16:21,986 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7543_210_6753_1_1528541907_1399322_178-56269-0_sp1.1 from training. Duration: 0.95275 2023-10-02 05:16:22,010 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9109W0012-549758 from training. Duration: 0.512 2023-10-02 05:16:22,094 WARNING [train.py:1204] (3/4) Exclude cut with ID _5410_210_1388_1_1527397055795_1719020_39-112221-0_sp0.9 from training. Duration: 0.7666875 2023-10-02 05:16:23,519 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9121W0012-554933 from training. Duration: 0.896 2023-10-02 05:16:26,186 WARNING [train.py:1204] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_476-272757-0 from training. Duration: 0.93 2023-10-02 05:16:26,213 WARNING [train.py:1204] (3/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_1085-171718-0_sp1.1 from training. Duration: 0.92725 2023-10-02 05:16:26,266 WARNING [train.py:1204] (3/4) Exclude cut with ID _8480_210_1874_1_1528589391205_6728330_309-139896-0_sp1.1 from training. Duration: 0.736375 2023-10-02 05:16:26,266 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9130W0113-559703 from training. Duration: 0.896 2023-10-02 05:16:26,270 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_292-265928-0_sp1.1 from training. Duration: 0.463625 2023-10-02 05:16:29,672 WARNING [train.py:1204] (3/4) Exclude cut with ID _8772_210_1850_1_1528635595972_3664011_96-12756-0 from training. Duration: 0.98 2023-10-02 05:16:29,731 WARNING [train.py:1204] (3/4) Exclude cut with ID _12556_210_8341_1_1530081024100_7327690_725-193477-0_sp1.1 from training. Duration: 0.8909375 2023-10-02 05:16:29,766 WARNING [train.py:1204] (3/4) Exclude cut with ID _5289_210_2084_1_1527678202736_3779299_142-103317-0_sp1.1 from training. Duration: 0.763625 2023-10-02 05:16:32,954 WARNING [train.py:1204] (3/4) Exclude cut with ID _45012_210_12651_1_1533608435136_5371380_206-51075-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 05:16:33,195 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.bypass.scale_min, batch_count=763560.0, ans=0.2 2023-10-02 05:16:34,317 WARNING [train.py:1204] (3/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_176-56120-0_sp1.1 from training. Duration: 0.7545625 2023-10-02 05:16:34,355 WARNING [train.py:1204] (3/4) Exclude cut with ID _40284_210_2084_1_1533513679814_3704279_220-201864-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 05:16:34,390 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9165W0004-576638 from training. Duration: 0.896 2023-10-02 05:16:36,330 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_5438_210_1794_1_1527336687_2600009_218-343336-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 05:16:36,376 WARNING [train.py:1204] (3/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_318-162017-0 from training. Duration: 0.91 2023-10-02 05:16:43,205 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_49544_210_1794_1_1534379770_5262083_483-349298-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 05:16:44,464 WARNING [train.py:1204] (3/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_197-28164-0_sp0.9 from training. Duration: 0.888875 2023-10-02 05:16:44,522 WARNING [train.py:1204] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_643-52007-0 from training. Duration: 0.57 2023-10-02 05:16:44,546 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_38965_210_4852_1_1533174906_3484735_403-332963-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 05:16:45,921 WARNING [train.py:1204] (3/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_340-174176-0 from training. Duration: 0.49 2023-10-02 05:16:50,553 WARNING [train.py:1204] (3/4) Exclude cut with ID _56708_210_13177_1_1535252351282_3422899_363-917-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 05:16:52,014 WARNING [train.py:1204] (3/4) Exclude cut with ID _19896_210_1883_1_1531709022662_6649569_525-159063-0_sp1.1 from training. Duration: 0.936375 2023-10-02 05:16:53,345 WARNING [train.py:1204] (3/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_430-116139-0_sp0.9 from training. Duration: 0.9444375 2023-10-02 05:16:53,471 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_53753_210_7234_1_1534320003_3594148_86-329250-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 05:16:53,480 WARNING [train.py:1204] (3/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_406-275649-0_sp1.1 from training. Duration: 0.3818125 2023-10-02 05:16:54,945 WARNING [train.py:1204] (3/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_1083-147536-0_sp1.1 from training. Duration: 0.8818125 2023-10-02 05:16:55,125 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.ff2_skip_rate, batch_count=763626.6666666666, ans=0.0 2023-10-02 05:16:55,144 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.conv_module1.balancer2.prob, batch_count=763626.6666666666, ans=0.125 2023-10-02 05:16:56,955 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.2.encoder.layers.0.conv_module1.whiten, num_groups=1, num_channels=384, metric=7.08 vs. limit=15.0 2023-10-02 05:16:57,527 INFO [train.py:1046] (3/4) Epoch 22, batch 3000, loss[loss=0.1735, simple_loss=0.2428, pruned_loss=0.05205, over 23499.00 frames. ], tot_loss[loss=0.1745, simple_loss=0.2508, pruned_loss=0.04904, over 4715335.92 frames. ], batch size: 120, lr: 4.66e-03, grad_scale: 16.0 2023-10-02 05:16:57,528 INFO [train.py:1069] (3/4) Computing validation loss 2023-10-02 05:17:10,597 INFO [zipformer.py:1853] (3/4) name=encoder.encoders.4.encoder.layers.0.self_attn_weights, attn_weights_entropy = tensor([3.2665, 2.2323, 3.5604, 2.8007], device='cuda:3') 2023-10-02 05:17:15,406 INFO [train.py:1078] (3/4) Epoch 22, validation: loss=0.3452, simple_loss=0.2763, pruned_loss=0.2071, over 1125622.00 frames. 2023-10-02 05:17:15,406 INFO [train.py:1079] (3/4) Maximum memory allocated so far is 21134MB 2023-10-02 05:17:15,463 WARNING [train.py:1204] (3/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_54-311741-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 05:17:15,470 WARNING [train.py:1204] (3/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_237-58433-0_sp0.9 from training. Duration: 0.82225 2023-10-02 05:17:15,510 WARNING [train.py:1204] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_98-51676-0_sp1.1 from training. Duration: 0.663625 2023-10-02 05:17:15,590 WARNING [train.py:1204] (3/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_386-270472-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 05:17:15,764 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.0.conv_module1.balancer1.min_positive, batch_count=763693.3333333334, ans=0.025 2023-10-02 05:17:16,868 WARNING [train.py:1204] (3/4) Exclude cut with ID _19023_210_4353_1_1531378892552_5820780_83-254794-0_sp1.1 from training. Duration: 0.7818125 2023-10-02 05:17:16,952 WARNING [train.py:1204] (3/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_314-93656-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 05:17:16,963 WARNING [train.py:1204] (3/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_336-213530-0 from training. Duration: 0.9 2023-10-02 05:17:18,386 WARNING [train.py:1204] (3/4) Exclude cut with ID _36686_210_5665_1_1533778225913_3646410_65-221120-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 05:17:21,232 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_26433_210_12108_1_1532157158_4056480_271-68178-0_sp1.1 from training. Duration: 0.92725 2023-10-02 05:17:21,285 WARNING [train.py:1204] (3/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_46-207599-0_sp0.9 from training. Duration: 0.711125 2023-10-02 05:17:24,108 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9303W0114-637133 from training. Duration: 0.896 2023-10-02 05:17:25,939 WARNING [train.py:1204] (3/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_490-207861-0 from training. Duration: 0.98 2023-10-02 05:17:27,488 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6767_210_3273_1_1527855783_1718932_173-256601-0_sp0.9 from training. Duration: 0.888875 2023-10-02 05:17:28,720 WARNING [train.py:1204] (3/4) Exclude cut with ID _13921_210_9994_1_1530841782272_3503520_327-69677-0_sp1.1 from training. Duration: 0.8181875 2023-10-02 05:17:28,782 WARNING [train.py:1204] (3/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_508-335369-0 from training. Duration: 0.48 2023-10-02 05:17:30,104 WARNING [train.py:1204] (3/4) Exclude cut with ID _64631_210_27767_1_1535349580068_4165160_24-46567-0_sp1.1 from training. Duration: 0.963625 2023-10-02 05:17:36,425 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_89-35748-0_sp1.1 from training. Duration: 0.7181875 2023-10-02 05:17:45,811 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_26433_210_12108_1_1532157158_4056480_578-262107-0_sp1.1 from training. Duration: 0.9 2023-10-02 05:17:49,822 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.586e+02 1.840e+02 2.064e+02 2.389e+02 3.388e+02, threshold=4.128e+02, percent-clipped=0.0 2023-10-02 05:17:52,726 WARNING [train.py:1204] (3/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_129-337802-0 from training. Duration: 0.72 2023-10-02 05:17:52,815 WARNING [train.py:1204] (3/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_356-167816-0_sp0.9 from training. Duration: 0.8 2023-10-02 05:17:55,240 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.0.layers.0.feed_forward1.out_whiten, num_groups=1, num_channels=192, metric=10.11 vs. limit=15.0 2023-10-02 05:17:55,589 WARNING [train.py:1204] (3/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_137-116442-0_sp1.1 from training. Duration: 0.7090625 2023-10-02 05:17:56,827 WARNING [train.py:1204] (3/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_358-172940-0_sp1.1 from training. Duration: 0.963625 2023-10-02 05:17:56,858 WARNING [train.py:1204] (3/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_668-344147-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 05:17:59,887 WARNING [train.py:1204] (3/4) Exclude cut with ID _62337_210_12392_1_1535009273105_3412229_255-157667-0_sp1.1 from training. Duration: 0.936375 2023-10-02 05:17:59,890 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_486-151131-0 from training. Duration: 0.85 2023-10-02 05:18:00,035 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_53638_210_21581_1_1534297319_5808449_885-80172-0 from training. Duration: 0.96 2023-10-02 05:18:00,169 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.0.feed_forward1.out_proj.dropout_p, batch_count=763893.3333333334, ans=0.1 2023-10-02 05:18:01,414 WARNING [train.py:1204] (3/4) Exclude cut with ID _50093_210_4220_1_1534417141207_3432299_105-183067-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 05:18:03,325 WARNING [train.py:1204] (3/4) Exclude cut with ID _7562_210_2084_1_1528887767916_3946229_345-169156-0_sp0.9 from training. Duration: 0.6444375 2023-10-02 05:18:04,744 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_4528_210_3273_1_1527076654_3276777_209-219804-0_sp0.9 from training. Duration: 0.7666875 2023-10-02 05:18:06,059 WARNING [train.py:1204] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_35-153886-0_sp0.9 from training. Duration: 0.9333125 2023-10-02 05:18:06,153 WARNING [train.py:1204] (3/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_332-121925-0_sp1.1 from training. Duration: 0.92725 2023-10-02 05:18:06,156 WARNING [train.py:1204] (3/4) Exclude cut with ID _59202_210_26984_1_1534816810612_3549174_495-65515-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 05:18:06,380 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.conv_module2.balancer1.prob, batch_count=763893.3333333334, ans=0.125 2023-10-02 05:18:08,919 WARNING [train.py:1204] (3/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_769-168302-0_sp1.1 from training. Duration: 0.6454375 2023-10-02 05:18:10,901 WARNING [train.py:1204] (3/4) Exclude cut with ID _24271_210_12658_1_1531987270022_7190519_708-71345-0_sp1.1 from training. Duration: 0.936375 2023-10-02 05:18:10,901 WARNING [train.py:1204] (3/4) Exclude cut with ID 3033-130750-0096-55598-0_sp0.9 from training. Duration: 0.92225 2023-10-02 05:18:13,735 WARNING [train.py:1204] (3/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_219-224445-0_sp0.9 from training. Duration: 0.9333125 2023-10-02 05:18:15,896 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_174-277364-0 from training. Duration: 0.71 2023-10-02 05:18:17,278 WARNING [train.py:1204] (3/4) Exclude cut with ID _45021_210_9994_1_1533619751094_3330288_96-239010-0_sp1.1 from training. Duration: 0.82725 2023-10-02 05:18:17,315 WARNING [train.py:1204] (3/4) Exclude cut with ID _31602_210_5854_1_1532677021456_4458250_489-305116-0_sp1.1 from training. Duration: 0.97275 2023-10-02 05:18:17,342 WARNING [train.py:1204] (3/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_269-154726-0_sp0.9 from training. Duration: 0.9555625 2023-10-02 05:18:21,393 WARNING [train.py:1204] (3/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_126-54418-0_sp1.1 from training. Duration: 0.92725 2023-10-02 05:18:22,755 WARNING [train.py:1204] (3/4) Exclude cut with ID _42326_210_7770_1_1533814282531_3707269_182-224159-0_sp1.1 from training. Duration: 0.92725 2023-10-02 05:18:22,860 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7002_210_1794_1_1527939683_5157286_1-281264-0_sp0.9 from training. Duration: 0.488875 2023-10-02 05:18:22,908 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_86-16881-0 from training. Duration: 0.58 2023-10-02 05:18:24,625 WARNING [train.py:1204] (3/4) Exclude cut with ID _19514_210_9259_1_1531530071330_5084230_637-279094-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 05:18:24,656 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_26433_210_12108_1_1532157158_4056480_578-262107-0 from training. Duration: 0.99 2023-10-02 05:18:25,930 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_82-221196-0_sp1.1 from training. Duration: 0.6909375 2023-10-02 05:18:27,479 WARNING [train.py:1204] (3/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_402-102311-0 from training. Duration: 0.67 2023-10-02 05:18:29,019 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1015-343884-0_sp1.1 from training. Duration: 0.563625 2023-10-02 05:18:29,102 WARNING [train.py:1204] (3/4) Exclude cut with ID _13684_210_7497_1_1530619354863_3598359_312-235606-0_sp1.1 from training. Duration: 0.5454375 2023-10-02 05:18:29,125 WARNING [train.py:1204] (3/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_55-31904-0 from training. Duration: 0.88 2023-10-02 05:18:29,196 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_25-332138-0 from training. Duration: 0.91 2023-10-02 05:18:29,198 WARNING [train.py:1204] (3/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_912-282412-0_sp0.9 from training. Duration: 0.5444375 2023-10-02 05:18:30,979 INFO [train.py:1046] (3/4) Epoch 22, batch 3050, loss[loss=0.1684, simple_loss=0.2558, pruned_loss=0.04048, over 24428.00 frames. ], tot_loss[loss=0.1749, simple_loss=0.2513, pruned_loss=0.04931, over 4714165.82 frames. ], batch size: 66, lr: 4.66e-03, grad_scale: 16.0 2023-10-02 05:18:31,129 WARNING [train.py:1204] (3/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_458-299435-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 05:18:32,419 WARNING [train.py:1204] (3/4) Exclude cut with ID _69584_210_5219_1_1535785133775_4044450_275-88403-0_sp1.1 from training. Duration: 0.92725 2023-10-02 05:18:32,433 WARNING [train.py:1204] (3/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_841-341679-0_sp1.1 from training. Duration: 0.52725 2023-10-02 05:18:32,439 WARNING [train.py:1204] (3/4) Exclude cut with ID _41391_210_7994_1_1533603771872_3638201_1-54521-0_sp1.1 from training. Duration: 0.97275 2023-10-02 05:18:33,765 WARNING [train.py:1204] (3/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_495-186609-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 05:18:35,148 WARNING [train.py:1204] (3/4) Exclude cut with ID _4681_210_3525_1_1527250689464_120889_15-126760-0 from training. Duration: 0.81 2023-10-02 05:18:39,524 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3019_210_2028_1_1526044245_3264865_127-135615-0_sp1.1 from training. Duration: 0.8545625 2023-10-02 05:18:41,458 WARNING [train.py:1204] (3/4) Exclude cut with ID _30191_210_1664_1_1533189915047_7196670_915-21561-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 05:18:41,508 WARNING [train.py:1204] (3/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_6-214815-0_sp0.9 from training. Duration: 0.9444375 2023-10-02 05:18:43,432 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.4.encoder.layers.0.whiten, num_groups=1, num_channels=384, metric=5.28 vs. limit=12.0 2023-10-02 05:18:44,416 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_45399_210_4852_1_1533624725_7579737_190-137494-0_sp1.1 from training. Duration: 0.97275 2023-10-02 05:18:47,954 WARNING [train.py:1204] (3/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_207-132038-0 from training. Duration: 0.94 2023-10-02 05:18:53,410 WARNING [train.py:1204] (3/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_90-31383-0 from training. Duration: 0.87 2023-10-02 05:18:53,464 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_15517_210_6753_1_1530952293_1664795_119-31001-0 from training. Duration: 0.94 2023-10-02 05:18:54,801 WARNING [train.py:1204] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_684-85342-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 05:18:57,726 WARNING [train.py:1204] (3/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_97-325053-0_sp1.1 from training. Duration: 0.836375 2023-10-02 05:19:00,473 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_68702_210_23689_1_1535799577_3420068_168-25150-0_sp1.1 from training. Duration: 0.97275 2023-10-02 05:19:00,480 WARNING [train.py:1204] (3/4) Exclude cut with ID _65134_210_14763_1_1535335393985_3505368_111-27984-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 05:19:00,520 WARNING [train.py:1204] (3/4) Exclude cut with ID _48678_210_5663_1_1533988830947_3769199_276-226230-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 05:19:00,752 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.balancer1.prob, batch_count=764160.0, ans=0.125 2023-10-02 05:19:03,883 WARNING [train.py:1204] (3/4) Exclude cut with ID _28427_210_4220_1_1532581240967_3061069_43-84928-0_sp1.1 from training. Duration: 0.9 2023-10-02 05:19:04,015 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.0.conv_skip_rate, batch_count=764160.0, ans=0.0 2023-10-02 05:19:05,173 WARNING [train.py:1204] (3/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_158-250116-0_sp1.1 from training. Duration: 0.563625 2023-10-02 05:19:05,196 WARNING [train.py:1204] (3/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_279-332556-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 05:19:05,232 WARNING [train.py:1204] (3/4) Exclude cut with ID _42863_210_9070_1_1533902398788_3696920_488-230146-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 05:19:05,232 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_387-247159-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 05:19:08,373 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6729_210_2550_1_1528032481_1788725_261-42619-0_sp1.1 from training. Duration: 0.97275 2023-10-02 05:19:09,818 WARNING [train.py:1204] (3/4) Exclude cut with ID _19896_210_1883_1_1531709022662_6649569_831-44724-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 05:19:13,360 WARNING [train.py:1204] (3/4) Exclude cut with ID _55153_210_2253_1_1534384786677_3162096_280-285944-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 05:19:13,394 WARNING [train.py:1204] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_448-4048-0 from training. Duration: 0.96 2023-10-02 05:19:14,760 WARNING [train.py:1204] (3/4) Exclude cut with ID _29678_210_7893_1_1532421042787_3906118_456-202548-0_sp1.1 from training. Duration: 0.97275 2023-10-02 05:19:14,773 WARNING [train.py:1204] (3/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_107-332398-0_sp1.1 from training. Duration: 0.7090625 2023-10-02 05:19:17,038 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.conv_module1.balancer2.prob, batch_count=764226.6666666666, ans=0.125 2023-10-02 05:19:18,234 WARNING [train.py:1204] (3/4) Exclude cut with ID _13921_210_9994_1_1530838702800_3003429_46-149444-0_sp1.1 from training. Duration: 0.8545625 2023-10-02 05:19:18,274 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_257-20733-0_sp0.9 from training. Duration: 0.8444375 2023-10-02 05:19:18,564 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.ff2_skip_rate, batch_count=764226.6666666666, ans=0.0 2023-10-02 05:19:19,608 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_53753_210_7234_1_1534320003_3594148_385-234809-0_sp1.1 from training. Duration: 0.963625 2023-10-02 05:19:19,675 WARNING [train.py:1204] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_292-215346-0_sp1.1 from training. Duration: 0.8818125 2023-10-02 05:19:25,101 WARNING [train.py:1204] (3/4) Exclude cut with ID _68965_210_1883_1_1535698962272_3497490_196-124268-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 05:19:26,427 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_23801_210_16223_1_1532066392_3294657_570-5439-0_sp1.1 from training. Duration: 0.8818125 2023-10-02 05:19:30,709 WARNING [train.py:1204] (3/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_349-6928-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 05:19:30,753 WARNING [train.py:1204] (3/4) Exclude cut with ID _60621_210_17071_1_1535013053530_3671739_226-255202-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 05:19:30,757 WARNING [train.py:1204] (3/4) Exclude cut with ID _41817_210_18835_1_1533434477160_7423049_448-130302-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 05:19:32,154 WARNING [train.py:1204] (3/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_758-143677-0_sp1.1 from training. Duration: 0.8909375 2023-10-02 05:19:33,486 WARNING [train.py:1204] (3/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_912-47473-0_sp0.9 from training. Duration: 0.7555625 2023-10-02 05:19:33,539 WARNING [train.py:1204] (3/4) Exclude cut with ID _6694_210_4599_1_1527852637490_3965529_356-169233-0_sp1.1 from training. Duration: 0.9 2023-10-02 05:19:35,568 WARNING [train.py:1204] (3/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_1-99680-0 from training. Duration: 0.67 2023-10-02 05:19:38,060 WARNING [train.py:1204] (3/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_199-86432-0_sp1.1 from training. Duration: 0.8909375 2023-10-02 05:19:38,084 WARNING [train.py:1204] (3/4) Exclude cut with ID _61148_210_6897_1_1535094022726_4217610_362-182430-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 05:19:38,160 WARNING [train.py:1204] (3/4) Exclude cut with ID _49376_210_5986_1_1533985278732_3370740_301-154763-0 from training. Duration: 0.98 2023-10-02 05:19:39,978 WARNING [train.py:1204] (3/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_572-185150-0_sp1.1 from training. Duration: 0.8818125 2023-10-02 05:19:45,723 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_47896_210_20508_1_1534492518_5958868_558-314572-0_sp1.1 from training. Duration: 0.8818125 2023-10-02 05:19:46,784 INFO [train.py:1046] (3/4) Epoch 22, batch 3100, loss[loss=0.1956, simple_loss=0.2489, pruned_loss=0.07114, over 19643.00 frames. ], tot_loss[loss=0.1754, simple_loss=0.2515, pruned_loss=0.04963, over 4694706.14 frames. ], batch size: 388, lr: 4.66e-03, grad_scale: 16.0 2023-10-02 05:19:48,172 WARNING [train.py:1204] (3/4) Exclude cut with ID _45560_210_21982_1_1533812603951_3584999_111-6376-0_sp0.9 from training. Duration: 0.9333125 2023-10-02 05:19:49,601 WARNING [train.py:1204] (3/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_412-317077-0_sp1.1 from training. Duration: 0.6818125 2023-10-02 05:19:52,294 WARNING [train.py:1204] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_718-68505-0 from training. Duration: 0.89 2023-10-02 05:19:52,972 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.3.encoder.layers.2.self_attn_weights.whiten_keys, num_groups=8, num_channels=256, metric=4.44 vs. limit=6.0 2023-10-02 05:19:55,188 WARNING [train.py:1204] (3/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_77-311404-0 from training. Duration: 0.94 2023-10-02 05:19:55,281 WARNING [train.py:1204] (3/4) Exclude cut with ID _60229_210_27767_1_1534838406158_3655350_264-279573-0 from training. Duration: 0.83 2023-10-02 05:19:58,009 WARNING [train.py:1204] (3/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_106-300985-0_sp1.1 from training. Duration: 0.8181875 2023-10-02 05:19:58,782 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.3.encoder.layers.3.self_attn2.whiten, num_groups=1, num_channels=512, metric=11.60 vs. limit=22.5 2023-10-02 05:20:00,805 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_4183_210_2087_1_1526729100_1869838_79-277189-0_sp1.1 from training. Duration: 0.963625 2023-10-02 05:20:00,816 WARNING [train.py:1204] (3/4) Exclude cut with ID _28427_210_4220_1_1532581240967_3061069_195-295268-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 05:20:03,620 WARNING [train.py:1204] (3/4) Exclude cut with ID _4092_210_2151_1_1526643044449_3135759_356-278694-0_sp0.9 from training. Duration: 0.52225 2023-10-02 05:20:03,903 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.balancer1.prob, batch_count=764426.6666666666, ans=0.125 2023-10-02 05:20:07,004 WARNING [train.py:1204] (3/4) Exclude cut with ID _14459_210_2228_1_1531443641946_7516700_777-71446-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 05:20:11,570 WARNING [train.py:1204] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_520-126729-0 from training. Duration: 0.7 2023-10-02 05:20:14,437 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.conv_module1.balancer1.max_abs, batch_count=764493.3333333334, ans=10.0 2023-10-02 05:20:18,112 WARNING [train.py:1204] (3/4) Exclude cut with ID _13684_210_7497_1_1530619354863_3598359_312-235606-0_sp0.9 from training. Duration: 0.6666875 2023-10-02 05:20:19,357 WARNING [train.py:1204] (3/4) Exclude cut with ID _37537_210_2253_1_1533171758568_3421949_283-349402-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 05:20:19,403 WARNING [train.py:1204] (3/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_29-117932-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 05:20:20,665 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.591e+02 1.816e+02 1.968e+02 2.176e+02 3.408e+02, threshold=3.937e+02, percent-clipped=0.0 2023-10-02 05:20:20,730 WARNING [train.py:1204] (3/4) Exclude cut with ID _29303_210_2575_1_1532392237303_7410649_721-283351-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 05:20:20,786 WARNING [train.py:1204] (3/4) Exclude cut with ID _14886_210_4220_1_1530702020355_3516190_175-229073-0_sp0.9 from training. Duration: 0.5 2023-10-02 05:20:20,911 WARNING [train.py:1204] (3/4) Exclude cut with ID _15458_210_8341_1_1530858614263_3834410_70-268830-0_sp1.1 from training. Duration: 0.97275 2023-10-02 05:20:20,928 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2295_210_2087_1_1525500993_2407863_234-83104-0 from training. Duration: 0.62 2023-10-02 05:20:20,929 WARNING [train.py:1204] (3/4) Exclude cut with ID _22961_210_8254_1_1531976366672_4278180_317-199546-0_sp1.1 from training. Duration: 0.7818125 2023-10-02 05:20:22,312 WARNING [train.py:1204] (3/4) Exclude cut with ID _27894_210_12392_1_1532415496078_6902439_547-196022-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 05:20:23,647 WARNING [train.py:1204] (3/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_46-207599-0 from training. Duration: 0.64 2023-10-02 05:20:25,111 WARNING [train.py:1204] (3/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_426-326246-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 05:20:29,094 WARNING [train.py:1204] (3/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_445-117398-0_sp0.9 from training. Duration: 0.988875 2023-10-02 05:20:29,158 WARNING [train.py:1204] (3/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_563-347832-0 from training. Duration: 0.89 2023-10-02 05:20:29,376 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.1.conv_module1.balancer1.prob, batch_count=764560.0, ans=0.125 2023-10-02 05:20:30,578 WARNING [train.py:1204] (3/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_252-216674-0 from training. Duration: 0.54 2023-10-02 05:20:30,664 WARNING [train.py:1204] (3/4) Exclude cut with ID _59611_210_8895_1_1534741198240_3650361_187-212779-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 05:20:30,725 WARNING [train.py:1204] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_282-159367-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 05:20:34,123 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_68702_210_23689_1_1535799577_3420068_33-136883-0_sp1.1 from training. Duration: 0.936375 2023-10-02 05:20:34,149 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_47228_210_4983_1_1533780021_3672027_184-293706-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 05:20:35,485 WARNING [train.py:1204] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_656-188361-0_sp0.9 from training. Duration: 0.9666875 2023-10-02 05:20:35,620 WARNING [train.py:1204] (3/4) Exclude cut with ID _4866_210_2577_1_1527404412284_4364659_352-157930-0_sp0.9 from training. Duration: 0.9 2023-10-02 05:20:35,621 WARNING [train.py:1204] (3/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_210-106047-0_sp1.1 from training. Duration: 0.8818125 2023-10-02 05:20:38,147 WARNING [train.py:1204] (3/4) Exclude cut with ID _9389_210_1664_1_1529217337301_5289269_289-213417-0_sp1.1 from training. Duration: 0.8090625 2023-10-02 05:20:38,186 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3910_210_1794_1_1526777667_3747260_178-41186-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 05:20:38,198 WARNING [train.py:1204] (3/4) Exclude cut with ID _27052_210_5986_1_1532329197187_3585009_24-172263-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 05:20:38,206 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_5522_210_3273_1_1527249606_3909096_318-203153-0_sp1.1 from training. Duration: 0.3909375 2023-10-02 05:20:42,979 WARNING [train.py:1204] (3/4) Exclude cut with ID _27894_210_12392_1_1532415496078_6902439_222-51982-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 05:20:43,769 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.1.encoder.layers.0.nonlin_attention.whiten1, num_groups=1, num_channels=192, metric=5.79 vs. limit=10.0 2023-10-02 05:20:44,297 WARNING [train.py:1204] (3/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_87-131427-0 from training. Duration: 0.78 2023-10-02 05:20:46,391 WARNING [train.py:1204] (3/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_372-298093-0_sp0.9 from training. Duration: 0.888875 2023-10-02 05:20:48,134 WARNING [train.py:1204] (3/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_663-166973-0 from training. Duration: 0.8 2023-10-02 05:20:48,263 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.conv_skip_rate, batch_count=764626.6666666666, ans=0.0 2023-10-02 05:20:49,503 WARNING [train.py:1204] (3/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_429-177469-0_sp1.1 from training. Duration: 0.936375 2023-10-02 05:20:49,538 WARNING [train.py:1204] (3/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_724-172651-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 05:20:49,579 WARNING [train.py:1204] (3/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_103-336064-0 from training. Duration: 0.51 2023-10-02 05:20:49,851 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=764626.6666666666, ans=0.1 2023-10-02 05:20:59,364 WARNING [train.py:1204] (3/4) Exclude cut with ID _48677_210_5663_1_1533902405919_3892989_111-11439-0 from training. Duration: 0.96 2023-10-02 05:21:00,692 INFO [train.py:1046] (3/4) Epoch 22, batch 3150, loss[loss=0.1599, simple_loss=0.2434, pruned_loss=0.03825, over 24453.00 frames. ], tot_loss[loss=0.1736, simple_loss=0.2499, pruned_loss=0.0486, over 4707562.83 frames. ], batch size: 63, lr: 4.66e-03, grad_scale: 16.0 2023-10-02 05:21:01,004 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.ff2_skip_rate, batch_count=764693.3333333334, ans=0.0 2023-10-02 05:21:02,225 WARNING [train.py:1204] (3/4) Exclude cut with ID _8085_210_4557_1_1528455633493_3716100_95-103398-0_sp1.1 from training. Duration: 0.963625 2023-10-02 05:21:02,294 WARNING [train.py:1204] (3/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_1041-110837-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 05:21:04,178 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_50050_210_16223_1_1535435796_7290138_1155-121757-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 05:21:04,180 WARNING [train.py:1204] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_602-275260-0_sp0.9 from training. Duration: 0.911125 2023-10-02 05:21:04,238 WARNING [train.py:1204] (3/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_265-164486-0 from training. Duration: 0.96 2023-10-02 05:21:05,575 WARNING [train.py:1204] (3/4) Exclude cut with ID _46067_210_4220_1_1534125571066_3307450_142-229375-0_sp1.1 from training. Duration: 0.963625 2023-10-02 05:21:05,604 WARNING [train.py:1204] (3/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_310-26186-0_sp1.1 from training. Duration: 0.636375 2023-10-02 05:21:06,968 WARNING [train.py:1204] (3/4) Exclude cut with ID _28461_210_2253_1_1532771775189_3041299_136-270644-0 from training. Duration: 0.93 2023-10-02 05:21:09,962 WARNING [train.py:1204] (3/4) Exclude cut with ID _14860_210_4915_1_1530702020340_7343240_438-286372-0_sp1.1 from training. Duration: 0.936375 2023-10-02 05:21:11,362 WARNING [train.py:1204] (3/4) Exclude cut with ID IC0128W0004-63309 from training. Duration: 0.831 2023-10-02 05:21:14,133 WARNING [train.py:1204] (3/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_135-174080-0 from training. Duration: 0.53 2023-10-02 05:21:14,186 WARNING [train.py:1204] (3/4) Exclude cut with ID _41326_210_5854_1_1533778076509_3492080_277-59511-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 05:21:15,549 WARNING [train.py:1204] (3/4) Exclude cut with ID IC0144W0112-71379 from training. Duration: 0.992 2023-10-02 05:21:17,499 WARNING [train.py:1204] (3/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_711-15688-0_sp0.9 from training. Duration: 0.47775 2023-10-02 05:21:19,350 WARNING [train.py:1204] (3/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_237-332027-0 from training. Duration: 0.95 2023-10-02 05:21:20,692 WARNING [train.py:1204] (3/4) Exclude cut with ID _7970_210_1883_1_1528455616296_3472230_25-178407-0 from training. Duration: 0.71 2023-10-02 05:21:20,694 WARNING [train.py:1204] (3/4) Exclude cut with ID _8480_210_1874_1_1528589391205_6728330_309-139896-0 from training. Duration: 0.81 2023-10-02 05:21:20,720 WARNING [train.py:1204] (3/4) Exclude cut with ID _39571_210_13449_1_1533801584143_3651077_319-268877-0_sp1.1 from training. Duration: 0.936375 2023-10-02 05:21:20,723 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_155-246880-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 05:21:22,198 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_566-122599-0_sp1.1 from training. Duration: 0.936375 2023-10-02 05:21:23,569 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_12868_210_6753_1_1530701932_530275_18-76322-0 from training. Duration: 0.87 2023-10-02 05:21:25,005 WARNING [train.py:1204] (3/4) Exclude cut with ID _56494_210_4220_1_1535284837625_3033169_159-106035-0_sp1.1 from training. Duration: 0.963625 2023-10-02 05:21:25,035 WARNING [train.py:1204] (3/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_98-276013-0_sp1.1 from training. Duration: 0.963625 2023-10-02 05:21:25,254 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.ff2_skip_rate, batch_count=764760.0, ans=0.0 2023-10-02 05:21:26,329 WARNING [train.py:1204] (3/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_330-216124-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 05:21:27,739 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_177-58005-0_sp0.9 from training. Duration: 0.688875 2023-10-02 05:21:31,044 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.3.encoder.layers.3.feed_forward1.out_whiten, num_groups=1, num_channels=512, metric=11.84 vs. limit=15.0 2023-10-02 05:21:31,777 WARNING [train.py:1204] (3/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_343-139898-0 from training. Duration: 0.96 2023-10-02 05:21:31,847 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6961_210_2087_1_1527937168_6815011_395-185060-0_sp1.1 from training. Duration: 0.863625 2023-10-02 05:21:31,981 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.balancer1.prob, batch_count=764826.6666666666, ans=0.125 2023-10-02 05:21:35,070 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7002_210_1794_1_1527939683_5157286_184-229630-0_sp0.9 from training. Duration: 0.82225 2023-10-02 05:21:35,274 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.balancer2.prob, batch_count=764826.6666666666, ans=0.125 2023-10-02 05:21:35,638 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.3.encoder.layers.1.whiten, num_groups=1, num_channels=512, metric=6.07 vs. limit=12.0 2023-10-02 05:21:36,373 WARNING [train.py:1204] (3/4) Exclude cut with ID _59170_210_6721_1_1534983633960_4203335_153-18895-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 05:21:36,422 WARNING [train.py:1204] (3/4) Exclude cut with ID _49643_210_7989_1_1534640244224_3939031_598-161926-0 from training. Duration: 0.9 2023-10-02 05:21:39,278 WARNING [train.py:1204] (3/4) Exclude cut with ID _4068_210_2629_1_1526697350395_1546259_224-288693-0 from training. Duration: 0.59 2023-10-02 05:21:39,576 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.feed_forward3.hidden_balancer.prob, batch_count=764826.6666666666, ans=0.125 2023-10-02 05:21:41,121 WARNING [train.py:1204] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_718-68505-0_sp1.1 from training. Duration: 0.8090625 2023-10-02 05:21:41,177 WARNING [train.py:1204] (3/4) Exclude cut with ID _4068_210_2629_1_1526697350395_1546259_224-288693-0_sp0.9 from training. Duration: 0.6555625 2023-10-02 05:21:41,188 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2296_210_2087_1_1525503448_3397335_57-185430-0_sp0.9 from training. Duration: 0.6666875 2023-10-02 05:21:42,523 WARNING [train.py:1204] (3/4) Exclude cut with ID _15458_210_8341_1_1530858614263_3834410_49-235472-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 05:21:42,525 WARNING [train.py:1204] (3/4) Exclude cut with ID _64135_210_2253_1_1535677267876_7608972_498-5039-0_sp0.9 from training. Duration: 0.8666875 2023-10-02 05:21:45,140 WARNING [train.py:1204] (3/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_283-220372-0_sp0.9 from training. Duration: 0.62225 2023-10-02 05:21:45,149 WARNING [train.py:1204] (3/4) Exclude cut with ID _6473_210_1741_1_1527901307396_3815380_108-17335-0_sp0.9 from training. Duration: 0.72225 2023-10-02 05:21:45,240 WARNING [train.py:1204] (3/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_113-92608-0 from training. Duration: 0.54 2023-10-02 05:21:46,551 WARNING [train.py:1204] (3/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_272-249985-0_sp1.1 from training. Duration: 0.6090625 2023-10-02 05:21:46,560 WARNING [train.py:1204] (3/4) Exclude cut with ID _50376_210_13401_1_1534031502101_4217740_518-187390-0_sp1.1 from training. Duration: 0.97275 2023-10-02 05:21:47,991 WARNING [train.py:1204] (3/4) Exclude cut with ID _59738_210_15379_1_1534849512562_3941470_165-248260-0_sp1.1 from training. Duration: 0.92725 2023-10-02 05:21:48,010 WARNING [train.py:1204] (3/4) Exclude cut with ID _23818_210_6228_1_1531917063884_3676920_400-343925-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 05:21:48,700 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6961_210_2087_1_1527937168_6815011_562-165518-0 from training. Duration: 0.77 2023-10-02 05:21:50,004 WARNING [train.py:1204] (3/4) Exclude cut with ID _24271_210_12658_1_1531987270022_7190519_289-207931-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 05:21:52,734 WARNING [train.py:1204] (3/4) Exclude cut with ID _14632_210_9988_1_1530691279392_3923200_332-105904-0 from training. Duration: 0.9 2023-10-02 05:21:52,746 WARNING [train.py:1204] (3/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_203-232438-0_sp1.1 from training. Duration: 0.97275 2023-10-02 05:21:52,837 WARNING [train.py:1204] (3/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_263-168388-0 from training. Duration: 0.86 2023-10-02 05:21:54,227 WARNING [train.py:1204] (3/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_174-205058-0 from training. Duration: 0.95 2023-10-02 05:21:54,353 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.0.bypass.skip_rate, batch_count=764893.3333333334, ans=0.035 2023-10-02 05:21:55,539 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_94-195801-0_sp1.1 from training. Duration: 0.8545625 2023-10-02 05:21:55,571 WARNING [train.py:1204] (3/4) Exclude cut with ID _55153_210_2253_1_1534384786677_3162096_269-103204-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 05:21:56,957 WARNING [train.py:1204] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_748-36150-0 from training. Duration: 0.91 2023-10-02 05:21:58,305 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_148-172861-0_sp1.1 from training. Duration: 0.4545625 2023-10-02 05:21:58,355 WARNING [train.py:1204] (3/4) Exclude cut with ID _67499_210_17282_1_1535509805220_3692430_460-77730-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 05:21:59,904 WARNING [train.py:1204] (3/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_374-20753-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 05:22:01,243 WARNING [train.py:1204] (3/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_374-289983-0_sp1.1 from training. Duration: 0.97275 2023-10-02 05:22:02,521 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_34886_210_10633_1_1533203882_1755661_76-156096-0_sp0.9 from training. Duration: 0.9666875 2023-10-02 05:22:07,295 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_238-256668-0_sp0.9 from training. Duration: 0.7666875 2023-10-02 05:22:08,649 WARNING [train.py:1204] (3/4) Exclude cut with ID _46683_210_9642_1_1533730913492_4591116_489-146727-0_sp1.1 from training. Duration: 0.97275 2023-10-02 05:22:10,509 WARNING [train.py:1204] (3/4) Exclude cut with ID _13938_210_9988_1_1530518718978_3693960_304-112784-0_sp0.9 from training. Duration: 0.511125 2023-10-02 05:22:12,206 INFO [scaling.py:1118] (3/4) WithLoss: name=encoder.encoders.4.encoder.layers.2.self_attn_weights, loss-sum=0.000e+00 2023-10-02 05:22:14,645 INFO [train.py:1046] (3/4) Epoch 22, batch 3200, loss[loss=0.1576, simple_loss=0.2352, pruned_loss=0.03998, over 24280.00 frames. ], tot_loss[loss=0.1722, simple_loss=0.2484, pruned_loss=0.04798, over 4711214.95 frames. ], batch size: 61, lr: 4.66e-03, grad_scale: 32.0 2023-10-02 05:22:16,155 WARNING [train.py:1204] (3/4) Exclude cut with ID _24271_210_12658_1_1531987270022_7190519_243-46549-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 05:22:16,159 WARNING [train.py:1204] (3/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_78-182912-0_sp1.1 from training. Duration: 0.536375 2023-10-02 05:22:21,019 WARNING [train.py:1204] (3/4) Exclude cut with ID _57572_210_5517_1_1534921219624_3809484_121-321334-0_sp1.1 from training. Duration: 0.97275 2023-10-02 05:22:21,125 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_40047_210_2290_1_1533290519_2374311_254-102149-0_sp1.1 from training. Duration: 0.963625 2023-10-02 05:22:22,495 WARNING [train.py:1204] (3/4) Exclude cut with ID _28461_210_2253_1_1532771775189_3041299_96-152158-0 from training. Duration: 0.85 2023-10-02 05:22:22,746 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.conv_module1.balancer2.prob, batch_count=765026.6666666666, ans=0.125 2023-10-02 05:22:23,060 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.4.encoder.layers.2.feed_forward1.out_whiten, num_groups=1, num_channels=384, metric=13.29 vs. limit=15.0 2023-10-02 05:22:25,309 WARNING [train.py:1204] (3/4) Exclude cut with ID _29877_210_6228_1_1532397791106_5276149_161-102979-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 05:22:28,220 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3787_210_2040_1_1526477544_5478586_603-120324-0_sp0.9 from training. Duration: 0.9 2023-10-02 05:22:31,022 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_41750_210_1960_1_1533520678_3948042_247-213456-0_sp1.1 from training. Duration: 0.97275 2023-10-02 05:22:40,305 WARNING [train.py:1204] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_127-119109-0_sp0.9 from training. Duration: 0.8 2023-10-02 05:22:45,424 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.2.encoder.layers.0.self_attn_weights.whiten_keys, num_groups=4, num_channels=128, metric=4.69 vs. limit=6.0 2023-10-02 05:22:49,248 WARNING [train.py:1204] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_215-202973-0 from training. Duration: 0.88 2023-10-02 05:22:50,473 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.635e+02 1.957e+02 2.150e+02 2.421e+02 3.635e+02, threshold=4.301e+02, percent-clipped=0.0 2023-10-02 05:22:51,294 WARNING [train.py:1204] (3/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_17-320491-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 05:22:53,946 WARNING [train.py:1204] (3/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_310-114452-0 from training. Duration: 0.59 2023-10-02 05:22:54,015 WARNING [train.py:1204] (3/4) Exclude cut with ID _15133_210_6228_1_1530794512074_3080379_29-264458-0_sp0.9 from training. Duration: 0.6444375 2023-10-02 05:22:58,298 WARNING [train.py:1204] (3/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_159-281310-0_sp1.1 from training. Duration: 0.863625 2023-10-02 05:22:58,308 WARNING [train.py:1204] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_195-167011-0_sp0.9 from training. Duration: 0.8333125 2023-10-02 05:22:59,686 WARNING [train.py:1204] (3/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_156-257607-0_sp1.1 from training. Duration: 0.7545625 2023-10-02 05:23:03,847 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2295_210_2087_1_1525500993_2407863_116-238894-0 from training. Duration: 0.7 2023-10-02 05:23:05,220 WARNING [train.py:1204] (3/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_321-335475-0_sp0.9 from training. Duration: 0.5 2023-10-02 05:23:05,793 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.4.encoder.layers.1.whiten, num_groups=1, num_channels=384, metric=5.45 vs. limit=12.0 2023-10-02 05:23:06,694 WARNING [train.py:1204] (3/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_188-114464-0 from training. Duration: 0.82 2023-10-02 05:23:09,975 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2592_210_2290_1_1525697025_4882925_157-70512-0 from training. Duration: 0.91 2023-10-02 05:23:13,100 WARNING [train.py:1204] (3/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_138-64210-0_sp1.1 from training. Duration: 0.663625 2023-10-02 05:23:14,731 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=765293.3333333334, ans=0.1 2023-10-02 05:23:18,668 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_11305_210_1794_1_1529750428_7178148_651-95702-0_sp1.1 from training. Duration: 0.963625 2023-10-02 05:23:18,691 WARNING [train.py:1204] (3/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_379-184223-0_sp0.9 from training. Duration: 0.9444375 2023-10-02 05:23:19,979 WARNING [train.py:1204] (3/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_381-125325-0_sp1.1 from training. Duration: 0.963625 2023-10-02 05:23:20,046 WARNING [train.py:1204] (3/4) Exclude cut with ID IC0622W0021-307659 from training. Duration: 0.896 2023-10-02 05:23:20,048 WARNING [train.py:1204] (3/4) Exclude cut with ID _7624_210_1531_1_1528109802198_4933101_418-349834-0_sp1.1 from training. Duration: 0.5454375 2023-10-02 05:23:25,411 WARNING [train.py:1204] (3/4) Exclude cut with ID _49728_210_12651_1_1534041034294_6788150_607-3416-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 05:23:26,876 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8275_210_4198_1_1528455320_3892571_491-17707-0 from training. Duration: 0.64 2023-10-02 05:23:26,924 WARNING [train.py:1204] (3/4) Exclude cut with ID _5287_210_2084_1_1527418883040_3878670_333-48508-0 from training. Duration: 0.6 2023-10-02 05:23:28,242 WARNING [train.py:1204] (3/4) Exclude cut with ID _8365_210_2064_1_1528592311070_4677690_371-122295-0 from training. Duration: 0.55 2023-10-02 05:23:29,515 INFO [train.py:1046] (3/4) Epoch 22, batch 3250, loss[loss=0.1816, simple_loss=0.2722, pruned_loss=0.04548, over 24272.00 frames. ], tot_loss[loss=0.1721, simple_loss=0.2486, pruned_loss=0.04776, over 4709114.20 frames. ], batch size: 74, lr: 4.66e-03, grad_scale: 16.0 2023-10-02 05:23:29,597 WARNING [train.py:1204] (3/4) Exclude cut with ID _14860_210_4915_1_1530702020340_7343240_836-328489-0 from training. Duration: 0.9 2023-10-02 05:23:29,847 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.bypass_mid.scale_min, batch_count=765360.0, ans=0.2 2023-10-02 05:23:30,412 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.2.encoder.layers.1.nonlin_attention.whiten2, num_groups=1, num_channels=384, metric=6.72 vs. limit=15.0 2023-10-02 05:23:31,005 WARNING [train.py:1204] (3/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_1033-274376-0_sp0.9 from training. Duration: 0.9666875 2023-10-02 05:23:33,870 WARNING [train.py:1204] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_475-127455-0_sp0.9 from training. Duration: 0.77775 2023-10-02 05:23:33,878 WARNING [train.py:1204] (3/4) Exclude cut with ID IC0671W0071-331914 from training. Duration: 0.896 2023-10-02 05:23:33,919 WARNING [train.py:1204] (3/4) Exclude cut with ID _30327_210_16049_1_1532484161295_3924169_2-1523-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 05:23:33,924 WARNING [train.py:1204] (3/4) Exclude cut with ID _24314_210_4915_1_1532135078116_7170179_460-208279-0_sp1.1 from training. Duration: 0.97275 2023-10-02 05:23:35,363 WARNING [train.py:1204] (3/4) Exclude cut with ID IC0679W0052-335859 from training. Duration: 0.64 2023-10-02 05:23:39,471 WARNING [train.py:1204] (3/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_3-237434-0_sp1.1 from training. Duration: 0.6909375 2023-10-02 05:23:39,751 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.ff3_skip_rate, batch_count=765360.0, ans=0.0 2023-10-02 05:23:41,565 WARNING [train.py:1204] (3/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_405-185283-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 05:23:48,634 WARNING [train.py:1204] (3/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_927-83684-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 05:23:48,642 WARNING [train.py:1204] (3/4) Exclude cut with ID _6694_210_4599_1_1527852637490_3965529_356-169233-0 from training. Duration: 0.99 2023-10-02 05:23:49,931 WARNING [train.py:1204] (3/4) Exclude cut with ID _19896_210_1883_1_1531709022662_6649569_562-233001-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 05:23:51,221 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_354-78629-0_sp1.1 from training. Duration: 0.963625 2023-10-02 05:23:51,222 WARNING [train.py:1204] (3/4) Exclude cut with ID _60135_210_13427_1_1534935557922_3651319_96-168198-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 05:23:53,326 WARNING [train.py:1204] (3/4) Exclude cut with ID _31602_210_5854_1_1532677021456_4458250_249-96604-0_sp1.1 from training. Duration: 0.8090625 2023-10-02 05:23:53,368 WARNING [train.py:1204] (3/4) Exclude cut with ID _14100_210_8341_1_1530529320613_3678069_258-224599-0_sp0.9 from training. Duration: 0.8555625 2023-10-02 05:23:56,722 WARNING [train.py:1204] (3/4) Exclude cut with ID _19708_210_9206_1_1532136650733_4238364_215-160135-0_sp1.1 from training. Duration: 0.97275 2023-10-02 05:23:56,765 WARNING [train.py:1204] (3/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_113-18163-0_sp0.9 from training. Duration: 0.988875 2023-10-02 05:23:56,806 WARNING [train.py:1204] (3/4) Exclude cut with ID _64135_210_2253_1_1535677267876_7608972_552-337340-0_sp1.1 from training. Duration: 0.92725 2023-10-02 05:23:56,839 WARNING [train.py:1204] (3/4) Exclude cut with ID _67725_210_27767_1_1535608756671_5105359_10-12887-0_sp1.1 from training. Duration: 0.97275 2023-10-02 05:23:56,845 WARNING [train.py:1204] (3/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_401-284840-0_sp1.1 from training. Duration: 0.97275 2023-10-02 05:23:56,864 WARNING [train.py:1204] (3/4) Exclude cut with ID _44659_210_2721_1_1534036120061_6614860_483-141285-0_sp1.1 from training. Duration: 0.936375 2023-10-02 05:24:00,926 WARNING [train.py:1204] (3/4) Exclude cut with ID _9461_210_4557_1_1529060514726_3677451_358-332734-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 05:24:03,750 WARNING [train.py:1204] (3/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_788-298907-0_sp1.1 from training. Duration: 0.8090625 2023-10-02 05:24:05,134 WARNING [train.py:1204] (3/4) Exclude cut with ID _39411_210_5986_1_1533538825030_3343649_354-267196-0_sp1.1 from training. Duration: 0.92725 2023-10-02 05:24:05,154 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_14953_210_2151_1_1530701710_5616165_192-309792-0_sp1.1 from training. Duration: 0.97275 2023-10-02 05:24:06,549 WARNING [train.py:1204] (3/4) Exclude cut with ID _59170_210_6721_1_1534983633960_4203335_290-151255-0_sp1.1 from training. Duration: 0.92725 2023-10-02 05:24:06,592 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_5575_210_1410_1_1527301305_2570170_249-93769-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 05:24:06,602 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_46013_210_7234_1_1533776362_3744032_463-52378-0_sp1.1 from training. Duration: 0.7818125 2023-10-02 05:24:06,868 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.feed_forward1.out_proj.dropout_p, batch_count=765493.3333333334, ans=0.1 2023-10-02 05:24:11,025 WARNING [train.py:1204] (3/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_580-93798-0 from training. Duration: 0.56 2023-10-02 05:24:11,068 WARNING [train.py:1204] (3/4) Exclude cut with ID _19514_210_9259_1_1531530071330_5084230_385-13762-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 05:24:11,072 WARNING [train.py:1204] (3/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_458-67321-0_sp1.1 from training. Duration: 0.8818125 2023-10-02 05:24:12,310 WARNING [train.py:1204] (3/4) Exclude cut with ID _41298_210_9019_1_1533430371450_4822143_188-154791-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 05:24:14,099 WARNING [train.py:1204] (3/4) Exclude cut with ID _15037_210_2253_1_1530752294104_3267650_296-192597-0_sp0.9 from training. Duration: 0.9 2023-10-02 05:24:16,000 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.feed_forward2.hidden_balancer.prob, batch_count=765560.0, ans=0.125 2023-10-02 05:24:19,946 WARNING [train.py:1204] (3/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_661-264857-0_sp1.1 from training. Duration: 0.8454375 2023-10-02 05:24:21,543 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.ff2_skip_rate, batch_count=765560.0, ans=0.0 2023-10-02 05:24:27,703 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_34008_210_10431_1_1533345424_8183247_662-317331-0_sp1.1 from training. Duration: 0.8545625 2023-10-02 05:24:27,728 WARNING [train.py:1204] (3/4) Exclude cut with ID _46502_210_13794_1_1533724452198_3657159_241-278329-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 05:24:27,728 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_11349_210_6753_1_1529800861_1101207_26-65602-0 from training. Duration: 0.96 2023-10-02 05:24:27,740 WARNING [train.py:1204] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_448-4048-0_sp1.1 from training. Duration: 0.87275 2023-10-02 05:24:27,747 WARNING [train.py:1204] (3/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_662-275621-0_sp1.1 from training. Duration: 0.3818125 2023-10-02 05:24:28,324 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.4.encoder.layers.2.feed_forward3.out_whiten, num_groups=1, num_channels=384, metric=12.72 vs. limit=15.0 2023-10-02 05:24:29,094 WARNING [train.py:1204] (3/4) Exclude cut with ID _13938_210_9988_1_1530518718978_3693960_256-54454-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 05:24:29,304 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=765626.6666666666, ans=0.1 2023-10-02 05:24:31,823 WARNING [train.py:1204] (3/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_256-32662-0 from training. Duration: 0.9 2023-10-02 05:24:31,861 WARNING [train.py:1204] (3/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_326-17061-0 from training. Duration: 0.94 2023-10-02 05:24:31,896 WARNING [train.py:1204] (3/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_505-97987-0_sp1.1 from training. Duration: 0.7818125 2023-10-02 05:24:33,454 WARNING [train.py:1204] (3/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_316-294364-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 05:24:33,515 WARNING [train.py:1204] (3/4) Exclude cut with ID _19514_210_9259_1_1531530071330_5084230_762-231063-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 05:24:34,972 WARNING [train.py:1204] (3/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_68-219807-0_sp0.9 from training. Duration: 0.52225 2023-10-02 05:24:35,012 WARNING [train.py:1204] (3/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_479-158053-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 05:24:39,036 WARNING [train.py:1204] (3/4) Exclude cut with ID _66112_210_10635_1_1535590236305_3801484_83-38181-0_sp1.1 from training. Duration: 0.936375 2023-10-02 05:24:39,057 WARNING [train.py:1204] (3/4) Exclude cut with ID _55857_210_2346_1_1534644007831_6898209_255-146346-0_sp1.1 from training. Duration: 0.963625 2023-10-02 05:24:40,408 WARNING [train.py:1204] (3/4) Exclude cut with ID _48836_210_12210_1_1534075848293_3904150_429-35475-0 from training. Duration: 0.89 2023-10-02 05:24:40,418 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_50154_210_13731_1_1534750862_1080961_119-187163-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 05:24:43,112 INFO [train.py:1046] (3/4) Epoch 22, batch 3300, loss[loss=0.184, simple_loss=0.2625, pruned_loss=0.05271, over 23412.00 frames. ], tot_loss[loss=0.1727, simple_loss=0.2495, pruned_loss=0.04793, over 4716733.96 frames. ], batch size: 93, lr: 4.66e-03, grad_scale: 16.0 2023-10-02 05:24:43,155 WARNING [train.py:1204] (3/4) Exclude cut with ID _8793_210_6310_1_1528542985139_2310009_121-179782-0_sp0.9 from training. Duration: 0.888875 2023-10-02 05:24:43,160 WARNING [train.py:1204] (3/4) Exclude cut with ID _14884_210_2252_1_1530707459689_5561250_591-173406-0 from training. Duration: 0.96 2023-10-02 05:24:45,366 WARNING [train.py:1204] (3/4) Exclude cut with ID _15133_210_6228_1_1530794512074_3080379_54-198193-0_sp1.1 from training. Duration: 0.8545625 2023-10-02 05:24:47,382 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6545_210_2040_1_1527774670_4245258_407-48070-0 from training. Duration: 0.76 2023-10-02 05:24:48,811 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1032-236863-0 from training. Duration: 0.84 2023-10-02 05:24:50,181 WARNING [train.py:1204] (3/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_507-303370-0 from training. Duration: 0.96 2023-10-02 05:24:51,444 WARNING [train.py:1204] (3/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_164-282366-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 05:24:54,928 WARNING [train.py:1204] (3/4) Exclude cut with ID _39867_210_9826_1_1533553222030_9344259_312-158727-0_sp1.1 from training. Duration: 0.963625 2023-10-02 05:24:55,022 WARNING [train.py:1204] (3/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_174-205058-0_sp1.1 from training. Duration: 0.863625 2023-10-02 05:24:56,299 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_11305_210_1794_1_1529750428_7178148_166-237358-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 05:24:58,385 WARNING [train.py:1204] (3/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_26-175553-0_sp1.1 from training. Duration: 0.5545625 2023-10-02 05:24:58,413 WARNING [train.py:1204] (3/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_96-290039-0_sp1.1 from training. Duration: 0.5090625 2023-10-02 05:24:58,615 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.balancer_ff3.min_abs, batch_count=765760.0, ans=0.2 2023-10-02 05:25:01,011 WARNING [train.py:1204] (3/4) Exclude cut with ID _53219_210_4525_1_1534834836008_4064825_261-101691-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 05:25:02,321 WARNING [train.py:1204] (3/4) Exclude cut with ID _24271_210_12658_1_1531987270022_7190519_96-293390-0_sp1.1 from training. Duration: 0.936375 2023-10-02 05:25:06,495 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_271-234962-0 from training. Duration: 0.79 2023-10-02 05:25:06,567 WARNING [train.py:1204] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_136-30660-0_sp1.1 from training. Duration: 0.97275 2023-10-02 05:25:06,591 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_625-281046-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 05:25:09,289 WARNING [train.py:1204] (3/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_333-168193-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 05:25:10,653 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9029W0269-509664 from training. Duration: 0.896 2023-10-02 05:25:12,042 WARNING [train.py:1204] (3/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_450-147393-0_sp1.1 from training. Duration: 0.92725 2023-10-02 05:25:13,361 WARNING [train.py:1204] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_643-52007-0_sp1.1 from training. Duration: 0.5181875 2023-10-02 05:25:13,432 WARNING [train.py:1204] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_330-327861-0_sp0.9 from training. Duration: 0.7444375 2023-10-02 05:25:13,447 WARNING [train.py:1204] (3/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_260-317651-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 05:25:13,463 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9044W0010-516654 from training. Duration: 0.896 2023-10-02 05:25:18,304 WARNING [train.py:1204] (3/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_265-118378-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 05:25:18,311 WARNING [train.py:1204] (3/4) Exclude cut with ID _6194_210_1534_1_1527511826160_3941194_321-148641-0_sp0.9 from training. Duration: 0.711125 2023-10-02 05:25:19,560 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.537e+02 1.834e+02 2.092e+02 2.306e+02 3.229e+02, threshold=4.184e+02, percent-clipped=0.0 2023-10-02 05:25:19,704 WARNING [train.py:1204] (3/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_101-169176-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 05:25:19,706 WARNING [train.py:1204] (3/4) Exclude cut with ID 497-129325-0061-62254-0_sp1.1 from training. Duration: 0.97725 2023-10-02 05:25:19,831 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.conv_module2.balancer2.prob, batch_count=765826.6666666666, ans=0.125 2023-10-02 05:25:21,659 WARNING [train.py:1204] (3/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_795-33252-0_sp1.1 from training. Duration: 0.336375 2023-10-02 05:25:21,681 WARNING [train.py:1204] (3/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_168-246176-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 05:25:23,007 WARNING [train.py:1204] (3/4) Exclude cut with ID _7160_210_1876_1_1527940923231_3703799_120-150943-0_sp1.1 from training. Duration: 0.736375 2023-10-02 05:25:24,472 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9084W0005-536844 from training. Duration: 0.768 2023-10-02 05:25:26,313 WARNING [train.py:1204] (3/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_234-260682-0 from training. Duration: 0.84 2023-10-02 05:25:26,355 WARNING [train.py:1204] (3/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_188-114464-0_sp0.9 from training. Duration: 0.911125 2023-10-02 05:25:27,815 WARNING [train.py:1204] (3/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_81-75849-0 from training. Duration: 0.39 2023-10-02 05:25:31,275 WARNING [train.py:1204] (3/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_379-184223-0_sp1.1 from training. Duration: 0.77275 2023-10-02 05:25:32,053 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.1.encoder.layers.0.nonlin_attention.whiten1, num_groups=1, num_channels=192, metric=4.65 vs. limit=10.0 2023-10-02 05:25:34,093 WARNING [train.py:1204] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_454-123002-0_sp1.1 from training. Duration: 0.62725 2023-10-02 05:25:34,145 WARNING [train.py:1204] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_887-249555-0_sp1.1 from training. Duration: 0.7 2023-10-02 05:25:36,138 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.2.encoder.layers.0.feed_forward3.out_whiten, num_groups=1, num_channels=384, metric=11.97 vs. limit=15.0 2023-10-02 05:25:38,323 WARNING [train.py:1204] (3/4) Exclude cut with ID _31088_210_16446_1_1532520012284_3440919_20-260902-0_sp1.1 from training. Duration: 0.97275 2023-10-02 05:25:38,369 WARNING [train.py:1204] (3/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_856-344574-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 05:25:38,372 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_36756_210_7234_1_1533026991_966276_4-164118-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 05:25:39,736 WARNING [train.py:1204] (3/4) Exclude cut with ID _8365_210_2064_1_1528592311070_4677690_371-122295-0_sp1.1 from training. Duration: 0.5 2023-10-02 05:25:42,484 WARNING [train.py:1204] (3/4) Exclude cut with ID _5910_210_1389_1_1527337972454_5050100_555-159815-0_sp1.1 from training. Duration: 0.8909375 2023-10-02 05:25:42,499 WARNING [train.py:1204] (3/4) Exclude cut with ID _37086_210_9038_1_1532999540891_7021250_469-147941-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 05:25:43,809 WARNING [train.py:1204] (3/4) Exclude cut with ID _3067_210_2667_1_1526212974640_4503168_459-173867-0_sp0.9 from training. Duration: 0.97775 2023-10-02 05:25:46,455 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9156W0006-572499 from training. Duration: 0.896 2023-10-02 05:25:46,539 WARNING [train.py:1204] (3/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_277-210798-0 from training. Duration: 0.55 2023-10-02 05:25:47,964 WARNING [train.py:1204] (3/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_103-336064-0_sp1.1 from training. Duration: 0.463625 2023-10-02 05:25:48,022 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3414_210_1988_1_1526212718_4813195_331-290945-0_sp1.1 from training. Duration: 0.936375 2023-10-02 05:25:48,023 WARNING [train.py:1204] (3/4) Exclude cut with ID _67499_210_17282_1_1535509805220_3692430_365-29276-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 05:25:49,885 WARNING [train.py:1204] (3/4) Exclude cut with ID _14821_210_8750_1_1530680503991_6829359_69-250847-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 05:25:49,886 WARNING [train.py:1204] (3/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_1102-61301-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 05:25:51,252 WARNING [train.py:1204] (3/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_166-246557-0_sp1.1 from training. Duration: 0.5818125 2023-10-02 05:25:51,482 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.bypass_mid.scale_min, batch_count=765960.0, ans=0.2 2023-10-02 05:25:53,148 WARNING [train.py:1204] (3/4) Exclude cut with ID _15351_210_10626_1_1530856635653_3576210_214-129918-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 05:25:53,167 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_120-41668-0_sp0.9 from training. Duration: 0.77775 2023-10-02 05:25:54,458 WARNING [train.py:1204] (3/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_27-72278-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 05:25:54,577 WARNING [train.py:1204] (3/4) Exclude cut with ID _24314_210_4915_1_1532135078116_7170179_521-258868-0_sp1.1 from training. Duration: 0.7181875 2023-10-02 05:25:57,845 WARNING [train.py:1204] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_143-86344-0 from training. Duration: 0.77 2023-10-02 05:25:57,893 WARNING [train.py:1204] (3/4) Exclude cut with ID _53489_210_6228_1_1534298357169_3835970_465-157433-0_sp1.1 from training. Duration: 0.92725 2023-10-02 05:25:59,258 INFO [train.py:1046] (3/4) Epoch 22, batch 3350, loss[loss=0.1608, simple_loss=0.241, pruned_loss=0.04024, over 24635.00 frames. ], tot_loss[loss=0.173, simple_loss=0.2503, pruned_loss=0.04788, over 4726872.77 frames. ], batch size: 65, lr: 4.66e-03, grad_scale: 8.0 2023-10-02 05:25:59,353 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_248-319353-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 05:25:59,526 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.bypass_mid.scale_min, batch_count=766026.6666666666, ans=0.2 2023-10-02 05:26:02,012 WARNING [train.py:1204] (3/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_102-77064-0_sp1.1 from training. Duration: 0.8454375 2023-10-02 05:26:02,042 WARNING [train.py:1204] (3/4) Exclude cut with ID _9389_210_1664_1_1529217337301_5289269_379-128794-0_sp1.1 from training. Duration: 0.77275 2023-10-02 05:26:03,385 WARNING [train.py:1204] (3/4) Exclude cut with ID _3231_210_2106_1_1526205864934_6784220_236-260835-0_sp1.1 from training. Duration: 0.97275 2023-10-02 05:26:06,766 WARNING [train.py:1204] (3/4) Exclude cut with ID _24271_210_12658_1_1531987270022_7190519_184-342363-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 05:26:06,767 WARNING [train.py:1204] (3/4) Exclude cut with ID _27894_210_12392_1_1532415496078_6902439_67-221663-0_sp1.1 from training. Duration: 0.92725 2023-10-02 05:26:08,237 WARNING [train.py:1204] (3/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_304-59227-0_sp1.1 from training. Duration: 0.7 2023-10-02 05:26:09,731 WARNING [train.py:1204] (3/4) Exclude cut with ID _58605_210_12210_1_1534723195939_3904259_458-42232-0_sp1.1 from training. Duration: 0.92725 2023-10-02 05:26:11,194 WARNING [train.py:1204] (3/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_13-165512-0_sp0.9 from training. Duration: 0.911125 2023-10-02 05:26:12,639 WARNING [train.py:1204] (3/4) Exclude cut with ID _58602_210_2346_1_1534903210085_6908830_792-102241-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 05:26:15,411 WARNING [train.py:1204] (3/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_588-125165-0_sp0.9 from training. Duration: 0.92225 2023-10-02 05:26:16,830 WARNING [train.py:1204] (3/4) Exclude cut with ID _54218_210_12210_1_1534413646388_4409259_239-98429-0_sp1.1 from training. Duration: 0.97275 2023-10-02 05:26:16,875 WARNING [train.py:1204] (3/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_538-32291-0_sp1.1 from training. Duration: 0.7545625 2023-10-02 05:26:17,297 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.5.encoder.layers.0.self_attn2.whiten, num_groups=1, num_channels=256, metric=10.61 vs. limit=22.5 2023-10-02 05:26:18,712 WARNING [train.py:1204] (3/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_419-317062-0 from training. Duration: 0.91 2023-10-02 05:26:20,231 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9309W0202-640329 from training. Duration: 0.896 2023-10-02 05:26:20,258 WARNING [train.py:1204] (3/4) Exclude cut with ID _3231_210_2106_1_1526205864934_6784220_419-313291-0_sp1.1 from training. Duration: 0.97275 2023-10-02 05:26:24,157 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.3.encoder.layers.2.self_attn2.whiten, num_groups=1, num_channels=512, metric=15.70 vs. limit=22.5 2023-10-02 05:26:24,975 WARNING [train.py:1204] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_619-344499-0 from training. Duration: 0.63 2023-10-02 05:26:24,990 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_278-272612-0 from training. Duration: 0.73 2023-10-02 05:26:25,087 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_103-84010-0_sp0.9 from training. Duration: 0.7666875 2023-10-02 05:26:26,434 WARNING [train.py:1204] (3/4) Exclude cut with ID _26528_210_9070_1_1532570410842_3851310_497-97218-0_sp1.1 from training. Duration: 0.7818125 2023-10-02 05:26:27,789 WARNING [train.py:1204] (3/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_262-84613-0_sp1.1 from training. Duration: 0.936375 2023-10-02 05:26:27,822 WARNING [train.py:1204] (3/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_387-131538-0 from training. Duration: 0.81 2023-10-02 05:26:27,859 WARNING [train.py:1204] (3/4) Exclude cut with ID _8030_210_7497_1_1528444886781_3531089_224-300489-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 05:26:27,883 WARNING [train.py:1204] (3/4) Exclude cut with ID _14349_210_8341_1_1530615710818_3442470_170-336541-0_sp0.9 from training. Duration: 0.9555625 2023-10-02 05:26:29,464 INFO [scaling.py:1118] (3/4) WithLoss: name=encoder.encoders.3.encoder.layers.2.self_attn_weights, loss-sum=0.000e+00 2023-10-02 05:26:30,541 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_47069_210_15368_1_1533781267_4431320_180-31746-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 05:26:30,651 WARNING [train.py:1204] (3/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_447-7041-0_sp1.1 from training. Duration: 0.92725 2023-10-02 05:26:30,675 WARNING [train.py:1204] (3/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_2-55283-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 05:26:30,893 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.feed_forward2.hidden_balancer.prob, batch_count=766160.0, ans=0.125 2023-10-02 05:26:32,717 WARNING [train.py:1204] (3/4) Exclude cut with ID _12555_210_8750_1_1530075594402_3780239_70-181911-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 05:26:35,033 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.feed_forward1.hidden_balancer.prob, batch_count=766160.0, ans=0.125 2023-10-02 05:26:37,261 WARNING [train.py:1204] (3/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_311-328022-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 05:26:40,030 WARNING [train.py:1204] (3/4) Exclude cut with ID _14528_210_8254_1_1530669700514_4306939_330-2987-0_sp1.1 from training. Duration: 0.92725 2023-10-02 05:26:40,084 WARNING [train.py:1204] (3/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_385-184858-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 05:26:44,328 WARNING [train.py:1204] (3/4) Exclude cut with ID _64414_210_17301_1_1535243252048_3757759_348-333360-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 05:26:45,619 WARNING [train.py:1204] (3/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_753-214751-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 05:26:46,998 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_17613_210_2151_1_1531137569_2366160_202-208013-0_sp1.1 from training. Duration: 0.92725 2023-10-02 05:26:47,012 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_43831_210_2158_1_1533725502_5872791_610-129300-0_sp1.1 from training. Duration: 0.936375 2023-10-02 05:26:48,394 WARNING [train.py:1204] (3/4) Exclude cut with ID _54218_210_12210_1_1534413646388_4409259_321-23852-0_sp1.1 from training. Duration: 0.936375 2023-10-02 05:26:48,532 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=766226.6666666666, ans=0.1 2023-10-02 05:26:49,780 WARNING [train.py:1204] (3/4) Exclude cut with ID _39411_210_5986_1_1533538825030_3343649_173-238248-0 from training. Duration: 0.83 2023-10-02 05:26:49,786 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_405-339986-0_sp0.9 from training. Duration: 0.5666875 2023-10-02 05:26:49,809 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2009_210_2071_1_1525481134_4588937_385-302100-0 from training. Duration: 0.87 2023-10-02 05:26:50,048 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.bypass.skip_rate, batch_count=766226.6666666666, ans=0.04949747468305833 2023-10-02 05:26:51,494 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_161-276996-0_sp1.1 from training. Duration: 0.863625 2023-10-02 05:26:51,588 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_177-58005-0 from training. Duration: 0.62 2023-10-02 05:26:52,957 WARNING [train.py:1204] (3/4) Exclude cut with ID _64639_210_5816_1_1535367619437_3528000_146-343734-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 05:26:53,239 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.conv_module1.balancer2.prob, batch_count=766226.6666666666, ans=0.125 2023-10-02 05:26:55,049 WARNING [train.py:1204] (3/4) Exclude cut with ID _3845_210_1390_1_1526731326141_3523610_72-61441-0_sp1.1 from training. Duration: 0.92725 2023-10-02 05:27:00,710 WARNING [train.py:1204] (3/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_991-125028-0_sp1.1 from training. Duration: 0.936375 2023-10-02 05:27:02,003 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7544_210_6753_1_1528591327_1471426_172-348777-0 from training. Duration: 0.66 2023-10-02 05:27:02,048 WARNING [train.py:1204] (3/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_416-257730-0_sp1.1 from training. Duration: 0.5909375 2023-10-02 05:27:02,113 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1113-7369-0_sp1.1 from training. Duration: 0.77275 2023-10-02 05:27:04,193 WARNING [train.py:1204] (3/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_242-76943-0_sp1.1 from training. Duration: 0.8545625 2023-10-02 05:27:07,769 WARNING [train.py:1204] (3/4) Exclude cut with ID _44718_210_5816_1_1534050051869_6469030_190-49648-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 05:27:10,534 WARNING [train.py:1204] (3/4) Exclude cut with ID _47251_210_18628_1_1533814063128_4137250_214-88076-0 from training. Duration: 0.88 2023-10-02 05:27:10,583 WARNING [train.py:1204] (3/4) Exclude cut with ID _5585_210_2667_1_1527418813324_8007340_89-332072-0_sp1.1 from training. Duration: 0.5818125 2023-10-02 05:27:10,612 WARNING [train.py:1204] (3/4) Exclude cut with ID _41817_210_18835_1_1533434477160_7423049_555-293448-0_sp1.1 from training. Duration: 0.72725 2023-10-02 05:27:11,358 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.2.encoder.layers.1.self_attn2.whiten, num_groups=1, num_channels=384, metric=17.48 vs. limit=22.5 2023-10-02 05:27:11,958 WARNING [train.py:1204] (3/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_286-331314-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 05:27:13,301 WARNING [train.py:1204] (3/4) Exclude cut with ID _13921_210_9994_1_1530838702800_3003429_46-149444-0 from training. Duration: 0.94 2023-10-02 05:27:14,633 INFO [train.py:1046] (3/4) Epoch 22, batch 3400, loss[loss=0.1548, simple_loss=0.2426, pruned_loss=0.03354, over 24465.00 frames. ], tot_loss[loss=0.1741, simple_loss=0.2512, pruned_loss=0.04855, over 4725329.38 frames. ], batch size: 66, lr: 4.66e-03, grad_scale: 8.0 2023-10-02 05:27:14,679 WARNING [train.py:1204] (3/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_768-43595-0_sp1.1 from training. Duration: 0.936375 2023-10-02 05:27:14,698 WARNING [train.py:1204] (3/4) Exclude cut with ID _35355_210_16040_1_1533212988977_4016229_74-190076-0 from training. Duration: 0.8 2023-10-02 05:27:16,749 WARNING [train.py:1204] (3/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_1007-316380-0_sp1.1 from training. Duration: 0.963625 2023-10-02 05:27:18,008 WARNING [train.py:1204] (3/4) Exclude cut with ID _23557_210_8750_1_1531890106370_6846255_150-283437-0_sp1.1 from training. Duration: 0.963625 2023-10-02 05:27:18,067 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3474_210_2028_1_1526188513_4825246_145-309131-0_sp1.1 from training. Duration: 0.67275 2023-10-02 05:27:19,421 WARNING [train.py:1204] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_391-22148-0_sp1.1 from training. Duration: 0.7 2023-10-02 05:27:19,449 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_238-256668-0 from training. Duration: 0.69 2023-10-02 05:27:25,419 WARNING [train.py:1204] (3/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_372-198878-0 from training. Duration: 0.6 2023-10-02 05:27:25,429 WARNING [train.py:1204] (3/4) Exclude cut with ID ID0174W0006-766659 from training. Duration: 0.953 2023-10-02 05:27:25,443 WARNING [train.py:1204] (3/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_668-23019-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 05:27:28,286 WARNING [train.py:1204] (3/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_322-74328-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 05:27:28,294 WARNING [train.py:1204] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_872-264525-0_sp1.1 from training. Duration: 0.5909375 2023-10-02 05:27:29,782 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_53378_210_17282_1_1534221071_5769509_641-286201-0_sp1.1 from training. Duration: 0.97275 2023-10-02 05:27:31,125 WARNING [train.py:1204] (3/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_199-295611-0_sp1.1 from training. Duration: 0.5 2023-10-02 05:27:37,106 WARNING [train.py:1204] (3/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_1270-317971-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 05:27:39,228 WARNING [train.py:1204] (3/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_784-213099-0 from training. Duration: 0.93 2023-10-02 05:27:43,278 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_115-233929-0_sp0.9 from training. Duration: 0.8 2023-10-02 05:27:46,008 WARNING [train.py:1204] (3/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_256-223809-0_sp1.1 from training. Duration: 0.97275 2023-10-02 05:27:46,060 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_285-87781-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 05:27:47,457 WARNING [train.py:1204] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_619-344499-0_sp1.1 from training. Duration: 0.57275 2023-10-02 05:27:49,894 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.0.layers.0.self_attn2.whiten, num_groups=1, num_channels=192, metric=10.70 vs. limit=22.5 2023-10-02 05:27:54,011 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.491e+02 1.880e+02 2.089e+02 2.300e+02 3.158e+02, threshold=4.178e+02, percent-clipped=0.0 2023-10-02 05:27:54,098 WARNING [train.py:1204] (3/4) Exclude cut with ID _60229_210_27767_1_1534838406158_3655350_264-279573-0_sp0.9 from training. Duration: 0.92225 2023-10-02 05:27:55,535 WARNING [train.py:1204] (3/4) Exclude cut with ID _7719_210_3525_1_1528456153163_4296960_333-110399-0 from training. Duration: 0.96 2023-10-02 05:28:01,003 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_50423_210_13270_1_1534128598_5021734_759-90833-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 05:28:02,382 WARNING [train.py:1204] (3/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_151-254798-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 05:28:02,429 WARNING [train.py:1204] (3/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_450-203637-0 from training. Duration: 0.92 2023-10-02 05:28:02,560 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=766560.0, ans=0.1 2023-10-02 05:28:03,776 WARNING [train.py:1204] (3/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_673-36456-0_sp1.1 from training. Duration: 0.936375 2023-10-02 05:28:03,825 WARNING [train.py:1204] (3/4) Exclude cut with ID _36538_210_9994_1_1533204020363_3795620_131-8685-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 05:28:05,800 WARNING [train.py:1204] (3/4) Exclude cut with ID _12556_210_8341_1_1530081024100_7327690_713-55626-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 05:28:05,836 WARNING [train.py:1204] (3/4) Exclude cut with ID _2322_210_2667_1_1525604548971_3656299_440-80494-0_sp0.9 from training. Duration: 0.97775 2023-10-02 05:28:07,360 WARNING [train.py:1204] (3/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_210-325081-0_sp1.1 from training. Duration: 0.97275 2023-10-02 05:28:12,030 WARNING [train.py:1204] (3/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_375-244976-0_sp0.9 from training. Duration: 0.8333125 2023-10-02 05:28:12,036 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_15517_210_6753_1_1530952293_1664795_119-31001-0_sp1.1 from training. Duration: 0.8545625 2023-10-02 05:28:13,712 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.conv_skip_rate, batch_count=766626.6666666666, ans=0.0 2023-10-02 05:28:16,373 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_41452_210_7291_1_1533437687_3941281_319-42326-0_sp1.1 from training. Duration: 0.92725 2023-10-02 05:28:17,748 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_372-173012-0 from training. Duration: 0.95 2023-10-02 05:28:17,946 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.balancer2.prob, batch_count=766626.6666666666, ans=0.125 2023-10-02 05:28:22,336 WARNING [train.py:1204] (3/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_41-31157-0_sp1.1 from training. Duration: 0.5181875 2023-10-02 05:28:27,273 WARNING [train.py:1204] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_91-212486-0 from training. Duration: 0.68 2023-10-02 05:28:29,123 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.nonlin_attention.balancer.prob, batch_count=766693.3333333334, ans=0.125 2023-10-02 05:28:30,082 INFO [train.py:1046] (3/4) Epoch 22, batch 3450, loss[loss=0.1707, simple_loss=0.2324, pruned_loss=0.05449, over 23750.00 frames. ], tot_loss[loss=0.1752, simple_loss=0.2521, pruned_loss=0.04916, over 4721250.58 frames. ], batch size: 232, lr: 4.65e-03, grad_scale: 4.0 2023-10-02 05:28:30,189 WARNING [train.py:1204] (3/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_1267-272693-0 from training. Duration: 0.73 2023-10-02 05:28:30,226 WARNING [train.py:1204] (3/4) Exclude cut with ID _13938_210_9988_1_1530518718978_3693960_152-67046-0_sp1.1 from training. Duration: 0.936375 2023-10-02 05:28:33,389 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_4528_210_3273_1_1527076654_3276777_342-168068-0_sp1.1 from training. Duration: 0.7909375 2023-10-02 05:28:33,406 WARNING [train.py:1204] (3/4) Exclude cut with ID _7606_210_4915_1_1528894253277_5080690_127-177761-0 from training. Duration: 0.64 2023-10-02 05:28:33,468 WARNING [train.py:1204] (3/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_335-137972-0_sp1.1 from training. Duration: 0.92725 2023-10-02 05:28:36,281 WARNING [train.py:1204] (3/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_331-117829-0_sp1.1 from training. Duration: 0.563625 2023-10-02 05:28:43,781 WARNING [train.py:1204] (3/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_125-125818-0_sp1.1 from training. Duration: 0.87275 2023-10-02 05:28:43,919 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.0.ff2_skip_rate, batch_count=766760.0, ans=0.0 2023-10-02 05:28:45,206 WARNING [train.py:1204] (3/4) Exclude cut with ID _6789_210_1534_1_1527854471671_2870519_48-294897-0_sp1.1 from training. Duration: 0.963625 2023-10-02 05:28:46,576 WARNING [train.py:1204] (3/4) Exclude cut with ID _6694_210_4599_1_1527852637490_3965529_116-109398-0_sp1.1 from training. Duration: 0.663625 2023-10-02 05:28:46,587 WARNING [train.py:1204] (3/4) Exclude cut with ID _52892_210_6721_1_1534378941259_4124129_222-172685-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 05:28:48,065 WARNING [train.py:1204] (3/4) Exclude cut with ID _50655_210_18628_1_1534229942193_3772154_320-132275-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 05:28:52,941 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.5.encoder.layers.1.feed_forward1.out_whiten, num_groups=1, num_channels=256, metric=7.61 vs. limit=15.0 2023-10-02 05:28:54,326 WARNING [train.py:1204] (3/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_76-311963-0 from training. Duration: 0.7 2023-10-02 05:28:59,880 WARNING [train.py:1204] (3/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_32-205288-0 from training. Duration: 0.78 2023-10-02 05:28:59,906 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_86-16881-0_sp0.9 from training. Duration: 0.6444375 2023-10-02 05:28:59,952 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_271-297505-0_sp1.1 from training. Duration: 0.9 2023-10-02 05:29:02,691 WARNING [train.py:1204] (3/4) Exclude cut with ID _57572_210_5517_1_1534921219624_3809484_187-295293-0_sp1.1 from training. Duration: 0.963625 2023-10-02 05:29:07,679 WARNING [train.py:1204] (3/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_563-329943-0 from training. Duration: 0.99 2023-10-02 05:29:08,997 WARNING [train.py:1204] (3/4) Exclude cut with ID _7160_210_1876_1_1527940923231_3703799_302-150728-0_sp0.9 from training. Duration: 0.8666875 2023-10-02 05:29:11,954 WARNING [train.py:1204] (3/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_926-7526-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 05:29:11,987 WARNING [train.py:1204] (3/4) Exclude cut with ID _62574_210_2655_1_1535180407058_3933249_467-22348-0_sp1.1 from training. Duration: 0.936375 2023-10-02 05:29:13,804 WARNING [train.py:1204] (3/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_280-161633-0_sp0.9 from training. Duration: 0.811125 2023-10-02 05:29:15,203 WARNING [train.py:1204] (3/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_385-193052-0_sp1.1 from training. Duration: 0.7818125 2023-10-02 05:29:16,785 WARNING [train.py:1204] (3/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_444-28041-0 from training. Duration: 0.85 2023-10-02 05:29:16,797 WARNING [train.py:1204] (3/4) Exclude cut with ID _66070_210_7640_1_1535441393096_7805310_341-249237-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 05:29:18,207 WARNING [train.py:1204] (3/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_425-269826-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 05:29:20,929 WARNING [train.py:1204] (3/4) Exclude cut with ID _53950_210_5816_1_1534417264919_3862951_124-185597-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 05:29:23,744 WARNING [train.py:1204] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_380-204756-0 from training. Duration: 0.86 2023-10-02 05:29:29,009 WARNING [train.py:1204] (3/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_282-257791-0_sp0.9 from training. Duration: 0.9555625 2023-10-02 05:29:34,569 WARNING [train.py:1204] (3/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_372-131163-0_sp1.1 from training. Duration: 0.8545625 2023-10-02 05:29:35,958 WARNING [train.py:1204] (3/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_447-233513-0_sp1.1 from training. Duration: 0.963625 2023-10-02 05:29:36,317 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=766960.0, ans=0.1 2023-10-02 05:29:39,191 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_54-118333-0_sp1.1 from training. Duration: 0.97275 2023-10-02 05:29:42,088 WARNING [train.py:1204] (3/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_388-230239-0_sp1.1 from training. Duration: 0.963625 2023-10-02 05:29:43,392 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_40047_210_2290_1_1533290519_2374311_141-54639-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 05:29:43,462 WARNING [train.py:1204] (3/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_91-267696-0_sp0.9 from training. Duration: 0.9666875 2023-10-02 05:29:44,741 INFO [train.py:1046] (3/4) Epoch 22, batch 3500, loss[loss=0.1643, simple_loss=0.2265, pruned_loss=0.05108, over 22812.00 frames. ], tot_loss[loss=0.1736, simple_loss=0.2503, pruned_loss=0.04848, over 4722523.41 frames. ], batch size: 322, lr: 4.65e-03, grad_scale: 8.0 2023-10-02 05:29:45,530 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_532-137847-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 05:29:49,583 WARNING [train.py:1204] (3/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_155-283231-0_sp1.1 from training. Duration: 0.97275 2023-10-02 05:29:52,390 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2245_210_2715_1_1525511548_3540254_310-52483-0_sp1.1 from training. Duration: 0.8 2023-10-02 05:29:53,704 WARNING [train.py:1204] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_185-174805-0 from training. Duration: 0.68 2023-10-02 05:29:55,089 WARNING [train.py:1204] (3/4) Exclude cut with ID _15012_210_1968_1_1530856820429_5095350_662-210469-0_sp1.1 from training. Duration: 0.5181875 2023-10-02 05:29:57,161 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_426-313557-0_sp0.9 from training. Duration: 0.588875 2023-10-02 05:29:59,901 WARNING [train.py:1204] (3/4) Exclude cut with ID _42326_210_7770_1_1533814282531_3707269_407-204343-0_sp1.1 from training. Duration: 0.97275 2023-10-02 05:29:59,918 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_258-310570-0 from training. Duration: 0.82 2023-10-02 05:30:02,007 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.conv_module1.balancer2.prob, batch_count=767093.3333333334, ans=0.125 2023-10-02 05:30:04,565 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_4737_210_2040_1_1526998689_3293008_378-178650-0_sp1.1 from training. Duration: 0.863625 2023-10-02 05:30:04,667 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_47896_210_20508_1_1534492518_5958868_432-327451-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 05:30:06,172 WARNING [train.py:1204] (3/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_188-114464-0_sp1.1 from training. Duration: 0.7454375 2023-10-02 05:30:06,179 WARNING [train.py:1204] (3/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_798-106179-0_sp1.1 from training. Duration: 0.7545625 2023-10-02 05:30:07,489 WARNING [train.py:1204] (3/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_143-117421-0_sp1.1 from training. Duration: 0.52725 2023-10-02 05:30:07,534 WARNING [train.py:1204] (3/4) Exclude cut with ID _67499_210_17282_1_1535509805220_3692430_361-78786-0_sp1.1 from training. Duration: 0.963625 2023-10-02 05:30:09,477 WARNING [train.py:1204] (3/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_712-342563-0_sp1.1 from training. Duration: 0.8818125 2023-10-02 05:30:09,538 WARNING [train.py:1204] (3/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_387-159008-0 from training. Duration: 0.86 2023-10-02 05:30:12,227 WARNING [train.py:1204] (3/4) Exclude cut with ID _33640_210_2655_1_1533117605479_4855469_959-18820-0_sp1.1 from training. Duration: 0.963625 2023-10-02 05:30:13,512 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_133-564-0_sp0.9 from training. Duration: 0.87775 2023-10-02 05:30:15,360 WARNING [train.py:1204] (3/4) Exclude cut with ID _23514_210_1389_1_1531832139785_3031439_12-29287-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 05:30:18,178 WARNING [train.py:1204] (3/4) Exclude cut with ID _23514_210_1389_1_1531832139785_3031439_489-185052-0_sp1.1 from training. Duration: 0.963625 2023-10-02 05:30:19,486 WARNING [train.py:1204] (3/4) Exclude cut with ID _5444_210_2151_1_1527418808464_5312210_634-194174-0 from training. Duration: 0.97 2023-10-02 05:30:19,512 WARNING [train.py:1204] (3/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_91-26919-0_sp1.1 from training. Duration: 0.8818125 2023-10-02 05:30:20,976 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_37351_210_7925_1_1533012931_3812113_516-82303-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 05:30:23,758 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.489e+02 1.866e+02 2.009e+02 2.229e+02 3.626e+02, threshold=4.019e+02, percent-clipped=0.0 2023-10-02 05:30:23,854 WARNING [train.py:1204] (3/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_80-74184-0_sp0.9 from training. Duration: 0.92225 2023-10-02 05:30:25,123 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_9460_210_4211_1_1528954587_10049667_495-202963-0_sp1.1 from training. Duration: 0.963625 2023-10-02 05:30:26,522 WARNING [train.py:1204] (3/4) Exclude cut with ID _14632_210_9988_1_1530691279392_3923200_332-105904-0_sp1.1 from training. Duration: 0.8181875 2023-10-02 05:30:26,539 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_40764_210_1925_1_1533882359_3752345_493-167886-0_sp1.1 from training. Duration: 0.92725 2023-10-02 05:30:27,926 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_196-289637-0 from training. Duration: 0.75 2023-10-02 05:30:28,015 WARNING [train.py:1204] (3/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_26-175553-0 from training. Duration: 0.61 2023-10-02 05:30:30,005 WARNING [train.py:1204] (3/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_890-5117-0 from training. Duration: 0.78 2023-10-02 05:30:31,301 WARNING [train.py:1204] (3/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_206-306860-0_sp1.1 from training. Duration: 0.92725 2023-10-02 05:30:32,738 WARNING [train.py:1204] (3/4) Exclude cut with ID _25882_210_3036_1_1532775665815_6479510_570-79824-0_sp1.1 from training. Duration: 0.963625 2023-10-02 05:30:32,798 WARNING [train.py:1204] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_731-2428-0_sp1.1 from training. Duration: 0.7545625 2023-10-02 05:30:32,830 WARNING [train.py:1204] (3/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_138-154812-0_sp0.9 from training. Duration: 0.8666875 2023-10-02 05:30:36,984 WARNING [train.py:1204] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_178-39722-0_sp0.9 from training. Duration: 0.5444375 2023-10-02 05:30:37,037 WARNING [train.py:1204] (3/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_1285-256622-0_sp0.9 from training. Duration: 0.9333125 2023-10-02 05:30:39,764 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.3.encoder.layers.1.feed_forward2.out_whiten, num_groups=1, num_channels=512, metric=5.85 vs. limit=15.0 2023-10-02 05:30:41,783 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_41428_210_2161_1_1533774184_3992532_65-271323-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 05:30:43,819 WARNING [train.py:1204] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_308-161067-0 from training. Duration: 0.66 2023-10-02 05:30:43,824 WARNING [train.py:1204] (3/4) Exclude cut with ID _3067_210_2667_1_1526212974640_4503168_459-173867-0 from training. Duration: 0.88 2023-10-02 05:30:43,825 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2221_210_1988_1_1525597051_4935541_415-296882-0_sp1.1 from training. Duration: 0.7 2023-10-02 05:30:46,475 WARNING [train.py:1204] (3/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_354-192266-0_sp1.1 from training. Duration: 0.936375 2023-10-02 05:30:47,828 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_258-310570-0_sp0.9 from training. Duration: 0.911125 2023-10-02 05:30:49,230 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_548-141039-0_sp1.1 from training. Duration: 0.963625 2023-10-02 05:30:50,658 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_15101_210_7927_1_1530786415_5033274_320-327527-0 from training. Duration: 0.96 2023-10-02 05:30:50,726 WARNING [train.py:1204] (3/4) Exclude cut with ID _2895_210_2477_1_1525956785794_4063750_289-11475-0_sp0.9 from training. Duration: 0.911125 2023-10-02 05:30:52,044 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7543_210_6753_1_1528543318_4552_1-85072-0_sp1.1 from training. Duration: 0.7545625 2023-10-02 05:30:53,375 WARNING [train.py:1204] (3/4) Exclude cut with ID _64639_210_5816_1_1535367619437_3528000_461-329156-0 from training. Duration: 0.96 2023-10-02 05:30:54,718 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_211-339622-0 from training. Duration: 0.45 2023-10-02 05:30:56,134 WARNING [train.py:1204] (3/4) Exclude cut with ID _20455_210_5219_1_1531565996775_8098100_402-112739-0_sp1.1 from training. Duration: 0.963625 2023-10-02 05:30:57,855 WARNING [train.py:1204] (3/4) Exclude cut with ID _53489_210_6228_1_1534298357169_3835970_256-115565-0_sp1.1 from training. Duration: 0.936375 2023-10-02 05:30:57,873 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_26326_210_7925_1_1533002493_3592406_360-305260-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 05:30:57,908 WARNING [train.py:1204] (3/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_18-204383-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 05:30:59,167 INFO [train.py:1046] (3/4) Epoch 22, batch 3550, loss[loss=0.1724, simple_loss=0.2383, pruned_loss=0.0533, over 23476.00 frames. ], tot_loss[loss=0.1722, simple_loss=0.2481, pruned_loss=0.04811, over 4709768.78 frames. ], batch size: 285, lr: 4.65e-03, grad_scale: 8.0 2023-10-02 05:30:59,586 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.conv_module1.balancer1.prob, batch_count=767360.0, ans=0.125 2023-10-02 05:31:01,226 WARNING [train.py:1204] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_306-232480-0_sp1.1 from training. Duration: 0.87275 2023-10-02 05:31:10,651 WARNING [train.py:1204] (3/4) Exclude cut with ID _41161_210_20034_1_1533431249657_3555314_116-299186-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 05:31:10,757 WARNING [train.py:1204] (3/4) Exclude cut with ID _13913_210_9994_1_1530579615187_3383897_473-277682-0_sp1.1 from training. Duration: 0.3545625 2023-10-02 05:31:14,597 WARNING [train.py:1204] (3/4) Exclude cut with ID _58895_210_8750_1_1535184081320_3649089_49-225702-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 05:31:14,667 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3080_210_2071_1_1526086102_4365166_312-8726-0_sp1.1 from training. Duration: 0.82725 2023-10-02 05:31:17,403 WARNING [train.py:1204] (3/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_489-190175-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 05:31:17,493 WARNING [train.py:1204] (3/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_345-95717-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 05:31:18,783 WARNING [train.py:1204] (3/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_85-138257-0_sp1.1 from training. Duration: 0.6454375 2023-10-02 05:31:21,544 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7046_210_6753_1_1527895337_6920917_783-278109-0_sp1.1 from training. Duration: 0.7 2023-10-02 05:31:22,888 WARNING [train.py:1204] (3/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_450-203637-0_sp1.1 from training. Duration: 0.836375 2023-10-02 05:31:22,956 WARNING [train.py:1204] (3/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_416-132893-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 05:31:24,165 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2655_210_2087_1_1525688959_3897400_437-337690-0_sp0.9 from training. Duration: 0.7 2023-10-02 05:31:24,244 WARNING [train.py:1204] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_700-183549-0_sp1.1 from training. Duration: 0.6181875 2023-10-02 05:31:30,390 WARNING [train.py:1204] (3/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_191-59784-0_sp1.1 from training. Duration: 0.8 2023-10-02 05:31:31,786 WARNING [train.py:1204] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_218-295097-0_sp1.1 from training. Duration: 0.7 2023-10-02 05:31:31,916 WARNING [train.py:1204] (3/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_310-109461-0_sp1.1 from training. Duration: 0.72725 2023-10-02 05:31:31,921 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_33348_210_5632_1_1533086912_3324030_131-321149-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 05:31:33,183 WARNING [train.py:1204] (3/4) Exclude cut with ID _64135_210_2253_1_1535677267876_7608972_133-188266-0_sp1.1 from training. Duration: 0.736375 2023-10-02 05:31:33,209 WARNING [train.py:1204] (3/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_505-205270-0 from training. Duration: 0.38 2023-10-02 05:31:33,222 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_67223_210_4238_1_1535612120_2537431_102-88248-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 05:31:34,963 WARNING [train.py:1204] (3/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_60-235433-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 05:31:35,704 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.3.encoder.layers.0.whiten, num_groups=1, num_channels=512, metric=5.56 vs. limit=12.0 2023-10-02 05:31:36,365 WARNING [train.py:1204] (3/4) Exclude cut with ID _5189_210_4557_1_1527248319428_3769229_199-53526-0_sp1.1 from training. Duration: 0.3818125 2023-10-02 05:31:41,236 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.conv_skip_rate, batch_count=767493.3333333334, ans=0.0 2023-10-02 05:31:42,770 WARNING [train.py:1204] (3/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_439-90979-0_sp1.1 from training. Duration: 0.963625 2023-10-02 05:31:42,854 WARNING [train.py:1204] (3/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_1162-132880-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 05:31:44,163 WARNING [train.py:1204] (3/4) Exclude cut with ID _56423_210_5854_1_1534896061049_3165969_220-22317-0_sp1.1 from training. Duration: 0.963625 2023-10-02 05:31:46,990 WARNING [train.py:1204] (3/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_909-64151-0 from training. Duration: 0.94 2023-10-02 05:31:47,062 WARNING [train.py:1204] (3/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_465-156500-0_sp0.9 from training. Duration: 0.8 2023-10-02 05:31:48,457 WARNING [train.py:1204] (3/4) Exclude cut with ID _44304_210_15374_1_1533639352765_4613129_302-22084-0 from training. Duration: 0.86 2023-10-02 05:31:48,507 WARNING [train.py:1204] (3/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_663-166973-0_sp1.1 from training. Duration: 0.72725 2023-10-02 05:31:49,934 WARNING [train.py:1204] (3/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_503-287883-0_sp0.9 from training. Duration: 0.97775 2023-10-02 05:31:49,962 WARNING [train.py:1204] (3/4) Exclude cut with ID _60057_210_10640_1_1534903065793_3333150_362-230377-0_sp0.9 from training. Duration: 0.9666875 2023-10-02 05:31:54,101 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_4183_210_2087_1_1526729100_1869838_143-280962-0 from training. Duration: 0.56 2023-10-02 05:31:55,383 WARNING [train.py:1204] (3/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_281-1313-0_sp1.1 from training. Duration: 0.936375 2023-10-02 05:31:55,889 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.4.encoder.layers.2.nonlin_attention.whiten2, num_groups=1, num_channels=384, metric=3.77 vs. limit=15.0 2023-10-02 05:31:57,039 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.balancer2.prob, batch_count=767626.6666666666, ans=0.125 2023-10-02 05:31:57,690 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.1.encoder.layers.1.self_attn2.whiten, num_groups=1, num_channels=256, metric=10.28 vs. limit=22.5 2023-10-02 05:31:59,546 WARNING [train.py:1204] (3/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_64-215118-0_sp1.1 from training. Duration: 0.936375 2023-10-02 05:32:00,894 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2822_210_2715_1_1526112586_1999587_165-36656-0 from training. Duration: 0.89 2023-10-02 05:32:02,794 WARNING [train.py:1204] (3/4) Exclude cut with ID _22961_210_8254_1_1531976366672_4278180_402-208830-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 05:32:06,033 WARNING [train.py:1204] (3/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_98-193494-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 05:32:07,963 WARNING [train.py:1204] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_383-285196-0 from training. Duration: 0.97 2023-10-02 05:32:13,852 INFO [train.py:1046] (3/4) Epoch 22, batch 3600, loss[loss=0.1838, simple_loss=0.2574, pruned_loss=0.05516, over 23738.00 frames. ], tot_loss[loss=0.1723, simple_loss=0.2481, pruned_loss=0.04822, over 4704717.97 frames. ], batch size: 232, lr: 4.65e-03, grad_scale: 16.0 2023-10-02 05:32:15,243 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6767_210_3273_1_1527855783_1718932_142-127132-0 from training. Duration: 0.95 2023-10-02 05:32:15,309 WARNING [train.py:1204] (3/4) Exclude cut with ID _39867_210_9826_1_1533553222030_9344259_1018-339635-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 05:32:15,392 WARNING [train.py:1204] (3/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_274-274798-0_sp1.1 from training. Duration: 0.8909375 2023-10-02 05:32:16,812 WARNING [train.py:1204] (3/4) Exclude cut with ID _2334_210_2667_1_1525608238880_4705129_376-263393-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 05:32:18,116 WARNING [train.py:1204] (3/4) Exclude cut with ID _29303_210_2575_1_1532392237303_7410649_363-344675-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 05:32:19,526 WARNING [train.py:1204] (3/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_58-68956-0_sp1.1 from training. Duration: 0.7545625 2023-10-02 05:32:22,387 WARNING [train.py:1204] (3/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_709-311571-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 05:32:23,783 WARNING [train.py:1204] (3/4) Exclude cut with ID _46067_210_4220_1_1534125571066_3307450_136-257407-0_sp1.1 from training. Duration: 0.963625 2023-10-02 05:32:25,120 WARNING [train.py:1204] (3/4) Exclude cut with ID _6493_210_1747_1_1528459210424_4012470_324-323854-0_sp1.1 from training. Duration: 0.836375 2023-10-02 05:32:25,175 WARNING [train.py:1204] (3/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_234-260682-0_sp1.1 from training. Duration: 0.763625 2023-10-02 05:32:25,229 WARNING [train.py:1204] (3/4) Exclude cut with ID _36634_210_1794_1_1533296352450_1399884_55-119451-0_sp1.1 from training. Duration: 0.963625 2023-10-02 05:32:26,544 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_40047_210_2290_1_1533290519_2374311_195-165693-0 from training. Duration: 0.89 2023-10-02 05:32:28,146 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_5522_210_3273_1_1527249606_3909096_366-172508-0_sp0.9 from training. Duration: 0.7333125 2023-10-02 05:32:29,522 WARNING [train.py:1204] (3/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_187-10504-0_sp1.1 from training. Duration: 0.963625 2023-10-02 05:32:31,069 WARNING [train.py:1204] (3/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_282-257791-0_sp1.1 from training. Duration: 0.7818125 2023-10-02 05:32:34,309 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_43831_210_2158_1_1533725502_5872791_467-288776-0_sp1.1 from training. Duration: 0.92725 2023-10-02 05:32:35,660 WARNING [train.py:1204] (3/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_138-154812-0_sp1.1 from training. Duration: 0.7090625 2023-10-02 05:32:35,694 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_1154-185930-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 05:32:35,718 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2655_210_2087_1_1525688959_3897400_52-279899-0 from training. Duration: 0.69 2023-10-02 05:32:36,004 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.nonlin_attention.balancer.prob, batch_count=767760.0, ans=0.125 2023-10-02 05:32:37,675 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_141-89113-0_sp1.1 from training. Duration: 0.7818125 2023-10-02 05:32:40,413 WARNING [train.py:1204] (3/4) Exclude cut with ID _55134_210_21982_1_1534590028377_3882170_297-47310-0_sp1.1 from training. Duration: 0.963625 2023-10-02 05:32:41,711 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_486-151131-0_sp1.1 from training. Duration: 0.77275 2023-10-02 05:32:43,751 WARNING [train.py:1204] (3/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_21-232980-0_sp1.1 from training. Duration: 0.936375 2023-10-02 05:32:46,516 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6165_210_3528_1_1527590982_4379841_338-106380-0_sp1.1 from training. Duration: 0.92725 2023-10-02 05:32:47,937 WARNING [train.py:1204] (3/4) Exclude cut with ID _55162_210_17071_1_1534408248583_3753480_355-257218-0_sp1.1 from training. Duration: 0.97275 2023-10-02 05:32:48,871 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.0.layers.1.conv_module2.whiten, num_groups=1, num_channels=192, metric=9.26 vs. limit=15.0 2023-10-02 05:32:49,326 WARNING [train.py:1204] (3/4) Exclude cut with ID _2895_210_2477_1_1525956785794_4063750_289-11475-0 from training. Duration: 0.82 2023-10-02 05:32:52,059 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.566e+02 1.859e+02 2.035e+02 2.316e+02 3.119e+02, threshold=4.069e+02, percent-clipped=0.0 2023-10-02 05:32:54,960 WARNING [train.py:1204] (3/4) Exclude cut with ID _16756_210_4915_1_1531047803763_7104259_839-183431-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 05:32:56,412 WARNING [train.py:1204] (3/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_427-208901-0_sp0.9 from training. Duration: 0.7555625 2023-10-02 05:32:56,450 WARNING [train.py:1204] (3/4) Exclude cut with ID _36512_210_6987_1_1533033143998_3126069_181-306500-0 from training. Duration: 0.95 2023-10-02 05:33:00,571 WARNING [train.py:1204] (3/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_28-70987-0_sp1.1 from training. Duration: 0.7909375 2023-10-02 05:33:07,093 WARNING [train.py:1204] (3/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_590-58996-0_sp1.1 from training. Duration: 0.936375 2023-10-02 05:33:08,570 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_48730_210_5399_1_1533885886_2212769_108-47561-0_sp1.1 from training. Duration: 0.936375 2023-10-02 05:33:12,236 WARNING [train.py:1204] (3/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_401-158239-0_sp0.9 from training. Duration: 0.788875 2023-10-02 05:33:12,249 WARNING [train.py:1204] (3/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_278-154492-0_sp1.1 from training. Duration: 0.6909375 2023-10-02 05:33:12,256 WARNING [train.py:1204] (3/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_137-116442-0 from training. Duration: 0.78 2023-10-02 05:33:13,656 WARNING [train.py:1204] (3/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_189-317158-0 from training. Duration: 0.9 2023-10-02 05:33:15,630 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_47896_210_20508_1_1534492518_5958868_558-314572-0 from training. Duration: 0.97 2023-10-02 05:33:16,990 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.1.conv_module2.balancer1.prob, batch_count=767960.0, ans=0.125 2023-10-02 05:33:18,038 WARNING [train.py:1204] (3/4) Exclude cut with ID _66070_210_7640_1_1535441393096_7805310_290-40284-0_sp1.1 from training. Duration: 0.97275 2023-10-02 05:33:18,087 WARNING [train.py:1204] (3/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_110-224688-0_sp1.1 from training. Duration: 0.8818125 2023-10-02 05:33:18,181 WARNING [train.py:1204] (3/4) Exclude cut with ID _7719_210_3525_1_1528456153163_4296960_88-250259-0 from training. Duration: 0.84 2023-10-02 05:33:19,467 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8453_210_4211_1_1528502940_8562730_1157-114102-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 05:33:19,501 WARNING [train.py:1204] (3/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_317-175062-0_sp1.1 from training. Duration: 0.7454375 2023-10-02 05:33:19,508 WARNING [train.py:1204] (3/4) Exclude cut with ID _29857_210_20034_1_1532395886753_3798160_254-347254-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 05:33:20,866 WARNING [train.py:1204] (3/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_28-70987-0 from training. Duration: 0.87 2023-10-02 05:33:23,755 WARNING [train.py:1204] (3/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_321-335475-0 from training. Duration: 0.45 2023-10-02 05:33:25,248 WARNING [train.py:1204] (3/4) Exclude cut with ID _40901_210_5665_1_1533363789958_3856130_356-107330-0_sp1.1 from training. Duration: 0.936375 2023-10-02 05:33:25,312 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3780_210_2087_1_1526467389_2145572_176-107140-0 from training. Duration: 0.69 2023-10-02 05:33:28,013 INFO [train.py:1046] (3/4) Epoch 22, batch 3650, loss[loss=0.1543, simple_loss=0.2316, pruned_loss=0.03851, over 24441.00 frames. ], tot_loss[loss=0.1729, simple_loss=0.2492, pruned_loss=0.0483, over 4717148.29 frames. ], batch size: 58, lr: 4.65e-03, grad_scale: 16.0 2023-10-02 05:33:30,962 WARNING [train.py:1204] (3/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_80-135200-0 from training. Duration: 0.79 2023-10-02 05:33:31,058 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_33348_210_5632_1_1533086912_3324030_85-28583-0_sp0.9 from training. Duration: 0.911125 2023-10-02 05:33:36,899 WARNING [train.py:1204] (3/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_29-293896-0 from training. Duration: 0.61 2023-10-02 05:33:38,302 WARNING [train.py:1204] (3/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_194-252800-0 from training. Duration: 0.97 2023-10-02 05:33:38,514 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.self_attn_weights.pos_emb_skip_rate, batch_count=768026.6666666666, ans=0.0 2023-10-02 05:33:43,282 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_360-52446-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 05:33:43,284 WARNING [train.py:1204] (3/4) Exclude cut with ID _9389_210_1664_1_1529217337301_5289269_289-213417-0_sp0.9 from training. Duration: 0.988875 2023-10-02 05:33:43,493 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.conv_module2.balancer1.prob, batch_count=768093.3333333334, ans=0.125 2023-10-02 05:33:44,800 WARNING [train.py:1204] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_143-86344-0_sp0.9 from training. Duration: 0.8555625 2023-10-02 05:33:48,259 WARNING [train.py:1204] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_520-126729-0_sp0.9 from training. Duration: 0.77775 2023-10-02 05:33:48,295 WARNING [train.py:1204] (3/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_76-308663-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 05:33:49,611 WARNING [train.py:1204] (3/4) Exclude cut with ID _8021_210_7994_1_1528376451358_3653069_100-140231-0 from training. Duration: 0.67 2023-10-02 05:33:50,985 WARNING [train.py:1204] (3/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_387-131538-0_sp0.9 from training. Duration: 0.9 2023-10-02 05:33:51,009 WARNING [train.py:1204] (3/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_365-182054-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 05:33:51,067 WARNING [train.py:1204] (3/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_345-340690-0 from training. Duration: 0.93 2023-10-02 05:33:51,283 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.feed_forward1.out_proj.dropout_p, batch_count=768093.3333333334, ans=0.1 2023-10-02 05:33:52,377 WARNING [train.py:1204] (3/4) Exclude cut with ID _5409_210_1388_1_1527292850189_5865191_178-127970-0_sp0.9 from training. Duration: 0.6333125 2023-10-02 05:33:52,421 WARNING [train.py:1204] (3/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_352-12734-0_sp1.1 from training. Duration: 0.92725 2023-10-02 05:33:52,424 WARNING [train.py:1204] (3/4) Exclude cut with ID _65134_210_14763_1_1535335393985_3505368_121-129373-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 05:33:55,162 WARNING [train.py:1204] (3/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_557-324258-0_sp1.1 from training. Duration: 0.736375 2023-10-02 05:33:57,945 WARNING [train.py:1204] (3/4) Exclude cut with ID _48678_210_5663_1_1533988830947_3769199_306-255886-0 from training. Duration: 0.77 2023-10-02 05:33:59,164 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7544_210_6753_1_1528587000_691468_87-178599-0 from training. Duration: 0.87 2023-10-02 05:34:00,495 WARNING [train.py:1204] (3/4) Exclude cut with ID _2941_210_2577_1_1526199673150_2489759_208-236402-0_sp1.1 from training. Duration: 0.963625 2023-10-02 05:34:01,867 WARNING [train.py:1204] (3/4) Exclude cut with ID _38670_210_15379_1_1533448810659_4391059_65-169397-0 from training. Duration: 0.97 2023-10-02 05:34:03,236 WARNING [train.py:1204] (3/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_306-328175-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 05:34:03,256 WARNING [train.py:1204] (3/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_212-149947-0_sp1.1 from training. Duration: 0.663625 2023-10-02 05:34:09,508 WARNING [train.py:1204] (3/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_321-22821-0_sp1.1 from training. Duration: 0.8090625 2023-10-02 05:34:10,875 WARNING [train.py:1204] (3/4) Exclude cut with ID _44010_210_4525_1_1533884262964_4013620_483-69110-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 05:34:10,885 WARNING [train.py:1204] (3/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_38-187851-0_sp1.1 from training. Duration: 0.563625 2023-10-02 05:34:12,236 WARNING [train.py:1204] (3/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_129-337802-0_sp0.9 from training. Duration: 0.8 2023-10-02 05:34:14,151 WARNING [train.py:1204] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_268-87909-0_sp1.1 from training. Duration: 0.9 2023-10-02 05:34:16,938 WARNING [train.py:1204] (3/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_559-184862-0_sp1.1 from training. Duration: 0.8545625 2023-10-02 05:34:19,117 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_46032_210_14656_1_1533781412_4507969_237-120766-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 05:34:20,501 WARNING [train.py:1204] (3/4) Exclude cut with ID _28427_210_4220_1_1532581240967_3061069_330-170906-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 05:34:20,507 WARNING [train.py:1204] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_401-51578-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 05:34:23,000 WARNING [train.py:1204] (3/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_209-205365-0_sp1.1 from training. Duration: 0.7181875 2023-10-02 05:34:24,325 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_23801_210_16223_1_1532066392_3294657_57-242170-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 05:34:24,396 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_127-37371-0_sp1.1 from training. Duration: 0.936375 2023-10-02 05:34:30,007 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9171W0019-578905 from training. Duration: 0.896 2023-10-02 05:34:32,692 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_24605_210_1794_1_1532047863_4149755_349-49056-0_sp1.1 from training. Duration: 0.92725 2023-10-02 05:34:32,712 WARNING [train.py:1204] (3/4) Exclude cut with ID _12302_210_5219_1_1529924446466_8107280_897-21237-0_sp1.1 from training. Duration: 0.936375 2023-10-02 05:34:35,323 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_9460_210_4211_1_1528954587_10049667_480-260269-0_sp0.9 from training. Duration: 0.62225 2023-10-02 05:34:35,380 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_348-152548-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 05:34:37,248 WARNING [train.py:1204] (3/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_301-330455-0_sp0.9 from training. Duration: 0.888875 2023-10-02 05:34:38,682 WARNING [train.py:1204] (3/4) Exclude cut with ID _61008_210_10633_1_1534852769198_6974370_345-91705-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 05:34:40,762 WARNING [train.py:1204] (3/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_304-273433-0 from training. Duration: 0.99 2023-10-02 05:34:40,765 WARNING [train.py:1204] (3/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_359-345972-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 05:34:42,055 INFO [train.py:1046] (3/4) Epoch 22, batch 3700, loss[loss=0.1673, simple_loss=0.2492, pruned_loss=0.04274, over 24676.00 frames. ], tot_loss[loss=0.1733, simple_loss=0.2503, pruned_loss=0.04815, over 4723345.02 frames. ], batch size: 73, lr: 4.65e-03, grad_scale: 16.0 2023-10-02 05:34:42,197 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2296_210_2087_1_1525503448_3397335_57-185430-0_sp1.1 from training. Duration: 0.5454375 2023-10-02 05:34:42,405 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.bypass_mid.scale_min, batch_count=768360.0, ans=0.2 2023-10-02 05:34:42,470 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.attention_skip_rate, batch_count=768360.0, ans=0.0 2023-10-02 05:34:43,745 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.bypass.skip_rate, batch_count=768360.0, ans=0.07 2023-10-02 05:34:44,881 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_1256-120974-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 05:34:46,911 WARNING [train.py:1204] (3/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_372-30302-0_sp0.9 from training. Duration: 0.9555625 2023-10-02 05:34:49,667 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_42012_210_7927_1_1533439936_3871556_451-344262-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 05:34:49,667 WARNING [train.py:1204] (3/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_28-324729-0 from training. Duration: 0.94 2023-10-02 05:34:49,676 WARNING [train.py:1204] (3/4) Exclude cut with ID _64135_210_2253_1_1535677267876_7608972_313-18394-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 05:34:50,454 WARNING [train.py:1204] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_620-276793-0_sp0.9 from training. Duration: 0.6555625 2023-10-02 05:34:51,754 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7544_210_6753_1_1528591327_1471426_172-348777-0_sp0.9 from training. Duration: 0.7333125 2023-10-02 05:34:56,075 WARNING [train.py:1204] (3/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_332-221057-0_sp0.9 from training. Duration: 0.7555625 2023-10-02 05:34:57,606 WARNING [train.py:1204] (3/4) Exclude cut with ID _19023_210_4353_1_1531378892552_5820780_5-300035-0_sp1.1 from training. Duration: 0.92725 2023-10-02 05:34:57,640 WARNING [train.py:1204] (3/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_343-309023-0_sp1.1 from training. Duration: 0.963625 2023-10-02 05:34:59,163 WARNING [train.py:1204] (3/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_37-41919-0_sp1.1 from training. Duration: 0.7909375 2023-10-02 05:35:00,473 WARNING [train.py:1204] (3/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_41-182910-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 05:35:00,537 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_229-45300-0_sp0.9 from training. Duration: 0.6333125 2023-10-02 05:35:03,273 WARNING [train.py:1204] (3/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_40-214716-0_sp1.1 from training. Duration: 0.963625 2023-10-02 05:35:04,652 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9309W0203-640330 from training. Duration: 0.896 2023-10-02 05:35:05,049 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.attention_skip_rate, batch_count=768426.6666666666, ans=0.0 2023-10-02 05:35:11,969 WARNING [train.py:1204] (3/4) Exclude cut with ID _60229_210_27767_1_1534838406158_3655350_264-279573-0_sp1.1 from training. Duration: 0.7545625 2023-10-02 05:35:12,138 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.feed_forward1.hidden_balancer.prob, batch_count=768493.3333333334, ans=0.125 2023-10-02 05:35:13,966 WARNING [train.py:1204] (3/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_372-198878-0_sp0.9 from training. Duration: 0.6666875 2023-10-02 05:35:14,063 WARNING [train.py:1204] (3/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_359-118926-0_sp1.1 from training. Duration: 0.6545625 2023-10-02 05:35:14,085 WARNING [train.py:1204] (3/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_85-65056-0 from training. Duration: 0.96 2023-10-02 05:35:14,116 WARNING [train.py:1204] (3/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_89-242335-0_sp1.1 from training. Duration: 0.863625 2023-10-02 05:35:18,794 WARNING [train.py:1204] (3/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_128-49444-0_sp1.1 from training. Duration: 0.936375 2023-10-02 05:35:18,876 WARNING [train.py:1204] (3/4) Exclude cut with ID _30327_210_16049_1_1532484161295_3924169_401-70819-0 from training. Duration: 0.86 2023-10-02 05:35:19,397 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.3.encoder.layers.3.conv_module1.whiten, num_groups=1, num_channels=512, metric=8.07 vs. limit=15.0 2023-10-02 05:35:20,186 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_41822_210_13270_1_1533955976_4379284_260-256391-0_sp1.1 from training. Duration: 0.936375 2023-10-02 05:35:21,396 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.551e+02 1.818e+02 2.091e+02 2.478e+02 3.925e+02, threshold=4.183e+02, percent-clipped=0.0 2023-10-02 05:35:21,579 WARNING [train.py:1204] (3/4) Exclude cut with ID _67335_210_5219_1_1535525975652_4275229_599-247317-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 05:35:24,744 WARNING [train.py:1204] (3/4) Exclude cut with ID _29857_210_20034_1_1532395886753_3798160_209-239267-0_sp1.1 from training. Duration: 0.936375 2023-10-02 05:35:24,773 WARNING [train.py:1204] (3/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_46-207599-0_sp1.1 from training. Duration: 0.5818125 2023-10-02 05:35:27,507 WARNING [train.py:1204] (3/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_912-282412-0_sp1.1 from training. Duration: 0.4454375 2023-10-02 05:35:27,815 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.bypass_mid.scale_min, batch_count=768560.0, ans=0.2 2023-10-02 05:35:31,727 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_4183_210_2087_1_1526729100_1869838_141-166600-0_sp1.1 from training. Duration: 0.863625 2023-10-02 05:35:31,731 WARNING [train.py:1204] (3/4) Exclude cut with ID _15037_210_2253_1_1530752294104_3267650_296-192597-0 from training. Duration: 0.81 2023-10-02 05:35:33,114 WARNING [train.py:1204] (3/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_571-220562-0_sp1.1 from training. Duration: 0.963625 2023-10-02 05:35:33,135 WARNING [train.py:1204] (3/4) Exclude cut with ID _5287_210_2084_1_1527418883040_3878670_12-221186-0 from training. Duration: 0.98 2023-10-02 05:35:34,825 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.out_combiner.scale_min, batch_count=768560.0, ans=0.2 2023-10-02 05:35:37,321 WARNING [train.py:1204] (3/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_149-85602-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 05:35:37,341 WARNING [train.py:1204] (3/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_1003-101638-0_sp1.1 from training. Duration: 0.663625 2023-10-02 05:35:39,944 WARNING [train.py:1204] (3/4) Exclude cut with ID _55844_210_2346_1_1534471212569_7625707_1099-33689-0_sp1.1 from training. Duration: 0.92725 2023-10-02 05:35:39,988 WARNING [train.py:1204] (3/4) Exclude cut with ID _7606_210_4915_1_1528894253277_5080690_301-10732-0 from training. Duration: 0.81 2023-10-02 05:35:43,419 WARNING [train.py:1204] (3/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_403-34698-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 05:35:43,431 WARNING [train.py:1204] (3/4) Exclude cut with ID _8480_210_1874_1_1528589391205_6728330_755-112625-0_sp0.9 from training. Duration: 0.82225 2023-10-02 05:35:43,447 WARNING [train.py:1204] (3/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_527-238990-0_sp1.1 from training. Duration: 0.8181875 2023-10-02 05:35:44,704 WARNING [train.py:1204] (3/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_426-7328-0_sp1.1 from training. Duration: 0.92725 2023-10-02 05:35:48,064 WARNING [train.py:1204] (3/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_189-317158-0_sp1.1 from training. Duration: 0.8181875 2023-10-02 05:35:49,409 WARNING [train.py:1204] (3/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_162-103620-0 from training. Duration: 0.93 2023-10-02 05:35:49,522 WARNING [train.py:1204] (3/4) Exclude cut with ID _22083_210_8254_1_1531702862439_3678970_49-234546-0 from training. Duration: 0.83 2023-10-02 05:35:49,660 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.0.feed_forward2.hidden_balancer.prob, batch_count=768626.6666666666, ans=0.125 2023-10-02 05:35:50,812 WARNING [train.py:1204] (3/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_370-331600-0_sp1.1 from training. Duration: 0.8545625 2023-10-02 05:35:50,827 WARNING [train.py:1204] (3/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_166-59042-0_sp1.1 from training. Duration: 0.97275 2023-10-02 05:35:50,938 WARNING [train.py:1204] (3/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_138-332773-0_sp1.1 from training. Duration: 0.736375 2023-10-02 05:35:52,763 WARNING [train.py:1204] (3/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_90-30673-0_sp0.9 from training. Duration: 0.9333125 2023-10-02 05:35:54,427 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.bypass.skip_rate, batch_count=768626.6666666666, ans=0.07 2023-10-02 05:35:55,543 WARNING [train.py:1204] (3/4) Exclude cut with ID _27967_210_12140_1_1532588391859_8419130_737-41898-0_sp1.1 from training. Duration: 0.936375 2023-10-02 05:35:56,816 INFO [train.py:1046] (3/4) Epoch 22, batch 3750, loss[loss=0.2483, simple_loss=0.3119, pruned_loss=0.09238, over 19472.00 frames. ], tot_loss[loss=0.1745, simple_loss=0.2514, pruned_loss=0.04883, over 4714408.92 frames. ], batch size: 388, lr: 4.65e-03, grad_scale: 16.0 2023-10-02 05:35:56,951 WARNING [train.py:1204] (3/4) Exclude cut with ID _8833_210_5219_1_1528592665210_7553059_329-125156-0_sp1.1 from training. Duration: 0.7454375 2023-10-02 05:35:58,348 WARNING [train.py:1204] (3/4) Exclude cut with ID _41817_210_18835_1_1533434477160_7423049_352-42537-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 05:36:01,007 WARNING [train.py:1204] (3/4) Exclude cut with ID _8365_210_2064_1_1528592311070_4677690_431-294102-0 from training. Duration: 0.89 2023-10-02 05:36:02,363 WARNING [train.py:1204] (3/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_454-314901-0_sp1.1 from training. Duration: 0.4181875 2023-10-02 05:36:03,813 WARNING [train.py:1204] (3/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_80-135200-0_sp0.9 from training. Duration: 0.87775 2023-10-02 05:36:05,160 WARNING [train.py:1204] (3/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_181-78632-0 from training. Duration: 0.78 2023-10-02 05:36:05,229 WARNING [train.py:1204] (3/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_300-27235-0_sp0.9 from training. Duration: 0.9666875 2023-10-02 05:36:06,586 WARNING [train.py:1204] (3/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_96-177176-0_sp1.1 from training. Duration: 0.97275 2023-10-02 05:36:06,683 WARNING [train.py:1204] (3/4) Exclude cut with ID _35941_210_15379_1_1532995294515_4023089_246-348742-0_sp1.1 from training. Duration: 0.97275 2023-10-02 05:36:09,292 WARNING [train.py:1204] (3/4) Exclude cut with ID _38670_210_15379_1_1533448810659_4391059_65-169397-0_sp1.1 from training. Duration: 0.8818125 2023-10-02 05:36:13,888 WARNING [train.py:1204] (3/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_1003-99642-0_sp1.1 from training. Duration: 0.963625 2023-10-02 05:36:16,192 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.1.encoder.layers.0.whiten, num_groups=1, num_channels=256, metric=5.61 vs. limit=12.0 2023-10-02 05:36:16,739 WARNING [train.py:1204] (3/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_320-14078-0_sp1.1 from training. Duration: 0.863625 2023-10-02 05:36:18,662 WARNING [train.py:1204] (3/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_367-61330-0_sp0.9 from training. Duration: 0.8333125 2023-10-02 05:36:20,701 WARNING [train.py:1204] (3/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_515-298514-0_sp1.1 from training. Duration: 0.92725 2023-10-02 05:36:25,171 WARNING [train.py:1204] (3/4) Exclude cut with ID _53731_210_7994_1_1534553979244_3757214_471-320815-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 05:36:25,230 WARNING [train.py:1204] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_306-232480-0 from training. Duration: 0.96 2023-10-02 05:36:25,427 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.conv_module1.balancer1.prob, batch_count=768826.6666666666, ans=0.125 2023-10-02 05:36:26,612 WARNING [train.py:1204] (3/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_478-162247-0_sp0.9 from training. Duration: 0.911125 2023-10-02 05:36:28,070 WARNING [train.py:1204] (3/4) Exclude cut with ID _56423_210_5854_1_1534896061049_3165969_345-284696-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 05:36:28,114 WARNING [train.py:1204] (3/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_600-122253-0_sp1.1 from training. Duration: 0.963625 2023-10-02 05:36:30,880 WARNING [train.py:1204] (3/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_199-295611-0 from training. Duration: 0.55 2023-10-02 05:36:31,181 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.conv_module2.balancer2.prob, batch_count=768826.6666666666, ans=0.125 2023-10-02 05:36:33,679 WARNING [train.py:1204] (3/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_734-129761-0 from training. Duration: 0.82 2023-10-02 05:36:34,996 WARNING [train.py:1204] (3/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_349-169257-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 05:36:36,355 WARNING [train.py:1204] (3/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_202-67017-0_sp0.9 from training. Duration: 0.911125 2023-10-02 05:36:39,038 WARNING [train.py:1204] (3/4) Exclude cut with ID _16908_210_5724_1_1531119157248_3772700_61-258978-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 05:36:43,301 WARNING [train.py:1204] (3/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_172-156931-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 05:36:45,233 WARNING [train.py:1204] (3/4) Exclude cut with ID _4092_210_2151_1_1526643044449_3135759_356-278694-0_sp1.1 from training. Duration: 0.42725 2023-10-02 05:36:45,402 INFO [scaling.py:1118] (3/4) WithLoss: name=encoder.encoders.0.layers.0.self_attn_weights, loss-sum=0.000e+00 2023-10-02 05:36:48,172 WARNING [train.py:1204] (3/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_116-332008-0 from training. Duration: 0.98 2023-10-02 05:36:50,279 WARNING [train.py:1204] (3/4) Exclude cut with ID _29003_210_19596_1_1532507391720_3894098_358-114789-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 05:36:54,365 WARNING [train.py:1204] (3/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_152-212533-0_sp1.1 from training. Duration: 0.8818125 2023-10-02 05:36:54,427 WARNING [train.py:1204] (3/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_267-214116-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 05:36:58,283 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7543_210_6753_1_1528541907_1399322_165-323619-0_sp0.9 from training. Duration: 0.7666875 2023-10-02 05:37:01,168 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.1.feed_forward1.out_proj.dropout_p, batch_count=768960.0, ans=0.1 2023-10-02 05:37:02,337 WARNING [train.py:1204] (3/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_577-332600-0_sp0.9 from training. Duration: 0.6333125 2023-10-02 05:37:03,759 WARNING [train.py:1204] (3/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_342-100157-0_sp0.9 from training. Duration: 0.6 2023-10-02 05:37:03,932 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.1.ff3_skip_rate, batch_count=768960.0, ans=0.0 2023-10-02 05:37:05,202 WARNING [train.py:1204] (3/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_141-89727-0_sp1.1 from training. Duration: 0.6454375 2023-10-02 05:37:06,590 WARNING [train.py:1204] (3/4) Exclude cut with ID _3992_210_1747_1_1526644816031_3731060_107-251434-0_sp1.1 from training. Duration: 0.9 2023-10-02 05:37:09,334 WARNING [train.py:1204] (3/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_401-53423-0_sp1.1 from training. Duration: 0.5 2023-10-02 05:37:10,679 INFO [train.py:1046] (3/4) Epoch 22, batch 3800, loss[loss=0.1546, simple_loss=0.2361, pruned_loss=0.03659, over 24572.00 frames. ], tot_loss[loss=0.175, simple_loss=0.2516, pruned_loss=0.04919, over 4718961.68 frames. ], batch size: 60, lr: 4.65e-03, grad_scale: 8.0 2023-10-02 05:37:16,910 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7762_210_1794_1_1528199508_4057408_422-227034-0_sp1.1 from training. Duration: 0.663625 2023-10-02 05:37:19,765 WARNING [train.py:1204] (3/4) Exclude cut with ID _4184_210_2550_1_1526730192013_2219380_102-250299-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 05:37:21,541 WARNING [train.py:1204] (3/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_253-308377-0_sp0.9 from training. Duration: 0.5444375 2023-10-02 05:37:22,915 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_271-297505-0 from training. Duration: 0.99 2023-10-02 05:37:24,286 WARNING [train.py:1204] (3/4) Exclude cut with ID _35941_210_15379_1_1532995294515_4023089_213-142973-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 05:37:25,089 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7544_210_6753_1_1528587722_3576686_162-63018-0_sp1.1 from training. Duration: 0.92725 2023-10-02 05:37:26,520 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_365-217119-0_sp0.9 from training. Duration: 0.7 2023-10-02 05:37:29,129 WARNING [train.py:1204] (3/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_662-275621-0_sp0.9 from training. Duration: 0.4666875 2023-10-02 05:37:29,131 WARNING [train.py:1204] (3/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_374-187555-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 05:37:29,206 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_289-180904-0_sp1.1 from training. Duration: 0.6909375 2023-10-02 05:37:30,613 WARNING [train.py:1204] (3/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_189-47206-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 05:37:30,643 WARNING [train.py:1204] (3/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_234-14174-0_sp1.1 from training. Duration: 0.7454375 2023-10-02 05:37:31,960 WARNING [train.py:1204] (3/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_576-76621-0_sp1.1 from training. Duration: 0.963625 2023-10-02 05:37:33,314 WARNING [train.py:1204] (3/4) Exclude cut with ID _13253_210_1925_1_1530757775819_3577873_253-246386-0 from training. Duration: 0.9 2023-10-02 05:37:36,068 WARNING [train.py:1204] (3/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_372-6869-0_sp1.1 from training. Duration: 0.4 2023-10-02 05:37:36,116 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_21434_210_4238_1_1531916339_1222252_22-54954-0_sp1.1 from training. Duration: 0.936375 2023-10-02 05:37:37,769 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.conv_module2.balancer1.prob, batch_count=769093.3333333334, ans=0.125 2023-10-02 05:37:38,955 WARNING [train.py:1204] (3/4) Exclude cut with ID _29853_210_7497_1_1532505600851_6642110_492-170361-0_sp1.1 from training. Duration: 0.92725 2023-10-02 05:37:41,759 WARNING [train.py:1204] (3/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_505-97987-0_sp0.9 from training. Duration: 0.9555625 2023-10-02 05:37:41,805 WARNING [train.py:1204] (3/4) Exclude cut with ID _8365_210_2064_1_1528592311070_4677690_371-122295-0_sp0.9 from training. Duration: 0.611125 2023-10-02 05:37:43,266 WARNING [train.py:1204] (3/4) Exclude cut with ID _14268_210_9206_1_1530622771285_4714099_637-86630-0_sp1.1 from training. Duration: 0.463625 2023-10-02 05:37:43,278 WARNING [train.py:1204] (3/4) Exclude cut with ID _29857_210_20034_1_1532395886753_3798160_372-5678-0_sp1.1 from training. Duration: 0.963625 2023-10-02 05:37:46,637 WARNING [train.py:1204] (3/4) Exclude cut with ID _52892_210_6721_1_1534378941259_4124129_90-165108-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 05:37:46,722 WARNING [train.py:1204] (3/4) Exclude cut with ID _27052_210_5986_1_1532329197187_3585009_143-339198-0_sp1.1 from training. Duration: 0.963625 2023-10-02 05:37:51,711 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.510e+02 1.862e+02 2.088e+02 2.377e+02 3.435e+02, threshold=4.176e+02, percent-clipped=0.0 2023-10-02 05:37:53,129 WARNING [train.py:1204] (3/4) Exclude cut with ID _14268_210_9206_1_1530622771285_4714099_637-86630-0_sp0.9 from training. Duration: 0.5666875 2023-10-02 05:37:53,132 WARNING [train.py:1204] (3/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_401-255491-0 from training. Duration: 0.7 2023-10-02 05:37:53,864 WARNING [train.py:1204] (3/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_178-6946-0_sp1.1 from training. Duration: 0.97275 2023-10-02 05:38:01,005 WARNING [train.py:1204] (3/4) Exclude cut with ID _27485_210_4916_1_1532739191677_3668269_460-163885-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 05:38:01,176 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.conv_module2.balancer2.prob, batch_count=769226.6666666666, ans=0.125 2023-10-02 05:38:05,249 WARNING [train.py:1204] (3/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_270-26053-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 05:38:06,615 WARNING [train.py:1204] (3/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_412-317077-0 from training. Duration: 0.75 2023-10-02 05:38:08,226 WARNING [train.py:1204] (3/4) Exclude cut with ID _5289_210_2084_1_1527678202736_3779299_142-103317-0 from training. Duration: 0.84 2023-10-02 05:38:08,269 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_150-143438-0_sp1.1 from training. Duration: 0.92725 2023-10-02 05:38:10,871 WARNING [train.py:1204] (3/4) Exclude cut with ID _27894_210_12392_1_1532415496078_6902439_668-269779-0_sp1.1 from training. Duration: 0.97275 2023-10-02 05:38:12,254 WARNING [train.py:1204] (3/4) Exclude cut with ID _18513_210_9206_1_1531618215223_3862620_56-222075-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 05:38:12,382 WARNING [train.py:1204] (3/4) Exclude cut with ID _4092_210_2151_1_1526643044449_3135759_356-278694-0 from training. Duration: 0.47 2023-10-02 05:38:15,136 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.0.layers.1.nonlin_attention.whiten2, num_groups=1, num_channels=192, metric=5.67 vs. limit=15.0 2023-10-02 05:38:17,061 WARNING [train.py:1204] (3/4) Exclude cut with ID _25808_210_13292_1_1532343514562_7366109_633-47524-0 from training. Duration: 0.99 2023-10-02 05:38:17,073 WARNING [train.py:1204] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_872-264525-0 from training. Duration: 0.65 2023-10-02 05:38:17,108 WARNING [train.py:1204] (3/4) Exclude cut with ID _14632_210_9988_1_1530691279392_3923200_239-232545-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 05:38:17,599 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.4.encoder.layers.2.self_attn1.whiten, num_groups=1, num_channels=384, metric=16.04 vs. limit=22.5 2023-10-02 05:38:20,636 WARNING [train.py:1204] (3/4) Exclude cut with ID _14349_210_8341_1_1530615710818_3442470_153-27432-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 05:38:27,287 INFO [train.py:1046] (3/4) Epoch 22, batch 3850, loss[loss=0.1799, simple_loss=0.2698, pruned_loss=0.04495, over 24554.00 frames. ], tot_loss[loss=0.1743, simple_loss=0.2511, pruned_loss=0.04879, over 4717808.24 frames. ], batch size: 71, lr: 4.65e-03, grad_scale: 8.0 2023-10-02 05:38:27,331 WARNING [train.py:1204] (3/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_37-41919-0_sp0.9 from training. Duration: 0.9666875 2023-10-02 05:38:27,407 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_196-289637-0_sp0.9 from training. Duration: 0.8333125 2023-10-02 05:38:31,603 WARNING [train.py:1204] (3/4) Exclude cut with ID _13678_210_7497_1_1530530069226_3463170_371-3221-0_sp1.1 from training. Duration: 0.8181875 2023-10-02 05:38:31,683 WARNING [train.py:1204] (3/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_557-324258-0 from training. Duration: 0.81 2023-10-02 05:38:33,070 WARNING [train.py:1204] (3/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_1012-279473-0_sp1.1 from training. Duration: 0.6090625 2023-10-02 05:38:33,157 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_66916_210_14656_1_1535725525_326566_7-92649-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 05:38:35,898 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_9460_210_4211_1_1528954587_10049667_480-260269-0_sp1.1 from training. Duration: 0.5090625 2023-10-02 05:38:38,677 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_121-155001-0_sp1.1 from training. Duration: 0.92725 2023-10-02 05:38:40,048 WARNING [train.py:1204] (3/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_188-171673-0_sp1.1 from training. Duration: 0.52725 2023-10-02 05:38:41,399 WARNING [train.py:1204] (3/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_2-39653-0 from training. Duration: 0.98 2023-10-02 05:38:47,953 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_23850_210_4238_1_1532232597_1632433_112-229012-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 05:38:50,751 WARNING [train.py:1204] (3/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_626-79528-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 05:38:52,198 WARNING [train.py:1204] (3/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_856-90052-0_sp1.1 from training. Duration: 0.963625 2023-10-02 05:38:53,458 WARNING [train.py:1204] (3/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_122-109023-0_sp0.9 from training. Duration: 0.8444375 2023-10-02 05:38:54,966 WARNING [train.py:1204] (3/4) Exclude cut with ID _64414_210_17301_1_1535243252048_3757759_37-70328-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 05:38:55,030 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_622-344907-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 05:38:55,081 WARNING [train.py:1204] (3/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_410-321956-0_sp1.1 from training. Duration: 0.92725 2023-10-02 05:38:55,093 WARNING [train.py:1204] (3/4) Exclude cut with ID _7562_210_2084_1_1528887767916_3946229_186-326422-0_sp0.9 from training. Duration: 0.9333125 2023-10-02 05:38:57,088 WARNING [train.py:1204] (3/4) Exclude cut with ID _37537_210_2253_1_1533171758568_3421949_127-285192-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 05:38:58,621 WARNING [train.py:1204] (3/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_723-228168-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 05:38:58,705 WARNING [train.py:1204] (3/4) Exclude cut with ID _13678_210_7497_1_1530530069226_3463170_426-246984-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 05:39:00,106 WARNING [train.py:1204] (3/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_399-156791-0_sp0.9 from training. Duration: 0.92225 2023-10-02 05:39:00,198 WARNING [train.py:1204] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_391-22148-0 from training. Duration: 0.77 2023-10-02 05:39:00,226 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3475_210_2028_1_1526193578_3622569_64-297072-0 from training. Duration: 0.91 2023-10-02 05:39:00,438 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.feed_forward1.hidden_balancer.prob, batch_count=769493.3333333334, ans=0.125 2023-10-02 05:39:01,540 WARNING [train.py:1204] (3/4) Exclude cut with ID _17875_210_2087_1_1531224012907_1821339_225-280015-0_sp1.1 from training. Duration: 0.963625 2023-10-02 05:39:01,566 WARNING [train.py:1204] (3/4) Exclude cut with ID _50655_210_18628_1_1534229942193_3772154_325-246169-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 05:39:04,331 WARNING [train.py:1204] (3/4) Exclude cut with ID _29529_210_7234_1_1532393961455_3977460_597-257348-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 05:39:04,369 WARNING [train.py:1204] (3/4) Exclude cut with ID _58895_210_8750_1_1535184081320_3649089_77-173485-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 05:39:05,609 WARNING [train.py:1204] (3/4) Exclude cut with ID _6270_210_2084_1_1527850911573_3856420_26-277343-0 from training. Duration: 0.82 2023-10-02 05:39:07,013 WARNING [train.py:1204] (3/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_144-3287-0 from training. Duration: 0.67 2023-10-02 05:39:09,836 WARNING [train.py:1204] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_293-145278-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 05:39:11,689 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_40047_210_2290_1_1533290519_2374311_257-335162-0 from training. Duration: 0.95 2023-10-02 05:39:13,773 WARNING [train.py:1204] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_619-344499-0_sp0.9 from training. Duration: 0.7 2023-10-02 05:39:19,398 WARNING [train.py:1204] (3/4) Exclude cut with ID _19504_210_9994_1_1531648863242_3325220_456-315379-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 05:39:20,849 WARNING [train.py:1204] (3/4) Exclude cut with ID _55857_210_2346_1_1534644007831_6898209_631-221692-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 05:39:23,692 WARNING [train.py:1204] (3/4) Exclude cut with ID _64631_210_27767_1_1535349580068_4165160_324-3682-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 05:39:25,047 WARNING [train.py:1204] (3/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_317-211107-0 from training. Duration: 0.92 2023-10-02 05:39:27,037 WARNING [train.py:1204] (3/4) Exclude cut with ID _6789_210_1534_1_1527857937218_3118950_311-61262-0 from training. Duration: 0.65 2023-10-02 05:39:29,684 WARNING [train.py:1204] (3/4) Exclude cut with ID _28323_210_16187_1_1532480395667_4100300_365-68882-0_sp1.1 from training. Duration: 0.92725 2023-10-02 05:39:31,052 WARNING [train.py:1204] (3/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_816-181825-0_sp1.1 from training. Duration: 0.92725 2023-10-02 05:39:31,439 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.attention_skip_rate, batch_count=769626.6666666666, ans=0.0 2023-10-02 05:39:32,539 WARNING [train.py:1204] (3/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_132-269633-0_sp1.1 from training. Duration: 0.6181875 2023-10-02 05:39:32,550 WARNING [train.py:1204] (3/4) Exclude cut with ID _44585_210_18628_1_1533605350387_4518199_280-73534-0_sp1.1 from training. Duration: 0.8090625 2023-10-02 05:39:33,868 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_68095_210_20185_1_1535718226_8888855_1009-164755-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 05:39:33,957 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_40223_210_6228_1_1533298404_4812267_400-103813-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 05:39:33,958 WARNING [train.py:1204] (3/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_491-261923-0_sp1.1 from training. Duration: 0.8909375 2023-10-02 05:39:33,964 WARNING [train.py:1204] (3/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_712-342563-0 from training. Duration: 0.97 2023-10-02 05:39:35,357 WARNING [train.py:1204] (3/4) Exclude cut with ID _41161_210_20034_1_1533431249657_3555314_175-190032-0_sp1.1 from training. Duration: 0.963625 2023-10-02 05:39:36,718 WARNING [train.py:1204] (3/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_788-298907-0 from training. Duration: 0.89 2023-10-02 05:39:36,740 WARNING [train.py:1204] (3/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_225-70961-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 05:39:36,751 WARNING [train.py:1204] (3/4) Exclude cut with ID _58583_210_5236_1_1534726835266_3574249_235-175994-0_sp1.1 from training. Duration: 0.92725 2023-10-02 05:39:39,434 WARNING [train.py:1204] (3/4) Exclude cut with ID _12302_210_5219_1_1529924446466_8107280_907-182949-0_sp1.1 from training. Duration: 0.836375 2023-10-02 05:39:40,608 INFO [train.py:1046] (3/4) Epoch 22, batch 3900, loss[loss=0.1929, simple_loss=0.2554, pruned_loss=0.06519, over 23775.00 frames. ], tot_loss[loss=0.1728, simple_loss=0.2493, pruned_loss=0.04818, over 4718882.98 frames. ], batch size: 164, lr: 4.65e-03, grad_scale: 8.0 2023-10-02 05:39:40,646 WARNING [train.py:1204] (3/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_94-193692-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 05:39:41,999 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_4528_210_3273_1_1527076654_3276777_342-168068-0_sp0.9 from training. Duration: 0.9666875 2023-10-02 05:39:42,053 WARNING [train.py:1204] (3/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_368-74239-0_sp1.1 from training. Duration: 0.92725 2023-10-02 05:39:42,054 WARNING [train.py:1204] (3/4) Exclude cut with ID _44831_210_13427_1_1533816011549_3689035_291-295928-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 05:39:42,117 WARNING [train.py:1204] (3/4) Exclude cut with ID _56406_210_9288_1_1534899618664_3496149_455-131997-0_sp1.1 from training. Duration: 0.97275 2023-10-02 05:39:43,989 WARNING [train.py:1204] (3/4) Exclude cut with ID _6473_210_1741_1_1527901307396_3815380_108-17335-0 from training. Duration: 0.65 2023-10-02 05:39:44,042 WARNING [train.py:1204] (3/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_286-135600-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 05:39:44,871 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.1.encoder.layers.1.self_attn_weights.whiten_keys, num_groups=4, num_channels=128, metric=2.57 vs. limit=6.0 2023-10-02 05:39:48,195 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_928-321675-0_sp1.1 from training. Duration: 0.936375 2023-10-02 05:39:49,651 WARNING [train.py:1204] (3/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_81-300212-0_sp1.1 from training. Duration: 0.5454375 2023-10-02 05:39:49,700 WARNING [train.py:1204] (3/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_29-280622-0_sp0.9 from training. Duration: 0.911125 2023-10-02 05:39:51,190 WARNING [train.py:1204] (3/4) Exclude cut with ID _61079_210_4525_1_1534921008198_4011419_228-176447-0_sp1.1 from training. Duration: 0.936375 2023-10-02 05:39:52,612 WARNING [train.py:1204] (3/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_372-198878-0_sp1.1 from training. Duration: 0.5454375 2023-10-02 05:39:53,820 WARNING [train.py:1204] (3/4) Exclude cut with ID _64521_210_14699_1_1535243427259_3815328_244-208725-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 05:39:55,346 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8275_210_4198_1_1528455320_3892571_491-17707-0_sp0.9 from training. Duration: 0.711125 2023-10-02 05:39:57,291 WARNING [train.py:1204] (3/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_375-245677-0 from training. Duration: 0.81 2023-10-02 05:39:57,298 WARNING [train.py:1204] (3/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_690-286055-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 05:39:59,303 WARNING [train.py:1204] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_89-321955-0 from training. Duration: 0.8 2023-10-02 05:39:59,331 WARNING [train.py:1204] (3/4) Exclude cut with ID _18895_210_6228_1_1531357274234_3784930_213-98721-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 05:40:00,620 WARNING [train.py:1204] (3/4) Exclude cut with ID _19708_210_9206_1_1532136650733_4238364_347-65240-0 from training. Duration: 0.97 2023-10-02 05:40:02,126 WARNING [train.py:1204] (3/4) Exclude cut with ID _64135_210_2253_1_1535677267876_7608972_498-5039-0 from training. Duration: 0.78 2023-10-02 05:40:04,971 WARNING [train.py:1204] (3/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_146-276944-0_sp1.1 from training. Duration: 0.7545625 2023-10-02 05:40:06,335 WARNING [train.py:1204] (3/4) Exclude cut with ID _63171_210_15488_1_1535104792794_3295110_155-34488-0_sp1.1 from training. Duration: 0.97275 2023-10-02 05:40:06,348 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_5438_210_1794_1_1527336687_2600009_233-58072-0_sp0.9 from training. Duration: 0.8555625 2023-10-02 05:40:07,685 WARNING [train.py:1204] (3/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_63-101996-0_sp0.9 from training. Duration: 0.97775 2023-10-02 05:40:13,236 WARNING [train.py:1204] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_271-269084-0_sp1.1 from training. Duration: 0.7545625 2023-10-02 05:40:14,611 WARNING [train.py:1204] (3/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_345-167986-0_sp1.1 from training. Duration: 0.763625 2023-10-02 05:40:16,713 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.balancer1.prob, batch_count=769826.6666666666, ans=0.125 2023-10-02 05:40:19,277 WARNING [train.py:1204] (3/4) Exclude cut with ID _39290_210_4220_1_1533862778760_3075118_92-311600-0_sp1.1 from training. Duration: 0.8 2023-10-02 05:40:19,283 WARNING [train.py:1204] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_209-65306-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 05:40:20,553 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.607e+02 1.829e+02 1.934e+02 2.273e+02 3.864e+02, threshold=3.868e+02, percent-clipped=0.0 2023-10-02 05:40:20,635 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_26433_210_12108_1_1532157158_4056480_317-335807-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 05:40:22,249 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.conv_skip_rate, batch_count=769826.6666666666, ans=0.0 2023-10-02 05:40:22,347 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.ff2_skip_rate, batch_count=769826.6666666666, ans=0.0 2023-10-02 05:40:23,795 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=769893.3333333334, ans=0.1 2023-10-02 05:40:25,488 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.5.encoder.layers.1.feed_forward1.out_whiten, num_groups=1, num_channels=256, metric=7.10 vs. limit=15.0 2023-10-02 05:40:26,411 WARNING [train.py:1204] (3/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_238-94127-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 05:40:26,452 WARNING [train.py:1204] (3/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_207-132038-0_sp1.1 from training. Duration: 0.8545625 2023-10-02 05:40:33,967 WARNING [train.py:1204] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_167-137255-0_sp1.1 from training. Duration: 0.5818125 2023-10-02 05:40:34,066 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2009_210_2071_1_1525481134_4588937_385-302100-0_sp0.9 from training. Duration: 0.9666875 2023-10-02 05:40:43,519 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_15517_210_6753_1_1530954008_3487336_253-110617-0_sp1.1 from training. Duration: 0.936375 2023-10-02 05:40:45,007 WARNING [train.py:1204] (3/4) Exclude cut with ID _39290_210_4220_1_1533862778760_3075118_92-311600-0_sp0.9 from training. Duration: 0.97775 2023-10-02 05:40:45,574 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.3.encoder.layers.2.whiten, num_groups=1, num_channels=512, metric=5.99 vs. limit=12.0 2023-10-02 05:40:46,383 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_42-142473-0 from training. Duration: 0.72 2023-10-02 05:40:46,425 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2296_210_2087_1_1525503448_3397335_57-185430-0 from training. Duration: 0.6 2023-10-02 05:40:48,133 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_11305_210_1794_1_1529750428_7178148_257-312870-0_sp0.9 from training. Duration: 0.97775 2023-10-02 05:40:48,246 WARNING [train.py:1204] (3/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_222-198127-0 from training. Duration: 0.84 2023-10-02 05:40:50,150 WARNING [train.py:1204] (3/4) Exclude cut with ID _65263_210_22356_1_1535763598691_3921037_423-202699-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 05:40:51,476 WARNING [train.py:1204] (3/4) Exclude cut with ID _15037_210_2253_1_1530752294104_3267650_13-130361-0 from training. Duration: 0.99 2023-10-02 05:40:53,614 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.3.encoder.layers.1.self_attn_weights.whiten_keys, num_groups=8, num_channels=256, metric=4.92 vs. limit=6.0 2023-10-02 05:40:54,155 INFO [train.py:1046] (3/4) Epoch 22, batch 3950, loss[loss=0.1696, simple_loss=0.2534, pruned_loss=0.04291, over 24631.00 frames. ], tot_loss[loss=0.1725, simple_loss=0.2491, pruned_loss=0.04797, over 4712447.88 frames. ], batch size: 68, lr: 4.64e-03, grad_scale: 8.0 2023-10-02 05:40:56,919 WARNING [train.py:1204] (3/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_628-198080-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 05:40:58,232 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3171_210_2087_1_1526109084_2330007_37-64334-0 from training. Duration: 0.82 2023-10-02 05:40:58,921 WARNING [train.py:1204] (3/4) Exclude cut with ID _5287_210_2084_1_1527418883040_3878670_12-221186-0_sp1.1 from training. Duration: 0.8909375 2023-10-02 05:41:01,935 WARNING [train.py:1204] (3/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_1103-56445-0_sp1.1 from training. Duration: 0.663625 2023-10-02 05:41:04,771 WARNING [train.py:1204] (3/4) Exclude cut with ID _65263_210_22356_1_1535763598691_3921037_169-140068-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 05:41:10,440 WARNING [train.py:1204] (3/4) Exclude cut with ID IC0671W0118-331961 from training. Duration: 0.896 2023-10-02 05:41:10,499 WARNING [train.py:1204] (3/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_402-102311-0_sp1.1 from training. Duration: 0.6090625 2023-10-02 05:41:10,529 WARNING [train.py:1204] (3/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_202-293523-0 from training. Duration: 0.72 2023-10-02 05:41:11,784 WARNING [train.py:1204] (3/4) Exclude cut with ID IC0679W0038-335846 from training. Duration: 0.768 2023-10-02 05:41:11,807 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_37357_210_17024_1_1533089504_2316121_79-198866-0_sp1.1 from training. Duration: 0.936375 2023-10-02 05:41:14,010 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.1.encoder.layers.0.feed_forward1.out_whiten, num_groups=1, num_channels=256, metric=5.49 vs. limit=15.0 2023-10-02 05:41:14,743 WARNING [train.py:1204] (3/4) Exclude cut with ID _7606_210_4915_1_1528894253277_5080690_292-271285-0_sp1.1 from training. Duration: 0.963625 2023-10-02 05:41:14,765 WARNING [train.py:1204] (3/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_377-296839-0_sp0.9 from training. Duration: 0.611125 2023-10-02 05:41:14,768 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_45886_210_17024_1_1533704917_2435333_58-131069-0_sp1.1 from training. Duration: 0.963625 2023-10-02 05:41:17,638 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6961_210_2087_1_1527937168_6815011_395-185060-0 from training. Duration: 0.95 2023-10-02 05:41:19,658 WARNING [train.py:1204] (3/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_304-266479-0_sp1.1 from training. Duration: 0.97275 2023-10-02 05:41:21,551 WARNING [train.py:1204] (3/4) Exclude cut with ID _3867_210_1883_1_1526641200575_4475830_319-349850-0_sp1.1 from training. Duration: 0.6090625 2023-10-02 05:41:21,568 WARNING [train.py:1204] (3/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_304-59227-0_sp0.9 from training. Duration: 0.8555625 2023-10-02 05:41:21,617 WARNING [train.py:1204] (3/4) Exclude cut with ID _8021_210_7994_1_1528376451358_3653069_100-140231-0_sp0.9 from training. Duration: 0.7444375 2023-10-02 05:41:22,951 WARNING [train.py:1204] (3/4) Exclude cut with ID _8833_210_5219_1_1528592665210_7553059_329-125156-0_sp0.9 from training. Duration: 0.911125 2023-10-02 05:41:31,897 WARNING [train.py:1204] (3/4) Exclude cut with ID _14459_210_2228_1_1531443641946_7516700_792-287234-0_sp1.1 from training. Duration: 0.92725 2023-10-02 05:41:33,706 WARNING [train.py:1204] (3/4) Exclude cut with ID _19708_210_9206_1_1532136650733_4238364_347-65240-0_sp1.1 from training. Duration: 0.8818125 2023-10-02 05:41:38,034 WARNING [train.py:1204] (3/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_70-27732-0 from training. Duration: 0.94 2023-10-02 05:41:43,437 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_397-132565-0 from training. Duration: 0.99 2023-10-02 05:41:43,440 WARNING [train.py:1204] (3/4) Exclude cut with ID _49728_210_12651_1_1534041034294_6788150_624-195301-0 from training. Duration: 0.85 2023-10-02 05:41:43,477 WARNING [train.py:1204] (3/4) Exclude cut with ID _55429_210_10626_1_1534467588527_7174143_377-169391-0_sp1.1 from training. Duration: 0.87275 2023-10-02 05:41:44,872 WARNING [train.py:1204] (3/4) Exclude cut with ID _49376_210_5986_1_1533985278732_3370740_354-344695-0_sp1.1 from training. Duration: 0.9 2023-10-02 05:41:48,542 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.0.layers.1.feed_forward3.out_whiten, num_groups=1, num_channels=192, metric=9.29 vs. limit=15.0 2023-10-02 05:41:52,981 WARNING [train.py:1204] (3/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_276-96593-0_sp1.1 from training. Duration: 0.7 2023-10-02 05:41:52,998 WARNING [train.py:1204] (3/4) Exclude cut with ID _15012_210_1968_1_1530856820429_5095350_688-178849-0_sp0.9 from training. Duration: 0.711125 2023-10-02 05:41:54,490 WARNING [train.py:1204] (3/4) Exclude cut with ID _25565_210_12392_1_1532423009382_3952098_372-69023-0_sp1.1 from training. Duration: 0.936375 2023-10-02 05:41:54,517 WARNING [train.py:1204] (3/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_61-327062-0_sp1.1 from training. Duration: 0.736375 2023-10-02 05:41:54,554 WARNING [train.py:1204] (3/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_215-141059-0 from training. Duration: 0.62 2023-10-02 05:41:58,666 WARNING [train.py:1204] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_6-226750-0_sp1.1 from training. Duration: 0.8545625 2023-10-02 05:42:00,755 WARNING [train.py:1204] (3/4) Exclude cut with ID _14348_210_8341_1_1530599434996_3778640_240-232370-0_sp1.1 from training. Duration: 0.763625 2023-10-02 05:42:00,991 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.conv_module1.balancer2.min_abs, batch_count=770293.3333333334, ans=0.5 2023-10-02 05:42:04,924 WARNING [train.py:1204] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_468-189058-0 from training. Duration: 0.96 2023-10-02 05:42:09,562 INFO [train.py:1046] (3/4) Epoch 22, batch 4000, loss[loss=0.1783, simple_loss=0.2435, pruned_loss=0.05654, over 23458.00 frames. ], tot_loss[loss=0.1737, simple_loss=0.2501, pruned_loss=0.04863, over 4714699.15 frames. ], batch size: 285, lr: 4.64e-03, grad_scale: 16.0 2023-10-02 05:42:10,572 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.0.self_attn2.whiten.whitening_limit, batch_count=770360.0, ans=22.5 2023-10-02 05:42:14,070 WARNING [train.py:1204] (3/4) Exclude cut with ID _21987_210_5854_1_1531821500482_3637038_247-93340-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 05:42:14,321 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.conv_module1.balancer1.prob, batch_count=770360.0, ans=0.125 2023-10-02 05:42:20,900 WARNING [train.py:1204] (3/4) Exclude cut with ID _66868_210_9527_1_1535509287954_4561140_436-191602-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 05:42:21,078 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.0.bypass.scale_min, batch_count=770360.0, ans=0.2 2023-10-02 05:42:26,065 WARNING [train.py:1204] (3/4) Exclude cut with ID _18895_210_6228_1_1531357274234_3784930_394-58203-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 05:42:26,124 WARNING [train.py:1204] (3/4) Exclude cut with ID _58605_210_12210_1_1534723195939_3904259_258-281498-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 05:42:27,385 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_33348_210_5632_1_1533086912_3324030_245-292752-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 05:42:27,405 WARNING [train.py:1204] (3/4) Exclude cut with ID _67831_210_12392_1_1535612464202_3092612_346-2148-0 from training. Duration: 0.93 2023-10-02 05:42:27,462 WARNING [train.py:1204] (3/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_369-222060-0_sp0.9 from training. Duration: 0.788875 2023-10-02 05:42:28,794 WARNING [train.py:1204] (3/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_245-174553-0 from training. Duration: 0.88 2023-10-02 05:42:28,800 WARNING [train.py:1204] (3/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_231-104736-0_sp1.1 from training. Duration: 0.7090625 2023-10-02 05:42:28,814 WARNING [train.py:1204] (3/4) Exclude cut with ID _44585_210_18628_1_1533605350387_4518199_280-73534-0 from training. Duration: 0.89 2023-10-02 05:42:32,090 WARNING [train.py:1204] (3/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_22-201476-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 05:42:34,852 WARNING [train.py:1204] (3/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_325-62802-0_sp0.9 from training. Duration: 0.9666875 2023-10-02 05:42:34,864 WARNING [train.py:1204] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_199-274062-0_sp1.1 from training. Duration: 0.97275 2023-10-02 05:42:34,867 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_8-319957-0_sp1.1 from training. Duration: 0.87275 2023-10-02 05:42:34,889 WARNING [train.py:1204] (3/4) Exclude cut with ID _29343_210_4381_1_1532602757752_3674109_224-349726-0_sp1.1 from training. Duration: 0.936375 2023-10-02 05:42:34,896 WARNING [train.py:1204] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_520-126729-0_sp1.1 from training. Duration: 0.636375 2023-10-02 05:42:36,366 WARNING [train.py:1204] (3/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_239-22076-0_sp1.1 from training. Duration: 0.77275 2023-10-02 05:42:38,244 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9012W0011-501146 from training. Duration: 0.896 2023-10-02 05:42:38,337 WARNING [train.py:1204] (3/4) Exclude cut with ID _29857_210_20034_1_1532395886753_3798160_279-185801-0_sp1.1 from training. Duration: 0.82725 2023-10-02 05:42:39,603 WARNING [train.py:1204] (3/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_261-223933-0_sp1.1 from training. Duration: 0.92725 2023-10-02 05:42:41,064 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9029W0025-509426 from training. Duration: 0.896 2023-10-02 05:42:42,317 WARNING [train.py:1204] (3/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_337-181502-0_sp1.1 from training. Duration: 0.5909375 2023-10-02 05:42:42,336 WARNING [train.py:1204] (3/4) Exclude cut with ID _68635_210_9317_1_1535770813410_4315870_159-332599-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 05:42:42,525 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.balancer2.prob, batch_count=770493.3333333334, ans=0.125 2023-10-02 05:42:47,669 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_161-276996-0 from training. Duration: 0.95 2023-10-02 05:42:48,562 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.0.layers.1.feed_forward2.out_whiten, num_groups=1, num_channels=192, metric=4.44 vs. limit=15.0 2023-10-02 05:42:48,840 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.409e+02 1.823e+02 2.125e+02 2.454e+02 3.151e+02, threshold=4.250e+02, percent-clipped=0.0 2023-10-02 05:42:48,985 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7088_210_1874_1_1527939776_1101113_127-68870-0_sp1.1 from training. Duration: 0.936375 2023-10-02 05:42:53,434 WARNING [train.py:1204] (3/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_322-217220-0_sp1.1 from training. Duration: 0.7818125 2023-10-02 05:42:54,726 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9072W0002-530636 from training. Duration: 0.768 2023-10-02 05:42:55,083 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.conv_skip_rate, batch_count=770560.0, ans=0.0 2023-10-02 05:42:55,538 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.3.encoder.layers.1.nonlin_attention.whiten2, num_groups=1, num_channels=512, metric=10.22 vs. limit=15.0 2023-10-02 05:42:56,218 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_340-336408-0_sp1.1 from training. Duration: 0.8454375 2023-10-02 05:42:58,000 WARNING [train.py:1204] (3/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_395-37222-0 from training. Duration: 0.76 2023-10-02 05:42:58,004 WARNING [train.py:1204] (3/4) Exclude cut with ID _64099_210_17301_1_1535526977656_1578188_159-209156-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 05:42:58,202 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.attention_skip_rate, batch_count=770560.0, ans=0.0 2023-10-02 05:42:59,395 WARNING [train.py:1204] (3/4) Exclude cut with ID _53950_210_5816_1_1534417264919_3862951_28-29681-0_sp1.1 from training. Duration: 0.92725 2023-10-02 05:42:59,478 WARNING [train.py:1204] (3/4) Exclude cut with ID _4681_210_3525_1_1527250689464_120889_15-126760-0_sp1.1 from training. Duration: 0.736375 2023-10-02 05:43:00,866 WARNING [train.py:1204] (3/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_242-3472-0_sp1.1 from training. Duration: 0.663625 2023-10-02 05:43:00,902 WARNING [train.py:1204] (3/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_436-186311-0_sp0.9 from training. Duration: 0.72225 2023-10-02 05:43:02,109 WARNING [train.py:1204] (3/4) Exclude cut with ID _3675_210_4557_1_1526639934408_4182089_452-172147-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 05:43:02,256 WARNING [train.py:1204] (3/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_80-147301-0 from training. Duration: 0.84 2023-10-02 05:43:02,907 WARNING [train.py:1204] (3/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_351-107588-0_sp1.1 from training. Duration: 0.92725 2023-10-02 05:43:04,066 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9111W0002-550781 from training. Duration: 0.896 2023-10-02 05:43:04,303 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.conv_module1.balancer1.prob, batch_count=770560.0, ans=0.125 2023-10-02 05:43:08,341 WARNING [train.py:1204] (3/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_465-156500-0_sp1.1 from training. Duration: 0.6545625 2023-10-02 05:43:11,735 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.attention_skip_rate, batch_count=770626.6666666666, ans=0.0 2023-10-02 05:43:12,924 WARNING [train.py:1204] (3/4) Exclude cut with ID _13938_210_9988_1_1530518718978_3693960_304-112784-0_sp1.1 from training. Duration: 0.4181875 2023-10-02 05:43:14,344 WARNING [train.py:1204] (3/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_202-67017-0_sp1.1 from training. Duration: 0.7454375 2023-10-02 05:43:14,386 WARNING [train.py:1204] (3/4) Exclude cut with ID _58414_210_8254_1_1535421609162_7151182_448-4358-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 05:43:15,061 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.3.encoder.layers.1.self_attn2.whiten, num_groups=1, num_channels=512, metric=14.78 vs. limit=22.5 2023-10-02 05:43:15,770 WARNING [train.py:1204] (3/4) Exclude cut with ID _66840_210_5219_1_1535452168426_4196349_203-243234-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 05:43:15,890 WARNING [train.py:1204] (3/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_201-213513-0_sp1.1 from training. Duration: 0.97275 2023-10-02 05:43:17,799 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.conv_skip_rate, batch_count=770626.6666666666, ans=0.0 2023-10-02 05:43:20,320 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_117-187560-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 05:43:22,909 INFO [train.py:1046] (3/4) Epoch 22, batch 4050, loss[loss=0.2012, simple_loss=0.2649, pruned_loss=0.06876, over 22750.00 frames. ], tot_loss[loss=0.1744, simple_loss=0.251, pruned_loss=0.04897, over 4714546.63 frames. ], batch size: 322, lr: 4.64e-03, grad_scale: 16.0 2023-10-02 05:43:24,786 WARNING [train.py:1204] (3/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_135-174080-0_sp0.9 from training. Duration: 0.588875 2023-10-02 05:43:26,185 WARNING [train.py:1204] (3/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_45-265684-0 from training. Duration: 0.72 2023-10-02 05:43:26,324 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_435-313305-0_sp1.1 from training. Duration: 0.6454375 2023-10-02 05:43:28,211 WARNING [train.py:1204] (3/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_235-307337-0_sp1.1 from training. Duration: 0.8545625 2023-10-02 05:43:28,305 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7762_210_1794_1_1528199508_4057408_419-125009-0_sp0.9 from training. Duration: 0.92225 2023-10-02 05:43:29,689 WARNING [train.py:1204] (3/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_215-141059-0_sp1.1 from training. Duration: 0.563625 2023-10-02 05:43:29,772 WARNING [train.py:1204] (3/4) Exclude cut with ID _27013_210_1389_1_1532437173240_3252649_222-125573-0_sp1.1 from training. Duration: 0.8909375 2023-10-02 05:43:32,713 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.attention_skip_rate, batch_count=770693.3333333334, ans=0.0 2023-10-02 05:43:34,548 WARNING [train.py:1204] (3/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_126-274694-0_sp1.1 from training. Duration: 0.8909375 2023-10-02 05:43:38,513 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2592_210_2290_1_1525697025_4882925_157-70512-0_sp1.1 from training. Duration: 0.82725 2023-10-02 05:43:39,870 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_292-265928-0_sp0.9 from training. Duration: 0.5666875 2023-10-02 05:43:40,642 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.2.encoder.layers.1.conv_module1.whiten, num_groups=1, num_channels=384, metric=9.37 vs. limit=15.0 2023-10-02 05:43:41,912 WARNING [train.py:1204] (3/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_1211-226066-0_sp1.1 from training. Duration: 0.6909375 2023-10-02 05:43:43,208 WARNING [train.py:1204] (3/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_394-265747-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 05:43:44,919 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.feed_forward1.out_proj.dropout_p, batch_count=770760.0, ans=0.1 2023-10-02 05:43:46,127 WARNING [train.py:1204] (3/4) Exclude cut with ID _48782_210_12658_1_1534467701544_2300422_239-67139-0_sp1.1 from training. Duration: 0.97275 2023-10-02 05:43:46,274 WARNING [train.py:1204] (3/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_329-113173-0_sp1.1 from training. Duration: 0.563625 2023-10-02 05:43:49,117 WARNING [train.py:1204] (3/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_81-75849-0_sp0.9 from training. Duration: 0.4333125 2023-10-02 05:43:50,537 WARNING [train.py:1204] (3/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_1003-101638-0 from training. Duration: 0.73 2023-10-02 05:43:50,680 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.feed_forward2.hidden_balancer.prob, batch_count=770760.0, ans=0.125 2023-10-02 05:43:51,794 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9309W0189-640316 from training. Duration: 0.64 2023-10-02 05:43:54,548 WARNING [train.py:1204] (3/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_503-287883-0_sp1.1 from training. Duration: 0.8 2023-10-02 05:43:59,236 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.1.conv_skip_rate, batch_count=770826.6666666666, ans=0.0 2023-10-02 05:44:02,510 WARNING [train.py:1204] (3/4) Exclude cut with ID _8833_210_5219_1_1528592665210_7553059_611-80313-0 from training. Duration: 0.64 2023-10-02 05:44:03,792 WARNING [train.py:1204] (3/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_335-106626-0_sp1.1 from training. Duration: 0.963625 2023-10-02 05:44:06,888 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.0.feed_forward3.hidden_balancer.prob, batch_count=770893.3333333334, ans=0.125 2023-10-02 05:44:08,087 WARNING [train.py:1204] (3/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_237-44332-0_sp1.1 from training. Duration: 0.8545625 2023-10-02 05:44:11,541 WARNING [train.py:1204] (3/4) Exclude cut with ID _46952_210_5986_1_1534230010357_3107379_178-147341-0_sp1.1 from training. Duration: 0.97275 2023-10-02 05:44:11,592 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_13091_210_7925_1_1530609447_8987354_946-154426-0_sp1.1 from training. Duration: 0.936375 2023-10-02 05:44:11,717 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.1.self_attn_weights.pos_emb_skip_rate, batch_count=770893.3333333334, ans=0.0 2023-10-02 05:44:12,883 WARNING [train.py:1204] (3/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_326-17061-0_sp1.1 from training. Duration: 0.8545625 2023-10-02 05:44:16,149 WARNING [train.py:1204] (3/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_38-169920-0_sp1.1 from training. Duration: 0.82725 2023-10-02 05:44:16,535 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.balancer1.prob, batch_count=770893.3333333334, ans=0.125 2023-10-02 05:44:17,691 WARNING [train.py:1204] (3/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_283-220372-0 from training. Duration: 0.56 2023-10-02 05:44:17,699 WARNING [train.py:1204] (3/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_252-216674-0_sp1.1 from training. Duration: 0.4909375 2023-10-02 05:44:19,122 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_202-91290-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 05:44:20,480 WARNING [train.py:1204] (3/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_80-74184-0 from training. Duration: 0.83 2023-10-02 05:44:24,595 WARNING [train.py:1204] (3/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_411-271358-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 05:44:30,874 WARNING [train.py:1204] (3/4) Exclude cut with ID _68777_210_4353_1_1535810390731_3117904_129-166012-0 from training. Duration: 0.85 2023-10-02 05:44:32,813 WARNING [train.py:1204] (3/4) Exclude cut with ID _44982_210_1511_1_1534312820108_3726029_410-109759-0_sp1.1 from training. Duration: 0.963625 2023-10-02 05:44:32,819 WARNING [train.py:1204] (3/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_364-22817-0_sp1.1 from training. Duration: 0.6454375 2023-10-02 05:44:35,530 WARNING [train.py:1204] (3/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_89-242335-0 from training. Duration: 0.95 2023-10-02 05:44:35,538 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_151-325626-0 from training. Duration: 0.97 2023-10-02 05:44:35,539 WARNING [train.py:1204] (3/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_786-264332-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 05:44:38,151 INFO [train.py:1046] (3/4) Epoch 22, batch 4100, loss[loss=0.1476, simple_loss=0.2217, pruned_loss=0.03672, over 24312.00 frames. ], tot_loss[loss=0.1742, simple_loss=0.2511, pruned_loss=0.04867, over 4730727.74 frames. ], batch size: 56, lr: 4.64e-03, grad_scale: 16.0 2023-10-02 05:44:38,249 WARNING [train.py:1204] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_35-153886-0_sp1.1 from training. Duration: 0.763625 2023-10-02 05:44:39,634 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_24007_210_1794_1_1531918925_1663875_116-100916-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 05:44:39,648 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_162-262717-0_sp1.1 from training. Duration: 0.7545625 2023-10-02 05:44:40,438 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.1.encoder.layers.1.self_attn2.whiten, num_groups=1, num_channels=256, metric=12.60 vs. limit=22.5 2023-10-02 05:44:47,453 WARNING [train.py:1204] (3/4) Exclude cut with ID _8263_210_1664_1_1528438953494_294100_40-204731-0 from training. Duration: 0.5 2023-10-02 05:44:48,895 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_416-293746-0 from training. Duration: 0.9 2023-10-02 05:44:50,329 WARNING [train.py:1204] (3/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_605-219732-0 from training. Duration: 0.94 2023-10-02 05:44:51,773 WARNING [train.py:1204] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_454-123002-0 from training. Duration: 0.69 2023-10-02 05:44:51,786 WARNING [train.py:1204] (3/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_25-304273-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 05:44:51,839 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6860_210_4852_1_1527917457_3833327_451-39702-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 05:44:53,106 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_304-16505-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 05:44:53,130 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_4-266348-0_sp1.1 from training. Duration: 0.7090625 2023-10-02 05:44:54,572 WARNING [train.py:1204] (3/4) Exclude cut with ID ID0149W0001-753701 from training. Duration: 0.989 2023-10-02 05:44:57,297 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_41251_210_15368_1_1533429767_4820695_394-55393-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 05:44:57,368 WARNING [train.py:1204] (3/4) Exclude cut with ID _8793_210_6310_1_1528542985139_2310009_121-179782-0_sp1.1 from training. Duration: 0.72725 2023-10-02 05:44:57,381 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2729_210_3528_1_1525863505_3882214_421-73047-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 05:44:57,552 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.conv_module2.balancer1.min_positive, batch_count=771093.3333333334, ans=0.025 2023-10-02 05:44:57,564 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.bypass.skip_rate, batch_count=771093.3333333334, ans=0.09899494936611666 2023-10-02 05:44:58,721 WARNING [train.py:1204] (3/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_278-154492-0_sp0.9 from training. Duration: 0.8444375 2023-10-02 05:45:01,907 WARNING [train.py:1204] (3/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_412-317077-0_sp0.9 from training. Duration: 0.8333125 2023-10-02 05:45:03,833 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_727-138317-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 05:45:03,872 WARNING [train.py:1204] (3/4) Exclude cut with ID _25808_210_13292_1_1532343514562_7366109_345-335048-0_sp1.1 from training. Duration: 0.92725 2023-10-02 05:45:05,146 WARNING [train.py:1204] (3/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_506-23200-0 from training. Duration: 0.97 2023-10-02 05:45:05,251 WARNING [train.py:1204] (3/4) Exclude cut with ID _44718_210_5816_1_1534050051869_6469030_643-245187-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 05:45:05,255 WARNING [train.py:1204] (3/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_42-186162-0_sp1.1 from training. Duration: 0.836375 2023-10-02 05:45:05,268 WARNING [train.py:1204] (3/4) Exclude cut with ID _3231_210_2106_1_1526205864934_6784220_171-256004-0_sp1.1 from training. Duration: 0.7818125 2023-10-02 05:45:05,291 WARNING [train.py:1204] (3/4) Exclude cut with ID _25914_210_6400_1_1532141317058_644740_43-248478-0_sp1.1 from training. Duration: 0.936375 2023-10-02 05:45:06,611 WARNING [train.py:1204] (3/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_577-287150-0 from training. Duration: 0.82 2023-10-02 05:45:09,451 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_46015_210_7234_1_1533863319_3087837_376-276068-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 05:45:09,567 WARNING [train.py:1204] (3/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_51-200220-0 from training. Duration: 0.56 2023-10-02 05:45:11,608 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_452-96101-0_sp1.1 from training. Duration: 0.963625 2023-10-02 05:45:12,847 WARNING [train.py:1204] (3/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_263-168388-0_sp1.1 from training. Duration: 0.7818125 2023-10-02 05:45:12,847 WARNING [train.py:1204] (3/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_469-94673-0 from training. Duration: 0.79 2023-10-02 05:45:14,271 WARNING [train.py:1204] (3/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_303-228738-0_sp1.1 from training. Duration: 0.763625 2023-10-02 05:45:16,168 WARNING [train.py:1204] (3/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_262-250826-0_sp1.1 from training. Duration: 0.9 2023-10-02 05:45:16,399 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.balancer1.prob, batch_count=771160.0, ans=0.125 2023-10-02 05:45:17,440 WARNING [train.py:1204] (3/4) Exclude cut with ID _68777_210_4353_1_1535810390731_3117904_129-166012-0_sp1.1 from training. Duration: 0.77275 2023-10-02 05:45:17,730 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.attention_skip_rate, batch_count=771160.0, ans=0.0 2023-10-02 05:45:18,691 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.490e+02 1.910e+02 2.108e+02 2.331e+02 3.295e+02, threshold=4.216e+02, percent-clipped=0.0 2023-10-02 05:45:18,828 WARNING [train.py:1204] (3/4) Exclude cut with ID _58895_210_8750_1_1535184081320_3649089_265-323810-0 from training. Duration: 0.96 2023-10-02 05:45:20,203 WARNING [train.py:1204] (3/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_120-76843-0_sp0.9 from training. Duration: 0.611125 2023-10-02 05:45:21,674 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7544_210_6753_1_1528587722_3576686_295-258479-0_sp0.9 from training. Duration: 0.9333125 2023-10-02 05:45:23,105 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7046_210_6753_1_1527895337_6920917_783-278109-0 from training. Duration: 0.77 2023-10-02 05:45:23,301 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.out_combiner.scale_min, batch_count=771226.6666666666, ans=0.2 2023-10-02 05:45:24,374 WARNING [train.py:1204] (3/4) Exclude cut with ID _42326_210_7770_1_1533814282531_3707269_113-308323-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 05:45:24,446 WARNING [train.py:1204] (3/4) Exclude cut with ID _7852_210_5219_1_1528286471316_8139400_242-105336-0_sp0.9 from training. Duration: 0.8 2023-10-02 05:45:27,259 WARNING [train.py:1204] (3/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_114-277905-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 05:45:30,634 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.conv_module2.balancer1.prob, batch_count=771226.6666666666, ans=0.125 2023-10-02 05:45:30,738 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=771226.6666666666, ans=0.1 2023-10-02 05:45:34,633 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_13118_210_7925_1_1530755612_7608297_42-66683-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 05:45:37,545 WARNING [train.py:1204] (3/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_901-79438-0_sp1.1 from training. Duration: 0.8545625 2023-10-02 05:45:39,363 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_30280_210_1881_1_1532872779_3660139_211-182194-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 05:45:46,845 WARNING [train.py:1204] (3/4) Exclude cut with ID _14898_210_9206_1_1530766812377_4177190_621-45732-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 05:45:46,860 WARNING [train.py:1204] (3/4) Exclude cut with ID _35350_210_16040_1_1533040191183_4075299_224-133775-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 05:45:46,981 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.self_attn_weights.pos_emb_skip_rate, batch_count=771293.3333333334, ans=0.0 2023-10-02 05:45:51,426 WARNING [train.py:1204] (3/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_147-293311-0_sp1.1 from training. Duration: 0.8545625 2023-10-02 05:45:51,579 WARNING [train.py:1204] (3/4) Exclude cut with ID _12302_210_5219_1_1529924446466_8107280_835-279754-0_sp1.1 from training. Duration: 0.8181875 2023-10-02 05:45:52,868 INFO [train.py:1046] (3/4) Epoch 22, batch 4150, loss[loss=0.1631, simple_loss=0.2289, pruned_loss=0.04864, over 23363.00 frames. ], tot_loss[loss=0.1745, simple_loss=0.2512, pruned_loss=0.04895, over 4716025.63 frames. ], batch size: 134, lr: 4.64e-03, grad_scale: 16.0 2023-10-02 05:45:54,425 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8453_210_4211_1_1528502940_8562730_347-330673-0_sp0.9 from training. Duration: 0.8 2023-10-02 05:45:54,696 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.conv_module2.balancer1.prob, batch_count=771360.0, ans=0.125 2023-10-02 05:45:55,812 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_213-103282-0_sp0.9 from training. Duration: 0.8666875 2023-10-02 05:45:55,893 WARNING [train.py:1204] (3/4) Exclude cut with ID _49728_210_12651_1_1534041034294_6788150_66-43496-0_sp1.1 from training. Duration: 0.97275 2023-10-02 05:45:55,895 WARNING [train.py:1204] (3/4) Exclude cut with ID _19023_210_4353_1_1531378892552_5820780_28-36615-0_sp1.1 from training. Duration: 0.8818125 2023-10-02 05:45:58,679 WARNING [train.py:1204] (3/4) Exclude cut with ID _14349_210_8341_1_1530615710818_3442470_115-285975-0 from training. Duration: 0.86 2023-10-02 05:45:58,712 WARNING [train.py:1204] (3/4) Exclude cut with ID _12734_210_8341_1_1530183614296_3891929_236-47464-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 05:45:58,927 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.bypass.skip_rate, batch_count=771360.0, ans=0.09899494936611666 2023-10-02 05:46:00,466 WARNING [train.py:1204] (3/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_177-236809-0 from training. Duration: 0.49 2023-10-02 05:46:00,529 WARNING [train.py:1204] (3/4) Exclude cut with ID _13678_210_7497_1_1530530069226_3463170_371-3221-0 from training. Duration: 0.9 2023-10-02 05:46:00,542 WARNING [train.py:1204] (3/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_48-338976-0 from training. Duration: 0.89 2023-10-02 05:46:03,210 WARNING [train.py:1204] (3/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_1167-309744-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 05:46:03,584 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.bypass.skip_rate, batch_count=771360.0, ans=0.07 2023-10-02 05:46:06,136 WARNING [train.py:1204] (3/4) Exclude cut with ID _30327_210_16049_1_1532484161295_3924169_401-70819-0_sp0.9 from training. Duration: 0.9555625 2023-10-02 05:46:06,144 WARNING [train.py:1204] (3/4) Exclude cut with ID _55134_210_21982_1_1534590028377_3882170_346-91022-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 05:46:08,274 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=771426.6666666666, ans=0.1 2023-10-02 05:46:10,900 WARNING [train.py:1204] (3/4) Exclude cut with ID _14459_210_2228_1_1531443641946_7516700_174-118801-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 05:46:12,806 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_269-278006-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 05:46:14,116 WARNING [train.py:1204] (3/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_390-339137-0_sp0.9 from training. Duration: 0.62225 2023-10-02 05:46:14,287 WARNING [train.py:1204] (3/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_253-308377-0_sp1.1 from training. Duration: 0.4454375 2023-10-02 05:46:14,299 WARNING [train.py:1204] (3/4) Exclude cut with ID _23825_210_13449_1_1531915313526_3497150_240-271367-0_sp1.1 from training. Duration: 0.8818125 2023-10-02 05:46:16,250 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1015-343884-0_sp0.9 from training. Duration: 0.688875 2023-10-02 05:46:20,509 WARNING [train.py:1204] (3/4) Exclude cut with ID _23514_210_1389_1_1531832139785_3031439_335-48210-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 05:46:23,420 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_309-65177-0_sp1.1 from training. Duration: 0.8 2023-10-02 05:46:24,686 WARNING [train.py:1204] (3/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_231-137892-0 from training. Duration: 0.99 2023-10-02 05:46:26,105 WARNING [train.py:1204] (3/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_57-270037-0 from training. Duration: 0.77 2023-10-02 05:46:26,109 WARNING [train.py:1204] (3/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_465-214726-0_sp1.1 from training. Duration: 0.7818125 2023-10-02 05:46:28,908 WARNING [train.py:1204] (3/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_106-300985-0 from training. Duration: 0.9 2023-10-02 05:46:28,915 WARNING [train.py:1204] (3/4) Exclude cut with ID _14632_210_9988_1_1530691279392_3923200_161-129530-0_sp1.1 from training. Duration: 0.87275 2023-10-02 05:46:28,944 WARNING [train.py:1204] (3/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_514-88370-0_sp1.1 from training. Duration: 0.963625 2023-10-02 05:46:32,071 WARNING [train.py:1204] (3/4) Exclude cut with ID _56495_210_4234_1_1534748080334_4136219_305-334505-0_sp1.1 from training. Duration: 0.92725 2023-10-02 05:46:33,431 WARNING [train.py:1204] (3/4) Exclude cut with ID _42875_210_14856_1_1533704397487_4012433_61-264914-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 05:46:36,323 WARNING [train.py:1204] (3/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_91-148831-0 from training. Duration: 0.99 2023-10-02 05:46:39,569 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_435-313305-0_sp0.9 from training. Duration: 0.788875 2023-10-02 05:46:41,004 WARNING [train.py:1204] (3/4) Exclude cut with ID _16744_210_5986_1_1531724405384_3907567_242-111316-0_sp1.1 from training. Duration: 0.8454375 2023-10-02 05:46:41,081 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_257-20733-0 from training. Duration: 0.76 2023-10-02 05:46:43,050 WARNING [train.py:1204] (3/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_4-91037-0_sp1.1 from training. Duration: 0.8 2023-10-02 05:46:44,402 WARNING [train.py:1204] (3/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_3-276701-0 from training. Duration: 0.91 2023-10-02 05:46:46,284 WARNING [train.py:1204] (3/4) Exclude cut with ID _3992_210_1747_1_1526644816031_3731060_36-147855-0_sp1.1 from training. Duration: 0.7090625 2023-10-02 05:46:47,697 WARNING [train.py:1204] (3/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_1296-298671-0_sp1.1 from training. Duration: 0.963625 2023-10-02 05:46:49,130 WARNING [train.py:1204] (3/4) Exclude cut with ID _68965_210_1883_1_1535698962272_3497490_309-334593-0_sp1.1 from training. Duration: 0.92725 2023-10-02 05:46:50,589 WARNING [train.py:1204] (3/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_128-296581-0 from training. Duration: 0.74 2023-10-02 05:46:50,589 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_17613_210_2151_1_1531137569_2366160_170-299079-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 05:46:50,592 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6287_210_3273_1_1527509506_2653716_182-199887-0_sp1.1 from training. Duration: 0.463625 2023-10-02 05:46:50,879 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.conv_module1.balancer2.prob, batch_count=771560.0, ans=0.125 2023-10-02 05:46:52,049 WARNING [train.py:1204] (3/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_80-135200-0_sp1.1 from training. Duration: 0.7181875 2023-10-02 05:46:54,845 WARNING [train.py:1204] (3/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_34-183685-0 from training. Duration: 0.98 2023-10-02 05:46:56,134 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8278_210_2550_1_1528637099_4220413_418-210155-0_sp1.1 from training. Duration: 0.92725 2023-10-02 05:46:56,139 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8853_210_3273_1_1528633899_818968_51-187685-0_sp0.9 from training. Duration: 0.7666875 2023-10-02 05:46:56,164 WARNING [train.py:1204] (3/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_332-221057-0_sp1.1 from training. Duration: 0.6181875 2023-10-02 05:46:56,238 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2221_210_1988_1_1525597051_4935541_415-296882-0 from training. Duration: 0.77 2023-10-02 05:46:56,259 WARNING [train.py:1204] (3/4) Exclude cut with ID _33640_210_2655_1_1533117605479_4855469_770-278185-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 05:46:56,284 WARNING [train.py:1204] (3/4) Exclude cut with ID _5287_210_2084_1_1527418883040_3878670_333-48508-0_sp1.1 from training. Duration: 0.5454375 2023-10-02 05:46:57,543 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_142-276839-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 05:46:59,086 WARNING [train.py:1204] (3/4) Exclude cut with ID _28394_210_8913_1_1532430049518_3650118_126-139371-0_sp1.1 from training. Duration: 0.92725 2023-10-02 05:46:59,110 WARNING [train.py:1204] (3/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_76-203867-0 from training. Duration: 0.72 2023-10-02 05:47:00,863 WARNING [train.py:1204] (3/4) Exclude cut with ID _6194_210_1534_1_1527511826160_3941194_14-282876-0_sp0.9 from training. Duration: 0.788875 2023-10-02 05:47:06,445 WARNING [train.py:1204] (3/4) Exclude cut with ID _35355_210_16040_1_1533212988977_4016229_74-190076-0_sp1.1 from training. Duration: 0.72725 2023-10-02 05:47:06,595 WARNING [train.py:1204] (3/4) Exclude cut with ID _33920_210_4220_1_1533257851500_3084399_198-334063-0 from training. Duration: 0.94 2023-10-02 05:47:08,513 INFO [train.py:1046] (3/4) Epoch 22, batch 4200, loss[loss=0.1665, simple_loss=0.2183, pruned_loss=0.05734, over 22594.00 frames. ], tot_loss[loss=0.1732, simple_loss=0.2493, pruned_loss=0.04854, over 4712835.09 frames. ], batch size: 322, lr: 4.64e-03, grad_scale: 16.0 2023-10-02 05:47:09,924 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_4-266348-0_sp0.9 from training. Duration: 0.8666875 2023-10-02 05:47:12,600 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_23856_210_4238_1_1531994785_3087934_122-38075-0_sp1.1 from training. Duration: 0.936375 2023-10-02 05:47:12,692 WARNING [train.py:1204] (3/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_276-96593-0_sp0.9 from training. Duration: 0.8555625 2023-10-02 05:47:14,620 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_308-278734-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 05:47:14,622 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_5190_210_4557_1_1527245078_2965646_353-250603-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 05:47:17,709 WARNING [train.py:1204] (3/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_472-216663-0 from training. Duration: 0.92 2023-10-02 05:47:20,588 WARNING [train.py:1204] (3/4) Exclude cut with ID _7537_210_2084_1_1528502395431_3924359_14-312906-0 from training. Duration: 0.84 2023-10-02 05:47:20,641 WARNING [train.py:1204] (3/4) Exclude cut with ID _29003_210_19596_1_1532507391720_3894098_559-236660-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 05:47:23,347 WARNING [train.py:1204] (3/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_312-204798-0_sp1.1 from training. Duration: 0.8454375 2023-10-02 05:47:26,023 WARNING [train.py:1204] (3/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_17-309070-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 05:47:28,716 WARNING [train.py:1204] (3/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_344-152526-0_sp0.9 from training. Duration: 0.588875 2023-10-02 05:47:28,872 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_41452_210_7291_1_1533437687_3941281_405-11559-0_sp1.1 from training. Duration: 0.82725 2023-10-02 05:47:28,897 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_48730_210_5399_1_1533885886_2212769_157-124052-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 05:47:30,191 WARNING [train.py:1204] (3/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_415-78792-0 from training. Duration: 0.81 2023-10-02 05:47:30,196 WARNING [train.py:1204] (3/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_345-340690-0_sp1.1 from training. Duration: 0.8454375 2023-10-02 05:47:32,116 WARNING [train.py:1204] (3/4) Exclude cut with ID _23825_210_13449_1_1531915313526_3497150_124-8138-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 05:47:33,550 WARNING [train.py:1204] (3/4) Exclude cut with ID _49258_210_7994_1_1534122017863_3738861_194-24864-0_sp1.1 from training. Duration: 0.936375 2023-10-02 05:47:33,570 WARNING [train.py:1204] (3/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_369-222060-0_sp1.1 from training. Duration: 0.6454375 2023-10-02 05:47:34,906 WARNING [train.py:1204] (3/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_480-13146-0_sp0.9 from training. Duration: 0.9444375 2023-10-02 05:47:37,723 WARNING [train.py:1204] (3/4) Exclude cut with ID _13253_210_1925_1_1530757775819_3577873_191-144977-0 from training. Duration: 0.78 2023-10-02 05:47:39,628 WARNING [train.py:1204] (3/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_1128-140266-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 05:47:43,851 WARNING [train.py:1204] (3/4) Exclude cut with ID _8772_210_1850_1_1528635595972_3664011_247-336448-0_sp0.9 from training. Duration: 0.82225 2023-10-02 05:47:43,939 WARNING [train.py:1204] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_318-59097-0_sp1.1 from training. Duration: 0.6909375 2023-10-02 05:47:46,624 WARNING [train.py:1204] (3/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_540-131164-0_sp1.1 from training. Duration: 0.836375 2023-10-02 05:47:47,404 WARNING [train.py:1204] (3/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_39-110836-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 05:47:47,583 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.ff2_skip_rate, batch_count=771826.6666666666, ans=0.0 2023-10-02 05:47:47,621 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.attention_skip_rate, batch_count=771826.6666666666, ans=0.0 2023-10-02 05:47:50,521 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.548e+02 1.871e+02 2.019e+02 2.222e+02 3.341e+02, threshold=4.039e+02, percent-clipped=0.0 2023-10-02 05:47:50,626 WARNING [train.py:1204] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_860-96198-0_sp1.1 from training. Duration: 0.663625 2023-10-02 05:47:50,643 WARNING [train.py:1204] (3/4) Exclude cut with ID _15133_210_6228_1_1530794512074_3080379_29-264458-0 from training. Duration: 0.58 2023-10-02 05:47:50,664 WARNING [train.py:1204] (3/4) Exclude cut with ID _55209_210_1531_1_1534675828996_4597160_797-348343-0_sp1.1 from training. Duration: 0.963625 2023-10-02 05:47:52,039 WARNING [train.py:1204] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_23-189234-0_sp1.1 from training. Duration: 0.7545625 2023-10-02 05:47:56,535 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.out_combiner.scale_min, batch_count=771893.3333333334, ans=0.2 2023-10-02 05:47:57,611 WARNING [train.py:1204] (3/4) Exclude cut with ID _5410_210_1388_1_1527397055795_1719020_39-112221-0_sp1.1 from training. Duration: 0.62725 2023-10-02 05:47:59,021 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_100-348441-0_sp1.1 from training. Duration: 0.82725 2023-10-02 05:48:05,189 WARNING [train.py:1204] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_143-86344-0_sp1.1 from training. Duration: 0.7 2023-10-02 05:48:07,940 WARNING [train.py:1204] (3/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_126-29104-0 from training. Duration: 0.85 2023-10-02 05:48:09,424 WARNING [train.py:1204] (3/4) Exclude cut with ID _39846_210_12651_1_1533603651992_1318030_138-31771-0_sp1.1 from training. Duration: 0.963625 2023-10-02 05:48:14,117 WARNING [train.py:1204] (3/4) Exclude cut with ID _2220_210_1819_1_1525509109898_3658680_50-99837-0_sp1.1 from training. Duration: 0.4909375 2023-10-02 05:48:14,192 WARNING [train.py:1204] (3/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_836-210550-0_sp1.1 from training. Duration: 0.97275 2023-10-02 05:48:17,340 WARNING [train.py:1204] (3/4) Exclude cut with ID _28323_210_16187_1_1532480395667_4100300_306-74561-0 from training. Duration: 0.94 2023-10-02 05:48:20,815 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_177-58005-0_sp1.1 from training. Duration: 0.563625 2023-10-02 05:48:24,006 INFO [train.py:1046] (3/4) Epoch 22, batch 4250, loss[loss=0.16, simple_loss=0.2382, pruned_loss=0.0409, over 23494.00 frames. ], tot_loss[loss=0.1733, simple_loss=0.2499, pruned_loss=0.04831, over 4734990.49 frames. ], batch size: 134, lr: 4.64e-03, grad_scale: 8.0 2023-10-02 05:48:25,563 WARNING [train.py:1204] (3/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_19-342632-0_sp1.1 from training. Duration: 0.82725 2023-10-02 05:48:26,845 WARNING [train.py:1204] (3/4) Exclude cut with ID _35355_210_16040_1_1533212988977_4016229_74-190076-0_sp0.9 from training. Duration: 0.888875 2023-10-02 05:48:29,504 WARNING [train.py:1204] (3/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_285-97786-0_sp1.1 from training. Duration: 0.97275 2023-10-02 05:48:33,050 WARNING [train.py:1204] (3/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_337-181502-0_sp0.9 from training. Duration: 0.72225 2023-10-02 05:48:33,095 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_353-182855-0 from training. Duration: 0.96 2023-10-02 05:48:33,319 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.bypass_mid.scale_min, batch_count=772026.6666666666, ans=0.2 2023-10-02 05:48:34,497 WARNING [train.py:1204] (3/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_409-327730-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 05:48:37,191 WARNING [train.py:1204] (3/4) Exclude cut with ID _8030_210_7497_1_1528444886781_3531089_373-19403-0_sp1.1 from training. Duration: 0.97275 2023-10-02 05:48:38,824 INFO [scaling.py:1118] (3/4) WithLoss: name=encoder.encoders.3.encoder.layers.1.self_attn_weights, loss-sum=0.000e+00 2023-10-02 05:48:39,935 WARNING [train.py:1204] (3/4) Exclude cut with ID _6789_210_1534_1_1527857937218_3118950_231-340237-0_sp1.1 from training. Duration: 0.936375 2023-10-02 05:48:43,427 WARNING [train.py:1204] (3/4) Exclude cut with ID _6493_210_1747_1_1528459210424_4012470_536-57832-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 05:48:44,770 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7046_210_6753_1_1527895337_6920917_756-116002-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 05:48:46,255 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_37152_210_9697_1_1533106573_3844188_29-239070-0_sp1.1 from training. Duration: 0.8909375 2023-10-02 05:48:46,259 WARNING [train.py:1204] (3/4) Exclude cut with ID _19023_210_4353_1_1531378892552_5820780_61-334539-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 05:48:47,641 WARNING [train.py:1204] (3/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_159-28488-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 05:48:48,978 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_764-71272-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 05:48:50,326 WARNING [train.py:1204] (3/4) Exclude cut with ID _44902_210_16187_1_1533888006062_3792078_258-178274-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 05:48:52,864 WARNING [train.py:1204] (3/4) Exclude cut with ID _7537_210_2084_1_1528502395431_3924359_14-312906-0_sp1.1 from training. Duration: 0.763625 2023-10-02 05:48:54,224 WARNING [train.py:1204] (3/4) Exclude cut with ID _49643_210_7989_1_1534640244224_3939031_674-116639-0_sp1.1 from training. Duration: 0.963625 2023-10-02 05:48:54,445 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.balancer1.prob, batch_count=772160.0, ans=0.125 2023-10-02 05:48:55,556 WARNING [train.py:1204] (3/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_415-161-0 from training. Duration: 0.98 2023-10-02 05:48:59,625 WARNING [train.py:1204] (3/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_264-348896-0 from training. Duration: 0.85 2023-10-02 05:48:59,633 WARNING [train.py:1204] (3/4) Exclude cut with ID _51899_210_20034_1_1534129254466_3669220_233-161492-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 05:48:59,708 WARNING [train.py:1204] (3/4) Exclude cut with ID _31255_210_13427_1_1532865619495_3815143_233-200023-0_sp1.1 from training. Duration: 0.936375 2023-10-02 05:48:59,732 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_23850_210_4238_1_1532232597_1632433_100-229879-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 05:49:01,688 WARNING [train.py:1204] (3/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_375-245677-0_sp0.9 from training. Duration: 0.9 2023-10-02 05:49:01,691 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_40223_210_6228_1_1533298404_4812267_355-317415-0_sp1.1 from training. Duration: 0.97275 2023-10-02 05:49:01,731 WARNING [train.py:1204] (3/4) Exclude cut with ID _9679_210_7497_1_1529129212906_4363929_235-81488-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 05:49:05,951 WARNING [train.py:1204] (3/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_172-215932-0_sp1.1 from training. Duration: 0.6 2023-10-02 05:49:07,258 WARNING [train.py:1204] (3/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_64-298917-0_sp1.1 from training. Duration: 0.8 2023-10-02 05:49:10,194 WARNING [train.py:1204] (3/4) Exclude cut with ID _38020_210_5986_1_1533112252259_7006089_131-53533-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 05:49:11,473 WARNING [train.py:1204] (3/4) Exclude cut with ID _42334_210_7770_1_1534206610396_3605880_392-48154-0_sp1.1 from training. Duration: 0.963625 2023-10-02 05:49:12,804 WARNING [train.py:1204] (3/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_337-181502-0 from training. Duration: 0.65 2023-10-02 05:49:12,811 WARNING [train.py:1204] (3/4) Exclude cut with ID _6267_210_2084_1_1527764658526_3894730_375-219887-0_sp0.9 from training. Duration: 0.7444375 2023-10-02 05:49:14,897 WARNING [train.py:1204] (3/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_60-146189-0 from training. Duration: 0.98 2023-10-02 05:49:15,233 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.feed_forward3.hidden_balancer.prob, batch_count=772226.6666666666, ans=0.125 2023-10-02 05:49:16,358 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_243-143119-0_sp1.1 from training. Duration: 0.7 2023-10-02 05:49:17,634 WARNING [train.py:1204] (3/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_1267-272693-0_sp0.9 from training. Duration: 0.811125 2023-10-02 05:49:17,877 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.balancer2.prob, batch_count=772226.6666666666, ans=0.125 2023-10-02 05:49:20,630 WARNING [train.py:1204] (3/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_83-182645-0_sp1.1 from training. Duration: 0.97275 2023-10-02 05:49:20,655 WARNING [train.py:1204] (3/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_403-292260-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 05:49:20,934 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.attention_skip_rate, batch_count=772226.6666666666, ans=0.0 2023-10-02 05:49:22,547 WARNING [train.py:1204] (3/4) Exclude cut with ID _55429_210_10626_1_1534467588527_7174143_377-169391-0 from training. Duration: 0.96 2023-10-02 05:49:24,482 WARNING [train.py:1204] (3/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_494-199538-0_sp1.1 from training. Duration: 0.6818125 2023-10-02 05:49:24,539 WARNING [train.py:1204] (3/4) Exclude cut with ID _3067_210_2667_1_1526212974640_4503168_119-175776-0_sp1.1 from training. Duration: 0.67275 2023-10-02 05:49:28,711 WARNING [train.py:1204] (3/4) Exclude cut with ID _3231_210_2106_1_1526205864934_6784220_183-303839-0_sp1.1 from training. Duration: 0.97275 2023-10-02 05:49:31,383 WARNING [train.py:1204] (3/4) Exclude cut with ID _18513_210_9206_1_1531618215223_3862620_165-38958-0_sp1.1 from training. Duration: 0.963625 2023-10-02 05:49:31,485 WARNING [train.py:1204] (3/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_87-272068-0_sp1.1 from training. Duration: 0.7545625 2023-10-02 05:49:32,862 WARNING [train.py:1204] (3/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_353-95-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 05:49:34,818 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2009_210_2071_1_1525481134_4588937_73-302813-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 05:49:36,225 WARNING [train.py:1204] (3/4) Exclude cut with ID _64220_210_9527_1_1535250151772_4875259_88-95011-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 05:49:36,277 WARNING [train.py:1204] (3/4) Exclude cut with ID _44902_210_16187_1_1533888006062_3792078_160-12107-0_sp1.1 from training. Duration: 0.92725 2023-10-02 05:49:36,284 WARNING [train.py:1204] (3/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_547-5424-0 from training. Duration: 0.94 2023-10-02 05:49:37,677 WARNING [train.py:1204] (3/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_268-85656-0_sp1.1 from training. Duration: 0.936375 2023-10-02 05:49:38,945 INFO [train.py:1046] (3/4) Epoch 22, batch 4300, loss[loss=0.1953, simple_loss=0.2681, pruned_loss=0.06123, over 23870.00 frames. ], tot_loss[loss=0.1738, simple_loss=0.2502, pruned_loss=0.04872, over 4727940.81 frames. ], batch size: 86, lr: 4.64e-03, grad_scale: 8.0 2023-10-02 05:49:43,610 WARNING [train.py:1204] (3/4) Exclude cut with ID _66210_210_12651_1_1535446812922_3365850_42-284985-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 05:49:44,946 WARNING [train.py:1204] (3/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_465-214726-0_sp0.9 from training. Duration: 0.9555625 2023-10-02 05:49:47,816 WARNING [train.py:1204] (3/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_50-95770-0_sp1.1 from training. Duration: 0.936375 2023-10-02 05:49:55,534 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=772426.6666666666, ans=0.1 2023-10-02 05:49:56,789 WARNING [train.py:1204] (3/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_269-77567-0_sp1.1 from training. Duration: 0.963625 2023-10-02 05:49:56,792 WARNING [train.py:1204] (3/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_7-111954-0 from training. Duration: 0.56 2023-10-02 05:49:58,185 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_85-195240-0_sp0.9 from training. Duration: 0.9666875 2023-10-02 05:49:59,556 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2822_210_2715_1_1526112586_1999587_165-36656-0_sp0.9 from training. Duration: 0.988875 2023-10-02 05:49:59,573 WARNING [train.py:1204] (3/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_181-78632-0_sp1.1 from training. Duration: 0.7090625 2023-10-02 05:49:59,584 WARNING [train.py:1204] (3/4) Exclude cut with ID IC0671W0044-331887 from training. Duration: 0.896 2023-10-02 05:50:03,672 WARNING [train.py:1204] (3/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_266-137104-0_sp1.1 from training. Duration: 0.5909375 2023-10-02 05:50:05,047 WARNING [train.py:1204] (3/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_137-116442-0_sp0.9 from training. Duration: 0.8666875 2023-10-02 05:50:09,613 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_23801_210_16223_1_1532066392_3294657_570-5439-0 from training. Duration: 0.97 2023-10-02 05:50:09,624 WARNING [train.py:1204] (3/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_29-280622-0_sp1.1 from training. Duration: 0.7454375 2023-10-02 05:50:09,645 WARNING [train.py:1204] (3/4) Exclude cut with ID 3557-8342-0013-54691-0 from training. Duration: 0.92 2023-10-02 05:50:12,423 WARNING [train.py:1204] (3/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_711-15688-0_sp1.1 from training. Duration: 0.3909375 2023-10-02 05:50:12,552 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_15126_210_6753_1_1530778677_2591185_120-277707-0_sp1.1 from training. Duration: 0.72725 2023-10-02 05:50:14,081 WARNING [train.py:1204] (3/4) Exclude cut with ID _14970_210_15709_1_1530775547361_1966965_8-169924-0_sp1.1 from training. Duration: 0.836375 2023-10-02 05:50:14,084 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_4692_210_3528_1_1526899755_4323254_107-44401-0_sp0.9 from training. Duration: 0.9555625 2023-10-02 05:50:16,036 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3475_210_2028_1_1526193578_3622569_265-276625-0_sp0.9 from training. Duration: 0.8444375 2023-10-02 05:50:18,715 WARNING [train.py:1204] (3/4) Exclude cut with ID _55153_210_2253_1_1534384786677_3162096_148-79022-0_sp1.1 from training. Duration: 0.87275 2023-10-02 05:50:18,801 WARNING [train.py:1204] (3/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_292-36336-0_sp1.1 from training. Duration: 0.92725 2023-10-02 05:50:19,957 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.628e+02 1.860e+02 2.141e+02 2.356e+02 3.803e+02, threshold=4.281e+02, percent-clipped=0.0 2023-10-02 05:50:20,042 WARNING [train.py:1204] (3/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_329-82235-0 from training. Duration: 0.82 2023-10-02 05:50:20,125 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_11305_210_1794_1_1529750428_7178148_257-312870-0 from training. Duration: 0.88 2023-10-02 05:50:21,568 WARNING [train.py:1204] (3/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_436-4493-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 05:50:21,735 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.self_attn_weights.pos_emb_skip_rate, batch_count=772560.0, ans=0.0 2023-10-02 05:50:24,459 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.0.layers.0.nonlin_attention.whiten2, num_groups=1, num_channels=192, metric=4.88 vs. limit=15.0 2023-10-02 05:50:24,761 WARNING [train.py:1204] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_321-54129-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 05:50:24,767 WARNING [train.py:1204] (3/4) Exclude cut with ID _5287_210_2084_1_1527418883040_3878670_333-48508-0_sp0.9 from training. Duration: 0.6666875 2023-10-02 05:50:24,780 WARNING [train.py:1204] (3/4) Exclude cut with ID _26528_210_9070_1_1532570410842_3851310_244-278158-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 05:50:24,813 WARNING [train.py:1204] (3/4) Exclude cut with ID _46952_210_5986_1_1534230010357_3107379_232-344503-0_sp1.1 from training. Duration: 0.87275 2023-10-02 05:50:24,825 WARNING [train.py:1204] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_43-206757-0 from training. Duration: 0.75 2023-10-02 05:50:24,826 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_4183_210_2087_1_1526729100_1869838_141-166600-0 from training. Duration: 0.95 2023-10-02 05:50:26,741 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_296-287678-0 from training. Duration: 0.95 2023-10-02 05:50:28,021 WARNING [train.py:1204] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_265-60343-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 05:50:28,060 WARNING [train.py:1204] (3/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_332-256168-0 from training. Duration: 0.75 2023-10-02 05:50:28,096 WARNING [train.py:1204] (3/4) Exclude cut with ID _13679_210_7497_1_1530534293662_3355098_411-186088-0 from training. Duration: 0.94 2023-10-02 05:50:32,244 WARNING [train.py:1204] (3/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_458-126779-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 05:50:33,643 WARNING [train.py:1204] (3/4) Exclude cut with ID IC0803W0380-397572 from training. Duration: 0.032 2023-10-02 05:50:33,689 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_278-272612-0_sp1.1 from training. Duration: 0.663625 2023-10-02 05:50:36,379 WARNING [train.py:1204] (3/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_491-263101-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 05:50:36,384 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_53555_210_5632_1_1534595220_7232297_334-336778-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 05:50:39,557 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_50-3289-0 from training. Duration: 0.91 2023-10-02 05:50:39,632 WARNING [train.py:1204] (3/4) Exclude cut with ID _8699_210_2069_1_1528630346831_3960409_155-39766-0_sp0.9 from training. Duration: 0.8666875 2023-10-02 05:50:39,636 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_502-82145-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 05:50:40,933 WARNING [train.py:1204] (3/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_550-43132-0_sp1.1 from training. Duration: 0.97275 2023-10-02 05:50:40,973 WARNING [train.py:1204] (3/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_446-112117-0_sp1.1 from training. Duration: 0.8181875 2023-10-02 05:50:41,028 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3691_210_3528_1_1526468169_4062053_355-334432-0_sp1.1 from training. Duration: 0.7909375 2023-10-02 05:50:43,780 WARNING [train.py:1204] (3/4) Exclude cut with ID _58605_210_12210_1_1534723195939_3904259_239-94720-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 05:50:45,917 WARNING [train.py:1204] (3/4) Exclude cut with ID _49376_210_5986_1_1533985278732_3370740_391-344072-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 05:50:46,013 WARNING [train.py:1204] (3/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_558-322014-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 05:50:47,360 WARNING [train.py:1204] (3/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_256-32662-0_sp1.1 from training. Duration: 0.8181875 2023-10-02 05:50:51,463 WARNING [train.py:1204] (3/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_572-185150-0 from training. Duration: 0.97 2023-10-02 05:50:52,857 INFO [train.py:1046] (3/4) Epoch 22, batch 4350, loss[loss=0.1832, simple_loss=0.2582, pruned_loss=0.05414, over 23310.00 frames. ], tot_loss[loss=0.1742, simple_loss=0.251, pruned_loss=0.04871, over 4732214.31 frames. ], batch size: 105, lr: 4.64e-03, grad_scale: 8.0 2023-10-02 05:50:52,927 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_89-35748-0_sp0.9 from training. Duration: 0.87775 2023-10-02 05:50:58,152 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_88-66747-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 05:51:00,931 WARNING [train.py:1204] (3/4) Exclude cut with ID _19504_210_9994_1_1531648863242_3325220_357-237029-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 05:51:02,408 WARNING [train.py:1204] (3/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_302-2550-0_sp1.1 from training. Duration: 0.736375 2023-10-02 05:51:02,408 WARNING [train.py:1204] (3/4) Exclude cut with ID _9461_210_4557_1_1529060514726_3677451_59-194017-0_sp1.1 from training. Duration: 0.963625 2023-10-02 05:51:02,633 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.ff3_skip_rate, batch_count=772693.3333333334, ans=0.0 2023-10-02 05:51:06,623 WARNING [train.py:1204] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_665-196155-0_sp1.1 from training. Duration: 0.6090625 2023-10-02 05:51:09,965 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_326-315007-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 05:51:11,424 WARNING [train.py:1204] (3/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_105-328174-0_sp1.1 from training. Duration: 0.6909375 2023-10-02 05:51:11,441 WARNING [train.py:1204] (3/4) Exclude cut with ID _56494_210_4220_1_1535284837625_3033169_106-263722-0_sp1.1 from training. Duration: 0.97275 2023-10-02 05:51:15,024 WARNING [train.py:1204] (3/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_770-200430-0_sp0.9 from training. Duration: 0.988875 2023-10-02 05:51:17,568 WARNING [train.py:1204] (3/4) Exclude cut with ID _65640_210_6310_1_1535368201445_3173250_209-13008-0_sp1.1 from training. Duration: 0.9 2023-10-02 05:51:18,997 WARNING [train.py:1204] (3/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_212-149947-0_sp0.9 from training. Duration: 0.811125 2023-10-02 05:51:23,847 WARNING [train.py:1204] (3/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_188-171673-0 from training. Duration: 0.58 2023-10-02 05:51:25,234 WARNING [train.py:1204] (3/4) Exclude cut with ID _58895_210_8750_1_1535184081320_3649089_286-281724-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 05:51:25,306 WARNING [train.py:1204] (3/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_16-254909-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 05:51:31,328 WARNING [train.py:1204] (3/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_52-95655-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 05:51:34,141 WARNING [train.py:1204] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_554-45617-0 from training. Duration: 0.99 2023-10-02 05:51:35,748 WARNING [train.py:1204] (3/4) Exclude cut with ID _14683_210_5219_1_1530698352608_3974859_473-344611-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 05:51:37,186 WARNING [train.py:1204] (3/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_17-25045-0_sp0.9 from training. Duration: 0.6555625 2023-10-02 05:51:42,010 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9072W0003-530637 from training. Duration: 0.768 2023-10-02 05:51:42,217 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.0.conv_skip_rate, batch_count=772893.3333333334, ans=0.0 2023-10-02 05:51:43,408 WARNING [train.py:1204] (3/4) Exclude cut with ID _38670_210_15379_1_1533448810659_4391059_76-74025-0_sp1.1 from training. Duration: 0.936375 2023-10-02 05:51:43,464 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_74-292608-0_sp0.9 from training. Duration: 0.8 2023-10-02 05:51:44,868 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9084W0025-536862 from training. Duration: 0.896 2023-10-02 05:51:46,196 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9087W0021-538392 from training. Duration: 0.768 2023-10-02 05:51:46,211 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7543_210_6753_1_1528543323_645515_56-163234-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 05:51:46,237 WARNING [train.py:1204] (3/4) Exclude cut with ID _45560_210_21982_1_1533812603951_3584999_44-332643-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 05:51:46,304 WARNING [train.py:1204] (3/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_1285-256622-0_sp1.1 from training. Duration: 0.763625 2023-10-02 05:51:48,267 WARNING [train.py:1204] (3/4) Exclude cut with ID _52781_210_16187_1_1534297729099_1118149_141-325632-0_sp1.1 from training. Duration: 0.936375 2023-10-02 05:51:49,625 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_1051-137631-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 05:51:49,664 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_13091_210_7925_1_1530609447_8987354_233-18333-0_sp1.1 from training. Duration: 0.97275 2023-10-02 05:51:52,392 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_34008_210_10431_1_1533345424_8183247_662-317331-0 from training. Duration: 0.94 2023-10-02 05:51:52,408 WARNING [train.py:1204] (3/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_291-248920-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 05:51:52,410 WARNING [train.py:1204] (3/4) Exclude cut with ID _65263_210_22356_1_1535763598691_3921037_274-288481-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 05:51:54,289 WARNING [train.py:1204] (3/4) Exclude cut with ID _5189_210_4557_1_1527248319428_3769229_233-168080-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 05:51:54,362 WARNING [train.py:1204] (3/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_135-184281-0 from training. Duration: 0.99 2023-10-02 05:51:57,052 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9123W0012-555972 from training. Duration: 0.896 2023-10-02 05:51:57,056 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9123W0118-556077 from training. Duration: 0.896 2023-10-02 05:51:57,067 WARNING [train.py:1204] (3/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_406-275649-0 from training. Duration: 0.42 2023-10-02 05:52:00,538 WARNING [train.py:1204] (3/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_309-13684-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 05:52:00,559 WARNING [train.py:1204] (3/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_129-337802-0_sp1.1 from training. Duration: 0.6545625 2023-10-02 05:52:00,582 WARNING [train.py:1204] (3/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_649-266825-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 05:52:01,884 WARNING [train.py:1204] (3/4) Exclude cut with ID _40519_210_13794_1_1533715228655_7337390_895-224742-0_sp1.1 from training. Duration: 0.92725 2023-10-02 05:52:03,415 WARNING [train.py:1204] (3/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_234-14174-0 from training. Duration: 0.82 2023-10-02 05:52:06,150 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9156W0009-572502 from training. Duration: 0.768 2023-10-02 05:52:06,157 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_424-245529-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 05:52:08,850 INFO [train.py:1046] (3/4) Epoch 22, batch 4400, loss[loss=0.1855, simple_loss=0.261, pruned_loss=0.05499, over 23678.00 frames. ], tot_loss[loss=0.175, simple_loss=0.2514, pruned_loss=0.04925, over 4716427.15 frames. ], batch size: 85, lr: 4.64e-03, grad_scale: 16.0 2023-10-02 05:52:08,971 WARNING [train.py:1204] (3/4) Exclude cut with ID _26528_210_9070_1_1532570410842_3851310_232-10149-0_sp1.1 from training. Duration: 0.963625 2023-10-02 05:52:08,979 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_17292_210_6753_1_1531125388_2943734_152-30410-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 05:52:13,020 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_35441_210_16554_1_1533347572_4092680_291-82075-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 05:52:14,879 WARNING [train.py:1204] (3/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_64-298917-0 from training. Duration: 0.88 2023-10-02 05:52:14,907 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_5618_210_1874_1_1527311008_5851_1-214501-0_sp0.9 from training. Duration: 0.7655625 2023-10-02 05:52:16,323 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_9460_210_4211_1_1528954587_10049667_572-37035-0 from training. Duration: 0.8 2023-10-02 05:52:16,370 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9193W0023-590322 from training. Duration: 0.896 2023-10-02 05:52:17,753 WARNING [train.py:1204] (3/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_206-10097-0_sp1.1 from training. Duration: 0.4454375 2023-10-02 05:52:17,768 WARNING [train.py:1204] (3/4) Exclude cut with ID _55134_210_21982_1_1534590028377_3882170_17-51731-0_sp1.1 from training. Duration: 0.963625 2023-10-02 05:52:21,111 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_14953_210_2151_1_1530701710_5616165_710-165400-0 from training. Duration: 0.87 2023-10-02 05:52:22,687 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.0.bypass_mid.scale_min, batch_count=773093.3333333334, ans=0.2 2023-10-02 05:52:24,012 WARNING [train.py:1204] (3/4) Exclude cut with ID _53203_210_2087_1_1534246218309_475590_61-85777-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 05:52:25,327 WARNING [train.py:1204] (3/4) Exclude cut with ID _55844_210_2346_1_1534471212569_7625707_1050-10453-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 05:52:25,336 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9218W0013-603297 from training. Duration: 0.896 2023-10-02 05:52:25,670 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.conv_module1.balancer2.prob, batch_count=773093.3333333334, ans=0.125 2023-10-02 05:52:26,413 INFO [scaling.py:1022] (3/4) Whitening: name=encoder_embed.out_whiten, num_groups=1, num_channels=192, metric=6.68 vs. limit=8.0 2023-10-02 05:52:26,849 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_267-102818-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 05:52:26,850 WARNING [train.py:1204] (3/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_191-41023-0 from training. Duration: 0.71 2023-10-02 05:52:26,898 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9231W0202-610227 from training. Duration: 0.896 2023-10-02 05:52:28,573 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.attention_skip_rate, batch_count=773093.3333333334, ans=0.0 2023-10-02 05:52:30,462 WARNING [train.py:1204] (3/4) Exclude cut with ID _6537_210_1883_1_1527987559710_3597129_382-133062-0 from training. Duration: 0.68 2023-10-02 05:52:31,673 WARNING [train.py:1204] (3/4) Exclude cut with ID _28427_210_4220_1_1532581240967_3061069_43-84928-0 from training. Duration: 0.99 2023-10-02 05:52:31,698 WARNING [train.py:1204] (3/4) Exclude cut with ID _59256_210_5986_1_1535076085997_3589120_121-237386-0 from training. Duration: 0.96 2023-10-02 05:52:31,733 WARNING [train.py:1204] (3/4) Exclude cut with ID _8480_210_1874_1_1528589391205_6728330_137-256986-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 05:52:33,622 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_9417_210_1794_1_1529235261_4614131_479-346963-0_sp1.1 from training. Duration: 0.936375 2023-10-02 05:52:33,666 WARNING [train.py:1204] (3/4) Exclude cut with ID _26474_210_9570_1_1532520022699_3623380_267-7362-0_sp1.1 from training. Duration: 0.936375 2023-10-02 05:52:36,318 WARNING [train.py:1204] (3/4) Exclude cut with ID _55328_210_27767_1_1534580973260_4088170_287-61272-0_sp1.1 from training. Duration: 0.97275 2023-10-02 05:52:36,460 WARNING [train.py:1204] (3/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_589-9070-0 from training. Duration: 0.71 2023-10-02 05:52:36,469 WARNING [train.py:1204] (3/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_148-158509-0_sp1.1 from training. Duration: 0.37275 2023-10-02 05:52:36,746 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.feed_forward2.hidden_balancer.prob, batch_count=773093.3333333334, ans=0.125 2023-10-02 05:52:37,855 WARNING [train.py:1204] (3/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_164-184548-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 05:52:39,233 WARNING [train.py:1204] (3/4) Exclude cut with ID _14970_210_15709_1_1530775547361_1966965_6-23328-0_sp1.1 from training. Duration: 0.7818125 2023-10-02 05:52:39,238 WARNING [train.py:1204] (3/4) Exclude cut with ID _29303_210_2575_1_1532392237303_7410649_612-165069-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 05:52:40,652 WARNING [train.py:1204] (3/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_383-321224-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 05:52:42,086 WARNING [train.py:1204] (3/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_283-160525-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 05:52:42,098 WARNING [train.py:1204] (3/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_505-97987-0 from training. Duration: 0.86 2023-10-02 05:52:42,188 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9309W0190-640317 from training. Duration: 0.768 2023-10-02 05:52:44,951 WARNING [train.py:1204] (3/4) Exclude cut with ID _50093_210_4220_1_1534417141207_3432299_280-297390-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 05:52:49,662 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.conv_module1.balancer1.prob, batch_count=773160.0, ans=0.125 2023-10-02 05:52:50,694 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.479e+02 1.791e+02 2.025e+02 2.337e+02 3.385e+02, threshold=4.050e+02, percent-clipped=0.0 2023-10-02 05:52:52,763 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_17292_210_6753_1_1531125388_2943734_320-71385-0_sp1.1 from training. Duration: 0.97275 2023-10-02 05:52:55,592 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_213-103282-0 from training. Duration: 0.78 2023-10-02 05:52:59,608 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7615_210_4211_1_1528110965_4580160_72-235211-0_sp0.9 from training. Duration: 0.8444375 2023-10-02 05:53:01,092 WARNING [train.py:1204] (3/4) Exclude cut with ID _14867_210_1968_1_1530684179890_3846969_173-320977-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 05:53:03,146 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7046_210_6753_1_1527895337_6920917_783-278109-0_sp0.9 from training. Duration: 0.8555625 2023-10-02 05:53:05,098 WARNING [train.py:1204] (3/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_90-30673-0 from training. Duration: 0.84 2023-10-02 05:53:05,134 WARNING [train.py:1204] (3/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_415-161-0_sp1.1 from training. Duration: 0.8909375 2023-10-02 05:53:05,150 WARNING [train.py:1204] (3/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_86-66192-0_sp0.9 from training. Duration: 0.611125 2023-10-02 05:53:05,153 WARNING [train.py:1204] (3/4) Exclude cut with ID _4626_210_3532_1_1526817652717_3916859_446-206656-0_sp1.1 from training. Duration: 0.6909375 2023-10-02 05:53:06,651 WARNING [train.py:1204] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_166-128941-0_sp0.9 from training. Duration: 0.6 2023-10-02 05:53:10,698 WARNING [train.py:1204] (3/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_345-167986-0 from training. Duration: 0.84 2023-10-02 05:53:13,479 WARNING [train.py:1204] (3/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_973-198085-0 from training. Duration: 0.75 2023-10-02 05:53:14,792 WARNING [train.py:1204] (3/4) Exclude cut with ID _60057_210_10640_1_1534903065793_3333150_171-181133-0 from training. Duration: 0.88 2023-10-02 05:53:14,806 WARNING [train.py:1204] (3/4) Exclude cut with ID _19504_210_9994_1_1531648863242_3325220_336-196513-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 05:53:14,821 WARNING [train.py:1204] (3/4) Exclude cut with ID _6493_210_1747_1_1528459210424_4012470_513-140967-0 from training. Duration: 0.79 2023-10-02 05:53:16,164 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6673_210_3273_1_1527767790_3173240_327-306185-0_sp0.9 from training. Duration: 0.92225 2023-10-02 05:53:21,426 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1032-236863-0_sp1.1 from training. Duration: 0.763625 2023-10-02 05:53:24,535 WARNING [train.py:1204] (3/4) Exclude cut with ID _14867_210_1968_1_1530684179890_3846969_413-29292-0 from training. Duration: 0.9 2023-10-02 05:53:24,673 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.conv_skip_rate, batch_count=773360.0, ans=0.0 2023-10-02 05:53:25,749 INFO [train.py:1046] (3/4) Epoch 22, batch 4450, loss[loss=0.1944, simple_loss=0.2665, pruned_loss=0.06117, over 23231.00 frames. ], tot_loss[loss=0.1746, simple_loss=0.2513, pruned_loss=0.04896, over 4715933.51 frames. ], batch size: 93, lr: 4.63e-03, grad_scale: 16.0 2023-10-02 05:53:29,295 WARNING [train.py:1204] (3/4) Exclude cut with ID _47251_210_18628_1_1533814063128_4137250_186-343322-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 05:53:30,784 WARNING [train.py:1204] (3/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_624-46091-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 05:53:32,238 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_791-197891-0_sp0.9 from training. Duration: 0.9333125 2023-10-02 05:53:37,062 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_50050_210_16223_1_1535435796_7290138_786-288457-0_sp1.1 from training. Duration: 0.92725 2023-10-02 05:53:37,085 WARNING [train.py:1204] (3/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_213-227283-0_sp1.1 from training. Duration: 0.963625 2023-10-02 05:53:40,572 WARNING [train.py:1204] (3/4) Exclude cut with ID _42659_210_2070_1_1533607442765_3120380_58-341121-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 05:53:43,105 WARNING [train.py:1204] (3/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_506-23200-0_sp1.1 from training. Duration: 0.8818125 2023-10-02 05:53:44,559 WARNING [train.py:1204] (3/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_336-213530-0_sp1.1 from training. Duration: 0.8181875 2023-10-02 05:53:44,584 WARNING [train.py:1204] (3/4) Exclude cut with ID _64135_210_2253_1_1535677267876_7608972_180-5312-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 05:53:47,297 WARNING [train.py:1204] (3/4) Exclude cut with ID _12302_210_5219_1_1529924446466_8107280_835-279754-0 from training. Duration: 0.9 2023-10-02 05:53:47,299 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_510-96996-0_sp1.1 from training. Duration: 0.7909375 2023-10-02 05:53:47,388 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_15517_210_6753_1_1530954008_3487336_216-187834-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 05:53:47,424 WARNING [train.py:1204] (3/4) Exclude cut with ID _19504_210_9994_1_1531648863242_3325220_414-113427-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 05:53:47,425 WARNING [train.py:1204] (3/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_264-108685-0_sp0.9 from training. Duration: 0.611125 2023-10-02 05:53:50,199 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_5438_210_1794_1_1527336687_2600009_104-288274-0_sp0.9 from training. Duration: 0.6444375 2023-10-02 05:53:57,472 WARNING [train.py:1204] (3/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_520-39496-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 05:53:57,525 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_37818_210_4238_1_1533620910_4720083_315-308565-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 05:53:58,907 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2009_210_2071_1_1525481134_4588937_385-302100-0_sp1.1 from training. Duration: 0.7909375 2023-10-02 05:54:00,925 WARNING [train.py:1204] (3/4) Exclude cut with ID _58895_210_8750_1_1535184081320_3649089_277-53219-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 05:54:01,032 WARNING [train.py:1204] (3/4) Exclude cut with ID _26664_210_7770_1_1532258943280_3418270_121-111258-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 05:54:05,252 WARNING [train.py:1204] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_99-167392-0_sp1.1 from training. Duration: 0.5545625 2023-10-02 05:54:07,211 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1113-7369-0 from training. Duration: 0.85 2023-10-02 05:54:07,228 WARNING [train.py:1204] (3/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_352-59958-0 from training. Duration: 0.86 2023-10-02 05:54:07,232 WARNING [train.py:1204] (3/4) Exclude cut with ID _14886_210_4220_1_1530702020355_3516190_159-73748-0_sp1.1 from training. Duration: 0.7545625 2023-10-02 05:54:10,004 WARNING [train.py:1204] (3/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_131-119569-0_sp1.1 from training. Duration: 0.92725 2023-10-02 05:54:11,397 WARNING [train.py:1204] (3/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_216-343007-0 from training. Duration: 0.66 2023-10-02 05:54:11,652 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.conv_module2.balancer1.prob, batch_count=773560.0, ans=0.125 2023-10-02 05:54:15,963 WARNING [train.py:1204] (3/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_113-92608-0_sp0.9 from training. Duration: 0.6 2023-10-02 05:54:18,740 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_40047_210_2290_1_1533290519_2374311_132-123271-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 05:54:18,797 WARNING [train.py:1204] (3/4) Exclude cut with ID _8793_210_6310_1_1528542985139_2310009_121-179782-0 from training. Duration: 0.8 2023-10-02 05:54:18,821 WARNING [train.py:1204] (3/4) Exclude cut with ID _43997_210_4525_1_1533538632555_5189329_283-218639-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 05:54:18,825 WARNING [train.py:1204] (3/4) Exclude cut with ID _49376_210_5986_1_1533985278732_3370740_297-78397-0_sp1.1 from training. Duration: 0.936375 2023-10-02 05:54:20,118 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3475_210_2028_1_1526193578_3622569_377-166133-0_sp0.9 from training. Duration: 0.9555625 2023-10-02 05:54:20,127 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_520-213149-0_sp1.1 from training. Duration: 0.92725 2023-10-02 05:54:21,571 WARNING [train.py:1204] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_567-144483-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 05:54:24,346 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_4183_210_2087_1_1526729100_1869838_143-280962-0_sp0.9 from training. Duration: 0.62225 2023-10-02 05:54:24,386 WARNING [train.py:1204] (3/4) Exclude cut with ID _49643_210_7989_1_1534640244224_3939031_611-185515-0 from training. Duration: 0.94 2023-10-02 05:54:26,274 WARNING [train.py:1204] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_88-341718-0_sp0.9 from training. Duration: 0.7333125 2023-10-02 05:54:27,728 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_51225_210_3601_1_1534400634_2327383_49-235441-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 05:54:29,137 WARNING [train.py:1204] (3/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_240-294400-0_sp1.1 from training. Duration: 0.936375 2023-10-02 05:54:30,563 WARNING [train.py:1204] (3/4) Exclude cut with ID _55429_210_10626_1_1534467588527_7174143_640-304825-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 05:54:30,595 WARNING [train.py:1204] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_187-221566-0_sp1.1 from training. Duration: 0.3909375 2023-10-02 05:54:34,389 WARNING [train.py:1204] (3/4) Exclude cut with ID _7606_210_4915_1_1528894253277_5080690_127-177761-0_sp0.9 from training. Duration: 0.711125 2023-10-02 05:54:36,930 WARNING [train.py:1204] (3/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_430-116139-0 from training. Duration: 0.85 2023-10-02 05:54:38,410 WARNING [train.py:1204] (3/4) Exclude cut with ID _3675_210_4557_1_1526639934408_4182089_456-18305-0_sp0.9 from training. Duration: 0.8444375 2023-10-02 05:54:41,145 INFO [train.py:1046] (3/4) Epoch 22, batch 4500, loss[loss=0.1766, simple_loss=0.2642, pruned_loss=0.04452, over 24341.00 frames. ], tot_loss[loss=0.1758, simple_loss=0.2517, pruned_loss=0.04995, over 4705110.51 frames. ], batch size: 74, lr: 4.63e-03, grad_scale: 16.0 2023-10-02 05:54:42,615 WARNING [train.py:1204] (3/4) Exclude cut with ID _18513_210_9206_1_1531618215223_3862620_402-5957-0_sp1.1 from training. Duration: 0.9 2023-10-02 05:54:42,719 WARNING [train.py:1204] (3/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_465-156500-0 from training. Duration: 0.72 2023-10-02 05:54:42,721 WARNING [train.py:1204] (3/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_714-310538-0 from training. Duration: 0.86 2023-10-02 05:54:45,915 WARNING [train.py:1204] (3/4) Exclude cut with ID _39411_210_5986_1_1533538825030_3343649_335-301187-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 05:54:50,299 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_41750_210_1960_1_1533520678_3948042_241-158616-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 05:54:51,633 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_46013_210_7234_1_1533776362_3744032_50-141535-0_sp1.1 from training. Duration: 0.9 2023-10-02 05:54:51,708 WARNING [train.py:1204] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_43-206757-0_sp0.9 from training. Duration: 0.8333125 2023-10-02 05:54:53,049 WARNING [train.py:1204] (3/4) Exclude cut with ID _22961_210_8254_1_1531976366672_4278180_172-276296-0_sp1.1 from training. Duration: 0.97275 2023-10-02 05:54:54,315 WARNING [train.py:1204] (3/4) Exclude cut with ID _17956_210_4353_1_1531209568125_6979970_1305-301903-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 05:54:54,381 WARNING [train.py:1204] (3/4) Exclude cut with ID _28323_210_16187_1_1532480395667_4100300_604-44507-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 05:55:04,047 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.1.conv_module1.balancer2.min_positive, batch_count=773760.0, ans=0.05 2023-10-02 05:55:07,503 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.conv_module1.balancer1.prob, batch_count=773760.0, ans=0.125 2023-10-02 05:55:08,627 WARNING [train.py:1204] (3/4) Exclude cut with ID _23514_210_1389_1_1531832139785_3031439_88-337544-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 05:55:10,027 WARNING [train.py:1204] (3/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_932-3672-0_sp1.1 from training. Duration: 0.963625 2023-10-02 05:55:12,787 WARNING [train.py:1204] (3/4) Exclude cut with ID _46502_210_13794_1_1533724452198_3657159_207-109991-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 05:55:12,841 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_216-73841-0_sp1.1 from training. Duration: 0.87275 2023-10-02 05:55:14,194 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8453_210_4211_1_1528502940_8562730_347-330673-0_sp1.1 from training. Duration: 0.6545625 2023-10-02 05:55:18,898 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_4183_210_2087_1_1526729100_1869838_143-280962-0_sp1.1 from training. Duration: 0.5090625 2023-10-02 05:55:22,910 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.497e+02 1.844e+02 2.070e+02 2.423e+02 3.586e+02, threshold=4.140e+02, percent-clipped=0.0 2023-10-02 05:55:24,386 WARNING [train.py:1204] (3/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_325-113798-0_sp1.1 from training. Duration: 0.77275 2023-10-02 05:55:27,848 WARNING [train.py:1204] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_91-212486-0_sp0.9 from training. Duration: 0.7555625 2023-10-02 05:55:30,591 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8776_210_2040_1_1528638837_4088036_244-278763-0_sp1.1 from training. Duration: 0.8181875 2023-10-02 05:55:30,631 WARNING [train.py:1204] (3/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_106-286130-0 from training. Duration: 0.98 2023-10-02 05:55:31,988 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2746_210_1988_1_1525872816_3795627_372-205861-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 05:55:32,036 WARNING [train.py:1204] (3/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_240-54646-0_sp1.1 from training. Duration: 0.936375 2023-10-02 05:55:33,990 WARNING [train.py:1204] (3/4) Exclude cut with ID _59500_210_8254_1_1534809610680_3736009_162-180695-0_sp1.1 from training. Duration: 0.936375 2023-10-02 05:55:34,014 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7544_210_6753_1_1528591327_1471426_146-93357-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 05:55:35,654 WARNING [train.py:1204] (3/4) Exclude cut with ID _42657_210_2070_1_1533520654016_3327008_405-87432-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 05:55:35,676 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_4528_210_3273_1_1527076654_3276777_209-219804-0 from training. Duration: 0.69 2023-10-02 05:55:35,677 WARNING [train.py:1204] (3/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_78-182912-0_sp0.9 from training. Duration: 0.6555625 2023-10-02 05:55:35,684 WARNING [train.py:1204] (3/4) Exclude cut with ID _31255_210_13427_1_1532865619495_3815143_322-114998-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 05:55:36,517 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.conv_skip_rate, batch_count=773893.3333333334, ans=0.0 2023-10-02 05:55:40,223 WARNING [train.py:1204] (3/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_848-280957-0_sp0.9 from training. Duration: 0.9666875 2023-10-02 05:55:40,249 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7696_210_2040_1_1528207167_3716099_117-125434-0_sp0.9 from training. Duration: 0.8444375 2023-10-02 05:55:43,014 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_1002-109192-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 05:55:44,510 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.bypass_mid.scale_min, batch_count=773960.0, ans=0.2 2023-10-02 05:55:47,482 WARNING [train.py:1204] (3/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_138-64210-0_sp0.9 from training. Duration: 0.811125 2023-10-02 05:55:47,498 WARNING [train.py:1204] (3/4) Exclude cut with ID _25808_210_13292_1_1532343514562_7366109_633-47524-0_sp1.1 from training. Duration: 0.9 2023-10-02 05:55:48,865 WARNING [train.py:1204] (3/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_297-5233-0 from training. Duration: 0.95 2023-10-02 05:55:50,326 WARNING [train.py:1204] (3/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_335-274780-0 from training. Duration: 0.78 2023-10-02 05:55:50,331 WARNING [train.py:1204] (3/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_301-330455-0 from training. Duration: 0.8 2023-10-02 05:55:53,219 WARNING [train.py:1204] (3/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_383-205833-0 from training. Duration: 0.95 2023-10-02 05:55:55,886 INFO [train.py:1046] (3/4) Epoch 22, batch 4550, loss[loss=0.1488, simple_loss=0.2297, pruned_loss=0.03395, over 24586.00 frames. ], tot_loss[loss=0.1747, simple_loss=0.2506, pruned_loss=0.04941, over 4704415.70 frames. ], batch size: 60, lr: 4.63e-03, grad_scale: 16.0 2023-10-02 05:55:56,004 WARNING [train.py:1204] (3/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_48-182155-0 from training. Duration: 0.87 2023-10-02 05:55:57,384 WARNING [train.py:1204] (3/4) Exclude cut with ID _14832_210_8750_1_1530691351539_3171348_1-54315-0_sp1.1 from training. Duration: 0.7909375 2023-10-02 05:56:00,663 WARNING [train.py:1204] (3/4) Exclude cut with ID _44982_210_1511_1_1534312820108_3726029_413-100814-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 05:56:01,975 WARNING [train.py:1204] (3/4) Exclude cut with ID _3231_210_2106_1_1526205864934_6784220_104-261392-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 05:56:05,474 WARNING [train.py:1204] (3/4) Exclude cut with ID _24314_210_4915_1_1532135078116_7170179_1-13588-0_sp1.1 from training. Duration: 0.97275 2023-10-02 05:56:09,140 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.bypass.scale_min, batch_count=774026.6666666666, ans=0.2 2023-10-02 05:56:11,515 WARNING [train.py:1204] (3/4) Exclude cut with ID _44304_210_15374_1_1533639352765_4613129_141-8356-0_sp1.1 from training. Duration: 0.963625 2023-10-02 05:56:13,007 WARNING [train.py:1204] (3/4) Exclude cut with ID _37892_210_19596_1_1533198677897_3964058_374-335034-0_sp1.1 from training. Duration: 0.936375 2023-10-02 05:56:15,760 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_195-257512-0_sp0.9 from training. Duration: 0.7444375 2023-10-02 05:56:15,762 WARNING [train.py:1204] (3/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_1267-272693-0_sp1.1 from training. Duration: 0.663625 2023-10-02 05:56:15,763 WARNING [train.py:1204] (3/4) Exclude cut with ID _30196_210_1664_1_1533708496522_7074140_690-326155-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 05:56:17,245 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_68847_210_14656_1_1536070214_2586843_38-20717-0_sp1.1 from training. Duration: 0.97275 2023-10-02 05:56:17,278 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_99-251314-0_sp1.1 from training. Duration: 0.7909375 2023-10-02 05:56:20,097 WARNING [train.py:1204] (3/4) Exclude cut with ID _49376_210_5986_1_1533985278732_3370740_225-95630-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 05:56:23,277 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2655_210_2087_1_1525688959_3897400_202-147557-0 from training. Duration: 0.85 2023-10-02 05:56:23,340 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_5618_210_1874_1_1527311008_5851_1-214501-0_sp1.1 from training. Duration: 0.626375 2023-10-02 05:56:23,400 WARNING [train.py:1204] (3/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_225-43381-0_sp1.1 from training. Duration: 0.8545625 2023-10-02 05:56:24,758 WARNING [train.py:1204] (3/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_242-3472-0 from training. Duration: 0.73 2023-10-02 05:56:29,340 WARNING [train.py:1204] (3/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_339-104791-0 from training. Duration: 0.49 2023-10-02 05:56:29,413 WARNING [train.py:1204] (3/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_488-92261-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 05:56:32,197 WARNING [train.py:1204] (3/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_175-163894-0 from training. Duration: 0.74 2023-10-02 05:56:34,239 WARNING [train.py:1204] (3/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_41-234353-0_sp0.9 from training. Duration: 0.7555625 2023-10-02 05:56:34,604 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.bypass_mid.scale_min, batch_count=774160.0, ans=0.2 2023-10-02 05:56:35,733 WARNING [train.py:1204] (3/4) Exclude cut with ID _47251_210_18628_1_1533814063128_4137250_116-275927-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 05:56:35,760 WARNING [train.py:1204] (3/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_61-223324-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 05:56:35,771 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6961_210_2087_1_1527937168_6815011_562-165518-0_sp1.1 from training. Duration: 0.7 2023-10-02 05:56:39,172 WARNING [train.py:1204] (3/4) Exclude cut with ID _21989_210_5854_1_1531899214931_3271360_133-44759-0 from training. Duration: 0.77 2023-10-02 05:56:41,619 WARNING [train.py:1204] (3/4) Exclude cut with ID _23557_210_8750_1_1531890106370_6846255_77-284923-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 05:56:44,312 WARNING [train.py:1204] (3/4) Exclude cut with ID _13678_210_7497_1_1530530069226_3463170_464-296127-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 05:56:44,326 WARNING [train.py:1204] (3/4) Exclude cut with ID _22613_210_5219_1_1532253599686_4090170_393-36663-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 05:56:45,722 WARNING [train.py:1204] (3/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_235-262124-0_sp0.9 from training. Duration: 0.7444375 2023-10-02 05:56:47,156 WARNING [train.py:1204] (3/4) Exclude cut with ID _8480_210_1874_1_1528589391205_6728330_755-112625-0 from training. Duration: 0.74 2023-10-02 05:56:47,206 WARNING [train.py:1204] (3/4) Exclude cut with ID _6456_210_1534_1_1527681634212_3181049_215-309359-0 from training. Duration: 0.88 2023-10-02 05:56:47,236 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3019_210_2028_1_1526044245_3264865_191-72235-0_sp1.1 from training. Duration: 0.8818125 2023-10-02 05:56:48,653 WARNING [train.py:1204] (3/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_380-162648-0 from training. Duration: 0.94 2023-10-02 05:56:48,850 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_92-46009-0 from training. Duration: 0.54 2023-10-02 05:56:50,138 WARNING [train.py:1204] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_53-300544-0_sp0.9 from training. Duration: 0.7444375 2023-10-02 05:56:52,322 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_68235_210_1925_1_1535863247_4484656_101-250943-0_sp1.1 from training. Duration: 0.97275 2023-10-02 05:56:52,347 WARNING [train.py:1204] (3/4) Exclude cut with ID _14970_210_15709_1_1530775547361_1966965_204-22182-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 05:56:53,621 WARNING [train.py:1204] (3/4) Exclude cut with ID _55153_210_2253_1_1534384786677_3162096_310-130092-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 05:56:53,641 WARNING [train.py:1204] (3/4) Exclude cut with ID _60113_210_16271_1_1535164212481_4386079_559-167191-0_sp1.1 from training. Duration: 0.8454375 2023-10-02 05:56:55,463 WARNING [train.py:1204] (3/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_93-58002-0_sp1.1 from training. Duration: 0.5181875 2023-10-02 05:56:56,674 WARNING [train.py:1204] (3/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_110-224688-0 from training. Duration: 0.97 2023-10-02 05:56:56,803 WARNING [train.py:1204] (3/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_178-178004-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 05:56:56,813 WARNING [train.py:1204] (3/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_321-335475-0_sp1.1 from training. Duration: 0.4090625 2023-10-02 05:56:58,125 WARNING [train.py:1204] (3/4) Exclude cut with ID _66863_210_18053_1_1535698758334_8590319_1092-271784-0 from training. Duration: 0.96 2023-10-02 05:56:58,132 WARNING [train.py:1204] (3/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_607-213951-0_sp1.1 from training. Duration: 0.763625 2023-10-02 05:56:58,147 WARNING [train.py:1204] (3/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_468-283113-0 from training. Duration: 0.96 2023-10-02 05:57:02,822 WARNING [train.py:1204] (3/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_13-165512-0_sp1.1 from training. Duration: 0.7454375 2023-10-02 05:57:02,833 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_69630_210_7234_1_1535788670_910989_56-169762-0_sp1.1 from training. Duration: 0.92725 2023-10-02 05:57:04,251 WARNING [train.py:1204] (3/4) Exclude cut with ID _67335_210_5219_1_1535525975652_4275229_121-80357-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 05:57:05,686 WARNING [train.py:1204] (3/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_126-12079-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 05:57:05,714 WARNING [train.py:1204] (3/4) Exclude cut with ID _5294_210_1747_1_1527249643928_3696179_342-270278-0_sp1.1 from training. Duration: 0.5 2023-10-02 05:57:07,660 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_30322_210_1925_1_1532843478_826817_35-328264-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 05:57:09,137 WARNING [train.py:1204] (3/4) Exclude cut with ID _8480_210_1874_1_1528589391205_6728330_755-112625-0_sp1.1 from training. Duration: 0.67275 2023-10-02 05:57:11,873 INFO [train.py:1046] (3/4) Epoch 22, batch 4600, loss[loss=0.1684, simple_loss=0.2417, pruned_loss=0.0476, over 24476.00 frames. ], tot_loss[loss=0.1733, simple_loss=0.2489, pruned_loss=0.0489, over 4713334.98 frames. ], batch size: 58, lr: 4.63e-03, grad_scale: 16.0 2023-10-02 05:57:11,928 WARNING [train.py:1204] (3/4) Exclude cut with ID _38670_210_15379_1_1533448810659_4391059_132-146412-0_sp1.1 from training. Duration: 0.97275 2023-10-02 05:57:12,014 WARNING [train.py:1204] (3/4) Exclude cut with ID _22020_210_2461_1_1531724164513_6326900_563-39691-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 05:57:14,809 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_9460_210_4211_1_1528954587_10049667_572-37035-0_sp1.1 from training. Duration: 0.72725 2023-10-02 05:57:14,822 WARNING [train.py:1204] (3/4) Exclude cut with ID _68777_210_4353_1_1535810390731_3117904_129-166012-0_sp0.9 from training. Duration: 0.9444375 2023-10-02 05:57:14,908 WARNING [train.py:1204] (3/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_487-91921-0_sp1.1 from training. Duration: 0.963625 2023-10-02 05:57:16,252 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_195-257512-0 from training. Duration: 0.67 2023-10-02 05:57:18,887 WARNING [train.py:1204] (3/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_188-260011-0_sp1.1 from training. Duration: 0.663625 2023-10-02 05:57:23,525 WARNING [train.py:1204] (3/4) Exclude cut with ID _8480_210_1874_1_1528589391205_6728330_309-139896-0_sp0.9 from training. Duration: 0.9 2023-10-02 05:57:24,915 WARNING [train.py:1204] (3/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_866-321248-0_sp1.1 from training. Duration: 0.963625 2023-10-02 05:57:26,470 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.nonlin_attention.balancer.prob, batch_count=774426.6666666666, ans=0.125 2023-10-02 05:57:29,658 WARNING [train.py:1204] (3/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_510-216513-0_sp1.1 from training. Duration: 0.97275 2023-10-02 05:57:36,104 WARNING [train.py:1204] (3/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_758-143677-0 from training. Duration: 0.98 2023-10-02 05:57:37,426 WARNING [train.py:1204] (3/4) Exclude cut with ID _55134_210_21982_1_1534590028377_3882170_50-214427-0_sp1.1 from training. Duration: 0.97275 2023-10-02 05:57:41,979 WARNING [train.py:1204] (3/4) Exclude cut with ID _31649_210_6228_1_1532584608063_4091078_482-301330-0_sp1.1 from training. Duration: 0.97275 2023-10-02 05:57:44,745 WARNING [train.py:1204] (3/4) Exclude cut with ID _35799_210_5854_1_1533263487887_3529071_490-326847-0_sp1.1 from training. Duration: 0.9 2023-10-02 05:57:44,753 WARNING [train.py:1204] (3/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_984-90716-0_sp1.1 from training. Duration: 0.963625 2023-10-02 05:57:49,205 WARNING [train.py:1204] (3/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_166-246557-0 from training. Duration: 0.64 2023-10-02 05:57:49,205 WARNING [train.py:1204] (3/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_51-200220-0_sp1.1 from training. Duration: 0.5090625 2023-10-02 05:57:50,517 WARNING [train.py:1204] (3/4) Exclude cut with ID _51382_210_23862_1_1534492801838_3650104_275-92796-0_sp1.1 from training. Duration: 0.936375 2023-10-02 05:57:53,314 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.416e+02 1.872e+02 2.041e+02 2.269e+02 3.286e+02, threshold=4.083e+02, percent-clipped=0.0 2023-10-02 05:57:56,213 WARNING [train.py:1204] (3/4) Exclude cut with ID _29853_210_7497_1_1532505600851_6642110_461-142829-0_sp1.1 from training. Duration: 0.97275 2023-10-02 05:57:56,274 WARNING [train.py:1204] (3/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_788-298907-0_sp0.9 from training. Duration: 0.988875 2023-10-02 05:57:56,442 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.0.conv_skip_rate, batch_count=774560.0, ans=0.0 2023-10-02 05:57:57,669 WARNING [train.py:1204] (3/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_739-2098-0_sp1.1 from training. Duration: 0.836375 2023-10-02 05:58:01,274 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7543_210_6753_1_1528541907_1399322_165-323619-0 from training. Duration: 0.69 2023-10-02 05:58:04,352 WARNING [train.py:1204] (3/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_401-255491-0_sp1.1 from training. Duration: 0.636375 2023-10-02 05:58:09,016 WARNING [train.py:1204] (3/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_24-143283-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 05:58:10,478 WARNING [train.py:1204] (3/4) Exclude cut with ID _14459_210_2228_1_1531443641946_7516700_1221-280439-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 05:58:12,031 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.1.feed_forward1.out_proj.dropout_p, batch_count=774626.6666666666, ans=0.1 2023-10-02 05:58:13,172 WARNING [train.py:1204] (3/4) Exclude cut with ID _59202_210_26984_1_1534816810612_3549174_469-213482-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 05:58:13,181 WARNING [train.py:1204] (3/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_148-158509-0_sp0.9 from training. Duration: 0.4555625 2023-10-02 05:58:13,220 WARNING [train.py:1204] (3/4) Exclude cut with ID _24314_210_4915_1_1532135078116_7170179_863-259095-0_sp1.1 from training. Duration: 0.97275 2023-10-02 05:58:13,286 WARNING [train.py:1204] (3/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_321-22821-0 from training. Duration: 0.89 2023-10-02 05:58:14,987 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_43831_210_2158_1_1533725502_5872791_55-163137-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 05:58:15,037 WARNING [train.py:1204] (3/4) Exclude cut with ID _27430_210_7770_1_1532604689375_1378590_26-25207-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 05:58:15,151 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_17613_210_2151_1_1531137569_2366160_223-316461-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 05:58:16,547 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_53679_210_4852_1_1534556726_4186141_541-213370-0_sp1.1 from training. Duration: 0.963625 2023-10-02 05:58:17,918 WARNING [train.py:1204] (3/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_369-295086-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 05:58:17,973 WARNING [train.py:1204] (3/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_91-267696-0 from training. Duration: 0.87 2023-10-02 05:58:18,020 WARNING [train.py:1204] (3/4) Exclude cut with ID _5294_210_1747_1_1527249643928_3696179_180-100713-0 from training. Duration: 0.81 2023-10-02 05:58:19,263 WARNING [train.py:1204] (3/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_32-335103-0 from training. Duration: 0.82 2023-10-02 05:58:19,269 WARNING [train.py:1204] (3/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_133-312727-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 05:58:19,987 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.3.encoder.layers.0.feed_forward3.out_whiten, num_groups=1, num_channels=512, metric=10.20 vs. limit=15.0 2023-10-02 05:58:20,984 WARNING [train.py:1204] (3/4) Exclude cut with ID _59288_210_5854_1_1535077757774_3412246_391-165934-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 05:58:21,041 WARNING [train.py:1204] (3/4) Exclude cut with ID _28323_210_16187_1_1532480395667_4100300_488-1281-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 05:58:22,321 WARNING [train.py:1204] (3/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_124-193207-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 05:58:26,300 INFO [train.py:1046] (3/4) Epoch 22, batch 4650, loss[loss=0.1765, simple_loss=0.2217, pruned_loss=0.06561, over 19146.00 frames. ], tot_loss[loss=0.1725, simple_loss=0.2484, pruned_loss=0.04831, over 4718084.95 frames. ], batch size: 388, lr: 4.63e-03, grad_scale: 16.0 2023-10-02 05:58:26,617 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.ff3_skip_rate, batch_count=774693.3333333334, ans=0.0 2023-10-02 05:58:29,622 INFO [scaling.py:1118] (3/4) WithLoss: name=encoder.encoders.4.encoder.layers.0.self_attn_weights, loss-sum=0.000e+00 2023-10-02 05:58:31,099 WARNING [train.py:1204] (3/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_226-242140-0_sp0.9 from training. Duration: 0.9 2023-10-02 05:58:31,373 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.self_attn_weights.pos_emb_skip_rate, batch_count=774693.3333333334, ans=0.0 2023-10-02 05:58:32,548 WARNING [train.py:1204] (3/4) Exclude cut with ID _29277_210_3279_1_1532581109293_3173069_392-3694-0_sp1.1 from training. Duration: 0.936375 2023-10-02 05:58:34,388 WARNING [train.py:1204] (3/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_563-329842-0_sp1.1 from training. Duration: 0.97275 2023-10-02 05:58:34,437 WARNING [train.py:1204] (3/4) Exclude cut with ID _58583_210_5236_1_1534726835266_3574249_391-87755-0_sp1.1 from training. Duration: 0.92725 2023-10-02 05:58:34,480 WARNING [train.py:1204] (3/4) Exclude cut with ID _28427_210_4220_1_1532581240967_3061069_281-237827-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 05:58:34,504 WARNING [train.py:1204] (3/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_44-147898-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 05:58:37,608 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_24007_210_1794_1_1531918925_1663875_125-320772-0_sp1.1 from training. Duration: 0.97275 2023-10-02 05:58:40,345 WARNING [train.py:1204] (3/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_143-117421-0 from training. Duration: 0.58 2023-10-02 05:58:44,542 WARNING [train.py:1204] (3/4) Exclude cut with ID _69584_210_5219_1_1535785133775_4044450_325-7656-0_sp0.9 from training. Duration: 0.9555625 2023-10-02 05:58:47,892 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_37351_210_7925_1_1533012931_3812113_265-85780-0 from training. Duration: 0.88 2023-10-02 05:58:47,912 WARNING [train.py:1204] (3/4) Exclude cut with ID _18516_210_9206_1_1531704653206_3692406_331-237896-0_sp1.1 from training. Duration: 0.936375 2023-10-02 05:58:49,285 WARNING [train.py:1204] (3/4) Exclude cut with ID _14629_210_15709_1_1530604298182_956899_6-232695-0 from training. Duration: 0.94 2023-10-02 05:58:49,304 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7544_210_6753_1_1528587000_691468_87-178599-0_sp1.1 from training. Duration: 0.7909375 2023-10-02 05:58:49,354 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_326-303945-0 from training. Duration: 0.99 2023-10-02 05:58:49,374 WARNING [train.py:1204] (3/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_503-30719-0 from training. Duration: 0.98 2023-10-02 05:58:49,381 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_11305_210_1794_1_1529750428_7178148_299-344640-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 05:58:50,610 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_53071_210_16554_1_1534490405_2954066_265-170472-0_sp1.1 from training. Duration: 0.7545625 2023-10-02 05:58:53,524 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_174-277364-0_sp1.1 from training. Duration: 0.6454375 2023-10-02 05:58:54,781 WARNING [train.py:1204] (3/4) Exclude cut with ID _58414_210_8254_1_1535421609162_7151182_193-151884-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 05:58:54,807 WARNING [train.py:1204] (3/4) Exclude cut with ID IC0651W0104-322063 from training. Duration: 0.768 2023-10-02 05:58:57,725 WARNING [train.py:1204] (3/4) Exclude cut with ID _21881_210_3525_1_1531825242757_3798380_139-305982-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 05:59:00,356 WARNING [train.py:1204] (3/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_79-200392-0 from training. Duration: 0.97 2023-10-02 05:59:01,800 WARNING [train.py:1204] (3/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_451-280894-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 05:59:01,808 WARNING [train.py:1204] (3/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_304-273433-0_sp1.1 from training. Duration: 0.9 2023-10-02 05:59:03,770 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_5959_210_1794_1_1527400400_3445726_412-44052-0 from training. Duration: 0.77 2023-10-02 05:59:05,110 WARNING [train.py:1204] (3/4) Exclude cut with ID _4681_210_3525_1_1527251040321_1235713_148-26159-0_sp1.1 from training. Duration: 0.963625 2023-10-02 05:59:06,767 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.0.feed_forward1.hidden_balancer.prob, batch_count=774826.6666666666, ans=0.125 2023-10-02 05:59:08,559 WARNING [train.py:1204] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_887-249555-0_sp0.9 from training. Duration: 0.8555625 2023-10-02 05:59:11,271 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_25236_210_2158_1_1532051968_280567_37-6860-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 05:59:11,972 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.4.encoder.layers.0.nonlin_attention.whiten2, num_groups=1, num_channels=384, metric=11.66 vs. limit=15.0 2023-10-02 05:59:13,018 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.ff3_skip_rate, batch_count=774893.3333333334, ans=0.0 2023-10-02 05:59:13,676 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.1.encoder.layers.1.nonlin_attention.whiten2, num_groups=1, num_channels=256, metric=7.57 vs. limit=15.0 2023-10-02 05:59:15,615 WARNING [train.py:1204] (3/4) Exclude cut with ID _29277_210_3279_1_1532581109293_3173069_417-115527-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 05:59:19,056 WARNING [train.py:1204] (3/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_552-134748-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 05:59:20,278 WARNING [train.py:1204] (3/4) Exclude cut with ID _43176_210_8254_1_1533524417469_4174339_188-174143-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 05:59:20,320 WARNING [train.py:1204] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_127-119109-0_sp1.1 from training. Duration: 0.6545625 2023-10-02 05:59:23,093 WARNING [train.py:1204] (3/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_38-187851-0 from training. Duration: 0.62 2023-10-02 05:59:23,130 WARNING [train.py:1204] (3/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_588-125165-0 from training. Duration: 0.83 2023-10-02 05:59:23,193 WARNING [train.py:1204] (3/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_344-152526-0_sp1.1 from training. Duration: 0.4818125 2023-10-02 05:59:23,195 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7002_210_1794_1_1527935634_4034444_487-236501-0 from training. Duration: 0.66 2023-10-02 05:59:24,621 WARNING [train.py:1204] (3/4) Exclude cut with ID _23825_210_13449_1_1531915313526_3497150_379-16981-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 05:59:31,681 WARNING [train.py:1204] (3/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_297-5233-0_sp1.1 from training. Duration: 0.863625 2023-10-02 05:59:31,686 WARNING [train.py:1204] (3/4) Exclude cut with ID _41298_210_9019_1_1533430371450_4822143_178-283538-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 05:59:33,006 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7046_210_6753_1_1527895337_6920917_642-106799-0 from training. Duration: 0.92 2023-10-02 05:59:33,027 WARNING [train.py:1204] (3/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_532-291952-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 05:59:33,582 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.conv_module2.whiten.whitening_limit, batch_count=774960.0, ans=15.0 2023-10-02 05:59:34,421 WARNING [train.py:1204] (3/4) Exclude cut with ID _47278_210_12041_1_1533902418349_3576110_260-270468-0_sp1.1 from training. Duration: 0.92725 2023-10-02 05:59:34,434 WARNING [train.py:1204] (3/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_144-3287-0_sp0.9 from training. Duration: 0.7444375 2023-10-02 05:59:34,730 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.bypass.scale_min, batch_count=774960.0, ans=0.2 2023-10-02 05:59:34,952 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.5.encoder.layers.1.feed_forward3.out_whiten, num_groups=1, num_channels=256, metric=12.80 vs. limit=15.0 2023-10-02 05:59:35,829 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_162-262717-0_sp0.9 from training. Duration: 0.92225 2023-10-02 05:59:39,047 WARNING [train.py:1204] (3/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_62-335552-0_sp0.9 from training. Duration: 0.9666875 2023-10-02 05:59:39,059 WARNING [train.py:1204] (3/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_726-272852-0_sp1.1 from training. Duration: 0.92725 2023-10-02 05:59:40,747 INFO [train.py:1046] (3/4) Epoch 22, batch 4700, loss[loss=0.1669, simple_loss=0.2432, pruned_loss=0.04529, over 23610.00 frames. ], tot_loss[loss=0.172, simple_loss=0.2481, pruned_loss=0.04788, over 4719159.77 frames. ], batch size: 135, lr: 4.63e-03, grad_scale: 16.0 2023-10-02 05:59:40,814 WARNING [train.py:1204] (3/4) Exclude cut with ID _14898_210_9206_1_1530766812377_4177190_26-135492-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 05:59:43,627 WARNING [train.py:1204] (3/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_200-95304-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 05:59:43,672 WARNING [train.py:1204] (3/4) Exclude cut with ID _45012_210_12651_1_1533608435136_5371380_240-37101-0_sp1.1 from training. Duration: 0.8454375 2023-10-02 05:59:43,680 WARNING [train.py:1204] (3/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_132-269633-0_sp0.9 from training. Duration: 0.7555625 2023-10-02 05:59:45,099 WARNING [train.py:1204] (3/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_907-151398-0_sp0.9 from training. Duration: 0.411125 2023-10-02 05:59:45,178 WARNING [train.py:1204] (3/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_45-265684-0_sp0.9 from training. Duration: 0.8 2023-10-02 05:59:45,409 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.conv_skip_rate, batch_count=775026.6666666666, ans=0.0 2023-10-02 05:59:46,521 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_74-292608-0 from training. Duration: 0.72 2023-10-02 05:59:48,440 INFO [scaling.py:1118] (3/4) WithLoss: name=encoder.encoders.0.layers.0.self_attn_weights, loss-sum=0.000e+00 2023-10-02 05:59:49,938 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.feed_forward1.hidden_balancer.prob, batch_count=775026.6666666666, ans=0.125 2023-10-02 05:59:53,935 WARNING [train.py:1204] (3/4) Exclude cut with ID _59202_210_26984_1_1534816810612_3549174_221-84819-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 05:59:55,311 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_33696_210_3852_1_1533082780_2663375_181-325595-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 05:59:55,353 WARNING [train.py:1204] (3/4) Exclude cut with ID _47251_210_18628_1_1533814063128_4137250_351-154105-0_sp1.1 from training. Duration: 0.963625 2023-10-02 05:59:56,744 WARNING [train.py:1204] (3/4) Exclude cut with ID _35501_210_9288_1_1532998507986_3192189_351-334740-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 05:59:58,120 WARNING [train.py:1204] (3/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_342-100157-0_sp1.1 from training. Duration: 0.4909375 2023-10-02 06:00:02,836 WARNING [train.py:1204] (3/4) Exclude cut with ID _6789_210_1534_1_1527857937218_3118950_262-48853-0 from training. Duration: 0.56 2023-10-02 06:00:02,863 WARNING [train.py:1204] (3/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_430-201020-0 from training. Duration: 0.97 2023-10-02 06:00:05,538 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_71-136518-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 06:00:06,901 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_28-291982-0_sp1.1 from training. Duration: 0.8818125 2023-10-02 06:00:06,924 WARNING [train.py:1204] (3/4) Exclude cut with ID _54218_210_12210_1_1534413646388_4409259_437-275326-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 06:00:11,059 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.attention_skip_rate, batch_count=775160.0, ans=0.0 2023-10-02 06:00:12,189 WARNING [train.py:1204] (3/4) Exclude cut with ID _55134_210_21982_1_1534590028377_3882170_40-309139-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 06:00:17,815 WARNING [train.py:1204] (3/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_266-329755-0_sp1.1 from training. Duration: 0.8090625 2023-10-02 06:00:19,228 WARNING [train.py:1204] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_21-174514-0_sp1.1 from training. Duration: 0.3909375 2023-10-02 06:00:20,169 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.conv_skip_rate, batch_count=775160.0, ans=0.0 2023-10-02 06:00:21,274 WARNING [train.py:1204] (3/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_409-207378-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 06:00:22,598 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.530e+02 1.844e+02 2.076e+02 2.631e+02 3.750e+02, threshold=4.153e+02, percent-clipped=0.0 2023-10-02 06:00:22,943 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.conv_skip_rate, batch_count=775160.0, ans=0.0 2023-10-02 06:00:26,760 WARNING [train.py:1204] (3/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_132-269633-0 from training. Duration: 0.68 2023-10-02 06:00:26,869 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2245_210_2715_1_1525511548_3540254_310-52483-0_sp0.9 from training. Duration: 0.97775 2023-10-02 06:00:29,578 WARNING [train.py:1204] (3/4) Exclude cut with ID _27430_210_7770_1_1532604689375_1378590_47-77403-0_sp1.1 from training. Duration: 0.92725 2023-10-02 06:00:32,507 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.feed_forward1.hidden_balancer.prob, batch_count=775226.6666666666, ans=0.125 2023-10-02 06:00:33,625 WARNING [train.py:1204] (3/4) Exclude cut with ID _7562_210_2084_1_1528887767916_3946229_186-326422-0 from training. Duration: 0.84 2023-10-02 06:00:35,563 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_230-35993-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 06:00:38,436 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_23854_210_1794_1_1531969696_1329157_57-123491-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 06:00:40,195 WARNING [train.py:1204] (3/4) Exclude cut with ID _14747_210_8341_1_1530685797463_3654089_147-131229-0 from training. Duration: 0.76 2023-10-02 06:00:42,043 WARNING [train.py:1204] (3/4) Exclude cut with ID _30191_210_1664_1_1533189915047_7196670_430-19539-0_sp1.1 from training. Duration: 0.92725 2023-10-02 06:00:42,058 WARNING [train.py:1204] (3/4) Exclude cut with ID _67725_210_27767_1_1535608756671_5105359_44-50762-0_sp1.1 from training. Duration: 0.963625 2023-10-02 06:00:46,264 WARNING [train.py:1204] (3/4) Exclude cut with ID _27485_210_4916_1_1532739191677_3668269_217-161755-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 06:00:46,312 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_5931_210_2040_1_1527428481_4547999_321-193614-0_sp1.1 from training. Duration: 0.6090625 2023-10-02 06:00:46,338 WARNING [train.py:1204] (3/4) Exclude cut with ID _6267_210_2084_1_1527764658526_3894730_375-219887-0 from training. Duration: 0.67 2023-10-02 06:00:47,603 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9087W0022-538393 from training. Duration: 0.768 2023-10-02 06:00:48,966 WARNING [train.py:1204] (3/4) Exclude cut with ID _68777_210_4353_1_1535810390731_3117904_83-196459-0_sp1.1 from training. Duration: 0.963625 2023-10-02 06:00:52,160 WARNING [train.py:1204] (3/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_455-343812-0_sp1.1 from training. Duration: 0.92725 2023-10-02 06:00:52,161 WARNING [train.py:1204] (3/4) Exclude cut with ID _13916_210_9994_1_1530788341126_3763149_414-320174-0_sp1.1 from training. Duration: 0.92725 2023-10-02 06:00:52,165 WARNING [train.py:1204] (3/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_398-239819-0 from training. Duration: 0.99 2023-10-02 06:00:52,246 WARNING [train.py:1204] (3/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_199-232314-0_sp1.1 from training. Duration: 0.92725 2023-10-02 06:00:54,894 INFO [train.py:1046] (3/4) Epoch 22, batch 4750, loss[loss=0.1799, simple_loss=0.2467, pruned_loss=0.05659, over 23717.00 frames. ], tot_loss[loss=0.1723, simple_loss=0.2486, pruned_loss=0.048, over 4726284.53 frames. ], batch size: 232, lr: 4.63e-03, grad_scale: 16.0 2023-10-02 06:00:56,453 WARNING [train.py:1204] (3/4) Exclude cut with ID _2334_210_2667_1_1525608238880_4705129_459-104997-0 from training. Duration: 0.95 2023-10-02 06:00:59,183 WARNING [train.py:1204] (3/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_441-209395-0_sp1.1 from training. Duration: 0.936375 2023-10-02 06:01:00,546 WARNING [train.py:1204] (3/4) Exclude cut with ID _27052_210_5986_1_1532329197187_3585009_320-80393-0_sp1.1 from training. Duration: 0.97275 2023-10-02 06:01:00,771 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.0.conv_module2.balancer2.prob, batch_count=775360.0, ans=0.125 2023-10-02 06:01:03,511 WARNING [train.py:1204] (3/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_89-217253-0_sp1.1 from training. Duration: 0.97275 2023-10-02 06:01:03,538 WARNING [train.py:1204] (3/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_453-282588-0_sp1.1 from training. Duration: 0.8909375 2023-10-02 06:01:05,154 WARNING [train.py:1204] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_88-341718-0 from training. Duration: 0.66 2023-10-02 06:01:05,196 WARNING [train.py:1204] (3/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_230-3521-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 06:01:09,666 WARNING [train.py:1204] (3/4) Exclude cut with ID _9679_210_7497_1_1529129212906_4363929_512-313889-0 from training. Duration: 0.81 2023-10-02 06:01:09,781 WARNING [train.py:1204] (3/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_194-252800-0_sp1.1 from training. Duration: 0.8818125 2023-10-02 06:01:09,799 WARNING [train.py:1204] (3/4) Exclude cut with ID _55209_210_1531_1_1534675828996_4597160_239-293541-0_sp1.1 from training. Duration: 0.963625 2023-10-02 06:01:11,712 WARNING [train.py:1204] (3/4) Exclude cut with ID _35799_210_5854_1_1533263487887_3529071_15-325131-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 06:01:12,062 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.feed_forward3.hidden_balancer.prob, batch_count=775426.6666666666, ans=0.125 2023-10-02 06:01:13,984 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.bypass.scale_min, batch_count=775426.6666666666, ans=0.2 2023-10-02 06:01:16,813 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.feed_forward1.hidden_balancer.prob, batch_count=775426.6666666666, ans=0.125 2023-10-02 06:01:17,869 WARNING [train.py:1204] (3/4) Exclude cut with ID _14100_210_8341_1_1530529320613_3678069_279-55760-0 from training. Duration: 0.93 2023-10-02 06:01:20,763 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_489-230703-0_sp1.1 from training. Duration: 0.77275 2023-10-02 06:01:24,104 WARNING [train.py:1204] (3/4) Exclude cut with ID _6694_210_4599_1_1527852637490_3965529_144-185051-0 from training. Duration: 0.96 2023-10-02 06:01:24,173 WARNING [train.py:1204] (3/4) Exclude cut with ID _28461_210_2253_1_1532771775189_3041299_1-228155-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 06:01:24,397 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.balancer2.prob, batch_count=775493.3333333334, ans=0.125 2023-10-02 06:01:26,900 WARNING [train.py:1204] (3/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_686-252323-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 06:01:26,903 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_1075-294633-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 06:01:26,921 WARNING [train.py:1204] (3/4) Exclude cut with ID _22083_210_8254_1_1531702862439_3678970_30-92879-0_sp1.1 from training. Duration: 0.97275 2023-10-02 06:01:29,476 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9245W0121-616888 from training. Duration: 0.896 2023-10-02 06:01:29,479 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6786_210_1388_1_1527839699_4194495_378-46070-0 from training. Duration: 0.72 2023-10-02 06:01:32,318 WARNING [train.py:1204] (3/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_317-175062-0 from training. Duration: 0.82 2023-10-02 06:01:33,222 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.2.encoder.layers.2.nonlin_attention.whiten1, num_groups=1, num_channels=288, metric=4.69 vs. limit=10.0 2023-10-02 06:01:33,857 WARNING [train.py:1204] (3/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_238-89390-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 06:01:37,088 WARNING [train.py:1204] (3/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_524-310811-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 06:01:38,641 WARNING [train.py:1204] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_307-158462-0_sp0.9 from training. Duration: 0.7666875 2023-10-02 06:01:38,642 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9309W0191-640318 from training. Duration: 0.512 2023-10-02 06:01:38,646 WARNING [train.py:1204] (3/4) Exclude cut with ID _55153_210_2253_1_1534384786677_3162096_100-204617-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 06:01:40,123 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.self_attn_weights.pos_emb_skip_rate, batch_count=775560.0, ans=0.0 2023-10-02 06:01:41,427 WARNING [train.py:1204] (3/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_112-149213-0_sp1.1 from training. Duration: 0.863625 2023-10-02 06:01:46,233 WARNING [train.py:1204] (3/4) Exclude cut with ID _67831_210_12392_1_1535612464202_3092612_346-2148-0_sp1.1 from training. Duration: 0.8454375 2023-10-02 06:01:48,879 WARNING [train.py:1204] (3/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_51-117576-0_sp0.9 from training. Duration: 0.4444375 2023-10-02 06:01:48,912 WARNING [train.py:1204] (3/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_300-27235-0 from training. Duration: 0.87 2023-10-02 06:01:50,233 WARNING [train.py:1204] (3/4) Exclude cut with ID _40399_210_4353_1_1533305658194_3279406_195-116825-0_sp1.1 from training. Duration: 0.97275 2023-10-02 06:01:50,263 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7002_210_1794_1_1527939683_5157286_163-87552-0_sp0.9 from training. Duration: 0.9555625 2023-10-02 06:01:52,331 WARNING [train.py:1204] (3/4) Exclude cut with ID _19514_210_9259_1_1531530071330_5084230_560-237597-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 06:01:52,400 WARNING [train.py:1204] (3/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_41-234353-0_sp1.1 from training. Duration: 0.6181875 2023-10-02 06:01:52,418 WARNING [train.py:1204] (3/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_769-168302-0 from training. Duration: 0.71 2023-10-02 06:01:55,103 WARNING [train.py:1204] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_574-45541-0 from training. Duration: 0.65 2023-10-02 06:01:57,970 WARNING [train.py:1204] (3/4) Exclude cut with ID _50310_210_15488_1_1534068043345_3641080_262-67120-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 06:01:59,421 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_53071_210_16554_1_1534493434_1276796_35-11927-0_sp1.1 from training. Duration: 0.963625 2023-10-02 06:01:59,423 WARNING [train.py:1204] (3/4) Exclude cut with ID _3992_210_1747_1_1526644816031_3731060_107-251434-0 from training. Duration: 0.99 2023-10-02 06:01:59,464 WARNING [train.py:1204] (3/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_412-54488-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 06:01:59,638 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.attention_skip_rate, batch_count=775626.6666666666, ans=0.0 2023-10-02 06:02:00,871 WARNING [train.py:1204] (3/4) Exclude cut with ID _18516_210_9206_1_1531704653206_3692406_77-258490-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 06:02:02,449 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_40047_210_2290_1_1533290519_2374311_195-165693-0_sp0.9 from training. Duration: 0.988875 2023-10-02 06:02:02,511 WARNING [train.py:1204] (3/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_296-36971-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 06:02:03,825 WARNING [train.py:1204] (3/4) Exclude cut with ID _14970_210_15709_1_1530773543493_1715730_187-20259-0_sp1.1 from training. Duration: 0.8090625 2023-10-02 06:02:06,486 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_53373_210_5399_1_1534229471_4715695_135-305927-0_sp1.1 from training. Duration: 0.936375 2023-10-02 06:02:08,370 WARNING [train.py:1204] (3/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_60-18123-0 from training. Duration: 0.88 2023-10-02 06:02:08,447 WARNING [train.py:1204] (3/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_166-166990-0 from training. Duration: 0.96 2023-10-02 06:02:09,637 INFO [train.py:1046] (3/4) Epoch 22, batch 4800, loss[loss=0.1797, simple_loss=0.2528, pruned_loss=0.0533, over 23680.00 frames. ], tot_loss[loss=0.1735, simple_loss=0.2498, pruned_loss=0.04857, over 4706141.30 frames. ], batch size: 149, lr: 4.63e-03, grad_scale: 32.0 2023-10-02 06:02:09,772 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_5438_210_1794_1_1527336687_2600009_233-58072-0 from training. Duration: 0.77 2023-10-02 06:02:12,627 WARNING [train.py:1204] (3/4) Exclude cut with ID _8833_210_5219_1_1528592665210_7553059_611-80313-0_sp0.9 from training. Duration: 0.711125 2023-10-02 06:02:14,510 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_53555_210_5632_1_1534595220_7232297_118-43239-0_sp1.1 from training. Duration: 0.936375 2023-10-02 06:02:15,948 WARNING [train.py:1204] (3/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_480-13146-0 from training. Duration: 0.85 2023-10-02 06:02:20,919 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_892-28814-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 06:02:20,967 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_24899_210_16554_1_1532046422_3827462_160-21255-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 06:02:25,770 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_154-316450-0_sp1.1 from training. Duration: 0.6909375 2023-10-02 06:02:27,193 WARNING [train.py:1204] (3/4) Exclude cut with ID _30196_210_1664_1_1533708496522_7074140_1225-301232-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 06:02:27,218 WARNING [train.py:1204] (3/4) Exclude cut with ID _28783_210_7943_1_1532671164808_7155479_414-208100-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 06:02:27,273 WARNING [train.py:1204] (3/4) Exclude cut with ID _19023_210_4353_1_1531378892552_5820780_83-254794-0 from training. Duration: 0.86 2023-10-02 06:02:28,599 WARNING [train.py:1204] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_591-83752-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 06:02:28,638 WARNING [train.py:1204] (3/4) Exclude cut with ID _29877_210_6228_1_1532397791106_5276149_266-62291-0_sp1.1 from training. Duration: 0.7909375 2023-10-02 06:02:29,957 WARNING [train.py:1204] (3/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_280-161633-0_sp1.1 from training. Duration: 0.663625 2023-10-02 06:02:34,628 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.4.encoder.layers.1.conv_module1.whiten, num_groups=1, num_channels=384, metric=3.24 vs. limit=15.0 2023-10-02 06:02:35,351 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7543_210_6753_1_1528539468_333764_39-156634-0_sp1.1 from training. Duration: 0.97275 2023-10-02 06:02:36,755 WARNING [train.py:1204] (3/4) Exclude cut with ID _67499_210_17282_1_1535509805220_3692430_505-312248-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 06:02:36,788 WARNING [train.py:1204] (3/4) Exclude cut with ID _5294_210_1747_1_1527249643928_3696179_180-100713-0_sp0.9 from training. Duration: 0.9 2023-10-02 06:02:37,020 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.conv_skip_rate, batch_count=775760.0, ans=0.0 2023-10-02 06:02:38,155 WARNING [train.py:1204] (3/4) Exclude cut with ID _26528_210_9070_1_1532570410842_3851310_508-148132-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 06:02:38,171 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_211-339622-0_sp1.1 from training. Duration: 0.4090625 2023-10-02 06:02:38,183 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_339-138356-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 06:02:40,308 WARNING [train.py:1204] (3/4) Exclude cut with ID _27060_210_5986_1_1532674838922_3522549_211-19933-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 06:02:41,713 WARNING [train.py:1204] (3/4) Exclude cut with ID _60229_210_27767_1_1534838406158_3655350_457-117923-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 06:02:44,260 WARNING [train.py:1204] (3/4) Exclude cut with ID _55429_210_10626_1_1534467588527_7174143_460-141439-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 06:02:48,155 WARNING [train.py:1204] (3/4) Exclude cut with ID _59170_210_6721_1_1534983633960_4203335_263-10780-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 06:02:48,172 WARNING [train.py:1204] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_872-264525-0_sp0.9 from training. Duration: 0.72225 2023-10-02 06:02:49,514 WARNING [train.py:1204] (3/4) Exclude cut with ID _7624_210_1531_1_1528109802198_4933101_418-349834-0_sp0.9 from training. Duration: 0.6666875 2023-10-02 06:02:52,267 WARNING [train.py:1204] (3/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_27-53998-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 06:02:52,400 WARNING [train.py:1204] (3/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_202-183922-0 from training. Duration: 0.85 2023-10-02 06:02:54,241 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.557e+02 1.869e+02 2.092e+02 2.385e+02 4.135e+02, threshold=4.184e+02, percent-clipped=0.0 2023-10-02 06:02:54,313 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7002_210_1794_1_1527939683_5157286_1-281264-0 from training. Duration: 0.44 2023-10-02 06:02:54,377 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_53753_210_7234_1_1534320003_3594148_493-9314-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 06:02:54,395 WARNING [train.py:1204] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_217-95056-0_sp1.1 from training. Duration: 0.8909375 2023-10-02 06:02:55,693 WARNING [train.py:1204] (3/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_406-45086-0_sp1.1 from training. Duration: 0.72725 2023-10-02 06:02:55,698 WARNING [train.py:1204] (3/4) Exclude cut with ID _31649_210_6228_1_1532584608063_4091078_505-213821-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 06:02:55,706 WARNING [train.py:1204] (3/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_317-175062-0_sp0.9 from training. Duration: 0.911125 2023-10-02 06:02:57,070 WARNING [train.py:1204] (3/4) Exclude cut with ID _5444_210_2151_1_1527418808464_5312210_726-27165-0_sp1.1 from training. Duration: 0.6090625 2023-10-02 06:02:57,121 WARNING [train.py:1204] (3/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_494-170437-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 06:03:01,318 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_474-64054-0_sp1.1 from training. Duration: 0.936375 2023-10-02 06:03:04,306 WARNING [train.py:1204] (3/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_521-7556-0_sp1.1 from training. Duration: 0.97275 2023-10-02 06:03:04,426 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_269-348505-0_sp1.1 from training. Duration: 0.92725 2023-10-02 06:03:08,654 WARNING [train.py:1204] (3/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_559-184862-0 from training. Duration: 0.94 2023-10-02 06:03:10,011 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_68702_210_23689_1_1535799577_3420068_92-76040-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 06:03:10,047 WARNING [train.py:1204] (3/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_227-102052-0_sp1.1 from training. Duration: 0.97275 2023-10-02 06:03:10,084 WARNING [train.py:1204] (3/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_718-250907-0_sp0.9 from training. Duration: 0.7666875 2023-10-02 06:03:10,138 WARNING [train.py:1204] (3/4) Exclude cut with ID _14069_210_5632_1_1530512692033_4803390_352-143891-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 06:03:10,282 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.conv_module2.balancer2.min_abs, batch_count=775960.0, ans=0.5 2023-10-02 06:03:14,727 WARNING [train.py:1204] (3/4) Exclude cut with ID _6267_210_2084_1_1527764658526_3894730_65-106602-0_sp1.1 from training. Duration: 0.936375 2023-10-02 06:03:16,083 WARNING [train.py:1204] (3/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_202-183922-0_sp0.9 from training. Duration: 0.9444375 2023-10-02 06:03:16,092 WARNING [train.py:1204] (3/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_465-268827-0_sp1.1 from training. Duration: 0.97275 2023-10-02 06:03:16,161 WARNING [train.py:1204] (3/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_58-29064-0_sp1.1 from training. Duration: 0.9 2023-10-02 06:03:16,192 WARNING [train.py:1204] (3/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_1-99680-0_sp0.9 from training. Duration: 0.7444375 2023-10-02 06:03:18,307 WARNING [train.py:1204] (3/4) Exclude cut with ID _13253_210_1925_1_1530757775819_3577873_191-144977-0_sp1.1 from training. Duration: 0.7090625 2023-10-02 06:03:21,434 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.conv_module1.balancer2.prob, batch_count=775960.0, ans=0.125 2023-10-02 06:03:22,522 WARNING [train.py:1204] (3/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_276-112684-0_sp1.1 from training. Duration: 0.92725 2023-10-02 06:03:22,530 WARNING [train.py:1204] (3/4) Exclude cut with ID _64639_210_5816_1_1535367619437_3528000_358-411-0_sp1.1 from training. Duration: 0.97275 2023-10-02 06:03:22,542 WARNING [train.py:1204] (3/4) Exclude cut with ID _31649_210_6228_1_1532584608063_4091078_106-21199-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 06:03:24,209 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.ff2_skip_rate, batch_count=775960.0, ans=0.0 2023-10-02 06:03:25,284 WARNING [train.py:1204] (3/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_901-79438-0 from training. Duration: 0.94 2023-10-02 06:03:26,635 INFO [train.py:1046] (3/4) Epoch 22, batch 4850, loss[loss=0.1554, simple_loss=0.2305, pruned_loss=0.04019, over 19979.00 frames. ], tot_loss[loss=0.1738, simple_loss=0.25, pruned_loss=0.04877, over 4707250.37 frames. ], batch size: 44, lr: 4.63e-03, grad_scale: 16.0 2023-10-02 06:03:26,789 WARNING [train.py:1204] (3/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_718-250907-0 from training. Duration: 0.69 2023-10-02 06:03:26,802 WARNING [train.py:1204] (3/4) Exclude cut with ID _5287_210_2084_1_1527418883040_3878670_363-194161-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 06:03:26,805 WARNING [train.py:1204] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_586-321321-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 06:03:28,201 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_68900_210_2087_1_1535713157_3708529_164-292661-0_sp1.1 from training. Duration: 0.963625 2023-10-02 06:03:28,202 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_26433_210_12108_1_1532157158_4056480_114-144878-0_sp1.1 from training. Duration: 0.97275 2023-10-02 06:03:31,099 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_15517_210_6753_1_1530954008_3487336_96-341874-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 06:03:35,376 WARNING [train.py:1204] (3/4) Exclude cut with ID _14268_210_9206_1_1530622771285_4714099_637-86630-0 from training. Duration: 0.51 2023-10-02 06:03:37,000 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.feed_forward1.out_proj.dropout_p, batch_count=776026.6666666666, ans=0.1 2023-10-02 06:03:38,028 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_28211_210_6388_1_1532861506_4523224_121-320092-0_sp1.1 from training. Duration: 0.92725 2023-10-02 06:03:42,697 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_141-89113-0_sp0.9 from training. Duration: 0.9555625 2023-10-02 06:03:44,088 WARNING [train.py:1204] (3/4) Exclude cut with ID _6789_210_1534_1_1527857937218_3118950_311-61262-0_sp1.1 from training. Duration: 0.5909375 2023-10-02 06:03:44,128 WARNING [train.py:1204] (3/4) Exclude cut with ID _58480_210_12392_1_1534811121111_3646160_389-200461-0_sp1.1 from training. Duration: 0.97275 2023-10-02 06:03:48,327 WARNING [train.py:1204] (3/4) Exclude cut with ID _41326_210_5854_1_1533778076509_3492080_28-156553-0_sp1.1 from training. Duration: 0.92725 2023-10-02 06:03:50,229 WARNING [train.py:1204] (3/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_577-287150-0_sp1.1 from training. Duration: 0.7454375 2023-10-02 06:03:51,603 WARNING [train.py:1204] (3/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_40-129756-0_sp1.1 from training. Duration: 0.736375 2023-10-02 06:03:51,617 WARNING [train.py:1204] (3/4) Exclude cut with ID _60113_210_16271_1_1535164212481_4386079_490-280290-0 from training. Duration: 0.98 2023-10-02 06:03:56,449 WARNING [train.py:1204] (3/4) Exclude cut with ID _58059_210_5854_1_1534901471149_3164808_34-39905-0_sp1.1 from training. Duration: 0.963625 2023-10-02 06:03:57,971 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_791-197891-0_sp1.1 from training. Duration: 0.763625 2023-10-02 06:03:57,987 WARNING [train.py:1204] (3/4) Exclude cut with ID _6789_210_1534_1_1527857937218_3118950_262-48853-0_sp1.1 from training. Duration: 0.5090625 2023-10-02 06:03:58,218 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.feed_forward1.out_proj.dropout_p, batch_count=776160.0, ans=0.1 2023-10-02 06:03:59,277 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2655_210_2087_1_1525688959_3897400_52-279899-0_sp0.9 from training. Duration: 0.7666875 2023-10-02 06:03:59,283 WARNING [train.py:1204] (3/4) Exclude cut with ID _14884_210_2252_1_1530707459689_5561250_114-201399-0 from training. Duration: 0.76 2023-10-02 06:04:00,788 WARNING [train.py:1204] (3/4) Exclude cut with ID _19023_210_4353_1_1531378892552_5820780_83-254794-0_sp0.9 from training. Duration: 0.9555625 2023-10-02 06:04:00,803 WARNING [train.py:1204] (3/4) Exclude cut with ID _53489_210_6228_1_1534298357169_3835970_10-297919-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 06:04:05,032 WARNING [train.py:1204] (3/4) Exclude cut with ID _44585_210_18628_1_1533605350387_4518199_418-3055-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 06:04:05,046 WARNING [train.py:1204] (3/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_310-109461-0 from training. Duration: 0.8 2023-10-02 06:04:06,368 WARNING [train.py:1204] (3/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_235-262124-0 from training. Duration: 0.67 2023-10-02 06:04:07,780 WARNING [train.py:1204] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_195-167011-0_sp1.1 from training. Duration: 0.6818125 2023-10-02 06:04:12,360 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.conv_module1.balancer2.min_positive, batch_count=776226.6666666666, ans=0.05 2023-10-02 06:04:14,076 WARNING [train.py:1204] (3/4) Exclude cut with ID _7852_210_5219_1_1528286471316_8139400_188-55162-0_sp1.1 from training. Duration: 0.936375 2023-10-02 06:04:14,117 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_35361_210_4238_1_1533262810_7793476_675-87012-0 from training. Duration: 0.89 2023-10-02 06:04:14,388 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.ff2_skip_rate, batch_count=776226.6666666666, ans=0.0 2023-10-02 06:04:15,525 WARNING [train.py:1204] (3/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_465-81427-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 06:04:15,532 WARNING [train.py:1204] (3/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_597-154640-0_sp1.1 from training. Duration: 0.8181875 2023-10-02 06:04:18,204 WARNING [train.py:1204] (3/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_186-309161-0_sp0.9 from training. Duration: 0.6 2023-10-02 06:04:20,180 WARNING [train.py:1204] (3/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_546-119312-0 from training. Duration: 0.99 2023-10-02 06:04:20,184 WARNING [train.py:1204] (3/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_330-240060-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 06:04:22,010 WARNING [train.py:1204] (3/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_477-255782-0 from training. Duration: 0.82 2023-10-02 06:04:22,029 WARNING [train.py:1204] (3/4) Exclude cut with ID _14970_210_15709_1_1530773543493_1715730_150-248939-0_sp1.1 from training. Duration: 0.97275 2023-10-02 06:04:22,722 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.3.encoder.layers.0.conv_module2.whiten, num_groups=1, num_channels=512, metric=2.84 vs. limit=15.0 2023-10-02 06:04:23,377 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_556-206615-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 06:04:23,449 WARNING [train.py:1204] (3/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_308-231735-0 from training. Duration: 0.73 2023-10-02 06:04:30,966 WARNING [train.py:1204] (3/4) Exclude cut with ID _27485_210_4916_1_1532739191677_3668269_66-236571-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 06:04:37,764 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2221_210_1988_1_1525597051_4935541_415-296882-0_sp0.9 from training. Duration: 0.8555625 2023-10-02 06:04:37,772 WARNING [train.py:1204] (3/4) Exclude cut with ID _64521_210_14699_1_1535243427259_3815328_231-78630-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 06:04:40,391 INFO [train.py:1046] (3/4) Epoch 22, batch 4900, loss[loss=0.1793, simple_loss=0.2411, pruned_loss=0.05871, over 23841.00 frames. ], tot_loss[loss=0.1737, simple_loss=0.2499, pruned_loss=0.04868, over 4708641.34 frames. ], batch size: 195, lr: 4.63e-03, grad_scale: 16.0 2023-10-02 06:04:42,019 WARNING [train.py:1204] (3/4) Exclude cut with ID _13942_210_9988_1_1530604906036_3733360_499-55676-0 from training. Duration: 0.62 2023-10-02 06:04:43,267 WARNING [train.py:1204] (3/4) Exclude cut with ID _49376_210_5986_1_1533985278732_3370740_301-154763-0_sp1.1 from training. Duration: 0.8909375 2023-10-02 06:04:49,265 WARNING [train.py:1204] (3/4) Exclude cut with ID _27485_210_4916_1_1532739191677_3668269_255-216698-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 06:04:51,211 WARNING [train.py:1204] (3/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_137-334143-0_sp1.1 from training. Duration: 0.97275 2023-10-02 06:04:51,239 WARNING [train.py:1204] (3/4) Exclude cut with ID _6456_210_1534_1_1527681634212_3181049_215-309359-0_sp1.1 from training. Duration: 0.8 2023-10-02 06:04:53,970 WARNING [train.py:1204] (3/4) Exclude cut with ID _65640_210_6310_1_1535368201445_3173250_209-13008-0 from training. Duration: 0.99 2023-10-02 06:04:56,876 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.1.conv_module2.balancer1.prob, batch_count=776426.6666666666, ans=0.125 2023-10-02 06:04:58,766 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.2.encoder.layers.1.whiten, num_groups=1, num_channels=384, metric=4.80 vs. limit=12.0 2023-10-02 06:04:59,484 WARNING [train.py:1204] (3/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_58-29064-0 from training. Duration: 0.99 2023-10-02 06:05:03,459 WARNING [train.py:1204] (3/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_410-287857-0 from training. Duration: 0.93 2023-10-02 06:05:03,534 WARNING [train.py:1204] (3/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_153-153537-0 from training. Duration: 0.91 2023-10-02 06:05:03,573 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7544_210_6753_1_1528587722_3576686_172-171750-0_sp0.9 from training. Duration: 0.72225 2023-10-02 06:05:03,600 WARNING [train.py:1204] (3/4) Exclude cut with ID _64521_210_14699_1_1535243427259_3815328_464-183853-0_sp1.1 from training. Duration: 0.97275 2023-10-02 06:05:04,949 WARNING [train.py:1204] (3/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_385-210308-0_sp1.1 from training. Duration: 0.963625 2023-10-02 06:05:04,956 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_46013_210_7234_1_1533776362_3744032_354-261865-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 06:05:04,963 WARNING [train.py:1204] (3/4) Exclude cut with ID _3067_210_2667_1_1526212974640_4503168_250-285937-0_sp1.1 from training. Duration: 0.72725 2023-10-02 06:05:05,041 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_40036_210_13405_1_1533648647_1315388_48-146016-0 from training. Duration: 0.93 2023-10-02 06:05:05,279 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.feed_forward2.hidden_balancer.prob, batch_count=776426.6666666666, ans=0.125 2023-10-02 06:05:07,839 WARNING [train.py:1204] (3/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_280-161633-0 from training. Duration: 0.73 2023-10-02 06:05:09,303 WARNING [train.py:1204] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_308-161067-0_sp0.9 from training. Duration: 0.7333125 2023-10-02 06:05:10,403 INFO [scaling.py:1022] (3/4) Whitening: name=encoder_embed.convnext.out_whiten, num_groups=1, num_channels=128, metric=4.65 vs. limit=5.0 2023-10-02 06:05:10,686 WARNING [train.py:1204] (3/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_68-212801-0_sp0.9 from training. Duration: 0.811125 2023-10-02 06:05:11,912 WARNING [train.py:1204] (3/4) Exclude cut with ID _6789_210_1534_1_1527857937218_3118950_311-61262-0_sp0.9 from training. Duration: 0.72225 2023-10-02 06:05:13,918 WARNING [train.py:1204] (3/4) Exclude cut with ID _14632_210_9988_1_1530691279392_3923200_102-204477-0_sp1.1 from training. Duration: 0.8818125 2023-10-02 06:05:14,076 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.1.bypass.skip_rate, batch_count=776493.3333333334, ans=0.035 2023-10-02 06:05:15,228 WARNING [train.py:1204] (3/4) Exclude cut with ID _65242_210_18628_1_1535369393717_3823149_290-117517-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 06:05:16,673 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_69630_210_7234_1_1535788670_910989_35-337140-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 06:05:16,682 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_5618_210_1874_1_1527311008_5851_1-214501-0 from training. Duration: 0.689 2023-10-02 06:05:18,172 WARNING [train.py:1204] (3/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_168-50337-0_sp0.9 from training. Duration: 0.9333125 2023-10-02 06:05:18,248 WARNING [train.py:1204] (3/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_185-14438-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 06:05:18,263 WARNING [train.py:1204] (3/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_61-327062-0 from training. Duration: 0.81 2023-10-02 06:05:18,267 WARNING [train.py:1204] (3/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_98-741-0 from training. Duration: 0.8 2023-10-02 06:05:24,335 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.396e+02 1.841e+02 1.995e+02 2.206e+02 2.989e+02, threshold=3.989e+02, percent-clipped=0.0 2023-10-02 06:05:24,467 WARNING [train.py:1204] (3/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_312-204798-0 from training. Duration: 0.93 2023-10-02 06:05:26,430 WARNING [train.py:1204] (3/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_787-294416-0_sp0.9 from training. Duration: 0.911125 2023-10-02 06:05:26,506 WARNING [train.py:1204] (3/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_140-181176-0_sp1.1 from training. Duration: 0.836375 2023-10-02 06:05:27,848 WARNING [train.py:1204] (3/4) Exclude cut with ID _14687_210_5854_1_1530703838259_3664889_115-239046-0_sp1.1 from training. Duration: 0.6090625 2023-10-02 06:05:27,917 WARNING [train.py:1204] (3/4) Exclude cut with ID _56423_210_5854_1_1534896061049_3165969_370-230614-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 06:05:27,967 WARNING [train.py:1204] (3/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_406-275649-0_sp0.9 from training. Duration: 0.4666875 2023-10-02 06:05:29,926 WARNING [train.py:1204] (3/4) Exclude cut with ID _15150_210_8341_1_1530752411803_7256384_22-138274-0_sp1.1 from training. Duration: 0.87275 2023-10-02 06:05:29,975 WARNING [train.py:1204] (3/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_125-125818-0 from training. Duration: 0.96 2023-10-02 06:05:31,519 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_4399_210_1874_1_1526705485_2400757_220-47974-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 06:05:33,068 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.conv_module2.balancer2.prob, batch_count=776560.0, ans=0.125 2023-10-02 06:05:34,157 WARNING [train.py:1204] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_308-161067-0_sp1.1 from training. Duration: 0.6 2023-10-02 06:05:35,582 WARNING [train.py:1204] (3/4) Exclude cut with ID _53489_210_6228_1_1534298357169_3835970_102-57645-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 06:05:38,220 WARNING [train.py:1204] (3/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_178-177582-0 from training. Duration: 0.74 2023-10-02 06:05:38,293 WARNING [train.py:1204] (3/4) Exclude cut with ID _14349_210_8341_1_1530615710818_3442470_170-336541-0_sp1.1 from training. Duration: 0.7818125 2023-10-02 06:05:39,674 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_521-214276-0_sp0.9 from training. Duration: 0.488875 2023-10-02 06:05:39,724 WARNING [train.py:1204] (3/4) Exclude cut with ID _66070_210_7640_1_1535441393096_7805310_489-292631-0 from training. Duration: 0.79 2023-10-02 06:05:45,806 WARNING [train.py:1204] (3/4) Exclude cut with ID _14069_210_5632_1_1530512692033_4803390_438-340023-0_sp1.1 from training. Duration: 0.936375 2023-10-02 06:05:47,093 WARNING [train.py:1204] (3/4) Exclude cut with ID _7852_210_5219_1_1528286471316_8139400_242-105336-0_sp1.1 from training. Duration: 0.6545625 2023-10-02 06:05:47,378 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.attention_skip_rate, batch_count=776626.6666666666, ans=0.0 2023-10-02 06:05:48,560 WARNING [train.py:1204] (3/4) Exclude cut with ID _3867_210_1883_1_1526641200575_4475830_272-215563-0 from training. Duration: 0.57 2023-10-02 06:05:49,866 WARNING [train.py:1204] (3/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_143-117421-0_sp0.9 from training. Duration: 0.6444375 2023-10-02 06:05:49,873 WARNING [train.py:1204] (3/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_30-326681-0_sp1.1 from training. Duration: 0.7545625 2023-10-02 06:05:51,981 WARNING [train.py:1204] (3/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_538-218448-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 06:05:55,497 WARNING [train.py:1204] (3/4) Exclude cut with ID _33640_210_2655_1_1533117605479_4855469_60-192970-0_sp1.1 from training. Duration: 0.963625 2023-10-02 06:05:56,724 INFO [train.py:1046] (3/4) Epoch 22, batch 4950, loss[loss=0.1727, simple_loss=0.2629, pruned_loss=0.04122, over 24645.00 frames. ], tot_loss[loss=0.1722, simple_loss=0.2486, pruned_loss=0.04793, over 4707116.14 frames. ], batch size: 73, lr: 4.62e-03, grad_scale: 16.0 2023-10-02 06:05:56,779 WARNING [train.py:1204] (3/4) Exclude cut with ID _65585_210_12210_1_1535450451584_3764098_112-294301-0_sp1.1 from training. Duration: 0.8 2023-10-02 06:05:56,819 WARNING [train.py:1204] (3/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_643-37412-0_sp1.1 from training. Duration: 0.936375 2023-10-02 06:05:56,836 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_9302_210_2550_1_1528889962_4655416_638-301620-0_sp1.1 from training. Duration: 0.42725 2023-10-02 06:05:58,137 WARNING [train.py:1204] (3/4) Exclude cut with ID _3867_210_1883_1_1526641200575_4475830_272-215563-0_sp1.1 from training. Duration: 0.5181875 2023-10-02 06:06:01,689 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_53679_210_4852_1_1534556726_4186141_358-154143-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 06:06:01,700 WARNING [train.py:1204] (3/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_841-341679-0_sp0.9 from training. Duration: 0.6444375 2023-10-02 06:06:04,409 WARNING [train.py:1204] (3/4) Exclude cut with ID _7160_210_1876_1_1527940923231_3703799_302-150728-0 from training. Duration: 0.78 2023-10-02 06:06:04,572 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.bypass.scale_min, batch_count=776693.3333333334, ans=0.2 2023-10-02 06:06:05,754 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_125-203105-0 from training. Duration: 0.8 2023-10-02 06:06:05,769 WARNING [train.py:1204] (3/4) Exclude cut with ID _5585_210_2667_1_1527418813324_8007340_89-332072-0_sp0.9 from training. Duration: 0.711125 2023-10-02 06:06:07,103 WARNING [train.py:1204] (3/4) Exclude cut with ID _3992_210_1747_1_1526644816031_3731060_36-147855-0 from training. Duration: 0.78 2023-10-02 06:06:07,131 WARNING [train.py:1204] (3/4) Exclude cut with ID _59256_210_5986_1_1535076085997_3589120_172-239045-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 06:06:07,141 WARNING [train.py:1204] (3/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_472-216663-0_sp1.1 from training. Duration: 0.836375 2023-10-02 06:06:07,196 WARNING [train.py:1204] (3/4) Exclude cut with ID _36512_210_6987_1_1533033143998_3126069_181-306500-0_sp1.1 from training. Duration: 0.863625 2023-10-02 06:06:07,223 WARNING [train.py:1204] (3/4) Exclude cut with ID _56524_210_9642_1_1534572011779_3619170_492-95008-0_sp1.1 from training. Duration: 0.97275 2023-10-02 06:06:10,035 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_33348_210_5632_1_1533086912_3324030_119-90326-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 06:06:10,076 WARNING [train.py:1204] (3/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_231-137892-0_sp1.1 from training. Duration: 0.9 2023-10-02 06:06:12,849 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8453_210_4211_1_1528502940_8562730_596-306820-0_sp1.1 from training. Duration: 0.8818125 2023-10-02 06:06:12,950 WARNING [train.py:1204] (3/4) Exclude cut with ID _27052_210_5986_1_1532329197187_3585009_50-242750-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 06:06:13,405 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.5.encoder.layers.1.feed_forward3.out_whiten, num_groups=1, num_channels=256, metric=11.00 vs. limit=15.0 2023-10-02 06:06:14,363 WARNING [train.py:1204] (3/4) Exclude cut with ID _13913_210_9994_1_1530579615187_3383897_358-98435-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 06:06:14,379 WARNING [train.py:1204] (3/4) Exclude cut with ID _27013_210_1389_1_1532437173240_3252649_399-229778-0_sp1.1 from training. Duration: 0.963625 2023-10-02 06:06:18,232 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.attention_skip_rate, batch_count=776760.0, ans=0.0 2023-10-02 06:06:19,410 WARNING [train.py:1204] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_185-174805-0_sp0.9 from training. Duration: 0.7555625 2023-10-02 06:06:22,501 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.conv_skip_rate, batch_count=776760.0, ans=0.0 2023-10-02 06:06:24,179 WARNING [train.py:1204] (3/4) Exclude cut with ID _65585_210_12210_1_1535450451584_3764098_196-221014-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 06:06:24,293 WARNING [train.py:1204] (3/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_202-293523-0_sp1.1 from training. Duration: 0.6545625 2023-10-02 06:06:27,535 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8275_210_4198_1_1528455320_3892571_213-127634-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 06:06:27,581 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_921-231678-0_sp1.1 from training. Duration: 0.97275 2023-10-02 06:06:27,799 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.conv_skip_rate, batch_count=776826.6666666666, ans=0.0 2023-10-02 06:06:28,921 WARNING [train.py:1204] (3/4) Exclude cut with ID _38020_210_5986_1_1533112252259_7006089_493-102745-0_sp1.1 from training. Duration: 0.8909375 2023-10-02 06:06:30,804 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6846_210_6753_1_1527850767_4295158_2-315073-0 from training. Duration: 0.88 2023-10-02 06:06:30,870 WARNING [train.py:1204] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_249-281349-0 from training. Duration: 0.85 2023-10-02 06:06:32,275 WARNING [train.py:1204] (3/4) Exclude cut with ID _18513_210_9206_1_1531618215223_3862620_314-83429-0_sp1.1 from training. Duration: 0.97275 2023-10-02 06:06:34,972 WARNING [train.py:1204] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_215-202973-0_sp0.9 from training. Duration: 0.97775 2023-10-02 06:06:34,983 WARNING [train.py:1204] (3/4) Exclude cut with ID _7719_210_3525_1_1528456153163_4296960_88-250259-0_sp1.1 from training. Duration: 0.763625 2023-10-02 06:06:35,117 WARNING [train.py:1204] (3/4) Exclude cut with ID _7606_210_4915_1_1528894253277_5080690_301-10732-0_sp1.1 from training. Duration: 0.736375 2023-10-02 06:06:36,382 WARNING [train.py:1204] (3/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_818-11907-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 06:06:37,754 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_5959_210_1794_1_1527400400_3445726_412-44052-0_sp1.1 from training. Duration: 0.7 2023-10-02 06:06:39,197 WARNING [train.py:1204] (3/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_6-279195-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 06:06:41,822 WARNING [train.py:1204] (3/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_252-216674-0_sp0.9 from training. Duration: 0.6 2023-10-02 06:06:43,279 WARNING [train.py:1204] (3/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_144-3287-0_sp1.1 from training. Duration: 0.6090625 2023-10-02 06:06:45,218 WARNING [train.py:1204] (3/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_342-3879-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 06:06:46,502 WARNING [train.py:1204] (3/4) Exclude cut with ID _33640_210_2655_1_1533117605479_4855469_584-65630-0_sp1.1 from training. Duration: 0.97275 2023-10-02 06:06:47,859 WARNING [train.py:1204] (3/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_37-189262-0 from training. Duration: 0.98 2023-10-02 06:06:47,890 WARNING [train.py:1204] (3/4) Exclude cut with ID _9389_210_1664_1_1529217337301_5289269_379-128794-0_sp0.9 from training. Duration: 0.9444375 2023-10-02 06:06:49,261 WARNING [train.py:1204] (3/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_119-207162-0_sp1.1 from training. Duration: 0.6454375 2023-10-02 06:06:52,252 WARNING [train.py:1204] (3/4) Exclude cut with ID _2941_210_2577_1_1526199673150_2489759_200-333643-0_sp1.1 from training. Duration: 0.936375 2023-10-02 06:06:53,655 WARNING [train.py:1204] (3/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_479-125611-0_sp1.1 from training. Duration: 0.87275 2023-10-02 06:06:53,672 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3475_210_2028_1_1526193578_3622569_64-297072-0_sp1.1 from training. Duration: 0.82725 2023-10-02 06:06:55,658 WARNING [train.py:1204] (3/4) Exclude cut with ID _53565_210_10635_1_1534337218335_3820259_139-346291-0_sp1.1 from training. Duration: 0.97275 2023-10-02 06:06:55,717 WARNING [train.py:1204] (3/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_588-125165-0_sp1.1 from training. Duration: 0.7545625 2023-10-02 06:06:56,985 WARNING [train.py:1204] (3/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_938-205135-0_sp1.1 from training. Duration: 0.7909375 2023-10-02 06:06:58,324 WARNING [train.py:1204] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_216-48707-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 06:06:59,656 WARNING [train.py:1204] (3/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_477-255782-0_sp1.1 from training. Duration: 0.7454375 2023-10-02 06:06:59,701 WARNING [train.py:1204] (3/4) Exclude cut with ID _16909_210_5724_1_1531292079912_4118919_583-92842-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 06:07:01,033 WARNING [train.py:1204] (3/4) Exclude cut with ID _60860_210_8254_1_1534896015875_3557169_245-256666-0 from training. Duration: 0.97 2023-10-02 06:07:04,561 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_45399_210_4852_1_1533624725_7579737_349-116173-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 06:07:07,735 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.conv_module1.balancer1.prob, batch_count=776960.0, ans=0.125 2023-10-02 06:07:10,077 WARNING [train.py:1204] (3/4) Exclude cut with ID _8699_210_2069_1_1528630346831_3960409_155-39766-0 from training. Duration: 0.78 2023-10-02 06:07:10,086 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_86-16881-0_sp1.1 from training. Duration: 0.52725 2023-10-02 06:07:11,435 INFO [train.py:1046] (3/4) Epoch 22, batch 5000, loss[loss=0.1677, simple_loss=0.2489, pruned_loss=0.04322, over 24480.00 frames. ], tot_loss[loss=0.172, simple_loss=0.2483, pruned_loss=0.04792, over 4718523.23 frames. ], batch size: 66, lr: 4.62e-03, grad_scale: 16.0 2023-10-02 06:07:16,355 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_191-159384-0_sp1.1 from training. Duration: 0.97275 2023-10-02 06:07:16,363 WARNING [train.py:1204] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_307-158462-0_sp1.1 from training. Duration: 0.62725 2023-10-02 06:07:19,143 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7544_210_6753_1_1528591327_1471426_33-346031-0 from training. Duration: 0.61 2023-10-02 06:07:19,240 WARNING [train.py:1204] (3/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_112-149213-0 from training. Duration: 0.95 2023-10-02 06:07:19,525 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=777026.6666666666, ans=0.1 2023-10-02 06:07:21,926 WARNING [train.py:1204] (3/4) Exclude cut with ID _55844_210_2346_1_1534471212569_7625707_925-191342-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 06:07:24,626 WARNING [train.py:1204] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_87-278686-0 from training. Duration: 0.77 2023-10-02 06:07:24,662 WARNING [train.py:1204] (3/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_415-78792-0_sp1.1 from training. Duration: 0.736375 2023-10-02 06:07:24,674 WARNING [train.py:1204] (3/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_107-332398-0_sp0.9 from training. Duration: 0.8666875 2023-10-02 06:07:26,553 WARNING [train.py:1204] (3/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_72-251255-0 from training. Duration: 0.94 2023-10-02 06:07:27,808 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_431-227273-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 06:07:27,872 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7544_210_6753_1_1528587000_691468_87-178599-0_sp0.9 from training. Duration: 0.9666875 2023-10-02 06:07:27,965 WARNING [train.py:1204] (3/4) Exclude cut with ID _13916_210_9994_1_1530788341126_3763149_60-191556-0 from training. Duration: 0.61 2023-10-02 06:07:27,968 WARNING [train.py:1204] (3/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_746-164782-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 06:07:29,990 WARNING [train.py:1204] (3/4) Exclude cut with ID _33640_210_2655_1_1533117605479_4855469_596-21066-0_sp1.1 from training. Duration: 0.92725 2023-10-02 06:07:31,429 WARNING [train.py:1204] (3/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_320-14078-0 from training. Duration: 0.95 2023-10-02 06:07:31,455 WARNING [train.py:1204] (3/4) Exclude cut with ID _29877_210_6228_1_1532397791106_5276149_266-62291-0 from training. Duration: 0.87 2023-10-02 06:07:32,929 WARNING [train.py:1204] (3/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_176-56120-0_sp0.9 from training. Duration: 0.92225 2023-10-02 06:07:32,970 WARNING [train.py:1204] (3/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_91-26919-0 from training. Duration: 0.97 2023-10-02 06:07:32,977 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_224-322394-0_sp0.9 from training. Duration: 0.7333125 2023-10-02 06:07:33,016 WARNING [train.py:1204] (3/4) Exclude cut with ID _44982_210_1511_1_1534312820108_3726029_586-297059-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 06:07:33,160 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.conv_module2.balancer1.prob, batch_count=777093.3333333334, ans=0.125 2023-10-02 06:07:33,444 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.5.encoder.layers.0.nonlin_attention.whiten2, num_groups=1, num_channels=256, metric=3.90 vs. limit=15.0 2023-10-02 06:07:34,314 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_196-289637-0_sp1.1 from training. Duration: 0.6818125 2023-10-02 06:07:34,315 WARNING [train.py:1204] (3/4) Exclude cut with ID _18513_210_9206_1_1531618215223_3862620_402-5957-0 from training. Duration: 0.99 2023-10-02 06:07:34,328 WARNING [train.py:1204] (3/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_81-300212-0 from training. Duration: 0.6 2023-10-02 06:07:36,323 WARNING [train.py:1204] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_166-128941-0 from training. Duration: 0.54 2023-10-02 06:07:37,621 WARNING [train.py:1204] (3/4) Exclude cut with ID _58059_210_5854_1_1534901471149_3164808_57-309660-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 06:07:37,692 WARNING [train.py:1204] (3/4) Exclude cut with ID _60621_210_17071_1_1535013053530_3671739_73-155814-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 06:07:37,785 WARNING [train.py:1204] (3/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_726-270780-0 from training. Duration: 0.86 2023-10-02 06:07:37,806 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8853_210_3273_1_1528633899_818968_51-187685-0_sp1.1 from training. Duration: 0.62725 2023-10-02 06:07:40,510 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_315-212794-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 06:07:41,909 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_53373_210_5399_1_1534229471_4715695_314-122795-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 06:07:43,346 WARNING [train.py:1204] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_187-221566-0_sp0.9 from training. Duration: 0.47775 2023-10-02 06:07:43,974 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.4.encoder.layers.1.whiten, num_groups=1, num_channels=384, metric=4.73 vs. limit=12.0 2023-10-02 06:07:44,793 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_133-564-0 from training. Duration: 0.79 2023-10-02 06:07:46,129 WARNING [train.py:1204] (3/4) Exclude cut with ID _55153_210_2253_1_1534384786677_3162096_255-228344-0_sp1.1 from training. Duration: 0.936375 2023-10-02 06:07:47,577 WARNING [train.py:1204] (3/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_383-165234-0_sp1.1 from training. Duration: 0.8545625 2023-10-02 06:07:52,341 WARNING [train.py:1204] (3/4) Exclude cut with ID IC0677W0056-334889 from training. Duration: 0.768 2023-10-02 06:07:53,886 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_14953_210_2151_1_1530701710_5616165_710-165400-0_sp0.9 from training. Duration: 0.9666875 2023-10-02 06:07:54,131 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.conv_module1.balancer2.prob, batch_count=777160.0, ans=0.125 2023-10-02 06:07:55,098 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.627e+02 1.890e+02 2.101e+02 2.537e+02 4.736e+02, threshold=4.203e+02, percent-clipped=2.0 2023-10-02 06:07:55,230 WARNING [train.py:1204] (3/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_355-236205-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 06:07:55,232 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_34450_210_9547_1_1533099443_4109050_503-246893-0_sp1.1 from training. Duration: 0.963625 2023-10-02 06:07:59,857 WARNING [train.py:1204] (3/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_25-120161-0 from training. Duration: 0.98 2023-10-02 06:07:59,895 WARNING [train.py:1204] (3/4) Exclude cut with ID _66112_210_10635_1_1535590236305_3801484_205-311930-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 06:07:59,937 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_48049_210_5632_1_1533864553_3519937_185-281218-0_sp1.1 from training. Duration: 0.92725 2023-10-02 06:08:01,220 WARNING [train.py:1204] (3/4) Exclude cut with ID _48782_210_12658_1_1534467701544_2300422_191-332415-0_sp1.1 from training. Duration: 0.97275 2023-10-02 06:08:02,607 WARNING [train.py:1204] (3/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_907-151398-0_sp1.1 from training. Duration: 0.336375 2023-10-02 06:08:03,962 WARNING [train.py:1204] (3/4) Exclude cut with ID _67499_210_17282_1_1535509805220_3692430_20-179346-0_sp1.1 from training. Duration: 0.8818125 2023-10-02 06:08:07,200 WARNING [train.py:1204] (3/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_441-91466-0_sp1.1 from training. Duration: 0.8818125 2023-10-02 06:08:07,280 WARNING [train.py:1204] (3/4) Exclude cut with ID _46681_210_9642_1_1533893005571_7603359_786-164007-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 06:08:12,965 WARNING [train.py:1204] (3/4) Exclude cut with ID _14970_210_15709_1_1530773543493_1715730_187-20259-0 from training. Duration: 0.89 2023-10-02 06:08:16,980 WARNING [train.py:1204] (3/4) Exclude cut with ID _12734_210_8341_1_1530183614296_3891929_383-339075-0_sp1.1 from training. Duration: 0.963625 2023-10-02 06:08:25,967 INFO [train.py:1046] (3/4) Epoch 22, batch 5050, loss[loss=0.1383, simple_loss=0.2184, pruned_loss=0.02914, over 24305.00 frames. ], tot_loss[loss=0.1723, simple_loss=0.2485, pruned_loss=0.04806, over 4712440.73 frames. ], batch size: 56, lr: 4.62e-03, grad_scale: 8.0 2023-10-02 06:08:26,093 WARNING [train.py:1204] (3/4) Exclude cut with ID _37086_210_9038_1_1532999540891_7021250_834-322225-0_sp1.1 from training. Duration: 0.92725 2023-10-02 06:08:27,436 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_10987_210_4163_1_1529760555_4123902_56-290559-0_sp1.1 from training. Duration: 0.963625 2023-10-02 06:08:27,451 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_92-246270-0_sp0.9 from training. Duration: 0.9444375 2023-10-02 06:08:28,838 WARNING [train.py:1204] (3/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_56-13561-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 06:08:28,864 WARNING [train.py:1204] (3/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_393-57888-0_sp1.1 from training. Duration: 0.5818125 2023-10-02 06:08:28,885 WARNING [train.py:1204] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_168-32906-0_sp1.1 from training. Duration: 0.62725 2023-10-02 06:08:29,677 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_51225_210_3601_1_1534400634_2327383_127-207553-0_sp1.1 from training. Duration: 0.963625 2023-10-02 06:08:29,808 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.0.attention_skip_rate, batch_count=777360.0, ans=0.0 2023-10-02 06:08:29,939 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.bypass.scale_min, batch_count=777360.0, ans=0.2 2023-10-02 06:08:34,219 WARNING [train.py:1204] (3/4) Exclude cut with ID _58583_210_5236_1_1534726835266_3574249_494-203521-0_sp1.1 from training. Duration: 0.963625 2023-10-02 06:08:34,241 WARNING [train.py:1204] (3/4) Exclude cut with ID _49376_210_5986_1_1533985278732_3370740_354-344695-0 from training. Duration: 0.99 2023-10-02 06:08:35,585 WARNING [train.py:1204] (3/4) Exclude cut with ID _8021_210_7994_1_1528376451358_3653069_179-45206-0_sp1.1 from training. Duration: 0.7818125 2023-10-02 06:08:38,643 WARNING [train.py:1204] (3/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_742-154017-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 06:08:40,107 WARNING [train.py:1204] (3/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_222-198127-0_sp1.1 from training. Duration: 0.763625 2023-10-02 06:08:40,151 WARNING [train.py:1204] (3/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_235-307337-0 from training. Duration: 0.94 2023-10-02 06:08:40,232 WARNING [train.py:1204] (3/4) Exclude cut with ID _8393_210_6702_1_1528505860249_3457328_453-176500-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 06:08:40,433 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.feed_forward1.hidden_balancer.prob, batch_count=777426.6666666666, ans=0.125 2023-10-02 06:08:41,586 WARNING [train.py:1204] (3/4) Exclude cut with ID _53565_210_10635_1_1534337218335_3820259_102-179528-0_sp1.1 from training. Duration: 0.97275 2023-10-02 06:08:43,052 WARNING [train.py:1204] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_43-206757-0_sp1.1 from training. Duration: 0.6818125 2023-10-02 06:08:44,458 WARNING [train.py:1204] (3/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_335-274780-0_sp1.1 from training. Duration: 0.7090625 2023-10-02 06:08:45,805 WARNING [train.py:1204] (3/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_128-296581-0_sp1.1 from training. Duration: 0.67275 2023-10-02 06:08:46,053 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=777426.6666666666, ans=0.1 2023-10-02 06:08:53,296 WARNING [train.py:1204] (3/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_111-112362-0 from training. Duration: 0.91 2023-10-02 06:08:54,632 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_5522_210_3273_1_1527249606_3909096_366-172508-0_sp1.1 from training. Duration: 0.6 2023-10-02 06:08:54,717 WARNING [train.py:1204] (3/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_47-167373-0_sp1.1 from training. Duration: 0.836375 2023-10-02 06:08:54,772 WARNING [train.py:1204] (3/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_41-234353-0 from training. Duration: 0.68 2023-10-02 06:08:56,119 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_82-221196-0_sp0.9 from training. Duration: 0.8444375 2023-10-02 06:08:57,619 WARNING [train.py:1204] (3/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_47-4814-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 06:08:57,642 WARNING [train.py:1204] (3/4) Exclude cut with ID _43541_210_10626_1_1533796233418_3635440_40-247993-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 06:08:57,703 WARNING [train.py:1204] (3/4) Exclude cut with ID _21989_210_5854_1_1531899214931_3271360_133-44759-0_sp0.9 from training. Duration: 0.8555625 2023-10-02 06:08:59,018 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_94-195801-0 from training. Duration: 0.94 2023-10-02 06:08:59,089 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_141-89113-0 from training. Duration: 0.86 2023-10-02 06:09:00,868 WARNING [train.py:1204] (3/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_683-284578-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 06:09:01,113 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.nonlin_attention.balancer.prob, batch_count=777493.3333333334, ans=0.125 2023-10-02 06:09:03,632 WARNING [train.py:1204] (3/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_6-214815-0_sp1.1 from training. Duration: 0.77275 2023-10-02 06:09:05,652 WARNING [train.py:1204] (3/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_556-212082-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 06:09:06,999 WARNING [train.py:1204] (3/4) Exclude cut with ID _44902_210_16187_1_1533888006062_3792078_149-67212-0 from training. Duration: 0.87 2023-10-02 06:09:09,100 WARNING [train.py:1204] (3/4) Exclude cut with ID _61148_210_6897_1_1535094022726_4217610_333-159440-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 06:09:11,838 WARNING [train.py:1204] (3/4) Exclude cut with ID _3120_210_3525_1_1526036143112_4072980_404-249994-0 from training. Duration: 0.99 2023-10-02 06:09:13,283 WARNING [train.py:1204] (3/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_234-260682-0_sp0.9 from training. Duration: 0.9333125 2023-10-02 06:09:13,316 WARNING [train.py:1204] (3/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_374-138971-0_sp1.1 from training. Duration: 0.8545625 2023-10-02 06:09:13,376 WARNING [train.py:1204] (3/4) Exclude cut with ID _53713_210_24116_1_1534314487762_7416751_743-302085-0_sp1.1 from training. Duration: 0.92725 2023-10-02 06:09:13,596 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.out_combiner.scale_min, batch_count=777560.0, ans=0.2 2023-10-02 06:09:14,744 WARNING [train.py:1204] (3/4) Exclude cut with ID _18516_210_9206_1_1531704653206_3692406_198-334870-0_sp1.1 from training. Duration: 0.836375 2023-10-02 06:09:16,354 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.ff2_skip_rate, batch_count=777560.0, ans=0.0 2023-10-02 06:09:17,486 WARNING [train.py:1204] (3/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_395-73570-0_sp1.1 from training. Duration: 0.9 2023-10-02 06:09:18,856 WARNING [train.py:1204] (3/4) Exclude cut with ID _46683_210_9642_1_1533730913492_4591116_421-19547-0_sp1.1 from training. Duration: 0.936375 2023-10-02 06:09:20,267 WARNING [train.py:1204] (3/4) Exclude cut with ID _19896_210_1883_1_1531709022662_6649569_714-8668-0_sp1.1 from training. Duration: 0.963625 2023-10-02 06:09:20,307 WARNING [train.py:1204] (3/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_49-101072-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 06:09:20,314 WARNING [train.py:1204] (3/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_220-191080-0_sp1.1 from training. Duration: 0.8909375 2023-10-02 06:09:20,354 WARNING [train.py:1204] (3/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_62-335552-0 from training. Duration: 0.87 2023-10-02 06:09:21,767 WARNING [train.py:1204] (3/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_111-112362-0_sp1.1 from training. Duration: 0.82725 2023-10-02 06:09:21,920 WARNING [train.py:1204] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_318-59097-0_sp0.9 from training. Duration: 0.8444375 2023-10-02 06:09:26,403 WARNING [train.py:1204] (3/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_98-255710-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 06:09:26,427 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9042W0003-515609 from training. Duration: 0.896 2023-10-02 06:09:26,436 WARNING [train.py:1204] (3/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_158-250116-0_sp0.9 from training. Duration: 0.688875 2023-10-02 06:09:27,729 WARNING [train.py:1204] (3/4) Exclude cut with ID _26750_210_5665_1_1532226678442_4063230_7-346885-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 06:09:29,097 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_37839_210_7234_1_1533542462_3689303_357-227078-0_sp1.1 from training. Duration: 0.963625 2023-10-02 06:09:29,124 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9052W0119-520904 from training. Duration: 0.896 2023-10-02 06:09:29,812 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.3.encoder.layers.0.self_attn_weights.whiten_keys, num_groups=8, num_channels=256, metric=5.40 vs. limit=6.0 2023-10-02 06:09:33,839 WARNING [train.py:1204] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_249-281349-0_sp1.1 from training. Duration: 0.77275 2023-10-02 06:09:33,845 WARNING [train.py:1204] (3/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_147-293311-0 from training. Duration: 0.94 2023-10-02 06:09:33,845 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_158-21166-0_sp1.1 from training. Duration: 0.963625 2023-10-02 06:09:35,698 WARNING [train.py:1204] (3/4) Exclude cut with ID _14100_210_8341_1_1530529320613_3678069_207-332194-0_sp1.1 from training. Duration: 0.92725 2023-10-02 06:09:37,647 WARNING [train.py:1204] (3/4) Exclude cut with ID _9389_210_1664_1_1529217337301_5289269_324-211031-0_sp1.1 from training. Duration: 0.963625 2023-10-02 06:09:37,666 WARNING [train.py:1204] (3/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_133-158308-0 from training. Duration: 0.84 2023-10-02 06:09:39,169 WARNING [train.py:1204] (3/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_166-314607-0 from training. Duration: 0.54 2023-10-02 06:09:42,308 INFO [train.py:1046] (3/4) Epoch 22, batch 5100, loss[loss=0.1896, simple_loss=0.2612, pruned_loss=0.05896, over 23480.00 frames. ], tot_loss[loss=0.173, simple_loss=0.2494, pruned_loss=0.04832, over 4701153.37 frames. ], batch size: 285, lr: 4.62e-03, grad_scale: 8.0 2023-10-02 06:09:42,377 WARNING [train.py:1204] (3/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_649-79538-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 06:09:42,393 WARNING [train.py:1204] (3/4) Exclude cut with ID _28394_210_8913_1_1532430049518_3650118_343-115667-0_sp1.1 from training. Duration: 0.97275 2023-10-02 06:09:43,705 WARNING [train.py:1204] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_87-278686-0_sp0.9 from training. Duration: 0.8555625 2023-10-02 06:09:46,523 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9109W0003-549749 from training. Duration: 0.768 2023-10-02 06:09:47,960 WARNING [train.py:1204] (3/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_126-29104-0_sp1.1 from training. Duration: 0.77275 2023-10-02 06:09:49,588 WARNING [train.py:1204] (3/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_325-113798-0 from training. Duration: 0.85 2023-10-02 06:09:50,859 WARNING [train.py:1204] (3/4) Exclude cut with ID _6694_210_4599_1_1527852637490_3965529_116-109398-0 from training. Duration: 0.73 2023-10-02 06:09:50,908 WARNING [train.py:1204] (3/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_46-57119-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 06:09:52,259 WARNING [train.py:1204] (3/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_546-119312-0_sp1.1 from training. Duration: 0.9 2023-10-02 06:09:55,426 WARNING [train.py:1204] (3/4) Exclude cut with ID _60057_210_10640_1_1534903065793_3333150_468-306939-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 06:09:56,746 WARNING [train.py:1204] (3/4) Exclude cut with ID _15150_210_8341_1_1530752411803_7256384_22-138274-0 from training. Duration: 0.96 2023-10-02 06:09:56,769 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_289-180904-0 from training. Duration: 0.76 2023-10-02 06:10:01,061 WARNING [train.py:1204] (3/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_64-133721-0_sp1.1 from training. Duration: 0.92725 2023-10-02 06:10:02,435 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_195-257512-0_sp1.1 from training. Duration: 0.6090625 2023-10-02 06:10:04,646 WARNING [train.py:1204] (3/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_704-101963-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 06:10:08,052 WARNING [train.py:1204] (3/4) Exclude cut with ID _4366_210_1388_1_1526792053633_3613819_231-211402-0 from training. Duration: 0.94 2023-10-02 06:10:08,122 WARNING [train.py:1204] (3/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_679-190385-0_sp1.1 from training. Duration: 0.97275 2023-10-02 06:10:10,698 WARNING [train.py:1204] (3/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_298-81459-0_sp1.1 from training. Duration: 0.963625 2023-10-02 06:10:10,714 WARNING [train.py:1204] (3/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_658-282610-0_sp0.9 from training. Duration: 0.588875 2023-10-02 06:10:13,489 WARNING [train.py:1204] (3/4) Exclude cut with ID _57572_210_5517_1_1534921219624_3809484_196-283734-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 06:10:13,762 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.conv_skip_rate, batch_count=777826.6666666666, ans=0.0 2023-10-02 06:10:15,366 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_53071_210_16554_1_1534490405_2954066_294-198554-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 06:10:15,379 WARNING [train.py:1204] (3/4) Exclude cut with ID _4866_210_2577_1_1527404412284_4364659_352-157930-0 from training. Duration: 0.81 2023-10-02 06:10:16,921 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9231W0113-610139 from training. Duration: 0.896 2023-10-02 06:10:18,300 WARNING [train.py:1204] (3/4) Exclude cut with ID _24271_210_12658_1_1531987270022_7190519_638-171906-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 06:10:18,349 WARNING [train.py:1204] (3/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_272-249985-0 from training. Duration: 0.67 2023-10-02 06:10:18,361 WARNING [train.py:1204] (3/4) Exclude cut with ID _64086_210_7994_1_1535270417019_7435949_449-31567-0 from training. Duration: 0.97 2023-10-02 06:10:22,532 WARNING [train.py:1204] (3/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_109-282568-0_sp1.1 from training. Duration: 0.97275 2023-10-02 06:10:26,630 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.564e+02 1.838e+02 2.102e+02 2.485e+02 3.822e+02, threshold=4.204e+02, percent-clipped=0.0 2023-10-02 06:10:30,256 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.conv_module2.balancer2.min_positive, batch_count=777893.3333333334, ans=0.05 2023-10-02 06:10:31,396 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_121-45934-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 06:10:34,171 WARNING [train.py:1204] (3/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_314-126819-0 from training. Duration: 0.5 2023-10-02 06:10:34,203 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9309W0192-640319 from training. Duration: 0.512 2023-10-02 06:10:34,211 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9310W0003-640649 from training. Duration: 0.896 2023-10-02 06:10:37,399 WARNING [train.py:1204] (3/4) Exclude cut with ID _7160_210_1876_1_1527940923231_3703799_120-150943-0 from training. Duration: 0.81 2023-10-02 06:10:37,400 WARNING [train.py:1204] (3/4) Exclude cut with ID _40380_210_9059_1_1533380594759_4187380_6-33508-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 06:10:38,874 WARNING [train.py:1204] (3/4) Exclude cut with ID _59202_210_26984_1_1534816810612_3549174_137-318710-0 from training. Duration: 0.89 2023-10-02 06:10:42,349 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_435-313305-0 from training. Duration: 0.71 2023-10-02 06:10:43,823 WARNING [train.py:1204] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_91-212486-0_sp1.1 from training. Duration: 0.6181875 2023-10-02 06:10:45,764 WARNING [train.py:1204] (3/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_133-158308-0_sp1.1 from training. Duration: 0.763625 2023-10-02 06:10:47,170 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_99-251314-0 from training. Duration: 0.87 2023-10-02 06:10:48,896 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.5.encoder.layers.0.feed_forward3.out_whiten, num_groups=1, num_channels=256, metric=10.47 vs. limit=15.0 2023-10-02 06:10:49,781 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_365-217119-0_sp1.1 from training. Duration: 0.57275 2023-10-02 06:10:51,139 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3475_210_2028_1_1526193578_3622569_377-166133-0 from training. Duration: 0.86 2023-10-02 06:10:55,522 WARNING [train.py:1204] (3/4) Exclude cut with ID _13916_210_9994_1_1530788341126_3763149_116-21034-0_sp1.1 from training. Duration: 0.87275 2023-10-02 06:10:55,540 WARNING [train.py:1204] (3/4) Exclude cut with ID _64220_210_9527_1_1535250151772_4875259_142-216831-0_sp1.1 from training. Duration: 0.936375 2023-10-02 06:10:55,540 WARNING [train.py:1204] (3/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_490-207861-0_sp1.1 from training. Duration: 0.8909375 2023-10-02 06:10:56,380 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.1.encoder.layers.0.self_attn_weights.whiten_keys, num_groups=4, num_channels=128, metric=3.34 vs. limit=6.0 2023-10-02 06:10:57,409 INFO [train.py:1046] (3/4) Epoch 22, batch 5150, loss[loss=0.1938, simple_loss=0.2624, pruned_loss=0.06261, over 22794.00 frames. ], tot_loss[loss=0.174, simple_loss=0.2503, pruned_loss=0.04884, over 4705552.67 frames. ], batch size: 322, lr: 4.62e-03, grad_scale: 8.0 2023-10-02 06:10:57,459 WARNING [train.py:1204] (3/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_87-272068-0_sp0.9 from training. Duration: 0.92225 2023-10-02 06:10:57,497 WARNING [train.py:1204] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_18-46756-0_sp1.1 from training. Duration: 0.7181875 2023-10-02 06:10:57,559 WARNING [train.py:1204] (3/4) Exclude cut with ID _39411_210_5986_1_1533538825030_3343649_98-311335-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 06:10:58,893 WARNING [train.py:1204] (3/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_113-18163-0 from training. Duration: 0.89 2023-10-02 06:10:58,896 WARNING [train.py:1204] (3/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_1103-56445-0 from training. Duration: 0.73 2023-10-02 06:11:00,238 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3474_210_2028_1_1526188513_4825246_145-309131-0 from training. Duration: 0.74 2023-10-02 06:11:00,274 WARNING [train.py:1204] (3/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_120-211630-0_sp1.1 from training. Duration: 0.836375 2023-10-02 06:11:00,285 WARNING [train.py:1204] (3/4) Exclude cut with ID _8030_210_7497_1_1528444886781_3531089_32-31837-0 from training. Duration: 0.97 2023-10-02 06:11:02,847 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_19949_210_2161_1_1531706337_928079_8-160956-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 06:11:02,898 WARNING [train.py:1204] (3/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_321-215742-0_sp1.1 from training. Duration: 0.4545625 2023-10-02 06:11:04,271 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_583-124650-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 06:11:05,642 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_30434_210_13049_1_1532519384_981746_164-182755-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 06:11:06,685 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.bypass.skip_rate, batch_count=778026.6666666666, ans=0.07 2023-10-02 06:11:11,108 WARNING [train.py:1204] (3/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_339-104791-0_sp1.1 from training. Duration: 0.4454375 2023-10-02 06:11:12,306 WARNING [train.py:1204] (3/4) Exclude cut with ID _15026_210_8750_1_1530777602316_3291850_211-195043-0 from training. Duration: 0.8 2023-10-02 06:11:13,730 WARNING [train.py:1204] (3/4) Exclude cut with ID _29877_210_6228_1_1532397791106_5276149_487-182647-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 06:11:13,777 WARNING [train.py:1204] (3/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_127-566-0_sp0.9 from training. Duration: 0.9333125 2023-10-02 06:11:15,218 WARNING [train.py:1204] (3/4) Exclude cut with ID _8263_210_1664_1_1528439452891_970220_68-181805-0_sp0.9 from training. Duration: 0.711125 2023-10-02 06:11:15,219 WARNING [train.py:1204] (3/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_504-72513-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 06:11:15,236 WARNING [train.py:1204] (3/4) Exclude cut with ID _64639_210_5816_1_1535367619437_3528000_324-265233-0_sp1.1 from training. Duration: 0.963625 2023-10-02 06:11:15,296 WARNING [train.py:1204] (3/4) Exclude cut with ID _6456_210_1534_1_1527681634212_3181049_215-309359-0_sp0.9 from training. Duration: 0.97775 2023-10-02 06:11:16,590 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_115-233929-0_sp1.1 from training. Duration: 0.6545625 2023-10-02 06:11:16,640 WARNING [train.py:1204] (3/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_219-224445-0 from training. Duration: 0.84 2023-10-02 06:11:18,641 WARNING [train.py:1204] (3/4) Exclude cut with ID _14867_210_1968_1_1530684179890_3846969_413-29292-0_sp1.1 from training. Duration: 0.8181875 2023-10-02 06:11:19,900 WARNING [train.py:1204] (3/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_1012-279473-0_sp0.9 from training. Duration: 0.7444375 2023-10-02 06:11:21,302 WARNING [train.py:1204] (3/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_605-133395-0_sp0.9 from training. Duration: 0.7333125 2023-10-02 06:11:21,886 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.4.encoder.layers.1.self_attn1.whiten, num_groups=1, num_channels=384, metric=12.19 vs. limit=22.5 2023-10-02 06:11:24,055 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_89-35748-0 from training. Duration: 0.79 2023-10-02 06:11:25,473 WARNING [train.py:1204] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_476-272757-0_sp1.1 from training. Duration: 0.8454375 2023-10-02 06:11:27,700 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.0.layers.0.whiten, num_groups=1, num_channels=192, metric=4.01 vs. limit=12.0 2023-10-02 06:11:31,814 WARNING [train.py:1204] (3/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_449-15508-0_sp0.9 from training. Duration: 0.911125 2023-10-02 06:11:33,300 WARNING [train.py:1204] (3/4) Exclude cut with ID _29345_210_4381_1_1532689168263_3860169_198-174328-0 from training. Duration: 0.91 2023-10-02 06:11:36,113 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_15517_210_6753_1_1530952293_1664795_125-158157-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 06:11:42,624 WARNING [train.py:1204] (3/4) Exclude cut with ID _25565_210_12392_1_1532423009382_3952098_420-113180-0_sp1.1 from training. Duration: 0.963625 2023-10-02 06:11:44,008 WARNING [train.py:1204] (3/4) Exclude cut with ID _51471_210_1889_1_1534399391390_3094250_236-146773-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 06:11:46,874 WARNING [train.py:1204] (3/4) Exclude cut with ID _30039_210_13270_1_1532484612900_2725150_175-35012-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 06:11:48,238 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_23850_210_4238_1_1532232597_1632433_99-275843-0_sp1.1 from training. Duration: 0.936375 2023-10-02 06:11:51,563 WARNING [train.py:1204] (3/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_658-282610-0 from training. Duration: 0.53 2023-10-02 06:11:53,796 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.2.encoder.layers.0.conv_module2.whiten, num_groups=1, num_channels=384, metric=5.01 vs. limit=15.0 2023-10-02 06:11:54,407 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_37818_210_4238_1_1533620910_4720083_234-65934-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 06:11:55,823 WARNING [train.py:1204] (3/4) Exclude cut with ID _3067_210_2667_1_1526212974640_4503168_459-173867-0_sp1.1 from training. Duration: 0.8 2023-10-02 06:11:55,846 WARNING [train.py:1204] (3/4) Exclude cut with ID _8696_210_1876_1_1528545632440_3345058_209-54069-0_sp0.9 from training. Duration: 0.7444375 2023-10-02 06:11:58,623 WARNING [train.py:1204] (3/4) Exclude cut with ID _62574_210_2655_1_1535180407058_3933249_420-236465-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 06:11:59,963 WARNING [train.py:1204] (3/4) Exclude cut with ID _46681_210_9642_1_1533893005571_7603359_333-4042-0_sp1.1 from training. Duration: 0.936375 2023-10-02 06:12:01,322 WARNING [train.py:1204] (3/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_377-296839-0 from training. Duration: 0.55 2023-10-02 06:12:04,807 WARNING [train.py:1204] (3/4) Exclude cut with ID _43997_210_4525_1_1533538632555_5189329_426-92599-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 06:12:06,230 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_43-71989-0_sp1.1 from training. Duration: 0.5181875 2023-10-02 06:12:08,900 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2009_210_2071_1_1525481134_4588937_140-345530-0_sp1.1 from training. Duration: 0.963625 2023-10-02 06:12:08,908 WARNING [train.py:1204] (3/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_138-332773-0_sp0.9 from training. Duration: 0.9 2023-10-02 06:12:10,959 WARNING [train.py:1204] (3/4) Exclude cut with ID _13942_210_9988_1_1530604906036_3733360_499-55676-0_sp0.9 from training. Duration: 0.688875 2023-10-02 06:12:10,990 WARNING [train.py:1204] (3/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_259-34038-0_sp1.1 from training. Duration: 0.67275 2023-10-02 06:12:11,007 WARNING [train.py:1204] (3/4) Exclude cut with ID _3120_210_3525_1_1526036143112_4072980_404-249994-0_sp1.1 from training. Duration: 0.9 2023-10-02 06:12:11,039 WARNING [train.py:1204] (3/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_222-108810-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 06:12:12,060 INFO [train.py:1046] (3/4) Epoch 22, batch 5200, loss[loss=0.1938, simple_loss=0.2567, pruned_loss=0.0655, over 23776.00 frames. ], tot_loss[loss=0.174, simple_loss=0.2505, pruned_loss=0.0488, over 4713006.22 frames. ], batch size: 212, lr: 4.62e-03, grad_scale: 16.0 2023-10-02 06:12:15,499 WARNING [train.py:1204] (3/4) Exclude cut with ID _22083_210_8254_1_1531702862439_3678970_49-234546-0_sp0.9 from training. Duration: 0.92225 2023-10-02 06:12:15,613 WARNING [train.py:1204] (3/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_199-295611-0_sp0.9 from training. Duration: 0.611125 2023-10-02 06:12:18,250 WARNING [train.py:1204] (3/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_43-192945-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 06:12:21,720 WARNING [train.py:1204] (3/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_364-22817-0 from training. Duration: 0.71 2023-10-02 06:12:21,914 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.feed_forward2.hidden_balancer.prob, batch_count=778360.0, ans=0.125 2023-10-02 06:12:23,126 WARNING [train.py:1204] (3/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_388-129552-0_sp1.1 from training. Duration: 0.87275 2023-10-02 06:12:24,404 WARNING [train.py:1204] (3/4) Exclude cut with ID _6537_210_1883_1_1527987559710_3597129_190-280227-0_sp1.1 from training. Duration: 0.97275 2023-10-02 06:12:27,113 WARNING [train.py:1204] (3/4) Exclude cut with ID _55844_210_2346_1_1534471212569_7625707_398-56897-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 06:12:27,216 WARNING [train.py:1204] (3/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_48-182155-0_sp1.1 from training. Duration: 0.7909375 2023-10-02 06:12:27,236 WARNING [train.py:1204] (3/4) Exclude cut with ID _29529_210_7234_1_1532393961455_3977460_582-18084-0_sp1.1 from training. Duration: 0.97275 2023-10-02 06:12:29,947 WARNING [train.py:1204] (3/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_912-282412-0 from training. Duration: 0.49 2023-10-02 06:12:32,746 WARNING [train.py:1204] (3/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_332-256168-0_sp0.9 from training. Duration: 0.8333125 2023-10-02 06:12:32,802 WARNING [train.py:1204] (3/4) Exclude cut with ID _64220_210_9527_1_1535250151772_4875259_55-250624-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 06:12:36,003 WARNING [train.py:1204] (3/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_264-108685-0 from training. Duration: 0.55 2023-10-02 06:12:37,440 WARNING [train.py:1204] (3/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_613-232856-0_sp0.9 from training. Duration: 0.6 2023-10-02 06:12:38,841 WARNING [train.py:1204] (3/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_68-212801-0_sp1.1 from training. Duration: 0.663625 2023-10-02 06:12:39,117 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.conv_module1.balancer1.prob, batch_count=778426.6666666666, ans=0.125 2023-10-02 06:12:40,152 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6290_210_3273_1_1527595224_1282787_115-87095-0 from training. Duration: 0.55 2023-10-02 06:12:40,875 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3080_210_2071_1_1526086102_4365166_312-8726-0 from training. Duration: 0.91 2023-10-02 06:12:43,461 WARNING [train.py:1204] (3/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_105-63309-0 from training. Duration: 0.98 2023-10-02 06:12:43,539 WARNING [train.py:1204] (3/4) Exclude cut with ID _2941_210_2577_1_1526199673150_2489759_201-160779-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 06:12:43,543 WARNING [train.py:1204] (3/4) Exclude cut with ID ID0406W0226-885434 from training. Duration: 0.997 2023-10-02 06:12:43,552 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_17292_210_6753_1_1531125388_2943734_353-328201-0_sp1.1 from training. Duration: 0.97275 2023-10-02 06:12:44,918 WARNING [train.py:1204] (3/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_572-96029-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 06:12:46,861 WARNING [train.py:1204] (3/4) Exclude cut with ID _14970_210_15709_1_1530775547361_1966965_6-23328-0_sp0.9 from training. Duration: 0.9555625 2023-10-02 06:12:46,911 WARNING [train.py:1204] (3/4) Exclude cut with ID _45012_210_12651_1_1533608435136_5371380_240-37101-0 from training. Duration: 0.93 2023-10-02 06:12:48,298 WARNING [train.py:1204] (3/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_685-90183-0_sp1.1 from training. Duration: 0.936375 2023-10-02 06:12:51,482 WARNING [train.py:1204] (3/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_41-22245-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 06:12:53,005 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_15126_210_6753_1_1530778677_2591185_120-277707-0 from training. Duration: 0.8 2023-10-02 06:12:53,247 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.feed_forward3.hidden_balancer.prob, batch_count=778493.3333333334, ans=0.125 2023-10-02 06:12:54,453 WARNING [train.py:1204] (3/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_310-26186-0 from training. Duration: 0.7 2023-10-02 06:12:54,470 WARNING [train.py:1204] (3/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_787-294416-0 from training. Duration: 0.82 2023-10-02 06:12:57,094 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.503e+02 1.869e+02 2.074e+02 2.412e+02 3.434e+02, threshold=4.148e+02, percent-clipped=0.0 2023-10-02 06:12:58,638 WARNING [train.py:1204] (3/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_795-33252-0 from training. Duration: 0.37 2023-10-02 06:12:58,671 WARNING [train.py:1204] (3/4) Exclude cut with ID _7537_210_2084_1_1528502395431_3924359_113-199933-0_sp0.9 from training. Duration: 0.6555625 2023-10-02 06:13:02,834 WARNING [train.py:1204] (3/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_308-231735-0_sp0.9 from training. Duration: 0.811125 2023-10-02 06:13:02,855 WARNING [train.py:1204] (3/4) Exclude cut with ID _19504_210_9994_1_1531648863242_3325220_354-296765-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 06:13:04,260 WARNING [train.py:1204] (3/4) Exclude cut with ID _5409_210_1388_1_1527292850189_5865191_178-127970-0 from training. Duration: 0.57 2023-10-02 06:13:06,213 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_1466-248056-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 06:13:06,250 WARNING [train.py:1204] (3/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_535-14410-0_sp0.9 from training. Duration: 0.52225 2023-10-02 06:13:06,253 WARNING [train.py:1204] (3/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_747-129941-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 06:13:07,645 WARNING [train.py:1204] (3/4) Exclude cut with ID _15012_210_1968_1_1530856820429_5095350_688-178849-0_sp1.1 from training. Duration: 0.5818125 2023-10-02 06:13:09,705 WARNING [train.py:1204] (3/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_356-45054-0_sp1.1 from training. Duration: 0.7818125 2023-10-02 06:13:10,842 WARNING [train.py:1204] (3/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_146-276944-0_sp0.9 from training. Duration: 0.92225 2023-10-02 06:13:12,670 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.5.encoder.layers.0.self_attn_weights.whiten_keys, num_groups=4, num_channels=128, metric=2.10 vs. limit=6.0 2023-10-02 06:13:14,870 WARNING [train.py:1204] (3/4) Exclude cut with ID _39204_210_4220_1_1533880547368_3191120_346-180306-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 06:13:15,048 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=778626.6666666666, ans=0.1 2023-10-02 06:13:16,244 WARNING [train.py:1204] (3/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_641-158017-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 06:13:16,245 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_19952_210_2161_1_1531875469_3851750_274-42137-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 06:13:16,362 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.0.feed_forward1.out_proj.dropout_p, batch_count=778626.6666666666, ans=0.1 2023-10-02 06:13:23,233 WARNING [train.py:1204] (3/4) Exclude cut with ID _60804_210_9642_1_1534831199859_7268299_87-327819-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 06:13:23,306 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_9302_210_2550_1_1528889962_4655416_638-301620-0 from training. Duration: 0.47 2023-10-02 06:13:24,630 WARNING [train.py:1204] (3/4) Exclude cut with ID _44304_210_15374_1_1533639352765_4613129_302-22084-0_sp1.1 from training. Duration: 0.7818125 2023-10-02 06:13:24,650 WARNING [train.py:1204] (3/4) Exclude cut with ID _18513_210_9206_1_1531618215223_3862620_177-227063-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 06:13:25,970 WARNING [train.py:1204] (3/4) Exclude cut with ID _41326_210_5854_1_1533778076509_3492080_510-45444-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 06:13:26,021 WARNING [train.py:1204] (3/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_76-311963-0_sp1.1 from training. Duration: 0.636375 2023-10-02 06:13:27,227 INFO [train.py:1046] (3/4) Epoch 22, batch 5250, loss[loss=0.1727, simple_loss=0.2461, pruned_loss=0.04962, over 23691.00 frames. ], tot_loss[loss=0.1734, simple_loss=0.2493, pruned_loss=0.04874, over 4710968.08 frames. ], batch size: 149, lr: 4.62e-03, grad_scale: 16.0 2023-10-02 06:13:27,364 WARNING [train.py:1204] (3/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_156-302675-0_sp0.9 from training. Duration: 0.611125 2023-10-02 06:13:27,568 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.feed_forward1.out_proj.dropout_p, batch_count=778693.3333333334, ans=0.1 2023-10-02 06:13:31,331 WARNING [train.py:1204] (3/4) Exclude cut with ID _53489_210_6228_1_1534298357169_3835970_367-148411-0_sp1.1 from training. Duration: 0.936375 2023-10-02 06:13:34,199 WARNING [train.py:1204] (3/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_389-144525-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 06:13:35,568 WARNING [train.py:1204] (3/4) Exclude cut with ID _14069_210_5632_1_1530512692033_4803390_158-238427-0_sp1.1 from training. Duration: 0.963625 2023-10-02 06:13:35,663 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_486-151131-0_sp0.9 from training. Duration: 0.9444375 2023-10-02 06:13:41,858 WARNING [train.py:1204] (3/4) Exclude cut with ID _39846_210_12651_1_1533603651992_1318030_188-71831-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 06:13:44,089 WARNING [train.py:1204] (3/4) Exclude cut with ID _39411_210_5986_1_1533538825030_3343649_173-238248-0_sp1.1 from training. Duration: 0.7545625 2023-10-02 06:13:45,286 WARNING [train.py:1204] (3/4) Exclude cut with ID _20684_210_6917_1_1531567751646_4203930_469-49081-0_sp1.1 from training. Duration: 0.92725 2023-10-02 06:13:46,693 WARNING [train.py:1204] (3/4) Exclude cut with ID _8833_210_5219_1_1528592665210_7553059_611-80313-0_sp1.1 from training. Duration: 0.5818125 2023-10-02 06:13:49,423 WARNING [train.py:1204] (3/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_440-141811-0 from training. Duration: 0.95 2023-10-02 06:13:49,446 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_41211_210_2544_1_1533348859_3702910_236-223652-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 06:13:51,485 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_40222_210_6228_1_1533385298_4481939_220-163998-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 06:14:16,156 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.0.layers.1.whiten, num_groups=1, num_channels=192, metric=3.62 vs. limit=12.0 2023-10-02 06:14:34,319 INFO [scaling.py:1118] (3/4) WithLoss: name=encoder.encoders.0.layers.0.self_attn_weights, loss-sum=0.000e+00 2023-10-02 06:14:36,718 INFO [train.py:1046] (3/4) Epoch 22, batch 5300, loss[loss=0.1648, simple_loss=0.2503, pruned_loss=0.03968, over 24656.00 frames. ], tot_loss[loss=0.172, simple_loss=0.2479, pruned_loss=0.04804, over 4708834.70 frames. ], batch size: 65, lr: 4.62e-03, grad_scale: 16.0 2023-10-02 06:14:38,791 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.3.encoder.layers.0.conv_module2.whiten, num_groups=1, num_channels=512, metric=3.05 vs. limit=15.0 2023-10-02 06:14:51,274 WARNING [train.py:1204] (3/4) Exclude cut with ID _14629_210_15709_1_1530604298182_956899_6-232695-0_sp1.1 from training. Duration: 0.8545625 2023-10-02 06:14:51,299 WARNING [train.py:1204] (3/4) Exclude cut with ID _13921_210_9994_1_1530841782272_3503520_327-69677-0 from training. Duration: 0.9 2023-10-02 06:14:51,332 WARNING [train.py:1204] (3/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_372-298093-0 from training. Duration: 0.8 2023-10-02 06:14:51,348 WARNING [train.py:1204] (3/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_515-139111-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 06:14:51,596 WARNING [train.py:1204] (3/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_85-65056-0_sp1.1 from training. Duration: 0.87275 2023-10-02 06:14:51,639 WARNING [train.py:1204] (3/4) Exclude cut with ID _15113_210_8254_1_1530842438579_3716279_209-54533-0_sp1.1 from training. Duration: 0.87275 2023-10-02 06:14:51,688 WARNING [train.py:1204] (3/4) Exclude cut with ID _70447_210_12392_1_1535972499300_3160420_123-312838-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 06:14:51,715 WARNING [train.py:1204] (3/4) Exclude cut with ID _60113_210_16271_1_1535164212481_4386079_70-63596-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 06:14:51,745 WARNING [train.py:1204] (3/4) Exclude cut with ID _14632_210_9988_1_1530691279392_3923200_225-188509-0_sp1.1 from training. Duration: 0.963625 2023-10-02 06:14:51,806 WARNING [train.py:1204] (3/4) Exclude cut with ID _34734_210_4525_1_1533365977542_4448720_584-180532-0_sp1.1 from training. Duration: 0.97275 2023-10-02 06:14:51,819 WARNING [train.py:1204] (3/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_351-294650-0_sp1.1 from training. Duration: 0.57275 2023-10-02 06:14:52,155 WARNING [train.py:1204] (3/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_263-168388-0_sp0.9 from training. Duration: 0.9555625 2023-10-02 06:14:52,185 WARNING [train.py:1204] (3/4) Exclude cut with ID _22083_210_8254_1_1531702862439_3678970_201-85738-0 from training. Duration: 0.9 2023-10-02 06:14:52,284 WARNING [train.py:1204] (3/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_237-58433-0 from training. Duration: 0.74 2023-10-02 06:14:52,311 WARNING [train.py:1204] (3/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_262-250826-0 from training. Duration: 0.99 2023-10-02 06:14:52,411 WARNING [train.py:1204] (3/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_927-43827-0_sp0.9 from training. Duration: 0.82225 2023-10-02 06:14:52,441 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2821_210_2715_1_1526108385_3763560_73-296673-0 from training. Duration: 0.83 2023-10-02 06:14:52,513 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6287_210_3273_1_1527509506_2653716_182-199887-0 from training. Duration: 0.51 2023-10-02 06:14:52,635 WARNING [train.py:1204] (3/4) Exclude cut with ID _70519_210_4163_1_1535961595994_7458230_899-80223-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 06:14:53,261 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_210-234949-0_sp1.1 from training. Duration: 0.97275 2023-10-02 06:14:53,297 WARNING [train.py:1204] (3/4) Exclude cut with ID _30196_210_1664_1_1533708496522_7074140_768-278176-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 06:14:53,378 WARNING [train.py:1204] (3/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_315-128283-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 06:14:53,461 WARNING [train.py:1204] (3/4) Exclude cut with ID _14886_210_4220_1_1530702020355_3516190_330-323941-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 06:14:53,726 WARNING [train.py:1204] (3/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_264-348896-0_sp1.1 from training. Duration: 0.77275 2023-10-02 06:14:53,748 WARNING [train.py:1204] (3/4) Exclude cut with ID _37086_210_9038_1_1532999540891_7021250_433-10563-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 06:14:53,784 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_53638_210_21581_1_1534297319_5808449_885-80172-0_sp1.1 from training. Duration: 0.87275 2023-10-02 06:14:53,873 WARNING [train.py:1204] (3/4) Exclude cut with ID _14623_210_1925_1_1530773354962_3632609_157-218771-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 06:14:53,883 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_1368-78165-0_sp1.1 from training. Duration: 0.97275 2023-10-02 06:14:53,887 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_309-65177-0_sp0.9 from training. Duration: 0.97775 2023-10-02 06:14:53,892 WARNING [train.py:1204] (3/4) Exclude cut with ID _21632_210_2253_1_1531994001765_3744140_291-138812-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 06:14:53,906 WARNING [train.py:1204] (3/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_669-93163-0_sp1.1 from training. Duration: 0.8909375 2023-10-02 06:14:54,448 WARNING [train.py:1204] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_292-215346-0 from training. Duration: 0.97 2023-10-02 06:14:54,472 WARNING [train.py:1204] (3/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_286-222073-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 06:14:54,760 WARNING [train.py:1204] (3/4) Exclude cut with ID _59256_210_5986_1_1535076085997_3589120_121-237386-0_sp1.1 from training. Duration: 0.87275 2023-10-02 06:14:54,774 WARNING [train.py:1204] (3/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_96-320736-0 from training. Duration: 0.86 2023-10-02 06:14:54,783 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_128-202104-0 from training. Duration: 0.63 2023-10-02 06:14:54,866 WARNING [train.py:1204] (3/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_242-3472-0_sp0.9 from training. Duration: 0.811125 2023-10-02 06:14:54,908 WARNING [train.py:1204] (3/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_154-74182-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 06:14:54,911 WARNING [train.py:1204] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_187-221566-0 from training. Duration: 0.43 2023-10-02 06:14:55,568 WARNING [train.py:1204] (3/4) Exclude cut with ID _6194_210_1534_1_1527511826160_3941194_14-282876-0 from training. Duration: 0.71 2023-10-02 06:14:55,611 WARNING [train.py:1204] (3/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_660-211088-0_sp1.1 from training. Duration: 0.563625 2023-10-02 06:14:55,953 WARNING [train.py:1204] (3/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_526-62079-0_sp1.1 from training. Duration: 0.6090625 2023-10-02 06:14:56,125 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_92-246270-0_sp1.1 from training. Duration: 0.77275 2023-10-02 06:14:56,209 WARNING [train.py:1204] (3/4) Exclude cut with ID IC0287W0023-142350 from training. Duration: 0.956 2023-10-02 06:14:56,281 WARNING [train.py:1204] (3/4) Exclude cut with ID _58605_210_12210_1_1534723195939_3904259_48-269402-0 from training. Duration: 0.96 2023-10-02 06:14:56,296 WARNING [train.py:1204] (3/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_416-257730-0_sp0.9 from training. Duration: 0.72225 2023-10-02 06:14:56,367 WARNING [train.py:1204] (3/4) Exclude cut with ID _7877_210_6014_1_1528286358099_3750136_185-17516-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 06:14:56,463 WARNING [train.py:1204] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_23-189234-0 from training. Duration: 0.83 2023-10-02 06:14:56,478 WARNING [train.py:1204] (3/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_237-89684-0 from training. Duration: 0.96 2023-10-02 06:14:56,543 WARNING [train.py:1204] (3/4) Exclude cut with ID _13916_210_9994_1_1530788341126_3763149_116-21034-0 from training. Duration: 0.96 2023-10-02 06:14:56,685 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_246-125000-0_sp1.1 from training. Duration: 0.563625 2023-10-02 06:14:59,770 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.1.encoder.layers.0.conv_module2.whiten, num_groups=1, num_channels=256, metric=8.88 vs. limit=15.0 2023-10-02 06:15:03,814 INFO [train.py:1046] (3/4) Epoch 23, batch 0, loss[loss=0.2341, simple_loss=0.3005, pruned_loss=0.08389, over 19615.00 frames. ], tot_loss[loss=0.2341, simple_loss=0.3005, pruned_loss=0.08389, over 19615.00 frames. ], batch size: 388, lr: 4.51e-03, grad_scale: 32.0 2023-10-02 06:15:03,815 INFO [train.py:1069] (3/4) Computing validation loss 2023-10-02 06:15:16,810 INFO [train.py:1078] (3/4) Epoch 23, validation: loss=0.2993, simple_loss=0.2685, pruned_loss=0.165, over 1125622.00 frames. 2023-10-02 06:15:16,811 INFO [train.py:1079] (3/4) Maximum memory allocated so far is 21134MB 2023-10-02 06:15:19,023 WARNING [train.py:1204] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_402-253027-0 from training. Duration: 0.54 2023-10-02 06:15:19,593 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.4.encoder.layers.0.feed_forward3.out_whiten, num_groups=1, num_channels=384, metric=14.33 vs. limit=15.0 2023-10-02 06:15:20,333 WARNING [train.py:1204] (3/4) Exclude cut with ID _59288_210_5854_1_1535077757774_3412246_158-289345-0_sp1.1 from training. Duration: 0.92725 2023-10-02 06:15:20,568 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.conv_module1.balancer1.prob, batch_count=779106.6666666666, ans=0.125 2023-10-02 06:15:21,792 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_28211_210_6388_1_1532861506_4523224_128-328162-0_sp1.1 from training. Duration: 0.8181875 2023-10-02 06:15:26,177 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_788-278281-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 06:15:26,186 WARNING [train.py:1204] (3/4) Exclude cut with ID _49728_210_12651_1_1534041034294_6788150_624-195301-0_sp0.9 from training. Duration: 0.9444375 2023-10-02 06:15:27,926 WARNING [train.py:1204] (3/4) Exclude cut with ID _14860_210_4915_1_1530702020340_7343240_267-139775-0_sp1.1 from training. Duration: 0.963625 2023-10-02 06:15:28,009 WARNING [train.py:1204] (3/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_332-221057-0 from training. Duration: 0.68 2023-10-02 06:15:29,485 WARNING [train.py:1204] (3/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_242-76943-0 from training. Duration: 0.94 2023-10-02 06:15:32,333 WARNING [train.py:1204] (3/4) Exclude cut with ID _16747_210_5986_1_1531811544733_2749109_87-349587-0_sp1.1 from training. Duration: 0.963625 2023-10-02 06:15:33,657 WARNING [train.py:1204] (3/4) Exclude cut with ID _22982_210_1883_1_1531882790447_5449409_505-346452-0_sp1.1 from training. Duration: 0.963625 2023-10-02 06:15:36,401 WARNING [train.py:1204] (3/4) Exclude cut with ID _17956_210_4353_1_1531209568125_6979970_13-192107-0_sp1.1 from training. Duration: 0.963625 2023-10-02 06:15:36,448 WARNING [train.py:1204] (3/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_430-307666-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 06:15:37,810 WARNING [train.py:1204] (3/4) Exclude cut with ID _2895_210_2477_1_1525956785794_4063750_289-11475-0_sp1.1 from training. Duration: 0.7454375 2023-10-02 06:15:37,828 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3910_210_1794_1_1526742306_334530_28-238909-0_sp0.9 from training. Duration: 0.9 2023-10-02 06:15:39,234 WARNING [train.py:1204] (3/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_18-130336-0 from training. Duration: 0.79 2023-10-02 06:15:41,844 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_548-294316-0_sp0.9 from training. Duration: 0.9 2023-10-02 06:15:43,685 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.578e+02 1.858e+02 2.101e+02 2.344e+02 3.915e+02, threshold=4.201e+02, percent-clipped=0.0 2023-10-02 06:15:48,769 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_42-142473-0_sp1.1 from training. Duration: 0.6545625 2023-10-02 06:15:48,775 WARNING [train.py:1204] (3/4) Exclude cut with ID _59170_210_6721_1_1534983633960_4203335_326-48066-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 06:15:50,288 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_45886_210_17024_1_1533709215_471063_7-79165-0 from training. Duration: 0.98 2023-10-02 06:15:54,751 WARNING [train.py:1204] (3/4) Exclude cut with ID _44585_210_18628_1_1533605350387_4518199_280-73534-0_sp0.9 from training. Duration: 0.988875 2023-10-02 06:15:54,757 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3475_210_2028_1_1526193578_3622569_265-276625-0_sp1.1 from training. Duration: 0.6909375 2023-10-02 06:15:56,214 WARNING [train.py:1204] (3/4) Exclude cut with ID _64631_210_27767_1_1535349580068_4165160_88-225232-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 06:16:00,365 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_205-265518-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 06:16:03,844 WARNING [train.py:1204] (3/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_271-253734-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 06:16:06,675 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.balancer1.prob, batch_count=779306.6666666666, ans=0.125 2023-10-02 06:16:10,547 WARNING [train.py:1204] (3/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_282-257791-0 from training. Duration: 0.86 2023-10-02 06:16:13,937 WARNING [train.py:1204] (3/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_356-45054-0 from training. Duration: 0.86 2023-10-02 06:16:13,985 WARNING [train.py:1204] (3/4) Exclude cut with ID _69584_210_5219_1_1535785133775_4044450_38-348116-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 06:16:13,987 WARNING [train.py:1204] (3/4) Exclude cut with ID _66763_210_17301_1_1535788667617_241750_21-328480-0_sp1.1 from training. Duration: 0.936375 2023-10-02 06:16:15,404 WARNING [train.py:1204] (3/4) Exclude cut with ID _26528_210_9070_1_1532570410842_3851310_497-97218-0_sp0.9 from training. Duration: 0.9555625 2023-10-02 06:16:17,348 WARNING [train.py:1204] (3/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_592-101075-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 06:16:17,485 WARNING [train.py:1204] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_678-159307-0 from training. Duration: 0.85 2023-10-02 06:16:20,195 WARNING [train.py:1204] (3/4) Exclude cut with ID _28427_210_4220_1_1532581240967_3061069_166-319773-0_sp1.1 from training. Duration: 0.936375 2023-10-02 06:16:22,105 WARNING [train.py:1204] (3/4) Exclude cut with ID _29877_210_6228_1_1532397791106_5276149_606-224984-0_sp1.1 from training. Duration: 0.936375 2023-10-02 06:16:23,606 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_125-203105-0_sp1.1 from training. Duration: 0.72725 2023-10-02 06:16:27,633 WARNING [train.py:1204] (3/4) Exclude cut with ID IC0620W0113-306765 from training. Duration: 0.768 2023-10-02 06:16:29,039 WARNING [train.py:1204] (3/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_410-287857-0_sp1.1 from training. Duration: 0.8454375 2023-10-02 06:16:31,776 INFO [train.py:1046] (3/4) Epoch 23, batch 50, loss[loss=0.1674, simple_loss=0.2383, pruned_loss=0.04825, over 23801.00 frames. ], tot_loss[loss=0.1731, simple_loss=0.2507, pruned_loss=0.04771, over 1070726.19 frames. ], batch size: 212, lr: 4.51e-03, grad_scale: 32.0 2023-10-02 06:16:31,832 WARNING [train.py:1204] (3/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_78-95260-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 06:16:35,022 WARNING [train.py:1204] (3/4) Exclude cut with ID _58059_210_5854_1_1534901471149_3164808_6-73080-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 06:16:35,038 WARNING [train.py:1204] (3/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_331-117829-0 from training. Duration: 0.62 2023-10-02 06:16:35,112 WARNING [train.py:1204] (3/4) Exclude cut with ID _14880_210_8895_1_1530680426655_3795259_289-212817-0_sp0.9 from training. Duration: 0.7666875 2023-10-02 06:16:35,147 WARNING [train.py:1204] (3/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_620-144779-0_sp1.1 from training. Duration: 0.92725 2023-10-02 06:16:36,720 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.nonlin_attention.balancer.prob, batch_count=779440.0, ans=0.125 2023-10-02 06:16:37,201 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.2.encoder.layers.1.feed_forward1.out_whiten, num_groups=1, num_channels=384, metric=7.56 vs. limit=15.0 2023-10-02 06:16:37,839 WARNING [train.py:1204] (3/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_561-288575-0_sp1.1 from training. Duration: 0.963625 2023-10-02 06:16:38,082 INFO [scaling.py:1118] (3/4) WithLoss: name=encoder.encoders.1.encoder.layers.0.self_attn_weights, loss-sum=0.000e+00 2023-10-02 06:16:39,284 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_22657_210_6890_1_1532086002_1637216_122-188128-0_sp1.1 from training. Duration: 0.963625 2023-10-02 06:16:40,615 WARNING [train.py:1204] (3/4) Exclude cut with ID _16756_210_4915_1_1531047803763_7104259_604-86607-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 06:16:42,619 WARNING [train.py:1204] (3/4) Exclude cut with ID _15150_210_8341_1_1530752411803_7256384_469-239509-0 from training. Duration: 0.68 2023-10-02 06:16:43,916 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_10987_210_4163_1_1529760555_4123902_302-87926-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 06:16:46,215 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.conv_skip_rate, batch_count=779506.6666666666, ans=0.0 2023-10-02 06:16:48,943 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.1.conv_module2.balancer2.prob, batch_count=779506.6666666666, ans=0.125 2023-10-02 06:16:50,688 WARNING [train.py:1204] (3/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_469-94673-0_sp0.9 from training. Duration: 0.87775 2023-10-02 06:16:53,293 WARNING [train.py:1204] (3/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_798-106179-0 from training. Duration: 0.83 2023-10-02 06:16:54,750 WARNING [train.py:1204] (3/4) Exclude cut with ID _16744_210_5986_1_1531724405384_3907567_242-111316-0 from training. Duration: 0.93 2023-10-02 06:16:57,494 WARNING [train.py:1204] (3/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_938-205135-0_sp0.9 from training. Duration: 0.9666875 2023-10-02 06:16:58,802 WARNING [train.py:1204] (3/4) Exclude cut with ID _64135_210_2253_1_1535677267876_7608972_133-188266-0_sp0.9 from training. Duration: 0.9 2023-10-02 06:16:58,806 WARNING [train.py:1204] (3/4) Exclude cut with ID _44585_210_18628_1_1533605350387_4518199_623-346820-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 06:17:00,132 WARNING [train.py:1204] (3/4) Exclude cut with ID _42326_210_7770_1_1533814282531_3707269_454-266903-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 06:17:00,221 WARNING [train.py:1204] (3/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_5-55893-0_sp1.1 from training. Duration: 0.5 2023-10-02 06:17:01,516 WARNING [train.py:1204] (3/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_577-332600-0_sp1.1 from training. Duration: 0.5181875 2023-10-02 06:17:01,517 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_957-138882-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 06:17:04,552 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.conv_skip_rate, batch_count=779573.3333333334, ans=0.0 2023-10-02 06:17:07,207 WARNING [train.py:1204] (3/4) Exclude cut with ID _12556_210_8341_1_1530081024100_7327690_219-38004-0_sp1.1 from training. Duration: 0.97275 2023-10-02 06:17:07,303 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_397-147196-0_sp1.1 from training. Duration: 0.72725 2023-10-02 06:17:09,203 WARNING [train.py:1204] (3/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_231-104736-0_sp0.9 from training. Duration: 0.8666875 2023-10-02 06:17:09,300 WARNING [train.py:1204] (3/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_398-334558-0 from training. Duration: 0.96 2023-10-02 06:17:10,640 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_257-20733-0_sp1.1 from training. Duration: 0.6909375 2023-10-02 06:17:12,051 WARNING [train.py:1204] (3/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_146-282400-0_sp1.1 from training. Duration: 0.6454375 2023-10-02 06:17:12,055 WARNING [train.py:1204] (3/4) Exclude cut with ID _51899_210_20034_1_1534129254466_3669220_265-215428-0 from training. Duration: 0.98 2023-10-02 06:17:12,123 WARNING [train.py:1204] (3/4) Exclude cut with ID _34423_210_8750_1_1533430798456_3330199_219-106541-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 06:17:15,182 WARNING [train.py:1204] (3/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_458-67321-0 from training. Duration: 0.97 2023-10-02 06:17:22,667 WARNING [train.py:1204] (3/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_1051-182606-0_sp1.1 from training. Duration: 0.936375 2023-10-02 06:17:22,682 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_53555_210_5632_1_1534595220_7232297_603-4396-0_sp1.1 from training. Duration: 0.97275 2023-10-02 06:17:22,780 WARNING [train.py:1204] (3/4) Exclude cut with ID _54218_210_12210_1_1534413646388_4409259_159-144367-0_sp1.1 from training. Duration: 0.963625 2023-10-02 06:17:24,181 WARNING [train.py:1204] (3/4) Exclude cut with ID _7606_210_4915_1_1528894253277_5080690_301-10732-0_sp0.9 from training. Duration: 0.9 2023-10-02 06:17:24,184 WARNING [train.py:1204] (3/4) Exclude cut with ID _15026_210_8750_1_1530777602316_3291850_211-195043-0_sp0.9 from training. Duration: 0.888875 2023-10-02 06:17:26,970 WARNING [train.py:1204] (3/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_146-282400-0 from training. Duration: 0.71 2023-10-02 06:17:28,172 WARNING [train.py:1204] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_65-88242-0 from training. Duration: 0.56 2023-10-02 06:17:29,529 WARNING [train.py:1204] (3/4) Exclude cut with ID _55857_210_2346_1_1534644007831_6898209_80-107757-0_sp1.1 from training. Duration: 0.963625 2023-10-02 06:17:29,578 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_15126_210_6753_1_1530778677_2591185_120-277707-0_sp0.9 from training. Duration: 0.888875 2023-10-02 06:17:30,971 WARNING [train.py:1204] (3/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_1033-168593-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 06:17:32,216 WARNING [train.py:1204] (3/4) Exclude cut with ID _46683_210_9642_1_1533730913492_4591116_475-30711-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 06:17:32,249 WARNING [train.py:1204] (3/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_253-308377-0 from training. Duration: 0.49 2023-10-02 06:17:32,315 WARNING [train.py:1204] (3/4) Exclude cut with ID _4680_210_3525_1_1527249757953_893386_75-78979-0 from training. Duration: 0.91 2023-10-02 06:17:33,613 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_5522_210_3273_1_1527249606_3909096_318-203153-0_sp0.9 from training. Duration: 0.47775 2023-10-02 06:17:35,022 WARNING [train.py:1204] (3/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_550-170116-0_sp1.1 from training. Duration: 0.92725 2023-10-02 06:17:35,058 WARNING [train.py:1204] (3/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_277-210798-0_sp0.9 from training. Duration: 0.611125 2023-10-02 06:17:36,303 WARNING [train.py:1204] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_620-276793-0 from training. Duration: 0.59 2023-10-02 06:17:36,317 WARNING [train.py:1204] (3/4) Exclude cut with ID _3067_210_2667_1_1526212974640_4503168_119-175776-0 from training. Duration: 0.74 2023-10-02 06:17:37,655 WARNING [train.py:1204] (3/4) Exclude cut with ID _55209_210_1531_1_1534675828996_4597160_681-195827-0_sp1.1 from training. Duration: 0.92725 2023-10-02 06:17:38,912 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_427-78097-0_sp1.1 from training. Duration: 0.72725 2023-10-02 06:17:40,927 WARNING [train.py:1204] (3/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_96-290039-0_sp0.9 from training. Duration: 0.62225 2023-10-02 06:17:40,944 WARNING [train.py:1204] (3/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_287-292174-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 06:17:43,644 WARNING [train.py:1204] (3/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_200-292228-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 06:17:45,364 INFO [train.py:1046] (3/4) Epoch 23, batch 100, loss[loss=0.186, simple_loss=0.2533, pruned_loss=0.05939, over 23461.00 frames. ], tot_loss[loss=0.1747, simple_loss=0.2514, pruned_loss=0.04894, over 1883946.81 frames. ], batch size: 285, lr: 4.51e-03, grad_scale: 32.0 2023-10-02 06:17:46,151 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.2.encoder.layers.2.feed_forward2.out_whiten, num_groups=1, num_channels=384, metric=4.27 vs. limit=15.0 2023-10-02 06:17:47,330 WARNING [train.py:1204] (3/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_291-194999-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 06:17:50,194 WARNING [train.py:1204] (3/4) Exclude cut with ID _58895_210_8750_1_1535184081320_3649089_265-323810-0_sp1.1 from training. Duration: 0.87275 2023-10-02 06:17:51,664 WARNING [train.py:1204] (3/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_58-68956-0 from training. Duration: 0.83 2023-10-02 06:17:51,672 WARNING [train.py:1204] (3/4) Exclude cut with ID _39867_210_9826_1_1533553222030_9344259_322-115153-0_sp1.1 from training. Duration: 0.963625 2023-10-02 06:17:56,580 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3371_210_1794_1_1526384462_5580777_396-339812-0_sp1.1 from training. Duration: 0.77275 2023-10-02 06:17:56,600 WARNING [train.py:1204] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_731-2428-0_sp0.9 from training. Duration: 0.92225 2023-10-02 06:17:56,635 WARNING [train.py:1204] (3/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_98-741-0_sp1.1 from training. Duration: 0.72725 2023-10-02 06:17:56,636 WARNING [train.py:1204] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_238-245655-0_sp0.9 from training. Duration: 0.9 2023-10-02 06:17:57,924 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6788_210_1388_1_1527898698_4562021_91-197981-0_sp0.9 from training. Duration: 0.92225 2023-10-02 06:17:59,321 WARNING [train.py:1204] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_860-96198-0 from training. Duration: 0.73 2023-10-02 06:17:59,564 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.feed_forward2.hidden_balancer.prob, batch_count=779840.0, ans=0.125 2023-10-02 06:18:00,755 WARNING [train.py:1204] (3/4) Exclude cut with ID _14268_210_9206_1_1530622771285_4714099_331-105351-0_sp0.9 from training. Duration: 0.72225 2023-10-02 06:18:02,025 WARNING [train.py:1204] (3/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_204-137133-0_sp1.1 from training. Duration: 0.92725 2023-10-02 06:18:02,060 WARNING [train.py:1204] (3/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_5-46496-0_sp1.1 from training. Duration: 0.936375 2023-10-02 06:18:02,068 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_160-218721-0_sp1.1 from training. Duration: 0.87275 2023-10-02 06:18:04,781 WARNING [train.py:1204] (3/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_503-287883-0 from training. Duration: 0.88 2023-10-02 06:18:06,241 WARNING [train.py:1204] (3/4) Exclude cut with ID _57572_210_5517_1_1534921219624_3809484_110-79134-0_sp1.1 from training. Duration: 0.92725 2023-10-02 06:18:07,580 WARNING [train.py:1204] (3/4) Exclude cut with ID _44585_210_18628_1_1533605350387_4518199_421-37641-0_sp1.1 from training. Duration: 0.936375 2023-10-02 06:18:08,843 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_42-142473-0_sp0.9 from training. Duration: 0.8 2023-10-02 06:18:10,362 WARNING [train.py:1204] (3/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_310-114452-0_sp0.9 from training. Duration: 0.6555625 2023-10-02 06:18:12,035 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.550e+02 1.840e+02 2.037e+02 2.251e+02 3.061e+02, threshold=4.074e+02, percent-clipped=0.0 2023-10-02 06:18:13,494 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9029W0120-509520 from training. Duration: 0.768 2023-10-02 06:18:13,508 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9029W0433-509820 from training. Duration: 0.896 2023-10-02 06:18:16,696 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_40222_210_6228_1_1533385298_4481939_163-334586-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 06:18:16,696 WARNING [train.py:1204] (3/4) Exclude cut with ID _48677_210_5663_1_1533902405919_3892989_408-40740-0_sp1.1 from training. Duration: 0.7545625 2023-10-02 06:18:20,104 WARNING [train.py:1204] (3/4) Exclude cut with ID _24314_210_4915_1_1532135078116_7170179_521-258868-0_sp0.9 from training. Duration: 0.87775 2023-10-02 06:18:21,432 WARNING [train.py:1204] (3/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_576-51198-0_sp1.1 from training. Duration: 0.92725 2023-10-02 06:18:22,790 WARNING [train.py:1204] (3/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_306-216871-0_sp1.1 from training. Duration: 0.97275 2023-10-02 06:18:28,616 WARNING [train.py:1204] (3/4) Exclude cut with ID _20684_210_6917_1_1531567751646_4203930_463-119982-0_sp1.1 from training. Duration: 0.97275 2023-10-02 06:18:28,690 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9084W0012-536850 from training. Duration: 0.64 2023-10-02 06:18:30,099 WARNING [train.py:1204] (3/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_847-41334-0_sp1.1 from training. Duration: 0.4 2023-10-02 06:18:32,972 WARNING [train.py:1204] (3/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_326-165900-0_sp1.1 from training. Duration: 0.62725 2023-10-02 06:18:34,324 WARNING [train.py:1204] (3/4) Exclude cut with ID _7537_210_2084_1_1528502395431_3924359_128-268029-0_sp1.1 from training. Duration: 0.8909375 2023-10-02 06:18:36,045 WARNING [train.py:1204] (3/4) Exclude cut with ID _67499_210_17282_1_1535509805220_3692430_333-155030-0_sp1.1 from training. Duration: 0.97275 2023-10-02 06:18:38,852 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_34008_210_10431_1_1533345424_8183247_588-257177-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 06:18:41,660 WARNING [train.py:1204] (3/4) Exclude cut with ID _25914_210_6400_1_1532141317058_644740_52-96808-0_sp1.1 from training. Duration: 0.82725 2023-10-02 06:18:43,455 WARNING [train.py:1204] (3/4) Exclude cut with ID _56494_210_4220_1_1535284837625_3033169_69-138150-0_sp1.1 from training. Duration: 0.8545625 2023-10-02 06:18:45,620 WARNING [train.py:1204] (3/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_19-143553-0_sp1.1 from training. Duration: 0.97275 2023-10-02 06:18:45,669 WARNING [train.py:1204] (3/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_582-130735-0_sp1.1 from training. Duration: 0.936375 2023-10-02 06:18:48,347 WARNING [train.py:1204] (3/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_943-201356-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 06:18:48,356 WARNING [train.py:1204] (3/4) Exclude cut with ID _3231_210_2106_1_1526205864934_6784220_422-229276-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 06:18:48,389 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_48049_210_5632_1_1533864553_3519937_73-198717-0_sp1.1 from training. Duration: 0.97275 2023-10-02 06:18:50,333 WARNING [train.py:1204] (3/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_329-285801-0 from training. Duration: 0.91 2023-10-02 06:18:50,357 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9171W0114-579000 from training. Duration: 0.896 2023-10-02 06:18:51,544 WARNING [train.py:1204] (3/4) Exclude cut with ID _3120_210_3525_1_1526036143112_4072980_210-132675-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 06:18:51,640 WARNING [train.py:1204] (3/4) Exclude cut with ID _55857_210_2346_1_1534644007831_6898209_745-98391-0_sp0.9 from training. Duration: 0.8444375 2023-10-02 06:18:52,965 WARNING [train.py:1204] (3/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_615-322035-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 06:18:52,971 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_36756_210_7234_1_1533026991_966276_119-315767-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 06:18:53,007 WARNING [train.py:1204] (3/4) Exclude cut with ID _13916_210_9994_1_1530788341126_3763149_60-191556-0_sp1.1 from training. Duration: 0.5545625 2023-10-02 06:18:53,026 WARNING [train.py:1204] (3/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_177-236809-0_sp1.1 from training. Duration: 0.4454375 2023-10-02 06:18:54,300 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7002_210_1794_1_1527939683_5157286_184-229630-0_sp1.1 from training. Duration: 0.67275 2023-10-02 06:18:54,308 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_50428_210_5724_1_1534247532_4126418_464-18322-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 06:18:54,375 WARNING [train.py:1204] (3/4) Exclude cut with ID _35799_210_5854_1_1533263487887_3529071_466-131402-0_sp1.1 from training. Duration: 0.936375 2023-10-02 06:18:55,843 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_1259-132768-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 06:18:55,884 WARNING [train.py:1204] (3/4) Exclude cut with ID _6694_210_4599_1_1527852637490_3965529_144-185051-0_sp1.1 from training. Duration: 0.87275 2023-10-02 06:18:57,558 WARNING [train.py:1204] (3/4) Exclude cut with ID _64639_210_5816_1_1535367619437_3528000_59-133290-0_sp1.1 from training. Duration: 0.963625 2023-10-02 06:18:59,055 WARNING [train.py:1204] (3/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_223-49525-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 06:19:00,317 INFO [train.py:1046] (3/4) Epoch 23, batch 150, loss[loss=0.1777, simple_loss=0.2474, pruned_loss=0.05399, over 23688.00 frames. ], tot_loss[loss=0.1754, simple_loss=0.252, pruned_loss=0.04939, over 2501030.53 frames. ], batch size: 232, lr: 4.51e-03, grad_scale: 32.0 2023-10-02 06:19:00,519 WARNING [train.py:1204] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_748-36150-0_sp1.1 from training. Duration: 0.82725 2023-10-02 06:19:00,519 WARNING [train.py:1204] (3/4) Exclude cut with ID _19896_210_1883_1_1531709022662_6649569_52-221627-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 06:19:01,926 WARNING [train.py:1204] (3/4) Exclude cut with ID _6789_210_1534_1_1527854471671_2870519_296-115766-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 06:19:04,707 WARNING [train.py:1204] (3/4) Exclude cut with ID _14624_210_1925_1_1530858690657_3642219_100-19871-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 06:19:04,756 WARNING [train.py:1204] (3/4) Exclude cut with ID _12555_210_8750_1_1530075594402_3780239_50-293675-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 06:19:07,654 WARNING [train.py:1204] (3/4) Exclude cut with ID _14880_210_8895_1_1530680426655_3795259_289-212817-0_sp1.1 from training. Duration: 0.62725 2023-10-02 06:19:08,979 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_388-211626-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 06:19:12,040 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.balancer1.prob, batch_count=780106.6666666666, ans=0.125 2023-10-02 06:19:12,531 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.3.encoder.layers.1.nonlin_attention.whiten2, num_groups=1, num_channels=512, metric=11.78 vs. limit=15.0 2023-10-02 06:19:13,772 WARNING [train.py:1204] (3/4) Exclude cut with ID _5289_210_2084_1_1527678202736_3779299_392-261988-0 from training. Duration: 0.65 2023-10-02 06:19:15,090 WARNING [train.py:1204] (3/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_124-345840-0 from training. Duration: 0.52 2023-10-02 06:19:15,095 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2655_210_2087_1_1525688959_3897400_437-337690-0 from training. Duration: 0.63 2023-10-02 06:19:17,859 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.feed_forward2.out_whiten.whitening_limit, batch_count=780173.3333333334, ans=15.0 2023-10-02 06:19:18,377 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3780_210_2087_1_1526467391_239068_13-274633-0_sp1.1 from training. Duration: 0.92725 2023-10-02 06:19:18,381 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_805-71497-0_sp1.1 from training. Duration: 0.6454375 2023-10-02 06:19:18,474 WARNING [train.py:1204] (3/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_434-83000-0_sp1.1 from training. Duration: 0.8909375 2023-10-02 06:19:18,995 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.5.encoder.layers.0.nonlin_attention.whiten1, num_groups=1, num_channels=192, metric=4.11 vs. limit=10.0 2023-10-02 06:19:19,862 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_17613_210_2151_1_1531137569_2366160_285-219531-0_sp1.1 from training. Duration: 0.97275 2023-10-02 06:19:21,174 WARNING [train.py:1204] (3/4) Exclude cut with ID _56108_210_16380_1_1534505446159_3887959_22-37858-0_sp1.1 from training. Duration: 0.936375 2023-10-02 06:19:21,197 WARNING [train.py:1204] (3/4) Exclude cut with ID _28427_210_4220_1_1532581240967_3061069_200-88444-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 06:19:21,261 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_77-279042-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 06:19:23,108 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9309W0193-640320 from training. Duration: 0.896 2023-10-02 06:19:24,401 WARNING [train.py:1204] (3/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_516-324055-0_sp1.1 from training. Duration: 0.936375 2023-10-02 06:19:30,375 WARNING [train.py:1204] (3/4) Exclude cut with ID _40094_210_16380_1_1533294005773_3641159_10-261240-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 06:19:34,569 WARNING [train.py:1204] (3/4) Exclude cut with ID _6267_210_2084_1_1527764658526_3894730_375-219887-0_sp1.1 from training. Duration: 0.6090625 2023-10-02 06:19:34,660 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6788_210_1388_1_1527898698_4562021_180-75288-0 from training. Duration: 0.99 2023-10-02 06:19:37,366 WARNING [train.py:1204] (3/4) Exclude cut with ID _8030_210_7497_1_1528444886781_3531089_262-86750-0_sp1.1 from training. Duration: 0.736375 2023-10-02 06:19:37,393 WARNING [train.py:1204] (3/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_934-325498-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 06:19:37,434 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_456-22141-0_sp0.9 from training. Duration: 0.97775 2023-10-02 06:19:40,234 WARNING [train.py:1204] (3/4) Exclude cut with ID _49643_210_7989_1_1534640244224_3939031_598-161926-0_sp1.1 from training. Duration: 0.8181875 2023-10-02 06:19:42,849 WARNING [train.py:1204] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_345-49721-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 06:19:42,933 WARNING [train.py:1204] (3/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_440-141811-0_sp1.1 from training. Duration: 0.863625 2023-10-02 06:19:44,341 WARNING [train.py:1204] (3/4) Exclude cut with ID _64048_210_10626_1_1535198382802_3835179_394-212120-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 06:19:44,392 WARNING [train.py:1204] (3/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_91-277684-0 from training. Duration: 0.74 2023-10-02 06:19:52,366 WARNING [train.py:1204] (3/4) Exclude cut with ID _23038_210_9570_1_1532001584158_3652419_191-346343-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 06:19:52,434 WARNING [train.py:1204] (3/4) Exclude cut with ID _15140_210_9288_1_1530791388450_4880149_351-347974-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 06:19:53,755 WARNING [train.py:1204] (3/4) Exclude cut with ID _58605_210_12210_1_1534723195939_3904259_48-269402-0_sp1.1 from training. Duration: 0.87275 2023-10-02 06:19:53,762 WARNING [train.py:1204] (3/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_401-53423-0_sp0.9 from training. Duration: 0.611125 2023-10-02 06:19:55,847 WARNING [train.py:1204] (3/4) Exclude cut with ID _42659_210_2070_1_1533607442765_3120380_158-49004-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 06:19:58,655 WARNING [train.py:1204] (3/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_81-75849-0_sp1.1 from training. Duration: 0.3545625 2023-10-02 06:20:01,413 WARNING [train.py:1204] (3/4) Exclude cut with ID _27052_210_5986_1_1532329197187_3585009_314-106499-0_sp1.1 from training. Duration: 0.836375 2023-10-02 06:20:04,276 WARNING [train.py:1204] (3/4) Exclude cut with ID _4626_210_3532_1_1526817652717_3916859_446-206656-0_sp0.9 from training. Duration: 0.8444375 2023-10-02 06:20:07,472 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_53378_210_17282_1_1534221071_5769509_158-114960-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 06:20:08,938 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3171_210_2087_1_1526109084_2330007_37-64334-0_sp0.9 from training. Duration: 0.911125 2023-10-02 06:20:08,952 WARNING [train.py:1204] (3/4) Exclude cut with ID _5410_210_1388_1_1527397055795_1719020_39-112221-0 from training. Duration: 0.69 2023-10-02 06:20:08,969 WARNING [train.py:1204] (3/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_4-91037-0_sp0.9 from training. Duration: 0.97775 2023-10-02 06:20:08,986 WARNING [train.py:1204] (3/4) Exclude cut with ID ID0079W0443-717795 from training. Duration: 0.541 2023-10-02 06:20:11,834 WARNING [train.py:1204] (3/4) Exclude cut with ID _34734_210_4525_1_1533365977542_4448720_766-297928-0_sp1.1 from training. Duration: 0.936375 2023-10-02 06:20:14,533 INFO [train.py:1046] (3/4) Epoch 23, batch 200, loss[loss=0.1887, simple_loss=0.2571, pruned_loss=0.06016, over 23600.00 frames. ], tot_loss[loss=0.1753, simple_loss=0.2518, pruned_loss=0.04945, over 2995177.48 frames. ], batch size: 256, lr: 4.51e-03, grad_scale: 32.0 2023-10-02 06:20:15,988 WARNING [train.py:1204] (3/4) Exclude cut with ID _51471_210_1889_1_1534399391390_3094250_310-337793-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 06:20:15,999 WARNING [train.py:1204] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_678-159307-0_sp0.9 from training. Duration: 0.9444375 2023-10-02 06:20:17,490 WARNING [train.py:1204] (3/4) Exclude cut with ID _50093_210_4220_1_1534417141207_3432299_259-322252-0 from training. Duration: 0.92 2023-10-02 06:20:17,548 WARNING [train.py:1204] (3/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_67-103444-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 06:20:17,564 WARNING [train.py:1204] (3/4) Exclude cut with ID _40551_210_10635_1_1533776424632_3416179_311-247578-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 06:20:20,934 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_4633_210_2040_1_1526824163_4145343_416-76117-0 from training. Duration: 0.95 2023-10-02 06:20:22,166 WARNING [train.py:1204] (3/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_331-117829-0_sp0.9 from training. Duration: 0.688875 2023-10-02 06:20:23,504 WARNING [train.py:1204] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_299-247059-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 06:20:24,884 WARNING [train.py:1204] (3/4) Exclude cut with ID _64002_210_9570_1_1535198605157_38979_2-240845-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 06:20:27,983 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.conv_module2.balancer2.prob, batch_count=780506.6666666666, ans=0.125 2023-10-02 06:20:30,889 WARNING [train.py:1204] (3/4) Exclude cut with ID _4681_210_3525_1_1527250689464_120889_15-126760-0_sp0.9 from training. Duration: 0.9 2023-10-02 06:20:30,915 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_21500_210_2054_1_1532149310_2716758_256-156611-0_sp1.1 from training. Duration: 0.936375 2023-10-02 06:20:30,929 WARNING [train.py:1204] (3/4) Exclude cut with ID _16908_210_5724_1_1531119157248_3772700_624-203729-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 06:20:35,938 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.1.encoder.layers.1.nonlin_attention.whiten1, num_groups=1, num_channels=192, metric=5.50 vs. limit=10.0 2023-10-02 06:20:40,945 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.489e+02 1.789e+02 2.004e+02 2.358e+02 3.840e+02, threshold=4.009e+02, percent-clipped=0.0 2023-10-02 06:20:45,404 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.conv_module2.balancer2.prob, batch_count=780573.3333333334, ans=0.125 2023-10-02 06:20:49,371 WARNING [train.py:1204] (3/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_605-219732-0_sp1.1 from training. Duration: 0.8545625 2023-10-02 06:20:49,406 WARNING [train.py:1204] (3/4) Exclude cut with ID _44718_210_5816_1_1534050051869_6469030_380-167520-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 06:20:50,709 WARNING [train.py:1204] (3/4) Exclude cut with ID _19896_210_1883_1_1531709022662_6649569_94-229927-0_sp1.1 from training. Duration: 0.8818125 2023-10-02 06:20:52,043 WARNING [train.py:1204] (3/4) Exclude cut with ID _19514_210_9259_1_1531530071330_5084230_519-174300-0_sp1.1 from training. Duration: 0.963625 2023-10-02 06:20:52,130 WARNING [train.py:1204] (3/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_340-174176-0_sp0.9 from training. Duration: 0.5444375 2023-10-02 06:20:52,133 WARNING [train.py:1204] (3/4) Exclude cut with ID _8263_210_1664_1_1528439452891_970220_68-181805-0_sp1.1 from training. Duration: 0.5818125 2023-10-02 06:20:54,257 WARNING [train.py:1204] (3/4) Exclude cut with ID _3120_210_3525_1_1526036143112_4072980_348-257723-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 06:20:54,721 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.4.encoder.layers.2.feed_forward1.out_whiten, num_groups=1, num_channels=384, metric=10.97 vs. limit=15.0 2023-10-02 06:20:55,312 WARNING [train.py:1204] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_453-133983-0_sp1.1 from training. Duration: 0.7090625 2023-10-02 06:20:57,150 WARNING [train.py:1204] (3/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_17-86060-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 06:20:57,182 WARNING [train.py:1204] (3/4) Exclude cut with ID _29877_210_6228_1_1532397791106_5276149_228-184657-0_sp1.1 from training. Duration: 0.97275 2023-10-02 06:20:58,585 WARNING [train.py:1204] (3/4) Exclude cut with ID _18516_210_9206_1_1531704653206_3692406_198-334870-0 from training. Duration: 0.92 2023-10-02 06:20:58,712 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.feed_forward1.hidden_balancer.prob, batch_count=780640.0, ans=0.125 2023-10-02 06:21:00,345 WARNING [train.py:1204] (3/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_340-174176-0_sp1.1 from training. Duration: 0.4454375 2023-10-02 06:21:01,665 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_14-139186-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 06:21:03,374 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.ff2_skip_rate, batch_count=780640.0, ans=0.0 2023-10-02 06:21:05,927 WARNING [train.py:1204] (3/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_126-29104-0_sp0.9 from training. Duration: 0.9444375 2023-10-02 06:21:06,243 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.ff2_skip_rate, batch_count=780640.0, ans=0.0 2023-10-02 06:21:12,195 WARNING [train.py:1204] (3/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_84-10310-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 06:21:16,715 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.ff3_skip_rate, batch_count=780706.6666666666, ans=0.0 2023-10-02 06:21:17,834 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_15517_210_6753_1_1530954008_3487336_79-273076-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 06:21:19,223 WARNING [train.py:1204] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_371-18169-0_sp0.9 from training. Duration: 0.9555625 2023-10-02 06:21:22,120 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.attention_skip_rate, batch_count=780706.6666666666, ans=0.0 2023-10-02 06:21:24,695 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_68702_210_23689_1_1535799577_3420068_170-92788-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 06:21:26,224 WARNING [train.py:1204] (3/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_78-182912-0 from training. Duration: 0.59 2023-10-02 06:21:27,520 INFO [train.py:1046] (3/4) Epoch 23, batch 250, loss[loss=0.1824, simple_loss=0.271, pruned_loss=0.0469, over 23724.00 frames. ], tot_loss[loss=0.1748, simple_loss=0.2513, pruned_loss=0.04915, over 3382279.99 frames. ], batch size: 85, lr: 4.51e-03, grad_scale: 16.0 2023-10-02 06:21:27,606 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3019_210_2028_1_1526044245_3264865_297-12384-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 06:21:27,611 WARNING [train.py:1204] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_147-111137-0_sp1.1 from training. Duration: 0.863625 2023-10-02 06:21:27,616 WARNING [train.py:1204] (3/4) Exclude cut with ID _28323_210_16187_1_1532480395667_4100300_117-8859-0_sp1.1 from training. Duration: 0.97275 2023-10-02 06:21:29,675 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7762_210_1794_1_1528199508_4057408_419-125009-0_sp1.1 from training. Duration: 0.7545625 2023-10-02 06:21:31,633 WARNING [train.py:1204] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_268-87909-0 from training. Duration: 0.99 2023-10-02 06:21:33,432 WARNING [train.py:1204] (3/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_118-311357-0_sp1.1 from training. Duration: 0.92725 2023-10-02 06:21:33,447 WARNING [train.py:1204] (3/4) Exclude cut with ID ID0377W0003-870225 from training. Duration: 0.852 2023-10-02 06:21:34,920 WARNING [train.py:1204] (3/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_56-5838-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 06:21:37,575 WARNING [train.py:1204] (3/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_90-31383-0_sp0.9 from training. Duration: 0.9666875 2023-10-02 06:21:38,498 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.0.layers.0.conv_module2.whiten, num_groups=1, num_channels=192, metric=9.59 vs. limit=15.0 2023-10-02 06:21:38,915 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_31583_210_3852_1_1532595851_4024413_58-86657-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 06:21:38,958 WARNING [train.py:1204] (3/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_344-113875-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 06:21:41,748 WARNING [train.py:1204] (3/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_278-104676-0_sp1.1 from training. Duration: 0.8909375 2023-10-02 06:21:41,795 WARNING [train.py:1204] (3/4) Exclude cut with ID _55844_210_2346_1_1534471212569_7625707_764-2387-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 06:21:43,704 WARNING [train.py:1204] (3/4) Exclude cut with ID _27430_210_7770_1_1532604689375_1378590_157-14963-0_sp1.1 from training. Duration: 0.963625 2023-10-02 06:21:45,983 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.0.layers.0.conv_module1.whiten, num_groups=1, num_channels=192, metric=11.01 vs. limit=15.0 2023-10-02 06:21:46,399 WARNING [train.py:1204] (3/4) Exclude cut with ID _7160_210_1876_1_1527940923231_3703799_282-322611-0_sp1.1 from training. Duration: 0.7909375 2023-10-02 06:21:53,570 WARNING [train.py:1204] (3/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_190-138640-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 06:21:55,087 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder_embed.conv.5.prob, batch_count=780840.0, ans=0.125 2023-10-02 06:21:56,434 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_33348_210_5632_1_1533086912_3324030_63-270922-0_sp1.1 from training. Duration: 0.97275 2023-10-02 06:21:56,460 WARNING [train.py:1204] (3/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_135-184281-0_sp1.1 from training. Duration: 0.9 2023-10-02 06:21:56,620 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.balancer2.prob, batch_count=780906.6666666666, ans=0.125 2023-10-02 06:22:01,872 WARNING [train.py:1204] (3/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_277-210798-0_sp1.1 from training. Duration: 0.5 2023-10-02 06:22:03,803 WARNING [train.py:1204] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_402-253027-0_sp0.9 from training. Duration: 0.6 2023-10-02 06:22:03,888 WARNING [train.py:1204] (3/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_540-275685-0_sp0.9 from training. Duration: 0.988875 2023-10-02 06:22:03,935 WARNING [train.py:1204] (3/4) Exclude cut with ID _59493_210_17282_1_1535078419077_3225879_118-18960-0_sp1.1 from training. Duration: 0.936375 2023-10-02 06:22:05,244 WARNING [train.py:1204] (3/4) Exclude cut with ID _6194_210_1534_1_1527511826160_3941194_321-148641-0_sp1.1 from training. Duration: 0.5818125 2023-10-02 06:22:05,258 WARNING [train.py:1204] (3/4) Exclude cut with ID _61133_210_9969_1_1535196777227_3696409_333-270325-0_sp1.1 from training. Duration: 0.8454375 2023-10-02 06:22:05,305 WARNING [train.py:1204] (3/4) Exclude cut with ID _49376_210_5986_1_1533985278732_3370740_307-145221-0_sp1.1 from training. Duration: 0.936375 2023-10-02 06:22:07,787 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6846_210_6753_1_1527850767_4295158_2-315073-0_sp0.9 from training. Duration: 0.97775 2023-10-02 06:22:09,695 WARNING [train.py:1204] (3/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_17-25045-0 from training. Duration: 0.59 2023-10-02 06:22:09,713 WARNING [train.py:1204] (3/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_503-333510-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 06:22:12,393 WARNING [train.py:1204] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_238-245655-0_sp1.1 from training. Duration: 0.736375 2023-10-02 06:22:12,426 WARNING [train.py:1204] (3/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_329-82235-0_sp0.9 from training. Duration: 0.911125 2023-10-02 06:22:12,432 WARNING [train.py:1204] (3/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_325-113798-0_sp0.9 from training. Duration: 0.9444375 2023-10-02 06:22:13,801 WARNING [train.py:1204] (3/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_395-37222-0_sp0.9 from training. Duration: 0.8444375 2023-10-02 06:22:15,094 WARNING [train.py:1204] (3/4) Exclude cut with ID _64135_210_2253_1_1535677267876_7608972_498-5039-0_sp1.1 from training. Duration: 0.7090625 2023-10-02 06:22:15,110 WARNING [train.py:1204] (3/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_303-228738-0_sp0.9 from training. Duration: 0.9333125 2023-10-02 06:22:15,287 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.feed_forward2.hidden_balancer.prob, batch_count=780973.3333333334, ans=0.125 2023-10-02 06:22:16,653 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.nonlin_attention.balancer.min_positive, batch_count=780973.3333333334, ans=0.05 2023-10-02 06:22:17,881 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_577-220899-0_sp1.1 from training. Duration: 0.92725 2023-10-02 06:22:19,155 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3475_210_2028_1_1526193578_3622569_377-166133-0_sp1.1 from training. Duration: 0.7818125 2023-10-02 06:22:19,191 WARNING [train.py:1204] (3/4) Exclude cut with ID _49376_210_5986_1_1533985278732_3370740_272-313512-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 06:22:22,196 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7762_210_1794_1_1528199508_4057408_422-227034-0_sp0.9 from training. Duration: 0.811125 2023-10-02 06:22:25,606 WARNING [train.py:1204] (3/4) Exclude cut with ID _18895_210_6228_1_1531357274234_3784930_74-341428-0_sp1.1 from training. Duration: 0.92725 2023-10-02 06:22:28,195 WARNING [train.py:1204] (3/4) Exclude cut with ID _44902_210_16187_1_1533888006062_3792078_149-67212-0_sp1.1 from training. Duration: 0.7909375 2023-10-02 06:22:33,298 WARNING [train.py:1204] (3/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_243-126020-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 06:22:34,798 WARNING [train.py:1204] (3/4) Exclude cut with ID _54218_210_12210_1_1534413646388_4409259_259-6561-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 06:22:38,821 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_41452_210_7291_1_1533437687_3941281_405-11559-0 from training. Duration: 0.91 2023-10-02 06:22:40,630 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_759-225084-0_sp1.1 from training. Duration: 0.97275 2023-10-02 06:22:40,648 WARNING [train.py:1204] (3/4) Exclude cut with ID _14747_210_8341_1_1530685797463_3654089_147-131229-0_sp0.9 from training. Duration: 0.8444375 2023-10-02 06:22:42,072 WARNING [train.py:1204] (3/4) Exclude cut with ID _14867_210_1968_1_1530684179890_3846969_319-250868-0 from training. Duration: 0.99 2023-10-02 06:22:42,094 WARNING [train.py:1204] (3/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_580-93798-0_sp0.9 from training. Duration: 0.62225 2023-10-02 06:22:43,428 INFO [train.py:1046] (3/4) Epoch 23, batch 300, loss[loss=0.1897, simple_loss=0.2541, pruned_loss=0.06265, over 23892.00 frames. ], tot_loss[loss=0.1739, simple_loss=0.2493, pruned_loss=0.04921, over 3653026.96 frames. ], batch size: 195, lr: 4.51e-03, grad_scale: 16.0 2023-10-02 06:22:43,488 WARNING [train.py:1204] (3/4) Exclude cut with ID _9679_210_7497_1_1529129212906_4363929_512-313889-0_sp0.9 from training. Duration: 0.9 2023-10-02 06:22:43,499 WARNING [train.py:1204] (3/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_18-189198-0 from training. Duration: 0.9 2023-10-02 06:22:48,767 WARNING [train.py:1204] (3/4) Exclude cut with ID _49755_210_18053_1_1534581005846_3218709_137-64179-0_sp1.1 from training. Duration: 0.92725 2023-10-02 06:22:50,233 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_48049_210_5632_1_1533864553_3519937_306-283794-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 06:22:51,813 WARNING [train.py:1204] (3/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_25-120161-0_sp1.1 from training. Duration: 0.8909375 2023-10-02 06:22:53,179 WARNING [train.py:1204] (3/4) Exclude cut with ID _8833_210_5219_1_1528592665210_7553059_319-349181-0 from training. Duration: 0.99 2023-10-02 06:22:55,093 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_296-332325-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 06:22:56,372 WARNING [train.py:1204] (3/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_436-186311-0_sp1.1 from training. Duration: 0.5909375 2023-10-02 06:22:56,385 WARNING [train.py:1204] (3/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_348-127789-0 from training. Duration: 0.85 2023-10-02 06:22:56,400 WARNING [train.py:1204] (3/4) Exclude cut with ID _3992_210_1747_1_1526644816031_3731060_443-195183-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 06:22:59,138 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder_embed.conv.5.prob, batch_count=781173.3333333334, ans=0.125 2023-10-02 06:23:00,401 WARNING [train.py:1204] (3/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_308-231735-0_sp1.1 from training. Duration: 0.663625 2023-10-02 06:23:05,161 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6786_210_1388_1_1527839699_4194495_378-46070-0_sp1.1 from training. Duration: 0.6545625 2023-10-02 06:23:05,220 WARNING [train.py:1204] (3/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_63-101996-0 from training. Duration: 0.88 2023-10-02 06:23:06,761 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2541_210_1794_1_1525590269_3084765_156-173713-0 from training. Duration: 0.81 2023-10-02 06:23:08,133 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_217-239796-0_sp1.1 from training. Duration: 0.963625 2023-10-02 06:23:10,501 WARNING [train.py:1204] (3/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_530-329400-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 06:23:11,289 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.1.encoder.layers.1.feed_forward3.out_whiten, num_groups=1, num_channels=256, metric=10.12 vs. limit=15.0 2023-10-02 06:23:11,714 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.547e+02 1.930e+02 2.119e+02 2.410e+02 3.837e+02, threshold=4.239e+02, percent-clipped=0.0 2023-10-02 06:23:13,235 WARNING [train.py:1204] (3/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_150-62920-0_sp1.1 from training. Duration: 0.963625 2023-10-02 06:23:13,235 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_5522_210_3273_1_1527249606_3909096_366-172508-0 from training. Duration: 0.66 2023-10-02 06:23:13,238 WARNING [train.py:1204] (3/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_303-340446-0_sp1.1 from training. Duration: 0.8181875 2023-10-02 06:23:14,583 WARNING [train.py:1204] (3/4) Exclude cut with ID _8699_210_2069_1_1528630346831_3960409_160-24408-0_sp1.1 from training. Duration: 0.82725 2023-10-02 06:23:16,196 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.bypass.scale_min, batch_count=781240.0, ans=0.2 2023-10-02 06:23:17,332 WARNING [train.py:1204] (3/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_299-187801-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 06:23:17,371 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_34886_210_10633_1_1533203882_1755661_92-345288-0_sp1.1 from training. Duration: 0.97275 2023-10-02 06:23:21,590 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_5438_210_1794_1_1527336687_2600009_104-288274-0_sp1.1 from training. Duration: 0.52725 2023-10-02 06:23:21,597 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_154-316450-0 from training. Duration: 0.76 2023-10-02 06:23:23,474 WARNING [train.py:1204] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_23-189234-0_sp0.9 from training. Duration: 0.92225 2023-10-02 06:23:26,152 WARNING [train.py:1204] (3/4) Exclude cut with ID _4779_210_2084_1_1527318022944_3914893_346-168820-0_sp1.1 from training. Duration: 0.963625 2023-10-02 06:23:27,479 WARNING [train.py:1204] (3/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_5-55893-0 from training. Duration: 0.55 2023-10-02 06:23:28,901 WARNING [train.py:1204] (3/4) Exclude cut with ID _20455_210_5219_1_1531565996775_8098100_180-24517-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 06:23:33,788 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_243-143119-0_sp0.9 from training. Duration: 0.8555625 2023-10-02 06:23:35,308 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.conv_module1.balancer2.prob, batch_count=781306.6666666666, ans=0.125 2023-10-02 06:23:37,063 WARNING [train.py:1204] (3/4) Exclude cut with ID _65585_210_12210_1_1535450451584_3764098_293-212045-0_sp1.1 from training. Duration: 0.92725 2023-10-02 06:23:37,067 WARNING [train.py:1204] (3/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_577-332600-0 from training. Duration: 0.57 2023-10-02 06:23:41,221 WARNING [train.py:1204] (3/4) Exclude cut with ID _30039_210_13270_1_1532484612900_2725150_104-289874-0_sp1.1 from training. Duration: 0.963625 2023-10-02 06:23:41,228 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3172_210_2087_1_1526292929_3987865_197-97762-0_sp1.1 from training. Duration: 0.6818125 2023-10-02 06:23:42,932 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_256-150908-0_sp1.1 from training. Duration: 0.963625 2023-10-02 06:23:44,899 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_296-287678-0_sp1.1 from training. Duration: 0.863625 2023-10-02 06:23:44,927 WARNING [train.py:1204] (3/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_443-30034-0 from training. Duration: 0.64 2023-10-02 06:23:44,953 WARNING [train.py:1204] (3/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_494-199538-0_sp0.9 from training. Duration: 0.8333125 2023-10-02 06:23:46,363 WARNING [train.py:1204] (3/4) Exclude cut with ID _19708_210_9206_1_1532136650733_4238364_61-162325-0_sp1.1 from training. Duration: 0.97275 2023-10-02 06:23:47,724 WARNING [train.py:1204] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_475-127455-0 from training. Duration: 0.7 2023-10-02 06:23:49,169 WARNING [train.py:1204] (3/4) Exclude cut with ID _48677_210_5663_1_1533902405919_3892989_188-11581-0_sp1.1 from training. Duration: 0.963625 2023-10-02 06:23:49,205 WARNING [train.py:1204] (3/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_136-8381-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 06:23:50,658 WARNING [train.py:1204] (3/4) Exclude cut with ID _55328_210_27767_1_1534580973260_4088170_270-229555-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 06:23:50,695 WARNING [train.py:1204] (3/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_372-266951-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 06:23:50,900 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.conv_module2.balancer1.prob, batch_count=781373.3333333334, ans=0.125 2023-10-02 06:23:51,970 WARNING [train.py:1204] (3/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_14-242149-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 06:23:56,782 WARNING [train.py:1204] (3/4) Exclude cut with ID _7877_210_6014_1_1528286358099_3750136_159-76512-0_sp1.1 from training. Duration: 0.936375 2023-10-02 06:23:56,784 WARNING [train.py:1204] (3/4) Exclude cut with ID _8263_210_1664_1_1528438953494_294100_40-204731-0_sp1.1 from training. Duration: 0.4545625 2023-10-02 06:23:58,138 INFO [train.py:1046] (3/4) Epoch 23, batch 350, loss[loss=0.167, simple_loss=0.2316, pruned_loss=0.05119, over 22824.00 frames. ], tot_loss[loss=0.1724, simple_loss=0.2466, pruned_loss=0.04908, over 3873693.14 frames. ], batch size: 322, lr: 4.51e-03, grad_scale: 16.0 2023-10-02 06:23:59,589 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8278_210_2550_1_1528637099_4220413_694-238413-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 06:24:02,612 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.feed_forward3.hidden_balancer.prob, batch_count=781440.0, ans=0.125 2023-10-02 06:24:05,727 WARNING [train.py:1204] (3/4) Exclude cut with ID _47278_210_12041_1_1533902418349_3576110_421-172710-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 06:24:08,955 WARNING [train.py:1204] (3/4) Exclude cut with ID _14970_210_15709_1_1530773543493_1715730_215-282680-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 06:24:10,363 WARNING [train.py:1204] (3/4) Exclude cut with ID _64639_210_5816_1_1535367619437_3528000_306-149449-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 06:24:13,261 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_28-291982-0 from training. Duration: 0.97 2023-10-02 06:24:15,487 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.0.layers.0.whiten, num_groups=1, num_channels=192, metric=4.27 vs. limit=12.0 2023-10-02 06:24:15,800 WARNING [train.py:1204] (3/4) Exclude cut with ID _55134_210_21982_1_1534590028377_3882170_412-15870-0_sp1.1 from training. Duration: 0.936375 2023-10-02 06:24:15,836 WARNING [train.py:1204] (3/4) Exclude cut with ID _13684_210_7497_1_1530619354863_3598359_312-235606-0 from training. Duration: 0.6 2023-10-02 06:24:19,321 WARNING [train.py:1204] (3/4) Exclude cut with ID _37112_210_2575_1_1532997007673_7268610_347-238993-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 06:24:19,361 WARNING [train.py:1204] (3/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_121-284152-0 from training. Duration: 0.59 2023-10-02 06:24:20,748 WARNING [train.py:1204] (3/4) Exclude cut with ID _2941_210_2577_1_1526199673150_2489759_228-2961-0_sp1.1 from training. Duration: 0.97275 2023-10-02 06:24:21,072 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=781506.6666666666, ans=0.1 2023-10-02 06:24:22,277 WARNING [train.py:1204] (3/4) Exclude cut with ID _28323_210_16187_1_1532480395667_4100300_531-24157-0 from training. Duration: 0.98 2023-10-02 06:24:23,681 WARNING [train.py:1204] (3/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_266-329755-0_sp0.9 from training. Duration: 0.988875 2023-10-02 06:24:25,503 WARNING [train.py:1204] (3/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_1038-146421-0_sp1.1 from training. Duration: 0.97275 2023-10-02 06:24:26,788 WARNING [train.py:1204] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_391-22148-0_sp0.9 from training. Duration: 0.8555625 2023-10-02 06:24:26,910 WARNING [train.py:1204] (3/4) Exclude cut with ID _64521_210_14699_1_1535243427259_3815328_543-341117-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 06:24:28,137 WARNING [train.py:1204] (3/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_52-168712-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 06:24:28,192 WARNING [train.py:1204] (3/4) Exclude cut with ID _33920_210_4220_1_1533257851500_3084399_198-334063-0_sp1.1 from training. Duration: 0.8545625 2023-10-02 06:24:28,205 WARNING [train.py:1204] (3/4) Exclude cut with ID _50093_210_4220_1_1534417141207_3432299_69-111987-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 06:24:28,237 WARNING [train.py:1204] (3/4) Exclude cut with ID _14821_210_8750_1_1530680503991_6829359_189-297451-0_sp0.9 from training. Duration: 0.611125 2023-10-02 06:24:29,644 WARNING [train.py:1204] (3/4) Exclude cut with ID _8030_210_7497_1_1528444886781_3531089_262-86750-0_sp0.9 from training. Duration: 0.9 2023-10-02 06:24:29,649 WARNING [train.py:1204] (3/4) Exclude cut with ID _24612_210_8913_1_1531998054193_3806359_294-191196-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 06:24:37,131 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.feed_forward3.hidden_balancer.prob, batch_count=781573.3333333334, ans=0.125 2023-10-02 06:24:38,302 WARNING [train.py:1204] (3/4) Exclude cut with ID _16909_210_5724_1_1531292079912_4118919_707-342896-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 06:24:38,316 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_9460_210_4211_1_1528954587_10049667_572-37035-0_sp0.9 from training. Duration: 0.888875 2023-10-02 06:24:38,379 WARNING [train.py:1204] (3/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_762-109886-0_sp1.1 from training. Duration: 0.82725 2023-10-02 06:24:38,409 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_14953_210_2151_1_1530701710_5616165_519-326393-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 06:24:40,106 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=781573.3333333334, ans=0.1 2023-10-02 06:24:45,139 WARNING [train.py:1204] (3/4) Exclude cut with ID _25914_210_6400_1_1532141317058_644740_52-96808-0 from training. Duration: 0.91 2023-10-02 06:24:45,143 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_68095_210_20185_1_1535718226_8888855_1079-94525-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 06:24:49,553 WARNING [train.py:1204] (3/4) Exclude cut with ID _14860_210_4915_1_1530702020340_7343240_746-3647-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 06:24:49,554 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_70379_210_7925_1_1535941455_557455_49-101720-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 06:24:49,580 WARNING [train.py:1204] (3/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_8-307291-0_sp1.1 from training. Duration: 0.936375 2023-10-02 06:24:50,928 WARNING [train.py:1204] (3/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_117-290916-0 from training. Duration: 0.51 2023-10-02 06:24:52,298 WARNING [train.py:1204] (3/4) Exclude cut with ID _22020_210_2461_1_1531724164513_6326900_944-339840-0_sp1.1 from training. Duration: 0.963625 2023-10-02 06:24:52,515 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.bypass.skip_rate, batch_count=781640.0, ans=0.04949747468305833 2023-10-02 06:24:53,668 WARNING [train.py:1204] (3/4) Exclude cut with ID IC0515W0043-254911 from training. Duration: 0.976 2023-10-02 06:24:53,786 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3910_210_1794_1_1526742306_334530_28-238909-0 from training. Duration: 0.81 2023-10-02 06:24:54,005 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.bypass.scale_min, batch_count=781640.0, ans=0.2 2023-10-02 06:24:55,044 WARNING [train.py:1204] (3/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_269-267275-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 06:24:57,328 INFO [scaling.py:1118] (3/4) WithLoss: name=encoder.encoders.1.encoder.layers.0.self_attn_weights, loss-sum=0.000e+00 2023-10-02 06:24:57,450 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.conv_module1.balancer2.min_positive, batch_count=781706.6666666666, ans=0.05 2023-10-02 06:24:58,316 WARNING [train.py:1204] (3/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_72-251255-0_sp1.1 from training. Duration: 0.8545625 2023-10-02 06:24:58,327 WARNING [train.py:1204] (3/4) Exclude cut with ID _14886_210_4220_1_1530702020355_3516190_175-229073-0 from training. Duration: 0.45 2023-10-02 06:25:01,154 WARNING [train.py:1204] (3/4) Exclude cut with ID _40519_210_13794_1_1533715228655_7337390_921-209452-0_sp1.1 from training. Duration: 0.963625 2023-10-02 06:25:02,677 WARNING [train.py:1204] (3/4) Exclude cut with ID _29857_210_20034_1_1532395886753_3798160_335-163562-0_sp0.9 from training. Duration: 0.9444375 2023-10-02 06:25:02,788 WARNING [train.py:1204] (3/4) Exclude cut with ID _58583_210_5236_1_1534726835266_3574249_384-58445-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 06:25:04,077 WARNING [train.py:1204] (3/4) Exclude cut with ID _44011_210_2046_1_1533722401691_3631310_96-315656-0_sp1.1 from training. Duration: 0.963625 2023-10-02 06:25:04,082 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_68484_210_1794_1_1535849804_7678097_549-170613-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 06:25:06,337 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.1.encoder.layers.0.self_attn1.whiten, num_groups=1, num_channels=256, metric=10.10 vs. limit=22.5 2023-10-02 06:25:07,313 WARNING [train.py:1204] (3/4) Exclude cut with ID _37892_210_19596_1_1533198677897_3964058_214-148058-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 06:25:10,669 WARNING [train.py:1204] (3/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_415-78792-0_sp0.9 from training. Duration: 0.9 2023-10-02 06:25:12,296 WARNING [train.py:1204] (3/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_383-205833-0_sp1.1 from training. Duration: 0.863625 2023-10-02 06:25:12,396 WARNING [train.py:1204] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_167-137255-0 from training. Duration: 0.64 2023-10-02 06:25:12,404 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8275_210_4198_1_1528455320_3892571_484-154786-0_sp1.1 from training. Duration: 0.963625 2023-10-02 06:25:13,644 INFO [train.py:1046] (3/4) Epoch 23, batch 400, loss[loss=0.1575, simple_loss=0.2313, pruned_loss=0.04189, over 24290.00 frames. ], tot_loss[loss=0.1712, simple_loss=0.2461, pruned_loss=0.04815, over 4066833.22 frames. ], batch size: 56, lr: 4.51e-03, grad_scale: 16.0 2023-10-02 06:25:13,719 WARNING [train.py:1204] (3/4) Exclude cut with ID _40284_210_2084_1_1533513679814_3704279_396-319412-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 06:25:15,910 WARNING [train.py:1204] (3/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_280-120942-0_sp0.9 from training. Duration: 0.8555625 2023-10-02 06:25:15,951 WARNING [train.py:1204] (3/4) Exclude cut with ID _45630_210_10556_1_1533949174498_3902049_558-240889-0_sp1.1 from training. Duration: 0.92725 2023-10-02 06:25:17,569 WARNING [train.py:1204] (3/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_197-307458-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 06:25:18,958 WARNING [train.py:1204] (3/4) Exclude cut with ID _16909_210_5724_1_1531292079912_4118919_163-254843-0_sp1.1 from training. Duration: 0.92725 2023-10-02 06:25:20,182 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_103-84010-0 from training. Duration: 0.69 2023-10-02 06:25:21,547 WARNING [train.py:1204] (3/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_401-158239-0 from training. Duration: 0.71 2023-10-02 06:25:21,548 WARNING [train.py:1204] (3/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_899-233658-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 06:25:22,899 WARNING [train.py:1204] (3/4) Exclude cut with ID _23825_210_13449_1_1531915313526_3497150_240-271367-0 from training. Duration: 0.97 2023-10-02 06:25:24,913 WARNING [train.py:1204] (3/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_424-134619-0_sp1.1 from training. Duration: 0.92725 2023-10-02 06:25:28,002 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.ff3_skip_rate, batch_count=781840.0, ans=0.0 2023-10-02 06:25:29,699 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.5.encoder.layers.0.self_attn1.whiten, num_groups=1, num_channels=256, metric=12.69 vs. limit=22.5 2023-10-02 06:25:30,472 WARNING [train.py:1204] (3/4) Exclude cut with ID _44831_210_13427_1_1533816011549_3689035_32-188147-0_sp1.1 from training. Duration: 0.87275 2023-10-02 06:25:30,475 WARNING [train.py:1204] (3/4) Exclude cut with ID _59170_210_6721_1_1534983633960_4203335_314-63879-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 06:25:30,500 WARNING [train.py:1204] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_602-275260-0 from training. Duration: 0.82 2023-10-02 06:25:30,531 WARNING [train.py:1204] (3/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_511-306767-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 06:25:30,571 WARNING [train.py:1204] (3/4) Exclude cut with ID _41817_210_18835_1_1533434477160_7423049_565-201775-0_sp1.1 from training. Duration: 0.92725 2023-10-02 06:25:31,824 WARNING [train.py:1204] (3/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_778-173151-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 06:25:31,890 WARNING [train.py:1204] (3/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_680-51001-0_sp1.1 from training. Duration: 0.963625 2023-10-02 06:25:34,636 WARNING [train.py:1204] (3/4) Exclude cut with ID IC0677W0059-334891 from training. Duration: 0.896 2023-10-02 06:25:35,961 WARNING [train.py:1204] (3/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_370-331600-0 from training. Duration: 0.94 2023-10-02 06:25:41,031 WARNING [train.py:1204] (3/4) Exclude cut with ID _66840_210_5219_1_1535452168426_4196349_446-240192-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 06:25:42,406 WARNING [train.py:1204] (3/4) Exclude cut with ID _65242_210_18628_1_1535369393717_3823149_38-6732-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 06:25:42,470 WARNING [train.py:1204] (3/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_302-2550-0 from training. Duration: 0.81 2023-10-02 06:25:43,927 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_100-348441-0 from training. Duration: 0.91 2023-10-02 06:25:45,159 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.527e+02 1.817e+02 2.027e+02 2.516e+02 3.767e+02, threshold=4.054e+02, percent-clipped=0.0 2023-10-02 06:25:45,371 WARNING [train.py:1204] (3/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_329-285801-0_sp1.1 from training. Duration: 0.82725 2023-10-02 06:25:47,437 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_30167_210_3528_1_1532428012_4772375_343-334712-0_sp1.1 from training. Duration: 0.936375 2023-10-02 06:25:54,092 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7543_210_6753_1_1528543318_4552_1-85072-0 from training. Duration: 0.83 2023-10-02 06:25:59,158 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7002_210_1794_1_1527935634_4034444_487-236501-0_sp1.1 from training. Duration: 0.6 2023-10-02 06:25:59,434 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.bypass_mid.scale_min, batch_count=781973.3333333334, ans=0.2 2023-10-02 06:26:00,688 WARNING [train.py:1204] (3/4) Exclude cut with ID _7160_210_1876_1_1527940923231_3703799_282-322611-0 from training. Duration: 0.87 2023-10-02 06:26:02,185 WARNING [train.py:1204] (3/4) Exclude cut with ID _67335_210_5219_1_1535525975652_4275229_570-262983-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 06:26:03,555 WARNING [train.py:1204] (3/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_625-338546-0_sp1.1 from training. Duration: 0.8 2023-10-02 06:26:04,862 WARNING [train.py:1204] (3/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_176-56120-0 from training. Duration: 0.83 2023-10-02 06:26:07,610 WARNING [train.py:1204] (3/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_963-187109-0_sp1.1 from training. Duration: 0.9 2023-10-02 06:26:10,903 WARNING [train.py:1204] (3/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_121-284152-0_sp0.9 from training. Duration: 0.6555625 2023-10-02 06:26:11,034 WARNING [train.py:1204] (3/4) Exclude cut with ID _45021_210_9994_1_1533619751094_3330288_104-81711-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 06:26:14,481 WARNING [train.py:1204] (3/4) Exclude cut with ID _51382_210_23862_1_1534492801838_3650104_129-343914-0_sp1.1 from training. Duration: 0.936375 2023-10-02 06:26:15,736 WARNING [train.py:1204] (3/4) Exclude cut with ID _14970_210_15709_1_1530775547361_1966965_8-169924-0 from training. Duration: 0.92 2023-10-02 06:26:17,200 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2295_210_2087_1_1525500993_2407863_234-83104-0_sp1.1 from training. Duration: 0.563625 2023-10-02 06:26:17,269 WARNING [train.py:1204] (3/4) Exclude cut with ID _14687_210_5854_1_1530703838259_3664889_115-239046-0 from training. Duration: 0.67 2023-10-02 06:26:20,321 WARNING [train.py:1204] (3/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_690-126306-0_sp1.1 from training. Duration: 0.7090625 2023-10-02 06:26:20,333 WARNING [train.py:1204] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_625-102955-0_sp0.9 from training. Duration: 0.8555625 2023-10-02 06:26:20,489 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.feed_forward2.hidden_balancer.prob, batch_count=782040.0, ans=0.125 2023-10-02 06:26:23,054 WARNING [train.py:1204] (3/4) Exclude cut with ID _9389_210_1664_1_1529217337301_5289269_289-213417-0 from training. Duration: 0.89 2023-10-02 06:26:23,206 WARNING [train.py:1204] (3/4) Exclude cut with ID _13253_210_1925_1_1530757775819_3577873_191-144977-0_sp0.9 from training. Duration: 0.8666875 2023-10-02 06:26:24,648 WARNING [train.py:1204] (3/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_325-155703-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 06:26:24,697 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2295_210_2087_1_1525500993_2407863_116-238894-0_sp0.9 from training. Duration: 0.77775 2023-10-02 06:26:26,228 WARNING [train.py:1204] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_94-126128-0 from training. Duration: 0.86 2023-10-02 06:26:26,250 WARNING [train.py:1204] (3/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_395-310220-0_sp0.9 from training. Duration: 0.988875 2023-10-02 06:26:26,548 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.conv_module2.balancer2.prob, batch_count=782040.0, ans=0.125 2023-10-02 06:26:27,618 WARNING [train.py:1204] (3/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_821-110852-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 06:26:27,687 WARNING [train.py:1204] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_64-248359-0_sp1.1 from training. Duration: 0.62725 2023-10-02 06:26:27,697 WARNING [train.py:1204] (3/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_138-187031-0 from training. Duration: 0.99 2023-10-02 06:26:29,565 INFO [train.py:1046] (3/4) Epoch 23, batch 450, loss[loss=0.2023, simple_loss=0.2672, pruned_loss=0.0687, over 22748.00 frames. ], tot_loss[loss=0.1718, simple_loss=0.2471, pruned_loss=0.04822, over 4211603.76 frames. ], batch size: 322, lr: 4.51e-03, grad_scale: 8.0 2023-10-02 06:26:29,634 WARNING [train.py:1204] (3/4) Exclude cut with ID _4122_210_2084_1_1526713164417_4353280_528-206771-0_sp0.9 from training. Duration: 0.97775 2023-10-02 06:26:31,025 WARNING [train.py:1204] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_844-96989-0_sp1.1 from training. Duration: 0.6454375 2023-10-02 06:26:32,433 WARNING [train.py:1204] (3/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_106-30782-0_sp1.1 from training. Duration: 0.72725 2023-10-02 06:26:32,700 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.feed_forward1.hidden_balancer.prob, batch_count=782106.6666666666, ans=0.125 2023-10-02 06:26:42,567 WARNING [train.py:1204] (3/4) Exclude cut with ID _14683_210_5219_1_1530698352608_3974859_281-250040-0_sp1.1 from training. Duration: 0.936375 2023-10-02 06:26:42,610 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_46013_210_7234_1_1533776362_3744032_463-52378-0_sp0.9 from training. Duration: 0.9555625 2023-10-02 06:26:46,099 WARNING [train.py:1204] (3/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_299-34413-0 from training. Duration: 0.65 2023-10-02 06:26:47,472 WARNING [train.py:1204] (3/4) Exclude cut with ID _35799_210_5854_1_1533263487887_3529071_490-326847-0 from training. Duration: 0.99 2023-10-02 06:26:50,351 WARNING [train.py:1204] (3/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_589-9070-0_sp0.9 from training. Duration: 0.788875 2023-10-02 06:26:53,702 WARNING [train.py:1204] (3/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_884-29515-0_sp1.1 from training. Duration: 0.936375 2023-10-02 06:26:56,529 WARNING [train.py:1204] (3/4) Exclude cut with ID _15012_210_1968_1_1530856820429_5095350_474-119995-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 06:26:58,104 WARNING [train.py:1204] (3/4) Exclude cut with ID _65585_210_12210_1_1535450451584_3764098_211-97124-0_sp1.1 from training. Duration: 0.963625 2023-10-02 06:26:59,447 WARNING [train.py:1204] (3/4) Exclude cut with ID _33640_210_2655_1_1533117605479_4855469_666-224463-0_sp1.1 from training. Duration: 0.963625 2023-10-02 06:27:00,834 WARNING [train.py:1204] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_35-153886-0 from training. Duration: 0.84 2023-10-02 06:27:02,172 WARNING [train.py:1204] (3/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_210-106047-0 from training. Duration: 0.97 2023-10-02 06:27:02,383 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.balancer1.prob, batch_count=782240.0, ans=0.125 2023-10-02 06:27:04,286 WARNING [train.py:1204] (3/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_479-125611-0 from training. Duration: 0.96 2023-10-02 06:27:04,313 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_9417_210_1794_1_1529235261_4614131_427-153963-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 06:27:04,587 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.balancer1.prob, batch_count=782240.0, ans=0.125 2023-10-02 06:27:05,637 WARNING [train.py:1204] (3/4) Exclude cut with ID _23818_210_6228_1_1531917063884_3676920_133-170571-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 06:27:05,702 WARNING [train.py:1204] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_602-275260-0_sp1.1 from training. Duration: 0.7454375 2023-10-02 06:27:08,440 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9029W0121-509521 from training. Duration: 0.896 2023-10-02 06:27:08,448 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9029W0434-509821 from training. Duration: 0.64 2023-10-02 06:27:09,733 WARNING [train.py:1204] (3/4) Exclude cut with ID _23038_210_9570_1_1532001584158_3652419_289-307034-0_sp1.1 from training. Duration: 0.936375 2023-10-02 06:27:11,594 WARNING [train.py:1204] (3/4) Exclude cut with ID _29345_210_4381_1_1532689168263_3860169_149-257268-0_sp0.9 from training. Duration: 0.9 2023-10-02 06:27:12,965 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_405-339986-0_sp1.1 from training. Duration: 0.463625 2023-10-02 06:27:15,830 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2655_210_2087_1_1525688959_3897400_437-337690-0_sp1.1 from training. Duration: 0.57275 2023-10-02 06:27:15,857 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7543_210_6753_1_1528541907_1399322_165-323619-0_sp1.1 from training. Duration: 0.62725 2023-10-02 06:27:15,922 WARNING [train.py:1204] (3/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_508-335369-0_sp1.1 from training. Duration: 0.436375 2023-10-02 06:27:17,788 WARNING [train.py:1204] (3/4) Exclude cut with ID _7852_210_5219_1_1528286471316_8139400_358-287492-0 from training. Duration: 0.96 2023-10-02 06:27:20,467 WARNING [train.py:1204] (3/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_322-217220-0_sp0.9 from training. Duration: 0.9555625 2023-10-02 06:27:23,767 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3171_210_2087_1_1526109084_2330007_66-260583-0_sp0.9 from training. Duration: 0.8 2023-10-02 06:27:23,794 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8558_210_1410_1_1528505225_7613406_131-329524-0_sp1.1 from training. Duration: 0.7181875 2023-10-02 06:27:24,015 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.1.feed_forward2.hidden_balancer.prob, batch_count=782306.6666666666, ans=0.125 2023-10-02 06:27:24,029 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.attention_skip_rate, batch_count=782306.6666666666, ans=0.0 2023-10-02 06:27:25,149 WARNING [train.py:1204] (3/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_30-326681-0 from training. Duration: 0.83 2023-10-02 06:27:28,007 WARNING [train.py:1204] (3/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_58-68956-0_sp0.9 from training. Duration: 0.92225 2023-10-02 06:27:28,077 WARNING [train.py:1204] (3/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_255-312654-0 from training. Duration: 0.91 2023-10-02 06:27:29,435 WARNING [train.py:1204] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_99-167392-0 from training. Duration: 0.61 2023-10-02 06:27:30,839 WARNING [train.py:1204] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_94-126128-0_sp0.9 from training. Duration: 0.9555625 2023-10-02 06:27:35,919 WARNING [train.py:1204] (3/4) Exclude cut with ID _68635_210_9317_1_1535770813410_4315870_187-192817-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 06:27:37,354 WARNING [train.py:1204] (3/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_277-204970-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 06:27:38,877 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6163_210_6400_1_1527506746_4263068_283-137811-0_sp0.9 from training. Duration: 0.8555625 2023-10-02 06:27:38,906 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9144W0121-566446 from training. Duration: 0.896 2023-10-02 06:27:41,839 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.ff3_skip_rate, batch_count=782373.3333333334, ans=0.0 2023-10-02 06:27:43,253 WARNING [train.py:1204] (3/4) Exclude cut with ID _49371_210_12071_1_1533952761021_3812190_351-315519-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 06:27:43,331 WARNING [train.py:1204] (3/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_115-326778-0_sp1.1 from training. Duration: 0.8454375 2023-10-02 06:27:44,766 INFO [train.py:1046] (3/4) Epoch 23, batch 500, loss[loss=0.1845, simple_loss=0.2501, pruned_loss=0.05951, over 23773.00 frames. ], tot_loss[loss=0.1724, simple_loss=0.2483, pruned_loss=0.04828, over 4328169.84 frames. ], batch size: 179, lr: 4.50e-03, grad_scale: 8.0 2023-10-02 06:27:44,884 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_17292_210_6753_1_1531125388_2943734_436-198965-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 06:27:44,893 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9165W0117-576751 from training. Duration: 0.896 2023-10-02 06:27:46,206 WARNING [train.py:1204] (3/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_212-149947-0 from training. Duration: 0.73 2023-10-02 06:27:46,214 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_13757_210_3528_1_1530440682_4107239_65-167544-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 06:27:47,130 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.1.encoder.layers.1.feed_forward2.out_whiten, num_groups=1, num_channels=256, metric=4.50 vs. limit=15.0 2023-10-02 06:27:50,766 WARNING [train.py:1204] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_700-183549-0_sp0.9 from training. Duration: 0.7555625 2023-10-02 06:27:54,425 WARNING [train.py:1204] (3/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_216-343007-0_sp0.9 from training. Duration: 0.7333125 2023-10-02 06:27:57,161 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7544_210_6753_1_1528587722_3576686_295-258479-0_sp1.1 from training. Duration: 0.763625 2023-10-02 06:27:58,551 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_365-287106-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 06:27:58,563 WARNING [train.py:1204] (3/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_300-3747-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 06:27:59,914 WARNING [train.py:1204] (3/4) Exclude cut with ID _14069_210_5632_1_1530512692033_4803390_487-303201-0_sp1.1 from training. Duration: 0.92725 2023-10-02 06:28:09,124 WARNING [train.py:1204] (3/4) Exclude cut with ID _14459_210_2228_1_1531443641946_7516700_668-123026-0_sp1.1 from training. Duration: 0.936375 2023-10-02 06:28:09,155 WARNING [train.py:1204] (3/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_108-53929-0_sp1.1 from training. Duration: 0.536375 2023-10-02 06:28:10,419 WARNING [train.py:1204] (3/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_266-137104-0_sp0.9 from training. Duration: 0.72225 2023-10-02 06:28:10,468 WARNING [train.py:1204] (3/4) Exclude cut with ID _12302_210_5219_1_1529924446466_8107280_902-168619-0_sp1.1 from training. Duration: 0.936375 2023-10-02 06:28:10,495 WARNING [train.py:1204] (3/4) Exclude cut with ID _59502_210_12210_1_1535107024698_1835139_146-162495-0 from training. Duration: 0.92 2023-10-02 06:28:11,735 WARNING [train.py:1204] (3/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_412-287526-0_sp1.1 from training. Duration: 0.6545625 2023-10-02 06:28:14,519 WARNING [train.py:1204] (3/4) Exclude cut with ID _7160_210_1876_1_1527940923231_3703799_120-150943-0_sp0.9 from training. Duration: 0.9 2023-10-02 06:28:14,577 WARNING [train.py:1204] (3/4) Exclude cut with ID _15037_210_2253_1_1530752294104_3267650_296-192597-0_sp1.1 from training. Duration: 0.736375 2023-10-02 06:28:14,597 WARNING [train.py:1204] (3/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_397-216497-0_sp1.1 from training. Duration: 0.87275 2023-10-02 06:28:14,730 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=782573.3333333334, ans=0.1 2023-10-02 06:28:15,795 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.499e+02 1.861e+02 2.114e+02 2.319e+02 3.215e+02, threshold=4.229e+02, percent-clipped=0.0 2023-10-02 06:28:15,881 WARNING [train.py:1204] (3/4) Exclude cut with ID _40094_210_16380_1_1533294005773_3641159_76-269320-0_sp1.1 from training. Duration: 0.936375 2023-10-02 06:28:15,942 WARNING [train.py:1204] (3/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_449-15508-0 from training. Duration: 0.82 2023-10-02 06:28:20,591 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9309W0194-640321 from training. Duration: 0.64 2023-10-02 06:28:23,863 WARNING [train.py:1204] (3/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_284-312178-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 06:28:23,969 WARNING [train.py:1204] (3/4) Exclude cut with ID _29853_210_7497_1_1532505600851_6642110_401-55937-0_sp1.1 from training. Duration: 0.92725 2023-10-02 06:28:25,358 WARNING [train.py:1204] (3/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_716-24918-0_sp1.1 from training. Duration: 0.92725 2023-10-02 06:28:25,388 WARNING [train.py:1204] (3/4) Exclude cut with ID _19637_210_4220_1_1531537167676_3205060_314-305638-0_sp1.1 from training. Duration: 0.92725 2023-10-02 06:28:27,252 WARNING [train.py:1204] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_475-127455-0_sp1.1 from training. Duration: 0.636375 2023-10-02 06:28:28,042 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.2.encoder.layers.2.whiten, num_groups=1, num_channels=384, metric=4.92 vs. limit=12.0 2023-10-02 06:28:28,721 WARNING [train.py:1204] (3/4) Exclude cut with ID _5444_210_2151_1_1527418808464_5312210_581-315411-0 from training. Duration: 0.62 2023-10-02 06:28:32,686 WARNING [train.py:1204] (3/4) Exclude cut with ID _44902_210_16187_1_1533888006062_3792078_149-67212-0_sp0.9 from training. Duration: 0.9666875 2023-10-02 06:28:32,761 WARNING [train.py:1204] (3/4) Exclude cut with ID _31602_210_5854_1_1532677021456_4458250_63-63253-0_sp1.1 from training. Duration: 0.963625 2023-10-02 06:28:37,028 WARNING [train.py:1204] (3/4) Exclude cut with ID _55209_210_1531_1_1534675828996_4597160_660-203992-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 06:28:40,368 WARNING [train.py:1204] (3/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_83-190624-0_sp1.1 from training. Duration: 0.92725 2023-10-02 06:28:44,609 WARNING [train.py:1204] (3/4) Exclude cut with ID _58605_210_12210_1_1534723195939_3904259_154-139635-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 06:28:47,340 WARNING [train.py:1204] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_164-224052-0 from training. Duration: 0.7 2023-10-02 06:28:47,354 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_1126-189071-0_sp1.1 from training. Duration: 0.963625 2023-10-02 06:28:48,608 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_15396_210_6890_1_1531308473_1938936_310-214789-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 06:28:51,802 WARNING [train.py:1204] (3/4) Exclude cut with ID _8395_210_1531_1_1528597871523_4411579_431-132470-0 from training. Duration: 0.74 2023-10-02 06:28:53,100 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3780_210_2087_1_1526467760_2130483_170-259044-0_sp0.9 from training. Duration: 0.77775 2023-10-02 06:28:55,079 WARNING [train.py:1204] (3/4) Exclude cut with ID _12733_210_8341_1_1530167424556_3646380_537-230640-0_sp1.1 from training. Duration: 0.963625 2023-10-02 06:28:58,125 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.feed_forward1.out_proj.dropout_p, batch_count=782773.3333333334, ans=0.1 2023-10-02 06:28:59,827 INFO [train.py:1046] (3/4) Epoch 23, batch 550, loss[loss=0.1746, simple_loss=0.261, pruned_loss=0.04411, over 24450.00 frames. ], tot_loss[loss=0.1735, simple_loss=0.2495, pruned_loss=0.04875, over 4415908.85 frames. ], batch size: 69, lr: 4.50e-03, grad_scale: 8.0 2023-10-02 06:28:59,946 WARNING [train.py:1204] (3/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_159-281310-0 from training. Duration: 0.95 2023-10-02 06:29:03,979 WARNING [train.py:1204] (3/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_434-83000-0 from training. Duration: 0.98 2023-10-02 06:29:04,011 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_4405_210_1849_1_1526732205_3593939_493-32464-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 06:29:05,276 WARNING [train.py:1204] (3/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_342-100157-0 from training. Duration: 0.54 2023-10-02 06:29:05,325 WARNING [train.py:1204] (3/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_765-250133-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 06:29:05,330 WARNING [train.py:1204] (3/4) Exclude cut with ID _38670_210_15379_1_1533448810659_4391059_81-312245-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 06:29:05,409 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_50050_210_16223_1_1535435796_7290138_1273-247611-0_sp1.1 from training. Duration: 0.936375 2023-10-02 06:29:05,469 WARNING [train.py:1204] (3/4) Exclude cut with ID _44831_210_13427_1_1533816011549_3689035_399-18591-0_sp1.1 from training. Duration: 0.936375 2023-10-02 06:29:05,483 WARNING [train.py:1204] (3/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_557-324258-0_sp0.9 from training. Duration: 0.9 2023-10-02 06:29:06,929 WARNING [train.py:1204] (3/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_62-335552-0_sp1.1 from training. Duration: 0.7909375 2023-10-02 06:29:07,596 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.3.encoder.layers.0.conv_module2.whiten, num_groups=1, num_channels=512, metric=3.38 vs. limit=15.0 2023-10-02 06:29:11,636 WARNING [train.py:1204] (3/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_673-264125-0_sp1.1 from training. Duration: 0.963625 2023-10-02 06:29:12,909 WARNING [train.py:1204] (3/4) Exclude cut with ID _8030_210_7497_1_1528444886781_3531089_262-86750-0 from training. Duration: 0.81 2023-10-02 06:29:12,930 WARNING [train.py:1204] (3/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_444-28041-0_sp1.1 from training. Duration: 0.77275 2023-10-02 06:29:16,038 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_34008_210_10431_1_1533345424_8183247_516-14858-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 06:29:16,058 WARNING [train.py:1204] (3/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_54-248218-0_sp1.1 from training. Duration: 0.936375 2023-10-02 06:29:18,822 WARNING [train.py:1204] (3/4) Exclude cut with ID _13253_210_1925_1_1530757775819_3577873_300-119782-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 06:29:20,303 WARNING [train.py:1204] (3/4) Exclude cut with ID _39290_210_4220_1_1533862778760_3075118_21-240703-0_sp1.1 from training. Duration: 0.936375 2023-10-02 06:29:23,713 WARNING [train.py:1204] (3/4) Exclude cut with ID 774-127930-0014-10412-0_sp1.1 from training. Duration: 0.95 2023-10-02 06:29:25,037 WARNING [train.py:1204] (3/4) Exclude cut with ID _67499_210_17282_1_1535509805220_3692430_20-179346-0 from training. Duration: 0.97 2023-10-02 06:29:26,447 WARNING [train.py:1204] (3/4) Exclude cut with ID _8365_210_2064_1_1528592311070_4677690_431-294102-0_sp0.9 from training. Duration: 0.988875 2023-10-02 06:29:31,273 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=782906.6666666666, ans=0.1 2023-10-02 06:29:33,065 WARNING [train.py:1204] (3/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_365-1492-0_sp1.1 from training. Duration: 0.82725 2023-10-02 06:29:34,358 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_105-114797-0_sp0.9 from training. Duration: 0.9333125 2023-10-02 06:29:35,894 WARNING [train.py:1204] (3/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_769-168302-0_sp0.9 from training. Duration: 0.788875 2023-10-02 06:29:37,245 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_40340_210_9024_1_1533433916_4132944_320-270277-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 06:29:37,250 WARNING [train.py:1204] (3/4) Exclude cut with ID ID0203W0003-781711 from training. Duration: 0.958 2023-10-02 06:29:38,678 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_30434_210_13049_1_1532519384_981746_52-12932-0_sp1.1 from training. Duration: 0.936375 2023-10-02 06:29:40,064 WARNING [train.py:1204] (3/4) Exclude cut with ID _8263_210_1664_1_1528438953494_294100_40-204731-0_sp0.9 from training. Duration: 0.5555625 2023-10-02 06:29:40,313 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.bypass_mid.scale_min, batch_count=782906.6666666666, ans=0.2 2023-10-02 06:29:40,329 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.bypass_mid.scale_min, batch_count=782906.6666666666, ans=0.2 2023-10-02 06:29:41,394 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1032-236863-0_sp0.9 from training. Duration: 0.9333125 2023-10-02 06:29:43,275 WARNING [train.py:1204] (3/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_335-274780-0_sp0.9 from training. Duration: 0.8666875 2023-10-02 06:29:43,283 WARNING [train.py:1204] (3/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_654-52713-0_sp1.1 from training. Duration: 0.7 2023-10-02 06:29:43,368 WARNING [train.py:1204] (3/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_742-267856-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 06:29:44,762 WARNING [train.py:1204] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_74-48717-0 from training. Duration: 0.58 2023-10-02 06:29:44,844 WARNING [train.py:1204] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_18-46756-0 from training. Duration: 0.79 2023-10-02 06:29:45,051 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.conv_skip_rate, batch_count=782973.3333333334, ans=0.0 2023-10-02 06:29:46,220 WARNING [train.py:1204] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_612-56061-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 06:29:46,228 WARNING [train.py:1204] (3/4) Exclude cut with ID _56423_210_5854_1_1534896061049_3165969_412-2403-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 06:29:46,285 WARNING [train.py:1204] (3/4) Exclude cut with ID _62574_210_2655_1_1535180407058_3933249_120-196848-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 06:29:47,536 WARNING [train.py:1204] (3/4) Exclude cut with ID _40519_210_13794_1_1533715228655_7337390_916-51000-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 06:29:50,352 WARNING [train.py:1204] (3/4) Exclude cut with ID _65263_210_22356_1_1535763598691_3921037_108-21902-0_sp1.1 from training. Duration: 0.9 2023-10-02 06:29:53,477 WARNING [train.py:1204] (3/4) Exclude cut with ID _4122_210_2084_1_1526713164417_4353280_528-206771-0_sp1.1 from training. Duration: 0.8 2023-10-02 06:29:54,959 WARNING [train.py:1204] (3/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_43-325657-0_sp1.1 from training. Duration: 0.92725 2023-10-02 06:29:55,043 WARNING [train.py:1204] (3/4) Exclude cut with ID _44659_210_2721_1_1534036120061_6614860_314-82041-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 06:29:56,938 WARNING [train.py:1204] (3/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_81-300212-0_sp0.9 from training. Duration: 0.6666875 2023-10-02 06:29:58,253 WARNING [train.py:1204] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_665-196155-0_sp0.9 from training. Duration: 0.7444375 2023-10-02 06:29:58,457 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.balancer1.prob, batch_count=783040.0, ans=0.125 2023-10-02 06:29:58,523 INFO [scaling.py:1118] (3/4) WithLoss: name=encoder.encoders.2.encoder.layers.0.self_attn_weights, loss-sum=0.000e+00 2023-10-02 06:29:59,638 WARNING [train.py:1204] (3/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_140-336881-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 06:30:00,924 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6290_210_3273_1_1527595224_1282787_115-87095-0_sp0.9 from training. Duration: 0.611125 2023-10-02 06:30:01,110 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.0.bypass.skip_rate, batch_count=783040.0, ans=0.035 2023-10-02 06:30:02,782 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_53373_210_5399_1_1534229471_4715695_607-259685-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 06:30:04,163 WARNING [train.py:1204] (3/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_175-163894-0_sp1.1 from training. Duration: 0.67275 2023-10-02 06:30:04,179 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7002_210_1794_1_1527939683_5157286_1-281264-0_sp1.1 from training. Duration: 0.4 2023-10-02 06:30:11,226 WARNING [train.py:1204] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_605-257205-0 from training. Duration: 0.59 2023-10-02 06:30:13,922 INFO [train.py:1046] (3/4) Epoch 23, batch 600, loss[loss=0.1625, simple_loss=0.2363, pruned_loss=0.04436, over 24325.00 frames. ], tot_loss[loss=0.1733, simple_loss=0.2499, pruned_loss=0.0483, over 4497118.15 frames. ], batch size: 56, lr: 4.50e-03, grad_scale: 8.0 2023-10-02 06:30:14,046 WARNING [train.py:1204] (3/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_454-314901-0 from training. Duration: 0.46 2023-10-02 06:30:14,233 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.feed_forward2.hidden_balancer.prob, batch_count=783106.6666666666, ans=0.125 2023-10-02 06:30:15,448 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_46032_210_14656_1_1533781412_4507969_287-130422-0_sp1.1 from training. Duration: 0.97275 2023-10-02 06:30:16,100 WARNING [train.py:1204] (3/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_186-309161-0_sp1.1 from training. Duration: 0.4909375 2023-10-02 06:30:17,293 WARNING [train.py:1204] (3/4) Exclude cut with ID _42659_210_2070_1_1533607442765_3120380_45-342307-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 06:30:24,231 WARNING [train.py:1204] (3/4) Exclude cut with ID _12555_210_8750_1_1530075594402_3780239_478-326818-0_sp1.1 from training. Duration: 0.963625 2023-10-02 06:30:25,640 WARNING [train.py:1204] (3/4) Exclude cut with ID _7606_210_4915_1_1528894253277_5080690_127-177761-0_sp1.1 from training. Duration: 0.5818125 2023-10-02 06:30:27,122 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6788_210_1388_1_1527898698_4562021_91-197981-0 from training. Duration: 0.83 2023-10-02 06:30:29,673 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.4.encoder.layers.1.feed_forward1.out_whiten, num_groups=1, num_channels=384, metric=10.05 vs. limit=15.0 2023-10-02 06:30:30,451 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8558_210_1410_1_1528505225_7613406_131-329524-0_sp0.9 from training. Duration: 0.87775 2023-10-02 06:30:31,906 WARNING [train.py:1204] (3/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_153-153537-0_sp1.1 from training. Duration: 0.82725 2023-10-02 06:30:33,297 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_17292_210_6753_1_1531125388_2943734_285-349105-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 06:30:35,313 WARNING [train.py:1204] (3/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_37-41919-0 from training. Duration: 0.87 2023-10-02 06:30:35,332 WARNING [train.py:1204] (3/4) Exclude cut with ID _60860_210_8254_1_1534896015875_3557169_172-294852-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 06:30:41,003 WARNING [train.py:1204] (3/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_146-276944-0 from training. Duration: 0.83 2023-10-02 06:30:44,838 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.563e+02 1.803e+02 1.978e+02 2.195e+02 2.831e+02, threshold=3.957e+02, percent-clipped=0.0 2023-10-02 06:30:44,954 WARNING [train.py:1204] (3/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_1254-44082-0_sp1.1 from training. Duration: 0.936375 2023-10-02 06:30:44,965 WARNING [train.py:1204] (3/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_1115-180350-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 06:30:44,986 WARNING [train.py:1204] (3/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_60-146189-0_sp1.1 from training. Duration: 0.8909375 2023-10-02 06:30:46,565 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.0.nonlin_attention.balancer.max_positive, batch_count=783240.0, ans=0.95 2023-10-02 06:30:51,208 WARNING [train.py:1204] (3/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_517-153649-0_sp1.1 from training. Duration: 0.92725 2023-10-02 06:30:51,228 WARNING [train.py:1204] (3/4) Exclude cut with ID _39571_210_13449_1_1533801584143_3651077_37-266257-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 06:30:52,517 WARNING [train.py:1204] (3/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_527-276118-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 06:30:59,193 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7696_210_2040_1_1528207167_3716099_117-125434-0_sp1.1 from training. Duration: 0.6909375 2023-10-02 06:31:05,080 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_25767_210_2161_1_1532307483_3867748_478-156360-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 06:31:05,091 WARNING [train.py:1204] (3/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_177-49092-0_sp1.1 from training. Duration: 0.82725 2023-10-02 06:31:05,099 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_11305_210_1794_1_1529750428_7178148_680-317073-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 06:31:11,003 INFO [scaling.py:1118] (3/4) WithLoss: name=encoder.encoders.1.encoder.layers.0.self_attn_weights, loss-sum=0.000e+00 2023-10-02 06:31:12,113 WARNING [train.py:1204] (3/4) Exclude cut with ID _53565_210_10635_1_1534337218335_3820259_287-18120-0 from training. Duration: 0.97 2023-10-02 06:31:16,335 WARNING [train.py:1204] (3/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_498-40153-0_sp1.1 from training. Duration: 0.57275 2023-10-02 06:31:16,353 WARNING [train.py:1204] (3/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_555-63066-0_sp1.1 from training. Duration: 0.963625 2023-10-02 06:31:21,122 WARNING [train.py:1204] (3/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_395-310220-0 from training. Duration: 0.89 2023-10-02 06:31:21,221 WARNING [train.py:1204] (3/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_937-208281-0_sp1.1 from training. Duration: 0.77275 2023-10-02 06:31:23,960 WARNING [train.py:1204] (3/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_120-211630-0 from training. Duration: 0.92 2023-10-02 06:31:25,368 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_454-276429-0_sp1.1 from training. Duration: 0.7909375 2023-10-02 06:31:25,411 WARNING [train.py:1204] (3/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_404-75889-0_sp1.1 from training. Duration: 0.8090625 2023-10-02 06:31:28,453 INFO [train.py:1046] (3/4) Epoch 23, batch 650, loss[loss=0.1558, simple_loss=0.2164, pruned_loss=0.04763, over 23481.00 frames. ], tot_loss[loss=0.173, simple_loss=0.2493, pruned_loss=0.04834, over 4545921.46 frames. ], batch size: 285, lr: 4.50e-03, grad_scale: 8.0 2023-10-02 06:31:29,961 WARNING [train.py:1204] (3/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_282-136641-0_sp0.9 from training. Duration: 0.4666875 2023-10-02 06:31:31,761 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_809-17156-0_sp1.1 from training. Duration: 0.663625 2023-10-02 06:31:33,710 WARNING [train.py:1204] (3/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_1003-101638-0_sp0.9 from training. Duration: 0.811125 2023-10-02 06:31:35,107 WARNING [train.py:1204] (3/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_404-75889-0_sp0.9 from training. Duration: 0.988875 2023-10-02 06:31:36,492 WARNING [train.py:1204] (3/4) Exclude cut with ID _59288_210_5854_1_1535077757774_3412246_570-295255-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 06:31:40,626 WARNING [train.py:1204] (3/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_412-287526-0 from training. Duration: 0.72 2023-10-02 06:31:40,693 WARNING [train.py:1204] (3/4) Exclude cut with ID _44010_210_4525_1_1533884262964_4013620_649-79655-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 06:31:44,860 WARNING [train.py:1204] (3/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_57-270037-0_sp0.9 from training. Duration: 0.8555625 2023-10-02 06:31:44,860 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_34815_210_6388_1_1533381320_3571903_302-255963-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 06:31:47,152 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_47449_210_5399_1_1534076445_4006929_573-301852-0_sp1.1 from training. Duration: 0.97275 2023-10-02 06:31:51,172 WARNING [train.py:1204] (3/4) Exclude cut with ID 6709-74022-0004-86860-0_sp1.1 from training. Duration: 0.9409375 2023-10-02 06:31:52,618 WARNING [train.py:1204] (3/4) Exclude cut with ID _40519_210_13794_1_1533715228655_7337390_111-71991-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 06:31:54,388 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_30167_210_3528_1_1532428012_4772375_169-214186-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 06:31:57,225 WARNING [train.py:1204] (3/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_20-235313-0_sp1.1 from training. Duration: 0.963625 2023-10-02 06:31:57,267 WARNING [train.py:1204] (3/4) Exclude cut with ID _4681_210_3525_1_1527251040321_1235713_123-24428-0_sp0.9 from training. Duration: 0.5333125 2023-10-02 06:31:59,930 WARNING [train.py:1204] (3/4) Exclude cut with ID _31602_210_5854_1_1532677021456_4458250_444-198752-0_sp1.1 from training. Duration: 0.97275 2023-10-02 06:32:01,265 WARNING [train.py:1204] (3/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_9-279442-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 06:32:01,333 WARNING [train.py:1204] (3/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_973-198085-0_sp0.9 from training. Duration: 0.8333125 2023-10-02 06:32:03,269 WARNING [train.py:1204] (3/4) Exclude cut with ID _29853_210_7497_1_1532505600851_6642110_567-250221-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 06:32:04,694 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6545_210_2040_1_1527774670_4245258_407-48070-0_sp1.1 from training. Duration: 0.6909375 2023-10-02 06:32:06,086 WARNING [train.py:1204] (3/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_224-343295-0_sp0.9 from training. Duration: 0.611125 2023-10-02 06:32:07,434 WARNING [train.py:1204] (3/4) Exclude cut with ID IC0125W0011-61817 from training. Duration: 0.88 2023-10-02 06:32:07,451 WARNING [train.py:1204] (3/4) Exclude cut with ID _60113_210_16271_1_1535164212481_4386079_435-154371-0_sp1.1 from training. Duration: 0.97275 2023-10-02 06:32:07,471 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_25836_210_16554_1_1532138167_53197_3-9394-0_sp1.1 from training. Duration: 0.92725 2023-10-02 06:32:09,173 INFO [scaling.py:1118] (3/4) WithLoss: name=encoder.encoders.5.encoder.layers.1.self_attn_weights, loss-sum=0.000e+00 2023-10-02 06:32:10,255 WARNING [train.py:1204] (3/4) Exclude cut with ID _29343_210_4381_1_1532602757752_3674109_57-55125-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 06:32:11,453 WARNING [train.py:1204] (3/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_267-23560-0_sp1.1 from training. Duration: 0.936375 2023-10-02 06:32:11,476 WARNING [train.py:1204] (3/4) Exclude cut with ID _30196_210_1664_1_1533708496522_7074140_1150-278867-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 06:32:11,537 WARNING [train.py:1204] (3/4) Exclude cut with ID _6270_210_2084_1_1527850911573_3856420_26-277343-0_sp0.9 from training. Duration: 0.911125 2023-10-02 06:32:12,925 WARNING [train.py:1204] (3/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_325-62802-0 from training. Duration: 0.87 2023-10-02 06:32:12,998 WARNING [train.py:1204] (3/4) Exclude cut with ID _56524_210_9642_1_1534572011779_3619170_145-310842-0_sp1.1 from training. Duration: 0.9 2023-10-02 06:32:14,328 WARNING [train.py:1204] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_860-96198-0_sp0.9 from training. Duration: 0.811125 2023-10-02 06:32:15,687 WARNING [train.py:1204] (3/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_17-25045-0_sp1.1 from training. Duration: 0.536375 2023-10-02 06:32:15,715 WARNING [train.py:1204] (3/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_349-69727-0_sp1.1 from training. Duration: 0.936375 2023-10-02 06:32:17,078 WARNING [train.py:1204] (3/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_580-93798-0_sp1.1 from training. Duration: 0.5090625 2023-10-02 06:32:18,418 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7762_210_1794_1_1528199508_4057408_419-125009-0 from training. Duration: 0.83 2023-10-02 06:32:20,428 WARNING [train.py:1204] (3/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_356-167816-0 from training. Duration: 0.72 2023-10-02 06:32:20,453 WARNING [train.py:1204] (3/4) Exclude cut with ID _29345_210_4381_1_1532689168263_3860169_225-97100-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 06:32:20,465 WARNING [train.py:1204] (3/4) Exclude cut with ID _14227_210_4381_1_1530701804286_3940259_374-194712-0_sp1.1 from training. Duration: 0.92725 2023-10-02 06:32:20,501 WARNING [train.py:1204] (3/4) Exclude cut with ID _40137_210_12210_1_1533290484046_3795180_249-187059-0_sp0.9 from training. Duration: 0.97775 2023-10-02 06:32:21,875 WARNING [train.py:1204] (3/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_389-317194-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 06:32:23,345 WARNING [train.py:1204] (3/4) Exclude cut with ID _27485_210_4916_1_1532739191677_3668269_239-122918-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 06:32:28,798 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2245_210_2715_1_1525511548_3540254_128-117609-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 06:32:28,837 WARNING [train.py:1204] (3/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_160-276178-0_sp1.1 from training. Duration: 0.963625 2023-10-02 06:32:30,560 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_31920_210_10633_1_1532860009_1686094_1-279735-0_sp1.1 from training. Duration: 0.97275 2023-10-02 06:32:32,130 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.0.conv_module2.balancer1.prob, batch_count=783706.6666666666, ans=0.125 2023-10-02 06:32:32,179 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.conv_module2.balancer1.prob, batch_count=783706.6666666666, ans=0.125 2023-10-02 06:32:33,330 WARNING [train.py:1204] (3/4) Exclude cut with ID _51078_210_9070_1_1534161557424_3805250_325-262383-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 06:32:33,350 WARNING [train.py:1204] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_643-52007-0_sp0.9 from training. Duration: 0.6333125 2023-10-02 06:32:33,406 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_51937_210_5014_1_1534573610_4022025_518-299248-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 06:32:40,949 WARNING [train.py:1204] (3/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_166-314607-0_sp1.1 from training. Duration: 0.4909375 2023-10-02 06:32:40,952 WARNING [train.py:1204] (3/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_1087-282939-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 06:32:42,118 INFO [train.py:1046] (3/4) Epoch 23, batch 700, loss[loss=0.1815, simple_loss=0.2599, pruned_loss=0.05153, over 23925.00 frames. ], tot_loss[loss=0.1724, simple_loss=0.2483, pruned_loss=0.0482, over 4591224.04 frames. ], batch size: 86, lr: 4.50e-03, grad_scale: 8.0 2023-10-02 06:32:42,177 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_23850_210_4238_1_1532232597_1632433_32-185135-0_sp1.1 from training. Duration: 0.7818125 2023-10-02 06:32:42,220 WARNING [train.py:1204] (3/4) Exclude cut with ID _59500_210_8254_1_1534809610680_3736009_145-89359-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 06:32:43,793 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.0.conv_module2.balancer2.min_positive, batch_count=783773.3333333334, ans=0.05 2023-10-02 06:32:47,728 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_340-336408-0 from training. Duration: 0.93 2023-10-02 06:32:47,802 WARNING [train.py:1204] (3/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_9-21737-0 from training. Duration: 0.97 2023-10-02 06:32:51,298 WARNING [train.py:1204] (3/4) Exclude cut with ID _2220_210_1819_1_1525509109898_3658680_50-99837-0 from training. Duration: 0.54 2023-10-02 06:32:52,721 WARNING [train.py:1204] (3/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_879-299350-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 06:32:52,844 WARNING [train.py:1204] (3/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_905-160074-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 06:32:55,593 WARNING [train.py:1204] (3/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_395-73570-0 from training. Duration: 0.99 2023-10-02 06:32:59,826 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7264_210_4938_1_1527939911_8132095_800-91995-0_sp1.1 from training. Duration: 0.7818125 2023-10-02 06:33:01,830 WARNING [train.py:1204] (3/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_77-311404-0_sp1.1 from training. Duration: 0.8545625 2023-10-02 06:33:02,046 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.ff2_skip_rate, batch_count=783840.0, ans=0.0 2023-10-02 06:33:03,198 WARNING [train.py:1204] (3/4) Exclude cut with ID _13679_210_7497_1_1530534293662_3355098_107-80772-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 06:33:04,581 WARNING [train.py:1204] (3/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_224-343295-0_sp1.1 from training. Duration: 0.5 2023-10-02 06:33:05,886 WARNING [train.py:1204] (3/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_156-253482-0_sp1.1 from training. Duration: 0.963625 2023-10-02 06:33:09,001 WARNING [train.py:1204] (3/4) Exclude cut with ID _49728_210_12651_1_1534041034294_6788150_309-125532-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 06:33:10,586 WARNING [train.py:1204] (3/4) Exclude cut with ID _13913_210_9994_1_1530579615187_3383897_473-277682-0_sp0.9 from training. Duration: 0.4333125 2023-10-02 06:33:10,598 WARNING [train.py:1204] (3/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_3-276701-0_sp1.1 from training. Duration: 0.82725 2023-10-02 06:33:12,036 WARNING [train.py:1204] (3/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_282-136641-0 from training. Duration: 0.42 2023-10-02 06:33:13,307 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.505e+02 1.806e+02 2.034e+02 2.307e+02 3.674e+02, threshold=4.068e+02, percent-clipped=0.0 2023-10-02 06:33:13,493 WARNING [train.py:1204] (3/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_127-566-0 from training. Duration: 0.84 2023-10-02 06:33:17,711 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6163_210_6400_1_1527506746_4263068_283-137811-0_sp1.1 from training. Duration: 0.7 2023-10-02 06:33:17,744 WARNING [train.py:1204] (3/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_34-183685-0_sp1.1 from training. Duration: 0.8909375 2023-10-02 06:33:20,536 WARNING [train.py:1204] (3/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_154-135696-0_sp0.9 from training. Duration: 0.888875 2023-10-02 06:33:23,286 WARNING [train.py:1204] (3/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_676-51014-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 06:33:23,357 WARNING [train.py:1204] (3/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_38-169920-0 from training. Duration: 0.91 2023-10-02 06:33:27,547 WARNING [train.py:1204] (3/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_747-114465-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 06:33:28,888 WARNING [train.py:1204] (3/4) Exclude cut with ID _8021_210_7994_1_1528376451358_3653069_100-140231-0_sp1.1 from training. Duration: 0.6090625 2023-10-02 06:33:28,911 WARNING [train.py:1204] (3/4) Exclude cut with ID _64135_210_2253_1_1535677267876_7608972_133-188266-0 from training. Duration: 0.81 2023-10-02 06:33:35,298 WARNING [train.py:1204] (3/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_714-310538-0_sp1.1 from training. Duration: 0.7818125 2023-10-02 06:33:35,389 WARNING [train.py:1204] (3/4) Exclude cut with ID _39867_210_9826_1_1533553222030_9344259_1134-288498-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 06:33:38,668 WARNING [train.py:1204] (3/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_211-237289-0_sp1.1 from training. Duration: 0.936375 2023-10-02 06:33:40,425 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.balancer2.prob, batch_count=783973.3333333334, ans=0.125 2023-10-02 06:33:42,816 WARNING [train.py:1204] (3/4) Exclude cut with ID _59202_210_26984_1_1534816810612_3549174_137-318710-0_sp0.9 from training. Duration: 0.988875 2023-10-02 06:33:42,849 WARNING [train.py:1204] (3/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_86-66192-0 from training. Duration: 0.55 2023-10-02 06:33:46,935 WARNING [train.py:1204] (3/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_540-275685-0 from training. Duration: 0.89 2023-10-02 06:33:46,982 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8776_210_2040_1_1528638837_4088036_244-278763-0 from training. Duration: 0.9 2023-10-02 06:33:51,557 WARNING [train.py:1204] (3/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_9-219972-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 06:33:53,030 WARNING [train.py:1204] (3/4) Exclude cut with ID _44659_210_2721_1_1534036120061_6614860_656-283823-0_sp1.1 from training. Duration: 0.92725 2023-10-02 06:33:53,106 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_69630_210_7234_1_1535789601_661049_77-216385-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 06:33:55,804 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_19952_210_2161_1_1531875469_3851750_210-308919-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 06:33:55,810 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_4737_210_2040_1_1526998689_3293008_378-178650-0 from training. Duration: 0.95 2023-10-02 06:33:57,198 INFO [train.py:1046] (3/4) Epoch 23, batch 750, loss[loss=0.1494, simple_loss=0.2354, pruned_loss=0.03175, over 24306.00 frames. ], tot_loss[loss=0.1722, simple_loss=0.2485, pruned_loss=0.04798, over 4617636.35 frames. ], batch size: 61, lr: 4.50e-03, grad_scale: 8.0 2023-10-02 06:33:59,926 WARNING [train.py:1204] (3/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_224-343295-0 from training. Duration: 0.55 2023-10-02 06:33:59,950 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7002_210_1794_1_1527939683_5157286_184-229630-0 from training. Duration: 0.74 2023-10-02 06:34:01,228 WARNING [train.py:1204] (3/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_6-214815-0 from training. Duration: 0.85 2023-10-02 06:34:03,148 WARNING [train.py:1204] (3/4) Exclude cut with ID _8699_210_2069_1_1528630346831_3960409_160-24408-0 from training. Duration: 0.91 2023-10-02 06:34:03,188 WARNING [train.py:1204] (3/4) Exclude cut with ID _5585_210_2667_1_1527418813324_8007340_89-332072-0 from training. Duration: 0.64 2023-10-02 06:34:03,242 WARNING [train.py:1204] (3/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_528-259800-0_sp1.1 from training. Duration: 0.963625 2023-10-02 06:34:04,563 WARNING [train.py:1204] (3/4) Exclude cut with ID _55383_210_10635_1_1534468426172_2231669_134-128246-0 from training. Duration: 0.89 2023-10-02 06:34:06,049 WARNING [train.py:1204] (3/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_400-338274-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 06:34:06,102 WARNING [train.py:1204] (3/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_364-22817-0_sp0.9 from training. Duration: 0.788875 2023-10-02 06:34:09,178 WARNING [train.py:1204] (3/4) Exclude cut with ID _42863_210_9070_1_1533902398788_3696920_331-291057-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 06:34:10,555 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_5436_210_1794_1_1527331317_2541494_229-211016-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 06:34:10,597 WARNING [train.py:1204] (3/4) Exclude cut with ID _6456_210_1534_1_1527681634212_3181049_345-252877-0_sp0.9 from training. Duration: 0.711125 2023-10-02 06:34:10,626 WARNING [train.py:1204] (3/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_96-81499-0_sp1.1 from training. Duration: 0.92725 2023-10-02 06:34:13,434 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_5575_210_1410_1_1527301305_2570170_342-265572-0_sp1.1 from training. Duration: 0.8545625 2023-10-02 06:34:14,828 WARNING [train.py:1204] (3/4) Exclude cut with ID _5444_210_2151_1_1527418808464_5312210_726-27165-0_sp0.9 from training. Duration: 0.7444375 2023-10-02 06:34:17,472 WARNING [train.py:1204] (3/4) Exclude cut with ID _22083_210_8254_1_1531702862439_3678970_304-327750-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 06:34:20,338 WARNING [train.py:1204] (3/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_245-226199-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 06:34:20,391 WARNING [train.py:1204] (3/4) Exclude cut with ID _42875_210_14856_1_1533704397487_4012433_102-199499-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 06:34:21,702 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_51225_210_3601_1_1534400634_2327383_55-301844-0 from training. Duration: 0.93 2023-10-02 06:34:23,651 WARNING [train.py:1204] (3/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_916-195848-0_sp1.1 from training. Duration: 0.836375 2023-10-02 06:34:23,719 WARNING [train.py:1204] (3/4) Exclude cut with ID _44718_210_5816_1_1534050051869_6469030_497-311999-0_sp1.1 from training. Duration: 0.936375 2023-10-02 06:34:25,772 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_34886_210_10633_1_1533203882_1755661_91-194765-0_sp1.1 from training. Duration: 0.936375 2023-10-02 06:34:28,427 WARNING [train.py:1204] (3/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_310-26186-0_sp0.9 from training. Duration: 0.77775 2023-10-02 06:34:29,842 WARNING [train.py:1204] (3/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_416-257730-0 from training. Duration: 0.65 2023-10-02 06:34:29,857 WARNING [train.py:1204] (3/4) Exclude cut with ID _9389_210_1664_1_1529217337301_5289269_209-154760-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 06:34:31,200 WARNING [train.py:1204] (3/4) Exclude cut with ID _4092_210_2151_1_1526643044449_3135759_227-213972-0 from training. Duration: 0.54 2023-10-02 06:34:31,228 WARNING [train.py:1204] (3/4) Exclude cut with ID IC0679W0044-335852 from training. Duration: 0.768 2023-10-02 06:34:31,278 WARNING [train.py:1204] (3/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_42-186162-0 from training. Duration: 0.92 2023-10-02 06:34:31,291 WARNING [train.py:1204] (3/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_302-2550-0_sp0.9 from training. Duration: 0.9 2023-10-02 06:34:31,303 WARNING [train.py:1204] (3/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_356-31216-0_sp0.9 from training. Duration: 0.8333125 2023-10-02 06:34:34,064 WARNING [train.py:1204] (3/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_125-322172-0_sp1.1 from training. Duration: 0.8454375 2023-10-02 06:34:34,342 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.ff3_skip_rate, batch_count=784240.0, ans=0.0 2023-10-02 06:34:40,380 WARNING [train.py:1204] (3/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_141-89727-0_sp0.9 from training. Duration: 0.788875 2023-10-02 06:34:40,404 WARNING [train.py:1204] (3/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_647-337784-0_sp1.1 from training. Duration: 0.97275 2023-10-02 06:34:40,422 WARNING [train.py:1204] (3/4) Exclude cut with ID _6493_210_1747_1_1528459210424_4012470_513-140967-0_sp1.1 from training. Duration: 0.7181875 2023-10-02 06:34:41,814 WARNING [train.py:1204] (3/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_87-266334-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 06:34:45,198 WARNING [train.py:1204] (3/4) Exclude cut with ID _26528_210_9070_1_1532570410842_3851310_526-261338-0_sp1.1 from training. Duration: 0.92725 2023-10-02 06:34:46,458 WARNING [train.py:1204] (3/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_380-343670-0 from training. Duration: 0.88 2023-10-02 06:34:46,526 WARNING [train.py:1204] (3/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_449-15508-0_sp1.1 from training. Duration: 0.7454375 2023-10-02 06:34:47,947 WARNING [train.py:1204] (3/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_372-6869-0_sp0.9 from training. Duration: 0.488875 2023-10-02 06:34:48,000 WARNING [train.py:1204] (3/4) Exclude cut with ID _26907_210_7234_1_1532221740694_3205263_337-2494-0_sp0.9 from training. Duration: 0.97775 2023-10-02 06:34:50,736 WARNING [train.py:1204] (3/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_10-100367-0_sp1.1 from training. Duration: 0.8909375 2023-10-02 06:34:50,784 WARNING [train.py:1204] (3/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_47-167373-0 from training. Duration: 0.92 2023-10-02 06:34:52,171 WARNING [train.py:1204] (3/4) Exclude cut with ID _28323_210_16187_1_1532480395667_4100300_628-282179-0_sp1.1 from training. Duration: 0.97275 2023-10-02 06:34:54,278 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.1.balancer1.prob, batch_count=784306.6666666666, ans=0.125 2023-10-02 06:34:56,373 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.2.encoder.layers.0.feed_forward1.out_whiten, num_groups=1, num_channels=384, metric=7.37 vs. limit=15.0 2023-10-02 06:34:56,905 WARNING [train.py:1204] (3/4) Exclude cut with ID _20455_210_5219_1_1531565996775_8098100_213-280898-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 06:34:57,015 WARNING [train.py:1204] (3/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_383-316000-0_sp1.1 from training. Duration: 0.8181875 2023-10-02 06:34:58,232 WARNING [train.py:1204] (3/4) Exclude cut with ID _18513_210_9206_1_1531618215223_3862620_16-78180-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 06:34:59,711 WARNING [train.py:1204] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_330-327861-0_sp1.1 from training. Duration: 0.6090625 2023-10-02 06:35:03,743 WARNING [train.py:1204] (3/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_356-31216-0 from training. Duration: 0.75 2023-10-02 06:35:05,663 WARNING [train.py:1204] (3/4) Exclude cut with ID _25248_210_8750_1_1532062825941_5554180_255-148539-0_sp1.1 from training. Duration: 0.963625 2023-10-02 06:35:05,733 WARNING [train.py:1204] (3/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_269-154726-0_sp1.1 from training. Duration: 0.7818125 2023-10-02 06:35:08,613 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7002_210_1794_1_1527939683_5157286_163-87552-0_sp1.1 from training. Duration: 0.7818125 2023-10-02 06:35:08,643 WARNING [train.py:1204] (3/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_97-151896-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 06:35:11,283 INFO [train.py:1046] (3/4) Epoch 23, batch 800, loss[loss=0.1508, simple_loss=0.2317, pruned_loss=0.0349, over 24509.00 frames. ], tot_loss[loss=0.1721, simple_loss=0.2483, pruned_loss=0.04788, over 4642990.09 frames. ], batch size: 63, lr: 4.50e-03, grad_scale: 16.0 2023-10-02 06:35:11,362 WARNING [train.py:1204] (3/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_207-7026-0_sp1.1 from training. Duration: 0.97275 2023-10-02 06:35:11,393 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_92-46009-0_sp0.9 from training. Duration: 0.6 2023-10-02 06:35:14,793 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.ff3_skip_rate, batch_count=784440.0, ans=0.0 2023-10-02 06:35:20,168 WARNING [train.py:1204] (3/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_122-178847-0_sp1.1 from training. Duration: 0.97275 2023-10-02 06:35:20,172 WARNING [train.py:1204] (3/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_94-348956-0_sp1.1 from training. Duration: 0.92725 2023-10-02 06:35:21,244 INFO [scaling.py:1022] (3/4) Whitening: name=encoder_embed.convnext.out_whiten, num_groups=1, num_channels=128, metric=4.73 vs. limit=5.0 2023-10-02 06:35:23,525 WARNING [train.py:1204] (3/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_1018-244089-0_sp1.1 from training. Duration: 0.963625 2023-10-02 06:35:23,544 WARNING [train.py:1204] (3/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_87-118423-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 06:35:24,927 WARNING [train.py:1204] (3/4) Exclude cut with ID _53713_210_24116_1_1534314487762_7416751_504-212292-0_sp1.1 from training. Duration: 0.92725 2023-10-02 06:35:24,949 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_446-150327-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 06:35:26,441 WARNING [train.py:1204] (3/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_682-245989-0_sp1.1 from training. Duration: 0.92725 2023-10-02 06:35:30,979 WARNING [train.py:1204] (3/4) Exclude cut with ID _35723_210_1794_1_1533088341043_628250_8-234524-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 06:35:31,056 WARNING [train.py:1204] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_64-248359-0_sp0.9 from training. Duration: 0.7666875 2023-10-02 06:35:33,719 WARNING [train.py:1204] (3/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_49-97158-0 from training. Duration: 0.76 2023-10-02 06:35:33,781 WARNING [train.py:1204] (3/4) Exclude cut with ID _24314_210_4915_1_1532135078116_7170179_743-915-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 06:35:35,174 WARNING [train.py:1204] (3/4) Exclude cut with ID _61194_210_18628_1_1535007555628_3750909_353-323468-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 06:35:35,200 WARNING [train.py:1204] (3/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_387-131538-0_sp1.1 from training. Duration: 0.736375 2023-10-02 06:35:35,385 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.feed_forward3.hidden_balancer.prob, batch_count=784506.6666666666, ans=0.125 2023-10-02 06:35:36,510 WARNING [train.py:1204] (3/4) Exclude cut with ID _44424_210_18163_1_1533623394848_1213110_79-187603-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 06:35:36,541 WARNING [train.py:1204] (3/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_86-19601-0 from training. Duration: 0.85 2023-10-02 06:35:36,571 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_50050_210_16223_1_1535435796_7290138_103-283529-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 06:35:36,612 WARNING [train.py:1204] (3/4) Exclude cut with ID _13913_210_9994_1_1530579615187_3383897_473-277682-0 from training. Duration: 0.39 2023-10-02 06:35:39,896 WARNING [train.py:1204] (3/4) Exclude cut with ID _42326_210_7770_1_1533814282531_3707269_368-68029-0_sp1.1 from training. Duration: 0.92725 2023-10-02 06:35:41,314 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_41428_210_2161_1_1533774184_3992532_446-339645-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 06:35:42,557 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.522e+02 1.768e+02 2.043e+02 2.413e+02 3.379e+02, threshold=4.086e+02, percent-clipped=0.0 2023-10-02 06:35:42,703 WARNING [train.py:1204] (3/4) Exclude cut with ID _69584_210_5219_1_1535785133775_4044450_325-7656-0_sp1.1 from training. Duration: 0.7818125 2023-10-02 06:35:44,081 WARNING [train.py:1204] (3/4) Exclude cut with ID _21632_210_2253_1_1531994001765_3744140_134-184387-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 06:35:47,403 WARNING [train.py:1204] (3/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_550-184707-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 06:35:47,419 WARNING [train.py:1204] (3/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_106-341780-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 06:35:49,073 INFO [scaling.py:1118] (3/4) WithLoss: name=encoder.encoders.4.encoder.layers.2.self_attn_weights, loss-sum=0.000e+00 2023-10-02 06:35:52,799 WARNING [train.py:1204] (3/4) Exclude cut with ID _45871_210_5665_1_1533864614245_3296060_253-132552-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 06:35:52,851 WARNING [train.py:1204] (3/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_395-310220-0_sp1.1 from training. Duration: 0.8090625 2023-10-02 06:35:52,875 WARNING [train.py:1204] (3/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_795-33252-0_sp0.9 from training. Duration: 0.411125 2023-10-02 06:35:54,659 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9002W0003-495962 from training. Duration: 0.946 2023-10-02 06:35:55,975 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_43-71989-0 from training. Duration: 0.57 2023-10-02 06:35:55,987 WARNING [train.py:1204] (3/4) Exclude cut with ID _8699_210_2069_1_1528630346831_3960409_155-39766-0_sp1.1 from training. Duration: 0.7090625 2023-10-02 06:35:55,998 WARNING [train.py:1204] (3/4) Exclude cut with ID _27894_210_12392_1_1532415496078_6902439_788-232684-0_sp1.1 from training. Duration: 0.97275 2023-10-02 06:35:57,892 WARNING [train.py:1204] (3/4) Exclude cut with ID _56494_210_4220_1_1535284837625_3033169_10-39296-0_sp1.1 from training. Duration: 0.92725 2023-10-02 06:35:57,922 WARNING [train.py:1204] (3/4) Exclude cut with ID _40901_210_5665_1_1533363789958_3856130_341-20819-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 06:36:03,510 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9029W0435-509822 from training. Duration: 0.896 2023-10-02 06:36:03,550 WARNING [train.py:1204] (3/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_51-117576-0 from training. Duration: 0.4 2023-10-02 06:36:03,692 WARNING [train.py:1204] (3/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_197-28164-0_sp1.1 from training. Duration: 0.72725 2023-10-02 06:36:04,741 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.0.layers.0.feed_forward2.out_whiten, num_groups=1, num_channels=192, metric=5.81 vs. limit=15.0 2023-10-02 06:36:05,160 WARNING [train.py:1204] (3/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_145-137790-0_sp1.1 from training. Duration: 0.7545625 2023-10-02 06:36:09,771 WARNING [train.py:1204] (3/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_2-39653-0_sp1.1 from training. Duration: 0.8909375 2023-10-02 06:36:12,488 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3780_210_2087_1_1526467760_2130483_186-208240-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 06:36:13,312 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.2.encoder.layers.2.self_attn_weights.whiten_keys, num_groups=4, num_channels=128, metric=3.61 vs. limit=6.0 2023-10-02 06:36:13,917 WARNING [train.py:1204] (3/4) Exclude cut with ID _24055_210_12651_1_1532310907143_976979_92-59356-0 from training. Duration: 0.91 2023-10-02 06:36:13,980 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3910_210_1794_1_1526742306_334530_28-238909-0_sp1.1 from training. Duration: 0.736375 2023-10-02 06:36:16,817 WARNING [train.py:1204] (3/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_269-154726-0 from training. Duration: 0.86 2023-10-02 06:36:22,831 WARNING [train.py:1204] (3/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_326-165900-0_sp0.9 from training. Duration: 0.7666875 2023-10-02 06:36:26,658 INFO [train.py:1046] (3/4) Epoch 23, batch 850, loss[loss=0.1803, simple_loss=0.2657, pruned_loss=0.0475, over 24569.00 frames. ], tot_loss[loss=0.1735, simple_loss=0.2496, pruned_loss=0.04874, over 4655690.79 frames. ], batch size: 71, lr: 4.50e-03, grad_scale: 16.0 2023-10-02 06:36:26,742 WARNING [train.py:1204] (3/4) Exclude cut with ID _5294_210_1747_1_1527249643928_3696179_470-276077-0_sp1.1 from training. Duration: 0.963625 2023-10-02 06:36:26,777 WARNING [train.py:1204] (3/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_372-131163-0 from training. Duration: 0.94 2023-10-02 06:36:26,817 WARNING [train.py:1204] (3/4) Exclude cut with ID _45297_210_2721_1_1533715351381_3264949_351-147136-0_sp1.1 from training. Duration: 0.7909375 2023-10-02 06:36:28,222 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_1072-12671-0_sp1.1 from training. Duration: 0.92725 2023-10-02 06:36:29,679 WARNING [train.py:1204] (3/4) Exclude cut with ID _15133_210_6228_1_1530794512074_3080379_54-198193-0 from training. Duration: 0.94 2023-10-02 06:36:29,709 WARNING [train.py:1204] (3/4) Exclude cut with ID _30196_210_1664_1_1533708496522_7074140_1194-270988-0_sp1.1 from training. Duration: 0.97275 2023-10-02 06:36:29,933 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=784773.3333333334, ans=0.1 2023-10-02 06:36:31,098 WARNING [train.py:1204] (3/4) Exclude cut with ID _50310_210_15488_1_1534068043345_3641080_289-193637-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 06:36:32,541 WARNING [train.py:1204] (3/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_313-263495-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 06:36:33,924 WARNING [train.py:1204] (3/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_18-130336-0_sp1.1 from training. Duration: 0.7181875 2023-10-02 06:36:35,374 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_47896_210_20508_1_1534492518_5958868_646-287209-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 06:36:37,415 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_294-163793-0 from training. Duration: 0.82 2023-10-02 06:36:37,460 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_8-319957-0 from training. Duration: 0.96 2023-10-02 06:36:37,463 WARNING [train.py:1204] (3/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_1211-226066-0 from training. Duration: 0.76 2023-10-02 06:36:37,724 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=784773.3333333334, ans=0.1 2023-10-02 06:36:38,844 WARNING [train.py:1204] (3/4) Exclude cut with ID _4779_210_2084_1_1527318022944_3914893_500-285013-0_sp0.9 from training. Duration: 0.7666875 2023-10-02 06:36:40,179 WARNING [train.py:1204] (3/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_347-232268-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 06:36:41,656 WARNING [train.py:1204] (3/4) Exclude cut with ID _29877_210_6228_1_1532397791106_5276149_111-55056-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 06:36:41,680 WARNING [train.py:1204] (3/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_812-271646-0_sp1.1 from training. Duration: 0.92725 2023-10-02 06:36:41,725 WARNING [train.py:1204] (3/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_235-262124-0_sp1.1 from training. Duration: 0.6090625 2023-10-02 06:36:43,238 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=784840.0, ans=0.1 2023-10-02 06:36:43,661 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.3.encoder.layers.3.self_attn1.whiten, num_groups=1, num_channels=512, metric=11.76 vs. limit=22.5 2023-10-02 06:36:46,999 WARNING [train.py:1204] (3/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_14-16530-0_sp1.1 from training. Duration: 0.97275 2023-10-02 06:36:47,035 WARNING [train.py:1204] (3/4) Exclude cut with ID _44718_210_5816_1_1534050051869_6469030_568-304512-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 06:36:47,065 WARNING [train.py:1204] (3/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_1033-274376-0 from training. Duration: 0.87 2023-10-02 06:36:50,084 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.conv_module1.balancer2.prob, batch_count=784840.0, ans=0.125 2023-10-02 06:36:51,730 WARNING [train.py:1204] (3/4) Exclude cut with ID _4681_210_3525_1_1527251040321_1235713_123-24428-0 from training. Duration: 0.48 2023-10-02 06:36:54,519 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_37818_210_4238_1_1533620910_4720083_251-288764-0_sp1.1 from training. Duration: 0.97275 2023-10-02 06:36:55,896 WARNING [train.py:1204] (3/4) Exclude cut with ID _65585_210_12210_1_1535450451584_3764098_112-294301-0 from training. Duration: 0.88 2023-10-02 06:36:59,776 WARNING [train.py:1204] (3/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_329-113173-0 from training. Duration: 0.62 2023-10-02 06:37:01,107 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6767_210_3273_1_1527855783_1718932_173-256601-0 from training. Duration: 0.8 2023-10-02 06:37:02,527 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9266W0001-627677 from training. Duration: 0.896 2023-10-02 06:37:02,544 WARNING [train.py:1204] (3/4) Exclude cut with ID _49643_210_7989_1_1534640244224_3939031_611-185515-0_sp1.1 from training. Duration: 0.8545625 2023-10-02 06:37:02,548 WARNING [train.py:1204] (3/4) Exclude cut with ID _31649_210_6228_1_1532584608063_4091078_687-237771-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 06:37:03,821 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_148-172861-0_sp0.9 from training. Duration: 0.5555625 2023-10-02 06:37:07,189 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_5129_210_6400_1_1527161005_3579054_107-69738-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 06:37:07,291 WARNING [train.py:1204] (3/4) Exclude cut with ID _40137_210_12210_1_1533290484046_3795180_36-301164-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 06:37:08,559 WARNING [train.py:1204] (3/4) Exclude cut with ID _7537_210_2084_1_1528502395431_3924359_113-199933-0 from training. Duration: 0.59 2023-10-02 06:37:11,410 WARNING [train.py:1204] (3/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_547-5424-0_sp1.1 from training. Duration: 0.8545625 2023-10-02 06:37:12,781 WARNING [train.py:1204] (3/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_147-284024-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 06:37:12,859 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_294-163793-0_sp1.1 from training. Duration: 0.7454375 2023-10-02 06:37:12,879 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3780_210_2087_1_1526467760_2130483_170-259044-0_sp1.1 from training. Duration: 0.636375 2023-10-02 06:37:14,367 WARNING [train.py:1204] (3/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_726-270780-0_sp1.1 from training. Duration: 0.7818125 2023-10-02 06:37:15,701 WARNING [train.py:1204] (3/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_214-291369-0_sp1.1 from training. Duration: 0.57275 2023-10-02 06:37:15,756 WARNING [train.py:1204] (3/4) Exclude cut with ID _21881_210_3525_1_1531825242757_3798380_247-102591-0 from training. Duration: 0.98 2023-10-02 06:37:19,901 WARNING [train.py:1204] (3/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_18-174647-0_sp1.1 from training. Duration: 0.82725 2023-10-02 06:37:19,904 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_415-52611-0_sp1.1 from training. Duration: 0.936375 2023-10-02 06:37:19,969 WARNING [train.py:1204] (3/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_379-49457-0_sp0.9 from training. Duration: 0.9666875 2023-10-02 06:37:19,983 WARNING [train.py:1204] (3/4) Exclude cut with ID _64521_210_14699_1_1535243427259_3815328_368-195106-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 06:37:21,320 WARNING [train.py:1204] (3/4) Exclude cut with ID _28394_210_8913_1_1532430049518_3650118_51-334124-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 06:37:24,631 WARNING [train.py:1204] (3/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_317-161769-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 06:37:27,345 WARNING [train.py:1204] (3/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_40-129756-0_sp0.9 from training. Duration: 0.9 2023-10-02 06:37:29,306 WARNING [train.py:1204] (3/4) Exclude cut with ID _3067_210_2667_1_1526212974640_4503168_119-175776-0_sp0.9 from training. Duration: 0.82225 2023-10-02 06:37:29,362 WARNING [train.py:1204] (3/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_1004-304915-0_sp1.1 from training. Duration: 0.92725 2023-10-02 06:37:31,357 WARNING [train.py:1204] (3/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_359-118926-0_sp0.9 from training. Duration: 0.8 2023-10-02 06:37:37,517 WARNING [train.py:1204] (3/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_51-200220-0_sp0.9 from training. Duration: 0.62225 2023-10-02 06:37:38,886 WARNING [train.py:1204] (3/4) Exclude cut with ID _46683_210_9642_1_1533730913492_4591116_530-326202-0_sp1.1 from training. Duration: 0.936375 2023-10-02 06:37:40,276 WARNING [train.py:1204] (3/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_567-135114-0 from training. Duration: 0.87 2023-10-02 06:37:40,314 WARNING [train.py:1204] (3/4) Exclude cut with ID _66863_210_18053_1_1535698758334_8590319_1092-271784-0_sp1.1 from training. Duration: 0.87275 2023-10-02 06:37:40,329 WARNING [train.py:1204] (3/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_762-282614-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 06:37:41,650 INFO [train.py:1046] (3/4) Epoch 23, batch 900, loss[loss=0.1581, simple_loss=0.2392, pruned_loss=0.03843, over 24581.00 frames. ], tot_loss[loss=0.1734, simple_loss=0.2498, pruned_loss=0.04844, over 4688575.05 frames. ], batch size: 60, lr: 4.50e-03, grad_scale: 16.0 2023-10-02 06:37:43,064 WARNING [train.py:1204] (3/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_280-120942-0 from training. Duration: 0.77 2023-10-02 06:37:48,684 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_31583_210_3852_1_1532595851_4024413_27-170931-0_sp1.1 from training. Duration: 0.963625 2023-10-02 06:37:51,452 WARNING [train.py:1204] (3/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_266-271628-0_sp1.1 from training. Duration: 0.92725 2023-10-02 06:37:51,496 WARNING [train.py:1204] (3/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_41-31157-0 from training. Duration: 0.57 2023-10-02 06:37:56,138 WARNING [train.py:1204] (3/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_113-18163-0_sp1.1 from training. Duration: 0.8090625 2023-10-02 06:37:56,330 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=785173.3333333334, ans=0.1 2023-10-02 06:37:57,448 WARNING [train.py:1204] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_147-111137-0 from training. Duration: 0.95 2023-10-02 06:37:58,842 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1083-220122-0_sp1.1 from training. Duration: 0.463625 2023-10-02 06:37:58,923 WARNING [train.py:1204] (3/4) Exclude cut with ID _14884_210_2252_1_1530707459689_5561250_591-173406-0_sp1.1 from training. Duration: 0.87275 2023-10-02 06:37:58,928 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_24677_210_1794_1_1532002693_4341296_149-303793-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 06:38:00,916 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_133-564-0_sp1.1 from training. Duration: 0.7181875 2023-10-02 06:38:00,948 WARNING [train.py:1204] (3/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_625-338546-0_sp0.9 from training. Duration: 0.97775 2023-10-02 06:38:05,841 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.feed_forward1.out_proj.dropout_p, batch_count=785173.3333333334, ans=0.1 2023-10-02 06:38:11,597 WARNING [train.py:1204] (3/4) Exclude cut with ID _25882_210_3036_1_1532775665815_6479510_624-320169-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 06:38:11,603 WARNING [train.py:1204] (3/4) Exclude cut with ID _20622_210_3852_1_1531622427080_1093839_127-325326-0_sp1.1 from training. Duration: 0.92725 2023-10-02 06:38:11,639 WARNING [train.py:1204] (3/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_464-33931-0_sp1.1 from training. Duration: 0.4909375 2023-10-02 06:38:12,834 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.553e+02 1.908e+02 2.046e+02 2.304e+02 2.973e+02, threshold=4.093e+02, percent-clipped=0.0 2023-10-02 06:38:15,694 WARNING [train.py:1204] (3/4) Exclude cut with ID _66112_210_10635_1_1535590236305_3801484_120-304588-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 06:38:19,798 WARNING [train.py:1204] (3/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_110-130961-0 from training. Duration: 0.47 2023-10-02 06:38:20,038 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=785240.0, ans=0.1 2023-10-02 06:38:21,184 WARNING [train.py:1204] (3/4) Exclude cut with ID _16908_210_5724_1_1531119157248_3772700_64-54020-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 06:38:22,581 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder_embed.conv.2.prob, batch_count=785240.0, ans=0.125 2023-10-02 06:38:25,466 WARNING [train.py:1204] (3/4) Exclude cut with ID _4068_210_2629_1_1526697350395_1546259_224-288693-0_sp1.1 from training. Duration: 0.536375 2023-10-02 06:38:25,501 WARNING [train.py:1204] (3/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_191-41023-0_sp0.9 from training. Duration: 0.788875 2023-10-02 06:38:25,564 WARNING [train.py:1204] (3/4) Exclude cut with ID ID0205W0255-783002 from training. Duration: 0.958 2023-10-02 06:38:28,604 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_162-262717-0 from training. Duration: 0.83 2023-10-02 06:38:31,710 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_224-322394-0_sp1.1 from training. Duration: 0.6 2023-10-02 06:38:31,716 WARNING [train.py:1204] (3/4) Exclude cut with ID _4866_210_2577_1_1527404412284_4364659_133-1696-0_sp1.1 from training. Duration: 0.9 2023-10-02 06:38:31,765 WARNING [train.py:1204] (3/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_973-198085-0_sp1.1 from training. Duration: 0.6818125 2023-10-02 06:38:32,103 INFO [scaling.py:1118] (3/4) WithLoss: name=encoder.encoders.1.encoder.layers.0.self_attn_weights, loss-sum=0.000e+00 2023-10-02 06:38:35,047 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.3.encoder.layers.3.self_attn1.whiten, num_groups=1, num_channels=512, metric=12.15 vs. limit=22.5 2023-10-02 06:38:38,619 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_47449_210_5399_1_1534076445_4006929_410-237806-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 06:38:38,630 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7543_210_6753_1_1528543318_4552_1-85072-0_sp0.9 from training. Duration: 0.92225 2023-10-02 06:38:41,897 WARNING [train.py:1204] (3/4) Exclude cut with ID _65263_210_22356_1_1535763598691_3921037_108-21902-0 from training. Duration: 0.99 2023-10-02 06:38:41,910 WARNING [train.py:1204] (3/4) Exclude cut with ID _65585_210_12210_1_1535450451584_3764098_309-174818-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 06:38:44,710 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_309-65177-0 from training. Duration: 0.88 2023-10-02 06:38:46,039 WARNING [train.py:1204] (3/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_363-185703-0_sp1.1 from training. Duration: 0.863625 2023-10-02 06:38:46,074 WARNING [train.py:1204] (3/4) Exclude cut with ID _42326_210_7770_1_1533814282531_3707269_442-238692-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 06:38:46,722 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.3.encoder.layers.0.nonlin_attention.whiten1, num_groups=1, num_channels=384, metric=7.37 vs. limit=10.0 2023-10-02 06:38:48,819 WARNING [train.py:1204] (3/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_226-254446-0_sp1.1 from training. Duration: 0.936375 2023-10-02 06:38:48,837 WARNING [train.py:1204] (3/4) Exclude cut with ID _29853_210_7497_1_1532505600851_6642110_287-299903-0_sp1.1 from training. Duration: 0.963625 2023-10-02 06:38:49,064 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.bypass.skip_rate, batch_count=785373.3333333334, ans=0.04949747468305833 2023-10-02 06:38:52,700 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.0.layers.1.self_attn_weights.whiten_keys, num_groups=4, num_channels=128, metric=2.37 vs. limit=6.0 2023-10-02 06:38:53,120 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1015-343884-0 from training. Duration: 0.62 2023-10-02 06:38:53,158 WARNING [train.py:1204] (3/4) Exclude cut with ID ID0305W0008-833402 from training. Duration: 0.91 2023-10-02 06:38:55,738 INFO [train.py:1046] (3/4) Epoch 23, batch 950, loss[loss=0.1534, simple_loss=0.2336, pruned_loss=0.03666, over 24409.00 frames. ], tot_loss[loss=0.1736, simple_loss=0.25, pruned_loss=0.04858, over 4693571.52 frames. ], batch size: 58, lr: 4.50e-03, grad_scale: 16.0 2023-10-02 06:38:55,837 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6673_210_3273_1_1527767790_3173240_351-167954-0_sp1.1 from training. Duration: 0.436375 2023-10-02 06:38:55,849 WARNING [train.py:1204] (3/4) Exclude cut with ID _6456_210_1534_1_1527681634212_3181049_345-252877-0 from training. Duration: 0.64 2023-10-02 06:38:56,045 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.conv_module1.balancer2.min_abs, batch_count=785440.0, ans=0.5 2023-10-02 06:38:57,310 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_13-205182-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 06:39:02,529 WARNING [train.py:1204] (3/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_427-208901-0 from training. Duration: 0.68 2023-10-02 06:39:05,392 WARNING [train.py:1204] (3/4) Exclude cut with ID _49039_210_6315_1_1533967537541_3846118_9-148921-0_sp1.1 from training. Duration: 0.97275 2023-10-02 06:39:09,805 WARNING [train.py:1204] (3/4) Exclude cut with ID _59493_210_17282_1_1535078419077_3225879_143-70317-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 06:39:09,831 WARNING [train.py:1204] (3/4) Exclude cut with ID _27967_210_12140_1_1532588391859_8419130_1056-236369-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 06:39:09,882 WARNING [train.py:1204] (3/4) Exclude cut with ID _6537_210_1883_1_1527987559710_3597129_382-133062-0_sp0.9 from training. Duration: 0.7555625 2023-10-02 06:39:13,062 WARNING [train.py:1204] (3/4) Exclude cut with ID ID0376W0255-869957 from training. Duration: 0.9510625 2023-10-02 06:39:15,850 WARNING [train.py:1204] (3/4) Exclude cut with ID _49371_210_12071_1_1533952761021_3812190_483-240691-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 06:39:17,176 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_12868_210_6753_1_1530701932_530275_18-76322-0_sp1.1 from training. Duration: 0.7909375 2023-10-02 06:39:17,248 WARNING [train.py:1204] (3/4) Exclude cut with ID _53950_210_5816_1_1534417264919_3862951_367-26344-0_sp1.1 from training. Duration: 0.97275 2023-10-02 06:39:17,277 WARNING [train.py:1204] (3/4) Exclude cut with ID _5444_210_2151_1_1527418808464_5312210_634-194174-0_sp1.1 from training. Duration: 0.8818125 2023-10-02 06:39:17,303 WARNING [train.py:1204] (3/4) Exclude cut with ID _56494_210_4220_1_1535284837625_3033169_69-138150-0 from training. Duration: 0.94 2023-10-02 06:39:18,608 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7543_210_6753_1_1528544010_1110402_25-183860-0_sp0.9 from training. Duration: 0.52225 2023-10-02 06:39:20,116 WARNING [train.py:1204] (3/4) Exclude cut with ID _40284_210_2084_1_1533513679814_3704279_287-191918-0_sp1.1 from training. Duration: 0.963625 2023-10-02 06:39:21,580 WARNING [train.py:1204] (3/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_291-270881-0 from training. Duration: 0.97 2023-10-02 06:39:22,877 WARNING [train.py:1204] (3/4) Exclude cut with ID _12555_210_8750_1_1530075594402_3780239_449-267986-0_sp0.9 from training. Duration: 0.92225 2023-10-02 06:39:25,625 WARNING [train.py:1204] (3/4) Exclude cut with ID _3992_210_1747_1_1526644816031_3731060_268-64520-0_sp1.1 from training. Duration: 0.963625 2023-10-02 06:39:25,631 WARNING [train.py:1204] (3/4) Exclude cut with ID _39411_210_5986_1_1533538825030_3343649_173-238248-0_sp0.9 from training. Duration: 0.92225 2023-10-02 06:39:25,651 WARNING [train.py:1204] (3/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_828-313097-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 06:39:28,777 WARNING [train.py:1204] (3/4) Exclude cut with ID _26907_210_7234_1_1532221740694_3205263_337-2494-0 from training. Duration: 0.88 2023-10-02 06:39:30,272 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_229-45300-0_sp1.1 from training. Duration: 0.5181875 2023-10-02 06:39:32,285 WARNING [train.py:1204] (3/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_300-27235-0_sp1.1 from training. Duration: 0.7909375 2023-10-02 06:39:33,758 WARNING [train.py:1204] (3/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_48-338976-0_sp1.1 from training. Duration: 0.8090625 2023-10-02 06:39:37,219 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_13091_210_7925_1_1530609447_8987354_792-195353-0_sp1.1 from training. Duration: 0.936375 2023-10-02 06:39:37,226 WARNING [train.py:1204] (3/4) Exclude cut with ID _56524_210_9642_1_1534572011779_3619170_311-54500-0_sp1.1 from training. Duration: 0.97275 2023-10-02 06:39:42,709 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7543_210_6753_1_1528544010_1110402_25-183860-0 from training. Duration: 0.47 2023-10-02 06:39:44,729 WARNING [train.py:1204] (3/4) Exclude cut with ID 2411-132532-0017-82279-0_sp1.1 from training. Duration: 0.9681875 2023-10-02 06:39:44,729 WARNING [train.py:1204] (3/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_272-249985-0_sp0.9 from training. Duration: 0.7444375 2023-10-02 06:39:44,809 WARNING [train.py:1204] (3/4) Exclude cut with ID _55644_210_11856_1_1534593599582_4103719_320-229006-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 06:39:46,156 WARNING [train.py:1204] (3/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_177-100554-0_sp1.1 from training. Duration: 0.963625 2023-10-02 06:39:46,159 WARNING [train.py:1204] (3/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_787-294416-0_sp1.1 from training. Duration: 0.7454375 2023-10-02 06:39:50,299 WARNING [train.py:1204] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_318-59097-0 from training. Duration: 0.76 2023-10-02 06:39:50,363 WARNING [train.py:1204] (3/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_480-13146-0_sp1.1 from training. Duration: 0.77275 2023-10-02 06:39:51,815 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_469-338558-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 06:39:53,118 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_367-126897-0_sp1.1 from training. Duration: 0.963625 2023-10-02 06:39:53,148 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6545_210_2040_1_1527774670_4245258_379-14182-0 from training. Duration: 0.8 2023-10-02 06:39:54,417 WARNING [train.py:1204] (3/4) Exclude cut with ID _21631_210_2253_1_1531907880778_3160339_197-79498-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 06:39:54,424 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_416-293746-0_sp1.1 from training. Duration: 0.8181875 2023-10-02 06:39:54,505 WARNING [train.py:1204] (3/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_52-211683-0 from training. Duration: 0.9 2023-10-02 06:40:00,041 WARNING [train.py:1204] (3/4) Exclude cut with ID _48678_210_5663_1_1533988830947_3769199_306-255886-0_sp0.9 from training. Duration: 0.8555625 2023-10-02 06:40:02,253 WARNING [train.py:1204] (3/4) Exclude cut with ID _65134_210_14763_1_1535335393985_3505368_296-24733-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 06:40:05,752 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.3.encoder.layers.1.self_attn2.whiten, num_groups=1, num_channels=512, metric=15.32 vs. limit=22.5 2023-10-02 06:40:06,432 WARNING [train.py:1204] (3/4) Exclude cut with ID _14898_210_9206_1_1530766812377_4177190_335-99364-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 06:40:08,489 WARNING [train.py:1204] (3/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_19-342632-0 from training. Duration: 0.91 2023-10-02 06:40:08,518 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_53071_210_16554_1_1534490405_2954066_265-170472-0 from training. Duration: 0.83 2023-10-02 06:40:10,920 INFO [train.py:1046] (3/4) Epoch 23, batch 1000, loss[loss=0.191, simple_loss=0.271, pruned_loss=0.05544, over 23929.00 frames. ], tot_loss[loss=0.1728, simple_loss=0.249, pruned_loss=0.04827, over 4701945.48 frames. ], batch size: 86, lr: 4.49e-03, grad_scale: 16.0 2023-10-02 06:40:12,467 WARNING [train.py:1204] (3/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_189-298172-0_sp1.1 from training. Duration: 0.963625 2023-10-02 06:40:12,765 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.0.feed_forward1.out_proj.dropout_p, batch_count=785773.3333333334, ans=0.1 2023-10-02 06:40:14,681 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7615_210_4211_1_1528110965_4580160_72-235211-0 from training. Duration: 0.76 2023-10-02 06:40:14,718 WARNING [train.py:1204] (3/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_155-90874-0_sp1.1 from training. Duration: 0.936375 2023-10-02 06:40:21,504 WARNING [train.py:1204] (3/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_164-216310-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 06:40:21,608 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_46013_210_7234_1_1533776362_3744032_463-52378-0 from training. Duration: 0.86 2023-10-02 06:40:21,611 WARNING [train.py:1204] (3/4) Exclude cut with ID _61133_210_9969_1_1535196777227_3696409_333-270325-0 from training. Duration: 0.93 2023-10-02 06:40:25,859 WARNING [train.py:1204] (3/4) Exclude cut with ID _5172_210_1883_1_1527246062567_3577219_18-101380-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 06:40:25,867 WARNING [train.py:1204] (3/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_577-236858-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 06:40:27,308 WARNING [train.py:1204] (3/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_1006-177345-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 06:40:29,264 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_4-266348-0 from training. Duration: 0.78 2023-10-02 06:40:32,807 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.bypass.skip_rate, batch_count=785840.0, ans=0.07 2023-10-02 06:40:33,809 WARNING [train.py:1204] (3/4) Exclude cut with ID _55857_210_2346_1_1534644007831_6898209_745-98391-0 from training. Duration: 0.76 2023-10-02 06:40:35,314 WARNING [train.py:1204] (3/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_375-244976-0 from training. Duration: 0.75 2023-10-02 06:40:37,060 WARNING [train.py:1204] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_4-248404-0_sp1.1 from training. Duration: 0.97275 2023-10-02 06:40:37,190 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8453_210_4211_1_1528502940_8562730_596-306820-0 from training. Duration: 0.97 2023-10-02 06:40:38,666 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_211-339622-0_sp0.9 from training. Duration: 0.5 2023-10-02 06:40:38,693 WARNING [train.py:1204] (3/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_393-57888-0 from training. Duration: 0.64 2023-10-02 06:40:40,063 WARNING [train.py:1204] (3/4) Exclude cut with ID _23825_210_13449_1_1531915313526_3497150_143-343575-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 06:40:41,531 WARNING [train.py:1204] (3/4) Exclude cut with ID _58605_210_12210_1_1534723195939_3904259_36-12256-0_sp1.1 from training. Duration: 0.936375 2023-10-02 06:40:42,773 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.478e+02 1.822e+02 2.055e+02 2.435e+02 3.236e+02, threshold=4.111e+02, percent-clipped=0.0 2023-10-02 06:40:50,304 WARNING [train.py:1204] (3/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_38-67795-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 06:40:51,709 WARNING [train.py:1204] (3/4) Exclude cut with ID _64631_210_27767_1_1535349580068_4165160_308-327832-0_sp1.1 from training. Duration: 0.92725 2023-10-02 06:40:51,808 WARNING [train.py:1204] (3/4) Exclude cut with ID _37086_210_9038_1_1532999540891_7021250_726-81176-0_sp1.1 from training. Duration: 0.936375 2023-10-02 06:40:53,257 WARNING [train.py:1204] (3/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_47-141521-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 06:40:53,262 WARNING [train.py:1204] (3/4) Exclude cut with ID _3067_210_2667_1_1526212974640_4503168_250-285937-0 from training. Duration: 0.8 2023-10-02 06:40:54,593 WARNING [train.py:1204] (3/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_239-24000-0_sp1.1 from training. Duration: 0.97275 2023-10-02 06:40:54,663 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3371_210_1794_1_1526384462_5580777_396-339812-0_sp0.9 from training. Duration: 0.9444375 2023-10-02 06:40:55,978 WARNING [train.py:1204] (3/4) Exclude cut with ID _16909_210_5724_1_1531292079912_4118919_580-173454-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 06:40:56,030 WARNING [train.py:1204] (3/4) Exclude cut with ID IC0124W0002-61308 from training. Duration: 0.982 2023-10-02 06:40:56,219 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=785973.3333333334, ans=0.1 2023-10-02 06:40:58,810 WARNING [train.py:1204] (3/4) Exclude cut with ID _31602_210_5854_1_1532677021456_4458250_249-96604-0 from training. Duration: 0.89 2023-10-02 06:41:00,117 WARNING [train.py:1204] (3/4) Exclude cut with ID _69584_210_5219_1_1535785133775_4044450_325-7656-0 from training. Duration: 0.86 2023-10-02 06:41:01,625 WARNING [train.py:1204] (3/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_478-162247-0 from training. Duration: 0.82 2023-10-02 06:41:03,554 WARNING [train.py:1204] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_686-54383-0_sp1.1 from training. Duration: 0.7909375 2023-10-02 06:41:09,816 WARNING [train.py:1204] (3/4) Exclude cut with ID _29303_210_2575_1_1532392237303_7410649_85-270092-0_sp1.1 from training. Duration: 0.936375 2023-10-02 06:41:11,103 WARNING [train.py:1204] (3/4) Exclude cut with ID _2322_210_2667_1_1525604548971_3656299_440-80494-0_sp1.1 from training. Duration: 0.8 2023-10-02 06:41:11,133 WARNING [train.py:1204] (3/4) Exclude cut with ID _47346_210_9476_1_1533815971553_3536160_201-40644-0_sp1.1 from training. Duration: 0.936375 2023-10-02 06:41:11,450 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.bypass.scale_min, batch_count=786040.0, ans=0.2 2023-10-02 06:41:12,443 WARNING [train.py:1204] (3/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_343-139898-0_sp1.1 from training. Duration: 0.87275 2023-10-02 06:41:13,842 WARNING [train.py:1204] (3/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_1012-279473-0 from training. Duration: 0.67 2023-10-02 06:41:13,948 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7931_210_1794_1_1528594728_5564059_280-279560-0_sp1.1 from training. Duration: 0.8909375 2023-10-02 06:41:15,354 WARNING [train.py:1204] (3/4) Exclude cut with ID _8263_210_1664_1_1528439452891_970220_68-181805-0 from training. Duration: 0.64 2023-10-02 06:41:15,404 WARNING [train.py:1204] (3/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_605-133395-0 from training. Duration: 0.66 2023-10-02 06:41:15,647 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.nonlin_attention.balancer.prob, batch_count=786040.0, ans=0.125 2023-10-02 06:41:17,391 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_43-171109-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 06:41:17,394 WARNING [train.py:1204] (3/4) Exclude cut with ID _40094_210_16380_1_1533294005773_3641159_107-280739-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 06:41:18,910 WARNING [train.py:1204] (3/4) Exclude cut with ID _14821_210_8750_1_1530680503991_6829359_2-227151-0_sp0.9 from training. Duration: 0.92225 2023-10-02 06:41:21,663 WARNING [train.py:1204] (3/4) Exclude cut with ID _14268_210_9206_1_1530622771285_4714099_331-105351-0_sp1.1 from training. Duration: 0.5909375 2023-10-02 06:41:24,240 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_26326_210_7925_1_1533002493_3592406_451-82741-0_sp1.1 from training. Duration: 0.97275 2023-10-02 06:41:25,585 INFO [train.py:1046] (3/4) Epoch 23, batch 1050, loss[loss=0.1585, simple_loss=0.2394, pruned_loss=0.03882, over 24331.00 frames. ], tot_loss[loss=0.1724, simple_loss=0.2479, pruned_loss=0.04848, over 4688170.70 frames. ], batch size: 61, lr: 4.49e-03, grad_scale: 16.0 2023-10-02 06:41:25,780 WARNING [train.py:1204] (3/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_191-59784-0_sp0.9 from training. Duration: 0.97775 2023-10-02 06:41:27,131 WARNING [train.py:1204] (3/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_445-117398-0_sp1.1 from training. Duration: 0.8090625 2023-10-02 06:41:28,691 WARNING [train.py:1204] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_605-257205-0_sp0.9 from training. Duration: 0.6555625 2023-10-02 06:41:30,040 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_50050_210_16223_1_1535435796_7290138_592-258841-0_sp1.1 from training. Duration: 0.936375 2023-10-02 06:41:32,049 WARNING [train.py:1204] (3/4) Exclude cut with ID _14348_210_8341_1_1530599434996_3778640_240-232370-0_sp0.9 from training. Duration: 0.9333125 2023-10-02 06:41:34,565 WARNING [train.py:1204] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_239-316802-0_sp0.9 from training. Duration: 0.7666875 2023-10-02 06:41:35,715 WARNING [train.py:1204] (3/4) Exclude cut with ID _45560_210_21982_1_1533812603951_3584999_111-6376-0_sp1.1 from training. Duration: 0.763625 2023-10-02 06:41:37,245 WARNING [train.py:1204] (3/4) Exclude cut with ID _54189_210_16380_1_1534332670039_3911710_606-311481-0_sp1.1 from training. Duration: 0.963625 2023-10-02 06:41:38,565 WARNING [train.py:1204] (3/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_237-332027-0_sp1.1 from training. Duration: 0.863625 2023-10-02 06:41:38,602 WARNING [train.py:1204] (3/4) Exclude cut with ID _66070_210_7640_1_1535441393096_7805310_489-292631-0_sp0.9 from training. Duration: 0.87775 2023-10-02 06:41:39,920 WARNING [train.py:1204] (3/4) Exclude cut with ID _49728_210_12651_1_1534041034294_6788150_624-195301-0_sp1.1 from training. Duration: 0.77275 2023-10-02 06:41:39,966 WARNING [train.py:1204] (3/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_44-79427-0 from training. Duration: 0.98 2023-10-02 06:41:41,323 WARNING [train.py:1204] (3/4) Exclude cut with ID _35350_210_16040_1_1533040191183_4075299_490-198354-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 06:41:41,366 WARNING [train.py:1204] (3/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_309-222545-0 from training. Duration: 0.84 2023-10-02 06:41:41,577 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.balancer1.prob, batch_count=786173.3333333334, ans=0.125 2023-10-02 06:41:44,431 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_13091_210_7925_1_1530609447_8987354_790-199469-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 06:41:44,437 WARNING [train.py:1204] (3/4) Exclude cut with ID _21632_210_2253_1_1531994001765_3744140_110-274654-0 from training. Duration: 0.82 2023-10-02 06:41:45,804 WARNING [train.py:1204] (3/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_498-40153-0_sp0.9 from training. Duration: 0.7 2023-10-02 06:41:50,444 WARNING [train.py:1204] (3/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_363-282909-0_sp1.1 from training. Duration: 0.936375 2023-10-02 06:41:50,714 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.ff3_skip_rate, batch_count=786173.3333333334, ans=0.0 2023-10-02 06:41:51,826 WARNING [train.py:1204] (3/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_178-177582-0_sp0.9 from training. Duration: 0.82225 2023-10-02 06:41:51,837 WARNING [train.py:1204] (3/4) Exclude cut with ID _24612_210_8913_1_1531998054193_3806359_357-35487-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 06:41:54,583 WARNING [train.py:1204] (3/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_138-332773-0 from training. Duration: 0.81 2023-10-02 06:41:54,612 WARNING [train.py:1204] (3/4) Exclude cut with ID _7624_210_1531_1_1528109802198_4933101_418-349834-0 from training. Duration: 0.6 2023-10-02 06:41:54,641 WARNING [train.py:1204] (3/4) Exclude cut with ID _7537_210_2084_1_1528502395431_3924359_14-312906-0_sp0.9 from training. Duration: 0.9333125 2023-10-02 06:41:58,696 WARNING [train.py:1204] (3/4) Exclude cut with ID _60057_210_10640_1_1534903065793_3333150_362-230377-0 from training. Duration: 0.87 2023-10-02 06:42:02,002 WARNING [train.py:1204] (3/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_138-64210-0 from training. Duration: 0.73 2023-10-02 06:42:03,881 WARNING [train.py:1204] (3/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_454-339504-0_sp1.1 from training. Duration: 0.97275 2023-10-02 06:42:07,460 WARNING [train.py:1204] (3/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_508-335369-0_sp0.9 from training. Duration: 0.5333125 2023-10-02 06:42:10,081 WARNING [train.py:1204] (3/4) Exclude cut with ID _5608_210_2106_1_1527415439089_6819768_909-82767-0_sp0.9 from training. Duration: 0.67775 2023-10-02 06:42:10,127 WARNING [train.py:1204] (3/4) Exclude cut with ID _6694_210_4599_1_1527852637490_3965529_99-11611-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 06:42:11,532 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2655_210_2087_1_1525688959_3897400_52-279899-0_sp1.1 from training. Duration: 0.62725 2023-10-02 06:42:11,863 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.feed_forward2.hidden_balancer.prob, batch_count=786306.6666666666, ans=0.125 2023-10-02 06:42:14,790 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.attention_skip_rate, batch_count=786306.6666666666, ans=0.0 2023-10-02 06:42:15,840 WARNING [train.py:1204] (3/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_375-245677-0_sp1.1 from training. Duration: 0.736375 2023-10-02 06:42:16,199 INFO [scaling.py:1118] (3/4) WithLoss: name=encoder.encoders.4.encoder.layers.0.self_attn_weights, loss-sum=0.000e+00 2023-10-02 06:42:18,125 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_23850_210_4238_1_1532232597_1632433_32-185135-0 from training. Duration: 0.86 2023-10-02 06:42:20,891 WARNING [train.py:1204] (3/4) Exclude cut with ID _15113_210_8254_1_1530842438579_3716279_209-54533-0 from training. Duration: 0.96 2023-10-02 06:42:20,938 WARNING [train.py:1204] (3/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_401-53423-0 from training. Duration: 0.55 2023-10-02 06:42:20,966 WARNING [train.py:1204] (3/4) Exclude cut with ID _14867_210_1968_1_1530684179890_3846969_319-250868-0_sp1.1 from training. Duration: 0.9 2023-10-02 06:42:20,980 WARNING [train.py:1204] (3/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_106-30782-0_sp0.9 from training. Duration: 0.888875 2023-10-02 06:42:22,360 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_5960_210_1794_1_1527403865_3990999_515-2613-0 from training. Duration: 0.88 2023-10-02 06:42:27,932 WARNING [train.py:1204] (3/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_798-106179-0_sp0.9 from training. Duration: 0.92225 2023-10-02 06:42:29,426 WARNING [train.py:1204] (3/4) Exclude cut with ID _14227_210_4381_1_1530701804286_3940259_125-275642-0_sp1.1 from training. Duration: 0.9 2023-10-02 06:42:29,428 WARNING [train.py:1204] (3/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_79-200392-0_sp1.1 from training. Duration: 0.8818125 2023-10-02 06:42:30,752 WARNING [train.py:1204] (3/4) Exclude cut with ID _48836_210_12210_1_1534075848293_3904150_429-35475-0_sp0.9 from training. Duration: 0.988875 2023-10-02 06:42:30,782 WARNING [train.py:1204] (3/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_224-172517-0_sp1.1 from training. Duration: 0.97275 2023-10-02 06:42:31,096 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.feed_forward2.hidden_balancer.prob, batch_count=786373.3333333334, ans=0.125 2023-10-02 06:42:34,250 WARNING [train.py:1204] (3/4) Exclude cut with ID _15113_210_8254_1_1530842438579_3716279_76-232507-0_sp1.1 from training. Duration: 0.97275 2023-10-02 06:42:34,253 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_135-5501-0 from training. Duration: 0.98 2023-10-02 06:42:37,542 WARNING [train.py:1204] (3/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_563-347832-0_sp0.9 from training. Duration: 0.988875 2023-10-02 06:42:37,553 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6786_210_1388_1_1527839699_4194495_273-149841-0 from training. Duration: 0.92 2023-10-02 06:42:38,250 WARNING [train.py:1204] (3/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_172-215932-0 from training. Duration: 0.66 2023-10-02 06:42:39,592 WARNING [train.py:1204] (3/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_939-132910-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 06:42:42,267 INFO [train.py:1046] (3/4) Epoch 23, batch 1100, loss[loss=0.1747, simple_loss=0.2443, pruned_loss=0.05255, over 23691.00 frames. ], tot_loss[loss=0.1723, simple_loss=0.2479, pruned_loss=0.04838, over 4701297.20 frames. ], batch size: 135, lr: 4.49e-03, grad_scale: 16.0 2023-10-02 06:42:43,759 WARNING [train.py:1204] (3/4) Exclude cut with ID _49371_210_12071_1_1533952761021_3812190_16-246001-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 06:42:48,052 WARNING [train.py:1204] (3/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_680-189165-0_sp1.1 from training. Duration: 0.936375 2023-10-02 06:42:51,468 WARNING [train.py:1204] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_53-300544-0_sp1.1 from training. Duration: 0.6090625 2023-10-02 06:42:52,849 WARNING [train.py:1204] (3/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_540-275685-0_sp1.1 from training. Duration: 0.8090625 2023-10-02 06:42:52,875 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_38965_210_4852_1_1533174906_3484735_453-106579-0_sp1.1 from training. Duration: 0.92725 2023-10-02 06:42:52,927 WARNING [train.py:1204] (3/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_105-328174-0 from training. Duration: 0.76 2023-10-02 06:42:53,558 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.3.encoder.layers.3.self_attn_weights.whiten_keys, num_groups=8, num_channels=256, metric=3.37 vs. limit=6.0 2023-10-02 06:42:53,699 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.2.encoder.layers.1.nonlin_attention.whiten1, num_groups=1, num_channels=288, metric=5.31 vs. limit=10.0 2023-10-02 06:42:55,554 WARNING [train.py:1204] (3/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_279-126205-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 06:42:58,354 WARNING [train.py:1204] (3/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_5-55893-0_sp0.9 from training. Duration: 0.611125 2023-10-02 06:42:59,800 WARNING [train.py:1204] (3/4) Exclude cut with ID _13678_210_7497_1_1530530069226_3463170_141-10835-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 06:43:01,369 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.conv_module1.balancer2.min_positive, batch_count=786506.6666666666, ans=0.05 2023-10-02 06:43:02,596 WARNING [train.py:1204] (3/4) Exclude cut with ID _22083_210_8254_1_1531702862439_3678970_201-85738-0_sp1.1 from training. Duration: 0.8181875 2023-10-02 06:43:02,629 WARNING [train.py:1204] (3/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_51-1049-0 from training. Duration: 0.75 2023-10-02 06:43:03,959 WARNING [train.py:1204] (3/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_206-10097-0_sp0.9 from training. Duration: 0.5444375 2023-10-02 06:43:05,448 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_4528_210_3273_1_1527076654_3276777_360-63661-0_sp1.1 from training. Duration: 0.92725 2023-10-02 06:43:05,457 WARNING [train.py:1204] (3/4) Exclude cut with ID _64086_210_7994_1_1535270417019_7435949_449-31567-0_sp1.1 from training. Duration: 0.8818125 2023-10-02 06:43:09,455 WARNING [train.py:1204] (3/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_245-174553-0_sp0.9 from training. Duration: 0.97775 2023-10-02 06:43:09,710 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.ff3_skip_rate, batch_count=786506.6666666666, ans=0.0 2023-10-02 06:43:11,095 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.conv_skip_rate, batch_count=786573.3333333334, ans=0.0 2023-10-02 06:43:12,280 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6545_210_2040_1_1527774670_4245258_379-14182-0_sp1.1 from training. Duration: 0.72725 2023-10-02 06:43:13,580 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.528e+02 1.798e+02 1.908e+02 2.172e+02 3.443e+02, threshold=3.816e+02, percent-clipped=0.0 2023-10-02 06:43:16,445 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_256-206070-0_sp1.1 from training. Duration: 0.963625 2023-10-02 06:43:19,273 WARNING [train.py:1204] (3/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_278-104676-0 from training. Duration: 0.98 2023-10-02 06:43:19,343 WARNING [train.py:1204] (3/4) Exclude cut with ID IC0671W0065-331908 from training. Duration: 0.896 2023-10-02 06:43:19,385 WARNING [train.py:1204] (3/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_244-129917-0_sp1.1 from training. Duration: 0.97275 2023-10-02 06:43:19,550 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.attention_skip_rate, batch_count=786573.3333333334, ans=0.0 2023-10-02 06:43:20,956 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.1.nonlin_attention.balancer.min_positive, batch_count=786573.3333333334, ans=0.05 2023-10-02 06:43:22,225 WARNING [train.py:1204] (3/4) Exclude cut with ID _7852_210_5219_1_1528286471316_8139400_1129-170857-0_sp1.1 from training. Duration: 0.97275 2023-10-02 06:43:22,308 WARNING [train.py:1204] (3/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_443-30034-0_sp0.9 from training. Duration: 0.711125 2023-10-02 06:43:22,507 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.attention_skip_rate, batch_count=786573.3333333334, ans=0.0 2023-10-02 06:43:24,108 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_31583_210_3852_1_1532595851_4024413_250-332999-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 06:43:25,556 WARNING [train.py:1204] (3/4) Exclude cut with ID _8772_210_1850_1_1528635595972_3664011_247-336448-0 from training. Duration: 0.74 2023-10-02 06:43:26,833 WARNING [train.py:1204] (3/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_567-135114-0_sp0.9 from training. Duration: 0.9666875 2023-10-02 06:43:26,836 WARNING [train.py:1204] (3/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_557-349534-0_sp1.1 from training. Duration: 0.82725 2023-10-02 06:43:26,849 WARNING [train.py:1204] (3/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_1341-26728-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 06:43:26,914 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_40222_210_6228_1_1533385298_4481939_375-328051-0_sp1.1 from training. Duration: 0.97275 2023-10-02 06:43:26,944 WARNING [train.py:1204] (3/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_276-96593-0 from training. Duration: 0.77 2023-10-02 06:43:29,984 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.feed_forward2.hidden_balancer.prob, batch_count=786640.0, ans=0.125 2023-10-02 06:43:32,826 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_53071_210_16554_1_1534490405_2954066_265-170472-0_sp0.9 from training. Duration: 0.92225 2023-10-02 06:43:32,861 WARNING [train.py:1204] (3/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_365-1492-0 from training. Duration: 0.91 2023-10-02 06:43:34,225 WARNING [train.py:1204] (3/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_654-52713-0_sp0.9 from training. Duration: 0.8555625 2023-10-02 06:43:39,462 WARNING [train.py:1204] (3/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_113-92608-0_sp1.1 from training. Duration: 0.4909375 2023-10-02 06:43:42,683 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_46013_210_7234_1_1533776362_3744032_50-141535-0 from training. Duration: 0.99 2023-10-02 06:43:42,699 WARNING [train.py:1204] (3/4) Exclude cut with ID _13942_210_9988_1_1530604906036_3733360_499-55676-0_sp1.1 from training. Duration: 0.563625 2023-10-02 06:43:44,126 WARNING [train.py:1204] (3/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_375-65143-0_sp1.1 from training. Duration: 0.97275 2023-10-02 06:43:46,823 WARNING [train.py:1204] (3/4) Exclude cut with ID _16756_210_4915_1_1531047803763_7104259_199-325608-0_sp1.1 from training. Duration: 0.92725 2023-10-02 06:43:48,130 WARNING [train.py:1204] (3/4) Exclude cut with ID _31649_210_6228_1_1532584608063_4091078_443-124391-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 06:43:49,548 WARNING [train.py:1204] (3/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_146-218763-0 from training. Duration: 0.86 2023-10-02 06:43:49,584 WARNING [train.py:1204] (3/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_567-135114-0_sp1.1 from training. Duration: 0.7909375 2023-10-02 06:43:49,631 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_26326_210_7925_1_1533002493_3592406_462-288299-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 06:43:51,006 WARNING [train.py:1204] (3/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_322-217220-0 from training. Duration: 0.86 2023-10-02 06:43:51,054 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_238-256668-0_sp1.1 from training. Duration: 0.62725 2023-10-02 06:43:52,396 WARNING [train.py:1204] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_371-18169-0 from training. Duration: 0.86 2023-10-02 06:43:52,501 WARNING [train.py:1204] (3/4) Exclude cut with ID _58480_210_12392_1_1534811121111_3646160_23-74512-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 06:43:52,505 WARNING [train.py:1204] (3/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_784-213099-0_sp1.1 from training. Duration: 0.8454375 2023-10-02 06:43:54,612 WARNING [train.py:1204] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_625-102955-0_sp1.1 from training. Duration: 0.7 2023-10-02 06:43:54,752 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.0.conv_module2.balancer2.prob, batch_count=786706.6666666666, ans=0.125 2023-10-02 06:43:57,350 INFO [train.py:1046] (3/4) Epoch 23, batch 1150, loss[loss=0.193, simple_loss=0.2794, pruned_loss=0.05327, over 24661.00 frames. ], tot_loss[loss=0.1729, simple_loss=0.2487, pruned_loss=0.04853, over 4697406.95 frames. ], batch size: 73, lr: 4.49e-03, grad_scale: 16.0 2023-10-02 06:43:58,983 WARNING [train.py:1204] (3/4) Exclude cut with ID _49748_210_24116_1_1533986971737_3158910_176-174327-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 06:44:01,672 WARNING [train.py:1204] (3/4) Exclude cut with ID _50310_210_15488_1_1534068043345_3641080_432-96140-0_sp1.1 from training. Duration: 0.936375 2023-10-02 06:44:04,435 WARNING [train.py:1204] (3/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_76-14910-0_sp1.1 from training. Duration: 0.92725 2023-10-02 06:44:04,472 WARNING [train.py:1204] (3/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_40-19458-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 06:44:05,837 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_46-93547-0 from training. Duration: 0.95 2023-10-02 06:44:05,885 WARNING [train.py:1204] (3/4) Exclude cut with ID _21631_210_2253_1_1531907880778_3160339_321-35325-0_sp1.1 from training. Duration: 0.963625 2023-10-02 06:44:09,055 WARNING [train.py:1204] (3/4) Exclude cut with ID _4779_210_2084_1_1527318022944_3914893_500-285013-0 from training. Duration: 0.69 2023-10-02 06:44:09,158 WARNING [train.py:1204] (3/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_301-93324-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 06:44:09,165 WARNING [train.py:1204] (3/4) Exclude cut with ID _6473_210_1741_1_1527901307396_3815380_108-17335-0_sp1.1 from training. Duration: 0.5909375 2023-10-02 06:44:17,115 WARNING [train.py:1204] (3/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_206-10097-0 from training. Duration: 0.49 2023-10-02 06:44:19,929 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_36756_210_7234_1_1533026991_966276_103-77513-0_sp1.1 from training. Duration: 0.97275 2023-10-02 06:44:20,463 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.nonlin_attention.whiten1.whitening_limit, batch_count=786840.0, ans=10.0 2023-10-02 06:44:24,157 WARNING [train.py:1204] (3/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_662-18193-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 06:44:24,219 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_26433_210_12108_1_1532157158_4056480_439-52438-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 06:44:25,589 WARNING [train.py:1204] (3/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_464-33931-0 from training. Duration: 0.54 2023-10-02 06:44:25,607 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3474_210_2028_1_1526188513_4825246_145-309131-0_sp0.9 from training. Duration: 0.82225 2023-10-02 06:44:25,619 WARNING [train.py:1204] (3/4) Exclude cut with ID _9679_210_7497_1_1529129212906_4363929_446-308321-0_sp1.1 from training. Duration: 0.963625 2023-10-02 06:44:29,062 WARNING [train.py:1204] (3/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_662-275621-0 from training. Duration: 0.42 2023-10-02 06:44:30,397 WARNING [train.py:1204] (3/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_344-337148-0_sp1.1 from training. Duration: 0.97275 2023-10-02 06:44:31,967 WARNING [train.py:1204] (3/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_759-179071-0_sp1.1 from training. Duration: 0.92725 2023-10-02 06:44:34,952 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.conv_skip_rate, batch_count=786906.6666666666, ans=0.0 2023-10-02 06:44:41,223 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.bypass.scale_min, batch_count=786973.3333333334, ans=0.2 2023-10-02 06:44:42,349 WARNING [train.py:1204] (3/4) Exclude cut with ID _28783_210_7943_1_1532671164808_7155479_293-12188-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 06:44:45,987 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.bypass.skip_rate, batch_count=786973.3333333334, ans=0.09899494936611666 2023-10-02 06:44:47,204 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_17613_210_2151_1_1531137569_2366160_235-320304-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 06:44:47,425 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.feed_forward3.hidden_balancer.prob, batch_count=786973.3333333334, ans=0.125 2023-10-02 06:44:49,027 WARNING [train.py:1204] (3/4) Exclude cut with ID _14349_210_8341_1_1530615710818_3442470_170-336541-0 from training. Duration: 0.86 2023-10-02 06:44:49,086 WARNING [train.py:1204] (3/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_375-218677-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 06:44:50,355 WARNING [train.py:1204] (3/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_829-202956-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 06:44:50,647 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.balancer2.prob, batch_count=786973.3333333334, ans=0.125 2023-10-02 06:44:54,669 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9029W0436-509823 from training. Duration: 0.768 2023-10-02 06:44:56,008 WARNING [train.py:1204] (3/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_231-112214-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 06:45:03,357 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9059W0470-524883 from training. Duration: 0.3415625 2023-10-02 06:45:07,614 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_43266_210_1794_1_1533521789_4497218_206-79512-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 06:45:09,009 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_294-163793-0_sp0.9 from training. Duration: 0.911125 2023-10-02 06:45:10,266 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6290_210_3273_1_1527595224_1282787_115-87095-0_sp1.1 from training. Duration: 0.5 2023-10-02 06:45:10,303 WARNING [train.py:1204] (3/4) Exclude cut with ID _14100_210_8341_1_1530529320613_3678069_279-55760-0_sp1.1 from training. Duration: 0.8454375 2023-10-02 06:45:12,377 INFO [train.py:1046] (3/4) Epoch 23, batch 1200, loss[loss=0.1875, simple_loss=0.271, pruned_loss=0.052, over 24144.00 frames. ], tot_loss[loss=0.1734, simple_loss=0.2494, pruned_loss=0.04875, over 4702805.31 frames. ], batch size: 80, lr: 4.49e-03, grad_scale: 32.0 2023-10-02 06:45:12,471 WARNING [train.py:1204] (3/4) Exclude cut with ID _14621_210_1925_1_1530686901185_3675420_419-234200-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 06:45:17,306 WARNING [train.py:1204] (3/4) Exclude cut with ID _60229_210_27767_1_1534838406158_3655350_353-213656-0_sp1.1 from training. Duration: 0.863625 2023-10-02 06:45:17,315 WARNING [train.py:1204] (3/4) Exclude cut with ID _48678_210_5663_1_1533988830947_3769199_306-255886-0_sp1.1 from training. Duration: 0.7 2023-10-02 06:45:19,108 WARNING [train.py:1204] (3/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_466-3492-0_sp1.1 from training. Duration: 0.97275 2023-10-02 06:45:19,108 WARNING [train.py:1204] (3/4) Exclude cut with ID _8755_210_1876_1_1528628559357_3258209_199-242167-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 06:45:19,160 WARNING [train.py:1204] (3/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_237-89684-0_sp1.1 from training. Duration: 0.87275 2023-10-02 06:45:21,818 WARNING [train.py:1204] (3/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_70-84121-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 06:45:23,111 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7544_210_6753_1_1528587722_3576686_172-171750-0_sp1.1 from training. Duration: 0.5909375 2023-10-02 06:45:24,561 WARNING [train.py:1204] (3/4) Exclude cut with ID _24271_210_12658_1_1531987270022_7190519_40-321822-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 06:45:24,578 WARNING [train.py:1204] (3/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_227-83612-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 06:45:24,727 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.conv_module1.balancer1.prob, batch_count=787106.6666666666, ans=0.125 2023-10-02 06:45:27,482 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9156W0015-572508 from training. Duration: 0.896 2023-10-02 06:45:30,197 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_292-265928-0 from training. Duration: 0.51 2023-10-02 06:45:33,474 WARNING [train.py:1204] (3/4) Exclude cut with ID _14747_210_8341_1_1530685797463_3654089_147-131229-0_sp1.1 from training. Duration: 0.6909375 2023-10-02 06:45:36,183 WARNING [train.py:1204] (3/4) Exclude cut with ID _45297_210_2721_1_1533715351381_3264949_351-147136-0_sp0.9 from training. Duration: 0.9666875 2023-10-02 06:45:36,491 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.conv_module2.balancer2.prob, batch_count=787173.3333333334, ans=0.125 2023-10-02 06:45:37,679 WARNING [train.py:1204] (3/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_1102-324568-0_sp1.1 from training. Duration: 0.97275 2023-10-02 06:45:39,096 WARNING [train.py:1204] (3/4) Exclude cut with ID _18516_210_9206_1_1531704653206_3692406_465-75446-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 06:45:40,381 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9203W0126-595623 from training. Duration: 0.896 2023-10-02 06:45:42,400 WARNING [train.py:1204] (3/4) Exclude cut with ID _55383_210_10635_1_1534468426172_2231669_139-328410-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 06:45:43,664 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.418e+02 1.812e+02 2.009e+02 2.421e+02 3.393e+02, threshold=4.017e+02, percent-clipped=0.0 2023-10-02 06:45:44,450 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.2.encoder.layers.2.self_attn1.whiten, num_groups=1, num_channels=384, metric=19.49 vs. limit=22.5 2023-10-02 06:45:47,341 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.0.layers.1.whiten, num_groups=1, num_channels=192, metric=3.71 vs. limit=12.0 2023-10-02 06:45:49,889 WARNING [train.py:1204] (3/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_166-246557-0_sp0.9 from training. Duration: 0.711125 2023-10-02 06:45:49,893 WARNING [train.py:1204] (3/4) Exclude cut with ID _30039_210_13270_1_1532484612900_2725150_239-304872-0_sp1.1 from training. Duration: 0.963625 2023-10-02 06:45:49,928 WARNING [train.py:1204] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_6-226750-0 from training. Duration: 0.94 2023-10-02 06:45:51,801 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_135-5501-0_sp1.1 from training. Duration: 0.8909375 2023-10-02 06:45:54,686 WARNING [train.py:1204] (3/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_359-118926-0 from training. Duration: 0.72 2023-10-02 06:45:58,613 WARNING [train.py:1204] (3/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_4-91037-0 from training. Duration: 0.88 2023-10-02 06:45:58,641 WARNING [train.py:1204] (3/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_415-288492-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 06:45:59,989 WARNING [train.py:1204] (3/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_544-287179-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 06:46:01,386 WARNING [train.py:1204] (3/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_122-32606-0_sp1.1 from training. Duration: 0.92725 2023-10-02 06:46:01,696 INFO [scaling.py:1118] (3/4) WithLoss: name=encoder.encoders.4.encoder.layers.1.self_attn_weights, loss-sum=0.000e+00 2023-10-02 06:46:02,668 WARNING [train.py:1204] (3/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_121-284152-0_sp1.1 from training. Duration: 0.536375 2023-10-02 06:46:02,774 WARNING [train.py:1204] (3/4) Exclude cut with ID _44718_210_5816_1_1534050051869_6469030_350-299477-0_sp1.1 from training. Duration: 0.97275 2023-10-02 06:46:02,785 WARNING [train.py:1204] (3/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_1103-56445-0_sp0.9 from training. Duration: 0.811125 2023-10-02 06:46:04,049 WARNING [train.py:1204] (3/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_64-298917-0_sp0.9 from training. Duration: 0.97775 2023-10-02 06:46:05,371 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2245_210_2715_1_1525511548_3540254_310-52483-0 from training. Duration: 0.88 2023-10-02 06:46:05,432 WARNING [train.py:1204] (3/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_401-158239-0_sp1.1 from training. Duration: 0.6454375 2023-10-02 06:46:07,436 WARNING [train.py:1204] (3/4) Exclude cut with ID _59502_210_12210_1_1535107024698_1835139_146-162495-0_sp1.1 from training. Duration: 0.836375 2023-10-02 06:46:07,452 WARNING [train.py:1204] (3/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_41-31157-0_sp0.9 from training. Duration: 0.6333125 2023-10-02 06:46:08,837 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_68717_210_3528_1_1535628683_1556759_95-36722-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 06:46:08,844 WARNING [train.py:1204] (3/4) Exclude cut with ID _30196_210_1664_1_1533708496522_7074140_654-162611-0_sp1.1 from training. Duration: 0.92725 2023-10-02 06:46:13,594 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_9302_210_2550_1_1528889962_4655416_638-301620-0_sp0.9 from training. Duration: 0.52225 2023-10-02 06:46:16,357 WARNING [train.py:1204] (3/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_478-162247-0_sp1.1 from training. Duration: 0.7454375 2023-10-02 06:46:17,931 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.0.attention_skip_rate, batch_count=787373.3333333334, ans=0.0 2023-10-02 06:46:19,162 WARNING [train.py:1204] (3/4) Exclude cut with ID _45297_210_2721_1_1533715351381_3264949_353-9541-0 from training. Duration: 0.97 2023-10-02 06:46:20,753 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.conv_module1.balancer2.min_positive, batch_count=787373.3333333334, ans=0.05 2023-10-02 06:46:22,557 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9653W0190-675468 from training. Duration: 0.9935 2023-10-02 06:46:24,491 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_49730_210_13049_1_1534075294_1830676_214-84436-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 06:46:25,960 WARNING [train.py:1204] (3/4) Exclude cut with ID _50093_210_4220_1_1534417141207_3432299_259-322252-0_sp1.1 from training. Duration: 0.836375 2023-10-02 06:46:27,469 INFO [train.py:1046] (3/4) Epoch 23, batch 1250, loss[loss=0.1799, simple_loss=0.2703, pruned_loss=0.04474, over 24533.00 frames. ], tot_loss[loss=0.174, simple_loss=0.25, pruned_loss=0.04903, over 4709437.28 frames. ], batch size: 71, lr: 4.49e-03, grad_scale: 32.0 2023-10-02 06:46:27,618 WARNING [train.py:1204] (3/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_599-50220-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 06:46:28,909 WARNING [train.py:1204] (3/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_174-109016-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 06:46:33,061 WARNING [train.py:1204] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_665-196155-0 from training. Duration: 0.67 2023-10-02 06:46:34,749 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.balancer1.prob, batch_count=787440.0, ans=0.125 2023-10-02 06:46:37,923 WARNING [train.py:1204] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_656-188361-0_sp1.1 from training. Duration: 0.7909375 2023-10-02 06:46:37,994 WARNING [train.py:1204] (3/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_60-18123-0_sp1.1 from training. Duration: 0.8 2023-10-02 06:46:39,302 WARNING [train.py:1204] (3/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_535-14410-0 from training. Duration: 0.47 2023-10-02 06:46:40,739 WARNING [train.py:1204] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_94-126128-0_sp1.1 from training. Duration: 0.7818125 2023-10-02 06:46:40,930 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.1.conv_module2.balancer1.prob, batch_count=787506.6666666666, ans=0.125 2023-10-02 06:46:42,160 WARNING [train.py:1204] (3/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_356-31216-0_sp1.1 from training. Duration: 0.6818125 2023-10-02 06:46:44,294 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.feed_forward2.hidden_balancer.prob, batch_count=787506.6666666666, ans=0.125 2023-10-02 06:46:45,346 WARNING [train.py:1204] (3/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_443-30034-0_sp1.1 from training. Duration: 0.5818125 2023-10-02 06:46:45,431 WARNING [train.py:1204] (3/4) Exclude cut with ID _26907_210_7234_1_1532221740694_3205263_337-2494-0_sp1.1 from training. Duration: 0.8 2023-10-02 06:46:46,844 WARNING [train.py:1204] (3/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_49-97158-0_sp0.9 from training. Duration: 0.8444375 2023-10-02 06:46:46,871 WARNING [train.py:1204] (3/4) Exclude cut with ID _7877_210_6014_1_1528286358099_3750136_553-170128-0_sp1.1 from training. Duration: 0.963625 2023-10-02 06:46:48,503 WARNING [train.py:1204] (3/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_202-293523-0_sp0.9 from training. Duration: 0.8 2023-10-02 06:46:53,015 WARNING [train.py:1204] (3/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_188-171673-0_sp0.9 from training. Duration: 0.6444375 2023-10-02 06:46:53,042 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_120-41668-0_sp1.1 from training. Duration: 0.636375 2023-10-02 06:46:53,046 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_40223_210_6228_1_1533298404_4812267_613-78769-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 06:46:54,385 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_53861_210_5399_1_1534422156_4272077_150-336899-0_sp1.1 from training. Duration: 0.963625 2023-10-02 06:46:54,439 WARNING [train.py:1204] (3/4) Exclude cut with ID _14459_210_2228_1_1531443641946_7516700_1146-56843-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 06:46:57,773 WARNING [train.py:1204] (3/4) Exclude cut with ID _39867_210_9826_1_1533553222030_9344259_1194-299928-0_sp1.1 from training. Duration: 0.92725 2023-10-02 06:46:57,871 WARNING [train.py:1204] (3/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_214-291369-0_sp0.9 from training. Duration: 0.7 2023-10-02 06:47:03,416 WARNING [train.py:1204] (3/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_358-215558-0 from training. Duration: 0.93 2023-10-02 06:47:03,477 WARNING [train.py:1204] (3/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_577-287150-0_sp0.9 from training. Duration: 0.911125 2023-10-02 06:47:06,772 WARNING [train.py:1204] (3/4) Exclude cut with ID _60860_210_8254_1_1534896015875_3557169_229-187367-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 06:47:08,144 WARNING [train.py:1204] (3/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_857-298942-0 from training. Duration: 0.85 2023-10-02 06:47:09,438 WARNING [train.py:1204] (3/4) Exclude cut with ID _40137_210_12210_1_1533290484046_3795180_249-187059-0_sp1.1 from training. Duration: 0.8 2023-10-02 06:47:09,445 WARNING [train.py:1204] (3/4) Exclude cut with ID ID0171W0508-765603 from training. Duration: 0.541 2023-10-02 06:47:09,474 WARNING [train.py:1204] (3/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_521-219439-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 06:47:09,479 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_40223_210_6228_1_1533298404_4812267_741-302616-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 06:47:13,003 WARNING [train.py:1204] (3/4) Exclude cut with ID _65293_210_9070_1_1535545549765_4004210_438-154735-0_sp1.1 from training. Duration: 0.92725 2023-10-02 06:47:15,712 WARNING [train.py:1204] (3/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_564-132950-0_sp1.1 from training. Duration: 0.92725 2023-10-02 06:47:15,762 WARNING [train.py:1204] (3/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_291-270881-0_sp1.1 from training. Duration: 0.8818125 2023-10-02 06:47:17,261 WARNING [train.py:1204] (3/4) Exclude cut with ID _60229_210_27767_1_1534838406158_3655350_353-213656-0 from training. Duration: 0.95 2023-10-02 06:47:17,276 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_243-143119-0 from training. Duration: 0.77 2023-10-02 06:47:17,304 WARNING [train.py:1204] (3/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_690-126306-0 from training. Duration: 0.78 2023-10-02 06:47:23,250 WARNING [train.py:1204] (3/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_347-95003-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 06:47:24,622 WARNING [train.py:1204] (3/4) Exclude cut with ID _2322_210_2667_1_1525604548971_3656299_440-80494-0 from training. Duration: 0.88 2023-10-02 06:47:24,652 WARNING [train.py:1204] (3/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_370-229434-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 06:47:26,268 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.balancer2.prob, batch_count=787706.6666666666, ans=0.125 2023-10-02 06:47:28,760 WARNING [train.py:1204] (3/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_124-345840-0_sp1.1 from training. Duration: 0.47275 2023-10-02 06:47:28,768 WARNING [train.py:1204] (3/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_165-123581-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 06:47:30,821 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.bypass_mid.scale_min, batch_count=787706.6666666666, ans=0.2 2023-10-02 06:47:32,079 WARNING [train.py:1204] (3/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_453-282588-0 from training. Duration: 0.98 2023-10-02 06:47:32,106 WARNING [train.py:1204] (3/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_299-34413-0_sp0.9 from training. Duration: 0.72225 2023-10-02 06:47:32,147 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_40036_210_13405_1_1533648647_1315388_48-146016-0_sp1.1 from training. Duration: 0.8454375 2023-10-02 06:47:32,178 WARNING [train.py:1204] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_99-167392-0_sp0.9 from training. Duration: 0.67775 2023-10-02 06:47:32,236 WARNING [train.py:1204] (3/4) Exclude cut with ID _48677_210_5663_1_1533902405919_3892989_37-207114-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 06:47:32,523 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.ff3_skip_rate, batch_count=787706.6666666666, ans=0.0 2023-10-02 06:47:34,913 WARNING [train.py:1204] (3/4) Exclude cut with ID _29857_210_20034_1_1532395886753_3798160_335-163562-0 from training. Duration: 0.85 2023-10-02 06:47:35,167 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.1.conv_module1.balancer2.min_abs, batch_count=787706.6666666666, ans=0.5 2023-10-02 06:47:36,957 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_36492_210_7927_1_1533261987_4790834_703-343351-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 06:47:38,271 WARNING [train.py:1204] (3/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_80-147301-0_sp0.9 from training. Duration: 0.9333125 2023-10-02 06:47:38,512 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.ff3_skip_rate, batch_count=787706.6666666666, ans=0.0 2023-10-02 06:47:39,651 WARNING [train.py:1204] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_218-295097-0_sp0.9 from training. Duration: 0.8555625 2023-10-02 06:47:42,161 INFO [train.py:1046] (3/4) Epoch 23, batch 1300, loss[loss=0.166, simple_loss=0.2379, pruned_loss=0.04706, over 23540.00 frames. ], tot_loss[loss=0.174, simple_loss=0.25, pruned_loss=0.04898, over 4709405.34 frames. ], batch size: 134, lr: 4.49e-03, grad_scale: 16.0 2023-10-02 06:47:43,556 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2295_210_2087_1_1525500993_2407863_116-238894-0_sp1.1 from training. Duration: 0.636375 2023-10-02 06:47:43,816 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.balancer1.prob, batch_count=787773.3333333334, ans=0.125 2023-10-02 06:47:44,997 WARNING [train.py:1204] (3/4) Exclude cut with ID _3231_210_2106_1_1526205864934_6784220_697-171068-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 06:47:45,032 WARNING [train.py:1204] (3/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_102-77064-0 from training. Duration: 0.93 2023-10-02 06:47:51,109 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_13091_210_7925_1_1530609447_8987354_1186-261957-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 06:47:52,380 WARNING [train.py:1204] (3/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_91-277684-0_sp1.1 from training. Duration: 0.67275 2023-10-02 06:47:53,748 WARNING [train.py:1204] (3/4) Exclude cut with ID _31088_210_16446_1_1532520012284_3440919_159-313751-0_sp1.1 from training. Duration: 0.936375 2023-10-02 06:47:55,872 WARNING [train.py:1204] (3/4) Exclude cut with ID _42326_210_7770_1_1533814282531_3707269_227-203721-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 06:47:57,258 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3531_210_3528_1_1526294203_5216456_309-227302-0_sp1.1 from training. Duration: 0.62725 2023-10-02 06:47:57,323 WARNING [train.py:1204] (3/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_226-242140-0 from training. Duration: 0.81 2023-10-02 06:48:01,606 WARNING [train.py:1204] (3/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_122-109023-0_sp1.1 from training. Duration: 0.6909375 2023-10-02 06:48:02,211 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.4.encoder.layers.1.feed_forward1.out_whiten, num_groups=1, num_channels=384, metric=9.90 vs. limit=15.0 2023-10-02 06:48:02,946 WARNING [train.py:1204] (3/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_660-211088-0_sp0.9 from training. Duration: 0.688875 2023-10-02 06:48:03,043 WARNING [train.py:1204] (3/4) Exclude cut with ID _39290_210_4220_1_1533862778760_3075118_92-311600-0 from training. Duration: 0.88 2023-10-02 06:48:06,235 WARNING [train.py:1204] (3/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_390-339137-0_sp1.1 from training. Duration: 0.5090625 2023-10-02 06:48:06,461 INFO [scaling.py:1118] (3/4) WithLoss: name=encoder.encoders.1.encoder.layers.1.self_attn_weights, loss-sum=0.000e+00 2023-10-02 06:48:09,596 WARNING [train.py:1204] (3/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_449-341946-0_sp1.1 from training. Duration: 0.963625 2023-10-02 06:48:10,975 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_68095_210_20185_1_1535718226_8888855_884-327007-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 06:48:12,332 WARNING [train.py:1204] (3/4) Exclude cut with ID _42326_210_7770_1_1533814282531_3707269_308-222998-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 06:48:12,432 WARNING [train.py:1204] (3/4) Exclude cut with ID _40399_210_4353_1_1533305658194_3279406_344-238708-0_sp1.1 from training. Duration: 0.963625 2023-10-02 06:48:13,813 WARNING [train.py:1204] (3/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_87-131427-0_sp1.1 from training. Duration: 0.7090625 2023-10-02 06:48:14,075 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.bypass_mid.scale_min, batch_count=787906.6666666666, ans=0.2 2023-10-02 06:48:15,083 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.516e+02 1.873e+02 2.076e+02 2.335e+02 3.601e+02, threshold=4.153e+02, percent-clipped=0.0 2023-10-02 06:48:15,896 WARNING [train.py:1204] (3/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_110-130961-0_sp0.9 from training. Duration: 0.52225 2023-10-02 06:48:15,935 WARNING [train.py:1204] (3/4) Exclude cut with ID _55153_210_2253_1_1534384786677_3162096_148-79022-0 from training. Duration: 0.96 2023-10-02 06:48:21,363 WARNING [train.py:1204] (3/4) Exclude cut with ID _9679_210_7497_1_1529129212906_4363929_512-313889-0_sp1.1 from training. Duration: 0.736375 2023-10-02 06:48:23,173 WARNING [train.py:1204] (3/4) Exclude cut with ID _7970_210_1883_1_1528455616296_3472230_25-178407-0_sp1.1 from training. Duration: 0.6454375 2023-10-02 06:48:24,516 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_246-125000-0 from training. Duration: 0.62 2023-10-02 06:48:25,910 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_15126_210_6753_1_1530778677_2591185_113-157651-0_sp0.9 from training. Duration: 0.7555625 2023-10-02 06:48:27,327 WARNING [train.py:1204] (3/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_105-63309-0_sp1.1 from training. Duration: 0.8909375 2023-10-02 06:48:28,904 WARNING [train.py:1204] (3/4) Exclude cut with ID _29877_210_6228_1_1532397791106_5276149_578-177271-0_sp1.1 from training. Duration: 0.936375 2023-10-02 06:48:30,264 WARNING [train.py:1204] (3/4) Exclude cut with ID _44831_210_13427_1_1533816011549_3689035_32-188147-0 from training. Duration: 0.96 2023-10-02 06:48:30,312 WARNING [train.py:1204] (3/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_300-254337-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 06:48:30,325 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_128-241008-0 from training. Duration: 0.94 2023-10-02 06:48:31,757 WARNING [train.py:1204] (3/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_53-279178-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 06:48:35,448 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_41452_210_7291_1_1533437687_3941281_273-334088-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 06:48:35,450 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_128-241008-0_sp1.1 from training. Duration: 0.8545625 2023-10-02 06:48:39,403 WARNING [train.py:1204] (3/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_379-49457-0 from training. Duration: 0.87 2023-10-02 06:48:41,406 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_105-114797-0 from training. Duration: 0.84 2023-10-02 06:48:41,491 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_14928_210_5632_1_1530773461_4714000_454-112198-0 from training. Duration: 0.66 2023-10-02 06:48:46,056 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_456-22141-0_sp1.1 from training. Duration: 0.8 2023-10-02 06:48:47,558 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_115-233929-0 from training. Duration: 0.72 2023-10-02 06:48:49,037 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_616-16795-0_sp1.1 from training. Duration: 0.963625 2023-10-02 06:48:56,478 WARNING [train.py:1204] (3/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_448-212135-0 from training. Duration: 0.91 2023-10-02 06:48:57,843 INFO [train.py:1046] (3/4) Epoch 23, batch 1350, loss[loss=0.1715, simple_loss=0.2471, pruned_loss=0.04793, over 24304.00 frames. ], tot_loss[loss=0.1731, simple_loss=0.2487, pruned_loss=0.04878, over 4714022.57 frames. ], batch size: 56, lr: 4.49e-03, grad_scale: 16.0 2023-10-02 06:48:59,255 WARNING [train.py:1204] (3/4) Exclude cut with ID _46683_210_9642_1_1533730913492_4591116_201-205677-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 06:49:00,630 WARNING [train.py:1204] (3/4) Exclude cut with ID _64086_210_7994_1_1535270417019_7435949_43-80483-0_sp1.1 from training. Duration: 0.97275 2023-10-02 06:49:04,172 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_240-134124-0_sp1.1 from training. Duration: 0.963625 2023-10-02 06:49:05,439 WARNING [train.py:1204] (3/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_907-120940-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 06:49:06,822 WARNING [train.py:1204] (3/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_848-280957-0_sp1.1 from training. Duration: 0.7909375 2023-10-02 06:49:08,823 WARNING [train.py:1204] (3/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_243-42116-0_sp0.9 from training. Duration: 0.811125 2023-10-02 06:49:11,566 WARNING [train.py:1204] (3/4) Exclude cut with ID _6694_210_4599_1_1527852637490_3965529_116-109398-0_sp0.9 from training. Duration: 0.811125 2023-10-02 06:49:14,350 WARNING [train.py:1204] (3/4) Exclude cut with ID _24314_210_4915_1_1532135078116_7170179_521-258868-0 from training. Duration: 0.79 2023-10-02 06:49:14,472 WARNING [train.py:1204] (3/4) Exclude cut with ID _2220_210_1819_1_1525509109898_3658680_50-99837-0_sp0.9 from training. Duration: 0.6 2023-10-02 06:49:15,647 WARNING [train.py:1204] (3/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_313-134930-0_sp1.1 from training. Duration: 0.8818125 2023-10-02 06:49:18,933 WARNING [train.py:1204] (3/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_125-144174-0 from training. Duration: 0.39 2023-10-02 06:49:19,013 WARNING [train.py:1204] (3/4) Exclude cut with ID _59288_210_5854_1_1535077757774_3412246_275-293897-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 06:49:20,425 WARNING [train.py:1204] (3/4) Exclude cut with ID _40519_210_13794_1_1533715228655_7337390_722-52880-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 06:49:20,440 WARNING [train.py:1204] (3/4) Exclude cut with ID _7537_210_2084_1_1528502395431_3924359_128-268029-0 from training. Duration: 0.98 2023-10-02 06:49:23,109 WARNING [train.py:1204] (3/4) Exclude cut with ID _8021_210_7994_1_1528376451358_3653069_179-45206-0 from training. Duration: 0.86 2023-10-02 06:49:24,568 WARNING [train.py:1204] (3/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_88-276668-0 from training. Duration: 0.99 2023-10-02 06:49:26,617 WARNING [train.py:1204] (3/4) Exclude cut with ID _22613_210_5219_1_1532253599686_4090170_290-212862-0_sp1.1 from training. Duration: 0.97275 2023-10-02 06:49:26,629 WARNING [train.py:1204] (3/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_465-214726-0 from training. Duration: 0.86 2023-10-02 06:49:37,052 WARNING [train.py:1204] (3/4) Exclude cut with ID _56480_210_4220_1_1534679942079_3075409_138-36031-0_sp1.1 from training. Duration: 0.97275 2023-10-02 06:49:40,860 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.bypass.scale_min, batch_count=788240.0, ans=0.2 2023-10-02 06:49:46,427 WARNING [train.py:1204] (3/4) Exclude cut with ID _58605_210_12210_1_1534723195939_3904259_300-285878-0_sp1.1 from training. Duration: 0.97275 2023-10-02 06:49:46,456 WARNING [train.py:1204] (3/4) Exclude cut with ID _40901_210_5665_1_1533363789958_3856130_114-340229-0_sp1.1 from training. Duration: 0.936375 2023-10-02 06:49:47,769 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_489-230703-0 from training. Duration: 0.85 2023-10-02 06:49:51,026 WARNING [train.py:1204] (3/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_322-81243-0_sp1.1 from training. Duration: 0.936375 2023-10-02 06:49:52,407 WARNING [train.py:1204] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_271-269084-0 from training. Duration: 0.83 2023-10-02 06:49:52,427 WARNING [train.py:1204] (3/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_166-314607-0_sp0.9 from training. Duration: 0.6 2023-10-02 06:49:53,814 WARNING [train.py:1204] (3/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_340-343302-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 06:49:54,104 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.conv_module1.balancer2.min_positive, batch_count=788306.6666666666, ans=0.05 2023-10-02 06:49:55,224 WARNING [train.py:1204] (3/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_286-112015-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 06:49:58,432 WARNING [train.py:1204] (3/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_328-264776-0 from training. Duration: 0.67 2023-10-02 06:49:59,863 WARNING [train.py:1204] (3/4) Exclude cut with ID _4866_210_2577_1_1527404412284_4364659_256-306830-0_sp0.9 from training. Duration: 0.9666875 2023-10-02 06:50:04,057 WARNING [train.py:1204] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_239-316802-0 from training. Duration: 0.69 2023-10-02 06:50:06,074 WARNING [train.py:1204] (3/4) Exclude cut with ID _29857_210_20034_1_1532395886753_3798160_279-185801-0 from training. Duration: 0.91 2023-10-02 06:50:13,338 INFO [train.py:1046] (3/4) Epoch 23, batch 1400, loss[loss=0.1677, simple_loss=0.2608, pruned_loss=0.0373, over 24288.00 frames. ], tot_loss[loss=0.1723, simple_loss=0.2477, pruned_loss=0.04844, over 4708553.22 frames. ], batch size: 74, lr: 4.49e-03, grad_scale: 16.0 2023-10-02 06:50:13,382 WARNING [train.py:1204] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_656-188361-0 from training. Duration: 0.87 2023-10-02 06:50:14,790 WARNING [train.py:1204] (3/4) Exclude cut with ID _44718_210_5816_1_1534050051869_6469030_100-230287-0_sp1.1 from training. Duration: 0.936375 2023-10-02 06:50:16,277 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8275_210_4198_1_1528455320_3892571_276-215622-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 06:50:17,702 WARNING [train.py:1204] (3/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_698-309430-0_sp1.1 from training. Duration: 0.963625 2023-10-02 06:50:17,982 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.feed_forward1.out_proj.dropout_p, batch_count=788440.0, ans=0.1 2023-10-02 06:50:18,631 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.0.layers.0.conv_module2.whiten, num_groups=1, num_channels=192, metric=9.43 vs. limit=15.0 2023-10-02 06:50:22,493 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_365-217119-0 from training. Duration: 0.63 2023-10-02 06:50:23,858 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7264_210_4938_1_1527939911_8132095_800-91995-0 from training. Duration: 0.86 2023-10-02 06:50:33,160 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.conv_module2.balancer1.max_abs, batch_count=788506.6666666666, ans=10.0 2023-10-02 06:50:35,599 WARNING [train.py:1204] (3/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_49-97158-0_sp1.1 from training. Duration: 0.6909375 2023-10-02 06:50:37,026 WARNING [train.py:1204] (3/4) Exclude cut with ID _61132_210_9969_1_1535021792964_3755419_61-57720-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 06:50:37,223 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.conv_module2.balancer1.prob, batch_count=788506.6666666666, ans=0.125 2023-10-02 06:50:39,188 WARNING [train.py:1204] (3/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_345-306832-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 06:50:39,214 WARNING [train.py:1204] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_164-224052-0_sp0.9 from training. Duration: 0.77775 2023-10-02 06:50:44,057 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_34886_210_10633_1_1533203882_1755661_76-156096-0_sp1.1 from training. Duration: 0.7909375 2023-10-02 06:50:44,138 WARNING [train.py:1204] (3/4) Exclude cut with ID 4133-6541-0027-40495-0_sp1.1 from training. Duration: 0.9681875 2023-10-02 06:50:44,328 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.conv_module1.balancer1.prob, batch_count=788573.3333333334, ans=0.125 2023-10-02 06:50:46,774 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.533e+02 1.899e+02 2.083e+02 2.387e+02 3.639e+02, threshold=4.165e+02, percent-clipped=0.0 2023-10-02 06:50:49,278 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.3.encoder.layers.1.nonlin_attention.whiten2, num_groups=1, num_channels=512, metric=8.42 vs. limit=15.0 2023-10-02 06:50:53,202 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_514-211165-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 06:50:54,523 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_41211_210_2544_1_1533348859_3702910_97-212695-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 06:50:57,455 WARNING [train.py:1204] (3/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_406-45086-0 from training. Duration: 0.8 2023-10-02 06:50:57,737 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.conv_module2.balancer1.prob, batch_count=788640.0, ans=0.125 2023-10-02 06:50:58,818 WARNING [train.py:1204] (3/4) Exclude cut with ID _65263_210_22356_1_1535763598691_3921037_49-222917-0_sp1.1 from training. Duration: 0.836375 2023-10-02 06:50:58,890 WARNING [train.py:1204] (3/4) Exclude cut with ID _4866_210_2577_1_1527404412284_4364659_352-157930-0_sp1.1 from training. Duration: 0.736375 2023-10-02 06:51:00,195 WARNING [train.py:1204] (3/4) Exclude cut with ID _4366_210_1388_1_1526792053633_3613819_231-211402-0_sp1.1 from training. Duration: 0.8545625 2023-10-02 06:51:02,088 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_98-21576-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 06:51:02,165 WARNING [train.py:1204] (3/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_348-127789-0_sp0.9 from training. Duration: 0.9444375 2023-10-02 06:51:02,184 WARNING [train.py:1204] (3/4) Exclude cut with ID _45871_210_5665_1_1533864614245_3296060_98-329557-0_sp1.1 from training. Duration: 0.97275 2023-10-02 06:51:03,476 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6846_210_6753_1_1527850767_4295158_2-315073-0_sp1.1 from training. Duration: 0.8 2023-10-02 06:51:03,589 WARNING [train.py:1204] (3/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_145-137790-0 from training. Duration: 0.83 2023-10-02 06:51:04,860 WARNING [train.py:1204] (3/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_133-158308-0_sp0.9 from training. Duration: 0.9333125 2023-10-02 06:51:09,701 WARNING [train.py:1204] (3/4) Exclude cut with ID _14898_210_9206_1_1530766812377_4177190_320-11758-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 06:51:13,071 WARNING [train.py:1204] (3/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_406-45086-0_sp0.9 from training. Duration: 0.888875 2023-10-02 06:51:18,768 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_5438_210_1794_1_1527336687_2600009_104-288274-0 from training. Duration: 0.58 2023-10-02 06:51:20,137 WARNING [train.py:1204] (3/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_108-53929-0_sp0.9 from training. Duration: 0.6555625 2023-10-02 06:51:21,514 WARNING [train.py:1204] (3/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_444-286390-0_sp1.1 from training. Duration: 0.963625 2023-10-02 06:51:24,803 WARNING [train.py:1204] (3/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_125-144174-0_sp0.9 from training. Duration: 0.4333125 2023-10-02 06:51:24,865 WARNING [train.py:1204] (3/4) Exclude cut with ID _55209_210_1531_1_1534675828996_4597160_118-345621-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 06:51:27,536 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_45886_210_17024_1_1533709215_471063_7-79165-0_sp1.1 from training. Duration: 0.8909375 2023-10-02 06:51:28,909 INFO [train.py:1046] (3/4) Epoch 23, batch 1450, loss[loss=0.1503, simple_loss=0.2271, pruned_loss=0.03678, over 24327.00 frames. ], tot_loss[loss=0.1715, simple_loss=0.2472, pruned_loss=0.04792, over 4720878.68 frames. ], batch size: 56, lr: 4.49e-03, grad_scale: 16.0 2023-10-02 06:51:29,065 WARNING [train.py:1204] (3/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_175-163894-0_sp0.9 from training. Duration: 0.82225 2023-10-02 06:51:30,427 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_50188_210_4198_1_1534503621_3646041_353-81682-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 06:51:32,128 WARNING [train.py:1204] (3/4) Exclude cut with ID _17875_210_2087_1_1531224012907_1821339_175-80565-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 06:51:32,136 WARNING [train.py:1204] (3/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_159-328157-0_sp1.1 from training. Duration: 0.47275 2023-10-02 06:51:36,406 WARNING [train.py:1204] (3/4) Exclude cut with ID _60057_210_10640_1_1534903065793_3333150_137-267579-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 06:51:38,360 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_92-46009-0_sp1.1 from training. Duration: 0.4909375 2023-10-02 06:51:39,799 WARNING [train.py:1204] (3/4) Exclude cut with ID _53203_210_2087_1_1534246218309_475590_48-68507-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 06:51:39,826 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_28211_210_6388_1_1532861506_4523224_128-328162-0 from training. Duration: 0.9 2023-10-02 06:51:41,161 WARNING [train.py:1204] (3/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_51-1049-0_sp1.1 from training. Duration: 0.6818125 2023-10-02 06:51:41,234 WARNING [train.py:1204] (3/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_383-316000-0 from training. Duration: 0.9 2023-10-02 06:51:42,589 WARNING [train.py:1204] (3/4) Exclude cut with ID _4779_210_2084_1_1527318022944_3914893_185-295153-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 06:51:44,360 WARNING [train.py:1204] (3/4) Exclude cut with ID _3231_210_2106_1_1526205864934_6784220_171-256004-0_sp0.9 from training. Duration: 0.9555625 2023-10-02 06:51:44,360 WARNING [train.py:1204] (3/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_87-272068-0 from training. Duration: 0.83 2023-10-02 06:51:44,457 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_164-152777-0_sp1.1 from training. Duration: 0.92725 2023-10-02 06:51:45,829 WARNING [train.py:1204] (3/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_18-130336-0_sp0.9 from training. Duration: 0.87775 2023-10-02 06:51:47,135 WARNING [train.py:1204] (3/4) Exclude cut with ID _14886_210_4220_1_1530702020355_3516190_175-229073-0_sp1.1 from training. Duration: 0.4090625 2023-10-02 06:51:47,159 WARNING [train.py:1204] (3/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_791-45149-0_sp0.9 from training. Duration: 0.9555625 2023-10-02 06:51:48,592 WARNING [train.py:1204] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_468-189058-0_sp1.1 from training. Duration: 0.87275 2023-10-02 06:51:50,023 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8278_210_2550_1_1528637099_4220413_32-142780-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 06:51:51,540 WARNING [train.py:1204] (3/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_146-218763-0_sp0.9 from training. Duration: 0.9555625 2023-10-02 06:51:54,979 WARNING [train.py:1204] (3/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_348-127789-0_sp1.1 from training. Duration: 0.77275 2023-10-02 06:51:54,993 WARNING [train.py:1204] (3/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_288-285445-0_sp1.1 from training. Duration: 0.936375 2023-10-02 06:51:56,395 WARNING [train.py:1204] (3/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_308-181732-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 06:51:57,726 WARNING [train.py:1204] (3/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_895-117428-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 06:51:59,024 WARNING [train.py:1204] (3/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_387-159008-0_sp0.9 from training. Duration: 0.9555625 2023-10-02 06:51:59,043 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_37351_210_7925_1_1533012931_3812113_265-85780-0_sp0.9 from training. Duration: 0.97775 2023-10-02 06:51:59,058 WARNING [train.py:1204] (3/4) Exclude cut with ID _40551_210_10635_1_1533776424632_3416179_361-3964-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 06:52:00,386 WARNING [train.py:1204] (3/4) Exclude cut with ID _25882_210_3036_1_1532775665815_6479510_485-122742-0_sp1.1 from training. Duration: 0.97275 2023-10-02 06:52:03,732 WARNING [train.py:1204] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_98-51676-0 from training. Duration: 0.73 2023-10-02 06:52:04,382 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.3.encoder.layers.1.self_attn_weights.whiten_keys, num_groups=8, num_channels=256, metric=4.93 vs. limit=6.0 2023-10-02 06:52:05,251 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.attention_skip_rate, batch_count=788906.6666666666, ans=0.0 2023-10-02 06:52:06,461 WARNING [train.py:1204] (3/4) Exclude cut with ID _30548_210_16040_1_1532608097130_3146969_371-280610-0_sp1.1 from training. Duration: 0.92725 2023-10-02 06:52:06,738 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.attention_skip_rate, batch_count=788906.6666666666, ans=0.0 2023-10-02 06:52:08,646 WARNING [train.py:1204] (3/4) Exclude cut with ID IC0671W0081-331924 from training. Duration: 0.896 2023-10-02 06:52:09,998 WARNING [train.py:1204] (3/4) Exclude cut with ID _40284_210_2084_1_1533513679814_3704279_380-160775-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 06:52:11,540 WARNING [train.py:1204] (3/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_57-270037-0_sp1.1 from training. Duration: 0.7 2023-10-02 06:52:13,370 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_853-150237-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 06:52:14,788 WARNING [train.py:1204] (3/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_491-261923-0 from training. Duration: 0.98 2023-10-02 06:52:17,622 WARNING [train.py:1204] (3/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_836-244045-0_sp1.1 from training. Duration: 0.97275 2023-10-02 06:52:17,710 WARNING [train.py:1204] (3/4) Exclude cut with ID _14880_210_8895_1_1530680426655_3795259_289-212817-0 from training. Duration: 0.69 2023-10-02 06:52:17,933 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.self_attn_weights.pos_emb_skip_rate, batch_count=788973.3333333334, ans=0.0 2023-10-02 06:52:19,115 WARNING [train.py:1204] (3/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_303-340446-0 from training. Duration: 0.9 2023-10-02 06:52:19,324 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.feed_forward1.hidden_balancer.prob, batch_count=788973.3333333334, ans=0.125 2023-10-02 06:52:20,581 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_37152_210_9697_1_1533106573_3844188_418-41736-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 06:52:25,131 WARNING [train.py:1204] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_371-18169-0_sp1.1 from training. Duration: 0.7818125 2023-10-02 06:52:25,166 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_187-219984-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 06:52:26,567 WARNING [train.py:1204] (3/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_63-267585-0 from training. Duration: 0.9 2023-10-02 06:52:26,823 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.conv_module1.balancer2.min_positive, batch_count=788973.3333333334, ans=0.05 2023-10-02 06:52:30,593 WARNING [train.py:1204] (3/4) Exclude cut with ID _27013_210_1389_1_1532437173240_3252649_222-125573-0 from training. Duration: 0.98 2023-10-02 06:52:30,662 WARNING [train.py:1204] (3/4) Exclude cut with ID _14821_210_8750_1_1530680503991_6829359_189-297451-0 from training. Duration: 0.55 2023-10-02 06:52:31,934 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_557-163391-0_sp1.1 from training. Duration: 0.97275 2023-10-02 06:52:33,815 WARNING [train.py:1204] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_65-88242-0_sp1.1 from training. Duration: 0.5090625 2023-10-02 06:52:42,248 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.4.encoder.layers.0.feed_forward3.out_whiten, num_groups=1, num_channels=384, metric=14.06 vs. limit=15.0 2023-10-02 06:52:44,875 INFO [train.py:1046] (3/4) Epoch 23, batch 1500, loss[loss=0.1751, simple_loss=0.2501, pruned_loss=0.05006, over 23576.00 frames. ], tot_loss[loss=0.1721, simple_loss=0.2482, pruned_loss=0.04796, over 4726034.00 frames. ], batch size: 256, lr: 4.49e-03, grad_scale: 16.0 2023-10-02 06:52:44,930 WARNING [train.py:1204] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_218-295097-0 from training. Duration: 0.77 2023-10-02 06:52:44,974 WARNING [train.py:1204] (3/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_686-247600-0_sp1.1 from training. Duration: 0.836375 2023-10-02 06:52:44,978 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_397-147196-0_sp0.9 from training. Duration: 0.888875 2023-10-02 06:52:45,168 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.conv_module1.balancer1.prob, batch_count=789106.6666666666, ans=0.125 2023-10-02 06:52:46,322 WARNING [train.py:1204] (3/4) Exclude cut with ID _53950_210_5816_1_1534417264919_3862951_237-136449-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 06:52:47,632 WARNING [train.py:1204] (3/4) Exclude cut with ID _3120_210_3525_1_1526036143112_4072980_74-178400-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 06:52:47,706 WARNING [train.py:1204] (3/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_309-222545-0_sp0.9 from training. Duration: 0.9333125 2023-10-02 06:52:49,064 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7544_210_6753_1_1528587722_3576686_172-171750-0 from training. Duration: 0.65 2023-10-02 06:52:50,454 WARNING [train.py:1204] (3/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_3-237434-0_sp0.9 from training. Duration: 0.8444375 2023-10-02 06:52:50,512 WARNING [train.py:1204] (3/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_101-293367-0_sp0.9 from training. Duration: 0.7 2023-10-02 06:52:51,794 WARNING [train.py:1204] (3/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_818-213552-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 06:52:51,885 WARNING [train.py:1204] (3/4) Exclude cut with ID _14349_210_8341_1_1530615710818_3442470_115-285975-0_sp1.1 from training. Duration: 0.7818125 2023-10-02 06:52:53,268 WARNING [train.py:1204] (3/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_291-309749-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 06:52:55,141 WARNING [train.py:1204] (3/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_715-3462-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 06:52:59,347 WARNING [train.py:1204] (3/4) Exclude cut with ID _59502_210_12210_1_1535107024698_1835139_13-331827-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 06:52:59,364 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_4528_210_3273_1_1527076654_3276777_342-168068-0 from training. Duration: 0.87 2023-10-02 06:53:00,670 WARNING [train.py:1204] (3/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_215-141059-0_sp0.9 from training. Duration: 0.688875 2023-10-02 06:53:00,692 WARNING [train.py:1204] (3/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_106-286130-0_sp1.1 from training. Duration: 0.8909375 2023-10-02 06:53:01,016 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.conv_module1.balancer2.min_abs, batch_count=789173.3333333334, ans=0.5 2023-10-02 06:53:02,485 WARNING [train.py:1204] (3/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_480-77655-0_sp1.1 from training. Duration: 0.97275 2023-10-02 06:53:03,992 WARNING [train.py:1204] (3/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_848-280957-0 from training. Duration: 0.87 2023-10-02 06:53:05,533 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.0.balancer2.prob, batch_count=789173.3333333334, ans=0.125 2023-10-02 06:53:07,089 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.ff2_skip_rate, batch_count=789173.3333333334, ans=0.0 2023-10-02 06:53:08,568 WARNING [train.py:1204] (3/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_122-109023-0 from training. Duration: 0.76 2023-10-02 06:53:09,957 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_11305_210_1794_1_1529750428_7178148_645-233480-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 06:53:10,012 WARNING [train.py:1204] (3/4) Exclude cut with ID _7562_210_2084_1_1528887767916_3946229_345-169156-0 from training. Duration: 0.58 2023-10-02 06:53:12,685 WARNING [train.py:1204] (3/4) Exclude cut with ID _7562_210_2084_1_1528887767916_3946229_345-169156-0_sp1.1 from training. Duration: 0.52725 2023-10-02 06:53:15,929 WARNING [train.py:1204] (3/4) Exclude cut with ID _13253_210_1925_1_1530757775819_3577873_253-246386-0_sp1.1 from training. Duration: 0.8181875 2023-10-02 06:53:17,159 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.616e+02 1.895e+02 2.081e+02 2.533e+02 4.073e+02, threshold=4.161e+02, percent-clipped=0.0 2023-10-02 06:53:17,309 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_136-264444-0_sp1.1 from training. Duration: 0.97275 2023-10-02 06:53:17,325 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2168_210_2290_1_1525606920_1849400_136-222991-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 06:53:18,925 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.feed_forward3.hidden_balancer.prob, batch_count=789240.0, ans=0.125 2023-10-02 06:53:19,887 WARNING [train.py:1204] (3/4) Exclude cut with ID _7852_210_5219_1_1528286471316_8139400_242-105336-0 from training. Duration: 0.72 2023-10-02 06:53:19,936 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_5960_210_1794_1_1527403865_3990999_515-2613-0_sp1.1 from training. Duration: 0.8 2023-10-02 06:53:19,958 WARNING [train.py:1204] (3/4) Exclude cut with ID _39688_210_19596_1_1533371626725_4154159_551-167834-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 06:53:21,351 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_85-195240-0 from training. Duration: 0.87 2023-10-02 06:53:21,441 WARNING [train.py:1204] (3/4) Exclude cut with ID _63171_210_15488_1_1535104792794_3295110_113-113534-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 06:53:25,780 WARNING [train.py:1204] (3/4) Exclude cut with ID _8833_210_5219_1_1528592665210_7553059_319-349181-0_sp1.1 from training. Duration: 0.9 2023-10-02 06:53:25,781 WARNING [train.py:1204] (3/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_225-43381-0 from training. Duration: 0.94 2023-10-02 06:53:31,792 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.1.bypass.scale_min, batch_count=789306.6666666666, ans=0.2 2023-10-02 06:53:33,054 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_5959_210_1794_1_1527400400_3445726_412-44052-0_sp0.9 from training. Duration: 0.8555625 2023-10-02 06:53:35,106 WARNING [train.py:1204] (3/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_356-167816-0_sp1.1 from training. Duration: 0.6545625 2023-10-02 06:53:39,522 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9012W0112-501244 from training. Duration: 0.768 2023-10-02 06:53:39,578 WARNING [train.py:1204] (3/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_177-326952-0_sp1.1 from training. Duration: 0.92725 2023-10-02 06:53:39,582 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9016W0116-502789 from training. Duration: 0.896 2023-10-02 06:53:40,933 WARNING [train.py:1204] (3/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_557-204815-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 06:53:42,831 WARNING [train.py:1204] (3/4) Exclude cut with ID _55857_210_2346_1_1534644007831_6898209_279-229469-0_sp1.1 from training. Duration: 0.936375 2023-10-02 06:53:42,902 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9029W0375-509764 from training. Duration: 0.896 2023-10-02 06:53:44,289 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2295_210_2087_1_1525500993_2407863_234-83104-0_sp0.9 from training. Duration: 0.688875 2023-10-02 06:53:47,078 WARNING [train.py:1204] (3/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_266-137104-0 from training. Duration: 0.65 2023-10-02 06:53:49,243 WARNING [train.py:1204] (3/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_289-163927-0_sp1.1 from training. Duration: 0.92725 2023-10-02 06:53:52,109 WARNING [train.py:1204] (3/4) Exclude cut with ID _12733_210_8341_1_1530167424556_3646380_86-225314-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 06:53:52,130 WARNING [train.py:1204] (3/4) Exclude cut with ID _7877_210_6014_1_1528286358099_3750136_618-284064-0_sp1.1 from training. Duration: 0.92725 2023-10-02 06:53:53,410 WARNING [train.py:1204] (3/4) Exclude cut with ID _55844_210_2346_1_1534471212569_7625707_1017-31469-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 06:53:53,472 WARNING [train.py:1204] (3/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_617-241446-0_sp1.1 from training. Duration: 0.92725 2023-10-02 06:53:54,758 WARNING [train.py:1204] (3/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_866-173440-0_sp1.1 from training. Duration: 0.8181875 2023-10-02 06:53:56,186 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_9460_210_4211_1_1528954587_10049667_480-260269-0 from training. Duration: 0.56 2023-10-02 06:53:56,238 WARNING [train.py:1204] (3/4) Exclude cut with ID _44659_210_2721_1_1534036120061_6614860_626-298884-0 from training. Duration: 0.97 2023-10-02 06:53:57,625 WARNING [train.py:1204] (3/4) Exclude cut with ID _53565_210_10635_1_1534337218335_3820259_287-18120-0_sp1.1 from training. Duration: 0.8818125 2023-10-02 06:53:57,691 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_791-197891-0 from training. Duration: 0.84 2023-10-02 06:53:57,736 WARNING [train.py:1204] (3/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_527-238990-0 from training. Duration: 0.9 2023-10-02 06:53:59,024 INFO [train.py:1046] (3/4) Epoch 23, batch 1550, loss[loss=0.1811, simple_loss=0.2626, pruned_loss=0.04981, over 23481.00 frames. ], tot_loss[loss=0.1716, simple_loss=0.2481, pruned_loss=0.04752, over 4734186.89 frames. ], batch size: 93, lr: 4.48e-03, grad_scale: 16.0 2023-10-02 06:54:01,084 WARNING [train.py:1204] (3/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_449-65969-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 06:54:02,484 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_553-341452-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 06:54:02,538 WARNING [train.py:1204] (3/4) Exclude cut with ID _16909_210_5724_1_1531292079912_4118919_300-47574-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 06:54:02,552 WARNING [train.py:1204] (3/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_70-27732-0_sp1.1 from training. Duration: 0.8545625 2023-10-02 06:54:02,634 WARNING [train.py:1204] (3/4) Exclude cut with ID _44718_210_5816_1_1534050051869_6469030_727-28074-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 06:54:03,954 WARNING [train.py:1204] (3/4) Exclude cut with ID _55857_210_2346_1_1534644007831_6898209_357-123664-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 06:54:07,282 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9123W0004-555964 from training. Duration: 0.896 2023-10-02 06:54:07,327 WARNING [train.py:1204] (3/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_445-145208-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 06:54:08,572 WARNING [train.py:1204] (3/4) Exclude cut with ID _3675_210_4557_1_1526639934408_4182089_456-18305-0_sp1.1 from training. Duration: 0.6909375 2023-10-02 06:54:08,657 WARNING [train.py:1204] (3/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_1-99680-0_sp1.1 from training. Duration: 0.6090625 2023-10-02 06:54:11,417 WARNING [train.py:1204] (3/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_146-282400-0_sp0.9 from training. Duration: 0.788875 2023-10-02 06:54:11,430 WARNING [train.py:1204] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_844-96989-0 from training. Duration: 0.71 2023-10-02 06:54:12,874 WARNING [train.py:1204] (3/4) Exclude cut with ID _29853_210_7497_1_1532505600851_6642110_128-146364-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 06:54:12,895 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_224-322394-0 from training. Duration: 0.66 2023-10-02 06:54:14,862 WARNING [train.py:1204] (3/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_191-59784-0 from training. Duration: 0.88 2023-10-02 06:54:14,867 WARNING [train.py:1204] (3/4) Exclude cut with ID _12556_210_8341_1_1530081024100_7327690_725-193477-0 from training. Duration: 0.98 2023-10-02 06:54:16,654 WARNING [train.py:1204] (3/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_523-79838-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 06:54:16,727 WARNING [train.py:1204] (3/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_398-334558-0_sp1.1 from training. Duration: 0.87275 2023-10-02 06:54:18,699 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.3.encoder.layers.3.nonlin_attention.whiten2, num_groups=1, num_channels=512, metric=5.34 vs. limit=15.0 2023-10-02 06:54:20,719 WARNING [train.py:1204] (3/4) Exclude cut with ID _6456_210_1534_1_1527681634212_3181049_171-36697-0_sp1.1 from training. Duration: 0.936375 2023-10-02 06:54:22,131 WARNING [train.py:1204] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_700-183549-0 from training. Duration: 0.68 2023-10-02 06:54:22,133 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8853_210_3273_1_1528633899_818968_51-187685-0 from training. Duration: 0.69 2023-10-02 06:54:29,038 WARNING [train.py:1204] (3/4) Exclude cut with ID _7719_210_3525_1_1528456153163_4296960_333-110399-0_sp1.1 from training. Duration: 0.87275 2023-10-02 06:54:31,164 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.2.encoder.layers.0.feed_forward1.out_whiten, num_groups=1, num_channels=384, metric=6.12 vs. limit=15.0 2023-10-02 06:54:33,684 WARNING [train.py:1204] (3/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_1022-83740-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 06:54:33,699 WARNING [train.py:1204] (3/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_61-327062-0_sp0.9 from training. Duration: 0.9 2023-10-02 06:54:35,006 WARNING [train.py:1204] (3/4) Exclude cut with ID _47251_210_18628_1_1533814063128_4137250_214-88076-0_sp1.1 from training. Duration: 0.8 2023-10-02 06:54:35,082 WARNING [train.py:1204] (3/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_839-173017-0 from training. Duration: 0.41 2023-10-02 06:54:42,620 WARNING [train.py:1204] (3/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_299-34413-0_sp1.1 from training. Duration: 0.5909375 2023-10-02 06:54:43,993 WARNING [train.py:1204] (3/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_664-159457-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 06:54:47,333 WARNING [train.py:1204] (3/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_483-291760-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 06:54:49,391 WARNING [train.py:1204] (3/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_287-255474-0_sp1.1 from training. Duration: 0.92725 2023-10-02 06:54:50,751 WARNING [train.py:1204] (3/4) Exclude cut with ID _48677_210_5663_1_1533902405919_3892989_111-11439-0_sp1.1 from training. Duration: 0.87275 2023-10-02 06:54:50,766 WARNING [train.py:1204] (3/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_399-156791-0 from training. Duration: 0.83 2023-10-02 06:54:50,809 WARNING [train.py:1204] (3/4) Exclude cut with ID _3992_210_1747_1_1526644816031_3731060_36-147855-0_sp0.9 from training. Duration: 0.8666875 2023-10-02 06:54:52,305 WARNING [train.py:1204] (3/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_442-320317-0_sp1.1 from training. Duration: 0.7454375 2023-10-02 06:54:52,324 WARNING [train.py:1204] (3/4) Exclude cut with ID _55429_210_10626_1_1534467588527_7174143_953-80868-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 06:54:52,395 WARNING [train.py:1204] (3/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_159-328157-0_sp0.9 from training. Duration: 0.57775 2023-10-02 06:54:52,401 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9309W0197-640324 from training. Duration: 0.896 2023-10-02 06:54:54,012 WARNING [train.py:1204] (3/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_361-30822-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 06:55:00,887 WARNING [train.py:1204] (3/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_636-251213-0 from training. Duration: 0.94 2023-10-02 06:55:03,855 WARNING [train.py:1204] (3/4) Exclude cut with ID _49728_210_12651_1_1534041034294_6788150_468-333827-0_sp1.1 from training. Duration: 0.963625 2023-10-02 06:55:04,077 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.feed_forward1.out_proj.dropout_p, batch_count=789706.6666666666, ans=0.1 2023-10-02 06:55:05,308 WARNING [train.py:1204] (3/4) Exclude cut with ID _25882_210_3036_1_1532775665815_6479510_623-191759-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 06:55:06,774 WARNING [train.py:1204] (3/4) Exclude cut with ID _5189_210_4557_1_1527248319428_3769229_199-53526-0 from training. Duration: 0.42 2023-10-02 06:55:08,710 WARNING [train.py:1204] (3/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_32-205288-0_sp0.9 from training. Duration: 0.8666875 2023-10-02 06:55:08,803 WARNING [train.py:1204] (3/4) Exclude cut with ID _65263_210_22356_1_1535763598691_3921037_401-346646-0_sp1.1 from training. Duration: 0.963625 2023-10-02 06:55:08,807 WARNING [train.py:1204] (3/4) Exclude cut with ID _12555_210_8750_1_1530075594402_3780239_449-267986-0_sp1.1 from training. Duration: 0.7545625 2023-10-02 06:55:10,091 WARNING [train.py:1204] (3/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_430-201020-0_sp1.1 from training. Duration: 0.8818125 2023-10-02 06:55:10,190 WARNING [train.py:1204] (3/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_379-49457-0_sp1.1 from training. Duration: 0.7909375 2023-10-02 06:55:14,644 INFO [train.py:1046] (3/4) Epoch 23, batch 1600, loss[loss=0.1587, simple_loss=0.2388, pruned_loss=0.03932, over 24592.00 frames. ], tot_loss[loss=0.1714, simple_loss=0.2483, pruned_loss=0.04723, over 4741705.70 frames. ], batch size: 60, lr: 4.48e-03, grad_scale: 32.0 2023-10-02 06:55:14,746 WARNING [train.py:1204] (3/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_200-105218-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 06:55:14,815 WARNING [train.py:1204] (3/4) Exclude cut with ID _3867_210_1883_1_1526641200575_4475830_319-349850-0 from training. Duration: 0.67 2023-10-02 06:55:16,165 WARNING [train.py:1204] (3/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_916-195848-0 from training. Duration: 0.92 2023-10-02 06:55:17,545 WARNING [train.py:1204] (3/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_436-186311-0 from training. Duration: 0.65 2023-10-02 06:55:20,565 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3910_210_1794_1_1526777667_3747260_302-180617-0_sp1.1 from training. Duration: 0.97275 2023-10-02 06:55:20,680 WARNING [train.py:1204] (3/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_102-92751-0 from training. Duration: 0.99 2023-10-02 06:55:22,016 WARNING [train.py:1204] (3/4) Exclude cut with ID _51899_210_20034_1_1534129254466_3669220_265-215428-0_sp1.1 from training. Duration: 0.8909375 2023-10-02 06:55:24,793 WARNING [train.py:1204] (3/4) Exclude cut with ID _58583_210_5236_1_1534726835266_3574249_438-198993-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 06:55:29,166 WARNING [train.py:1204] (3/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_166-166990-0_sp1.1 from training. Duration: 0.87275 2023-10-02 06:55:32,014 WARNING [train.py:1204] (3/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_907-151398-0 from training. Duration: 0.37 2023-10-02 06:55:33,554 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=789840.0, ans=0.1 2023-10-02 06:55:33,591 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.bypass.scale_min, batch_count=789840.0, ans=0.2 2023-10-02 06:55:34,650 WARNING [train.py:1204] (3/4) Exclude cut with ID _38670_210_15379_1_1533448810659_4391059_452-256669-0_sp1.1 from training. Duration: 0.936375 2023-10-02 06:55:35,863 WARNING [train.py:1204] (3/4) Exclude cut with ID _48677_210_5663_1_1533902405919_3892989_408-40740-0 from training. Duration: 0.83 2023-10-02 06:55:35,913 WARNING [train.py:1204] (3/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_903-158756-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 06:55:35,963 WARNING [train.py:1204] (3/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_397-216497-0 from training. Duration: 0.96 2023-10-02 06:55:38,701 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.2.encoder.layers.2.nonlin_attention.whiten1, num_groups=1, num_channels=288, metric=5.13 vs. limit=10.0 2023-10-02 06:55:43,306 WARNING [train.py:1204] (3/4) Exclude cut with ID _55429_210_10626_1_1534467588527_7174143_97-335084-0 from training. Duration: 0.97 2023-10-02 06:55:46,672 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.548e+02 1.856e+02 2.043e+02 2.277e+02 4.874e+02, threshold=4.086e+02, percent-clipped=2.0 2023-10-02 06:55:48,310 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.conv_module2.balancer2.prob, batch_count=789906.6666666666, ans=0.125 2023-10-02 06:55:51,057 WARNING [train.py:1204] (3/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_453-267730-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 06:55:51,129 WARNING [train.py:1204] (3/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_153-97441-0 from training. Duration: 0.56 2023-10-02 06:55:52,519 WARNING [train.py:1204] (3/4) Exclude cut with ID _40901_210_5665_1_1533363789958_3856130_312-329122-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 06:55:52,570 WARNING [train.py:1204] (3/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_97-126471-0_sp1.1 from training. Duration: 0.97275 2023-10-02 06:55:52,570 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_11305_210_1794_1_1529750428_7178148_257-312870-0_sp1.1 from training. Duration: 0.8 2023-10-02 06:55:54,848 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.conv_module2.balancer2.prob, batch_count=789906.6666666666, ans=0.125 2023-10-02 06:55:56,029 WARNING [train.py:1204] (3/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_454-314901-0_sp0.9 from training. Duration: 0.511125 2023-10-02 06:56:00,169 WARNING [train.py:1204] (3/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_282-136641-0_sp1.1 from training. Duration: 0.3818125 2023-10-02 06:56:01,575 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_46013_210_7234_1_1533776362_3744032_493-160724-0_sp1.1 from training. Duration: 0.963625 2023-10-02 06:56:01,612 WARNING [train.py:1204] (3/4) Exclude cut with ID _67831_210_12392_1_1535612464202_3092612_259-33442-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 06:56:02,904 WARNING [train.py:1204] (3/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_289-62384-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 06:56:04,219 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6545_210_2040_1_1527774670_4245258_379-14182-0_sp0.9 from training. Duration: 0.888875 2023-10-02 06:56:05,691 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_548-294316-0_sp1.1 from training. Duration: 0.736375 2023-10-02 06:56:07,060 WARNING [train.py:1204] (3/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_857-298942-0_sp1.1 from training. Duration: 0.77275 2023-10-02 06:56:08,936 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6673_210_3273_1_1527767790_3173240_327-306185-0_sp1.1 from training. Duration: 0.7545625 2023-10-02 06:56:14,960 WARNING [train.py:1204] (3/4) Exclude cut with ID _61008_210_10633_1_1534852769198_6974370_814-107647-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 06:56:15,726 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.2.encoder.layers.2.feed_forward2.out_whiten, num_groups=1, num_channels=384, metric=4.03 vs. limit=15.0 2023-10-02 06:56:16,820 WARNING [train.py:1204] (3/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_116-332008-0_sp1.1 from training. Duration: 0.8909375 2023-10-02 06:56:18,325 WARNING [train.py:1204] (3/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_266-329755-0 from training. Duration: 0.89 2023-10-02 06:56:18,329 WARNING [train.py:1204] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_271-269084-0_sp0.9 from training. Duration: 0.92225 2023-10-02 06:56:18,405 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3171_210_2087_1_1526109084_2330007_66-260583-0 from training. Duration: 0.72 2023-10-02 06:56:24,050 WARNING [train.py:1204] (3/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_1326-276386-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 06:56:25,404 WARNING [train.py:1204] (3/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_372-30302-0_sp1.1 from training. Duration: 0.7818125 2023-10-02 06:56:25,455 WARNING [train.py:1204] (3/4) Exclude cut with ID _25882_210_3036_1_1532775665815_6479510_631-257029-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 06:56:27,198 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3691_210_3528_1_1526468169_4062053_355-334432-0 from training. Duration: 0.87 2023-10-02 06:56:27,216 WARNING [train.py:1204] (3/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_839-173017-0_sp1.1 from training. Duration: 0.37275 2023-10-02 06:56:27,224 WARNING [train.py:1204] (3/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_441-91466-0 from training. Duration: 0.97 2023-10-02 06:56:27,264 WARNING [train.py:1204] (3/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_108-53929-0 from training. Duration: 0.59 2023-10-02 06:56:27,800 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.4.encoder.layers.2.feed_forward2.out_whiten, num_groups=1, num_channels=384, metric=9.51 vs. limit=15.0 2023-10-02 06:56:28,498 INFO [train.py:1046] (3/4) Epoch 23, batch 1650, loss[loss=0.1768, simple_loss=0.2396, pruned_loss=0.057, over 23331.00 frames. ], tot_loss[loss=0.1721, simple_loss=0.249, pruned_loss=0.04762, over 4730446.60 frames. ], batch size: 285, lr: 4.48e-03, grad_scale: 16.0 2023-10-02 06:56:31,467 WARNING [train.py:1204] (3/4) Exclude cut with ID _9389_210_1664_1_1529217337301_5289269_260-1942-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 06:56:31,530 WARNING [train.py:1204] (3/4) Exclude cut with ID _56423_210_5854_1_1534896061049_3165969_198-61547-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 06:56:31,556 WARNING [train.py:1204] (3/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_412-142122-0_sp1.1 from training. Duration: 0.92725 2023-10-02 06:56:31,583 WARNING [train.py:1204] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_88-341718-0_sp1.1 from training. Duration: 0.6 2023-10-02 06:56:34,256 WARNING [train.py:1204] (3/4) Exclude cut with ID _61079_210_4525_1_1534921008198_4011419_334-298697-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 06:56:35,665 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_5465_210_4211_1_1527227253_55357_8-2499-0_sp1.1 from training. Duration: 0.996375 2023-10-02 06:56:37,755 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_447-201565-0_sp1.1 from training. Duration: 0.97275 2023-10-02 06:56:38,874 WARNING [train.py:1204] (3/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_253-161789-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 06:56:38,877 WARNING [train.py:1204] (3/4) Exclude cut with ID _44304_210_15374_1_1533639352765_4613129_302-22084-0_sp0.9 from training. Duration: 0.9555625 2023-10-02 06:56:38,895 WARNING [train.py:1204] (3/4) Exclude cut with ID _26664_210_7770_1_1532258943280_3418270_187-80815-0_sp1.1 from training. Duration: 0.8454375 2023-10-02 06:56:40,310 WARNING [train.py:1204] (3/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_68-212801-0 from training. Duration: 0.73 2023-10-02 06:56:40,352 WARNING [train.py:1204] (3/4) Exclude cut with ID _29345_210_4381_1_1532689168263_3860169_149-257268-0 from training. Duration: 0.81 2023-10-02 06:56:46,558 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_213-103282-0_sp1.1 from training. Duration: 0.7090625 2023-10-02 06:56:47,964 WARNING [train.py:1204] (3/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_156-302675-0_sp1.1 from training. Duration: 0.5 2023-10-02 06:56:48,256 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.self_attn_weights.pos_emb_skip_rate, batch_count=790173.3333333334, ans=0.0 2023-10-02 06:56:50,033 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.2.encoder.layers.1.self_attn_weights.whiten_keys, num_groups=4, num_channels=128, metric=2.95 vs. limit=6.0 2023-10-02 06:57:00,277 WARNING [train.py:1204] (3/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_125-322172-0 from training. Duration: 0.93 2023-10-02 06:57:02,284 WARNING [train.py:1204] (3/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_493-60474-0_sp1.1 from training. Duration: 0.963625 2023-10-02 06:57:03,899 WARNING [train.py:1204] (3/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_156-302675-0 from training. Duration: 0.55 2023-10-02 06:57:05,407 WARNING [train.py:1204] (3/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_123-267862-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 06:57:06,926 WARNING [train.py:1204] (3/4) Exclude cut with ID _28427_210_4220_1_1532581240967_3061069_201-143747-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 06:57:08,299 WARNING [train.py:1204] (3/4) Exclude cut with ID _8772_210_1850_1_1528635595972_3664011_96-12756-0_sp1.1 from training. Duration: 0.8909375 2023-10-02 06:57:08,339 WARNING [train.py:1204] (3/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_1347-145675-0_sp1.1 from training. Duration: 0.936375 2023-10-02 06:57:11,503 WARNING [train.py:1204] (3/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_387-159008-0_sp1.1 from training. Duration: 0.7818125 2023-10-02 06:57:11,539 WARNING [train.py:1204] (3/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_400-201896-0_sp1.1 from training. Duration: 0.963625 2023-10-02 06:57:12,876 WARNING [train.py:1204] (3/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_326-174612-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 06:57:14,218 WARNING [train.py:1204] (3/4) Exclude cut with ID _55844_210_2346_1_1534471212569_7625707_390-116233-0_sp1.1 from training. Duration: 0.963625 2023-10-02 06:57:14,259 WARNING [train.py:1204] (3/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_419-317062-0_sp1.1 from training. Duration: 0.82725 2023-10-02 06:57:15,615 WARNING [train.py:1204] (3/4) Exclude cut with ID _29877_210_6228_1_1532397791106_5276149_266-62291-0_sp0.9 from training. Duration: 0.9666875 2023-10-02 06:57:15,687 WARNING [train.py:1204] (3/4) Exclude cut with ID _7857_210_1534_1_1528286515745_6831470_431-338528-0_sp1.1 from training. Duration: 0.92725 2023-10-02 06:57:15,760 WARNING [train.py:1204] (3/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_87-131427-0_sp0.9 from training. Duration: 0.8666875 2023-10-02 06:57:19,157 WARNING [train.py:1204] (3/4) Exclude cut with ID _24055_210_12651_1_1532310907143_976979_92-59356-0_sp1.1 from training. Duration: 0.82725 2023-10-02 06:57:19,237 WARNING [train.py:1204] (3/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_96-290039-0 from training. Duration: 0.56 2023-10-02 06:57:21,941 WARNING [train.py:1204] (3/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_48-182155-0_sp0.9 from training. Duration: 0.9666875 2023-10-02 06:57:21,987 WARNING [train.py:1204] (3/4) Exclude cut with ID _4626_210_3532_1_1526817652717_3916859_446-206656-0 from training. Duration: 0.76 2023-10-02 06:57:23,265 WARNING [train.py:1204] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_168-32906-0 from training. Duration: 0.69 2023-10-02 06:57:24,567 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3780_210_2087_1_1526467760_2130483_170-259044-0 from training. Duration: 0.7 2023-10-02 06:57:24,590 WARNING [train.py:1204] (3/4) Exclude cut with ID _65134_210_14763_1_1535335393985_3505368_153-216718-0_sp1.1 from training. Duration: 0.92725 2023-10-02 06:57:24,663 WARNING [train.py:1204] (3/4) Exclude cut with ID _64639_210_5816_1_1535367619437_3528000_461-329156-0_sp1.1 from training. Duration: 0.87275 2023-10-02 06:57:24,707 WARNING [train.py:1204] (3/4) Exclude cut with ID _21989_210_5854_1_1531899214931_3271360_126-347531-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 06:57:26,033 WARNING [train.py:1204] (3/4) Exclude cut with ID _7614_210_4915_1_1529326764092_4537359_96-162197-0_sp1.1 from training. Duration: 0.963625 2023-10-02 06:57:26,034 WARNING [train.py:1204] (3/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_1083-147536-0 from training. Duration: 0.97 2023-10-02 06:57:31,506 WARNING [train.py:1204] (3/4) Exclude cut with ID _64029_210_4381_1_1535178502772_3756459_275-16710-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 06:57:33,490 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7264_210_4938_1_1527939911_8132095_800-91995-0_sp0.9 from training. Duration: 0.9555625 2023-10-02 06:57:34,799 WARNING [train.py:1204] (3/4) Exclude cut with ID _3844_210_1390_1_1526727803582_2871209_50-193879-0_sp1.1 from training. Duration: 0.936375 2023-10-02 06:57:36,378 WARNING [train.py:1204] (3/4) Exclude cut with ID _46952_210_5986_1_1534230010357_3107379_232-344503-0 from training. Duration: 0.96 2023-10-02 06:57:40,627 WARNING [train.py:1204] (3/4) Exclude cut with ID _25808_210_13292_1_1532343514562_7366109_206-14076-0_sp1.1 from training. Duration: 0.936375 2023-10-02 06:57:40,628 WARNING [train.py:1204] (3/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_55-31904-0_sp0.9 from training. Duration: 0.97775 2023-10-02 06:57:40,647 WARNING [train.py:1204] (3/4) Exclude cut with ID _14632_210_9988_1_1530691279392_3923200_161-129530-0 from training. Duration: 0.96 2023-10-02 06:57:41,921 INFO [train.py:1046] (3/4) Epoch 23, batch 1700, loss[loss=0.1845, simple_loss=0.2596, pruned_loss=0.05475, over 23339.00 frames. ], tot_loss[loss=0.1718, simple_loss=0.2483, pruned_loss=0.04764, over 4733225.89 frames. ], batch size: 93, lr: 4.48e-03, grad_scale: 16.0 2023-10-02 06:57:42,005 WARNING [train.py:1204] (3/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_770-200430-0_sp1.1 from training. Duration: 0.8090625 2023-10-02 06:57:42,023 WARNING [train.py:1204] (3/4) Exclude cut with ID _6270_210_2084_1_1527850911573_3856420_26-277343-0_sp1.1 from training. Duration: 0.7454375 2023-10-02 06:57:42,032 WARNING [train.py:1204] (3/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_88-23776-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 06:57:45,920 WARNING [train.py:1204] (3/4) Exclude cut with ID _60860_210_8254_1_1534896015875_3557169_245-256666-0_sp1.1 from training. Duration: 0.8818125 2023-10-02 06:57:45,949 WARNING [train.py:1204] (3/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_636-251213-0_sp1.1 from training. Duration: 0.8545625 2023-10-02 06:57:47,241 WARNING [train.py:1204] (3/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_540-131164-0 from training. Duration: 0.92 2023-10-02 06:57:50,516 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_5931_210_2040_1_1527428481_4547999_321-193614-0_sp0.9 from training. Duration: 0.7444375 2023-10-02 06:57:55,404 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.3.encoder.layers.1.nonlin_attention.whiten2, num_groups=1, num_channels=512, metric=10.07 vs. limit=15.0 2023-10-02 06:57:57,559 INFO [scaling.py:1118] (3/4) WithLoss: name=encoder.encoders.4.encoder.layers.0.self_attn_weights, loss-sum=0.000e+00 2023-10-02 06:57:58,656 WARNING [train.py:1204] (3/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_373-225403-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 06:58:00,158 WARNING [train.py:1204] (3/4) Exclude cut with ID _20622_210_3852_1_1531622427080_1093839_57-326027-0_sp1.1 from training. Duration: 0.97275 2023-10-02 06:58:04,448 WARNING [train.py:1204] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_605-257205-0_sp1.1 from training. Duration: 0.536375 2023-10-02 06:58:06,343 WARNING [train.py:1204] (3/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_442-320317-0_sp0.9 from training. Duration: 0.911125 2023-10-02 06:58:06,390 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_35361_210_4238_1_1533262810_7793476_675-87012-0_sp1.1 from training. Duration: 0.8090625 2023-10-02 06:58:06,417 WARNING [train.py:1204] (3/4) Exclude cut with ID _4184_210_2550_1_1526730192013_2219380_306-44736-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 06:58:09,216 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_124-113562-0 from training. Duration: 0.94 2023-10-02 06:58:10,515 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2821_210_2715_1_1526108385_3763560_73-296673-0_sp0.9 from training. Duration: 0.92225 2023-10-02 06:58:10,530 WARNING [train.py:1204] (3/4) Exclude cut with ID _10301_210_5886_1_1529222275332_3910620_216-46959-0_sp1.1 from training. Duration: 0.92725 2023-10-02 06:58:11,814 WARNING [train.py:1204] (3/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_91-277684-0_sp0.9 from training. Duration: 0.82225 2023-10-02 06:58:13,598 WARNING [train.py:1204] (3/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_351-294650-0_sp0.9 from training. Duration: 0.7 2023-10-02 06:58:15,640 WARNING [train.py:1204] (3/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_209-205365-0 from training. Duration: 0.79 2023-10-02 06:58:16,796 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.566e+02 1.870e+02 2.002e+02 2.327e+02 4.481e+02, threshold=4.004e+02, percent-clipped=3.0 2023-10-02 06:58:16,890 WARNING [train.py:1204] (3/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_97-325053-0 from training. Duration: 0.92 2023-10-02 06:58:18,266 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_49348_210_7927_1_1533954500_3691300_109-276430-0_sp1.1 from training. Duration: 0.92725 2023-10-02 06:58:20,383 WARNING [train.py:1204] (3/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_912-47473-0 from training. Duration: 0.68 2023-10-02 06:58:20,466 WARNING [train.py:1204] (3/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_96-320736-0_sp0.9 from training. Duration: 0.9555625 2023-10-02 06:58:30,011 WARNING [train.py:1204] (3/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_1252-235481-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 06:58:30,116 WARNING [train.py:1204] (3/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_813-261554-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 06:58:30,184 WARNING [train.py:1204] (3/4) Exclude cut with ID _21632_210_2253_1_1531994001765_3744140_110-274654-0_sp0.9 from training. Duration: 0.911125 2023-10-02 06:58:31,272 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.0.layers.0.nonlin_attention.whiten1, num_groups=1, num_channels=144, metric=9.33 vs. limit=10.0 2023-10-02 06:58:33,079 WARNING [train.py:1204] (3/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_381-317791-0_sp1.1 from training. Duration: 0.663625 2023-10-02 06:58:33,100 WARNING [train.py:1204] (3/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_51-117576-0_sp1.1 from training. Duration: 0.363625 2023-10-02 06:58:33,122 WARNING [train.py:1204] (3/4) Exclude cut with ID _42321_210_7770_1_1533468631352_4366895_104-215915-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 06:58:35,897 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_216-22824-0_sp1.1 from training. Duration: 0.92725 2023-10-02 06:58:35,897 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_14831_210_7927_1_1530700200_5583610_181-27467-0 from training. Duration: 0.91 2023-10-02 06:58:37,805 WARNING [train.py:1204] (3/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_1024-265645-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 06:58:37,807 WARNING [train.py:1204] (3/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_412-233904-0_sp1.1 from training. Duration: 0.936375 2023-10-02 06:58:37,830 WARNING [train.py:1204] (3/4) Exclude cut with ID _30191_210_1664_1_1533189915047_7196670_911-223461-0_sp1.1 from training. Duration: 0.92725 2023-10-02 06:58:37,837 WARNING [train.py:1204] (3/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_558-148266-0_sp1.1 from training. Duration: 0.963625 2023-10-02 06:58:40,588 WARNING [train.py:1204] (3/4) Exclude cut with ID _58480_210_12392_1_1534811121111_3646160_212-164495-0_sp1.1 from training. Duration: 0.936375 2023-10-02 06:58:40,589 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_454-276429-0_sp0.9 from training. Duration: 0.9666875 2023-10-02 06:58:42,412 WARNING [train.py:1204] (3/4) Exclude cut with ID _24736_210_3279_1_1532332894218_2838700_73-197086-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 06:58:42,698 INFO [scaling.py:1118] (3/4) WithLoss: name=encoder.encoders.3.encoder.layers.0.self_attn_weights, loss-sum=0.000e+00 2023-10-02 06:58:42,750 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.feed_forward1.out_proj.dropout_p, batch_count=790706.6666666666, ans=0.1 2023-10-02 06:58:43,741 WARNING [train.py:1204] (3/4) Exclude cut with ID _5294_210_1747_1_1527249643928_3696179_529-242298-0_sp1.1 from training. Duration: 0.863625 2023-10-02 06:58:43,785 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_53555_210_5632_1_1534595220_7232297_345-171438-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 06:58:48,454 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_28211_210_6388_1_1532861506_4523224_22-25766-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 06:58:49,803 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7002_210_1794_1_1527939683_5157286_163-87552-0 from training. Duration: 0.86 2023-10-02 06:58:51,239 WARNING [train.py:1204] (3/4) Exclude cut with ID _56927_210_9969_1_1534848970840_3695229_409-225129-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 06:58:52,630 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_26192_210_7927_1_1532137269_1040115_21-219331-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 06:58:54,709 WARNING [train.py:1204] (3/4) Exclude cut with ID _15012_210_1968_1_1530856820429_5095350_662-210469-0 from training. Duration: 0.57 2023-10-02 06:58:57,209 INFO [train.py:1046] (3/4) Epoch 23, batch 1750, loss[loss=0.161, simple_loss=0.2368, pruned_loss=0.04256, over 23441.00 frames. ], tot_loss[loss=0.1716, simple_loss=0.2478, pruned_loss=0.0477, over 4731197.38 frames. ], batch size: 119, lr: 4.48e-03, grad_scale: 16.0 2023-10-02 06:59:00,181 WARNING [train.py:1204] (3/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_582-191016-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 06:59:01,634 WARNING [train.py:1204] (3/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_270-210032-0_sp1.1 from training. Duration: 0.963625 2023-10-02 06:59:01,662 WARNING [train.py:1204] (3/4) Exclude cut with ID _6789_210_1534_1_1527857937218_3118950_262-48853-0_sp0.9 from training. Duration: 0.62225 2023-10-02 06:59:02,941 WARNING [train.py:1204] (3/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_847-41334-0 from training. Duration: 0.44 2023-10-02 06:59:02,978 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_573-46532-0_sp1.1 from training. Duration: 0.92725 2023-10-02 06:59:05,842 WARNING [train.py:1204] (3/4) Exclude cut with ID _21881_210_3525_1_1531825242757_3798380_247-102591-0_sp1.1 from training. Duration: 0.8909375 2023-10-02 06:59:05,863 WARNING [train.py:1204] (3/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_200-21395-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 06:59:12,277 WARNING [train.py:1204] (3/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_711-15688-0 from training. Duration: 0.43 2023-10-02 06:59:15,021 WARNING [train.py:1204] (3/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_53-75025-0_sp1.1 from training. Duration: 0.963625 2023-10-02 06:59:16,440 WARNING [train.py:1204] (3/4) Exclude cut with ID _6194_210_1534_1_1527511826160_3941194_321-148641-0 from training. Duration: 0.64 2023-10-02 06:59:16,486 WARNING [train.py:1204] (3/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_337-294431-0_sp1.1 from training. Duration: 0.936375 2023-10-02 06:59:18,599 WARNING [train.py:1204] (3/4) Exclude cut with ID _21632_210_2253_1_1531994001765_3744140_110-274654-0_sp1.1 from training. Duration: 0.7454375 2023-10-02 06:59:18,765 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.feed_forward1.out_proj.dropout_p, batch_count=790840.0, ans=0.1 2023-10-02 06:59:19,814 WARNING [train.py:1204] (3/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_135-174080-0_sp1.1 from training. Duration: 0.4818125 2023-10-02 06:59:21,202 WARNING [train.py:1204] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_21-174514-0 from training. Duration: 0.43 2023-10-02 06:59:22,642 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_38965_210_4852_1_1533174906_3484735_215-294589-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 06:59:23,961 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_548-294316-0 from training. Duration: 0.81 2023-10-02 06:59:31,378 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_103-84010-0_sp1.1 from training. Duration: 0.62725 2023-10-02 06:59:32,908 WARNING [train.py:1204] (3/4) Exclude cut with ID _18199_210_6109_1_1531475789864_4288790_295-198247-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 06:59:32,935 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_13757_210_3528_1_1530440682_4107239_103-313224-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 06:59:34,558 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.conv_skip_rate, batch_count=790906.6666666666, ans=0.0 2023-10-02 06:59:35,744 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_34886_210_10633_1_1533203882_1755661_67-294673-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 06:59:35,754 WARNING [train.py:1204] (3/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_350-101734-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 06:59:38,298 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_37357_210_17024_1_1533089504_2316121_204-153986-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 06:59:38,401 WARNING [train.py:1204] (3/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_673-115569-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 06:59:41,808 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_489-230703-0_sp0.9 from training. Duration: 0.9444375 2023-10-02 06:59:41,871 WARNING [train.py:1204] (3/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_575-102496-0_sp1.1 from training. Duration: 0.936375 2023-10-02 06:59:43,185 WARNING [train.py:1204] (3/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_669-93163-0 from training. Duration: 0.98 2023-10-02 06:59:44,080 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.1.encoder.layers.0.self_attn_weights.whiten_keys, num_groups=4, num_channels=128, metric=3.64 vs. limit=6.0 2023-10-02 06:59:46,457 WARNING [train.py:1204] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_249-281349-0_sp0.9 from training. Duration: 0.9444375 2023-10-02 06:59:48,928 WARNING [train.py:1204] (3/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_106-30782-0 from training. Duration: 0.8 2023-10-02 06:59:48,993 WARNING [train.py:1204] (3/4) Exclude cut with ID _8772_210_1850_1_1528635595972_3664011_398-320049-0_sp1.1 from training. Duration: 0.8 2023-10-02 06:59:50,470 WARNING [train.py:1204] (3/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_516-151727-0_sp1.1 from training. Duration: 0.963625 2023-10-02 06:59:51,798 WARNING [train.py:1204] (3/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_325-62802-0_sp1.1 from training. Duration: 0.7909375 2023-10-02 06:59:56,399 WARNING [train.py:1204] (3/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_93-58002-0_sp0.9 from training. Duration: 0.6333125 2023-10-02 06:59:57,744 WARNING [train.py:1204] (3/4) Exclude cut with ID _4681_210_3525_1_1527251040321_1235713_123-24428-0_sp1.1 from training. Duration: 0.436375 2023-10-02 06:59:59,148 WARNING [train.py:1204] (3/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_1185-56473-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 07:00:00,561 WARNING [train.py:1204] (3/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_96-244299-0_sp1.1 from training. Duration: 0.8 2023-10-02 07:00:04,815 WARNING [train.py:1204] (3/4) Exclude cut with ID _59500_210_8254_1_1534809610680_3736009_104-72815-0_sp1.1 from training. Duration: 0.963625 2023-10-02 07:00:06,170 WARNING [train.py:1204] (3/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_468-283113-0_sp1.1 from training. Duration: 0.87275 2023-10-02 07:00:07,667 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6239_210_4211_1_1527765584_4306698_528-102186-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 07:00:08,920 WARNING [train.py:1204] (3/4) Exclude cut with ID _45297_210_2721_1_1533715351381_3264949_351-147136-0 from training. Duration: 0.87 2023-10-02 07:00:08,925 WARNING [train.py:1204] (3/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_520-51744-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 07:00:09,032 WARNING [train.py:1204] (3/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_310-114452-0_sp1.1 from training. Duration: 0.536375 2023-10-02 07:00:09,036 WARNING [train.py:1204] (3/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_450-165710-0_sp1.1 from training. Duration: 0.92725 2023-10-02 07:00:09,047 WARNING [train.py:1204] (3/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_280-120942-0_sp1.1 from training. Duration: 0.7 2023-10-02 07:00:09,079 WARNING [train.py:1204] (3/4) Exclude cut with ID _65585_210_12210_1_1535450451584_3764098_112-294301-0_sp0.9 from training. Duration: 0.97775 2023-10-02 07:00:09,277 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.feed_forward1.out_proj.dropout_p, batch_count=791106.6666666666, ans=0.1 2023-10-02 07:00:10,946 INFO [train.py:1046] (3/4) Epoch 23, batch 1800, loss[loss=0.175, simple_loss=0.2599, pruned_loss=0.04505, over 24028.00 frames. ], tot_loss[loss=0.1712, simple_loss=0.2475, pruned_loss=0.04749, over 4729804.67 frames. ], batch size: 80, lr: 4.48e-03, grad_scale: 16.0 2023-10-02 07:00:11,078 WARNING [train.py:1204] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_98-51676-0_sp0.9 from training. Duration: 0.811125 2023-10-02 07:00:13,867 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_33348_210_5632_1_1533086912_3324030_85-28583-0_sp1.1 from training. Duration: 0.7454375 2023-10-02 07:00:15,083 WARNING [train.py:1204] (3/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_336-208232-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 07:00:18,096 WARNING [train.py:1204] (3/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_339-104791-0_sp0.9 from training. Duration: 0.5444375 2023-10-02 07:00:19,499 WARNING [train.py:1204] (3/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_559-29840-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 07:00:21,124 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.feed_forward3.hidden_balancer.prob, batch_count=791106.6666666666, ans=0.125 2023-10-02 07:00:22,298 WARNING [train.py:1204] (3/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_117-290916-0_sp0.9 from training. Duration: 0.5666875 2023-10-02 07:00:22,392 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_185-112289-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 07:00:25,928 WARNING [train.py:1204] (3/4) Exclude cut with ID _59256_210_5986_1_1535076085997_3589120_155-298797-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 07:00:28,680 WARNING [train.py:1204] (3/4) Exclude cut with ID _44659_210_2721_1_1534036120061_6614860_246-206531-0_sp1.1 from training. Duration: 0.92725 2023-10-02 07:00:29,990 WARNING [train.py:1204] (3/4) Exclude cut with ID _23818_210_6228_1_1531917063884_3676920_469-151944-0_sp1.1 from training. Duration: 0.92725 2023-10-02 07:00:31,381 WARNING [train.py:1204] (3/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_88-276668-0_sp1.1 from training. Duration: 0.9 2023-10-02 07:00:32,866 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_15101_210_7927_1_1530786415_5033274_320-327527-0_sp1.1 from training. Duration: 0.87275 2023-10-02 07:00:32,876 WARNING [train.py:1204] (3/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_446-112117-0 from training. Duration: 0.9 2023-10-02 07:00:32,952 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_30280_210_1881_1_1532872779_3660139_400-254159-0_sp1.1 from training. Duration: 0.936375 2023-10-02 07:00:34,490 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.1.feed_forward1.out_proj.dropout_p, batch_count=791173.3333333334, ans=0.1 2023-10-02 07:00:35,727 WARNING [train.py:1204] (3/4) Exclude cut with ID _40284_210_2084_1_1533513679814_3704279_158-345341-0_sp1.1 from training. Duration: 0.936375 2023-10-02 07:00:39,867 WARNING [train.py:1204] (3/4) Exclude cut with ID 3033-130750-0096-55598-0 from training. Duration: 0.83 2023-10-02 07:00:41,863 WARNING [train.py:1204] (3/4) Exclude cut with ID _9389_210_1664_1_1529217337301_5289269_379-128794-0 from training. Duration: 0.85 2023-10-02 07:00:43,898 WARNING [train.py:1204] (3/4) Exclude cut with ID _3675_210_4557_1_1526639934408_4182089_456-18305-0 from training. Duration: 0.76 2023-10-02 07:00:43,932 WARNING [train.py:1204] (3/4) Exclude cut with ID _29853_210_7497_1_1532505600851_6642110_571-249460-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 07:00:45,214 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.572e+02 1.867e+02 2.054e+02 2.379e+02 4.412e+02, threshold=4.108e+02, percent-clipped=1.0 2023-10-02 07:00:45,329 WARNING [train.py:1204] (3/4) Exclude cut with ID _4068_210_2629_1_1526697350395_1546259_230-325568-0_sp1.1 from training. Duration: 0.92725 2023-10-02 07:00:45,330 WARNING [train.py:1204] (3/4) Exclude cut with ID _34415_210_8750_1_1533085213375_3451116_213-267570-0_sp1.1 from training. Duration: 0.963625 2023-10-02 07:00:45,413 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6767_210_3273_1_1527855783_1718932_142-127132-0_sp1.1 from training. Duration: 0.863625 2023-10-02 07:00:52,821 WARNING [train.py:1204] (3/4) Exclude cut with ID IC0653W0001-322925 from training. Duration: 0.896 2023-10-02 07:00:54,207 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_4528_210_3273_1_1527076654_3276777_209-219804-0_sp1.1 from training. Duration: 0.62725 2023-10-02 07:00:57,432 WARNING [train.py:1204] (3/4) Exclude cut with ID _55844_210_2346_1_1534471212569_7625707_187-204971-0_sp1.1 from training. Duration: 0.936375 2023-10-02 07:00:58,854 WARNING [train.py:1204] (3/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_369-325184-0 from training. Duration: 0.99 2023-10-02 07:01:00,185 WARNING [train.py:1204] (3/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_791-45149-0 from training. Duration: 0.86 2023-10-02 07:01:00,232 WARNING [train.py:1204] (3/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_317-211107-0_sp1.1 from training. Duration: 0.836375 2023-10-02 07:01:00,446 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=791306.6666666666, ans=0.1 2023-10-02 07:01:01,595 WARNING [train.py:1204] (3/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_488-330913-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 07:01:01,696 WARNING [train.py:1204] (3/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_506-42614-0_sp1.1 from training. Duration: 0.7181875 2023-10-02 07:01:04,568 WARNING [train.py:1204] (3/4) Exclude cut with ID _8696_210_1876_1_1528545632440_3345058_209-54069-0 from training. Duration: 0.67 2023-10-02 07:01:11,061 WARNING [train.py:1204] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_215-202973-0_sp1.1 from training. Duration: 0.8 2023-10-02 07:01:11,136 WARNING [train.py:1204] (3/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_321-215742-0 from training. Duration: 0.5 2023-10-02 07:01:11,186 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_428-32114-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 07:01:11,194 WARNING [train.py:1204] (3/4) Exclude cut with ID _27013_210_1389_1_1532437173240_3252649_467-263151-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 07:01:13,111 WARNING [train.py:1204] (3/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_734-129761-0_sp0.9 from training. Duration: 0.911125 2023-10-02 07:01:13,154 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_521-214276-0 from training. Duration: 0.44 2023-10-02 07:01:15,850 WARNING [train.py:1204] (3/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_80-147301-0_sp1.1 from training. Duration: 0.763625 2023-10-02 07:01:15,852 WARNING [train.py:1204] (3/4) Exclude cut with ID _9679_210_7497_1_1529129212906_4363929_56-307796-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 07:01:20,587 WARNING [train.py:1204] (3/4) Exclude cut with ID _14832_210_8750_1_1530691351539_3171348_1-54315-0 from training. Duration: 0.87 2023-10-02 07:01:20,587 WARNING [train.py:1204] (3/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_558-155750-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 07:01:22,044 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_142-309370-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 07:01:22,071 WARNING [train.py:1204] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_18-46756-0_sp0.9 from training. Duration: 0.87775 2023-10-02 07:01:23,450 WARNING [train.py:1204] (3/4) Exclude cut with ID _61148_210_6897_1_1535094022726_4217610_279-212159-0_sp1.1 from training. Duration: 0.936375 2023-10-02 07:01:24,780 INFO [train.py:1046] (3/4) Epoch 23, batch 1850, loss[loss=0.1737, simple_loss=0.2494, pruned_loss=0.04896, over 23933.00 frames. ], tot_loss[loss=0.172, simple_loss=0.2481, pruned_loss=0.04793, over 4724349.39 frames. ], batch size: 180, lr: 4.48e-03, grad_scale: 8.0 2023-10-02 07:01:24,846 WARNING [train.py:1204] (3/4) Exclude cut with ID _21987_210_5854_1_1531821500482_3637038_164-2641-0_sp1.1 from training. Duration: 0.936375 2023-10-02 07:01:24,883 WARNING [train.py:1204] (3/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_395-340852-0_sp1.1 from training. Duration: 0.8454375 2023-10-02 07:01:27,028 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1077-184261-0_sp1.1 from training. Duration: 0.963625 2023-10-02 07:01:27,042 WARNING [train.py:1204] (3/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_1176-204992-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 07:01:31,130 WARNING [train.py:1204] (3/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_563-347832-0_sp1.1 from training. Duration: 0.8090625 2023-10-02 07:01:32,497 WARNING [train.py:1204] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_380-204756-0_sp0.9 from training. Duration: 0.9555625 2023-10-02 07:01:38,489 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.conv_skip_rate, batch_count=791506.6666666666, ans=0.0 2023-10-02 07:01:39,614 WARNING [train.py:1204] (3/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_212-10501-0_sp1.1 from training. Duration: 0.8545625 2023-10-02 07:01:39,644 WARNING [train.py:1204] (3/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_445-117398-0 from training. Duration: 0.89 2023-10-02 07:01:41,788 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.3.encoder.layers.0.self_attn_weights.whiten_keys, num_groups=8, num_channels=256, metric=5.28 vs. limit=6.0 2023-10-02 07:01:42,427 WARNING [train.py:1204] (3/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_29-280622-0 from training. Duration: 0.82 2023-10-02 07:01:43,863 WARNING [train.py:1204] (3/4) Exclude cut with ID _26664_210_7770_1_1532258943280_3418270_187-80815-0 from training. Duration: 0.93 2023-10-02 07:01:47,208 WARNING [train.py:1204] (3/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_414-349640-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 07:01:47,224 WARNING [train.py:1204] (3/4) Exclude cut with ID _45021_210_9994_1_1533619751094_3330288_96-239010-0 from training. Duration: 0.91 2023-10-02 07:01:47,227 WARNING [train.py:1204] (3/4) Exclude cut with ID _5189_210_4557_1_1527248319428_3769229_199-53526-0_sp0.9 from training. Duration: 0.4666875 2023-10-02 07:01:57,799 WARNING [train.py:1204] (3/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_63-101996-0_sp1.1 from training. Duration: 0.8 2023-10-02 07:01:59,243 WARNING [train.py:1204] (3/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_199-86432-0 from training. Duration: 0.98 2023-10-02 07:02:00,690 WARNING [train.py:1204] (3/4) Exclude cut with ID _36399_210_16049_1_1532998771377_3698333_207-259013-0_sp1.1 from training. Duration: 0.92725 2023-10-02 07:02:00,719 WARNING [train.py:1204] (3/4) Exclude cut with ID _62574_210_2655_1_1535180407058_3933249_567-111756-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 07:02:00,850 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.1.feed_forward1.out_proj.dropout_p, batch_count=791573.3333333334, ans=0.1 2023-10-02 07:02:06,226 WARNING [train.py:1204] (3/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_436-318113-0 from training. Duration: 0.75 2023-10-02 07:02:06,271 WARNING [train.py:1204] (3/4) Exclude cut with ID _53379_210_20034_1_1534222027363_3854654_43-305214-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 07:02:06,288 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_15126_210_6753_1_1530778677_2591185_113-157651-0_sp1.1 from training. Duration: 0.6181875 2023-10-02 07:02:07,689 WARNING [train.py:1204] (3/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_146-218763-0_sp1.1 from training. Duration: 0.7818125 2023-10-02 07:02:10,500 WARNING [train.py:1204] (3/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_714-310538-0_sp0.9 from training. Duration: 0.9555625 2023-10-02 07:02:11,963 WARNING [train.py:1204] (3/4) Exclude cut with ID _24271_210_12658_1_1531987270022_7190519_709-16721-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 07:02:14,910 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.conv_module1.balancer1.prob, batch_count=791640.0, ans=0.125 2023-10-02 07:02:16,586 WARNING [train.py:1204] (3/4) Exclude cut with ID _5190_210_4557_1_1527244368799_373439_28-197030-0_sp0.9 from training. Duration: 0.8 2023-10-02 07:02:16,621 WARNING [train.py:1204] (3/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_267-197536-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 07:02:16,649 WARNING [train.py:1204] (3/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_658-282610-0_sp1.1 from training. Duration: 0.4818125 2023-10-02 07:02:16,666 WARNING [train.py:1204] (3/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_257-67050-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 07:02:18,155 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.feed_forward3.hidden_balancer.prob, batch_count=791640.0, ans=0.125 2023-10-02 07:02:19,368 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_14906_210_7927_1_1530754190_9408633_961-53402-0_sp1.1 from training. Duration: 0.936375 2023-10-02 07:02:20,708 WARNING [train.py:1204] (3/4) Exclude cut with ID _60113_210_16271_1_1535164212481_4386079_490-280290-0_sp1.1 from training. Duration: 0.8909375 2023-10-02 07:02:24,220 WARNING [train.py:1204] (3/4) Exclude cut with ID _14100_210_8341_1_1530529320613_3678069_258-224599-0 from training. Duration: 0.77 2023-10-02 07:02:24,278 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6961_210_2087_1_1527937168_6815011_565-326898-0_sp1.1 from training. Duration: 0.936375 2023-10-02 07:02:24,480 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.conv_module2.balancer2.min_abs, batch_count=791706.6666666666, ans=0.5 2023-10-02 07:02:27,630 WARNING [train.py:1204] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_239-316802-0_sp1.1 from training. Duration: 0.62725 2023-10-02 07:02:27,701 WARNING [train.py:1204] (3/4) Exclude cut with ID _55857_210_2346_1_1534644007831_6898209_745-98391-0_sp1.1 from training. Duration: 0.6909375 2023-10-02 07:02:27,704 WARNING [train.py:1204] (3/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_154-135696-0 from training. Duration: 0.8 2023-10-02 07:02:28,975 WARNING [train.py:1204] (3/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_93-58002-0 from training. Duration: 0.57 2023-10-02 07:02:30,396 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9025W0005-507335 from training. Duration: 0.896 2023-10-02 07:02:31,798 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9029W0034-509435 from training. Duration: 0.64 2023-10-02 07:02:33,175 WARNING [train.py:1204] (3/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_76-203867-0_sp1.1 from training. Duration: 0.6545625 2023-10-02 07:02:34,476 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_46639_210_8341_1_1533726088_4495500_66-346325-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 07:02:34,483 WARNING [train.py:1204] (3/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_145-137790-0_sp0.9 from training. Duration: 0.92225 2023-10-02 07:02:34,507 WARNING [train.py:1204] (3/4) Exclude cut with ID _36522_210_8887_1_1533177030611_1772478_7-327299-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 07:02:35,807 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9042W0009-515615 from training. Duration: 0.896 2023-10-02 07:02:35,816 WARNING [train.py:1204] (3/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_39-248870-0_sp0.9 from training. Duration: 0.7666875 2023-10-02 07:02:35,857 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_35441_210_16554_1_1533347572_4092680_362-242912-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 07:02:37,197 WARNING [train.py:1204] (3/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_216-343007-0_sp1.1 from training. Duration: 0.6 2023-10-02 07:02:37,289 WARNING [train.py:1204] (3/4) Exclude cut with ID _3867_210_1883_1_1526641200575_4475830_272-215563-0_sp0.9 from training. Duration: 0.6333125 2023-10-02 07:02:38,711 INFO [train.py:1046] (3/4) Epoch 23, batch 1900, loss[loss=0.1553, simple_loss=0.2311, pruned_loss=0.03974, over 24324.00 frames. ], tot_loss[loss=0.1725, simple_loss=0.2487, pruned_loss=0.04808, over 4732876.94 frames. ], batch size: 61, lr: 4.48e-03, grad_scale: 8.0 2023-10-02 07:02:38,787 WARNING [train.py:1204] (3/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_553-175603-0_sp1.1 from training. Duration: 0.92725 2023-10-02 07:02:38,797 WARNING [train.py:1204] (3/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_963-187109-0 from training. Duration: 0.99 2023-10-02 07:02:40,106 WARNING [train.py:1204] (3/4) Exclude cut with ID _44973_210_1511_1_1533794381163_3634770_495-211466-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 07:02:41,343 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9068W0002-529085 from training. Duration: 0.896 2023-10-02 07:02:41,366 WARNING [train.py:1204] (3/4) Exclude cut with ID _5294_210_1747_1_1527249643928_3696179_180-100713-0_sp1.1 from training. Duration: 0.736375 2023-10-02 07:02:42,688 WARNING [train.py:1204] (3/4) Exclude cut with ID _35799_210_5854_1_1533263487887_3529071_149-49689-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 07:02:48,794 WARNING [train.py:1204] (3/4) Exclude cut with ID _67725_210_27767_1_1535608756671_5105359_36-220794-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 07:02:50,321 WARNING [train.py:1204] (3/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_338-96520-0_sp1.1 from training. Duration: 0.8545625 2023-10-02 07:02:51,034 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.2.encoder.layers.2.feed_forward3.out_whiten, num_groups=1, num_channels=384, metric=13.39 vs. limit=15.0 2023-10-02 07:02:52,261 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9104W0004-547160 from training. Duration: 0.896 2023-10-02 07:02:52,337 WARNING [train.py:1204] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_330-327861-0 from training. Duration: 0.67 2023-10-02 07:02:53,738 WARNING [train.py:1204] (3/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_156-257607-0_sp0.9 from training. Duration: 0.92225 2023-10-02 07:02:55,070 WARNING [train.py:1204] (3/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_476-322050-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 07:02:55,079 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9118W0017-553385 from training. Duration: 0.896 2023-10-02 07:02:55,115 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9120W0028-554435 from training. Duration: 0.896 2023-10-02 07:02:58,050 WARNING [train.py:1204] (3/4) Exclude cut with ID _56524_210_9642_1_1534572011779_3619170_145-310842-0 from training. Duration: 0.99 2023-10-02 07:03:01,189 WARNING [train.py:1204] (3/4) Exclude cut with ID _51631_210_8254_1_1534129228420_4084794_270-222508-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 07:03:05,293 WARNING [train.py:1204] (3/4) Exclude cut with ID _14268_210_9206_1_1530622771285_4714099_331-105351-0 from training. Duration: 0.65 2023-10-02 07:03:05,425 WARNING [train.py:1204] (3/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_40-129756-0 from training. Duration: 0.81 2023-10-02 07:03:13,810 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.532e+02 1.937e+02 2.220e+02 2.601e+02 3.701e+02, threshold=4.439e+02, percent-clipped=0.0 2023-10-02 07:03:13,981 WARNING [train.py:1204] (3/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_231-104736-0 from training. Duration: 0.78 2023-10-02 07:03:17,114 WARNING [train.py:1204] (3/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_214-291369-0 from training. Duration: 0.63 2023-10-02 07:03:18,833 WARNING [train.py:1204] (3/4) Exclude cut with ID _62574_210_2655_1_1535180407058_3933249_369-84347-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 07:03:20,168 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9210W0111-599240 from training. Duration: 0.896 2023-10-02 07:03:20,175 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_92-246270-0 from training. Duration: 0.85 2023-10-02 07:03:20,225 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_37152_210_9697_1_1533106573_3844188_29-239070-0 from training. Duration: 0.98 2023-10-02 07:03:22,177 WARNING [train.py:1204] (3/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_239-22076-0 from training. Duration: 0.85 2023-10-02 07:03:22,180 WARNING [train.py:1204] (3/4) Exclude cut with ID _65962_210_29927_1_1535436026519_4174930_368-44639-0_sp1.1 from training. Duration: 0.97275 2023-10-02 07:03:24,912 WARNING [train.py:1204] (3/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_152-212533-0 from training. Duration: 0.97 2023-10-02 07:03:29,316 WARNING [train.py:1204] (3/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_90-31383-0_sp1.1 from training. Duration: 0.7909375 2023-10-02 07:03:30,802 WARNING [train.py:1204] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_196-152868-0_sp1.1 from training. Duration: 0.92725 2023-10-02 07:03:30,804 WARNING [train.py:1204] (3/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_140-181176-0 from training. Duration: 0.92 2023-10-02 07:03:32,641 WARNING [train.py:1204] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_168-32906-0_sp0.9 from training. Duration: 0.7666875 2023-10-02 07:03:35,596 WARNING [train.py:1204] (3/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_13-165512-0 from training. Duration: 0.82 2023-10-02 07:03:35,636 WARNING [train.py:1204] (3/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_381-317791-0_sp0.9 from training. Duration: 0.811125 2023-10-02 07:03:41,223 WARNING [train.py:1204] (3/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_63-267585-0_sp1.1 from training. Duration: 0.8181875 2023-10-02 07:03:41,225 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_326-303945-0_sp1.1 from training. Duration: 0.9 2023-10-02 07:03:42,590 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_194-143381-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 07:03:42,678 WARNING [train.py:1204] (3/4) Exclude cut with ID _39290_210_4220_1_1533862778760_3075118_194-16667-0_sp1.1 from training. Duration: 0.963625 2023-10-02 07:03:45,348 WARNING [train.py:1204] (3/4) Exclude cut with ID _15012_210_1968_1_1530856820429_5095350_662-210469-0_sp0.9 from training. Duration: 0.6333125 2023-10-02 07:03:45,394 WARNING [train.py:1204] (3/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_68-219807-0_sp1.1 from training. Duration: 0.42725 2023-10-02 07:03:47,094 WARNING [train.py:1204] (3/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_321-22821-0_sp0.9 from training. Duration: 0.988875 2023-10-02 07:03:51,098 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_1037-45317-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 07:03:51,099 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_403-39619-0_sp1.1 from training. Duration: 0.763625 2023-10-02 07:03:52,943 INFO [train.py:1046] (3/4) Epoch 23, batch 1950, loss[loss=0.1534, simple_loss=0.2301, pruned_loss=0.03832, over 24620.00 frames. ], tot_loss[loss=0.1729, simple_loss=0.2493, pruned_loss=0.04822, over 4732548.67 frames. ], batch size: 60, lr: 4.48e-03, grad_scale: 8.0 2023-10-02 07:03:54,394 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_510-96996-0_sp0.9 from training. Duration: 0.9666875 2023-10-02 07:03:54,397 WARNING [train.py:1204] (3/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_225-180155-0_sp1.1 from training. Duration: 0.92725 2023-10-02 07:03:54,439 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_278-272612-0_sp0.9 from training. Duration: 0.811125 2023-10-02 07:03:56,373 WARNING [train.py:1204] (3/4) Exclude cut with ID _53713_210_24116_1_1534314487762_7416751_691-70244-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 07:03:59,199 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6788_210_1388_1_1527898698_4562021_91-197981-0_sp1.1 from training. Duration: 0.7545625 2023-10-02 07:04:01,862 WARNING [train.py:1204] (3/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_60-18123-0_sp0.9 from training. Duration: 0.97775 2023-10-02 07:04:01,902 WARNING [train.py:1204] (3/4) Exclude cut with ID _35799_210_5854_1_1533263487887_3529071_5-214617-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 07:04:01,912 WARNING [train.py:1204] (3/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_890-5117-0_sp1.1 from training. Duration: 0.7090625 2023-10-02 07:04:03,893 WARNING [train.py:1204] (3/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_660-211088-0 from training. Duration: 0.62 2023-10-02 07:04:05,172 WARNING [train.py:1204] (3/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_314-126819-0_sp0.9 from training. Duration: 0.5555625 2023-10-02 07:04:05,194 WARNING [train.py:1204] (3/4) Exclude cut with ID _21632_210_2253_1_1531994001765_3744140_362-66225-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 07:04:05,279 WARNING [train.py:1204] (3/4) Exclude cut with ID _61118_210_9527_1_1534897668884_3752630_173-258555-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 07:04:09,360 WARNING [train.py:1204] (3/4) Exclude cut with ID _7719_210_3525_1_1528456153163_4296960_88-250259-0_sp0.9 from training. Duration: 0.9333125 2023-10-02 07:04:09,400 WARNING [train.py:1204] (3/4) Exclude cut with ID _15140_210_9288_1_1530791388450_4880149_539-153708-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 07:04:09,668 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.nonlin_attention.balancer.prob, batch_count=792173.3333333334, ans=0.125 2023-10-02 07:04:10,667 WARNING [train.py:1204] (3/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_253-42544-0_sp1.1 from training. Duration: 0.97275 2023-10-02 07:04:12,110 WARNING [train.py:1204] (3/4) Exclude cut with ID _33640_210_2655_1_1533117605479_4855469_479-296877-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 07:04:13,558 WARNING [train.py:1204] (3/4) Exclude cut with ID 3033-130750-0096-55598-0_sp1.1 from training. Duration: 0.7545625 2023-10-02 07:04:13,579 WARNING [train.py:1204] (3/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_191-41023-0_sp1.1 from training. Duration: 0.6454375 2023-10-02 07:04:13,603 WARNING [train.py:1204] (3/4) Exclude cut with ID _8365_210_2064_1_1528592311070_4677690_431-294102-0_sp1.1 from training. Duration: 0.8090625 2023-10-02 07:04:13,633 WARNING [train.py:1204] (3/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_893-296030-0_sp1.1 from training. Duration: 0.97275 2023-10-02 07:04:17,837 WARNING [train.py:1204] (3/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_401-343820-0_sp1.1 from training. Duration: 0.97275 2023-10-02 07:04:21,025 WARNING [train.py:1204] (3/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_127-566-0_sp1.1 from training. Duration: 0.763625 2023-10-02 07:04:21,026 WARNING [train.py:1204] (3/4) Exclude cut with ID _14100_210_8341_1_1530529320613_3678069_126-272847-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 07:04:21,046 WARNING [train.py:1204] (3/4) Exclude cut with ID _15133_210_6228_1_1530794512074_3080379_29-264458-0_sp1.1 from training. Duration: 0.52725 2023-10-02 07:04:21,046 WARNING [train.py:1204] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_127-119109-0 from training. Duration: 0.72 2023-10-02 07:04:21,113 WARNING [train.py:1204] (3/4) Exclude cut with ID _6537_210_1883_1_1527987559710_3597129_382-133062-0_sp1.1 from training. Duration: 0.6181875 2023-10-02 07:04:21,125 WARNING [train.py:1204] (3/4) Exclude cut with ID _29529_210_7234_1_1532393961455_3977460_391-219682-0_sp1.1 from training. Duration: 0.936375 2023-10-02 07:04:22,728 WARNING [train.py:1204] (3/4) Exclude cut with ID _37537_210_2253_1_1533171758568_3421949_96-137189-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 07:04:26,166 WARNING [train.py:1204] (3/4) Exclude cut with ID _58414_210_8254_1_1535421609162_7151182_15-276301-0_sp1.1 from training. Duration: 0.97275 2023-10-02 07:04:29,478 WARNING [train.py:1204] (3/4) Exclude cut with ID _59502_210_12210_1_1535107024698_1835139_110-289074-0_sp1.1 from training. Duration: 0.92725 2023-10-02 07:04:30,290 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.2.encoder.layers.0.self_attn2.whiten, num_groups=1, num_channels=384, metric=19.32 vs. limit=22.5 2023-10-02 07:04:34,308 WARNING [train.py:1204] (3/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_367-61330-0_sp1.1 from training. Duration: 0.6818125 2023-10-02 07:04:37,100 WARNING [train.py:1204] (3/4) Exclude cut with ID _45297_210_2721_1_1533715351381_3264949_353-9541-0_sp1.1 from training. Duration: 0.8818125 2023-10-02 07:04:37,136 WARNING [train.py:1204] (3/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_85-138257-0_sp0.9 from training. Duration: 0.788875 2023-10-02 07:04:37,161 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3787_210_2040_1_1526477544_5478586_603-120324-0 from training. Duration: 0.81 2023-10-02 07:04:37,207 WARNING [train.py:1204] (3/4) Exclude cut with ID _42863_210_9070_1_1533902398788_3696920_411-204131-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 07:04:41,418 WARNING [train.py:1204] (3/4) Exclude cut with ID _30196_210_1664_1_1533708496522_7074140_822-289909-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 07:04:41,520 WARNING [train.py:1204] (3/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_32-335103-0_sp0.9 from training. Duration: 0.911125 2023-10-02 07:04:41,773 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.conv_module1.balancer2.prob, batch_count=792306.6666666666, ans=0.125 2023-10-02 07:04:42,842 WARNING [train.py:1204] (3/4) Exclude cut with ID _6493_210_1747_1_1528459210424_4012470_513-140967-0_sp0.9 from training. Duration: 0.87775 2023-10-02 07:04:43,175 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.conv_module1.balancer1.prob, batch_count=792306.6666666666, ans=0.125 2023-10-02 07:04:44,354 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.0.conv_module2.balancer1.prob, batch_count=792306.6666666666, ans=0.125 2023-10-02 07:04:50,149 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_47896_210_20508_1_1534492518_5958868_439-257766-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 07:04:50,213 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_1294-63152-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 07:04:50,458 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.conv_module1.balancer2.prob, batch_count=792306.6666666666, ans=0.125 2023-10-02 07:04:52,960 WARNING [train.py:1204] (3/4) Exclude cut with ID _59170_210_6721_1_1534983633960_4203335_319-166512-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 07:04:54,410 WARNING [train.py:1204] (3/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_146-3470-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 07:04:57,689 WARNING [train.py:1204] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_678-159307-0_sp1.1 from training. Duration: 0.77275 2023-10-02 07:04:58,973 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_343-48822-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 07:04:59,032 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_4692_210_3528_1_1526899755_4323254_107-44401-0 from training. Duration: 0.86 2023-10-02 07:04:59,037 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_74-292608-0_sp1.1 from training. Duration: 0.6545625 2023-10-02 07:05:00,710 WARNING [train.py:1204] (3/4) Exclude cut with ID _55644_210_11856_1_1534593599582_4103719_293-248694-0_sp1.1 from training. Duration: 0.97275 2023-10-02 07:05:00,782 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3172_210_2087_1_1526292929_3987865_197-97762-0 from training. Duration: 0.75 2023-10-02 07:05:03,724 WARNING [train.py:1204] (3/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_264-348896-0_sp0.9 from training. Duration: 0.9444375 2023-10-02 07:05:06,548 WARNING [train.py:1204] (3/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_209-205365-0_sp0.9 from training. Duration: 0.87775 2023-10-02 07:05:07,951 INFO [train.py:1046] (3/4) Epoch 23, batch 2000, loss[loss=0.1711, simple_loss=0.2594, pruned_loss=0.04146, over 24434.00 frames. ], tot_loss[loss=0.1735, simple_loss=0.25, pruned_loss=0.04849, over 4740146.21 frames. ], batch size: 69, lr: 4.48e-03, grad_scale: 16.0 2023-10-02 07:05:07,995 WARNING [train.py:1204] (3/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_222-198127-0_sp0.9 from training. Duration: 0.9333125 2023-10-02 07:05:08,039 WARNING [train.py:1204] (3/4) Exclude cut with ID _57572_210_5517_1_1534921219624_3809484_52-43530-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 07:05:08,270 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.attention_skip_rate, batch_count=792440.0, ans=0.0 2023-10-02 07:05:09,444 WARNING [train.py:1204] (3/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_96-320736-0_sp1.1 from training. Duration: 0.7818125 2023-10-02 07:05:11,090 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.conv_module1.balancer1.min_positive, batch_count=792440.0, ans=0.025 2023-10-02 07:05:12,135 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_13091_210_7925_1_1530609447_8987354_1159-225508-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 07:05:13,710 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.balancer1.prob, batch_count=792440.0, ans=0.125 2023-10-02 07:05:14,893 WARNING [train.py:1204] (3/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_237-44332-0 from training. Duration: 0.94 2023-10-02 07:05:14,943 WARNING [train.py:1204] (3/4) Exclude cut with ID _7970_210_1883_1_1528455616296_3472230_25-178407-0_sp0.9 from training. Duration: 0.788875 2023-10-02 07:05:17,736 WARNING [train.py:1204] (3/4) Exclude cut with ID _29003_210_19596_1_1532507391720_3894098_357-257062-0_sp1.1 from training. Duration: 0.936375 2023-10-02 07:05:19,127 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_809-17156-0 from training. Duration: 0.73 2023-10-02 07:05:21,027 WARNING [train.py:1204] (3/4) Exclude cut with ID _7160_210_1876_1_1527940923231_3703799_302-150728-0_sp1.1 from training. Duration: 0.7090625 2023-10-02 07:05:21,044 WARNING [train.py:1204] (3/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_444-28041-0_sp0.9 from training. Duration: 0.9444375 2023-10-02 07:05:22,506 WARNING [train.py:1204] (3/4) Exclude cut with ID _52892_210_6721_1_1534378941259_4124129_278-251666-0_sp1.1 from training. Duration: 0.92725 2023-10-02 07:05:22,625 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.1.feed_forward1.out_proj.dropout_p, batch_count=792506.6666666666, ans=0.1 2023-10-02 07:05:26,265 WARNING [train.py:1204] (3/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_115-326778-0 from training. Duration: 0.93 2023-10-02 07:05:27,466 WARNING [train.py:1204] (3/4) Exclude cut with ID _41391_210_7994_1_1533603771872_3638201_31-188961-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 07:05:28,963 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.feed_forward1.hidden_balancer.prob, batch_count=792506.6666666666, ans=0.125 2023-10-02 07:05:30,826 WARNING [train.py:1204] (3/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_1073-6264-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 07:05:30,846 WARNING [train.py:1204] (3/4) Exclude cut with ID _66868_210_9527_1_1535509287954_4561140_435-259515-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 07:05:30,958 WARNING [train.py:1204] (3/4) Exclude cut with ID _14886_210_4220_1_1530702020355_3516190_159-73748-0 from training. Duration: 0.83 2023-10-02 07:05:31,007 WARNING [train.py:1204] (3/4) Exclude cut with ID _15150_210_8341_1_1530752411803_7256384_469-239509-0_sp1.1 from training. Duration: 0.6181875 2023-10-02 07:05:33,741 WARNING [train.py:1204] (3/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_395-340852-0 from training. Duration: 0.93 2023-10-02 07:05:33,744 WARNING [train.py:1204] (3/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_138-187031-0_sp1.1 from training. Duration: 0.9 2023-10-02 07:05:34,002 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.conv_module1.balancer2.min_abs, batch_count=792506.6666666666, ans=0.5 2023-10-02 07:05:37,813 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_13757_210_3528_1_1530440682_4107239_48-106347-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 07:05:39,145 WARNING [train.py:1204] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_164-224052-0_sp1.1 from training. Duration: 0.636375 2023-10-02 07:05:39,156 WARNING [train.py:1204] (3/4) Exclude cut with ID _59288_210_5854_1_1535077757774_3412246_408-322648-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 07:05:39,197 WARNING [train.py:1204] (3/4) Exclude cut with ID _30191_210_1664_1_1533189915047_7196670_827-36359-0_sp1.1 from training. Duration: 0.963625 2023-10-02 07:05:40,522 WARNING [train.py:1204] (3/4) Exclude cut with ID _4680_210_3525_1_1527249757953_893386_75-78979-0_sp1.1 from training. Duration: 0.82725 2023-10-02 07:05:40,574 WARNING [train.py:1204] (3/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_383-165234-0 from training. Duration: 0.94 2023-10-02 07:05:43,150 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.597e+02 1.945e+02 2.106e+02 2.300e+02 3.133e+02, threshold=4.213e+02, percent-clipped=0.0 2023-10-02 07:05:44,575 WARNING [train.py:1204] (3/4) Exclude cut with ID _19896_210_1883_1_1531709022662_6649569_94-229927-0 from training. Duration: 0.97 2023-10-02 07:05:44,583 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6788_210_1388_1_1527898698_4562021_180-75288-0_sp1.1 from training. Duration: 0.9 2023-10-02 07:05:44,591 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_26433_210_12108_1_1532157158_4056480_54-321349-0_sp1.1 from training. Duration: 0.97275 2023-10-02 07:05:50,316 WARNING [train.py:1204] (3/4) Exclude cut with ID _56108_210_16380_1_1534505446159_3887959_265-161493-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 07:05:50,426 WARNING [train.py:1204] (3/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_395-111602-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 07:05:50,432 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_154-316450-0_sp0.9 from training. Duration: 0.8444375 2023-10-02 07:05:51,780 WARNING [train.py:1204] (3/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_381-116041-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 07:05:53,789 WARNING [train.py:1204] (3/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_65-104124-0_sp1.1 from training. Duration: 0.963625 2023-10-02 07:05:55,090 WARNING [train.py:1204] (3/4) Exclude cut with ID _6536_210_1883_1_1527850783339_3319069_169-23811-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 07:05:55,119 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_289-180904-0_sp0.9 from training. Duration: 0.8444375 2023-10-02 07:05:55,130 WARNING [train.py:1204] (3/4) Exclude cut with ID _56406_210_9288_1_1534899618664_3496149_226-242400-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 07:05:56,549 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_24899_210_16554_1_1532046422_3827462_276-216330-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 07:05:59,709 WARNING [train.py:1204] (3/4) Exclude cut with ID _29345_210_4381_1_1532689168263_3860169_198-174328-0_sp1.1 from training. Duration: 0.82725 2023-10-02 07:06:01,001 WARNING [train.py:1204] (3/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_338-96520-0 from training. Duration: 0.94 2023-10-02 07:06:04,614 WARNING [train.py:1204] (3/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_7-111954-0_sp1.1 from training. Duration: 0.5090625 2023-10-02 07:06:06,090 WARNING [train.py:1204] (3/4) Exclude cut with ID _36692_210_5665_1_1533608967420_398809_8-16261-0_sp1.1 from training. Duration: 0.97275 2023-10-02 07:06:06,852 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.3.encoder.layers.0.whiten, num_groups=1, num_channels=512, metric=5.95 vs. limit=12.0 2023-10-02 07:06:09,572 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.2.encoder.layers.0.whiten, num_groups=1, num_channels=384, metric=5.32 vs. limit=12.0 2023-10-02 07:06:10,239 WARNING [train.py:1204] (3/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_86-212657-0_sp1.1 from training. Duration: 0.97275 2023-10-02 07:06:10,246 WARNING [train.py:1204] (3/4) Exclude cut with ID _44659_210_2721_1_1534036120061_6614860_626-298884-0_sp1.1 from training. Duration: 0.8818125 2023-10-02 07:06:13,170 WARNING [train.py:1204] (3/4) Exclude cut with ID _49755_210_18053_1_1534581005846_3218709_120-257918-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 07:06:14,604 WARNING [train.py:1204] (3/4) Exclude cut with ID _21881_210_3525_1_1531825242757_3798380_400-113821-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 07:06:14,606 WARNING [train.py:1204] (3/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_387-236773-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 07:06:16,050 WARNING [train.py:1204] (3/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_613-232856-0_sp1.1 from training. Duration: 0.4909375 2023-10-02 07:06:17,347 WARNING [train.py:1204] (3/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_181-78632-0_sp0.9 from training. Duration: 0.8666875 2023-10-02 07:06:18,761 WARNING [train.py:1204] (3/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_175-17485-0_sp1.1 from training. Duration: 0.97275 2023-10-02 07:06:20,227 WARNING [train.py:1204] (3/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_1004-183422-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 07:06:21,124 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.1.encoder.layers.0.self_attn2.whiten, num_groups=1, num_channels=256, metric=13.18 vs. limit=22.5 2023-10-02 07:06:21,946 INFO [train.py:1046] (3/4) Epoch 23, batch 2050, loss[loss=0.1645, simple_loss=0.2481, pruned_loss=0.0405, over 24458.00 frames. ], tot_loss[loss=0.173, simple_loss=0.2492, pruned_loss=0.04834, over 4737954.54 frames. ], batch size: 66, lr: 4.48e-03, grad_scale: 16.0 2023-10-02 07:06:23,418 WARNING [train.py:1204] (3/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_32-65962-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 07:06:24,864 WARNING [train.py:1204] (3/4) Exclude cut with ID _15458_210_8341_1_1530858614263_3834410_40-298664-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 07:06:29,558 WARNING [train.py:1204] (3/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_380-261863-0_sp1.1 from training. Duration: 0.963625 2023-10-02 07:06:31,005 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_372-173012-0_sp1.1 from training. Duration: 0.863625 2023-10-02 07:06:32,297 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_57-82205-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 07:06:34,233 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3691_210_3528_1_1526468169_4062053_355-334432-0_sp0.9 from training. Duration: 0.9666875 2023-10-02 07:06:35,618 WARNING [train.py:1204] (3/4) Exclude cut with ID _19023_210_4353_1_1531378892552_5820780_28-36615-0 from training. Duration: 0.97 2023-10-02 07:06:35,623 WARNING [train.py:1204] (3/4) Exclude cut with ID _29853_210_7497_1_1532505600851_6642110_286-251333-0_sp1.1 from training. Duration: 0.92725 2023-10-02 07:06:37,107 WARNING [train.py:1204] (3/4) Exclude cut with ID _31602_210_5854_1_1532677021456_4458250_518-80679-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 07:06:37,138 WARNING [train.py:1204] (3/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_38-187851-0_sp0.9 from training. Duration: 0.688875 2023-10-02 07:06:45,520 WARNING [train.py:1204] (3/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_39-248870-0_sp1.1 from training. Duration: 0.62725 2023-10-02 07:06:45,541 WARNING [train.py:1204] (3/4) Exclude cut with ID _16908_210_5724_1_1531119157248_3772700_154-31455-0_sp1.1 from training. Duration: 0.97275 2023-10-02 07:06:49,049 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.1.encoder.layers.0.self_attn2.whiten, num_groups=1, num_channels=256, metric=13.04 vs. limit=22.5 2023-10-02 07:06:49,456 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8453_210_4211_1_1528502940_8562730_347-330673-0 from training. Duration: 0.72 2023-10-02 07:06:49,684 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.ff2_skip_rate, batch_count=792906.6666666666, ans=0.0 2023-10-02 07:06:49,782 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=792906.6666666666, ans=0.1 2023-10-02 07:06:50,886 WARNING [train.py:1204] (3/4) Exclude cut with ID _30191_210_1664_1_1533189915047_7196670_674-204247-0_sp1.1 from training. Duration: 0.97275 2023-10-02 07:06:50,989 WARNING [train.py:1204] (3/4) Exclude cut with ID _8833_210_5219_1_1528592665210_7553059_329-125156-0 from training. Duration: 0.82 2023-10-02 07:06:51,022 WARNING [train.py:1204] (3/4) Exclude cut with ID _4779_210_2084_1_1527318022944_3914893_500-285013-0_sp1.1 from training. Duration: 0.62725 2023-10-02 07:06:55,643 WARNING [train.py:1204] (3/4) Exclude cut with ID _14421_210_8750_1_1530604795825_3542290_190-313578-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 07:06:58,846 WARNING [train.py:1204] (3/4) Exclude cut with ID _35799_210_5854_1_1533263487887_3529071_538-273574-0_sp1.1 from training. Duration: 0.936375 2023-10-02 07:07:00,248 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_14928_210_5632_1_1530773461_4714000_454-112198-0_sp1.1 from training. Duration: 0.6 2023-10-02 07:07:02,162 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_468-143126-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 07:07:03,582 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_4528_210_3273_1_1527076654_3276777_258-121681-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 07:07:04,967 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_397-132565-0_sp1.1 from training. Duration: 0.9 2023-10-02 07:07:04,993 WARNING [train.py:1204] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_454-123002-0_sp0.9 from training. Duration: 0.7666875 2023-10-02 07:07:06,520 WARNING [train.py:1204] (3/4) Exclude cut with ID _27052_210_5986_1_1532329197187_3585009_297-86890-0_sp1.1 from training. Duration: 0.936375 2023-10-02 07:07:09,837 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_51225_210_3601_1_1534400634_2327383_55-301844-0_sp1.1 from training. Duration: 0.8454375 2023-10-02 07:07:11,298 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6786_210_1388_1_1527839699_4194495_378-46070-0_sp0.9 from training. Duration: 0.8 2023-10-02 07:07:11,386 WARNING [train.py:1204] (3/4) Exclude cut with ID _42334_210_7770_1_1534206610396_3605880_55-19758-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 07:07:15,653 WARNING [train.py:1204] (3/4) Exclude cut with ID _14884_210_2252_1_1530707459689_5561250_114-201399-0_sp0.9 from training. Duration: 0.8444375 2023-10-02 07:07:19,932 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_99-251314-0_sp0.9 from training. Duration: 0.9666875 2023-10-02 07:07:21,314 WARNING [train.py:1204] (3/4) Exclude cut with ID _26528_210_9070_1_1532570410842_3851310_497-97218-0 from training. Duration: 0.86 2023-10-02 07:07:24,977 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.2.encoder.layers.0.conv_module1.whiten, num_groups=1, num_channels=384, metric=6.16 vs. limit=15.0 2023-10-02 07:07:27,578 WARNING [train.py:1204] (3/4) Exclude cut with ID _55328_210_27767_1_1534580973260_4088170_230-317241-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 07:07:28,886 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_125-203105-0_sp0.9 from training. Duration: 0.888875 2023-10-02 07:07:32,120 WARNING [train.py:1204] (3/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_55-31904-0_sp1.1 from training. Duration: 0.8 2023-10-02 07:07:33,556 WARNING [train.py:1204] (3/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_18-174647-0 from training. Duration: 0.91 2023-10-02 07:07:36,277 INFO [train.py:1046] (3/4) Epoch 23, batch 2100, loss[loss=0.1766, simple_loss=0.2432, pruned_loss=0.05505, over 21861.00 frames. ], tot_loss[loss=0.1716, simple_loss=0.2475, pruned_loss=0.04781, over 4739013.14 frames. ], batch size: 48, lr: 4.47e-03, grad_scale: 16.0 2023-10-02 07:07:36,524 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.0.feed_forward1.out_proj.dropout_p, batch_count=793106.6666666666, ans=0.1 2023-10-02 07:07:38,167 WARNING [train.py:1204] (3/4) Exclude cut with ID IC0165W0005-81726 from training. Duration: 0.952 2023-10-02 07:07:38,168 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_31583_210_3852_1_1532595851_4024413_148-171847-0_sp1.1 from training. Duration: 0.963625 2023-10-02 07:07:38,214 WARNING [train.py:1204] (3/4) Exclude cut with ID _21989_210_5854_1_1531899214931_3271360_116-138769-0_sp1.1 from training. Duration: 0.936375 2023-10-02 07:07:39,511 WARNING [train.py:1204] (3/4) Exclude cut with ID _66070_210_7640_1_1535441393096_7805310_489-292631-0_sp1.1 from training. Duration: 0.7181875 2023-10-02 07:07:42,001 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_202-27960-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 07:07:42,009 WARNING [train.py:1204] (3/4) Exclude cut with ID _5294_210_1747_1_1527249643928_3696179_342-270278-0 from training. Duration: 0.55 2023-10-02 07:07:42,040 WARNING [train.py:1204] (3/4) Exclude cut with ID _38323_210_12108_1_1533121233369_3303049_416-11919-0 from training. Duration: 0.96 2023-10-02 07:07:43,451 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6545_210_2040_1_1527774670_4245258_407-48070-0_sp0.9 from training. Duration: 0.8444375 2023-10-02 07:07:45,013 WARNING [train.py:1204] (3/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_503-30719-0_sp1.1 from training. Duration: 0.8909375 2023-10-02 07:07:45,087 WARNING [train.py:1204] (3/4) Exclude cut with ID _49643_210_7989_1_1534640244224_3939031_234-326497-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 07:07:47,861 WARNING [train.py:1204] (3/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_301-77873-0_sp1.1 from training. Duration: 0.963625 2023-10-02 07:07:49,162 WARNING [train.py:1204] (3/4) Exclude cut with ID _45297_210_2721_1_1533715351381_3264949_152-154697-0_sp1.1 from training. Duration: 0.97275 2023-10-02 07:07:49,181 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3371_210_1794_1_1526384462_5580777_396-339812-0 from training. Duration: 0.85 2023-10-02 07:07:49,264 WARNING [train.py:1204] (3/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_399-156791-0_sp1.1 from training. Duration: 0.7545625 2023-10-02 07:07:49,302 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7696_210_2040_1_1528207167_3716099_117-125434-0 from training. Duration: 0.76 2023-10-02 07:07:49,306 WARNING [train.py:1204] (3/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_96-244299-0 from training. Duration: 0.88 2023-10-02 07:07:50,632 WARNING [train.py:1204] (3/4) Exclude cut with ID _37892_210_19596_1_1533198677897_3964058_519-343462-0_sp1.1 from training. Duration: 0.92725 2023-10-02 07:07:50,654 WARNING [train.py:1204] (3/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_202-183922-0_sp1.1 from training. Duration: 0.77275 2023-10-02 07:07:51,932 WARNING [train.py:1204] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_453-133983-0 from training. Duration: 0.78 2023-10-02 07:07:51,967 WARNING [train.py:1204] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_178-39722-0_sp1.1 from training. Duration: 0.4454375 2023-10-02 07:07:56,839 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_15126_210_6753_1_1530778677_2591185_113-157651-0 from training. Duration: 0.68 2023-10-02 07:07:56,840 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_271-234962-0_sp1.1 from training. Duration: 0.7181875 2023-10-02 07:08:02,107 WARNING [train.py:1204] (3/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_195-64967-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 07:08:02,163 WARNING [train.py:1204] (3/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_365-215065-0_sp1.1 from training. Duration: 0.936375 2023-10-02 07:08:04,118 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.4.encoder.layers.2.nonlin_attention.whiten2, num_groups=1, num_channels=384, metric=3.50 vs. limit=15.0 2023-10-02 07:08:06,905 WARNING [train.py:1204] (3/4) Exclude cut with ID _31602_210_5854_1_1532677021456_4458250_249-96604-0_sp0.9 from training. Duration: 0.988875 2023-10-02 07:08:06,961 WARNING [train.py:1204] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_307-158462-0 from training. Duration: 0.69 2023-10-02 07:08:07,008 WARNING [train.py:1204] (3/4) Exclude cut with ID _30918_210_6400_1_1532754078600_3125920_407-223394-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 07:08:07,011 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6673_210_3273_1_1527767790_3173240_351-167954-0_sp0.9 from training. Duration: 0.5333125 2023-10-02 07:08:09,432 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.0.layers.1.whiten, num_groups=1, num_channels=192, metric=3.66 vs. limit=12.0 2023-10-02 07:08:09,862 WARNING [train.py:1204] (3/4) Exclude cut with ID _4866_210_2577_1_1527404412284_4364659_256-306830-0 from training. Duration: 0.87 2023-10-02 07:08:09,902 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_17613_210_2151_1_1531137569_2366160_222-131118-0_sp1.1 from training. Duration: 0.963625 2023-10-02 07:08:09,923 WARNING [train.py:1204] (3/4) Exclude cut with ID _45560_210_21982_1_1533812603951_3584999_111-6376-0 from training. Duration: 0.84 2023-10-02 07:08:09,964 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_216-73841-0 from training. Duration: 0.96 2023-10-02 07:08:11,322 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_14058_210_4163_1_1530690953_6327180_49-210687-0 from training. Duration: 0.94 2023-10-02 07:08:11,473 WARNING [train.py:1204] (3/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_506-42614-0_sp0.9 from training. Duration: 0.87775 2023-10-02 07:08:12,673 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.417e+02 1.829e+02 2.013e+02 2.459e+02 4.112e+02, threshold=4.025e+02, percent-clipped=0.0 2023-10-02 07:08:12,860 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_25-332138-0_sp1.1 from training. Duration: 0.82725 2023-10-02 07:08:15,552 WARNING [train.py:1204] (3/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_589-9070-0_sp1.1 from training. Duration: 0.6454375 2023-10-02 07:08:16,956 WARNING [train.py:1204] (3/4) Exclude cut with ID _6194_210_1534_1_1527511826160_3941194_14-282876-0_sp1.1 from training. Duration: 0.6454375 2023-10-02 07:08:18,243 WARNING [train.py:1204] (3/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_1166-90654-0_sp1.1 from training. Duration: 0.92725 2023-10-02 07:08:19,682 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_218-255858-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 07:08:19,684 WARNING [train.py:1204] (3/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_1285-256622-0 from training. Duration: 0.84 2023-10-02 07:08:19,702 WARNING [train.py:1204] (3/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_205-286261-0_sp1.1 from training. Duration: 0.963625 2023-10-02 07:08:19,720 WARNING [train.py:1204] (3/4) Exclude cut with ID _68777_210_4353_1_1535810390731_3117904_6-154119-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 07:08:21,025 WARNING [train.py:1204] (3/4) Exclude cut with ID _22982_210_1883_1_1531882790447_5449409_565-199393-0_sp1.1 from training. Duration: 0.92725 2023-10-02 07:08:21,042 WARNING [train.py:1204] (3/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_278-154492-0 from training. Duration: 0.76 2023-10-02 07:08:22,714 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.self_attn_weights.pos_emb_skip_rate, batch_count=793306.6666666666, ans=0.0 2023-10-02 07:08:23,740 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_426-313557-0 from training. Duration: 0.53 2023-10-02 07:08:23,803 WARNING [train.py:1204] (3/4) Exclude cut with ID _41817_210_18835_1_1533434477160_7423049_555-293448-0 from training. Duration: 0.8 2023-10-02 07:08:28,688 WARNING [train.py:1204] (3/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_32-335103-0_sp1.1 from training. Duration: 0.7454375 2023-10-02 07:08:32,119 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2655_210_2087_1_1525688959_3897400_202-147557-0_sp1.1 from training. Duration: 0.77275 2023-10-02 07:08:32,155 WARNING [train.py:1204] (3/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_158-250116-0 from training. Duration: 0.62 2023-10-02 07:08:39,048 WARNING [train.py:1204] (3/4) Exclude cut with ID _55429_210_10626_1_1534467588527_7174143_528-88083-0_sp1.1 from training. Duration: 0.963625 2023-10-02 07:08:41,732 WARNING [train.py:1204] (3/4) Exclude cut with ID _8030_210_7497_1_1528444886781_3531089_32-31837-0_sp1.1 from training. Duration: 0.8818125 2023-10-02 07:08:42,959 WARNING [train.py:1204] (3/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_744-86168-0_sp1.1 from training. Duration: 0.97275 2023-10-02 07:08:42,968 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1113-7369-0_sp0.9 from training. Duration: 0.9444375 2023-10-02 07:08:42,996 WARNING [train.py:1204] (3/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_124-345840-0_sp0.9 from training. Duration: 0.57775 2023-10-02 07:08:43,041 WARNING [train.py:1204] (3/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_328-264776-0_sp0.9 from training. Duration: 0.7444375 2023-10-02 07:08:43,855 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.2.encoder.layers.1.whiten, num_groups=1, num_channels=384, metric=5.02 vs. limit=12.0 2023-10-02 07:08:44,480 WARNING [train.py:1204] (3/4) Exclude cut with ID _18895_210_6228_1_1531357274234_3784930_469-166615-0_sp1.1 from training. Duration: 0.963625 2023-10-02 07:08:44,486 WARNING [train.py:1204] (3/4) Exclude cut with ID _8395_210_1531_1_1528597871523_4411579_431-132470-0_sp0.9 from training. Duration: 0.82225 2023-10-02 07:08:45,878 WARNING [train.py:1204] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_554-45617-0_sp1.1 from training. Duration: 0.9 2023-10-02 07:08:47,241 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_35361_210_4238_1_1533262810_7793476_524-188242-0_sp1.1 from training. Duration: 0.92725 2023-10-02 07:08:47,495 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.attention_skip_rate, batch_count=793373.3333333334, ans=0.0 2023-10-02 07:08:48,723 WARNING [train.py:1204] (3/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_938-205135-0 from training. Duration: 0.87 2023-10-02 07:08:50,108 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_5931_210_2040_1_1527428481_4547999_321-193614-0 from training. Duration: 0.67 2023-10-02 07:08:50,130 WARNING [train.py:1204] (3/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_445-269521-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 07:08:51,397 INFO [train.py:1046] (3/4) Epoch 23, batch 2150, loss[loss=0.1745, simple_loss=0.2509, pruned_loss=0.04906, over 23683.00 frames. ], tot_loss[loss=0.1712, simple_loss=0.2473, pruned_loss=0.04753, over 4744943.10 frames. ], batch size: 149, lr: 4.47e-03, grad_scale: 16.0 2023-10-02 07:08:53,051 WARNING [train.py:1204] (3/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_3-33116-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 07:08:53,052 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_4633_210_2040_1_1526824163_4145343_416-76117-0_sp1.1 from training. Duration: 0.863625 2023-10-02 07:08:53,078 WARNING [train.py:1204] (3/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_1033-274376-0_sp1.1 from training. Duration: 0.7909375 2023-10-02 07:08:54,452 WARNING [train.py:1204] (3/4) Exclude cut with ID _8772_210_1850_1_1528635595972_3664011_398-320049-0_sp0.9 from training. Duration: 0.97775 2023-10-02 07:08:56,129 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.conv_skip_rate, batch_count=793440.0, ans=0.0 2023-10-02 07:08:58,753 WARNING [train.py:1204] (3/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_321-215742-0_sp0.9 from training. Duration: 0.5555625 2023-10-02 07:09:01,879 WARNING [train.py:1204] (3/4) Exclude cut with ID _28394_210_8913_1_1532430049518_3650118_272-190950-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 07:09:02,008 WARNING [train.py:1204] (3/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_31-299371-0_sp1.1 from training. Duration: 0.92725 2023-10-02 07:09:05,301 WARNING [train.py:1204] (3/4) Exclude cut with ID _55383_210_10635_1_1534468426172_2231669_134-128246-0_sp0.9 from training. Duration: 0.988875 2023-10-02 07:09:05,311 WARNING [train.py:1204] (3/4) Exclude cut with ID _40519_210_13794_1_1533715228655_7337390_887-9547-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 07:09:06,623 WARNING [train.py:1204] (3/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_310-109461-0_sp0.9 from training. Duration: 0.888875 2023-10-02 07:09:11,835 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2009_210_2071_1_1525481134_4588937_258-250196-0_sp1.1 from training. Duration: 0.92725 2023-10-02 07:09:11,889 WARNING [train.py:1204] (3/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_1186-72702-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 07:09:11,892 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_14831_210_7927_1_1530700200_5583610_181-27467-0_sp1.1 from training. Duration: 0.82725 2023-10-02 07:09:15,934 WARNING [train.py:1204] (3/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_278-132109-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 07:09:15,954 WARNING [train.py:1204] (3/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_261-297503-0 from training. Duration: 0.99 2023-10-02 07:09:20,189 WARNING [train.py:1204] (3/4) Exclude cut with ID _19896_210_1883_1_1531709022662_6649569_313-265903-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 07:09:20,274 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_105-114797-0_sp1.1 from training. Duration: 0.763625 2023-10-02 07:09:21,669 WARNING [train.py:1204] (3/4) Exclude cut with ID _53755_210_8254_1_1534577312941_4707360_9-65843-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 07:09:21,699 WARNING [train.py:1204] (3/4) Exclude cut with ID _18516_210_9206_1_1531704653206_3692406_361-224549-0_sp1.1 from training. Duration: 0.97275 2023-10-02 07:09:22,959 WARNING [train.py:1204] (3/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_403-310975-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 07:09:23,028 WARNING [train.py:1204] (3/4) Exclude cut with ID _5444_210_2151_1_1527418808464_5312210_581-315411-0_sp0.9 from training. Duration: 0.688875 2023-10-02 07:09:24,408 WARNING [train.py:1204] (3/4) Exclude cut with ID _23825_210_13449_1_1531915313526_3497150_464-154904-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 07:09:24,427 WARNING [train.py:1204] (3/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_937-208281-0_sp0.9 from training. Duration: 0.9444375 2023-10-02 07:09:24,499 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_37839_210_7234_1_1533542462_3689303_158-181212-0_sp1.1 from training. Duration: 0.963625 2023-10-02 07:09:25,824 WARNING [train.py:1204] (3/4) Exclude cut with ID _8772_210_1850_1_1528635595972_3664011_398-320049-0 from training. Duration: 0.88 2023-10-02 07:09:27,192 WARNING [train.py:1204] (3/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_718-250907-0_sp1.1 from training. Duration: 0.62725 2023-10-02 07:09:28,532 WARNING [train.py:1204] (3/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_641-126395-0_sp1.1 from training. Duration: 0.92725 2023-10-02 07:09:28,557 WARNING [train.py:1204] (3/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_166-241493-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 07:09:28,850 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.feed_forward2.hidden_balancer.prob, batch_count=793573.3333333334, ans=0.125 2023-10-02 07:09:29,966 WARNING [train.py:1204] (3/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_526-62079-0_sp0.9 from training. Duration: 0.7444375 2023-10-02 07:09:30,224 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.feed_forward1.out_proj.dropout_p, batch_count=793573.3333333334, ans=0.1 2023-10-02 07:09:31,295 WARNING [train.py:1204] (3/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_439-275083-0_sp1.1 from training. Duration: 0.936375 2023-10-02 07:09:34,150 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_66907_210_21581_1_1535615661_3590177_493-229032-0_sp1.1 from training. Duration: 0.92725 2023-10-02 07:09:34,180 WARNING [train.py:1204] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_89-321955-0_sp1.1 from training. Duration: 0.72725 2023-10-02 07:09:34,288 WARNING [train.py:1204] (3/4) Exclude cut with ID _29857_210_20034_1_1532395886753_3798160_273-226195-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 07:09:34,294 WARNING [train.py:1204] (3/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_404-75889-0 from training. Duration: 0.89 2023-10-02 07:09:34,320 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_271-234962-0_sp0.9 from training. Duration: 0.87775 2023-10-02 07:09:38,280 WARNING [train.py:1204] (3/4) Exclude cut with ID _22961_210_8254_1_1531976366672_4278180_156-339685-0_sp1.1 from training. Duration: 0.97275 2023-10-02 07:09:40,177 WARNING [train.py:1204] (3/4) Exclude cut with ID _37892_210_19596_1_1533198677897_3964058_425-231927-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 07:09:41,460 WARNING [train.py:1204] (3/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_102-288670-0_sp1.1 from training. Duration: 0.97275 2023-10-02 07:09:42,688 WARNING [train.py:1204] (3/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_283-220372-0_sp1.1 from training. Duration: 0.5090625 2023-10-02 07:09:42,768 WARNING [train.py:1204] (3/4) Exclude cut with ID _44304_210_15374_1_1533639352765_4613129_86-1699-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 07:09:44,091 WARNING [train.py:1204] (3/4) Exclude cut with ID _2891_210_2550_1_1525874398521_4427710_327-121590-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 07:09:44,099 WARNING [train.py:1204] (3/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_369-222060-0 from training. Duration: 0.71 2023-10-02 07:09:45,605 WARNING [train.py:1204] (3/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_762-109886-0 from training. Duration: 0.91 2023-10-02 07:09:45,635 WARNING [train.py:1204] (3/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_477-255782-0_sp0.9 from training. Duration: 0.911125 2023-10-02 07:09:46,986 WARNING [train.py:1204] (3/4) Exclude cut with ID IC0669W0061-330906 from training. Duration: 0.768 2023-10-02 07:09:48,314 WARNING [train.py:1204] (3/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_347-213071-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 07:09:48,343 WARNING [train.py:1204] (3/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_507-303370-0_sp1.1 from training. Duration: 0.87275 2023-10-02 07:09:48,420 WARNING [train.py:1204] (3/4) Exclude cut with ID _15012_210_1968_1_1530856820429_5095350_688-178849-0 from training. Duration: 0.64 2023-10-02 07:09:48,433 WARNING [train.py:1204] (3/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_28-70987-0_sp0.9 from training. Duration: 0.9666875 2023-10-02 07:09:48,435 WARNING [train.py:1204] (3/4) Exclude cut with ID _5910_210_1389_1_1527337972454_5050100_555-159815-0 from training. Duration: 0.98 2023-10-02 07:09:48,465 WARNING [train.py:1204] (3/4) Exclude cut with ID IC0679W0064-335871 from training. Duration: 0.768 2023-10-02 07:09:48,465 WARNING [train.py:1204] (3/4) Exclude cut with ID IC0679W0079-335886 from training. Duration: 0.896 2023-10-02 07:09:48,503 WARNING [train.py:1204] (3/4) Exclude cut with ID _5608_210_2106_1_1527415439089_6819768_909-82767-0 from training. Duration: 0.61 2023-10-02 07:09:48,637 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.feed_forward2.hidden_balancer.prob, batch_count=793640.0, ans=0.125 2023-10-02 07:09:51,103 WARNING [train.py:1204] (3/4) Exclude cut with ID _23038_210_9570_1_1532001584158_3652419_411-323967-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 07:09:51,175 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_350-231748-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 07:09:51,189 WARNING [train.py:1204] (3/4) Exclude cut with ID _5289_210_2084_1_1527678202736_3779299_142-103317-0_sp0.9 from training. Duration: 0.9333125 2023-10-02 07:09:51,276 WARNING [train.py:1204] (3/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_314-144055-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 07:09:52,629 WARNING [train.py:1204] (3/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_177-236809-0_sp0.9 from training. Duration: 0.5444375 2023-10-02 07:09:52,758 WARNING [train.py:1204] (3/4) Exclude cut with ID _23038_210_9570_1_1532001584158_3652419_82-334733-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 07:09:54,187 WARNING [train.py:1204] (3/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_225-22526-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 07:09:59,772 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.feed_forward1.hidden_balancer.prob, batch_count=793706.6666666666, ans=0.125 2023-10-02 07:10:01,033 WARNING [train.py:1204] (3/4) Exclude cut with ID _30327_210_16049_1_1532484161295_3924169_401-70819-0_sp1.1 from training. Duration: 0.7818125 2023-10-02 07:10:02,913 WARNING [train.py:1204] (3/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_538-32291-0 from training. Duration: 0.83 2023-10-02 07:10:06,056 INFO [train.py:1046] (3/4) Epoch 23, batch 2200, loss[loss=0.2025, simple_loss=0.2637, pruned_loss=0.0706, over 19545.00 frames. ], tot_loss[loss=0.1713, simple_loss=0.2475, pruned_loss=0.04755, over 4747562.66 frames. ], batch size: 388, lr: 4.47e-03, grad_scale: 16.0 2023-10-02 07:10:08,100 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_37357_210_17024_1_1533089504_2316121_258-211605-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 07:10:12,885 WARNING [train.py:1204] (3/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_550-335198-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 07:10:12,949 WARNING [train.py:1204] (3/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_98-741-0_sp0.9 from training. Duration: 0.888875 2023-10-02 07:10:14,308 WARNING [train.py:1204] (3/4) Exclude cut with ID _38323_210_12108_1_1533121233369_3303049_251-238988-0_sp1.1 from training. Duration: 0.92725 2023-10-02 07:10:14,401 WARNING [train.py:1204] (3/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_259-34038-0_sp0.9 from training. Duration: 0.82225 2023-10-02 07:10:17,160 WARNING [train.py:1204] (3/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_1060-180633-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 07:10:17,201 WARNING [train.py:1204] (3/4) Exclude cut with ID _8755_210_1876_1_1528628559357_3258209_211-37473-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 07:10:17,206 WARNING [train.py:1204] (3/4) Exclude cut with ID _4122_210_2084_1_1526713164417_4353280_528-206771-0 from training. Duration: 0.88 2023-10-02 07:10:22,922 WARNING [train.py:1204] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_64-248359-0 from training. Duration: 0.69 2023-10-02 07:10:24,400 WARNING [train.py:1204] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_402-253027-0_sp1.1 from training. Duration: 0.4909375 2023-10-02 07:10:28,512 WARNING [train.py:1204] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_625-102955-0 from training. Duration: 0.77 2023-10-02 07:10:31,218 WARNING [train.py:1204] (3/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_611-219324-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 07:10:32,924 WARNING [train.py:1204] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_718-68505-0_sp0.9 from training. Duration: 0.988875 2023-10-02 07:10:34,354 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_353-182855-0_sp1.1 from training. Duration: 0.87275 2023-10-02 07:10:37,581 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_5960_210_1794_1_1527403865_3990999_515-2613-0_sp0.9 from training. Duration: 0.97775 2023-10-02 07:10:37,602 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_5575_210_1410_1_1527301305_2570170_342-265572-0 from training. Duration: 0.94 2023-10-02 07:10:42,796 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.541e+02 1.834e+02 1.960e+02 2.207e+02 4.164e+02, threshold=3.921e+02, percent-clipped=1.0 2023-10-02 07:10:42,892 WARNING [train.py:1204] (3/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_329-113173-0_sp0.9 from training. Duration: 0.688875 2023-10-02 07:10:44,193 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_15396_210_6890_1_1531308473_1938936_213-312506-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 07:10:44,252 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_521-214276-0_sp1.1 from training. Duration: 0.4 2023-10-02 07:10:45,800 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.conv_module2.balancer2.prob, batch_count=793906.6666666666, ans=0.125 2023-10-02 07:10:48,406 WARNING [train.py:1204] (3/4) Exclude cut with ID _15026_210_8750_1_1530777602316_3291850_211-195043-0_sp1.1 from training. Duration: 0.72725 2023-10-02 07:10:49,796 WARNING [train.py:1204] (3/4) Exclude cut with ID _29343_210_4381_1_1532602757752_3674109_108-257494-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 07:10:51,315 WARNING [train.py:1204] (3/4) Exclude cut with ID _64414_210_17301_1_1535243252048_3757759_403-58151-0_sp1.1 from training. Duration: 0.963625 2023-10-02 07:10:52,759 WARNING [train.py:1204] (3/4) Exclude cut with ID _36512_210_6987_1_1533033143998_3126069_109-177787-0_sp1.1 from training. Duration: 0.92725 2023-10-02 07:10:54,163 WARNING [train.py:1204] (3/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_498-40153-0 from training. Duration: 0.63 2023-10-02 07:10:55,538 WARNING [train.py:1204] (3/4) Exclude cut with ID _56447_210_1389_1_1535074245910_3697189_161-334523-0_sp1.1 from training. Duration: 0.936375 2023-10-02 07:10:58,225 WARNING [train.py:1204] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_238-245655-0 from training. Duration: 0.81 2023-10-02 07:10:59,683 WARNING [train.py:1204] (3/4) Exclude cut with ID _64631_210_27767_1_1535349580068_4165160_108-43865-0_sp1.1 from training. Duration: 0.92725 2023-10-02 07:10:59,688 WARNING [train.py:1204] (3/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_243-42116-0_sp1.1 from training. Duration: 0.663625 2023-10-02 07:11:00,932 WARNING [train.py:1204] (3/4) Exclude cut with ID _66868_210_9527_1_1535509287954_4561140_594-132160-0_sp1.1 from training. Duration: 0.92725 2023-10-02 07:11:02,358 WARNING [train.py:1204] (3/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_48-338976-0_sp0.9 from training. Duration: 0.988875 2023-10-02 07:11:02,399 WARNING [train.py:1204] (3/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_137-100595-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 07:11:02,404 WARNING [train.py:1204] (3/4) Exclude cut with ID _49748_210_24116_1_1533986971737_3158910_12-271669-0_sp1.1 from training. Duration: 0.936375 2023-10-02 07:11:02,421 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_37357_210_17024_1_1533089504_2316121_206-80421-0_sp1.1 from training. Duration: 0.936375 2023-10-02 07:11:03,818 WARNING [train.py:1204] (3/4) Exclude cut with ID _7537_210_2084_1_1528502395431_3924359_113-199933-0_sp1.1 from training. Duration: 0.536375 2023-10-02 07:11:05,114 WARNING [train.py:1204] (3/4) Exclude cut with ID _55440_210_12651_1_1534496713364_2980520_31-59289-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 07:11:07,248 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1083-220122-0_sp0.9 from training. Duration: 0.5666875 2023-10-02 07:11:10,017 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_426-313557-0_sp1.1 from training. Duration: 0.4818125 2023-10-02 07:11:10,065 WARNING [train.py:1204] (3/4) Exclude cut with ID _35799_210_5854_1_1533263487887_3529071_324-94843-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 07:11:13,730 WARNING [train.py:1204] (3/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_264-108685-0_sp1.1 from training. Duration: 0.5 2023-10-02 07:11:14,937 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9012W0006-501141 from training. Duration: 0.896 2023-10-02 07:11:16,402 WARNING [train.py:1204] (3/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_890-5117-0_sp0.9 from training. Duration: 0.8666875 2023-10-02 07:11:16,442 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9023W0008-506301 from training. Duration: 0.896 2023-10-02 07:11:17,869 WARNING [train.py:1204] (3/4) Exclude cut with ID _13916_210_9994_1_1530788341126_3763149_60-191556-0_sp0.9 from training. Duration: 0.67775 2023-10-02 07:11:19,185 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9029W0331-509721 from training. Duration: 0.896 2023-10-02 07:11:20,456 INFO [train.py:1046] (3/4) Epoch 23, batch 2250, loss[loss=0.1912, simple_loss=0.2696, pruned_loss=0.05643, over 24288.00 frames. ], tot_loss[loss=0.1721, simple_loss=0.2481, pruned_loss=0.04804, over 4731695.41 frames. ], batch size: 77, lr: 4.47e-03, grad_scale: 16.0 2023-10-02 07:11:21,903 WARNING [train.py:1204] (3/4) Exclude cut with ID _35823_210_6917_1_1533187881914_7297830_567-108596-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 07:11:21,947 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7543_210_6753_1_1528544010_1110402_25-183860-0_sp1.1 from training. Duration: 0.42725 2023-10-02 07:11:23,322 WARNING [train.py:1204] (3/4) Exclude cut with ID _47278_210_12041_1_1533902418349_3576110_495-260035-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 07:11:24,724 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9051W0015-520281 from training. Duration: 0.9045625 2023-10-02 07:11:26,154 WARNING [train.py:1204] (3/4) Exclude cut with ID _14459_210_2228_1_1531443641946_7516700_707-66193-0_sp1.1 from training. Duration: 0.97275 2023-10-02 07:11:28,850 WARNING [train.py:1204] (3/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_219-224445-0_sp1.1 from training. Duration: 0.763625 2023-10-02 07:11:34,218 WARNING [train.py:1204] (3/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_345-167986-0_sp0.9 from training. Duration: 0.9333125 2023-10-02 07:11:34,339 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_34815_210_6388_1_1533381320_3571903_279-305565-0_sp1.1 from training. Duration: 0.836375 2023-10-02 07:11:37,580 WARNING [train.py:1204] (3/4) Exclude cut with ID _21987_210_5854_1_1531821500482_3637038_317-179212-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 07:11:37,783 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.conv_module2.balancer2.min_abs, batch_count=794173.3333333334, ans=0.5 2023-10-02 07:11:39,480 WARNING [train.py:1204] (3/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_395-37222-0_sp1.1 from training. Duration: 0.6909375 2023-10-02 07:11:40,763 WARNING [train.py:1204] (3/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_168-50337-0_sp1.1 from training. Duration: 0.763625 2023-10-02 07:11:42,779 WARNING [train.py:1204] (3/4) Exclude cut with ID _60113_210_16271_1_1535164212481_4386079_559-167191-0 from training. Duration: 0.93 2023-10-02 07:11:42,789 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_47228_210_4983_1_1533780021_3672027_344-179642-0_sp1.1 from training. Duration: 0.92725 2023-10-02 07:11:42,827 WARNING [train.py:1204] (3/4) Exclude cut with ID _22961_210_8254_1_1531976366672_4278180_317-199546-0_sp0.9 from training. Duration: 0.9555625 2023-10-02 07:11:44,153 WARNING [train.py:1204] (3/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_141-89727-0 from training. Duration: 0.71 2023-10-02 07:11:44,219 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_78-116075-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 07:11:44,234 WARNING [train.py:1204] (3/4) Exclude cut with ID _64086_210_7994_1_1535270417019_7435949_52-235756-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 07:11:46,236 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7615_210_4211_1_1528110965_4580160_72-235211-0_sp1.1 from training. Duration: 0.6909375 2023-10-02 07:11:49,468 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.4.encoder.layers.0.nonlin_attention.whiten1, num_groups=1, num_channels=288, metric=8.34 vs. limit=10.0 2023-10-02 07:11:51,477 WARNING [train.py:1204] (3/4) Exclude cut with ID _14069_210_5632_1_1530512692033_4803390_287-27950-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 07:11:51,577 WARNING [train.py:1204] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_74-48717-0_sp0.9 from training. Duration: 0.6444375 2023-10-02 07:11:52,863 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7544_210_6753_1_1528591327_1471426_172-348777-0_sp1.1 from training. Duration: 0.6 2023-10-02 07:11:54,372 WARNING [train.py:1204] (3/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_220-191080-0 from training. Duration: 0.98 2023-10-02 07:11:55,717 WARNING [train.py:1204] (3/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_1081-137641-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 07:11:57,243 WARNING [train.py:1204] (3/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_278-235459-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 07:11:57,871 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.3.encoder.layers.1.conv_module2.whiten, num_groups=1, num_channels=512, metric=3.45 vs. limit=15.0 2023-10-02 07:12:00,318 WARNING [train.py:1204] (3/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_1181-77470-0_sp1.1 from training. Duration: 0.936375 2023-10-02 07:12:02,993 WARNING [train.py:1204] (3/4) Exclude cut with ID _60860_210_8254_1_1534896015875_3557169_198-5531-0_sp1.1 from training. Duration: 0.936375 2023-10-02 07:12:04,313 WARNING [train.py:1204] (3/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_17-289781-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 07:12:04,319 WARNING [train.py:1204] (3/4) Exclude cut with ID _50310_210_15488_1_1534068043345_3641080_146-199752-0_sp1.1 from training. Duration: 0.92725 2023-10-02 07:12:07,070 WARNING [train.py:1204] (3/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_969-213354-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 07:12:09,085 WARNING [train.py:1204] (3/4) Exclude cut with ID _70519_210_4163_1_1535961595994_7458230_66-344156-0_sp1.1 from training. Duration: 0.963625 2023-10-02 07:12:11,175 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.nonlin_attention.whiten1.whitening_limit, batch_count=794306.6666666666, ans=10.0 2023-10-02 07:12:13,801 WARNING [train.py:1204] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_380-204756-0_sp1.1 from training. Duration: 0.7818125 2023-10-02 07:12:17,106 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_128-202104-0_sp0.9 from training. Duration: 0.7 2023-10-02 07:12:22,652 WARNING [train.py:1204] (3/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_51-1049-0_sp0.9 from training. Duration: 0.8333125 2023-10-02 07:12:22,670 WARNING [train.py:1204] (3/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_86-66192-0_sp1.1 from training. Duration: 0.5 2023-10-02 07:12:22,731 WARNING [train.py:1204] (3/4) Exclude cut with ID _14069_210_5632_1_1530512692033_4803390_95-1896-0_sp1.1 from training. Duration: 0.8909375 2023-10-02 07:12:27,008 WARNING [train.py:1204] (3/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_29-293896-0_sp1.1 from training. Duration: 0.5545625 2023-10-02 07:12:29,701 WARNING [train.py:1204] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_167-137255-0_sp0.9 from training. Duration: 0.711125 2023-10-02 07:12:29,702 WARNING [train.py:1204] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_217-95056-0 from training. Duration: 0.98 2023-10-02 07:12:29,725 WARNING [train.py:1204] (3/4) Exclude cut with ID _47278_210_12041_1_1533902418349_3576110_97-266746-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 07:12:31,085 WARNING [train.py:1204] (3/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_380-343670-0_sp1.1 from training. Duration: 0.8 2023-10-02 07:12:33,767 INFO [train.py:1046] (3/4) Epoch 23, batch 2300, loss[loss=0.1733, simple_loss=0.2427, pruned_loss=0.05192, over 23581.00 frames. ], tot_loss[loss=0.172, simple_loss=0.2484, pruned_loss=0.04784, over 4731461.45 frames. ], batch size: 134, lr: 4.47e-03, grad_scale: 16.0 2023-10-02 07:12:33,836 WARNING [train.py:1204] (3/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_107-332398-0 from training. Duration: 0.78 2023-10-02 07:12:36,595 WARNING [train.py:1204] (3/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_91-267696-0_sp1.1 from training. Duration: 0.7909375 2023-10-02 07:12:36,615 WARNING [train.py:1204] (3/4) Exclude cut with ID _41298_210_9019_1_1533430371450_4822143_498-186993-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 07:12:42,785 WARNING [train.py:1204] (3/4) Exclude cut with ID _17956_210_4353_1_1531209568125_6979970_54-77610-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 07:12:44,252 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_40047_210_2290_1_1533290519_2374311_257-335162-0_sp1.1 from training. Duration: 0.863625 2023-10-02 07:12:45,748 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9653W0193-675471 from training. Duration: 0.9656875 2023-10-02 07:12:47,541 WARNING [train.py:1204] (3/4) Exclude cut with ID _25248_210_8750_1_1532062825941_5554180_243-240536-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 07:12:55,103 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_32360_210_4198_1_1532689160_3648436_60-51908-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 07:12:55,111 WARNING [train.py:1204] (3/4) Exclude cut with ID _4092_210_2151_1_1526643044449_3135759_227-213972-0_sp0.9 from training. Duration: 0.6 2023-10-02 07:12:55,149 WARNING [train.py:1204] (3/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_392-173500-0_sp1.1 from training. Duration: 0.97275 2023-10-02 07:12:55,330 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.ff2_skip_rate, batch_count=794506.6666666666, ans=0.0 2023-10-02 07:12:55,701 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.5.encoder.layers.0.self_attn_weights.whiten_keys, num_groups=4, num_channels=128, metric=1.76 vs. limit=6.0 2023-10-02 07:12:56,483 WARNING [train.py:1204] (3/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_71-329150-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 07:12:56,486 WARNING [train.py:1204] (3/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_866-173440-0 from training. Duration: 0.9 2023-10-02 07:12:56,585 WARNING [train.py:1204] (3/4) Exclude cut with ID _14832_210_8750_1_1530691351539_3171348_1-54315-0_sp0.9 from training. Duration: 0.9666875 2023-10-02 07:12:59,324 WARNING [train.py:1204] (3/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_430-116139-0_sp1.1 from training. Duration: 0.77275 2023-10-02 07:13:00,596 WARNING [train.py:1204] (3/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_356-45054-0_sp0.9 from training. Duration: 0.9555625 2023-10-02 07:13:03,375 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3531_210_3528_1_1526294203_5216456_309-227302-0_sp0.9 from training. Duration: 0.7666875 2023-10-02 07:13:04,985 WARNING [train.py:1204] (3/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_301-330455-0_sp1.1 from training. Duration: 0.72725 2023-10-02 07:13:07,768 WARNING [train.py:1204] (3/4) Exclude cut with ID _23557_210_8750_1_1531890106370_6846255_667-145945-0_sp1.1 from training. Duration: 0.92725 2023-10-02 07:13:07,949 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.0.conv_module1.balancer1.prob, batch_count=794573.3333333334, ans=0.125 2023-10-02 07:13:09,518 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.542e+02 1.881e+02 2.032e+02 2.329e+02 3.115e+02, threshold=4.065e+02, percent-clipped=0.0 2023-10-02 07:13:11,845 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.conv_skip_rate, batch_count=794573.3333333334, ans=0.0 2023-10-02 07:13:14,183 WARNING [train.py:1204] (3/4) Exclude cut with ID _14860_210_4915_1_1530702020340_7343240_836-328489-0_sp1.1 from training. Duration: 0.8181875 2023-10-02 07:13:14,230 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_37152_210_9697_1_1533106573_3844188_416-333249-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 07:13:17,639 WARNING [train.py:1204] (3/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_538-32291-0_sp0.9 from training. Duration: 0.92225 2023-10-02 07:13:20,818 WARNING [train.py:1204] (3/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_965-176824-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 07:13:23,689 WARNING [train.py:1204] (3/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_86-19601-0_sp1.1 from training. Duration: 0.77275 2023-10-02 07:13:23,759 WARNING [train.py:1204] (3/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_436-318113-0_sp1.1 from training. Duration: 0.6818125 2023-10-02 07:13:25,340 WARNING [train.py:1204] (3/4) Exclude cut with ID _41817_210_18835_1_1533434477160_7423049_555-293448-0_sp0.9 from training. Duration: 0.888875 2023-10-02 07:13:25,351 WARNING [train.py:1204] (3/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_68-219807-0 from training. Duration: 0.47 2023-10-02 07:13:28,122 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7544_210_6753_1_1528591327_1471426_33-346031-0_sp1.1 from training. Duration: 0.5545625 2023-10-02 07:13:28,124 WARNING [train.py:1204] (3/4) Exclude cut with ID _49748_210_24116_1_1533986971737_3158910_263-52113-0_sp1.1 from training. Duration: 0.97275 2023-10-02 07:13:29,481 WARNING [train.py:1204] (3/4) Exclude cut with ID _64639_210_5816_1_1535367619437_3528000_340-24580-0_sp1.1 from training. Duration: 0.92725 2023-10-02 07:13:29,496 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_47228_210_4983_1_1533780021_3672027_479-114941-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 07:13:29,531 WARNING [train.py:1204] (3/4) Exclude cut with ID _44585_210_18628_1_1533605350387_4518199_506-119659-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 07:13:30,858 WARNING [train.py:1204] (3/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_314-126819-0_sp1.1 from training. Duration: 0.4545625 2023-10-02 07:13:30,862 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2541_210_1794_1_1525590269_3084765_156-173713-0_sp0.9 from training. Duration: 0.9 2023-10-02 07:13:32,182 WARNING [train.py:1204] (3/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_108-255430-0 from training. Duration: 0.97 2023-10-02 07:13:32,190 WARNING [train.py:1204] (3/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_791-45149-0_sp1.1 from training. Duration: 0.7818125 2023-10-02 07:13:32,206 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_53378_210_17282_1_1534227054_1261425_10-118295-0_sp1.1 from training. Duration: 0.97275 2023-10-02 07:13:33,552 WARNING [train.py:1204] (3/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_148-158509-0 from training. Duration: 0.41 2023-10-02 07:13:37,934 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_34726_210_4525_1_1533019474_5030395_687-291305-0_sp1.1 from training. Duration: 0.936375 2023-10-02 07:13:40,093 WARNING [train.py:1204] (3/4) Exclude cut with ID _14860_210_4915_1_1530702020340_7343240_264-324537-0_sp1.1 from training. Duration: 0.963625 2023-10-02 07:13:43,443 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.2.encoder.layers.2.whiten, num_groups=1, num_channels=384, metric=5.55 vs. limit=12.0 2023-10-02 07:13:45,363 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_41251_210_15368_1_1533429767_4820695_591-46386-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 07:13:47,203 WARNING [train.py:1204] (3/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_86-19601-0_sp0.9 from training. Duration: 0.9444375 2023-10-02 07:13:47,223 WARNING [train.py:1204] (3/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_401-255491-0_sp0.9 from training. Duration: 0.77775 2023-10-02 07:13:47,548 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.ff2_skip_rate, batch_count=794773.3333333334, ans=0.0 2023-10-02 07:13:48,592 INFO [train.py:1046] (3/4) Epoch 23, batch 2350, loss[loss=0.1668, simple_loss=0.2579, pruned_loss=0.03787, over 24422.00 frames. ], tot_loss[loss=0.1734, simple_loss=0.2499, pruned_loss=0.04842, over 4724226.63 frames. ], batch size: 69, lr: 4.47e-03, grad_scale: 16.0 2023-10-02 07:13:48,666 WARNING [train.py:1204] (3/4) Exclude cut with ID _8696_210_1876_1_1528545632440_3345058_209-54069-0_sp1.1 from training. Duration: 0.6090625 2023-10-02 07:13:48,684 WARNING [train.py:1204] (3/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_369-325184-0_sp1.1 from training. Duration: 0.9 2023-10-02 07:13:48,748 WARNING [train.py:1204] (3/4) Exclude cut with ID _14687_210_5854_1_1530703838259_3664889_115-239046-0_sp0.9 from training. Duration: 0.7444375 2023-10-02 07:13:50,535 WARNING [train.py:1204] (3/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_168-50337-0 from training. Duration: 0.84 2023-10-02 07:13:56,272 WARNING [train.py:1204] (3/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_909-64151-0_sp1.1 from training. Duration: 0.8545625 2023-10-02 07:13:56,283 WARNING [train.py:1204] (3/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_597-154640-0 from training. Duration: 0.9 2023-10-02 07:13:59,368 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.self_attn_weights.pos_emb_skip_rate, batch_count=794773.3333333334, ans=0.0 2023-10-02 07:14:01,949 WARNING [train.py:1204] (3/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_126-274694-0 from training. Duration: 0.98 2023-10-02 07:14:04,656 WARNING [train.py:1204] (3/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_344-269786-0_sp1.1 from training. Duration: 0.97275 2023-10-02 07:14:08,759 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_37818_210_4238_1_1533620910_4720083_322-253154-0_sp1.1 from training. Duration: 0.92725 2023-10-02 07:14:08,762 WARNING [train.py:1204] (3/4) Exclude cut with ID _58895_210_8750_1_1535184081320_3649089_221-15823-0_sp1.1 from training. Duration: 0.92725 2023-10-02 07:14:08,780 WARNING [train.py:1204] (3/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_9-21737-0_sp1.1 from training. Duration: 0.8818125 2023-10-02 07:14:08,818 WARNING [train.py:1204] (3/4) Exclude cut with ID _14069_210_5632_1_1530512692033_4803390_522-341588-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 07:14:10,180 WARNING [train.py:1204] (3/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_363-185703-0 from training. Duration: 0.95 2023-10-02 07:14:12,348 WARNING [train.py:1204] (3/4) Exclude cut with ID _41817_210_18835_1_1533434477160_7423049_265-76216-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 07:14:12,647 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=794840.0, ans=0.1 2023-10-02 07:14:17,184 WARNING [train.py:1204] (3/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_841-341679-0 from training. Duration: 0.58 2023-10-02 07:14:18,622 WARNING [train.py:1204] (3/4) Exclude cut with ID _55429_210_10626_1_1534467588527_7174143_97-335084-0_sp1.1 from training. Duration: 0.8818125 2023-10-02 07:14:21,356 WARNING [train.py:1204] (3/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_690-126306-0_sp0.9 from training. Duration: 0.8666875 2023-10-02 07:14:21,373 WARNING [train.py:1204] (3/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_102-92751-0_sp1.1 from training. Duration: 0.9 2023-10-02 07:14:24,685 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3787_210_2040_1_1526477544_5478586_603-120324-0_sp1.1 from training. Duration: 0.736375 2023-10-02 07:14:26,168 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_33348_210_5632_1_1533086912_3324030_85-28583-0 from training. Duration: 0.82 2023-10-02 07:14:26,231 WARNING [train.py:1204] (3/4) Exclude cut with ID _60057_210_10640_1_1534903065793_3333150_362-230377-0_sp1.1 from training. Duration: 0.7909375 2023-10-02 07:14:27,620 WARNING [train.py:1204] (3/4) Exclude cut with ID _60113_210_16271_1_1535164212481_4386079_194-146486-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 07:14:27,632 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_53753_210_7234_1_1534320003_3594148_171-277668-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 07:14:27,670 WARNING [train.py:1204] (3/4) Exclude cut with ID _34423_210_8750_1_1533430798456_3330199_251-332788-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 07:14:27,903 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.ff3_skip_rate, batch_count=794906.6666666666, ans=0.0 2023-10-02 07:14:30,512 WARNING [train.py:1204] (3/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_663-166973-0_sp0.9 from training. Duration: 0.888875 2023-10-02 07:14:31,959 WARNING [train.py:1204] (3/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_927-43827-0 from training. Duration: 0.74 2023-10-02 07:14:33,285 WARNING [train.py:1204] (3/4) Exclude cut with ID _28323_210_16187_1_1532480395667_4100300_306-74561-0_sp1.1 from training. Duration: 0.8545625 2023-10-02 07:14:34,748 WARNING [train.py:1204] (3/4) Exclude cut with ID _15037_210_2253_1_1530752294104_3267650_139-337989-0_sp1.1 from training. Duration: 0.92725 2023-10-02 07:14:34,765 WARNING [train.py:1204] (3/4) Exclude cut with ID _49039_210_6315_1_1533967537541_3846118_460-263670-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 07:14:38,123 WARNING [train.py:1204] (3/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_186-309161-0 from training. Duration: 0.54 2023-10-02 07:14:39,271 WARNING [train.py:1204] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_87-278686-0_sp1.1 from training. Duration: 0.7 2023-10-02 07:14:39,799 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.4.encoder.layers.0.conv_module1.whiten, num_groups=1, num_channels=384, metric=4.38 vs. limit=15.0 2023-10-02 07:14:43,912 WARNING [train.py:1204] (3/4) Exclude cut with ID _27052_210_5986_1_1532329197187_3585009_314-106499-0 from training. Duration: 0.92 2023-10-02 07:14:43,930 WARNING [train.py:1204] (3/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_412-287526-0_sp0.9 from training. Duration: 0.8 2023-10-02 07:14:44,195 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.ff2_skip_rate, batch_count=794973.3333333334, ans=0.0 2023-10-02 07:14:47,494 WARNING [train.py:1204] (3/4) Exclude cut with ID _22961_210_8254_1_1531976366672_4278180_317-199546-0 from training. Duration: 0.86 2023-10-02 07:14:51,654 WARNING [train.py:1204] (3/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_381-317791-0 from training. Duration: 0.73 2023-10-02 07:14:51,837 INFO [scaling.py:1118] (3/4) WithLoss: name=encoder.encoders.1.encoder.layers.0.self_attn_weights, loss-sum=0.000e+00 2023-10-02 07:14:52,808 WARNING [train.py:1204] (3/4) Exclude cut with ID _44010_210_4525_1_1533884262964_4013620_560-310411-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 07:14:52,816 WARNING [train.py:1204] (3/4) Exclude cut with ID _5294_210_1747_1_1527249643928_3696179_342-270278-0_sp0.9 from training. Duration: 0.611125 2023-10-02 07:14:52,835 WARNING [train.py:1204] (3/4) Exclude cut with ID ID0473W0002-918951 from training. Duration: 0.924 2023-10-02 07:14:52,853 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1083-220122-0 from training. Duration: 0.51 2023-10-02 07:14:54,851 WARNING [train.py:1204] (3/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_138-154812-0 from training. Duration: 0.78 2023-10-02 07:14:57,599 WARNING [train.py:1204] (3/4) Exclude cut with ID _44982_210_1511_1_1534312820108_3726029_331-19420-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 07:15:01,547 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2655_210_2087_1_1525688959_3897400_202-147557-0_sp0.9 from training. Duration: 0.9444375 2023-10-02 07:15:02,811 INFO [train.py:1046] (3/4) Epoch 23, batch 2400, loss[loss=0.1806, simple_loss=0.2644, pruned_loss=0.04844, over 24411.00 frames. ], tot_loss[loss=0.1736, simple_loss=0.2497, pruned_loss=0.04878, over 4700579.32 frames. ], batch size: 77, lr: 4.47e-03, grad_scale: 32.0 2023-10-02 07:15:05,593 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_23850_210_4238_1_1532232597_1632433_32-185135-0_sp0.9 from training. Duration: 0.9555625 2023-10-02 07:15:07,021 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_46-93547-0_sp1.1 from training. Duration: 0.863625 2023-10-02 07:15:08,991 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7931_210_1794_1_1528594728_5564059_280-279560-0 from training. Duration: 0.98 2023-10-02 07:15:09,031 WARNING [train.py:1204] (3/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_937-208281-0 from training. Duration: 0.85 2023-10-02 07:15:16,813 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_14928_210_5632_1_1530773461_4714000_454-112198-0_sp0.9 from training. Duration: 0.7333125 2023-10-02 07:15:16,814 WARNING [train.py:1204] (3/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_944-153333-0_sp1.1 from training. Duration: 0.963625 2023-10-02 07:15:18,764 WARNING [train.py:1204] (3/4) Exclude cut with ID _14821_210_8750_1_1530680503991_6829359_2-227151-0 from training. Duration: 0.83 2023-10-02 07:15:20,060 WARNING [train.py:1204] (3/4) Exclude cut with ID _28461_210_2253_1_1532771775189_3041299_96-152158-0_sp1.1 from training. Duration: 0.77275 2023-10-02 07:15:20,131 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7837_210_1792_1_1528453445_5841305_654-54195-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 07:15:21,477 WARNING [train.py:1204] (3/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_344-152526-0 from training. Duration: 0.53 2023-10-02 07:15:27,171 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_30167_210_3528_1_1532428012_4772375_480-142230-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 07:15:29,090 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_160-218721-0 from training. Duration: 0.96 2023-10-02 07:15:33,351 WARNING [train.py:1204] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_65-88242-0_sp0.9 from training. Duration: 0.62225 2023-10-02 07:15:36,311 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_34886_210_10633_1_1533203882_1755661_76-156096-0 from training. Duration: 0.87 2023-10-02 07:15:38,462 WARNING [train.py:1204] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_188-38388-0_sp1.1 from training. Duration: 0.92725 2023-10-02 07:15:39,453 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.0.layers.0.conv_module2.whiten, num_groups=1, num_channels=192, metric=8.41 vs. limit=15.0 2023-10-02 07:15:39,697 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.561e+02 1.839e+02 2.014e+02 2.319e+02 3.519e+02, threshold=4.028e+02, percent-clipped=0.0 2023-10-02 07:15:41,095 WARNING [train.py:1204] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_81-289335-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 07:15:42,748 WARNING [train.py:1204] (3/4) Exclude cut with ID _59493_210_17282_1_1535078419077_3225879_110-8778-0_sp1.1 from training. Duration: 0.936375 2023-10-02 07:15:44,578 WARNING [train.py:1204] (3/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_188-260011-0 from training. Duration: 0.73 2023-10-02 07:15:44,621 WARNING [train.py:1204] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_453-133983-0_sp0.9 from training. Duration: 0.8666875 2023-10-02 07:15:51,995 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8054_210_7927_1_1528627721_4984094_553-17379-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 07:15:53,555 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.feed_forward1.out_proj.dropout_p, batch_count=795306.6666666666, ans=0.1 2023-10-02 07:15:54,722 WARNING [train.py:1204] (3/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_341-171072-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 07:15:57,478 WARNING [train.py:1204] (3/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_265-257286-0_sp1.1 from training. Duration: 0.963625 2023-10-02 07:15:59,426 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_403-39619-0_sp0.9 from training. Duration: 0.9333125 2023-10-02 07:15:59,430 WARNING [train.py:1204] (3/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_464-33931-0_sp0.9 from training. Duration: 0.6 2023-10-02 07:15:59,462 WARNING [train.py:1204] (3/4) Exclude cut with ID _60804_210_9642_1_1534831199859_7268299_632-143863-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 07:15:59,469 WARNING [train.py:1204] (3/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_431-310997-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 07:15:59,518 WARNING [train.py:1204] (3/4) Exclude cut with ID _31649_210_6228_1_1532584608063_4091078_114-263860-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 07:16:00,772 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6961_210_2087_1_1527937168_6815011_562-165518-0_sp0.9 from training. Duration: 0.8555625 2023-10-02 07:16:03,822 WARNING [train.py:1204] (3/4) Exclude cut with ID _27052_210_5986_1_1532329197187_3585009_338-325870-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 07:16:05,063 WARNING [train.py:1204] (3/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_469-94673-0_sp1.1 from training. Duration: 0.7181875 2023-10-02 07:16:05,099 WARNING [train.py:1204] (3/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_259-34038-0 from training. Duration: 0.74 2023-10-02 07:16:06,971 WARNING [train.py:1204] (3/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_388-129552-0 from training. Duration: 0.96 2023-10-02 07:16:08,444 WARNING [train.py:1204] (3/4) Exclude cut with ID _28394_210_8913_1_1532430049518_3650118_342-251455-0_sp1.1 from training. Duration: 0.936375 2023-10-02 07:16:08,473 WARNING [train.py:1204] (3/4) Exclude cut with ID _53731_210_7994_1_1534553979244_3757214_8-191390-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 07:16:09,776 WARNING [train.py:1204] (3/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_613-232856-0 from training. Duration: 0.54 2023-10-02 07:16:09,848 WARNING [train.py:1204] (3/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_557-349534-0 from training. Duration: 0.91 2023-10-02 07:16:09,864 WARNING [train.py:1204] (3/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_379-184223-0 from training. Duration: 0.85 2023-10-02 07:16:09,874 WARNING [train.py:1204] (3/4) Exclude cut with ID IC0125W0001-61807 from training. Duration: 0.966 2023-10-02 07:16:11,777 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_229-45300-0 from training. Duration: 0.57 2023-10-02 07:16:13,159 WARNING [train.py:1204] (3/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_1069-24743-0_sp1.1 from training. Duration: 0.92725 2023-10-02 07:16:14,616 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_41452_210_7291_1_1533437687_3941281_323-172873-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 07:16:14,626 WARNING [train.py:1204] (3/4) Exclude cut with ID _37086_210_9038_1_1532999540891_7021250_802-226709-0_sp1.1 from training. Duration: 0.97275 2023-10-02 07:16:15,974 WARNING [train.py:1204] (3/4) Exclude cut with ID IC0144W0500-71767 from training. Duration: 0.931875 2023-10-02 07:16:17,515 WARNING [train.py:1204] (3/4) Exclude cut with ID _39846_210_12651_1_1533605310265_2115212_233-129699-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 07:16:17,754 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.bypass.skip_rate, batch_count=795440.0, ans=0.07 2023-10-02 07:16:18,765 INFO [train.py:1046] (3/4) Epoch 23, batch 2450, loss[loss=0.1759, simple_loss=0.2606, pruned_loss=0.04558, over 24645.00 frames. ], tot_loss[loss=0.1733, simple_loss=0.2489, pruned_loss=0.04883, over 4693929.45 frames. ], batch size: 65, lr: 4.47e-03, grad_scale: 32.0 2023-10-02 07:16:18,848 WARNING [train.py:1204] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_574-45541-0_sp0.9 from training. Duration: 0.72225 2023-10-02 07:16:22,129 WARNING [train.py:1204] (3/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_372-298093-0_sp1.1 from training. Duration: 0.72725 2023-10-02 07:16:22,151 WARNING [train.py:1204] (3/4) Exclude cut with ID _62574_210_2655_1_1535180407058_3933249_34-26343-0_sp1.1 from training. Duration: 0.97275 2023-10-02 07:16:26,322 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_512-256238-0_sp1.1 from training. Duration: 0.963625 2023-10-02 07:16:26,328 WARNING [train.py:1204] (3/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_196-55502-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 07:16:27,661 WARNING [train.py:1204] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_195-167011-0 from training. Duration: 0.75 2023-10-02 07:16:32,570 WARNING [train.py:1204] (3/4) Exclude cut with ID _13678_210_7497_1_1530530069226_3463170_345-230416-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 07:16:32,589 WARNING [train.py:1204] (3/4) Exclude cut with ID _51078_210_9070_1_1534161557424_3805250_225-327809-0_sp1.1 from training. Duration: 0.963625 2023-10-02 07:16:36,651 WARNING [train.py:1204] (3/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_18-189198-0_sp1.1 from training. Duration: 0.8181875 2023-10-02 07:16:36,670 WARNING [train.py:1204] (3/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_329-82235-0_sp1.1 from training. Duration: 0.7454375 2023-10-02 07:16:36,678 WARNING [train.py:1204] (3/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_726-270780-0_sp0.9 from training. Duration: 0.9555625 2023-10-02 07:16:37,949 WARNING [train.py:1204] (3/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_654-52713-0 from training. Duration: 0.77 2023-10-02 07:16:38,255 INFO [scaling.py:1118] (3/4) WithLoss: name=encoder.encoders.3.encoder.layers.3.self_attn_weights, loss-sum=0.000e+00 2023-10-02 07:16:41,365 WARNING [train.py:1204] (3/4) Exclude cut with ID _20455_210_5219_1_1531565996775_8098100_191-16006-0_sp1.1 from training. Duration: 0.963625 2023-10-02 07:16:44,621 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3171_210_2087_1_1526109084_2330007_66-260583-0_sp1.1 from training. Duration: 0.6545625 2023-10-02 07:16:44,668 WARNING [train.py:1204] (3/4) Exclude cut with ID _38670_210_15379_1_1533448810659_4391059_63-129006-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 07:16:46,322 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.0.bypass.scale_min, batch_count=795506.6666666666, ans=0.2 2023-10-02 07:16:47,586 WARNING [train.py:1204] (3/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_393-57888-0_sp0.9 from training. Duration: 0.711125 2023-10-02 07:16:47,615 WARNING [train.py:1204] (3/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_857-298942-0_sp0.9 from training. Duration: 0.9444375 2023-10-02 07:16:49,055 WARNING [train.py:1204] (3/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_239-22076-0_sp0.9 from training. Duration: 0.9444375 2023-10-02 07:16:49,091 WARNING [train.py:1204] (3/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_424-227947-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 07:16:52,258 WARNING [train.py:1204] (3/4) Exclude cut with ID _14069_210_5632_1_1530512692033_4803390_95-1896-0 from training. Duration: 0.98 2023-10-02 07:16:53,597 WARNING [train.py:1204] (3/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_96-244299-0_sp0.9 from training. Duration: 0.97775 2023-10-02 07:16:55,225 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.feed_forward2.hidden_balancer.prob, batch_count=795573.3333333334, ans=0.125 2023-10-02 07:17:01,906 WARNING [train.py:1204] (3/4) Exclude cut with ID _46683_210_9642_1_1533730913492_4591116_562-319825-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 07:17:03,702 WARNING [train.py:1204] (3/4) Exclude cut with ID _64639_210_5816_1_1535367619437_3528000_284-173474-0_sp1.1 from training. Duration: 0.963625 2023-10-02 07:17:03,734 WARNING [train.py:1204] (3/4) Exclude cut with ID _29529_210_7234_1_1532393961455_3977460_497-100133-0_sp1.1 from training. Duration: 0.936375 2023-10-02 07:17:05,083 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_12868_210_6753_1_1530701932_530275_18-76322-0_sp0.9 from training. Duration: 0.9666875 2023-10-02 07:17:05,117 WARNING [train.py:1204] (3/4) Exclude cut with ID _50093_210_4220_1_1534417141207_3432299_116-303447-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 07:17:06,632 WARNING [train.py:1204] (3/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_44-79427-0_sp1.1 from training. Duration: 0.8909375 2023-10-02 07:17:06,698 WARNING [train.py:1204] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_887-249555-0 from training. Duration: 0.77 2023-10-02 07:17:09,529 WARNING [train.py:1204] (3/4) Exclude cut with ID _28461_210_2253_1_1532771775189_3041299_96-152158-0_sp0.9 from training. Duration: 0.9444375 2023-10-02 07:17:11,409 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_14058_210_4163_1_1530690953_6327180_49-210687-0_sp1.1 from training. Duration: 0.8545625 2023-10-02 07:17:14,203 WARNING [train.py:1204] (3/4) Exclude cut with ID _40399_210_4353_1_1533305658194_3279406_351-127903-0_sp1.1 from training. Duration: 0.97275 2023-10-02 07:17:14,208 WARNING [train.py:1204] (3/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_184-206939-0_sp1.1 from training. Duration: 0.936375 2023-10-02 07:17:17,813 WARNING [train.py:1204] (3/4) Exclude cut with ID _14100_210_8341_1_1530529320613_3678069_258-224599-0_sp1.1 from training. Duration: 0.7 2023-10-02 07:17:19,107 WARNING [train.py:1204] (3/4) Exclude cut with ID _6493_210_1747_1_1528459210424_4012470_324-323854-0 from training. Duration: 0.92 2023-10-02 07:17:19,203 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_151-325626-0_sp1.1 from training. Duration: 0.8818125 2023-10-02 07:17:20,572 WARNING [train.py:1204] (3/4) Exclude cut with ID _14528_210_8254_1_1530669700514_4306939_106-43145-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 07:17:20,589 WARNING [train.py:1204] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_686-54383-0 from training. Duration: 0.87 2023-10-02 07:17:22,334 WARNING [train.py:1204] (3/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_801-102362-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 07:17:23,666 WARNING [train.py:1204] (3/4) Exclude cut with ID _48677_210_5663_1_1533902405919_3892989_408-40740-0_sp0.9 from training. Duration: 0.92225 2023-10-02 07:17:25,243 WARNING [train.py:1204] (3/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_448-212135-0_sp1.1 from training. Duration: 0.82725 2023-10-02 07:17:27,860 WARNING [train.py:1204] (3/4) Exclude cut with ID _59170_210_6721_1_1534983633960_4203335_320-124047-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 07:17:27,907 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_4692_210_3528_1_1526899755_4323254_107-44401-0_sp1.1 from training. Duration: 0.7818125 2023-10-02 07:17:29,411 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.0.nonlin_attention.balancer.prob, batch_count=795706.6666666666, ans=0.125 2023-10-02 07:17:29,533 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.nonlin_attention.balancer.prob, batch_count=795706.6666666666, ans=0.125 2023-10-02 07:17:30,698 WARNING [train.py:1204] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_731-2428-0 from training. Duration: 0.83 2023-10-02 07:17:31,381 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.3.encoder.layers.1.self_attn_weights.whiten_keys, num_groups=8, num_channels=256, metric=4.67 vs. limit=6.0 2023-10-02 07:17:32,088 WARNING [train.py:1204] (3/4) Exclude cut with ID _29857_210_20034_1_1532395886753_3798160_335-163562-0_sp1.1 from training. Duration: 0.77275 2023-10-02 07:17:33,345 INFO [train.py:1046] (3/4) Epoch 23, batch 2500, loss[loss=0.1869, simple_loss=0.2558, pruned_loss=0.05897, over 23829.00 frames. ], tot_loss[loss=0.1721, simple_loss=0.2477, pruned_loss=0.04828, over 4707332.10 frames. ], batch size: 195, lr: 4.47e-03, grad_scale: 16.0 2023-10-02 07:17:38,585 INFO [scaling.py:1118] (3/4) WithLoss: name=encoder.encoders.3.encoder.layers.3.self_attn_weights, loss-sum=0.000e+00 2023-10-02 07:17:39,667 WARNING [train.py:1204] (3/4) Exclude cut with ID _14832_210_8750_1_1530691351539_3171348_171-275004-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 07:17:44,758 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.conv_module1.balancer1.prob, batch_count=795773.3333333334, ans=0.125 2023-10-02 07:17:47,948 WARNING [train.py:1204] (3/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_105-328174-0_sp0.9 from training. Duration: 0.8444375 2023-10-02 07:17:49,333 WARNING [train.py:1204] (3/4) Exclude cut with ID _68965_210_1883_1_1535698962272_3497490_164-118421-0_sp1.1 from training. Duration: 0.936375 2023-10-02 07:17:50,663 WARNING [train.py:1204] (3/4) Exclude cut with ID _19896_210_1883_1_1531709022662_6649569_380-24170-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 07:17:50,669 WARNING [train.py:1204] (3/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_505-205270-0_sp1.1 from training. Duration: 0.3454375 2023-10-02 07:17:56,763 WARNING [train.py:1204] (3/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_427-208901-0_sp1.1 from training. Duration: 0.6181875 2023-10-02 07:17:58,095 WARNING [train.py:1204] (3/4) Exclude cut with ID _23818_210_6228_1_1531917063884_3676920_85-326916-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 07:17:58,171 WARNING [train.py:1204] (3/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_101-293367-0_sp1.1 from training. Duration: 0.57275 2023-10-02 07:17:58,179 WARNING [train.py:1204] (3/4) Exclude cut with ID _5409_210_1388_1_1527292850189_5865191_178-127970-0_sp1.1 from training. Duration: 0.5181875 2023-10-02 07:17:59,095 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.0.layers.0.feed_forward2.out_whiten, num_groups=1, num_channels=192, metric=7.41 vs. limit=15.0 2023-10-02 07:17:59,540 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_403-39619-0 from training. Duration: 0.84 2023-10-02 07:18:00,905 WARNING [train.py:1204] (3/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_427-87841-0_sp1.1 from training. Duration: 0.963625 2023-10-02 07:18:02,270 WARNING [train.py:1204] (3/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_286-68181-0_sp1.1 from training. Duration: 0.97275 2023-10-02 07:18:03,618 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6673_210_3273_1_1527767790_3173240_327-306185-0 from training. Duration: 0.83 2023-10-02 07:18:03,633 WARNING [train.py:1204] (3/4) Exclude cut with ID _3675_210_4557_1_1526639934408_4182089_106-203713-0_sp1.1 from training. Duration: 0.963625 2023-10-02 07:18:03,695 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_53071_210_16554_1_1534493434_1276796_130-36076-0 from training. Duration: 0.95 2023-10-02 07:18:04,987 WARNING [train.py:1204] (3/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_586-93054-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 07:18:09,695 WARNING [train.py:1204] (3/4) Exclude cut with ID _23825_210_13449_1_1531915313526_3497150_262-10571-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 07:18:09,789 WARNING [train.py:1204] (3/4) Exclude cut with ID _28783_210_7943_1_1532671164808_7155479_432-67885-0_sp1.1 from training. Duration: 0.97275 2023-10-02 07:18:11,067 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.545e+02 1.892e+02 2.068e+02 2.319e+02 3.851e+02, threshold=4.136e+02, percent-clipped=0.0 2023-10-02 07:18:12,755 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6287_210_3273_1_1527509506_2653716_182-199887-0_sp0.9 from training. Duration: 0.5666875 2023-10-02 07:18:14,674 WARNING [train.py:1204] (3/4) Exclude cut with ID _3231_210_2106_1_1526205864934_6784220_171-256004-0 from training. Duration: 0.86 2023-10-02 07:18:14,733 WARNING [train.py:1204] (3/4) Exclude cut with ID _69051_210_1531_1_1535885566901_3829538_332-92090-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 07:18:14,928 INFO [scaling.py:1118] (3/4) WithLoss: name=encoder.encoders.3.encoder.layers.0.self_attn_weights, loss-sum=0.000e+00 2023-10-02 07:18:16,179 WARNING [train.py:1204] (3/4) Exclude cut with ID _3675_210_4557_1_1526639934408_4182089_104-324125-0_sp1.1 from training. Duration: 0.963625 2023-10-02 07:18:19,755 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.conv_skip_rate, batch_count=795973.3333333334, ans=0.0 2023-10-02 07:18:20,835 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_41764_210_2521_1_1533686110_4089313_501-2119-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 07:18:22,392 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_56-100988-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 07:18:23,793 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder_embed.conv.8.prob, batch_count=795973.3333333334, ans=0.125 2023-10-02 07:18:26,947 WARNING [train.py:1204] (3/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_398-239819-0_sp1.1 from training. Duration: 0.9 2023-10-02 07:18:27,230 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=795973.3333333334, ans=0.1 2023-10-02 07:18:32,574 WARNING [train.py:1204] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_74-48717-0_sp1.1 from training. Duration: 0.52725 2023-10-02 07:18:35,374 WARNING [train.py:1204] (3/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_177-49092-0 from training. Duration: 0.91 2023-10-02 07:18:35,391 WARNING [train.py:1204] (3/4) Exclude cut with ID _62337_210_12392_1_1535009273105_3412229_44-176353-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 07:18:35,408 WARNING [train.py:1204] (3/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_110-130961-0_sp1.1 from training. Duration: 0.42725 2023-10-02 07:18:36,798 WARNING [train.py:1204] (3/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_37-189262-0_sp1.1 from training. Duration: 0.8909375 2023-10-02 07:18:36,799 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3172_210_2087_1_1526292929_3987865_197-97762-0_sp0.9 from training. Duration: 0.8333125 2023-10-02 07:18:38,205 WARNING [train.py:1204] (3/4) Exclude cut with ID IC0677W0065-334897 from training. Duration: 0.896 2023-10-02 07:18:38,206 WARNING [train.py:1204] (3/4) Exclude cut with ID IC0677W0080-334912 from training. Duration: 0.768 2023-10-02 07:18:38,212 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_120-41668-0 from training. Duration: 0.7 2023-10-02 07:18:40,985 WARNING [train.py:1204] (3/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_460-205287-0_sp1.1 from training. Duration: 0.963625 2023-10-02 07:18:42,907 WARNING [train.py:1204] (3/4) Exclude cut with ID _14632_210_9988_1_1530691279392_3923200_102-204477-0 from training. Duration: 0.97 2023-10-02 07:18:42,911 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_34815_210_6388_1_1533381320_3571903_279-305565-0 from training. Duration: 0.92 2023-10-02 07:18:44,163 WARNING [train.py:1204] (3/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_91-148831-0_sp1.1 from training. Duration: 0.9 2023-10-02 07:18:46,179 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_82-221196-0 from training. Duration: 0.76 2023-10-02 07:18:49,413 INFO [train.py:1046] (3/4) Epoch 23, batch 2550, loss[loss=0.1815, simple_loss=0.2494, pruned_loss=0.05678, over 23753.00 frames. ], tot_loss[loss=0.1721, simple_loss=0.2481, pruned_loss=0.04808, over 4711001.26 frames. ], batch size: 164, lr: 4.47e-03, grad_scale: 16.0 2023-10-02 07:18:50,819 WARNING [train.py:1204] (3/4) Exclude cut with ID _38020_210_5986_1_1533112252259_7006089_493-102745-0 from training. Duration: 0.98 2023-10-02 07:18:52,436 WARNING [train.py:1204] (3/4) Exclude cut with ID _61133_210_9969_1_1535196777227_3696409_40-93442-0_sp1.1 from training. Duration: 0.92725 2023-10-02 07:18:53,862 WARNING [train.py:1204] (3/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_45-258852-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 07:18:55,138 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_53071_210_16554_1_1534493434_1276796_130-36076-0_sp1.1 from training. Duration: 0.863625 2023-10-02 07:18:56,580 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8694_210_6400_1_1529065754_3260756_332-13553-0_sp1.1 from training. Duration: 0.92725 2023-10-02 07:18:58,419 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_405-339986-0 from training. Duration: 0.51 2023-10-02 07:18:58,464 WARNING [train.py:1204] (3/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_927-43827-0_sp1.1 from training. Duration: 0.67275 2023-10-02 07:19:02,640 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_397-147196-0 from training. Duration: 0.8 2023-10-02 07:19:04,098 WARNING [train.py:1204] (3/4) Exclude cut with ID 3557-8342-0013-54691-0_sp1.1 from training. Duration: 0.836375 2023-10-02 07:19:05,531 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_47438_210_5399_1_1533797471_4513356_626-127722-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 07:19:06,865 WARNING [train.py:1204] (3/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_256-323883-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 07:19:06,870 WARNING [train.py:1204] (3/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_839-173017-0_sp0.9 from training. Duration: 0.4555625 2023-10-02 07:19:06,895 WARNING [train.py:1204] (3/4) Exclude cut with ID _5289_210_2084_1_1527678202736_3779299_392-261988-0_sp1.1 from training. Duration: 0.5909375 2023-10-02 07:19:06,939 WARNING [train.py:1204] (3/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_119-224384-0_sp1.1 from training. Duration: 0.936375 2023-10-02 07:19:08,277 WARNING [train.py:1204] (3/4) Exclude cut with ID _57523_210_1856_1_1534766294545_3732989_262-267933-0_sp1.1 from training. Duration: 0.963625 2023-10-02 07:19:10,892 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_35361_210_4238_1_1533262810_7793476_675-87012-0_sp0.9 from training. Duration: 0.988875 2023-10-02 07:19:10,918 WARNING [train.py:1204] (3/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_17-90541-0 from training. Duration: 0.94 2023-10-02 07:19:11,180 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.ff3_skip_rate, batch_count=796173.3333333334, ans=0.0 2023-10-02 07:19:12,166 WARNING [train.py:1204] (3/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_535-14410-0_sp1.1 from training. Duration: 0.42725 2023-10-02 07:19:12,172 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_24673_210_6400_1_1531968633_778659_94-154511-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 07:19:12,176 WARNING [train.py:1204] (3/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_212-10501-0 from training. Duration: 0.94 2023-10-02 07:19:24,477 WARNING [train.py:1204] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_383-285196-0_sp1.1 from training. Duration: 0.8818125 2023-10-02 07:19:30,577 WARNING [train.py:1204] (3/4) Exclude cut with ID _45560_210_21982_1_1533812603951_3584999_238-47020-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 07:19:30,586 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_21500_210_2054_1_1532149310_2716758_278-18053-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 07:19:30,602 WARNING [train.py:1204] (3/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_453-9626-0_sp1.1 from training. Duration: 0.936375 2023-10-02 07:19:30,692 WARNING [train.py:1204] (3/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_153-97441-0_sp1.1 from training. Duration: 0.5090625 2023-10-02 07:19:37,544 WARNING [train.py:1204] (3/4) Exclude cut with ID _67725_210_27767_1_1535608756671_5105359_431-120895-0_sp1.1 from training. Duration: 0.963625 2023-10-02 07:19:39,052 WARNING [train.py:1204] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_574-45541-0_sp1.1 from training. Duration: 0.5909375 2023-10-02 07:19:39,079 WARNING [train.py:1204] (3/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_162-103620-0_sp1.1 from training. Duration: 0.8454375 2023-10-02 07:19:39,096 WARNING [train.py:1204] (3/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_734-129761-0_sp1.1 from training. Duration: 0.7454375 2023-10-02 07:19:39,139 WARNING [train.py:1204] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_620-276793-0_sp1.1 from training. Duration: 0.536375 2023-10-02 07:19:40,441 WARNING [train.py:1204] (3/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_153-97441-0_sp0.9 from training. Duration: 0.62225 2023-10-02 07:19:43,665 WARNING [train.py:1204] (3/4) Exclude cut with ID _12733_210_8341_1_1530167424556_3646380_523-85621-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 07:19:43,676 WARNING [train.py:1204] (3/4) Exclude cut with ID _64048_210_10626_1_1535198382802_3835179_352-179834-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 07:19:50,233 WARNING [train.py:1204] (3/4) Exclude cut with ID _55162_210_17071_1_1534408248583_3753480_43-242364-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 07:19:50,248 WARNING [train.py:1204] (3/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_661-264857-0 from training. Duration: 0.93 2023-10-02 07:19:50,255 WARNING [train.py:1204] (3/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_325-237800-0_sp1.1 from training. Duration: 0.97275 2023-10-02 07:19:51,596 WARNING [train.py:1204] (3/4) Exclude cut with ID _55328_210_27767_1_1534580973260_4088170_385-171459-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 07:19:53,005 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3780_210_2087_1_1526467389_2145572_176-107140-0_sp1.1 from training. Duration: 0.62725 2023-10-02 07:19:53,094 WARNING [train.py:1204] (3/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_375-244976-0_sp1.1 from training. Duration: 0.6818125 2023-10-02 07:19:55,782 WARNING [train.py:1204] (3/4) Exclude cut with ID _35941_210_15379_1_1532995294515_4023089_309-33162-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 07:20:00,735 WARNING [train.py:1204] (3/4) Exclude cut with ID _14459_210_2228_1_1531443641946_7516700_1185-22038-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 07:20:00,898 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=796373.3333333334, ans=0.1 2023-10-02 07:20:03,554 INFO [train.py:1046] (3/4) Epoch 23, batch 2600, loss[loss=0.1668, simple_loss=0.2465, pruned_loss=0.04356, over 23615.00 frames. ], tot_loss[loss=0.1731, simple_loss=0.2494, pruned_loss=0.04844, over 4719056.75 frames. ], batch size: 134, lr: 4.46e-03, grad_scale: 16.0 2023-10-02 07:20:03,607 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_1197-69460-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 07:20:03,834 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.balancer1.prob, batch_count=796440.0, ans=0.125 2023-10-02 07:20:04,257 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.4.encoder.layers.1.feed_forward1.out_whiten, num_groups=1, num_channels=384, metric=9.49 vs. limit=15.0 2023-10-02 07:20:06,410 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9012W0022-501157 from training. Duration: 0.896 2023-10-02 07:20:09,143 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9023W0009-506302 from training. Duration: 0.896 2023-10-02 07:20:09,157 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2821_210_2715_1_1526108385_3763560_73-296673-0_sp1.1 from training. Duration: 0.7545625 2023-10-02 07:20:09,186 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9025W0113-507442 from training. Duration: 0.896 2023-10-02 07:20:09,260 WARNING [train.py:1204] (3/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_739-2098-0 from training. Duration: 0.92 2023-10-02 07:20:09,394 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.bypass.scale_min, batch_count=796440.0, ans=0.2 2023-10-02 07:20:10,550 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9029W0332-509722 from training. Duration: 0.768 2023-10-02 07:20:12,058 WARNING [train.py:1204] (3/4) Exclude cut with ID _16908_210_5724_1_1531119157248_3772700_68-101953-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 07:20:12,081 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9042W0011-515617 from training. Duration: 0.768 2023-10-02 07:20:13,502 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3019_210_2028_1_1526044245_3264865_191-72235-0 from training. Duration: 0.97 2023-10-02 07:20:15,336 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9052W0006-520792 from training. Duration: 0.896 2023-10-02 07:20:17,408 WARNING [train.py:1204] (3/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_245-174553-0_sp1.1 from training. Duration: 0.8 2023-10-02 07:20:20,800 WARNING [train.py:1204] (3/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_202-67017-0 from training. Duration: 0.82 2023-10-02 07:20:22,163 WARNING [train.py:1204] (3/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_304-59227-0 from training. Duration: 0.77 2023-10-02 07:20:22,263 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_246-125000-0_sp0.9 from training. Duration: 0.688875 2023-10-02 07:20:22,296 WARNING [train.py:1204] (3/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_243-42116-0 from training. Duration: 0.73 2023-10-02 07:20:24,961 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9087W0121-538492 from training. Duration: 0.896 2023-10-02 07:20:24,986 WARNING [train.py:1204] (3/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_120-76843-0 from training. Duration: 0.55 2023-10-02 07:20:32,541 WARNING [train.py:1204] (3/4) Exclude cut with ID _7852_210_5219_1_1528286471316_8139400_850-304621-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 07:20:32,553 WARNING [train.py:1204] (3/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_154-285415-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 07:20:32,570 WARNING [train.py:1204] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_538-278194-0_sp1.1 from training. Duration: 0.92725 2023-10-02 07:20:32,571 WARNING [train.py:1204] (3/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_119-207162-0 from training. Duration: 0.71 2023-10-02 07:20:35,271 WARNING [train.py:1204] (3/4) Exclude cut with ID _2334_210_2667_1_1525608238880_4705129_459-104997-0_sp1.1 from training. Duration: 0.863625 2023-10-02 07:20:38,505 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.bypass_mid.scale_min, batch_count=796573.3333333334, ans=0.2 2023-10-02 07:20:38,522 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=796573.3333333334, ans=0.1 2023-10-02 07:20:41,116 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.479e+02 1.855e+02 2.030e+02 2.251e+02 3.622e+02, threshold=4.059e+02, percent-clipped=0.0 2023-10-02 07:20:41,269 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9147W0019-567847 from training. Duration: 0.768 2023-10-02 07:20:43,652 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.0.layers.0.self_attn_weights.whiten_keys, num_groups=4, num_channels=128, metric=2.94 vs. limit=6.0 2023-10-02 07:20:44,265 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.conv_module2.balancer1.max_abs, batch_count=796573.3333333334, ans=10.0 2023-10-02 07:20:47,334 WARNING [train.py:1204] (3/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_228-189439-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 07:20:48,694 WARNING [train.py:1204] (3/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_196-77172-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 07:20:50,031 WARNING [train.py:1204] (3/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_625-338546-0 from training. Duration: 0.88 2023-10-02 07:20:50,724 WARNING [train.py:1204] (3/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_193-327630-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 07:20:50,735 WARNING [train.py:1204] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_391-24006-0_sp1.1 from training. Duration: 0.92725 2023-10-02 07:20:52,186 WARNING [train.py:1204] (3/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_197-28164-0 from training. Duration: 0.8 2023-10-02 07:20:55,452 WARNING [train.py:1204] (3/4) Exclude cut with ID _8395_210_1531_1_1528597871523_4411579_431-132470-0_sp1.1 from training. Duration: 0.67275 2023-10-02 07:20:56,762 WARNING [train.py:1204] (3/4) Exclude cut with ID _35350_210_16040_1_1533040191183_4075299_465-248326-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 07:20:58,272 WARNING [train.py:1204] (3/4) Exclude cut with ID _16756_210_4915_1_1531047803763_7104259_101-141922-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 07:21:01,142 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9212W0005-600172 from training. Duration: 0.896 2023-10-02 07:21:01,177 WARNING [train.py:1204] (3/4) Exclude cut with ID _63186_210_2084_1_1535414479705_3908649_378-166820-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 07:21:02,532 WARNING [train.py:1204] (3/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_52-211683-0_sp1.1 from training. Duration: 0.8181875 2023-10-02 07:21:06,049 WARNING [train.py:1204] (3/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_1147-237729-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 07:21:06,122 WARNING [train.py:1204] (3/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_234-14174-0_sp0.9 from training. Duration: 0.911125 2023-10-02 07:21:06,138 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_454-276429-0 from training. Duration: 0.87 2023-10-02 07:21:07,462 WARNING [train.py:1204] (3/4) Exclude cut with ID _53713_210_24116_1_1534314487762_7416751_755-165313-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 07:21:08,988 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8278_210_2550_1_1528637099_4220413_436-317072-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 07:21:09,050 WARNING [train.py:1204] (3/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_28-324729-0_sp1.1 from training. Duration: 0.8545625 2023-10-02 07:21:11,921 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.0.conv_module1.balancer2.prob, batch_count=796706.6666666666, ans=0.125 2023-10-02 07:21:15,772 WARNING [train.py:1204] (3/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_326-165900-0 from training. Duration: 0.69 2023-10-02 07:21:15,846 WARNING [train.py:1204] (3/4) Exclude cut with ID _53489_210_6228_1_1534298357169_3835970_96-30246-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 07:21:17,389 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8275_210_4198_1_1528455320_3892571_491-17707-0_sp1.1 from training. Duration: 0.5818125 2023-10-02 07:21:18,618 INFO [train.py:1046] (3/4) Epoch 23, batch 2650, loss[loss=0.1701, simple_loss=0.2507, pruned_loss=0.04477, over 24637.00 frames. ], tot_loss[loss=0.1734, simple_loss=0.2499, pruned_loss=0.04851, over 4709090.85 frames. ], batch size: 65, lr: 4.46e-03, grad_scale: 16.0 2023-10-02 07:21:22,787 WARNING [train.py:1204] (3/4) Exclude cut with ID _14348_210_8341_1_1530599434996_3778640_240-232370-0 from training. Duration: 0.84 2023-10-02 07:21:22,804 WARNING [train.py:1204] (3/4) Exclude cut with ID _45012_210_12651_1_1533608435136_5371380_54-112623-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 07:21:24,133 WARNING [train.py:1204] (3/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_328-264776-0_sp1.1 from training. Duration: 0.6090625 2023-10-02 07:21:24,215 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9325W0001-648427 from training. Duration: 0.768 2023-10-02 07:21:24,224 WARNING [train.py:1204] (3/4) Exclude cut with ID _56480_210_4220_1_1534679942079_3075409_49-74717-0_sp1.1 from training. Duration: 0.936375 2023-10-02 07:21:26,975 WARNING [train.py:1204] (3/4) Exclude cut with ID _59500_210_8254_1_1534809610680_3736009_6-241869-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 07:21:28,440 WARNING [train.py:1204] (3/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_172-215932-0_sp0.9 from training. Duration: 0.7333125 2023-10-02 07:21:29,801 WARNING [train.py:1204] (3/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_17-90541-0_sp1.1 from training. Duration: 0.8545625 2023-10-02 07:21:32,428 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_43831_210_2158_1_1533725502_5872791_525-89447-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 07:21:32,501 WARNING [train.py:1204] (3/4) Exclude cut with ID _12302_210_5219_1_1529924446466_8107280_907-182949-0 from training. Duration: 0.92 2023-10-02 07:21:32,514 WARNING [train.py:1204] (3/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_402-102311-0_sp0.9 from training. Duration: 0.7444375 2023-10-02 07:21:32,551 WARNING [train.py:1204] (3/4) Exclude cut with ID _8021_210_7994_1_1528376451358_3653069_179-45206-0_sp0.9 from training. Duration: 0.9555625 2023-10-02 07:21:37,273 WARNING [train.py:1204] (3/4) Exclude cut with ID _40137_210_12210_1_1533290484046_3795180_249-187059-0 from training. Duration: 0.88 2023-10-02 07:21:38,632 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9653W0194-675472 from training. Duration: 0.9658125 2023-10-02 07:21:41,402 WARNING [train.py:1204] (3/4) Exclude cut with ID _51332_210_9988_1_1534244371836_3801270_336-255604-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 07:21:44,077 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_510-96996-0 from training. Duration: 0.87 2023-10-02 07:21:44,091 WARNING [train.py:1204] (3/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_1167-20437-0_sp1.1 from training. Duration: 0.92725 2023-10-02 07:21:44,155 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3475_210_2028_1_1526193578_3622569_265-276625-0 from training. Duration: 0.76 2023-10-02 07:21:48,314 WARNING [train.py:1204] (3/4) Exclude cut with ID _14860_210_4915_1_1530702020340_7343240_470-172418-0_sp1.1 from training. Duration: 0.97275 2023-10-02 07:21:48,331 WARNING [train.py:1204] (3/4) Exclude cut with ID _5444_210_2151_1_1527418808464_5312210_581-315411-0_sp1.1 from training. Duration: 0.563625 2023-10-02 07:21:48,348 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_34450_210_9547_1_1533099443_4109050_519-342439-0_sp1.1 from training. Duration: 0.97275 2023-10-02 07:21:48,384 WARNING [train.py:1204] (3/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_50-175961-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 07:21:55,028 WARNING [train.py:1204] (3/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_372-6869-0 from training. Duration: 0.44 2023-10-02 07:21:55,035 WARNING [train.py:1204] (3/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_3-237434-0 from training. Duration: 0.76 2023-10-02 07:21:56,474 WARNING [train.py:1204] (3/4) Exclude cut with ID _8772_210_1850_1_1528635595972_3664011_247-336448-0_sp1.1 from training. Duration: 0.67275 2023-10-02 07:21:57,233 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.3.encoder.layers.1.self_attn2.whiten, num_groups=1, num_channels=512, metric=16.02 vs. limit=22.5 2023-10-02 07:22:01,905 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_427-78097-0 from training. Duration: 0.8 2023-10-02 07:22:01,929 WARNING [train.py:1204] (3/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_466-252727-0_sp1.1 from training. Duration: 0.97275 2023-10-02 07:22:03,314 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_15972_210_6400_1_1530877934_3405692_316-35478-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 07:22:03,352 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_809-17156-0_sp0.9 from training. Duration: 0.811125 2023-10-02 07:22:04,655 WARNING [train.py:1204] (3/4) Exclude cut with ID _57572_210_5517_1_1534921219624_3809484_594-263474-0_sp1.1 from training. Duration: 0.936375 2023-10-02 07:22:04,694 WARNING [train.py:1204] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_190-228204-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 07:22:06,109 WARNING [train.py:1204] (3/4) Exclude cut with ID _19514_210_9259_1_1531530071330_5084230_117-21193-0_sp1.1 from training. Duration: 0.936375 2023-10-02 07:22:08,784 WARNING [train.py:1204] (3/4) Exclude cut with ID _69584_210_5219_1_1535785133775_4044450_190-21315-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 07:22:10,854 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_53071_210_16554_1_1534493434_1276796_184-302050-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 07:22:10,922 WARNING [train.py:1204] (3/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_120-76843-0_sp1.1 from training. Duration: 0.5 2023-10-02 07:22:12,201 WARNING [train.py:1204] (3/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_318-162017-0_sp1.1 from training. Duration: 0.82725 2023-10-02 07:22:13,539 WARNING [train.py:1204] (3/4) Exclude cut with ID _46067_210_4220_1_1534125571066_3307450_79-46864-0_sp1.1 from training. Duration: 0.92725 2023-10-02 07:22:14,916 WARNING [train.py:1204] (3/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_358-215558-0_sp1.1 from training. Duration: 0.8454375 2023-10-02 07:22:15,001 WARNING [train.py:1204] (3/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_621-66764-0_sp1.1 from training. Duration: 0.92725 2023-10-02 07:22:17,905 WARNING [train.py:1204] (3/4) Exclude cut with ID _42863_210_9070_1_1533902398788_3696920_338-156851-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 07:22:17,932 WARNING [train.py:1204] (3/4) Exclude cut with ID _5289_210_2084_1_1527678202736_3779299_392-261988-0_sp0.9 from training. Duration: 0.72225 2023-10-02 07:22:20,891 WARNING [train.py:1204] (3/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_409-84682-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 07:22:22,200 WARNING [train.py:1204] (3/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_30-326681-0_sp0.9 from training. Duration: 0.92225 2023-10-02 07:22:22,205 WARNING [train.py:1204] (3/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_389-184695-0_sp1.1 from training. Duration: 0.92725 2023-10-02 07:22:22,239 WARNING [train.py:1204] (3/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_442-320317-0 from training. Duration: 0.82 2023-10-02 07:22:27,193 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.feed_forward1.out_proj.dropout_p, batch_count=797040.0, ans=0.1 2023-10-02 07:22:28,094 WARNING [train.py:1204] (3/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_756-29911-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 07:22:29,434 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_401-112036-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 07:22:29,521 WARNING [train.py:1204] (3/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_181-308827-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 07:22:29,788 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.conv_module2.balancer1.prob, batch_count=797040.0, ans=0.125 2023-10-02 07:22:30,882 WARNING [train.py:1204] (3/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_332-258725-0_sp1.1 from training. Duration: 0.963625 2023-10-02 07:22:30,965 WARNING [train.py:1204] (3/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_188-260011-0_sp0.9 from training. Duration: 0.811125 2023-10-02 07:22:32,334 WARNING [train.py:1204] (3/4) Exclude cut with ID _49039_210_6315_1_1533967537541_3846118_99-302239-0_sp1.1 from training. Duration: 0.963625 2023-10-02 07:22:33,651 INFO [train.py:1046] (3/4) Epoch 23, batch 2700, loss[loss=0.1761, simple_loss=0.2616, pruned_loss=0.04534, over 23969.00 frames. ], tot_loss[loss=0.1741, simple_loss=0.251, pruned_loss=0.04855, over 4707921.22 frames. ], batch size: 80, lr: 4.46e-03, grad_scale: 16.0 2023-10-02 07:22:35,179 WARNING [train.py:1204] (3/4) Exclude cut with ID _4122_210_2084_1_1526713164417_4353280_312-294828-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 07:22:35,191 WARNING [train.py:1204] (3/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_303-228738-0 from training. Duration: 0.84 2023-10-02 07:22:37,896 WARNING [train.py:1204] (3/4) Exclude cut with ID _14349_210_8341_1_1530615710818_3442470_115-285975-0_sp0.9 from training. Duration: 0.9555625 2023-10-02 07:22:38,219 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.bypass_mid.scale_min, batch_count=797106.6666666666, ans=0.2 2023-10-02 07:22:39,348 WARNING [train.py:1204] (3/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_125-144174-0_sp1.1 from training. Duration: 0.3545625 2023-10-02 07:22:41,340 WARNING [train.py:1204] (3/4) Exclude cut with ID _53565_210_10635_1_1534337218335_3820259_351-54570-0_sp1.1 from training. Duration: 0.97275 2023-10-02 07:22:41,368 WARNING [train.py:1204] (3/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_410-40065-0_sp1.1 from training. Duration: 0.963625 2023-10-02 07:22:41,394 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_38965_210_4852_1_1533174906_3484735_167-339442-0_sp1.1 from training. Duration: 0.963625 2023-10-02 07:22:44,041 WARNING [train.py:1204] (3/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_265-164486-0_sp1.1 from training. Duration: 0.87275 2023-10-02 07:22:44,053 WARNING [train.py:1204] (3/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_725-182484-0_sp1.1 from training. Duration: 0.92725 2023-10-02 07:22:44,080 WARNING [train.py:1204] (3/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_607-213951-0_sp0.9 from training. Duration: 0.9333125 2023-10-02 07:22:44,104 WARNING [train.py:1204] (3/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_605-133395-0_sp1.1 from training. Duration: 0.6 2023-10-02 07:22:44,120 WARNING [train.py:1204] (3/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_351-294650-0 from training. Duration: 0.63 2023-10-02 07:22:44,192 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_14953_210_2151_1_1530701710_5616165_710-165400-0_sp1.1 from training. Duration: 0.7909375 2023-10-02 07:22:45,546 WARNING [train.py:1204] (3/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_237-58433-0_sp1.1 from training. Duration: 0.67275 2023-10-02 07:22:46,902 WARNING [train.py:1204] (3/4) Exclude cut with ID _5190_210_4557_1_1527244368799_373439_28-197030-0_sp1.1 from training. Duration: 0.6545625 2023-10-02 07:22:46,957 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_743-327651-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 07:22:47,035 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder_embed.conv.8.prob, batch_count=797173.3333333334, ans=0.125 2023-10-02 07:22:47,958 INFO [scaling.py:1022] (3/4) Whitening: name=encoder_embed.out_whiten, num_groups=1, num_channels=192, metric=6.87 vs. limit=8.0 2023-10-02 07:22:48,663 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.5.encoder.layers.0.whiten, num_groups=1, num_channels=256, metric=3.40 vs. limit=12.0 2023-10-02 07:22:50,946 WARNING [train.py:1204] (3/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_76-203867-0_sp0.9 from training. Duration: 0.8 2023-10-02 07:22:51,030 WARNING [train.py:1204] (3/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_274-274798-0 from training. Duration: 0.98 2023-10-02 07:22:52,347 WARNING [train.py:1204] (3/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_226-242140-0_sp1.1 from training. Duration: 0.736375 2023-10-02 07:22:59,597 WARNING [train.py:1204] (3/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_352-59958-0_sp1.1 from training. Duration: 0.7818125 2023-10-02 07:22:59,602 WARNING [train.py:1204] (3/4) Exclude cut with ID _39411_210_5986_1_1533538825030_3343649_191-251819-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 07:23:04,002 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7046_210_6753_1_1527895337_6920917_642-106799-0_sp1.1 from training. Duration: 0.836375 2023-10-02 07:23:04,017 WARNING [train.py:1204] (3/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_412-3442-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 07:23:04,046 WARNING [train.py:1204] (3/4) Exclude cut with ID _60057_210_10640_1_1534903065793_3333150_171-181133-0_sp0.9 from training. Duration: 0.97775 2023-10-02 07:23:04,256 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.balancer1.prob, batch_count=797240.0, ans=0.125 2023-10-02 07:23:05,369 WARNING [train.py:1204] (3/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_154-135696-0_sp1.1 from training. Duration: 0.72725 2023-10-02 07:23:08,248 WARNING [train.py:1204] (3/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_148-186805-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 07:23:09,762 WARNING [train.py:1204] (3/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_605-165641-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 07:23:09,769 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_174-277364-0_sp0.9 from training. Duration: 0.788875 2023-10-02 07:23:09,783 WARNING [train.py:1204] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_89-321955-0_sp0.9 from training. Duration: 0.888875 2023-10-02 07:23:11,501 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.543e+02 1.838e+02 2.027e+02 2.219e+02 3.329e+02, threshold=4.053e+02, percent-clipped=0.0 2023-10-02 07:23:14,570 WARNING [train.py:1204] (3/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_438-105429-0_sp1.1 from training. Duration: 0.963625 2023-10-02 07:23:14,574 WARNING [train.py:1204] (3/4) Exclude cut with ID _14970_210_15709_1_1530773543493_1715730_187-20259-0_sp0.9 from training. Duration: 0.988875 2023-10-02 07:23:23,015 WARNING [train.py:1204] (3/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_385-193052-0_sp0.9 from training. Duration: 0.9555625 2023-10-02 07:23:23,078 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7113_210_1849_1_1527942685_3448768_194-311626-0_sp1.1 from training. Duration: 0.92725 2023-10-02 07:23:27,964 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3780_210_2087_1_1526467389_2145572_176-107140-0_sp0.9 from training. Duration: 0.7666875 2023-10-02 07:23:27,967 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_36756_210_7234_1_1533026991_966276_123-213457-0_sp1.1 from training. Duration: 0.936375 2023-10-02 07:23:30,771 WARNING [train.py:1204] (3/4) Exclude cut with ID _23557_210_8750_1_1531890106370_6846255_456-244861-0_sp1.1 from training. Duration: 0.963625 2023-10-02 07:23:30,920 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.0.attention_skip_rate, batch_count=797306.6666666666, ans=0.0 2023-10-02 07:23:32,162 WARNING [train.py:1204] (3/4) Exclude cut with ID _37086_210_9038_1_1532999540891_7021250_242-349170-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 07:23:32,216 WARNING [train.py:1204] (3/4) Exclude cut with ID _41298_210_9019_1_1533430371450_4822143_471-281351-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 07:23:33,630 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_53378_210_17282_1_1534221071_5769509_572-126176-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 07:23:35,087 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_1507-263032-0_sp1.1 from training. Duration: 0.963625 2023-10-02 07:23:36,417 WARNING [train.py:1204] (3/4) Exclude cut with ID _13679_210_7497_1_1530534293662_3355098_411-186088-0_sp1.1 from training. Duration: 0.8545625 2023-10-02 07:23:39,312 WARNING [train.py:1204] (3/4) Exclude cut with ID _29345_210_4381_1_1532689168263_3860169_149-257268-0_sp1.1 from training. Duration: 0.736375 2023-10-02 07:23:39,425 WARNING [train.py:1204] (3/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_381-252014-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 07:23:39,432 WARNING [train.py:1204] (3/4) Exclude cut with ID _35799_210_5854_1_1533263487887_3529071_512-2717-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 07:23:43,871 WARNING [train.py:1204] (3/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_505-205270-0_sp0.9 from training. Duration: 0.42225 2023-10-02 07:23:43,953 WARNING [train.py:1204] (3/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_731-20117-0_sp1.1 from training. Duration: 0.936375 2023-10-02 07:23:44,842 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.0.layers.1.feed_forward3.out_whiten, num_groups=1, num_channels=192, metric=8.52 vs. limit=15.0 2023-10-02 07:23:45,622 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.bypass.skip_rate, batch_count=797373.3333333334, ans=0.04949747468305833 2023-10-02 07:23:46,722 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_37351_210_7925_1_1533012931_3812113_265-85780-0_sp1.1 from training. Duration: 0.8 2023-10-02 07:23:46,736 WARNING [train.py:1204] (3/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_313-134930-0 from training. Duration: 0.97 2023-10-02 07:23:48,024 INFO [train.py:1046] (3/4) Epoch 23, batch 2750, loss[loss=0.1667, simple_loss=0.2519, pruned_loss=0.04077, over 24584.00 frames. ], tot_loss[loss=0.174, simple_loss=0.251, pruned_loss=0.0485, over 4698783.74 frames. ], batch size: 71, lr: 4.46e-03, grad_scale: 16.0 2023-10-02 07:23:49,434 WARNING [train.py:1204] (3/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_374-138971-0 from training. Duration: 0.94 2023-10-02 07:23:49,477 WARNING [train.py:1204] (3/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_1227-287886-0_sp1.1 from training. Duration: 0.936375 2023-10-02 07:23:52,318 WARNING [train.py:1204] (3/4) Exclude cut with ID _59493_210_17282_1_1535078419077_3225879_332-217544-0_sp1.1 from training. Duration: 0.97275 2023-10-02 07:23:53,576 WARNING [train.py:1204] (3/4) Exclude cut with ID _23514_210_1389_1_1531832139785_3031439_361-119640-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 07:23:55,022 WARNING [train.py:1204] (3/4) Exclude cut with ID _69584_210_5219_1_1535785133775_4044450_142-118058-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 07:23:57,534 WARNING [train.py:1204] (3/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_178-177582-0_sp1.1 from training. Duration: 0.67275 2023-10-02 07:23:57,575 WARNING [train.py:1204] (3/4) Exclude cut with ID _25303_210_17519_1_1532050207576_3635970_41-195572-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 07:24:00,736 WARNING [train.py:1204] (3/4) Exclude cut with ID _55328_210_27767_1_1534580973260_4088170_329-341322-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 07:24:00,778 WARNING [train.py:1204] (3/4) Exclude cut with ID _5608_210_2106_1_1527415439089_6819768_909-82767-0_sp1.1 from training. Duration: 0.5545625 2023-10-02 07:24:00,813 WARNING [train.py:1204] (3/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_261-297503-0_sp1.1 from training. Duration: 0.9 2023-10-02 07:24:00,819 WARNING [train.py:1204] (3/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_459-291706-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 07:24:00,820 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7762_210_1794_1_1528199508_4057408_422-227034-0 from training. Duration: 0.73 2023-10-02 07:24:00,828 WARNING [train.py:1204] (3/4) Exclude cut with ID _3067_210_2667_1_1526212974640_4503168_250-285937-0_sp0.9 from training. Duration: 0.888875 2023-10-02 07:24:02,163 WARNING [train.py:1204] (3/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_332-253745-0_sp1.1 from training. Duration: 0.936375 2023-10-02 07:24:02,517 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.bypass.scale_min, batch_count=797506.6666666666, ans=0.2 2023-10-02 07:24:06,533 WARNING [train.py:1204] (3/4) Exclude cut with ID _13938_210_9988_1_1530518718978_3693960_304-112784-0 from training. Duration: 0.46 2023-10-02 07:24:09,212 WARNING [train.py:1204] (3/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_226-267957-0_sp1.1 from training. Duration: 0.92725 2023-10-02 07:24:09,274 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_41822_210_13270_1_1533955976_4379284_608-254567-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 07:24:09,340 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_124-113562-0_sp1.1 from training. Duration: 0.8545625 2023-10-02 07:24:09,370 WARNING [train.py:1204] (3/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_117-290916-0_sp1.1 from training. Duration: 0.463625 2023-10-02 07:24:10,639 WARNING [train.py:1204] (3/4) Exclude cut with ID _28427_210_4220_1_1532581240967_3061069_102-66533-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 07:24:12,611 WARNING [train.py:1204] (3/4) Exclude cut with ID _55383_210_10635_1_1534468426172_2231669_134-128246-0_sp1.1 from training. Duration: 0.8090625 2023-10-02 07:24:12,667 WARNING [train.py:1204] (3/4) Exclude cut with ID _45871_210_5665_1_1533864614245_3296060_49-308316-0_sp1.1 from training. Duration: 0.97275 2023-10-02 07:24:14,031 WARNING [train.py:1204] (3/4) Exclude cut with ID _57572_210_5517_1_1534921219624_3809484_176-199205-0_sp1.1 from training. Duration: 0.97275 2023-10-02 07:24:16,761 WARNING [train.py:1204] (3/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_45-265684-0_sp1.1 from training. Duration: 0.6545625 2023-10-02 07:24:16,778 WARNING [train.py:1204] (3/4) Exclude cut with ID _15150_210_8341_1_1530752411803_7256384_469-239509-0_sp0.9 from training. Duration: 0.7555625 2023-10-02 07:24:18,136 WARNING [train.py:1204] (3/4) Exclude cut with ID _28461_210_2253_1_1532771775189_3041299_136-270644-0_sp1.1 from training. Duration: 0.8454375 2023-10-02 07:24:19,456 WARNING [train.py:1204] (3/4) Exclude cut with ID _69584_210_5219_1_1535785133775_4044450_302-47302-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 07:24:20,807 WARNING [train.py:1204] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_166-128941-0_sp1.1 from training. Duration: 0.4909375 2023-10-02 07:24:26,515 WARNING [train.py:1204] (3/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_559-120504-0_sp1.1 from training. Duration: 0.97275 2023-10-02 07:24:30,376 WARNING [train.py:1204] (3/4) Exclude cut with ID _6456_210_1534_1_1527681634212_3181049_345-252877-0_sp1.1 from training. Duration: 0.5818125 2023-10-02 07:24:30,409 WARNING [train.py:1204] (3/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_902-233003-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 07:24:34,607 WARNING [train.py:1204] (3/4) Exclude cut with ID _36538_210_9994_1_1533204020363_3795620_204-261385-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 07:24:34,614 WARNING [train.py:1204] (3/4) Exclude cut with ID _14821_210_8750_1_1530680503991_6829359_189-297451-0_sp1.1 from training. Duration: 0.5 2023-10-02 07:24:35,999 WARNING [train.py:1204] (3/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_1211-226066-0_sp0.9 from training. Duration: 0.8444375 2023-10-02 07:24:40,315 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6786_210_1388_1_1527839699_4194495_273-149841-0_sp1.1 from training. Duration: 0.836375 2023-10-02 07:24:40,351 WARNING [train.py:1204] (3/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_380-162648-0_sp1.1 from training. Duration: 0.8545625 2023-10-02 07:24:40,352 WARNING [train.py:1204] (3/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_494-199538-0 from training. Duration: 0.75 2023-10-02 07:24:45,010 WARNING [train.py:1204] (3/4) Exclude cut with ID _49097_210_16049_1_1533952780207_4360979_276-87053-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 07:24:47,841 WARNING [train.py:1204] (3/4) Exclude cut with ID _5444_210_2151_1_1527418808464_5312210_726-27165-0 from training. Duration: 0.67 2023-10-02 07:24:51,982 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_128-202104-0_sp1.1 from training. Duration: 0.57275 2023-10-02 07:24:53,469 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_11349_210_6753_1_1529800861_1101207_26-65602-0_sp1.1 from training. Duration: 0.87275 2023-10-02 07:24:55,411 WARNING [train.py:1204] (3/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_159-328157-0 from training. Duration: 0.52 2023-10-02 07:24:55,509 WARNING [train.py:1204] (3/4) Exclude cut with ID _14069_210_5632_1_1530512692033_4803390_98-21934-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 07:24:57,012 WARNING [train.py:1204] (3/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_352-59958-0_sp0.9 from training. Duration: 0.9555625 2023-10-02 07:24:58,776 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6163_210_6400_1_1527506746_4263068_283-137811-0 from training. Duration: 0.77 2023-10-02 07:24:58,816 WARNING [train.py:1204] (3/4) Exclude cut with ID _21989_210_5854_1_1531899214931_3271360_133-44759-0_sp1.1 from training. Duration: 0.7 2023-10-02 07:25:01,986 WARNING [train.py:1204] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_21-174514-0_sp0.9 from training. Duration: 0.47775 2023-10-02 07:25:02,021 WARNING [train.py:1204] (3/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_201-78811-0_sp1.1 from training. Duration: 0.963625 2023-10-02 07:25:03,240 INFO [train.py:1046] (3/4) Epoch 23, batch 2800, loss[loss=0.1972, simple_loss=0.2784, pruned_loss=0.05799, over 24064.00 frames. ], tot_loss[loss=0.1725, simple_loss=0.2494, pruned_loss=0.04784, over 4699239.73 frames. ], batch size: 80, lr: 4.46e-03, grad_scale: 32.0 2023-10-02 07:25:03,312 WARNING [train.py:1204] (3/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_537-164630-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 07:25:03,384 WARNING [train.py:1204] (3/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_156-257607-0 from training. Duration: 0.83 2023-10-02 07:25:03,395 WARNING [train.py:1204] (3/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_308-154450-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 07:25:04,687 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_45399_210_4852_1_1533624725_7579737_214-240574-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 07:25:04,814 WARNING [train.py:1204] (3/4) Exclude cut with ID _29303_210_2575_1_1532392237303_7410649_615-305517-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 07:25:06,121 WARNING [train.py:1204] (3/4) Exclude cut with ID IC0125W0002-61808 from training. Duration: 0.708 2023-10-02 07:25:06,122 WARNING [train.py:1204] (3/4) Exclude cut with ID IC0125W0017-61823 from training. Duration: 0.759 2023-10-02 07:25:08,915 WARNING [train.py:1204] (3/4) Exclude cut with ID _53219_210_4525_1_1534834836008_4064825_4-145291-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 07:25:10,305 WARNING [train.py:1204] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_185-174805-0_sp1.1 from training. Duration: 0.6181875 2023-10-02 07:25:10,329 WARNING [train.py:1204] (3/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_635-193046-0_sp1.1 from training. Duration: 0.92725 2023-10-02 07:25:14,818 WARNING [train.py:1204] (3/4) Exclude cut with ID _46067_210_4220_1_1534125571066_3307450_321-258866-0_sp1.1 from training. Duration: 0.97275 2023-10-02 07:25:15,163 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.feed_forward1.out_proj.dropout_p, batch_count=797773.3333333334, ans=0.1 2023-10-02 07:25:16,327 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8558_210_1410_1_1528505225_7613406_131-329524-0 from training. Duration: 0.79 2023-10-02 07:25:17,805 WARNING [train.py:1204] (3/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_847-41334-0_sp0.9 from training. Duration: 0.488875 2023-10-02 07:25:17,894 WARNING [train.py:1204] (3/4) Exclude cut with ID _14227_210_4381_1_1530701804286_3940259_125-275642-0 from training. Duration: 0.99 2023-10-02 07:25:20,567 WARNING [train.py:1204] (3/4) Exclude cut with ID _25248_210_8750_1_1532062825941_5554180_248-177255-0_sp1.1 from training. Duration: 0.963625 2023-10-02 07:25:20,631 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_40047_210_2290_1_1533290519_2374311_195-165693-0_sp1.1 from training. Duration: 0.8090625 2023-10-02 07:25:20,635 WARNING [train.py:1204] (3/4) Exclude cut with ID _23109_210_4381_1_1532150828777_901949_29-258672-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 07:25:25,409 WARNING [train.py:1204] (3/4) Exclude cut with ID _14821_210_8750_1_1530680503991_6829359_2-227151-0_sp1.1 from training. Duration: 0.7545625 2023-10-02 07:25:25,452 WARNING [train.py:1204] (3/4) Exclude cut with ID _49643_210_7989_1_1534640244224_3939031_565-53982-0_sp1.1 from training. Duration: 0.963625 2023-10-02 07:25:25,456 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7544_210_6753_1_1528591327_1471426_33-346031-0_sp0.9 from training. Duration: 0.67775 2023-10-02 07:25:26,196 WARNING [train.py:1204] (3/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_95-61782-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 07:25:36,079 WARNING [train.py:1204] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_686-54383-0_sp0.9 from training. Duration: 0.9666875 2023-10-02 07:25:37,669 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_505-276823-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 07:25:40,401 WARNING [train.py:1204] (3/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_745-128443-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 07:25:40,472 WARNING [train.py:1204] (3/4) Exclude cut with ID _3844_210_1390_1_1526727803582_2871209_20-251664-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 07:25:41,825 WARNING [train.py:1204] (3/4) Exclude cut with ID _36538_210_9994_1_1533204020363_3795620_487-164628-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 07:25:43,044 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.426e+02 1.852e+02 2.120e+02 2.350e+02 3.658e+02, threshold=4.240e+02, percent-clipped=0.0 2023-10-02 07:25:47,815 WARNING [train.py:1204] (3/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_309-222545-0_sp1.1 from training. Duration: 0.763625 2023-10-02 07:25:47,841 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3531_210_3528_1_1526294203_5216456_309-227302-0 from training. Duration: 0.69 2023-10-02 07:25:49,168 WARNING [train.py:1204] (3/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_42-192460-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 07:25:50,497 WARNING [train.py:1204] (3/4) Exclude cut with ID _22083_210_8254_1_1531702862439_3678970_49-234546-0_sp1.1 from training. Duration: 0.7545625 2023-10-02 07:25:50,503 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_427-78097-0_sp0.9 from training. Duration: 0.888875 2023-10-02 07:25:54,723 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_378-205254-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 07:25:56,029 WARNING [train.py:1204] (3/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_672-314830-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 07:25:56,366 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.feed_forward1.out_proj.dropout_p, batch_count=797973.3333333334, ans=0.1 2023-10-02 07:25:57,497 WARNING [train.py:1204] (3/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_90-30673-0_sp1.1 from training. Duration: 0.763625 2023-10-02 07:26:00,392 WARNING [train.py:1204] (3/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_563-329943-0_sp1.1 from training. Duration: 0.9 2023-10-02 07:26:00,406 WARNING [train.py:1204] (3/4) Exclude cut with ID _21632_210_2253_1_1531994001765_3744140_154-84090-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 07:26:00,407 WARNING [train.py:1204] (3/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_103-336064-0_sp0.9 from training. Duration: 0.5666875 2023-10-02 07:26:01,801 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_43-71989-0_sp0.9 from training. Duration: 0.6333125 2023-10-02 07:26:01,871 WARNING [train.py:1204] (3/4) Exclude cut with ID _3867_210_1883_1_1526641200575_4475830_319-349850-0_sp0.9 from training. Duration: 0.7444375 2023-10-02 07:26:03,728 WARNING [train.py:1204] (3/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_629-179454-0_sp1.1 from training. Duration: 0.963625 2023-10-02 07:26:03,736 WARNING [train.py:1204] (3/4) Exclude cut with ID _65263_210_22356_1_1535763598691_3921037_49-222917-0 from training. Duration: 0.92 2023-10-02 07:26:03,755 WARNING [train.py:1204] (3/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_602-331772-0_sp1.1 from training. Duration: 0.936375 2023-10-02 07:26:05,155 WARNING [train.py:1204] (3/4) Exclude cut with ID _58414_210_8254_1_1535421609162_7151182_292-134978-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 07:26:06,417 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_38793_210_2544_1_1533176114_3849320_200-299292-0_sp1.1 from training. Duration: 0.936375 2023-10-02 07:26:07,777 WARNING [train.py:1204] (3/4) Exclude cut with ID _14970_210_15709_1_1530775547361_1966965_6-23328-0 from training. Duration: 0.86 2023-10-02 07:26:07,870 WARNING [train.py:1204] (3/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_386-225511-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 07:26:07,881 WARNING [train.py:1204] (3/4) Exclude cut with ID _38323_210_12108_1_1533121233369_3303049_416-11919-0_sp1.1 from training. Duration: 0.87275 2023-10-02 07:26:09,200 WARNING [train.py:1204] (3/4) Exclude cut with ID _4866_210_2577_1_1527404412284_4364659_256-306830-0_sp1.1 from training. Duration: 0.7909375 2023-10-02 07:26:09,323 WARNING [train.py:1204] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_53-300544-0 from training. Duration: 0.67 2023-10-02 07:26:15,630 WARNING [train.py:1204] (3/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_80-74184-0_sp1.1 from training. Duration: 0.7545625 2023-10-02 07:26:16,925 WARNING [train.py:1204] (3/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_332-256168-0_sp1.1 from training. Duration: 0.6818125 2023-10-02 07:26:16,971 WARNING [train.py:1204] (3/4) Exclude cut with ID _48836_210_12210_1_1534075848293_3904150_429-35475-0_sp1.1 from training. Duration: 0.8090625 2023-10-02 07:26:18,367 INFO [train.py:1046] (3/4) Epoch 23, batch 2850, loss[loss=0.1586, simple_loss=0.2335, pruned_loss=0.04184, over 24253.00 frames. ], tot_loss[loss=0.1715, simple_loss=0.2479, pruned_loss=0.04761, over 4695180.97 frames. ], batch size: 56, lr: 4.46e-03, grad_scale: 8.0 2023-10-02 07:26:18,445 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_144-135833-0_sp1.1 from training. Duration: 0.97275 2023-10-02 07:26:22,586 WARNING [train.py:1204] (3/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_380-343670-0_sp0.9 from training. Duration: 0.97775 2023-10-02 07:26:22,612 WARNING [train.py:1204] (3/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_213-265174-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 07:26:22,632 WARNING [train.py:1204] (3/4) Exclude cut with ID _20455_210_5219_1_1531565996775_8098100_86-314283-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 07:26:25,283 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_41750_210_1960_1_1533520678_3948042_68-223813-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 07:26:25,371 WARNING [train.py:1204] (3/4) Exclude cut with ID _55153_210_2253_1_1534384786677_3162096_24-198429-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 07:26:28,671 WARNING [train.py:1204] (3/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_119-207162-0_sp0.9 from training. Duration: 0.788875 2023-10-02 07:26:28,717 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_148-172861-0 from training. Duration: 0.5 2023-10-02 07:26:29,324 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.5.encoder.layers.0.self_attn1.whiten, num_groups=1, num_channels=256, metric=12.48 vs. limit=22.5 2023-10-02 07:26:35,003 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_5522_210_3273_1_1527249606_3909096_318-203153-0 from training. Duration: 0.43 2023-10-02 07:26:35,010 WARNING [train.py:1204] (3/4) Exclude cut with ID _14884_210_2252_1_1530707459689_5561250_288-189949-0_sp1.1 from training. Duration: 0.97275 2023-10-02 07:26:35,286 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=798173.3333333334, ans=0.1 2023-10-02 07:26:35,975 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.0.layers.0.feed_forward2.out_whiten, num_groups=1, num_channels=192, metric=6.14 vs. limit=15.0 2023-10-02 07:26:37,728 WARNING [train.py:1204] (3/4) Exclude cut with ID _5294_210_1747_1_1527249643928_3696179_529-242298-0 from training. Duration: 0.95 2023-10-02 07:26:37,805 WARNING [train.py:1204] (3/4) Exclude cut with ID _36538_210_9994_1_1533204020363_3795620_503-110289-0_sp1.1 from training. Duration: 0.936375 2023-10-02 07:26:40,697 WARNING [train.py:1204] (3/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_385-193052-0 from training. Duration: 0.86 2023-10-02 07:26:40,756 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_456-22141-0 from training. Duration: 0.88 2023-10-02 07:26:40,872 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder_embed.convnext.hidden_balancer.prob, batch_count=798173.3333333334, ans=0.125 2023-10-02 07:26:43,395 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_38965_210_4852_1_1533174906_3484735_284-117840-0_sp1.1 from training. Duration: 0.936375 2023-10-02 07:26:55,213 WARNING [train.py:1204] (3/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_75-201625-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 07:26:56,624 WARNING [train.py:1204] (3/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_255-312654-0_sp1.1 from training. Duration: 0.82725 2023-10-02 07:26:56,634 WARNING [train.py:1204] (3/4) Exclude cut with ID _47251_210_18628_1_1533814063128_4137250_214-88076-0_sp0.9 from training. Duration: 0.97775 2023-10-02 07:26:57,957 WARNING [train.py:1204] (3/4) Exclude cut with ID _4092_210_2151_1_1526643044449_3135759_227-213972-0_sp1.1 from training. Duration: 0.4909375 2023-10-02 07:26:57,973 WARNING [train.py:1204] (3/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_912-47473-0_sp1.1 from training. Duration: 0.6181875 2023-10-02 07:26:57,998 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6767_210_3273_1_1527855783_1718932_173-256601-0_sp1.1 from training. Duration: 0.72725 2023-10-02 07:27:00,093 WARNING [train.py:1204] (3/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_32-205288-0_sp1.1 from training. Duration: 0.7090625 2023-10-02 07:27:00,124 WARNING [train.py:1204] (3/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_39-248870-0 from training. Duration: 0.69 2023-10-02 07:27:02,888 WARNING [train.py:1204] (3/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_377-296839-0_sp1.1 from training. Duration: 0.5 2023-10-02 07:27:04,651 WARNING [train.py:1204] (3/4) Exclude cut with ID _13679_210_7497_1_1530534293662_3355098_413-56154-0_sp1.1 from training. Duration: 0.963625 2023-10-02 07:27:04,732 WARNING [train.py:1204] (3/4) Exclude cut with ID _14970_210_15709_1_1530775547361_1966965_201-84544-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 07:27:06,051 WARNING [train.py:1204] (3/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_105-95775-0_sp1.1 from training. Duration: 0.936375 2023-10-02 07:27:07,484 WARNING [train.py:1204] (3/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_131-191669-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 07:27:08,798 WARNING [train.py:1204] (3/4) Exclude cut with ID _14860_210_4915_1_1530702020340_7343240_695-4323-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 07:27:08,906 WARNING [train.py:1204] (3/4) Exclude cut with ID _39867_210_9826_1_1533553222030_9344259_991-130565-0_sp1.1 from training. Duration: 0.97275 2023-10-02 07:27:11,562 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_50-3289-0_sp1.1 from training. Duration: 0.82725 2023-10-02 07:27:11,695 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_54-347670-0_sp1.1 from training. Duration: 0.92725 2023-10-02 07:27:13,073 WARNING [train.py:1204] (3/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_200-153445-0_sp1.1 from training. Duration: 0.936375 2023-10-02 07:27:14,386 WARNING [train.py:1204] (3/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_279-302911-0_sp1.1 from training. Duration: 0.97275 2023-10-02 07:27:17,129 WARNING [train.py:1204] (3/4) Exclude cut with ID _14886_210_4220_1_1530702020355_3516190_159-73748-0_sp0.9 from training. Duration: 0.92225 2023-10-02 07:27:17,335 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.balancer1.prob, batch_count=798373.3333333334, ans=0.125 2023-10-02 07:27:17,779 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.4.encoder.layers.1.nonlin_attention.whiten1, num_groups=1, num_channels=288, metric=4.15 vs. limit=10.0 2023-10-02 07:27:20,107 WARNING [train.py:1204] (3/4) Exclude cut with ID _53379_210_20034_1_1534222027363_3854654_23-222715-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 07:27:22,020 WARNING [train.py:1204] (3/4) Exclude cut with ID _12555_210_8750_1_1530075594402_3780239_449-267986-0 from training. Duration: 0.83 2023-10-02 07:27:23,458 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7544_210_6753_1_1528587722_3576686_295-258479-0 from training. Duration: 0.84 2023-10-02 07:27:23,604 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7002_210_1794_1_1527935634_4034444_487-236501-0_sp0.9 from training. Duration: 0.7333125 2023-10-02 07:27:24,946 WARNING [train.py:1204] (3/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_244-19107-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 07:27:24,960 WARNING [train.py:1204] (3/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_390-339137-0 from training. Duration: 0.56 2023-10-02 07:27:25,018 WARNING [train.py:1204] (3/4) Exclude cut with ID _7562_210_2084_1_1528887767916_3946229_186-326422-0_sp1.1 from training. Duration: 0.763625 2023-10-02 07:27:26,320 WARNING [train.py:1204] (3/4) Exclude cut with ID _33640_210_2655_1_1533117605479_4855469_645-145235-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 07:27:26,346 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_25767_210_2161_1_1532307483_3867748_506-171777-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 07:27:26,373 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_5438_210_1794_1_1527336687_2600009_233-58072-0_sp1.1 from training. Duration: 0.7 2023-10-02 07:27:26,374 WARNING [train.py:1204] (3/4) Exclude cut with ID IC0669W0063-330908 from training. Duration: 0.896 2023-10-02 07:27:27,072 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.3.encoder.layers.1.nonlin_attention.whiten1, num_groups=1, num_channels=384, metric=4.79 vs. limit=10.0 2023-10-02 07:27:27,681 WARNING [train.py:1204] (3/4) Exclude cut with ID IC0671W0070-331913 from training. Duration: 0.896 2023-10-02 07:27:27,684 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_258-310570-0_sp1.1 from training. Duration: 0.7454375 2023-10-02 07:27:27,756 WARNING [train.py:1204] (3/4) Exclude cut with ID _9461_210_4557_1_1529060514726_3677451_162-177507-0_sp1.1 from training. Duration: 0.97275 2023-10-02 07:27:32,441 INFO [train.py:1046] (3/4) Epoch 23, batch 2900, loss[loss=0.1605, simple_loss=0.229, pruned_loss=0.04604, over 23359.00 frames. ], tot_loss[loss=0.1723, simple_loss=0.2486, pruned_loss=0.04804, over 4696757.39 frames. ], batch size: 119, lr: 4.46e-03, grad_scale: 8.0 2023-10-02 07:27:34,617 WARNING [train.py:1204] (3/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_26-175553-0_sp0.9 from training. Duration: 0.67775 2023-10-02 07:27:34,634 WARNING [train.py:1204] (3/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_7-150481-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 07:27:34,699 WARNING [train.py:1204] (3/4) Exclude cut with ID _23557_210_8750_1_1531890106370_6846255_72-273262-0_sp1.1 from training. Duration: 0.963625 2023-10-02 07:27:36,095 WARNING [train.py:1204] (3/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_367-61330-0 from training. Duration: 0.75 2023-10-02 07:27:38,937 WARNING [train.py:1204] (3/4) Exclude cut with ID _39867_210_9826_1_1533553222030_9344259_1337-11141-0_sp1.1 from training. Duration: 0.936375 2023-10-02 07:27:40,214 WARNING [train.py:1204] (3/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_101-293367-0 from training. Duration: 0.63 2023-10-02 07:27:40,285 WARNING [train.py:1204] (3/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_10-100367-0 from training. Duration: 0.98 2023-10-02 07:27:42,884 WARNING [train.py:1204] (3/4) Exclude cut with ID _60057_210_10640_1_1534903065793_3333150_171-181133-0_sp1.1 from training. Duration: 0.8 2023-10-02 07:27:42,886 WARNING [train.py:1204] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_844-96989-0_sp0.9 from training. Duration: 0.788875 2023-10-02 07:27:44,345 WARNING [train.py:1204] (3/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_301-144595-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 07:27:47,253 WARNING [train.py:1204] (3/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_115-147343-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 07:27:48,707 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.conv_module1.balancer1.prob, batch_count=798506.6666666666, ans=0.125 2023-10-02 07:27:51,190 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3171_210_2087_1_1526109084_2330007_37-64334-0_sp1.1 from training. Duration: 0.7454375 2023-10-02 07:27:51,241 WARNING [train.py:1204] (3/4) Exclude cut with ID _19514_210_9259_1_1531530071330_5084230_769-272914-0_sp1.1 from training. Duration: 0.936375 2023-10-02 07:27:54,568 WARNING [train.py:1204] (3/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_76-311963-0_sp0.9 from training. Duration: 0.77775 2023-10-02 07:27:54,627 WARNING [train.py:1204] (3/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_526-62079-0 from training. Duration: 0.67 2023-10-02 07:27:54,668 WARNING [train.py:1204] (3/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_7-111954-0_sp0.9 from training. Duration: 0.62225 2023-10-02 07:27:56,138 WARNING [train.py:1204] (3/4) Exclude cut with ID _57997_210_4163_1_1534766425889_3961049_360-279140-0_sp1.1 from training. Duration: 0.97275 2023-10-02 07:27:58,914 WARNING [train.py:1204] (3/4) Exclude cut with ID _4866_210_2577_1_1527404412284_4364659_133-1696-0 from training. Duration: 0.99 2023-10-02 07:27:58,976 WARNING [train.py:1204] (3/4) Exclude cut with ID _5190_210_4557_1_1527244368799_373439_28-197030-0 from training. Duration: 0.72 2023-10-02 07:28:03,062 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_53-39003-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 07:28:03,065 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6673_210_3273_1_1527767790_3173240_351-167954-0 from training. Duration: 0.48 2023-10-02 07:28:03,081 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2822_210_2715_1_1526112586_1999587_165-36656-0_sp1.1 from training. Duration: 0.8090625 2023-10-02 07:28:05,342 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.feed_forward2.hidden_balancer.prob, batch_count=798573.3333333334, ans=0.125 2023-10-02 07:28:05,955 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.2.encoder.layers.1.nonlin_attention.whiten2, num_groups=1, num_channels=384, metric=5.32 vs. limit=15.0 2023-10-02 07:28:06,825 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_328-264892-0_sp1.1 from training. Duration: 0.92725 2023-10-02 07:28:06,834 WARNING [train.py:1204] (3/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_29-293896-0_sp0.9 from training. Duration: 0.67775 2023-10-02 07:28:08,994 WARNING [train.py:1204] (3/4) Exclude cut with ID _65263_210_22356_1_1535763598691_3921037_465-55448-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 07:28:10,299 WARNING [train.py:1204] (3/4) Exclude cut with ID _21989_210_5854_1_1531899214931_3271360_311-53106-0_sp1.1 from training. Duration: 0.97275 2023-10-02 07:28:13,037 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.489e+02 1.877e+02 2.119e+02 2.424e+02 3.198e+02, threshold=4.238e+02, percent-clipped=0.0 2023-10-02 07:28:13,107 WARNING [train.py:1204] (3/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_389-277915-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 07:28:14,676 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_69630_210_7234_1_1535791094_677837_63-277139-0_sp1.1 from training. Duration: 0.963625 2023-10-02 07:28:16,209 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.conv_module2.balancer1.min_positive, batch_count=798640.0, ans=0.025 2023-10-02 07:28:17,431 WARNING [train.py:1204] (3/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_85-138257-0 from training. Duration: 0.71 2023-10-02 07:28:17,450 WARNING [train.py:1204] (3/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_686-247600-0 from training. Duration: 0.92 2023-10-02 07:28:17,469 WARNING [train.py:1204] (3/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_108-255430-0_sp1.1 from training. Duration: 0.8818125 2023-10-02 07:28:20,359 WARNING [train.py:1204] (3/4) Exclude cut with ID _14884_210_2252_1_1530707459689_5561250_114-201399-0_sp1.1 from training. Duration: 0.6909375 2023-10-02 07:28:23,909 WARNING [train.py:1204] (3/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_506-42614-0 from training. Duration: 0.79 2023-10-02 07:28:24,026 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_85-195240-0_sp1.1 from training. Duration: 0.7909375 2023-10-02 07:28:26,055 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.3.encoder.layers.1.self_attn2.whiten, num_groups=1, num_channels=512, metric=14.17 vs. limit=22.5 2023-10-02 07:28:28,029 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_141-36243-0_sp1.1 from training. Duration: 0.97275 2023-10-02 07:28:29,735 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.1.ff3_skip_rate, batch_count=798640.0, ans=0.0 2023-10-02 07:28:31,687 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.2.encoder.layers.2.nonlin_attention.whiten1, num_groups=1, num_channels=288, metric=4.77 vs. limit=10.0 2023-10-02 07:28:37,258 WARNING [train.py:1204] (3/4) Exclude cut with ID _7852_210_5219_1_1528286471316_8139400_358-287492-0_sp1.1 from training. Duration: 0.87275 2023-10-02 07:28:37,287 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_805-71497-0_sp0.9 from training. Duration: 0.788875 2023-10-02 07:28:37,578 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.ff2_skip_rate, batch_count=798706.6666666666, ans=0.0 2023-10-02 07:28:39,085 WARNING [train.py:1204] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_178-39722-0 from training. Duration: 0.49 2023-10-02 07:28:39,788 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.2.encoder.layers.2.self_attn_weights.whiten_keys, num_groups=4, num_channels=128, metric=3.28 vs. limit=6.0 2023-10-02 07:28:43,326 WARNING [train.py:1204] (3/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_428-181925-0_sp1.1 from training. Duration: 0.963625 2023-10-02 07:28:43,330 WARNING [train.py:1204] (3/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_770-200430-0 from training. Duration: 0.89 2023-10-02 07:28:43,384 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_1104-271009-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 07:28:44,750 WARNING [train.py:1204] (3/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_128-296581-0_sp0.9 from training. Duration: 0.82225 2023-10-02 07:28:47,425 INFO [train.py:1046] (3/4) Epoch 23, batch 2950, loss[loss=0.1706, simple_loss=0.2466, pruned_loss=0.04724, over 23399.00 frames. ], tot_loss[loss=0.1732, simple_loss=0.2499, pruned_loss=0.04829, over 4703200.09 frames. ], batch size: 105, lr: 4.46e-03, grad_scale: 8.0 2023-10-02 07:28:50,280 WARNING [train.py:1204] (3/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_19-62511-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 07:28:52,895 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_805-71497-0 from training. Duration: 0.71 2023-10-02 07:28:52,977 WARNING [train.py:1204] (3/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_573-152077-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 07:28:52,982 WARNING [train.py:1204] (3/4) Exclude cut with ID _42875_210_14856_1_1533704397487_4012433_393-177816-0_sp1.1 from training. Duration: 0.963625 2023-10-02 07:28:56,262 WARNING [train.py:1204] (3/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_278-35159-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 07:28:57,632 WARNING [train.py:1204] (3/4) Exclude cut with ID _15037_210_2253_1_1530752294104_3267650_13-130361-0_sp1.1 from training. Duration: 0.9 2023-10-02 07:28:57,713 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3019_210_2028_1_1526044245_3264865_127-135615-0 from training. Duration: 0.94 2023-10-02 07:28:59,076 WARNING [train.py:1204] (3/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_607-213951-0 from training. Duration: 0.84 2023-10-02 07:28:59,141 WARNING [train.py:1204] (3/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_436-318113-0_sp0.9 from training. Duration: 0.8333125 2023-10-02 07:28:59,142 WARNING [train.py:1204] (3/4) Exclude cut with ID _13938_210_9988_1_1530518718978_3693960_148-152225-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 07:29:04,074 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.feed_forward2.hidden_balancer.prob, batch_count=798840.0, ans=0.125 2023-10-02 07:29:05,036 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2541_210_1794_1_1525590269_3084765_156-173713-0_sp1.1 from training. Duration: 0.736375 2023-10-02 07:29:06,426 WARNING [train.py:1204] (3/4) Exclude cut with ID _28323_210_16187_1_1532480395667_4100300_531-24157-0_sp1.1 from training. Duration: 0.8909375 2023-10-02 07:29:07,814 WARNING [train.py:1204] (3/4) Exclude cut with ID _29303_210_2575_1_1532392237303_7410649_382-217492-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 07:29:09,751 WARNING [train.py:1204] (3/4) Exclude cut with ID _58895_210_8750_1_1535184081320_3649089_270-158013-0_sp1.1 from training. Duration: 0.92725 2023-10-02 07:29:12,589 WARNING [train.py:1204] (3/4) Exclude cut with ID _29277_210_3279_1_1532581109293_3173069_311-206227-0_sp1.1 from training. Duration: 0.936375 2023-10-02 07:29:12,599 WARNING [train.py:1204] (3/4) Exclude cut with ID _7160_210_1876_1_1527940923231_3703799_282-322611-0_sp0.9 from training. Duration: 0.9666875 2023-10-02 07:29:14,018 WARNING [train.py:1204] (3/4) Exclude cut with ID _53219_210_4525_1_1534834836008_4064825_562-32295-0_sp1.1 from training. Duration: 0.963625 2023-10-02 07:29:15,373 WARNING [train.py:1204] (3/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_408-87330-0_sp1.1 from training. Duration: 0.963625 2023-10-02 07:29:15,389 WARNING [train.py:1204] (3/4) Exclude cut with ID _59202_210_26984_1_1534816810612_3549174_137-318710-0_sp1.1 from training. Duration: 0.8090625 2023-10-02 07:29:17,169 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.ff2_skip_rate, batch_count=798906.6666666666, ans=0.0 2023-10-02 07:29:17,191 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=798906.6666666666, ans=0.1 2023-10-02 07:29:18,327 WARNING [train.py:1204] (3/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_372-30302-0 from training. Duration: 0.86 2023-10-02 07:29:22,332 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7543_210_6753_1_1528541907_1399322_178-56269-0_sp1.1 from training. Duration: 0.95275 2023-10-02 07:29:22,353 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9109W0012-549758 from training. Duration: 0.512 2023-10-02 07:29:23,781 WARNING [train.py:1204] (3/4) Exclude cut with ID _5410_210_1388_1_1527397055795_1719020_39-112221-0_sp0.9 from training. Duration: 0.7666875 2023-10-02 07:29:25,702 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9121W0012-554933 from training. Duration: 0.896 2023-10-02 07:29:27,077 WARNING [train.py:1204] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_476-272757-0 from training. Duration: 0.93 2023-10-02 07:29:28,447 WARNING [train.py:1204] (3/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_1085-171718-0_sp1.1 from training. Duration: 0.92725 2023-10-02 07:29:28,507 WARNING [train.py:1204] (3/4) Exclude cut with ID _8480_210_1874_1_1528589391205_6728330_309-139896-0_sp1.1 from training. Duration: 0.736375 2023-10-02 07:29:28,508 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9130W0113-559703 from training. Duration: 0.896 2023-10-02 07:29:28,512 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_292-265928-0_sp1.1 from training. Duration: 0.463625 2023-10-02 07:29:31,346 WARNING [train.py:1204] (3/4) Exclude cut with ID _8772_210_1850_1_1528635595972_3664011_96-12756-0 from training. Duration: 0.98 2023-10-02 07:29:31,402 WARNING [train.py:1204] (3/4) Exclude cut with ID _12556_210_8341_1_1530081024100_7327690_725-193477-0_sp1.1 from training. Duration: 0.8909375 2023-10-02 07:29:31,599 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.attention_skip_rate, batch_count=798973.3333333334, ans=0.0 2023-10-02 07:29:32,741 WARNING [train.py:1204] (3/4) Exclude cut with ID _5289_210_2084_1_1527678202736_3779299_142-103317-0_sp1.1 from training. Duration: 0.763625 2023-10-02 07:29:35,484 WARNING [train.py:1204] (3/4) Exclude cut with ID _45012_210_12651_1_1533608435136_5371380_206-51075-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 07:29:35,589 WARNING [train.py:1204] (3/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_176-56120-0_sp1.1 from training. Duration: 0.7545625 2023-10-02 07:29:35,621 WARNING [train.py:1204] (3/4) Exclude cut with ID _40284_210_2084_1_1533513679814_3704279_220-201864-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 07:29:36,488 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.ff3_skip_rate, batch_count=798973.3333333334, ans=0.0 2023-10-02 07:29:37,484 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9165W0004-576638 from training. Duration: 0.896 2023-10-02 07:29:37,512 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_5438_210_1794_1_1527336687_2600009_218-343336-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 07:29:37,540 WARNING [train.py:1204] (3/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_318-162017-0 from training. Duration: 0.91 2023-10-02 07:29:44,066 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_49544_210_1794_1_1534379770_5262083_483-349298-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 07:29:44,208 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.0.feed_forward1.out_proj.dropout_p, batch_count=798973.3333333334, ans=0.1 2023-10-02 07:29:45,436 WARNING [train.py:1204] (3/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_197-28164-0_sp0.9 from training. Duration: 0.888875 2023-10-02 07:29:46,718 WARNING [train.py:1204] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_643-52007-0 from training. Duration: 0.57 2023-10-02 07:29:46,742 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_38965_210_4852_1_1533174906_3484735_403-332963-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 07:29:48,189 WARNING [train.py:1204] (3/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_340-174176-0 from training. Duration: 0.49 2023-10-02 07:29:50,997 WARNING [train.py:1204] (3/4) Exclude cut with ID _56708_210_13177_1_1535252351282_3422899_363-917-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 07:29:52,425 WARNING [train.py:1204] (3/4) Exclude cut with ID _19896_210_1883_1_1531709022662_6649569_525-159063-0_sp1.1 from training. Duration: 0.936375 2023-10-02 07:29:52,475 WARNING [train.py:1204] (3/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_430-116139-0_sp0.9 from training. Duration: 0.9444375 2023-10-02 07:29:53,876 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_53753_210_7234_1_1534320003_3594148_86-329250-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 07:29:53,888 WARNING [train.py:1204] (3/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_406-275649-0_sp1.1 from training. Duration: 0.3818125 2023-10-02 07:29:54,015 WARNING [train.py:1204] (3/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_1083-147536-0_sp1.1 from training. Duration: 0.8818125 2023-10-02 07:29:55,963 WARNING [train.py:1204] (3/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_54-311741-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 07:29:55,978 WARNING [train.py:1204] (3/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_237-58433-0_sp0.9 from training. Duration: 0.82225 2023-10-02 07:29:56,021 WARNING [train.py:1204] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_98-51676-0_sp1.1 from training. Duration: 0.663625 2023-10-02 07:29:56,076 WARNING [train.py:1204] (3/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_386-270472-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 07:29:57,385 WARNING [train.py:1204] (3/4) Exclude cut with ID _19023_210_4353_1_1531378892552_5820780_83-254794-0_sp1.1 from training. Duration: 0.7818125 2023-10-02 07:29:58,706 WARNING [train.py:1204] (3/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_314-93656-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 07:29:58,717 WARNING [train.py:1204] (3/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_336-213530-0 from training. Duration: 0.9 2023-10-02 07:30:00,119 WARNING [train.py:1204] (3/4) Exclude cut with ID _36686_210_5665_1_1533778225913_3646410_65-221120-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 07:30:01,202 INFO [train.py:1046] (3/4) Epoch 23, batch 3000, loss[loss=0.1642, simple_loss=0.2353, pruned_loss=0.04661, over 23519.00 frames. ], tot_loss[loss=0.173, simple_loss=0.2497, pruned_loss=0.04814, over 4721309.71 frames. ], batch size: 134, lr: 4.46e-03, grad_scale: 8.0 2023-10-02 07:30:01,202 INFO [train.py:1069] (3/4) Computing validation loss 2023-10-02 07:30:13,760 INFO [train.py:1078] (3/4) Epoch 23, validation: loss=0.3132, simple_loss=0.2719, pruned_loss=0.1772, over 1125622.00 frames. 2023-10-02 07:30:13,760 INFO [train.py:1079] (3/4) Maximum memory allocated so far is 21134MB 2023-10-02 07:30:14,059 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.ff2_skip_rate, batch_count=799106.6666666666, ans=0.0 2023-10-02 07:30:15,265 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_26433_210_12108_1_1532157158_4056480_271-68178-0_sp1.1 from training. Duration: 0.92725 2023-10-02 07:30:16,544 WARNING [train.py:1204] (3/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_46-207599-0_sp0.9 from training. Duration: 0.711125 2023-10-02 07:30:19,311 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9303W0114-637133 from training. Duration: 0.896 2023-10-02 07:30:19,343 WARNING [train.py:1204] (3/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_490-207861-0 from training. Duration: 0.98 2023-10-02 07:30:20,859 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.conv_module1.balancer1.prob, batch_count=799106.6666666666, ans=0.125 2023-10-02 07:30:22,621 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6767_210_3273_1_1527855783_1718932_173-256601-0_sp0.9 from training. Duration: 0.888875 2023-10-02 07:30:22,659 WARNING [train.py:1204] (3/4) Exclude cut with ID _13921_210_9994_1_1530841782272_3503520_327-69677-0_sp1.1 from training. Duration: 0.8181875 2023-10-02 07:30:23,955 WARNING [train.py:1204] (3/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_508-335369-0 from training. Duration: 0.48 2023-10-02 07:30:23,986 WARNING [train.py:1204] (3/4) Exclude cut with ID _64631_210_27767_1_1535349580068_4165160_24-46567-0_sp1.1 from training. Duration: 0.963625 2023-10-02 07:30:28,343 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.0.ff2_skip_rate, batch_count=799173.3333333334, ans=0.0 2023-10-02 07:30:30,867 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_89-35748-0_sp1.1 from training. Duration: 0.7181875 2023-10-02 07:30:33,133 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.2.encoder.layers.2.conv_module1.whiten, num_groups=1, num_channels=384, metric=5.72 vs. limit=15.0 2023-10-02 07:30:34,170 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=799173.3333333334, ans=0.1 2023-10-02 07:30:40,241 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_26433_210_12108_1_1532157158_4056480_578-262107-0_sp1.1 from training. Duration: 0.9 2023-10-02 07:30:46,898 WARNING [train.py:1204] (3/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_129-337802-0 from training. Duration: 0.72 2023-10-02 07:30:49,640 WARNING [train.py:1204] (3/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_356-167816-0_sp0.9 from training. Duration: 0.8 2023-10-02 07:30:52,384 WARNING [train.py:1204] (3/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_137-116442-0_sp1.1 from training. Duration: 0.7090625 2023-10-02 07:30:52,424 WARNING [train.py:1204] (3/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_358-172940-0_sp1.1 from training. Duration: 0.963625 2023-10-02 07:30:52,457 WARNING [train.py:1204] (3/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_668-344147-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 07:30:54,302 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.545e+02 1.810e+02 1.967e+02 2.227e+02 3.390e+02, threshold=3.934e+02, percent-clipped=0.0 2023-10-02 07:30:54,454 WARNING [train.py:1204] (3/4) Exclude cut with ID _62337_210_12392_1_1535009273105_3412229_255-157667-0_sp1.1 from training. Duration: 0.936375 2023-10-02 07:30:55,816 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_486-151131-0 from training. Duration: 0.85 2023-10-02 07:30:55,979 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_53638_210_21581_1_1534297319_5808449_885-80172-0 from training. Duration: 0.96 2023-10-02 07:30:57,442 WARNING [train.py:1204] (3/4) Exclude cut with ID _50093_210_4220_1_1534417141207_3432299_105-183067-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 07:30:57,505 WARNING [train.py:1204] (3/4) Exclude cut with ID _7562_210_2084_1_1528887767916_3946229_345-169156-0_sp0.9 from training. Duration: 0.6444375 2023-10-02 07:30:59,013 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_4528_210_3273_1_1527076654_3276777_209-219804-0_sp0.9 from training. Duration: 0.7666875 2023-10-02 07:31:00,355 WARNING [train.py:1204] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_35-153886-0_sp0.9 from training. Duration: 0.9333125 2023-10-02 07:31:00,427 WARNING [train.py:1204] (3/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_332-121925-0_sp1.1 from training. Duration: 0.92725 2023-10-02 07:31:00,429 WARNING [train.py:1204] (3/4) Exclude cut with ID _59202_210_26984_1_1534816810612_3549174_495-65515-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 07:31:04,502 WARNING [train.py:1204] (3/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_769-168302-0_sp1.1 from training. Duration: 0.6454375 2023-10-02 07:31:04,533 WARNING [train.py:1204] (3/4) Exclude cut with ID _24271_210_12658_1_1531987270022_7190519_708-71345-0_sp1.1 from training. Duration: 0.936375 2023-10-02 07:31:04,534 WARNING [train.py:1204] (3/4) Exclude cut with ID 3033-130750-0096-55598-0_sp0.9 from training. Duration: 0.92225 2023-10-02 07:31:05,967 WARNING [train.py:1204] (3/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_219-224445-0_sp0.9 from training. Duration: 0.9333125 2023-10-02 07:31:08,661 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_174-277364-0 from training. Duration: 0.71 2023-10-02 07:31:10,080 WARNING [train.py:1204] (3/4) Exclude cut with ID _45021_210_9994_1_1533619751094_3330288_96-239010-0_sp1.1 from training. Duration: 0.82725 2023-10-02 07:31:10,245 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.feed_forward2.hidden_balancer.prob, batch_count=799306.6666666666, ans=0.125 2023-10-02 07:31:11,720 WARNING [train.py:1204] (3/4) Exclude cut with ID _31602_210_5854_1_1532677021456_4458250_489-305116-0_sp1.1 from training. Duration: 0.97275 2023-10-02 07:31:11,752 WARNING [train.py:1204] (3/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_269-154726-0_sp0.9 from training. Duration: 0.9555625 2023-10-02 07:31:18,385 WARNING [train.py:1204] (3/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_126-54418-0_sp1.1 from training. Duration: 0.92725 2023-10-02 07:31:18,420 WARNING [train.py:1204] (3/4) Exclude cut with ID _42326_210_7770_1_1533814282531_3707269_182-224159-0_sp1.1 from training. Duration: 0.92725 2023-10-02 07:31:18,592 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=799373.3333333334, ans=0.1 2023-10-02 07:31:19,774 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7002_210_1794_1_1527939683_5157286_1-281264-0_sp0.9 from training. Duration: 0.488875 2023-10-02 07:31:21,032 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_86-16881-0 from training. Duration: 0.58 2023-10-02 07:31:21,066 WARNING [train.py:1204] (3/4) Exclude cut with ID _19514_210_9259_1_1531530071330_5084230_637-279094-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 07:31:21,115 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_26433_210_12108_1_1532157158_4056480_578-262107-0 from training. Duration: 0.99 2023-10-02 07:31:22,429 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_82-221196-0_sp1.1 from training. Duration: 0.6909375 2023-10-02 07:31:22,561 WARNING [train.py:1204] (3/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_402-102311-0 from training. Duration: 0.67 2023-10-02 07:31:25,760 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1015-343884-0_sp1.1 from training. Duration: 0.563625 2023-10-02 07:31:27,157 WARNING [train.py:1204] (3/4) Exclude cut with ID _13684_210_7497_1_1530619354863_3598359_312-235606-0_sp1.1 from training. Duration: 0.5454375 2023-10-02 07:31:28,390 INFO [train.py:1046] (3/4) Epoch 23, batch 3050, loss[loss=0.1543, simple_loss=0.2328, pruned_loss=0.03791, over 24573.00 frames. ], tot_loss[loss=0.1724, simple_loss=0.2497, pruned_loss=0.04753, over 4741986.82 frames. ], batch size: 60, lr: 4.46e-03, grad_scale: 8.0 2023-10-02 07:31:28,432 WARNING [train.py:1204] (3/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_55-31904-0 from training. Duration: 0.88 2023-10-02 07:31:28,500 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_25-332138-0 from training. Duration: 0.91 2023-10-02 07:31:28,502 WARNING [train.py:1204] (3/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_912-282412-0_sp0.9 from training. Duration: 0.5444375 2023-10-02 07:31:29,810 WARNING [train.py:1204] (3/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_458-299435-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 07:31:31,195 WARNING [train.py:1204] (3/4) Exclude cut with ID _69584_210_5219_1_1535785133775_4044450_275-88403-0_sp1.1 from training. Duration: 0.92725 2023-10-02 07:31:31,217 WARNING [train.py:1204] (3/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_841-341679-0_sp1.1 from training. Duration: 0.52725 2023-10-02 07:31:31,232 WARNING [train.py:1204] (3/4) Exclude cut with ID _41391_210_7994_1_1533603771872_3638201_1-54521-0_sp1.1 from training. Duration: 0.97275 2023-10-02 07:31:32,589 WARNING [train.py:1204] (3/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_495-186609-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 07:31:34,101 WARNING [train.py:1204] (3/4) Exclude cut with ID _4681_210_3525_1_1527250689464_120889_15-126760-0 from training. Duration: 0.81 2023-10-02 07:31:35,612 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3019_210_2028_1_1526044245_3264865_127-135615-0_sp1.1 from training. Duration: 0.8545625 2023-10-02 07:31:35,878 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.conv_module1.balancer1.prob, batch_count=799440.0, ans=0.125 2023-10-02 07:31:37,144 WARNING [train.py:1204] (3/4) Exclude cut with ID _30191_210_1664_1_1533189915047_7196670_915-21561-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 07:31:37,187 WARNING [train.py:1204] (3/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_6-214815-0_sp0.9 from training. Duration: 0.9444375 2023-10-02 07:31:37,445 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.conv_module2.balancer2.prob, batch_count=799440.0, ans=0.125 2023-10-02 07:31:40,009 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_45399_210_4852_1_1533624725_7579737_190-137494-0_sp1.1 from training. Duration: 0.97275 2023-10-02 07:31:42,069 WARNING [train.py:1204] (3/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_207-132038-0 from training. Duration: 0.94 2023-10-02 07:31:42,310 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.ff3_skip_rate, batch_count=799506.6666666666, ans=0.0 2023-10-02 07:31:49,724 WARNING [train.py:1204] (3/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_90-31383-0 from training. Duration: 0.87 2023-10-02 07:31:49,769 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_15517_210_6753_1_1530952293_1664795_119-31001-0 from training. Duration: 0.94 2023-10-02 07:31:51,179 WARNING [train.py:1204] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_684-85342-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 07:31:53,978 WARNING [train.py:1204] (3/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_97-325053-0_sp1.1 from training. Duration: 0.836375 2023-10-02 07:31:57,310 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_68702_210_23689_1_1535799577_3420068_168-25150-0_sp1.1 from training. Duration: 0.97275 2023-10-02 07:31:57,328 WARNING [train.py:1204] (3/4) Exclude cut with ID _65134_210_14763_1_1535335393985_3505368_111-27984-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 07:31:58,664 WARNING [train.py:1204] (3/4) Exclude cut with ID _48678_210_5663_1_1533988830947_3769199_276-226230-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 07:32:01,470 WARNING [train.py:1204] (3/4) Exclude cut with ID _28427_210_4220_1_1532581240967_3061069_43-84928-0_sp1.1 from training. Duration: 0.9 2023-10-02 07:32:01,505 WARNING [train.py:1204] (3/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_158-250116-0_sp1.1 from training. Duration: 0.563625 2023-10-02 07:32:02,734 WARNING [train.py:1204] (3/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_279-332556-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 07:32:02,763 WARNING [train.py:1204] (3/4) Exclude cut with ID _42863_210_9070_1_1533902398788_3696920_488-230146-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 07:32:02,763 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_387-247159-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 07:32:02,849 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6729_210_2550_1_1528032481_1788725_261-42619-0_sp1.1 from training. Duration: 0.97275 2023-10-02 07:32:05,511 WARNING [train.py:1204] (3/4) Exclude cut with ID _19896_210_1883_1_1531709022662_6649569_831-44724-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 07:32:08,136 WARNING [train.py:1204] (3/4) Exclude cut with ID _55153_210_2253_1_1534384786677_3162096_280-285944-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 07:32:08,171 WARNING [train.py:1204] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_448-4048-0 from training. Duration: 0.96 2023-10-02 07:32:08,431 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.conv_module2.balancer2.prob, batch_count=799573.3333333334, ans=0.125 2023-10-02 07:32:09,565 WARNING [train.py:1204] (3/4) Exclude cut with ID _29678_210_7893_1_1532421042787_3906118_456-202548-0_sp1.1 from training. Duration: 0.97275 2023-10-02 07:32:09,580 WARNING [train.py:1204] (3/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_107-332398-0_sp1.1 from training. Duration: 0.7090625 2023-10-02 07:32:11,700 WARNING [train.py:1204] (3/4) Exclude cut with ID _13921_210_9994_1_1530838702800_3003429_46-149444-0_sp1.1 from training. Duration: 0.8545625 2023-10-02 07:32:12,967 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_257-20733-0_sp0.9 from training. Duration: 0.8444375 2023-10-02 07:32:14,272 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_53753_210_7234_1_1534320003_3594148_385-234809-0_sp1.1 from training. Duration: 0.963625 2023-10-02 07:32:14,336 WARNING [train.py:1204] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_292-215346-0_sp1.1 from training. Duration: 0.8818125 2023-10-02 07:32:20,679 WARNING [train.py:1204] (3/4) Exclude cut with ID _68965_210_1883_1_1535698962272_3497490_196-124268-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 07:32:20,712 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_23801_210_16223_1_1532066392_3294657_570-5439-0_sp1.1 from training. Duration: 0.8818125 2023-10-02 07:32:26,921 WARNING [train.py:1204] (3/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_349-6928-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 07:32:26,975 WARNING [train.py:1204] (3/4) Exclude cut with ID _60621_210_17071_1_1535013053530_3671739_226-255202-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 07:32:26,979 WARNING [train.py:1204] (3/4) Exclude cut with ID _41817_210_18835_1_1533434477160_7423049_448-130302-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 07:32:29,779 WARNING [train.py:1204] (3/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_758-143677-0_sp1.1 from training. Duration: 0.8909375 2023-10-02 07:32:29,826 WARNING [train.py:1204] (3/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_912-47473-0_sp0.9 from training. Duration: 0.7555625 2023-10-02 07:32:31,096 WARNING [train.py:1204] (3/4) Exclude cut with ID _6694_210_4599_1_1527852637490_3965529_356-169233-0_sp1.1 from training. Duration: 0.9 2023-10-02 07:32:32,406 WARNING [train.py:1204] (3/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_1-99680-0 from training. Duration: 0.67 2023-10-02 07:32:33,844 WARNING [train.py:1204] (3/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_199-86432-0_sp1.1 from training. Duration: 0.8909375 2023-10-02 07:32:33,863 WARNING [train.py:1204] (3/4) Exclude cut with ID _61148_210_6897_1_1535094022726_4217610_362-182430-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 07:32:33,957 WARNING [train.py:1204] (3/4) Exclude cut with ID _49376_210_5986_1_1533985278732_3370740_301-154763-0 from training. Duration: 0.98 2023-10-02 07:32:36,734 WARNING [train.py:1204] (3/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_572-185150-0_sp1.1 from training. Duration: 0.8818125 2023-10-02 07:32:40,999 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_47896_210_20508_1_1534492518_5958868_558-314572-0_sp1.1 from training. Duration: 0.8818125 2023-10-02 07:32:41,210 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.feed_forward2.hidden_balancer.prob, batch_count=799773.3333333334, ans=0.125 2023-10-02 07:32:42,357 INFO [train.py:1046] (3/4) Epoch 23, batch 3100, loss[loss=0.1485, simple_loss=0.2284, pruned_loss=0.03427, over 20526.00 frames. ], tot_loss[loss=0.1729, simple_loss=0.2495, pruned_loss=0.04809, over 4722014.81 frames. ], batch size: 45, lr: 4.46e-03, grad_scale: 8.0 2023-10-02 07:32:42,432 WARNING [train.py:1204] (3/4) Exclude cut with ID _45560_210_21982_1_1533812603951_3584999_111-6376-0_sp0.9 from training. Duration: 0.9333125 2023-10-02 07:32:46,545 WARNING [train.py:1204] (3/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_412-317077-0_sp1.1 from training. Duration: 0.6818125 2023-10-02 07:32:47,890 WARNING [train.py:1204] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_718-68505-0 from training. Duration: 0.89 2023-10-02 07:32:50,614 WARNING [train.py:1204] (3/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_77-311404-0 from training. Duration: 0.94 2023-10-02 07:32:51,995 WARNING [train.py:1204] (3/4) Exclude cut with ID _60229_210_27767_1_1534838406158_3655350_264-279573-0 from training. Duration: 0.83 2023-10-02 07:32:53,361 WARNING [train.py:1204] (3/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_106-300985-0_sp1.1 from training. Duration: 0.8181875 2023-10-02 07:32:57,297 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_4183_210_2087_1_1526729100_1869838_79-277189-0_sp1.1 from training. Duration: 0.963625 2023-10-02 07:32:57,310 WARNING [train.py:1204] (3/4) Exclude cut with ID _28427_210_4220_1_1532581240967_3061069_195-295268-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 07:32:58,842 WARNING [train.py:1204] (3/4) Exclude cut with ID _4092_210_2151_1_1526643044449_3135759_356-278694-0_sp0.9 from training. Duration: 0.52225 2023-10-02 07:32:58,965 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.feed_forward1.hidden_balancer.prob, batch_count=799840.0, ans=0.125 2023-10-02 07:32:59,071 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.conv_skip_rate, batch_count=799840.0, ans=0.0 2023-10-02 07:33:02,941 WARNING [train.py:1204] (3/4) Exclude cut with ID _14459_210_2228_1_1531443641946_7516700_777-71446-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 07:33:07,142 WARNING [train.py:1204] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_520-126729-0 from training. Duration: 0.7 2023-10-02 07:33:12,792 WARNING [train.py:1204] (3/4) Exclude cut with ID _13684_210_7497_1_1530619354863_3598359_312-235606-0_sp0.9 from training. Duration: 0.6666875 2023-10-02 07:33:14,119 WARNING [train.py:1204] (3/4) Exclude cut with ID _37537_210_2253_1_1533171758568_3421949_283-349402-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 07:33:14,151 WARNING [train.py:1204] (3/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_29-117932-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 07:33:14,170 WARNING [train.py:1204] (3/4) Exclude cut with ID _29303_210_2575_1_1532392237303_7410649_721-283351-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 07:33:15,559 WARNING [train.py:1204] (3/4) Exclude cut with ID _14886_210_4220_1_1530702020355_3516190_175-229073-0_sp0.9 from training. Duration: 0.5 2023-10-02 07:33:17,429 WARNING [train.py:1204] (3/4) Exclude cut with ID _15458_210_8341_1_1530858614263_3834410_70-268830-0_sp1.1 from training. Duration: 0.97275 2023-10-02 07:33:19,209 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2295_210_2087_1_1525500993_2407863_234-83104-0 from training. Duration: 0.62 2023-10-02 07:33:19,211 WARNING [train.py:1204] (3/4) Exclude cut with ID _22961_210_8254_1_1531976366672_4278180_317-199546-0_sp1.1 from training. Duration: 0.7818125 2023-10-02 07:33:19,334 WARNING [train.py:1204] (3/4) Exclude cut with ID _27894_210_12392_1_1532415496078_6902439_547-196022-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 07:33:20,828 WARNING [train.py:1204] (3/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_46-207599-0 from training. Duration: 0.64 2023-10-02 07:33:22,217 WARNING [train.py:1204] (3/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_426-326246-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 07:33:23,451 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.429e+02 1.736e+02 1.938e+02 2.240e+02 3.003e+02, threshold=3.876e+02, percent-clipped=0.0 2023-10-02 07:33:23,630 WARNING [train.py:1204] (3/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_445-117398-0_sp0.9 from training. Duration: 0.988875 2023-10-02 07:33:25,563 WARNING [train.py:1204] (3/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_563-347832-0 from training. Duration: 0.89 2023-10-02 07:33:26,919 WARNING [train.py:1204] (3/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_252-216674-0 from training. Duration: 0.54 2023-10-02 07:33:27,000 WARNING [train.py:1204] (3/4) Exclude cut with ID _59611_210_8895_1_1534741198240_3650361_187-212779-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 07:33:28,989 WARNING [train.py:1204] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_282-159367-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 07:33:30,304 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_68702_210_23689_1_1535799577_3420068_33-136883-0_sp1.1 from training. Duration: 0.936375 2023-10-02 07:33:30,314 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_47228_210_4983_1_1533780021_3672027_184-293706-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 07:33:30,355 WARNING [train.py:1204] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_656-188361-0_sp0.9 from training. Duration: 0.9666875 2023-10-02 07:33:31,751 WARNING [train.py:1204] (3/4) Exclude cut with ID _4866_210_2577_1_1527404412284_4364659_352-157930-0_sp0.9 from training. Duration: 0.9 2023-10-02 07:33:31,752 WARNING [train.py:1204] (3/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_210-106047-0_sp1.1 from training. Duration: 0.8818125 2023-10-02 07:33:37,194 WARNING [train.py:1204] (3/4) Exclude cut with ID _9389_210_1664_1_1529217337301_5289269_289-213417-0_sp1.1 from training. Duration: 0.8090625 2023-10-02 07:33:37,240 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3910_210_1794_1_1526777667_3747260_178-41186-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 07:33:37,244 WARNING [train.py:1204] (3/4) Exclude cut with ID _27052_210_5986_1_1532329197187_3585009_24-172263-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 07:33:37,249 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_5522_210_3273_1_1527249606_3909096_318-203153-0_sp1.1 from training. Duration: 0.3909375 2023-10-02 07:33:42,848 WARNING [train.py:1204] (3/4) Exclude cut with ID _27894_210_12392_1_1532415496078_6902439_222-51982-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 07:33:44,237 WARNING [train.py:1204] (3/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_87-131427-0 from training. Duration: 0.78 2023-10-02 07:33:45,646 WARNING [train.py:1204] (3/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_372-298093-0_sp0.9 from training. Duration: 0.888875 2023-10-02 07:33:46,930 WARNING [train.py:1204] (3/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_663-166973-0 from training. Duration: 0.8 2023-10-02 07:33:46,965 WARNING [train.py:1204] (3/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_429-177469-0_sp1.1 from training. Duration: 0.936375 2023-10-02 07:33:48,901 WARNING [train.py:1204] (3/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_724-172651-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 07:33:48,958 WARNING [train.py:1204] (3/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_103-336064-0 from training. Duration: 0.51 2023-10-02 07:33:59,074 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.feed_forward3.hidden_balancer.prob, batch_count=800040.0, ans=0.125 2023-10-02 07:34:00,040 WARNING [train.py:1204] (3/4) Exclude cut with ID _48677_210_5663_1_1533902405919_3892989_111-11439-0 from training. Duration: 0.96 2023-10-02 07:34:01,945 INFO [train.py:1046] (3/4) Epoch 23, batch 3150, loss[loss=0.188, simple_loss=0.2654, pruned_loss=0.05529, over 23576.00 frames. ], tot_loss[loss=0.1723, simple_loss=0.2488, pruned_loss=0.04785, over 4728793.85 frames. ], batch size: 106, lr: 4.45e-03, grad_scale: 8.0 2023-10-02 07:34:03,433 WARNING [train.py:1204] (3/4) Exclude cut with ID _8085_210_4557_1_1528455633493_3716100_95-103398-0_sp1.1 from training. Duration: 0.963625 2023-10-02 07:34:03,510 WARNING [train.py:1204] (3/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_1041-110837-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 07:34:04,937 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_50050_210_16223_1_1535435796_7290138_1155-121757-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 07:34:04,940 WARNING [train.py:1204] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_602-275260-0_sp0.9 from training. Duration: 0.911125 2023-10-02 07:34:05,169 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.ff3_skip_rate, batch_count=800106.6666666666, ans=0.0 2023-10-02 07:34:06,367 WARNING [train.py:1204] (3/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_265-164486-0 from training. Duration: 0.96 2023-10-02 07:34:07,787 WARNING [train.py:1204] (3/4) Exclude cut with ID _46067_210_4220_1_1534125571066_3307450_142-229375-0_sp1.1 from training. Duration: 0.963625 2023-10-02 07:34:07,823 WARNING [train.py:1204] (3/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_310-26186-0_sp1.1 from training. Duration: 0.636375 2023-10-02 07:34:09,170 WARNING [train.py:1204] (3/4) Exclude cut with ID _28461_210_2253_1_1532771775189_3041299_136-270644-0 from training. Duration: 0.93 2023-10-02 07:34:11,876 WARNING [train.py:1204] (3/4) Exclude cut with ID _14860_210_4915_1_1530702020340_7343240_438-286372-0_sp1.1 from training. Duration: 0.936375 2023-10-02 07:34:13,312 WARNING [train.py:1204] (3/4) Exclude cut with ID IC0128W0004-63309 from training. Duration: 0.831 2023-10-02 07:34:14,850 WARNING [train.py:1204] (3/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_135-174080-0 from training. Duration: 0.53 2023-10-02 07:34:16,119 WARNING [train.py:1204] (3/4) Exclude cut with ID _41326_210_5854_1_1533778076509_3492080_277-59511-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 07:34:16,206 WARNING [train.py:1204] (3/4) Exclude cut with ID IC0144W0112-71379 from training. Duration: 0.992 2023-10-02 07:34:17,558 WARNING [train.py:1204] (3/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_711-15688-0_sp0.9 from training. Duration: 0.47775 2023-10-02 07:34:17,678 WARNING [train.py:1204] (3/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_237-332027-0 from training. Duration: 0.95 2023-10-02 07:34:19,583 WARNING [train.py:1204] (3/4) Exclude cut with ID _7970_210_1883_1_1528455616296_3472230_25-178407-0 from training. Duration: 0.71 2023-10-02 07:34:19,585 WARNING [train.py:1204] (3/4) Exclude cut with ID _8480_210_1874_1_1528589391205_6728330_309-139896-0 from training. Duration: 0.81 2023-10-02 07:34:21,084 WARNING [train.py:1204] (3/4) Exclude cut with ID _39571_210_13449_1_1533801584143_3651077_319-268877-0_sp1.1 from training. Duration: 0.936375 2023-10-02 07:34:21,090 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_155-246880-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 07:34:21,191 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_566-122599-0_sp1.1 from training. Duration: 0.936375 2023-10-02 07:34:21,322 INFO [scaling.py:1118] (3/4) WithLoss: name=encoder.encoders.0.layers.1.self_attn_weights, loss-sum=0.000e+00 2023-10-02 07:34:24,320 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_12868_210_6753_1_1530701932_530275_18-76322-0 from training. Duration: 0.87 2023-10-02 07:34:25,719 WARNING [train.py:1204] (3/4) Exclude cut with ID _56494_210_4220_1_1535284837625_3033169_159-106035-0_sp1.1 from training. Duration: 0.963625 2023-10-02 07:34:26,994 WARNING [train.py:1204] (3/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_98-276013-0_sp1.1 from training. Duration: 0.963625 2023-10-02 07:34:27,058 WARNING [train.py:1204] (3/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_330-216124-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 07:34:28,940 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_177-58005-0_sp0.9 from training. Duration: 0.688875 2023-10-02 07:34:29,266 INFO [scaling.py:1118] (3/4) WithLoss: name=encoder.encoders.3.encoder.layers.1.self_attn_weights, loss-sum=0.000e+00 2023-10-02 07:34:32,496 WARNING [train.py:1204] (3/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_343-139898-0 from training. Duration: 0.96 2023-10-02 07:34:33,907 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6961_210_2087_1_1527937168_6815011_395-185060-0_sp1.1 from training. Duration: 0.863625 2023-10-02 07:34:36,490 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7002_210_1794_1_1527939683_5157286_184-229630-0_sp0.9 from training. Duration: 0.82225 2023-10-02 07:34:37,827 WARNING [train.py:1204] (3/4) Exclude cut with ID _59170_210_6721_1_1534983633960_4203335_153-18895-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 07:34:39,188 WARNING [train.py:1204] (3/4) Exclude cut with ID _49643_210_7989_1_1534640244224_3939031_598-161926-0 from training. Duration: 0.9 2023-10-02 07:34:39,990 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.3.encoder.layers.0.feed_forward1.out_whiten, num_groups=1, num_channels=512, metric=8.66 vs. limit=15.0 2023-10-02 07:34:41,946 WARNING [train.py:1204] (3/4) Exclude cut with ID _4068_210_2629_1_1526697350395_1546259_224-288693-0 from training. Duration: 0.59 2023-10-02 07:34:42,015 WARNING [train.py:1204] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_718-68505-0_sp1.1 from training. Duration: 0.8090625 2023-10-02 07:34:43,309 WARNING [train.py:1204] (3/4) Exclude cut with ID _4068_210_2629_1_1526697350395_1546259_224-288693-0_sp0.9 from training. Duration: 0.6555625 2023-10-02 07:34:43,320 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2296_210_2087_1_1525503448_3397335_57-185430-0_sp0.9 from training. Duration: 0.6666875 2023-10-02 07:34:43,376 WARNING [train.py:1204] (3/4) Exclude cut with ID _15458_210_8341_1_1530858614263_3834410_49-235472-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 07:34:43,378 WARNING [train.py:1204] (3/4) Exclude cut with ID _64135_210_2253_1_1535677267876_7608972_498-5039-0_sp0.9 from training. Duration: 0.8666875 2023-10-02 07:34:44,731 WARNING [train.py:1204] (3/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_283-220372-0_sp0.9 from training. Duration: 0.62225 2023-10-02 07:34:44,746 WARNING [train.py:1204] (3/4) Exclude cut with ID _6473_210_1741_1_1527901307396_3815380_108-17335-0_sp0.9 from training. Duration: 0.72225 2023-10-02 07:34:46,323 WARNING [train.py:1204] (3/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_113-92608-0 from training. Duration: 0.54 2023-10-02 07:34:46,366 WARNING [train.py:1204] (3/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_272-249985-0_sp1.1 from training. Duration: 0.6090625 2023-10-02 07:34:46,381 WARNING [train.py:1204] (3/4) Exclude cut with ID _50376_210_13401_1_1534031502101_4217740_518-187390-0_sp1.1 from training. Duration: 0.97275 2023-10-02 07:34:47,841 WARNING [train.py:1204] (3/4) Exclude cut with ID _59738_210_15379_1_1534849512562_3941470_165-248260-0_sp1.1 from training. Duration: 0.92725 2023-10-02 07:34:47,854 WARNING [train.py:1204] (3/4) Exclude cut with ID _23818_210_6228_1_1531917063884_3676920_400-343925-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 07:34:49,191 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6961_210_2087_1_1527937168_6815011_562-165518-0 from training. Duration: 0.77 2023-10-02 07:34:49,911 WARNING [train.py:1204] (3/4) Exclude cut with ID _24271_210_12658_1_1531987270022_7190519_289-207931-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 07:34:51,153 WARNING [train.py:1204] (3/4) Exclude cut with ID _14632_210_9988_1_1530691279392_3923200_332-105904-0 from training. Duration: 0.9 2023-10-02 07:34:51,164 WARNING [train.py:1204] (3/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_203-232438-0_sp1.1 from training. Duration: 0.97275 2023-10-02 07:34:52,562 WARNING [train.py:1204] (3/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_263-168388-0 from training. Duration: 0.86 2023-10-02 07:34:52,628 WARNING [train.py:1204] (3/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_174-205058-0 from training. Duration: 0.95 2023-10-02 07:34:52,815 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.conv_module1.balancer2.prob, batch_count=800306.6666666666, ans=0.125 2023-10-02 07:34:54,568 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_94-195801-0_sp1.1 from training. Duration: 0.8545625 2023-10-02 07:34:55,836 WARNING [train.py:1204] (3/4) Exclude cut with ID _55153_210_2253_1_1534384786677_3162096_269-103204-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 07:34:57,147 WARNING [train.py:1204] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_748-36150-0 from training. Duration: 0.91 2023-10-02 07:34:58,541 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_148-172861-0_sp1.1 from training. Duration: 0.4545625 2023-10-02 07:34:58,585 WARNING [train.py:1204] (3/4) Exclude cut with ID _67499_210_17282_1_1535509805220_3692430_460-77730-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 07:35:02,159 WARNING [train.py:1204] (3/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_374-20753-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 07:35:02,278 WARNING [train.py:1204] (3/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_374-289983-0_sp1.1 from training. Duration: 0.97275 2023-10-02 07:35:02,306 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_34886_210_10633_1_1533203882_1755661_76-156096-0_sp0.9 from training. Duration: 0.9666875 2023-10-02 07:35:07,896 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_238-256668-0_sp0.9 from training. Duration: 0.7666875 2023-10-02 07:35:09,188 WARNING [train.py:1204] (3/4) Exclude cut with ID _46683_210_9642_1_1533730913492_4591116_489-146727-0_sp1.1 from training. Duration: 0.97275 2023-10-02 07:35:10,697 WARNING [train.py:1204] (3/4) Exclude cut with ID _13938_210_9988_1_1530518718978_3693960_304-112784-0_sp0.9 from training. Duration: 0.511125 2023-10-02 07:35:16,088 INFO [train.py:1046] (3/4) Epoch 23, batch 3200, loss[loss=0.165, simple_loss=0.2443, pruned_loss=0.04288, over 23683.00 frames. ], tot_loss[loss=0.171, simple_loss=0.2471, pruned_loss=0.04748, over 4714895.60 frames. ], batch size: 135, lr: 4.45e-03, grad_scale: 16.0 2023-10-02 07:35:16,196 WARNING [train.py:1204] (3/4) Exclude cut with ID _24271_210_12658_1_1531987270022_7190519_243-46549-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 07:35:16,200 WARNING [train.py:1204] (3/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_78-182912-0_sp1.1 from training. Duration: 0.536375 2023-10-02 07:35:21,014 WARNING [train.py:1204] (3/4) Exclude cut with ID _57572_210_5517_1_1534921219624_3809484_121-321334-0_sp1.1 from training. Duration: 0.97275 2023-10-02 07:35:22,369 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_40047_210_2290_1_1533290519_2374311_254-102149-0_sp1.1 from training. Duration: 0.963625 2023-10-02 07:35:22,370 WARNING [train.py:1204] (3/4) Exclude cut with ID _28461_210_2253_1_1532771775189_3041299_96-152158-0 from training. Duration: 0.85 2023-10-02 07:35:25,149 WARNING [train.py:1204] (3/4) Exclude cut with ID _29877_210_6228_1_1532397791106_5276149_161-102979-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 07:35:29,533 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3787_210_2040_1_1526477544_5478586_603-120324-0_sp0.9 from training. Duration: 0.9 2023-10-02 07:35:31,176 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.balancer2.prob, batch_count=800506.6666666666, ans=0.125 2023-10-02 07:35:31,635 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.2.encoder.layers.2.conv_module1.whiten, num_groups=1, num_channels=384, metric=5.89 vs. limit=15.0 2023-10-02 07:35:32,859 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_41750_210_1960_1_1533520678_3948042_247-213456-0_sp1.1 from training. Duration: 0.97275 2023-10-02 07:35:38,983 WARNING [train.py:1204] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_127-119109-0_sp0.9 from training. Duration: 0.8 2023-10-02 07:35:49,385 WARNING [train.py:1204] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_215-202973-0 from training. Duration: 0.88 2023-10-02 07:35:50,720 WARNING [train.py:1204] (3/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_17-320491-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 07:35:54,060 WARNING [train.py:1204] (3/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_310-114452-0 from training. Duration: 0.59 2023-10-02 07:35:55,416 WARNING [train.py:1204] (3/4) Exclude cut with ID _15133_210_6228_1_1530794512074_3080379_29-264458-0_sp0.9 from training. Duration: 0.6444375 2023-10-02 07:35:56,716 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.549e+02 1.871e+02 2.114e+02 2.432e+02 3.533e+02, threshold=4.228e+02, percent-clipped=0.0 2023-10-02 07:35:58,268 WARNING [train.py:1204] (3/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_159-281310-0_sp1.1 from training. Duration: 0.863625 2023-10-02 07:35:58,278 WARNING [train.py:1204] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_195-167011-0_sp0.9 from training. Duration: 0.8333125 2023-10-02 07:35:58,356 WARNING [train.py:1204] (3/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_156-257607-0_sp1.1 from training. Duration: 0.7545625 2023-10-02 07:36:04,106 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2295_210_2087_1_1525500993_2407863_116-238894-0 from training. Duration: 0.7 2023-10-02 07:36:05,427 WARNING [train.py:1204] (3/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_321-335475-0_sp0.9 from training. Duration: 0.5 2023-10-02 07:36:06,967 WARNING [train.py:1204] (3/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_188-114464-0 from training. Duration: 0.82 2023-10-02 07:36:10,356 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2592_210_2290_1_1525697025_4882925_157-70512-0 from training. Duration: 0.91 2023-10-02 07:36:11,750 WARNING [train.py:1204] (3/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_138-64210-0_sp1.1 from training. Duration: 0.663625 2023-10-02 07:36:18,643 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_11305_210_1794_1_1529750428_7178148_651-95702-0_sp1.1 from training. Duration: 0.963625 2023-10-02 07:36:18,659 WARNING [train.py:1204] (3/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_379-184223-0_sp0.9 from training. Duration: 0.9444375 2023-10-02 07:36:18,701 WARNING [train.py:1204] (3/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_381-125325-0_sp1.1 from training. Duration: 0.963625 2023-10-02 07:36:18,754 WARNING [train.py:1204] (3/4) Exclude cut with ID IC0622W0021-307659 from training. Duration: 0.896 2023-10-02 07:36:18,756 WARNING [train.py:1204] (3/4) Exclude cut with ID _7624_210_1531_1_1528109802198_4933101_418-349834-0_sp1.1 from training. Duration: 0.5454375 2023-10-02 07:36:19,042 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.bypass_mid.scale_min, batch_count=800706.6666666666, ans=0.2 2023-10-02 07:36:19,544 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.2.encoder.layers.2.self_attn1.whiten, num_groups=1, num_channels=384, metric=18.14 vs. limit=22.5 2023-10-02 07:36:23,576 WARNING [train.py:1204] (3/4) Exclude cut with ID _49728_210_12651_1_1534041034294_6788150_607-3416-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 07:36:25,340 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8275_210_4198_1_1528455320_3892571_491-17707-0 from training. Duration: 0.64 2023-10-02 07:36:26,605 WARNING [train.py:1204] (3/4) Exclude cut with ID _5287_210_2084_1_1527418883040_3878670_333-48508-0 from training. Duration: 0.6 2023-10-02 07:36:26,688 WARNING [train.py:1204] (3/4) Exclude cut with ID _8365_210_2064_1_1528592311070_4677690_371-122295-0 from training. Duration: 0.55 2023-10-02 07:36:28,080 WARNING [train.py:1204] (3/4) Exclude cut with ID _14860_210_4915_1_1530702020340_7343240_836-328489-0 from training. Duration: 0.9 2023-10-02 07:36:29,480 WARNING [train.py:1204] (3/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_1033-274376-0_sp0.9 from training. Duration: 0.9666875 2023-10-02 07:36:30,770 INFO [train.py:1046] (3/4) Epoch 23, batch 3250, loss[loss=0.158, simple_loss=0.2384, pruned_loss=0.03878, over 24617.00 frames. ], tot_loss[loss=0.1711, simple_loss=0.2473, pruned_loss=0.04743, over 4718979.97 frames. ], batch size: 60, lr: 4.45e-03, grad_scale: 16.0 2023-10-02 07:36:31,086 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.conv_skip_rate, batch_count=800773.3333333334, ans=0.0 2023-10-02 07:36:32,271 WARNING [train.py:1204] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_475-127455-0_sp0.9 from training. Duration: 0.77775 2023-10-02 07:36:32,276 WARNING [train.py:1204] (3/4) Exclude cut with ID IC0671W0071-331914 from training. Duration: 0.896 2023-10-02 07:36:32,294 WARNING [train.py:1204] (3/4) Exclude cut with ID _30327_210_16049_1_1532484161295_3924169_2-1523-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 07:36:32,296 WARNING [train.py:1204] (3/4) Exclude cut with ID _24314_210_4915_1_1532135078116_7170179_460-208279-0_sp1.1 from training. Duration: 0.97275 2023-10-02 07:36:32,574 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.bypass.scale_min, batch_count=800773.3333333334, ans=0.2 2023-10-02 07:36:35,350 WARNING [train.py:1204] (3/4) Exclude cut with ID IC0679W0052-335859 from training. Duration: 0.64 2023-10-02 07:36:36,911 WARNING [train.py:1204] (3/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_3-237434-0_sp1.1 from training. Duration: 0.6909375 2023-10-02 07:36:39,382 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.5.encoder.layers.0.self_attn2.whiten, num_groups=1, num_channels=256, metric=12.22 vs. limit=22.5 2023-10-02 07:36:40,331 WARNING [train.py:1204] (3/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_405-185283-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 07:36:46,139 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.bypass.skip_rate, batch_count=800840.0, ans=0.04949747468305833 2023-10-02 07:36:47,446 WARNING [train.py:1204] (3/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_927-83684-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 07:36:47,468 WARNING [train.py:1204] (3/4) Exclude cut with ID _6694_210_4599_1_1527852637490_3965529_356-169233-0 from training. Duration: 0.99 2023-10-02 07:36:48,876 WARNING [train.py:1204] (3/4) Exclude cut with ID _19896_210_1883_1_1531709022662_6649569_562-233001-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 07:36:48,935 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_354-78629-0_sp1.1 from training. Duration: 0.963625 2023-10-02 07:36:48,936 WARNING [train.py:1204] (3/4) Exclude cut with ID _60135_210_13427_1_1534935557922_3651319_96-168198-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 07:36:50,368 WARNING [train.py:1204] (3/4) Exclude cut with ID _31602_210_5854_1_1532677021456_4458250_249-96604-0_sp1.1 from training. Duration: 0.8090625 2023-10-02 07:36:50,541 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.bypass.scale_min, batch_count=800840.0, ans=0.2 2023-10-02 07:36:52,421 WARNING [train.py:1204] (3/4) Exclude cut with ID _14100_210_8341_1_1530529320613_3678069_258-224599-0_sp0.9 from training. Duration: 0.8555625 2023-10-02 07:36:52,607 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.1.bypass_mid.scale_min, batch_count=800840.0, ans=0.2 2023-10-02 07:36:55,185 WARNING [train.py:1204] (3/4) Exclude cut with ID _19708_210_9206_1_1532136650733_4238364_215-160135-0_sp1.1 from training. Duration: 0.97275 2023-10-02 07:36:55,219 WARNING [train.py:1204] (3/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_113-18163-0_sp0.9 from training. Duration: 0.988875 2023-10-02 07:36:56,527 WARNING [train.py:1204] (3/4) Exclude cut with ID _64135_210_2253_1_1535677267876_7608972_552-337340-0_sp1.1 from training. Duration: 0.92725 2023-10-02 07:36:56,553 WARNING [train.py:1204] (3/4) Exclude cut with ID _67725_210_27767_1_1535608756671_5105359_10-12887-0_sp1.1 from training. Duration: 0.97275 2023-10-02 07:36:56,560 WARNING [train.py:1204] (3/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_401-284840-0_sp1.1 from training. Duration: 0.97275 2023-10-02 07:36:56,579 WARNING [train.py:1204] (3/4) Exclude cut with ID _44659_210_2721_1_1534036120061_6614860_483-141285-0_sp1.1 from training. Duration: 0.936375 2023-10-02 07:36:59,284 WARNING [train.py:1204] (3/4) Exclude cut with ID _9461_210_4557_1_1529060514726_3677451_358-332734-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 07:36:59,376 WARNING [train.py:1204] (3/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_788-298907-0_sp1.1 from training. Duration: 0.8090625 2023-10-02 07:37:02,147 WARNING [train.py:1204] (3/4) Exclude cut with ID _39411_210_5986_1_1533538825030_3343649_354-267196-0_sp1.1 from training. Duration: 0.92725 2023-10-02 07:37:02,173 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_14953_210_2151_1_1530701710_5616165_192-309792-0_sp1.1 from training. Duration: 0.97275 2023-10-02 07:37:03,800 WARNING [train.py:1204] (3/4) Exclude cut with ID _59170_210_6721_1_1534983633960_4203335_290-151255-0_sp1.1 from training. Duration: 0.92725 2023-10-02 07:37:03,834 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_5575_210_1410_1_1527301305_2570170_249-93769-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 07:37:04,038 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.conv_skip_rate, batch_count=800906.6666666666, ans=0.0 2023-10-02 07:37:05,514 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_46013_210_7234_1_1533776362_3744032_463-52378-0_sp1.1 from training. Duration: 0.7818125 2023-10-02 07:37:05,732 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.bypass.scale_min, batch_count=800906.6666666666, ans=0.2 2023-10-02 07:37:10,151 WARNING [train.py:1204] (3/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_580-93798-0 from training. Duration: 0.56 2023-10-02 07:37:10,215 WARNING [train.py:1204] (3/4) Exclude cut with ID _19514_210_9259_1_1531530071330_5084230_385-13762-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 07:37:10,220 WARNING [train.py:1204] (3/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_458-67321-0_sp1.1 from training. Duration: 0.8818125 2023-10-02 07:37:11,546 WARNING [train.py:1204] (3/4) Exclude cut with ID _41298_210_9019_1_1533430371450_4822143_188-154791-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 07:37:11,615 WARNING [train.py:1204] (3/4) Exclude cut with ID _15037_210_2253_1_1530752294104_3267650_296-192597-0_sp0.9 from training. Duration: 0.9 2023-10-02 07:37:17,178 WARNING [train.py:1204] (3/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_661-264857-0_sp1.1 from training. Duration: 0.8454375 2023-10-02 07:37:25,179 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_34008_210_10431_1_1533345424_8183247_662-317331-0_sp1.1 from training. Duration: 0.8545625 2023-10-02 07:37:25,216 WARNING [train.py:1204] (3/4) Exclude cut with ID _46502_210_13794_1_1533724452198_3657159_241-278329-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 07:37:25,217 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_11349_210_6753_1_1529800861_1101207_26-65602-0 from training. Duration: 0.96 2023-10-02 07:37:25,221 WARNING [train.py:1204] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_448-4048-0_sp1.1 from training. Duration: 0.87275 2023-10-02 07:37:25,236 WARNING [train.py:1204] (3/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_662-275621-0_sp1.1 from training. Duration: 0.3818125 2023-10-02 07:37:26,582 WARNING [train.py:1204] (3/4) Exclude cut with ID _13938_210_9988_1_1530518718978_3693960_256-54454-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 07:37:29,148 WARNING [train.py:1204] (3/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_256-32662-0 from training. Duration: 0.9 2023-10-02 07:37:29,189 WARNING [train.py:1204] (3/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_326-17061-0 from training. Duration: 0.94 2023-10-02 07:37:30,553 WARNING [train.py:1204] (3/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_505-97987-0_sp1.1 from training. Duration: 0.7818125 2023-10-02 07:37:30,658 WARNING [train.py:1204] (3/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_316-294364-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 07:37:30,720 WARNING [train.py:1204] (3/4) Exclude cut with ID _19514_210_9259_1_1531530071330_5084230_762-231063-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 07:37:32,032 WARNING [train.py:1204] (3/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_68-219807-0_sp0.9 from training. Duration: 0.52225 2023-10-02 07:37:33,345 WARNING [train.py:1204] (3/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_479-158053-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 07:37:37,924 WARNING [train.py:1204] (3/4) Exclude cut with ID _66112_210_10635_1_1535590236305_3801484_83-38181-0_sp1.1 from training. Duration: 0.936375 2023-10-02 07:37:37,955 WARNING [train.py:1204] (3/4) Exclude cut with ID _55857_210_2346_1_1534644007831_6898209_255-146346-0_sp1.1 from training. Duration: 0.963625 2023-10-02 07:37:39,458 WARNING [train.py:1204] (3/4) Exclude cut with ID _48836_210_12210_1_1534075848293_3904150_429-35475-0 from training. Duration: 0.89 2023-10-02 07:37:39,473 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_50154_210_13731_1_1534750862_1080961_119-187163-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 07:37:39,707 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.nonlin_attention.balancer.prob, batch_count=801040.0, ans=0.125 2023-10-02 07:37:42,658 WARNING [train.py:1204] (3/4) Exclude cut with ID _8793_210_6310_1_1528542985139_2310009_121-179782-0_sp0.9 from training. Duration: 0.888875 2023-10-02 07:37:42,661 WARNING [train.py:1204] (3/4) Exclude cut with ID _14884_210_2252_1_1530707459689_5561250_591-173406-0 from training. Duration: 0.96 2023-10-02 07:37:45,279 INFO [train.py:1046] (3/4) Epoch 23, batch 3300, loss[loss=0.2194, simple_loss=0.2785, pruned_loss=0.08018, over 19409.00 frames. ], tot_loss[loss=0.1719, simple_loss=0.2482, pruned_loss=0.04785, over 4714159.32 frames. ], batch size: 388, lr: 4.45e-03, grad_scale: 16.0 2023-10-02 07:37:45,438 WARNING [train.py:1204] (3/4) Exclude cut with ID _15133_210_6228_1_1530794512074_3080379_54-198193-0_sp1.1 from training. Duration: 0.8545625 2023-10-02 07:37:45,447 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6545_210_2040_1_1527774670_4245258_407-48070-0 from training. Duration: 0.76 2023-10-02 07:37:48,076 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1032-236863-0 from training. Duration: 0.84 2023-10-02 07:37:48,157 WARNING [train.py:1204] (3/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_507-303370-0 from training. Duration: 0.96 2023-10-02 07:37:49,504 WARNING [train.py:1204] (3/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_164-282366-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 07:37:52,023 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.0.layers.0.nonlin_attention.whiten1, num_groups=1, num_channels=144, metric=9.32 vs. limit=10.0 2023-10-02 07:37:52,373 WARNING [train.py:1204] (3/4) Exclude cut with ID _39867_210_9826_1_1533553222030_9344259_312-158727-0_sp1.1 from training. Duration: 0.963625 2023-10-02 07:37:53,746 WARNING [train.py:1204] (3/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_174-205058-0_sp1.1 from training. Duration: 0.863625 2023-10-02 07:37:53,775 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_11305_210_1794_1_1529750428_7178148_166-237358-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 07:37:55,806 WARNING [train.py:1204] (3/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_26-175553-0_sp1.1 from training. Duration: 0.5545625 2023-10-02 07:37:55,833 WARNING [train.py:1204] (3/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_96-290039-0_sp1.1 from training. Duration: 0.5090625 2023-10-02 07:37:58,518 WARNING [train.py:1204] (3/4) Exclude cut with ID _53219_210_4525_1_1534834836008_4064825_261-101691-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 07:38:01,222 WARNING [train.py:1204] (3/4) Exclude cut with ID _24271_210_12658_1_1531987270022_7190519_96-293390-0_sp1.1 from training. Duration: 0.936375 2023-10-02 07:38:02,793 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_271-234962-0 from training. Duration: 0.79 2023-10-02 07:38:03,398 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.4.encoder.layers.0.conv_module2.whiten, num_groups=1, num_channels=384, metric=6.86 vs. limit=15.0 2023-10-02 07:38:04,495 WARNING [train.py:1204] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_136-30660-0_sp1.1 from training. Duration: 0.97275 2023-10-02 07:38:04,519 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_625-281046-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 07:38:07,298 WARNING [train.py:1204] (3/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_333-168193-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 07:38:07,344 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9029W0269-509664 from training. Duration: 0.896 2023-10-02 07:38:08,777 WARNING [train.py:1204] (3/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_450-147393-0_sp1.1 from training. Duration: 0.92725 2023-10-02 07:38:08,834 WARNING [train.py:1204] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_643-52007-0_sp1.1 from training. Duration: 0.5181875 2023-10-02 07:38:10,235 WARNING [train.py:1204] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_330-327861-0_sp0.9 from training. Duration: 0.7444375 2023-10-02 07:38:10,244 WARNING [train.py:1204] (3/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_260-317651-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 07:38:10,265 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9044W0010-516654 from training. Duration: 0.896 2023-10-02 07:38:14,375 WARNING [train.py:1204] (3/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_265-118378-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 07:38:14,381 WARNING [train.py:1204] (3/4) Exclude cut with ID _6194_210_1534_1_1527511826160_3941194_321-148641-0_sp0.9 from training. Duration: 0.711125 2023-10-02 07:38:15,861 WARNING [train.py:1204] (3/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_101-169176-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 07:38:17,672 WARNING [train.py:1204] (3/4) Exclude cut with ID 497-129325-0061-62254-0_sp1.1 from training. Duration: 0.97725 2023-10-02 07:38:19,064 WARNING [train.py:1204] (3/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_795-33252-0_sp1.1 from training. Duration: 0.336375 2023-10-02 07:38:19,079 WARNING [train.py:1204] (3/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_168-246176-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 07:38:20,481 WARNING [train.py:1204] (3/4) Exclude cut with ID _7160_210_1876_1_1527940923231_3703799_120-150943-0_sp1.1 from training. Duration: 0.736375 2023-10-02 07:38:20,636 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9084W0005-536844 from training. Duration: 0.768 2023-10-02 07:38:22,091 WARNING [train.py:1204] (3/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_234-260682-0 from training. Duration: 0.84 2023-10-02 07:38:22,358 INFO [scaling.py:1118] (3/4) WithLoss: name=encoder.encoders.4.encoder.layers.0.self_attn_weights, loss-sum=0.000e+00 2023-10-02 07:38:23,457 WARNING [train.py:1204] (3/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_188-114464-0_sp0.9 from training. Duration: 0.911125 2023-10-02 07:38:24,716 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.528e+02 1.780e+02 1.961e+02 2.106e+02 2.870e+02, threshold=3.921e+02, percent-clipped=0.0 2023-10-02 07:38:26,693 WARNING [train.py:1204] (3/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_81-75849-0 from training. Duration: 0.39 2023-10-02 07:38:28,115 WARNING [train.py:1204] (3/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_379-184223-0_sp1.1 from training. Duration: 0.77275 2023-10-02 07:38:29,640 WARNING [train.py:1204] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_454-123002-0_sp1.1 from training. Duration: 0.62725 2023-10-02 07:38:31,022 WARNING [train.py:1204] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_887-249555-0_sp1.1 from training. Duration: 0.7 2023-10-02 07:38:33,765 WARNING [train.py:1204] (3/4) Exclude cut with ID _31088_210_16446_1_1532520012284_3440919_20-260902-0_sp1.1 from training. Duration: 0.97275 2023-10-02 07:38:33,813 WARNING [train.py:1204] (3/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_856-344574-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 07:38:33,815 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_36756_210_7234_1_1533026991_966276_4-164118-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 07:38:35,246 WARNING [train.py:1204] (3/4) Exclude cut with ID _8365_210_2064_1_1528592311070_4677690_371-122295-0_sp1.1 from training. Duration: 0.5 2023-10-02 07:38:35,489 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.out_combiner.scale_min, batch_count=801306.6666666666, ans=0.2 2023-10-02 07:38:37,262 WARNING [train.py:1204] (3/4) Exclude cut with ID _5910_210_1389_1_1527337972454_5050100_555-159815-0_sp1.1 from training. Duration: 0.8909375 2023-10-02 07:38:37,271 WARNING [train.py:1204] (3/4) Exclude cut with ID _37086_210_9038_1_1532999540891_7021250_469-147941-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 07:38:38,656 WARNING [train.py:1204] (3/4) Exclude cut with ID _3067_210_2667_1_1526212974640_4503168_459-173867-0_sp0.9 from training. Duration: 0.97775 2023-10-02 07:38:41,379 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9156W0006-572499 from training. Duration: 0.896 2023-10-02 07:38:41,471 WARNING [train.py:1204] (3/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_277-210798-0 from training. Duration: 0.55 2023-10-02 07:38:44,087 WARNING [train.py:1204] (3/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_103-336064-0_sp1.1 from training. Duration: 0.463625 2023-10-02 07:38:44,133 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3414_210_1988_1_1526212718_4813195_331-290945-0_sp1.1 from training. Duration: 0.936375 2023-10-02 07:38:44,134 WARNING [train.py:1204] (3/4) Exclude cut with ID _67499_210_17282_1_1535509805220_3692430_365-29276-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 07:38:45,669 WARNING [train.py:1204] (3/4) Exclude cut with ID _14821_210_8750_1_1530680503991_6829359_69-250847-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 07:38:45,671 WARNING [train.py:1204] (3/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_1102-61301-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 07:38:47,063 WARNING [train.py:1204] (3/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_166-246557-0_sp1.1 from training. Duration: 0.5818125 2023-10-02 07:38:47,106 WARNING [train.py:1204] (3/4) Exclude cut with ID _15351_210_10626_1_1530856635653_3576210_214-129918-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 07:38:48,903 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_120-41668-0_sp0.9 from training. Duration: 0.77775 2023-10-02 07:38:48,984 WARNING [train.py:1204] (3/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_27-72278-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 07:38:51,426 WARNING [train.py:1204] (3/4) Exclude cut with ID _24314_210_4915_1_1532135078116_7170179_521-258868-0_sp1.1 from training. Duration: 0.7181875 2023-10-02 07:38:54,188 WARNING [train.py:1204] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_143-86344-0 from training. Duration: 0.77 2023-10-02 07:38:56,051 WARNING [train.py:1204] (3/4) Exclude cut with ID _53489_210_6228_1_1534298357169_3835970_465-157433-0_sp1.1 from training. Duration: 0.92725 2023-10-02 07:38:56,140 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_248-319353-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 07:38:59,503 INFO [train.py:1046] (3/4) Epoch 23, batch 3350, loss[loss=0.1956, simple_loss=0.2584, pruned_loss=0.06642, over 23847.00 frames. ], tot_loss[loss=0.1732, simple_loss=0.2496, pruned_loss=0.04841, over 4708146.40 frames. ], batch size: 179, lr: 4.45e-03, grad_scale: 16.0 2023-10-02 07:38:59,543 WARNING [train.py:1204] (3/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_102-77064-0_sp1.1 from training. Duration: 0.8454375 2023-10-02 07:38:59,575 WARNING [train.py:1204] (3/4) Exclude cut with ID _9389_210_1664_1_1529217337301_5289269_379-128794-0_sp1.1 from training. Duration: 0.77275 2023-10-02 07:39:00,963 WARNING [train.py:1204] (3/4) Exclude cut with ID _3231_210_2106_1_1526205864934_6784220_236-260835-0_sp1.1 from training. Duration: 0.97275 2023-10-02 07:39:03,639 WARNING [train.py:1204] (3/4) Exclude cut with ID _24271_210_12658_1_1531987270022_7190519_184-342363-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 07:39:03,640 WARNING [train.py:1204] (3/4) Exclude cut with ID _27894_210_12392_1_1532415496078_6902439_67-221663-0_sp1.1 from training. Duration: 0.92725 2023-10-02 07:39:05,202 WARNING [train.py:1204] (3/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_304-59227-0_sp1.1 from training. Duration: 0.7 2023-10-02 07:39:05,312 WARNING [train.py:1204] (3/4) Exclude cut with ID _58605_210_12210_1_1534723195939_3904259_458-42232-0_sp1.1 from training. Duration: 0.92725 2023-10-02 07:39:05,492 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.conv_module2.balancer1.prob, batch_count=801440.0, ans=0.125 2023-10-02 07:39:06,613 WARNING [train.py:1204] (3/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_13-165512-0_sp0.9 from training. Duration: 0.911125 2023-10-02 07:39:09,377 WARNING [train.py:1204] (3/4) Exclude cut with ID _58602_210_2346_1_1534903210085_6908830_792-102241-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 07:39:11,203 WARNING [train.py:1204] (3/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_588-125165-0_sp0.9 from training. Duration: 0.92225 2023-10-02 07:39:12,570 WARNING [train.py:1204] (3/4) Exclude cut with ID _54218_210_12210_1_1534413646388_4409259_239-98429-0_sp1.1 from training. Duration: 0.97275 2023-10-02 07:39:12,623 WARNING [train.py:1204] (3/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_538-32291-0_sp1.1 from training. Duration: 0.7545625 2023-10-02 07:39:13,965 WARNING [train.py:1204] (3/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_419-317062-0 from training. Duration: 0.91 2023-10-02 07:39:16,813 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9309W0202-640329 from training. Duration: 0.896 2023-10-02 07:39:16,857 WARNING [train.py:1204] (3/4) Exclude cut with ID _3231_210_2106_1_1526205864934_6784220_419-313291-0_sp1.1 from training. Duration: 0.97275 2023-10-02 07:39:19,479 WARNING [train.py:1204] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_619-344499-0 from training. Duration: 0.63 2023-10-02 07:39:19,494 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_278-272612-0 from training. Duration: 0.73 2023-10-02 07:39:20,853 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_103-84010-0_sp0.9 from training. Duration: 0.7666875 2023-10-02 07:39:20,875 WARNING [train.py:1204] (3/4) Exclude cut with ID _26528_210_9070_1_1532570410842_3851310_497-97218-0_sp1.1 from training. Duration: 0.7818125 2023-10-02 07:39:22,894 WARNING [train.py:1204] (3/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_262-84613-0_sp1.1 from training. Duration: 0.936375 2023-10-02 07:39:22,946 WARNING [train.py:1204] (3/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_387-131538-0 from training. Duration: 0.81 2023-10-02 07:39:23,711 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.2.encoder.layers.0.feed_forward2.out_whiten, num_groups=1, num_channels=384, metric=5.36 vs. limit=15.0 2023-10-02 07:39:24,232 WARNING [train.py:1204] (3/4) Exclude cut with ID _8030_210_7497_1_1528444886781_3531089_224-300489-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 07:39:24,248 WARNING [train.py:1204] (3/4) Exclude cut with ID _14349_210_8341_1_1530615710818_3442470_170-336541-0_sp0.9 from training. Duration: 0.9555625 2023-10-02 07:39:27,016 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_47069_210_15368_1_1533781267_4431320_180-31746-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 07:39:29,051 WARNING [train.py:1204] (3/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_447-7041-0_sp1.1 from training. Duration: 0.92725 2023-10-02 07:39:29,082 WARNING [train.py:1204] (3/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_2-55283-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 07:39:29,151 WARNING [train.py:1204] (3/4) Exclude cut with ID _12555_210_8750_1_1530075594402_3780239_70-181911-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 07:39:29,328 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.0.balancer2.prob, batch_count=801573.3333333334, ans=0.125 2023-10-02 07:39:32,030 WARNING [train.py:1204] (3/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_311-328022-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 07:39:33,509 WARNING [train.py:1204] (3/4) Exclude cut with ID _14528_210_8254_1_1530669700514_4306939_330-2987-0_sp1.1 from training. Duration: 0.92725 2023-10-02 07:39:34,800 WARNING [train.py:1204] (3/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_385-184858-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 07:39:37,537 WARNING [train.py:1204] (3/4) Exclude cut with ID _64414_210_17301_1_1535243252048_3757759_348-333360-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 07:39:39,550 WARNING [train.py:1204] (3/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_753-214751-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 07:39:42,255 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_17613_210_2151_1_1531137569_2366160_202-208013-0_sp1.1 from training. Duration: 0.92725 2023-10-02 07:39:42,275 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_43831_210_2158_1_1533725502_5872791_610-129300-0_sp1.1 from training. Duration: 0.936375 2023-10-02 07:39:44,935 WARNING [train.py:1204] (3/4) Exclude cut with ID _54218_210_12210_1_1534413646388_4409259_321-23852-0_sp1.1 from training. Duration: 0.936375 2023-10-02 07:39:47,601 WARNING [train.py:1204] (3/4) Exclude cut with ID _39411_210_5986_1_1533538825030_3343649_173-238248-0 from training. Duration: 0.83 2023-10-02 07:39:47,608 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_405-339986-0_sp0.9 from training. Duration: 0.5666875 2023-10-02 07:39:47,642 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2009_210_2071_1_1525481134_4588937_385-302100-0 from training. Duration: 0.87 2023-10-02 07:39:49,018 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_161-276996-0_sp1.1 from training. Duration: 0.863625 2023-10-02 07:39:49,109 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_177-58005-0 from training. Duration: 0.62 2023-10-02 07:39:50,462 WARNING [train.py:1204] (3/4) Exclude cut with ID _64639_210_5816_1_1535367619437_3528000_146-343734-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 07:39:52,322 WARNING [train.py:1204] (3/4) Exclude cut with ID _3845_210_1390_1_1526731326141_3523610_72-61441-0_sp1.1 from training. Duration: 0.92725 2023-10-02 07:39:59,682 WARNING [train.py:1204] (3/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_991-125028-0_sp1.1 from training. Duration: 0.936375 2023-10-02 07:40:00,458 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7544_210_6753_1_1528591327_1471426_172-348777-0 from training. Duration: 0.66 2023-10-02 07:40:01,753 WARNING [train.py:1204] (3/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_416-257730-0_sp1.1 from training. Duration: 0.5909375 2023-10-02 07:40:03,163 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1113-7369-0_sp1.1 from training. Duration: 0.77275 2023-10-02 07:40:03,250 WARNING [train.py:1204] (3/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_242-76943-0_sp1.1 from training. Duration: 0.8545625 2023-10-02 07:40:03,496 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.nonlin_attention.balancer.max_positive, batch_count=801706.6666666666, ans=0.95 2023-10-02 07:40:06,421 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.balancer1.prob, batch_count=801706.6666666666, ans=0.125 2023-10-02 07:40:08,934 WARNING [train.py:1204] (3/4) Exclude cut with ID _44718_210_5816_1_1534050051869_6469030_190-49648-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 07:40:09,128 WARNING [train.py:1204] (3/4) Exclude cut with ID _47251_210_18628_1_1533814063128_4137250_214-88076-0 from training. Duration: 0.88 2023-10-02 07:40:11,034 WARNING [train.py:1204] (3/4) Exclude cut with ID _5585_210_2667_1_1527418813324_8007340_89-332072-0_sp1.1 from training. Duration: 0.5818125 2023-10-02 07:40:11,079 WARNING [train.py:1204] (3/4) Exclude cut with ID _41817_210_18835_1_1533434477160_7423049_555-293448-0_sp1.1 from training. Duration: 0.72725 2023-10-02 07:40:12,494 WARNING [train.py:1204] (3/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_286-331314-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 07:40:12,552 WARNING [train.py:1204] (3/4) Exclude cut with ID _13921_210_9994_1_1530838702800_3003429_46-149444-0 from training. Duration: 0.94 2023-10-02 07:40:12,609 WARNING [train.py:1204] (3/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_768-43595-0_sp1.1 from training. Duration: 0.936375 2023-10-02 07:40:12,625 WARNING [train.py:1204] (3/4) Exclude cut with ID _35355_210_16040_1_1533212988977_4016229_74-190076-0 from training. Duration: 0.8 2023-10-02 07:40:13,875 INFO [train.py:1046] (3/4) Epoch 23, batch 3400, loss[loss=0.1628, simple_loss=0.2503, pruned_loss=0.03768, over 24645.00 frames. ], tot_loss[loss=0.1734, simple_loss=0.2501, pruned_loss=0.04832, over 4718685.95 frames. ], batch size: 68, lr: 4.45e-03, grad_scale: 16.0 2023-10-02 07:40:15,244 WARNING [train.py:1204] (3/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_1007-316380-0_sp1.1 from training. Duration: 0.963625 2023-10-02 07:40:15,297 WARNING [train.py:1204] (3/4) Exclude cut with ID _23557_210_8750_1_1531890106370_6846255_150-283437-0_sp1.1 from training. Duration: 0.963625 2023-10-02 07:40:15,338 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3474_210_2028_1_1526188513_4825246_145-309131-0_sp1.1 from training. Duration: 0.67275 2023-10-02 07:40:17,951 WARNING [train.py:1204] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_391-22148-0_sp1.1 from training. Duration: 0.7 2023-10-02 07:40:17,968 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_238-256668-0 from training. Duration: 0.69 2023-10-02 07:40:20,897 WARNING [train.py:1204] (3/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_372-198878-0 from training. Duration: 0.6 2023-10-02 07:40:20,909 WARNING [train.py:1204] (3/4) Exclude cut with ID ID0174W0006-766659 from training. Duration: 0.953 2023-10-02 07:40:20,926 WARNING [train.py:1204] (3/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_668-23019-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 07:40:25,527 WARNING [train.py:1204] (3/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_322-74328-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 07:40:26,779 WARNING [train.py:1204] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_872-264525-0_sp1.1 from training. Duration: 0.5909375 2023-10-02 07:40:26,845 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_53378_210_17282_1_1534221071_5769509_641-286201-0_sp1.1 from training. Duration: 0.97275 2023-10-02 07:40:28,274 WARNING [train.py:1204] (3/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_199-295611-0_sp1.1 from training. Duration: 0.5 2023-10-02 07:40:34,339 WARNING [train.py:1204] (3/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_1270-317971-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 07:40:34,462 WARNING [train.py:1204] (3/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_784-213099-0 from training. Duration: 0.93 2023-10-02 07:40:38,756 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_115-233929-0_sp0.9 from training. Duration: 0.8 2023-10-02 07:40:42,095 WARNING [train.py:1204] (3/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_256-223809-0_sp1.1 from training. Duration: 0.97275 2023-10-02 07:40:42,154 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_285-87781-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 07:40:43,609 WARNING [train.py:1204] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_619-344499-0_sp1.1 from training. Duration: 0.57275 2023-10-02 07:40:49,242 WARNING [train.py:1204] (3/4) Exclude cut with ID _60229_210_27767_1_1534838406158_3655350_264-279573-0_sp0.9 from training. Duration: 0.92225 2023-10-02 07:40:49,397 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.bypass.scale_min, batch_count=801906.6666666666, ans=0.2 2023-10-02 07:40:54,708 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.476e+02 1.893e+02 2.076e+02 2.373e+02 3.275e+02, threshold=4.152e+02, percent-clipped=0.0 2023-10-02 07:40:54,795 WARNING [train.py:1204] (3/4) Exclude cut with ID _7719_210_3525_1_1528456153163_4296960_333-110399-0 from training. Duration: 0.96 2023-10-02 07:41:00,804 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_50423_210_13270_1_1534128598_5021734_759-90833-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 07:41:01,527 WARNING [train.py:1204] (3/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_151-254798-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 07:41:01,768 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=801973.3333333334, ans=0.1 2023-10-02 07:41:02,781 WARNING [train.py:1204] (3/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_450-203637-0 from training. Duration: 0.92 2023-10-02 07:41:02,806 WARNING [train.py:1204] (3/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_673-36456-0_sp1.1 from training. Duration: 0.936375 2023-10-02 07:41:02,860 WARNING [train.py:1204] (3/4) Exclude cut with ID _36538_210_9994_1_1533204020363_3795620_131-8685-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 07:41:03,081 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.conv_module1.balancer1.prob, batch_count=801973.3333333334, ans=0.125 2023-10-02 07:41:04,197 WARNING [train.py:1204] (3/4) Exclude cut with ID _12556_210_8341_1_1530081024100_7327690_713-55626-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 07:41:04,246 WARNING [train.py:1204] (3/4) Exclude cut with ID _2322_210_2667_1_1525604548971_3656299_440-80494-0_sp0.9 from training. Duration: 0.97775 2023-10-02 07:41:05,817 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.conv_skip_rate, batch_count=801973.3333333334, ans=0.0 2023-10-02 07:41:07,037 WARNING [train.py:1204] (3/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_210-325081-0_sp1.1 from training. Duration: 0.97275 2023-10-02 07:41:08,788 INFO [scaling.py:1118] (3/4) WithLoss: name=encoder.encoders.1.encoder.layers.1.self_attn_weights, loss-sum=0.000e+00 2023-10-02 07:41:11,292 WARNING [train.py:1204] (3/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_375-244976-0_sp0.9 from training. Duration: 0.8333125 2023-10-02 07:41:11,298 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_15517_210_6753_1_1530952293_1664795_119-31001-0_sp1.1 from training. Duration: 0.8545625 2023-10-02 07:41:17,441 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_41452_210_7291_1_1533437687_3941281_319-42326-0_sp1.1 from training. Duration: 0.92725 2023-10-02 07:41:18,893 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_372-173012-0 from training. Duration: 0.95 2023-10-02 07:41:23,182 WARNING [train.py:1204] (3/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_41-31157-0_sp1.1 from training. Duration: 0.5181875 2023-10-02 07:41:27,212 INFO [train.py:1046] (3/4) Epoch 23, batch 3450, loss[loss=0.1595, simple_loss=0.2162, pruned_loss=0.05134, over 22763.00 frames. ], tot_loss[loss=0.1731, simple_loss=0.2499, pruned_loss=0.04809, over 4716992.87 frames. ], batch size: 322, lr: 4.45e-03, grad_scale: 8.0 2023-10-02 07:41:27,289 WARNING [train.py:1204] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_91-212486-0 from training. Duration: 0.68 2023-10-02 07:41:30,860 WARNING [train.py:1204] (3/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_1267-272693-0 from training. Duration: 0.73 2023-10-02 07:41:30,887 WARNING [train.py:1204] (3/4) Exclude cut with ID _13938_210_9988_1_1530518718978_3693960_152-67046-0_sp1.1 from training. Duration: 0.936375 2023-10-02 07:41:31,377 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.5.encoder.layers.0.nonlin_attention.whiten1, num_groups=1, num_channels=192, metric=4.30 vs. limit=10.0 2023-10-02 07:41:32,971 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_4528_210_3273_1_1527076654_3276777_342-168068-0_sp1.1 from training. Duration: 0.7909375 2023-10-02 07:41:34,170 WARNING [train.py:1204] (3/4) Exclude cut with ID _7606_210_4915_1_1528894253277_5080690_127-177761-0 from training. Duration: 0.64 2023-10-02 07:41:34,256 WARNING [train.py:1204] (3/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_335-137972-0_sp1.1 from training. Duration: 0.92725 2023-10-02 07:41:38,439 WARNING [train.py:1204] (3/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_331-117829-0_sp1.1 from training. Duration: 0.563625 2023-10-02 07:41:43,032 WARNING [train.py:1204] (3/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_125-125818-0_sp1.1 from training. Duration: 0.87275 2023-10-02 07:41:45,647 WARNING [train.py:1204] (3/4) Exclude cut with ID _6789_210_1534_1_1527854471671_2870519_48-294897-0_sp1.1 from training. Duration: 0.963625 2023-10-02 07:41:45,728 WARNING [train.py:1204] (3/4) Exclude cut with ID _6694_210_4599_1_1527852637490_3965529_116-109398-0_sp1.1 from training. Duration: 0.663625 2023-10-02 07:41:45,739 WARNING [train.py:1204] (3/4) Exclude cut with ID _52892_210_6721_1_1534378941259_4124129_222-172685-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 07:41:47,248 WARNING [train.py:1204] (3/4) Exclude cut with ID _50655_210_18628_1_1534229942193_3772154_320-132275-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 07:41:52,769 WARNING [train.py:1204] (3/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_76-311963-0 from training. Duration: 0.7 2023-10-02 07:41:58,966 WARNING [train.py:1204] (3/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_32-205288-0 from training. Duration: 0.78 2023-10-02 07:41:58,985 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_86-16881-0_sp0.9 from training. Duration: 0.6444375 2023-10-02 07:41:59,040 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_271-297505-0_sp1.1 from training. Duration: 0.9 2023-10-02 07:42:00,429 WARNING [train.py:1204] (3/4) Exclude cut with ID _57572_210_5517_1_1534921219624_3809484_187-295293-0_sp1.1 from training. Duration: 0.963625 2023-10-02 07:42:00,792 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.feed_forward1.out_proj.dropout_p, batch_count=802240.0, ans=0.1 2023-10-02 07:42:06,437 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.1.encoder.layers.1.self_attn2.whiten, num_groups=1, num_channels=256, metric=9.53 vs. limit=22.5 2023-10-02 07:42:06,999 WARNING [train.py:1204] (3/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_563-329943-0 from training. Duration: 0.99 2023-10-02 07:42:07,079 WARNING [train.py:1204] (3/4) Exclude cut with ID _7160_210_1876_1_1527940923231_3703799_302-150728-0_sp0.9 from training. Duration: 0.8666875 2023-10-02 07:42:11,421 WARNING [train.py:1204] (3/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_926-7526-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 07:42:12,810 WARNING [train.py:1204] (3/4) Exclude cut with ID _62574_210_2655_1_1535180407058_3933249_467-22348-0_sp1.1 from training. Duration: 0.936375 2023-10-02 07:42:12,948 WARNING [train.py:1204] (3/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_280-161633-0_sp0.9 from training. Duration: 0.811125 2023-10-02 07:42:14,751 WARNING [train.py:1204] (3/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_385-193052-0_sp1.1 from training. Duration: 0.7818125 2023-10-02 07:42:17,508 WARNING [train.py:1204] (3/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_444-28041-0 from training. Duration: 0.85 2023-10-02 07:42:17,513 WARNING [train.py:1204] (3/4) Exclude cut with ID _66070_210_7640_1_1535441393096_7805310_341-249237-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 07:42:17,592 WARNING [train.py:1204] (3/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_425-269826-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 07:42:20,410 WARNING [train.py:1204] (3/4) Exclude cut with ID _53950_210_5816_1_1534417264919_3862951_124-185597-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 07:42:21,965 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.conv_module2.balancer2.prob, batch_count=802306.6666666666, ans=0.125 2023-10-02 07:42:23,094 WARNING [train.py:1204] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_380-204756-0 from training. Duration: 0.86 2023-10-02 07:42:25,815 WARNING [train.py:1204] (3/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_282-257791-0_sp0.9 from training. Duration: 0.9555625 2023-10-02 07:42:30,091 WARNING [train.py:1204] (3/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_372-131163-0_sp1.1 from training. Duration: 0.8545625 2023-10-02 07:42:32,114 WARNING [train.py:1204] (3/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_447-233513-0_sp1.1 from training. Duration: 0.963625 2023-10-02 07:42:35,975 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_54-118333-0_sp1.1 from training. Duration: 0.97275 2023-10-02 07:42:38,995 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.feed_forward1.hidden_balancer.prob, batch_count=802373.3333333334, ans=0.125 2023-10-02 07:42:40,134 WARNING [train.py:1204] (3/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_388-230239-0_sp1.1 from training. Duration: 0.963625 2023-10-02 07:42:40,146 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_40047_210_2290_1_1533290519_2374311_141-54639-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 07:42:40,210 WARNING [train.py:1204] (3/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_91-267696-0_sp0.9 from training. Duration: 0.9666875 2023-10-02 07:42:41,473 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_532-137847-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 07:42:43,278 INFO [train.py:1046] (3/4) Epoch 23, batch 3500, loss[loss=0.1802, simple_loss=0.2488, pruned_loss=0.05575, over 23249.00 frames. ], tot_loss[loss=0.172, simple_loss=0.2485, pruned_loss=0.04775, over 4712220.20 frames. ], batch size: 105, lr: 4.45e-03, grad_scale: 8.0 2023-10-02 07:42:44,803 WARNING [train.py:1204] (3/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_155-283231-0_sp1.1 from training. Duration: 0.97275 2023-10-02 07:42:47,546 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2245_210_2715_1_1525511548_3540254_310-52483-0_sp1.1 from training. Duration: 0.8 2023-10-02 07:42:48,855 WARNING [train.py:1204] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_185-174805-0 from training. Duration: 0.68 2023-10-02 07:42:49,002 WARNING [train.py:1204] (3/4) Exclude cut with ID _15012_210_1968_1_1530856820429_5095350_662-210469-0_sp1.1 from training. Duration: 0.5181875 2023-10-02 07:42:49,229 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.ff3_skip_rate, batch_count=802440.0, ans=0.0 2023-10-02 07:42:52,172 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.balancer2.prob, batch_count=802440.0, ans=0.125 2023-10-02 07:42:52,625 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.2.encoder.layers.1.self_attn2.whiten, num_groups=1, num_channels=384, metric=14.99 vs. limit=22.5 2023-10-02 07:42:53,255 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_426-313557-0_sp0.9 from training. Duration: 0.588875 2023-10-02 07:42:56,032 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.feed_forward1.out_proj.dropout_p, batch_count=802506.6666666666, ans=0.1 2023-10-02 07:42:57,252 WARNING [train.py:1204] (3/4) Exclude cut with ID _42326_210_7770_1_1533814282531_3707269_407-204343-0_sp1.1 from training. Duration: 0.97275 2023-10-02 07:42:57,260 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_258-310570-0 from training. Duration: 0.82 2023-10-02 07:43:01,461 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_4737_210_2040_1_1526998689_3293008_378-178650-0_sp1.1 from training. Duration: 0.863625 2023-10-02 07:43:01,547 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_47896_210_20508_1_1534492518_5958868_432-327451-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 07:43:03,275 WARNING [train.py:1204] (3/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_188-114464-0_sp1.1 from training. Duration: 0.7454375 2023-10-02 07:43:03,283 WARNING [train.py:1204] (3/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_798-106179-0_sp1.1 from training. Duration: 0.7545625 2023-10-02 07:43:03,316 WARNING [train.py:1204] (3/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_143-117421-0_sp1.1 from training. Duration: 0.52725 2023-10-02 07:43:04,041 WARNING [train.py:1204] (3/4) Exclude cut with ID _67499_210_17282_1_1535509805220_3692430_361-78786-0_sp1.1 from training. Duration: 0.963625 2023-10-02 07:43:04,071 WARNING [train.py:1204] (3/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_712-342563-0_sp1.1 from training. Duration: 0.8818125 2023-10-02 07:43:05,397 WARNING [train.py:1204] (3/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_387-159008-0 from training. Duration: 0.86 2023-10-02 07:43:06,719 WARNING [train.py:1204] (3/4) Exclude cut with ID _33640_210_2655_1_1533117605479_4855469_959-18820-0_sp1.1 from training. Duration: 0.963625 2023-10-02 07:43:06,755 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_133-564-0_sp0.9 from training. Duration: 0.87775 2023-10-02 07:43:06,961 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=802506.6666666666, ans=0.1 2023-10-02 07:43:09,378 WARNING [train.py:1204] (3/4) Exclude cut with ID _23514_210_1389_1_1531832139785_3031439_12-29287-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 07:43:09,665 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.attention_skip_rate, batch_count=802506.6666666666, ans=0.0 2023-10-02 07:43:13,913 WARNING [train.py:1204] (3/4) Exclude cut with ID _23514_210_1389_1_1531832139785_3031439_489-185052-0_sp1.1 from training. Duration: 0.963625 2023-10-02 07:43:13,980 WARNING [train.py:1204] (3/4) Exclude cut with ID _5444_210_2151_1_1527418808464_5312210_634-194174-0 from training. Duration: 0.97 2023-10-02 07:43:14,023 WARNING [train.py:1204] (3/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_91-26919-0_sp1.1 from training. Duration: 0.8818125 2023-10-02 07:43:16,858 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_37351_210_7925_1_1533012931_3812113_516-82303-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 07:43:17,108 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.balancer2.prob, batch_count=802573.3333333334, ans=0.125 2023-10-02 07:43:18,286 WARNING [train.py:1204] (3/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_80-74184-0_sp0.9 from training. Duration: 0.92225 2023-10-02 07:43:19,593 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_9460_210_4211_1_1528954587_10049667_495-202963-0_sp1.1 from training. Duration: 0.963625 2023-10-02 07:43:19,699 WARNING [train.py:1204] (3/4) Exclude cut with ID _14632_210_9988_1_1530691279392_3923200_332-105904-0_sp1.1 from training. Duration: 0.8181875 2023-10-02 07:43:19,720 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_40764_210_1925_1_1533882359_3752345_493-167886-0_sp1.1 from training. Duration: 0.92725 2023-10-02 07:43:21,205 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_196-289637-0 from training. Duration: 0.75 2023-10-02 07:43:21,387 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.balancer1.prob, batch_count=802573.3333333334, ans=0.125 2023-10-02 07:43:22,565 WARNING [train.py:1204] (3/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_26-175553-0 from training. Duration: 0.61 2023-10-02 07:43:22,608 WARNING [train.py:1204] (3/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_890-5117-0 from training. Duration: 0.78 2023-10-02 07:43:22,777 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.balancer2.prob, batch_count=802573.3333333334, ans=0.125 2023-10-02 07:43:23,844 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.562e+02 1.940e+02 2.168e+02 2.529e+02 4.138e+02, threshold=4.336e+02, percent-clipped=0.0 2023-10-02 07:43:23,941 WARNING [train.py:1204] (3/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_206-306860-0_sp1.1 from training. Duration: 0.92725 2023-10-02 07:43:24,098 WARNING [train.py:1204] (3/4) Exclude cut with ID _25882_210_3036_1_1532775665815_6479510_570-79824-0_sp1.1 from training. Duration: 0.963625 2023-10-02 07:43:25,374 WARNING [train.py:1204] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_731-2428-0_sp1.1 from training. Duration: 0.7545625 2023-10-02 07:43:25,395 WARNING [train.py:1204] (3/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_138-154812-0_sp0.9 from training. Duration: 0.8666875 2023-10-02 07:43:28,710 WARNING [train.py:1204] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_178-39722-0_sp0.9 from training. Duration: 0.5444375 2023-10-02 07:43:28,761 WARNING [train.py:1204] (3/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_1285-256622-0_sp0.9 from training. Duration: 0.9333125 2023-10-02 07:43:33,409 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_41428_210_2161_1_1533774184_3992532_65-271323-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 07:43:34,865 WARNING [train.py:1204] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_308-161067-0 from training. Duration: 0.66 2023-10-02 07:43:34,870 WARNING [train.py:1204] (3/4) Exclude cut with ID _3067_210_2667_1_1526212974640_4503168_459-173867-0 from training. Duration: 0.88 2023-10-02 07:43:34,871 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2221_210_1988_1_1525597051_4935541_415-296882-0_sp1.1 from training. Duration: 0.7 2023-10-02 07:43:37,648 WARNING [train.py:1204] (3/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_354-192266-0_sp1.1 from training. Duration: 0.936375 2023-10-02 07:43:39,020 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_258-310570-0_sp0.9 from training. Duration: 0.911125 2023-10-02 07:43:40,398 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_548-141039-0_sp1.1 from training. Duration: 0.963625 2023-10-02 07:43:43,830 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_15101_210_7927_1_1530786415_5033274_320-327527-0 from training. Duration: 0.96 2023-10-02 07:43:44,034 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=802706.6666666666, ans=0.1 2023-10-02 07:43:45,138 WARNING [train.py:1204] (3/4) Exclude cut with ID _2895_210_2477_1_1525956785794_4063750_289-11475-0_sp0.9 from training. Duration: 0.911125 2023-10-02 07:43:46,598 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7543_210_6753_1_1528543318_4552_1-85072-0_sp1.1 from training. Duration: 0.7545625 2023-10-02 07:43:47,970 WARNING [train.py:1204] (3/4) Exclude cut with ID _64639_210_5816_1_1535367619437_3528000_461-329156-0 from training. Duration: 0.96 2023-10-02 07:43:50,734 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_211-339622-0 from training. Duration: 0.45 2023-10-02 07:43:51,985 WARNING [train.py:1204] (3/4) Exclude cut with ID _20455_210_5219_1_1531565996775_8098100_402-112739-0_sp1.1 from training. Duration: 0.963625 2023-10-02 07:43:53,417 WARNING [train.py:1204] (3/4) Exclude cut with ID _53489_210_6228_1_1534298357169_3835970_256-115565-0_sp1.1 from training. Duration: 0.936375 2023-10-02 07:43:53,436 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_26326_210_7925_1_1533002493_3592406_360-305260-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 07:43:53,478 WARNING [train.py:1204] (3/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_18-204383-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 07:43:56,091 INFO [train.py:1046] (3/4) Epoch 23, batch 3550, loss[loss=0.1924, simple_loss=0.2682, pruned_loss=0.05824, over 23727.00 frames. ], tot_loss[loss=0.1707, simple_loss=0.247, pruned_loss=0.04725, over 4719591.69 frames. ], batch size: 149, lr: 4.45e-03, grad_scale: 8.0 2023-10-02 07:43:56,478 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=802773.3333333334, ans=0.1 2023-10-02 07:43:57,743 WARNING [train.py:1204] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_306-232480-0_sp1.1 from training. Duration: 0.87275 2023-10-02 07:44:05,730 WARNING [train.py:1204] (3/4) Exclude cut with ID _41161_210_20034_1_1533431249657_3555314_116-299186-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 07:44:05,939 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.conv_module1.balancer1.prob, batch_count=802773.3333333334, ans=0.125 2023-10-02 07:44:07,171 WARNING [train.py:1204] (3/4) Exclude cut with ID _13913_210_9994_1_1530579615187_3383897_473-277682-0_sp1.1 from training. Duration: 0.3545625 2023-10-02 07:44:10,083 WARNING [train.py:1204] (3/4) Exclude cut with ID _58895_210_8750_1_1535184081320_3649089_49-225702-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 07:44:11,397 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3080_210_2071_1_1526086102_4365166_312-8726-0_sp1.1 from training. Duration: 0.82725 2023-10-02 07:44:12,785 WARNING [train.py:1204] (3/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_489-190175-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 07:44:12,874 WARNING [train.py:1204] (3/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_345-95717-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 07:44:14,116 WARNING [train.py:1204] (3/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_85-138257-0_sp1.1 from training. Duration: 0.6454375 2023-10-02 07:44:16,070 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7046_210_6753_1_1527895337_6920917_783-278109-0_sp1.1 from training. Duration: 0.7 2023-10-02 07:44:17,349 WARNING [train.py:1204] (3/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_450-203637-0_sp1.1 from training. Duration: 0.836375 2023-10-02 07:44:17,409 WARNING [train.py:1204] (3/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_416-132893-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 07:44:18,690 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2655_210_2087_1_1525688959_3897400_437-337690-0_sp0.9 from training. Duration: 0.7 2023-10-02 07:44:20,095 WARNING [train.py:1204] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_700-183549-0_sp1.1 from training. Duration: 0.6181875 2023-10-02 07:44:25,493 WARNING [train.py:1204] (3/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_191-59784-0_sp1.1 from training. Duration: 0.8 2023-10-02 07:44:25,505 WARNING [train.py:1204] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_218-295097-0_sp1.1 from training. Duration: 0.7 2023-10-02 07:44:25,734 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.feed_forward1.hidden_balancer.prob, batch_count=802906.6666666666, ans=0.125 2023-10-02 07:44:26,937 WARNING [train.py:1204] (3/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_310-109461-0_sp1.1 from training. Duration: 0.72725 2023-10-02 07:44:26,941 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_33348_210_5632_1_1533086912_3324030_131-321149-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 07:44:28,386 WARNING [train.py:1204] (3/4) Exclude cut with ID _64135_210_2253_1_1535677267876_7608972_133-188266-0_sp1.1 from training. Duration: 0.736375 2023-10-02 07:44:28,418 WARNING [train.py:1204] (3/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_505-205270-0 from training. Duration: 0.38 2023-10-02 07:44:28,431 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_67223_210_4238_1_1535612120_2537431_102-88248-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 07:44:30,462 WARNING [train.py:1204] (3/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_60-235433-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 07:44:31,872 WARNING [train.py:1204] (3/4) Exclude cut with ID _5189_210_4557_1_1527248319428_3769229_199-53526-0_sp1.1 from training. Duration: 0.3818125 2023-10-02 07:44:35,202 WARNING [train.py:1204] (3/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_439-90979-0_sp1.1 from training. Duration: 0.963625 2023-10-02 07:44:36,558 WARNING [train.py:1204] (3/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_1162-132880-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 07:44:37,918 WARNING [train.py:1204] (3/4) Exclude cut with ID _56423_210_5854_1_1534896061049_3165969_220-22317-0_sp1.1 from training. Duration: 0.963625 2023-10-02 07:44:40,661 WARNING [train.py:1204] (3/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_909-64151-0 from training. Duration: 0.94 2023-10-02 07:44:40,722 WARNING [train.py:1204] (3/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_465-156500-0_sp0.9 from training. Duration: 0.8 2023-10-02 07:44:41,012 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.ff3_skip_rate, batch_count=802973.3333333334, ans=0.0 2023-10-02 07:44:42,049 WARNING [train.py:1204] (3/4) Exclude cut with ID _44304_210_15374_1_1533639352765_4613129_302-22084-0 from training. Duration: 0.86 2023-10-02 07:44:42,109 WARNING [train.py:1204] (3/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_663-166973-0_sp1.1 from training. Duration: 0.72725 2023-10-02 07:44:43,750 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.conv_module1.balancer1.prob, batch_count=802973.3333333334, ans=0.125 2023-10-02 07:44:44,914 WARNING [train.py:1204] (3/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_503-287883-0_sp0.9 from training. Duration: 0.97775 2023-10-02 07:44:44,942 WARNING [train.py:1204] (3/4) Exclude cut with ID _60057_210_10640_1_1534903065793_3333150_362-230377-0_sp0.9 from training. Duration: 0.9666875 2023-10-02 07:44:46,422 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.bypass_mid.scale_min, batch_count=802973.3333333334, ans=0.2 2023-10-02 07:44:47,634 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_4183_210_2087_1_1526729100_1869838_143-280962-0 from training. Duration: 0.56 2023-10-02 07:44:49,594 WARNING [train.py:1204] (3/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_281-1313-0_sp1.1 from training. Duration: 0.936375 2023-10-02 07:44:49,849 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.balancer2.prob, batch_count=802973.3333333334, ans=0.125 2023-10-02 07:44:55,640 WARNING [train.py:1204] (3/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_64-215118-0_sp1.1 from training. Duration: 0.936375 2023-10-02 07:44:56,955 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2822_210_2715_1_1526112586_1999587_165-36656-0 from training. Duration: 0.89 2023-10-02 07:44:57,014 WARNING [train.py:1204] (3/4) Exclude cut with ID _22961_210_8254_1_1531976366672_4278180_402-208830-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 07:45:01,528 WARNING [train.py:1204] (3/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_98-193494-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 07:45:02,917 WARNING [train.py:1204] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_383-285196-0 from training. Duration: 0.97 2023-10-02 07:45:07,929 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6767_210_3273_1_1527855783_1718932_142-127132-0 from training. Duration: 0.95 2023-10-02 07:45:07,960 WARNING [train.py:1204] (3/4) Exclude cut with ID _39867_210_9826_1_1533553222030_9344259_1018-339635-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 07:45:09,232 WARNING [train.py:1204] (3/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_274-274798-0_sp1.1 from training. Duration: 0.8909375 2023-10-02 07:45:10,512 INFO [train.py:1046] (3/4) Epoch 23, batch 3600, loss[loss=0.1614, simple_loss=0.2474, pruned_loss=0.03768, over 24648.00 frames. ], tot_loss[loss=0.1704, simple_loss=0.2465, pruned_loss=0.04721, over 4724794.57 frames. ], batch size: 68, lr: 4.45e-03, grad_scale: 16.0 2023-10-02 07:45:10,644 WARNING [train.py:1204] (3/4) Exclude cut with ID _2334_210_2667_1_1525608238880_4705129_376-263393-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 07:45:10,696 WARNING [train.py:1204] (3/4) Exclude cut with ID _29303_210_2575_1_1532392237303_7410649_363-344675-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 07:45:12,100 WARNING [train.py:1204] (3/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_58-68956-0_sp1.1 from training. Duration: 0.7545625 2023-10-02 07:45:14,989 WARNING [train.py:1204] (3/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_709-311571-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 07:45:15,237 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.balancer2.prob, batch_count=803106.6666666666, ans=0.125 2023-10-02 07:45:16,383 WARNING [train.py:1204] (3/4) Exclude cut with ID _46067_210_4220_1_1534125571066_3307450_136-257407-0_sp1.1 from training. Duration: 0.963625 2023-10-02 07:45:18,201 WARNING [train.py:1204] (3/4) Exclude cut with ID _6493_210_1747_1_1528459210424_4012470_324-323854-0_sp1.1 from training. Duration: 0.836375 2023-10-02 07:45:18,269 WARNING [train.py:1204] (3/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_234-260682-0_sp1.1 from training. Duration: 0.763625 2023-10-02 07:45:19,627 WARNING [train.py:1204] (3/4) Exclude cut with ID _36634_210_1794_1_1533296352450_1399884_55-119451-0_sp1.1 from training. Duration: 0.963625 2023-10-02 07:45:19,640 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_40047_210_2290_1_1533290519_2374311_195-165693-0 from training. Duration: 0.89 2023-10-02 07:45:23,639 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_5522_210_3273_1_1527249606_3909096_366-172508-0_sp0.9 from training. Duration: 0.7333125 2023-10-02 07:45:24,972 WARNING [train.py:1204] (3/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_187-10504-0_sp1.1 from training. Duration: 0.963625 2023-10-02 07:45:26,467 WARNING [train.py:1204] (3/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_282-257791-0_sp1.1 from training. Duration: 0.7818125 2023-10-02 07:45:29,895 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_43831_210_2158_1_1533725502_5872791_467-288776-0_sp1.1 from training. Duration: 0.92725 2023-10-02 07:45:30,261 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.self_attn_weights.pos_emb_skip_rate, batch_count=803173.3333333334, ans=0.0 2023-10-02 07:45:32,062 WARNING [train.py:1204] (3/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_138-154812-0_sp1.1 from training. Duration: 0.7090625 2023-10-02 07:45:32,109 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_1154-185930-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 07:45:32,137 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2655_210_2087_1_1525688959_3897400_52-279899-0 from training. Duration: 0.69 2023-10-02 07:45:33,323 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_141-89113-0_sp1.1 from training. Duration: 0.7818125 2023-10-02 07:45:36,515 WARNING [train.py:1204] (3/4) Exclude cut with ID _55134_210_21982_1_1534590028377_3882170_297-47310-0_sp1.1 from training. Duration: 0.963625 2023-10-02 07:45:36,589 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_486-151131-0_sp1.1 from training. Duration: 0.77275 2023-10-02 07:45:37,946 WARNING [train.py:1204] (3/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_21-232980-0_sp1.1 from training. Duration: 0.936375 2023-10-02 07:45:39,475 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6165_210_3528_1_1527590982_4379841_338-106380-0_sp1.1 from training. Duration: 0.92725 2023-10-02 07:45:40,844 WARNING [train.py:1204] (3/4) Exclude cut with ID _55162_210_17071_1_1534408248583_3753480_355-257218-0_sp1.1 from training. Duration: 0.97275 2023-10-02 07:45:42,084 WARNING [train.py:1204] (3/4) Exclude cut with ID _2895_210_2477_1_1525956785794_4063750_289-11475-0 from training. Duration: 0.82 2023-10-02 07:45:50,371 WARNING [train.py:1204] (3/4) Exclude cut with ID _16756_210_4915_1_1531047803763_7104259_839-183431-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 07:45:52,130 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.380e+02 1.763e+02 1.994e+02 2.308e+02 3.230e+02, threshold=3.989e+02, percent-clipped=0.0 2023-10-02 07:45:52,280 WARNING [train.py:1204] (3/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_427-208901-0_sp0.9 from training. Duration: 0.7555625 2023-10-02 07:45:53,612 WARNING [train.py:1204] (3/4) Exclude cut with ID _36512_210_6987_1_1533033143998_3126069_181-306500-0 from training. Duration: 0.95 2023-10-02 07:45:55,189 INFO [scaling.py:1118] (3/4) WithLoss: name=encoder.encoders.0.layers.0.self_attn_weights, loss-sum=0.000e+00 2023-10-02 07:45:57,702 WARNING [train.py:1204] (3/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_28-70987-0_sp1.1 from training. Duration: 0.7909375 2023-10-02 07:46:02,572 WARNING [train.py:1204] (3/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_590-58996-0_sp1.1 from training. Duration: 0.936375 2023-10-02 07:46:03,043 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.5.encoder.layers.1.feed_forward3.out_whiten, num_groups=1, num_channels=256, metric=9.33 vs. limit=15.0 2023-10-02 07:46:04,609 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_48730_210_5399_1_1533885886_2212769_108-47561-0_sp1.1 from training. Duration: 0.936375 2023-10-02 07:46:05,987 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.self_attn_weights.pos_emb_skip_rate, batch_count=803306.6666666666, ans=0.0 2023-10-02 07:46:10,313 WARNING [train.py:1204] (3/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_401-158239-0_sp0.9 from training. Duration: 0.788875 2023-10-02 07:46:11,581 WARNING [train.py:1204] (3/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_278-154492-0_sp1.1 from training. Duration: 0.6909375 2023-10-02 07:46:11,596 WARNING [train.py:1204] (3/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_137-116442-0 from training. Duration: 0.78 2023-10-02 07:46:12,927 WARNING [train.py:1204] (3/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_189-317158-0 from training. Duration: 0.9 2023-10-02 07:46:13,008 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_47896_210_20508_1_1534492518_5958868_558-314572-0 from training. Duration: 0.97 2023-10-02 07:46:15,770 WARNING [train.py:1204] (3/4) Exclude cut with ID _66070_210_7640_1_1535441393096_7805310_290-40284-0_sp1.1 from training. Duration: 0.97275 2023-10-02 07:46:15,809 WARNING [train.py:1204] (3/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_110-224688-0_sp1.1 from training. Duration: 0.8818125 2023-10-02 07:46:17,199 WARNING [train.py:1204] (3/4) Exclude cut with ID _7719_210_3525_1_1528456153163_4296960_88-250259-0 from training. Duration: 0.84 2023-10-02 07:46:17,262 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8453_210_4211_1_1528502940_8562730_1157-114102-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 07:46:18,569 WARNING [train.py:1204] (3/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_317-175062-0_sp1.1 from training. Duration: 0.7454375 2023-10-02 07:46:18,574 WARNING [train.py:1204] (3/4) Exclude cut with ID _29857_210_20034_1_1532395886753_3798160_254-347254-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 07:46:20,036 WARNING [train.py:1204] (3/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_28-70987-0 from training. Duration: 0.87 2023-10-02 07:46:20,119 WARNING [train.py:1204] (3/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_321-335475-0 from training. Duration: 0.45 2023-10-02 07:46:24,919 INFO [train.py:1046] (3/4) Epoch 23, batch 3650, loss[loss=0.1821, simple_loss=0.2516, pruned_loss=0.05628, over 23662.00 frames. ], tot_loss[loss=0.1712, simple_loss=0.2474, pruned_loss=0.04752, over 4719178.85 frames. ], batch size: 232, lr: 4.45e-03, grad_scale: 16.0 2023-10-02 07:46:24,976 WARNING [train.py:1204] (3/4) Exclude cut with ID _40901_210_5665_1_1533363789958_3856130_356-107330-0_sp1.1 from training. Duration: 0.936375 2023-10-02 07:46:25,039 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3780_210_2087_1_1526467389_2145572_176-107140-0 from training. Duration: 0.69 2023-10-02 07:46:27,935 WARNING [train.py:1204] (3/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_80-135200-0 from training. Duration: 0.79 2023-10-02 07:46:29,347 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_33348_210_5632_1_1533086912_3324030_85-28583-0_sp0.9 from training. Duration: 0.911125 2023-10-02 07:46:35,303 WARNING [train.py:1204] (3/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_29-293896-0 from training. Duration: 0.61 2023-10-02 07:46:37,377 WARNING [train.py:1204] (3/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_194-252800-0 from training. Duration: 0.97 2023-10-02 07:46:40,171 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_360-52446-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 07:46:40,173 WARNING [train.py:1204] (3/4) Exclude cut with ID _9389_210_1664_1_1529217337301_5289269_289-213417-0_sp0.9 from training. Duration: 0.988875 2023-10-02 07:46:42,043 WARNING [train.py:1204] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_143-86344-0_sp0.9 from training. Duration: 0.8555625 2023-10-02 07:46:43,561 WARNING [train.py:1204] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_520-126729-0_sp0.9 from training. Duration: 0.77775 2023-10-02 07:46:43,588 WARNING [train.py:1204] (3/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_76-308663-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 07:46:43,653 WARNING [train.py:1204] (3/4) Exclude cut with ID _8021_210_7994_1_1528376451358_3653069_100-140231-0 from training. Duration: 0.67 2023-10-02 07:46:44,905 WARNING [train.py:1204] (3/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_387-131538-0_sp0.9 from training. Duration: 0.9 2023-10-02 07:46:46,229 WARNING [train.py:1204] (3/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_365-182054-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 07:46:46,276 WARNING [train.py:1204] (3/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_345-340690-0 from training. Duration: 0.93 2023-10-02 07:46:46,371 WARNING [train.py:1204] (3/4) Exclude cut with ID _5409_210_1388_1_1527292850189_5865191_178-127970-0_sp0.9 from training. Duration: 0.6333125 2023-10-02 07:46:47,668 WARNING [train.py:1204] (3/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_352-12734-0_sp1.1 from training. Duration: 0.92725 2023-10-02 07:46:47,672 WARNING [train.py:1204] (3/4) Exclude cut with ID _65134_210_14763_1_1535335393985_3505368_121-129373-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 07:46:50,343 WARNING [train.py:1204] (3/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_557-324258-0_sp1.1 from training. Duration: 0.736375 2023-10-02 07:46:51,752 WARNING [train.py:1204] (3/4) Exclude cut with ID _48678_210_5663_1_1533988830947_3769199_306-255886-0 from training. Duration: 0.77 2023-10-02 07:46:53,604 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7544_210_6753_1_1528587000_691468_87-178599-0 from training. Duration: 0.87 2023-10-02 07:46:53,656 WARNING [train.py:1204] (3/4) Exclude cut with ID _2941_210_2577_1_1526199673150_2489759_208-236402-0_sp1.1 from training. Duration: 0.963625 2023-10-02 07:46:56,423 WARNING [train.py:1204] (3/4) Exclude cut with ID _38670_210_15379_1_1533448810659_4391059_65-169397-0 from training. Duration: 0.97 2023-10-02 07:46:56,517 WARNING [train.py:1204] (3/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_306-328175-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 07:46:56,527 WARNING [train.py:1204] (3/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_212-149947-0_sp1.1 from training. Duration: 0.663625 2023-10-02 07:46:56,736 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.bypass.scale_min, batch_count=803573.3333333334, ans=0.2 2023-10-02 07:47:02,862 WARNING [train.py:1204] (3/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_321-22821-0_sp1.1 from training. Duration: 0.8090625 2023-10-02 07:47:06,132 WARNING [train.py:1204] (3/4) Exclude cut with ID _44010_210_4525_1_1533884262964_4013620_483-69110-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 07:47:06,149 WARNING [train.py:1204] (3/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_38-187851-0_sp1.1 from training. Duration: 0.563625 2023-10-02 07:47:07,517 WARNING [train.py:1204] (3/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_129-337802-0_sp0.9 from training. Duration: 0.8 2023-10-02 07:47:07,580 WARNING [train.py:1204] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_268-87909-0_sp1.1 from training. Duration: 0.9 2023-10-02 07:47:07,808 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.conv_module2.balancer1.prob, batch_count=803573.3333333334, ans=0.125 2023-10-02 07:47:08,881 WARNING [train.py:1204] (3/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_559-184862-0_sp1.1 from training. Duration: 0.8545625 2023-10-02 07:47:12,041 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_46032_210_14656_1_1533781412_4507969_237-120766-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 07:47:13,061 INFO [scaling.py:1022] (3/4) Whitening: name=encoder_embed.convnext.out_whiten, num_groups=1, num_channels=128, metric=4.75 vs. limit=5.0 2023-10-02 07:47:14,653 WARNING [train.py:1204] (3/4) Exclude cut with ID _28427_210_4220_1_1532581240967_3061069_330-170906-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 07:47:14,659 WARNING [train.py:1204] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_401-51578-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 07:47:14,779 WARNING [train.py:1204] (3/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_209-205365-0_sp1.1 from training. Duration: 0.7181875 2023-10-02 07:47:16,004 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_23801_210_16223_1_1532066392_3294657_57-242170-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 07:47:17,373 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_127-37371-0_sp1.1 from training. Duration: 0.936375 2023-10-02 07:47:23,081 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=803706.6666666666, ans=0.1 2023-10-02 07:47:24,811 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9171W0019-578905 from training. Duration: 0.896 2023-10-02 07:47:29,070 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_24605_210_1794_1_1532047863_4149755_349-49056-0_sp1.1 from training. Duration: 0.92725 2023-10-02 07:47:29,081 WARNING [train.py:1204] (3/4) Exclude cut with ID _12302_210_5219_1_1529924446466_8107280_897-21237-0_sp1.1 from training. Duration: 0.936375 2023-10-02 07:47:30,534 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_9460_210_4211_1_1528954587_10049667_480-260269-0_sp0.9 from training. Duration: 0.62225 2023-10-02 07:47:31,887 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_348-152548-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 07:47:33,269 WARNING [train.py:1204] (3/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_301-330455-0_sp0.9 from training. Duration: 0.888875 2023-10-02 07:47:35,254 WARNING [train.py:1204] (3/4) Exclude cut with ID _61008_210_10633_1_1534852769198_6974370_345-91705-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 07:47:36,601 WARNING [train.py:1204] (3/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_304-273433-0 from training. Duration: 0.99 2023-10-02 07:47:36,605 WARNING [train.py:1204] (3/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_359-345972-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 07:47:39,223 INFO [train.py:1046] (3/4) Epoch 23, batch 3700, loss[loss=0.1948, simple_loss=0.2752, pruned_loss=0.0572, over 24059.00 frames. ], tot_loss[loss=0.1722, simple_loss=0.2486, pruned_loss=0.04794, over 4716957.39 frames. ], batch size: 86, lr: 4.44e-03, grad_scale: 8.0 2023-10-02 07:47:39,997 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2296_210_2087_1_1525503448_3397335_57-185430-0_sp1.1 from training. Duration: 0.5454375 2023-10-02 07:47:42,094 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.1.encoder.layers.0.feed_forward2.out_whiten, num_groups=1, num_channels=256, metric=4.37 vs. limit=15.0 2023-10-02 07:47:42,570 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_1256-120974-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 07:47:42,635 WARNING [train.py:1204] (3/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_372-30302-0_sp0.9 from training. Duration: 0.9555625 2023-10-02 07:47:45,358 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_42012_210_7927_1_1533439936_3871556_451-344262-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 07:47:45,359 WARNING [train.py:1204] (3/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_28-324729-0 from training. Duration: 0.94 2023-10-02 07:47:45,367 WARNING [train.py:1204] (3/4) Exclude cut with ID _64135_210_2253_1_1535677267876_7608972_313-18394-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 07:47:46,739 WARNING [train.py:1204] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_620-276793-0_sp0.9 from training. Duration: 0.6555625 2023-10-02 07:47:46,781 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7544_210_6753_1_1528591327_1471426_172-348777-0_sp0.9 from training. Duration: 0.7333125 2023-10-02 07:47:50,102 WARNING [train.py:1204] (3/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_332-221057-0_sp0.9 from training. Duration: 0.7555625 2023-10-02 07:47:52,824 WARNING [train.py:1204] (3/4) Exclude cut with ID _19023_210_4353_1_1531378892552_5820780_5-300035-0_sp1.1 from training. Duration: 0.92725 2023-10-02 07:47:52,858 WARNING [train.py:1204] (3/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_343-309023-0_sp1.1 from training. Duration: 0.963625 2023-10-02 07:47:54,244 WARNING [train.py:1204] (3/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_37-41919-0_sp1.1 from training. Duration: 0.7909375 2023-10-02 07:47:55,660 WARNING [train.py:1204] (3/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_41-182910-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 07:47:56,944 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_229-45300-0_sp0.9 from training. Duration: 0.6333125 2023-10-02 07:47:59,031 WARNING [train.py:1204] (3/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_40-214716-0_sp1.1 from training. Duration: 0.963625 2023-10-02 07:48:00,428 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9309W0203-640330 from training. Duration: 0.896 2023-10-02 07:48:08,035 WARNING [train.py:1204] (3/4) Exclude cut with ID _60229_210_27767_1_1534838406158_3655350_264-279573-0_sp1.1 from training. Duration: 0.7545625 2023-10-02 07:48:08,075 WARNING [train.py:1204] (3/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_372-198878-0_sp0.9 from training. Duration: 0.6666875 2023-10-02 07:48:10,611 WARNING [train.py:1204] (3/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_359-118926-0_sp1.1 from training. Duration: 0.6545625 2023-10-02 07:48:10,632 WARNING [train.py:1204] (3/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_85-65056-0 from training. Duration: 0.96 2023-10-02 07:48:11,999 WARNING [train.py:1204] (3/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_89-242335-0_sp1.1 from training. Duration: 0.863625 2023-10-02 07:48:15,401 WARNING [train.py:1204] (3/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_128-49444-0_sp1.1 from training. Duration: 0.936375 2023-10-02 07:48:16,723 WARNING [train.py:1204] (3/4) Exclude cut with ID _30327_210_16049_1_1532484161295_3924169_401-70819-0 from training. Duration: 0.86 2023-10-02 07:48:16,806 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_41822_210_13270_1_1533955976_4379284_260-256391-0_sp1.1 from training. Duration: 0.936375 2023-10-02 07:48:17,003 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.nonlin_attention.balancer.prob, batch_count=803906.6666666666, ans=0.125 2023-10-02 07:48:18,209 WARNING [train.py:1204] (3/4) Exclude cut with ID _67335_210_5219_1_1535525975652_4275229_599-247317-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 07:48:18,542 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.conv_module1.balancer1.prob, batch_count=803906.6666666666, ans=0.125 2023-10-02 07:48:21,040 WARNING [train.py:1204] (3/4) Exclude cut with ID _29857_210_20034_1_1532395886753_3798160_209-239267-0_sp1.1 from training. Duration: 0.936375 2023-10-02 07:48:21,071 WARNING [train.py:1204] (3/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_46-207599-0_sp1.1 from training. Duration: 0.5818125 2023-10-02 07:48:22,660 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.421e+02 1.851e+02 2.016e+02 2.342e+02 4.002e+02, threshold=4.032e+02, percent-clipped=1.0 2023-10-02 07:48:24,235 WARNING [train.py:1204] (3/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_912-282412-0_sp1.1 from training. Duration: 0.4454375 2023-10-02 07:48:28,316 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_4183_210_2087_1_1526729100_1869838_141-166600-0_sp1.1 from training. Duration: 0.863625 2023-10-02 07:48:28,320 WARNING [train.py:1204] (3/4) Exclude cut with ID _15037_210_2253_1_1530752294104_3267650_296-192597-0 from training. Duration: 0.81 2023-10-02 07:48:29,781 WARNING [train.py:1204] (3/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_571-220562-0_sp1.1 from training. Duration: 0.963625 2023-10-02 07:48:29,807 WARNING [train.py:1204] (3/4) Exclude cut with ID _5287_210_2084_1_1527418883040_3878670_12-221186-0 from training. Duration: 0.98 2023-10-02 07:48:36,211 WARNING [train.py:1204] (3/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_149-85602-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 07:48:36,244 WARNING [train.py:1204] (3/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_1003-101638-0_sp1.1 from training. Duration: 0.663625 2023-10-02 07:48:39,663 WARNING [train.py:1204] (3/4) Exclude cut with ID _55844_210_2346_1_1534471212569_7625707_1099-33689-0_sp1.1 from training. Duration: 0.92725 2023-10-02 07:48:39,696 WARNING [train.py:1204] (3/4) Exclude cut with ID _7606_210_4915_1_1528894253277_5080690_301-10732-0 from training. Duration: 0.81 2023-10-02 07:48:42,357 WARNING [train.py:1204] (3/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_403-34698-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 07:48:42,363 WARNING [train.py:1204] (3/4) Exclude cut with ID _8480_210_1874_1_1528589391205_6728330_755-112625-0_sp0.9 from training. Duration: 0.82225 2023-10-02 07:48:42,376 WARNING [train.py:1204] (3/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_527-238990-0_sp1.1 from training. Duration: 0.8181875 2023-10-02 07:48:42,396 WARNING [train.py:1204] (3/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_426-7328-0_sp1.1 from training. Duration: 0.92725 2023-10-02 07:48:45,349 WARNING [train.py:1204] (3/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_189-317158-0_sp1.1 from training. Duration: 0.8181875 2023-10-02 07:48:46,104 WARNING [train.py:1204] (3/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_162-103620-0 from training. Duration: 0.93 2023-10-02 07:48:47,296 WARNING [train.py:1204] (3/4) Exclude cut with ID _22083_210_8254_1_1531702862439_3678970_49-234546-0 from training. Duration: 0.83 2023-10-02 07:48:48,609 WARNING [train.py:1204] (3/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_370-331600-0_sp1.1 from training. Duration: 0.8545625 2023-10-02 07:48:48,641 WARNING [train.py:1204] (3/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_166-59042-0_sp1.1 from training. Duration: 0.97275 2023-10-02 07:48:49,993 WARNING [train.py:1204] (3/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_138-332773-0_sp1.1 from training. Duration: 0.736375 2023-10-02 07:48:51,391 WARNING [train.py:1204] (3/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_90-30673-0_sp0.9 from training. Duration: 0.9333125 2023-10-02 07:48:52,916 WARNING [train.py:1204] (3/4) Exclude cut with ID _27967_210_12140_1_1532588391859_8419130_737-41898-0_sp1.1 from training. Duration: 0.936375 2023-10-02 07:48:53,408 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.5.encoder.layers.1.feed_forward2.out_whiten, num_groups=1, num_channels=256, metric=10.26 vs. limit=15.0 2023-10-02 07:48:54,812 INFO [train.py:1046] (3/4) Epoch 23, batch 3750, loss[loss=0.1679, simple_loss=0.2344, pruned_loss=0.05067, over 23813.00 frames. ], tot_loss[loss=0.1723, simple_loss=0.2488, pruned_loss=0.0479, over 4725066.83 frames. ], batch size: 164, lr: 4.44e-03, grad_scale: 8.0 2023-10-02 07:48:54,872 WARNING [train.py:1204] (3/4) Exclude cut with ID _8833_210_5219_1_1528592665210_7553059_329-125156-0_sp1.1 from training. Duration: 0.7454375 2023-10-02 07:48:56,260 WARNING [train.py:1204] (3/4) Exclude cut with ID _41817_210_18835_1_1533434477160_7423049_352-42537-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 07:48:56,570 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=804106.6666666666, ans=0.1 2023-10-02 07:48:57,679 WARNING [train.py:1204] (3/4) Exclude cut with ID _8365_210_2064_1_1528592311070_4677690_431-294102-0 from training. Duration: 0.89 2023-10-02 07:48:59,552 WARNING [train.py:1204] (3/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_454-314901-0_sp1.1 from training. Duration: 0.4181875 2023-10-02 07:49:02,243 WARNING [train.py:1204] (3/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_80-135200-0_sp0.9 from training. Duration: 0.87775 2023-10-02 07:49:02,301 WARNING [train.py:1204] (3/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_181-78632-0 from training. Duration: 0.78 2023-10-02 07:49:02,402 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.0.bypass.skip_rate, batch_count=804106.6666666666, ans=0.035 2023-10-02 07:49:02,465 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.conv_module1.balancer2.prob, batch_count=804106.6666666666, ans=0.125 2023-10-02 07:49:03,661 WARNING [train.py:1204] (3/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_300-27235-0_sp0.9 from training. Duration: 0.9666875 2023-10-02 07:49:05,010 WARNING [train.py:1204] (3/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_96-177176-0_sp1.1 from training. Duration: 0.97275 2023-10-02 07:49:06,435 WARNING [train.py:1204] (3/4) Exclude cut with ID _35941_210_15379_1_1532995294515_4023089_246-348742-0_sp1.1 from training. Duration: 0.97275 2023-10-02 07:49:07,860 WARNING [train.py:1204] (3/4) Exclude cut with ID _38670_210_15379_1_1533448810659_4391059_65-169397-0_sp1.1 from training. Duration: 0.8818125 2023-10-02 07:49:11,393 WARNING [train.py:1204] (3/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_1003-99642-0_sp1.1 from training. Duration: 0.963625 2023-10-02 07:49:14,085 WARNING [train.py:1204] (3/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_320-14078-0_sp1.1 from training. Duration: 0.863625 2023-10-02 07:49:15,506 WARNING [train.py:1204] (3/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_367-61330-0_sp0.9 from training. Duration: 0.8333125 2023-10-02 07:49:18,878 WARNING [train.py:1204] (3/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_515-298514-0_sp1.1 from training. Duration: 0.92725 2023-10-02 07:49:21,631 WARNING [train.py:1204] (3/4) Exclude cut with ID _53731_210_7994_1_1534553979244_3757214_471-320815-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 07:49:23,015 WARNING [train.py:1204] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_306-232480-0 from training. Duration: 0.96 2023-10-02 07:49:23,098 WARNING [train.py:1204] (3/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_478-162247-0_sp0.9 from training. Duration: 0.911125 2023-10-02 07:49:23,362 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.conv_module2.balancer1.prob, batch_count=804240.0, ans=0.125 2023-10-02 07:49:24,445 WARNING [train.py:1204] (3/4) Exclude cut with ID _56423_210_5854_1_1534896061049_3165969_345-284696-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 07:49:24,473 WARNING [train.py:1204] (3/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_600-122253-0_sp1.1 from training. Duration: 0.963625 2023-10-02 07:49:29,116 WARNING [train.py:1204] (3/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_199-295611-0 from training. Duration: 0.55 2023-10-02 07:49:32,457 WARNING [train.py:1204] (3/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_734-129761-0 from training. Duration: 0.82 2023-10-02 07:49:33,891 WARNING [train.py:1204] (3/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_349-169257-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 07:49:33,923 WARNING [train.py:1204] (3/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_202-67017-0_sp0.9 from training. Duration: 0.911125 2023-10-02 07:49:35,831 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.5.encoder.layers.0.self_attn1.whiten, num_groups=1, num_channels=256, metric=11.75 vs. limit=22.5 2023-10-02 07:49:36,766 WARNING [train.py:1204] (3/4) Exclude cut with ID _16908_210_5724_1_1531119157248_3772700_61-258978-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 07:49:39,659 WARNING [train.py:1204] (3/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_172-156931-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 07:49:41,474 WARNING [train.py:1204] (3/4) Exclude cut with ID _4092_210_2151_1_1526643044449_3135759_356-278694-0_sp1.1 from training. Duration: 0.42725 2023-10-02 07:49:44,321 WARNING [train.py:1204] (3/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_116-332008-0 from training. Duration: 0.98 2023-10-02 07:49:46,365 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.feed_forward2.hidden_balancer.prob, batch_count=804306.6666666666, ans=0.125 2023-10-02 07:49:48,857 WARNING [train.py:1204] (3/4) Exclude cut with ID _29003_210_19596_1_1532507391720_3894098_358-114789-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 07:49:51,551 WARNING [train.py:1204] (3/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_152-212533-0_sp1.1 from training. Duration: 0.8818125 2023-10-02 07:49:52,923 WARNING [train.py:1204] (3/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_267-214116-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 07:49:55,699 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7543_210_6753_1_1528541907_1399322_165-323619-0_sp0.9 from training. Duration: 0.7666875 2023-10-02 07:49:59,011 WARNING [train.py:1204] (3/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_577-332600-0_sp0.9 from training. Duration: 0.6333125 2023-10-02 07:50:00,740 WARNING [train.py:1204] (3/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_342-100157-0_sp0.9 from training. Duration: 0.6 2023-10-02 07:50:02,131 WARNING [train.py:1204] (3/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_141-89727-0_sp1.1 from training. Duration: 0.6454375 2023-10-02 07:50:02,370 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.0.feed_forward3.hidden_balancer.prob, batch_count=804373.3333333334, ans=0.125 2023-10-02 07:50:03,477 WARNING [train.py:1204] (3/4) Exclude cut with ID _3992_210_1747_1_1526644816031_3731060_107-251434-0_sp1.1 from training. Duration: 0.9 2023-10-02 07:50:07,587 WARNING [train.py:1204] (3/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_401-53423-0_sp1.1 from training. Duration: 0.5 2023-10-02 07:50:08,915 INFO [train.py:1046] (3/4) Epoch 23, batch 3800, loss[loss=0.174, simple_loss=0.2604, pruned_loss=0.04382, over 24086.00 frames. ], tot_loss[loss=0.1729, simple_loss=0.2495, pruned_loss=0.04809, over 4730859.41 frames. ], batch size: 80, lr: 4.44e-03, grad_scale: 8.0 2023-10-02 07:50:14,148 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7762_210_1794_1_1528199508_4057408_422-227034-0_sp1.1 from training. Duration: 0.663625 2023-10-02 07:50:17,238 WARNING [train.py:1204] (3/4) Exclude cut with ID _4184_210_2550_1_1526730192013_2219380_102-250299-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 07:50:19,110 WARNING [train.py:1204] (3/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_253-308377-0_sp0.9 from training. Duration: 0.5444375 2023-10-02 07:50:20,414 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_271-297505-0 from training. Duration: 0.99 2023-10-02 07:50:21,879 WARNING [train.py:1204] (3/4) Exclude cut with ID _35941_210_15379_1_1532995294515_4023089_213-142973-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 07:50:24,527 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7544_210_6753_1_1528587722_3576686_162-63018-0_sp1.1 from training. Duration: 0.92725 2023-10-02 07:50:24,628 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_365-217119-0_sp0.9 from training. Duration: 0.7 2023-10-02 07:50:27,923 WARNING [train.py:1204] (3/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_662-275621-0_sp0.9 from training. Duration: 0.4666875 2023-10-02 07:50:27,925 WARNING [train.py:1204] (3/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_374-187555-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 07:50:28,024 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_289-180904-0_sp1.1 from training. Duration: 0.6909375 2023-10-02 07:50:29,449 WARNING [train.py:1204] (3/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_189-47206-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 07:50:29,494 WARNING [train.py:1204] (3/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_234-14174-0_sp1.1 from training. Duration: 0.7454375 2023-10-02 07:50:30,870 WARNING [train.py:1204] (3/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_576-76621-0_sp1.1 from training. Duration: 0.963625 2023-10-02 07:50:32,794 WARNING [train.py:1204] (3/4) Exclude cut with ID _13253_210_1925_1_1530757775819_3577873_253-246386-0 from training. Duration: 0.9 2023-10-02 07:50:35,607 WARNING [train.py:1204] (3/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_372-6869-0_sp1.1 from training. Duration: 0.4 2023-10-02 07:50:36,951 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_21434_210_4238_1_1531916339_1222252_22-54954-0_sp1.1 from training. Duration: 0.936375 2023-10-02 07:50:38,427 WARNING [train.py:1204] (3/4) Exclude cut with ID _29853_210_7497_1_1532505600851_6642110_492-170361-0_sp1.1 from training. Duration: 0.92725 2023-10-02 07:50:41,170 WARNING [train.py:1204] (3/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_505-97987-0_sp0.9 from training. Duration: 0.9555625 2023-10-02 07:50:41,211 WARNING [train.py:1204] (3/4) Exclude cut with ID _8365_210_2064_1_1528592311070_4677690_371-122295-0_sp0.9 from training. Duration: 0.611125 2023-10-02 07:50:42,653 WARNING [train.py:1204] (3/4) Exclude cut with ID _14268_210_9206_1_1530622771285_4714099_637-86630-0_sp1.1 from training. Duration: 0.463625 2023-10-02 07:50:42,664 WARNING [train.py:1204] (3/4) Exclude cut with ID _29857_210_20034_1_1532395886753_3798160_372-5678-0_sp1.1 from training. Duration: 0.963625 2023-10-02 07:50:44,469 WARNING [train.py:1204] (3/4) Exclude cut with ID _52892_210_6721_1_1534378941259_4124129_90-165108-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 07:50:45,875 WARNING [train.py:1204] (3/4) Exclude cut with ID _27052_210_5986_1_1532329197187_3585009_143-339198-0_sp1.1 from training. Duration: 0.963625 2023-10-02 07:50:49,342 WARNING [train.py:1204] (3/4) Exclude cut with ID _14268_210_9206_1_1530622771285_4714099_637-86630-0_sp0.9 from training. Duration: 0.5666875 2023-10-02 07:50:49,344 WARNING [train.py:1204] (3/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_401-255491-0 from training. Duration: 0.7 2023-10-02 07:50:52,130 WARNING [train.py:1204] (3/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_178-6946-0_sp1.1 from training. Duration: 0.97275 2023-10-02 07:50:53,401 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.461e+02 1.883e+02 2.089e+02 2.519e+02 3.424e+02, threshold=4.178e+02, percent-clipped=0.0 2023-10-02 07:50:59,483 WARNING [train.py:1204] (3/4) Exclude cut with ID _27485_210_4916_1_1532739191677_3668269_460-163885-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 07:51:04,442 WARNING [train.py:1204] (3/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_270-26053-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 07:51:05,869 WARNING [train.py:1204] (3/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_412-317077-0 from training. Duration: 0.75 2023-10-02 07:51:08,561 WARNING [train.py:1204] (3/4) Exclude cut with ID _5289_210_2084_1_1527678202736_3779299_142-103317-0 from training. Duration: 0.84 2023-10-02 07:51:08,610 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_150-143438-0_sp1.1 from training. Duration: 0.92725 2023-10-02 07:51:10,056 WARNING [train.py:1204] (3/4) Exclude cut with ID _27894_210_12392_1_1532415496078_6902439_668-269779-0_sp1.1 from training. Duration: 0.97275 2023-10-02 07:51:10,138 WARNING [train.py:1204] (3/4) Exclude cut with ID _18513_210_9206_1_1531618215223_3862620_56-222075-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 07:51:11,517 WARNING [train.py:1204] (3/4) Exclude cut with ID _4092_210_2151_1_1526643044449_3135759_356-278694-0 from training. Duration: 0.47 2023-10-02 07:51:14,991 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.balancer1.prob, batch_count=804706.6666666666, ans=0.125 2023-10-02 07:51:16,126 WARNING [train.py:1204] (3/4) Exclude cut with ID _25808_210_13292_1_1532343514562_7366109_633-47524-0 from training. Duration: 0.99 2023-10-02 07:51:16,150 WARNING [train.py:1204] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_872-264525-0 from training. Duration: 0.65 2023-10-02 07:51:16,198 WARNING [train.py:1204] (3/4) Exclude cut with ID _14632_210_9988_1_1530691279392_3923200_239-232545-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 07:51:18,024 WARNING [train.py:1204] (3/4) Exclude cut with ID _14349_210_8341_1_1530615710818_3442470_153-27432-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 07:51:19,689 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.balancer_na.min_abs, batch_count=804706.6666666666, ans=0.02 2023-10-02 07:51:23,648 WARNING [train.py:1204] (3/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_37-41919-0_sp0.9 from training. Duration: 0.9666875 2023-10-02 07:51:23,715 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_196-289637-0_sp0.9 from training. Duration: 0.8333125 2023-10-02 07:51:25,062 INFO [train.py:1046] (3/4) Epoch 23, batch 3850, loss[loss=0.162, simple_loss=0.2443, pruned_loss=0.03981, over 24650.00 frames. ], tot_loss[loss=0.1719, simple_loss=0.2479, pruned_loss=0.04798, over 4719684.10 frames. ], batch size: 65, lr: 4.44e-03, grad_scale: 8.0 2023-10-02 07:51:28,519 WARNING [train.py:1204] (3/4) Exclude cut with ID _13678_210_7497_1_1530530069226_3463170_371-3221-0_sp1.1 from training. Duration: 0.8181875 2023-10-02 07:51:29,858 WARNING [train.py:1204] (3/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_557-324258-0 from training. Duration: 0.81 2023-10-02 07:51:31,655 WARNING [train.py:1204] (3/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_1012-279473-0_sp1.1 from training. Duration: 0.6090625 2023-10-02 07:51:31,731 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_66916_210_14656_1_1535725525_326566_7-92649-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 07:51:35,804 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_9460_210_4211_1_1528954587_10049667_480-260269-0_sp1.1 from training. Duration: 0.5090625 2023-10-02 07:51:37,346 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_121-155001-0_sp1.1 from training. Duration: 0.92725 2023-10-02 07:51:40,311 WARNING [train.py:1204] (3/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_188-171673-0_sp1.1 from training. Duration: 0.52725 2023-10-02 07:51:41,686 WARNING [train.py:1204] (3/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_2-39653-0 from training. Duration: 0.98 2023-10-02 07:51:41,961 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.conv_module2.balancer2.prob, batch_count=804840.0, ans=0.125 2023-10-02 07:51:42,426 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.2.encoder.layers.2.self_attn1.whiten, num_groups=1, num_channels=384, metric=21.09 vs. limit=22.5 2023-10-02 07:51:48,110 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_23850_210_4238_1_1532232597_1632433_112-229012-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 07:51:50,179 WARNING [train.py:1204] (3/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_626-79528-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 07:51:51,400 WARNING [train.py:1204] (3/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_856-90052-0_sp1.1 from training. Duration: 0.963625 2023-10-02 07:51:51,449 WARNING [train.py:1204] (3/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_122-109023-0_sp0.9 from training. Duration: 0.8444375 2023-10-02 07:51:55,668 WARNING [train.py:1204] (3/4) Exclude cut with ID _64414_210_17301_1_1535243252048_3757759_37-70328-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 07:51:57,026 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_622-344907-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 07:51:57,095 WARNING [train.py:1204] (3/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_410-321956-0_sp1.1 from training. Duration: 0.92725 2023-10-02 07:51:57,109 WARNING [train.py:1204] (3/4) Exclude cut with ID _7562_210_2084_1_1528887767916_3946229_186-326422-0_sp0.9 from training. Duration: 0.9333125 2023-10-02 07:51:58,969 WARNING [train.py:1204] (3/4) Exclude cut with ID _37537_210_2253_1_1533171758568_3421949_127-285192-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 07:52:01,709 WARNING [train.py:1204] (3/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_723-228168-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 07:52:01,781 WARNING [train.py:1204] (3/4) Exclude cut with ID _13678_210_7497_1_1530530069226_3463170_426-246984-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 07:52:03,241 WARNING [train.py:1204] (3/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_399-156791-0_sp0.9 from training. Duration: 0.92225 2023-10-02 07:52:03,303 WARNING [train.py:1204] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_391-22148-0 from training. Duration: 0.77 2023-10-02 07:52:05,217 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3475_210_2028_1_1526193578_3622569_64-297072-0 from training. Duration: 0.91 2023-10-02 07:52:05,287 WARNING [train.py:1204] (3/4) Exclude cut with ID _17875_210_2087_1_1531224012907_1821339_225-280015-0_sp1.1 from training. Duration: 0.963625 2023-10-02 07:52:06,539 WARNING [train.py:1204] (3/4) Exclude cut with ID _50655_210_18628_1_1534229942193_3772154_325-246169-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 07:52:09,287 WARNING [train.py:1204] (3/4) Exclude cut with ID _29529_210_7234_1_1532393961455_3977460_597-257348-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 07:52:10,650 WARNING [train.py:1204] (3/4) Exclude cut with ID _58895_210_8750_1_1535184081320_3649089_77-173485-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 07:52:10,686 WARNING [train.py:1204] (3/4) Exclude cut with ID _6270_210_2084_1_1527850911573_3856420_26-277343-0 from training. Duration: 0.82 2023-10-02 07:52:12,748 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.3.encoder.layers.1.feed_forward2.out_whiten, num_groups=1, num_channels=512, metric=10.52 vs. limit=15.0 2023-10-02 07:52:13,491 WARNING [train.py:1204] (3/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_144-3287-0 from training. Duration: 0.67 2023-10-02 07:52:14,922 WARNING [train.py:1204] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_293-145278-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 07:52:16,371 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_40047_210_2290_1_1533290519_2374311_257-335162-0 from training. Duration: 0.95 2023-10-02 07:52:17,821 WARNING [train.py:1204] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_619-344499-0_sp0.9 from training. Duration: 0.7 2023-10-02 07:52:23,284 WARNING [train.py:1204] (3/4) Exclude cut with ID _19504_210_9994_1_1531648863242_3325220_456-315379-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 07:52:23,508 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=804973.3333333334, ans=0.1 2023-10-02 07:52:24,714 WARNING [train.py:1204] (3/4) Exclude cut with ID _55857_210_2346_1_1534644007831_6898209_631-221692-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 07:52:28,801 WARNING [train.py:1204] (3/4) Exclude cut with ID _64631_210_27767_1_1535349580068_4165160_324-3682-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 07:52:28,823 WARNING [train.py:1204] (3/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_317-211107-0 from training. Duration: 0.92 2023-10-02 07:52:30,498 WARNING [train.py:1204] (3/4) Exclude cut with ID _6789_210_1534_1_1527857937218_3118950_311-61262-0 from training. Duration: 0.65 2023-10-02 07:52:33,717 WARNING [train.py:1204] (3/4) Exclude cut with ID _28323_210_16187_1_1532480395667_4100300_365-68882-0_sp1.1 from training. Duration: 0.92725 2023-10-02 07:52:33,744 WARNING [train.py:1204] (3/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_816-181825-0_sp1.1 from training. Duration: 0.92725 2023-10-02 07:52:37,072 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=805040.0, ans=0.1 2023-10-02 07:52:38,189 WARNING [train.py:1204] (3/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_132-269633-0_sp1.1 from training. Duration: 0.6181875 2023-10-02 07:52:38,209 WARNING [train.py:1204] (3/4) Exclude cut with ID _44585_210_18628_1_1533605350387_4518199_280-73534-0_sp1.1 from training. Duration: 0.8090625 2023-10-02 07:52:38,269 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_68095_210_20185_1_1535718226_8888855_1009-164755-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 07:52:39,644 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_40223_210_6228_1_1533298404_4812267_400-103813-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 07:52:39,645 WARNING [train.py:1204] (3/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_491-261923-0_sp1.1 from training. Duration: 0.8909375 2023-10-02 07:52:39,650 WARNING [train.py:1204] (3/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_712-342563-0 from training. Duration: 0.97 2023-10-02 07:52:39,729 WARNING [train.py:1204] (3/4) Exclude cut with ID _41161_210_20034_1_1533431249657_3555314_175-190032-0_sp1.1 from training. Duration: 0.963625 2023-10-02 07:52:40,983 INFO [train.py:1046] (3/4) Epoch 23, batch 3900, loss[loss=0.1828, simple_loss=0.2556, pruned_loss=0.05494, over 23329.00 frames. ], tot_loss[loss=0.1711, simple_loss=0.247, pruned_loss=0.0476, over 4715765.55 frames. ], batch size: 93, lr: 4.44e-03, grad_scale: 8.0 2023-10-02 07:52:41,139 WARNING [train.py:1204] (3/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_788-298907-0 from training. Duration: 0.89 2023-10-02 07:52:41,162 WARNING [train.py:1204] (3/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_225-70961-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 07:52:41,168 WARNING [train.py:1204] (3/4) Exclude cut with ID _58583_210_5236_1_1534726835266_3574249_235-175994-0_sp1.1 from training. Duration: 0.92725 2023-10-02 07:52:42,584 WARNING [train.py:1204] (3/4) Exclude cut with ID _12302_210_5219_1_1529924446466_8107280_907-182949-0_sp1.1 from training. Duration: 0.836375 2023-10-02 07:52:42,621 WARNING [train.py:1204] (3/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_94-193692-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 07:52:42,830 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.nonlin_attention.balancer.prob, batch_count=805106.6666666666, ans=0.125 2023-10-02 07:52:43,971 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_4528_210_3273_1_1527076654_3276777_342-168068-0_sp0.9 from training. Duration: 0.9666875 2023-10-02 07:52:45,286 WARNING [train.py:1204] (3/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_368-74239-0_sp1.1 from training. Duration: 0.92725 2023-10-02 07:52:45,287 WARNING [train.py:1204] (3/4) Exclude cut with ID _44831_210_13427_1_1533816011549_3689035_291-295928-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 07:52:45,373 WARNING [train.py:1204] (3/4) Exclude cut with ID _56406_210_9288_1_1534899618664_3496149_455-131997-0_sp1.1 from training. Duration: 0.97275 2023-10-02 07:52:45,380 WARNING [train.py:1204] (3/4) Exclude cut with ID _6473_210_1741_1_1527901307396_3815380_108-17335-0 from training. Duration: 0.65 2023-10-02 07:52:46,640 WARNING [train.py:1204] (3/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_286-135600-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 07:52:53,377 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_928-321675-0_sp1.1 from training. Duration: 0.936375 2023-10-02 07:52:53,459 WARNING [train.py:1204] (3/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_81-300212-0_sp1.1 from training. Duration: 0.5454375 2023-10-02 07:52:53,498 WARNING [train.py:1204] (3/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_29-280622-0_sp0.9 from training. Duration: 0.911125 2023-10-02 07:52:56,231 WARNING [train.py:1204] (3/4) Exclude cut with ID _61079_210_4525_1_1534921008198_4011419_228-176447-0_sp1.1 from training. Duration: 0.936375 2023-10-02 07:52:59,037 WARNING [train.py:1204] (3/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_372-198878-0_sp1.1 from training. Duration: 0.5454375 2023-10-02 07:52:59,048 WARNING [train.py:1204] (3/4) Exclude cut with ID _64521_210_14699_1_1535243427259_3815328_244-208725-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 07:53:00,425 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8275_210_4198_1_1528455320_3892571_491-17707-0_sp0.9 from training. Duration: 0.711125 2023-10-02 07:53:01,780 WARNING [train.py:1204] (3/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_375-245677-0 from training. Duration: 0.81 2023-10-02 07:53:01,787 WARNING [train.py:1204] (3/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_690-286055-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 07:53:01,913 WARNING [train.py:1204] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_89-321955-0 from training. Duration: 0.8 2023-10-02 07:53:03,259 WARNING [train.py:1204] (3/4) Exclude cut with ID _18895_210_6228_1_1531357274234_3784930_213-98721-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 07:53:03,323 WARNING [train.py:1204] (3/4) Exclude cut with ID _19708_210_9206_1_1532136650733_4238364_347-65240-0 from training. Duration: 0.97 2023-10-02 07:53:04,645 WARNING [train.py:1204] (3/4) Exclude cut with ID _64135_210_2253_1_1535677267876_7608972_498-5039-0 from training. Duration: 0.78 2023-10-02 07:53:04,916 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.conv_module2.balancer1.prob, batch_count=805173.3333333334, ans=0.125 2023-10-02 07:53:09,821 WARNING [train.py:1204] (3/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_146-276944-0_sp1.1 from training. Duration: 0.7545625 2023-10-02 07:53:11,214 WARNING [train.py:1204] (3/4) Exclude cut with ID _63171_210_15488_1_1535104792794_3295110_155-34488-0_sp1.1 from training. Duration: 0.97275 2023-10-02 07:53:11,227 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_5438_210_1794_1_1527336687_2600009_233-58072-0_sp0.9 from training. Duration: 0.8555625 2023-10-02 07:53:11,278 WARNING [train.py:1204] (3/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_63-101996-0_sp0.9 from training. Duration: 0.97775 2023-10-02 07:53:12,810 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.feed_forward1.out_proj.dropout_p, batch_count=805240.0, ans=0.1 2023-10-02 07:53:14,037 WARNING [train.py:1204] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_271-269084-0_sp1.1 from training. Duration: 0.7545625 2023-10-02 07:53:16,835 WARNING [train.py:1204] (3/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_345-167986-0_sp1.1 from training. Duration: 0.763625 2023-10-02 07:53:19,597 WARNING [train.py:1204] (3/4) Exclude cut with ID _39290_210_4220_1_1533862778760_3075118_92-311600-0_sp1.1 from training. Duration: 0.8 2023-10-02 07:53:19,603 WARNING [train.py:1204] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_209-65306-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 07:53:21,317 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_26433_210_12108_1_1532157158_4056480_317-335807-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 07:53:24,060 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.505e+02 1.850e+02 2.030e+02 2.348e+02 4.332e+02, threshold=4.060e+02, percent-clipped=1.0 2023-10-02 07:53:27,540 WARNING [train.py:1204] (3/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_238-94127-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 07:53:28,810 WARNING [train.py:1204] (3/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_207-132038-0_sp1.1 from training. Duration: 0.8545625 2023-10-02 07:53:34,865 WARNING [train.py:1204] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_167-137255-0_sp1.1 from training. Duration: 0.5818125 2023-10-02 07:53:36,278 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2009_210_2071_1_1525481134_4588937_385-302100-0_sp0.9 from training. Duration: 0.9666875 2023-10-02 07:53:45,314 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_15517_210_6753_1_1530954008_3487336_253-110617-0_sp1.1 from training. Duration: 0.936375 2023-10-02 07:53:45,599 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.conv_module2.balancer1.prob, batch_count=805373.3333333334, ans=0.125 2023-10-02 07:53:48,234 WARNING [train.py:1204] (3/4) Exclude cut with ID _39290_210_4220_1_1533862778760_3075118_92-311600-0_sp0.9 from training. Duration: 0.97775 2023-10-02 07:53:48,274 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_42-142473-0 from training. Duration: 0.72 2023-10-02 07:53:48,905 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.nonlin_attention.whiten2.whitening_limit, batch_count=805373.3333333334, ans=15.0 2023-10-02 07:53:49,176 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.1.encoder.layers.0.self_attn_weights.whiten_keys, num_groups=4, num_channels=128, metric=3.58 vs. limit=6.0 2023-10-02 07:53:49,619 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2296_210_2087_1_1525503448_3397335_57-185430-0 from training. Duration: 0.6 2023-10-02 07:53:49,632 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_11305_210_1794_1_1529750428_7178148_257-312870-0_sp0.9 from training. Duration: 0.97775 2023-10-02 07:53:51,565 WARNING [train.py:1204] (3/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_222-198127-0 from training. Duration: 0.84 2023-10-02 07:53:51,651 WARNING [train.py:1204] (3/4) Exclude cut with ID _65263_210_22356_1_1535763598691_3921037_423-202699-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 07:53:52,959 WARNING [train.py:1204] (3/4) Exclude cut with ID _15037_210_2253_1_1530752294104_3267650_13-130361-0 from training. Duration: 0.99 2023-10-02 07:53:56,308 INFO [train.py:1046] (3/4) Epoch 23, batch 3950, loss[loss=0.1876, simple_loss=0.2618, pruned_loss=0.05672, over 23905.00 frames. ], tot_loss[loss=0.171, simple_loss=0.2472, pruned_loss=0.04743, over 4717195.31 frames. ], batch size: 179, lr: 4.44e-03, grad_scale: 8.0 2023-10-02 07:53:57,946 WARNING [train.py:1204] (3/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_628-198080-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 07:53:59,358 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3171_210_2087_1_1526109084_2330007_37-64334-0 from training. Duration: 0.82 2023-10-02 07:53:59,414 WARNING [train.py:1204] (3/4) Exclude cut with ID _5287_210_2084_1_1527418883040_3878670_12-221186-0_sp1.1 from training. Duration: 0.8909375 2023-10-02 07:54:02,089 WARNING [train.py:1204] (3/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_1103-56445-0_sp1.1 from training. Duration: 0.663625 2023-10-02 07:54:04,688 WARNING [train.py:1204] (3/4) Exclude cut with ID _65263_210_22356_1_1535763598691_3921037_169-140068-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 07:54:06,207 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.1.conv_module2.balancer1.prob, batch_count=805440.0, ans=0.125 2023-10-02 07:54:10,859 WARNING [train.py:1204] (3/4) Exclude cut with ID IC0671W0118-331961 from training. Duration: 0.896 2023-10-02 07:54:10,916 WARNING [train.py:1204] (3/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_402-102311-0_sp1.1 from training. Duration: 0.6090625 2023-10-02 07:54:12,204 WARNING [train.py:1204] (3/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_202-293523-0 from training. Duration: 0.72 2023-10-02 07:54:12,284 WARNING [train.py:1204] (3/4) Exclude cut with ID IC0679W0038-335846 from training. Duration: 0.768 2023-10-02 07:54:13,541 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_37357_210_17024_1_1533089504_2316121_79-198866-0_sp1.1 from training. Duration: 0.936375 2023-10-02 07:54:16,286 WARNING [train.py:1204] (3/4) Exclude cut with ID _7606_210_4915_1_1528894253277_5080690_292-271285-0_sp1.1 from training. Duration: 0.963625 2023-10-02 07:54:16,297 WARNING [train.py:1204] (3/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_377-296839-0_sp0.9 from training. Duration: 0.611125 2023-10-02 07:54:16,306 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_45886_210_17024_1_1533704917_2435333_58-131069-0_sp1.1 from training. Duration: 0.963625 2023-10-02 07:54:16,539 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.out_combiner.scale_min, batch_count=805506.6666666666, ans=0.2 2023-10-02 07:54:19,161 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6961_210_2087_1_1527937168_6815011_395-185060-0 from training. Duration: 0.95 2023-10-02 07:54:22,484 WARNING [train.py:1204] (3/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_304-266479-0_sp1.1 from training. Duration: 0.97275 2023-10-02 07:54:23,787 WARNING [train.py:1204] (3/4) Exclude cut with ID _3867_210_1883_1_1526641200575_4475830_319-349850-0_sp1.1 from training. Duration: 0.6090625 2023-10-02 07:54:23,804 WARNING [train.py:1204] (3/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_304-59227-0_sp0.9 from training. Duration: 0.8555625 2023-10-02 07:54:25,220 WARNING [train.py:1204] (3/4) Exclude cut with ID _8021_210_7994_1_1528376451358_3653069_100-140231-0_sp0.9 from training. Duration: 0.7444375 2023-10-02 07:54:26,587 WARNING [train.py:1204] (3/4) Exclude cut with ID _8833_210_5219_1_1528592665210_7553059_329-125156-0_sp0.9 from training. Duration: 0.911125 2023-10-02 07:54:29,037 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.conv_module1.balancer1.prob, batch_count=805573.3333333334, ans=0.125 2023-10-02 07:54:36,125 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.1.bypass.scale_min, batch_count=805573.3333333334, ans=0.2 2023-10-02 07:54:37,255 WARNING [train.py:1204] (3/4) Exclude cut with ID _14459_210_2228_1_1531443641946_7516700_792-287234-0_sp1.1 from training. Duration: 0.92725 2023-10-02 07:54:37,272 WARNING [train.py:1204] (3/4) Exclude cut with ID _19708_210_9206_1_1532136650733_4238364_347-65240-0_sp1.1 from training. Duration: 0.8818125 2023-10-02 07:54:40,737 WARNING [train.py:1204] (3/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_70-27732-0 from training. Duration: 0.94 2023-10-02 07:54:46,600 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_397-132565-0 from training. Duration: 0.99 2023-10-02 07:54:46,603 WARNING [train.py:1204] (3/4) Exclude cut with ID _49728_210_12651_1_1534041034294_6788150_624-195301-0 from training. Duration: 0.85 2023-10-02 07:54:46,636 WARNING [train.py:1204] (3/4) Exclude cut with ID _55429_210_10626_1_1534467588527_7174143_377-169391-0_sp1.1 from training. Duration: 0.87275 2023-10-02 07:54:48,040 WARNING [train.py:1204] (3/4) Exclude cut with ID _49376_210_5986_1_1533985278732_3370740_354-344695-0_sp1.1 from training. Duration: 0.9 2023-10-02 07:54:54,445 WARNING [train.py:1204] (3/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_276-96593-0_sp1.1 from training. Duration: 0.7 2023-10-02 07:54:54,466 WARNING [train.py:1204] (3/4) Exclude cut with ID _15012_210_1968_1_1530856820429_5095350_688-178849-0_sp0.9 from training. Duration: 0.711125 2023-10-02 07:54:55,773 WARNING [train.py:1204] (3/4) Exclude cut with ID _25565_210_12392_1_1532423009382_3952098_372-69023-0_sp1.1 from training. Duration: 0.936375 2023-10-02 07:54:55,802 WARNING [train.py:1204] (3/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_61-327062-0_sp1.1 from training. Duration: 0.736375 2023-10-02 07:54:55,834 WARNING [train.py:1204] (3/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_215-141059-0 from training. Duration: 0.62 2023-10-02 07:54:59,435 WARNING [train.py:1204] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_6-226750-0_sp1.1 from training. Duration: 0.8545625 2023-10-02 07:54:59,662 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.feed_forward2.hidden_balancer.prob, batch_count=805706.6666666666, ans=0.125 2023-10-02 07:55:01,003 WARNING [train.py:1204] (3/4) Exclude cut with ID _14348_210_8341_1_1530599434996_3778640_240-232370-0_sp1.1 from training. Duration: 0.763625 2023-10-02 07:55:05,149 WARNING [train.py:1204] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_468-189058-0 from training. Duration: 0.96 2023-10-02 07:55:11,791 INFO [train.py:1046] (3/4) Epoch 23, batch 4000, loss[loss=0.1473, simple_loss=0.2299, pruned_loss=0.03231, over 24612.00 frames. ], tot_loss[loss=0.171, simple_loss=0.2474, pruned_loss=0.04732, over 4723534.10 frames. ], batch size: 60, lr: 4.44e-03, grad_scale: 16.0 2023-10-02 07:55:13,258 WARNING [train.py:1204] (3/4) Exclude cut with ID _21987_210_5854_1_1531821500482_3637038_247-93340-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 07:55:14,877 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.conv_module2.balancer1.max_abs, batch_count=805773.3333333334, ans=10.0 2023-10-02 07:55:16,154 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.conv_module2.balancer1.prob, batch_count=805773.3333333334, ans=0.125 2023-10-02 07:55:17,621 INFO [scaling.py:1118] (3/4) WithLoss: name=encoder.encoders.0.layers.1.self_attn_weights, loss-sum=0.000e+00 2023-10-02 07:55:19,300 WARNING [train.py:1204] (3/4) Exclude cut with ID _66868_210_9527_1_1535509287954_4561140_436-191602-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 07:55:24,735 WARNING [train.py:1204] (3/4) Exclude cut with ID _18895_210_6228_1_1531357274234_3784930_394-58203-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 07:55:24,796 WARNING [train.py:1204] (3/4) Exclude cut with ID _58605_210_12210_1_1534723195939_3904259_258-281498-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 07:55:25,355 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.4.encoder.layers.1.nonlin_attention.whiten2, num_groups=1, num_channels=384, metric=3.27 vs. limit=15.0 2023-10-02 07:55:26,888 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_33348_210_5632_1_1533086912_3324030_245-292752-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 07:55:26,906 WARNING [train.py:1204] (3/4) Exclude cut with ID _67831_210_12392_1_1535612464202_3092612_346-2148-0 from training. Duration: 0.93 2023-10-02 07:55:28,143 WARNING [train.py:1204] (3/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_369-222060-0_sp0.9 from training. Duration: 0.788875 2023-10-02 07:55:29,478 WARNING [train.py:1204] (3/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_245-174553-0 from training. Duration: 0.88 2023-10-02 07:55:29,484 WARNING [train.py:1204] (3/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_231-104736-0_sp1.1 from training. Duration: 0.7090625 2023-10-02 07:55:29,490 WARNING [train.py:1204] (3/4) Exclude cut with ID _44585_210_18628_1_1533605350387_4518199_280-73534-0 from training. Duration: 0.89 2023-10-02 07:55:30,945 WARNING [train.py:1204] (3/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_22-201476-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 07:55:33,741 WARNING [train.py:1204] (3/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_325-62802-0_sp0.9 from training. Duration: 0.9666875 2023-10-02 07:55:33,754 WARNING [train.py:1204] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_199-274062-0_sp1.1 from training. Duration: 0.97275 2023-10-02 07:55:33,757 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_8-319957-0_sp1.1 from training. Duration: 0.87275 2023-10-02 07:55:33,782 WARNING [train.py:1204] (3/4) Exclude cut with ID _29343_210_4381_1_1532602757752_3674109_224-349726-0_sp1.1 from training. Duration: 0.936375 2023-10-02 07:55:33,787 WARNING [train.py:1204] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_520-126729-0_sp1.1 from training. Duration: 0.636375 2023-10-02 07:55:35,177 WARNING [train.py:1204] (3/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_239-22076-0_sp1.1 from training. Duration: 0.77275 2023-10-02 07:55:36,525 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9012W0011-501146 from training. Duration: 0.896 2023-10-02 07:55:36,600 WARNING [train.py:1204] (3/4) Exclude cut with ID _29857_210_20034_1_1532395886753_3798160_279-185801-0_sp1.1 from training. Duration: 0.82725 2023-10-02 07:55:36,662 WARNING [train.py:1204] (3/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_261-223933-0_sp1.1 from training. Duration: 0.92725 2023-10-02 07:55:40,151 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9029W0025-509426 from training. Duration: 0.896 2023-10-02 07:55:40,333 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.conv_module1.balancer1.prob, batch_count=805906.6666666666, ans=0.125 2023-10-02 07:55:41,990 WARNING [train.py:1204] (3/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_337-181502-0_sp1.1 from training. Duration: 0.5909375 2023-10-02 07:55:42,012 WARNING [train.py:1204] (3/4) Exclude cut with ID _68635_210_9317_1_1535770813410_4315870_159-332599-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 07:55:46,275 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_161-276996-0 from training. Duration: 0.95 2023-10-02 07:55:47,606 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7088_210_1874_1_1527939776_1101113_127-68870-0_sp1.1 from training. Duration: 0.936375 2023-10-02 07:55:50,934 WARNING [train.py:1204] (3/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_322-217220-0_sp1.1 from training. Duration: 0.7818125 2023-10-02 07:55:51,001 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9072W0002-530636 from training. Duration: 0.768 2023-10-02 07:55:52,334 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_340-336408-0_sp1.1 from training. Duration: 0.8454375 2023-10-02 07:55:52,394 WARNING [train.py:1204] (3/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_395-37222-0 from training. Duration: 0.76 2023-10-02 07:55:52,396 WARNING [train.py:1204] (3/4) Exclude cut with ID _64099_210_17301_1_1535526977656_1578188_159-209156-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 07:55:53,763 WARNING [train.py:1204] (3/4) Exclude cut with ID _53950_210_5816_1_1534417264919_3862951_28-29681-0_sp1.1 from training. Duration: 0.92725 2023-10-02 07:55:55,527 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.588e+02 1.792e+02 1.999e+02 2.170e+02 4.369e+02, threshold=3.999e+02, percent-clipped=1.0 2023-10-02 07:55:55,644 WARNING [train.py:1204] (3/4) Exclude cut with ID _4681_210_3525_1_1527250689464_120889_15-126760-0_sp1.1 from training. Duration: 0.736375 2023-10-02 07:55:56,056 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.ff3_skip_rate, batch_count=805973.3333333334, ans=0.0 2023-10-02 07:55:57,088 WARNING [train.py:1204] (3/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_242-3472-0_sp1.1 from training. Duration: 0.663625 2023-10-02 07:55:57,138 WARNING [train.py:1204] (3/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_436-186311-0_sp0.9 from training. Duration: 0.72225 2023-10-02 07:55:57,163 WARNING [train.py:1204] (3/4) Exclude cut with ID _3675_210_4557_1_1526639934408_4182089_452-172147-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 07:55:59,930 WARNING [train.py:1204] (3/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_80-147301-0 from training. Duration: 0.84 2023-10-02 07:55:59,954 WARNING [train.py:1204] (3/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_351-107588-0_sp1.1 from training. Duration: 0.92725 2023-10-02 07:56:01,318 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9111W0002-550781 from training. Duration: 0.896 2023-10-02 07:56:05,749 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.conv_module1.balancer1.prob, batch_count=805973.3333333334, ans=0.125 2023-10-02 07:56:06,778 WARNING [train.py:1204] (3/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_465-156500-0_sp1.1 from training. Duration: 0.6545625 2023-10-02 07:56:09,591 WARNING [train.py:1204] (3/4) Exclude cut with ID _13938_210_9988_1_1530518718978_3693960_304-112784-0_sp1.1 from training. Duration: 0.4181875 2023-10-02 07:56:11,625 WARNING [train.py:1204] (3/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_202-67017-0_sp1.1 from training. Duration: 0.7454375 2023-10-02 07:56:13,339 WARNING [train.py:1204] (3/4) Exclude cut with ID _58414_210_8254_1_1535421609162_7151182_448-4358-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 07:56:14,734 WARNING [train.py:1204] (3/4) Exclude cut with ID _66840_210_5219_1_1535452168426_4196349_203-243234-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 07:56:14,819 WARNING [train.py:1204] (3/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_201-213513-0_sp1.1 from training. Duration: 0.97275 2023-10-02 07:56:19,470 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_117-187560-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 07:56:20,952 WARNING [train.py:1204] (3/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_135-174080-0_sp0.9 from training. Duration: 0.588875 2023-10-02 07:56:20,984 WARNING [train.py:1204] (3/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_45-265684-0 from training. Duration: 0.72 2023-10-02 07:56:21,208 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.conv_skip_rate, batch_count=806040.0, ans=0.0 2023-10-02 07:56:23,838 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_435-313305-0_sp1.1 from training. Duration: 0.6454375 2023-10-02 07:56:23,865 WARNING [train.py:1204] (3/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_235-307337-0_sp1.1 from training. Duration: 0.8545625 2023-10-02 07:56:25,077 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7762_210_1794_1_1528199508_4057408_419-125009-0_sp0.9 from training. Duration: 0.92225 2023-10-02 07:56:26,964 INFO [train.py:1046] (3/4) Epoch 23, batch 4050, loss[loss=0.1821, simple_loss=0.2588, pruned_loss=0.05273, over 23816.00 frames. ], tot_loss[loss=0.1723, simple_loss=0.2485, pruned_loss=0.04807, over 4717070.02 frames. ], batch size: 85, lr: 4.44e-03, grad_scale: 16.0 2023-10-02 07:56:27,068 WARNING [train.py:1204] (3/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_215-141059-0_sp1.1 from training. Duration: 0.563625 2023-10-02 07:56:28,435 WARNING [train.py:1204] (3/4) Exclude cut with ID _27013_210_1389_1_1532437173240_3252649_222-125573-0_sp1.1 from training. Duration: 0.8909375 2023-10-02 07:56:31,250 WARNING [train.py:1204] (3/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_126-274694-0_sp1.1 from training. Duration: 0.8909375 2023-10-02 07:56:32,746 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2592_210_2290_1_1525697025_4882925_157-70512-0_sp1.1 from training. Duration: 0.82725 2023-10-02 07:56:34,078 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_292-265928-0_sp0.9 from training. Duration: 0.5666875 2023-10-02 07:56:36,882 WARNING [train.py:1204] (3/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_1211-226066-0_sp1.1 from training. Duration: 0.6909375 2023-10-02 07:56:36,955 WARNING [train.py:1204] (3/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_394-265747-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 07:56:42,514 WARNING [train.py:1204] (3/4) Exclude cut with ID _48782_210_12658_1_1534467701544_2300422_239-67139-0_sp1.1 from training. Duration: 0.97275 2023-10-02 07:56:44,461 WARNING [train.py:1204] (3/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_329-113173-0_sp1.1 from training. Duration: 0.563625 2023-10-02 07:56:46,783 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.feed_forward3.hidden_balancer.prob, batch_count=806173.3333333334, ans=0.125 2023-10-02 07:56:47,947 WARNING [train.py:1204] (3/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_81-75849-0_sp0.9 from training. Duration: 0.4333125 2023-10-02 07:56:49,399 WARNING [train.py:1204] (3/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_1003-101638-0 from training. Duration: 0.73 2023-10-02 07:56:49,424 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9309W0189-640316 from training. Duration: 0.64 2023-10-02 07:56:52,731 WARNING [train.py:1204] (3/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_503-287883-0_sp1.1 from training. Duration: 0.8 2023-10-02 07:56:57,029 WARNING [train.py:1204] (3/4) Exclude cut with ID _8833_210_5219_1_1528592665210_7553059_611-80313-0 from training. Duration: 0.64 2023-10-02 07:56:58,836 WARNING [train.py:1204] (3/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_335-106626-0_sp1.1 from training. Duration: 0.963625 2023-10-02 07:57:01,594 WARNING [train.py:1204] (3/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_237-44332-0_sp1.1 from training. Duration: 0.8545625 2023-10-02 07:57:01,813 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.feed_forward3.hidden_balancer.prob, batch_count=806240.0, ans=0.125 2023-10-02 07:57:03,252 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=806240.0, ans=0.1 2023-10-02 07:57:05,712 WARNING [train.py:1204] (3/4) Exclude cut with ID _46952_210_5986_1_1534230010357_3107379_178-147341-0_sp1.1 from training. Duration: 0.97275 2023-10-02 07:57:05,755 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_13091_210_7925_1_1530609447_8987354_946-154426-0_sp1.1 from training. Duration: 0.936375 2023-10-02 07:57:05,795 WARNING [train.py:1204] (3/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_326-17061-0_sp1.1 from training. Duration: 0.8545625 2023-10-02 07:57:07,278 WARNING [train.py:1204] (3/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_38-169920-0_sp1.1 from training. Duration: 0.82725 2023-10-02 07:57:10,061 WARNING [train.py:1204] (3/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_283-220372-0 from training. Duration: 0.56 2023-10-02 07:57:10,069 WARNING [train.py:1204] (3/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_252-216674-0_sp1.1 from training. Duration: 0.4909375 2023-10-02 07:57:10,303 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.attention_skip_rate, batch_count=806306.6666666666, ans=0.0 2023-10-02 07:57:12,016 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.3.encoder.layers.1.feed_forward3.out_whiten, num_groups=1, num_channels=512, metric=16.82 vs. limit=15.0 2023-10-02 07:57:12,687 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_202-91290-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 07:57:14,506 WARNING [train.py:1204] (3/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_80-74184-0 from training. Duration: 0.83 2023-10-02 07:57:19,090 WARNING [train.py:1204] (3/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_411-271358-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 07:57:26,977 WARNING [train.py:1204] (3/4) Exclude cut with ID _68777_210_4353_1_1535810390731_3117904_129-166012-0 from training. Duration: 0.85 2023-10-02 07:57:27,075 WARNING [train.py:1204] (3/4) Exclude cut with ID _44982_210_1511_1_1534312820108_3726029_410-109759-0_sp1.1 from training. Duration: 0.963625 2023-10-02 07:57:27,081 WARNING [train.py:1204] (3/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_364-22817-0_sp1.1 from training. Duration: 0.6454375 2023-10-02 07:57:28,641 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.bypass.scale_min, batch_count=806373.3333333334, ans=0.2 2023-10-02 07:57:29,830 WARNING [train.py:1204] (3/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_89-242335-0 from training. Duration: 0.95 2023-10-02 07:57:29,839 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_151-325626-0 from training. Duration: 0.97 2023-10-02 07:57:29,840 WARNING [train.py:1204] (3/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_786-264332-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 07:57:31,433 WARNING [train.py:1204] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_35-153886-0_sp1.1 from training. Duration: 0.763625 2023-10-02 07:57:34,235 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_24007_210_1794_1_1531918925_1663875_116-100916-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 07:57:34,267 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_162-262717-0_sp1.1 from training. Duration: 0.7545625 2023-10-02 07:57:35,877 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=806373.3333333334, ans=0.1 2023-10-02 07:57:39,117 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.4.encoder.layers.0.conv_module1.whiten, num_groups=1, num_channels=384, metric=4.61 vs. limit=15.0 2023-10-02 07:57:39,914 WARNING [train.py:1204] (3/4) Exclude cut with ID _8263_210_1664_1_1528438953494_294100_40-204731-0 from training. Duration: 0.5 2023-10-02 07:57:41,173 INFO [train.py:1046] (3/4) Epoch 23, batch 4100, loss[loss=0.2275, simple_loss=0.2898, pruned_loss=0.08255, over 19793.00 frames. ], tot_loss[loss=0.1735, simple_loss=0.2495, pruned_loss=0.04878, over 4706406.95 frames. ], batch size: 388, lr: 4.44e-03, grad_scale: 16.0 2023-10-02 07:57:42,586 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_416-293746-0 from training. Duration: 0.9 2023-10-02 07:57:44,123 WARNING [train.py:1204] (3/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_605-219732-0 from training. Duration: 0.94 2023-10-02 07:57:45,946 WARNING [train.py:1204] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_454-123002-0 from training. Duration: 0.69 2023-10-02 07:57:45,954 WARNING [train.py:1204] (3/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_25-304273-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 07:57:46,021 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6860_210_4852_1_1527917457_3833327_451-39702-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 07:57:46,074 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_304-16505-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 07:57:47,396 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_4-266348-0_sp1.1 from training. Duration: 0.7090625 2023-10-02 07:57:47,484 WARNING [train.py:1204] (3/4) Exclude cut with ID ID0149W0001-753701 from training. Duration: 0.989 2023-10-02 07:57:49,686 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.feed_forward1.hidden_balancer.prob, batch_count=806440.0, ans=0.125 2023-10-02 07:57:49,753 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.bypass.skip_rate, batch_count=806440.0, ans=0.07 2023-10-02 07:57:52,177 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_41251_210_15368_1_1533429767_4820695_394-55393-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 07:57:52,255 WARNING [train.py:1204] (3/4) Exclude cut with ID _8793_210_6310_1_1528542985139_2310009_121-179782-0_sp1.1 from training. Duration: 0.72725 2023-10-02 07:57:52,276 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2729_210_3528_1_1525863505_3882214_421-73047-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 07:57:54,161 WARNING [train.py:1204] (3/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_278-154492-0_sp0.9 from training. Duration: 0.8444375 2023-10-02 07:57:58,447 WARNING [train.py:1204] (3/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_412-317077-0_sp0.9 from training. Duration: 0.8333125 2023-10-02 07:58:00,268 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_727-138317-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 07:58:00,327 WARNING [train.py:1204] (3/4) Exclude cut with ID _25808_210_13292_1_1532343514562_7366109_345-335048-0_sp1.1 from training. Duration: 0.92725 2023-10-02 07:58:01,714 WARNING [train.py:1204] (3/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_506-23200-0 from training. Duration: 0.97 2023-10-02 07:58:01,799 WARNING [train.py:1204] (3/4) Exclude cut with ID _44718_210_5816_1_1534050051869_6469030_643-245187-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 07:58:01,960 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.bypass_mid.scale_min, batch_count=806506.6666666666, ans=0.2 2023-10-02 07:58:03,061 WARNING [train.py:1204] (3/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_42-186162-0_sp1.1 from training. Duration: 0.836375 2023-10-02 07:58:03,073 WARNING [train.py:1204] (3/4) Exclude cut with ID _3231_210_2106_1_1526205864934_6784220_171-256004-0_sp1.1 from training. Duration: 0.7818125 2023-10-02 07:58:03,091 WARNING [train.py:1204] (3/4) Exclude cut with ID _25914_210_6400_1_1532141317058_644740_43-248478-0_sp1.1 from training. Duration: 0.936375 2023-10-02 07:58:03,145 WARNING [train.py:1204] (3/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_577-287150-0 from training. Duration: 0.82 2023-10-02 07:58:05,865 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_46015_210_7234_1_1533863319_3087837_376-276068-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 07:58:07,248 WARNING [train.py:1204] (3/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_51-200220-0 from training. Duration: 0.56 2023-10-02 07:58:08,582 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_452-96101-0_sp1.1 from training. Duration: 0.963625 2023-10-02 07:58:11,365 WARNING [train.py:1204] (3/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_263-168388-0_sp1.1 from training. Duration: 0.7818125 2023-10-02 07:58:11,366 WARNING [train.py:1204] (3/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_469-94673-0 from training. Duration: 0.79 2023-10-02 07:58:12,695 WARNING [train.py:1204] (3/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_303-228738-0_sp1.1 from training. Duration: 0.763625 2023-10-02 07:58:13,940 WARNING [train.py:1204] (3/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_262-250826-0_sp1.1 from training. Duration: 0.9 2023-10-02 07:58:13,967 WARNING [train.py:1204] (3/4) Exclude cut with ID _68777_210_4353_1_1535810390731_3117904_129-166012-0_sp1.1 from training. Duration: 0.77275 2023-10-02 07:58:16,827 WARNING [train.py:1204] (3/4) Exclude cut with ID _58895_210_8750_1_1535184081320_3649089_265-323810-0 from training. Duration: 0.96 2023-10-02 07:58:18,227 WARNING [train.py:1204] (3/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_120-76843-0_sp0.9 from training. Duration: 0.611125 2023-10-02 07:58:18,281 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7544_210_6753_1_1528587722_3576686_295-258479-0_sp0.9 from training. Duration: 0.9333125 2023-10-02 07:58:20,130 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7046_210_6753_1_1527895337_6920917_783-278109-0 from training. Duration: 0.77 2023-10-02 07:58:20,195 WARNING [train.py:1204] (3/4) Exclude cut with ID _42326_210_7770_1_1533814282531_3707269_113-308323-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 07:58:22,029 WARNING [train.py:1204] (3/4) Exclude cut with ID _7852_210_5219_1_1528286471316_8139400_242-105336-0_sp0.9 from training. Duration: 0.8 2023-10-02 07:58:25,107 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.523e+02 1.962e+02 2.255e+02 2.740e+02 4.048e+02, threshold=4.511e+02, percent-clipped=1.0 2023-10-02 07:58:25,239 WARNING [train.py:1204] (3/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_114-277905-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 07:58:29,378 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_13118_210_7925_1_1530755612_7608297_42-66683-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 07:58:32,952 WARNING [train.py:1204] (3/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_901-79438-0_sp1.1 from training. Duration: 0.8545625 2023-10-02 07:58:34,361 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_30280_210_1881_1_1532872779_3660139_211-182194-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 07:58:41,271 WARNING [train.py:1204] (3/4) Exclude cut with ID _14898_210_9206_1_1530766812377_4177190_621-45732-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 07:58:41,280 WARNING [train.py:1204] (3/4) Exclude cut with ID _35350_210_16040_1_1533040191183_4075299_224-133775-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 07:58:41,515 INFO [scaling.py:1118] (3/4) WithLoss: name=encoder.encoders.1.encoder.layers.1.self_attn_weights, loss-sum=0.000e+00 2023-10-02 07:58:43,520 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.1.encoder.layers.1.feed_forward1.out_whiten, num_groups=1, num_channels=256, metric=6.75 vs. limit=15.0 2023-10-02 07:58:45,440 WARNING [train.py:1204] (3/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_147-293311-0_sp1.1 from training. Duration: 0.8545625 2023-10-02 07:58:46,825 WARNING [train.py:1204] (3/4) Exclude cut with ID _12302_210_5219_1_1529924446466_8107280_835-279754-0_sp1.1 from training. Duration: 0.8181875 2023-10-02 07:58:48,366 INFO [scaling.py:1118] (3/4) WithLoss: name=encoder.encoders.0.layers.1.self_attn_weights, loss-sum=0.000e+00 2023-10-02 07:58:51,493 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8453_210_4211_1_1528502940_8562730_347-330673-0_sp0.9 from training. Duration: 0.8 2023-10-02 07:58:51,658 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.self_attn_weights.pos_emb_skip_rate, batch_count=806706.6666666666, ans=0.0 2023-10-02 07:58:53,433 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_213-103282-0_sp0.9 from training. Duration: 0.8666875 2023-10-02 07:58:54,816 WARNING [train.py:1204] (3/4) Exclude cut with ID _49728_210_12651_1_1534041034294_6788150_66-43496-0_sp1.1 from training. Duration: 0.97275 2023-10-02 07:58:54,825 WARNING [train.py:1204] (3/4) Exclude cut with ID _19023_210_4353_1_1531378892552_5820780_28-36615-0_sp1.1 from training. Duration: 0.8818125 2023-10-02 07:58:56,620 INFO [train.py:1046] (3/4) Epoch 23, batch 4150, loss[loss=0.1712, simple_loss=0.234, pruned_loss=0.05421, over 23654.00 frames. ], tot_loss[loss=0.1719, simple_loss=0.2485, pruned_loss=0.04772, over 4721205.36 frames. ], batch size: 232, lr: 4.44e-03, grad_scale: 16.0 2023-10-02 07:58:57,937 WARNING [train.py:1204] (3/4) Exclude cut with ID _14349_210_8341_1_1530615710818_3442470_115-285975-0 from training. Duration: 0.86 2023-10-02 07:58:59,229 WARNING [train.py:1204] (3/4) Exclude cut with ID _12734_210_8341_1_1530183614296_3891929_236-47464-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 07:58:59,300 WARNING [train.py:1204] (3/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_177-236809-0 from training. Duration: 0.49 2023-10-02 07:58:59,361 WARNING [train.py:1204] (3/4) Exclude cut with ID _13678_210_7497_1_1530530069226_3463170_371-3221-0 from training. Duration: 0.9 2023-10-02 07:59:00,658 WARNING [train.py:1204] (3/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_48-338976-0 from training. Duration: 0.89 2023-10-02 07:59:02,084 WARNING [train.py:1204] (3/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_1167-309744-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 07:59:08,124 WARNING [train.py:1204] (3/4) Exclude cut with ID _30327_210_16049_1_1532484161295_3924169_401-70819-0_sp0.9 from training. Duration: 0.9555625 2023-10-02 07:59:08,132 WARNING [train.py:1204] (3/4) Exclude cut with ID _55134_210_21982_1_1534590028377_3882170_346-91022-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 07:59:10,046 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.4.encoder.layers.1.self_attn_weights.whiten_keys, num_groups=4, num_channels=128, metric=2.49 vs. limit=6.0 2023-10-02 07:59:12,269 WARNING [train.py:1204] (3/4) Exclude cut with ID _14459_210_2228_1_1531443641946_7516700_174-118801-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 07:59:12,350 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_269-278006-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 07:59:13,578 WARNING [train.py:1204] (3/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_390-339137-0_sp0.9 from training. Duration: 0.62225 2023-10-02 07:59:16,267 WARNING [train.py:1204] (3/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_253-308377-0_sp1.1 from training. Duration: 0.4454375 2023-10-02 07:59:16,277 WARNING [train.py:1204] (3/4) Exclude cut with ID _23825_210_13449_1_1531915313526_3497150_240-271367-0_sp1.1 from training. Duration: 0.8818125 2023-10-02 07:59:16,372 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1015-343884-0_sp0.9 from training. Duration: 0.688875 2023-10-02 07:59:17,947 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.balancer2.prob, batch_count=806840.0, ans=0.125 2023-10-02 07:59:21,732 WARNING [train.py:1204] (3/4) Exclude cut with ID _23514_210_1389_1_1531832139785_3031439_335-48210-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 07:59:26,167 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_309-65177-0_sp1.1 from training. Duration: 0.8 2023-10-02 07:59:27,571 WARNING [train.py:1204] (3/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_231-137892-0 from training. Duration: 0.99 2023-10-02 07:59:29,626 WARNING [train.py:1204] (3/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_57-270037-0 from training. Duration: 0.77 2023-10-02 07:59:30,849 WARNING [train.py:1204] (3/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_465-214726-0_sp1.1 from training. Duration: 0.7818125 2023-10-02 07:59:30,927 WARNING [train.py:1204] (3/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_106-300985-0 from training. Duration: 0.9 2023-10-02 07:59:30,931 WARNING [train.py:1204] (3/4) Exclude cut with ID _14632_210_9988_1_1530691279392_3923200_161-129530-0_sp1.1 from training. Duration: 0.87275 2023-10-02 07:59:30,949 WARNING [train.py:1204] (3/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_514-88370-0_sp1.1 from training. Duration: 0.963625 2023-10-02 07:59:33,853 WARNING [train.py:1204] (3/4) Exclude cut with ID _56495_210_4234_1_1534748080334_4136219_305-334505-0_sp1.1 from training. Duration: 0.92725 2023-10-02 07:59:35,260 WARNING [train.py:1204] (3/4) Exclude cut with ID _42875_210_14856_1_1533704397487_4012433_61-264914-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 07:59:38,607 WARNING [train.py:1204] (3/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_91-148831-0 from training. Duration: 0.99 2023-10-02 07:59:41,441 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_435-313305-0_sp0.9 from training. Duration: 0.788875 2023-10-02 07:59:41,766 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.conv_module1.balancer1.max_abs, batch_count=806973.3333333334, ans=10.0 2023-10-02 07:59:42,879 WARNING [train.py:1204] (3/4) Exclude cut with ID _16744_210_5986_1_1531724405384_3907567_242-111316-0_sp1.1 from training. Duration: 0.8454375 2023-10-02 07:59:44,305 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_257-20733-0 from training. Duration: 0.76 2023-10-02 07:59:44,354 WARNING [train.py:1204] (3/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_4-91037-0_sp1.1 from training. Duration: 0.8 2023-10-02 07:59:45,711 WARNING [train.py:1204] (3/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_3-276701-0 from training. Duration: 0.91 2023-10-02 07:59:47,067 WARNING [train.py:1204] (3/4) Exclude cut with ID _3992_210_1747_1_1526644816031_3731060_36-147855-0_sp1.1 from training. Duration: 0.7090625 2023-10-02 07:59:48,392 WARNING [train.py:1204] (3/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_1296-298671-0_sp1.1 from training. Duration: 0.963625 2023-10-02 07:59:48,642 INFO [scaling.py:1118] (3/4) WithLoss: name=encoder.encoders.3.encoder.layers.2.self_attn_weights, loss-sum=0.000e+00 2023-10-02 07:59:49,778 WARNING [train.py:1204] (3/4) Exclude cut with ID _68965_210_1883_1_1535698962272_3497490_309-334593-0_sp1.1 from training. Duration: 0.92725 2023-10-02 07:59:52,544 WARNING [train.py:1204] (3/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_128-296581-0 from training. Duration: 0.74 2023-10-02 07:59:52,544 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_17613_210_2151_1_1531137569_2366160_170-299079-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 07:59:52,546 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6287_210_3273_1_1527509506_2653716_182-199887-0_sp1.1 from training. Duration: 0.463625 2023-10-02 07:59:53,897 WARNING [train.py:1204] (3/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_80-135200-0_sp1.1 from training. Duration: 0.7181875 2023-10-02 07:59:57,091 WARNING [train.py:1204] (3/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_34-183685-0 from training. Duration: 0.98 2023-10-02 07:59:58,465 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8278_210_2550_1_1528637099_4220413_418-210155-0_sp1.1 from training. Duration: 0.92725 2023-10-02 07:59:58,468 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8853_210_3273_1_1528633899_818968_51-187685-0_sp0.9 from training. Duration: 0.7666875 2023-10-02 07:59:58,488 WARNING [train.py:1204] (3/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_332-221057-0_sp1.1 from training. Duration: 0.6181875 2023-10-02 08:00:00,344 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2221_210_1988_1_1525597051_4935541_415-296882-0 from training. Duration: 0.77 2023-10-02 08:00:00,371 WARNING [train.py:1204] (3/4) Exclude cut with ID _33640_210_2655_1_1533117605479_4855469_770-278185-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 08:00:00,403 WARNING [train.py:1204] (3/4) Exclude cut with ID _5287_210_2084_1_1527418883040_3878670_333-48508-0_sp1.1 from training. Duration: 0.5454375 2023-10-02 08:00:00,450 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_142-276839-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 08:00:01,975 WARNING [train.py:1204] (3/4) Exclude cut with ID _28394_210_8913_1_1532430049518_3650118_126-139371-0_sp1.1 from training. Duration: 0.92725 2023-10-02 08:00:01,993 WARNING [train.py:1204] (3/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_76-203867-0 from training. Duration: 0.72 2023-10-02 08:00:03,319 WARNING [train.py:1204] (3/4) Exclude cut with ID _6194_210_1534_1_1527511826160_3941194_14-282876-0_sp0.9 from training. Duration: 0.788875 2023-10-02 08:00:08,192 WARNING [train.py:1204] (3/4) Exclude cut with ID _35355_210_16040_1_1533212988977_4016229_74-190076-0_sp1.1 from training. Duration: 0.72725 2023-10-02 08:00:09,590 INFO [train.py:1046] (3/4) Epoch 23, batch 4200, loss[loss=0.1637, simple_loss=0.2485, pruned_loss=0.03944, over 24477.00 frames. ], tot_loss[loss=0.1712, simple_loss=0.2472, pruned_loss=0.04755, over 4724536.31 frames. ], batch size: 63, lr: 4.44e-03, grad_scale: 16.0 2023-10-02 08:00:09,695 WARNING [train.py:1204] (3/4) Exclude cut with ID _33920_210_4220_1_1533257851500_3084399_198-334063-0 from training. Duration: 0.94 2023-10-02 08:00:12,412 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_4-266348-0_sp0.9 from training. Duration: 0.8666875 2023-10-02 08:00:12,750 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.balancer2.prob, batch_count=807106.6666666666, ans=0.125 2023-10-02 08:00:13,712 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_23856_210_4238_1_1531994785_3087934_122-38075-0_sp1.1 from training. Duration: 0.936375 2023-10-02 08:00:13,797 WARNING [train.py:1204] (3/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_276-96593-0_sp0.9 from training. Duration: 0.8555625 2023-10-02 08:00:15,113 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_308-278734-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 08:00:15,115 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_5190_210_4557_1_1527245078_2965646_353-250603-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 08:00:19,096 WARNING [train.py:1204] (3/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_472-216663-0 from training. Duration: 0.92 2023-10-02 08:00:21,950 WARNING [train.py:1204] (3/4) Exclude cut with ID _7537_210_2084_1_1528502395431_3924359_14-312906-0 from training. Duration: 0.84 2023-10-02 08:00:23,262 WARNING [train.py:1204] (3/4) Exclude cut with ID _29003_210_19596_1_1532507391720_3894098_559-236660-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 08:00:25,908 WARNING [train.py:1204] (3/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_312-204798-0_sp1.1 from training. Duration: 0.8454375 2023-10-02 08:00:27,695 WARNING [train.py:1204] (3/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_17-309070-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 08:00:31,490 WARNING [train.py:1204] (3/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_344-152526-0_sp0.9 from training. Duration: 0.588875 2023-10-02 08:00:31,611 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_41452_210_7291_1_1533437687_3941281_405-11559-0_sp1.1 from training. Duration: 0.82725 2023-10-02 08:00:32,926 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_48730_210_5399_1_1533885886_2212769_157-124052-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 08:00:32,978 WARNING [train.py:1204] (3/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_415-78792-0 from training. Duration: 0.81 2023-10-02 08:00:32,981 WARNING [train.py:1204] (3/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_345-340690-0_sp1.1 from training. Duration: 0.8454375 2023-10-02 08:00:34,291 WARNING [train.py:1204] (3/4) Exclude cut with ID _23825_210_13449_1_1531915313526_3497150_124-8138-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 08:00:35,614 WARNING [train.py:1204] (3/4) Exclude cut with ID _49258_210_7994_1_1534122017863_3738861_194-24864-0_sp1.1 from training. Duration: 0.936375 2023-10-02 08:00:35,630 WARNING [train.py:1204] (3/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_369-222060-0_sp1.1 from training. Duration: 0.6454375 2023-10-02 08:00:35,828 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.conv_skip_rate, batch_count=807173.3333333334, ans=0.0 2023-10-02 08:00:37,034 WARNING [train.py:1204] (3/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_480-13146-0_sp0.9 from training. Duration: 0.9444375 2023-10-02 08:00:39,018 WARNING [train.py:1204] (3/4) Exclude cut with ID _13253_210_1925_1_1530757775819_3577873_191-144977-0 from training. Duration: 0.78 2023-10-02 08:00:39,037 WARNING [train.py:1204] (3/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_1128-140266-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 08:00:43,079 WARNING [train.py:1204] (3/4) Exclude cut with ID _8772_210_1850_1_1528635595972_3664011_247-336448-0_sp0.9 from training. Duration: 0.82225 2023-10-02 08:00:44,416 WARNING [train.py:1204] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_318-59097-0_sp1.1 from training. Duration: 0.6909375 2023-10-02 08:00:47,146 WARNING [train.py:1204] (3/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_540-131164-0_sp1.1 from training. Duration: 0.836375 2023-10-02 08:00:48,477 WARNING [train.py:1204] (3/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_39-110836-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 08:00:51,054 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.483e+02 1.783e+02 1.946e+02 2.154e+02 3.104e+02, threshold=3.892e+02, percent-clipped=0.0 2023-10-02 08:00:51,142 WARNING [train.py:1204] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_860-96198-0_sp1.1 from training. Duration: 0.663625 2023-10-02 08:00:51,149 WARNING [train.py:1204] (3/4) Exclude cut with ID _15133_210_6228_1_1530794512074_3080379_29-264458-0 from training. Duration: 0.58 2023-10-02 08:00:51,169 WARNING [train.py:1204] (3/4) Exclude cut with ID _55209_210_1531_1_1534675828996_4597160_797-348343-0_sp1.1 from training. Duration: 0.963625 2023-10-02 08:00:52,587 WARNING [train.py:1204] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_23-189234-0_sp1.1 from training. Duration: 0.7545625 2023-10-02 08:00:57,343 WARNING [train.py:1204] (3/4) Exclude cut with ID _5410_210_1388_1_1527397055795_1719020_39-112221-0_sp1.1 from training. Duration: 0.62725 2023-10-02 08:00:58,661 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_100-348441-0_sp1.1 from training. Duration: 0.82725 2023-10-02 08:01:05,208 WARNING [train.py:1204] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_143-86344-0_sp1.1 from training. Duration: 0.7 2023-10-02 08:01:07,990 WARNING [train.py:1204] (3/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_126-29104-0 from training. Duration: 0.85 2023-10-02 08:01:11,269 WARNING [train.py:1204] (3/4) Exclude cut with ID _39846_210_12651_1_1533603651992_1318030_138-31771-0_sp1.1 from training. Duration: 0.963625 2023-10-02 08:01:14,171 WARNING [train.py:1204] (3/4) Exclude cut with ID _2220_210_1819_1_1525509109898_3658680_50-99837-0_sp1.1 from training. Duration: 0.4909375 2023-10-02 08:01:14,233 WARNING [train.py:1204] (3/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_836-210550-0_sp1.1 from training. Duration: 0.97275 2023-10-02 08:01:16,904 WARNING [train.py:1204] (3/4) Exclude cut with ID _28323_210_16187_1_1532480395667_4100300_306-74561-0 from training. Duration: 0.94 2023-10-02 08:01:22,252 INFO [train.py:1046] (3/4) Epoch 23, batch 4250, loss[loss=0.1612, simple_loss=0.2351, pruned_loss=0.0436, over 23924.00 frames. ], tot_loss[loss=0.17, simple_loss=0.2459, pruned_loss=0.047, over 4713834.75 frames. ], batch size: 195, lr: 4.43e-03, grad_scale: 16.0 2023-10-02 08:01:22,321 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_177-58005-0_sp1.1 from training. Duration: 0.563625 2023-10-02 08:01:25,144 WARNING [train.py:1204] (3/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_19-342632-0_sp1.1 from training. Duration: 0.82725 2023-10-02 08:01:25,165 WARNING [train.py:1204] (3/4) Exclude cut with ID _35355_210_16040_1_1533212988977_4016229_74-190076-0_sp0.9 from training. Duration: 0.888875 2023-10-02 08:01:27,319 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.0.conv_module2.balancer1.prob, batch_count=807440.0, ans=0.125 2023-10-02 08:01:28,530 WARNING [train.py:1204] (3/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_285-97786-0_sp1.1 from training. Duration: 0.97275 2023-10-02 08:01:33,586 WARNING [train.py:1204] (3/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_337-181502-0_sp0.9 from training. Duration: 0.72225 2023-10-02 08:01:33,626 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_353-182855-0 from training. Duration: 0.96 2023-10-02 08:01:33,650 WARNING [train.py:1204] (3/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_409-327730-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 08:01:36,429 WARNING [train.py:1204] (3/4) Exclude cut with ID _8030_210_7497_1_1528444886781_3531089_373-19403-0_sp1.1 from training. Duration: 0.97275 2023-10-02 08:01:40,994 WARNING [train.py:1204] (3/4) Exclude cut with ID _6789_210_1534_1_1527857937218_3118950_231-340237-0_sp1.1 from training. Duration: 0.936375 2023-10-02 08:01:45,114 WARNING [train.py:1204] (3/4) Exclude cut with ID _6493_210_1747_1_1528459210424_4012470_536-57832-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 08:01:46,502 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7046_210_6753_1_1527895337_6920917_756-116002-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 08:01:47,877 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_37152_210_9697_1_1533106573_3844188_29-239070-0_sp1.1 from training. Duration: 0.8909375 2023-10-02 08:01:47,880 WARNING [train.py:1204] (3/4) Exclude cut with ID _19023_210_4353_1_1531378892552_5820780_61-334539-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 08:01:49,289 WARNING [train.py:1204] (3/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_159-28488-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 08:01:49,483 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.ff3_skip_rate, batch_count=807506.6666666666, ans=0.0 2023-10-02 08:01:50,584 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_764-71272-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 08:01:50,662 WARNING [train.py:1204] (3/4) Exclude cut with ID _44902_210_16187_1_1533888006062_3792078_258-178274-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 08:01:52,105 WARNING [train.py:1204] (3/4) Exclude cut with ID _7537_210_2084_1_1528502395431_3924359_14-312906-0_sp1.1 from training. Duration: 0.763625 2023-10-02 08:01:54,643 WARNING [train.py:1204] (3/4) Exclude cut with ID _49643_210_7989_1_1534640244224_3939031_674-116639-0_sp1.1 from training. Duration: 0.963625 2023-10-02 08:01:54,749 WARNING [train.py:1204] (3/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_415-161-0 from training. Duration: 0.98 2023-10-02 08:01:59,260 WARNING [train.py:1204] (3/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_264-348896-0 from training. Duration: 0.85 2023-10-02 08:01:59,274 WARNING [train.py:1204] (3/4) Exclude cut with ID _51899_210_20034_1_1534129254466_3669220_233-161492-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 08:01:59,340 WARNING [train.py:1204] (3/4) Exclude cut with ID _31255_210_13427_1_1532865619495_3815143_233-200023-0_sp1.1 from training. Duration: 0.936375 2023-10-02 08:02:00,542 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_23850_210_4238_1_1532232597_1632433_100-229879-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 08:02:00,628 WARNING [train.py:1204] (3/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_375-245677-0_sp0.9 from training. Duration: 0.9 2023-10-02 08:02:00,631 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_40223_210_6228_1_1533298404_4812267_355-317415-0_sp1.1 from training. Duration: 0.97275 2023-10-02 08:02:02,400 WARNING [train.py:1204] (3/4) Exclude cut with ID _9679_210_7497_1_1529129212906_4363929_235-81488-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 08:02:02,585 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.0.conv_skip_rate, batch_count=807573.3333333334, ans=0.0 2023-10-02 08:02:05,673 WARNING [train.py:1204] (3/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_172-215932-0_sp1.1 from training. Duration: 0.6 2023-10-02 08:02:07,086 WARNING [train.py:1204] (3/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_64-298917-0_sp1.1 from training. Duration: 0.8 2023-10-02 08:02:13,002 WARNING [train.py:1204] (3/4) Exclude cut with ID _38020_210_5986_1_1533112252259_7006089_131-53533-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 08:02:14,375 WARNING [train.py:1204] (3/4) Exclude cut with ID _42334_210_7770_1_1534206610396_3605880_392-48154-0_sp1.1 from training. Duration: 0.963625 2023-10-02 08:02:14,443 WARNING [train.py:1204] (3/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_337-181502-0 from training. Duration: 0.65 2023-10-02 08:02:14,450 WARNING [train.py:1204] (3/4) Exclude cut with ID _6267_210_2084_1_1527764658526_3894730_375-219887-0_sp0.9 from training. Duration: 0.7444375 2023-10-02 08:02:15,764 WARNING [train.py:1204] (3/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_60-146189-0 from training. Duration: 0.98 2023-10-02 08:02:17,162 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_243-143119-0_sp1.1 from training. Duration: 0.7 2023-10-02 08:02:18,148 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.0.layers.0.feed_forward1.out_whiten, num_groups=1, num_channels=192, metric=8.41 vs. limit=15.0 2023-10-02 08:02:18,535 WARNING [train.py:1204] (3/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_1267-272693-0_sp0.9 from training. Duration: 0.811125 2023-10-02 08:02:18,648 WARNING [train.py:1204] (3/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_83-182645-0_sp1.1 from training. Duration: 0.97275 2023-10-02 08:02:19,957 WARNING [train.py:1204] (3/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_403-292260-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 08:02:21,408 WARNING [train.py:1204] (3/4) Exclude cut with ID _55429_210_10626_1_1534467588527_7174143_377-169391-0 from training. Duration: 0.96 2023-10-02 08:02:22,759 WARNING [train.py:1204] (3/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_494-199538-0_sp1.1 from training. Duration: 0.6818125 2023-10-02 08:02:24,089 WARNING [train.py:1204] (3/4) Exclude cut with ID _3067_210_2667_1_1526212974640_4503168_119-175776-0_sp1.1 from training. Duration: 0.67275 2023-10-02 08:02:27,273 WARNING [train.py:1204] (3/4) Exclude cut with ID _3231_210_2106_1_1526205864934_6784220_183-303839-0_sp1.1 from training. Duration: 0.97275 2023-10-02 08:02:29,959 WARNING [train.py:1204] (3/4) Exclude cut with ID _18513_210_9206_1_1531618215223_3862620_165-38958-0_sp1.1 from training. Duration: 0.963625 2023-10-02 08:02:31,278 WARNING [train.py:1204] (3/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_87-272068-0_sp1.1 from training. Duration: 0.7545625 2023-10-02 08:02:31,359 WARNING [train.py:1204] (3/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_353-95-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 08:02:32,682 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2009_210_2071_1_1525481134_4588937_73-302813-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 08:02:34,640 WARNING [train.py:1204] (3/4) Exclude cut with ID _64220_210_9527_1_1535250151772_4875259_88-95011-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 08:02:34,891 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.feed_forward3.hidden_balancer.prob, batch_count=807773.3333333334, ans=0.125 2023-10-02 08:02:35,935 INFO [train.py:1046] (3/4) Epoch 23, batch 4300, loss[loss=0.1733, simple_loss=0.2426, pruned_loss=0.05202, over 23879.00 frames. ], tot_loss[loss=0.1702, simple_loss=0.2464, pruned_loss=0.04702, over 4727579.60 frames. ], batch size: 195, lr: 4.43e-03, grad_scale: 16.0 2023-10-02 08:02:36,005 WARNING [train.py:1204] (3/4) Exclude cut with ID _44902_210_16187_1_1533888006062_3792078_160-12107-0_sp1.1 from training. Duration: 0.92725 2023-10-02 08:02:36,011 WARNING [train.py:1204] (3/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_547-5424-0 from training. Duration: 0.94 2023-10-02 08:02:37,345 WARNING [train.py:1204] (3/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_268-85656-0_sp1.1 from training. Duration: 0.936375 2023-10-02 08:02:38,142 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.balancer1.prob, batch_count=807773.3333333334, ans=0.125 2023-10-02 08:02:44,462 WARNING [train.py:1204] (3/4) Exclude cut with ID _66210_210_12651_1_1535446812922_3365850_42-284985-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 08:02:44,485 WARNING [train.py:1204] (3/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_465-214726-0_sp0.9 from training. Duration: 0.9555625 2023-10-02 08:02:47,227 WARNING [train.py:1204] (3/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_50-95770-0_sp1.1 from training. Duration: 0.936375 2023-10-02 08:02:54,061 WARNING [train.py:1204] (3/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_269-77567-0_sp1.1 from training. Duration: 0.963625 2023-10-02 08:02:54,063 WARNING [train.py:1204] (3/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_7-111954-0 from training. Duration: 0.56 2023-10-02 08:02:56,005 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_85-195240-0_sp0.9 from training. Duration: 0.9666875 2023-10-02 08:02:57,440 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2822_210_2715_1_1526112586_1999587_165-36656-0_sp0.9 from training. Duration: 0.988875 2023-10-02 08:02:58,838 WARNING [train.py:1204] (3/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_181-78632-0_sp1.1 from training. Duration: 0.7090625 2023-10-02 08:02:58,860 WARNING [train.py:1204] (3/4) Exclude cut with ID IC0671W0044-331887 from training. Duration: 0.896 2023-10-02 08:03:01,479 WARNING [train.py:1204] (3/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_266-137104-0_sp1.1 from training. Duration: 0.5909375 2023-10-02 08:03:02,874 WARNING [train.py:1204] (3/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_137-116442-0_sp0.9 from training. Duration: 0.8666875 2023-10-02 08:03:04,924 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.4.encoder.layers.1.feed_forward1.out_whiten, num_groups=1, num_channels=384, metric=9.66 vs. limit=15.0 2023-10-02 08:03:06,232 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_23801_210_16223_1_1532066392_3294657_570-5439-0 from training. Duration: 0.97 2023-10-02 08:03:06,245 WARNING [train.py:1204] (3/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_29-280622-0_sp1.1 from training. Duration: 0.7454375 2023-10-02 08:03:06,271 WARNING [train.py:1204] (3/4) Exclude cut with ID 3557-8342-0013-54691-0 from training. Duration: 0.92 2023-10-02 08:03:09,946 WARNING [train.py:1204] (3/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_711-15688-0_sp1.1 from training. Duration: 0.3909375 2023-10-02 08:03:10,066 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_15126_210_6753_1_1530778677_2591185_120-277707-0_sp1.1 from training. Duration: 0.72725 2023-10-02 08:03:12,633 WARNING [train.py:1204] (3/4) Exclude cut with ID _14970_210_15709_1_1530775547361_1966965_8-169924-0_sp1.1 from training. Duration: 0.836375 2023-10-02 08:03:12,635 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_4692_210_3528_1_1526899755_4323254_107-44401-0_sp0.9 from training. Duration: 0.9555625 2023-10-02 08:03:12,716 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3475_210_2028_1_1526193578_3622569_265-276625-0_sp0.9 from training. Duration: 0.8444375 2023-10-02 08:03:15,304 WARNING [train.py:1204] (3/4) Exclude cut with ID _55153_210_2253_1_1534384786677_3162096_148-79022-0_sp1.1 from training. Duration: 0.87275 2023-10-02 08:03:15,376 WARNING [train.py:1204] (3/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_292-36336-0_sp1.1 from training. Duration: 0.92725 2023-10-02 08:03:15,390 WARNING [train.py:1204] (3/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_329-82235-0 from training. Duration: 0.82 2023-10-02 08:03:17,963 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.537e+02 1.923e+02 2.251e+02 2.726e+02 4.370e+02, threshold=4.501e+02, percent-clipped=2.0 2023-10-02 08:03:18,039 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_11305_210_1794_1_1529750428_7178148_257-312870-0 from training. Duration: 0.88 2023-10-02 08:03:19,441 WARNING [train.py:1204] (3/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_436-4493-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 08:03:20,896 WARNING [train.py:1204] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_321-54129-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 08:03:20,903 WARNING [train.py:1204] (3/4) Exclude cut with ID _5287_210_2084_1_1527418883040_3878670_333-48508-0_sp0.9 from training. Duration: 0.6666875 2023-10-02 08:03:20,915 WARNING [train.py:1204] (3/4) Exclude cut with ID _26528_210_9070_1_1532570410842_3851310_244-278158-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 08:03:21,175 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.feed_forward2.hidden_balancer.prob, batch_count=807973.3333333334, ans=0.125 2023-10-02 08:03:22,215 WARNING [train.py:1204] (3/4) Exclude cut with ID _46952_210_5986_1_1534230010357_3107379_232-344503-0_sp1.1 from training. Duration: 0.87275 2023-10-02 08:03:22,231 WARNING [train.py:1204] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_43-206757-0 from training. Duration: 0.75 2023-10-02 08:03:22,233 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_4183_210_2087_1_1526729100_1869838_141-166600-0 from training. Duration: 0.95 2023-10-02 08:03:22,286 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_296-287678-0 from training. Duration: 0.95 2023-10-02 08:03:23,600 WARNING [train.py:1204] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_265-60343-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 08:03:23,637 WARNING [train.py:1204] (3/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_332-256168-0 from training. Duration: 0.75 2023-10-02 08:03:23,662 WARNING [train.py:1204] (3/4) Exclude cut with ID _13679_210_7497_1_1530534293662_3355098_411-186088-0 from training. Duration: 0.94 2023-10-02 08:03:27,289 WARNING [train.py:1204] (3/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_458-126779-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 08:03:28,636 WARNING [train.py:1204] (3/4) Exclude cut with ID IC0803W0380-397572 from training. Duration: 0.032 2023-10-02 08:03:29,932 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_278-272612-0_sp1.1 from training. Duration: 0.663625 2023-10-02 08:03:31,124 WARNING [train.py:1204] (3/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_491-263101-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 08:03:31,136 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_53555_210_5632_1_1534595220_7232297_334-336778-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 08:03:34,897 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_50-3289-0 from training. Duration: 0.91 2023-10-02 08:03:35,065 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.balancer1.prob, batch_count=808040.0, ans=0.125 2023-10-02 08:03:36,192 WARNING [train.py:1204] (3/4) Exclude cut with ID _8699_210_2069_1_1528630346831_3960409_155-39766-0_sp0.9 from training. Duration: 0.8666875 2023-10-02 08:03:36,196 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_502-82145-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 08:03:36,240 WARNING [train.py:1204] (3/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_550-43132-0_sp1.1 from training. Duration: 0.97275 2023-10-02 08:03:36,483 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.bypass_mid.scale_min, batch_count=808040.0, ans=0.2 2023-10-02 08:03:38,170 WARNING [train.py:1204] (3/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_446-112117-0_sp1.1 from training. Duration: 0.8181875 2023-10-02 08:03:38,232 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3691_210_3528_1_1526468169_4062053_355-334432-0_sp1.1 from training. Duration: 0.7909375 2023-10-02 08:03:40,889 WARNING [train.py:1204] (3/4) Exclude cut with ID _58605_210_12210_1_1534723195939_3904259_239-94720-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 08:03:42,288 WARNING [train.py:1204] (3/4) Exclude cut with ID _49376_210_5986_1_1533985278732_3370740_391-344072-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 08:03:43,610 WARNING [train.py:1204] (3/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_558-322014-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 08:03:43,644 WARNING [train.py:1204] (3/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_256-32662-0_sp1.1 from training. Duration: 0.8181875 2023-10-02 08:03:45,213 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.1.balancer1.prob, batch_count=808040.0, ans=0.125 2023-10-02 08:03:47,912 WARNING [train.py:1204] (3/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_572-185150-0 from training. Duration: 0.97 2023-10-02 08:03:47,940 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_89-35748-0_sp0.9 from training. Duration: 0.87775 2023-10-02 08:03:49,204 INFO [train.py:1046] (3/4) Epoch 23, batch 4350, loss[loss=0.1613, simple_loss=0.245, pruned_loss=0.03877, over 24491.00 frames. ], tot_loss[loss=0.171, simple_loss=0.2472, pruned_loss=0.04744, over 4712640.38 frames. ], batch size: 63, lr: 4.43e-03, grad_scale: 16.0 2023-10-02 08:03:53,415 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_88-66747-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 08:03:56,566 WARNING [train.py:1204] (3/4) Exclude cut with ID _19504_210_9994_1_1531648863242_3325220_357-237029-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 08:03:59,476 WARNING [train.py:1204] (3/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_302-2550-0_sp1.1 from training. Duration: 0.736375 2023-10-02 08:03:59,477 WARNING [train.py:1204] (3/4) Exclude cut with ID _9461_210_4557_1_1529060514726_3677451_59-194017-0_sp1.1 from training. Duration: 0.963625 2023-10-02 08:04:01,073 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=808106.6666666666, ans=0.1 2023-10-02 08:04:02,316 WARNING [train.py:1204] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_665-196155-0_sp1.1 from training. Duration: 0.6090625 2023-10-02 08:04:04,871 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_326-315007-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 08:04:04,991 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.0.conv_skip_rate, batch_count=808173.3333333334, ans=0.0 2023-10-02 08:04:07,502 WARNING [train.py:1204] (3/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_105-328174-0_sp1.1 from training. Duration: 0.6909375 2023-10-02 08:04:07,520 WARNING [train.py:1204] (3/4) Exclude cut with ID _56494_210_4220_1_1535284837625_3033169_106-263722-0_sp1.1 from training. Duration: 0.97275 2023-10-02 08:04:10,274 WARNING [train.py:1204] (3/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_770-200430-0_sp0.9 from training. Duration: 0.988875 2023-10-02 08:04:11,606 WARNING [train.py:1204] (3/4) Exclude cut with ID _65640_210_6310_1_1535368201445_3173250_209-13008-0_sp1.1 from training. Duration: 0.9 2023-10-02 08:04:11,732 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.attention_skip_rate, batch_count=808173.3333333334, ans=0.0 2023-10-02 08:04:14,312 WARNING [train.py:1204] (3/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_212-149947-0_sp0.9 from training. Duration: 0.811125 2023-10-02 08:04:17,170 WARNING [train.py:1204] (3/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_188-171673-0 from training. Duration: 0.58 2023-10-02 08:04:18,401 WARNING [train.py:1204] (3/4) Exclude cut with ID _58895_210_8750_1_1535184081320_3649089_286-281724-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 08:04:19,761 WARNING [train.py:1204] (3/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_16-254909-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 08:04:22,752 WARNING [train.py:1204] (3/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_52-95655-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 08:04:26,067 WARNING [train.py:1204] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_554-45617-0 from training. Duration: 0.99 2023-10-02 08:04:27,667 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.conv_module1.balancer2.prob, batch_count=808240.0, ans=0.125 2023-10-02 08:04:28,808 WARNING [train.py:1204] (3/4) Exclude cut with ID _14683_210_5219_1_1530698352608_3974859_473-344611-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 08:04:32,357 WARNING [train.py:1204] (3/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_17-25045-0_sp0.9 from training. Duration: 0.6555625 2023-10-02 08:04:33,975 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=808306.6666666666, ans=0.1 2023-10-02 08:04:37,107 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9072W0003-530637 from training. Duration: 0.768 2023-10-02 08:04:38,477 WARNING [train.py:1204] (3/4) Exclude cut with ID _38670_210_15379_1_1533448810659_4391059_76-74025-0_sp1.1 from training. Duration: 0.936375 2023-10-02 08:04:38,529 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_74-292608-0_sp0.9 from training. Duration: 0.8 2023-10-02 08:04:39,870 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9084W0025-536862 from training. Duration: 0.896 2023-10-02 08:04:41,189 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9087W0021-538392 from training. Duration: 0.768 2023-10-02 08:04:41,194 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7543_210_6753_1_1528543323_645515_56-163234-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 08:04:41,219 WARNING [train.py:1204] (3/4) Exclude cut with ID _45560_210_21982_1_1533812603951_3584999_44-332643-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 08:04:42,585 WARNING [train.py:1204] (3/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_1285-256622-0_sp1.1 from training. Duration: 0.763625 2023-10-02 08:04:42,636 WARNING [train.py:1204] (3/4) Exclude cut with ID _52781_210_16187_1_1534297729099_1118149_141-325632-0_sp1.1 from training. Duration: 0.936375 2023-10-02 08:04:43,963 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_1051-137631-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 08:04:44,006 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_13091_210_7925_1_1530609447_8987354_233-18333-0_sp1.1 from training. Duration: 0.97275 2023-10-02 08:04:47,964 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_34008_210_10431_1_1533345424_8183247_662-317331-0 from training. Duration: 0.94 2023-10-02 08:04:47,970 WARNING [train.py:1204] (3/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_291-248920-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 08:04:47,972 WARNING [train.py:1204] (3/4) Exclude cut with ID _65263_210_22356_1_1535763598691_3921037_274-288481-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 08:04:48,001 WARNING [train.py:1204] (3/4) Exclude cut with ID _5189_210_4557_1_1527248319428_3769229_233-168080-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 08:04:48,433 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.conv_module2.whiten.whitening_limit, batch_count=808373.3333333334, ans=15.0 2023-10-02 08:04:49,342 WARNING [train.py:1204] (3/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_135-184281-0 from training. Duration: 0.99 2023-10-02 08:04:50,670 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9123W0012-555972 from training. Duration: 0.896 2023-10-02 08:04:50,680 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9123W0118-556077 from training. Duration: 0.896 2023-10-02 08:04:50,694 WARNING [train.py:1204] (3/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_406-275649-0 from training. Duration: 0.42 2023-10-02 08:04:54,595 WARNING [train.py:1204] (3/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_309-13684-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 08:04:54,625 WARNING [train.py:1204] (3/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_129-337802-0_sp1.1 from training. Duration: 0.6545625 2023-10-02 08:04:54,656 WARNING [train.py:1204] (3/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_649-266825-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 08:04:55,982 WARNING [train.py:1204] (3/4) Exclude cut with ID _40519_210_13794_1_1533715228655_7337390_895-224742-0_sp1.1 from training. Duration: 0.92725 2023-10-02 08:04:57,396 WARNING [train.py:1204] (3/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_234-14174-0 from training. Duration: 0.82 2023-10-02 08:05:00,600 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9156W0009-572502 from training. Duration: 0.768 2023-10-02 08:05:00,614 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_424-245529-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 08:05:01,934 INFO [train.py:1046] (3/4) Epoch 23, batch 4400, loss[loss=0.1698, simple_loss=0.2568, pruned_loss=0.04143, over 24341.00 frames. ], tot_loss[loss=0.1708, simple_loss=0.247, pruned_loss=0.04723, over 4711843.06 frames. ], batch size: 77, lr: 4.43e-03, grad_scale: 32.0 2023-10-02 08:05:03,538 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=808440.0, ans=0.1 2023-10-02 08:05:04,586 WARNING [train.py:1204] (3/4) Exclude cut with ID _26528_210_9070_1_1532570410842_3851310_232-10149-0_sp1.1 from training. Duration: 0.963625 2023-10-02 08:05:04,594 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_17292_210_6753_1_1531125388_2943734_152-30410-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 08:05:06,608 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_35441_210_16554_1_1533347572_4092680_291-82075-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 08:05:09,260 WARNING [train.py:1204] (3/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_64-298917-0 from training. Duration: 0.88 2023-10-02 08:05:09,281 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_5618_210_1874_1_1527311008_5851_1-214501-0_sp0.9 from training. Duration: 0.7655625 2023-10-02 08:05:09,938 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_9460_210_4211_1_1528954587_10049667_572-37035-0 from training. Duration: 0.8 2023-10-02 08:05:09,964 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9193W0023-590322 from training. Duration: 0.896 2023-10-02 08:05:10,055 WARNING [train.py:1204] (3/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_206-10097-0_sp1.1 from training. Duration: 0.4454375 2023-10-02 08:05:10,066 WARNING [train.py:1204] (3/4) Exclude cut with ID _55134_210_21982_1_1534590028377_3882170_17-51731-0_sp1.1 from training. Duration: 0.963625 2023-10-02 08:05:11,293 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_14953_210_2151_1_1530701710_5616165_710-165400-0 from training. Duration: 0.87 2023-10-02 08:05:11,432 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.bypass.scale_min, batch_count=808440.0, ans=0.2 2023-10-02 08:05:13,921 WARNING [train.py:1204] (3/4) Exclude cut with ID _53203_210_2087_1_1534246218309_475590_61-85777-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 08:05:15,234 WARNING [train.py:1204] (3/4) Exclude cut with ID _55844_210_2346_1_1534471212569_7625707_1050-10453-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 08:05:15,243 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9218W0013-603297 from training. Duration: 0.896 2023-10-02 08:05:19,152 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_267-102818-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 08:05:19,153 WARNING [train.py:1204] (3/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_191-41023-0 from training. Duration: 0.71 2023-10-02 08:05:19,200 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9231W0202-610227 from training. Duration: 0.896 2023-10-02 08:05:21,958 WARNING [train.py:1204] (3/4) Exclude cut with ID _6537_210_1883_1_1527987559710_3597129_382-133062-0 from training. Duration: 0.68 2023-10-02 08:05:21,996 WARNING [train.py:1204] (3/4) Exclude cut with ID _28427_210_4220_1_1532581240967_3061069_43-84928-0 from training. Duration: 0.99 2023-10-02 08:05:22,019 WARNING [train.py:1204] (3/4) Exclude cut with ID _59256_210_5986_1_1535076085997_3589120_121-237386-0 from training. Duration: 0.96 2023-10-02 08:05:22,049 WARNING [train.py:1204] (3/4) Exclude cut with ID _8480_210_1874_1_1528589391205_6728330_137-256986-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 08:05:23,367 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_9417_210_1794_1_1529235261_4614131_479-346963-0_sp1.1 from training. Duration: 0.936375 2023-10-02 08:05:24,698 WARNING [train.py:1204] (3/4) Exclude cut with ID _26474_210_9570_1_1532520022699_3623380_267-7362-0_sp1.1 from training. Duration: 0.936375 2023-10-02 08:05:24,763 WARNING [train.py:1204] (3/4) Exclude cut with ID _55328_210_27767_1_1534580973260_4088170_287-61272-0_sp1.1 from training. Duration: 0.97275 2023-10-02 08:05:27,473 WARNING [train.py:1204] (3/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_589-9070-0 from training. Duration: 0.71 2023-10-02 08:05:27,488 WARNING [train.py:1204] (3/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_148-158509-0_sp1.1 from training. Duration: 0.37275 2023-10-02 08:05:27,551 WARNING [train.py:1204] (3/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_164-184548-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 08:05:28,924 WARNING [train.py:1204] (3/4) Exclude cut with ID _14970_210_15709_1_1530775547361_1966965_6-23328-0_sp1.1 from training. Duration: 0.7818125 2023-10-02 08:05:28,929 WARNING [train.py:1204] (3/4) Exclude cut with ID _29303_210_2575_1_1532392237303_7410649_612-165069-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 08:05:30,667 WARNING [train.py:1204] (3/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_383-321224-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 08:05:30,709 WARNING [train.py:1204] (3/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_283-160525-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 08:05:30,721 WARNING [train.py:1204] (3/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_505-97987-0 from training. Duration: 0.86 2023-10-02 08:05:32,002 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9309W0190-640317 from training. Duration: 0.768 2023-10-02 08:05:35,270 WARNING [train.py:1204] (3/4) Exclude cut with ID _50093_210_4220_1_1534417141207_3432299_280-297390-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 08:05:42,622 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_17292_210_6753_1_1531125388_2943734_320-71385-0_sp1.1 from training. Duration: 0.97275 2023-10-02 08:05:43,848 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.505e+02 1.961e+02 2.160e+02 2.507e+02 3.743e+02, threshold=4.320e+02, percent-clipped=0.0 2023-10-02 08:05:45,348 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_213-103282-0 from training. Duration: 0.78 2023-10-02 08:05:49,462 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7615_210_4211_1_1528110965_4580160_72-235211-0_sp0.9 from training. Duration: 0.8444375 2023-10-02 08:05:52,197 WARNING [train.py:1204] (3/4) Exclude cut with ID _14867_210_1968_1_1530684179890_3846969_173-320977-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 08:05:53,740 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.conv_module2.balancer2.prob, batch_count=808640.0, ans=0.125 2023-10-02 08:05:54,853 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7046_210_6753_1_1527895337_6920917_783-278109-0_sp0.9 from training. Duration: 0.8555625 2023-10-02 08:05:54,903 WARNING [train.py:1204] (3/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_90-30673-0 from training. Duration: 0.84 2023-10-02 08:05:54,917 WARNING [train.py:1204] (3/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_415-161-0_sp1.1 from training. Duration: 0.8909375 2023-10-02 08:05:54,927 WARNING [train.py:1204] (3/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_86-66192-0_sp0.9 from training. Duration: 0.611125 2023-10-02 08:05:54,936 WARNING [train.py:1204] (3/4) Exclude cut with ID _4626_210_3532_1_1526817652717_3916859_446-206656-0_sp1.1 from training. Duration: 0.6909375 2023-10-02 08:05:56,275 WARNING [train.py:1204] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_166-128941-0_sp0.9 from training. Duration: 0.6 2023-10-02 08:06:00,389 WARNING [train.py:1204] (3/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_345-167986-0 from training. Duration: 0.84 2023-10-02 08:06:04,184 WARNING [train.py:1204] (3/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_973-198085-0 from training. Duration: 0.75 2023-10-02 08:06:05,528 WARNING [train.py:1204] (3/4) Exclude cut with ID _60057_210_10640_1_1534903065793_3333150_171-181133-0 from training. Duration: 0.88 2023-10-02 08:06:05,541 WARNING [train.py:1204] (3/4) Exclude cut with ID _19504_210_9994_1_1531648863242_3325220_336-196513-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 08:06:05,548 WARNING [train.py:1204] (3/4) Exclude cut with ID _6493_210_1747_1_1528459210424_4012470_513-140967-0 from training. Duration: 0.79 2023-10-02 08:06:05,634 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6673_210_3273_1_1527767790_3173240_327-306185-0_sp0.9 from training. Duration: 0.92225 2023-10-02 08:06:11,477 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1032-236863-0_sp1.1 from training. Duration: 0.763625 2023-10-02 08:06:14,543 INFO [train.py:1046] (3/4) Epoch 23, batch 4450, loss[loss=0.1559, simple_loss=0.2329, pruned_loss=0.0395, over 24655.00 frames. ], tot_loss[loss=0.1714, simple_loss=0.2476, pruned_loss=0.04759, over 4713755.58 frames. ], batch size: 60, lr: 4.43e-03, grad_scale: 32.0 2023-10-02 08:06:14,586 WARNING [train.py:1204] (3/4) Exclude cut with ID _14867_210_1968_1_1530684179890_3846969_413-29292-0 from training. Duration: 0.9 2023-10-02 08:06:17,487 WARNING [train.py:1204] (3/4) Exclude cut with ID _47251_210_18628_1_1533814063128_4137250_186-343322-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 08:06:20,310 WARNING [train.py:1204] (3/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_624-46091-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 08:06:20,340 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_791-197891-0_sp0.9 from training. Duration: 0.9333125 2023-10-02 08:06:25,939 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_50050_210_16223_1_1535435796_7290138_786-288457-0_sp1.1 from training. Duration: 0.92725 2023-10-02 08:06:25,967 WARNING [train.py:1204] (3/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_213-227283-0_sp1.1 from training. Duration: 0.963625 2023-10-02 08:06:28,018 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.2.encoder.layers.2.nonlin_attention.whiten2, num_groups=1, num_channels=384, metric=4.56 vs. limit=15.0 2023-10-02 08:06:28,731 WARNING [train.py:1204] (3/4) Exclude cut with ID _42659_210_2070_1_1533607442765_3120380_58-341121-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 08:06:31,325 WARNING [train.py:1204] (3/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_506-23200-0_sp1.1 from training. Duration: 0.8818125 2023-10-02 08:06:34,595 WARNING [train.py:1204] (3/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_336-213530-0_sp1.1 from training. Duration: 0.8181875 2023-10-02 08:06:34,612 WARNING [train.py:1204] (3/4) Exclude cut with ID _64135_210_2253_1_1535677267876_7608972_180-5312-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 08:06:35,931 WARNING [train.py:1204] (3/4) Exclude cut with ID _12302_210_5219_1_1529924446466_8107280_835-279754-0 from training. Duration: 0.9 2023-10-02 08:06:35,933 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_510-96996-0_sp1.1 from training. Duration: 0.7909375 2023-10-02 08:06:36,009 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_15517_210_6753_1_1530954008_3487336_216-187834-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 08:06:37,922 WARNING [train.py:1204] (3/4) Exclude cut with ID _19504_210_9994_1_1531648863242_3325220_414-113427-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 08:06:37,923 WARNING [train.py:1204] (3/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_264-108685-0_sp0.9 from training. Duration: 0.611125 2023-10-02 08:06:39,336 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_5438_210_1794_1_1527336687_2600009_104-288274-0_sp0.9 from training. Duration: 0.6444375 2023-10-02 08:06:45,084 WARNING [train.py:1204] (3/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_520-39496-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 08:06:45,132 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_37818_210_4238_1_1533620910_4720083_315-308565-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 08:06:46,555 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2009_210_2071_1_1525481134_4588937_385-302100-0_sp1.1 from training. Duration: 0.7909375 2023-10-02 08:06:47,908 WARNING [train.py:1204] (3/4) Exclude cut with ID _58895_210_8750_1_1535184081320_3649089_277-53219-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 08:06:47,979 WARNING [train.py:1204] (3/4) Exclude cut with ID _26664_210_7770_1_1532258943280_3418270_121-111258-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 08:06:48,128 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.1.feed_forward1.out_proj.dropout_p, batch_count=808906.6666666666, ans=0.1 2023-10-02 08:06:52,010 WARNING [train.py:1204] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_99-167392-0_sp1.1 from training. Duration: 0.5545625 2023-10-02 08:06:52,159 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.feed_forward2.hidden_balancer.prob, batch_count=808906.6666666666, ans=0.125 2023-10-02 08:06:53,323 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1113-7369-0 from training. Duration: 0.85 2023-10-02 08:06:54,636 WARNING [train.py:1204] (3/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_352-59958-0 from training. Duration: 0.86 2023-10-02 08:06:54,641 WARNING [train.py:1204] (3/4) Exclude cut with ID _14886_210_4220_1_1530702020355_3516190_159-73748-0_sp1.1 from training. Duration: 0.7545625 2023-10-02 08:06:56,044 WARNING [train.py:1204] (3/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_131-119569-0_sp1.1 from training. Duration: 0.92725 2023-10-02 08:06:56,111 WARNING [train.py:1204] (3/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_216-343007-0 from training. Duration: 0.66 2023-10-02 08:06:59,030 WARNING [train.py:1204] (3/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_113-92608-0_sp0.9 from training. Duration: 0.6 2023-10-02 08:07:03,113 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_40047_210_2290_1_1533290519_2374311_132-123271-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 08:07:03,158 WARNING [train.py:1204] (3/4) Exclude cut with ID _8793_210_6310_1_1528542985139_2310009_121-179782-0 from training. Duration: 0.8 2023-10-02 08:07:03,179 WARNING [train.py:1204] (3/4) Exclude cut with ID _43997_210_4525_1_1533538632555_5189329_283-218639-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 08:07:03,182 WARNING [train.py:1204] (3/4) Exclude cut with ID _49376_210_5986_1_1533985278732_3370740_297-78397-0_sp1.1 from training. Duration: 0.936375 2023-10-02 08:07:03,195 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3475_210_2028_1_1526193578_3622569_377-166133-0_sp0.9 from training. Duration: 0.9555625 2023-10-02 08:07:03,201 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_520-213149-0_sp1.1 from training. Duration: 0.92725 2023-10-02 08:07:08,014 WARNING [train.py:1204] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_567-144483-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 08:07:10,689 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_4183_210_2087_1_1526729100_1869838_143-280962-0_sp0.9 from training. Duration: 0.62225 2023-10-02 08:07:10,738 WARNING [train.py:1204] (3/4) Exclude cut with ID _49643_210_7989_1_1534640244224_3939031_611-185515-0 from training. Duration: 0.94 2023-10-02 08:07:13,576 WARNING [train.py:1204] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_88-341718-0_sp0.9 from training. Duration: 0.7333125 2023-10-02 08:07:15,444 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_51225_210_3601_1_1534400634_2327383_49-235441-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 08:07:16,274 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.0.layers.0.feed_forward3.out_whiten, num_groups=1, num_channels=192, metric=11.21 vs. limit=15.0 2023-10-02 08:07:16,830 WARNING [train.py:1204] (3/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_240-294400-0_sp1.1 from training. Duration: 0.936375 2023-10-02 08:07:18,166 WARNING [train.py:1204] (3/4) Exclude cut with ID _55429_210_10626_1_1534467588527_7174143_640-304825-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 08:07:18,187 WARNING [train.py:1204] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_187-221566-0_sp1.1 from training. Duration: 0.3909375 2023-10-02 08:07:19,605 WARNING [train.py:1204] (3/4) Exclude cut with ID _7606_210_4915_1_1528894253277_5080690_127-177761-0_sp0.9 from training. Duration: 0.711125 2023-10-02 08:07:22,433 WARNING [train.py:1204] (3/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_430-116139-0 from training. Duration: 0.85 2023-10-02 08:07:23,789 WARNING [train.py:1204] (3/4) Exclude cut with ID _3675_210_4557_1_1526639934408_4182089_456-18305-0_sp0.9 from training. Duration: 0.8444375 2023-10-02 08:07:24,050 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.feed_forward1.out_proj.dropout_p, batch_count=809040.0, ans=0.1 2023-10-02 08:07:26,433 INFO [train.py:1046] (3/4) Epoch 23, batch 4500, loss[loss=0.1748, simple_loss=0.2497, pruned_loss=0.04996, over 23220.00 frames. ], tot_loss[loss=0.1713, simple_loss=0.2481, pruned_loss=0.04728, over 4724508.66 frames. ], batch size: 105, lr: 4.43e-03, grad_scale: 32.0 2023-10-02 08:07:27,977 WARNING [train.py:1204] (3/4) Exclude cut with ID _18513_210_9206_1_1531618215223_3862620_402-5957-0_sp1.1 from training. Duration: 0.9 2023-10-02 08:07:29,244 WARNING [train.py:1204] (3/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_465-156500-0 from training. Duration: 0.72 2023-10-02 08:07:29,245 WARNING [train.py:1204] (3/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_714-310538-0 from training. Duration: 0.86 2023-10-02 08:07:30,791 WARNING [train.py:1204] (3/4) Exclude cut with ID _39411_210_5986_1_1533538825030_3343649_335-301187-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 08:07:35,509 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_41750_210_1960_1_1533520678_3948042_241-158616-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 08:07:37,283 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_46013_210_7234_1_1533776362_3744032_50-141535-0_sp1.1 from training. Duration: 0.9 2023-10-02 08:07:38,707 WARNING [train.py:1204] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_43-206757-0_sp0.9 from training. Duration: 0.8333125 2023-10-02 08:07:38,766 WARNING [train.py:1204] (3/4) Exclude cut with ID _22961_210_8254_1_1531976366672_4278180_172-276296-0_sp1.1 from training. Duration: 0.97275 2023-10-02 08:07:40,073 WARNING [train.py:1204] (3/4) Exclude cut with ID _17956_210_4353_1_1531209568125_6979970_1305-301903-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 08:07:40,111 WARNING [train.py:1204] (3/4) Exclude cut with ID _28323_210_16187_1_1532480395667_4100300_604-44507-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 08:07:51,668 WARNING [train.py:1204] (3/4) Exclude cut with ID _23514_210_1389_1_1531832139785_3031439_88-337544-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 08:07:52,972 WARNING [train.py:1204] (3/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_932-3672-0_sp1.1 from training. Duration: 0.963625 2023-10-02 08:07:54,475 WARNING [train.py:1204] (3/4) Exclude cut with ID _46502_210_13794_1_1533724452198_3657159_207-109991-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 08:07:54,517 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_216-73841-0_sp1.1 from training. Duration: 0.87275 2023-10-02 08:07:55,914 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8453_210_4211_1_1528502940_8562730_347-330673-0_sp1.1 from training. Duration: 0.6545625 2023-10-02 08:08:01,403 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_4183_210_2087_1_1526729100_1869838_143-280962-0_sp1.1 from training. Duration: 0.5090625 2023-10-02 08:08:01,598 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.ff2_skip_rate, batch_count=809240.0, ans=0.0 2023-10-02 08:08:04,593 WARNING [train.py:1204] (3/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_325-113798-0_sp1.1 from training. Duration: 0.77275 2023-10-02 08:08:07,909 WARNING [train.py:1204] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_91-212486-0_sp0.9 from training. Duration: 0.7555625 2023-10-02 08:08:09,150 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.526e+02 1.850e+02 2.045e+02 2.410e+02 3.743e+02, threshold=4.091e+02, percent-clipped=0.0 2023-10-02 08:08:10,667 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8776_210_2040_1_1528638837_4088036_244-278763-0_sp1.1 from training. Duration: 0.8181875 2023-10-02 08:08:10,714 WARNING [train.py:1204] (3/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_106-286130-0 from training. Duration: 0.98 2023-10-02 08:08:10,763 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2746_210_1988_1_1525872816_3795627_372-205861-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 08:08:10,971 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.bypass.skip_rate, batch_count=809306.6666666666, ans=0.07 2023-10-02 08:08:12,019 WARNING [train.py:1204] (3/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_240-54646-0_sp1.1 from training. Duration: 0.936375 2023-10-02 08:08:15,094 WARNING [train.py:1204] (3/4) Exclude cut with ID _59500_210_8254_1_1534809610680_3736009_162-180695-0_sp1.1 from training. Duration: 0.936375 2023-10-02 08:08:15,133 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7544_210_6753_1_1528591327_1471426_146-93357-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 08:08:17,641 WARNING [train.py:1204] (3/4) Exclude cut with ID _42657_210_2070_1_1533520654016_3327008_405-87432-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 08:08:17,663 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_4528_210_3273_1_1527076654_3276777_209-219804-0 from training. Duration: 0.69 2023-10-02 08:08:17,664 WARNING [train.py:1204] (3/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_78-182912-0_sp0.9 from training. Duration: 0.6555625 2023-10-02 08:08:17,670 WARNING [train.py:1204] (3/4) Exclude cut with ID _31255_210_13427_1_1532865619495_3815143_322-114998-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 08:08:23,260 WARNING [train.py:1204] (3/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_848-280957-0_sp0.9 from training. Duration: 0.9666875 2023-10-02 08:08:24,601 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7696_210_2040_1_1528207167_3716099_117-125434-0_sp0.9 from training. Duration: 0.8444375 2023-10-02 08:08:26,070 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_1002-109192-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 08:08:26,362 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.feed_forward2.hidden_balancer.prob, batch_count=809373.3333333334, ans=0.125 2023-10-02 08:08:28,872 WARNING [train.py:1204] (3/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_138-64210-0_sp0.9 from training. Duration: 0.811125 2023-10-02 08:08:28,895 WARNING [train.py:1204] (3/4) Exclude cut with ID _25808_210_13292_1_1532343514562_7366109_633-47524-0_sp1.1 from training. Duration: 0.9 2023-10-02 08:08:31,447 WARNING [train.py:1204] (3/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_297-5233-0 from training. Duration: 0.95 2023-10-02 08:08:32,855 WARNING [train.py:1204] (3/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_335-274780-0 from training. Duration: 0.78 2023-10-02 08:08:32,865 WARNING [train.py:1204] (3/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_301-330455-0 from training. Duration: 0.8 2023-10-02 08:08:38,511 WARNING [train.py:1204] (3/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_383-205833-0 from training. Duration: 0.95 2023-10-02 08:08:39,849 INFO [train.py:1046] (3/4) Epoch 23, batch 4550, loss[loss=0.1639, simple_loss=0.2445, pruned_loss=0.04167, over 24621.00 frames. ], tot_loss[loss=0.1709, simple_loss=0.2478, pruned_loss=0.04701, over 4735589.90 frames. ], batch size: 60, lr: 4.43e-03, grad_scale: 32.0 2023-10-02 08:08:39,977 WARNING [train.py:1204] (3/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_48-182155-0 from training. Duration: 0.87 2023-10-02 08:08:41,327 WARNING [train.py:1204] (3/4) Exclude cut with ID _14832_210_8750_1_1530691351539_3171348_1-54315-0_sp1.1 from training. Duration: 0.7909375 2023-10-02 08:08:41,547 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.balancer2.prob, batch_count=809440.0, ans=0.125 2023-10-02 08:08:44,150 WARNING [train.py:1204] (3/4) Exclude cut with ID _44982_210_1511_1_1534312820108_3726029_413-100814-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 08:08:45,450 WARNING [train.py:1204] (3/4) Exclude cut with ID _3231_210_2106_1_1526205864934_6784220_104-261392-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 08:08:45,668 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.bypass_mid.scale_min, batch_count=809440.0, ans=0.2 2023-10-02 08:08:47,553 WARNING [train.py:1204] (3/4) Exclude cut with ID _24314_210_4915_1_1532135078116_7170179_1-13588-0_sp1.1 from training. Duration: 0.97275 2023-10-02 08:08:48,982 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.bypass.scale_min, batch_count=809440.0, ans=0.2 2023-10-02 08:08:50,175 WARNING [train.py:1204] (3/4) Exclude cut with ID _44304_210_15374_1_1533639352765_4613129_141-8356-0_sp1.1 from training. Duration: 0.963625 2023-10-02 08:08:50,371 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.feed_forward3.hidden_balancer.prob, batch_count=809440.0, ans=0.125 2023-10-02 08:08:52,816 WARNING [train.py:1204] (3/4) Exclude cut with ID _37892_210_19596_1_1533198677897_3964058_374-335034-0_sp1.1 from training. Duration: 0.936375 2023-10-02 08:08:54,211 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_195-257512-0_sp0.9 from training. Duration: 0.7444375 2023-10-02 08:08:54,213 WARNING [train.py:1204] (3/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_1267-272693-0_sp1.1 from training. Duration: 0.663625 2023-10-02 08:08:54,214 WARNING [train.py:1204] (3/4) Exclude cut with ID _30196_210_1664_1_1533708496522_7074140_690-326155-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 08:08:57,020 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_68847_210_14656_1_1536070214_2586843_38-20717-0_sp1.1 from training. Duration: 0.97275 2023-10-02 08:08:58,358 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_99-251314-0_sp1.1 from training. Duration: 0.7909375 2023-10-02 08:08:59,889 WARNING [train.py:1204] (3/4) Exclude cut with ID _49376_210_5986_1_1533985278732_3370740_225-95630-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 08:09:03,078 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2655_210_2087_1_1525688959_3897400_202-147557-0 from training. Duration: 0.85 2023-10-02 08:09:03,139 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_5618_210_1874_1_1527311008_5851_1-214501-0_sp1.1 from training. Duration: 0.626375 2023-10-02 08:09:06,218 WARNING [train.py:1204] (3/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_225-43381-0_sp1.1 from training. Duration: 0.8545625 2023-10-02 08:09:07,532 WARNING [train.py:1204] (3/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_242-3472-0 from training. Duration: 0.73 2023-10-02 08:09:09,760 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.conv_module2.balancer2.prob, batch_count=809573.3333333334, ans=0.125 2023-10-02 08:09:10,828 WARNING [train.py:1204] (3/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_339-104791-0 from training. Duration: 0.49 2023-10-02 08:09:10,891 WARNING [train.py:1204] (3/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_488-92261-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 08:09:13,734 WARNING [train.py:1204] (3/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_175-163894-0 from training. Duration: 0.74 2023-10-02 08:09:15,130 WARNING [train.py:1204] (3/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_41-234353-0_sp0.9 from training. Duration: 0.7555625 2023-10-02 08:09:17,231 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.1.conv_module1.balancer1.prob, batch_count=809573.3333333334, ans=0.125 2023-10-02 08:09:17,328 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.feed_forward1.hidden_balancer.prob, batch_count=809573.3333333334, ans=0.125 2023-10-02 08:09:19,656 WARNING [train.py:1204] (3/4) Exclude cut with ID _47251_210_18628_1_1533814063128_4137250_116-275927-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 08:09:19,692 WARNING [train.py:1204] (3/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_61-223324-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 08:09:20,953 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6961_210_2087_1_1527937168_6815011_562-165518-0_sp1.1 from training. Duration: 0.7 2023-10-02 08:09:23,716 WARNING [train.py:1204] (3/4) Exclude cut with ID _21989_210_5854_1_1531899214931_3271360_133-44759-0 from training. Duration: 0.77 2023-10-02 08:09:25,129 WARNING [train.py:1204] (3/4) Exclude cut with ID _23557_210_8750_1_1531890106370_6846255_77-284923-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 08:09:27,940 WARNING [train.py:1204] (3/4) Exclude cut with ID _13678_210_7497_1_1530530069226_3463170_464-296127-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 08:09:27,956 WARNING [train.py:1204] (3/4) Exclude cut with ID _22613_210_5219_1_1532253599686_4090170_393-36663-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 08:09:29,368 WARNING [train.py:1204] (3/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_235-262124-0_sp0.9 from training. Duration: 0.7444375 2023-10-02 08:09:29,453 WARNING [train.py:1204] (3/4) Exclude cut with ID _8480_210_1874_1_1528589391205_6728330_755-112625-0 from training. Duration: 0.74 2023-10-02 08:09:30,774 WARNING [train.py:1204] (3/4) Exclude cut with ID _6456_210_1534_1_1527681634212_3181049_215-309359-0 from training. Duration: 0.88 2023-10-02 08:09:30,797 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3019_210_2028_1_1526044245_3264865_191-72235-0_sp1.1 from training. Duration: 0.8818125 2023-10-02 08:09:32,535 WARNING [train.py:1204] (3/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_380-162648-0 from training. Duration: 0.94 2023-10-02 08:09:34,005 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_92-46009-0 from training. Duration: 0.54 2023-10-02 08:09:34,031 WARNING [train.py:1204] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_53-300544-0_sp0.9 from training. Duration: 0.7444375 2023-10-02 08:09:37,042 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_68235_210_1925_1_1535863247_4484656_101-250943-0_sp1.1 from training. Duration: 0.97275 2023-10-02 08:09:37,051 WARNING [train.py:1204] (3/4) Exclude cut with ID _14970_210_15709_1_1530775547361_1966965_204-22182-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 08:09:37,361 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.feed_forward3.hidden_balancer.prob, batch_count=809706.6666666666, ans=0.125 2023-10-02 08:09:38,364 WARNING [train.py:1204] (3/4) Exclude cut with ID _55153_210_2253_1_1534384786677_3162096_310-130092-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 08:09:38,375 WARNING [train.py:1204] (3/4) Exclude cut with ID _60113_210_16271_1_1535164212481_4386079_559-167191-0_sp1.1 from training. Duration: 0.8454375 2023-10-02 08:09:38,484 WARNING [train.py:1204] (3/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_93-58002-0_sp1.1 from training. Duration: 0.5181875 2023-10-02 08:09:39,784 WARNING [train.py:1204] (3/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_110-224688-0 from training. Duration: 0.97 2023-10-02 08:09:41,154 WARNING [train.py:1204] (3/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_178-178004-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 08:09:41,162 WARNING [train.py:1204] (3/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_321-335475-0_sp1.1 from training. Duration: 0.4090625 2023-10-02 08:09:41,218 WARNING [train.py:1204] (3/4) Exclude cut with ID _66863_210_18053_1_1535698758334_8590319_1092-271784-0 from training. Duration: 0.96 2023-10-02 08:09:41,227 WARNING [train.py:1204] (3/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_607-213951-0_sp1.1 from training. Duration: 0.763625 2023-10-02 08:09:41,235 WARNING [train.py:1204] (3/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_468-283113-0 from training. Duration: 0.96 2023-10-02 08:09:44,588 WARNING [train.py:1204] (3/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_13-165512-0_sp1.1 from training. Duration: 0.7454375 2023-10-02 08:09:44,605 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_69630_210_7234_1_1535788670_910989_56-169762-0_sp1.1 from training. Duration: 0.92725 2023-10-02 08:09:47,765 WARNING [train.py:1204] (3/4) Exclude cut with ID _67335_210_5219_1_1535525975652_4275229_121-80357-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 08:09:47,804 WARNING [train.py:1204] (3/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_126-12079-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 08:09:48,003 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.feed_forward1.out_proj.dropout_p, batch_count=809706.6666666666, ans=0.1 2023-10-02 08:09:49,237 WARNING [train.py:1204] (3/4) Exclude cut with ID _5294_210_1747_1_1527249643928_3696179_342-270278-0_sp1.1 from training. Duration: 0.5 2023-10-02 08:09:50,635 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_30322_210_1925_1_1532843478_826817_35-328264-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 08:09:51,962 WARNING [train.py:1204] (3/4) Exclude cut with ID _8480_210_1874_1_1528589391205_6728330_755-112625-0_sp1.1 from training. Duration: 0.67275 2023-10-02 08:09:52,687 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.3.encoder.layers.2.self_attn1.whiten, num_groups=1, num_channels=512, metric=14.96 vs. limit=22.5 2023-10-02 08:09:53,333 INFO [train.py:1046] (3/4) Epoch 23, batch 4600, loss[loss=0.1633, simple_loss=0.239, pruned_loss=0.04381, over 24314.00 frames. ], tot_loss[loss=0.1703, simple_loss=0.247, pruned_loss=0.04683, over 4729113.17 frames. ], batch size: 61, lr: 4.43e-03, grad_scale: 32.0 2023-10-02 08:09:54,710 WARNING [train.py:1204] (3/4) Exclude cut with ID _38670_210_15379_1_1533448810659_4391059_132-146412-0_sp1.1 from training. Duration: 0.97275 2023-10-02 08:09:56,186 WARNING [train.py:1204] (3/4) Exclude cut with ID _22020_210_2461_1_1531724164513_6326900_563-39691-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 08:09:57,621 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_9460_210_4211_1_1528954587_10049667_572-37035-0_sp1.1 from training. Duration: 0.72725 2023-10-02 08:09:57,632 WARNING [train.py:1204] (3/4) Exclude cut with ID _68777_210_4353_1_1535810390731_3117904_129-166012-0_sp0.9 from training. Duration: 0.9444375 2023-10-02 08:09:59,000 WARNING [train.py:1204] (3/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_487-91921-0_sp1.1 from training. Duration: 0.963625 2023-10-02 08:10:00,354 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_195-257512-0 from training. Duration: 0.67 2023-10-02 08:10:02,309 WARNING [train.py:1204] (3/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_188-260011-0_sp1.1 from training. Duration: 0.663625 2023-10-02 08:10:08,121 WARNING [train.py:1204] (3/4) Exclude cut with ID _8480_210_1874_1_1528589391205_6728330_309-139896-0_sp0.9 from training. Duration: 0.9 2023-10-02 08:10:08,177 WARNING [train.py:1204] (3/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_866-321248-0_sp1.1 from training. Duration: 0.963625 2023-10-02 08:10:09,719 WARNING [train.py:1204] (3/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_510-216513-0_sp1.1 from training. Duration: 0.97275 2023-10-02 08:10:16,831 WARNING [train.py:1204] (3/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_758-143677-0 from training. Duration: 0.98 2023-10-02 08:10:16,910 WARNING [train.py:1204] (3/4) Exclude cut with ID _55134_210_21982_1_1534590028377_3882170_50-214427-0_sp1.1 from training. Duration: 0.97275 2023-10-02 08:10:20,123 WARNING [train.py:1204] (3/4) Exclude cut with ID _31649_210_6228_1_1532584608063_4091078_482-301330-0_sp1.1 from training. Duration: 0.97275 2023-10-02 08:10:20,745 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.3.encoder.layers.0.whiten, num_groups=1, num_channels=512, metric=4.68 vs. limit=12.0 2023-10-02 08:10:22,819 WARNING [train.py:1204] (3/4) Exclude cut with ID _35799_210_5854_1_1533263487887_3529071_490-326847-0_sp1.1 from training. Duration: 0.9 2023-10-02 08:10:22,826 WARNING [train.py:1204] (3/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_984-90716-0_sp1.1 from training. Duration: 0.963625 2023-10-02 08:10:25,788 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.ff2_skip_rate, batch_count=809906.6666666666, ans=0.0 2023-10-02 08:10:29,564 WARNING [train.py:1204] (3/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_166-246557-0 from training. Duration: 0.64 2023-10-02 08:10:29,564 WARNING [train.py:1204] (3/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_51-200220-0_sp1.1 from training. Duration: 0.5090625 2023-10-02 08:10:29,636 WARNING [train.py:1204] (3/4) Exclude cut with ID _51382_210_23862_1_1534492801838_3650104_275-92796-0_sp1.1 from training. Duration: 0.936375 2023-10-02 08:10:34,346 WARNING [train.py:1204] (3/4) Exclude cut with ID _29853_210_7497_1_1532505600851_6642110_461-142829-0_sp1.1 from training. Duration: 0.97275 2023-10-02 08:10:36,008 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.516e+02 1.884e+02 2.130e+02 2.615e+02 3.495e+02, threshold=4.259e+02, percent-clipped=0.0 2023-10-02 08:10:36,090 WARNING [train.py:1204] (3/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_788-298907-0_sp0.9 from training. Duration: 0.988875 2023-10-02 08:10:36,417 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=809973.3333333334, ans=0.1 2023-10-02 08:10:37,450 WARNING [train.py:1204] (3/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_739-2098-0_sp1.1 from training. Duration: 0.836375 2023-10-02 08:10:41,447 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7543_210_6753_1_1528541907_1399322_165-323619-0 from training. Duration: 0.69 2023-10-02 08:10:42,770 WARNING [train.py:1204] (3/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_401-255491-0_sp1.1 from training. Duration: 0.636375 2023-10-02 08:10:47,475 WARNING [train.py:1204] (3/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_24-143283-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 08:10:48,808 WARNING [train.py:1204] (3/4) Exclude cut with ID _14459_210_2228_1_1531443641946_7516700_1221-280439-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 08:10:50,769 WARNING [train.py:1204] (3/4) Exclude cut with ID _59202_210_26984_1_1534816810612_3549174_469-213482-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 08:10:50,770 WARNING [train.py:1204] (3/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_148-158509-0_sp0.9 from training. Duration: 0.4555625 2023-10-02 08:10:50,808 WARNING [train.py:1204] (3/4) Exclude cut with ID _24314_210_4915_1_1532135078116_7170179_863-259095-0_sp1.1 from training. Duration: 0.97275 2023-10-02 08:10:50,985 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.conv_skip_rate, batch_count=810040.0, ans=0.0 2023-10-02 08:10:52,081 WARNING [train.py:1204] (3/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_321-22821-0 from training. Duration: 0.89 2023-10-02 08:10:52,100 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_43831_210_2158_1_1533725502_5872791_55-163137-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 08:10:52,134 WARNING [train.py:1204] (3/4) Exclude cut with ID _27430_210_7770_1_1532604689375_1378590_26-25207-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 08:10:53,483 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_17613_210_2151_1_1531137569_2366160_223-316461-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 08:10:54,787 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_53679_210_4852_1_1534556726_4186141_541-213370-0_sp1.1 from training. Duration: 0.963625 2023-10-02 08:10:54,871 WARNING [train.py:1204] (3/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_369-295086-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 08:10:56,103 WARNING [train.py:1204] (3/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_91-267696-0 from training. Duration: 0.87 2023-10-02 08:10:56,158 WARNING [train.py:1204] (3/4) Exclude cut with ID _5294_210_1747_1_1527249643928_3696179_180-100713-0 from training. Duration: 0.81 2023-10-02 08:10:56,179 WARNING [train.py:1204] (3/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_32-335103-0 from training. Duration: 0.82 2023-10-02 08:10:57,488 WARNING [train.py:1204] (3/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_133-312727-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 08:10:58,809 WARNING [train.py:1204] (3/4) Exclude cut with ID _59288_210_5854_1_1535077757774_3412246_391-165934-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 08:11:00,102 WARNING [train.py:1204] (3/4) Exclude cut with ID _28323_210_16187_1_1532480395667_4100300_488-1281-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 08:11:00,176 WARNING [train.py:1204] (3/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_124-193207-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 08:11:05,883 INFO [train.py:1046] (3/4) Epoch 23, batch 4650, loss[loss=0.1882, simple_loss=0.2694, pruned_loss=0.05349, over 23250.00 frames. ], tot_loss[loss=0.1699, simple_loss=0.2465, pruned_loss=0.04659, over 4721217.88 frames. ], batch size: 105, lr: 4.43e-03, grad_scale: 32.0 2023-10-02 08:11:09,377 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.attention_skip_rate, batch_count=810106.6666666666, ans=0.0 2023-10-02 08:11:10,521 WARNING [train.py:1204] (3/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_226-242140-0_sp0.9 from training. Duration: 0.9 2023-10-02 08:11:12,065 WARNING [train.py:1204] (3/4) Exclude cut with ID _29277_210_3279_1_1532581109293_3173069_392-3694-0_sp1.1 from training. Duration: 0.936375 2023-10-02 08:11:13,330 WARNING [train.py:1204] (3/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_563-329842-0_sp1.1 from training. Duration: 0.97275 2023-10-02 08:11:13,375 WARNING [train.py:1204] (3/4) Exclude cut with ID _58583_210_5236_1_1534726835266_3574249_391-87755-0_sp1.1 from training. Duration: 0.92725 2023-10-02 08:11:13,401 WARNING [train.py:1204] (3/4) Exclude cut with ID _28427_210_4220_1_1532581240967_3061069_281-237827-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 08:11:13,420 WARNING [train.py:1204] (3/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_44-147898-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 08:11:14,782 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_24007_210_1794_1_1531918925_1663875_125-320772-0_sp1.1 from training. Duration: 0.97275 2023-10-02 08:11:18,718 WARNING [train.py:1204] (3/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_143-117421-0 from training. Duration: 0.58 2023-10-02 08:11:20,750 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.conv_module1.balancer2.prob, batch_count=810173.3333333334, ans=0.125 2023-10-02 08:11:21,965 WARNING [train.py:1204] (3/4) Exclude cut with ID _69584_210_5219_1_1535785133775_4044450_325-7656-0_sp0.9 from training. Duration: 0.9555625 2023-10-02 08:11:23,353 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_37351_210_7925_1_1533012931_3812113_265-85780-0 from training. Duration: 0.88 2023-10-02 08:11:23,380 WARNING [train.py:1204] (3/4) Exclude cut with ID _18516_210_9206_1_1531704653206_3692406_331-237896-0_sp1.1 from training. Duration: 0.936375 2023-10-02 08:11:24,749 WARNING [train.py:1204] (3/4) Exclude cut with ID _14629_210_15709_1_1530604298182_956899_6-232695-0 from training. Duration: 0.94 2023-10-02 08:11:24,770 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7544_210_6753_1_1528587000_691468_87-178599-0_sp1.1 from training. Duration: 0.7909375 2023-10-02 08:11:26,165 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_326-303945-0 from training. Duration: 0.99 2023-10-02 08:11:26,184 WARNING [train.py:1204] (3/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_503-30719-0 from training. Duration: 0.98 2023-10-02 08:11:26,190 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_11305_210_1794_1_1529750428_7178148_299-344640-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 08:11:27,466 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_53071_210_16554_1_1534490405_2954066_265-170472-0_sp1.1 from training. Duration: 0.7545625 2023-10-02 08:11:28,998 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=810173.3333333334, ans=0.1 2023-10-02 08:11:31,512 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_174-277364-0_sp1.1 from training. Duration: 0.6454375 2023-10-02 08:11:31,603 WARNING [train.py:1204] (3/4) Exclude cut with ID _58414_210_8254_1_1535421609162_7151182_193-151884-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 08:11:31,626 WARNING [train.py:1204] (3/4) Exclude cut with ID IC0651W0104-322063 from training. Duration: 0.768 2023-10-02 08:11:34,897 WARNING [train.py:1204] (3/4) Exclude cut with ID _21881_210_3525_1_1531825242757_3798380_139-305982-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 08:11:36,763 WARNING [train.py:1204] (3/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_79-200392-0 from training. Duration: 0.97 2023-10-02 08:11:38,184 WARNING [train.py:1204] (3/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_451-280894-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 08:11:38,191 WARNING [train.py:1204] (3/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_304-273433-0_sp1.1 from training. Duration: 0.9 2023-10-02 08:11:39,522 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_5959_210_1794_1_1527400400_3445726_412-44052-0 from training. Duration: 0.77 2023-10-02 08:11:41,009 WARNING [train.py:1204] (3/4) Exclude cut with ID _4681_210_3525_1_1527251040321_1235713_148-26159-0_sp1.1 from training. Duration: 0.963625 2023-10-02 08:11:42,503 WARNING [train.py:1204] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_887-249555-0_sp0.9 from training. Duration: 0.8555625 2023-10-02 08:11:45,356 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.bypass.scale_min, batch_count=810240.0, ans=0.2 2023-10-02 08:11:46,425 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_25236_210_2158_1_1532051968_280567_37-6860-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 08:11:50,264 WARNING [train.py:1204] (3/4) Exclude cut with ID _29277_210_3279_1_1532581109293_3173069_417-115527-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 08:11:52,545 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.0.layers.0.feed_forward1.out_whiten, num_groups=1, num_channels=192, metric=8.79 vs. limit=15.0 2023-10-02 08:11:52,949 WARNING [train.py:1204] (3/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_552-134748-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 08:11:54,254 WARNING [train.py:1204] (3/4) Exclude cut with ID _43176_210_8254_1_1533524417469_4174339_188-174143-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 08:11:55,586 WARNING [train.py:1204] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_127-119109-0_sp1.1 from training. Duration: 0.6545625 2023-10-02 08:11:58,287 WARNING [train.py:1204] (3/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_38-187851-0 from training. Duration: 0.62 2023-10-02 08:11:58,326 WARNING [train.py:1204] (3/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_588-125165-0 from training. Duration: 0.83 2023-10-02 08:11:59,603 WARNING [train.py:1204] (3/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_344-152526-0_sp1.1 from training. Duration: 0.4818125 2023-10-02 08:11:59,605 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7002_210_1794_1_1527935634_4034444_487-236501-0 from training. Duration: 0.66 2023-10-02 08:12:01,139 WARNING [train.py:1204] (3/4) Exclude cut with ID _23825_210_13449_1_1531915313526_3497150_379-16981-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 08:12:05,955 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.conv_module1.balancer1.prob, batch_count=810373.3333333334, ans=0.125 2023-10-02 08:12:08,530 WARNING [train.py:1204] (3/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_297-5233-0_sp1.1 from training. Duration: 0.863625 2023-10-02 08:12:09,753 WARNING [train.py:1204] (3/4) Exclude cut with ID _41298_210_9019_1_1533430371450_4822143_178-283538-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 08:12:09,777 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7046_210_6753_1_1527895337_6920917_642-106799-0 from training. Duration: 0.92 2023-10-02 08:12:09,798 WARNING [train.py:1204] (3/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_532-291952-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 08:12:09,896 WARNING [train.py:1204] (3/4) Exclude cut with ID _47278_210_12041_1_1533902418349_3576110_260-270468-0_sp1.1 from training. Duration: 0.92725 2023-10-02 08:12:09,901 WARNING [train.py:1204] (3/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_144-3287-0_sp0.9 from training. Duration: 0.7444375 2023-10-02 08:12:12,499 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_162-262717-0_sp0.9 from training. Duration: 0.92225 2023-10-02 08:12:15,212 WARNING [train.py:1204] (3/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_62-335552-0_sp0.9 from training. Duration: 0.9666875 2023-10-02 08:12:15,222 WARNING [train.py:1204] (3/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_726-272852-0_sp1.1 from training. Duration: 0.92725 2023-10-02 08:12:15,446 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.conv_module2.balancer2.prob, batch_count=810373.3333333334, ans=0.125 2023-10-02 08:12:17,766 INFO [train.py:1046] (3/4) Epoch 23, batch 4700, loss[loss=0.1685, simple_loss=0.261, pruned_loss=0.03805, over 24623.00 frames. ], tot_loss[loss=0.1706, simple_loss=0.2475, pruned_loss=0.04685, over 4722462.78 frames. ], batch size: 68, lr: 4.43e-03, grad_scale: 16.0 2023-10-02 08:12:17,822 WARNING [train.py:1204] (3/4) Exclude cut with ID _14898_210_9206_1_1530766812377_4177190_26-135492-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 08:12:21,071 WARNING [train.py:1204] (3/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_200-95304-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 08:12:21,106 WARNING [train.py:1204] (3/4) Exclude cut with ID _45012_210_12651_1_1533608435136_5371380_240-37101-0_sp1.1 from training. Duration: 0.8454375 2023-10-02 08:12:21,113 WARNING [train.py:1204] (3/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_132-269633-0_sp0.9 from training. Duration: 0.7555625 2023-10-02 08:12:21,179 WARNING [train.py:1204] (3/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_907-151398-0_sp0.9 from training. Duration: 0.411125 2023-10-02 08:12:23,068 WARNING [train.py:1204] (3/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_45-265684-0_sp0.9 from training. Duration: 0.8 2023-10-02 08:12:24,443 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_74-292608-0 from training. Duration: 0.72 2023-10-02 08:12:31,232 WARNING [train.py:1204] (3/4) Exclude cut with ID _59202_210_26984_1_1534816810612_3549174_221-84819-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 08:12:31,470 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.nonlin_attention.balancer.prob, batch_count=810506.6666666666, ans=0.125 2023-10-02 08:12:32,588 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_33696_210_3852_1_1533082780_2663375_181-325595-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 08:12:32,632 WARNING [train.py:1204] (3/4) Exclude cut with ID _47251_210_18628_1_1533814063128_4137250_351-154105-0_sp1.1 from training. Duration: 0.963625 2023-10-02 08:12:32,857 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.conv_skip_rate, batch_count=810506.6666666666, ans=0.0 2023-10-02 08:12:34,511 WARNING [train.py:1204] (3/4) Exclude cut with ID _35501_210_9288_1_1532998507986_3192189_351-334740-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 08:12:35,865 WARNING [train.py:1204] (3/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_342-100157-0_sp1.1 from training. Duration: 0.4909375 2023-10-02 08:12:39,129 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.conv_skip_rate, batch_count=810506.6666666666, ans=0.0 2023-10-02 08:12:40,332 WARNING [train.py:1204] (3/4) Exclude cut with ID _6789_210_1534_1_1527857937218_3118950_262-48853-0 from training. Duration: 0.56 2023-10-02 08:12:41,607 WARNING [train.py:1204] (3/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_430-201020-0 from training. Duration: 0.97 2023-10-02 08:12:43,179 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_71-136518-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 08:12:44,561 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_28-291982-0_sp1.1 from training. Duration: 0.8818125 2023-10-02 08:12:45,799 WARNING [train.py:1204] (3/4) Exclude cut with ID _54218_210_12210_1_1534413646388_4409259_437-275326-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 08:12:48,586 WARNING [train.py:1204] (3/4) Exclude cut with ID _55134_210_21982_1_1534590028377_3882170_40-309139-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 08:12:48,788 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.0.feed_forward1.out_proj.dropout_p, batch_count=810573.3333333334, ans=0.1 2023-10-02 08:12:52,103 WARNING [train.py:1204] (3/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_266-329755-0_sp1.1 from training. Duration: 0.8090625 2023-10-02 08:12:54,029 WARNING [train.py:1204] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_21-174514-0_sp1.1 from training. Duration: 0.3909375 2023-10-02 08:12:56,708 WARNING [train.py:1204] (3/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_409-207378-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 08:12:59,668 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.out_combiner.scale_min, batch_count=810573.3333333334, ans=0.2 2023-10-02 08:13:01,990 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.482e+02 1.784e+02 2.018e+02 2.230e+02 2.934e+02, threshold=4.036e+02, percent-clipped=0.0 2023-10-02 08:13:02,089 WARNING [train.py:1204] (3/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_132-269633-0 from training. Duration: 0.68 2023-10-02 08:13:03,442 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2245_210_2715_1_1525511548_3540254_310-52483-0_sp0.9 from training. Duration: 0.97775 2023-10-02 08:13:06,671 WARNING [train.py:1204] (3/4) Exclude cut with ID _27430_210_7770_1_1532604689375_1378590_47-77403-0_sp1.1 from training. Duration: 0.92725 2023-10-02 08:13:09,620 WARNING [train.py:1204] (3/4) Exclude cut with ID _7562_210_2084_1_1528887767916_3946229_186-326422-0 from training. Duration: 0.84 2023-10-02 08:13:10,977 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_230-35993-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 08:13:13,865 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_23854_210_1794_1_1531969696_1329157_57-123491-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 08:13:15,186 WARNING [train.py:1204] (3/4) Exclude cut with ID _14747_210_8341_1_1530685797463_3654089_147-131229-0 from training. Duration: 0.76 2023-10-02 08:13:16,561 WARNING [train.py:1204] (3/4) Exclude cut with ID _30191_210_1664_1_1533189915047_7196670_430-19539-0_sp1.1 from training. Duration: 0.92725 2023-10-02 08:13:16,581 WARNING [train.py:1204] (3/4) Exclude cut with ID _67725_210_27767_1_1535608756671_5105359_44-50762-0_sp1.1 from training. Duration: 0.963625 2023-10-02 08:13:16,715 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.0.ff3_skip_rate, batch_count=810706.6666666666, ans=0.0 2023-10-02 08:13:19,232 WARNING [train.py:1204] (3/4) Exclude cut with ID _27485_210_4916_1_1532739191677_3668269_217-161755-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 08:13:19,273 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_5931_210_2040_1_1527428481_4547999_321-193614-0_sp1.1 from training. Duration: 0.6090625 2023-10-02 08:13:19,296 WARNING [train.py:1204] (3/4) Exclude cut with ID _6267_210_2084_1_1527764658526_3894730_375-219887-0 from training. Duration: 0.67 2023-10-02 08:13:21,889 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9087W0022-538393 from training. Duration: 0.768 2023-10-02 08:13:24,567 WARNING [train.py:1204] (3/4) Exclude cut with ID _68777_210_4353_1_1535810390731_3117904_83-196459-0_sp1.1 from training. Duration: 0.963625 2023-10-02 08:13:25,893 WARNING [train.py:1204] (3/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_455-343812-0_sp1.1 from training. Duration: 0.92725 2023-10-02 08:13:25,894 WARNING [train.py:1204] (3/4) Exclude cut with ID _13916_210_9994_1_1530788341126_3763149_414-320174-0_sp1.1 from training. Duration: 0.92725 2023-10-02 08:13:25,897 WARNING [train.py:1204] (3/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_398-239819-0 from training. Duration: 0.99 2023-10-02 08:13:27,207 WARNING [train.py:1204] (3/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_199-232314-0_sp1.1 from training. Duration: 0.92725 2023-10-02 08:13:29,943 WARNING [train.py:1204] (3/4) Exclude cut with ID _2334_210_2667_1_1525608238880_4705129_459-104997-0 from training. Duration: 0.95 2023-10-02 08:13:31,250 INFO [train.py:1046] (3/4) Epoch 23, batch 4750, loss[loss=0.1857, simple_loss=0.272, pruned_loss=0.04967, over 24566.00 frames. ], tot_loss[loss=0.1711, simple_loss=0.2483, pruned_loss=0.04695, over 4719253.50 frames. ], batch size: 71, lr: 4.43e-03, grad_scale: 16.0 2023-10-02 08:13:32,610 WARNING [train.py:1204] (3/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_441-209395-0_sp1.1 from training. Duration: 0.936375 2023-10-02 08:13:34,544 WARNING [train.py:1204] (3/4) Exclude cut with ID _27052_210_5986_1_1532329197187_3585009_320-80393-0_sp1.1 from training. Duration: 0.97275 2023-10-02 08:13:37,439 WARNING [train.py:1204] (3/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_89-217253-0_sp1.1 from training. Duration: 0.97275 2023-10-02 08:13:37,455 WARNING [train.py:1204] (3/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_453-282588-0_sp1.1 from training. Duration: 0.8909375 2023-10-02 08:13:39,438 WARNING [train.py:1204] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_88-341718-0 from training. Duration: 0.66 2023-10-02 08:13:39,487 WARNING [train.py:1204] (3/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_230-3521-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 08:13:42,202 WARNING [train.py:1204] (3/4) Exclude cut with ID _9679_210_7497_1_1529129212906_4363929_512-313889-0 from training. Duration: 0.81 2023-10-02 08:13:43,699 WARNING [train.py:1204] (3/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_194-252800-0_sp1.1 from training. Duration: 0.8818125 2023-10-02 08:13:43,724 WARNING [train.py:1204] (3/4) Exclude cut with ID _55209_210_1531_1_1534675828996_4597160_239-293541-0_sp1.1 from training. Duration: 0.963625 2023-10-02 08:13:44,916 WARNING [train.py:1204] (3/4) Exclude cut with ID _35799_210_5854_1_1533263487887_3529071_15-325131-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 08:13:50,320 WARNING [train.py:1204] (3/4) Exclude cut with ID _14100_210_8341_1_1530529320613_3678069_279-55760-0 from training. Duration: 0.93 2023-10-02 08:13:53,782 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_489-230703-0_sp1.1 from training. Duration: 0.77275 2023-10-02 08:13:55,211 WARNING [train.py:1204] (3/4) Exclude cut with ID _6694_210_4599_1_1527852637490_3965529_144-185051-0 from training. Duration: 0.96 2023-10-02 08:13:55,462 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.self_attn_weights.pos_emb_skip_rate, batch_count=810840.0, ans=0.0 2023-10-02 08:13:56,516 WARNING [train.py:1204] (3/4) Exclude cut with ID _28461_210_2253_1_1532771775189_3041299_1-228155-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 08:13:59,239 WARNING [train.py:1204] (3/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_686-252323-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 08:13:59,242 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_1075-294633-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 08:13:59,252 WARNING [train.py:1204] (3/4) Exclude cut with ID _22083_210_8254_1_1531702862439_3678970_30-92879-0_sp1.1 from training. Duration: 0.97275 2023-10-02 08:13:59,322 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9245W0121-616888 from training. Duration: 0.896 2023-10-02 08:14:00,590 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6786_210_1388_1_1527839699_4194495_378-46070-0 from training. Duration: 0.72 2023-10-02 08:14:05,438 WARNING [train.py:1204] (3/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_317-175062-0 from training. Duration: 0.82 2023-10-02 08:14:08,678 WARNING [train.py:1204] (3/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_238-89390-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 08:14:10,052 WARNING [train.py:1204] (3/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_524-310811-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 08:14:12,737 WARNING [train.py:1204] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_307-158462-0_sp0.9 from training. Duration: 0.7666875 2023-10-02 08:14:12,738 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9309W0191-640318 from training. Duration: 0.512 2023-10-02 08:14:12,742 WARNING [train.py:1204] (3/4) Exclude cut with ID _55153_210_2253_1_1534384786677_3162096_100-204617-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 08:14:15,380 WARNING [train.py:1204] (3/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_112-149213-0_sp1.1 from training. Duration: 0.863625 2023-10-02 08:14:16,878 WARNING [train.py:1204] (3/4) Exclude cut with ID _67831_210_12392_1_1535612464202_3092612_346-2148-0_sp1.1 from training. Duration: 0.8454375 2023-10-02 08:14:17,113 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.bypass.scale_min, batch_count=810973.3333333334, ans=0.2 2023-10-02 08:14:19,556 WARNING [train.py:1204] (3/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_51-117576-0_sp0.9 from training. Duration: 0.4444375 2023-10-02 08:14:21,451 WARNING [train.py:1204] (3/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_300-27235-0 from training. Duration: 0.87 2023-10-02 08:14:21,506 WARNING [train.py:1204] (3/4) Exclude cut with ID _40399_210_4353_1_1533305658194_3279406_195-116825-0_sp1.1 from training. Duration: 0.97275 2023-10-02 08:14:22,551 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7002_210_1794_1_1527939683_5157286_163-87552-0_sp0.9 from training. Duration: 0.9555625 2023-10-02 08:14:22,578 WARNING [train.py:1204] (3/4) Exclude cut with ID _19514_210_9259_1_1531530071330_5084230_560-237597-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 08:14:22,655 WARNING [train.py:1204] (3/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_41-234353-0_sp1.1 from training. Duration: 0.6181875 2023-10-02 08:14:23,883 WARNING [train.py:1204] (3/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_769-168302-0 from training. Duration: 0.71 2023-10-02 08:14:26,667 WARNING [train.py:1204] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_574-45541-0 from training. Duration: 0.65 2023-10-02 08:14:28,154 WARNING [train.py:1204] (3/4) Exclude cut with ID _50310_210_15488_1_1534068043345_3641080_262-67120-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 08:14:28,364 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.bypass.skip_rate, batch_count=811040.0, ans=0.07 2023-10-02 08:14:29,590 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_53071_210_16554_1_1534493434_1276796_35-11927-0_sp1.1 from training. Duration: 0.963625 2023-10-02 08:14:29,592 WARNING [train.py:1204] (3/4) Exclude cut with ID _3992_210_1747_1_1526644816031_3731060_107-251434-0 from training. Duration: 0.99 2023-10-02 08:14:29,635 WARNING [train.py:1204] (3/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_412-54488-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 08:14:30,985 WARNING [train.py:1204] (3/4) Exclude cut with ID _18516_210_9206_1_1531704653206_3692406_77-258490-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 08:14:32,369 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_40047_210_2290_1_1533290519_2374311_195-165693-0_sp0.9 from training. Duration: 0.988875 2023-10-02 08:14:34,222 WARNING [train.py:1204] (3/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_296-36971-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 08:14:35,438 WARNING [train.py:1204] (3/4) Exclude cut with ID _14970_210_15709_1_1530773543493_1715730_187-20259-0_sp1.1 from training. Duration: 0.8090625 2023-10-02 08:14:35,722 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.conv_module2.balancer2.prob, batch_count=811040.0, ans=0.125 2023-10-02 08:14:39,910 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_53373_210_5399_1_1534229471_4715695_135-305927-0_sp1.1 from training. Duration: 0.936375 2023-10-02 08:14:39,933 WARNING [train.py:1204] (3/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_60-18123-0 from training. Duration: 0.88 2023-10-02 08:14:41,213 WARNING [train.py:1204] (3/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_166-166990-0 from training. Duration: 0.96 2023-10-02 08:14:42,555 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_5438_210_1794_1_1527336687_2600009_233-58072-0 from training. Duration: 0.77 2023-10-02 08:14:43,876 INFO [train.py:1046] (3/4) Epoch 23, batch 4800, loss[loss=0.1577, simple_loss=0.2278, pruned_loss=0.04377, over 24410.00 frames. ], tot_loss[loss=0.1732, simple_loss=0.2504, pruned_loss=0.04795, over 4695011.89 frames. ], batch size: 58, lr: 4.42e-03, grad_scale: 32.0 2023-10-02 08:14:45,335 WARNING [train.py:1204] (3/4) Exclude cut with ID _8833_210_5219_1_1528592665210_7553059_611-80313-0_sp0.9 from training. Duration: 0.711125 2023-10-02 08:14:45,356 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_53555_210_5632_1_1534595220_7232297_118-43239-0_sp1.1 from training. Duration: 0.936375 2023-10-02 08:14:46,700 WARNING [train.py:1204] (3/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_480-13146-0 from training. Duration: 0.85 2023-10-02 08:14:51,457 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_892-28814-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 08:14:52,647 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_24899_210_16554_1_1532046422_3827462_160-21255-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 08:14:58,709 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_154-316450-0_sp1.1 from training. Duration: 0.6909375 2023-10-02 08:15:00,168 WARNING [train.py:1204] (3/4) Exclude cut with ID _30196_210_1664_1_1533708496522_7074140_1225-301232-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 08:15:01,494 WARNING [train.py:1204] (3/4) Exclude cut with ID _28783_210_7943_1_1532671164808_7155479_414-208100-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 08:15:01,531 WARNING [train.py:1204] (3/4) Exclude cut with ID _19023_210_4353_1_1531378892552_5820780_83-254794-0 from training. Duration: 0.86 2023-10-02 08:15:01,589 WARNING [train.py:1204] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_591-83752-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 08:15:01,629 WARNING [train.py:1204] (3/4) Exclude cut with ID _29877_210_6228_1_1532397791106_5276149_266-62291-0_sp1.1 from training. Duration: 0.7909375 2023-10-02 08:15:03,004 WARNING [train.py:1204] (3/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_280-161633-0_sp1.1 from training. Duration: 0.663625 2023-10-02 08:15:07,590 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7543_210_6753_1_1528539468_333764_39-156634-0_sp1.1 from training. Duration: 0.97275 2023-10-02 08:15:07,710 WARNING [train.py:1204] (3/4) Exclude cut with ID _67499_210_17282_1_1535509805220_3692430_505-312248-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 08:15:09,448 WARNING [train.py:1204] (3/4) Exclude cut with ID _5294_210_1747_1_1527249643928_3696179_180-100713-0_sp0.9 from training. Duration: 0.9 2023-10-02 08:15:09,644 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.nonlin_attention.balancer.prob, batch_count=811173.3333333334, ans=0.125 2023-10-02 08:15:10,793 WARNING [train.py:1204] (3/4) Exclude cut with ID _26528_210_9070_1_1532570410842_3851310_508-148132-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 08:15:10,802 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_211-339622-0_sp1.1 from training. Duration: 0.4090625 2023-10-02 08:15:10,814 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_339-138356-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 08:15:10,890 WARNING [train.py:1204] (3/4) Exclude cut with ID _27060_210_5986_1_1532674838922_3522549_211-19933-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 08:15:11,460 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.3.encoder.layers.3.conv_module2.whiten, num_groups=1, num_channels=512, metric=5.87 vs. limit=15.0 2023-10-02 08:15:12,249 WARNING [train.py:1204] (3/4) Exclude cut with ID _60229_210_27767_1_1534838406158_3655350_457-117923-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 08:15:14,924 WARNING [train.py:1204] (3/4) Exclude cut with ID _55429_210_10626_1_1534467588527_7174143_460-141439-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 08:15:16,286 WARNING [train.py:1204] (3/4) Exclude cut with ID _59170_210_6721_1_1534983633960_4203335_263-10780-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 08:15:16,297 WARNING [train.py:1204] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_872-264525-0_sp0.9 from training. Duration: 0.72225 2023-10-02 08:15:17,687 WARNING [train.py:1204] (3/4) Exclude cut with ID _7624_210_1531_1_1528109802198_4933101_418-349834-0_sp0.9 from training. Duration: 0.6666875 2023-10-02 08:15:19,620 WARNING [train.py:1204] (3/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_27-53998-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 08:15:20,968 WARNING [train.py:1204] (3/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_202-183922-0 from training. Duration: 0.85 2023-10-02 08:15:20,979 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7002_210_1794_1_1527939683_5157286_1-281264-0 from training. Duration: 0.44 2023-10-02 08:15:22,259 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_53753_210_7234_1_1534320003_3594148_493-9314-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 08:15:22,278 WARNING [train.py:1204] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_217-95056-0_sp1.1 from training. Duration: 0.8909375 2023-10-02 08:15:23,855 WARNING [train.py:1204] (3/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_406-45086-0_sp1.1 from training. Duration: 0.72725 2023-10-02 08:15:23,861 WARNING [train.py:1204] (3/4) Exclude cut with ID _31649_210_6228_1_1532584608063_4091078_505-213821-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 08:15:23,875 WARNING [train.py:1204] (3/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_317-175062-0_sp0.9 from training. Duration: 0.911125 2023-10-02 08:15:26,549 WARNING [train.py:1204] (3/4) Exclude cut with ID _5444_210_2151_1_1527418808464_5312210_726-27165-0_sp1.1 from training. Duration: 0.6090625 2023-10-02 08:15:26,579 WARNING [train.py:1204] (3/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_494-170437-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 08:15:28,385 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.460e+02 1.842e+02 1.977e+02 2.264e+02 3.634e+02, threshold=3.953e+02, percent-clipped=0.0 2023-10-02 08:15:31,148 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_474-64054-0_sp1.1 from training. Duration: 0.936375 2023-10-02 08:15:32,732 WARNING [train.py:1204] (3/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_521-7556-0_sp1.1 from training. Duration: 0.97275 2023-10-02 08:15:34,516 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_269-348505-0_sp1.1 from training. Duration: 0.92725 2023-10-02 08:15:39,210 WARNING [train.py:1204] (3/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_559-184862-0 from training. Duration: 0.94 2023-10-02 08:15:39,240 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_68702_210_23689_1_1535799577_3420068_92-76040-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 08:15:39,729 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.4.encoder.layers.2.nonlin_attention.whiten2, num_groups=1, num_channels=384, metric=3.91 vs. limit=15.0 2023-10-02 08:15:40,377 WARNING [train.py:1204] (3/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_227-102052-0_sp1.1 from training. Duration: 0.97275 2023-10-02 08:15:40,413 WARNING [train.py:1204] (3/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_718-250907-0_sp0.9 from training. Duration: 0.7666875 2023-10-02 08:15:41,879 WARNING [train.py:1204] (3/4) Exclude cut with ID _14069_210_5632_1_1530512692033_4803390_352-143891-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 08:15:44,764 WARNING [train.py:1204] (3/4) Exclude cut with ID _6267_210_2084_1_1527764658526_3894730_65-106602-0_sp1.1 from training. Duration: 0.936375 2023-10-02 08:15:45,989 WARNING [train.py:1204] (3/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_202-183922-0_sp0.9 from training. Duration: 0.9444375 2023-10-02 08:15:45,998 WARNING [train.py:1204] (3/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_465-268827-0_sp1.1 from training. Duration: 0.97275 2023-10-02 08:15:46,053 WARNING [train.py:1204] (3/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_58-29064-0_sp1.1 from training. Duration: 0.9 2023-10-02 08:15:46,088 WARNING [train.py:1204] (3/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_1-99680-0_sp0.9 from training. Duration: 0.7444375 2023-10-02 08:15:47,467 WARNING [train.py:1204] (3/4) Exclude cut with ID _13253_210_1925_1_1530757775819_3577873_191-144977-0_sp1.1 from training. Duration: 0.7090625 2023-10-02 08:15:51,966 WARNING [train.py:1204] (3/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_276-112684-0_sp1.1 from training. Duration: 0.92725 2023-10-02 08:15:51,974 WARNING [train.py:1204] (3/4) Exclude cut with ID _64639_210_5816_1_1535367619437_3528000_358-411-0_sp1.1 from training. Duration: 0.97275 2023-10-02 08:15:51,980 WARNING [train.py:1204] (3/4) Exclude cut with ID _31649_210_6228_1_1532584608063_4091078_106-21199-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 08:15:53,390 WARNING [train.py:1204] (3/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_901-79438-0 from training. Duration: 0.94 2023-10-02 08:15:56,058 WARNING [train.py:1204] (3/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_718-250907-0 from training. Duration: 0.69 2023-10-02 08:15:56,070 WARNING [train.py:1204] (3/4) Exclude cut with ID _5287_210_2084_1_1527418883040_3878670_363-194161-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 08:15:56,074 WARNING [train.py:1204] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_586-321321-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 08:15:56,140 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_68900_210_2087_1_1535713157_3708529_164-292661-0_sp1.1 from training. Duration: 0.963625 2023-10-02 08:15:56,141 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_26433_210_12108_1_1532157158_4056480_114-144878-0_sp1.1 from training. Duration: 0.97275 2023-10-02 08:15:57,958 INFO [train.py:1046] (3/4) Epoch 23, batch 4850, loss[loss=0.1492, simple_loss=0.2279, pruned_loss=0.03528, over 24277.00 frames. ], tot_loss[loss=0.1733, simple_loss=0.2507, pruned_loss=0.04796, over 4689277.87 frames. ], batch size: 56, lr: 4.42e-03, grad_scale: 32.0 2023-10-02 08:15:59,399 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_15517_210_6753_1_1530954008_3487336_96-341874-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 08:16:02,211 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.balancer1.prob, batch_count=811440.0, ans=0.125 2023-10-02 08:16:05,466 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.nonlin_attention.balancer.prob, batch_count=811440.0, ans=0.125 2023-10-02 08:16:07,976 WARNING [train.py:1204] (3/4) Exclude cut with ID _14268_210_9206_1_1530622771285_4714099_637-86630-0 from training. Duration: 0.51 2023-10-02 08:16:09,651 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_28211_210_6388_1_1532861506_4523224_121-320092-0_sp1.1 from training. Duration: 0.92725 2023-10-02 08:16:15,029 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_141-89113-0_sp0.9 from training. Duration: 0.9555625 2023-10-02 08:16:15,120 WARNING [train.py:1204] (3/4) Exclude cut with ID _6789_210_1534_1_1527857937218_3118950_311-61262-0_sp1.1 from training. Duration: 0.5909375 2023-10-02 08:16:16,420 WARNING [train.py:1204] (3/4) Exclude cut with ID _58480_210_12392_1_1534811121111_3646160_389-200461-0_sp1.1 from training. Duration: 0.97275 2023-10-02 08:16:16,822 INFO [scaling.py:1118] (3/4) WithLoss: name=encoder.encoders.5.encoder.layers.0.self_attn_weights, loss-sum=0.000e+00 2023-10-02 08:16:19,220 WARNING [train.py:1204] (3/4) Exclude cut with ID _41326_210_5854_1_1533778076509_3492080_28-156553-0_sp1.1 from training. Duration: 0.92725 2023-10-02 08:16:20,559 WARNING [train.py:1204] (3/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_577-287150-0_sp1.1 from training. Duration: 0.7454375 2023-10-02 08:16:20,636 WARNING [train.py:1204] (3/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_40-129756-0_sp1.1 from training. Duration: 0.736375 2023-10-02 08:16:20,645 WARNING [train.py:1204] (3/4) Exclude cut with ID _60113_210_16271_1_1535164212481_4386079_490-280290-0 from training. Duration: 0.98 2023-10-02 08:16:25,487 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.feed_forward1.out_proj.dropout_p, batch_count=811573.3333333334, ans=0.1 2023-10-02 08:16:26,571 WARNING [train.py:1204] (3/4) Exclude cut with ID _58059_210_5854_1_1534901471149_3164808_34-39905-0_sp1.1 from training. Duration: 0.963625 2023-10-02 08:16:28,024 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_791-197891-0_sp1.1 from training. Duration: 0.763625 2023-10-02 08:16:28,044 WARNING [train.py:1204] (3/4) Exclude cut with ID _6789_210_1534_1_1527857937218_3118950_262-48853-0_sp1.1 from training. Duration: 0.5090625 2023-10-02 08:16:28,105 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2655_210_2087_1_1525688959_3897400_52-279899-0_sp0.9 from training. Duration: 0.7666875 2023-10-02 08:16:28,112 WARNING [train.py:1204] (3/4) Exclude cut with ID _14884_210_2252_1_1530707459689_5561250_114-201399-0 from training. Duration: 0.76 2023-10-02 08:16:32,800 WARNING [train.py:1204] (3/4) Exclude cut with ID _19023_210_4353_1_1531378892552_5820780_83-254794-0_sp0.9 from training. Duration: 0.9555625 2023-10-02 08:16:32,824 WARNING [train.py:1204] (3/4) Exclude cut with ID _53489_210_6228_1_1534298357169_3835970_10-297919-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 08:16:35,634 WARNING [train.py:1204] (3/4) Exclude cut with ID _44585_210_18628_1_1533605350387_4518199_418-3055-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 08:16:35,652 WARNING [train.py:1204] (3/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_310-109461-0 from training. Duration: 0.8 2023-10-02 08:16:37,487 WARNING [train.py:1204] (3/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_235-262124-0 from training. Duration: 0.67 2023-10-02 08:16:40,553 WARNING [train.py:1204] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_195-167011-0_sp1.1 from training. Duration: 0.6818125 2023-10-02 08:16:47,442 WARNING [train.py:1204] (3/4) Exclude cut with ID _7852_210_5219_1_1528286471316_8139400_188-55162-0_sp1.1 from training. Duration: 0.936375 2023-10-02 08:16:47,489 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_35361_210_4238_1_1533262810_7793476_675-87012-0 from training. Duration: 0.89 2023-10-02 08:16:48,857 WARNING [train.py:1204] (3/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_465-81427-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 08:16:48,865 WARNING [train.py:1204] (3/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_597-154640-0_sp1.1 from training. Duration: 0.8181875 2023-10-02 08:16:50,193 WARNING [train.py:1204] (3/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_186-309161-0_sp0.9 from training. Duration: 0.6 2023-10-02 08:16:52,857 WARNING [train.py:1204] (3/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_546-119312-0 from training. Duration: 0.99 2023-10-02 08:16:52,861 WARNING [train.py:1204] (3/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_330-240060-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 08:16:52,944 WARNING [train.py:1204] (3/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_477-255782-0 from training. Duration: 0.82 2023-10-02 08:16:52,960 WARNING [train.py:1204] (3/4) Exclude cut with ID _14970_210_15709_1_1530773543493_1715730_150-248939-0_sp1.1 from training. Duration: 0.97275 2023-10-02 08:16:54,264 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_556-206615-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 08:16:54,319 WARNING [train.py:1204] (3/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_308-231735-0 from training. Duration: 0.73 2023-10-02 08:17:02,236 WARNING [train.py:1204] (3/4) Exclude cut with ID _27485_210_4916_1_1532739191677_3668269_66-236571-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 08:17:08,274 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2221_210_1988_1_1525597051_4935541_415-296882-0_sp0.9 from training. Duration: 0.8555625 2023-10-02 08:17:08,286 WARNING [train.py:1204] (3/4) Exclude cut with ID _64521_210_14699_1_1535243427259_3815328_231-78630-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 08:17:11,414 INFO [train.py:1046] (3/4) Epoch 23, batch 4900, loss[loss=0.1555, simple_loss=0.2305, pruned_loss=0.04026, over 21458.00 frames. ], tot_loss[loss=0.1725, simple_loss=0.2501, pruned_loss=0.04746, over 4702250.02 frames. ], batch size: 47, lr: 4.42e-03, grad_scale: 32.0 2023-10-02 08:17:11,725 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.conv_module2.balancer1.prob, batch_count=811773.3333333334, ans=0.125 2023-10-02 08:17:12,360 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.0.layers.1.conv_module2.whiten, num_groups=1, num_channels=192, metric=8.55 vs. limit=15.0 2023-10-02 08:17:14,425 WARNING [train.py:1204] (3/4) Exclude cut with ID _13942_210_9988_1_1530604906036_3733360_499-55676-0 from training. Duration: 0.62 2023-10-02 08:17:14,427 WARNING [train.py:1204] (3/4) Exclude cut with ID _49376_210_5986_1_1533985278732_3370740_301-154763-0_sp1.1 from training. Duration: 0.8909375 2023-10-02 08:17:18,664 WARNING [train.py:1204] (3/4) Exclude cut with ID _27485_210_4916_1_1532739191677_3668269_255-216698-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 08:17:19,894 WARNING [train.py:1204] (3/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_137-334143-0_sp1.1 from training. Duration: 0.97275 2023-10-02 08:17:20,102 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.bypass.scale_min, batch_count=811773.3333333334, ans=0.2 2023-10-02 08:17:21,159 WARNING [train.py:1204] (3/4) Exclude cut with ID _6456_210_1534_1_1527681634212_3181049_215-309359-0_sp1.1 from training. Duration: 0.8 2023-10-02 08:17:22,606 WARNING [train.py:1204] (3/4) Exclude cut with ID _65640_210_6310_1_1535368201445_3173250_209-13008-0 from training. Duration: 0.99 2023-10-02 08:17:27,394 WARNING [train.py:1204] (3/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_58-29064-0 from training. Duration: 0.99 2023-10-02 08:17:31,354 WARNING [train.py:1204] (3/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_410-287857-0 from training. Duration: 0.93 2023-10-02 08:17:31,436 WARNING [train.py:1204] (3/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_153-153537-0 from training. Duration: 0.91 2023-10-02 08:17:31,668 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.conv_module1.balancer1.prob, batch_count=811840.0, ans=0.125 2023-10-02 08:17:33,404 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7544_210_6753_1_1528587722_3576686_172-171750-0_sp0.9 from training. Duration: 0.72225 2023-10-02 08:17:33,441 WARNING [train.py:1204] (3/4) Exclude cut with ID _64521_210_14699_1_1535243427259_3815328_464-183853-0_sp1.1 from training. Duration: 0.97275 2023-10-02 08:17:33,454 WARNING [train.py:1204] (3/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_385-210308-0_sp1.1 from training. Duration: 0.963625 2023-10-02 08:17:33,461 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_46013_210_7234_1_1533776362_3744032_354-261865-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 08:17:33,474 WARNING [train.py:1204] (3/4) Exclude cut with ID _3067_210_2667_1_1526212974640_4503168_250-285937-0_sp1.1 from training. Duration: 0.72725 2023-10-02 08:17:33,526 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_40036_210_13405_1_1533648647_1315388_48-146016-0 from training. Duration: 0.93 2023-10-02 08:17:38,385 WARNING [train.py:1204] (3/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_280-161633-0 from training. Duration: 0.73 2023-10-02 08:17:38,426 WARNING [train.py:1204] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_308-161067-0_sp0.9 from training. Duration: 0.7333125 2023-10-02 08:17:41,053 WARNING [train.py:1204] (3/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_68-212801-0_sp0.9 from training. Duration: 0.811125 2023-10-02 08:17:41,115 WARNING [train.py:1204] (3/4) Exclude cut with ID _6789_210_1534_1_1527857937218_3118950_311-61262-0_sp0.9 from training. Duration: 0.72225 2023-10-02 08:17:43,799 WARNING [train.py:1204] (3/4) Exclude cut with ID _14632_210_9988_1_1530691279392_3923200_102-204477-0_sp1.1 from training. Duration: 0.8818125 2023-10-02 08:17:43,859 WARNING [train.py:1204] (3/4) Exclude cut with ID _65242_210_18628_1_1535369393717_3823149_290-117517-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 08:17:45,274 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_69630_210_7234_1_1535788670_910989_35-337140-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 08:17:45,288 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_5618_210_1874_1_1527311008_5851_1-214501-0 from training. Duration: 0.689 2023-10-02 08:17:46,675 WARNING [train.py:1204] (3/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_168-50337-0_sp0.9 from training. Duration: 0.9333125 2023-10-02 08:17:46,746 WARNING [train.py:1204] (3/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_185-14438-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 08:17:48,013 WARNING [train.py:1204] (3/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_61-327062-0 from training. Duration: 0.81 2023-10-02 08:17:48,025 WARNING [train.py:1204] (3/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_98-741-0 from training. Duration: 0.8 2023-10-02 08:17:50,783 WARNING [train.py:1204] (3/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_312-204798-0 from training. Duration: 0.93 2023-10-02 08:17:52,194 WARNING [train.py:1204] (3/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_787-294416-0_sp0.9 from training. Duration: 0.911125 2023-10-02 08:17:52,345 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.nonlin_attention.balancer.min_positive, batch_count=811906.6666666666, ans=0.05 2023-10-02 08:17:53,471 WARNING [train.py:1204] (3/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_140-181176-0_sp1.1 from training. Duration: 0.836375 2023-10-02 08:17:53,505 WARNING [train.py:1204] (3/4) Exclude cut with ID _14687_210_5854_1_1530703838259_3664889_115-239046-0_sp1.1 from training. Duration: 0.6090625 2023-10-02 08:17:53,542 WARNING [train.py:1204] (3/4) Exclude cut with ID _56423_210_5854_1_1534896061049_3165969_370-230614-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 08:17:54,802 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.538e+02 1.777e+02 1.982e+02 2.157e+02 3.751e+02, threshold=3.964e+02, percent-clipped=0.0 2023-10-02 08:17:55,476 WARNING [train.py:1204] (3/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_406-275649-0_sp0.9 from training. Duration: 0.4666875 2023-10-02 08:17:55,494 WARNING [train.py:1204] (3/4) Exclude cut with ID _15150_210_8341_1_1530752411803_7256384_22-138274-0_sp1.1 from training. Duration: 0.87275 2023-10-02 08:17:55,533 WARNING [train.py:1204] (3/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_125-125818-0 from training. Duration: 0.96 2023-10-02 08:17:58,008 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_4399_210_1874_1_1526705485_2400757_220-47974-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 08:18:00,886 WARNING [train.py:1204] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_308-161067-0_sp1.1 from training. Duration: 0.6 2023-10-02 08:18:02,250 WARNING [train.py:1204] (3/4) Exclude cut with ID _53489_210_6228_1_1534298357169_3835970_102-57645-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 08:18:06,134 WARNING [train.py:1204] (3/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_178-177582-0 from training. Duration: 0.74 2023-10-02 08:18:06,197 WARNING [train.py:1204] (3/4) Exclude cut with ID _14349_210_8341_1_1530615710818_3442470_170-336541-0_sp1.1 from training. Duration: 0.7818125 2023-10-02 08:18:07,551 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_521-214276-0_sp0.9 from training. Duration: 0.488875 2023-10-02 08:18:08,914 WARNING [train.py:1204] (3/4) Exclude cut with ID _66070_210_7640_1_1535441393096_7805310_489-292631-0 from training. Duration: 0.79 2023-10-02 08:18:13,163 WARNING [train.py:1204] (3/4) Exclude cut with ID _14069_210_5632_1_1530512692033_4803390_438-340023-0_sp1.1 from training. Duration: 0.936375 2023-10-02 08:18:14,477 WARNING [train.py:1204] (3/4) Exclude cut with ID _7852_210_5219_1_1528286471316_8139400_242-105336-0_sp1.1 from training. Duration: 0.6545625 2023-10-02 08:18:15,802 WARNING [train.py:1204] (3/4) Exclude cut with ID _3867_210_1883_1_1526641200575_4475830_272-215563-0 from training. Duration: 0.57 2023-10-02 08:18:15,816 WARNING [train.py:1204] (3/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_143-117421-0_sp0.9 from training. Duration: 0.6444375 2023-10-02 08:18:15,822 WARNING [train.py:1204] (3/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_30-326681-0_sp1.1 from training. Duration: 0.7545625 2023-10-02 08:18:15,916 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder_embed.dropout.p, batch_count=812040.0, ans=0.1 2023-10-02 08:18:17,246 WARNING [train.py:1204] (3/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_538-218448-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 08:18:21,294 WARNING [train.py:1204] (3/4) Exclude cut with ID _33640_210_2655_1_1533117605479_4855469_60-192970-0_sp1.1 from training. Duration: 0.963625 2023-10-02 08:18:21,307 WARNING [train.py:1204] (3/4) Exclude cut with ID _65585_210_12210_1_1535450451584_3764098_112-294301-0_sp1.1 from training. Duration: 0.8 2023-10-02 08:18:21,328 WARNING [train.py:1204] (3/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_643-37412-0_sp1.1 from training. Duration: 0.936375 2023-10-02 08:18:21,345 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_9302_210_2550_1_1528889962_4655416_638-301620-0_sp1.1 from training. Duration: 0.42725 2023-10-02 08:18:22,794 WARNING [train.py:1204] (3/4) Exclude cut with ID _3867_210_1883_1_1526641200575_4475830_272-215563-0_sp1.1 from training. Duration: 0.5181875 2023-10-02 08:18:24,011 INFO [train.py:1046] (3/4) Epoch 23, batch 4950, loss[loss=0.1762, simple_loss=0.2633, pruned_loss=0.04461, over 24340.00 frames. ], tot_loss[loss=0.171, simple_loss=0.2482, pruned_loss=0.04684, over 4707431.74 frames. ], batch size: 74, lr: 4.42e-03, grad_scale: 32.0 2023-10-02 08:18:25,898 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_53679_210_4852_1_1534556726_4186141_358-154143-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 08:18:25,908 WARNING [train.py:1204] (3/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_841-341679-0_sp0.9 from training. Duration: 0.6444375 2023-10-02 08:18:28,756 WARNING [train.py:1204] (3/4) Exclude cut with ID _7160_210_1876_1_1527940923231_3703799_302-150728-0 from training. Duration: 0.78 2023-10-02 08:18:30,042 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_125-203105-0 from training. Duration: 0.8 2023-10-02 08:18:30,064 WARNING [train.py:1204] (3/4) Exclude cut with ID _5585_210_2667_1_1527418813324_8007340_89-332072-0_sp0.9 from training. Duration: 0.711125 2023-10-02 08:18:31,426 WARNING [train.py:1204] (3/4) Exclude cut with ID _3992_210_1747_1_1526644816031_3731060_36-147855-0 from training. Duration: 0.78 2023-10-02 08:18:31,451 WARNING [train.py:1204] (3/4) Exclude cut with ID _59256_210_5986_1_1535076085997_3589120_172-239045-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 08:18:31,460 WARNING [train.py:1204] (3/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_472-216663-0_sp1.1 from training. Duration: 0.836375 2023-10-02 08:18:31,514 WARNING [train.py:1204] (3/4) Exclude cut with ID _36512_210_6987_1_1533033143998_3126069_181-306500-0_sp1.1 from training. Duration: 0.863625 2023-10-02 08:18:33,177 WARNING [train.py:1204] (3/4) Exclude cut with ID _56524_210_9642_1_1534572011779_3619170_492-95008-0_sp1.1 from training. Duration: 0.97275 2023-10-02 08:18:35,948 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_33348_210_5632_1_1533086912_3324030_119-90326-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 08:18:35,995 WARNING [train.py:1204] (3/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_231-137892-0_sp1.1 from training. Duration: 0.9 2023-10-02 08:18:36,115 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8453_210_4211_1_1528502940_8562730_596-306820-0_sp1.1 from training. Duration: 0.8818125 2023-10-02 08:18:37,421 WARNING [train.py:1204] (3/4) Exclude cut with ID _27052_210_5986_1_1532329197187_3585009_50-242750-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 08:18:40,186 WARNING [train.py:1204] (3/4) Exclude cut with ID _13913_210_9994_1_1530579615187_3383897_358-98435-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 08:18:40,212 WARNING [train.py:1204] (3/4) Exclude cut with ID _27013_210_1389_1_1532437173240_3252649_399-229778-0_sp1.1 from training. Duration: 0.963625 2023-10-02 08:18:42,915 WARNING [train.py:1204] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_185-174805-0_sp0.9 from training. Duration: 0.7555625 2023-10-02 08:18:45,764 WARNING [train.py:1204] (3/4) Exclude cut with ID _65585_210_12210_1_1535450451584_3764098_196-221014-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 08:18:47,290 WARNING [train.py:1204] (3/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_202-293523-0_sp1.1 from training. Duration: 0.6545625 2023-10-02 08:18:48,746 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8275_210_4198_1_1528455320_3892571_213-127634-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 08:18:49,999 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_921-231678-0_sp1.1 from training. Duration: 0.97275 2023-10-02 08:18:51,337 WARNING [train.py:1204] (3/4) Exclude cut with ID _38020_210_5986_1_1533112252259_7006089_493-102745-0_sp1.1 from training. Duration: 0.8909375 2023-10-02 08:18:52,686 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6846_210_6753_1_1527850767_4295158_2-315073-0 from training. Duration: 0.88 2023-10-02 08:18:53,940 WARNING [train.py:1204] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_249-281349-0 from training. Duration: 0.85 2023-10-02 08:18:57,039 WARNING [train.py:1204] (3/4) Exclude cut with ID _18513_210_9206_1_1531618215223_3862620_314-83429-0_sp1.1 from training. Duration: 0.97275 2023-10-02 08:18:58,461 WARNING [train.py:1204] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_215-202973-0_sp0.9 from training. Duration: 0.97775 2023-10-02 08:18:59,777 WARNING [train.py:1204] (3/4) Exclude cut with ID _7719_210_3525_1_1528456153163_4296960_88-250259-0_sp1.1 from training. Duration: 0.763625 2023-10-02 08:18:59,874 WARNING [train.py:1204] (3/4) Exclude cut with ID _7606_210_4915_1_1528894253277_5080690_301-10732-0_sp1.1 from training. Duration: 0.736375 2023-10-02 08:18:59,890 WARNING [train.py:1204] (3/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_818-11907-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 08:19:01,772 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_5959_210_1794_1_1527400400_3445726_412-44052-0_sp1.1 from training. Duration: 0.7 2023-10-02 08:19:04,924 WARNING [train.py:1204] (3/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_6-279195-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 08:19:07,516 WARNING [train.py:1204] (3/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_252-216674-0_sp0.9 from training. Duration: 0.6 2023-10-02 08:19:10,346 WARNING [train.py:1204] (3/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_144-3287-0_sp1.1 from training. Duration: 0.6090625 2023-10-02 08:19:11,705 WARNING [train.py:1204] (3/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_342-3879-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 08:19:11,728 WARNING [train.py:1204] (3/4) Exclude cut with ID _33640_210_2655_1_1533117605479_4855469_584-65630-0_sp1.1 from training. Duration: 0.97275 2023-10-02 08:19:11,943 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.bypass.skip_rate, batch_count=812306.6666666666, ans=0.09899494936611666 2023-10-02 08:19:13,138 WARNING [train.py:1204] (3/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_37-189262-0 from training. Duration: 0.98 2023-10-02 08:19:13,164 WARNING [train.py:1204] (3/4) Exclude cut with ID _9389_210_1664_1_1529217337301_5289269_379-128794-0_sp0.9 from training. Duration: 0.9444375 2023-10-02 08:19:13,351 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=812306.6666666666, ans=0.1 2023-10-02 08:19:15,091 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.3.encoder.layers.3.conv_module2.whiten, num_groups=1, num_channels=512, metric=5.89 vs. limit=15.0 2023-10-02 08:19:15,835 WARNING [train.py:1204] (3/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_119-207162-0_sp1.1 from training. Duration: 0.6454375 2023-10-02 08:19:17,337 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.feed_forward1.hidden_balancer.prob, batch_count=812306.6666666666, ans=0.125 2023-10-02 08:19:18,561 WARNING [train.py:1204] (3/4) Exclude cut with ID _2941_210_2577_1_1526199673150_2489759_200-333643-0_sp1.1 from training. Duration: 0.936375 2023-10-02 08:19:19,945 WARNING [train.py:1204] (3/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_479-125611-0_sp1.1 from training. Duration: 0.87275 2023-10-02 08:19:19,961 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3475_210_2028_1_1526193578_3622569_64-297072-0_sp1.1 from training. Duration: 0.82725 2023-10-02 08:19:19,995 WARNING [train.py:1204] (3/4) Exclude cut with ID _53565_210_10635_1_1534337218335_3820259_139-346291-0_sp1.1 from training. Duration: 0.97275 2023-10-02 08:19:20,027 WARNING [train.py:1204] (3/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_588-125165-0_sp1.1 from training. Duration: 0.7545625 2023-10-02 08:19:21,329 WARNING [train.py:1204] (3/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_938-205135-0_sp1.1 from training. Duration: 0.7909375 2023-10-02 08:19:22,595 WARNING [train.py:1204] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_216-48707-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 08:19:22,654 WARNING [train.py:1204] (3/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_477-255782-0_sp1.1 from training. Duration: 0.7454375 2023-10-02 08:19:22,687 WARNING [train.py:1204] (3/4) Exclude cut with ID _16909_210_5724_1_1531292079912_4118919_583-92842-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 08:19:24,061 WARNING [train.py:1204] (3/4) Exclude cut with ID _60860_210_8254_1_1534896015875_3557169_245-256666-0 from training. Duration: 0.97 2023-10-02 08:19:30,950 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_45399_210_4852_1_1533624725_7579737_349-116173-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 08:19:31,637 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.4.encoder.layers.0.self_attn1.whiten, num_groups=1, num_channels=384, metric=14.70 vs. limit=22.5 2023-10-02 08:19:34,938 WARNING [train.py:1204] (3/4) Exclude cut with ID _8699_210_2069_1_1528630346831_3960409_155-39766-0 from training. Duration: 0.78 2023-10-02 08:19:34,946 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_86-16881-0_sp1.1 from training. Duration: 0.52725 2023-10-02 08:19:36,744 INFO [train.py:1046] (3/4) Epoch 23, batch 5000, loss[loss=0.1586, simple_loss=0.2401, pruned_loss=0.03856, over 24337.00 frames. ], tot_loss[loss=0.1705, simple_loss=0.2477, pruned_loss=0.04664, over 4715284.19 frames. ], batch size: 61, lr: 4.42e-03, grad_scale: 32.0 2023-10-02 08:19:41,121 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_191-159384-0_sp1.1 from training. Duration: 0.97275 2023-10-02 08:19:41,135 WARNING [train.py:1204] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_307-158462-0_sp1.1 from training. Duration: 0.62725 2023-10-02 08:19:43,867 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7544_210_6753_1_1528591327_1471426_33-346031-0 from training. Duration: 0.61 2023-10-02 08:19:43,959 WARNING [train.py:1204] (3/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_112-149213-0 from training. Duration: 0.95 2023-10-02 08:19:46,552 WARNING [train.py:1204] (3/4) Exclude cut with ID _55844_210_2346_1_1534471212569_7625707_925-191342-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 08:19:48,029 WARNING [train.py:1204] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_87-278686-0 from training. Duration: 0.77 2023-10-02 08:19:48,052 WARNING [train.py:1204] (3/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_415-78792-0_sp1.1 from training. Duration: 0.736375 2023-10-02 08:19:48,063 WARNING [train.py:1204] (3/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_107-332398-0_sp0.9 from training. Duration: 0.8666875 2023-10-02 08:19:49,409 WARNING [train.py:1204] (3/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_72-251255-0 from training. Duration: 0.94 2023-10-02 08:19:50,694 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_431-227273-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 08:19:52,089 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7544_210_6753_1_1528587000_691468_87-178599-0_sp0.9 from training. Duration: 0.9666875 2023-10-02 08:19:53,413 WARNING [train.py:1204] (3/4) Exclude cut with ID _13916_210_9994_1_1530788341126_3763149_60-191556-0 from training. Duration: 0.61 2023-10-02 08:19:53,415 WARNING [train.py:1204] (3/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_746-164782-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 08:19:53,450 WARNING [train.py:1204] (3/4) Exclude cut with ID _33640_210_2655_1_1533117605479_4855469_596-21066-0_sp1.1 from training. Duration: 0.92725 2023-10-02 08:19:54,862 WARNING [train.py:1204] (3/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_320-14078-0 from training. Duration: 0.95 2023-10-02 08:19:56,695 WARNING [train.py:1204] (3/4) Exclude cut with ID _29877_210_6228_1_1532397791106_5276149_266-62291-0 from training. Duration: 0.87 2023-10-02 08:19:57,984 WARNING [train.py:1204] (3/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_176-56120-0_sp0.9 from training. Duration: 0.92225 2023-10-02 08:19:58,038 WARNING [train.py:1204] (3/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_91-26919-0 from training. Duration: 0.97 2023-10-02 08:19:58,045 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_224-322394-0_sp0.9 from training. Duration: 0.7333125 2023-10-02 08:19:59,367 WARNING [train.py:1204] (3/4) Exclude cut with ID _44982_210_1511_1_1534312820108_3726029_586-297059-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 08:19:59,406 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_196-289637-0_sp1.1 from training. Duration: 0.6818125 2023-10-02 08:19:59,407 WARNING [train.py:1204] (3/4) Exclude cut with ID _18513_210_9206_1_1531618215223_3862620_402-5957-0 from training. Duration: 0.99 2023-10-02 08:19:59,413 WARNING [train.py:1204] (3/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_81-300212-0 from training. Duration: 0.6 2023-10-02 08:20:02,132 WARNING [train.py:1204] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_166-128941-0 from training. Duration: 0.54 2023-10-02 08:20:02,145 WARNING [train.py:1204] (3/4) Exclude cut with ID _58059_210_5854_1_1534901471149_3164808_57-309660-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 08:20:02,203 WARNING [train.py:1204] (3/4) Exclude cut with ID _60621_210_17071_1_1535013053530_3671739_73-155814-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 08:20:05,338 WARNING [train.py:1204] (3/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_726-270780-0 from training. Duration: 0.86 2023-10-02 08:20:05,350 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8853_210_3273_1_1528633899_818968_51-187685-0_sp1.1 from training. Duration: 0.62725 2023-10-02 08:20:05,466 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_315-212794-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 08:20:06,866 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_53373_210_5399_1_1534229471_4715695_314-122795-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 08:20:08,812 WARNING [train.py:1204] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_187-221566-0_sp0.9 from training. Duration: 0.47775 2023-10-02 08:20:10,145 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_133-564-0 from training. Duration: 0.79 2023-10-02 08:20:11,450 WARNING [train.py:1204] (3/4) Exclude cut with ID _55153_210_2253_1_1534384786677_3162096_255-228344-0_sp1.1 from training. Duration: 0.936375 2023-10-02 08:20:12,770 WARNING [train.py:1204] (3/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_383-165234-0_sp1.1 from training. Duration: 0.8545625 2023-10-02 08:20:15,605 WARNING [train.py:1204] (3/4) Exclude cut with ID IC0677W0056-334889 from training. Duration: 0.768 2023-10-02 08:20:19,581 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.539e+02 1.897e+02 2.081e+02 2.436e+02 3.526e+02, threshold=4.162e+02, percent-clipped=0.0 2023-10-02 08:20:19,669 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_14953_210_2151_1_1530701710_5616165_710-165400-0_sp0.9 from training. Duration: 0.9666875 2023-10-02 08:20:19,758 WARNING [train.py:1204] (3/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_355-236205-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 08:20:19,759 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_34450_210_9547_1_1533099443_4109050_503-246893-0_sp1.1 from training. Duration: 0.963625 2023-10-02 08:20:22,475 WARNING [train.py:1204] (3/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_25-120161-0 from training. Duration: 0.98 2023-10-02 08:20:22,503 WARNING [train.py:1204] (3/4) Exclude cut with ID _66112_210_10635_1_1535590236305_3801484_205-311930-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 08:20:22,643 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.balancer1.prob, batch_count=812640.0, ans=0.125 2023-10-02 08:20:23,946 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_48049_210_5632_1_1533864553_3519937_185-281218-0_sp1.1 from training. Duration: 0.92725 2023-10-02 08:20:23,984 WARNING [train.py:1204] (3/4) Exclude cut with ID _48782_210_12658_1_1534467701544_2300422_191-332415-0_sp1.1 from training. Duration: 0.97275 2023-10-02 08:20:24,188 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.conv_module2.balancer2.prob, batch_count=812640.0, ans=0.125 2023-10-02 08:20:26,128 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.out_combiner.scale_min, batch_count=812640.0, ans=0.2 2023-10-02 08:20:27,121 WARNING [train.py:1204] (3/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_907-151398-0_sp1.1 from training. Duration: 0.336375 2023-10-02 08:20:27,157 WARNING [train.py:1204] (3/4) Exclude cut with ID _67499_210_17282_1_1535509805220_3692430_20-179346-0_sp1.1 from training. Duration: 0.8818125 2023-10-02 08:20:29,840 WARNING [train.py:1204] (3/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_441-91466-0_sp1.1 from training. Duration: 0.8818125 2023-10-02 08:20:29,912 WARNING [train.py:1204] (3/4) Exclude cut with ID _46681_210_9642_1_1533893005571_7603359_786-164007-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 08:20:32,761 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.ff2_skip_rate, batch_count=812706.6666666666, ans=0.0 2023-10-02 08:20:35,957 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.feed_forward1.hidden_balancer.prob, batch_count=812706.6666666666, ans=0.125 2023-10-02 08:20:37,372 WARNING [train.py:1204] (3/4) Exclude cut with ID _14970_210_15709_1_1530773543493_1715730_187-20259-0 from training. Duration: 0.89 2023-10-02 08:20:42,031 WARNING [train.py:1204] (3/4) Exclude cut with ID _12734_210_8341_1_1530183614296_3891929_383-339075-0_sp1.1 from training. Duration: 0.963625 2023-10-02 08:20:48,077 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.3.encoder.layers.1.self_attn2.whiten, num_groups=1, num_channels=512, metric=18.49 vs. limit=22.5 2023-10-02 08:20:48,597 INFO [train.py:1046] (3/4) Epoch 23, batch 5050, loss[loss=0.2016, simple_loss=0.2659, pruned_loss=0.06867, over 23803.00 frames. ], tot_loss[loss=0.1712, simple_loss=0.2481, pruned_loss=0.04719, over 4718789.93 frames. ], batch size: 212, lr: 4.42e-03, grad_scale: 32.0 2023-10-02 08:20:48,722 WARNING [train.py:1204] (3/4) Exclude cut with ID _37086_210_9038_1_1532999540891_7021250_834-322225-0_sp1.1 from training. Duration: 0.92725 2023-10-02 08:20:50,126 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_10987_210_4163_1_1529760555_4123902_56-290559-0_sp1.1 from training. Duration: 0.963625 2023-10-02 08:20:50,139 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_92-246270-0_sp0.9 from training. Duration: 0.9444375 2023-10-02 08:20:50,163 WARNING [train.py:1204] (3/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_56-13561-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 08:20:51,367 WARNING [train.py:1204] (3/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_393-57888-0_sp1.1 from training. Duration: 0.5818125 2023-10-02 08:20:51,386 WARNING [train.py:1204] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_168-32906-0_sp1.1 from training. Duration: 0.62725 2023-10-02 08:20:51,444 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_51225_210_3601_1_1534400634_2327383_127-207553-0_sp1.1 from training. Duration: 0.963625 2023-10-02 08:20:55,503 WARNING [train.py:1204] (3/4) Exclude cut with ID _58583_210_5236_1_1534726835266_3574249_494-203521-0_sp1.1 from training. Duration: 0.963625 2023-10-02 08:20:55,532 WARNING [train.py:1204] (3/4) Exclude cut with ID _49376_210_5986_1_1533985278732_3370740_354-344695-0 from training. Duration: 0.99 2023-10-02 08:20:57,238 WARNING [train.py:1204] (3/4) Exclude cut with ID _8021_210_7994_1_1528376451358_3653069_179-45206-0_sp1.1 from training. Duration: 0.7818125 2023-10-02 08:20:58,716 WARNING [train.py:1204] (3/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_742-154017-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 08:20:59,982 WARNING [train.py:1204] (3/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_222-198127-0_sp1.1 from training. Duration: 0.763625 2023-10-02 08:21:00,018 WARNING [train.py:1204] (3/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_235-307337-0 from training. Duration: 0.94 2023-10-02 08:21:01,325 WARNING [train.py:1204] (3/4) Exclude cut with ID _8393_210_6702_1_1528505860249_3457328_453-176500-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 08:21:01,359 WARNING [train.py:1204] (3/4) Exclude cut with ID _53565_210_10635_1_1534337218335_3820259_102-179528-0_sp1.1 from training. Duration: 0.97275 2023-10-02 08:21:02,757 WARNING [train.py:1204] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_43-206757-0_sp1.1 from training. Duration: 0.6818125 2023-10-02 08:21:04,115 WARNING [train.py:1204] (3/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_335-274780-0_sp1.1 from training. Duration: 0.7090625 2023-10-02 08:21:04,161 WARNING [train.py:1204] (3/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_128-296581-0_sp1.1 from training. Duration: 0.67275 2023-10-02 08:21:14,637 WARNING [train.py:1204] (3/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_111-112362-0 from training. Duration: 0.91 2023-10-02 08:21:14,690 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_5522_210_3273_1_1527249606_3909096_366-172508-0_sp1.1 from training. Duration: 0.6 2023-10-02 08:21:15,978 WARNING [train.py:1204] (3/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_47-167373-0_sp1.1 from training. Duration: 0.836375 2023-10-02 08:21:16,018 WARNING [train.py:1204] (3/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_41-234353-0 from training. Duration: 0.68 2023-10-02 08:21:16,176 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.conv_module1.balancer2.prob, batch_count=812906.6666666666, ans=0.125 2023-10-02 08:21:17,337 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_82-221196-0_sp0.9 from training. Duration: 0.8444375 2023-10-02 08:21:18,764 WARNING [train.py:1204] (3/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_47-4814-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 08:21:18,793 WARNING [train.py:1204] (3/4) Exclude cut with ID _43541_210_10626_1_1533796233418_3635440_40-247993-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 08:21:18,845 WARNING [train.py:1204] (3/4) Exclude cut with ID _21989_210_5854_1_1531899214931_3271360_133-44759-0_sp0.9 from training. Duration: 0.8555625 2023-10-02 08:21:18,847 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_94-195801-0 from training. Duration: 0.94 2023-10-02 08:21:20,308 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_141-89113-0 from training. Duration: 0.86 2023-10-02 08:21:21,682 WARNING [train.py:1204] (3/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_683-284578-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 08:21:23,108 WARNING [train.py:1204] (3/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_6-214815-0_sp1.1 from training. Duration: 0.77275 2023-10-02 08:21:27,908 WARNING [train.py:1204] (3/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_556-212082-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 08:21:29,133 WARNING [train.py:1204] (3/4) Exclude cut with ID _44902_210_16187_1_1533888006062_3792078_149-67212-0 from training. Duration: 0.87 2023-10-02 08:21:30,516 WARNING [train.py:1204] (3/4) Exclude cut with ID _61148_210_6897_1_1535094022726_4217610_333-159440-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 08:21:33,315 WARNING [train.py:1204] (3/4) Exclude cut with ID _3120_210_3525_1_1526036143112_4072980_404-249994-0 from training. Duration: 0.99 2023-10-02 08:21:34,655 WARNING [train.py:1204] (3/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_234-260682-0_sp0.9 from training. Duration: 0.9333125 2023-10-02 08:21:34,681 WARNING [train.py:1204] (3/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_374-138971-0_sp1.1 from training. Duration: 0.8545625 2023-10-02 08:21:34,742 WARNING [train.py:1204] (3/4) Exclude cut with ID _53713_210_24116_1_1534314487762_7416751_743-302085-0_sp1.1 from training. Duration: 0.92725 2023-10-02 08:21:36,081 WARNING [train.py:1204] (3/4) Exclude cut with ID _18516_210_9206_1_1531704653206_3692406_198-334870-0_sp1.1 from training. Duration: 0.836375 2023-10-02 08:21:37,909 WARNING [train.py:1204] (3/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_395-73570-0_sp1.1 from training. Duration: 0.9 2023-10-02 08:21:38,497 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.5.encoder.layers.0.feed_forward2.out_whiten, num_groups=1, num_channels=256, metric=7.81 vs. limit=15.0 2023-10-02 08:21:39,858 WARNING [train.py:1204] (3/4) Exclude cut with ID _46683_210_9642_1_1533730913492_4591116_421-19547-0_sp1.1 from training. Duration: 0.936375 2023-10-02 08:21:41,805 WARNING [train.py:1204] (3/4) Exclude cut with ID _19896_210_1883_1_1531709022662_6649569_714-8668-0_sp1.1 from training. Duration: 0.963625 2023-10-02 08:21:41,834 WARNING [train.py:1204] (3/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_49-101072-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 08:21:41,840 WARNING [train.py:1204] (3/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_220-191080-0_sp1.1 from training. Duration: 0.8909375 2023-10-02 08:21:41,872 WARNING [train.py:1204] (3/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_62-335552-0 from training. Duration: 0.87 2023-10-02 08:21:43,262 WARNING [train.py:1204] (3/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_111-112362-0_sp1.1 from training. Duration: 0.82725 2023-10-02 08:21:44,576 WARNING [train.py:1204] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_318-59097-0_sp0.9 from training. Duration: 0.8444375 2023-10-02 08:21:47,421 WARNING [train.py:1204] (3/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_98-255710-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 08:21:47,427 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9042W0003-515609 from training. Duration: 0.896 2023-10-02 08:21:47,435 WARNING [train.py:1204] (3/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_158-250116-0_sp0.9 from training. Duration: 0.688875 2023-10-02 08:21:48,702 WARNING [train.py:1204] (3/4) Exclude cut with ID _26750_210_5665_1_1532226678442_4063230_7-346885-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 08:21:49,970 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_37839_210_7234_1_1533542462_3689303_357-227078-0_sp1.1 from training. Duration: 0.963625 2023-10-02 08:21:50,005 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9052W0119-520904 from training. Duration: 0.896 2023-10-02 08:21:52,911 WARNING [train.py:1204] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_249-281349-0_sp1.1 from training. Duration: 0.77275 2023-10-02 08:21:52,915 WARNING [train.py:1204] (3/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_147-293311-0 from training. Duration: 0.94 2023-10-02 08:21:52,916 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_158-21166-0_sp1.1 from training. Duration: 0.963625 2023-10-02 08:21:54,786 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.5.encoder.layers.0.whiten, num_groups=1, num_channels=256, metric=2.31 vs. limit=12.0 2023-10-02 08:21:56,871 WARNING [train.py:1204] (3/4) Exclude cut with ID _14100_210_8341_1_1530529320613_3678069_207-332194-0_sp1.1 from training. Duration: 0.92725 2023-10-02 08:21:56,922 WARNING [train.py:1204] (3/4) Exclude cut with ID _9389_210_1664_1_1529217337301_5289269_324-211031-0_sp1.1 from training. Duration: 0.963625 2023-10-02 08:21:58,736 WARNING [train.py:1204] (3/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_133-158308-0 from training. Duration: 0.84 2023-10-02 08:21:58,847 WARNING [train.py:1204] (3/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_166-314607-0 from training. Duration: 0.54 2023-10-02 08:22:01,340 INFO [train.py:1046] (3/4) Epoch 23, batch 5100, loss[loss=0.1664, simple_loss=0.246, pruned_loss=0.04339, over 23522.00 frames. ], tot_loss[loss=0.1716, simple_loss=0.2486, pruned_loss=0.0473, over 4725907.96 frames. ], batch size: 134, lr: 4.42e-03, grad_scale: 16.0 2023-10-02 08:22:01,386 WARNING [train.py:1204] (3/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_649-79538-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 08:22:01,394 WARNING [train.py:1204] (3/4) Exclude cut with ID _28394_210_8913_1_1532430049518_3650118_343-115667-0_sp1.1 from training. Duration: 0.97275 2023-10-02 08:22:01,437 WARNING [train.py:1204] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_87-278686-0_sp0.9 from training. Duration: 0.8555625 2023-10-02 08:22:05,489 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9109W0003-549749 from training. Duration: 0.768 2023-10-02 08:22:07,266 WARNING [train.py:1204] (3/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_126-29104-0_sp1.1 from training. Duration: 0.77275 2023-10-02 08:22:10,446 WARNING [train.py:1204] (3/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_325-113798-0 from training. Duration: 0.85 2023-10-02 08:22:10,503 WARNING [train.py:1204] (3/4) Exclude cut with ID _6694_210_4599_1_1527852637490_3965529_116-109398-0 from training. Duration: 0.73 2023-10-02 08:22:12,364 WARNING [train.py:1204] (3/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_46-57119-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 08:22:12,527 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.balancer2.prob, batch_count=813106.6666666666, ans=0.125 2023-10-02 08:22:13,732 WARNING [train.py:1204] (3/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_546-119312-0_sp1.1 from training. Duration: 0.9 2023-10-02 08:22:16,488 WARNING [train.py:1204] (3/4) Exclude cut with ID _60057_210_10640_1_1534903065793_3333150_468-306939-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 08:22:16,530 WARNING [train.py:1204] (3/4) Exclude cut with ID _15150_210_8341_1_1530752411803_7256384_22-138274-0 from training. Duration: 0.96 2023-10-02 08:22:16,551 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_289-180904-0 from training. Duration: 0.76 2023-10-02 08:22:20,666 WARNING [train.py:1204] (3/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_64-133721-0_sp1.1 from training. Duration: 0.92725 2023-10-02 08:22:20,704 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_195-257512-0_sp1.1 from training. Duration: 0.6090625 2023-10-02 08:22:24,655 WARNING [train.py:1204] (3/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_704-101963-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 08:22:27,927 WARNING [train.py:1204] (3/4) Exclude cut with ID _4366_210_1388_1_1526792053633_3613819_231-211402-0 from training. Duration: 0.94 2023-10-02 08:22:27,974 WARNING [train.py:1204] (3/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_679-190385-0_sp1.1 from training. Duration: 0.97275 2023-10-02 08:22:30,662 WARNING [train.py:1204] (3/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_298-81459-0_sp1.1 from training. Duration: 0.963625 2023-10-02 08:22:30,679 WARNING [train.py:1204] (3/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_658-282610-0_sp0.9 from training. Duration: 0.588875 2023-10-02 08:22:32,128 WARNING [train.py:1204] (3/4) Exclude cut with ID _57572_210_5517_1_1534921219624_3809484_196-283734-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 08:22:33,480 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_53071_210_16554_1_1534490405_2954066_294-198554-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 08:22:33,485 WARNING [train.py:1204] (3/4) Exclude cut with ID _4866_210_2577_1_1527404412284_4364659_352-157930-0 from training. Duration: 0.81 2023-10-02 08:22:36,146 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9231W0113-610139 from training. Duration: 0.896 2023-10-02 08:22:36,204 WARNING [train.py:1204] (3/4) Exclude cut with ID _24271_210_12658_1_1531987270022_7190519_638-171906-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 08:22:36,237 WARNING [train.py:1204] (3/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_272-249985-0 from training. Duration: 0.67 2023-10-02 08:22:36,406 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.balancer1.prob, batch_count=813240.0, ans=0.125 2023-10-02 08:22:37,960 WARNING [train.py:1204] (3/4) Exclude cut with ID _64086_210_7994_1_1535270417019_7435949_449-31567-0 from training. Duration: 0.97 2023-10-02 08:22:42,504 WARNING [train.py:1204] (3/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_109-282568-0_sp1.1 from training. Duration: 0.97275 2023-10-02 08:22:47,181 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.431e+02 1.831e+02 1.978e+02 2.257e+02 3.540e+02, threshold=3.956e+02, percent-clipped=0.0 2023-10-02 08:22:48,703 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_121-45934-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 08:22:49,414 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.2.encoder.layers.2.feed_forward3.out_whiten, num_groups=1, num_channels=384, metric=9.96 vs. limit=15.0 2023-10-02 08:22:51,638 WARNING [train.py:1204] (3/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_314-126819-0 from training. Duration: 0.5 2023-10-02 08:22:52,925 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9309W0192-640319 from training. Duration: 0.512 2023-10-02 08:22:52,946 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9310W0003-640649 from training. Duration: 0.896 2023-10-02 08:22:54,420 WARNING [train.py:1204] (3/4) Exclude cut with ID _7160_210_1876_1_1527940923231_3703799_120-150943-0 from training. Duration: 0.81 2023-10-02 08:22:54,421 WARNING [train.py:1204] (3/4) Exclude cut with ID _40380_210_9059_1_1533380594759_4187380_6-33508-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 08:22:55,821 WARNING [train.py:1204] (3/4) Exclude cut with ID _59202_210_26984_1_1534816810612_3549174_137-318710-0 from training. Duration: 0.89 2023-10-02 08:22:59,139 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_435-313305-0 from training. Duration: 0.71 2023-10-02 08:23:00,547 WARNING [train.py:1204] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_91-212486-0_sp1.1 from training. Duration: 0.6181875 2023-10-02 08:23:00,710 INFO [scaling.py:1118] (3/4) WithLoss: name=encoder.encoders.1.encoder.layers.1.self_attn_weights, loss-sum=0.000e+00 2023-10-02 08:23:03,215 WARNING [train.py:1204] (3/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_133-158308-0_sp1.1 from training. Duration: 0.763625 2023-10-02 08:23:04,612 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_99-251314-0 from training. Duration: 0.87 2023-10-02 08:23:07,338 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_365-217119-0_sp1.1 from training. Duration: 0.57275 2023-10-02 08:23:07,374 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3475_210_2028_1_1526193578_3622569_377-166133-0 from training. Duration: 0.86 2023-10-02 08:23:13,174 WARNING [train.py:1204] (3/4) Exclude cut with ID _13916_210_9994_1_1530788341126_3763149_116-21034-0_sp1.1 from training. Duration: 0.87275 2023-10-02 08:23:14,584 WARNING [train.py:1204] (3/4) Exclude cut with ID _64220_210_9527_1_1535250151772_4875259_142-216831-0_sp1.1 from training. Duration: 0.936375 2023-10-02 08:23:14,584 WARNING [train.py:1204] (3/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_490-207861-0_sp1.1 from training. Duration: 0.8909375 2023-10-02 08:23:15,859 INFO [train.py:1046] (3/4) Epoch 23, batch 5150, loss[loss=0.182, simple_loss=0.2564, pruned_loss=0.05385, over 23508.00 frames. ], tot_loss[loss=0.1732, simple_loss=0.2498, pruned_loss=0.04832, over 4708059.81 frames. ], batch size: 285, lr: 4.42e-03, grad_scale: 16.0 2023-10-02 08:23:15,903 WARNING [train.py:1204] (3/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_87-272068-0_sp0.9 from training. Duration: 0.92225 2023-10-02 08:23:15,936 WARNING [train.py:1204] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_18-46756-0_sp1.1 from training. Duration: 0.7181875 2023-10-02 08:23:17,245 WARNING [train.py:1204] (3/4) Exclude cut with ID _39411_210_5986_1_1533538825030_3343649_98-311335-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 08:23:17,320 WARNING [train.py:1204] (3/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_113-18163-0 from training. Duration: 0.89 2023-10-02 08:23:17,322 WARNING [train.py:1204] (3/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_1103-56445-0 from training. Duration: 0.73 2023-10-02 08:23:17,359 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3474_210_2028_1_1526188513_4825246_145-309131-0 from training. Duration: 0.74 2023-10-02 08:23:18,623 WARNING [train.py:1204] (3/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_120-211630-0_sp1.1 from training. Duration: 0.836375 2023-10-02 08:23:18,632 WARNING [train.py:1204] (3/4) Exclude cut with ID _8030_210_7497_1_1528444886781_3531089_32-31837-0 from training. Duration: 0.97 2023-10-02 08:23:20,093 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_19949_210_2161_1_1531706337_928079_8-160956-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 08:23:20,136 WARNING [train.py:1204] (3/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_321-215742-0_sp1.1 from training. Duration: 0.4545625 2023-10-02 08:23:21,574 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_583-124650-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 08:23:24,154 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_30434_210_13049_1_1532519384_981746_164-182755-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 08:23:28,799 WARNING [train.py:1204] (3/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_339-104791-0_sp1.1 from training. Duration: 0.4454375 2023-10-02 08:23:28,818 WARNING [train.py:1204] (3/4) Exclude cut with ID _15026_210_8750_1_1530777602316_3291850_211-195043-0 from training. Duration: 0.8 2023-10-02 08:23:31,441 WARNING [train.py:1204] (3/4) Exclude cut with ID _29877_210_6228_1_1532397791106_5276149_487-182647-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 08:23:31,492 WARNING [train.py:1204] (3/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_127-566-0_sp0.9 from training. Duration: 0.9333125 2023-10-02 08:23:34,232 WARNING [train.py:1204] (3/4) Exclude cut with ID _8263_210_1664_1_1528439452891_970220_68-181805-0_sp0.9 from training. Duration: 0.711125 2023-10-02 08:23:34,233 WARNING [train.py:1204] (3/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_504-72513-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 08:23:34,241 WARNING [train.py:1204] (3/4) Exclude cut with ID _64639_210_5816_1_1535367619437_3528000_324-265233-0_sp1.1 from training. Duration: 0.963625 2023-10-02 08:23:35,586 WARNING [train.py:1204] (3/4) Exclude cut with ID _6456_210_1534_1_1527681634212_3181049_215-309359-0_sp0.9 from training. Duration: 0.97775 2023-10-02 08:23:35,591 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_115-233929-0_sp1.1 from training. Duration: 0.6545625 2023-10-02 08:23:35,622 WARNING [train.py:1204] (3/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_219-224445-0 from training. Duration: 0.84 2023-10-02 08:23:37,002 WARNING [train.py:1204] (3/4) Exclude cut with ID _14867_210_1968_1_1530684179890_3846969_413-29292-0_sp1.1 from training. Duration: 0.8181875 2023-10-02 08:23:38,407 WARNING [train.py:1204] (3/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_1012-279473-0_sp0.9 from training. Duration: 0.7444375 2023-10-02 08:23:38,716 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.bypass.skip_rate, batch_count=813506.6666666666, ans=0.07 2023-10-02 08:23:39,855 WARNING [train.py:1204] (3/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_605-133395-0_sp0.9 from training. Duration: 0.7333125 2023-10-02 08:23:41,228 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_89-35748-0 from training. Duration: 0.79 2023-10-02 08:23:43,008 WARNING [train.py:1204] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_476-272757-0_sp1.1 from training. Duration: 0.8454375 2023-10-02 08:23:49,485 WARNING [train.py:1204] (3/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_449-15508-0_sp0.9 from training. Duration: 0.911125 2023-10-02 08:23:52,143 WARNING [train.py:1204] (3/4) Exclude cut with ID _29345_210_4381_1_1532689168263_3860169_198-174328-0 from training. Duration: 0.91 2023-10-02 08:23:54,981 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_15517_210_6753_1_1530952293_1664795_125-158157-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 08:23:58,187 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.feed_forward3.hidden_balancer.prob, batch_count=813640.0, ans=0.125 2023-10-02 08:23:59,971 WARNING [train.py:1204] (3/4) Exclude cut with ID _25565_210_12392_1_1532423009382_3952098_420-113180-0_sp1.1 from training. Duration: 0.963625 2023-10-02 08:24:01,084 WARNING [train.py:1204] (3/4) Exclude cut with ID _51471_210_1889_1_1534399391390_3094250_236-146773-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 08:24:05,359 WARNING [train.py:1204] (3/4) Exclude cut with ID _30039_210_13270_1_1532484612900_2725150_175-35012-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 08:24:06,576 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_23850_210_4238_1_1532232597_1632433_99-275843-0_sp1.1 from training. Duration: 0.936375 2023-10-02 08:24:08,146 WARNING [train.py:1204] (3/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_658-282610-0 from training. Duration: 0.53 2023-10-02 08:24:12,115 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_37818_210_4238_1_1533620910_4720083_234-65934-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 08:24:13,514 WARNING [train.py:1204] (3/4) Exclude cut with ID _3067_210_2667_1_1526212974640_4503168_459-173867-0_sp1.1 from training. Duration: 0.8 2023-10-02 08:24:13,540 WARNING [train.py:1204] (3/4) Exclude cut with ID _8696_210_1876_1_1528545632440_3345058_209-54069-0_sp0.9 from training. Duration: 0.7444375 2023-10-02 08:24:16,946 WARNING [train.py:1204] (3/4) Exclude cut with ID _62574_210_2655_1_1535180407058_3933249_420-236465-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 08:24:18,346 WARNING [train.py:1204] (3/4) Exclude cut with ID _46681_210_9642_1_1533893005571_7603359_333-4042-0_sp1.1 from training. Duration: 0.936375 2023-10-02 08:24:19,803 WARNING [train.py:1204] (3/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_377-296839-0 from training. Duration: 0.55 2023-10-02 08:24:21,308 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.conv_module1.balancer2.prob, batch_count=813706.6666666666, ans=0.125 2023-10-02 08:24:23,865 WARNING [train.py:1204] (3/4) Exclude cut with ID _43997_210_4525_1_1533538632555_5189329_426-92599-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 08:24:25,270 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_43-71989-0_sp1.1 from training. Duration: 0.5181875 2023-10-02 08:24:26,608 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2009_210_2071_1_1525481134_4588937_140-345530-0_sp1.1 from training. Duration: 0.963625 2023-10-02 08:24:26,615 WARNING [train.py:1204] (3/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_138-332773-0_sp0.9 from training. Duration: 0.9 2023-10-02 08:24:26,682 WARNING [train.py:1204] (3/4) Exclude cut with ID _13942_210_9988_1_1530604906036_3733360_499-55676-0_sp0.9 from training. Duration: 0.688875 2023-10-02 08:24:28,069 INFO [train.py:1046] (3/4) Epoch 23, batch 5200, loss[loss=0.1357, simple_loss=0.2137, pruned_loss=0.02885, over 24435.00 frames. ], tot_loss[loss=0.1731, simple_loss=0.2502, pruned_loss=0.04802, over 4714987.93 frames. ], batch size: 58, lr: 4.42e-03, grad_scale: 32.0 2023-10-02 08:24:28,122 WARNING [train.py:1204] (3/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_259-34038-0_sp1.1 from training. Duration: 0.67275 2023-10-02 08:24:28,132 WARNING [train.py:1204] (3/4) Exclude cut with ID _3120_210_3525_1_1526036143112_4072980_404-249994-0_sp1.1 from training. Duration: 0.9 2023-10-02 08:24:28,170 WARNING [train.py:1204] (3/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_222-108810-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 08:24:30,671 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.4.encoder.layers.2.nonlin_attention.whiten1, num_groups=1, num_channels=288, metric=6.26 vs. limit=10.0 2023-10-02 08:24:31,385 WARNING [train.py:1204] (3/4) Exclude cut with ID _22083_210_8254_1_1531702862439_3678970_49-234546-0_sp0.9 from training. Duration: 0.92225 2023-10-02 08:24:32,675 WARNING [train.py:1204] (3/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_199-295611-0_sp0.9 from training. Duration: 0.611125 2023-10-02 08:24:34,170 WARNING [train.py:1204] (3/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_43-192945-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 08:24:35,727 INFO [scaling.py:1118] (3/4) WithLoss: name=encoder.encoders.2.encoder.layers.0.self_attn_weights, loss-sum=0.000e+00 2023-10-02 08:24:38,312 WARNING [train.py:1204] (3/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_364-22817-0 from training. Duration: 0.71 2023-10-02 08:24:39,681 WARNING [train.py:1204] (3/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_388-129552-0_sp1.1 from training. Duration: 0.87275 2023-10-02 08:24:39,732 WARNING [train.py:1204] (3/4) Exclude cut with ID _6537_210_1883_1_1527987559710_3597129_190-280227-0_sp1.1 from training. Duration: 0.97275 2023-10-02 08:24:40,003 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.bypass_mid.scale_min, batch_count=813773.3333333334, ans=0.2 2023-10-02 08:24:42,395 WARNING [train.py:1204] (3/4) Exclude cut with ID _55844_210_2346_1_1534471212569_7625707_398-56897-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 08:24:42,464 WARNING [train.py:1204] (3/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_48-182155-0_sp1.1 from training. Duration: 0.7909375 2023-10-02 08:24:44,299 WARNING [train.py:1204] (3/4) Exclude cut with ID _29529_210_7234_1_1532393961455_3977460_582-18084-0_sp1.1 from training. Duration: 0.97275 2023-10-02 08:24:44,414 WARNING [train.py:1204] (3/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_912-282412-0 from training. Duration: 0.49 2023-10-02 08:24:47,136 WARNING [train.py:1204] (3/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_332-256168-0_sp0.9 from training. Duration: 0.8333125 2023-10-02 08:24:47,192 WARNING [train.py:1204] (3/4) Exclude cut with ID _64220_210_9527_1_1535250151772_4875259_55-250624-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 08:24:49,946 WARNING [train.py:1204] (3/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_264-108685-0 from training. Duration: 0.55 2023-10-02 08:24:50,649 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.3.encoder.layers.3.self_attn_weights.whiten_keys, num_groups=8, num_channels=256, metric=3.25 vs. limit=6.0 2023-10-02 08:24:52,576 WARNING [train.py:1204] (3/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_613-232856-0_sp0.9 from training. Duration: 0.6 2023-10-02 08:24:52,657 WARNING [train.py:1204] (3/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_68-212801-0_sp1.1 from training. Duration: 0.663625 2023-10-02 08:24:52,691 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6290_210_3273_1_1527595224_1282787_115-87095-0 from training. Duration: 0.55 2023-10-02 08:24:53,977 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3080_210_2071_1_1526086102_4365166_312-8726-0 from training. Duration: 0.91 2023-10-02 08:24:54,234 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.conv_module2.balancer2.prob, batch_count=813840.0, ans=0.125 2023-10-02 08:24:56,691 WARNING [train.py:1204] (3/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_105-63309-0 from training. Duration: 0.98 2023-10-02 08:24:56,756 WARNING [train.py:1204] (3/4) Exclude cut with ID _2941_210_2577_1_1526199673150_2489759_201-160779-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 08:24:56,759 WARNING [train.py:1204] (3/4) Exclude cut with ID ID0406W0226-885434 from training. Duration: 0.997 2023-10-02 08:24:56,771 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_17292_210_6753_1_1531125388_2943734_353-328201-0_sp1.1 from training. Duration: 0.97275 2023-10-02 08:24:59,420 WARNING [train.py:1204] (3/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_572-96029-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 08:24:59,458 WARNING [train.py:1204] (3/4) Exclude cut with ID _14970_210_15709_1_1530775547361_1966965_6-23328-0_sp0.9 from training. Duration: 0.9555625 2023-10-02 08:25:00,098 WARNING [train.py:1204] (3/4) Exclude cut with ID _45012_210_12651_1_1533608435136_5371380_240-37101-0 from training. Duration: 0.93 2023-10-02 08:25:01,195 WARNING [train.py:1204] (3/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_685-90183-0_sp1.1 from training. Duration: 0.936375 2023-10-02 08:25:01,514 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.conv_skip_rate, batch_count=813906.6666666666, ans=0.0 2023-10-02 08:25:02,763 WARNING [train.py:1204] (3/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_41-22245-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 08:25:06,782 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_15126_210_6753_1_1530778677_2591185_120-277707-0 from training. Duration: 0.8 2023-10-02 08:25:08,037 WARNING [train.py:1204] (3/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_310-26186-0 from training. Duration: 0.7 2023-10-02 08:25:08,051 WARNING [train.py:1204] (3/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_787-294416-0 from training. Duration: 0.82 2023-10-02 08:25:08,573 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.5.encoder.layers.1.feed_forward2.out_whiten, num_groups=1, num_channels=256, metric=8.99 vs. limit=15.0 2023-10-02 08:25:11,451 WARNING [train.py:1204] (3/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_795-33252-0 from training. Duration: 0.37 2023-10-02 08:25:11,483 WARNING [train.py:1204] (3/4) Exclude cut with ID _7537_210_2084_1_1528502395431_3924359_113-199933-0_sp0.9 from training. Duration: 0.6555625 2023-10-02 08:25:13,378 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.562e+02 1.905e+02 1.999e+02 2.249e+02 3.561e+02, threshold=3.998e+02, percent-clipped=0.0 2023-10-02 08:25:15,566 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=813973.3333333334, ans=0.1 2023-10-02 08:25:19,425 WARNING [train.py:1204] (3/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_308-231735-0_sp0.9 from training. Duration: 0.811125 2023-10-02 08:25:19,449 WARNING [train.py:1204] (3/4) Exclude cut with ID _19504_210_9994_1_1531648863242_3325220_354-296765-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 08:25:20,821 WARNING [train.py:1204] (3/4) Exclude cut with ID _5409_210_1388_1_1527292850189_5865191_178-127970-0 from training. Duration: 0.57 2023-10-02 08:25:22,136 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_1466-248056-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 08:25:22,153 WARNING [train.py:1204] (3/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_535-14410-0_sp0.9 from training. Duration: 0.52225 2023-10-02 08:25:22,156 WARNING [train.py:1204] (3/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_747-129941-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 08:25:22,202 WARNING [train.py:1204] (3/4) Exclude cut with ID _15012_210_1968_1_1530856820429_5095350_688-178849-0_sp1.1 from training. Duration: 0.5818125 2023-10-02 08:25:26,314 WARNING [train.py:1204] (3/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_356-45054-0_sp1.1 from training. Duration: 0.7818125 2023-10-02 08:25:26,397 WARNING [train.py:1204] (3/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_146-276944-0_sp0.9 from training. Duration: 0.92225 2023-10-02 08:25:30,471 WARNING [train.py:1204] (3/4) Exclude cut with ID _39204_210_4220_1_1533880547368_3191120_346-180306-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 08:25:32,218 WARNING [train.py:1204] (3/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_641-158017-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 08:25:32,219 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_19952_210_2161_1_1531875469_3851750_274-42137-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 08:25:33,719 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.1.bypass_mid.scale_min, batch_count=814040.0, ans=0.2 2023-10-02 08:25:36,374 WARNING [train.py:1204] (3/4) Exclude cut with ID _60804_210_9642_1_1534831199859_7268299_87-327819-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 08:25:37,210 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.1.encoder.layers.0.conv_module2.whiten, num_groups=1, num_channels=256, metric=5.19 vs. limit=15.0 2023-10-02 08:25:37,672 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_9302_210_2550_1_1528889962_4655416_638-301620-0 from training. Duration: 0.47 2023-10-02 08:25:37,733 WARNING [train.py:1204] (3/4) Exclude cut with ID _44304_210_15374_1_1533639352765_4613129_302-22084-0_sp1.1 from training. Duration: 0.7818125 2023-10-02 08:25:37,743 WARNING [train.py:1204] (3/4) Exclude cut with ID _18513_210_9206_1_1531618215223_3862620_177-227063-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 08:25:39,163 WARNING [train.py:1204] (3/4) Exclude cut with ID _41326_210_5854_1_1533778076509_3492080_510-45444-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 08:25:39,213 WARNING [train.py:1204] (3/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_76-311963-0_sp1.1 from training. Duration: 0.636375 2023-10-02 08:25:40,417 INFO [train.py:1046] (3/4) Epoch 23, batch 5250, loss[loss=0.1729, simple_loss=0.2214, pruned_loss=0.06218, over 19194.00 frames. ], tot_loss[loss=0.1729, simple_loss=0.2499, pruned_loss=0.04798, over 4705182.14 frames. ], batch size: 389, lr: 4.42e-03, grad_scale: 32.0 2023-10-02 08:25:40,545 WARNING [train.py:1204] (3/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_156-302675-0_sp0.9 from training. Duration: 0.611125 2023-10-02 08:25:43,805 WARNING [train.py:1204] (3/4) Exclude cut with ID _53489_210_6228_1_1534298357169_3835970_367-148411-0_sp1.1 from training. Duration: 0.936375 2023-10-02 08:25:44,122 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=814106.6666666666, ans=0.1 2023-10-02 08:25:45,878 WARNING [train.py:1204] (3/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_389-144525-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 08:25:47,218 WARNING [train.py:1204] (3/4) Exclude cut with ID _14069_210_5632_1_1530512692033_4803390_158-238427-0_sp1.1 from training. Duration: 0.963625 2023-10-02 08:25:48,559 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_486-151131-0_sp0.9 from training. Duration: 0.9444375 2023-10-02 08:25:52,662 WARNING [train.py:1204] (3/4) Exclude cut with ID _39846_210_12651_1_1533603651992_1318030_188-71831-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 08:25:54,077 WARNING [train.py:1204] (3/4) Exclude cut with ID _39411_210_5986_1_1533538825030_3343649_173-238248-0_sp1.1 from training. Duration: 0.7545625 2023-10-02 08:25:55,513 WARNING [train.py:1204] (3/4) Exclude cut with ID _20684_210_6917_1_1531567751646_4203930_469-49081-0_sp1.1 from training. Duration: 0.92725 2023-10-02 08:25:56,855 WARNING [train.py:1204] (3/4) Exclude cut with ID _8833_210_5219_1_1528592665210_7553059_611-80313-0_sp1.1 from training. Duration: 0.5818125 2023-10-02 08:26:00,137 WARNING [train.py:1204] (3/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_440-141811-0 from training. Duration: 0.95 2023-10-02 08:26:00,158 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_41211_210_2544_1_1533348859_3702910_236-223652-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 08:26:00,238 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_40222_210_6228_1_1533385298_4481939_220-163998-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 08:26:23,574 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.conv_module1.balancer2.prob, batch_count=814306.6666666666, ans=0.125 2023-10-02 08:26:33,031 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.1.attention_skip_rate, batch_count=814306.6666666666, ans=0.0 2023-10-02 08:26:42,040 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=814373.3333333334, ans=0.1 2023-10-02 08:26:45,980 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.5.encoder.layers.1.nonlin_attention.whiten1, num_groups=1, num_channels=192, metric=2.95 vs. limit=10.0 2023-10-02 08:26:48,279 INFO [train.py:1046] (3/4) Epoch 23, batch 5300, loss[loss=0.1587, simple_loss=0.2379, pruned_loss=0.03975, over 24305.00 frames. ], tot_loss[loss=0.1721, simple_loss=0.2488, pruned_loss=0.04771, over 4703659.39 frames. ], batch size: 61, lr: 4.42e-03, grad_scale: 16.0 2023-10-02 08:26:51,448 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.4.encoder.layers.0.nonlin_attention.whiten2, num_groups=1, num_channels=384, metric=11.52 vs. limit=15.0 2023-10-02 08:26:53,795 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.conv_module1.balancer2.min_positive, batch_count=814440.0, ans=0.05 2023-10-02 08:27:02,690 WARNING [train.py:1204] (3/4) Exclude cut with ID _14629_210_15709_1_1530604298182_956899_6-232695-0_sp1.1 from training. Duration: 0.8545625 2023-10-02 08:27:02,724 WARNING [train.py:1204] (3/4) Exclude cut with ID _13921_210_9994_1_1530841782272_3503520_327-69677-0 from training. Duration: 0.9 2023-10-02 08:27:02,749 WARNING [train.py:1204] (3/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_372-298093-0 from training. Duration: 0.8 2023-10-02 08:27:02,762 WARNING [train.py:1204] (3/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_515-139111-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 08:27:02,985 WARNING [train.py:1204] (3/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_85-65056-0_sp1.1 from training. Duration: 0.87275 2023-10-02 08:27:03,022 WARNING [train.py:1204] (3/4) Exclude cut with ID _15113_210_8254_1_1530842438579_3716279_209-54533-0_sp1.1 from training. Duration: 0.87275 2023-10-02 08:27:03,069 WARNING [train.py:1204] (3/4) Exclude cut with ID _70447_210_12392_1_1535972499300_3160420_123-312838-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 08:27:03,097 WARNING [train.py:1204] (3/4) Exclude cut with ID _60113_210_16271_1_1535164212481_4386079_70-63596-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 08:27:03,123 WARNING [train.py:1204] (3/4) Exclude cut with ID _14632_210_9988_1_1530691279392_3923200_225-188509-0_sp1.1 from training. Duration: 0.963625 2023-10-02 08:27:03,169 WARNING [train.py:1204] (3/4) Exclude cut with ID _34734_210_4525_1_1533365977542_4448720_584-180532-0_sp1.1 from training. Duration: 0.97275 2023-10-02 08:27:03,182 WARNING [train.py:1204] (3/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_351-294650-0_sp1.1 from training. Duration: 0.57275 2023-10-02 08:27:03,491 WARNING [train.py:1204] (3/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_263-168388-0_sp0.9 from training. Duration: 0.9555625 2023-10-02 08:27:03,518 WARNING [train.py:1204] (3/4) Exclude cut with ID _22083_210_8254_1_1531702862439_3678970_201-85738-0 from training. Duration: 0.9 2023-10-02 08:27:03,602 WARNING [train.py:1204] (3/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_237-58433-0 from training. Duration: 0.74 2023-10-02 08:27:03,627 WARNING [train.py:1204] (3/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_262-250826-0 from training. Duration: 0.99 2023-10-02 08:27:03,723 WARNING [train.py:1204] (3/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_927-43827-0_sp0.9 from training. Duration: 0.82225 2023-10-02 08:27:03,751 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2821_210_2715_1_1526108385_3763560_73-296673-0 from training. Duration: 0.83 2023-10-02 08:27:03,819 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6287_210_3273_1_1527509506_2653716_182-199887-0 from training. Duration: 0.51 2023-10-02 08:27:03,940 WARNING [train.py:1204] (3/4) Exclude cut with ID _70519_210_4163_1_1535961595994_7458230_899-80223-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 08:27:04,179 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_210-234949-0_sp1.1 from training. Duration: 0.97275 2023-10-02 08:27:04,215 WARNING [train.py:1204] (3/4) Exclude cut with ID _30196_210_1664_1_1533708496522_7074140_768-278176-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 08:27:04,643 WARNING [train.py:1204] (3/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_315-128283-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 08:27:04,720 WARNING [train.py:1204] (3/4) Exclude cut with ID _14886_210_4220_1_1530702020355_3516190_330-323941-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 08:27:04,965 WARNING [train.py:1204] (3/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_264-348896-0_sp1.1 from training. Duration: 0.77275 2023-10-02 08:27:04,986 WARNING [train.py:1204] (3/4) Exclude cut with ID _37086_210_9038_1_1532999540891_7021250_433-10563-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 08:27:05,020 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_53638_210_21581_1_1534297319_5808449_885-80172-0_sp1.1 from training. Duration: 0.87275 2023-10-02 08:27:05,112 WARNING [train.py:1204] (3/4) Exclude cut with ID _14623_210_1925_1_1530773354962_3632609_157-218771-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 08:27:05,122 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_1368-78165-0_sp1.1 from training. Duration: 0.97275 2023-10-02 08:27:05,126 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_309-65177-0_sp0.9 from training. Duration: 0.97775 2023-10-02 08:27:05,132 WARNING [train.py:1204] (3/4) Exclude cut with ID _21632_210_2253_1_1531994001765_3744140_291-138812-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 08:27:05,144 WARNING [train.py:1204] (3/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_669-93163-0_sp1.1 from training. Duration: 0.8909375 2023-10-02 08:27:05,681 WARNING [train.py:1204] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_292-215346-0 from training. Duration: 0.97 2023-10-02 08:27:05,704 WARNING [train.py:1204] (3/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_286-222073-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 08:27:05,976 WARNING [train.py:1204] (3/4) Exclude cut with ID _59256_210_5986_1_1535076085997_3589120_121-237386-0_sp1.1 from training. Duration: 0.87275 2023-10-02 08:27:05,989 WARNING [train.py:1204] (3/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_96-320736-0 from training. Duration: 0.86 2023-10-02 08:27:05,997 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_128-202104-0 from training. Duration: 0.63 2023-10-02 08:27:06,080 WARNING [train.py:1204] (3/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_242-3472-0_sp0.9 from training. Duration: 0.811125 2023-10-02 08:27:06,123 WARNING [train.py:1204] (3/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_154-74182-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 08:27:06,128 WARNING [train.py:1204] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_187-221566-0 from training. Duration: 0.43 2023-10-02 08:27:06,292 WARNING [train.py:1204] (3/4) Exclude cut with ID _6194_210_1534_1_1527511826160_3941194_14-282876-0 from training. Duration: 0.71 2023-10-02 08:27:06,330 WARNING [train.py:1204] (3/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_660-211088-0_sp1.1 from training. Duration: 0.563625 2023-10-02 08:27:07,121 WARNING [train.py:1204] (3/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_526-62079-0_sp1.1 from training. Duration: 0.6090625 2023-10-02 08:27:07,261 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_92-246270-0_sp1.1 from training. Duration: 0.77275 2023-10-02 08:27:07,344 WARNING [train.py:1204] (3/4) Exclude cut with ID IC0287W0023-142350 from training. Duration: 0.956 2023-10-02 08:27:07,415 WARNING [train.py:1204] (3/4) Exclude cut with ID _58605_210_12210_1_1534723195939_3904259_48-269402-0 from training. Duration: 0.96 2023-10-02 08:27:07,428 WARNING [train.py:1204] (3/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_416-257730-0_sp0.9 from training. Duration: 0.72225 2023-10-02 08:27:07,499 WARNING [train.py:1204] (3/4) Exclude cut with ID _7877_210_6014_1_1528286358099_3750136_185-17516-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 08:27:07,595 WARNING [train.py:1204] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_23-189234-0 from training. Duration: 0.83 2023-10-02 08:27:07,610 WARNING [train.py:1204] (3/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_237-89684-0 from training. Duration: 0.96 2023-10-02 08:27:07,675 WARNING [train.py:1204] (3/4) Exclude cut with ID _13916_210_9994_1_1530788341126_3763149_116-21034-0 from training. Duration: 0.96 2023-10-02 08:27:07,829 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_246-125000-0_sp1.1 from training. Duration: 0.563625 2023-10-02 08:27:14,102 INFO [train.py:1046] (3/4) Epoch 24, batch 0, loss[loss=0.144, simple_loss=0.2175, pruned_loss=0.03522, over 24310.00 frames. ], tot_loss[loss=0.144, simple_loss=0.2175, pruned_loss=0.03522, over 24310.00 frames. ], batch size: 56, lr: 4.32e-03, grad_scale: 32.0 2023-10-02 08:27:14,103 INFO [train.py:1069] (3/4) Computing validation loss 2023-10-02 08:27:27,309 INFO [train.py:1078] (3/4) Epoch 24, validation: loss=0.3245, simple_loss=0.2712, pruned_loss=0.1889, over 1125622.00 frames. 2023-10-02 08:27:27,310 INFO [train.py:1079] (3/4) Maximum memory allocated so far is 21134MB 2023-10-02 08:27:28,750 WARNING [train.py:1204] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_402-253027-0 from training. Duration: 0.54 2023-10-02 08:27:28,808 WARNING [train.py:1204] (3/4) Exclude cut with ID _59288_210_5854_1_1535077757774_3412246_158-289345-0_sp1.1 from training. Duration: 0.92725 2023-10-02 08:27:30,164 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_28211_210_6388_1_1532861506_4523224_128-328162-0_sp1.1 from training. Duration: 0.8181875 2023-10-02 08:27:34,405 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.conv_module1.balancer1.min_positive, batch_count=814520.0, ans=0.025 2023-10-02 08:27:35,534 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_788-278281-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 08:27:35,542 WARNING [train.py:1204] (3/4) Exclude cut with ID _49728_210_12651_1_1534041034294_6788150_624-195301-0_sp0.9 from training. Duration: 0.9444375 2023-10-02 08:27:35,590 WARNING [train.py:1204] (3/4) Exclude cut with ID _14860_210_4915_1_1530702020340_7343240_267-139775-0_sp1.1 from training. Duration: 0.963625 2023-10-02 08:27:36,912 WARNING [train.py:1204] (3/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_332-221057-0 from training. Duration: 0.68 2023-10-02 08:27:40,218 WARNING [train.py:1204] (3/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_242-76943-0 from training. Duration: 0.94 2023-10-02 08:27:41,990 WARNING [train.py:1204] (3/4) Exclude cut with ID _16747_210_5986_1_1531811544733_2749109_87-349587-0_sp1.1 from training. Duration: 0.963625 2023-10-02 08:27:43,372 WARNING [train.py:1204] (3/4) Exclude cut with ID _22982_210_1883_1_1531882790447_5449409_505-346452-0_sp1.1 from training. Duration: 0.963625 2023-10-02 08:27:47,561 WARNING [train.py:1204] (3/4) Exclude cut with ID _17956_210_4353_1_1531209568125_6979970_13-192107-0_sp1.1 from training. Duration: 0.963625 2023-10-02 08:27:47,601 WARNING [train.py:1204] (3/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_430-307666-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 08:27:48,917 WARNING [train.py:1204] (3/4) Exclude cut with ID _2895_210_2477_1_1525956785794_4063750_289-11475-0_sp1.1 from training. Duration: 0.7454375 2023-10-02 08:27:48,928 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3910_210_1794_1_1526742306_334530_28-238909-0_sp0.9 from training. Duration: 0.9 2023-10-02 08:27:49,048 WARNING [train.py:1204] (3/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_18-130336-0 from training. Duration: 0.79 2023-10-02 08:27:49,191 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.0.feed_forward1.out_proj.dropout_p, batch_count=814586.6666666666, ans=0.1 2023-10-02 08:27:49,257 INFO [scaling.py:1118] (3/4) WithLoss: name=encoder.encoders.1.encoder.layers.0.self_attn_weights, loss-sum=0.000e+00 2023-10-02 08:27:52,385 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_548-294316-0_sp0.9 from training. Duration: 0.9 2023-10-02 08:27:54,450 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.3.encoder.layers.2.feed_forward2.out_whiten, num_groups=1, num_channels=512, metric=13.10 vs. limit=15.0 2023-10-02 08:27:56,504 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.669e+02 1.867e+02 2.102e+02 2.500e+02 3.375e+02, threshold=4.204e+02, percent-clipped=0.0 2023-10-02 08:27:56,861 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.nonlin_attention.balancer.prob, batch_count=814653.3333333334, ans=0.125 2023-10-02 08:27:58,575 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_42-142473-0_sp1.1 from training. Duration: 0.6545625 2023-10-02 08:28:00,113 WARNING [train.py:1204] (3/4) Exclude cut with ID _59170_210_6721_1_1534983633960_4203335_326-48066-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 08:28:01,549 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_45886_210_17024_1_1533709215_471063_7-79165-0 from training. Duration: 0.98 2023-10-02 08:28:04,244 WARNING [train.py:1204] (3/4) Exclude cut with ID _44585_210_18628_1_1533605350387_4518199_280-73534-0_sp0.9 from training. Duration: 0.988875 2023-10-02 08:28:04,258 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3475_210_2028_1_1526193578_3622569_265-276625-0_sp1.1 from training. Duration: 0.6909375 2023-10-02 08:28:05,670 WARNING [train.py:1204] (3/4) Exclude cut with ID _64631_210_27767_1_1535349580068_4165160_88-225232-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 08:28:09,888 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_205-265518-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 08:28:13,061 WARNING [train.py:1204] (3/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_271-253734-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 08:28:18,349 WARNING [train.py:1204] (3/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_282-257791-0 from training. Duration: 0.86 2023-10-02 08:28:21,696 WARNING [train.py:1204] (3/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_356-45054-0 from training. Duration: 0.86 2023-10-02 08:28:21,733 WARNING [train.py:1204] (3/4) Exclude cut with ID _69584_210_5219_1_1535785133775_4044450_38-348116-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 08:28:21,734 WARNING [train.py:1204] (3/4) Exclude cut with ID _66763_210_17301_1_1535788667617_241750_21-328480-0_sp1.1 from training. Duration: 0.936375 2023-10-02 08:28:23,071 WARNING [train.py:1204] (3/4) Exclude cut with ID _26528_210_9070_1_1532570410842_3851310_497-97218-0_sp0.9 from training. Duration: 0.9555625 2023-10-02 08:28:23,242 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.nonlin_attention.balancer.prob, batch_count=814720.0, ans=0.125 2023-10-02 08:28:24,417 WARNING [train.py:1204] (3/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_592-101075-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 08:28:27,560 WARNING [train.py:1204] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_678-159307-0 from training. Duration: 0.85 2023-10-02 08:28:28,934 WARNING [train.py:1204] (3/4) Exclude cut with ID _28427_210_4220_1_1532581240967_3061069_166-319773-0_sp1.1 from training. Duration: 0.936375 2023-10-02 08:28:29,004 WARNING [train.py:1204] (3/4) Exclude cut with ID _29877_210_6228_1_1532397791106_5276149_606-224984-0_sp1.1 from training. Duration: 0.936375 2023-10-02 08:28:33,119 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_125-203105-0_sp1.1 from training. Duration: 0.72725 2023-10-02 08:28:35,916 WARNING [train.py:1204] (3/4) Exclude cut with ID IC0620W0113-306765 from training. Duration: 0.768 2023-10-02 08:28:37,252 WARNING [train.py:1204] (3/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_410-287857-0_sp1.1 from training. Duration: 0.8454375 2023-10-02 08:28:40,398 INFO [train.py:1046] (3/4) Epoch 24, batch 50, loss[loss=0.171, simple_loss=0.2534, pruned_loss=0.0443, over 24443.00 frames. ], tot_loss[loss=0.1735, simple_loss=0.2511, pruned_loss=0.04798, over 1070958.15 frames. ], batch size: 69, lr: 4.32e-03, grad_scale: 32.0 2023-10-02 08:28:40,460 WARNING [train.py:1204] (3/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_78-95260-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 08:28:41,894 WARNING [train.py:1204] (3/4) Exclude cut with ID _58059_210_5854_1_1534901471149_3164808_6-73080-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 08:28:41,914 WARNING [train.py:1204] (3/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_331-117829-0 from training. Duration: 0.62 2023-10-02 08:28:43,198 WARNING [train.py:1204] (3/4) Exclude cut with ID _14880_210_8895_1_1530680426655_3795259_289-212817-0_sp0.9 from training. Duration: 0.7666875 2023-10-02 08:28:43,238 WARNING [train.py:1204] (3/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_620-144779-0_sp1.1 from training. Duration: 0.92725 2023-10-02 08:28:46,255 WARNING [train.py:1204] (3/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_561-288575-0_sp1.1 from training. Duration: 0.963625 2023-10-02 08:28:47,617 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_22657_210_6890_1_1532086002_1637216_122-188128-0_sp1.1 from training. Duration: 0.963625 2023-10-02 08:28:47,806 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.self_attn_weights.pos_emb_skip_rate, batch_count=814853.3333333334, ans=0.0 2023-10-02 08:28:49,061 WARNING [train.py:1204] (3/4) Exclude cut with ID _16756_210_4915_1_1531047803763_7104259_604-86607-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 08:28:53,653 WARNING [train.py:1204] (3/4) Exclude cut with ID _15150_210_8341_1_1530752411803_7256384_469-239509-0 from training. Duration: 0.68 2023-10-02 08:28:53,660 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_10987_210_4163_1_1529760555_4123902_302-87926-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 08:28:59,509 WARNING [train.py:1204] (3/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_469-94673-0_sp0.9 from training. Duration: 0.87775 2023-10-02 08:29:00,784 WARNING [train.py:1204] (3/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_798-106179-0 from training. Duration: 0.83 2023-10-02 08:29:03,461 WARNING [train.py:1204] (3/4) Exclude cut with ID _16744_210_5986_1_1531724405384_3907567_242-111316-0 from training. Duration: 0.93 2023-10-02 08:29:03,635 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.1.attention_skip_rate, batch_count=814920.0, ans=0.0 2023-10-02 08:29:04,877 WARNING [train.py:1204] (3/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_938-205135-0_sp0.9 from training. Duration: 0.9666875 2023-10-02 08:29:04,969 WARNING [train.py:1204] (3/4) Exclude cut with ID _64135_210_2253_1_1535677267876_7608972_133-188266-0_sp0.9 from training. Duration: 0.9 2023-10-02 08:29:04,973 WARNING [train.py:1204] (3/4) Exclude cut with ID _44585_210_18628_1_1533605350387_4518199_623-346820-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 08:29:06,297 WARNING [train.py:1204] (3/4) Exclude cut with ID _42326_210_7770_1_1533814282531_3707269_454-266903-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 08:29:07,632 WARNING [train.py:1204] (3/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_5-55893-0_sp1.1 from training. Duration: 0.5 2023-10-02 08:29:07,660 WARNING [train.py:1204] (3/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_577-332600-0_sp1.1 from training. Duration: 0.5181875 2023-10-02 08:29:07,661 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_957-138882-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 08:29:15,430 WARNING [train.py:1204] (3/4) Exclude cut with ID _12556_210_8341_1_1530081024100_7327690_219-38004-0_sp1.1 from training. Duration: 0.97275 2023-10-02 08:29:16,815 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_397-147196-0_sp1.1 from training. Duration: 0.72725 2023-10-02 08:29:16,836 WARNING [train.py:1204] (3/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_231-104736-0_sp0.9 from training. Duration: 0.8666875 2023-10-02 08:29:16,911 WARNING [train.py:1204] (3/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_398-334558-0 from training. Duration: 0.96 2023-10-02 08:29:18,268 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_257-20733-0_sp1.1 from training. Duration: 0.6909375 2023-10-02 08:29:19,531 WARNING [train.py:1204] (3/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_146-282400-0_sp1.1 from training. Duration: 0.6454375 2023-10-02 08:29:19,535 WARNING [train.py:1204] (3/4) Exclude cut with ID _51899_210_20034_1_1534129254466_3669220_265-215428-0 from training. Duration: 0.98 2023-10-02 08:29:20,823 WARNING [train.py:1204] (3/4) Exclude cut with ID _34423_210_8750_1_1533430798456_3330199_219-106541-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 08:29:22,847 WARNING [train.py:1204] (3/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_458-67321-0 from training. Duration: 0.97 2023-10-02 08:29:29,022 WARNING [train.py:1204] (3/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_1051-182606-0_sp1.1 from training. Duration: 0.936375 2023-10-02 08:29:30,184 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_53555_210_5632_1_1534595220_7232297_603-4396-0_sp1.1 from training. Duration: 0.97275 2023-10-02 08:29:31,567 WARNING [train.py:1204] (3/4) Exclude cut with ID _54218_210_12210_1_1534413646388_4409259_159-144367-0_sp1.1 from training. Duration: 0.963625 2023-10-02 08:29:31,642 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder_embed.convnext.out_balancer.prob, batch_count=815053.3333333334, ans=0.125 2023-10-02 08:29:32,956 WARNING [train.py:1204] (3/4) Exclude cut with ID _7606_210_4915_1_1528894253277_5080690_301-10732-0_sp0.9 from training. Duration: 0.9 2023-10-02 08:29:32,959 WARNING [train.py:1204] (3/4) Exclude cut with ID _15026_210_8750_1_1530777602316_3291850_211-195043-0_sp0.9 from training. Duration: 0.888875 2023-10-02 08:29:35,669 WARNING [train.py:1204] (3/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_146-282400-0 from training. Duration: 0.71 2023-10-02 08:29:37,032 WARNING [train.py:1204] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_65-88242-0 from training. Duration: 0.56 2023-10-02 08:29:38,393 WARNING [train.py:1204] (3/4) Exclude cut with ID _55857_210_2346_1_1534644007831_6898209_80-107757-0_sp1.1 from training. Duration: 0.963625 2023-10-02 08:29:38,436 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_15126_210_6753_1_1530778677_2591185_120-277707-0_sp0.9 from training. Duration: 0.888875 2023-10-02 08:29:39,914 WARNING [train.py:1204] (3/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_1033-168593-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 08:29:41,119 WARNING [train.py:1204] (3/4) Exclude cut with ID _46683_210_9642_1_1533730913492_4591116_475-30711-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 08:29:41,150 WARNING [train.py:1204] (3/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_253-308377-0 from training. Duration: 0.49 2023-10-02 08:29:41,202 WARNING [train.py:1204] (3/4) Exclude cut with ID _4680_210_3525_1_1527249757953_893386_75-78979-0 from training. Duration: 0.91 2023-10-02 08:29:42,728 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_5522_210_3273_1_1527249606_3909096_318-203153-0_sp0.9 from training. Duration: 0.47775 2023-10-02 08:29:46,277 WARNING [train.py:1204] (3/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_550-170116-0_sp1.1 from training. Duration: 0.92725 2023-10-02 08:29:46,302 WARNING [train.py:1204] (3/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_277-210798-0_sp0.9 from training. Duration: 0.611125 2023-10-02 08:29:46,374 WARNING [train.py:1204] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_620-276793-0 from training. Duration: 0.59 2023-10-02 08:29:47,549 WARNING [train.py:1204] (3/4) Exclude cut with ID _3067_210_2667_1_1526212974640_4503168_119-175776-0 from training. Duration: 0.74 2023-10-02 08:29:48,994 WARNING [train.py:1204] (3/4) Exclude cut with ID _55209_210_1531_1_1534675828996_4597160_681-195827-0_sp1.1 from training. Duration: 0.92725 2023-10-02 08:29:49,056 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_427-78097-0_sp1.1 from training. Duration: 0.72725 2023-10-02 08:29:50,380 WARNING [train.py:1204] (3/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_96-290039-0_sp0.9 from training. Duration: 0.62225 2023-10-02 08:29:51,677 WARNING [train.py:1204] (3/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_287-292174-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 08:29:53,011 INFO [train.py:1046] (3/4) Epoch 24, batch 100, loss[loss=0.1864, simple_loss=0.2599, pruned_loss=0.05646, over 23808.00 frames. ], tot_loss[loss=0.1739, simple_loss=0.2513, pruned_loss=0.0483, over 1875985.55 frames. ], batch size: 164, lr: 4.32e-03, grad_scale: 16.0 2023-10-02 08:29:53,122 WARNING [train.py:1204] (3/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_200-292228-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 08:29:56,265 WARNING [train.py:1204] (3/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_291-194999-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 08:29:59,524 WARNING [train.py:1204] (3/4) Exclude cut with ID _58895_210_8750_1_1535184081320_3649089_265-323810-0_sp1.1 from training. Duration: 0.87275 2023-10-02 08:30:02,077 WARNING [train.py:1204] (3/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_58-68956-0 from training. Duration: 0.83 2023-10-02 08:30:02,082 WARNING [train.py:1204] (3/4) Exclude cut with ID _39867_210_9826_1_1533553222030_9344259_322-115153-0_sp1.1 from training. Duration: 0.963625 2023-10-02 08:30:05,035 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3371_210_1794_1_1526384462_5580777_396-339812-0_sp1.1 from training. Duration: 0.77275 2023-10-02 08:30:05,056 WARNING [train.py:1204] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_731-2428-0_sp0.9 from training. Duration: 0.92225 2023-10-02 08:30:06,452 WARNING [train.py:1204] (3/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_98-741-0_sp1.1 from training. Duration: 0.72725 2023-10-02 08:30:06,454 WARNING [train.py:1204] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_238-245655-0_sp0.9 from training. Duration: 0.9 2023-10-02 08:30:06,486 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6788_210_1388_1_1527898698_4562021_91-197981-0_sp0.9 from training. Duration: 0.92225 2023-10-02 08:30:07,741 WARNING [train.py:1204] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_860-96198-0 from training. Duration: 0.73 2023-10-02 08:30:10,462 WARNING [train.py:1204] (3/4) Exclude cut with ID _14268_210_9206_1_1530622771285_4714099_331-105351-0_sp0.9 from training. Duration: 0.72225 2023-10-02 08:30:11,883 WARNING [train.py:1204] (3/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_204-137133-0_sp1.1 from training. Duration: 0.92725 2023-10-02 08:30:11,901 WARNING [train.py:1204] (3/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_5-46496-0_sp1.1 from training. Duration: 0.936375 2023-10-02 08:30:11,908 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_160-218721-0_sp1.1 from training. Duration: 0.87275 2023-10-02 08:30:14,613 WARNING [train.py:1204] (3/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_503-287883-0 from training. Duration: 0.88 2023-10-02 08:30:15,969 WARNING [train.py:1204] (3/4) Exclude cut with ID _57572_210_5517_1_1534921219624_3809484_110-79134-0_sp1.1 from training. Duration: 0.92725 2023-10-02 08:30:17,837 WARNING [train.py:1204] (3/4) Exclude cut with ID _44585_210_18628_1_1533605350387_4518199_421-37641-0_sp1.1 from training. Duration: 0.936375 2023-10-02 08:30:17,920 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_42-142473-0_sp0.9 from training. Duration: 0.8 2023-10-02 08:30:19,812 WARNING [train.py:1204] (3/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_310-114452-0_sp0.9 from training. Duration: 0.6555625 2023-10-02 08:30:23,759 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.577e+02 1.848e+02 2.049e+02 2.242e+02 3.447e+02, threshold=4.098e+02, percent-clipped=0.0 2023-10-02 08:30:23,850 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9029W0120-509520 from training. Duration: 0.768 2023-10-02 08:30:23,864 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9029W0433-509820 from training. Duration: 0.896 2023-10-02 08:30:25,242 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_40222_210_6228_1_1533385298_4481939_163-334586-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 08:30:25,242 WARNING [train.py:1204] (3/4) Exclude cut with ID _48677_210_5663_1_1533902405919_3892989_408-40740-0_sp1.1 from training. Duration: 0.7545625 2023-10-02 08:30:28,495 WARNING [train.py:1204] (3/4) Exclude cut with ID _24314_210_4915_1_1532135078116_7170179_521-258868-0_sp0.9 from training. Duration: 0.87775 2023-10-02 08:30:30,382 WARNING [train.py:1204] (3/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_576-51198-0_sp1.1 from training. Duration: 0.92725 2023-10-02 08:30:30,488 WARNING [train.py:1204] (3/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_306-216871-0_sp1.1 from training. Duration: 0.97275 2023-10-02 08:30:31,446 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.0.layers.1.nonlin_attention.whiten2, num_groups=1, num_channels=192, metric=5.63 vs. limit=15.0 2023-10-02 08:30:33,617 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.self_attn_weights.pos_emb_skip_rate, batch_count=815320.0, ans=0.0 2023-10-02 08:30:34,799 WARNING [train.py:1204] (3/4) Exclude cut with ID _20684_210_6917_1_1531567751646_4203930_463-119982-0_sp1.1 from training. Duration: 0.97275 2023-10-02 08:30:36,181 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9084W0012-536850 from training. Duration: 0.64 2023-10-02 08:30:37,598 WARNING [train.py:1204] (3/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_847-41334-0_sp1.1 from training. Duration: 0.4 2023-10-02 08:30:40,263 WARNING [train.py:1204] (3/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_326-165900-0_sp1.1 from training. Duration: 0.62725 2023-10-02 08:30:41,635 WARNING [train.py:1204] (3/4) Exclude cut with ID _7537_210_2084_1_1528502395431_3924359_128-268029-0_sp1.1 from training. Duration: 0.8909375 2023-10-02 08:30:43,084 WARNING [train.py:1204] (3/4) Exclude cut with ID _67499_210_17282_1_1535509805220_3692430_333-155030-0_sp1.1 from training. Duration: 0.97275 2023-10-02 08:30:44,568 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.feed_forward1.hidden_balancer.prob, batch_count=815386.6666666666, ans=0.125 2023-10-02 08:30:47,888 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_34008_210_10431_1_1533345424_8183247_588-257177-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 08:30:50,617 WARNING [train.py:1204] (3/4) Exclude cut with ID _25914_210_6400_1_1532141317058_644740_52-96808-0_sp1.1 from training. Duration: 0.82725 2023-10-02 08:30:51,964 WARNING [train.py:1204] (3/4) Exclude cut with ID _56494_210_4220_1_1535284837625_3033169_69-138150-0_sp1.1 from training. Duration: 0.8545625 2023-10-02 08:30:54,849 WARNING [train.py:1204] (3/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_19-143553-0_sp1.1 from training. Duration: 0.97275 2023-10-02 08:30:54,898 WARNING [train.py:1204] (3/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_582-130735-0_sp1.1 from training. Duration: 0.936375 2023-10-02 08:30:57,901 WARNING [train.py:1204] (3/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_943-201356-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 08:30:57,918 WARNING [train.py:1204] (3/4) Exclude cut with ID _3231_210_2106_1_1526205864934_6784220_422-229276-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 08:30:57,940 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_48049_210_5632_1_1533864553_3519937_73-198717-0_sp1.1 from training. Duration: 0.97275 2023-10-02 08:30:57,991 WARNING [train.py:1204] (3/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_329-285801-0 from training. Duration: 0.91 2023-10-02 08:30:59,242 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9171W0114-579000 from training. Duration: 0.896 2023-10-02 08:30:59,267 WARNING [train.py:1204] (3/4) Exclude cut with ID _3120_210_3525_1_1526036143112_4072980_210-132675-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 08:31:00,586 WARNING [train.py:1204] (3/4) Exclude cut with ID _55857_210_2346_1_1534644007831_6898209_745-98391-0_sp0.9 from training. Duration: 0.8444375 2023-10-02 08:31:02,427 WARNING [train.py:1204] (3/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_615-322035-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 08:31:02,431 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_36756_210_7234_1_1533026991_966276_119-315767-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 08:31:02,446 WARNING [train.py:1204] (3/4) Exclude cut with ID _13916_210_9994_1_1530788341126_3763149_60-191556-0_sp1.1 from training. Duration: 0.5545625 2023-10-02 08:31:02,460 WARNING [train.py:1204] (3/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_177-236809-0_sp1.1 from training. Duration: 0.4454375 2023-10-02 08:31:02,500 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7002_210_1794_1_1527939683_5157286_184-229630-0_sp1.1 from training. Duration: 0.67275 2023-10-02 08:31:02,513 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_50428_210_5724_1_1534247532_4126418_464-18322-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 08:31:02,573 WARNING [train.py:1204] (3/4) Exclude cut with ID _35799_210_5854_1_1533263487887_3529071_466-131402-0_sp1.1 from training. Duration: 0.936375 2023-10-02 08:31:03,918 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_1259-132768-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 08:31:03,965 WARNING [train.py:1204] (3/4) Exclude cut with ID _6694_210_4599_1_1527852637490_3965529_144-185051-0_sp1.1 from training. Duration: 0.87275 2023-10-02 08:31:05,324 WARNING [train.py:1204] (3/4) Exclude cut with ID _64639_210_5816_1_1535367619437_3528000_59-133290-0_sp1.1 from training. Duration: 0.963625 2023-10-02 08:31:06,142 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.2.encoder.layers.2.conv_module1.whiten, num_groups=1, num_channels=384, metric=5.30 vs. limit=15.0 2023-10-02 08:31:06,732 INFO [train.py:1046] (3/4) Epoch 24, batch 150, loss[loss=0.1909, simple_loss=0.2729, pruned_loss=0.05449, over 23709.00 frames. ], tot_loss[loss=0.1748, simple_loss=0.252, pruned_loss=0.04884, over 2504568.48 frames. ], batch size: 85, lr: 4.32e-03, grad_scale: 16.0 2023-10-02 08:31:08,123 WARNING [train.py:1204] (3/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_223-49525-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 08:31:10,909 WARNING [train.py:1204] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_748-36150-0_sp1.1 from training. Duration: 0.82725 2023-10-02 08:31:10,910 WARNING [train.py:1204] (3/4) Exclude cut with ID _19896_210_1883_1_1531709022662_6649569_52-221627-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 08:31:12,201 WARNING [train.py:1204] (3/4) Exclude cut with ID _6789_210_1534_1_1527854471671_2870519_296-115766-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 08:31:13,803 WARNING [train.py:1204] (3/4) Exclude cut with ID _14624_210_1925_1_1530858690657_3642219_100-19871-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 08:31:13,834 WARNING [train.py:1204] (3/4) Exclude cut with ID _12555_210_8750_1_1530075594402_3780239_50-293675-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 08:31:16,232 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.0.layers.0.conv_module2.whiten, num_groups=1, num_channels=192, metric=8.13 vs. limit=15.0 2023-10-02 08:31:16,922 WARNING [train.py:1204] (3/4) Exclude cut with ID _14880_210_8895_1_1530680426655_3795259_289-212817-0_sp1.1 from training. Duration: 0.62725 2023-10-02 08:31:18,222 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_388-211626-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 08:31:24,102 WARNING [train.py:1204] (3/4) Exclude cut with ID _5289_210_2084_1_1527678202736_3779299_392-261988-0 from training. Duration: 0.65 2023-10-02 08:31:24,115 WARNING [train.py:1204] (3/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_124-345840-0 from training. Duration: 0.52 2023-10-02 08:31:24,120 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2655_210_2087_1_1525688959_3897400_437-337690-0 from training. Duration: 0.63 2023-10-02 08:31:26,872 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3780_210_2087_1_1526467391_239068_13-274633-0_sp1.1 from training. Duration: 0.92725 2023-10-02 08:31:26,877 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_805-71497-0_sp1.1 from training. Duration: 0.6454375 2023-10-02 08:31:28,295 WARNING [train.py:1204] (3/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_434-83000-0_sp1.1 from training. Duration: 0.8909375 2023-10-02 08:31:30,136 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_17613_210_2151_1_1531137569_2366160_285-219531-0_sp1.1 from training. Duration: 0.97275 2023-10-02 08:31:30,141 WARNING [train.py:1204] (3/4) Exclude cut with ID _56108_210_16380_1_1534505446159_3887959_22-37858-0_sp1.1 from training. Duration: 0.936375 2023-10-02 08:31:30,173 WARNING [train.py:1204] (3/4) Exclude cut with ID _28427_210_4220_1_1532581240967_3061069_200-88444-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 08:31:31,555 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_77-279042-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 08:31:31,653 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9309W0193-640320 from training. Duration: 0.896 2023-10-02 08:31:34,309 WARNING [train.py:1204] (3/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_516-324055-0_sp1.1 from training. Duration: 0.936375 2023-10-02 08:31:40,067 WARNING [train.py:1204] (3/4) Exclude cut with ID _40094_210_16380_1_1533294005773_3641159_10-261240-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 08:31:42,731 WARNING [train.py:1204] (3/4) Exclude cut with ID _6267_210_2084_1_1527764658526_3894730_375-219887-0_sp1.1 from training. Duration: 0.6090625 2023-10-02 08:31:44,122 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6788_210_1388_1_1527898698_4562021_180-75288-0 from training. Duration: 0.99 2023-10-02 08:31:48,067 WARNING [train.py:1204] (3/4) Exclude cut with ID _8030_210_7497_1_1528444886781_3531089_262-86750-0_sp1.1 from training. Duration: 0.736375 2023-10-02 08:31:48,085 WARNING [train.py:1204] (3/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_934-325498-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 08:31:49,928 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_456-22141-0_sp0.9 from training. Duration: 0.97775 2023-10-02 08:31:51,887 WARNING [train.py:1204] (3/4) Exclude cut with ID _49643_210_7989_1_1534640244224_3939031_598-161926-0_sp1.1 from training. Duration: 0.8181875 2023-10-02 08:31:53,253 WARNING [train.py:1204] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_345-49721-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 08:31:54,598 WARNING [train.py:1204] (3/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_440-141811-0_sp1.1 from training. Duration: 0.863625 2023-10-02 08:31:54,898 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.balancer2.prob, batch_count=815720.0, ans=0.125 2023-10-02 08:31:55,987 WARNING [train.py:1204] (3/4) Exclude cut with ID _64048_210_10626_1_1535198382802_3835179_394-212120-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 08:31:56,047 WARNING [train.py:1204] (3/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_91-277684-0 from training. Duration: 0.74 2023-10-02 08:31:56,218 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.feed_forward2.hidden_balancer.prob, batch_count=815720.0, ans=0.125 2023-10-02 08:32:01,634 WARNING [train.py:1204] (3/4) Exclude cut with ID _23038_210_9570_1_1532001584158_3652419_191-346343-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 08:32:02,926 WARNING [train.py:1204] (3/4) Exclude cut with ID _15140_210_9288_1_1530791388450_4880149_351-347974-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 08:32:02,960 WARNING [train.py:1204] (3/4) Exclude cut with ID _58605_210_12210_1_1534723195939_3904259_48-269402-0_sp1.1 from training. Duration: 0.87275 2023-10-02 08:32:02,968 WARNING [train.py:1204] (3/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_401-53423-0_sp0.9 from training. Duration: 0.611125 2023-10-02 08:32:03,855 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=815786.6666666666, ans=0.1 2023-10-02 08:32:04,828 WARNING [train.py:1204] (3/4) Exclude cut with ID _42659_210_2070_1_1533607442765_3120380_158-49004-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 08:32:06,223 WARNING [train.py:1204] (3/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_81-75849-0_sp1.1 from training. Duration: 0.3545625 2023-10-02 08:32:09,476 WARNING [train.py:1204] (3/4) Exclude cut with ID _27052_210_5986_1_1532329197187_3585009_314-106499-0_sp1.1 from training. Duration: 0.836375 2023-10-02 08:32:10,996 WARNING [train.py:1204] (3/4) Exclude cut with ID _4626_210_3532_1_1526817652717_3916859_446-206656-0_sp0.9 from training. Duration: 0.8444375 2023-10-02 08:32:11,263 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.feed_forward1.hidden_balancer.prob, batch_count=815786.6666666666, ans=0.125 2023-10-02 08:32:12,291 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_53378_210_17282_1_1534221071_5769509_158-114960-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 08:32:13,697 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3171_210_2087_1_1526109084_2330007_37-64334-0_sp0.9 from training. Duration: 0.911125 2023-10-02 08:32:13,711 WARNING [train.py:1204] (3/4) Exclude cut with ID _5410_210_1388_1_1527397055795_1719020_39-112221-0 from training. Duration: 0.69 2023-10-02 08:32:13,729 WARNING [train.py:1204] (3/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_4-91037-0_sp0.9 from training. Duration: 0.97775 2023-10-02 08:32:13,742 WARNING [train.py:1204] (3/4) Exclude cut with ID ID0079W0443-717795 from training. Duration: 0.541 2023-10-02 08:32:14,032 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.balancer2.prob, batch_count=815786.6666666666, ans=0.125 2023-10-02 08:32:15,202 WARNING [train.py:1204] (3/4) Exclude cut with ID _34734_210_4525_1_1533365977542_4448720_766-297928-0_sp1.1 from training. Duration: 0.936375 2023-10-02 08:32:19,663 INFO [train.py:1046] (3/4) Epoch 24, batch 200, loss[loss=0.2534, simple_loss=0.3113, pruned_loss=0.09773, over 19607.00 frames. ], tot_loss[loss=0.1746, simple_loss=0.252, pruned_loss=0.0486, over 3006251.62 frames. ], batch size: 388, lr: 4.32e-03, grad_scale: 16.0 2023-10-02 08:32:19,769 WARNING [train.py:1204] (3/4) Exclude cut with ID _51471_210_1889_1_1534399391390_3094250_310-337793-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 08:32:19,780 WARNING [train.py:1204] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_678-159307-0_sp0.9 from training. Duration: 0.9444375 2023-10-02 08:32:21,759 WARNING [train.py:1204] (3/4) Exclude cut with ID _50093_210_4220_1_1534417141207_3432299_259-322252-0 from training. Duration: 0.92 2023-10-02 08:32:23,064 WARNING [train.py:1204] (3/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_67-103444-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 08:32:23,080 WARNING [train.py:1204] (3/4) Exclude cut with ID _40551_210_10635_1_1533776424632_3416179_311-247578-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 08:32:23,293 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.1.ff2_skip_rate, batch_count=815853.3333333334, ans=0.0 2023-10-02 08:32:25,893 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_4633_210_2040_1_1526824163_4145343_416-76117-0 from training. Duration: 0.95 2023-10-02 08:32:25,974 WARNING [train.py:1204] (3/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_331-117829-0_sp0.9 from training. Duration: 0.688875 2023-10-02 08:32:27,276 WARNING [train.py:1204] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_299-247059-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 08:32:28,500 WARNING [train.py:1204] (3/4) Exclude cut with ID _64002_210_9570_1_1535198605157_38979_2-240845-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 08:32:33,117 WARNING [train.py:1204] (3/4) Exclude cut with ID _4681_210_3525_1_1527250689464_120889_15-126760-0_sp0.9 from training. Duration: 0.9 2023-10-02 08:32:34,351 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_21500_210_2054_1_1532149310_2716758_256-156611-0_sp1.1 from training. Duration: 0.936375 2023-10-02 08:32:34,374 WARNING [train.py:1204] (3/4) Exclude cut with ID _16908_210_5724_1_1531119157248_3772700_624-203729-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 08:32:49,985 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.435e+02 1.916e+02 2.110e+02 2.574e+02 4.556e+02, threshold=4.220e+02, percent-clipped=1.0 2023-10-02 08:32:54,809 WARNING [train.py:1204] (3/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_605-219732-0_sp1.1 from training. Duration: 0.8545625 2023-10-02 08:32:54,842 WARNING [train.py:1204] (3/4) Exclude cut with ID _44718_210_5816_1_1534050051869_6469030_380-167520-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 08:32:56,169 WARNING [train.py:1204] (3/4) Exclude cut with ID _19896_210_1883_1_1531709022662_6649569_94-229927-0_sp1.1 from training. Duration: 0.8818125 2023-10-02 08:32:56,215 WARNING [train.py:1204] (3/4) Exclude cut with ID _19514_210_9259_1_1531530071330_5084230_519-174300-0_sp1.1 from training. Duration: 0.963625 2023-10-02 08:32:57,560 WARNING [train.py:1204] (3/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_340-174176-0_sp0.9 from training. Duration: 0.5444375 2023-10-02 08:32:57,563 WARNING [train.py:1204] (3/4) Exclude cut with ID _8263_210_1664_1_1528439452891_970220_68-181805-0_sp1.1 from training. Duration: 0.5818125 2023-10-02 08:32:58,944 WARNING [train.py:1204] (3/4) Exclude cut with ID _3120_210_3525_1_1526036143112_4072980_348-257723-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 08:33:00,267 WARNING [train.py:1204] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_453-133983-0_sp1.1 from training. Duration: 0.7090625 2023-10-02 08:33:00,327 WARNING [train.py:1204] (3/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_17-86060-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 08:33:00,345 WARNING [train.py:1204] (3/4) Exclude cut with ID _29877_210_6228_1_1532397791106_5276149_228-184657-0_sp1.1 from training. Duration: 0.97275 2023-10-02 08:33:02,839 WARNING [train.py:1204] (3/4) Exclude cut with ID _18516_210_9206_1_1531704653206_3692406_198-334870-0 from training. Duration: 0.92 2023-10-02 08:33:02,874 WARNING [train.py:1204] (3/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_340-174176-0_sp1.1 from training. Duration: 0.4454375 2023-10-02 08:33:04,200 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_14-139186-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 08:33:08,967 WARNING [train.py:1204] (3/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_126-29104-0_sp0.9 from training. Duration: 0.9444375 2023-10-02 08:33:13,630 WARNING [train.py:1204] (3/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_84-10310-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 08:33:22,061 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_15517_210_6753_1_1530954008_3487336_79-273076-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 08:33:22,116 WARNING [train.py:1204] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_371-18169-0_sp0.9 from training. Duration: 0.9555625 2023-10-02 08:33:30,910 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_68702_210_23689_1_1535799577_3420068_170-92788-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 08:33:32,203 INFO [train.py:1046] (3/4) Epoch 24, batch 250, loss[loss=0.1637, simple_loss=0.2475, pruned_loss=0.03998, over 24475.00 frames. ], tot_loss[loss=0.1733, simple_loss=0.2514, pruned_loss=0.04764, over 3383050.48 frames. ], batch size: 63, lr: 4.32e-03, grad_scale: 16.0 2023-10-02 08:33:32,368 WARNING [train.py:1204] (3/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_78-182912-0 from training. Duration: 0.59 2023-10-02 08:33:33,730 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3019_210_2028_1_1526044245_3264865_297-12384-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 08:33:33,735 WARNING [train.py:1204] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_147-111137-0_sp1.1 from training. Duration: 0.863625 2023-10-02 08:33:33,739 WARNING [train.py:1204] (3/4) Exclude cut with ID _28323_210_16187_1_1532480395667_4100300_117-8859-0_sp1.1 from training. Duration: 0.97275 2023-10-02 08:33:35,168 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7762_210_1794_1_1528199508_4057408_419-125009-0_sp1.1 from training. Duration: 0.7545625 2023-10-02 08:33:35,258 WARNING [train.py:1204] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_268-87909-0 from training. Duration: 0.99 2023-10-02 08:33:36,595 WARNING [train.py:1204] (3/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_118-311357-0_sp1.1 from training. Duration: 0.92725 2023-10-02 08:33:36,608 WARNING [train.py:1204] (3/4) Exclude cut with ID ID0377W0003-870225 from training. Duration: 0.852 2023-10-02 08:33:38,014 WARNING [train.py:1204] (3/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_56-5838-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 08:33:41,163 WARNING [train.py:1204] (3/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_90-31383-0_sp0.9 from training. Duration: 0.9666875 2023-10-02 08:33:42,319 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_31583_210_3852_1_1532595851_4024413_58-86657-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 08:33:42,356 WARNING [train.py:1204] (3/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_344-113875-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 08:33:45,632 WARNING [train.py:1204] (3/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_278-104676-0_sp1.1 from training. Duration: 0.8909375 2023-10-02 08:33:45,659 WARNING [train.py:1204] (3/4) Exclude cut with ID _55844_210_2346_1_1534471212569_7625707_764-2387-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 08:33:47,032 WARNING [train.py:1204] (3/4) Exclude cut with ID _27430_210_7770_1_1532604689375_1378590_157-14963-0_sp1.1 from training. Duration: 0.963625 2023-10-02 08:33:49,815 WARNING [train.py:1204] (3/4) Exclude cut with ID _7160_210_1876_1_1527940923231_3703799_282-322611-0_sp1.1 from training. Duration: 0.7909375 2023-10-02 08:33:58,430 WARNING [train.py:1204] (3/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_190-138640-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 08:34:01,685 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_33348_210_5632_1_1533086912_3324030_63-270922-0_sp1.1 from training. Duration: 0.97275 2023-10-02 08:34:01,713 WARNING [train.py:1204] (3/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_135-184281-0_sp1.1 from training. Duration: 0.9 2023-10-02 08:34:08,611 WARNING [train.py:1204] (3/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_277-210798-0_sp1.1 from training. Duration: 0.5 2023-10-02 08:34:08,662 WARNING [train.py:1204] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_402-253027-0_sp0.9 from training. Duration: 0.6 2023-10-02 08:34:10,070 WARNING [train.py:1204] (3/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_540-275685-0_sp0.9 from training. Duration: 0.988875 2023-10-02 08:34:10,107 WARNING [train.py:1204] (3/4) Exclude cut with ID _59493_210_17282_1_1535078419077_3225879_118-18960-0_sp1.1 from training. Duration: 0.936375 2023-10-02 08:34:12,012 WARNING [train.py:1204] (3/4) Exclude cut with ID _6194_210_1534_1_1527511826160_3941194_321-148641-0_sp1.1 from training. Duration: 0.5818125 2023-10-02 08:34:12,022 WARNING [train.py:1204] (3/4) Exclude cut with ID _61133_210_9969_1_1535196777227_3696409_333-270325-0_sp1.1 from training. Duration: 0.8454375 2023-10-02 08:34:13,243 WARNING [train.py:1204] (3/4) Exclude cut with ID _49376_210_5986_1_1533985278732_3370740_307-145221-0_sp1.1 from training. Duration: 0.936375 2023-10-02 08:34:14,636 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6846_210_6753_1_1527850767_4295158_2-315073-0_sp0.9 from training. Duration: 0.97775 2023-10-02 08:34:17,335 WARNING [train.py:1204] (3/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_17-25045-0 from training. Duration: 0.59 2023-10-02 08:34:17,347 WARNING [train.py:1204] (3/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_503-333510-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 08:34:19,300 WARNING [train.py:1204] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_238-245655-0_sp1.1 from training. Duration: 0.736375 2023-10-02 08:34:19,319 WARNING [train.py:1204] (3/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_329-82235-0_sp0.9 from training. Duration: 0.911125 2023-10-02 08:34:19,324 WARNING [train.py:1204] (3/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_325-113798-0_sp0.9 from training. Duration: 0.9444375 2023-10-02 08:34:20,723 WARNING [train.py:1204] (3/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_395-37222-0_sp0.9 from training. Duration: 0.8444375 2023-10-02 08:34:22,116 WARNING [train.py:1204] (3/4) Exclude cut with ID _64135_210_2253_1_1535677267876_7608972_498-5039-0_sp1.1 from training. Duration: 0.7090625 2023-10-02 08:34:22,125 WARNING [train.py:1204] (3/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_303-228738-0_sp0.9 from training. Duration: 0.9333125 2023-10-02 08:34:23,500 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_577-220899-0_sp1.1 from training. Duration: 0.92725 2023-10-02 08:34:24,809 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3475_210_2028_1_1526193578_3622569_377-166133-0_sp1.1 from training. Duration: 0.7818125 2023-10-02 08:34:24,847 WARNING [train.py:1204] (3/4) Exclude cut with ID _49376_210_5986_1_1533985278732_3370740_272-313512-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 08:34:27,777 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7762_210_1794_1_1528199508_4057408_422-227034-0_sp0.9 from training. Duration: 0.811125 2023-10-02 08:34:32,350 WARNING [train.py:1204] (3/4) Exclude cut with ID _18895_210_6228_1_1531357274234_3784930_74-341428-0_sp1.1 from training. Duration: 0.92725 2023-10-02 08:34:33,791 WARNING [train.py:1204] (3/4) Exclude cut with ID _44902_210_16187_1_1533888006062_3792078_149-67212-0_sp1.1 from training. Duration: 0.7909375 2023-10-02 08:34:39,342 WARNING [train.py:1204] (3/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_243-126020-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 08:34:39,551 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.bypass.scale_min, batch_count=816453.3333333334, ans=0.2 2023-10-02 08:34:41,229 WARNING [train.py:1204] (3/4) Exclude cut with ID _54218_210_12210_1_1534413646388_4409259_259-6561-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 08:34:44,432 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_41452_210_7291_1_1533437687_3941281_405-11559-0 from training. Duration: 0.91 2023-10-02 08:34:45,739 INFO [train.py:1046] (3/4) Epoch 24, batch 300, loss[loss=0.1785, simple_loss=0.2697, pruned_loss=0.04362, over 24558.00 frames. ], tot_loss[loss=0.1724, simple_loss=0.2501, pruned_loss=0.04729, over 3676902.73 frames. ], batch size: 71, lr: 4.32e-03, grad_scale: 16.0 2023-10-02 08:34:45,786 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_759-225084-0_sp1.1 from training. Duration: 0.97275 2023-10-02 08:34:45,800 WARNING [train.py:1204] (3/4) Exclude cut with ID _14747_210_8341_1_1530685797463_3654089_147-131229-0_sp0.9 from training. Duration: 0.8444375 2023-10-02 08:34:45,909 WARNING [train.py:1204] (3/4) Exclude cut with ID _14867_210_1968_1_1530684179890_3846969_319-250868-0 from training. Duration: 0.99 2023-10-02 08:34:45,934 WARNING [train.py:1204] (3/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_580-93798-0_sp0.9 from training. Duration: 0.62225 2023-10-02 08:34:47,862 WARNING [train.py:1204] (3/4) Exclude cut with ID _9679_210_7497_1_1529129212906_4363929_512-313889-0_sp0.9 from training. Duration: 0.9 2023-10-02 08:34:47,870 WARNING [train.py:1204] (3/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_18-189198-0 from training. Duration: 0.9 2023-10-02 08:34:51,995 WARNING [train.py:1204] (3/4) Exclude cut with ID _49755_210_18053_1_1534581005846_3218709_137-64179-0_sp1.1 from training. Duration: 0.92725 2023-10-02 08:34:52,052 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_48049_210_5632_1_1533864553_3519937_306-283794-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 08:34:56,096 WARNING [train.py:1204] (3/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_25-120161-0_sp1.1 from training. Duration: 0.8909375 2023-10-02 08:34:57,367 WARNING [train.py:1204] (3/4) Exclude cut with ID _8833_210_5219_1_1528592665210_7553059_319-349181-0 from training. Duration: 0.99 2023-10-02 08:34:58,706 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_296-332325-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 08:34:58,798 WARNING [train.py:1204] (3/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_436-186311-0_sp1.1 from training. Duration: 0.5909375 2023-10-02 08:34:58,808 WARNING [train.py:1204] (3/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_348-127789-0 from training. Duration: 0.85 2023-10-02 08:35:00,629 WARNING [train.py:1204] (3/4) Exclude cut with ID _3992_210_1747_1_1526644816031_3731060_443-195183-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 08:35:02,398 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.attention_skip_rate, batch_count=816586.6666666666, ans=0.0 2023-10-02 08:35:04,693 WARNING [train.py:1204] (3/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_308-231735-0_sp1.1 from training. Duration: 0.663625 2023-10-02 08:35:09,436 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6786_210_1388_1_1527839699_4194495_378-46070-0_sp1.1 from training. Duration: 0.6545625 2023-10-02 08:35:09,488 WARNING [train.py:1204] (3/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_63-101996-0 from training. Duration: 0.88 2023-10-02 08:35:14,031 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2541_210_1794_1_1525590269_3084765_156-173713-0 from training. Duration: 0.81 2023-10-02 08:35:14,062 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_217-239796-0_sp1.1 from training. Duration: 0.963625 2023-10-02 08:35:14,390 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.conv_skip_rate, batch_count=816653.3333333334, ans=0.0 2023-10-02 08:35:15,496 WARNING [train.py:1204] (3/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_530-329400-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 08:35:16,751 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.532e+02 1.883e+02 2.136e+02 2.437e+02 4.219e+02, threshold=4.271e+02, percent-clipped=0.0 2023-10-02 08:35:18,667 WARNING [train.py:1204] (3/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_150-62920-0_sp1.1 from training. Duration: 0.963625 2023-10-02 08:35:18,667 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_5522_210_3273_1_1527249606_3909096_366-172508-0 from training. Duration: 0.66 2023-10-02 08:35:18,670 WARNING [train.py:1204] (3/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_303-340446-0_sp1.1 from training. Duration: 0.8181875 2023-10-02 08:35:18,904 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.balancer2.prob, batch_count=816653.3333333334, ans=0.125 2023-10-02 08:35:20,045 WARNING [train.py:1204] (3/4) Exclude cut with ID _8699_210_2069_1_1528630346831_3960409_160-24408-0_sp1.1 from training. Duration: 0.82725 2023-10-02 08:35:22,132 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.3.encoder.layers.2.nonlin_attention.whiten2, num_groups=1, num_channels=512, metric=6.66 vs. limit=15.0 2023-10-02 08:35:22,732 WARNING [train.py:1204] (3/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_299-187801-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 08:35:22,771 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_34886_210_10633_1_1533203882_1755661_92-345288-0_sp1.1 from training. Duration: 0.97275 2023-10-02 08:35:24,381 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.feed_forward2.hidden_balancer.prob, batch_count=816653.3333333334, ans=0.125 2023-10-02 08:35:25,659 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_5438_210_1794_1_1527336687_2600009_104-288274-0_sp1.1 from training. Duration: 0.52725 2023-10-02 08:35:25,664 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_154-316450-0 from training. Duration: 0.76 2023-10-02 08:35:25,860 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.feed_forward1.hidden_balancer.prob, batch_count=816653.3333333334, ans=0.125 2023-10-02 08:35:27,007 WARNING [train.py:1204] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_23-189234-0_sp0.9 from training. Duration: 0.92225 2023-10-02 08:35:30,288 WARNING [train.py:1204] (3/4) Exclude cut with ID _4779_210_2084_1_1527318022944_3914893_346-168820-0_sp1.1 from training. Duration: 0.963625 2023-10-02 08:35:30,455 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.0.bypass.scale_min, batch_count=816720.0, ans=0.2 2023-10-02 08:35:31,652 WARNING [train.py:1204] (3/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_5-55893-0 from training. Duration: 0.55 2023-10-02 08:35:32,936 WARNING [train.py:1204] (3/4) Exclude cut with ID _20455_210_5219_1_1531565996775_8098100_180-24517-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 08:35:36,973 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_243-143119-0_sp0.9 from training. Duration: 0.8555625 2023-10-02 08:35:41,157 WARNING [train.py:1204] (3/4) Exclude cut with ID _65585_210_12210_1_1535450451584_3764098_293-212045-0_sp1.1 from training. Duration: 0.92725 2023-10-02 08:35:41,161 WARNING [train.py:1204] (3/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_577-332600-0 from training. Duration: 0.57 2023-10-02 08:35:42,068 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.conv_skip_rate, batch_count=816720.0, ans=0.0 2023-10-02 08:35:43,144 WARNING [train.py:1204] (3/4) Exclude cut with ID _30039_210_13270_1_1532484612900_2725150_104-289874-0_sp1.1 from training. Duration: 0.963625 2023-10-02 08:35:43,157 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3172_210_2087_1_1526292929_3987865_197-97762-0_sp1.1 from training. Duration: 0.6818125 2023-10-02 08:35:46,335 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_256-150908-0_sp1.1 from training. Duration: 0.963625 2023-10-02 08:35:48,170 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_296-287678-0_sp1.1 from training. Duration: 0.863625 2023-10-02 08:35:48,196 WARNING [train.py:1204] (3/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_443-30034-0 from training. Duration: 0.64 2023-10-02 08:35:48,330 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.nonlin_attention.balancer.prob, batch_count=816786.6666666666, ans=0.125 2023-10-02 08:35:49,358 WARNING [train.py:1204] (3/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_494-199538-0_sp0.9 from training. Duration: 0.8333125 2023-10-02 08:35:49,416 WARNING [train.py:1204] (3/4) Exclude cut with ID _19708_210_9206_1_1532136650733_4238364_61-162325-0_sp1.1 from training. Duration: 0.97275 2023-10-02 08:35:49,699 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.balancer2.prob, batch_count=816786.6666666666, ans=0.125 2023-10-02 08:35:50,765 WARNING [train.py:1204] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_475-127455-0 from training. Duration: 0.7 2023-10-02 08:35:50,875 WARNING [train.py:1204] (3/4) Exclude cut with ID _48677_210_5663_1_1533902405919_3892989_188-11581-0_sp1.1 from training. Duration: 0.963625 2023-10-02 08:35:52,160 WARNING [train.py:1204] (3/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_136-8381-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 08:35:53,531 WARNING [train.py:1204] (3/4) Exclude cut with ID _55328_210_27767_1_1534580973260_4088170_270-229555-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 08:35:53,564 WARNING [train.py:1204] (3/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_372-266951-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 08:35:55,037 WARNING [train.py:1204] (3/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_14-242149-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 08:35:59,417 INFO [train.py:1046] (3/4) Epoch 24, batch 350, loss[loss=0.1859, simple_loss=0.2598, pruned_loss=0.05602, over 23419.00 frames. ], tot_loss[loss=0.1712, simple_loss=0.2485, pruned_loss=0.04695, over 3907459.16 frames. ], batch size: 120, lr: 4.31e-03, grad_scale: 16.0 2023-10-02 08:35:59,529 WARNING [train.py:1204] (3/4) Exclude cut with ID _7877_210_6014_1_1528286358099_3750136_159-76512-0_sp1.1 from training. Duration: 0.936375 2023-10-02 08:35:59,533 WARNING [train.py:1204] (3/4) Exclude cut with ID _8263_210_1664_1_1528438953494_294100_40-204731-0_sp1.1 from training. Duration: 0.4545625 2023-10-02 08:36:02,407 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8278_210_2550_1_1528637099_4220413_694-238413-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 08:36:02,681 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.self_attn_weights.pos_emb_skip_rate, batch_count=816853.3333333334, ans=0.0 2023-10-02 08:36:07,785 WARNING [train.py:1204] (3/4) Exclude cut with ID _47278_210_12041_1_1533902418349_3576110_421-172710-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 08:36:10,451 WARNING [train.py:1204] (3/4) Exclude cut with ID _14970_210_15709_1_1530773543493_1715730_215-282680-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 08:36:10,481 WARNING [train.py:1204] (3/4) Exclude cut with ID _64639_210_5816_1_1535367619437_3528000_306-149449-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 08:36:13,748 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_28-291982-0 from training. Duration: 0.97 2023-10-02 08:36:15,707 WARNING [train.py:1204] (3/4) Exclude cut with ID _55134_210_21982_1_1534590028377_3882170_412-15870-0_sp1.1 from training. Duration: 0.936375 2023-10-02 08:36:15,745 WARNING [train.py:1204] (3/4) Exclude cut with ID _13684_210_7497_1_1530619354863_3598359_312-235606-0 from training. Duration: 0.6 2023-10-02 08:36:18,562 WARNING [train.py:1204] (3/4) Exclude cut with ID _37112_210_2575_1_1532997007673_7268610_347-238993-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 08:36:18,600 WARNING [train.py:1204] (3/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_121-284152-0 from training. Duration: 0.59 2023-10-02 08:36:19,827 WARNING [train.py:1204] (3/4) Exclude cut with ID _2941_210_2577_1_1526199673150_2489759_228-2961-0_sp1.1 from training. Duration: 0.97275 2023-10-02 08:36:21,315 WARNING [train.py:1204] (3/4) Exclude cut with ID _28323_210_16187_1_1532480395667_4100300_531-24157-0 from training. Duration: 0.98 2023-10-02 08:36:22,757 WARNING [train.py:1204] (3/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_266-329755-0_sp0.9 from training. Duration: 0.988875 2023-10-02 08:36:24,075 WARNING [train.py:1204] (3/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_1038-146421-0_sp1.1 from training. Duration: 0.97275 2023-10-02 08:36:25,437 WARNING [train.py:1204] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_391-22148-0_sp0.9 from training. Duration: 0.8555625 2023-10-02 08:36:26,809 WARNING [train.py:1204] (3/4) Exclude cut with ID _64521_210_14699_1_1535243427259_3815328_543-341117-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 08:36:26,819 WARNING [train.py:1204] (3/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_52-168712-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 08:36:28,689 WARNING [train.py:1204] (3/4) Exclude cut with ID _33920_210_4220_1_1533257851500_3084399_198-334063-0_sp1.1 from training. Duration: 0.8545625 2023-10-02 08:36:28,709 WARNING [train.py:1204] (3/4) Exclude cut with ID _50093_210_4220_1_1534417141207_3432299_69-111987-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 08:36:28,753 WARNING [train.py:1204] (3/4) Exclude cut with ID _14821_210_8750_1_1530680503991_6829359_189-297451-0_sp0.9 from training. Duration: 0.611125 2023-10-02 08:36:28,891 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.conv_module2.balancer1.prob, batch_count=816986.6666666666, ans=0.125 2023-10-02 08:36:31,397 WARNING [train.py:1204] (3/4) Exclude cut with ID _8030_210_7497_1_1528444886781_3531089_262-86750-0_sp0.9 from training. Duration: 0.9 2023-10-02 08:36:31,402 WARNING [train.py:1204] (3/4) Exclude cut with ID _24612_210_8913_1_1531998054193_3806359_294-191196-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 08:36:31,683 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=816986.6666666666, ans=0.1 2023-10-02 08:36:38,174 WARNING [train.py:1204] (3/4) Exclude cut with ID _16909_210_5724_1_1531292079912_4118919_707-342896-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 08:36:38,188 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_9460_210_4211_1_1528954587_10049667_572-37035-0_sp0.9 from training. Duration: 0.888875 2023-10-02 08:36:38,232 WARNING [train.py:1204] (3/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_762-109886-0_sp1.1 from training. Duration: 0.82725 2023-10-02 08:36:38,255 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_14953_210_2151_1_1530701710_5616165_519-326393-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 08:36:44,305 WARNING [train.py:1204] (3/4) Exclude cut with ID _25914_210_6400_1_1532141317058_644740_52-96808-0 from training. Duration: 0.91 2023-10-02 08:36:44,309 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_68095_210_20185_1_1535718226_8888855_1079-94525-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 08:36:46,609 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.feed_forward2.hidden_balancer.prob, batch_count=817053.3333333334, ans=0.125 2023-10-02 08:36:47,787 WARNING [train.py:1204] (3/4) Exclude cut with ID _14860_210_4915_1_1530702020340_7343240_746-3647-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 08:36:47,788 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_70379_210_7925_1_1535941455_557455_49-101720-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 08:36:49,136 WARNING [train.py:1204] (3/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_8-307291-0_sp1.1 from training. Duration: 0.936375 2023-10-02 08:36:49,237 WARNING [train.py:1204] (3/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_117-290916-0 from training. Duration: 0.51 2023-10-02 08:36:50,760 WARNING [train.py:1204] (3/4) Exclude cut with ID _22020_210_2461_1_1531724164513_6326900_944-339840-0_sp1.1 from training. Duration: 0.963625 2023-10-02 08:36:52,096 WARNING [train.py:1204] (3/4) Exclude cut with ID IC0515W0043-254911 from training. Duration: 0.976 2023-10-02 08:36:53,336 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3910_210_1794_1_1526742306_334530_28-238909-0 from training. Duration: 0.81 2023-10-02 08:36:54,676 WARNING [train.py:1204] (3/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_269-267275-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 08:36:55,652 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.0.layers.0.nonlin_attention.whiten2, num_groups=1, num_channels=192, metric=5.16 vs. limit=15.0 2023-10-02 08:36:57,814 WARNING [train.py:1204] (3/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_72-251255-0_sp1.1 from training. Duration: 0.8545625 2023-10-02 08:36:57,824 WARNING [train.py:1204] (3/4) Exclude cut with ID _14886_210_4220_1_1530702020355_3516190_175-229073-0 from training. Duration: 0.45 2023-10-02 08:37:00,441 WARNING [train.py:1204] (3/4) Exclude cut with ID _40519_210_13794_1_1533715228655_7337390_921-209452-0_sp1.1 from training. Duration: 0.963625 2023-10-02 08:37:03,139 WARNING [train.py:1204] (3/4) Exclude cut with ID _29857_210_20034_1_1532395886753_3798160_335-163562-0_sp0.9 from training. Duration: 0.9444375 2023-10-02 08:37:03,234 WARNING [train.py:1204] (3/4) Exclude cut with ID _58583_210_5236_1_1534726835266_3574249_384-58445-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 08:37:03,505 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.self_attn_weights.pos_emb_skip_rate, batch_count=817120.0, ans=0.0 2023-10-02 08:37:05,935 WARNING [train.py:1204] (3/4) Exclude cut with ID _44011_210_2046_1_1533722401691_3631310_96-315656-0_sp1.1 from training. Duration: 0.963625 2023-10-02 08:37:05,939 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_68484_210_1794_1_1535849804_7678097_549-170613-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 08:37:07,378 WARNING [train.py:1204] (3/4) Exclude cut with ID _37892_210_19596_1_1533198677897_3964058_214-148058-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 08:37:08,863 WARNING [train.py:1204] (3/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_415-78792-0_sp0.9 from training. Duration: 0.9 2023-10-02 08:37:10,240 WARNING [train.py:1204] (3/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_383-205833-0_sp1.1 from training. Duration: 0.863625 2023-10-02 08:37:11,553 INFO [train.py:1046] (3/4) Epoch 24, batch 400, loss[loss=0.1774, simple_loss=0.2644, pruned_loss=0.04516, over 24550.00 frames. ], tot_loss[loss=0.1702, simple_loss=0.2479, pruned_loss=0.04619, over 4099841.43 frames. ], batch size: 71, lr: 4.31e-03, grad_scale: 32.0 2023-10-02 08:37:11,616 WARNING [train.py:1204] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_167-137255-0 from training. Duration: 0.64 2023-10-02 08:37:11,624 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8275_210_4198_1_1528455320_3892571_484-154786-0_sp1.1 from training. Duration: 0.963625 2023-10-02 08:37:11,679 WARNING [train.py:1204] (3/4) Exclude cut with ID _40284_210_2084_1_1533513679814_3704279_396-319412-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 08:37:15,554 WARNING [train.py:1204] (3/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_280-120942-0_sp0.9 from training. Duration: 0.8555625 2023-10-02 08:37:15,592 WARNING [train.py:1204] (3/4) Exclude cut with ID _45630_210_10556_1_1533949174498_3902049_558-240889-0_sp1.1 from training. Duration: 0.92725 2023-10-02 08:37:17,001 WARNING [train.py:1204] (3/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_197-307458-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 08:37:18,377 WARNING [train.py:1204] (3/4) Exclude cut with ID _16909_210_5724_1_1531292079912_4118919_163-254843-0_sp1.1 from training. Duration: 0.92725 2023-10-02 08:37:18,496 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_103-84010-0 from training. Duration: 0.69 2023-10-02 08:37:19,886 WARNING [train.py:1204] (3/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_401-158239-0 from training. Duration: 0.71 2023-10-02 08:37:19,887 WARNING [train.py:1204] (3/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_899-233658-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 08:37:21,321 WARNING [train.py:1204] (3/4) Exclude cut with ID _23825_210_13449_1_1531915313526_3497150_240-271367-0 from training. Duration: 0.97 2023-10-02 08:37:21,359 WARNING [train.py:1204] (3/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_424-134619-0_sp1.1 from training. Duration: 0.92725 2023-10-02 08:37:21,532 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.feed_forward1.hidden_balancer.prob, batch_count=817186.6666666666, ans=0.125 2023-10-02 08:37:25,738 WARNING [train.py:1204] (3/4) Exclude cut with ID _44831_210_13427_1_1533816011549_3689035_32-188147-0_sp1.1 from training. Duration: 0.87275 2023-10-02 08:37:25,749 WARNING [train.py:1204] (3/4) Exclude cut with ID _59170_210_6721_1_1534983633960_4203335_314-63879-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 08:37:25,762 WARNING [train.py:1204] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_602-275260-0 from training. Duration: 0.82 2023-10-02 08:37:25,789 WARNING [train.py:1204] (3/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_511-306767-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 08:37:27,064 WARNING [train.py:1204] (3/4) Exclude cut with ID _41817_210_18835_1_1533434477160_7423049_565-201775-0_sp1.1 from training. Duration: 0.92725 2023-10-02 08:37:27,075 WARNING [train.py:1204] (3/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_778-173151-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 08:37:27,112 WARNING [train.py:1204] (3/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_680-51001-0_sp1.1 from training. Duration: 0.963625 2023-10-02 08:37:28,546 WARNING [train.py:1204] (3/4) Exclude cut with ID IC0677W0059-334891 from training. Duration: 0.896 2023-10-02 08:37:29,856 WARNING [train.py:1204] (3/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_370-331600-0 from training. Duration: 0.94 2023-10-02 08:37:35,131 WARNING [train.py:1204] (3/4) Exclude cut with ID _66840_210_5219_1_1535452168426_4196349_446-240192-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 08:37:38,372 WARNING [train.py:1204] (3/4) Exclude cut with ID _65242_210_18628_1_1535369393717_3823149_38-6732-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 08:37:38,426 WARNING [train.py:1204] (3/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_302-2550-0 from training. Duration: 0.81 2023-10-02 08:37:39,810 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_100-348441-0 from training. Duration: 0.91 2023-10-02 08:37:42,924 WARNING [train.py:1204] (3/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_329-285801-0_sp1.1 from training. Duration: 0.82725 2023-10-02 08:37:44,457 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.427e+02 1.820e+02 2.038e+02 2.400e+02 4.228e+02, threshold=4.076e+02, percent-clipped=0.0 2023-10-02 08:37:47,637 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_30167_210_3528_1_1532428012_4772375_343-334712-0_sp1.1 from training. Duration: 0.936375 2023-10-02 08:37:53,223 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7543_210_6753_1_1528543318_4552_1-85072-0 from training. Duration: 0.83 2023-10-02 08:37:57,603 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7002_210_1794_1_1527935634_4034444_487-236501-0_sp1.1 from training. Duration: 0.6 2023-10-02 08:37:59,135 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.bypass.scale_min, batch_count=817386.6666666666, ans=0.2 2023-10-02 08:38:00,271 WARNING [train.py:1204] (3/4) Exclude cut with ID _7160_210_1876_1_1527940923231_3703799_282-322611-0 from training. Duration: 0.87 2023-10-02 08:38:00,439 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.conv_module1.balancer1.prob, batch_count=817386.6666666666, ans=0.125 2023-10-02 08:38:01,677 WARNING [train.py:1204] (3/4) Exclude cut with ID _67335_210_5219_1_1535525975652_4275229_570-262983-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 08:38:03,104 WARNING [train.py:1204] (3/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_625-338546-0_sp1.1 from training. Duration: 0.8 2023-10-02 08:38:04,397 WARNING [train.py:1204] (3/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_176-56120-0 from training. Duration: 0.83 2023-10-02 08:38:05,932 WARNING [train.py:1204] (3/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_963-187109-0_sp1.1 from training. Duration: 0.9 2023-10-02 08:38:08,622 WARNING [train.py:1204] (3/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_121-284152-0_sp0.9 from training. Duration: 0.6555625 2023-10-02 08:38:09,956 WARNING [train.py:1204] (3/4) Exclude cut with ID _45021_210_9994_1_1533619751094_3330288_104-81711-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 08:38:13,072 WARNING [train.py:1204] (3/4) Exclude cut with ID _51382_210_23862_1_1534492801838_3650104_129-343914-0_sp1.1 from training. Duration: 0.936375 2023-10-02 08:38:13,104 WARNING [train.py:1204] (3/4) Exclude cut with ID _14970_210_15709_1_1530775547361_1966965_8-169924-0 from training. Duration: 0.92 2023-10-02 08:38:14,652 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2295_210_2087_1_1525500993_2407863_234-83104-0_sp1.1 from training. Duration: 0.563625 2023-10-02 08:38:14,711 WARNING [train.py:1204] (3/4) Exclude cut with ID _14687_210_5854_1_1530703838259_3664889_115-239046-0 from training. Duration: 0.67 2023-10-02 08:38:18,579 WARNING [train.py:1204] (3/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_690-126306-0_sp1.1 from training. Duration: 0.7090625 2023-10-02 08:38:18,587 WARNING [train.py:1204] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_625-102955-0_sp0.9 from training. Duration: 0.8555625 2023-10-02 08:38:19,885 WARNING [train.py:1204] (3/4) Exclude cut with ID _9389_210_1664_1_1529217337301_5289269_289-213417-0 from training. Duration: 0.89 2023-10-02 08:38:22,541 WARNING [train.py:1204] (3/4) Exclude cut with ID _13253_210_1925_1_1530757775819_3577873_191-144977-0_sp0.9 from training. Duration: 0.8666875 2023-10-02 08:38:22,596 WARNING [train.py:1204] (3/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_325-155703-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 08:38:22,623 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2295_210_2087_1_1525500993_2407863_116-238894-0_sp0.9 from training. Duration: 0.77775 2023-10-02 08:38:22,852 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=817453.3333333334, ans=0.1 2023-10-02 08:38:23,993 WARNING [train.py:1204] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_94-126128-0 from training. Duration: 0.86 2023-10-02 08:38:24,016 WARNING [train.py:1204] (3/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_395-310220-0_sp0.9 from training. Duration: 0.988875 2023-10-02 08:38:24,085 WARNING [train.py:1204] (3/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_821-110852-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 08:38:25,679 INFO [train.py:1046] (3/4) Epoch 24, batch 450, loss[loss=0.189, simple_loss=0.275, pruned_loss=0.05153, over 24578.00 frames. ], tot_loss[loss=0.171, simple_loss=0.2491, pruned_loss=0.04642, over 4242459.05 frames. ], batch size: 71, lr: 4.31e-03, grad_scale: 16.0 2023-10-02 08:38:25,761 WARNING [train.py:1204] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_64-248359-0_sp1.1 from training. Duration: 0.62725 2023-10-02 08:38:25,778 WARNING [train.py:1204] (3/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_138-187031-0 from training. Duration: 0.99 2023-10-02 08:38:25,806 WARNING [train.py:1204] (3/4) Exclude cut with ID _4122_210_2084_1_1526713164417_4353280_528-206771-0_sp0.9 from training. Duration: 0.97775 2023-10-02 08:38:27,192 WARNING [train.py:1204] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_844-96989-0_sp1.1 from training. Duration: 0.6454375 2023-10-02 08:38:29,922 WARNING [train.py:1204] (3/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_106-30782-0_sp1.1 from training. Duration: 0.72725 2023-10-02 08:38:38,212 WARNING [train.py:1204] (3/4) Exclude cut with ID _14683_210_5219_1_1530698352608_3974859_281-250040-0_sp1.1 from training. Duration: 0.936375 2023-10-02 08:38:38,247 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_46013_210_7234_1_1533776362_3744032_463-52378-0_sp0.9 from training. Duration: 0.9555625 2023-10-02 08:38:38,373 WARNING [train.py:1204] (3/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_299-34413-0 from training. Duration: 0.65 2023-10-02 08:38:39,671 WARNING [train.py:1204] (3/4) Exclude cut with ID _35799_210_5854_1_1533263487887_3529071_490-326847-0 from training. Duration: 0.99 2023-10-02 08:38:42,778 WARNING [train.py:1204] (3/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_589-9070-0_sp0.9 from training. Duration: 0.788875 2023-10-02 08:38:45,966 WARNING [train.py:1204] (3/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_884-29515-0_sp1.1 from training. Duration: 0.936375 2023-10-02 08:38:46,555 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.4.encoder.layers.0.self_attn1.whiten, num_groups=1, num_channels=384, metric=14.21 vs. limit=22.5 2023-10-02 08:38:47,937 WARNING [train.py:1204] (3/4) Exclude cut with ID _15012_210_1968_1_1530856820429_5095350_474-119995-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 08:38:50,689 WARNING [train.py:1204] (3/4) Exclude cut with ID _65585_210_12210_1_1535450451584_3764098_211-97124-0_sp1.1 from training. Duration: 0.963625 2023-10-02 08:38:52,036 WARNING [train.py:1204] (3/4) Exclude cut with ID _33640_210_2655_1_1533117605479_4855469_666-224463-0_sp1.1 from training. Duration: 0.963625 2023-10-02 08:38:55,221 WARNING [train.py:1204] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_35-153886-0 from training. Duration: 0.84 2023-10-02 08:38:55,255 WARNING [train.py:1204] (3/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_210-106047-0 from training. Duration: 0.97 2023-10-02 08:38:57,976 WARNING [train.py:1204] (3/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_479-125611-0 from training. Duration: 0.96 2023-10-02 08:38:58,009 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_9417_210_1794_1_1529235261_4614131_427-153963-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 08:38:59,351 WARNING [train.py:1204] (3/4) Exclude cut with ID _23818_210_6228_1_1531917063884_3676920_133-170571-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 08:38:59,409 WARNING [train.py:1204] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_602-275260-0_sp1.1 from training. Duration: 0.7454375 2023-10-02 08:39:02,029 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9029W0121-509521 from training. Duration: 0.896 2023-10-02 08:39:02,036 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9029W0434-509821 from training. Duration: 0.64 2023-10-02 08:39:02,061 WARNING [train.py:1204] (3/4) Exclude cut with ID _23038_210_9570_1_1532001584158_3652419_289-307034-0_sp1.1 from training. Duration: 0.936375 2023-10-02 08:39:03,485 WARNING [train.py:1204] (3/4) Exclude cut with ID _29345_210_4381_1_1532689168263_3860169_149-257268-0_sp0.9 from training. Duration: 0.9 2023-10-02 08:39:04,753 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_405-339986-0_sp1.1 from training. Duration: 0.463625 2023-10-02 08:39:07,448 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2655_210_2087_1_1525688959_3897400_437-337690-0_sp1.1 from training. Duration: 0.57275 2023-10-02 08:39:07,471 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7543_210_6753_1_1528541907_1399322_165-323619-0_sp1.1 from training. Duration: 0.62725 2023-10-02 08:39:08,789 WARNING [train.py:1204] (3/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_508-335369-0_sp1.1 from training. Duration: 0.436375 2023-10-02 08:39:10,102 WARNING [train.py:1204] (3/4) Exclude cut with ID _7852_210_5219_1_1528286471316_8139400_358-287492-0 from training. Duration: 0.96 2023-10-02 08:39:12,791 WARNING [train.py:1204] (3/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_322-217220-0_sp0.9 from training. Duration: 0.9555625 2023-10-02 08:39:14,722 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3171_210_2087_1_1526109084_2330007_66-260583-0_sp0.9 from training. Duration: 0.8 2023-10-02 08:39:14,753 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8558_210_1410_1_1528505225_7613406_131-329524-0_sp1.1 from training. Duration: 0.7181875 2023-10-02 08:39:16,696 WARNING [train.py:1204] (3/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_30-326681-0 from training. Duration: 0.83 2023-10-02 08:39:16,854 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.1.feed_forward3.hidden_balancer.prob, batch_count=817720.0, ans=0.125 2023-10-02 08:39:20,055 WARNING [train.py:1204] (3/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_58-68956-0_sp0.9 from training. Duration: 0.92225 2023-10-02 08:39:20,126 WARNING [train.py:1204] (3/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_255-312654-0 from training. Duration: 0.91 2023-10-02 08:39:21,279 WARNING [train.py:1204] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_99-167392-0 from training. Duration: 0.61 2023-10-02 08:39:21,459 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.ff2_skip_rate, batch_count=817720.0, ans=0.0 2023-10-02 08:39:23,172 WARNING [train.py:1204] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_94-126128-0_sp0.9 from training. Duration: 0.9555625 2023-10-02 08:39:28,574 WARNING [train.py:1204] (3/4) Exclude cut with ID _68635_210_9317_1_1535770813410_4315870_187-192817-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 08:39:29,904 WARNING [train.py:1204] (3/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_277-204970-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 08:39:31,358 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6163_210_6400_1_1527506746_4263068_283-137811-0_sp0.9 from training. Duration: 0.8555625 2023-10-02 08:39:31,386 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9144W0121-566446 from training. Duration: 0.896 2023-10-02 08:39:35,791 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.conv_module2.balancer2.prob, batch_count=817786.6666666666, ans=0.125 2023-10-02 08:39:36,888 WARNING [train.py:1204] (3/4) Exclude cut with ID _49371_210_12071_1_1533952761021_3812190_351-315519-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 08:39:38,130 INFO [train.py:1046] (3/4) Epoch 24, batch 500, loss[loss=0.1771, simple_loss=0.265, pruned_loss=0.0446, over 24548.00 frames. ], tot_loss[loss=0.1716, simple_loss=0.2495, pruned_loss=0.04683, over 4351939.87 frames. ], batch size: 71, lr: 4.31e-03, grad_scale: 16.0 2023-10-02 08:39:38,167 WARNING [train.py:1204] (3/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_115-326778-0_sp1.1 from training. Duration: 0.8454375 2023-10-02 08:39:39,573 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_17292_210_6753_1_1531125388_2943734_436-198965-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 08:39:39,590 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9165W0117-576751 from training. Duration: 0.896 2023-10-02 08:39:40,929 WARNING [train.py:1204] (3/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_212-149947-0 from training. Duration: 0.73 2023-10-02 08:39:40,939 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_13757_210_3528_1_1530440682_4107239_65-167544-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 08:39:43,832 WARNING [train.py:1204] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_700-183549-0_sp0.9 from training. Duration: 0.7555625 2023-10-02 08:39:48,807 WARNING [train.py:1204] (3/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_216-343007-0_sp0.9 from training. Duration: 0.7333125 2023-10-02 08:39:50,097 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7544_210_6753_1_1528587722_3576686_295-258479-0_sp1.1 from training. Duration: 0.763625 2023-10-02 08:39:50,313 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.conv_module2.balancer2.prob, batch_count=817853.3333333334, ans=0.125 2023-10-02 08:39:52,262 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_365-287106-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 08:39:52,274 WARNING [train.py:1204] (3/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_300-3747-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 08:39:53,317 WARNING [train.py:1204] (3/4) Exclude cut with ID _14069_210_5632_1_1530512692033_4803390_487-303201-0_sp1.1 from training. Duration: 0.92725 2023-10-02 08:39:55,031 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.conv_module2.balancer2.prob, batch_count=817920.0, ans=0.125 2023-10-02 08:40:02,097 WARNING [train.py:1204] (3/4) Exclude cut with ID _14459_210_2228_1_1531443641946_7516700_668-123026-0_sp1.1 from training. Duration: 0.936375 2023-10-02 08:40:02,118 WARNING [train.py:1204] (3/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_108-53929-0_sp1.1 from training. Duration: 0.536375 2023-10-02 08:40:02,153 WARNING [train.py:1204] (3/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_266-137104-0_sp0.9 from training. Duration: 0.72225 2023-10-02 08:40:02,190 WARNING [train.py:1204] (3/4) Exclude cut with ID _12302_210_5219_1_1529924446466_8107280_902-168619-0_sp1.1 from training. Duration: 0.936375 2023-10-02 08:40:02,213 WARNING [train.py:1204] (3/4) Exclude cut with ID _59502_210_12210_1_1535107024698_1835139_146-162495-0 from training. Duration: 0.92 2023-10-02 08:40:02,412 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.nonlin_attention.balancer.prob, batch_count=817920.0, ans=0.125 2023-10-02 08:40:02,822 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.3.encoder.layers.3.feed_forward2.out_whiten, num_groups=1, num_channels=512, metric=6.73 vs. limit=15.0 2023-10-02 08:40:03,543 WARNING [train.py:1204] (3/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_412-287526-0_sp1.1 from training. Duration: 0.6545625 2023-10-02 08:40:06,302 WARNING [train.py:1204] (3/4) Exclude cut with ID _7160_210_1876_1_1527940923231_3703799_120-150943-0_sp0.9 from training. Duration: 0.9 2023-10-02 08:40:07,523 WARNING [train.py:1204] (3/4) Exclude cut with ID _15037_210_2253_1_1530752294104_3267650_296-192597-0_sp1.1 from training. Duration: 0.736375 2023-10-02 08:40:07,549 WARNING [train.py:1204] (3/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_397-216497-0_sp1.1 from training. Duration: 0.87275 2023-10-02 08:40:07,569 WARNING [train.py:1204] (3/4) Exclude cut with ID _40094_210_16380_1_1533294005773_3641159_76-269320-0_sp1.1 from training. Duration: 0.936375 2023-10-02 08:40:08,939 WARNING [train.py:1204] (3/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_449-15508-0 from training. Duration: 0.82 2023-10-02 08:40:10,158 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.571e+02 1.880e+02 2.084e+02 2.383e+02 5.306e+02, threshold=4.168e+02, percent-clipped=1.0 2023-10-02 08:40:11,693 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9309W0194-640321 from training. Duration: 0.64 2023-10-02 08:40:14,406 WARNING [train.py:1204] (3/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_284-312178-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 08:40:15,632 WARNING [train.py:1204] (3/4) Exclude cut with ID _29853_210_7497_1_1532505600851_6642110_401-55937-0_sp1.1 from training. Duration: 0.92725 2023-10-02 08:40:17,016 WARNING [train.py:1204] (3/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_716-24918-0_sp1.1 from training. Duration: 0.92725 2023-10-02 08:40:18,723 WARNING [train.py:1204] (3/4) Exclude cut with ID _19637_210_4220_1_1531537167676_3205060_314-305638-0_sp1.1 from training. Duration: 0.92725 2023-10-02 08:40:18,762 WARNING [train.py:1204] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_475-127455-0_sp1.1 from training. Duration: 0.636375 2023-10-02 08:40:21,058 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.bypass.scale_min, batch_count=818053.3333333334, ans=0.2 2023-10-02 08:40:22,072 WARNING [train.py:1204] (3/4) Exclude cut with ID _5444_210_2151_1_1527418808464_5312210_581-315411-0 from training. Duration: 0.62 2023-10-02 08:40:25,458 WARNING [train.py:1204] (3/4) Exclude cut with ID _44902_210_16187_1_1533888006062_3792078_149-67212-0_sp0.9 from training. Duration: 0.9666875 2023-10-02 08:40:25,719 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.conv_module2.balancer2.prob, batch_count=818053.3333333334, ans=0.125 2023-10-02 08:40:26,906 WARNING [train.py:1204] (3/4) Exclude cut with ID _31602_210_5854_1_1532677021456_4458250_63-63253-0_sp1.1 from training. Duration: 0.963625 2023-10-02 08:40:31,517 WARNING [train.py:1204] (3/4) Exclude cut with ID _55209_210_1531_1_1534675828996_4597160_660-203992-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 08:40:32,959 WARNING [train.py:1204] (3/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_83-190624-0_sp1.1 from training. Duration: 0.92725 2023-10-02 08:40:38,578 WARNING [train.py:1204] (3/4) Exclude cut with ID _58605_210_12210_1_1534723195939_3904259_154-139635-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 08:40:38,775 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=818120.0, ans=0.1 2023-10-02 08:40:41,403 WARNING [train.py:1204] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_164-224052-0 from training. Duration: 0.7 2023-10-02 08:40:41,411 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_1126-189071-0_sp1.1 from training. Duration: 0.963625 2023-10-02 08:40:41,421 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_15396_210_6890_1_1531308473_1938936_310-214789-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 08:40:45,497 WARNING [train.py:1204] (3/4) Exclude cut with ID _8395_210_1531_1_1528597871523_4411579_431-132470-0 from training. Duration: 0.74 2023-10-02 08:40:45,558 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3780_210_2087_1_1526467760_2130483_170-259044-0_sp0.9 from training. Duration: 0.77775 2023-10-02 08:40:45,643 WARNING [train.py:1204] (3/4) Exclude cut with ID _12733_210_8341_1_1530167424556_3646380_537-230640-0_sp1.1 from training. Duration: 0.963625 2023-10-02 08:40:51,341 INFO [train.py:1046] (3/4) Epoch 24, batch 550, loss[loss=0.1632, simple_loss=0.2337, pruned_loss=0.04637, over 23482.00 frames. ], tot_loss[loss=0.1723, simple_loss=0.2502, pruned_loss=0.04716, over 4442112.57 frames. ], batch size: 134, lr: 4.31e-03, grad_scale: 16.0 2023-10-02 08:40:51,414 WARNING [train.py:1204] (3/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_159-281310-0 from training. Duration: 0.95 2023-10-02 08:40:52,783 WARNING [train.py:1204] (3/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_434-83000-0 from training. Duration: 0.98 2023-10-02 08:40:54,612 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_4405_210_1849_1_1526732205_3593939_493-32464-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 08:40:54,620 WARNING [train.py:1204] (3/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_342-100157-0 from training. Duration: 0.54 2023-10-02 08:40:56,260 WARNING [train.py:1204] (3/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_765-250133-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 08:40:56,265 WARNING [train.py:1204] (3/4) Exclude cut with ID _38670_210_15379_1_1533448810659_4391059_81-312245-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 08:40:56,343 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_50050_210_16223_1_1535435796_7290138_1273-247611-0_sp1.1 from training. Duration: 0.936375 2023-10-02 08:40:56,387 WARNING [train.py:1204] (3/4) Exclude cut with ID _44831_210_13427_1_1533816011549_3689035_399-18591-0_sp1.1 from training. Duration: 0.936375 2023-10-02 08:40:56,408 WARNING [train.py:1204] (3/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_557-324258-0_sp0.9 from training. Duration: 0.9 2023-10-02 08:40:58,339 WARNING [train.py:1204] (3/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_62-335552-0_sp1.1 from training. Duration: 0.7909375 2023-10-02 08:40:59,668 WARNING [train.py:1204] (3/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_673-264125-0_sp1.1 from training. Duration: 0.963625 2023-10-02 08:41:00,979 WARNING [train.py:1204] (3/4) Exclude cut with ID _8030_210_7497_1_1528444886781_3531089_262-86750-0 from training. Duration: 0.81 2023-10-02 08:41:00,999 WARNING [train.py:1204] (3/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_444-28041-0_sp1.1 from training. Duration: 0.77275 2023-10-02 08:41:06,403 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_34008_210_10431_1_1533345424_8183247_516-14858-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 08:41:06,416 WARNING [train.py:1204] (3/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_54-248218-0_sp1.1 from training. Duration: 0.936375 2023-10-02 08:41:07,894 WARNING [train.py:1204] (3/4) Exclude cut with ID _13253_210_1925_1_1530757775819_3577873_300-119782-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 08:41:07,959 WARNING [train.py:1204] (3/4) Exclude cut with ID _39290_210_4220_1_1533862778760_3075118_21-240703-0_sp1.1 from training. Duration: 0.936375 2023-10-02 08:41:08,132 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.feed_forward1.hidden_balancer.prob, batch_count=818253.3333333334, ans=0.125 2023-10-02 08:41:13,331 WARNING [train.py:1204] (3/4) Exclude cut with ID 774-127930-0014-10412-0_sp1.1 from training. Duration: 0.95 2023-10-02 08:41:14,657 WARNING [train.py:1204] (3/4) Exclude cut with ID _67499_210_17282_1_1535509805220_3692430_20-179346-0 from training. Duration: 0.97 2023-10-02 08:41:14,756 WARNING [train.py:1204] (3/4) Exclude cut with ID _8365_210_2064_1_1528592311070_4677690_431-294102-0_sp0.9 from training. Duration: 0.988875 2023-10-02 08:41:18,701 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.4.encoder.layers.1.feed_forward1.out_whiten, num_groups=1, num_channels=384, metric=9.65 vs. limit=15.0 2023-10-02 08:41:20,811 WARNING [train.py:1204] (3/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_365-1492-0_sp1.1 from training. Duration: 0.82725 2023-10-02 08:41:20,836 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_105-114797-0_sp0.9 from training. Duration: 0.9333125 2023-10-02 08:41:22,228 WARNING [train.py:1204] (3/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_769-168302-0_sp0.9 from training. Duration: 0.788875 2023-10-02 08:41:25,588 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_40340_210_9024_1_1533433916_4132944_320-270277-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 08:41:27,423 WARNING [train.py:1204] (3/4) Exclude cut with ID ID0203W0003-781711 from training. Duration: 0.958 2023-10-02 08:41:27,504 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_30434_210_13049_1_1532519384_981746_52-12932-0_sp1.1 from training. Duration: 0.936375 2023-10-02 08:41:28,933 WARNING [train.py:1204] (3/4) Exclude cut with ID _8263_210_1664_1_1528438953494_294100_40-204731-0_sp0.9 from training. Duration: 0.5555625 2023-10-02 08:41:31,646 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1032-236863-0_sp0.9 from training. Duration: 0.9333125 2023-10-02 08:41:31,698 WARNING [train.py:1204] (3/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_335-274780-0_sp0.9 from training. Duration: 0.8666875 2023-10-02 08:41:31,705 WARNING [train.py:1204] (3/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_654-52713-0_sp1.1 from training. Duration: 0.7 2023-10-02 08:41:33,091 WARNING [train.py:1204] (3/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_742-267856-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 08:41:34,563 WARNING [train.py:1204] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_74-48717-0 from training. Duration: 0.58 2023-10-02 08:41:35,913 WARNING [train.py:1204] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_18-46756-0 from training. Duration: 0.79 2023-10-02 08:41:37,277 WARNING [train.py:1204] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_612-56061-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 08:41:37,292 WARNING [train.py:1204] (3/4) Exclude cut with ID _56423_210_5854_1_1534896061049_3165969_412-2403-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 08:41:38,581 WARNING [train.py:1204] (3/4) Exclude cut with ID _62574_210_2655_1_1535180407058_3933249_120-196848-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 08:41:38,582 WARNING [train.py:1204] (3/4) Exclude cut with ID _40519_210_13794_1_1533715228655_7337390_916-51000-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 08:41:42,620 WARNING [train.py:1204] (3/4) Exclude cut with ID _65263_210_22356_1_1535763598691_3921037_108-21902-0_sp1.1 from training. Duration: 0.9 2023-10-02 08:41:43,995 WARNING [train.py:1204] (3/4) Exclude cut with ID _4122_210_2084_1_1526713164417_4353280_528-206771-0_sp1.1 from training. Duration: 0.8 2023-10-02 08:41:45,450 WARNING [train.py:1204] (3/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_43-325657-0_sp1.1 from training. Duration: 0.92725 2023-10-02 08:41:46,894 WARNING [train.py:1204] (3/4) Exclude cut with ID _44659_210_2721_1_1534036120061_6614860_314-82041-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 08:41:48,246 WARNING [train.py:1204] (3/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_81-300212-0_sp0.9 from training. Duration: 0.6666875 2023-10-02 08:41:49,544 WARNING [train.py:1204] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_665-196155-0_sp0.9 from training. Duration: 0.7444375 2023-10-02 08:41:52,570 WARNING [train.py:1204] (3/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_140-336881-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 08:41:52,643 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6290_210_3273_1_1527595224_1282787_115-87095-0_sp0.9 from training. Duration: 0.611125 2023-10-02 08:41:52,682 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_53373_210_5399_1_1534229471_4715695_607-259685-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 08:41:54,125 WARNING [train.py:1204] (3/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_175-163894-0_sp1.1 from training. Duration: 0.67275 2023-10-02 08:41:54,141 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7002_210_1794_1_1527939683_5157286_1-281264-0_sp1.1 from training. Duration: 0.4 2023-10-02 08:42:00,555 WARNING [train.py:1204] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_605-257205-0 from training. Duration: 0.59 2023-10-02 08:42:02,070 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.bypass.scale_min, batch_count=818453.3333333334, ans=0.2 2023-10-02 08:42:03,221 WARNING [train.py:1204] (3/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_454-314901-0 from training. Duration: 0.46 2023-10-02 08:42:04,495 INFO [train.py:1046] (3/4) Epoch 24, batch 600, loss[loss=0.1731, simple_loss=0.2576, pruned_loss=0.04427, over 24575.00 frames. ], tot_loss[loss=0.1726, simple_loss=0.2503, pruned_loss=0.04742, over 4501081.64 frames. ], batch size: 71, lr: 4.31e-03, grad_scale: 16.0 2023-10-02 08:42:04,609 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_46032_210_14656_1_1533781412_4507969_287-130422-0_sp1.1 from training. Duration: 0.97275 2023-10-02 08:42:04,625 WARNING [train.py:1204] (3/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_186-309161-0_sp1.1 from training. Duration: 0.4909375 2023-10-02 08:42:04,654 WARNING [train.py:1204] (3/4) Exclude cut with ID _42659_210_2070_1_1533607442765_3120380_45-342307-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 08:42:04,863 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.nonlin_attention.balancer.prob, batch_count=818520.0, ans=0.125 2023-10-02 08:42:10,167 WARNING [train.py:1204] (3/4) Exclude cut with ID _12555_210_8750_1_1530075594402_3780239_478-326818-0_sp1.1 from training. Duration: 0.963625 2023-10-02 08:42:10,406 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.conv_module1.balancer1.prob, batch_count=818520.0, ans=0.125 2023-10-02 08:42:11,540 WARNING [train.py:1204] (3/4) Exclude cut with ID _7606_210_4915_1_1528894253277_5080690_127-177761-0_sp1.1 from training. Duration: 0.5818125 2023-10-02 08:42:12,863 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6788_210_1388_1_1527898698_4562021_91-197981-0 from training. Duration: 0.83 2023-10-02 08:42:14,230 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8558_210_1410_1_1528505225_7613406_131-329524-0_sp0.9 from training. Duration: 0.87775 2023-10-02 08:42:14,425 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.nonlin_attention.balancer.prob, batch_count=818520.0, ans=0.125 2023-10-02 08:42:15,597 WARNING [train.py:1204] (3/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_153-153537-0_sp1.1 from training. Duration: 0.82725 2023-10-02 08:42:17,025 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_17292_210_6753_1_1531125388_2943734_285-349105-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 08:42:20,285 WARNING [train.py:1204] (3/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_37-41919-0 from training. Duration: 0.87 2023-10-02 08:42:20,310 WARNING [train.py:1204] (3/4) Exclude cut with ID _60860_210_8254_1_1534896015875_3557169_172-294852-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 08:42:28,520 WARNING [train.py:1204] (3/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_146-276944-0 from training. Duration: 0.83 2023-10-02 08:42:32,658 WARNING [train.py:1204] (3/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_1254-44082-0_sp1.1 from training. Duration: 0.936375 2023-10-02 08:42:32,670 WARNING [train.py:1204] (3/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_1115-180350-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 08:42:32,701 WARNING [train.py:1204] (3/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_60-146189-0_sp1.1 from training. Duration: 0.8909375 2023-10-02 08:42:37,053 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.496e+02 1.807e+02 2.027e+02 2.301e+02 3.422e+02, threshold=4.053e+02, percent-clipped=0.0 2023-10-02 08:42:38,480 WARNING [train.py:1204] (3/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_517-153649-0_sp1.1 from training. Duration: 0.92725 2023-10-02 08:42:38,498 WARNING [train.py:1204] (3/4) Exclude cut with ID _39571_210_13449_1_1533801584143_3651077_37-266257-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 08:42:38,533 WARNING [train.py:1204] (3/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_527-276118-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 08:42:45,552 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7696_210_2040_1_1528207167_3716099_117-125434-0_sp1.1 from training. Duration: 0.6909375 2023-10-02 08:42:47,249 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.balancer1.prob, batch_count=818720.0, ans=0.125 2023-10-02 08:42:49,708 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_25767_210_2161_1_1532307483_3867748_478-156360-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 08:42:50,971 WARNING [train.py:1204] (3/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_177-49092-0_sp1.1 from training. Duration: 0.82725 2023-10-02 08:42:50,979 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_11305_210_1794_1_1529750428_7178148_680-317073-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 08:42:58,115 WARNING [train.py:1204] (3/4) Exclude cut with ID _53565_210_10635_1_1534337218335_3820259_287-18120-0 from training. Duration: 0.97 2023-10-02 08:43:03,943 WARNING [train.py:1204] (3/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_498-40153-0_sp1.1 from training. Duration: 0.57275 2023-10-02 08:43:03,961 WARNING [train.py:1204] (3/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_555-63066-0_sp1.1 from training. Duration: 0.963625 2023-10-02 08:43:06,835 WARNING [train.py:1204] (3/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_395-310220-0 from training. Duration: 0.89 2023-10-02 08:43:06,928 WARNING [train.py:1204] (3/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_937-208281-0_sp1.1 from training. Duration: 0.77275 2023-10-02 08:43:08,435 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.balancer2.prob, batch_count=818786.6666666666, ans=0.125 2023-10-02 08:43:09,607 WARNING [train.py:1204] (3/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_120-211630-0 from training. Duration: 0.92 2023-10-02 08:43:10,872 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_454-276429-0_sp1.1 from training. Duration: 0.7909375 2023-10-02 08:43:12,170 WARNING [train.py:1204] (3/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_404-75889-0_sp1.1 from training. Duration: 0.8090625 2023-10-02 08:43:17,585 INFO [train.py:1046] (3/4) Epoch 24, batch 650, loss[loss=0.1755, simple_loss=0.2544, pruned_loss=0.04835, over 23740.00 frames. ], tot_loss[loss=0.1717, simple_loss=0.2493, pruned_loss=0.04704, over 4547104.22 frames. ], batch size: 85, lr: 4.31e-03, grad_scale: 16.0 2023-10-02 08:43:17,693 WARNING [train.py:1204] (3/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_282-136641-0_sp0.9 from training. Duration: 0.4666875 2023-10-02 08:43:19,079 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_809-17156-0_sp1.1 from training. Duration: 0.663625 2023-10-02 08:43:21,701 WARNING [train.py:1204] (3/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_1003-101638-0_sp0.9 from training. Duration: 0.811125 2023-10-02 08:43:23,099 WARNING [train.py:1204] (3/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_404-75889-0_sp0.9 from training. Duration: 0.988875 2023-10-02 08:43:25,051 WARNING [train.py:1204] (3/4) Exclude cut with ID _59288_210_5854_1_1535077757774_3412246_570-295255-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 08:43:26,547 WARNING [train.py:1204] (3/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_412-287526-0 from training. Duration: 0.72 2023-10-02 08:43:27,970 WARNING [train.py:1204] (3/4) Exclude cut with ID _44010_210_4525_1_1533884262964_4013620_649-79655-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 08:43:33,063 WARNING [train.py:1204] (3/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_57-270037-0_sp0.9 from training. Duration: 0.8555625 2023-10-02 08:43:33,063 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_34815_210_6388_1_1533381320_3571903_302-255963-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 08:43:37,098 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_47449_210_5399_1_1534076445_4006929_573-301852-0_sp1.1 from training. Duration: 0.97275 2023-10-02 08:43:39,926 WARNING [train.py:1204] (3/4) Exclude cut with ID 6709-74022-0004-86860-0_sp1.1 from training. Duration: 0.9409375 2023-10-02 08:43:42,596 WARNING [train.py:1204] (3/4) Exclude cut with ID _40519_210_13794_1_1533715228655_7337390_111-71991-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 08:43:42,644 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_30167_210_3528_1_1532428012_4772375_169-214186-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 08:43:43,284 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.2.encoder.layers.1.conv_module1.whiten, num_groups=1, num_channels=384, metric=6.14 vs. limit=15.0 2023-10-02 08:43:46,526 WARNING [train.py:1204] (3/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_20-235313-0_sp1.1 from training. Duration: 0.963625 2023-10-02 08:43:46,570 WARNING [train.py:1204] (3/4) Exclude cut with ID _4681_210_3525_1_1527251040321_1235713_123-24428-0_sp0.9 from training. Duration: 0.5333125 2023-10-02 08:43:48,139 WARNING [train.py:1204] (3/4) Exclude cut with ID _31602_210_5854_1_1532677021456_4458250_444-198752-0_sp1.1 from training. Duration: 0.97275 2023-10-02 08:43:49,469 WARNING [train.py:1204] (3/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_9-279442-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 08:43:50,858 WARNING [train.py:1204] (3/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_973-198085-0_sp0.9 from training. Duration: 0.8333125 2023-10-02 08:43:50,939 WARNING [train.py:1204] (3/4) Exclude cut with ID _29853_210_7497_1_1532505600851_6642110_567-250221-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 08:43:51,024 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6545_210_2040_1_1527774670_4245258_407-48070-0_sp1.1 from training. Duration: 0.6909375 2023-10-02 08:43:54,525 WARNING [train.py:1204] (3/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_224-343295-0_sp0.9 from training. Duration: 0.611125 2023-10-02 08:43:54,548 WARNING [train.py:1204] (3/4) Exclude cut with ID IC0125W0011-61817 from training. Duration: 0.88 2023-10-02 08:43:55,818 WARNING [train.py:1204] (3/4) Exclude cut with ID _60113_210_16271_1_1535164212481_4386079_435-154371-0_sp1.1 from training. Duration: 0.97275 2023-10-02 08:43:55,830 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_25836_210_16554_1_1532138167_53197_3-9394-0_sp1.1 from training. Duration: 0.92725 2023-10-02 08:43:57,308 WARNING [train.py:1204] (3/4) Exclude cut with ID _29343_210_4381_1_1532602757752_3674109_57-55125-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 08:43:59,092 WARNING [train.py:1204] (3/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_267-23560-0_sp1.1 from training. Duration: 0.936375 2023-10-02 08:44:00,354 WARNING [train.py:1204] (3/4) Exclude cut with ID _30196_210_1664_1_1533708496522_7074140_1150-278867-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 08:44:01,621 WARNING [train.py:1204] (3/4) Exclude cut with ID _6270_210_2084_1_1527850911573_3856420_26-277343-0_sp0.9 from training. Duration: 0.911125 2023-10-02 08:44:02,278 WARNING [train.py:1204] (3/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_325-62802-0 from training. Duration: 0.87 2023-10-02 08:44:03,606 WARNING [train.py:1204] (3/4) Exclude cut with ID _56524_210_9642_1_1534572011779_3619170_145-310842-0_sp1.1 from training. Duration: 0.9 2023-10-02 08:44:03,634 WARNING [train.py:1204] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_860-96198-0_sp0.9 from training. Duration: 0.811125 2023-10-02 08:44:05,025 WARNING [train.py:1204] (3/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_17-25045-0_sp1.1 from training. Duration: 0.536375 2023-10-02 08:44:05,044 WARNING [train.py:1204] (3/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_349-69727-0_sp1.1 from training. Duration: 0.936375 2023-10-02 08:44:06,434 WARNING [train.py:1204] (3/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_580-93798-0_sp1.1 from training. Duration: 0.5090625 2023-10-02 08:44:06,533 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7762_210_1794_1_1528199508_4057408_419-125009-0 from training. Duration: 0.83 2023-10-02 08:44:07,953 WARNING [train.py:1204] (3/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_356-167816-0 from training. Duration: 0.72 2023-10-02 08:44:09,223 WARNING [train.py:1204] (3/4) Exclude cut with ID _29345_210_4381_1_1532689168263_3860169_225-97100-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 08:44:09,226 WARNING [train.py:1204] (3/4) Exclude cut with ID _14227_210_4381_1_1530701804286_3940259_374-194712-0_sp1.1 from training. Duration: 0.92725 2023-10-02 08:44:09,259 WARNING [train.py:1204] (3/4) Exclude cut with ID _40137_210_12210_1_1533290484046_3795180_249-187059-0_sp0.9 from training. Duration: 0.97775 2023-10-02 08:44:09,276 WARNING [train.py:1204] (3/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_389-317194-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 08:44:11,871 WARNING [train.py:1204] (3/4) Exclude cut with ID _27485_210_4916_1_1532739191677_3668269_239-122918-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 08:44:12,165 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.ff2_skip_rate, batch_count=819053.3333333334, ans=0.0 2023-10-02 08:44:17,302 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2245_210_2715_1_1525511548_3540254_128-117609-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 08:44:17,344 WARNING [train.py:1204] (3/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_160-276178-0_sp1.1 from training. Duration: 0.963625 2023-10-02 08:44:17,811 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.conv_module1.whiten.whitening_limit, batch_count=819120.0, ans=15.0 2023-10-02 08:44:18,675 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_31920_210_10633_1_1532860009_1686094_1-279735-0_sp1.1 from training. Duration: 0.97275 2023-10-02 08:44:22,017 WARNING [train.py:1204] (3/4) Exclude cut with ID _51078_210_9070_1_1534161557424_3805250_325-262383-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 08:44:22,049 WARNING [train.py:1204] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_643-52007-0_sp0.9 from training. Duration: 0.6333125 2023-10-02 08:44:23,347 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_51937_210_5014_1_1534573610_4022025_518-299248-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 08:44:28,058 WARNING [train.py:1204] (3/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_166-314607-0_sp1.1 from training. Duration: 0.4909375 2023-10-02 08:44:28,066 WARNING [train.py:1204] (3/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_1087-282939-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 08:44:28,280 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.conv_module2.balancer2.prob, batch_count=819120.0, ans=0.125 2023-10-02 08:44:29,927 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_23850_210_4238_1_1532232597_1632433_32-185135-0_sp1.1 from training. Duration: 0.7818125 2023-10-02 08:44:29,949 WARNING [train.py:1204] (3/4) Exclude cut with ID _59500_210_8254_1_1534809610680_3736009_145-89359-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 08:44:31,244 INFO [train.py:1046] (3/4) Epoch 24, batch 700, loss[loss=0.1369, simple_loss=0.1851, pruned_loss=0.04436, over 18960.00 frames. ], tot_loss[loss=0.1705, simple_loss=0.2477, pruned_loss=0.04665, over 4594824.84 frames. ], batch size: 388, lr: 4.31e-03, grad_scale: 16.0 2023-10-02 08:44:34,635 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_340-336408-0 from training. Duration: 0.93 2023-10-02 08:44:34,708 WARNING [train.py:1204] (3/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_9-21737-0 from training. Duration: 0.97 2023-10-02 08:44:36,233 WARNING [train.py:1204] (3/4) Exclude cut with ID _2220_210_1819_1_1525509109898_3658680_50-99837-0 from training. Duration: 0.54 2023-10-02 08:44:36,284 WARNING [train.py:1204] (3/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_879-299350-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 08:44:38,914 WARNING [train.py:1204] (3/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_905-160074-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 08:44:40,251 WARNING [train.py:1204] (3/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_395-73570-0 from training. Duration: 0.99 2023-10-02 08:44:42,939 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7264_210_4938_1_1527939911_8132095_800-91995-0_sp1.1 from training. Duration: 0.7818125 2023-10-02 08:44:45,724 WARNING [train.py:1204] (3/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_77-311404-0_sp1.1 from training. Duration: 0.8545625 2023-10-02 08:44:47,004 WARNING [train.py:1204] (3/4) Exclude cut with ID _13679_210_7497_1_1530534293662_3355098_107-80772-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 08:44:48,726 WARNING [train.py:1204] (3/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_224-343295-0_sp1.1 from training. Duration: 0.5 2023-10-02 08:44:48,763 WARNING [train.py:1204] (3/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_156-253482-0_sp1.1 from training. Duration: 0.963625 2023-10-02 08:44:51,493 WARNING [train.py:1204] (3/4) Exclude cut with ID _49728_210_12651_1_1534041034294_6788150_309-125532-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 08:44:54,592 WARNING [train.py:1204] (3/4) Exclude cut with ID _13913_210_9994_1_1530579615187_3383897_473-277682-0_sp0.9 from training. Duration: 0.4333125 2023-10-02 08:44:54,596 WARNING [train.py:1204] (3/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_3-276701-0_sp1.1 from training. Duration: 0.82725 2023-10-02 08:44:56,353 WARNING [train.py:1204] (3/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_282-136641-0 from training. Duration: 0.42 2023-10-02 08:44:56,958 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.3.encoder.layers.3.feed_forward2.out_whiten, num_groups=1, num_channels=512, metric=5.70 vs. limit=15.0 2023-10-02 08:44:59,161 WARNING [train.py:1204] (3/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_127-566-0 from training. Duration: 0.84 2023-10-02 08:45:03,640 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.595e+02 1.854e+02 2.026e+02 2.248e+02 3.229e+02, threshold=4.051e+02, percent-clipped=0.0 2023-10-02 08:45:03,709 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6163_210_6400_1_1527506746_4263068_283-137811-0_sp1.1 from training. Duration: 0.7 2023-10-02 08:45:03,748 WARNING [train.py:1204] (3/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_34-183685-0_sp1.1 from training. Duration: 0.8909375 2023-10-02 08:45:05,156 WARNING [train.py:1204] (3/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_154-135696-0_sp0.9 from training. Duration: 0.888875 2023-10-02 08:45:09,269 WARNING [train.py:1204] (3/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_676-51014-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 08:45:09,320 WARNING [train.py:1204] (3/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_38-169920-0 from training. Duration: 0.91 2023-10-02 08:45:13,506 WARNING [train.py:1204] (3/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_747-114465-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 08:45:13,553 WARNING [train.py:1204] (3/4) Exclude cut with ID _8021_210_7994_1_1528376451358_3653069_100-140231-0_sp1.1 from training. Duration: 0.6090625 2023-10-02 08:45:13,569 WARNING [train.py:1204] (3/4) Exclude cut with ID _64135_210_2253_1_1535677267876_7608972_133-188266-0 from training. Duration: 0.81 2023-10-02 08:45:19,397 WARNING [train.py:1204] (3/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_714-310538-0_sp1.1 from training. Duration: 0.7818125 2023-10-02 08:45:20,748 WARNING [train.py:1204] (3/4) Exclude cut with ID _39867_210_9826_1_1533553222030_9344259_1134-288498-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 08:45:22,362 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.conv_skip_rate, batch_count=819386.6666666666, ans=0.0 2023-10-02 08:45:23,939 WARNING [train.py:1204] (3/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_211-237289-0_sp1.1 from training. Duration: 0.936375 2023-10-02 08:45:27,371 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.2.encoder.layers.0.self_attn_weights.whiten_keys, num_groups=4, num_channels=128, metric=4.03 vs. limit=6.0 2023-10-02 08:45:30,473 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.4.encoder.layers.1.nonlin_attention.whiten1, num_groups=1, num_channels=288, metric=3.98 vs. limit=10.0 2023-10-02 08:45:31,302 WARNING [train.py:1204] (3/4) Exclude cut with ID _59202_210_26984_1_1534816810612_3549174_137-318710-0_sp0.9 from training. Duration: 0.988875 2023-10-02 08:45:31,333 WARNING [train.py:1204] (3/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_86-66192-0 from training. Duration: 0.55 2023-10-02 08:45:34,670 WARNING [train.py:1204] (3/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_540-275685-0 from training. Duration: 0.89 2023-10-02 08:45:35,977 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8776_210_2040_1_1528638837_4088036_244-278763-0 from training. Duration: 0.9 2023-10-02 08:45:37,387 WARNING [train.py:1204] (3/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_9-219972-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 08:45:38,790 WARNING [train.py:1204] (3/4) Exclude cut with ID _44659_210_2721_1_1534036120061_6614860_656-283823-0_sp1.1 from training. Duration: 0.92725 2023-10-02 08:45:38,858 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_69630_210_7234_1_1535789601_661049_77-216385-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 08:45:40,390 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.bypass.scale_min, batch_count=819453.3333333334, ans=0.2 2023-10-02 08:45:41,517 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_19952_210_2161_1_1531875469_3851750_210-308919-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 08:45:41,523 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_4737_210_2040_1_1526998689_3293008_378-178650-0 from training. Duration: 0.95 2023-10-02 08:45:44,210 INFO [train.py:1046] (3/4) Epoch 24, batch 750, loss[loss=0.1769, simple_loss=0.2479, pruned_loss=0.05294, over 23604.00 frames. ], tot_loss[loss=0.1699, simple_loss=0.2469, pruned_loss=0.04649, over 4630033.71 frames. ], batch size: 256, lr: 4.31e-03, grad_scale: 16.0 2023-10-02 08:45:45,660 WARNING [train.py:1204] (3/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_224-343295-0 from training. Duration: 0.55 2023-10-02 08:45:46,908 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7002_210_1794_1_1527939683_5157286_184-229630-0 from training. Duration: 0.74 2023-10-02 08:45:46,944 WARNING [train.py:1204] (3/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_6-214815-0 from training. Duration: 0.85 2023-10-02 08:45:48,729 WARNING [train.py:1204] (3/4) Exclude cut with ID _8699_210_2069_1_1528630346831_3960409_160-24408-0 from training. Duration: 0.91 2023-10-02 08:45:48,768 WARNING [train.py:1204] (3/4) Exclude cut with ID _5585_210_2667_1_1527418813324_8007340_89-332072-0 from training. Duration: 0.64 2023-10-02 08:45:50,051 WARNING [train.py:1204] (3/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_528-259800-0_sp1.1 from training. Duration: 0.963625 2023-10-02 08:45:51,325 WARNING [train.py:1204] (3/4) Exclude cut with ID _55383_210_10635_1_1534468426172_2231669_134-128246-0 from training. Duration: 0.89 2023-10-02 08:45:52,656 WARNING [train.py:1204] (3/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_400-338274-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 08:45:52,706 WARNING [train.py:1204] (3/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_364-22817-0_sp0.9 from training. Duration: 0.788875 2023-10-02 08:45:54,148 WARNING [train.py:1204] (3/4) Exclude cut with ID _42863_210_9070_1_1533902398788_3696920_331-291057-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 08:45:56,841 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_5436_210_1794_1_1527331317_2541494_229-211016-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 08:45:56,878 WARNING [train.py:1204] (3/4) Exclude cut with ID _6456_210_1534_1_1527681634212_3181049_345-252877-0_sp0.9 from training. Duration: 0.711125 2023-10-02 08:45:56,907 WARNING [train.py:1204] (3/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_96-81499-0_sp1.1 from training. Duration: 0.92725 2023-10-02 08:46:00,054 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_5575_210_1410_1_1527301305_2570170_342-265572-0_sp1.1 from training. Duration: 0.8545625 2023-10-02 08:46:01,345 WARNING [train.py:1204] (3/4) Exclude cut with ID _5444_210_2151_1_1527418808464_5312210_726-27165-0_sp0.9 from training. Duration: 0.7444375 2023-10-02 08:46:03,404 WARNING [train.py:1204] (3/4) Exclude cut with ID _22083_210_8254_1_1531702862439_3678970_304-327750-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 08:46:06,605 WARNING [train.py:1204] (3/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_245-226199-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 08:46:07,819 WARNING [train.py:1204] (3/4) Exclude cut with ID _42875_210_14856_1_1533704397487_4012433_102-199499-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 08:46:07,874 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_51225_210_3601_1_1534400634_2327383_55-301844-0 from training. Duration: 0.93 2023-10-02 08:46:09,236 WARNING [train.py:1204] (3/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_916-195848-0_sp1.1 from training. Duration: 0.836375 2023-10-02 08:46:09,296 WARNING [train.py:1204] (3/4) Exclude cut with ID _44718_210_5816_1_1534050051869_6469030_497-311999-0_sp1.1 from training. Duration: 0.936375 2023-10-02 08:46:10,670 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_34886_210_10633_1_1533203882_1755661_91-194765-0_sp1.1 from training. Duration: 0.936375 2023-10-02 08:46:13,406 WARNING [train.py:1204] (3/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_310-26186-0_sp0.9 from training. Duration: 0.77775 2023-10-02 08:46:14,747 WARNING [train.py:1204] (3/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_416-257730-0 from training. Duration: 0.65 2023-10-02 08:46:14,752 WARNING [train.py:1204] (3/4) Exclude cut with ID _9389_210_1664_1_1529217337301_5289269_209-154760-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 08:46:16,087 WARNING [train.py:1204] (3/4) Exclude cut with ID _4092_210_2151_1_1526643044449_3135759_227-213972-0 from training. Duration: 0.54 2023-10-02 08:46:16,106 WARNING [train.py:1204] (3/4) Exclude cut with ID IC0679W0044-335852 from training. Duration: 0.768 2023-10-02 08:46:16,170 WARNING [train.py:1204] (3/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_42-186162-0 from training. Duration: 0.92 2023-10-02 08:46:17,483 WARNING [train.py:1204] (3/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_302-2550-0_sp0.9 from training. Duration: 0.9 2023-10-02 08:46:17,496 WARNING [train.py:1204] (3/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_356-31216-0_sp0.9 from training. Duration: 0.8333125 2023-10-02 08:46:18,921 WARNING [train.py:1204] (3/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_125-322172-0_sp1.1 from training. Duration: 0.8454375 2023-10-02 08:46:23,688 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=819653.3333333334, ans=0.1 2023-10-02 08:46:26,091 WARNING [train.py:1204] (3/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_141-89727-0_sp0.9 from training. Duration: 0.788875 2023-10-02 08:46:26,114 WARNING [train.py:1204] (3/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_647-337784-0_sp1.1 from training. Duration: 0.97275 2023-10-02 08:46:26,130 WARNING [train.py:1204] (3/4) Exclude cut with ID _6493_210_1747_1_1528459210424_4012470_513-140967-0_sp1.1 from training. Duration: 0.7181875 2023-10-02 08:46:29,386 WARNING [train.py:1204] (3/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_87-266334-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 08:46:30,915 WARNING [train.py:1204] (3/4) Exclude cut with ID _26528_210_9070_1_1532570410842_3851310_526-261338-0_sp1.1 from training. Duration: 0.92725 2023-10-02 08:46:30,964 WARNING [train.py:1204] (3/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_380-343670-0 from training. Duration: 0.88 2023-10-02 08:46:31,012 WARNING [train.py:1204] (3/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_449-15508-0_sp1.1 from training. Duration: 0.7454375 2023-10-02 08:46:31,153 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.feed_forward1.hidden_balancer.prob, batch_count=819720.0, ans=0.125 2023-10-02 08:46:32,690 WARNING [train.py:1204] (3/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_372-6869-0_sp0.9 from training. Duration: 0.488875 2023-10-02 08:46:34,467 WARNING [train.py:1204] (3/4) Exclude cut with ID _26907_210_7234_1_1532221740694_3205263_337-2494-0_sp0.9 from training. Duration: 0.97775 2023-10-02 08:46:35,977 WARNING [train.py:1204] (3/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_10-100367-0_sp1.1 from training. Duration: 0.8909375 2023-10-02 08:46:36,198 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.nonlin_attention.balancer.prob, batch_count=819720.0, ans=0.125 2023-10-02 08:46:37,313 WARNING [train.py:1204] (3/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_47-167373-0 from training. Duration: 0.92 2023-10-02 08:46:37,371 WARNING [train.py:1204] (3/4) Exclude cut with ID _28323_210_16187_1_1532480395667_4100300_628-282179-0_sp1.1 from training. Duration: 0.97275 2023-10-02 08:46:41,392 WARNING [train.py:1204] (3/4) Exclude cut with ID _20455_210_5219_1_1531565996775_8098100_213-280898-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 08:46:44,063 WARNING [train.py:1204] (3/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_383-316000-0_sp1.1 from training. Duration: 0.8181875 2023-10-02 08:46:44,105 WARNING [train.py:1204] (3/4) Exclude cut with ID _18513_210_9206_1_1531618215223_3862620_16-78180-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 08:46:46,917 WARNING [train.py:1204] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_330-327861-0_sp1.1 from training. Duration: 0.6090625 2023-10-02 08:46:49,713 WARNING [train.py:1204] (3/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_356-31216-0 from training. Duration: 0.75 2023-10-02 08:46:49,737 WARNING [train.py:1204] (3/4) Exclude cut with ID _25248_210_8750_1_1532062825941_5554180_255-148539-0_sp1.1 from training. Duration: 0.963625 2023-10-02 08:46:51,495 WARNING [train.py:1204] (3/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_269-154726-0_sp1.1 from training. Duration: 0.7818125 2023-10-02 08:46:53,031 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7002_210_1794_1_1527939683_5157286_163-87552-0_sp1.1 from training. Duration: 0.7818125 2023-10-02 08:46:53,063 WARNING [train.py:1204] (3/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_97-151896-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 08:46:55,698 WARNING [train.py:1204] (3/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_207-7026-0_sp1.1 from training. Duration: 0.97275 2023-10-02 08:46:55,723 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_92-46009-0_sp0.9 from training. Duration: 0.6 2023-10-02 08:46:56,924 INFO [train.py:1046] (3/4) Epoch 24, batch 800, loss[loss=0.1796, simple_loss=0.251, pruned_loss=0.0541, over 23712.00 frames. ], tot_loss[loss=0.1703, simple_loss=0.2471, pruned_loss=0.04679, over 4651092.73 frames. ], batch size: 150, lr: 4.31e-03, grad_scale: 32.0 2023-10-02 08:47:03,976 WARNING [train.py:1204] (3/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_122-178847-0_sp1.1 from training. Duration: 0.97275 2023-10-02 08:47:03,983 WARNING [train.py:1204] (3/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_94-348956-0_sp1.1 from training. Duration: 0.92725 2023-10-02 08:47:05,405 WARNING [train.py:1204] (3/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_1018-244089-0_sp1.1 from training. Duration: 0.963625 2023-10-02 08:47:05,416 WARNING [train.py:1204] (3/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_87-118423-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 08:47:06,660 WARNING [train.py:1204] (3/4) Exclude cut with ID _53713_210_24116_1_1534314487762_7416751_504-212292-0_sp1.1 from training. Duration: 0.92725 2023-10-02 08:47:08,000 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_446-150327-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 08:47:08,129 WARNING [train.py:1204] (3/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_682-245989-0_sp1.1 from training. Duration: 0.92725 2023-10-02 08:47:12,053 WARNING [train.py:1204] (3/4) Exclude cut with ID _35723_210_1794_1_1533088341043_628250_8-234524-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 08:47:13,321 WARNING [train.py:1204] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_64-248359-0_sp0.9 from training. Duration: 0.7666875 2023-10-02 08:47:14,156 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.1.feed_forward2.out_whiten.whitening_limit, batch_count=819920.0, ans=15.0 2023-10-02 08:47:16,030 WARNING [train.py:1204] (3/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_49-97158-0 from training. Duration: 0.76 2023-10-02 08:47:17,404 WARNING [train.py:1204] (3/4) Exclude cut with ID _24314_210_4915_1_1532135078116_7170179_743-915-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 08:47:17,496 WARNING [train.py:1204] (3/4) Exclude cut with ID _61194_210_18628_1_1535007555628_3750909_353-323468-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 08:47:18,748 WARNING [train.py:1204] (3/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_387-131538-0_sp1.1 from training. Duration: 0.736375 2023-10-02 08:47:18,775 WARNING [train.py:1204] (3/4) Exclude cut with ID _44424_210_18163_1_1533623394848_1213110_79-187603-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 08:47:18,810 WARNING [train.py:1204] (3/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_86-19601-0 from training. Duration: 0.85 2023-10-02 08:47:18,836 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_50050_210_16223_1_1535435796_7290138_103-283529-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 08:47:18,865 WARNING [train.py:1204] (3/4) Exclude cut with ID _13913_210_9994_1_1530579615187_3383897_473-277682-0 from training. Duration: 0.39 2023-10-02 08:47:22,267 WARNING [train.py:1204] (3/4) Exclude cut with ID _42326_210_7770_1_1533814282531_3707269_368-68029-0_sp1.1 from training. Duration: 0.92725 2023-10-02 08:47:22,417 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.0.conv_module2.balancer2.prob, batch_count=819920.0, ans=0.125 2023-10-02 08:47:24,932 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_41428_210_2161_1_1533774184_3992532_446-339645-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 08:47:27,589 WARNING [train.py:1204] (3/4) Exclude cut with ID _69584_210_5219_1_1535785133775_4044450_325-7656-0_sp1.1 from training. Duration: 0.7818125 2023-10-02 08:47:27,597 WARNING [train.py:1204] (3/4) Exclude cut with ID _21632_210_2253_1_1531994001765_3744140_134-184387-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 08:47:30,994 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.485e+02 1.828e+02 1.971e+02 2.242e+02 3.409e+02, threshold=3.942e+02, percent-clipped=0.0 2023-10-02 08:47:31,119 WARNING [train.py:1204] (3/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_550-184707-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 08:47:31,139 WARNING [train.py:1204] (3/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_106-341780-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 08:47:32,605 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.ff2_skip_rate, batch_count=819986.6666666666, ans=0.0 2023-10-02 08:47:37,530 WARNING [train.py:1204] (3/4) Exclude cut with ID _45871_210_5665_1_1533864614245_3296060_253-132552-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 08:47:38,833 WARNING [train.py:1204] (3/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_395-310220-0_sp1.1 from training. Duration: 0.8090625 2023-10-02 08:47:38,847 WARNING [train.py:1204] (3/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_795-33252-0_sp0.9 from training. Duration: 0.411125 2023-10-02 08:47:40,252 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9002W0003-495962 from training. Duration: 0.946 2023-10-02 08:47:40,277 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_43-71989-0 from training. Duration: 0.57 2023-10-02 08:47:40,288 WARNING [train.py:1204] (3/4) Exclude cut with ID _8699_210_2069_1_1528630346831_3960409_155-39766-0_sp1.1 from training. Duration: 0.7090625 2023-10-02 08:47:40,299 WARNING [train.py:1204] (3/4) Exclude cut with ID _27894_210_12392_1_1532415496078_6902439_788-232684-0_sp1.1 from training. Duration: 0.97275 2023-10-02 08:47:42,941 WARNING [train.py:1204] (3/4) Exclude cut with ID _56494_210_4220_1_1535284837625_3033169_10-39296-0_sp1.1 from training. Duration: 0.92725 2023-10-02 08:47:42,960 WARNING [train.py:1204] (3/4) Exclude cut with ID _40901_210_5665_1_1533363789958_3856130_341-20819-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 08:47:43,182 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.feed_forward1.hidden_balancer.prob, batch_count=820053.3333333334, ans=0.125 2023-10-02 08:47:47,016 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9029W0435-509822 from training. Duration: 0.896 2023-10-02 08:47:47,058 WARNING [train.py:1204] (3/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_51-117576-0 from training. Duration: 0.4 2023-10-02 08:47:48,424 WARNING [train.py:1204] (3/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_197-28164-0_sp1.1 from training. Duration: 0.72725 2023-10-02 08:47:49,802 WARNING [train.py:1204] (3/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_145-137790-0_sp1.1 from training. Duration: 0.7545625 2023-10-02 08:47:54,174 WARNING [train.py:1204] (3/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_2-39653-0_sp1.1 from training. Duration: 0.8909375 2023-10-02 08:47:55,711 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3780_210_2087_1_1526467760_2130483_186-208240-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 08:47:57,141 WARNING [train.py:1204] (3/4) Exclude cut with ID _24055_210_12651_1_1532310907143_976979_92-59356-0 from training. Duration: 0.91 2023-10-02 08:47:57,318 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.out_combiner.scale_min, batch_count=820120.0, ans=0.2 2023-10-02 08:47:58,563 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3910_210_1794_1_1526742306_334530_28-238909-0_sp1.1 from training. Duration: 0.736375 2023-10-02 08:48:01,822 WARNING [train.py:1204] (3/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_269-154726-0 from training. Duration: 0.86 2023-10-02 08:48:09,434 WARNING [train.py:1204] (3/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_326-165900-0_sp0.9 from training. Duration: 0.7666875 2023-10-02 08:48:10,751 INFO [train.py:1046] (3/4) Epoch 24, batch 850, loss[loss=0.1576, simple_loss=0.2405, pruned_loss=0.03737, over 24493.00 frames. ], tot_loss[loss=0.1709, simple_loss=0.2479, pruned_loss=0.04693, over 4674937.25 frames. ], batch size: 66, lr: 4.31e-03, grad_scale: 16.0 2023-10-02 08:48:11,035 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.ff2_skip_rate, batch_count=820186.6666666666, ans=0.0 2023-10-02 08:48:12,201 WARNING [train.py:1204] (3/4) Exclude cut with ID _5294_210_1747_1_1527249643928_3696179_470-276077-0_sp1.1 from training. Duration: 0.963625 2023-10-02 08:48:12,229 WARNING [train.py:1204] (3/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_372-131163-0 from training. Duration: 0.94 2023-10-02 08:48:13,585 WARNING [train.py:1204] (3/4) Exclude cut with ID _45297_210_2721_1_1533715351381_3264949_351-147136-0_sp1.1 from training. Duration: 0.7909375 2023-10-02 08:48:13,649 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_1072-12671-0_sp1.1 from training. Duration: 0.92725 2023-10-02 08:48:15,107 WARNING [train.py:1204] (3/4) Exclude cut with ID _15133_210_6228_1_1530794512074_3080379_54-198193-0 from training. Duration: 0.94 2023-10-02 08:48:15,123 WARNING [train.py:1204] (3/4) Exclude cut with ID _30196_210_1664_1_1533708496522_7074140_1194-270988-0_sp1.1 from training. Duration: 0.97275 2023-10-02 08:48:16,533 WARNING [train.py:1204] (3/4) Exclude cut with ID _50310_210_15488_1_1534068043345_3641080_289-193637-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 08:48:17,889 WARNING [train.py:1204] (3/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_313-263495-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 08:48:18,024 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.1.bypass_mid.scale_min, batch_count=820186.6666666666, ans=0.2 2023-10-02 08:48:19,168 WARNING [train.py:1204] (3/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_18-130336-0_sp1.1 from training. Duration: 0.7181875 2023-10-02 08:48:19,223 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_47896_210_20508_1_1534492518_5958868_646-287209-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 08:48:20,586 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_294-163793-0 from training. Duration: 0.82 2023-10-02 08:48:21,989 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_8-319957-0 from training. Duration: 0.96 2023-10-02 08:48:21,993 WARNING [train.py:1204] (3/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_1211-226066-0 from training. Duration: 0.76 2023-10-02 08:48:23,395 WARNING [train.py:1204] (3/4) Exclude cut with ID _4779_210_2084_1_1527318022944_3914893_500-285013-0_sp0.9 from training. Duration: 0.7666875 2023-10-02 08:48:23,417 WARNING [train.py:1204] (3/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_347-232268-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 08:48:26,576 WARNING [train.py:1204] (3/4) Exclude cut with ID _29877_210_6228_1_1532397791106_5276149_111-55056-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 08:48:26,728 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.conv_skip_rate, batch_count=820253.3333333334, ans=0.0 2023-10-02 08:48:27,880 WARNING [train.py:1204] (3/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_812-271646-0_sp1.1 from training. Duration: 0.92725 2023-10-02 08:48:27,911 WARNING [train.py:1204] (3/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_235-262124-0_sp1.1 from training. Duration: 0.6090625 2023-10-02 08:48:28,748 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.3.encoder.layers.0.feed_forward2.out_whiten, num_groups=1, num_channels=512, metric=4.51 vs. limit=15.0 2023-10-02 08:48:30,761 WARNING [train.py:1204] (3/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_14-16530-0_sp1.1 from training. Duration: 0.97275 2023-10-02 08:48:32,151 WARNING [train.py:1204] (3/4) Exclude cut with ID _44718_210_5816_1_1534050051869_6469030_568-304512-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 08:48:32,184 WARNING [train.py:1204] (3/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_1033-274376-0 from training. Duration: 0.87 2023-10-02 08:48:34,221 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.0.self_attn_weights.pos_emb_skip_rate, batch_count=820253.3333333334, ans=0.0 2023-10-02 08:48:35,535 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.balancer1.prob, batch_count=820253.3333333334, ans=0.125 2023-10-02 08:48:37,241 WARNING [train.py:1204] (3/4) Exclude cut with ID _4681_210_3525_1_1527251040321_1235713_123-24428-0 from training. Duration: 0.48 2023-10-02 08:48:40,409 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_37818_210_4238_1_1533620910_4720083_251-288764-0_sp1.1 from training. Duration: 0.97275 2023-10-02 08:48:41,855 WARNING [train.py:1204] (3/4) Exclude cut with ID _65585_210_12210_1_1535450451584_3764098_112-294301-0 from training. Duration: 0.88 2023-10-02 08:48:43,524 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=820320.0, ans=0.1 2023-10-02 08:48:44,691 WARNING [train.py:1204] (3/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_329-113173-0 from training. Duration: 0.62 2023-10-02 08:48:44,787 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6767_210_3273_1_1527855783_1718932_173-256601-0 from training. Duration: 0.8 2023-10-02 08:48:47,446 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9266W0001-627677 from training. Duration: 0.896 2023-10-02 08:48:47,456 WARNING [train.py:1204] (3/4) Exclude cut with ID _49643_210_7989_1_1534640244224_3939031_611-185515-0_sp1.1 from training. Duration: 0.8545625 2023-10-02 08:48:47,467 WARNING [train.py:1204] (3/4) Exclude cut with ID _31649_210_6228_1_1532584608063_4091078_687-237771-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 08:48:47,478 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_148-172861-0_sp0.9 from training. Duration: 0.5555625 2023-10-02 08:48:50,089 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_5129_210_6400_1_1527161005_3579054_107-69738-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 08:48:51,476 WARNING [train.py:1204] (3/4) Exclude cut with ID _40137_210_12210_1_1533290484046_3795180_36-301164-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 08:48:51,509 WARNING [train.py:1204] (3/4) Exclude cut with ID _7537_210_2084_1_1528502395431_3924359_113-199933-0 from training. Duration: 0.59 2023-10-02 08:48:51,714 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.feed_forward2.hidden_balancer.prob, batch_count=820320.0, ans=0.125 2023-10-02 08:48:51,791 INFO [scaling.py:1118] (3/4) WithLoss: name=encoder.encoders.4.encoder.layers.0.self_attn_weights, loss-sum=0.000e+00 2023-10-02 08:48:54,101 WARNING [train.py:1204] (3/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_547-5424-0_sp1.1 from training. Duration: 0.8545625 2023-10-02 08:48:55,968 WARNING [train.py:1204] (3/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_147-284024-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 08:48:57,290 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_294-163793-0_sp1.1 from training. Duration: 0.7454375 2023-10-02 08:48:57,318 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3780_210_2087_1_1526467760_2130483_170-259044-0_sp1.1 from training. Duration: 0.636375 2023-10-02 08:48:58,699 WARNING [train.py:1204] (3/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_726-270780-0_sp1.1 from training. Duration: 0.7818125 2023-10-02 08:49:00,064 WARNING [train.py:1204] (3/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_214-291369-0_sp1.1 from training. Duration: 0.57275 2023-10-02 08:49:00,105 WARNING [train.py:1204] (3/4) Exclude cut with ID _21881_210_3525_1_1531825242757_3798380_247-102591-0 from training. Duration: 0.98 2023-10-02 08:49:01,999 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.5.encoder.layers.0.feed_forward3.out_whiten, num_groups=1, num_channels=256, metric=11.82 vs. limit=15.0 2023-10-02 08:49:03,158 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.conv_module1.balancer2.prob, batch_count=820386.6666666666, ans=0.125 2023-10-02 08:49:04,691 WARNING [train.py:1204] (3/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_18-174647-0_sp1.1 from training. Duration: 0.82725 2023-10-02 08:49:04,693 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_415-52611-0_sp1.1 from training. Duration: 0.936375 2023-10-02 08:49:04,749 WARNING [train.py:1204] (3/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_379-49457-0_sp0.9 from training. Duration: 0.9666875 2023-10-02 08:49:04,761 WARNING [train.py:1204] (3/4) Exclude cut with ID _64521_210_14699_1_1535243427259_3815328_368-195106-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 08:49:06,659 WARNING [train.py:1204] (3/4) Exclude cut with ID _28394_210_8913_1_1532430049518_3650118_51-334124-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 08:49:10,022 WARNING [train.py:1204] (3/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_317-161769-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 08:49:11,256 WARNING [train.py:1204] (3/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_40-129756-0_sp0.9 from training. Duration: 0.9 2023-10-02 08:49:13,892 WARNING [train.py:1204] (3/4) Exclude cut with ID _3067_210_2667_1_1526212974640_4503168_119-175776-0_sp0.9 from training. Duration: 0.82225 2023-10-02 08:49:15,258 WARNING [train.py:1204] (3/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_1004-304915-0_sp1.1 from training. Duration: 0.92725 2023-10-02 08:49:15,304 WARNING [train.py:1204] (3/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_359-118926-0_sp0.9 from training. Duration: 0.8 2023-10-02 08:49:20,884 WARNING [train.py:1204] (3/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_51-200220-0_sp0.9 from training. Duration: 0.62225 2023-10-02 08:49:22,161 WARNING [train.py:1204] (3/4) Exclude cut with ID _46683_210_9642_1_1533730913492_4591116_530-326202-0_sp1.1 from training. Duration: 0.936375 2023-10-02 08:49:22,213 WARNING [train.py:1204] (3/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_567-135114-0 from training. Duration: 0.87 2023-10-02 08:49:23,572 INFO [train.py:1046] (3/4) Epoch 24, batch 900, loss[loss=0.1854, simple_loss=0.2574, pruned_loss=0.0567, over 22739.00 frames. ], tot_loss[loss=0.172, simple_loss=0.249, pruned_loss=0.04752, over 4681652.33 frames. ], batch size: 322, lr: 4.30e-03, grad_scale: 16.0 2023-10-02 08:49:23,618 WARNING [train.py:1204] (3/4) Exclude cut with ID _66863_210_18053_1_1535698758334_8590319_1092-271784-0_sp1.1 from training. Duration: 0.87275 2023-10-02 08:49:23,634 WARNING [train.py:1204] (3/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_762-282614-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 08:49:24,998 WARNING [train.py:1204] (3/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_280-120942-0 from training. Duration: 0.77 2023-10-02 08:49:29,864 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.bypass.scale_min, batch_count=820520.0, ans=0.2 2023-10-02 08:49:31,023 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_31583_210_3852_1_1532595851_4024413_27-170931-0_sp1.1 from training. Duration: 0.963625 2023-10-02 08:49:34,225 WARNING [train.py:1204] (3/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_266-271628-0_sp1.1 from training. Duration: 0.92725 2023-10-02 08:49:34,266 WARNING [train.py:1204] (3/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_41-31157-0 from training. Duration: 0.57 2023-10-02 08:49:37,439 WARNING [train.py:1204] (3/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_113-18163-0_sp1.1 from training. Duration: 0.8090625 2023-10-02 08:49:37,473 WARNING [train.py:1204] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_147-111137-0 from training. Duration: 0.95 2023-10-02 08:49:40,497 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1083-220122-0_sp1.1 from training. Duration: 0.463625 2023-10-02 08:49:41,818 WARNING [train.py:1204] (3/4) Exclude cut with ID _14884_210_2252_1_1530707459689_5561250_591-173406-0_sp1.1 from training. Duration: 0.87275 2023-10-02 08:49:41,824 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_24677_210_1794_1_1532002693_4341296_149-303793-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 08:49:41,874 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_133-564-0_sp1.1 from training. Duration: 0.7181875 2023-10-02 08:49:41,894 WARNING [train.py:1204] (3/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_625-338546-0_sp0.9 from training. Duration: 0.97775 2023-10-02 08:49:47,475 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.feed_forward2.hidden_balancer.prob, batch_count=820586.6666666666, ans=0.125 2023-10-02 08:49:51,245 WARNING [train.py:1204] (3/4) Exclude cut with ID _25882_210_3036_1_1532775665815_6479510_624-320169-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 08:49:51,251 WARNING [train.py:1204] (3/4) Exclude cut with ID _20622_210_3852_1_1531622427080_1093839_127-325326-0_sp1.1 from training. Duration: 0.92725 2023-10-02 08:49:52,578 WARNING [train.py:1204] (3/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_464-33931-0_sp1.1 from training. Duration: 0.4909375 2023-10-02 08:49:55,337 WARNING [train.py:1204] (3/4) Exclude cut with ID _66112_210_10635_1_1535590236305_3801484_120-304588-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 08:49:55,632 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.conv_module2.balancer1.prob, batch_count=820653.3333333334, ans=0.125 2023-10-02 08:49:57,928 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.2.encoder.layers.2.feed_forward2.out_whiten, num_groups=1, num_channels=384, metric=4.25 vs. limit=15.0 2023-10-02 08:49:58,565 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.500e+02 1.862e+02 2.065e+02 2.310e+02 4.708e+02, threshold=4.130e+02, percent-clipped=1.0 2023-10-02 08:49:59,998 WARNING [train.py:1204] (3/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_110-130961-0 from training. Duration: 0.47 2023-10-02 08:50:02,082 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.3.encoder.layers.1.self_attn1.whiten, num_groups=1, num_channels=512, metric=16.07 vs. limit=22.5 2023-10-02 08:50:02,731 WARNING [train.py:1204] (3/4) Exclude cut with ID _16908_210_5724_1_1531119157248_3772700_64-54020-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 08:50:06,000 WARNING [train.py:1204] (3/4) Exclude cut with ID _4068_210_2629_1_1526697350395_1546259_224-288693-0_sp1.1 from training. Duration: 0.536375 2023-10-02 08:50:07,231 WARNING [train.py:1204] (3/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_191-41023-0_sp0.9 from training. Duration: 0.788875 2023-10-02 08:50:08,583 WARNING [train.py:1204] (3/4) Exclude cut with ID ID0205W0255-783002 from training. Duration: 0.958 2023-10-02 08:50:10,505 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_162-262717-0 from training. Duration: 0.83 2023-10-02 08:50:15,945 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_224-322394-0_sp1.1 from training. Duration: 0.6 2023-10-02 08:50:15,951 WARNING [train.py:1204] (3/4) Exclude cut with ID _4866_210_2577_1_1527404412284_4364659_133-1696-0_sp1.1 from training. Duration: 0.9 2023-10-02 08:50:16,007 WARNING [train.py:1204] (3/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_973-198085-0_sp1.1 from training. Duration: 0.6818125 2023-10-02 08:50:22,693 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_47449_210_5399_1_1534076445_4006929_410-237806-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 08:50:22,703 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7543_210_6753_1_1528543318_4552_1-85072-0_sp0.9 from training. Duration: 0.92225 2023-10-02 08:50:22,823 WARNING [train.py:1204] (3/4) Exclude cut with ID _65263_210_22356_1_1535763598691_3921037_108-21902-0 from training. Duration: 0.99 2023-10-02 08:50:22,827 WARNING [train.py:1204] (3/4) Exclude cut with ID _65585_210_12210_1_1535450451584_3764098_309-174818-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 08:50:25,596 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_309-65177-0 from training. Duration: 0.88 2023-10-02 08:50:26,944 WARNING [train.py:1204] (3/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_363-185703-0_sp1.1 from training. Duration: 0.863625 2023-10-02 08:50:26,974 WARNING [train.py:1204] (3/4) Exclude cut with ID _42326_210_7770_1_1533814282531_3707269_442-238692-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 08:50:30,052 WARNING [train.py:1204] (3/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_226-254446-0_sp1.1 from training. Duration: 0.936375 2023-10-02 08:50:30,076 WARNING [train.py:1204] (3/4) Exclude cut with ID _29853_210_7497_1_1532505600851_6642110_287-299903-0_sp1.1 from training. Duration: 0.963625 2023-10-02 08:50:33,458 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.3.encoder.layers.1.whiten, num_groups=1, num_channels=512, metric=5.02 vs. limit=12.0 2023-10-02 08:50:34,549 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1015-343884-0 from training. Duration: 0.62 2023-10-02 08:50:34,584 WARNING [train.py:1204] (3/4) Exclude cut with ID ID0305W0008-833402 from training. Duration: 0.91 2023-10-02 08:50:35,955 INFO [train.py:1046] (3/4) Epoch 24, batch 950, loss[loss=0.1674, simple_loss=0.2539, pruned_loss=0.04046, over 24449.00 frames. ], tot_loss[loss=0.1714, simple_loss=0.2488, pruned_loss=0.04702, over 4701160.16 frames. ], batch size: 69, lr: 4.30e-03, grad_scale: 8.0 2023-10-02 08:50:36,027 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6673_210_3273_1_1527767790_3173240_351-167954-0_sp1.1 from training. Duration: 0.436375 2023-10-02 08:50:36,046 WARNING [train.py:1204] (3/4) Exclude cut with ID _6456_210_1534_1_1527681634212_3181049_345-252877-0 from training. Duration: 0.64 2023-10-02 08:50:37,444 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_13-205182-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 08:50:40,867 WARNING [train.py:1204] (3/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_427-208901-0 from training. Duration: 0.68 2023-10-02 08:50:46,095 WARNING [train.py:1204] (3/4) Exclude cut with ID _49039_210_6315_1_1533967537541_3846118_9-148921-0_sp1.1 from training. Duration: 0.97275 2023-10-02 08:50:48,880 WARNING [train.py:1204] (3/4) Exclude cut with ID _59493_210_17282_1_1535078419077_3225879_143-70317-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 08:50:48,906 WARNING [train.py:1204] (3/4) Exclude cut with ID _27967_210_12140_1_1532588391859_8419130_1056-236369-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 08:50:50,195 WARNING [train.py:1204] (3/4) Exclude cut with ID _6537_210_1883_1_1527987559710_3597129_382-133062-0_sp0.9 from training. Duration: 0.7555625 2023-10-02 08:50:51,596 WARNING [train.py:1204] (3/4) Exclude cut with ID ID0376W0255-869957 from training. Duration: 0.9510625 2023-10-02 08:50:55,676 WARNING [train.py:1204] (3/4) Exclude cut with ID _49371_210_12071_1_1533952761021_3812190_483-240691-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 08:50:56,940 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_12868_210_6753_1_1530701932_530275_18-76322-0_sp1.1 from training. Duration: 0.7909375 2023-10-02 08:50:56,999 WARNING [train.py:1204] (3/4) Exclude cut with ID _53950_210_5816_1_1534417264919_3862951_367-26344-0_sp1.1 from training. Duration: 0.97275 2023-10-02 08:50:57,023 WARNING [train.py:1204] (3/4) Exclude cut with ID _5444_210_2151_1_1527418808464_5312210_634-194174-0_sp1.1 from training. Duration: 0.8818125 2023-10-02 08:50:57,045 WARNING [train.py:1204] (3/4) Exclude cut with ID _56494_210_4220_1_1535284837625_3033169_69-138150-0 from training. Duration: 0.94 2023-10-02 08:50:58,901 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7543_210_6753_1_1528544010_1110402_25-183860-0_sp0.9 from training. Duration: 0.52225 2023-10-02 08:51:00,277 WARNING [train.py:1204] (3/4) Exclude cut with ID _40284_210_2084_1_1533513679814_3704279_287-191918-0_sp1.1 from training. Duration: 0.963625 2023-10-02 08:51:01,686 WARNING [train.py:1204] (3/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_291-270881-0 from training. Duration: 0.97 2023-10-02 08:51:01,724 WARNING [train.py:1204] (3/4) Exclude cut with ID _12555_210_8750_1_1530075594402_3780239_449-267986-0_sp0.9 from training. Duration: 0.92225 2023-10-02 08:51:07,995 WARNING [train.py:1204] (3/4) Exclude cut with ID _3992_210_1747_1_1526644816031_3731060_268-64520-0_sp1.1 from training. Duration: 0.963625 2023-10-02 08:51:08,001 WARNING [train.py:1204] (3/4) Exclude cut with ID _39411_210_5986_1_1533538825030_3343649_173-238248-0_sp0.9 from training. Duration: 0.92225 2023-10-02 08:51:08,021 WARNING [train.py:1204] (3/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_828-313097-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 08:51:08,109 WARNING [train.py:1204] (3/4) Exclude cut with ID _26907_210_7234_1_1532221740694_3205263_337-2494-0 from training. Duration: 0.88 2023-10-02 08:51:09,497 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_229-45300-0_sp1.1 from training. Duration: 0.5181875 2023-10-02 08:51:11,502 WARNING [train.py:1204] (3/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_300-27235-0_sp1.1 from training. Duration: 0.7909375 2023-10-02 08:51:11,621 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.attention_skip_rate, batch_count=820986.6666666666, ans=0.0 2023-10-02 08:51:12,863 WARNING [train.py:1204] (3/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_48-338976-0_sp1.1 from training. Duration: 0.8090625 2023-10-02 08:51:15,692 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_13091_210_7925_1_1530609447_8987354_792-195353-0_sp1.1 from training. Duration: 0.936375 2023-10-02 08:51:15,698 WARNING [train.py:1204] (3/4) Exclude cut with ID _56524_210_9642_1_1534572011779_3619170_311-54500-0_sp1.1 from training. Duration: 0.97275 2023-10-02 08:51:18,416 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7543_210_6753_1_1528544010_1110402_25-183860-0 from training. Duration: 0.47 2023-10-02 08:51:19,758 WARNING [train.py:1204] (3/4) Exclude cut with ID 2411-132532-0017-82279-0_sp1.1 from training. Duration: 0.9681875 2023-10-02 08:51:19,758 WARNING [train.py:1204] (3/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_272-249985-0_sp0.9 from training. Duration: 0.7444375 2023-10-02 08:51:21,049 WARNING [train.py:1204] (3/4) Exclude cut with ID _55644_210_11856_1_1534593599582_4103719_320-229006-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 08:51:22,336 WARNING [train.py:1204] (3/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_177-100554-0_sp1.1 from training. Duration: 0.963625 2023-10-02 08:51:22,338 WARNING [train.py:1204] (3/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_787-294416-0_sp1.1 from training. Duration: 0.7454375 2023-10-02 08:51:26,742 WARNING [train.py:1204] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_318-59097-0 from training. Duration: 0.76 2023-10-02 08:51:28,015 WARNING [train.py:1204] (3/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_480-13146-0_sp1.1 from training. Duration: 0.77275 2023-10-02 08:51:29,469 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_469-338558-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 08:51:30,954 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_367-126897-0_sp1.1 from training. Duration: 0.963625 2023-10-02 08:51:30,969 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6545_210_2040_1_1527774670_4245258_379-14182-0 from training. Duration: 0.8 2023-10-02 08:51:30,980 WARNING [train.py:1204] (3/4) Exclude cut with ID _21631_210_2253_1_1531907880778_3160339_197-79498-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 08:51:30,983 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_416-293746-0_sp1.1 from training. Duration: 0.8181875 2023-10-02 08:51:31,023 WARNING [train.py:1204] (3/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_52-211683-0 from training. Duration: 0.9 2023-10-02 08:51:34,614 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.balancer1.prob, batch_count=821120.0, ans=0.125 2023-10-02 08:51:37,422 WARNING [train.py:1204] (3/4) Exclude cut with ID _48678_210_5663_1_1533988830947_3769199_306-255886-0_sp0.9 from training. Duration: 0.8555625 2023-10-02 08:51:38,809 WARNING [train.py:1204] (3/4) Exclude cut with ID _65134_210_14763_1_1535335393985_3505368_296-24733-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 08:51:43,665 WARNING [train.py:1204] (3/4) Exclude cut with ID _14898_210_9206_1_1530766812377_4177190_335-99364-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 08:51:45,000 WARNING [train.py:1204] (3/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_19-342632-0 from training. Duration: 0.91 2023-10-02 08:51:45,016 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_53071_210_16554_1_1534490405_2954066_265-170472-0 from training. Duration: 0.83 2023-10-02 08:51:47,792 WARNING [train.py:1204] (3/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_189-298172-0_sp1.1 from training. Duration: 0.963625 2023-10-02 08:51:49,060 INFO [train.py:1046] (3/4) Epoch 24, batch 1000, loss[loss=0.1657, simple_loss=0.252, pruned_loss=0.03965, over 24326.00 frames. ], tot_loss[loss=0.171, simple_loss=0.2481, pruned_loss=0.04697, over 4693359.11 frames. ], batch size: 74, lr: 4.30e-03, grad_scale: 8.0 2023-10-02 08:51:51,835 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7615_210_4211_1_1528110965_4580160_72-235211-0 from training. Duration: 0.76 2023-10-02 08:51:51,882 WARNING [train.py:1204] (3/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_155-90874-0_sp1.1 from training. Duration: 0.936375 2023-10-02 08:51:56,668 WARNING [train.py:1204] (3/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_164-216310-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 08:51:59,178 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_46013_210_7234_1_1533776362_3744032_463-52378-0 from training. Duration: 0.86 2023-10-02 08:51:59,182 WARNING [train.py:1204] (3/4) Exclude cut with ID _61133_210_9969_1_1535196777227_3696409_333-270325-0 from training. Duration: 0.93 2023-10-02 08:52:03,704 WARNING [train.py:1204] (3/4) Exclude cut with ID _5172_210_1883_1_1527246062567_3577219_18-101380-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 08:52:03,712 WARNING [train.py:1204] (3/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_577-236858-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 08:52:03,880 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.feed_forward3.hidden_balancer.prob, batch_count=821253.3333333334, ans=0.125 2023-10-02 08:52:05,627 WARNING [train.py:1204] (3/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_1006-177345-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 08:52:09,589 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_4-266348-0 from training. Duration: 0.78 2023-10-02 08:52:11,544 WARNING [train.py:1204] (3/4) Exclude cut with ID _55857_210_2346_1_1534644007831_6898209_745-98391-0 from training. Duration: 0.76 2023-10-02 08:52:12,825 WARNING [train.py:1204] (3/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_375-244976-0 from training. Duration: 0.75 2023-10-02 08:52:12,865 WARNING [train.py:1204] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_4-248404-0_sp1.1 from training. Duration: 0.97275 2023-10-02 08:52:14,226 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8453_210_4211_1_1528502940_8562730_596-306820-0 from training. Duration: 0.97 2023-10-02 08:52:15,598 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_211-339622-0_sp0.9 from training. Duration: 0.5 2023-10-02 08:52:17,024 WARNING [train.py:1204] (3/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_393-57888-0 from training. Duration: 0.64 2023-10-02 08:52:18,298 WARNING [train.py:1204] (3/4) Exclude cut with ID _23825_210_13449_1_1531915313526_3497150_143-343575-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 08:52:18,365 WARNING [train.py:1204] (3/4) Exclude cut with ID _58605_210_12210_1_1534723195939_3904259_36-12256-0_sp1.1 from training. Duration: 0.936375 2023-10-02 08:52:23,907 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.463e+02 1.855e+02 2.029e+02 2.321e+02 3.051e+02, threshold=4.057e+02, percent-clipped=0.0 2023-10-02 08:52:25,841 WARNING [train.py:1204] (3/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_38-67795-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 08:52:25,898 WARNING [train.py:1204] (3/4) Exclude cut with ID _64631_210_27767_1_1535349580068_4165160_308-327832-0_sp1.1 from training. Duration: 0.92725 2023-10-02 08:52:27,244 WARNING [train.py:1204] (3/4) Exclude cut with ID _37086_210_9038_1_1532999540891_7021250_726-81176-0_sp1.1 from training. Duration: 0.936375 2023-10-02 08:52:27,305 WARNING [train.py:1204] (3/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_47-141521-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 08:52:27,309 WARNING [train.py:1204] (3/4) Exclude cut with ID _3067_210_2667_1_1526212974640_4503168_250-285937-0 from training. Duration: 0.8 2023-10-02 08:52:27,331 WARNING [train.py:1204] (3/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_239-24000-0_sp1.1 from training. Duration: 0.97275 2023-10-02 08:52:28,726 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3371_210_1794_1_1526384462_5580777_396-339812-0_sp0.9 from training. Duration: 0.9444375 2023-10-02 08:52:28,771 WARNING [train.py:1204] (3/4) Exclude cut with ID _16909_210_5724_1_1531292079912_4118919_580-173454-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 08:52:30,582 WARNING [train.py:1204] (3/4) Exclude cut with ID IC0124W0002-61308 from training. Duration: 0.982 2023-10-02 08:52:33,665 WARNING [train.py:1204] (3/4) Exclude cut with ID _31602_210_5854_1_1532677021456_4458250_249-96604-0 from training. Duration: 0.89 2023-10-02 08:52:35,165 WARNING [train.py:1204] (3/4) Exclude cut with ID _69584_210_5219_1_1535785133775_4044450_325-7656-0 from training. Duration: 0.86 2023-10-02 08:52:35,644 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.5.encoder.layers.1.self_attn_weights.whiten_keys, num_groups=4, num_channels=128, metric=1.67 vs. limit=6.0 2023-10-02 08:52:37,242 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.1.encoder.layers.1.conv_module1.whiten, num_groups=1, num_channels=256, metric=11.87 vs. limit=15.0 2023-10-02 08:52:39,140 WARNING [train.py:1204] (3/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_478-162247-0 from training. Duration: 0.82 2023-10-02 08:52:40,578 WARNING [train.py:1204] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_686-54383-0_sp1.1 from training. Duration: 0.7909375 2023-10-02 08:52:46,737 WARNING [train.py:1204] (3/4) Exclude cut with ID _29303_210_2575_1_1532392237303_7410649_85-270092-0_sp1.1 from training. Duration: 0.936375 2023-10-02 08:52:46,753 WARNING [train.py:1204] (3/4) Exclude cut with ID _2322_210_2667_1_1525604548971_3656299_440-80494-0_sp1.1 from training. Duration: 0.8 2023-10-02 08:52:48,058 WARNING [train.py:1204] (3/4) Exclude cut with ID _47346_210_9476_1_1533815971553_3536160_201-40644-0_sp1.1 from training. Duration: 0.936375 2023-10-02 08:52:48,150 WARNING [train.py:1204] (3/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_343-139898-0_sp1.1 from training. Duration: 0.87275 2023-10-02 08:52:50,826 WARNING [train.py:1204] (3/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_1012-279473-0 from training. Duration: 0.67 2023-10-02 08:52:50,913 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7931_210_1794_1_1528594728_5564059_280-279560-0_sp1.1 from training. Duration: 0.8909375 2023-10-02 08:52:50,946 WARNING [train.py:1204] (3/4) Exclude cut with ID _8263_210_1664_1_1528439452891_970220_68-181805-0 from training. Duration: 0.64 2023-10-02 08:52:52,260 WARNING [train.py:1204] (3/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_605-133395-0 from training. Duration: 0.66 2023-10-02 08:52:52,353 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_43-171109-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 08:52:52,356 WARNING [train.py:1204] (3/4) Exclude cut with ID _40094_210_16380_1_1533294005773_3641159_107-280739-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 08:52:54,427 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.1.encoder.layers.0.nonlin_attention.whiten2, num_groups=1, num_channels=256, metric=7.78 vs. limit=15.0 2023-10-02 08:52:54,975 WARNING [train.py:1204] (3/4) Exclude cut with ID _14821_210_8750_1_1530680503991_6829359_2-227151-0_sp0.9 from training. Duration: 0.92225 2023-10-02 08:52:58,249 WARNING [train.py:1204] (3/4) Exclude cut with ID _14268_210_9206_1_1530622771285_4714099_331-105351-0_sp1.1 from training. Duration: 0.5909375 2023-10-02 08:52:59,680 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_26326_210_7925_1_1533002493_3592406_451-82741-0_sp1.1 from training. Duration: 0.97275 2023-10-02 08:53:02,889 INFO [train.py:1046] (3/4) Epoch 24, batch 1050, loss[loss=0.175, simple_loss=0.2532, pruned_loss=0.0484, over 24647.00 frames. ], tot_loss[loss=0.1705, simple_loss=0.2474, pruned_loss=0.04679, over 4711752.44 frames. ], batch size: 65, lr: 4.30e-03, grad_scale: 8.0 2023-10-02 08:53:04,271 WARNING [train.py:1204] (3/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_191-59784-0_sp0.9 from training. Duration: 0.97775 2023-10-02 08:53:04,363 WARNING [train.py:1204] (3/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_445-117398-0_sp1.1 from training. Duration: 0.8090625 2023-10-02 08:53:05,773 WARNING [train.py:1204] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_605-257205-0_sp0.9 from training. Duration: 0.6555625 2023-10-02 08:53:07,149 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_50050_210_16223_1_1535435796_7290138_592-258841-0_sp1.1 from training. Duration: 0.936375 2023-10-02 08:53:09,766 WARNING [train.py:1204] (3/4) Exclude cut with ID _14348_210_8341_1_1530599434996_3778640_240-232370-0_sp0.9 from training. Duration: 0.9333125 2023-10-02 08:53:11,182 WARNING [train.py:1204] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_239-316802-0_sp0.9 from training. Duration: 0.7666875 2023-10-02 08:53:13,122 WARNING [train.py:1204] (3/4) Exclude cut with ID _45560_210_21982_1_1533812603951_3584999_111-6376-0_sp1.1 from training. Duration: 0.763625 2023-10-02 08:53:15,951 WARNING [train.py:1204] (3/4) Exclude cut with ID _54189_210_16380_1_1534332670039_3911710_606-311481-0_sp1.1 from training. Duration: 0.963625 2023-10-02 08:53:16,199 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.ff3_skip_rate, batch_count=821586.6666666666, ans=0.0 2023-10-02 08:53:17,372 WARNING [train.py:1204] (3/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_237-332027-0_sp1.1 from training. Duration: 0.863625 2023-10-02 08:53:18,614 WARNING [train.py:1204] (3/4) Exclude cut with ID _66070_210_7640_1_1535441393096_7805310_489-292631-0_sp0.9 from training. Duration: 0.87775 2023-10-02 08:53:18,689 WARNING [train.py:1204] (3/4) Exclude cut with ID _49728_210_12651_1_1534041034294_6788150_624-195301-0_sp1.1 from training. Duration: 0.77275 2023-10-02 08:53:19,929 WARNING [train.py:1204] (3/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_44-79427-0 from training. Duration: 0.98 2023-10-02 08:53:19,990 WARNING [train.py:1204] (3/4) Exclude cut with ID _35350_210_16040_1_1533040191183_4075299_490-198354-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 08:53:20,021 WARNING [train.py:1204] (3/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_309-222545-0 from training. Duration: 0.84 2023-10-02 08:53:21,631 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_13091_210_7925_1_1530609447_8987354_790-199469-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 08:53:21,637 WARNING [train.py:1204] (3/4) Exclude cut with ID _21632_210_2253_1_1531994001765_3744140_110-274654-0 from training. Duration: 0.82 2023-10-02 08:53:21,642 WARNING [train.py:1204] (3/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_498-40153-0_sp0.9 from training. Duration: 0.7 2023-10-02 08:53:24,562 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.0.balancer_na.min_abs, batch_count=821586.6666666666, ans=0.02 2023-10-02 08:53:25,904 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.bypass.scale_min, batch_count=821586.6666666666, ans=0.2 2023-10-02 08:53:25,922 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.conv_skip_rate, batch_count=821586.6666666666, ans=0.0 2023-10-02 08:53:30,052 WARNING [train.py:1204] (3/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_363-282909-0_sp1.1 from training. Duration: 0.936375 2023-10-02 08:53:31,911 WARNING [train.py:1204] (3/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_178-177582-0_sp0.9 from training. Duration: 0.82225 2023-10-02 08:53:31,928 WARNING [train.py:1204] (3/4) Exclude cut with ID _24612_210_8913_1_1531998054193_3806359_357-35487-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 08:53:35,242 WARNING [train.py:1204] (3/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_138-332773-0 from training. Duration: 0.81 2023-10-02 08:53:35,260 WARNING [train.py:1204] (3/4) Exclude cut with ID _7624_210_1531_1_1528109802198_4933101_418-349834-0 from training. Duration: 0.6 2023-10-02 08:53:35,289 WARNING [train.py:1204] (3/4) Exclude cut with ID _7537_210_2084_1_1528502395431_3924359_14-312906-0_sp0.9 from training. Duration: 0.9333125 2023-10-02 08:53:36,759 WARNING [train.py:1204] (3/4) Exclude cut with ID _60057_210_10640_1_1534903065793_3333150_362-230377-0 from training. Duration: 0.87 2023-10-02 08:53:36,940 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.1.bypass.skip_rate, batch_count=821653.3333333334, ans=0.035 2023-10-02 08:53:39,509 WARNING [train.py:1204] (3/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_138-64210-0 from training. Duration: 0.73 2023-10-02 08:53:41,218 WARNING [train.py:1204] (3/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_454-339504-0_sp1.1 from training. Duration: 0.97275 2023-10-02 08:53:44,224 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.balancer_na.min_abs, batch_count=821653.3333333334, ans=0.02 2023-10-02 08:53:45,358 WARNING [train.py:1204] (3/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_508-335369-0_sp0.9 from training. Duration: 0.5333125 2023-10-02 08:53:46,737 WARNING [train.py:1204] (3/4) Exclude cut with ID _5608_210_2106_1_1527415439089_6819768_909-82767-0_sp0.9 from training. Duration: 0.67775 2023-10-02 08:53:46,768 WARNING [train.py:1204] (3/4) Exclude cut with ID _6694_210_4599_1_1527852637490_3965529_99-11611-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 08:53:46,826 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2655_210_2087_1_1525688959_3897400_52-279899-0_sp1.1 from training. Duration: 0.62725 2023-10-02 08:53:46,929 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.0.self_attn_weights.pos_emb_skip_rate, batch_count=821720.0, ans=0.0 2023-10-02 08:53:50,997 WARNING [train.py:1204] (3/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_375-245677-0_sp1.1 from training. Duration: 0.736375 2023-10-02 08:53:53,725 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_23850_210_4238_1_1532232597_1632433_32-185135-0 from training. Duration: 0.86 2023-10-02 08:53:56,491 WARNING [train.py:1204] (3/4) Exclude cut with ID _15113_210_8254_1_1530842438579_3716279_209-54533-0 from training. Duration: 0.96 2023-10-02 08:53:57,783 WARNING [train.py:1204] (3/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_401-53423-0 from training. Duration: 0.55 2023-10-02 08:53:57,810 WARNING [train.py:1204] (3/4) Exclude cut with ID _14867_210_1968_1_1530684179890_3846969_319-250868-0_sp1.1 from training. Duration: 0.9 2023-10-02 08:53:57,831 WARNING [train.py:1204] (3/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_106-30782-0_sp0.9 from training. Duration: 0.888875 2023-10-02 08:53:59,677 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_5960_210_1794_1_1527403865_3990999_515-2613-0 from training. Duration: 0.88 2023-10-02 08:54:04,419 WARNING [train.py:1204] (3/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_798-106179-0_sp0.9 from training. Duration: 0.92225 2023-10-02 08:54:04,555 WARNING [train.py:1204] (3/4) Exclude cut with ID _14227_210_4381_1_1530701804286_3940259_125-275642-0_sp1.1 from training. Duration: 0.9 2023-10-02 08:54:04,557 WARNING [train.py:1204] (3/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_79-200392-0_sp1.1 from training. Duration: 0.8818125 2023-10-02 08:54:06,331 WARNING [train.py:1204] (3/4) Exclude cut with ID _48836_210_12210_1_1534075848293_3904150_429-35475-0_sp0.9 from training. Duration: 0.988875 2023-10-02 08:54:06,345 WARNING [train.py:1204] (3/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_224-172517-0_sp1.1 from training. Duration: 0.97275 2023-10-02 08:54:10,440 WARNING [train.py:1204] (3/4) Exclude cut with ID _15113_210_8254_1_1530842438579_3716279_76-232507-0_sp1.1 from training. Duration: 0.97275 2023-10-02 08:54:10,455 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_135-5501-0 from training. Duration: 0.98 2023-10-02 08:54:12,310 WARNING [train.py:1204] (3/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_563-347832-0_sp0.9 from training. Duration: 0.988875 2023-10-02 08:54:12,320 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6786_210_1388_1_1527839699_4194495_273-149841-0 from training. Duration: 0.92 2023-10-02 08:54:12,347 WARNING [train.py:1204] (3/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_172-215932-0 from training. Duration: 0.66 2023-10-02 08:54:12,405 WARNING [train.py:1204] (3/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_939-132910-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 08:54:16,460 INFO [train.py:1046] (3/4) Epoch 24, batch 1100, loss[loss=0.1486, simple_loss=0.2285, pruned_loss=0.03429, over 24429.00 frames. ], tot_loss[loss=0.1703, simple_loss=0.2467, pruned_loss=0.04701, over 4703499.49 frames. ], batch size: 58, lr: 4.30e-03, grad_scale: 8.0 2023-10-02 08:54:16,548 WARNING [train.py:1204] (3/4) Exclude cut with ID _49371_210_12071_1_1533952761021_3812190_16-246001-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 08:54:21,854 WARNING [train.py:1204] (3/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_680-189165-0_sp1.1 from training. Duration: 0.936375 2023-10-02 08:54:25,966 WARNING [train.py:1204] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_53-300544-0_sp1.1 from training. Duration: 0.6090625 2023-10-02 08:54:26,251 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.feed_forward3.hidden_balancer.prob, batch_count=821853.3333333334, ans=0.125 2023-10-02 08:54:27,331 WARNING [train.py:1204] (3/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_540-275685-0_sp1.1 from training. Duration: 0.8090625 2023-10-02 08:54:27,354 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_38965_210_4852_1_1533174906_3484735_453-106579-0_sp1.1 from training. Duration: 0.92725 2023-10-02 08:54:27,399 WARNING [train.py:1204] (3/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_105-328174-0 from training. Duration: 0.76 2023-10-02 08:54:27,649 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=821853.3333333334, ans=0.1 2023-10-02 08:54:29,321 WARNING [train.py:1204] (3/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_279-126205-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 08:54:30,990 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.feed_forward1.hidden_balancer.prob, batch_count=821920.0, ans=0.125 2023-10-02 08:54:32,146 WARNING [train.py:1204] (3/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_5-55893-0_sp0.9 from training. Duration: 0.611125 2023-10-02 08:54:34,104 WARNING [train.py:1204] (3/4) Exclude cut with ID _13678_210_7497_1_1530530069226_3463170_141-10835-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 08:54:35,506 WARNING [train.py:1204] (3/4) Exclude cut with ID _22083_210_8254_1_1531702862439_3678970_201-85738-0_sp1.1 from training. Duration: 0.8181875 2023-10-02 08:54:35,531 WARNING [train.py:1204] (3/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_51-1049-0 from training. Duration: 0.75 2023-10-02 08:54:37,458 WARNING [train.py:1204] (3/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_206-10097-0_sp0.9 from training. Duration: 0.5444375 2023-10-02 08:54:38,868 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_4528_210_3273_1_1527076654_3276777_360-63661-0_sp1.1 from training. Duration: 0.92725 2023-10-02 08:54:38,876 WARNING [train.py:1204] (3/4) Exclude cut with ID _64086_210_7994_1_1535270417019_7435949_449-31567-0_sp1.1 from training. Duration: 0.8818125 2023-10-02 08:54:41,978 WARNING [train.py:1204] (3/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_245-174553-0_sp0.9 from training. Duration: 0.97775 2023-10-02 08:54:44,677 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6545_210_2040_1_1527774670_4245258_379-14182-0_sp1.1 from training. Duration: 0.72725 2023-10-02 08:54:48,672 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_256-206070-0_sp1.1 from training. Duration: 0.963625 2023-10-02 08:54:51,384 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.541e+02 1.803e+02 1.981e+02 2.331e+02 3.257e+02, threshold=3.962e+02, percent-clipped=0.0 2023-10-02 08:54:51,467 WARNING [train.py:1204] (3/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_278-104676-0 from training. Duration: 0.98 2023-10-02 08:54:51,533 WARNING [train.py:1204] (3/4) Exclude cut with ID IC0671W0065-331908 from training. Duration: 0.896 2023-10-02 08:54:51,574 WARNING [train.py:1204] (3/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_244-129917-0_sp1.1 from training. Duration: 0.97275 2023-10-02 08:54:53,034 WARNING [train.py:1204] (3/4) Exclude cut with ID _7852_210_5219_1_1528286471316_8139400_1129-170857-0_sp1.1 from training. Duration: 0.97275 2023-10-02 08:54:54,328 WARNING [train.py:1204] (3/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_443-30034-0_sp0.9 from training. Duration: 0.711125 2023-10-02 08:54:54,363 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_31583_210_3852_1_1532595851_4024413_250-332999-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 08:54:55,736 WARNING [train.py:1204] (3/4) Exclude cut with ID _8772_210_1850_1_1528635595972_3664011_247-336448-0 from training. Duration: 0.74 2023-10-02 08:54:57,515 WARNING [train.py:1204] (3/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_567-135114-0_sp0.9 from training. Duration: 0.9666875 2023-10-02 08:54:57,519 WARNING [train.py:1204] (3/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_557-349534-0_sp1.1 from training. Duration: 0.82725 2023-10-02 08:54:57,530 WARNING [train.py:1204] (3/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_1341-26728-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 08:54:58,872 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_40222_210_6228_1_1533385298_4481939_375-328051-0_sp1.1 from training. Duration: 0.97275 2023-10-02 08:54:58,893 WARNING [train.py:1204] (3/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_276-96593-0 from training. Duration: 0.77 2023-10-02 08:55:05,066 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_53071_210_16554_1_1534490405_2954066_265-170472-0_sp0.9 from training. Duration: 0.92225 2023-10-02 08:55:06,319 WARNING [train.py:1204] (3/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_365-1492-0 from training. Duration: 0.91 2023-10-02 08:55:07,719 WARNING [train.py:1204] (3/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_654-52713-0_sp0.9 from training. Duration: 0.8555625 2023-10-02 08:55:14,214 WARNING [train.py:1204] (3/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_113-92608-0_sp1.1 from training. Duration: 0.4909375 2023-10-02 08:55:16,919 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_46013_210_7234_1_1533776362_3744032_50-141535-0 from training. Duration: 0.99 2023-10-02 08:55:16,927 WARNING [train.py:1204] (3/4) Exclude cut with ID _13942_210_9988_1_1530604906036_3733360_499-55676-0_sp1.1 from training. Duration: 0.563625 2023-10-02 08:55:18,406 WARNING [train.py:1204] (3/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_375-65143-0_sp1.1 from training. Duration: 0.97275 2023-10-02 08:55:21,137 WARNING [train.py:1204] (3/4) Exclude cut with ID _16756_210_4915_1_1531047803763_7104259_199-325608-0_sp1.1 from training. Duration: 0.92725 2023-10-02 08:55:21,175 WARNING [train.py:1204] (3/4) Exclude cut with ID _31649_210_6228_1_1532584608063_4091078_443-124391-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 08:55:22,462 WARNING [train.py:1204] (3/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_146-218763-0 from training. Duration: 0.86 2023-10-02 08:55:22,520 WARNING [train.py:1204] (3/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_567-135114-0_sp1.1 from training. Duration: 0.7909375 2023-10-02 08:55:23,831 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_26326_210_7925_1_1533002493_3592406_462-288299-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 08:55:23,918 WARNING [train.py:1204] (3/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_322-217220-0 from training. Duration: 0.86 2023-10-02 08:55:25,208 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_238-256668-0_sp1.1 from training. Duration: 0.62725 2023-10-02 08:55:25,251 WARNING [train.py:1204] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_371-18169-0 from training. Duration: 0.86 2023-10-02 08:55:27,122 WARNING [train.py:1204] (3/4) Exclude cut with ID _58480_210_12392_1_1534811121111_3646160_23-74512-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 08:55:27,133 WARNING [train.py:1204] (3/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_784-213099-0_sp1.1 from training. Duration: 0.8454375 2023-10-02 08:55:28,521 WARNING [train.py:1204] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_625-102955-0_sp1.1 from training. Duration: 0.7 2023-10-02 08:55:29,869 INFO [train.py:1046] (3/4) Epoch 24, batch 1150, loss[loss=0.1889, simple_loss=0.2715, pruned_loss=0.05312, over 24357.00 frames. ], tot_loss[loss=0.1714, simple_loss=0.2479, pruned_loss=0.04749, over 4704071.49 frames. ], batch size: 77, lr: 4.30e-03, grad_scale: 8.0 2023-10-02 08:55:33,889 WARNING [train.py:1204] (3/4) Exclude cut with ID _49748_210_24116_1_1533986971737_3158910_176-174327-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 08:55:35,301 WARNING [train.py:1204] (3/4) Exclude cut with ID _50310_210_15488_1_1534068043345_3641080_432-96140-0_sp1.1 from training. Duration: 0.936375 2023-10-02 08:55:37,233 WARNING [train.py:1204] (3/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_76-14910-0_sp1.1 from training. Duration: 0.92725 2023-10-02 08:55:37,254 WARNING [train.py:1204] (3/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_40-19458-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 08:55:37,289 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_46-93547-0 from training. Duration: 0.95 2023-10-02 08:55:37,325 WARNING [train.py:1204] (3/4) Exclude cut with ID _21631_210_2253_1_1531907880778_3160339_321-35325-0_sp1.1 from training. Duration: 0.963625 2023-10-02 08:55:42,113 WARNING [train.py:1204] (3/4) Exclude cut with ID _4779_210_2084_1_1527318022944_3914893_500-285013-0 from training. Duration: 0.69 2023-10-02 08:55:43,410 WARNING [train.py:1204] (3/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_301-93324-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 08:55:44,024 WARNING [train.py:1204] (3/4) Exclude cut with ID _6473_210_1741_1_1527901307396_3815380_108-17335-0_sp1.1 from training. Duration: 0.5909375 2023-10-02 08:55:50,748 WARNING [train.py:1204] (3/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_206-10097-0 from training. Duration: 0.49 2023-10-02 08:55:52,108 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_36756_210_7234_1_1533026991_966276_103-77513-0_sp1.1 from training. Duration: 0.97275 2023-10-02 08:55:54,945 WARNING [train.py:1204] (3/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_662-18193-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 08:55:56,232 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_26433_210_12108_1_1532157158_4056480_439-52438-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 08:55:56,284 WARNING [train.py:1204] (3/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_464-33931-0 from training. Duration: 0.54 2023-10-02 08:55:56,291 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3474_210_2028_1_1526188513_4825246_145-309131-0_sp0.9 from training. Duration: 0.82225 2023-10-02 08:55:56,310 WARNING [train.py:1204] (3/4) Exclude cut with ID _9679_210_7497_1_1529129212906_4363929_446-308321-0_sp1.1 from training. Duration: 0.963625 2023-10-02 08:56:00,772 WARNING [train.py:1204] (3/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_662-275621-0 from training. Duration: 0.42 2023-10-02 08:56:02,113 WARNING [train.py:1204] (3/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_344-337148-0_sp1.1 from training. Duration: 0.97275 2023-10-02 08:56:03,484 WARNING [train.py:1204] (3/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_759-179071-0_sp1.1 from training. Duration: 0.92725 2023-10-02 08:56:14,134 WARNING [train.py:1204] (3/4) Exclude cut with ID _28783_210_7943_1_1532671164808_7155479_293-12188-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 08:56:20,133 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_17613_210_2151_1_1531137569_2366160_235-320304-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 08:56:20,161 WARNING [train.py:1204] (3/4) Exclude cut with ID _14349_210_8341_1_1530615710818_3442470_170-336541-0 from training. Duration: 0.86 2023-10-02 08:56:21,595 WARNING [train.py:1204] (3/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_375-218677-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 08:56:21,645 WARNING [train.py:1204] (3/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_829-202956-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 08:56:21,777 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.0.feed_forward1.out_proj.dropout_p, batch_count=822386.6666666666, ans=0.1 2023-10-02 08:56:25,969 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9029W0436-509823 from training. Duration: 0.768 2023-10-02 08:56:28,552 WARNING [train.py:1204] (3/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_231-112214-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 08:56:33,343 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.bypass.scale_min, batch_count=822453.3333333334, ans=0.2 2023-10-02 08:56:34,816 INFO [scaling.py:1118] (3/4) WithLoss: name=encoder.encoders.1.encoder.layers.1.self_attn_weights, loss-sum=0.000e+00 2023-10-02 08:56:36,017 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9059W0470-524883 from training. Duration: 0.3415625 2023-10-02 08:56:40,139 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_43266_210_1794_1_1533521789_4497218_206-79512-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 08:56:40,234 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_294-163793-0_sp0.9 from training. Duration: 0.911125 2023-10-02 08:56:40,257 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6290_210_3273_1_1527595224_1282787_115-87095-0_sp1.1 from training. Duration: 0.5 2023-10-02 08:56:41,622 WARNING [train.py:1204] (3/4) Exclude cut with ID _14100_210_8341_1_1530529320613_3678069_279-55760-0_sp1.1 from training. Duration: 0.8454375 2023-10-02 08:56:43,606 INFO [train.py:1046] (3/4) Epoch 24, batch 1200, loss[loss=0.1696, simple_loss=0.2475, pruned_loss=0.04582, over 23290.00 frames. ], tot_loss[loss=0.1716, simple_loss=0.2482, pruned_loss=0.04749, over 4711886.61 frames. ], batch size: 93, lr: 4.30e-03, grad_scale: 16.0 2023-10-02 08:56:43,719 WARNING [train.py:1204] (3/4) Exclude cut with ID _14621_210_1925_1_1530686901185_3675420_419-234200-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 08:56:49,326 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.2.encoder.layers.0.feed_forward2.out_whiten, num_groups=1, num_channels=384, metric=8.03 vs. limit=15.0 2023-10-02 08:56:49,926 WARNING [train.py:1204] (3/4) Exclude cut with ID _60229_210_27767_1_1534838406158_3655350_353-213656-0_sp1.1 from training. Duration: 0.863625 2023-10-02 08:56:49,937 WARNING [train.py:1204] (3/4) Exclude cut with ID _48678_210_5663_1_1533988830947_3769199_306-255886-0_sp1.1 from training. Duration: 0.7 2023-10-02 08:56:52,574 WARNING [train.py:1204] (3/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_466-3492-0_sp1.1 from training. Duration: 0.97275 2023-10-02 08:56:52,574 WARNING [train.py:1204] (3/4) Exclude cut with ID _8755_210_1876_1_1528628559357_3258209_199-242167-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 08:56:52,635 WARNING [train.py:1204] (3/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_237-89684-0_sp1.1 from training. Duration: 0.87275 2023-10-02 08:56:54,009 WARNING [train.py:1204] (3/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_70-84121-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 08:56:54,656 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.2.encoder.layers.1.feed_forward3.out_whiten, num_groups=1, num_channels=384, metric=9.59 vs. limit=15.0 2023-10-02 08:56:55,456 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7544_210_6753_1_1528587722_3576686_172-171750-0_sp1.1 from training. Duration: 0.5909375 2023-10-02 08:56:56,824 WARNING [train.py:1204] (3/4) Exclude cut with ID _24271_210_12658_1_1531987270022_7190519_40-321822-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 08:56:56,853 WARNING [train.py:1204] (3/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_227-83612-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 08:56:59,671 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9156W0015-572508 from training. Duration: 0.896 2023-10-02 08:57:02,833 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_292-265928-0 from training. Duration: 0.51 2023-10-02 08:57:06,148 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.5.encoder.layers.0.feed_forward2.out_whiten, num_groups=1, num_channels=256, metric=14.55 vs. limit=15.0 2023-10-02 08:57:07,061 WARNING [train.py:1204] (3/4) Exclude cut with ID _14747_210_8341_1_1530685797463_3654089_147-131229-0_sp1.1 from training. Duration: 0.6909375 2023-10-02 08:57:09,678 WARNING [train.py:1204] (3/4) Exclude cut with ID _45297_210_2721_1_1533715351381_3264949_351-147136-0_sp0.9 from training. Duration: 0.9666875 2023-10-02 08:57:11,211 WARNING [train.py:1204] (3/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_1102-324568-0_sp1.1 from training. Duration: 0.97275 2023-10-02 08:57:13,898 WARNING [train.py:1204] (3/4) Exclude cut with ID _18516_210_9206_1_1531704653206_3692406_465-75446-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 08:57:13,900 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9203W0126-595623 from training. Duration: 0.896 2023-10-02 08:57:13,974 WARNING [train.py:1204] (3/4) Exclude cut with ID _55383_210_10635_1_1534468426172_2231669_139-328410-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 08:57:14,629 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.3.encoder.layers.2.nonlin_attention.whiten2, num_groups=1, num_channels=512, metric=5.74 vs. limit=15.0 2023-10-02 08:57:17,417 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.feed_forward3.hidden_balancer.prob, batch_count=822653.3333333334, ans=0.125 2023-10-02 08:57:19,064 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.560e+02 1.901e+02 2.150e+02 2.548e+02 4.088e+02, threshold=4.300e+02, percent-clipped=1.0 2023-10-02 08:57:22,021 WARNING [train.py:1204] (3/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_166-246557-0_sp0.9 from training. Duration: 0.711125 2023-10-02 08:57:22,025 WARNING [train.py:1204] (3/4) Exclude cut with ID _30039_210_13270_1_1532484612900_2725150_239-304872-0_sp1.1 from training. Duration: 0.963625 2023-10-02 08:57:23,243 WARNING [train.py:1204] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_6-226750-0 from training. Duration: 0.94 2023-10-02 08:57:23,314 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_135-5501-0_sp1.1 from training. Duration: 0.8909375 2023-10-02 08:57:27,487 WARNING [train.py:1204] (3/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_359-118926-0 from training. Duration: 0.72 2023-10-02 08:57:27,693 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.conv_module2.balancer1.prob, batch_count=822720.0, ans=0.125 2023-10-02 08:57:30,217 WARNING [train.py:1204] (3/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_4-91037-0 from training. Duration: 0.88 2023-10-02 08:57:31,474 WARNING [train.py:1204] (3/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_415-288492-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 08:57:31,543 WARNING [train.py:1204] (3/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_544-287179-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 08:57:31,630 WARNING [train.py:1204] (3/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_122-32606-0_sp1.1 from training. Duration: 0.92725 2023-10-02 08:57:33,561 WARNING [train.py:1204] (3/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_121-284152-0_sp1.1 from training. Duration: 0.536375 2023-10-02 08:57:33,654 WARNING [train.py:1204] (3/4) Exclude cut with ID _44718_210_5816_1_1534050051869_6469030_350-299477-0_sp1.1 from training. Duration: 0.97275 2023-10-02 08:57:34,947 WARNING [train.py:1204] (3/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_1103-56445-0_sp0.9 from training. Duration: 0.811125 2023-10-02 08:57:35,018 WARNING [train.py:1204] (3/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_64-298917-0_sp0.9 from training. Duration: 0.97775 2023-10-02 08:57:36,302 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2245_210_2715_1_1525511548_3540254_310-52483-0 from training. Duration: 0.88 2023-10-02 08:57:36,351 WARNING [train.py:1204] (3/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_401-158239-0_sp1.1 from training. Duration: 0.6454375 2023-10-02 08:57:37,633 WARNING [train.py:1204] (3/4) Exclude cut with ID _59502_210_12210_1_1535107024698_1835139_146-162495-0_sp1.1 from training. Duration: 0.836375 2023-10-02 08:57:37,642 WARNING [train.py:1204] (3/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_41-31157-0_sp0.9 from training. Duration: 0.6333125 2023-10-02 08:57:40,238 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_68717_210_3528_1_1535628683_1556759_95-36722-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 08:57:40,245 WARNING [train.py:1204] (3/4) Exclude cut with ID _30196_210_1664_1_1533708496522_7074140_654-162611-0_sp1.1 from training. Duration: 0.92725 2023-10-02 08:57:43,655 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_9302_210_2550_1_1528889962_4655416_638-301620-0_sp0.9 from training. Duration: 0.52225 2023-10-02 08:57:46,385 WARNING [train.py:1204] (3/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_478-162247-0_sp1.1 from training. Duration: 0.7454375 2023-10-02 08:57:49,760 WARNING [train.py:1204] (3/4) Exclude cut with ID _45297_210_2721_1_1533715351381_3264949_353-9541-0 from training. Duration: 0.97 2023-10-02 08:57:54,548 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9653W0190-675468 from training. Duration: 0.9935 2023-10-02 08:57:55,919 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_49730_210_13049_1_1534075294_1830676_214-84436-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 08:57:57,174 INFO [train.py:1046] (3/4) Epoch 24, batch 1250, loss[loss=0.1467, simple_loss=0.2186, pruned_loss=0.03737, over 24293.00 frames. ], tot_loss[loss=0.1716, simple_loss=0.2489, pruned_loss=0.04715, over 4723068.90 frames. ], batch size: 56, lr: 4.30e-03, grad_scale: 16.0 2023-10-02 08:57:57,336 WARNING [train.py:1204] (3/4) Exclude cut with ID _50093_210_4220_1_1534417141207_3432299_259-322252-0_sp1.1 from training. Duration: 0.836375 2023-10-02 08:58:00,010 WARNING [train.py:1204] (3/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_599-50220-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 08:58:00,117 WARNING [train.py:1204] (3/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_174-109016-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 08:58:03,387 WARNING [train.py:1204] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_665-196155-0 from training. Duration: 0.67 2023-10-02 08:58:06,214 WARNING [train.py:1204] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_656-188361-0_sp1.1 from training. Duration: 0.7909375 2023-10-02 08:58:08,925 WARNING [train.py:1204] (3/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_60-18123-0_sp1.1 from training. Duration: 0.8 2023-10-02 08:58:10,237 WARNING [train.py:1204] (3/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_535-14410-0 from training. Duration: 0.47 2023-10-02 08:58:11,696 WARNING [train.py:1204] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_94-126128-0_sp1.1 from training. Duration: 0.7818125 2023-10-02 08:58:13,048 WARNING [train.py:1204] (3/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_356-31216-0_sp1.1 from training. Duration: 0.6818125 2023-10-02 08:58:17,238 WARNING [train.py:1204] (3/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_443-30034-0_sp1.1 from training. Duration: 0.5818125 2023-10-02 08:58:17,305 WARNING [train.py:1204] (3/4) Exclude cut with ID _26907_210_7234_1_1532221740694_3205263_337-2494-0_sp1.1 from training. Duration: 0.8 2023-10-02 08:58:19,239 WARNING [train.py:1204] (3/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_49-97158-0_sp0.9 from training. Duration: 0.8444375 2023-10-02 08:58:19,251 WARNING [train.py:1204] (3/4) Exclude cut with ID _7877_210_6014_1_1528286358099_3750136_553-170128-0_sp1.1 from training. Duration: 0.963625 2023-10-02 08:58:22,434 WARNING [train.py:1204] (3/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_202-293523-0_sp0.9 from training. Duration: 0.8 2023-10-02 08:58:25,826 WARNING [train.py:1204] (3/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_188-171673-0_sp0.9 from training. Duration: 0.6444375 2023-10-02 08:58:27,092 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_120-41668-0_sp1.1 from training. Duration: 0.636375 2023-10-02 08:58:27,097 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_40223_210_6228_1_1533298404_4812267_613-78769-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 08:58:28,454 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_53861_210_5399_1_1534422156_4272077_150-336899-0_sp1.1 from training. Duration: 0.963625 2023-10-02 08:58:28,513 WARNING [train.py:1204] (3/4) Exclude cut with ID _14459_210_2228_1_1531443641946_7516700_1146-56843-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 08:58:31,465 WARNING [train.py:1204] (3/4) Exclude cut with ID _39867_210_9826_1_1533553222030_9344259_1194-299928-0_sp1.1 from training. Duration: 0.92725 2023-10-02 08:58:34,733 WARNING [train.py:1204] (3/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_214-291369-0_sp0.9 from training. Duration: 0.7 2023-10-02 08:58:37,557 WARNING [train.py:1204] (3/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_358-215558-0 from training. Duration: 0.93 2023-10-02 08:58:38,873 WARNING [train.py:1204] (3/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_577-287150-0_sp0.9 from training. Duration: 0.911125 2023-10-02 08:58:40,362 WARNING [train.py:1204] (3/4) Exclude cut with ID _60860_210_8254_1_1534896015875_3557169_229-187367-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 08:58:41,706 WARNING [train.py:1204] (3/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_857-298942-0 from training. Duration: 0.85 2023-10-02 08:58:43,093 WARNING [train.py:1204] (3/4) Exclude cut with ID _40137_210_12210_1_1533290484046_3795180_249-187059-0_sp1.1 from training. Duration: 0.8 2023-10-02 08:58:43,106 WARNING [train.py:1204] (3/4) Exclude cut with ID ID0171W0508-765603 from training. Duration: 0.541 2023-10-02 08:58:43,125 WARNING [train.py:1204] (3/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_521-219439-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 08:58:44,381 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_40223_210_6228_1_1533298404_4812267_741-302616-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 08:58:45,008 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.3.encoder.layers.2.whiten, num_groups=1, num_channels=512, metric=5.10 vs. limit=12.0 2023-10-02 08:58:47,098 WARNING [train.py:1204] (3/4) Exclude cut with ID _65293_210_9070_1_1535545549765_4004210_438-154735-0_sp1.1 from training. Duration: 0.92725 2023-10-02 08:58:50,446 WARNING [train.py:1204] (3/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_564-132950-0_sp1.1 from training. Duration: 0.92725 2023-10-02 08:58:50,483 WARNING [train.py:1204] (3/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_291-270881-0_sp1.1 from training. Duration: 0.8818125 2023-10-02 08:58:50,582 WARNING [train.py:1204] (3/4) Exclude cut with ID _60229_210_27767_1_1534838406158_3655350_353-213656-0 from training. Duration: 0.95 2023-10-02 08:58:51,873 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_243-143119-0 from training. Duration: 0.77 2023-10-02 08:58:51,895 WARNING [train.py:1204] (3/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_690-126306-0 from training. Duration: 0.78 2023-10-02 08:58:55,100 WARNING [train.py:1204] (3/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_347-95003-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 08:58:57,019 WARNING [train.py:1204] (3/4) Exclude cut with ID _2322_210_2667_1_1525604548971_3656299_440-80494-0 from training. Duration: 0.88 2023-10-02 08:58:57,040 WARNING [train.py:1204] (3/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_370-229434-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 08:58:57,645 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.3.encoder.layers.1.whiten, num_groups=1, num_channels=512, metric=6.02 vs. limit=12.0 2023-10-02 08:58:59,757 WARNING [train.py:1204] (3/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_124-345840-0_sp1.1 from training. Duration: 0.47275 2023-10-02 08:59:01,150 WARNING [train.py:1204] (3/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_165-123581-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 08:59:01,334 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.1.feed_forward3.hidden_balancer.prob, batch_count=823120.0, ans=0.125 2023-10-02 08:59:02,499 WARNING [train.py:1204] (3/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_453-282588-0 from training. Duration: 0.98 2023-10-02 08:59:03,837 WARNING [train.py:1204] (3/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_299-34413-0_sp0.9 from training. Duration: 0.72225 2023-10-02 08:59:03,888 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_40036_210_13405_1_1533648647_1315388_48-146016-0_sp1.1 from training. Duration: 0.8454375 2023-10-02 08:59:03,910 WARNING [train.py:1204] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_99-167392-0_sp0.9 from training. Duration: 0.67775 2023-10-02 08:59:05,614 WARNING [train.py:1204] (3/4) Exclude cut with ID _48677_210_5663_1_1533902405919_3892989_37-207114-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 08:59:06,991 WARNING [train.py:1204] (3/4) Exclude cut with ID _29857_210_20034_1_1532395886753_3798160_335-163562-0 from training. Duration: 0.85 2023-10-02 08:59:08,397 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_36492_210_7927_1_1533261987_4790834_703-343351-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 08:59:09,887 WARNING [train.py:1204] (3/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_80-147301-0_sp0.9 from training. Duration: 0.9333125 2023-10-02 08:59:11,138 INFO [train.py:1046] (3/4) Epoch 24, batch 1300, loss[loss=0.1716, simple_loss=0.2417, pruned_loss=0.0507, over 23474.00 frames. ], tot_loss[loss=0.172, simple_loss=0.2494, pruned_loss=0.04726, over 4727518.05 frames. ], batch size: 285, lr: 4.30e-03, grad_scale: 16.0 2023-10-02 08:59:11,214 WARNING [train.py:1204] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_218-295097-0_sp0.9 from training. Duration: 0.8555625 2023-10-02 08:59:13,895 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2295_210_2087_1_1525500993_2407863_116-238894-0_sp1.1 from training. Duration: 0.636375 2023-10-02 08:59:16,774 WARNING [train.py:1204] (3/4) Exclude cut with ID _3231_210_2106_1_1526205864934_6784220_697-171068-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 08:59:16,796 WARNING [train.py:1204] (3/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_102-77064-0 from training. Duration: 0.93 2023-10-02 08:59:20,054 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_13091_210_7925_1_1530609447_8987354_1186-261957-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 08:59:22,754 WARNING [train.py:1204] (3/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_91-277684-0_sp1.1 from training. Duration: 0.67275 2023-10-02 08:59:22,838 WARNING [train.py:1204] (3/4) Exclude cut with ID _31088_210_16446_1_1532520012284_3440919_159-313751-0_sp1.1 from training. Duration: 0.936375 2023-10-02 08:59:24,697 WARNING [train.py:1204] (3/4) Exclude cut with ID _42326_210_7770_1_1533814282531_3707269_227-203721-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 08:59:24,978 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.conv_module1.balancer1.prob, batch_count=823253.3333333334, ans=0.125 2023-10-02 08:59:26,047 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3531_210_3528_1_1526294203_5216456_309-227302-0_sp1.1 from training. Duration: 0.62725 2023-10-02 08:59:27,473 WARNING [train.py:1204] (3/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_226-242140-0 from training. Duration: 0.81 2023-10-02 08:59:31,562 WARNING [train.py:1204] (3/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_122-109023-0_sp1.1 from training. Duration: 0.6909375 2023-10-02 08:59:32,886 WARNING [train.py:1204] (3/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_660-211088-0_sp0.9 from training. Duration: 0.688875 2023-10-02 08:59:32,981 WARNING [train.py:1204] (3/4) Exclude cut with ID _39290_210_4220_1_1533862778760_3075118_92-311600-0 from training. Duration: 0.88 2023-10-02 08:59:36,275 WARNING [train.py:1204] (3/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_390-339137-0_sp1.1 from training. Duration: 0.5090625 2023-10-02 08:59:40,353 WARNING [train.py:1204] (3/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_449-341946-0_sp1.1 from training. Duration: 0.963625 2023-10-02 08:59:40,418 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_68095_210_20185_1_1535718226_8888855_884-327007-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 08:59:41,807 WARNING [train.py:1204] (3/4) Exclude cut with ID _42326_210_7770_1_1533814282531_3707269_308-222998-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 08:59:43,160 WARNING [train.py:1204] (3/4) Exclude cut with ID _40399_210_4353_1_1533305658194_3279406_344-238708-0_sp1.1 from training. Duration: 0.963625 2023-10-02 08:59:43,215 WARNING [train.py:1204] (3/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_87-131427-0_sp1.1 from training. Duration: 0.7090625 2023-10-02 08:59:44,635 WARNING [train.py:1204] (3/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_110-130961-0_sp0.9 from training. Duration: 0.52225 2023-10-02 08:59:44,682 WARNING [train.py:1204] (3/4) Exclude cut with ID _55153_210_2253_1_1534384786677_3162096_148-79022-0 from training. Duration: 0.96 2023-10-02 08:59:46,394 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.517e+02 1.906e+02 2.045e+02 2.364e+02 3.578e+02, threshold=4.090e+02, percent-clipped=0.0 2023-10-02 08:59:50,561 WARNING [train.py:1204] (3/4) Exclude cut with ID _9679_210_7497_1_1529129212906_4363929_512-313889-0_sp1.1 from training. Duration: 0.736375 2023-10-02 08:59:50,574 WARNING [train.py:1204] (3/4) Exclude cut with ID _7970_210_1883_1_1528455616296_3472230_25-178407-0_sp1.1 from training. Duration: 0.6454375 2023-10-02 08:59:52,067 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_246-125000-0 from training. Duration: 0.62 2023-10-02 08:59:52,737 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_15126_210_6753_1_1530778677_2591185_113-157651-0_sp0.9 from training. Duration: 0.7555625 2023-10-02 08:59:54,115 WARNING [train.py:1204] (3/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_105-63309-0_sp1.1 from training. Duration: 0.8909375 2023-10-02 08:59:56,863 WARNING [train.py:1204] (3/4) Exclude cut with ID _29877_210_6228_1_1532397791106_5276149_578-177271-0_sp1.1 from training. Duration: 0.936375 2023-10-02 08:59:58,181 WARNING [train.py:1204] (3/4) Exclude cut with ID _44831_210_13427_1_1533816011549_3689035_32-188147-0 from training. Duration: 0.96 2023-10-02 08:59:58,253 WARNING [train.py:1204] (3/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_300-254337-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 08:59:58,266 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_128-241008-0 from training. Duration: 0.94 2023-10-02 09:00:00,736 WARNING [train.py:1204] (3/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_53-279178-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 09:00:05,227 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_41452_210_7291_1_1533437687_3941281_273-334088-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 09:00:05,230 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_128-241008-0_sp1.1 from training. Duration: 0.8545625 2023-10-02 09:00:09,380 WARNING [train.py:1204] (3/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_379-49457-0 from training. Duration: 0.87 2023-10-02 09:00:09,453 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_105-114797-0 from training. Duration: 0.84 2023-10-02 09:00:10,773 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_14928_210_5632_1_1530773461_4714000_454-112198-0 from training. Duration: 0.66 2023-10-02 09:00:15,406 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_456-22141-0_sp1.1 from training. Duration: 0.8 2023-10-02 09:00:16,820 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_115-233929-0 from training. Duration: 0.72 2023-10-02 09:00:18,689 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.5.encoder.layers.0.conv_module1.whiten, num_groups=1, num_channels=256, metric=6.99 vs. limit=15.0 2023-10-02 09:00:19,504 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_616-16795-0_sp1.1 from training. Duration: 0.963625 2023-10-02 09:00:23,803 INFO [train.py:1046] (3/4) Epoch 24, batch 1350, loss[loss=0.1858, simple_loss=0.2454, pruned_loss=0.06305, over 23739.00 frames. ], tot_loss[loss=0.172, simple_loss=0.2488, pruned_loss=0.04757, over 4733087.63 frames. ], batch size: 164, lr: 4.30e-03, grad_scale: 16.0 2023-10-02 09:00:23,935 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder_embed.conv.5.prob, batch_count=823520.0, ans=0.125 2023-10-02 09:00:26,367 WARNING [train.py:1204] (3/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_448-212135-0 from training. Duration: 0.91 2023-10-02 09:00:30,365 WARNING [train.py:1204] (3/4) Exclude cut with ID _46683_210_9642_1_1533730913492_4591116_201-205677-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 09:00:30,509 WARNING [train.py:1204] (3/4) Exclude cut with ID _64086_210_7994_1_1535270417019_7435949_43-80483-0_sp1.1 from training. Duration: 0.97275 2023-10-02 09:00:33,339 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_240-134124-0_sp1.1 from training. Duration: 0.963625 2023-10-02 09:00:34,654 WARNING [train.py:1204] (3/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_907-120940-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 09:00:34,762 WARNING [train.py:1204] (3/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_848-280957-0_sp1.1 from training. Duration: 0.7909375 2023-10-02 09:00:36,664 WARNING [train.py:1204] (3/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_243-42116-0_sp0.9 from training. Duration: 0.811125 2023-10-02 09:00:38,064 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.bypass.skip_rate, batch_count=823586.6666666666, ans=0.07 2023-10-02 09:00:42,325 WARNING [train.py:1204] (3/4) Exclude cut with ID _6694_210_4599_1_1527852637490_3965529_116-109398-0_sp0.9 from training. Duration: 0.811125 2023-10-02 09:00:43,803 WARNING [train.py:1204] (3/4) Exclude cut with ID _24314_210_4915_1_1532135078116_7170179_521-258868-0 from training. Duration: 0.79 2023-10-02 09:00:45,140 WARNING [train.py:1204] (3/4) Exclude cut with ID _2220_210_1819_1_1525509109898_3658680_50-99837-0_sp0.9 from training. Duration: 0.6 2023-10-02 09:00:45,187 WARNING [train.py:1204] (3/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_313-134930-0_sp1.1 from training. Duration: 0.8818125 2023-10-02 09:00:47,997 WARNING [train.py:1204] (3/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_125-144174-0 from training. Duration: 0.39 2023-10-02 09:00:49,277 WARNING [train.py:1204] (3/4) Exclude cut with ID _59288_210_5854_1_1535077757774_3412246_275-293897-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 09:00:50,554 WARNING [train.py:1204] (3/4) Exclude cut with ID _40519_210_13794_1_1533715228655_7337390_722-52880-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 09:00:50,567 WARNING [train.py:1204] (3/4) Exclude cut with ID _7537_210_2084_1_1528502395431_3924359_128-268029-0 from training. Duration: 0.98 2023-10-02 09:00:52,016 WARNING [train.py:1204] (3/4) Exclude cut with ID _8021_210_7994_1_1528376451358_3653069_179-45206-0 from training. Duration: 0.86 2023-10-02 09:00:52,143 WARNING [train.py:1204] (3/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_88-276668-0 from training. Duration: 0.99 2023-10-02 09:00:54,002 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=823653.3333333334, ans=0.1 2023-10-02 09:00:55,240 WARNING [train.py:1204] (3/4) Exclude cut with ID _22613_210_5219_1_1532253599686_4090170_290-212862-0_sp1.1 from training. Duration: 0.97275 2023-10-02 09:00:55,245 WARNING [train.py:1204] (3/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_465-214726-0 from training. Duration: 0.86 2023-10-02 09:01:01,702 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.conv_skip_rate, batch_count=823653.3333333334, ans=0.0 2023-10-02 09:01:06,109 WARNING [train.py:1204] (3/4) Exclude cut with ID _56480_210_4220_1_1534679942079_3075409_138-36031-0_sp1.1 from training. Duration: 0.97275 2023-10-02 09:01:07,698 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.feed_forward3.hidden_balancer.prob, batch_count=823720.0, ans=0.125 2023-10-02 09:01:15,032 WARNING [train.py:1204] (3/4) Exclude cut with ID _58605_210_12210_1_1534723195939_3904259_300-285878-0_sp1.1 from training. Duration: 0.97275 2023-10-02 09:01:15,064 WARNING [train.py:1204] (3/4) Exclude cut with ID _40901_210_5665_1_1533363789958_3856130_114-340229-0_sp1.1 from training. Duration: 0.936375 2023-10-02 09:01:15,101 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_489-230703-0 from training. Duration: 0.85 2023-10-02 09:01:16,503 WARNING [train.py:1204] (3/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_322-81243-0_sp1.1 from training. Duration: 0.936375 2023-10-02 09:01:17,816 WARNING [train.py:1204] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_271-269084-0 from training. Duration: 0.83 2023-10-02 09:01:17,827 WARNING [train.py:1204] (3/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_166-314607-0_sp0.9 from training. Duration: 0.6 2023-10-02 09:01:18,276 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.5.encoder.layers.1.conv_module1.whiten, num_groups=1, num_channels=256, metric=4.09 vs. limit=15.0 2023-10-02 09:01:19,127 WARNING [train.py:1204] (3/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_340-343302-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 09:01:21,888 WARNING [train.py:1204] (3/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_286-112015-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 09:01:25,236 WARNING [train.py:1204] (3/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_328-264776-0 from training. Duration: 0.67 2023-10-02 09:01:27,106 WARNING [train.py:1204] (3/4) Exclude cut with ID _4866_210_2577_1_1527404412284_4364659_256-306830-0_sp0.9 from training. Duration: 0.9666875 2023-10-02 09:01:28,877 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.balancer2.prob, batch_count=823786.6666666666, ans=0.125 2023-10-02 09:01:30,107 WARNING [train.py:1204] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_239-316802-0 from training. Duration: 0.69 2023-10-02 09:01:32,753 WARNING [train.py:1204] (3/4) Exclude cut with ID _29857_210_20034_1_1532395886753_3798160_279-185801-0 from training. Duration: 0.91 2023-10-02 09:01:36,351 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.3.encoder.layers.0.feed_forward1.out_whiten, num_groups=1, num_channels=512, metric=8.90 vs. limit=15.0 2023-10-02 09:01:37,390 WARNING [train.py:1204] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_656-188361-0 from training. Duration: 0.87 2023-10-02 09:01:38,731 INFO [train.py:1046] (3/4) Epoch 24, batch 1400, loss[loss=0.1644, simple_loss=0.2504, pruned_loss=0.03918, over 24495.00 frames. ], tot_loss[loss=0.1707, simple_loss=0.247, pruned_loss=0.04722, over 4710380.34 frames. ], batch size: 63, lr: 4.30e-03, grad_scale: 8.0 2023-10-02 09:01:38,870 WARNING [train.py:1204] (3/4) Exclude cut with ID _44718_210_5816_1_1534050051869_6469030_100-230287-0_sp1.1 from training. Duration: 0.936375 2023-10-02 09:01:43,364 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8275_210_4198_1_1528455320_3892571_276-215622-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 09:01:43,420 WARNING [train.py:1204] (3/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_698-309430-0_sp1.1 from training. Duration: 0.963625 2023-10-02 09:01:43,607 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.conv_module1.balancer2.prob, batch_count=823853.3333333334, ans=0.125 2023-10-02 09:01:48,743 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_365-217119-0 from training. Duration: 0.63 2023-10-02 09:01:50,051 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7264_210_4938_1_1527939911_8132095_800-91995-0 from training. Duration: 0.86 2023-10-02 09:01:50,907 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.conv_module1.balancer2.prob, batch_count=823853.3333333334, ans=0.125 2023-10-02 09:01:58,141 WARNING [train.py:1204] (3/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_49-97158-0_sp1.1 from training. Duration: 0.6909375 2023-10-02 09:02:00,745 WARNING [train.py:1204] (3/4) Exclude cut with ID _61132_210_9969_1_1535021792964_3755419_61-57720-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 09:02:02,219 WARNING [train.py:1204] (3/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_345-306832-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 09:02:02,242 WARNING [train.py:1204] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_164-224052-0_sp0.9 from training. Duration: 0.77775 2023-10-02 09:02:05,544 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_34886_210_10633_1_1533203882_1755661_76-156096-0_sp1.1 from training. Duration: 0.7909375 2023-10-02 09:02:06,895 WARNING [train.py:1204] (3/4) Exclude cut with ID 4133-6541-0027-40495-0_sp1.1 from training. Duration: 0.9681875 2023-10-02 09:02:11,736 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.feed_forward3.hidden_balancer.prob, batch_count=823986.6666666666, ans=0.125 2023-10-02 09:02:15,580 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.509e+02 1.897e+02 2.132e+02 2.372e+02 2.905e+02, threshold=4.263e+02, percent-clipped=0.0 2023-10-02 09:02:15,741 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_514-211165-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 09:02:15,791 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_41211_210_2544_1_1533348859_3702910_97-212695-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 09:02:18,513 WARNING [train.py:1204] (3/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_406-45086-0 from training. Duration: 0.8 2023-10-02 09:02:20,481 WARNING [train.py:1204] (3/4) Exclude cut with ID _65263_210_22356_1_1535763598691_3921037_49-222917-0_sp1.1 from training. Duration: 0.836375 2023-10-02 09:02:21,738 WARNING [train.py:1204] (3/4) Exclude cut with ID _4866_210_2577_1_1527404412284_4364659_352-157930-0_sp1.1 from training. Duration: 0.736375 2023-10-02 09:02:23,025 WARNING [train.py:1204] (3/4) Exclude cut with ID _4366_210_1388_1_1526792053633_3613819_231-211402-0_sp1.1 from training. Duration: 0.8545625 2023-10-02 09:02:23,072 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_98-21576-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 09:02:24,432 WARNING [train.py:1204] (3/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_348-127789-0_sp0.9 from training. Duration: 0.9444375 2023-10-02 09:02:24,447 WARNING [train.py:1204] (3/4) Exclude cut with ID _45871_210_5665_1_1533864614245_3296060_98-329557-0_sp1.1 from training. Duration: 0.97275 2023-10-02 09:02:25,777 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6846_210_6753_1_1527850767_4295158_2-315073-0_sp1.1 from training. Duration: 0.8 2023-10-02 09:02:25,887 WARNING [train.py:1204] (3/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_145-137790-0 from training. Duration: 0.83 2023-10-02 09:02:25,912 WARNING [train.py:1204] (3/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_133-158308-0_sp0.9 from training. Duration: 0.9333125 2023-10-02 09:02:30,532 WARNING [train.py:1204] (3/4) Exclude cut with ID _14898_210_9206_1_1530766812377_4177190_320-11758-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 09:02:33,336 WARNING [train.py:1204] (3/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_406-45086-0_sp0.9 from training. Duration: 0.888875 2023-10-02 09:02:41,231 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_5438_210_1794_1_1527336687_2600009_104-288274-0 from training. Duration: 0.58 2023-10-02 09:02:42,597 WARNING [train.py:1204] (3/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_108-53929-0_sp0.9 from training. Duration: 0.6555625 2023-10-02 09:02:44,130 WARNING [train.py:1204] (3/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_444-286390-0_sp1.1 from training. Duration: 0.963625 2023-10-02 09:02:45,562 WARNING [train.py:1204] (3/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_125-144174-0_sp0.9 from training. Duration: 0.4333125 2023-10-02 09:02:46,917 WARNING [train.py:1204] (3/4) Exclude cut with ID _55209_210_1531_1_1534675828996_4597160_118-345621-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 09:02:48,738 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_45886_210_17024_1_1533709215_471063_7-79165-0_sp1.1 from training. Duration: 0.8909375 2023-10-02 09:02:51,620 WARNING [train.py:1204] (3/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_175-163894-0_sp0.9 from training. Duration: 0.82225 2023-10-02 09:02:51,947 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=824186.6666666666, ans=0.1 2023-10-02 09:02:52,960 INFO [train.py:1046] (3/4) Epoch 24, batch 1450, loss[loss=0.1682, simple_loss=0.2466, pruned_loss=0.04491, over 23896.00 frames. ], tot_loss[loss=0.1701, simple_loss=0.2461, pruned_loss=0.04704, over 4693889.74 frames. ], batch size: 86, lr: 4.30e-03, grad_scale: 8.0 2023-10-02 09:02:53,052 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_50188_210_4198_1_1534503621_3646041_353-81682-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 09:02:53,058 WARNING [train.py:1204] (3/4) Exclude cut with ID _17875_210_2087_1_1531224012907_1821339_175-80565-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 09:02:53,067 WARNING [train.py:1204] (3/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_159-328157-0_sp1.1 from training. Duration: 0.47275 2023-10-02 09:02:57,773 WARNING [train.py:1204] (3/4) Exclude cut with ID _60057_210_10640_1_1534903065793_3333150_137-267579-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 09:02:59,233 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_92-46009-0_sp1.1 from training. Duration: 0.4909375 2023-10-02 09:03:00,597 WARNING [train.py:1204] (3/4) Exclude cut with ID _53203_210_2087_1_1534246218309_475590_48-68507-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 09:03:00,611 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_28211_210_6388_1_1532861506_4523224_128-328162-0 from training. Duration: 0.9 2023-10-02 09:03:02,034 WARNING [train.py:1204] (3/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_51-1049-0_sp1.1 from training. Duration: 0.6818125 2023-10-02 09:03:03,271 WARNING [train.py:1204] (3/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_383-316000-0 from training. Duration: 0.9 2023-10-02 09:03:03,321 WARNING [train.py:1204] (3/4) Exclude cut with ID _4779_210_2084_1_1527318022944_3914893_185-295153-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 09:03:05,107 WARNING [train.py:1204] (3/4) Exclude cut with ID _3231_210_2106_1_1526205864934_6784220_171-256004-0_sp0.9 from training. Duration: 0.9555625 2023-10-02 09:03:05,107 WARNING [train.py:1204] (3/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_87-272068-0 from training. Duration: 0.83 2023-10-02 09:03:06,509 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_164-152777-0_sp1.1 from training. Duration: 0.92725 2023-10-02 09:03:06,551 WARNING [train.py:1204] (3/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_18-130336-0_sp0.9 from training. Duration: 0.87775 2023-10-02 09:03:07,947 WARNING [train.py:1204] (3/4) Exclude cut with ID _14886_210_4220_1_1530702020355_3516190_175-229073-0_sp1.1 from training. Duration: 0.4090625 2023-10-02 09:03:07,967 WARNING [train.py:1204] (3/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_791-45149-0_sp0.9 from training. Duration: 0.9555625 2023-10-02 09:03:09,441 WARNING [train.py:1204] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_468-189058-0_sp1.1 from training. Duration: 0.87275 2023-10-02 09:03:09,657 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.attention_skip_rate, batch_count=824253.3333333334, ans=0.0 2023-10-02 09:03:12,004 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8278_210_2550_1_1528637099_4220413_32-142780-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 09:03:13,861 WARNING [train.py:1204] (3/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_146-218763-0_sp0.9 from training. Duration: 0.9555625 2023-10-02 09:03:17,333 WARNING [train.py:1204] (3/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_348-127789-0_sp1.1 from training. Duration: 0.77275 2023-10-02 09:03:17,337 WARNING [train.py:1204] (3/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_288-285445-0_sp1.1 from training. Duration: 0.936375 2023-10-02 09:03:18,525 WARNING [train.py:1204] (3/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_308-181732-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 09:03:18,555 WARNING [train.py:1204] (3/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_895-117428-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 09:03:19,138 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.3.encoder.layers.2.feed_forward3.out_whiten, num_groups=1, num_channels=512, metric=10.60 vs. limit=15.0 2023-10-02 09:03:21,223 WARNING [train.py:1204] (3/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_387-159008-0_sp0.9 from training. Duration: 0.9555625 2023-10-02 09:03:22,513 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_37351_210_7925_1_1533012931_3812113_265-85780-0_sp0.9 from training. Duration: 0.97775 2023-10-02 09:03:22,535 WARNING [train.py:1204] (3/4) Exclude cut with ID _40551_210_10635_1_1533776424632_3416179_361-3964-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 09:03:23,799 WARNING [train.py:1204] (3/4) Exclude cut with ID _25882_210_3036_1_1532775665815_6479510_485-122742-0_sp1.1 from training. Duration: 0.97275 2023-10-02 09:03:25,468 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.bypass_mid.scale_min, batch_count=824320.0, ans=0.2 2023-10-02 09:03:27,932 WARNING [train.py:1204] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_98-51676-0 from training. Duration: 0.73 2023-10-02 09:03:28,603 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.4.encoder.layers.0.conv_module1.whiten, num_groups=1, num_channels=384, metric=4.44 vs. limit=15.0 2023-10-02 09:03:31,405 WARNING [train.py:1204] (3/4) Exclude cut with ID _30548_210_16040_1_1532608097130_3146969_371-280610-0_sp1.1 from training. Duration: 0.92725 2023-10-02 09:03:34,177 WARNING [train.py:1204] (3/4) Exclude cut with ID IC0671W0081-331924 from training. Duration: 0.896 2023-10-02 09:03:34,376 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.1.nonlin_attention.balancer.prob, batch_count=824320.0, ans=0.125 2023-10-02 09:03:35,528 WARNING [train.py:1204] (3/4) Exclude cut with ID _40284_210_2084_1_1533513679814_3704279_380-160775-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 09:03:36,924 WARNING [train.py:1204] (3/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_57-270037-0_sp1.1 from training. Duration: 0.7 2023-10-02 09:03:38,344 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_853-150237-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 09:03:40,067 WARNING [train.py:1204] (3/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_491-261923-0 from training. Duration: 0.98 2023-10-02 09:03:44,666 WARNING [train.py:1204] (3/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_836-244045-0_sp1.1 from training. Duration: 0.97275 2023-10-02 09:03:46,024 WARNING [train.py:1204] (3/4) Exclude cut with ID _14880_210_8895_1_1530680426655_3795259_289-212817-0 from training. Duration: 0.69 2023-10-02 09:03:47,316 WARNING [train.py:1204] (3/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_303-340446-0 from training. Duration: 0.9 2023-10-02 09:03:48,106 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.feed_forward2.out_whiten.whitening_limit, batch_count=824386.6666666666, ans=15.0 2023-10-02 09:03:48,714 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_37152_210_9697_1_1533106573_3844188_418-41736-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 09:03:48,856 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.0.conv_skip_rate, batch_count=824386.6666666666, ans=0.0 2023-10-02 09:03:53,261 WARNING [train.py:1204] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_371-18169-0_sp1.1 from training. Duration: 0.7818125 2023-10-02 09:03:54,563 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_187-219984-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 09:03:55,955 WARNING [train.py:1204] (3/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_63-267585-0 from training. Duration: 0.9 2023-10-02 09:03:57,417 WARNING [train.py:1204] (3/4) Exclude cut with ID _27013_210_1389_1_1532437173240_3252649_222-125573-0 from training. Duration: 0.98 2023-10-02 09:03:58,682 WARNING [train.py:1204] (3/4) Exclude cut with ID _14821_210_8750_1_1530680503991_6829359_189-297451-0 from training. Duration: 0.55 2023-10-02 09:04:00,073 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_557-163391-0_sp1.1 from training. Duration: 0.97275 2023-10-02 09:04:00,151 WARNING [train.py:1204] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_65-88242-0_sp1.1 from training. Duration: 0.5090625 2023-10-02 09:04:06,032 INFO [train.py:1046] (3/4) Epoch 24, batch 1500, loss[loss=0.1722, simple_loss=0.2449, pruned_loss=0.04973, over 23545.00 frames. ], tot_loss[loss=0.1708, simple_loss=0.2473, pruned_loss=0.0472, over 4703853.20 frames. ], batch size: 134, lr: 4.29e-03, grad_scale: 8.0 2023-10-02 09:04:12,064 WARNING [train.py:1204] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_218-295097-0 from training. Duration: 0.77 2023-10-02 09:04:12,083 WARNING [train.py:1204] (3/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_686-247600-0_sp1.1 from training. Duration: 0.836375 2023-10-02 09:04:12,087 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_397-147196-0_sp0.9 from training. Duration: 0.888875 2023-10-02 09:04:13,407 WARNING [train.py:1204] (3/4) Exclude cut with ID _53950_210_5816_1_1534417264919_3862951_237-136449-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 09:04:13,480 WARNING [train.py:1204] (3/4) Exclude cut with ID _3120_210_3525_1_1526036143112_4072980_74-178400-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 09:04:15,013 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.bypass_mid.scale_min, batch_count=824520.0, ans=0.2 2023-10-02 09:04:16,100 WARNING [train.py:1204] (3/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_309-222545-0_sp0.9 from training. Duration: 0.9333125 2023-10-02 09:04:16,179 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7544_210_6753_1_1528587722_3576686_172-171750-0 from training. Duration: 0.65 2023-10-02 09:04:18,084 WARNING [train.py:1204] (3/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_3-237434-0_sp0.9 from training. Duration: 0.8444375 2023-10-02 09:04:18,127 WARNING [train.py:1204] (3/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_101-293367-0_sp0.9 from training. Duration: 0.7 2023-10-02 09:04:18,135 WARNING [train.py:1204] (3/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_818-213552-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 09:04:19,311 WARNING [train.py:1204] (3/4) Exclude cut with ID _14349_210_8341_1_1530615710818_3442470_115-285975-0_sp1.1 from training. Duration: 0.7818125 2023-10-02 09:04:20,767 WARNING [train.py:1204] (3/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_291-309749-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 09:04:22,232 WARNING [train.py:1204] (3/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_715-3462-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 09:04:22,474 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.feed_forward1.hidden_balancer.prob, batch_count=824586.6666666666, ans=0.125 2023-10-02 09:04:29,479 WARNING [train.py:1204] (3/4) Exclude cut with ID _59502_210_12210_1_1535107024698_1835139_13-331827-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 09:04:29,489 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_4528_210_3273_1_1527076654_3276777_342-168068-0 from training. Duration: 0.87 2023-10-02 09:04:30,759 WARNING [train.py:1204] (3/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_215-141059-0_sp0.9 from training. Duration: 0.688875 2023-10-02 09:04:30,779 WARNING [train.py:1204] (3/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_106-286130-0_sp1.1 from training. Duration: 0.8909375 2023-10-02 09:04:32,107 WARNING [train.py:1204] (3/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_480-77655-0_sp1.1 from training. Duration: 0.97275 2023-10-02 09:04:33,797 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.self_attn_weights.pos_emb_skip_rate, batch_count=824653.3333333334, ans=0.0 2023-10-02 09:04:34,968 WARNING [train.py:1204] (3/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_848-280957-0 from training. Duration: 0.87 2023-10-02 09:04:38,254 WARNING [train.py:1204] (3/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_122-109023-0 from training. Duration: 0.76 2023-10-02 09:04:39,611 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_11305_210_1794_1_1529750428_7178148_645-233480-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 09:04:40,952 WARNING [train.py:1204] (3/4) Exclude cut with ID _7562_210_2084_1_1528887767916_3946229_345-169156-0 from training. Duration: 0.58 2023-10-02 09:04:42,143 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.491e+02 1.830e+02 2.016e+02 2.339e+02 3.423e+02, threshold=4.032e+02, percent-clipped=0.0 2023-10-02 09:04:42,320 WARNING [train.py:1204] (3/4) Exclude cut with ID _7562_210_2084_1_1528887767916_3946229_345-169156-0_sp1.1 from training. Duration: 0.52725 2023-10-02 09:04:42,483 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.feed_forward2.hidden_balancer.prob, batch_count=824653.3333333334, ans=0.125 2023-10-02 09:04:44,131 WARNING [train.py:1204] (3/4) Exclude cut with ID _13253_210_1925_1_1530757775819_3577873_253-246386-0_sp1.1 from training. Duration: 0.8181875 2023-10-02 09:04:45,441 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_136-264444-0_sp1.1 from training. Duration: 0.97275 2023-10-02 09:04:45,456 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2168_210_2290_1_1525606920_1849400_136-222991-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 09:04:48,123 WARNING [train.py:1204] (3/4) Exclude cut with ID _7852_210_5219_1_1528286471316_8139400_242-105336-0 from training. Duration: 0.72 2023-10-02 09:04:49,571 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_5960_210_1794_1_1527403865_3990999_515-2613-0_sp1.1 from training. Duration: 0.8 2023-10-02 09:04:49,584 WARNING [train.py:1204] (3/4) Exclude cut with ID _39688_210_19596_1_1533371626725_4154159_551-167834-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 09:04:49,637 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_85-195240-0 from training. Duration: 0.87 2023-10-02 09:04:49,696 WARNING [train.py:1204] (3/4) Exclude cut with ID _63171_210_15488_1_1535104792794_3295110_113-113534-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 09:04:54,211 WARNING [train.py:1204] (3/4) Exclude cut with ID _8833_210_5219_1_1528592665210_7553059_319-349181-0_sp1.1 from training. Duration: 0.9 2023-10-02 09:04:54,212 WARNING [train.py:1204] (3/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_225-43381-0 from training. Duration: 0.94 2023-10-02 09:04:58,922 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_5959_210_1794_1_1527400400_3445726_412-44052-0_sp0.9 from training. Duration: 0.8555625 2023-10-02 09:05:00,194 WARNING [train.py:1204] (3/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_356-167816-0_sp1.1 from training. Duration: 0.6545625 2023-10-02 09:05:04,284 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9012W0112-501244 from training. Duration: 0.768 2023-10-02 09:05:04,344 WARNING [train.py:1204] (3/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_177-326952-0_sp1.1 from training. Duration: 0.92725 2023-10-02 09:05:04,347 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9016W0116-502789 from training. Duration: 0.896 2023-10-02 09:05:06,123 WARNING [train.py:1204] (3/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_557-204815-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 09:05:07,518 WARNING [train.py:1204] (3/4) Exclude cut with ID _55857_210_2346_1_1534644007831_6898209_279-229469-0_sp1.1 from training. Duration: 0.936375 2023-10-02 09:05:07,594 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9029W0375-509764 from training. Duration: 0.896 2023-10-02 09:05:08,927 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2295_210_2087_1_1525500993_2407863_234-83104-0_sp0.9 from training. Duration: 0.688875 2023-10-02 09:05:11,104 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.1.encoder.layers.1.self_attn1.whiten, num_groups=1, num_channels=256, metric=12.53 vs. limit=22.5 2023-10-02 09:05:11,562 WARNING [train.py:1204] (3/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_266-137104-0 from training. Duration: 0.65 2023-10-02 09:05:11,715 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.1.ff2_skip_rate, batch_count=824786.6666666666, ans=0.0 2023-10-02 09:05:12,900 WARNING [train.py:1204] (3/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_289-163927-0_sp1.1 from training. Duration: 0.92725 2023-10-02 09:05:13,138 INFO [scaling.py:1118] (3/4) WithLoss: name=encoder.encoders.3.encoder.layers.2.self_attn_weights, loss-sum=0.000e+00 2023-10-02 09:05:16,240 WARNING [train.py:1204] (3/4) Exclude cut with ID _12733_210_8341_1_1530167424556_3646380_86-225314-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 09:05:16,253 WARNING [train.py:1204] (3/4) Exclude cut with ID _7877_210_6014_1_1528286358099_3750136_618-284064-0_sp1.1 from training. Duration: 0.92725 2023-10-02 09:05:18,154 WARNING [train.py:1204] (3/4) Exclude cut with ID _55844_210_2346_1_1534471212569_7625707_1017-31469-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 09:05:18,185 WARNING [train.py:1204] (3/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_617-241446-0_sp1.1 from training. Duration: 0.92725 2023-10-02 09:05:18,244 WARNING [train.py:1204] (3/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_866-173440-0_sp1.1 from training. Duration: 0.8181875 2023-10-02 09:05:19,577 INFO [train.py:1046] (3/4) Epoch 24, batch 1550, loss[loss=0.1741, simple_loss=0.2593, pruned_loss=0.04439, over 24630.00 frames. ], tot_loss[loss=0.1711, simple_loss=0.2479, pruned_loss=0.04716, over 4705640.58 frames. ], batch size: 68, lr: 4.29e-03, grad_scale: 8.0 2023-10-02 09:05:19,727 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_9460_210_4211_1_1528954587_10049667_480-260269-0 from training. Duration: 0.56 2023-10-02 09:05:20,962 WARNING [train.py:1204] (3/4) Exclude cut with ID _44659_210_2721_1_1534036120061_6614860_626-298884-0 from training. Duration: 0.97 2023-10-02 09:05:21,008 WARNING [train.py:1204] (3/4) Exclude cut with ID _53565_210_10635_1_1534337218335_3820259_287-18120-0_sp1.1 from training. Duration: 0.8818125 2023-10-02 09:05:21,247 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.bypass.scale_min, batch_count=824853.3333333334, ans=0.2 2023-10-02 09:05:22,272 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_791-197891-0 from training. Duration: 0.84 2023-10-02 09:05:22,327 WARNING [train.py:1204] (3/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_527-238990-0 from training. Duration: 0.9 2023-10-02 09:05:26,870 WARNING [train.py:1204] (3/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_449-65969-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 09:05:26,973 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_553-341452-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 09:05:28,343 WARNING [train.py:1204] (3/4) Exclude cut with ID _16909_210_5724_1_1531292079912_4118919_300-47574-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 09:05:28,355 WARNING [train.py:1204] (3/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_70-27732-0_sp1.1 from training. Duration: 0.8545625 2023-10-02 09:05:29,725 WARNING [train.py:1204] (3/4) Exclude cut with ID _44718_210_5816_1_1534050051869_6469030_727-28074-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 09:05:29,771 WARNING [train.py:1204] (3/4) Exclude cut with ID _55857_210_2346_1_1534644007831_6898209_357-123664-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 09:05:32,636 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9123W0004-555964 from training. Duration: 0.896 2023-10-02 09:05:32,671 WARNING [train.py:1204] (3/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_445-145208-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 09:05:33,876 WARNING [train.py:1204] (3/4) Exclude cut with ID _3675_210_4557_1_1526639934408_4182089_456-18305-0_sp1.1 from training. Duration: 0.6909375 2023-10-02 09:05:33,943 WARNING [train.py:1204] (3/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_1-99680-0_sp1.1 from training. Duration: 0.6090625 2023-10-02 09:05:37,010 WARNING [train.py:1204] (3/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_146-282400-0_sp0.9 from training. Duration: 0.788875 2023-10-02 09:05:37,023 WARNING [train.py:1204] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_844-96989-0 from training. Duration: 0.71 2023-10-02 09:05:39,774 WARNING [train.py:1204] (3/4) Exclude cut with ID _29853_210_7497_1_1532505600851_6642110_128-146364-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 09:05:39,799 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_224-322394-0 from training. Duration: 0.66 2023-10-02 09:05:41,303 WARNING [train.py:1204] (3/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_191-59784-0 from training. Duration: 0.88 2023-10-02 09:05:41,308 WARNING [train.py:1204] (3/4) Exclude cut with ID _12556_210_8341_1_1530081024100_7327690_725-193477-0 from training. Duration: 0.98 2023-10-02 09:05:41,361 WARNING [train.py:1204] (3/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_523-79838-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 09:05:42,764 WARNING [train.py:1204] (3/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_398-334558-0_sp1.1 from training. Duration: 0.87275 2023-10-02 09:05:47,301 WARNING [train.py:1204] (3/4) Exclude cut with ID _6456_210_1534_1_1527681634212_3181049_171-36697-0_sp1.1 from training. Duration: 0.936375 2023-10-02 09:05:50,085 WARNING [train.py:1204] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_700-183549-0 from training. Duration: 0.68 2023-10-02 09:05:50,087 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8853_210_3273_1_1528633899_818968_51-187685-0 from training. Duration: 0.69 2023-10-02 09:05:56,282 WARNING [train.py:1204] (3/4) Exclude cut with ID _7719_210_3525_1_1528456153163_4296960_333-110399-0_sp1.1 from training. Duration: 0.87275 2023-10-02 09:05:56,504 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.conv_module1.balancer1.prob, batch_count=824986.6666666666, ans=0.125 2023-10-02 09:05:59,786 WARNING [train.py:1204] (3/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_1022-83740-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 09:06:01,215 WARNING [train.py:1204] (3/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_61-327062-0_sp0.9 from training. Duration: 0.9 2023-10-02 09:06:01,221 WARNING [train.py:1204] (3/4) Exclude cut with ID _47251_210_18628_1_1533814063128_4137250_214-88076-0_sp1.1 from training. Duration: 0.8 2023-10-02 09:06:01,283 WARNING [train.py:1204] (3/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_839-173017-0 from training. Duration: 0.41 2023-10-02 09:06:07,243 WARNING [train.py:1204] (3/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_299-34413-0_sp1.1 from training. Duration: 0.5909375 2023-10-02 09:06:08,678 WARNING [train.py:1204] (3/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_664-159457-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 09:06:10,316 WARNING [train.py:1204] (3/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_483-291760-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 09:06:12,992 WARNING [train.py:1204] (3/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_287-255474-0_sp1.1 from training. Duration: 0.92725 2023-10-02 09:06:13,038 WARNING [train.py:1204] (3/4) Exclude cut with ID _48677_210_5663_1_1533902405919_3892989_111-11439-0_sp1.1 from training. Duration: 0.87275 2023-10-02 09:06:13,045 WARNING [train.py:1204] (3/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_399-156791-0 from training. Duration: 0.83 2023-10-02 09:06:13,084 WARNING [train.py:1204] (3/4) Exclude cut with ID _3992_210_1747_1_1526644816031_3731060_36-147855-0_sp0.9 from training. Duration: 0.8666875 2023-10-02 09:06:14,499 WARNING [train.py:1204] (3/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_442-320317-0_sp1.1 from training. Duration: 0.7454375 2023-10-02 09:06:15,897 WARNING [train.py:1204] (3/4) Exclude cut with ID _55429_210_10626_1_1534467588527_7174143_953-80868-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 09:06:17,141 WARNING [train.py:1204] (3/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_159-328157-0_sp0.9 from training. Duration: 0.57775 2023-10-02 09:06:17,153 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9309W0197-640324 from training. Duration: 0.896 2023-10-02 09:06:20,429 WARNING [train.py:1204] (3/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_361-30822-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 09:06:23,750 WARNING [train.py:1204] (3/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_636-251213-0 from training. Duration: 0.94 2023-10-02 09:06:26,009 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.bypass.scale_min, batch_count=825120.0, ans=0.2 2023-10-02 09:06:28,507 WARNING [train.py:1204] (3/4) Exclude cut with ID _49728_210_12651_1_1534041034294_6788150_468-333827-0_sp1.1 from training. Duration: 0.963625 2023-10-02 09:06:29,760 WARNING [train.py:1204] (3/4) Exclude cut with ID _25882_210_3036_1_1532775665815_6479510_623-191759-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 09:06:30,940 WARNING [train.py:1204] (3/4) Exclude cut with ID _5189_210_4557_1_1527248319428_3769229_199-53526-0 from training. Duration: 0.42 2023-10-02 09:06:32,333 WARNING [train.py:1204] (3/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_32-205288-0_sp0.9 from training. Duration: 0.8666875 2023-10-02 09:06:32,417 WARNING [train.py:1204] (3/4) Exclude cut with ID _65263_210_22356_1_1535763598691_3921037_401-346646-0_sp1.1 from training. Duration: 0.963625 2023-10-02 09:06:32,420 WARNING [train.py:1204] (3/4) Exclude cut with ID _12555_210_8750_1_1530075594402_3780239_449-267986-0_sp1.1 from training. Duration: 0.7545625 2023-10-02 09:06:32,438 WARNING [train.py:1204] (3/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_430-201020-0_sp1.1 from training. Duration: 0.8818125 2023-10-02 09:06:33,670 INFO [train.py:1046] (3/4) Epoch 24, batch 1600, loss[loss=0.1792, simple_loss=0.2642, pruned_loss=0.04708, over 24572.00 frames. ], tot_loss[loss=0.172, simple_loss=0.2486, pruned_loss=0.04764, over 4694691.13 frames. ], batch size: 71, lr: 4.29e-03, grad_scale: 16.0 2023-10-02 09:06:35,048 WARNING [train.py:1204] (3/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_379-49457-0_sp1.1 from training. Duration: 0.7909375 2023-10-02 09:06:38,332 WARNING [train.py:1204] (3/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_200-105218-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 09:06:39,669 WARNING [train.py:1204] (3/4) Exclude cut with ID _3867_210_1883_1_1526641200575_4475830_319-349850-0 from training. Duration: 0.67 2023-10-02 09:06:40,917 WARNING [train.py:1204] (3/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_916-195848-0 from training. Duration: 0.92 2023-10-02 09:06:42,317 WARNING [train.py:1204] (3/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_436-186311-0 from training. Duration: 0.65 2023-10-02 09:06:43,716 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3910_210_1794_1_1526777667_3747260_302-180617-0_sp1.1 from training. Duration: 0.97275 2023-10-02 09:06:45,231 WARNING [train.py:1204] (3/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_102-92751-0 from training. Duration: 0.99 2023-10-02 09:06:45,299 WARNING [train.py:1204] (3/4) Exclude cut with ID _51899_210_20034_1_1534129254466_3669220_265-215428-0_sp1.1 from training. Duration: 0.8909375 2023-10-02 09:06:46,769 WARNING [train.py:1204] (3/4) Exclude cut with ID _58583_210_5236_1_1534726835266_3574249_438-198993-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 09:06:53,508 WARNING [train.py:1204] (3/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_166-166990-0_sp1.1 from training. Duration: 0.87275 2023-10-02 09:06:56,883 WARNING [train.py:1204] (3/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_907-151398-0 from training. Duration: 0.37 2023-10-02 09:06:59,617 WARNING [train.py:1204] (3/4) Exclude cut with ID _38670_210_15379_1_1533448810659_4391059_452-256669-0_sp1.1 from training. Duration: 0.936375 2023-10-02 09:07:00,914 WARNING [train.py:1204] (3/4) Exclude cut with ID _48677_210_5663_1_1533902405919_3892989_408-40740-0 from training. Duration: 0.83 2023-10-02 09:07:00,970 WARNING [train.py:1204] (3/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_903-158756-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 09:07:01,017 WARNING [train.py:1204] (3/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_397-216497-0 from training. Duration: 0.96 2023-10-02 09:07:06,781 WARNING [train.py:1204] (3/4) Exclude cut with ID _55429_210_10626_1_1534467588527_7174143_97-335084-0 from training. Duration: 0.97 2023-10-02 09:07:11,393 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.582e+02 1.850e+02 2.023e+02 2.271e+02 3.157e+02, threshold=4.047e+02, percent-clipped=0.0 2023-10-02 09:07:12,918 WARNING [train.py:1204] (3/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_453-267730-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 09:07:12,987 WARNING [train.py:1204] (3/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_153-97441-0 from training. Duration: 0.56 2023-10-02 09:07:14,195 WARNING [train.py:1204] (3/4) Exclude cut with ID _40901_210_5665_1_1533363789958_3856130_312-329122-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 09:07:14,246 WARNING [train.py:1204] (3/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_97-126471-0_sp1.1 from training. Duration: 0.97275 2023-10-02 09:07:14,247 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_11305_210_1794_1_1529750428_7178148_257-312870-0_sp1.1 from training. Duration: 0.8 2023-10-02 09:07:17,042 WARNING [train.py:1204] (3/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_454-314901-0_sp0.9 from training. Duration: 0.511125 2023-10-02 09:07:20,452 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.conv_module2.balancer2.prob, batch_count=825386.6666666666, ans=0.125 2023-10-02 09:07:21,731 WARNING [train.py:1204] (3/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_282-136641-0_sp1.1 from training. Duration: 0.3818125 2023-10-02 09:07:25,018 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_46013_210_7234_1_1533776362_3744032_493-160724-0_sp1.1 from training. Duration: 0.963625 2023-10-02 09:07:26,327 WARNING [train.py:1204] (3/4) Exclude cut with ID _67831_210_12392_1_1535612464202_3092612_259-33442-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 09:07:26,376 WARNING [train.py:1204] (3/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_289-62384-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 09:07:28,245 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6545_210_2040_1_1527774670_4245258_379-14182-0_sp0.9 from training. Duration: 0.888875 2023-10-02 09:07:29,809 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_548-294316-0_sp1.1 from training. Duration: 0.736375 2023-10-02 09:07:31,199 WARNING [train.py:1204] (3/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_857-298942-0_sp1.1 from training. Duration: 0.77275 2023-10-02 09:07:32,735 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6673_210_3273_1_1527767790_3173240_327-306185-0_sp1.1 from training. Duration: 0.7545625 2023-10-02 09:07:39,558 WARNING [train.py:1204] (3/4) Exclude cut with ID _61008_210_10633_1_1534852769198_6974370_814-107647-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 09:07:39,631 WARNING [train.py:1204] (3/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_116-332008-0_sp1.1 from training. Duration: 0.8909375 2023-10-02 09:07:40,985 WARNING [train.py:1204] (3/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_266-329755-0 from training. Duration: 0.89 2023-10-02 09:07:40,988 WARNING [train.py:1204] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_271-269084-0_sp0.9 from training. Duration: 0.92225 2023-10-02 09:07:42,844 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3171_210_2087_1_1526109084_2330007_66-260583-0 from training. Duration: 0.72 2023-10-02 09:07:48,252 INFO [train.py:1046] (3/4) Epoch 24, batch 1650, loss[loss=0.1609, simple_loss=0.2525, pruned_loss=0.03464, over 24659.00 frames. ], tot_loss[loss=0.173, simple_loss=0.2496, pruned_loss=0.04826, over 4683574.37 frames. ], batch size: 73, lr: 4.29e-03, grad_scale: 16.0 2023-10-02 09:07:48,316 WARNING [train.py:1204] (3/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_1326-276386-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 09:07:49,693 WARNING [train.py:1204] (3/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_372-30302-0_sp1.1 from training. Duration: 0.7818125 2023-10-02 09:07:51,098 WARNING [train.py:1204] (3/4) Exclude cut with ID _25882_210_3036_1_1532775665815_6479510_631-257029-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 09:07:51,106 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3691_210_3528_1_1526468169_4062053_355-334432-0 from training. Duration: 0.87 2023-10-02 09:07:51,113 WARNING [train.py:1204] (3/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_839-173017-0_sp1.1 from training. Duration: 0.37275 2023-10-02 09:07:51,121 WARNING [train.py:1204] (3/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_441-91466-0 from training. Duration: 0.97 2023-10-02 09:07:51,158 WARNING [train.py:1204] (3/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_108-53929-0 from training. Duration: 0.59 2023-10-02 09:07:57,467 WARNING [train.py:1204] (3/4) Exclude cut with ID _9389_210_1664_1_1529217337301_5289269_260-1942-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 09:07:57,528 WARNING [train.py:1204] (3/4) Exclude cut with ID _56423_210_5854_1_1534896061049_3165969_198-61547-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 09:07:57,552 WARNING [train.py:1204] (3/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_412-142122-0_sp1.1 from training. Duration: 0.92725 2023-10-02 09:07:59,419 WARNING [train.py:1204] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_88-341718-0_sp1.1 from training. Duration: 0.6 2023-10-02 09:07:59,565 WARNING [train.py:1204] (3/4) Exclude cut with ID _61079_210_4525_1_1534921008198_4011419_334-298697-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 09:08:02,368 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_5465_210_4211_1_1527227253_55357_8-2499-0_sp1.1 from training. Duration: 0.996375 2023-10-02 09:08:02,491 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.balancer_na.min_abs, batch_count=825586.6666666666, ans=0.02 2023-10-02 09:08:02,906 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.4.encoder.layers.1.nonlin_attention.whiten1, num_groups=1, num_channels=288, metric=4.08 vs. limit=10.0 2023-10-02 09:08:05,005 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_447-201565-0_sp1.1 from training. Duration: 0.97275 2023-10-02 09:08:05,011 WARNING [train.py:1204] (3/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_253-161789-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 09:08:05,014 WARNING [train.py:1204] (3/4) Exclude cut with ID _44304_210_15374_1_1533639352765_4613129_302-22084-0_sp0.9 from training. Duration: 0.9555625 2023-10-02 09:08:05,022 WARNING [train.py:1204] (3/4) Exclude cut with ID _26664_210_7770_1_1532258943280_3418270_187-80815-0_sp1.1 from training. Duration: 0.8454375 2023-10-02 09:08:05,099 WARNING [train.py:1204] (3/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_68-212801-0 from training. Duration: 0.73 2023-10-02 09:08:05,137 WARNING [train.py:1204] (3/4) Exclude cut with ID _29345_210_4381_1_1532689168263_3860169_149-257268-0 from training. Duration: 0.81 2023-10-02 09:08:11,884 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_213-103282-0_sp1.1 from training. Duration: 0.7090625 2023-10-02 09:08:15,117 WARNING [train.py:1204] (3/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_156-302675-0_sp1.1 from training. Duration: 0.5 2023-10-02 09:08:15,340 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=825586.6666666666, ans=0.1 2023-10-02 09:08:22,214 WARNING [train.py:1204] (3/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_125-322172-0 from training. Duration: 0.93 2023-10-02 09:08:24,009 WARNING [train.py:1204] (3/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_493-60474-0_sp1.1 from training. Duration: 0.963625 2023-10-02 09:08:25,381 WARNING [train.py:1204] (3/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_156-302675-0 from training. Duration: 0.55 2023-10-02 09:08:28,830 WARNING [train.py:1204] (3/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_123-267862-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 09:08:32,077 WARNING [train.py:1204] (3/4) Exclude cut with ID _28427_210_4220_1_1532581240967_3061069_201-143747-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 09:08:32,116 WARNING [train.py:1204] (3/4) Exclude cut with ID _8772_210_1850_1_1528635595972_3664011_96-12756-0_sp1.1 from training. Duration: 0.8909375 2023-10-02 09:08:33,466 WARNING [train.py:1204] (3/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_1347-145675-0_sp1.1 from training. Duration: 0.936375 2023-10-02 09:08:33,565 WARNING [train.py:1204] (3/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_387-159008-0_sp1.1 from training. Duration: 0.7818125 2023-10-02 09:08:34,741 WARNING [train.py:1204] (3/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_400-201896-0_sp1.1 from training. Duration: 0.963625 2023-10-02 09:08:37,506 WARNING [train.py:1204] (3/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_326-174612-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 09:08:37,538 WARNING [train.py:1204] (3/4) Exclude cut with ID _55844_210_2346_1_1534471212569_7625707_390-116233-0_sp1.1 from training. Duration: 0.963625 2023-10-02 09:08:37,572 WARNING [train.py:1204] (3/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_419-317062-0_sp1.1 from training. Duration: 0.82725 2023-10-02 09:08:38,885 WARNING [train.py:1204] (3/4) Exclude cut with ID _29877_210_6228_1_1532397791106_5276149_266-62291-0_sp0.9 from training. Duration: 0.9666875 2023-10-02 09:08:40,141 WARNING [train.py:1204] (3/4) Exclude cut with ID _7857_210_1534_1_1528286515745_6831470_431-338528-0_sp1.1 from training. Duration: 0.92725 2023-10-02 09:08:40,208 WARNING [train.py:1204] (3/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_87-131427-0_sp0.9 from training. Duration: 0.8666875 2023-10-02 09:08:42,293 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.5.encoder.layers.0.feed_forward2.out_whiten, num_groups=1, num_channels=256, metric=9.51 vs. limit=15.0 2023-10-02 09:08:43,537 WARNING [train.py:1204] (3/4) Exclude cut with ID _24055_210_12651_1_1532310907143_976979_92-59356-0_sp1.1 from training. Duration: 0.82725 2023-10-02 09:08:44,874 WARNING [train.py:1204] (3/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_96-290039-0 from training. Duration: 0.56 2023-10-02 09:08:46,297 WARNING [train.py:1204] (3/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_48-182155-0_sp0.9 from training. Duration: 0.9666875 2023-10-02 09:08:46,329 WARNING [train.py:1204] (3/4) Exclude cut with ID _4626_210_3532_1_1526817652717_3916859_446-206656-0 from training. Duration: 0.76 2023-10-02 09:08:47,740 WARNING [train.py:1204] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_168-32906-0 from training. Duration: 0.69 2023-10-02 09:08:47,757 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3780_210_2087_1_1526467760_2130483_170-259044-0 from training. Duration: 0.7 2023-10-02 09:08:47,776 WARNING [train.py:1204] (3/4) Exclude cut with ID _65134_210_14763_1_1535335393985_3505368_153-216718-0_sp1.1 from training. Duration: 0.92725 2023-10-02 09:08:49,180 WARNING [train.py:1204] (3/4) Exclude cut with ID _64639_210_5816_1_1535367619437_3528000_461-329156-0_sp1.1 from training. Duration: 0.87275 2023-10-02 09:08:49,228 WARNING [train.py:1204] (3/4) Exclude cut with ID _21989_210_5854_1_1531899214931_3271360_126-347531-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 09:08:49,278 WARNING [train.py:1204] (3/4) Exclude cut with ID _7614_210_4915_1_1529326764092_4537359_96-162197-0_sp1.1 from training. Duration: 0.963625 2023-10-02 09:08:49,279 WARNING [train.py:1204] (3/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_1083-147536-0 from training. Duration: 0.97 2023-10-02 09:08:54,686 WARNING [train.py:1204] (3/4) Exclude cut with ID _64029_210_4381_1_1535178502772_3756459_275-16710-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 09:08:56,527 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7264_210_4938_1_1527939911_8132095_800-91995-0_sp0.9 from training. Duration: 0.9555625 2023-10-02 09:08:56,564 WARNING [train.py:1204] (3/4) Exclude cut with ID _3844_210_1390_1_1526727803582_2871209_50-193879-0_sp1.1 from training. Duration: 0.936375 2023-10-02 09:08:56,705 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.conv_skip_rate, batch_count=825786.6666666666, ans=0.0 2023-10-02 09:08:57,965 WARNING [train.py:1204] (3/4) Exclude cut with ID _46952_210_5986_1_1534230010357_3107379_232-344503-0 from training. Duration: 0.96 2023-10-02 09:09:02,451 INFO [train.py:1046] (3/4) Epoch 24, batch 1700, loss[loss=0.1461, simple_loss=0.2238, pruned_loss=0.03416, over 24615.00 frames. ], tot_loss[loss=0.1725, simple_loss=0.2488, pruned_loss=0.04811, over 4692741.26 frames. ], batch size: 60, lr: 4.29e-03, grad_scale: 16.0 2023-10-02 09:09:03,892 WARNING [train.py:1204] (3/4) Exclude cut with ID _25808_210_13292_1_1532343514562_7366109_206-14076-0_sp1.1 from training. Duration: 0.936375 2023-10-02 09:09:03,893 WARNING [train.py:1204] (3/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_55-31904-0_sp0.9 from training. Duration: 0.97775 2023-10-02 09:09:03,925 WARNING [train.py:1204] (3/4) Exclude cut with ID _14632_210_9988_1_1530691279392_3923200_161-129530-0 from training. Duration: 0.96 2023-10-02 09:09:04,134 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.feed_forward2.hidden_balancer.prob, batch_count=825853.3333333334, ans=0.125 2023-10-02 09:09:05,246 WARNING [train.py:1204] (3/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_770-200430-0_sp1.1 from training. Duration: 0.8090625 2023-10-02 09:09:05,259 WARNING [train.py:1204] (3/4) Exclude cut with ID _6270_210_2084_1_1527850911573_3856420_26-277343-0_sp1.1 from training. Duration: 0.7454375 2023-10-02 09:09:05,260 WARNING [train.py:1204] (3/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_88-23776-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 09:09:06,693 WARNING [train.py:1204] (3/4) Exclude cut with ID _60860_210_8254_1_1534896015875_3557169_245-256666-0_sp1.1 from training. Duration: 0.8818125 2023-10-02 09:09:06,727 WARNING [train.py:1204] (3/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_636-251213-0_sp1.1 from training. Duration: 0.8545625 2023-10-02 09:09:08,074 WARNING [train.py:1204] (3/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_540-131164-0 from training. Duration: 0.92 2023-10-02 09:09:09,564 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_5931_210_2040_1_1527428481_4547999_321-193614-0_sp0.9 from training. Duration: 0.7444375 2023-10-02 09:09:18,286 WARNING [train.py:1204] (3/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_373-225403-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 09:09:18,986 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.4.encoder.layers.0.conv_module1.whiten, num_groups=1, num_channels=384, metric=4.31 vs. limit=15.0 2023-10-02 09:09:19,675 WARNING [train.py:1204] (3/4) Exclude cut with ID _20622_210_3852_1_1531622427080_1093839_57-326027-0_sp1.1 from training. Duration: 0.97275 2023-10-02 09:09:25,517 WARNING [train.py:1204] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_605-257205-0_sp1.1 from training. Duration: 0.536375 2023-10-02 09:09:25,537 WARNING [train.py:1204] (3/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_442-320317-0_sp0.9 from training. Duration: 0.911125 2023-10-02 09:09:25,589 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_35361_210_4238_1_1533262810_7793476_675-87012-0_sp1.1 from training. Duration: 0.8090625 2023-10-02 09:09:26,837 WARNING [train.py:1204] (3/4) Exclude cut with ID _4184_210_2550_1_1526730192013_2219380_306-44736-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 09:09:28,924 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.feed_forward3.hidden_balancer.prob, batch_count=825920.0, ans=0.125 2023-10-02 09:09:30,000 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_124-113562-0 from training. Duration: 0.94 2023-10-02 09:09:31,461 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2821_210_2715_1_1526108385_3763560_73-296673-0_sp0.9 from training. Duration: 0.92225 2023-10-02 09:09:31,477 WARNING [train.py:1204] (3/4) Exclude cut with ID _10301_210_5886_1_1529222275332_3910620_216-46959-0_sp1.1 from training. Duration: 0.92725 2023-10-02 09:09:34,135 WARNING [train.py:1204] (3/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_91-277684-0_sp0.9 from training. Duration: 0.82225 2023-10-02 09:09:35,459 WARNING [train.py:1204] (3/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_351-294650-0_sp0.9 from training. Duration: 0.7 2023-10-02 09:09:38,222 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.587e+02 1.847e+02 2.054e+02 2.399e+02 3.587e+02, threshold=4.108e+02, percent-clipped=0.0 2023-10-02 09:09:38,303 WARNING [train.py:1204] (3/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_209-205365-0 from training. Duration: 0.79 2023-10-02 09:09:38,359 WARNING [train.py:1204] (3/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_97-325053-0 from training. Duration: 0.92 2023-10-02 09:09:39,758 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_49348_210_7927_1_1533954500_3691300_109-276430-0_sp1.1 from training. Duration: 0.92725 2023-10-02 09:09:41,189 WARNING [train.py:1204] (3/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_912-47473-0 from training. Duration: 0.68 2023-10-02 09:09:41,260 WARNING [train.py:1204] (3/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_96-320736-0_sp0.9 from training. Duration: 0.9555625 2023-10-02 09:09:44,495 INFO [scaling.py:1118] (3/4) WithLoss: name=encoder.encoders.3.encoder.layers.1.self_attn_weights, loss-sum=0.000e+00 2023-10-02 09:09:48,730 WARNING [train.py:1204] (3/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_1252-235481-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 09:09:50,131 WARNING [train.py:1204] (3/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_813-261554-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 09:09:50,191 WARNING [train.py:1204] (3/4) Exclude cut with ID _21632_210_2253_1_1531994001765_3744140_110-274654-0_sp0.9 from training. Duration: 0.911125 2023-10-02 09:09:51,761 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.out_combiner.scale_min, batch_count=826053.3333333334, ans=0.2 2023-10-02 09:09:52,898 WARNING [train.py:1204] (3/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_381-317791-0_sp1.1 from training. Duration: 0.663625 2023-10-02 09:09:52,917 WARNING [train.py:1204] (3/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_51-117576-0_sp1.1 from training. Duration: 0.363625 2023-10-02 09:09:52,941 WARNING [train.py:1204] (3/4) Exclude cut with ID _42321_210_7770_1_1533468631352_4366895_104-215915-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 09:09:56,286 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_216-22824-0_sp1.1 from training. Duration: 0.92725 2023-10-02 09:09:56,287 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_14831_210_7927_1_1530700200_5583610_181-27467-0 from training. Duration: 0.91 2023-10-02 09:09:58,157 WARNING [train.py:1204] (3/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_1024-265645-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 09:09:58,159 WARNING [train.py:1204] (3/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_412-233904-0_sp1.1 from training. Duration: 0.936375 2023-10-02 09:09:58,174 WARNING [train.py:1204] (3/4) Exclude cut with ID _30191_210_1664_1_1533189915047_7196670_911-223461-0_sp1.1 from training. Duration: 0.92725 2023-10-02 09:09:58,190 WARNING [train.py:1204] (3/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_558-148266-0_sp1.1 from training. Duration: 0.963625 2023-10-02 09:10:00,230 WARNING [train.py:1204] (3/4) Exclude cut with ID _58480_210_12392_1_1534811121111_3646160_212-164495-0_sp1.1 from training. Duration: 0.936375 2023-10-02 09:10:00,232 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_454-276429-0_sp0.9 from training. Duration: 0.9666875 2023-10-02 09:10:01,667 WARNING [train.py:1204] (3/4) Exclude cut with ID _24736_210_3279_1_1532332894218_2838700_73-197086-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 09:10:01,736 WARNING [train.py:1204] (3/4) Exclude cut with ID _5294_210_1747_1_1527249643928_3696179_529-242298-0_sp1.1 from training. Duration: 0.863625 2023-10-02 09:10:03,058 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_53555_210_5632_1_1534595220_7232297_345-171438-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 09:10:05,763 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_28211_210_6388_1_1532861506_4523224_22-25766-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 09:10:07,133 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7002_210_1794_1_1527939683_5157286_163-87552-0 from training. Duration: 0.86 2023-10-02 09:10:08,610 WARNING [train.py:1204] (3/4) Exclude cut with ID _56927_210_9969_1_1534848970840_3695229_409-225129-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 09:10:09,947 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_26192_210_7927_1_1532137269_1040115_21-219331-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 09:10:11,373 WARNING [train.py:1204] (3/4) Exclude cut with ID _15012_210_1968_1_1530856820429_5095350_662-210469-0 from training. Duration: 0.57 2023-10-02 09:10:11,514 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.ff2_skip_rate, batch_count=826120.0, ans=0.0 2023-10-02 09:10:11,555 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.conv_skip_rate, batch_count=826120.0, ans=0.0 2023-10-02 09:10:15,716 INFO [train.py:1046] (3/4) Epoch 24, batch 1750, loss[loss=0.1634, simple_loss=0.2499, pruned_loss=0.03847, over 24650.00 frames. ], tot_loss[loss=0.171, simple_loss=0.2469, pruned_loss=0.04762, over 4688623.24 frames. ], batch size: 73, lr: 4.29e-03, grad_scale: 16.0 2023-10-02 09:10:17,160 WARNING [train.py:1204] (3/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_582-191016-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 09:10:19,899 WARNING [train.py:1204] (3/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_270-210032-0_sp1.1 from training. Duration: 0.963625 2023-10-02 09:10:19,940 WARNING [train.py:1204] (3/4) Exclude cut with ID _6789_210_1534_1_1527857937218_3118950_262-48853-0_sp0.9 from training. Duration: 0.62225 2023-10-02 09:10:21,275 WARNING [train.py:1204] (3/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_847-41334-0 from training. Duration: 0.44 2023-10-02 09:10:21,299 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_573-46532-0_sp1.1 from training. Duration: 0.92725 2023-10-02 09:10:22,748 WARNING [train.py:1204] (3/4) Exclude cut with ID _21881_210_3525_1_1531825242757_3798380_247-102591-0_sp1.1 from training. Duration: 0.8909375 2023-10-02 09:10:24,055 WARNING [train.py:1204] (3/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_200-21395-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 09:10:24,955 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.bypass.skip_rate, batch_count=826186.6666666666, ans=0.04949747468305833 2023-10-02 09:10:25,207 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.4.encoder.layers.0.whiten, num_groups=1, num_channels=384, metric=6.12 vs. limit=12.0 2023-10-02 09:10:27,353 WARNING [train.py:1204] (3/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_711-15688-0 from training. Duration: 0.43 2023-10-02 09:10:30,441 WARNING [train.py:1204] (3/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_53-75025-0_sp1.1 from training. Duration: 0.963625 2023-10-02 09:10:33,213 WARNING [train.py:1204] (3/4) Exclude cut with ID _6194_210_1534_1_1527511826160_3941194_321-148641-0 from training. Duration: 0.64 2023-10-02 09:10:33,236 WARNING [train.py:1204] (3/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_337-294431-0_sp1.1 from training. Duration: 0.936375 2023-10-02 09:10:34,624 WARNING [train.py:1204] (3/4) Exclude cut with ID _21632_210_2253_1_1531994001765_3744140_110-274654-0_sp1.1 from training. Duration: 0.7454375 2023-10-02 09:10:36,673 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.3.encoder.layers.3.feed_forward3.out_whiten, num_groups=1, num_channels=512, metric=11.13 vs. limit=15.0 2023-10-02 09:10:37,454 WARNING [train.py:1204] (3/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_135-174080-0_sp1.1 from training. Duration: 0.4818125 2023-10-02 09:10:38,918 WARNING [train.py:1204] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_21-174514-0 from training. Duration: 0.43 2023-10-02 09:10:41,589 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_38965_210_4852_1_1533174906_3484735_215-294589-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 09:10:41,622 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_548-294316-0 from training. Duration: 0.81 2023-10-02 09:10:41,742 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.feed_forward1.hidden_balancer.prob, batch_count=826253.3333333334, ans=0.125 2023-10-02 09:10:48,218 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.feed_forward1.hidden_balancer.prob, batch_count=826320.0, ans=0.125 2023-10-02 09:10:49,445 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_103-84010-0_sp1.1 from training. Duration: 0.62725 2023-10-02 09:10:49,595 INFO [scaling.py:1118] (3/4) WithLoss: name=encoder.encoders.0.layers.1.self_attn_weights, loss-sum=0.000e+00 2023-10-02 09:10:49,621 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.feed_forward3.hidden_balancer.prob, batch_count=826320.0, ans=0.125 2023-10-02 09:10:52,167 WARNING [train.py:1204] (3/4) Exclude cut with ID _18199_210_6109_1_1531475789864_4288790_295-198247-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 09:10:52,200 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_13757_210_3528_1_1530440682_4107239_103-313224-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 09:10:55,018 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_34886_210_10633_1_1533203882_1755661_67-294673-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 09:10:55,036 WARNING [train.py:1204] (3/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_350-101734-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 09:10:57,198 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_37357_210_17024_1_1533089504_2316121_204-153986-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 09:10:58,530 WARNING [train.py:1204] (3/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_673-115569-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 09:11:01,737 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_489-230703-0_sp0.9 from training. Duration: 0.9444375 2023-10-02 09:11:03,075 WARNING [train.py:1204] (3/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_575-102496-0_sp1.1 from training. Duration: 0.936375 2023-10-02 09:11:03,185 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder_embed.convnext.out_balancer.prob, batch_count=826386.6666666666, ans=0.125 2023-10-02 09:11:04,323 WARNING [train.py:1204] (3/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_669-93163-0 from training. Duration: 0.98 2023-10-02 09:11:05,734 WARNING [train.py:1204] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_249-281349-0_sp0.9 from training. Duration: 0.9444375 2023-10-02 09:11:07,162 WARNING [train.py:1204] (3/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_106-30782-0 from training. Duration: 0.8 2023-10-02 09:11:08,475 WARNING [train.py:1204] (3/4) Exclude cut with ID _8772_210_1850_1_1528635595972_3664011_398-320049-0_sp1.1 from training. Duration: 0.8 2023-10-02 09:11:08,769 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.feed_forward1.hidden_balancer.prob, batch_count=826386.6666666666, ans=0.125 2023-10-02 09:11:09,900 WARNING [train.py:1204] (3/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_516-151727-0_sp1.1 from training. Duration: 0.963625 2023-10-02 09:11:11,203 WARNING [train.py:1204] (3/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_325-62802-0_sp1.1 from training. Duration: 0.7909375 2023-10-02 09:11:12,821 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.ff2_skip_rate, batch_count=826453.3333333334, ans=0.0 2023-10-02 09:11:14,011 WARNING [train.py:1204] (3/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_93-58002-0_sp0.9 from training. Duration: 0.6333125 2023-10-02 09:11:15,359 WARNING [train.py:1204] (3/4) Exclude cut with ID _4681_210_3525_1_1527251040321_1235713_123-24428-0_sp1.1 from training. Duration: 0.436375 2023-10-02 09:11:15,422 WARNING [train.py:1204] (3/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_1185-56473-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 09:11:18,702 WARNING [train.py:1204] (3/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_96-244299-0_sp1.1 from training. Duration: 0.8 2023-10-02 09:11:21,485 WARNING [train.py:1204] (3/4) Exclude cut with ID _59500_210_8254_1_1534809610680_3736009_104-72815-0_sp1.1 from training. Duration: 0.963625 2023-10-02 09:11:23,519 WARNING [train.py:1204] (3/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_468-283113-0_sp1.1 from training. Duration: 0.87275 2023-10-02 09:11:23,741 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.conv_module2.balancer2.prob, batch_count=826453.3333333334, ans=0.125 2023-10-02 09:11:24,773 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6239_210_4211_1_1527765584_4306698_528-102186-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 09:11:24,967 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.conv_skip_rate, batch_count=826453.3333333334, ans=0.0 2023-10-02 09:11:26,105 WARNING [train.py:1204] (3/4) Exclude cut with ID _45297_210_2721_1_1533715351381_3264949_351-147136-0 from training. Duration: 0.87 2023-10-02 09:11:26,109 WARNING [train.py:1204] (3/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_520-51744-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 09:11:26,216 WARNING [train.py:1204] (3/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_310-114452-0_sp1.1 from training. Duration: 0.536375 2023-10-02 09:11:27,434 WARNING [train.py:1204] (3/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_450-165710-0_sp1.1 from training. Duration: 0.92725 2023-10-02 09:11:27,452 WARNING [train.py:1204] (3/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_280-120942-0_sp1.1 from training. Duration: 0.7 2023-10-02 09:11:27,485 WARNING [train.py:1204] (3/4) Exclude cut with ID _65585_210_12210_1_1535450451584_3764098_112-294301-0_sp0.9 from training. Duration: 0.97775 2023-10-02 09:11:29,229 INFO [train.py:1046] (3/4) Epoch 24, batch 1800, loss[loss=0.1942, simple_loss=0.2747, pruned_loss=0.05681, over 23356.00 frames. ], tot_loss[loss=0.1703, simple_loss=0.2458, pruned_loss=0.0474, over 4680858.33 frames. ], batch size: 93, lr: 4.29e-03, grad_scale: 16.0 2023-10-02 09:11:29,315 WARNING [train.py:1204] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_98-51676-0_sp0.9 from training. Duration: 0.811125 2023-10-02 09:11:30,751 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_33348_210_5632_1_1533086912_3324030_85-28583-0_sp1.1 from training. Duration: 0.7454375 2023-10-02 09:11:32,100 WARNING [train.py:1204] (3/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_336-208232-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 09:11:33,556 WARNING [train.py:1204] (3/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_339-104791-0_sp0.9 from training. Duration: 0.5444375 2023-10-02 09:11:36,347 WARNING [train.py:1204] (3/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_559-29840-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 09:11:40,583 WARNING [train.py:1204] (3/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_117-290916-0_sp0.9 from training. Duration: 0.5666875 2023-10-02 09:11:40,666 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_185-112289-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 09:11:42,048 WARNING [train.py:1204] (3/4) Exclude cut with ID _59256_210_5986_1_1535076085997_3589120_155-298797-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 09:11:43,578 WARNING [train.py:1204] (3/4) Exclude cut with ID _44659_210_2721_1_1534036120061_6614860_246-206531-0_sp1.1 from training. Duration: 0.92725 2023-10-02 09:11:43,620 WARNING [train.py:1204] (3/4) Exclude cut with ID _23818_210_6228_1_1531917063884_3676920_469-151944-0_sp1.1 from training. Duration: 0.92725 2023-10-02 09:11:43,792 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.feed_forward3.hidden_balancer.prob, batch_count=826586.6666666666, ans=0.125 2023-10-02 09:11:45,010 WARNING [train.py:1204] (3/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_88-276668-0_sp1.1 from training. Duration: 0.9 2023-10-02 09:11:46,914 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_15101_210_7927_1_1530786415_5033274_320-327527-0_sp1.1 from training. Duration: 0.87275 2023-10-02 09:11:46,924 WARNING [train.py:1204] (3/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_446-112117-0 from training. Duration: 0.9 2023-10-02 09:11:48,322 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_30280_210_1881_1_1532872779_3660139_400-254159-0_sp1.1 from training. Duration: 0.936375 2023-10-02 09:11:51,274 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=826586.6666666666, ans=0.1 2023-10-02 09:11:52,527 WARNING [train.py:1204] (3/4) Exclude cut with ID _40284_210_2084_1_1533513679814_3704279_158-345341-0_sp1.1 from training. Duration: 0.936375 2023-10-02 09:11:55,976 WARNING [train.py:1204] (3/4) Exclude cut with ID 3033-130750-0096-55598-0 from training. Duration: 0.83 2023-10-02 09:11:59,281 WARNING [train.py:1204] (3/4) Exclude cut with ID _9389_210_1664_1_1529217337301_5289269_379-128794-0 from training. Duration: 0.85 2023-10-02 09:11:59,304 WARNING [train.py:1204] (3/4) Exclude cut with ID _3675_210_4557_1_1526639934408_4182089_456-18305-0 from training. Duration: 0.76 2023-10-02 09:11:59,330 WARNING [train.py:1204] (3/4) Exclude cut with ID _29853_210_7497_1_1532505600851_6642110_571-249460-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 09:12:00,584 WARNING [train.py:1204] (3/4) Exclude cut with ID _4068_210_2629_1_1526697350395_1546259_230-325568-0_sp1.1 from training. Duration: 0.92725 2023-10-02 09:12:00,585 WARNING [train.py:1204] (3/4) Exclude cut with ID _34415_210_8750_1_1533085213375_3451116_213-267570-0_sp1.1 from training. Duration: 0.963625 2023-10-02 09:12:04,665 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6767_210_3273_1_1527855783_1718932_142-127132-0_sp1.1 from training. Duration: 0.863625 2023-10-02 09:12:07,708 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.nonlin_attention.balancer.prob, batch_count=826653.3333333334, ans=0.125 2023-10-02 09:12:08,723 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.580e+02 1.863e+02 2.128e+02 2.344e+02 3.480e+02, threshold=4.257e+02, percent-clipped=0.0 2023-10-02 09:12:11,509 WARNING [train.py:1204] (3/4) Exclude cut with ID IC0653W0001-322925 from training. Duration: 0.896 2023-10-02 09:12:11,606 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_4528_210_3273_1_1527076654_3276777_209-219804-0_sp1.1 from training. Duration: 0.62725 2023-10-02 09:12:12,915 WARNING [train.py:1204] (3/4) Exclude cut with ID _55844_210_2346_1_1534471212569_7625707_187-204971-0_sp1.1 from training. Duration: 0.936375 2023-10-02 09:12:14,264 WARNING [train.py:1204] (3/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_369-325184-0 from training. Duration: 0.99 2023-10-02 09:12:14,305 WARNING [train.py:1204] (3/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_791-45149-0 from training. Duration: 0.86 2023-10-02 09:12:14,342 WARNING [train.py:1204] (3/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_317-211107-0_sp1.1 from training. Duration: 0.836375 2023-10-02 09:12:16,948 WARNING [train.py:1204] (3/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_488-330913-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 09:12:17,035 WARNING [train.py:1204] (3/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_506-42614-0_sp1.1 from training. Duration: 0.7181875 2023-10-02 09:12:19,081 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.nonlin_attention.balancer.max_positive, batch_count=826720.0, ans=0.95 2023-10-02 09:12:20,240 WARNING [train.py:1204] (3/4) Exclude cut with ID _8696_210_1876_1_1528545632440_3345058_209-54069-0 from training. Duration: 0.67 2023-10-02 09:12:23,843 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.4.encoder.layers.0.self_attn2.whiten, num_groups=1, num_channels=384, metric=12.76 vs. limit=22.5 2023-10-02 09:12:27,951 WARNING [train.py:1204] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_215-202973-0_sp1.1 from training. Duration: 0.8 2023-10-02 09:12:29,290 WARNING [train.py:1204] (3/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_321-215742-0 from training. Duration: 0.5 2023-10-02 09:12:29,339 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_428-32114-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 09:12:29,545 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.ff3_skip_rate, batch_count=826786.6666666666, ans=0.0 2023-10-02 09:12:31,199 WARNING [train.py:1204] (3/4) Exclude cut with ID _27013_210_1389_1_1532437173240_3252649_467-263151-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 09:12:31,252 WARNING [train.py:1204] (3/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_734-129761-0_sp0.9 from training. Duration: 0.911125 2023-10-02 09:12:32,663 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_521-214276-0 from training. Duration: 0.44 2023-10-02 09:12:32,867 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.1.conv_module1.balancer1.prob, batch_count=826786.6666666666, ans=0.125 2023-10-02 09:12:34,198 WARNING [train.py:1204] (3/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_80-147301-0_sp1.1 from training. Duration: 0.763625 2023-10-02 09:12:34,200 WARNING [train.py:1204] (3/4) Exclude cut with ID _9679_210_7497_1_1529129212906_4363929_56-307796-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 09:12:36,922 WARNING [train.py:1204] (3/4) Exclude cut with ID _14832_210_8750_1_1530691351539_3171348_1-54315-0 from training. Duration: 0.87 2023-10-02 09:12:36,922 WARNING [train.py:1204] (3/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_558-155750-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 09:12:39,596 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_142-309370-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 09:12:39,620 WARNING [train.py:1204] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_18-46756-0_sp0.9 from training. Duration: 0.87775 2023-10-02 09:12:39,633 WARNING [train.py:1204] (3/4) Exclude cut with ID _61148_210_6897_1_1535094022726_4217610_279-212159-0_sp1.1 from training. Duration: 0.936375 2023-10-02 09:12:42,258 WARNING [train.py:1204] (3/4) Exclude cut with ID _21987_210_5854_1_1531821500482_3637038_164-2641-0_sp1.1 from training. Duration: 0.936375 2023-10-02 09:12:42,305 WARNING [train.py:1204] (3/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_395-340852-0_sp1.1 from training. Duration: 0.8454375 2023-10-02 09:12:44,920 INFO [train.py:1046] (3/4) Epoch 24, batch 1850, loss[loss=0.1689, simple_loss=0.2376, pruned_loss=0.05013, over 23807.00 frames. ], tot_loss[loss=0.1702, simple_loss=0.2462, pruned_loss=0.04706, over 4694733.12 frames. ], batch size: 164, lr: 4.29e-03, grad_scale: 16.0 2023-10-02 09:12:45,018 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1077-184261-0_sp1.1 from training. Duration: 0.963625 2023-10-02 09:12:45,027 WARNING [train.py:1204] (3/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_1176-204992-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 09:12:47,764 WARNING [train.py:1204] (3/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_563-347832-0_sp1.1 from training. Duration: 0.8090625 2023-10-02 09:12:47,825 WARNING [train.py:1204] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_380-204756-0_sp0.9 from training. Duration: 0.9555625 2023-10-02 09:12:52,773 WARNING [train.py:1204] (3/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_212-10501-0_sp1.1 from training. Duration: 0.8545625 2023-10-02 09:12:54,096 WARNING [train.py:1204] (3/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_445-117398-0 from training. Duration: 0.89 2023-10-02 09:12:56,179 WARNING [train.py:1204] (3/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_29-280622-0 from training. Duration: 0.82 2023-10-02 09:12:57,634 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.conv_skip_rate, batch_count=826853.3333333334, ans=0.0 2023-10-02 09:12:59,342 WARNING [train.py:1204] (3/4) Exclude cut with ID _26664_210_7770_1_1532258943280_3418270_187-80815-0 from training. Duration: 0.93 2023-10-02 09:13:02,736 WARNING [train.py:1204] (3/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_414-349640-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 09:13:04,097 WARNING [train.py:1204] (3/4) Exclude cut with ID _45021_210_9994_1_1533619751094_3330288_96-239010-0 from training. Duration: 0.91 2023-10-02 09:13:04,100 WARNING [train.py:1204] (3/4) Exclude cut with ID _5189_210_4557_1_1527248319428_3769229_199-53526-0_sp0.9 from training. Duration: 0.4666875 2023-10-02 09:13:09,753 WARNING [train.py:1204] (3/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_63-101996-0_sp1.1 from training. Duration: 0.8 2023-10-02 09:13:10,040 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.out_combiner.scale_min, batch_count=826920.0, ans=0.2 2023-10-02 09:13:12,329 WARNING [train.py:1204] (3/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_199-86432-0 from training. Duration: 0.98 2023-10-02 09:13:15,068 WARNING [train.py:1204] (3/4) Exclude cut with ID _36399_210_16049_1_1532998771377_3698333_207-259013-0_sp1.1 from training. Duration: 0.92725 2023-10-02 09:13:15,105 WARNING [train.py:1204] (3/4) Exclude cut with ID _62574_210_2655_1_1535180407058_3933249_567-111756-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 09:13:18,355 WARNING [train.py:1204] (3/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_436-318113-0 from training. Duration: 0.75 2023-10-02 09:13:19,738 WARNING [train.py:1204] (3/4) Exclude cut with ID _53379_210_20034_1_1534222027363_3854654_43-305214-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 09:13:19,762 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_15126_210_6753_1_1530778677_2591185_113-157651-0_sp1.1 from training. Duration: 0.6181875 2023-10-02 09:13:19,971 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.1.bypass.scale_min, batch_count=826986.6666666666, ans=0.2 2023-10-02 09:13:21,109 WARNING [train.py:1204] (3/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_146-218763-0_sp1.1 from training. Duration: 0.7818125 2023-10-02 09:13:21,313 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.0.feed_forward1.out_proj.dropout_p, batch_count=826986.6666666666, ans=0.1 2023-10-02 09:13:24,573 WARNING [train.py:1204] (3/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_714-310538-0_sp0.9 from training. Duration: 0.9555625 2023-10-02 09:13:26,043 WARNING [train.py:1204] (3/4) Exclude cut with ID _24271_210_12658_1_1531987270022_7190519_709-16721-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 09:13:29,266 WARNING [train.py:1204] (3/4) Exclude cut with ID _5190_210_4557_1_1527244368799_373439_28-197030-0_sp0.9 from training. Duration: 0.8 2023-10-02 09:13:30,592 WARNING [train.py:1204] (3/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_267-197536-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 09:13:32,315 WARNING [train.py:1204] (3/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_658-282610-0_sp1.1 from training. Duration: 0.4818125 2023-10-02 09:13:32,324 WARNING [train.py:1204] (3/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_257-67050-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 09:13:32,445 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_14906_210_7927_1_1530754190_9408633_961-53402-0_sp1.1 from training. Duration: 0.936375 2023-10-02 09:13:33,833 WARNING [train.py:1204] (3/4) Exclude cut with ID _60113_210_16271_1_1535164212481_4386079_490-280290-0_sp1.1 from training. Duration: 0.8909375 2023-10-02 09:13:36,664 WARNING [train.py:1204] (3/4) Exclude cut with ID _14100_210_8341_1_1530529320613_3678069_258-224599-0 from training. Duration: 0.77 2023-10-02 09:13:37,947 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6961_210_2087_1_1527937168_6815011_565-326898-0_sp1.1 from training. Duration: 0.936375 2023-10-02 09:13:40,018 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.3.encoder.layers.2.feed_forward1.out_whiten, num_groups=1, num_channels=512, metric=8.83 vs. limit=15.0 2023-10-02 09:13:40,731 WARNING [train.py:1204] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_239-316802-0_sp1.1 from training. Duration: 0.62725 2023-10-02 09:13:42,189 WARNING [train.py:1204] (3/4) Exclude cut with ID _55857_210_2346_1_1534644007831_6898209_745-98391-0_sp1.1 from training. Duration: 0.6909375 2023-10-02 09:13:42,192 WARNING [train.py:1204] (3/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_154-135696-0 from training. Duration: 0.8 2023-10-02 09:13:42,194 WARNING [train.py:1204] (3/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_93-58002-0 from training. Duration: 0.57 2023-10-02 09:13:44,788 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9025W0005-507335 from training. Duration: 0.896 2023-10-02 09:13:44,864 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9029W0034-509435 from training. Duration: 0.64 2023-10-02 09:13:47,491 WARNING [train.py:1204] (3/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_76-203867-0_sp1.1 from training. Duration: 0.6545625 2023-10-02 09:13:47,501 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_46639_210_8341_1_1533726088_4495500_66-346325-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 09:13:47,507 WARNING [train.py:1204] (3/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_145-137790-0_sp0.9 from training. Duration: 0.92225 2023-10-02 09:13:47,532 WARNING [train.py:1204] (3/4) Exclude cut with ID _36522_210_8887_1_1533177030611_1772478_7-327299-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 09:13:48,876 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9042W0009-515615 from training. Duration: 0.896 2023-10-02 09:13:48,884 WARNING [train.py:1204] (3/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_39-248870-0_sp0.9 from training. Duration: 0.7666875 2023-10-02 09:13:50,216 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_35441_210_16554_1_1533347572_4092680_362-242912-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 09:13:52,196 WARNING [train.py:1204] (3/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_216-343007-0_sp1.1 from training. Duration: 0.6 2023-10-02 09:13:52,285 WARNING [train.py:1204] (3/4) Exclude cut with ID _3867_210_1883_1_1526641200575_4475830_272-215563-0_sp0.9 from training. Duration: 0.6333125 2023-10-02 09:13:53,629 WARNING [train.py:1204] (3/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_553-175603-0_sp1.1 from training. Duration: 0.92725 2023-10-02 09:13:53,645 WARNING [train.py:1204] (3/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_963-187109-0 from training. Duration: 0.99 2023-10-02 09:13:55,083 WARNING [train.py:1204] (3/4) Exclude cut with ID _44973_210_1511_1_1533794381163_3634770_495-211466-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 09:13:55,093 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9068W0002-529085 from training. Duration: 0.896 2023-10-02 09:13:56,399 WARNING [train.py:1204] (3/4) Exclude cut with ID _5294_210_1747_1_1527249643928_3696179_180-100713-0_sp1.1 from training. Duration: 0.736375 2023-10-02 09:13:56,469 WARNING [train.py:1204] (3/4) Exclude cut with ID _35799_210_5854_1_1533263487887_3529071_149-49689-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 09:13:59,747 INFO [train.py:1046] (3/4) Epoch 24, batch 1900, loss[loss=0.1692, simple_loss=0.2427, pruned_loss=0.04787, over 23567.00 frames. ], tot_loss[loss=0.1707, simple_loss=0.247, pruned_loss=0.04724, over 4704153.67 frames. ], batch size: 134, lr: 4.29e-03, grad_scale: 16.0 2023-10-02 09:14:03,084 WARNING [train.py:1204] (3/4) Exclude cut with ID _67725_210_27767_1_1535608756671_5105359_36-220794-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 09:14:04,573 WARNING [train.py:1204] (3/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_338-96520-0_sp1.1 from training. Duration: 0.8545625 2023-10-02 09:14:05,929 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9104W0004-547160 from training. Duration: 0.896 2023-10-02 09:14:07,368 WARNING [train.py:1204] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_330-327861-0 from training. Duration: 0.67 2023-10-02 09:14:08,737 WARNING [train.py:1204] (3/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_156-257607-0_sp0.9 from training. Duration: 0.92225 2023-10-02 09:14:09,001 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.balancer1.prob, batch_count=827186.6666666666, ans=0.125 2023-10-02 09:14:09,403 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.4.encoder.layers.1.nonlin_attention.whiten2, num_groups=1, num_channels=384, metric=3.20 vs. limit=15.0 2023-10-02 09:14:10,140 WARNING [train.py:1204] (3/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_476-322050-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 09:14:10,148 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9118W0017-553385 from training. Duration: 0.896 2023-10-02 09:14:10,186 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9120W0028-554435 from training. Duration: 0.896 2023-10-02 09:14:10,404 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.feed_forward1.out_proj.dropout_p, batch_count=827186.6666666666, ans=0.1 2023-10-02 09:14:13,083 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.bypass.skip_rate, batch_count=827253.3333333334, ans=0.09899494936611666 2023-10-02 09:14:14,171 WARNING [train.py:1204] (3/4) Exclude cut with ID _56524_210_9642_1_1534572011779_3619170_145-310842-0 from training. Duration: 0.99 2023-10-02 09:14:15,576 WARNING [train.py:1204] (3/4) Exclude cut with ID _51631_210_8254_1_1534129228420_4084794_270-222508-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 09:14:20,869 WARNING [train.py:1204] (3/4) Exclude cut with ID _14268_210_9206_1_1530622771285_4714099_331-105351-0 from training. Duration: 0.65 2023-10-02 09:14:22,301 WARNING [train.py:1204] (3/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_40-129756-0 from training. Duration: 0.81 2023-10-02 09:14:24,353 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.feed_forward1.out_proj.dropout_p, batch_count=827253.3333333334, ans=0.1 2023-10-02 09:14:30,505 WARNING [train.py:1204] (3/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_231-104736-0 from training. Duration: 0.78 2023-10-02 09:14:33,892 WARNING [train.py:1204] (3/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_214-291369-0 from training. Duration: 0.63 2023-10-02 09:14:35,141 WARNING [train.py:1204] (3/4) Exclude cut with ID _62574_210_2655_1_1535180407058_3933249_369-84347-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 09:14:35,193 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9210W0111-599240 from training. Duration: 0.896 2023-10-02 09:14:35,197 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_92-246270-0 from training. Duration: 0.85 2023-10-02 09:14:35,222 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_37152_210_9697_1_1533106573_3844188_29-239070-0 from training. Duration: 0.98 2023-10-02 09:14:35,270 WARNING [train.py:1204] (3/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_239-22076-0 from training. Duration: 0.85 2023-10-02 09:14:35,273 WARNING [train.py:1204] (3/4) Exclude cut with ID _65962_210_29927_1_1535436026519_4174930_368-44639-0_sp1.1 from training. Duration: 0.97275 2023-10-02 09:14:36,440 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.482e+02 1.881e+02 2.012e+02 2.248e+02 2.968e+02, threshold=4.023e+02, percent-clipped=0.0 2023-10-02 09:14:38,014 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.ff2_skip_rate, batch_count=827320.0, ans=0.0 2023-10-02 09:14:39,416 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.conv_module1.balancer2.prob, batch_count=827320.0, ans=0.125 2023-10-02 09:14:40,509 WARNING [train.py:1204] (3/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_152-212533-0 from training. Duration: 0.97 2023-10-02 09:14:41,960 WARNING [train.py:1204] (3/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_90-31383-0_sp1.1 from training. Duration: 0.7909375 2023-10-02 09:14:44,835 WARNING [train.py:1204] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_196-152868-0_sp1.1 from training. Duration: 0.92725 2023-10-02 09:14:44,837 WARNING [train.py:1204] (3/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_140-181176-0 from training. Duration: 0.92 2023-10-02 09:14:46,470 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.conv_module2.balancer1.max_abs, batch_count=827386.6666666666, ans=10.0 2023-10-02 09:14:47,681 WARNING [train.py:1204] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_168-32906-0_sp0.9 from training. Duration: 0.7666875 2023-10-02 09:14:49,172 WARNING [train.py:1204] (3/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_13-165512-0 from training. Duration: 0.82 2023-10-02 09:14:50,468 WARNING [train.py:1204] (3/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_381-317791-0_sp0.9 from training. Duration: 0.811125 2023-10-02 09:14:56,370 WARNING [train.py:1204] (3/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_63-267585-0_sp1.1 from training. Duration: 0.8181875 2023-10-02 09:14:56,372 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_326-303945-0_sp1.1 from training. Duration: 0.9 2023-10-02 09:14:56,390 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_194-143381-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 09:14:57,787 WARNING [train.py:1204] (3/4) Exclude cut with ID _39290_210_4220_1_1533862778760_3075118_194-16667-0_sp1.1 from training. Duration: 0.963625 2023-10-02 09:14:58,612 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.bypass.scale_min, batch_count=827453.3333333334, ans=0.2 2023-10-02 09:14:59,739 WARNING [train.py:1204] (3/4) Exclude cut with ID _15012_210_1968_1_1530856820429_5095350_662-210469-0_sp0.9 from training. Duration: 0.6333125 2023-10-02 09:14:59,769 WARNING [train.py:1204] (3/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_68-219807-0_sp1.1 from training. Duration: 0.42725 2023-10-02 09:15:01,153 WARNING [train.py:1204] (3/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_321-22821-0_sp0.9 from training. Duration: 0.988875 2023-10-02 09:15:03,120 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_1037-45317-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 09:15:03,124 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_403-39619-0_sp1.1 from training. Duration: 0.763625 2023-10-02 09:15:05,833 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_510-96996-0_sp0.9 from training. Duration: 0.9666875 2023-10-02 09:15:05,835 WARNING [train.py:1204] (3/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_225-180155-0_sp1.1 from training. Duration: 0.92725 2023-10-02 09:15:07,185 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_278-272612-0_sp0.9 from training. Duration: 0.811125 2023-10-02 09:15:09,847 WARNING [train.py:1204] (3/4) Exclude cut with ID _53713_210_24116_1_1534314487762_7416751_691-70244-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 09:15:12,555 INFO [train.py:1046] (3/4) Epoch 24, batch 1950, loss[loss=0.1595, simple_loss=0.2327, pruned_loss=0.04312, over 23582.00 frames. ], tot_loss[loss=0.1719, simple_loss=0.2483, pruned_loss=0.04778, over 4711501.86 frames. ], batch size: 135, lr: 4.29e-03, grad_scale: 16.0 2023-10-02 09:15:12,674 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6788_210_1388_1_1527898698_4562021_91-197981-0_sp1.1 from training. Duration: 0.7545625 2023-10-02 09:15:14,137 WARNING [train.py:1204] (3/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_60-18123-0_sp0.9 from training. Duration: 0.97775 2023-10-02 09:15:14,178 WARNING [train.py:1204] (3/4) Exclude cut with ID _35799_210_5854_1_1533263487887_3529071_5-214617-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 09:15:14,183 WARNING [train.py:1204] (3/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_890-5117-0_sp1.1 from training. Duration: 0.7090625 2023-10-02 09:15:18,248 WARNING [train.py:1204] (3/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_660-211088-0 from training. Duration: 0.62 2023-10-02 09:15:18,300 WARNING [train.py:1204] (3/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_314-126819-0_sp0.9 from training. Duration: 0.5555625 2023-10-02 09:15:18,328 WARNING [train.py:1204] (3/4) Exclude cut with ID _21632_210_2253_1_1531994001765_3744140_362-66225-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 09:15:20,727 WARNING [train.py:1204] (3/4) Exclude cut with ID _61118_210_9527_1_1534897668884_3752630_173-258555-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 09:15:22,547 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.conv_module2.balancer1.prob, batch_count=827520.0, ans=0.125 2023-10-02 09:15:23,619 WARNING [train.py:1204] (3/4) Exclude cut with ID _7719_210_3525_1_1528456153163_4296960_88-250259-0_sp0.9 from training. Duration: 0.9333125 2023-10-02 09:15:23,641 WARNING [train.py:1204] (3/4) Exclude cut with ID _15140_210_9288_1_1530791388450_4880149_539-153708-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 09:15:23,662 WARNING [train.py:1204] (3/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_253-42544-0_sp1.1 from training. Duration: 0.97275 2023-10-02 09:15:26,585 WARNING [train.py:1204] (3/4) Exclude cut with ID _33640_210_2655_1_1533117605479_4855469_479-296877-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 09:15:29,781 WARNING [train.py:1204] (3/4) Exclude cut with ID 3033-130750-0096-55598-0_sp1.1 from training. Duration: 0.7545625 2023-10-02 09:15:29,793 WARNING [train.py:1204] (3/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_191-41023-0_sp1.1 from training. Duration: 0.6454375 2023-10-02 09:15:29,814 WARNING [train.py:1204] (3/4) Exclude cut with ID _8365_210_2064_1_1528592311070_4677690_431-294102-0_sp1.1 from training. Duration: 0.8090625 2023-10-02 09:15:31,110 WARNING [train.py:1204] (3/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_893-296030-0_sp1.1 from training. Duration: 0.97275 2023-10-02 09:15:34,569 WARNING [train.py:1204] (3/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_401-343820-0_sp1.1 from training. Duration: 0.97275 2023-10-02 09:15:37,963 WARNING [train.py:1204] (3/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_127-566-0_sp1.1 from training. Duration: 0.763625 2023-10-02 09:15:37,964 WARNING [train.py:1204] (3/4) Exclude cut with ID _14100_210_8341_1_1530529320613_3678069_126-272847-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 09:15:37,984 WARNING [train.py:1204] (3/4) Exclude cut with ID _15133_210_6228_1_1530794512074_3080379_29-264458-0_sp1.1 from training. Duration: 0.52725 2023-10-02 09:15:37,985 WARNING [train.py:1204] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_127-119109-0 from training. Duration: 0.72 2023-10-02 09:15:39,893 WARNING [train.py:1204] (3/4) Exclude cut with ID _6537_210_1883_1_1527987559710_3597129_382-133062-0_sp1.1 from training. Duration: 0.6181875 2023-10-02 09:15:39,904 WARNING [train.py:1204] (3/4) Exclude cut with ID _29529_210_7234_1_1532393961455_3977460_391-219682-0_sp1.1 from training. Duration: 0.936375 2023-10-02 09:15:39,962 WARNING [train.py:1204] (3/4) Exclude cut with ID _37537_210_2253_1_1533171758568_3421949_96-137189-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 09:15:44,132 WARNING [train.py:1204] (3/4) Exclude cut with ID _58414_210_8254_1_1535421609162_7151182_15-276301-0_sp1.1 from training. Duration: 0.97275 2023-10-02 09:15:46,964 WARNING [train.py:1204] (3/4) Exclude cut with ID _59502_210_12210_1_1535107024698_1835139_110-289074-0_sp1.1 from training. Duration: 0.92725 2023-10-02 09:15:49,922 WARNING [train.py:1204] (3/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_367-61330-0_sp1.1 from training. Duration: 0.6818125 2023-10-02 09:15:54,040 WARNING [train.py:1204] (3/4) Exclude cut with ID _45297_210_2721_1_1533715351381_3264949_353-9541-0_sp1.1 from training. Duration: 0.8818125 2023-10-02 09:15:54,085 WARNING [train.py:1204] (3/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_85-138257-0_sp0.9 from training. Duration: 0.788875 2023-10-02 09:15:55,357 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3787_210_2040_1_1526477544_5478586_603-120324-0 from training. Duration: 0.81 2023-10-02 09:15:55,404 WARNING [train.py:1204] (3/4) Exclude cut with ID _42863_210_9070_1_1533902398788_3696920_411-204131-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 09:15:58,296 WARNING [train.py:1204] (3/4) Exclude cut with ID _30196_210_1664_1_1533708496522_7074140_822-289909-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 09:16:00,205 WARNING [train.py:1204] (3/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_32-335103-0_sp0.9 from training. Duration: 0.911125 2023-10-02 09:16:00,246 WARNING [train.py:1204] (3/4) Exclude cut with ID _6493_210_1747_1_1528459210424_4012470_513-140967-0_sp0.9 from training. Duration: 0.87775 2023-10-02 09:16:03,369 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.conv_module1.balancer1.prob, batch_count=827720.0, ans=0.125 2023-10-02 09:16:05,178 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.conv_module1.balancer2.prob, batch_count=827720.0, ans=0.125 2023-10-02 09:16:09,533 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_47896_210_20508_1_1534492518_5958868_439-257766-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 09:16:09,590 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_1294-63152-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 09:16:11,682 WARNING [train.py:1204] (3/4) Exclude cut with ID _59170_210_6721_1_1534983633960_4203335_319-166512-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 09:16:14,527 WARNING [train.py:1204] (3/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_146-3470-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 09:16:15,929 WARNING [train.py:1204] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_678-159307-0_sp1.1 from training. Duration: 0.77275 2023-10-02 09:16:17,184 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_343-48822-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 09:16:17,245 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_4692_210_3528_1_1526899755_4323254_107-44401-0 from training. Duration: 0.86 2023-10-02 09:16:17,254 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_74-292608-0_sp1.1 from training. Duration: 0.6545625 2023-10-02 09:16:18,637 WARNING [train.py:1204] (3/4) Exclude cut with ID _55644_210_11856_1_1534593599582_4103719_293-248694-0_sp1.1 from training. Duration: 0.97275 2023-10-02 09:16:19,961 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3172_210_2087_1_1526292929_3987865_197-97762-0 from training. Duration: 0.75 2023-10-02 09:16:22,662 WARNING [train.py:1204] (3/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_264-348896-0_sp0.9 from training. Duration: 0.9444375 2023-10-02 09:16:24,187 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder_embed.conv.8.prob, batch_count=827786.6666666666, ans=0.125 2023-10-02 09:16:25,483 WARNING [train.py:1204] (3/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_209-205365-0_sp0.9 from training. Duration: 0.87775 2023-10-02 09:16:26,781 INFO [train.py:1046] (3/4) Epoch 24, batch 2000, loss[loss=0.1718, simple_loss=0.2682, pruned_loss=0.03768, over 24584.00 frames. ], tot_loss[loss=0.172, simple_loss=0.2487, pruned_loss=0.04767, over 4722189.09 frames. ], batch size: 71, lr: 4.29e-03, grad_scale: 32.0 2023-10-02 09:16:26,855 WARNING [train.py:1204] (3/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_222-198127-0_sp0.9 from training. Duration: 0.9333125 2023-10-02 09:16:26,895 WARNING [train.py:1204] (3/4) Exclude cut with ID _57572_210_5517_1_1534921219624_3809484_52-43530-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 09:16:30,118 WARNING [train.py:1204] (3/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_96-320736-0_sp1.1 from training. Duration: 0.7818125 2023-10-02 09:16:30,232 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_13091_210_7925_1_1530609447_8987354_1159-225508-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 09:16:33,022 WARNING [train.py:1204] (3/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_237-44332-0 from training. Duration: 0.94 2023-10-02 09:16:34,314 WARNING [train.py:1204] (3/4) Exclude cut with ID _7970_210_1883_1_1528455616296_3472230_25-178407-0_sp0.9 from training. Duration: 0.788875 2023-10-02 09:16:38,119 WARNING [train.py:1204] (3/4) Exclude cut with ID _29003_210_19596_1_1532507391720_3894098_357-257062-0_sp1.1 from training. Duration: 0.936375 2023-10-02 09:16:40,940 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_809-17156-0 from training. Duration: 0.73 2023-10-02 09:16:42,335 WARNING [train.py:1204] (3/4) Exclude cut with ID _7160_210_1876_1_1527940923231_3703799_302-150728-0_sp1.1 from training. Duration: 0.7090625 2023-10-02 09:16:44,155 WARNING [train.py:1204] (3/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_444-28041-0_sp0.9 from training. Duration: 0.9444375 2023-10-02 09:16:45,628 WARNING [train.py:1204] (3/4) Exclude cut with ID _52892_210_6721_1_1534378941259_4124129_278-251666-0_sp1.1 from training. Duration: 0.92725 2023-10-02 09:16:47,018 WARNING [train.py:1204] (3/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_115-326778-0 from training. Duration: 0.93 2023-10-02 09:16:47,657 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.3.encoder.layers.2.self_attn2.whiten, num_groups=1, num_channels=512, metric=11.89 vs. limit=22.5 2023-10-02 09:16:49,703 WARNING [train.py:1204] (3/4) Exclude cut with ID _41391_210_7994_1_1533603771872_3638201_31-188961-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 09:16:51,091 WARNING [train.py:1204] (3/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_1073-6264-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 09:16:51,118 WARNING [train.py:1204] (3/4) Exclude cut with ID _66868_210_9527_1_1535509287954_4561140_435-259515-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 09:16:53,832 WARNING [train.py:1204] (3/4) Exclude cut with ID _14886_210_4220_1_1530702020355_3516190_159-73748-0 from training. Duration: 0.83 2023-10-02 09:16:53,865 WARNING [train.py:1204] (3/4) Exclude cut with ID _15150_210_8341_1_1530752411803_7256384_469-239509-0_sp1.1 from training. Duration: 0.6181875 2023-10-02 09:16:55,329 WARNING [train.py:1204] (3/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_395-340852-0 from training. Duration: 0.93 2023-10-02 09:16:55,333 WARNING [train.py:1204] (3/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_138-187031-0_sp1.1 from training. Duration: 0.9 2023-10-02 09:16:58,079 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_13757_210_3528_1_1530440682_4107239_48-106347-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 09:16:58,156 WARNING [train.py:1204] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_164-224052-0_sp1.1 from training. Duration: 0.636375 2023-10-02 09:16:58,164 WARNING [train.py:1204] (3/4) Exclude cut with ID _59288_210_5854_1_1535077757774_3412246_408-322648-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 09:16:59,704 WARNING [train.py:1204] (3/4) Exclude cut with ID _30191_210_1664_1_1533189915047_7196670_827-36359-0_sp1.1 from training. Duration: 0.963625 2023-10-02 09:17:01,073 WARNING [train.py:1204] (3/4) Exclude cut with ID _4680_210_3525_1_1527249757953_893386_75-78979-0_sp1.1 from training. Duration: 0.82725 2023-10-02 09:17:01,132 WARNING [train.py:1204] (3/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_383-165234-0 from training. Duration: 0.94 2023-10-02 09:17:04,255 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.524e+02 1.913e+02 2.238e+02 2.677e+02 4.135e+02, threshold=4.476e+02, percent-clipped=1.0 2023-10-02 09:17:05,728 WARNING [train.py:1204] (3/4) Exclude cut with ID _19896_210_1883_1_1531709022662_6649569_94-229927-0 from training. Duration: 0.97 2023-10-02 09:17:05,741 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6788_210_1388_1_1527898698_4562021_180-75288-0_sp1.1 from training. Duration: 0.9 2023-10-02 09:17:05,748 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_26433_210_12108_1_1532157158_4056480_54-321349-0_sp1.1 from training. Duration: 0.97275 2023-10-02 09:17:12,196 WARNING [train.py:1204] (3/4) Exclude cut with ID _56108_210_16380_1_1534505446159_3887959_265-161493-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 09:17:13,628 WARNING [train.py:1204] (3/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_395-111602-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 09:17:13,636 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_154-316450-0_sp0.9 from training. Duration: 0.8444375 2023-10-02 09:17:13,694 WARNING [train.py:1204] (3/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_381-116041-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 09:17:15,686 WARNING [train.py:1204] (3/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_65-104124-0_sp1.1 from training. Duration: 0.963625 2023-10-02 09:17:17,068 WARNING [train.py:1204] (3/4) Exclude cut with ID _6536_210_1883_1_1527850783339_3319069_169-23811-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 09:17:17,109 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_289-180904-0_sp0.9 from training. Duration: 0.8444375 2023-10-02 09:17:17,120 WARNING [train.py:1204] (3/4) Exclude cut with ID _56406_210_9288_1_1534899618664_3496149_226-242400-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 09:17:18,518 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_24899_210_16554_1_1532046422_3827462_276-216330-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 09:17:18,737 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.0.feed_forward1.out_proj.dropout_p, batch_count=828053.3333333334, ans=0.1 2023-10-02 09:17:20,029 WARNING [train.py:1204] (3/4) Exclude cut with ID _29345_210_4381_1_1532689168263_3860169_198-174328-0_sp1.1 from training. Duration: 0.82725 2023-10-02 09:17:21,305 WARNING [train.py:1204] (3/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_338-96520-0 from training. Duration: 0.94 2023-10-02 09:17:22,978 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.bypass_mid.scale_min, batch_count=828053.3333333334, ans=0.2 2023-10-02 09:17:24,168 WARNING [train.py:1204] (3/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_7-111954-0_sp1.1 from training. Duration: 0.5090625 2023-10-02 09:17:24,947 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.2.encoder.layers.0.self_attn1.whiten, num_groups=1, num_channels=384, metric=22.88 vs. limit=22.5 2023-10-02 09:17:26,813 WARNING [train.py:1204] (3/4) Exclude cut with ID _36692_210_5665_1_1533608967420_398809_8-16261-0_sp1.1 from training. Duration: 0.97275 2023-10-02 09:17:28,429 WARNING [train.py:1204] (3/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_86-212657-0_sp1.1 from training. Duration: 0.97275 2023-10-02 09:17:28,435 WARNING [train.py:1204] (3/4) Exclude cut with ID _44659_210_2721_1_1534036120061_6614860_626-298884-0_sp1.1 from training. Duration: 0.8818125 2023-10-02 09:17:33,746 WARNING [train.py:1204] (3/4) Exclude cut with ID _49755_210_18053_1_1534581005846_3218709_120-257918-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 09:17:34,023 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.conv_module2.balancer2.prob, batch_count=828120.0, ans=0.125 2023-10-02 09:17:35,722 WARNING [train.py:1204] (3/4) Exclude cut with ID _21881_210_3525_1_1531825242757_3798380_400-113821-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 09:17:35,725 WARNING [train.py:1204] (3/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_387-236773-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 09:17:37,036 WARNING [train.py:1204] (3/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_613-232856-0_sp1.1 from training. Duration: 0.4909375 2023-10-02 09:17:37,061 WARNING [train.py:1204] (3/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_181-78632-0_sp0.9 from training. Duration: 0.8666875 2023-10-02 09:17:38,912 WARNING [train.py:1204] (3/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_175-17485-0_sp1.1 from training. Duration: 0.97275 2023-10-02 09:17:39,085 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.conv_skip_rate, batch_count=828120.0, ans=0.0 2023-10-02 09:17:40,279 WARNING [train.py:1204] (3/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_1004-183422-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 09:17:42,160 INFO [train.py:1046] (3/4) Epoch 24, batch 2050, loss[loss=0.1647, simple_loss=0.2408, pruned_loss=0.04433, over 23680.00 frames. ], tot_loss[loss=0.1713, simple_loss=0.2481, pruned_loss=0.0472, over 4720928.85 frames. ], batch size: 135, lr: 4.28e-03, grad_scale: 32.0 2023-10-02 09:17:45,377 WARNING [train.py:1204] (3/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_32-65962-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 09:17:46,686 WARNING [train.py:1204] (3/4) Exclude cut with ID _15458_210_8341_1_1530858614263_3834410_40-298664-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 09:17:52,313 WARNING [train.py:1204] (3/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_380-261863-0_sp1.1 from training. Duration: 0.963625 2023-10-02 09:17:53,781 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_372-173012-0_sp1.1 from training. Duration: 0.863625 2023-10-02 09:17:53,836 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_57-82205-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 09:17:54,811 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.0.layers.0.self_attn2.whiten, num_groups=1, num_channels=192, metric=10.98 vs. limit=22.5 2023-10-02 09:17:55,187 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3691_210_3528_1_1526468169_4062053_355-334432-0_sp0.9 from training. Duration: 0.9666875 2023-10-02 09:17:57,878 WARNING [train.py:1204] (3/4) Exclude cut with ID _19023_210_4353_1_1531378892552_5820780_28-36615-0 from training. Duration: 0.97 2023-10-02 09:17:57,883 WARNING [train.py:1204] (3/4) Exclude cut with ID _29853_210_7497_1_1532505600851_6642110_286-251333-0_sp1.1 from training. Duration: 0.92725 2023-10-02 09:17:58,041 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.conv_module1.balancer2.prob, batch_count=828253.3333333334, ans=0.125 2023-10-02 09:17:59,224 WARNING [train.py:1204] (3/4) Exclude cut with ID _31602_210_5854_1_1532677021456_4458250_518-80679-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 09:17:59,268 WARNING [train.py:1204] (3/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_38-187851-0_sp0.9 from training. Duration: 0.688875 2023-10-02 09:18:08,115 WARNING [train.py:1204] (3/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_39-248870-0_sp1.1 from training. Duration: 0.62725 2023-10-02 09:18:08,140 WARNING [train.py:1204] (3/4) Exclude cut with ID _16908_210_5724_1_1531119157248_3772700_154-31455-0_sp1.1 from training. Duration: 0.97275 2023-10-02 09:18:10,804 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8453_210_4211_1_1528502940_8562730_347-330673-0 from training. Duration: 0.72 2023-10-02 09:18:12,622 WARNING [train.py:1204] (3/4) Exclude cut with ID _30191_210_1664_1_1533189915047_7196670_674-204247-0_sp1.1 from training. Duration: 0.97275 2023-10-02 09:18:14,598 WARNING [train.py:1204] (3/4) Exclude cut with ID _8833_210_5219_1_1528592665210_7553059_329-125156-0 from training. Duration: 0.82 2023-10-02 09:18:15,994 WARNING [train.py:1204] (3/4) Exclude cut with ID _4779_210_2084_1_1527318022944_3914893_500-285013-0_sp1.1 from training. Duration: 0.62725 2023-10-02 09:18:17,527 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.feed_forward1.out_proj.dropout_p, batch_count=828320.0, ans=0.1 2023-10-02 09:18:18,595 WARNING [train.py:1204] (3/4) Exclude cut with ID _14421_210_8750_1_1530604795825_3542290_190-313578-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 09:18:20,066 WARNING [train.py:1204] (3/4) Exclude cut with ID _35799_210_5854_1_1533263487887_3529071_538-273574-0_sp1.1 from training. Duration: 0.936375 2023-10-02 09:18:21,483 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_14928_210_5632_1_1530773461_4714000_454-112198-0_sp1.1 from training. Duration: 0.6 2023-10-02 09:18:21,519 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_468-143126-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 09:18:24,244 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_4528_210_3273_1_1527076654_3276777_258-121681-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 09:18:25,615 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_397-132565-0_sp1.1 from training. Duration: 0.9 2023-10-02 09:18:25,638 WARNING [train.py:1204] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_454-123002-0_sp0.9 from training. Duration: 0.7666875 2023-10-02 09:18:28,296 WARNING [train.py:1204] (3/4) Exclude cut with ID _27052_210_5986_1_1532329197187_3585009_297-86890-0_sp1.1 from training. Duration: 0.936375 2023-10-02 09:18:29,675 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_51225_210_3601_1_1534400634_2327383_55-301844-0_sp1.1 from training. Duration: 0.8454375 2023-10-02 09:18:31,137 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6786_210_1388_1_1527839699_4194495_378-46070-0_sp0.9 from training. Duration: 0.8 2023-10-02 09:18:31,301 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.0.conv_module1.balancer2.prob, batch_count=828386.6666666666, ans=0.125 2023-10-02 09:18:32,524 WARNING [train.py:1204] (3/4) Exclude cut with ID _42334_210_7770_1_1534206610396_3605880_55-19758-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 09:18:33,313 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.1.encoder.layers.0.self_attn_weights.whiten_keys, num_groups=4, num_channels=128, metric=3.69 vs. limit=6.0 2023-10-02 09:18:35,166 WARNING [train.py:1204] (3/4) Exclude cut with ID _14884_210_2252_1_1530707459689_5561250_114-201399-0_sp0.9 from training. Duration: 0.8444375 2023-10-02 09:18:35,334 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.feed_forward2.hidden_balancer.prob, batch_count=828386.6666666666, ans=0.125 2023-10-02 09:18:39,937 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_99-251314-0_sp0.9 from training. Duration: 0.9666875 2023-10-02 09:18:40,918 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.0.layers.0.feed_forward3.out_whiten, num_groups=1, num_channels=192, metric=12.72 vs. limit=15.0 2023-10-02 09:18:41,316 WARNING [train.py:1204] (3/4) Exclude cut with ID _26528_210_9070_1_1532570410842_3851310_497-97218-0 from training. Duration: 0.86 2023-10-02 09:18:46,200 WARNING [train.py:1204] (3/4) Exclude cut with ID _55328_210_27767_1_1534580973260_4088170_230-317241-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 09:18:47,502 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_125-203105-0_sp0.9 from training. Duration: 0.888875 2023-10-02 09:18:50,338 WARNING [train.py:1204] (3/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_55-31904-0_sp1.1 from training. Duration: 0.8 2023-10-02 09:18:51,799 WARNING [train.py:1204] (3/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_18-174647-0 from training. Duration: 0.91 2023-10-02 09:18:54,436 INFO [train.py:1046] (3/4) Epoch 24, batch 2100, loss[loss=0.1723, simple_loss=0.2526, pruned_loss=0.04598, over 23329.00 frames. ], tot_loss[loss=0.1706, simple_loss=0.2475, pruned_loss=0.04685, over 4722955.50 frames. ], batch size: 105, lr: 4.28e-03, grad_scale: 16.0 2023-10-02 09:18:55,877 WARNING [train.py:1204] (3/4) Exclude cut with ID IC0165W0005-81726 from training. Duration: 0.952 2023-10-02 09:18:55,877 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_31583_210_3852_1_1532595851_4024413_148-171847-0_sp1.1 from training. Duration: 0.963625 2023-10-02 09:18:57,173 WARNING [train.py:1204] (3/4) Exclude cut with ID _21989_210_5854_1_1531899214931_3271360_116-138769-0_sp1.1 from training. Duration: 0.936375 2023-10-02 09:18:57,233 WARNING [train.py:1204] (3/4) Exclude cut with ID _66070_210_7640_1_1535441393096_7805310_489-292631-0_sp1.1 from training. Duration: 0.7181875 2023-10-02 09:18:57,318 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_202-27960-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 09:18:58,621 WARNING [train.py:1204] (3/4) Exclude cut with ID _5294_210_1747_1_1527249643928_3696179_342-270278-0 from training. Duration: 0.55 2023-10-02 09:18:58,657 WARNING [train.py:1204] (3/4) Exclude cut with ID _38323_210_12108_1_1533121233369_3303049_416-11919-0 from training. Duration: 0.96 2023-10-02 09:19:01,342 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6545_210_2040_1_1527774670_4245258_407-48070-0_sp0.9 from training. Duration: 0.8444375 2023-10-02 09:19:02,851 WARNING [train.py:1204] (3/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_503-30719-0_sp1.1 from training. Duration: 0.8909375 2023-10-02 09:19:04,196 WARNING [train.py:1204] (3/4) Exclude cut with ID _49643_210_7989_1_1534640244224_3939031_234-326497-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 09:19:04,361 WARNING [train.py:1204] (3/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_301-77873-0_sp1.1 from training. Duration: 0.963625 2023-10-02 09:19:06,951 WARNING [train.py:1204] (3/4) Exclude cut with ID _45297_210_2721_1_1533715351381_3264949_152-154697-0_sp1.1 from training. Duration: 0.97275 2023-10-02 09:19:06,958 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3371_210_1794_1_1526384462_5580777_396-339812-0 from training. Duration: 0.85 2023-10-02 09:19:08,825 WARNING [train.py:1204] (3/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_399-156791-0_sp1.1 from training. Duration: 0.7545625 2023-10-02 09:19:08,874 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7696_210_2040_1_1528207167_3716099_117-125434-0 from training. Duration: 0.76 2023-10-02 09:19:08,878 WARNING [train.py:1204] (3/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_96-244299-0 from training. Duration: 0.88 2023-10-02 09:19:10,665 WARNING [train.py:1204] (3/4) Exclude cut with ID _37892_210_19596_1_1533198677897_3964058_519-343462-0_sp1.1 from training. Duration: 0.92725 2023-10-02 09:19:10,685 WARNING [train.py:1204] (3/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_202-183922-0_sp1.1 from training. Duration: 0.77275 2023-10-02 09:19:10,697 WARNING [train.py:1204] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_453-133983-0 from training. Duration: 0.78 2023-10-02 09:19:12,559 WARNING [train.py:1204] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_178-39722-0_sp1.1 from training. Duration: 0.4454375 2023-10-02 09:19:17,560 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_15126_210_6753_1_1530778677_2591185_113-157651-0 from training. Duration: 0.68 2023-10-02 09:19:17,561 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_271-234962-0_sp1.1 from training. Duration: 0.7181875 2023-10-02 09:19:17,709 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.feed_forward2.hidden_balancer.prob, batch_count=828586.6666666666, ans=0.125 2023-10-02 09:19:21,643 WARNING [train.py:1204] (3/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_195-64967-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 09:19:21,673 WARNING [train.py:1204] (3/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_365-215065-0_sp1.1 from training. Duration: 0.936375 2023-10-02 09:19:25,796 WARNING [train.py:1204] (3/4) Exclude cut with ID _31602_210_5854_1_1532677021456_4458250_249-96604-0_sp0.9 from training. Duration: 0.988875 2023-10-02 09:19:25,843 WARNING [train.py:1204] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_307-158462-0 from training. Duration: 0.69 2023-10-02 09:19:25,895 WARNING [train.py:1204] (3/4) Exclude cut with ID _30918_210_6400_1_1532754078600_3125920_407-223394-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 09:19:25,898 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6673_210_3273_1_1527767790_3173240_351-167954-0_sp0.9 from training. Duration: 0.5333125 2023-10-02 09:19:27,334 WARNING [train.py:1204] (3/4) Exclude cut with ID _4866_210_2577_1_1527404412284_4364659_256-306830-0 from training. Duration: 0.87 2023-10-02 09:19:28,609 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_17613_210_2151_1_1531137569_2366160_222-131118-0_sp1.1 from training. Duration: 0.963625 2023-10-02 09:19:28,628 WARNING [train.py:1204] (3/4) Exclude cut with ID _45560_210_21982_1_1533812603951_3584999_111-6376-0 from training. Duration: 0.84 2023-10-02 09:19:28,655 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_216-73841-0 from training. Duration: 0.96 2023-10-02 09:19:29,920 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_14058_210_4163_1_1530690953_6327180_49-210687-0 from training. Duration: 0.94 2023-10-02 09:19:32,605 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.606e+02 1.934e+02 2.200e+02 2.587e+02 4.169e+02, threshold=4.400e+02, percent-clipped=0.0 2023-10-02 09:19:32,700 WARNING [train.py:1204] (3/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_506-42614-0_sp0.9 from training. Duration: 0.87775 2023-10-02 09:19:34,000 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_25-332138-0_sp1.1 from training. Duration: 0.82725 2023-10-02 09:19:35,404 WARNING [train.py:1204] (3/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_589-9070-0_sp1.1 from training. Duration: 0.6454375 2023-10-02 09:19:36,894 WARNING [train.py:1204] (3/4) Exclude cut with ID _6194_210_1534_1_1527511826160_3941194_14-282876-0_sp1.1 from training. Duration: 0.6454375 2023-10-02 09:19:36,978 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder_embed.conv.8.prob, batch_count=828720.0, ans=0.125 2023-10-02 09:19:38,763 WARNING [train.py:1204] (3/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_1166-90654-0_sp1.1 from training. Duration: 0.92725 2023-10-02 09:19:40,262 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_218-255858-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 09:19:40,263 WARNING [train.py:1204] (3/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_1285-256622-0 from training. Duration: 0.84 2023-10-02 09:19:40,283 WARNING [train.py:1204] (3/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_205-286261-0_sp1.1 from training. Duration: 0.963625 2023-10-02 09:19:41,947 WARNING [train.py:1204] (3/4) Exclude cut with ID _68777_210_4353_1_1535810390731_3117904_6-154119-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 09:19:42,001 WARNING [train.py:1204] (3/4) Exclude cut with ID _22982_210_1883_1_1531882790447_5449409_565-199393-0_sp1.1 from training. Duration: 0.92725 2023-10-02 09:19:42,010 WARNING [train.py:1204] (3/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_278-154492-0 from training. Duration: 0.76 2023-10-02 09:19:42,265 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.conv_module1.balancer1.prob, batch_count=828720.0, ans=0.125 2023-10-02 09:19:44,908 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.0.layers.0.feed_forward1.out_whiten, num_groups=1, num_channels=192, metric=8.63 vs. limit=15.0 2023-10-02 09:19:45,195 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_426-313557-0 from training. Duration: 0.53 2023-10-02 09:19:45,263 WARNING [train.py:1204] (3/4) Exclude cut with ID _41817_210_18835_1_1533434477160_7423049_555-293448-0 from training. Duration: 0.8 2023-10-02 09:19:49,821 WARNING [train.py:1204] (3/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_32-335103-0_sp1.1 from training. Duration: 0.7454375 2023-10-02 09:19:52,588 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2655_210_2087_1_1525688959_3897400_202-147557-0_sp1.1 from training. Duration: 0.77275 2023-10-02 09:19:52,617 WARNING [train.py:1204] (3/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_158-250116-0 from training. Duration: 0.62 2023-10-02 09:19:52,799 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.self_attn_weights.pos_emb_skip_rate, batch_count=828786.6666666666, ans=0.0 2023-10-02 09:19:56,725 WARNING [train.py:1204] (3/4) Exclude cut with ID _55429_210_10626_1_1534467588527_7174143_528-88083-0_sp1.1 from training. Duration: 0.963625 2023-10-02 09:19:59,492 WARNING [train.py:1204] (3/4) Exclude cut with ID _8030_210_7497_1_1528444886781_3531089_32-31837-0_sp1.1 from training. Duration: 0.8818125 2023-10-02 09:19:59,529 WARNING [train.py:1204] (3/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_744-86168-0_sp1.1 from training. Duration: 0.97275 2023-10-02 09:19:59,537 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1113-7369-0_sp0.9 from training. Duration: 0.9444375 2023-10-02 09:20:00,819 WARNING [train.py:1204] (3/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_124-345840-0_sp0.9 from training. Duration: 0.57775 2023-10-02 09:20:00,881 WARNING [train.py:1204] (3/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_328-264776-0_sp0.9 from training. Duration: 0.7444375 2023-10-02 09:20:02,271 WARNING [train.py:1204] (3/4) Exclude cut with ID _18895_210_6228_1_1531357274234_3784930_469-166615-0_sp1.1 from training. Duration: 0.963625 2023-10-02 09:20:02,277 WARNING [train.py:1204] (3/4) Exclude cut with ID _8395_210_1531_1_1528597871523_4411579_431-132470-0_sp0.9 from training. Duration: 0.82225 2023-10-02 09:20:03,663 WARNING [train.py:1204] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_554-45617-0_sp1.1 from training. Duration: 0.9 2023-10-02 09:20:03,696 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_35361_210_4238_1_1533262810_7793476_524-188242-0_sp1.1 from training. Duration: 0.92725 2023-10-02 09:20:06,520 WARNING [train.py:1204] (3/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_938-205135-0 from training. Duration: 0.87 2023-10-02 09:20:08,323 INFO [train.py:1046] (3/4) Epoch 24, batch 2150, loss[loss=0.1761, simple_loss=0.2514, pruned_loss=0.05038, over 23394.00 frames. ], tot_loss[loss=0.1691, simple_loss=0.246, pruned_loss=0.04609, over 4723415.86 frames. ], batch size: 119, lr: 4.28e-03, grad_scale: 8.0 2023-10-02 09:20:08,449 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_5931_210_2040_1_1527428481_4547999_321-193614-0 from training. Duration: 0.67 2023-10-02 09:20:08,460 WARNING [train.py:1204] (3/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_445-269521-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 09:20:11,737 WARNING [train.py:1204] (3/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_3-33116-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 09:20:11,738 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_4633_210_2040_1_1526824163_4145343_416-76117-0_sp1.1 from training. Duration: 0.863625 2023-10-02 09:20:11,775 WARNING [train.py:1204] (3/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_1033-274376-0_sp1.1 from training. Duration: 0.7909375 2023-10-02 09:20:11,812 WARNING [train.py:1204] (3/4) Exclude cut with ID _8772_210_1850_1_1528635595972_3664011_398-320049-0_sp0.9 from training. Duration: 0.97775 2023-10-02 09:20:17,600 WARNING [train.py:1204] (3/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_321-215742-0_sp0.9 from training. Duration: 0.5555625 2023-10-02 09:20:18,916 WARNING [train.py:1204] (3/4) Exclude cut with ID _28394_210_8913_1_1532430049518_3650118_272-190950-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 09:20:20,639 WARNING [train.py:1204] (3/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_31-299371-0_sp1.1 from training. Duration: 0.92725 2023-10-02 09:20:23,368 WARNING [train.py:1204] (3/4) Exclude cut with ID _55383_210_10635_1_1534468426172_2231669_134-128246-0_sp0.9 from training. Duration: 0.988875 2023-10-02 09:20:23,370 WARNING [train.py:1204] (3/4) Exclude cut with ID _40519_210_13794_1_1533715228655_7337390_887-9547-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 09:20:23,397 WARNING [train.py:1204] (3/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_310-109461-0_sp0.9 from training. Duration: 0.888875 2023-10-02 09:20:24,897 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2009_210_2071_1_1525481134_4588937_258-250196-0_sp1.1 from training. Duration: 0.92725 2023-10-02 09:20:24,950 WARNING [train.py:1204] (3/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_1186-72702-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 09:20:26,212 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_14831_210_7927_1_1530700200_5583610_181-27467-0_sp1.1 from training. Duration: 0.82725 2023-10-02 09:20:28,974 WARNING [train.py:1204] (3/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_278-132109-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 09:20:30,314 WARNING [train.py:1204] (3/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_261-297503-0 from training. Duration: 0.99 2023-10-02 09:20:34,669 WARNING [train.py:1204] (3/4) Exclude cut with ID _19896_210_1883_1_1531709022662_6649569_313-265903-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 09:20:36,077 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_105-114797-0_sp1.1 from training. Duration: 0.763625 2023-10-02 09:20:37,373 WARNING [train.py:1204] (3/4) Exclude cut with ID _53755_210_8254_1_1534577312941_4707360_9-65843-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 09:20:37,403 WARNING [train.py:1204] (3/4) Exclude cut with ID _18516_210_9206_1_1531704653206_3692406_361-224549-0_sp1.1 from training. Duration: 0.97275 2023-10-02 09:20:37,428 WARNING [train.py:1204] (3/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_403-310975-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 09:20:37,653 INFO [scaling.py:1118] (3/4) WithLoss: name=encoder.encoders.2.encoder.layers.0.self_attn_weights, loss-sum=0.000e+00 2023-10-02 09:20:39,168 WARNING [train.py:1204] (3/4) Exclude cut with ID _5444_210_2151_1_1527418808464_5312210_581-315411-0_sp0.9 from training. Duration: 0.688875 2023-10-02 09:20:40,500 WARNING [train.py:1204] (3/4) Exclude cut with ID _23825_210_13449_1_1531915313526_3497150_464-154904-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 09:20:40,509 WARNING [train.py:1204] (3/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_937-208281-0_sp0.9 from training. Duration: 0.9444375 2023-10-02 09:20:40,580 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_37839_210_7234_1_1533542462_3689303_158-181212-0_sp1.1 from training. Duration: 0.963625 2023-10-02 09:20:41,070 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.5.encoder.layers.0.feed_forward2.out_whiten, num_groups=1, num_channels=256, metric=6.98 vs. limit=15.0 2023-10-02 09:20:42,627 WARNING [train.py:1204] (3/4) Exclude cut with ID _8772_210_1850_1_1528635595972_3664011_398-320049-0 from training. Duration: 0.88 2023-10-02 09:20:42,731 WARNING [train.py:1204] (3/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_718-250907-0_sp1.1 from training. Duration: 0.62725 2023-10-02 09:20:44,122 WARNING [train.py:1204] (3/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_641-126395-0_sp1.1 from training. Duration: 0.92725 2023-10-02 09:20:44,160 WARNING [train.py:1204] (3/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_166-241493-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 09:20:45,492 WARNING [train.py:1204] (3/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_526-62079-0_sp0.9 from training. Duration: 0.7444375 2023-10-02 09:20:47,468 WARNING [train.py:1204] (3/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_439-275083-0_sp1.1 from training. Duration: 0.936375 2023-10-02 09:20:50,166 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_66907_210_21581_1_1535615661_3590177_493-229032-0_sp1.1 from training. Duration: 0.92725 2023-10-02 09:20:52,061 WARNING [train.py:1204] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_89-321955-0_sp1.1 from training. Duration: 0.72725 2023-10-02 09:20:52,167 WARNING [train.py:1204] (3/4) Exclude cut with ID _29857_210_20034_1_1532395886753_3798160_273-226195-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 09:20:52,173 WARNING [train.py:1204] (3/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_404-75889-0 from training. Duration: 0.89 2023-10-02 09:20:52,197 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_271-234962-0_sp0.9 from training. Duration: 0.87775 2023-10-02 09:20:54,936 WARNING [train.py:1204] (3/4) Exclude cut with ID _22961_210_8254_1_1531976366672_4278180_156-339685-0_sp1.1 from training. Duration: 0.97275 2023-10-02 09:20:56,326 WARNING [train.py:1204] (3/4) Exclude cut with ID _37892_210_19596_1_1533198677897_3964058_425-231927-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 09:20:57,624 WARNING [train.py:1204] (3/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_102-288670-0_sp1.1 from training. Duration: 0.97275 2023-10-02 09:20:58,948 WARNING [train.py:1204] (3/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_283-220372-0_sp1.1 from training. Duration: 0.5090625 2023-10-02 09:21:00,325 WARNING [train.py:1204] (3/4) Exclude cut with ID _44304_210_15374_1_1533639352765_4613129_86-1699-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 09:21:01,694 WARNING [train.py:1204] (3/4) Exclude cut with ID _2891_210_2550_1_1525874398521_4427710_327-121590-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 09:21:01,700 WARNING [train.py:1204] (3/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_369-222060-0 from training. Duration: 0.71 2023-10-02 09:21:03,133 WARNING [train.py:1204] (3/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_762-109886-0 from training. Duration: 0.91 2023-10-02 09:21:03,160 WARNING [train.py:1204] (3/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_477-255782-0_sp0.9 from training. Duration: 0.911125 2023-10-02 09:21:04,538 WARNING [train.py:1204] (3/4) Exclude cut with ID IC0669W0061-330906 from training. Duration: 0.768 2023-10-02 09:21:04,596 WARNING [train.py:1204] (3/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_347-213071-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 09:21:05,864 WARNING [train.py:1204] (3/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_507-303370-0_sp1.1 from training. Duration: 0.87275 2023-10-02 09:21:07,232 WARNING [train.py:1204] (3/4) Exclude cut with ID _15012_210_1968_1_1530856820429_5095350_688-178849-0 from training. Duration: 0.64 2023-10-02 09:21:07,237 WARNING [train.py:1204] (3/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_28-70987-0_sp0.9 from training. Duration: 0.9666875 2023-10-02 09:21:07,239 WARNING [train.py:1204] (3/4) Exclude cut with ID _5910_210_1389_1_1527337972454_5050100_555-159815-0 from training. Duration: 0.98 2023-10-02 09:21:07,266 WARNING [train.py:1204] (3/4) Exclude cut with ID IC0679W0064-335871 from training. Duration: 0.768 2023-10-02 09:21:07,266 WARNING [train.py:1204] (3/4) Exclude cut with ID IC0679W0079-335886 from training. Duration: 0.896 2023-10-02 09:21:07,305 WARNING [train.py:1204] (3/4) Exclude cut with ID _5608_210_2106_1_1527415439089_6819768_909-82767-0 from training. Duration: 0.61 2023-10-02 09:21:08,802 WARNING [train.py:1204] (3/4) Exclude cut with ID _23038_210_9570_1_1532001584158_3652419_411-323967-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 09:21:10,117 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_350-231748-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 09:21:10,134 WARNING [train.py:1204] (3/4) Exclude cut with ID _5289_210_2084_1_1527678202736_3779299_142-103317-0_sp0.9 from training. Duration: 0.9333125 2023-10-02 09:21:10,395 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.conv_module1.balancer1.prob, batch_count=829120.0, ans=0.125 2023-10-02 09:21:11,853 WARNING [train.py:1204] (3/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_314-144055-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 09:21:13,200 WARNING [train.py:1204] (3/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_177-236809-0_sp0.9 from training. Duration: 0.5444375 2023-10-02 09:21:14,967 WARNING [train.py:1204] (3/4) Exclude cut with ID _23038_210_9570_1_1532001584158_3652419_82-334733-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 09:21:14,975 WARNING [train.py:1204] (3/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_225-22526-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 09:21:21,841 WARNING [train.py:1204] (3/4) Exclude cut with ID _30327_210_16049_1_1532484161295_3924169_401-70819-0_sp1.1 from training. Duration: 0.7818125 2023-10-02 09:21:23,010 INFO [train.py:1046] (3/4) Epoch 24, batch 2200, loss[loss=0.1625, simple_loss=0.2555, pruned_loss=0.03475, over 24332.00 frames. ], tot_loss[loss=0.1693, simple_loss=0.2465, pruned_loss=0.04601, over 4735453.86 frames. ], batch size: 74, lr: 4.28e-03, grad_scale: 8.0 2023-10-02 09:21:23,073 WARNING [train.py:1204] (3/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_538-32291-0 from training. Duration: 0.83 2023-10-02 09:21:27,121 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_37357_210_17024_1_1533089504_2316121_258-211605-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 09:21:28,608 WARNING [train.py:1204] (3/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_550-335198-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 09:21:29,909 WARNING [train.py:1204] (3/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_98-741-0_sp0.9 from training. Duration: 0.888875 2023-10-02 09:21:29,969 WARNING [train.py:1204] (3/4) Exclude cut with ID _38323_210_12108_1_1533121233369_3303049_251-238988-0_sp1.1 from training. Duration: 0.92725 2023-10-02 09:21:31,306 WARNING [train.py:1204] (3/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_259-34038-0_sp0.9 from training. Duration: 0.82225 2023-10-02 09:21:32,783 WARNING [train.py:1204] (3/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_1060-180633-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 09:21:34,078 WARNING [train.py:1204] (3/4) Exclude cut with ID _8755_210_1876_1_1528628559357_3258209_211-37473-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 09:21:34,090 WARNING [train.py:1204] (3/4) Exclude cut with ID _4122_210_2084_1_1526713164417_4353280_528-206771-0 from training. Duration: 0.88 2023-10-02 09:21:38,283 WARNING [train.py:1204] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_64-248359-0 from training. Duration: 0.69 2023-10-02 09:21:38,954 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.4.encoder.layers.1.self_attn_weights.whiten_keys, num_groups=4, num_channels=128, metric=2.17 vs. limit=6.0 2023-10-02 09:21:40,129 WARNING [train.py:1204] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_402-253027-0_sp1.1 from training. Duration: 0.4909375 2023-10-02 09:21:42,175 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.4.encoder.layers.0.feed_forward1.out_whiten, num_groups=1, num_channels=384, metric=5.54 vs. limit=15.0 2023-10-02 09:21:44,346 WARNING [train.py:1204] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_625-102955-0 from training. Duration: 0.77 2023-10-02 09:21:45,856 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.conv_module2.balancer2.prob, batch_count=829253.3333333334, ans=0.125 2023-10-02 09:21:45,864 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.feed_forward1.hidden_balancer.prob, batch_count=829253.3333333334, ans=0.125 2023-10-02 09:21:47,071 WARNING [train.py:1204] (3/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_611-219324-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 09:21:48,903 WARNING [train.py:1204] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_718-68505-0_sp0.9 from training. Duration: 0.988875 2023-10-02 09:21:48,952 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_353-182855-0_sp1.1 from training. Duration: 0.87275 2023-10-02 09:21:52,244 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_5960_210_1794_1_1527403865_3990999_515-2613-0_sp0.9 from training. Duration: 0.97775 2023-10-02 09:21:52,269 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_5575_210_1410_1_1527301305_2570170_342-265572-0 from training. Duration: 0.94 2023-10-02 09:21:56,663 WARNING [train.py:1204] (3/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_329-113173-0_sp0.9 from training. Duration: 0.688875 2023-10-02 09:21:57,976 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_15396_210_6890_1_1531308473_1938936_213-312506-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 09:21:59,274 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_521-214276-0_sp1.1 from training. Duration: 0.4 2023-10-02 09:22:01,835 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.459e+02 1.810e+02 2.099e+02 2.472e+02 3.900e+02, threshold=4.197e+02, percent-clipped=0.0 2023-10-02 09:22:04,775 WARNING [train.py:1204] (3/4) Exclude cut with ID _15026_210_8750_1_1530777602316_3291850_211-195043-0_sp1.1 from training. Duration: 0.72725 2023-10-02 09:22:06,214 WARNING [train.py:1204] (3/4) Exclude cut with ID _29343_210_4381_1_1532602757752_3674109_108-257494-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 09:22:07,612 WARNING [train.py:1204] (3/4) Exclude cut with ID _64414_210_17301_1_1535243252048_3757759_403-58151-0_sp1.1 from training. Duration: 0.963625 2023-10-02 09:22:10,254 WARNING [train.py:1204] (3/4) Exclude cut with ID _36512_210_6987_1_1533033143998_3126069_109-177787-0_sp1.1 from training. Duration: 0.92725 2023-10-02 09:22:12,264 WARNING [train.py:1204] (3/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_498-40153-0 from training. Duration: 0.63 2023-10-02 09:22:13,588 WARNING [train.py:1204] (3/4) Exclude cut with ID _56447_210_1389_1_1535074245910_3697189_161-334523-0_sp1.1 from training. Duration: 0.936375 2023-10-02 09:22:15,095 WARNING [train.py:1204] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_238-245655-0 from training. Duration: 0.81 2023-10-02 09:22:16,559 WARNING [train.py:1204] (3/4) Exclude cut with ID _64631_210_27767_1_1535349580068_4165160_108-43865-0_sp1.1 from training. Duration: 0.92725 2023-10-02 09:22:16,565 WARNING [train.py:1204] (3/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_243-42116-0_sp1.1 from training. Duration: 0.663625 2023-10-02 09:22:16,610 WARNING [train.py:1204] (3/4) Exclude cut with ID _66868_210_9527_1_1535509287954_4561140_594-132160-0_sp1.1 from training. Duration: 0.92725 2023-10-02 09:22:18,018 WARNING [train.py:1204] (3/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_48-338976-0_sp0.9 from training. Duration: 0.988875 2023-10-02 09:22:19,341 WARNING [train.py:1204] (3/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_137-100595-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 09:22:19,348 WARNING [train.py:1204] (3/4) Exclude cut with ID _49748_210_24116_1_1533986971737_3158910_12-271669-0_sp1.1 from training. Duration: 0.936375 2023-10-02 09:22:19,380 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_37357_210_17024_1_1533089504_2316121_206-80421-0_sp1.1 from training. Duration: 0.936375 2023-10-02 09:22:20,784 WARNING [train.py:1204] (3/4) Exclude cut with ID _7537_210_2084_1_1528502395431_3924359_113-199933-0_sp1.1 from training. Duration: 0.536375 2023-10-02 09:22:21,484 WARNING [train.py:1204] (3/4) Exclude cut with ID _55440_210_12651_1_1534496713364_2980520_31-59289-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 09:22:22,820 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1083-220122-0_sp0.9 from training. Duration: 0.5666875 2023-10-02 09:22:25,963 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_426-313557-0_sp1.1 from training. Duration: 0.4818125 2023-10-02 09:22:26,009 WARNING [train.py:1204] (3/4) Exclude cut with ID _35799_210_5854_1_1533263487887_3529071_324-94843-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 09:22:28,636 WARNING [train.py:1204] (3/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_264-108685-0_sp1.1 from training. Duration: 0.5 2023-10-02 09:22:30,019 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9012W0006-501141 from training. Duration: 0.896 2023-10-02 09:22:31,513 WARNING [train.py:1204] (3/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_890-5117-0_sp0.9 from training. Duration: 0.8666875 2023-10-02 09:22:32,755 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9023W0008-506301 from training. Duration: 0.896 2023-10-02 09:22:34,084 WARNING [train.py:1204] (3/4) Exclude cut with ID _13916_210_9994_1_1530788341126_3763149_60-191556-0_sp0.9 from training. Duration: 0.67775 2023-10-02 09:22:34,132 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9029W0331-509721 from training. Duration: 0.896 2023-10-02 09:22:35,410 INFO [train.py:1046] (3/4) Epoch 24, batch 2250, loss[loss=0.1851, simple_loss=0.2535, pruned_loss=0.05834, over 23536.00 frames. ], tot_loss[loss=0.1699, simple_loss=0.2471, pruned_loss=0.04639, over 4723914.74 frames. ], batch size: 256, lr: 4.28e-03, grad_scale: 8.0 2023-10-02 09:22:35,505 WARNING [train.py:1204] (3/4) Exclude cut with ID _35823_210_6917_1_1533187881914_7297830_567-108596-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 09:22:35,552 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7543_210_6753_1_1528544010_1110402_25-183860-0_sp1.1 from training. Duration: 0.42725 2023-10-02 09:22:36,977 WARNING [train.py:1204] (3/4) Exclude cut with ID _47278_210_12041_1_1533902418349_3576110_495-260035-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 09:22:38,370 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9051W0015-520281 from training. Duration: 0.9045625 2023-10-02 09:22:40,309 WARNING [train.py:1204] (3/4) Exclude cut with ID _14459_210_2228_1_1531443641946_7516700_707-66193-0_sp1.1 from training. Duration: 0.97275 2023-10-02 09:22:42,339 WARNING [train.py:1204] (3/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_219-224445-0_sp1.1 from training. Duration: 0.763625 2023-10-02 09:22:45,788 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.2.encoder.layers.1.feed_forward3.out_whiten, num_groups=1, num_channels=384, metric=10.56 vs. limit=15.0 2023-10-02 09:22:47,850 WARNING [train.py:1204] (3/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_345-167986-0_sp0.9 from training. Duration: 0.9333125 2023-10-02 09:22:49,651 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_34815_210_6388_1_1533381320_3571903_279-305565-0_sp1.1 from training. Duration: 0.836375 2023-10-02 09:22:49,851 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.0.feed_forward1.out_proj.dropout_p, batch_count=829586.6666666666, ans=0.1 2023-10-02 09:22:51,117 WARNING [train.py:1204] (3/4) Exclude cut with ID _21987_210_5854_1_1531821500482_3637038_317-179212-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 09:22:51,175 WARNING [train.py:1204] (3/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_395-37222-0_sp1.1 from training. Duration: 0.6909375 2023-10-02 09:22:52,485 WARNING [train.py:1204] (3/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_168-50337-0_sp1.1 from training. Duration: 0.763625 2023-10-02 09:22:55,802 WARNING [train.py:1204] (3/4) Exclude cut with ID _60113_210_16271_1_1535164212481_4386079_559-167191-0 from training. Duration: 0.93 2023-10-02 09:22:55,811 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_47228_210_4983_1_1533780021_3672027_344-179642-0_sp1.1 from training. Duration: 0.92725 2023-10-02 09:22:55,835 WARNING [train.py:1204] (3/4) Exclude cut with ID _22961_210_8254_1_1531976366672_4278180_317-199546-0_sp0.9 from training. Duration: 0.9555625 2023-10-02 09:22:56,522 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.2.encoder.layers.1.conv_module2.whiten, num_groups=1, num_channels=384, metric=4.95 vs. limit=15.0 2023-10-02 09:22:58,439 WARNING [train.py:1204] (3/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_141-89727-0 from training. Duration: 0.71 2023-10-02 09:22:59,806 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_78-116075-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 09:23:01,090 WARNING [train.py:1204] (3/4) Exclude cut with ID _64086_210_7994_1_1535270417019_7435949_52-235756-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 09:23:02,507 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7615_210_4211_1_1528110965_4580160_72-235211-0_sp1.1 from training. Duration: 0.6909375 2023-10-02 09:23:08,119 WARNING [train.py:1204] (3/4) Exclude cut with ID _14069_210_5632_1_1530512692033_4803390_287-27950-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 09:23:08,209 WARNING [train.py:1204] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_74-48717-0_sp0.9 from training. Duration: 0.6444375 2023-10-02 09:23:08,232 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7544_210_6753_1_1528591327_1471426_172-348777-0_sp1.1 from training. Duration: 0.6 2023-10-02 09:23:09,684 WARNING [train.py:1204] (3/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_220-191080-0 from training. Duration: 0.98 2023-10-02 09:23:11,007 WARNING [train.py:1204] (3/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_1081-137641-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 09:23:14,209 WARNING [train.py:1204] (3/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_278-235459-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 09:23:17,009 WARNING [train.py:1204] (3/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_1181-77470-0_sp1.1 from training. Duration: 0.936375 2023-10-02 09:23:18,340 WARNING [train.py:1204] (3/4) Exclude cut with ID _60860_210_8254_1_1534896015875_3557169_198-5531-0_sp1.1 from training. Duration: 0.936375 2023-10-02 09:23:19,675 WARNING [train.py:1204] (3/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_17-289781-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 09:23:19,689 WARNING [train.py:1204] (3/4) Exclude cut with ID _50310_210_15488_1_1534068043345_3641080_146-199752-0_sp1.1 from training. Duration: 0.92725 2023-10-02 09:23:23,105 WARNING [train.py:1204] (3/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_969-213354-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 09:23:23,188 WARNING [train.py:1204] (3/4) Exclude cut with ID _70519_210_4163_1_1535961595994_7458230_66-344156-0_sp1.1 from training. Duration: 0.963625 2023-10-02 09:23:24,619 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder_embed.dropout.p, batch_count=829720.0, ans=0.1 2023-10-02 09:23:24,758 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.ff2_skip_rate, batch_count=829720.0, ans=0.0 2023-10-02 09:23:27,672 WARNING [train.py:1204] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_380-204756-0_sp1.1 from training. Duration: 0.7818125 2023-10-02 09:23:29,124 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_128-202104-0_sp0.9 from training. Duration: 0.7 2023-10-02 09:23:31,980 WARNING [train.py:1204] (3/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_51-1049-0_sp0.9 from training. Duration: 0.8333125 2023-10-02 09:23:32,003 WARNING [train.py:1204] (3/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_86-66192-0_sp1.1 from training. Duration: 0.5 2023-10-02 09:23:33,244 WARNING [train.py:1204] (3/4) Exclude cut with ID _14069_210_5632_1_1530512692033_4803390_95-1896-0_sp1.1 from training. Duration: 0.8909375 2023-10-02 09:23:38,634 WARNING [train.py:1204] (3/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_29-293896-0_sp1.1 from training. Duration: 0.5545625 2023-10-02 09:23:40,639 WARNING [train.py:1204] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_167-137255-0_sp0.9 from training. Duration: 0.711125 2023-10-02 09:23:40,640 WARNING [train.py:1204] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_217-95056-0 from training. Duration: 0.98 2023-10-02 09:23:40,653 WARNING [train.py:1204] (3/4) Exclude cut with ID _47278_210_12041_1_1533902418349_3576110_97-266746-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 09:23:40,701 WARNING [train.py:1204] (3/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_380-343670-0_sp1.1 from training. Duration: 0.8 2023-10-02 09:23:42,513 WARNING [train.py:1204] (3/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_107-332398-0 from training. Duration: 0.78 2023-10-02 09:23:47,146 WARNING [train.py:1204] (3/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_91-267696-0_sp1.1 from training. Duration: 0.7909375 2023-10-02 09:23:47,180 WARNING [train.py:1204] (3/4) Exclude cut with ID _41298_210_9019_1_1533430371450_4822143_498-186993-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 09:23:49,571 INFO [train.py:1046] (3/4) Epoch 24, batch 2300, loss[loss=0.1834, simple_loss=0.2497, pruned_loss=0.05854, over 23747.00 frames. ], tot_loss[loss=0.1714, simple_loss=0.248, pruned_loss=0.04742, over 4712559.22 frames. ], batch size: 212, lr: 4.28e-03, grad_scale: 8.0 2023-10-02 09:23:51,221 WARNING [train.py:1204] (3/4) Exclude cut with ID _17956_210_4353_1_1531209568125_6979970_54-77610-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 09:23:52,612 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_40047_210_2290_1_1533290519_2374311_257-335162-0_sp1.1 from training. Duration: 0.863625 2023-10-02 09:23:52,841 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.feed_forward2.hidden_balancer.prob, batch_count=829853.3333333334, ans=0.125 2023-10-02 09:23:54,021 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9653W0193-675471 from training. Duration: 0.9656875 2023-10-02 09:23:54,181 WARNING [train.py:1204] (3/4) Exclude cut with ID _25248_210_8750_1_1532062825941_5554180_243-240536-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 09:24:00,494 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.feed_forward2.hidden_balancer.prob, batch_count=829853.3333333334, ans=0.125 2023-10-02 09:24:02,952 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_32360_210_4198_1_1532689160_3648436_60-51908-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 09:24:02,963 WARNING [train.py:1204] (3/4) Exclude cut with ID _4092_210_2151_1_1526643044449_3135759_227-213972-0_sp0.9 from training. Duration: 0.6 2023-10-02 09:24:03,008 WARNING [train.py:1204] (3/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_392-173500-0_sp1.1 from training. Duration: 0.97275 2023-10-02 09:24:03,055 WARNING [train.py:1204] (3/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_71-329150-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 09:24:03,058 WARNING [train.py:1204] (3/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_866-173440-0 from training. Duration: 0.9 2023-10-02 09:24:03,615 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.4.encoder.layers.2.feed_forward3.out_whiten, num_groups=1, num_channels=384, metric=12.97 vs. limit=15.0 2023-10-02 09:24:04,432 WARNING [train.py:1204] (3/4) Exclude cut with ID _14832_210_8750_1_1530691351539_3171348_1-54315-0_sp0.9 from training. Duration: 0.9666875 2023-10-02 09:24:07,287 WARNING [train.py:1204] (3/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_430-116139-0_sp1.1 from training. Duration: 0.77275 2023-10-02 09:24:07,324 WARNING [train.py:1204] (3/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_356-45054-0_sp0.9 from training. Duration: 0.9555625 2023-10-02 09:24:11,107 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3531_210_3528_1_1526294203_5216456_309-227302-0_sp0.9 from training. Duration: 0.7666875 2023-10-02 09:24:13,869 WARNING [train.py:1204] (3/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_301-330455-0_sp1.1 from training. Duration: 0.72725 2023-10-02 09:24:16,004 WARNING [train.py:1204] (3/4) Exclude cut with ID _23557_210_8750_1_1531890106370_6846255_667-145945-0_sp1.1 from training. Duration: 0.92725 2023-10-02 09:24:21,346 WARNING [train.py:1204] (3/4) Exclude cut with ID _14860_210_4915_1_1530702020340_7343240_836-328489-0_sp1.1 from training. Duration: 0.8181875 2023-10-02 09:24:22,592 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_37152_210_9697_1_1533106573_3844188_416-333249-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 09:24:25,316 WARNING [train.py:1204] (3/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_538-32291-0_sp0.9 from training. Duration: 0.92225 2023-10-02 09:24:28,630 WARNING [train.py:1204] (3/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_965-176824-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 09:24:29,936 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.478e+02 1.846e+02 2.063e+02 2.350e+02 3.320e+02, threshold=4.127e+02, percent-clipped=0.0 2023-10-02 09:24:31,447 WARNING [train.py:1204] (3/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_86-19601-0_sp1.1 from training. Duration: 0.77275 2023-10-02 09:24:32,810 WARNING [train.py:1204] (3/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_436-318113-0_sp1.1 from training. Duration: 0.6818125 2023-10-02 09:24:32,836 WARNING [train.py:1204] (3/4) Exclude cut with ID _41817_210_18835_1_1533434477160_7423049_555-293448-0_sp0.9 from training. Duration: 0.888875 2023-10-02 09:24:32,854 WARNING [train.py:1204] (3/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_68-219807-0 from training. Duration: 0.47 2023-10-02 09:24:34,591 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.feed_forward3.hidden_balancer.prob, batch_count=830053.3333333334, ans=0.125 2023-10-02 09:24:35,755 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7544_210_6753_1_1528591327_1471426_33-346031-0_sp1.1 from training. Duration: 0.5545625 2023-10-02 09:24:35,756 WARNING [train.py:1204] (3/4) Exclude cut with ID _49748_210_24116_1_1533986971737_3158910_263-52113-0_sp1.1 from training. Duration: 0.97275 2023-10-02 09:24:35,804 WARNING [train.py:1204] (3/4) Exclude cut with ID _64639_210_5816_1_1535367619437_3528000_340-24580-0_sp1.1 from training. Duration: 0.92725 2023-10-02 09:24:35,823 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_47228_210_4983_1_1533780021_3672027_479-114941-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 09:24:37,085 WARNING [train.py:1204] (3/4) Exclude cut with ID _44585_210_18628_1_1533605350387_4518199_506-119659-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 09:24:37,160 WARNING [train.py:1204] (3/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_314-126819-0_sp1.1 from training. Duration: 0.4545625 2023-10-02 09:24:37,163 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2541_210_1794_1_1525590269_3084765_156-173713-0_sp0.9 from training. Duration: 0.9 2023-10-02 09:24:38,503 WARNING [train.py:1204] (3/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_108-255430-0 from training. Duration: 0.97 2023-10-02 09:24:38,518 WARNING [train.py:1204] (3/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_791-45149-0_sp1.1 from training. Duration: 0.7818125 2023-10-02 09:24:38,524 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_53378_210_17282_1_1534227054_1261425_10-118295-0_sp1.1 from training. Duration: 0.97275 2023-10-02 09:24:40,410 WARNING [train.py:1204] (3/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_148-158509-0 from training. Duration: 0.41 2023-10-02 09:24:42,579 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.0.layers.0.whiten, num_groups=1, num_channels=192, metric=4.28 vs. limit=12.0 2023-10-02 09:24:46,306 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_34726_210_4525_1_1533019474_5030395_687-291305-0_sp1.1 from training. Duration: 0.936375 2023-10-02 09:24:50,926 WARNING [train.py:1204] (3/4) Exclude cut with ID _14860_210_4915_1_1530702020340_7343240_264-324537-0_sp1.1 from training. Duration: 0.963625 2023-10-02 09:24:53,580 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_41251_210_15368_1_1533429767_4820695_591-46386-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 09:24:53,604 WARNING [train.py:1204] (3/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_86-19601-0_sp0.9 from training. Duration: 0.9444375 2023-10-02 09:24:55,329 WARNING [train.py:1204] (3/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_401-255491-0_sp0.9 from training. Duration: 0.77775 2023-10-02 09:24:56,761 WARNING [train.py:1204] (3/4) Exclude cut with ID _8696_210_1876_1_1528545632440_3345058_209-54069-0_sp1.1 from training. Duration: 0.6090625 2023-10-02 09:24:56,769 WARNING [train.py:1204] (3/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_369-325184-0_sp1.1 from training. Duration: 0.9 2023-10-02 09:24:56,834 WARNING [train.py:1204] (3/4) Exclude cut with ID _14687_210_5854_1_1530703838259_3664889_115-239046-0_sp0.9 from training. Duration: 0.7444375 2023-10-02 09:24:58,212 WARNING [train.py:1204] (3/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_168-50337-0 from training. Duration: 0.84 2023-10-02 09:25:03,537 INFO [train.py:1046] (3/4) Epoch 24, batch 2350, loss[loss=0.1531, simple_loss=0.234, pruned_loss=0.03606, over 24327.00 frames. ], tot_loss[loss=0.1716, simple_loss=0.2485, pruned_loss=0.04737, over 4718302.08 frames. ], batch size: 56, lr: 4.28e-03, grad_scale: 8.0 2023-10-02 09:25:03,667 WARNING [train.py:1204] (3/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_909-64151-0_sp1.1 from training. Duration: 0.8545625 2023-10-02 09:25:03,684 WARNING [train.py:1204] (3/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_597-154640-0 from training. Duration: 0.9 2023-10-02 09:25:09,102 WARNING [train.py:1204] (3/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_126-274694-0 from training. Duration: 0.98 2023-10-02 09:25:12,481 WARNING [train.py:1204] (3/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_344-269786-0_sp1.1 from training. Duration: 0.97275 2023-10-02 09:25:12,768 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=830186.6666666666, ans=0.1 2023-10-02 09:25:15,627 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_37818_210_4238_1_1533620910_4720083_322-253154-0_sp1.1 from training. Duration: 0.92725 2023-10-02 09:25:15,630 WARNING [train.py:1204] (3/4) Exclude cut with ID _58895_210_8750_1_1535184081320_3649089_221-15823-0_sp1.1 from training. Duration: 0.92725 2023-10-02 09:25:15,641 WARNING [train.py:1204] (3/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_9-21737-0_sp1.1 from training. Duration: 0.8818125 2023-10-02 09:25:16,917 WARNING [train.py:1204] (3/4) Exclude cut with ID _14069_210_5632_1_1530512692033_4803390_522-341588-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 09:25:18,318 WARNING [train.py:1204] (3/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_363-185703-0 from training. Duration: 0.95 2023-10-02 09:25:22,978 WARNING [train.py:1204] (3/4) Exclude cut with ID _41817_210_18835_1_1533434477160_7423049_265-76216-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 09:25:24,628 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.feed_forward1.hidden_balancer.prob, batch_count=830253.3333333334, ans=0.125 2023-10-02 09:25:29,103 WARNING [train.py:1204] (3/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_841-341679-0 from training. Duration: 0.58 2023-10-02 09:25:29,205 WARNING [train.py:1204] (3/4) Exclude cut with ID _55429_210_10626_1_1534467588527_7174143_97-335084-0_sp1.1 from training. Duration: 0.8818125 2023-10-02 09:25:33,310 WARNING [train.py:1204] (3/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_690-126306-0_sp0.9 from training. Duration: 0.8666875 2023-10-02 09:25:34,601 WARNING [train.py:1204] (3/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_102-92751-0_sp1.1 from training. Duration: 0.9 2023-10-02 09:25:35,997 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3787_210_2040_1_1526477544_5478586_603-120324-0_sp1.1 from training. Duration: 0.736375 2023-10-02 09:25:37,404 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_33348_210_5632_1_1533086912_3324030_85-28583-0 from training. Duration: 0.82 2023-10-02 09:25:38,745 WARNING [train.py:1204] (3/4) Exclude cut with ID _60057_210_10640_1_1534903065793_3333150_362-230377-0_sp1.1 from training. Duration: 0.7909375 2023-10-02 09:25:40,166 WARNING [train.py:1204] (3/4) Exclude cut with ID _60113_210_16271_1_1535164212481_4386079_194-146486-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 09:25:40,186 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_53753_210_7234_1_1534320003_3594148_171-277668-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 09:25:40,217 WARNING [train.py:1204] (3/4) Exclude cut with ID _34423_210_8750_1_1533430798456_3330199_251-332788-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 09:25:40,393 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.1.bypass_mid.scale_min, batch_count=830320.0, ans=0.2 2023-10-02 09:25:43,224 WARNING [train.py:1204] (3/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_663-166973-0_sp0.9 from training. Duration: 0.888875 2023-10-02 09:25:47,611 WARNING [train.py:1204] (3/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_927-43827-0 from training. Duration: 0.74 2023-10-02 09:25:47,657 WARNING [train.py:1204] (3/4) Exclude cut with ID _28323_210_16187_1_1532480395667_4100300_306-74561-0_sp1.1 from training. Duration: 0.8545625 2023-10-02 09:25:50,854 WARNING [train.py:1204] (3/4) Exclude cut with ID _15037_210_2253_1_1530752294104_3267650_139-337989-0_sp1.1 from training. Duration: 0.92725 2023-10-02 09:25:50,881 WARNING [train.py:1204] (3/4) Exclude cut with ID _49039_210_6315_1_1533967537541_3846118_460-263670-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 09:25:52,256 WARNING [train.py:1204] (3/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_186-309161-0 from training. Duration: 0.54 2023-10-02 09:25:53,593 WARNING [train.py:1204] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_87-278686-0_sp1.1 from training. Duration: 0.7 2023-10-02 09:25:55,567 WARNING [train.py:1204] (3/4) Exclude cut with ID _27052_210_5986_1_1532329197187_3585009_314-106499-0 from training. Duration: 0.92 2023-10-02 09:25:55,591 WARNING [train.py:1204] (3/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_412-287526-0_sp0.9 from training. Duration: 0.8 2023-10-02 09:26:01,355 WARNING [train.py:1204] (3/4) Exclude cut with ID _22961_210_8254_1_1531976366672_4278180_317-199546-0 from training. Duration: 0.86 2023-10-02 09:26:06,899 WARNING [train.py:1204] (3/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_381-317791-0 from training. Duration: 0.73 2023-10-02 09:26:08,289 WARNING [train.py:1204] (3/4) Exclude cut with ID _44010_210_4525_1_1533884262964_4013620_560-310411-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 09:26:08,295 WARNING [train.py:1204] (3/4) Exclude cut with ID _5294_210_1747_1_1527249643928_3696179_342-270278-0_sp0.9 from training. Duration: 0.611125 2023-10-02 09:26:08,329 WARNING [train.py:1204] (3/4) Exclude cut with ID ID0473W0002-918951 from training. Duration: 0.924 2023-10-02 09:26:08,348 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1083-220122-0 from training. Duration: 0.51 2023-10-02 09:26:10,005 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.conv_skip_rate, batch_count=830453.3333333334, ans=0.0 2023-10-02 09:26:11,225 WARNING [train.py:1204] (3/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_138-154812-0 from training. Duration: 0.78 2023-10-02 09:26:11,514 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.conv_skip_rate, batch_count=830453.3333333334, ans=0.0 2023-10-02 09:26:14,086 WARNING [train.py:1204] (3/4) Exclude cut with ID _44982_210_1511_1_1534312820108_3726029_331-19420-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 09:26:16,629 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.3.encoder.layers.3.nonlin_attention.whiten2, num_groups=1, num_channels=512, metric=6.13 vs. limit=15.0 2023-10-02 09:26:17,299 INFO [train.py:1046] (3/4) Epoch 24, batch 2400, loss[loss=0.1538, simple_loss=0.2196, pruned_loss=0.044, over 23672.00 frames. ], tot_loss[loss=0.1704, simple_loss=0.2473, pruned_loss=0.04678, over 4728145.74 frames. ], batch size: 212, lr: 4.28e-03, grad_scale: 16.0 2023-10-02 09:26:17,399 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2655_210_2087_1_1525688959_3897400_202-147557-0_sp0.9 from training. Duration: 0.9444375 2023-10-02 09:26:20,270 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_23850_210_4238_1_1532232597_1632433_32-185135-0_sp0.9 from training. Duration: 0.9555625 2023-10-02 09:26:22,162 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_46-93547-0_sp1.1 from training. Duration: 0.863625 2023-10-02 09:26:23,457 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7931_210_1794_1_1528594728_5564059_280-279560-0 from training. Duration: 0.98 2023-10-02 09:26:23,489 WARNING [train.py:1204] (3/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_937-208281-0 from training. Duration: 0.85 2023-10-02 09:26:28,322 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_14928_210_5632_1_1530773461_4714000_454-112198-0_sp0.9 from training. Duration: 0.7333125 2023-10-02 09:26:28,323 WARNING [train.py:1204] (3/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_944-153333-0_sp1.1 from training. Duration: 0.963625 2023-10-02 09:26:30,934 WARNING [train.py:1204] (3/4) Exclude cut with ID _14821_210_8750_1_1530680503991_6829359_2-227151-0 from training. Duration: 0.83 2023-10-02 09:26:30,966 WARNING [train.py:1204] (3/4) Exclude cut with ID _28461_210_2253_1_1532771775189_3041299_96-152158-0_sp1.1 from training. Duration: 0.77275 2023-10-02 09:26:31,208 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=830586.6666666666, ans=0.1 2023-10-02 09:26:32,304 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7837_210_1792_1_1528453445_5841305_654-54195-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 09:26:32,345 WARNING [train.py:1204] (3/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_344-152526-0 from training. Duration: 0.53 2023-10-02 09:26:38,306 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_30167_210_3528_1_1532428012_4772375_480-142230-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 09:26:41,011 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_160-218721-0 from training. Duration: 0.96 2023-10-02 09:26:45,807 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.conv_module1.balancer2.prob, batch_count=830653.3333333334, ans=0.125 2023-10-02 09:26:46,987 WARNING [train.py:1204] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_65-88242-0_sp0.9 from training. Duration: 0.62225 2023-10-02 09:26:50,386 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_34886_210_10633_1_1533203882_1755661_76-156096-0 from training. Duration: 0.87 2023-10-02 09:26:51,847 WARNING [train.py:1204] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_188-38388-0_sp1.1 from training. Duration: 0.92725 2023-10-02 09:26:53,207 WARNING [train.py:1204] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_81-289335-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 09:26:56,615 WARNING [train.py:1204] (3/4) Exclude cut with ID _59493_210_17282_1_1535078419077_3225879_110-8778-0_sp1.1 from training. Duration: 0.936375 2023-10-02 09:26:57,839 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.522e+02 1.801e+02 2.069e+02 2.462e+02 3.779e+02, threshold=4.137e+02, percent-clipped=0.0 2023-10-02 09:26:57,935 WARNING [train.py:1204] (3/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_188-260011-0 from training. Duration: 0.73 2023-10-02 09:26:57,976 WARNING [train.py:1204] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_453-133983-0_sp0.9 from training. Duration: 0.8666875 2023-10-02 09:26:59,822 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.5.encoder.layers.1.self_attn2.whiten, num_groups=1, num_channels=256, metric=11.36 vs. limit=22.5 2023-10-02 09:27:05,448 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8054_210_7927_1_1528627721_4984094_553-17379-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 09:27:06,919 WARNING [train.py:1204] (3/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_341-171072-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 09:27:08,579 WARNING [train.py:1204] (3/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_265-257286-0_sp1.1 from training. Duration: 0.963625 2023-10-02 09:27:08,680 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_403-39619-0_sp0.9 from training. Duration: 0.9333125 2023-10-02 09:27:10,504 WARNING [train.py:1204] (3/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_464-33931-0_sp0.9 from training. Duration: 0.6 2023-10-02 09:27:10,542 WARNING [train.py:1204] (3/4) Exclude cut with ID _60804_210_9642_1_1534831199859_7268299_632-143863-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 09:27:10,549 WARNING [train.py:1204] (3/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_431-310997-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 09:27:11,836 WARNING [train.py:1204] (3/4) Exclude cut with ID _31649_210_6228_1_1532584608063_4091078_114-263860-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 09:27:11,848 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6961_210_2087_1_1527937168_6815011_562-165518-0_sp0.9 from training. Duration: 0.8555625 2023-10-02 09:27:15,302 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.2.encoder.layers.1.self_attn_weights.whiten_keys, num_groups=4, num_channels=128, metric=3.09 vs. limit=6.0 2023-10-02 09:27:15,837 WARNING [train.py:1204] (3/4) Exclude cut with ID _27052_210_5986_1_1532329197187_3585009_338-325870-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 09:27:17,643 WARNING [train.py:1204] (3/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_469-94673-0_sp1.1 from training. Duration: 0.7181875 2023-10-02 09:27:17,663 WARNING [train.py:1204] (3/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_259-34038-0 from training. Duration: 0.74 2023-10-02 09:27:18,988 WARNING [train.py:1204] (3/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_388-129552-0 from training. Duration: 0.96 2023-10-02 09:27:20,445 WARNING [train.py:1204] (3/4) Exclude cut with ID _28394_210_8913_1_1532430049518_3650118_342-251455-0_sp1.1 from training. Duration: 0.936375 2023-10-02 09:27:20,472 WARNING [train.py:1204] (3/4) Exclude cut with ID _53731_210_7994_1_1534553979244_3757214_8-191390-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 09:27:21,840 WARNING [train.py:1204] (3/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_613-232856-0 from training. Duration: 0.54 2023-10-02 09:27:21,901 WARNING [train.py:1204] (3/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_557-349534-0 from training. Duration: 0.91 2023-10-02 09:27:21,910 WARNING [train.py:1204] (3/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_379-184223-0 from training. Duration: 0.85 2023-10-02 09:27:21,913 WARNING [train.py:1204] (3/4) Exclude cut with ID IC0125W0001-61807 from training. Duration: 0.966 2023-10-02 09:27:21,996 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_229-45300-0 from training. Duration: 0.57 2023-10-02 09:27:24,078 WARNING [train.py:1204] (3/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_1069-24743-0_sp1.1 from training. Duration: 0.92725 2023-10-02 09:27:25,558 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_41452_210_7291_1_1533437687_3941281_323-172873-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 09:27:25,574 WARNING [train.py:1204] (3/4) Exclude cut with ID _37086_210_9038_1_1532999540891_7021250_802-226709-0_sp1.1 from training. Duration: 0.97275 2023-10-02 09:27:26,926 WARNING [train.py:1204] (3/4) Exclude cut with ID IC0144W0500-71767 from training. Duration: 0.931875 2023-10-02 09:27:28,289 WARNING [train.py:1204] (3/4) Exclude cut with ID _39846_210_12651_1_1533605310265_2115212_233-129699-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 09:27:28,355 WARNING [train.py:1204] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_574-45541-0_sp0.9 from training. Duration: 0.72225 2023-10-02 09:27:31,249 WARNING [train.py:1204] (3/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_372-298093-0_sp1.1 from training. Duration: 0.72725 2023-10-02 09:27:31,260 WARNING [train.py:1204] (3/4) Exclude cut with ID _62574_210_2655_1_1535180407058_3933249_34-26343-0_sp1.1 from training. Duration: 0.97275 2023-10-02 09:27:32,531 INFO [train.py:1046] (3/4) Epoch 24, batch 2450, loss[loss=0.1752, simple_loss=0.2648, pruned_loss=0.04278, over 24422.00 frames. ], tot_loss[loss=0.1693, simple_loss=0.2454, pruned_loss=0.04665, over 4688660.03 frames. ], batch size: 69, lr: 4.28e-03, grad_scale: 16.0 2023-10-02 09:27:34,793 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.2.encoder.layers.0.self_attn2.whiten, num_groups=1, num_channels=384, metric=20.40 vs. limit=22.5 2023-10-02 09:27:35,814 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_512-256238-0_sp1.1 from training. Duration: 0.963625 2023-10-02 09:27:35,819 WARNING [train.py:1204] (3/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_196-55502-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 09:27:37,219 WARNING [train.py:1204] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_195-167011-0 from training. Duration: 0.75 2023-10-02 09:27:40,092 WARNING [train.py:1204] (3/4) Exclude cut with ID _13678_210_7497_1_1530530069226_3463170_345-230416-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 09:27:40,109 WARNING [train.py:1204] (3/4) Exclude cut with ID _51078_210_9070_1_1534161557424_3805250_225-327809-0_sp1.1 from training. Duration: 0.963625 2023-10-02 09:27:43,455 WARNING [train.py:1204] (3/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_18-189198-0_sp1.1 from training. Duration: 0.8181875 2023-10-02 09:27:45,160 WARNING [train.py:1204] (3/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_329-82235-0_sp1.1 from training. Duration: 0.7454375 2023-10-02 09:27:45,171 WARNING [train.py:1204] (3/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_726-270780-0_sp0.9 from training. Duration: 0.9555625 2023-10-02 09:27:45,211 WARNING [train.py:1204] (3/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_654-52713-0 from training. Duration: 0.77 2023-10-02 09:27:49,443 WARNING [train.py:1204] (3/4) Exclude cut with ID _20455_210_5219_1_1531565996775_8098100_191-16006-0_sp1.1 from training. Duration: 0.963625 2023-10-02 09:27:50,806 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3171_210_2087_1_1526109084_2330007_66-260583-0_sp1.1 from training. Duration: 0.6545625 2023-10-02 09:27:50,858 WARNING [train.py:1204] (3/4) Exclude cut with ID _38670_210_15379_1_1533448810659_4391059_63-129006-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 09:27:55,367 WARNING [train.py:1204] (3/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_393-57888-0_sp0.9 from training. Duration: 0.711125 2023-10-02 09:27:55,400 WARNING [train.py:1204] (3/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_857-298942-0_sp0.9 from training. Duration: 0.9444375 2023-10-02 09:27:56,731 WARNING [train.py:1204] (3/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_239-22076-0_sp0.9 from training. Duration: 0.9444375 2023-10-02 09:27:56,753 WARNING [train.py:1204] (3/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_424-227947-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 09:27:59,436 WARNING [train.py:1204] (3/4) Exclude cut with ID _14069_210_5632_1_1530512692033_4803390_95-1896-0 from training. Duration: 0.98 2023-10-02 09:27:59,510 WARNING [train.py:1204] (3/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_96-244299-0_sp0.9 from training. Duration: 0.97775 2023-10-02 09:28:08,312 WARNING [train.py:1204] (3/4) Exclude cut with ID _46683_210_9642_1_1533730913492_4591116_562-319825-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 09:28:10,187 WARNING [train.py:1204] (3/4) Exclude cut with ID _64639_210_5816_1_1535367619437_3528000_284-173474-0_sp1.1 from training. Duration: 0.963625 2023-10-02 09:28:11,587 WARNING [train.py:1204] (3/4) Exclude cut with ID _29529_210_7234_1_1532393961455_3977460_497-100133-0_sp1.1 from training. Duration: 0.936375 2023-10-02 09:28:12,899 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_12868_210_6753_1_1530701932_530275_18-76322-0_sp0.9 from training. Duration: 0.9666875 2023-10-02 09:28:12,949 WARNING [train.py:1204] (3/4) Exclude cut with ID _50093_210_4220_1_1534417141207_3432299_116-303447-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 09:28:14,378 WARNING [train.py:1204] (3/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_44-79427-0_sp1.1 from training. Duration: 0.8909375 2023-10-02 09:28:15,701 WARNING [train.py:1204] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_887-249555-0 from training. Duration: 0.77 2023-10-02 09:28:19,089 WARNING [train.py:1204] (3/4) Exclude cut with ID _28461_210_2253_1_1532771775189_3041299_96-152158-0_sp0.9 from training. Duration: 0.9444375 2023-10-02 09:28:19,139 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_14058_210_4163_1_1530690953_6327180_49-210687-0_sp1.1 from training. Duration: 0.8545625 2023-10-02 09:28:21,865 WARNING [train.py:1204] (3/4) Exclude cut with ID _40399_210_4353_1_1533305658194_3279406_351-127903-0_sp1.1 from training. Duration: 0.97275 2023-10-02 09:28:21,870 WARNING [train.py:1204] (3/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_184-206939-0_sp1.1 from training. Duration: 0.936375 2023-10-02 09:28:23,374 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.2.encoder.layers.0.conv_module1.whiten, num_groups=1, num_channels=384, metric=6.12 vs. limit=15.0 2023-10-02 09:28:25,432 WARNING [train.py:1204] (3/4) Exclude cut with ID _14100_210_8341_1_1530529320613_3678069_258-224599-0_sp1.1 from training. Duration: 0.7 2023-10-02 09:28:26,720 WARNING [train.py:1204] (3/4) Exclude cut with ID _6493_210_1747_1_1528459210424_4012470_324-323854-0 from training. Duration: 0.92 2023-10-02 09:28:26,813 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_151-325626-0_sp1.1 from training. Duration: 0.8818125 2023-10-02 09:28:28,129 WARNING [train.py:1204] (3/4) Exclude cut with ID _14528_210_8254_1_1530669700514_4306939_106-43145-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 09:28:28,143 WARNING [train.py:1204] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_686-54383-0 from training. Duration: 0.87 2023-10-02 09:28:29,411 WARNING [train.py:1204] (3/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_801-102362-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 09:28:29,481 WARNING [train.py:1204] (3/4) Exclude cut with ID _48677_210_5663_1_1533902405919_3892989_408-40740-0_sp0.9 from training. Duration: 0.92225 2023-10-02 09:28:32,339 WARNING [train.py:1204] (3/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_448-212135-0_sp1.1 from training. Duration: 0.82725 2023-10-02 09:28:35,041 WARNING [train.py:1204] (3/4) Exclude cut with ID _59170_210_6721_1_1534983633960_4203335_320-124047-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 09:28:35,079 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_4692_210_3528_1_1526899755_4323254_107-44401-0_sp1.1 from training. Duration: 0.7818125 2023-10-02 09:28:41,351 WARNING [train.py:1204] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_731-2428-0 from training. Duration: 0.83 2023-10-02 09:28:41,442 WARNING [train.py:1204] (3/4) Exclude cut with ID _29857_210_20034_1_1532395886753_3798160_335-163562-0_sp1.1 from training. Duration: 0.77275 2023-10-02 09:28:46,890 INFO [train.py:1046] (3/4) Epoch 24, batch 2500, loss[loss=0.1693, simple_loss=0.2557, pruned_loss=0.04151, over 24418.00 frames. ], tot_loss[loss=0.1693, simple_loss=0.2454, pruned_loss=0.04664, over 4697729.80 frames. ], batch size: 69, lr: 4.28e-03, grad_scale: 16.0 2023-10-02 09:28:48,730 WARNING [train.py:1204] (3/4) Exclude cut with ID _14832_210_8750_1_1530691351539_3171348_171-275004-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 09:28:56,421 WARNING [train.py:1204] (3/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_105-328174-0_sp0.9 from training. Duration: 0.8444375 2023-10-02 09:28:57,762 WARNING [train.py:1204] (3/4) Exclude cut with ID _68965_210_1883_1_1535698962272_3497490_164-118421-0_sp1.1 from training. Duration: 0.936375 2023-10-02 09:28:59,061 WARNING [train.py:1204] (3/4) Exclude cut with ID _19896_210_1883_1_1531709022662_6649569_380-24170-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 09:28:59,073 WARNING [train.py:1204] (3/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_505-205270-0_sp1.1 from training. Duration: 0.3454375 2023-10-02 09:29:04,651 WARNING [train.py:1204] (3/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_427-208901-0_sp1.1 from training. Duration: 0.6181875 2023-10-02 09:29:04,701 WARNING [train.py:1204] (3/4) Exclude cut with ID _23818_210_6228_1_1531917063884_3676920_85-326916-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 09:29:06,114 WARNING [train.py:1204] (3/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_101-293367-0_sp1.1 from training. Duration: 0.57275 2023-10-02 09:29:06,124 WARNING [train.py:1204] (3/4) Exclude cut with ID _5409_210_1388_1_1527292850189_5865191_178-127970-0_sp1.1 from training. Duration: 0.5181875 2023-10-02 09:29:06,172 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_403-39619-0 from training. Duration: 0.84 2023-10-02 09:29:07,603 WARNING [train.py:1204] (3/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_427-87841-0_sp1.1 from training. Duration: 0.963625 2023-10-02 09:29:08,826 WARNING [train.py:1204] (3/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_286-68181-0_sp1.1 from training. Duration: 0.97275 2023-10-02 09:29:10,297 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6673_210_3273_1_1527767790_3173240_327-306185-0 from training. Duration: 0.83 2023-10-02 09:29:10,322 WARNING [train.py:1204] (3/4) Exclude cut with ID _3675_210_4557_1_1526639934408_4182089_106-203713-0_sp1.1 from training. Duration: 0.963625 2023-10-02 09:29:12,139 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_53071_210_16554_1_1534493434_1276796_130-36076-0 from training. Duration: 0.95 2023-10-02 09:29:12,163 WARNING [train.py:1204] (3/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_586-93054-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 09:29:16,878 WARNING [train.py:1204] (3/4) Exclude cut with ID _23825_210_13449_1_1531915313526_3497150_262-10571-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 09:29:18,165 WARNING [train.py:1204] (3/4) Exclude cut with ID _28783_210_7943_1_1532671164808_7155479_432-67885-0_sp1.1 from training. Duration: 0.97275 2023-10-02 09:29:20,129 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6287_210_3273_1_1527509506_2653716_182-199887-0_sp0.9 from training. Duration: 0.5666875 2023-10-02 09:29:21,541 WARNING [train.py:1204] (3/4) Exclude cut with ID _3231_210_2106_1_1526205864934_6784220_171-256004-0 from training. Duration: 0.86 2023-10-02 09:29:22,797 WARNING [train.py:1204] (3/4) Exclude cut with ID _69051_210_1531_1_1535885566901_3829538_332-92090-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 09:29:25,473 WARNING [train.py:1204] (3/4) Exclude cut with ID _3675_210_4557_1_1526639934408_4182089_104-324125-0_sp1.1 from training. Duration: 0.963625 2023-10-02 09:29:27,222 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.conv_module1.balancer2.prob, batch_count=831320.0, ans=0.125 2023-10-02 09:29:27,253 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.balancer_ff2.min_abs, batch_count=831320.0, ans=0.1 2023-10-02 09:29:28,278 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.531e+02 1.796e+02 1.940e+02 2.141e+02 3.270e+02, threshold=3.880e+02, percent-clipped=0.0 2023-10-02 09:29:30,378 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_41764_210_2521_1_1533686110_4089313_501-2119-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 09:29:33,196 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_56-100988-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 09:29:36,011 WARNING [train.py:1204] (3/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_398-239819-0_sp1.1 from training. Duration: 0.9 2023-10-02 09:29:40,219 WARNING [train.py:1204] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_74-48717-0_sp1.1 from training. Duration: 0.52725 2023-10-02 09:29:41,661 WARNING [train.py:1204] (3/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_177-49092-0 from training. Duration: 0.91 2023-10-02 09:29:42,947 WARNING [train.py:1204] (3/4) Exclude cut with ID _62337_210_12392_1_1535009273105_3412229_44-176353-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 09:29:42,962 WARNING [train.py:1204] (3/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_110-130961-0_sp1.1 from training. Duration: 0.42725 2023-10-02 09:29:44,969 WARNING [train.py:1204] (3/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_37-189262-0_sp1.1 from training. Duration: 0.8909375 2023-10-02 09:29:44,970 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3172_210_2087_1_1526292929_3987865_197-97762-0_sp0.9 from training. Duration: 0.8333125 2023-10-02 09:29:47,487 WARNING [train.py:1204] (3/4) Exclude cut with ID IC0677W0065-334897 from training. Duration: 0.896 2023-10-02 09:29:47,488 WARNING [train.py:1204] (3/4) Exclude cut with ID IC0677W0080-334912 from training. Duration: 0.768 2023-10-02 09:29:47,493 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_120-41668-0 from training. Duration: 0.7 2023-10-02 09:29:50,791 WARNING [train.py:1204] (3/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_460-205287-0_sp1.1 from training. Duration: 0.963625 2023-10-02 09:29:53,383 WARNING [train.py:1204] (3/4) Exclude cut with ID _14632_210_9988_1_1530691279392_3923200_102-204477-0 from training. Duration: 0.97 2023-10-02 09:29:53,388 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_34815_210_6388_1_1533381320_3571903_279-305565-0 from training. Duration: 0.92 2023-10-02 09:29:53,454 WARNING [train.py:1204] (3/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_91-148831-0_sp1.1 from training. Duration: 0.9 2023-10-02 09:29:53,509 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_82-221196-0 from training. Duration: 0.76 2023-10-02 09:29:57,629 WARNING [train.py:1204] (3/4) Exclude cut with ID _38020_210_5986_1_1533112252259_7006089_493-102745-0 from training. Duration: 0.98 2023-10-02 09:30:00,296 INFO [train.py:1046] (3/4) Epoch 24, batch 2550, loss[loss=0.1738, simple_loss=0.2453, pruned_loss=0.05117, over 22868.00 frames. ], tot_loss[loss=0.169, simple_loss=0.2457, pruned_loss=0.04618, over 4718971.43 frames. ], batch size: 322, lr: 4.28e-03, grad_scale: 8.0 2023-10-02 09:30:00,354 WARNING [train.py:1204] (3/4) Exclude cut with ID _61133_210_9969_1_1535196777227_3696409_40-93442-0_sp1.1 from training. Duration: 0.92725 2023-10-02 09:30:00,647 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.feed_forward3.hidden_balancer.prob, batch_count=831520.0, ans=0.125 2023-10-02 09:30:01,842 WARNING [train.py:1204] (3/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_45-258852-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 09:30:02,517 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_53071_210_16554_1_1534493434_1276796_130-36076-0_sp1.1 from training. Duration: 0.863625 2023-10-02 09:30:05,160 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8694_210_6400_1_1529065754_3260756_332-13553-0_sp1.1 from training. Duration: 0.92725 2023-10-02 09:30:05,389 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.feed_forward1.out_proj.dropout_p, batch_count=831520.0, ans=0.1 2023-10-02 09:30:06,476 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_405-339986-0 from training. Duration: 0.51 2023-10-02 09:30:06,533 WARNING [train.py:1204] (3/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_927-43827-0_sp1.1 from training. Duration: 0.67275 2023-10-02 09:30:10,503 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_397-147196-0 from training. Duration: 0.8 2023-10-02 09:30:10,606 WARNING [train.py:1204] (3/4) Exclude cut with ID 3557-8342-0013-54691-0_sp1.1 from training. Duration: 0.836375 2023-10-02 09:30:13,472 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_47438_210_5399_1_1533797471_4513356_626-127722-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 09:30:16,693 WARNING [train.py:1204] (3/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_256-323883-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 09:30:16,698 WARNING [train.py:1204] (3/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_839-173017-0_sp0.9 from training. Duration: 0.4555625 2023-10-02 09:30:18,591 WARNING [train.py:1204] (3/4) Exclude cut with ID _5289_210_2084_1_1527678202736_3779299_392-261988-0_sp1.1 from training. Duration: 0.5909375 2023-10-02 09:30:18,644 WARNING [train.py:1204] (3/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_119-224384-0_sp1.1 from training. Duration: 0.936375 2023-10-02 09:30:20,467 WARNING [train.py:1204] (3/4) Exclude cut with ID _57523_210_1856_1_1534766294545_3732989_262-267933-0_sp1.1 from training. Duration: 0.963625 2023-10-02 09:30:21,900 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_35361_210_4238_1_1533262810_7793476_675-87012-0_sp0.9 from training. Duration: 0.988875 2023-10-02 09:30:21,916 WARNING [train.py:1204] (3/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_17-90541-0 from training. Duration: 0.94 2023-10-02 09:30:22,677 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.1.encoder.layers.1.feed_forward1.out_whiten, num_groups=1, num_channels=256, metric=6.52 vs. limit=15.0 2023-10-02 09:30:23,120 WARNING [train.py:1204] (3/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_535-14410-0_sp1.1 from training. Duration: 0.42725 2023-10-02 09:30:23,126 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_24673_210_6400_1_1531968633_778659_94-154511-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 09:30:23,129 WARNING [train.py:1204] (3/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_212-10501-0 from training. Duration: 0.94 2023-10-02 09:30:33,049 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.nonlin_attention.balancer.prob, batch_count=831653.3333333334, ans=0.125 2023-10-02 09:30:34,173 WARNING [train.py:1204] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_383-285196-0_sp1.1 from training. Duration: 0.8818125 2023-10-02 09:30:37,951 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.4.encoder.layers.2.nonlin_attention.whiten1, num_groups=1, num_channels=288, metric=3.93 vs. limit=10.0 2023-10-02 09:30:40,087 WARNING [train.py:1204] (3/4) Exclude cut with ID _45560_210_21982_1_1533812603951_3584999_238-47020-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 09:30:40,096 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_21500_210_2054_1_1532149310_2716758_278-18053-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 09:30:40,104 WARNING [train.py:1204] (3/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_453-9626-0_sp1.1 from training. Duration: 0.936375 2023-10-02 09:30:41,435 WARNING [train.py:1204] (3/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_153-97441-0_sp1.1 from training. Duration: 0.5090625 2023-10-02 09:30:46,379 WARNING [train.py:1204] (3/4) Exclude cut with ID _67725_210_27767_1_1535608756671_5105359_431-120895-0_sp1.1 from training. Duration: 0.963625 2023-10-02 09:30:49,606 WARNING [train.py:1204] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_574-45541-0_sp1.1 from training. Duration: 0.5909375 2023-10-02 09:30:49,622 WARNING [train.py:1204] (3/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_162-103620-0_sp1.1 from training. Duration: 0.8454375 2023-10-02 09:30:49,640 WARNING [train.py:1204] (3/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_734-129761-0_sp1.1 from training. Duration: 0.7454375 2023-10-02 09:30:51,567 WARNING [train.py:1204] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_620-276793-0_sp1.1 from training. Duration: 0.536375 2023-10-02 09:30:51,611 WARNING [train.py:1204] (3/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_153-97441-0_sp0.9 from training. Duration: 0.62225 2023-10-02 09:30:55,709 WARNING [train.py:1204] (3/4) Exclude cut with ID _12733_210_8341_1_1530167424556_3646380_523-85621-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 09:30:55,730 WARNING [train.py:1204] (3/4) Exclude cut with ID _64048_210_10626_1_1535198382802_3835179_352-179834-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 09:30:57,492 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.self_attn_weights.pos_emb_skip_rate, batch_count=831720.0, ans=0.0 2023-10-02 09:30:58,700 WARNING [train.py:1204] (3/4) Exclude cut with ID _55162_210_17071_1_1534408248583_3753480_43-242364-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 09:30:58,714 WARNING [train.py:1204] (3/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_661-264857-0 from training. Duration: 0.93 2023-10-02 09:30:58,720 WARNING [train.py:1204] (3/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_325-237800-0_sp1.1 from training. Duration: 0.97275 2023-10-02 09:30:58,762 WARNING [train.py:1204] (3/4) Exclude cut with ID _55328_210_27767_1_1534580973260_4088170_385-171459-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 09:31:00,107 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3780_210_2087_1_1526467389_2145572_176-107140-0_sp1.1 from training. Duration: 0.62725 2023-10-02 09:31:02,662 WARNING [train.py:1204] (3/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_375-244976-0_sp1.1 from training. Duration: 0.6818125 2023-10-02 09:31:04,007 WARNING [train.py:1204] (3/4) Exclude cut with ID _35941_210_15379_1_1532995294515_4023089_309-33162-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 09:31:06,946 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder_embed.convnext.out_balancer.prob, batch_count=831786.6666666666, ans=0.125 2023-10-02 09:31:10,155 WARNING [train.py:1204] (3/4) Exclude cut with ID _14459_210_2228_1_1531443641946_7516700_1185-22038-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 09:31:10,317 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder_embed.conv.8.prob, batch_count=831786.6666666666, ans=0.125 2023-10-02 09:31:11,568 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_1197-69460-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 09:31:13,105 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9012W0022-501157 from training. Duration: 0.896 2023-10-02 09:31:14,776 INFO [train.py:1046] (3/4) Epoch 24, batch 2600, loss[loss=0.1532, simple_loss=0.2282, pruned_loss=0.03905, over 24602.00 frames. ], tot_loss[loss=0.1689, simple_loss=0.246, pruned_loss=0.04592, over 4727049.33 frames. ], batch size: 60, lr: 4.28e-03, grad_scale: 8.0 2023-10-02 09:31:16,218 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9023W0009-506302 from training. Duration: 0.896 2023-10-02 09:31:16,241 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2821_210_2715_1_1526108385_3763560_73-296673-0_sp1.1 from training. Duration: 0.7545625 2023-10-02 09:31:16,284 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9025W0113-507442 from training. Duration: 0.896 2023-10-02 09:31:18,080 WARNING [train.py:1204] (3/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_739-2098-0 from training. Duration: 0.92 2023-10-02 09:31:19,335 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9029W0332-509722 from training. Duration: 0.768 2023-10-02 09:31:20,877 WARNING [train.py:1204] (3/4) Exclude cut with ID _16908_210_5724_1_1531119157248_3772700_68-101953-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 09:31:20,902 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9042W0011-515617 from training. Duration: 0.768 2023-10-02 09:31:22,726 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3019_210_2028_1_1526044245_3264865_191-72235-0 from training. Duration: 0.97 2023-10-02 09:31:24,069 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9052W0006-520792 from training. Duration: 0.896 2023-10-02 09:31:24,358 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.bypass.skip_rate, batch_count=831853.3333333334, ans=0.04949747468305833 2023-10-02 09:31:24,887 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.2.encoder.layers.0.self_attn1.whiten, num_groups=1, num_channels=384, metric=21.66 vs. limit=22.5 2023-10-02 09:31:26,724 WARNING [train.py:1204] (3/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_245-174553-0_sp1.1 from training. Duration: 0.8 2023-10-02 09:31:28,089 WARNING [train.py:1204] (3/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_202-67017-0 from training. Duration: 0.82 2023-10-02 09:31:29,579 WARNING [train.py:1204] (3/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_304-59227-0 from training. Duration: 0.77 2023-10-02 09:31:30,983 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_246-125000-0_sp0.9 from training. Duration: 0.688875 2023-10-02 09:31:32,318 WARNING [train.py:1204] (3/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_243-42116-0 from training. Duration: 0.73 2023-10-02 09:31:33,792 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9087W0121-538492 from training. Duration: 0.896 2023-10-02 09:31:33,806 WARNING [train.py:1204] (3/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_120-76843-0 from training. Duration: 0.55 2023-10-02 09:31:33,945 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.feed_forward2.hidden_balancer.prob, batch_count=831920.0, ans=0.125 2023-10-02 09:31:42,562 WARNING [train.py:1204] (3/4) Exclude cut with ID _7852_210_5219_1_1528286471316_8139400_850-304621-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 09:31:42,574 WARNING [train.py:1204] (3/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_154-285415-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 09:31:42,585 WARNING [train.py:1204] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_538-278194-0_sp1.1 from training. Duration: 0.92725 2023-10-02 09:31:42,586 WARNING [train.py:1204] (3/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_119-207162-0 from training. Duration: 0.71 2023-10-02 09:31:44,786 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.conv_skip_rate, batch_count=831986.6666666666, ans=0.0 2023-10-02 09:31:45,856 WARNING [train.py:1204] (3/4) Exclude cut with ID _2334_210_2667_1_1525608238880_4705129_459-104997-0_sp1.1 from training. Duration: 0.863625 2023-10-02 09:31:46,096 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=831986.6666666666, ans=0.1 2023-10-02 09:31:51,629 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9147W0019-567847 from training. Duration: 0.768 2023-10-02 09:31:55,147 WARNING [train.py:1204] (3/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_228-189439-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 09:31:55,174 WARNING [train.py:1204] (3/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_196-77172-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 09:31:56,949 WARNING [train.py:1204] (3/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_625-338546-0 from training. Duration: 0.88 2023-10-02 09:31:58,175 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.554e+02 1.850e+02 2.063e+02 2.376e+02 4.529e+02, threshold=4.127e+02, percent-clipped=3.0 2023-10-02 09:31:58,279 WARNING [train.py:1204] (3/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_193-327630-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 09:31:58,281 WARNING [train.py:1204] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_391-24006-0_sp1.1 from training. Duration: 0.92725 2023-10-02 09:31:59,599 WARNING [train.py:1204] (3/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_197-28164-0 from training. Duration: 0.8 2023-10-02 09:32:01,048 WARNING [train.py:1204] (3/4) Exclude cut with ID _8395_210_1531_1_1528597871523_4411579_431-132470-0_sp1.1 from training. Duration: 0.67275 2023-10-02 09:32:02,347 WARNING [train.py:1204] (3/4) Exclude cut with ID _35350_210_16040_1_1533040191183_4075299_465-248326-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 09:32:03,838 WARNING [train.py:1204] (3/4) Exclude cut with ID _16756_210_4915_1_1531047803763_7104259_101-141922-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 09:32:06,613 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9212W0005-600172 from training. Duration: 0.896 2023-10-02 09:32:06,645 WARNING [train.py:1204] (3/4) Exclude cut with ID _63186_210_2084_1_1535414479705_3908649_378-166820-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 09:32:06,667 WARNING [train.py:1204] (3/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_52-211683-0_sp1.1 from training. Duration: 0.8181875 2023-10-02 09:32:12,636 WARNING [train.py:1204] (3/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_1147-237729-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 09:32:13,978 WARNING [train.py:1204] (3/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_234-14174-0_sp0.9 from training. Duration: 0.911125 2023-10-02 09:32:13,994 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_454-276429-0 from training. Duration: 0.87 2023-10-02 09:32:15,817 WARNING [train.py:1204] (3/4) Exclude cut with ID _53713_210_24116_1_1534314487762_7416751_755-165313-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 09:32:17,276 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8278_210_2550_1_1528637099_4220413_436-317072-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 09:32:18,526 WARNING [train.py:1204] (3/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_28-324729-0_sp1.1 from training. Duration: 0.8545625 2023-10-02 09:32:19,252 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.2.encoder.layers.2.feed_forward1.out_whiten, num_groups=1, num_channels=384, metric=6.29 vs. limit=15.0 2023-10-02 09:32:21,972 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.2.encoder.layers.1.self_attn2.whiten, num_groups=1, num_channels=384, metric=15.34 vs. limit=22.5 2023-10-02 09:32:24,479 WARNING [train.py:1204] (3/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_326-165900-0 from training. Duration: 0.69 2023-10-02 09:32:24,541 WARNING [train.py:1204] (3/4) Exclude cut with ID _53489_210_6228_1_1534298357169_3835970_96-30246-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 09:32:27,722 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8275_210_4198_1_1528455320_3892571_491-17707-0_sp1.1 from training. Duration: 0.5818125 2023-10-02 09:32:29,176 INFO [train.py:1046] (3/4) Epoch 24, batch 2650, loss[loss=0.1627, simple_loss=0.237, pruned_loss=0.0442, over 20727.00 frames. ], tot_loss[loss=0.1697, simple_loss=0.2469, pruned_loss=0.04622, over 4726368.52 frames. ], batch size: 45, lr: 4.27e-03, grad_scale: 8.0 2023-10-02 09:32:30,693 WARNING [train.py:1204] (3/4) Exclude cut with ID _14348_210_8341_1_1530599434996_3778640_240-232370-0 from training. Duration: 0.84 2023-10-02 09:32:30,703 WARNING [train.py:1204] (3/4) Exclude cut with ID _45012_210_12651_1_1533608435136_5371380_54-112623-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 09:32:33,353 WARNING [train.py:1204] (3/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_328-264776-0_sp1.1 from training. Duration: 0.6090625 2023-10-02 09:32:34,652 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9325W0001-648427 from training. Duration: 0.768 2023-10-02 09:32:34,669 WARNING [train.py:1204] (3/4) Exclude cut with ID _56480_210_4220_1_1534679942079_3075409_49-74717-0_sp1.1 from training. Duration: 0.936375 2023-10-02 09:32:34,823 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.1.conv_module2.balancer2.prob, batch_count=832186.6666666666, ans=0.125 2023-10-02 09:32:38,654 WARNING [train.py:1204] (3/4) Exclude cut with ID _59500_210_8254_1_1534809610680_3736009_6-241869-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 09:32:40,044 WARNING [train.py:1204] (3/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_172-215932-0_sp0.9 from training. Duration: 0.7333125 2023-10-02 09:32:41,426 WARNING [train.py:1204] (3/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_17-90541-0_sp1.1 from training. Duration: 0.8545625 2023-10-02 09:32:42,932 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_43831_210_2158_1_1533725502_5872791_525-89447-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 09:32:45,078 WARNING [train.py:1204] (3/4) Exclude cut with ID _12302_210_5219_1_1529924446466_8107280_907-182949-0 from training. Duration: 0.92 2023-10-02 09:32:45,098 WARNING [train.py:1204] (3/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_402-102311-0_sp0.9 from training. Duration: 0.7444375 2023-10-02 09:32:45,126 WARNING [train.py:1204] (3/4) Exclude cut with ID _8021_210_7994_1_1528376451358_3653069_179-45206-0_sp0.9 from training. Duration: 0.9555625 2023-10-02 09:32:47,838 WARNING [train.py:1204] (3/4) Exclude cut with ID _40137_210_12210_1_1533290484046_3795180_249-187059-0 from training. Duration: 0.88 2023-10-02 09:32:49,773 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9653W0194-675472 from training. Duration: 0.9658125 2023-10-02 09:32:51,156 WARNING [train.py:1204] (3/4) Exclude cut with ID _51332_210_9988_1_1534244371836_3801270_336-255604-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 09:32:54,500 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_510-96996-0 from training. Duration: 0.87 2023-10-02 09:32:55,132 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.3.encoder.layers.3.feed_forward2.out_whiten, num_groups=1, num_channels=512, metric=7.07 vs. limit=15.0 2023-10-02 09:32:55,831 WARNING [train.py:1204] (3/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_1167-20437-0_sp1.1 from training. Duration: 0.92725 2023-10-02 09:32:55,891 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3475_210_2028_1_1526193578_3622569_265-276625-0 from training. Duration: 0.76 2023-10-02 09:32:58,668 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.0.balancer1.prob, batch_count=832320.0, ans=0.125 2023-10-02 09:33:01,732 WARNING [train.py:1204] (3/4) Exclude cut with ID _14860_210_4915_1_1530702020340_7343240_470-172418-0_sp1.1 from training. Duration: 0.97275 2023-10-02 09:33:01,740 WARNING [train.py:1204] (3/4) Exclude cut with ID _5444_210_2151_1_1527418808464_5312210_581-315411-0_sp1.1 from training. Duration: 0.563625 2023-10-02 09:33:01,768 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_34450_210_9547_1_1533099443_4109050_519-342439-0_sp1.1 from training. Duration: 0.97275 2023-10-02 09:33:01,801 WARNING [train.py:1204] (3/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_50-175961-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 09:33:03,785 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.4.encoder.layers.0.self_attn2.whiten, num_groups=1, num_channels=384, metric=14.44 vs. limit=22.5 2023-10-02 09:33:07,129 WARNING [train.py:1204] (3/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_372-6869-0 from training. Duration: 0.44 2023-10-02 09:33:07,136 WARNING [train.py:1204] (3/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_3-237434-0 from training. Duration: 0.76 2023-10-02 09:33:08,670 WARNING [train.py:1204] (3/4) Exclude cut with ID _8772_210_1850_1_1528635595972_3664011_247-336448-0_sp1.1 from training. Duration: 0.67275 2023-10-02 09:33:09,357 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.2.encoder.layers.2.whiten, num_groups=1, num_channels=384, metric=3.73 vs. limit=12.0 2023-10-02 09:33:11,512 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_427-78097-0 from training. Duration: 0.8 2023-10-02 09:33:11,535 WARNING [train.py:1204] (3/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_466-252727-0_sp1.1 from training. Duration: 0.97275 2023-10-02 09:33:14,239 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_15972_210_6400_1_1530877934_3405692_316-35478-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 09:33:14,273 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_809-17156-0_sp0.9 from training. Duration: 0.811125 2023-10-02 09:33:14,319 WARNING [train.py:1204] (3/4) Exclude cut with ID _57572_210_5517_1_1534921219624_3809484_594-263474-0_sp1.1 from training. Duration: 0.936375 2023-10-02 09:33:15,688 WARNING [train.py:1204] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_190-228204-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 09:33:17,634 WARNING [train.py:1204] (3/4) Exclude cut with ID _19514_210_9259_1_1531530071330_5084230_117-21193-0_sp1.1 from training. Duration: 0.936375 2023-10-02 09:33:19,003 WARNING [train.py:1204] (3/4) Exclude cut with ID _69584_210_5219_1_1535785133775_4044450_190-21315-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 09:33:20,422 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_53071_210_16554_1_1534493434_1276796_184-302050-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 09:33:20,475 WARNING [train.py:1204] (3/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_120-76843-0_sp1.1 from training. Duration: 0.5 2023-10-02 09:33:20,663 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.ff2_skip_rate, batch_count=832386.6666666666, ans=0.0 2023-10-02 09:33:22,346 WARNING [train.py:1204] (3/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_318-162017-0_sp1.1 from training. Duration: 0.82725 2023-10-02 09:33:23,670 WARNING [train.py:1204] (3/4) Exclude cut with ID _46067_210_4220_1_1534125571066_3307450_79-46864-0_sp1.1 from training. Duration: 0.92725 2023-10-02 09:33:25,022 WARNING [train.py:1204] (3/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_358-215558-0_sp1.1 from training. Duration: 0.8454375 2023-10-02 09:33:25,100 WARNING [train.py:1204] (3/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_621-66764-0_sp1.1 from training. Duration: 0.92725 2023-10-02 09:33:26,895 WARNING [train.py:1204] (3/4) Exclude cut with ID _42863_210_9070_1_1533902398788_3696920_338-156851-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 09:33:28,181 WARNING [train.py:1204] (3/4) Exclude cut with ID _5289_210_2084_1_1527678202736_3779299_392-261988-0_sp0.9 from training. Duration: 0.72225 2023-10-02 09:33:30,816 WARNING [train.py:1204] (3/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_409-84682-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 09:33:30,899 WARNING [train.py:1204] (3/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_30-326681-0_sp0.9 from training. Duration: 0.92225 2023-10-02 09:33:30,903 WARNING [train.py:1204] (3/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_389-184695-0_sp1.1 from training. Duration: 0.92725 2023-10-02 09:33:32,567 WARNING [train.py:1204] (3/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_442-320317-0 from training. Duration: 0.82 2023-10-02 09:33:35,260 WARNING [train.py:1204] (3/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_756-29911-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 09:33:36,660 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_401-112036-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 09:33:36,744 WARNING [train.py:1204] (3/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_181-308827-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 09:33:38,120 WARNING [train.py:1204] (3/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_332-258725-0_sp1.1 from training. Duration: 0.963625 2023-10-02 09:33:38,190 WARNING [train.py:1204] (3/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_188-260011-0_sp0.9 from training. Duration: 0.811125 2023-10-02 09:33:39,528 WARNING [train.py:1204] (3/4) Exclude cut with ID _49039_210_6315_1_1533967537541_3846118_99-302239-0_sp1.1 from training. Duration: 0.963625 2023-10-02 09:33:42,196 INFO [train.py:1046] (3/4) Epoch 24, batch 2700, loss[loss=0.192, simple_loss=0.2541, pruned_loss=0.06495, over 23674.00 frames. ], tot_loss[loss=0.1708, simple_loss=0.2484, pruned_loss=0.04656, over 4730886.58 frames. ], batch size: 232, lr: 4.27e-03, grad_scale: 8.0 2023-10-02 09:33:42,303 WARNING [train.py:1204] (3/4) Exclude cut with ID _4122_210_2084_1_1526713164417_4353280_312-294828-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 09:33:42,310 WARNING [train.py:1204] (3/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_303-228738-0 from training. Duration: 0.84 2023-10-02 09:33:45,028 WARNING [train.py:1204] (3/4) Exclude cut with ID _14349_210_8341_1_1530615710818_3442470_115-285975-0_sp0.9 from training. Duration: 0.9555625 2023-10-02 09:33:48,181 WARNING [train.py:1204] (3/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_125-144174-0_sp1.1 from training. Duration: 0.3545625 2023-10-02 09:33:50,895 WARNING [train.py:1204] (3/4) Exclude cut with ID _53565_210_10635_1_1534337218335_3820259_351-54570-0_sp1.1 from training. Duration: 0.97275 2023-10-02 09:33:50,915 WARNING [train.py:1204] (3/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_410-40065-0_sp1.1 from training. Duration: 0.963625 2023-10-02 09:33:50,954 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_38965_210_4852_1_1533174906_3484735_167-339442-0_sp1.1 from training. Duration: 0.963625 2023-10-02 09:33:51,030 WARNING [train.py:1204] (3/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_265-164486-0_sp1.1 from training. Duration: 0.87275 2023-10-02 09:33:51,040 WARNING [train.py:1204] (3/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_725-182484-0_sp1.1 from training. Duration: 0.92725 2023-10-02 09:33:51,063 WARNING [train.py:1204] (3/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_607-213951-0_sp0.9 from training. Duration: 0.9333125 2023-10-02 09:33:52,711 WARNING [train.py:1204] (3/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_605-133395-0_sp1.1 from training. Duration: 0.6 2023-10-02 09:33:52,725 WARNING [train.py:1204] (3/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_351-294650-0 from training. Duration: 0.63 2023-10-02 09:33:52,793 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_14953_210_2151_1_1530701710_5616165_710-165400-0_sp1.1 from training. Duration: 0.7909375 2023-10-02 09:33:54,240 WARNING [train.py:1204] (3/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_237-58433-0_sp1.1 from training. Duration: 0.67275 2023-10-02 09:33:56,200 WARNING [train.py:1204] (3/4) Exclude cut with ID _5190_210_4557_1_1527244368799_373439_28-197030-0_sp1.1 from training. Duration: 0.6545625 2023-10-02 09:33:57,509 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_743-327651-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 09:33:58,239 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.2.encoder.layers.0.conv_module1.whiten, num_groups=1, num_channels=384, metric=9.72 vs. limit=15.0 2023-10-02 09:34:00,231 WARNING [train.py:1204] (3/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_76-203867-0_sp0.9 from training. Duration: 0.8 2023-10-02 09:34:02,085 WARNING [train.py:1204] (3/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_274-274798-0 from training. Duration: 0.98 2023-10-02 09:34:02,117 WARNING [train.py:1204] (3/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_226-242140-0_sp1.1 from training. Duration: 0.736375 2023-10-02 09:34:06,294 WARNING [train.py:1204] (3/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_352-59958-0_sp1.1 from training. Duration: 0.7818125 2023-10-02 09:34:06,298 WARNING [train.py:1204] (3/4) Exclude cut with ID _39411_210_5986_1_1533538825030_3343649_191-251819-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 09:34:07,786 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=832586.6666666666, ans=0.1 2023-10-02 09:34:11,725 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7046_210_6753_1_1527895337_6920917_642-106799-0_sp1.1 from training. Duration: 0.836375 2023-10-02 09:34:11,731 WARNING [train.py:1204] (3/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_412-3442-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 09:34:11,749 WARNING [train.py:1204] (3/4) Exclude cut with ID _60057_210_10640_1_1534903065793_3333150_171-181133-0_sp0.9 from training. Duration: 0.97775 2023-10-02 09:34:13,061 WARNING [train.py:1204] (3/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_154-135696-0_sp1.1 from training. Duration: 0.72725 2023-10-02 09:34:13,241 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.conv_module2.balancer1.prob, batch_count=832653.3333333334, ans=0.125 2023-10-02 09:34:14,704 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.conv_module1.balancer1.prob, batch_count=832653.3333333334, ans=0.125 2023-10-02 09:34:15,980 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.nonlin_attention.balancer.prob, batch_count=832653.3333333334, ans=0.125 2023-10-02 09:34:17,187 WARNING [train.py:1204] (3/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_148-186805-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 09:34:19,367 WARNING [train.py:1204] (3/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_605-165641-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 09:34:19,379 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_174-277364-0_sp0.9 from training. Duration: 0.788875 2023-10-02 09:34:19,399 WARNING [train.py:1204] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_89-321955-0_sp0.9 from training. Duration: 0.888875 2023-10-02 09:34:19,834 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.feed_forward1.out_whiten.whitening_limit, batch_count=832653.3333333334, ans=15.0 2023-10-02 09:34:21,357 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.3.encoder.layers.2.nonlin_attention.whiten2, num_groups=1, num_channels=512, metric=6.69 vs. limit=15.0 2023-10-02 09:34:23,486 WARNING [train.py:1204] (3/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_438-105429-0_sp1.1 from training. Duration: 0.963625 2023-10-02 09:34:23,490 WARNING [train.py:1204] (3/4) Exclude cut with ID _14970_210_15709_1_1530773543493_1715730_187-20259-0_sp0.9 from training. Duration: 0.988875 2023-10-02 09:34:24,654 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.552e+02 1.849e+02 2.017e+02 2.176e+02 3.250e+02, threshold=4.034e+02, percent-clipped=0.0 2023-10-02 09:34:31,335 WARNING [train.py:1204] (3/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_385-193052-0_sp0.9 from training. Duration: 0.9555625 2023-10-02 09:34:31,387 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7113_210_1849_1_1527942685_3448768_194-311626-0_sp1.1 from training. Duration: 0.92725 2023-10-02 09:34:35,352 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3780_210_2087_1_1526467389_2145572_176-107140-0_sp0.9 from training. Duration: 0.7666875 2023-10-02 09:34:35,354 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_36756_210_7234_1_1533026991_966276_123-213457-0_sp1.1 from training. Duration: 0.936375 2023-10-02 09:34:36,823 WARNING [train.py:1204] (3/4) Exclude cut with ID _23557_210_8750_1_1531890106370_6846255_456-244861-0_sp1.1 from training. Duration: 0.963625 2023-10-02 09:34:38,207 WARNING [train.py:1204] (3/4) Exclude cut with ID _37086_210_9038_1_1532999540891_7021250_242-349170-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 09:34:39,566 WARNING [train.py:1204] (3/4) Exclude cut with ID _41298_210_9019_1_1533430371450_4822143_471-281351-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 09:34:41,082 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_53378_210_17282_1_1534221071_5769509_572-126176-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 09:34:41,185 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_1507-263032-0_sp1.1 from training. Duration: 0.963625 2023-10-02 09:34:41,213 WARNING [train.py:1204] (3/4) Exclude cut with ID _13679_210_7497_1_1530534293662_3355098_411-186088-0_sp1.1 from training. Duration: 0.8545625 2023-10-02 09:34:43,888 WARNING [train.py:1204] (3/4) Exclude cut with ID _29345_210_4381_1_1532689168263_3860169_149-257268-0_sp1.1 from training. Duration: 0.736375 2023-10-02 09:34:43,987 WARNING [train.py:1204] (3/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_381-252014-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 09:34:45,765 WARNING [train.py:1204] (3/4) Exclude cut with ID _35799_210_5854_1_1533263487887_3529071_512-2717-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 09:34:49,640 WARNING [train.py:1204] (3/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_505-205270-0_sp0.9 from training. Duration: 0.42225 2023-10-02 09:34:49,719 WARNING [train.py:1204] (3/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_731-20117-0_sp1.1 from training. Duration: 0.936375 2023-10-02 09:34:51,107 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_37351_210_7925_1_1533012931_3812113_265-85780-0_sp1.1 from training. Duration: 0.8 2023-10-02 09:34:51,114 WARNING [train.py:1204] (3/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_313-134930-0 from training. Duration: 0.97 2023-10-02 09:34:52,818 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.feed_forward2.hidden_balancer.prob, batch_count=832786.6666666666, ans=0.125 2023-10-02 09:34:54,485 WARNING [train.py:1204] (3/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_374-138971-0 from training. Duration: 0.94 2023-10-02 09:34:54,519 WARNING [train.py:1204] (3/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_1227-287886-0_sp1.1 from training. Duration: 0.936375 2023-10-02 09:34:56,395 INFO [train.py:1046] (3/4) Epoch 24, batch 2750, loss[loss=0.1549, simple_loss=0.2352, pruned_loss=0.03733, over 24313.00 frames. ], tot_loss[loss=0.1706, simple_loss=0.2479, pruned_loss=0.0467, over 4723156.45 frames. ], batch size: 61, lr: 4.27e-03, grad_scale: 8.0 2023-10-02 09:34:56,521 WARNING [train.py:1204] (3/4) Exclude cut with ID _59493_210_17282_1_1535078419077_3225879_332-217544-0_sp1.1 from training. Duration: 0.97275 2023-10-02 09:34:56,559 WARNING [train.py:1204] (3/4) Exclude cut with ID _23514_210_1389_1_1531832139785_3031439_361-119640-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 09:34:56,776 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.nonlin_attention.balancer.prob, batch_count=832853.3333333334, ans=0.125 2023-10-02 09:34:59,689 WARNING [train.py:1204] (3/4) Exclude cut with ID _69584_210_5219_1_1535785133775_4044450_142-118058-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 09:34:59,723 WARNING [train.py:1204] (3/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_178-177582-0_sp1.1 from training. Duration: 0.67275 2023-10-02 09:35:01,043 WARNING [train.py:1204] (3/4) Exclude cut with ID _25303_210_17519_1_1532050207576_3635970_41-195572-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 09:35:02,458 WARNING [train.py:1204] (3/4) Exclude cut with ID _55328_210_27767_1_1534580973260_4088170_329-341322-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 09:35:02,681 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.conv_module2.balancer1.prob, batch_count=832853.3333333334, ans=0.125 2023-10-02 09:35:03,728 WARNING [train.py:1204] (3/4) Exclude cut with ID _5608_210_2106_1_1527415439089_6819768_909-82767-0_sp1.1 from training. Duration: 0.5545625 2023-10-02 09:35:03,774 WARNING [train.py:1204] (3/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_261-297503-0_sp1.1 from training. Duration: 0.9 2023-10-02 09:35:03,787 WARNING [train.py:1204] (3/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_459-291706-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 09:35:03,788 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7762_210_1794_1_1528199508_4057408_422-227034-0 from training. Duration: 0.73 2023-10-02 09:35:03,797 WARNING [train.py:1204] (3/4) Exclude cut with ID _3067_210_2667_1_1526212974640_4503168_250-285937-0_sp0.9 from training. Duration: 0.888875 2023-10-02 09:35:03,819 WARNING [train.py:1204] (3/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_332-253745-0_sp1.1 from training. Duration: 0.936375 2023-10-02 09:35:10,694 WARNING [train.py:1204] (3/4) Exclude cut with ID _13938_210_9988_1_1530518718978_3693960_304-112784-0 from training. Duration: 0.46 2023-10-02 09:35:12,065 WARNING [train.py:1204] (3/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_226-267957-0_sp1.1 from training. Duration: 0.92725 2023-10-02 09:35:13,398 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_41822_210_13270_1_1533955976_4379284_608-254567-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 09:35:13,450 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_124-113562-0_sp1.1 from training. Duration: 0.8545625 2023-10-02 09:35:13,488 WARNING [train.py:1204] (3/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_117-290916-0_sp1.1 from training. Duration: 0.463625 2023-10-02 09:35:15,358 WARNING [train.py:1204] (3/4) Exclude cut with ID _28427_210_4220_1_1532581240967_3061069_102-66533-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 09:35:15,436 WARNING [train.py:1204] (3/4) Exclude cut with ID _55383_210_10635_1_1534468426172_2231669_134-128246-0_sp1.1 from training. Duration: 0.8090625 2023-10-02 09:35:15,695 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.feed_forward2.hidden_balancer.prob, batch_count=832920.0, ans=0.125 2023-10-02 09:35:16,850 WARNING [train.py:1204] (3/4) Exclude cut with ID _45871_210_5665_1_1533864614245_3296060_49-308316-0_sp1.1 from training. Duration: 0.97275 2023-10-02 09:35:16,893 WARNING [train.py:1204] (3/4) Exclude cut with ID _57572_210_5517_1_1534921219624_3809484_176-199205-0_sp1.1 from training. Duration: 0.97275 2023-10-02 09:35:20,885 WARNING [train.py:1204] (3/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_45-265684-0_sp1.1 from training. Duration: 0.6545625 2023-10-02 09:35:20,911 WARNING [train.py:1204] (3/4) Exclude cut with ID _15150_210_8341_1_1530752411803_7256384_469-239509-0_sp0.9 from training. Duration: 0.7555625 2023-10-02 09:35:20,958 WARNING [train.py:1204] (3/4) Exclude cut with ID _28461_210_2253_1_1532771775189_3041299_136-270644-0_sp1.1 from training. Duration: 0.8454375 2023-10-02 09:35:22,678 WARNING [train.py:1204] (3/4) Exclude cut with ID _69584_210_5219_1_1535785133775_4044450_302-47302-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 09:35:24,063 WARNING [train.py:1204] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_166-128941-0_sp1.1 from training. Duration: 0.4909375 2023-10-02 09:35:28,812 WARNING [train.py:1204] (3/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_559-120504-0_sp1.1 from training. Duration: 0.97275 2023-10-02 09:35:31,486 WARNING [train.py:1204] (3/4) Exclude cut with ID _6456_210_1534_1_1527681634212_3181049_345-252877-0_sp1.1 from training. Duration: 0.5818125 2023-10-02 09:35:31,519 WARNING [train.py:1204] (3/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_902-233003-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 09:35:35,637 WARNING [train.py:1204] (3/4) Exclude cut with ID _36538_210_9994_1_1533204020363_3795620_204-261385-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 09:35:35,645 WARNING [train.py:1204] (3/4) Exclude cut with ID _14821_210_8750_1_1530680503991_6829359_189-297451-0_sp1.1 from training. Duration: 0.5 2023-10-02 09:35:35,693 WARNING [train.py:1204] (3/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_1211-226066-0_sp0.9 from training. Duration: 0.8444375 2023-10-02 09:35:40,070 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.conv_module2.balancer1.prob, batch_count=833053.3333333334, ans=0.125 2023-10-02 09:35:42,630 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6786_210_1388_1_1527839699_4194495_273-149841-0_sp1.1 from training. Duration: 0.836375 2023-10-02 09:35:44,390 WARNING [train.py:1204] (3/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_380-162648-0_sp1.1 from training. Duration: 0.8545625 2023-10-02 09:35:44,391 WARNING [train.py:1204] (3/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_494-199538-0 from training. Duration: 0.75 2023-10-02 09:35:46,898 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.0.layers.0.self_attn_weights.whiten_keys, num_groups=4, num_channels=128, metric=2.91 vs. limit=6.0 2023-10-02 09:35:48,504 WARNING [train.py:1204] (3/4) Exclude cut with ID _49097_210_16049_1_1533952780207_4360979_276-87053-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 09:35:50,034 WARNING [train.py:1204] (3/4) Exclude cut with ID _5444_210_2151_1_1527418808464_5312210_726-27165-0 from training. Duration: 0.67 2023-10-02 09:35:54,827 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_128-202104-0_sp1.1 from training. Duration: 0.57275 2023-10-02 09:35:58,145 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_11349_210_6753_1_1529800861_1101207_26-65602-0_sp1.1 from training. Duration: 0.87275 2023-10-02 09:35:58,173 WARNING [train.py:1204] (3/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_159-328157-0 from training. Duration: 0.52 2023-10-02 09:35:59,529 WARNING [train.py:1204] (3/4) Exclude cut with ID _14069_210_5632_1_1530512692033_4803390_98-21934-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 09:36:02,247 WARNING [train.py:1204] (3/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_352-59958-0_sp0.9 from training. Duration: 0.9555625 2023-10-02 09:36:02,289 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6163_210_6400_1_1527506746_4263068_283-137811-0 from training. Duration: 0.77 2023-10-02 09:36:03,643 WARNING [train.py:1204] (3/4) Exclude cut with ID _21989_210_5854_1_1531899214931_3271360_133-44759-0_sp1.1 from training. Duration: 0.7 2023-10-02 09:36:07,704 WARNING [train.py:1204] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_21-174514-0_sp0.9 from training. Duration: 0.47775 2023-10-02 09:36:07,744 WARNING [train.py:1204] (3/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_201-78811-0_sp1.1 from training. Duration: 0.963625 2023-10-02 09:36:09,153 INFO [train.py:1046] (3/4) Epoch 24, batch 2800, loss[loss=0.1702, simple_loss=0.2392, pruned_loss=0.05058, over 19570.00 frames. ], tot_loss[loss=0.1706, simple_loss=0.2474, pruned_loss=0.04687, over 4725299.50 frames. ], batch size: 42, lr: 4.27e-03, grad_scale: 16.0 2023-10-02 09:36:09,193 WARNING [train.py:1204] (3/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_537-164630-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 09:36:09,264 WARNING [train.py:1204] (3/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_156-257607-0 from training. Duration: 0.83 2023-10-02 09:36:09,282 WARNING [train.py:1204] (3/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_308-154450-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 09:36:09,315 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_45399_210_4852_1_1533624725_7579737_214-240574-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 09:36:12,025 WARNING [train.py:1204] (3/4) Exclude cut with ID _29303_210_2575_1_1532392237303_7410649_615-305517-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 09:36:12,073 WARNING [train.py:1204] (3/4) Exclude cut with ID IC0125W0002-61808 from training. Duration: 0.708 2023-10-02 09:36:12,073 WARNING [train.py:1204] (3/4) Exclude cut with ID IC0125W0017-61823 from training. Duration: 0.759 2023-10-02 09:36:16,137 WARNING [train.py:1204] (3/4) Exclude cut with ID _53219_210_4525_1_1534834836008_4064825_4-145291-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 09:36:17,508 WARNING [train.py:1204] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_185-174805-0_sp1.1 from training. Duration: 0.6181875 2023-10-02 09:36:17,521 WARNING [train.py:1204] (3/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_635-193046-0_sp1.1 from training. Duration: 0.92725 2023-10-02 09:36:22,157 WARNING [train.py:1204] (3/4) Exclude cut with ID _46067_210_4220_1_1534125571066_3307450_321-258866-0_sp1.1 from training. Duration: 0.97275 2023-10-02 09:36:23,587 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8558_210_1410_1_1528505225_7613406_131-329524-0 from training. Duration: 0.79 2023-10-02 09:36:25,574 WARNING [train.py:1204] (3/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_847-41334-0_sp0.9 from training. Duration: 0.488875 2023-10-02 09:36:26,933 WARNING [train.py:1204] (3/4) Exclude cut with ID _14227_210_4381_1_1530701804286_3940259_125-275642-0 from training. Duration: 0.99 2023-10-02 09:36:28,797 WARNING [train.py:1204] (3/4) Exclude cut with ID _25248_210_8750_1_1532062825941_5554180_248-177255-0_sp1.1 from training. Duration: 0.963625 2023-10-02 09:36:28,855 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_40047_210_2290_1_1533290519_2374311_195-165693-0_sp1.1 from training. Duration: 0.8090625 2023-10-02 09:36:30,113 WARNING [train.py:1204] (3/4) Exclude cut with ID _23109_210_4381_1_1532150828777_901949_29-258672-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 09:36:34,781 WARNING [train.py:1204] (3/4) Exclude cut with ID _14821_210_8750_1_1530680503991_6829359_2-227151-0_sp1.1 from training. Duration: 0.7545625 2023-10-02 09:36:34,820 WARNING [train.py:1204] (3/4) Exclude cut with ID _49643_210_7989_1_1534640244224_3939031_565-53982-0_sp1.1 from training. Duration: 0.963625 2023-10-02 09:36:34,823 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7544_210_6753_1_1528591327_1471426_33-346031-0_sp0.9 from training. Duration: 0.67775 2023-10-02 09:36:34,901 WARNING [train.py:1204] (3/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_95-61782-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 09:36:40,870 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.4.encoder.layers.1.conv_module1.whiten, num_groups=1, num_channels=384, metric=3.26 vs. limit=15.0 2023-10-02 09:36:43,367 WARNING [train.py:1204] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_686-54383-0_sp0.9 from training. Duration: 0.9666875 2023-10-02 09:36:44,765 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_505-276823-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 09:36:47,468 WARNING [train.py:1204] (3/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_745-128443-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 09:36:47,522 WARNING [train.py:1204] (3/4) Exclude cut with ID _3844_210_1390_1_1526727803582_2871209_20-251664-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 09:36:49,382 WARNING [train.py:1204] (3/4) Exclude cut with ID _36538_210_9994_1_1533204020363_3795620_487-164628-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 09:36:52,041 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.465e+02 1.978e+02 2.120e+02 2.446e+02 3.436e+02, threshold=4.241e+02, percent-clipped=0.0 2023-10-02 09:36:52,209 WARNING [train.py:1204] (3/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_309-222545-0_sp1.1 from training. Duration: 0.763625 2023-10-02 09:36:52,229 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3531_210_3528_1_1526294203_5216456_309-227302-0 from training. Duration: 0.69 2023-10-02 09:36:53,556 WARNING [train.py:1204] (3/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_42-192460-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 09:36:54,953 WARNING [train.py:1204] (3/4) Exclude cut with ID _22083_210_8254_1_1531702862439_3678970_49-234546-0_sp1.1 from training. Duration: 0.7545625 2023-10-02 09:36:54,957 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_427-78097-0_sp0.9 from training. Duration: 0.888875 2023-10-02 09:36:59,465 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_378-205254-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 09:36:59,510 WARNING [train.py:1204] (3/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_672-314830-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 09:37:04,192 WARNING [train.py:1204] (3/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_90-30673-0_sp1.1 from training. Duration: 0.763625 2023-10-02 09:37:04,388 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.ff3_skip_rate, batch_count=833386.6666666666, ans=0.0 2023-10-02 09:37:06,816 WARNING [train.py:1204] (3/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_563-329943-0_sp1.1 from training. Duration: 0.9 2023-10-02 09:37:06,832 WARNING [train.py:1204] (3/4) Exclude cut with ID _21632_210_2253_1_1531994001765_3744140_154-84090-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 09:37:06,833 WARNING [train.py:1204] (3/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_103-336064-0_sp0.9 from training. Duration: 0.5666875 2023-10-02 09:37:06,899 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_43-71989-0_sp0.9 from training. Duration: 0.6333125 2023-10-02 09:37:08,251 WARNING [train.py:1204] (3/4) Exclude cut with ID _3867_210_1883_1_1526641200575_4475830_319-349850-0_sp0.9 from training. Duration: 0.7444375 2023-10-02 09:37:08,320 WARNING [train.py:1204] (3/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_629-179454-0_sp1.1 from training. Duration: 0.963625 2023-10-02 09:37:08,330 WARNING [train.py:1204] (3/4) Exclude cut with ID _65263_210_22356_1_1535763598691_3921037_49-222917-0 from training. Duration: 0.92 2023-10-02 09:37:08,358 WARNING [train.py:1204] (3/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_602-331772-0_sp1.1 from training. Duration: 0.936375 2023-10-02 09:37:09,849 WARNING [train.py:1204] (3/4) Exclude cut with ID _58414_210_8254_1_1535421609162_7151182_292-134978-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 09:37:09,890 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_38793_210_2544_1_1533176114_3849320_200-299292-0_sp1.1 from training. Duration: 0.936375 2023-10-02 09:37:11,398 WARNING [train.py:1204] (3/4) Exclude cut with ID _14970_210_15709_1_1530775547361_1966965_6-23328-0 from training. Duration: 0.86 2023-10-02 09:37:12,223 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.1.encoder.layers.1.nonlin_attention.whiten2, num_groups=1, num_channels=256, metric=6.99 vs. limit=15.0 2023-10-02 09:37:12,852 WARNING [train.py:1204] (3/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_386-225511-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 09:37:12,863 WARNING [train.py:1204] (3/4) Exclude cut with ID _38323_210_12108_1_1533121233369_3303049_416-11919-0_sp1.1 from training. Duration: 0.87275 2023-10-02 09:37:12,924 WARNING [train.py:1204] (3/4) Exclude cut with ID _4866_210_2577_1_1527404412284_4364659_256-306830-0_sp1.1 from training. Duration: 0.7909375 2023-10-02 09:37:14,463 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.conv_module1.balancer2.min_abs, batch_count=833453.3333333334, ans=0.5 2023-10-02 09:37:16,842 WARNING [train.py:1204] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_53-300544-0 from training. Duration: 0.67 2023-10-02 09:37:21,638 WARNING [train.py:1204] (3/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_80-74184-0_sp1.1 from training. Duration: 0.7545625 2023-10-02 09:37:21,650 WARNING [train.py:1204] (3/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_332-256168-0_sp1.1 from training. Duration: 0.6818125 2023-10-02 09:37:22,931 INFO [train.py:1046] (3/4) Epoch 24, batch 2850, loss[loss=0.1688, simple_loss=0.2393, pruned_loss=0.04914, over 23449.00 frames. ], tot_loss[loss=0.1698, simple_loss=0.2466, pruned_loss=0.04654, over 4729335.84 frames. ], batch size: 134, lr: 4.27e-03, grad_scale: 8.0 2023-10-02 09:37:22,993 WARNING [train.py:1204] (3/4) Exclude cut with ID _48836_210_12210_1_1534075848293_3904150_429-35475-0_sp1.1 from training. Duration: 0.8090625 2023-10-02 09:37:23,160 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_144-135833-0_sp1.1 from training. Duration: 0.97275 2023-10-02 09:37:27,375 WARNING [train.py:1204] (3/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_380-343670-0_sp0.9 from training. Duration: 0.97775 2023-10-02 09:37:27,409 WARNING [train.py:1204] (3/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_213-265174-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 09:37:27,445 WARNING [train.py:1204] (3/4) Exclude cut with ID _20455_210_5219_1_1531565996775_8098100_86-314283-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 09:37:27,713 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.feed_forward2.hidden_balancer.prob, batch_count=833520.0, ans=0.125 2023-10-02 09:37:29,492 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_41750_210_1960_1_1533520678_3948042_68-223813-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 09:37:31,422 WARNING [train.py:1204] (3/4) Exclude cut with ID _55153_210_2253_1_1534384786677_3162096_24-198429-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 09:37:32,810 WARNING [train.py:1204] (3/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_119-207162-0_sp0.9 from training. Duration: 0.788875 2023-10-02 09:37:34,139 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_148-172861-0 from training. Duration: 0.5 2023-10-02 09:37:36,414 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.1.encoder.layers.1.feed_forward1.out_whiten, num_groups=1, num_channels=256, metric=5.97 vs. limit=15.0 2023-10-02 09:37:39,637 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_5522_210_3273_1_1527249606_3909096_318-203153-0 from training. Duration: 0.43 2023-10-02 09:37:39,643 WARNING [train.py:1204] (3/4) Exclude cut with ID _14884_210_2252_1_1530707459689_5561250_288-189949-0_sp1.1 from training. Duration: 0.97275 2023-10-02 09:37:41,057 WARNING [train.py:1204] (3/4) Exclude cut with ID _5294_210_1747_1_1527249643928_3696179_529-242298-0 from training. Duration: 0.95 2023-10-02 09:37:41,132 WARNING [train.py:1204] (3/4) Exclude cut with ID _36538_210_9994_1_1533204020363_3795620_503-110289-0_sp1.1 from training. Duration: 0.936375 2023-10-02 09:37:43,762 WARNING [train.py:1204] (3/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_385-193052-0 from training. Duration: 0.86 2023-10-02 09:37:45,148 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_456-22141-0 from training. Duration: 0.88 2023-10-02 09:37:46,468 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_38965_210_4852_1_1533174906_3484735_284-117840-0_sp1.1 from training. Duration: 0.936375 2023-10-02 09:37:56,636 WARNING [train.py:1204] (3/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_75-201625-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 09:37:58,601 WARNING [train.py:1204] (3/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_255-312654-0_sp1.1 from training. Duration: 0.82725 2023-10-02 09:37:58,613 WARNING [train.py:1204] (3/4) Exclude cut with ID _47251_210_18628_1_1533814063128_4137250_214-88076-0_sp0.9 from training. Duration: 0.97775 2023-10-02 09:38:01,466 WARNING [train.py:1204] (3/4) Exclude cut with ID _4092_210_2151_1_1526643044449_3135759_227-213972-0_sp1.1 from training. Duration: 0.4909375 2023-10-02 09:38:01,488 WARNING [train.py:1204] (3/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_912-47473-0_sp1.1 from training. Duration: 0.6181875 2023-10-02 09:38:01,509 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6767_210_3273_1_1527855783_1718932_173-256601-0_sp1.1 from training. Duration: 0.72725 2023-10-02 09:38:04,589 WARNING [train.py:1204] (3/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_32-205288-0_sp1.1 from training. Duration: 0.7090625 2023-10-02 09:38:04,612 WARNING [train.py:1204] (3/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_39-248870-0 from training. Duration: 0.69 2023-10-02 09:38:06,097 WARNING [train.py:1204] (3/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_377-296839-0_sp1.1 from training. Duration: 0.5 2023-10-02 09:38:07,395 WARNING [train.py:1204] (3/4) Exclude cut with ID _13679_210_7497_1_1530534293662_3355098_413-56154-0_sp1.1 from training. Duration: 0.963625 2023-10-02 09:38:08,826 WARNING [train.py:1204] (3/4) Exclude cut with ID _14970_210_15709_1_1530775547361_1966965_201-84544-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 09:38:08,882 WARNING [train.py:1204] (3/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_105-95775-0_sp1.1 from training. Duration: 0.936375 2023-10-02 09:38:09,131 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.feed_forward1.out_proj.dropout_p, batch_count=833720.0, ans=0.1 2023-10-02 09:38:11,630 WARNING [train.py:1204] (3/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_131-191669-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 09:38:11,660 WARNING [train.py:1204] (3/4) Exclude cut with ID _14860_210_4915_1_1530702020340_7343240_695-4323-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 09:38:12,987 WARNING [train.py:1204] (3/4) Exclude cut with ID _39867_210_9826_1_1533553222030_9344259_991-130565-0_sp1.1 from training. Duration: 0.97275 2023-10-02 09:38:14,446 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_50-3289-0_sp1.1 from training. Duration: 0.82725 2023-10-02 09:38:15,815 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_54-347670-0_sp1.1 from training. Duration: 0.92725 2023-10-02 09:38:15,876 WARNING [train.py:1204] (3/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_200-153445-0_sp1.1 from training. Duration: 0.936375 2023-10-02 09:38:17,831 WARNING [train.py:1204] (3/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_279-302911-0_sp1.1 from training. Duration: 0.97275 2023-10-02 09:38:19,243 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.conv_skip_rate, batch_count=833720.0, ans=0.0 2023-10-02 09:38:20,419 WARNING [train.py:1204] (3/4) Exclude cut with ID _14886_210_4220_1_1530702020355_3516190_159-73748-0_sp0.9 from training. Duration: 0.92225 2023-10-02 09:38:23,292 WARNING [train.py:1204] (3/4) Exclude cut with ID _53379_210_20034_1_1534222027363_3854654_23-222715-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 09:38:25,807 WARNING [train.py:1204] (3/4) Exclude cut with ID _12555_210_8750_1_1530075594402_3780239_449-267986-0 from training. Duration: 0.83 2023-10-02 09:38:25,826 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7544_210_6753_1_1528587722_3576686_295-258479-0 from training. Duration: 0.84 2023-10-02 09:38:28,505 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7002_210_1794_1_1527935634_4034444_487-236501-0_sp0.9 from training. Duration: 0.7333125 2023-10-02 09:38:28,555 WARNING [train.py:1204] (3/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_244-19107-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 09:38:28,576 WARNING [train.py:1204] (3/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_390-339137-0 from training. Duration: 0.56 2023-10-02 09:38:30,465 WARNING [train.py:1204] (3/4) Exclude cut with ID _7562_210_2084_1_1528887767916_3946229_186-326422-0_sp1.1 from training. Duration: 0.763625 2023-10-02 09:38:30,526 WARNING [train.py:1204] (3/4) Exclude cut with ID _33640_210_2655_1_1533117605479_4855469_645-145235-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 09:38:32,450 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_25767_210_2161_1_1532307483_3867748_506-171777-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 09:38:32,477 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_5438_210_1794_1_1527336687_2600009_233-58072-0_sp1.1 from training. Duration: 0.7 2023-10-02 09:38:32,477 WARNING [train.py:1204] (3/4) Exclude cut with ID IC0669W0063-330908 from training. Duration: 0.896 2023-10-02 09:38:32,521 WARNING [train.py:1204] (3/4) Exclude cut with ID IC0671W0070-331913 from training. Duration: 0.896 2023-10-02 09:38:32,527 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_258-310570-0_sp1.1 from training. Duration: 0.7454375 2023-10-02 09:38:32,683 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.bypass.scale_min, batch_count=833786.6666666666, ans=0.2 2023-10-02 09:38:33,934 WARNING [train.py:1204] (3/4) Exclude cut with ID _9461_210_4557_1_1529060514726_3677451_162-177507-0_sp1.1 from training. Duration: 0.97275 2023-10-02 09:38:37,056 INFO [train.py:1046] (3/4) Epoch 24, batch 2900, loss[loss=0.1624, simple_loss=0.2434, pruned_loss=0.04075, over 24309.00 frames. ], tot_loss[loss=0.1702, simple_loss=0.247, pruned_loss=0.04669, over 4718666.30 frames. ], batch size: 61, lr: 4.27e-03, grad_scale: 8.0 2023-10-02 09:38:37,204 WARNING [train.py:1204] (3/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_26-175553-0_sp0.9 from training. Duration: 0.67775 2023-10-02 09:38:38,540 WARNING [train.py:1204] (3/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_7-150481-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 09:38:38,604 WARNING [train.py:1204] (3/4) Exclude cut with ID _23557_210_8750_1_1531890106370_6846255_72-273262-0_sp1.1 from training. Duration: 0.963625 2023-10-02 09:38:39,941 WARNING [train.py:1204] (3/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_367-61330-0 from training. Duration: 0.75 2023-10-02 09:38:44,186 WARNING [train.py:1204] (3/4) Exclude cut with ID _39867_210_9826_1_1533553222030_9344259_1337-11141-0_sp1.1 from training. Duration: 0.936375 2023-10-02 09:38:44,215 WARNING [train.py:1204] (3/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_101-293367-0 from training. Duration: 0.63 2023-10-02 09:38:45,518 WARNING [train.py:1204] (3/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_10-100367-0 from training. Duration: 0.98 2023-10-02 09:38:46,992 WARNING [train.py:1204] (3/4) Exclude cut with ID _60057_210_10640_1_1534903065793_3333150_171-181133-0_sp1.1 from training. Duration: 0.8 2023-10-02 09:38:46,994 WARNING [train.py:1204] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_844-96989-0_sp0.9 from training. Duration: 0.788875 2023-10-02 09:38:50,393 WARNING [train.py:1204] (3/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_301-144595-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 09:38:50,493 WARNING [train.py:1204] (3/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_115-147343-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 09:38:54,603 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3171_210_2087_1_1526109084_2330007_37-64334-0_sp1.1 from training. Duration: 0.7454375 2023-10-02 09:38:54,656 WARNING [train.py:1204] (3/4) Exclude cut with ID _19514_210_9259_1_1531530071330_5084230_769-272914-0_sp1.1 from training. Duration: 0.936375 2023-10-02 09:38:56,195 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.conv_module2.balancer1.prob, batch_count=833920.0, ans=0.125 2023-10-02 09:38:57,334 WARNING [train.py:1204] (3/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_76-311963-0_sp0.9 from training. Duration: 0.77775 2023-10-02 09:38:58,583 WARNING [train.py:1204] (3/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_526-62079-0 from training. Duration: 0.67 2023-10-02 09:38:58,646 WARNING [train.py:1204] (3/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_7-111954-0_sp0.9 from training. Duration: 0.62225 2023-10-02 09:39:01,699 WARNING [train.py:1204] (3/4) Exclude cut with ID _57997_210_4163_1_1534766425889_3961049_360-279140-0_sp1.1 from training. Duration: 0.97275 2023-10-02 09:39:01,885 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=833920.0, ans=0.1 2023-10-02 09:39:03,138 WARNING [train.py:1204] (3/4) Exclude cut with ID _4866_210_2577_1_1527404412284_4364659_133-1696-0 from training. Duration: 0.99 2023-10-02 09:39:04,433 WARNING [train.py:1204] (3/4) Exclude cut with ID _5190_210_4557_1_1527244368799_373439_28-197030-0 from training. Duration: 0.72 2023-10-02 09:39:07,676 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_53-39003-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 09:39:07,678 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6673_210_3273_1_1527767790_3173240_351-167954-0 from training. Duration: 0.48 2023-10-02 09:39:07,693 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2822_210_2715_1_1526112586_1999587_165-36656-0_sp1.1 from training. Duration: 0.8090625 2023-10-02 09:39:09,638 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_328-264892-0_sp1.1 from training. Duration: 0.92725 2023-10-02 09:39:09,642 WARNING [train.py:1204] (3/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_29-293896-0_sp0.9 from training. Duration: 0.67775 2023-10-02 09:39:12,546 WARNING [train.py:1204] (3/4) Exclude cut with ID _65263_210_22356_1_1535763598691_3921037_465-55448-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 09:39:13,743 WARNING [train.py:1204] (3/4) Exclude cut with ID _21989_210_5854_1_1531899214931_3271360_311-53106-0_sp1.1 from training. Duration: 0.97275 2023-10-02 09:39:16,696 WARNING [train.py:1204] (3/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_389-277915-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 09:39:19,360 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_69630_210_7234_1_1535791094_677837_63-277139-0_sp1.1 from training. Duration: 0.963625 2023-10-02 09:39:20,635 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.491e+02 1.897e+02 2.159e+02 2.476e+02 3.741e+02, threshold=4.318e+02, percent-clipped=0.0 2023-10-02 09:39:22,722 WARNING [train.py:1204] (3/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_85-138257-0 from training. Duration: 0.71 2023-10-02 09:39:22,751 WARNING [train.py:1204] (3/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_686-247600-0 from training. Duration: 0.92 2023-10-02 09:39:22,757 WARNING [train.py:1204] (3/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_108-255430-0_sp1.1 from training. Duration: 0.8818125 2023-10-02 09:39:26,902 WARNING [train.py:1204] (3/4) Exclude cut with ID _14884_210_2252_1_1530707459689_5561250_114-201399-0_sp1.1 from training. Duration: 0.6909375 2023-10-02 09:39:28,322 WARNING [train.py:1204] (3/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_506-42614-0 from training. Duration: 0.79 2023-10-02 09:39:29,741 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_85-195240-0_sp1.1 from training. Duration: 0.7909375 2023-10-02 09:39:30,233 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.4.encoder.layers.2.conv_module2.whiten, num_groups=1, num_channels=384, metric=4.16 vs. limit=15.0 2023-10-02 09:39:34,446 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_141-36243-0_sp1.1 from training. Duration: 0.97275 2023-10-02 09:39:37,911 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.nonlin_attention.balancer.prob, batch_count=834120.0, ans=0.125 2023-10-02 09:39:43,654 WARNING [train.py:1204] (3/4) Exclude cut with ID _7852_210_5219_1_1528286471316_8139400_358-287492-0_sp1.1 from training. Duration: 0.87275 2023-10-02 09:39:44,922 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_805-71497-0_sp0.9 from training. Duration: 0.788875 2023-10-02 09:39:46,330 WARNING [train.py:1204] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_178-39722-0 from training. Duration: 0.49 2023-10-02 09:39:47,862 WARNING [train.py:1204] (3/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_428-181925-0_sp1.1 from training. Duration: 0.963625 2023-10-02 09:39:47,866 WARNING [train.py:1204] (3/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_770-200430-0 from training. Duration: 0.89 2023-10-02 09:39:47,910 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_1104-271009-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 09:39:49,203 WARNING [train.py:1204] (3/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_128-296581-0_sp0.9 from training. Duration: 0.82225 2023-10-02 09:39:50,575 INFO [train.py:1046] (3/4) Epoch 24, batch 2950, loss[loss=0.1687, simple_loss=0.2604, pruned_loss=0.03845, over 24562.00 frames. ], tot_loss[loss=0.171, simple_loss=0.2479, pruned_loss=0.04703, over 4726505.82 frames. ], batch size: 71, lr: 4.27e-03, grad_scale: 8.0 2023-10-02 09:39:53,837 WARNING [train.py:1204] (3/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_19-62511-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 09:39:55,417 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.bypass.skip_rate, batch_count=834186.6666666666, ans=0.09899494936611666 2023-10-02 09:39:56,473 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_805-71497-0 from training. Duration: 0.71 2023-10-02 09:39:56,541 WARNING [train.py:1204] (3/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_573-152077-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 09:39:56,546 WARNING [train.py:1204] (3/4) Exclude cut with ID _42875_210_14856_1_1533704397487_4012433_393-177816-0_sp1.1 from training. Duration: 0.963625 2023-10-02 09:39:59,304 WARNING [train.py:1204] (3/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_278-35159-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 09:40:00,698 WARNING [train.py:1204] (3/4) Exclude cut with ID _15037_210_2253_1_1530752294104_3267650_13-130361-0_sp1.1 from training. Duration: 0.9 2023-10-02 09:40:00,790 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3019_210_2028_1_1526044245_3264865_127-135615-0 from training. Duration: 0.94 2023-10-02 09:40:02,182 WARNING [train.py:1204] (3/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_607-213951-0 from training. Duration: 0.84 2023-10-02 09:40:02,254 WARNING [train.py:1204] (3/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_436-318113-0_sp0.9 from training. Duration: 0.8333125 2023-10-02 09:40:02,254 WARNING [train.py:1204] (3/4) Exclude cut with ID _13938_210_9988_1_1530518718978_3693960_148-152225-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 09:40:10,028 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2541_210_1794_1_1525590269_3084765_156-173713-0_sp1.1 from training. Duration: 0.736375 2023-10-02 09:40:11,613 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.feed_forward1.hidden_balancer.prob, batch_count=834253.3333333334, ans=0.125 2023-10-02 09:40:12,683 WARNING [train.py:1204] (3/4) Exclude cut with ID _28323_210_16187_1_1532480395667_4100300_531-24157-0_sp1.1 from training. Duration: 0.8909375 2023-10-02 09:40:12,807 WARNING [train.py:1204] (3/4) Exclude cut with ID _29303_210_2575_1_1532392237303_7410649_382-217492-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 09:40:14,149 WARNING [train.py:1204] (3/4) Exclude cut with ID _58895_210_8750_1_1535184081320_3649089_270-158013-0_sp1.1 from training. Duration: 0.92725 2023-10-02 09:40:16,079 WARNING [train.py:1204] (3/4) Exclude cut with ID _29277_210_3279_1_1532581109293_3173069_311-206227-0_sp1.1 from training. Duration: 0.936375 2023-10-02 09:40:16,090 WARNING [train.py:1204] (3/4) Exclude cut with ID _7160_210_1876_1_1527940923231_3703799_282-322611-0_sp0.9 from training. Duration: 0.9666875 2023-10-02 09:40:17,395 WARNING [train.py:1204] (3/4) Exclude cut with ID _53219_210_4525_1_1534834836008_4064825_562-32295-0_sp1.1 from training. Duration: 0.963625 2023-10-02 09:40:18,934 WARNING [train.py:1204] (3/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_408-87330-0_sp1.1 from training. Duration: 0.963625 2023-10-02 09:40:18,942 WARNING [train.py:1204] (3/4) Exclude cut with ID _59202_210_26984_1_1534816810612_3549174_137-318710-0_sp1.1 from training. Duration: 0.8090625 2023-10-02 09:40:20,419 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.1.conv_module1.balancer2.min_abs, batch_count=834320.0, ans=0.5 2023-10-02 09:40:22,841 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.4.encoder.layers.0.nonlin_attention.whiten2, num_groups=1, num_channels=384, metric=14.96 vs. limit=15.0 2023-10-02 09:40:23,526 WARNING [train.py:1204] (3/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_372-30302-0 from training. Duration: 0.86 2023-10-02 09:40:26,684 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.feed_forward1.hidden_balancer.prob, batch_count=834320.0, ans=0.125 2023-10-02 09:40:27,798 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7543_210_6753_1_1528541907_1399322_178-56269-0_sp1.1 from training. Duration: 0.95275 2023-10-02 09:40:27,825 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9109W0012-549758 from training. Duration: 0.512 2023-10-02 09:40:29,136 WARNING [train.py:1204] (3/4) Exclude cut with ID _5410_210_1388_1_1527397055795_1719020_39-112221-0_sp0.9 from training. Duration: 0.7666875 2023-10-02 09:40:30,603 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9121W0012-554933 from training. Duration: 0.896 2023-10-02 09:40:30,696 WARNING [train.py:1204] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_476-272757-0 from training. Duration: 0.93 2023-10-02 09:40:30,714 WARNING [train.py:1204] (3/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_1085-171718-0_sp1.1 from training. Duration: 0.92725 2023-10-02 09:40:32,549 WARNING [train.py:1204] (3/4) Exclude cut with ID _8480_210_1874_1_1528589391205_6728330_309-139896-0_sp1.1 from training. Duration: 0.736375 2023-10-02 09:40:32,549 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9130W0113-559703 from training. Duration: 0.896 2023-10-02 09:40:32,553 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_292-265928-0_sp1.1 from training. Duration: 0.463625 2023-10-02 09:40:36,611 WARNING [train.py:1204] (3/4) Exclude cut with ID _8772_210_1850_1_1528635595972_3664011_96-12756-0 from training. Duration: 0.98 2023-10-02 09:40:36,690 WARNING [train.py:1204] (3/4) Exclude cut with ID _12556_210_8341_1_1530081024100_7327690_725-193477-0_sp1.1 from training. Duration: 0.8909375 2023-10-02 09:40:38,433 WARNING [train.py:1204] (3/4) Exclude cut with ID _5289_210_2084_1_1527678202736_3779299_142-103317-0_sp1.1 from training. Duration: 0.763625 2023-10-02 09:40:39,909 WARNING [train.py:1204] (3/4) Exclude cut with ID _45012_210_12651_1_1533608435136_5371380_206-51075-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 09:40:41,252 WARNING [train.py:1204] (3/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_176-56120-0_sp1.1 from training. Duration: 0.7545625 2023-10-02 09:40:42,558 WARNING [train.py:1204] (3/4) Exclude cut with ID _40284_210_2084_1_1533513679814_3704279_220-201864-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 09:40:42,588 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9165W0004-576638 from training. Duration: 0.896 2023-10-02 09:40:42,630 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_5438_210_1794_1_1527336687_2600009_218-343336-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 09:40:42,660 WARNING [train.py:1204] (3/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_318-162017-0 from training. Duration: 0.91 2023-10-02 09:40:49,797 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_49544_210_1794_1_1534379770_5262083_483-349298-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 09:40:51,148 WARNING [train.py:1204] (3/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_197-28164-0_sp0.9 from training. Duration: 0.888875 2023-10-02 09:40:52,489 WARNING [train.py:1204] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_643-52007-0 from training. Duration: 0.57 2023-10-02 09:40:52,516 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_38965_210_4852_1_1533174906_3484735_403-332963-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 09:40:52,632 WARNING [train.py:1204] (3/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_340-174176-0 from training. Duration: 0.49 2023-10-02 09:40:55,960 WARNING [train.py:1204] (3/4) Exclude cut with ID _56708_210_13177_1_1535252351282_3422899_363-917-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 09:40:57,356 WARNING [train.py:1204] (3/4) Exclude cut with ID _19896_210_1883_1_1531709022662_6649569_525-159063-0_sp1.1 from training. Duration: 0.936375 2023-10-02 09:40:58,666 WARNING [train.py:1204] (3/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_430-116139-0_sp0.9 from training. Duration: 0.9444375 2023-10-02 09:40:58,782 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_53753_210_7234_1_1534320003_3594148_86-329250-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 09:40:58,792 WARNING [train.py:1204] (3/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_406-275649-0_sp1.1 from training. Duration: 0.3818125 2023-10-02 09:41:00,158 WARNING [train.py:1204] (3/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_1083-147536-0_sp1.1 from training. Duration: 0.8818125 2023-10-02 09:41:01,521 WARNING [train.py:1204] (3/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_54-311741-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 09:41:01,526 WARNING [train.py:1204] (3/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_237-58433-0_sp0.9 from training. Duration: 0.82225 2023-10-02 09:41:01,572 WARNING [train.py:1204] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_98-51676-0_sp1.1 from training. Duration: 0.663625 2023-10-02 09:41:02,892 WARNING [train.py:1204] (3/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_386-270472-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 09:41:04,750 INFO [train.py:1046] (3/4) Epoch 24, batch 3000, loss[loss=0.2397, simple_loss=0.2962, pruned_loss=0.0916, over 18874.00 frames. ], tot_loss[loss=0.1718, simple_loss=0.249, pruned_loss=0.04729, over 4724282.97 frames. ], batch size: 388, lr: 4.27e-03, grad_scale: 8.0 2023-10-02 09:41:04,751 INFO [train.py:1069] (3/4) Computing validation loss 2023-10-02 09:41:17,335 INFO [train.py:1078] (3/4) Epoch 24, validation: loss=0.349, simple_loss=0.2892, pruned_loss=0.2044, over 1125622.00 frames. 2023-10-02 09:41:17,336 INFO [train.py:1079] (3/4) Maximum memory allocated so far is 21134MB 2023-10-02 09:41:17,445 WARNING [train.py:1204] (3/4) Exclude cut with ID _19023_210_4353_1_1531378892552_5820780_83-254794-0_sp1.1 from training. Duration: 0.7818125 2023-10-02 09:41:19,385 WARNING [train.py:1204] (3/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_314-93656-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 09:41:19,396 WARNING [train.py:1204] (3/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_336-213530-0 from training. Duration: 0.9 2023-10-02 09:41:20,890 WARNING [train.py:1204] (3/4) Exclude cut with ID _36686_210_5665_1_1533778225913_3646410_65-221120-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 09:41:22,932 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_26433_210_12108_1_1532157158_4056480_271-68178-0_sp1.1 from training. Duration: 0.92725 2023-10-02 09:41:24,294 WARNING [train.py:1204] (3/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_46-207599-0_sp0.9 from training. Duration: 0.711125 2023-10-02 09:41:25,775 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9303W0114-637133 from training. Duration: 0.896 2023-10-02 09:41:25,807 WARNING [train.py:1204] (3/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_490-207861-0 from training. Duration: 0.98 2023-10-02 09:41:26,025 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=834520.0, ans=0.1 2023-10-02 09:41:29,751 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6767_210_3273_1_1527855783_1718932_173-256601-0_sp0.9 from training. Duration: 0.888875 2023-10-02 09:41:29,789 WARNING [train.py:1204] (3/4) Exclude cut with ID _13921_210_9994_1_1530841782272_3503520_327-69677-0_sp1.1 from training. Duration: 0.8181875 2023-10-02 09:41:29,836 WARNING [train.py:1204] (3/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_508-335369-0 from training. Duration: 0.48 2023-10-02 09:41:31,099 WARNING [train.py:1204] (3/4) Exclude cut with ID _64631_210_27767_1_1535349580068_4165160_24-46567-0_sp1.1 from training. Duration: 0.963625 2023-10-02 09:41:35,833 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.nonlin_attention.balancer.prob, batch_count=834586.6666666666, ans=0.125 2023-10-02 09:41:37,135 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_89-35748-0_sp1.1 from training. Duration: 0.7181875 2023-10-02 09:41:44,678 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_26433_210_12108_1_1532157158_4056480_578-262107-0_sp1.1 from training. Duration: 0.9 2023-10-02 09:41:53,117 WARNING [train.py:1204] (3/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_129-337802-0 from training. Duration: 0.72 2023-10-02 09:41:53,235 WARNING [train.py:1204] (3/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_356-167816-0_sp0.9 from training. Duration: 0.8 2023-10-02 09:41:55,908 WARNING [train.py:1204] (3/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_137-116442-0_sp1.1 from training. Duration: 0.7090625 2023-10-02 09:41:55,967 WARNING [train.py:1204] (3/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_358-172940-0_sp1.1 from training. Duration: 0.963625 2023-10-02 09:41:55,986 WARNING [train.py:1204] (3/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_668-344147-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 09:41:57,330 WARNING [train.py:1204] (3/4) Exclude cut with ID _62337_210_12392_1_1535009273105_3412229_255-157667-0_sp1.1 from training. Duration: 0.936375 2023-10-02 09:41:57,333 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_486-151131-0 from training. Duration: 0.85 2023-10-02 09:42:00,519 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_53638_210_21581_1_1534297319_5808449_885-80172-0 from training. Duration: 0.96 2023-10-02 09:42:01,862 WARNING [train.py:1204] (3/4) Exclude cut with ID _50093_210_4220_1_1534417141207_3432299_105-183067-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 09:42:02,707 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.1.encoder.layers.1.whiten, num_groups=1, num_channels=256, metric=3.63 vs. limit=12.0 2023-10-02 09:42:03,110 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.482e+02 1.840e+02 2.041e+02 2.384e+02 3.232e+02, threshold=4.082e+02, percent-clipped=0.0 2023-10-02 09:42:03,198 WARNING [train.py:1204] (3/4) Exclude cut with ID _7562_210_2084_1_1528887767916_3946229_345-169156-0_sp0.9 from training. Duration: 0.6444375 2023-10-02 09:42:05,801 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_4528_210_3273_1_1527076654_3276777_209-219804-0_sp0.9 from training. Duration: 0.7666875 2023-10-02 09:42:07,124 WARNING [train.py:1204] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_35-153886-0_sp0.9 from training. Duration: 0.9333125 2023-10-02 09:42:07,174 WARNING [train.py:1204] (3/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_332-121925-0_sp1.1 from training. Duration: 0.92725 2023-10-02 09:42:07,176 WARNING [train.py:1204] (3/4) Exclude cut with ID _59202_210_26984_1_1534816810612_3549174_495-65515-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 09:42:10,533 WARNING [train.py:1204] (3/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_769-168302-0_sp1.1 from training. Duration: 0.6454375 2023-10-02 09:42:11,895 WARNING [train.py:1204] (3/4) Exclude cut with ID _24271_210_12658_1_1531987270022_7190519_708-71345-0_sp1.1 from training. Duration: 0.936375 2023-10-02 09:42:11,896 WARNING [train.py:1204] (3/4) Exclude cut with ID 3033-130750-0096-55598-0_sp0.9 from training. Duration: 0.92225 2023-10-02 09:42:13,302 WARNING [train.py:1204] (3/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_219-224445-0_sp0.9 from training. Duration: 0.9333125 2023-10-02 09:42:16,016 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_174-277364-0 from training. Duration: 0.71 2023-10-02 09:42:16,101 WARNING [train.py:1204] (3/4) Exclude cut with ID _45021_210_9994_1_1533619751094_3330288_96-239010-0_sp1.1 from training. Duration: 0.82725 2023-10-02 09:42:16,125 WARNING [train.py:1204] (3/4) Exclude cut with ID _31602_210_5854_1_1532677021456_4458250_489-305116-0_sp1.1 from training. Duration: 0.97275 2023-10-02 09:42:17,397 WARNING [train.py:1204] (3/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_269-154726-0_sp0.9 from training. Duration: 0.9555625 2023-10-02 09:42:21,608 WARNING [train.py:1204] (3/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_126-54418-0_sp1.1 from training. Duration: 0.92725 2023-10-02 09:42:23,349 WARNING [train.py:1204] (3/4) Exclude cut with ID _42326_210_7770_1_1533814282531_3707269_182-224159-0_sp1.1 from training. Duration: 0.92725 2023-10-02 09:42:23,430 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7002_210_1794_1_1527939683_5157286_1-281264-0_sp0.9 from training. Duration: 0.488875 2023-10-02 09:42:23,461 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_86-16881-0 from training. Duration: 0.58 2023-10-02 09:42:24,823 WARNING [train.py:1204] (3/4) Exclude cut with ID _19514_210_9259_1_1531530071330_5084230_637-279094-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 09:42:25,500 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_26433_210_12108_1_1532157158_4056480_578-262107-0 from training. Duration: 0.99 2023-10-02 09:42:25,545 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_82-221196-0_sp1.1 from training. Duration: 0.6909375 2023-10-02 09:42:27,043 WARNING [train.py:1204] (3/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_402-102311-0 from training. Duration: 0.67 2023-10-02 09:42:28,692 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.self_attn_weights.pos_emb_skip_rate, batch_count=834786.6666666666, ans=0.0 2023-10-02 09:42:29,813 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1015-343884-0_sp1.1 from training. Duration: 0.563625 2023-10-02 09:42:31,132 WARNING [train.py:1204] (3/4) Exclude cut with ID _13684_210_7497_1_1530619354863_3598359_312-235606-0_sp1.1 from training. Duration: 0.5454375 2023-10-02 09:42:31,155 WARNING [train.py:1204] (3/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_55-31904-0 from training. Duration: 0.88 2023-10-02 09:42:31,249 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_25-332138-0 from training. Duration: 0.91 2023-10-02 09:42:31,251 WARNING [train.py:1204] (3/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_912-282412-0_sp0.9 from training. Duration: 0.5444375 2023-10-02 09:42:32,478 INFO [train.py:1046] (3/4) Epoch 24, batch 3050, loss[loss=0.1623, simple_loss=0.2514, pruned_loss=0.03663, over 24564.00 frames. ], tot_loss[loss=0.1721, simple_loss=0.2492, pruned_loss=0.0475, over 4724631.65 frames. ], batch size: 71, lr: 4.27e-03, grad_scale: 8.0 2023-10-02 09:42:32,591 WARNING [train.py:1204] (3/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_458-299435-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 09:42:34,417 WARNING [train.py:1204] (3/4) Exclude cut with ID _69584_210_5219_1_1535785133775_4044450_275-88403-0_sp1.1 from training. Duration: 0.92725 2023-10-02 09:42:34,432 WARNING [train.py:1204] (3/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_841-341679-0_sp1.1 from training. Duration: 0.52725 2023-10-02 09:42:34,439 WARNING [train.py:1204] (3/4) Exclude cut with ID _41391_210_7994_1_1533603771872_3638201_1-54521-0_sp1.1 from training. Duration: 0.97275 2023-10-02 09:42:35,709 WARNING [train.py:1204] (3/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_495-186609-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 09:42:37,251 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=834853.3333333334, ans=0.1 2023-10-02 09:42:38,436 WARNING [train.py:1204] (3/4) Exclude cut with ID _4681_210_3525_1_1527250689464_120889_15-126760-0 from training. Duration: 0.81 2023-10-02 09:42:41,669 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3019_210_2028_1_1526044245_3264865_127-135615-0_sp1.1 from training. Duration: 0.8545625 2023-10-02 09:42:42,297 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.4.encoder.layers.2.feed_forward2.out_whiten, num_groups=1, num_channels=384, metric=8.52 vs. limit=15.0 2023-10-02 09:42:43,096 WARNING [train.py:1204] (3/4) Exclude cut with ID _30191_210_1664_1_1533189915047_7196670_915-21561-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 09:42:43,126 WARNING [train.py:1204] (3/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_6-214815-0_sp0.9 from training. Duration: 0.9444375 2023-10-02 09:42:44,596 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.conv_module1.balancer1.min_positive, batch_count=834853.3333333334, ans=0.025 2023-10-02 09:42:47,180 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_45399_210_4852_1_1533624725_7579737_190-137494-0_sp1.1 from training. Duration: 0.97275 2023-10-02 09:42:50,106 WARNING [train.py:1204] (3/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_207-132038-0 from training. Duration: 0.94 2023-10-02 09:42:56,607 WARNING [train.py:1204] (3/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_90-31383-0 from training. Duration: 0.87 2023-10-02 09:42:56,638 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_15517_210_6753_1_1530952293_1664795_119-31001-0 from training. Duration: 0.94 2023-10-02 09:42:56,667 WARNING [train.py:1204] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_684-85342-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 09:42:58,181 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.1.conv_module2.balancer1.prob, batch_count=834920.0, ans=0.125 2023-10-02 09:42:59,397 WARNING [train.py:1204] (3/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_97-325053-0_sp1.1 from training. Duration: 0.836375 2023-10-02 09:43:01,062 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.balancer2.prob, batch_count=834986.6666666666, ans=0.125 2023-10-02 09:43:03,923 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_68702_210_23689_1_1535799577_3420068_168-25150-0_sp1.1 from training. Duration: 0.97275 2023-10-02 09:43:03,938 WARNING [train.py:1204] (3/4) Exclude cut with ID _65134_210_14763_1_1535335393985_3505368_111-27984-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 09:43:03,983 WARNING [train.py:1204] (3/4) Exclude cut with ID _48678_210_5663_1_1533988830947_3769199_276-226230-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 09:43:06,836 WARNING [train.py:1204] (3/4) Exclude cut with ID _28427_210_4220_1_1532581240967_3061069_43-84928-0_sp1.1 from training. Duration: 0.9 2023-10-02 09:43:08,094 WARNING [train.py:1204] (3/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_158-250116-0_sp1.1 from training. Duration: 0.563625 2023-10-02 09:43:08,125 WARNING [train.py:1204] (3/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_279-332556-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 09:43:08,154 WARNING [train.py:1204] (3/4) Exclude cut with ID _42863_210_9070_1_1533902398788_3696920_488-230146-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 09:43:08,155 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_387-247159-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 09:43:09,587 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6729_210_2550_1_1528032481_1788725_261-42619-0_sp1.1 from training. Duration: 0.97275 2023-10-02 09:43:11,021 WARNING [train.py:1204] (3/4) Exclude cut with ID _19896_210_1883_1_1531709022662_6649569_831-44724-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 09:43:14,217 WARNING [train.py:1204] (3/4) Exclude cut with ID _55153_210_2253_1_1534384786677_3162096_280-285944-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 09:43:15,618 WARNING [train.py:1204] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_448-4048-0 from training. Duration: 0.96 2023-10-02 09:43:15,669 WARNING [train.py:1204] (3/4) Exclude cut with ID _29678_210_7893_1_1532421042787_3906118_456-202548-0_sp1.1 from training. Duration: 0.97275 2023-10-02 09:43:15,681 WARNING [train.py:1204] (3/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_107-332398-0_sp1.1 from training. Duration: 0.7090625 2023-10-02 09:43:18,670 WARNING [train.py:1204] (3/4) Exclude cut with ID _13921_210_9994_1_1530838702800_3003429_46-149444-0_sp1.1 from training. Duration: 0.8545625 2023-10-02 09:43:19,960 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_257-20733-0_sp0.9 from training. Duration: 0.8444375 2023-10-02 09:43:20,017 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_53753_210_7234_1_1534320003_3594148_385-234809-0_sp1.1 from training. Duration: 0.963625 2023-10-02 09:43:21,340 WARNING [train.py:1204] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_292-215346-0_sp1.1 from training. Duration: 0.8818125 2023-10-02 09:43:24,490 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.nonlin_attention.balancer.prob, batch_count=835053.3333333334, ans=0.125 2023-10-02 09:43:26,261 WARNING [train.py:1204] (3/4) Exclude cut with ID _68965_210_1883_1_1535698962272_3497490_196-124268-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 09:43:26,308 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_23801_210_16223_1_1532066392_3294657_570-5439-0_sp1.1 from training. Duration: 0.8818125 2023-10-02 09:43:34,125 WARNING [train.py:1204] (3/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_349-6928-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 09:43:34,169 WARNING [train.py:1204] (3/4) Exclude cut with ID _60621_210_17071_1_1535013053530_3671739_226-255202-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 09:43:34,179 WARNING [train.py:1204] (3/4) Exclude cut with ID _41817_210_18835_1_1533434477160_7423049_448-130302-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 09:43:35,603 WARNING [train.py:1204] (3/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_758-143677-0_sp1.1 from training. Duration: 0.8909375 2023-10-02 09:43:37,028 WARNING [train.py:1204] (3/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_912-47473-0_sp0.9 from training. Duration: 0.7555625 2023-10-02 09:43:37,057 WARNING [train.py:1204] (3/4) Exclude cut with ID _6694_210_4599_1_1527852637490_3965529_356-169233-0_sp1.1 from training. Duration: 0.9 2023-10-02 09:43:38,383 WARNING [train.py:1204] (3/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_1-99680-0 from training. Duration: 0.67 2023-10-02 09:43:39,775 WARNING [train.py:1204] (3/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_199-86432-0_sp1.1 from training. Duration: 0.8909375 2023-10-02 09:43:39,803 WARNING [train.py:1204] (3/4) Exclude cut with ID _61148_210_6897_1_1535094022726_4217610_362-182430-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 09:43:41,137 WARNING [train.py:1204] (3/4) Exclude cut with ID _49376_210_5986_1_1533985278732_3370740_301-154763-0 from training. Duration: 0.98 2023-10-02 09:43:42,558 WARNING [train.py:1204] (3/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_572-185150-0_sp1.1 from training. Duration: 0.8818125 2023-10-02 09:43:44,160 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.attention_skip_rate, batch_count=835120.0, ans=0.0 2023-10-02 09:43:47,137 INFO [train.py:1046] (3/4) Epoch 24, batch 3100, loss[loss=0.1575, simple_loss=0.2172, pruned_loss=0.04891, over 22554.00 frames. ], tot_loss[loss=0.1714, simple_loss=0.2488, pruned_loss=0.04702, over 4719208.86 frames. ], batch size: 322, lr: 4.27e-03, grad_scale: 8.0 2023-10-02 09:43:47,282 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_47896_210_20508_1_1534492518_5958868_558-314572-0_sp1.1 from training. Duration: 0.8818125 2023-10-02 09:43:48,738 WARNING [train.py:1204] (3/4) Exclude cut with ID _45560_210_21982_1_1533812603951_3584999_111-6376-0_sp0.9 from training. Duration: 0.9333125 2023-10-02 09:43:51,515 WARNING [train.py:1204] (3/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_412-317077-0_sp1.1 from training. Duration: 0.6818125 2023-10-02 09:43:52,940 WARNING [train.py:1204] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_718-68505-0 from training. Duration: 0.89 2023-10-02 09:43:56,095 WARNING [train.py:1204] (3/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_77-311404-0 from training. Duration: 0.94 2023-10-02 09:43:56,170 WARNING [train.py:1204] (3/4) Exclude cut with ID _60229_210_27767_1_1534838406158_3655350_264-279573-0 from training. Duration: 0.83 2023-10-02 09:43:58,220 WARNING [train.py:1204] (3/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_106-300985-0_sp1.1 from training. Duration: 0.8181875 2023-10-02 09:44:00,983 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_4183_210_2087_1_1526729100_1869838_79-277189-0_sp1.1 from training. Duration: 0.963625 2023-10-02 09:44:02,295 WARNING [train.py:1204] (3/4) Exclude cut with ID _28427_210_4220_1_1532581240967_3061069_195-295268-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 09:44:04,211 WARNING [train.py:1204] (3/4) Exclude cut with ID _4092_210_2151_1_1526643044449_3135759_356-278694-0_sp0.9 from training. Duration: 0.52225 2023-10-02 09:44:08,290 WARNING [train.py:1204] (3/4) Exclude cut with ID _14459_210_2228_1_1531443641946_7516700_777-71446-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 09:44:12,403 WARNING [train.py:1204] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_520-126729-0 from training. Duration: 0.7 2023-10-02 09:44:16,468 WARNING [train.py:1204] (3/4) Exclude cut with ID _13684_210_7497_1_1530619354863_3598359_312-235606-0_sp0.9 from training. Duration: 0.6666875 2023-10-02 09:44:16,499 WARNING [train.py:1204] (3/4) Exclude cut with ID _37537_210_2253_1_1533171758568_3421949_283-349402-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 09:44:16,703 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.out_combiner.scale_min, batch_count=835320.0, ans=0.2 2023-10-02 09:44:18,235 WARNING [train.py:1204] (3/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_29-117932-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 09:44:18,264 WARNING [train.py:1204] (3/4) Exclude cut with ID _29303_210_2575_1_1532392237303_7410649_721-283351-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 09:44:19,616 WARNING [train.py:1204] (3/4) Exclude cut with ID _14886_210_4220_1_1530702020355_3516190_175-229073-0_sp0.9 from training. Duration: 0.5 2023-10-02 09:44:21,021 WARNING [train.py:1204] (3/4) Exclude cut with ID _15458_210_8341_1_1530858614263_3834410_70-268830-0_sp1.1 from training. Duration: 0.97275 2023-10-02 09:44:21,051 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2295_210_2087_1_1525500993_2407863_234-83104-0 from training. Duration: 0.62 2023-10-02 09:44:21,052 WARNING [train.py:1204] (3/4) Exclude cut with ID _22961_210_8254_1_1531976366672_4278180_317-199546-0_sp1.1 from training. Duration: 0.7818125 2023-10-02 09:44:22,476 WARNING [train.py:1204] (3/4) Exclude cut with ID _27894_210_12392_1_1532415496078_6902439_547-196022-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 09:44:24,388 WARNING [train.py:1204] (3/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_46-207599-0 from training. Duration: 0.64 2023-10-02 09:44:27,025 WARNING [train.py:1204] (3/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_426-326246-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 09:44:30,215 WARNING [train.py:1204] (3/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_445-117398-0_sp0.9 from training. Duration: 0.988875 2023-10-02 09:44:30,279 WARNING [train.py:1204] (3/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_563-347832-0 from training. Duration: 0.89 2023-10-02 09:44:31,466 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.559e+02 1.869e+02 2.084e+02 2.364e+02 3.517e+02, threshold=4.169e+02, percent-clipped=0.0 2023-10-02 09:44:31,615 WARNING [train.py:1204] (3/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_252-216674-0 from training. Duration: 0.54 2023-10-02 09:44:32,968 WARNING [train.py:1204] (3/4) Exclude cut with ID _59611_210_8895_1_1534741198240_3650361_187-212779-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 09:44:33,033 WARNING [train.py:1204] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_282-159367-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 09:44:36,156 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_68702_210_23689_1_1535799577_3420068_33-136883-0_sp1.1 from training. Duration: 0.936375 2023-10-02 09:44:36,178 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_47228_210_4983_1_1533780021_3672027_184-293706-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 09:44:36,209 WARNING [train.py:1204] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_656-188361-0_sp0.9 from training. Duration: 0.9666875 2023-10-02 09:44:36,522 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.conv_skip_rate, batch_count=835386.6666666666, ans=0.0 2023-10-02 09:44:37,635 WARNING [train.py:1204] (3/4) Exclude cut with ID _4866_210_2577_1_1527404412284_4364659_352-157930-0_sp0.9 from training. Duration: 0.9 2023-10-02 09:44:37,635 WARNING [train.py:1204] (3/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_210-106047-0_sp1.1 from training. Duration: 0.8818125 2023-10-02 09:44:37,806 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.0.conv_module2.balancer1.prob, batch_count=835386.6666666666, ans=0.125 2023-10-02 09:44:39,068 WARNING [train.py:1204] (3/4) Exclude cut with ID _9389_210_1664_1_1529217337301_5289269_289-213417-0_sp1.1 from training. Duration: 0.8090625 2023-10-02 09:44:39,100 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3910_210_1794_1_1526777667_3747260_178-41186-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 09:44:39,104 WARNING [train.py:1204] (3/4) Exclude cut with ID _27052_210_5986_1_1532329197187_3585009_24-172263-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 09:44:39,107 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_5522_210_3273_1_1527249606_3909096_318-203153-0_sp1.1 from training. Duration: 0.3909375 2023-10-02 09:44:44,398 WARNING [train.py:1204] (3/4) Exclude cut with ID _27894_210_12392_1_1532415496078_6902439_222-51982-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 09:44:45,696 WARNING [train.py:1204] (3/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_87-131427-0 from training. Duration: 0.78 2023-10-02 09:44:48,377 WARNING [train.py:1204] (3/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_372-298093-0_sp0.9 from training. Duration: 0.888875 2023-10-02 09:44:48,435 WARNING [train.py:1204] (3/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_663-166973-0 from training. Duration: 0.8 2023-10-02 09:44:50,282 WARNING [train.py:1204] (3/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_429-177469-0_sp1.1 from training. Duration: 0.936375 2023-10-02 09:44:50,322 WARNING [train.py:1204] (3/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_724-172651-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 09:44:50,358 WARNING [train.py:1204] (3/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_103-336064-0 from training. Duration: 0.51 2023-10-02 09:45:00,579 INFO [train.py:1046] (3/4) Epoch 24, batch 3150, loss[loss=0.1828, simple_loss=0.25, pruned_loss=0.05781, over 23757.00 frames. ], tot_loss[loss=0.1699, simple_loss=0.2466, pruned_loss=0.04664, over 4712674.58 frames. ], batch size: 179, lr: 4.27e-03, grad_scale: 8.0 2023-10-02 09:45:00,622 WARNING [train.py:1204] (3/4) Exclude cut with ID _48677_210_5663_1_1533902405919_3892989_111-11439-0 from training. Duration: 0.96 2023-10-02 09:45:04,019 WARNING [train.py:1204] (3/4) Exclude cut with ID _8085_210_4557_1_1528455633493_3716100_95-103398-0_sp1.1 from training. Duration: 0.963625 2023-10-02 09:45:05,331 WARNING [train.py:1204] (3/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_1041-110837-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 09:45:05,627 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.ff2_skip_rate, batch_count=835520.0, ans=0.0 2023-10-02 09:45:06,767 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_50050_210_16223_1_1535435796_7290138_1155-121757-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 09:45:06,769 WARNING [train.py:1204] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_602-275260-0_sp0.9 from training. Duration: 0.911125 2023-10-02 09:45:06,831 WARNING [train.py:1204] (3/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_265-164486-0 from training. Duration: 0.96 2023-10-02 09:45:08,206 WARNING [train.py:1204] (3/4) Exclude cut with ID _46067_210_4220_1_1534125571066_3307450_142-229375-0_sp1.1 from training. Duration: 0.963625 2023-10-02 09:45:08,230 WARNING [train.py:1204] (3/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_310-26186-0_sp1.1 from training. Duration: 0.636375 2023-10-02 09:45:08,328 WARNING [train.py:1204] (3/4) Exclude cut with ID _28461_210_2253_1_1532771775189_3041299_136-270644-0 from training. Duration: 0.93 2023-10-02 09:45:10,970 WARNING [train.py:1204] (3/4) Exclude cut with ID _14860_210_4915_1_1530702020340_7343240_438-286372-0_sp1.1 from training. Duration: 0.936375 2023-10-02 09:45:11,191 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=835520.0, ans=0.1 2023-10-02 09:45:13,664 WARNING [train.py:1204] (3/4) Exclude cut with ID IC0128W0004-63309 from training. Duration: 0.831 2023-10-02 09:45:15,078 WARNING [train.py:1204] (3/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_135-174080-0 from training. Duration: 0.53 2023-10-02 09:45:15,110 WARNING [train.py:1204] (3/4) Exclude cut with ID _41326_210_5854_1_1533778076509_3492080_277-59511-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 09:45:16,494 WARNING [train.py:1204] (3/4) Exclude cut with ID IC0144W0112-71379 from training. Duration: 0.992 2023-10-02 09:45:17,312 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.2.encoder.layers.1.nonlin_attention.whiten2, num_groups=1, num_channels=384, metric=5.95 vs. limit=15.0 2023-10-02 09:45:17,866 WARNING [train.py:1204] (3/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_711-15688-0_sp0.9 from training. Duration: 0.47775 2023-10-02 09:45:18,045 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.1.conv_module1.balancer1.prob, batch_count=835586.6666666666, ans=0.125 2023-10-02 09:45:19,306 WARNING [train.py:1204] (3/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_237-332027-0 from training. Duration: 0.95 2023-10-02 09:45:19,364 WARNING [train.py:1204] (3/4) Exclude cut with ID _7970_210_1883_1_1528455616296_3472230_25-178407-0 from training. Duration: 0.71 2023-10-02 09:45:19,366 WARNING [train.py:1204] (3/4) Exclude cut with ID _8480_210_1874_1_1528589391205_6728330_309-139896-0 from training. Duration: 0.81 2023-10-02 09:45:19,389 WARNING [train.py:1204] (3/4) Exclude cut with ID _39571_210_13449_1_1533801584143_3651077_319-268877-0_sp1.1 from training. Duration: 0.936375 2023-10-02 09:45:19,392 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_155-246880-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 09:45:21,278 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_566-122599-0_sp1.1 from training. Duration: 0.936375 2023-10-02 09:45:24,028 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_12868_210_6753_1_1530701932_530275_18-76322-0 from training. Duration: 0.87 2023-10-02 09:45:25,366 WARNING [train.py:1204] (3/4) Exclude cut with ID _56494_210_4220_1_1535284837625_3033169_159-106035-0_sp1.1 from training. Duration: 0.963625 2023-10-02 09:45:25,408 WARNING [train.py:1204] (3/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_98-276013-0_sp1.1 from training. Duration: 0.963625 2023-10-02 09:45:27,302 WARNING [train.py:1204] (3/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_330-216124-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 09:45:28,678 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_177-58005-0_sp0.9 from training. Duration: 0.688875 2023-10-02 09:45:28,833 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.1.feed_forward1.out_proj.dropout_p, batch_count=835653.3333333334, ans=0.1 2023-10-02 09:45:31,568 WARNING [train.py:1204] (3/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_343-139898-0 from training. Duration: 0.96 2023-10-02 09:45:33,432 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6961_210_2087_1_1527937168_6815011_395-185060-0_sp1.1 from training. Duration: 0.863625 2023-10-02 09:45:37,393 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7002_210_1794_1_1527939683_5157286_184-229630-0_sp0.9 from training. Duration: 0.82225 2023-10-02 09:45:37,439 WARNING [train.py:1204] (3/4) Exclude cut with ID _59170_210_6721_1_1534983633960_4203335_153-18895-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 09:45:37,485 WARNING [train.py:1204] (3/4) Exclude cut with ID _49643_210_7989_1_1534640244224_3939031_598-161926-0 from training. Duration: 0.9 2023-10-02 09:45:40,209 WARNING [train.py:1204] (3/4) Exclude cut with ID _4068_210_2629_1_1526697350395_1546259_224-288693-0 from training. Duration: 0.59 2023-10-02 09:45:41,508 WARNING [train.py:1204] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_718-68505-0_sp1.1 from training. Duration: 0.8090625 2023-10-02 09:45:41,569 WARNING [train.py:1204] (3/4) Exclude cut with ID _4068_210_2629_1_1526697350395_1546259_224-288693-0_sp0.9 from training. Duration: 0.6555625 2023-10-02 09:45:42,807 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2296_210_2087_1_1525503448_3397335_57-185430-0_sp0.9 from training. Duration: 0.6666875 2023-10-02 09:45:42,877 WARNING [train.py:1204] (3/4) Exclude cut with ID _15458_210_8341_1_1530858614263_3834410_49-235472-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 09:45:44,140 WARNING [train.py:1204] (3/4) Exclude cut with ID _64135_210_2253_1_1535677267876_7608972_498-5039-0_sp0.9 from training. Duration: 0.8666875 2023-10-02 09:45:45,532 WARNING [train.py:1204] (3/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_283-220372-0_sp0.9 from training. Duration: 0.62225 2023-10-02 09:45:45,548 WARNING [train.py:1204] (3/4) Exclude cut with ID _6473_210_1741_1_1527901307396_3815380_108-17335-0_sp0.9 from training. Duration: 0.72225 2023-10-02 09:45:45,624 WARNING [train.py:1204] (3/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_113-92608-0 from training. Duration: 0.54 2023-10-02 09:45:45,671 WARNING [train.py:1204] (3/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_272-249985-0_sp1.1 from training. Duration: 0.6090625 2023-10-02 09:45:46,973 WARNING [train.py:1204] (3/4) Exclude cut with ID _50376_210_13401_1_1534031502101_4217740_518-187390-0_sp1.1 from training. Duration: 0.97275 2023-10-02 09:45:48,336 WARNING [train.py:1204] (3/4) Exclude cut with ID _59738_210_15379_1_1534849512562_3941470_165-248260-0_sp1.1 from training. Duration: 0.92725 2023-10-02 09:45:48,346 WARNING [train.py:1204] (3/4) Exclude cut with ID _23818_210_6228_1_1531917063884_3676920_400-343925-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 09:45:48,416 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6961_210_2087_1_1527937168_6815011_562-165518-0 from training. Duration: 0.77 2023-10-02 09:45:48,462 WARNING [train.py:1204] (3/4) Exclude cut with ID _24271_210_12658_1_1531987270022_7190519_289-207931-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 09:45:49,908 WARNING [train.py:1204] (3/4) Exclude cut with ID _14632_210_9988_1_1530691279392_3923200_332-105904-0 from training. Duration: 0.9 2023-10-02 09:45:49,925 WARNING [train.py:1204] (3/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_203-232438-0_sp1.1 from training. Duration: 0.97275 2023-10-02 09:45:51,331 WARNING [train.py:1204] (3/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_263-168388-0 from training. Duration: 0.86 2023-10-02 09:45:53,269 WARNING [train.py:1204] (3/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_174-205058-0 from training. Duration: 0.95 2023-10-02 09:45:54,614 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_94-195801-0_sp1.1 from training. Duration: 0.8545625 2023-10-02 09:45:54,637 WARNING [train.py:1204] (3/4) Exclude cut with ID _55153_210_2253_1_1534384786677_3162096_269-103204-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 09:45:56,530 WARNING [train.py:1204] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_748-36150-0 from training. Duration: 0.91 2023-10-02 09:45:57,969 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_148-172861-0_sp1.1 from training. Duration: 0.4545625 2023-10-02 09:45:58,014 WARNING [train.py:1204] (3/4) Exclude cut with ID _67499_210_17282_1_1535509805220_3692430_460-77730-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 09:46:00,639 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.0.layers.0.self_attn1.whiten, num_groups=1, num_channels=192, metric=11.88 vs. limit=22.5 2023-10-02 09:46:01,725 WARNING [train.py:1204] (3/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_374-20753-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 09:46:03,286 WARNING [train.py:1204] (3/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_374-289983-0_sp1.1 from training. Duration: 0.97275 2023-10-02 09:46:03,328 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_34886_210_10633_1_1533203882_1755661_76-156096-0_sp0.9 from training. Duration: 0.9666875 2023-10-02 09:46:07,394 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_238-256668-0_sp0.9 from training. Duration: 0.7666875 2023-10-02 09:46:07,459 WARNING [train.py:1204] (3/4) Exclude cut with ID _46683_210_9642_1_1533730913492_4591116_489-146727-0_sp1.1 from training. Duration: 0.97275 2023-10-02 09:46:10,125 WARNING [train.py:1204] (3/4) Exclude cut with ID _13938_210_9988_1_1530518718978_3693960_304-112784-0_sp0.9 from training. Duration: 0.511125 2023-10-02 09:46:14,236 INFO [train.py:1046] (3/4) Epoch 24, batch 3200, loss[loss=0.167, simple_loss=0.2406, pruned_loss=0.04671, over 23365.00 frames. ], tot_loss[loss=0.1693, simple_loss=0.2454, pruned_loss=0.04662, over 4691276.47 frames. ], batch size: 134, lr: 4.27e-03, grad_scale: 16.0 2023-10-02 09:46:14,340 WARNING [train.py:1204] (3/4) Exclude cut with ID _24271_210_12658_1_1531987270022_7190519_243-46549-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 09:46:14,345 WARNING [train.py:1204] (3/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_78-182912-0_sp1.1 from training. Duration: 0.536375 2023-10-02 09:46:18,333 WARNING [train.py:1204] (3/4) Exclude cut with ID _57572_210_5517_1_1534921219624_3809484_121-321334-0_sp1.1 from training. Duration: 0.97275 2023-10-02 09:46:18,438 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_40047_210_2290_1_1533290519_2374311_254-102149-0_sp1.1 from training. Duration: 0.963625 2023-10-02 09:46:18,439 WARNING [train.py:1204] (3/4) Exclude cut with ID _28461_210_2253_1_1532771775189_3041299_96-152158-0 from training. Duration: 0.85 2023-10-02 09:46:23,319 WARNING [train.py:1204] (3/4) Exclude cut with ID _29877_210_6228_1_1532397791106_5276149_161-102979-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 09:46:25,034 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.bypass.scale_min, batch_count=835853.3333333334, ans=0.2 2023-10-02 09:46:26,209 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3787_210_2040_1_1526477544_5478586_603-120324-0_sp0.9 from training. Duration: 0.9 2023-10-02 09:46:29,657 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_41750_210_1960_1_1533520678_3948042_247-213456-0_sp1.1 from training. Duration: 0.97275 2023-10-02 09:46:38,357 WARNING [train.py:1204] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_127-119109-0_sp0.9 from training. Duration: 0.8 2023-10-02 09:46:40,633 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.0.layers.0.feed_forward1.out_whiten, num_groups=1, num_channels=192, metric=9.82 vs. limit=15.0 2023-10-02 09:46:46,933 WARNING [train.py:1204] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_215-202973-0 from training. Duration: 0.88 2023-10-02 09:46:49,539 WARNING [train.py:1204] (3/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_17-320491-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 09:46:52,956 WARNING [train.py:1204] (3/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_310-114452-0 from training. Duration: 0.59 2023-10-02 09:46:54,318 WARNING [train.py:1204] (3/4) Exclude cut with ID _15133_210_6228_1_1530794512074_3080379_29-264458-0_sp0.9 from training. Duration: 0.6444375 2023-10-02 09:46:57,674 WARNING [train.py:1204] (3/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_159-281310-0_sp1.1 from training. Duration: 0.863625 2023-10-02 09:46:57,690 WARNING [train.py:1204] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_195-167011-0_sp0.9 from training. Duration: 0.8333125 2023-10-02 09:46:57,748 WARNING [train.py:1204] (3/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_156-257607-0_sp1.1 from training. Duration: 0.7545625 2023-10-02 09:46:58,938 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.633e+02 1.959e+02 2.375e+02 3.040e+02 4.854e+02, threshold=4.749e+02, percent-clipped=3.0 2023-10-02 09:47:01,060 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2295_210_2087_1_1525500993_2407863_116-238894-0 from training. Duration: 0.7 2023-10-02 09:47:03,594 WARNING [train.py:1204] (3/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_321-335475-0_sp0.9 from training. Duration: 0.5 2023-10-02 09:47:05,437 WARNING [train.py:1204] (3/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_188-114464-0 from training. Duration: 0.82 2023-10-02 09:47:08,255 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2592_210_2290_1_1525697025_4882925_157-70512-0 from training. Duration: 0.91 2023-10-02 09:47:09,679 WARNING [train.py:1204] (3/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_138-64210-0_sp1.1 from training. Duration: 0.663625 2023-10-02 09:47:13,760 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.balancer2.prob, batch_count=836120.0, ans=0.125 2023-10-02 09:47:15,038 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder_embed.dropout.p, batch_count=836120.0, ans=0.1 2023-10-02 09:47:17,628 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_11305_210_1794_1_1529750428_7178148_651-95702-0_sp1.1 from training. Duration: 0.963625 2023-10-02 09:47:17,641 WARNING [train.py:1204] (3/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_379-184223-0_sp0.9 from training. Duration: 0.9444375 2023-10-02 09:47:17,666 WARNING [train.py:1204] (3/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_381-125325-0_sp1.1 from training. Duration: 0.963625 2023-10-02 09:47:17,717 WARNING [train.py:1204] (3/4) Exclude cut with ID IC0622W0021-307659 from training. Duration: 0.896 2023-10-02 09:47:17,719 WARNING [train.py:1204] (3/4) Exclude cut with ID _7624_210_1531_1_1528109802198_4933101_418-349834-0_sp1.1 from training. Duration: 0.5454375 2023-10-02 09:47:21,869 WARNING [train.py:1204] (3/4) Exclude cut with ID _49728_210_12651_1_1534041034294_6788150_607-3416-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 09:47:23,278 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8275_210_4198_1_1528455320_3892571_491-17707-0 from training. Duration: 0.64 2023-10-02 09:47:23,344 WARNING [train.py:1204] (3/4) Exclude cut with ID _5287_210_2084_1_1527418883040_3878670_333-48508-0 from training. Duration: 0.6 2023-10-02 09:47:24,996 WARNING [train.py:1204] (3/4) Exclude cut with ID _8365_210_2064_1_1528592311070_4677690_371-122295-0 from training. Duration: 0.55 2023-10-02 09:47:25,259 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.feed_forward1.hidden_balancer.prob, batch_count=836120.0, ans=0.125 2023-10-02 09:47:26,408 WARNING [train.py:1204] (3/4) Exclude cut with ID _14860_210_4915_1_1530702020340_7343240_836-328489-0 from training. Duration: 0.9 2023-10-02 09:47:28,245 INFO [train.py:1046] (3/4) Epoch 24, batch 3250, loss[loss=0.1519, simple_loss=0.2261, pruned_loss=0.0389, over 24290.00 frames. ], tot_loss[loss=0.1694, simple_loss=0.2458, pruned_loss=0.0465, over 4710026.40 frames. ], batch size: 56, lr: 4.26e-03, grad_scale: 16.0 2023-10-02 09:47:29,484 WARNING [train.py:1204] (3/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_1033-274376-0_sp0.9 from training. Duration: 0.9666875 2023-10-02 09:47:30,913 WARNING [train.py:1204] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_475-127455-0_sp0.9 from training. Duration: 0.77775 2023-10-02 09:47:30,919 WARNING [train.py:1204] (3/4) Exclude cut with ID IC0671W0071-331914 from training. Duration: 0.896 2023-10-02 09:47:30,943 WARNING [train.py:1204] (3/4) Exclude cut with ID _30327_210_16049_1_1532484161295_3924169_2-1523-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 09:47:30,945 WARNING [train.py:1204] (3/4) Exclude cut with ID _24314_210_4915_1_1532135078116_7170179_460-208279-0_sp1.1 from training. Duration: 0.97275 2023-10-02 09:47:32,816 WARNING [train.py:1204] (3/4) Exclude cut with ID IC0679W0052-335859 from training. Duration: 0.64 2023-10-02 09:47:37,463 WARNING [train.py:1204] (3/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_3-237434-0_sp1.1 from training. Duration: 0.6909375 2023-10-02 09:47:40,251 WARNING [train.py:1204] (3/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_405-185283-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 09:47:47,147 WARNING [train.py:1204] (3/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_927-83684-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 09:47:47,165 WARNING [train.py:1204] (3/4) Exclude cut with ID _6694_210_4599_1_1527852637490_3965529_356-169233-0 from training. Duration: 0.99 2023-10-02 09:47:48,491 WARNING [train.py:1204] (3/4) Exclude cut with ID _19896_210_1883_1_1531709022662_6649569_562-233001-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 09:47:48,536 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_354-78629-0_sp1.1 from training. Duration: 0.963625 2023-10-02 09:47:48,538 WARNING [train.py:1204] (3/4) Exclude cut with ID _60135_210_13427_1_1534935557922_3651319_96-168198-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 09:47:51,193 WARNING [train.py:1204] (3/4) Exclude cut with ID _31602_210_5854_1_1532677021456_4458250_249-96604-0_sp1.1 from training. Duration: 0.8090625 2023-10-02 09:47:51,215 WARNING [train.py:1204] (3/4) Exclude cut with ID _14100_210_8341_1_1530529320613_3678069_258-224599-0_sp0.9 from training. Duration: 0.8555625 2023-10-02 09:47:52,649 WARNING [train.py:1204] (3/4) Exclude cut with ID _19708_210_9206_1_1532136650733_4238364_215-160135-0_sp1.1 from training. Duration: 0.97275 2023-10-02 09:47:52,675 WARNING [train.py:1204] (3/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_113-18163-0_sp0.9 from training. Duration: 0.988875 2023-10-02 09:47:52,834 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.bypass_mid.scale_min, batch_count=836253.3333333334, ans=0.2 2023-10-02 09:47:54,330 WARNING [train.py:1204] (3/4) Exclude cut with ID _64135_210_2253_1_1535677267876_7608972_552-337340-0_sp1.1 from training. Duration: 0.92725 2023-10-02 09:47:54,364 WARNING [train.py:1204] (3/4) Exclude cut with ID _67725_210_27767_1_1535608756671_5105359_10-12887-0_sp1.1 from training. Duration: 0.97275 2023-10-02 09:47:54,377 WARNING [train.py:1204] (3/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_401-284840-0_sp1.1 from training. Duration: 0.97275 2023-10-02 09:47:54,405 WARNING [train.py:1204] (3/4) Exclude cut with ID _44659_210_2721_1_1534036120061_6614860_483-141285-0_sp1.1 from training. Duration: 0.936375 2023-10-02 09:47:57,216 WARNING [train.py:1204] (3/4) Exclude cut with ID _9461_210_4557_1_1529060514726_3677451_358-332734-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 09:47:59,140 WARNING [train.py:1204] (3/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_788-298907-0_sp1.1 from training. Duration: 0.8090625 2023-10-02 09:48:01,770 WARNING [train.py:1204] (3/4) Exclude cut with ID _39411_210_5986_1_1533538825030_3343649_354-267196-0_sp1.1 from training. Duration: 0.92725 2023-10-02 09:48:01,790 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_14953_210_2151_1_1530701710_5616165_192-309792-0_sp1.1 from training. Duration: 0.97275 2023-10-02 09:48:03,158 WARNING [train.py:1204] (3/4) Exclude cut with ID _59170_210_6721_1_1534983633960_4203335_290-151255-0_sp1.1 from training. Duration: 0.92725 2023-10-02 09:48:03,190 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_5575_210_1410_1_1527301305_2570170_249-93769-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 09:48:03,214 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_46013_210_7234_1_1533776362_3744032_463-52378-0_sp1.1 from training. Duration: 0.7818125 2023-10-02 09:48:08,235 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=836320.0, ans=0.1 2023-10-02 09:48:09,272 WARNING [train.py:1204] (3/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_580-93798-0 from training. Duration: 0.56 2023-10-02 09:48:09,322 WARNING [train.py:1204] (3/4) Exclude cut with ID _19514_210_9259_1_1531530071330_5084230_385-13762-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 09:48:09,327 WARNING [train.py:1204] (3/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_458-67321-0_sp1.1 from training. Duration: 0.8818125 2023-10-02 09:48:10,715 WARNING [train.py:1204] (3/4) Exclude cut with ID _41298_210_9019_1_1533430371450_4822143_188-154791-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 09:48:10,793 WARNING [train.py:1204] (3/4) Exclude cut with ID _15037_210_2253_1_1530752294104_3267650_296-192597-0_sp0.9 from training. Duration: 0.9 2023-10-02 09:48:15,155 WARNING [train.py:1204] (3/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_661-264857-0_sp1.1 from training. Duration: 0.8454375 2023-10-02 09:48:20,848 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_34008_210_10431_1_1533345424_8183247_662-317331-0_sp1.1 from training. Duration: 0.8545625 2023-10-02 09:48:20,879 WARNING [train.py:1204] (3/4) Exclude cut with ID _46502_210_13794_1_1533724452198_3657159_241-278329-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 09:48:20,879 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_11349_210_6753_1_1529800861_1101207_26-65602-0 from training. Duration: 0.96 2023-10-02 09:48:20,883 WARNING [train.py:1204] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_448-4048-0_sp1.1 from training. Duration: 0.87275 2023-10-02 09:48:20,892 WARNING [train.py:1204] (3/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_662-275621-0_sp1.1 from training. Duration: 0.3818125 2023-10-02 09:48:20,936 WARNING [train.py:1204] (3/4) Exclude cut with ID _13938_210_9988_1_1530518718978_3693960_256-54454-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 09:48:25,256 WARNING [train.py:1204] (3/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_256-32662-0 from training. Duration: 0.9 2023-10-02 09:48:25,288 WARNING [train.py:1204] (3/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_326-17061-0 from training. Duration: 0.94 2023-10-02 09:48:26,606 WARNING [train.py:1204] (3/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_505-97987-0_sp1.1 from training. Duration: 0.7818125 2023-10-02 09:48:26,707 WARNING [train.py:1204] (3/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_316-294364-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 09:48:28,651 WARNING [train.py:1204] (3/4) Exclude cut with ID _19514_210_9259_1_1531530071330_5084230_762-231063-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 09:48:29,957 WARNING [train.py:1204] (3/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_68-219807-0_sp0.9 from training. Duration: 0.52225 2023-10-02 09:48:29,990 WARNING [train.py:1204] (3/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_479-158053-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 09:48:31,977 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.0.conv_module1.balancer1.prob, batch_count=836453.3333333334, ans=0.125 2023-10-02 09:48:33,168 WARNING [train.py:1204] (3/4) Exclude cut with ID _66112_210_10635_1_1535590236305_3801484_83-38181-0_sp1.1 from training. Duration: 0.936375 2023-10-02 09:48:33,184 WARNING [train.py:1204] (3/4) Exclude cut with ID _55857_210_2346_1_1534644007831_6898209_255-146346-0_sp1.1 from training. Duration: 0.963625 2023-10-02 09:48:34,563 WARNING [train.py:1204] (3/4) Exclude cut with ID _48836_210_12210_1_1534075848293_3904150_429-35475-0 from training. Duration: 0.89 2023-10-02 09:48:34,573 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_50154_210_13731_1_1534750862_1080961_119-187163-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 09:48:38,084 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.feed_forward1.out_proj.dropout_p, batch_count=836453.3333333334, ans=0.1 2023-10-02 09:48:39,130 WARNING [train.py:1204] (3/4) Exclude cut with ID _8793_210_6310_1_1528542985139_2310009_121-179782-0_sp0.9 from training. Duration: 0.888875 2023-10-02 09:48:39,133 WARNING [train.py:1204] (3/4) Exclude cut with ID _14884_210_2252_1_1530707459689_5561250_591-173406-0 from training. Duration: 0.96 2023-10-02 09:48:41,739 INFO [train.py:1046] (3/4) Epoch 24, batch 3300, loss[loss=0.1719, simple_loss=0.2619, pruned_loss=0.04097, over 24628.00 frames. ], tot_loss[loss=0.1702, simple_loss=0.2466, pruned_loss=0.04688, over 4715543.02 frames. ], batch size: 73, lr: 4.26e-03, grad_scale: 16.0 2023-10-02 09:48:41,825 WARNING [train.py:1204] (3/4) Exclude cut with ID _15133_210_6228_1_1530794512074_3080379_54-198193-0_sp1.1 from training. Duration: 0.8545625 2023-10-02 09:48:41,834 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6545_210_2040_1_1527774670_4245258_407-48070-0 from training. Duration: 0.76 2023-10-02 09:48:43,261 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1032-236863-0 from training. Duration: 0.84 2023-10-02 09:48:44,637 WARNING [train.py:1204] (3/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_507-303370-0 from training. Duration: 0.96 2023-10-02 09:48:44,664 WARNING [train.py:1204] (3/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_164-282366-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 09:48:48,772 WARNING [train.py:1204] (3/4) Exclude cut with ID _39867_210_9826_1_1533553222030_9344259_312-158727-0_sp1.1 from training. Duration: 0.963625 2023-10-02 09:48:48,879 WARNING [train.py:1204] (3/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_174-205058-0_sp1.1 from training. Duration: 0.863625 2023-10-02 09:48:48,902 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_11305_210_1794_1_1529750428_7178148_166-237358-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 09:48:50,257 WARNING [train.py:1204] (3/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_26-175553-0_sp1.1 from training. Duration: 0.5545625 2023-10-02 09:48:52,105 WARNING [train.py:1204] (3/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_96-290039-0_sp1.1 from training. Duration: 0.5090625 2023-10-02 09:48:54,854 WARNING [train.py:1204] (3/4) Exclude cut with ID _53219_210_4525_1_1534834836008_4064825_261-101691-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 09:48:57,581 WARNING [train.py:1204] (3/4) Exclude cut with ID _24271_210_12658_1_1531987270022_7190519_96-293390-0_sp1.1 from training. Duration: 0.936375 2023-10-02 09:49:01,092 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_271-234962-0 from training. Duration: 0.79 2023-10-02 09:49:02,379 WARNING [train.py:1204] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_136-30660-0_sp1.1 from training. Duration: 0.97275 2023-10-02 09:49:02,393 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_625-281046-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 09:49:04,372 WARNING [train.py:1204] (3/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_333-168193-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 09:49:04,422 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9029W0269-509664 from training. Duration: 0.896 2023-10-02 09:49:05,727 WARNING [train.py:1204] (3/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_450-147393-0_sp1.1 from training. Duration: 0.92725 2023-10-02 09:49:05,777 WARNING [train.py:1204] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_643-52007-0_sp1.1 from training. Duration: 0.5181875 2023-10-02 09:49:07,110 WARNING [train.py:1204] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_330-327861-0_sp0.9 from training. Duration: 0.7444375 2023-10-02 09:49:07,118 WARNING [train.py:1204] (3/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_260-317651-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 09:49:07,141 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9044W0010-516654 from training. Duration: 0.896 2023-10-02 09:49:11,782 WARNING [train.py:1204] (3/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_265-118378-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 09:49:11,788 WARNING [train.py:1204] (3/4) Exclude cut with ID _6194_210_1534_1_1527511826160_3941194_321-148641-0_sp0.9 from training. Duration: 0.711125 2023-10-02 09:49:13,256 WARNING [train.py:1204] (3/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_101-169176-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 09:49:13,258 WARNING [train.py:1204] (3/4) Exclude cut with ID 497-129325-0061-62254-0_sp1.1 from training. Duration: 0.97725 2023-10-02 09:49:13,351 WARNING [train.py:1204] (3/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_795-33252-0_sp1.1 from training. Duration: 0.336375 2023-10-02 09:49:13,363 WARNING [train.py:1204] (3/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_168-246176-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 09:49:14,845 WARNING [train.py:1204] (3/4) Exclude cut with ID _7160_210_1876_1_1527940923231_3703799_120-150943-0_sp1.1 from training. Duration: 0.736375 2023-10-02 09:49:17,540 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9084W0005-536844 from training. Duration: 0.768 2023-10-02 09:49:18,928 WARNING [train.py:1204] (3/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_234-260682-0 from training. Duration: 0.84 2023-10-02 09:49:18,968 WARNING [train.py:1204] (3/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_188-114464-0_sp0.9 from training. Duration: 0.911125 2023-10-02 09:49:22,009 WARNING [train.py:1204] (3/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_81-75849-0 from training. Duration: 0.39 2023-10-02 09:49:26,018 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.458e+02 1.884e+02 2.119e+02 2.530e+02 4.061e+02, threshold=4.238e+02, percent-clipped=0.0 2023-10-02 09:49:26,111 WARNING [train.py:1204] (3/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_379-184223-0_sp1.1 from training. Duration: 0.77275 2023-10-02 09:49:27,649 WARNING [train.py:1204] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_454-123002-0_sp1.1 from training. Duration: 0.62725 2023-10-02 09:49:27,700 WARNING [train.py:1204] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_887-249555-0_sp1.1 from training. Duration: 0.7 2023-10-02 09:49:30,550 WARNING [train.py:1204] (3/4) Exclude cut with ID _31088_210_16446_1_1532520012284_3440919_20-260902-0_sp1.1 from training. Duration: 0.97275 2023-10-02 09:49:32,476 WARNING [train.py:1204] (3/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_856-344574-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 09:49:32,478 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_36756_210_7234_1_1533026991_966276_4-164118-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 09:49:32,492 WARNING [train.py:1204] (3/4) Exclude cut with ID _8365_210_2064_1_1528592311070_4677690_371-122295-0_sp1.1 from training. Duration: 0.5 2023-10-02 09:49:34,254 WARNING [train.py:1204] (3/4) Exclude cut with ID _5910_210_1389_1_1527337972454_5050100_555-159815-0_sp1.1 from training. Duration: 0.8909375 2023-10-02 09:49:34,264 WARNING [train.py:1204] (3/4) Exclude cut with ID _37086_210_9038_1_1532999540891_7021250_469-147941-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 09:49:35,655 WARNING [train.py:1204] (3/4) Exclude cut with ID _3067_210_2667_1_1526212974640_4503168_459-173867-0_sp0.9 from training. Duration: 0.97775 2023-10-02 09:49:35,766 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9156W0006-572499 from training. Duration: 0.896 2023-10-02 09:49:35,846 WARNING [train.py:1204] (3/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_277-210798-0 from training. Duration: 0.55 2023-10-02 09:49:36,045 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.feed_forward1.hidden_balancer.prob, batch_count=836720.0, ans=0.125 2023-10-02 09:49:38,643 WARNING [train.py:1204] (3/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_103-336064-0_sp1.1 from training. Duration: 0.463625 2023-10-02 09:49:38,698 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3414_210_1988_1_1526212718_4813195_331-290945-0_sp1.1 from training. Duration: 0.936375 2023-10-02 09:49:38,699 WARNING [train.py:1204] (3/4) Exclude cut with ID _67499_210_17282_1_1535509805220_3692430_365-29276-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 09:49:40,653 WARNING [train.py:1204] (3/4) Exclude cut with ID _14821_210_8750_1_1530680503991_6829359_69-250847-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 09:49:40,654 WARNING [train.py:1204] (3/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_1102-61301-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 09:49:42,084 WARNING [train.py:1204] (3/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_166-246557-0_sp1.1 from training. Duration: 0.5818125 2023-10-02 09:49:42,143 WARNING [train.py:1204] (3/4) Exclude cut with ID _15351_210_10626_1_1530856635653_3576210_214-129918-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 09:49:42,157 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_120-41668-0_sp0.9 from training. Duration: 0.77775 2023-10-02 09:49:42,233 WARNING [train.py:1204] (3/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_27-72278-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 09:49:44,802 WARNING [train.py:1204] (3/4) Exclude cut with ID _24314_210_4915_1_1532135078116_7170179_521-258868-0_sp1.1 from training. Duration: 0.7181875 2023-10-02 09:49:47,513 WARNING [train.py:1204] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_143-86344-0 from training. Duration: 0.77 2023-10-02 09:49:47,551 WARNING [train.py:1204] (3/4) Exclude cut with ID _53489_210_6228_1_1534298357169_3835970_465-157433-0_sp1.1 from training. Duration: 0.92725 2023-10-02 09:49:48,730 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_248-319353-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 09:49:50,121 WARNING [train.py:1204] (3/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_102-77064-0_sp1.1 from training. Duration: 0.8454375 2023-10-02 09:49:50,155 WARNING [train.py:1204] (3/4) Exclude cut with ID _9389_210_1664_1_1529217337301_5289269_379-128794-0_sp1.1 from training. Duration: 0.77275 2023-10-02 09:49:51,999 WARNING [train.py:1204] (3/4) Exclude cut with ID _3231_210_2106_1_1526205864934_6784220_236-260835-0_sp1.1 from training. Duration: 0.97275 2023-10-02 09:49:54,650 WARNING [train.py:1204] (3/4) Exclude cut with ID _24271_210_12658_1_1531987270022_7190519_184-342363-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 09:49:54,651 WARNING [train.py:1204] (3/4) Exclude cut with ID _27894_210_12392_1_1532415496078_6902439_67-221663-0_sp1.1 from training. Duration: 0.92725 2023-10-02 09:49:55,965 INFO [train.py:1046] (3/4) Epoch 24, batch 3350, loss[loss=0.1791, simple_loss=0.2579, pruned_loss=0.05018, over 23254.00 frames. ], tot_loss[loss=0.1706, simple_loss=0.2471, pruned_loss=0.04703, over 4718351.39 frames. ], batch size: 105, lr: 4.26e-03, grad_scale: 16.0 2023-10-02 09:49:58,655 WARNING [train.py:1204] (3/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_304-59227-0_sp1.1 from training. Duration: 0.7 2023-10-02 09:50:01,989 WARNING [train.py:1204] (3/4) Exclude cut with ID _58605_210_12210_1_1534723195939_3904259_458-42232-0_sp1.1 from training. Duration: 0.92725 2023-10-02 09:50:02,069 WARNING [train.py:1204] (3/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_13-165512-0_sp0.9 from training. Duration: 0.911125 2023-10-02 09:50:06,726 WARNING [train.py:1204] (3/4) Exclude cut with ID _58602_210_2346_1_1534903210085_6908830_792-102241-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 09:50:08,165 WARNING [train.py:1204] (3/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_588-125165-0_sp0.9 from training. Duration: 0.92225 2023-10-02 09:50:09,445 WARNING [train.py:1204] (3/4) Exclude cut with ID _54218_210_12210_1_1534413646388_4409259_239-98429-0_sp1.1 from training. Duration: 0.97275 2023-10-02 09:50:09,499 WARNING [train.py:1204] (3/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_538-32291-0_sp1.1 from training. Duration: 0.7545625 2023-10-02 09:50:10,914 WARNING [train.py:1204] (3/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_419-317062-0 from training. Duration: 0.91 2023-10-02 09:50:12,265 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9309W0202-640329 from training. Duration: 0.896 2023-10-02 09:50:12,438 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.feed_forward1.hidden_balancer.prob, batch_count=836920.0, ans=0.125 2023-10-02 09:50:14,020 WARNING [train.py:1204] (3/4) Exclude cut with ID _3231_210_2106_1_1526205864934_6784220_419-313291-0_sp1.1 from training. Duration: 0.97275 2023-10-02 09:50:16,701 WARNING [train.py:1204] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_619-344499-0 from training. Duration: 0.63 2023-10-02 09:50:16,712 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_278-272612-0 from training. Duration: 0.73 2023-10-02 09:50:18,182 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_103-84010-0_sp0.9 from training. Duration: 0.7666875 2023-10-02 09:50:18,202 WARNING [train.py:1204] (3/4) Exclude cut with ID _26528_210_9070_1_1532570410842_3851310_497-97218-0_sp1.1 from training. Duration: 0.7818125 2023-10-02 09:50:18,292 WARNING [train.py:1204] (3/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_262-84613-0_sp1.1 from training. Duration: 0.936375 2023-10-02 09:50:19,599 WARNING [train.py:1204] (3/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_387-131538-0 from training. Duration: 0.81 2023-10-02 09:50:19,624 WARNING [train.py:1204] (3/4) Exclude cut with ID _8030_210_7497_1_1528444886781_3531089_224-300489-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 09:50:19,639 WARNING [train.py:1204] (3/4) Exclude cut with ID _14349_210_8341_1_1530615710818_3442470_170-336541-0_sp0.9 from training. Duration: 0.9555625 2023-10-02 09:50:21,020 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_47069_210_15368_1_1533781267_4431320_180-31746-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 09:50:22,420 WARNING [train.py:1204] (3/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_447-7041-0_sp1.1 from training. Duration: 0.92725 2023-10-02 09:50:22,452 WARNING [train.py:1204] (3/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_2-55283-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 09:50:23,784 WARNING [train.py:1204] (3/4) Exclude cut with ID _12555_210_8750_1_1530075594402_3780239_70-181911-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 09:50:27,015 WARNING [train.py:1204] (3/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_311-328022-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 09:50:29,629 WARNING [train.py:1204] (3/4) Exclude cut with ID _14528_210_8254_1_1530669700514_4306939_330-2987-0_sp1.1 from training. Duration: 0.92725 2023-10-02 09:50:30,975 WARNING [train.py:1204] (3/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_385-184858-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 09:50:34,436 WARNING [train.py:1204] (3/4) Exclude cut with ID _64414_210_17301_1_1535243252048_3757759_348-333360-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 09:50:34,653 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.balancer1.prob, batch_count=836986.6666666666, ans=0.125 2023-10-02 09:50:36,343 WARNING [train.py:1204] (3/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_753-214751-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 09:50:37,777 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_17613_210_2151_1_1531137569_2366160_202-208013-0_sp1.1 from training. Duration: 0.92725 2023-10-02 09:50:37,993 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.conv_module2.balancer2.min_positive, batch_count=836986.6666666666, ans=0.05 2023-10-02 09:50:39,024 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_43831_210_2158_1_1533725502_5872791_610-129300-0_sp1.1 from training. Duration: 0.936375 2023-10-02 09:50:42,132 WARNING [train.py:1204] (3/4) Exclude cut with ID _54218_210_12210_1_1534413646388_4409259_321-23852-0_sp1.1 from training. Duration: 0.936375 2023-10-02 09:50:43,604 WARNING [train.py:1204] (3/4) Exclude cut with ID _39411_210_5986_1_1533538825030_3343649_173-238248-0 from training. Duration: 0.83 2023-10-02 09:50:43,610 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_405-339986-0_sp0.9 from training. Duration: 0.5666875 2023-10-02 09:50:43,644 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2009_210_2071_1_1525481134_4588937_385-302100-0 from training. Duration: 0.87 2023-10-02 09:50:43,671 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_161-276996-0_sp1.1 from training. Duration: 0.863625 2023-10-02 09:50:45,051 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_177-58005-0 from training. Duration: 0.62 2023-10-02 09:50:46,384 WARNING [train.py:1204] (3/4) Exclude cut with ID _64639_210_5816_1_1535367619437_3528000_146-343734-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 09:50:49,053 WARNING [train.py:1204] (3/4) Exclude cut with ID _3845_210_1390_1_1526731326141_3523610_72-61441-0_sp1.1 from training. Duration: 0.92725 2023-10-02 09:50:53,331 WARNING [train.py:1204] (3/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_991-125028-0_sp1.1 from training. Duration: 0.936375 2023-10-02 09:50:54,623 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7544_210_6753_1_1528591327_1471426_172-348777-0 from training. Duration: 0.66 2023-10-02 09:50:54,681 WARNING [train.py:1204] (3/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_416-257730-0_sp1.1 from training. Duration: 0.5909375 2023-10-02 09:50:55,985 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1113-7369-0_sp1.1 from training. Duration: 0.77275 2023-10-02 09:50:57,370 WARNING [train.py:1204] (3/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_242-76943-0_sp1.1 from training. Duration: 0.8545625 2023-10-02 09:51:04,627 WARNING [train.py:1204] (3/4) Exclude cut with ID _44718_210_5816_1_1534050051869_6469030_190-49648-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 09:51:06,594 WARNING [train.py:1204] (3/4) Exclude cut with ID _47251_210_18628_1_1533814063128_4137250_214-88076-0 from training. Duration: 0.88 2023-10-02 09:51:06,629 WARNING [train.py:1204] (3/4) Exclude cut with ID _5585_210_2667_1_1527418813324_8007340_89-332072-0_sp1.1 from training. Duration: 0.5818125 2023-10-02 09:51:08,332 WARNING [train.py:1204] (3/4) Exclude cut with ID _41817_210_18835_1_1533434477160_7423049_555-293448-0_sp1.1 from training. Duration: 0.72725 2023-10-02 09:51:09,645 INFO [train.py:1046] (3/4) Epoch 24, batch 3400, loss[loss=0.1691, simple_loss=0.2597, pruned_loss=0.0393, over 24615.00 frames. ], tot_loss[loss=0.1716, simple_loss=0.2487, pruned_loss=0.04725, over 4718474.52 frames. ], batch size: 68, lr: 4.26e-03, grad_scale: 16.0 2023-10-02 09:51:09,748 WARNING [train.py:1204] (3/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_286-331314-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 09:51:11,133 WARNING [train.py:1204] (3/4) Exclude cut with ID _13921_210_9994_1_1530838702800_3003429_46-149444-0 from training. Duration: 0.94 2023-10-02 09:51:11,196 WARNING [train.py:1204] (3/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_768-43595-0_sp1.1 from training. Duration: 0.936375 2023-10-02 09:51:12,349 WARNING [train.py:1204] (3/4) Exclude cut with ID _35355_210_16040_1_1533212988977_4016229_74-190076-0 from training. Duration: 0.8 2023-10-02 09:51:12,449 WARNING [train.py:1204] (3/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_1007-316380-0_sp1.1 from training. Duration: 0.963625 2023-10-02 09:51:13,811 WARNING [train.py:1204] (3/4) Exclude cut with ID _23557_210_8750_1_1531890106370_6846255_150-283437-0_sp1.1 from training. Duration: 0.963625 2023-10-02 09:51:13,861 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3474_210_2028_1_1526188513_4825246_145-309131-0_sp1.1 from training. Duration: 0.67275 2023-10-02 09:51:16,912 WARNING [train.py:1204] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_391-22148-0_sp1.1 from training. Duration: 0.7 2023-10-02 09:51:16,935 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_238-256668-0 from training. Duration: 0.69 2023-10-02 09:51:19,780 WARNING [train.py:1204] (3/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_372-198878-0 from training. Duration: 0.6 2023-10-02 09:51:19,797 WARNING [train.py:1204] (3/4) Exclude cut with ID ID0174W0006-766659 from training. Duration: 0.953 2023-10-02 09:51:19,813 WARNING [train.py:1204] (3/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_668-23019-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 09:51:22,640 WARNING [train.py:1204] (3/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_322-74328-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 09:51:22,646 WARNING [train.py:1204] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_872-264525-0_sp1.1 from training. Duration: 0.5909375 2023-10-02 09:51:23,898 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_53378_210_17282_1_1534221071_5769509_641-286201-0_sp1.1 from training. Duration: 0.97275 2023-10-02 09:51:25,220 WARNING [train.py:1204] (3/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_199-295611-0_sp1.1 from training. Duration: 0.5 2023-10-02 09:51:25,505 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.bypass.scale_min, batch_count=837253.3333333334, ans=0.2 2023-10-02 09:51:29,715 WARNING [train.py:1204] (3/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_1270-317971-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 09:51:31,063 WARNING [train.py:1204] (3/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_784-213099-0 from training. Duration: 0.93 2023-10-02 09:51:31,382 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.ff2_skip_rate, batch_count=837253.3333333334, ans=0.0 2023-10-02 09:51:35,575 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_115-233929-0_sp0.9 from training. Duration: 0.8 2023-10-02 09:51:36,266 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.3.encoder.layers.1.conv_module1.whiten, num_groups=1, num_channels=512, metric=2.63 vs. limit=15.0 2023-10-02 09:51:37,652 WARNING [train.py:1204] (3/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_256-223809-0_sp1.1 from training. Duration: 0.97275 2023-10-02 09:51:37,691 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_285-87781-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 09:51:40,197 WARNING [train.py:1204] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_619-344499-0_sp1.1 from training. Duration: 0.57275 2023-10-02 09:51:45,070 WARNING [train.py:1204] (3/4) Exclude cut with ID _60229_210_27767_1_1534838406158_3655350_264-279573-0_sp0.9 from training. Duration: 0.92225 2023-10-02 09:51:47,873 WARNING [train.py:1204] (3/4) Exclude cut with ID _7719_210_3525_1_1528456153163_4296960_333-110399-0 from training. Duration: 0.96 2023-10-02 09:51:53,138 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.545e+02 1.845e+02 2.030e+02 2.228e+02 3.234e+02, threshold=4.061e+02, percent-clipped=0.0 2023-10-02 09:51:53,223 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_50423_210_13270_1_1534128598_5021734_759-90833-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 09:51:53,272 WARNING [train.py:1204] (3/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_151-254798-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 09:51:54,541 WARNING [train.py:1204] (3/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_450-203637-0 from training. Duration: 0.92 2023-10-02 09:51:54,572 WARNING [train.py:1204] (3/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_673-36456-0_sp1.1 from training. Duration: 0.936375 2023-10-02 09:51:56,302 WARNING [train.py:1204] (3/4) Exclude cut with ID _36538_210_9994_1_1533204020363_3795620_131-8685-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 09:51:56,341 WARNING [train.py:1204] (3/4) Exclude cut with ID _12556_210_8341_1_1530081024100_7327690_713-55626-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 09:51:56,383 WARNING [train.py:1204] (3/4) Exclude cut with ID _2322_210_2667_1_1525604548971_3656299_440-80494-0_sp0.9 from training. Duration: 0.97775 2023-10-02 09:51:59,260 WARNING [train.py:1204] (3/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_210-325081-0_sp1.1 from training. Duration: 0.97275 2023-10-02 09:52:03,234 WARNING [train.py:1204] (3/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_375-244976-0_sp0.9 from training. Duration: 0.8333125 2023-10-02 09:52:03,240 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_15517_210_6753_1_1530952293_1664795_119-31001-0_sp1.1 from training. Duration: 0.8545625 2023-10-02 09:52:07,979 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_41452_210_7291_1_1533437687_3941281_319-42326-0_sp1.1 from training. Duration: 0.92725 2023-10-02 09:52:11,297 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_372-173012-0 from training. Duration: 0.95 2023-10-02 09:52:15,937 WARNING [train.py:1204] (3/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_41-31157-0_sp1.1 from training. Duration: 0.5181875 2023-10-02 09:52:21,296 WARNING [train.py:1204] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_91-212486-0 from training. Duration: 0.68 2023-10-02 09:52:22,561 INFO [train.py:1046] (3/4) Epoch 24, batch 3450, loss[loss=0.1621, simple_loss=0.2447, pruned_loss=0.03974, over 24313.00 frames. ], tot_loss[loss=0.1709, simple_loss=0.2479, pruned_loss=0.04691, over 4725878.61 frames. ], batch size: 61, lr: 4.26e-03, grad_scale: 16.0 2023-10-02 09:52:25,399 WARNING [train.py:1204] (3/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_1267-272693-0 from training. Duration: 0.73 2023-10-02 09:52:25,438 WARNING [train.py:1204] (3/4) Exclude cut with ID _13938_210_9988_1_1530518718978_3693960_152-67046-0_sp1.1 from training. Duration: 0.936375 2023-10-02 09:52:28,125 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_4528_210_3273_1_1527076654_3276777_342-168068-0_sp1.1 from training. Duration: 0.7909375 2023-10-02 09:52:28,140 WARNING [train.py:1204] (3/4) Exclude cut with ID _7606_210_4915_1_1528894253277_5080690_127-177761-0 from training. Duration: 0.64 2023-10-02 09:52:29,928 WARNING [train.py:1204] (3/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_335-137972-0_sp1.1 from training. Duration: 0.92725 2023-10-02 09:52:32,727 WARNING [train.py:1204] (3/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_331-117829-0_sp1.1 from training. Duration: 0.563625 2023-10-02 09:52:33,464 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.3.encoder.layers.2.feed_forward1.out_whiten, num_groups=1, num_channels=512, metric=8.51 vs. limit=15.0 2023-10-02 09:52:36,865 WARNING [train.py:1204] (3/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_125-125818-0_sp1.1 from training. Duration: 0.87275 2023-10-02 09:52:36,917 WARNING [train.py:1204] (3/4) Exclude cut with ID _6789_210_1534_1_1527854471671_2870519_48-294897-0_sp1.1 from training. Duration: 0.963625 2023-10-02 09:52:38,734 WARNING [train.py:1204] (3/4) Exclude cut with ID _6694_210_4599_1_1527852637490_3965529_116-109398-0_sp1.1 from training. Duration: 0.663625 2023-10-02 09:52:38,754 WARNING [train.py:1204] (3/4) Exclude cut with ID _52892_210_6721_1_1534378941259_4124129_222-172685-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 09:52:41,973 WARNING [train.py:1204] (3/4) Exclude cut with ID _50655_210_18628_1_1534229942193_3772154_320-132275-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 09:52:42,166 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.balancer1.prob, batch_count=837586.6666666666, ans=0.125 2023-10-02 09:52:46,267 WARNING [train.py:1204] (3/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_76-311963-0 from training. Duration: 0.7 2023-10-02 09:52:52,415 WARNING [train.py:1204] (3/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_32-205288-0 from training. Duration: 0.78 2023-10-02 09:52:52,439 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_86-16881-0_sp0.9 from training. Duration: 0.6444375 2023-10-02 09:52:52,477 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_271-297505-0_sp1.1 from training. Duration: 0.9 2023-10-02 09:52:55,170 WARNING [train.py:1204] (3/4) Exclude cut with ID _57572_210_5517_1_1534921219624_3809484_187-295293-0_sp1.1 from training. Duration: 0.963625 2023-10-02 09:52:59,357 WARNING [train.py:1204] (3/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_563-329943-0 from training. Duration: 0.99 2023-10-02 09:53:01,323 WARNING [train.py:1204] (3/4) Exclude cut with ID _7160_210_1876_1_1527940923231_3703799_302-150728-0_sp0.9 from training. Duration: 0.8666875 2023-10-02 09:53:05,906 WARNING [train.py:1204] (3/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_926-7526-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 09:53:07,214 WARNING [train.py:1204] (3/4) Exclude cut with ID _62574_210_2655_1_1535180407058_3933249_467-22348-0_sp1.1 from training. Duration: 0.936375 2023-10-02 09:53:08,749 WARNING [train.py:1204] (3/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_280-161633-0_sp0.9 from training. Duration: 0.811125 2023-10-02 09:53:08,850 WARNING [train.py:1204] (3/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_385-193052-0_sp1.1 from training. Duration: 0.7818125 2023-10-02 09:53:11,473 WARNING [train.py:1204] (3/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_444-28041-0 from training. Duration: 0.85 2023-10-02 09:53:11,477 WARNING [train.py:1204] (3/4) Exclude cut with ID _66070_210_7640_1_1535441393096_7805310_341-249237-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 09:53:11,570 WARNING [train.py:1204] (3/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_425-269826-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 09:53:14,786 WARNING [train.py:1204] (3/4) Exclude cut with ID _53950_210_5816_1_1534417264919_3862951_124-185597-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 09:53:17,556 WARNING [train.py:1204] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_380-204756-0 from training. Duration: 0.86 2023-10-02 09:53:22,183 WARNING [train.py:1204] (3/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_282-257791-0_sp0.9 from training. Duration: 0.9555625 2023-10-02 09:53:26,359 WARNING [train.py:1204] (3/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_372-131163-0_sp1.1 from training. Duration: 0.8545625 2023-10-02 09:53:27,733 WARNING [train.py:1204] (3/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_447-233513-0_sp1.1 from training. Duration: 0.963625 2023-10-02 09:53:30,535 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_54-118333-0_sp1.1 from training. Duration: 0.97275 2023-10-02 09:53:30,882 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.conv_module2.balancer2.prob, batch_count=837786.6666666666, ans=0.125 2023-10-02 09:53:33,916 WARNING [train.py:1204] (3/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_388-230239-0_sp1.1 from training. Duration: 0.963625 2023-10-02 09:53:33,929 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_40047_210_2290_1_1533290519_2374311_141-54639-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 09:53:35,341 WARNING [train.py:1204] (3/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_91-267696-0_sp0.9 from training. Duration: 0.9666875 2023-10-02 09:53:35,366 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_532-137847-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 09:53:36,706 INFO [train.py:1046] (3/4) Epoch 24, batch 3500, loss[loss=0.1621, simple_loss=0.2198, pruned_loss=0.05223, over 22633.00 frames. ], tot_loss[loss=0.17, simple_loss=0.2472, pruned_loss=0.04641, over 4711997.23 frames. ], batch size: 322, lr: 4.26e-03, grad_scale: 16.0 2023-10-02 09:53:39,334 WARNING [train.py:1204] (3/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_155-283231-0_sp1.1 from training. Duration: 0.97275 2023-10-02 09:53:42,098 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2245_210_2715_1_1525511548_3540254_310-52483-0_sp1.1 from training. Duration: 0.8 2023-10-02 09:53:42,815 WARNING [train.py:1204] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_185-174805-0 from training. Duration: 0.68 2023-10-02 09:53:44,089 WARNING [train.py:1204] (3/4) Exclude cut with ID _15012_210_1968_1_1530856820429_5095350_662-210469-0_sp1.1 from training. Duration: 0.5181875 2023-10-02 09:53:46,733 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_426-313557-0_sp0.9 from training. Duration: 0.588875 2023-10-02 09:53:49,399 WARNING [train.py:1204] (3/4) Exclude cut with ID _42326_210_7770_1_1533814282531_3707269_407-204343-0_sp1.1 from training. Duration: 0.97275 2023-10-02 09:53:49,407 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_258-310570-0 from training. Duration: 0.82 2023-10-02 09:53:55,368 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_4737_210_2040_1_1526998689_3293008_378-178650-0_sp1.1 from training. Duration: 0.863625 2023-10-02 09:53:55,462 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_47896_210_20508_1_1534492518_5958868_432-327451-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 09:53:56,784 WARNING [train.py:1204] (3/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_188-114464-0_sp1.1 from training. Duration: 0.7454375 2023-10-02 09:53:56,791 WARNING [train.py:1204] (3/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_798-106179-0_sp1.1 from training. Duration: 0.7545625 2023-10-02 09:53:58,080 WARNING [train.py:1204] (3/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_143-117421-0_sp1.1 from training. Duration: 0.52725 2023-10-02 09:53:58,116 WARNING [train.py:1204] (3/4) Exclude cut with ID _67499_210_17282_1_1535509805220_3692430_361-78786-0_sp1.1 from training. Duration: 0.963625 2023-10-02 09:53:58,157 WARNING [train.py:1204] (3/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_712-342563-0_sp1.1 from training. Duration: 0.8818125 2023-10-02 09:53:59,448 WARNING [train.py:1204] (3/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_387-159008-0 from training. Duration: 0.86 2023-10-02 09:54:03,101 WARNING [train.py:1204] (3/4) Exclude cut with ID _33640_210_2655_1_1533117605479_4855469_959-18820-0_sp1.1 from training. Duration: 0.963625 2023-10-02 09:54:03,143 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_133-564-0_sp0.9 from training. Duration: 0.87775 2023-10-02 09:54:03,240 WARNING [train.py:1204] (3/4) Exclude cut with ID _23514_210_1389_1_1531832139785_3031439_12-29287-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 09:54:06,362 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.conv_module2.balancer1.prob, batch_count=837986.6666666666, ans=0.125 2023-10-02 09:54:07,648 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.nonlin_attention.balancer.prob, batch_count=837986.6666666666, ans=0.125 2023-10-02 09:54:08,728 WARNING [train.py:1204] (3/4) Exclude cut with ID _23514_210_1389_1_1531832139785_3031439_489-185052-0_sp1.1 from training. Duration: 0.963625 2023-10-02 09:54:08,963 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.conv_skip_rate, batch_count=837986.6666666666, ans=0.0 2023-10-02 09:54:10,065 WARNING [train.py:1204] (3/4) Exclude cut with ID _5444_210_2151_1_1527418808464_5312210_634-194174-0 from training. Duration: 0.97 2023-10-02 09:54:10,083 WARNING [train.py:1204] (3/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_91-26919-0_sp1.1 from training. Duration: 0.8818125 2023-10-02 09:54:13,419 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_37351_210_7925_1_1533012931_3812113_516-82303-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 09:54:14,760 WARNING [train.py:1204] (3/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_80-74184-0_sp0.9 from training. Duration: 0.92225 2023-10-02 09:54:16,202 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_9460_210_4211_1_1528954587_10049667_495-202963-0_sp1.1 from training. Duration: 0.963625 2023-10-02 09:54:17,589 WARNING [train.py:1204] (3/4) Exclude cut with ID _14632_210_9988_1_1530691279392_3923200_332-105904-0_sp1.1 from training. Duration: 0.8181875 2023-10-02 09:54:18,912 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_40764_210_1925_1_1533882359_3752345_493-167886-0_sp1.1 from training. Duration: 0.92725 2023-10-02 09:54:20,198 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.588e+02 1.879e+02 2.062e+02 2.374e+02 3.315e+02, threshold=4.124e+02, percent-clipped=0.0 2023-10-02 09:54:20,316 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_196-289637-0 from training. Duration: 0.75 2023-10-02 09:54:21,593 WARNING [train.py:1204] (3/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_26-175553-0 from training. Duration: 0.61 2023-10-02 09:54:21,644 WARNING [train.py:1204] (3/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_890-5117-0 from training. Duration: 0.78 2023-10-02 09:54:21,701 WARNING [train.py:1204] (3/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_206-306860-0_sp1.1 from training. Duration: 0.92725 2023-10-02 09:54:23,209 WARNING [train.py:1204] (3/4) Exclude cut with ID _25882_210_3036_1_1532775665815_6479510_570-79824-0_sp1.1 from training. Duration: 0.963625 2023-10-02 09:54:23,277 WARNING [train.py:1204] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_731-2428-0_sp1.1 from training. Duration: 0.7545625 2023-10-02 09:54:23,299 WARNING [train.py:1204] (3/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_138-154812-0_sp0.9 from training. Duration: 0.8666875 2023-10-02 09:54:27,846 WARNING [train.py:1204] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_178-39722-0_sp0.9 from training. Duration: 0.5444375 2023-10-02 09:54:29,017 WARNING [train.py:1204] (3/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_1285-256622-0_sp0.9 from training. Duration: 0.9333125 2023-10-02 09:54:33,824 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_41428_210_2161_1_1533774184_3992532_65-271323-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 09:54:33,916 WARNING [train.py:1204] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_308-161067-0 from training. Duration: 0.66 2023-10-02 09:54:35,239 WARNING [train.py:1204] (3/4) Exclude cut with ID _3067_210_2667_1_1526212974640_4503168_459-173867-0 from training. Duration: 0.88 2023-10-02 09:54:35,240 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2221_210_1988_1_1525597051_4935541_415-296882-0_sp1.1 from training. Duration: 0.7 2023-10-02 09:54:37,996 WARNING [train.py:1204] (3/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_354-192266-0_sp1.1 from training. Duration: 0.936375 2023-10-02 09:54:38,993 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.0.layers.0.feed_forward1.out_whiten, num_groups=1, num_channels=192, metric=9.08 vs. limit=15.0 2023-10-02 09:54:39,288 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_258-310570-0_sp0.9 from training. Duration: 0.911125 2023-10-02 09:54:40,643 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_548-141039-0_sp1.1 from training. Duration: 0.963625 2023-10-02 09:54:42,767 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_15101_210_7927_1_1530786415_5033274_320-327527-0 from training. Duration: 0.96 2023-10-02 09:54:44,161 WARNING [train.py:1204] (3/4) Exclude cut with ID _2895_210_2477_1_1525956785794_4063750_289-11475-0_sp0.9 from training. Duration: 0.911125 2023-10-02 09:54:45,461 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7543_210_6753_1_1528543318_4552_1-85072-0_sp1.1 from training. Duration: 0.7545625 2023-10-02 09:54:46,752 WARNING [train.py:1204] (3/4) Exclude cut with ID _64639_210_5816_1_1535367619437_3528000_461-329156-0 from training. Duration: 0.96 2023-10-02 09:54:48,277 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_211-339622-0 from training. Duration: 0.45 2023-10-02 09:54:49,503 INFO [train.py:1046] (3/4) Epoch 24, batch 3550, loss[loss=0.1793, simple_loss=0.2422, pruned_loss=0.05817, over 23441.00 frames. ], tot_loss[loss=0.1688, simple_loss=0.2457, pruned_loss=0.04592, over 4717241.79 frames. ], batch size: 285, lr: 4.26e-03, grad_scale: 16.0 2023-10-02 09:54:50,883 WARNING [train.py:1204] (3/4) Exclude cut with ID _20455_210_5219_1_1531565996775_8098100_402-112739-0_sp1.1 from training. Duration: 0.963625 2023-10-02 09:54:50,974 WARNING [train.py:1204] (3/4) Exclude cut with ID _53489_210_6228_1_1534298357169_3835970_256-115565-0_sp1.1 from training. Duration: 0.936375 2023-10-02 09:54:50,988 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_26326_210_7925_1_1533002493_3592406_360-305260-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 09:54:52,323 WARNING [train.py:1204] (3/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_18-204383-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 09:54:53,838 WARNING [train.py:1204] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_306-232480-0_sp1.1 from training. Duration: 0.87275 2023-10-02 09:55:04,841 WARNING [train.py:1204] (3/4) Exclude cut with ID _41161_210_20034_1_1533431249657_3555314_116-299186-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 09:55:06,309 WARNING [train.py:1204] (3/4) Exclude cut with ID _13913_210_9994_1_1530579615187_3383897_473-277682-0_sp1.1 from training. Duration: 0.3545625 2023-10-02 09:55:09,007 WARNING [train.py:1204] (3/4) Exclude cut with ID _58895_210_8750_1_1535184081320_3649089_49-225702-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 09:55:10,335 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3080_210_2071_1_1526086102_4365166_312-8726-0_sp1.1 from training. Duration: 0.82725 2023-10-02 09:55:12,096 WARNING [train.py:1204] (3/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_489-190175-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 09:55:12,169 WARNING [train.py:1204] (3/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_345-95717-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 09:55:13,467 WARNING [train.py:1204] (3/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_85-138257-0_sp1.1 from training. Duration: 0.6454375 2023-10-02 09:55:14,978 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7046_210_6753_1_1527895337_6920917_783-278109-0_sp1.1 from training. Duration: 0.7 2023-10-02 09:55:15,019 WARNING [train.py:1204] (3/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_450-203637-0_sp1.1 from training. Duration: 0.836375 2023-10-02 09:55:16,341 WARNING [train.py:1204] (3/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_416-132893-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 09:55:16,364 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2655_210_2087_1_1525688959_3897400_437-337690-0_sp0.9 from training. Duration: 0.7 2023-10-02 09:55:16,429 WARNING [train.py:1204] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_700-183549-0_sp1.1 from training. Duration: 0.6181875 2023-10-02 09:55:21,996 WARNING [train.py:1204] (3/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_191-59784-0_sp1.1 from training. Duration: 0.8 2023-10-02 09:55:22,006 WARNING [train.py:1204] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_218-295097-0_sp1.1 from training. Duration: 0.7 2023-10-02 09:55:22,118 WARNING [train.py:1204] (3/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_310-109461-0_sp1.1 from training. Duration: 0.72725 2023-10-02 09:55:22,122 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_33348_210_5632_1_1533086912_3324030_131-321149-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 09:55:23,256 WARNING [train.py:1204] (3/4) Exclude cut with ID _64135_210_2253_1_1535677267876_7608972_133-188266-0_sp1.1 from training. Duration: 0.736375 2023-10-02 09:55:24,531 WARNING [train.py:1204] (3/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_505-205270-0 from training. Duration: 0.38 2023-10-02 09:55:24,541 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_67223_210_4238_1_1535612120_2537431_102-88248-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 09:55:26,614 WARNING [train.py:1204] (3/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_60-235433-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 09:55:26,705 WARNING [train.py:1204] (3/4) Exclude cut with ID _5189_210_4557_1_1527248319428_3769229_199-53526-0_sp1.1 from training. Duration: 0.3818125 2023-10-02 09:55:26,865 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.conv_module2.balancer1.prob, batch_count=838320.0, ans=0.125 2023-10-02 09:55:28,337 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.balancer1.prob, batch_count=838320.0, ans=0.125 2023-10-02 09:55:28,655 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.4.encoder.layers.1.self_attn1.whiten, num_groups=1, num_channels=384, metric=12.23 vs. limit=22.5 2023-10-02 09:55:30,957 WARNING [train.py:1204] (3/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_439-90979-0_sp1.1 from training. Duration: 0.963625 2023-10-02 09:55:32,766 WARNING [train.py:1204] (3/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_1162-132880-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 09:55:32,844 WARNING [train.py:1204] (3/4) Exclude cut with ID _56423_210_5854_1_1534896061049_3165969_220-22317-0_sp1.1 from training. Duration: 0.963625 2023-10-02 09:55:35,926 WARNING [train.py:1204] (3/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_909-64151-0 from training. Duration: 0.94 2023-10-02 09:55:37,380 WARNING [train.py:1204] (3/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_465-156500-0_sp0.9 from training. Duration: 0.8 2023-10-02 09:55:37,467 WARNING [train.py:1204] (3/4) Exclude cut with ID _44304_210_15374_1_1533639352765_4613129_302-22084-0 from training. Duration: 0.86 2023-10-02 09:55:39,338 WARNING [train.py:1204] (3/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_663-166973-0_sp1.1 from training. Duration: 0.72725 2023-10-02 09:55:41,999 WARNING [train.py:1204] (3/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_503-287883-0_sp0.9 from training. Duration: 0.97775 2023-10-02 09:55:42,431 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.5.encoder.layers.1.conv_module2.whiten, num_groups=1, num_channels=256, metric=4.55 vs. limit=15.0 2023-10-02 09:55:43,257 WARNING [train.py:1204] (3/4) Exclude cut with ID _60057_210_10640_1_1534903065793_3333150_362-230377-0_sp0.9 from training. Duration: 0.9666875 2023-10-02 09:55:44,806 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_4183_210_2087_1_1526729100_1869838_143-280962-0 from training. Duration: 0.56 2023-10-02 09:55:46,190 WARNING [train.py:1204] (3/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_281-1313-0_sp1.1 from training. Duration: 0.936375 2023-10-02 09:55:51,787 WARNING [train.py:1204] (3/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_64-215118-0_sp1.1 from training. Duration: 0.936375 2023-10-02 09:55:53,210 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2822_210_2715_1_1526112586_1999587_165-36656-0 from training. Duration: 0.89 2023-10-02 09:55:54,509 WARNING [train.py:1204] (3/4) Exclude cut with ID _22961_210_8254_1_1531976366672_4278180_402-208830-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 09:55:57,935 WARNING [train.py:1204] (3/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_98-193494-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 09:55:58,213 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.feed_forward2.hidden_balancer.prob, batch_count=838453.3333333334, ans=0.125 2023-10-02 09:55:59,357 WARNING [train.py:1204] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_383-285196-0 from training. Duration: 0.97 2023-10-02 09:56:02,503 INFO [scaling.py:1118] (3/4) WithLoss: name=encoder.encoders.0.layers.0.self_attn_weights, loss-sum=0.000e+00 2023-10-02 09:56:04,247 INFO [train.py:1046] (3/4) Epoch 24, batch 3600, loss[loss=0.195, simple_loss=0.2445, pruned_loss=0.07281, over 19013.00 frames. ], tot_loss[loss=0.1686, simple_loss=0.2452, pruned_loss=0.04603, over 4700444.10 frames. ], batch size: 388, lr: 4.26e-03, grad_scale: 32.0 2023-10-02 09:56:04,339 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6767_210_3273_1_1527855783_1718932_142-127132-0 from training. Duration: 0.95 2023-10-02 09:56:04,377 WARNING [train.py:1204] (3/4) Exclude cut with ID _39867_210_9826_1_1533553222030_9344259_1018-339635-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 09:56:05,690 WARNING [train.py:1204] (3/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_274-274798-0_sp1.1 from training. Duration: 0.8909375 2023-10-02 09:56:07,524 WARNING [train.py:1204] (3/4) Exclude cut with ID _2334_210_2667_1_1525608238880_4705129_376-263393-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 09:56:07,579 WARNING [train.py:1204] (3/4) Exclude cut with ID _29303_210_2575_1_1532392237303_7410649_363-344675-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 09:56:09,034 WARNING [train.py:1204] (3/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_58-68956-0_sp1.1 from training. Duration: 0.7545625 2023-10-02 09:56:13,654 WARNING [train.py:1204] (3/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_709-311571-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 09:56:16,289 WARNING [train.py:1204] (3/4) Exclude cut with ID _46067_210_4220_1_1534125571066_3307450_136-257407-0_sp1.1 from training. Duration: 0.963625 2023-10-02 09:56:16,372 WARNING [train.py:1204] (3/4) Exclude cut with ID _6493_210_1747_1_1528459210424_4012470_324-323854-0_sp1.1 from training. Duration: 0.836375 2023-10-02 09:56:17,676 WARNING [train.py:1204] (3/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_234-260682-0_sp1.1 from training. Duration: 0.763625 2023-10-02 09:56:17,740 WARNING [train.py:1204] (3/4) Exclude cut with ID _36634_210_1794_1_1533296352450_1399884_55-119451-0_sp1.1 from training. Duration: 0.963625 2023-10-02 09:56:17,743 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_40047_210_2290_1_1533290519_2374311_195-165693-0 from training. Duration: 0.89 2023-10-02 09:56:20,737 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_5522_210_3273_1_1527249606_3909096_366-172508-0_sp0.9 from training. Duration: 0.7333125 2023-10-02 09:56:22,100 WARNING [train.py:1204] (3/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_187-10504-0_sp1.1 from training. Duration: 0.963625 2023-10-02 09:56:24,802 WARNING [train.py:1204] (3/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_282-257791-0_sp1.1 from training. Duration: 0.7818125 2023-10-02 09:56:26,286 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_43831_210_2158_1_1533725502_5872791_467-288776-0_sp1.1 from training. Duration: 0.92725 2023-10-02 09:56:27,716 WARNING [train.py:1204] (3/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_138-154812-0_sp1.1 from training. Duration: 0.7090625 2023-10-02 09:56:29,593 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_1154-185930-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 09:56:29,611 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2655_210_2087_1_1525688959_3897400_52-279899-0 from training. Duration: 0.69 2023-10-02 09:56:29,679 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_141-89113-0_sp1.1 from training. Duration: 0.7818125 2023-10-02 09:56:32,400 WARNING [train.py:1204] (3/4) Exclude cut with ID _55134_210_21982_1_1534590028377_3882170_297-47310-0_sp1.1 from training. Duration: 0.963625 2023-10-02 09:56:34,260 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_486-151131-0_sp1.1 from training. Duration: 0.77275 2023-10-02 09:56:36,192 WARNING [train.py:1204] (3/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_21-232980-0_sp1.1 from training. Duration: 0.936375 2023-10-02 09:56:39,193 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6165_210_3528_1_1527590982_4379841_338-106380-0_sp1.1 from training. Duration: 0.92725 2023-10-02 09:56:39,269 WARNING [train.py:1204] (3/4) Exclude cut with ID _55162_210_17071_1_1534408248583_3753480_355-257218-0_sp1.1 from training. Duration: 0.97275 2023-10-02 09:56:39,443 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.attention_skip_rate, batch_count=838653.3333333334, ans=0.0 2023-10-02 09:56:40,632 WARNING [train.py:1204] (3/4) Exclude cut with ID _2895_210_2477_1_1525956785794_4063750_289-11475-0 from training. Duration: 0.82 2023-10-02 09:56:46,580 WARNING [train.py:1204] (3/4) Exclude cut with ID _16756_210_4915_1_1531047803763_7104259_839-183431-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 09:56:47,802 WARNING [train.py:1204] (3/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_427-208901-0_sp0.9 from training. Duration: 0.7555625 2023-10-02 09:56:48,001 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.conv_module1.balancer2.prob, batch_count=838720.0, ans=0.125 2023-10-02 09:56:49,066 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.434e+02 1.867e+02 2.105e+02 2.448e+02 3.419e+02, threshold=4.210e+02, percent-clipped=0.0 2023-10-02 09:56:49,140 WARNING [train.py:1204] (3/4) Exclude cut with ID _36512_210_6987_1_1533033143998_3126069_181-306500-0 from training. Duration: 0.95 2023-10-02 09:56:53,305 WARNING [train.py:1204] (3/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_28-70987-0_sp1.1 from training. Duration: 0.7909375 2023-10-02 09:56:56,314 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.self_attn_weights.pos_emb_skip_rate, batch_count=838720.0, ans=0.0 2023-10-02 09:56:57,495 WARNING [train.py:1204] (3/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_590-58996-0_sp1.1 from training. Duration: 0.936375 2023-10-02 09:57:00,658 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_48730_210_5399_1_1533885886_2212769_108-47561-0_sp1.1 from training. Duration: 0.936375 2023-10-02 09:57:05,320 WARNING [train.py:1204] (3/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_401-158239-0_sp0.9 from training. Duration: 0.788875 2023-10-02 09:57:05,341 WARNING [train.py:1204] (3/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_278-154492-0_sp1.1 from training. Duration: 0.6909375 2023-10-02 09:57:05,346 WARNING [train.py:1204] (3/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_137-116442-0 from training. Duration: 0.78 2023-10-02 09:57:07,117 WARNING [train.py:1204] (3/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_189-317158-0 from training. Duration: 0.9 2023-10-02 09:57:08,473 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_47896_210_20508_1_1534492518_5958868_558-314572-0 from training. Duration: 0.97 2023-10-02 09:57:09,973 WARNING [train.py:1204] (3/4) Exclude cut with ID _66070_210_7640_1_1535441393096_7805310_290-40284-0_sp1.1 from training. Duration: 0.97275 2023-10-02 09:57:10,007 WARNING [train.py:1204] (3/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_110-224688-0_sp1.1 from training. Duration: 0.8818125 2023-10-02 09:57:11,753 WARNING [train.py:1204] (3/4) Exclude cut with ID _7719_210_3525_1_1528456153163_4296960_88-250259-0 from training. Duration: 0.84 2023-10-02 09:57:13,055 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8453_210_4211_1_1528502940_8562730_1157-114102-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 09:57:13,096 WARNING [train.py:1204] (3/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_317-175062-0_sp1.1 from training. Duration: 0.7454375 2023-10-02 09:57:13,102 WARNING [train.py:1204] (3/4) Exclude cut with ID _29857_210_20034_1_1532395886753_3798160_254-347254-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 09:57:14,467 WARNING [train.py:1204] (3/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_28-70987-0 from training. Duration: 0.87 2023-10-02 09:57:15,795 WARNING [train.py:1204] (3/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_321-335475-0 from training. Duration: 0.45 2023-10-02 09:57:17,325 WARNING [train.py:1204] (3/4) Exclude cut with ID _40901_210_5665_1_1533363789958_3856130_356-107330-0_sp1.1 from training. Duration: 0.936375 2023-10-02 09:57:18,585 INFO [train.py:1046] (3/4) Epoch 24, batch 3650, loss[loss=0.1736, simple_loss=0.2359, pruned_loss=0.05566, over 23823.00 frames. ], tot_loss[loss=0.1689, simple_loss=0.2457, pruned_loss=0.04609, over 4705949.10 frames. ], batch size: 195, lr: 4.26e-03, grad_scale: 32.0 2023-10-02 09:57:18,670 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3780_210_2087_1_1526467389_2145572_176-107140-0 from training. Duration: 0.69 2023-10-02 09:57:22,854 WARNING [train.py:1204] (3/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_80-135200-0 from training. Duration: 0.79 2023-10-02 09:57:24,348 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_33348_210_5632_1_1533086912_3324030_85-28583-0_sp0.9 from training. Duration: 0.911125 2023-10-02 09:57:27,067 WARNING [train.py:1204] (3/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_29-293896-0 from training. Duration: 0.61 2023-10-02 09:57:30,338 WARNING [train.py:1204] (3/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_194-252800-0 from training. Duration: 0.97 2023-10-02 09:57:33,563 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_360-52446-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 09:57:33,565 WARNING [train.py:1204] (3/4) Exclude cut with ID _9389_210_1664_1_1529217337301_5289269_289-213417-0_sp0.9 from training. Duration: 0.988875 2023-10-02 09:57:34,889 WARNING [train.py:1204] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_143-86344-0_sp0.9 from training. Duration: 0.8555625 2023-10-02 09:57:36,872 WARNING [train.py:1204] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_520-126729-0_sp0.9 from training. Duration: 0.77775 2023-10-02 09:57:36,902 WARNING [train.py:1204] (3/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_76-308663-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 09:57:36,959 WARNING [train.py:1204] (3/4) Exclude cut with ID _8021_210_7994_1_1528376451358_3653069_100-140231-0 from training. Duration: 0.67 2023-10-02 09:57:37,013 WARNING [train.py:1204] (3/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_387-131538-0_sp0.9 from training. Duration: 0.9 2023-10-02 09:57:38,995 WARNING [train.py:1204] (3/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_365-182054-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 09:57:39,038 WARNING [train.py:1204] (3/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_345-340690-0 from training. Duration: 0.93 2023-10-02 09:57:40,201 WARNING [train.py:1204] (3/4) Exclude cut with ID _5409_210_1388_1_1527292850189_5865191_178-127970-0_sp0.9 from training. Duration: 0.6333125 2023-10-02 09:57:40,254 WARNING [train.py:1204] (3/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_352-12734-0_sp1.1 from training. Duration: 0.92725 2023-10-02 09:57:40,257 WARNING [train.py:1204] (3/4) Exclude cut with ID _65134_210_14763_1_1535335393985_3505368_121-129373-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 09:57:42,945 WARNING [train.py:1204] (3/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_557-324258-0_sp1.1 from training. Duration: 0.736375 2023-10-02 09:57:45,675 WARNING [train.py:1204] (3/4) Exclude cut with ID _48678_210_5663_1_1533988830947_3769199_306-255886-0 from training. Duration: 0.77 2023-10-02 09:57:45,764 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7544_210_6753_1_1528587000_691468_87-178599-0 from training. Duration: 0.87 2023-10-02 09:57:45,817 WARNING [train.py:1204] (3/4) Exclude cut with ID _2941_210_2577_1_1526199673150_2489759_208-236402-0_sp1.1 from training. Duration: 0.963625 2023-10-02 09:57:48,558 WARNING [train.py:1204] (3/4) Exclude cut with ID _38670_210_15379_1_1533448810659_4391059_65-169397-0 from training. Duration: 0.97 2023-10-02 09:57:50,079 WARNING [train.py:1204] (3/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_306-328175-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 09:57:50,087 WARNING [train.py:1204] (3/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_212-149947-0_sp1.1 from training. Duration: 0.663625 2023-10-02 09:57:54,165 WARNING [train.py:1204] (3/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_321-22821-0_sp1.1 from training. Duration: 0.8090625 2023-10-02 09:57:55,608 WARNING [train.py:1204] (3/4) Exclude cut with ID _44010_210_4525_1_1533884262964_4013620_483-69110-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 09:57:56,845 WARNING [train.py:1204] (3/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_38-187851-0_sp1.1 from training. Duration: 0.563625 2023-10-02 09:57:56,932 WARNING [train.py:1204] (3/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_129-337802-0_sp0.9 from training. Duration: 0.8 2023-10-02 09:57:58,832 WARNING [train.py:1204] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_268-87909-0_sp1.1 from training. Duration: 0.9 2023-10-02 09:58:00,847 WARNING [train.py:1204] (3/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_559-184862-0_sp1.1 from training. Duration: 0.8545625 2023-10-02 09:58:04,759 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_46032_210_14656_1_1533781412_4507969_237-120766-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 09:58:04,850 WARNING [train.py:1204] (3/4) Exclude cut with ID _28427_210_4220_1_1532581240967_3061069_330-170906-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 09:58:04,855 WARNING [train.py:1204] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_401-51578-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 09:58:06,742 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.feed_forward1.hidden_balancer.prob, batch_count=839053.3333333334, ans=0.125 2023-10-02 09:58:08,469 WARNING [train.py:1204] (3/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_209-205365-0_sp1.1 from training. Duration: 0.7181875 2023-10-02 09:58:09,823 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_23801_210_16223_1_1532066392_3294657_57-242170-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 09:58:09,880 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_127-37371-0_sp1.1 from training. Duration: 0.936375 2023-10-02 09:58:16,837 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9171W0019-578905 from training. Duration: 0.896 2023-10-02 09:58:17,118 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.feed_forward2.hidden_balancer.prob, batch_count=839120.0, ans=0.125 2023-10-02 09:58:18,500 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.ff3_skip_rate, batch_count=839120.0, ans=0.0 2023-10-02 09:58:19,565 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_24605_210_1794_1_1532047863_4149755_349-49056-0_sp1.1 from training. Duration: 0.92725 2023-10-02 09:58:19,575 WARNING [train.py:1204] (3/4) Exclude cut with ID _12302_210_5219_1_1529924446466_8107280_897-21237-0_sp1.1 from training. Duration: 0.936375 2023-10-02 09:58:20,965 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_9460_210_4211_1_1528954587_10049667_480-260269-0_sp0.9 from training. Duration: 0.62225 2023-10-02 09:58:22,385 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_348-152548-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 09:58:23,696 WARNING [train.py:1204] (3/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_301-330455-0_sp0.9 from training. Duration: 0.888875 2023-10-02 09:58:25,050 WARNING [train.py:1204] (3/4) Exclude cut with ID _61008_210_10633_1_1534852769198_6974370_345-91705-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 09:58:25,162 WARNING [train.py:1204] (3/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_304-273433-0 from training. Duration: 0.99 2023-10-02 09:58:25,167 WARNING [train.py:1204] (3/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_359-345972-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 09:58:29,146 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2296_210_2087_1_1525503448_3397335_57-185430-0_sp1.1 from training. Duration: 0.5454375 2023-10-02 09:58:30,558 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_1256-120974-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 09:58:31,841 INFO [train.py:1046] (3/4) Epoch 24, batch 3700, loss[loss=0.1528, simple_loss=0.2336, pruned_loss=0.03598, over 24312.00 frames. ], tot_loss[loss=0.1705, simple_loss=0.247, pruned_loss=0.04703, over 4698924.92 frames. ], batch size: 61, lr: 4.26e-03, grad_scale: 32.0 2023-10-02 09:58:33,330 WARNING [train.py:1204] (3/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_372-30302-0_sp0.9 from training. Duration: 0.9555625 2023-10-02 09:58:35,258 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_42012_210_7927_1_1533439936_3871556_451-344262-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 09:58:35,258 WARNING [train.py:1204] (3/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_28-324729-0 from training. Duration: 0.94 2023-10-02 09:58:35,266 WARNING [train.py:1204] (3/4) Exclude cut with ID _64135_210_2253_1_1535677267876_7608972_313-18394-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 09:58:36,567 WARNING [train.py:1204] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_620-276793-0_sp0.9 from training. Duration: 0.6555625 2023-10-02 09:58:36,606 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7544_210_6753_1_1528591327_1471426_172-348777-0_sp0.9 from training. Duration: 0.7333125 2023-10-02 09:58:41,289 WARNING [train.py:1204] (3/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_332-221057-0_sp0.9 from training. Duration: 0.7555625 2023-10-02 09:58:44,102 WARNING [train.py:1204] (3/4) Exclude cut with ID _19023_210_4353_1_1531378892552_5820780_5-300035-0_sp1.1 from training. Duration: 0.92725 2023-10-02 09:58:45,442 WARNING [train.py:1204] (3/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_343-309023-0_sp1.1 from training. Duration: 0.963625 2023-10-02 09:58:45,532 WARNING [train.py:1204] (3/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_37-41919-0_sp1.1 from training. Duration: 0.7909375 2023-10-02 09:58:45,573 WARNING [train.py:1204] (3/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_41-182910-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 09:58:46,920 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_229-45300-0_sp0.9 from training. Duration: 0.6333125 2023-10-02 09:58:48,351 WARNING [train.py:1204] (3/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_40-214716-0_sp1.1 from training. Duration: 0.963625 2023-10-02 09:58:49,761 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9309W0203-640330 from training. Duration: 0.896 2023-10-02 09:58:56,814 WARNING [train.py:1204] (3/4) Exclude cut with ID _60229_210_27767_1_1534838406158_3655350_264-279573-0_sp1.1 from training. Duration: 0.7545625 2023-10-02 09:58:56,863 WARNING [train.py:1204] (3/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_372-198878-0_sp0.9 from training. Duration: 0.6666875 2023-10-02 09:58:58,229 WARNING [train.py:1204] (3/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_359-118926-0_sp1.1 from training. Duration: 0.6545625 2023-10-02 09:58:58,240 WARNING [train.py:1204] (3/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_85-65056-0 from training. Duration: 0.96 2023-10-02 09:58:58,264 WARNING [train.py:1204] (3/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_89-242335-0_sp1.1 from training. Duration: 0.863625 2023-10-02 09:59:02,274 WARNING [train.py:1204] (3/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_128-49444-0_sp1.1 from training. Duration: 0.936375 2023-10-02 09:59:03,619 WARNING [train.py:1204] (3/4) Exclude cut with ID _30327_210_16049_1_1532484161295_3924169_401-70819-0 from training. Duration: 0.86 2023-10-02 09:59:05,515 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_41822_210_13270_1_1533955976_4379284_260-256391-0_sp1.1 from training. Duration: 0.936375 2023-10-02 09:59:06,966 WARNING [train.py:1204] (3/4) Exclude cut with ID _67335_210_5219_1_1535525975652_4275229_599-247317-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 09:59:11,320 WARNING [train.py:1204] (3/4) Exclude cut with ID _29857_210_20034_1_1532395886753_3798160_209-239267-0_sp1.1 from training. Duration: 0.936375 2023-10-02 09:59:11,347 WARNING [train.py:1204] (3/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_46-207599-0_sp1.1 from training. Duration: 0.5818125 2023-10-02 09:59:13,438 WARNING [train.py:1204] (3/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_912-282412-0_sp1.1 from training. Duration: 0.4454375 2023-10-02 09:59:16,264 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_4183_210_2087_1_1526729100_1869838_141-166600-0_sp1.1 from training. Duration: 0.863625 2023-10-02 09:59:16,268 WARNING [train.py:1204] (3/4) Exclude cut with ID _15037_210_2253_1_1530752294104_3267650_296-192597-0 from training. Duration: 0.81 2023-10-02 09:59:17,446 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.524e+02 1.801e+02 2.013e+02 2.164e+02 3.512e+02, threshold=4.027e+02, percent-clipped=0.0 2023-10-02 09:59:17,571 WARNING [train.py:1204] (3/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_571-220562-0_sp1.1 from training. Duration: 0.963625 2023-10-02 09:59:18,786 WARNING [train.py:1204] (3/4) Exclude cut with ID _5287_210_2084_1_1527418883040_3878670_12-221186-0 from training. Duration: 0.98 2023-10-02 09:59:19,651 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.1.encoder.layers.0.whiten, num_groups=1, num_channels=256, metric=5.15 vs. limit=12.0 2023-10-02 09:59:23,004 WARNING [train.py:1204] (3/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_149-85602-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 09:59:23,027 WARNING [train.py:1204] (3/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_1003-101638-0_sp1.1 from training. Duration: 0.663625 2023-10-02 09:59:25,823 WARNING [train.py:1204] (3/4) Exclude cut with ID _55844_210_2346_1_1534471212569_7625707_1099-33689-0_sp1.1 from training. Duration: 0.92725 2023-10-02 09:59:25,868 WARNING [train.py:1204] (3/4) Exclude cut with ID _7606_210_4915_1_1528894253277_5080690_301-10732-0 from training. Duration: 0.81 2023-10-02 09:59:27,177 WARNING [train.py:1204] (3/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_403-34698-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 09:59:28,427 WARNING [train.py:1204] (3/4) Exclude cut with ID _8480_210_1874_1_1528589391205_6728330_755-112625-0_sp0.9 from training. Duration: 0.82225 2023-10-02 09:59:28,440 WARNING [train.py:1204] (3/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_527-238990-0_sp1.1 from training. Duration: 0.8181875 2023-10-02 09:59:28,468 WARNING [train.py:1204] (3/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_426-7328-0_sp1.1 from training. Duration: 0.92725 2023-10-02 09:59:30,274 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.nonlin_attention.balancer.prob, batch_count=839453.3333333334, ans=0.125 2023-10-02 09:59:31,366 WARNING [train.py:1204] (3/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_189-317158-0_sp1.1 from training. Duration: 0.8181875 2023-10-02 09:59:32,093 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.3.encoder.layers.0.whiten, num_groups=1, num_channels=512, metric=5.31 vs. limit=12.0 2023-10-02 09:59:32,701 WARNING [train.py:1204] (3/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_162-103620-0 from training. Duration: 0.93 2023-10-02 09:59:32,802 WARNING [train.py:1204] (3/4) Exclude cut with ID _22083_210_8254_1_1531702862439_3678970_49-234546-0 from training. Duration: 0.83 2023-10-02 09:59:34,496 WARNING [train.py:1204] (3/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_370-331600-0_sp1.1 from training. Duration: 0.8545625 2023-10-02 09:59:34,509 WARNING [train.py:1204] (3/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_166-59042-0_sp1.1 from training. Duration: 0.97275 2023-10-02 09:59:35,909 WARNING [train.py:1204] (3/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_138-332773-0_sp1.1 from training. Duration: 0.736375 2023-10-02 09:59:36,168 INFO [scaling.py:1118] (3/4) WithLoss: name=encoder.encoders.4.encoder.layers.0.self_attn_weights, loss-sum=0.000e+00 2023-10-02 09:59:36,437 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.4.encoder.layers.2.feed_forward3.out_whiten, num_groups=1, num_channels=384, metric=13.45 vs. limit=15.0 2023-10-02 09:59:37,750 WARNING [train.py:1204] (3/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_90-30673-0_sp0.9 from training. Duration: 0.9333125 2023-10-02 09:59:40,492 WARNING [train.py:1204] (3/4) Exclude cut with ID _27967_210_12140_1_1532588391859_8419130_737-41898-0_sp1.1 from training. Duration: 0.936375 2023-10-02 09:59:41,267 WARNING [train.py:1204] (3/4) Exclude cut with ID _8833_210_5219_1_1528592665210_7553059_329-125156-0_sp1.1 from training. Duration: 0.7454375 2023-10-02 09:59:42,526 WARNING [train.py:1204] (3/4) Exclude cut with ID _41817_210_18835_1_1533434477160_7423049_352-42537-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 09:59:45,148 INFO [train.py:1046] (3/4) Epoch 24, batch 3750, loss[loss=0.1519, simple_loss=0.2272, pruned_loss=0.03833, over 24471.00 frames. ], tot_loss[loss=0.1717, simple_loss=0.2483, pruned_loss=0.04755, over 4704447.91 frames. ], batch size: 58, lr: 4.26e-03, grad_scale: 16.0 2023-10-02 09:59:45,242 WARNING [train.py:1204] (3/4) Exclude cut with ID _8365_210_2064_1_1528592311070_4677690_431-294102-0 from training. Duration: 0.89 2023-10-02 09:59:46,512 WARNING [train.py:1204] (3/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_454-314901-0_sp1.1 from training. Duration: 0.4181875 2023-10-02 09:59:47,879 WARNING [train.py:1204] (3/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_80-135200-0_sp0.9 from training. Duration: 0.87775 2023-10-02 09:59:49,337 WARNING [train.py:1204] (3/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_181-78632-0 from training. Duration: 0.78 2023-10-02 09:59:50,740 WARNING [train.py:1204] (3/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_300-27235-0_sp0.9 from training. Duration: 0.9666875 2023-10-02 09:59:50,811 WARNING [train.py:1204] (3/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_96-177176-0_sp1.1 from training. Duration: 0.97275 2023-10-02 09:59:52,191 WARNING [train.py:1204] (3/4) Exclude cut with ID _35941_210_15379_1_1532995294515_4023089_246-348742-0_sp1.1 from training. Duration: 0.97275 2023-10-02 09:59:52,271 WARNING [train.py:1204] (3/4) Exclude cut with ID _38670_210_15379_1_1533448810659_4391059_65-169397-0_sp1.1 from training. Duration: 0.8818125 2023-10-02 09:59:56,413 WARNING [train.py:1204] (3/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_1003-99642-0_sp1.1 from training. Duration: 0.963625 2023-10-02 10:00:00,356 WARNING [train.py:1204] (3/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_320-14078-0_sp1.1 from training. Duration: 0.863625 2023-10-02 10:00:00,452 WARNING [train.py:1204] (3/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_367-61330-0_sp0.9 from training. Duration: 0.8333125 2023-10-02 10:00:01,844 WARNING [train.py:1204] (3/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_515-298514-0_sp1.1 from training. Duration: 0.92725 2023-10-02 10:00:05,021 WARNING [train.py:1204] (3/4) Exclude cut with ID _53731_210_7994_1_1534553979244_3757214_471-320815-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 10:00:06,695 WARNING [train.py:1204] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_306-232480-0 from training. Duration: 0.96 2023-10-02 10:00:07,975 WARNING [train.py:1204] (3/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_478-162247-0_sp0.9 from training. Duration: 0.911125 2023-10-02 10:00:09,370 WARNING [train.py:1204] (3/4) Exclude cut with ID _56423_210_5854_1_1534896061049_3165969_345-284696-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 10:00:09,397 WARNING [train.py:1204] (3/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_600-122253-0_sp1.1 from training. Duration: 0.963625 2023-10-02 10:00:14,546 WARNING [train.py:1204] (3/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_199-295611-0 from training. Duration: 0.55 2023-10-02 10:00:17,250 WARNING [train.py:1204] (3/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_734-129761-0 from training. Duration: 0.82 2023-10-02 10:00:19,941 WARNING [train.py:1204] (3/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_349-169257-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 10:00:19,975 WARNING [train.py:1204] (3/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_202-67017-0_sp0.9 from training. Duration: 0.911125 2023-10-02 10:00:22,705 WARNING [train.py:1204] (3/4) Exclude cut with ID _16908_210_5724_1_1531119157248_3772700_61-258978-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 10:00:25,589 WARNING [train.py:1204] (3/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_172-156931-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 10:00:25,684 WARNING [train.py:1204] (3/4) Exclude cut with ID _4092_210_2151_1_1526643044449_3135759_356-278694-0_sp1.1 from training. Duration: 0.42725 2023-10-02 10:00:25,843 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.conv_module1.balancer1.prob, batch_count=839653.3333333334, ans=0.125 2023-10-02 10:00:27,272 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.attention_skip_rate, batch_count=839720.0, ans=0.0 2023-10-02 10:00:29,788 WARNING [train.py:1204] (3/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_116-332008-0 from training. Duration: 0.98 2023-10-02 10:00:33,798 WARNING [train.py:1204] (3/4) Exclude cut with ID _29003_210_19596_1_1532507391720_3894098_358-114789-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 10:00:37,082 WARNING [train.py:1204] (3/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_152-212533-0_sp1.1 from training. Duration: 0.8818125 2023-10-02 10:00:37,160 WARNING [train.py:1204] (3/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_267-214116-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 10:00:41,706 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7543_210_6753_1_1528541907_1399322_165-323619-0_sp0.9 from training. Duration: 0.7666875 2023-10-02 10:00:45,114 WARNING [train.py:1204] (3/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_577-332600-0_sp0.9 from training. Duration: 0.6333125 2023-10-02 10:00:47,782 WARNING [train.py:1204] (3/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_342-100157-0_sp0.9 from training. Duration: 0.6 2023-10-02 10:00:49,297 WARNING [train.py:1204] (3/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_141-89727-0_sp1.1 from training. Duration: 0.6454375 2023-10-02 10:00:50,779 WARNING [train.py:1204] (3/4) Exclude cut with ID _3992_210_1747_1_1526644816031_3731060_107-251434-0_sp1.1 from training. Duration: 0.9 2023-10-02 10:00:53,517 WARNING [train.py:1204] (3/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_401-53423-0_sp1.1 from training. Duration: 0.5 2023-10-02 10:00:57,804 INFO [train.py:1046] (3/4) Epoch 24, batch 3800, loss[loss=0.1573, simple_loss=0.2325, pruned_loss=0.04108, over 19393.00 frames. ], tot_loss[loss=0.1714, simple_loss=0.2482, pruned_loss=0.0473, over 4711445.39 frames. ], batch size: 42, lr: 4.25e-03, grad_scale: 16.0 2023-10-02 10:00:59,406 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.conv_skip_rate, batch_count=839853.3333333334, ans=0.0 2023-10-02 10:00:59,434 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.conv_module1.balancer2.prob, batch_count=839853.3333333334, ans=0.125 2023-10-02 10:01:01,923 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7762_210_1794_1_1528199508_4057408_422-227034-0_sp1.1 from training. Duration: 0.663625 2023-10-02 10:01:02,508 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.4.encoder.layers.1.conv_module2.whiten, num_groups=1, num_channels=384, metric=2.59 vs. limit=15.0 2023-10-02 10:01:04,741 WARNING [train.py:1204] (3/4) Exclude cut with ID _4184_210_2550_1_1526730192013_2219380_102-250299-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 10:01:06,080 WARNING [train.py:1204] (3/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_253-308377-0_sp0.9 from training. Duration: 0.5444375 2023-10-02 10:01:07,379 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_271-297505-0 from training. Duration: 0.99 2023-10-02 10:01:08,792 WARNING [train.py:1204] (3/4) Exclude cut with ID _35941_210_15379_1_1532995294515_4023089_213-142973-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 10:01:10,693 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7544_210_6753_1_1528587722_3576686_162-63018-0_sp1.1 from training. Duration: 0.92725 2023-10-02 10:01:12,684 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_365-217119-0_sp0.9 from training. Duration: 0.7 2023-10-02 10:01:13,630 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.0.layers.0.self_attn1.whiten, num_groups=1, num_channels=192, metric=12.52 vs. limit=22.5 2023-10-02 10:01:15,356 WARNING [train.py:1204] (3/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_662-275621-0_sp0.9 from training. Duration: 0.4666875 2023-10-02 10:01:15,358 WARNING [train.py:1204] (3/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_374-187555-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 10:01:16,078 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_289-180904-0_sp1.1 from training. Duration: 0.6909375 2023-10-02 10:01:17,417 WARNING [train.py:1204] (3/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_189-47206-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 10:01:18,865 WARNING [train.py:1204] (3/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_234-14174-0_sp1.1 from training. Duration: 0.7454375 2023-10-02 10:01:18,903 WARNING [train.py:1204] (3/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_576-76621-0_sp1.1 from training. Duration: 0.963625 2023-10-02 10:01:20,265 WARNING [train.py:1204] (3/4) Exclude cut with ID _13253_210_1925_1_1530757775819_3577873_253-246386-0 from training. Duration: 0.9 2023-10-02 10:01:21,897 WARNING [train.py:1204] (3/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_372-6869-0_sp1.1 from training. Duration: 0.4 2023-10-02 10:01:21,944 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_21434_210_4238_1_1531916339_1222252_22-54954-0_sp1.1 from training. Duration: 0.936375 2023-10-02 10:01:23,878 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.self_attn_weights.whiten_keys.whitening_limit, batch_count=839920.0, ans=6.0 2023-10-02 10:01:24,652 WARNING [train.py:1204] (3/4) Exclude cut with ID _29853_210_7497_1_1532505600851_6642110_492-170361-0_sp1.1 from training. Duration: 0.92725 2023-10-02 10:01:26,100 WARNING [train.py:1204] (3/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_505-97987-0_sp0.9 from training. Duration: 0.9555625 2023-10-02 10:01:27,398 WARNING [train.py:1204] (3/4) Exclude cut with ID _8365_210_2064_1_1528592311070_4677690_371-122295-0_sp0.9 from training. Duration: 0.611125 2023-10-02 10:01:28,942 WARNING [train.py:1204] (3/4) Exclude cut with ID _14268_210_9206_1_1530622771285_4714099_637-86630-0_sp1.1 from training. Duration: 0.463625 2023-10-02 10:01:30,597 WARNING [train.py:1204] (3/4) Exclude cut with ID _29857_210_20034_1_1532395886753_3798160_372-5678-0_sp1.1 from training. Duration: 0.963625 2023-10-02 10:01:31,823 WARNING [train.py:1204] (3/4) Exclude cut with ID _52892_210_6721_1_1534378941259_4124129_90-165108-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 10:01:33,225 WARNING [train.py:1204] (3/4) Exclude cut with ID _27052_210_5986_1_1532329197187_3585009_143-339198-0_sp1.1 from training. Duration: 0.963625 2023-10-02 10:01:36,189 WARNING [train.py:1204] (3/4) Exclude cut with ID _14268_210_9206_1_1530622771285_4714099_637-86630-0_sp0.9 from training. Duration: 0.5666875 2023-10-02 10:01:36,191 WARNING [train.py:1204] (3/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_401-255491-0 from training. Duration: 0.7 2023-10-02 10:01:37,952 WARNING [train.py:1204] (3/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_178-6946-0_sp1.1 from training. Duration: 0.97275 2023-10-02 10:01:42,829 WARNING [train.py:1204] (3/4) Exclude cut with ID _27485_210_4916_1_1532739191677_3668269_460-163885-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 10:01:44,019 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.530e+02 1.979e+02 2.395e+02 2.879e+02 4.810e+02, threshold=4.790e+02, percent-clipped=5.0 2023-10-02 10:01:48,636 WARNING [train.py:1204] (3/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_270-26053-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 10:01:50,099 WARNING [train.py:1204] (3/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_412-317077-0 from training. Duration: 0.75 2023-10-02 10:01:52,884 WARNING [train.py:1204] (3/4) Exclude cut with ID _5289_210_2084_1_1527678202736_3779299_142-103317-0 from training. Duration: 0.84 2023-10-02 10:01:54,234 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_150-143438-0_sp1.1 from training. Duration: 0.92725 2023-10-02 10:01:54,376 WARNING [train.py:1204] (3/4) Exclude cut with ID _27894_210_12392_1_1532415496078_6902439_668-269779-0_sp1.1 from training. Duration: 0.97275 2023-10-02 10:01:55,627 WARNING [train.py:1204] (3/4) Exclude cut with ID _18513_210_9206_1_1531618215223_3862620_56-222075-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 10:01:56,978 WARNING [train.py:1204] (3/4) Exclude cut with ID _4092_210_2151_1_1526643044449_3135759_356-278694-0 from training. Duration: 0.47 2023-10-02 10:01:59,850 WARNING [train.py:1204] (3/4) Exclude cut with ID _25808_210_13292_1_1532343514562_7366109_633-47524-0 from training. Duration: 0.99 2023-10-02 10:01:59,860 WARNING [train.py:1204] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_872-264525-0 from training. Duration: 0.65 2023-10-02 10:01:59,888 WARNING [train.py:1204] (3/4) Exclude cut with ID _14632_210_9988_1_1530691279392_3923200_239-232545-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 10:02:01,164 WARNING [train.py:1204] (3/4) Exclude cut with ID _14349_210_8341_1_1530615710818_3442470_153-27432-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 10:02:06,768 WARNING [train.py:1204] (3/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_37-41919-0_sp0.9 from training. Duration: 0.9666875 2023-10-02 10:02:08,713 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_196-289637-0_sp0.9 from training. Duration: 0.8333125 2023-10-02 10:02:11,995 INFO [train.py:1046] (3/4) Epoch 24, batch 3850, loss[loss=0.1744, simple_loss=0.2539, pruned_loss=0.04742, over 24096.00 frames. ], tot_loss[loss=0.1705, simple_loss=0.247, pruned_loss=0.04699, over 4693224.84 frames. ], batch size: 80, lr: 4.25e-03, grad_scale: 16.0 2023-10-02 10:02:13,535 WARNING [train.py:1204] (3/4) Exclude cut with ID _13678_210_7497_1_1530530069226_3463170_371-3221-0_sp1.1 from training. Duration: 0.8181875 2023-10-02 10:02:13,609 WARNING [train.py:1204] (3/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_557-324258-0 from training. Duration: 0.81 2023-10-02 10:02:14,919 WARNING [train.py:1204] (3/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_1012-279473-0_sp1.1 from training. Duration: 0.6090625 2023-10-02 10:02:16,845 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_66916_210_14656_1_1535725525_326566_7-92649-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 10:02:18,539 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_9460_210_4211_1_1528954587_10049667_480-260269-0_sp1.1 from training. Duration: 0.5090625 2023-10-02 10:02:18,799 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.balancer1.prob, batch_count=840186.6666666666, ans=0.125 2023-10-02 10:02:21,220 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_121-155001-0_sp1.1 from training. Duration: 0.92725 2023-10-02 10:02:23,903 WARNING [train.py:1204] (3/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_188-171673-0_sp1.1 from training. Duration: 0.52725 2023-10-02 10:02:23,976 WARNING [train.py:1204] (3/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_2-39653-0 from training. Duration: 0.98 2023-10-02 10:02:29,481 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_23850_210_4238_1_1532232597_1632433_112-229012-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 10:02:30,888 WARNING [train.py:1204] (3/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_626-79528-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 10:02:32,201 WARNING [train.py:1204] (3/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_856-90052-0_sp1.1 from training. Duration: 0.963625 2023-10-02 10:02:32,326 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.0.ff3_skip_rate, batch_count=840253.3333333334, ans=0.0 2023-10-02 10:02:33,444 WARNING [train.py:1204] (3/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_122-109023-0_sp0.9 from training. Duration: 0.8444375 2023-10-02 10:02:36,540 WARNING [train.py:1204] (3/4) Exclude cut with ID _64414_210_17301_1_1535243252048_3757759_37-70328-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 10:02:38,281 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_622-344907-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 10:02:38,340 WARNING [train.py:1204] (3/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_410-321956-0_sp1.1 from training. Duration: 0.92725 2023-10-02 10:02:38,358 WARNING [train.py:1204] (3/4) Exclude cut with ID _7562_210_2084_1_1528887767916_3946229_186-326422-0_sp0.9 from training. Duration: 0.9333125 2023-10-02 10:02:40,241 WARNING [train.py:1204] (3/4) Exclude cut with ID _37537_210_2253_1_1533171758568_3421949_127-285192-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 10:02:43,848 WARNING [train.py:1204] (3/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_723-228168-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 10:02:44,485 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.3.encoder.layers.0.whiten, num_groups=1, num_channels=512, metric=6.64 vs. limit=12.0 2023-10-02 10:02:44,963 WARNING [train.py:1204] (3/4) Exclude cut with ID _13678_210_7497_1_1530530069226_3463170_426-246984-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 10:02:44,979 WARNING [train.py:1204] (3/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_399-156791-0_sp0.9 from training. Duration: 0.92225 2023-10-02 10:02:45,050 WARNING [train.py:1204] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_391-22148-0 from training. Duration: 0.77 2023-10-02 10:02:45,074 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3475_210_2028_1_1526193578_3622569_64-297072-0 from training. Duration: 0.91 2023-10-02 10:02:46,384 WARNING [train.py:1204] (3/4) Exclude cut with ID _17875_210_2087_1_1531224012907_1821339_225-280015-0_sp1.1 from training. Duration: 0.963625 2023-10-02 10:02:46,421 WARNING [train.py:1204] (3/4) Exclude cut with ID _50655_210_18628_1_1534229942193_3772154_325-246169-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 10:02:47,880 WARNING [train.py:1204] (3/4) Exclude cut with ID _29529_210_7234_1_1532393961455_3977460_597-257348-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 10:02:47,924 WARNING [train.py:1204] (3/4) Exclude cut with ID _58895_210_8750_1_1535184081320_3649089_77-173485-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 10:02:49,331 WARNING [train.py:1204] (3/4) Exclude cut with ID _6270_210_2084_1_1527850911573_3856420_26-277343-0 from training. Duration: 0.82 2023-10-02 10:02:51,961 WARNING [train.py:1204] (3/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_144-3287-0 from training. Duration: 0.67 2023-10-02 10:02:53,305 WARNING [train.py:1204] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_293-145278-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 10:02:54,612 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_40047_210_2290_1_1533290519_2374311_257-335162-0 from training. Duration: 0.95 2023-10-02 10:02:56,039 WARNING [train.py:1204] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_619-344499-0_sp0.9 from training. Duration: 0.7 2023-10-02 10:03:00,405 WARNING [train.py:1204] (3/4) Exclude cut with ID _19504_210_9994_1_1531648863242_3325220_456-315379-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 10:03:01,850 WARNING [train.py:1204] (3/4) Exclude cut with ID _55857_210_2346_1_1534644007831_6898209_631-221692-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 10:03:03,942 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.2.encoder.layers.2.feed_forward1.out_whiten, num_groups=1, num_channels=384, metric=9.99 vs. limit=15.0 2023-10-02 10:03:06,115 WARNING [train.py:1204] (3/4) Exclude cut with ID _64631_210_27767_1_1535349580068_4165160_324-3682-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 10:03:06,135 WARNING [train.py:1204] (3/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_317-211107-0 from training. Duration: 0.92 2023-10-02 10:03:09,852 WARNING [train.py:1204] (3/4) Exclude cut with ID _6789_210_1534_1_1527857937218_3118950_311-61262-0 from training. Duration: 0.65 2023-10-02 10:03:13,052 WARNING [train.py:1204] (3/4) Exclude cut with ID _28323_210_16187_1_1532480395667_4100300_365-68882-0_sp1.1 from training. Duration: 0.92725 2023-10-02 10:03:13,088 WARNING [train.py:1204] (3/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_816-181825-0_sp1.1 from training. Duration: 0.92725 2023-10-02 10:03:16,251 WARNING [train.py:1204] (3/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_132-269633-0_sp1.1 from training. Duration: 0.6181875 2023-10-02 10:03:16,269 WARNING [train.py:1204] (3/4) Exclude cut with ID _44585_210_18628_1_1533605350387_4518199_280-73534-0_sp1.1 from training. Duration: 0.8090625 2023-10-02 10:03:17,587 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_68095_210_20185_1_1535718226_8888855_1009-164755-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 10:03:18,857 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_40223_210_6228_1_1533298404_4812267_400-103813-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 10:03:18,858 WARNING [train.py:1204] (3/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_491-261923-0_sp1.1 from training. Duration: 0.8909375 2023-10-02 10:03:18,863 WARNING [train.py:1204] (3/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_712-342563-0 from training. Duration: 0.97 2023-10-02 10:03:20,215 WARNING [train.py:1204] (3/4) Exclude cut with ID _41161_210_20034_1_1533431249657_3555314_175-190032-0_sp1.1 from training. Duration: 0.963625 2023-10-02 10:03:21,614 WARNING [train.py:1204] (3/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_788-298907-0 from training. Duration: 0.89 2023-10-02 10:03:21,631 WARNING [train.py:1204] (3/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_225-70961-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 10:03:21,645 WARNING [train.py:1204] (3/4) Exclude cut with ID _58583_210_5236_1_1534726835266_3574249_235-175994-0_sp1.1 from training. Duration: 0.92725 2023-10-02 10:03:24,413 WARNING [train.py:1204] (3/4) Exclude cut with ID _12302_210_5219_1_1529924446466_8107280_907-182949-0_sp1.1 from training. Duration: 0.836375 2023-10-02 10:03:24,453 WARNING [train.py:1204] (3/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_94-193692-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 10:03:25,750 INFO [train.py:1046] (3/4) Epoch 24, batch 3900, loss[loss=0.1845, simple_loss=0.2655, pruned_loss=0.0517, over 24617.00 frames. ], tot_loss[loss=0.1698, simple_loss=0.2457, pruned_loss=0.04698, over 4699524.95 frames. ], batch size: 65, lr: 4.25e-03, grad_scale: 16.0 2023-10-02 10:03:25,877 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_4528_210_3273_1_1527076654_3276777_342-168068-0_sp0.9 from training. Duration: 0.9666875 2023-10-02 10:03:27,162 WARNING [train.py:1204] (3/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_368-74239-0_sp1.1 from training. Duration: 0.92725 2023-10-02 10:03:27,163 WARNING [train.py:1204] (3/4) Exclude cut with ID _44831_210_13427_1_1533816011549_3689035_291-295928-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 10:03:27,239 WARNING [train.py:1204] (3/4) Exclude cut with ID _56406_210_9288_1_1534899618664_3496149_455-131997-0_sp1.1 from training. Duration: 0.97275 2023-10-02 10:03:27,246 WARNING [train.py:1204] (3/4) Exclude cut with ID _6473_210_1741_1_1527901307396_3815380_108-17335-0 from training. Duration: 0.65 2023-10-02 10:03:28,576 WARNING [train.py:1204] (3/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_286-135600-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 10:03:32,607 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_928-321675-0_sp1.1 from training. Duration: 0.936375 2023-10-02 10:03:32,687 WARNING [train.py:1204] (3/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_81-300212-0_sp1.1 from training. Duration: 0.5454375 2023-10-02 10:03:32,721 WARNING [train.py:1204] (3/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_29-280622-0_sp0.9 from training. Duration: 0.911125 2023-10-02 10:03:35,289 WARNING [train.py:1204] (3/4) Exclude cut with ID _61079_210_4525_1_1534921008198_4011419_228-176447-0_sp1.1 from training. Duration: 0.936375 2023-10-02 10:03:36,738 WARNING [train.py:1204] (3/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_372-198878-0_sp1.1 from training. Duration: 0.5454375 2023-10-02 10:03:38,013 WARNING [train.py:1204] (3/4) Exclude cut with ID _64521_210_14699_1_1535243427259_3815328_244-208725-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 10:03:39,884 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8275_210_4198_1_1528455320_3892571_491-17707-0_sp0.9 from training. Duration: 0.711125 2023-10-02 10:03:41,264 WARNING [train.py:1204] (3/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_375-245677-0 from training. Duration: 0.81 2023-10-02 10:03:41,279 WARNING [train.py:1204] (3/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_690-286055-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 10:03:42,671 WARNING [train.py:1204] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_89-321955-0 from training. Duration: 0.8 2023-10-02 10:03:44,543 WARNING [train.py:1204] (3/4) Exclude cut with ID _18895_210_6228_1_1531357274234_3784930_213-98721-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 10:03:44,621 WARNING [train.py:1204] (3/4) Exclude cut with ID _19708_210_9206_1_1532136650733_4238364_347-65240-0 from training. Duration: 0.97 2023-10-02 10:03:46,643 WARNING [train.py:1204] (3/4) Exclude cut with ID _64135_210_2253_1_1535677267876_7608972_498-5039-0 from training. Duration: 0.78 2023-10-02 10:03:46,895 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.self_attn_weights.pos_emb_skip_rate, batch_count=840586.6666666666, ans=0.0 2023-10-02 10:03:52,075 WARNING [train.py:1204] (3/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_146-276944-0_sp1.1 from training. Duration: 0.7545625 2023-10-02 10:03:52,312 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.ff3_skip_rate, batch_count=840586.6666666666, ans=0.0 2023-10-02 10:03:53,363 WARNING [train.py:1204] (3/4) Exclude cut with ID _63171_210_15488_1_1535104792794_3295110_155-34488-0_sp1.1 from training. Duration: 0.97275 2023-10-02 10:03:53,386 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_5438_210_1794_1_1527336687_2600009_233-58072-0_sp0.9 from training. Duration: 0.8555625 2023-10-02 10:03:53,441 WARNING [train.py:1204] (3/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_63-101996-0_sp0.9 from training. Duration: 0.97775 2023-10-02 10:03:56,269 WARNING [train.py:1204] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_271-269084-0_sp1.1 from training. Duration: 0.7545625 2023-10-02 10:03:57,714 WARNING [train.py:1204] (3/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_345-167986-0_sp1.1 from training. Duration: 0.763625 2023-10-02 10:04:00,369 WARNING [train.py:1204] (3/4) Exclude cut with ID _39290_210_4220_1_1533862778760_3075118_92-311600-0_sp1.1 from training. Duration: 0.8 2023-10-02 10:04:00,383 WARNING [train.py:1204] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_209-65306-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 10:04:00,465 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_26433_210_12108_1_1532157158_4056480_317-335807-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 10:04:04,675 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.feed_forward2.hidden_balancer.prob, batch_count=840653.3333333334, ans=0.125 2023-10-02 10:04:05,786 WARNING [train.py:1204] (3/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_238-94127-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 10:04:05,844 WARNING [train.py:1204] (3/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_207-132038-0_sp1.1 from training. Duration: 0.8545625 2023-10-02 10:04:09,706 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.589e+02 1.982e+02 2.222e+02 2.624e+02 4.261e+02, threshold=4.444e+02, percent-clipped=0.0 2023-10-02 10:04:15,220 WARNING [train.py:1204] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_167-137255-0_sp1.1 from training. Duration: 0.5818125 2023-10-02 10:04:16,480 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2009_210_2071_1_1525481134_4588937_385-302100-0_sp0.9 from training. Duration: 0.9666875 2023-10-02 10:04:25,532 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_15517_210_6753_1_1530954008_3487336_253-110617-0_sp1.1 from training. Duration: 0.936375 2023-10-02 10:04:28,197 WARNING [train.py:1204] (3/4) Exclude cut with ID _39290_210_4220_1_1533862778760_3075118_92-311600-0_sp0.9 from training. Duration: 0.97775 2023-10-02 10:04:29,514 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_42-142473-0 from training. Duration: 0.72 2023-10-02 10:04:29,553 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2296_210_2087_1_1525503448_3397335_57-185430-0 from training. Duration: 0.6 2023-10-02 10:04:29,570 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_11305_210_1794_1_1529750428_7178148_257-312870-0_sp0.9 from training. Duration: 0.97775 2023-10-02 10:04:30,996 WARNING [train.py:1204] (3/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_222-198127-0 from training. Duration: 0.84 2023-10-02 10:04:32,394 WARNING [train.py:1204] (3/4) Exclude cut with ID _65263_210_22356_1_1535763598691_3921037_423-202699-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 10:04:32,445 WARNING [train.py:1204] (3/4) Exclude cut with ID _15037_210_2253_1_1530752294104_3267650_13-130361-0 from training. Duration: 0.99 2023-10-02 10:04:38,034 INFO [train.py:1046] (3/4) Epoch 24, batch 3950, loss[loss=0.1589, simple_loss=0.2334, pruned_loss=0.04223, over 23362.00 frames. ], tot_loss[loss=0.1697, simple_loss=0.2458, pruned_loss=0.04683, over 4698990.95 frames. ], batch size: 105, lr: 4.25e-03, grad_scale: 16.0 2023-10-02 10:04:40,686 WARNING [train.py:1204] (3/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_628-198080-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 10:04:42,722 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3171_210_2087_1_1526109084_2330007_37-64334-0 from training. Duration: 0.82 2023-10-02 10:04:44,113 WARNING [train.py:1204] (3/4) Exclude cut with ID _5287_210_2084_1_1527418883040_3878670_12-221186-0_sp1.1 from training. Duration: 0.8909375 2023-10-02 10:04:46,846 WARNING [train.py:1204] (3/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_1103-56445-0_sp1.1 from training. Duration: 0.663625 2023-10-02 10:04:46,956 WARNING [train.py:1204] (3/4) Exclude cut with ID _65263_210_22356_1_1535763598691_3921037_169-140068-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 10:04:53,162 WARNING [train.py:1204] (3/4) Exclude cut with ID IC0671W0118-331961 from training. Duration: 0.896 2023-10-02 10:04:53,224 WARNING [train.py:1204] (3/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_402-102311-0_sp1.1 from training. Duration: 0.6090625 2023-10-02 10:04:53,246 WARNING [train.py:1204] (3/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_202-293523-0 from training. Duration: 0.72 2023-10-02 10:04:55,012 WARNING [train.py:1204] (3/4) Exclude cut with ID IC0679W0038-335846 from training. Duration: 0.768 2023-10-02 10:04:55,043 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_37357_210_17024_1_1533089504_2316121_79-198866-0_sp1.1 from training. Duration: 0.936375 2023-10-02 10:04:57,746 WARNING [train.py:1204] (3/4) Exclude cut with ID _7606_210_4915_1_1528894253277_5080690_292-271285-0_sp1.1 from training. Duration: 0.963625 2023-10-02 10:04:57,757 WARNING [train.py:1204] (3/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_377-296839-0_sp0.9 from training. Duration: 0.611125 2023-10-02 10:04:57,760 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_45886_210_17024_1_1533704917_2435333_58-131069-0_sp1.1 from training. Duration: 0.963625 2023-10-02 10:04:58,015 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.conv_skip_rate, batch_count=840920.0, ans=0.0 2023-10-02 10:05:00,527 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6961_210_2087_1_1527937168_6815011_395-185060-0 from training. Duration: 0.95 2023-10-02 10:05:03,314 WARNING [train.py:1204] (3/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_304-266479-0_sp1.1 from training. Duration: 0.97275 2023-10-02 10:05:04,675 WARNING [train.py:1204] (3/4) Exclude cut with ID _3867_210_1883_1_1526641200575_4475830_319-349850-0_sp1.1 from training. Duration: 0.6090625 2023-10-02 10:05:04,683 WARNING [train.py:1204] (3/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_304-59227-0_sp0.9 from training. Duration: 0.8555625 2023-10-02 10:05:04,758 WARNING [train.py:1204] (3/4) Exclude cut with ID _8021_210_7994_1_1528376451358_3653069_100-140231-0_sp0.9 from training. Duration: 0.7444375 2023-10-02 10:05:06,069 WARNING [train.py:1204] (3/4) Exclude cut with ID _8833_210_5219_1_1528592665210_7553059_329-125156-0_sp0.9 from training. Duration: 0.911125 2023-10-02 10:05:10,660 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.5.encoder.layers.1.whiten, num_groups=1, num_channels=256, metric=2.33 vs. limit=12.0 2023-10-02 10:05:16,680 WARNING [train.py:1204] (3/4) Exclude cut with ID _14459_210_2228_1_1531443641946_7516700_792-287234-0_sp1.1 from training. Duration: 0.92725 2023-10-02 10:05:16,698 WARNING [train.py:1204] (3/4) Exclude cut with ID _19708_210_9206_1_1532136650733_4238364_347-65240-0_sp1.1 from training. Duration: 0.8818125 2023-10-02 10:05:17,361 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.4.encoder.layers.2.conv_module1.whiten, num_groups=1, num_channels=384, metric=2.78 vs. limit=15.0 2023-10-02 10:05:21,422 WARNING [train.py:1204] (3/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_70-27732-0 from training. Duration: 0.94 2023-10-02 10:05:27,307 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_397-132565-0 from training. Duration: 0.99 2023-10-02 10:05:27,309 WARNING [train.py:1204] (3/4) Exclude cut with ID _49728_210_12651_1_1534041034294_6788150_624-195301-0 from training. Duration: 0.85 2023-10-02 10:05:27,349 WARNING [train.py:1204] (3/4) Exclude cut with ID _55429_210_10626_1_1534467588527_7174143_377-169391-0_sp1.1 from training. Duration: 0.87275 2023-10-02 10:05:29,971 WARNING [train.py:1204] (3/4) Exclude cut with ID _49376_210_5986_1_1533985278732_3370740_354-344695-0_sp1.1 from training. Duration: 0.9 2023-10-02 10:05:36,828 WARNING [train.py:1204] (3/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_276-96593-0_sp1.1 from training. Duration: 0.7 2023-10-02 10:05:36,836 WARNING [train.py:1204] (3/4) Exclude cut with ID _15012_210_1968_1_1530856820429_5095350_688-178849-0_sp0.9 from training. Duration: 0.711125 2023-10-02 10:05:36,870 WARNING [train.py:1204] (3/4) Exclude cut with ID _25565_210_12392_1_1532423009382_3952098_372-69023-0_sp1.1 from training. Duration: 0.936375 2023-10-02 10:05:36,911 WARNING [train.py:1204] (3/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_61-327062-0_sp1.1 from training. Duration: 0.736375 2023-10-02 10:05:38,255 WARNING [train.py:1204] (3/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_215-141059-0 from training. Duration: 0.62 2023-10-02 10:05:41,120 WARNING [train.py:1204] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_6-226750-0_sp1.1 from training. Duration: 0.8545625 2023-10-02 10:05:42,381 WARNING [train.py:1204] (3/4) Exclude cut with ID _14348_210_8341_1_1530599434996_3778640_240-232370-0_sp1.1 from training. Duration: 0.763625 2023-10-02 10:05:46,934 WARNING [train.py:1204] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_468-189058-0 from training. Duration: 0.96 2023-10-02 10:05:50,544 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=841186.6666666666, ans=0.1 2023-10-02 10:05:51,371 INFO [train.py:1046] (3/4) Epoch 24, batch 4000, loss[loss=0.1813, simple_loss=0.265, pruned_loss=0.04878, over 24011.00 frames. ], tot_loss[loss=0.1699, simple_loss=0.2465, pruned_loss=0.04666, over 4717695.83 frames. ], batch size: 80, lr: 4.25e-03, grad_scale: 16.0 2023-10-02 10:05:54,860 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.self_attn_weights.pos_emb_skip_rate, batch_count=841186.6666666666, ans=0.0 2023-10-02 10:05:57,300 WARNING [train.py:1204] (3/4) Exclude cut with ID _21987_210_5854_1_1531821500482_3637038_247-93340-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 10:06:02,831 WARNING [train.py:1204] (3/4) Exclude cut with ID _66868_210_9527_1_1535509287954_4561140_436-191602-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 10:06:03,079 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.self_attn_weights.pos_emb_skip_rate, batch_count=841186.6666666666, ans=0.0 2023-10-02 10:06:08,297 WARNING [train.py:1204] (3/4) Exclude cut with ID _18895_210_6228_1_1531357274234_3784930_394-58203-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 10:06:08,354 WARNING [train.py:1204] (3/4) Exclude cut with ID _58605_210_12210_1_1534723195939_3904259_258-281498-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 10:06:09,673 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_33348_210_5632_1_1533086912_3324030_245-292752-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 10:06:09,691 WARNING [train.py:1204] (3/4) Exclude cut with ID _67831_210_12392_1_1535612464202_3092612_346-2148-0 from training. Duration: 0.93 2023-10-02 10:06:09,753 WARNING [train.py:1204] (3/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_369-222060-0_sp0.9 from training. Duration: 0.788875 2023-10-02 10:06:11,037 WARNING [train.py:1204] (3/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_245-174553-0 from training. Duration: 0.88 2023-10-02 10:06:11,043 WARNING [train.py:1204] (3/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_231-104736-0_sp1.1 from training. Duration: 0.7090625 2023-10-02 10:06:11,049 WARNING [train.py:1204] (3/4) Exclude cut with ID _44585_210_18628_1_1533605350387_4518199_280-73534-0 from training. Duration: 0.89 2023-10-02 10:06:13,779 WARNING [train.py:1204] (3/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_22-201476-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 10:06:13,984 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.bypass.skip_rate, batch_count=841253.3333333334, ans=0.04949747468305833 2023-10-02 10:06:17,100 WARNING [train.py:1204] (3/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_325-62802-0_sp0.9 from training. Duration: 0.9666875 2023-10-02 10:06:17,121 WARNING [train.py:1204] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_199-274062-0_sp1.1 from training. Duration: 0.97275 2023-10-02 10:06:17,125 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_8-319957-0_sp1.1 from training. Duration: 0.87275 2023-10-02 10:06:17,158 WARNING [train.py:1204] (3/4) Exclude cut with ID _29343_210_4381_1_1532602757752_3674109_224-349726-0_sp1.1 from training. Duration: 0.936375 2023-10-02 10:06:17,162 WARNING [train.py:1204] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_520-126729-0_sp1.1 from training. Duration: 0.636375 2023-10-02 10:06:19,215 WARNING [train.py:1204] (3/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_239-22076-0_sp1.1 from training. Duration: 0.77275 2023-10-02 10:06:21,209 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9012W0011-501146 from training. Duration: 0.896 2023-10-02 10:06:21,752 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.4.encoder.layers.2.feed_forward3.out_whiten, num_groups=1, num_channels=384, metric=13.44 vs. limit=15.0 2023-10-02 10:06:22,485 WARNING [train.py:1204] (3/4) Exclude cut with ID _29857_210_20034_1_1532395886753_3798160_279-185801-0_sp1.1 from training. Duration: 0.82725 2023-10-02 10:06:22,552 WARNING [train.py:1204] (3/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_261-223933-0_sp1.1 from training. Duration: 0.92725 2023-10-02 10:06:25,526 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9029W0025-509426 from training. Duration: 0.896 2023-10-02 10:06:27,280 WARNING [train.py:1204] (3/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_337-181502-0_sp1.1 from training. Duration: 0.5909375 2023-10-02 10:06:27,306 WARNING [train.py:1204] (3/4) Exclude cut with ID _68635_210_9317_1_1535770813410_4315870_159-332599-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 10:06:32,703 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_161-276996-0 from training. Duration: 0.95 2023-10-02 10:06:32,759 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7088_210_1874_1_1527939776_1101113_127-68870-0_sp1.1 from training. Duration: 0.936375 2023-10-02 10:06:35,356 WARNING [train.py:1204] (3/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_322-217220-0_sp1.1 from training. Duration: 0.7818125 2023-10-02 10:06:35,409 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9072W0002-530636 from training. Duration: 0.768 2023-10-02 10:06:36,867 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_340-336408-0_sp1.1 from training. Duration: 0.8454375 2023-10-02 10:06:38,065 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.448e+02 1.830e+02 2.096e+02 2.397e+02 3.466e+02, threshold=4.191e+02, percent-clipped=0.0 2023-10-02 10:06:38,171 WARNING [train.py:1204] (3/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_395-37222-0 from training. Duration: 0.76 2023-10-02 10:06:38,174 WARNING [train.py:1204] (3/4) Exclude cut with ID _64099_210_17301_1_1535526977656_1578188_159-209156-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 10:06:39,572 WARNING [train.py:1204] (3/4) Exclude cut with ID _53950_210_5816_1_1534417264919_3862951_28-29681-0_sp1.1 from training. Duration: 0.92725 2023-10-02 10:06:39,660 WARNING [train.py:1204] (3/4) Exclude cut with ID _4681_210_3525_1_1527250689464_120889_15-126760-0_sp1.1 from training. Duration: 0.736375 2023-10-02 10:06:41,025 WARNING [train.py:1204] (3/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_242-3472-0_sp1.1 from training. Duration: 0.663625 2023-10-02 10:06:42,391 WARNING [train.py:1204] (3/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_436-186311-0_sp0.9 from training. Duration: 0.72225 2023-10-02 10:06:42,413 WARNING [train.py:1204] (3/4) Exclude cut with ID _3675_210_4557_1_1526639934408_4182089_452-172147-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 10:06:43,870 WARNING [train.py:1204] (3/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_80-147301-0 from training. Duration: 0.84 2023-10-02 10:06:45,542 WARNING [train.py:1204] (3/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_351-107588-0_sp1.1 from training. Duration: 0.92725 2023-10-02 10:06:46,906 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9111W0002-550781 from training. Duration: 0.896 2023-10-02 10:06:52,089 WARNING [train.py:1204] (3/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_465-156500-0_sp1.1 from training. Duration: 0.6545625 2023-10-02 10:06:54,796 WARNING [train.py:1204] (3/4) Exclude cut with ID _13938_210_9988_1_1530518718978_3693960_304-112784-0_sp1.1 from training. Duration: 0.4181875 2023-10-02 10:06:54,951 WARNING [train.py:1204] (3/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_202-67017-0_sp1.1 from training. Duration: 0.7454375 2023-10-02 10:06:56,603 WARNING [train.py:1204] (3/4) Exclude cut with ID _58414_210_8254_1_1535421609162_7151182_448-4358-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 10:06:57,895 WARNING [train.py:1204] (3/4) Exclude cut with ID _66840_210_5219_1_1535452168426_4196349_203-243234-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 10:06:57,961 WARNING [train.py:1204] (3/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_201-213513-0_sp1.1 from training. Duration: 0.97275 2023-10-02 10:06:58,134 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.1.feed_forward2.hidden_balancer.prob, batch_count=841453.3333333334, ans=0.125 2023-10-02 10:06:59,808 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.5.encoder.layers.0.whiten, num_groups=1, num_channels=256, metric=2.64 vs. limit=12.0 2023-10-02 10:07:03,527 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_117-187560-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 10:07:04,745 INFO [train.py:1046] (3/4) Epoch 24, batch 4050, loss[loss=0.1615, simple_loss=0.2397, pruned_loss=0.04162, over 23577.00 frames. ], tot_loss[loss=0.1706, simple_loss=0.2471, pruned_loss=0.04709, over 4717407.38 frames. ], batch size: 134, lr: 4.25e-03, grad_scale: 16.0 2023-10-02 10:07:04,873 WARNING [train.py:1204] (3/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_135-174080-0_sp0.9 from training. Duration: 0.588875 2023-10-02 10:07:06,156 WARNING [train.py:1204] (3/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_45-265684-0 from training. Duration: 0.72 2023-10-02 10:07:07,632 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_435-313305-0_sp1.1 from training. Duration: 0.6454375 2023-10-02 10:07:07,651 WARNING [train.py:1204] (3/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_235-307337-0_sp1.1 from training. Duration: 0.8545625 2023-10-02 10:07:08,904 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7762_210_1794_1_1528199508_4057408_419-125009-0_sp0.9 from training. Duration: 0.92225 2023-10-02 10:07:10,251 WARNING [train.py:1204] (3/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_215-141059-0_sp1.1 from training. Duration: 0.563625 2023-10-02 10:07:11,590 WARNING [train.py:1204] (3/4) Exclude cut with ID _27013_210_1389_1_1532437173240_3252649_222-125573-0_sp1.1 from training. Duration: 0.8909375 2023-10-02 10:07:14,314 WARNING [train.py:1204] (3/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_126-274694-0_sp1.1 from training. Duration: 0.8909375 2023-10-02 10:07:18,227 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2592_210_2290_1_1525697025_4882925_157-70512-0_sp1.1 from training. Duration: 0.82725 2023-10-02 10:07:19,670 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_292-265928-0_sp0.9 from training. Duration: 0.5666875 2023-10-02 10:07:21,529 WARNING [train.py:1204] (3/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_1211-226066-0_sp1.1 from training. Duration: 0.6909375 2023-10-02 10:07:21,584 WARNING [train.py:1204] (3/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_394-265747-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 10:07:24,407 WARNING [train.py:1204] (3/4) Exclude cut with ID _48782_210_12658_1_1534467701544_2300422_239-67139-0_sp1.1 from training. Duration: 0.97275 2023-10-02 10:07:25,826 WARNING [train.py:1204] (3/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_329-113173-0_sp1.1 from training. Duration: 0.563625 2023-10-02 10:07:29,005 WARNING [train.py:1204] (3/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_81-75849-0_sp0.9 from training. Duration: 0.4333125 2023-10-02 10:07:29,133 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.0.conv_module1.balancer1.prob, batch_count=841586.6666666666, ans=0.125 2023-10-02 10:07:31,595 WARNING [train.py:1204] (3/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_1003-101638-0 from training. Duration: 0.73 2023-10-02 10:07:31,632 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9309W0189-640316 from training. Duration: 0.64 2023-10-02 10:07:34,311 WARNING [train.py:1204] (3/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_503-287883-0_sp1.1 from training. Duration: 0.8 2023-10-02 10:07:39,940 WARNING [train.py:1204] (3/4) Exclude cut with ID _8833_210_5219_1_1528592665210_7553059_611-80313-0 from training. Duration: 0.64 2023-10-02 10:07:40,009 WARNING [train.py:1204] (3/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_335-106626-0_sp1.1 from training. Duration: 0.963625 2023-10-02 10:07:44,048 WARNING [train.py:1204] (3/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_237-44332-0_sp1.1 from training. Duration: 0.8545625 2023-10-02 10:07:46,000 WARNING [train.py:1204] (3/4) Exclude cut with ID _46952_210_5986_1_1534230010357_3107379_178-147341-0_sp1.1 from training. Duration: 0.97275 2023-10-02 10:07:46,042 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_13091_210_7925_1_1530609447_8987354_946-154426-0_sp1.1 from training. Duration: 0.936375 2023-10-02 10:07:47,829 WARNING [train.py:1204] (3/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_326-17061-0_sp1.1 from training. Duration: 0.8545625 2023-10-02 10:07:51,115 WARNING [train.py:1204] (3/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_38-169920-0_sp1.1 from training. Duration: 0.82725 2023-10-02 10:07:55,081 WARNING [train.py:1204] (3/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_283-220372-0 from training. Duration: 0.56 2023-10-02 10:07:55,096 WARNING [train.py:1204] (3/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_252-216674-0_sp1.1 from training. Duration: 0.4909375 2023-10-02 10:07:57,899 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_202-91290-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 10:07:58,014 WARNING [train.py:1204] (3/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_80-74184-0 from training. Duration: 0.83 2023-10-02 10:08:02,800 WARNING [train.py:1204] (3/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_411-271358-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 10:08:08,245 WARNING [train.py:1204] (3/4) Exclude cut with ID _68777_210_4353_1_1535810390731_3117904_129-166012-0 from training. Duration: 0.85 2023-10-02 10:08:09,583 WARNING [train.py:1204] (3/4) Exclude cut with ID _44982_210_1511_1_1534312820108_3726029_410-109759-0_sp1.1 from training. Duration: 0.963625 2023-10-02 10:08:09,587 WARNING [train.py:1204] (3/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_364-22817-0_sp1.1 from training. Duration: 0.6454375 2023-10-02 10:08:09,799 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.ff3_skip_rate, batch_count=841786.6666666666, ans=0.0 2023-10-02 10:08:12,256 WARNING [train.py:1204] (3/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_89-242335-0 from training. Duration: 0.95 2023-10-02 10:08:12,264 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_151-325626-0 from training. Duration: 0.97 2023-10-02 10:08:12,264 WARNING [train.py:1204] (3/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_786-264332-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 10:08:15,449 WARNING [train.py:1204] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_35-153886-0_sp1.1 from training. Duration: 0.763625 2023-10-02 10:08:16,916 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_24007_210_1794_1_1531918925_1663875_116-100916-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 10:08:16,937 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_162-262717-0_sp1.1 from training. Duration: 0.7545625 2023-10-02 10:08:18,182 INFO [train.py:1046] (3/4) Epoch 24, batch 4100, loss[loss=0.1804, simple_loss=0.2514, pruned_loss=0.05468, over 23437.00 frames. ], tot_loss[loss=0.1716, simple_loss=0.248, pruned_loss=0.04756, over 4716627.48 frames. ], batch size: 285, lr: 4.25e-03, grad_scale: 8.0 2023-10-02 10:08:23,067 WARNING [train.py:1204] (3/4) Exclude cut with ID _8263_210_1664_1_1528438953494_294100_40-204731-0 from training. Duration: 0.5 2023-10-02 10:08:24,316 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_416-293746-0 from training. Duration: 0.9 2023-10-02 10:08:24,545 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.bypass.skip_rate, batch_count=841853.3333333334, ans=0.07 2023-10-02 10:08:25,731 WARNING [train.py:1204] (3/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_605-219732-0 from training. Duration: 0.94 2023-10-02 10:08:27,111 WARNING [train.py:1204] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_454-123002-0 from training. Duration: 0.69 2023-10-02 10:08:27,116 WARNING [train.py:1204] (3/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_25-304273-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 10:08:28,423 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6860_210_4852_1_1527917457_3833327_451-39702-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 10:08:28,449 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_304-16505-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 10:08:29,120 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.2.encoder.layers.0.nonlin_attention.whiten2, num_groups=1, num_channels=384, metric=9.24 vs. limit=15.0 2023-10-02 10:08:29,884 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_4-266348-0_sp1.1 from training. Duration: 0.7090625 2023-10-02 10:08:29,953 WARNING [train.py:1204] (3/4) Exclude cut with ID ID0149W0001-753701 from training. Duration: 0.989 2023-10-02 10:08:31,368 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_41251_210_15368_1_1533429767_4820695_394-55393-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 10:08:32,749 WARNING [train.py:1204] (3/4) Exclude cut with ID _8793_210_6310_1_1528542985139_2310009_121-179782-0_sp1.1 from training. Duration: 0.72725 2023-10-02 10:08:32,776 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2729_210_3528_1_1525863505_3882214_421-73047-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 10:08:34,627 WARNING [train.py:1204] (3/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_278-154492-0_sp0.9 from training. Duration: 0.8444375 2023-10-02 10:08:34,913 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.conv_module1.balancer2.prob, batch_count=841920.0, ans=0.125 2023-10-02 10:08:39,910 WARNING [train.py:1204] (3/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_412-317077-0_sp0.9 from training. Duration: 0.8333125 2023-10-02 10:08:39,992 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_727-138317-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 10:08:41,365 WARNING [train.py:1204] (3/4) Exclude cut with ID _25808_210_13292_1_1532343514562_7366109_345-335048-0_sp1.1 from training. Duration: 0.92725 2023-10-02 10:08:41,387 WARNING [train.py:1204] (3/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_506-23200-0 from training. Duration: 0.97 2023-10-02 10:08:42,675 WARNING [train.py:1204] (3/4) Exclude cut with ID _44718_210_5816_1_1534050051869_6469030_643-245187-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 10:08:42,679 WARNING [train.py:1204] (3/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_42-186162-0_sp1.1 from training. Duration: 0.836375 2023-10-02 10:08:42,697 WARNING [train.py:1204] (3/4) Exclude cut with ID _3231_210_2106_1_1526205864934_6784220_171-256004-0_sp1.1 from training. Duration: 0.7818125 2023-10-02 10:08:42,720 WARNING [train.py:1204] (3/4) Exclude cut with ID _25914_210_6400_1_1532141317058_644740_43-248478-0_sp1.1 from training. Duration: 0.936375 2023-10-02 10:08:44,011 WARNING [train.py:1204] (3/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_577-287150-0 from training. Duration: 0.82 2023-10-02 10:08:47,264 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_46015_210_7234_1_1533863319_3087837_376-276068-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 10:08:49,246 WARNING [train.py:1204] (3/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_51-200220-0 from training. Duration: 0.56 2023-10-02 10:08:50,535 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_452-96101-0_sp1.1 from training. Duration: 0.963625 2023-10-02 10:08:52,362 WARNING [train.py:1204] (3/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_263-168388-0_sp1.1 from training. Duration: 0.7818125 2023-10-02 10:08:52,363 WARNING [train.py:1204] (3/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_469-94673-0 from training. Duration: 0.79 2023-10-02 10:08:52,925 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.nonlin_attention.whiten2.whitening_limit, batch_count=841986.6666666666, ans=15.0 2023-10-02 10:08:53,717 WARNING [train.py:1204] (3/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_303-228738-0_sp1.1 from training. Duration: 0.763625 2023-10-02 10:08:53,763 WARNING [train.py:1204] (3/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_262-250826-0_sp1.1 from training. Duration: 0.9 2023-10-02 10:08:53,785 WARNING [train.py:1204] (3/4) Exclude cut with ID _68777_210_4353_1_1535810390731_3117904_129-166012-0_sp1.1 from training. Duration: 0.77275 2023-10-02 10:08:55,209 WARNING [train.py:1204] (3/4) Exclude cut with ID _58895_210_8750_1_1535184081320_3649089_265-323810-0 from training. Duration: 0.96 2023-10-02 10:08:57,885 WARNING [train.py:1204] (3/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_120-76843-0_sp0.9 from training. Duration: 0.611125 2023-10-02 10:08:57,948 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7544_210_6753_1_1528587722_3576686_295-258479-0_sp0.9 from training. Duration: 0.9333125 2023-10-02 10:08:59,394 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7046_210_6753_1_1527895337_6920917_783-278109-0 from training. Duration: 0.77 2023-10-02 10:09:00,741 WARNING [train.py:1204] (3/4) Exclude cut with ID _42326_210_7770_1_1533814282531_3707269_113-308323-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 10:09:02,048 WARNING [train.py:1204] (3/4) Exclude cut with ID _7852_210_5219_1_1528286471316_8139400_242-105336-0_sp0.9 from training. Duration: 0.8 2023-10-02 10:09:03,505 WARNING [train.py:1204] (3/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_114-277905-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 10:09:03,707 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.nonlin_attention.balancer.max_positive, batch_count=842053.3333333334, ans=0.95 2023-10-02 10:09:06,741 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.453e+02 1.905e+02 2.091e+02 2.337e+02 3.438e+02, threshold=4.183e+02, percent-clipped=0.0 2023-10-02 10:09:09,527 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_13118_210_7925_1_1530755612_7608297_42-66683-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 10:09:10,069 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.4.encoder.layers.2.nonlin_attention.whiten1, num_groups=1, num_channels=288, metric=5.10 vs. limit=10.0 2023-10-02 10:09:10,965 WARNING [train.py:1204] (3/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_901-79438-0_sp1.1 from training. Duration: 0.8545625 2023-10-02 10:09:12,283 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_30280_210_1881_1_1532872779_3660139_211-182194-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 10:09:22,964 WARNING [train.py:1204] (3/4) Exclude cut with ID _14898_210_9206_1_1530766812377_4177190_621-45732-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 10:09:22,972 WARNING [train.py:1204] (3/4) Exclude cut with ID _35350_210_16040_1_1533040191183_4075299_224-133775-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 10:09:24,449 WARNING [train.py:1204] (3/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_147-293311-0_sp1.1 from training. Duration: 0.8545625 2023-10-02 10:09:24,579 WARNING [train.py:1204] (3/4) Exclude cut with ID _12302_210_5219_1_1529924446466_8107280_835-279754-0_sp1.1 from training. Duration: 0.8181875 2023-10-02 10:09:24,712 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=842120.0, ans=0.1 2023-10-02 10:09:26,122 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.feed_forward1.out_proj.dropout_p, batch_count=842120.0, ans=0.1 2023-10-02 10:09:27,542 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.feed_forward1.out_proj.dropout_p, batch_count=842120.0, ans=0.1 2023-10-02 10:09:27,546 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.feed_forward3.hidden_balancer.prob, batch_count=842120.0, ans=0.125 2023-10-02 10:09:30,129 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8453_210_4211_1_1528502940_8562730_347-330673-0_sp0.9 from training. Duration: 0.8 2023-10-02 10:09:30,214 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_213-103282-0_sp0.9 from training. Duration: 0.8666875 2023-10-02 10:09:31,401 INFO [train.py:1046] (3/4) Epoch 24, batch 4150, loss[loss=0.1886, simple_loss=0.2608, pruned_loss=0.05819, over 23197.00 frames. ], tot_loss[loss=0.1728, simple_loss=0.249, pruned_loss=0.04835, over 4704703.13 frames. ], batch size: 105, lr: 4.25e-03, grad_scale: 8.0 2023-10-02 10:09:31,505 WARNING [train.py:1204] (3/4) Exclude cut with ID _49728_210_12651_1_1534041034294_6788150_66-43496-0_sp1.1 from training. Duration: 0.97275 2023-10-02 10:09:31,507 WARNING [train.py:1204] (3/4) Exclude cut with ID _19023_210_4353_1_1531378892552_5820780_28-36615-0_sp1.1 from training. Duration: 0.8818125 2023-10-02 10:09:32,969 WARNING [train.py:1204] (3/4) Exclude cut with ID _14349_210_8341_1_1530615710818_3442470_115-285975-0 from training. Duration: 0.86 2023-10-02 10:09:34,832 WARNING [train.py:1204] (3/4) Exclude cut with ID _12734_210_8341_1_1530183614296_3891929_236-47464-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 10:09:34,889 WARNING [train.py:1204] (3/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_177-236809-0 from training. Duration: 0.49 2023-10-02 10:09:36,182 WARNING [train.py:1204] (3/4) Exclude cut with ID _13678_210_7497_1_1530530069226_3463170_371-3221-0 from training. Duration: 0.9 2023-10-02 10:09:36,204 WARNING [train.py:1204] (3/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_48-338976-0 from training. Duration: 0.89 2023-10-02 10:09:36,337 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.feed_forward2.hidden_balancer.prob, batch_count=842186.6666666666, ans=0.125 2023-10-02 10:09:37,601 WARNING [train.py:1204] (3/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_1167-309744-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 10:09:41,558 WARNING [train.py:1204] (3/4) Exclude cut with ID _30327_210_16049_1_1532484161295_3924169_401-70819-0_sp0.9 from training. Duration: 0.9555625 2023-10-02 10:09:41,573 WARNING [train.py:1204] (3/4) Exclude cut with ID _55134_210_21982_1_1534590028377_3882170_346-91022-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 10:09:44,355 WARNING [train.py:1204] (3/4) Exclude cut with ID _14459_210_2228_1_1531443641946_7516700_174-118801-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 10:09:46,221 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_269-278006-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 10:09:48,111 WARNING [train.py:1204] (3/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_390-339137-0_sp0.9 from training. Duration: 0.62225 2023-10-02 10:09:50,107 WARNING [train.py:1204] (3/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_253-308377-0_sp1.1 from training. Duration: 0.4454375 2023-10-02 10:09:50,119 WARNING [train.py:1204] (3/4) Exclude cut with ID _23825_210_13449_1_1531915313526_3497150_240-271367-0_sp1.1 from training. Duration: 0.8818125 2023-10-02 10:09:50,365 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.conv_module1.balancer1.prob, batch_count=842253.3333333334, ans=0.125 2023-10-02 10:09:51,531 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1015-343884-0_sp0.9 from training. Duration: 0.688875 2023-10-02 10:09:55,575 WARNING [train.py:1204] (3/4) Exclude cut with ID _23514_210_1389_1_1531832139785_3031439_335-48210-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 10:10:00,981 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_309-65177-0_sp1.1 from training. Duration: 0.8 2023-10-02 10:10:02,304 WARNING [train.py:1204] (3/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_231-137892-0 from training. Duration: 0.99 2023-10-02 10:10:05,022 WARNING [train.py:1204] (3/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_57-270037-0 from training. Duration: 0.77 2023-10-02 10:10:05,031 WARNING [train.py:1204] (3/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_465-214726-0_sp1.1 from training. Duration: 0.7818125 2023-10-02 10:10:06,436 WARNING [train.py:1204] (3/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_106-300985-0 from training. Duration: 0.9 2023-10-02 10:10:06,448 WARNING [train.py:1204] (3/4) Exclude cut with ID _14632_210_9988_1_1530691279392_3923200_161-129530-0_sp1.1 from training. Duration: 0.87275 2023-10-02 10:10:06,459 WARNING [train.py:1204] (3/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_514-88370-0_sp1.1 from training. Duration: 0.963625 2023-10-02 10:10:09,491 WARNING [train.py:1204] (3/4) Exclude cut with ID _56495_210_4234_1_1534748080334_4136219_305-334505-0_sp1.1 from training. Duration: 0.92725 2023-10-02 10:10:10,874 WARNING [train.py:1204] (3/4) Exclude cut with ID _42875_210_14856_1_1533704397487_4012433_61-264914-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 10:10:14,991 WARNING [train.py:1204] (3/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_91-148831-0 from training. Duration: 0.99 2023-10-02 10:10:17,773 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_435-313305-0_sp0.9 from training. Duration: 0.788875 2023-10-02 10:10:21,012 WARNING [train.py:1204] (3/4) Exclude cut with ID _16744_210_5986_1_1531724405384_3907567_242-111316-0_sp1.1 from training. Duration: 0.8454375 2023-10-02 10:10:21,096 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_257-20733-0 from training. Duration: 0.76 2023-10-02 10:10:23,074 WARNING [train.py:1204] (3/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_4-91037-0_sp1.1 from training. Duration: 0.8 2023-10-02 10:10:24,983 WARNING [train.py:1204] (3/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_3-276701-0 from training. Duration: 0.91 2023-10-02 10:10:26,409 WARNING [train.py:1204] (3/4) Exclude cut with ID _3992_210_1747_1_1526644816031_3731060_36-147855-0_sp1.1 from training. Duration: 0.7090625 2023-10-02 10:10:26,483 WARNING [train.py:1204] (3/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_1296-298671-0_sp1.1 from training. Duration: 0.963625 2023-10-02 10:10:27,872 WARNING [train.py:1204] (3/4) Exclude cut with ID _68965_210_1883_1_1535698962272_3497490_309-334593-0_sp1.1 from training. Duration: 0.92725 2023-10-02 10:10:29,363 WARNING [train.py:1204] (3/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_128-296581-0 from training. Duration: 0.74 2023-10-02 10:10:29,364 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_17613_210_2151_1_1531137569_2366160_170-299079-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 10:10:29,372 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6287_210_3273_1_1527509506_2653716_182-199887-0_sp1.1 from training. Duration: 0.463625 2023-10-02 10:10:30,934 WARNING [train.py:1204] (3/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_80-135200-0_sp1.1 from training. Duration: 0.7181875 2023-10-02 10:10:33,564 WARNING [train.py:1204] (3/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_34-183685-0 from training. Duration: 0.98 2023-10-02 10:10:33,591 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8278_210_2550_1_1528637099_4220413_418-210155-0_sp1.1 from training. Duration: 0.92725 2023-10-02 10:10:34,825 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8853_210_3273_1_1528633899_818968_51-187685-0_sp0.9 from training. Duration: 0.7666875 2023-10-02 10:10:34,845 WARNING [train.py:1204] (3/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_332-221057-0_sp1.1 from training. Duration: 0.6181875 2023-10-02 10:10:34,918 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2221_210_1988_1_1525597051_4935541_415-296882-0 from training. Duration: 0.77 2023-10-02 10:10:34,945 WARNING [train.py:1204] (3/4) Exclude cut with ID _33640_210_2655_1_1533117605479_4855469_770-278185-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 10:10:36,324 WARNING [train.py:1204] (3/4) Exclude cut with ID _5287_210_2084_1_1527418883040_3878670_333-48508-0_sp1.1 from training. Duration: 0.5454375 2023-10-02 10:10:36,388 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_142-276839-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 10:10:39,123 WARNING [train.py:1204] (3/4) Exclude cut with ID _28394_210_8913_1_1532430049518_3650118_126-139371-0_sp1.1 from training. Duration: 0.92725 2023-10-02 10:10:39,145 WARNING [train.py:1204] (3/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_76-203867-0 from training. Duration: 0.72 2023-10-02 10:10:40,489 WARNING [train.py:1204] (3/4) Exclude cut with ID _6194_210_1534_1_1527511826160_3941194_14-282876-0_sp0.9 from training. Duration: 0.788875 2023-10-02 10:10:44,455 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.3.encoder.layers.1.conv_module1.whiten, num_groups=1, num_channels=512, metric=2.91 vs. limit=15.0 2023-10-02 10:10:45,062 INFO [train.py:1046] (3/4) Epoch 24, batch 4200, loss[loss=0.1795, simple_loss=0.2652, pruned_loss=0.04687, over 24294.00 frames. ], tot_loss[loss=0.1719, simple_loss=0.2479, pruned_loss=0.04798, over 4695742.07 frames. ], batch size: 74, lr: 4.25e-03, grad_scale: 8.0 2023-10-02 10:10:45,146 WARNING [train.py:1204] (3/4) Exclude cut with ID _35355_210_16040_1_1533212988977_4016229_74-190076-0_sp1.1 from training. Duration: 0.72725 2023-10-02 10:10:47,894 WARNING [train.py:1204] (3/4) Exclude cut with ID _33920_210_4220_1_1533257851500_3084399_198-334063-0 from training. Duration: 0.94 2023-10-02 10:10:48,101 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.attention_skip_rate, batch_count=842520.0, ans=0.0 2023-10-02 10:10:49,377 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_4-266348-0_sp0.9 from training. Duration: 0.8666875 2023-10-02 10:10:51,292 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_23856_210_4238_1_1531994785_3087934_122-38075-0_sp1.1 from training. Duration: 0.936375 2023-10-02 10:10:52,811 WARNING [train.py:1204] (3/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_276-96593-0_sp0.9 from training. Duration: 0.8555625 2023-10-02 10:10:54,094 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_308-278734-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 10:10:54,096 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_5190_210_4557_1_1527245078_2965646_353-250603-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 10:10:56,082 WARNING [train.py:1204] (3/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_472-216663-0 from training. Duration: 0.92 2023-10-02 10:10:57,541 WARNING [train.py:1204] (3/4) Exclude cut with ID _7537_210_2084_1_1528502395431_3924359_14-312906-0 from training. Duration: 0.84 2023-10-02 10:10:57,738 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.feed_forward1.out_proj.dropout_p, batch_count=842520.0, ans=0.1 2023-10-02 10:10:58,878 WARNING [train.py:1204] (3/4) Exclude cut with ID _29003_210_19596_1_1532507391720_3894098_559-236660-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 10:11:01,503 WARNING [train.py:1204] (3/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_312-204798-0_sp1.1 from training. Duration: 0.8454375 2023-10-02 10:11:04,271 WARNING [train.py:1204] (3/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_17-309070-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 10:11:07,156 WARNING [train.py:1204] (3/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_344-152526-0_sp0.9 from training. Duration: 0.588875 2023-10-02 10:11:08,465 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_41452_210_7291_1_1533437687_3941281_405-11559-0_sp1.1 from training. Duration: 0.82725 2023-10-02 10:11:09,737 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_48730_210_5399_1_1533885886_2212769_157-124052-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 10:11:09,789 WARNING [train.py:1204] (3/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_415-78792-0 from training. Duration: 0.81 2023-10-02 10:11:09,793 WARNING [train.py:1204] (3/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_345-340690-0_sp1.1 from training. Duration: 0.8454375 2023-10-02 10:11:11,158 WARNING [train.py:1204] (3/4) Exclude cut with ID _23825_210_13449_1_1531915313526_3497150_124-8138-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 10:11:12,454 WARNING [train.py:1204] (3/4) Exclude cut with ID _49258_210_7994_1_1534122017863_3738861_194-24864-0_sp1.1 from training. Duration: 0.936375 2023-10-02 10:11:12,475 WARNING [train.py:1204] (3/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_369-222060-0_sp1.1 from training. Duration: 0.6454375 2023-10-02 10:11:13,819 WARNING [train.py:1204] (3/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_480-13146-0_sp0.9 from training. Duration: 0.9444375 2023-10-02 10:11:15,889 WARNING [train.py:1204] (3/4) Exclude cut with ID _13253_210_1925_1_1530757775819_3577873_191-144977-0 from training. Duration: 0.78 2023-10-02 10:11:17,541 WARNING [train.py:1204] (3/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_1128-140266-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 10:11:21,062 WARNING [train.py:1204] (3/4) Exclude cut with ID _8772_210_1850_1_1528635595972_3664011_247-336448-0_sp0.9 from training. Duration: 0.82225 2023-10-02 10:11:21,147 WARNING [train.py:1204] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_318-59097-0_sp1.1 from training. Duration: 0.6909375 2023-10-02 10:11:23,219 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=842653.3333333334, ans=0.1 2023-10-02 10:11:24,372 WARNING [train.py:1204] (3/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_540-131164-0_sp1.1 from training. Duration: 0.836375 2023-10-02 10:11:24,470 WARNING [train.py:1204] (3/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_39-110836-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 10:11:27,574 WARNING [train.py:1204] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_860-96198-0_sp1.1 from training. Duration: 0.663625 2023-10-02 10:11:27,581 WARNING [train.py:1204] (3/4) Exclude cut with ID _15133_210_6228_1_1530794512074_3080379_29-264458-0 from training. Duration: 0.58 2023-10-02 10:11:27,602 WARNING [train.py:1204] (3/4) Exclude cut with ID _55209_210_1531_1_1534675828996_4597160_797-348343-0_sp1.1 from training. Duration: 0.963625 2023-10-02 10:11:29,000 WARNING [train.py:1204] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_23-189234-0_sp1.1 from training. Duration: 0.7545625 2023-10-02 10:11:34,515 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.468e+02 1.872e+02 2.085e+02 2.351e+02 3.667e+02, threshold=4.171e+02, percent-clipped=0.0 2023-10-02 10:11:34,592 WARNING [train.py:1204] (3/4) Exclude cut with ID _5410_210_1388_1_1527397055795_1719020_39-112221-0_sp1.1 from training. Duration: 0.62725 2023-10-02 10:11:36,132 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_100-348441-0_sp1.1 from training. Duration: 0.82725 2023-10-02 10:11:40,431 WARNING [train.py:1204] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_143-86344-0_sp1.1 from training. Duration: 0.7 2023-10-02 10:11:43,033 WARNING [train.py:1204] (3/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_126-29104-0 from training. Duration: 0.85 2023-10-02 10:11:46,274 WARNING [train.py:1204] (3/4) Exclude cut with ID _39846_210_12651_1_1533603651992_1318030_138-31771-0_sp1.1 from training. Duration: 0.963625 2023-10-02 10:11:51,004 WARNING [train.py:1204] (3/4) Exclude cut with ID _2220_210_1819_1_1525509109898_3658680_50-99837-0_sp1.1 from training. Duration: 0.4909375 2023-10-02 10:11:52,469 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.1.encoder.layers.1.conv_module1.whiten, num_groups=1, num_channels=256, metric=10.25 vs. limit=15.0 2023-10-02 10:11:52,938 WARNING [train.py:1204] (3/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_836-210550-0_sp1.1 from training. Duration: 0.97275 2023-10-02 10:11:54,334 WARNING [train.py:1204] (3/4) Exclude cut with ID _28323_210_16187_1_1532480395667_4100300_306-74561-0 from training. Duration: 0.94 2023-10-02 10:11:55,267 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.2.encoder.layers.0.nonlin_attention.whiten1, num_groups=1, num_channels=288, metric=7.91 vs. limit=10.0 2023-10-02 10:12:00,368 INFO [train.py:1046] (3/4) Epoch 24, batch 4250, loss[loss=0.1584, simple_loss=0.2266, pruned_loss=0.04511, over 23458.00 frames. ], tot_loss[loss=0.1706, simple_loss=0.246, pruned_loss=0.04762, over 4676643.04 frames. ], batch size: 285, lr: 4.25e-03, grad_scale: 8.0 2023-10-02 10:12:00,443 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_177-58005-0_sp1.1 from training. Duration: 0.563625 2023-10-02 10:12:04,588 WARNING [train.py:1204] (3/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_19-342632-0_sp1.1 from training. Duration: 0.82725 2023-10-02 10:12:04,610 WARNING [train.py:1204] (3/4) Exclude cut with ID _35355_210_16040_1_1533212988977_4016229_74-190076-0_sp0.9 from training. Duration: 0.888875 2023-10-02 10:12:07,339 WARNING [train.py:1204] (3/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_285-97786-0_sp1.1 from training. Duration: 0.97275 2023-10-02 10:12:09,026 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.1.conv_module2.balancer1.min_positive, batch_count=842853.3333333334, ans=0.025 2023-10-02 10:12:10,340 WARNING [train.py:1204] (3/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_337-181502-0_sp0.9 from training. Duration: 0.72225 2023-10-02 10:12:11,662 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_353-182855-0 from training. Duration: 0.96 2023-10-02 10:12:11,700 WARNING [train.py:1204] (3/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_409-327730-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 10:12:14,363 WARNING [train.py:1204] (3/4) Exclude cut with ID _8030_210_7497_1_1528444886781_3531089_373-19403-0_sp1.1 from training. Duration: 0.97275 2023-10-02 10:12:17,063 WARNING [train.py:1204] (3/4) Exclude cut with ID _6789_210_1534_1_1527857937218_3118950_231-340237-0_sp1.1 from training. Duration: 0.936375 2023-10-02 10:12:19,545 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.5.encoder.layers.0.feed_forward2.out_whiten, num_groups=1, num_channels=256, metric=6.94 vs. limit=15.0 2023-10-02 10:12:22,725 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.1.encoder.layers.0.nonlin_attention.whiten2, num_groups=1, num_channels=256, metric=6.17 vs. limit=15.0 2023-10-02 10:12:23,223 WARNING [train.py:1204] (3/4) Exclude cut with ID _6493_210_1747_1_1528459210424_4012470_536-57832-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 10:12:23,245 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7046_210_6753_1_1527895337_6920917_756-116002-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 10:12:24,633 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_37152_210_9697_1_1533106573_3844188_29-239070-0_sp1.1 from training. Duration: 0.8909375 2023-10-02 10:12:24,635 WARNING [train.py:1204] (3/4) Exclude cut with ID _19023_210_4353_1_1531378892552_5820780_61-334539-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 10:12:27,253 WARNING [train.py:1204] (3/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_159-28488-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 10:12:29,236 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_764-71272-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 10:12:30,622 WARNING [train.py:1204] (3/4) Exclude cut with ID _44902_210_16187_1_1533888006062_3792078_258-178274-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 10:12:33,346 WARNING [train.py:1204] (3/4) Exclude cut with ID _7537_210_2084_1_1528502395431_3924359_14-312906-0_sp1.1 from training. Duration: 0.763625 2023-10-02 10:12:33,444 WARNING [train.py:1204] (3/4) Exclude cut with ID _49643_210_7989_1_1534640244224_3939031_674-116639-0_sp1.1 from training. Duration: 0.963625 2023-10-02 10:12:33,560 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder_embed.conv.2.prob, batch_count=842986.6666666666, ans=0.125 2023-10-02 10:12:34,858 WARNING [train.py:1204] (3/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_415-161-0 from training. Duration: 0.98 2023-10-02 10:12:37,778 WARNING [train.py:1204] (3/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_264-348896-0 from training. Duration: 0.85 2023-10-02 10:12:37,792 WARNING [train.py:1204] (3/4) Exclude cut with ID _51899_210_20034_1_1534129254466_3669220_233-161492-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 10:12:38,272 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.4.encoder.layers.2.feed_forward3.out_whiten, num_groups=1, num_channels=384, metric=15.61 vs. limit=15.0 2023-10-02 10:12:39,065 WARNING [train.py:1204] (3/4) Exclude cut with ID _31255_210_13427_1_1532865619495_3815143_233-200023-0_sp1.1 from training. Duration: 0.936375 2023-10-02 10:12:39,081 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_23850_210_4238_1_1532232597_1632433_100-229879-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 10:12:40,429 WARNING [train.py:1204] (3/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_375-245677-0_sp0.9 from training. Duration: 0.9 2023-10-02 10:12:40,432 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_40223_210_6228_1_1533298404_4812267_355-317415-0_sp1.1 from training. Duration: 0.97275 2023-10-02 10:12:41,790 WARNING [train.py:1204] (3/4) Exclude cut with ID _9679_210_7497_1_1529129212906_4363929_235-81488-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 10:12:43,352 WARNING [train.py:1204] (3/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_172-215932-0_sp1.1 from training. Duration: 0.6 2023-10-02 10:12:44,682 WARNING [train.py:1204] (3/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_64-298917-0_sp1.1 from training. Duration: 0.8 2023-10-02 10:12:48,797 WARNING [train.py:1204] (3/4) Exclude cut with ID _38020_210_5986_1_1533112252259_7006089_131-53533-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 10:12:48,913 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.0.nonlin_attention.balancer.prob, batch_count=843053.3333333334, ans=0.125 2023-10-02 10:12:52,494 WARNING [train.py:1204] (3/4) Exclude cut with ID _42334_210_7770_1_1534206610396_3605880_392-48154-0_sp1.1 from training. Duration: 0.963625 2023-10-02 10:12:53,171 WARNING [train.py:1204] (3/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_337-181502-0 from training. Duration: 0.65 2023-10-02 10:12:53,186 WARNING [train.py:1204] (3/4) Exclude cut with ID _6267_210_2084_1_1527764658526_3894730_375-219887-0_sp0.9 from training. Duration: 0.7444375 2023-10-02 10:12:53,460 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.bypass_mid.scale_min, batch_count=843053.3333333334, ans=0.2 2023-10-02 10:12:54,351 WARNING [train.py:1204] (3/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_60-146189-0 from training. Duration: 0.98 2023-10-02 10:12:55,641 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_243-143119-0_sp1.1 from training. Duration: 0.7 2023-10-02 10:12:55,753 WARNING [train.py:1204] (3/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_1267-272693-0_sp0.9 from training. Duration: 0.811125 2023-10-02 10:12:57,181 WARNING [train.py:1204] (3/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_83-182645-0_sp1.1 from training. Duration: 0.97275 2023-10-02 10:12:58,451 WARNING [train.py:1204] (3/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_403-292260-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 10:12:58,793 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.feed_forward2.hidden_balancer.prob, batch_count=843120.0, ans=0.125 2023-10-02 10:12:59,897 WARNING [train.py:1204] (3/4) Exclude cut with ID _55429_210_10626_1_1534467588527_7174143_377-169391-0 from training. Duration: 0.96 2023-10-02 10:13:03,135 WARNING [train.py:1204] (3/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_494-199538-0_sp1.1 from training. Duration: 0.6818125 2023-10-02 10:13:03,185 WARNING [train.py:1204] (3/4) Exclude cut with ID _3067_210_2667_1_1526212974640_4503168_119-175776-0_sp1.1 from training. Duration: 0.67275 2023-10-02 10:13:06,064 WARNING [train.py:1204] (3/4) Exclude cut with ID _3231_210_2106_1_1526205864934_6784220_183-303839-0_sp1.1 from training. Duration: 0.97275 2023-10-02 10:13:10,183 WARNING [train.py:1204] (3/4) Exclude cut with ID _18513_210_9206_1_1531618215223_3862620_165-38958-0_sp1.1 from training. Duration: 0.963625 2023-10-02 10:13:10,274 WARNING [train.py:1204] (3/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_87-272068-0_sp1.1 from training. Duration: 0.7545625 2023-10-02 10:13:10,502 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.self_attn_weights.pos_emb_skip_rate, batch_count=843120.0, ans=0.0 2023-10-02 10:13:11,585 WARNING [train.py:1204] (3/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_353-95-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 10:13:12,894 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2009_210_2071_1_1525481134_4588937_73-302813-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 10:13:14,173 INFO [train.py:1046] (3/4) Epoch 24, batch 4300, loss[loss=0.1443, simple_loss=0.2221, pruned_loss=0.03324, over 20434.00 frames. ], tot_loss[loss=0.1698, simple_loss=0.2457, pruned_loss=0.04698, over 4685760.23 frames. ], batch size: 44, lr: 4.25e-03, grad_scale: 8.0 2023-10-02 10:13:14,272 WARNING [train.py:1204] (3/4) Exclude cut with ID _64220_210_9527_1_1535250151772_4875259_88-95011-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 10:13:14,318 WARNING [train.py:1204] (3/4) Exclude cut with ID _44902_210_16187_1_1533888006062_3792078_160-12107-0_sp1.1 from training. Duration: 0.92725 2023-10-02 10:13:14,331 WARNING [train.py:1204] (3/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_547-5424-0 from training. Duration: 0.94 2023-10-02 10:13:15,769 WARNING [train.py:1204] (3/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_268-85656-0_sp1.1 from training. Duration: 0.936375 2023-10-02 10:13:20,707 WARNING [train.py:1204] (3/4) Exclude cut with ID _66210_210_12651_1_1535446812922_3365850_42-284985-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 10:13:20,737 WARNING [train.py:1204] (3/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_465-214726-0_sp0.9 from training. Duration: 0.9555625 2023-10-02 10:13:26,810 WARNING [train.py:1204] (3/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_50-95770-0_sp1.1 from training. Duration: 0.936375 2023-10-02 10:13:34,224 WARNING [train.py:1204] (3/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_269-77567-0_sp1.1 from training. Duration: 0.963625 2023-10-02 10:13:34,226 WARNING [train.py:1204] (3/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_7-111954-0 from training. Duration: 0.56 2023-10-02 10:13:34,320 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_85-195240-0_sp0.9 from training. Duration: 0.9666875 2023-10-02 10:13:37,134 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2822_210_2715_1_1526112586_1999587_165-36656-0_sp0.9 from training. Duration: 0.988875 2023-10-02 10:13:37,156 WARNING [train.py:1204] (3/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_181-78632-0_sp1.1 from training. Duration: 0.7090625 2023-10-02 10:13:38,442 WARNING [train.py:1204] (3/4) Exclude cut with ID IC0671W0044-331887 from training. Duration: 0.896 2023-10-02 10:13:41,127 WARNING [train.py:1204] (3/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_266-137104-0_sp1.1 from training. Duration: 0.5909375 2023-10-02 10:13:42,568 WARNING [train.py:1204] (3/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_137-116442-0_sp0.9 from training. Duration: 0.8666875 2023-10-02 10:13:44,086 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_23801_210_16223_1_1532066392_3294657_570-5439-0 from training. Duration: 0.97 2023-10-02 10:13:44,104 WARNING [train.py:1204] (3/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_29-280622-0_sp1.1 from training. Duration: 0.7454375 2023-10-02 10:13:45,396 WARNING [train.py:1204] (3/4) Exclude cut with ID 3557-8342-0013-54691-0 from training. Duration: 0.92 2023-10-02 10:13:46,843 WARNING [train.py:1204] (3/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_711-15688-0_sp1.1 from training. Duration: 0.3909375 2023-10-02 10:13:49,926 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_15126_210_6753_1_1530778677_2591185_120-277707-0_sp1.1 from training. Duration: 0.72725 2023-10-02 10:13:53,300 WARNING [train.py:1204] (3/4) Exclude cut with ID _14970_210_15709_1_1530775547361_1966965_8-169924-0_sp1.1 from training. Duration: 0.836375 2023-10-02 10:13:53,301 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_4692_210_3528_1_1526899755_4323254_107-44401-0_sp0.9 from training. Duration: 0.9555625 2023-10-02 10:13:54,652 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3475_210_2028_1_1526193578_3622569_265-276625-0_sp0.9 from training. Duration: 0.8444375 2023-10-02 10:13:56,031 WARNING [train.py:1204] (3/4) Exclude cut with ID _55153_210_2253_1_1534384786677_3162096_148-79022-0_sp1.1 from training. Duration: 0.87275 2023-10-02 10:13:56,108 WARNING [train.py:1204] (3/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_292-36336-0_sp1.1 from training. Duration: 0.92725 2023-10-02 10:13:56,116 WARNING [train.py:1204] (3/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_329-82235-0 from training. Duration: 0.82 2023-10-02 10:13:58,031 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_11305_210_1794_1_1529750428_7178148_257-312870-0 from training. Duration: 0.88 2023-10-02 10:14:00,636 WARNING [train.py:1204] (3/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_436-4493-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 10:14:02,508 WARNING [train.py:1204] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_321-54129-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 10:14:02,515 WARNING [train.py:1204] (3/4) Exclude cut with ID _5287_210_2084_1_1527418883040_3878670_333-48508-0_sp0.9 from training. Duration: 0.6666875 2023-10-02 10:14:02,527 WARNING [train.py:1204] (3/4) Exclude cut with ID _26528_210_9070_1_1532570410842_3851310_244-278158-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 10:14:02,560 WARNING [train.py:1204] (3/4) Exclude cut with ID _46952_210_5986_1_1534230010357_3107379_232-344503-0_sp1.1 from training. Duration: 0.87275 2023-10-02 10:14:03,759 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.484e+02 1.778e+02 1.987e+02 2.231e+02 3.215e+02, threshold=3.973e+02, percent-clipped=0.0 2023-10-02 10:14:03,830 WARNING [train.py:1204] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_43-206757-0 from training. Duration: 0.75 2023-10-02 10:14:03,832 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_4183_210_2087_1_1526729100_1869838_141-166600-0 from training. Duration: 0.95 2023-10-02 10:14:03,895 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_296-287678-0 from training. Duration: 0.95 2023-10-02 10:14:05,402 WARNING [train.py:1204] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_265-60343-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 10:14:05,427 WARNING [train.py:1204] (3/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_332-256168-0 from training. Duration: 0.75 2023-10-02 10:14:05,672 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.conv_module2.balancer1.prob, batch_count=843386.6666666666, ans=0.125 2023-10-02 10:14:06,714 WARNING [train.py:1204] (3/4) Exclude cut with ID _13679_210_7497_1_1530534293662_3355098_411-186088-0 from training. Duration: 0.94 2023-10-02 10:14:09,354 WARNING [train.py:1204] (3/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_458-126779-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 10:14:10,828 WARNING [train.py:1204] (3/4) Exclude cut with ID IC0803W0380-397572 from training. Duration: 0.032 2023-10-02 10:14:10,873 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_278-272612-0_sp1.1 from training. Duration: 0.663625 2023-10-02 10:14:11,073 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.balancer2.prob, batch_count=843386.6666666666, ans=0.125 2023-10-02 10:14:12,317 WARNING [train.py:1204] (3/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_491-263101-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 10:14:12,328 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_53555_210_5632_1_1534595220_7232297_334-336778-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 10:14:14,907 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_50-3289-0 from training. Duration: 0.91 2023-10-02 10:14:14,970 WARNING [train.py:1204] (3/4) Exclude cut with ID _8699_210_2069_1_1528630346831_3960409_155-39766-0_sp0.9 from training. Duration: 0.8666875 2023-10-02 10:14:16,212 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_502-82145-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 10:14:16,260 WARNING [train.py:1204] (3/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_550-43132-0_sp1.1 from training. Duration: 0.97275 2023-10-02 10:14:16,293 WARNING [train.py:1204] (3/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_446-112117-0_sp1.1 from training. Duration: 0.8181875 2023-10-02 10:14:16,900 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.3.encoder.layers.1.conv_module2.whiten, num_groups=1, num_channels=512, metric=3.12 vs. limit=15.0 2023-10-02 10:14:17,611 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3691_210_3528_1_1526468169_4062053_355-334432-0_sp1.1 from training. Duration: 0.7909375 2023-10-02 10:14:22,472 WARNING [train.py:1204] (3/4) Exclude cut with ID _58605_210_12210_1_1534723195939_3904259_239-94720-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 10:14:23,943 WARNING [train.py:1204] (3/4) Exclude cut with ID _49376_210_5986_1_1533985278732_3370740_391-344072-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 10:14:25,272 WARNING [train.py:1204] (3/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_558-322014-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 10:14:25,308 WARNING [train.py:1204] (3/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_256-32662-0_sp1.1 from training. Duration: 0.8181875 2023-10-02 10:14:28,400 INFO [train.py:1046] (3/4) Epoch 24, batch 4350, loss[loss=0.1878, simple_loss=0.257, pruned_loss=0.05927, over 23584.00 frames. ], tot_loss[loss=0.17, simple_loss=0.2467, pruned_loss=0.04669, over 4713821.73 frames. ], batch size: 256, lr: 4.25e-03, grad_scale: 4.0 2023-10-02 10:14:31,232 WARNING [train.py:1204] (3/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_572-185150-0 from training. Duration: 0.97 2023-10-02 10:14:32,925 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_89-35748-0_sp0.9 from training. Duration: 0.87775 2023-10-02 10:14:37,259 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_88-66747-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 10:14:38,743 WARNING [train.py:1204] (3/4) Exclude cut with ID _19504_210_9994_1_1531648863242_3325220_357-237029-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 10:14:40,237 WARNING [train.py:1204] (3/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_302-2550-0_sp1.1 from training. Duration: 0.736375 2023-10-02 10:14:40,238 WARNING [train.py:1204] (3/4) Exclude cut with ID _9461_210_4557_1_1529060514726_3677451_59-194017-0_sp1.1 from training. Duration: 0.963625 2023-10-02 10:14:45,612 WARNING [train.py:1204] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_665-196155-0_sp1.1 from training. Duration: 0.6090625 2023-10-02 10:14:48,296 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_326-315007-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 10:14:49,801 WARNING [train.py:1204] (3/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_105-328174-0_sp1.1 from training. Duration: 0.6909375 2023-10-02 10:14:49,809 WARNING [train.py:1204] (3/4) Exclude cut with ID _56494_210_4220_1_1535284837625_3033169_106-263722-0_sp1.1 from training. Duration: 0.97275 2023-10-02 10:14:51,862 WARNING [train.py:1204] (3/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_770-200430-0_sp0.9 from training. Duration: 0.988875 2023-10-02 10:14:53,867 WARNING [train.py:1204] (3/4) Exclude cut with ID _65640_210_6310_1_1535368201445_3173250_209-13008-0_sp1.1 from training. Duration: 0.9 2023-10-02 10:14:55,237 WARNING [train.py:1204] (3/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_212-149947-0_sp0.9 from training. Duration: 0.811125 2023-10-02 10:14:55,360 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.0.attention_skip_rate, batch_count=843586.6666666666, ans=0.0 2023-10-02 10:15:01,766 WARNING [train.py:1204] (3/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_188-171673-0 from training. Duration: 0.58 2023-10-02 10:15:01,841 WARNING [train.py:1204] (3/4) Exclude cut with ID _58895_210_8750_1_1535184081320_3649089_286-281724-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 10:15:03,213 WARNING [train.py:1204] (3/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_16-254909-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 10:15:07,417 WARNING [train.py:1204] (3/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_52-95655-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 10:15:08,859 WARNING [train.py:1204] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_554-45617-0 from training. Duration: 0.99 2023-10-02 10:15:11,549 WARNING [train.py:1204] (3/4) Exclude cut with ID _14683_210_5219_1_1530698352608_3974859_473-344611-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 10:15:12,935 WARNING [train.py:1204] (3/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_17-25045-0_sp0.9 from training. Duration: 0.6555625 2023-10-02 10:15:17,145 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9072W0003-530637 from training. Duration: 0.768 2023-10-02 10:15:18,565 WARNING [train.py:1204] (3/4) Exclude cut with ID _38670_210_15379_1_1533448810659_4391059_76-74025-0_sp1.1 from training. Duration: 0.936375 2023-10-02 10:15:18,598 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_74-292608-0_sp0.9 from training. Duration: 0.8 2023-10-02 10:15:20,356 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9084W0025-536862 from training. Duration: 0.896 2023-10-02 10:15:20,417 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9087W0021-538392 from training. Duration: 0.768 2023-10-02 10:15:20,421 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7543_210_6753_1_1528543323_645515_56-163234-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 10:15:21,664 WARNING [train.py:1204] (3/4) Exclude cut with ID _45560_210_21982_1_1533812603951_3584999_44-332643-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 10:15:21,736 WARNING [train.py:1204] (3/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_1285-256622-0_sp1.1 from training. Duration: 0.763625 2023-10-02 10:15:23,024 WARNING [train.py:1204] (3/4) Exclude cut with ID _52781_210_16187_1_1534297729099_1118149_141-325632-0_sp1.1 from training. Duration: 0.936375 2023-10-02 10:15:23,090 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_1051-137631-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 10:15:23,123 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_13091_210_7925_1_1530609447_8987354_233-18333-0_sp1.1 from training. Duration: 0.97275 2023-10-02 10:15:26,504 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_34008_210_10431_1_1533345424_8183247_662-317331-0 from training. Duration: 0.94 2023-10-02 10:15:26,510 WARNING [train.py:1204] (3/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_291-248920-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 10:15:26,512 WARNING [train.py:1204] (3/4) Exclude cut with ID _65263_210_22356_1_1535763598691_3921037_274-288481-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 10:15:27,797 WARNING [train.py:1204] (3/4) Exclude cut with ID _5189_210_4557_1_1527248319428_3769229_233-168080-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 10:15:27,851 WARNING [train.py:1204] (3/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_135-184281-0 from training. Duration: 0.99 2023-10-02 10:15:27,952 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9123W0012-555972 from training. Duration: 0.896 2023-10-02 10:15:27,956 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9123W0118-556077 from training. Duration: 0.896 2023-10-02 10:15:29,893 WARNING [train.py:1204] (3/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_406-275649-0 from training. Duration: 0.42 2023-10-02 10:15:32,629 WARNING [train.py:1204] (3/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_309-13684-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 10:15:32,660 WARNING [train.py:1204] (3/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_129-337802-0_sp1.1 from training. Duration: 0.6545625 2023-10-02 10:15:33,947 WARNING [train.py:1204] (3/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_649-266825-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 10:15:34,005 WARNING [train.py:1204] (3/4) Exclude cut with ID _40519_210_13794_1_1533715228655_7337390_895-224742-0_sp1.1 from training. Duration: 0.92725 2023-10-02 10:15:36,684 WARNING [train.py:1204] (3/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_234-14174-0 from training. Duration: 0.82 2023-10-02 10:15:38,125 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9156W0009-572502 from training. Duration: 0.768 2023-10-02 10:15:39,292 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_424-245529-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 10:15:41,928 INFO [train.py:1046] (3/4) Epoch 24, batch 4400, loss[loss=0.1579, simple_loss=0.2436, pruned_loss=0.03609, over 24403.00 frames. ], tot_loss[loss=0.1711, simple_loss=0.248, pruned_loss=0.04708, over 4720324.36 frames. ], batch size: 74, lr: 4.24e-03, grad_scale: 8.0 2023-10-02 10:15:41,997 WARNING [train.py:1204] (3/4) Exclude cut with ID _26528_210_9070_1_1532570410842_3851310_232-10149-0_sp1.1 from training. Duration: 0.963625 2023-10-02 10:15:42,005 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_17292_210_6753_1_1531125388_2943734_152-30410-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 10:15:43,497 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_35441_210_16554_1_1533347572_4092680_291-82075-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 10:15:44,917 WARNING [train.py:1204] (3/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_64-298917-0 from training. Duration: 0.88 2023-10-02 10:15:44,934 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_5618_210_1874_1_1527311008_5851_1-214501-0_sp0.9 from training. Duration: 0.7655625 2023-10-02 10:15:46,285 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_9460_210_4211_1_1528954587_10049667_572-37035-0 from training. Duration: 0.8 2023-10-02 10:15:46,304 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9193W0023-590322 from training. Duration: 0.896 2023-10-02 10:15:48,190 WARNING [train.py:1204] (3/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_206-10097-0_sp1.1 from training. Duration: 0.4454375 2023-10-02 10:15:48,209 WARNING [train.py:1204] (3/4) Exclude cut with ID _55134_210_21982_1_1534590028377_3882170_17-51731-0_sp1.1 from training. Duration: 0.963625 2023-10-02 10:15:48,914 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.2.encoder.layers.1.self_attn_weights.whiten_keys, num_groups=4, num_channels=128, metric=3.49 vs. limit=6.0 2023-10-02 10:15:51,447 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_14953_210_2151_1_1530701710_5616165_710-165400-0 from training. Duration: 0.87 2023-10-02 10:15:54,126 WARNING [train.py:1204] (3/4) Exclude cut with ID _53203_210_2087_1_1534246218309_475590_61-85777-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 10:15:55,548 WARNING [train.py:1204] (3/4) Exclude cut with ID _55844_210_2346_1_1534471212569_7625707_1050-10453-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 10:15:55,564 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9218W0013-603297 from training. Duration: 0.896 2023-10-02 10:15:55,697 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.0.self_attn_weights.pos_emb_skip_rate, batch_count=843920.0, ans=0.0 2023-10-02 10:15:58,949 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.3.encoder.layers.2.conv_module2.whiten, num_groups=1, num_channels=512, metric=6.97 vs. limit=15.0 2023-10-02 10:15:59,625 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_267-102818-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 10:15:59,625 WARNING [train.py:1204] (3/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_191-41023-0 from training. Duration: 0.71 2023-10-02 10:16:01,556 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9231W0202-610227 from training. Duration: 0.896 2023-10-02 10:16:03,037 WARNING [train.py:1204] (3/4) Exclude cut with ID _6537_210_1883_1_1527987559710_3597129_382-133062-0 from training. Duration: 0.68 2023-10-02 10:16:04,341 WARNING [train.py:1204] (3/4) Exclude cut with ID _28427_210_4220_1_1532581240967_3061069_43-84928-0 from training. Duration: 0.99 2023-10-02 10:16:04,370 WARNING [train.py:1204] (3/4) Exclude cut with ID _59256_210_5986_1_1535076085997_3589120_121-237386-0 from training. Duration: 0.96 2023-10-02 10:16:05,667 WARNING [train.py:1204] (3/4) Exclude cut with ID _8480_210_1874_1_1528589391205_6728330_137-256986-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 10:16:05,955 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.conv_skip_rate, batch_count=843920.0, ans=0.0 2023-10-02 10:16:07,033 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_9417_210_1794_1_1529235261_4614131_479-346963-0_sp1.1 from training. Duration: 0.936375 2023-10-02 10:16:07,067 WARNING [train.py:1204] (3/4) Exclude cut with ID _26474_210_9570_1_1532520022699_3623380_267-7362-0_sp1.1 from training. Duration: 0.936375 2023-10-02 10:16:07,133 WARNING [train.py:1204] (3/4) Exclude cut with ID _55328_210_27767_1_1534580973260_4088170_287-61272-0_sp1.1 from training. Duration: 0.97275 2023-10-02 10:16:08,535 WARNING [train.py:1204] (3/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_589-9070-0 from training. Duration: 0.71 2023-10-02 10:16:08,551 WARNING [train.py:1204] (3/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_148-158509-0_sp1.1 from training. Duration: 0.37275 2023-10-02 10:16:09,936 WARNING [train.py:1204] (3/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_164-184548-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 10:16:12,614 WARNING [train.py:1204] (3/4) Exclude cut with ID _14970_210_15709_1_1530775547361_1966965_6-23328-0_sp1.1 from training. Duration: 0.7818125 2023-10-02 10:16:12,619 WARNING [train.py:1204] (3/4) Exclude cut with ID _29303_210_2575_1_1532392237303_7410649_612-165069-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 10:16:14,130 WARNING [train.py:1204] (3/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_383-321224-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 10:16:15,493 WARNING [train.py:1204] (3/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_283-160525-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 10:16:15,521 WARNING [train.py:1204] (3/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_505-97987-0 from training. Duration: 0.86 2023-10-02 10:16:16,933 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9309W0190-640317 from training. Duration: 0.768 2023-10-02 10:16:19,683 WARNING [train.py:1204] (3/4) Exclude cut with ID _50093_210_4220_1_1534417141207_3432299_280-297390-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 10:16:26,360 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_17292_210_6753_1_1531125388_2943734_320-71385-0_sp1.1 from training. Duration: 0.97275 2023-10-02 10:16:29,002 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_213-103282-0 from training. Duration: 0.78 2023-10-02 10:16:31,213 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7615_210_4211_1_1528110965_4580160_72-235211-0_sp0.9 from training. Duration: 0.8444375 2023-10-02 10:16:32,287 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.578e+02 1.852e+02 2.093e+02 2.430e+02 3.908e+02, threshold=4.185e+02, percent-clipped=0.0 2023-10-02 10:16:33,770 WARNING [train.py:1204] (3/4) Exclude cut with ID _14867_210_1968_1_1530684179890_3846969_173-320977-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 10:16:37,008 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7046_210_6753_1_1527895337_6920917_783-278109-0_sp0.9 from training. Duration: 0.8555625 2023-10-02 10:16:38,250 WARNING [train.py:1204] (3/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_90-30673-0 from training. Duration: 0.84 2023-10-02 10:16:38,264 WARNING [train.py:1204] (3/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_415-161-0_sp1.1 from training. Duration: 0.8909375 2023-10-02 10:16:38,274 WARNING [train.py:1204] (3/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_86-66192-0_sp0.9 from training. Duration: 0.611125 2023-10-02 10:16:38,277 WARNING [train.py:1204] (3/4) Exclude cut with ID _4626_210_3532_1_1526817652717_3916859_446-206656-0_sp1.1 from training. Duration: 0.6909375 2023-10-02 10:16:38,332 WARNING [train.py:1204] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_166-128941-0_sp0.9 from training. Duration: 0.6 2023-10-02 10:16:42,517 WARNING [train.py:1204] (3/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_345-167986-0 from training. Duration: 0.84 2023-10-02 10:16:44,450 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.3.encoder.layers.0.conv_module2.whiten, num_groups=1, num_channels=512, metric=3.15 vs. limit=15.0 2023-10-02 10:16:46,415 WARNING [train.py:1204] (3/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_973-198085-0 from training. Duration: 0.75 2023-10-02 10:16:46,495 WARNING [train.py:1204] (3/4) Exclude cut with ID _60057_210_10640_1_1534903065793_3333150_171-181133-0 from training. Duration: 0.88 2023-10-02 10:16:46,517 WARNING [train.py:1204] (3/4) Exclude cut with ID _19504_210_9994_1_1531648863242_3325220_336-196513-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 10:16:47,843 WARNING [train.py:1204] (3/4) Exclude cut with ID _6493_210_1747_1_1528459210424_4012470_513-140967-0 from training. Duration: 0.79 2023-10-02 10:16:47,933 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6673_210_3273_1_1527767790_3173240_327-306185-0_sp0.9 from training. Duration: 0.92225 2023-10-02 10:16:52,024 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1032-236863-0_sp1.1 from training. Duration: 0.763625 2023-10-02 10:16:55,150 INFO [train.py:1046] (3/4) Epoch 24, batch 4450, loss[loss=0.1544, simple_loss=0.2335, pruned_loss=0.03762, over 24583.00 frames. ], tot_loss[loss=0.1716, simple_loss=0.2485, pruned_loss=0.04733, over 4715604.28 frames. ], batch size: 60, lr: 4.24e-03, grad_scale: 8.0 2023-10-02 10:16:55,233 WARNING [train.py:1204] (3/4) Exclude cut with ID _14867_210_1968_1_1530684179890_3846969_413-29292-0 from training. Duration: 0.9 2023-10-02 10:16:55,524 INFO [scaling.py:1118] (3/4) WithLoss: name=encoder.encoders.3.encoder.layers.3.self_attn_weights, loss-sum=0.000e+00 2023-10-02 10:17:00,027 WARNING [train.py:1204] (3/4) Exclude cut with ID _47251_210_18628_1_1533814063128_4137250_186-343322-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 10:17:01,401 WARNING [train.py:1204] (3/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_624-46091-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 10:17:02,738 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_791-197891-0_sp0.9 from training. Duration: 0.9333125 2023-10-02 10:17:10,738 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_50050_210_16223_1_1535435796_7290138_786-288457-0_sp1.1 from training. Duration: 0.92725 2023-10-02 10:17:10,768 WARNING [train.py:1204] (3/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_213-227283-0_sp1.1 from training. Duration: 0.963625 2023-10-02 10:17:14,863 WARNING [train.py:1204] (3/4) Exclude cut with ID _42659_210_2070_1_1533607442765_3120380_58-341121-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 10:17:16,267 WARNING [train.py:1204] (3/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_506-23200-0_sp1.1 from training. Duration: 0.8818125 2023-10-02 10:17:17,747 WARNING [train.py:1204] (3/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_336-213530-0_sp1.1 from training. Duration: 0.8181875 2023-10-02 10:17:17,776 WARNING [train.py:1204] (3/4) Exclude cut with ID _64135_210_2253_1_1535677267876_7608972_180-5312-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 10:17:19,164 WARNING [train.py:1204] (3/4) Exclude cut with ID _12302_210_5219_1_1529924446466_8107280_835-279754-0 from training. Duration: 0.9 2023-10-02 10:17:19,166 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_510-96996-0_sp1.1 from training. Duration: 0.7909375 2023-10-02 10:17:20,377 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_15517_210_6753_1_1530954008_3487336_216-187834-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 10:17:20,401 WARNING [train.py:1204] (3/4) Exclude cut with ID _19504_210_9994_1_1531648863242_3325220_414-113427-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 10:17:20,402 WARNING [train.py:1204] (3/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_264-108685-0_sp0.9 from training. Duration: 0.611125 2023-10-02 10:17:23,269 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_5438_210_1794_1_1527336687_2600009_104-288274-0_sp0.9 from training. Duration: 0.6444375 2023-10-02 10:17:27,710 WARNING [train.py:1204] (3/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_520-39496-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 10:17:29,120 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_37818_210_4238_1_1533620910_4720083_315-308565-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 10:17:31,021 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2009_210_2071_1_1525481134_4588937_385-302100-0_sp1.1 from training. Duration: 0.7909375 2023-10-02 10:17:31,059 WARNING [train.py:1204] (3/4) Exclude cut with ID _58895_210_8750_1_1535184081320_3649089_277-53219-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 10:17:31,152 WARNING [train.py:1204] (3/4) Exclude cut with ID _26664_210_7770_1_1532258943280_3418270_121-111258-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 10:17:35,905 WARNING [train.py:1204] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_99-167392-0_sp1.1 from training. Duration: 0.5545625 2023-10-02 10:17:37,205 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1113-7369-0 from training. Duration: 0.85 2023-10-02 10:17:38,884 WARNING [train.py:1204] (3/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_352-59958-0 from training. Duration: 0.86 2023-10-02 10:17:38,899 WARNING [train.py:1204] (3/4) Exclude cut with ID _14886_210_4220_1_1530702020355_3516190_159-73748-0_sp1.1 from training. Duration: 0.7545625 2023-10-02 10:17:40,430 WARNING [train.py:1204] (3/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_131-119569-0_sp1.1 from training. Duration: 0.92725 2023-10-02 10:17:41,743 WARNING [train.py:1204] (3/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_216-343007-0 from training. Duration: 0.66 2023-10-02 10:17:43,426 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.nonlin_attention.balancer.prob, batch_count=844386.6666666666, ans=0.125 2023-10-02 10:17:44,528 WARNING [train.py:1204] (3/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_113-92608-0_sp0.9 from training. Duration: 0.6 2023-10-02 10:17:47,446 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_40047_210_2290_1_1533290519_2374311_132-123271-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 10:17:48,767 WARNING [train.py:1204] (3/4) Exclude cut with ID _8793_210_6310_1_1528542985139_2310009_121-179782-0 from training. Duration: 0.8 2023-10-02 10:17:48,789 WARNING [train.py:1204] (3/4) Exclude cut with ID _43997_210_4525_1_1533538632555_5189329_283-218639-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 10:17:48,792 WARNING [train.py:1204] (3/4) Exclude cut with ID _49376_210_5986_1_1533985278732_3370740_297-78397-0_sp1.1 from training. Duration: 0.936375 2023-10-02 10:17:48,806 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3475_210_2028_1_1526193578_3622569_377-166133-0_sp0.9 from training. Duration: 0.9555625 2023-10-02 10:17:48,819 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_520-213149-0_sp1.1 from training. Duration: 0.92725 2023-10-02 10:17:51,413 WARNING [train.py:1204] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_567-144483-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 10:17:54,313 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_4183_210_2087_1_1526729100_1869838_143-280962-0_sp0.9 from training. Duration: 0.62225 2023-10-02 10:17:54,350 WARNING [train.py:1204] (3/4) Exclude cut with ID _49643_210_7989_1_1534640244224_3939031_611-185515-0 from training. Duration: 0.94 2023-10-02 10:17:57,079 WARNING [train.py:1204] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_88-341718-0_sp0.9 from training. Duration: 0.7333125 2023-10-02 10:17:59,111 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_51225_210_3601_1_1534400634_2327383_49-235441-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 10:18:01,027 WARNING [train.py:1204] (3/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_240-294400-0_sp1.1 from training. Duration: 0.936375 2023-10-02 10:18:02,427 WARNING [train.py:1204] (3/4) Exclude cut with ID _55429_210_10626_1_1534467588527_7174143_640-304825-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 10:18:02,455 WARNING [train.py:1204] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_187-221566-0_sp1.1 from training. Duration: 0.3909375 2023-10-02 10:18:05,728 WARNING [train.py:1204] (3/4) Exclude cut with ID _7606_210_4915_1_1528894253277_5080690_127-177761-0_sp0.9 from training. Duration: 0.711125 2023-10-02 10:18:09,745 INFO [train.py:1046] (3/4) Epoch 24, batch 4500, loss[loss=0.185, simple_loss=0.2541, pruned_loss=0.05799, over 23768.00 frames. ], tot_loss[loss=0.172, simple_loss=0.2488, pruned_loss=0.04758, over 4711023.63 frames. ], batch size: 212, lr: 4.24e-03, grad_scale: 8.0 2023-10-02 10:18:09,798 WARNING [train.py:1204] (3/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_430-116139-0 from training. Duration: 0.85 2023-10-02 10:18:11,202 WARNING [train.py:1204] (3/4) Exclude cut with ID _3675_210_4557_1_1526639934408_4182089_456-18305-0_sp0.9 from training. Duration: 0.8444375 2023-10-02 10:18:14,647 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.conv_skip_rate, batch_count=844520.0, ans=0.0 2023-10-02 10:18:15,760 WARNING [train.py:1204] (3/4) Exclude cut with ID _18513_210_9206_1_1531618215223_3862620_402-5957-0_sp1.1 from training. Duration: 0.9 2023-10-02 10:18:17,199 WARNING [train.py:1204] (3/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_465-156500-0 from training. Duration: 0.72 2023-10-02 10:18:17,200 WARNING [train.py:1204] (3/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_714-310538-0 from training. Duration: 0.86 2023-10-02 10:18:18,552 WARNING [train.py:1204] (3/4) Exclude cut with ID _39411_210_5986_1_1533538825030_3343649_335-301187-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 10:18:23,887 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_41750_210_1960_1_1533520678_3948042_241-158616-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 10:18:23,945 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_46013_210_7234_1_1533776362_3744032_50-141535-0_sp1.1 from training. Duration: 0.9 2023-10-02 10:18:25,272 WARNING [train.py:1204] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_43-206757-0_sp0.9 from training. Duration: 0.8333125 2023-10-02 10:18:25,334 WARNING [train.py:1204] (3/4) Exclude cut with ID _22961_210_8254_1_1531976366672_4278180_172-276296-0_sp1.1 from training. Duration: 0.97275 2023-10-02 10:18:25,359 WARNING [train.py:1204] (3/4) Exclude cut with ID _17956_210_4353_1_1531209568125_6979970_1305-301903-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 10:18:26,773 WARNING [train.py:1204] (3/4) Exclude cut with ID _28323_210_16187_1_1532480395667_4100300_604-44507-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 10:18:36,808 WARNING [train.py:1204] (3/4) Exclude cut with ID _23514_210_1389_1_1531832139785_3031439_88-337544-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 10:18:38,136 WARNING [train.py:1204] (3/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_932-3672-0_sp1.1 from training. Duration: 0.963625 2023-10-02 10:18:40,873 WARNING [train.py:1204] (3/4) Exclude cut with ID _46502_210_13794_1_1533724452198_3657159_207-109991-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 10:18:42,181 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_216-73841-0_sp1.1 from training. Duration: 0.87275 2023-10-02 10:18:43,496 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8453_210_4211_1_1528502940_8562730_347-330673-0_sp1.1 from training. Duration: 0.6545625 2023-10-02 10:18:48,291 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_4183_210_2087_1_1526729100_1869838_143-280962-0_sp1.1 from training. Duration: 0.5090625 2023-10-02 10:18:51,101 WARNING [train.py:1204] (3/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_325-113798-0_sp1.1 from training. Duration: 0.77275 2023-10-02 10:18:55,191 WARNING [train.py:1204] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_91-212486-0_sp0.9 from training. Duration: 0.7555625 2023-10-02 10:18:55,361 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.conv_module2.balancer1.max_abs, batch_count=844720.0, ans=10.0 2023-10-02 10:18:56,721 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8776_210_2040_1_1528638837_4088036_244-278763-0_sp1.1 from training. Duration: 0.8181875 2023-10-02 10:18:58,000 WARNING [train.py:1204] (3/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_106-286130-0 from training. Duration: 0.98 2023-10-02 10:18:59,243 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.429e+02 1.850e+02 2.033e+02 2.343e+02 3.798e+02, threshold=4.065e+02, percent-clipped=0.0 2023-10-02 10:18:59,348 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2746_210_1988_1_1525872816_3795627_372-205861-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 10:18:59,381 WARNING [train.py:1204] (3/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_240-54646-0_sp1.1 from training. Duration: 0.936375 2023-10-02 10:19:02,980 WARNING [train.py:1204] (3/4) Exclude cut with ID _59500_210_8254_1_1534809610680_3736009_162-180695-0_sp1.1 from training. Duration: 0.936375 2023-10-02 10:19:03,007 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7544_210_6753_1_1528591327_1471426_146-93357-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 10:19:04,364 WARNING [train.py:1204] (3/4) Exclude cut with ID _42657_210_2070_1_1533520654016_3327008_405-87432-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 10:19:04,389 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_4528_210_3273_1_1527076654_3276777_209-219804-0 from training. Duration: 0.69 2023-10-02 10:19:04,389 WARNING [train.py:1204] (3/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_78-182912-0_sp0.9 from training. Duration: 0.6555625 2023-10-02 10:19:04,395 WARNING [train.py:1204] (3/4) Exclude cut with ID _31255_210_13427_1_1532865619495_3815143_322-114998-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 10:19:09,799 WARNING [train.py:1204] (3/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_848-280957-0_sp0.9 from training. Duration: 0.9666875 2023-10-02 10:19:09,824 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7696_210_2040_1_1528207167_3716099_117-125434-0_sp0.9 from training. Duration: 0.8444375 2023-10-02 10:19:12,555 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_1002-109192-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 10:19:15,769 WARNING [train.py:1204] (3/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_138-64210-0_sp0.9 from training. Duration: 0.811125 2023-10-02 10:19:15,790 WARNING [train.py:1204] (3/4) Exclude cut with ID _25808_210_13292_1_1532343514562_7366109_633-47524-0_sp1.1 from training. Duration: 0.9 2023-10-02 10:19:18,359 WARNING [train.py:1204] (3/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_297-5233-0 from training. Duration: 0.95 2023-10-02 10:19:18,467 WARNING [train.py:1204] (3/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_335-274780-0 from training. Duration: 0.78 2023-10-02 10:19:18,472 WARNING [train.py:1204] (3/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_301-330455-0 from training. Duration: 0.8 2023-10-02 10:19:21,286 WARNING [train.py:1204] (3/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_383-205833-0 from training. Duration: 0.95 2023-10-02 10:19:22,500 INFO [train.py:1046] (3/4) Epoch 24, batch 4550, loss[loss=0.1564, simple_loss=0.2369, pruned_loss=0.03791, over 24454.00 frames. ], tot_loss[loss=0.1715, simple_loss=0.2481, pruned_loss=0.04747, over 4708782.64 frames. ], batch size: 63, lr: 4.24e-03, grad_scale: 8.0 2023-10-02 10:19:23,951 WARNING [train.py:1204] (3/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_48-182155-0 from training. Duration: 0.87 2023-10-02 10:19:25,284 WARNING [train.py:1204] (3/4) Exclude cut with ID _14832_210_8750_1_1530691351539_3171348_1-54315-0_sp1.1 from training. Duration: 0.7909375 2023-10-02 10:19:28,038 WARNING [train.py:1204] (3/4) Exclude cut with ID _44982_210_1511_1_1534312820108_3726029_413-100814-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 10:19:28,197 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.1.self_attn_weights.pos_emb_skip_rate, batch_count=844853.3333333334, ans=0.0 2023-10-02 10:19:29,971 WARNING [train.py:1204] (3/4) Exclude cut with ID _3231_210_2106_1_1526205864934_6784220_104-261392-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 10:19:33,389 WARNING [train.py:1204] (3/4) Exclude cut with ID _24314_210_4915_1_1532135078116_7170179_1-13588-0_sp1.1 from training. Duration: 0.97275 2023-10-02 10:19:36,221 WARNING [train.py:1204] (3/4) Exclude cut with ID _44304_210_15374_1_1533639352765_4613129_141-8356-0_sp1.1 from training. Duration: 0.963625 2023-10-02 10:19:37,569 WARNING [train.py:1204] (3/4) Exclude cut with ID _37892_210_19596_1_1533198677897_3964058_374-335034-0_sp1.1 from training. Duration: 0.936375 2023-10-02 10:19:40,257 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_195-257512-0_sp0.9 from training. Duration: 0.7444375 2023-10-02 10:19:40,260 WARNING [train.py:1204] (3/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_1267-272693-0_sp1.1 from training. Duration: 0.663625 2023-10-02 10:19:40,261 WARNING [train.py:1204] (3/4) Exclude cut with ID _30196_210_1664_1_1533708496522_7074140_690-326155-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 10:19:42,909 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_68847_210_14656_1_1536070214_2586843_38-20717-0_sp1.1 from training. Duration: 0.97275 2023-10-02 10:19:44,714 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_99-251314-0_sp1.1 from training. Duration: 0.7909375 2023-10-02 10:19:46,206 WARNING [train.py:1204] (3/4) Exclude cut with ID _49376_210_5986_1_1533985278732_3370740_225-95630-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 10:19:47,686 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2655_210_2087_1_1525688959_3897400_202-147557-0 from training. Duration: 0.85 2023-10-02 10:19:49,044 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_5618_210_1874_1_1527311008_5851_1-214501-0_sp1.1 from training. Duration: 0.626375 2023-10-02 10:19:50,404 WARNING [train.py:1204] (3/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_225-43381-0_sp1.1 from training. Duration: 0.8545625 2023-10-02 10:19:51,804 WARNING [train.py:1204] (3/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_242-3472-0 from training. Duration: 0.73 2023-10-02 10:19:52,065 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.conv_skip_rate, batch_count=844986.6666666666, ans=0.0 2023-10-02 10:19:54,441 WARNING [train.py:1204] (3/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_339-104791-0 from training. Duration: 0.49 2023-10-02 10:19:54,502 WARNING [train.py:1204] (3/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_488-92261-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 10:19:57,753 WARNING [train.py:1204] (3/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_175-163894-0 from training. Duration: 0.74 2023-10-02 10:19:59,838 WARNING [train.py:1204] (3/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_41-234353-0_sp0.9 from training. Duration: 0.7555625 2023-10-02 10:20:02,451 WARNING [train.py:1204] (3/4) Exclude cut with ID _47251_210_18628_1_1533814063128_4137250_116-275927-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 10:20:02,495 WARNING [train.py:1204] (3/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_61-223324-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 10:20:02,507 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6961_210_2087_1_1527937168_6815011_562-165518-0_sp1.1 from training. Duration: 0.7 2023-10-02 10:20:03,928 WARNING [train.py:1204] (3/4) Exclude cut with ID _21989_210_5854_1_1531899214931_3271360_133-44759-0 from training. Duration: 0.77 2023-10-02 10:20:06,732 WARNING [train.py:1204] (3/4) Exclude cut with ID _23557_210_8750_1_1531890106370_6846255_77-284923-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 10:20:08,161 WARNING [train.py:1204] (3/4) Exclude cut with ID _13678_210_7497_1_1530530069226_3463170_464-296127-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 10:20:08,181 WARNING [train.py:1204] (3/4) Exclude cut with ID _22613_210_5219_1_1532253599686_4090170_393-36663-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 10:20:09,607 WARNING [train.py:1204] (3/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_235-262124-0_sp0.9 from training. Duration: 0.7444375 2023-10-02 10:20:10,988 WARNING [train.py:1204] (3/4) Exclude cut with ID _8480_210_1874_1_1528589391205_6728330_755-112625-0 from training. Duration: 0.74 2023-10-02 10:20:11,054 WARNING [train.py:1204] (3/4) Exclude cut with ID _6456_210_1534_1_1527681634212_3181049_215-309359-0 from training. Duration: 0.88 2023-10-02 10:20:11,071 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3019_210_2028_1_1526044245_3264865_191-72235-0_sp1.1 from training. Duration: 0.8818125 2023-10-02 10:20:11,255 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.conv_module1.balancer1.prob, batch_count=845053.3333333334, ans=0.125 2023-10-02 10:20:12,422 WARNING [train.py:1204] (3/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_380-162648-0 from training. Duration: 0.94 2023-10-02 10:20:15,696 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_92-46009-0 from training. Duration: 0.54 2023-10-02 10:20:15,725 WARNING [train.py:1204] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_53-300544-0_sp0.9 from training. Duration: 0.7444375 2023-10-02 10:20:18,448 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_68235_210_1925_1_1535863247_4484656_101-250943-0_sp1.1 from training. Duration: 0.97275 2023-10-02 10:20:18,457 WARNING [train.py:1204] (3/4) Exclude cut with ID _14970_210_15709_1_1530775547361_1966965_204-22182-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 10:20:18,548 WARNING [train.py:1204] (3/4) Exclude cut with ID _55153_210_2253_1_1534384786677_3162096_310-130092-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 10:20:19,817 WARNING [train.py:1204] (3/4) Exclude cut with ID _60113_210_16271_1_1535164212481_4386079_559-167191-0_sp1.1 from training. Duration: 0.8454375 2023-10-02 10:20:19,939 WARNING [train.py:1204] (3/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_93-58002-0_sp1.1 from training. Duration: 0.5181875 2023-10-02 10:20:21,281 WARNING [train.py:1204] (3/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_110-224688-0 from training. Duration: 0.97 2023-10-02 10:20:23,776 WARNING [train.py:1204] (3/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_178-178004-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 10:20:23,784 WARNING [train.py:1204] (3/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_321-335475-0_sp1.1 from training. Duration: 0.4090625 2023-10-02 10:20:23,836 WARNING [train.py:1204] (3/4) Exclude cut with ID _66863_210_18053_1_1535698758334_8590319_1092-271784-0 from training. Duration: 0.96 2023-10-02 10:20:23,850 WARNING [train.py:1204] (3/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_607-213951-0_sp1.1 from training. Duration: 0.763625 2023-10-02 10:20:23,859 WARNING [train.py:1204] (3/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_468-283113-0 from training. Duration: 0.96 2023-10-02 10:20:27,238 WARNING [train.py:1204] (3/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_13-165512-0_sp1.1 from training. Duration: 0.7454375 2023-10-02 10:20:27,256 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_69630_210_7234_1_1535788670_910989_56-169762-0_sp1.1 from training. Duration: 0.92725 2023-10-02 10:20:30,591 WARNING [train.py:1204] (3/4) Exclude cut with ID _67335_210_5219_1_1535525975652_4275229_121-80357-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 10:20:30,632 WARNING [train.py:1204] (3/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_126-12079-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 10:20:30,669 WARNING [train.py:1204] (3/4) Exclude cut with ID _5294_210_1747_1_1527249643928_3696179_342-270278-0_sp1.1 from training. Duration: 0.5 2023-10-02 10:20:32,074 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_30322_210_1925_1_1532843478_826817_35-328264-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 10:20:34,706 WARNING [train.py:1204] (3/4) Exclude cut with ID _8480_210_1874_1_1528589391205_6728330_755-112625-0_sp1.1 from training. Duration: 0.67275 2023-10-02 10:20:36,062 INFO [train.py:1046] (3/4) Epoch 24, batch 4600, loss[loss=0.1512, simple_loss=0.218, pruned_loss=0.04217, over 22742.00 frames. ], tot_loss[loss=0.1703, simple_loss=0.2466, pruned_loss=0.04696, over 4703402.81 frames. ], batch size: 322, lr: 4.24e-03, grad_scale: 8.0 2023-10-02 10:20:36,912 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.2.encoder.layers.1.self_attn2.whiten, num_groups=1, num_channels=384, metric=15.40 vs. limit=22.5 2023-10-02 10:20:37,441 WARNING [train.py:1204] (3/4) Exclude cut with ID _38670_210_15379_1_1533448810659_4391059_132-146412-0_sp1.1 from training. Duration: 0.97275 2023-10-02 10:20:38,852 WARNING [train.py:1204] (3/4) Exclude cut with ID _22020_210_2461_1_1531724164513_6326900_563-39691-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 10:20:41,593 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_9460_210_4211_1_1528954587_10049667_572-37035-0_sp1.1 from training. Duration: 0.72725 2023-10-02 10:20:41,613 WARNING [train.py:1204] (3/4) Exclude cut with ID _68777_210_4353_1_1535810390731_3117904_129-166012-0_sp0.9 from training. Duration: 0.9444375 2023-10-02 10:20:41,792 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=845186.6666666666, ans=0.1 2023-10-02 10:20:42,938 WARNING [train.py:1204] (3/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_487-91921-0_sp1.1 from training. Duration: 0.963625 2023-10-02 10:20:44,869 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_195-257512-0 from training. Duration: 0.67 2023-10-02 10:20:44,982 WARNING [train.py:1204] (3/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_188-260011-0_sp1.1 from training. Duration: 0.663625 2023-10-02 10:20:47,805 WARNING [train.py:1204] (3/4) Exclude cut with ID _8480_210_1874_1_1528589391205_6728330_309-139896-0_sp0.9 from training. Duration: 0.9 2023-10-02 10:20:49,210 WARNING [train.py:1204] (3/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_866-321248-0_sp1.1 from training. Duration: 0.963625 2023-10-02 10:20:53,084 WARNING [train.py:1204] (3/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_510-216513-0_sp1.1 from training. Duration: 0.97275 2023-10-02 10:20:57,525 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.balancer1.prob, batch_count=845253.3333333334, ans=0.125 2023-10-02 10:20:59,813 WARNING [train.py:1204] (3/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_758-143677-0 from training. Duration: 0.98 2023-10-02 10:20:59,888 WARNING [train.py:1204] (3/4) Exclude cut with ID _55134_210_21982_1_1534590028377_3882170_50-214427-0_sp1.1 from training. Duration: 0.97275 2023-10-02 10:21:00,022 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.bypass.skip_rate, batch_count=845253.3333333334, ans=0.04949747468305833 2023-10-02 10:21:02,531 WARNING [train.py:1204] (3/4) Exclude cut with ID _31649_210_6228_1_1532584608063_4091078_482-301330-0_sp1.1 from training. Duration: 0.97275 2023-10-02 10:21:02,800 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=845253.3333333334, ans=0.1 2023-10-02 10:21:03,958 WARNING [train.py:1204] (3/4) Exclude cut with ID _35799_210_5854_1_1533263487887_3529071_490-326847-0_sp1.1 from training. Duration: 0.9 2023-10-02 10:21:03,965 WARNING [train.py:1204] (3/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_984-90716-0_sp1.1 from training. Duration: 0.963625 2023-10-02 10:21:09,753 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.feed_forward1.hidden_balancer.prob, batch_count=845320.0, ans=0.125 2023-10-02 10:21:11,265 WARNING [train.py:1204] (3/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_166-246557-0 from training. Duration: 0.64 2023-10-02 10:21:11,266 WARNING [train.py:1204] (3/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_51-200220-0_sp1.1 from training. Duration: 0.5090625 2023-10-02 10:21:11,341 WARNING [train.py:1204] (3/4) Exclude cut with ID _51382_210_23862_1_1534492801838_3650104_275-92796-0_sp1.1 from training. Duration: 0.936375 2023-10-02 10:21:16,876 WARNING [train.py:1204] (3/4) Exclude cut with ID _29853_210_7497_1_1532505600851_6642110_461-142829-0_sp1.1 from training. Duration: 0.97275 2023-10-02 10:21:18,174 WARNING [train.py:1204] (3/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_788-298907-0_sp0.9 from training. Duration: 0.988875 2023-10-02 10:21:18,412 INFO [scaling.py:1118] (3/4) WithLoss: name=encoder.encoders.2.encoder.layers.2.self_attn_weights, loss-sum=0.000e+00 2023-10-02 10:21:19,558 WARNING [train.py:1204] (3/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_739-2098-0_sp1.1 from training. Duration: 0.836375 2023-10-02 10:21:22,402 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7543_210_6753_1_1528541907_1399322_165-323619-0 from training. Duration: 0.69 2023-10-02 10:21:23,842 WARNING [train.py:1204] (3/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_401-255491-0_sp1.1 from training. Duration: 0.636375 2023-10-02 10:21:25,025 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.554e+02 1.808e+02 1.985e+02 2.322e+02 3.064e+02, threshold=3.969e+02, percent-clipped=0.0 2023-10-02 10:21:29,900 WARNING [train.py:1204] (3/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_24-143283-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 10:21:29,991 WARNING [train.py:1204] (3/4) Exclude cut with ID _14459_210_2228_1_1531443641946_7516700_1221-280439-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 10:21:30,145 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.1.feed_forward1.out_proj.dropout_p, batch_count=845386.6666666666, ans=0.1 2023-10-02 10:21:33,772 WARNING [train.py:1204] (3/4) Exclude cut with ID _59202_210_26984_1_1534816810612_3549174_469-213482-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 10:21:33,773 WARNING [train.py:1204] (3/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_148-158509-0_sp0.9 from training. Duration: 0.4555625 2023-10-02 10:21:33,813 WARNING [train.py:1204] (3/4) Exclude cut with ID _24314_210_4915_1_1532135078116_7170179_863-259095-0_sp1.1 from training. Duration: 0.97275 2023-10-02 10:21:35,127 WARNING [train.py:1204] (3/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_321-22821-0 from training. Duration: 0.89 2023-10-02 10:21:35,156 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_43831_210_2158_1_1533725502_5872791_55-163137-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 10:21:36,479 WARNING [train.py:1204] (3/4) Exclude cut with ID _27430_210_7770_1_1532604689375_1378590_26-25207-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 10:21:36,591 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_17613_210_2151_1_1531137569_2366160_223-316461-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 10:21:36,804 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.feed_forward1.out_proj.dropout_p, batch_count=845453.3333333334, ans=0.1 2023-10-02 10:21:37,929 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_53679_210_4852_1_1534556726_4186141_541-213370-0_sp1.1 from training. Duration: 0.963625 2023-10-02 10:21:38,026 WARNING [train.py:1204] (3/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_369-295086-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 10:21:39,398 WARNING [train.py:1204] (3/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_91-267696-0 from training. Duration: 0.87 2023-10-02 10:21:39,446 WARNING [train.py:1204] (3/4) Exclude cut with ID _5294_210_1747_1_1527249643928_3696179_180-100713-0 from training. Duration: 0.81 2023-10-02 10:21:40,740 WARNING [train.py:1204] (3/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_32-335103-0 from training. Duration: 0.82 2023-10-02 10:21:40,745 WARNING [train.py:1204] (3/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_133-312727-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 10:21:42,705 WARNING [train.py:1204] (3/4) Exclude cut with ID _59288_210_5854_1_1535077757774_3412246_391-165934-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 10:21:43,981 WARNING [train.py:1204] (3/4) Exclude cut with ID _28323_210_16187_1_1532480395667_4100300_488-1281-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 10:21:45,316 WARNING [train.py:1204] (3/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_124-193207-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 10:21:49,429 INFO [train.py:1046] (3/4) Epoch 24, batch 4650, loss[loss=0.1652, simple_loss=0.2534, pruned_loss=0.03848, over 24642.00 frames. ], tot_loss[loss=0.1706, simple_loss=0.2474, pruned_loss=0.04693, over 4718666.91 frames. ], batch size: 65, lr: 4.24e-03, grad_scale: 8.0 2023-10-02 10:21:52,387 WARNING [train.py:1204] (3/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_226-242140-0_sp0.9 from training. Duration: 0.9 2023-10-02 10:21:55,282 WARNING [train.py:1204] (3/4) Exclude cut with ID _29277_210_3279_1_1532581109293_3173069_392-3694-0_sp1.1 from training. Duration: 0.936375 2023-10-02 10:21:55,311 WARNING [train.py:1204] (3/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_563-329842-0_sp1.1 from training. Duration: 0.97275 2023-10-02 10:21:55,355 WARNING [train.py:1204] (3/4) Exclude cut with ID _58583_210_5236_1_1534726835266_3574249_391-87755-0_sp1.1 from training. Duration: 0.92725 2023-10-02 10:21:56,650 WARNING [train.py:1204] (3/4) Exclude cut with ID _28427_210_4220_1_1532581240967_3061069_281-237827-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 10:21:56,672 WARNING [train.py:1204] (3/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_44-147898-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 10:21:56,845 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.nonlin_attention.balancer.prob, batch_count=845520.0, ans=0.125 2023-10-02 10:21:58,077 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_24007_210_1794_1_1531918925_1663875_125-320772-0_sp1.1 from training. Duration: 0.97275 2023-10-02 10:22:01,307 WARNING [train.py:1204] (3/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_143-117421-0 from training. Duration: 0.58 2023-10-02 10:22:06,321 WARNING [train.py:1204] (3/4) Exclude cut with ID _69584_210_5219_1_1535785133775_4044450_325-7656-0_sp0.9 from training. Duration: 0.9555625 2023-10-02 10:22:08,992 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_37351_210_7925_1_1533012931_3812113_265-85780-0 from training. Duration: 0.88 2023-10-02 10:22:09,023 WARNING [train.py:1204] (3/4) Exclude cut with ID _18516_210_9206_1_1531704653206_3692406_331-237896-0_sp1.1 from training. Duration: 0.936375 2023-10-02 10:22:09,086 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder_embed.conv.2.prob, batch_count=845586.6666666666, ans=0.125 2023-10-02 10:22:10,353 WARNING [train.py:1204] (3/4) Exclude cut with ID _14629_210_15709_1_1530604298182_956899_6-232695-0 from training. Duration: 0.94 2023-10-02 10:22:10,382 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7544_210_6753_1_1528587000_691468_87-178599-0_sp1.1 from training. Duration: 0.7909375 2023-10-02 10:22:11,394 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.0.layers.0.whiten, num_groups=1, num_channels=192, metric=4.43 vs. limit=12.0 2023-10-02 10:22:11,815 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_326-303945-0 from training. Duration: 0.99 2023-10-02 10:22:11,829 WARNING [train.py:1204] (3/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_503-30719-0 from training. Duration: 0.98 2023-10-02 10:22:11,836 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_11305_210_1794_1_1529750428_7178148_299-344640-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 10:22:11,894 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_53071_210_16554_1_1534490405_2954066_265-170472-0_sp1.1 from training. Duration: 0.7545625 2023-10-02 10:22:15,241 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_174-277364-0_sp1.1 from training. Duration: 0.6454375 2023-10-02 10:22:16,148 INFO [scaling.py:1022] (3/4) Whitening: name=encoder_embed.out_whiten, num_groups=1, num_channels=192, metric=6.51 vs. limit=8.0 2023-10-02 10:22:17,825 WARNING [train.py:1204] (3/4) Exclude cut with ID _58414_210_8254_1_1535421609162_7151182_193-151884-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 10:22:17,849 WARNING [train.py:1204] (3/4) Exclude cut with ID IC0651W0104-322063 from training. Duration: 0.768 2023-10-02 10:22:21,892 WARNING [train.py:1204] (3/4) Exclude cut with ID _21881_210_3525_1_1531825242757_3798380_139-305982-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 10:22:21,982 WARNING [train.py:1204] (3/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_79-200392-0 from training. Duration: 0.97 2023-10-02 10:22:26,128 WARNING [train.py:1204] (3/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_451-280894-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 10:22:26,135 WARNING [train.py:1204] (3/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_304-273433-0_sp1.1 from training. Duration: 0.9 2023-10-02 10:22:26,205 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_5959_210_1794_1_1527400400_3445726_412-44052-0 from training. Duration: 0.77 2023-10-02 10:22:27,593 WARNING [train.py:1204] (3/4) Exclude cut with ID _4681_210_3525_1_1527251040321_1235713_148-26159-0_sp1.1 from training. Duration: 0.963625 2023-10-02 10:22:30,339 WARNING [train.py:1204] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_887-249555-0_sp0.9 from training. Duration: 0.8555625 2023-10-02 10:22:31,812 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_25236_210_2158_1_1532051968_280567_37-6860-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 10:22:38,543 WARNING [train.py:1204] (3/4) Exclude cut with ID _29277_210_3279_1_1532581109293_3173069_417-115527-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 10:22:41,253 WARNING [train.py:1204] (3/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_552-134748-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 10:22:41,327 WARNING [train.py:1204] (3/4) Exclude cut with ID _43176_210_8254_1_1533524417469_4174339_188-174143-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 10:22:42,598 WARNING [train.py:1204] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_127-119109-0_sp1.1 from training. Duration: 0.6545625 2023-10-02 10:22:43,159 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.5.encoder.layers.0.nonlin_attention.whiten2, num_groups=1, num_channels=256, metric=3.46 vs. limit=15.0 2023-10-02 10:22:44,578 WARNING [train.py:1204] (3/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_38-187851-0 from training. Duration: 0.62 2023-10-02 10:22:45,805 WARNING [train.py:1204] (3/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_588-125165-0 from training. Duration: 0.83 2023-10-02 10:22:45,881 WARNING [train.py:1204] (3/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_344-152526-0_sp1.1 from training. Duration: 0.4818125 2023-10-02 10:22:45,882 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7002_210_1794_1_1527935634_4034444_487-236501-0 from training. Duration: 0.66 2023-10-02 10:22:48,534 WARNING [train.py:1204] (3/4) Exclude cut with ID _23825_210_13449_1_1531915313526_3497150_379-16981-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 10:22:55,406 WARNING [train.py:1204] (3/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_297-5233-0_sp1.1 from training. Duration: 0.863625 2023-10-02 10:22:55,418 WARNING [train.py:1204] (3/4) Exclude cut with ID _41298_210_9019_1_1533430371450_4822143_178-283538-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 10:22:55,451 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7046_210_6753_1_1527895337_6920917_642-106799-0 from training. Duration: 0.92 2023-10-02 10:22:55,465 WARNING [train.py:1204] (3/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_532-291952-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 10:22:56,902 WARNING [train.py:1204] (3/4) Exclude cut with ID _47278_210_12041_1_1533902418349_3576110_260-270468-0_sp1.1 from training. Duration: 0.92725 2023-10-02 10:22:56,908 WARNING [train.py:1204] (3/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_144-3287-0_sp0.9 from training. Duration: 0.7444375 2023-10-02 10:22:59,648 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_162-262717-0_sp0.9 from training. Duration: 0.92225 2023-10-02 10:23:01,059 WARNING [train.py:1204] (3/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_62-335552-0_sp0.9 from training. Duration: 0.9666875 2023-10-02 10:23:01,064 WARNING [train.py:1204] (3/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_726-272852-0_sp1.1 from training. Duration: 0.92725 2023-10-02 10:23:02,339 INFO [train.py:1046] (3/4) Epoch 24, batch 4700, loss[loss=0.1667, simple_loss=0.2398, pruned_loss=0.04677, over 23789.00 frames. ], tot_loss[loss=0.1705, simple_loss=0.2474, pruned_loss=0.04678, over 4711845.07 frames. ], batch size: 150, lr: 4.24e-03, grad_scale: 8.0 2023-10-02 10:23:02,454 WARNING [train.py:1204] (3/4) Exclude cut with ID _14898_210_9206_1_1530766812377_4177190_26-135492-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 10:23:09,026 WARNING [train.py:1204] (3/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_200-95304-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 10:23:09,064 WARNING [train.py:1204] (3/4) Exclude cut with ID _45012_210_12651_1_1533608435136_5371380_240-37101-0_sp1.1 from training. Duration: 0.8454375 2023-10-02 10:23:09,072 WARNING [train.py:1204] (3/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_132-269633-0_sp0.9 from training. Duration: 0.7555625 2023-10-02 10:23:09,700 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.4.encoder.layers.0.feed_forward1.out_whiten, num_groups=1, num_channels=384, metric=5.63 vs. limit=15.0 2023-10-02 10:23:10,426 WARNING [train.py:1204] (3/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_907-151398-0_sp0.9 from training. Duration: 0.411125 2023-10-02 10:23:10,510 WARNING [train.py:1204] (3/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_45-265684-0_sp0.9 from training. Duration: 0.8 2023-10-02 10:23:12,261 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_74-292608-0 from training. Duration: 0.72 2023-10-02 10:23:19,796 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.feed_forward1.hidden_balancer.prob, batch_count=845920.0, ans=0.125 2023-10-02 10:23:20,979 WARNING [train.py:1204] (3/4) Exclude cut with ID _59202_210_26984_1_1534816810612_3549174_221-84819-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 10:23:22,301 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_33696_210_3852_1_1533082780_2663375_181-325595-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 10:23:22,341 WARNING [train.py:1204] (3/4) Exclude cut with ID _47251_210_18628_1_1533814063128_4137250_351-154105-0_sp1.1 from training. Duration: 0.963625 2023-10-02 10:23:23,683 WARNING [train.py:1204] (3/4) Exclude cut with ID _35501_210_9288_1_1532998507986_3192189_351-334740-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 10:23:25,091 WARNING [train.py:1204] (3/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_342-100157-0_sp1.1 from training. Duration: 0.4909375 2023-10-02 10:23:29,282 WARNING [train.py:1204] (3/4) Exclude cut with ID _6789_210_1534_1_1527857937218_3118950_262-48853-0 from training. Duration: 0.56 2023-10-02 10:23:29,332 WARNING [train.py:1204] (3/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_430-201020-0 from training. Duration: 0.97 2023-10-02 10:23:30,675 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_71-136518-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 10:23:32,109 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_28-291982-0_sp1.1 from training. Duration: 0.8818125 2023-10-02 10:23:33,272 WARNING [train.py:1204] (3/4) Exclude cut with ID _54218_210_12210_1_1534413646388_4409259_437-275326-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 10:23:35,630 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.conv_module1.balancer2.prob, batch_count=845986.6666666666, ans=0.125 2023-10-02 10:23:36,577 WARNING [train.py:1204] (3/4) Exclude cut with ID _55134_210_21982_1_1534590028377_3882170_40-309139-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 10:23:36,777 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.attention_skip_rate, batch_count=845986.6666666666, ans=0.0 2023-10-02 10:23:42,533 WARNING [train.py:1204] (3/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_266-329755-0_sp1.1 from training. Duration: 0.8090625 2023-10-02 10:23:42,613 WARNING [train.py:1204] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_21-174514-0_sp1.1 from training. Duration: 0.3909375 2023-10-02 10:23:44,421 WARNING [train.py:1204] (3/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_409-207378-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 10:23:50,372 WARNING [train.py:1204] (3/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_132-269633-0 from training. Duration: 0.68 2023-10-02 10:23:50,545 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.ff3_skip_rate, batch_count=846053.3333333334, ans=0.0 2023-10-02 10:23:51,769 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2245_210_2715_1_1525511548_3540254_310-52483-0_sp0.9 from training. Duration: 0.97775 2023-10-02 10:23:53,103 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.546e+02 1.791e+02 2.008e+02 2.250e+02 3.556e+02, threshold=4.016e+02, percent-clipped=0.0 2023-10-02 10:23:53,278 WARNING [train.py:1204] (3/4) Exclude cut with ID _27430_210_7770_1_1532604689375_1378590_47-77403-0_sp1.1 from training. Duration: 0.92725 2023-10-02 10:23:57,221 WARNING [train.py:1204] (3/4) Exclude cut with ID _7562_210_2084_1_1528887767916_3946229_186-326422-0 from training. Duration: 0.84 2023-10-02 10:23:58,558 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_230-35993-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 10:24:02,676 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_23854_210_1794_1_1531969696_1329157_57-123491-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 10:24:02,722 WARNING [train.py:1204] (3/4) Exclude cut with ID _14747_210_8341_1_1530685797463_3654089_147-131229-0 from training. Duration: 0.76 2023-10-02 10:24:02,879 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.conv_skip_rate, batch_count=846120.0, ans=0.0 2023-10-02 10:24:04,114 WARNING [train.py:1204] (3/4) Exclude cut with ID _30191_210_1664_1_1533189915047_7196670_430-19539-0_sp1.1 from training. Duration: 0.92725 2023-10-02 10:24:04,133 WARNING [train.py:1204] (3/4) Exclude cut with ID _67725_210_27767_1_1535608756671_5105359_44-50762-0_sp1.1 from training. Duration: 0.963625 2023-10-02 10:24:07,571 WARNING [train.py:1204] (3/4) Exclude cut with ID _27485_210_4916_1_1532739191677_3668269_217-161755-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 10:24:08,919 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_5931_210_2040_1_1527428481_4547999_321-193614-0_sp1.1 from training. Duration: 0.6090625 2023-10-02 10:24:08,935 WARNING [train.py:1204] (3/4) Exclude cut with ID _6267_210_2084_1_1527764658526_3894730_375-219887-0 from training. Duration: 0.67 2023-10-02 10:24:09,016 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9087W0022-538393 from training. Duration: 0.768 2023-10-02 10:24:12,179 WARNING [train.py:1204] (3/4) Exclude cut with ID _68777_210_4353_1_1535810390731_3117904_83-196459-0_sp1.1 from training. Duration: 0.963625 2023-10-02 10:24:13,497 WARNING [train.py:1204] (3/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_455-343812-0_sp1.1 from training. Duration: 0.92725 2023-10-02 10:24:13,497 WARNING [train.py:1204] (3/4) Exclude cut with ID _13916_210_9994_1_1530788341126_3763149_414-320174-0_sp1.1 from training. Duration: 0.92725 2023-10-02 10:24:13,501 WARNING [train.py:1204] (3/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_398-239819-0 from training. Duration: 0.99 2023-10-02 10:24:15,538 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.bypass.scale_min, batch_count=846186.6666666666, ans=0.2 2023-10-02 10:24:16,543 INFO [train.py:1046] (3/4) Epoch 24, batch 4750, loss[loss=0.1747, simple_loss=0.2614, pruned_loss=0.04397, over 24318.00 frames. ], tot_loss[loss=0.1712, simple_loss=0.248, pruned_loss=0.04719, over 4709405.20 frames. ], batch size: 74, lr: 4.24e-03, grad_scale: 8.0 2023-10-02 10:24:16,595 WARNING [train.py:1204] (3/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_199-232314-0_sp1.1 from training. Duration: 0.92725 2023-10-02 10:24:16,807 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=846186.6666666666, ans=0.1 2023-10-02 10:24:20,057 WARNING [train.py:1204] (3/4) Exclude cut with ID _2334_210_2667_1_1525608238880_4705129_459-104997-0 from training. Duration: 0.95 2023-10-02 10:24:22,730 WARNING [train.py:1204] (3/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_441-209395-0_sp1.1 from training. Duration: 0.936375 2023-10-02 10:24:24,111 WARNING [train.py:1204] (3/4) Exclude cut with ID _27052_210_5986_1_1532329197187_3585009_320-80393-0_sp1.1 from training. Duration: 0.97275 2023-10-02 10:24:28,174 WARNING [train.py:1204] (3/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_89-217253-0_sp1.1 from training. Duration: 0.97275 2023-10-02 10:24:29,529 WARNING [train.py:1204] (3/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_453-282588-0_sp1.1 from training. Duration: 0.8909375 2023-10-02 10:24:29,653 WARNING [train.py:1204] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_88-341718-0 from training. Duration: 0.66 2023-10-02 10:24:30,960 WARNING [train.py:1204] (3/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_230-3521-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 10:24:35,076 WARNING [train.py:1204] (3/4) Exclude cut with ID _9679_210_7497_1_1529129212906_4363929_512-313889-0 from training. Duration: 0.81 2023-10-02 10:24:36,347 WARNING [train.py:1204] (3/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_194-252800-0_sp1.1 from training. Duration: 0.8818125 2023-10-02 10:24:37,009 WARNING [train.py:1204] (3/4) Exclude cut with ID _55209_210_1531_1_1534675828996_4597160_239-293541-0_sp1.1 from training. Duration: 0.963625 2023-10-02 10:24:37,048 WARNING [train.py:1204] (3/4) Exclude cut with ID _35799_210_5854_1_1533263487887_3529071_15-325131-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 10:24:42,595 WARNING [train.py:1204] (3/4) Exclude cut with ID _14100_210_8341_1_1530529320613_3678069_279-55760-0 from training. Duration: 0.93 2023-10-02 10:24:44,708 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.balancer1.prob, batch_count=846320.0, ans=0.125 2023-10-02 10:24:47,187 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_489-230703-0_sp1.1 from training. Duration: 0.77275 2023-10-02 10:24:50,467 WARNING [train.py:1204] (3/4) Exclude cut with ID _6694_210_4599_1_1527852637490_3965529_144-185051-0 from training. Duration: 0.96 2023-10-02 10:24:50,528 WARNING [train.py:1204] (3/4) Exclude cut with ID _28461_210_2253_1_1532771775189_3041299_1-228155-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 10:24:52,084 WARNING [train.py:1204] (3/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_686-252323-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 10:24:52,087 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_1075-294633-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 10:24:52,099 WARNING [train.py:1204] (3/4) Exclude cut with ID _22083_210_8254_1_1531702862439_3678970_30-92879-0_sp1.1 from training. Duration: 0.97275 2023-10-02 10:24:53,820 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9245W0121-616888 from training. Duration: 0.896 2023-10-02 10:24:53,825 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6786_210_1388_1_1527839699_4194495_378-46070-0 from training. Duration: 0.72 2023-10-02 10:24:57,243 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.4.encoder.layers.0.self_attn2.whiten, num_groups=1, num_channels=384, metric=12.92 vs. limit=22.5 2023-10-02 10:24:59,386 WARNING [train.py:1204] (3/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_317-175062-0 from training. Duration: 0.82 2023-10-02 10:25:00,845 WARNING [train.py:1204] (3/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_238-89390-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 10:25:03,463 WARNING [train.py:1204] (3/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_524-310811-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 10:25:03,649 WARNING [train.py:1204] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_307-158462-0_sp0.9 from training. Duration: 0.7666875 2023-10-02 10:25:03,649 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9309W0191-640318 from training. Duration: 0.512 2023-10-02 10:25:03,826 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.balancer1.prob, batch_count=846386.6666666666, ans=0.125 2023-10-02 10:25:04,940 WARNING [train.py:1204] (3/4) Exclude cut with ID _55153_210_2253_1_1534384786677_3162096_100-204617-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 10:25:06,570 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.feed_forward2.hidden_balancer.prob, batch_count=846386.6666666666, ans=0.125 2023-10-02 10:25:08,030 WARNING [train.py:1204] (3/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_112-149213-0_sp1.1 from training. Duration: 0.863625 2023-10-02 10:25:09,571 WARNING [train.py:1204] (3/4) Exclude cut with ID _67831_210_12392_1_1535612464202_3092612_346-2148-0_sp1.1 from training. Duration: 0.8454375 2023-10-02 10:25:10,862 WARNING [train.py:1204] (3/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_51-117576-0_sp0.9 from training. Duration: 0.4444375 2023-10-02 10:25:12,858 WARNING [train.py:1204] (3/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_300-27235-0 from training. Duration: 0.87 2023-10-02 10:25:12,913 WARNING [train.py:1204] (3/4) Exclude cut with ID _40399_210_4353_1_1533305658194_3279406_195-116825-0_sp1.1 from training. Duration: 0.97275 2023-10-02 10:25:14,212 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7002_210_1794_1_1527939683_5157286_163-87552-0_sp0.9 from training. Duration: 0.9555625 2023-10-02 10:25:14,242 WARNING [train.py:1204] (3/4) Exclude cut with ID _19514_210_9259_1_1531530071330_5084230_560-237597-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 10:25:14,312 WARNING [train.py:1204] (3/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_41-234353-0_sp1.1 from training. Duration: 0.6181875 2023-10-02 10:25:15,625 WARNING [train.py:1204] (3/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_769-168302-0 from training. Duration: 0.71 2023-10-02 10:25:17,054 WARNING [train.py:1204] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_574-45541-0 from training. Duration: 0.65 2023-10-02 10:25:17,223 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.bypass.scale_min, batch_count=846453.3333333334, ans=0.2 2023-10-02 10:25:20,406 WARNING [train.py:1204] (3/4) Exclude cut with ID _50310_210_15488_1_1534068043345_3641080_262-67120-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 10:25:22,551 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_53071_210_16554_1_1534493434_1276796_35-11927-0_sp1.1 from training. Duration: 0.963625 2023-10-02 10:25:22,553 WARNING [train.py:1204] (3/4) Exclude cut with ID _3992_210_1747_1_1526644816031_3731060_107-251434-0 from training. Duration: 0.99 2023-10-02 10:25:23,880 WARNING [train.py:1204] (3/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_412-54488-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 10:25:25,227 WARNING [train.py:1204] (3/4) Exclude cut with ID _18516_210_9206_1_1531704653206_3692406_77-258490-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 10:25:26,664 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_40047_210_2290_1_1533290519_2374311_195-165693-0_sp0.9 from training. Duration: 0.988875 2023-10-02 10:25:27,988 WARNING [train.py:1204] (3/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_296-36971-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 10:25:28,060 WARNING [train.py:1204] (3/4) Exclude cut with ID _14970_210_15709_1_1530773543493_1715730_187-20259-0_sp1.1 from training. Duration: 0.8090625 2023-10-02 10:25:30,623 INFO [train.py:1046] (3/4) Epoch 24, batch 4800, loss[loss=0.1716, simple_loss=0.2638, pruned_loss=0.03965, over 24651.00 frames. ], tot_loss[loss=0.1721, simple_loss=0.249, pruned_loss=0.04759, over 4712346.45 frames. ], batch size: 73, lr: 4.24e-03, grad_scale: 16.0 2023-10-02 10:25:32,092 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_53373_210_5399_1_1534229471_4715695_135-305927-0_sp1.1 from training. Duration: 0.936375 2023-10-02 10:25:32,122 WARNING [train.py:1204] (3/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_60-18123-0 from training. Duration: 0.88 2023-10-02 10:25:32,187 WARNING [train.py:1204] (3/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_166-166990-0 from training. Duration: 0.96 2023-10-02 10:25:32,403 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.bypass.skip_rate, batch_count=846520.0, ans=0.07 2023-10-02 10:25:34,829 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_5438_210_1794_1_1527336687_2600009_233-58072-0 from training. Duration: 0.77 2023-10-02 10:25:36,253 WARNING [train.py:1204] (3/4) Exclude cut with ID _8833_210_5219_1_1528592665210_7553059_611-80313-0_sp0.9 from training. Duration: 0.711125 2023-10-02 10:25:36,286 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_53555_210_5632_1_1534595220_7232297_118-43239-0_sp1.1 from training. Duration: 0.936375 2023-10-02 10:25:37,547 WARNING [train.py:1204] (3/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_480-13146-0 from training. Duration: 0.85 2023-10-02 10:25:38,379 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=846520.0, ans=0.1 2023-10-02 10:25:42,808 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_892-28814-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 10:25:42,854 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_24899_210_16554_1_1532046422_3827462_160-21255-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 10:25:44,798 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.4.encoder.layers.1.nonlin_attention.whiten1, num_groups=1, num_channels=288, metric=4.79 vs. limit=10.0 2023-10-02 10:25:48,683 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_154-316450-0_sp1.1 from training. Duration: 0.6909375 2023-10-02 10:25:50,082 WARNING [train.py:1204] (3/4) Exclude cut with ID _30196_210_1664_1_1533708496522_7074140_1225-301232-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 10:25:50,112 WARNING [train.py:1204] (3/4) Exclude cut with ID _28783_210_7943_1_1532671164808_7155479_414-208100-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 10:25:50,155 WARNING [train.py:1204] (3/4) Exclude cut with ID _19023_210_4353_1_1531378892552_5820780_83-254794-0 from training. Duration: 0.86 2023-10-02 10:25:52,028 WARNING [train.py:1204] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_591-83752-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 10:25:52,063 WARNING [train.py:1204] (3/4) Exclude cut with ID _29877_210_6228_1_1532397791106_5276149_266-62291-0_sp1.1 from training. Duration: 0.7909375 2023-10-02 10:25:53,534 WARNING [train.py:1204] (3/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_280-161633-0_sp1.1 from training. Duration: 0.663625 2023-10-02 10:25:57,625 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7543_210_6753_1_1528539468_333764_39-156634-0_sp1.1 from training. Duration: 0.97275 2023-10-02 10:25:58,985 WARNING [train.py:1204] (3/4) Exclude cut with ID _67499_210_17282_1_1535509805220_3692430_505-312248-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 10:25:59,014 WARNING [train.py:1204] (3/4) Exclude cut with ID _5294_210_1747_1_1527249643928_3696179_180-100713-0_sp0.9 from training. Duration: 0.9 2023-10-02 10:26:00,296 WARNING [train.py:1204] (3/4) Exclude cut with ID _26528_210_9070_1_1532570410842_3851310_508-148132-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 10:26:00,305 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_211-339622-0_sp1.1 from training. Duration: 0.4090625 2023-10-02 10:26:00,323 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_339-138356-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 10:26:01,614 WARNING [train.py:1204] (3/4) Exclude cut with ID _27060_210_5986_1_1532674838922_3522549_211-19933-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 10:26:04,568 WARNING [train.py:1204] (3/4) Exclude cut with ID _60229_210_27767_1_1534838406158_3655350_457-117923-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 10:26:05,999 WARNING [train.py:1204] (3/4) Exclude cut with ID _55429_210_10626_1_1534467588527_7174143_460-141439-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 10:26:08,028 WARNING [train.py:1204] (3/4) Exclude cut with ID _59170_210_6721_1_1534983633960_4203335_263-10780-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 10:26:08,047 WARNING [train.py:1204] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_872-264525-0_sp0.9 from training. Duration: 0.72225 2023-10-02 10:26:10,662 WARNING [train.py:1204] (3/4) Exclude cut with ID _7624_210_1531_1_1528109802198_4933101_418-349834-0_sp0.9 from training. Duration: 0.6666875 2023-10-02 10:26:10,767 WARNING [train.py:1204] (3/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_27-53998-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 10:26:12,243 WARNING [train.py:1204] (3/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_202-183922-0 from training. Duration: 0.85 2023-10-02 10:26:12,254 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7002_210_1794_1_1527939683_5157286_1-281264-0 from training. Duration: 0.44 2023-10-02 10:26:14,035 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_53753_210_7234_1_1534320003_3594148_493-9314-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 10:26:14,048 WARNING [train.py:1204] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_217-95056-0_sp1.1 from training. Duration: 0.8909375 2023-10-02 10:26:14,097 WARNING [train.py:1204] (3/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_406-45086-0_sp1.1 from training. Duration: 0.72725 2023-10-02 10:26:14,103 WARNING [train.py:1204] (3/4) Exclude cut with ID _31649_210_6228_1_1532584608063_4091078_505-213821-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 10:26:14,111 WARNING [train.py:1204] (3/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_317-175062-0_sp0.9 from training. Duration: 0.911125 2023-10-02 10:26:17,370 WARNING [train.py:1204] (3/4) Exclude cut with ID _5444_210_2151_1_1527418808464_5312210_726-27165-0_sp1.1 from training. Duration: 0.6090625 2023-10-02 10:26:17,401 WARNING [train.py:1204] (3/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_494-170437-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 10:26:21,642 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_474-64054-0_sp1.1 from training. Duration: 0.936375 2023-10-02 10:26:23,283 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.537e+02 1.885e+02 2.115e+02 2.514e+02 3.907e+02, threshold=4.229e+02, percent-clipped=0.0 2023-10-02 10:26:24,769 WARNING [train.py:1204] (3/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_521-7556-0_sp1.1 from training. Duration: 0.97275 2023-10-02 10:26:27,503 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_269-348505-0_sp1.1 from training. Duration: 0.92725 2023-10-02 10:26:30,491 WARNING [train.py:1204] (3/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_559-184862-0 from training. Duration: 0.94 2023-10-02 10:26:30,522 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_68702_210_23689_1_1535799577_3420068_92-76040-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 10:26:31,743 WARNING [train.py:1204] (3/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_227-102052-0_sp1.1 from training. Duration: 0.97275 2023-10-02 10:26:31,792 WARNING [train.py:1204] (3/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_718-250907-0_sp0.9 from training. Duration: 0.7666875 2023-10-02 10:26:33,143 WARNING [train.py:1204] (3/4) Exclude cut with ID _14069_210_5632_1_1530512692033_4803390_352-143891-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 10:26:35,871 WARNING [train.py:1204] (3/4) Exclude cut with ID _6267_210_2084_1_1527764658526_3894730_65-106602-0_sp1.1 from training. Duration: 0.936375 2023-10-02 10:26:37,594 WARNING [train.py:1204] (3/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_202-183922-0_sp0.9 from training. Duration: 0.9444375 2023-10-02 10:26:37,600 WARNING [train.py:1204] (3/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_465-268827-0_sp1.1 from training. Duration: 0.97275 2023-10-02 10:26:38,992 WARNING [train.py:1204] (3/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_58-29064-0_sp1.1 from training. Duration: 0.9 2023-10-02 10:26:39,032 WARNING [train.py:1204] (3/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_1-99680-0_sp0.9 from training. Duration: 0.7444375 2023-10-02 10:26:40,370 WARNING [train.py:1204] (3/4) Exclude cut with ID _13253_210_1925_1_1530757775819_3577873_191-144977-0_sp1.1 from training. Duration: 0.7090625 2023-10-02 10:26:44,312 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.2.encoder.layers.2.feed_forward3.out_whiten, num_groups=1, num_channels=384, metric=10.98 vs. limit=15.0 2023-10-02 10:26:44,818 INFO [train.py:1046] (3/4) Epoch 24, batch 4850, loss[loss=0.1783, simple_loss=0.2557, pruned_loss=0.05049, over 23676.00 frames. ], tot_loss[loss=0.1723, simple_loss=0.2491, pruned_loss=0.04779, over 4703733.96 frames. ], batch size: 85, lr: 4.24e-03, grad_scale: 8.0 2023-10-02 10:26:44,883 WARNING [train.py:1204] (3/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_276-112684-0_sp1.1 from training. Duration: 0.92725 2023-10-02 10:26:44,891 WARNING [train.py:1204] (3/4) Exclude cut with ID _64639_210_5816_1_1535367619437_3528000_358-411-0_sp1.1 from training. Duration: 0.97275 2023-10-02 10:26:44,897 WARNING [train.py:1204] (3/4) Exclude cut with ID _31649_210_6228_1_1532584608063_4091078_106-21199-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 10:26:46,275 WARNING [train.py:1204] (3/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_901-79438-0 from training. Duration: 0.94 2023-10-02 10:26:48,981 WARNING [train.py:1204] (3/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_718-250907-0 from training. Duration: 0.69 2023-10-02 10:26:48,987 WARNING [train.py:1204] (3/4) Exclude cut with ID _5287_210_2084_1_1527418883040_3878670_363-194161-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 10:26:48,990 WARNING [train.py:1204] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_586-321321-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 10:26:50,376 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_68900_210_2087_1_1535713157_3708529_164-292661-0_sp1.1 from training. Duration: 0.963625 2023-10-02 10:26:50,377 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_26433_210_12108_1_1532157158_4056480_114-144878-0_sp1.1 from training. Duration: 0.97275 2023-10-02 10:26:53,743 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_15517_210_6753_1_1530954008_3487336_96-341874-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 10:26:58,456 WARNING [train.py:1204] (3/4) Exclude cut with ID _14268_210_9206_1_1530622771285_4714099_637-86630-0 from training. Duration: 0.51 2023-10-02 10:26:59,850 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_28211_210_6388_1_1532861506_4523224_121-320092-0_sp1.1 from training. Duration: 0.92725 2023-10-02 10:27:04,230 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_141-89113-0_sp0.9 from training. Duration: 0.9555625 2023-10-02 10:27:04,329 WARNING [train.py:1204] (3/4) Exclude cut with ID _6789_210_1534_1_1527857937218_3118950_311-61262-0_sp1.1 from training. Duration: 0.5909375 2023-10-02 10:27:05,597 WARNING [train.py:1204] (3/4) Exclude cut with ID _58480_210_12392_1_1534811121111_3646160_389-200461-0_sp1.1 from training. Duration: 0.97275 2023-10-02 10:27:08,400 WARNING [train.py:1204] (3/4) Exclude cut with ID _41326_210_5854_1_1533778076509_3492080_28-156553-0_sp1.1 from training. Duration: 0.92725 2023-10-02 10:27:10,362 WARNING [train.py:1204] (3/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_577-287150-0_sp1.1 from training. Duration: 0.7454375 2023-10-02 10:27:11,776 WARNING [train.py:1204] (3/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_40-129756-0_sp1.1 from training. Duration: 0.736375 2023-10-02 10:27:11,786 WARNING [train.py:1204] (3/4) Exclude cut with ID _60113_210_16271_1_1535164212481_4386079_490-280290-0 from training. Duration: 0.98 2023-10-02 10:27:13,692 WARNING [train.py:1204] (3/4) Exclude cut with ID _58059_210_5854_1_1534901471149_3164808_34-39905-0_sp1.1 from training. Duration: 0.963625 2023-10-02 10:27:17,630 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_791-197891-0_sp1.1 from training. Duration: 0.763625 2023-10-02 10:27:17,646 WARNING [train.py:1204] (3/4) Exclude cut with ID _6789_210_1534_1_1527857937218_3118950_262-48853-0_sp1.1 from training. Duration: 0.5090625 2023-10-02 10:27:17,712 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2655_210_2087_1_1525688959_3897400_52-279899-0_sp0.9 from training. Duration: 0.7666875 2023-10-02 10:27:17,716 WARNING [train.py:1204] (3/4) Exclude cut with ID _14884_210_2252_1_1530707459689_5561250_114-201399-0 from training. Duration: 0.76 2023-10-02 10:27:20,965 WARNING [train.py:1204] (3/4) Exclude cut with ID _19023_210_4353_1_1531378892552_5820780_83-254794-0_sp0.9 from training. Duration: 0.9555625 2023-10-02 10:27:20,981 WARNING [train.py:1204] (3/4) Exclude cut with ID _53489_210_6228_1_1534298357169_3835970_10-297919-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 10:27:25,500 WARNING [train.py:1204] (3/4) Exclude cut with ID _44585_210_18628_1_1533605350387_4518199_418-3055-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 10:27:25,519 WARNING [train.py:1204] (3/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_310-109461-0 from training. Duration: 0.8 2023-10-02 10:27:25,561 WARNING [train.py:1204] (3/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_235-262124-0 from training. Duration: 0.67 2023-10-02 10:27:28,177 WARNING [train.py:1204] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_195-167011-0_sp1.1 from training. Duration: 0.6818125 2023-10-02 10:27:36,345 WARNING [train.py:1204] (3/4) Exclude cut with ID _7852_210_5219_1_1528286471316_8139400_188-55162-0_sp1.1 from training. Duration: 0.936375 2023-10-02 10:27:36,386 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_35361_210_4238_1_1533262810_7793476_675-87012-0 from training. Duration: 0.89 2023-10-02 10:27:36,464 WARNING [train.py:1204] (3/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_465-81427-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 10:27:36,479 WARNING [train.py:1204] (3/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_597-154640-0_sp1.1 from training. Duration: 0.8181875 2023-10-02 10:27:37,901 WARNING [train.py:1204] (3/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_186-309161-0_sp0.9 from training. Duration: 0.6 2023-10-02 10:27:38,021 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.1.bypass.skip_rate, batch_count=847053.3333333334, ans=0.035 2023-10-02 10:27:41,160 WARNING [train.py:1204] (3/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_546-119312-0 from training. Duration: 0.99 2023-10-02 10:27:41,164 WARNING [train.py:1204] (3/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_330-240060-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 10:27:41,236 WARNING [train.py:1204] (3/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_477-255782-0 from training. Duration: 0.82 2023-10-02 10:27:41,259 WARNING [train.py:1204] (3/4) Exclude cut with ID _14970_210_15709_1_1530773543493_1715730_150-248939-0_sp1.1 from training. Duration: 0.97275 2023-10-02 10:27:42,442 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_556-206615-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 10:27:42,704 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.conv_module2.balancer1.prob, batch_count=847120.0, ans=0.125 2023-10-02 10:27:44,258 WARNING [train.py:1204] (3/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_308-231735-0 from training. Duration: 0.73 2023-10-02 10:27:51,605 WARNING [train.py:1204] (3/4) Exclude cut with ID _27485_210_4916_1_1532739191677_3668269_66-236571-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 10:27:56,368 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2221_210_1988_1_1525597051_4935541_415-296882-0_sp0.9 from training. Duration: 0.8555625 2023-10-02 10:27:56,378 WARNING [train.py:1204] (3/4) Exclude cut with ID _64521_210_14699_1_1535243427259_3815328_231-78630-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 10:27:57,096 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.feed_forward1.out_whiten.whitening_limit, batch_count=847120.0, ans=15.0 2023-10-02 10:27:59,049 INFO [train.py:1046] (3/4) Epoch 24, batch 4900, loss[loss=0.1551, simple_loss=0.2186, pruned_loss=0.04577, over 23545.00 frames. ], tot_loss[loss=0.1712, simple_loss=0.2474, pruned_loss=0.04752, over 4689351.61 frames. ], batch size: 256, lr: 4.24e-03, grad_scale: 8.0 2023-10-02 10:28:00,660 WARNING [train.py:1204] (3/4) Exclude cut with ID _13942_210_9988_1_1530604906036_3733360_499-55676-0 from training. Duration: 0.62 2023-10-02 10:28:00,662 WARNING [train.py:1204] (3/4) Exclude cut with ID _49376_210_5986_1_1533985278732_3370740_301-154763-0_sp1.1 from training. Duration: 0.8909375 2023-10-02 10:28:05,450 WARNING [train.py:1204] (3/4) Exclude cut with ID _27485_210_4916_1_1532739191677_3668269_255-216698-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 10:28:06,834 WARNING [train.py:1204] (3/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_137-334143-0_sp1.1 from training. Duration: 0.97275 2023-10-02 10:28:06,853 WARNING [train.py:1204] (3/4) Exclude cut with ID _6456_210_1534_1_1527681634212_3181049_215-309359-0_sp1.1 from training. Duration: 0.8 2023-10-02 10:28:07,131 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.conv_skip_rate, batch_count=847186.6666666666, ans=0.0 2023-10-02 10:28:08,374 WARNING [train.py:1204] (3/4) Exclude cut with ID _65640_210_6310_1_1535368201445_3173250_209-13008-0 from training. Duration: 0.99 2023-10-02 10:28:11,427 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.feed_forward1.hidden_balancer.prob, batch_count=847186.6666666666, ans=0.125 2023-10-02 10:28:11,688 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.5.encoder.layers.1.self_attn_weights.whiten_keys, num_groups=4, num_channels=128, metric=2.01 vs. limit=6.0 2023-10-02 10:28:12,990 WARNING [train.py:1204] (3/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_58-29064-0 from training. Duration: 0.99 2023-10-02 10:28:17,406 WARNING [train.py:1204] (3/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_410-287857-0 from training. Duration: 0.93 2023-10-02 10:28:17,627 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.feed_forward1.out_proj.dropout_p, batch_count=847253.3333333334, ans=0.1 2023-10-02 10:28:18,776 WARNING [train.py:1204] (3/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_153-153537-0 from training. Duration: 0.91 2023-10-02 10:28:18,818 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7544_210_6753_1_1528587722_3576686_172-171750-0_sp0.9 from training. Duration: 0.72225 2023-10-02 10:28:18,869 WARNING [train.py:1204] (3/4) Exclude cut with ID _64521_210_14699_1_1535243427259_3815328_464-183853-0_sp1.1 from training. Duration: 0.97275 2023-10-02 10:28:18,883 WARNING [train.py:1204] (3/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_385-210308-0_sp1.1 from training. Duration: 0.963625 2023-10-02 10:28:18,891 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_46013_210_7234_1_1533776362_3744032_354-261865-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 10:28:18,898 WARNING [train.py:1204] (3/4) Exclude cut with ID _3067_210_2667_1_1526212974640_4503168_250-285937-0_sp1.1 from training. Duration: 0.72725 2023-10-02 10:28:20,192 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_40036_210_13405_1_1533648647_1315388_48-146016-0 from training. Duration: 0.93 2023-10-02 10:28:21,774 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=847253.3333333334, ans=0.1 2023-10-02 10:28:24,268 WARNING [train.py:1204] (3/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_280-161633-0 from training. Duration: 0.73 2023-10-02 10:28:24,308 WARNING [train.py:1204] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_308-161067-0_sp0.9 from training. Duration: 0.7333125 2023-10-02 10:28:25,697 WARNING [train.py:1204] (3/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_68-212801-0_sp0.9 from training. Duration: 0.811125 2023-10-02 10:28:27,531 WARNING [train.py:1204] (3/4) Exclude cut with ID _6789_210_1534_1_1527857937218_3118950_311-61262-0_sp0.9 from training. Duration: 0.72225 2023-10-02 10:28:27,786 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.conv_module2.balancer1.prob, batch_count=847320.0, ans=0.125 2023-10-02 10:28:29,231 WARNING [train.py:1204] (3/4) Exclude cut with ID _14632_210_9988_1_1530691279392_3923200_102-204477-0_sp1.1 from training. Duration: 0.8818125 2023-10-02 10:28:30,602 WARNING [train.py:1204] (3/4) Exclude cut with ID _65242_210_18628_1_1535369393717_3823149_290-117517-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 10:28:32,016 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_69630_210_7234_1_1535788670_910989_35-337140-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 10:28:32,023 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_5618_210_1874_1_1527311008_5851_1-214501-0 from training. Duration: 0.689 2023-10-02 10:28:33,849 WARNING [train.py:1204] (3/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_168-50337-0_sp0.9 from training. Duration: 0.9333125 2023-10-02 10:28:33,932 WARNING [train.py:1204] (3/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_185-14438-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 10:28:33,945 WARNING [train.py:1204] (3/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_61-327062-0 from training. Duration: 0.81 2023-10-02 10:28:35,172 WARNING [train.py:1204] (3/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_98-741-0 from training. Duration: 0.8 2023-10-02 10:28:36,811 WARNING [train.py:1204] (3/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_312-204798-0 from training. Duration: 0.93 2023-10-02 10:28:39,437 WARNING [train.py:1204] (3/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_787-294416-0_sp0.9 from training. Duration: 0.911125 2023-10-02 10:28:40,811 WARNING [train.py:1204] (3/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_140-181176-0_sp1.1 from training. Duration: 0.836375 2023-10-02 10:28:40,837 WARNING [train.py:1204] (3/4) Exclude cut with ID _14687_210_5854_1_1530703838259_3664889_115-239046-0_sp1.1 from training. Duration: 0.6090625 2023-10-02 10:28:42,607 WARNING [train.py:1204] (3/4) Exclude cut with ID _56423_210_5854_1_1534896061049_3165969_370-230614-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 10:28:42,652 WARNING [train.py:1204] (3/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_406-275649-0_sp0.9 from training. Duration: 0.4666875 2023-10-02 10:28:42,669 WARNING [train.py:1204] (3/4) Exclude cut with ID _15150_210_8341_1_1530752411803_7256384_22-138274-0_sp1.1 from training. Duration: 0.87275 2023-10-02 10:28:42,871 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.feed_forward1.hidden_balancer.prob, batch_count=847386.6666666666, ans=0.125 2023-10-02 10:28:44,003 WARNING [train.py:1204] (3/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_125-125818-0 from training. Duration: 0.96 2023-10-02 10:28:47,174 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_4399_210_1874_1_1526705485_2400757_220-47974-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 10:28:48,557 WARNING [train.py:1204] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_308-161067-0_sp1.1 from training. Duration: 0.6 2023-10-02 10:28:49,088 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.4.encoder.layers.0.conv_module2.whiten, num_groups=1, num_channels=384, metric=7.00 vs. limit=15.0 2023-10-02 10:28:51,078 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.516e+02 1.854e+02 2.024e+02 2.236e+02 3.450e+02, threshold=4.049e+02, percent-clipped=0.0 2023-10-02 10:28:51,191 WARNING [train.py:1204] (3/4) Exclude cut with ID _53489_210_6228_1_1534298357169_3835970_102-57645-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 10:28:55,344 WARNING [train.py:1204] (3/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_178-177582-0 from training. Duration: 0.74 2023-10-02 10:28:56,632 WARNING [train.py:1204] (3/4) Exclude cut with ID _14349_210_8341_1_1530615710818_3442470_170-336541-0_sp1.1 from training. Duration: 0.7818125 2023-10-02 10:28:56,692 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_521-214276-0_sp0.9 from training. Duration: 0.488875 2023-10-02 10:28:56,910 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.conv_module2.balancer1.min_positive, batch_count=847453.3333333334, ans=0.025 2023-10-02 10:28:57,990 WARNING [train.py:1204] (3/4) Exclude cut with ID _66070_210_7640_1_1535441393096_7805310_489-292631-0 from training. Duration: 0.79 2023-10-02 10:29:03,073 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.bypass.skip_rate, batch_count=847453.3333333334, ans=0.09899494936611666 2023-10-02 10:29:05,568 WARNING [train.py:1204] (3/4) Exclude cut with ID _14069_210_5632_1_1530512692033_4803390_438-340023-0_sp1.1 from training. Duration: 0.936375 2023-10-02 10:29:05,652 WARNING [train.py:1204] (3/4) Exclude cut with ID _7852_210_5219_1_1528286471316_8139400_242-105336-0_sp1.1 from training. Duration: 0.6545625 2023-10-02 10:29:07,543 WARNING [train.py:1204] (3/4) Exclude cut with ID _3867_210_1883_1_1526641200575_4475830_272-215563-0 from training. Duration: 0.57 2023-10-02 10:29:07,553 WARNING [train.py:1204] (3/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_143-117421-0_sp0.9 from training. Duration: 0.6444375 2023-10-02 10:29:07,560 WARNING [train.py:1204] (3/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_30-326681-0_sp1.1 from training. Duration: 0.7545625 2023-10-02 10:29:07,818 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.conv_module1.balancer2.prob, batch_count=847453.3333333334, ans=0.125 2023-10-02 10:29:08,853 WARNING [train.py:1204] (3/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_538-218448-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 10:29:11,552 WARNING [train.py:1204] (3/4) Exclude cut with ID _33640_210_2655_1_1533117605479_4855469_60-192970-0_sp1.1 from training. Duration: 0.963625 2023-10-02 10:29:11,559 WARNING [train.py:1204] (3/4) Exclude cut with ID _65585_210_12210_1_1535450451584_3764098_112-294301-0_sp1.1 from training. Duration: 0.8 2023-10-02 10:29:12,841 INFO [train.py:1046] (3/4) Epoch 24, batch 4950, loss[loss=0.1549, simple_loss=0.2236, pruned_loss=0.04311, over 24477.00 frames. ], tot_loss[loss=0.1698, simple_loss=0.2457, pruned_loss=0.04692, over 4695226.28 frames. ], batch size: 58, lr: 4.24e-03, grad_scale: 8.0 2023-10-02 10:29:12,893 WARNING [train.py:1204] (3/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_643-37412-0_sp1.1 from training. Duration: 0.936375 2023-10-02 10:29:12,917 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_9302_210_2550_1_1528889962_4655416_638-301620-0_sp1.1 from training. Duration: 0.42725 2023-10-02 10:29:13,020 WARNING [train.py:1204] (3/4) Exclude cut with ID _3867_210_1883_1_1526641200575_4475830_272-215563-0_sp1.1 from training. Duration: 0.5181875 2023-10-02 10:29:16,425 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_53679_210_4852_1_1534556726_4186141_358-154143-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 10:29:16,436 WARNING [train.py:1204] (3/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_841-341679-0_sp0.9 from training. Duration: 0.6444375 2023-10-02 10:29:19,610 WARNING [train.py:1204] (3/4) Exclude cut with ID _7160_210_1876_1_1527940923231_3703799_302-150728-0 from training. Duration: 0.78 2023-10-02 10:29:20,796 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_125-203105-0 from training. Duration: 0.8 2023-10-02 10:29:20,818 WARNING [train.py:1204] (3/4) Exclude cut with ID _5585_210_2667_1_1527418813324_8007340_89-332072-0_sp0.9 from training. Duration: 0.711125 2023-10-02 10:29:22,164 WARNING [train.py:1204] (3/4) Exclude cut with ID _3992_210_1747_1_1526644816031_3731060_36-147855-0 from training. Duration: 0.78 2023-10-02 10:29:22,188 WARNING [train.py:1204] (3/4) Exclude cut with ID _59256_210_5986_1_1535076085997_3589120_172-239045-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 10:29:22,421 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.conv_skip_rate, batch_count=847520.0, ans=0.0 2023-10-02 10:29:23,481 WARNING [train.py:1204] (3/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_472-216663-0_sp1.1 from training. Duration: 0.836375 2023-10-02 10:29:23,539 WARNING [train.py:1204] (3/4) Exclude cut with ID _36512_210_6987_1_1533033143998_3126069_181-306500-0_sp1.1 from training. Duration: 0.863625 2023-10-02 10:29:23,560 WARNING [train.py:1204] (3/4) Exclude cut with ID _56524_210_9642_1_1534572011779_3619170_492-95008-0_sp1.1 from training. Duration: 0.97275 2023-10-02 10:29:26,287 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_33348_210_5632_1_1533086912_3324030_119-90326-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 10:29:26,328 WARNING [train.py:1204] (3/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_231-137892-0_sp1.1 from training. Duration: 0.9 2023-10-02 10:29:27,709 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8453_210_4211_1_1528502940_8562730_596-306820-0_sp1.1 from training. Duration: 0.8818125 2023-10-02 10:29:27,790 WARNING [train.py:1204] (3/4) Exclude cut with ID _27052_210_5986_1_1532329197187_3585009_50-242750-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 10:29:31,117 WARNING [train.py:1204] (3/4) Exclude cut with ID _13913_210_9994_1_1530579615187_3383897_358-98435-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 10:29:31,136 WARNING [train.py:1204] (3/4) Exclude cut with ID _27013_210_1389_1_1532437173240_3252649_399-229778-0_sp1.1 from training. Duration: 0.963625 2023-10-02 10:29:33,935 WARNING [train.py:1204] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_185-174805-0_sp0.9 from training. Duration: 0.7555625 2023-10-02 10:29:38,438 WARNING [train.py:1204] (3/4) Exclude cut with ID _65585_210_12210_1_1535450451584_3764098_196-221014-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 10:29:39,815 WARNING [train.py:1204] (3/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_202-293523-0_sp1.1 from training. Duration: 0.6545625 2023-10-02 10:29:40,059 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.feed_forward1.hidden_balancer.prob, batch_count=847586.6666666666, ans=0.125 2023-10-02 10:29:41,410 INFO [scaling.py:1118] (3/4) WithLoss: name=encoder.encoders.3.encoder.layers.1.self_attn_weights, loss-sum=0.000e+00 2023-10-02 10:29:42,480 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8275_210_4198_1_1528455320_3892571_213-127634-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 10:29:42,532 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_921-231678-0_sp1.1 from training. Duration: 0.97275 2023-10-02 10:29:45,683 WARNING [train.py:1204] (3/4) Exclude cut with ID _38020_210_5986_1_1533112252259_7006089_493-102745-0_sp1.1 from training. Duration: 0.8909375 2023-10-02 10:29:47,115 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6846_210_6753_1_1527850767_4295158_2-315073-0 from training. Duration: 0.88 2023-10-02 10:29:47,182 WARNING [train.py:1204] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_249-281349-0 from training. Duration: 0.85 2023-10-02 10:29:50,607 WARNING [train.py:1204] (3/4) Exclude cut with ID _18513_210_9206_1_1531618215223_3862620_314-83429-0_sp1.1 from training. Duration: 0.97275 2023-10-02 10:29:53,321 WARNING [train.py:1204] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_215-202973-0_sp0.9 from training. Duration: 0.97775 2023-10-02 10:29:53,330 WARNING [train.py:1204] (3/4) Exclude cut with ID _7719_210_3525_1_1528456153163_4296960_88-250259-0_sp1.1 from training. Duration: 0.763625 2023-10-02 10:29:53,444 WARNING [train.py:1204] (3/4) Exclude cut with ID _7606_210_4915_1_1528894253277_5080690_301-10732-0_sp1.1 from training. Duration: 0.736375 2023-10-02 10:29:53,453 WARNING [train.py:1204] (3/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_818-11907-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 10:29:54,850 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_5959_210_1794_1_1527400400_3445726_412-44052-0_sp1.1 from training. Duration: 0.7 2023-10-02 10:29:57,634 WARNING [train.py:1204] (3/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_6-279195-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 10:29:57,824 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.feed_forward3.hidden_balancer.prob, batch_count=847720.0, ans=0.125 2023-10-02 10:29:59,067 WARNING [train.py:1204] (3/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_252-216674-0_sp0.9 from training. Duration: 0.6 2023-10-02 10:30:02,155 WARNING [train.py:1204] (3/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_144-3287-0_sp1.1 from training. Duration: 0.6090625 2023-10-02 10:30:04,824 WARNING [train.py:1204] (3/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_342-3879-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 10:30:04,858 WARNING [train.py:1204] (3/4) Exclude cut with ID _33640_210_2655_1_1533117605479_4855469_584-65630-0_sp1.1 from training. Duration: 0.97275 2023-10-02 10:30:06,211 WARNING [train.py:1204] (3/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_37-189262-0 from training. Duration: 0.98 2023-10-02 10:30:06,244 WARNING [train.py:1204] (3/4) Exclude cut with ID _9389_210_1664_1_1529217337301_5289269_379-128794-0_sp0.9 from training. Duration: 0.9444375 2023-10-02 10:30:06,446 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.bypass.scale_min, batch_count=847720.0, ans=0.2 2023-10-02 10:30:06,451 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=847720.0, ans=0.1 2023-10-02 10:30:07,691 WARNING [train.py:1204] (3/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_119-207162-0_sp1.1 from training. Duration: 0.6454375 2023-10-02 10:30:11,123 WARNING [train.py:1204] (3/4) Exclude cut with ID _2941_210_2577_1_1526199673150_2489759_200-333643-0_sp1.1 from training. Duration: 0.936375 2023-10-02 10:30:11,204 WARNING [train.py:1204] (3/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_479-125611-0_sp1.1 from training. Duration: 0.87275 2023-10-02 10:30:12,474 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3475_210_2028_1_1526193578_3622569_64-297072-0_sp1.1 from training. Duration: 0.82725 2023-10-02 10:30:12,520 WARNING [train.py:1204] (3/4) Exclude cut with ID _53565_210_10635_1_1534337218335_3820259_139-346291-0_sp1.1 from training. Duration: 0.97275 2023-10-02 10:30:12,748 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.conv_module2.balancer1.prob, batch_count=847786.6666666666, ans=0.125 2023-10-02 10:30:13,783 WARNING [train.py:1204] (3/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_588-125165-0_sp1.1 from training. Duration: 0.7545625 2023-10-02 10:30:13,849 WARNING [train.py:1204] (3/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_938-205135-0_sp1.1 from training. Duration: 0.7909375 2023-10-02 10:30:16,932 WARNING [train.py:1204] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_216-48707-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 10:30:16,989 WARNING [train.py:1204] (3/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_477-255782-0_sp1.1 from training. Duration: 0.7454375 2023-10-02 10:30:17,017 WARNING [train.py:1204] (3/4) Exclude cut with ID _16909_210_5724_1_1531292079912_4118919_583-92842-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 10:30:18,414 WARNING [train.py:1204] (3/4) Exclude cut with ID _60860_210_8254_1_1534896015875_3557169_245-256666-0 from training. Duration: 0.97 2023-10-02 10:30:24,350 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_45399_210_4852_1_1533624725_7579737_349-116173-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 10:30:27,215 INFO [train.py:1046] (3/4) Epoch 24, batch 5000, loss[loss=0.1622, simple_loss=0.2321, pruned_loss=0.04618, over 23806.00 frames. ], tot_loss[loss=0.1693, simple_loss=0.2454, pruned_loss=0.0466, over 4711787.12 frames. ], batch size: 212, lr: 4.23e-03, grad_scale: 8.0 2023-10-02 10:30:27,344 WARNING [train.py:1204] (3/4) Exclude cut with ID _8699_210_2069_1_1528630346831_3960409_155-39766-0 from training. Duration: 0.78 2023-10-02 10:30:27,353 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_86-16881-0_sp1.1 from training. Duration: 0.52725 2023-10-02 10:30:35,843 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_191-159384-0_sp1.1 from training. Duration: 0.97275 2023-10-02 10:30:35,850 WARNING [train.py:1204] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_307-158462-0_sp1.1 from training. Duration: 0.62725 2023-10-02 10:30:35,949 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7544_210_6753_1_1528591327_1471426_33-346031-0 from training. Duration: 0.61 2023-10-02 10:30:37,326 WARNING [train.py:1204] (3/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_112-149213-0 from training. Duration: 0.95 2023-10-02 10:30:40,799 WARNING [train.py:1204] (3/4) Exclude cut with ID _55844_210_2346_1_1534471212569_7625707_925-191342-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 10:30:40,939 WARNING [train.py:1204] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_87-278686-0 from training. Duration: 0.77 2023-10-02 10:30:40,956 WARNING [train.py:1204] (3/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_415-78792-0_sp1.1 from training. Duration: 0.736375 2023-10-02 10:30:40,965 WARNING [train.py:1204] (3/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_107-332398-0_sp0.9 from training. Duration: 0.8666875 2023-10-02 10:30:42,223 WARNING [train.py:1204] (3/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_72-251255-0 from training. Duration: 0.94 2023-10-02 10:30:43,429 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_431-227273-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 10:30:43,483 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7544_210_6753_1_1528587000_691468_87-178599-0_sp0.9 from training. Duration: 0.9666875 2023-10-02 10:30:44,851 WARNING [train.py:1204] (3/4) Exclude cut with ID _13916_210_9994_1_1530788341126_3763149_60-191556-0 from training. Duration: 0.61 2023-10-02 10:30:44,853 WARNING [train.py:1204] (3/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_746-164782-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 10:30:44,890 WARNING [train.py:1204] (3/4) Exclude cut with ID _33640_210_2655_1_1533117605479_4855469_596-21066-0_sp1.1 from training. Duration: 0.92725 2023-10-02 10:30:46,286 WARNING [train.py:1204] (3/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_320-14078-0 from training. Duration: 0.95 2023-10-02 10:30:46,322 WARNING [train.py:1204] (3/4) Exclude cut with ID _29877_210_6228_1_1532397791106_5276149_266-62291-0 from training. Duration: 0.87 2023-10-02 10:30:48,293 WARNING [train.py:1204] (3/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_176-56120-0_sp0.9 from training. Duration: 0.92225 2023-10-02 10:30:48,334 WARNING [train.py:1204] (3/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_91-26919-0 from training. Duration: 0.97 2023-10-02 10:30:48,343 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_224-322394-0_sp0.9 from training. Duration: 0.7333125 2023-10-02 10:30:48,393 WARNING [train.py:1204] (3/4) Exclude cut with ID _44982_210_1511_1_1534312820108_3726029_586-297059-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 10:30:49,698 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_196-289637-0_sp1.1 from training. Duration: 0.6818125 2023-10-02 10:30:49,700 WARNING [train.py:1204] (3/4) Exclude cut with ID _18513_210_9206_1_1531618215223_3862620_402-5957-0 from training. Duration: 0.99 2023-10-02 10:30:49,706 WARNING [train.py:1204] (3/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_81-300212-0 from training. Duration: 0.6 2023-10-02 10:30:51,109 WARNING [train.py:1204] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_166-128941-0 from training. Duration: 0.54 2023-10-02 10:30:51,122 WARNING [train.py:1204] (3/4) Exclude cut with ID _58059_210_5854_1_1534901471149_3164808_57-309660-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 10:30:51,180 WARNING [train.py:1204] (3/4) Exclude cut with ID _60621_210_17071_1_1535013053530_3671739_73-155814-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 10:30:53,167 WARNING [train.py:1204] (3/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_726-270780-0 from training. Duration: 0.86 2023-10-02 10:30:53,186 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8853_210_3273_1_1528633899_818968_51-187685-0_sp1.1 from training. Duration: 0.62725 2023-10-02 10:30:53,901 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.2.encoder.layers.2.whiten, num_groups=1, num_channels=384, metric=4.00 vs. limit=12.0 2023-10-02 10:30:55,793 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_315-212794-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 10:30:55,870 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_53373_210_5399_1_1534229471_4715695_314-122795-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 10:30:56,083 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.feed_forward2.hidden_balancer.prob, batch_count=847986.6666666666, ans=0.125 2023-10-02 10:30:57,315 WARNING [train.py:1204] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_187-221566-0_sp0.9 from training. Duration: 0.47775 2023-10-02 10:31:00,813 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_133-564-0 from training. Duration: 0.79 2023-10-02 10:31:02,106 WARNING [train.py:1204] (3/4) Exclude cut with ID _55153_210_2253_1_1534384786677_3162096_255-228344-0_sp1.1 from training. Duration: 0.936375 2023-10-02 10:31:03,467 WARNING [train.py:1204] (3/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_383-165234-0_sp1.1 from training. Duration: 0.8545625 2023-10-02 10:31:06,221 WARNING [train.py:1204] (3/4) Exclude cut with ID IC0677W0056-334889 from training. Duration: 0.768 2023-10-02 10:31:09,467 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_14953_210_2151_1_1530701710_5616165_710-165400-0_sp0.9 from training. Duration: 0.9666875 2023-10-02 10:31:10,837 WARNING [train.py:1204] (3/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_355-236205-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 10:31:10,839 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_34450_210_9547_1_1533099443_4109050_503-246893-0_sp1.1 from training. Duration: 0.963625 2023-10-02 10:31:13,612 WARNING [train.py:1204] (3/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_25-120161-0 from training. Duration: 0.98 2023-10-02 10:31:13,640 WARNING [train.py:1204] (3/4) Exclude cut with ID _66112_210_10635_1_1535590236305_3801484_205-311930-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 10:31:14,967 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_48049_210_5632_1_1533864553_3519937_185-281218-0_sp1.1 from training. Duration: 0.92725 2023-10-02 10:31:15,022 WARNING [train.py:1204] (3/4) Exclude cut with ID _48782_210_12658_1_1534467701544_2300422_191-332415-0_sp1.1 from training. Duration: 0.97275 2023-10-02 10:31:16,464 WARNING [train.py:1204] (3/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_907-151398-0_sp1.1 from training. Duration: 0.336375 2023-10-02 10:31:17,238 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.2.encoder.layers.1.self_attn1.whiten, num_groups=1, num_channels=384, metric=15.56 vs. limit=22.5 2023-10-02 10:31:18,393 WARNING [train.py:1204] (3/4) Exclude cut with ID _67499_210_17282_1_1535509805220_3692430_20-179346-0_sp1.1 from training. Duration: 0.8818125 2023-10-02 10:31:19,762 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.507e+02 1.837e+02 2.022e+02 2.310e+02 4.101e+02, threshold=4.045e+02, percent-clipped=1.0 2023-10-02 10:31:19,850 WARNING [train.py:1204] (3/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_441-91466-0_sp1.1 from training. Duration: 0.8818125 2023-10-02 10:31:19,921 WARNING [train.py:1204] (3/4) Exclude cut with ID _46681_210_9642_1_1533893005571_7603359_786-164007-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 10:31:24,557 WARNING [train.py:1204] (3/4) Exclude cut with ID _14970_210_15709_1_1530773543493_1715730_187-20259-0 from training. Duration: 0.89 2023-10-02 10:31:28,663 WARNING [train.py:1204] (3/4) Exclude cut with ID _12734_210_8341_1_1530183614296_3891929_383-339075-0_sp1.1 from training. Duration: 0.963625 2023-10-02 10:31:37,913 WARNING [train.py:1204] (3/4) Exclude cut with ID _37086_210_9038_1_1532999540891_7021250_834-322225-0_sp1.1 from training. Duration: 0.92725 2023-10-02 10:31:40,481 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_10987_210_4163_1_1529760555_4123902_56-290559-0_sp1.1 from training. Duration: 0.963625 2023-10-02 10:31:40,488 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_92-246270-0_sp0.9 from training. Duration: 0.9444375 2023-10-02 10:31:40,523 WARNING [train.py:1204] (3/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_56-13561-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 10:31:41,903 INFO [train.py:1046] (3/4) Epoch 24, batch 5050, loss[loss=0.1731, simple_loss=0.2496, pruned_loss=0.0483, over 23411.00 frames. ], tot_loss[loss=0.1696, simple_loss=0.2458, pruned_loss=0.04664, over 4704393.99 frames. ], batch size: 93, lr: 4.23e-03, grad_scale: 8.0 2023-10-02 10:31:41,947 WARNING [train.py:1204] (3/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_393-57888-0_sp1.1 from training. Duration: 0.5818125 2023-10-02 10:31:41,967 WARNING [train.py:1204] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_168-32906-0_sp1.1 from training. Duration: 0.62725 2023-10-02 10:31:42,017 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_51225_210_3601_1_1534400634_2327383_127-207553-0_sp1.1 from training. Duration: 0.963625 2023-10-02 10:31:45,045 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=848186.6666666666, ans=0.1 2023-10-02 10:31:46,125 WARNING [train.py:1204] (3/4) Exclude cut with ID _58583_210_5236_1_1534726835266_3574249_494-203521-0_sp1.1 from training. Duration: 0.963625 2023-10-02 10:31:46,154 WARNING [train.py:1204] (3/4) Exclude cut with ID _49376_210_5986_1_1533985278732_3370740_354-344695-0 from training. Duration: 0.99 2023-10-02 10:31:48,156 WARNING [train.py:1204] (3/4) Exclude cut with ID _8021_210_7994_1_1528376451358_3653069_179-45206-0_sp1.1 from training. Duration: 0.7818125 2023-10-02 10:31:49,578 WARNING [train.py:1204] (3/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_742-154017-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 10:31:51,003 WARNING [train.py:1204] (3/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_222-198127-0_sp1.1 from training. Duration: 0.763625 2023-10-02 10:31:51,070 WARNING [train.py:1204] (3/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_235-307337-0 from training. Duration: 0.94 2023-10-02 10:31:53,670 WARNING [train.py:1204] (3/4) Exclude cut with ID _8393_210_6702_1_1528505860249_3457328_453-176500-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 10:31:53,711 WARNING [train.py:1204] (3/4) Exclude cut with ID _53565_210_10635_1_1534337218335_3820259_102-179528-0_sp1.1 from training. Duration: 0.97275 2023-10-02 10:31:55,615 WARNING [train.py:1204] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_43-206757-0_sp1.1 from training. Duration: 0.6818125 2023-10-02 10:31:56,949 WARNING [train.py:1204] (3/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_335-274780-0_sp1.1 from training. Duration: 0.7090625 2023-10-02 10:31:58,304 WARNING [train.py:1204] (3/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_128-296581-0_sp1.1 from training. Duration: 0.67275 2023-10-02 10:31:59,904 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.conv_skip_rate, batch_count=848253.3333333334, ans=0.0 2023-10-02 10:32:07,122 WARNING [train.py:1204] (3/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_111-112362-0 from training. Duration: 0.91 2023-10-02 10:32:07,169 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_5522_210_3273_1_1527249606_3909096_366-172508-0_sp1.1 from training. Duration: 0.6 2023-10-02 10:32:08,517 WARNING [train.py:1204] (3/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_47-167373-0_sp1.1 from training. Duration: 0.836375 2023-10-02 10:32:08,550 WARNING [train.py:1204] (3/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_41-234353-0 from training. Duration: 0.68 2023-10-02 10:32:09,215 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_82-221196-0_sp0.9 from training. Duration: 0.8444375 2023-10-02 10:32:10,461 WARNING [train.py:1204] (3/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_47-4814-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 10:32:10,488 WARNING [train.py:1204] (3/4) Exclude cut with ID _43541_210_10626_1_1533796233418_3635440_40-247993-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 10:32:11,879 WARNING [train.py:1204] (3/4) Exclude cut with ID _21989_210_5854_1_1531899214931_3271360_133-44759-0_sp0.9 from training. Duration: 0.8555625 2023-10-02 10:32:11,889 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_94-195801-0 from training. Duration: 0.94 2023-10-02 10:32:13,225 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_141-89113-0 from training. Duration: 0.86 2023-10-02 10:32:14,516 WARNING [train.py:1204] (3/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_683-284578-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 10:32:17,143 WARNING [train.py:1204] (3/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_6-214815-0_sp1.1 from training. Duration: 0.77275 2023-10-02 10:32:20,495 WARNING [train.py:1204] (3/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_556-212082-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 10:32:20,535 WARNING [train.py:1204] (3/4) Exclude cut with ID _44902_210_16187_1_1533888006062_3792078_149-67212-0 from training. Duration: 0.87 2023-10-02 10:32:21,932 WARNING [train.py:1204] (3/4) Exclude cut with ID _61148_210_6897_1_1535094022726_4217610_333-159440-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 10:32:23,434 WARNING [train.py:1204] (3/4) Exclude cut with ID _3120_210_3525_1_1526036143112_4072980_404-249994-0 from training. Duration: 0.99 2023-10-02 10:32:24,836 WARNING [train.py:1204] (3/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_234-260682-0_sp0.9 from training. Duration: 0.9333125 2023-10-02 10:32:26,812 WARNING [train.py:1204] (3/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_374-138971-0_sp1.1 from training. Duration: 0.8545625 2023-10-02 10:32:26,877 WARNING [train.py:1204] (3/4) Exclude cut with ID _53713_210_24116_1_1534314487762_7416751_743-302085-0_sp1.1 from training. Duration: 0.92725 2023-10-02 10:32:28,261 WARNING [train.py:1204] (3/4) Exclude cut with ID _18516_210_9206_1_1531704653206_3692406_198-334870-0_sp1.1 from training. Duration: 0.836375 2023-10-02 10:32:29,738 WARNING [train.py:1204] (3/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_395-73570-0_sp1.1 from training. Duration: 0.9 2023-10-02 10:32:31,033 WARNING [train.py:1204] (3/4) Exclude cut with ID _46683_210_9642_1_1533730913492_4591116_421-19547-0_sp1.1 from training. Duration: 0.936375 2023-10-02 10:32:32,249 WARNING [train.py:1204] (3/4) Exclude cut with ID _19896_210_1883_1_1531709022662_6649569_714-8668-0_sp1.1 from training. Duration: 0.963625 2023-10-02 10:32:32,272 WARNING [train.py:1204] (3/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_49-101072-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 10:32:32,291 WARNING [train.py:1204] (3/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_220-191080-0_sp1.1 from training. Duration: 0.8909375 2023-10-02 10:32:32,341 WARNING [train.py:1204] (3/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_62-335552-0 from training. Duration: 0.87 2023-10-02 10:32:34,180 WARNING [train.py:1204] (3/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_111-112362-0_sp1.1 from training. Duration: 0.82725 2023-10-02 10:32:35,479 WARNING [train.py:1204] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_318-59097-0_sp0.9 from training. Duration: 0.8444375 2023-10-02 10:32:37,833 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.0.layers.0.self_attn_weights.whiten_keys, num_groups=4, num_channels=128, metric=3.18 vs. limit=6.0 2023-10-02 10:32:38,161 WARNING [train.py:1204] (3/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_98-255710-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 10:32:38,168 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9042W0003-515609 from training. Duration: 0.896 2023-10-02 10:32:38,176 WARNING [train.py:1204] (3/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_158-250116-0_sp0.9 from training. Duration: 0.688875 2023-10-02 10:32:40,250 WARNING [train.py:1204] (3/4) Exclude cut with ID _26750_210_5665_1_1532226678442_4063230_7-346885-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 10:32:40,308 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_37839_210_7234_1_1533542462_3689303_357-227078-0_sp1.1 from training. Duration: 0.963625 2023-10-02 10:32:41,450 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9052W0119-520904 from training. Duration: 0.896 2023-10-02 10:32:45,378 WARNING [train.py:1204] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_249-281349-0_sp1.1 from training. Duration: 0.77275 2023-10-02 10:32:45,389 WARNING [train.py:1204] (3/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_147-293311-0 from training. Duration: 0.94 2023-10-02 10:32:45,390 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_158-21166-0_sp1.1 from training. Duration: 0.963625 2023-10-02 10:32:49,686 WARNING [train.py:1204] (3/4) Exclude cut with ID _14100_210_8341_1_1530529320613_3678069_207-332194-0_sp1.1 from training. Duration: 0.92725 2023-10-02 10:32:49,729 WARNING [train.py:1204] (3/4) Exclude cut with ID _9389_210_1664_1_1529217337301_5289269_324-211031-0_sp1.1 from training. Duration: 0.963625 2023-10-02 10:32:49,740 WARNING [train.py:1204] (3/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_133-158308-0 from training. Duration: 0.84 2023-10-02 10:32:51,738 WARNING [train.py:1204] (3/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_166-314607-0 from training. Duration: 0.54 2023-10-02 10:32:53,234 WARNING [train.py:1204] (3/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_649-79538-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 10:32:53,242 WARNING [train.py:1204] (3/4) Exclude cut with ID _28394_210_8913_1_1532430049518_3650118_343-115667-0_sp1.1 from training. Duration: 0.97275 2023-10-02 10:32:53,365 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.0.feed_forward3.hidden_balancer.prob, batch_count=848453.3333333334, ans=0.125 2023-10-02 10:32:54,509 WARNING [train.py:1204] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_87-278686-0_sp0.9 from training. Duration: 0.8555625 2023-10-02 10:32:56,189 INFO [train.py:1046] (3/4) Epoch 24, batch 5100, loss[loss=0.1695, simple_loss=0.2413, pruned_loss=0.04881, over 23849.00 frames. ], tot_loss[loss=0.171, simple_loss=0.2473, pruned_loss=0.0473, over 4703990.55 frames. ], batch size: 212, lr: 4.23e-03, grad_scale: 8.0 2023-10-02 10:32:56,331 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9109W0003-549749 from training. Duration: 0.768 2023-10-02 10:32:56,488 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.conv_module1.balancer1.prob, batch_count=848520.0, ans=0.125 2023-10-02 10:32:58,978 WARNING [train.py:1204] (3/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_126-29104-0_sp1.1 from training. Duration: 0.77275 2023-10-02 10:33:03,027 WARNING [train.py:1204] (3/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_325-113798-0 from training. Duration: 0.85 2023-10-02 10:33:03,098 WARNING [train.py:1204] (3/4) Exclude cut with ID _6694_210_4599_1_1527852637490_3965529_116-109398-0 from training. Duration: 0.73 2023-10-02 10:33:04,379 WARNING [train.py:1204] (3/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_46-57119-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 10:33:06,274 WARNING [train.py:1204] (3/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_546-119312-0_sp1.1 from training. Duration: 0.9 2023-10-02 10:33:08,972 WARNING [train.py:1204] (3/4) Exclude cut with ID _60057_210_10640_1_1534903065793_3333150_468-306939-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 10:33:09,007 WARNING [train.py:1204] (3/4) Exclude cut with ID _15150_210_8341_1_1530752411803_7256384_22-138274-0 from training. Duration: 0.96 2023-10-02 10:33:09,031 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_289-180904-0 from training. Duration: 0.76 2023-10-02 10:33:14,000 WARNING [train.py:1204] (3/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_64-133721-0_sp1.1 from training. Duration: 0.92725 2023-10-02 10:33:14,228 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.balancer1.prob, batch_count=848586.6666666666, ans=0.125 2023-10-02 10:33:15,266 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_195-257512-0_sp1.1 from training. Duration: 0.6090625 2023-10-02 10:33:16,188 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.0.layers.1.conv_module2.whiten, num_groups=1, num_channels=192, metric=10.81 vs. limit=15.0 2023-10-02 10:33:18,748 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.1.encoder.layers.1.self_attn2.whiten, num_groups=1, num_channels=256, metric=10.11 vs. limit=22.5 2023-10-02 10:33:20,681 WARNING [train.py:1204] (3/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_704-101963-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 10:33:22,178 WARNING [train.py:1204] (3/4) Exclude cut with ID _4366_210_1388_1_1526792053633_3613819_231-211402-0 from training. Duration: 0.94 2023-10-02 10:33:22,235 WARNING [train.py:1204] (3/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_679-190385-0_sp1.1 from training. Duration: 0.97275 2023-10-02 10:33:24,458 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.conv_skip_rate, batch_count=848653.3333333334, ans=0.0 2023-10-02 10:33:25,518 WARNING [train.py:1204] (3/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_298-81459-0_sp1.1 from training. Duration: 0.963625 2023-10-02 10:33:25,535 WARNING [train.py:1204] (3/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_658-282610-0_sp0.9 from training. Duration: 0.588875 2023-10-02 10:33:28,756 WARNING [train.py:1204] (3/4) Exclude cut with ID _57572_210_5517_1_1534921219624_3809484_196-283734-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 10:33:28,836 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_53071_210_16554_1_1534490405_2954066_294-198554-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 10:33:28,848 WARNING [train.py:1204] (3/4) Exclude cut with ID _4866_210_2577_1_1527404412284_4364659_352-157930-0 from training. Duration: 0.81 2023-10-02 10:33:31,541 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9231W0113-610139 from training. Duration: 0.896 2023-10-02 10:33:31,735 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=848653.3333333334, ans=0.1 2023-10-02 10:33:32,884 WARNING [train.py:1204] (3/4) Exclude cut with ID _24271_210_12658_1_1531987270022_7190519_638-171906-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 10:33:32,913 WARNING [train.py:1204] (3/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_272-249985-0 from training. Duration: 0.67 2023-10-02 10:33:32,921 WARNING [train.py:1204] (3/4) Exclude cut with ID _64086_210_7994_1_1535270417019_7435949_449-31567-0 from training. Duration: 0.97 2023-10-02 10:33:36,863 WARNING [train.py:1204] (3/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_109-282568-0_sp1.1 from training. Duration: 0.97275 2023-10-02 10:33:45,716 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_121-45934-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 10:33:46,939 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.615e+02 1.871e+02 2.071e+02 2.316e+02 3.219e+02, threshold=4.141e+02, percent-clipped=0.0 2023-10-02 10:33:48,977 WARNING [train.py:1204] (3/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_314-126819-0 from training. Duration: 0.5 2023-10-02 10:33:49,020 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9309W0192-640319 from training. Duration: 0.512 2023-10-02 10:33:49,029 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9310W0003-640649 from training. Duration: 0.896 2023-10-02 10:33:51,700 WARNING [train.py:1204] (3/4) Exclude cut with ID _7160_210_1876_1_1527940923231_3703799_120-150943-0 from training. Duration: 0.81 2023-10-02 10:33:51,702 WARNING [train.py:1204] (3/4) Exclude cut with ID _40380_210_9059_1_1533380594759_4187380_6-33508-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 10:33:54,573 WARNING [train.py:1204] (3/4) Exclude cut with ID _59202_210_26984_1_1534816810612_3549174_137-318710-0 from training. Duration: 0.89 2023-10-02 10:33:57,949 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_435-313305-0 from training. Duration: 0.71 2023-10-02 10:33:59,369 WARNING [train.py:1204] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_91-212486-0_sp1.1 from training. Duration: 0.6181875 2023-10-02 10:34:00,727 WARNING [train.py:1204] (3/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_133-158308-0_sp1.1 from training. Duration: 0.763625 2023-10-02 10:34:03,889 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_99-251314-0 from training. Duration: 0.87 2023-10-02 10:34:05,272 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_365-217119-0_sp1.1 from training. Duration: 0.57275 2023-10-02 10:34:05,306 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3475_210_2028_1_1526193578_3622569_377-166133-0 from training. Duration: 0.86 2023-10-02 10:34:05,449 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=848786.6666666666, ans=0.1 2023-10-02 10:34:09,210 INFO [train.py:1046] (3/4) Epoch 24, batch 5150, loss[loss=0.2076, simple_loss=0.2726, pruned_loss=0.07131, over 19475.00 frames. ], tot_loss[loss=0.171, simple_loss=0.2481, pruned_loss=0.04699, over 4707207.50 frames. ], batch size: 388, lr: 4.23e-03, grad_scale: 4.0 2023-10-02 10:34:11,370 WARNING [train.py:1204] (3/4) Exclude cut with ID _13916_210_9994_1_1530788341126_3763149_116-21034-0_sp1.1 from training. Duration: 0.87275 2023-10-02 10:34:11,379 WARNING [train.py:1204] (3/4) Exclude cut with ID _64220_210_9527_1_1535250151772_4875259_142-216831-0_sp1.1 from training. Duration: 0.936375 2023-10-02 10:34:11,380 WARNING [train.py:1204] (3/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_490-207861-0_sp1.1 from training. Duration: 0.8909375 2023-10-02 10:34:11,444 WARNING [train.py:1204] (3/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_87-272068-0_sp0.9 from training. Duration: 0.92225 2023-10-02 10:34:12,822 WARNING [train.py:1204] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_18-46756-0_sp1.1 from training. Duration: 0.7181875 2023-10-02 10:34:12,890 WARNING [train.py:1204] (3/4) Exclude cut with ID _39411_210_5986_1_1533538825030_3343649_98-311335-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 10:34:14,284 WARNING [train.py:1204] (3/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_113-18163-0 from training. Duration: 0.89 2023-10-02 10:34:14,286 WARNING [train.py:1204] (3/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_1103-56445-0 from training. Duration: 0.73 2023-10-02 10:34:14,328 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3474_210_2028_1_1526188513_4825246_145-309131-0 from training. Duration: 0.74 2023-10-02 10:34:14,481 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.ff2_skip_rate, batch_count=848853.3333333334, ans=0.0 2023-10-02 10:34:15,565 WARNING [train.py:1204] (3/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_120-211630-0_sp1.1 from training. Duration: 0.836375 2023-10-02 10:34:15,577 WARNING [train.py:1204] (3/4) Exclude cut with ID _8030_210_7497_1_1528444886781_3531089_32-31837-0 from training. Duration: 0.97 2023-10-02 10:34:18,186 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_19949_210_2161_1_1531706337_928079_8-160956-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 10:34:18,236 WARNING [train.py:1204] (3/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_321-215742-0_sp1.1 from training. Duration: 0.4545625 2023-10-02 10:34:20,225 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_583-124650-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 10:34:21,718 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_30434_210_13049_1_1532519384_981746_164-182755-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 10:34:21,984 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.conv_module2.balancer2.prob, batch_count=848853.3333333334, ans=0.125 2023-10-02 10:34:25,143 WARNING [train.py:1204] (3/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_339-104791-0_sp1.1 from training. Duration: 0.4454375 2023-10-02 10:34:25,159 WARNING [train.py:1204] (3/4) Exclude cut with ID _15026_210_8750_1_1530777602316_3291850_211-195043-0 from training. Duration: 0.8 2023-10-02 10:34:27,718 WARNING [train.py:1204] (3/4) Exclude cut with ID _29877_210_6228_1_1532397791106_5276149_487-182647-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 10:34:27,768 WARNING [train.py:1204] (3/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_127-566-0_sp0.9 from training. Duration: 0.9333125 2023-10-02 10:34:29,183 WARNING [train.py:1204] (3/4) Exclude cut with ID _8263_210_1664_1_1528439452891_970220_68-181805-0_sp0.9 from training. Duration: 0.711125 2023-10-02 10:34:29,196 WARNING [train.py:1204] (3/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_504-72513-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 10:34:29,210 WARNING [train.py:1204] (3/4) Exclude cut with ID _64639_210_5816_1_1535367619437_3528000_324-265233-0_sp1.1 from training. Duration: 0.963625 2023-10-02 10:34:30,508 WARNING [train.py:1204] (3/4) Exclude cut with ID _6456_210_1534_1_1527681634212_3181049_215-309359-0_sp0.9 from training. Duration: 0.97775 2023-10-02 10:34:30,512 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_115-233929-0_sp1.1 from training. Duration: 0.6545625 2023-10-02 10:34:30,561 WARNING [train.py:1204] (3/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_219-224445-0 from training. Duration: 0.84 2023-10-02 10:34:33,593 WARNING [train.py:1204] (3/4) Exclude cut with ID _14867_210_1968_1_1530684179890_3846969_413-29292-0_sp1.1 from training. Duration: 0.8181875 2023-10-02 10:34:34,887 WARNING [train.py:1204] (3/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_1012-279473-0_sp0.9 from training. Duration: 0.7444375 2023-10-02 10:34:36,295 WARNING [train.py:1204] (3/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_605-133395-0_sp0.9 from training. Duration: 0.7333125 2023-10-02 10:34:38,913 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_89-35748-0 from training. Duration: 0.79 2023-10-02 10:34:40,400 WARNING [train.py:1204] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_476-272757-0_sp1.1 from training. Duration: 0.8454375 2023-10-02 10:34:44,608 WARNING [train.py:1204] (3/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_449-15508-0_sp0.9 from training. Duration: 0.911125 2023-10-02 10:34:47,647 WARNING [train.py:1204] (3/4) Exclude cut with ID _29345_210_4381_1_1532689168263_3860169_198-174328-0 from training. Duration: 0.91 2023-10-02 10:34:49,816 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.bypass.scale_min, batch_count=848986.6666666666, ans=0.2 2023-10-02 10:34:51,049 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_15517_210_6753_1_1530952293_1664795_125-158157-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 10:34:57,056 WARNING [train.py:1204] (3/4) Exclude cut with ID _25565_210_12392_1_1532423009382_3952098_420-113180-0_sp1.1 from training. Duration: 0.963625 2023-10-02 10:34:58,438 WARNING [train.py:1204] (3/4) Exclude cut with ID _51471_210_1889_1_1534399391390_3094250_236-146773-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 10:35:01,304 WARNING [train.py:1204] (3/4) Exclude cut with ID _30039_210_13270_1_1532484612900_2725150_175-35012-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 10:35:02,634 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_23850_210_4238_1_1532232597_1632433_99-275843-0_sp1.1 from training. Duration: 0.936375 2023-10-02 10:35:04,062 WARNING [train.py:1204] (3/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_658-282610-0 from training. Duration: 0.53 2023-10-02 10:35:07,408 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_37818_210_4238_1_1533620910_4720083_234-65934-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 10:35:07,469 WARNING [train.py:1204] (3/4) Exclude cut with ID _3067_210_2667_1_1526212974640_4503168_459-173867-0_sp1.1 from training. Duration: 0.8 2023-10-02 10:35:07,489 WARNING [train.py:1204] (3/4) Exclude cut with ID _8696_210_1876_1_1528545632440_3345058_209-54069-0_sp0.9 from training. Duration: 0.7444375 2023-10-02 10:35:10,251 WARNING [train.py:1204] (3/4) Exclude cut with ID _62574_210_2655_1_1535180407058_3933249_420-236465-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 10:35:11,630 WARNING [train.py:1204] (3/4) Exclude cut with ID _46681_210_9642_1_1533893005571_7603359_333-4042-0_sp1.1 from training. Duration: 0.936375 2023-10-02 10:35:12,925 WARNING [train.py:1204] (3/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_377-296839-0 from training. Duration: 0.55 2023-10-02 10:35:17,578 WARNING [train.py:1204] (3/4) Exclude cut with ID _43997_210_4525_1_1533538632555_5189329_426-92599-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 10:35:19,394 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_43-71989-0_sp1.1 from training. Duration: 0.5181875 2023-10-02 10:35:20,792 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2009_210_2071_1_1525481134_4588937_140-345530-0_sp1.1 from training. Duration: 0.963625 2023-10-02 10:35:20,800 WARNING [train.py:1204] (3/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_138-332773-0_sp0.9 from training. Duration: 0.9 2023-10-02 10:35:22,140 WARNING [train.py:1204] (3/4) Exclude cut with ID _13942_210_9988_1_1530604906036_3733360_499-55676-0_sp0.9 from training. Duration: 0.688875 2023-10-02 10:35:22,169 WARNING [train.py:1204] (3/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_259-34038-0_sp1.1 from training. Duration: 0.67275 2023-10-02 10:35:23,938 INFO [train.py:1046] (3/4) Epoch 24, batch 5200, loss[loss=0.1749, simple_loss=0.2442, pruned_loss=0.0528, over 23626.00 frames. ], tot_loss[loss=0.1718, simple_loss=0.2488, pruned_loss=0.04739, over 4715101.03 frames. ], batch size: 232, lr: 4.23e-03, grad_scale: 8.0 2023-10-02 10:35:23,974 WARNING [train.py:1204] (3/4) Exclude cut with ID _3120_210_3525_1_1526036143112_4072980_404-249994-0_sp1.1 from training. Duration: 0.9 2023-10-02 10:35:24,007 WARNING [train.py:1204] (3/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_222-108810-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 10:35:28,034 WARNING [train.py:1204] (3/4) Exclude cut with ID _22083_210_8254_1_1531702862439_3678970_49-234546-0_sp0.9 from training. Duration: 0.92225 2023-10-02 10:35:29,462 WARNING [train.py:1204] (3/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_199-295611-0_sp0.9 from training. Duration: 0.611125 2023-10-02 10:35:30,908 WARNING [train.py:1204] (3/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_43-192945-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 10:35:34,379 WARNING [train.py:1204] (3/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_364-22817-0 from training. Duration: 0.71 2023-10-02 10:35:34,464 WARNING [train.py:1204] (3/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_388-129552-0_sp1.1 from training. Duration: 0.87275 2023-10-02 10:35:37,091 WARNING [train.py:1204] (3/4) Exclude cut with ID _6537_210_1883_1_1527987559710_3597129_190-280227-0_sp1.1 from training. Duration: 0.97275 2023-10-02 10:35:38,563 WARNING [train.py:1204] (3/4) Exclude cut with ID _55844_210_2346_1_1534471212569_7625707_398-56897-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 10:35:39,932 WARNING [train.py:1204] (3/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_48-182155-0_sp1.1 from training. Duration: 0.7909375 2023-10-02 10:35:39,951 WARNING [train.py:1204] (3/4) Exclude cut with ID _29529_210_7234_1_1532393961455_3977460_582-18084-0_sp1.1 from training. Duration: 0.97275 2023-10-02 10:35:42,569 WARNING [train.py:1204] (3/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_912-282412-0 from training. Duration: 0.49 2023-10-02 10:35:45,265 WARNING [train.py:1204] (3/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_332-256168-0_sp0.9 from training. Duration: 0.8333125 2023-10-02 10:35:45,305 WARNING [train.py:1204] (3/4) Exclude cut with ID _64220_210_9527_1_1535250151772_4875259_55-250624-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 10:35:47,270 WARNING [train.py:1204] (3/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_264-108685-0 from training. Duration: 0.55 2023-10-02 10:35:50,631 WARNING [train.py:1204] (3/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_613-232856-0_sp0.9 from training. Duration: 0.6 2023-10-02 10:35:50,723 WARNING [train.py:1204] (3/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_68-212801-0_sp1.1 from training. Duration: 0.663625 2023-10-02 10:35:50,951 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.self_attn_weights.pos_emb_skip_rate, batch_count=849253.3333333334, ans=0.0 2023-10-02 10:35:51,943 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6290_210_3273_1_1527595224_1282787_115-87095-0 from training. Duration: 0.55 2023-10-02 10:35:53,262 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3080_210_2071_1_1526086102_4365166_312-8726-0 from training. Duration: 0.91 2023-10-02 10:35:56,722 WARNING [train.py:1204] (3/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_105-63309-0 from training. Duration: 0.98 2023-10-02 10:35:58,076 WARNING [train.py:1204] (3/4) Exclude cut with ID _2941_210_2577_1_1526199673150_2489759_201-160779-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 10:35:58,079 WARNING [train.py:1204] (3/4) Exclude cut with ID ID0406W0226-885434 from training. Duration: 0.997 2023-10-02 10:35:58,092 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_17292_210_6753_1_1531125388_2943734_353-328201-0_sp1.1 from training. Duration: 0.97275 2023-10-02 10:35:59,456 WARNING [train.py:1204] (3/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_572-96029-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 10:35:59,492 WARNING [train.py:1204] (3/4) Exclude cut with ID _14970_210_15709_1_1530775547361_1966965_6-23328-0_sp0.9 from training. Duration: 0.9555625 2023-10-02 10:35:59,535 WARNING [train.py:1204] (3/4) Exclude cut with ID _45012_210_12651_1_1533608435136_5371380_240-37101-0 from training. Duration: 0.93 2023-10-02 10:35:59,592 WARNING [train.py:1204] (3/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_685-90183-0_sp1.1 from training. Duration: 0.936375 2023-10-02 10:36:02,288 WARNING [train.py:1204] (3/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_41-22245-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 10:36:02,502 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.1.bypass.skip_rate, batch_count=849320.0, ans=0.035 2023-10-02 10:36:03,794 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=849320.0, ans=0.1 2023-10-02 10:36:05,538 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_15126_210_6753_1_1530778677_2591185_120-277707-0 from training. Duration: 0.8 2023-10-02 10:36:05,575 WARNING [train.py:1204] (3/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_310-26186-0 from training. Duration: 0.7 2023-10-02 10:36:05,590 WARNING [train.py:1204] (3/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_787-294416-0 from training. Duration: 0.82 2023-10-02 10:36:06,268 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.3.encoder.layers.3.feed_forward1.out_whiten, num_groups=1, num_channels=512, metric=6.20 vs. limit=15.0 2023-10-02 10:36:09,715 WARNING [train.py:1204] (3/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_795-33252-0 from training. Duration: 0.37 2023-10-02 10:36:10,992 WARNING [train.py:1204] (3/4) Exclude cut with ID _7537_210_2084_1_1528502395431_3924359_113-199933-0_sp0.9 from training. Duration: 0.6555625 2023-10-02 10:36:16,807 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.497e+02 1.871e+02 2.065e+02 2.333e+02 3.353e+02, threshold=4.130e+02, percent-clipped=0.0 2023-10-02 10:36:16,917 WARNING [train.py:1204] (3/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_308-231735-0_sp0.9 from training. Duration: 0.811125 2023-10-02 10:36:16,943 WARNING [train.py:1204] (3/4) Exclude cut with ID _19504_210_9994_1_1531648863242_3325220_354-296765-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 10:36:18,324 WARNING [train.py:1204] (3/4) Exclude cut with ID _5409_210_1388_1_1527292850189_5865191_178-127970-0 from training. Duration: 0.57 2023-10-02 10:36:18,537 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.conv_skip_rate, batch_count=849386.6666666666, ans=0.0 2023-10-02 10:36:19,772 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_1466-248056-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 10:36:19,797 WARNING [train.py:1204] (3/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_535-14410-0_sp0.9 from training. Duration: 0.52225 2023-10-02 10:36:19,800 WARNING [train.py:1204] (3/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_747-129941-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 10:36:20,444 WARNING [train.py:1204] (3/4) Exclude cut with ID _15012_210_1968_1_1530856820429_5095350_688-178849-0_sp1.1 from training. Duration: 0.5818125 2023-10-02 10:36:21,946 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.attention_skip_rate, batch_count=849453.3333333334, ans=0.0 2023-10-02 10:36:22,363 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.4.encoder.layers.1.feed_forward1.out_whiten, num_groups=1, num_channels=384, metric=9.46 vs. limit=15.0 2023-10-02 10:36:24,458 WARNING [train.py:1204] (3/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_356-45054-0_sp1.1 from training. Duration: 0.7818125 2023-10-02 10:36:24,540 WARNING [train.py:1204] (3/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_146-276944-0_sp0.9 from training. Duration: 0.92225 2023-10-02 10:36:27,906 WARNING [train.py:1204] (3/4) Exclude cut with ID _39204_210_4220_1_1533880547368_3191120_346-180306-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 10:36:28,014 WARNING [train.py:1204] (3/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_641-158017-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 10:36:28,015 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_19952_210_2161_1_1531875469_3851750_274-42137-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 10:36:33,411 WARNING [train.py:1204] (3/4) Exclude cut with ID _60804_210_9642_1_1534831199859_7268299_87-327819-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 10:36:35,181 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_9302_210_2550_1_1528889962_4655416_638-301620-0 from training. Duration: 0.47 2023-10-02 10:36:35,258 WARNING [train.py:1204] (3/4) Exclude cut with ID _44304_210_15374_1_1533639352765_4613129_302-22084-0_sp1.1 from training. Duration: 0.7818125 2023-10-02 10:36:36,506 WARNING [train.py:1204] (3/4) Exclude cut with ID _18513_210_9206_1_1531618215223_3862620_177-227063-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 10:36:36,606 WARNING [train.py:1204] (3/4) Exclude cut with ID _41326_210_5854_1_1533778076509_3492080_510-45444-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 10:36:37,892 INFO [train.py:1046] (3/4) Epoch 24, batch 5250, loss[loss=0.164, simple_loss=0.2519, pruned_loss=0.03803, over 24316.00 frames. ], tot_loss[loss=0.1711, simple_loss=0.2485, pruned_loss=0.04688, over 4719550.92 frames. ], batch size: 74, lr: 4.23e-03, grad_scale: 8.0 2023-10-02 10:36:37,946 WARNING [train.py:1204] (3/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_76-311963-0_sp1.1 from training. Duration: 0.636375 2023-10-02 10:36:39,404 WARNING [train.py:1204] (3/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_156-302675-0_sp0.9 from training. Duration: 0.611125 2023-10-02 10:36:40,838 WARNING [train.py:1204] (3/4) Exclude cut with ID _53489_210_6228_1_1534298357169_3835970_367-148411-0_sp1.1 from training. Duration: 0.936375 2023-10-02 10:36:43,508 WARNING [train.py:1204] (3/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_389-144525-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 10:36:44,839 WARNING [train.py:1204] (3/4) Exclude cut with ID _14069_210_5632_1_1530512692033_4803390_158-238427-0_sp1.1 from training. Duration: 0.963625 2023-10-02 10:36:46,852 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_486-151131-0_sp0.9 from training. Duration: 0.9444375 2023-10-02 10:36:53,108 WARNING [train.py:1204] (3/4) Exclude cut with ID _39846_210_12651_1_1533603651992_1318030_188-71831-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 10:36:53,432 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.bypass.skip_rate, batch_count=849586.6666666666, ans=0.04949747468305833 2023-10-02 10:36:54,502 WARNING [train.py:1204] (3/4) Exclude cut with ID _39411_210_5986_1_1533538825030_3343649_173-238248-0_sp1.1 from training. Duration: 0.7545625 2023-10-02 10:36:54,681 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.0.self_attn_weights.pos_emb_skip_rate, batch_count=849586.6666666666, ans=0.0 2023-10-02 10:36:54,784 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.conv_skip_rate, batch_count=849586.6666666666, ans=0.0 2023-10-02 10:36:55,957 WARNING [train.py:1204] (3/4) Exclude cut with ID _20684_210_6917_1_1531567751646_4203930_469-49081-0_sp1.1 from training. Duration: 0.92725 2023-10-02 10:36:57,920 WARNING [train.py:1204] (3/4) Exclude cut with ID _8833_210_5219_1_1528592665210_7553059_611-80313-0_sp1.1 from training. Duration: 0.5818125 2023-10-02 10:36:59,375 WARNING [train.py:1204] (3/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_440-141811-0 from training. Duration: 0.95 2023-10-02 10:37:00,784 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_41211_210_2544_1_1533348859_3702910_236-223652-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 10:37:02,141 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_40222_210_6228_1_1533385298_4481939_220-163998-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 10:37:18,100 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=849653.3333333334, ans=0.1 2023-10-02 10:37:46,415 INFO [train.py:1046] (3/4) Epoch 24, batch 5300, loss[loss=0.1669, simple_loss=0.2485, pruned_loss=0.04266, over 24535.00 frames. ], tot_loss[loss=0.17, simple_loss=0.2469, pruned_loss=0.04656, over 4713558.98 frames. ], batch size: 63, lr: 4.23e-03, grad_scale: 8.0 2023-10-02 10:37:47,917 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.nonlin_attention.balancer.prob, batch_count=849853.3333333334, ans=0.125 2023-10-02 10:37:58,239 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.feed_forward3.hidden_balancer.prob, batch_count=849920.0, ans=0.125 2023-10-02 10:38:01,355 WARNING [train.py:1204] (3/4) Exclude cut with ID _14629_210_15709_1_1530604298182_956899_6-232695-0_sp1.1 from training. Duration: 0.8545625 2023-10-02 10:38:01,374 WARNING [train.py:1204] (3/4) Exclude cut with ID _13921_210_9994_1_1530841782272_3503520_327-69677-0 from training. Duration: 0.9 2023-10-02 10:38:01,398 WARNING [train.py:1204] (3/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_372-298093-0 from training. Duration: 0.8 2023-10-02 10:38:01,411 WARNING [train.py:1204] (3/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_515-139111-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 10:38:01,638 WARNING [train.py:1204] (3/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_85-65056-0_sp1.1 from training. Duration: 0.87275 2023-10-02 10:38:01,678 WARNING [train.py:1204] (3/4) Exclude cut with ID _15113_210_8254_1_1530842438579_3716279_209-54533-0_sp1.1 from training. Duration: 0.87275 2023-10-02 10:38:01,722 WARNING [train.py:1204] (3/4) Exclude cut with ID _70447_210_12392_1_1535972499300_3160420_123-312838-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 10:38:01,748 WARNING [train.py:1204] (3/4) Exclude cut with ID _60113_210_16271_1_1535164212481_4386079_70-63596-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 10:38:01,776 WARNING [train.py:1204] (3/4) Exclude cut with ID _14632_210_9988_1_1530691279392_3923200_225-188509-0_sp1.1 from training. Duration: 0.963625 2023-10-02 10:38:01,821 WARNING [train.py:1204] (3/4) Exclude cut with ID _34734_210_4525_1_1533365977542_4448720_584-180532-0_sp1.1 from training. Duration: 0.97275 2023-10-02 10:38:01,835 WARNING [train.py:1204] (3/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_351-294650-0_sp1.1 from training. Duration: 0.57275 2023-10-02 10:38:02,155 WARNING [train.py:1204] (3/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_263-168388-0_sp0.9 from training. Duration: 0.9555625 2023-10-02 10:38:02,184 WARNING [train.py:1204] (3/4) Exclude cut with ID _22083_210_8254_1_1531702862439_3678970_201-85738-0 from training. Duration: 0.9 2023-10-02 10:38:02,278 WARNING [train.py:1204] (3/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_237-58433-0 from training. Duration: 0.74 2023-10-02 10:38:02,305 WARNING [train.py:1204] (3/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_262-250826-0 from training. Duration: 0.99 2023-10-02 10:38:02,418 WARNING [train.py:1204] (3/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_927-43827-0_sp0.9 from training. Duration: 0.82225 2023-10-02 10:38:02,450 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2821_210_2715_1_1526108385_3763560_73-296673-0 from training. Duration: 0.83 2023-10-02 10:38:02,529 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6287_210_3273_1_1527509506_2653716_182-199887-0 from training. Duration: 0.51 2023-10-02 10:38:02,656 WARNING [train.py:1204] (3/4) Exclude cut with ID _70519_210_4163_1_1535961595994_7458230_899-80223-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 10:38:03,279 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_210-234949-0_sp1.1 from training. Duration: 0.97275 2023-10-02 10:38:03,314 WARNING [train.py:1204] (3/4) Exclude cut with ID _30196_210_1664_1_1533708496522_7074140_768-278176-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 10:38:03,390 WARNING [train.py:1204] (3/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_315-128283-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 10:38:03,468 WARNING [train.py:1204] (3/4) Exclude cut with ID _14886_210_4220_1_1530702020355_3516190_330-323941-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 10:38:03,722 WARNING [train.py:1204] (3/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_264-348896-0_sp1.1 from training. Duration: 0.77275 2023-10-02 10:38:03,743 WARNING [train.py:1204] (3/4) Exclude cut with ID _37086_210_9038_1_1532999540891_7021250_433-10563-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 10:38:03,789 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_53638_210_21581_1_1534297319_5808449_885-80172-0_sp1.1 from training. Duration: 0.87275 2023-10-02 10:38:03,899 WARNING [train.py:1204] (3/4) Exclude cut with ID _14623_210_1925_1_1530773354962_3632609_157-218771-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 10:38:03,910 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_1368-78165-0_sp1.1 from training. Duration: 0.97275 2023-10-02 10:38:03,915 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_309-65177-0_sp0.9 from training. Duration: 0.97775 2023-10-02 10:38:03,922 WARNING [train.py:1204] (3/4) Exclude cut with ID _21632_210_2253_1_1531994001765_3744140_291-138812-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 10:38:03,936 WARNING [train.py:1204] (3/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_669-93163-0_sp1.1 from training. Duration: 0.8909375 2023-10-02 10:38:04,502 WARNING [train.py:1204] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_292-215346-0 from training. Duration: 0.97 2023-10-02 10:38:04,527 WARNING [train.py:1204] (3/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_286-222073-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 10:38:04,814 WARNING [train.py:1204] (3/4) Exclude cut with ID _59256_210_5986_1_1535076085997_3589120_121-237386-0_sp1.1 from training. Duration: 0.87275 2023-10-02 10:38:04,828 WARNING [train.py:1204] (3/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_96-320736-0 from training. Duration: 0.86 2023-10-02 10:38:04,837 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_128-202104-0 from training. Duration: 0.63 2023-10-02 10:38:04,926 WARNING [train.py:1204] (3/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_242-3472-0_sp0.9 from training. Duration: 0.811125 2023-10-02 10:38:04,970 WARNING [train.py:1204] (3/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_154-74182-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 10:38:04,973 WARNING [train.py:1204] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_187-221566-0 from training. Duration: 0.43 2023-10-02 10:38:05,133 WARNING [train.py:1204] (3/4) Exclude cut with ID _6194_210_1534_1_1527511826160_3941194_14-282876-0 from training. Duration: 0.71 2023-10-02 10:38:05,173 WARNING [train.py:1204] (3/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_660-211088-0_sp1.1 from training. Duration: 0.563625 2023-10-02 10:38:05,980 WARNING [train.py:1204] (3/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_526-62079-0_sp1.1 from training. Duration: 0.6090625 2023-10-02 10:38:06,125 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_92-246270-0_sp1.1 from training. Duration: 0.77275 2023-10-02 10:38:06,212 WARNING [train.py:1204] (3/4) Exclude cut with ID IC0287W0023-142350 from training. Duration: 0.956 2023-10-02 10:38:06,290 WARNING [train.py:1204] (3/4) Exclude cut with ID _58605_210_12210_1_1534723195939_3904259_48-269402-0 from training. Duration: 0.96 2023-10-02 10:38:06,304 WARNING [train.py:1204] (3/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_416-257730-0_sp0.9 from training. Duration: 0.72225 2023-10-02 10:38:06,387 WARNING [train.py:1204] (3/4) Exclude cut with ID _7877_210_6014_1_1528286358099_3750136_185-17516-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 10:38:06,498 WARNING [train.py:1204] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_23-189234-0 from training. Duration: 0.83 2023-10-02 10:38:06,517 WARNING [train.py:1204] (3/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_237-89684-0 from training. Duration: 0.96 2023-10-02 10:38:06,592 WARNING [train.py:1204] (3/4) Exclude cut with ID _13916_210_9994_1_1530788341126_3763149_116-21034-0 from training. Duration: 0.96 2023-10-02 10:38:06,764 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_246-125000-0_sp1.1 from training. Duration: 0.563625 2023-10-02 10:38:13,257 INFO [train.py:1046] (3/4) Epoch 25, batch 0, loss[loss=0.1744, simple_loss=0.2621, pruned_loss=0.04337, over 24632.00 frames. ], tot_loss[loss=0.1744, simple_loss=0.2621, pruned_loss=0.04337, over 24632.00 frames. ], batch size: 73, lr: 4.14e-03, grad_scale: 16.0 2023-10-02 10:38:13,258 INFO [train.py:1069] (3/4) Computing validation loss 2023-10-02 10:38:25,662 INFO [train.py:1078] (3/4) Epoch 25, validation: loss=0.3293, simple_loss=0.2723, pruned_loss=0.1931, over 1125622.00 frames. 2023-10-02 10:38:25,663 INFO [train.py:1079] (3/4) Maximum memory allocated so far is 21134MB 2023-10-02 10:38:28,999 WARNING [train.py:1204] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_402-253027-0 from training. Duration: 0.54 2023-10-02 10:38:29,059 WARNING [train.py:1204] (3/4) Exclude cut with ID _59288_210_5854_1_1535077757774_3412246_158-289345-0_sp1.1 from training. Duration: 0.92725 2023-10-02 10:38:30,530 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_28211_210_6388_1_1532861506_4523224_128-328162-0_sp1.1 from training. Duration: 0.8181875 2023-10-02 10:38:35,006 INFO [scaling.py:1118] (3/4) WithLoss: name=encoder.encoders.3.encoder.layers.3.self_attn_weights, loss-sum=0.000e+00 2023-10-02 10:38:36,145 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_788-278281-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 10:38:36,159 WARNING [train.py:1204] (3/4) Exclude cut with ID _49728_210_12651_1_1534041034294_6788150_624-195301-0_sp0.9 from training. Duration: 0.9444375 2023-10-02 10:38:36,189 WARNING [train.py:1204] (3/4) Exclude cut with ID _14860_210_4915_1_1530702020340_7343240_267-139775-0_sp1.1 from training. Duration: 0.963625 2023-10-02 10:38:37,544 WARNING [train.py:1204] (3/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_332-221057-0 from training. Duration: 0.68 2023-10-02 10:38:38,881 WARNING [train.py:1204] (3/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_242-76943-0 from training. Duration: 0.94 2023-10-02 10:38:40,304 WARNING [train.py:1204] (3/4) Exclude cut with ID _16747_210_5986_1_1531811544733_2749109_87-349587-0_sp1.1 from training. Duration: 0.963625 2023-10-02 10:38:41,655 WARNING [train.py:1204] (3/4) Exclude cut with ID _22982_210_1883_1_1531882790447_5449409_505-346452-0_sp1.1 from training. Duration: 0.963625 2023-10-02 10:38:44,432 WARNING [train.py:1204] (3/4) Exclude cut with ID _17956_210_4353_1_1531209568125_6979970_13-192107-0_sp1.1 from training. Duration: 0.963625 2023-10-02 10:38:45,766 WARNING [train.py:1204] (3/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_430-307666-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 10:38:45,824 WARNING [train.py:1204] (3/4) Exclude cut with ID _2895_210_2477_1_1525956785794_4063750_289-11475-0_sp1.1 from training. Duration: 0.7454375 2023-10-02 10:38:47,087 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3910_210_1794_1_1526742306_334530_28-238909-0_sp0.9 from training. Duration: 0.9 2023-10-02 10:38:48,567 WARNING [train.py:1204] (3/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_18-130336-0 from training. Duration: 0.79 2023-10-02 10:38:49,987 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_548-294316-0_sp0.9 from training. Duration: 0.9 2023-10-02 10:38:59,699 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.581e+02 1.885e+02 2.204e+02 2.603e+02 4.904e+02, threshold=4.408e+02, percent-clipped=3.0 2023-10-02 10:38:59,828 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_42-142473-0_sp1.1 from training. Duration: 0.6545625 2023-10-02 10:38:59,833 WARNING [train.py:1204] (3/4) Exclude cut with ID _59170_210_6721_1_1534983633960_4203335_326-48066-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 10:39:01,187 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_45886_210_17024_1_1533709215_471063_7-79165-0 from training. Duration: 0.98 2023-10-02 10:39:05,914 WARNING [train.py:1204] (3/4) Exclude cut with ID _44585_210_18628_1_1533605350387_4518199_280-73534-0_sp0.9 from training. Duration: 0.988875 2023-10-02 10:39:05,928 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3475_210_2028_1_1526193578_3622569_265-276625-0_sp1.1 from training. Duration: 0.6909375 2023-10-02 10:39:08,656 WARNING [train.py:1204] (3/4) Exclude cut with ID _64631_210_27767_1_1535349580068_4165160_88-225232-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 10:39:09,320 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.3.encoder.layers.2.whiten, num_groups=1, num_channels=512, metric=4.14 vs. limit=12.0 2023-10-02 10:39:11,395 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_205-265518-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 10:39:15,565 WARNING [train.py:1204] (3/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_271-253734-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 10:39:21,079 WARNING [train.py:1204] (3/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_282-257791-0 from training. Duration: 0.86 2023-10-02 10:39:22,556 WARNING [train.py:1204] (3/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_356-45054-0 from training. Duration: 0.86 2023-10-02 10:39:22,588 WARNING [train.py:1204] (3/4) Exclude cut with ID _69584_210_5219_1_1535785133775_4044450_38-348116-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 10:39:22,596 WARNING [train.py:1204] (3/4) Exclude cut with ID _66763_210_17301_1_1535788667617_241750_21-328480-0_sp1.1 from training. Duration: 0.936375 2023-10-02 10:39:23,881 WARNING [train.py:1204] (3/4) Exclude cut with ID _26528_210_9070_1_1532570410842_3851310_497-97218-0_sp0.9 from training. Duration: 0.9555625 2023-10-02 10:39:23,932 WARNING [train.py:1204] (3/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_592-101075-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 10:39:26,079 WARNING [train.py:1204] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_678-159307-0 from training. Duration: 0.85 2023-10-02 10:39:27,389 WARNING [train.py:1204] (3/4) Exclude cut with ID _28427_210_4220_1_1532581240967_3061069_166-319773-0_sp1.1 from training. Duration: 0.936375 2023-10-02 10:39:29,244 WARNING [train.py:1204] (3/4) Exclude cut with ID _29877_210_6228_1_1532397791106_5276149_606-224984-0_sp1.1 from training. Duration: 0.936375 2023-10-02 10:39:34,845 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_125-203105-0_sp1.1 from training. Duration: 0.72725 2023-10-02 10:39:37,607 WARNING [train.py:1204] (3/4) Exclude cut with ID IC0620W0113-306765 from training. Duration: 0.768 2023-10-02 10:39:38,899 INFO [train.py:1046] (3/4) Epoch 25, batch 50, loss[loss=0.165, simple_loss=0.2403, pruned_loss=0.04489, over 23680.00 frames. ], tot_loss[loss=0.173, simple_loss=0.2499, pruned_loss=0.04802, over 1064254.96 frames. ], batch size: 149, lr: 4.14e-03, grad_scale: 16.0 2023-10-02 10:39:39,023 WARNING [train.py:1204] (3/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_410-287857-0_sp1.1 from training. Duration: 0.8454375 2023-10-02 10:39:43,122 WARNING [train.py:1204] (3/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_78-95260-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 10:39:44,557 WARNING [train.py:1204] (3/4) Exclude cut with ID _58059_210_5854_1_1534901471149_3164808_6-73080-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 10:39:44,572 WARNING [train.py:1204] (3/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_331-117829-0 from training. Duration: 0.62 2023-10-02 10:39:45,869 WARNING [train.py:1204] (3/4) Exclude cut with ID _14880_210_8895_1_1530680426655_3795259_289-212817-0_sp0.9 from training. Duration: 0.7666875 2023-10-02 10:39:47,214 WARNING [train.py:1204] (3/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_620-144779-0_sp1.1 from training. Duration: 0.92725 2023-10-02 10:39:48,589 WARNING [train.py:1204] (3/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_561-288575-0_sp1.1 from training. Duration: 0.963625 2023-10-02 10:39:49,979 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_22657_210_6890_1_1532086002_1637216_122-188128-0_sp1.1 from training. Duration: 0.963625 2023-10-02 10:39:50,313 INFO [scaling.py:1118] (3/4) WithLoss: name=encoder.encoders.3.encoder.layers.0.self_attn_weights, loss-sum=0.000e+00 2023-10-02 10:39:51,323 WARNING [train.py:1204] (3/4) Exclude cut with ID _16756_210_4915_1_1531047803763_7104259_604-86607-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 10:39:53,049 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.feed_forward3.hidden_balancer.prob, batch_count=850340.0, ans=0.125 2023-10-02 10:39:55,605 WARNING [train.py:1204] (3/4) Exclude cut with ID _15150_210_8341_1_1530752411803_7256384_469-239509-0 from training. Duration: 0.68 2023-10-02 10:39:55,612 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_10987_210_4163_1_1529760555_4123902_302-87926-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 10:40:00,392 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.self_attn_weights.pos_emb_skip_rate, batch_count=850340.0, ans=0.0 2023-10-02 10:40:02,697 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.2.encoder.layers.2.feed_forward3.out_whiten, num_groups=1, num_channels=384, metric=9.27 vs. limit=15.0 2023-10-02 10:40:03,482 WARNING [train.py:1204] (3/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_469-94673-0_sp0.9 from training. Duration: 0.87775 2023-10-02 10:40:05,307 WARNING [train.py:1204] (3/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_798-106179-0 from training. Duration: 0.83 2023-10-02 10:40:07,204 WARNING [train.py:1204] (3/4) Exclude cut with ID _16744_210_5986_1_1531724405384_3907567_242-111316-0 from training. Duration: 0.93 2023-10-02 10:40:08,666 WARNING [train.py:1204] (3/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_938-205135-0_sp0.9 from training. Duration: 0.9666875 2023-10-02 10:40:10,046 WARNING [train.py:1204] (3/4) Exclude cut with ID _64135_210_2253_1_1535677267876_7608972_133-188266-0_sp0.9 from training. Duration: 0.9 2023-10-02 10:40:10,050 WARNING [train.py:1204] (3/4) Exclude cut with ID _44585_210_18628_1_1533605350387_4518199_623-346820-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 10:40:11,347 WARNING [train.py:1204] (3/4) Exclude cut with ID _42326_210_7770_1_1533814282531_3707269_454-266903-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 10:40:12,704 WARNING [train.py:1204] (3/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_5-55893-0_sp1.1 from training. Duration: 0.5 2023-10-02 10:40:14,075 WARNING [train.py:1204] (3/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_577-332600-0_sp1.1 from training. Duration: 0.5181875 2023-10-02 10:40:14,076 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_957-138882-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 10:40:19,856 WARNING [train.py:1204] (3/4) Exclude cut with ID _12556_210_8341_1_1530081024100_7327690_219-38004-0_sp1.1 from training. Duration: 0.97275 2023-10-02 10:40:21,264 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_397-147196-0_sp1.1 from training. Duration: 0.72725 2023-10-02 10:40:21,282 WARNING [train.py:1204] (3/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_231-104736-0_sp0.9 from training. Duration: 0.8666875 2023-10-02 10:40:22,641 WARNING [train.py:1204] (3/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_398-334558-0 from training. Duration: 0.96 2023-10-02 10:40:25,314 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_257-20733-0_sp1.1 from training. Duration: 0.6909375 2023-10-02 10:40:26,765 WARNING [train.py:1204] (3/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_146-282400-0_sp1.1 from training. Duration: 0.6454375 2023-10-02 10:40:26,770 WARNING [train.py:1204] (3/4) Exclude cut with ID _51899_210_20034_1_1534129254466_3669220_265-215428-0 from training. Duration: 0.98 2023-10-02 10:40:26,857 WARNING [train.py:1204] (3/4) Exclude cut with ID _34423_210_8750_1_1533430798456_3330199_219-106541-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 10:40:30,253 WARNING [train.py:1204] (3/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_458-67321-0 from training. Duration: 0.97 2023-10-02 10:40:37,795 WARNING [train.py:1204] (3/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_1051-182606-0_sp1.1 from training. Duration: 0.936375 2023-10-02 10:40:37,826 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_53555_210_5632_1_1534595220_7232297_603-4396-0_sp1.1 from training. Duration: 0.97275 2023-10-02 10:40:39,172 WARNING [train.py:1204] (3/4) Exclude cut with ID _54218_210_12210_1_1534413646388_4409259_159-144367-0_sp1.1 from training. Duration: 0.963625 2023-10-02 10:40:40,578 WARNING [train.py:1204] (3/4) Exclude cut with ID _7606_210_4915_1_1528894253277_5080690_301-10732-0_sp0.9 from training. Duration: 0.9 2023-10-02 10:40:40,581 WARNING [train.py:1204] (3/4) Exclude cut with ID _15026_210_8750_1_1530777602316_3291850_211-195043-0_sp0.9 from training. Duration: 0.888875 2023-10-02 10:40:42,047 WARNING [train.py:1204] (3/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_146-282400-0 from training. Duration: 0.71 2023-10-02 10:40:43,980 WARNING [train.py:1204] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_65-88242-0 from training. Duration: 0.56 2023-10-02 10:40:44,071 WARNING [train.py:1204] (3/4) Exclude cut with ID _55857_210_2346_1_1534644007831_6898209_80-107757-0_sp1.1 from training. Duration: 0.963625 2023-10-02 10:40:45,347 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_15126_210_6753_1_1530778677_2591185_120-277707-0_sp0.9 from training. Duration: 0.888875 2023-10-02 10:40:46,715 WARNING [train.py:1204] (3/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_1033-168593-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 10:40:47,960 WARNING [train.py:1204] (3/4) Exclude cut with ID _46683_210_9642_1_1533730913492_4591116_475-30711-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 10:40:47,982 WARNING [train.py:1204] (3/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_253-308377-0 from training. Duration: 0.49 2023-10-02 10:40:49,341 WARNING [train.py:1204] (3/4) Exclude cut with ID _4680_210_3525_1_1527249757953_893386_75-78979-0 from training. Duration: 0.91 2023-10-02 10:40:50,614 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_5522_210_3273_1_1527249606_3909096_318-203153-0_sp0.9 from training. Duration: 0.47775 2023-10-02 10:40:50,701 WARNING [train.py:1204] (3/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_550-170116-0_sp1.1 from training. Duration: 0.92725 2023-10-02 10:40:50,732 WARNING [train.py:1204] (3/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_277-210798-0_sp0.9 from training. Duration: 0.611125 2023-10-02 10:40:51,998 INFO [train.py:1046] (3/4) Epoch 25, batch 100, loss[loss=0.1605, simple_loss=0.2355, pruned_loss=0.04275, over 23640.00 frames. ], tot_loss[loss=0.1736, simple_loss=0.2511, pruned_loss=0.04801, over 1867071.89 frames. ], batch size: 149, lr: 4.14e-03, grad_scale: 16.0 2023-10-02 10:40:52,063 WARNING [train.py:1204] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_620-276793-0 from training. Duration: 0.59 2023-10-02 10:40:52,081 WARNING [train.py:1204] (3/4) Exclude cut with ID _3067_210_2667_1_1526212974640_4503168_119-175776-0 from training. Duration: 0.74 2023-10-02 10:40:53,434 WARNING [train.py:1204] (3/4) Exclude cut with ID _55209_210_1531_1_1534675828996_4597160_681-195827-0_sp1.1 from training. Duration: 0.92725 2023-10-02 10:40:53,514 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_427-78097-0_sp1.1 from training. Duration: 0.72725 2023-10-02 10:40:54,913 WARNING [train.py:1204] (3/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_96-290039-0_sp0.9 from training. Duration: 0.62225 2023-10-02 10:40:54,922 WARNING [train.py:1204] (3/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_287-292174-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 10:40:57,530 WARNING [train.py:1204] (3/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_200-292228-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 10:41:00,669 WARNING [train.py:1204] (3/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_291-194999-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 10:41:04,006 WARNING [train.py:1204] (3/4) Exclude cut with ID _58895_210_8750_1_1535184081320_3649089_265-323810-0_sp1.1 from training. Duration: 0.87275 2023-10-02 10:41:05,544 WARNING [train.py:1204] (3/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_58-68956-0 from training. Duration: 0.83 2023-10-02 10:41:05,549 WARNING [train.py:1204] (3/4) Exclude cut with ID _39867_210_9826_1_1533553222030_9344259_322-115153-0_sp1.1 from training. Duration: 0.963625 2023-10-02 10:41:10,225 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3371_210_1794_1_1526384462_5580777_396-339812-0_sp1.1 from training. Duration: 0.77275 2023-10-02 10:41:10,235 WARNING [train.py:1204] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_731-2428-0_sp0.9 from training. Duration: 0.92225 2023-10-02 10:41:11,489 WARNING [train.py:1204] (3/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_98-741-0_sp1.1 from training. Duration: 0.72725 2023-10-02 10:41:11,491 WARNING [train.py:1204] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_238-245655-0_sp0.9 from training. Duration: 0.9 2023-10-02 10:41:11,523 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6788_210_1388_1_1527898698_4562021_91-197981-0_sp0.9 from training. Duration: 0.92225 2023-10-02 10:41:11,628 WARNING [train.py:1204] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_860-96198-0 from training. Duration: 0.73 2023-10-02 10:41:14,761 WARNING [train.py:1204] (3/4) Exclude cut with ID _14268_210_9206_1_1530622771285_4714099_331-105351-0_sp0.9 from training. Duration: 0.72225 2023-10-02 10:41:14,783 WARNING [train.py:1204] (3/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_204-137133-0_sp1.1 from training. Duration: 0.92725 2023-10-02 10:41:14,808 WARNING [train.py:1204] (3/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_5-46496-0_sp1.1 from training. Duration: 0.936375 2023-10-02 10:41:14,816 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_160-218721-0_sp1.1 from training. Duration: 0.87275 2023-10-02 10:41:18,849 WARNING [train.py:1204] (3/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_503-287883-0 from training. Duration: 0.88 2023-10-02 10:41:20,283 WARNING [train.py:1204] (3/4) Exclude cut with ID _57572_210_5517_1_1534921219624_3809484_110-79134-0_sp1.1 from training. Duration: 0.92725 2023-10-02 10:41:21,642 WARNING [train.py:1204] (3/4) Exclude cut with ID _44585_210_18628_1_1533605350387_4518199_421-37641-0_sp1.1 from training. Duration: 0.936375 2023-10-02 10:41:23,033 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_42-142473-0_sp0.9 from training. Duration: 0.8 2023-10-02 10:41:24,436 WARNING [train.py:1204] (3/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_310-114452-0_sp0.9 from training. Duration: 0.6555625 2023-10-02 10:41:25,649 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.442e+02 1.842e+02 2.089e+02 2.326e+02 3.490e+02, threshold=4.178e+02, percent-clipped=0.0 2023-10-02 10:41:28,590 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9029W0120-509520 from training. Duration: 0.768 2023-10-02 10:41:28,604 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9029W0433-509820 from training. Duration: 0.896 2023-10-02 10:41:29,989 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_40222_210_6228_1_1533385298_4481939_163-334586-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 10:41:29,989 WARNING [train.py:1204] (3/4) Exclude cut with ID _48677_210_5663_1_1533902405919_3892989_408-40740-0_sp1.1 from training. Duration: 0.7545625 2023-10-02 10:41:34,677 WARNING [train.py:1204] (3/4) Exclude cut with ID _24314_210_4915_1_1532135078116_7170179_521-258868-0_sp0.9 from training. Duration: 0.87775 2023-10-02 10:41:36,561 WARNING [train.py:1204] (3/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_576-51198-0_sp1.1 from training. Duration: 0.92725 2023-10-02 10:41:37,916 WARNING [train.py:1204] (3/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_306-216871-0_sp1.1 from training. Duration: 0.97275 2023-10-02 10:41:43,797 WARNING [train.py:1204] (3/4) Exclude cut with ID _20684_210_6917_1_1531567751646_4203930_463-119982-0_sp1.1 from training. Duration: 0.97275 2023-10-02 10:41:43,869 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9084W0012-536850 from training. Duration: 0.64 2023-10-02 10:41:47,096 WARNING [train.py:1204] (3/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_847-41334-0_sp1.1 from training. Duration: 0.4 2023-10-02 10:41:49,429 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.0.layers.0.conv_module2.whiten, num_groups=1, num_channels=192, metric=13.04 vs. limit=15.0 2023-10-02 10:41:50,063 WARNING [train.py:1204] (3/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_326-165900-0_sp1.1 from training. Duration: 0.62725 2023-10-02 10:41:51,387 WARNING [train.py:1204] (3/4) Exclude cut with ID _7537_210_2084_1_1528502395431_3924359_128-268029-0_sp1.1 from training. Duration: 0.8909375 2023-10-02 10:41:54,096 WARNING [train.py:1204] (3/4) Exclude cut with ID _67499_210_17282_1_1535509805220_3692430_333-155030-0_sp1.1 from training. Duration: 0.97275 2023-10-02 10:41:55,551 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_34008_210_10431_1_1533345424_8183247_588-257177-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 10:41:58,281 WARNING [train.py:1204] (3/4) Exclude cut with ID _25914_210_6400_1_1532141317058_644740_52-96808-0_sp1.1 from training. Duration: 0.82725 2023-10-02 10:41:59,764 WARNING [train.py:1204] (3/4) Exclude cut with ID _56494_210_4220_1_1535284837625_3033169_69-138150-0_sp1.1 from training. Duration: 0.8545625 2023-10-02 10:42:01,255 WARNING [train.py:1204] (3/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_19-143553-0_sp1.1 from training. Duration: 0.97275 2023-10-02 10:42:02,605 WARNING [train.py:1204] (3/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_582-130735-0_sp1.1 from training. Duration: 0.936375 2023-10-02 10:42:02,728 WARNING [train.py:1204] (3/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_943-201356-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 10:42:02,738 WARNING [train.py:1204] (3/4) Exclude cut with ID _3231_210_2106_1_1526205864934_6784220_422-229276-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 10:42:03,360 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_48049_210_5632_1_1533864553_3519937_73-198717-0_sp1.1 from training. Duration: 0.97275 2023-10-02 10:42:04,617 WARNING [train.py:1204] (3/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_329-285801-0 from training. Duration: 0.91 2023-10-02 10:42:04,643 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9171W0114-579000 from training. Duration: 0.896 2023-10-02 10:42:04,662 WARNING [train.py:1204] (3/4) Exclude cut with ID _3120_210_3525_1_1526036143112_4072980_210-132675-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 10:42:06,370 INFO [train.py:1046] (3/4) Epoch 25, batch 150, loss[loss=0.198, simple_loss=0.2668, pruned_loss=0.06458, over 23284.00 frames. ], tot_loss[loss=0.1742, simple_loss=0.2513, pruned_loss=0.04853, over 2512152.25 frames. ], batch size: 93, lr: 4.14e-03, grad_scale: 16.0 2023-10-02 10:42:07,707 WARNING [train.py:1204] (3/4) Exclude cut with ID _55857_210_2346_1_1534644007831_6898209_745-98391-0_sp0.9 from training. Duration: 0.8444375 2023-10-02 10:42:07,776 WARNING [train.py:1204] (3/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_615-322035-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 10:42:07,780 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_36756_210_7234_1_1533026991_966276_119-315767-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 10:42:07,794 WARNING [train.py:1204] (3/4) Exclude cut with ID _13916_210_9994_1_1530788341126_3763149_60-191556-0_sp1.1 from training. Duration: 0.5545625 2023-10-02 10:42:08,965 WARNING [train.py:1204] (3/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_177-236809-0_sp1.1 from training. Duration: 0.4454375 2023-10-02 10:42:09,001 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7002_210_1794_1_1527939683_5157286_184-229630-0_sp1.1 from training. Duration: 0.67275 2023-10-02 10:42:09,009 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_50428_210_5724_1_1534247532_4126418_464-18322-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 10:42:09,088 WARNING [train.py:1204] (3/4) Exclude cut with ID _35799_210_5854_1_1533263487887_3529071_466-131402-0_sp1.1 from training. Duration: 0.936375 2023-10-02 10:42:10,455 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_1259-132768-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 10:42:11,871 WARNING [train.py:1204] (3/4) Exclude cut with ID _6694_210_4599_1_1527852637490_3965529_144-185051-0_sp1.1 from training. Duration: 0.87275 2023-10-02 10:42:11,918 WARNING [train.py:1204] (3/4) Exclude cut with ID _64639_210_5816_1_1535367619437_3528000_59-133290-0_sp1.1 from training. Duration: 0.963625 2023-10-02 10:42:15,194 WARNING [train.py:1204] (3/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_223-49525-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 10:42:18,396 WARNING [train.py:1204] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_748-36150-0_sp1.1 from training. Duration: 0.82725 2023-10-02 10:42:18,397 WARNING [train.py:1204] (3/4) Exclude cut with ID _19896_210_1883_1_1531709022662_6649569_52-221627-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 10:42:18,461 WARNING [train.py:1204] (3/4) Exclude cut with ID _6789_210_1534_1_1527854471671_2870519_296-115766-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 10:42:21,144 WARNING [train.py:1204] (3/4) Exclude cut with ID _14624_210_1925_1_1530858690657_3642219_100-19871-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 10:42:21,183 WARNING [train.py:1204] (3/4) Exclude cut with ID _12555_210_8750_1_1530075594402_3780239_50-293675-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 10:42:23,810 WARNING [train.py:1204] (3/4) Exclude cut with ID _14880_210_8895_1_1530680426655_3795259_289-212817-0_sp1.1 from training. Duration: 0.62725 2023-10-02 10:42:23,860 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_388-211626-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 10:42:26,794 WARNING [train.py:1204] (3/4) Exclude cut with ID _5289_210_2084_1_1527678202736_3779299_392-261988-0 from training. Duration: 0.65 2023-10-02 10:42:26,802 WARNING [train.py:1204] (3/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_124-345840-0 from training. Duration: 0.52 2023-10-02 10:42:26,807 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2655_210_2087_1_1525688959_3897400_437-337690-0 from training. Duration: 0.63 2023-10-02 10:42:28,361 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3780_210_2087_1_1526467391_239068_13-274633-0_sp1.1 from training. Duration: 0.92725 2023-10-02 10:42:28,365 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_805-71497-0_sp1.1 from training. Duration: 0.6454375 2023-10-02 10:42:29,736 WARNING [train.py:1204] (3/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_434-83000-0_sp1.1 from training. Duration: 0.8909375 2023-10-02 10:42:31,573 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_17613_210_2151_1_1531137569_2366160_285-219531-0_sp1.1 from training. Duration: 0.97275 2023-10-02 10:42:31,578 WARNING [train.py:1204] (3/4) Exclude cut with ID _56108_210_16380_1_1534505446159_3887959_22-37858-0_sp1.1 from training. Duration: 0.936375 2023-10-02 10:42:31,617 WARNING [train.py:1204] (3/4) Exclude cut with ID _28427_210_4220_1_1532581240967_3061069_200-88444-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 10:42:31,773 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.feed_forward3.hidden_balancer.prob, batch_count=851006.6666666666, ans=0.125 2023-10-02 10:42:33,008 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_77-279042-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 10:42:33,127 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9309W0193-640320 from training. Duration: 0.896 2023-10-02 10:42:34,951 WARNING [train.py:1204] (3/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_516-324055-0_sp1.1 from training. Duration: 0.936375 2023-10-02 10:42:37,779 INFO [scaling.py:1118] (3/4) WithLoss: name=encoder.encoders.0.layers.1.self_attn_weights, loss-sum=0.000e+00 2023-10-02 10:42:37,780 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.1.feed_forward1.out_proj.dropout_p, batch_count=851073.3333333334, ans=0.1 2023-10-02 10:42:40,243 WARNING [train.py:1204] (3/4) Exclude cut with ID _40094_210_16380_1_1533294005773_3641159_10-261240-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 10:42:42,240 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.feed_forward1.out_proj.dropout_p, batch_count=851073.3333333334, ans=0.1 2023-10-02 10:42:46,547 WARNING [train.py:1204] (3/4) Exclude cut with ID _6267_210_2084_1_1527764658526_3894730_375-219887-0_sp1.1 from training. Duration: 0.6090625 2023-10-02 10:42:47,913 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6788_210_1388_1_1527898698_4562021_180-75288-0 from training. Duration: 0.99 2023-10-02 10:42:50,825 WARNING [train.py:1204] (3/4) Exclude cut with ID _8030_210_7497_1_1528444886781_3531089_262-86750-0_sp1.1 from training. Duration: 0.736375 2023-10-02 10:42:50,839 WARNING [train.py:1204] (3/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_934-325498-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 10:42:52,129 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_456-22141-0_sp0.9 from training. Duration: 0.97775 2023-10-02 10:42:52,324 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.nonlin_attention.balancer.prob, batch_count=851140.0, ans=0.125 2023-10-02 10:42:53,623 WARNING [train.py:1204] (3/4) Exclude cut with ID _49643_210_7989_1_1534640244224_3939031_598-161926-0_sp1.1 from training. Duration: 0.8181875 2023-10-02 10:42:54,896 WARNING [train.py:1204] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_345-49721-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 10:42:56,286 WARNING [train.py:1204] (3/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_440-141811-0_sp1.1 from training. Duration: 0.863625 2023-10-02 10:42:57,616 WARNING [train.py:1204] (3/4) Exclude cut with ID _64048_210_10626_1_1535198382802_3835179_394-212120-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 10:42:58,997 WARNING [train.py:1204] (3/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_91-277684-0 from training. Duration: 0.74 2023-10-02 10:43:02,427 WARNING [train.py:1204] (3/4) Exclude cut with ID _23038_210_9570_1_1532001584158_3652419_191-346343-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 10:43:03,743 WARNING [train.py:1204] (3/4) Exclude cut with ID _15140_210_9288_1_1530791388450_4880149_351-347974-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 10:43:03,777 WARNING [train.py:1204] (3/4) Exclude cut with ID _58605_210_12210_1_1534723195939_3904259_48-269402-0_sp1.1 from training. Duration: 0.87275 2023-10-02 10:43:03,784 WARNING [train.py:1204] (3/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_401-53423-0_sp0.9 from training. Duration: 0.611125 2023-10-02 10:43:05,216 WARNING [train.py:1204] (3/4) Exclude cut with ID _42659_210_2070_1_1533607442765_3120380_158-49004-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 10:43:08,350 WARNING [train.py:1204] (3/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_81-75849-0_sp1.1 from training. Duration: 0.3545625 2023-10-02 10:43:09,807 WARNING [train.py:1204] (3/4) Exclude cut with ID _27052_210_5986_1_1532329197187_3585009_314-106499-0_sp1.1 from training. Duration: 0.836375 2023-10-02 10:43:12,372 WARNING [train.py:1204] (3/4) Exclude cut with ID _4626_210_3532_1_1526817652717_3916859_446-206656-0_sp0.9 from training. Duration: 0.8444375 2023-10-02 10:43:13,663 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_53378_210_17282_1_1534221071_5769509_158-114960-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 10:43:15,525 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3171_210_2087_1_1526109084_2330007_37-64334-0_sp0.9 from training. Duration: 0.911125 2023-10-02 10:43:16,835 WARNING [train.py:1204] (3/4) Exclude cut with ID _5410_210_1388_1_1527397055795_1719020_39-112221-0 from training. Duration: 0.69 2023-10-02 10:43:16,854 WARNING [train.py:1204] (3/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_4-91037-0_sp0.9 from training. Duration: 0.97775 2023-10-02 10:43:16,877 WARNING [train.py:1204] (3/4) Exclude cut with ID ID0079W0443-717795 from training. Duration: 0.541 2023-10-02 10:43:20,114 INFO [train.py:1046] (3/4) Epoch 25, batch 200, loss[loss=0.1641, simple_loss=0.2405, pruned_loss=0.04385, over 20380.00 frames. ], tot_loss[loss=0.1747, simple_loss=0.2516, pruned_loss=0.04892, over 2993063.71 frames. ], batch size: 44, lr: 4.14e-03, grad_scale: 16.0 2023-10-02 10:43:20,223 WARNING [train.py:1204] (3/4) Exclude cut with ID _34734_210_4525_1_1533365977542_4448720_766-297928-0_sp1.1 from training. Duration: 0.936375 2023-10-02 10:43:22,278 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.2.encoder.layers.0.self_attn2.whiten, num_groups=1, num_channels=384, metric=18.64 vs. limit=22.5 2023-10-02 10:43:22,975 WARNING [train.py:1204] (3/4) Exclude cut with ID _51471_210_1889_1_1534399391390_3094250_310-337793-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 10:43:22,992 WARNING [train.py:1204] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_678-159307-0_sp0.9 from training. Duration: 0.9444375 2023-10-02 10:43:25,137 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.0.layers.1.conv_module1.whiten, num_groups=1, num_channels=192, metric=9.14 vs. limit=15.0 2023-10-02 10:43:25,761 WARNING [train.py:1204] (3/4) Exclude cut with ID _50093_210_4220_1_1534417141207_3432299_259-322252-0 from training. Duration: 0.92 2023-10-02 10:43:25,814 WARNING [train.py:1204] (3/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_67-103444-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 10:43:27,076 WARNING [train.py:1204] (3/4) Exclude cut with ID _40551_210_10635_1_1533776424632_3416179_311-247578-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 10:43:29,889 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_4633_210_2040_1_1526824163_4145343_416-76117-0 from training. Duration: 0.95 2023-10-02 10:43:31,735 WARNING [train.py:1204] (3/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_331-117829-0_sp0.9 from training. Duration: 0.688875 2023-10-02 10:43:33,134 WARNING [train.py:1204] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_299-247059-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 10:43:34,504 WARNING [train.py:1204] (3/4) Exclude cut with ID _64002_210_9570_1_1535198605157_38979_2-240845-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 10:43:39,134 WARNING [train.py:1204] (3/4) Exclude cut with ID _4681_210_3525_1_1527250689464_120889_15-126760-0_sp0.9 from training. Duration: 0.9 2023-10-02 10:43:39,145 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_21500_210_2054_1_1532149310_2716758_256-156611-0_sp1.1 from training. Duration: 0.936375 2023-10-02 10:43:39,166 WARNING [train.py:1204] (3/4) Exclude cut with ID _16908_210_5724_1_1531119157248_3772700_624-203729-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 10:43:50,704 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=851406.6666666666, ans=0.1 2023-10-02 10:43:54,484 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.636e+02 1.978e+02 2.260e+02 2.565e+02 3.626e+02, threshold=4.520e+02, percent-clipped=0.0 2023-10-02 10:43:56,030 WARNING [train.py:1204] (3/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_605-219732-0_sp1.1 from training. Duration: 0.8545625 2023-10-02 10:43:57,286 WARNING [train.py:1204] (3/4) Exclude cut with ID _44718_210_5816_1_1534050051869_6469030_380-167520-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 10:43:57,355 WARNING [train.py:1204] (3/4) Exclude cut with ID _19896_210_1883_1_1531709022662_6649569_94-229927-0_sp1.1 from training. Duration: 0.8818125 2023-10-02 10:43:58,670 WARNING [train.py:1204] (3/4) Exclude cut with ID _19514_210_9259_1_1531530071330_5084230_519-174300-0_sp1.1 from training. Duration: 0.963625 2023-10-02 10:43:58,730 WARNING [train.py:1204] (3/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_340-174176-0_sp0.9 from training. Duration: 0.5444375 2023-10-02 10:43:58,734 WARNING [train.py:1204] (3/4) Exclude cut with ID _8263_210_1664_1_1528439452891_970220_68-181805-0_sp1.1 from training. Duration: 0.5818125 2023-10-02 10:44:02,024 WARNING [train.py:1204] (3/4) Exclude cut with ID _3120_210_3525_1_1526036143112_4072980_348-257723-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 10:44:02,095 WARNING [train.py:1204] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_453-133983-0_sp1.1 from training. Duration: 0.7090625 2023-10-02 10:44:03,421 WARNING [train.py:1204] (3/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_17-86060-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 10:44:04,699 WARNING [train.py:1204] (3/4) Exclude cut with ID _29877_210_6228_1_1532397791106_5276149_228-184657-0_sp1.1 from training. Duration: 0.97275 2023-10-02 10:44:04,789 WARNING [train.py:1204] (3/4) Exclude cut with ID _18516_210_9206_1_1531704653206_3692406_198-334870-0 from training. Duration: 0.92 2023-10-02 10:44:04,832 WARNING [train.py:1204] (3/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_340-174176-0_sp1.1 from training. Duration: 0.4454375 2023-10-02 10:44:04,843 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_14-139186-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 10:44:06,433 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.feed_forward2.hidden_balancer.prob, batch_count=851473.3333333334, ans=0.125 2023-10-02 10:44:10,757 WARNING [train.py:1204] (3/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_126-29104-0_sp0.9 from training. Duration: 0.9444375 2023-10-02 10:44:15,082 WARNING [train.py:1204] (3/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_84-10310-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 10:44:24,272 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_15517_210_6753_1_1530954008_3487336_79-273076-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 10:44:25,624 WARNING [train.py:1204] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_371-18169-0_sp0.9 from training. Duration: 0.9555625 2023-10-02 10:44:31,147 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_68702_210_23689_1_1535799577_3420068_170-92788-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 10:44:32,091 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.self_attn_weights.pos_emb_skip_rate, batch_count=851540.0, ans=0.0 2023-10-02 10:44:33,163 WARNING [train.py:1204] (3/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_78-182912-0 from training. Duration: 0.59 2023-10-02 10:44:33,214 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3019_210_2028_1_1526044245_3264865_297-12384-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 10:44:33,222 WARNING [train.py:1204] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_147-111137-0_sp1.1 from training. Duration: 0.863625 2023-10-02 10:44:33,226 WARNING [train.py:1204] (3/4) Exclude cut with ID _28323_210_16187_1_1532480395667_4100300_117-8859-0_sp1.1 from training. Duration: 0.97275 2023-10-02 10:44:34,608 INFO [train.py:1046] (3/4) Epoch 25, batch 250, loss[loss=0.1733, simple_loss=0.2606, pruned_loss=0.04298, over 24604.00 frames. ], tot_loss[loss=0.1747, simple_loss=0.2512, pruned_loss=0.04912, over 3373748.83 frames. ], batch size: 71, lr: 4.14e-03, grad_scale: 16.0 2023-10-02 10:44:34,699 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7762_210_1794_1_1528199508_4057408_419-125009-0_sp1.1 from training. Duration: 0.7545625 2023-10-02 10:44:36,099 WARNING [train.py:1204] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_268-87909-0 from training. Duration: 0.99 2023-10-02 10:44:37,494 WARNING [train.py:1204] (3/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_118-311357-0_sp1.1 from training. Duration: 0.92725 2023-10-02 10:44:37,515 WARNING [train.py:1204] (3/4) Exclude cut with ID ID0377W0003-870225 from training. Duration: 0.852 2023-10-02 10:44:38,961 WARNING [train.py:1204] (3/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_56-5838-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 10:44:40,847 WARNING [train.py:1204] (3/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_90-31383-0_sp0.9 from training. Duration: 0.9666875 2023-10-02 10:44:40,943 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_31583_210_3852_1_1532595851_4024413_58-86657-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 10:44:42,291 WARNING [train.py:1204] (3/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_344-113875-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 10:44:43,629 WARNING [train.py:1204] (3/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_278-104676-0_sp1.1 from training. Duration: 0.8909375 2023-10-02 10:44:44,815 WARNING [train.py:1204] (3/4) Exclude cut with ID _55844_210_2346_1_1534471212569_7625707_764-2387-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 10:44:46,171 WARNING [train.py:1204] (3/4) Exclude cut with ID _27430_210_7770_1_1532604689375_1378590_157-14963-0_sp1.1 from training. Duration: 0.963625 2023-10-02 10:44:48,930 WARNING [train.py:1204] (3/4) Exclude cut with ID _7160_210_1876_1_1527940923231_3703799_282-322611-0_sp1.1 from training. Duration: 0.7909375 2023-10-02 10:44:49,058 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.0.ff2_skip_rate, batch_count=851673.3333333334, ans=0.0 2023-10-02 10:44:49,110 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.out_combiner.scale_min, batch_count=851673.3333333334, ans=0.2 2023-10-02 10:45:01,077 WARNING [train.py:1204] (3/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_190-138640-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 10:45:04,316 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_33348_210_5632_1_1533086912_3324030_63-270922-0_sp1.1 from training. Duration: 0.97275 2023-10-02 10:45:04,344 WARNING [train.py:1204] (3/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_135-184281-0_sp1.1 from training. Duration: 0.9 2023-10-02 10:45:09,684 WARNING [train.py:1204] (3/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_277-210798-0_sp1.1 from training. Duration: 0.5 2023-10-02 10:45:09,734 WARNING [train.py:1204] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_402-253027-0_sp0.9 from training. Duration: 0.6 2023-10-02 10:45:11,640 WARNING [train.py:1204] (3/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_540-275685-0_sp0.9 from training. Duration: 0.988875 2023-10-02 10:45:11,684 WARNING [train.py:1204] (3/4) Exclude cut with ID _59493_210_17282_1_1535078419077_3225879_118-18960-0_sp1.1 from training. Duration: 0.936375 2023-10-02 10:45:13,056 WARNING [train.py:1204] (3/4) Exclude cut with ID _6194_210_1534_1_1527511826160_3941194_321-148641-0_sp1.1 from training. Duration: 0.5818125 2023-10-02 10:45:14,384 WARNING [train.py:1204] (3/4) Exclude cut with ID _61133_210_9969_1_1535196777227_3696409_333-270325-0_sp1.1 from training. Duration: 0.8454375 2023-10-02 10:45:15,741 WARNING [train.py:1204] (3/4) Exclude cut with ID _49376_210_5986_1_1533985278732_3370740_307-145221-0_sp1.1 from training. Duration: 0.936375 2023-10-02 10:45:17,206 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6846_210_6753_1_1527850767_4295158_2-315073-0_sp0.9 from training. Duration: 0.97775 2023-10-02 10:45:19,940 WARNING [train.py:1204] (3/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_17-25045-0 from training. Duration: 0.59 2023-10-02 10:45:19,953 WARNING [train.py:1204] (3/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_503-333510-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 10:45:21,403 WARNING [train.py:1204] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_238-245655-0_sp1.1 from training. Duration: 0.736375 2023-10-02 10:45:21,431 WARNING [train.py:1204] (3/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_329-82235-0_sp0.9 from training. Duration: 0.911125 2023-10-02 10:45:21,442 WARNING [train.py:1204] (3/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_325-113798-0_sp0.9 from training. Duration: 0.9444375 2023-10-02 10:45:22,787 WARNING [train.py:1204] (3/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_395-37222-0_sp0.9 from training. Duration: 0.8444375 2023-10-02 10:45:22,862 WARNING [train.py:1204] (3/4) Exclude cut with ID _64135_210_2253_1_1535677267876_7608972_498-5039-0_sp1.1 from training. Duration: 0.7090625 2023-10-02 10:45:22,872 WARNING [train.py:1204] (3/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_303-228738-0_sp0.9 from training. Duration: 0.9333125 2023-10-02 10:45:26,032 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_577-220899-0_sp1.1 from training. Duration: 0.92725 2023-10-02 10:45:27,878 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3475_210_2028_1_1526193578_3622569_377-166133-0_sp1.1 from training. Duration: 0.7818125 2023-10-02 10:45:29,249 WARNING [train.py:1204] (3/4) Exclude cut with ID _49376_210_5986_1_1533985278732_3370740_272-313512-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 10:45:32,070 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7762_210_1794_1_1528199508_4057408_422-227034-0_sp0.9 from training. Duration: 0.811125 2023-10-02 10:45:36,552 WARNING [train.py:1204] (3/4) Exclude cut with ID _18895_210_6228_1_1531357274234_3784930_74-341428-0_sp1.1 from training. Duration: 0.92725 2023-10-02 10:45:36,858 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.feed_forward1.out_proj.dropout_p, batch_count=851873.3333333334, ans=0.1 2023-10-02 10:45:38,088 WARNING [train.py:1204] (3/4) Exclude cut with ID _44902_210_16187_1_1533888006062_3792078_149-67212-0_sp1.1 from training. Duration: 0.7909375 2023-10-02 10:45:40,972 WARNING [train.py:1204] (3/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_243-126020-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 10:45:42,772 WARNING [train.py:1204] (3/4) Exclude cut with ID _54218_210_12210_1_1534413646388_4409259_259-6561-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 10:45:45,742 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_41452_210_7291_1_1533437687_3941281_405-11559-0 from training. Duration: 0.91 2023-10-02 10:45:47,084 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_759-225084-0_sp1.1 from training. Duration: 0.97275 2023-10-02 10:45:47,099 WARNING [train.py:1204] (3/4) Exclude cut with ID _14747_210_8341_1_1530685797463_3654089_147-131229-0_sp0.9 from training. Duration: 0.8444375 2023-10-02 10:45:48,260 INFO [train.py:1046] (3/4) Epoch 25, batch 300, loss[loss=0.1543, simple_loss=0.2335, pruned_loss=0.03758, over 24623.00 frames. ], tot_loss[loss=0.1719, simple_loss=0.2486, pruned_loss=0.04761, over 3675973.99 frames. ], batch size: 60, lr: 4.14e-03, grad_scale: 16.0 2023-10-02 10:45:49,658 WARNING [train.py:1204] (3/4) Exclude cut with ID _14867_210_1968_1_1530684179890_3846969_319-250868-0 from training. Duration: 0.99 2023-10-02 10:45:49,678 WARNING [train.py:1204] (3/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_580-93798-0_sp0.9 from training. Duration: 0.62225 2023-10-02 10:45:51,082 WARNING [train.py:1204] (3/4) Exclude cut with ID _9679_210_7497_1_1529129212906_4363929_512-313889-0_sp0.9 from training. Duration: 0.9 2023-10-02 10:45:51,099 WARNING [train.py:1204] (3/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_18-189198-0 from training. Duration: 0.9 2023-10-02 10:45:56,204 WARNING [train.py:1204] (3/4) Exclude cut with ID _49755_210_18053_1_1534581005846_3218709_137-64179-0_sp1.1 from training. Duration: 0.92725 2023-10-02 10:45:56,268 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_48049_210_5632_1_1533864553_3519937_306-283794-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 10:45:59,358 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.conv_skip_rate, batch_count=851940.0, ans=0.0 2023-10-02 10:46:00,426 WARNING [train.py:1204] (3/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_25-120161-0_sp1.1 from training. Duration: 0.8909375 2023-10-02 10:46:00,940 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.4.encoder.layers.2.feed_forward2.out_whiten, num_groups=1, num_channels=384, metric=8.53 vs. limit=15.0 2023-10-02 10:46:02,004 WARNING [train.py:1204] (3/4) Exclude cut with ID _8833_210_5219_1_1528592665210_7553059_319-349181-0 from training. Duration: 0.99 2023-10-02 10:46:03,444 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_296-332325-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 10:46:04,770 WARNING [train.py:1204] (3/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_436-186311-0_sp1.1 from training. Duration: 0.5909375 2023-10-02 10:46:04,781 WARNING [train.py:1204] (3/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_348-127789-0 from training. Duration: 0.85 2023-10-02 10:46:04,786 WARNING [train.py:1204] (3/4) Exclude cut with ID _3992_210_1747_1_1526644816031_3731060_443-195183-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 10:46:08,147 WARNING [train.py:1204] (3/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_308-231735-0_sp1.1 from training. Duration: 0.663625 2023-10-02 10:46:12,958 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6786_210_1388_1_1527839699_4194495_378-46070-0_sp1.1 from training. Duration: 0.6545625 2023-10-02 10:46:13,001 WARNING [train.py:1204] (3/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_63-101996-0 from training. Duration: 0.88 2023-10-02 10:46:15,720 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2541_210_1794_1_1525590269_3084765_156-173713-0 from training. Duration: 0.81 2023-10-02 10:46:17,093 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_217-239796-0_sp1.1 from training. Duration: 0.963625 2023-10-02 10:46:19,624 WARNING [train.py:1204] (3/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_530-329400-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 10:46:20,150 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.5.encoder.layers.1.self_attn2.whiten, num_groups=1, num_channels=256, metric=12.84 vs. limit=22.5 2023-10-02 10:46:22,344 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.571e+02 1.821e+02 1.986e+02 2.165e+02 3.006e+02, threshold=3.972e+02, percent-clipped=0.0 2023-10-02 10:46:22,456 WARNING [train.py:1204] (3/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_150-62920-0_sp1.1 from training. Duration: 0.963625 2023-10-02 10:46:22,457 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_5522_210_3273_1_1527249606_3909096_366-172508-0 from training. Duration: 0.66 2023-10-02 10:46:22,461 WARNING [train.py:1204] (3/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_303-340446-0_sp1.1 from training. Duration: 0.8181875 2023-10-02 10:46:25,726 WARNING [train.py:1204] (3/4) Exclude cut with ID _8699_210_2069_1_1528630346831_3960409_160-24408-0_sp1.1 from training. Duration: 0.82725 2023-10-02 10:46:27,352 WARNING [train.py:1204] (3/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_299-187801-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 10:46:29,185 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_34886_210_10633_1_1533203882_1755661_92-345288-0_sp1.1 from training. Duration: 0.97275 2023-10-02 10:46:32,036 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_5438_210_1794_1_1527336687_2600009_104-288274-0_sp1.1 from training. Duration: 0.52725 2023-10-02 10:46:32,040 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_154-316450-0 from training. Duration: 0.76 2023-10-02 10:46:33,258 WARNING [train.py:1204] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_23-189234-0_sp0.9 from training. Duration: 0.92225 2023-10-02 10:46:33,608 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.conv_module2.balancer1.min_positive, batch_count=852140.0, ans=0.025 2023-10-02 10:46:34,687 WARNING [train.py:1204] (3/4) Exclude cut with ID _4779_210_2084_1_1527318022944_3914893_346-168820-0_sp1.1 from training. Duration: 0.963625 2023-10-02 10:46:36,107 WARNING [train.py:1204] (3/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_5-55893-0 from training. Duration: 0.55 2023-10-02 10:46:38,101 WARNING [train.py:1204] (3/4) Exclude cut with ID _20455_210_5219_1_1531565996775_8098100_180-24517-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 10:46:41,479 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.conv_module1.balancer2.min_positive, batch_count=852140.0, ans=0.05 2023-10-02 10:46:42,553 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_243-143119-0_sp0.9 from training. Duration: 0.8555625 2023-10-02 10:46:45,315 WARNING [train.py:1204] (3/4) Exclude cut with ID _65585_210_12210_1_1535450451584_3764098_293-212045-0_sp1.1 from training. Duration: 0.92725 2023-10-02 10:46:45,325 WARNING [train.py:1204] (3/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_577-332600-0 from training. Duration: 0.57 2023-10-02 10:46:49,379 WARNING [train.py:1204] (3/4) Exclude cut with ID _30039_210_13270_1_1532484612900_2725150_104-289874-0_sp1.1 from training. Duration: 0.963625 2023-10-02 10:46:49,385 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3172_210_2087_1_1526292929_3987865_197-97762-0_sp1.1 from training. Duration: 0.6818125 2023-10-02 10:46:50,929 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_256-150908-0_sp1.1 from training. Duration: 0.963625 2023-10-02 10:46:52,297 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_296-287678-0_sp1.1 from training. Duration: 0.863625 2023-10-02 10:46:54,126 WARNING [train.py:1204] (3/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_443-30034-0 from training. Duration: 0.64 2023-10-02 10:46:54,155 WARNING [train.py:1204] (3/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_494-199538-0_sp0.9 from training. Duration: 0.8333125 2023-10-02 10:46:54,210 WARNING [train.py:1204] (3/4) Exclude cut with ID _19708_210_9206_1_1532136650733_4238364_61-162325-0_sp1.1 from training. Duration: 0.97275 2023-10-02 10:46:56,858 WARNING [train.py:1204] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_475-127455-0 from training. Duration: 0.7 2023-10-02 10:46:56,994 WARNING [train.py:1204] (3/4) Exclude cut with ID _48677_210_5663_1_1533902405919_3892989_188-11581-0_sp1.1 from training. Duration: 0.963625 2023-10-02 10:46:58,274 WARNING [train.py:1204] (3/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_136-8381-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 10:46:59,684 WARNING [train.py:1204] (3/4) Exclude cut with ID _55328_210_27767_1_1534580973260_4088170_270-229555-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 10:46:59,720 WARNING [train.py:1204] (3/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_372-266951-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 10:46:59,783 WARNING [train.py:1204] (3/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_14-242149-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 10:47:03,003 INFO [train.py:1046] (3/4) Epoch 25, batch 350, loss[loss=0.1665, simple_loss=0.2292, pruned_loss=0.05188, over 22834.00 frames. ], tot_loss[loss=0.1706, simple_loss=0.2469, pruned_loss=0.04718, over 3914442.05 frames. ], batch size: 322, lr: 4.14e-03, grad_scale: 16.0 2023-10-02 10:47:05,791 WARNING [train.py:1204] (3/4) Exclude cut with ID _7877_210_6014_1_1528286358099_3750136_159-76512-0_sp1.1 from training. Duration: 0.936375 2023-10-02 10:47:05,793 WARNING [train.py:1204] (3/4) Exclude cut with ID _8263_210_1664_1_1528438953494_294100_40-204731-0_sp1.1 from training. Duration: 0.4545625 2023-10-02 10:47:07,803 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8278_210_2550_1_1528637099_4220413_694-238413-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 10:47:10,575 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder_embed.conv.2.prob, batch_count=852273.3333333334, ans=0.125 2023-10-02 10:47:12,614 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.1.conv_skip_rate, batch_count=852273.3333333334, ans=0.0 2023-10-02 10:47:13,675 WARNING [train.py:1204] (3/4) Exclude cut with ID _47278_210_12041_1_1533902418349_3576110_421-172710-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 10:47:15,237 WARNING [train.py:1204] (3/4) Exclude cut with ID _14970_210_15709_1_1530773543493_1715730_215-282680-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 10:47:16,572 WARNING [train.py:1204] (3/4) Exclude cut with ID _64639_210_5816_1_1535367619437_3528000_306-149449-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 10:47:19,358 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_28-291982-0 from training. Duration: 0.97 2023-10-02 10:47:20,774 WARNING [train.py:1204] (3/4) Exclude cut with ID _55134_210_21982_1_1534590028377_3882170_412-15870-0_sp1.1 from training. Duration: 0.936375 2023-10-02 10:47:20,813 WARNING [train.py:1204] (3/4) Exclude cut with ID _13684_210_7497_1_1530619354863_3598359_312-235606-0 from training. Duration: 0.6 2023-10-02 10:47:23,932 WARNING [train.py:1204] (3/4) Exclude cut with ID _37112_210_2575_1_1532997007673_7268610_347-238993-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 10:47:25,254 WARNING [train.py:1204] (3/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_121-284152-0 from training. Duration: 0.59 2023-10-02 10:47:26,580 WARNING [train.py:1204] (3/4) Exclude cut with ID _2941_210_2577_1_1526199673150_2489759_228-2961-0_sp1.1 from training. Duration: 0.97275 2023-10-02 10:47:28,101 WARNING [train.py:1204] (3/4) Exclude cut with ID _28323_210_16187_1_1532480395667_4100300_531-24157-0 from training. Duration: 0.98 2023-10-02 10:47:29,600 WARNING [train.py:1204] (3/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_266-329755-0_sp0.9 from training. Duration: 0.988875 2023-10-02 10:47:29,756 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.feed_forward3.hidden_balancer.prob, batch_count=852340.0, ans=0.125 2023-10-02 10:47:32,818 WARNING [train.py:1204] (3/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_1038-146421-0_sp1.1 from training. Duration: 0.97275 2023-10-02 10:47:32,889 WARNING [train.py:1204] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_391-22148-0_sp0.9 from training. Duration: 0.8555625 2023-10-02 10:47:34,114 WARNING [train.py:1204] (3/4) Exclude cut with ID _64521_210_14699_1_1535243427259_3815328_543-341117-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 10:47:34,124 WARNING [train.py:1204] (3/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_52-168712-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 10:47:34,163 WARNING [train.py:1204] (3/4) Exclude cut with ID _33920_210_4220_1_1533257851500_3084399_198-334063-0_sp1.1 from training. Duration: 0.8545625 2023-10-02 10:47:34,175 WARNING [train.py:1204] (3/4) Exclude cut with ID _50093_210_4220_1_1534417141207_3432299_69-111987-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 10:47:35,432 WARNING [train.py:1204] (3/4) Exclude cut with ID _14821_210_8750_1_1530680503991_6829359_189-297451-0_sp0.9 from training. Duration: 0.611125 2023-10-02 10:47:36,838 WARNING [train.py:1204] (3/4) Exclude cut with ID _8030_210_7497_1_1528444886781_3531089_262-86750-0_sp0.9 from training. Duration: 0.9 2023-10-02 10:47:36,843 WARNING [train.py:1204] (3/4) Exclude cut with ID _24612_210_8913_1_1531998054193_3806359_294-191196-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 10:47:40,382 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.nonlin_attention.balancer.prob, batch_count=852406.6666666666, ans=0.125 2023-10-02 10:47:44,646 WARNING [train.py:1204] (3/4) Exclude cut with ID _16909_210_5724_1_1531292079912_4118919_707-342896-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 10:47:44,661 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_9460_210_4211_1_1528954587_10049667_572-37035-0_sp0.9 from training. Duration: 0.888875 2023-10-02 10:47:46,011 WARNING [train.py:1204] (3/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_762-109886-0_sp1.1 from training. Duration: 0.82725 2023-10-02 10:47:46,047 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_14953_210_2151_1_1530701710_5616165_519-326393-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 10:47:50,075 WARNING [train.py:1204] (3/4) Exclude cut with ID _25914_210_6400_1_1532141317058_644740_52-96808-0 from training. Duration: 0.91 2023-10-02 10:47:50,085 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_68095_210_20185_1_1535718226_8888855_1079-94525-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 10:47:53,484 WARNING [train.py:1204] (3/4) Exclude cut with ID _14860_210_4915_1_1530702020340_7343240_746-3647-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 10:47:53,484 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_70379_210_7925_1_1535941455_557455_49-101720-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 10:47:54,786 WARNING [train.py:1204] (3/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_8-307291-0_sp1.1 from training. Duration: 0.936375 2023-10-02 10:47:56,181 WARNING [train.py:1204] (3/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_117-290916-0 from training. Duration: 0.51 2023-10-02 10:47:57,481 WARNING [train.py:1204] (3/4) Exclude cut with ID _22020_210_2461_1_1531724164513_6326900_944-339840-0_sp1.1 from training. Duration: 0.963625 2023-10-02 10:47:58,835 WARNING [train.py:1204] (3/4) Exclude cut with ID IC0515W0043-254911 from training. Duration: 0.976 2023-10-02 10:48:00,205 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3910_210_1794_1_1526742306_334530_28-238909-0 from training. Duration: 0.81 2023-10-02 10:48:00,218 WARNING [train.py:1204] (3/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_269-267275-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 10:48:03,519 WARNING [train.py:1204] (3/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_72-251255-0_sp1.1 from training. Duration: 0.8545625 2023-10-02 10:48:03,531 WARNING [train.py:1204] (3/4) Exclude cut with ID _14886_210_4220_1_1530702020355_3516190_175-229073-0 from training. Duration: 0.45 2023-10-02 10:48:06,225 WARNING [train.py:1204] (3/4) Exclude cut with ID _40519_210_13794_1_1533715228655_7337390_921-209452-0_sp1.1 from training. Duration: 0.963625 2023-10-02 10:48:09,596 WARNING [train.py:1204] (3/4) Exclude cut with ID _29857_210_20034_1_1532395886753_3798160_335-163562-0_sp0.9 from training. Duration: 0.9444375 2023-10-02 10:48:10,989 WARNING [train.py:1204] (3/4) Exclude cut with ID _58583_210_5236_1_1534726835266_3574249_384-58445-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 10:48:12,345 WARNING [train.py:1204] (3/4) Exclude cut with ID _44011_210_2046_1_1533722401691_3631310_96-315656-0_sp1.1 from training. Duration: 0.963625 2023-10-02 10:48:12,349 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_68484_210_1794_1_1535849804_7678097_549-170613-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 10:48:13,874 WARNING [train.py:1204] (3/4) Exclude cut with ID _37892_210_19596_1_1533198677897_3964058_214-148058-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 10:48:16,573 INFO [train.py:1046] (3/4) Epoch 25, batch 400, loss[loss=0.1784, simple_loss=0.2596, pruned_loss=0.04857, over 24663.00 frames. ], tot_loss[loss=0.1696, simple_loss=0.2465, pruned_loss=0.04638, over 4101046.83 frames. ], batch size: 68, lr: 4.14e-03, grad_scale: 32.0 2023-10-02 10:48:16,665 WARNING [train.py:1204] (3/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_415-78792-0_sp0.9 from training. Duration: 0.9 2023-10-02 10:48:18,083 WARNING [train.py:1204] (3/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_383-205833-0_sp1.1 from training. Duration: 0.863625 2023-10-02 10:48:19,332 WARNING [train.py:1204] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_167-137255-0 from training. Duration: 0.64 2023-10-02 10:48:19,345 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8275_210_4198_1_1528455320_3892571_484-154786-0_sp1.1 from training. Duration: 0.963625 2023-10-02 10:48:20,641 WARNING [train.py:1204] (3/4) Exclude cut with ID _40284_210_2084_1_1533513679814_3704279_396-319412-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 10:48:22,046 WARNING [train.py:1204] (3/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_280-120942-0_sp0.9 from training. Duration: 0.8555625 2023-10-02 10:48:22,095 WARNING [train.py:1204] (3/4) Exclude cut with ID _45630_210_10556_1_1533949174498_3902049_558-240889-0_sp1.1 from training. Duration: 0.92725 2023-10-02 10:48:25,262 WARNING [train.py:1204] (3/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_197-307458-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 10:48:28,056 WARNING [train.py:1204] (3/4) Exclude cut with ID _16909_210_5724_1_1531292079912_4118919_163-254843-0_sp1.1 from training. Duration: 0.92725 2023-10-02 10:48:28,198 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_103-84010-0 from training. Duration: 0.69 2023-10-02 10:48:31,428 WARNING [train.py:1204] (3/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_401-158239-0 from training. Duration: 0.71 2023-10-02 10:48:31,429 WARNING [train.py:1204] (3/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_899-233658-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 10:48:32,740 WARNING [train.py:1204] (3/4) Exclude cut with ID _23825_210_13449_1_1531915313526_3497150_240-271367-0 from training. Duration: 0.97 2023-10-02 10:48:33,980 WARNING [train.py:1204] (3/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_424-134619-0_sp1.1 from training. Duration: 0.92725 2023-10-02 10:48:36,853 WARNING [train.py:1204] (3/4) Exclude cut with ID _44831_210_13427_1_1533816011549_3689035_32-188147-0_sp1.1 from training. Duration: 0.87275 2023-10-02 10:48:36,856 WARNING [train.py:1204] (3/4) Exclude cut with ID _59170_210_6721_1_1534983633960_4203335_314-63879-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 10:48:36,881 WARNING [train.py:1204] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_602-275260-0 from training. Duration: 0.82 2023-10-02 10:48:36,912 WARNING [train.py:1204] (3/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_511-306767-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 10:48:36,947 WARNING [train.py:1204] (3/4) Exclude cut with ID _41817_210_18835_1_1533434477160_7423049_565-201775-0_sp1.1 from training. Duration: 0.92725 2023-10-02 10:48:38,240 WARNING [train.py:1204] (3/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_778-173151-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 10:48:38,270 WARNING [train.py:1204] (3/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_680-51001-0_sp1.1 from training. Duration: 0.963625 2023-10-02 10:48:41,386 WARNING [train.py:1204] (3/4) Exclude cut with ID IC0677W0059-334891 from training. Duration: 0.896 2023-10-02 10:48:41,432 WARNING [train.py:1204] (3/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_370-331600-0 from training. Duration: 0.94 2023-10-02 10:48:44,783 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.nonlin_attention.balancer.prob, batch_count=852740.0, ans=0.125 2023-10-02 10:48:47,269 WARNING [train.py:1204] (3/4) Exclude cut with ID _66840_210_5219_1_1535452168426_4196349_446-240192-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 10:48:47,366 WARNING [train.py:1204] (3/4) Exclude cut with ID _65242_210_18628_1_1535369393717_3823149_38-6732-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 10:48:48,706 WARNING [train.py:1204] (3/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_302-2550-0 from training. Duration: 0.81 2023-10-02 10:48:49,992 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.470e+02 1.835e+02 2.039e+02 2.384e+02 3.954e+02, threshold=4.078e+02, percent-clipped=0.0 2023-10-02 10:48:50,101 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_100-348441-0 from training. Duration: 0.91 2023-10-02 10:48:53,351 WARNING [train.py:1204] (3/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_329-285801-0_sp1.1 from training. Duration: 0.82725 2023-10-02 10:48:53,507 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.nonlin_attention.balancer.prob, batch_count=852740.0, ans=0.125 2023-10-02 10:48:54,954 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.ff2_skip_rate, batch_count=852740.0, ans=0.0 2023-10-02 10:48:56,091 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_30167_210_3528_1_1532428012_4772375_343-334712-0_sp1.1 from training. Duration: 0.936375 2023-10-02 10:48:57,665 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.conv_module2.balancer1.max_abs, batch_count=852740.0, ans=10.0 2023-10-02 10:49:01,623 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7543_210_6753_1_1528543318_4552_1-85072-0 from training. Duration: 0.83 2023-10-02 10:49:04,819 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7002_210_1794_1_1527935634_4034444_487-236501-0_sp1.1 from training. Duration: 0.6 2023-10-02 10:49:04,930 WARNING [train.py:1204] (3/4) Exclude cut with ID _7160_210_1876_1_1527940923231_3703799_282-322611-0 from training. Duration: 0.87 2023-10-02 10:49:06,494 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.bypass.skip_rate, batch_count=852806.6666666666, ans=0.09899494936611666 2023-10-02 10:49:08,836 WARNING [train.py:1204] (3/4) Exclude cut with ID _67335_210_5219_1_1535525975652_4275229_570-262983-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 10:49:09,036 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=852806.6666666666, ans=0.1 2023-10-02 10:49:10,140 WARNING [train.py:1204] (3/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_625-338546-0_sp1.1 from training. Duration: 0.8 2023-10-02 10:49:10,170 WARNING [train.py:1204] (3/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_176-56120-0 from training. Duration: 0.83 2023-10-02 10:49:13,647 WARNING [train.py:1204] (3/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_963-187109-0_sp1.1 from training. Duration: 0.9 2023-10-02 10:49:13,896 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.bypass_mid.scale_min, batch_count=852873.3333333334, ans=0.2 2023-10-02 10:49:16,418 WARNING [train.py:1204] (3/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_121-284152-0_sp0.9 from training. Duration: 0.6555625 2023-10-02 10:49:16,610 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.conv_module1.balancer2.prob, batch_count=852873.3333333334, ans=0.125 2023-10-02 10:49:17,762 WARNING [train.py:1204] (3/4) Exclude cut with ID _45021_210_9994_1_1533619751094_3330288_104-81711-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 10:49:19,251 WARNING [train.py:1204] (3/4) Exclude cut with ID _51382_210_23862_1_1534492801838_3650104_129-343914-0_sp1.1 from training. Duration: 0.936375 2023-10-02 10:49:19,286 WARNING [train.py:1204] (3/4) Exclude cut with ID _14970_210_15709_1_1530775547361_1966965_8-169924-0 from training. Duration: 0.92 2023-10-02 10:49:21,163 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2295_210_2087_1_1525500993_2407863_234-83104-0_sp1.1 from training. Duration: 0.563625 2023-10-02 10:49:22,566 WARNING [train.py:1204] (3/4) Exclude cut with ID _14687_210_5854_1_1530703838259_3664889_115-239046-0 from training. Duration: 0.67 2023-10-02 10:49:26,581 WARNING [train.py:1204] (3/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_690-126306-0_sp1.1 from training. Duration: 0.7090625 2023-10-02 10:49:26,596 WARNING [train.py:1204] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_625-102955-0_sp0.9 from training. Duration: 0.8555625 2023-10-02 10:49:28,091 WARNING [train.py:1204] (3/4) Exclude cut with ID _9389_210_1664_1_1529217337301_5289269_289-213417-0 from training. Duration: 0.89 2023-10-02 10:49:29,378 INFO [train.py:1046] (3/4) Epoch 25, batch 450, loss[loss=0.1552, simple_loss=0.2293, pruned_loss=0.04053, over 23339.00 frames. ], tot_loss[loss=0.1697, simple_loss=0.2468, pruned_loss=0.04624, over 4236499.43 frames. ], batch size: 119, lr: 4.14e-03, grad_scale: 32.0 2023-10-02 10:49:30,814 WARNING [train.py:1204] (3/4) Exclude cut with ID _13253_210_1925_1_1530757775819_3577873_191-144977-0_sp0.9 from training. Duration: 0.8666875 2023-10-02 10:49:30,868 WARNING [train.py:1204] (3/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_325-155703-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 10:49:30,906 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2295_210_2087_1_1525500993_2407863_116-238894-0_sp0.9 from training. Duration: 0.77775 2023-10-02 10:49:32,255 WARNING [train.py:1204] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_94-126128-0 from training. Duration: 0.86 2023-10-02 10:49:33,667 WARNING [train.py:1204] (3/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_395-310220-0_sp0.9 from training. Duration: 0.988875 2023-10-02 10:49:35,078 WARNING [train.py:1204] (3/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_821-110852-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 10:49:35,282 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.attention_skip_rate, batch_count=852940.0, ans=0.0 2023-10-02 10:49:35,326 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.bypass.skip_rate, batch_count=852940.0, ans=0.09899494936611666 2023-10-02 10:49:36,390 WARNING [train.py:1204] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_64-248359-0_sp1.1 from training. Duration: 0.62725 2023-10-02 10:49:36,407 WARNING [train.py:1204] (3/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_138-187031-0 from training. Duration: 0.99 2023-10-02 10:49:36,441 WARNING [train.py:1204] (3/4) Exclude cut with ID _4122_210_2084_1_1526713164417_4353280_528-206771-0_sp0.9 from training. Duration: 0.97775 2023-10-02 10:49:38,246 WARNING [train.py:1204] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_844-96989-0_sp1.1 from training. Duration: 0.6454375 2023-10-02 10:49:40,986 WARNING [train.py:1204] (3/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_106-30782-0_sp1.1 from training. Duration: 0.72725 2023-10-02 10:49:46,077 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.feed_forward3.hidden_balancer.prob, batch_count=853006.6666666666, ans=0.125 2023-10-02 10:49:47,326 WARNING [train.py:1204] (3/4) Exclude cut with ID _14683_210_5219_1_1530698352608_3974859_281-250040-0_sp1.1 from training. Duration: 0.936375 2023-10-02 10:49:47,464 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.1.attention_skip_rate, batch_count=853006.6666666666, ans=0.0 2023-10-02 10:49:48,563 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_46013_210_7234_1_1533776362_3744032_463-52378-0_sp0.9 from training. Duration: 0.9555625 2023-10-02 10:49:49,965 WARNING [train.py:1204] (3/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_299-34413-0 from training. Duration: 0.65 2023-10-02 10:49:50,024 WARNING [train.py:1204] (3/4) Exclude cut with ID _35799_210_5854_1_1533263487887_3529071_490-326847-0 from training. Duration: 0.99 2023-10-02 10:49:54,547 WARNING [train.py:1204] (3/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_589-9070-0_sp0.9 from training. Duration: 0.788875 2023-10-02 10:49:57,114 WARNING [train.py:1204] (3/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_884-29515-0_sp1.1 from training. Duration: 0.936375 2023-10-02 10:49:59,872 WARNING [train.py:1204] (3/4) Exclude cut with ID _15012_210_1968_1_1530856820429_5095350_474-119995-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 10:50:02,701 WARNING [train.py:1204] (3/4) Exclude cut with ID _65585_210_12210_1_1535450451584_3764098_211-97124-0_sp1.1 from training. Duration: 0.963625 2023-10-02 10:50:04,008 WARNING [train.py:1204] (3/4) Exclude cut with ID _33640_210_2655_1_1533117605479_4855469_666-224463-0_sp1.1 from training. Duration: 0.963625 2023-10-02 10:50:06,856 WARNING [train.py:1204] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_35-153886-0 from training. Duration: 0.84 2023-10-02 10:50:08,197 WARNING [train.py:1204] (3/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_210-106047-0 from training. Duration: 0.97 2023-10-02 10:50:10,066 WARNING [train.py:1204] (3/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_479-125611-0 from training. Duration: 0.96 2023-10-02 10:50:10,100 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_9417_210_1794_1_1529235261_4614131_427-153963-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 10:50:11,423 WARNING [train.py:1204] (3/4) Exclude cut with ID _23818_210_6228_1_1531917063884_3676920_133-170571-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 10:50:11,471 WARNING [train.py:1204] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_602-275260-0_sp1.1 from training. Duration: 0.7454375 2023-10-02 10:50:12,933 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9029W0121-509521 from training. Duration: 0.896 2023-10-02 10:50:14,604 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9029W0434-509821 from training. Duration: 0.64 2023-10-02 10:50:14,621 WARNING [train.py:1204] (3/4) Exclude cut with ID _23038_210_9570_1_1532001584158_3652419_289-307034-0_sp1.1 from training. Duration: 0.936375 2023-10-02 10:50:15,965 WARNING [train.py:1204] (3/4) Exclude cut with ID _29345_210_4381_1_1532689168263_3860169_149-257268-0_sp0.9 from training. Duration: 0.9 2023-10-02 10:50:16,645 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_405-339986-0_sp1.1 from training. Duration: 0.463625 2023-10-02 10:50:20,693 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2655_210_2087_1_1525688959_3897400_437-337690-0_sp1.1 from training. Duration: 0.57275 2023-10-02 10:50:21,950 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7543_210_6753_1_1528541907_1399322_165-323619-0_sp1.1 from training. Duration: 0.62725 2023-10-02 10:50:22,011 WARNING [train.py:1204] (3/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_508-335369-0_sp1.1 from training. Duration: 0.436375 2023-10-02 10:50:23,806 WARNING [train.py:1204] (3/4) Exclude cut with ID _7852_210_5219_1_1528286471316_8139400_358-287492-0 from training. Duration: 0.96 2023-10-02 10:50:23,957 WARNING [train.py:1204] (3/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_322-217220-0_sp0.9 from training. Duration: 0.9555625 2023-10-02 10:50:25,397 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3171_210_2087_1_1526109084_2330007_66-260583-0_sp0.9 from training. Duration: 0.8 2023-10-02 10:50:26,714 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8558_210_1410_1_1528505225_7613406_131-329524-0_sp1.1 from training. Duration: 0.7181875 2023-10-02 10:50:28,031 WARNING [train.py:1204] (3/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_30-326681-0 from training. Duration: 0.83 2023-10-02 10:50:30,829 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.ff2_skip_rate, batch_count=853206.6666666666, ans=0.0 2023-10-02 10:50:31,880 WARNING [train.py:1204] (3/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_58-68956-0_sp0.9 from training. Duration: 0.92225 2023-10-02 10:50:31,939 WARNING [train.py:1204] (3/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_255-312654-0 from training. Duration: 0.91 2023-10-02 10:50:33,262 WARNING [train.py:1204] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_99-167392-0 from training. Duration: 0.61 2023-10-02 10:50:33,357 WARNING [train.py:1204] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_94-126128-0_sp0.9 from training. Duration: 0.9555625 2023-10-02 10:50:37,503 WARNING [train.py:1204] (3/4) Exclude cut with ID _68635_210_9317_1_1535770813410_4315870_187-192817-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 10:50:39,366 WARNING [train.py:1204] (3/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_277-204970-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 10:50:40,635 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6163_210_6400_1_1527506746_4263068_283-137811-0_sp0.9 from training. Duration: 0.8555625 2023-10-02 10:50:40,660 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9144W0121-566446 from training. Duration: 0.896 2023-10-02 10:50:41,956 INFO [train.py:1046] (3/4) Epoch 25, batch 500, loss[loss=0.1597, simple_loss=0.2377, pruned_loss=0.04085, over 14591.00 frames. ], tot_loss[loss=0.1699, simple_loss=0.2475, pruned_loss=0.04619, over 4336869.40 frames. ], batch size: 31, lr: 4.13e-03, grad_scale: 16.0 2023-10-02 10:50:44,058 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.conv_skip_rate, batch_count=853273.3333333334, ans=0.0 2023-10-02 10:50:45,961 WARNING [train.py:1204] (3/4) Exclude cut with ID _49371_210_12071_1_1533952761021_3812190_351-315519-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 10:50:47,037 WARNING [train.py:1204] (3/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_115-326778-0_sp1.1 from training. Duration: 0.8454375 2023-10-02 10:50:47,081 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_17292_210_6753_1_1531125388_2943734_436-198965-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 10:50:47,100 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9165W0117-576751 from training. Duration: 0.896 2023-10-02 10:50:49,830 WARNING [train.py:1204] (3/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_212-149947-0 from training. Duration: 0.73 2023-10-02 10:50:49,849 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_13757_210_3528_1_1530440682_4107239_65-167544-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 10:50:50,845 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.1.encoder.layers.1.feed_forward1.out_whiten, num_groups=1, num_channels=256, metric=5.87 vs. limit=15.0 2023-10-02 10:50:51,394 WARNING [train.py:1204] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_700-183549-0_sp0.9 from training. Duration: 0.7555625 2023-10-02 10:50:58,644 WARNING [train.py:1204] (3/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_216-343007-0_sp0.9 from training. Duration: 0.7333125 2023-10-02 10:51:00,290 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7544_210_6753_1_1528587722_3576686_295-258479-0_sp1.1 from training. Duration: 0.763625 2023-10-02 10:51:01,710 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_365-287106-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 10:51:01,730 WARNING [train.py:1204] (3/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_300-3747-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 10:51:03,255 WARNING [train.py:1204] (3/4) Exclude cut with ID _14069_210_5632_1_1530512692033_4803390_487-303201-0_sp1.1 from training. Duration: 0.92725 2023-10-02 10:51:04,694 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=853340.0, ans=0.1 2023-10-02 10:51:13,156 WARNING [train.py:1204] (3/4) Exclude cut with ID _14459_210_2228_1_1531443641946_7516700_668-123026-0_sp1.1 from training. Duration: 0.936375 2023-10-02 10:51:13,192 WARNING [train.py:1204] (3/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_108-53929-0_sp1.1 from training. Duration: 0.536375 2023-10-02 10:51:14,557 WARNING [train.py:1204] (3/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_266-137104-0_sp0.9 from training. Duration: 0.72225 2023-10-02 10:51:14,592 WARNING [train.py:1204] (3/4) Exclude cut with ID _12302_210_5219_1_1529924446466_8107280_902-168619-0_sp1.1 from training. Duration: 0.936375 2023-10-02 10:51:14,619 WARNING [train.py:1204] (3/4) Exclude cut with ID _59502_210_12210_1_1535107024698_1835139_146-162495-0 from training. Duration: 0.92 2023-10-02 10:51:14,643 WARNING [train.py:1204] (3/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_412-287526-0_sp1.1 from training. Duration: 0.6545625 2023-10-02 10:51:18,170 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.feed_forward1.out_proj.dropout_p, batch_count=853406.6666666666, ans=0.1 2023-10-02 10:51:19,220 WARNING [train.py:1204] (3/4) Exclude cut with ID _7160_210_1876_1_1527940923231_3703799_120-150943-0_sp0.9 from training. Duration: 0.9 2023-10-02 10:51:19,282 WARNING [train.py:1204] (3/4) Exclude cut with ID _15037_210_2253_1_1530752294104_3267650_296-192597-0_sp1.1 from training. Duration: 0.736375 2023-10-02 10:51:19,301 WARNING [train.py:1204] (3/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_397-216497-0_sp1.1 from training. Duration: 0.87275 2023-10-02 10:51:20,408 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.608e+02 1.879e+02 2.028e+02 2.264e+02 3.460e+02, threshold=4.055e+02, percent-clipped=0.0 2023-10-02 10:51:21,125 WARNING [train.py:1204] (3/4) Exclude cut with ID _40094_210_16380_1_1533294005773_3641159_76-269320-0_sp1.1 from training. Duration: 0.936375 2023-10-02 10:51:21,190 WARNING [train.py:1204] (3/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_449-15508-0 from training. Duration: 0.82 2023-10-02 10:51:25,026 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9309W0194-640321 from training. Duration: 0.64 2023-10-02 10:51:26,496 WARNING [train.py:1204] (3/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_284-312178-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 10:51:28,239 WARNING [train.py:1204] (3/4) Exclude cut with ID _29853_210_7497_1_1532505600851_6642110_401-55937-0_sp1.1 from training. Duration: 0.92725 2023-10-02 10:51:29,653 WARNING [train.py:1204] (3/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_716-24918-0_sp1.1 from training. Duration: 0.92725 2023-10-02 10:51:29,672 WARNING [train.py:1204] (3/4) Exclude cut with ID _19637_210_4220_1_1531537167676_3205060_314-305638-0_sp1.1 from training. Duration: 0.92725 2023-10-02 10:51:30,926 WARNING [train.py:1204] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_475-127455-0_sp1.1 from training. Duration: 0.636375 2023-10-02 10:51:32,379 WARNING [train.py:1204] (3/4) Exclude cut with ID _5444_210_2151_1_1527418808464_5312210_581-315411-0 from training. Duration: 0.62 2023-10-02 10:51:33,741 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder_embed.conv.5.prob, batch_count=853473.3333333334, ans=0.125 2023-10-02 10:51:36,450 WARNING [train.py:1204] (3/4) Exclude cut with ID _44902_210_16187_1_1533888006062_3792078_149-67212-0_sp0.9 from training. Duration: 0.9666875 2023-10-02 10:51:36,533 WARNING [train.py:1204] (3/4) Exclude cut with ID _31602_210_5854_1_1532677021456_4458250_63-63253-0_sp1.1 from training. Duration: 0.963625 2023-10-02 10:51:36,786 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.feed_forward1.out_proj.dropout_p, batch_count=853473.3333333334, ans=0.1 2023-10-02 10:51:40,667 WARNING [train.py:1204] (3/4) Exclude cut with ID _55209_210_1531_1_1534675828996_4597160_660-203992-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 10:51:43,486 WARNING [train.py:1204] (3/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_83-190624-0_sp1.1 from training. Duration: 0.92725 2023-10-02 10:51:48,183 WARNING [train.py:1204] (3/4) Exclude cut with ID _58605_210_12210_1_1534723195939_3904259_154-139635-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 10:51:53,266 WARNING [train.py:1204] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_164-224052-0 from training. Duration: 0.7 2023-10-02 10:51:54,592 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_1126-189071-0_sp1.1 from training. Duration: 0.963625 2023-10-02 10:51:54,610 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_15396_210_6890_1_1531308473_1938936_310-214789-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 10:51:56,235 WARNING [train.py:1204] (3/4) Exclude cut with ID _8395_210_1531_1_1528597871523_4411579_431-132470-0 from training. Duration: 0.74 2023-10-02 10:51:57,517 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3780_210_2087_1_1526467760_2130483_170-259044-0_sp0.9 from training. Duration: 0.77775 2023-10-02 10:51:57,628 WARNING [train.py:1204] (3/4) Exclude cut with ID _12733_210_8341_1_1530167424556_3646380_537-230640-0_sp1.1 from training. Duration: 0.963625 2023-10-02 10:51:58,891 INFO [train.py:1046] (3/4) Epoch 25, batch 550, loss[loss=0.1705, simple_loss=0.2579, pruned_loss=0.04152, over 24446.00 frames. ], tot_loss[loss=0.1703, simple_loss=0.2479, pruned_loss=0.04633, over 4414987.24 frames. ], batch size: 69, lr: 4.13e-03, grad_scale: 16.0 2023-10-02 10:52:03,408 WARNING [train.py:1204] (3/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_159-281310-0 from training. Duration: 0.95 2023-10-02 10:52:06,080 WARNING [train.py:1204] (3/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_434-83000-0 from training. Duration: 0.98 2023-10-02 10:52:06,098 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_4405_210_1849_1_1526732205_3593939_493-32464-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 10:52:07,544 WARNING [train.py:1204] (3/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_342-100157-0 from training. Duration: 0.54 2023-10-02 10:52:07,605 WARNING [train.py:1204] (3/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_765-250133-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 10:52:07,609 WARNING [train.py:1204] (3/4) Exclude cut with ID _38670_210_15379_1_1533448810659_4391059_81-312245-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 10:52:08,931 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_50050_210_16223_1_1535435796_7290138_1273-247611-0_sp1.1 from training. Duration: 0.936375 2023-10-02 10:52:08,994 WARNING [train.py:1204] (3/4) Exclude cut with ID _44831_210_13427_1_1533816011549_3689035_399-18591-0_sp1.1 from training. Duration: 0.936375 2023-10-02 10:52:09,018 WARNING [train.py:1204] (3/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_557-324258-0_sp0.9 from training. Duration: 0.9 2023-10-02 10:52:10,399 WARNING [train.py:1204] (3/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_62-335552-0_sp1.1 from training. Duration: 0.7909375 2023-10-02 10:52:13,160 WARNING [train.py:1204] (3/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_673-264125-0_sp1.1 from training. Duration: 0.963625 2023-10-02 10:52:13,239 WARNING [train.py:1204] (3/4) Exclude cut with ID _8030_210_7497_1_1528444886781_3531089_262-86750-0 from training. Duration: 0.81 2023-10-02 10:52:13,264 WARNING [train.py:1204] (3/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_444-28041-0_sp1.1 from training. Duration: 0.77275 2023-10-02 10:52:19,210 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_34008_210_10431_1_1533345424_8183247_516-14858-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 10:52:19,224 WARNING [train.py:1204] (3/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_54-248218-0_sp1.1 from training. Duration: 0.936375 2023-10-02 10:52:20,773 WARNING [train.py:1204] (3/4) Exclude cut with ID _13253_210_1925_1_1530757775819_3577873_300-119782-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 10:52:22,668 WARNING [train.py:1204] (3/4) Exclude cut with ID _39290_210_4220_1_1533862778760_3075118_21-240703-0_sp1.1 from training. Duration: 0.936375 2023-10-02 10:52:25,940 WARNING [train.py:1204] (3/4) Exclude cut with ID 774-127930-0014-10412-0_sp1.1 from training. Duration: 0.95 2023-10-02 10:52:26,002 WARNING [train.py:1204] (3/4) Exclude cut with ID _67499_210_17282_1_1535509805220_3692430_20-179346-0 from training. Duration: 0.97 2023-10-02 10:52:28,642 WARNING [train.py:1204] (3/4) Exclude cut with ID _8365_210_2064_1_1528592311070_4677690_431-294102-0_sp0.9 from training. Duration: 0.988875 2023-10-02 10:52:34,544 WARNING [train.py:1204] (3/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_365-1492-0_sp1.1 from training. Duration: 0.82725 2023-10-02 10:52:34,572 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_105-114797-0_sp0.9 from training. Duration: 0.9333125 2023-10-02 10:52:34,680 WARNING [train.py:1204] (3/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_769-168302-0_sp0.9 from training. Duration: 0.788875 2023-10-02 10:52:37,472 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_40340_210_9024_1_1533433916_4132944_320-270277-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 10:52:37,477 WARNING [train.py:1204] (3/4) Exclude cut with ID ID0203W0003-781711 from training. Duration: 0.958 2023-10-02 10:52:38,800 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_30434_210_13049_1_1532519384_981746_52-12932-0_sp1.1 from training. Duration: 0.936375 2023-10-02 10:52:40,158 WARNING [train.py:1204] (3/4) Exclude cut with ID _8263_210_1664_1_1528438953494_294100_40-204731-0_sp0.9 from training. Duration: 0.5555625 2023-10-02 10:52:42,766 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1032-236863-0_sp0.9 from training. Duration: 0.9333125 2023-10-02 10:52:42,825 WARNING [train.py:1204] (3/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_335-274780-0_sp0.9 from training. Duration: 0.8666875 2023-10-02 10:52:42,832 WARNING [train.py:1204] (3/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_654-52713-0_sp1.1 from training. Duration: 0.7 2023-10-02 10:52:44,164 WARNING [train.py:1204] (3/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_742-267856-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 10:52:45,534 WARNING [train.py:1204] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_74-48717-0 from training. Duration: 0.58 2023-10-02 10:52:47,414 WARNING [train.py:1204] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_18-46756-0 from training. Duration: 0.79 2023-10-02 10:52:48,849 WARNING [train.py:1204] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_612-56061-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 10:52:48,863 WARNING [train.py:1204] (3/4) Exclude cut with ID _56423_210_5854_1_1534896061049_3165969_412-2403-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 10:52:50,185 WARNING [train.py:1204] (3/4) Exclude cut with ID _62574_210_2655_1_1535180407058_3933249_120-196848-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 10:52:50,185 WARNING [train.py:1204] (3/4) Exclude cut with ID _40519_210_13794_1_1533715228655_7337390_916-51000-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 10:52:53,483 WARNING [train.py:1204] (3/4) Exclude cut with ID _65263_210_22356_1_1535763598691_3921037_108-21902-0_sp1.1 from training. Duration: 0.9 2023-10-02 10:52:53,571 WARNING [train.py:1204] (3/4) Exclude cut with ID _4122_210_2084_1_1526713164417_4353280_528-206771-0_sp1.1 from training. Duration: 0.8 2023-10-02 10:52:56,998 WARNING [train.py:1204] (3/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_43-325657-0_sp1.1 from training. Duration: 0.92725 2023-10-02 10:52:57,057 WARNING [train.py:1204] (3/4) Exclude cut with ID _44659_210_2721_1_1534036120061_6614860_314-82041-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 10:52:59,050 WARNING [train.py:1204] (3/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_81-300212-0_sp0.9 from training. Duration: 0.6666875 2023-10-02 10:52:59,138 WARNING [train.py:1204] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_665-196155-0_sp0.9 from training. Duration: 0.7444375 2023-10-02 10:53:01,855 WARNING [train.py:1204] (3/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_140-336881-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 10:53:01,932 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6290_210_3273_1_1527595224_1282787_115-87095-0_sp0.9 from training. Duration: 0.611125 2023-10-02 10:53:03,131 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_53373_210_5399_1_1534229471_4715695_607-259685-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 10:53:04,542 WARNING [train.py:1204] (3/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_175-163894-0_sp1.1 from training. Duration: 0.67275 2023-10-02 10:53:04,559 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7002_210_1794_1_1527939683_5157286_1-281264-0_sp1.1 from training. Duration: 0.4 2023-10-02 10:53:09,321 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.3.encoder.layers.2.feed_forward1.out_whiten, num_groups=1, num_channels=512, metric=9.32 vs. limit=15.0 2023-10-02 10:53:09,977 WARNING [train.py:1204] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_605-257205-0 from training. Duration: 0.59 2023-10-02 10:53:12,552 INFO [train.py:1046] (3/4) Epoch 25, batch 600, loss[loss=0.168, simple_loss=0.2548, pruned_loss=0.04066, over 24430.00 frames. ], tot_loss[loss=0.1711, simple_loss=0.2487, pruned_loss=0.04674, over 4466722.02 frames. ], batch size: 69, lr: 4.13e-03, grad_scale: 16.0 2023-10-02 10:53:12,839 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.1.balancer2.prob, batch_count=853940.0, ans=0.125 2023-10-02 10:53:13,998 WARNING [train.py:1204] (3/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_454-314901-0 from training. Duration: 0.46 2023-10-02 10:53:15,336 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_46032_210_14656_1_1533781412_4507969_287-130422-0_sp1.1 from training. Duration: 0.97275 2023-10-02 10:53:15,352 WARNING [train.py:1204] (3/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_186-309161-0_sp1.1 from training. Duration: 0.4909375 2023-10-02 10:53:15,395 WARNING [train.py:1204] (3/4) Exclude cut with ID _42659_210_2070_1_1533607442765_3120380_45-342307-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 10:53:21,539 WARNING [train.py:1204] (3/4) Exclude cut with ID _12555_210_8750_1_1530075594402_3780239_478-326818-0_sp1.1 from training. Duration: 0.963625 2023-10-02 10:53:23,370 WARNING [train.py:1204] (3/4) Exclude cut with ID _7606_210_4915_1_1528894253277_5080690_127-177761-0_sp1.1 from training. Duration: 0.5818125 2023-10-02 10:53:25,158 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6788_210_1388_1_1527898698_4562021_91-197981-0 from training. Duration: 0.83 2023-10-02 10:53:28,345 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8558_210_1410_1_1528505225_7613406_131-329524-0_sp0.9 from training. Duration: 0.87775 2023-10-02 10:53:29,835 WARNING [train.py:1204] (3/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_153-153537-0_sp1.1 from training. Duration: 0.82725 2023-10-02 10:53:31,209 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_17292_210_6753_1_1531125388_2943734_285-349105-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 10:53:33,947 WARNING [train.py:1204] (3/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_37-41919-0 from training. Duration: 0.87 2023-10-02 10:53:33,968 WARNING [train.py:1204] (3/4) Exclude cut with ID _60860_210_8254_1_1534896015875_3557169_172-294852-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 10:53:39,589 WARNING [train.py:1204] (3/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_146-276944-0 from training. Duration: 0.83 2023-10-02 10:53:42,247 WARNING [train.py:1204] (3/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_1254-44082-0_sp1.1 from training. Duration: 0.936375 2023-10-02 10:53:42,274 WARNING [train.py:1204] (3/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_1115-180350-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 10:53:42,296 WARNING [train.py:1204] (3/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_60-146189-0_sp1.1 from training. Duration: 0.8909375 2023-10-02 10:53:42,444 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.1.conv_module2.balancer1.prob, batch_count=854073.3333333334, ans=0.125 2023-10-02 10:53:48,071 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.607e+02 1.854e+02 2.038e+02 2.245e+02 2.978e+02, threshold=4.076e+02, percent-clipped=0.0 2023-10-02 10:53:48,203 WARNING [train.py:1204] (3/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_517-153649-0_sp1.1 from training. Duration: 0.92725 2023-10-02 10:53:48,215 WARNING [train.py:1204] (3/4) Exclude cut with ID _39571_210_13449_1_1533801584143_3651077_37-266257-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 10:53:49,676 WARNING [train.py:1204] (3/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_527-276118-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 10:53:52,005 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.1.encoder.layers.1.conv_module1.whiten, num_groups=1, num_channels=256, metric=11.07 vs. limit=15.0 2023-10-02 10:53:56,311 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7696_210_2040_1_1528207167_3716099_117-125434-0_sp1.1 from training. Duration: 0.6909375 2023-10-02 10:53:56,653 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.feed_forward1.hidden_balancer.prob, batch_count=854140.0, ans=0.125 2023-10-02 10:53:59,222 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_25767_210_2161_1_1532307483_3867748_478-156360-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 10:53:59,232 WARNING [train.py:1204] (3/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_177-49092-0_sp1.1 from training. Duration: 0.82725 2023-10-02 10:53:59,240 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_11305_210_1794_1_1529750428_7178148_680-317073-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 10:54:06,040 WARNING [train.py:1204] (3/4) Exclude cut with ID _53565_210_10635_1_1534337218335_3820259_287-18120-0 from training. Duration: 0.97 2023-10-02 10:54:12,962 WARNING [train.py:1204] (3/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_498-40153-0_sp1.1 from training. Duration: 0.57275 2023-10-02 10:54:12,980 WARNING [train.py:1204] (3/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_555-63066-0_sp1.1 from training. Duration: 0.963625 2023-10-02 10:54:15,700 WARNING [train.py:1204] (3/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_395-310220-0 from training. Duration: 0.89 2023-10-02 10:54:15,775 WARNING [train.py:1204] (3/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_937-208281-0_sp1.1 from training. Duration: 0.77275 2023-10-02 10:54:17,845 WARNING [train.py:1204] (3/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_120-211630-0 from training. Duration: 0.92 2023-10-02 10:54:19,181 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_454-276429-0_sp1.1 from training. Duration: 0.7909375 2023-10-02 10:54:19,235 WARNING [train.py:1204] (3/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_404-75889-0_sp1.1 from training. Duration: 0.8090625 2023-10-02 10:54:25,954 WARNING [train.py:1204] (3/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_282-136641-0_sp0.9 from training. Duration: 0.4666875 2023-10-02 10:54:27,167 INFO [train.py:1046] (3/4) Epoch 25, batch 650, loss[loss=0.1675, simple_loss=0.2184, pruned_loss=0.05833, over 19384.00 frames. ], tot_loss[loss=0.1705, simple_loss=0.2479, pruned_loss=0.04655, over 4529962.87 frames. ], batch size: 388, lr: 4.13e-03, grad_scale: 16.0 2023-10-02 10:54:27,260 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_809-17156-0_sp1.1 from training. Duration: 0.663625 2023-10-02 10:54:29,986 WARNING [train.py:1204] (3/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_1003-101638-0_sp0.9 from training. Duration: 0.811125 2023-10-02 10:54:31,372 WARNING [train.py:1204] (3/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_404-75889-0_sp0.9 from training. Duration: 0.988875 2023-10-02 10:54:32,735 WARNING [train.py:1204] (3/4) Exclude cut with ID _59288_210_5854_1_1535077757774_3412246_570-295255-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 10:54:35,475 WARNING [train.py:1204] (3/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_412-287526-0 from training. Duration: 0.72 2023-10-02 10:54:36,765 WARNING [train.py:1204] (3/4) Exclude cut with ID _44010_210_4525_1_1533884262964_4013620_649-79655-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 10:54:42,231 WARNING [train.py:1204] (3/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_57-270037-0_sp0.9 from training. Duration: 0.8555625 2023-10-02 10:54:42,231 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_34815_210_6388_1_1533381320_3571903_302-255963-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 10:54:45,026 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_47449_210_5399_1_1534076445_4006929_573-301852-0_sp1.1 from training. Duration: 0.97275 2023-10-02 10:54:46,666 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.bypass.scale_min, batch_count=854340.0, ans=0.2 2023-10-02 10:54:47,906 WARNING [train.py:1204] (3/4) Exclude cut with ID 6709-74022-0004-86860-0_sp1.1 from training. Duration: 0.9409375 2023-10-02 10:54:51,126 WARNING [train.py:1204] (3/4) Exclude cut with ID _40519_210_13794_1_1533715228655_7337390_111-71991-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 10:54:51,163 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_30167_210_3528_1_1532428012_4772375_169-214186-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 10:54:55,236 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.feed_forward1.out_proj.dropout_p, batch_count=854406.6666666666, ans=0.1 2023-10-02 10:54:56,218 WARNING [train.py:1204] (3/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_20-235313-0_sp1.1 from training. Duration: 0.963625 2023-10-02 10:54:56,271 WARNING [train.py:1204] (3/4) Exclude cut with ID _4681_210_3525_1_1527251040321_1235713_123-24428-0_sp0.9 from training. Duration: 0.5333125 2023-10-02 10:54:59,086 WARNING [train.py:1204] (3/4) Exclude cut with ID _31602_210_5854_1_1532677021456_4458250_444-198752-0_sp1.1 from training. Duration: 0.97275 2023-10-02 10:54:59,142 WARNING [train.py:1204] (3/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_9-279442-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 10:55:00,470 WARNING [train.py:1204] (3/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_973-198085-0_sp0.9 from training. Duration: 0.8333125 2023-10-02 10:55:01,804 WARNING [train.py:1204] (3/4) Exclude cut with ID _29853_210_7497_1_1532505600851_6642110_567-250221-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 10:55:01,913 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6545_210_2040_1_1527774670_4245258_407-48070-0_sp1.1 from training. Duration: 0.6909375 2023-10-02 10:55:02,138 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.bypass.skip_rate, batch_count=854406.6666666666, ans=0.07 2023-10-02 10:55:04,658 WARNING [train.py:1204] (3/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_224-343295-0_sp0.9 from training. Duration: 0.611125 2023-10-02 10:55:04,679 WARNING [train.py:1204] (3/4) Exclude cut with ID IC0125W0011-61817 from training. Duration: 0.88 2023-10-02 10:55:05,909 WARNING [train.py:1204] (3/4) Exclude cut with ID _60113_210_16271_1_1535164212481_4386079_435-154371-0_sp1.1 from training. Duration: 0.97275 2023-10-02 10:55:05,921 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_25836_210_16554_1_1532138167_53197_3-9394-0_sp1.1 from training. Duration: 0.92725 2023-10-02 10:55:07,394 WARNING [train.py:1204] (3/4) Exclude cut with ID _29343_210_4381_1_1532602757752_3674109_57-55125-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 10:55:08,840 WARNING [train.py:1204] (3/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_267-23560-0_sp1.1 from training. Duration: 0.936375 2023-10-02 10:55:10,196 WARNING [train.py:1204] (3/4) Exclude cut with ID _30196_210_1664_1_1533708496522_7074140_1150-278867-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 10:55:10,254 WARNING [train.py:1204] (3/4) Exclude cut with ID _6270_210_2084_1_1527850911573_3856420_26-277343-0_sp0.9 from training. Duration: 0.911125 2023-10-02 10:55:11,635 WARNING [train.py:1204] (3/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_325-62802-0 from training. Duration: 0.87 2023-10-02 10:55:11,709 WARNING [train.py:1204] (3/4) Exclude cut with ID _56524_210_9642_1_1534572011779_3619170_145-310842-0_sp1.1 from training. Duration: 0.9 2023-10-02 10:55:11,738 WARNING [train.py:1204] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_860-96198-0_sp0.9 from training. Duration: 0.811125 2023-10-02 10:55:13,212 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.nonlin_attention.balancer.prob, batch_count=854473.3333333334, ans=0.125 2023-10-02 10:55:14,228 WARNING [train.py:1204] (3/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_17-25045-0_sp1.1 from training. Duration: 0.536375 2023-10-02 10:55:14,252 WARNING [train.py:1204] (3/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_349-69727-0_sp1.1 from training. Duration: 0.936375 2023-10-02 10:55:15,664 WARNING [train.py:1204] (3/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_580-93798-0_sp1.1 from training. Duration: 0.5090625 2023-10-02 10:55:15,773 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7762_210_1794_1_1528199508_4057408_419-125009-0 from training. Duration: 0.83 2023-10-02 10:55:19,032 WARNING [train.py:1204] (3/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_356-167816-0 from training. Duration: 0.72 2023-10-02 10:55:19,059 WARNING [train.py:1204] (3/4) Exclude cut with ID _29345_210_4381_1_1532689168263_3860169_225-97100-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 10:55:19,062 WARNING [train.py:1204] (3/4) Exclude cut with ID _14227_210_4381_1_1530701804286_3940259_374-194712-0_sp1.1 from training. Duration: 0.92725 2023-10-02 10:55:19,091 WARNING [train.py:1204] (3/4) Exclude cut with ID _40137_210_12210_1_1533290484046_3795180_249-187059-0_sp0.9 from training. Duration: 0.97775 2023-10-02 10:55:20,442 WARNING [train.py:1204] (3/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_389-317194-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 10:55:20,570 WARNING [train.py:1204] (3/4) Exclude cut with ID _27485_210_4916_1_1532739191677_3668269_239-122918-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 10:55:20,770 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.bypass.skip_rate, batch_count=854473.3333333334, ans=0.07 2023-10-02 10:55:27,359 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2245_210_2715_1_1525511548_3540254_128-117609-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 10:55:27,390 WARNING [train.py:1204] (3/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_160-276178-0_sp1.1 from training. Duration: 0.963625 2023-10-02 10:55:30,086 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_31920_210_10633_1_1532860009_1686094_1-279735-0_sp1.1 from training. Duration: 0.97275 2023-10-02 10:55:32,739 WARNING [train.py:1204] (3/4) Exclude cut with ID _51078_210_9070_1_1534161557424_3805250_325-262383-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 10:55:32,759 WARNING [train.py:1204] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_643-52007-0_sp0.9 from training. Duration: 0.6333125 2023-10-02 10:55:34,218 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_51937_210_5014_1_1534573610_4022025_518-299248-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 10:55:38,442 WARNING [train.py:1204] (3/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_166-314607-0_sp1.1 from training. Duration: 0.4909375 2023-10-02 10:55:38,446 WARNING [train.py:1204] (3/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_1087-282939-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 10:55:39,747 INFO [train.py:1046] (3/4) Epoch 25, batch 700, loss[loss=0.1632, simple_loss=0.2507, pruned_loss=0.03788, over 24625.00 frames. ], tot_loss[loss=0.1694, simple_loss=0.2468, pruned_loss=0.04602, over 4579150.22 frames. ], batch size: 68, lr: 4.13e-03, grad_scale: 16.0 2023-10-02 10:55:39,799 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_23850_210_4238_1_1532232597_1632433_32-185135-0_sp1.1 from training. Duration: 0.7818125 2023-10-02 10:55:39,828 WARNING [train.py:1204] (3/4) Exclude cut with ID _59500_210_8254_1_1534809610680_3736009_145-89359-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 10:55:45,053 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_340-336408-0 from training. Duration: 0.93 2023-10-02 10:55:46,489 WARNING [train.py:1204] (3/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_9-21737-0 from training. Duration: 0.97 2023-10-02 10:55:47,919 WARNING [train.py:1204] (3/4) Exclude cut with ID _2220_210_1819_1_1525509109898_3658680_50-99837-0 from training. Duration: 0.54 2023-10-02 10:55:47,978 WARNING [train.py:1204] (3/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_879-299350-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 10:55:49,971 WARNING [train.py:1204] (3/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_905-160074-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 10:55:52,849 WARNING [train.py:1204] (3/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_395-73570-0 from training. Duration: 0.99 2023-10-02 10:55:53,534 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.3.encoder.layers.2.feed_forward1.out_whiten, num_groups=1, num_channels=512, metric=8.44 vs. limit=15.0 2023-10-02 10:55:56,528 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7264_210_4938_1_1527939911_8132095_800-91995-0_sp1.1 from training. Duration: 0.7818125 2023-10-02 10:56:01,091 WARNING [train.py:1204] (3/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_77-311404-0_sp1.1 from training. Duration: 0.8545625 2023-10-02 10:56:01,201 WARNING [train.py:1204] (3/4) Exclude cut with ID _13679_210_7497_1_1530534293662_3355098_107-80772-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 10:56:02,492 WARNING [train.py:1204] (3/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_224-343295-0_sp1.1 from training. Duration: 0.5 2023-10-02 10:56:02,689 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=854673.3333333334, ans=0.1 2023-10-02 10:56:03,789 WARNING [train.py:1204] (3/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_156-253482-0_sp1.1 from training. Duration: 0.963625 2023-10-02 10:56:05,232 WARNING [train.py:1204] (3/4) Exclude cut with ID _49728_210_12651_1_1534041034294_6788150_309-125532-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 10:56:07,921 WARNING [train.py:1204] (3/4) Exclude cut with ID _13913_210_9994_1_1530579615187_3383897_473-277682-0_sp0.9 from training. Duration: 0.4333125 2023-10-02 10:56:07,927 WARNING [train.py:1204] (3/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_3-276701-0_sp1.1 from training. Duration: 0.82725 2023-10-02 10:56:09,335 WARNING [train.py:1204] (3/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_282-136641-0 from training. Duration: 0.42 2023-10-02 10:56:12,025 WARNING [train.py:1204] (3/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_127-566-0 from training. Duration: 0.84 2023-10-02 10:56:14,702 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.603e+02 1.821e+02 2.032e+02 2.235e+02 2.949e+02, threshold=4.064e+02, percent-clipped=0.0 2023-10-02 10:56:16,243 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6163_210_6400_1_1527506746_4263068_283-137811-0_sp1.1 from training. Duration: 0.7 2023-10-02 10:56:16,279 WARNING [train.py:1204] (3/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_34-183685-0_sp1.1 from training. Duration: 0.8909375 2023-10-02 10:56:19,437 WARNING [train.py:1204] (3/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_154-135696-0_sp0.9 from training. Duration: 0.888875 2023-10-02 10:56:22,326 WARNING [train.py:1204] (3/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_676-51014-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 10:56:23,676 WARNING [train.py:1204] (3/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_38-169920-0 from training. Duration: 0.91 2023-10-02 10:56:27,293 WARNING [train.py:1204] (3/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_747-114465-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 10:56:29,264 WARNING [train.py:1204] (3/4) Exclude cut with ID _8021_210_7994_1_1528376451358_3653069_100-140231-0_sp1.1 from training. Duration: 0.6090625 2023-10-02 10:56:29,280 WARNING [train.py:1204] (3/4) Exclude cut with ID _64135_210_2253_1_1535677267876_7608972_133-188266-0 from training. Duration: 0.81 2023-10-02 10:56:32,183 WARNING [train.py:1204] (3/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_714-310538-0_sp1.1 from training. Duration: 0.7818125 2023-10-02 10:56:33,547 WARNING [train.py:1204] (3/4) Exclude cut with ID _39867_210_9826_1_1533553222030_9344259_1134-288498-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 10:56:35,036 WARNING [train.py:1204] (3/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_211-237289-0_sp1.1 from training. Duration: 0.936375 2023-10-02 10:56:40,584 WARNING [train.py:1204] (3/4) Exclude cut with ID _59202_210_26984_1_1534816810612_3549174_137-318710-0_sp0.9 from training. Duration: 0.988875 2023-10-02 10:56:40,619 WARNING [train.py:1204] (3/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_86-66192-0 from training. Duration: 0.55 2023-10-02 10:56:43,409 WARNING [train.py:1204] (3/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_540-275685-0 from training. Duration: 0.89 2023-10-02 10:56:43,554 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.balancer2.prob, batch_count=854873.3333333334, ans=0.125 2023-10-02 10:56:44,072 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.2.encoder.layers.2.self_attn_weights.whiten_keys, num_groups=4, num_channels=128, metric=4.59 vs. limit=6.0 2023-10-02 10:56:44,695 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8776_210_2040_1_1528638837_4088036_244-278763-0 from training. Duration: 0.9 2023-10-02 10:56:46,179 WARNING [train.py:1204] (3/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_9-219972-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 10:56:48,023 WARNING [train.py:1204] (3/4) Exclude cut with ID _44659_210_2721_1_1534036120061_6614860_656-283823-0_sp1.1 from training. Duration: 0.92725 2023-10-02 10:56:48,100 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_69630_210_7234_1_1535789601_661049_77-216385-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 10:56:50,635 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_19952_210_2161_1_1531875469_3851750_210-308919-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 10:56:50,641 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_4737_210_2040_1_1526998689_3293008_378-178650-0 from training. Duration: 0.95 2023-10-02 10:56:53,754 INFO [train.py:1046] (3/4) Epoch 25, batch 750, loss[loss=0.1657, simple_loss=0.2446, pruned_loss=0.04343, over 23333.00 frames. ], tot_loss[loss=0.169, simple_loss=0.2459, pruned_loss=0.04602, over 4604783.82 frames. ], batch size: 119, lr: 4.13e-03, grad_scale: 16.0 2023-10-02 10:56:55,140 WARNING [train.py:1204] (3/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_224-343295-0 from training. Duration: 0.55 2023-10-02 10:56:55,156 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7002_210_1794_1_1527939683_5157286_184-229630-0 from training. Duration: 0.74 2023-10-02 10:56:56,003 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.feed_forward1.out_proj.dropout_p, batch_count=854940.0, ans=0.1 2023-10-02 10:56:56,875 WARNING [train.py:1204] (3/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_6-214815-0 from training. Duration: 0.85 2023-10-02 10:56:58,208 WARNING [train.py:1204] (3/4) Exclude cut with ID _8699_210_2069_1_1528630346831_3960409_160-24408-0 from training. Duration: 0.91 2023-10-02 10:56:58,254 WARNING [train.py:1204] (3/4) Exclude cut with ID _5585_210_2667_1_1527418813324_8007340_89-332072-0 from training. Duration: 0.64 2023-10-02 10:56:59,590 WARNING [train.py:1204] (3/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_528-259800-0_sp1.1 from training. Duration: 0.963625 2023-10-02 10:56:59,672 WARNING [train.py:1204] (3/4) Exclude cut with ID _55383_210_10635_1_1534468426172_2231669_134-128246-0 from training. Duration: 0.89 2023-10-02 10:57:01,039 WARNING [train.py:1204] (3/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_400-338274-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 10:57:01,098 WARNING [train.py:1204] (3/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_364-22817-0_sp0.9 from training. Duration: 0.788875 2023-10-02 10:57:03,821 WARNING [train.py:1204] (3/4) Exclude cut with ID _42863_210_9070_1_1533902398788_3696920_331-291057-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 10:57:05,161 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_5436_210_1794_1_1527331317_2541494_229-211016-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 10:57:06,471 WARNING [train.py:1204] (3/4) Exclude cut with ID _6456_210_1534_1_1527681634212_3181049_345-252877-0_sp0.9 from training. Duration: 0.711125 2023-10-02 10:57:06,494 WARNING [train.py:1204] (3/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_96-81499-0_sp1.1 from training. Duration: 0.92725 2023-10-02 10:57:09,291 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_5575_210_1410_1_1527301305_2570170_342-265572-0_sp1.1 from training. Duration: 0.8545625 2023-10-02 10:57:09,369 WARNING [train.py:1204] (3/4) Exclude cut with ID _5444_210_2151_1_1527418808464_5312210_726-27165-0_sp0.9 from training. Duration: 0.7444375 2023-10-02 10:57:10,783 WARNING [train.py:1204] (3/4) Exclude cut with ID _22083_210_8254_1_1531702862439_3678970_304-327750-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 10:57:13,575 WARNING [train.py:1204] (3/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_245-226199-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 10:57:13,603 WARNING [train.py:1204] (3/4) Exclude cut with ID _42875_210_14856_1_1533704397487_4012433_102-199499-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 10:57:13,667 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_51225_210_3601_1_1534400634_2327383_55-301844-0 from training. Duration: 0.93 2023-10-02 10:57:14,951 WARNING [train.py:1204] (3/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_916-195848-0_sp1.1 from training. Duration: 0.836375 2023-10-02 10:57:15,250 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.balancer1.prob, batch_count=855006.6666666666, ans=0.125 2023-10-02 10:57:16,423 WARNING [train.py:1204] (3/4) Exclude cut with ID _44718_210_5816_1_1534050051869_6469030_497-311999-0_sp1.1 from training. Duration: 0.936375 2023-10-02 10:57:17,885 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_34886_210_10633_1_1533203882_1755661_91-194765-0_sp1.1 from training. Duration: 0.936375 2023-10-02 10:57:20,987 WARNING [train.py:1204] (3/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_310-26186-0_sp0.9 from training. Duration: 0.77775 2023-10-02 10:57:22,858 WARNING [train.py:1204] (3/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_416-257730-0 from training. Duration: 0.65 2023-10-02 10:57:22,874 WARNING [train.py:1204] (3/4) Exclude cut with ID _9389_210_1664_1_1529217337301_5289269_209-154760-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 10:57:24,676 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.bypass.skip_rate, batch_count=855073.3333333334, ans=0.04949747468305833 2023-10-02 10:57:25,836 WARNING [train.py:1204] (3/4) Exclude cut with ID _4092_210_2151_1_1526643044449_3135759_227-213972-0 from training. Duration: 0.54 2023-10-02 10:57:25,869 WARNING [train.py:1204] (3/4) Exclude cut with ID IC0679W0044-335852 from training. Duration: 0.768 2023-10-02 10:57:27,947 WARNING [train.py:1204] (3/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_42-186162-0 from training. Duration: 0.92 2023-10-02 10:57:27,966 WARNING [train.py:1204] (3/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_302-2550-0_sp0.9 from training. Duration: 0.9 2023-10-02 10:57:27,983 WARNING [train.py:1204] (3/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_356-31216-0_sp0.9 from training. Duration: 0.8333125 2023-10-02 10:57:28,141 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.conv_module1.balancer1.prob, batch_count=855073.3333333334, ans=0.125 2023-10-02 10:57:29,270 WARNING [train.py:1204] (3/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_125-322172-0_sp1.1 from training. Duration: 0.8454375 2023-10-02 10:57:35,504 WARNING [train.py:1204] (3/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_141-89727-0_sp0.9 from training. Duration: 0.788875 2023-10-02 10:57:35,521 WARNING [train.py:1204] (3/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_647-337784-0_sp1.1 from training. Duration: 0.97275 2023-10-02 10:57:35,530 WARNING [train.py:1204] (3/4) Exclude cut with ID _6493_210_1747_1_1528459210424_4012470_513-140967-0_sp1.1 from training. Duration: 0.7181875 2023-10-02 10:57:38,191 WARNING [train.py:1204] (3/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_87-266334-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 10:57:38,304 WARNING [train.py:1204] (3/4) Exclude cut with ID _26528_210_9070_1_1532570410842_3851310_526-261338-0_sp1.1 from training. Duration: 0.92725 2023-10-02 10:57:39,663 WARNING [train.py:1204] (3/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_380-343670-0 from training. Duration: 0.88 2023-10-02 10:57:39,732 WARNING [train.py:1204] (3/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_449-15508-0_sp1.1 from training. Duration: 0.7454375 2023-10-02 10:57:42,299 WARNING [train.py:1204] (3/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_372-6869-0_sp0.9 from training. Duration: 0.488875 2023-10-02 10:57:43,595 WARNING [train.py:1204] (3/4) Exclude cut with ID _26907_210_7234_1_1532221740694_3205263_337-2494-0_sp0.9 from training. Duration: 0.97775 2023-10-02 10:57:45,050 WARNING [train.py:1204] (3/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_10-100367-0_sp1.1 from training. Duration: 0.8909375 2023-10-02 10:57:45,077 WARNING [train.py:1204] (3/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_47-167373-0 from training. Duration: 0.92 2023-10-02 10:57:46,289 WARNING [train.py:1204] (3/4) Exclude cut with ID _28323_210_16187_1_1532480395667_4100300_628-282179-0_sp1.1 from training. Duration: 0.97275 2023-10-02 10:57:52,101 WARNING [train.py:1204] (3/4) Exclude cut with ID _20455_210_5219_1_1531565996775_8098100_213-280898-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 10:57:53,929 WARNING [train.py:1204] (3/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_383-316000-0_sp1.1 from training. Duration: 0.8181875 2023-10-02 10:57:53,956 WARNING [train.py:1204] (3/4) Exclude cut with ID _18513_210_9206_1_1531618215223_3862620_16-78180-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 10:57:55,385 WARNING [train.py:1204] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_330-327861-0_sp1.1 from training. Duration: 0.6090625 2023-10-02 10:58:00,175 WARNING [train.py:1204] (3/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_356-31216-0 from training. Duration: 0.75 2023-10-02 10:58:00,204 WARNING [train.py:1204] (3/4) Exclude cut with ID _25248_210_8750_1_1532062825941_5554180_255-148539-0_sp1.1 from training. Duration: 0.963625 2023-10-02 10:58:00,252 WARNING [train.py:1204] (3/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_269-154726-0_sp1.1 from training. Duration: 0.7818125 2023-10-02 10:58:03,578 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7002_210_1794_1_1527939683_5157286_163-87552-0_sp1.1 from training. Duration: 0.7818125 2023-10-02 10:58:04,841 WARNING [train.py:1204] (3/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_97-151896-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 10:58:06,422 WARNING [train.py:1204] (3/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_207-7026-0_sp1.1 from training. Duration: 0.97275 2023-10-02 10:58:07,660 INFO [train.py:1046] (3/4) Epoch 25, batch 800, loss[loss=0.1602, simple_loss=0.2409, pruned_loss=0.03975, over 24488.00 frames. ], tot_loss[loss=0.1696, simple_loss=0.2465, pruned_loss=0.04637, over 4631873.77 frames. ], batch size: 63, lr: 4.13e-03, grad_scale: 32.0 2023-10-02 10:58:07,680 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_92-46009-0_sp0.9 from training. Duration: 0.6 2023-10-02 10:58:14,538 WARNING [train.py:1204] (3/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_122-178847-0_sp1.1 from training. Duration: 0.97275 2023-10-02 10:58:14,542 WARNING [train.py:1204] (3/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_94-348956-0_sp1.1 from training. Duration: 0.92725 2023-10-02 10:58:14,710 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=855273.3333333334, ans=0.1 2023-10-02 10:58:17,195 WARNING [train.py:1204] (3/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_1018-244089-0_sp1.1 from training. Duration: 0.963625 2023-10-02 10:58:17,216 WARNING [train.py:1204] (3/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_87-118423-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 10:58:18,502 WARNING [train.py:1204] (3/4) Exclude cut with ID _53713_210_24116_1_1534314487762_7416751_504-212292-0_sp1.1 from training. Duration: 0.92725 2023-10-02 10:58:18,523 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_446-150327-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 10:58:18,645 WARNING [train.py:1204] (3/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_682-245989-0_sp1.1 from training. Duration: 0.92725 2023-10-02 10:58:23,346 WARNING [train.py:1204] (3/4) Exclude cut with ID _35723_210_1794_1_1533088341043_628250_8-234524-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 10:58:25,100 WARNING [train.py:1204] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_64-248359-0_sp0.9 from training. Duration: 0.7666875 2023-10-02 10:58:26,709 WARNING [train.py:1204] (3/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_49-97158-0 from training. Duration: 0.76 2023-10-02 10:58:28,521 WARNING [train.py:1204] (3/4) Exclude cut with ID _24314_210_4915_1_1532135078116_7170179_743-915-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 10:58:29,916 WARNING [train.py:1204] (3/4) Exclude cut with ID _61194_210_18628_1_1535007555628_3750909_353-323468-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 10:58:29,937 WARNING [train.py:1204] (3/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_387-131538-0_sp1.1 from training. Duration: 0.736375 2023-10-02 10:58:29,959 WARNING [train.py:1204] (3/4) Exclude cut with ID _44424_210_18163_1_1533623394848_1213110_79-187603-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 10:58:30,003 WARNING [train.py:1204] (3/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_86-19601-0 from training. Duration: 0.85 2023-10-02 10:58:30,727 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.1.encoder.layers.1.nonlin_attention.whiten1, num_groups=1, num_channels=192, metric=6.01 vs. limit=10.0 2023-10-02 10:58:31,247 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_50050_210_16223_1_1535435796_7290138_103-283529-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 10:58:31,299 WARNING [train.py:1204] (3/4) Exclude cut with ID _13913_210_9994_1_1530579615187_3383897_473-277682-0 from training. Duration: 0.39 2023-10-02 10:58:34,664 WARNING [train.py:1204] (3/4) Exclude cut with ID _42326_210_7770_1_1533814282531_3707269_368-68029-0_sp1.1 from training. Duration: 0.92725 2023-10-02 10:58:36,029 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_41428_210_2161_1_1533774184_3992532_446-339645-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 10:58:37,500 WARNING [train.py:1204] (3/4) Exclude cut with ID _69584_210_5219_1_1535785133775_4044450_325-7656-0_sp1.1 from training. Duration: 0.7818125 2023-10-02 10:58:37,508 WARNING [train.py:1204] (3/4) Exclude cut with ID _21632_210_2253_1_1531994001765_3744140_134-184387-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 10:58:40,338 WARNING [train.py:1204] (3/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_550-184707-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 10:58:41,559 WARNING [train.py:1204] (3/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_106-341780-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 10:58:44,118 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.429e+02 1.817e+02 2.016e+02 2.246e+02 3.441e+02, threshold=4.032e+02, percent-clipped=0.0 2023-10-02 10:58:44,234 WARNING [train.py:1204] (3/4) Exclude cut with ID _45871_210_5665_1_1533864614245_3296060_253-132552-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 10:58:44,280 WARNING [train.py:1204] (3/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_395-310220-0_sp1.1 from training. Duration: 0.8090625 2023-10-02 10:58:44,301 WARNING [train.py:1204] (3/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_795-33252-0_sp0.9 from training. Duration: 0.411125 2023-10-02 10:58:47,021 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9002W0003-495962 from training. Duration: 0.946 2023-10-02 10:58:47,048 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_43-71989-0 from training. Duration: 0.57 2023-10-02 10:58:47,060 WARNING [train.py:1204] (3/4) Exclude cut with ID _8699_210_2069_1_1528630346831_3960409_155-39766-0_sp1.1 from training. Duration: 0.7090625 2023-10-02 10:58:47,078 WARNING [train.py:1204] (3/4) Exclude cut with ID _27894_210_12392_1_1532415496078_6902439_788-232684-0_sp1.1 from training. Duration: 0.97275 2023-10-02 10:58:47,320 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.ff2_skip_rate, batch_count=855406.6666666666, ans=0.0 2023-10-02 10:58:49,740 WARNING [train.py:1204] (3/4) Exclude cut with ID _56494_210_4220_1_1535284837625_3033169_10-39296-0_sp1.1 from training. Duration: 0.92725 2023-10-02 10:58:49,760 WARNING [train.py:1204] (3/4) Exclude cut with ID _40901_210_5665_1_1533363789958_3856130_341-20819-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 10:58:51,413 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=855473.3333333334, ans=0.1 2023-10-02 10:58:53,278 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9029W0435-509822 from training. Duration: 0.896 2023-10-02 10:58:53,320 WARNING [train.py:1204] (3/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_51-117576-0 from training. Duration: 0.4 2023-10-02 10:58:54,686 WARNING [train.py:1204] (3/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_197-28164-0_sp1.1 from training. Duration: 0.72725 2023-10-02 10:58:56,715 WARNING [train.py:1204] (3/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_145-137790-0_sp1.1 from training. Duration: 0.7545625 2023-10-02 10:59:00,717 WARNING [train.py:1204] (3/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_2-39653-0_sp1.1 from training. Duration: 0.8909375 2023-10-02 10:59:06,649 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3780_210_2087_1_1526467760_2130483_186-208240-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 10:59:06,728 WARNING [train.py:1204] (3/4) Exclude cut with ID _24055_210_12651_1_1532310907143_976979_92-59356-0 from training. Duration: 0.91 2023-10-02 10:59:06,775 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3910_210_1794_1_1526742306_334530_28-238909-0_sp1.1 from training. Duration: 0.736375 2023-10-02 10:59:10,746 WARNING [train.py:1204] (3/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_269-154726-0 from training. Duration: 0.86 2023-10-02 10:59:12,464 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.conv_module2.balancer1.prob, batch_count=855540.0, ans=0.125 2023-10-02 10:59:12,465 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=855540.0, ans=0.1 2023-10-02 10:59:17,735 WARNING [train.py:1204] (3/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_326-165900-0_sp0.9 from training. Duration: 0.7666875 2023-10-02 10:59:20,379 INFO [train.py:1046] (3/4) Epoch 25, batch 850, loss[loss=0.1784, simple_loss=0.2668, pruned_loss=0.04504, over 24318.00 frames. ], tot_loss[loss=0.1709, simple_loss=0.2478, pruned_loss=0.04703, over 4627161.80 frames. ], batch size: 74, lr: 4.13e-03, grad_scale: 16.0 2023-10-02 10:59:21,781 WARNING [train.py:1204] (3/4) Exclude cut with ID _5294_210_1747_1_1527249643928_3696179_470-276077-0_sp1.1 from training. Duration: 0.963625 2023-10-02 10:59:21,821 WARNING [train.py:1204] (3/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_372-131163-0 from training. Duration: 0.94 2023-10-02 10:59:23,127 WARNING [train.py:1204] (3/4) Exclude cut with ID _45297_210_2721_1_1533715351381_3264949_351-147136-0_sp1.1 from training. Duration: 0.7909375 2023-10-02 10:59:23,206 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_1072-12671-0_sp1.1 from training. Duration: 0.92725 2023-10-02 10:59:24,017 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.3.encoder.layers.1.nonlin_attention.whiten2, num_groups=1, num_channels=512, metric=7.77 vs. limit=15.0 2023-10-02 10:59:24,689 WARNING [train.py:1204] (3/4) Exclude cut with ID _15133_210_6228_1_1530794512074_3080379_54-198193-0 from training. Duration: 0.94 2023-10-02 10:59:24,718 WARNING [train.py:1204] (3/4) Exclude cut with ID _30196_210_1664_1_1533708496522_7074140_1194-270988-0_sp1.1 from training. Duration: 0.97275 2023-10-02 10:59:25,206 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.4.encoder.layers.2.self_attn_weights.whiten_keys, num_groups=4, num_channels=128, metric=1.80 vs. limit=6.0 2023-10-02 10:59:26,062 WARNING [train.py:1204] (3/4) Exclude cut with ID _50310_210_15488_1_1534068043345_3641080_289-193637-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 10:59:28,053 WARNING [train.py:1204] (3/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_313-263495-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 10:59:29,417 WARNING [train.py:1204] (3/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_18-130336-0_sp1.1 from training. Duration: 0.7181875 2023-10-02 10:59:31,428 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_47896_210_20508_1_1534492518_5958868_646-287209-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 10:59:32,823 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_294-163793-0 from training. Duration: 0.82 2023-10-02 10:59:32,853 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_8-319957-0 from training. Duration: 0.96 2023-10-02 10:59:32,856 WARNING [train.py:1204] (3/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_1211-226066-0 from training. Duration: 0.76 2023-10-02 10:59:34,710 WARNING [train.py:1204] (3/4) Exclude cut with ID _4779_210_2084_1_1527318022944_3914893_500-285013-0_sp0.9 from training. Duration: 0.7666875 2023-10-02 10:59:34,728 WARNING [train.py:1204] (3/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_347-232268-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 10:59:36,226 WARNING [train.py:1204] (3/4) Exclude cut with ID _29877_210_6228_1_1532397791106_5276149_111-55056-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 10:59:36,259 WARNING [train.py:1204] (3/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_812-271646-0_sp1.1 from training. Duration: 0.92725 2023-10-02 10:59:36,283 WARNING [train.py:1204] (3/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_235-262124-0_sp1.1 from training. Duration: 0.6090625 2023-10-02 10:59:40,360 WARNING [train.py:1204] (3/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_14-16530-0_sp1.1 from training. Duration: 0.97275 2023-10-02 10:59:40,387 WARNING [train.py:1204] (3/4) Exclude cut with ID _44718_210_5816_1_1534050051869_6469030_568-304512-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 10:59:41,694 WARNING [train.py:1204] (3/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_1033-274376-0 from training. Duration: 0.87 2023-10-02 10:59:44,313 WARNING [train.py:1204] (3/4) Exclude cut with ID _4681_210_3525_1_1527251040321_1235713_123-24428-0 from training. Duration: 0.48 2023-10-02 10:59:48,333 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_37818_210_4238_1_1533620910_4720083_251-288764-0_sp1.1 from training. Duration: 0.97275 2023-10-02 10:59:48,422 WARNING [train.py:1204] (3/4) Exclude cut with ID _65585_210_12210_1_1535450451584_3764098_112-294301-0 from training. Duration: 0.88 2023-10-02 10:59:51,168 WARNING [train.py:1204] (3/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_329-113173-0 from training. Duration: 0.62 2023-10-02 10:59:52,586 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6767_210_3273_1_1527855783_1718932_173-256601-0 from training. Duration: 0.8 2023-10-02 10:59:54,051 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9266W0001-627677 from training. Duration: 0.896 2023-10-02 10:59:54,063 WARNING [train.py:1204] (3/4) Exclude cut with ID _49643_210_7989_1_1534640244224_3939031_611-185515-0_sp1.1 from training. Duration: 0.8545625 2023-10-02 10:59:54,067 WARNING [train.py:1204] (3/4) Exclude cut with ID _31649_210_6228_1_1532584608063_4091078_687-237771-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 10:59:54,080 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_148-172861-0_sp0.9 from training. Duration: 0.5555625 2023-10-02 10:59:57,257 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_5129_210_6400_1_1527161005_3579054_107-69738-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 10:59:59,237 WARNING [train.py:1204] (3/4) Exclude cut with ID _40137_210_12210_1_1533290484046_3795180_36-301164-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 10:59:59,266 WARNING [train.py:1204] (3/4) Exclude cut with ID _7537_210_2084_1_1528502395431_3924359_113-199933-0 from training. Duration: 0.59 2023-10-02 11:00:01,152 WARNING [train.py:1204] (3/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_547-5424-0_sp1.1 from training. Duration: 0.8545625 2023-10-02 11:00:01,209 WARNING [train.py:1204] (3/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_147-284024-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 11:00:02,512 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_294-163793-0_sp1.1 from training. Duration: 0.7454375 2023-10-02 11:00:02,539 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3780_210_2087_1_1526467760_2130483_170-259044-0_sp1.1 from training. Duration: 0.636375 2023-10-02 11:00:05,443 WARNING [train.py:1204] (3/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_726-270780-0_sp1.1 from training. Duration: 0.7818125 2023-10-02 11:00:06,781 WARNING [train.py:1204] (3/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_214-291369-0_sp1.1 from training. Duration: 0.57275 2023-10-02 11:00:08,109 WARNING [train.py:1204] (3/4) Exclude cut with ID _21881_210_3525_1_1531825242757_3798380_247-102591-0 from training. Duration: 0.98 2023-10-02 11:00:12,315 WARNING [train.py:1204] (3/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_18-174647-0_sp1.1 from training. Duration: 0.82725 2023-10-02 11:00:12,326 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_415-52611-0_sp1.1 from training. Duration: 0.936375 2023-10-02 11:00:12,559 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.balancer1.prob, batch_count=855806.6666666666, ans=0.125 2023-10-02 11:00:13,681 WARNING [train.py:1204] (3/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_379-49457-0_sp0.9 from training. Duration: 0.9666875 2023-10-02 11:00:13,695 WARNING [train.py:1204] (3/4) Exclude cut with ID _64521_210_14699_1_1535243427259_3815328_368-195106-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 11:00:15,052 WARNING [train.py:1204] (3/4) Exclude cut with ID _28394_210_8913_1_1532430049518_3650118_51-334124-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 11:00:15,314 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.bypass_mid.scale_min, batch_count=855806.6666666666, ans=0.2 2023-10-02 11:00:15,772 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.2.encoder.layers.2.feed_forward3.out_whiten, num_groups=1, num_channels=384, metric=10.44 vs. limit=15.0 2023-10-02 11:00:17,786 WARNING [train.py:1204] (3/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_317-161769-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 11:00:19,149 WARNING [train.py:1204] (3/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_40-129756-0_sp0.9 from training. Duration: 0.9 2023-10-02 11:00:20,445 WARNING [train.py:1204] (3/4) Exclude cut with ID _3067_210_2667_1_1526212974640_4503168_119-175776-0_sp0.9 from training. Duration: 0.82225 2023-10-02 11:00:20,498 WARNING [train.py:1204] (3/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_1004-304915-0_sp1.1 from training. Duration: 0.92725 2023-10-02 11:00:20,553 WARNING [train.py:1204] (3/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_359-118926-0_sp0.9 from training. Duration: 0.8 2023-10-02 11:00:27,244 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.nonlin_attention.balancer.prob, batch_count=855873.3333333334, ans=0.125 2023-10-02 11:00:28,529 WARNING [train.py:1204] (3/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_51-200220-0_sp0.9 from training. Duration: 0.62225 2023-10-02 11:00:28,603 WARNING [train.py:1204] (3/4) Exclude cut with ID _46683_210_9642_1_1533730913492_4591116_530-326202-0_sp1.1 from training. Duration: 0.936375 2023-10-02 11:00:30,403 WARNING [train.py:1204] (3/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_567-135114-0 from training. Duration: 0.87 2023-10-02 11:00:30,437 WARNING [train.py:1204] (3/4) Exclude cut with ID _66863_210_18053_1_1535698758334_8590319_1092-271784-0_sp1.1 from training. Duration: 0.87275 2023-10-02 11:00:32,223 WARNING [train.py:1204] (3/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_762-282614-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 11:00:33,667 WARNING [train.py:1204] (3/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_280-120942-0 from training. Duration: 0.77 2023-10-02 11:00:34,960 INFO [train.py:1046] (3/4) Epoch 25, batch 900, loss[loss=0.1896, simple_loss=0.2557, pruned_loss=0.06173, over 23476.00 frames. ], tot_loss[loss=0.1717, simple_loss=0.2484, pruned_loss=0.04746, over 4649074.10 frames. ], batch size: 256, lr: 4.13e-03, grad_scale: 16.0 2023-10-02 11:00:40,457 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_31583_210_3852_1_1532595851_4024413_27-170931-0_sp1.1 from training. Duration: 0.963625 2023-10-02 11:00:41,900 WARNING [train.py:1204] (3/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_266-271628-0_sp1.1 from training. Duration: 0.92725 2023-10-02 11:00:43,207 WARNING [train.py:1204] (3/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_41-31157-0 from training. Duration: 0.57 2023-10-02 11:00:44,770 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.bypass.scale_min, batch_count=855940.0, ans=0.2 2023-10-02 11:00:46,020 WARNING [train.py:1204] (3/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_113-18163-0_sp1.1 from training. Duration: 0.8090625 2023-10-02 11:00:46,049 WARNING [train.py:1204] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_147-111137-0 from training. Duration: 0.95 2023-10-02 11:00:47,620 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1083-220122-0_sp1.1 from training. Duration: 0.463625 2023-10-02 11:00:47,712 WARNING [train.py:1204] (3/4) Exclude cut with ID _14884_210_2252_1_1530707459689_5561250_591-173406-0_sp1.1 from training. Duration: 0.87275 2023-10-02 11:00:47,717 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_24677_210_1794_1_1532002693_4341296_149-303793-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 11:00:47,971 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.conv_module1.balancer1.prob, batch_count=856006.6666666666, ans=0.125 2023-10-02 11:00:48,979 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_133-564-0_sp1.1 from training. Duration: 0.7181875 2023-10-02 11:00:49,000 WARNING [train.py:1204] (3/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_625-338546-0_sp0.9 from training. Duration: 0.97775 2023-10-02 11:00:57,879 WARNING [train.py:1204] (3/4) Exclude cut with ID _25882_210_3036_1_1532775665815_6479510_624-320169-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 11:00:57,886 WARNING [train.py:1204] (3/4) Exclude cut with ID _20622_210_3852_1_1531622427080_1093839_127-325326-0_sp1.1 from training. Duration: 0.92725 2023-10-02 11:00:58,542 WARNING [train.py:1204] (3/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_464-33931-0_sp1.1 from training. Duration: 0.4909375 2023-10-02 11:01:01,627 WARNING [train.py:1204] (3/4) Exclude cut with ID _66112_210_10635_1_1535590236305_3801484_120-304588-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 11:01:05,718 WARNING [train.py:1204] (3/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_110-130961-0 from training. Duration: 0.47 2023-10-02 11:01:07,124 WARNING [train.py:1204] (3/4) Exclude cut with ID _16908_210_5724_1_1531119157248_3772700_64-54020-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 11:01:10,090 WARNING [train.py:1204] (3/4) Exclude cut with ID _4068_210_2629_1_1526697350395_1546259_224-288693-0_sp1.1 from training. Duration: 0.536375 2023-10-02 11:01:11,292 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.621e+02 1.867e+02 2.045e+02 2.308e+02 5.112e+02, threshold=4.090e+02, percent-clipped=1.0 2023-10-02 11:01:11,402 WARNING [train.py:1204] (3/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_191-41023-0_sp0.9 from training. Duration: 0.788875 2023-10-02 11:01:11,474 WARNING [train.py:1204] (3/4) Exclude cut with ID ID0205W0255-783002 from training. Duration: 0.958 2023-10-02 11:01:12,755 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_162-262717-0 from training. Duration: 0.83 2023-10-02 11:01:18,286 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_224-322394-0_sp1.1 from training. Duration: 0.6 2023-10-02 11:01:18,292 WARNING [train.py:1204] (3/4) Exclude cut with ID _4866_210_2577_1_1527404412284_4364659_133-1696-0_sp1.1 from training. Duration: 0.9 2023-10-02 11:01:18,341 WARNING [train.py:1204] (3/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_973-198085-0_sp1.1 from training. Duration: 0.6818125 2023-10-02 11:01:23,025 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=856140.0, ans=0.1 2023-10-02 11:01:25,521 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_47449_210_5399_1_1534076445_4006929_410-237806-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 11:01:25,537 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7543_210_6753_1_1528543318_4552_1-85072-0_sp0.9 from training. Duration: 0.92225 2023-10-02 11:01:27,636 WARNING [train.py:1204] (3/4) Exclude cut with ID _65263_210_22356_1_1535763598691_3921037_108-21902-0 from training. Duration: 0.99 2023-10-02 11:01:27,647 WARNING [train.py:1204] (3/4) Exclude cut with ID _65585_210_12210_1_1535450451584_3764098_309-174818-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 11:01:29,056 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_309-65177-0 from training. Duration: 0.88 2023-10-02 11:01:32,338 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.0.balancer2.prob, batch_count=856206.6666666666, ans=0.125 2023-10-02 11:01:33,513 WARNING [train.py:1204] (3/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_363-185703-0_sp1.1 from training. Duration: 0.863625 2023-10-02 11:01:33,547 WARNING [train.py:1204] (3/4) Exclude cut with ID _42326_210_7770_1_1533814282531_3707269_442-238692-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 11:01:34,945 WARNING [train.py:1204] (3/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_226-254446-0_sp1.1 from training. Duration: 0.936375 2023-10-02 11:01:34,961 WARNING [train.py:1204] (3/4) Exclude cut with ID _29853_210_7497_1_1532505600851_6642110_287-299903-0_sp1.1 from training. Duration: 0.963625 2023-10-02 11:01:39,097 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1015-343884-0 from training. Duration: 0.62 2023-10-02 11:01:39,129 WARNING [train.py:1204] (3/4) Exclude cut with ID ID0305W0008-833402 from training. Duration: 0.91 2023-10-02 11:01:40,640 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6673_210_3273_1_1527767790_3173240_351-167954-0_sp1.1 from training. Duration: 0.436375 2023-10-02 11:01:40,653 WARNING [train.py:1204] (3/4) Exclude cut with ID _6456_210_1534_1_1527681634212_3181049_345-252877-0 from training. Duration: 0.64 2023-10-02 11:01:41,321 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.5.encoder.layers.0.self_attn_weights.whiten_keys, num_groups=4, num_channels=128, metric=1.72 vs. limit=6.0 2023-10-02 11:01:42,166 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_13-205182-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 11:01:44,933 WARNING [train.py:1204] (3/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_427-208901-0 from training. Duration: 0.68 2023-10-02 11:01:47,443 INFO [train.py:1046] (3/4) Epoch 25, batch 950, loss[loss=0.1554, simple_loss=0.2434, pruned_loss=0.03368, over 24301.00 frames. ], tot_loss[loss=0.1714, simple_loss=0.2483, pruned_loss=0.04721, over 4670868.17 frames. ], batch size: 74, lr: 4.13e-03, grad_scale: 16.0 2023-10-02 11:01:48,887 WARNING [train.py:1204] (3/4) Exclude cut with ID _49039_210_6315_1_1533967537541_3846118_9-148921-0_sp1.1 from training. Duration: 0.97275 2023-10-02 11:01:52,200 WARNING [train.py:1204] (3/4) Exclude cut with ID _59493_210_17282_1_1535078419077_3225879_143-70317-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 11:01:53,478 WARNING [train.py:1204] (3/4) Exclude cut with ID _27967_210_12140_1_1532588391859_8419130_1056-236369-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 11:01:53,530 WARNING [train.py:1204] (3/4) Exclude cut with ID _6537_210_1883_1_1527987559710_3597129_382-133062-0_sp0.9 from training. Duration: 0.7555625 2023-10-02 11:01:56,256 WARNING [train.py:1204] (3/4) Exclude cut with ID ID0376W0255-869957 from training. Duration: 0.9510625 2023-10-02 11:02:01,707 WARNING [train.py:1204] (3/4) Exclude cut with ID _49371_210_12071_1_1533952761021_3812190_483-240691-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 11:02:03,027 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_12868_210_6753_1_1530701932_530275_18-76322-0_sp1.1 from training. Duration: 0.7909375 2023-10-02 11:02:03,090 WARNING [train.py:1204] (3/4) Exclude cut with ID _53950_210_5816_1_1534417264919_3862951_367-26344-0_sp1.1 from training. Duration: 0.97275 2023-10-02 11:02:04,357 WARNING [train.py:1204] (3/4) Exclude cut with ID _5444_210_2151_1_1527418808464_5312210_634-194174-0_sp1.1 from training. Duration: 0.8818125 2023-10-02 11:02:04,373 WARNING [train.py:1204] (3/4) Exclude cut with ID _56494_210_4220_1_1535284837625_3033169_69-138150-0 from training. Duration: 0.94 2023-10-02 11:02:05,789 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7543_210_6753_1_1528544010_1110402_25-183860-0_sp0.9 from training. Duration: 0.52225 2023-10-02 11:02:07,163 WARNING [train.py:1204] (3/4) Exclude cut with ID _40284_210_2084_1_1533513679814_3704279_287-191918-0_sp1.1 from training. Duration: 0.963625 2023-10-02 11:02:08,721 WARNING [train.py:1204] (3/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_291-270881-0 from training. Duration: 0.97 2023-10-02 11:02:08,765 WARNING [train.py:1204] (3/4) Exclude cut with ID _12555_210_8750_1_1530075594402_3780239_449-267986-0_sp0.9 from training. Duration: 0.92225 2023-10-02 11:02:14,131 WARNING [train.py:1204] (3/4) Exclude cut with ID _3992_210_1747_1_1526644816031_3731060_268-64520-0_sp1.1 from training. Duration: 0.963625 2023-10-02 11:02:14,144 WARNING [train.py:1204] (3/4) Exclude cut with ID _39411_210_5986_1_1533538825030_3343649_173-238248-0_sp0.9 from training. Duration: 0.92225 2023-10-02 11:02:15,469 WARNING [train.py:1204] (3/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_828-313097-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 11:02:16,854 WARNING [train.py:1204] (3/4) Exclude cut with ID _26907_210_7234_1_1532221740694_3205263_337-2494-0 from training. Duration: 0.88 2023-10-02 11:02:18,309 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_229-45300-0_sp1.1 from training. Duration: 0.5181875 2023-10-02 11:02:19,711 WARNING [train.py:1204] (3/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_300-27235-0_sp1.1 from training. Duration: 0.7909375 2023-10-02 11:02:21,079 WARNING [train.py:1204] (3/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_48-338976-0_sp1.1 from training. Duration: 0.8090625 2023-10-02 11:02:26,835 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_13091_210_7925_1_1530609447_8987354_792-195353-0_sp1.1 from training. Duration: 0.936375 2023-10-02 11:02:26,841 WARNING [train.py:1204] (3/4) Exclude cut with ID _56524_210_9642_1_1534572011779_3619170_311-54500-0_sp1.1 from training. Duration: 0.97275 2023-10-02 11:02:31,619 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7543_210_6753_1_1528544010_1110402_25-183860-0 from training. Duration: 0.47 2023-10-02 11:02:31,736 WARNING [train.py:1204] (3/4) Exclude cut with ID 2411-132532-0017-82279-0_sp1.1 from training. Duration: 0.9681875 2023-10-02 11:02:31,737 WARNING [train.py:1204] (3/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_272-249985-0_sp0.9 from training. Duration: 0.7444375 2023-10-02 11:02:31,789 WARNING [train.py:1204] (3/4) Exclude cut with ID _55644_210_11856_1_1534593599582_4103719_320-229006-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 11:02:33,022 WARNING [train.py:1204] (3/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_177-100554-0_sp1.1 from training. Duration: 0.963625 2023-10-02 11:02:33,024 WARNING [train.py:1204] (3/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_787-294416-0_sp1.1 from training. Duration: 0.7454375 2023-10-02 11:02:37,682 WARNING [train.py:1204] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_318-59097-0 from training. Duration: 0.76 2023-10-02 11:02:38,939 WARNING [train.py:1204] (3/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_480-13146-0_sp1.1 from training. Duration: 0.77275 2023-10-02 11:02:41,778 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.feed_forward1.out_proj.dropout_p, batch_count=856473.3333333334, ans=0.1 2023-10-02 11:02:42,900 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_469-338558-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 11:02:42,953 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_367-126897-0_sp1.1 from training. Duration: 0.963625 2023-10-02 11:02:42,981 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6545_210_2040_1_1527774670_4245258_379-14182-0 from training. Duration: 0.8 2023-10-02 11:02:44,288 WARNING [train.py:1204] (3/4) Exclude cut with ID _21631_210_2253_1_1531907880778_3160339_197-79498-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 11:02:44,293 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_416-293746-0_sp1.1 from training. Duration: 0.8181875 2023-10-02 11:02:44,343 WARNING [train.py:1204] (3/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_52-211683-0 from training. Duration: 0.9 2023-10-02 11:02:47,281 WARNING [train.py:1204] (3/4) Exclude cut with ID _48678_210_5663_1_1533988830947_3769199_306-255886-0_sp0.9 from training. Duration: 0.8555625 2023-10-02 11:02:50,057 WARNING [train.py:1204] (3/4) Exclude cut with ID _65134_210_14763_1_1535335393985_3505368_296-24733-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 11:02:54,248 WARNING [train.py:1204] (3/4) Exclude cut with ID _14898_210_9206_1_1530766812377_4177190_335-99364-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 11:02:55,565 WARNING [train.py:1204] (3/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_19-342632-0 from training. Duration: 0.91 2023-10-02 11:02:55,581 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_53071_210_16554_1_1534490405_2954066_265-170472-0 from training. Duration: 0.83 2023-10-02 11:03:00,018 INFO [train.py:1046] (3/4) Epoch 25, batch 1000, loss[loss=0.1736, simple_loss=0.2618, pruned_loss=0.04269, over 24092.00 frames. ], tot_loss[loss=0.1706, simple_loss=0.2476, pruned_loss=0.04679, over 4683343.68 frames. ], batch size: 80, lr: 4.13e-03, grad_scale: 16.0 2023-10-02 11:03:00,850 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.2.encoder.layers.1.self_attn_weights.whiten_keys, num_groups=4, num_channels=128, metric=3.37 vs. limit=6.0 2023-10-02 11:03:01,427 WARNING [train.py:1204] (3/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_189-298172-0_sp1.1 from training. Duration: 0.963625 2023-10-02 11:03:06,128 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7615_210_4211_1_1528110965_4580160_72-235211-0 from training. Duration: 0.76 2023-10-02 11:03:06,171 WARNING [train.py:1204] (3/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_155-90874-0_sp1.1 from training. Duration: 0.936375 2023-10-02 11:03:06,423 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.attention_skip_rate, batch_count=856606.6666666666, ans=0.0 2023-10-02 11:03:07,763 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.bypass_mid.scale_min, batch_count=856606.6666666666, ans=0.2 2023-10-02 11:03:10,612 WARNING [train.py:1204] (3/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_164-216310-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 11:03:10,709 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_46013_210_7234_1_1533776362_3744032_463-52378-0 from training. Duration: 0.86 2023-10-02 11:03:10,712 WARNING [train.py:1204] (3/4) Exclude cut with ID _61133_210_9969_1_1535196777227_3696409_333-270325-0 from training. Duration: 0.93 2023-10-02 11:03:13,690 WARNING [train.py:1204] (3/4) Exclude cut with ID _5172_210_1883_1_1527246062567_3577219_18-101380-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 11:03:13,699 WARNING [train.py:1204] (3/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_577-236858-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 11:03:13,884 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.ff3_skip_rate, batch_count=856673.3333333334, ans=0.0 2023-10-02 11:03:16,405 WARNING [train.py:1204] (3/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_1006-177345-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 11:03:17,978 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_4-266348-0 from training. Duration: 0.78 2023-10-02 11:03:21,997 WARNING [train.py:1204] (3/4) Exclude cut with ID _55857_210_2346_1_1534644007831_6898209_745-98391-0 from training. Duration: 0.76 2023-10-02 11:03:22,216 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.feed_forward3.hidden_balancer.prob, batch_count=856673.3333333334, ans=0.125 2023-10-02 11:03:23,370 WARNING [train.py:1204] (3/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_375-244976-0 from training. Duration: 0.75 2023-10-02 11:03:24,709 WARNING [train.py:1204] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_4-248404-0_sp1.1 from training. Duration: 0.97275 2023-10-02 11:03:24,818 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8453_210_4211_1_1528502940_8562730_596-306820-0 from training. Duration: 0.97 2023-10-02 11:03:24,933 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.conv_skip_rate, batch_count=856673.3333333334, ans=0.0 2023-10-02 11:03:26,912 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.feed_forward2.hidden_balancer.prob, batch_count=856673.3333333334, ans=0.125 2023-10-02 11:03:28,008 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_211-339622-0_sp0.9 from training. Duration: 0.5 2023-10-02 11:03:28,034 WARNING [train.py:1204] (3/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_393-57888-0 from training. Duration: 0.64 2023-10-02 11:03:29,511 WARNING [train.py:1204] (3/4) Exclude cut with ID _23825_210_13449_1_1531915313526_3497150_143-343575-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 11:03:29,582 WARNING [train.py:1204] (3/4) Exclude cut with ID _58605_210_12210_1_1534723195939_3904259_36-12256-0_sp1.1 from training. Duration: 0.936375 2023-10-02 11:03:37,777 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.528e+02 1.816e+02 2.002e+02 2.184e+02 3.024e+02, threshold=4.004e+02, percent-clipped=0.0 2023-10-02 11:03:37,951 WARNING [train.py:1204] (3/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_38-67795-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 11:03:39,255 WARNING [train.py:1204] (3/4) Exclude cut with ID _64631_210_27767_1_1535349580068_4165160_308-327832-0_sp1.1 from training. Duration: 0.92725 2023-10-02 11:03:39,320 WARNING [train.py:1204] (3/4) Exclude cut with ID _37086_210_9038_1_1532999540891_7021250_726-81176-0_sp1.1 from training. Duration: 0.936375 2023-10-02 11:03:40,622 WARNING [train.py:1204] (3/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_47-141521-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 11:03:40,626 WARNING [train.py:1204] (3/4) Exclude cut with ID _3067_210_2667_1_1526212974640_4503168_250-285937-0 from training. Duration: 0.8 2023-10-02 11:03:40,661 WARNING [train.py:1204] (3/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_239-24000-0_sp1.1 from training. Duration: 0.97275 2023-10-02 11:03:42,055 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3371_210_1794_1_1526384462_5580777_396-339812-0_sp0.9 from training. Duration: 0.9444375 2023-10-02 11:03:42,106 WARNING [train.py:1204] (3/4) Exclude cut with ID _16909_210_5724_1_1531292079912_4118919_580-173454-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 11:03:43,441 WARNING [train.py:1204] (3/4) Exclude cut with ID IC0124W0002-61308 from training. Duration: 0.982 2023-10-02 11:03:46,184 WARNING [train.py:1204] (3/4) Exclude cut with ID _31602_210_5854_1_1532677021456_4458250_249-96604-0 from training. Duration: 0.89 2023-10-02 11:03:46,267 WARNING [train.py:1204] (3/4) Exclude cut with ID _69584_210_5219_1_1535785133775_4044450_325-7656-0 from training. Duration: 0.86 2023-10-02 11:03:47,703 WARNING [train.py:1204] (3/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_478-162247-0 from training. Duration: 0.82 2023-10-02 11:03:50,400 WARNING [train.py:1204] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_686-54383-0_sp1.1 from training. Duration: 0.7909375 2023-10-02 11:03:54,672 WARNING [train.py:1204] (3/4) Exclude cut with ID _29303_210_2575_1_1532392237303_7410649_85-270092-0_sp1.1 from training. Duration: 0.936375 2023-10-02 11:03:54,689 WARNING [train.py:1204] (3/4) Exclude cut with ID _2322_210_2667_1_1525604548971_3656299_440-80494-0_sp1.1 from training. Duration: 0.8 2023-10-02 11:03:56,574 WARNING [train.py:1204] (3/4) Exclude cut with ID _47346_210_9476_1_1533815971553_3536160_201-40644-0_sp1.1 from training. Duration: 0.936375 2023-10-02 11:03:57,980 WARNING [train.py:1204] (3/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_343-139898-0_sp1.1 from training. Duration: 0.87275 2023-10-02 11:04:00,600 WARNING [train.py:1204] (3/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_1012-279473-0 from training. Duration: 0.67 2023-10-02 11:04:02,565 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7931_210_1794_1_1528594728_5564059_280-279560-0_sp1.1 from training. Duration: 0.8909375 2023-10-02 11:04:02,627 WARNING [train.py:1204] (3/4) Exclude cut with ID _8263_210_1664_1_1528439452891_970220_68-181805-0 from training. Duration: 0.64 2023-10-02 11:04:03,948 WARNING [train.py:1204] (3/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_605-133395-0 from training. Duration: 0.66 2023-10-02 11:04:05,358 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_43-171109-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 11:04:05,362 WARNING [train.py:1204] (3/4) Exclude cut with ID _40094_210_16380_1_1533294005773_3641159_107-280739-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 11:04:05,615 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.1.conv_module2.balancer2.prob, batch_count=856873.3333333334, ans=0.125 2023-10-02 11:04:06,750 WARNING [train.py:1204] (3/4) Exclude cut with ID _14821_210_8750_1_1530680503991_6829359_2-227151-0_sp0.9 from training. Duration: 0.92225 2023-10-02 11:04:10,086 WARNING [train.py:1204] (3/4) Exclude cut with ID _14268_210_9206_1_1530622771285_4714099_331-105351-0_sp1.1 from training. Duration: 0.5909375 2023-10-02 11:04:11,433 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_26326_210_7925_1_1533002493_3592406_451-82741-0_sp1.1 from training. Duration: 0.97275 2023-10-02 11:04:11,935 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.4.encoder.layers.2.conv_module1.whiten, num_groups=1, num_channels=384, metric=3.19 vs. limit=15.0 2023-10-02 11:04:13,016 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.conv_skip_rate, batch_count=856940.0, ans=0.0 2023-10-02 11:04:13,029 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.nonlin_attention.balancer.prob, batch_count=856940.0, ans=0.125 2023-10-02 11:04:14,016 INFO [train.py:1046] (3/4) Epoch 25, batch 1050, loss[loss=0.1703, simple_loss=0.2371, pruned_loss=0.05176, over 23504.00 frames. ], tot_loss[loss=0.1699, simple_loss=0.2463, pruned_loss=0.0467, over 4687794.88 frames. ], batch size: 256, lr: 4.13e-03, grad_scale: 16.0 2023-10-02 11:04:15,646 WARNING [train.py:1204] (3/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_191-59784-0_sp0.9 from training. Duration: 0.97775 2023-10-02 11:04:17,046 WARNING [train.py:1204] (3/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_445-117398-0_sp1.1 from training. Duration: 0.8090625 2023-10-02 11:04:19,656 WARNING [train.py:1204] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_605-257205-0_sp0.9 from training. Duration: 0.6555625 2023-10-02 11:04:19,728 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_50050_210_16223_1_1535435796_7290138_592-258841-0_sp1.1 from training. Duration: 0.936375 2023-10-02 11:04:21,245 WARNING [train.py:1204] (3/4) Exclude cut with ID _14348_210_8341_1_1530599434996_3778640_240-232370-0_sp0.9 from training. Duration: 0.9333125 2023-10-02 11:04:23,899 WARNING [train.py:1204] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_239-316802-0_sp0.9 from training. Duration: 0.7666875 2023-10-02 11:04:25,308 WARNING [train.py:1204] (3/4) Exclude cut with ID _45560_210_21982_1_1533812603951_3584999_111-6376-0_sp1.1 from training. Duration: 0.763625 2023-10-02 11:04:28,526 WARNING [train.py:1204] (3/4) Exclude cut with ID _54189_210_16380_1_1534332670039_3911710_606-311481-0_sp1.1 from training. Duration: 0.963625 2023-10-02 11:04:28,604 WARNING [train.py:1204] (3/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_237-332027-0_sp1.1 from training. Duration: 0.863625 2023-10-02 11:04:28,630 WARNING [train.py:1204] (3/4) Exclude cut with ID _66070_210_7640_1_1535441393096_7805310_489-292631-0_sp0.9 from training. Duration: 0.87775 2023-10-02 11:04:29,984 WARNING [train.py:1204] (3/4) Exclude cut with ID _49728_210_12651_1_1534041034294_6788150_624-195301-0_sp1.1 from training. Duration: 0.77275 2023-10-02 11:04:30,042 WARNING [train.py:1204] (3/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_44-79427-0 from training. Duration: 0.98 2023-10-02 11:04:31,913 WARNING [train.py:1204] (3/4) Exclude cut with ID _35350_210_16040_1_1533040191183_4075299_490-198354-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 11:04:32,413 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.5.encoder.layers.0.conv_module1.whiten, num_groups=1, num_channels=256, metric=6.68 vs. limit=15.0 2023-10-02 11:04:33,227 WARNING [train.py:1204] (3/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_309-222545-0 from training. Duration: 0.84 2023-10-02 11:04:35,239 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_13091_210_7925_1_1530609447_8987354_790-199469-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 11:04:35,250 WARNING [train.py:1204] (3/4) Exclude cut with ID _21632_210_2253_1_1531994001765_3744140_110-274654-0 from training. Duration: 0.82 2023-10-02 11:04:35,255 WARNING [train.py:1204] (3/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_498-40153-0_sp0.9 from training. Duration: 0.7 2023-10-02 11:04:40,839 WARNING [train.py:1204] (3/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_363-282909-0_sp1.1 from training. Duration: 0.936375 2023-10-02 11:04:42,645 WARNING [train.py:1204] (3/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_178-177582-0_sp0.9 from training. Duration: 0.82225 2023-10-02 11:04:42,664 WARNING [train.py:1204] (3/4) Exclude cut with ID _24612_210_8913_1_1531998054193_3806359_357-35487-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 11:04:45,415 WARNING [train.py:1204] (3/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_138-332773-0 from training. Duration: 0.81 2023-10-02 11:04:45,437 WARNING [train.py:1204] (3/4) Exclude cut with ID _7624_210_1531_1_1528109802198_4933101_418-349834-0 from training. Duration: 0.6 2023-10-02 11:04:46,731 WARNING [train.py:1204] (3/4) Exclude cut with ID _7537_210_2084_1_1528502395431_3924359_14-312906-0_sp0.9 from training. Duration: 0.9333125 2023-10-02 11:04:48,327 WARNING [train.py:1204] (3/4) Exclude cut with ID _60057_210_10640_1_1534903065793_3333150_362-230377-0 from training. Duration: 0.87 2023-10-02 11:04:51,123 WARNING [train.py:1204] (3/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_138-64210-0 from training. Duration: 0.73 2023-10-02 11:04:51,189 WARNING [train.py:1204] (3/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_454-339504-0_sp1.1 from training. Duration: 0.97275 2023-10-02 11:04:54,004 WARNING [train.py:1204] (3/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_508-335369-0_sp0.9 from training. Duration: 0.5333125 2023-10-02 11:04:55,342 WARNING [train.py:1204] (3/4) Exclude cut with ID _5608_210_2106_1_1527415439089_6819768_909-82767-0_sp0.9 from training. Duration: 0.67775 2023-10-02 11:04:55,372 WARNING [train.py:1204] (3/4) Exclude cut with ID _6694_210_4599_1_1527852637490_3965529_99-11611-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 11:04:56,688 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2655_210_2087_1_1525688959_3897400_52-279899-0_sp1.1 from training. Duration: 0.62725 2023-10-02 11:05:02,371 WARNING [train.py:1204] (3/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_375-245677-0_sp1.1 from training. Duration: 0.736375 2023-10-02 11:05:04,014 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.feed_forward1.out_proj.dropout_p, batch_count=857140.0, ans=0.1 2023-10-02 11:05:05,368 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.self_attn_weights.pos_emb_skip_rate, batch_count=857140.0, ans=0.0 2023-10-02 11:05:06,511 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_23850_210_4238_1_1532232597_1632433_32-185135-0 from training. Duration: 0.86 2023-10-02 11:05:07,903 WARNING [train.py:1204] (3/4) Exclude cut with ID _15113_210_8254_1_1530842438579_3716279_209-54533-0 from training. Duration: 0.96 2023-10-02 11:05:09,158 WARNING [train.py:1204] (3/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_401-53423-0 from training. Duration: 0.55 2023-10-02 11:05:09,210 WARNING [train.py:1204] (3/4) Exclude cut with ID _14867_210_1968_1_1530684179890_3846969_319-250868-0_sp1.1 from training. Duration: 0.9 2023-10-02 11:05:09,228 WARNING [train.py:1204] (3/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_106-30782-0_sp0.9 from training. Duration: 0.888875 2023-10-02 11:05:11,861 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_5960_210_1794_1_1527403865_3990999_515-2613-0 from training. Duration: 0.88 2023-10-02 11:05:15,338 WARNING [train.py:1204] (3/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_798-106179-0_sp0.9 from training. Duration: 0.92225 2023-10-02 11:05:16,744 WARNING [train.py:1204] (3/4) Exclude cut with ID _14227_210_4381_1_1530701804286_3940259_125-275642-0_sp1.1 from training. Duration: 0.9 2023-10-02 11:05:16,746 WARNING [train.py:1204] (3/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_79-200392-0_sp1.1 from training. Duration: 0.8818125 2023-10-02 11:05:16,789 WARNING [train.py:1204] (3/4) Exclude cut with ID _48836_210_12210_1_1534075848293_3904150_429-35475-0_sp0.9 from training. Duration: 0.988875 2023-10-02 11:05:18,069 WARNING [train.py:1204] (3/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_224-172517-0_sp1.1 from training. Duration: 0.97275 2023-10-02 11:05:20,828 WARNING [train.py:1204] (3/4) Exclude cut with ID _15113_210_8254_1_1530842438579_3716279_76-232507-0_sp1.1 from training. Duration: 0.97275 2023-10-02 11:05:20,830 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_135-5501-0 from training. Duration: 0.98 2023-10-02 11:05:22,243 WARNING [train.py:1204] (3/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_563-347832-0_sp0.9 from training. Duration: 0.988875 2023-10-02 11:05:22,259 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6786_210_1388_1_1527839699_4194495_273-149841-0 from training. Duration: 0.92 2023-10-02 11:05:22,276 WARNING [train.py:1204] (3/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_172-215932-0 from training. Duration: 0.66 2023-10-02 11:05:23,612 WARNING [train.py:1204] (3/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_939-132910-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 11:05:26,439 WARNING [train.py:1204] (3/4) Exclude cut with ID _49371_210_12071_1_1533952761021_3812190_16-246001-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 11:05:28,140 INFO [train.py:1046] (3/4) Epoch 25, batch 1100, loss[loss=0.1564, simple_loss=0.2358, pruned_loss=0.03852, over 17575.00 frames. ], tot_loss[loss=0.1695, simple_loss=0.2454, pruned_loss=0.04678, over 4684180.16 frames. ], batch size: 38, lr: 4.12e-03, grad_scale: 16.0 2023-10-02 11:05:32,812 WARNING [train.py:1204] (3/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_680-189165-0_sp1.1 from training. Duration: 0.936375 2023-10-02 11:05:36,939 WARNING [train.py:1204] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_53-300544-0_sp1.1 from training. Duration: 0.6090625 2023-10-02 11:05:38,376 WARNING [train.py:1204] (3/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_540-275685-0_sp1.1 from training. Duration: 0.8090625 2023-10-02 11:05:38,391 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_38965_210_4852_1_1533174906_3484735_453-106579-0_sp1.1 from training. Duration: 0.92725 2023-10-02 11:05:39,749 WARNING [train.py:1204] (3/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_105-328174-0 from training. Duration: 0.76 2023-10-02 11:05:41,371 WARNING [train.py:1204] (3/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_279-126205-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 11:05:44,039 WARNING [train.py:1204] (3/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_5-55893-0_sp0.9 from training. Duration: 0.611125 2023-10-02 11:05:47,482 WARNING [train.py:1204] (3/4) Exclude cut with ID _13678_210_7497_1_1530530069226_3463170_141-10835-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 11:05:50,298 WARNING [train.py:1204] (3/4) Exclude cut with ID _22083_210_8254_1_1531702862439_3678970_201-85738-0_sp1.1 from training. Duration: 0.8181875 2023-10-02 11:05:50,318 WARNING [train.py:1204] (3/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_51-1049-0 from training. Duration: 0.75 2023-10-02 11:05:51,663 WARNING [train.py:1204] (3/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_206-10097-0_sp0.9 from training. Duration: 0.5444375 2023-10-02 11:05:53,008 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_4528_210_3273_1_1527076654_3276777_360-63661-0_sp1.1 from training. Duration: 0.92725 2023-10-02 11:05:53,015 WARNING [train.py:1204] (3/4) Exclude cut with ID _64086_210_7994_1_1535270417019_7435949_449-31567-0_sp1.1 from training. Duration: 0.8818125 2023-10-02 11:05:55,778 WARNING [train.py:1204] (3/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_245-174553-0_sp0.9 from training. Duration: 0.97775 2023-10-02 11:05:57,320 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6545_210_2040_1_1527774670_4245258_379-14182-0_sp1.1 from training. Duration: 0.72725 2023-10-02 11:06:01,888 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_256-206070-0_sp1.1 from training. Duration: 0.963625 2023-10-02 11:06:04,995 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.521e+02 1.795e+02 1.916e+02 2.111e+02 3.206e+02, threshold=3.831e+02, percent-clipped=0.0 2023-10-02 11:06:05,120 WARNING [train.py:1204] (3/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_278-104676-0 from training. Duration: 0.98 2023-10-02 11:06:05,190 WARNING [train.py:1204] (3/4) Exclude cut with ID IC0671W0065-331908 from training. Duration: 0.896 2023-10-02 11:06:07,154 WARNING [train.py:1204] (3/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_244-129917-0_sp1.1 from training. Duration: 0.97275 2023-10-02 11:06:08,561 WARNING [train.py:1204] (3/4) Exclude cut with ID _7852_210_5219_1_1528286471316_8139400_1129-170857-0_sp1.1 from training. Duration: 0.97275 2023-10-02 11:06:09,951 WARNING [train.py:1204] (3/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_443-30034-0_sp0.9 from training. Duration: 0.711125 2023-10-02 11:06:09,994 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_31583_210_3852_1_1532595851_4024413_250-332999-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 11:06:11,343 WARNING [train.py:1204] (3/4) Exclude cut with ID _8772_210_1850_1_1528635595972_3664011_247-336448-0 from training. Duration: 0.74 2023-10-02 11:06:11,406 WARNING [train.py:1204] (3/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_567-135114-0_sp0.9 from training. Duration: 0.9666875 2023-10-02 11:06:11,617 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.balancer1.prob, batch_count=857473.3333333334, ans=0.125 2023-10-02 11:06:12,629 WARNING [train.py:1204] (3/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_557-349534-0_sp1.1 from training. Duration: 0.82725 2023-10-02 11:06:12,641 WARNING [train.py:1204] (3/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_1341-26728-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 11:06:12,683 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_40222_210_6228_1_1533385298_4481939_375-328051-0_sp1.1 from training. Duration: 0.97275 2023-10-02 11:06:12,725 WARNING [train.py:1204] (3/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_276-96593-0 from training. Duration: 0.77 2023-10-02 11:06:18,899 INFO [scaling.py:1118] (3/4) WithLoss: name=encoder.encoders.4.encoder.layers.2.self_attn_weights, loss-sum=0.000e+00 2023-10-02 11:06:20,055 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_53071_210_16554_1_1534490405_2954066_265-170472-0_sp0.9 from training. Duration: 0.92225 2023-10-02 11:06:21,329 WARNING [train.py:1204] (3/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_365-1492-0 from training. Duration: 0.91 2023-10-02 11:06:22,776 WARNING [train.py:1204] (3/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_654-52713-0_sp0.9 from training. Duration: 0.8555625 2023-10-02 11:06:28,180 WARNING [train.py:1204] (3/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_113-92608-0_sp1.1 from training. Duration: 0.4909375 2023-10-02 11:06:30,993 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_46013_210_7234_1_1533776362_3744032_50-141535-0 from training. Duration: 0.99 2023-10-02 11:06:31,010 WARNING [train.py:1204] (3/4) Exclude cut with ID _13942_210_9988_1_1530604906036_3733360_499-55676-0_sp1.1 from training. Duration: 0.563625 2023-10-02 11:06:32,999 WARNING [train.py:1204] (3/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_375-65143-0_sp1.1 from training. Duration: 0.97275 2023-10-02 11:06:35,678 WARNING [train.py:1204] (3/4) Exclude cut with ID _16756_210_4915_1_1531047803763_7104259_199-325608-0_sp1.1 from training. Duration: 0.92725 2023-10-02 11:06:37,003 WARNING [train.py:1204] (3/4) Exclude cut with ID _31649_210_6228_1_1532584608063_4091078_443-124391-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 11:06:38,797 WARNING [train.py:1204] (3/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_146-218763-0 from training. Duration: 0.86 2023-10-02 11:06:38,839 WARNING [train.py:1204] (3/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_567-135114-0_sp1.1 from training. Duration: 0.7909375 2023-10-02 11:06:38,871 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_26326_210_7925_1_1533002493_3592406_462-288299-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 11:06:40,543 WARNING [train.py:1204] (3/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_322-217220-0 from training. Duration: 0.86 2023-10-02 11:06:40,569 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_238-256668-0_sp1.1 from training. Duration: 0.62725 2023-10-02 11:06:41,864 INFO [train.py:1046] (3/4) Epoch 25, batch 1150, loss[loss=0.1763, simple_loss=0.2553, pruned_loss=0.04862, over 23677.00 frames. ], tot_loss[loss=0.1694, simple_loss=0.2456, pruned_loss=0.04656, over 4694876.60 frames. ], batch size: 85, lr: 4.12e-03, grad_scale: 16.0 2023-10-02 11:06:41,905 WARNING [train.py:1204] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_371-18169-0 from training. Duration: 0.86 2023-10-02 11:06:42,010 WARNING [train.py:1204] (3/4) Exclude cut with ID _58480_210_12392_1_1534811121111_3646160_23-74512-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 11:06:43,264 WARNING [train.py:1204] (3/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_784-213099-0_sp1.1 from training. Duration: 0.8454375 2023-10-02 11:06:43,498 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.balancer1.prob, batch_count=857606.6666666666, ans=0.125 2023-10-02 11:06:44,591 WARNING [train.py:1204] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_625-102955-0_sp1.1 from training. Duration: 0.7 2023-10-02 11:06:47,426 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.feed_forward2.hidden_balancer.prob, batch_count=857606.6666666666, ans=0.125 2023-10-02 11:06:48,706 WARNING [train.py:1204] (3/4) Exclude cut with ID _49748_210_24116_1_1533986971737_3158910_176-174327-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 11:06:50,191 WARNING [train.py:1204] (3/4) Exclude cut with ID _50310_210_15488_1_1534068043345_3641080_432-96140-0_sp1.1 from training. Duration: 0.936375 2023-10-02 11:06:53,413 WARNING [train.py:1204] (3/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_76-14910-0_sp1.1 from training. Duration: 0.92725 2023-10-02 11:06:53,450 WARNING [train.py:1204] (3/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_40-19458-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 11:06:53,483 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_46-93547-0 from training. Duration: 0.95 2023-10-02 11:06:54,747 WARNING [train.py:1204] (3/4) Exclude cut with ID _21631_210_2253_1_1531907880778_3160339_321-35325-0_sp1.1 from training. Duration: 0.963625 2023-10-02 11:06:56,182 WARNING [train.py:1204] (3/4) Exclude cut with ID _4779_210_2084_1_1527318022944_3914893_500-285013-0 from training. Duration: 0.69 2023-10-02 11:06:57,599 WARNING [train.py:1204] (3/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_301-93324-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 11:06:57,605 WARNING [train.py:1204] (3/4) Exclude cut with ID _6473_210_1741_1_1527901307396_3815380_108-17335-0_sp1.1 from training. Duration: 0.5909375 2023-10-02 11:07:03,611 WARNING [train.py:1204] (3/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_206-10097-0 from training. Duration: 0.49 2023-10-02 11:07:06,209 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_36756_210_7234_1_1533026991_966276_103-77513-0_sp1.1 from training. Duration: 0.97275 2023-10-02 11:07:09,397 WARNING [train.py:1204] (3/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_662-18193-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 11:07:09,460 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_26433_210_12108_1_1532157158_4056480_439-52438-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 11:07:09,506 WARNING [train.py:1204] (3/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_464-33931-0 from training. Duration: 0.54 2023-10-02 11:07:09,512 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3474_210_2028_1_1526188513_4825246_145-309131-0_sp0.9 from training. Duration: 0.82225 2023-10-02 11:07:09,523 WARNING [train.py:1204] (3/4) Exclude cut with ID _9679_210_7497_1_1529129212906_4363929_446-308321-0_sp1.1 from training. Duration: 0.963625 2023-10-02 11:07:15,557 WARNING [train.py:1204] (3/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_662-275621-0 from training. Duration: 0.42 2023-10-02 11:07:16,962 WARNING [train.py:1204] (3/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_344-337148-0_sp1.1 from training. Duration: 0.97275 2023-10-02 11:07:17,090 WARNING [train.py:1204] (3/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_759-179071-0_sp1.1 from training. Duration: 0.92725 2023-10-02 11:07:27,630 WARNING [train.py:1204] (3/4) Exclude cut with ID _28783_210_7943_1_1532671164808_7155479_293-12188-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 11:07:28,295 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.4.encoder.layers.0.nonlin_attention.whiten1, num_groups=1, num_channels=288, metric=8.28 vs. limit=10.0 2023-10-02 11:07:35,035 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_17613_210_2151_1_1531137569_2366160_235-320304-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 11:07:35,060 WARNING [train.py:1204] (3/4) Exclude cut with ID _14349_210_8341_1_1530615710818_3442470_170-336541-0 from training. Duration: 0.86 2023-10-02 11:07:35,114 WARNING [train.py:1204] (3/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_375-218677-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 11:07:36,356 WARNING [train.py:1204] (3/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_829-202956-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 11:07:41,109 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9029W0436-509823 from training. Duration: 0.768 2023-10-02 11:07:44,209 WARNING [train.py:1204] (3/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_231-112214-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 11:07:47,347 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.balancer1.prob, batch_count=857873.3333333334, ans=0.125 2023-10-02 11:07:49,919 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9059W0470-524883 from training. Duration: 0.3415625 2023-10-02 11:07:54,050 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_43266_210_1794_1_1533521789_4497218_206-79512-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 11:07:55,394 INFO [train.py:1046] (3/4) Epoch 25, batch 1200, loss[loss=0.1681, simple_loss=0.2496, pruned_loss=0.04334, over 24310.00 frames. ], tot_loss[loss=0.1704, simple_loss=0.2472, pruned_loss=0.04683, over 4703896.49 frames. ], batch size: 61, lr: 4.12e-03, grad_scale: 32.0 2023-10-02 11:07:55,457 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_294-163793-0_sp0.9 from training. Duration: 0.911125 2023-10-02 11:07:55,489 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6290_210_3273_1_1527595224_1282787_115-87095-0_sp1.1 from training. Duration: 0.5 2023-10-02 11:07:56,790 WARNING [train.py:1204] (3/4) Exclude cut with ID _14100_210_8341_1_1530529320613_3678069_279-55760-0_sp1.1 from training. Duration: 0.8454375 2023-10-02 11:08:00,197 WARNING [train.py:1204] (3/4) Exclude cut with ID _14621_210_1925_1_1530686901185_3675420_419-234200-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 11:08:04,770 WARNING [train.py:1204] (3/4) Exclude cut with ID _60229_210_27767_1_1534838406158_3655350_353-213656-0_sp1.1 from training. Duration: 0.863625 2023-10-02 11:08:04,778 WARNING [train.py:1204] (3/4) Exclude cut with ID _48678_210_5663_1_1533988830947_3769199_306-255886-0_sp1.1 from training. Duration: 0.7 2023-10-02 11:08:06,011 WARNING [train.py:1204] (3/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_466-3492-0_sp1.1 from training. Duration: 0.97275 2023-10-02 11:08:06,012 WARNING [train.py:1204] (3/4) Exclude cut with ID _8755_210_1876_1_1528628559357_3258209_199-242167-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 11:08:06,078 WARNING [train.py:1204] (3/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_237-89684-0_sp1.1 from training. Duration: 0.87275 2023-10-02 11:08:07,546 WARNING [train.py:1204] (3/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_70-84121-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 11:08:10,092 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7544_210_6753_1_1528587722_3576686_172-171750-0_sp1.1 from training. Duration: 0.5909375 2023-10-02 11:08:12,157 WARNING [train.py:1204] (3/4) Exclude cut with ID _24271_210_12658_1_1531987270022_7190519_40-321822-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 11:08:12,811 WARNING [train.py:1204] (3/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_227-83612-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 11:08:15,376 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9156W0015-572508 from training. Duration: 0.896 2023-10-02 11:08:16,866 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_292-265928-0 from training. Duration: 0.51 2023-10-02 11:08:19,757 WARNING [train.py:1204] (3/4) Exclude cut with ID _14747_210_8341_1_1530685797463_3654089_147-131229-0_sp1.1 from training. Duration: 0.6909375 2023-10-02 11:08:21,187 WARNING [train.py:1204] (3/4) Exclude cut with ID _45297_210_2721_1_1533715351381_3264949_351-147136-0_sp0.9 from training. Duration: 0.9666875 2023-10-02 11:08:22,579 WARNING [train.py:1204] (3/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_1102-324568-0_sp1.1 from training. Duration: 0.97275 2023-10-02 11:08:26,439 WARNING [train.py:1204] (3/4) Exclude cut with ID _18516_210_9206_1_1531704653206_3692406_465-75446-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 11:08:26,441 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9203W0126-595623 from training. Duration: 0.896 2023-10-02 11:08:27,862 WARNING [train.py:1204] (3/4) Exclude cut with ID _55383_210_10635_1_1534468426172_2231669_139-328410-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 11:08:28,211 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.conv_skip_rate, batch_count=858073.3333333334, ans=0.0 2023-10-02 11:08:31,415 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.balancer1.prob, batch_count=858073.3333333334, ans=0.125 2023-10-02 11:08:32,518 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.564e+02 1.868e+02 2.056e+02 2.360e+02 3.745e+02, threshold=4.112e+02, percent-clipped=0.0 2023-10-02 11:08:36,028 WARNING [train.py:1204] (3/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_166-246557-0_sp0.9 from training. Duration: 0.711125 2023-10-02 11:08:36,033 WARNING [train.py:1204] (3/4) Exclude cut with ID _30039_210_13270_1_1532484612900_2725150_239-304872-0_sp1.1 from training. Duration: 0.963625 2023-10-02 11:08:36,069 WARNING [train.py:1204] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_6-226750-0 from training. Duration: 0.94 2023-10-02 11:08:36,246 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.conv_skip_rate, batch_count=858073.3333333334, ans=0.0 2023-10-02 11:08:37,440 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_135-5501-0_sp1.1 from training. Duration: 0.8909375 2023-10-02 11:08:40,260 WARNING [train.py:1204] (3/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_359-118926-0 from training. Duration: 0.72 2023-10-02 11:08:45,117 WARNING [train.py:1204] (3/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_4-91037-0 from training. Duration: 0.88 2023-10-02 11:08:45,141 WARNING [train.py:1204] (3/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_415-288492-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 11:08:46,613 WARNING [train.py:1204] (3/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_544-287179-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 11:08:46,711 WARNING [train.py:1204] (3/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_122-32606-0_sp1.1 from training. Duration: 0.92725 2023-10-02 11:08:48,141 WARNING [train.py:1204] (3/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_121-284152-0_sp1.1 from training. Duration: 0.536375 2023-10-02 11:08:48,241 WARNING [train.py:1204] (3/4) Exclude cut with ID _44718_210_5816_1_1534050051869_6469030_350-299477-0_sp1.1 from training. Duration: 0.97275 2023-10-02 11:08:49,503 WARNING [train.py:1204] (3/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_1103-56445-0_sp0.9 from training. Duration: 0.811125 2023-10-02 11:08:49,572 WARNING [train.py:1204] (3/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_64-298917-0_sp0.9 from training. Duration: 0.97775 2023-10-02 11:08:50,916 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2245_210_2715_1_1525511548_3540254_310-52483-0 from training. Duration: 0.88 2023-10-02 11:08:52,279 WARNING [train.py:1204] (3/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_401-158239-0_sp1.1 from training. Duration: 0.6454375 2023-10-02 11:08:52,322 WARNING [train.py:1204] (3/4) Exclude cut with ID _59502_210_12210_1_1535107024698_1835139_146-162495-0_sp1.1 from training. Duration: 0.836375 2023-10-02 11:08:52,338 WARNING [train.py:1204] (3/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_41-31157-0_sp0.9 from training. Duration: 0.6333125 2023-10-02 11:08:55,002 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_68717_210_3528_1_1535628683_1556759_95-36722-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 11:08:55,010 WARNING [train.py:1204] (3/4) Exclude cut with ID _30196_210_1664_1_1533708496522_7074140_654-162611-0_sp1.1 from training. Duration: 0.92725 2023-10-02 11:08:57,819 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_9302_210_2550_1_1528889962_4655416_638-301620-0_sp0.9 from training. Duration: 0.52225 2023-10-02 11:08:59,805 WARNING [train.py:1204] (3/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_478-162247-0_sp1.1 from training. Duration: 0.7454375 2023-10-02 11:09:01,229 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=858206.6666666666, ans=0.1 2023-10-02 11:09:03,671 WARNING [train.py:1204] (3/4) Exclude cut with ID _45297_210_2721_1_1533715351381_3264949_353-9541-0 from training. Duration: 0.97 2023-10-02 11:09:06,848 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9653W0190-675468 from training. Duration: 0.9935 2023-10-02 11:09:08,209 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_49730_210_13049_1_1534075294_1830676_214-84436-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 11:09:09,461 INFO [train.py:1046] (3/4) Epoch 25, batch 1250, loss[loss=0.1749, simple_loss=0.2597, pruned_loss=0.04502, over 24689.00 frames. ], tot_loss[loss=0.1705, simple_loss=0.2475, pruned_loss=0.04671, over 4708211.12 frames. ], batch size: 73, lr: 4.12e-03, grad_scale: 16.0 2023-10-02 11:09:10,926 WARNING [train.py:1204] (3/4) Exclude cut with ID _50093_210_4220_1_1534417141207_3432299_259-322252-0_sp1.1 from training. Duration: 0.836375 2023-10-02 11:09:11,046 WARNING [train.py:1204] (3/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_599-50220-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 11:09:12,884 WARNING [train.py:1204] (3/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_174-109016-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 11:09:16,237 WARNING [train.py:1204] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_665-196155-0 from training. Duration: 0.67 2023-10-02 11:09:20,355 WARNING [train.py:1204] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_656-188361-0_sp1.1 from training. Duration: 0.7909375 2023-10-02 11:09:21,670 WARNING [train.py:1204] (3/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_60-18123-0_sp1.1 from training. Duration: 0.8 2023-10-02 11:09:21,732 WARNING [train.py:1204] (3/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_535-14410-0 from training. Duration: 0.47 2023-10-02 11:09:24,342 WARNING [train.py:1204] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_94-126128-0_sp1.1 from training. Duration: 0.7818125 2023-10-02 11:09:25,793 WARNING [train.py:1204] (3/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_356-31216-0_sp1.1 from training. Duration: 0.6818125 2023-10-02 11:09:28,715 WARNING [train.py:1204] (3/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_443-30034-0_sp1.1 from training. Duration: 0.5818125 2023-10-02 11:09:30,240 WARNING [train.py:1204] (3/4) Exclude cut with ID _26907_210_7234_1_1532221740694_3205263_337-2494-0_sp1.1 from training. Duration: 0.8 2023-10-02 11:09:31,623 WARNING [train.py:1204] (3/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_49-97158-0_sp0.9 from training. Duration: 0.8444375 2023-10-02 11:09:31,633 WARNING [train.py:1204] (3/4) Exclude cut with ID _7877_210_6014_1_1528286358099_3750136_553-170128-0_sp1.1 from training. Duration: 0.963625 2023-10-02 11:09:34,813 WARNING [train.py:1204] (3/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_202-293523-0_sp0.9 from training. Duration: 0.8 2023-10-02 11:09:38,052 WARNING [train.py:1204] (3/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_188-171673-0_sp0.9 from training. Duration: 0.6444375 2023-10-02 11:09:39,363 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_120-41668-0_sp1.1 from training. Duration: 0.636375 2023-10-02 11:09:39,368 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_40223_210_6228_1_1533298404_4812267_613-78769-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 11:09:40,786 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_53861_210_5399_1_1534422156_4272077_150-336899-0_sp1.1 from training. Duration: 0.963625 2023-10-02 11:09:40,858 WARNING [train.py:1204] (3/4) Exclude cut with ID _14459_210_2228_1_1531443641946_7516700_1146-56843-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 11:09:44,134 WARNING [train.py:1204] (3/4) Exclude cut with ID _39867_210_9826_1_1533553222030_9344259_1194-299928-0_sp1.1 from training. Duration: 0.92725 2023-10-02 11:09:45,490 WARNING [train.py:1204] (3/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_214-291369-0_sp0.9 from training. Duration: 0.7 2023-10-02 11:09:48,781 WARNING [train.py:1204] (3/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_358-215558-0 from training. Duration: 0.93 2023-10-02 11:09:50,112 WARNING [train.py:1204] (3/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_577-287150-0_sp0.9 from training. Duration: 0.911125 2023-10-02 11:09:53,033 WARNING [train.py:1204] (3/4) Exclude cut with ID _60860_210_8254_1_1534896015875_3557169_229-187367-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 11:09:54,391 WARNING [train.py:1204] (3/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_857-298942-0 from training. Duration: 0.85 2023-10-02 11:09:54,625 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.nonlin_attention.balancer.min_positive, batch_count=858473.3333333334, ans=0.05 2023-10-02 11:09:55,727 WARNING [train.py:1204] (3/4) Exclude cut with ID _40137_210_12210_1_1533290484046_3795180_249-187059-0_sp1.1 from training. Duration: 0.8 2023-10-02 11:09:55,734 WARNING [train.py:1204] (3/4) Exclude cut with ID ID0171W0508-765603 from training. Duration: 0.541 2023-10-02 11:09:55,762 WARNING [train.py:1204] (3/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_521-219439-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 11:09:55,769 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_40223_210_6228_1_1533298404_4812267_741-302616-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 11:09:58,621 WARNING [train.py:1204] (3/4) Exclude cut with ID _65293_210_9070_1_1535545549765_4004210_438-154735-0_sp1.1 from training. Duration: 0.92725 2023-10-02 11:10:01,369 WARNING [train.py:1204] (3/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_564-132950-0_sp1.1 from training. Duration: 0.92725 2023-10-02 11:10:02,720 WARNING [train.py:1204] (3/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_291-270881-0_sp1.1 from training. Duration: 0.8818125 2023-10-02 11:10:04,504 WARNING [train.py:1204] (3/4) Exclude cut with ID _60229_210_27767_1_1534838406158_3655350_353-213656-0 from training. Duration: 0.95 2023-10-02 11:10:04,522 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_243-143119-0 from training. Duration: 0.77 2023-10-02 11:10:05,769 WARNING [train.py:1204] (3/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_690-126306-0 from training. Duration: 0.78 2023-10-02 11:10:08,642 WARNING [train.py:1204] (3/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_347-95003-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 11:10:10,564 WARNING [train.py:1204] (3/4) Exclude cut with ID _2322_210_2667_1_1525604548971_3656299_440-80494-0 from training. Duration: 0.88 2023-10-02 11:10:10,584 WARNING [train.py:1204] (3/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_370-229434-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 11:10:12,062 WARNING [train.py:1204] (3/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_124-345840-0_sp1.1 from training. Duration: 0.47275 2023-10-02 11:10:13,292 WARNING [train.py:1204] (3/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_165-123581-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 11:10:15,133 WARNING [train.py:1204] (3/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_453-282588-0 from training. Duration: 0.98 2023-10-02 11:10:15,166 WARNING [train.py:1204] (3/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_299-34413-0_sp0.9 from training. Duration: 0.72225 2023-10-02 11:10:16,504 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_40036_210_13405_1_1533648647_1315388_48-146016-0_sp1.1 from training. Duration: 0.8454375 2023-10-02 11:10:16,527 WARNING [train.py:1204] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_99-167392-0_sp0.9 from training. Duration: 0.67775 2023-10-02 11:10:16,581 WARNING [train.py:1204] (3/4) Exclude cut with ID _48677_210_5663_1_1533902405919_3892989_37-207114-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 11:10:17,989 WARNING [train.py:1204] (3/4) Exclude cut with ID _29857_210_20034_1_1532395886753_3798160_335-163562-0 from training. Duration: 0.85 2023-10-02 11:10:21,298 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_36492_210_7927_1_1533261987_4790834_703-343351-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 11:10:22,689 WARNING [train.py:1204] (3/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_80-147301-0_sp0.9 from training. Duration: 0.9333125 2023-10-02 11:10:23,958 INFO [train.py:1046] (3/4) Epoch 25, batch 1300, loss[loss=0.1588, simple_loss=0.2288, pruned_loss=0.04435, over 23312.00 frames. ], tot_loss[loss=0.1713, simple_loss=0.2485, pruned_loss=0.0471, over 4711728.29 frames. ], batch size: 285, lr: 4.12e-03, grad_scale: 8.0 2023-10-02 11:10:24,013 WARNING [train.py:1204] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_218-295097-0_sp0.9 from training. Duration: 0.8555625 2023-10-02 11:10:25,364 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2295_210_2087_1_1525500993_2407863_116-238894-0_sp1.1 from training. Duration: 0.636375 2023-10-02 11:10:28,220 WARNING [train.py:1204] (3/4) Exclude cut with ID _3231_210_2106_1_1526205864934_6784220_697-171068-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 11:10:29,559 WARNING [train.py:1204] (3/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_102-77064-0 from training. Duration: 0.93 2023-10-02 11:10:32,377 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_13091_210_7925_1_1530609447_8987354_1186-261957-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 11:10:35,595 WARNING [train.py:1204] (3/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_91-277684-0_sp1.1 from training. Duration: 0.67275 2023-10-02 11:10:37,190 WARNING [train.py:1204] (3/4) Exclude cut with ID _31088_210_16446_1_1532520012284_3440919_159-313751-0_sp1.1 from training. Duration: 0.936375 2023-10-02 11:10:39,001 WARNING [train.py:1204] (3/4) Exclude cut with ID _42326_210_7770_1_1533814282531_3707269_227-203721-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 11:10:39,148 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.bypass_mid.scale_min, batch_count=858673.3333333334, ans=0.2 2023-10-02 11:10:40,367 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3531_210_3528_1_1526294203_5216456_309-227302-0_sp1.1 from training. Duration: 0.62725 2023-10-02 11:10:40,413 WARNING [train.py:1204] (3/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_226-242140-0 from training. Duration: 0.81 2023-10-02 11:10:44,611 WARNING [train.py:1204] (3/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_122-109023-0_sp1.1 from training. Duration: 0.6909375 2023-10-02 11:10:44,697 WARNING [train.py:1204] (3/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_660-211088-0_sp0.9 from training. Duration: 0.688875 2023-10-02 11:10:46,685 WARNING [train.py:1204] (3/4) Exclude cut with ID _39290_210_4220_1_1533862778760_3075118_92-311600-0 from training. Duration: 0.88 2023-10-02 11:10:51,345 WARNING [train.py:1204] (3/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_390-339137-0_sp1.1 from training. Duration: 0.5090625 2023-10-02 11:10:54,162 WARNING [train.py:1204] (3/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_449-341946-0_sp1.1 from training. Duration: 0.963625 2023-10-02 11:10:55,543 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_68095_210_20185_1_1535718226_8888855_884-327007-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 11:10:56,951 WARNING [train.py:1204] (3/4) Exclude cut with ID _42326_210_7770_1_1533814282531_3707269_308-222998-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 11:10:58,279 WARNING [train.py:1204] (3/4) Exclude cut with ID _40399_210_4353_1_1533305658194_3279406_344-238708-0_sp1.1 from training. Duration: 0.963625 2023-10-02 11:10:58,341 WARNING [train.py:1204] (3/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_87-131427-0_sp1.1 from training. Duration: 0.7090625 2023-10-02 11:10:59,728 WARNING [train.py:1204] (3/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_110-130961-0_sp0.9 from training. Duration: 0.52225 2023-10-02 11:10:59,763 WARNING [train.py:1204] (3/4) Exclude cut with ID _55153_210_2253_1_1534384786677_3162096_148-79022-0 from training. Duration: 0.96 2023-10-02 11:11:03,928 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.517e+02 1.851e+02 2.097e+02 2.493e+02 3.634e+02, threshold=4.193e+02, percent-clipped=0.0 2023-10-02 11:11:04,161 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.1.feed_forward2.hidden_balancer.prob, batch_count=858740.0, ans=0.125 2023-10-02 11:11:05,331 WARNING [train.py:1204] (3/4) Exclude cut with ID _9679_210_7497_1_1529129212906_4363929_512-313889-0_sp1.1 from training. Duration: 0.736375 2023-10-02 11:11:05,342 WARNING [train.py:1204] (3/4) Exclude cut with ID _7970_210_1883_1_1528455616296_3472230_25-178407-0_sp1.1 from training. Duration: 0.6454375 2023-10-02 11:11:07,086 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_246-125000-0 from training. Duration: 0.62 2023-10-02 11:11:07,152 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_15126_210_6753_1_1530778677_2591185_113-157651-0_sp0.9 from training. Duration: 0.7555625 2023-10-02 11:11:07,615 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.5.encoder.layers.0.feed_forward1.out_whiten, num_groups=1, num_channels=256, metric=8.58 vs. limit=15.0 2023-10-02 11:11:09,119 WARNING [train.py:1204] (3/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_105-63309-0_sp1.1 from training. Duration: 0.8909375 2023-10-02 11:11:11,836 WARNING [train.py:1204] (3/4) Exclude cut with ID _29877_210_6228_1_1532397791106_5276149_578-177271-0_sp1.1 from training. Duration: 0.936375 2023-10-02 11:11:13,025 WARNING [train.py:1204] (3/4) Exclude cut with ID _44831_210_13427_1_1533816011549_3689035_32-188147-0 from training. Duration: 0.96 2023-10-02 11:11:14,414 WARNING [train.py:1204] (3/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_300-254337-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 11:11:14,426 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_128-241008-0 from training. Duration: 0.94 2023-10-02 11:11:15,798 WARNING [train.py:1204] (3/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_53-279178-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 11:11:19,257 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_41452_210_7291_1_1533437687_3941281_273-334088-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 11:11:20,526 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_128-241008-0_sp1.1 from training. Duration: 0.8545625 2023-10-02 11:11:21,978 WARNING [train.py:1204] (3/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_379-49457-0 from training. Duration: 0.87 2023-10-02 11:11:22,041 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_105-114797-0 from training. Duration: 0.84 2023-10-02 11:11:22,344 INFO [scaling.py:1118] (3/4) WithLoss: name=encoder.encoders.5.encoder.layers.1.self_attn_weights, loss-sum=0.000e+00 2023-10-02 11:11:23,409 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_14928_210_5632_1_1530773461_4714000_454-112198-0 from training. Duration: 0.66 2023-10-02 11:11:27,473 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_456-22141-0_sp1.1 from training. Duration: 0.8 2023-10-02 11:11:30,140 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_115-233929-0 from training. Duration: 0.72 2023-10-02 11:11:31,597 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_616-16795-0_sp1.1 from training. Duration: 0.963625 2023-10-02 11:11:37,872 INFO [train.py:1046] (3/4) Epoch 25, batch 1350, loss[loss=0.1568, simple_loss=0.2218, pruned_loss=0.04587, over 23576.00 frames. ], tot_loss[loss=0.1704, simple_loss=0.247, pruned_loss=0.04696, over 4708881.09 frames. ], batch size: 256, lr: 4.12e-03, grad_scale: 8.0 2023-10-02 11:11:40,562 WARNING [train.py:1204] (3/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_448-212135-0 from training. Duration: 0.91 2023-10-02 11:11:42,067 WARNING [train.py:1204] (3/4) Exclude cut with ID _46683_210_9642_1_1533730913492_4591116_201-205677-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 11:11:43,564 WARNING [train.py:1204] (3/4) Exclude cut with ID _64086_210_7994_1_1535270417019_7435949_43-80483-0_sp1.1 from training. Duration: 0.97275 2023-10-02 11:11:48,509 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_240-134124-0_sp1.1 from training. Duration: 0.963625 2023-10-02 11:11:48,544 WARNING [train.py:1204] (3/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_907-120940-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 11:11:51,124 WARNING [train.py:1204] (3/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_848-280957-0_sp1.1 from training. Duration: 0.7909375 2023-10-02 11:11:51,165 WARNING [train.py:1204] (3/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_243-42116-0_sp0.9 from training. Duration: 0.811125 2023-10-02 11:11:55,348 WARNING [train.py:1204] (3/4) Exclude cut with ID _6694_210_4599_1_1527852637490_3965529_116-109398-0_sp0.9 from training. Duration: 0.811125 2023-10-02 11:11:56,765 WARNING [train.py:1204] (3/4) Exclude cut with ID _24314_210_4915_1_1532135078116_7170179_521-258868-0 from training. Duration: 0.79 2023-10-02 11:11:58,199 WARNING [train.py:1204] (3/4) Exclude cut with ID _2220_210_1819_1_1525509109898_3658680_50-99837-0_sp0.9 from training. Duration: 0.6 2023-10-02 11:11:59,454 WARNING [train.py:1204] (3/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_313-134930-0_sp1.1 from training. Duration: 0.8818125 2023-10-02 11:12:02,217 WARNING [train.py:1204] (3/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_125-144174-0 from training. Duration: 0.39 2023-10-02 11:12:02,278 WARNING [train.py:1204] (3/4) Exclude cut with ID _59288_210_5854_1_1535077757774_3412246_275-293897-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 11:12:03,667 WARNING [train.py:1204] (3/4) Exclude cut with ID _40519_210_13794_1_1533715228655_7337390_722-52880-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 11:12:03,681 WARNING [train.py:1204] (3/4) Exclude cut with ID _7537_210_2084_1_1528502395431_3924359_128-268029-0 from training. Duration: 0.98 2023-10-02 11:12:03,960 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.ff2_skip_rate, batch_count=859006.6666666666, ans=0.0 2023-10-02 11:12:04,994 WARNING [train.py:1204] (3/4) Exclude cut with ID _8021_210_7994_1_1528376451358_3653069_179-45206-0 from training. Duration: 0.86 2023-10-02 11:12:08,264 WARNING [train.py:1204] (3/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_88-276668-0 from training. Duration: 0.99 2023-10-02 11:12:08,369 WARNING [train.py:1204] (3/4) Exclude cut with ID _22613_210_5219_1_1532253599686_4090170_290-212862-0_sp1.1 from training. Duration: 0.97275 2023-10-02 11:12:10,083 WARNING [train.py:1204] (3/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_465-214726-0 from training. Duration: 0.86 2023-10-02 11:12:17,871 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.bypass.skip_rate, batch_count=859073.3333333334, ans=0.07 2023-10-02 11:12:19,399 WARNING [train.py:1204] (3/4) Exclude cut with ID _56480_210_4220_1_1534679942079_3075409_138-36031-0_sp1.1 from training. Duration: 0.97275 2023-10-02 11:12:26,407 WARNING [train.py:1204] (3/4) Exclude cut with ID _58605_210_12210_1_1534723195939_3904259_300-285878-0_sp1.1 from training. Duration: 0.97275 2023-10-02 11:12:26,431 WARNING [train.py:1204] (3/4) Exclude cut with ID _40901_210_5665_1_1533363789958_3856130_114-340229-0_sp1.1 from training. Duration: 0.936375 2023-10-02 11:12:26,468 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_489-230703-0 from training. Duration: 0.85 2023-10-02 11:12:30,593 WARNING [train.py:1204] (3/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_322-81243-0_sp1.1 from training. Duration: 0.936375 2023-10-02 11:12:30,681 WARNING [train.py:1204] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_271-269084-0 from training. Duration: 0.83 2023-10-02 11:12:30,692 WARNING [train.py:1204] (3/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_166-314607-0_sp0.9 from training. Duration: 0.6 2023-10-02 11:12:30,754 WARNING [train.py:1204] (3/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_340-343302-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 11:12:33,898 WARNING [train.py:1204] (3/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_286-112015-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 11:12:37,160 WARNING [train.py:1204] (3/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_328-264776-0 from training. Duration: 0.67 2023-10-02 11:12:38,574 WARNING [train.py:1204] (3/4) Exclude cut with ID _4866_210_2577_1_1527404412284_4364659_256-306830-0_sp0.9 from training. Duration: 0.9666875 2023-10-02 11:12:43,228 WARNING [train.py:1204] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_239-316802-0 from training. Duration: 0.69 2023-10-02 11:12:44,625 WARNING [train.py:1204] (3/4) Exclude cut with ID _29857_210_20034_1_1532395886753_3798160_279-185801-0 from training. Duration: 0.91 2023-10-02 11:12:46,238 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=859206.6666666666, ans=0.1 2023-10-02 11:12:51,749 INFO [train.py:1046] (3/4) Epoch 25, batch 1400, loss[loss=0.1634, simple_loss=0.239, pruned_loss=0.04389, over 24311.00 frames. ], tot_loss[loss=0.1685, simple_loss=0.2447, pruned_loss=0.04617, over 4705374.35 frames. ], batch size: 61, lr: 4.12e-03, grad_scale: 8.0 2023-10-02 11:12:53,137 WARNING [train.py:1204] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_656-188361-0 from training. Duration: 0.87 2023-10-02 11:12:54,555 WARNING [train.py:1204] (3/4) Exclude cut with ID _44718_210_5816_1_1534050051869_6469030_100-230287-0_sp1.1 from training. Duration: 0.936375 2023-10-02 11:12:57,377 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8275_210_4198_1_1528455320_3892571_276-215622-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 11:12:57,429 WARNING [train.py:1204] (3/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_698-309430-0_sp1.1 from training. Duration: 0.963625 2023-10-02 11:13:00,452 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_365-217119-0 from training. Duration: 0.63 2023-10-02 11:13:01,840 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7264_210_4938_1_1527939911_8132095_800-91995-0 from training. Duration: 0.86 2023-10-02 11:13:08,038 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.conv_module1.balancer1.prob, batch_count=859340.0, ans=0.125 2023-10-02 11:13:10,551 WARNING [train.py:1204] (3/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_49-97158-0_sp1.1 from training. Duration: 0.6909375 2023-10-02 11:13:12,544 WARNING [train.py:1204] (3/4) Exclude cut with ID _61132_210_9969_1_1535021792964_3755419_61-57720-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 11:13:13,972 WARNING [train.py:1204] (3/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_345-306832-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 11:13:15,331 WARNING [train.py:1204] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_164-224052-0_sp0.9 from training. Duration: 0.77775 2023-10-02 11:13:19,771 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_34886_210_10633_1_1533203882_1755661_76-156096-0_sp1.1 from training. Duration: 0.7909375 2023-10-02 11:13:19,849 WARNING [train.py:1204] (3/4) Exclude cut with ID 4133-6541-0027-40495-0_sp1.1 from training. Duration: 0.9681875 2023-10-02 11:13:29,599 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_514-211165-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 11:13:30,873 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.613e+02 1.837e+02 2.107e+02 2.511e+02 3.281e+02, threshold=4.214e+02, percent-clipped=0.0 2023-10-02 11:13:30,957 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_41211_210_2544_1_1533348859_3702910_97-212695-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 11:13:35,487 WARNING [train.py:1204] (3/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_406-45086-0 from training. Duration: 0.8 2023-10-02 11:13:37,046 WARNING [train.py:1204] (3/4) Exclude cut with ID _65263_210_22356_1_1535763598691_3921037_49-222917-0_sp1.1 from training. Duration: 0.836375 2023-10-02 11:13:37,215 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.conv_module1.balancer1.prob, batch_count=859473.3333333334, ans=0.125 2023-10-02 11:13:38,735 WARNING [train.py:1204] (3/4) Exclude cut with ID _4866_210_2577_1_1527404412284_4364659_352-157930-0_sp1.1 from training. Duration: 0.736375 2023-10-02 11:13:38,808 WARNING [train.py:1204] (3/4) Exclude cut with ID _4366_210_1388_1_1526792053633_3613819_231-211402-0_sp1.1 from training. Duration: 0.8545625 2023-10-02 11:13:40,244 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_98-21576-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 11:13:40,339 WARNING [train.py:1204] (3/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_348-127789-0_sp0.9 from training. Duration: 0.9444375 2023-10-02 11:13:41,608 WARNING [train.py:1204] (3/4) Exclude cut with ID _45871_210_5665_1_1533864614245_3296060_98-329557-0_sp1.1 from training. Duration: 0.97275 2023-10-02 11:13:41,661 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6846_210_6753_1_1527850767_4295158_2-315073-0_sp1.1 from training. Duration: 0.8 2023-10-02 11:13:43,029 WARNING [train.py:1204] (3/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_145-137790-0 from training. Duration: 0.83 2023-10-02 11:13:43,049 WARNING [train.py:1204] (3/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_133-158308-0_sp0.9 from training. Duration: 0.9333125 2023-10-02 11:13:46,781 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.conv_skip_rate, batch_count=859473.3333333334, ans=0.0 2023-10-02 11:13:47,925 WARNING [train.py:1204] (3/4) Exclude cut with ID _14898_210_9206_1_1530766812377_4177190_320-11758-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 11:13:50,813 WARNING [train.py:1204] (3/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_406-45086-0_sp0.9 from training. Duration: 0.888875 2023-10-02 11:13:58,130 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_5438_210_1794_1_1527336687_2600009_104-288274-0 from training. Duration: 0.58 2023-10-02 11:13:59,525 WARNING [train.py:1204] (3/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_108-53929-0_sp0.9 from training. Duration: 0.6555625 2023-10-02 11:14:00,862 WARNING [train.py:1204] (3/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_444-286390-0_sp1.1 from training. Duration: 0.963625 2023-10-02 11:14:01,120 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=859540.0, ans=0.1 2023-10-02 11:14:02,303 WARNING [train.py:1204] (3/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_125-144174-0_sp0.9 from training. Duration: 0.4333125 2023-10-02 11:14:02,367 WARNING [train.py:1204] (3/4) Exclude cut with ID _55209_210_1531_1_1534675828996_4597160_118-345621-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 11:14:04,956 INFO [train.py:1046] (3/4) Epoch 25, batch 1450, loss[loss=0.1656, simple_loss=0.2508, pruned_loss=0.04017, over 24535.00 frames. ], tot_loss[loss=0.168, simple_loss=0.2446, pruned_loss=0.04573, over 4708786.10 frames. ], batch size: 66, lr: 4.12e-03, grad_scale: 8.0 2023-10-02 11:14:05,066 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_45886_210_17024_1_1533709215_471063_7-79165-0_sp1.1 from training. Duration: 0.8909375 2023-10-02 11:14:08,496 WARNING [train.py:1204] (3/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_175-163894-0_sp0.9 from training. Duration: 0.82225 2023-10-02 11:14:11,577 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_50188_210_4198_1_1534503621_3646041_353-81682-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 11:14:11,582 WARNING [train.py:1204] (3/4) Exclude cut with ID _17875_210_2087_1_1531224012907_1821339_175-80565-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 11:14:11,600 WARNING [train.py:1204] (3/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_159-328157-0_sp1.1 from training. Duration: 0.47275 2023-10-02 11:14:14,428 WARNING [train.py:1204] (3/4) Exclude cut with ID _60057_210_10640_1_1534903065793_3333150_137-267579-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 11:14:14,486 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_92-46009-0_sp1.1 from training. Duration: 0.4909375 2023-10-02 11:14:17,247 WARNING [train.py:1204] (3/4) Exclude cut with ID _53203_210_2087_1_1534246218309_475590_48-68507-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 11:14:17,273 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_28211_210_6388_1_1532861506_4523224_128-328162-0 from training. Duration: 0.9 2023-10-02 11:14:17,976 WARNING [train.py:1204] (3/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_51-1049-0_sp1.1 from training. Duration: 0.6818125 2023-10-02 11:14:19,300 WARNING [train.py:1204] (3/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_383-316000-0 from training. Duration: 0.9 2023-10-02 11:14:19,372 WARNING [train.py:1204] (3/4) Exclude cut with ID _4779_210_2084_1_1527318022944_3914893_185-295153-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 11:14:20,705 WARNING [train.py:1204] (3/4) Exclude cut with ID _3231_210_2106_1_1526205864934_6784220_171-256004-0_sp0.9 from training. Duration: 0.9555625 2023-10-02 11:14:20,706 WARNING [train.py:1204] (3/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_87-272068-0 from training. Duration: 0.83 2023-10-02 11:14:22,668 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_164-152777-0_sp1.1 from training. Duration: 0.92725 2023-10-02 11:14:22,724 WARNING [train.py:1204] (3/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_18-130336-0_sp0.9 from training. Duration: 0.87775 2023-10-02 11:14:24,047 WARNING [train.py:1204] (3/4) Exclude cut with ID _14886_210_4220_1_1530702020355_3516190_175-229073-0_sp1.1 from training. Duration: 0.4090625 2023-10-02 11:14:25,303 WARNING [train.py:1204] (3/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_791-45149-0_sp0.9 from training. Duration: 0.9555625 2023-10-02 11:14:25,402 WARNING [train.py:1204] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_468-189058-0_sp1.1 from training. Duration: 0.87275 2023-10-02 11:14:26,802 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8278_210_2550_1_1528637099_4220413_32-142780-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 11:14:30,801 WARNING [train.py:1204] (3/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_146-218763-0_sp0.9 from training. Duration: 0.9555625 2023-10-02 11:14:32,172 WARNING [train.py:1204] (3/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_348-127789-0_sp1.1 from training. Duration: 0.77275 2023-10-02 11:14:32,180 WARNING [train.py:1204] (3/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_288-285445-0_sp1.1 from training. Duration: 0.936375 2023-10-02 11:14:35,389 WARNING [train.py:1204] (3/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_308-181732-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 11:14:36,668 WARNING [train.py:1204] (3/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_895-117428-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 11:14:37,982 WARNING [train.py:1204] (3/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_387-159008-0_sp0.9 from training. Duration: 0.9555625 2023-10-02 11:14:38,003 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_37351_210_7925_1_1533012931_3812113_265-85780-0_sp0.9 from training. Duration: 0.97775 2023-10-02 11:14:39,766 WARNING [train.py:1204] (3/4) Exclude cut with ID _40551_210_10635_1_1533776424632_3416179_361-3964-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 11:14:39,805 WARNING [train.py:1204] (3/4) Exclude cut with ID _25882_210_3036_1_1532775665815_6479510_485-122742-0_sp1.1 from training. Duration: 0.97275 2023-10-02 11:14:44,008 WARNING [train.py:1204] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_98-51676-0 from training. Duration: 0.73 2023-10-02 11:14:45,398 WARNING [train.py:1204] (3/4) Exclude cut with ID _30548_210_16040_1_1532608097130_3146969_371-280610-0_sp1.1 from training. Duration: 0.92725 2023-10-02 11:14:48,278 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=859806.6666666666, ans=0.1 2023-10-02 11:14:49,369 WARNING [train.py:1204] (3/4) Exclude cut with ID IC0671W0081-331924 from training. Duration: 0.896 2023-10-02 11:14:51,391 WARNING [train.py:1204] (3/4) Exclude cut with ID _40284_210_2084_1_1533513679814_3704279_380-160775-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 11:14:51,939 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.5.encoder.layers.1.conv_module2.whiten, num_groups=1, num_channels=256, metric=4.83 vs. limit=15.0 2023-10-02 11:14:52,832 WARNING [train.py:1204] (3/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_57-270037-0_sp1.1 from training. Duration: 0.7 2023-10-02 11:14:52,924 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_853-150237-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 11:14:54,793 WARNING [train.py:1204] (3/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_491-261923-0 from training. Duration: 0.98 2023-10-02 11:14:58,917 WARNING [train.py:1204] (3/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_836-244045-0_sp1.1 from training. Duration: 0.97275 2023-10-02 11:15:00,236 WARNING [train.py:1204] (3/4) Exclude cut with ID _14880_210_8895_1_1530680426655_3795259_289-212817-0 from training. Duration: 0.69 2023-10-02 11:15:01,602 WARNING [train.py:1204] (3/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_303-340446-0 from training. Duration: 0.9 2023-10-02 11:15:03,022 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_37152_210_9697_1_1533106573_3844188_418-41736-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 11:15:07,623 WARNING [train.py:1204] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_371-18169-0_sp1.1 from training. Duration: 0.7818125 2023-10-02 11:15:07,662 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_187-219984-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 11:15:10,261 WARNING [train.py:1204] (3/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_63-267585-0 from training. Duration: 0.9 2023-10-02 11:15:12,220 WARNING [train.py:1204] (3/4) Exclude cut with ID _27013_210_1389_1_1532437173240_3252649_222-125573-0 from training. Duration: 0.98 2023-10-02 11:15:12,265 WARNING [train.py:1204] (3/4) Exclude cut with ID _14821_210_8750_1_1530680503991_6829359_189-297451-0 from training. Duration: 0.55 2023-10-02 11:15:13,630 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_557-163391-0_sp1.1 from training. Duration: 0.97275 2023-10-02 11:15:14,491 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.2.encoder.layers.0.self_attn1.whiten, num_groups=1, num_channels=384, metric=21.04 vs. limit=22.5 2023-10-02 11:15:15,043 WARNING [train.py:1204] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_65-88242-0_sp1.1 from training. Duration: 0.5090625 2023-10-02 11:15:19,145 INFO [train.py:1046] (3/4) Epoch 25, batch 1500, loss[loss=0.18, simple_loss=0.2618, pruned_loss=0.04911, over 23866.00 frames. ], tot_loss[loss=0.1683, simple_loss=0.2454, pruned_loss=0.04555, over 4726903.52 frames. ], batch size: 86, lr: 4.12e-03, grad_scale: 8.0 2023-10-02 11:15:25,397 WARNING [train.py:1204] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_218-295097-0 from training. Duration: 0.77 2023-10-02 11:15:25,416 WARNING [train.py:1204] (3/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_686-247600-0_sp1.1 from training. Duration: 0.836375 2023-10-02 11:15:25,420 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_397-147196-0_sp0.9 from training. Duration: 0.888875 2023-10-02 11:15:25,496 WARNING [train.py:1204] (3/4) Exclude cut with ID _53950_210_5816_1_1534417264919_3862951_237-136449-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 11:15:27,405 WARNING [train.py:1204] (3/4) Exclude cut with ID _3120_210_3525_1_1526036143112_4072980_74-178400-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 11:15:28,756 WARNING [train.py:1204] (3/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_309-222545-0_sp0.9 from training. Duration: 0.9333125 2023-10-02 11:15:30,036 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7544_210_6753_1_1528587722_3576686_172-171750-0 from training. Duration: 0.65 2023-10-02 11:15:32,880 WARNING [train.py:1204] (3/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_3-237434-0_sp0.9 from training. Duration: 0.8444375 2023-10-02 11:15:32,920 WARNING [train.py:1204] (3/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_101-293367-0_sp0.9 from training. Duration: 0.7 2023-10-02 11:15:32,936 WARNING [train.py:1204] (3/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_818-213552-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 11:15:33,010 WARNING [train.py:1204] (3/4) Exclude cut with ID _14349_210_8341_1_1530615710818_3442470_115-285975-0_sp1.1 from training. Duration: 0.7818125 2023-10-02 11:15:35,679 WARNING [train.py:1204] (3/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_291-309749-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 11:15:35,763 WARNING [train.py:1204] (3/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_715-3462-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 11:15:41,710 WARNING [train.py:1204] (3/4) Exclude cut with ID _59502_210_12210_1_1535107024698_1835139_13-331827-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 11:15:41,719 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_4528_210_3273_1_1527076654_3276777_342-168068-0 from training. Duration: 0.87 2023-10-02 11:15:41,782 WARNING [train.py:1204] (3/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_215-141059-0_sp0.9 from training. Duration: 0.688875 2023-10-02 11:15:42,532 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.1.encoder.layers.0.feed_forward1.out_whiten, num_groups=1, num_channels=256, metric=5.53 vs. limit=15.0 2023-10-02 11:15:43,452 WARNING [train.py:1204] (3/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_106-286130-0_sp1.1 from training. Duration: 0.8909375 2023-10-02 11:15:44,908 WARNING [train.py:1204] (3/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_480-77655-0_sp1.1 from training. Duration: 0.97275 2023-10-02 11:15:47,672 WARNING [train.py:1204] (3/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_848-280957-0 from training. Duration: 0.87 2023-10-02 11:15:50,582 WARNING [train.py:1204] (3/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_122-109023-0 from training. Duration: 0.76 2023-10-02 11:15:51,961 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_11305_210_1794_1_1529750428_7178148_645-233480-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 11:15:53,677 WARNING [train.py:1204] (3/4) Exclude cut with ID _7562_210_2084_1_1528887767916_3946229_345-169156-0 from training. Duration: 0.58 2023-10-02 11:15:56,807 WARNING [train.py:1204] (3/4) Exclude cut with ID _7562_210_2084_1_1528887767916_3946229_345-169156-0_sp1.1 from training. Duration: 0.52725 2023-10-02 11:15:58,155 WARNING [train.py:1204] (3/4) Exclude cut with ID _13253_210_1925_1_1530757775819_3577873_253-246386-0_sp1.1 from training. Duration: 0.8181875 2023-10-02 11:15:59,380 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.537e+02 1.797e+02 1.991e+02 2.155e+02 3.181e+02, threshold=3.982e+02, percent-clipped=0.0 2023-10-02 11:15:59,472 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_136-264444-0_sp1.1 from training. Duration: 0.97275 2023-10-02 11:15:59,493 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2168_210_2290_1_1525606920_1849400_136-222991-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 11:16:00,835 WARNING [train.py:1204] (3/4) Exclude cut with ID _7852_210_5219_1_1528286471316_8139400_242-105336-0 from training. Duration: 0.72 2023-10-02 11:16:00,894 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_5960_210_1794_1_1527403865_3990999_515-2613-0_sp1.1 from training. Duration: 0.8 2023-10-02 11:16:02,113 WARNING [train.py:1204] (3/4) Exclude cut with ID _39688_210_19596_1_1533371626725_4154159_551-167834-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 11:16:02,180 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_85-195240-0 from training. Duration: 0.87 2023-10-02 11:16:03,557 WARNING [train.py:1204] (3/4) Exclude cut with ID _63171_210_15488_1_1535104792794_3295110_113-113534-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 11:16:05,281 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.ff3_skip_rate, batch_count=860140.0, ans=0.0 2023-10-02 11:16:07,656 WARNING [train.py:1204] (3/4) Exclude cut with ID _8833_210_5219_1_1528592665210_7553059_319-349181-0_sp1.1 from training. Duration: 0.9 2023-10-02 11:16:07,657 WARNING [train.py:1204] (3/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_225-43381-0 from training. Duration: 0.94 2023-10-02 11:16:14,877 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_5959_210_1794_1_1527400400_3445726_412-44052-0_sp0.9 from training. Duration: 0.8555625 2023-10-02 11:16:16,216 WARNING [train.py:1204] (3/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_356-167816-0_sp1.1 from training. Duration: 0.6545625 2023-10-02 11:16:20,875 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9012W0112-501244 from training. Duration: 0.768 2023-10-02 11:16:20,930 WARNING [train.py:1204] (3/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_177-326952-0_sp1.1 from training. Duration: 0.92725 2023-10-02 11:16:20,934 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9016W0116-502789 from training. Duration: 0.896 2023-10-02 11:16:22,424 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.attention_skip_rate, batch_count=860206.6666666666, ans=0.0 2023-10-02 11:16:23,603 WARNING [train.py:1204] (3/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_557-204815-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 11:16:23,693 WARNING [train.py:1204] (3/4) Exclude cut with ID _55857_210_2346_1_1534644007831_6898209_279-229469-0_sp1.1 from training. Duration: 0.936375 2023-10-02 11:16:23,762 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9029W0375-509764 from training. Duration: 0.896 2023-10-02 11:16:25,759 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2295_210_2087_1_1525500993_2407863_234-83104-0_sp0.9 from training. Duration: 0.688875 2023-10-02 11:16:29,035 WARNING [train.py:1204] (3/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_266-137104-0 from training. Duration: 0.65 2023-10-02 11:16:30,435 WARNING [train.py:1204] (3/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_289-163927-0_sp1.1 from training. Duration: 0.92725 2023-10-02 11:16:32,939 INFO [train.py:1046] (3/4) Epoch 25, batch 1550, loss[loss=0.169, simple_loss=0.2438, pruned_loss=0.04708, over 23855.00 frames. ], tot_loss[loss=0.169, simple_loss=0.2458, pruned_loss=0.04611, over 4724141.14 frames. ], batch size: 212, lr: 4.12e-03, grad_scale: 8.0 2023-10-02 11:16:34,386 WARNING [train.py:1204] (3/4) Exclude cut with ID _12733_210_8341_1_1530167424556_3646380_86-225314-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 11:16:34,404 WARNING [train.py:1204] (3/4) Exclude cut with ID _7877_210_6014_1_1528286358099_3750136_618-284064-0_sp1.1 from training. Duration: 0.92725 2023-10-02 11:16:34,453 WARNING [train.py:1204] (3/4) Exclude cut with ID _55844_210_2346_1_1534471212569_7625707_1017-31469-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 11:16:34,482 WARNING [train.py:1204] (3/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_617-241446-0_sp1.1 from training. Duration: 0.92725 2023-10-02 11:16:34,616 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.ff2_skip_rate, batch_count=860273.3333333334, ans=0.0 2023-10-02 11:16:35,900 WARNING [train.py:1204] (3/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_866-173440-0_sp1.1 from training. Duration: 0.8181875 2023-10-02 11:16:37,459 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.balancer1.prob, batch_count=860273.3333333334, ans=0.125 2023-10-02 11:16:38,539 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_9460_210_4211_1_1528954587_10049667_480-260269-0 from training. Duration: 0.56 2023-10-02 11:16:38,576 WARNING [train.py:1204] (3/4) Exclude cut with ID _44659_210_2721_1_1534036120061_6614860_626-298884-0 from training. Duration: 0.97 2023-10-02 11:16:38,612 WARNING [train.py:1204] (3/4) Exclude cut with ID _53565_210_10635_1_1534337218335_3820259_287-18120-0_sp1.1 from training. Duration: 0.8818125 2023-10-02 11:16:40,042 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_791-197891-0 from training. Duration: 0.84 2023-10-02 11:16:40,093 WARNING [train.py:1204] (3/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_527-238990-0 from training. Duration: 0.9 2023-10-02 11:16:42,729 WARNING [train.py:1204] (3/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_449-65969-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 11:16:44,224 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_553-341452-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 11:16:44,267 WARNING [train.py:1204] (3/4) Exclude cut with ID _16909_210_5724_1_1531292079912_4118919_300-47574-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 11:16:44,395 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.balancer1.prob, batch_count=860273.3333333334, ans=0.125 2023-10-02 11:16:45,512 WARNING [train.py:1204] (3/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_70-27732-0_sp1.1 from training. Duration: 0.8545625 2023-10-02 11:16:45,612 WARNING [train.py:1204] (3/4) Exclude cut with ID _44718_210_5816_1_1534050051869_6469030_727-28074-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 11:16:47,491 WARNING [train.py:1204] (3/4) Exclude cut with ID _55857_210_2346_1_1534644007831_6898209_357-123664-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 11:16:50,776 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9123W0004-555964 from training. Duration: 0.896 2023-10-02 11:16:50,797 WARNING [train.py:1204] (3/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_445-145208-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 11:16:50,837 WARNING [train.py:1204] (3/4) Exclude cut with ID _3675_210_4557_1_1526639934408_4182089_456-18305-0_sp1.1 from training. Duration: 0.6909375 2023-10-02 11:16:52,193 WARNING [train.py:1204] (3/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_1-99680-0_sp1.1 from training. Duration: 0.6090625 2023-10-02 11:16:53,608 WARNING [train.py:1204] (3/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_146-282400-0_sp0.9 from training. Duration: 0.788875 2023-10-02 11:16:53,628 WARNING [train.py:1204] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_844-96989-0 from training. Duration: 0.71 2023-10-02 11:16:55,068 WARNING [train.py:1204] (3/4) Exclude cut with ID _29853_210_7497_1_1532505600851_6642110_128-146364-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 11:16:56,934 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_224-322394-0 from training. Duration: 0.66 2023-10-02 11:16:57,030 WARNING [train.py:1204] (3/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_191-59784-0 from training. Duration: 0.88 2023-10-02 11:16:58,260 WARNING [train.py:1204] (3/4) Exclude cut with ID _12556_210_8341_1_1530081024100_7327690_725-193477-0 from training. Duration: 0.98 2023-10-02 11:16:58,297 WARNING [train.py:1204] (3/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_523-79838-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 11:16:58,384 WARNING [train.py:1204] (3/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_398-334558-0_sp1.1 from training. Duration: 0.87275 2023-10-02 11:17:01,738 WARNING [train.py:1204] (3/4) Exclude cut with ID _6456_210_1534_1_1527681634212_3181049_171-36697-0_sp1.1 from training. Duration: 0.936375 2023-10-02 11:17:01,994 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=860406.6666666666, ans=0.1 2023-10-02 11:17:03,198 WARNING [train.py:1204] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_700-183549-0 from training. Duration: 0.68 2023-10-02 11:17:03,200 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8853_210_3273_1_1528633899_818968_51-187685-0 from training. Duration: 0.69 2023-10-02 11:17:04,681 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.feed_forward1.hidden_balancer.prob, batch_count=860406.6666666666, ans=0.125 2023-10-02 11:17:11,238 WARNING [train.py:1204] (3/4) Exclude cut with ID _7719_210_3525_1_1528456153163_4296960_333-110399-0_sp1.1 from training. Duration: 0.87275 2023-10-02 11:17:11,922 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.3.encoder.layers.0.feed_forward1.out_whiten, num_groups=1, num_channels=512, metric=10.57 vs. limit=15.0 2023-10-02 11:17:15,309 WARNING [train.py:1204] (3/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_1022-83740-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 11:17:15,341 WARNING [train.py:1204] (3/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_61-327062-0_sp0.9 from training. Duration: 0.9 2023-10-02 11:17:15,347 WARNING [train.py:1204] (3/4) Exclude cut with ID _47251_210_18628_1_1533814063128_4137250_214-88076-0_sp1.1 from training. Duration: 0.8 2023-10-02 11:17:15,497 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.conv_module2.balancer1.prob, batch_count=860473.3333333334, ans=0.125 2023-10-02 11:17:16,686 WARNING [train.py:1204] (3/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_839-173017-0 from training. Duration: 0.41 2023-10-02 11:17:22,652 WARNING [train.py:1204] (3/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_299-34413-0_sp1.1 from training. Duration: 0.5909375 2023-10-02 11:17:22,759 WARNING [train.py:1204] (3/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_664-159457-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 11:17:26,127 WARNING [train.py:1204] (3/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_483-291760-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 11:17:27,736 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.0.conv_module2.balancer2.prob, batch_count=860473.3333333334, ans=0.125 2023-10-02 11:17:27,743 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.1.feed_forward2.hidden_balancer.prob, batch_count=860473.3333333334, ans=0.125 2023-10-02 11:17:29,433 WARNING [train.py:1204] (3/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_287-255474-0_sp1.1 from training. Duration: 0.92725 2023-10-02 11:17:29,484 WARNING [train.py:1204] (3/4) Exclude cut with ID _48677_210_5663_1_1533902405919_3892989_111-11439-0_sp1.1 from training. Duration: 0.87275 2023-10-02 11:17:29,498 WARNING [train.py:1204] (3/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_399-156791-0 from training. Duration: 0.83 2023-10-02 11:17:29,637 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.conv_skip_rate, batch_count=860473.3333333334, ans=0.0 2023-10-02 11:17:30,738 WARNING [train.py:1204] (3/4) Exclude cut with ID _3992_210_1747_1_1526644816031_3731060_36-147855-0_sp0.9 from training. Duration: 0.8666875 2023-10-02 11:17:32,208 WARNING [train.py:1204] (3/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_442-320317-0_sp1.1 from training. Duration: 0.7454375 2023-10-02 11:17:32,235 WARNING [train.py:1204] (3/4) Exclude cut with ID _55429_210_10626_1_1534467588527_7174143_953-80868-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 11:17:32,297 WARNING [train.py:1204] (3/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_159-328157-0_sp0.9 from training. Duration: 0.57775 2023-10-02 11:17:32,310 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9309W0197-640324 from training. Duration: 0.896 2023-10-02 11:17:35,012 WARNING [train.py:1204] (3/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_361-30822-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 11:17:36,499 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.0.feed_forward1.out_proj.dropout_p, batch_count=860540.0, ans=0.1 2023-10-02 11:17:40,397 WARNING [train.py:1204] (3/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_636-251213-0 from training. Duration: 0.94 2023-10-02 11:17:43,353 WARNING [train.py:1204] (3/4) Exclude cut with ID _49728_210_12651_1_1534041034294_6788150_468-333827-0_sp1.1 from training. Duration: 0.963625 2023-10-02 11:17:45,900 INFO [train.py:1046] (3/4) Epoch 25, batch 1600, loss[loss=0.1602, simple_loss=0.2494, pruned_loss=0.03552, over 24675.00 frames. ], tot_loss[loss=0.1699, simple_loss=0.2467, pruned_loss=0.0466, over 4718704.99 frames. ], batch size: 73, lr: 4.12e-03, grad_scale: 16.0 2023-10-02 11:17:45,962 WARNING [train.py:1204] (3/4) Exclude cut with ID _25882_210_3036_1_1532775665815_6479510_623-191759-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 11:17:46,019 WARNING [train.py:1204] (3/4) Exclude cut with ID _5189_210_4557_1_1527248319428_3769229_199-53526-0 from training. Duration: 0.42 2023-10-02 11:17:47,396 WARNING [train.py:1204] (3/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_32-205288-0_sp0.9 from training. Duration: 0.8666875 2023-10-02 11:17:49,344 WARNING [train.py:1204] (3/4) Exclude cut with ID _65263_210_22356_1_1535763598691_3921037_401-346646-0_sp1.1 from training. Duration: 0.963625 2023-10-02 11:17:49,357 WARNING [train.py:1204] (3/4) Exclude cut with ID _12555_210_8750_1_1530075594402_3780239_449-267986-0_sp1.1 from training. Duration: 0.7545625 2023-10-02 11:17:49,378 WARNING [train.py:1204] (3/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_430-201020-0_sp1.1 from training. Duration: 0.8818125 2023-10-02 11:17:51,146 WARNING [train.py:1204] (3/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_379-49457-0_sp1.1 from training. Duration: 0.7909375 2023-10-02 11:17:53,866 WARNING [train.py:1204] (3/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_200-105218-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 11:17:55,802 WARNING [train.py:1204] (3/4) Exclude cut with ID _3867_210_1883_1_1526641200575_4475830_319-349850-0 from training. Duration: 0.67 2023-10-02 11:17:55,884 WARNING [train.py:1204] (3/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_916-195848-0 from training. Duration: 0.92 2023-10-02 11:17:58,472 WARNING [train.py:1204] (3/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_436-186311-0 from training. Duration: 0.65 2023-10-02 11:18:00,485 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3910_210_1794_1_1526777667_3747260_302-180617-0_sp1.1 from training. Duration: 0.97275 2023-10-02 11:18:01,878 WARNING [train.py:1204] (3/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_102-92751-0 from training. Duration: 0.99 2023-10-02 11:18:03,157 WARNING [train.py:1204] (3/4) Exclude cut with ID _51899_210_20034_1_1534129254466_3669220_265-215428-0_sp1.1 from training. Duration: 0.8909375 2023-10-02 11:18:05,829 WARNING [train.py:1204] (3/4) Exclude cut with ID _58583_210_5236_1_1534726835266_3574249_438-198993-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 11:18:08,638 WARNING [train.py:1204] (3/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_166-166990-0_sp1.1 from training. Duration: 0.87275 2023-10-02 11:18:11,408 WARNING [train.py:1204] (3/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_907-151398-0 from training. Duration: 0.37 2023-10-02 11:18:14,259 WARNING [train.py:1204] (3/4) Exclude cut with ID _38670_210_15379_1_1533448810659_4391059_452-256669-0_sp1.1 from training. Duration: 0.936375 2023-10-02 11:18:14,293 WARNING [train.py:1204] (3/4) Exclude cut with ID _48677_210_5663_1_1533902405919_3892989_408-40740-0 from training. Duration: 0.83 2023-10-02 11:18:14,523 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=860740.0, ans=0.1 2023-10-02 11:18:15,578 WARNING [train.py:1204] (3/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_903-158756-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 11:18:15,631 WARNING [train.py:1204] (3/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_397-216497-0 from training. Duration: 0.96 2023-10-02 11:18:17,147 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.1.nonlin_attention.balancer.prob, batch_count=860740.0, ans=0.125 2023-10-02 11:18:20,305 WARNING [train.py:1204] (3/4) Exclude cut with ID _55429_210_10626_1_1534467588527_7174143_97-335084-0 from training. Duration: 0.97 2023-10-02 11:18:26,762 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.576e+02 1.917e+02 2.071e+02 2.381e+02 2.970e+02, threshold=4.142e+02, percent-clipped=0.0 2023-10-02 11:18:28,207 WARNING [train.py:1204] (3/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_453-267730-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 11:18:28,289 WARNING [train.py:1204] (3/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_153-97441-0 from training. Duration: 0.56 2023-10-02 11:18:28,350 WARNING [train.py:1204] (3/4) Exclude cut with ID _40901_210_5665_1_1533363789958_3856130_312-329122-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 11:18:28,383 WARNING [train.py:1204] (3/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_97-126471-0_sp1.1 from training. Duration: 0.97275 2023-10-02 11:18:28,384 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_11305_210_1794_1_1529750428_7178148_257-312870-0_sp1.1 from training. Duration: 0.8 2023-10-02 11:18:33,008 WARNING [train.py:1204] (3/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_454-314901-0_sp0.9 from training. Duration: 0.511125 2023-10-02 11:18:34,623 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.feed_forward3.hidden_balancer.prob, batch_count=860806.6666666666, ans=0.125 2023-10-02 11:18:37,074 WARNING [train.py:1204] (3/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_282-136641-0_sp1.1 from training. Duration: 0.3818125 2023-10-02 11:18:39,787 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_46013_210_7234_1_1533776362_3744032_493-160724-0_sp1.1 from training. Duration: 0.963625 2023-10-02 11:18:39,818 WARNING [train.py:1204] (3/4) Exclude cut with ID _67831_210_12392_1_1535612464202_3092612_259-33442-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 11:18:39,873 WARNING [train.py:1204] (3/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_289-62384-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 11:18:40,133 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.self_attn_weights.pos_emb_skip_rate, batch_count=860806.6666666666, ans=0.0 2023-10-02 11:18:41,155 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6545_210_2040_1_1527774670_4245258_379-14182-0_sp0.9 from training. Duration: 0.888875 2023-10-02 11:18:42,533 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_548-294316-0_sp1.1 from training. Duration: 0.736375 2023-10-02 11:18:44,058 WARNING [train.py:1204] (3/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_857-298942-0_sp1.1 from training. Duration: 0.77275 2023-10-02 11:18:44,171 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6673_210_3273_1_1527767790_3173240_327-306185-0_sp1.1 from training. Duration: 0.7545625 2023-10-02 11:18:50,113 WARNING [train.py:1204] (3/4) Exclude cut with ID _61008_210_10633_1_1534852769198_6974370_814-107647-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 11:18:50,947 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.balancer1.prob, batch_count=860873.3333333334, ans=0.125 2023-10-02 11:18:51,823 WARNING [train.py:1204] (3/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_116-332008-0_sp1.1 from training. Duration: 0.8909375 2023-10-02 11:18:53,742 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.bypass_mid.scale_min, batch_count=860873.3333333334, ans=0.2 2023-10-02 11:18:54,945 WARNING [train.py:1204] (3/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_266-329755-0 from training. Duration: 0.89 2023-10-02 11:18:54,955 WARNING [train.py:1204] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_271-269084-0_sp0.9 from training. Duration: 0.92225 2023-10-02 11:18:57,494 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3171_210_2087_1_1526109084_2330007_66-260583-0 from training. Duration: 0.72 2023-10-02 11:19:00,264 INFO [train.py:1046] (3/4) Epoch 25, batch 1650, loss[loss=0.1555, simple_loss=0.2385, pruned_loss=0.03627, over 24470.00 frames. ], tot_loss[loss=0.1709, simple_loss=0.2474, pruned_loss=0.04725, over 4700136.73 frames. ], batch size: 63, lr: 4.12e-03, grad_scale: 16.0 2023-10-02 11:19:00,440 WARNING [train.py:1204] (3/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_1326-276386-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 11:19:02,396 WARNING [train.py:1204] (3/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_372-30302-0_sp1.1 from training. Duration: 0.7818125 2023-10-02 11:19:03,800 WARNING [train.py:1204] (3/4) Exclude cut with ID _25882_210_3036_1_1532775665815_6479510_631-257029-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 11:19:03,816 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3691_210_3528_1_1526468169_4062053_355-334432-0 from training. Duration: 0.87 2023-10-02 11:19:03,824 WARNING [train.py:1204] (3/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_839-173017-0_sp1.1 from training. Duration: 0.37275 2023-10-02 11:19:03,832 WARNING [train.py:1204] (3/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_441-91466-0 from training. Duration: 0.97 2023-10-02 11:19:03,873 WARNING [train.py:1204] (3/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_108-53929-0 from training. Duration: 0.59 2023-10-02 11:19:05,445 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.attention_skip_rate, batch_count=860940.0, ans=0.0 2023-10-02 11:19:07,814 WARNING [train.py:1204] (3/4) Exclude cut with ID _9389_210_1664_1_1529217337301_5289269_260-1942-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 11:19:07,873 WARNING [train.py:1204] (3/4) Exclude cut with ID _56423_210_5854_1_1534896061049_3165969_198-61547-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 11:19:07,901 WARNING [train.py:1204] (3/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_412-142122-0_sp1.1 from training. Duration: 0.92725 2023-10-02 11:19:07,921 WARNING [train.py:1204] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_88-341718-0_sp1.1 from training. Duration: 0.6 2023-10-02 11:19:10,661 WARNING [train.py:1204] (3/4) Exclude cut with ID _61079_210_4525_1_1534921008198_4011419_334-298697-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 11:19:10,925 INFO [scaling.py:1118] (3/4) WithLoss: name=encoder.encoders.1.encoder.layers.1.self_attn_weights, loss-sum=0.000e+00 2023-10-02 11:19:13,368 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_5465_210_4211_1_1527227253_55357_8-2499-0_sp1.1 from training. Duration: 0.996375 2023-10-02 11:19:14,822 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_447-201565-0_sp1.1 from training. Duration: 0.97275 2023-10-02 11:19:14,834 WARNING [train.py:1204] (3/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_253-161789-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 11:19:14,837 WARNING [train.py:1204] (3/4) Exclude cut with ID _44304_210_15374_1_1533639352765_4613129_302-22084-0_sp0.9 from training. Duration: 0.9555625 2023-10-02 11:19:16,158 WARNING [train.py:1204] (3/4) Exclude cut with ID _26664_210_7770_1_1532258943280_3418270_187-80815-0_sp1.1 from training. Duration: 0.8454375 2023-10-02 11:19:16,246 WARNING [train.py:1204] (3/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_68-212801-0 from training. Duration: 0.73 2023-10-02 11:19:17,480 WARNING [train.py:1204] (3/4) Exclude cut with ID _29345_210_4381_1_1532689168263_3860169_149-257268-0 from training. Duration: 0.81 2023-10-02 11:19:25,727 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_213-103282-0_sp1.1 from training. Duration: 0.7090625 2023-10-02 11:19:27,078 WARNING [train.py:1204] (3/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_156-302675-0_sp1.1 from training. Duration: 0.5 2023-10-02 11:19:33,422 WARNING [train.py:1204] (3/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_125-322172-0 from training. Duration: 0.93 2023-10-02 11:19:33,481 WARNING [train.py:1204] (3/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_493-60474-0_sp1.1 from training. Duration: 0.963625 2023-10-02 11:19:33,608 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.conv_module2.balancer1.prob, batch_count=861073.3333333334, ans=0.125 2023-10-02 11:19:36,161 WARNING [train.py:1204] (3/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_156-302675-0 from training. Duration: 0.55 2023-10-02 11:19:38,852 WARNING [train.py:1204] (3/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_123-267862-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 11:19:41,662 WARNING [train.py:1204] (3/4) Exclude cut with ID _28427_210_4220_1_1532581240967_3061069_201-143747-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 11:19:41,820 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.self_attn_weights.pos_emb_skip_rate, batch_count=861073.3333333334, ans=0.0 2023-10-02 11:19:41,902 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.nonlin_attention.balancer.prob, batch_count=861073.3333333334, ans=0.125 2023-10-02 11:19:42,954 WARNING [train.py:1204] (3/4) Exclude cut with ID _8772_210_1850_1_1528635595972_3664011_96-12756-0_sp1.1 from training. Duration: 0.8909375 2023-10-02 11:19:42,988 WARNING [train.py:1204] (3/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_1347-145675-0_sp1.1 from training. Duration: 0.936375 2023-10-02 11:19:45,577 WARNING [train.py:1204] (3/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_387-159008-0_sp1.1 from training. Duration: 0.7818125 2023-10-02 11:19:45,602 WARNING [train.py:1204] (3/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_400-201896-0_sp1.1 from training. Duration: 0.963625 2023-10-02 11:19:48,485 WARNING [train.py:1204] (3/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_326-174612-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 11:19:49,821 WARNING [train.py:1204] (3/4) Exclude cut with ID _55844_210_2346_1_1534471212569_7625707_390-116233-0_sp1.1 from training. Duration: 0.963625 2023-10-02 11:19:49,846 WARNING [train.py:1204] (3/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_419-317062-0_sp1.1 from training. Duration: 0.82725 2023-10-02 11:19:49,893 WARNING [train.py:1204] (3/4) Exclude cut with ID _29877_210_6228_1_1532397791106_5276149_266-62291-0_sp0.9 from training. Duration: 0.9666875 2023-10-02 11:19:51,761 WARNING [train.py:1204] (3/4) Exclude cut with ID _7857_210_1534_1_1528286515745_6831470_431-338528-0_sp1.1 from training. Duration: 0.92725 2023-10-02 11:19:51,827 WARNING [train.py:1204] (3/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_87-131427-0_sp0.9 from training. Duration: 0.8666875 2023-10-02 11:19:55,243 WARNING [train.py:1204] (3/4) Exclude cut with ID _24055_210_12651_1_1532310907143_976979_92-59356-0_sp1.1 from training. Duration: 0.82725 2023-10-02 11:19:57,207 WARNING [train.py:1204] (3/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_96-290039-0 from training. Duration: 0.56 2023-10-02 11:19:58,642 WARNING [train.py:1204] (3/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_48-182155-0_sp0.9 from training. Duration: 0.9666875 2023-10-02 11:19:59,936 WARNING [train.py:1204] (3/4) Exclude cut with ID _4626_210_3532_1_1526817652717_3916859_446-206656-0 from training. Duration: 0.76 2023-10-02 11:20:00,021 WARNING [train.py:1204] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_168-32906-0 from training. Duration: 0.69 2023-10-02 11:20:00,042 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3780_210_2087_1_1526467760_2130483_170-259044-0 from training. Duration: 0.7 2023-10-02 11:20:01,780 WARNING [train.py:1204] (3/4) Exclude cut with ID _65134_210_14763_1_1535335393985_3505368_153-216718-0_sp1.1 from training. Duration: 0.92725 2023-10-02 11:20:01,839 WARNING [train.py:1204] (3/4) Exclude cut with ID _64639_210_5816_1_1535367619437_3528000_461-329156-0_sp1.1 from training. Duration: 0.87275 2023-10-02 11:20:01,876 WARNING [train.py:1204] (3/4) Exclude cut with ID _21989_210_5854_1_1531899214931_3271360_126-347531-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 11:20:01,915 WARNING [train.py:1204] (3/4) Exclude cut with ID _7614_210_4915_1_1529326764092_4537359_96-162197-0_sp1.1 from training. Duration: 0.963625 2023-10-02 11:20:01,916 WARNING [train.py:1204] (3/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_1083-147536-0 from training. Duration: 0.97 2023-10-02 11:20:04,839 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=861206.6666666666, ans=0.1 2023-10-02 11:20:06,064 WARNING [train.py:1204] (3/4) Exclude cut with ID _64029_210_4381_1_1535178502772_3756459_275-16710-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 11:20:07,413 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7264_210_4938_1_1527939911_8132095_800-91995-0_sp0.9 from training. Duration: 0.9555625 2023-10-02 11:20:07,455 WARNING [train.py:1204] (3/4) Exclude cut with ID _3844_210_1390_1_1526727803582_2871209_50-193879-0_sp1.1 from training. Duration: 0.936375 2023-10-02 11:20:10,209 WARNING [train.py:1204] (3/4) Exclude cut with ID _46952_210_5986_1_1534230010357_3107379_232-344503-0 from training. Duration: 0.96 2023-10-02 11:20:14,111 INFO [train.py:1046] (3/4) Epoch 25, batch 1700, loss[loss=0.1608, simple_loss=0.2449, pruned_loss=0.03834, over 24627.00 frames. ], tot_loss[loss=0.1709, simple_loss=0.2474, pruned_loss=0.04718, over 4707046.09 frames. ], batch size: 65, lr: 4.12e-03, grad_scale: 16.0 2023-10-02 11:20:15,587 WARNING [train.py:1204] (3/4) Exclude cut with ID _25808_210_13292_1_1532343514562_7366109_206-14076-0_sp1.1 from training. Duration: 0.936375 2023-10-02 11:20:15,588 WARNING [train.py:1204] (3/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_55-31904-0_sp0.9 from training. Duration: 0.97775 2023-10-02 11:20:15,616 WARNING [train.py:1204] (3/4) Exclude cut with ID _14632_210_9988_1_1530691279392_3923200_161-129530-0 from training. Duration: 0.96 2023-10-02 11:20:16,928 WARNING [train.py:1204] (3/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_770-200430-0_sp1.1 from training. Duration: 0.8090625 2023-10-02 11:20:16,938 WARNING [train.py:1204] (3/4) Exclude cut with ID _6270_210_2084_1_1527850911573_3856420_26-277343-0_sp1.1 from training. Duration: 0.7454375 2023-10-02 11:20:16,939 WARNING [train.py:1204] (3/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_88-23776-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 11:20:18,460 WARNING [train.py:1204] (3/4) Exclude cut with ID _60860_210_8254_1_1534896015875_3557169_245-256666-0_sp1.1 from training. Duration: 0.8818125 2023-10-02 11:20:18,506 WARNING [train.py:1204] (3/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_636-251213-0_sp1.1 from training. Duration: 0.8545625 2023-10-02 11:20:18,520 WARNING [train.py:1204] (3/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_540-131164-0 from training. Duration: 0.92 2023-10-02 11:20:21,388 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_5931_210_2040_1_1527428481_4547999_321-193614-0_sp0.9 from training. Duration: 0.7444375 2023-10-02 11:20:29,476 WARNING [train.py:1204] (3/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_373-225403-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 11:20:32,747 WARNING [train.py:1204] (3/4) Exclude cut with ID _20622_210_3852_1_1531622427080_1093839_57-326027-0_sp1.1 from training. Duration: 0.97275 2023-10-02 11:20:33,057 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.feed_forward3.hidden_balancer.prob, batch_count=861340.0, ans=0.125 2023-10-02 11:20:34,340 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.1.conv_skip_rate, batch_count=861340.0, ans=0.0 2023-10-02 11:20:35,756 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.conv_skip_rate, batch_count=861340.0, ans=0.0 2023-10-02 11:20:38,217 WARNING [train.py:1204] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_605-257205-0_sp1.1 from training. Duration: 0.536375 2023-10-02 11:20:38,250 WARNING [train.py:1204] (3/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_442-320317-0_sp0.9 from training. Duration: 0.911125 2023-10-02 11:20:38,426 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.balancer1.prob, batch_count=861340.0, ans=0.125 2023-10-02 11:20:39,488 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_35361_210_4238_1_1533262810_7793476_675-87012-0_sp1.1 from training. Duration: 0.8090625 2023-10-02 11:20:39,514 WARNING [train.py:1204] (3/4) Exclude cut with ID _4184_210_2550_1_1526730192013_2219380_306-44736-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 11:20:41,046 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_124-113562-0 from training. Duration: 0.94 2023-10-02 11:20:41,270 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.conv_module1.balancer1.prob, batch_count=861340.0, ans=0.125 2023-10-02 11:20:43,663 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2821_210_2715_1_1526108385_3763560_73-296673-0_sp0.9 from training. Duration: 0.92225 2023-10-02 11:20:43,678 WARNING [train.py:1204] (3/4) Exclude cut with ID _10301_210_5886_1_1529222275332_3910620_216-46959-0_sp1.1 from training. Duration: 0.92725 2023-10-02 11:20:45,080 WARNING [train.py:1204] (3/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_91-277684-0_sp0.9 from training. Duration: 0.82225 2023-10-02 11:20:46,467 WARNING [train.py:1204] (3/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_351-294650-0_sp0.9 from training. Duration: 0.7 2023-10-02 11:20:47,887 WARNING [train.py:1204] (3/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_209-205365-0 from training. Duration: 0.79 2023-10-02 11:20:47,955 WARNING [train.py:1204] (3/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_97-325053-0 from training. Duration: 0.92 2023-10-02 11:20:49,392 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_49348_210_7927_1_1533954500_3691300_109-276430-0_sp1.1 from training. Duration: 0.92725 2023-10-02 11:20:51,187 WARNING [train.py:1204] (3/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_912-47473-0 from training. Duration: 0.68 2023-10-02 11:20:51,877 WARNING [train.py:1204] (3/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_96-320736-0_sp0.9 from training. Duration: 0.9555625 2023-10-02 11:20:55,051 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.536e+02 1.911e+02 2.075e+02 2.352e+02 2.964e+02, threshold=4.149e+02, percent-clipped=0.0 2023-10-02 11:20:59,385 WARNING [train.py:1204] (3/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_1252-235481-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 11:20:59,512 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.1.ff2_skip_rate, batch_count=861473.3333333334, ans=0.0 2023-10-02 11:21:01,204 WARNING [train.py:1204] (3/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_813-261554-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 11:21:01,284 WARNING [train.py:1204] (3/4) Exclude cut with ID _21632_210_2253_1_1531994001765_3744140_110-274654-0_sp0.9 from training. Duration: 0.911125 2023-10-02 11:21:02,738 WARNING [train.py:1204] (3/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_381-317791-0_sp1.1 from training. Duration: 0.663625 2023-10-02 11:21:02,750 WARNING [train.py:1204] (3/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_51-117576-0_sp1.1 from training. Duration: 0.363625 2023-10-02 11:21:02,774 WARNING [train.py:1204] (3/4) Exclude cut with ID _42321_210_7770_1_1533468631352_4366895_104-215915-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 11:21:05,476 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_216-22824-0_sp1.1 from training. Duration: 0.92725 2023-10-02 11:21:05,477 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_14831_210_7927_1_1530700200_5583610_181-27467-0 from training. Duration: 0.91 2023-10-02 11:21:06,723 WARNING [train.py:1204] (3/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_1024-265645-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 11:21:06,726 WARNING [train.py:1204] (3/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_412-233904-0_sp1.1 from training. Duration: 0.936375 2023-10-02 11:21:06,753 WARNING [train.py:1204] (3/4) Exclude cut with ID _30191_210_1664_1_1533189915047_7196670_911-223461-0_sp1.1 from training. Duration: 0.92725 2023-10-02 11:21:06,760 WARNING [train.py:1204] (3/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_558-148266-0_sp1.1 from training. Duration: 0.963625 2023-10-02 11:21:09,533 WARNING [train.py:1204] (3/4) Exclude cut with ID _58480_210_12392_1_1534811121111_3646160_212-164495-0_sp1.1 from training. Duration: 0.936375 2023-10-02 11:21:09,534 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_454-276429-0_sp0.9 from training. Duration: 0.9666875 2023-10-02 11:21:10,952 WARNING [train.py:1204] (3/4) Exclude cut with ID _24736_210_3279_1_1532332894218_2838700_73-197086-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 11:21:11,009 WARNING [train.py:1204] (3/4) Exclude cut with ID _5294_210_1747_1_1527249643928_3696179_529-242298-0_sp1.1 from training. Duration: 0.863625 2023-10-02 11:21:12,289 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_53555_210_5632_1_1534595220_7232297_345-171438-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 11:21:15,080 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_28211_210_6388_1_1532861506_4523224_22-25766-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 11:21:15,161 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7002_210_1794_1_1527939683_5157286_163-87552-0 from training. Duration: 0.86 2023-10-02 11:21:16,737 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.out_combiner.scale_min, batch_count=861540.0, ans=0.2 2023-10-02 11:21:18,328 WARNING [train.py:1204] (3/4) Exclude cut with ID _56927_210_9969_1_1534848970840_3695229_409-225129-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 11:21:19,584 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_26192_210_7927_1_1532137269_1040115_21-219331-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 11:21:22,859 WARNING [train.py:1204] (3/4) Exclude cut with ID _15012_210_1968_1_1530856820429_5095350_662-210469-0 from training. Duration: 0.57 2023-10-02 11:21:26,085 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.4.encoder.layers.0.feed_forward1.out_whiten, num_groups=1, num_channels=384, metric=5.42 vs. limit=15.0 2023-10-02 11:21:26,298 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.1.encoder.layers.1.conv_module1.whiten, num_groups=1, num_channels=256, metric=9.63 vs. limit=15.0 2023-10-02 11:21:28,796 INFO [train.py:1046] (3/4) Epoch 25, batch 1750, loss[loss=0.169, simple_loss=0.2537, pruned_loss=0.0422, over 24050.00 frames. ], tot_loss[loss=0.1692, simple_loss=0.245, pruned_loss=0.04668, over 4668928.51 frames. ], batch size: 80, lr: 4.11e-03, grad_scale: 16.0 2023-10-02 11:21:28,875 WARNING [train.py:1204] (3/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_582-191016-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 11:21:30,274 WARNING [train.py:1204] (3/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_270-210032-0_sp1.1 from training. Duration: 0.963625 2023-10-02 11:21:32,241 WARNING [train.py:1204] (3/4) Exclude cut with ID _6789_210_1534_1_1527857937218_3118950_262-48853-0_sp0.9 from training. Duration: 0.62225 2023-10-02 11:21:32,297 WARNING [train.py:1204] (3/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_847-41334-0 from training. Duration: 0.44 2023-10-02 11:21:32,323 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_573-46532-0_sp1.1 from training. Duration: 0.92725 2023-10-02 11:21:35,043 WARNING [train.py:1204] (3/4) Exclude cut with ID _21881_210_3525_1_1531825242757_3798380_247-102591-0_sp1.1 from training. Duration: 0.8909375 2023-10-02 11:21:35,076 WARNING [train.py:1204] (3/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_200-21395-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 11:21:39,177 WARNING [train.py:1204] (3/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_711-15688-0 from training. Duration: 0.43 2023-10-02 11:21:40,674 WARNING [train.py:1204] (3/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_53-75025-0_sp1.1 from training. Duration: 0.963625 2023-10-02 11:21:43,333 WARNING [train.py:1204] (3/4) Exclude cut with ID _6194_210_1534_1_1527511826160_3941194_321-148641-0 from training. Duration: 0.64 2023-10-02 11:21:43,356 WARNING [train.py:1204] (3/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_337-294431-0_sp1.1 from training. Duration: 0.936375 2023-10-02 11:21:44,736 WARNING [train.py:1204] (3/4) Exclude cut with ID _21632_210_2253_1_1531994001765_3744140_110-274654-0_sp1.1 from training. Duration: 0.7454375 2023-10-02 11:21:44,846 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.0.conv_module2.balancer1.prob, batch_count=861673.3333333334, ans=0.125 2023-10-02 11:21:48,077 WARNING [train.py:1204] (3/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_135-174080-0_sp1.1 from training. Duration: 0.4818125 2023-10-02 11:21:49,469 WARNING [train.py:1204] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_21-174514-0 from training. Duration: 0.43 2023-10-02 11:21:50,888 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_38965_210_4852_1_1533174906_3484735_215-294589-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 11:21:50,922 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_548-294316-0 from training. Duration: 0.81 2023-10-02 11:21:53,214 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.nonlin_attention.balancer.prob, batch_count=861673.3333333334, ans=0.125 2023-10-02 11:21:56,518 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.ff3_skip_rate, batch_count=861673.3333333334, ans=0.0 2023-10-02 11:21:59,098 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_103-84010-0_sp1.1 from training. Duration: 0.62725 2023-10-02 11:21:59,198 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.0.conv_skip_rate, batch_count=861740.0, ans=0.0 2023-10-02 11:22:02,423 WARNING [train.py:1204] (3/4) Exclude cut with ID _18199_210_6109_1_1531475789864_4288790_295-198247-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 11:22:02,451 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_13757_210_3528_1_1530440682_4107239_103-313224-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 11:22:06,643 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_34886_210_10633_1_1533203882_1755661_67-294673-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 11:22:08,013 WARNING [train.py:1204] (3/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_350-101734-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 11:22:09,376 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_37357_210_17024_1_1533089504_2316121_204-153986-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 11:22:10,813 WARNING [train.py:1204] (3/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_673-115569-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 11:22:13,565 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_489-230703-0_sp0.9 from training. Duration: 0.9444375 2023-10-02 11:22:13,619 WARNING [train.py:1204] (3/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_575-102496-0_sp1.1 from training. Duration: 0.936375 2023-10-02 11:22:14,950 WARNING [train.py:1204] (3/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_669-93163-0 from training. Duration: 0.98 2023-10-02 11:22:16,418 WARNING [train.py:1204] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_249-281349-0_sp0.9 from training. Duration: 0.9444375 2023-10-02 11:22:19,935 WARNING [train.py:1204] (3/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_106-30782-0 from training. Duration: 0.8 2023-10-02 11:22:21,302 WARNING [train.py:1204] (3/4) Exclude cut with ID _8772_210_1850_1_1528635595972_3664011_398-320049-0_sp1.1 from training. Duration: 0.8 2023-10-02 11:22:23,059 WARNING [train.py:1204] (3/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_516-151727-0_sp1.1 from training. Duration: 0.963625 2023-10-02 11:22:23,135 WARNING [train.py:1204] (3/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_325-62802-0_sp1.1 from training. Duration: 0.7909375 2023-10-02 11:22:25,938 WARNING [train.py:1204] (3/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_93-58002-0_sp0.9 from training. Duration: 0.6333125 2023-10-02 11:22:25,987 WARNING [train.py:1204] (3/4) Exclude cut with ID _4681_210_3525_1_1527251040321_1235713_123-24428-0_sp1.1 from training. Duration: 0.436375 2023-10-02 11:22:27,287 WARNING [train.py:1204] (3/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_1185-56473-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 11:22:27,485 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=861873.3333333334, ans=0.1 2023-10-02 11:22:29,228 WARNING [train.py:1204] (3/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_96-244299-0_sp1.1 from training. Duration: 0.8 2023-10-02 11:22:29,773 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.4.encoder.layers.0.conv_module2.whiten, num_groups=1, num_channels=384, metric=10.40 vs. limit=15.0 2023-10-02 11:22:33,616 WARNING [train.py:1204] (3/4) Exclude cut with ID _59500_210_8254_1_1534809610680_3736009_104-72815-0_sp1.1 from training. Duration: 0.963625 2023-10-02 11:22:35,045 WARNING [train.py:1204] (3/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_468-283113-0_sp1.1 from training. Duration: 0.87275 2023-10-02 11:22:36,431 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6239_210_4211_1_1527765584_4306698_528-102186-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 11:22:39,033 WARNING [train.py:1204] (3/4) Exclude cut with ID _45297_210_2721_1_1533715351381_3264949_351-147136-0 from training. Duration: 0.87 2023-10-02 11:22:39,037 WARNING [train.py:1204] (3/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_520-51744-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 11:22:40,356 WARNING [train.py:1204] (3/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_310-114452-0_sp1.1 from training. Duration: 0.536375 2023-10-02 11:22:40,359 WARNING [train.py:1204] (3/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_450-165710-0_sp1.1 from training. Duration: 0.92725 2023-10-02 11:22:40,369 WARNING [train.py:1204] (3/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_280-120942-0_sp1.1 from training. Duration: 0.7 2023-10-02 11:22:40,396 WARNING [train.py:1204] (3/4) Exclude cut with ID _65585_210_12210_1_1535450451584_3764098_112-294301-0_sp0.9 from training. Duration: 0.97775 2023-10-02 11:22:41,693 WARNING [train.py:1204] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_98-51676-0_sp0.9 from training. Duration: 0.811125 2023-10-02 11:22:42,938 INFO [train.py:1046] (3/4) Epoch 25, batch 1800, loss[loss=0.1498, simple_loss=0.2254, pruned_loss=0.0371, over 24590.00 frames. ], tot_loss[loss=0.1676, simple_loss=0.2438, pruned_loss=0.04573, over 4684331.29 frames. ], batch size: 60, lr: 4.11e-03, grad_scale: 16.0 2023-10-02 11:22:43,316 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.feed_forward3.hidden_balancer.prob, batch_count=861940.0, ans=0.125 2023-10-02 11:22:44,333 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_33348_210_5632_1_1533086912_3324030_85-28583-0_sp1.1 from training. Duration: 0.7454375 2023-10-02 11:22:44,420 WARNING [train.py:1204] (3/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_336-208232-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 11:22:46,024 WARNING [train.py:1204] (3/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_339-104791-0_sp0.9 from training. Duration: 0.5444375 2023-10-02 11:22:50,125 WARNING [train.py:1204] (3/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_559-29840-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 11:22:53,456 WARNING [train.py:1204] (3/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_117-290916-0_sp0.9 from training. Duration: 0.5666875 2023-10-02 11:22:53,538 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_185-112289-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 11:22:56,923 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.0.balancer1.prob, batch_count=862006.6666666666, ans=0.125 2023-10-02 11:22:58,131 WARNING [train.py:1204] (3/4) Exclude cut with ID _59256_210_5986_1_1535076085997_3589120_155-298797-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 11:23:01,281 WARNING [train.py:1204] (3/4) Exclude cut with ID _44659_210_2721_1_1534036120061_6614860_246-206531-0_sp1.1 from training. Duration: 0.92725 2023-10-02 11:23:01,329 WARNING [train.py:1204] (3/4) Exclude cut with ID _23818_210_6228_1_1531917063884_3676920_469-151944-0_sp1.1 from training. Duration: 0.92725 2023-10-02 11:23:02,738 WARNING [train.py:1204] (3/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_88-276668-0_sp1.1 from training. Duration: 0.9 2023-10-02 11:23:04,769 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_15101_210_7927_1_1530786415_5033274_320-327527-0_sp1.1 from training. Duration: 0.87275 2023-10-02 11:23:04,777 WARNING [train.py:1204] (3/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_446-112117-0 from training. Duration: 0.9 2023-10-02 11:23:04,849 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_30280_210_1881_1_1532872779_3660139_400-254159-0_sp1.1 from training. Duration: 0.936375 2023-10-02 11:23:08,902 WARNING [train.py:1204] (3/4) Exclude cut with ID _40284_210_2084_1_1533513679814_3704279_158-345341-0_sp1.1 from training. Duration: 0.936375 2023-10-02 11:23:09,191 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.nonlin_attention.balancer.max_positive, batch_count=862006.6666666666, ans=0.95 2023-10-02 11:23:11,618 WARNING [train.py:1204] (3/4) Exclude cut with ID 3033-130750-0096-55598-0 from training. Duration: 0.83 2023-10-02 11:23:14,333 WARNING [train.py:1204] (3/4) Exclude cut with ID _9389_210_1664_1_1529217337301_5289269_379-128794-0 from training. Duration: 0.85 2023-10-02 11:23:14,347 WARNING [train.py:1204] (3/4) Exclude cut with ID _3675_210_4557_1_1526639934408_4182089_456-18305-0 from training. Duration: 0.76 2023-10-02 11:23:14,379 WARNING [train.py:1204] (3/4) Exclude cut with ID _29853_210_7497_1_1532505600851_6642110_571-249460-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 11:23:16,975 WARNING [train.py:1204] (3/4) Exclude cut with ID _4068_210_2629_1_1526697350395_1546259_230-325568-0_sp1.1 from training. Duration: 0.92725 2023-10-02 11:23:16,976 WARNING [train.py:1204] (3/4) Exclude cut with ID _34415_210_8750_1_1533085213375_3451116_213-267570-0_sp1.1 from training. Duration: 0.963625 2023-10-02 11:23:17,225 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.conv_skip_rate, batch_count=862073.3333333334, ans=0.0 2023-10-02 11:23:18,294 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6767_210_3273_1_1527855783_1718932_142-127132-0_sp1.1 from training. Duration: 0.863625 2023-10-02 11:23:22,273 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.585e+02 1.934e+02 2.294e+02 2.756e+02 4.950e+02, threshold=4.588e+02, percent-clipped=2.0 2023-10-02 11:23:24,399 WARNING [train.py:1204] (3/4) Exclude cut with ID IC0653W0001-322925 from training. Duration: 0.896 2023-10-02 11:23:25,695 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_4528_210_3273_1_1527076654_3276777_209-219804-0_sp1.1 from training. Duration: 0.62725 2023-10-02 11:23:27,094 WARNING [train.py:1204] (3/4) Exclude cut with ID _55844_210_2346_1_1534471212569_7625707_187-204971-0_sp1.1 from training. Duration: 0.936375 2023-10-02 11:23:30,441 WARNING [train.py:1204] (3/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_369-325184-0 from training. Duration: 0.99 2023-10-02 11:23:30,467 WARNING [train.py:1204] (3/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_791-45149-0 from training. Duration: 0.86 2023-10-02 11:23:30,506 WARNING [train.py:1204] (3/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_317-211107-0_sp1.1 from training. Duration: 0.836375 2023-10-02 11:23:31,855 WARNING [train.py:1204] (3/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_488-330913-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 11:23:32,079 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.conv_module2.balancer2.prob, batch_count=862140.0, ans=0.125 2023-10-02 11:23:32,154 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.ff2_skip_rate, batch_count=862140.0, ans=0.0 2023-10-02 11:23:33,981 WARNING [train.py:1204] (3/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_506-42614-0_sp1.1 from training. Duration: 0.7181875 2023-10-02 11:23:38,139 WARNING [train.py:1204] (3/4) Exclude cut with ID _8696_210_1876_1_1528545632440_3345058_209-54069-0 from training. Duration: 0.67 2023-10-02 11:23:38,375 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=862140.0, ans=0.1 2023-10-02 11:23:44,900 WARNING [train.py:1204] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_215-202973-0_sp1.1 from training. Duration: 0.8 2023-10-02 11:23:44,969 WARNING [train.py:1204] (3/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_321-215742-0 from training. Duration: 0.5 2023-10-02 11:23:45,010 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_428-32114-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 11:23:45,024 WARNING [train.py:1204] (3/4) Exclude cut with ID _27013_210_1389_1_1532437173240_3252649_467-263151-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 11:23:46,318 WARNING [train.py:1204] (3/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_734-129761-0_sp0.9 from training. Duration: 0.911125 2023-10-02 11:23:46,372 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_521-214276-0 from training. Duration: 0.44 2023-10-02 11:23:49,206 WARNING [train.py:1204] (3/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_80-147301-0_sp1.1 from training. Duration: 0.763625 2023-10-02 11:23:49,208 WARNING [train.py:1204] (3/4) Exclude cut with ID _9679_210_7497_1_1529129212906_4363929_56-307796-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 11:23:51,788 WARNING [train.py:1204] (3/4) Exclude cut with ID _14832_210_8750_1_1530691351539_3171348_1-54315-0 from training. Duration: 0.87 2023-10-02 11:23:51,789 WARNING [train.py:1204] (3/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_558-155750-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 11:23:54,464 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_142-309370-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 11:23:54,505 WARNING [train.py:1204] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_18-46756-0_sp0.9 from training. Duration: 0.87775 2023-10-02 11:23:54,514 WARNING [train.py:1204] (3/4) Exclude cut with ID _61148_210_6897_1_1535094022726_4217610_279-212159-0_sp1.1 from training. Duration: 0.936375 2023-10-02 11:23:56,330 INFO [train.py:1046] (3/4) Epoch 25, batch 1850, loss[loss=0.1591, simple_loss=0.2391, pruned_loss=0.03956, over 24532.00 frames. ], tot_loss[loss=0.1685, simple_loss=0.2454, pruned_loss=0.04575, over 4701112.37 frames. ], batch size: 60, lr: 4.11e-03, grad_scale: 16.0 2023-10-02 11:23:56,421 WARNING [train.py:1204] (3/4) Exclude cut with ID _21987_210_5854_1_1531821500482_3637038_164-2641-0_sp1.1 from training. Duration: 0.936375 2023-10-02 11:23:56,458 WARNING [train.py:1204] (3/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_395-340852-0_sp1.1 from training. Duration: 0.8454375 2023-10-02 11:23:59,640 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1077-184261-0_sp1.1 from training. Duration: 0.963625 2023-10-02 11:23:59,655 WARNING [train.py:1204] (3/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_1176-204992-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 11:24:01,134 WARNING [train.py:1204] (3/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_563-347832-0_sp1.1 from training. Duration: 0.8090625 2023-10-02 11:24:02,507 WARNING [train.py:1204] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_380-204756-0_sp0.9 from training. Duration: 0.9555625 2023-10-02 11:24:08,946 INFO [scaling.py:1118] (3/4) WithLoss: name=encoder.encoders.0.layers.0.self_attn_weights, loss-sum=0.000e+00 2023-10-02 11:24:10,215 WARNING [train.py:1204] (3/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_212-10501-0_sp1.1 from training. Duration: 0.8545625 2023-10-02 11:24:10,250 WARNING [train.py:1204] (3/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_445-117398-0 from training. Duration: 0.89 2023-10-02 11:24:14,394 WARNING [train.py:1204] (3/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_29-280622-0 from training. Duration: 0.82 2023-10-02 11:24:15,788 WARNING [train.py:1204] (3/4) Exclude cut with ID _26664_210_7770_1_1532258943280_3418270_187-80815-0 from training. Duration: 0.93 2023-10-02 11:24:18,671 WARNING [train.py:1204] (3/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_414-349640-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 11:24:18,693 WARNING [train.py:1204] (3/4) Exclude cut with ID _45021_210_9994_1_1533619751094_3330288_96-239010-0 from training. Duration: 0.91 2023-10-02 11:24:18,695 WARNING [train.py:1204] (3/4) Exclude cut with ID _5189_210_4557_1_1527248319428_3769229_199-53526-0_sp0.9 from training. Duration: 0.4666875 2023-10-02 11:24:25,425 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.0.layers.0.conv_module1.whiten, num_groups=1, num_channels=192, metric=11.63 vs. limit=15.0 2023-10-02 11:24:30,564 WARNING [train.py:1204] (3/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_63-101996-0_sp1.1 from training. Duration: 0.8 2023-10-02 11:24:31,981 WARNING [train.py:1204] (3/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_199-86432-0 from training. Duration: 0.98 2023-10-02 11:24:34,722 WARNING [train.py:1204] (3/4) Exclude cut with ID _36399_210_16049_1_1532998771377_3698333_207-259013-0_sp1.1 from training. Duration: 0.92725 2023-10-02 11:24:34,762 WARNING [train.py:1204] (3/4) Exclude cut with ID _62574_210_2655_1_1535180407058_3933249_567-111756-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 11:24:40,128 WARNING [train.py:1204] (3/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_436-318113-0 from training. Duration: 0.75 2023-10-02 11:24:40,176 WARNING [train.py:1204] (3/4) Exclude cut with ID _53379_210_20034_1_1534222027363_3854654_43-305214-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 11:24:40,188 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_15126_210_6753_1_1530778677_2591185_113-157651-0_sp1.1 from training. Duration: 0.6181875 2023-10-02 11:24:42,763 WARNING [train.py:1204] (3/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_146-218763-0_sp1.1 from training. Duration: 0.7818125 2023-10-02 11:24:44,298 WARNING [train.py:1204] (3/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_714-310538-0_sp0.9 from training. Duration: 0.9555625 2023-10-02 11:24:45,693 WARNING [train.py:1204] (3/4) Exclude cut with ID _24271_210_12658_1_1531987270022_7190519_709-16721-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 11:24:48,553 WARNING [train.py:1204] (3/4) Exclude cut with ID _5190_210_4557_1_1527244368799_373439_28-197030-0_sp0.9 from training. Duration: 0.8 2023-10-02 11:24:49,942 WARNING [train.py:1204] (3/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_267-197536-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 11:24:49,978 WARNING [train.py:1204] (3/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_658-282610-0_sp1.1 from training. Duration: 0.4818125 2023-10-02 11:24:51,330 WARNING [train.py:1204] (3/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_257-67050-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 11:24:52,709 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_14906_210_7927_1_1530754190_9408633_961-53402-0_sp1.1 from training. Duration: 0.936375 2023-10-02 11:24:53,304 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.4.encoder.layers.1.feed_forward2.out_whiten, num_groups=1, num_channels=384, metric=9.98 vs. limit=15.0 2023-10-02 11:24:54,059 WARNING [train.py:1204] (3/4) Exclude cut with ID _60113_210_16271_1_1535164212481_4386079_490-280290-0_sp1.1 from training. Duration: 0.8909375 2023-10-02 11:24:57,311 WARNING [train.py:1204] (3/4) Exclude cut with ID _14100_210_8341_1_1530529320613_3678069_258-224599-0 from training. Duration: 0.77 2023-10-02 11:24:57,359 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6961_210_2087_1_1527937168_6815011_565-326898-0_sp1.1 from training. Duration: 0.936375 2023-10-02 11:25:01,357 WARNING [train.py:1204] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_239-316802-0_sp1.1 from training. Duration: 0.62725 2023-10-02 11:25:03,216 WARNING [train.py:1204] (3/4) Exclude cut with ID _55857_210_2346_1_1534644007831_6898209_745-98391-0_sp1.1 from training. Duration: 0.6909375 2023-10-02 11:25:03,219 WARNING [train.py:1204] (3/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_154-135696-0 from training. Duration: 0.8 2023-10-02 11:25:03,221 WARNING [train.py:1204] (3/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_93-58002-0 from training. Duration: 0.57 2023-10-02 11:25:04,551 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9025W0005-507335 from training. Duration: 0.896 2023-10-02 11:25:05,933 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9029W0034-509435 from training. Duration: 0.64 2023-10-02 11:25:07,483 WARNING [train.py:1204] (3/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_76-203867-0_sp1.1 from training. Duration: 0.6545625 2023-10-02 11:25:07,500 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_46639_210_8341_1_1533726088_4495500_66-346325-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 11:25:07,506 WARNING [train.py:1204] (3/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_145-137790-0_sp0.9 from training. Duration: 0.92225 2023-10-02 11:25:09,248 WARNING [train.py:1204] (3/4) Exclude cut with ID _36522_210_8887_1_1533177030611_1772478_7-327299-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 11:25:09,320 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9042W0009-515615 from training. Duration: 0.896 2023-10-02 11:25:09,328 WARNING [train.py:1204] (3/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_39-248870-0_sp0.9 from training. Duration: 0.7666875 2023-10-02 11:25:09,379 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_35441_210_16554_1_1533347572_4092680_362-242912-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 11:25:10,615 INFO [train.py:1046] (3/4) Epoch 25, batch 1900, loss[loss=0.1609, simple_loss=0.2439, pruned_loss=0.03896, over 24538.00 frames. ], tot_loss[loss=0.1694, simple_loss=0.2463, pruned_loss=0.04625, over 4696740.52 frames. ], batch size: 66, lr: 4.11e-03, grad_scale: 16.0 2023-10-02 11:25:10,714 WARNING [train.py:1204] (3/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_216-343007-0_sp1.1 from training. Duration: 0.6 2023-10-02 11:25:12,109 WARNING [train.py:1204] (3/4) Exclude cut with ID _3867_210_1883_1_1526641200575_4475830_272-215563-0_sp0.9 from training. Duration: 0.6333125 2023-10-02 11:25:14,030 WARNING [train.py:1204] (3/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_553-175603-0_sp1.1 from training. Duration: 0.92725 2023-10-02 11:25:14,045 WARNING [train.py:1204] (3/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_963-187109-0 from training. Duration: 0.99 2023-10-02 11:25:16,758 WARNING [train.py:1204] (3/4) Exclude cut with ID _44973_210_1511_1_1533794381163_3634770_495-211466-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 11:25:16,769 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9068W0002-529085 from training. Duration: 0.896 2023-10-02 11:25:16,791 WARNING [train.py:1204] (3/4) Exclude cut with ID _5294_210_1747_1_1527249643928_3696179_180-100713-0_sp1.1 from training. Duration: 0.736375 2023-10-02 11:25:18,132 WARNING [train.py:1204] (3/4) Exclude cut with ID _35799_210_5854_1_1533263487887_3529071_149-49689-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 11:25:23,742 WARNING [train.py:1204] (3/4) Exclude cut with ID _67725_210_27767_1_1535608756671_5105359_36-220794-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 11:25:23,917 WARNING [train.py:1204] (3/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_338-96520-0_sp1.1 from training. Duration: 0.8545625 2023-10-02 11:25:25,302 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9104W0004-547160 from training. Duration: 0.896 2023-10-02 11:25:26,712 WARNING [train.py:1204] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_330-327861-0 from training. Duration: 0.67 2023-10-02 11:25:28,791 WARNING [train.py:1204] (3/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_156-257607-0_sp0.9 from training. Duration: 0.92225 2023-10-02 11:25:29,018 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.feed_forward1.hidden_balancer.prob, batch_count=862673.3333333334, ans=0.125 2023-10-02 11:25:30,034 WARNING [train.py:1204] (3/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_476-322050-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 11:25:30,049 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9118W0017-553385 from training. Duration: 0.896 2023-10-02 11:25:30,073 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9120W0028-554435 from training. Duration: 0.896 2023-10-02 11:25:33,200 WARNING [train.py:1204] (3/4) Exclude cut with ID _56524_210_9642_1_1534572011779_3619170_145-310842-0 from training. Duration: 0.99 2023-10-02 11:25:35,939 WARNING [train.py:1204] (3/4) Exclude cut with ID _51631_210_8254_1_1534129228420_4084794_270-222508-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 11:25:37,558 INFO [scaling.py:1118] (3/4) WithLoss: name=encoder.encoders.5.encoder.layers.1.self_attn_weights, loss-sum=0.000e+00 2023-10-02 11:25:39,876 WARNING [train.py:1204] (3/4) Exclude cut with ID _14268_210_9206_1_1530622771285_4714099_331-105351-0 from training. Duration: 0.65 2023-10-02 11:25:41,926 WARNING [train.py:1204] (3/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_40-129756-0 from training. Duration: 0.81 2023-10-02 11:25:49,780 WARNING [train.py:1204] (3/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_231-104736-0 from training. Duration: 0.78 2023-10-02 11:25:50,897 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.623e+02 2.019e+02 2.414e+02 2.839e+02 5.766e+02, threshold=4.828e+02, percent-clipped=1.0 2023-10-02 11:25:52,545 WARNING [train.py:1204] (3/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_214-291369-0 from training. Duration: 0.63 2023-10-02 11:25:52,551 WARNING [train.py:1204] (3/4) Exclude cut with ID _62574_210_2655_1_1535180407058_3933249_369-84347-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 11:25:52,585 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder_embed.convnext.hidden_balancer.prob, batch_count=862740.0, ans=0.125 2023-10-02 11:25:53,850 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9210W0111-599240 from training. Duration: 0.896 2023-10-02 11:25:53,854 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_92-246270-0 from training. Duration: 0.85 2023-10-02 11:25:53,878 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_37152_210_9697_1_1533106573_3844188_29-239070-0 from training. Duration: 0.98 2023-10-02 11:25:53,936 WARNING [train.py:1204] (3/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_239-22076-0 from training. Duration: 0.85 2023-10-02 11:25:53,938 WARNING [train.py:1204] (3/4) Exclude cut with ID _65962_210_29927_1_1535436026519_4174930_368-44639-0_sp1.1 from training. Duration: 0.97275 2023-10-02 11:25:57,439 WARNING [train.py:1204] (3/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_152-212533-0 from training. Duration: 0.97 2023-10-02 11:25:59,988 WARNING [train.py:1204] (3/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_90-31383-0_sp1.1 from training. Duration: 0.7909375 2023-10-02 11:26:04,394 WARNING [train.py:1204] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_196-152868-0_sp1.1 from training. Duration: 0.92725 2023-10-02 11:26:04,395 WARNING [train.py:1204] (3/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_140-181176-0 from training. Duration: 0.92 2023-10-02 11:26:04,618 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.0.self_attn_weights.pos_emb_skip_rate, batch_count=862806.6666666666, ans=0.0 2023-10-02 11:26:05,822 WARNING [train.py:1204] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_168-32906-0_sp0.9 from training. Duration: 0.7666875 2023-10-02 11:26:08,709 WARNING [train.py:1204] (3/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_13-165512-0 from training. Duration: 0.82 2023-10-02 11:26:08,759 WARNING [train.py:1204] (3/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_381-317791-0_sp0.9 from training. Duration: 0.811125 2023-10-02 11:26:09,029 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.conv_module1.balancer1.prob, batch_count=862873.3333333334, ans=0.125 2023-10-02 11:26:14,823 WARNING [train.py:1204] (3/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_63-267585-0_sp1.1 from training. Duration: 0.8181875 2023-10-02 11:26:14,825 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_326-303945-0_sp1.1 from training. Duration: 0.9 2023-10-02 11:26:14,837 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_194-143381-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 11:26:17,590 WARNING [train.py:1204] (3/4) Exclude cut with ID _39290_210_4220_1_1533862778760_3075118_194-16667-0_sp1.1 from training. Duration: 0.963625 2023-10-02 11:26:18,996 WARNING [train.py:1204] (3/4) Exclude cut with ID _15012_210_1968_1_1530856820429_5095350_662-210469-0_sp0.9 from training. Duration: 0.6333125 2023-10-02 11:26:19,020 WARNING [train.py:1204] (3/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_68-219807-0_sp1.1 from training. Duration: 0.42725 2023-10-02 11:26:20,411 WARNING [train.py:1204] (3/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_321-22821-0_sp0.9 from training. Duration: 0.988875 2023-10-02 11:26:23,144 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_1037-45317-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 11:26:23,146 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_403-39619-0_sp1.1 from training. Duration: 0.763625 2023-10-02 11:26:24,534 INFO [train.py:1046] (3/4) Epoch 25, batch 1950, loss[loss=0.1571, simple_loss=0.2462, pruned_loss=0.03394, over 24503.00 frames. ], tot_loss[loss=0.1704, simple_loss=0.2473, pruned_loss=0.04671, over 4701479.02 frames. ], batch size: 66, lr: 4.11e-03, grad_scale: 16.0 2023-10-02 11:26:25,939 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_510-96996-0_sp0.9 from training. Duration: 0.9666875 2023-10-02 11:26:25,941 WARNING [train.py:1204] (3/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_225-180155-0_sp1.1 from training. Duration: 0.92725 2023-10-02 11:26:26,003 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_278-272612-0_sp0.9 from training. Duration: 0.811125 2023-10-02 11:26:28,002 WARNING [train.py:1204] (3/4) Exclude cut with ID _53713_210_24116_1_1534314487762_7416751_691-70244-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 11:26:29,271 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder_embed.conv.2.prob, batch_count=862940.0, ans=0.125 2023-10-02 11:26:30,566 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6788_210_1388_1_1527898698_4562021_91-197981-0_sp1.1 from training. Duration: 0.7545625 2023-10-02 11:26:32,120 WARNING [train.py:1204] (3/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_60-18123-0_sp0.9 from training. Duration: 0.97775 2023-10-02 11:26:33,519 WARNING [train.py:1204] (3/4) Exclude cut with ID _35799_210_5854_1_1533263487887_3529071_5-214617-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 11:26:33,524 WARNING [train.py:1204] (3/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_890-5117-0_sp1.1 from training. Duration: 0.7090625 2023-10-02 11:26:33,718 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.0.balancer2.prob, batch_count=862940.0, ans=0.125 2023-10-02 11:26:34,913 WARNING [train.py:1204] (3/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_660-211088-0 from training. Duration: 0.62 2023-10-02 11:26:36,725 WARNING [train.py:1204] (3/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_314-126819-0_sp0.9 from training. Duration: 0.5555625 2023-10-02 11:26:36,747 WARNING [train.py:1204] (3/4) Exclude cut with ID _21632_210_2253_1_1531994001765_3744140_362-66225-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 11:26:38,146 WARNING [train.py:1204] (3/4) Exclude cut with ID _61118_210_9527_1_1534897668884_3752630_173-258555-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 11:26:38,313 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.1.self_attn_weights.pos_emb_skip_rate, batch_count=863006.6666666666, ans=0.0 2023-10-02 11:26:39,562 WARNING [train.py:1204] (3/4) Exclude cut with ID _7719_210_3525_1_1528456153163_4296960_88-250259-0_sp0.9 from training. Duration: 0.9333125 2023-10-02 11:26:40,865 WARNING [train.py:1204] (3/4) Exclude cut with ID _15140_210_9288_1_1530791388450_4880149_539-153708-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 11:26:42,631 WARNING [train.py:1204] (3/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_253-42544-0_sp1.1 from training. Duration: 0.97275 2023-10-02 11:26:44,008 WARNING [train.py:1204] (3/4) Exclude cut with ID _33640_210_2655_1_1533117605479_4855469_479-296877-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 11:26:44,280 INFO [scaling.py:1118] (3/4) WithLoss: name=encoder.encoders.2.encoder.layers.1.self_attn_weights, loss-sum=0.000e+00 2023-10-02 11:26:44,848 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.2.encoder.layers.1.conv_module1.whiten, num_groups=1, num_channels=384, metric=8.12 vs. limit=15.0 2023-10-02 11:26:45,637 WARNING [train.py:1204] (3/4) Exclude cut with ID 3033-130750-0096-55598-0_sp1.1 from training. Duration: 0.7545625 2023-10-02 11:26:47,446 WARNING [train.py:1204] (3/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_191-41023-0_sp1.1 from training. Duration: 0.6454375 2023-10-02 11:26:47,470 WARNING [train.py:1204] (3/4) Exclude cut with ID _8365_210_2064_1_1528592311070_4677690_431-294102-0_sp1.1 from training. Duration: 0.8090625 2023-10-02 11:26:47,500 WARNING [train.py:1204] (3/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_893-296030-0_sp1.1 from training. Duration: 0.97275 2023-10-02 11:26:50,264 WARNING [train.py:1204] (3/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_401-343820-0_sp1.1 from training. Duration: 0.97275 2023-10-02 11:26:54,327 WARNING [train.py:1204] (3/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_127-566-0_sp1.1 from training. Duration: 0.763625 2023-10-02 11:26:54,328 WARNING [train.py:1204] (3/4) Exclude cut with ID _14100_210_8341_1_1530529320613_3678069_126-272847-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 11:26:54,350 WARNING [train.py:1204] (3/4) Exclude cut with ID _15133_210_6228_1_1530794512074_3080379_29-264458-0_sp1.1 from training. Duration: 0.52725 2023-10-02 11:26:54,351 WARNING [train.py:1204] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_127-119109-0 from training. Duration: 0.72 2023-10-02 11:26:54,405 WARNING [train.py:1204] (3/4) Exclude cut with ID _6537_210_1883_1_1527987559710_3597129_382-133062-0_sp1.1 from training. Duration: 0.6181875 2023-10-02 11:26:54,421 WARNING [train.py:1204] (3/4) Exclude cut with ID _29529_210_7234_1_1532393961455_3977460_391-219682-0_sp1.1 from training. Duration: 0.936375 2023-10-02 11:26:55,779 WARNING [train.py:1204] (3/4) Exclude cut with ID _37537_210_2253_1_1533171758568_3421949_96-137189-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 11:26:57,768 WARNING [train.py:1204] (3/4) Exclude cut with ID _58414_210_8254_1_1535421609162_7151182_15-276301-0_sp1.1 from training. Duration: 0.97275 2023-10-02 11:27:01,771 WARNING [train.py:1204] (3/4) Exclude cut with ID _59502_210_12210_1_1535107024698_1835139_110-289074-0_sp1.1 from training. Duration: 0.92725 2023-10-02 11:27:04,579 WARNING [train.py:1204] (3/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_367-61330-0_sp1.1 from training. Duration: 0.6818125 2023-10-02 11:27:09,236 WARNING [train.py:1204] (3/4) Exclude cut with ID _45297_210_2721_1_1533715351381_3264949_353-9541-0_sp1.1 from training. Duration: 0.8818125 2023-10-02 11:27:09,270 WARNING [train.py:1204] (3/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_85-138257-0_sp0.9 from training. Duration: 0.788875 2023-10-02 11:27:09,302 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3787_210_2040_1_1526477544_5478586_603-120324-0 from training. Duration: 0.81 2023-10-02 11:27:10,586 WARNING [train.py:1204] (3/4) Exclude cut with ID _42863_210_9070_1_1533902398788_3696920_411-204131-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 11:27:13,939 WARNING [train.py:1204] (3/4) Exclude cut with ID _30196_210_1664_1_1533708496522_7074140_822-289909-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 11:27:15,261 WARNING [train.py:1204] (3/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_32-335103-0_sp0.9 from training. Duration: 0.911125 2023-10-02 11:27:15,312 WARNING [train.py:1204] (3/4) Exclude cut with ID _6493_210_1747_1_1528459210424_4012470_513-140967-0_sp0.9 from training. Duration: 0.87775 2023-10-02 11:27:22,942 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_47896_210_20508_1_1534492518_5958868_439-257766-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 11:27:24,300 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_1294-63152-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 11:27:26,178 WARNING [train.py:1204] (3/4) Exclude cut with ID _59170_210_6721_1_1534983633960_4203335_319-166512-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 11:27:28,885 WARNING [train.py:1204] (3/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_146-3470-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 11:27:31,594 WARNING [train.py:1204] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_678-159307-0_sp1.1 from training. Duration: 0.77275 2023-10-02 11:27:31,630 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_343-48822-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 11:27:32,918 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_4692_210_3528_1_1526899755_4323254_107-44401-0 from training. Duration: 0.86 2023-10-02 11:27:32,923 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_74-292608-0_sp1.1 from training. Duration: 0.6545625 2023-10-02 11:27:32,998 WARNING [train.py:1204] (3/4) Exclude cut with ID _55644_210_11856_1_1534593599582_4103719_293-248694-0_sp1.1 from training. Duration: 0.97275 2023-10-02 11:27:34,328 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3172_210_2087_1_1526292929_3987865_197-97762-0 from training. Duration: 0.75 2023-10-02 11:27:36,528 WARNING [train.py:1204] (3/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_264-348896-0_sp0.9 from training. Duration: 0.9444375 2023-10-02 11:27:39,092 INFO [train.py:1046] (3/4) Epoch 25, batch 2000, loss[loss=0.1999, simple_loss=0.2653, pruned_loss=0.06726, over 22663.00 frames. ], tot_loss[loss=0.1725, simple_loss=0.2493, pruned_loss=0.0479, over 4688780.87 frames. ], batch size: 322, lr: 4.11e-03, grad_scale: 32.0 2023-10-02 11:27:40,553 WARNING [train.py:1204] (3/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_209-205365-0_sp0.9 from training. Duration: 0.87775 2023-10-02 11:27:41,877 WARNING [train.py:1204] (3/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_222-198127-0_sp0.9 from training. Duration: 0.9333125 2023-10-02 11:27:41,916 WARNING [train.py:1204] (3/4) Exclude cut with ID _57572_210_5517_1_1534921219624_3809484_52-43530-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 11:27:43,855 WARNING [train.py:1204] (3/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_96-320736-0_sp1.1 from training. Duration: 0.7818125 2023-10-02 11:27:45,254 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_13091_210_7925_1_1530609447_8987354_1159-225508-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 11:27:45,742 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.5.encoder.layers.0.feed_forward3.out_whiten, num_groups=1, num_channels=256, metric=9.96 vs. limit=15.0 2023-10-02 11:27:47,998 WARNING [train.py:1204] (3/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_237-44332-0 from training. Duration: 0.94 2023-10-02 11:27:49,136 WARNING [train.py:1204] (3/4) Exclude cut with ID _7970_210_1883_1_1528455616296_3472230_25-178407-0_sp0.9 from training. Duration: 0.788875 2023-10-02 11:27:52,487 WARNING [train.py:1204] (3/4) Exclude cut with ID _29003_210_19596_1_1532507391720_3894098_357-257062-0_sp1.1 from training. Duration: 0.936375 2023-10-02 11:27:53,890 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_809-17156-0 from training. Duration: 0.73 2023-10-02 11:27:55,626 WARNING [train.py:1204] (3/4) Exclude cut with ID _7160_210_1876_1_1527940923231_3703799_302-150728-0_sp1.1 from training. Duration: 0.7090625 2023-10-02 11:27:55,635 WARNING [train.py:1204] (3/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_444-28041-0_sp0.9 from training. Duration: 0.9444375 2023-10-02 11:27:58,414 WARNING [train.py:1204] (3/4) Exclude cut with ID _52892_210_6721_1_1534378941259_4124129_278-251666-0_sp1.1 from training. Duration: 0.92725 2023-10-02 11:27:58,543 WARNING [train.py:1204] (3/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_115-326778-0 from training. Duration: 0.93 2023-10-02 11:27:59,895 WARNING [train.py:1204] (3/4) Exclude cut with ID _41391_210_7994_1_1533603771872_3638201_31-188961-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 11:28:01,347 WARNING [train.py:1204] (3/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_1073-6264-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 11:28:02,705 WARNING [train.py:1204] (3/4) Exclude cut with ID _66868_210_9527_1_1535509287954_4561140_435-259515-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 11:28:05,330 WARNING [train.py:1204] (3/4) Exclude cut with ID _14886_210_4220_1_1530702020355_3516190_159-73748-0 from training. Duration: 0.83 2023-10-02 11:28:05,365 WARNING [train.py:1204] (3/4) Exclude cut with ID _15150_210_8341_1_1530752411803_7256384_469-239509-0_sp1.1 from training. Duration: 0.6181875 2023-10-02 11:28:07,212 WARNING [train.py:1204] (3/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_395-340852-0 from training. Duration: 0.93 2023-10-02 11:28:07,215 WARNING [train.py:1204] (3/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_138-187031-0_sp1.1 from training. Duration: 0.9 2023-10-02 11:28:10,139 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_13757_210_3528_1_1530440682_4107239_48-106347-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 11:28:10,216 WARNING [train.py:1204] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_164-224052-0_sp1.1 from training. Duration: 0.636375 2023-10-02 11:28:10,220 WARNING [train.py:1204] (3/4) Exclude cut with ID _59288_210_5854_1_1535077757774_3412246_408-322648-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 11:28:11,535 WARNING [train.py:1204] (3/4) Exclude cut with ID _30191_210_1664_1_1533189915047_7196670_827-36359-0_sp1.1 from training. Duration: 0.963625 2023-10-02 11:28:11,614 WARNING [train.py:1204] (3/4) Exclude cut with ID _4680_210_3525_1_1527249757953_893386_75-78979-0_sp1.1 from training. Duration: 0.82725 2023-10-02 11:28:12,977 WARNING [train.py:1204] (3/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_383-165234-0 from training. Duration: 0.94 2023-10-02 11:28:14,992 WARNING [train.py:1204] (3/4) Exclude cut with ID _19896_210_1883_1_1531709022662_6649569_94-229927-0 from training. Duration: 0.97 2023-10-02 11:28:14,999 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6788_210_1388_1_1527898698_4562021_180-75288-0_sp1.1 from training. Duration: 0.9 2023-10-02 11:28:15,012 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_26433_210_12108_1_1532157158_4056480_54-321349-0_sp1.1 from training. Duration: 0.97275 2023-10-02 11:28:19,117 WARNING [train.py:1204] (3/4) Exclude cut with ID _56108_210_16380_1_1534505446159_3887959_265-161493-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 11:28:20,914 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.611e+02 1.859e+02 2.034e+02 2.311e+02 3.192e+02, threshold=4.068e+02, percent-clipped=0.0 2023-10-02 11:28:21,642 WARNING [train.py:1204] (3/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_395-111602-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 11:28:21,657 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_154-316450-0_sp0.9 from training. Duration: 0.8444375 2023-10-02 11:28:21,730 WARNING [train.py:1204] (3/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_381-116041-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 11:28:25,563 WARNING [train.py:1204] (3/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_65-104124-0_sp1.1 from training. Duration: 0.963625 2023-10-02 11:28:25,623 WARNING [train.py:1204] (3/4) Exclude cut with ID _6536_210_1883_1_1527850783339_3319069_169-23811-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 11:28:25,650 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_289-180904-0_sp0.9 from training. Duration: 0.8444375 2023-10-02 11:28:25,660 WARNING [train.py:1204] (3/4) Exclude cut with ID _56406_210_9288_1_1534899618664_3496149_226-242400-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 11:28:25,845 INFO [scaling.py:1118] (3/4) WithLoss: name=encoder.encoders.2.encoder.layers.0.self_attn_weights, loss-sum=0.000e+00 2023-10-02 11:28:27,003 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_24899_210_16554_1_1532046422_3827462_276-216330-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 11:28:31,150 WARNING [train.py:1204] (3/4) Exclude cut with ID _29345_210_4381_1_1532689168263_3860169_198-174328-0_sp1.1 from training. Duration: 0.82725 2023-10-02 11:28:32,465 WARNING [train.py:1204] (3/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_338-96520-0 from training. Duration: 0.94 2023-10-02 11:28:35,932 WARNING [train.py:1204] (3/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_7-111954-0_sp1.1 from training. Duration: 0.5090625 2023-10-02 11:28:36,022 WARNING [train.py:1204] (3/4) Exclude cut with ID _36692_210_5665_1_1533608967420_398809_8-16261-0_sp1.1 from training. Duration: 0.97275 2023-10-02 11:28:38,827 WARNING [train.py:1204] (3/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_86-212657-0_sp1.1 from training. Duration: 0.97275 2023-10-02 11:28:38,834 WARNING [train.py:1204] (3/4) Exclude cut with ID _44659_210_2721_1_1534036120061_6614860_626-298884-0_sp1.1 from training. Duration: 0.8818125 2023-10-02 11:28:44,568 WARNING [train.py:1204] (3/4) Exclude cut with ID _49755_210_18053_1_1534581005846_3218709_120-257918-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 11:28:45,961 WARNING [train.py:1204] (3/4) Exclude cut with ID _21881_210_3525_1_1531825242757_3798380_400-113821-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 11:28:45,964 WARNING [train.py:1204] (3/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_387-236773-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 11:28:47,332 WARNING [train.py:1204] (3/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_613-232856-0_sp1.1 from training. Duration: 0.4909375 2023-10-02 11:28:47,361 WARNING [train.py:1204] (3/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_181-78632-0_sp0.9 from training. Duration: 0.8666875 2023-10-02 11:28:50,069 WARNING [train.py:1204] (3/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_175-17485-0_sp1.1 from training. Duration: 0.97275 2023-10-02 11:28:50,158 WARNING [train.py:1204] (3/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_1004-183422-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 11:28:53,313 INFO [train.py:1046] (3/4) Epoch 25, batch 2050, loss[loss=0.1539, simple_loss=0.2201, pruned_loss=0.04382, over 23733.00 frames. ], tot_loss[loss=0.1717, simple_loss=0.2479, pruned_loss=0.04775, over 4692118.95 frames. ], batch size: 232, lr: 4.11e-03, grad_scale: 16.0 2023-10-02 11:28:53,452 WARNING [train.py:1204] (3/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_32-65962-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 11:28:53,596 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.0.balancer1.prob, batch_count=863606.6666666666, ans=0.125 2023-10-02 11:28:54,417 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.0.layers.0.conv_module2.whiten, num_groups=1, num_channels=192, metric=9.98 vs. limit=15.0 2023-10-02 11:28:55,386 WARNING [train.py:1204] (3/4) Exclude cut with ID _15458_210_8341_1_1530858614263_3834410_40-298664-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 11:28:58,367 WARNING [train.py:1204] (3/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_380-261863-0_sp1.1 from training. Duration: 0.963625 2023-10-02 11:28:59,805 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_372-173012-0_sp1.1 from training. Duration: 0.863625 2023-10-02 11:28:59,940 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.nonlin_attention.balancer.min_positive, batch_count=863606.6666666666, ans=0.05 2023-10-02 11:29:01,100 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_57-82205-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 11:29:01,167 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3691_210_3528_1_1526468169_4062053_355-334432-0_sp0.9 from training. Duration: 0.9666875 2023-10-02 11:29:04,322 WARNING [train.py:1204] (3/4) Exclude cut with ID _19023_210_4353_1_1531378892552_5820780_28-36615-0 from training. Duration: 0.97 2023-10-02 11:29:04,327 WARNING [train.py:1204] (3/4) Exclude cut with ID _29853_210_7497_1_1532505600851_6642110_286-251333-0_sp1.1 from training. Duration: 0.92725 2023-10-02 11:29:05,799 WARNING [train.py:1204] (3/4) Exclude cut with ID _31602_210_5854_1_1532677021456_4458250_518-80679-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 11:29:05,853 WARNING [train.py:1204] (3/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_38-187851-0_sp0.9 from training. Duration: 0.688875 2023-10-02 11:29:13,518 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.bypass_mid.scale_min, batch_count=863673.3333333334, ans=0.2 2023-10-02 11:29:16,077 WARNING [train.py:1204] (3/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_39-248870-0_sp1.1 from training. Duration: 0.62725 2023-10-02 11:29:16,108 WARNING [train.py:1204] (3/4) Exclude cut with ID _16908_210_5724_1_1531119157248_3772700_154-31455-0_sp1.1 from training. Duration: 0.97275 2023-10-02 11:29:16,942 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.0.layers.0.self_attn2.whiten, num_groups=1, num_channels=192, metric=11.31 vs. limit=22.5 2023-10-02 11:29:17,390 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8453_210_4211_1_1528502940_8562730_347-330673-0 from training. Duration: 0.72 2023-10-02 11:29:19,028 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.conv_module2.balancer1.prob, batch_count=863673.3333333334, ans=0.125 2023-10-02 11:29:20,101 WARNING [train.py:1204] (3/4) Exclude cut with ID _30191_210_1664_1_1533189915047_7196670_674-204247-0_sp1.1 from training. Duration: 0.97275 2023-10-02 11:29:21,928 WARNING [train.py:1204] (3/4) Exclude cut with ID _8833_210_5219_1_1528592665210_7553059_329-125156-0 from training. Duration: 0.82 2023-10-02 11:29:21,967 WARNING [train.py:1204] (3/4) Exclude cut with ID _4779_210_2084_1_1527318022944_3914893_500-285013-0_sp1.1 from training. Duration: 0.62725 2023-10-02 11:29:24,797 WARNING [train.py:1204] (3/4) Exclude cut with ID _14421_210_8750_1_1530604795825_3542290_190-313578-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 11:29:25,031 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.ff3_skip_rate, batch_count=863740.0, ans=0.0 2023-10-02 11:29:29,369 WARNING [train.py:1204] (3/4) Exclude cut with ID _35799_210_5854_1_1533263487887_3529071_538-273574-0_sp1.1 from training. Duration: 0.936375 2023-10-02 11:29:29,452 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_14928_210_5632_1_1530773461_4714000_454-112198-0_sp1.1 from training. Duration: 0.6 2023-10-02 11:29:30,835 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_468-143126-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 11:29:32,260 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_4528_210_3273_1_1527076654_3276777_258-121681-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 11:29:33,557 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_397-132565-0_sp1.1 from training. Duration: 0.9 2023-10-02 11:29:33,575 WARNING [train.py:1204] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_454-123002-0_sp0.9 from training. Duration: 0.7666875 2023-10-02 11:29:38,420 WARNING [train.py:1204] (3/4) Exclude cut with ID _27052_210_5986_1_1532329197187_3585009_297-86890-0_sp1.1 from training. Duration: 0.936375 2023-10-02 11:29:38,557 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_51225_210_3601_1_1534400634_2327383_55-301844-0_sp1.1 from training. Duration: 0.8454375 2023-10-02 11:29:40,007 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6786_210_1388_1_1527839699_4194495_378-46070-0_sp0.9 from training. Duration: 0.8 2023-10-02 11:29:41,401 WARNING [train.py:1204] (3/4) Exclude cut with ID _42334_210_7770_1_1534206610396_3605880_55-19758-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 11:29:45,994 WARNING [train.py:1204] (3/4) Exclude cut with ID _14884_210_2252_1_1530707459689_5561250_114-201399-0_sp0.9 from training. Duration: 0.8444375 2023-10-02 11:29:51,886 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_99-251314-0_sp0.9 from training. Duration: 0.9666875 2023-10-02 11:29:51,963 WARNING [train.py:1204] (3/4) Exclude cut with ID _26528_210_9070_1_1532570410842_3851310_497-97218-0 from training. Duration: 0.86 2023-10-02 11:29:57,211 WARNING [train.py:1204] (3/4) Exclude cut with ID _55328_210_27767_1_1534580973260_4088170_230-317241-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 11:29:57,274 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_125-203105-0_sp0.9 from training. Duration: 0.888875 2023-10-02 11:29:58,598 WARNING [train.py:1204] (3/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_55-31904-0_sp1.1 from training. Duration: 0.8 2023-10-02 11:30:00,514 WARNING [train.py:1204] (3/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_18-174647-0 from training. Duration: 0.91 2023-10-02 11:30:05,771 WARNING [train.py:1204] (3/4) Exclude cut with ID IC0165W0005-81726 from training. Duration: 0.952 2023-10-02 11:30:05,772 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_31583_210_3852_1_1532595851_4024413_148-171847-0_sp1.1 from training. Duration: 0.963625 2023-10-02 11:30:05,833 WARNING [train.py:1204] (3/4) Exclude cut with ID _21989_210_5854_1_1531899214931_3271360_116-138769-0_sp1.1 from training. Duration: 0.936375 2023-10-02 11:30:07,108 INFO [train.py:1046] (3/4) Epoch 25, batch 2100, loss[loss=0.1864, simple_loss=0.2691, pruned_loss=0.05181, over 23910.00 frames. ], tot_loss[loss=0.1703, simple_loss=0.2458, pruned_loss=0.04735, over 4684009.35 frames. ], batch size: 86, lr: 4.11e-03, grad_scale: 16.0 2023-10-02 11:30:07,148 WARNING [train.py:1204] (3/4) Exclude cut with ID _66070_210_7640_1_1535441393096_7805310_489-292631-0_sp1.1 from training. Duration: 0.7181875 2023-10-02 11:30:07,240 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_202-27960-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 11:30:08,524 WARNING [train.py:1204] (3/4) Exclude cut with ID _5294_210_1747_1_1527249643928_3696179_342-270278-0 from training. Duration: 0.55 2023-10-02 11:30:08,568 WARNING [train.py:1204] (3/4) Exclude cut with ID _38323_210_12108_1_1533121233369_3303049_416-11919-0 from training. Duration: 0.96 2023-10-02 11:30:11,802 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6545_210_2040_1_1527774670_4245258_407-48070-0_sp0.9 from training. Duration: 0.8444375 2023-10-02 11:30:14,914 WARNING [train.py:1204] (3/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_503-30719-0_sp1.1 from training. Duration: 0.8909375 2023-10-02 11:30:14,967 WARNING [train.py:1204] (3/4) Exclude cut with ID _49643_210_7989_1_1534640244224_3939031_234-326497-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 11:30:16,384 WARNING [train.py:1204] (3/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_301-77873-0_sp1.1 from training. Duration: 0.963625 2023-10-02 11:30:17,894 WARNING [train.py:1204] (3/4) Exclude cut with ID _45297_210_2721_1_1533715351381_3264949_152-154697-0_sp1.1 from training. Duration: 0.97275 2023-10-02 11:30:17,910 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3371_210_1794_1_1526384462_5580777_396-339812-0 from training. Duration: 0.85 2023-10-02 11:30:19,301 WARNING [train.py:1204] (3/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_399-156791-0_sp1.1 from training. Duration: 0.7545625 2023-10-02 11:30:20,767 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7696_210_2040_1_1528207167_3716099_117-125434-0 from training. Duration: 0.76 2023-10-02 11:30:20,773 WARNING [train.py:1204] (3/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_96-244299-0 from training. Duration: 0.88 2023-10-02 11:30:22,796 WARNING [train.py:1204] (3/4) Exclude cut with ID _37892_210_19596_1_1533198677897_3964058_519-343462-0_sp1.1 from training. Duration: 0.92725 2023-10-02 11:30:22,832 WARNING [train.py:1204] (3/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_202-183922-0_sp1.1 from training. Duration: 0.77275 2023-10-02 11:30:22,844 WARNING [train.py:1204] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_453-133983-0 from training. Duration: 0.78 2023-10-02 11:30:24,210 WARNING [train.py:1204] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_178-39722-0_sp1.1 from training. Duration: 0.4454375 2023-10-02 11:30:28,245 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_15126_210_6753_1_1530778677_2591185_113-157651-0 from training. Duration: 0.68 2023-10-02 11:30:28,246 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_271-234962-0_sp1.1 from training. Duration: 0.7181875 2023-10-02 11:30:31,552 WARNING [train.py:1204] (3/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_195-64967-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 11:30:32,863 WARNING [train.py:1204] (3/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_365-215065-0_sp1.1 from training. Duration: 0.936375 2023-10-02 11:30:33,061 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.attention_skip_rate, batch_count=864006.6666666666, ans=0.0 2023-10-02 11:30:35,643 WARNING [train.py:1204] (3/4) Exclude cut with ID _31602_210_5854_1_1532677021456_4458250_249-96604-0_sp0.9 from training. Duration: 0.988875 2023-10-02 11:30:36,957 WARNING [train.py:1204] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_307-158462-0 from training. Duration: 0.69 2023-10-02 11:30:37,004 WARNING [train.py:1204] (3/4) Exclude cut with ID _30918_210_6400_1_1532754078600_3125920_407-223394-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 11:30:37,007 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6673_210_3273_1_1527767790_3173240_351-167954-0_sp0.9 from training. Duration: 0.5333125 2023-10-02 11:30:38,455 WARNING [train.py:1204] (3/4) Exclude cut with ID _4866_210_2577_1_1527404412284_4364659_256-306830-0 from training. Duration: 0.87 2023-10-02 11:30:40,352 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_17613_210_2151_1_1531137569_2366160_222-131118-0_sp1.1 from training. Duration: 0.963625 2023-10-02 11:30:41,659 WARNING [train.py:1204] (3/4) Exclude cut with ID _45560_210_21982_1_1533812603951_3584999_111-6376-0 from training. Duration: 0.84 2023-10-02 11:30:41,689 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_216-73841-0 from training. Duration: 0.96 2023-10-02 11:30:43,047 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_14058_210_4163_1_1530690953_6327180_49-210687-0 from training. Duration: 0.94 2023-10-02 11:30:44,468 WARNING [train.py:1204] (3/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_506-42614-0_sp0.9 from training. Duration: 0.87775 2023-10-02 11:30:45,838 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_25-332138-0_sp1.1 from training. Duration: 0.82725 2023-10-02 11:30:48,952 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.624e+02 1.897e+02 2.155e+02 2.514e+02 3.500e+02, threshold=4.310e+02, percent-clipped=0.0 2023-10-02 11:30:49,068 WARNING [train.py:1204] (3/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_589-9070-0_sp1.1 from training. Duration: 0.6454375 2023-10-02 11:30:51,663 WARNING [train.py:1204] (3/4) Exclude cut with ID _6194_210_1534_1_1527511826160_3941194_14-282876-0_sp1.1 from training. Duration: 0.6454375 2023-10-02 11:30:53,041 WARNING [train.py:1204] (3/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_1166-90654-0_sp1.1 from training. Duration: 0.92725 2023-10-02 11:30:53,766 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_218-255858-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 11:30:53,768 WARNING [train.py:1204] (3/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_1285-256622-0 from training. Duration: 0.84 2023-10-02 11:30:53,780 WARNING [train.py:1204] (3/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_205-286261-0_sp1.1 from training. Duration: 0.963625 2023-10-02 11:30:53,794 WARNING [train.py:1204] (3/4) Exclude cut with ID _68777_210_4353_1_1535810390731_3117904_6-154119-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 11:30:55,087 WARNING [train.py:1204] (3/4) Exclude cut with ID _22982_210_1883_1_1531882790447_5449409_565-199393-0_sp1.1 from training. Duration: 0.92725 2023-10-02 11:30:55,097 WARNING [train.py:1204] (3/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_278-154492-0 from training. Duration: 0.76 2023-10-02 11:30:55,224 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_426-313557-0 from training. Duration: 0.53 2023-10-02 11:30:56,556 WARNING [train.py:1204] (3/4) Exclude cut with ID _41817_210_18835_1_1533434477160_7423049_555-293448-0 from training. Duration: 0.8 2023-10-02 11:30:58,322 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.conv_module1.balancer2.min_positive, batch_count=864140.0, ans=0.05 2023-10-02 11:30:59,445 WARNING [train.py:1204] (3/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_32-335103-0_sp1.1 from training. Duration: 0.7454375 2023-10-02 11:30:59,733 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.conv_skip_rate, batch_count=864140.0, ans=0.0 2023-10-02 11:31:02,139 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2655_210_2087_1_1525688959_3897400_202-147557-0_sp1.1 from training. Duration: 0.77275 2023-10-02 11:31:03,424 WARNING [train.py:1204] (3/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_158-250116-0 from training. Duration: 0.62 2023-10-02 11:31:08,724 WARNING [train.py:1204] (3/4) Exclude cut with ID _55429_210_10626_1_1534467588527_7174143_528-88083-0_sp1.1 from training. Duration: 0.963625 2023-10-02 11:31:10,154 WARNING [train.py:1204] (3/4) Exclude cut with ID _8030_210_7497_1_1528444886781_3531089_32-31837-0_sp1.1 from training. Duration: 0.8818125 2023-10-02 11:31:11,537 WARNING [train.py:1204] (3/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_744-86168-0_sp1.1 from training. Duration: 0.97275 2023-10-02 11:31:11,553 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1113-7369-0_sp0.9 from training. Duration: 0.9444375 2023-10-02 11:31:11,576 WARNING [train.py:1204] (3/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_124-345840-0_sp0.9 from training. Duration: 0.57775 2023-10-02 11:31:11,618 WARNING [train.py:1204] (3/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_328-264776-0_sp0.9 from training. Duration: 0.7444375 2023-10-02 11:31:14,147 WARNING [train.py:1204] (3/4) Exclude cut with ID _18895_210_6228_1_1531357274234_3784930_469-166615-0_sp1.1 from training. Duration: 0.963625 2023-10-02 11:31:14,153 WARNING [train.py:1204] (3/4) Exclude cut with ID _8395_210_1531_1_1528597871523_4411579_431-132470-0_sp0.9 from training. Duration: 0.82225 2023-10-02 11:31:14,228 WARNING [train.py:1204] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_554-45617-0_sp1.1 from training. Duration: 0.9 2023-10-02 11:31:14,250 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_35361_210_4238_1_1533262810_7793476_524-188242-0_sp1.1 from training. Duration: 0.92725 2023-10-02 11:31:17,309 WARNING [train.py:1204] (3/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_938-205135-0 from training. Duration: 0.87 2023-10-02 11:31:18,659 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_5931_210_2040_1_1527428481_4547999_321-193614-0 from training. Duration: 0.67 2023-10-02 11:31:18,671 WARNING [train.py:1204] (3/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_445-269521-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 11:31:21,322 INFO [train.py:1046] (3/4) Epoch 25, batch 2150, loss[loss=0.1488, simple_loss=0.2294, pruned_loss=0.03413, over 24336.00 frames. ], tot_loss[loss=0.1695, simple_loss=0.2452, pruned_loss=0.04687, over 4691046.95 frames. ], batch size: 61, lr: 4.11e-03, grad_scale: 16.0 2023-10-02 11:31:21,409 WARNING [train.py:1204] (3/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_3-33116-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 11:31:21,410 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_4633_210_2040_1_1526824163_4145343_416-76117-0_sp1.1 from training. Duration: 0.863625 2023-10-02 11:31:23,072 WARNING [train.py:1204] (3/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_1033-274376-0_sp1.1 from training. Duration: 0.7909375 2023-10-02 11:31:23,115 WARNING [train.py:1204] (3/4) Exclude cut with ID _8772_210_1850_1_1528635595972_3664011_398-320049-0_sp0.9 from training. Duration: 0.97775 2023-10-02 11:31:26,101 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.feed_forward2.hidden_balancer.prob, batch_count=864273.3333333334, ans=0.125 2023-10-02 11:31:28,740 WARNING [train.py:1204] (3/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_321-215742-0_sp0.9 from training. Duration: 0.5555625 2023-10-02 11:31:30,175 WARNING [train.py:1204] (3/4) Exclude cut with ID _28394_210_8913_1_1532430049518_3650118_272-190950-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 11:31:30,498 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.balancer1.prob, batch_count=864273.3333333334, ans=0.125 2023-10-02 11:31:31,521 WARNING [train.py:1204] (3/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_31-299371-0_sp1.1 from training. Duration: 0.92725 2023-10-02 11:31:31,649 WARNING [train.py:1204] (3/4) Exclude cut with ID _55383_210_10635_1_1534468426172_2231669_134-128246-0_sp0.9 from training. Duration: 0.988875 2023-10-02 11:31:31,652 WARNING [train.py:1204] (3/4) Exclude cut with ID _40519_210_13794_1_1533715228655_7337390_887-9547-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 11:31:31,678 WARNING [train.py:1204] (3/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_310-109461-0_sp0.9 from training. Duration: 0.888875 2023-10-02 11:31:34,614 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2009_210_2071_1_1525481134_4588937_258-250196-0_sp1.1 from training. Duration: 0.92725 2023-10-02 11:31:36,297 WARNING [train.py:1204] (3/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_1186-72702-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 11:31:36,299 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_14831_210_7927_1_1530700200_5583610_181-27467-0_sp1.1 from training. Duration: 0.82725 2023-10-02 11:31:40,853 WARNING [train.py:1204] (3/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_278-132109-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 11:31:40,879 WARNING [train.py:1204] (3/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_261-297503-0 from training. Duration: 0.99 2023-10-02 11:31:45,484 WARNING [train.py:1204] (3/4) Exclude cut with ID _19896_210_1883_1_1531709022662_6649569_313-265903-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 11:31:46,891 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_105-114797-0_sp1.1 from training. Duration: 0.763625 2023-10-02 11:31:46,973 WARNING [train.py:1204] (3/4) Exclude cut with ID _53755_210_8254_1_1534577312941_4707360_9-65843-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 11:31:48,257 WARNING [train.py:1204] (3/4) Exclude cut with ID _18516_210_9206_1_1531704653206_3692406_361-224549-0_sp1.1 from training. Duration: 0.97275 2023-10-02 11:31:48,295 WARNING [train.py:1204] (3/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_403-310975-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 11:31:48,334 WARNING [train.py:1204] (3/4) Exclude cut with ID _5444_210_2151_1_1527418808464_5312210_581-315411-0_sp0.9 from training. Duration: 0.688875 2023-10-02 11:31:49,629 WARNING [train.py:1204] (3/4) Exclude cut with ID _23825_210_13449_1_1531915313526_3497150_464-154904-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 11:31:49,638 WARNING [train.py:1204] (3/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_937-208281-0_sp0.9 from training. Duration: 0.9444375 2023-10-02 11:31:49,720 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder_embed.dropout.p, batch_count=864406.6666666666, ans=0.1 2023-10-02 11:31:50,953 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_37839_210_7234_1_1533542462_3689303_158-181212-0_sp1.1 from training. Duration: 0.963625 2023-10-02 11:31:51,140 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.conv_module2.balancer2.prob, batch_count=864406.6666666666, ans=0.125 2023-10-02 11:31:53,067 WARNING [train.py:1204] (3/4) Exclude cut with ID _8772_210_1850_1_1528635595972_3664011_398-320049-0 from training. Duration: 0.88 2023-10-02 11:31:55,559 WARNING [train.py:1204] (3/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_718-250907-0_sp1.1 from training. Duration: 0.62725 2023-10-02 11:31:55,651 WARNING [train.py:1204] (3/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_641-126395-0_sp1.1 from training. Duration: 0.92725 2023-10-02 11:31:55,676 WARNING [train.py:1204] (3/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_166-241493-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 11:31:58,401 WARNING [train.py:1204] (3/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_526-62079-0_sp0.9 from training. Duration: 0.7444375 2023-10-02 11:31:59,745 WARNING [train.py:1204] (3/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_439-275083-0_sp1.1 from training. Duration: 0.936375 2023-10-02 11:32:01,213 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_66907_210_21581_1_1535615661_3590177_493-229032-0_sp1.1 from training. Duration: 0.92725 2023-10-02 11:32:02,666 WARNING [train.py:1204] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_89-321955-0_sp1.1 from training. Duration: 0.72725 2023-10-02 11:32:02,772 WARNING [train.py:1204] (3/4) Exclude cut with ID _29857_210_20034_1_1532395886753_3798160_273-226195-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 11:32:02,778 WARNING [train.py:1204] (3/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_404-75889-0 from training. Duration: 0.89 2023-10-02 11:32:02,940 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.conv_module2.balancer2.prob, batch_count=864406.6666666666, ans=0.125 2023-10-02 11:32:04,123 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_271-234962-0_sp0.9 from training. Duration: 0.87775 2023-10-02 11:32:05,579 WARNING [train.py:1204] (3/4) Exclude cut with ID _22961_210_8254_1_1531976366672_4278180_156-339685-0_sp1.1 from training. Duration: 0.97275 2023-10-02 11:32:06,952 WARNING [train.py:1204] (3/4) Exclude cut with ID _37892_210_19596_1_1533198677897_3964058_425-231927-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 11:32:07,153 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.feed_forward1.hidden_balancer.prob, batch_count=864473.3333333334, ans=0.125 2023-10-02 11:32:08,281 WARNING [train.py:1204] (3/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_102-288670-0_sp1.1 from training. Duration: 0.97275 2023-10-02 11:32:09,614 WARNING [train.py:1204] (3/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_283-220372-0_sp1.1 from training. Duration: 0.5090625 2023-10-02 11:32:09,705 WARNING [train.py:1204] (3/4) Exclude cut with ID _44304_210_15374_1_1533639352765_4613129_86-1699-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 11:32:11,537 WARNING [train.py:1204] (3/4) Exclude cut with ID _2891_210_2550_1_1525874398521_4427710_327-121590-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 11:32:11,547 WARNING [train.py:1204] (3/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_369-222060-0 from training. Duration: 0.71 2023-10-02 11:32:14,639 WARNING [train.py:1204] (3/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_762-109886-0 from training. Duration: 0.91 2023-10-02 11:32:14,667 WARNING [train.py:1204] (3/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_477-255782-0_sp0.9 from training. Duration: 0.911125 2023-10-02 11:32:14,691 WARNING [train.py:1204] (3/4) Exclude cut with ID IC0669W0061-330906 from training. Duration: 0.768 2023-10-02 11:32:14,734 WARNING [train.py:1204] (3/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_347-213071-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 11:32:16,706 WARNING [train.py:1204] (3/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_507-303370-0_sp1.1 from training. Duration: 0.87275 2023-10-02 11:32:16,777 WARNING [train.py:1204] (3/4) Exclude cut with ID _15012_210_1968_1_1530856820429_5095350_688-178849-0 from training. Duration: 0.64 2023-10-02 11:32:16,782 WARNING [train.py:1204] (3/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_28-70987-0_sp0.9 from training. Duration: 0.9666875 2023-10-02 11:32:16,791 WARNING [train.py:1204] (3/4) Exclude cut with ID _5910_210_1389_1_1527337972454_5050100_555-159815-0 from training. Duration: 0.98 2023-10-02 11:32:18,094 WARNING [train.py:1204] (3/4) Exclude cut with ID IC0679W0064-335871 from training. Duration: 0.768 2023-10-02 11:32:18,094 WARNING [train.py:1204] (3/4) Exclude cut with ID IC0679W0079-335886 from training. Duration: 0.896 2023-10-02 11:32:18,136 WARNING [train.py:1204] (3/4) Exclude cut with ID _5608_210_2106_1_1527415439089_6819768_909-82767-0 from training. Duration: 0.61 2023-10-02 11:32:18,912 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.1.encoder.layers.1.whiten, num_groups=1, num_channels=256, metric=3.87 vs. limit=12.0 2023-10-02 11:32:19,468 WARNING [train.py:1204] (3/4) Exclude cut with ID _23038_210_9570_1_1532001584158_3652419_411-323967-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 11:32:19,520 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_350-231748-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 11:32:19,538 WARNING [train.py:1204] (3/4) Exclude cut with ID _5289_210_2084_1_1527678202736_3779299_142-103317-0_sp0.9 from training. Duration: 0.9333125 2023-10-02 11:32:20,820 WARNING [train.py:1204] (3/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_314-144055-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 11:32:22,153 WARNING [train.py:1204] (3/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_177-236809-0_sp0.9 from training. Duration: 0.5444375 2023-10-02 11:32:23,541 WARNING [train.py:1204] (3/4) Exclude cut with ID _23038_210_9570_1_1532001584158_3652419_82-334733-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 11:32:23,557 WARNING [train.py:1204] (3/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_225-22526-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 11:32:33,698 WARNING [train.py:1204] (3/4) Exclude cut with ID _30327_210_16049_1_1532484161295_3924169_401-70819-0_sp1.1 from training. Duration: 0.7818125 2023-10-02 11:32:33,740 WARNING [train.py:1204] (3/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_538-32291-0 from training. Duration: 0.83 2023-10-02 11:32:34,875 INFO [train.py:1046] (3/4) Epoch 25, batch 2200, loss[loss=0.1736, simple_loss=0.2492, pruned_loss=0.04901, over 23590.00 frames. ], tot_loss[loss=0.1688, simple_loss=0.2451, pruned_loss=0.04623, over 4694225.57 frames. ], batch size: 134, lr: 4.11e-03, grad_scale: 16.0 2023-10-02 11:32:35,276 INFO [scaling.py:1118] (3/4) WithLoss: name=encoder.encoders.5.encoder.layers.1.self_attn_weights, loss-sum=0.000e+00 2023-10-02 11:32:36,469 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_37357_210_17024_1_1533089504_2316121_258-211605-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 11:32:40,638 WARNING [train.py:1204] (3/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_550-335198-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 11:32:40,860 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.bypass.skip_rate, batch_count=864606.6666666666, ans=0.07 2023-10-02 11:32:42,456 WARNING [train.py:1204] (3/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_98-741-0_sp0.9 from training. Duration: 0.888875 2023-10-02 11:32:42,507 WARNING [train.py:1204] (3/4) Exclude cut with ID _38323_210_12108_1_1533121233369_3303049_251-238988-0_sp1.1 from training. Duration: 0.92725 2023-10-02 11:32:43,883 WARNING [train.py:1204] (3/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_259-34038-0_sp0.9 from training. Duration: 0.82225 2023-10-02 11:32:44,236 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.conv_module1.balancer1.max_abs, batch_count=864606.6666666666, ans=10.0 2023-10-02 11:32:45,922 WARNING [train.py:1204] (3/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_1060-180633-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 11:32:45,962 WARNING [train.py:1204] (3/4) Exclude cut with ID _8755_210_1876_1_1528628559357_3258209_211-37473-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 11:32:47,224 WARNING [train.py:1204] (3/4) Exclude cut with ID _4122_210_2084_1_1526713164417_4353280_528-206771-0 from training. Duration: 0.88 2023-10-02 11:32:52,494 WARNING [train.py:1204] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_64-248359-0 from training. Duration: 0.69 2023-10-02 11:32:52,632 WARNING [train.py:1204] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_402-253027-0_sp1.1 from training. Duration: 0.4909375 2023-10-02 11:32:58,618 WARNING [train.py:1204] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_625-102955-0 from training. Duration: 0.77 2023-10-02 11:33:02,570 WARNING [train.py:1204] (3/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_611-219324-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 11:33:02,666 WARNING [train.py:1204] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_718-68505-0_sp0.9 from training. Duration: 0.988875 2023-10-02 11:33:04,020 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_353-182855-0_sp1.1 from training. Duration: 0.87275 2023-10-02 11:33:06,975 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_5960_210_1794_1_1527403865_3990999_515-2613-0_sp0.9 from training. Duration: 0.97775 2023-10-02 11:33:07,005 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_5575_210_1410_1_1527301305_2570170_342-265572-0 from training. Duration: 0.94 2023-10-02 11:33:08,577 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.bypass_mid.scale_min, batch_count=864740.0, ans=0.2 2023-10-02 11:33:11,198 WARNING [train.py:1204] (3/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_329-113173-0_sp0.9 from training. Duration: 0.688875 2023-10-02 11:33:13,063 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_15396_210_6890_1_1531308473_1938936_213-312506-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 11:33:14,403 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_521-214276-0_sp1.1 from training. Duration: 0.4 2023-10-02 11:33:14,774 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.conv_module2.balancer1.prob, batch_count=864740.0, ans=0.125 2023-10-02 11:33:14,775 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.ff3_skip_rate, batch_count=864740.0, ans=0.0 2023-10-02 11:33:16,290 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.455e+02 1.821e+02 1.981e+02 2.213e+02 3.022e+02, threshold=3.961e+02, percent-clipped=0.0 2023-10-02 11:33:16,443 WARNING [train.py:1204] (3/4) Exclude cut with ID _15026_210_8750_1_1530777602316_3291850_211-195043-0_sp1.1 from training. Duration: 0.72725 2023-10-02 11:33:17,776 WARNING [train.py:1204] (3/4) Exclude cut with ID _29343_210_4381_1_1532602757752_3674109_108-257494-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 11:33:19,581 WARNING [train.py:1204] (3/4) Exclude cut with ID _64414_210_17301_1_1535243252048_3757759_403-58151-0_sp1.1 from training. Duration: 0.963625 2023-10-02 11:33:20,966 WARNING [train.py:1204] (3/4) Exclude cut with ID _36512_210_6987_1_1533033143998_3126069_109-177787-0_sp1.1 from training. Duration: 0.92725 2023-10-02 11:33:22,296 WARNING [train.py:1204] (3/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_498-40153-0 from training. Duration: 0.63 2023-10-02 11:33:23,636 WARNING [train.py:1204] (3/4) Exclude cut with ID _56447_210_1389_1_1535074245910_3697189_161-334523-0_sp1.1 from training. Duration: 0.936375 2023-10-02 11:33:25,541 WARNING [train.py:1204] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_238-245655-0 from training. Duration: 0.81 2023-10-02 11:33:27,037 WARNING [train.py:1204] (3/4) Exclude cut with ID _64631_210_27767_1_1535349580068_4165160_108-43865-0_sp1.1 from training. Duration: 0.92725 2023-10-02 11:33:27,042 WARNING [train.py:1204] (3/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_243-42116-0_sp1.1 from training. Duration: 0.663625 2023-10-02 11:33:28,285 WARNING [train.py:1204] (3/4) Exclude cut with ID _66868_210_9527_1_1535509287954_4561140_594-132160-0_sp1.1 from training. Duration: 0.92725 2023-10-02 11:33:29,568 WARNING [train.py:1204] (3/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_48-338976-0_sp0.9 from training. Duration: 0.988875 2023-10-02 11:33:29,610 WARNING [train.py:1204] (3/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_137-100595-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 11:33:29,615 WARNING [train.py:1204] (3/4) Exclude cut with ID _49748_210_24116_1_1533986971737_3158910_12-271669-0_sp1.1 from training. Duration: 0.936375 2023-10-02 11:33:29,651 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_37357_210_17024_1_1533089504_2316121_206-80421-0_sp1.1 from training. Duration: 0.936375 2023-10-02 11:33:29,890 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.bypass.skip_rate, batch_count=864806.6666666666, ans=0.04949747468305833 2023-10-02 11:33:32,234 WARNING [train.py:1204] (3/4) Exclude cut with ID _7537_210_2084_1_1528502395431_3924359_113-199933-0_sp1.1 from training. Duration: 0.536375 2023-10-02 11:33:32,301 WARNING [train.py:1204] (3/4) Exclude cut with ID _55440_210_12651_1_1534496713364_2980520_31-59289-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 11:33:32,572 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.feed_forward1.out_proj.dropout_p, batch_count=864873.3333333334, ans=0.1 2023-10-02 11:33:34,804 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1083-220122-0_sp0.9 from training. Duration: 0.5666875 2023-10-02 11:33:37,627 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_426-313557-0_sp1.1 from training. Duration: 0.4818125 2023-10-02 11:33:37,686 WARNING [train.py:1204] (3/4) Exclude cut with ID _35799_210_5854_1_1533263487887_3529071_324-94843-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 11:33:41,624 WARNING [train.py:1204] (3/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_264-108685-0_sp1.1 from training. Duration: 0.5 2023-10-02 11:33:41,727 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9012W0006-501141 from training. Duration: 0.896 2023-10-02 11:33:45,568 WARNING [train.py:1204] (3/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_890-5117-0_sp0.9 from training. Duration: 0.8666875 2023-10-02 11:33:45,621 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9023W0008-506301 from training. Duration: 0.896 2023-10-02 11:33:45,689 WARNING [train.py:1204] (3/4) Exclude cut with ID _13916_210_9994_1_1530788341126_3763149_60-191556-0_sp0.9 from training. Duration: 0.67775 2023-10-02 11:33:46,280 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.4.encoder.layers.0.conv_module1.whiten, num_groups=1, num_channels=384, metric=4.32 vs. limit=15.0 2023-10-02 11:33:47,120 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9029W0331-509721 from training. Duration: 0.896 2023-10-02 11:33:48,993 INFO [train.py:1046] (3/4) Epoch 25, batch 2250, loss[loss=0.2047, simple_loss=0.2637, pruned_loss=0.07279, over 19602.00 frames. ], tot_loss[loss=0.1694, simple_loss=0.2462, pruned_loss=0.04634, over 4706816.13 frames. ], batch size: 388, lr: 4.11e-03, grad_scale: 16.0 2023-10-02 11:33:49,095 WARNING [train.py:1204] (3/4) Exclude cut with ID _35823_210_6917_1_1533187881914_7297830_567-108596-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 11:33:49,142 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7543_210_6753_1_1528544010_1110402_25-183860-0_sp1.1 from training. Duration: 0.42725 2023-10-02 11:33:49,389 INFO [scaling.py:1118] (3/4) WithLoss: name=encoder.encoders.4.encoder.layers.2.self_attn_weights, loss-sum=0.000e+00 2023-10-02 11:33:50,619 WARNING [train.py:1204] (3/4) Exclude cut with ID _47278_210_12041_1_1533902418349_3576110_495-260035-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 11:33:52,028 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9051W0015-520281 from training. Duration: 0.9045625 2023-10-02 11:33:53,411 WARNING [train.py:1204] (3/4) Exclude cut with ID _14459_210_2228_1_1531443641946_7516700_707-66193-0_sp1.1 from training. Duration: 0.97275 2023-10-02 11:33:56,642 WARNING [train.py:1204] (3/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_219-224445-0_sp1.1 from training. Duration: 0.763625 2023-10-02 11:34:02,132 WARNING [train.py:1204] (3/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_345-167986-0_sp0.9 from training. Duration: 0.9333125 2023-10-02 11:34:04,826 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_34815_210_6388_1_1533381320_3571903_279-305565-0_sp1.1 from training. Duration: 0.836375 2023-10-02 11:34:06,420 WARNING [train.py:1204] (3/4) Exclude cut with ID _21987_210_5854_1_1531821500482_3637038_317-179212-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 11:34:06,477 WARNING [train.py:1204] (3/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_395-37222-0_sp1.1 from training. Duration: 0.6909375 2023-10-02 11:34:07,743 WARNING [train.py:1204] (3/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_168-50337-0_sp1.1 from training. Duration: 0.763625 2023-10-02 11:34:10,504 WARNING [train.py:1204] (3/4) Exclude cut with ID _60113_210_16271_1_1535164212481_4386079_559-167191-0 from training. Duration: 0.93 2023-10-02 11:34:10,515 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_47228_210_4983_1_1533780021_3672027_344-179642-0_sp1.1 from training. Duration: 0.92725 2023-10-02 11:34:10,542 WARNING [train.py:1204] (3/4) Exclude cut with ID _22961_210_8254_1_1531976366672_4278180_317-199546-0_sp0.9 from training. Duration: 0.9555625 2023-10-02 11:34:13,192 WARNING [train.py:1204] (3/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_141-89727-0 from training. Duration: 0.71 2023-10-02 11:34:13,266 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_78-116075-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 11:34:15,152 WARNING [train.py:1204] (3/4) Exclude cut with ID _64086_210_7994_1_1535270417019_7435949_52-235756-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 11:34:17,847 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7615_210_4211_1_1528110965_4580160_72-235211-0_sp1.1 from training. Duration: 0.6909375 2023-10-02 11:34:18,083 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.self_attn_weights.pos_emb_skip_rate, batch_count=865073.3333333334, ans=0.0 2023-10-02 11:34:20,586 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.2.encoder.layers.1.self_attn_weights.whiten_keys, num_groups=4, num_channels=128, metric=3.39 vs. limit=6.0 2023-10-02 11:34:23,122 WARNING [train.py:1204] (3/4) Exclude cut with ID _14069_210_5632_1_1530512692033_4803390_287-27950-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 11:34:24,534 WARNING [train.py:1204] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_74-48717-0_sp0.9 from training. Duration: 0.6444375 2023-10-02 11:34:24,565 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7544_210_6753_1_1528591327_1471426_172-348777-0_sp1.1 from training. Duration: 0.6 2023-10-02 11:34:25,991 WARNING [train.py:1204] (3/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_220-191080-0 from training. Duration: 0.98 2023-10-02 11:34:26,072 WARNING [train.py:1204] (3/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_1081-137641-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 11:34:28,837 WARNING [train.py:1204] (3/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_278-235459-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 11:34:33,525 WARNING [train.py:1204] (3/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_1181-77470-0_sp1.1 from training. Duration: 0.936375 2023-10-02 11:34:34,959 WARNING [train.py:1204] (3/4) Exclude cut with ID _60860_210_8254_1_1534896015875_3557169_198-5531-0_sp1.1 from training. Duration: 0.936375 2023-10-02 11:34:36,329 WARNING [train.py:1204] (3/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_17-289781-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 11:34:36,343 WARNING [train.py:1204] (3/4) Exclude cut with ID _50310_210_15488_1_1534068043345_3641080_146-199752-0_sp1.1 from training. Duration: 0.92725 2023-10-02 11:34:37,795 WARNING [train.py:1204] (3/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_969-213354-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 11:34:40,467 WARNING [train.py:1204] (3/4) Exclude cut with ID _70519_210_4163_1_1535961595994_7458230_66-344156-0_sp1.1 from training. Duration: 0.963625 2023-10-02 11:34:44,629 WARNING [train.py:1204] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_380-204756-0_sp1.1 from training. Duration: 0.7818125 2023-10-02 11:34:46,092 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_128-202104-0_sp0.9 from training. Duration: 0.7 2023-10-02 11:34:51,144 WARNING [train.py:1204] (3/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_51-1049-0_sp0.9 from training. Duration: 0.8333125 2023-10-02 11:34:51,169 WARNING [train.py:1204] (3/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_86-66192-0_sp1.1 from training. Duration: 0.5 2023-10-02 11:34:51,229 WARNING [train.py:1204] (3/4) Exclude cut with ID _14069_210_5632_1_1530512692033_4803390_95-1896-0_sp1.1 from training. Duration: 0.8909375 2023-10-02 11:34:57,242 WARNING [train.py:1204] (3/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_29-293896-0_sp1.1 from training. Duration: 0.5545625 2023-10-02 11:35:00,593 WARNING [train.py:1204] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_167-137255-0_sp0.9 from training. Duration: 0.711125 2023-10-02 11:35:00,594 WARNING [train.py:1204] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_217-95056-0 from training. Duration: 0.98 2023-10-02 11:35:00,607 WARNING [train.py:1204] (3/4) Exclude cut with ID _47278_210_12041_1_1533902418349_3576110_97-266746-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 11:35:01,945 WARNING [train.py:1204] (3/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_380-343670-0_sp1.1 from training. Duration: 0.8 2023-10-02 11:35:03,270 INFO [train.py:1046] (3/4) Epoch 25, batch 2300, loss[loss=0.17, simple_loss=0.2531, pruned_loss=0.04348, over 24027.00 frames. ], tot_loss[loss=0.1704, simple_loss=0.2473, pruned_loss=0.04679, over 4713432.28 frames. ], batch size: 80, lr: 4.11e-03, grad_scale: 16.0 2023-10-02 11:35:05,904 WARNING [train.py:1204] (3/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_107-332398-0 from training. Duration: 0.78 2023-10-02 11:35:07,406 WARNING [train.py:1204] (3/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_91-267696-0_sp1.1 from training. Duration: 0.7909375 2023-10-02 11:35:08,661 WARNING [train.py:1204] (3/4) Exclude cut with ID _41298_210_9019_1_1533430371450_4822143_498-186993-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 11:35:12,976 WARNING [train.py:1204] (3/4) Exclude cut with ID _17956_210_4353_1_1531209568125_6979970_54-77610-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 11:35:13,014 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_40047_210_2290_1_1533290519_2374311_257-335162-0_sp1.1 from training. Duration: 0.863625 2023-10-02 11:35:16,018 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9653W0193-675471 from training. Duration: 0.9656875 2023-10-02 11:35:17,407 WARNING [train.py:1204] (3/4) Exclude cut with ID _25248_210_8750_1_1532062825941_5554180_243-240536-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 11:35:22,352 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_32360_210_4198_1_1532689160_3648436_60-51908-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 11:35:24,156 WARNING [train.py:1204] (3/4) Exclude cut with ID _4092_210_2151_1_1526643044449_3135759_227-213972-0_sp0.9 from training. Duration: 0.6 2023-10-02 11:35:24,202 WARNING [train.py:1204] (3/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_392-173500-0_sp1.1 from training. Duration: 0.97275 2023-10-02 11:35:24,233 WARNING [train.py:1204] (3/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_71-329150-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 11:35:24,236 WARNING [train.py:1204] (3/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_866-173440-0 from training. Duration: 0.9 2023-10-02 11:35:26,073 WARNING [train.py:1204] (3/4) Exclude cut with ID _14832_210_8750_1_1530691351539_3171348_1-54315-0_sp0.9 from training. Duration: 0.9666875 2023-10-02 11:35:27,535 WARNING [train.py:1204] (3/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_430-116139-0_sp1.1 from training. Duration: 0.77275 2023-10-02 11:35:27,573 WARNING [train.py:1204] (3/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_356-45054-0_sp0.9 from training. Duration: 0.9555625 2023-10-02 11:35:30,690 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3531_210_3528_1_1526294203_5216456_309-227302-0_sp0.9 from training. Duration: 0.7666875 2023-10-02 11:35:34,707 WARNING [train.py:1204] (3/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_301-330455-0_sp1.1 from training. Duration: 0.72725 2023-10-02 11:35:36,296 WARNING [train.py:1204] (3/4) Exclude cut with ID _23557_210_8750_1_1531890106370_6846255_667-145945-0_sp1.1 from training. Duration: 0.92725 2023-10-02 11:35:40,332 WARNING [train.py:1204] (3/4) Exclude cut with ID _14860_210_4915_1_1530702020340_7343240_836-328489-0_sp1.1 from training. Duration: 0.8181875 2023-10-02 11:35:40,380 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_37152_210_9697_1_1533106573_3844188_416-333249-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 11:35:43,092 WARNING [train.py:1204] (3/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_538-32291-0_sp0.9 from training. Duration: 0.92225 2023-10-02 11:35:44,373 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.490e+02 1.801e+02 2.122e+02 2.550e+02 3.360e+02, threshold=4.244e+02, percent-clipped=0.0 2023-10-02 11:35:45,727 WARNING [train.py:1204] (3/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_965-176824-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 11:35:49,123 WARNING [train.py:1204] (3/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_86-19601-0_sp1.1 from training. Duration: 0.77275 2023-10-02 11:35:49,200 WARNING [train.py:1204] (3/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_436-318113-0_sp1.1 from training. Duration: 0.6818125 2023-10-02 11:35:49,231 WARNING [train.py:1204] (3/4) Exclude cut with ID _41817_210_18835_1_1533434477160_7423049_555-293448-0_sp0.9 from training. Duration: 0.888875 2023-10-02 11:35:49,242 WARNING [train.py:1204] (3/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_68-219807-0 from training. Duration: 0.47 2023-10-02 11:35:54,453 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7544_210_6753_1_1528591327_1471426_33-346031-0_sp1.1 from training. Duration: 0.5545625 2023-10-02 11:35:54,455 WARNING [train.py:1204] (3/4) Exclude cut with ID _49748_210_24116_1_1533986971737_3158910_263-52113-0_sp1.1 from training. Duration: 0.97275 2023-10-02 11:35:55,830 WARNING [train.py:1204] (3/4) Exclude cut with ID _64639_210_5816_1_1535367619437_3528000_340-24580-0_sp1.1 from training. Duration: 0.92725 2023-10-02 11:35:55,843 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_47228_210_4983_1_1533780021_3672027_479-114941-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 11:35:55,875 WARNING [train.py:1204] (3/4) Exclude cut with ID _44585_210_18628_1_1533605350387_4518199_506-119659-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 11:35:57,958 WARNING [train.py:1204] (3/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_314-126819-0_sp1.1 from training. Duration: 0.4545625 2023-10-02 11:35:59,043 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2541_210_1794_1_1525590269_3084765_156-173713-0_sp0.9 from training. Duration: 0.9 2023-10-02 11:35:59,096 WARNING [train.py:1204] (3/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_108-255430-0 from training. Duration: 0.97 2023-10-02 11:35:59,104 WARNING [train.py:1204] (3/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_791-45149-0_sp1.1 from training. Duration: 0.7818125 2023-10-02 11:35:59,109 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_53378_210_17282_1_1534227054_1261425_10-118295-0_sp1.1 from training. Duration: 0.97275 2023-10-02 11:35:59,399 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.bypass.skip_rate, batch_count=865473.3333333334, ans=0.09899494936611666 2023-10-02 11:36:00,501 WARNING [train.py:1204] (3/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_148-158509-0 from training. Duration: 0.41 2023-10-02 11:36:03,596 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_34726_210_4525_1_1533019474_5030395_687-291305-0_sp1.1 from training. Duration: 0.936375 2023-10-02 11:36:06,451 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.balancer2.prob, batch_count=865540.0, ans=0.125 2023-10-02 11:36:07,521 WARNING [train.py:1204] (3/4) Exclude cut with ID _14860_210_4915_1_1530702020340_7343240_264-324537-0_sp1.1 from training. Duration: 0.963625 2023-10-02 11:36:11,450 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_41251_210_15368_1_1533429767_4820695_591-46386-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 11:36:12,832 WARNING [train.py:1204] (3/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_86-19601-0_sp0.9 from training. Duration: 0.9444375 2023-10-02 11:36:12,857 WARNING [train.py:1204] (3/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_401-255491-0_sp0.9 from training. Duration: 0.77775 2023-10-02 11:36:14,283 WARNING [train.py:1204] (3/4) Exclude cut with ID _8696_210_1876_1_1528545632440_3345058_209-54069-0_sp1.1 from training. Duration: 0.6090625 2023-10-02 11:36:14,292 WARNING [train.py:1204] (3/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_369-325184-0_sp1.1 from training. Duration: 0.9 2023-10-02 11:36:14,349 WARNING [train.py:1204] (3/4) Exclude cut with ID _14687_210_5854_1_1530703838259_3664889_115-239046-0_sp0.9 from training. Duration: 0.7444375 2023-10-02 11:36:15,655 WARNING [train.py:1204] (3/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_168-50337-0 from training. Duration: 0.84 2023-10-02 11:36:17,545 INFO [train.py:1046] (3/4) Epoch 25, batch 2350, loss[loss=0.1727, simple_loss=0.2628, pruned_loss=0.04125, over 24329.00 frames. ], tot_loss[loss=0.1714, simple_loss=0.2484, pruned_loss=0.04718, over 4712197.95 frames. ], batch size: 74, lr: 4.11e-03, grad_scale: 16.0 2023-10-02 11:36:21,080 WARNING [train.py:1204] (3/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_909-64151-0_sp1.1 from training. Duration: 0.8545625 2023-10-02 11:36:21,105 WARNING [train.py:1204] (3/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_597-154640-0 from training. Duration: 0.9 2023-10-02 11:36:28,802 WARNING [train.py:1204] (3/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_126-274694-0 from training. Duration: 0.98 2023-10-02 11:36:30,410 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.balancer2.prob, batch_count=865606.6666666666, ans=0.125 2023-10-02 11:36:31,491 WARNING [train.py:1204] (3/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_344-269786-0_sp1.1 from training. Duration: 0.97275 2023-10-02 11:36:32,999 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_37818_210_4238_1_1533620910_4720083_322-253154-0_sp1.1 from training. Duration: 0.92725 2023-10-02 11:36:33,008 WARNING [train.py:1204] (3/4) Exclude cut with ID _58895_210_8750_1_1535184081320_3649089_221-15823-0_sp1.1 from training. Duration: 0.92725 2023-10-02 11:36:34,354 WARNING [train.py:1204] (3/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_9-21737-0_sp1.1 from training. Duration: 0.8818125 2023-10-02 11:36:34,393 WARNING [train.py:1204] (3/4) Exclude cut with ID _14069_210_5632_1_1530512692033_4803390_522-341588-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 11:36:35,327 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.0.layers.0.self_attn1.whiten, num_groups=1, num_channels=192, metric=10.57 vs. limit=22.5 2023-10-02 11:36:35,760 WARNING [train.py:1204] (3/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_363-185703-0 from training. Duration: 0.95 2023-10-02 11:36:36,027 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.conv_module1.balancer1.min_positive, batch_count=865673.3333333334, ans=0.025 2023-10-02 11:36:39,772 WARNING [train.py:1204] (3/4) Exclude cut with ID _41817_210_18835_1_1533434477160_7423049_265-76216-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 11:36:40,018 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.bypass.skip_rate, batch_count=865673.3333333334, ans=0.09899494936611666 2023-10-02 11:36:42,876 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.conv_module2.balancer2.prob, batch_count=865673.3333333334, ans=0.125 2023-10-02 11:36:45,282 WARNING [train.py:1204] (3/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_841-341679-0 from training. Duration: 0.58 2023-10-02 11:36:46,665 WARNING [train.py:1204] (3/4) Exclude cut with ID _55429_210_10626_1_1534467588527_7174143_97-335084-0_sp1.1 from training. Duration: 0.8818125 2023-10-02 11:36:49,912 WARNING [train.py:1204] (3/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_690-126306-0_sp0.9 from training. Duration: 0.8666875 2023-10-02 11:36:49,922 WARNING [train.py:1204] (3/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_102-92751-0_sp1.1 from training. Duration: 0.9 2023-10-02 11:36:52,638 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3787_210_2040_1_1526477544_5478586_603-120324-0_sp1.1 from training. Duration: 0.736375 2023-10-02 11:36:54,663 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_33348_210_5632_1_1533086912_3324030_85-28583-0 from training. Duration: 0.82 2023-10-02 11:36:55,973 WARNING [train.py:1204] (3/4) Exclude cut with ID _60057_210_10640_1_1534903065793_3333150_362-230377-0_sp1.1 from training. Duration: 0.7909375 2023-10-02 11:36:57,429 WARNING [train.py:1204] (3/4) Exclude cut with ID _60113_210_16271_1_1535164212481_4386079_194-146486-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 11:36:57,450 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_53753_210_7234_1_1534320003_3594148_171-277668-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 11:36:57,484 WARNING [train.py:1204] (3/4) Exclude cut with ID _34423_210_8750_1_1533430798456_3330199_251-332788-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 11:37:02,054 WARNING [train.py:1204] (3/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_663-166973-0_sp0.9 from training. Duration: 0.888875 2023-10-02 11:37:04,851 WARNING [train.py:1204] (3/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_927-43827-0 from training. Duration: 0.74 2023-10-02 11:37:04,897 WARNING [train.py:1204] (3/4) Exclude cut with ID _28323_210_16187_1_1532480395667_4100300_306-74561-0_sp1.1 from training. Duration: 0.8545625 2023-10-02 11:37:06,409 WARNING [train.py:1204] (3/4) Exclude cut with ID _15037_210_2253_1_1530752294104_3267650_139-337989-0_sp1.1 from training. Duration: 0.92725 2023-10-02 11:37:06,429 WARNING [train.py:1204] (3/4) Exclude cut with ID _49039_210_6315_1_1533967537541_3846118_460-263670-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 11:37:09,061 WARNING [train.py:1204] (3/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_186-309161-0 from training. Duration: 0.54 2023-10-02 11:37:09,137 WARNING [train.py:1204] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_87-278686-0_sp1.1 from training. Duration: 0.7 2023-10-02 11:37:10,763 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.self_attn_weights.pos_emb_skip_rate, batch_count=865806.6666666666, ans=0.0 2023-10-02 11:37:13,258 WARNING [train.py:1204] (3/4) Exclude cut with ID _27052_210_5986_1_1532329197187_3585009_314-106499-0 from training. Duration: 0.92 2023-10-02 11:37:13,283 WARNING [train.py:1204] (3/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_412-287526-0_sp0.9 from training. Duration: 0.8 2023-10-02 11:37:16,200 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.conv_skip_rate, batch_count=865873.3333333334, ans=0.0 2023-10-02 11:37:17,368 WARNING [train.py:1204] (3/4) Exclude cut with ID _22961_210_8254_1_1531976366672_4278180_317-199546-0 from training. Duration: 0.86 2023-10-02 11:37:20,214 WARNING [train.py:1204] (3/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_381-317791-0 from training. Duration: 0.73 2023-10-02 11:37:20,263 WARNING [train.py:1204] (3/4) Exclude cut with ID _44010_210_4525_1_1533884262964_4013620_560-310411-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 11:37:22,170 WARNING [train.py:1204] (3/4) Exclude cut with ID _5294_210_1747_1_1527249643928_3696179_342-270278-0_sp0.9 from training. Duration: 0.611125 2023-10-02 11:37:22,189 WARNING [train.py:1204] (3/4) Exclude cut with ID ID0473W0002-918951 from training. Duration: 0.924 2023-10-02 11:37:22,206 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1083-220122-0 from training. Duration: 0.51 2023-10-02 11:37:24,903 WARNING [train.py:1204] (3/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_138-154812-0 from training. Duration: 0.78 2023-10-02 11:37:29,576 WARNING [train.py:1204] (3/4) Exclude cut with ID _44982_210_1511_1_1534312820108_3726029_331-19420-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 11:37:31,496 INFO [train.py:1046] (3/4) Epoch 25, batch 2400, loss[loss=0.1662, simple_loss=0.2582, pruned_loss=0.03707, over 24465.00 frames. ], tot_loss[loss=0.1711, simple_loss=0.2478, pruned_loss=0.04723, over 4718001.49 frames. ], batch size: 69, lr: 4.10e-03, grad_scale: 32.0 2023-10-02 11:37:34,241 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2655_210_2087_1_1525688959_3897400_202-147557-0_sp0.9 from training. Duration: 0.9444375 2023-10-02 11:37:38,451 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_23850_210_4238_1_1532232597_1632433_32-185135-0_sp0.9 from training. Duration: 0.9555625 2023-10-02 11:37:38,558 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_46-93547-0_sp1.1 from training. Duration: 0.863625 2023-10-02 11:37:38,600 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7931_210_1794_1_1528594728_5564059_280-279560-0 from training. Duration: 0.98 2023-10-02 11:37:39,911 WARNING [train.py:1204] (3/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_937-208281-0 from training. Duration: 0.85 2023-10-02 11:37:45,403 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_14928_210_5632_1_1530773461_4714000_454-112198-0_sp0.9 from training. Duration: 0.7333125 2023-10-02 11:37:45,404 WARNING [train.py:1204] (3/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_944-153333-0_sp1.1 from training. Duration: 0.963625 2023-10-02 11:37:46,838 WARNING [train.py:1204] (3/4) Exclude cut with ID _14821_210_8750_1_1530680503991_6829359_2-227151-0 from training. Duration: 0.83 2023-10-02 11:37:46,860 WARNING [train.py:1204] (3/4) Exclude cut with ID _28461_210_2253_1_1532771775189_3041299_96-152158-0_sp1.1 from training. Duration: 0.77275 2023-10-02 11:37:47,023 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.ff2_skip_rate, batch_count=866006.6666666666, ans=0.0 2023-10-02 11:37:48,101 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7837_210_1792_1_1528453445_5841305_654-54195-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 11:37:49,486 WARNING [train.py:1204] (3/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_344-152526-0 from training. Duration: 0.53 2023-10-02 11:37:54,234 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_30167_210_3528_1_1532428012_4772375_480-142230-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 11:37:55,633 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_160-218721-0 from training. Duration: 0.96 2023-10-02 11:37:55,875 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.self_attn_weights.pos_emb_skip_rate, batch_count=866006.6666666666, ans=0.0 2023-10-02 11:38:02,672 WARNING [train.py:1204] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_65-88242-0_sp0.9 from training. Duration: 0.62225 2023-10-02 11:38:04,874 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.2.encoder.layers.1.conv_module2.whiten, num_groups=1, num_channels=384, metric=6.99 vs. limit=15.0 2023-10-02 11:38:05,446 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_34886_210_10633_1_1533203882_1755661_76-156096-0 from training. Duration: 0.87 2023-10-02 11:38:07,033 WARNING [train.py:1204] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_188-38388-0_sp1.1 from training. Duration: 0.92725 2023-10-02 11:38:08,640 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=866073.3333333334, ans=0.1 2023-10-02 11:38:09,820 WARNING [train.py:1204] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_81-289335-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 11:38:12,533 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.315e+02 1.806e+02 1.972e+02 2.221e+02 3.865e+02, threshold=3.945e+02, percent-clipped=0.0 2023-10-02 11:38:13,937 WARNING [train.py:1204] (3/4) Exclude cut with ID _59493_210_17282_1_1535078419077_3225879_110-8778-0_sp1.1 from training. Duration: 0.936375 2023-10-02 11:38:15,291 WARNING [train.py:1204] (3/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_188-260011-0 from training. Duration: 0.73 2023-10-02 11:38:15,324 WARNING [train.py:1204] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_453-133983-0_sp0.9 from training. Duration: 0.8666875 2023-10-02 11:38:21,533 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8054_210_7927_1_1528627721_4984094_553-17379-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 11:38:21,774 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.nonlin_attention.balancer.prob, batch_count=866140.0, ans=0.125 2023-10-02 11:38:24,186 WARNING [train.py:1204] (3/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_341-171072-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 11:38:25,641 WARNING [train.py:1204] (3/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_265-257286-0_sp1.1 from training. Duration: 0.963625 2023-10-02 11:38:27,440 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_403-39619-0_sp0.9 from training. Duration: 0.9333125 2023-10-02 11:38:27,445 WARNING [train.py:1204] (3/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_464-33931-0_sp0.9 from training. Duration: 0.6 2023-10-02 11:38:27,474 WARNING [train.py:1204] (3/4) Exclude cut with ID _60804_210_9642_1_1534831199859_7268299_632-143863-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 11:38:27,480 WARNING [train.py:1204] (3/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_431-310997-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 11:38:29,402 WARNING [train.py:1204] (3/4) Exclude cut with ID _31649_210_6228_1_1532584608063_4091078_114-263860-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 11:38:29,422 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6961_210_2087_1_1527937168_6815011_562-165518-0_sp0.9 from training. Duration: 0.8555625 2023-10-02 11:38:33,986 WARNING [train.py:1204] (3/4) Exclude cut with ID _27052_210_5986_1_1532329197187_3585009_338-325870-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 11:38:35,327 WARNING [train.py:1204] (3/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_469-94673-0_sp1.1 from training. Duration: 0.7181875 2023-10-02 11:38:35,357 WARNING [train.py:1204] (3/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_259-34038-0 from training. Duration: 0.74 2023-10-02 11:38:36,679 WARNING [train.py:1204] (3/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_388-129552-0 from training. Duration: 0.96 2023-10-02 11:38:39,331 WARNING [train.py:1204] (3/4) Exclude cut with ID _28394_210_8913_1_1532430049518_3650118_342-251455-0_sp1.1 from training. Duration: 0.936375 2023-10-02 11:38:39,364 WARNING [train.py:1204] (3/4) Exclude cut with ID _53731_210_7994_1_1534553979244_3757214_8-191390-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 11:38:39,395 WARNING [train.py:1204] (3/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_613-232856-0 from training. Duration: 0.54 2023-10-02 11:38:40,797 WARNING [train.py:1204] (3/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_557-349534-0 from training. Duration: 0.91 2023-10-02 11:38:40,805 WARNING [train.py:1204] (3/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_379-184223-0 from training. Duration: 0.85 2023-10-02 11:38:40,815 WARNING [train.py:1204] (3/4) Exclude cut with ID IC0125W0001-61807 from training. Duration: 0.966 2023-10-02 11:38:42,295 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_229-45300-0 from training. Duration: 0.57 2023-10-02 11:38:43,597 WARNING [train.py:1204] (3/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_1069-24743-0_sp1.1 from training. Duration: 0.92725 2023-10-02 11:38:44,985 INFO [train.py:1046] (3/4) Epoch 25, batch 2450, loss[loss=0.1665, simple_loss=0.241, pruned_loss=0.04601, over 22887.00 frames. ], tot_loss[loss=0.1698, simple_loss=0.2467, pruned_loss=0.04649, over 4718766.78 frames. ], batch size: 50, lr: 4.10e-03, grad_scale: 32.0 2023-10-02 11:38:45,075 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_41452_210_7291_1_1533437687_3941281_323-172873-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 11:38:45,085 WARNING [train.py:1204] (3/4) Exclude cut with ID _37086_210_9038_1_1532999540891_7021250_802-226709-0_sp1.1 from training. Duration: 0.97275 2023-10-02 11:38:46,510 WARNING [train.py:1204] (3/4) Exclude cut with ID IC0144W0500-71767 from training. Duration: 0.931875 2023-10-02 11:38:46,574 WARNING [train.py:1204] (3/4) Exclude cut with ID _39846_210_12651_1_1533605310265_2115212_233-129699-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 11:38:47,900 WARNING [train.py:1204] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_574-45541-0_sp0.9 from training. Duration: 0.72225 2023-10-02 11:38:49,494 WARNING [train.py:1204] (3/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_372-298093-0_sp1.1 from training. Duration: 0.72725 2023-10-02 11:38:51,333 WARNING [train.py:1204] (3/4) Exclude cut with ID _62574_210_2655_1_1535180407058_3933249_34-26343-0_sp1.1 from training. Duration: 0.97275 2023-10-02 11:38:54,031 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_512-256238-0_sp1.1 from training. Duration: 0.963625 2023-10-02 11:38:54,037 WARNING [train.py:1204] (3/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_196-55502-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 11:38:54,132 WARNING [train.py:1204] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_195-167011-0 from training. Duration: 0.75 2023-10-02 11:38:57,753 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.conv_module1.balancer1.prob, batch_count=866273.3333333334, ans=0.125 2023-10-02 11:39:00,524 WARNING [train.py:1204] (3/4) Exclude cut with ID _13678_210_7497_1_1530530069226_3463170_345-230416-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 11:39:00,543 WARNING [train.py:1204] (3/4) Exclude cut with ID _51078_210_9070_1_1534161557424_3805250_225-327809-0_sp1.1 from training. Duration: 0.963625 2023-10-02 11:39:03,337 WARNING [train.py:1204] (3/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_18-189198-0_sp1.1 from training. Duration: 0.8181875 2023-10-02 11:39:05,184 WARNING [train.py:1204] (3/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_329-82235-0_sp1.1 from training. Duration: 0.7454375 2023-10-02 11:39:05,199 WARNING [train.py:1204] (3/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_726-270780-0_sp0.9 from training. Duration: 0.9555625 2023-10-02 11:39:05,238 WARNING [train.py:1204] (3/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_654-52713-0 from training. Duration: 0.77 2023-10-02 11:39:09,377 WARNING [train.py:1204] (3/4) Exclude cut with ID _20455_210_5219_1_1531565996775_8098100_191-16006-0_sp1.1 from training. Duration: 0.963625 2023-10-02 11:39:11,310 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.5.encoder.layers.0.feed_forward2.out_whiten, num_groups=1, num_channels=256, metric=7.44 vs. limit=15.0 2023-10-02 11:39:12,204 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3171_210_2087_1_1526109084_2330007_66-260583-0_sp1.1 from training. Duration: 0.6545625 2023-10-02 11:39:12,264 WARNING [train.py:1204] (3/4) Exclude cut with ID _38670_210_15379_1_1533448810659_4391059_63-129006-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 11:39:16,348 WARNING [train.py:1204] (3/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_393-57888-0_sp0.9 from training. Duration: 0.711125 2023-10-02 11:39:16,397 WARNING [train.py:1204] (3/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_857-298942-0_sp0.9 from training. Duration: 0.9444375 2023-10-02 11:39:17,874 WARNING [train.py:1204] (3/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_239-22076-0_sp0.9 from training. Duration: 0.9444375 2023-10-02 11:39:17,898 WARNING [train.py:1204] (3/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_424-227947-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 11:39:20,617 WARNING [train.py:1204] (3/4) Exclude cut with ID _14069_210_5632_1_1530512692033_4803390_95-1896-0 from training. Duration: 0.98 2023-10-02 11:39:20,691 WARNING [train.py:1204] (3/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_96-244299-0_sp0.9 from training. Duration: 0.97775 2023-10-02 11:39:27,607 WARNING [train.py:1204] (3/4) Exclude cut with ID _46683_210_9642_1_1533730913492_4591116_562-319825-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 11:39:29,021 WARNING [train.py:1204] (3/4) Exclude cut with ID _64639_210_5816_1_1535367619437_3528000_284-173474-0_sp1.1 from training. Duration: 0.963625 2023-10-02 11:39:30,899 WARNING [train.py:1204] (3/4) Exclude cut with ID _29529_210_7234_1_1532393961455_3977460_497-100133-0_sp1.1 from training. Duration: 0.936375 2023-10-02 11:39:30,926 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_12868_210_6753_1_1530701932_530275_18-76322-0_sp0.9 from training. Duration: 0.9666875 2023-10-02 11:39:30,955 WARNING [train.py:1204] (3/4) Exclude cut with ID _50093_210_4220_1_1534417141207_3432299_116-303447-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 11:39:32,366 WARNING [train.py:1204] (3/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_44-79427-0_sp1.1 from training. Duration: 0.8909375 2023-10-02 11:39:32,413 WARNING [train.py:1204] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_887-249555-0 from training. Duration: 0.77 2023-10-02 11:39:36,200 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.3.encoder.layers.1.whiten, num_groups=1, num_channels=512, metric=10.95 vs. limit=12.0 2023-10-02 11:39:36,881 WARNING [train.py:1204] (3/4) Exclude cut with ID _28461_210_2253_1_1532771775189_3041299_96-152158-0_sp0.9 from training. Duration: 0.9444375 2023-10-02 11:39:36,941 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_14058_210_4163_1_1530690953_6327180_49-210687-0_sp1.1 from training. Duration: 0.8545625 2023-10-02 11:39:39,830 WARNING [train.py:1204] (3/4) Exclude cut with ID _40399_210_4353_1_1533305658194_3279406_351-127903-0_sp1.1 from training. Duration: 0.97275 2023-10-02 11:39:39,838 WARNING [train.py:1204] (3/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_184-206939-0_sp1.1 from training. Duration: 0.936375 2023-10-02 11:39:43,991 WARNING [train.py:1204] (3/4) Exclude cut with ID _14100_210_8341_1_1530529320613_3678069_258-224599-0_sp1.1 from training. Duration: 0.7 2023-10-02 11:39:44,021 WARNING [train.py:1204] (3/4) Exclude cut with ID _6493_210_1747_1_1528459210424_4012470_324-323854-0 from training. Duration: 0.92 2023-10-02 11:39:45,319 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_151-325626-0_sp1.1 from training. Duration: 0.8818125 2023-10-02 11:39:46,739 WARNING [train.py:1204] (3/4) Exclude cut with ID _14528_210_8254_1_1530669700514_4306939_106-43145-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 11:39:46,762 WARNING [train.py:1204] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_686-54383-0 from training. Duration: 0.87 2023-10-02 11:39:46,808 WARNING [train.py:1204] (3/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_801-102362-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 11:39:48,170 WARNING [train.py:1204] (3/4) Exclude cut with ID _48677_210_5663_1_1533902405919_3892989_408-40740-0_sp0.9 from training. Duration: 0.92225 2023-10-02 11:39:51,591 WARNING [train.py:1204] (3/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_448-212135-0_sp1.1 from training. Duration: 0.82725 2023-10-02 11:39:53,020 WARNING [train.py:1204] (3/4) Exclude cut with ID _59170_210_6721_1_1534983633960_4203335_320-124047-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 11:39:53,054 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_4692_210_3528_1_1526899755_4323254_107-44401-0_sp1.1 from training. Duration: 0.7818125 2023-10-02 11:39:58,834 WARNING [train.py:1204] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_731-2428-0 from training. Duration: 0.83 2023-10-02 11:39:58,936 WARNING [train.py:1204] (3/4) Exclude cut with ID _29857_210_20034_1_1532395886753_3798160_335-163562-0_sp1.1 from training. Duration: 0.77275 2023-10-02 11:40:00,117 INFO [train.py:1046] (3/4) Epoch 25, batch 2500, loss[loss=0.1588, simple_loss=0.2337, pruned_loss=0.04192, over 23525.00 frames. ], tot_loss[loss=0.1685, simple_loss=0.2451, pruned_loss=0.04594, over 4705285.71 frames. ], batch size: 134, lr: 4.10e-03, grad_scale: 32.0 2023-10-02 11:40:05,422 WARNING [train.py:1204] (3/4) Exclude cut with ID _14832_210_8750_1_1530691351539_3171348_171-275004-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 11:40:14,989 WARNING [train.py:1204] (3/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_105-328174-0_sp0.9 from training. Duration: 0.8444375 2023-10-02 11:40:15,013 WARNING [train.py:1204] (3/4) Exclude cut with ID _68965_210_1883_1_1535698962272_3497490_164-118421-0_sp1.1 from training. Duration: 0.936375 2023-10-02 11:40:16,369 WARNING [train.py:1204] (3/4) Exclude cut with ID _19896_210_1883_1_1531709022662_6649569_380-24170-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 11:40:16,374 WARNING [train.py:1204] (3/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_505-205270-0_sp1.1 from training. Duration: 0.3454375 2023-10-02 11:40:16,491 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder_embed.convnext.layerdrop_rate, batch_count=866673.3333333334, ans=0.015 2023-10-02 11:40:20,707 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=866673.3333333334, ans=0.1 2023-10-02 11:40:23,722 WARNING [train.py:1204] (3/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_427-208901-0_sp1.1 from training. Duration: 0.6181875 2023-10-02 11:40:23,773 WARNING [train.py:1204] (3/4) Exclude cut with ID _23818_210_6228_1_1531917063884_3676920_85-326916-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 11:40:25,261 WARNING [train.py:1204] (3/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_101-293367-0_sp1.1 from training. Duration: 0.57275 2023-10-02 11:40:25,269 WARNING [train.py:1204] (3/4) Exclude cut with ID _5409_210_1388_1_1527292850189_5865191_178-127970-0_sp1.1 from training. Duration: 0.5181875 2023-10-02 11:40:25,316 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_403-39619-0 from training. Duration: 0.84 2023-10-02 11:40:27,972 WARNING [train.py:1204] (3/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_427-87841-0_sp1.1 from training. Duration: 0.963625 2023-10-02 11:40:28,020 WARNING [train.py:1204] (3/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_286-68181-0_sp1.1 from training. Duration: 0.97275 2023-10-02 11:40:28,061 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6673_210_3273_1_1527767790_3173240_327-306185-0 from training. Duration: 0.83 2023-10-02 11:40:28,076 WARNING [train.py:1204] (3/4) Exclude cut with ID _3675_210_4557_1_1526639934408_4182089_106-203713-0_sp1.1 from training. Duration: 0.963625 2023-10-02 11:40:28,758 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.2.encoder.layers.1.feed_forward2.out_whiten, num_groups=1, num_channels=384, metric=6.58 vs. limit=15.0 2023-10-02 11:40:29,917 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_53071_210_16554_1_1534493434_1276796_130-36076-0 from training. Duration: 0.95 2023-10-02 11:40:29,953 WARNING [train.py:1204] (3/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_586-93054-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 11:40:34,998 WARNING [train.py:1204] (3/4) Exclude cut with ID _23825_210_13449_1_1531915313526_3497150_262-10571-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 11:40:35,232 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.ff2_skip_rate, batch_count=866740.0, ans=0.0 2023-10-02 11:40:36,374 WARNING [train.py:1204] (3/4) Exclude cut with ID _28783_210_7943_1_1532671164808_7155479_432-67885-0_sp1.1 from training. Duration: 0.97275 2023-10-02 11:40:39,113 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6287_210_3273_1_1527509506_2653716_182-199887-0_sp0.9 from training. Duration: 0.5666875 2023-10-02 11:40:39,159 WARNING [train.py:1204] (3/4) Exclude cut with ID _3231_210_2106_1_1526205864934_6784220_171-256004-0 from training. Duration: 0.86 2023-10-02 11:40:39,209 WARNING [train.py:1204] (3/4) Exclude cut with ID _69051_210_1531_1_1535885566901_3829538_332-92090-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 11:40:41,815 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.456e+02 1.832e+02 2.049e+02 2.331e+02 3.606e+02, threshold=4.098e+02, percent-clipped=0.0 2023-10-02 11:40:41,939 WARNING [train.py:1204] (3/4) Exclude cut with ID _3675_210_4557_1_1526639934408_4182089_104-324125-0_sp1.1 from training. Duration: 0.963625 2023-10-02 11:40:44,747 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_41764_210_2521_1_1533686110_4089313_501-2119-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 11:40:47,560 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_56-100988-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 11:40:50,285 WARNING [train.py:1204] (3/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_398-239819-0_sp1.1 from training. Duration: 0.9 2023-10-02 11:40:55,685 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.0.layers.0.feed_forward2.out_whiten, num_groups=1, num_channels=192, metric=5.88 vs. limit=15.0 2023-10-02 11:40:56,062 WARNING [train.py:1204] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_74-48717-0_sp1.1 from training. Duration: 0.52725 2023-10-02 11:40:57,744 WARNING [train.py:1204] (3/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_177-49092-0 from training. Duration: 0.91 2023-10-02 11:40:59,067 WARNING [train.py:1204] (3/4) Exclude cut with ID _62337_210_12392_1_1535009273105_3412229_44-176353-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 11:40:59,082 WARNING [train.py:1204] (3/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_110-130961-0_sp1.1 from training. Duration: 0.42725 2023-10-02 11:41:00,558 WARNING [train.py:1204] (3/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_37-189262-0_sp1.1 from training. Duration: 0.8909375 2023-10-02 11:41:00,558 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3172_210_2087_1_1526292929_3987865_197-97762-0_sp0.9 from training. Duration: 0.8333125 2023-10-02 11:41:03,034 WARNING [train.py:1204] (3/4) Exclude cut with ID IC0677W0065-334897 from training. Duration: 0.896 2023-10-02 11:41:03,035 WARNING [train.py:1204] (3/4) Exclude cut with ID IC0677W0080-334912 from training. Duration: 0.768 2023-10-02 11:41:03,040 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_120-41668-0 from training. Duration: 0.7 2023-10-02 11:41:04,478 WARNING [train.py:1204] (3/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_460-205287-0_sp1.1 from training. Duration: 0.963625 2023-10-02 11:41:05,893 WARNING [train.py:1204] (3/4) Exclude cut with ID _14632_210_9988_1_1530691279392_3923200_102-204477-0 from training. Duration: 0.97 2023-10-02 11:41:05,902 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_34815_210_6388_1_1533381320_3571903_279-305565-0 from training. Duration: 0.92 2023-10-02 11:41:07,219 WARNING [train.py:1204] (3/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_91-148831-0_sp1.1 from training. Duration: 0.9 2023-10-02 11:41:08,556 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_82-221196-0 from training. Duration: 0.76 2023-10-02 11:41:11,344 WARNING [train.py:1204] (3/4) Exclude cut with ID _38020_210_5986_1_1533112252259_7006089_493-102745-0 from training. Duration: 0.98 2023-10-02 11:41:12,756 WARNING [train.py:1204] (3/4) Exclude cut with ID _61133_210_9969_1_1535196777227_3696409_40-93442-0_sp1.1 from training. Duration: 0.92725 2023-10-02 11:41:14,053 INFO [train.py:1046] (3/4) Epoch 25, batch 2550, loss[loss=0.151, simple_loss=0.2362, pruned_loss=0.03292, over 24483.00 frames. ], tot_loss[loss=0.1688, simple_loss=0.2456, pruned_loss=0.04606, over 4709944.48 frames. ], batch size: 63, lr: 4.10e-03, grad_scale: 32.0 2023-10-02 11:41:15,496 WARNING [train.py:1204] (3/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_45-258852-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 11:41:15,521 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_53071_210_16554_1_1534493434_1276796_130-36076-0_sp1.1 from training. Duration: 0.863625 2023-10-02 11:41:16,945 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8694_210_6400_1_1529065754_3260756_332-13553-0_sp1.1 from training. Duration: 0.92725 2023-10-02 11:41:18,232 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_405-339986-0 from training. Duration: 0.51 2023-10-02 11:41:20,016 WARNING [train.py:1204] (3/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_927-43827-0_sp1.1 from training. Duration: 0.67275 2023-10-02 11:41:22,771 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_397-147196-0 from training. Duration: 0.8 2023-10-02 11:41:24,121 WARNING [train.py:1204] (3/4) Exclude cut with ID 3557-8342-0013-54691-0_sp1.1 from training. Duration: 0.836375 2023-10-02 11:41:24,316 INFO [scaling.py:1118] (3/4) WithLoss: name=encoder.encoders.3.encoder.layers.0.self_attn_weights, loss-sum=0.000e+00 2023-10-02 11:41:26,791 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_47438_210_5399_1_1533797471_4513356_626-127722-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 11:41:28,238 WARNING [train.py:1204] (3/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_256-323883-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 11:41:28,251 WARNING [train.py:1204] (3/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_839-173017-0_sp0.9 from training. Duration: 0.4555625 2023-10-02 11:41:29,541 WARNING [train.py:1204] (3/4) Exclude cut with ID _5289_210_2084_1_1527678202736_3779299_392-261988-0_sp1.1 from training. Duration: 0.5909375 2023-10-02 11:41:30,197 WARNING [train.py:1204] (3/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_119-224384-0_sp1.1 from training. Duration: 0.936375 2023-10-02 11:41:30,238 WARNING [train.py:1204] (3/4) Exclude cut with ID _57523_210_1856_1_1534766294545_3732989_262-267933-0_sp1.1 from training. Duration: 0.963625 2023-10-02 11:41:33,409 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_35361_210_4238_1_1533262810_7793476_675-87012-0_sp0.9 from training. Duration: 0.988875 2023-10-02 11:41:33,425 WARNING [train.py:1204] (3/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_17-90541-0 from training. Duration: 0.94 2023-10-02 11:41:33,458 WARNING [train.py:1204] (3/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_535-14410-0_sp1.1 from training. Duration: 0.42725 2023-10-02 11:41:33,463 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_24673_210_6400_1_1531968633_778659_94-154511-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 11:41:33,466 WARNING [train.py:1204] (3/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_212-10501-0 from training. Duration: 0.94 2023-10-02 11:41:38,400 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.2.encoder.layers.2.feed_forward1.out_whiten, num_groups=1, num_channels=384, metric=6.60 vs. limit=15.0 2023-10-02 11:41:48,433 WARNING [train.py:1204] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_383-285196-0_sp1.1 from training. Duration: 0.8818125 2023-10-02 11:41:52,874 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.feed_forward2.hidden_balancer.prob, batch_count=867073.3333333334, ans=0.125 2023-10-02 11:41:54,415 WARNING [train.py:1204] (3/4) Exclude cut with ID _45560_210_21982_1_1533812603951_3584999_238-47020-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 11:41:54,433 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_21500_210_2054_1_1532149310_2716758_278-18053-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 11:41:54,448 WARNING [train.py:1204] (3/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_453-9626-0_sp1.1 from training. Duration: 0.936375 2023-10-02 11:41:55,989 WARNING [train.py:1204] (3/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_153-97441-0_sp1.1 from training. Duration: 0.5090625 2023-10-02 11:41:56,696 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.2.encoder.layers.2.feed_forward2.out_whiten, num_groups=1, num_channels=384, metric=5.21 vs. limit=15.0 2023-10-02 11:42:03,119 WARNING [train.py:1204] (3/4) Exclude cut with ID _67725_210_27767_1_1535608756671_5105359_431-120895-0_sp1.1 from training. Duration: 0.963625 2023-10-02 11:42:05,212 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.1.conv_skip_rate, batch_count=867140.0, ans=0.0 2023-10-02 11:42:06,295 WARNING [train.py:1204] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_574-45541-0_sp1.1 from training. Duration: 0.5909375 2023-10-02 11:42:06,311 WARNING [train.py:1204] (3/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_162-103620-0_sp1.1 from training. Duration: 0.8454375 2023-10-02 11:42:06,322 WARNING [train.py:1204] (3/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_734-129761-0_sp1.1 from training. Duration: 0.7454375 2023-10-02 11:42:06,347 WARNING [train.py:1204] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_620-276793-0_sp1.1 from training. Duration: 0.536375 2023-10-02 11:42:08,267 WARNING [train.py:1204] (3/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_153-97441-0_sp0.9 from training. Duration: 0.62225 2023-10-02 11:42:09,864 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.ff2_skip_rate, batch_count=867140.0, ans=0.0 2023-10-02 11:42:11,002 WARNING [train.py:1204] (3/4) Exclude cut with ID _12733_210_8341_1_1530167424556_3646380_523-85621-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 11:42:11,022 WARNING [train.py:1204] (3/4) Exclude cut with ID _64048_210_10626_1_1535198382802_3835179_352-179834-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 11:42:16,426 WARNING [train.py:1204] (3/4) Exclude cut with ID _55162_210_17071_1_1534408248583_3753480_43-242364-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 11:42:16,441 WARNING [train.py:1204] (3/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_661-264857-0 from training. Duration: 0.93 2023-10-02 11:42:16,448 WARNING [train.py:1204] (3/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_325-237800-0_sp1.1 from training. Duration: 0.97275 2023-10-02 11:42:16,496 WARNING [train.py:1204] (3/4) Exclude cut with ID _55328_210_27767_1_1534580973260_4088170_385-171459-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 11:42:17,872 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3780_210_2087_1_1526467389_2145572_176-107140-0_sp1.1 from training. Duration: 0.62725 2023-10-02 11:42:19,207 WARNING [train.py:1204] (3/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_375-244976-0_sp1.1 from training. Duration: 0.6818125 2023-10-02 11:42:20,715 WARNING [train.py:1204] (3/4) Exclude cut with ID _35941_210_15379_1_1532995294515_4023089_309-33162-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 11:42:26,622 INFO [train.py:1046] (3/4) Epoch 25, batch 2600, loss[loss=0.1918, simple_loss=0.2584, pruned_loss=0.06262, over 22779.00 frames. ], tot_loss[loss=0.1698, simple_loss=0.2469, pruned_loss=0.04639, over 4713008.67 frames. ], batch size: 322, lr: 4.10e-03, grad_scale: 16.0 2023-10-02 11:42:26,729 WARNING [train.py:1204] (3/4) Exclude cut with ID _14459_210_2228_1_1531443641946_7516700_1185-22038-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 11:42:28,226 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_1197-69460-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 11:42:30,977 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9012W0022-501157 from training. Duration: 0.896 2023-10-02 11:42:32,451 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9023W0009-506302 from training. Duration: 0.896 2023-10-02 11:42:32,472 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2821_210_2715_1_1526108385_3763560_73-296673-0_sp1.1 from training. Duration: 0.7545625 2023-10-02 11:42:33,770 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9025W0113-507442 from training. Duration: 0.896 2023-10-02 11:42:33,845 WARNING [train.py:1204] (3/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_739-2098-0 from training. Duration: 0.92 2023-10-02 11:42:33,865 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9029W0332-509722 from training. Duration: 0.768 2023-10-02 11:42:37,677 WARNING [train.py:1204] (3/4) Exclude cut with ID _16908_210_5724_1_1531119157248_3772700_68-101953-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 11:42:38,991 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9042W0011-515617 from training. Duration: 0.768 2023-10-02 11:42:40,899 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3019_210_2028_1_1526044245_3264865_191-72235-0 from training. Duration: 0.97 2023-10-02 11:42:42,219 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9052W0006-520792 from training. Duration: 0.896 2023-10-02 11:42:43,606 WARNING [train.py:1204] (3/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_245-174553-0_sp1.1 from training. Duration: 0.8 2023-10-02 11:42:45,093 WARNING [train.py:1204] (3/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_202-67017-0 from training. Duration: 0.82 2023-10-02 11:42:46,493 WARNING [train.py:1204] (3/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_304-59227-0 from training. Duration: 0.77 2023-10-02 11:42:47,857 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_246-125000-0_sp0.9 from training. Duration: 0.688875 2023-10-02 11:42:47,886 WARNING [train.py:1204] (3/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_243-42116-0 from training. Duration: 0.73 2023-10-02 11:42:50,525 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9087W0121-538492 from training. Duration: 0.896 2023-10-02 11:42:50,551 WARNING [train.py:1204] (3/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_120-76843-0 from training. Duration: 0.55 2023-10-02 11:42:56,679 WARNING [train.py:1204] (3/4) Exclude cut with ID _7852_210_5219_1_1528286471316_8139400_850-304621-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 11:42:56,696 WARNING [train.py:1204] (3/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_154-285415-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 11:42:57,955 WARNING [train.py:1204] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_538-278194-0_sp1.1 from training. Duration: 0.92725 2023-10-02 11:42:57,956 WARNING [train.py:1204] (3/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_119-207162-0 from training. Duration: 0.71 2023-10-02 11:43:00,688 WARNING [train.py:1204] (3/4) Exclude cut with ID _2334_210_2667_1_1525608238880_4705129_459-104997-0_sp1.1 from training. Duration: 0.863625 2023-10-02 11:43:05,617 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9147W0019-567847 from training. Duration: 0.768 2023-10-02 11:43:10,119 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.653e+02 1.853e+02 2.071e+02 2.375e+02 3.577e+02, threshold=4.142e+02, percent-clipped=0.0 2023-10-02 11:43:12,059 WARNING [train.py:1204] (3/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_228-189439-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 11:43:12,091 WARNING [train.py:1204] (3/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_196-77172-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 11:43:12,161 WARNING [train.py:1204] (3/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_625-338546-0 from training. Duration: 0.88 2023-10-02 11:43:13,504 WARNING [train.py:1204] (3/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_193-327630-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 11:43:13,512 WARNING [train.py:1204] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_391-24006-0_sp1.1 from training. Duration: 0.92725 2023-10-02 11:43:14,926 WARNING [train.py:1204] (3/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_197-28164-0 from training. Duration: 0.8 2023-10-02 11:43:17,670 WARNING [train.py:1204] (3/4) Exclude cut with ID _8395_210_1531_1_1528597871523_4411579_431-132470-0_sp1.1 from training. Duration: 0.67275 2023-10-02 11:43:17,695 WARNING [train.py:1204] (3/4) Exclude cut with ID _35350_210_16040_1_1533040191183_4075299_465-248326-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 11:43:20,317 WARNING [train.py:1204] (3/4) Exclude cut with ID _16756_210_4915_1_1531047803763_7104259_101-141922-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 11:43:23,088 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9212W0005-600172 from training. Duration: 0.896 2023-10-02 11:43:24,343 WARNING [train.py:1204] (3/4) Exclude cut with ID _63186_210_2084_1_1535414479705_3908649_378-166820-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 11:43:24,362 WARNING [train.py:1204] (3/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_52-211683-0_sp1.1 from training. Duration: 0.8181875 2023-10-02 11:43:29,041 WARNING [train.py:1204] (3/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_1147-237729-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 11:43:30,353 WARNING [train.py:1204] (3/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_234-14174-0_sp0.9 from training. Duration: 0.911125 2023-10-02 11:43:30,371 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_454-276429-0 from training. Duration: 0.87 2023-10-02 11:43:31,686 WARNING [train.py:1204] (3/4) Exclude cut with ID _53713_210_24116_1_1534314487762_7416751_755-165313-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 11:43:31,823 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8278_210_2550_1_1528637099_4220413_436-317072-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 11:43:33,193 WARNING [train.py:1204] (3/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_28-324729-0_sp1.1 from training. Duration: 0.8545625 2023-10-02 11:43:34,660 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.feed_forward2.out_whiten.whitening_limit, batch_count=867540.0, ans=15.0 2023-10-02 11:43:38,517 WARNING [train.py:1204] (3/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_326-165900-0 from training. Duration: 0.69 2023-10-02 11:43:39,756 WARNING [train.py:1204] (3/4) Exclude cut with ID _53489_210_6228_1_1534298357169_3835970_96-30246-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 11:43:39,912 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8275_210_4198_1_1528455320_3892571_491-17707-0_sp1.1 from training. Duration: 0.5818125 2023-10-02 11:43:41,631 INFO [train.py:1046] (3/4) Epoch 25, batch 2650, loss[loss=0.173, simple_loss=0.2469, pruned_loss=0.04955, over 24310.00 frames. ], tot_loss[loss=0.1714, simple_loss=0.2486, pruned_loss=0.04705, over 4708199.35 frames. ], batch size: 61, lr: 4.10e-03, grad_scale: 16.0 2023-10-02 11:43:44,736 WARNING [train.py:1204] (3/4) Exclude cut with ID _14348_210_8341_1_1530599434996_3778640_240-232370-0 from training. Duration: 0.84 2023-10-02 11:43:44,745 WARNING [train.py:1204] (3/4) Exclude cut with ID _45012_210_12651_1_1533608435136_5371380_54-112623-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 11:43:46,024 WARNING [train.py:1204] (3/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_328-264776-0_sp1.1 from training. Duration: 0.6090625 2023-10-02 11:43:46,258 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=867606.6666666666, ans=0.1 2023-10-02 11:43:47,248 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9325W0001-648427 from training. Duration: 0.768 2023-10-02 11:43:47,256 WARNING [train.py:1204] (3/4) Exclude cut with ID _56480_210_4220_1_1534679942079_3075409_49-74717-0_sp1.1 from training. Duration: 0.936375 2023-10-02 11:43:51,499 WARNING [train.py:1204] (3/4) Exclude cut with ID _59500_210_8254_1_1534809610680_3736009_6-241869-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 11:43:51,651 WARNING [train.py:1204] (3/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_172-215932-0_sp0.9 from training. Duration: 0.7333125 2023-10-02 11:43:51,774 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.1.feed_forward1.out_proj.dropout_p, batch_count=867606.6666666666, ans=0.1 2023-10-02 11:43:53,026 WARNING [train.py:1204] (3/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_17-90541-0_sp1.1 from training. Duration: 0.8545625 2023-10-02 11:43:55,700 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_43831_210_2158_1_1533725502_5872791_525-89447-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 11:43:55,766 WARNING [train.py:1204] (3/4) Exclude cut with ID _12302_210_5219_1_1529924446466_8107280_907-182949-0 from training. Duration: 0.92 2023-10-02 11:43:55,778 WARNING [train.py:1204] (3/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_402-102311-0_sp0.9 from training. Duration: 0.7444375 2023-10-02 11:43:55,821 WARNING [train.py:1204] (3/4) Exclude cut with ID _8021_210_7994_1_1528376451358_3653069_179-45206-0_sp0.9 from training. Duration: 0.9555625 2023-10-02 11:43:55,942 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.1.feed_forward1.out_proj.dropout_p, batch_count=867673.3333333334, ans=0.1 2023-10-02 11:43:59,113 WARNING [train.py:1204] (3/4) Exclude cut with ID _40137_210_12210_1_1533290484046_3795180_249-187059-0 from training. Duration: 0.88 2023-10-02 11:43:59,202 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9653W0194-675472 from training. Duration: 0.9658125 2023-10-02 11:44:02,453 WARNING [train.py:1204] (3/4) Exclude cut with ID _51332_210_9988_1_1534244371836_3801270_336-255604-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 11:44:03,821 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_510-96996-0 from training. Duration: 0.87 2023-10-02 11:44:03,840 WARNING [train.py:1204] (3/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_1167-20437-0_sp1.1 from training. Duration: 0.92725 2023-10-02 11:44:05,064 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3475_210_2028_1_1526193578_3622569_265-276625-0 from training. Duration: 0.76 2023-10-02 11:44:11,493 WARNING [train.py:1204] (3/4) Exclude cut with ID _14860_210_4915_1_1530702020340_7343240_470-172418-0_sp1.1 from training. Duration: 0.97275 2023-10-02 11:44:11,502 WARNING [train.py:1204] (3/4) Exclude cut with ID _5444_210_2151_1_1527418808464_5312210_581-315411-0_sp1.1 from training. Duration: 0.563625 2023-10-02 11:44:11,531 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_34450_210_9547_1_1533099443_4109050_519-342439-0_sp1.1 from training. Duration: 0.97275 2023-10-02 11:44:11,782 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.bypass.scale_min, batch_count=867740.0, ans=0.2 2023-10-02 11:44:12,883 WARNING [train.py:1204] (3/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_50-175961-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 11:44:15,725 WARNING [train.py:1204] (3/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_372-6869-0 from training. Duration: 0.44 2023-10-02 11:44:16,981 WARNING [train.py:1204] (3/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_3-237434-0 from training. Duration: 0.76 2023-10-02 11:44:19,792 WARNING [train.py:1204] (3/4) Exclude cut with ID _8772_210_1850_1_1528635595972_3664011_247-336448-0_sp1.1 from training. Duration: 0.67275 2023-10-02 11:44:23,899 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_427-78097-0 from training. Duration: 0.8 2023-10-02 11:44:23,921 WARNING [train.py:1204] (3/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_466-252727-0_sp1.1 from training. Duration: 0.97275 2023-10-02 11:44:25,260 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_15972_210_6400_1_1530877934_3405692_316-35478-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 11:44:25,288 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_809-17156-0_sp0.9 from training. Duration: 0.811125 2023-10-02 11:44:25,325 WARNING [train.py:1204] (3/4) Exclude cut with ID _57572_210_5517_1_1534921219624_3809484_594-263474-0_sp1.1 from training. Duration: 0.936375 2023-10-02 11:44:25,504 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.conv_module1.balancer1.prob, batch_count=867806.6666666666, ans=0.125 2023-10-02 11:44:26,595 WARNING [train.py:1204] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_190-228204-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 11:44:26,730 WARNING [train.py:1204] (3/4) Exclude cut with ID _19514_210_9259_1_1531530071330_5084230_117-21193-0_sp1.1 from training. Duration: 0.936375 2023-10-02 11:44:26,906 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.conv_skip_rate, batch_count=867806.6666666666, ans=0.0 2023-10-02 11:44:28,848 WARNING [train.py:1204] (3/4) Exclude cut with ID _69584_210_5219_1_1535785133775_4044450_190-21315-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 11:44:30,236 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_53071_210_16554_1_1534493434_1276796_184-302050-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 11:44:30,284 WARNING [train.py:1204] (3/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_120-76843-0_sp1.1 from training. Duration: 0.5 2023-10-02 11:44:31,040 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.2.encoder.layers.2.self_attn1.whiten, num_groups=1, num_channels=384, metric=20.52 vs. limit=22.5 2023-10-02 11:44:31,710 WARNING [train.py:1204] (3/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_318-162017-0_sp1.1 from training. Duration: 0.82725 2023-10-02 11:44:31,808 WARNING [train.py:1204] (3/4) Exclude cut with ID _46067_210_4220_1_1534125571066_3307450_79-46864-0_sp1.1 from training. Duration: 0.92725 2023-10-02 11:44:33,123 WARNING [train.py:1204] (3/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_358-215558-0_sp1.1 from training. Duration: 0.8454375 2023-10-02 11:44:35,106 WARNING [train.py:1204] (3/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_621-66764-0_sp1.1 from training. Duration: 0.92725 2023-10-02 11:44:36,482 WARNING [train.py:1204] (3/4) Exclude cut with ID _42863_210_9070_1_1533902398788_3696920_338-156851-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 11:44:36,521 WARNING [train.py:1204] (3/4) Exclude cut with ID _5289_210_2084_1_1527678202736_3779299_392-261988-0_sp0.9 from training. Duration: 0.72225 2023-10-02 11:44:39,960 WARNING [train.py:1204] (3/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_409-84682-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 11:44:41,911 WARNING [train.py:1204] (3/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_30-326681-0_sp0.9 from training. Duration: 0.92225 2023-10-02 11:44:41,916 WARNING [train.py:1204] (3/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_389-184695-0_sp1.1 from training. Duration: 0.92725 2023-10-02 11:44:41,954 WARNING [train.py:1204] (3/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_442-320317-0 from training. Duration: 0.82 2023-10-02 11:44:46,027 WARNING [train.py:1204] (3/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_756-29911-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 11:44:47,374 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_401-112036-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 11:44:48,723 WARNING [train.py:1204] (3/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_181-308827-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 11:44:48,791 WARNING [train.py:1204] (3/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_332-258725-0_sp1.1 from training. Duration: 0.963625 2023-10-02 11:44:50,137 WARNING [train.py:1204] (3/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_188-260011-0_sp0.9 from training. Duration: 0.811125 2023-10-02 11:44:51,455 WARNING [train.py:1204] (3/4) Exclude cut with ID _49039_210_6315_1_1533967537541_3846118_99-302239-0_sp1.1 from training. Duration: 0.963625 2023-10-02 11:44:54,014 WARNING [train.py:1204] (3/4) Exclude cut with ID _4122_210_2084_1_1526713164417_4353280_312-294828-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 11:44:54,020 WARNING [train.py:1204] (3/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_303-228738-0 from training. Duration: 0.84 2023-10-02 11:44:55,442 INFO [train.py:1046] (3/4) Epoch 25, batch 2700, loss[loss=0.1481, simple_loss=0.2263, pruned_loss=0.03491, over 24401.00 frames. ], tot_loss[loss=0.1716, simple_loss=0.2491, pruned_loss=0.04702, over 4716962.64 frames. ], batch size: 58, lr: 4.10e-03, grad_scale: 16.0 2023-10-02 11:44:56,017 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.5.encoder.layers.1.self_attn1.whiten, num_groups=1, num_channels=256, metric=11.76 vs. limit=22.5 2023-10-02 11:44:56,899 WARNING [train.py:1204] (3/4) Exclude cut with ID _14349_210_8341_1_1530615710818_3442470_115-285975-0_sp0.9 from training. Duration: 0.9555625 2023-10-02 11:44:58,275 WARNING [train.py:1204] (3/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_125-144174-0_sp1.1 from training. Duration: 0.3545625 2023-10-02 11:45:00,049 WARNING [train.py:1204] (3/4) Exclude cut with ID _53565_210_10635_1_1534337218335_3820259_351-54570-0_sp1.1 from training. Duration: 0.97275 2023-10-02 11:45:01,348 WARNING [train.py:1204] (3/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_410-40065-0_sp1.1 from training. Duration: 0.963625 2023-10-02 11:45:01,376 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_38965_210_4852_1_1533174906_3484735_167-339442-0_sp1.1 from training. Duration: 0.963625 2023-10-02 11:45:02,816 WARNING [train.py:1204] (3/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_265-164486-0_sp1.1 from training. Duration: 0.87275 2023-10-02 11:45:02,827 WARNING [train.py:1204] (3/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_725-182484-0_sp1.1 from training. Duration: 0.92725 2023-10-02 11:45:02,852 WARNING [train.py:1204] (3/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_607-213951-0_sp0.9 from training. Duration: 0.9333125 2023-10-02 11:45:02,866 WARNING [train.py:1204] (3/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_605-133395-0_sp1.1 from training. Duration: 0.6 2023-10-02 11:45:02,898 WARNING [train.py:1204] (3/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_351-294650-0 from training. Duration: 0.63 2023-10-02 11:45:04,294 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_14953_210_2151_1_1530701710_5616165_710-165400-0_sp1.1 from training. Duration: 0.7909375 2023-10-02 11:45:05,693 WARNING [train.py:1204] (3/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_237-58433-0_sp1.1 from training. Duration: 0.67275 2023-10-02 11:45:07,415 WARNING [train.py:1204] (3/4) Exclude cut with ID _5190_210_4557_1_1527244368799_373439_28-197030-0_sp1.1 from training. Duration: 0.6545625 2023-10-02 11:45:07,474 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_743-327651-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 11:45:12,195 WARNING [train.py:1204] (3/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_76-203867-0_sp0.9 from training. Duration: 0.8 2023-10-02 11:45:14,249 WARNING [train.py:1204] (3/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_274-274798-0 from training. Duration: 0.98 2023-10-02 11:45:14,306 WARNING [train.py:1204] (3/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_226-242140-0_sp1.1 from training. Duration: 0.736375 2023-10-02 11:45:17,285 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.conv_module2.balancer1.prob, batch_count=868006.6666666666, ans=0.125 2023-10-02 11:45:18,487 WARNING [train.py:1204] (3/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_352-59958-0_sp1.1 from training. Duration: 0.7818125 2023-10-02 11:45:18,491 WARNING [train.py:1204] (3/4) Exclude cut with ID _39411_210_5986_1_1533538825030_3343649_191-251819-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 11:45:18,743 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.ff3_skip_rate, batch_count=868006.6666666666, ans=0.0 2023-10-02 11:45:22,601 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7046_210_6753_1_1527895337_6920917_642-106799-0_sp1.1 from training. Duration: 0.836375 2023-10-02 11:45:22,608 WARNING [train.py:1204] (3/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_412-3442-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 11:45:22,806 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.feed_forward1.hidden_balancer.prob, batch_count=868006.6666666666, ans=0.125 2023-10-02 11:45:23,881 WARNING [train.py:1204] (3/4) Exclude cut with ID _60057_210_10640_1_1534903065793_3333150_171-181133-0_sp0.9 from training. Duration: 0.97775 2023-10-02 11:45:23,891 WARNING [train.py:1204] (3/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_154-135696-0_sp1.1 from training. Duration: 0.72725 2023-10-02 11:45:26,806 WARNING [train.py:1204] (3/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_148-186805-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 11:45:29,904 WARNING [train.py:1204] (3/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_605-165641-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 11:45:29,912 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_174-277364-0_sp0.9 from training. Duration: 0.788875 2023-10-02 11:45:29,925 WARNING [train.py:1204] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_89-321955-0_sp0.9 from training. Duration: 0.888875 2023-10-02 11:45:34,056 WARNING [train.py:1204] (3/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_438-105429-0_sp1.1 from training. Duration: 0.963625 2023-10-02 11:45:34,060 WARNING [train.py:1204] (3/4) Exclude cut with ID _14970_210_15709_1_1530773543493_1715730_187-20259-0_sp0.9 from training. Duration: 0.988875 2023-10-02 11:45:39,108 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.430e+02 1.840e+02 2.089e+02 2.422e+02 3.725e+02, threshold=4.178e+02, percent-clipped=0.0 2023-10-02 11:45:44,340 WARNING [train.py:1204] (3/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_385-193052-0_sp0.9 from training. Duration: 0.9555625 2023-10-02 11:45:44,387 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7113_210_1849_1_1527942685_3448768_194-311626-0_sp1.1 from training. Duration: 0.92725 2023-10-02 11:45:47,835 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3780_210_2087_1_1526467389_2145572_176-107140-0_sp0.9 from training. Duration: 0.7666875 2023-10-02 11:45:47,837 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_36756_210_7234_1_1533026991_966276_123-213457-0_sp1.1 from training. Duration: 0.936375 2023-10-02 11:45:51,952 WARNING [train.py:1204] (3/4) Exclude cut with ID _23557_210_8750_1_1531890106370_6846255_456-244861-0_sp1.1 from training. Duration: 0.963625 2023-10-02 11:45:52,024 WARNING [train.py:1204] (3/4) Exclude cut with ID _37086_210_9038_1_1532999540891_7021250_242-349170-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 11:45:54,575 WARNING [train.py:1204] (3/4) Exclude cut with ID _41298_210_9019_1_1533430371450_4822143_471-281351-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 11:45:54,679 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_53378_210_17282_1_1534221071_5769509_572-126176-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 11:45:56,038 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_1507-263032-0_sp1.1 from training. Duration: 0.963625 2023-10-02 11:45:56,073 WARNING [train.py:1204] (3/4) Exclude cut with ID _13679_210_7497_1_1530534293662_3355098_411-186088-0_sp1.1 from training. Duration: 0.8545625 2023-10-02 11:45:58,851 WARNING [train.py:1204] (3/4) Exclude cut with ID _29345_210_4381_1_1532689168263_3860169_149-257268-0_sp1.1 from training. Duration: 0.736375 2023-10-02 11:46:00,199 WARNING [train.py:1204] (3/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_381-252014-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 11:46:00,211 WARNING [train.py:1204] (3/4) Exclude cut with ID _35799_210_5854_1_1533263487887_3529071_512-2717-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 11:46:03,337 WARNING [train.py:1204] (3/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_505-205270-0_sp0.9 from training. Duration: 0.42225 2023-10-02 11:46:04,677 WARNING [train.py:1204] (3/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_731-20117-0_sp1.1 from training. Duration: 0.936375 2023-10-02 11:46:04,913 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.0.feed_forward1.hidden_balancer.prob, batch_count=868206.6666666666, ans=0.125 2023-10-02 11:46:06,528 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_37351_210_7925_1_1533012931_3812113_265-85780-0_sp1.1 from training. Duration: 0.8 2023-10-02 11:46:06,543 WARNING [train.py:1204] (3/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_313-134930-0 from training. Duration: 0.97 2023-10-02 11:46:09,177 INFO [train.py:1046] (3/4) Epoch 25, batch 2750, loss[loss=0.1558, simple_loss=0.2313, pruned_loss=0.04012, over 24425.00 frames. ], tot_loss[loss=0.1709, simple_loss=0.2481, pruned_loss=0.04688, over 4710543.54 frames. ], batch size: 58, lr: 4.10e-03, grad_scale: 16.0 2023-10-02 11:46:09,265 WARNING [train.py:1204] (3/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_374-138971-0 from training. Duration: 0.94 2023-10-02 11:46:09,312 WARNING [train.py:1204] (3/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_1227-287886-0_sp1.1 from training. Duration: 0.936375 2023-10-02 11:46:12,575 WARNING [train.py:1204] (3/4) Exclude cut with ID _59493_210_17282_1_1535078419077_3225879_332-217544-0_sp1.1 from training. Duration: 0.97275 2023-10-02 11:46:13,970 WARNING [train.py:1204] (3/4) Exclude cut with ID _23514_210_1389_1_1531832139785_3031439_361-119640-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 11:46:15,443 WARNING [train.py:1204] (3/4) Exclude cut with ID _69584_210_5219_1_1535785133775_4044450_142-118058-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 11:46:15,462 WARNING [train.py:1204] (3/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_178-177582-0_sp1.1 from training. Duration: 0.67275 2023-10-02 11:46:16,723 WARNING [train.py:1204] (3/4) Exclude cut with ID _25303_210_17519_1_1532050207576_3635970_41-195572-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 11:46:19,986 WARNING [train.py:1204] (3/4) Exclude cut with ID _55328_210_27767_1_1534580973260_4088170_329-341322-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 11:46:21,301 WARNING [train.py:1204] (3/4) Exclude cut with ID _5608_210_2106_1_1527415439089_6819768_909-82767-0_sp1.1 from training. Duration: 0.5545625 2023-10-02 11:46:21,346 WARNING [train.py:1204] (3/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_261-297503-0_sp1.1 from training. Duration: 0.9 2023-10-02 11:46:21,352 WARNING [train.py:1204] (3/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_459-291706-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 11:46:21,352 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7762_210_1794_1_1528199508_4057408_422-227034-0 from training. Duration: 0.73 2023-10-02 11:46:21,361 WARNING [train.py:1204] (3/4) Exclude cut with ID _3067_210_2667_1_1526212974640_4503168_250-285937-0_sp0.9 from training. Duration: 0.888875 2023-10-02 11:46:21,374 WARNING [train.py:1204] (3/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_332-253745-0_sp1.1 from training. Duration: 0.936375 2023-10-02 11:46:26,975 WARNING [train.py:1204] (3/4) Exclude cut with ID _13938_210_9988_1_1530518718978_3693960_304-112784-0 from training. Duration: 0.46 2023-10-02 11:46:28,316 WARNING [train.py:1204] (3/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_226-267957-0_sp1.1 from training. Duration: 0.92725 2023-10-02 11:46:28,359 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_41822_210_13270_1_1533955976_4379284_608-254567-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 11:46:29,703 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_124-113562-0_sp1.1 from training. Duration: 0.8545625 2023-10-02 11:46:29,731 WARNING [train.py:1204] (3/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_117-290916-0_sp1.1 from training. Duration: 0.463625 2023-10-02 11:46:31,105 WARNING [train.py:1204] (3/4) Exclude cut with ID _28427_210_4220_1_1532581240967_3061069_102-66533-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 11:46:33,035 WARNING [train.py:1204] (3/4) Exclude cut with ID _55383_210_10635_1_1534468426172_2231669_134-128246-0_sp1.1 from training. Duration: 0.8090625 2023-10-02 11:46:34,348 WARNING [train.py:1204] (3/4) Exclude cut with ID _45871_210_5665_1_1533864614245_3296060_49-308316-0_sp1.1 from training. Duration: 0.97275 2023-10-02 11:46:35,694 WARNING [train.py:1204] (3/4) Exclude cut with ID _57572_210_5517_1_1534921219624_3809484_176-199205-0_sp1.1 from training. Duration: 0.97275 2023-10-02 11:46:38,501 WARNING [train.py:1204] (3/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_45-265684-0_sp1.1 from training. Duration: 0.6545625 2023-10-02 11:46:40,292 WARNING [train.py:1204] (3/4) Exclude cut with ID _15150_210_8341_1_1530752411803_7256384_469-239509-0_sp0.9 from training. Duration: 0.7555625 2023-10-02 11:46:40,330 WARNING [train.py:1204] (3/4) Exclude cut with ID _28461_210_2253_1_1532771775189_3041299_136-270644-0_sp1.1 from training. Duration: 0.8454375 2023-10-02 11:46:42,222 WARNING [train.py:1204] (3/4) Exclude cut with ID _69584_210_5219_1_1535785133775_4044450_302-47302-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 11:46:43,643 WARNING [train.py:1204] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_166-128941-0_sp1.1 from training. Duration: 0.4909375 2023-10-02 11:46:50,589 WARNING [train.py:1204] (3/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_559-120504-0_sp1.1 from training. Duration: 0.97275 2023-10-02 11:46:52,544 WARNING [train.py:1204] (3/4) Exclude cut with ID _6456_210_1534_1_1527681634212_3181049_345-252877-0_sp1.1 from training. Duration: 0.5818125 2023-10-02 11:46:53,936 WARNING [train.py:1204] (3/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_902-233003-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 11:46:56,711 WARNING [train.py:1204] (3/4) Exclude cut with ID _36538_210_9994_1_1533204020363_3795620_204-261385-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 11:46:56,715 WARNING [train.py:1204] (3/4) Exclude cut with ID _14821_210_8750_1_1530680503991_6829359_189-297451-0_sp1.1 from training. Duration: 0.5 2023-10-02 11:46:56,748 WARNING [train.py:1204] (3/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_1211-226066-0_sp0.9 from training. Duration: 0.8444375 2023-10-02 11:47:02,299 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6786_210_1388_1_1527839699_4194495_273-149841-0_sp1.1 from training. Duration: 0.836375 2023-10-02 11:47:02,335 WARNING [train.py:1204] (3/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_380-162648-0_sp1.1 from training. Duration: 0.8545625 2023-10-02 11:47:02,336 WARNING [train.py:1204] (3/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_494-199538-0 from training. Duration: 0.75 2023-10-02 11:47:06,940 WARNING [train.py:1204] (3/4) Exclude cut with ID _49097_210_16049_1_1533952780207_4360979_276-87053-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 11:47:08,915 WARNING [train.py:1204] (3/4) Exclude cut with ID _5444_210_2151_1_1527418808464_5312210_726-27165-0 from training. Duration: 0.67 2023-10-02 11:47:09,075 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.feed_forward1.out_proj.dropout_p, batch_count=868540.0, ans=0.1 2023-10-02 11:47:15,044 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_128-202104-0_sp1.1 from training. Duration: 0.57275 2023-10-02 11:47:16,628 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.feed_forward2.hidden_balancer.prob, batch_count=868540.0, ans=0.125 2023-10-02 11:47:17,826 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_11349_210_6753_1_1529800861_1101207_26-65602-0_sp1.1 from training. Duration: 0.87275 2023-10-02 11:47:17,850 WARNING [train.py:1204] (3/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_159-328157-0 from training. Duration: 0.52 2023-10-02 11:47:17,928 WARNING [train.py:1204] (3/4) Exclude cut with ID _14069_210_5632_1_1530512692033_4803390_98-21934-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 11:47:19,312 WARNING [train.py:1204] (3/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_352-59958-0_sp0.9 from training. Duration: 0.9555625 2023-10-02 11:47:19,353 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6163_210_6400_1_1527506746_4263068_283-137811-0 from training. Duration: 0.77 2023-10-02 11:47:19,378 WARNING [train.py:1204] (3/4) Exclude cut with ID _21989_210_5854_1_1531899214931_3271360_133-44759-0_sp1.1 from training. Duration: 0.7 2023-10-02 11:47:22,719 WARNING [train.py:1204] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_21-174514-0_sp0.9 from training. Duration: 0.47775 2023-10-02 11:47:22,743 WARNING [train.py:1204] (3/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_201-78811-0_sp1.1 from training. Duration: 0.963625 2023-10-02 11:47:22,781 WARNING [train.py:1204] (3/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_537-164630-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 11:47:24,081 INFO [train.py:1046] (3/4) Epoch 25, batch 2800, loss[loss=0.168, simple_loss=0.2161, pruned_loss=0.0599, over 19236.00 frames. ], tot_loss[loss=0.1699, simple_loss=0.2462, pruned_loss=0.04683, over 4686868.36 frames. ], batch size: 388, lr: 4.10e-03, grad_scale: 32.0 2023-10-02 11:47:24,188 WARNING [train.py:1204] (3/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_156-257607-0 from training. Duration: 0.83 2023-10-02 11:47:24,206 WARNING [train.py:1204] (3/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_308-154450-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 11:47:24,239 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_45399_210_4852_1_1533624725_7579737_214-240574-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 11:47:25,618 WARNING [train.py:1204] (3/4) Exclude cut with ID _29303_210_2575_1_1532392237303_7410649_615-305517-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 11:47:25,663 WARNING [train.py:1204] (3/4) Exclude cut with ID IC0125W0002-61808 from training. Duration: 0.708 2023-10-02 11:47:25,664 WARNING [train.py:1204] (3/4) Exclude cut with ID IC0125W0017-61823 from training. Duration: 0.759 2023-10-02 11:47:29,762 WARNING [train.py:1204] (3/4) Exclude cut with ID _53219_210_4525_1_1534834836008_4064825_4-145291-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 11:47:32,712 WARNING [train.py:1204] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_185-174805-0_sp1.1 from training. Duration: 0.6181875 2023-10-02 11:47:32,734 WARNING [train.py:1204] (3/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_635-193046-0_sp1.1 from training. Duration: 0.92725 2023-10-02 11:47:34,887 WARNING [train.py:1204] (3/4) Exclude cut with ID _46067_210_4220_1_1534125571066_3307450_321-258866-0_sp1.1 from training. Duration: 0.97275 2023-10-02 11:47:37,485 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8558_210_1410_1_1528505225_7613406_131-329524-0 from training. Duration: 0.79 2023-10-02 11:47:40,151 WARNING [train.py:1204] (3/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_847-41334-0_sp0.9 from training. Duration: 0.488875 2023-10-02 11:47:41,524 WARNING [train.py:1204] (3/4) Exclude cut with ID _14227_210_4381_1_1530701804286_3940259_125-275642-0 from training. Duration: 0.99 2023-10-02 11:47:41,616 WARNING [train.py:1204] (3/4) Exclude cut with ID _25248_210_8750_1_1532062825941_5554180_248-177255-0_sp1.1 from training. Duration: 0.963625 2023-10-02 11:47:43,356 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_40047_210_2290_1_1533290519_2374311_195-165693-0_sp1.1 from training. Duration: 0.8090625 2023-10-02 11:47:43,359 WARNING [train.py:1204] (3/4) Exclude cut with ID _23109_210_4381_1_1532150828777_901949_29-258672-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 11:47:47,476 WARNING [train.py:1204] (3/4) Exclude cut with ID _14821_210_8750_1_1530680503991_6829359_2-227151-0_sp1.1 from training. Duration: 0.7545625 2023-10-02 11:47:47,519 WARNING [train.py:1204] (3/4) Exclude cut with ID _49643_210_7989_1_1534640244224_3939031_565-53982-0_sp1.1 from training. Duration: 0.963625 2023-10-02 11:47:47,523 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7544_210_6753_1_1528591327_1471426_33-346031-0_sp0.9 from training. Duration: 0.67775 2023-10-02 11:47:48,828 WARNING [train.py:1204] (3/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_95-61782-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 11:47:56,285 WARNING [train.py:1204] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_686-54383-0_sp0.9 from training. Duration: 0.9666875 2023-10-02 11:47:57,647 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_505-276823-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 11:47:58,315 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.4.encoder.layers.0.feed_forward2.out_whiten, num_groups=1, num_channels=384, metric=7.03 vs. limit=15.0 2023-10-02 11:47:59,124 WARNING [train.py:1204] (3/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_745-128443-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 11:48:00,863 WARNING [train.py:1204] (3/4) Exclude cut with ID _3844_210_1390_1_1526727803582_2871209_20-251664-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 11:48:02,201 WARNING [train.py:1204] (3/4) Exclude cut with ID _36538_210_9994_1_1533204020363_3795620_487-164628-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 11:48:04,908 WARNING [train.py:1204] (3/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_309-222545-0_sp1.1 from training. Duration: 0.763625 2023-10-02 11:48:04,921 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3531_210_3528_1_1526294203_5216456_309-227302-0 from training. Duration: 0.69 2023-10-02 11:48:05,069 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.conv_skip_rate, batch_count=868740.0, ans=0.0 2023-10-02 11:48:06,349 WARNING [train.py:1204] (3/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_42-192460-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 11:48:08,149 WARNING [train.py:1204] (3/4) Exclude cut with ID _22083_210_8254_1_1531702862439_3678970_49-234546-0_sp1.1 from training. Duration: 0.7545625 2023-10-02 11:48:08,153 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_427-78097-0_sp0.9 from training. Duration: 0.888875 2023-10-02 11:48:08,397 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.ff3_skip_rate, batch_count=868806.6666666666, ans=0.0 2023-10-02 11:48:09,359 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.592e+02 1.857e+02 2.100e+02 2.533e+02 3.672e+02, threshold=4.201e+02, percent-clipped=0.0 2023-10-02 11:48:12,578 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_378-205254-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 11:48:12,628 WARNING [train.py:1204] (3/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_672-314830-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 11:48:15,542 WARNING [train.py:1204] (3/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_90-30673-0_sp1.1 from training. Duration: 0.763625 2023-10-02 11:48:16,966 WARNING [train.py:1204] (3/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_563-329943-0_sp1.1 from training. Duration: 0.9 2023-10-02 11:48:18,292 WARNING [train.py:1204] (3/4) Exclude cut with ID _21632_210_2253_1_1531994001765_3744140_154-84090-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 11:48:18,294 WARNING [train.py:1204] (3/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_103-336064-0_sp0.9 from training. Duration: 0.5666875 2023-10-02 11:48:19,618 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_43-71989-0_sp0.9 from training. Duration: 0.6333125 2023-10-02 11:48:19,682 WARNING [train.py:1204] (3/4) Exclude cut with ID _3867_210_1883_1_1526641200575_4475830_319-349850-0_sp0.9 from training. Duration: 0.7444375 2023-10-02 11:48:21,036 WARNING [train.py:1204] (3/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_629-179454-0_sp1.1 from training. Duration: 0.963625 2023-10-02 11:48:21,047 WARNING [train.py:1204] (3/4) Exclude cut with ID _65263_210_22356_1_1535763598691_3921037_49-222917-0 from training. Duration: 0.92 2023-10-02 11:48:21,079 WARNING [train.py:1204] (3/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_602-331772-0_sp1.1 from training. Duration: 0.936375 2023-10-02 11:48:22,485 WARNING [train.py:1204] (3/4) Exclude cut with ID _58414_210_8254_1_1535421609162_7151182_292-134978-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 11:48:22,533 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_38793_210_2544_1_1533176114_3849320_200-299292-0_sp1.1 from training. Duration: 0.936375 2023-10-02 11:48:23,935 WARNING [train.py:1204] (3/4) Exclude cut with ID _14970_210_15709_1_1530775547361_1966965_6-23328-0 from training. Duration: 0.86 2023-10-02 11:48:25,880 WARNING [train.py:1204] (3/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_386-225511-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 11:48:25,891 WARNING [train.py:1204] (3/4) Exclude cut with ID _38323_210_12108_1_1533121233369_3303049_416-11919-0_sp1.1 from training. Duration: 0.87275 2023-10-02 11:48:25,928 WARNING [train.py:1204] (3/4) Exclude cut with ID _4866_210_2577_1_1527404412284_4364659_256-306830-0_sp1.1 from training. Duration: 0.7909375 2023-10-02 11:48:27,335 WARNING [train.py:1204] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_53-300544-0 from training. Duration: 0.67 2023-10-02 11:48:30,057 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.2.encoder.layers.2.feed_forward1.out_whiten, num_groups=1, num_channels=384, metric=12.83 vs. limit=15.0 2023-10-02 11:48:34,048 WARNING [train.py:1204] (3/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_80-74184-0_sp1.1 from training. Duration: 0.7545625 2023-10-02 11:48:34,062 WARNING [train.py:1204] (3/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_332-256168-0_sp1.1 from training. Duration: 0.6818125 2023-10-02 11:48:34,209 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.conv_module2.balancer2.prob, batch_count=868873.3333333334, ans=0.125 2023-10-02 11:48:35,310 WARNING [train.py:1204] (3/4) Exclude cut with ID _48836_210_12210_1_1534075848293_3904150_429-35475-0_sp1.1 from training. Duration: 0.8090625 2023-10-02 11:48:36,744 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_144-135833-0_sp1.1 from training. Duration: 0.97275 2023-10-02 11:48:38,047 INFO [train.py:1046] (3/4) Epoch 25, batch 2850, loss[loss=0.1531, simple_loss=0.2263, pruned_loss=0.03989, over 24400.00 frames. ], tot_loss[loss=0.1691, simple_loss=0.2451, pruned_loss=0.04657, over 4695381.55 frames. ], batch size: 58, lr: 4.10e-03, grad_scale: 8.0 2023-10-02 11:48:41,092 WARNING [train.py:1204] (3/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_380-343670-0_sp0.9 from training. Duration: 0.97775 2023-10-02 11:48:41,117 WARNING [train.py:1204] (3/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_213-265174-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 11:48:42,816 WARNING [train.py:1204] (3/4) Exclude cut with ID _20455_210_5219_1_1531565996775_8098100_86-314283-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 11:48:44,343 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_41750_210_1960_1_1533520678_3948042_68-223813-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 11:48:44,534 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.feed_forward1.hidden_balancer.prob, batch_count=868940.0, ans=0.125 2023-10-02 11:48:45,837 WARNING [train.py:1204] (3/4) Exclude cut with ID _55153_210_2253_1_1534384786677_3162096_24-198429-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 11:48:47,131 WARNING [train.py:1204] (3/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_119-207162-0_sp0.9 from training. Duration: 0.788875 2023-10-02 11:48:47,185 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_148-172861-0 from training. Duration: 0.5 2023-10-02 11:48:51,627 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.ff2_skip_rate, batch_count=869006.6666666666, ans=0.0 2023-10-02 11:48:52,773 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_5522_210_3273_1_1527249606_3909096_318-203153-0 from training. Duration: 0.43 2023-10-02 11:48:52,787 WARNING [train.py:1204] (3/4) Exclude cut with ID _14884_210_2252_1_1530707459689_5561250_288-189949-0_sp1.1 from training. Duration: 0.97275 2023-10-02 11:48:54,192 WARNING [train.py:1204] (3/4) Exclude cut with ID _5294_210_1747_1_1527249643928_3696179_529-242298-0 from training. Duration: 0.95 2023-10-02 11:48:55,519 WARNING [train.py:1204] (3/4) Exclude cut with ID _36538_210_9994_1_1533204020363_3795620_503-110289-0_sp1.1 from training. Duration: 0.936375 2023-10-02 11:48:58,812 WARNING [train.py:1204] (3/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_385-193052-0 from training. Duration: 0.86 2023-10-02 11:48:58,871 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_456-22141-0 from training. Duration: 0.88 2023-10-02 11:49:01,535 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_38965_210_4852_1_1533174906_3484735_284-117840-0_sp1.1 from training. Duration: 0.936375 2023-10-02 11:49:07,565 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.1.encoder.layers.0.feed_forward1.out_whiten, num_groups=1, num_channels=256, metric=5.27 vs. limit=15.0 2023-10-02 11:49:08,335 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.bypass.scale_min, batch_count=869073.3333333334, ans=0.2 2023-10-02 11:49:14,566 WARNING [train.py:1204] (3/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_75-201625-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 11:49:15,954 WARNING [train.py:1204] (3/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_255-312654-0_sp1.1 from training. Duration: 0.82725 2023-10-02 11:49:15,963 WARNING [train.py:1204] (3/4) Exclude cut with ID _47251_210_18628_1_1533814063128_4137250_214-88076-0_sp0.9 from training. Duration: 0.97775 2023-10-02 11:49:17,348 WARNING [train.py:1204] (3/4) Exclude cut with ID _4092_210_2151_1_1526643044449_3135759_227-213972-0_sp1.1 from training. Duration: 0.4909375 2023-10-02 11:49:17,365 WARNING [train.py:1204] (3/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_912-47473-0_sp1.1 from training. Duration: 0.6181875 2023-10-02 11:49:17,386 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6767_210_3273_1_1527855783_1718932_173-256601-0_sp1.1 from training. Duration: 0.72725 2023-10-02 11:49:17,589 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.self_attn_weights.pos_emb_skip_rate, batch_count=869073.3333333334, ans=0.0 2023-10-02 11:49:20,011 WARNING [train.py:1204] (3/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_32-205288-0_sp1.1 from training. Duration: 0.7090625 2023-10-02 11:49:20,042 WARNING [train.py:1204] (3/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_39-248870-0 from training. Duration: 0.69 2023-10-02 11:49:21,368 WARNING [train.py:1204] (3/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_377-296839-0_sp1.1 from training. Duration: 0.5 2023-10-02 11:49:21,386 WARNING [train.py:1204] (3/4) Exclude cut with ID _13679_210_7497_1_1530534293662_3355098_413-56154-0_sp1.1 from training. Duration: 0.963625 2023-10-02 11:49:22,698 WARNING [train.py:1204] (3/4) Exclude cut with ID _14970_210_15709_1_1530775547361_1966965_201-84544-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 11:49:22,750 WARNING [train.py:1204] (3/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_105-95775-0_sp1.1 from training. Duration: 0.936375 2023-10-02 11:49:25,523 WARNING [train.py:1204] (3/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_131-191669-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 11:49:25,558 WARNING [train.py:1204] (3/4) Exclude cut with ID _14860_210_4915_1_1530702020340_7343240_695-4323-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 11:49:25,656 WARNING [train.py:1204] (3/4) Exclude cut with ID _39867_210_9826_1_1533553222030_9344259_991-130565-0_sp1.1 from training. Duration: 0.97275 2023-10-02 11:49:27,237 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_50-3289-0_sp1.1 from training. Duration: 0.82725 2023-10-02 11:49:29,207 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_54-347670-0_sp1.1 from training. Duration: 0.92725 2023-10-02 11:49:30,638 WARNING [train.py:1204] (3/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_200-153445-0_sp1.1 from training. Duration: 0.936375 2023-10-02 11:49:30,708 WARNING [train.py:1204] (3/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_279-302911-0_sp1.1 from training. Duration: 0.97275 2023-10-02 11:49:32,117 WARNING [train.py:1204] (3/4) Exclude cut with ID _14886_210_4220_1_1530702020355_3516190_159-73748-0_sp0.9 from training. Duration: 0.92225 2023-10-02 11:49:38,378 WARNING [train.py:1204] (3/4) Exclude cut with ID _53379_210_20034_1_1534222027363_3854654_23-222715-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 11:49:39,835 WARNING [train.py:1204] (3/4) Exclude cut with ID _12555_210_8750_1_1530075594402_3780239_449-267986-0 from training. Duration: 0.83 2023-10-02 11:49:39,853 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7544_210_6753_1_1528587722_3576686_295-258479-0 from training. Duration: 0.84 2023-10-02 11:49:41,296 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7002_210_1794_1_1527935634_4034444_487-236501-0_sp0.9 from training. Duration: 0.7333125 2023-10-02 11:49:43,203 WARNING [train.py:1204] (3/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_244-19107-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 11:49:43,216 WARNING [train.py:1204] (3/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_390-339137-0 from training. Duration: 0.56 2023-10-02 11:49:44,481 WARNING [train.py:1204] (3/4) Exclude cut with ID _7562_210_2084_1_1528887767916_3946229_186-326422-0_sp1.1 from training. Duration: 0.763625 2023-10-02 11:49:44,552 WARNING [train.py:1204] (3/4) Exclude cut with ID _33640_210_2655_1_1533117605479_4855469_645-145235-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 11:49:44,569 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_25767_210_2161_1_1532307483_3867748_506-171777-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 11:49:44,588 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_5438_210_1794_1_1527336687_2600009_233-58072-0_sp1.1 from training. Duration: 0.7 2023-10-02 11:49:44,589 WARNING [train.py:1204] (3/4) Exclude cut with ID IC0669W0063-330908 from training. Duration: 0.896 2023-10-02 11:49:44,890 INFO [scaling.py:1118] (3/4) WithLoss: name=encoder.encoders.5.encoder.layers.0.self_attn_weights, loss-sum=0.000e+00 2023-10-02 11:49:45,953 WARNING [train.py:1204] (3/4) Exclude cut with ID IC0671W0070-331913 from training. Duration: 0.896 2023-10-02 11:49:45,957 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_258-310570-0_sp1.1 from training. Duration: 0.7454375 2023-10-02 11:49:46,009 WARNING [train.py:1204] (3/4) Exclude cut with ID _9461_210_4557_1_1529060514726_3677451_162-177507-0_sp1.1 from training. Duration: 0.97275 2023-10-02 11:49:51,406 WARNING [train.py:1204] (3/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_26-175553-0_sp0.9 from training. Duration: 0.67775 2023-10-02 11:49:51,423 WARNING [train.py:1204] (3/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_7-150481-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 11:49:52,685 INFO [train.py:1046] (3/4) Epoch 25, batch 2900, loss[loss=0.1726, simple_loss=0.247, pruned_loss=0.04908, over 23290.00 frames. ], tot_loss[loss=0.1686, simple_loss=0.245, pruned_loss=0.04606, over 4698450.05 frames. ], batch size: 119, lr: 4.10e-03, grad_scale: 8.0 2023-10-02 11:49:52,740 WARNING [train.py:1204] (3/4) Exclude cut with ID _23557_210_8750_1_1531890106370_6846255_72-273262-0_sp1.1 from training. Duration: 0.963625 2023-10-02 11:49:52,834 WARNING [train.py:1204] (3/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_367-61330-0 from training. Duration: 0.75 2023-10-02 11:49:57,114 WARNING [train.py:1204] (3/4) Exclude cut with ID _39867_210_9826_1_1533553222030_9344259_1337-11141-0_sp1.1 from training. Duration: 0.936375 2023-10-02 11:49:57,133 WARNING [train.py:1204] (3/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_101-293367-0 from training. Duration: 0.63 2023-10-02 11:49:58,464 WARNING [train.py:1204] (3/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_10-100367-0 from training. Duration: 0.98 2023-10-02 11:49:59,788 WARNING [train.py:1204] (3/4) Exclude cut with ID _60057_210_10640_1_1534903065793_3333150_171-181133-0_sp1.1 from training. Duration: 0.8 2023-10-02 11:49:59,790 WARNING [train.py:1204] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_844-96989-0_sp0.9 from training. Duration: 0.788875 2023-10-02 11:50:01,771 WARNING [train.py:1204] (3/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_301-144595-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 11:50:04,991 WARNING [train.py:1204] (3/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_115-147343-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 11:50:08,122 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3171_210_2087_1_1526109084_2330007_37-64334-0_sp1.1 from training. Duration: 0.7454375 2023-10-02 11:50:08,186 WARNING [train.py:1204] (3/4) Exclude cut with ID _19514_210_9259_1_1531530071330_5084230_769-272914-0_sp1.1 from training. Duration: 0.936375 2023-10-02 11:50:11,402 WARNING [train.py:1204] (3/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_76-311963-0_sp0.9 from training. Duration: 0.77775 2023-10-02 11:50:12,739 WARNING [train.py:1204] (3/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_526-62079-0 from training. Duration: 0.67 2023-10-02 11:50:12,786 WARNING [train.py:1204] (3/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_7-111954-0_sp0.9 from training. Duration: 0.62225 2023-10-02 11:50:14,285 WARNING [train.py:1204] (3/4) Exclude cut with ID _57997_210_4163_1_1534766425889_3961049_360-279140-0_sp1.1 from training. Duration: 0.97275 2023-10-02 11:50:16,213 WARNING [train.py:1204] (3/4) Exclude cut with ID _4866_210_2577_1_1527404412284_4364659_133-1696-0 from training. Duration: 0.99 2023-10-02 11:50:17,465 WARNING [train.py:1204] (3/4) Exclude cut with ID _5190_210_4557_1_1527244368799_373439_28-197030-0 from training. Duration: 0.72 2023-10-02 11:50:18,949 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_53-39003-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 11:50:18,952 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6673_210_3273_1_1527767790_3173240_351-167954-0 from training. Duration: 0.48 2023-10-02 11:50:20,239 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2822_210_2715_1_1526112586_1999587_165-36656-0_sp1.1 from training. Duration: 0.8090625 2023-10-02 11:50:21,553 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_328-264892-0_sp1.1 from training. Duration: 0.92725 2023-10-02 11:50:21,558 WARNING [train.py:1204] (3/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_29-293896-0_sp0.9 from training. Duration: 0.67775 2023-10-02 11:50:23,076 WARNING [train.py:1204] (3/4) Exclude cut with ID _65263_210_22356_1_1535763598691_3921037_465-55448-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 11:50:23,158 WARNING [train.py:1204] (3/4) Exclude cut with ID _21989_210_5854_1_1531899214931_3271360_311-53106-0_sp1.1 from training. Duration: 0.97275 2023-10-02 11:50:25,927 WARNING [train.py:1204] (3/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_389-277915-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 11:50:27,395 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_69630_210_7234_1_1535791094_677837_63-277139-0_sp1.1 from training. Duration: 0.963625 2023-10-02 11:50:28,765 WARNING [train.py:1204] (3/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_85-138257-0 from training. Duration: 0.71 2023-10-02 11:50:28,905 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.bypass.skip_rate, batch_count=869406.6666666666, ans=0.04949747468305833 2023-10-02 11:50:30,549 WARNING [train.py:1204] (3/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_686-247600-0 from training. Duration: 0.92 2023-10-02 11:50:30,554 WARNING [train.py:1204] (3/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_108-255430-0_sp1.1 from training. Duration: 0.8818125 2023-10-02 11:50:33,910 WARNING [train.py:1204] (3/4) Exclude cut with ID _14884_210_2252_1_1530707459689_5561250_114-201399-0_sp1.1 from training. Duration: 0.6909375 2023-10-02 11:50:36,693 WARNING [train.py:1204] (3/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_506-42614-0 from training. Duration: 0.79 2023-10-02 11:50:37,975 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_85-195240-0_sp1.1 from training. Duration: 0.7909375 2023-10-02 11:50:40,584 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.525e+02 1.913e+02 2.084e+02 2.277e+02 3.213e+02, threshold=4.167e+02, percent-clipped=0.0 2023-10-02 11:50:40,747 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder_embed.conv.5.prob, batch_count=869473.3333333334, ans=0.125 2023-10-02 11:50:43,362 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_141-36243-0_sp1.1 from training. Duration: 0.97275 2023-10-02 11:50:50,803 WARNING [train.py:1204] (3/4) Exclude cut with ID _7852_210_5219_1_1528286471316_8139400_358-287492-0_sp1.1 from training. Duration: 0.87275 2023-10-02 11:50:50,820 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_805-71497-0_sp0.9 from training. Duration: 0.788875 2023-10-02 11:50:53,300 WARNING [train.py:1204] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_178-39722-0 from training. Duration: 0.49 2023-10-02 11:50:54,179 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.1.encoder.layers.1.conv_module2.whiten, num_groups=1, num_channels=256, metric=10.09 vs. limit=15.0 2023-10-02 11:50:56,042 WARNING [train.py:1204] (3/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_428-181925-0_sp1.1 from training. Duration: 0.963625 2023-10-02 11:50:56,056 WARNING [train.py:1204] (3/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_770-200430-0 from training. Duration: 0.89 2023-10-02 11:50:56,099 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_1104-271009-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 11:50:56,141 WARNING [train.py:1204] (3/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_128-296581-0_sp0.9 from training. Duration: 0.82225 2023-10-02 11:51:03,954 WARNING [train.py:1204] (3/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_19-62511-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 11:51:05,783 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_805-71497-0 from training. Duration: 0.71 2023-10-02 11:51:07,021 INFO [train.py:1046] (3/4) Epoch 25, batch 2950, loss[loss=0.1975, simple_loss=0.2761, pruned_loss=0.05949, over 23978.00 frames. ], tot_loss[loss=0.1702, simple_loss=0.2467, pruned_loss=0.04682, over 4699422.37 frames. ], batch size: 86, lr: 4.10e-03, grad_scale: 4.0 2023-10-02 11:51:07,106 WARNING [train.py:1204] (3/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_573-152077-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 11:51:07,111 WARNING [train.py:1204] (3/4) Exclude cut with ID _42875_210_14856_1_1533704397487_4012433_393-177816-0_sp1.1 from training. Duration: 0.963625 2023-10-02 11:51:07,461 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.conv_module1.balancer2.prob, batch_count=869606.6666666666, ans=0.125 2023-10-02 11:51:08,426 WARNING [train.py:1204] (3/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_278-35159-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 11:51:08,524 WARNING [train.py:1204] (3/4) Exclude cut with ID _15037_210_2253_1_1530752294104_3267650_13-130361-0_sp1.1 from training. Duration: 0.9 2023-10-02 11:51:11,214 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3019_210_2028_1_1526044245_3264865_127-135615-0 from training. Duration: 0.94 2023-10-02 11:51:11,264 WARNING [train.py:1204] (3/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_607-213951-0 from training. Duration: 0.84 2023-10-02 11:51:12,573 WARNING [train.py:1204] (3/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_436-318113-0_sp0.9 from training. Duration: 0.8333125 2023-10-02 11:51:12,574 WARNING [train.py:1204] (3/4) Exclude cut with ID _13938_210_9988_1_1530518718978_3693960_148-152225-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 11:51:17,439 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.conv_skip_rate, batch_count=869606.6666666666, ans=0.0 2023-10-02 11:51:18,420 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2541_210_1794_1_1525590269_3084765_156-173713-0_sp1.1 from training. Duration: 0.736375 2023-10-02 11:51:21,112 WARNING [train.py:1204] (3/4) Exclude cut with ID _28323_210_16187_1_1532480395667_4100300_531-24157-0_sp1.1 from training. Duration: 0.8909375 2023-10-02 11:51:21,855 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.2.encoder.layers.2.feed_forward1.out_whiten, num_groups=1, num_channels=384, metric=6.83 vs. limit=15.0 2023-10-02 11:51:22,486 WARNING [train.py:1204] (3/4) Exclude cut with ID _29303_210_2575_1_1532392237303_7410649_382-217492-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 11:51:22,518 WARNING [train.py:1204] (3/4) Exclude cut with ID _58895_210_8750_1_1535184081320_3649089_270-158013-0_sp1.1 from training. Duration: 0.92725 2023-10-02 11:51:25,347 WARNING [train.py:1204] (3/4) Exclude cut with ID _29277_210_3279_1_1532581109293_3173069_311-206227-0_sp1.1 from training. Duration: 0.936375 2023-10-02 11:51:25,360 WARNING [train.py:1204] (3/4) Exclude cut with ID _7160_210_1876_1_1527940923231_3703799_282-322611-0_sp0.9 from training. Duration: 0.9666875 2023-10-02 11:51:26,860 WARNING [train.py:1204] (3/4) Exclude cut with ID _53219_210_4525_1_1534834836008_4064825_562-32295-0_sp1.1 from training. Duration: 0.963625 2023-10-02 11:51:28,206 WARNING [train.py:1204] (3/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_408-87330-0_sp1.1 from training. Duration: 0.963625 2023-10-02 11:51:28,213 WARNING [train.py:1204] (3/4) Exclude cut with ID _59202_210_26984_1_1534816810612_3549174_137-318710-0_sp1.1 from training. Duration: 0.8090625 2023-10-02 11:51:31,412 WARNING [train.py:1204] (3/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_372-30302-0 from training. Duration: 0.86 2023-10-02 11:51:33,032 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.self_attn_weights.pos_emb_skip_rate, batch_count=869673.3333333334, ans=0.0 2023-10-02 11:51:37,271 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7543_210_6753_1_1528541907_1399322_178-56269-0_sp1.1 from training. Duration: 0.95275 2023-10-02 11:51:37,290 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9109W0012-549758 from training. Duration: 0.512 2023-10-02 11:51:37,955 WARNING [train.py:1204] (3/4) Exclude cut with ID _5410_210_1388_1_1527397055795_1719020_39-112221-0_sp0.9 from training. Duration: 0.7666875 2023-10-02 11:51:39,338 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9121W0012-554933 from training. Duration: 0.896 2023-10-02 11:51:40,702 WARNING [train.py:1204] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_476-272757-0 from training. Duration: 0.93 2023-10-02 11:51:40,720 WARNING [train.py:1204] (3/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_1085-171718-0_sp1.1 from training. Duration: 0.92725 2023-10-02 11:51:42,099 WARNING [train.py:1204] (3/4) Exclude cut with ID _8480_210_1874_1_1528589391205_6728330_309-139896-0_sp1.1 from training. Duration: 0.736375 2023-10-02 11:51:42,099 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9130W0113-559703 from training. Duration: 0.896 2023-10-02 11:51:42,103 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_292-265928-0_sp1.1 from training. Duration: 0.463625 2023-10-02 11:51:45,315 WARNING [train.py:1204] (3/4) Exclude cut with ID _8772_210_1850_1_1528635595972_3664011_96-12756-0 from training. Duration: 0.98 2023-10-02 11:51:45,579 INFO [scaling.py:1118] (3/4) WithLoss: name=encoder.encoders.4.encoder.layers.0.self_attn_weights, loss-sum=0.000e+00 2023-10-02 11:51:46,643 WARNING [train.py:1204] (3/4) Exclude cut with ID _12556_210_8341_1_1530081024100_7327690_725-193477-0_sp1.1 from training. Duration: 0.8909375 2023-10-02 11:51:46,697 WARNING [train.py:1204] (3/4) Exclude cut with ID _5289_210_2084_1_1527678202736_3779299_142-103317-0_sp1.1 from training. Duration: 0.763625 2023-10-02 11:51:46,821 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.0.conv_skip_rate, batch_count=869740.0, ans=0.0 2023-10-02 11:51:46,935 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.conv_module2.balancer1.max_abs, batch_count=869740.0, ans=10.0 2023-10-02 11:51:49,380 WARNING [train.py:1204] (3/4) Exclude cut with ID _45012_210_12651_1_1533608435136_5371380_206-51075-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 11:51:49,487 WARNING [train.py:1204] (3/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_176-56120-0_sp1.1 from training. Duration: 0.7545625 2023-10-02 11:51:50,797 WARNING [train.py:1204] (3/4) Exclude cut with ID _40284_210_2084_1_1533513679814_3704279_220-201864-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 11:51:50,839 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9165W0004-576638 from training. Duration: 0.896 2023-10-02 11:51:52,124 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_5438_210_1794_1_1527336687_2600009_218-343336-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 11:51:52,158 WARNING [train.py:1204] (3/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_318-162017-0 from training. Duration: 0.91 2023-10-02 11:51:57,582 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_49544_210_1794_1_1534379770_5262083_483-349298-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 11:51:58,940 WARNING [train.py:1204] (3/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_197-28164-0_sp0.9 from training. Duration: 0.888875 2023-10-02 11:52:00,260 WARNING [train.py:1204] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_643-52007-0 from training. Duration: 0.57 2023-10-02 11:52:00,278 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_38965_210_4852_1_1533174906_3484735_403-332963-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 11:52:00,485 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.conv_skip_rate, batch_count=869806.6666666666, ans=0.0 2023-10-02 11:52:01,687 WARNING [train.py:1204] (3/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_340-174176-0 from training. Duration: 0.49 2023-10-02 11:52:04,536 WARNING [train.py:1204] (3/4) Exclude cut with ID _56708_210_13177_1_1535252351282_3422899_363-917-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 11:52:07,842 WARNING [train.py:1204] (3/4) Exclude cut with ID _19896_210_1883_1_1531709022662_6649569_525-159063-0_sp1.1 from training. Duration: 0.936375 2023-10-02 11:52:07,890 WARNING [train.py:1204] (3/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_430-116139-0_sp0.9 from training. Duration: 0.9444375 2023-10-02 11:52:09,246 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_53753_210_7234_1_1534320003_3594148_86-329250-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 11:52:09,255 WARNING [train.py:1204] (3/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_406-275649-0_sp1.1 from training. Duration: 0.3818125 2023-10-02 11:52:12,509 WARNING [train.py:1204] (3/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_1083-147536-0_sp1.1 from training. Duration: 0.8818125 2023-10-02 11:52:13,245 WARNING [train.py:1204] (3/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_54-311741-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 11:52:13,250 WARNING [train.py:1204] (3/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_237-58433-0_sp0.9 from training. Duration: 0.82225 2023-10-02 11:52:14,438 WARNING [train.py:1204] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_98-51676-0_sp1.1 from training. Duration: 0.663625 2023-10-02 11:52:15,729 WARNING [train.py:1204] (3/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_386-270472-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 11:52:15,817 WARNING [train.py:1204] (3/4) Exclude cut with ID _19023_210_4353_1_1531378892552_5820780_83-254794-0_sp1.1 from training. Duration: 0.7818125 2023-10-02 11:52:16,788 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.0.layers.1.self_attn_weights.whiten_keys, num_groups=4, num_channels=128, metric=2.33 vs. limit=6.0 2023-10-02 11:52:17,794 WARNING [train.py:1204] (3/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_314-93656-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 11:52:17,811 WARNING [train.py:1204] (3/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_336-213530-0 from training. Duration: 0.9 2023-10-02 11:52:19,287 WARNING [train.py:1204] (3/4) Exclude cut with ID _36686_210_5665_1_1533778225913_3646410_65-221120-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 11:52:20,705 INFO [train.py:1046] (3/4) Epoch 25, batch 3000, loss[loss=0.1526, simple_loss=0.2339, pruned_loss=0.03565, over 18096.00 frames. ], tot_loss[loss=0.1696, simple_loss=0.2468, pruned_loss=0.04619, over 4716657.38 frames. ], batch size: 39, lr: 4.09e-03, grad_scale: 8.0 2023-10-02 11:52:20,706 INFO [train.py:1069] (3/4) Computing validation loss 2023-10-02 11:52:34,337 INFO [train.py:1078] (3/4) Epoch 25, validation: loss=0.328, simple_loss=0.2751, pruned_loss=0.1905, over 1125622.00 frames. 2023-10-02 11:52:34,338 INFO [train.py:1079] (3/4) Maximum memory allocated so far is 21134MB 2023-10-02 11:52:34,459 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_26433_210_12108_1_1532157158_4056480_271-68178-0_sp1.1 from training. Duration: 0.92725 2023-10-02 11:52:35,771 WARNING [train.py:1204] (3/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_46-207599-0_sp0.9 from training. Duration: 0.711125 2023-10-02 11:52:36,532 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.3.encoder.layers.2.conv_module1.whiten, num_groups=1, num_channels=512, metric=5.69 vs. limit=15.0 2023-10-02 11:52:38,925 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9303W0114-637133 from training. Duration: 0.896 2023-10-02 11:52:38,966 WARNING [train.py:1204] (3/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_490-207861-0 from training. Duration: 0.98 2023-10-02 11:52:40,424 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6767_210_3273_1_1527855783_1718932_173-256601-0_sp0.9 from training. Duration: 0.888875 2023-10-02 11:52:40,573 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.0.conv_module2.balancer2.min_positive, batch_count=869940.0, ans=0.05 2023-10-02 11:52:42,168 WARNING [train.py:1204] (3/4) Exclude cut with ID _13921_210_9994_1_1530841782272_3503520_327-69677-0_sp1.1 from training. Duration: 0.8181875 2023-10-02 11:52:42,225 WARNING [train.py:1204] (3/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_508-335369-0 from training. Duration: 0.48 2023-10-02 11:52:42,864 WARNING [train.py:1204] (3/4) Exclude cut with ID _64631_210_27767_1_1535349580068_4165160_24-46567-0_sp1.1 from training. Duration: 0.963625 2023-10-02 11:52:49,998 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_89-35748-0_sp1.1 from training. Duration: 0.7181875 2023-10-02 11:52:57,113 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_26433_210_12108_1_1532157158_4056480_578-262107-0_sp1.1 from training. Duration: 0.9 2023-10-02 11:53:02,771 WARNING [train.py:1204] (3/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_129-337802-0 from training. Duration: 0.72 2023-10-02 11:53:04,210 WARNING [train.py:1204] (3/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_356-167816-0_sp0.9 from training. Duration: 0.8 2023-10-02 11:53:07,421 WARNING [train.py:1204] (3/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_137-116442-0_sp1.1 from training. Duration: 0.7090625 2023-10-02 11:53:08,740 WARNING [train.py:1204] (3/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_358-172940-0_sp1.1 from training. Duration: 0.963625 2023-10-02 11:53:08,778 WARNING [train.py:1204] (3/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_668-344147-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 11:53:11,451 WARNING [train.py:1204] (3/4) Exclude cut with ID _62337_210_12392_1_1535009273105_3412229_255-157667-0_sp1.1 from training. Duration: 0.936375 2023-10-02 11:53:11,454 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_486-151131-0 from training. Duration: 0.85 2023-10-02 11:53:13,470 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_53638_210_21581_1_1534297319_5808449_885-80172-0 from training. Duration: 0.96 2023-10-02 11:53:14,812 WARNING [train.py:1204] (3/4) Exclude cut with ID _50093_210_4220_1_1534417141207_3432299_105-183067-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 11:53:14,877 WARNING [train.py:1204] (3/4) Exclude cut with ID _7562_210_2084_1_1528887767916_3946229_345-169156-0_sp0.9 from training. Duration: 0.6444375 2023-10-02 11:53:15,086 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.out_combiner.scale_min, batch_count=870073.3333333334, ans=0.2 2023-10-02 11:53:16,949 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_4528_210_3273_1_1527076654_3276777_209-219804-0_sp0.9 from training. Duration: 0.7666875 2023-10-02 11:53:18,281 WARNING [train.py:1204] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_35-153886-0_sp0.9 from training. Duration: 0.9333125 2023-10-02 11:53:18,338 WARNING [train.py:1204] (3/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_332-121925-0_sp1.1 from training. Duration: 0.92725 2023-10-02 11:53:18,340 WARNING [train.py:1204] (3/4) Exclude cut with ID _59202_210_26984_1_1534816810612_3549174_495-65515-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 11:53:21,266 WARNING [train.py:1204] (3/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_769-168302-0_sp1.1 from training. Duration: 0.6454375 2023-10-02 11:53:21,312 WARNING [train.py:1204] (3/4) Exclude cut with ID _24271_210_12658_1_1531987270022_7190519_708-71345-0_sp1.1 from training. Duration: 0.936375 2023-10-02 11:53:21,313 WARNING [train.py:1204] (3/4) Exclude cut with ID 3033-130750-0096-55598-0_sp0.9 from training. Duration: 0.92225 2023-10-02 11:53:23,880 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.496e+02 1.799e+02 2.052e+02 2.353e+02 3.864e+02, threshold=4.104e+02, percent-clipped=0.0 2023-10-02 11:53:24,017 WARNING [train.py:1204] (3/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_219-224445-0_sp0.9 from training. Duration: 0.9333125 2023-10-02 11:53:26,814 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.feed_forward1.out_proj.dropout_p, batch_count=870140.0, ans=0.1 2023-10-02 11:53:27,961 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_174-277364-0 from training. Duration: 0.71 2023-10-02 11:53:29,345 WARNING [train.py:1204] (3/4) Exclude cut with ID _45021_210_9994_1_1533619751094_3330288_96-239010-0_sp1.1 from training. Duration: 0.82725 2023-10-02 11:53:29,368 WARNING [train.py:1204] (3/4) Exclude cut with ID _31602_210_5854_1_1532677021456_4458250_489-305116-0_sp1.1 from training. Duration: 0.97275 2023-10-02 11:53:29,387 WARNING [train.py:1204] (3/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_269-154726-0_sp0.9 from training. Duration: 0.9555625 2023-10-02 11:53:32,439 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=870206.6666666666, ans=0.1 2023-10-02 11:53:33,528 WARNING [train.py:1204] (3/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_126-54418-0_sp1.1 from training. Duration: 0.92725 2023-10-02 11:53:33,551 WARNING [train.py:1204] (3/4) Exclude cut with ID _42326_210_7770_1_1533814282531_3707269_182-224159-0_sp1.1 from training. Duration: 0.92725 2023-10-02 11:53:33,642 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7002_210_1794_1_1527939683_5157286_1-281264-0_sp0.9 from training. Duration: 0.488875 2023-10-02 11:53:33,676 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_86-16881-0 from training. Duration: 0.58 2023-10-02 11:53:35,042 WARNING [train.py:1204] (3/4) Exclude cut with ID _19514_210_9259_1_1531530071330_5084230_637-279094-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 11:53:35,068 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_26433_210_12108_1_1532157158_4056480_578-262107-0 from training. Duration: 0.99 2023-10-02 11:53:35,116 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_82-221196-0_sp1.1 from training. Duration: 0.6909375 2023-10-02 11:53:37,112 WARNING [train.py:1204] (3/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_402-102311-0 from training. Duration: 0.67 2023-10-02 11:53:39,840 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1015-343884-0_sp1.1 from training. Duration: 0.563625 2023-10-02 11:53:41,200 WARNING [train.py:1204] (3/4) Exclude cut with ID _13684_210_7497_1_1530619354863_3598359_312-235606-0_sp1.1 from training. Duration: 0.5454375 2023-10-02 11:53:41,229 WARNING [train.py:1204] (3/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_55-31904-0 from training. Duration: 0.88 2023-10-02 11:53:41,972 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.1.bypass_mid.scale_min, batch_count=870206.6666666666, ans=0.2 2023-10-02 11:53:43,146 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_25-332138-0 from training. Duration: 0.91 2023-10-02 11:53:43,148 WARNING [train.py:1204] (3/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_912-282412-0_sp0.9 from training. Duration: 0.5444375 2023-10-02 11:53:44,498 WARNING [train.py:1204] (3/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_458-299435-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 11:53:46,372 WARNING [train.py:1204] (3/4) Exclude cut with ID _69584_210_5219_1_1535785133775_4044450_275-88403-0_sp1.1 from training. Duration: 0.92725 2023-10-02 11:53:46,388 WARNING [train.py:1204] (3/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_841-341679-0_sp1.1 from training. Duration: 0.52725 2023-10-02 11:53:46,395 WARNING [train.py:1204] (3/4) Exclude cut with ID _41391_210_7994_1_1533603771872_3638201_1-54521-0_sp1.1 from training. Duration: 0.97275 2023-10-02 11:53:47,670 WARNING [train.py:1204] (3/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_495-186609-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 11:53:49,643 INFO [train.py:1046] (3/4) Epoch 25, batch 3050, loss[loss=0.1775, simple_loss=0.2596, pruned_loss=0.04766, over 24351.00 frames. ], tot_loss[loss=0.1707, simple_loss=0.2481, pruned_loss=0.04668, over 4714537.24 frames. ], batch size: 77, lr: 4.09e-03, grad_scale: 8.0 2023-10-02 11:53:49,761 WARNING [train.py:1204] (3/4) Exclude cut with ID _4681_210_3525_1_1527250689464_120889_15-126760-0 from training. Duration: 0.81 2023-10-02 11:53:51,265 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3019_210_2028_1_1526044245_3264865_127-135615-0_sp1.1 from training. Duration: 0.8545625 2023-10-02 11:53:53,907 WARNING [train.py:1204] (3/4) Exclude cut with ID _30191_210_1664_1_1533189915047_7196670_915-21561-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 11:53:53,954 WARNING [train.py:1204] (3/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_6-214815-0_sp0.9 from training. Duration: 0.9444375 2023-10-02 11:53:56,713 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_45399_210_4852_1_1533624725_7579737_190-137494-0_sp1.1 from training. Duration: 0.97275 2023-10-02 11:53:59,513 WARNING [train.py:1204] (3/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_207-132038-0 from training. Duration: 0.94 2023-10-02 11:53:59,863 INFO [scaling.py:1118] (3/4) WithLoss: name=encoder.encoders.3.encoder.layers.0.self_attn_weights, loss-sum=0.000e+00 2023-10-02 11:54:04,401 WARNING [train.py:1204] (3/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_90-31383-0 from training. Duration: 0.87 2023-10-02 11:54:05,752 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_15517_210_6753_1_1530952293_1664795_119-31001-0 from training. Duration: 0.94 2023-10-02 11:54:07,102 WARNING [train.py:1204] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_684-85342-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 11:54:09,957 WARNING [train.py:1204] (3/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_97-325053-0_sp1.1 from training. Duration: 0.836375 2023-10-02 11:54:14,071 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_68702_210_23689_1_1535799577_3420068_168-25150-0_sp1.1 from training. Duration: 0.97275 2023-10-02 11:54:14,078 WARNING [train.py:1204] (3/4) Exclude cut with ID _65134_210_14763_1_1535335393985_3505368_111-27984-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 11:54:14,855 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.feed_forward1.out_proj.dropout_p, batch_count=870340.0, ans=0.1 2023-10-02 11:54:15,952 WARNING [train.py:1204] (3/4) Exclude cut with ID _48678_210_5663_1_1533988830947_3769199_276-226230-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 11:54:16,045 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder_embed.dropout.p, batch_count=870340.0, ans=0.1 2023-10-02 11:54:18,133 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.feed_forward2.hidden_balancer.prob, batch_count=870406.6666666666, ans=0.125 2023-10-02 11:54:19,219 WARNING [train.py:1204] (3/4) Exclude cut with ID _28427_210_4220_1_1532581240967_3061069_43-84928-0_sp1.1 from training. Duration: 0.9 2023-10-02 11:54:19,260 WARNING [train.py:1204] (3/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_158-250116-0_sp1.1 from training. Duration: 0.563625 2023-10-02 11:54:19,286 WARNING [train.py:1204] (3/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_279-332556-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 11:54:19,321 WARNING [train.py:1204] (3/4) Exclude cut with ID _42863_210_9070_1_1533902398788_3696920_488-230146-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 11:54:19,322 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_387-247159-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 11:54:19,903 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.4.encoder.layers.0.nonlin_attention.whiten1, num_groups=1, num_channels=288, metric=7.72 vs. limit=10.0 2023-10-02 11:54:20,624 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6729_210_2550_1_1528032481_1788725_261-42619-0_sp1.1 from training. Duration: 0.97275 2023-10-02 11:54:23,969 WARNING [train.py:1204] (3/4) Exclude cut with ID _19896_210_1883_1_1531709022662_6649569_831-44724-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 11:54:26,640 WARNING [train.py:1204] (3/4) Exclude cut with ID _55153_210_2253_1_1534384786677_3162096_280-285944-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 11:54:26,662 WARNING [train.py:1204] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_448-4048-0 from training. Duration: 0.96 2023-10-02 11:54:28,131 WARNING [train.py:1204] (3/4) Exclude cut with ID _29678_210_7893_1_1532421042787_3906118_456-202548-0_sp1.1 from training. Duration: 0.97275 2023-10-02 11:54:28,152 WARNING [train.py:1204] (3/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_107-332398-0_sp1.1 from training. Duration: 0.7090625 2023-10-02 11:54:28,398 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.balancer_na.min_abs, batch_count=870406.6666666666, ans=0.02 2023-10-02 11:54:30,959 WARNING [train.py:1204] (3/4) Exclude cut with ID _13921_210_9994_1_1530838702800_3003429_46-149444-0_sp1.1 from training. Duration: 0.8545625 2023-10-02 11:54:32,300 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_257-20733-0_sp0.9 from training. Duration: 0.8444375 2023-10-02 11:54:32,370 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_53753_210_7234_1_1534320003_3594148_385-234809-0_sp1.1 from training. Duration: 0.963625 2023-10-02 11:54:33,651 WARNING [train.py:1204] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_292-215346-0_sp1.1 from training. Duration: 0.8818125 2023-10-02 11:54:38,285 WARNING [train.py:1204] (3/4) Exclude cut with ID _68965_210_1883_1_1535698962272_3497490_196-124268-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 11:54:39,629 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_23801_210_16223_1_1532066392_3294657_570-5439-0_sp1.1 from training. Duration: 0.8818125 2023-10-02 11:54:39,949 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.conv_module2.balancer2.min_abs, batch_count=870473.3333333334, ans=0.5 2023-10-02 11:54:42,483 WARNING [train.py:1204] (3/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_349-6928-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 11:54:43,803 WARNING [train.py:1204] (3/4) Exclude cut with ID _60621_210_17071_1_1535013053530_3671739_226-255202-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 11:54:43,807 WARNING [train.py:1204] (3/4) Exclude cut with ID _41817_210_18835_1_1533434477160_7423049_448-130302-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 11:54:45,617 WARNING [train.py:1204] (3/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_758-143677-0_sp1.1 from training. Duration: 0.8909375 2023-10-02 11:54:46,892 WARNING [train.py:1204] (3/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_912-47473-0_sp0.9 from training. Duration: 0.7555625 2023-10-02 11:54:48,141 WARNING [train.py:1204] (3/4) Exclude cut with ID _6694_210_4599_1_1527852637490_3965529_356-169233-0_sp1.1 from training. Duration: 0.9 2023-10-02 11:54:48,240 WARNING [train.py:1204] (3/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_1-99680-0 from training. Duration: 0.67 2023-10-02 11:54:50,128 WARNING [train.py:1204] (3/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_199-86432-0_sp1.1 from training. Duration: 0.8909375 2023-10-02 11:54:50,151 WARNING [train.py:1204] (3/4) Exclude cut with ID _61148_210_6897_1_1535094022726_4217610_362-182430-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 11:54:51,544 WARNING [train.py:1204] (3/4) Exclude cut with ID _49376_210_5986_1_1533985278732_3370740_301-154763-0 from training. Duration: 0.98 2023-10-02 11:54:51,598 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder_embed.conv.2.prob, batch_count=870540.0, ans=0.125 2023-10-02 11:54:52,983 WARNING [train.py:1204] (3/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_572-185150-0_sp1.1 from training. Duration: 0.8818125 2023-10-02 11:54:57,842 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_47896_210_20508_1_1534492518_5958868_558-314572-0_sp1.1 from training. Duration: 0.8818125 2023-10-02 11:54:59,195 WARNING [train.py:1204] (3/4) Exclude cut with ID _45560_210_21982_1_1533812603951_3584999_111-6376-0_sp0.9 from training. Duration: 0.9333125 2023-10-02 11:55:00,564 WARNING [train.py:1204] (3/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_412-317077-0_sp1.1 from training. Duration: 0.6818125 2023-10-02 11:55:01,916 WARNING [train.py:1204] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_718-68505-0 from training. Duration: 0.89 2023-10-02 11:55:03,273 INFO [train.py:1046] (3/4) Epoch 25, batch 3100, loss[loss=0.1735, simple_loss=0.2264, pruned_loss=0.06034, over 19390.00 frames. ], tot_loss[loss=0.1706, simple_loss=0.2476, pruned_loss=0.04682, over 4705175.28 frames. ], batch size: 389, lr: 4.09e-03, grad_scale: 8.0 2023-10-02 11:55:06,527 WARNING [train.py:1204] (3/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_77-311404-0 from training. Duration: 0.94 2023-10-02 11:55:07,845 WARNING [train.py:1204] (3/4) Exclude cut with ID _60229_210_27767_1_1534838406158_3655350_264-279573-0 from training. Duration: 0.83 2023-10-02 11:55:09,273 WARNING [train.py:1204] (3/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_106-300985-0_sp1.1 from training. Duration: 0.8181875 2023-10-02 11:55:12,080 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_4183_210_2087_1_1526729100_1869838_79-277189-0_sp1.1 from training. Duration: 0.963625 2023-10-02 11:55:12,089 WARNING [train.py:1204] (3/4) Exclude cut with ID _28427_210_4220_1_1532581240967_3061069_195-295268-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 11:55:15,248 WARNING [train.py:1204] (3/4) Exclude cut with ID _4092_210_2151_1_1526643044449_3135759_356-278694-0_sp0.9 from training. Duration: 0.52225 2023-10-02 11:55:20,100 WARNING [train.py:1204] (3/4) Exclude cut with ID _14459_210_2228_1_1531443641946_7516700_777-71446-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 11:55:24,318 WARNING [train.py:1204] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_520-126729-0 from training. Duration: 0.7 2023-10-02 11:55:29,141 WARNING [train.py:1204] (3/4) Exclude cut with ID _13684_210_7497_1_1530619354863_3598359_312-235606-0_sp0.9 from training. Duration: 0.6666875 2023-10-02 11:55:29,190 WARNING [train.py:1204] (3/4) Exclude cut with ID _37537_210_2253_1_1533171758568_3421949_283-349402-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 11:55:30,416 WARNING [train.py:1204] (3/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_29-117932-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 11:55:30,450 WARNING [train.py:1204] (3/4) Exclude cut with ID _29303_210_2575_1_1532392237303_7410649_721-283351-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 11:55:31,792 WARNING [train.py:1204] (3/4) Exclude cut with ID _14886_210_4220_1_1530702020355_3516190_175-229073-0_sp0.9 from training. Duration: 0.5 2023-10-02 11:55:33,244 WARNING [train.py:1204] (3/4) Exclude cut with ID _15458_210_8341_1_1530858614263_3834410_70-268830-0_sp1.1 from training. Duration: 0.97275 2023-10-02 11:55:34,481 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2295_210_2087_1_1525500993_2407863_234-83104-0 from training. Duration: 0.62 2023-10-02 11:55:34,482 WARNING [train.py:1204] (3/4) Exclude cut with ID _22961_210_8254_1_1531976366672_4278180_317-199546-0_sp1.1 from training. Duration: 0.7818125 2023-10-02 11:55:35,881 WARNING [train.py:1204] (3/4) Exclude cut with ID _27894_210_12392_1_1532415496078_6902439_547-196022-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 11:55:37,792 WARNING [train.py:1204] (3/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_46-207599-0 from training. Duration: 0.64 2023-10-02 11:55:38,015 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.ff2_skip_rate, batch_count=870740.0, ans=0.0 2023-10-02 11:55:39,164 WARNING [train.py:1204] (3/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_426-326246-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 11:55:41,938 WARNING [train.py:1204] (3/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_445-117398-0_sp0.9 from training. Duration: 0.988875 2023-10-02 11:55:41,986 WARNING [train.py:1204] (3/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_563-347832-0 from training. Duration: 0.89 2023-10-02 11:55:43,354 WARNING [train.py:1204] (3/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_252-216674-0 from training. Duration: 0.54 2023-10-02 11:55:44,700 WARNING [train.py:1204] (3/4) Exclude cut with ID _59611_210_8895_1_1534741198240_3650361_187-212779-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 11:55:45,918 WARNING [train.py:1204] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_282-159367-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 11:55:49,333 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_68702_210_23689_1_1535799577_3420068_33-136883-0_sp1.1 from training. Duration: 0.936375 2023-10-02 11:55:49,343 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_47228_210_4983_1_1533780021_3672027_184-293706-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 11:55:49,366 WARNING [train.py:1204] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_656-188361-0_sp0.9 from training. Duration: 0.9666875 2023-10-02 11:55:50,755 WARNING [train.py:1204] (3/4) Exclude cut with ID _4866_210_2577_1_1527404412284_4364659_352-157930-0_sp0.9 from training. Duration: 0.9 2023-10-02 11:55:50,756 WARNING [train.py:1204] (3/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_210-106047-0_sp1.1 from training. Duration: 0.8818125 2023-10-02 11:55:52,045 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.486e+02 1.832e+02 1.979e+02 2.196e+02 3.160e+02, threshold=3.959e+02, percent-clipped=0.0 2023-10-02 11:55:53,844 WARNING [train.py:1204] (3/4) Exclude cut with ID _9389_210_1664_1_1529217337301_5289269_289-213417-0_sp1.1 from training. Duration: 0.8090625 2023-10-02 11:55:53,890 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3910_210_1794_1_1526777667_3747260_178-41186-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 11:55:53,895 WARNING [train.py:1204] (3/4) Exclude cut with ID _27052_210_5986_1_1532329197187_3585009_24-172263-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 11:55:53,900 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_5522_210_3273_1_1527249606_3909096_318-203153-0_sp1.1 from training. Duration: 0.3909375 2023-10-02 11:55:56,911 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.conv_module1.balancer2.prob, batch_count=870806.6666666666, ans=0.125 2023-10-02 11:55:58,032 WARNING [train.py:1204] (3/4) Exclude cut with ID _27894_210_12392_1_1532415496078_6902439_222-51982-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 11:55:59,824 WARNING [train.py:1204] (3/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_87-131427-0 from training. Duration: 0.78 2023-10-02 11:56:01,241 WARNING [train.py:1204] (3/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_372-298093-0_sp0.9 from training. Duration: 0.888875 2023-10-02 11:56:02,563 WARNING [train.py:1204] (3/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_663-166973-0 from training. Duration: 0.8 2023-10-02 11:56:02,599 WARNING [train.py:1204] (3/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_429-177469-0_sp1.1 from training. Duration: 0.936375 2023-10-02 11:56:02,631 WARNING [train.py:1204] (3/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_724-172651-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 11:56:03,972 WARNING [train.py:1204] (3/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_103-336064-0 from training. Duration: 0.51 2023-10-02 11:56:14,822 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.4.encoder.layers.1.nonlin_attention.whiten1, num_groups=1, num_channels=288, metric=4.33 vs. limit=10.0 2023-10-02 11:56:16,767 INFO [train.py:1046] (3/4) Epoch 25, batch 3150, loss[loss=0.1599, simple_loss=0.2406, pruned_loss=0.03958, over 24655.00 frames. ], tot_loss[loss=0.1689, simple_loss=0.2461, pruned_loss=0.04589, over 4721430.05 frames. ], batch size: 65, lr: 4.09e-03, grad_scale: 8.0 2023-10-02 11:56:16,841 WARNING [train.py:1204] (3/4) Exclude cut with ID _48677_210_5663_1_1533902405919_3892989_111-11439-0 from training. Duration: 0.96 2023-10-02 11:56:18,346 WARNING [train.py:1204] (3/4) Exclude cut with ID _8085_210_4557_1_1528455633493_3716100_95-103398-0_sp1.1 from training. Duration: 0.963625 2023-10-02 11:56:18,408 WARNING [train.py:1204] (3/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_1041-110837-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 11:56:20,421 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_50050_210_16223_1_1535435796_7290138_1155-121757-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 11:56:20,423 WARNING [train.py:1204] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_602-275260-0_sp0.9 from training. Duration: 0.911125 2023-10-02 11:56:21,874 WARNING [train.py:1204] (3/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_265-164486-0 from training. Duration: 0.96 2023-10-02 11:56:23,353 WARNING [train.py:1204] (3/4) Exclude cut with ID _46067_210_4220_1_1534125571066_3307450_142-229375-0_sp1.1 from training. Duration: 0.963625 2023-10-02 11:56:23,389 WARNING [train.py:1204] (3/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_310-26186-0_sp1.1 from training. Duration: 0.636375 2023-10-02 11:56:23,638 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.feed_forward3.hidden_balancer.prob, batch_count=870940.0, ans=0.125 2023-10-02 11:56:24,777 WARNING [train.py:1204] (3/4) Exclude cut with ID _28461_210_2253_1_1532771775189_3041299_136-270644-0 from training. Duration: 0.93 2023-10-02 11:56:26,601 WARNING [train.py:1204] (3/4) Exclude cut with ID _14860_210_4915_1_1530702020340_7343240_438-286372-0_sp1.1 from training. Duration: 0.936375 2023-10-02 11:56:28,064 WARNING [train.py:1204] (3/4) Exclude cut with ID IC0128W0004-63309 from training. Duration: 0.831 2023-10-02 11:56:29,431 WARNING [train.py:1204] (3/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_135-174080-0 from training. Duration: 0.53 2023-10-02 11:56:31,308 WARNING [train.py:1204] (3/4) Exclude cut with ID _41326_210_5854_1_1533778076509_3492080_277-59511-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 11:56:32,515 WARNING [train.py:1204] (3/4) Exclude cut with ID IC0144W0112-71379 from training. Duration: 0.992 2023-10-02 11:56:32,588 WARNING [train.py:1204] (3/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_711-15688-0_sp0.9 from training. Duration: 0.47775 2023-10-02 11:56:35,442 WARNING [train.py:1204] (3/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_237-332027-0 from training. Duration: 0.95 2023-10-02 11:56:36,550 WARNING [train.py:1204] (3/4) Exclude cut with ID _7970_210_1883_1_1528455616296_3472230_25-178407-0 from training. Duration: 0.71 2023-10-02 11:56:36,552 WARNING [train.py:1204] (3/4) Exclude cut with ID _8480_210_1874_1_1528589391205_6728330_309-139896-0 from training. Duration: 0.81 2023-10-02 11:56:36,577 WARNING [train.py:1204] (3/4) Exclude cut with ID _39571_210_13449_1_1533801584143_3651077_319-268877-0_sp1.1 from training. Duration: 0.936375 2023-10-02 11:56:36,587 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_155-246880-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 11:56:38,006 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_566-122599-0_sp1.1 from training. Duration: 0.936375 2023-10-02 11:56:39,864 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_12868_210_6753_1_1530701932_530275_18-76322-0 from training. Duration: 0.87 2023-10-02 11:56:41,375 WARNING [train.py:1204] (3/4) Exclude cut with ID _56494_210_4220_1_1535284837625_3033169_159-106035-0_sp1.1 from training. Duration: 0.963625 2023-10-02 11:56:41,408 WARNING [train.py:1204] (3/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_98-276013-0_sp1.1 from training. Duration: 0.963625 2023-10-02 11:56:42,676 WARNING [train.py:1204] (3/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_330-216124-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 11:56:42,956 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.bypass.scale_min, batch_count=871006.6666666666, ans=0.2 2023-10-02 11:56:44,215 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_177-58005-0_sp0.9 from training. Duration: 0.688875 2023-10-02 11:56:48,145 WARNING [train.py:1204] (3/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_343-139898-0 from training. Duration: 0.96 2023-10-02 11:56:48,398 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.feed_forward2.hidden_balancer.prob, batch_count=871073.3333333334, ans=0.125 2023-10-02 11:56:49,522 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6961_210_2087_1_1527937168_6815011_395-185060-0_sp1.1 from training. Duration: 0.863625 2023-10-02 11:56:51,327 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7002_210_1794_1_1527939683_5157286_184-229630-0_sp0.9 from training. Duration: 0.82225 2023-10-02 11:56:52,647 WARNING [train.py:1204] (3/4) Exclude cut with ID _59170_210_6721_1_1534983633960_4203335_153-18895-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 11:56:52,681 WARNING [train.py:1204] (3/4) Exclude cut with ID _49643_210_7989_1_1534640244224_3939031_598-161926-0 from training. Duration: 0.9 2023-10-02 11:56:55,936 WARNING [train.py:1204] (3/4) Exclude cut with ID _4068_210_2629_1_1526697350395_1546259_224-288693-0 from training. Duration: 0.59 2023-10-02 11:56:55,991 WARNING [train.py:1204] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_718-68505-0_sp1.1 from training. Duration: 0.8090625 2023-10-02 11:56:57,282 WARNING [train.py:1204] (3/4) Exclude cut with ID _4068_210_2629_1_1526697350395_1546259_224-288693-0_sp0.9 from training. Duration: 0.6555625 2023-10-02 11:56:57,294 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2296_210_2087_1_1525503448_3397335_57-185430-0_sp0.9 from training. Duration: 0.6666875 2023-10-02 11:56:58,541 WARNING [train.py:1204] (3/4) Exclude cut with ID _15458_210_8341_1_1530858614263_3834410_49-235472-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 11:56:58,543 WARNING [train.py:1204] (3/4) Exclude cut with ID _64135_210_2253_1_1535677267876_7608972_498-5039-0_sp0.9 from training. Duration: 0.8666875 2023-10-02 11:56:58,647 WARNING [train.py:1204] (3/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_283-220372-0_sp0.9 from training. Duration: 0.62225 2023-10-02 11:56:58,655 WARNING [train.py:1204] (3/4) Exclude cut with ID _6473_210_1741_1_1527901307396_3815380_108-17335-0_sp0.9 from training. Duration: 0.72225 2023-10-02 11:57:00,041 WARNING [train.py:1204] (3/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_113-92608-0 from training. Duration: 0.54 2023-10-02 11:57:00,088 WARNING [train.py:1204] (3/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_272-249985-0_sp1.1 from training. Duration: 0.6090625 2023-10-02 11:57:00,096 WARNING [train.py:1204] (3/4) Exclude cut with ID _50376_210_13401_1_1534031502101_4217740_518-187390-0_sp1.1 from training. Duration: 0.97275 2023-10-02 11:57:03,328 WARNING [train.py:1204] (3/4) Exclude cut with ID _59738_210_15379_1_1534849512562_3941470_165-248260-0_sp1.1 from training. Duration: 0.92725 2023-10-02 11:57:03,340 WARNING [train.py:1204] (3/4) Exclude cut with ID _23818_210_6228_1_1531917063884_3676920_400-343925-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 11:57:03,408 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6961_210_2087_1_1527937168_6815011_562-165518-0 from training. Duration: 0.77 2023-10-02 11:57:04,096 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.3.encoder.layers.1.feed_forward2.out_whiten, num_groups=1, num_channels=512, metric=5.67 vs. limit=15.0 2023-10-02 11:57:04,765 WARNING [train.py:1204] (3/4) Exclude cut with ID _24271_210_12658_1_1531987270022_7190519_289-207931-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 11:57:06,085 WARNING [train.py:1204] (3/4) Exclude cut with ID _14632_210_9988_1_1530691279392_3923200_332-105904-0 from training. Duration: 0.9 2023-10-02 11:57:06,095 WARNING [train.py:1204] (3/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_203-232438-0_sp1.1 from training. Duration: 0.97275 2023-10-02 11:57:07,499 WARNING [train.py:1204] (3/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_263-168388-0 from training. Duration: 0.86 2023-10-02 11:57:09,197 WARNING [train.py:1204] (3/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_174-205058-0 from training. Duration: 0.95 2023-10-02 11:57:09,381 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.conv_module1.balancer1.prob, batch_count=871140.0, ans=0.125 2023-10-02 11:57:10,580 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_94-195801-0_sp1.1 from training. Duration: 0.8545625 2023-10-02 11:57:10,602 WARNING [train.py:1204] (3/4) Exclude cut with ID _55153_210_2253_1_1534384786677_3162096_269-103204-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 11:57:12,003 WARNING [train.py:1204] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_748-36150-0 from training. Duration: 0.91 2023-10-02 11:57:13,320 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_148-172861-0_sp1.1 from training. Duration: 0.4545625 2023-10-02 11:57:13,374 WARNING [train.py:1204] (3/4) Exclude cut with ID _67499_210_17282_1_1535509805220_3692430_460-77730-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 11:57:16,108 WARNING [train.py:1204] (3/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_374-20753-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 11:57:18,790 WARNING [train.py:1204] (3/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_374-289983-0_sp1.1 from training. Duration: 0.97275 2023-10-02 11:57:18,826 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_34886_210_10633_1_1533203882_1755661_76-156096-0_sp0.9 from training. Duration: 0.9666875 2023-10-02 11:57:19,063 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.bypass.scale_min, batch_count=871206.6666666666, ans=0.2 2023-10-02 11:57:24,116 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_238-256668-0_sp0.9 from training. Duration: 0.7666875 2023-10-02 11:57:24,170 WARNING [train.py:1204] (3/4) Exclude cut with ID _46683_210_9642_1_1533730913492_4591116_489-146727-0_sp1.1 from training. Duration: 0.97275 2023-10-02 11:57:25,856 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.conv_module2.balancer2.min_abs, batch_count=871206.6666666666, ans=0.5 2023-10-02 11:57:26,916 WARNING [train.py:1204] (3/4) Exclude cut with ID _13938_210_9988_1_1530518718978_3693960_304-112784-0_sp0.9 from training. Duration: 0.511125 2023-10-02 11:57:27,217 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.out_combiner.scale_min, batch_count=871206.6666666666, ans=0.2 2023-10-02 11:57:30,979 INFO [train.py:1046] (3/4) Epoch 25, batch 3200, loss[loss=0.1476, simple_loss=0.2174, pruned_loss=0.03895, over 24440.00 frames. ], tot_loss[loss=0.1685, simple_loss=0.2453, pruned_loss=0.04582, over 4708841.72 frames. ], batch size: 58, lr: 4.09e-03, grad_scale: 16.0 2023-10-02 11:57:33,062 WARNING [train.py:1204] (3/4) Exclude cut with ID _24271_210_12658_1_1531987270022_7190519_243-46549-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 11:57:33,079 WARNING [train.py:1204] (3/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_78-182912-0_sp1.1 from training. Duration: 0.536375 2023-10-02 11:57:37,120 WARNING [train.py:1204] (3/4) Exclude cut with ID _57572_210_5517_1_1534921219624_3809484_121-321334-0_sp1.1 from training. Duration: 0.97275 2023-10-02 11:57:37,211 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_40047_210_2290_1_1533290519_2374311_254-102149-0_sp1.1 from training. Duration: 0.963625 2023-10-02 11:57:37,212 WARNING [train.py:1204] (3/4) Exclude cut with ID _28461_210_2253_1_1532771775189_3041299_96-152158-0 from training. Duration: 0.85 2023-10-02 11:57:40,476 WARNING [train.py:1204] (3/4) Exclude cut with ID _29877_210_6228_1_1532397791106_5276149_161-102979-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 11:57:43,366 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3787_210_2040_1_1526477544_5478586_603-120324-0_sp0.9 from training. Duration: 0.9 2023-10-02 11:57:46,067 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_41750_210_1960_1_1533520678_3948042_247-213456-0_sp1.1 from training. Duration: 0.97275 2023-10-02 11:57:55,008 WARNING [train.py:1204] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_127-119109-0_sp0.9 from training. Duration: 0.8 2023-10-02 11:58:03,651 WARNING [train.py:1204] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_215-202973-0 from training. Duration: 0.88 2023-10-02 11:58:03,714 WARNING [train.py:1204] (3/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_17-320491-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 11:58:06,451 WARNING [train.py:1204] (3/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_310-114452-0 from training. Duration: 0.59 2023-10-02 11:58:08,297 WARNING [train.py:1204] (3/4) Exclude cut with ID _15133_210_6228_1_1530794512074_3080379_29-264458-0_sp0.9 from training. Duration: 0.6444375 2023-10-02 11:58:12,357 WARNING [train.py:1204] (3/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_159-281310-0_sp1.1 from training. Duration: 0.863625 2023-10-02 11:58:12,372 WARNING [train.py:1204] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_195-167011-0_sp0.9 from training. Duration: 0.8333125 2023-10-02 11:58:13,674 WARNING [train.py:1204] (3/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_156-257607-0_sp1.1 from training. Duration: 0.7545625 2023-10-02 11:58:17,774 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2295_210_2087_1_1525500993_2407863_116-238894-0 from training. Duration: 0.7 2023-10-02 11:58:18,968 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.546e+02 1.831e+02 2.000e+02 2.221e+02 3.044e+02, threshold=4.001e+02, percent-clipped=0.0 2023-10-02 11:58:19,091 WARNING [train.py:1204] (3/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_321-335475-0_sp0.9 from training. Duration: 0.5 2023-10-02 11:58:21,222 WARNING [train.py:1204] (3/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_188-114464-0 from training. Duration: 0.82 2023-10-02 11:58:23,971 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2592_210_2290_1_1525697025_4882925_157-70512-0 from training. Duration: 0.91 2023-10-02 11:58:25,391 WARNING [train.py:1204] (3/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_138-64210-0_sp1.1 from training. Duration: 0.663625 2023-10-02 11:58:30,293 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_11305_210_1794_1_1529750428_7178148_651-95702-0_sp1.1 from training. Duration: 0.963625 2023-10-02 11:58:30,313 WARNING [train.py:1204] (3/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_379-184223-0_sp0.9 from training. Duration: 0.9444375 2023-10-02 11:58:30,340 WARNING [train.py:1204] (3/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_381-125325-0_sp1.1 from training. Duration: 0.963625 2023-10-02 11:58:31,666 WARNING [train.py:1204] (3/4) Exclude cut with ID IC0622W0021-307659 from training. Duration: 0.896 2023-10-02 11:58:31,668 WARNING [train.py:1204] (3/4) Exclude cut with ID _7624_210_1531_1_1528109802198_4933101_418-349834-0_sp1.1 from training. Duration: 0.5454375 2023-10-02 11:58:33,623 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.2.encoder.layers.1.feed_forward1.out_whiten, num_groups=1, num_channels=384, metric=9.81 vs. limit=15.0 2023-10-02 11:58:34,925 WARNING [train.py:1204] (3/4) Exclude cut with ID _49728_210_12651_1_1534041034294_6788150_607-3416-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 11:58:35,022 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8275_210_4198_1_1528455320_3892571_491-17707-0 from training. Duration: 0.64 2023-10-02 11:58:36,378 WARNING [train.py:1204] (3/4) Exclude cut with ID _5287_210_2084_1_1527418883040_3878670_333-48508-0 from training. Duration: 0.6 2023-10-02 11:58:36,452 WARNING [train.py:1204] (3/4) Exclude cut with ID _8365_210_2064_1_1528592311070_4677690_371-122295-0 from training. Duration: 0.55 2023-10-02 11:58:37,812 WARNING [train.py:1204] (3/4) Exclude cut with ID _14860_210_4915_1_1530702020340_7343240_836-328489-0 from training. Duration: 0.9 2023-10-02 11:58:39,812 WARNING [train.py:1204] (3/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_1033-274376-0_sp0.9 from training. Duration: 0.9666875 2023-10-02 11:58:42,364 WARNING [train.py:1204] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_475-127455-0_sp0.9 from training. Duration: 0.77775 2023-10-02 11:58:42,371 WARNING [train.py:1204] (3/4) Exclude cut with ID IC0671W0071-331914 from training. Duration: 0.896 2023-10-02 11:58:42,390 WARNING [train.py:1204] (3/4) Exclude cut with ID _30327_210_16049_1_1532484161295_3924169_2-1523-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 11:58:42,392 WARNING [train.py:1204] (3/4) Exclude cut with ID _24314_210_4915_1_1532135078116_7170179_460-208279-0_sp1.1 from training. Duration: 0.97275 2023-10-02 11:58:42,501 WARNING [train.py:1204] (3/4) Exclude cut with ID IC0679W0052-335859 from training. Duration: 0.64 2023-10-02 11:58:45,078 INFO [train.py:1046] (3/4) Epoch 25, batch 3250, loss[loss=0.1626, simple_loss=0.242, pruned_loss=0.04163, over 24462.00 frames. ], tot_loss[loss=0.1691, simple_loss=0.2455, pruned_loss=0.04632, over 4707257.12 frames. ], batch size: 63, lr: 4.09e-03, grad_scale: 8.0 2023-10-02 11:58:48,185 WARNING [train.py:1204] (3/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_3-237434-0_sp1.1 from training. Duration: 0.6909375 2023-10-02 11:58:49,589 WARNING [train.py:1204] (3/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_405-185283-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 11:58:58,720 WARNING [train.py:1204] (3/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_927-83684-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 11:58:58,728 WARNING [train.py:1204] (3/4) Exclude cut with ID _6694_210_4599_1_1527852637490_3965529_356-169233-0 from training. Duration: 0.99 2023-10-02 11:59:00,038 WARNING [train.py:1204] (3/4) Exclude cut with ID _19896_210_1883_1_1531709022662_6649569_562-233001-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 11:59:00,081 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_354-78629-0_sp1.1 from training. Duration: 0.963625 2023-10-02 11:59:00,082 WARNING [train.py:1204] (3/4) Exclude cut with ID _60135_210_13427_1_1534935557922_3651319_96-168198-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 11:59:00,194 WARNING [train.py:1204] (3/4) Exclude cut with ID _31602_210_5854_1_1532677021456_4458250_249-96604-0_sp1.1 from training. Duration: 0.8090625 2023-10-02 11:59:01,566 WARNING [train.py:1204] (3/4) Exclude cut with ID _14100_210_8341_1_1530529320613_3678069_258-224599-0_sp0.9 from training. Duration: 0.8555625 2023-10-02 11:59:04,317 WARNING [train.py:1204] (3/4) Exclude cut with ID _19708_210_9206_1_1532136650733_4238364_215-160135-0_sp1.1 from training. Duration: 0.97275 2023-10-02 11:59:04,344 WARNING [train.py:1204] (3/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_113-18163-0_sp0.9 from training. Duration: 0.988875 2023-10-02 11:59:04,389 WARNING [train.py:1204] (3/4) Exclude cut with ID _64135_210_2253_1_1535677267876_7608972_552-337340-0_sp1.1 from training. Duration: 0.92725 2023-10-02 11:59:04,498 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.1.nonlin_attention.balancer.min_positive, batch_count=871673.3333333334, ans=0.05 2023-10-02 11:59:05,640 WARNING [train.py:1204] (3/4) Exclude cut with ID _67725_210_27767_1_1535608756671_5105359_10-12887-0_sp1.1 from training. Duration: 0.97275 2023-10-02 11:59:05,647 WARNING [train.py:1204] (3/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_401-284840-0_sp1.1 from training. Duration: 0.97275 2023-10-02 11:59:05,678 WARNING [train.py:1204] (3/4) Exclude cut with ID _44659_210_2721_1_1534036120061_6614860_483-141285-0_sp1.1 from training. Duration: 0.936375 2023-10-02 11:59:08,883 WARNING [train.py:1204] (3/4) Exclude cut with ID _9461_210_4557_1_1529060514726_3677451_358-332734-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 11:59:10,332 WARNING [train.py:1204] (3/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_788-298907-0_sp1.1 from training. Duration: 0.8090625 2023-10-02 11:59:10,465 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.ff2_skip_rate, batch_count=871673.3333333334, ans=0.0 2023-10-02 11:59:11,773 WARNING [train.py:1204] (3/4) Exclude cut with ID _39411_210_5986_1_1533538825030_3343649_354-267196-0_sp1.1 from training. Duration: 0.92725 2023-10-02 11:59:13,096 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_14953_210_2151_1_1530701710_5616165_192-309792-0_sp1.1 from training. Duration: 0.97275 2023-10-02 11:59:14,512 WARNING [train.py:1204] (3/4) Exclude cut with ID _59170_210_6721_1_1534983633960_4203335_290-151255-0_sp1.1 from training. Duration: 0.92725 2023-10-02 11:59:14,549 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_5575_210_1410_1_1527301305_2570170_249-93769-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 11:59:14,565 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_46013_210_7234_1_1533776362_3744032_463-52378-0_sp1.1 from training. Duration: 0.7818125 2023-10-02 11:59:18,810 WARNING [train.py:1204] (3/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_580-93798-0 from training. Duration: 0.56 2023-10-02 11:59:18,869 WARNING [train.py:1204] (3/4) Exclude cut with ID _19514_210_9259_1_1531530071330_5084230_385-13762-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 11:59:18,877 WARNING [train.py:1204] (3/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_458-67321-0_sp1.1 from training. Duration: 0.8818125 2023-10-02 11:59:21,953 WARNING [train.py:1204] (3/4) Exclude cut with ID _41298_210_9019_1_1533430371450_4822143_188-154791-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 11:59:22,114 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.0.conv_skip_rate, batch_count=871740.0, ans=0.0 2023-10-02 11:59:23,270 WARNING [train.py:1204] (3/4) Exclude cut with ID _15037_210_2253_1_1530752294104_3267650_296-192597-0_sp0.9 from training. Duration: 0.9 2023-10-02 11:59:23,566 INFO [scaling.py:1118] (3/4) WithLoss: name=encoder.encoders.2.encoder.layers.2.self_attn_weights, loss-sum=0.000e+00 2023-10-02 11:59:27,933 WARNING [train.py:1204] (3/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_661-264857-0_sp1.1 from training. Duration: 0.8454375 2023-10-02 11:59:31,371 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.bypass_mid.scale_min, batch_count=871806.6666666666, ans=0.2 2023-10-02 11:59:33,885 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_34008_210_10431_1_1533345424_8183247_662-317331-0_sp1.1 from training. Duration: 0.8545625 2023-10-02 11:59:33,908 WARNING [train.py:1204] (3/4) Exclude cut with ID _46502_210_13794_1_1533724452198_3657159_241-278329-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 11:59:33,909 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_11349_210_6753_1_1529800861_1101207_26-65602-0 from training. Duration: 0.96 2023-10-02 11:59:33,912 WARNING [train.py:1204] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_448-4048-0_sp1.1 from training. Duration: 0.87275 2023-10-02 11:59:33,919 WARNING [train.py:1204] (3/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_662-275621-0_sp1.1 from training. Duration: 0.3818125 2023-10-02 11:59:35,867 WARNING [train.py:1204] (3/4) Exclude cut with ID _13938_210_9988_1_1530518718978_3693960_256-54454-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 11:59:39,785 WARNING [train.py:1204] (3/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_256-32662-0 from training. Duration: 0.9 2023-10-02 11:59:39,840 WARNING [train.py:1204] (3/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_326-17061-0 from training. Duration: 0.94 2023-10-02 11:59:41,125 WARNING [train.py:1204] (3/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_505-97987-0_sp1.1 from training. Duration: 0.7818125 2023-10-02 11:59:41,341 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.feed_forward1.hidden_balancer.prob, batch_count=871806.6666666666, ans=0.125 2023-10-02 11:59:42,520 WARNING [train.py:1204] (3/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_316-294364-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 11:59:42,578 WARNING [train.py:1204] (3/4) Exclude cut with ID _19514_210_9259_1_1531530071330_5084230_762-231063-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 11:59:42,640 WARNING [train.py:1204] (3/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_68-219807-0_sp0.9 from training. Duration: 0.52225 2023-10-02 11:59:43,993 WARNING [train.py:1204] (3/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_479-158053-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 11:59:45,549 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.bypass.scale_min, batch_count=871873.3333333334, ans=0.2 2023-10-02 11:59:46,937 WARNING [train.py:1204] (3/4) Exclude cut with ID _66112_210_10635_1_1535590236305_3801484_83-38181-0_sp1.1 from training. Duration: 0.936375 2023-10-02 11:59:46,959 WARNING [train.py:1204] (3/4) Exclude cut with ID _55857_210_2346_1_1534644007831_6898209_255-146346-0_sp1.1 from training. Duration: 0.963625 2023-10-02 11:59:50,086 WARNING [train.py:1204] (3/4) Exclude cut with ID _48836_210_12210_1_1534075848293_3904150_429-35475-0 from training. Duration: 0.89 2023-10-02 11:59:50,098 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_50154_210_13731_1_1534750862_1080961_119-187163-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 11:59:51,585 WARNING [train.py:1204] (3/4) Exclude cut with ID _8793_210_6310_1_1528542985139_2310009_121-179782-0_sp0.9 from training. Duration: 0.888875 2023-10-02 11:59:51,588 WARNING [train.py:1204] (3/4) Exclude cut with ID _14884_210_2252_1_1530707459689_5561250_591-173406-0 from training. Duration: 0.96 2023-10-02 11:59:54,354 WARNING [train.py:1204] (3/4) Exclude cut with ID _15133_210_6228_1_1530794512074_3080379_54-198193-0_sp1.1 from training. Duration: 0.8545625 2023-10-02 11:59:55,580 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6545_210_2040_1_1527774670_4245258_407-48070-0 from training. Duration: 0.76 2023-10-02 11:59:57,063 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.conv_module2.balancer2.prob, batch_count=871940.0, ans=0.125 2023-10-02 11:59:58,142 INFO [train.py:1046] (3/4) Epoch 25, batch 3300, loss[loss=0.1812, simple_loss=0.2641, pruned_loss=0.04918, over 24454.00 frames. ], tot_loss[loss=0.1698, simple_loss=0.2465, pruned_loss=0.04652, over 4698746.05 frames. ], batch size: 69, lr: 4.09e-03, grad_scale: 8.0 2023-10-02 11:59:58,223 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1032-236863-0 from training. Duration: 0.84 2023-10-02 11:59:58,287 WARNING [train.py:1204] (3/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_507-303370-0 from training. Duration: 0.96 2023-10-02 11:59:58,315 WARNING [train.py:1204] (3/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_164-282366-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 12:00:02,914 WARNING [train.py:1204] (3/4) Exclude cut with ID _39867_210_9826_1_1533553222030_9344259_312-158727-0_sp1.1 from training. Duration: 0.963625 2023-10-02 12:00:03,003 WARNING [train.py:1204] (3/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_174-205058-0_sp1.1 from training. Duration: 0.863625 2023-10-02 12:00:03,023 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_11305_210_1794_1_1529750428_7178148_166-237358-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 12:00:04,962 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.ff3_skip_rate, batch_count=871940.0, ans=0.0 2023-10-02 12:00:06,082 WARNING [train.py:1204] (3/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_26-175553-0_sp1.1 from training. Duration: 0.5545625 2023-10-02 12:00:06,102 WARNING [train.py:1204] (3/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_96-290039-0_sp1.1 from training. Duration: 0.5090625 2023-10-02 12:00:07,890 WARNING [train.py:1204] (3/4) Exclude cut with ID _53219_210_4525_1_1534834836008_4064825_261-101691-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 12:00:10,611 WARNING [train.py:1204] (3/4) Exclude cut with ID _24271_210_12658_1_1531987270022_7190519_96-293390-0_sp1.1 from training. Duration: 0.936375 2023-10-02 12:00:15,091 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_271-234962-0 from training. Duration: 0.79 2023-10-02 12:00:15,167 WARNING [train.py:1204] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_136-30660-0_sp1.1 from training. Duration: 0.97275 2023-10-02 12:00:15,183 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_625-281046-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 12:00:16,522 WARNING [train.py:1204] (3/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_333-168193-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 12:00:16,574 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9029W0269-509664 from training. Duration: 0.896 2023-10-02 12:00:19,255 WARNING [train.py:1204] (3/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_450-147393-0_sp1.1 from training. Duration: 0.92725 2023-10-02 12:00:19,904 WARNING [train.py:1204] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_643-52007-0_sp1.1 from training. Duration: 0.5181875 2023-10-02 12:00:19,974 WARNING [train.py:1204] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_330-327861-0_sp0.9 from training. Duration: 0.7444375 2023-10-02 12:00:19,982 WARNING [train.py:1204] (3/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_260-317651-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 12:00:20,144 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.balancer1.prob, batch_count=872006.6666666666, ans=0.125 2023-10-02 12:00:21,059 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9044W0010-516654 from training. Duration: 0.896 2023-10-02 12:00:23,967 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.ff2_skip_rate, batch_count=872006.6666666666, ans=0.0 2023-10-02 12:00:25,151 WARNING [train.py:1204] (3/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_265-118378-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 12:00:25,157 WARNING [train.py:1204] (3/4) Exclude cut with ID _6194_210_1534_1_1527511826160_3941194_321-148641-0_sp0.9 from training. Duration: 0.711125 2023-10-02 12:00:25,310 WARNING [train.py:1204] (3/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_101-169176-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 12:00:25,312 WARNING [train.py:1204] (3/4) Exclude cut with ID 497-129325-0061-62254-0_sp1.1 from training. Duration: 0.97725 2023-10-02 12:00:26,863 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.feed_forward2.hidden_balancer.prob, batch_count=872073.3333333334, ans=0.125 2023-10-02 12:00:27,909 WARNING [train.py:1204] (3/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_795-33252-0_sp1.1 from training. Duration: 0.336375 2023-10-02 12:00:27,933 WARNING [train.py:1204] (3/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_168-246176-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 12:00:29,306 WARNING [train.py:1204] (3/4) Exclude cut with ID _7160_210_1876_1_1527940923231_3703799_120-150943-0_sp1.1 from training. Duration: 0.736375 2023-10-02 12:00:32,468 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9084W0005-536844 from training. Duration: 0.768 2023-10-02 12:00:35,758 WARNING [train.py:1204] (3/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_234-260682-0 from training. Duration: 0.84 2023-10-02 12:00:35,795 WARNING [train.py:1204] (3/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_188-114464-0_sp0.9 from training. Duration: 0.911125 2023-10-02 12:00:37,361 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.balancer1.prob, batch_count=872073.3333333334, ans=0.125 2023-10-02 12:00:38,689 WARNING [train.py:1204] (3/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_81-75849-0 from training. Duration: 0.39 2023-10-02 12:00:40,121 WARNING [train.py:1204] (3/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_379-184223-0_sp1.1 from training. Duration: 0.77275 2023-10-02 12:00:42,307 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.nonlin_attention.balancer.prob, batch_count=872140.0, ans=0.125 2023-10-02 12:00:43,365 WARNING [train.py:1204] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_454-123002-0_sp1.1 from training. Duration: 0.62725 2023-10-02 12:00:43,411 WARNING [train.py:1204] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_887-249555-0_sp1.1 from training. Duration: 0.7 2023-10-02 12:00:46,285 WARNING [train.py:1204] (3/4) Exclude cut with ID _31088_210_16446_1_1532520012284_3440919_20-260902-0_sp1.1 from training. Duration: 0.97275 2023-10-02 12:00:46,332 WARNING [train.py:1204] (3/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_856-344574-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 12:00:46,334 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_36756_210_7234_1_1533026991_966276_4-164118-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 12:00:46,348 WARNING [train.py:1204] (3/4) Exclude cut with ID _8365_210_2064_1_1528592311070_4677690_371-122295-0_sp1.1 from training. Duration: 0.5 2023-10-02 12:00:48,873 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.577e+02 1.878e+02 2.117e+02 2.413e+02 3.290e+02, threshold=4.234e+02, percent-clipped=0.0 2023-10-02 12:00:49,030 WARNING [train.py:1204] (3/4) Exclude cut with ID _5910_210_1389_1_1527337972454_5050100_555-159815-0_sp1.1 from training. Duration: 0.8909375 2023-10-02 12:00:49,039 WARNING [train.py:1204] (3/4) Exclude cut with ID _37086_210_9038_1_1532999540891_7021250_469-147941-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 12:00:50,348 WARNING [train.py:1204] (3/4) Exclude cut with ID _3067_210_2667_1_1526212974640_4503168_459-173867-0_sp0.9 from training. Duration: 0.97775 2023-10-02 12:00:50,448 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9156W0006-572499 from training. Duration: 0.896 2023-10-02 12:00:51,973 WARNING [train.py:1204] (3/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_277-210798-0 from training. Duration: 0.55 2023-10-02 12:00:55,320 WARNING [train.py:1204] (3/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_103-336064-0_sp1.1 from training. Duration: 0.463625 2023-10-02 12:00:55,370 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3414_210_1988_1_1526212718_4813195_331-290945-0_sp1.1 from training. Duration: 0.936375 2023-10-02 12:00:55,371 WARNING [train.py:1204] (3/4) Exclude cut with ID _67499_210_17282_1_1535509805220_3692430_365-29276-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 12:00:58,026 WARNING [train.py:1204] (3/4) Exclude cut with ID _14821_210_8750_1_1530680503991_6829359_69-250847-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 12:00:58,041 WARNING [train.py:1204] (3/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_1102-61301-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 12:00:58,253 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.0.balancer2.prob, batch_count=872206.6666666666, ans=0.125 2023-10-02 12:00:59,446 WARNING [train.py:1204] (3/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_166-246557-0_sp1.1 from training. Duration: 0.5818125 2023-10-02 12:00:59,510 WARNING [train.py:1204] (3/4) Exclude cut with ID _15351_210_10626_1_1530856635653_3576210_214-129918-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 12:00:59,523 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_120-41668-0_sp0.9 from training. Duration: 0.77775 2023-10-02 12:01:00,848 WARNING [train.py:1204] (3/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_27-72278-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 12:01:02,762 WARNING [train.py:1204] (3/4) Exclude cut with ID _24314_210_4915_1_1532135078116_7170179_521-258868-0_sp1.1 from training. Duration: 0.7181875 2023-10-02 12:01:02,887 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.conv_module1.balancer2.prob, batch_count=872206.6666666666, ans=0.125 2023-10-02 12:01:05,990 WARNING [train.py:1204] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_143-86344-0 from training. Duration: 0.77 2023-10-02 12:01:06,035 WARNING [train.py:1204] (3/4) Exclude cut with ID _53489_210_6228_1_1534298357169_3835970_465-157433-0_sp1.1 from training. Duration: 0.92725 2023-10-02 12:01:07,430 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_248-319353-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 12:01:08,819 WARNING [train.py:1204] (3/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_102-77064-0_sp1.1 from training. Duration: 0.8454375 2023-10-02 12:01:08,861 WARNING [train.py:1204] (3/4) Exclude cut with ID _9389_210_1664_1_1529217337301_5289269_379-128794-0_sp1.1 from training. Duration: 0.77275 2023-10-02 12:01:10,234 WARNING [train.py:1204] (3/4) Exclude cut with ID _3231_210_2106_1_1526205864934_6784220_236-260835-0_sp1.1 from training. Duration: 0.97275 2023-10-02 12:01:11,689 WARNING [train.py:1204] (3/4) Exclude cut with ID _24271_210_12658_1_1531987270022_7190519_184-342363-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 12:01:11,690 WARNING [train.py:1204] (3/4) Exclude cut with ID _27894_210_12392_1_1532415496078_6902439_67-221663-0_sp1.1 from training. Duration: 0.92725 2023-10-02 12:01:13,359 INFO [train.py:1046] (3/4) Epoch 25, batch 3350, loss[loss=0.1761, simple_loss=0.2676, pruned_loss=0.04229, over 24458.00 frames. ], tot_loss[loss=0.17, simple_loss=0.2474, pruned_loss=0.04631, over 4715803.43 frames. ], batch size: 69, lr: 4.09e-03, grad_scale: 8.0 2023-10-02 12:01:14,062 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.2.encoder.layers.2.whiten, num_groups=1, num_channels=384, metric=4.37 vs. limit=12.0 2023-10-02 12:01:16,066 WARNING [train.py:1204] (3/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_304-59227-0_sp1.1 from training. Duration: 0.7 2023-10-02 12:01:17,554 WARNING [train.py:1204] (3/4) Exclude cut with ID _58605_210_12210_1_1534723195939_3904259_458-42232-0_sp1.1 from training. Duration: 0.92725 2023-10-02 12:01:17,633 WARNING [train.py:1204] (3/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_13-165512-0_sp0.9 from training. Duration: 0.911125 2023-10-02 12:01:19,025 WARNING [train.py:1204] (3/4) Exclude cut with ID _58602_210_2346_1_1534903210085_6908830_792-102241-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 12:01:21,724 WARNING [train.py:1204] (3/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_588-125165-0_sp0.9 from training. Duration: 0.92225 2023-10-02 12:01:23,117 WARNING [train.py:1204] (3/4) Exclude cut with ID _54218_210_12210_1_1534413646388_4409259_239-98429-0_sp1.1 from training. Duration: 0.97275 2023-10-02 12:01:24,932 WARNING [train.py:1204] (3/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_538-32291-0_sp1.1 from training. Duration: 0.7545625 2023-10-02 12:01:26,307 WARNING [train.py:1204] (3/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_419-317062-0 from training. Duration: 0.91 2023-10-02 12:01:26,608 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.feed_forward1.hidden_balancer.prob, batch_count=872340.0, ans=0.125 2023-10-02 12:01:27,694 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9309W0202-640329 from training. Duration: 0.896 2023-10-02 12:01:27,718 WARNING [train.py:1204] (3/4) Exclude cut with ID _3231_210_2106_1_1526205864934_6784220_419-313291-0_sp1.1 from training. Duration: 0.97275 2023-10-02 12:01:31,797 WARNING [train.py:1204] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_619-344499-0 from training. Duration: 0.63 2023-10-02 12:01:31,816 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_278-272612-0 from training. Duration: 0.73 2023-10-02 12:01:33,120 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_103-84010-0_sp0.9 from training. Duration: 0.7666875 2023-10-02 12:01:33,138 WARNING [train.py:1204] (3/4) Exclude cut with ID _26528_210_9070_1_1532570410842_3851310_497-97218-0_sp1.1 from training. Duration: 0.7818125 2023-10-02 12:01:33,229 WARNING [train.py:1204] (3/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_262-84613-0_sp1.1 from training. Duration: 0.936375 2023-10-02 12:01:34,894 WARNING [train.py:1204] (3/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_387-131538-0 from training. Duration: 0.81 2023-10-02 12:01:34,930 WARNING [train.py:1204] (3/4) Exclude cut with ID _8030_210_7497_1_1528444886781_3531089_224-300489-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 12:01:34,951 WARNING [train.py:1204] (3/4) Exclude cut with ID _14349_210_8341_1_1530615710818_3442470_170-336541-0_sp0.9 from training. Duration: 0.9555625 2023-10-02 12:01:36,378 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_47069_210_15368_1_1533781267_4431320_180-31746-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 12:01:38,222 WARNING [train.py:1204] (3/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_447-7041-0_sp1.1 from training. Duration: 0.92725 2023-10-02 12:01:39,497 WARNING [train.py:1204] (3/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_2-55283-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 12:01:39,546 WARNING [train.py:1204] (3/4) Exclude cut with ID _12555_210_8750_1_1530075594402_3780239_70-181911-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 12:01:42,378 WARNING [train.py:1204] (3/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_311-328022-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 12:01:45,591 WARNING [train.py:1204] (3/4) Exclude cut with ID _14528_210_8254_1_1530669700514_4306939_330-2987-0_sp1.1 from training. Duration: 0.92725 2023-10-02 12:01:45,645 WARNING [train.py:1204] (3/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_385-184858-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 12:01:49,208 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.1.encoder.layers.0.nonlin_attention.whiten1, num_groups=1, num_channels=192, metric=4.73 vs. limit=10.0 2023-10-02 12:01:49,710 WARNING [train.py:1204] (3/4) Exclude cut with ID _64414_210_17301_1_1535243252048_3757759_348-333360-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 12:01:50,983 WARNING [train.py:1204] (3/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_753-214751-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 12:01:52,416 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_17613_210_2151_1_1531137569_2366160_202-208013-0_sp1.1 from training. Duration: 0.92725 2023-10-02 12:01:52,424 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_43831_210_2158_1_1533725502_5872791_610-129300-0_sp1.1 from training. Duration: 0.936375 2023-10-02 12:01:55,538 WARNING [train.py:1204] (3/4) Exclude cut with ID _54218_210_12210_1_1534413646388_4409259_321-23852-0_sp1.1 from training. Duration: 0.936375 2023-10-02 12:01:58,241 WARNING [train.py:1204] (3/4) Exclude cut with ID _39411_210_5986_1_1533538825030_3343649_173-238248-0 from training. Duration: 0.83 2023-10-02 12:01:58,249 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_405-339986-0_sp0.9 from training. Duration: 0.5666875 2023-10-02 12:01:58,280 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2009_210_2071_1_1525481134_4588937_385-302100-0 from training. Duration: 0.87 2023-10-02 12:01:58,321 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_161-276996-0_sp1.1 from training. Duration: 0.863625 2023-10-02 12:01:59,673 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_177-58005-0 from training. Duration: 0.62 2023-10-02 12:02:02,206 WARNING [train.py:1204] (3/4) Exclude cut with ID _64639_210_5816_1_1535367619437_3528000_146-343734-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 12:02:02,317 WARNING [train.py:1204] (3/4) Exclude cut with ID _3845_210_1390_1_1526731326141_3523610_72-61441-0_sp1.1 from training. Duration: 0.92725 2023-10-02 12:02:09,787 WARNING [train.py:1204] (3/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_991-125028-0_sp1.1 from training. Duration: 0.936375 2023-10-02 12:02:09,848 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7544_210_6753_1_1528591327_1471426_172-348777-0 from training. Duration: 0.66 2023-10-02 12:02:11,693 WARNING [train.py:1204] (3/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_416-257730-0_sp1.1 from training. Duration: 0.5909375 2023-10-02 12:02:12,998 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1113-7369-0_sp1.1 from training. Duration: 0.77275 2023-10-02 12:02:13,543 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.4.encoder.layers.0.self_attn2.whiten, num_groups=1, num_channels=384, metric=13.38 vs. limit=22.5 2023-10-02 12:02:14,329 WARNING [train.py:1204] (3/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_242-76943-0_sp1.1 from training. Duration: 0.8545625 2023-10-02 12:02:19,089 WARNING [train.py:1204] (3/4) Exclude cut with ID _44718_210_5816_1_1534050051869_6469030_190-49648-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 12:02:20,522 WARNING [train.py:1204] (3/4) Exclude cut with ID _47251_210_18628_1_1533814063128_4137250_214-88076-0 from training. Duration: 0.88 2023-10-02 12:02:20,552 WARNING [train.py:1204] (3/4) Exclude cut with ID _5585_210_2667_1_1527418813324_8007340_89-332072-0_sp1.1 from training. Duration: 0.5818125 2023-10-02 12:02:21,890 WARNING [train.py:1204] (3/4) Exclude cut with ID _41817_210_18835_1_1533434477160_7423049_555-293448-0_sp1.1 from training. Duration: 0.72725 2023-10-02 12:02:24,656 WARNING [train.py:1204] (3/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_286-331314-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 12:02:24,711 WARNING [train.py:1204] (3/4) Exclude cut with ID _13921_210_9994_1_1530838702800_3003429_46-149444-0 from training. Duration: 0.94 2023-10-02 12:02:25,920 INFO [train.py:1046] (3/4) Epoch 25, batch 3400, loss[loss=0.1705, simple_loss=0.2448, pruned_loss=0.04806, over 23805.00 frames. ], tot_loss[loss=0.1704, simple_loss=0.2478, pruned_loss=0.04656, over 4714954.34 frames. ], batch size: 212, lr: 4.09e-03, grad_scale: 8.0 2023-10-02 12:02:25,990 WARNING [train.py:1204] (3/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_768-43595-0_sp1.1 from training. Duration: 0.936375 2023-10-02 12:02:26,006 WARNING [train.py:1204] (3/4) Exclude cut with ID _35355_210_16040_1_1533212988977_4016229_74-190076-0 from training. Duration: 0.8 2023-10-02 12:02:27,442 WARNING [train.py:1204] (3/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_1007-316380-0_sp1.1 from training. Duration: 0.963625 2023-10-02 12:02:29,393 WARNING [train.py:1204] (3/4) Exclude cut with ID _23557_210_8750_1_1531890106370_6846255_150-283437-0_sp1.1 from training. Duration: 0.963625 2023-10-02 12:02:30,799 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3474_210_2028_1_1526188513_4825246_145-309131-0_sp1.1 from training. Duration: 0.67275 2023-10-02 12:02:30,897 WARNING [train.py:1204] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_391-22148-0_sp1.1 from training. Duration: 0.7 2023-10-02 12:02:32,131 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_238-256668-0 from training. Duration: 0.69 2023-10-02 12:02:36,273 WARNING [train.py:1204] (3/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_372-198878-0 from training. Duration: 0.6 2023-10-02 12:02:36,292 WARNING [train.py:1204] (3/4) Exclude cut with ID ID0174W0006-766659 from training. Duration: 0.953 2023-10-02 12:02:36,324 WARNING [train.py:1204] (3/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_668-23019-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 12:02:40,962 WARNING [train.py:1204] (3/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_322-74328-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 12:02:40,969 WARNING [train.py:1204] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_872-264525-0_sp1.1 from training. Duration: 0.5909375 2023-10-02 12:02:41,018 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_53378_210_17282_1_1534221071_5769509_641-286201-0_sp1.1 from training. Duration: 0.97275 2023-10-02 12:02:42,451 WARNING [train.py:1204] (3/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_199-295611-0_sp1.1 from training. Duration: 0.5 2023-10-02 12:02:47,265 WARNING [train.py:1204] (3/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_1270-317971-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 12:02:48,683 WARNING [train.py:1204] (3/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_784-213099-0 from training. Duration: 0.93 2023-10-02 12:02:52,998 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_115-233929-0_sp0.9 from training. Duration: 0.8 2023-10-02 12:02:55,654 WARNING [train.py:1204] (3/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_256-223809-0_sp1.1 from training. Duration: 0.97275 2023-10-02 12:02:55,721 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_285-87781-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 12:02:55,928 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.conv_module1.balancer1.min_positive, batch_count=872740.0, ans=0.025 2023-10-02 12:02:57,022 WARNING [train.py:1204] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_619-344499-0_sp1.1 from training. Duration: 0.57275 2023-10-02 12:03:04,265 WARNING [train.py:1204] (3/4) Exclude cut with ID _60229_210_27767_1_1534838406158_3655350_264-279573-0_sp0.9 from training. Duration: 0.92225 2023-10-02 12:03:08,270 WARNING [train.py:1204] (3/4) Exclude cut with ID _7719_210_3525_1_1528456153163_4296960_333-110399-0 from training. Duration: 0.96 2023-10-02 12:03:12,853 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_50423_210_13270_1_1534128598_5021734_759-90833-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 12:03:12,909 WARNING [train.py:1204] (3/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_151-254798-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 12:03:12,949 WARNING [train.py:1204] (3/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_450-203637-0 from training. Duration: 0.92 2023-10-02 12:03:14,250 WARNING [train.py:1204] (3/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_673-36456-0_sp1.1 from training. Duration: 0.936375 2023-10-02 12:03:15,539 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.553e+02 1.863e+02 2.090e+02 2.369e+02 3.462e+02, threshold=4.180e+02, percent-clipped=0.0 2023-10-02 12:03:15,646 WARNING [train.py:1204] (3/4) Exclude cut with ID _36538_210_9994_1_1533204020363_3795620_131-8685-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 12:03:17,492 WARNING [train.py:1204] (3/4) Exclude cut with ID _12556_210_8341_1_1530081024100_7327690_713-55626-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 12:03:17,533 WARNING [train.py:1204] (3/4) Exclude cut with ID _2322_210_2667_1_1525604548971_3656299_440-80494-0_sp0.9 from training. Duration: 0.97775 2023-10-02 12:03:20,730 WARNING [train.py:1204] (3/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_210-325081-0_sp1.1 from training. Duration: 0.97275 2023-10-02 12:03:23,551 WARNING [train.py:1204] (3/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_375-244976-0_sp0.9 from training. Duration: 0.8333125 2023-10-02 12:03:23,556 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_15517_210_6753_1_1530952293_1664795_119-31001-0_sp1.1 from training. Duration: 0.8545625 2023-10-02 12:03:27,832 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_41452_210_7291_1_1533437687_3941281_319-42326-0_sp1.1 from training. Duration: 0.92725 2023-10-02 12:03:29,267 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_372-173012-0 from training. Duration: 0.95 2023-10-02 12:03:33,980 WARNING [train.py:1204] (3/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_41-31157-0_sp1.1 from training. Duration: 0.5181875 2023-10-02 12:03:38,545 WARNING [train.py:1204] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_91-212486-0 from training. Duration: 0.68 2023-10-02 12:03:39,777 INFO [train.py:1046] (3/4) Epoch 25, batch 3450, loss[loss=0.1629, simple_loss=0.2236, pruned_loss=0.05107, over 23349.00 frames. ], tot_loss[loss=0.1709, simple_loss=0.2481, pruned_loss=0.04686, over 4698700.39 frames. ], batch size: 285, lr: 4.09e-03, grad_scale: 8.0 2023-10-02 12:03:42,567 WARNING [train.py:1204] (3/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_1267-272693-0 from training. Duration: 0.73 2023-10-02 12:03:43,941 WARNING [train.py:1204] (3/4) Exclude cut with ID _13938_210_9988_1_1530518718978_3693960_152-67046-0_sp1.1 from training. Duration: 0.936375 2023-10-02 12:03:45,323 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_4528_210_3273_1_1527076654_3276777_342-168068-0_sp1.1 from training. Duration: 0.7909375 2023-10-02 12:03:45,333 WARNING [train.py:1204] (3/4) Exclude cut with ID _7606_210_4915_1_1528894253277_5080690_127-177761-0 from training. Duration: 0.64 2023-10-02 12:03:45,401 WARNING [train.py:1204] (3/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_335-137972-0_sp1.1 from training. Duration: 0.92725 2023-10-02 12:03:46,175 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.1.encoder.layers.1.feed_forward2.out_whiten, num_groups=1, num_channels=256, metric=6.49 vs. limit=15.0 2023-10-02 12:03:50,315 WARNING [train.py:1204] (3/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_331-117829-0_sp1.1 from training. Duration: 0.563625 2023-10-02 12:03:55,728 WARNING [train.py:1204] (3/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_125-125818-0_sp1.1 from training. Duration: 0.87275 2023-10-02 12:03:55,781 WARNING [train.py:1204] (3/4) Exclude cut with ID _6789_210_1534_1_1527854471671_2870519_48-294897-0_sp1.1 from training. Duration: 0.963625 2023-10-02 12:03:57,142 WARNING [train.py:1204] (3/4) Exclude cut with ID _6694_210_4599_1_1527852637490_3965529_116-109398-0_sp1.1 from training. Duration: 0.663625 2023-10-02 12:03:57,152 WARNING [train.py:1204] (3/4) Exclude cut with ID _52892_210_6721_1_1534378941259_4124129_222-172685-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 12:04:00,002 WARNING [train.py:1204] (3/4) Exclude cut with ID _50655_210_18628_1_1534229942193_3772154_320-132275-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 12:04:02,258 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.conv_skip_rate, batch_count=873006.6666666666, ans=0.0 2023-10-02 12:04:04,828 WARNING [train.py:1204] (3/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_76-311963-0 from training. Duration: 0.7 2023-10-02 12:04:09,083 WARNING [train.py:1204] (3/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_32-205288-0 from training. Duration: 0.78 2023-10-02 12:04:09,109 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_86-16881-0_sp0.9 from training. Duration: 0.6444375 2023-10-02 12:04:11,013 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_271-297505-0_sp1.1 from training. Duration: 0.9 2023-10-02 12:04:11,108 WARNING [train.py:1204] (3/4) Exclude cut with ID _57572_210_5517_1_1534921219624_3809484_187-295293-0_sp1.1 from training. Duration: 0.963625 2023-10-02 12:04:16,518 WARNING [train.py:1204] (3/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_563-329943-0 from training. Duration: 0.99 2023-10-02 12:04:16,761 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=873073.3333333334, ans=0.1 2023-10-02 12:04:18,401 WARNING [train.py:1204] (3/4) Exclude cut with ID _7160_210_1876_1_1527940923231_3703799_302-150728-0_sp0.9 from training. Duration: 0.8666875 2023-10-02 12:04:21,237 WARNING [train.py:1204] (3/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_926-7526-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 12:04:21,272 WARNING [train.py:1204] (3/4) Exclude cut with ID _62574_210_2655_1_1535180407058_3933249_467-22348-0_sp1.1 from training. Duration: 0.936375 2023-10-02 12:04:23,957 WARNING [train.py:1204] (3/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_280-161633-0_sp0.9 from training. Duration: 0.811125 2023-10-02 12:04:25,349 WARNING [train.py:1204] (3/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_385-193052-0_sp1.1 from training. Duration: 0.7818125 2023-10-02 12:04:26,749 WARNING [train.py:1204] (3/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_444-28041-0 from training. Duration: 0.85 2023-10-02 12:04:26,761 WARNING [train.py:1204] (3/4) Exclude cut with ID _66070_210_7640_1_1535441393096_7805310_341-249237-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 12:04:28,130 WARNING [train.py:1204] (3/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_425-269826-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 12:04:28,321 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.0.feed_forward1.hidden_balancer.prob, batch_count=873140.0, ans=0.125 2023-10-02 12:04:30,360 WARNING [train.py:1204] (3/4) Exclude cut with ID _53950_210_5816_1_1534417264919_3862951_124-185597-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 12:04:31,811 WARNING [train.py:1204] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_380-204756-0 from training. Duration: 0.86 2023-10-02 12:04:34,569 WARNING [train.py:1204] (3/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_282-257791-0_sp0.9 from training. Duration: 0.9555625 2023-10-02 12:04:40,468 WARNING [train.py:1204] (3/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_372-131163-0_sp1.1 from training. Duration: 0.8545625 2023-10-02 12:04:40,561 WARNING [train.py:1204] (3/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_447-233513-0_sp1.1 from training. Duration: 0.963625 2023-10-02 12:04:43,234 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_54-118333-0_sp1.1 from training. Duration: 0.97275 2023-10-02 12:04:47,803 WARNING [train.py:1204] (3/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_388-230239-0_sp1.1 from training. Duration: 0.963625 2023-10-02 12:04:49,623 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_40047_210_2290_1_1533290519_2374311_141-54639-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 12:04:49,686 WARNING [train.py:1204] (3/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_91-267696-0_sp0.9 from training. Duration: 0.9666875 2023-10-02 12:04:51,012 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_532-137847-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 12:04:53,917 INFO [train.py:1046] (3/4) Epoch 25, batch 3500, loss[loss=0.1697, simple_loss=0.2586, pruned_loss=0.04041, over 24674.00 frames. ], tot_loss[loss=0.1697, simple_loss=0.2463, pruned_loss=0.0465, over 4691384.79 frames. ], batch size: 73, lr: 4.09e-03, grad_scale: 8.0 2023-10-02 12:04:53,970 WARNING [train.py:1204] (3/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_155-283231-0_sp1.1 from training. Duration: 0.97275 2023-10-02 12:04:57,297 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2245_210_2715_1_1525511548_3540254_310-52483-0_sp1.1 from training. Duration: 0.8 2023-10-02 12:04:58,572 WARNING [train.py:1204] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_185-174805-0 from training. Duration: 0.68 2023-10-02 12:05:01,281 WARNING [train.py:1204] (3/4) Exclude cut with ID _15012_210_1968_1_1530856820429_5095350_662-210469-0_sp1.1 from training. Duration: 0.5181875 2023-10-02 12:05:04,037 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_426-313557-0_sp0.9 from training. Duration: 0.588875 2023-10-02 12:05:07,120 WARNING [train.py:1204] (3/4) Exclude cut with ID _42326_210_7770_1_1533814282531_3707269_407-204343-0_sp1.1 from training. Duration: 0.97275 2023-10-02 12:05:07,128 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_258-310570-0 from training. Duration: 0.82 2023-10-02 12:05:11,239 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_4737_210_2040_1_1526998689_3293008_378-178650-0_sp1.1 from training. Duration: 0.863625 2023-10-02 12:05:12,673 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_47896_210_20508_1_1534492518_5958868_432-327451-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 12:05:14,623 WARNING [train.py:1204] (3/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_188-114464-0_sp1.1 from training. Duration: 0.7454375 2023-10-02 12:05:14,630 WARNING [train.py:1204] (3/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_798-106179-0_sp1.1 from training. Duration: 0.7545625 2023-10-02 12:05:14,667 WARNING [train.py:1204] (3/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_143-117421-0_sp1.1 from training. Duration: 0.52725 2023-10-02 12:05:14,711 WARNING [train.py:1204] (3/4) Exclude cut with ID _67499_210_17282_1_1535509805220_3692430_361-78786-0_sp1.1 from training. Duration: 0.963625 2023-10-02 12:05:14,742 WARNING [train.py:1204] (3/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_712-342563-0_sp1.1 from training. Duration: 0.8818125 2023-10-02 12:05:16,114 WARNING [train.py:1204] (3/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_387-159008-0 from training. Duration: 0.86 2023-10-02 12:05:19,317 WARNING [train.py:1204] (3/4) Exclude cut with ID _33640_210_2655_1_1533117605479_4855469_959-18820-0_sp1.1 from training. Duration: 0.963625 2023-10-02 12:05:19,347 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_133-564-0_sp0.9 from training. Duration: 0.87775 2023-10-02 12:05:21,297 WARNING [train.py:1204] (3/4) Exclude cut with ID _23514_210_1389_1_1531832139785_3031439_12-29287-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 12:05:24,178 WARNING [train.py:1204] (3/4) Exclude cut with ID _23514_210_1389_1_1531832139785_3031439_489-185052-0_sp1.1 from training. Duration: 0.963625 2023-10-02 12:05:25,623 WARNING [train.py:1204] (3/4) Exclude cut with ID _5444_210_2151_1_1527418808464_5312210_634-194174-0 from training. Duration: 0.97 2023-10-02 12:05:25,647 WARNING [train.py:1204] (3/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_91-26919-0_sp1.1 from training. Duration: 0.8818125 2023-10-02 12:05:25,912 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.feed_forward3.hidden_balancer.prob, batch_count=873406.6666666666, ans=0.125 2023-10-02 12:05:30,015 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_37351_210_7925_1_1533012931_3812113_516-82303-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 12:05:30,115 WARNING [train.py:1204] (3/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_80-74184-0_sp0.9 from training. Duration: 0.92225 2023-10-02 12:05:31,514 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_9460_210_4211_1_1528954587_10049667_495-202963-0_sp1.1 from training. Duration: 0.963625 2023-10-02 12:05:34,075 WARNING [train.py:1204] (3/4) Exclude cut with ID _14632_210_9988_1_1530691279392_3923200_332-105904-0_sp1.1 from training. Duration: 0.8181875 2023-10-02 12:05:34,097 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_40764_210_1925_1_1533882359_3752345_493-167886-0_sp1.1 from training. Duration: 0.92725 2023-10-02 12:05:36,813 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_196-289637-0 from training. Duration: 0.75 2023-10-02 12:05:36,889 WARNING [train.py:1204] (3/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_26-175553-0 from training. Duration: 0.61 2023-10-02 12:05:38,209 WARNING [train.py:1204] (3/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_890-5117-0 from training. Duration: 0.78 2023-10-02 12:05:38,262 WARNING [train.py:1204] (3/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_206-306860-0_sp1.1 from training. Duration: 0.92725 2023-10-02 12:05:39,760 WARNING [train.py:1204] (3/4) Exclude cut with ID _25882_210_3036_1_1532775665815_6479510_570-79824-0_sp1.1 from training. Duration: 0.963625 2023-10-02 12:05:39,829 WARNING [train.py:1204] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_731-2428-0_sp1.1 from training. Duration: 0.7545625 2023-10-02 12:05:41,159 WARNING [train.py:1204] (3/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_138-154812-0_sp0.9 from training. Duration: 0.8666875 2023-10-02 12:05:42,623 WARNING [train.py:1204] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_178-39722-0_sp0.9 from training. Duration: 0.5444375 2023-10-02 12:05:43,824 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.471e+02 1.790e+02 1.964e+02 2.126e+02 3.238e+02, threshold=3.929e+02, percent-clipped=0.0 2023-10-02 12:05:43,932 WARNING [train.py:1204] (3/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_1285-256622-0_sp0.9 from training. Duration: 0.9333125 2023-10-02 12:05:50,528 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_41428_210_2161_1_1533774184_3992532_65-271323-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 12:05:50,777 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.feed_forward3.hidden_balancer.prob, batch_count=873473.3333333334, ans=0.125 2023-10-02 12:05:51,899 WARNING [train.py:1204] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_308-161067-0 from training. Duration: 0.66 2023-10-02 12:05:51,904 WARNING [train.py:1204] (3/4) Exclude cut with ID _3067_210_2667_1_1526212974640_4503168_459-173867-0 from training. Duration: 0.88 2023-10-02 12:05:51,905 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2221_210_1988_1_1525597051_4935541_415-296882-0_sp1.1 from training. Duration: 0.7 2023-10-02 12:05:53,332 WARNING [train.py:1204] (3/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_354-192266-0_sp1.1 from training. Duration: 0.936375 2023-10-02 12:05:53,403 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_258-310570-0_sp0.9 from training. Duration: 0.911125 2023-10-02 12:05:54,852 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_548-141039-0_sp1.1 from training. Duration: 0.963625 2023-10-02 12:05:58,136 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_15101_210_7927_1_1530786415_5033274_320-327527-0 from training. Duration: 0.96 2023-10-02 12:05:58,197 WARNING [train.py:1204] (3/4) Exclude cut with ID _2895_210_2477_1_1525956785794_4063750_289-11475-0_sp0.9 from training. Duration: 0.911125 2023-10-02 12:05:59,557 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7543_210_6753_1_1528543318_4552_1-85072-0_sp1.1 from training. Duration: 0.7545625 2023-10-02 12:06:00,907 WARNING [train.py:1204] (3/4) Exclude cut with ID _64639_210_5816_1_1535367619437_3528000_461-329156-0 from training. Duration: 0.96 2023-10-02 12:06:02,302 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_211-339622-0 from training. Duration: 0.45 2023-10-02 12:06:03,179 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.1.encoder.layers.0.feed_forward1.out_whiten, num_groups=1, num_channels=256, metric=5.34 vs. limit=15.0 2023-10-02 12:06:03,745 WARNING [train.py:1204] (3/4) Exclude cut with ID _20455_210_5219_1_1531565996775_8098100_402-112739-0_sp1.1 from training. Duration: 0.963625 2023-10-02 12:06:05,189 WARNING [train.py:1204] (3/4) Exclude cut with ID _53489_210_6228_1_1534298357169_3835970_256-115565-0_sp1.1 from training. Duration: 0.936375 2023-10-02 12:06:06,400 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_26326_210_7925_1_1533002493_3592406_360-305260-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 12:06:06,421 WARNING [train.py:1204] (3/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_18-204383-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 12:06:07,625 INFO [train.py:1046] (3/4) Epoch 25, batch 3550, loss[loss=0.172, simple_loss=0.2422, pruned_loss=0.05094, over 23369.00 frames. ], tot_loss[loss=0.1687, simple_loss=0.2449, pruned_loss=0.0462, over 4682430.85 frames. ], batch size: 285, lr: 4.09e-03, grad_scale: 8.0 2023-10-02 12:06:10,324 WARNING [train.py:1204] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_306-232480-0_sp1.1 from training. Duration: 0.87275 2023-10-02 12:06:14,065 INFO [scaling.py:1118] (3/4) WithLoss: name=encoder.encoders.2.encoder.layers.0.self_attn_weights, loss-sum=0.000e+00 2023-10-02 12:06:15,399 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.bypass_mid.scale_min, batch_count=873606.6666666666, ans=0.2 2023-10-02 12:06:17,221 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.3.encoder.layers.2.conv_module2.whiten, num_groups=1, num_channels=512, metric=8.68 vs. limit=15.0 2023-10-02 12:06:17,975 WARNING [train.py:1204] (3/4) Exclude cut with ID _41161_210_20034_1_1533431249657_3555314_116-299186-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 12:06:19,841 WARNING [train.py:1204] (3/4) Exclude cut with ID _13913_210_9994_1_1530579615187_3383897_473-277682-0_sp1.1 from training. Duration: 0.3545625 2023-10-02 12:06:23,848 WARNING [train.py:1204] (3/4) Exclude cut with ID _58895_210_8750_1_1535184081320_3649089_49-225702-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 12:06:23,919 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3080_210_2071_1_1526086102_4365166_312-8726-0_sp1.1 from training. Duration: 0.82725 2023-10-02 12:06:24,031 WARNING [train.py:1204] (3/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_489-190175-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 12:06:25,309 WARNING [train.py:1204] (3/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_345-95717-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 12:06:25,320 WARNING [train.py:1204] (3/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_85-138257-0_sp1.1 from training. Duration: 0.6454375 2023-10-02 12:06:27,431 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7046_210_6753_1_1527895337_6920917_783-278109-0_sp1.1 from training. Duration: 0.7 2023-10-02 12:06:28,770 WARNING [train.py:1204] (3/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_450-203637-0_sp1.1 from training. Duration: 0.836375 2023-10-02 12:06:28,820 WARNING [train.py:1204] (3/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_416-132893-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 12:06:28,835 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2655_210_2087_1_1525688959_3897400_437-337690-0_sp0.9 from training. Duration: 0.7 2023-10-02 12:06:30,194 WARNING [train.py:1204] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_700-183549-0_sp1.1 from training. Duration: 0.6181875 2023-10-02 12:06:34,388 WARNING [train.py:1204] (3/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_191-59784-0_sp1.1 from training. Duration: 0.8 2023-10-02 12:06:34,406 WARNING [train.py:1204] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_218-295097-0_sp1.1 from training. Duration: 0.7 2023-10-02 12:06:37,142 WARNING [train.py:1204] (3/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_310-109461-0_sp1.1 from training. Duration: 0.72725 2023-10-02 12:06:37,146 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_33348_210_5632_1_1533086912_3324030_131-321149-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 12:06:37,194 WARNING [train.py:1204] (3/4) Exclude cut with ID _64135_210_2253_1_1535677267876_7608972_133-188266-0_sp1.1 from training. Duration: 0.736375 2023-10-02 12:06:37,219 WARNING [train.py:1204] (3/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_505-205270-0 from training. Duration: 0.38 2023-10-02 12:06:37,240 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_67223_210_4238_1_1535612120_2537431_102-88248-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 12:06:39,863 WARNING [train.py:1204] (3/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_60-235433-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 12:06:41,343 WARNING [train.py:1204] (3/4) Exclude cut with ID _5189_210_4557_1_1527248319428_3769229_199-53526-0_sp1.1 from training. Duration: 0.3818125 2023-10-02 12:06:46,090 WARNING [train.py:1204] (3/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_439-90979-0_sp1.1 from training. Duration: 0.963625 2023-10-02 12:06:48,051 WARNING [train.py:1204] (3/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_1162-132880-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 12:06:49,406 WARNING [train.py:1204] (3/4) Exclude cut with ID _56423_210_5854_1_1534896061049_3165969_220-22317-0_sp1.1 from training. Duration: 0.963625 2023-10-02 12:06:50,775 WARNING [train.py:1204] (3/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_909-64151-0 from training. Duration: 0.94 2023-10-02 12:06:52,605 WARNING [train.py:1204] (3/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_465-156500-0_sp0.9 from training. Duration: 0.8 2023-10-02 12:06:54,027 WARNING [train.py:1204] (3/4) Exclude cut with ID _44304_210_15374_1_1533639352765_4613129_302-22084-0 from training. Duration: 0.86 2023-10-02 12:06:54,083 WARNING [train.py:1204] (3/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_663-166973-0_sp1.1 from training. Duration: 0.72725 2023-10-02 12:06:55,965 WARNING [train.py:1204] (3/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_503-287883-0_sp0.9 from training. Duration: 0.97775 2023-10-02 12:06:55,979 WARNING [train.py:1204] (3/4) Exclude cut with ID _60057_210_10640_1_1534903065793_3333150_362-230377-0_sp0.9 from training. Duration: 0.9666875 2023-10-02 12:06:58,701 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_4183_210_2087_1_1526729100_1869838_143-280962-0 from training. Duration: 0.56 2023-10-02 12:07:00,076 WARNING [train.py:1204] (3/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_281-1313-0_sp1.1 from training. Duration: 0.936375 2023-10-02 12:07:04,361 WARNING [train.py:1204] (3/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_64-215118-0_sp1.1 from training. Duration: 0.936375 2023-10-02 12:07:05,657 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2822_210_2715_1_1526112586_1999587_165-36656-0 from training. Duration: 0.89 2023-10-02 12:07:06,997 WARNING [train.py:1204] (3/4) Exclude cut with ID _22961_210_8254_1_1531976366672_4278180_402-208830-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 12:07:11,269 WARNING [train.py:1204] (3/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_98-193494-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 12:07:11,371 WARNING [train.py:1204] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_383-285196-0 from training. Duration: 0.97 2023-10-02 12:07:12,883 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.ff3_skip_rate, batch_count=873873.3333333334, ans=0.0 2023-10-02 12:07:18,715 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6767_210_3273_1_1527855783_1718932_142-127132-0 from training. Duration: 0.95 2023-10-02 12:07:18,752 WARNING [train.py:1204] (3/4) Exclude cut with ID _39867_210_9826_1_1533553222030_9344259_1018-339635-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 12:07:18,830 WARNING [train.py:1204] (3/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_274-274798-0_sp1.1 from training. Duration: 0.8909375 2023-10-02 12:07:21,701 INFO [train.py:1046] (3/4) Epoch 25, batch 3600, loss[loss=0.1593, simple_loss=0.2482, pruned_loss=0.0352, over 24603.00 frames. ], tot_loss[loss=0.1691, simple_loss=0.2453, pruned_loss=0.04651, over 4682788.84 frames. ], batch size: 68, lr: 4.09e-03, grad_scale: 16.0 2023-10-02 12:07:21,774 WARNING [train.py:1204] (3/4) Exclude cut with ID _2334_210_2667_1_1525608238880_4705129_376-263393-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 12:07:21,830 WARNING [train.py:1204] (3/4) Exclude cut with ID _29303_210_2575_1_1532392237303_7410649_363-344675-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 12:07:23,231 WARNING [train.py:1204] (3/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_58-68956-0_sp1.1 from training. Duration: 0.7545625 2023-10-02 12:07:28,328 WARNING [train.py:1204] (3/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_709-311571-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 12:07:31,004 WARNING [train.py:1204] (3/4) Exclude cut with ID _46067_210_4220_1_1534125571066_3307450_136-257407-0_sp1.1 from training. Duration: 0.963625 2023-10-02 12:07:31,079 WARNING [train.py:1204] (3/4) Exclude cut with ID _6493_210_1747_1_1528459210424_4012470_324-323854-0_sp1.1 from training. Duration: 0.836375 2023-10-02 12:07:31,133 WARNING [train.py:1204] (3/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_234-260682-0_sp1.1 from training. Duration: 0.763625 2023-10-02 12:07:32,550 WARNING [train.py:1204] (3/4) Exclude cut with ID _36634_210_1794_1_1533296352450_1399884_55-119451-0_sp1.1 from training. Duration: 0.963625 2023-10-02 12:07:32,555 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_40047_210_2290_1_1533290519_2374311_195-165693-0 from training. Duration: 0.89 2023-10-02 12:07:36,597 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_5522_210_3273_1_1527249606_3909096_366-172508-0_sp0.9 from training. Duration: 0.7333125 2023-10-02 12:07:36,678 WARNING [train.py:1204] (3/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_187-10504-0_sp1.1 from training. Duration: 0.963625 2023-10-02 12:07:39,424 WARNING [train.py:1204] (3/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_282-257791-0_sp1.1 from training. Duration: 0.7818125 2023-10-02 12:07:42,126 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_43831_210_2158_1_1533725502_5872791_467-288776-0_sp1.1 from training. Duration: 0.92725 2023-10-02 12:07:43,431 WARNING [train.py:1204] (3/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_138-154812-0_sp1.1 from training. Duration: 0.7090625 2023-10-02 12:07:43,473 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_1154-185930-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 12:07:43,502 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2655_210_2087_1_1525688959_3897400_52-279899-0 from training. Duration: 0.69 2023-10-02 12:07:43,552 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_141-89113-0_sp1.1 from training. Duration: 0.7818125 2023-10-02 12:07:45,192 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.conv_module2.balancer1.prob, batch_count=874006.6666666666, ans=0.125 2023-10-02 12:07:46,803 WARNING [train.py:1204] (3/4) Exclude cut with ID _55134_210_21982_1_1534590028377_3882170_297-47310-0_sp1.1 from training. Duration: 0.963625 2023-10-02 12:07:47,568 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.2.encoder.layers.1.self_attn2.whiten, num_groups=1, num_channels=384, metric=15.36 vs. limit=22.5 2023-10-02 12:07:48,768 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_486-151131-0_sp1.1 from training. Duration: 0.77275 2023-10-02 12:07:48,945 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.feed_forward1.hidden_balancer.prob, batch_count=874006.6666666666, ans=0.125 2023-10-02 12:07:50,114 WARNING [train.py:1204] (3/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_21-232980-0_sp1.1 from training. Duration: 0.936375 2023-10-02 12:07:52,810 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6165_210_3528_1_1527590982_4379841_338-106380-0_sp1.1 from training. Duration: 0.92725 2023-10-02 12:07:52,872 WARNING [train.py:1204] (3/4) Exclude cut with ID _55162_210_17071_1_1534408248583_3753480_355-257218-0_sp1.1 from training. Duration: 0.97275 2023-10-02 12:07:54,729 WARNING [train.py:1204] (3/4) Exclude cut with ID _2895_210_2477_1_1525956785794_4063750_289-11475-0 from training. Duration: 0.82 2023-10-02 12:08:02,157 WARNING [train.py:1204] (3/4) Exclude cut with ID _16756_210_4915_1_1531047803763_7104259_839-183431-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 12:08:03,535 WARNING [train.py:1204] (3/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_427-208901-0_sp0.9 from training. Duration: 0.7555625 2023-10-02 12:08:03,577 WARNING [train.py:1204] (3/4) Exclude cut with ID _36512_210_6987_1_1533033143998_3126069_181-306500-0 from training. Duration: 0.95 2023-10-02 12:08:07,829 WARNING [train.py:1204] (3/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_28-70987-0_sp1.1 from training. Duration: 0.7909375 2023-10-02 12:08:11,621 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.508e+02 1.785e+02 1.913e+02 2.150e+02 3.207e+02, threshold=3.826e+02, percent-clipped=0.0 2023-10-02 12:08:11,742 WARNING [train.py:1204] (3/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_590-58996-0_sp1.1 from training. Duration: 0.936375 2023-10-02 12:08:13,213 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_48730_210_5399_1_1533885886_2212769_108-47561-0_sp1.1 from training. Duration: 0.936375 2023-10-02 12:08:20,924 WARNING [train.py:1204] (3/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_401-158239-0_sp0.9 from training. Duration: 0.788875 2023-10-02 12:08:20,946 WARNING [train.py:1204] (3/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_278-154492-0_sp1.1 from training. Duration: 0.6909375 2023-10-02 12:08:20,951 WARNING [train.py:1204] (3/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_137-116442-0 from training. Duration: 0.78 2023-10-02 12:08:22,597 WARNING [train.py:1204] (3/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_189-317158-0 from training. Duration: 0.9 2023-10-02 12:08:23,947 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_47896_210_20508_1_1534492518_5958868_558-314572-0 from training. Duration: 0.97 2023-10-02 12:08:27,141 WARNING [train.py:1204] (3/4) Exclude cut with ID _66070_210_7640_1_1535441393096_7805310_290-40284-0_sp1.1 from training. Duration: 0.97275 2023-10-02 12:08:27,174 WARNING [train.py:1204] (3/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_110-224688-0_sp1.1 from training. Duration: 0.8818125 2023-10-02 12:08:27,272 WARNING [train.py:1204] (3/4) Exclude cut with ID _7719_210_3525_1_1528456153163_4296960_88-250259-0 from training. Duration: 0.84 2023-10-02 12:08:27,509 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.attention_skip_rate, batch_count=874206.6666666666, ans=0.0 2023-10-02 12:08:29,164 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8453_210_4211_1_1528502940_8562730_1157-114102-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 12:08:29,207 WARNING [train.py:1204] (3/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_317-175062-0_sp1.1 from training. Duration: 0.7454375 2023-10-02 12:08:29,212 WARNING [train.py:1204] (3/4) Exclude cut with ID _29857_210_20034_1_1532395886753_3798160_254-347254-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 12:08:30,505 WARNING [train.py:1204] (3/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_28-70987-0 from training. Duration: 0.87 2023-10-02 12:08:30,563 WARNING [train.py:1204] (3/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_321-335475-0 from training. Duration: 0.45 2023-10-02 12:08:30,757 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=874206.6666666666, ans=0.1 2023-10-02 12:08:34,507 WARNING [train.py:1204] (3/4) Exclude cut with ID _40901_210_5665_1_1533363789958_3856130_356-107330-0_sp1.1 from training. Duration: 0.936375 2023-10-02 12:08:35,768 INFO [train.py:1046] (3/4) Epoch 25, batch 3650, loss[loss=0.137, simple_loss=0.2204, pruned_loss=0.02675, over 24605.00 frames. ], tot_loss[loss=0.1699, simple_loss=0.2468, pruned_loss=0.04648, over 4688747.54 frames. ], batch size: 60, lr: 4.08e-03, grad_scale: 16.0 2023-10-02 12:08:35,840 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3780_210_2087_1_1526467389_2145572_176-107140-0 from training. Duration: 0.69 2023-10-02 12:08:39,834 WARNING [train.py:1204] (3/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_80-135200-0 from training. Duration: 0.79 2023-10-02 12:08:42,504 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_33348_210_5632_1_1533086912_3324030_85-28583-0_sp0.9 from training. Duration: 0.911125 2023-10-02 12:08:45,356 WARNING [train.py:1204] (3/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_29-293896-0 from training. Duration: 0.61 2023-10-02 12:08:46,847 WARNING [train.py:1204] (3/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_194-252800-0 from training. Duration: 0.97 2023-10-02 12:08:52,721 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_360-52446-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 12:08:52,723 WARNING [train.py:1204] (3/4) Exclude cut with ID _9389_210_1664_1_1529217337301_5289269_289-213417-0_sp0.9 from training. Duration: 0.988875 2023-10-02 12:08:52,771 WARNING [train.py:1204] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_143-86344-0_sp0.9 from training. Duration: 0.8555625 2023-10-02 12:08:55,450 WARNING [train.py:1204] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_520-126729-0_sp0.9 from training. Duration: 0.77775 2023-10-02 12:08:55,477 WARNING [train.py:1204] (3/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_76-308663-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 12:08:56,856 WARNING [train.py:1204] (3/4) Exclude cut with ID _8021_210_7994_1_1528376451358_3653069_100-140231-0 from training. Duration: 0.67 2023-10-02 12:08:58,327 WARNING [train.py:1204] (3/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_387-131538-0_sp0.9 from training. Duration: 0.9 2023-10-02 12:08:58,349 WARNING [train.py:1204] (3/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_365-182054-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 12:08:59,026 WARNING [train.py:1204] (3/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_345-340690-0 from training. Duration: 0.93 2023-10-02 12:09:00,417 WARNING [train.py:1204] (3/4) Exclude cut with ID _5409_210_1388_1_1527292850189_5865191_178-127970-0_sp0.9 from training. Duration: 0.6333125 2023-10-02 12:09:00,454 WARNING [train.py:1204] (3/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_352-12734-0_sp1.1 from training. Duration: 0.92725 2023-10-02 12:09:00,466 WARNING [train.py:1204] (3/4) Exclude cut with ID _65134_210_14763_1_1535335393985_3505368_121-129373-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 12:09:03,538 WARNING [train.py:1204] (3/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_557-324258-0_sp1.1 from training. Duration: 0.736375 2023-10-02 12:09:06,354 WARNING [train.py:1204] (3/4) Exclude cut with ID _48678_210_5663_1_1533988830947_3769199_306-255886-0 from training. Duration: 0.77 2023-10-02 12:09:07,683 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7544_210_6753_1_1528587000_691468_87-178599-0 from training. Duration: 0.87 2023-10-02 12:09:09,082 WARNING [train.py:1204] (3/4) Exclude cut with ID _2941_210_2577_1_1526199673150_2489759_208-236402-0_sp1.1 from training. Duration: 0.963625 2023-10-02 12:09:10,622 WARNING [train.py:1204] (3/4) Exclude cut with ID _38670_210_15379_1_1533448810659_4391059_65-169397-0 from training. Duration: 0.97 2023-10-02 12:09:11,958 WARNING [train.py:1204] (3/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_306-328175-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 12:09:13,327 WARNING [train.py:1204] (3/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_212-149947-0_sp1.1 from training. Duration: 0.663625 2023-10-02 12:09:13,707 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=874406.6666666666, ans=0.1 2023-10-02 12:09:17,630 WARNING [train.py:1204] (3/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_321-22821-0_sp1.1 from training. Duration: 0.8090625 2023-10-02 12:09:18,932 WARNING [train.py:1204] (3/4) Exclude cut with ID _44010_210_4525_1_1533884262964_4013620_483-69110-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 12:09:18,943 WARNING [train.py:1204] (3/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_38-187851-0_sp1.1 from training. Duration: 0.563625 2023-10-02 12:09:19,492 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.5.encoder.layers.0.whiten, num_groups=1, num_channels=256, metric=2.15 vs. limit=12.0 2023-10-02 12:09:20,986 WARNING [train.py:1204] (3/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_129-337802-0_sp0.9 from training. Duration: 0.8 2023-10-02 12:09:21,061 WARNING [train.py:1204] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_268-87909-0_sp1.1 from training. Duration: 0.9 2023-10-02 12:09:24,260 WARNING [train.py:1204] (3/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_559-184862-0_sp1.1 from training. Duration: 0.8545625 2023-10-02 12:09:26,940 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_46032_210_14656_1_1533781412_4507969_237-120766-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 12:09:28,240 WARNING [train.py:1204] (3/4) Exclude cut with ID _28427_210_4220_1_1532581240967_3061069_330-170906-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 12:09:28,247 WARNING [train.py:1204] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_401-51578-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 12:09:29,673 WARNING [train.py:1204] (3/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_209-205365-0_sp1.1 from training. Duration: 0.7181875 2023-10-02 12:09:31,022 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_23801_210_16223_1_1532066392_3294657_57-242170-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 12:09:31,074 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_127-37371-0_sp1.1 from training. Duration: 0.936375 2023-10-02 12:09:37,774 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9171W0019-578905 from training. Duration: 0.896 2023-10-02 12:09:39,336 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_24605_210_1794_1_1532047863_4149755_349-49056-0_sp1.1 from training. Duration: 0.92725 2023-10-02 12:09:39,346 WARNING [train.py:1204] (3/4) Exclude cut with ID _12302_210_5219_1_1529924446466_8107280_897-21237-0_sp1.1 from training. Duration: 0.936375 2023-10-02 12:09:40,675 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_9460_210_4211_1_1528954587_10049667_480-260269-0_sp0.9 from training. Duration: 0.62225 2023-10-02 12:09:40,733 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_348-152548-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 12:09:40,921 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.attention_skip_rate, batch_count=874540.0, ans=0.0 2023-10-02 12:09:41,338 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.2.encoder.layers.2.nonlin_attention.whiten1, num_groups=1, num_channels=288, metric=4.74 vs. limit=10.0 2023-10-02 12:09:42,050 WARNING [train.py:1204] (3/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_301-330455-0_sp0.9 from training. Duration: 0.888875 2023-10-02 12:09:43,421 WARNING [train.py:1204] (3/4) Exclude cut with ID _61008_210_10633_1_1534852769198_6974370_345-91705-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 12:09:44,756 WARNING [train.py:1204] (3/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_304-273433-0 from training. Duration: 0.99 2023-10-02 12:09:44,759 WARNING [train.py:1204] (3/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_359-345972-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 12:09:46,181 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2296_210_2087_1_1525503448_3397335_57-185430-0_sp1.1 from training. Duration: 0.5454375 2023-10-02 12:09:49,162 INFO [train.py:1046] (3/4) Epoch 25, batch 3700, loss[loss=0.1738, simple_loss=0.2493, pruned_loss=0.04913, over 23800.00 frames. ], tot_loss[loss=0.1699, simple_loss=0.2474, pruned_loss=0.0462, over 4700199.72 frames. ], batch size: 212, lr: 4.08e-03, grad_scale: 16.0 2023-10-02 12:09:49,256 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_1256-120974-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 12:09:49,330 WARNING [train.py:1204] (3/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_372-30302-0_sp0.9 from training. Duration: 0.9555625 2023-10-02 12:09:52,506 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_42012_210_7927_1_1533439936_3871556_451-344262-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 12:09:52,507 WARNING [train.py:1204] (3/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_28-324729-0 from training. Duration: 0.94 2023-10-02 12:09:52,515 WARNING [train.py:1204] (3/4) Exclude cut with ID _64135_210_2253_1_1535677267876_7608972_313-18394-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 12:09:52,597 WARNING [train.py:1204] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_620-276793-0_sp0.9 from training. Duration: 0.6555625 2023-10-02 12:09:52,624 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7544_210_6753_1_1528591327_1471426_172-348777-0_sp0.9 from training. Duration: 0.7333125 2023-10-02 12:09:55,365 WARNING [train.py:1204] (3/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_332-221057-0_sp0.9 from training. Duration: 0.7555625 2023-10-02 12:09:59,882 WARNING [train.py:1204] (3/4) Exclude cut with ID _19023_210_4353_1_1531378892552_5820780_5-300035-0_sp1.1 from training. Duration: 0.92725 2023-10-02 12:09:59,944 WARNING [train.py:1204] (3/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_343-309023-0_sp1.1 from training. Duration: 0.963625 2023-10-02 12:10:00,025 WARNING [train.py:1204] (3/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_37-41919-0_sp1.1 from training. Duration: 0.7909375 2023-10-02 12:10:01,366 WARNING [train.py:1204] (3/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_41-182910-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 12:10:01,429 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_229-45300-0_sp0.9 from training. Duration: 0.6333125 2023-10-02 12:10:05,171 WARNING [train.py:1204] (3/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_40-214716-0_sp1.1 from training. Duration: 0.963625 2023-10-02 12:10:05,403 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.bypass.scale_min, batch_count=874673.3333333334, ans=0.2 2023-10-02 12:10:06,560 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9309W0203-640330 from training. Duration: 0.896 2023-10-02 12:10:14,667 WARNING [train.py:1204] (3/4) Exclude cut with ID _60229_210_27767_1_1534838406158_3655350_264-279573-0_sp1.1 from training. Duration: 0.7545625 2023-10-02 12:10:14,714 WARNING [train.py:1204] (3/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_372-198878-0_sp0.9 from training. Duration: 0.6666875 2023-10-02 12:10:16,135 WARNING [train.py:1204] (3/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_359-118926-0_sp1.1 from training. Duration: 0.6545625 2023-10-02 12:10:16,146 WARNING [train.py:1204] (3/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_85-65056-0 from training. Duration: 0.96 2023-10-02 12:10:16,165 WARNING [train.py:1204] (3/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_89-242335-0_sp1.1 from training. Duration: 0.863625 2023-10-02 12:10:16,384 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.1.bypass.skip_rate, batch_count=874673.3333333334, ans=0.035 2023-10-02 12:10:19,554 WARNING [train.py:1204] (3/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_128-49444-0_sp1.1 from training. Duration: 0.936375 2023-10-02 12:10:20,934 WARNING [train.py:1204] (3/4) Exclude cut with ID _30327_210_16049_1_1532484161295_3924169_401-70819-0 from training. Duration: 0.86 2023-10-02 12:10:22,311 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_41822_210_13270_1_1533955976_4379284_260-256391-0_sp1.1 from training. Duration: 0.936375 2023-10-02 12:10:24,254 WARNING [train.py:1204] (3/4) Exclude cut with ID _67335_210_5219_1_1535525975652_4275229_599-247317-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 12:10:26,893 WARNING [train.py:1204] (3/4) Exclude cut with ID _29857_210_20034_1_1532395886753_3798160_209-239267-0_sp1.1 from training. Duration: 0.936375 2023-10-02 12:10:26,913 WARNING [train.py:1204] (3/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_46-207599-0_sp1.1 from training. Duration: 0.5818125 2023-10-02 12:10:29,518 WARNING [train.py:1204] (3/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_912-282412-0_sp1.1 from training. Duration: 0.4454375 2023-10-02 12:10:32,740 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_4183_210_2087_1_1526729100_1869838_141-166600-0_sp1.1 from training. Duration: 0.863625 2023-10-02 12:10:32,744 WARNING [train.py:1204] (3/4) Exclude cut with ID _15037_210_2253_1_1530752294104_3267650_296-192597-0 from training. Duration: 0.81 2023-10-02 12:10:34,062 WARNING [train.py:1204] (3/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_571-220562-0_sp1.1 from training. Duration: 0.963625 2023-10-02 12:10:34,088 WARNING [train.py:1204] (3/4) Exclude cut with ID _5287_210_2084_1_1527418883040_3878670_12-221186-0 from training. Duration: 0.98 2023-10-02 12:10:34,367 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.conv_module2.balancer2.prob, batch_count=874806.6666666666, ans=0.125 2023-10-02 12:10:38,855 WARNING [train.py:1204] (3/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_149-85602-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 12:10:38,885 WARNING [train.py:1204] (3/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_1003-101638-0_sp1.1 from training. Duration: 0.663625 2023-10-02 12:10:40,222 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.605e+02 1.959e+02 2.210e+02 2.685e+02 4.346e+02, threshold=4.420e+02, percent-clipped=1.0 2023-10-02 12:10:41,768 WARNING [train.py:1204] (3/4) Exclude cut with ID _55844_210_2346_1_1534471212569_7625707_1099-33689-0_sp1.1 from training. Duration: 0.92725 2023-10-02 12:10:43,102 WARNING [train.py:1204] (3/4) Exclude cut with ID _7606_210_4915_1_1528894253277_5080690_301-10732-0 from training. Duration: 0.81 2023-10-02 12:10:45,804 WARNING [train.py:1204] (3/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_403-34698-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 12:10:45,821 WARNING [train.py:1204] (3/4) Exclude cut with ID _8480_210_1874_1_1528589391205_6728330_755-112625-0_sp0.9 from training. Duration: 0.82225 2023-10-02 12:10:45,834 WARNING [train.py:1204] (3/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_527-238990-0_sp1.1 from training. Duration: 0.8181875 2023-10-02 12:10:45,867 WARNING [train.py:1204] (3/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_426-7328-0_sp1.1 from training. Duration: 0.92725 2023-10-02 12:10:50,409 WARNING [train.py:1204] (3/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_189-317158-0_sp1.1 from training. Duration: 0.8181875 2023-10-02 12:10:50,467 WARNING [train.py:1204] (3/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_162-103620-0 from training. Duration: 0.93 2023-10-02 12:10:51,909 WARNING [train.py:1204] (3/4) Exclude cut with ID _22083_210_8254_1_1531702862439_3678970_49-234546-0 from training. Duration: 0.83 2023-10-02 12:10:53,224 WARNING [train.py:1204] (3/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_370-331600-0_sp1.1 from training. Duration: 0.8545625 2023-10-02 12:10:53,237 WARNING [train.py:1204] (3/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_166-59042-0_sp1.1 from training. Duration: 0.97275 2023-10-02 12:10:54,602 WARNING [train.py:1204] (3/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_138-332773-0_sp1.1 from training. Duration: 0.736375 2023-10-02 12:10:54,672 WARNING [train.py:1204] (3/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_90-30673-0_sp0.9 from training. Duration: 0.9333125 2023-10-02 12:10:57,979 WARNING [train.py:1204] (3/4) Exclude cut with ID _27967_210_12140_1_1532588391859_8419130_737-41898-0_sp1.1 from training. Duration: 0.936375 2023-10-02 12:10:59,387 WARNING [train.py:1204] (3/4) Exclude cut with ID _8833_210_5219_1_1528592665210_7553059_329-125156-0_sp1.1 from training. Duration: 0.7454375 2023-10-02 12:11:01,497 WARNING [train.py:1204] (3/4) Exclude cut with ID _41817_210_18835_1_1533434477160_7423049_352-42537-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 12:11:04,100 INFO [train.py:1046] (3/4) Epoch 25, batch 3750, loss[loss=0.2023, simple_loss=0.2704, pruned_loss=0.06715, over 23644.00 frames. ], tot_loss[loss=0.171, simple_loss=0.2484, pruned_loss=0.04677, over 4704585.22 frames. ], batch size: 232, lr: 4.08e-03, grad_scale: 16.0 2023-10-02 12:11:04,174 WARNING [train.py:1204] (3/4) Exclude cut with ID _8365_210_2064_1_1528592311070_4677690_431-294102-0 from training. Duration: 0.89 2023-10-02 12:11:05,538 WARNING [train.py:1204] (3/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_454-314901-0_sp1.1 from training. Duration: 0.4181875 2023-10-02 12:11:07,013 WARNING [train.py:1204] (3/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_80-135200-0_sp0.9 from training. Duration: 0.87775 2023-10-02 12:11:08,731 WARNING [train.py:1204] (3/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_181-78632-0 from training. Duration: 0.78 2023-10-02 12:11:08,780 WARNING [train.py:1204] (3/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_300-27235-0_sp0.9 from training. Duration: 0.9666875 2023-10-02 12:11:10,107 WARNING [train.py:1204] (3/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_96-177176-0_sp1.1 from training. Duration: 0.97275 2023-10-02 12:11:11,559 WARNING [train.py:1204] (3/4) Exclude cut with ID _35941_210_15379_1_1532995294515_4023089_246-348742-0_sp1.1 from training. Duration: 0.97275 2023-10-02 12:11:12,924 WARNING [train.py:1204] (3/4) Exclude cut with ID _38670_210_15379_1_1533448810659_4391059_65-169397-0_sp1.1 from training. Duration: 0.8818125 2023-10-02 12:11:14,569 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.conv_module1.balancer1.prob, batch_count=874940.0, ans=0.125 2023-10-02 12:11:15,703 WARNING [train.py:1204] (3/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_1003-99642-0_sp1.1 from training. Duration: 0.963625 2023-10-02 12:11:18,870 WARNING [train.py:1204] (3/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_320-14078-0_sp1.1 from training. Duration: 0.863625 2023-10-02 12:11:20,284 WARNING [train.py:1204] (3/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_367-61330-0_sp0.9 from training. Duration: 0.8333125 2023-10-02 12:11:21,683 WARNING [train.py:1204] (3/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_515-298514-0_sp1.1 from training. Duration: 0.92725 2023-10-02 12:11:24,369 WARNING [train.py:1204] (3/4) Exclude cut with ID _53731_210_7994_1_1534553979244_3757214_471-320815-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 12:11:24,427 WARNING [train.py:1204] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_306-232480-0 from training. Duration: 0.96 2023-10-02 12:11:25,708 WARNING [train.py:1204] (3/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_478-162247-0_sp0.9 from training. Duration: 0.911125 2023-10-02 12:11:25,818 WARNING [train.py:1204] (3/4) Exclude cut with ID _56423_210_5854_1_1534896061049_3165969_345-284696-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 12:11:27,719 WARNING [train.py:1204] (3/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_600-122253-0_sp1.1 from training. Duration: 0.963625 2023-10-02 12:11:30,384 WARNING [train.py:1204] (3/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_199-295611-0 from training. Duration: 0.55 2023-10-02 12:11:33,217 WARNING [train.py:1204] (3/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_734-129761-0 from training. Duration: 0.82 2023-10-02 12:11:34,586 WARNING [train.py:1204] (3/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_349-169257-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 12:11:34,742 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.self_attn_weights.pos_emb_skip_rate, batch_count=875073.3333333334, ans=0.0 2023-10-02 12:11:35,819 WARNING [train.py:1204] (3/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_202-67017-0_sp0.9 from training. Duration: 0.911125 2023-10-02 12:11:38,490 WARNING [train.py:1204] (3/4) Exclude cut with ID _16908_210_5724_1_1531119157248_3772700_61-258978-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 12:11:44,568 WARNING [train.py:1204] (3/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_172-156931-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 12:11:44,680 WARNING [train.py:1204] (3/4) Exclude cut with ID _4092_210_2151_1_1526643044449_3135759_356-278694-0_sp1.1 from training. Duration: 0.42725 2023-10-02 12:11:49,491 WARNING [train.py:1204] (3/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_116-332008-0 from training. Duration: 0.98 2023-10-02 12:11:52,150 WARNING [train.py:1204] (3/4) Exclude cut with ID _29003_210_19596_1_1532507391720_3894098_358-114789-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 12:11:54,868 WARNING [train.py:1204] (3/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_152-212533-0_sp1.1 from training. Duration: 0.8818125 2023-10-02 12:11:56,325 WARNING [train.py:1204] (3/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_267-214116-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 12:11:59,498 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7543_210_6753_1_1528541907_1399322_165-323619-0_sp0.9 from training. Duration: 0.7666875 2023-10-02 12:12:05,573 WARNING [train.py:1204] (3/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_577-332600-0_sp0.9 from training. Duration: 0.6333125 2023-10-02 12:12:06,935 WARNING [train.py:1204] (3/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_342-100157-0_sp0.9 from training. Duration: 0.6 2023-10-02 12:12:08,392 WARNING [train.py:1204] (3/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_141-89727-0_sp1.1 from training. Duration: 0.6454375 2023-10-02 12:12:09,739 WARNING [train.py:1204] (3/4) Exclude cut with ID _3992_210_1747_1_1526644816031_3731060_107-251434-0_sp1.1 from training. Duration: 0.9 2023-10-02 12:12:11,300 WARNING [train.py:1204] (3/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_401-53423-0_sp1.1 from training. Duration: 0.5 2023-10-02 12:12:14,661 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.conv_module2.balancer1.prob, batch_count=875206.6666666666, ans=0.125 2023-10-02 12:12:17,201 INFO [train.py:1046] (3/4) Epoch 25, batch 3800, loss[loss=0.1643, simple_loss=0.2497, pruned_loss=0.03945, over 24328.00 frames. ], tot_loss[loss=0.1715, simple_loss=0.2489, pruned_loss=0.04709, over 4700556.68 frames. ], batch size: 74, lr: 4.08e-03, grad_scale: 16.0 2023-10-02 12:12:19,121 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7762_210_1794_1_1528199508_4057408_422-227034-0_sp1.1 from training. Duration: 0.663625 2023-10-02 12:12:23,262 WARNING [train.py:1204] (3/4) Exclude cut with ID _4184_210_2550_1_1526730192013_2219380_102-250299-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 12:12:24,601 WARNING [train.py:1204] (3/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_253-308377-0_sp0.9 from training. Duration: 0.5444375 2023-10-02 12:12:25,893 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_271-297505-0 from training. Duration: 0.99 2023-10-02 12:12:25,993 WARNING [train.py:1204] (3/4) Exclude cut with ID _35941_210_15379_1_1532995294515_4023089_213-142973-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 12:12:29,146 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7544_210_6753_1_1528587722_3576686_162-63018-0_sp1.1 from training. Duration: 0.92725 2023-10-02 12:12:29,228 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_365-217119-0_sp0.9 from training. Duration: 0.7 2023-10-02 12:12:31,992 WARNING [train.py:1204] (3/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_662-275621-0_sp0.9 from training. Duration: 0.4666875 2023-10-02 12:12:31,994 WARNING [train.py:1204] (3/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_374-187555-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 12:12:33,942 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_289-180904-0_sp1.1 from training. Duration: 0.6909375 2023-10-02 12:12:35,351 WARNING [train.py:1204] (3/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_189-47206-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 12:12:35,379 WARNING [train.py:1204] (3/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_234-14174-0_sp1.1 from training. Duration: 0.7454375 2023-10-02 12:12:35,403 WARNING [train.py:1204] (3/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_576-76621-0_sp1.1 from training. Duration: 0.963625 2023-10-02 12:12:36,848 WARNING [train.py:1204] (3/4) Exclude cut with ID _13253_210_1925_1_1530757775819_3577873_253-246386-0 from training. Duration: 0.9 2023-10-02 12:12:39,598 WARNING [train.py:1204] (3/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_372-6869-0_sp1.1 from training. Duration: 0.4 2023-10-02 12:12:41,016 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_21434_210_4238_1_1531916339_1222252_22-54954-0_sp1.1 from training. Duration: 0.936375 2023-10-02 12:12:42,460 WARNING [train.py:1204] (3/4) Exclude cut with ID _29853_210_7497_1_1532505600851_6642110_492-170361-0_sp1.1 from training. Duration: 0.92725 2023-10-02 12:12:43,837 WARNING [train.py:1204] (3/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_505-97987-0_sp0.9 from training. Duration: 0.9555625 2023-10-02 12:12:43,881 WARNING [train.py:1204] (3/4) Exclude cut with ID _8365_210_2064_1_1528592311070_4677690_371-122295-0_sp0.9 from training. Duration: 0.611125 2023-10-02 12:12:47,216 WARNING [train.py:1204] (3/4) Exclude cut with ID _14268_210_9206_1_1530622771285_4714099_637-86630-0_sp1.1 from training. Duration: 0.463625 2023-10-02 12:12:47,240 WARNING [train.py:1204] (3/4) Exclude cut with ID _29857_210_20034_1_1532395886753_3798160_372-5678-0_sp1.1 from training. Duration: 0.963625 2023-10-02 12:12:50,477 WARNING [train.py:1204] (3/4) Exclude cut with ID _52892_210_6721_1_1534378941259_4124129_90-165108-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 12:12:51,818 WARNING [train.py:1204] (3/4) Exclude cut with ID _27052_210_5986_1_1532329197187_3585009_143-339198-0_sp1.1 from training. Duration: 0.963625 2023-10-02 12:12:56,007 WARNING [train.py:1204] (3/4) Exclude cut with ID _14268_210_9206_1_1530622771285_4714099_637-86630-0_sp0.9 from training. Duration: 0.5666875 2023-10-02 12:12:56,009 WARNING [train.py:1204] (3/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_401-255491-0 from training. Duration: 0.7 2023-10-02 12:12:57,569 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.feed_forward1.hidden_balancer.prob, batch_count=875406.6666666666, ans=0.125 2023-10-02 12:12:58,609 WARNING [train.py:1204] (3/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_178-6946-0_sp1.1 from training. Duration: 0.97275 2023-10-02 12:13:05,347 WARNING [train.py:1204] (3/4) Exclude cut with ID _27485_210_4916_1_1532739191677_3668269_460-163885-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 12:13:05,624 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.balancer1.prob, batch_count=875473.3333333334, ans=0.125 2023-10-02 12:13:08,023 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.583e+02 1.870e+02 2.050e+02 2.289e+02 3.047e+02, threshold=4.100e+02, percent-clipped=0.0 2023-10-02 12:13:10,869 WARNING [train.py:1204] (3/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_270-26053-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 12:13:12,341 WARNING [train.py:1204] (3/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_412-317077-0 from training. Duration: 0.75 2023-10-02 12:13:13,797 WARNING [train.py:1204] (3/4) Exclude cut with ID _5289_210_2084_1_1527678202736_3779299_142-103317-0 from training. Duration: 0.84 2023-10-02 12:13:13,841 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_150-143438-0_sp1.1 from training. Duration: 0.92725 2023-10-02 12:13:14,055 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.bypass_mid.scale_min, batch_count=875473.3333333334, ans=0.2 2023-10-02 12:13:16,666 WARNING [train.py:1204] (3/4) Exclude cut with ID _27894_210_12392_1_1532415496078_6902439_668-269779-0_sp1.1 from training. Duration: 0.97275 2023-10-02 12:13:16,893 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.conv_module2.balancer1.prob, batch_count=875540.0, ans=0.125 2023-10-02 12:13:18,566 WARNING [train.py:1204] (3/4) Exclude cut with ID _18513_210_9206_1_1531618215223_3862620_56-222075-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 12:13:20,325 WARNING [train.py:1204] (3/4) Exclude cut with ID _4092_210_2151_1_1526643044449_3135759_356-278694-0 from training. Duration: 0.47 2023-10-02 12:13:23,134 WARNING [train.py:1204] (3/4) Exclude cut with ID _25808_210_13292_1_1532343514562_7366109_633-47524-0 from training. Duration: 0.99 2023-10-02 12:13:23,146 WARNING [train.py:1204] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_872-264525-0 from training. Duration: 0.65 2023-10-02 12:13:23,180 WARNING [train.py:1204] (3/4) Exclude cut with ID _14632_210_9988_1_1530691279392_3923200_239-232545-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 12:13:24,669 WARNING [train.py:1204] (3/4) Exclude cut with ID _14349_210_8341_1_1530615710818_3442470_153-27432-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 12:13:28,917 WARNING [train.py:1204] (3/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_37-41919-0_sp0.9 from training. Duration: 0.9666875 2023-10-02 12:13:30,177 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_196-289637-0_sp0.9 from training. Duration: 0.8333125 2023-10-02 12:13:31,981 INFO [train.py:1046] (3/4) Epoch 25, batch 3850, loss[loss=0.1551, simple_loss=0.2273, pruned_loss=0.04144, over 23696.00 frames. ], tot_loss[loss=0.1706, simple_loss=0.2478, pruned_loss=0.04676, over 4706236.02 frames. ], batch size: 149, lr: 4.08e-03, grad_scale: 16.0 2023-10-02 12:13:34,510 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.0.layers.0.feed_forward2.out_whiten, num_groups=1, num_channels=192, metric=5.95 vs. limit=15.0 2023-10-02 12:13:35,585 WARNING [train.py:1204] (3/4) Exclude cut with ID _13678_210_7497_1_1530530069226_3463170_371-3221-0_sp1.1 from training. Duration: 0.8181875 2023-10-02 12:13:36,904 WARNING [train.py:1204] (3/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_557-324258-0 from training. Duration: 0.81 2023-10-02 12:13:38,242 WARNING [train.py:1204] (3/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_1012-279473-0_sp1.1 from training. Duration: 0.6090625 2023-10-02 12:13:38,331 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_66916_210_14656_1_1535725525_326566_7-92649-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 12:13:39,159 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.0.layers.0.self_attn1.whiten, num_groups=1, num_channels=192, metric=10.90 vs. limit=22.5 2023-10-02 12:13:41,043 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_9460_210_4211_1_1528954587_10049667_480-260269-0_sp1.1 from training. Duration: 0.5090625 2023-10-02 12:13:42,485 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_121-155001-0_sp1.1 from training. Duration: 0.92725 2023-10-02 12:13:45,095 WARNING [train.py:1204] (3/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_188-171673-0_sp1.1 from training. Duration: 0.52725 2023-10-02 12:13:47,070 WARNING [train.py:1204] (3/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_2-39653-0 from training. Duration: 0.98 2023-10-02 12:13:52,537 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_23850_210_4238_1_1532232597_1632433_112-229012-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 12:13:52,663 WARNING [train.py:1204] (3/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_626-79528-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 12:13:54,180 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=875673.3333333334, ans=0.1 2023-10-02 12:13:55,371 WARNING [train.py:1204] (3/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_856-90052-0_sp1.1 from training. Duration: 0.963625 2023-10-02 12:13:56,778 WARNING [train.py:1204] (3/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_122-109023-0_sp0.9 from training. Duration: 0.8444375 2023-10-02 12:13:59,503 WARNING [train.py:1204] (3/4) Exclude cut with ID _64414_210_17301_1_1535243252048_3757759_37-70328-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 12:13:59,574 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_622-344907-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 12:14:01,311 WARNING [train.py:1204] (3/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_410-321956-0_sp1.1 from training. Duration: 0.92725 2023-10-02 12:14:02,565 WARNING [train.py:1204] (3/4) Exclude cut with ID _7562_210_2084_1_1528887767916_3946229_186-326422-0_sp0.9 from training. Duration: 0.9333125 2023-10-02 12:14:03,962 WARNING [train.py:1204] (3/4) Exclude cut with ID _37537_210_2253_1_1533171758568_3421949_127-285192-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 12:14:05,297 WARNING [train.py:1204] (3/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_723-228168-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 12:14:07,144 WARNING [train.py:1204] (3/4) Exclude cut with ID _13678_210_7497_1_1530530069226_3463170_426-246984-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 12:14:07,157 WARNING [train.py:1204] (3/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_399-156791-0_sp0.9 from training. Duration: 0.92225 2023-10-02 12:14:08,489 WARNING [train.py:1204] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_391-22148-0 from training. Duration: 0.77 2023-10-02 12:14:08,519 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3475_210_2028_1_1526193578_3622569_64-297072-0 from training. Duration: 0.91 2023-10-02 12:14:09,858 WARNING [train.py:1204] (3/4) Exclude cut with ID _17875_210_2087_1_1531224012907_1821339_225-280015-0_sp1.1 from training. Duration: 0.963625 2023-10-02 12:14:09,877 WARNING [train.py:1204] (3/4) Exclude cut with ID _50655_210_18628_1_1534229942193_3772154_325-246169-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 12:14:11,550 WARNING [train.py:1204] (3/4) Exclude cut with ID _29529_210_7234_1_1532393961455_3977460_597-257348-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 12:14:11,598 WARNING [train.py:1204] (3/4) Exclude cut with ID _58895_210_8750_1_1535184081320_3649089_77-173485-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 12:14:11,629 WARNING [train.py:1204] (3/4) Exclude cut with ID _6270_210_2084_1_1527850911573_3856420_26-277343-0 from training. Duration: 0.82 2023-10-02 12:14:14,838 WARNING [train.py:1204] (3/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_144-3287-0 from training. Duration: 0.67 2023-10-02 12:14:14,970 WARNING [train.py:1204] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_293-145278-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 12:14:19,398 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_40047_210_2290_1_1533290519_2374311_257-335162-0 from training. Duration: 0.95 2023-10-02 12:14:20,790 WARNING [train.py:1204] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_619-344499-0_sp0.9 from training. Duration: 0.7 2023-10-02 12:14:23,913 WARNING [train.py:1204] (3/4) Exclude cut with ID _19504_210_9994_1_1531648863242_3325220_456-315379-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 12:14:25,290 WARNING [train.py:1204] (3/4) Exclude cut with ID _55857_210_2346_1_1534644007831_6898209_631-221692-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 12:14:29,174 WARNING [train.py:1204] (3/4) Exclude cut with ID _64631_210_27767_1_1535349580068_4165160_324-3682-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 12:14:29,209 WARNING [train.py:1204] (3/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_317-211107-0 from training. Duration: 0.92 2023-10-02 12:14:32,269 WARNING [train.py:1204] (3/4) Exclude cut with ID _6789_210_1534_1_1527857937218_3118950_311-61262-0 from training. Duration: 0.65 2023-10-02 12:14:32,497 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=875873.3333333334, ans=0.1 2023-10-02 12:14:34,908 WARNING [train.py:1204] (3/4) Exclude cut with ID _28323_210_16187_1_1532480395667_4100300_365-68882-0_sp1.1 from training. Duration: 0.92725 2023-10-02 12:14:35,564 WARNING [train.py:1204] (3/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_816-181825-0_sp1.1 from training. Duration: 0.92725 2023-10-02 12:14:38,240 WARNING [train.py:1204] (3/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_132-269633-0_sp1.1 from training. Duration: 0.6181875 2023-10-02 12:14:38,252 WARNING [train.py:1204] (3/4) Exclude cut with ID _44585_210_18628_1_1533605350387_4518199_280-73534-0_sp1.1 from training. Duration: 0.8090625 2023-10-02 12:14:38,300 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_68095_210_20185_1_1535718226_8888855_1009-164755-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 12:14:39,706 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_40223_210_6228_1_1533298404_4812267_400-103813-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 12:14:39,707 WARNING [train.py:1204] (3/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_491-261923-0_sp1.1 from training. Duration: 0.8909375 2023-10-02 12:14:39,720 WARNING [train.py:1204] (3/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_712-342563-0 from training. Duration: 0.97 2023-10-02 12:14:41,211 WARNING [train.py:1204] (3/4) Exclude cut with ID _41161_210_20034_1_1533431249657_3555314_175-190032-0_sp1.1 from training. Duration: 0.963625 2023-10-02 12:14:42,590 WARNING [train.py:1204] (3/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_788-298907-0 from training. Duration: 0.89 2023-10-02 12:14:42,612 WARNING [train.py:1204] (3/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_225-70961-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 12:14:42,618 WARNING [train.py:1204] (3/4) Exclude cut with ID _58583_210_5236_1_1534726835266_3574249_235-175994-0_sp1.1 from training. Duration: 0.92725 2023-10-02 12:14:44,599 WARNING [train.py:1204] (3/4) Exclude cut with ID _12302_210_5219_1_1529924446466_8107280_907-182949-0_sp1.1 from training. Duration: 0.836375 2023-10-02 12:14:45,829 INFO [train.py:1046] (3/4) Epoch 25, batch 3900, loss[loss=0.162, simple_loss=0.2393, pruned_loss=0.04239, over 23616.00 frames. ], tot_loss[loss=0.169, simple_loss=0.2462, pruned_loss=0.04589, over 4711983.34 frames. ], batch size: 149, lr: 4.08e-03, grad_scale: 16.0 2023-10-02 12:14:45,886 WARNING [train.py:1204] (3/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_94-193692-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 12:14:47,346 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_4528_210_3273_1_1527076654_3276777_342-168068-0_sp0.9 from training. Duration: 0.9666875 2023-10-02 12:14:47,401 WARNING [train.py:1204] (3/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_368-74239-0_sp1.1 from training. Duration: 0.92725 2023-10-02 12:14:48,702 WARNING [train.py:1204] (3/4) Exclude cut with ID _44831_210_13427_1_1533816011549_3689035_291-295928-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 12:14:48,783 WARNING [train.py:1204] (3/4) Exclude cut with ID _56406_210_9288_1_1534899618664_3496149_455-131997-0_sp1.1 from training. Duration: 0.97275 2023-10-02 12:14:49,974 WARNING [train.py:1204] (3/4) Exclude cut with ID _6473_210_1741_1_1527901307396_3815380_108-17335-0 from training. Duration: 0.65 2023-10-02 12:14:50,016 WARNING [train.py:1204] (3/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_286-135600-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 12:14:53,457 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_928-321675-0_sp1.1 from training. Duration: 0.936375 2023-10-02 12:14:54,955 WARNING [train.py:1204] (3/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_81-300212-0_sp1.1 from training. Duration: 0.5454375 2023-10-02 12:14:56,137 WARNING [train.py:1204] (3/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_29-280622-0_sp0.9 from training. Duration: 0.911125 2023-10-02 12:14:57,457 WARNING [train.py:1204] (3/4) Exclude cut with ID _61079_210_4525_1_1534921008198_4011419_228-176447-0_sp1.1 from training. Duration: 0.936375 2023-10-02 12:15:00,531 WARNING [train.py:1204] (3/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_372-198878-0_sp1.1 from training. Duration: 0.5454375 2023-10-02 12:15:00,541 WARNING [train.py:1204] (3/4) Exclude cut with ID _64521_210_14699_1_1535243427259_3815328_244-208725-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 12:15:00,665 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8275_210_4198_1_1528455320_3892571_491-17707-0_sp0.9 from training. Duration: 0.711125 2023-10-02 12:15:00,883 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.feed_forward3.hidden_balancer.prob, batch_count=876006.6666666666, ans=0.125 2023-10-02 12:15:02,014 WARNING [train.py:1204] (3/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_375-245677-0 from training. Duration: 0.81 2023-10-02 12:15:02,030 WARNING [train.py:1204] (3/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_690-286055-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 12:15:04,002 WARNING [train.py:1204] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_89-321955-0 from training. Duration: 0.8 2023-10-02 12:15:04,045 WARNING [train.py:1204] (3/4) Exclude cut with ID _18895_210_6228_1_1531357274234_3784930_213-98721-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 12:15:05,379 WARNING [train.py:1204] (3/4) Exclude cut with ID _19708_210_9206_1_1532136650733_4238364_347-65240-0 from training. Duration: 0.97 2023-10-02 12:15:07,287 WARNING [train.py:1204] (3/4) Exclude cut with ID _64135_210_2253_1_1535677267876_7608972_498-5039-0 from training. Duration: 0.78 2023-10-02 12:15:10,054 WARNING [train.py:1204] (3/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_146-276944-0_sp1.1 from training. Duration: 0.7545625 2023-10-02 12:15:11,373 WARNING [train.py:1204] (3/4) Exclude cut with ID _63171_210_15488_1_1535104792794_3295110_155-34488-0_sp1.1 from training. Duration: 0.97275 2023-10-02 12:15:11,392 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_5438_210_1794_1_1527336687_2600009_233-58072-0_sp0.9 from training. Duration: 0.8555625 2023-10-02 12:15:11,438 WARNING [train.py:1204] (3/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_63-101996-0_sp0.9 from training. Duration: 0.97775 2023-10-02 12:15:17,205 WARNING [train.py:1204] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_271-269084-0_sp1.1 from training. Duration: 0.7545625 2023-10-02 12:15:18,748 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.conv_skip_rate, batch_count=876073.3333333334, ans=0.0 2023-10-02 12:15:19,803 WARNING [train.py:1204] (3/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_345-167986-0_sp1.1 from training. Duration: 0.763625 2023-10-02 12:15:21,258 WARNING [train.py:1204] (3/4) Exclude cut with ID _39290_210_4220_1_1533862778760_3075118_92-311600-0_sp1.1 from training. Duration: 0.8 2023-10-02 12:15:22,534 WARNING [train.py:1204] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_209-65306-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 12:15:22,626 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_26433_210_12108_1_1532157158_4056480_317-335807-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 12:15:24,759 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.conv_module2.balancer2.prob, batch_count=876073.3333333334, ans=0.125 2023-10-02 12:15:30,105 WARNING [train.py:1204] (3/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_238-94127-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 12:15:30,145 WARNING [train.py:1204] (3/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_207-132038-0_sp1.1 from training. Duration: 0.8545625 2023-10-02 12:15:37,489 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.494e+02 1.823e+02 1.992e+02 2.120e+02 3.702e+02, threshold=3.983e+02, percent-clipped=0.0 2023-10-02 12:15:37,596 WARNING [train.py:1204] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_167-137255-0_sp1.1 from training. Duration: 0.5818125 2023-10-02 12:15:37,788 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.feed_forward1.hidden_balancer.prob, batch_count=876140.0, ans=0.125 2023-10-02 12:15:38,978 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2009_210_2071_1_1525481134_4588937_385-302100-0_sp0.9 from training. Duration: 0.9666875 2023-10-02 12:15:41,189 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.attention_skip_rate, batch_count=876140.0, ans=0.0 2023-10-02 12:15:43,171 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.2.encoder.layers.0.feed_forward1.out_whiten, num_groups=1, num_channels=384, metric=5.84 vs. limit=15.0 2023-10-02 12:15:46,588 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_15517_210_6753_1_1530954008_3487336_253-110617-0_sp1.1 from training. Duration: 0.936375 2023-10-02 12:15:51,053 WARNING [train.py:1204] (3/4) Exclude cut with ID _39290_210_4220_1_1533862778760_3075118_92-311600-0_sp0.9 from training. Duration: 0.97775 2023-10-02 12:15:51,094 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_42-142473-0 from training. Duration: 0.72 2023-10-02 12:15:51,128 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2296_210_2087_1_1525503448_3397335_57-185430-0 from training. Duration: 0.6 2023-10-02 12:15:51,148 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_11305_210_1794_1_1529750428_7178148_257-312870-0_sp0.9 from training. Duration: 0.97775 2023-10-02 12:15:52,620 WARNING [train.py:1204] (3/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_222-198127-0 from training. Duration: 0.84 2023-10-02 12:15:54,028 WARNING [train.py:1204] (3/4) Exclude cut with ID _65263_210_22356_1_1535763598691_3921037_423-202699-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 12:15:55,975 WARNING [train.py:1204] (3/4) Exclude cut with ID _15037_210_2253_1_1530752294104_3267650_13-130361-0 from training. Duration: 0.99 2023-10-02 12:15:59,846 INFO [train.py:1046] (3/4) Epoch 25, batch 3950, loss[loss=0.1492, simple_loss=0.2307, pruned_loss=0.03387, over 24554.00 frames. ], tot_loss[loss=0.1686, simple_loss=0.2464, pruned_loss=0.04546, over 4724322.32 frames. ], batch size: 60, lr: 4.08e-03, grad_scale: 8.0 2023-10-02 12:16:02,049 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.2.encoder.layers.2.feed_forward1.out_whiten, num_groups=1, num_channels=384, metric=6.31 vs. limit=15.0 2023-10-02 12:16:02,704 WARNING [train.py:1204] (3/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_628-198080-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 12:16:04,437 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3171_210_2087_1_1526109084_2330007_37-64334-0 from training. Duration: 0.82 2023-10-02 12:16:04,493 WARNING [train.py:1204] (3/4) Exclude cut with ID _5287_210_2084_1_1527418883040_3878670_12-221186-0_sp1.1 from training. Duration: 0.8909375 2023-10-02 12:16:07,275 WARNING [train.py:1204] (3/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_1103-56445-0_sp1.1 from training. Duration: 0.663625 2023-10-02 12:16:08,639 WARNING [train.py:1204] (3/4) Exclude cut with ID _65263_210_22356_1_1535763598691_3921037_169-140068-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 12:16:10,623 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.4.encoder.layers.1.feed_forward1.out_whiten, num_groups=1, num_channels=384, metric=9.95 vs. limit=15.0 2023-10-02 12:16:14,646 WARNING [train.py:1204] (3/4) Exclude cut with ID IC0671W0118-331961 from training. Duration: 0.896 2023-10-02 12:16:15,984 WARNING [train.py:1204] (3/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_402-102311-0_sp1.1 from training. Duration: 0.6090625 2023-10-02 12:16:16,003 WARNING [train.py:1204] (3/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_202-293523-0 from training. Duration: 0.72 2023-10-02 12:16:16,072 WARNING [train.py:1204] (3/4) Exclude cut with ID IC0679W0038-335846 from training. Duration: 0.768 2023-10-02 12:16:16,101 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_37357_210_17024_1_1533089504_2316121_79-198866-0_sp1.1 from training. Duration: 0.936375 2023-10-02 12:16:18,971 WARNING [train.py:1204] (3/4) Exclude cut with ID _7606_210_4915_1_1528894253277_5080690_292-271285-0_sp1.1 from training. Duration: 0.963625 2023-10-02 12:16:18,981 WARNING [train.py:1204] (3/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_377-296839-0_sp0.9 from training. Duration: 0.611125 2023-10-02 12:16:18,984 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_45886_210_17024_1_1533704917_2435333_58-131069-0_sp1.1 from training. Duration: 0.963625 2023-10-02 12:16:23,528 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6961_210_2087_1_1527937168_6815011_395-185060-0 from training. Duration: 0.95 2023-10-02 12:16:26,742 WARNING [train.py:1204] (3/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_304-266479-0_sp1.1 from training. Duration: 0.97275 2023-10-02 12:16:26,793 WARNING [train.py:1204] (3/4) Exclude cut with ID _3867_210_1883_1_1526641200575_4475830_319-349850-0_sp1.1 from training. Duration: 0.6090625 2023-10-02 12:16:26,801 WARNING [train.py:1204] (3/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_304-59227-0_sp0.9 from training. Duration: 0.8555625 2023-10-02 12:16:28,263 WARNING [train.py:1204] (3/4) Exclude cut with ID _8021_210_7994_1_1528376451358_3653069_100-140231-0_sp0.9 from training. Duration: 0.7444375 2023-10-02 12:16:28,317 WARNING [train.py:1204] (3/4) Exclude cut with ID _8833_210_5219_1_1528592665210_7553059_329-125156-0_sp0.9 from training. Duration: 0.911125 2023-10-02 12:16:37,506 WARNING [train.py:1204] (3/4) Exclude cut with ID _14459_210_2228_1_1531443641946_7516700_792-287234-0_sp1.1 from training. Duration: 0.92725 2023-10-02 12:16:37,521 WARNING [train.py:1204] (3/4) Exclude cut with ID _19708_210_9206_1_1532136650733_4238364_347-65240-0_sp1.1 from training. Duration: 0.8818125 2023-10-02 12:16:41,659 WARNING [train.py:1204] (3/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_70-27732-0 from training. Duration: 0.94 2023-10-02 12:16:47,757 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_397-132565-0 from training. Duration: 0.99 2023-10-02 12:16:47,760 WARNING [train.py:1204] (3/4) Exclude cut with ID _49728_210_12651_1_1534041034294_6788150_624-195301-0 from training. Duration: 0.85 2023-10-02 12:16:47,786 WARNING [train.py:1204] (3/4) Exclude cut with ID _55429_210_10626_1_1534467588527_7174143_377-169391-0_sp1.1 from training. Duration: 0.87275 2023-10-02 12:16:49,163 WARNING [train.py:1204] (3/4) Exclude cut with ID _49376_210_5986_1_1533985278732_3370740_354-344695-0_sp1.1 from training. Duration: 0.9 2023-10-02 12:16:55,986 WARNING [train.py:1204] (3/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_276-96593-0_sp1.1 from training. Duration: 0.7 2023-10-02 12:16:57,235 WARNING [train.py:1204] (3/4) Exclude cut with ID _15012_210_1968_1_1530856820429_5095350_688-178849-0_sp0.9 from training. Duration: 0.711125 2023-10-02 12:16:57,288 WARNING [train.py:1204] (3/4) Exclude cut with ID _25565_210_12392_1_1532423009382_3952098_372-69023-0_sp1.1 from training. Duration: 0.936375 2023-10-02 12:16:57,315 WARNING [train.py:1204] (3/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_61-327062-0_sp1.1 from training. Duration: 0.736375 2023-10-02 12:16:57,348 WARNING [train.py:1204] (3/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_215-141059-0 from training. Duration: 0.62 2023-10-02 12:17:02,756 WARNING [train.py:1204] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_6-226750-0_sp1.1 from training. Duration: 0.8545625 2023-10-02 12:17:04,488 WARNING [train.py:1204] (3/4) Exclude cut with ID _14348_210_8341_1_1530599434996_3778640_240-232370-0_sp1.1 from training. Duration: 0.763625 2023-10-02 12:17:08,498 WARNING [train.py:1204] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_468-189058-0 from training. Duration: 0.96 2023-10-02 12:17:13,732 INFO [train.py:1046] (3/4) Epoch 25, batch 4000, loss[loss=0.1822, simple_loss=0.2658, pruned_loss=0.04933, over 24037.00 frames. ], tot_loss[loss=0.1685, simple_loss=0.2466, pruned_loss=0.04526, over 4719897.03 frames. ], batch size: 80, lr: 4.08e-03, grad_scale: 16.0 2023-10-02 12:17:15,453 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.conv_skip_rate, batch_count=876606.6666666666, ans=0.0 2023-10-02 12:17:19,592 WARNING [train.py:1204] (3/4) Exclude cut with ID _21987_210_5854_1_1531821500482_3637038_247-93340-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 12:17:27,179 WARNING [train.py:1204] (3/4) Exclude cut with ID _66868_210_9527_1_1535509287954_4561140_436-191602-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 12:17:27,428 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.balancer2.prob, batch_count=876673.3333333334, ans=0.125 2023-10-02 12:17:30,617 WARNING [train.py:1204] (3/4) Exclude cut with ID _18895_210_6228_1_1531357274234_3784930_394-58203-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 12:17:30,657 WARNING [train.py:1204] (3/4) Exclude cut with ID _58605_210_12210_1_1534723195939_3904259_258-281498-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 12:17:32,010 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_33348_210_5632_1_1533086912_3324030_245-292752-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 12:17:32,034 WARNING [train.py:1204] (3/4) Exclude cut with ID _67831_210_12392_1_1535612464202_3092612_346-2148-0 from training. Duration: 0.93 2023-10-02 12:17:33,400 WARNING [train.py:1204] (3/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_369-222060-0_sp0.9 from training. Duration: 0.788875 2023-10-02 12:17:33,468 WARNING [train.py:1204] (3/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_245-174553-0 from training. Duration: 0.88 2023-10-02 12:17:34,697 WARNING [train.py:1204] (3/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_231-104736-0_sp1.1 from training. Duration: 0.7090625 2023-10-02 12:17:34,703 WARNING [train.py:1204] (3/4) Exclude cut with ID _44585_210_18628_1_1533605350387_4518199_280-73534-0 from training. Duration: 0.89 2023-10-02 12:17:34,857 WARNING [train.py:1204] (3/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_22-201476-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 12:17:37,949 WARNING [train.py:1204] (3/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_325-62802-0_sp0.9 from training. Duration: 0.9666875 2023-10-02 12:17:37,970 WARNING [train.py:1204] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_199-274062-0_sp1.1 from training. Duration: 0.97275 2023-10-02 12:17:37,976 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_8-319957-0_sp1.1 from training. Duration: 0.87275 2023-10-02 12:17:39,317 WARNING [train.py:1204] (3/4) Exclude cut with ID _29343_210_4381_1_1532602757752_3674109_224-349726-0_sp1.1 from training. Duration: 0.936375 2023-10-02 12:17:39,322 WARNING [train.py:1204] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_520-126729-0_sp1.1 from training. Duration: 0.636375 2023-10-02 12:17:40,769 WARNING [train.py:1204] (3/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_239-22076-0_sp1.1 from training. Duration: 0.77275 2023-10-02 12:17:42,102 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9012W0011-501146 from training. Duration: 0.896 2023-10-02 12:17:42,195 WARNING [train.py:1204] (3/4) Exclude cut with ID _29857_210_20034_1_1532395886753_3798160_279-185801-0_sp1.1 from training. Duration: 0.82725 2023-10-02 12:17:43,572 WARNING [train.py:1204] (3/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_261-223933-0_sp1.1 from training. Duration: 0.92725 2023-10-02 12:17:44,969 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9029W0025-509426 from training. Duration: 0.896 2023-10-02 12:17:45,034 WARNING [train.py:1204] (3/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_337-181502-0_sp1.1 from training. Duration: 0.5909375 2023-10-02 12:17:46,337 WARNING [train.py:1204] (3/4) Exclude cut with ID _68635_210_9317_1_1535770813410_4315870_159-332599-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 12:17:52,529 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_161-276996-0 from training. Duration: 0.95 2023-10-02 12:17:52,571 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7088_210_1874_1_1527939776_1101113_127-68870-0_sp1.1 from training. Duration: 0.936375 2023-10-02 12:17:55,486 WARNING [train.py:1204] (3/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_322-217220-0_sp1.1 from training. Duration: 0.7818125 2023-10-02 12:17:57,280 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9072W0002-530636 from training. Duration: 0.768 2023-10-02 12:17:57,385 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_340-336408-0_sp1.1 from training. Duration: 0.8454375 2023-10-02 12:17:58,740 WARNING [train.py:1204] (3/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_395-37222-0 from training. Duration: 0.76 2023-10-02 12:17:58,743 WARNING [train.py:1204] (3/4) Exclude cut with ID _64099_210_17301_1_1535526977656_1578188_159-209156-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 12:18:00,091 WARNING [train.py:1204] (3/4) Exclude cut with ID _53950_210_5816_1_1534417264919_3862951_28-29681-0_sp1.1 from training. Duration: 0.92725 2023-10-02 12:18:00,356 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.ff3_skip_rate, batch_count=876806.6666666666, ans=0.0 2023-10-02 12:18:01,468 WARNING [train.py:1204] (3/4) Exclude cut with ID _4681_210_3525_1_1527250689464_120889_15-126760-0_sp1.1 from training. Duration: 0.736375 2023-10-02 12:18:02,785 WARNING [train.py:1204] (3/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_242-3472-0_sp1.1 from training. Duration: 0.663625 2023-10-02 12:18:02,830 WARNING [train.py:1204] (3/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_436-186311-0_sp0.9 from training. Duration: 0.72225 2023-10-02 12:18:04,175 WARNING [train.py:1204] (3/4) Exclude cut with ID _3675_210_4557_1_1526639934408_4182089_452-172147-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 12:18:05,457 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.518e+02 1.899e+02 2.057e+02 2.287e+02 3.112e+02, threshold=4.114e+02, percent-clipped=0.0 2023-10-02 12:18:07,336 WARNING [train.py:1204] (3/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_80-147301-0 from training. Duration: 0.84 2023-10-02 12:18:07,362 WARNING [train.py:1204] (3/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_351-107588-0_sp1.1 from training. Duration: 0.92725 2023-10-02 12:18:09,975 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9111W0002-550781 from training. Duration: 0.896 2023-10-02 12:18:14,166 WARNING [train.py:1204] (3/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_465-156500-0_sp1.1 from training. Duration: 0.6545625 2023-10-02 12:18:16,890 WARNING [train.py:1204] (3/4) Exclude cut with ID _13938_210_9988_1_1530518718978_3693960_304-112784-0_sp1.1 from training. Duration: 0.4181875 2023-10-02 12:18:19,859 WARNING [train.py:1204] (3/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_202-67017-0_sp1.1 from training. Duration: 0.7454375 2023-10-02 12:18:19,897 WARNING [train.py:1204] (3/4) Exclude cut with ID _58414_210_8254_1_1535421609162_7151182_448-4358-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 12:18:19,966 WARNING [train.py:1204] (3/4) Exclude cut with ID _66840_210_5219_1_1535452168426_4196349_203-243234-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 12:18:21,431 WARNING [train.py:1204] (3/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_201-213513-0_sp1.1 from training. Duration: 0.97275 2023-10-02 12:18:21,638 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.1.bypass.scale_min, batch_count=876873.3333333334, ans=0.2 2023-10-02 12:18:25,553 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_117-187560-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 12:18:26,802 INFO [train.py:1046] (3/4) Epoch 25, batch 4050, loss[loss=0.1953, simple_loss=0.2604, pruned_loss=0.06506, over 22723.00 frames. ], tot_loss[loss=0.1692, simple_loss=0.247, pruned_loss=0.04567, over 4710829.63 frames. ], batch size: 322, lr: 4.08e-03, grad_scale: 16.0 2023-10-02 12:18:28,880 WARNING [train.py:1204] (3/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_135-174080-0_sp0.9 from training. Duration: 0.588875 2023-10-02 12:18:30,249 WARNING [train.py:1204] (3/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_45-265684-0 from training. Duration: 0.72 2023-10-02 12:18:31,634 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_435-313305-0_sp1.1 from training. Duration: 0.6454375 2023-10-02 12:18:31,660 WARNING [train.py:1204] (3/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_235-307337-0_sp1.1 from training. Duration: 0.8545625 2023-10-02 12:18:32,285 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.3.encoder.layers.1.self_attn_weights.whiten_keys, num_groups=8, num_channels=256, metric=4.86 vs. limit=6.0 2023-10-02 12:18:32,974 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7762_210_1794_1_1528199508_4057408_419-125009-0_sp0.9 from training. Duration: 0.92225 2023-10-02 12:18:34,365 WARNING [train.py:1204] (3/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_215-141059-0_sp1.1 from training. Duration: 0.563625 2023-10-02 12:18:35,625 WARNING [train.py:1204] (3/4) Exclude cut with ID _27013_210_1389_1_1532437173240_3252649_222-125573-0_sp1.1 from training. Duration: 0.8909375 2023-10-02 12:18:40,431 WARNING [train.py:1204] (3/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_126-274694-0_sp1.1 from training. Duration: 0.8909375 2023-10-02 12:18:41,930 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2592_210_2290_1_1525697025_4882925_157-70512-0_sp1.1 from training. Duration: 0.82725 2023-10-02 12:18:43,228 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_292-265928-0_sp0.9 from training. Duration: 0.5666875 2023-10-02 12:18:45,883 WARNING [train.py:1204] (3/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_1211-226066-0_sp1.1 from training. Duration: 0.6909375 2023-10-02 12:18:47,236 WARNING [train.py:1204] (3/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_394-265747-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 12:18:51,956 WARNING [train.py:1204] (3/4) Exclude cut with ID _48782_210_12658_1_1534467701544_2300422_239-67139-0_sp1.1 from training. Duration: 0.97275 2023-10-02 12:18:53,332 WARNING [train.py:1204] (3/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_329-113173-0_sp1.1 from training. Duration: 0.563625 2023-10-02 12:18:54,843 WARNING [train.py:1204] (3/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_81-75849-0_sp0.9 from training. Duration: 0.4333125 2023-10-02 12:18:57,563 WARNING [train.py:1204] (3/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_1003-101638-0 from training. Duration: 0.73 2023-10-02 12:18:57,594 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9309W0189-640316 from training. Duration: 0.64 2023-10-02 12:19:00,706 WARNING [train.py:1204] (3/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_503-287883-0_sp1.1 from training. Duration: 0.8 2023-10-02 12:19:08,056 WARNING [train.py:1204] (3/4) Exclude cut with ID _8833_210_5219_1_1528592665210_7553059_611-80313-0 from training. Duration: 0.64 2023-10-02 12:19:08,135 WARNING [train.py:1204] (3/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_335-106626-0_sp1.1 from training. Duration: 0.963625 2023-10-02 12:19:10,990 WARNING [train.py:1204] (3/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_237-44332-0_sp1.1 from training. Duration: 0.8545625 2023-10-02 12:19:14,424 WARNING [train.py:1204] (3/4) Exclude cut with ID _46952_210_5986_1_1534230010357_3107379_178-147341-0_sp1.1 from training. Duration: 0.97275 2023-10-02 12:19:15,685 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_13091_210_7925_1_1530609447_8987354_946-154426-0_sp1.1 from training. Duration: 0.936375 2023-10-02 12:19:15,709 WARNING [train.py:1204] (3/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_326-17061-0_sp1.1 from training. Duration: 0.8545625 2023-10-02 12:19:17,130 WARNING [train.py:1204] (3/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_38-169920-0_sp1.1 from training. Duration: 0.82725 2023-10-02 12:19:20,312 WARNING [train.py:1204] (3/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_283-220372-0 from training. Duration: 0.56 2023-10-02 12:19:20,326 WARNING [train.py:1204] (3/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_252-216674-0_sp1.1 from training. Duration: 0.4909375 2023-10-02 12:19:21,719 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_202-91290-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 12:19:23,247 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.nonlin_attention.balancer.prob, batch_count=877140.0, ans=0.125 2023-10-02 12:19:24,330 WARNING [train.py:1204] (3/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_80-74184-0 from training. Duration: 0.83 2023-10-02 12:19:27,129 WARNING [train.py:1204] (3/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_411-271358-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 12:19:34,430 WARNING [train.py:1204] (3/4) Exclude cut with ID _68777_210_4353_1_1535810390731_3117904_129-166012-0 from training. Duration: 0.85 2023-10-02 12:19:36,288 WARNING [train.py:1204] (3/4) Exclude cut with ID _44982_210_1511_1_1534312820108_3726029_410-109759-0_sp1.1 from training. Duration: 0.963625 2023-10-02 12:19:36,300 WARNING [train.py:1204] (3/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_364-22817-0_sp1.1 from training. Duration: 0.6454375 2023-10-02 12:19:39,125 WARNING [train.py:1204] (3/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_89-242335-0 from training. Duration: 0.95 2023-10-02 12:19:39,133 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_151-325626-0 from training. Duration: 0.97 2023-10-02 12:19:39,134 WARNING [train.py:1204] (3/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_786-264332-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 12:19:40,396 INFO [train.py:1046] (3/4) Epoch 25, batch 4100, loss[loss=0.1785, simple_loss=0.2615, pruned_loss=0.0477, over 24369.00 frames. ], tot_loss[loss=0.1701, simple_loss=0.2478, pruned_loss=0.04625, over 4712092.48 frames. ], batch size: 77, lr: 4.08e-03, grad_scale: 8.0 2023-10-02 12:19:40,578 WARNING [train.py:1204] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_35-153886-0_sp1.1 from training. Duration: 0.763625 2023-10-02 12:19:42,479 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_24007_210_1794_1_1531918925_1663875_116-100916-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 12:19:42,506 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_162-262717-0_sp1.1 from training. Duration: 0.7545625 2023-10-02 12:19:48,061 WARNING [train.py:1204] (3/4) Exclude cut with ID _8263_210_1664_1_1528438953494_294100_40-204731-0 from training. Duration: 0.5 2023-10-02 12:19:50,207 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_416-293746-0 from training. Duration: 0.9 2023-10-02 12:19:51,489 WARNING [train.py:1204] (3/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_605-219732-0 from training. Duration: 0.94 2023-10-02 12:19:54,158 WARNING [train.py:1204] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_454-123002-0 from training. Duration: 0.69 2023-10-02 12:19:54,163 WARNING [train.py:1204] (3/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_25-304273-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 12:19:55,568 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6860_210_4852_1_1527917457_3833327_451-39702-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 12:19:55,592 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_304-16505-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 12:19:55,605 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_4-266348-0_sp1.1 from training. Duration: 0.7090625 2023-10-02 12:19:55,689 WARNING [train.py:1204] (3/4) Exclude cut with ID ID0149W0001-753701 from training. Duration: 0.989 2023-10-02 12:19:58,472 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_41251_210_15368_1_1533429767_4820695_394-55393-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 12:19:59,837 WARNING [train.py:1204] (3/4) Exclude cut with ID _8793_210_6310_1_1528542985139_2310009_121-179782-0_sp1.1 from training. Duration: 0.72725 2023-10-02 12:19:59,857 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2729_210_3528_1_1525863505_3882214_421-73047-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 12:19:59,913 WARNING [train.py:1204] (3/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_278-154492-0_sp0.9 from training. Duration: 0.8444375 2023-10-02 12:20:02,317 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.0.layers.1.self_attn_weights.whiten_keys, num_groups=4, num_channels=128, metric=2.38 vs. limit=6.0 2023-10-02 12:20:04,523 WARNING [train.py:1204] (3/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_412-317077-0_sp0.9 from training. Duration: 0.8333125 2023-10-02 12:20:05,851 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_727-138317-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 12:20:05,904 WARNING [train.py:1204] (3/4) Exclude cut with ID _25808_210_13292_1_1532343514562_7366109_345-335048-0_sp1.1 from training. Duration: 0.92725 2023-10-02 12:20:05,925 WARNING [train.py:1204] (3/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_506-23200-0 from training. Duration: 0.97 2023-10-02 12:20:07,617 WARNING [train.py:1204] (3/4) Exclude cut with ID _44718_210_5816_1_1534050051869_6469030_643-245187-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 12:20:07,621 WARNING [train.py:1204] (3/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_42-186162-0_sp1.1 from training. Duration: 0.836375 2023-10-02 12:20:07,635 WARNING [train.py:1204] (3/4) Exclude cut with ID _3231_210_2106_1_1526205864934_6784220_171-256004-0_sp1.1 from training. Duration: 0.7818125 2023-10-02 12:20:07,669 WARNING [train.py:1204] (3/4) Exclude cut with ID _25914_210_6400_1_1532141317058_644740_43-248478-0_sp1.1 from training. Duration: 0.936375 2023-10-02 12:20:09,133 WARNING [train.py:1204] (3/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_577-287150-0 from training. Duration: 0.82 2023-10-02 12:20:13,552 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_46015_210_7234_1_1533863319_3087837_376-276068-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 12:20:14,918 WARNING [train.py:1204] (3/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_51-200220-0 from training. Duration: 0.56 2023-10-02 12:20:16,234 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_452-96101-0_sp1.1 from training. Duration: 0.963625 2023-10-02 12:20:17,629 WARNING [train.py:1204] (3/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_263-168388-0_sp1.1 from training. Duration: 0.7818125 2023-10-02 12:20:17,629 WARNING [train.py:1204] (3/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_469-94673-0 from training. Duration: 0.79 2023-10-02 12:20:19,559 WARNING [train.py:1204] (3/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_303-228738-0_sp1.1 from training. Duration: 0.763625 2023-10-02 12:20:19,599 WARNING [train.py:1204] (3/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_262-250826-0_sp1.1 from training. Duration: 0.9 2023-10-02 12:20:20,683 WARNING [train.py:1204] (3/4) Exclude cut with ID _68777_210_4353_1_1535810390731_3117904_129-166012-0_sp1.1 from training. Duration: 0.77275 2023-10-02 12:20:22,153 WARNING [train.py:1204] (3/4) Exclude cut with ID _58895_210_8750_1_1535184081320_3649089_265-323810-0 from training. Duration: 0.96 2023-10-02 12:20:23,540 WARNING [train.py:1204] (3/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_120-76843-0_sp0.9 from training. Duration: 0.611125 2023-10-02 12:20:23,591 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7544_210_6753_1_1528587722_3576686_295-258479-0_sp0.9 from training. Duration: 0.9333125 2023-10-02 12:20:25,053 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7046_210_6753_1_1527895337_6920917_783-278109-0 from training. Duration: 0.77 2023-10-02 12:20:26,230 WARNING [train.py:1204] (3/4) Exclude cut with ID _42326_210_7770_1_1533814282531_3707269_113-308323-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 12:20:26,279 WARNING [train.py:1204] (3/4) Exclude cut with ID _7852_210_5219_1_1528286471316_8139400_242-105336-0_sp0.9 from training. Duration: 0.8 2023-10-02 12:20:30,105 WARNING [train.py:1204] (3/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_114-277905-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 12:20:33,241 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.518e+02 1.821e+02 2.033e+02 2.340e+02 3.969e+02, threshold=4.066e+02, percent-clipped=0.0 2023-10-02 12:20:34,105 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.3.encoder.layers.0.nonlin_attention.whiten2, num_groups=1, num_channels=512, metric=7.18 vs. limit=15.0 2023-10-02 12:20:36,096 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_13118_210_7925_1_1530755612_7608297_42-66683-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 12:20:36,416 INFO [scaling.py:1118] (3/4) WithLoss: name=encoder.encoders.5.encoder.layers.1.self_attn_weights, loss-sum=0.000e+00 2023-10-02 12:20:37,763 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.nonlin_attention.balancer.prob, batch_count=877540.0, ans=0.125 2023-10-02 12:20:38,883 WARNING [train.py:1204] (3/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_901-79438-0_sp1.1 from training. Duration: 0.8545625 2023-10-02 12:20:38,931 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_30280_210_1881_1_1532872779_3660139_211-182194-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 12:20:41,250 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.conv_module1.balancer1.prob, batch_count=877540.0, ans=0.125 2023-10-02 12:20:45,732 WARNING [train.py:1204] (3/4) Exclude cut with ID _14898_210_9206_1_1530766812377_4177190_621-45732-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 12:20:45,739 WARNING [train.py:1204] (3/4) Exclude cut with ID _35350_210_16040_1_1533040191183_4075299_224-133775-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 12:20:50,405 WARNING [train.py:1204] (3/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_147-293311-0_sp1.1 from training. Duration: 0.8545625 2023-10-02 12:20:53,204 WARNING [train.py:1204] (3/4) Exclude cut with ID _12302_210_5219_1_1529924446466_8107280_835-279754-0_sp1.1 from training. Duration: 0.8181875 2023-10-02 12:20:54,540 INFO [train.py:1046] (3/4) Epoch 25, batch 4150, loss[loss=0.1655, simple_loss=0.2439, pruned_loss=0.0435, over 24321.00 frames. ], tot_loss[loss=0.1703, simple_loss=0.2477, pruned_loss=0.04647, over 4715659.53 frames. ], batch size: 61, lr: 4.08e-03, grad_scale: 8.0 2023-10-02 12:20:57,393 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8453_210_4211_1_1528502940_8562730_347-330673-0_sp0.9 from training. Duration: 0.8 2023-10-02 12:20:57,473 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_213-103282-0_sp0.9 from training. Duration: 0.8666875 2023-10-02 12:20:58,785 WARNING [train.py:1204] (3/4) Exclude cut with ID _49728_210_12651_1_1534041034294_6788150_66-43496-0_sp1.1 from training. Duration: 0.97275 2023-10-02 12:20:58,787 WARNING [train.py:1204] (3/4) Exclude cut with ID _19023_210_4353_1_1531378892552_5820780_28-36615-0_sp1.1 from training. Duration: 0.8818125 2023-10-02 12:21:02,067 WARNING [train.py:1204] (3/4) Exclude cut with ID _14349_210_8341_1_1530615710818_3442470_115-285975-0 from training. Duration: 0.86 2023-10-02 12:21:02,086 WARNING [train.py:1204] (3/4) Exclude cut with ID _12734_210_8341_1_1530183614296_3891929_236-47464-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 12:21:03,418 WARNING [train.py:1204] (3/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_177-236809-0 from training. Duration: 0.49 2023-10-02 12:21:04,724 WARNING [train.py:1204] (3/4) Exclude cut with ID _13678_210_7497_1_1530530069226_3463170_371-3221-0 from training. Duration: 0.9 2023-10-02 12:21:04,738 WARNING [train.py:1204] (3/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_48-338976-0 from training. Duration: 0.89 2023-10-02 12:21:04,877 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder_embed.conv.8.prob, batch_count=877606.6666666666, ans=0.125 2023-10-02 12:21:06,092 WARNING [train.py:1204] (3/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_1167-309744-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 12:21:10,255 WARNING [train.py:1204] (3/4) Exclude cut with ID _30327_210_16049_1_1532484161295_3924169_401-70819-0_sp0.9 from training. Duration: 0.9555625 2023-10-02 12:21:10,263 WARNING [train.py:1204] (3/4) Exclude cut with ID _55134_210_21982_1_1534590028377_3882170_346-91022-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 12:21:14,059 WARNING [train.py:1204] (3/4) Exclude cut with ID _14459_210_2228_1_1531443641946_7516700_174-118801-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 12:21:14,139 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_269-278006-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 12:21:15,486 WARNING [train.py:1204] (3/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_390-339137-0_sp0.9 from training. Duration: 0.62225 2023-10-02 12:21:17,092 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.bypass.skip_rate, batch_count=877673.3333333334, ans=0.09899494936611666 2023-10-02 12:21:18,220 WARNING [train.py:1204] (3/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_253-308377-0_sp1.1 from training. Duration: 0.4454375 2023-10-02 12:21:18,233 WARNING [train.py:1204] (3/4) Exclude cut with ID _23825_210_13449_1_1531915313526_3497150_240-271367-0_sp1.1 from training. Duration: 0.8818125 2023-10-02 12:21:19,563 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1015-343884-0_sp0.9 from training. Duration: 0.688875 2023-10-02 12:21:24,268 WARNING [train.py:1204] (3/4) Exclude cut with ID _23514_210_1389_1_1531832139785_3031439_335-48210-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 12:21:24,474 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.0.feed_forward1.out_proj.dropout_p, batch_count=877740.0, ans=0.1 2023-10-02 12:21:25,253 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.0.layers.1.conv_module2.whiten, num_groups=1, num_channels=192, metric=9.26 vs. limit=15.0 2023-10-02 12:21:28,473 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_309-65177-0_sp1.1 from training. Duration: 0.8 2023-10-02 12:21:28,549 WARNING [train.py:1204] (3/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_231-137892-0 from training. Duration: 0.99 2023-10-02 12:21:29,952 WARNING [train.py:1204] (3/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_57-270037-0 from training. Duration: 0.77 2023-10-02 12:21:29,956 WARNING [train.py:1204] (3/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_465-214726-0_sp1.1 from training. Duration: 0.7818125 2023-10-02 12:21:31,322 WARNING [train.py:1204] (3/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_106-300985-0 from training. Duration: 0.9 2023-10-02 12:21:31,327 WARNING [train.py:1204] (3/4) Exclude cut with ID _14632_210_9988_1_1530691279392_3923200_161-129530-0_sp1.1 from training. Duration: 0.87275 2023-10-02 12:21:31,338 WARNING [train.py:1204] (3/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_514-88370-0_sp1.1 from training. Duration: 0.963625 2023-10-02 12:21:34,648 WARNING [train.py:1204] (3/4) Exclude cut with ID _56495_210_4234_1_1534748080334_4136219_305-334505-0_sp1.1 from training. Duration: 0.92725 2023-10-02 12:21:35,966 WARNING [train.py:1204] (3/4) Exclude cut with ID _42875_210_14856_1_1533704397487_4012433_61-264914-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 12:21:38,838 WARNING [train.py:1204] (3/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_91-148831-0 from training. Duration: 0.99 2023-10-02 12:21:42,163 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_435-313305-0_sp0.9 from training. Duration: 0.788875 2023-10-02 12:21:43,602 WARNING [train.py:1204] (3/4) Exclude cut with ID _16744_210_5986_1_1531724405384_3907567_242-111316-0_sp1.1 from training. Duration: 0.8454375 2023-10-02 12:21:43,676 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_257-20733-0 from training. Duration: 0.76 2023-10-02 12:21:45,594 WARNING [train.py:1204] (3/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_4-91037-0_sp1.1 from training. Duration: 0.8 2023-10-02 12:21:45,686 WARNING [train.py:1204] (3/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_3-276701-0 from training. Duration: 0.91 2023-10-02 12:21:48,467 WARNING [train.py:1204] (3/4) Exclude cut with ID _3992_210_1747_1_1526644816031_3731060_36-147855-0_sp1.1 from training. Duration: 0.7090625 2023-10-02 12:21:50,438 WARNING [train.py:1204] (3/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_1296-298671-0_sp1.1 from training. Duration: 0.963625 2023-10-02 12:21:51,617 WARNING [train.py:1204] (3/4) Exclude cut with ID _68965_210_1883_1_1535698962272_3497490_309-334593-0_sp1.1 from training. Duration: 0.92725 2023-10-02 12:21:52,984 WARNING [train.py:1204] (3/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_128-296581-0 from training. Duration: 0.74 2023-10-02 12:21:52,984 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_17613_210_2151_1_1531137569_2366160_170-299079-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 12:21:52,986 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6287_210_3273_1_1527509506_2653716_182-199887-0_sp1.1 from training. Duration: 0.463625 2023-10-02 12:21:53,133 WARNING [train.py:1204] (3/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_80-135200-0_sp1.1 from training. Duration: 0.7181875 2023-10-02 12:21:55,947 WARNING [train.py:1204] (3/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_34-183685-0 from training. Duration: 0.98 2023-10-02 12:21:55,971 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8278_210_2550_1_1528637099_4220413_418-210155-0_sp1.1 from training. Duration: 0.92725 2023-10-02 12:21:55,975 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8853_210_3273_1_1528633899_818968_51-187685-0_sp0.9 from training. Duration: 0.7666875 2023-10-02 12:21:55,995 WARNING [train.py:1204] (3/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_332-221057-0_sp1.1 from training. Duration: 0.6181875 2023-10-02 12:21:56,064 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2221_210_1988_1_1525597051_4935541_415-296882-0 from training. Duration: 0.77 2023-10-02 12:21:57,371 WARNING [train.py:1204] (3/4) Exclude cut with ID _33640_210_2655_1_1533117605479_4855469_770-278185-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 12:21:57,417 WARNING [train.py:1204] (3/4) Exclude cut with ID _5287_210_2084_1_1527418883040_3878670_333-48508-0_sp1.1 from training. Duration: 0.5454375 2023-10-02 12:21:59,269 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_142-276839-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 12:22:00,678 WARNING [train.py:1204] (3/4) Exclude cut with ID _28394_210_8913_1_1532430049518_3650118_126-139371-0_sp1.1 from training. Duration: 0.92725 2023-10-02 12:22:01,885 WARNING [train.py:1204] (3/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_76-203867-0 from training. Duration: 0.72 2023-10-02 12:22:01,921 WARNING [train.py:1204] (3/4) Exclude cut with ID _6194_210_1534_1_1527511826160_3941194_14-282876-0_sp0.9 from training. Duration: 0.788875 2023-10-02 12:22:03,543 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.ff2_skip_rate, batch_count=877873.3333333334, ans=0.0 2023-10-02 12:22:06,139 WARNING [train.py:1204] (3/4) Exclude cut with ID _35355_210_16040_1_1533212988977_4016229_74-190076-0_sp1.1 from training. Duration: 0.72725 2023-10-02 12:22:08,737 INFO [train.py:1046] (3/4) Epoch 25, batch 4200, loss[loss=0.1494, simple_loss=0.1988, pruned_loss=0.05005, over 19454.00 frames. ], tot_loss[loss=0.1691, simple_loss=0.2454, pruned_loss=0.04637, over 4707018.35 frames. ], batch size: 388, lr: 4.08e-03, grad_scale: 8.0 2023-10-02 12:22:08,835 WARNING [train.py:1204] (3/4) Exclude cut with ID _33920_210_4220_1_1533257851500_3084399_198-334063-0 from training. Duration: 0.94 2023-10-02 12:22:10,292 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_4-266348-0_sp0.9 from training. Duration: 0.8666875 2023-10-02 12:22:12,174 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_23856_210_4238_1_1531994785_3087934_122-38075-0_sp1.1 from training. Duration: 0.936375 2023-10-02 12:22:13,608 WARNING [train.py:1204] (3/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_276-96593-0_sp0.9 from training. Duration: 0.8555625 2023-10-02 12:22:15,448 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_308-278734-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 12:22:15,450 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_5190_210_4557_1_1527245078_2965646_353-250603-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 12:22:18,633 WARNING [train.py:1204] (3/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_472-216663-0 from training. Duration: 0.92 2023-10-02 12:22:21,416 WARNING [train.py:1204] (3/4) Exclude cut with ID _7537_210_2084_1_1528502395431_3924359_14-312906-0 from training. Duration: 0.84 2023-10-02 12:22:22,702 WARNING [train.py:1204] (3/4) Exclude cut with ID _29003_210_19596_1_1532507391720_3894098_559-236660-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 12:22:24,113 WARNING [train.py:1204] (3/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_312-204798-0_sp1.1 from training. Duration: 0.8454375 2023-10-02 12:22:28,155 WARNING [train.py:1204] (3/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_17-309070-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 12:22:30,944 WARNING [train.py:1204] (3/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_344-152526-0_sp0.9 from training. Duration: 0.588875 2023-10-02 12:22:32,454 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_41452_210_7291_1_1533437687_3941281_405-11559-0_sp1.1 from training. Duration: 0.82725 2023-10-02 12:22:34,366 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_48730_210_5399_1_1533885886_2212769_157-124052-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 12:22:34,419 WARNING [train.py:1204] (3/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_415-78792-0 from training. Duration: 0.81 2023-10-02 12:22:34,423 WARNING [train.py:1204] (3/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_345-340690-0_sp1.1 from training. Duration: 0.8454375 2023-10-02 12:22:35,869 WARNING [train.py:1204] (3/4) Exclude cut with ID _23825_210_13449_1_1531915313526_3497150_124-8138-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 12:22:35,911 WARNING [train.py:1204] (3/4) Exclude cut with ID _49258_210_7994_1_1534122017863_3738861_194-24864-0_sp1.1 from training. Duration: 0.936375 2023-10-02 12:22:37,238 WARNING [train.py:1204] (3/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_369-222060-0_sp1.1 from training. Duration: 0.6454375 2023-10-02 12:22:38,572 WARNING [train.py:1204] (3/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_480-13146-0_sp0.9 from training. Duration: 0.9444375 2023-10-02 12:22:40,003 WARNING [train.py:1204] (3/4) Exclude cut with ID _13253_210_1925_1_1530757775819_3577873_191-144977-0 from training. Duration: 0.78 2023-10-02 12:22:40,027 WARNING [train.py:1204] (3/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_1128-140266-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 12:22:40,245 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.conv_skip_rate, batch_count=878073.3333333334, ans=0.0 2023-10-02 12:22:44,127 WARNING [train.py:1204] (3/4) Exclude cut with ID _8772_210_1850_1_1528635595972_3664011_247-336448-0_sp0.9 from training. Duration: 0.82225 2023-10-02 12:22:44,221 WARNING [train.py:1204] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_318-59097-0_sp1.1 from training. Duration: 0.6909375 2023-10-02 12:22:46,219 WARNING [train.py:1204] (3/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_540-131164-0_sp1.1 from training. Duration: 0.836375 2023-10-02 12:22:49,286 WARNING [train.py:1204] (3/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_39-110836-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 12:22:51,254 WARNING [train.py:1204] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_860-96198-0_sp1.1 from training. Duration: 0.663625 2023-10-02 12:22:51,267 WARNING [train.py:1204] (3/4) Exclude cut with ID _15133_210_6228_1_1530794512074_3080379_29-264458-0 from training. Duration: 0.58 2023-10-02 12:22:51,428 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.feed_forward1.out_proj.dropout_p, batch_count=878073.3333333334, ans=0.1 2023-10-02 12:22:52,416 WARNING [train.py:1204] (3/4) Exclude cut with ID _55209_210_1531_1_1534675828996_4597160_797-348343-0_sp1.1 from training. Duration: 0.963625 2023-10-02 12:22:52,527 WARNING [train.py:1204] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_23-189234-0_sp1.1 from training. Duration: 0.7545625 2023-10-02 12:22:52,739 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.balancer2.prob, batch_count=878140.0, ans=0.125 2023-10-02 12:22:57,836 WARNING [train.py:1204] (3/4) Exclude cut with ID _5410_210_1388_1_1527397055795_1719020_39-112221-0_sp1.1 from training. Duration: 0.62725 2023-10-02 12:22:59,265 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_100-348441-0_sp1.1 from training. Duration: 0.82725 2023-10-02 12:22:59,445 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.feed_forward1.hidden_balancer.prob, batch_count=878140.0, ans=0.125 2023-10-02 12:23:01,934 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.559e+02 1.878e+02 2.060e+02 2.306e+02 3.122e+02, threshold=4.120e+02, percent-clipped=0.0 2023-10-02 12:23:05,166 WARNING [train.py:1204] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_143-86344-0_sp1.1 from training. Duration: 0.7 2023-10-02 12:23:07,927 WARNING [train.py:1204] (3/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_126-29104-0 from training. Duration: 0.85 2023-10-02 12:23:09,442 WARNING [train.py:1204] (3/4) Exclude cut with ID _39846_210_12651_1_1533603651992_1318030_138-31771-0_sp1.1 from training. Duration: 0.963625 2023-10-02 12:23:15,219 WARNING [train.py:1204] (3/4) Exclude cut with ID _2220_210_1819_1_1525509109898_3658680_50-99837-0_sp1.1 from training. Duration: 0.4909375 2023-10-02 12:23:15,283 WARNING [train.py:1204] (3/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_836-210550-0_sp1.1 from training. Duration: 0.97275 2023-10-02 12:23:18,612 WARNING [train.py:1204] (3/4) Exclude cut with ID _28323_210_16187_1_1532480395667_4100300_306-74561-0 from training. Duration: 0.94 2023-10-02 12:23:23,000 INFO [train.py:1046] (3/4) Epoch 25, batch 4250, loss[loss=0.177, simple_loss=0.2539, pruned_loss=0.04999, over 23563.00 frames. ], tot_loss[loss=0.1684, simple_loss=0.2451, pruned_loss=0.04591, over 4720865.50 frames. ], batch size: 94, lr: 4.08e-03, grad_scale: 8.0 2023-10-02 12:23:23,164 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_177-58005-0_sp1.1 from training. Duration: 0.563625 2023-10-02 12:23:27,288 WARNING [train.py:1204] (3/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_19-342632-0_sp1.1 from training. Duration: 0.82725 2023-10-02 12:23:27,309 WARNING [train.py:1204] (3/4) Exclude cut with ID _35355_210_16040_1_1533212988977_4016229_74-190076-0_sp0.9 from training. Duration: 0.888875 2023-10-02 12:23:30,109 WARNING [train.py:1204] (3/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_285-97786-0_sp1.1 from training. Duration: 0.97275 2023-10-02 12:23:34,762 WARNING [train.py:1204] (3/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_337-181502-0_sp0.9 from training. Duration: 0.72225 2023-10-02 12:23:34,795 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_353-182855-0 from training. Duration: 0.96 2023-10-02 12:23:34,842 WARNING [train.py:1204] (3/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_409-327730-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 12:23:36,480 WARNING [train.py:1204] (3/4) Exclude cut with ID _8030_210_7497_1_1528444886781_3531089_373-19403-0_sp1.1 from training. Duration: 0.97275 2023-10-02 12:23:36,666 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.conv_module2.balancer1.prob, batch_count=878340.0, ans=0.125 2023-10-02 12:23:39,204 WARNING [train.py:1204] (3/4) Exclude cut with ID _6789_210_1534_1_1527857937218_3118950_231-340237-0_sp1.1 from training. Duration: 0.936375 2023-10-02 12:23:41,474 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.1.encoder.layers.0.feed_forward3.out_whiten, num_groups=1, num_channels=256, metric=10.79 vs. limit=15.0 2023-10-02 12:23:43,706 WARNING [train.py:1204] (3/4) Exclude cut with ID _6493_210_1747_1_1528459210424_4012470_536-57832-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 12:23:43,719 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7046_210_6753_1_1527895337_6920917_756-116002-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 12:23:45,141 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_37152_210_9697_1_1533106573_3844188_29-239070-0_sp1.1 from training. Duration: 0.8909375 2023-10-02 12:23:45,144 WARNING [train.py:1204] (3/4) Exclude cut with ID _19023_210_4353_1_1531378892552_5820780_61-334539-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 12:23:47,071 WARNING [train.py:1204] (3/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_159-28488-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 12:23:47,147 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_764-71272-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 12:23:48,565 WARNING [train.py:1204] (3/4) Exclude cut with ID _44902_210_16187_1_1533888006062_3792078_258-178274-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 12:23:51,802 WARNING [train.py:1204] (3/4) Exclude cut with ID _7537_210_2084_1_1528502395431_3924359_14-312906-0_sp1.1 from training. Duration: 0.763625 2023-10-02 12:23:52,256 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.5.encoder.layers.1.nonlin_attention.whiten2, num_groups=1, num_channels=256, metric=4.41 vs. limit=15.0 2023-10-02 12:23:53,043 WARNING [train.py:1204] (3/4) Exclude cut with ID _49643_210_7989_1_1534640244224_3939031_674-116639-0_sp1.1 from training. Duration: 0.963625 2023-10-02 12:23:54,409 WARNING [train.py:1204] (3/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_415-161-0 from training. Duration: 0.98 2023-10-02 12:23:57,315 WARNING [train.py:1204] (3/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_264-348896-0 from training. Duration: 0.85 2023-10-02 12:23:58,515 WARNING [train.py:1204] (3/4) Exclude cut with ID _51899_210_20034_1_1534129254466_3669220_233-161492-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 12:23:58,564 WARNING [train.py:1204] (3/4) Exclude cut with ID _31255_210_13427_1_1532865619495_3815143_233-200023-0_sp1.1 from training. Duration: 0.936375 2023-10-02 12:23:58,587 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_23850_210_4238_1_1532232597_1632433_100-229879-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 12:23:58,673 WARNING [train.py:1204] (3/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_375-245677-0_sp0.9 from training. Duration: 0.9 2023-10-02 12:23:58,676 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_40223_210_6228_1_1533298404_4812267_355-317415-0_sp1.1 from training. Duration: 0.97275 2023-10-02 12:24:00,521 WARNING [train.py:1204] (3/4) Exclude cut with ID _9679_210_7497_1_1529129212906_4363929_235-81488-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 12:24:04,531 WARNING [train.py:1204] (3/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_172-215932-0_sp1.1 from training. Duration: 0.6 2023-10-02 12:24:04,604 WARNING [train.py:1204] (3/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_64-298917-0_sp1.1 from training. Duration: 0.8 2023-10-02 12:24:08,679 WARNING [train.py:1204] (3/4) Exclude cut with ID _38020_210_5986_1_1533112252259_7006089_131-53533-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 12:24:10,085 WARNING [train.py:1204] (3/4) Exclude cut with ID _42334_210_7770_1_1534206610396_3605880_392-48154-0_sp1.1 from training. Duration: 0.963625 2023-10-02 12:24:11,402 WARNING [train.py:1204] (3/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_337-181502-0 from training. Duration: 0.65 2023-10-02 12:24:11,416 WARNING [train.py:1204] (3/4) Exclude cut with ID _6267_210_2084_1_1527764658526_3894730_375-219887-0_sp0.9 from training. Duration: 0.7444375 2023-10-02 12:24:13,282 WARNING [train.py:1204] (3/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_60-146189-0 from training. Duration: 0.98 2023-10-02 12:24:13,369 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_243-143119-0_sp1.1 from training. Duration: 0.7 2023-10-02 12:24:15,939 WARNING [train.py:1204] (3/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_1267-272693-0_sp0.9 from training. Duration: 0.811125 2023-10-02 12:24:17,920 WARNING [train.py:1204] (3/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_83-182645-0_sp1.1 from training. Duration: 0.97275 2023-10-02 12:24:17,939 WARNING [train.py:1204] (3/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_403-292260-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 12:24:21,177 WARNING [train.py:1204] (3/4) Exclude cut with ID _55429_210_10626_1_1534467588527_7174143_377-169391-0 from training. Duration: 0.96 2023-10-02 12:24:23,974 WARNING [train.py:1204] (3/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_494-199538-0_sp1.1 from training. Duration: 0.6818125 2023-10-02 12:24:24,026 WARNING [train.py:1204] (3/4) Exclude cut with ID _3067_210_2667_1_1526212974640_4503168_119-175776-0_sp1.1 from training. Duration: 0.67275 2023-10-02 12:24:28,278 WARNING [train.py:1204] (3/4) Exclude cut with ID _3231_210_2106_1_1526205864934_6784220_183-303839-0_sp1.1 from training. Duration: 0.97275 2023-10-02 12:24:31,541 WARNING [train.py:1204] (3/4) Exclude cut with ID _18513_210_9206_1_1531618215223_3862620_165-38958-0_sp1.1 from training. Duration: 0.963625 2023-10-02 12:24:32,947 WARNING [train.py:1204] (3/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_87-272068-0_sp1.1 from training. Duration: 0.7545625 2023-10-02 12:24:34,311 WARNING [train.py:1204] (3/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_353-95-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 12:24:35,757 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2009_210_2071_1_1525481134_4588937_73-302813-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 12:24:37,124 INFO [train.py:1046] (3/4) Epoch 25, batch 4300, loss[loss=0.1535, simple_loss=0.2364, pruned_loss=0.03525, over 24313.00 frames. ], tot_loss[loss=0.1673, simple_loss=0.2441, pruned_loss=0.04528, over 4717608.88 frames. ], batch size: 61, lr: 4.07e-03, grad_scale: 8.0 2023-10-02 12:24:37,215 WARNING [train.py:1204] (3/4) Exclude cut with ID _64220_210_9527_1_1535250151772_4875259_88-95011-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 12:24:38,540 WARNING [train.py:1204] (3/4) Exclude cut with ID _44902_210_16187_1_1533888006062_3792078_160-12107-0_sp1.1 from training. Duration: 0.92725 2023-10-02 12:24:38,555 WARNING [train.py:1204] (3/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_547-5424-0 from training. Duration: 0.94 2023-10-02 12:24:38,751 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.self_attn_weights.pos_emb_skip_rate, batch_count=878606.6666666666, ans=0.0 2023-10-02 12:24:40,050 WARNING [train.py:1204] (3/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_268-85656-0_sp1.1 from training. Duration: 0.936375 2023-10-02 12:24:40,201 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.feed_forward1.hidden_balancer.prob, batch_count=878606.6666666666, ans=0.125 2023-10-02 12:24:45,974 WARNING [train.py:1204] (3/4) Exclude cut with ID _66210_210_12651_1_1535446812922_3365850_42-284985-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 12:24:46,006 WARNING [train.py:1204] (3/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_465-214726-0_sp0.9 from training. Duration: 0.9555625 2023-10-02 12:24:48,960 WARNING [train.py:1204] (3/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_50-95770-0_sp1.1 from training. Duration: 0.936375 2023-10-02 12:24:57,197 WARNING [train.py:1204] (3/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_269-77567-0_sp1.1 from training. Duration: 0.963625 2023-10-02 12:24:57,199 WARNING [train.py:1204] (3/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_7-111954-0 from training. Duration: 0.56 2023-10-02 12:24:58,550 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_85-195240-0_sp0.9 from training. Duration: 0.9666875 2023-10-02 12:24:59,987 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2822_210_2715_1_1526112586_1999587_165-36656-0_sp0.9 from training. Duration: 0.988875 2023-10-02 12:25:01,264 WARNING [train.py:1204] (3/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_181-78632-0_sp1.1 from training. Duration: 0.7090625 2023-10-02 12:25:01,276 WARNING [train.py:1204] (3/4) Exclude cut with ID IC0671W0044-331887 from training. Duration: 0.896 2023-10-02 12:25:03,260 WARNING [train.py:1204] (3/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_266-137104-0_sp1.1 from training. Duration: 0.5909375 2023-10-02 12:25:04,669 WARNING [train.py:1204] (3/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_137-116442-0_sp0.9 from training. Duration: 0.8666875 2023-10-02 12:25:07,510 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_23801_210_16223_1_1532066392_3294657_570-5439-0 from training. Duration: 0.97 2023-10-02 12:25:07,524 WARNING [train.py:1204] (3/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_29-280622-0_sp1.1 from training. Duration: 0.7454375 2023-10-02 12:25:07,550 WARNING [train.py:1204] (3/4) Exclude cut with ID 3557-8342-0013-54691-0 from training. Duration: 0.92 2023-10-02 12:25:10,211 WARNING [train.py:1204] (3/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_711-15688-0_sp1.1 from training. Duration: 0.3909375 2023-10-02 12:25:11,567 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_15126_210_6753_1_1530778677_2591185_120-277707-0_sp1.1 from training. Duration: 0.72725 2023-10-02 12:25:14,783 WARNING [train.py:1204] (3/4) Exclude cut with ID _14970_210_15709_1_1530775547361_1966965_8-169924-0_sp1.1 from training. Duration: 0.836375 2023-10-02 12:25:14,784 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_4692_210_3528_1_1526899755_4323254_107-44401-0_sp0.9 from training. Duration: 0.9555625 2023-10-02 12:25:14,867 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3475_210_2028_1_1526193578_3622569_265-276625-0_sp0.9 from training. Duration: 0.8444375 2023-10-02 12:25:16,290 WARNING [train.py:1204] (3/4) Exclude cut with ID _55153_210_2253_1_1534384786677_3162096_148-79022-0_sp1.1 from training. Duration: 0.87275 2023-10-02 12:25:16,363 WARNING [train.py:1204] (3/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_292-36336-0_sp1.1 from training. Duration: 0.92725 2023-10-02 12:25:17,640 WARNING [train.py:1204] (3/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_329-82235-0 from training. Duration: 0.82 2023-10-02 12:25:17,717 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_11305_210_1794_1_1529750428_7178148_257-312870-0 from training. Duration: 0.88 2023-10-02 12:25:19,824 WARNING [train.py:1204] (3/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_436-4493-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 12:25:22,560 WARNING [train.py:1204] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_321-54129-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 12:25:22,567 WARNING [train.py:1204] (3/4) Exclude cut with ID _5287_210_2084_1_1527418883040_3878670_333-48508-0_sp0.9 from training. Duration: 0.6666875 2023-10-02 12:25:22,589 WARNING [train.py:1204] (3/4) Exclude cut with ID _26528_210_9070_1_1532570410842_3851310_244-278158-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 12:25:23,902 WARNING [train.py:1204] (3/4) Exclude cut with ID _46952_210_5986_1_1534230010357_3107379_232-344503-0_sp1.1 from training. Duration: 0.87275 2023-10-02 12:25:23,922 WARNING [train.py:1204] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_43-206757-0 from training. Duration: 0.75 2023-10-02 12:25:23,924 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_4183_210_2087_1_1526729100_1869838_141-166600-0 from training. Duration: 0.95 2023-10-02 12:25:25,912 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_296-287678-0 from training. Duration: 0.95 2023-10-02 12:25:25,988 WARNING [train.py:1204] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_265-60343-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 12:25:27,140 WARNING [train.py:1204] (3/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_332-256168-0 from training. Duration: 0.75 2023-10-02 12:25:27,181 WARNING [train.py:1204] (3/4) Exclude cut with ID _13679_210_7497_1_1530534293662_3355098_411-186088-0 from training. Duration: 0.94 2023-10-02 12:25:31,176 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.485e+02 1.823e+02 1.995e+02 2.293e+02 3.439e+02, threshold=3.990e+02, percent-clipped=0.0 2023-10-02 12:25:31,328 WARNING [train.py:1204] (3/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_458-126779-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 12:25:33,198 WARNING [train.py:1204] (3/4) Exclude cut with ID IC0803W0380-397572 from training. Duration: 0.032 2023-10-02 12:25:34,522 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_278-272612-0_sp1.1 from training. Duration: 0.663625 2023-10-02 12:25:36,013 WARNING [train.py:1204] (3/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_491-263101-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 12:25:36,018 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_53555_210_5632_1_1534595220_7232297_334-336778-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 12:25:38,636 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_50-3289-0 from training. Duration: 0.91 2023-10-02 12:25:38,699 WARNING [train.py:1204] (3/4) Exclude cut with ID _8699_210_2069_1_1528630346831_3960409_155-39766-0_sp0.9 from training. Duration: 0.8666875 2023-10-02 12:25:38,702 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_502-82145-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 12:25:39,998 WARNING [train.py:1204] (3/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_550-43132-0_sp1.1 from training. Duration: 0.97275 2023-10-02 12:25:40,038 WARNING [train.py:1204] (3/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_446-112117-0_sp1.1 from training. Duration: 0.8181875 2023-10-02 12:25:40,094 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3691_210_3528_1_1526468169_4062053_355-334432-0_sp1.1 from training. Duration: 0.7909375 2023-10-02 12:25:42,878 WARNING [train.py:1204] (3/4) Exclude cut with ID _58605_210_12210_1_1534723195939_3904259_239-94720-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 12:25:46,157 WARNING [train.py:1204] (3/4) Exclude cut with ID _49376_210_5986_1_1533985278732_3370740_391-344072-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 12:25:47,560 WARNING [train.py:1204] (3/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_558-322014-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 12:25:47,601 WARNING [train.py:1204] (3/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_256-32662-0_sp1.1 from training. Duration: 0.8181875 2023-10-02 12:25:51,631 INFO [train.py:1046] (3/4) Epoch 25, batch 4350, loss[loss=0.1745, simple_loss=0.2596, pruned_loss=0.04466, over 24640.00 frames. ], tot_loss[loss=0.1677, simple_loss=0.2452, pruned_loss=0.04513, over 4737592.04 frames. ], batch size: 68, lr: 4.07e-03, grad_scale: 8.0 2023-10-02 12:25:53,569 WARNING [train.py:1204] (3/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_572-185150-0 from training. Duration: 0.97 2023-10-02 12:25:53,597 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_89-35748-0_sp0.9 from training. Duration: 0.87775 2023-10-02 12:25:53,787 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.ff2_skip_rate, batch_count=878940.0, ans=0.0 2023-10-02 12:25:57,802 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_88-66747-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 12:26:01,165 WARNING [train.py:1204] (3/4) Exclude cut with ID _19504_210_9994_1_1531648863242_3325220_357-237029-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 12:26:04,550 WARNING [train.py:1204] (3/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_302-2550-0_sp1.1 from training. Duration: 0.736375 2023-10-02 12:26:04,551 WARNING [train.py:1204] (3/4) Exclude cut with ID _9461_210_4557_1_1529060514726_3677451_59-194017-0_sp1.1 from training. Duration: 0.963625 2023-10-02 12:26:07,492 WARNING [train.py:1204] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_665-196155-0_sp1.1 from training. Duration: 0.6090625 2023-10-02 12:26:11,571 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_326-315007-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 12:26:11,818 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.feed_forward1.out_proj.dropout_p, batch_count=879006.6666666666, ans=0.1 2023-10-02 12:26:14,334 WARNING [train.py:1204] (3/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_105-328174-0_sp1.1 from training. Duration: 0.6909375 2023-10-02 12:26:14,342 WARNING [train.py:1204] (3/4) Exclude cut with ID _56494_210_4220_1_1535284837625_3033169_106-263722-0_sp1.1 from training. Duration: 0.97275 2023-10-02 12:26:17,634 WARNING [train.py:1204] (3/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_770-200430-0_sp0.9 from training. Duration: 0.988875 2023-10-02 12:26:19,025 WARNING [train.py:1204] (3/4) Exclude cut with ID _65640_210_6310_1_1535368201445_3173250_209-13008-0_sp1.1 from training. Duration: 0.9 2023-10-02 12:26:20,324 WARNING [train.py:1204] (3/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_212-149947-0_sp0.9 from training. Duration: 0.811125 2023-10-02 12:26:26,455 WARNING [train.py:1204] (3/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_188-171673-0 from training. Duration: 0.58 2023-10-02 12:26:27,810 WARNING [train.py:1204] (3/4) Exclude cut with ID _58895_210_8750_1_1535184081320_3649089_286-281724-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 12:26:29,135 WARNING [train.py:1204] (3/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_16-254909-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 12:26:31,887 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.3.encoder.layers.0.self_attn1.whiten, num_groups=1, num_channels=512, metric=17.11 vs. limit=22.5 2023-10-02 12:26:33,150 WARNING [train.py:1204] (3/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_52-95655-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 12:26:35,905 WARNING [train.py:1204] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_554-45617-0 from training. Duration: 0.99 2023-10-02 12:26:38,585 WARNING [train.py:1204] (3/4) Exclude cut with ID _14683_210_5219_1_1530698352608_3974859_473-344611-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 12:26:39,986 WARNING [train.py:1204] (3/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_17-25045-0_sp0.9 from training. Duration: 0.6555625 2023-10-02 12:26:42,899 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9072W0003-530637 from training. Duration: 0.768 2023-10-02 12:26:43,094 INFO [scaling.py:1118] (3/4) WithLoss: name=encoder.encoders.2.encoder.layers.2.self_attn_weights, loss-sum=0.000e+00 2023-10-02 12:26:43,583 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.2.encoder.layers.1.self_attn_weights.whiten_keys, num_groups=4, num_channels=128, metric=2.94 vs. limit=6.0 2023-10-02 12:26:45,548 WARNING [train.py:1204] (3/4) Exclude cut with ID _38670_210_15379_1_1533448810659_4391059_76-74025-0_sp1.1 from training. Duration: 0.936375 2023-10-02 12:26:46,871 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_74-292608-0_sp0.9 from training. Duration: 0.8 2023-10-02 12:26:46,948 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9084W0025-536862 from training. Duration: 0.896 2023-10-02 12:26:48,323 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9087W0021-538392 from training. Duration: 0.768 2023-10-02 12:26:48,329 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7543_210_6753_1_1528543323_645515_56-163234-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 12:26:48,348 WARNING [train.py:1204] (3/4) Exclude cut with ID _45560_210_21982_1_1533812603951_3584999_44-332643-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 12:26:48,431 WARNING [train.py:1204] (3/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_1285-256622-0_sp1.1 from training. Duration: 0.763625 2023-10-02 12:26:50,278 WARNING [train.py:1204] (3/4) Exclude cut with ID _52781_210_16187_1_1534297729099_1118149_141-325632-0_sp1.1 from training. Duration: 0.936375 2023-10-02 12:26:51,675 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_1051-137631-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 12:26:51,717 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_13091_210_7925_1_1530609447_8987354_233-18333-0_sp1.1 from training. Duration: 0.97275 2023-10-02 12:26:54,472 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_34008_210_10431_1_1533345424_8183247_662-317331-0 from training. Duration: 0.94 2023-10-02 12:26:54,478 WARNING [train.py:1204] (3/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_291-248920-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 12:26:54,480 WARNING [train.py:1204] (3/4) Exclude cut with ID _65263_210_22356_1_1535763598691_3921037_274-288481-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 12:26:54,510 WARNING [train.py:1204] (3/4) Exclude cut with ID _5189_210_4557_1_1527248319428_3769229_233-168080-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 12:26:54,571 WARNING [train.py:1204] (3/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_135-184281-0 from training. Duration: 0.99 2023-10-02 12:26:55,973 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9123W0012-555972 from training. Duration: 0.896 2023-10-02 12:26:55,977 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9123W0118-556077 from training. Duration: 0.896 2023-10-02 12:26:55,986 WARNING [train.py:1204] (3/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_406-275649-0 from training. Duration: 0.42 2023-10-02 12:26:58,053 WARNING [train.py:1204] (3/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_309-13684-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 12:26:58,255 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.balancer2.prob, batch_count=879206.6666666666, ans=0.125 2023-10-02 12:26:59,386 WARNING [train.py:1204] (3/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_129-337802-0_sp1.1 from training. Duration: 0.6545625 2023-10-02 12:26:59,417 WARNING [train.py:1204] (3/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_649-266825-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 12:27:00,746 WARNING [train.py:1204] (3/4) Exclude cut with ID _40519_210_13794_1_1533715228655_7337390_895-224742-0_sp1.1 from training. Duration: 0.92725 2023-10-02 12:27:02,700 WARNING [train.py:1204] (3/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_234-14174-0 from training. Duration: 0.82 2023-10-02 12:27:05,403 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9156W0009-572502 from training. Duration: 0.768 2023-10-02 12:27:05,409 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_424-245529-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 12:27:06,766 INFO [train.py:1046] (3/4) Epoch 25, batch 4400, loss[loss=0.1649, simple_loss=0.24, pruned_loss=0.04493, over 23461.00 frames. ], tot_loss[loss=0.1692, simple_loss=0.2468, pruned_loss=0.04577, over 4723246.08 frames. ], batch size: 93, lr: 4.07e-03, grad_scale: 16.0 2023-10-02 12:27:09,593 WARNING [train.py:1204] (3/4) Exclude cut with ID _26528_210_9070_1_1532570410842_3851310_232-10149-0_sp1.1 from training. Duration: 0.963625 2023-10-02 12:27:09,601 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_17292_210_6753_1_1531125388_2943734_152-30410-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 12:27:11,022 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_35441_210_16554_1_1533347572_4092680_291-82075-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 12:27:13,667 WARNING [train.py:1204] (3/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_64-298917-0 from training. Duration: 0.88 2023-10-02 12:27:13,694 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_5618_210_1874_1_1527311008_5851_1-214501-0_sp0.9 from training. Duration: 0.7655625 2023-10-02 12:27:13,966 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=879273.3333333334, ans=0.1 2023-10-02 12:27:14,524 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.1.encoder.layers.1.self_attn1.whiten, num_groups=1, num_channels=256, metric=13.04 vs. limit=22.5 2023-10-02 12:27:15,097 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_9460_210_4211_1_1528954587_10049667_572-37035-0 from training. Duration: 0.8 2023-10-02 12:27:15,114 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9193W0023-590322 from training. Duration: 0.896 2023-10-02 12:27:15,204 WARNING [train.py:1204] (3/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_206-10097-0_sp1.1 from training. Duration: 0.4454375 2023-10-02 12:27:15,215 WARNING [train.py:1204] (3/4) Exclude cut with ID _55134_210_21982_1_1534590028377_3882170_17-51731-0_sp1.1 from training. Duration: 0.963625 2023-10-02 12:27:17,886 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_14953_210_2151_1_1530701710_5616165_710-165400-0 from training. Duration: 0.87 2023-10-02 12:27:18,011 WARNING [train.py:1204] (3/4) Exclude cut with ID _53203_210_2087_1_1534246218309_475590_61-85777-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 12:27:19,465 WARNING [train.py:1204] (3/4) Exclude cut with ID _55844_210_2346_1_1534471212569_7625707_1050-10453-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 12:27:20,720 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9218W0013-603297 from training. Duration: 0.896 2023-10-02 12:27:21,033 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.feed_forward2.hidden_balancer.prob, batch_count=879340.0, ans=0.125 2023-10-02 12:27:22,644 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_267-102818-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 12:27:22,645 WARNING [train.py:1204] (3/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_191-41023-0 from training. Duration: 0.71 2023-10-02 12:27:22,684 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9231W0202-610227 from training. Duration: 0.896 2023-10-02 12:27:25,330 WARNING [train.py:1204] (3/4) Exclude cut with ID _6537_210_1883_1_1527987559710_3597129_382-133062-0 from training. Duration: 0.68 2023-10-02 12:27:26,665 WARNING [train.py:1204] (3/4) Exclude cut with ID _28427_210_4220_1_1532581240967_3061069_43-84928-0 from training. Duration: 0.99 2023-10-02 12:27:26,704 WARNING [train.py:1204] (3/4) Exclude cut with ID _59256_210_5986_1_1535076085997_3589120_121-237386-0 from training. Duration: 0.96 2023-10-02 12:27:28,519 WARNING [train.py:1204] (3/4) Exclude cut with ID _8480_210_1874_1_1528589391205_6728330_137-256986-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 12:27:30,469 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_9417_210_1794_1_1529235261_4614131_479-346963-0_sp1.1 from training. Duration: 0.936375 2023-10-02 12:27:30,500 WARNING [train.py:1204] (3/4) Exclude cut with ID _26474_210_9570_1_1532520022699_3623380_267-7362-0_sp1.1 from training. Duration: 0.936375 2023-10-02 12:27:31,798 WARNING [train.py:1204] (3/4) Exclude cut with ID _55328_210_27767_1_1534580973260_4088170_287-61272-0_sp1.1 from training. Duration: 0.97275 2023-10-02 12:27:33,138 WARNING [train.py:1204] (3/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_589-9070-0 from training. Duration: 0.71 2023-10-02 12:27:34,449 WARNING [train.py:1204] (3/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_148-158509-0_sp1.1 from training. Duration: 0.37275 2023-10-02 12:27:34,523 WARNING [train.py:1204] (3/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_164-184548-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 12:27:37,233 WARNING [train.py:1204] (3/4) Exclude cut with ID _14970_210_15709_1_1530775547361_1966965_6-23328-0_sp1.1 from training. Duration: 0.7818125 2023-10-02 12:27:37,238 WARNING [train.py:1204] (3/4) Exclude cut with ID _29303_210_2575_1_1532392237303_7410649_612-165069-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 12:27:37,715 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.5.encoder.layers.0.self_attn2.whiten, num_groups=1, num_channels=256, metric=11.79 vs. limit=22.5 2023-10-02 12:27:38,596 WARNING [train.py:1204] (3/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_383-321224-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 12:27:39,908 WARNING [train.py:1204] (3/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_283-160525-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 12:27:39,921 WARNING [train.py:1204] (3/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_505-97987-0 from training. Duration: 0.86 2023-10-02 12:27:41,262 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9309W0190-640317 from training. Duration: 0.768 2023-10-02 12:27:43,264 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.4.encoder.layers.2.self_attn_weights.whiten_keys, num_groups=4, num_channels=128, metric=2.08 vs. limit=6.0 2023-10-02 12:27:44,140 WARNING [train.py:1204] (3/4) Exclude cut with ID _50093_210_4220_1_1534417141207_3432299_280-297390-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 12:27:51,151 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_17292_210_6753_1_1531125388_2943734_320-71385-0_sp1.1 from training. Duration: 0.97275 2023-10-02 12:27:52,613 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_213-103282-0 from training. Duration: 0.78 2023-10-02 12:27:54,859 INFO [scaling.py:1118] (3/4) WithLoss: name=encoder.encoders.3.encoder.layers.0.self_attn_weights, loss-sum=0.000e+00 2023-10-02 12:27:56,038 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7615_210_4211_1_1528110965_4580160_72-235211-0_sp0.9 from training. Duration: 0.8444375 2023-10-02 12:27:57,446 WARNING [train.py:1204] (3/4) Exclude cut with ID _14867_210_1968_1_1530684179890_3846969_173-320977-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 12:27:57,581 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.1.bypass.scale_min, batch_count=879473.3333333334, ans=0.2 2023-10-02 12:28:00,647 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.592e+02 1.859e+02 2.058e+02 2.291e+02 3.192e+02, threshold=4.115e+02, percent-clipped=0.0 2023-10-02 12:28:02,702 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7046_210_6753_1_1527895337_6920917_783-278109-0_sp0.9 from training. Duration: 0.8555625 2023-10-02 12:28:04,002 WARNING [train.py:1204] (3/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_90-30673-0 from training. Duration: 0.84 2023-10-02 12:28:04,020 WARNING [train.py:1204] (3/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_415-161-0_sp1.1 from training. Duration: 0.8909375 2023-10-02 12:28:04,042 WARNING [train.py:1204] (3/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_86-66192-0_sp0.9 from training. Duration: 0.611125 2023-10-02 12:28:04,045 WARNING [train.py:1204] (3/4) Exclude cut with ID _4626_210_3532_1_1526817652717_3916859_446-206656-0_sp1.1 from training. Duration: 0.6909375 2023-10-02 12:28:04,731 WARNING [train.py:1204] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_166-128941-0_sp0.9 from training. Duration: 0.6 2023-10-02 12:28:06,530 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.conv_module1.balancer2.prob, batch_count=879540.0, ans=0.125 2023-10-02 12:28:08,980 WARNING [train.py:1204] (3/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_345-167986-0 from training. Duration: 0.84 2023-10-02 12:28:11,743 WARNING [train.py:1204] (3/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_973-198085-0 from training. Duration: 0.75 2023-10-02 12:28:13,082 WARNING [train.py:1204] (3/4) Exclude cut with ID _60057_210_10640_1_1534903065793_3333150_171-181133-0 from training. Duration: 0.88 2023-10-02 12:28:13,096 WARNING [train.py:1204] (3/4) Exclude cut with ID _19504_210_9994_1_1531648863242_3325220_336-196513-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 12:28:13,102 WARNING [train.py:1204] (3/4) Exclude cut with ID _6493_210_1747_1_1528459210424_4012470_513-140967-0 from training. Duration: 0.79 2023-10-02 12:28:13,186 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6673_210_3273_1_1527767790_3173240_327-306185-0_sp0.9 from training. Duration: 0.92225 2023-10-02 12:28:15,956 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1032-236863-0_sp1.1 from training. Duration: 0.763625 2023-10-02 12:28:18,552 WARNING [train.py:1204] (3/4) Exclude cut with ID _14867_210_1968_1_1530684179890_3846969_413-29292-0 from training. Duration: 0.9 2023-10-02 12:28:19,789 INFO [train.py:1046] (3/4) Epoch 25, batch 4450, loss[loss=0.1556, simple_loss=0.2323, pruned_loss=0.03947, over 24621.00 frames. ], tot_loss[loss=0.1691, simple_loss=0.2469, pruned_loss=0.04565, over 4733215.10 frames. ], batch size: 60, lr: 4.07e-03, grad_scale: 8.0 2023-10-02 12:28:22,743 WARNING [train.py:1204] (3/4) Exclude cut with ID _47251_210_18628_1_1533814063128_4137250_186-343322-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 12:28:26,019 WARNING [train.py:1204] (3/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_624-46091-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 12:28:26,053 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_791-197891-0_sp0.9 from training. Duration: 0.9333125 2023-10-02 12:28:29,029 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.bypass_mid.scale_min, batch_count=879606.6666666666, ans=0.2 2023-10-02 12:28:33,000 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.conv_module2.balancer1.min_positive, batch_count=879673.3333333334, ans=0.025 2023-10-02 12:28:34,657 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_50050_210_16223_1_1535435796_7290138_786-288457-0_sp1.1 from training. Duration: 0.92725 2023-10-02 12:28:34,681 WARNING [train.py:1204] (3/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_213-227283-0_sp1.1 from training. Duration: 0.963625 2023-10-02 12:28:38,569 WARNING [train.py:1204] (3/4) Exclude cut with ID _42659_210_2070_1_1533607442765_3120380_58-341121-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 12:28:40,091 WARNING [train.py:1204] (3/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_506-23200-0_sp1.1 from training. Duration: 0.8818125 2023-10-02 12:28:42,822 WARNING [train.py:1204] (3/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_336-213530-0_sp1.1 from training. Duration: 0.8181875 2023-10-02 12:28:42,848 WARNING [train.py:1204] (3/4) Exclude cut with ID _64135_210_2253_1_1535677267876_7608972_180-5312-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 12:28:42,929 WARNING [train.py:1204] (3/4) Exclude cut with ID _12302_210_5219_1_1529924446466_8107280_835-279754-0 from training. Duration: 0.9 2023-10-02 12:28:42,931 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_510-96996-0_sp1.1 from training. Duration: 0.7909375 2023-10-02 12:28:44,328 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_15517_210_6753_1_1530954008_3487336_216-187834-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 12:28:44,368 WARNING [train.py:1204] (3/4) Exclude cut with ID _19504_210_9994_1_1531648863242_3325220_414-113427-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 12:28:44,369 WARNING [train.py:1204] (3/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_264-108685-0_sp0.9 from training. Duration: 0.611125 2023-10-02 12:28:47,157 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_5438_210_1794_1_1527336687_2600009_104-288274-0_sp0.9 from training. Duration: 0.6444375 2023-10-02 12:28:51,268 WARNING [train.py:1204] (3/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_520-39496-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 12:28:51,321 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_37818_210_4238_1_1533620910_4720083_315-308565-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 12:28:52,766 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2009_210_2071_1_1525481134_4588937_385-302100-0_sp1.1 from training. Duration: 0.7909375 2023-10-02 12:28:54,803 WARNING [train.py:1204] (3/4) Exclude cut with ID _58895_210_8750_1_1535184081320_3649089_277-53219-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 12:28:56,154 WARNING [train.py:1204] (3/4) Exclude cut with ID _26664_210_7770_1_1532258943280_3418270_121-111258-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 12:29:01,472 WARNING [train.py:1204] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_99-167392-0_sp1.1 from training. Duration: 0.5545625 2023-10-02 12:29:02,805 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1113-7369-0 from training. Duration: 0.85 2023-10-02 12:29:02,824 WARNING [train.py:1204] (3/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_352-59958-0 from training. Duration: 0.86 2023-10-02 12:29:02,829 WARNING [train.py:1204] (3/4) Exclude cut with ID _14886_210_4220_1_1530702020355_3516190_159-73748-0_sp1.1 from training. Duration: 0.7545625 2023-10-02 12:29:02,972 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.conv_module1.balancer2.prob, batch_count=879806.6666666666, ans=0.125 2023-10-02 12:29:04,483 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.conv_skip_rate, batch_count=879806.6666666666, ans=0.0 2023-10-02 12:29:06,142 WARNING [train.py:1204] (3/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_131-119569-0_sp1.1 from training. Duration: 0.92725 2023-10-02 12:29:06,211 WARNING [train.py:1204] (3/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_216-343007-0 from training. Duration: 0.66 2023-10-02 12:29:09,587 WARNING [train.py:1204] (3/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_113-92608-0_sp0.9 from training. Duration: 0.6 2023-10-02 12:29:12,287 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_40047_210_2290_1_1533290519_2374311_132-123271-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 12:29:13,628 WARNING [train.py:1204] (3/4) Exclude cut with ID _8793_210_6310_1_1528542985139_2310009_121-179782-0 from training. Duration: 0.8 2023-10-02 12:29:13,649 WARNING [train.py:1204] (3/4) Exclude cut with ID _43997_210_4525_1_1533538632555_5189329_283-218639-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 12:29:13,652 WARNING [train.py:1204] (3/4) Exclude cut with ID _49376_210_5986_1_1533985278732_3370740_297-78397-0_sp1.1 from training. Duration: 0.936375 2023-10-02 12:29:14,837 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3475_210_2028_1_1526193578_3622569_377-166133-0_sp0.9 from training. Duration: 0.9555625 2023-10-02 12:29:14,844 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_520-213149-0_sp1.1 from training. Duration: 0.92725 2023-10-02 12:29:16,338 WARNING [train.py:1204] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_567-144483-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 12:29:19,112 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_4183_210_2087_1_1526729100_1869838_143-280962-0_sp0.9 from training. Duration: 0.62225 2023-10-02 12:29:19,164 WARNING [train.py:1204] (3/4) Exclude cut with ID _49643_210_7989_1_1534640244224_3939031_611-185515-0 from training. Duration: 0.94 2023-10-02 12:29:20,551 WARNING [train.py:1204] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_88-341718-0_sp0.9 from training. Duration: 0.7333125 2023-10-02 12:29:23,188 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_51225_210_3601_1_1534400634_2327383_49-235441-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 12:29:25,026 WARNING [train.py:1204] (3/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_240-294400-0_sp1.1 from training. Duration: 0.936375 2023-10-02 12:29:25,154 WARNING [train.py:1204] (3/4) Exclude cut with ID _55429_210_10626_1_1534467588527_7174143_640-304825-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 12:29:25,176 WARNING [train.py:1204] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_187-221566-0_sp1.1 from training. Duration: 0.3909375 2023-10-02 12:29:28,010 WARNING [train.py:1204] (3/4) Exclude cut with ID _7606_210_4915_1_1528894253277_5080690_127-177761-0_sp0.9 from training. Duration: 0.711125 2023-10-02 12:29:30,705 WARNING [train.py:1204] (3/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_430-116139-0 from training. Duration: 0.85 2023-10-02 12:29:33,412 INFO [train.py:1046] (3/4) Epoch 25, batch 4500, loss[loss=0.1844, simple_loss=0.2661, pruned_loss=0.05134, over 24079.00 frames. ], tot_loss[loss=0.169, simple_loss=0.2471, pruned_loss=0.0455, over 4726315.50 frames. ], batch size: 80, lr: 4.07e-03, grad_scale: 8.0 2023-10-02 12:29:33,477 WARNING [train.py:1204] (3/4) Exclude cut with ID _3675_210_4557_1_1526639934408_4182089_456-18305-0_sp0.9 from training. Duration: 0.8444375 2023-10-02 12:29:36,845 WARNING [train.py:1204] (3/4) Exclude cut with ID _18513_210_9206_1_1531618215223_3862620_402-5957-0_sp1.1 from training. Duration: 0.9 2023-10-02 12:29:39,612 WARNING [train.py:1204] (3/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_465-156500-0 from training. Duration: 0.72 2023-10-02 12:29:39,613 WARNING [train.py:1204] (3/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_714-310538-0 from training. Duration: 0.86 2023-10-02 12:29:40,384 WARNING [train.py:1204] (3/4) Exclude cut with ID _39411_210_5986_1_1533538825030_3343649_335-301187-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 12:29:44,575 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_41750_210_1960_1_1533520678_3948042_241-158616-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 12:29:45,840 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_46013_210_7234_1_1533776362_3744032_50-141535-0_sp1.1 from training. Duration: 0.9 2023-10-02 12:29:49,585 WARNING [train.py:1204] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_43-206757-0_sp0.9 from training. Duration: 0.8333125 2023-10-02 12:29:50,882 WARNING [train.py:1204] (3/4) Exclude cut with ID _22961_210_8254_1_1531976366672_4278180_172-276296-0_sp1.1 from training. Duration: 0.97275 2023-10-02 12:29:50,911 WARNING [train.py:1204] (3/4) Exclude cut with ID _17956_210_4353_1_1531209568125_6979970_1305-301903-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 12:29:50,942 WARNING [train.py:1204] (3/4) Exclude cut with ID _28323_210_16187_1_1532480395667_4100300_604-44507-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 12:29:51,269 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.conv_module2.balancer2.prob, batch_count=880006.6666666666, ans=0.125 2023-10-02 12:29:59,077 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.conv_module2.balancer2.min_abs, batch_count=880006.6666666666, ans=0.5 2023-10-02 12:30:00,161 WARNING [train.py:1204] (3/4) Exclude cut with ID _23514_210_1389_1_1531832139785_3031439_88-337544-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 12:30:01,406 WARNING [train.py:1204] (3/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_932-3672-0_sp1.1 from training. Duration: 0.963625 2023-10-02 12:30:02,852 WARNING [train.py:1204] (3/4) Exclude cut with ID _46502_210_13794_1_1533724452198_3657159_207-109991-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 12:30:04,598 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_216-73841-0_sp1.1 from training. Duration: 0.87275 2023-10-02 12:30:06,087 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8453_210_4211_1_1528502940_8562730_347-330673-0_sp1.1 from training. Duration: 0.6545625 2023-10-02 12:30:12,760 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_4183_210_2087_1_1526729100_1869838_143-280962-0_sp1.1 from training. Duration: 0.5090625 2023-10-02 12:30:16,789 WARNING [train.py:1204] (3/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_325-113798-0_sp1.1 from training. Duration: 0.77275 2023-10-02 12:30:21,061 WARNING [train.py:1204] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_91-212486-0_sp0.9 from training. Duration: 0.7555625 2023-10-02 12:30:22,684 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8776_210_2040_1_1528638837_4088036_244-278763-0_sp1.1 from training. Duration: 0.8181875 2023-10-02 12:30:22,723 WARNING [train.py:1204] (3/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_106-286130-0 from training. Duration: 0.98 2023-10-02 12:30:24,072 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2746_210_1988_1_1525872816_3795627_372-205861-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 12:30:24,238 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.conv_skip_rate, batch_count=880140.0, ans=0.0 2023-10-02 12:30:25,360 WARNING [train.py:1204] (3/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_240-54646-0_sp1.1 from training. Duration: 0.936375 2023-10-02 12:30:27,815 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.0.layers.0.nonlin_attention.whiten1, num_groups=1, num_channels=144, metric=9.16 vs. limit=10.0 2023-10-02 12:30:28,633 WARNING [train.py:1204] (3/4) Exclude cut with ID _59500_210_8254_1_1534809610680_3736009_162-180695-0_sp1.1 from training. Duration: 0.936375 2023-10-02 12:30:28,662 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7544_210_6753_1_1528591327_1471426_146-93357-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 12:30:30,116 WARNING [train.py:1204] (3/4) Exclude cut with ID _42657_210_2070_1_1533520654016_3327008_405-87432-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 12:30:30,266 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.feed_forward1.out_proj.dropout_p, batch_count=880140.0, ans=0.1 2023-10-02 12:30:31,328 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.569e+02 1.866e+02 2.072e+02 2.376e+02 3.335e+02, threshold=4.143e+02, percent-clipped=0.0 2023-10-02 12:30:31,411 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_4528_210_3273_1_1527076654_3276777_209-219804-0 from training. Duration: 0.69 2023-10-02 12:30:31,411 WARNING [train.py:1204] (3/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_78-182912-0_sp0.9 from training. Duration: 0.6555625 2023-10-02 12:30:31,418 WARNING [train.py:1204] (3/4) Exclude cut with ID _31255_210_13427_1_1532865619495_3815143_322-114998-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 12:30:36,143 WARNING [train.py:1204] (3/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_848-280957-0_sp0.9 from training. Duration: 0.9666875 2023-10-02 12:30:36,173 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7696_210_2040_1_1528207167_3716099_117-125434-0_sp0.9 from training. Duration: 0.8444375 2023-10-02 12:30:37,527 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_1002-109192-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 12:30:40,819 WARNING [train.py:1204] (3/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_138-64210-0_sp0.9 from training. Duration: 0.811125 2023-10-02 12:30:42,130 WARNING [train.py:1204] (3/4) Exclude cut with ID _25808_210_13292_1_1532343514562_7366109_633-47524-0_sp1.1 from training. Duration: 0.9 2023-10-02 12:30:42,951 WARNING [train.py:1204] (3/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_297-5233-0 from training. Duration: 0.95 2023-10-02 12:30:45,397 WARNING [train.py:1204] (3/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_335-274780-0 from training. Duration: 0.78 2023-10-02 12:30:45,410 WARNING [train.py:1204] (3/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_301-330455-0 from training. Duration: 0.8 2023-10-02 12:30:48,326 WARNING [train.py:1204] (3/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_383-205833-0 from training. Duration: 0.95 2023-10-02 12:30:50,987 INFO [train.py:1046] (3/4) Epoch 25, batch 4550, loss[loss=0.1716, simple_loss=0.2577, pruned_loss=0.0427, over 24642.00 frames. ], tot_loss[loss=0.1685, simple_loss=0.2466, pruned_loss=0.04521, over 4730196.79 frames. ], batch size: 73, lr: 4.07e-03, grad_scale: 8.0 2023-10-02 12:30:51,072 WARNING [train.py:1204] (3/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_48-182155-0 from training. Duration: 0.87 2023-10-02 12:30:52,492 WARNING [train.py:1204] (3/4) Exclude cut with ID _14832_210_8750_1_1530691351539_3171348_1-54315-0_sp1.1 from training. Duration: 0.7909375 2023-10-02 12:30:52,665 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=880273.3333333334, ans=0.1 2023-10-02 12:30:55,237 WARNING [train.py:1204] (3/4) Exclude cut with ID _44982_210_1511_1_1534312820108_3726029_413-100814-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 12:30:56,908 WARNING [train.py:1204] (3/4) Exclude cut with ID _3231_210_2106_1_1526205864934_6784220_104-261392-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 12:30:59,707 WARNING [train.py:1204] (3/4) Exclude cut with ID _24314_210_4915_1_1532135078116_7170179_1-13588-0_sp1.1 from training. Duration: 0.97275 2023-10-02 12:31:03,997 WARNING [train.py:1204] (3/4) Exclude cut with ID _44304_210_15374_1_1533639352765_4613129_141-8356-0_sp1.1 from training. Duration: 0.963625 2023-10-02 12:31:05,835 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.4.encoder.layers.0.self_attn2.whiten, num_groups=1, num_channels=384, metric=16.88 vs. limit=22.5 2023-10-02 12:31:06,640 WARNING [train.py:1204] (3/4) Exclude cut with ID _37892_210_19596_1_1533198677897_3964058_374-335034-0_sp1.1 from training. Duration: 0.936375 2023-10-02 12:31:06,784 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_195-257512-0_sp0.9 from training. Duration: 0.7444375 2023-10-02 12:31:06,786 WARNING [train.py:1204] (3/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_1267-272693-0_sp1.1 from training. Duration: 0.663625 2023-10-02 12:31:06,787 WARNING [train.py:1204] (3/4) Exclude cut with ID _30196_210_1664_1_1533708496522_7074140_690-326155-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 12:31:10,194 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_68847_210_14656_1_1536070214_2586843_38-20717-0_sp1.1 from training. Duration: 0.97275 2023-10-02 12:31:11,497 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_99-251314-0_sp1.1 from training. Duration: 0.7909375 2023-10-02 12:31:15,368 WARNING [train.py:1204] (3/4) Exclude cut with ID _49376_210_5986_1_1533985278732_3370740_225-95630-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 12:31:16,888 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2655_210_2087_1_1525688959_3897400_202-147557-0 from training. Duration: 0.85 2023-10-02 12:31:18,279 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_5618_210_1874_1_1527311008_5851_1-214501-0_sp1.1 from training. Duration: 0.626375 2023-10-02 12:31:18,360 WARNING [train.py:1204] (3/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_225-43381-0_sp1.1 from training. Duration: 0.8545625 2023-10-02 12:31:19,724 WARNING [train.py:1204] (3/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_242-3472-0 from training. Duration: 0.73 2023-10-02 12:31:21,334 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.balancer2.prob, batch_count=880406.6666666666, ans=0.125 2023-10-02 12:31:22,554 WARNING [train.py:1204] (3/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_339-104791-0 from training. Duration: 0.49 2023-10-02 12:31:22,631 WARNING [train.py:1204] (3/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_488-92261-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 12:31:24,238 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.conv_skip_rate, batch_count=880406.6666666666, ans=0.0 2023-10-02 12:31:27,362 WARNING [train.py:1204] (3/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_175-163894-0 from training. Duration: 0.74 2023-10-02 12:31:29,976 WARNING [train.py:1204] (3/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_41-234353-0_sp0.9 from training. Duration: 0.7555625 2023-10-02 12:31:31,571 WARNING [train.py:1204] (3/4) Exclude cut with ID _47251_210_18628_1_1533814063128_4137250_116-275927-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 12:31:32,846 WARNING [train.py:1204] (3/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_61-223324-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 12:31:32,869 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6961_210_2087_1_1527937168_6815011_562-165518-0_sp1.1 from training. Duration: 0.7 2023-10-02 12:31:33,183 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.feed_forward3.hidden_balancer.prob, batch_count=880406.6666666666, ans=0.125 2023-10-02 12:31:34,400 WARNING [train.py:1204] (3/4) Exclude cut with ID _21989_210_5854_1_1531899214931_3271360_133-44759-0 from training. Duration: 0.77 2023-10-02 12:31:37,110 WARNING [train.py:1204] (3/4) Exclude cut with ID _23557_210_8750_1_1531890106370_6846255_77-284923-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 12:31:38,679 WARNING [train.py:1204] (3/4) Exclude cut with ID _13678_210_7497_1_1530530069226_3463170_464-296127-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 12:31:38,692 WARNING [train.py:1204] (3/4) Exclude cut with ID _22613_210_5219_1_1532253599686_4090170_393-36663-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 12:31:40,495 WARNING [train.py:1204] (3/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_235-262124-0_sp0.9 from training. Duration: 0.7444375 2023-10-02 12:31:42,471 WARNING [train.py:1204] (3/4) Exclude cut with ID _8480_210_1874_1_1528589391205_6728330_755-112625-0 from training. Duration: 0.74 2023-10-02 12:31:43,757 WARNING [train.py:1204] (3/4) Exclude cut with ID _6456_210_1534_1_1527681634212_3181049_215-309359-0 from training. Duration: 0.88 2023-10-02 12:31:43,783 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3019_210_2028_1_1526044245_3264865_191-72235-0_sp1.1 from training. Duration: 0.8818125 2023-10-02 12:31:45,674 WARNING [train.py:1204] (3/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_380-162648-0 from training. Duration: 0.94 2023-10-02 12:31:48,481 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_92-46009-0 from training. Duration: 0.54 2023-10-02 12:31:48,502 WARNING [train.py:1204] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_53-300544-0_sp0.9 from training. Duration: 0.7444375 2023-10-02 12:31:49,847 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_68235_210_1925_1_1535863247_4484656_101-250943-0_sp1.1 from training. Duration: 0.97275 2023-10-02 12:31:49,856 WARNING [train.py:1204] (3/4) Exclude cut with ID _14970_210_15709_1_1530775547361_1966965_204-22182-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 12:31:51,218 WARNING [train.py:1204] (3/4) Exclude cut with ID _55153_210_2253_1_1534384786677_3162096_310-130092-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 12:31:51,231 WARNING [train.py:1204] (3/4) Exclude cut with ID _60113_210_16271_1_1535164212481_4386079_559-167191-0_sp1.1 from training. Duration: 0.8454375 2023-10-02 12:31:51,350 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.0.self_attn_weights.pos_emb_skip_rate, batch_count=880540.0, ans=0.0 2023-10-02 12:31:53,779 WARNING [train.py:1204] (3/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_93-58002-0_sp1.1 from training. Duration: 0.5181875 2023-10-02 12:31:53,842 WARNING [train.py:1204] (3/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_110-224688-0 from training. Duration: 0.97 2023-10-02 12:31:54,116 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.feed_forward2.hidden_balancer.prob, batch_count=880540.0, ans=0.125 2023-10-02 12:31:56,614 WARNING [train.py:1204] (3/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_178-178004-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 12:31:56,628 WARNING [train.py:1204] (3/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_321-335475-0_sp1.1 from training. Duration: 0.4090625 2023-10-02 12:31:58,302 WARNING [train.py:1204] (3/4) Exclude cut with ID _66863_210_18053_1_1535698758334_8590319_1092-271784-0 from training. Duration: 0.96 2023-10-02 12:31:58,316 WARNING [train.py:1204] (3/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_607-213951-0_sp1.1 from training. Duration: 0.763625 2023-10-02 12:31:58,324 WARNING [train.py:1204] (3/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_468-283113-0 from training. Duration: 0.96 2023-10-02 12:31:59,884 WARNING [train.py:1204] (3/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_13-165512-0_sp1.1 from training. Duration: 0.7454375 2023-10-02 12:32:01,325 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_69630_210_7234_1_1535788670_910989_56-169762-0_sp1.1 from training. Duration: 0.92725 2023-10-02 12:32:02,803 WARNING [train.py:1204] (3/4) Exclude cut with ID _67335_210_5219_1_1535525975652_4275229_121-80357-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 12:32:02,848 WARNING [train.py:1204] (3/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_126-12079-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 12:32:04,146 WARNING [train.py:1204] (3/4) Exclude cut with ID _5294_210_1747_1_1527249643928_3696179_342-270278-0_sp1.1 from training. Duration: 0.5 2023-10-02 12:32:05,475 INFO [train.py:1046] (3/4) Epoch 25, batch 4600, loss[loss=0.1706, simple_loss=0.2589, pruned_loss=0.04116, over 23983.00 frames. ], tot_loss[loss=0.1678, simple_loss=0.2455, pruned_loss=0.04507, over 4724187.31 frames. ], batch size: 86, lr: 4.07e-03, grad_scale: 8.0 2023-10-02 12:32:05,535 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_30322_210_1925_1_1532843478_826817_35-328264-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 12:32:06,936 WARNING [train.py:1204] (3/4) Exclude cut with ID _8480_210_1874_1_1528589391205_6728330_755-112625-0_sp1.1 from training. Duration: 0.67275 2023-10-02 12:32:08,454 WARNING [train.py:1204] (3/4) Exclude cut with ID _38670_210_15379_1_1533448810659_4391059_132-146412-0_sp1.1 from training. Duration: 0.97275 2023-10-02 12:32:09,890 WARNING [train.py:1204] (3/4) Exclude cut with ID _22020_210_2461_1_1531724164513_6326900_563-39691-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 12:32:10,237 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.conv_module1.balancer1.max_abs, batch_count=880606.6666666666, ans=10.0 2023-10-02 12:32:11,796 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_9460_210_4211_1_1528954587_10049667_572-37035-0_sp1.1 from training. Duration: 0.72725 2023-10-02 12:32:11,811 WARNING [train.py:1204] (3/4) Exclude cut with ID _68777_210_4353_1_1535810390731_3117904_129-166012-0_sp0.9 from training. Duration: 0.9444375 2023-10-02 12:32:13,698 WARNING [train.py:1204] (3/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_487-91921-0_sp1.1 from training. Duration: 0.963625 2023-10-02 12:32:15,110 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_195-257512-0 from training. Duration: 0.67 2023-10-02 12:32:16,962 WARNING [train.py:1204] (3/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_188-260011-0_sp1.1 from training. Duration: 0.663625 2023-10-02 12:32:19,687 WARNING [train.py:1204] (3/4) Exclude cut with ID _8480_210_1874_1_1528589391205_6728330_309-139896-0_sp0.9 from training. Duration: 0.9 2023-10-02 12:32:21,028 WARNING [train.py:1204] (3/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_866-321248-0_sp1.1 from training. Duration: 0.963625 2023-10-02 12:32:23,843 WARNING [train.py:1204] (3/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_510-216513-0_sp1.1 from training. Duration: 0.97275 2023-10-02 12:32:25,540 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.attention_skip_rate, batch_count=880673.3333333334, ans=0.0 2023-10-02 12:32:29,990 WARNING [train.py:1204] (3/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_758-143677-0 from training. Duration: 0.98 2023-10-02 12:32:31,340 WARNING [train.py:1204] (3/4) Exclude cut with ID _55134_210_21982_1_1534590028377_3882170_50-214427-0_sp1.1 from training. Duration: 0.97275 2023-10-02 12:32:32,827 WARNING [train.py:1204] (3/4) Exclude cut with ID _31649_210_6228_1_1532584608063_4091078_482-301330-0_sp1.1 from training. Duration: 0.97275 2023-10-02 12:32:37,007 WARNING [train.py:1204] (3/4) Exclude cut with ID _35799_210_5854_1_1533263487887_3529071_490-326847-0_sp1.1 from training. Duration: 0.9 2023-10-02 12:32:37,015 WARNING [train.py:1204] (3/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_984-90716-0_sp1.1 from training. Duration: 0.963625 2023-10-02 12:32:38,734 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=880740.0, ans=0.1 2023-10-02 12:32:40,465 WARNING [train.py:1204] (3/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_166-246557-0 from training. Duration: 0.64 2023-10-02 12:32:40,466 WARNING [train.py:1204] (3/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_51-200220-0_sp1.1 from training. Duration: 0.5090625 2023-10-02 12:32:41,840 WARNING [train.py:1204] (3/4) Exclude cut with ID _51382_210_23862_1_1534492801838_3650104_275-92796-0_sp1.1 from training. Duration: 0.936375 2023-10-02 12:32:46,584 WARNING [train.py:1204] (3/4) Exclude cut with ID _29853_210_7497_1_1532505600851_6642110_461-142829-0_sp1.1 from training. Duration: 0.97275 2023-10-02 12:32:47,989 WARNING [train.py:1204] (3/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_788-298907-0_sp0.9 from training. Duration: 0.988875 2023-10-02 12:32:49,354 WARNING [train.py:1204] (3/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_739-2098-0_sp1.1 from training. Duration: 0.836375 2023-10-02 12:32:51,062 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=880806.6666666666, ans=0.1 2023-10-02 12:32:52,173 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7543_210_6753_1_1528541907_1399322_165-323619-0 from training. Duration: 0.69 2023-10-02 12:32:53,390 WARNING [train.py:1204] (3/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_401-255491-0_sp1.1 from training. Duration: 0.636375 2023-10-02 12:32:56,911 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.bypass.skip_rate, batch_count=880806.6666666666, ans=0.09899494936611666 2023-10-02 12:32:57,988 WARNING [train.py:1204] (3/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_24-143283-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 12:32:58,084 WARNING [train.py:1204] (3/4) Exclude cut with ID _14459_210_2228_1_1531443641946_7516700_1221-280439-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 12:33:00,730 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.568e+02 1.837e+02 2.008e+02 2.202e+02 3.381e+02, threshold=4.017e+02, percent-clipped=0.0 2023-10-02 12:33:00,828 WARNING [train.py:1204] (3/4) Exclude cut with ID _59202_210_26984_1_1534816810612_3549174_469-213482-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 12:33:00,829 WARNING [train.py:1204] (3/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_148-158509-0_sp0.9 from training. Duration: 0.4555625 2023-10-02 12:33:00,858 WARNING [train.py:1204] (3/4) Exclude cut with ID _24314_210_4915_1_1532135078116_7170179_863-259095-0_sp1.1 from training. Duration: 0.97275 2023-10-02 12:33:01,030 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=880806.6666666666, ans=0.1 2023-10-02 12:33:02,119 WARNING [train.py:1204] (3/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_321-22821-0 from training. Duration: 0.89 2023-10-02 12:33:02,157 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_43831_210_2158_1_1533725502_5872791_55-163137-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 12:33:02,198 WARNING [train.py:1204] (3/4) Exclude cut with ID _27430_210_7770_1_1532604689375_1378590_26-25207-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 12:33:03,521 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_17613_210_2151_1_1531137569_2366160_223-316461-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 12:33:03,594 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_53679_210_4852_1_1534556726_4186141_541-213370-0_sp1.1 from training. Duration: 0.963625 2023-10-02 12:33:05,012 WARNING [train.py:1204] (3/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_369-295086-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 12:33:05,077 WARNING [train.py:1204] (3/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_91-267696-0 from training. Duration: 0.87 2023-10-02 12:33:06,427 WARNING [train.py:1204] (3/4) Exclude cut with ID _5294_210_1747_1_1527249643928_3696179_180-100713-0 from training. Duration: 0.81 2023-10-02 12:33:06,468 WARNING [train.py:1204] (3/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_32-335103-0 from training. Duration: 0.82 2023-10-02 12:33:06,473 WARNING [train.py:1204] (3/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_133-312727-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 12:33:08,338 WARNING [train.py:1204] (3/4) Exclude cut with ID _59288_210_5854_1_1535077757774_3412246_391-165934-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 12:33:08,387 WARNING [train.py:1204] (3/4) Exclude cut with ID _28323_210_16187_1_1532480395667_4100300_488-1281-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 12:33:09,779 WARNING [train.py:1204] (3/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_124-193207-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 12:33:19,059 WARNING [train.py:1204] (3/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_226-242140-0_sp0.9 from training. Duration: 0.9 2023-10-02 12:33:20,355 INFO [train.py:1046] (3/4) Epoch 25, batch 4650, loss[loss=0.1616, simple_loss=0.2491, pruned_loss=0.03703, over 24318.00 frames. ], tot_loss[loss=0.1674, simple_loss=0.2451, pruned_loss=0.04482, over 4716958.71 frames. ], batch size: 74, lr: 4.07e-03, grad_scale: 8.0 2023-10-02 12:33:21,809 WARNING [train.py:1204] (3/4) Exclude cut with ID _29277_210_3279_1_1532581109293_3173069_392-3694-0_sp1.1 from training. Duration: 0.936375 2023-10-02 12:33:23,113 WARNING [train.py:1204] (3/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_563-329842-0_sp1.1 from training. Duration: 0.97275 2023-10-02 12:33:23,162 WARNING [train.py:1204] (3/4) Exclude cut with ID _58583_210_5236_1_1534726835266_3574249_391-87755-0_sp1.1 from training. Duration: 0.92725 2023-10-02 12:33:23,180 WARNING [train.py:1204] (3/4) Exclude cut with ID _28427_210_4220_1_1532581240967_3061069_281-237827-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 12:33:23,193 WARNING [train.py:1204] (3/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_44-147898-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 12:33:24,872 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_24007_210_1794_1_1531918925_1663875_125-320772-0_sp1.1 from training. Duration: 0.97275 2023-10-02 12:33:27,672 WARNING [train.py:1204] (3/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_143-117421-0 from training. Duration: 0.58 2023-10-02 12:33:27,978 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.feed_forward2.hidden_balancer.prob, batch_count=880940.0, ans=0.125 2023-10-02 12:33:30,550 WARNING [train.py:1204] (3/4) Exclude cut with ID _69584_210_5219_1_1535785133775_4044450_325-7656-0_sp0.9 from training. Duration: 0.9555625 2023-10-02 12:33:31,958 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_37351_210_7925_1_1533012931_3812113_265-85780-0 from training. Duration: 0.88 2023-10-02 12:33:31,985 WARNING [train.py:1204] (3/4) Exclude cut with ID _18516_210_9206_1_1531704653206_3692406_331-237896-0_sp1.1 from training. Duration: 0.936375 2023-10-02 12:33:33,353 WARNING [train.py:1204] (3/4) Exclude cut with ID _14629_210_15709_1_1530604298182_956899_6-232695-0 from training. Duration: 0.94 2023-10-02 12:33:33,390 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7544_210_6753_1_1528587000_691468_87-178599-0_sp1.1 from training. Duration: 0.7909375 2023-10-02 12:33:33,444 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_326-303945-0 from training. Duration: 0.99 2023-10-02 12:33:35,171 WARNING [train.py:1204] (3/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_503-30719-0 from training. Duration: 0.98 2023-10-02 12:33:35,178 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_11305_210_1794_1_1529750428_7178148_299-344640-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 12:33:35,239 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_53071_210_16554_1_1534490405_2954066_265-170472-0_sp1.1 from training. Duration: 0.7545625 2023-10-02 12:33:37,999 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_174-277364-0_sp1.1 from training. Duration: 0.6454375 2023-10-02 12:33:39,441 WARNING [train.py:1204] (3/4) Exclude cut with ID _58414_210_8254_1_1535421609162_7151182_193-151884-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 12:33:39,467 WARNING [train.py:1204] (3/4) Exclude cut with ID IC0651W0104-322063 from training. Duration: 0.768 2023-10-02 12:33:40,955 WARNING [train.py:1204] (3/4) Exclude cut with ID _21881_210_3525_1_1531825242757_3798380_139-305982-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 12:33:42,795 WARNING [train.py:1204] (3/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_79-200392-0 from training. Duration: 0.97 2023-10-02 12:33:47,374 WARNING [train.py:1204] (3/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_451-280894-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 12:33:47,383 WARNING [train.py:1204] (3/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_304-273433-0_sp1.1 from training. Duration: 0.9 2023-10-02 12:33:48,672 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_5959_210_1794_1_1527400400_3445726_412-44052-0 from training. Duration: 0.77 2023-10-02 12:33:48,798 WARNING [train.py:1204] (3/4) Exclude cut with ID _4681_210_3525_1_1527251040321_1235713_148-26159-0_sp1.1 from training. Duration: 0.963625 2023-10-02 12:33:52,904 WARNING [train.py:1204] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_887-249555-0_sp0.9 from training. Duration: 0.8555625 2023-10-02 12:33:57,536 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_25236_210_2158_1_1532051968_280567_37-6860-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 12:34:03,070 WARNING [train.py:1204] (3/4) Exclude cut with ID _29277_210_3279_1_1532581109293_3173069_417-115527-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 12:34:03,801 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.2.encoder.layers.2.nonlin_attention.whiten1, num_groups=1, num_channels=288, metric=5.52 vs. limit=10.0 2023-10-02 12:34:05,004 WARNING [train.py:1204] (3/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_552-134748-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 12:34:06,335 WARNING [train.py:1204] (3/4) Exclude cut with ID _43176_210_8254_1_1533524417469_4174339_188-174143-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 12:34:06,381 WARNING [train.py:1204] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_127-119109-0_sp1.1 from training. Duration: 0.6545625 2023-10-02 12:34:07,807 WARNING [train.py:1204] (3/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_38-187851-0 from training. Duration: 0.62 2023-10-02 12:34:09,085 WARNING [train.py:1204] (3/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_588-125165-0 from training. Duration: 0.83 2023-10-02 12:34:10,415 WARNING [train.py:1204] (3/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_344-152526-0_sp1.1 from training. Duration: 0.4818125 2023-10-02 12:34:10,417 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7002_210_1794_1_1527935634_4034444_487-236501-0 from training. Duration: 0.66 2023-10-02 12:34:11,845 WARNING [train.py:1204] (3/4) Exclude cut with ID _23825_210_13449_1_1531915313526_3497150_379-16981-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 12:34:16,528 WARNING [train.py:1204] (3/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_297-5233-0_sp1.1 from training. Duration: 0.863625 2023-10-02 12:34:16,532 WARNING [train.py:1204] (3/4) Exclude cut with ID _41298_210_9019_1_1533430371450_4822143_178-283538-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 12:34:16,566 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7046_210_6753_1_1527895337_6920917_642-106799-0 from training. Duration: 0.92 2023-10-02 12:34:16,579 WARNING [train.py:1204] (3/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_532-291952-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 12:34:18,391 WARNING [train.py:1204] (3/4) Exclude cut with ID _47278_210_12041_1_1533902418349_3576110_260-270468-0_sp1.1 from training. Duration: 0.92725 2023-10-02 12:34:18,397 WARNING [train.py:1204] (3/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_144-3287-0_sp0.9 from training. Duration: 0.7444375 2023-10-02 12:34:19,822 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_162-262717-0_sp0.9 from training. Duration: 0.92225 2023-10-02 12:34:22,497 WARNING [train.py:1204] (3/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_62-335552-0_sp0.9 from training. Duration: 0.9666875 2023-10-02 12:34:22,505 WARNING [train.py:1204] (3/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_726-272852-0_sp1.1 from training. Duration: 0.92725 2023-10-02 12:34:22,782 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.conv_skip_rate, batch_count=881206.6666666666, ans=0.0 2023-10-02 12:34:23,841 WARNING [train.py:1204] (3/4) Exclude cut with ID _14898_210_9206_1_1530766812377_4177190_26-135492-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 12:34:25,363 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.ff3_skip_rate, batch_count=881206.6666666666, ans=0.0 2023-10-02 12:34:26,962 WARNING [train.py:1204] (3/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_200-95304-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 12:34:28,320 WARNING [train.py:1204] (3/4) Exclude cut with ID _45012_210_12651_1_1533608435136_5371380_240-37101-0_sp1.1 from training. Duration: 0.8454375 2023-10-02 12:34:28,337 WARNING [train.py:1204] (3/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_132-269633-0_sp0.9 from training. Duration: 0.7555625 2023-10-02 12:34:29,723 WARNING [train.py:1204] (3/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_907-151398-0_sp0.9 from training. Duration: 0.411125 2023-10-02 12:34:31,110 WARNING [train.py:1204] (3/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_45-265684-0_sp0.9 from training. Duration: 0.8 2023-10-02 12:34:32,450 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_74-292608-0 from training. Duration: 0.72 2023-10-02 12:34:33,836 INFO [train.py:1046] (3/4) Epoch 25, batch 4700, loss[loss=0.1525, simple_loss=0.2367, pruned_loss=0.03413, over 24296.00 frames. ], tot_loss[loss=0.1683, simple_loss=0.2459, pruned_loss=0.0453, over 4719213.94 frames. ], batch size: 61, lr: 4.07e-03, grad_scale: 8.0 2023-10-02 12:34:39,911 WARNING [train.py:1204] (3/4) Exclude cut with ID _59202_210_26984_1_1534816810612_3549174_221-84819-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 12:34:39,981 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_33696_210_3852_1_1533082780_2663375_181-325595-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 12:34:40,011 WARNING [train.py:1204] (3/4) Exclude cut with ID _47251_210_18628_1_1533814063128_4137250_351-154105-0_sp1.1 from training. Duration: 0.963625 2023-10-02 12:34:41,528 WARNING [train.py:1204] (3/4) Exclude cut with ID _35501_210_9288_1_1532998507986_3192189_351-334740-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 12:34:42,896 WARNING [train.py:1204] (3/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_342-100157-0_sp1.1 from training. Duration: 0.4909375 2023-10-02 12:34:46,410 WARNING [train.py:1204] (3/4) Exclude cut with ID _6789_210_1534_1_1527857937218_3118950_262-48853-0 from training. Duration: 0.56 2023-10-02 12:34:46,452 WARNING [train.py:1204] (3/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_430-201020-0 from training. Duration: 0.97 2023-10-02 12:34:48,601 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.3.encoder.layers.2.nonlin_attention.whiten2, num_groups=1, num_channels=512, metric=6.17 vs. limit=15.0 2023-10-02 12:34:49,327 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_71-136518-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 12:34:50,712 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_28-291982-0_sp1.1 from training. Duration: 0.8818125 2023-10-02 12:34:50,752 WARNING [train.py:1204] (3/4) Exclude cut with ID _54218_210_12210_1_1534413646388_4409259_437-275326-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 12:34:54,703 WARNING [train.py:1204] (3/4) Exclude cut with ID _55134_210_21982_1_1534590028377_3882170_40-309139-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 12:34:59,416 WARNING [train.py:1204] (3/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_266-329755-0_sp1.1 from training. Duration: 0.8090625 2023-10-02 12:35:00,791 WARNING [train.py:1204] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_21-174514-0_sp1.1 from training. Duration: 0.3909375 2023-10-02 12:35:03,468 WARNING [train.py:1204] (3/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_409-207378-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 12:35:05,053 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.feed_forward1.hidden_balancer.prob, batch_count=881406.6666666666, ans=0.125 2023-10-02 12:35:09,642 WARNING [train.py:1204] (3/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_132-269633-0 from training. Duration: 0.68 2023-10-02 12:35:11,027 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2245_210_2715_1_1525511548_3540254_310-52483-0_sp0.9 from training. Duration: 0.97775 2023-10-02 12:35:13,653 WARNING [train.py:1204] (3/4) Exclude cut with ID _27430_210_7770_1_1532604689375_1378590_47-77403-0_sp1.1 from training. Duration: 0.92725 2023-10-02 12:35:17,403 WARNING [train.py:1204] (3/4) Exclude cut with ID _7562_210_2084_1_1528887767916_3946229_186-326422-0 from training. Duration: 0.84 2023-10-02 12:35:18,824 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_230-35993-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 12:35:22,997 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_23854_210_1794_1_1531969696_1329157_57-123491-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 12:35:24,235 WARNING [train.py:1204] (3/4) Exclude cut with ID _14747_210_8341_1_1530685797463_3654089_147-131229-0 from training. Duration: 0.76 2023-10-02 12:35:27,132 WARNING [train.py:1204] (3/4) Exclude cut with ID _30191_210_1664_1_1533189915047_7196670_430-19539-0_sp1.1 from training. Duration: 0.92725 2023-10-02 12:35:27,161 WARNING [train.py:1204] (3/4) Exclude cut with ID _67725_210_27767_1_1535608756671_5105359_44-50762-0_sp1.1 from training. Duration: 0.963625 2023-10-02 12:35:27,463 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.self_attn_weights.pos_emb_skip_rate, batch_count=881473.3333333334, ans=0.0 2023-10-02 12:35:28,466 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.585e+02 1.857e+02 2.002e+02 2.228e+02 2.822e+02, threshold=4.004e+02, percent-clipped=0.0 2023-10-02 12:35:30,479 WARNING [train.py:1204] (3/4) Exclude cut with ID _27485_210_4916_1_1532739191677_3668269_217-161755-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 12:35:30,533 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_5931_210_2040_1_1527428481_4547999_321-193614-0_sp1.1 from training. Duration: 0.6090625 2023-10-02 12:35:30,556 WARNING [train.py:1204] (3/4) Exclude cut with ID _6267_210_2084_1_1527764658526_3894730_375-219887-0 from training. Duration: 0.67 2023-10-02 12:35:32,038 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9087W0022-538393 from training. Duration: 0.768 2023-10-02 12:35:33,528 WARNING [train.py:1204] (3/4) Exclude cut with ID _68777_210_4353_1_1535810390731_3117904_83-196459-0_sp1.1 from training. Duration: 0.963625 2023-10-02 12:35:34,877 WARNING [train.py:1204] (3/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_455-343812-0_sp1.1 from training. Duration: 0.92725 2023-10-02 12:35:34,878 WARNING [train.py:1204] (3/4) Exclude cut with ID _13916_210_9994_1_1530788341126_3763149_414-320174-0_sp1.1 from training. Duration: 0.92725 2023-10-02 12:35:34,882 WARNING [train.py:1204] (3/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_398-239819-0 from training. Duration: 0.99 2023-10-02 12:35:34,994 WARNING [train.py:1204] (3/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_199-232314-0_sp1.1 from training. Duration: 0.92725 2023-10-02 12:35:37,735 WARNING [train.py:1204] (3/4) Exclude cut with ID _2334_210_2667_1_1525608238880_4705129_459-104997-0 from training. Duration: 0.95 2023-10-02 12:35:42,152 WARNING [train.py:1204] (3/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_441-209395-0_sp1.1 from training. Duration: 0.936375 2023-10-02 12:35:43,515 WARNING [train.py:1204] (3/4) Exclude cut with ID _27052_210_5986_1_1532329197187_3585009_320-80393-0_sp1.1 from training. Duration: 0.97275 2023-10-02 12:35:46,605 WARNING [train.py:1204] (3/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_89-217253-0_sp1.1 from training. Duration: 0.97275 2023-10-02 12:35:48,224 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.0.feed_forward1.out_whiten.whitening_limit, batch_count=881606.6666666666, ans=15.0 2023-10-02 12:35:48,463 INFO [train.py:1046] (3/4) Epoch 25, batch 4750, loss[loss=0.1763, simple_loss=0.2617, pruned_loss=0.04542, over 24389.00 frames. ], tot_loss[loss=0.1697, simple_loss=0.2471, pruned_loss=0.04617, over 4714456.34 frames. ], batch size: 77, lr: 4.07e-03, grad_scale: 8.0 2023-10-02 12:35:48,537 WARNING [train.py:1204] (3/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_453-282588-0_sp1.1 from training. Duration: 0.8909375 2023-10-02 12:35:50,444 WARNING [train.py:1204] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_88-341718-0 from training. Duration: 0.66 2023-10-02 12:35:51,759 WARNING [train.py:1204] (3/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_230-3521-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 12:35:54,459 WARNING [train.py:1204] (3/4) Exclude cut with ID _9679_210_7497_1_1529129212906_4363929_512-313889-0 from training. Duration: 0.81 2023-10-02 12:35:57,163 WARNING [train.py:1204] (3/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_194-252800-0_sp1.1 from training. Duration: 0.8818125 2023-10-02 12:35:57,188 WARNING [train.py:1204] (3/4) Exclude cut with ID _55209_210_1531_1_1534675828996_4597160_239-293541-0_sp1.1 from training. Duration: 0.963625 2023-10-02 12:35:58,559 WARNING [train.py:1204] (3/4) Exclude cut with ID _35799_210_5854_1_1533263487887_3529071_15-325131-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 12:36:04,399 WARNING [train.py:1204] (3/4) Exclude cut with ID _14100_210_8341_1_1530529320613_3678069_279-55760-0 from training. Duration: 0.93 2023-10-02 12:36:08,600 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_489-230703-0_sp1.1 from training. Duration: 0.77275 2023-10-02 12:36:10,007 WARNING [train.py:1204] (3/4) Exclude cut with ID _6694_210_4599_1_1527852637490_3965529_144-185051-0 from training. Duration: 0.96 2023-10-02 12:36:11,333 WARNING [train.py:1204] (3/4) Exclude cut with ID _28461_210_2253_1_1532771775189_3041299_1-228155-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 12:36:15,953 WARNING [train.py:1204] (3/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_686-252323-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 12:36:15,955 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_1075-294633-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 12:36:15,966 WARNING [train.py:1204] (3/4) Exclude cut with ID _22083_210_8254_1_1531702862439_3678970_30-92879-0_sp1.1 from training. Duration: 0.97275 2023-10-02 12:36:17,301 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9245W0121-616888 from training. Duration: 0.896 2023-10-02 12:36:17,304 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6786_210_1388_1_1527839699_4194495_378-46070-0 from training. Duration: 0.72 2023-10-02 12:36:22,057 WARNING [train.py:1204] (3/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_317-175062-0 from training. Duration: 0.82 2023-10-02 12:36:22,237 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.conv_skip_rate, batch_count=881740.0, ans=0.0 2023-10-02 12:36:22,700 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.3.encoder.layers.2.feed_forward2.out_whiten, num_groups=1, num_channels=512, metric=8.48 vs. limit=15.0 2023-10-02 12:36:25,324 WARNING [train.py:1204] (3/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_238-89390-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 12:36:26,727 WARNING [train.py:1204] (3/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_524-310811-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 12:36:29,307 WARNING [train.py:1204] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_307-158462-0_sp0.9 from training. Duration: 0.7666875 2023-10-02 12:36:29,307 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9309W0191-640318 from training. Duration: 0.512 2023-10-02 12:36:29,312 WARNING [train.py:1204] (3/4) Exclude cut with ID _55153_210_2253_1_1534384786677_3162096_100-204617-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 12:36:30,766 WARNING [train.py:1204] (3/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_112-149213-0_sp1.1 from training. Duration: 0.863625 2023-10-02 12:36:35,364 WARNING [train.py:1204] (3/4) Exclude cut with ID _67831_210_12392_1_1535612464202_3092612_346-2148-0_sp1.1 from training. Duration: 0.8454375 2023-10-02 12:36:38,064 WARNING [train.py:1204] (3/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_51-117576-0_sp0.9 from training. Duration: 0.4444375 2023-10-02 12:36:38,094 WARNING [train.py:1204] (3/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_300-27235-0 from training. Duration: 0.87 2023-10-02 12:36:39,444 WARNING [train.py:1204] (3/4) Exclude cut with ID _40399_210_4353_1_1533305658194_3279406_195-116825-0_sp1.1 from training. Duration: 0.97275 2023-10-02 12:36:39,467 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7002_210_1794_1_1527939683_5157286_163-87552-0_sp0.9 from training. Duration: 0.9555625 2023-10-02 12:36:39,502 WARNING [train.py:1204] (3/4) Exclude cut with ID _19514_210_9259_1_1531530071330_5084230_560-237597-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 12:36:40,877 WARNING [train.py:1204] (3/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_41-234353-0_sp1.1 from training. Duration: 0.6181875 2023-10-02 12:36:42,194 WARNING [train.py:1204] (3/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_769-168302-0 from training. Duration: 0.71 2023-10-02 12:36:43,692 WARNING [train.py:1204] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_574-45541-0 from training. Duration: 0.65 2023-10-02 12:36:43,914 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.self_attn_weights.pos_emb_skip_rate, batch_count=881806.6666666666, ans=0.0 2023-10-02 12:36:46,392 WARNING [train.py:1204] (3/4) Exclude cut with ID _50310_210_15488_1_1534068043345_3641080_262-67120-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 12:36:49,763 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_53071_210_16554_1_1534493434_1276796_35-11927-0_sp1.1 from training. Duration: 0.963625 2023-10-02 12:36:49,765 WARNING [train.py:1204] (3/4) Exclude cut with ID _3992_210_1747_1_1526644816031_3731060_107-251434-0 from training. Duration: 0.99 2023-10-02 12:36:51,137 WARNING [train.py:1204] (3/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_412-54488-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 12:36:52,475 WARNING [train.py:1204] (3/4) Exclude cut with ID _18516_210_9206_1_1531704653206_3692406_77-258490-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 12:36:53,889 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_40047_210_2290_1_1533290519_2374311_195-165693-0_sp0.9 from training. Duration: 0.988875 2023-10-02 12:36:55,845 WARNING [train.py:1204] (3/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_296-36971-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 12:36:55,912 WARNING [train.py:1204] (3/4) Exclude cut with ID _14970_210_15709_1_1530773543493_1715730_187-20259-0_sp1.1 from training. Duration: 0.8090625 2023-10-02 12:36:57,475 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.feed_forward1.out_proj.dropout_p, batch_count=881873.3333333334, ans=0.1 2023-10-02 12:36:59,863 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_53373_210_5399_1_1534229471_4715695_135-305927-0_sp1.1 from training. Duration: 0.936375 2023-10-02 12:36:59,903 WARNING [train.py:1204] (3/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_60-18123-0 from training. Duration: 0.88 2023-10-02 12:37:01,148 INFO [train.py:1046] (3/4) Epoch 25, batch 4800, loss[loss=0.1618, simple_loss=0.2478, pruned_loss=0.0379, over 24632.00 frames. ], tot_loss[loss=0.1707, simple_loss=0.2484, pruned_loss=0.0465, over 4721509.94 frames. ], batch size: 73, lr: 4.07e-03, grad_scale: 16.0 2023-10-02 12:37:01,239 WARNING [train.py:1204] (3/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_166-166990-0 from training. Duration: 0.96 2023-10-02 12:37:01,529 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.attention_skip_rate, batch_count=881940.0, ans=0.0 2023-10-02 12:37:02,588 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_5438_210_1794_1_1527336687_2600009_233-58072-0 from training. Duration: 0.77 2023-10-02 12:37:03,992 WARNING [train.py:1204] (3/4) Exclude cut with ID _8833_210_5219_1_1528592665210_7553059_611-80313-0_sp0.9 from training. Duration: 0.711125 2023-10-02 12:37:04,983 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.0.layers.0.feed_forward1.out_whiten, num_groups=1, num_channels=192, metric=8.72 vs. limit=15.0 2023-10-02 12:37:05,366 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_53555_210_5632_1_1534595220_7232297_118-43239-0_sp1.1 from training. Duration: 0.936375 2023-10-02 12:37:05,453 WARNING [train.py:1204] (3/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_480-13146-0 from training. Duration: 0.85 2023-10-02 12:37:05,625 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.1.nonlin_attention.balancer.prob, batch_count=881940.0, ans=0.125 2023-10-02 12:37:11,359 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_892-28814-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 12:37:11,404 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_24899_210_16554_1_1532046422_3827462_160-21255-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 12:37:16,960 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_154-316450-0_sp1.1 from training. Duration: 0.6909375 2023-10-02 12:37:18,242 WARNING [train.py:1204] (3/4) Exclude cut with ID _30196_210_1664_1_1533708496522_7074140_1225-301232-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 12:37:18,270 WARNING [train.py:1204] (3/4) Exclude cut with ID _28783_210_7943_1_1532671164808_7155479_414-208100-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 12:37:18,315 WARNING [train.py:1204] (3/4) Exclude cut with ID _19023_210_4353_1_1531378892552_5820780_83-254794-0 from training. Duration: 0.86 2023-10-02 12:37:19,790 WARNING [train.py:1204] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_591-83752-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 12:37:19,848 WARNING [train.py:1204] (3/4) Exclude cut with ID _29877_210_6228_1_1532397791106_5276149_266-62291-0_sp1.1 from training. Duration: 0.7909375 2023-10-02 12:37:21,770 WARNING [train.py:1204] (3/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_280-161633-0_sp1.1 from training. Duration: 0.663625 2023-10-02 12:37:25,973 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7543_210_6753_1_1528539468_333764_39-156634-0_sp1.1 from training. Duration: 0.97275 2023-10-02 12:37:27,886 WARNING [train.py:1204] (3/4) Exclude cut with ID _67499_210_17282_1_1535509805220_3692430_505-312248-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 12:37:27,907 WARNING [train.py:1204] (3/4) Exclude cut with ID _5294_210_1747_1_1527249643928_3696179_180-100713-0_sp0.9 from training. Duration: 0.9 2023-10-02 12:37:29,919 WARNING [train.py:1204] (3/4) Exclude cut with ID _26528_210_9070_1_1532570410842_3851310_508-148132-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 12:37:29,929 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_211-339622-0_sp1.1 from training. Duration: 0.4090625 2023-10-02 12:37:29,941 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_339-138356-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 12:37:31,400 WARNING [train.py:1204] (3/4) Exclude cut with ID _27060_210_5986_1_1532674838922_3522549_211-19933-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 12:37:32,866 WARNING [train.py:1204] (3/4) Exclude cut with ID _60229_210_27767_1_1534838406158_3655350_457-117923-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 12:37:35,479 WARNING [train.py:1204] (3/4) Exclude cut with ID _55429_210_10626_1_1534467588527_7174143_460-141439-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 12:37:36,907 WARNING [train.py:1204] (3/4) Exclude cut with ID _59170_210_6721_1_1534983633960_4203335_263-10780-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 12:37:36,927 WARNING [train.py:1204] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_872-264525-0_sp0.9 from training. Duration: 0.72225 2023-10-02 12:37:36,995 WARNING [train.py:1204] (3/4) Exclude cut with ID _7624_210_1531_1_1528109802198_4933101_418-349834-0_sp0.9 from training. Duration: 0.6666875 2023-10-02 12:37:38,831 WARNING [train.py:1204] (3/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_27-53998-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 12:37:40,393 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.bypass.skip_rate, batch_count=882073.3333333334, ans=0.07 2023-10-02 12:37:41,505 WARNING [train.py:1204] (3/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_202-183922-0 from training. Duration: 0.85 2023-10-02 12:37:41,524 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7002_210_1794_1_1527939683_5157286_1-281264-0 from training. Duration: 0.44 2023-10-02 12:37:42,907 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_53753_210_7234_1_1534320003_3594148_493-9314-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 12:37:42,926 WARNING [train.py:1204] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_217-95056-0_sp1.1 from training. Duration: 0.8909375 2023-10-02 12:37:44,261 WARNING [train.py:1204] (3/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_406-45086-0_sp1.1 from training. Duration: 0.72725 2023-10-02 12:37:44,266 WARNING [train.py:1204] (3/4) Exclude cut with ID _31649_210_6228_1_1532584608063_4091078_505-213821-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 12:37:44,274 WARNING [train.py:1204] (3/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_317-175062-0_sp0.9 from training. Duration: 0.911125 2023-10-02 12:37:44,429 WARNING [train.py:1204] (3/4) Exclude cut with ID _5444_210_2151_1_1527418808464_5312210_726-27165-0_sp1.1 from training. Duration: 0.6090625 2023-10-02 12:37:44,676 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=882140.0, ans=0.1 2023-10-02 12:37:45,763 WARNING [train.py:1204] (3/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_494-170437-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 12:37:50,495 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_474-64054-0_sp1.1 from training. Duration: 0.936375 2023-10-02 12:37:53,224 WARNING [train.py:1204] (3/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_521-7556-0_sp1.1 from training. Duration: 0.97275 2023-10-02 12:37:54,579 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_269-348505-0_sp1.1 from training. Duration: 0.92725 2023-10-02 12:37:55,797 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.524e+02 1.900e+02 2.147e+02 2.539e+02 4.141e+02, threshold=4.294e+02, percent-clipped=1.0 2023-10-02 12:37:56,784 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.conv_skip_rate, batch_count=882140.0, ans=0.0 2023-10-02 12:37:59,281 WARNING [train.py:1204] (3/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_559-184862-0 from training. Duration: 0.94 2023-10-02 12:37:59,331 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_68702_210_23689_1_1535799577_3420068_92-76040-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 12:38:00,690 WARNING [train.py:1204] (3/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_227-102052-0_sp1.1 from training. Duration: 0.97275 2023-10-02 12:38:00,728 WARNING [train.py:1204] (3/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_718-250907-0_sp0.9 from training. Duration: 0.7666875 2023-10-02 12:38:00,780 WARNING [train.py:1204] (3/4) Exclude cut with ID _14069_210_5632_1_1530512692033_4803390_352-143891-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 12:38:05,283 WARNING [train.py:1204] (3/4) Exclude cut with ID _6267_210_2084_1_1527764658526_3894730_65-106602-0_sp1.1 from training. Duration: 0.936375 2023-10-02 12:38:06,660 WARNING [train.py:1204] (3/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_202-183922-0_sp0.9 from training. Duration: 0.9444375 2023-10-02 12:38:06,667 WARNING [train.py:1204] (3/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_465-268827-0_sp1.1 from training. Duration: 0.97275 2023-10-02 12:38:07,998 WARNING [train.py:1204] (3/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_58-29064-0_sp1.1 from training. Duration: 0.9 2023-10-02 12:38:08,038 WARNING [train.py:1204] (3/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_1-99680-0_sp0.9 from training. Duration: 0.7444375 2023-10-02 12:38:09,354 WARNING [train.py:1204] (3/4) Exclude cut with ID _13253_210_1925_1_1530757775819_3577873_191-144977-0_sp1.1 from training. Duration: 0.7090625 2023-10-02 12:38:12,714 WARNING [train.py:1204] (3/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_276-112684-0_sp1.1 from training. Duration: 0.92725 2023-10-02 12:38:12,723 WARNING [train.py:1204] (3/4) Exclude cut with ID _64639_210_5816_1_1535367619437_3528000_358-411-0_sp1.1 from training. Duration: 0.97275 2023-10-02 12:38:12,730 WARNING [train.py:1204] (3/4) Exclude cut with ID _31649_210_6228_1_1532584608063_4091078_106-21199-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 12:38:13,051 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.attention_skip_rate, batch_count=882206.6666666666, ans=0.0 2023-10-02 12:38:14,120 WARNING [train.py:1204] (3/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_901-79438-0 from training. Duration: 0.94 2023-10-02 12:38:14,883 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.3.encoder.layers.2.conv_module2.whiten, num_groups=1, num_channels=512, metric=4.05 vs. limit=15.0 2023-10-02 12:38:15,491 INFO [train.py:1046] (3/4) Epoch 25, batch 4850, loss[loss=0.1644, simple_loss=0.2504, pruned_loss=0.0392, over 24664.00 frames. ], tot_loss[loss=0.1709, simple_loss=0.2485, pruned_loss=0.04664, over 4732157.70 frames. ], batch size: 68, lr: 4.07e-03, grad_scale: 16.0 2023-10-02 12:38:15,561 WARNING [train.py:1204] (3/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_718-250907-0 from training. Duration: 0.69 2023-10-02 12:38:15,567 WARNING [train.py:1204] (3/4) Exclude cut with ID _5287_210_2084_1_1527418883040_3878670_363-194161-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 12:38:15,570 WARNING [train.py:1204] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_586-321321-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 12:38:16,917 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_68900_210_2087_1_1535713157_3708529_164-292661-0_sp1.1 from training. Duration: 0.963625 2023-10-02 12:38:16,918 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_26433_210_12108_1_1532157158_4056480_114-144878-0_sp1.1 from training. Duration: 0.97275 2023-10-02 12:38:19,911 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_15517_210_6753_1_1530954008_3487336_96-341874-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 12:38:27,113 WARNING [train.py:1204] (3/4) Exclude cut with ID _14268_210_9206_1_1530622771285_4714099_637-86630-0 from training. Duration: 0.51 2023-10-02 12:38:27,202 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_28211_210_6388_1_1532861506_4523224_121-320092-0_sp1.1 from training. Duration: 0.92725 2023-10-02 12:38:34,435 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_141-89113-0_sp0.9 from training. Duration: 0.9555625 2023-10-02 12:38:34,519 WARNING [train.py:1204] (3/4) Exclude cut with ID _6789_210_1534_1_1527857937218_3118950_311-61262-0_sp1.1 from training. Duration: 0.5909375 2023-10-02 12:38:34,544 WARNING [train.py:1204] (3/4) Exclude cut with ID _58480_210_12392_1_1534811121111_3646160_389-200461-0_sp1.1 from training. Duration: 0.97275 2023-10-02 12:38:37,378 WARNING [train.py:1204] (3/4) Exclude cut with ID _41326_210_5854_1_1533778076509_3492080_28-156553-0_sp1.1 from training. Duration: 0.92725 2023-10-02 12:38:38,857 WARNING [train.py:1204] (3/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_577-287150-0_sp1.1 from training. Duration: 0.7454375 2023-10-02 12:38:38,947 WARNING [train.py:1204] (3/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_40-129756-0_sp1.1 from training. Duration: 0.736375 2023-10-02 12:38:38,956 WARNING [train.py:1204] (3/4) Exclude cut with ID _60113_210_16271_1_1535164212481_4386079_490-280290-0 from training. Duration: 0.98 2023-10-02 12:38:42,336 WARNING [train.py:1204] (3/4) Exclude cut with ID _58059_210_5854_1_1534901471149_3164808_34-39905-0_sp1.1 from training. Duration: 0.963625 2023-10-02 12:38:44,972 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_791-197891-0_sp1.1 from training. Duration: 0.763625 2023-10-02 12:38:45,000 WARNING [train.py:1204] (3/4) Exclude cut with ID _6789_210_1534_1_1527857937218_3118950_262-48853-0_sp1.1 from training. Duration: 0.5090625 2023-10-02 12:38:46,270 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2655_210_2087_1_1525688959_3897400_52-279899-0_sp0.9 from training. Duration: 0.7666875 2023-10-02 12:38:46,273 WARNING [train.py:1204] (3/4) Exclude cut with ID _14884_210_2252_1_1530707459689_5561250_114-201399-0 from training. Duration: 0.76 2023-10-02 12:38:48,992 WARNING [train.py:1204] (3/4) Exclude cut with ID _19023_210_4353_1_1531378892552_5820780_83-254794-0_sp0.9 from training. Duration: 0.9555625 2023-10-02 12:38:49,007 WARNING [train.py:1204] (3/4) Exclude cut with ID _53489_210_6228_1_1534298357169_3835970_10-297919-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 12:38:53,658 WARNING [train.py:1204] (3/4) Exclude cut with ID _44585_210_18628_1_1533605350387_4518199_418-3055-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 12:38:53,670 WARNING [train.py:1204] (3/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_310-109461-0 from training. Duration: 0.8 2023-10-02 12:38:53,733 WARNING [train.py:1204] (3/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_235-262124-0 from training. Duration: 0.67 2023-10-02 12:38:53,867 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.feed_forward2.hidden_balancer.prob, batch_count=882406.6666666666, ans=0.125 2023-10-02 12:38:55,112 WARNING [train.py:1204] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_195-167011-0_sp1.1 from training. Duration: 0.6818125 2023-10-02 12:38:56,792 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.5.encoder.layers.1.nonlin_attention.whiten2, num_groups=1, num_channels=256, metric=5.52 vs. limit=15.0 2023-10-02 12:38:59,878 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.conv_module1.balancer1.prob, batch_count=882473.3333333334, ans=0.125 2023-10-02 12:39:02,544 WARNING [train.py:1204] (3/4) Exclude cut with ID _7852_210_5219_1_1528286471316_8139400_188-55162-0_sp1.1 from training. Duration: 0.936375 2023-10-02 12:39:02,598 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_35361_210_4238_1_1533262810_7793476_675-87012-0 from training. Duration: 0.89 2023-10-02 12:39:03,925 WARNING [train.py:1204] (3/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_465-81427-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 12:39:03,931 WARNING [train.py:1204] (3/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_597-154640-0_sp1.1 from training. Duration: 0.8181875 2023-10-02 12:39:05,392 WARNING [train.py:1204] (3/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_186-309161-0_sp0.9 from training. Duration: 0.6 2023-10-02 12:39:06,772 WARNING [train.py:1204] (3/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_546-119312-0 from training. Duration: 0.99 2023-10-02 12:39:06,776 WARNING [train.py:1204] (3/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_330-240060-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 12:39:08,143 WARNING [train.py:1204] (3/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_477-255782-0 from training. Duration: 0.82 2023-10-02 12:39:08,160 WARNING [train.py:1204] (3/4) Exclude cut with ID _14970_210_15709_1_1530773543493_1715730_150-248939-0_sp1.1 from training. Duration: 0.97275 2023-10-02 12:39:09,541 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_556-206615-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 12:39:09,603 WARNING [train.py:1204] (3/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_308-231735-0 from training. Duration: 0.73 2023-10-02 12:39:18,220 WARNING [train.py:1204] (3/4) Exclude cut with ID _27485_210_4916_1_1532739191677_3668269_66-236571-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 12:39:24,097 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2221_210_1988_1_1525597051_4935541_415-296882-0_sp0.9 from training. Duration: 0.8555625 2023-10-02 12:39:24,105 WARNING [train.py:1204] (3/4) Exclude cut with ID _64521_210_14699_1_1535243427259_3815328_231-78630-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 12:39:28,609 INFO [train.py:1046] (3/4) Epoch 25, batch 4900, loss[loss=0.1505, simple_loss=0.2184, pruned_loss=0.04132, over 23585.00 frames. ], tot_loss[loss=0.1698, simple_loss=0.2474, pruned_loss=0.0461, over 4735082.62 frames. ], batch size: 256, lr: 4.07e-03, grad_scale: 16.0 2023-10-02 12:39:28,707 WARNING [train.py:1204] (3/4) Exclude cut with ID _13942_210_9988_1_1530604906036_3733360_499-55676-0 from training. Duration: 0.62 2023-10-02 12:39:28,709 WARNING [train.py:1204] (3/4) Exclude cut with ID _49376_210_5986_1_1533985278732_3370740_301-154763-0_sp1.1 from training. Duration: 0.8909375 2023-10-02 12:39:34,659 WARNING [train.py:1204] (3/4) Exclude cut with ID _27485_210_4916_1_1532739191677_3668269_255-216698-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 12:39:36,011 WARNING [train.py:1204] (3/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_137-334143-0_sp1.1 from training. Duration: 0.97275 2023-10-02 12:39:36,038 WARNING [train.py:1204] (3/4) Exclude cut with ID _6456_210_1534_1_1527681634212_3181049_215-309359-0_sp1.1 from training. Duration: 0.8 2023-10-02 12:39:38,833 WARNING [train.py:1204] (3/4) Exclude cut with ID _65640_210_6310_1_1535368201445_3173250_209-13008-0 from training. Duration: 0.99 2023-10-02 12:39:43,627 WARNING [train.py:1204] (3/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_58-29064-0 from training. Duration: 0.99 2023-10-02 12:39:47,826 WARNING [train.py:1204] (3/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_410-287857-0 from training. Duration: 0.93 2023-10-02 12:39:47,903 WARNING [train.py:1204] (3/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_153-153537-0 from training. Duration: 0.91 2023-10-02 12:39:49,701 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7544_210_6753_1_1528587722_3576686_172-171750-0_sp0.9 from training. Duration: 0.72225 2023-10-02 12:39:49,728 WARNING [train.py:1204] (3/4) Exclude cut with ID _64521_210_14699_1_1535243427259_3815328_464-183853-0_sp1.1 from training. Duration: 0.97275 2023-10-02 12:39:49,741 WARNING [train.py:1204] (3/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_385-210308-0_sp1.1 from training. Duration: 0.963625 2023-10-02 12:39:49,748 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_46013_210_7234_1_1533776362_3744032_354-261865-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 12:39:49,761 WARNING [train.py:1204] (3/4) Exclude cut with ID _3067_210_2667_1_1526212974640_4503168_250-285937-0_sp1.1 from training. Duration: 0.72725 2023-10-02 12:39:51,138 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_40036_210_13405_1_1533648647_1315388_48-146016-0 from training. Duration: 0.93 2023-10-02 12:39:53,936 WARNING [train.py:1204] (3/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_280-161633-0 from training. Duration: 0.73 2023-10-02 12:39:55,310 WARNING [train.py:1204] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_308-161067-0_sp0.9 from training. Duration: 0.7333125 2023-10-02 12:39:56,622 WARNING [train.py:1204] (3/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_68-212801-0_sp0.9 from training. Duration: 0.811125 2023-10-02 12:39:57,953 WARNING [train.py:1204] (3/4) Exclude cut with ID _6789_210_1534_1_1527857937218_3118950_311-61262-0_sp0.9 from training. Duration: 0.72225 2023-10-02 12:40:01,258 WARNING [train.py:1204] (3/4) Exclude cut with ID _14632_210_9988_1_1530691279392_3923200_102-204477-0_sp1.1 from training. Duration: 0.8818125 2023-10-02 12:40:01,929 WARNING [train.py:1204] (3/4) Exclude cut with ID _65242_210_18628_1_1535369393717_3823149_290-117517-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 12:40:04,456 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_69630_210_7234_1_1535788670_910989_35-337140-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 12:40:04,467 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_5618_210_1874_1_1527311008_5851_1-214501-0 from training. Duration: 0.689 2023-10-02 12:40:04,587 WARNING [train.py:1204] (3/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_168-50337-0_sp0.9 from training. Duration: 0.9333125 2023-10-02 12:40:05,919 WARNING [train.py:1204] (3/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_185-14438-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 12:40:05,945 WARNING [train.py:1204] (3/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_61-327062-0 from training. Duration: 0.81 2023-10-02 12:40:05,957 WARNING [train.py:1204] (3/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_98-741-0 from training. Duration: 0.8 2023-10-02 12:40:10,152 WARNING [train.py:1204] (3/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_312-204798-0 from training. Duration: 0.93 2023-10-02 12:40:11,440 WARNING [train.py:1204] (3/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_787-294416-0_sp0.9 from training. Duration: 0.911125 2023-10-02 12:40:12,773 WARNING [train.py:1204] (3/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_140-181176-0_sp1.1 from training. Duration: 0.836375 2023-10-02 12:40:14,086 WARNING [train.py:1204] (3/4) Exclude cut with ID _14687_210_5854_1_1530703838259_3664889_115-239046-0_sp1.1 from training. Duration: 0.6090625 2023-10-02 12:40:14,133 WARNING [train.py:1204] (3/4) Exclude cut with ID _56423_210_5854_1_1534896061049_3165969_370-230614-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 12:40:14,170 WARNING [train.py:1204] (3/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_406-275649-0_sp0.9 from training. Duration: 0.4666875 2023-10-02 12:40:14,181 WARNING [train.py:1204] (3/4) Exclude cut with ID _15150_210_8341_1_1530752411803_7256384_22-138274-0_sp1.1 from training. Duration: 0.87275 2023-10-02 12:40:14,225 WARNING [train.py:1204] (3/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_125-125818-0 from training. Duration: 0.96 2023-10-02 12:40:17,457 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_4399_210_1874_1_1526705485_2400757_220-47974-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 12:40:18,922 WARNING [train.py:1204] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_308-161067-0_sp1.1 from training. Duration: 0.6 2023-10-02 12:40:22,272 WARNING [train.py:1204] (3/4) Exclude cut with ID _53489_210_6228_1_1534298357169_3835970_102-57645-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 12:40:23,576 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.598e+02 1.865e+02 2.106e+02 2.440e+02 3.328e+02, threshold=4.212e+02, percent-clipped=0.0 2023-10-02 12:40:25,101 WARNING [train.py:1204] (3/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_178-177582-0 from training. Duration: 0.74 2023-10-02 12:40:26,475 WARNING [train.py:1204] (3/4) Exclude cut with ID _14349_210_8341_1_1530615710818_3442470_170-336541-0_sp1.1 from training. Duration: 0.7818125 2023-10-02 12:40:27,829 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_521-214276-0_sp0.9 from training. Duration: 0.488875 2023-10-02 12:40:27,882 WARNING [train.py:1204] (3/4) Exclude cut with ID _66070_210_7640_1_1535441393096_7805310_489-292631-0 from training. Duration: 0.79 2023-10-02 12:40:28,147 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.ff3_skip_rate, batch_count=882873.3333333334, ans=0.0 2023-10-02 12:40:29,879 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.5.encoder.layers.1.whiten, num_groups=1, num_channels=256, metric=2.12 vs. limit=12.0 2023-10-02 12:40:32,331 INFO [scaling.py:1118] (3/4) WithLoss: name=encoder.encoders.5.encoder.layers.1.self_attn_weights, loss-sum=0.000e+00 2023-10-02 12:40:33,342 WARNING [train.py:1204] (3/4) Exclude cut with ID _14069_210_5632_1_1530512692033_4803390_438-340023-0_sp1.1 from training. Duration: 0.936375 2023-10-02 12:40:35,956 WARNING [train.py:1204] (3/4) Exclude cut with ID _7852_210_5219_1_1528286471316_8139400_242-105336-0_sp1.1 from training. Duration: 0.6545625 2023-10-02 12:40:37,245 WARNING [train.py:1204] (3/4) Exclude cut with ID _3867_210_1883_1_1526641200575_4475830_272-215563-0 from training. Duration: 0.57 2023-10-02 12:40:37,254 WARNING [train.py:1204] (3/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_143-117421-0_sp0.9 from training. Duration: 0.6444375 2023-10-02 12:40:37,261 WARNING [train.py:1204] (3/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_30-326681-0_sp1.1 from training. Duration: 0.7545625 2023-10-02 12:40:38,686 WARNING [train.py:1204] (3/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_538-218448-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 12:40:42,659 INFO [train.py:1046] (3/4) Epoch 25, batch 4950, loss[loss=0.1794, simple_loss=0.2471, pruned_loss=0.05589, over 23714.00 frames. ], tot_loss[loss=0.1688, simple_loss=0.2457, pruned_loss=0.04593, over 4729559.32 frames. ], batch size: 164, lr: 4.06e-03, grad_scale: 16.0 2023-10-02 12:40:42,797 WARNING [train.py:1204] (3/4) Exclude cut with ID _33640_210_2655_1_1533117605479_4855469_60-192970-0_sp1.1 from training. Duration: 0.963625 2023-10-02 12:40:42,804 WARNING [train.py:1204] (3/4) Exclude cut with ID _65585_210_12210_1_1535450451584_3764098_112-294301-0_sp1.1 from training. Duration: 0.8 2023-10-02 12:40:44,140 WARNING [train.py:1204] (3/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_643-37412-0_sp1.1 from training. Duration: 0.936375 2023-10-02 12:40:44,162 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_9302_210_2550_1_1528889962_4655416_638-301620-0_sp1.1 from training. Duration: 0.42725 2023-10-02 12:40:45,532 WARNING [train.py:1204] (3/4) Exclude cut with ID _3867_210_1883_1_1526641200575_4475830_272-215563-0_sp1.1 from training. Duration: 0.5181875 2023-10-02 12:40:47,476 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_53679_210_4852_1_1534556726_4186141_358-154143-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 12:40:47,488 WARNING [train.py:1204] (3/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_841-341679-0_sp0.9 from training. Duration: 0.6444375 2023-10-02 12:40:50,438 WARNING [train.py:1204] (3/4) Exclude cut with ID _7160_210_1876_1_1527940923231_3703799_302-150728-0 from training. Duration: 0.78 2023-10-02 12:40:50,468 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_125-203105-0 from training. Duration: 0.8 2023-10-02 12:40:50,482 WARNING [train.py:1204] (3/4) Exclude cut with ID _5585_210_2667_1_1527418813324_8007340_89-332072-0_sp0.9 from training. Duration: 0.711125 2023-10-02 12:40:53,663 WARNING [train.py:1204] (3/4) Exclude cut with ID _3992_210_1747_1_1526644816031_3731060_36-147855-0 from training. Duration: 0.78 2023-10-02 12:40:53,681 WARNING [train.py:1204] (3/4) Exclude cut with ID _59256_210_5986_1_1535076085997_3589120_172-239045-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 12:40:53,689 WARNING [train.py:1204] (3/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_472-216663-0_sp1.1 from training. Duration: 0.836375 2023-10-02 12:40:53,745 WARNING [train.py:1204] (3/4) Exclude cut with ID _36512_210_6987_1_1533033143998_3126069_181-306500-0_sp1.1 from training. Duration: 0.863625 2023-10-02 12:40:53,759 WARNING [train.py:1204] (3/4) Exclude cut with ID _56524_210_9642_1_1534572011779_3619170_492-95008-0_sp1.1 from training. Duration: 0.97275 2023-10-02 12:40:55,211 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_33348_210_5632_1_1533086912_3324030_119-90326-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 12:40:55,256 WARNING [train.py:1204] (3/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_231-137892-0_sp1.1 from training. Duration: 0.9 2023-10-02 12:40:55,444 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.feed_forward2.hidden_balancer.prob, batch_count=882940.0, ans=0.125 2023-10-02 12:40:57,748 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8453_210_4211_1_1528502940_8562730_596-306820-0_sp1.1 from training. Duration: 0.8818125 2023-10-02 12:40:59,064 WARNING [train.py:1204] (3/4) Exclude cut with ID _27052_210_5986_1_1532329197187_3585009_50-242750-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 12:40:59,207 WARNING [train.py:1204] (3/4) Exclude cut with ID _13913_210_9994_1_1530579615187_3383897_358-98435-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 12:40:59,226 WARNING [train.py:1204] (3/4) Exclude cut with ID _27013_210_1389_1_1532437173240_3252649_399-229778-0_sp1.1 from training. Duration: 0.963625 2023-10-02 12:41:02,565 WARNING [train.py:1204] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_185-174805-0_sp0.9 from training. Duration: 0.7555625 2023-10-02 12:41:07,886 WARNING [train.py:1204] (3/4) Exclude cut with ID _65585_210_12210_1_1535450451584_3764098_196-221014-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 12:41:09,213 WARNING [train.py:1204] (3/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_202-293523-0_sp1.1 from training. Duration: 0.6545625 2023-10-02 12:41:11,880 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8275_210_4198_1_1528455320_3892571_213-127634-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 12:41:11,927 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_921-231678-0_sp1.1 from training. Duration: 0.97275 2023-10-02 12:41:12,092 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.out_combiner.scale_min, batch_count=883073.3333333334, ans=0.2 2023-10-02 12:41:13,384 WARNING [train.py:1204] (3/4) Exclude cut with ID _38020_210_5986_1_1533112252259_7006089_493-102745-0_sp1.1 from training. Duration: 0.8909375 2023-10-02 12:41:15,322 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6846_210_6753_1_1527850767_4295158_2-315073-0 from training. Duration: 0.88 2023-10-02 12:41:16,743 WARNING [train.py:1204] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_249-281349-0 from training. Duration: 0.85 2023-10-02 12:41:18,219 WARNING [train.py:1204] (3/4) Exclude cut with ID _18513_210_9206_1_1531618215223_3862620_314-83429-0_sp1.1 from training. Duration: 0.97275 2023-10-02 12:41:20,179 WARNING [train.py:1204] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_215-202973-0_sp0.9 from training. Duration: 0.97775 2023-10-02 12:41:20,189 WARNING [train.py:1204] (3/4) Exclude cut with ID _7719_210_3525_1_1528456153163_4296960_88-250259-0_sp1.1 from training. Duration: 0.763625 2023-10-02 12:41:21,530 WARNING [train.py:1204] (3/4) Exclude cut with ID _7606_210_4915_1_1528894253277_5080690_301-10732-0_sp1.1 from training. Duration: 0.736375 2023-10-02 12:41:22,804 WARNING [train.py:1204] (3/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_818-11907-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 12:41:22,870 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_5959_210_1794_1_1527400400_3445726_412-44052-0_sp1.1 from training. Duration: 0.7 2023-10-02 12:41:24,314 WARNING [train.py:1204] (3/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_6-279195-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 12:41:25,632 WARNING [train.py:1204] (3/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_252-216674-0_sp0.9 from training. Duration: 0.6 2023-10-02 12:41:27,213 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.ff2_skip_rate, batch_count=883140.0, ans=0.0 2023-10-02 12:41:27,508 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.5.encoder.layers.0.nonlin_attention.whiten1, num_groups=1, num_channels=192, metric=4.12 vs. limit=10.0 2023-10-02 12:41:28,457 WARNING [train.py:1204] (3/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_144-3287-0_sp1.1 from training. Duration: 0.6090625 2023-10-02 12:41:29,850 WARNING [train.py:1204] (3/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_342-3879-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 12:41:29,888 WARNING [train.py:1204] (3/4) Exclude cut with ID _33640_210_2655_1_1533117605479_4855469_584-65630-0_sp1.1 from training. Duration: 0.97275 2023-10-02 12:41:31,239 WARNING [train.py:1204] (3/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_37-189262-0 from training. Duration: 0.98 2023-10-02 12:41:31,276 WARNING [train.py:1204] (3/4) Exclude cut with ID _9389_210_1664_1_1529217337301_5289269_379-128794-0_sp0.9 from training. Duration: 0.9444375 2023-10-02 12:41:31,474 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.feed_forward3.hidden_balancer.prob, batch_count=883140.0, ans=0.125 2023-10-02 12:41:33,139 WARNING [train.py:1204] (3/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_119-207162-0_sp1.1 from training. Duration: 0.6454375 2023-10-02 12:41:34,574 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.2.encoder.layers.0.self_attn2.whiten, num_groups=1, num_channels=384, metric=20.18 vs. limit=22.5 2023-10-02 12:41:37,855 WARNING [train.py:1204] (3/4) Exclude cut with ID _2941_210_2577_1_1526199673150_2489759_200-333643-0_sp1.1 from training. Duration: 0.936375 2023-10-02 12:41:39,246 WARNING [train.py:1204] (3/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_479-125611-0_sp1.1 from training. Duration: 0.87275 2023-10-02 12:41:39,264 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3475_210_2028_1_1526193578_3622569_64-297072-0_sp1.1 from training. Duration: 0.82725 2023-10-02 12:41:39,297 WARNING [train.py:1204] (3/4) Exclude cut with ID _53565_210_10635_1_1534337218335_3820259_139-346291-0_sp1.1 from training. Duration: 0.97275 2023-10-02 12:41:40,582 WARNING [train.py:1204] (3/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_588-125165-0_sp1.1 from training. Duration: 0.7545625 2023-10-02 12:41:41,947 WARNING [train.py:1204] (3/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_938-205135-0_sp1.1 from training. Duration: 0.7909375 2023-10-02 12:41:43,595 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.ff3_skip_rate, batch_count=883206.6666666666, ans=0.0 2023-10-02 12:41:45,099 WARNING [train.py:1204] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_216-48707-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 12:41:45,156 WARNING [train.py:1204] (3/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_477-255782-0_sp1.1 from training. Duration: 0.7454375 2023-10-02 12:41:45,196 WARNING [train.py:1204] (3/4) Exclude cut with ID _16909_210_5724_1_1531292079912_4118919_583-92842-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 12:41:47,850 WARNING [train.py:1204] (3/4) Exclude cut with ID _60860_210_8254_1_1534896015875_3557169_245-256666-0 from training. Duration: 0.97 2023-10-02 12:41:52,018 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_45399_210_4852_1_1533624725_7579737_349-116173-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 12:41:56,725 INFO [train.py:1046] (3/4) Epoch 25, batch 5000, loss[loss=0.1612, simple_loss=0.24, pruned_loss=0.04119, over 23536.00 frames. ], tot_loss[loss=0.1677, simple_loss=0.2452, pruned_loss=0.04513, over 4741865.87 frames. ], batch size: 134, lr: 4.06e-03, grad_scale: 16.0 2023-10-02 12:41:57,555 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.feed_forward2.out_whiten.whitening_limit, batch_count=883273.3333333334, ans=15.0 2023-10-02 12:41:58,141 WARNING [train.py:1204] (3/4) Exclude cut with ID _8699_210_2069_1_1528630346831_3960409_155-39766-0 from training. Duration: 0.78 2023-10-02 12:41:58,150 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_86-16881-0_sp1.1 from training. Duration: 0.52725 2023-10-02 12:42:03,543 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_191-159384-0_sp1.1 from training. Duration: 0.97275 2023-10-02 12:42:03,556 WARNING [train.py:1204] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_307-158462-0_sp1.1 from training. Duration: 0.62725 2023-10-02 12:42:04,912 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7544_210_6753_1_1528591327_1471426_33-346031-0 from training. Duration: 0.61 2023-10-02 12:42:06,629 WARNING [train.py:1204] (3/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_112-149213-0 from training. Duration: 0.95 2023-10-02 12:42:06,767 WARNING [train.py:1204] (3/4) Exclude cut with ID _55844_210_2346_1_1534471212569_7625707_925-191342-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 12:42:10,026 WARNING [train.py:1204] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_87-278686-0 from training. Duration: 0.77 2023-10-02 12:42:10,046 WARNING [train.py:1204] (3/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_415-78792-0_sp1.1 from training. Duration: 0.736375 2023-10-02 12:42:10,055 WARNING [train.py:1204] (3/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_107-332398-0_sp0.9 from training. Duration: 0.8666875 2023-10-02 12:42:11,467 WARNING [train.py:1204] (3/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_72-251255-0 from training. Duration: 0.94 2023-10-02 12:42:11,489 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_431-227273-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 12:42:11,558 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7544_210_6753_1_1528587000_691468_87-178599-0_sp0.9 from training. Duration: 0.9666875 2023-10-02 12:42:12,701 WARNING [train.py:1204] (3/4) Exclude cut with ID _13916_210_9994_1_1530788341126_3763149_60-191556-0 from training. Duration: 0.61 2023-10-02 12:42:12,703 WARNING [train.py:1204] (3/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_746-164782-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 12:42:14,022 WARNING [train.py:1204] (3/4) Exclude cut with ID _33640_210_2655_1_1533117605479_4855469_596-21066-0_sp1.1 from training. Duration: 0.92725 2023-10-02 12:42:15,924 WARNING [train.py:1204] (3/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_320-14078-0 from training. Duration: 0.95 2023-10-02 12:42:17,271 WARNING [train.py:1204] (3/4) Exclude cut with ID _29877_210_6228_1_1532397791106_5276149_266-62291-0 from training. Duration: 0.87 2023-10-02 12:42:17,351 WARNING [train.py:1204] (3/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_176-56120-0_sp0.9 from training. Duration: 0.92225 2023-10-02 12:42:18,630 WARNING [train.py:1204] (3/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_91-26919-0 from training. Duration: 0.97 2023-10-02 12:42:18,641 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_224-322394-0_sp0.9 from training. Duration: 0.7333125 2023-10-02 12:42:18,695 WARNING [train.py:1204] (3/4) Exclude cut with ID _44982_210_1511_1_1534312820108_3726029_586-297059-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 12:42:20,059 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_196-289637-0_sp1.1 from training. Duration: 0.6818125 2023-10-02 12:42:20,061 WARNING [train.py:1204] (3/4) Exclude cut with ID _18513_210_9206_1_1531618215223_3862620_402-5957-0 from training. Duration: 0.99 2023-10-02 12:42:20,067 WARNING [train.py:1204] (3/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_81-300212-0 from training. Duration: 0.6 2023-10-02 12:42:20,204 WARNING [train.py:1204] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_166-128941-0 from training. Duration: 0.54 2023-10-02 12:42:21,501 WARNING [train.py:1204] (3/4) Exclude cut with ID _58059_210_5854_1_1534901471149_3164808_57-309660-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 12:42:21,579 WARNING [train.py:1204] (3/4) Exclude cut with ID _60621_210_17071_1_1535013053530_3671739_73-155814-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 12:42:24,232 WARNING [train.py:1204] (3/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_726-270780-0 from training. Duration: 0.86 2023-10-02 12:42:24,244 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8853_210_3273_1_1528633899_818968_51-187685-0_sp1.1 from training. Duration: 0.62725 2023-10-02 12:42:26,284 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_315-212794-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 12:42:26,360 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_53373_210_5399_1_1534229471_4715695_314-122795-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 12:42:28,963 WARNING [train.py:1204] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_187-221566-0_sp0.9 from training. Duration: 0.47775 2023-10-02 12:42:30,335 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_133-564-0 from training. Duration: 0.79 2023-10-02 12:42:30,376 WARNING [train.py:1204] (3/4) Exclude cut with ID _55153_210_2253_1_1534384786677_3162096_255-228344-0_sp1.1 from training. Duration: 0.936375 2023-10-02 12:42:31,775 WARNING [train.py:1204] (3/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_383-165234-0_sp1.1 from training. Duration: 0.8545625 2023-10-02 12:42:36,314 WARNING [train.py:1204] (3/4) Exclude cut with ID IC0677W0056-334889 from training. Duration: 0.768 2023-10-02 12:42:39,126 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_14953_210_2151_1_1530701710_5616165_710-165400-0_sp0.9 from training. Duration: 0.9666875 2023-10-02 12:42:39,223 WARNING [train.py:1204] (3/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_355-236205-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 12:42:39,224 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_34450_210_9547_1_1533099443_4109050_503-246893-0_sp1.1 from training. Duration: 0.963625 2023-10-02 12:42:43,416 WARNING [train.py:1204] (3/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_25-120161-0 from training. Duration: 0.98 2023-10-02 12:42:43,460 WARNING [train.py:1204] (3/4) Exclude cut with ID _66112_210_10635_1_1535590236305_3801484_205-311930-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 12:42:43,487 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_48049_210_5632_1_1533864553_3519937_185-281218-0_sp1.1 from training. Duration: 0.92725 2023-10-02 12:42:44,675 WARNING [train.py:1204] (3/4) Exclude cut with ID _48782_210_12658_1_1534467701544_2300422_191-332415-0_sp1.1 from training. Duration: 0.97275 2023-10-02 12:42:44,914 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.conv_module2.balancer2.prob, batch_count=883473.3333333334, ans=0.125 2023-10-02 12:42:47,943 WARNING [train.py:1204] (3/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_907-151398-0_sp1.1 from training. Duration: 0.336375 2023-10-02 12:42:47,994 WARNING [train.py:1204] (3/4) Exclude cut with ID _67499_210_17282_1_1535509805220_3692430_20-179346-0_sp1.1 from training. Duration: 0.8818125 2023-10-02 12:42:50,603 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.491e+02 1.815e+02 1.960e+02 2.170e+02 3.682e+02, threshold=3.920e+02, percent-clipped=0.0 2023-10-02 12:42:50,729 WARNING [train.py:1204] (3/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_441-91466-0_sp1.1 from training. Duration: 0.8818125 2023-10-02 12:42:50,811 WARNING [train.py:1204] (3/4) Exclude cut with ID _46681_210_9642_1_1533893005571_7603359_786-164007-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 12:42:56,789 WARNING [train.py:1204] (3/4) Exclude cut with ID _14970_210_15709_1_1530773543493_1715730_187-20259-0 from training. Duration: 0.89 2023-10-02 12:43:02,373 WARNING [train.py:1204] (3/4) Exclude cut with ID _12734_210_8341_1_1530183614296_3891929_383-339075-0_sp1.1 from training. Duration: 0.963625 2023-10-02 12:43:05,427 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.feed_forward3.hidden_balancer.prob, batch_count=883540.0, ans=0.125 2023-10-02 12:43:09,569 INFO [train.py:1046] (3/4) Epoch 25, batch 5050, loss[loss=0.1861, simple_loss=0.2708, pruned_loss=0.05075, over 23918.00 frames. ], tot_loss[loss=0.1678, simple_loss=0.2453, pruned_loss=0.04512, over 4738767.47 frames. ], batch size: 80, lr: 4.06e-03, grad_scale: 16.0 2023-10-02 12:43:11,553 WARNING [train.py:1204] (3/4) Exclude cut with ID _37086_210_9038_1_1532999540891_7021250_834-322225-0_sp1.1 from training. Duration: 0.92725 2023-10-02 12:43:11,662 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_10987_210_4163_1_1529760555_4123902_56-290559-0_sp1.1 from training. Duration: 0.963625 2023-10-02 12:43:11,669 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_92-246270-0_sp0.9 from training. Duration: 0.9444375 2023-10-02 12:43:13,013 WARNING [train.py:1204] (3/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_56-13561-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 12:43:13,053 WARNING [train.py:1204] (3/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_393-57888-0_sp1.1 from training. Duration: 0.5818125 2023-10-02 12:43:14,161 WARNING [train.py:1204] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_168-32906-0_sp1.1 from training. Duration: 0.62725 2023-10-02 12:43:14,205 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_51225_210_3601_1_1534400634_2327383_127-207553-0_sp1.1 from training. Duration: 0.963625 2023-10-02 12:43:15,845 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.bypass.skip_rate, batch_count=883606.6666666666, ans=0.07 2023-10-02 12:43:15,887 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.nonlin_attention.balancer.prob, batch_count=883606.6666666666, ans=0.125 2023-10-02 12:43:18,881 WARNING [train.py:1204] (3/4) Exclude cut with ID _58583_210_5236_1_1534726835266_3574249_494-203521-0_sp1.1 from training. Duration: 0.963625 2023-10-02 12:43:18,893 WARNING [train.py:1204] (3/4) Exclude cut with ID _49376_210_5986_1_1533985278732_3370740_354-344695-0 from training. Duration: 0.99 2023-10-02 12:43:18,982 WARNING [train.py:1204] (3/4) Exclude cut with ID _8021_210_7994_1_1528376451358_3653069_179-45206-0_sp1.1 from training. Duration: 0.7818125 2023-10-02 12:43:21,691 WARNING [train.py:1204] (3/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_742-154017-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 12:43:23,095 WARNING [train.py:1204] (3/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_222-198127-0_sp1.1 from training. Duration: 0.763625 2023-10-02 12:43:23,145 WARNING [train.py:1204] (3/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_235-307337-0 from training. Duration: 0.94 2023-10-02 12:43:23,343 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.conv_module2.balancer2.prob, batch_count=883673.3333333334, ans=0.125 2023-10-02 12:43:23,624 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.5.encoder.layers.0.self_attn1.whiten, num_groups=1, num_channels=256, metric=12.11 vs. limit=22.5 2023-10-02 12:43:24,543 WARNING [train.py:1204] (3/4) Exclude cut with ID _8393_210_6702_1_1528505860249_3457328_453-176500-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 12:43:24,589 WARNING [train.py:1204] (3/4) Exclude cut with ID _53565_210_10635_1_1534337218335_3820259_102-179528-0_sp1.1 from training. Duration: 0.97275 2023-10-02 12:43:27,345 WARNING [train.py:1204] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_43-206757-0_sp1.1 from training. Duration: 0.6818125 2023-10-02 12:43:29,119 WARNING [train.py:1204] (3/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_335-274780-0_sp1.1 from training. Duration: 0.7090625 2023-10-02 12:43:29,154 WARNING [train.py:1204] (3/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_128-296581-0_sp1.1 from training. Duration: 0.67275 2023-10-02 12:43:30,696 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.bypass_mid.scale_min, batch_count=883673.3333333334, ans=0.2 2023-10-02 12:43:39,247 WARNING [train.py:1204] (3/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_111-112362-0 from training. Duration: 0.91 2023-10-02 12:43:39,299 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_5522_210_3273_1_1527249606_3909096_366-172508-0_sp1.1 from training. Duration: 0.6 2023-10-02 12:43:40,510 WARNING [train.py:1204] (3/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_47-167373-0_sp1.1 from training. Duration: 0.836375 2023-10-02 12:43:40,543 WARNING [train.py:1204] (3/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_41-234353-0 from training. Duration: 0.68 2023-10-02 12:43:42,360 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_82-221196-0_sp0.9 from training. Duration: 0.8444375 2023-10-02 12:43:43,815 WARNING [train.py:1204] (3/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_47-4814-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 12:43:43,838 WARNING [train.py:1204] (3/4) Exclude cut with ID _43541_210_10626_1_1533796233418_3635440_40-247993-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 12:43:45,174 WARNING [train.py:1204] (3/4) Exclude cut with ID _21989_210_5854_1_1531899214931_3271360_133-44759-0_sp0.9 from training. Duration: 0.8555625 2023-10-02 12:43:45,177 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_94-195801-0 from training. Duration: 0.94 2023-10-02 12:43:45,260 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_141-89113-0 from training. Duration: 0.86 2023-10-02 12:43:46,609 WARNING [train.py:1204] (3/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_683-284578-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 12:43:49,302 WARNING [train.py:1204] (3/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_6-214815-0_sp1.1 from training. Duration: 0.77275 2023-10-02 12:43:51,252 WARNING [train.py:1204] (3/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_556-212082-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 12:43:51,289 WARNING [train.py:1204] (3/4) Exclude cut with ID _44902_210_16187_1_1533888006062_3792078_149-67212-0 from training. Duration: 0.87 2023-10-02 12:43:53,953 WARNING [train.py:1204] (3/4) Exclude cut with ID _61148_210_6897_1_1535094022726_4217610_333-159440-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 12:43:55,410 WARNING [train.py:1204] (3/4) Exclude cut with ID _3120_210_3525_1_1526036143112_4072980_404-249994-0 from training. Duration: 0.99 2023-10-02 12:43:55,629 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.conv_skip_rate, batch_count=883806.6666666666, ans=0.0 2023-10-02 12:43:56,674 WARNING [train.py:1204] (3/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_234-260682-0_sp0.9 from training. Duration: 0.9333125 2023-10-02 12:43:57,989 WARNING [train.py:1204] (3/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_374-138971-0_sp1.1 from training. Duration: 0.8545625 2023-10-02 12:43:58,051 WARNING [train.py:1204] (3/4) Exclude cut with ID _53713_210_24116_1_1534314487762_7416751_743-302085-0_sp1.1 from training. Duration: 0.92725 2023-10-02 12:43:58,121 WARNING [train.py:1204] (3/4) Exclude cut with ID _18516_210_9206_1_1531704653206_3692406_198-334870-0_sp1.1 from training. Duration: 0.836375 2023-10-02 12:44:00,289 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.ff3_skip_rate, batch_count=883806.6666666666, ans=0.0 2023-10-02 12:44:01,404 WARNING [train.py:1204] (3/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_395-73570-0_sp1.1 from training. Duration: 0.9 2023-10-02 12:44:02,851 WARNING [train.py:1204] (3/4) Exclude cut with ID _46683_210_9642_1_1533730913492_4591116_421-19547-0_sp1.1 from training. Duration: 0.936375 2023-10-02 12:44:04,207 WARNING [train.py:1204] (3/4) Exclude cut with ID _19896_210_1883_1_1531709022662_6649569_714-8668-0_sp1.1 from training. Duration: 0.963625 2023-10-02 12:44:04,247 WARNING [train.py:1204] (3/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_49-101072-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 12:44:04,256 WARNING [train.py:1204] (3/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_220-191080-0_sp1.1 from training. Duration: 0.8909375 2023-10-02 12:44:04,284 WARNING [train.py:1204] (3/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_62-335552-0 from training. Duration: 0.87 2023-10-02 12:44:05,634 WARNING [train.py:1204] (3/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_111-112362-0_sp1.1 from training. Duration: 0.82725 2023-10-02 12:44:06,981 WARNING [train.py:1204] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_318-59097-0_sp0.9 from training. Duration: 0.8444375 2023-10-02 12:44:10,365 WARNING [train.py:1204] (3/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_98-255710-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 12:44:11,627 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9042W0003-515609 from training. Duration: 0.896 2023-10-02 12:44:11,642 WARNING [train.py:1204] (3/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_158-250116-0_sp0.9 from training. Duration: 0.688875 2023-10-02 12:44:13,041 WARNING [train.py:1204] (3/4) Exclude cut with ID _26750_210_5665_1_1532226678442_4063230_7-346885-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 12:44:13,187 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.1.conv_module2.balancer1.prob, batch_count=883873.3333333334, ans=0.125 2023-10-02 12:44:14,965 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_37839_210_7234_1_1533542462_3689303_357-227078-0_sp1.1 from training. Duration: 0.963625 2023-10-02 12:44:15,004 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9052W0119-520904 from training. Duration: 0.896 2023-10-02 12:44:15,225 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.conv_skip_rate, batch_count=883873.3333333334, ans=0.0 2023-10-02 12:44:17,791 WARNING [train.py:1204] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_249-281349-0_sp1.1 from training. Duration: 0.77275 2023-10-02 12:44:17,796 WARNING [train.py:1204] (3/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_147-293311-0 from training. Duration: 0.94 2023-10-02 12:44:17,796 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_158-21166-0_sp1.1 from training. Duration: 0.963625 2023-10-02 12:44:22,336 WARNING [train.py:1204] (3/4) Exclude cut with ID _14100_210_8341_1_1530529320613_3678069_207-332194-0_sp1.1 from training. Duration: 0.92725 2023-10-02 12:44:22,376 WARNING [train.py:1204] (3/4) Exclude cut with ID _9389_210_1664_1_1529217337301_5289269_324-211031-0_sp1.1 from training. Duration: 0.963625 2023-10-02 12:44:22,395 WARNING [train.py:1204] (3/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_133-158308-0 from training. Duration: 0.84 2023-10-02 12:44:22,650 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.bypass.skip_rate, batch_count=883940.0, ans=0.07 2023-10-02 12:44:23,702 INFO [train.py:1046] (3/4) Epoch 25, batch 5100, loss[loss=0.1585, simple_loss=0.243, pruned_loss=0.03702, over 24483.00 frames. ], tot_loss[loss=0.1683, simple_loss=0.2461, pruned_loss=0.04527, over 4743798.05 frames. ], batch size: 66, lr: 4.06e-03, grad_scale: 16.0 2023-10-02 12:44:23,837 WARNING [train.py:1204] (3/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_166-314607-0 from training. Duration: 0.54 2023-10-02 12:44:24,357 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.5.encoder.layers.1.self_attn_weights.whiten_keys, num_groups=4, num_channels=128, metric=1.60 vs. limit=6.0 2023-10-02 12:44:25,374 WARNING [train.py:1204] (3/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_649-79538-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 12:44:26,569 WARNING [train.py:1204] (3/4) Exclude cut with ID _28394_210_8913_1_1532430049518_3650118_343-115667-0_sp1.1 from training. Duration: 0.97275 2023-10-02 12:44:26,612 WARNING [train.py:1204] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_87-278686-0_sp0.9 from training. Duration: 0.8555625 2023-10-02 12:44:29,498 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9109W0003-549749 from training. Duration: 0.768 2023-10-02 12:44:31,386 WARNING [train.py:1204] (3/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_126-29104-0_sp1.1 from training. Duration: 0.77275 2023-10-02 12:44:32,848 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.0.feed_forward3.hidden_balancer.prob, batch_count=883940.0, ans=0.125 2023-10-02 12:44:34,116 WARNING [train.py:1204] (3/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_325-113798-0 from training. Duration: 0.85 2023-10-02 12:44:35,440 WARNING [train.py:1204] (3/4) Exclude cut with ID _6694_210_4599_1_1527852637490_3965529_116-109398-0 from training. Duration: 0.73 2023-10-02 12:44:35,495 WARNING [train.py:1204] (3/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_46-57119-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 12:44:37,043 WARNING [train.py:1204] (3/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_546-119312-0_sp1.1 from training. Duration: 0.9 2023-10-02 12:44:38,421 WARNING [train.py:1204] (3/4) Exclude cut with ID _60057_210_10640_1_1534903065793_3333150_468-306939-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 12:44:39,888 WARNING [train.py:1204] (3/4) Exclude cut with ID _15150_210_8341_1_1530752411803_7256384_22-138274-0 from training. Duration: 0.96 2023-10-02 12:44:41,635 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_289-180904-0 from training. Duration: 0.76 2023-10-02 12:44:45,686 WARNING [train.py:1204] (3/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_64-133721-0_sp1.1 from training. Duration: 0.92725 2023-10-02 12:44:45,738 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_195-257512-0_sp1.1 from training. Duration: 0.6090625 2023-10-02 12:44:49,259 WARNING [train.py:1204] (3/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_704-101963-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 12:44:51,932 WARNING [train.py:1204] (3/4) Exclude cut with ID _4366_210_1388_1_1526792053633_3613819_231-211402-0 from training. Duration: 0.94 2023-10-02 12:44:53,706 WARNING [train.py:1204] (3/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_679-190385-0_sp1.1 from training. Duration: 0.97275 2023-10-02 12:44:56,488 WARNING [train.py:1204] (3/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_298-81459-0_sp1.1 from training. Duration: 0.963625 2023-10-02 12:44:56,505 WARNING [train.py:1204] (3/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_658-282610-0_sp0.9 from training. Duration: 0.588875 2023-10-02 12:44:59,206 WARNING [train.py:1204] (3/4) Exclude cut with ID _57572_210_5517_1_1534921219624_3809484_196-283734-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 12:44:59,283 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_53071_210_16554_1_1534490405_2954066_294-198554-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 12:44:59,288 WARNING [train.py:1204] (3/4) Exclude cut with ID _4866_210_2577_1_1527404412284_4364659_352-157930-0 from training. Duration: 0.81 2023-10-02 12:45:01,295 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9231W0113-610139 from training. Duration: 0.896 2023-10-02 12:45:02,608 WARNING [train.py:1204] (3/4) Exclude cut with ID _24271_210_12658_1_1531987270022_7190519_638-171906-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 12:45:02,648 WARNING [train.py:1204] (3/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_272-249985-0 from training. Duration: 0.67 2023-10-02 12:45:02,657 WARNING [train.py:1204] (3/4) Exclude cut with ID _64086_210_7994_1_1535270417019_7435949_449-31567-0 from training. Duration: 0.97 2023-10-02 12:45:05,539 WARNING [train.py:1204] (3/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_109-282568-0_sp1.1 from training. Duration: 0.97275 2023-10-02 12:45:07,069 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.nonlin_attention.balancer.max_positive, batch_count=884140.0, ans=0.95 2023-10-02 12:45:12,942 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_121-45934-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 12:45:14,989 WARNING [train.py:1204] (3/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_314-126819-0 from training. Duration: 0.5 2023-10-02 12:45:15,017 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9309W0192-640319 from training. Duration: 0.512 2023-10-02 12:45:15,030 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9310W0003-640649 from training. Duration: 0.896 2023-10-02 12:45:17,612 WARNING [train.py:1204] (3/4) Exclude cut with ID _7160_210_1876_1_1527940923231_3703799_120-150943-0 from training. Duration: 0.81 2023-10-02 12:45:17,614 WARNING [train.py:1204] (3/4) Exclude cut with ID _40380_210_9059_1_1533380594759_4187380_6-33508-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 12:45:19,064 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.574e+02 1.925e+02 2.148e+02 2.500e+02 3.994e+02, threshold=4.296e+02, percent-clipped=1.0 2023-10-02 12:45:19,207 WARNING [train.py:1204] (3/4) Exclude cut with ID _59202_210_26984_1_1534816810612_3549174_137-318710-0 from training. Duration: 0.89 2023-10-02 12:45:21,961 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.conv_skip_rate, batch_count=884206.6666666666, ans=0.0 2023-10-02 12:45:22,001 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.conv_module2.balancer1.prob, batch_count=884206.6666666666, ans=0.125 2023-10-02 12:45:23,148 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_435-313305-0 from training. Duration: 0.71 2023-10-02 12:45:23,392 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.feed_forward1.out_proj.dropout_p, batch_count=884206.6666666666, ans=0.1 2023-10-02 12:45:26,333 WARNING [train.py:1204] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_91-212486-0_sp1.1 from training. Duration: 0.6181875 2023-10-02 12:45:27,656 WARNING [train.py:1204] (3/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_133-158308-0_sp1.1 from training. Duration: 0.763625 2023-10-02 12:45:30,354 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_99-251314-0 from training. Duration: 0.87 2023-10-02 12:45:32,252 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_365-217119-0_sp1.1 from training. Duration: 0.57275 2023-10-02 12:45:33,558 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3475_210_2028_1_1526193578_3622569_377-166133-0 from training. Duration: 0.86 2023-10-02 12:45:37,766 INFO [train.py:1046] (3/4) Epoch 25, batch 5150, loss[loss=0.1715, simple_loss=0.2567, pruned_loss=0.04316, over 23777.00 frames. ], tot_loss[loss=0.1697, simple_loss=0.2473, pruned_loss=0.04606, over 4733300.06 frames. ], batch size: 85, lr: 4.06e-03, grad_scale: 16.0 2023-10-02 12:45:37,912 WARNING [train.py:1204] (3/4) Exclude cut with ID _13916_210_9994_1_1530788341126_3763149_116-21034-0_sp1.1 from training. Duration: 0.87275 2023-10-02 12:45:37,922 WARNING [train.py:1204] (3/4) Exclude cut with ID _64220_210_9527_1_1535250151772_4875259_142-216831-0_sp1.1 from training. Duration: 0.936375 2023-10-02 12:45:37,922 WARNING [train.py:1204] (3/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_490-207861-0_sp1.1 from training. Duration: 0.8909375 2023-10-02 12:45:39,228 WARNING [train.py:1204] (3/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_87-272068-0_sp0.9 from training. Duration: 0.92225 2023-10-02 12:45:39,257 WARNING [train.py:1204] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_18-46756-0_sp1.1 from training. Duration: 0.7181875 2023-10-02 12:45:39,322 WARNING [train.py:1204] (3/4) Exclude cut with ID _39411_210_5986_1_1533538825030_3343649_98-311335-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 12:45:40,636 WARNING [train.py:1204] (3/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_113-18163-0 from training. Duration: 0.89 2023-10-02 12:45:40,638 WARNING [train.py:1204] (3/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_1103-56445-0 from training. Duration: 0.73 2023-10-02 12:45:42,578 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3474_210_2028_1_1526188513_4825246_145-309131-0 from training. Duration: 0.74 2023-10-02 12:45:42,604 WARNING [train.py:1204] (3/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_120-211630-0_sp1.1 from training. Duration: 0.836375 2023-10-02 12:45:43,819 WARNING [train.py:1204] (3/4) Exclude cut with ID _8030_210_7497_1_1528444886781_3531089_32-31837-0 from training. Duration: 0.97 2023-10-02 12:45:45,160 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_19949_210_2161_1_1531706337_928079_8-160956-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 12:45:46,900 WARNING [train.py:1204] (3/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_321-215742-0_sp1.1 from training. Duration: 0.4545625 2023-10-02 12:45:48,270 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_583-124650-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 12:45:48,376 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_30434_210_13049_1_1532519384_981746_164-182755-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 12:45:52,795 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.conv_module1.balancer2.prob, batch_count=884340.0, ans=0.125 2023-10-02 12:45:53,819 WARNING [train.py:1204] (3/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_339-104791-0_sp1.1 from training. Duration: 0.4454375 2023-10-02 12:45:53,837 WARNING [train.py:1204] (3/4) Exclude cut with ID _15026_210_8750_1_1530777602316_3291850_211-195043-0 from training. Duration: 0.8 2023-10-02 12:45:53,936 WARNING [train.py:1204] (3/4) Exclude cut with ID _29877_210_6228_1_1532397791106_5276149_487-182647-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 12:45:55,798 WARNING [train.py:1204] (3/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_127-566-0_sp0.9 from training. Duration: 0.9333125 2023-10-02 12:45:58,611 WARNING [train.py:1204] (3/4) Exclude cut with ID _8263_210_1664_1_1528439452891_970220_68-181805-0_sp0.9 from training. Duration: 0.711125 2023-10-02 12:45:58,613 WARNING [train.py:1204] (3/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_504-72513-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 12:45:58,622 WARNING [train.py:1204] (3/4) Exclude cut with ID _64639_210_5816_1_1535367619437_3528000_324-265233-0_sp1.1 from training. Duration: 0.963625 2023-10-02 12:45:58,687 WARNING [train.py:1204] (3/4) Exclude cut with ID _6456_210_1534_1_1527681634212_3181049_215-309359-0_sp0.9 from training. Duration: 0.97775 2023-10-02 12:45:58,691 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_115-233929-0_sp1.1 from training. Duration: 0.6545625 2023-10-02 12:45:59,969 WARNING [train.py:1204] (3/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_219-224445-0 from training. Duration: 0.84 2023-10-02 12:46:01,465 WARNING [train.py:1204] (3/4) Exclude cut with ID _14867_210_1968_1_1530684179890_3846969_413-29292-0_sp1.1 from training. Duration: 0.8181875 2023-10-02 12:46:01,510 WARNING [train.py:1204] (3/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_1012-279473-0_sp0.9 from training. Duration: 0.7444375 2023-10-02 12:46:03,033 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.balancer1.prob, batch_count=884340.0, ans=0.125 2023-10-02 12:46:04,814 WARNING [train.py:1204] (3/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_605-133395-0_sp0.9 from training. Duration: 0.7333125 2023-10-02 12:46:07,491 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_89-35748-0 from training. Duration: 0.79 2023-10-02 12:46:08,904 WARNING [train.py:1204] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_476-272757-0_sp1.1 from training. Duration: 0.8454375 2023-10-02 12:46:11,944 WARNING [train.py:1204] (3/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_449-15508-0_sp0.9 from training. Duration: 0.911125 2023-10-02 12:46:15,030 WARNING [train.py:1204] (3/4) Exclude cut with ID _29345_210_4381_1_1532689168263_3860169_198-174328-0 from training. Duration: 0.91 2023-10-02 12:46:18,202 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_15517_210_6753_1_1530952293_1664795_125-158157-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 12:46:24,926 WARNING [train.py:1204] (3/4) Exclude cut with ID _25565_210_12392_1_1532423009382_3952098_420-113180-0_sp1.1 from training. Duration: 0.963625 2023-10-02 12:46:25,015 WARNING [train.py:1204] (3/4) Exclude cut with ID _51471_210_1889_1_1534399391390_3094250_236-146773-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 12:46:29,512 WARNING [train.py:1204] (3/4) Exclude cut with ID _30039_210_13270_1_1532484612900_2725150_175-35012-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 12:46:29,552 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_23850_210_4238_1_1532232597_1632433_99-275843-0_sp1.1 from training. Duration: 0.936375 2023-10-02 12:46:30,996 WARNING [train.py:1204] (3/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_658-282610-0 from training. Duration: 0.53 2023-10-02 12:46:32,599 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.ff3_skip_rate, batch_count=884473.3333333334, ans=0.0 2023-10-02 12:46:35,575 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_37818_210_4238_1_1533620910_4720083_234-65934-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 12:46:36,916 WARNING [train.py:1204] (3/4) Exclude cut with ID _3067_210_2667_1_1526212974640_4503168_459-173867-0_sp1.1 from training. Duration: 0.8 2023-10-02 12:46:36,935 WARNING [train.py:1204] (3/4) Exclude cut with ID _8696_210_1876_1_1528545632440_3345058_209-54069-0_sp0.9 from training. Duration: 0.7444375 2023-10-02 12:46:40,968 WARNING [train.py:1204] (3/4) Exclude cut with ID _62574_210_2655_1_1535180407058_3933249_420-236465-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 12:46:41,255 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.bypass.scale_min, batch_count=884540.0, ans=0.2 2023-10-02 12:46:42,300 WARNING [train.py:1204] (3/4) Exclude cut with ID _46681_210_9642_1_1533893005571_7603359_333-4042-0_sp1.1 from training. Duration: 0.936375 2023-10-02 12:46:43,666 WARNING [train.py:1204] (3/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_377-296839-0 from training. Duration: 0.55 2023-10-02 12:46:45,839 WARNING [train.py:1204] (3/4) Exclude cut with ID _43997_210_4525_1_1533538632555_5189329_426-92599-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 12:46:46,063 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.conv_module2.balancer1.prob, batch_count=884540.0, ans=0.125 2023-10-02 12:46:48,839 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_43-71989-0_sp1.1 from training. Duration: 0.5181875 2023-10-02 12:46:51,468 INFO [train.py:1046] (3/4) Epoch 25, batch 5200, loss[loss=0.1742, simple_loss=0.2408, pruned_loss=0.05376, over 23826.00 frames. ], tot_loss[loss=0.1704, simple_loss=0.2484, pruned_loss=0.04621, over 4738473.35 frames. ], batch size: 195, lr: 4.06e-03, grad_scale: 32.0 2023-10-02 12:46:51,568 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2009_210_2071_1_1525481134_4588937_140-345530-0_sp1.1 from training. Duration: 0.963625 2023-10-02 12:46:51,576 WARNING [train.py:1204] (3/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_138-332773-0_sp0.9 from training. Duration: 0.9 2023-10-02 12:46:52,903 WARNING [train.py:1204] (3/4) Exclude cut with ID _13942_210_9988_1_1530604906036_3733360_499-55676-0_sp0.9 from training. Duration: 0.688875 2023-10-02 12:46:52,927 WARNING [train.py:1204] (3/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_259-34038-0_sp1.1 from training. Duration: 0.67275 2023-10-02 12:46:52,941 WARNING [train.py:1204] (3/4) Exclude cut with ID _3120_210_3525_1_1526036143112_4072980_404-249994-0_sp1.1 from training. Duration: 0.9 2023-10-02 12:46:54,441 WARNING [train.py:1204] (3/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_222-108810-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 12:46:57,346 WARNING [train.py:1204] (3/4) Exclude cut with ID _22083_210_8254_1_1531702862439_3678970_49-234546-0_sp0.9 from training. Duration: 0.92225 2023-10-02 12:46:58,254 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.1.encoder.layers.1.feed_forward1.out_whiten, num_groups=1, num_channels=256, metric=6.80 vs. limit=15.0 2023-10-02 12:46:59,355 WARNING [train.py:1204] (3/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_199-295611-0_sp0.9 from training. Duration: 0.611125 2023-10-02 12:47:00,795 WARNING [train.py:1204] (3/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_43-192945-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 12:47:03,674 WARNING [train.py:1204] (3/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_364-22817-0 from training. Duration: 0.71 2023-10-02 12:47:03,916 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.conv_module1.balancer1.prob, batch_count=884606.6666666666, ans=0.125 2023-10-02 12:47:05,023 WARNING [train.py:1204] (3/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_388-129552-0_sp1.1 from training. Duration: 0.87275 2023-10-02 12:47:05,081 WARNING [train.py:1204] (3/4) Exclude cut with ID _6537_210_1883_1_1527987559710_3597129_190-280227-0_sp1.1 from training. Duration: 0.97275 2023-10-02 12:47:08,505 WARNING [train.py:1204] (3/4) Exclude cut with ID _55844_210_2346_1_1534471212569_7625707_398-56897-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 12:47:09,887 WARNING [train.py:1204] (3/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_48-182155-0_sp1.1 from training. Duration: 0.7909375 2023-10-02 12:47:09,912 WARNING [train.py:1204] (3/4) Exclude cut with ID _29529_210_7234_1_1532393961455_3977460_582-18084-0_sp1.1 from training. Duration: 0.97275 2023-10-02 12:47:11,299 WARNING [train.py:1204] (3/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_912-282412-0 from training. Duration: 0.49 2023-10-02 12:47:12,734 WARNING [train.py:1204] (3/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_332-256168-0_sp0.9 from training. Duration: 0.8333125 2023-10-02 12:47:14,077 WARNING [train.py:1204] (3/4) Exclude cut with ID _64220_210_9527_1_1535250151772_4875259_55-250624-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 12:47:17,216 WARNING [train.py:1204] (3/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_264-108685-0 from training. Duration: 0.55 2023-10-02 12:47:20,327 WARNING [train.py:1204] (3/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_613-232856-0_sp0.9 from training. Duration: 0.6 2023-10-02 12:47:21,703 WARNING [train.py:1204] (3/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_68-212801-0_sp1.1 from training. Duration: 0.663625 2023-10-02 12:47:22,325 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.4.encoder.layers.2.conv_module1.whiten, num_groups=1, num_channels=384, metric=4.46 vs. limit=15.0 2023-10-02 12:47:23,153 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6290_210_3273_1_1527595224_1282787_115-87095-0 from training. Duration: 0.55 2023-10-02 12:47:23,187 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3080_210_2071_1_1526086102_4365166_312-8726-0 from training. Duration: 0.91 2023-10-02 12:47:24,682 WARNING [train.py:1204] (3/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_105-63309-0 from training. Duration: 0.98 2023-10-02 12:47:26,066 WARNING [train.py:1204] (3/4) Exclude cut with ID _2941_210_2577_1_1526199673150_2489759_201-160779-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 12:47:26,070 WARNING [train.py:1204] (3/4) Exclude cut with ID ID0406W0226-885434 from training. Duration: 0.997 2023-10-02 12:47:26,076 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_17292_210_6753_1_1531125388_2943734_353-328201-0_sp1.1 from training. Duration: 0.97275 2023-10-02 12:47:29,172 WARNING [train.py:1204] (3/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_572-96029-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 12:47:29,201 WARNING [train.py:1204] (3/4) Exclude cut with ID _14970_210_15709_1_1530775547361_1966965_6-23328-0_sp0.9 from training. Duration: 0.9555625 2023-10-02 12:47:29,257 WARNING [train.py:1204] (3/4) Exclude cut with ID _45012_210_12651_1_1533608435136_5371380_240-37101-0 from training. Duration: 0.93 2023-10-02 12:47:30,558 WARNING [train.py:1204] (3/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_685-90183-0_sp1.1 from training. Duration: 0.936375 2023-10-02 12:47:31,958 WARNING [train.py:1204] (3/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_41-22245-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 12:47:34,751 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_15126_210_6753_1_1530778677_2591185_120-277707-0 from training. Duration: 0.8 2023-10-02 12:47:34,781 WARNING [train.py:1204] (3/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_310-26186-0 from training. Duration: 0.7 2023-10-02 12:47:34,803 WARNING [train.py:1204] (3/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_787-294416-0 from training. Duration: 0.82 2023-10-02 12:47:40,728 WARNING [train.py:1204] (3/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_795-33252-0 from training. Duration: 0.37 2023-10-02 12:47:40,769 WARNING [train.py:1204] (3/4) Exclude cut with ID _7537_210_2084_1_1528502395431_3924359_113-199933-0_sp0.9 from training. Duration: 0.6555625 2023-10-02 12:47:43,892 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.balancer1.prob, batch_count=884806.6666666666, ans=0.125 2023-10-02 12:47:44,929 WARNING [train.py:1204] (3/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_308-231735-0_sp0.9 from training. Duration: 0.811125 2023-10-02 12:47:46,254 WARNING [train.py:1204] (3/4) Exclude cut with ID _19504_210_9994_1_1531648863242_3325220_354-296765-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 12:47:48,187 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.453e+02 1.909e+02 2.152e+02 2.481e+02 3.751e+02, threshold=4.304e+02, percent-clipped=0.0 2023-10-02 12:47:48,263 WARNING [train.py:1204] (3/4) Exclude cut with ID _5409_210_1388_1_1527292850189_5865191_178-127970-0 from training. Duration: 0.57 2023-10-02 12:47:48,306 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_1466-248056-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 12:47:48,323 WARNING [train.py:1204] (3/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_535-14410-0_sp0.9 from training. Duration: 0.52225 2023-10-02 12:47:48,326 WARNING [train.py:1204] (3/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_747-129941-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 12:47:50,060 WARNING [train.py:1204] (3/4) Exclude cut with ID _15012_210_1968_1_1530856820429_5095350_688-178849-0_sp1.1 from training. Duration: 0.5818125 2023-10-02 12:47:51,603 WARNING [train.py:1204] (3/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_356-45054-0_sp1.1 from training. Duration: 0.7818125 2023-10-02 12:47:53,041 WARNING [train.py:1204] (3/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_146-276944-0_sp0.9 from training. Duration: 0.92225 2023-10-02 12:47:55,669 WARNING [train.py:1204] (3/4) Exclude cut with ID _39204_210_4220_1_1533880547368_3191120_346-180306-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 12:47:57,011 WARNING [train.py:1204] (3/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_641-158017-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 12:47:57,012 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_19952_210_2161_1_1531875469_3851750_274-42137-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 12:47:57,235 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.1.feed_forward1.out_proj.dropout_p, batch_count=884873.3333333334, ans=0.1 2023-10-02 12:48:00,557 WARNING [train.py:1204] (3/4) Exclude cut with ID _60804_210_9642_1_1534831199859_7268299_87-327819-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 12:48:02,024 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_9302_210_2550_1_1528889962_4655416_638-301620-0 from training. Duration: 0.47 2023-10-02 12:48:02,199 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.self_attn_weights.pos_emb_skip_rate, batch_count=884873.3333333334, ans=0.0 2023-10-02 12:48:03,340 WARNING [train.py:1204] (3/4) Exclude cut with ID _44304_210_15374_1_1533639352765_4613129_302-22084-0_sp1.1 from training. Duration: 0.7818125 2023-10-02 12:48:04,539 WARNING [train.py:1204] (3/4) Exclude cut with ID _18513_210_9206_1_1531618215223_3862620_177-227063-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 12:48:06,305 INFO [train.py:1046] (3/4) Epoch 25, batch 5250, loss[loss=0.1648, simple_loss=0.2423, pruned_loss=0.04365, over 23418.00 frames. ], tot_loss[loss=0.1689, simple_loss=0.2466, pruned_loss=0.04556, over 4727332.55 frames. ], batch size: 105, lr: 4.06e-03, grad_scale: 16.0 2023-10-02 12:48:06,379 WARNING [train.py:1204] (3/4) Exclude cut with ID _41326_210_5854_1_1533778076509_3492080_510-45444-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 12:48:06,429 WARNING [train.py:1204] (3/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_76-311963-0_sp1.1 from training. Duration: 0.636375 2023-10-02 12:48:07,783 WARNING [train.py:1204] (3/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_156-302675-0_sp0.9 from training. Duration: 0.611125 2023-10-02 12:48:09,233 WARNING [train.py:1204] (3/4) Exclude cut with ID _53489_210_6228_1_1534298357169_3835970_367-148411-0_sp1.1 from training. Duration: 0.936375 2023-10-02 12:48:13,257 WARNING [train.py:1204] (3/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_389-144525-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 12:48:13,308 WARNING [train.py:1204] (3/4) Exclude cut with ID _14069_210_5632_1_1530512692033_4803390_158-238427-0_sp1.1 from training. Duration: 0.963625 2023-10-02 12:48:13,489 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.0.bypass_mid.scale_min, batch_count=884940.0, ans=0.2 2023-10-02 12:48:13,502 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.1.feed_forward3.hidden_balancer.prob, batch_count=884940.0, ans=0.125 2023-10-02 12:48:14,621 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_486-151131-0_sp0.9 from training. Duration: 0.9444375 2023-10-02 12:48:19,307 WARNING [train.py:1204] (3/4) Exclude cut with ID _39846_210_12651_1_1533603651992_1318030_188-71831-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 12:48:22,029 WARNING [train.py:1204] (3/4) Exclude cut with ID _39411_210_5986_1_1533538825030_3343649_173-238248-0_sp1.1 from training. Duration: 0.7545625 2023-10-02 12:48:23,481 WARNING [train.py:1204] (3/4) Exclude cut with ID _20684_210_6917_1_1531567751646_4203930_469-49081-0_sp1.1 from training. Duration: 0.92725 2023-10-02 12:48:24,849 WARNING [train.py:1204] (3/4) Exclude cut with ID _8833_210_5219_1_1528592665210_7553059_611-80313-0_sp1.1 from training. Duration: 0.5818125 2023-10-02 12:48:26,303 WARNING [train.py:1204] (3/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_440-141811-0 from training. Duration: 0.95 2023-10-02 12:48:26,317 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_41211_210_2544_1_1533348859_3702910_236-223652-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 12:48:26,399 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_40222_210_6228_1_1533385298_4481939_220-163998-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 12:48:44,528 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.ff2_skip_rate, batch_count=885073.3333333334, ans=0.0 2023-10-02 12:48:58,123 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.2.encoder.layers.1.conv_module1.whiten, num_groups=1, num_channels=384, metric=11.79 vs. limit=15.0 2023-10-02 12:49:14,852 INFO [train.py:1046] (3/4) Epoch 25, batch 5300, loss[loss=0.185, simple_loss=0.2439, pruned_loss=0.06308, over 23781.00 frames. ], tot_loss[loss=0.1678, simple_loss=0.2452, pruned_loss=0.04521, over 4716154.40 frames. ], batch size: 179, lr: 4.06e-03, grad_scale: 16.0 2023-10-02 12:49:22,972 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.ff3_skip_rate, batch_count=885273.3333333334, ans=0.0 2023-10-02 12:49:29,201 WARNING [train.py:1204] (3/4) Exclude cut with ID _14629_210_15709_1_1530604298182_956899_6-232695-0_sp1.1 from training. Duration: 0.8545625 2023-10-02 12:49:29,223 WARNING [train.py:1204] (3/4) Exclude cut with ID _13921_210_9994_1_1530841782272_3503520_327-69677-0 from training. Duration: 0.9 2023-10-02 12:49:29,250 WARNING [train.py:1204] (3/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_372-298093-0 from training. Duration: 0.8 2023-10-02 12:49:29,267 WARNING [train.py:1204] (3/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_515-139111-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 12:49:29,509 WARNING [train.py:1204] (3/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_85-65056-0_sp1.1 from training. Duration: 0.87275 2023-10-02 12:49:29,548 WARNING [train.py:1204] (3/4) Exclude cut with ID _15113_210_8254_1_1530842438579_3716279_209-54533-0_sp1.1 from training. Duration: 0.87275 2023-10-02 12:49:29,595 WARNING [train.py:1204] (3/4) Exclude cut with ID _70447_210_12392_1_1535972499300_3160420_123-312838-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 12:49:29,621 WARNING [train.py:1204] (3/4) Exclude cut with ID _60113_210_16271_1_1535164212481_4386079_70-63596-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 12:49:29,649 WARNING [train.py:1204] (3/4) Exclude cut with ID _14632_210_9988_1_1530691279392_3923200_225-188509-0_sp1.1 from training. Duration: 0.963625 2023-10-02 12:49:29,695 WARNING [train.py:1204] (3/4) Exclude cut with ID _34734_210_4525_1_1533365977542_4448720_584-180532-0_sp1.1 from training. Duration: 0.97275 2023-10-02 12:49:29,709 WARNING [train.py:1204] (3/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_351-294650-0_sp1.1 from training. Duration: 0.57275 2023-10-02 12:49:30,030 WARNING [train.py:1204] (3/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_263-168388-0_sp0.9 from training. Duration: 0.9555625 2023-10-02 12:49:30,056 WARNING [train.py:1204] (3/4) Exclude cut with ID _22083_210_8254_1_1531702862439_3678970_201-85738-0 from training. Duration: 0.9 2023-10-02 12:49:30,144 WARNING [train.py:1204] (3/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_237-58433-0 from training. Duration: 0.74 2023-10-02 12:49:30,170 WARNING [train.py:1204] (3/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_262-250826-0 from training. Duration: 0.99 2023-10-02 12:49:30,266 WARNING [train.py:1204] (3/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_927-43827-0_sp0.9 from training. Duration: 0.82225 2023-10-02 12:49:30,295 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2821_210_2715_1_1526108385_3763560_73-296673-0 from training. Duration: 0.83 2023-10-02 12:49:30,365 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6287_210_3273_1_1527509506_2653716_182-199887-0 from training. Duration: 0.51 2023-10-02 12:49:30,487 WARNING [train.py:1204] (3/4) Exclude cut with ID _70519_210_4163_1_1535961595994_7458230_899-80223-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 12:49:31,117 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_210-234949-0_sp1.1 from training. Duration: 0.97275 2023-10-02 12:49:31,152 WARNING [train.py:1204] (3/4) Exclude cut with ID _30196_210_1664_1_1533708496522_7074140_768-278176-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 12:49:31,227 WARNING [train.py:1204] (3/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_315-128283-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 12:49:31,304 WARNING [train.py:1204] (3/4) Exclude cut with ID _14886_210_4220_1_1530702020355_3516190_330-323941-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 12:49:31,583 WARNING [train.py:1204] (3/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_264-348896-0_sp1.1 from training. Duration: 0.77275 2023-10-02 12:49:31,607 WARNING [train.py:1204] (3/4) Exclude cut with ID _37086_210_9038_1_1532999540891_7021250_433-10563-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 12:49:31,649 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_53638_210_21581_1_1534297319_5808449_885-80172-0_sp1.1 from training. Duration: 0.87275 2023-10-02 12:49:31,753 WARNING [train.py:1204] (3/4) Exclude cut with ID _14623_210_1925_1_1530773354962_3632609_157-218771-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 12:49:31,766 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_1368-78165-0_sp1.1 from training. Duration: 0.97275 2023-10-02 12:49:31,771 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_309-65177-0_sp0.9 from training. Duration: 0.97775 2023-10-02 12:49:31,777 WARNING [train.py:1204] (3/4) Exclude cut with ID _21632_210_2253_1_1531994001765_3744140_291-138812-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 12:49:31,789 WARNING [train.py:1204] (3/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_669-93163-0_sp1.1 from training. Duration: 0.8909375 2023-10-02 12:49:32,375 WARNING [train.py:1204] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_292-215346-0 from training. Duration: 0.97 2023-10-02 12:49:32,399 WARNING [train.py:1204] (3/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_286-222073-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 12:49:32,693 WARNING [train.py:1204] (3/4) Exclude cut with ID _59256_210_5986_1_1535076085997_3589120_121-237386-0_sp1.1 from training. Duration: 0.87275 2023-10-02 12:49:32,707 WARNING [train.py:1204] (3/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_96-320736-0 from training. Duration: 0.86 2023-10-02 12:49:32,716 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_128-202104-0 from training. Duration: 0.63 2023-10-02 12:49:32,808 WARNING [train.py:1204] (3/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_242-3472-0_sp0.9 from training. Duration: 0.811125 2023-10-02 12:49:32,853 WARNING [train.py:1204] (3/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_154-74182-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 12:49:32,856 WARNING [train.py:1204] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_187-221566-0 from training. Duration: 0.43 2023-10-02 12:49:33,459 WARNING [train.py:1204] (3/4) Exclude cut with ID _6194_210_1534_1_1527511826160_3941194_14-282876-0 from training. Duration: 0.71 2023-10-02 12:49:33,497 WARNING [train.py:1204] (3/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_660-211088-0_sp1.1 from training. Duration: 0.563625 2023-10-02 12:49:33,835 WARNING [train.py:1204] (3/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_526-62079-0_sp1.1 from training. Duration: 0.6090625 2023-10-02 12:49:33,984 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_92-246270-0_sp1.1 from training. Duration: 0.77275 2023-10-02 12:49:34,077 WARNING [train.py:1204] (3/4) Exclude cut with ID IC0287W0023-142350 from training. Duration: 0.956 2023-10-02 12:49:34,153 WARNING [train.py:1204] (3/4) Exclude cut with ID _58605_210_12210_1_1534723195939_3904259_48-269402-0 from training. Duration: 0.96 2023-10-02 12:49:34,167 WARNING [train.py:1204] (3/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_416-257730-0_sp0.9 from training. Duration: 0.72225 2023-10-02 12:49:34,238 WARNING [train.py:1204] (3/4) Exclude cut with ID _7877_210_6014_1_1528286358099_3750136_185-17516-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 12:49:34,336 WARNING [train.py:1204] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_23-189234-0 from training. Duration: 0.83 2023-10-02 12:49:34,351 WARNING [train.py:1204] (3/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_237-89684-0 from training. Duration: 0.96 2023-10-02 12:49:34,417 WARNING [train.py:1204] (3/4) Exclude cut with ID _13916_210_9994_1_1530788341126_3763149_116-21034-0 from training. Duration: 0.96 2023-10-02 12:49:34,587 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_246-125000-0_sp1.1 from training. Duration: 0.563625 2023-10-02 12:49:40,838 INFO [train.py:1046] (3/4) Epoch 26, batch 0, loss[loss=0.1618, simple_loss=0.2426, pruned_loss=0.0405, over 24618.00 frames. ], tot_loss[loss=0.1618, simple_loss=0.2426, pruned_loss=0.0405, over 24618.00 frames. ], batch size: 60, lr: 3.98e-03, grad_scale: 32.0 2023-10-02 12:49:40,838 INFO [train.py:1069] (3/4) Computing validation loss 2023-10-02 12:49:53,974 INFO [train.py:1078] (3/4) Epoch 26, validation: loss=0.3276, simple_loss=0.28, pruned_loss=0.1876, over 1125622.00 frames. 2023-10-02 12:49:53,974 INFO [train.py:1079] (3/4) Maximum memory allocated so far is 21134MB 2023-10-02 12:49:57,341 WARNING [train.py:1204] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_402-253027-0 from training. Duration: 0.54 2023-10-02 12:49:58,665 WARNING [train.py:1204] (3/4) Exclude cut with ID _59288_210_5854_1_1535077757774_3412246_158-289345-0_sp1.1 from training. Duration: 0.92725 2023-10-02 12:50:00,014 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_28211_210_6388_1_1532861506_4523224_128-328162-0_sp1.1 from training. Duration: 0.8181875 2023-10-02 12:50:06,260 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_788-278281-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 12:50:06,274 WARNING [train.py:1204] (3/4) Exclude cut with ID _49728_210_12651_1_1534041034294_6788150_624-195301-0_sp0.9 from training. Duration: 0.9444375 2023-10-02 12:50:06,475 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.balancer2.prob, batch_count=885353.3333333334, ans=0.125 2023-10-02 12:50:07,569 WARNING [train.py:1204] (3/4) Exclude cut with ID _14860_210_4915_1_1530702020340_7343240_267-139775-0_sp1.1 from training. Duration: 0.963625 2023-10-02 12:50:07,628 WARNING [train.py:1204] (3/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_332-221057-0 from training. Duration: 0.68 2023-10-02 12:50:09,199 WARNING [train.py:1204] (3/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_242-76943-0 from training. Duration: 0.94 2023-10-02 12:50:10,645 WARNING [train.py:1204] (3/4) Exclude cut with ID _16747_210_5986_1_1531811544733_2749109_87-349587-0_sp1.1 from training. Duration: 0.963625 2023-10-02 12:50:12,005 WARNING [train.py:1204] (3/4) Exclude cut with ID _22982_210_1883_1_1531882790447_5449409_505-346452-0_sp1.1 from training. Duration: 0.963625 2023-10-02 12:50:12,221 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=885420.0, ans=0.1 2023-10-02 12:50:12,309 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.feed_forward1.out_proj.dropout_p, batch_count=885420.0, ans=0.1 2023-10-02 12:50:16,206 WARNING [train.py:1204] (3/4) Exclude cut with ID _17956_210_4353_1_1531209568125_6979970_13-192107-0_sp1.1 from training. Duration: 0.963625 2023-10-02 12:50:16,242 WARNING [train.py:1204] (3/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_430-307666-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 12:50:17,482 WARNING [train.py:1204] (3/4) Exclude cut with ID _2895_210_2477_1_1525956785794_4063750_289-11475-0_sp1.1 from training. Duration: 0.7454375 2023-10-02 12:50:17,497 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3910_210_1794_1_1526742306_334530_28-238909-0_sp0.9 from training. Duration: 0.9 2023-10-02 12:50:17,762 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.conv_skip_rate, batch_count=885420.0, ans=0.0 2023-10-02 12:50:19,026 WARNING [train.py:1204] (3/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_18-130336-0 from training. Duration: 0.79 2023-10-02 12:50:20,433 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_548-294316-0_sp0.9 from training. Duration: 0.9 2023-10-02 12:50:27,394 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_42-142473-0_sp1.1 from training. Duration: 0.6545625 2023-10-02 12:50:28,728 WARNING [train.py:1204] (3/4) Exclude cut with ID _59170_210_6721_1_1534983633960_4203335_326-48066-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 12:50:31,794 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_45886_210_17024_1_1533709215_471063_7-79165-0 from training. Duration: 0.98 2023-10-02 12:50:33,076 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.601e+02 1.877e+02 2.109e+02 2.369e+02 3.021e+02, threshold=4.218e+02, percent-clipped=0.0 2023-10-02 12:50:36,015 WARNING [train.py:1204] (3/4) Exclude cut with ID _44585_210_18628_1_1533605350387_4518199_280-73534-0_sp0.9 from training. Duration: 0.988875 2023-10-02 12:50:36,022 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3475_210_2028_1_1526193578_3622569_265-276625-0_sp1.1 from training. Duration: 0.6909375 2023-10-02 12:50:38,720 WARNING [train.py:1204] (3/4) Exclude cut with ID _64631_210_27767_1_1535349580068_4165160_88-225232-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 12:50:42,165 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_205-265518-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 12:50:45,140 WARNING [train.py:1204] (3/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_271-253734-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 12:50:50,733 WARNING [train.py:1204] (3/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_282-257791-0 from training. Duration: 0.86 2023-10-02 12:50:54,148 WARNING [train.py:1204] (3/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_356-45054-0 from training. Duration: 0.86 2023-10-02 12:50:54,179 WARNING [train.py:1204] (3/4) Exclude cut with ID _69584_210_5219_1_1535785133775_4044450_38-348116-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 12:50:54,181 WARNING [train.py:1204] (3/4) Exclude cut with ID _66763_210_17301_1_1535788667617_241750_21-328480-0_sp1.1 from training. Duration: 0.936375 2023-10-02 12:50:55,606 WARNING [train.py:1204] (3/4) Exclude cut with ID _26528_210_9070_1_1532570410842_3851310_497-97218-0_sp0.9 from training. Duration: 0.9555625 2023-10-02 12:50:55,660 WARNING [train.py:1204] (3/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_592-101075-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 12:50:58,759 WARNING [train.py:1204] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_678-159307-0 from training. Duration: 0.85 2023-10-02 12:51:00,149 WARNING [train.py:1204] (3/4) Exclude cut with ID _28427_210_4220_1_1532581240967_3061069_166-319773-0_sp1.1 from training. Duration: 0.936375 2023-10-02 12:51:00,229 WARNING [train.py:1204] (3/4) Exclude cut with ID _29877_210_6228_1_1532397791106_5276149_606-224984-0_sp1.1 from training. Duration: 0.936375 2023-10-02 12:51:02,930 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_125-203105-0_sp1.1 from training. Duration: 0.72725 2023-10-02 12:51:03,587 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.3.encoder.layers.0.feed_forward2.out_whiten, num_groups=1, num_channels=512, metric=4.95 vs. limit=15.0 2023-10-02 12:51:06,222 WARNING [train.py:1204] (3/4) Exclude cut with ID IC0620W0113-306765 from training. Duration: 0.768 2023-10-02 12:51:07,651 WARNING [train.py:1204] (3/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_410-287857-0_sp1.1 from training. Duration: 0.8454375 2023-10-02 12:51:08,983 INFO [train.py:1046] (3/4) Epoch 26, batch 50, loss[loss=0.1671, simple_loss=0.2504, pruned_loss=0.0419, over 23513.00 frames. ], tot_loss[loss=0.1683, simple_loss=0.2442, pruned_loss=0.04613, over 1071007.13 frames. ], batch size: 119, lr: 3.98e-03, grad_scale: 16.0 2023-10-02 12:51:11,951 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.0.layers.0.self_attn2.whiten, num_groups=1, num_channels=192, metric=11.76 vs. limit=22.5 2023-10-02 12:51:12,337 WARNING [train.py:1204] (3/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_78-95260-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 12:51:12,609 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.ff2_skip_rate, batch_count=885686.6666666666, ans=0.0 2023-10-02 12:51:13,740 WARNING [train.py:1204] (3/4) Exclude cut with ID _58059_210_5854_1_1534901471149_3164808_6-73080-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 12:51:13,752 WARNING [train.py:1204] (3/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_331-117829-0 from training. Duration: 0.62 2023-10-02 12:51:15,082 WARNING [train.py:1204] (3/4) Exclude cut with ID _14880_210_8895_1_1530680426655_3795259_289-212817-0_sp0.9 from training. Duration: 0.7666875 2023-10-02 12:51:15,116 WARNING [train.py:1204] (3/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_620-144779-0_sp1.1 from training. Duration: 0.92725 2023-10-02 12:51:16,522 WARNING [train.py:1204] (3/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_561-288575-0_sp1.1 from training. Duration: 0.963625 2023-10-02 12:51:17,835 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_22657_210_6890_1_1532086002_1637216_122-188128-0_sp1.1 from training. Duration: 0.963625 2023-10-02 12:51:19,287 WARNING [train.py:1204] (3/4) Exclude cut with ID _16756_210_4915_1_1531047803763_7104259_604-86607-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 12:51:19,475 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.self_attn_weights.pos_emb_skip_rate, batch_count=885686.6666666666, ans=0.0 2023-10-02 12:51:21,962 WARNING [train.py:1204] (3/4) Exclude cut with ID _15150_210_8341_1_1530752411803_7256384_469-239509-0 from training. Duration: 0.68 2023-10-02 12:51:22,588 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_10987_210_4163_1_1529760555_4123902_302-87926-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 12:51:22,751 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.feed_forward3.hidden_balancer.prob, batch_count=885753.3333333334, ans=0.125 2023-10-02 12:51:28,591 WARNING [train.py:1204] (3/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_469-94673-0_sp0.9 from training. Duration: 0.87775 2023-10-02 12:51:31,406 WARNING [train.py:1204] (3/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_798-106179-0 from training. Duration: 0.83 2023-10-02 12:51:32,667 WARNING [train.py:1204] (3/4) Exclude cut with ID _16744_210_5986_1_1531724405384_3907567_242-111316-0 from training. Duration: 0.93 2023-10-02 12:51:34,107 WARNING [train.py:1204] (3/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_938-205135-0_sp0.9 from training. Duration: 0.9666875 2023-10-02 12:51:35,981 WARNING [train.py:1204] (3/4) Exclude cut with ID _64135_210_2253_1_1535677267876_7608972_133-188266-0_sp0.9 from training. Duration: 0.9 2023-10-02 12:51:35,986 WARNING [train.py:1204] (3/4) Exclude cut with ID _44585_210_18628_1_1533605350387_4518199_623-346820-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 12:51:37,495 WARNING [train.py:1204] (3/4) Exclude cut with ID _42326_210_7770_1_1533814282531_3707269_454-266903-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 12:51:37,580 WARNING [train.py:1204] (3/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_5-55893-0_sp1.1 from training. Duration: 0.5 2023-10-02 12:51:38,916 WARNING [train.py:1204] (3/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_577-332600-0_sp1.1 from training. Duration: 0.5181875 2023-10-02 12:51:38,917 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_957-138882-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 12:51:46,363 WARNING [train.py:1204] (3/4) Exclude cut with ID _12556_210_8341_1_1530081024100_7327690_219-38004-0_sp1.1 from training. Duration: 0.97275 2023-10-02 12:51:47,589 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_397-147196-0_sp1.1 from training. Duration: 0.72725 2023-10-02 12:51:47,598 WARNING [train.py:1204] (3/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_231-104736-0_sp0.9 from training. Duration: 0.8666875 2023-10-02 12:51:49,055 WARNING [train.py:1204] (3/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_398-334558-0 from training. Duration: 0.96 2023-10-02 12:51:51,753 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_257-20733-0_sp1.1 from training. Duration: 0.6909375 2023-10-02 12:51:52,954 WARNING [train.py:1204] (3/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_146-282400-0_sp1.1 from training. Duration: 0.6454375 2023-10-02 12:51:52,960 WARNING [train.py:1204] (3/4) Exclude cut with ID _51899_210_20034_1_1534129254466_3669220_265-215428-0 from training. Duration: 0.98 2023-10-02 12:51:53,032 WARNING [train.py:1204] (3/4) Exclude cut with ID _34423_210_8750_1_1533430798456_3330199_219-106541-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 12:51:55,206 WARNING [train.py:1204] (3/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_458-67321-0 from training. Duration: 0.97 2023-10-02 12:51:58,341 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.bypass_mid.scale_min, batch_count=885886.6666666666, ans=0.2 2023-10-02 12:52:02,716 WARNING [train.py:1204] (3/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_1051-182606-0_sp1.1 from training. Duration: 0.936375 2023-10-02 12:52:02,739 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_53555_210_5632_1_1534595220_7232297_603-4396-0_sp1.1 from training. Duration: 0.97275 2023-10-02 12:52:04,256 WARNING [train.py:1204] (3/4) Exclude cut with ID _54218_210_12210_1_1534413646388_4409259_159-144367-0_sp1.1 from training. Duration: 0.963625 2023-10-02 12:52:04,366 WARNING [train.py:1204] (3/4) Exclude cut with ID _7606_210_4915_1_1528894253277_5080690_301-10732-0_sp0.9 from training. Duration: 0.9 2023-10-02 12:52:04,370 WARNING [train.py:1204] (3/4) Exclude cut with ID _15026_210_8750_1_1530777602316_3291850_211-195043-0_sp0.9 from training. Duration: 0.888875 2023-10-02 12:52:07,649 WARNING [train.py:1204] (3/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_146-282400-0 from training. Duration: 0.71 2023-10-02 12:52:07,684 WARNING [train.py:1204] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_65-88242-0 from training. Duration: 0.56 2023-10-02 12:52:09,075 WARNING [train.py:1204] (3/4) Exclude cut with ID _55857_210_2346_1_1534644007831_6898209_80-107757-0_sp1.1 from training. Duration: 0.963625 2023-10-02 12:52:09,109 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_15126_210_6753_1_1530778677_2591185_120-277707-0_sp0.9 from training. Duration: 0.888875 2023-10-02 12:52:09,216 WARNING [train.py:1204] (3/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_1033-168593-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 12:52:10,577 WARNING [train.py:1204] (3/4) Exclude cut with ID _46683_210_9642_1_1533730913492_4591116_475-30711-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 12:52:10,600 WARNING [train.py:1204] (3/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_253-308377-0 from training. Duration: 0.49 2023-10-02 12:52:10,663 WARNING [train.py:1204] (3/4) Exclude cut with ID _4680_210_3525_1_1527249757953_893386_75-78979-0 from training. Duration: 0.91 2023-10-02 12:52:12,461 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_5522_210_3273_1_1527249606_3909096_318-203153-0_sp0.9 from training. Duration: 0.47775 2023-10-02 12:52:15,015 WARNING [train.py:1204] (3/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_550-170116-0_sp1.1 from training. Duration: 0.92725 2023-10-02 12:52:15,055 WARNING [train.py:1204] (3/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_277-210798-0_sp0.9 from training. Duration: 0.611125 2023-10-02 12:52:15,257 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.conv_module2.balancer2.prob, batch_count=885953.3333333334, ans=0.125 2023-10-02 12:52:16,487 WARNING [train.py:1204] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_620-276793-0 from training. Duration: 0.59 2023-10-02 12:52:16,505 WARNING [train.py:1204] (3/4) Exclude cut with ID _3067_210_2667_1_1526212974640_4503168_119-175776-0 from training. Duration: 0.74 2023-10-02 12:52:17,859 WARNING [train.py:1204] (3/4) Exclude cut with ID _55209_210_1531_1_1534675828996_4597160_681-195827-0_sp1.1 from training. Duration: 0.92725 2023-10-02 12:52:17,942 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_427-78097-0_sp1.1 from training. Duration: 0.72725 2023-10-02 12:52:19,428 WARNING [train.py:1204] (3/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_96-290039-0_sp0.9 from training. Duration: 0.62225 2023-10-02 12:52:20,748 WARNING [train.py:1204] (3/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_287-292174-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 12:52:20,988 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.ff2_skip_rate, batch_count=885953.3333333334, ans=0.0 2023-10-02 12:52:22,344 WARNING [train.py:1204] (3/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_200-292228-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 12:52:22,620 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.bypass.skip_rate, batch_count=886020.0, ans=0.07 2023-10-02 12:52:23,553 INFO [train.py:1046] (3/4) Epoch 26, batch 100, loss[loss=0.1602, simple_loss=0.239, pruned_loss=0.04068, over 24612.00 frames. ], tot_loss[loss=0.1695, simple_loss=0.2464, pruned_loss=0.04628, over 1883658.14 frames. ], batch size: 60, lr: 3.98e-03, grad_scale: 16.0 2023-10-02 12:52:26,449 WARNING [train.py:1204] (3/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_291-194999-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 12:52:29,687 WARNING [train.py:1204] (3/4) Exclude cut with ID _58895_210_8750_1_1535184081320_3649089_265-323810-0_sp1.1 from training. Duration: 0.87275 2023-10-02 12:52:32,410 WARNING [train.py:1204] (3/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_58-68956-0 from training. Duration: 0.83 2023-10-02 12:52:32,415 WARNING [train.py:1204] (3/4) Exclude cut with ID _39867_210_9826_1_1533553222030_9344259_322-115153-0_sp1.1 from training. Duration: 0.963625 2023-10-02 12:52:35,195 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3371_210_1794_1_1526384462_5580777_396-339812-0_sp1.1 from training. Duration: 0.77275 2023-10-02 12:52:36,592 WARNING [train.py:1204] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_731-2428-0_sp0.9 from training. Duration: 0.92225 2023-10-02 12:52:36,629 WARNING [train.py:1204] (3/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_98-741-0_sp1.1 from training. Duration: 0.72725 2023-10-02 12:52:36,631 WARNING [train.py:1204] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_238-245655-0_sp0.9 from training. Duration: 0.9 2023-10-02 12:52:36,666 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6788_210_1388_1_1527898698_4562021_91-197981-0_sp0.9 from training. Duration: 0.92225 2023-10-02 12:52:38,108 WARNING [train.py:1204] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_860-96198-0 from training. Duration: 0.73 2023-10-02 12:52:39,647 WARNING [train.py:1204] (3/4) Exclude cut with ID _14268_210_9206_1_1530622771285_4714099_331-105351-0_sp0.9 from training. Duration: 0.72225 2023-10-02 12:52:39,681 WARNING [train.py:1204] (3/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_204-137133-0_sp1.1 from training. Duration: 0.92725 2023-10-02 12:52:41,459 WARNING [train.py:1204] (3/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_5-46496-0_sp1.1 from training. Duration: 0.936375 2023-10-02 12:52:41,474 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_160-218721-0_sp1.1 from training. Duration: 0.87275 2023-10-02 12:52:44,461 WARNING [train.py:1204] (3/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_503-287883-0 from training. Duration: 0.88 2023-10-02 12:52:46,445 WARNING [train.py:1204] (3/4) Exclude cut with ID _57572_210_5517_1_1534921219624_3809484_110-79134-0_sp1.1 from training. Duration: 0.92725 2023-10-02 12:52:47,805 WARNING [train.py:1204] (3/4) Exclude cut with ID _44585_210_18628_1_1533605350387_4518199_421-37641-0_sp1.1 from training. Duration: 0.936375 2023-10-02 12:52:49,171 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_42-142473-0_sp0.9 from training. Duration: 0.8 2023-10-02 12:52:50,489 WARNING [train.py:1204] (3/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_310-114452-0_sp0.9 from training. Duration: 0.6555625 2023-10-02 12:52:52,187 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.1.nonlin_attention.balancer.max_positive, batch_count=886153.3333333334, ans=0.95 2023-10-02 12:52:54,745 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9029W0120-509520 from training. Duration: 0.768 2023-10-02 12:52:54,769 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9029W0433-509820 from training. Duration: 0.896 2023-10-02 12:52:56,109 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_40222_210_6228_1_1533385298_4481939_163-334586-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 12:52:56,109 WARNING [train.py:1204] (3/4) Exclude cut with ID _48677_210_5663_1_1533902405919_3892989_408-40740-0_sp1.1 from training. Duration: 0.7545625 2023-10-02 12:52:58,956 WARNING [train.py:1204] (3/4) Exclude cut with ID _24314_210_4915_1_1532135078116_7170179_521-258868-0_sp0.9 from training. Duration: 0.87775 2023-10-02 12:53:02,687 WARNING [train.py:1204] (3/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_576-51198-0_sp1.1 from training. Duration: 0.92725 2023-10-02 12:53:04,021 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.556e+02 1.882e+02 2.043e+02 2.257e+02 3.571e+02, threshold=4.085e+02, percent-clipped=0.0 2023-10-02 12:53:04,205 WARNING [train.py:1204] (3/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_306-216871-0_sp1.1 from training. Duration: 0.97275 2023-10-02 12:53:04,410 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.feed_forward1.hidden_balancer.prob, batch_count=886153.3333333334, ans=0.125 2023-10-02 12:53:08,479 WARNING [train.py:1204] (3/4) Exclude cut with ID _20684_210_6917_1_1531567751646_4203930_463-119982-0_sp1.1 from training. Duration: 0.97275 2023-10-02 12:53:10,095 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9084W0012-536850 from training. Duration: 0.64 2023-10-02 12:53:11,792 WARNING [train.py:1204] (3/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_847-41334-0_sp1.1 from training. Duration: 0.4 2023-10-02 12:53:16,044 WARNING [train.py:1204] (3/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_326-165900-0_sp1.1 from training. Duration: 0.62725 2023-10-02 12:53:16,124 WARNING [train.py:1204] (3/4) Exclude cut with ID _7537_210_2084_1_1528502395431_3924359_128-268029-0_sp1.1 from training. Duration: 0.8909375 2023-10-02 12:53:19,260 WARNING [train.py:1204] (3/4) Exclude cut with ID _67499_210_17282_1_1535509805220_3692430_333-155030-0_sp1.1 from training. Duration: 0.97275 2023-10-02 12:53:22,081 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_34008_210_10431_1_1533345424_8183247_588-257177-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 12:53:24,930 WARNING [train.py:1204] (3/4) Exclude cut with ID _25914_210_6400_1_1532141317058_644740_52-96808-0_sp1.1 from training. Duration: 0.82725 2023-10-02 12:53:26,161 WARNING [train.py:1204] (3/4) Exclude cut with ID _56494_210_4220_1_1535284837625_3033169_69-138150-0_sp1.1 from training. Duration: 0.8545625 2023-10-02 12:53:26,389 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.ff3_skip_rate, batch_count=886286.6666666666, ans=0.0 2023-10-02 12:53:30,243 WARNING [train.py:1204] (3/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_19-143553-0_sp1.1 from training. Duration: 0.97275 2023-10-02 12:53:30,308 WARNING [train.py:1204] (3/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_582-130735-0_sp1.1 from training. Duration: 0.936375 2023-10-02 12:53:31,665 WARNING [train.py:1204] (3/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_943-201356-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 12:53:31,683 WARNING [train.py:1204] (3/4) Exclude cut with ID _3231_210_2106_1_1526205864934_6784220_422-229276-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 12:53:31,700 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_48049_210_5632_1_1533864553_3519937_73-198717-0_sp1.1 from training. Duration: 0.97275 2023-10-02 12:53:33,125 WARNING [train.py:1204] (3/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_329-285801-0 from training. Duration: 0.91 2023-10-02 12:53:33,149 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9171W0114-579000 from training. Duration: 0.896 2023-10-02 12:53:33,160 WARNING [train.py:1204] (3/4) Exclude cut with ID _3120_210_3525_1_1526036143112_4072980_210-132675-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 12:53:35,077 WARNING [train.py:1204] (3/4) Exclude cut with ID _55857_210_2346_1_1534644007831_6898209_745-98391-0_sp0.9 from training. Duration: 0.8444375 2023-10-02 12:53:35,138 WARNING [train.py:1204] (3/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_615-322035-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 12:53:35,145 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_36756_210_7234_1_1533026991_966276_119-315767-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 12:53:35,167 WARNING [train.py:1204] (3/4) Exclude cut with ID _13916_210_9994_1_1530788341126_3763149_60-191556-0_sp1.1 from training. Duration: 0.5545625 2023-10-02 12:53:35,190 WARNING [train.py:1204] (3/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_177-236809-0_sp1.1 from training. Duration: 0.4454375 2023-10-02 12:53:36,582 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7002_210_1794_1_1527939683_5157286_184-229630-0_sp1.1 from training. Duration: 0.67275 2023-10-02 12:53:36,595 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_50428_210_5724_1_1534247532_4126418_464-18322-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 12:53:36,665 WARNING [train.py:1204] (3/4) Exclude cut with ID _35799_210_5854_1_1533263487887_3529071_466-131402-0_sp1.1 from training. Duration: 0.936375 2023-10-02 12:53:37,851 INFO [train.py:1046] (3/4) Epoch 26, batch 150, loss[loss=0.1527, simple_loss=0.2266, pruned_loss=0.03939, over 23732.00 frames. ], tot_loss[loss=0.1695, simple_loss=0.2476, pruned_loss=0.04568, over 2515596.40 frames. ], batch size: 149, lr: 3.98e-03, grad_scale: 16.0 2023-10-02 12:53:39,259 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_1259-132768-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 12:53:39,316 WARNING [train.py:1204] (3/4) Exclude cut with ID _6694_210_4599_1_1527852637490_3965529_144-185051-0_sp1.1 from training. Duration: 0.87275 2023-10-02 12:53:39,352 WARNING [train.py:1204] (3/4) Exclude cut with ID _64639_210_5816_1_1535367619437_3528000_59-133290-0_sp1.1 from training. Duration: 0.963625 2023-10-02 12:53:42,452 WARNING [train.py:1204] (3/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_223-49525-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 12:53:45,233 WARNING [train.py:1204] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_748-36150-0_sp1.1 from training. Duration: 0.82725 2023-10-02 12:53:45,234 WARNING [train.py:1204] (3/4) Exclude cut with ID _19896_210_1883_1_1531709022662_6649569_52-221627-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 12:53:46,437 WARNING [train.py:1204] (3/4) Exclude cut with ID _6789_210_1534_1_1527854471671_2870519_296-115766-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 12:53:49,189 WARNING [train.py:1204] (3/4) Exclude cut with ID _14624_210_1925_1_1530858690657_3642219_100-19871-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 12:53:49,233 WARNING [train.py:1204] (3/4) Exclude cut with ID _12555_210_8750_1_1530075594402_3780239_50-293675-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 12:53:52,358 WARNING [train.py:1204] (3/4) Exclude cut with ID _14880_210_8895_1_1530680426655_3795259_289-212817-0_sp1.1 from training. Duration: 0.62725 2023-10-02 12:53:52,537 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=886420.0, ans=0.1 2023-10-02 12:53:53,724 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_388-211626-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 12:53:56,629 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.bypass.scale_min, batch_count=886420.0, ans=0.2 2023-10-02 12:53:57,766 WARNING [train.py:1204] (3/4) Exclude cut with ID _5289_210_2084_1_1527678202736_3779299_392-261988-0 from training. Duration: 0.65 2023-10-02 12:53:57,773 WARNING [train.py:1204] (3/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_124-345840-0 from training. Duration: 0.52 2023-10-02 12:53:57,780 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2655_210_2087_1_1525688959_3897400_437-337690-0 from training. Duration: 0.63 2023-10-02 12:54:00,494 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3780_210_2087_1_1526467391_239068_13-274633-0_sp1.1 from training. Duration: 0.92725 2023-10-02 12:54:00,498 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_805-71497-0_sp1.1 from training. Duration: 0.6454375 2023-10-02 12:54:00,573 WARNING [train.py:1204] (3/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_434-83000-0_sp1.1 from training. Duration: 0.8909375 2023-10-02 12:54:01,987 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_17613_210_2151_1_1531137569_2366160_285-219531-0_sp1.1 from training. Duration: 0.97275 2023-10-02 12:54:01,991 WARNING [train.py:1204] (3/4) Exclude cut with ID _56108_210_16380_1_1534505446159_3887959_22-37858-0_sp1.1 from training. Duration: 0.936375 2023-10-02 12:54:02,016 WARNING [train.py:1204] (3/4) Exclude cut with ID _28427_210_4220_1_1532581240967_3061069_200-88444-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 12:54:02,072 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_77-279042-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 12:54:03,564 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9309W0193-640320 from training. Duration: 0.896 2023-10-02 12:54:05,459 WARNING [train.py:1204] (3/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_516-324055-0_sp1.1 from training. Duration: 0.936375 2023-10-02 12:54:12,397 WARNING [train.py:1204] (3/4) Exclude cut with ID _40094_210_16380_1_1533294005773_3641159_10-261240-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 12:54:15,182 WARNING [train.py:1204] (3/4) Exclude cut with ID _6267_210_2084_1_1527764658526_3894730_375-219887-0_sp1.1 from training. Duration: 0.6090625 2023-10-02 12:54:16,581 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6788_210_1388_1_1527898698_4562021_180-75288-0 from training. Duration: 0.99 2023-10-02 12:54:19,316 WARNING [train.py:1204] (3/4) Exclude cut with ID _8030_210_7497_1_1528444886781_3531089_262-86750-0_sp1.1 from training. Duration: 0.736375 2023-10-02 12:54:19,337 WARNING [train.py:1204] (3/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_934-325498-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 12:54:19,367 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_456-22141-0_sp0.9 from training. Duration: 0.97775 2023-10-02 12:54:22,036 WARNING [train.py:1204] (3/4) Exclude cut with ID _49643_210_7989_1_1534640244224_3939031_598-161926-0_sp1.1 from training. Duration: 0.8181875 2023-10-02 12:54:23,431 WARNING [train.py:1204] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_345-49721-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 12:54:23,504 WARNING [train.py:1204] (3/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_440-141811-0_sp1.1 from training. Duration: 0.863625 2023-10-02 12:54:25,340 WARNING [train.py:1204] (3/4) Exclude cut with ID _64048_210_10626_1_1535198382802_3835179_394-212120-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 12:54:26,789 WARNING [train.py:1204] (3/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_91-277684-0 from training. Duration: 0.74 2023-10-02 12:54:30,847 WARNING [train.py:1204] (3/4) Exclude cut with ID _23038_210_9570_1_1532001584158_3652419_191-346343-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 12:54:30,906 WARNING [train.py:1204] (3/4) Exclude cut with ID _15140_210_9288_1_1530791388450_4880149_351-347974-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 12:54:30,945 WARNING [train.py:1204] (3/4) Exclude cut with ID _58605_210_12210_1_1534723195939_3904259_48-269402-0_sp1.1 from training. Duration: 0.87275 2023-10-02 12:54:32,254 WARNING [train.py:1204] (3/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_401-53423-0_sp0.9 from training. Duration: 0.611125 2023-10-02 12:54:34,141 WARNING [train.py:1204] (3/4) Exclude cut with ID _42659_210_2070_1_1533607442765_3120380_158-49004-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 12:54:37,314 WARNING [train.py:1204] (3/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_81-75849-0_sp1.1 from training. Duration: 0.3545625 2023-10-02 12:54:38,735 WARNING [train.py:1204] (3/4) Exclude cut with ID _27052_210_5986_1_1532329197187_3585009_314-106499-0_sp1.1 from training. Duration: 0.836375 2023-10-02 12:54:40,086 WARNING [train.py:1204] (3/4) Exclude cut with ID _4626_210_3532_1_1526817652717_3916859_446-206656-0_sp0.9 from training. Duration: 0.8444375 2023-10-02 12:54:40,176 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_53378_210_17282_1_1534221071_5769509_158-114960-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 12:54:43,461 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3171_210_2087_1_1526109084_2330007_37-64334-0_sp0.9 from training. Duration: 0.911125 2023-10-02 12:54:43,482 WARNING [train.py:1204] (3/4) Exclude cut with ID _5410_210_1388_1_1527397055795_1719020_39-112221-0 from training. Duration: 0.69 2023-10-02 12:54:44,755 WARNING [train.py:1204] (3/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_4-91037-0_sp0.9 from training. Duration: 0.97775 2023-10-02 12:54:44,770 WARNING [train.py:1204] (3/4) Exclude cut with ID ID0079W0443-717795 from training. Duration: 0.541 2023-10-02 12:54:48,089 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=886620.0, ans=0.1 2023-10-02 12:54:48,185 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.conv_skip_rate, batch_count=886620.0, ans=0.0 2023-10-02 12:54:49,245 WARNING [train.py:1204] (3/4) Exclude cut with ID _34734_210_4525_1_1533365977542_4448720_766-297928-0_sp1.1 from training. Duration: 0.936375 2023-10-02 12:54:50,663 INFO [train.py:1046] (3/4) Epoch 26, batch 200, loss[loss=0.1836, simple_loss=0.2469, pruned_loss=0.06009, over 23789.00 frames. ], tot_loss[loss=0.1701, simple_loss=0.2479, pruned_loss=0.0461, over 3013310.32 frames. ], batch size: 164, lr: 3.98e-03, grad_scale: 16.0 2023-10-02 12:54:52,166 WARNING [train.py:1204] (3/4) Exclude cut with ID _51471_210_1889_1_1534399391390_3094250_310-337793-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 12:54:52,177 WARNING [train.py:1204] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_678-159307-0_sp0.9 from training. Duration: 0.9444375 2023-10-02 12:54:55,359 WARNING [train.py:1204] (3/4) Exclude cut with ID _50093_210_4220_1_1534417141207_3432299_259-322252-0 from training. Duration: 0.92 2023-10-02 12:54:55,422 WARNING [train.py:1204] (3/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_67-103444-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 12:54:56,690 WARNING [train.py:1204] (3/4) Exclude cut with ID _40551_210_10635_1_1533776424632_3416179_311-247578-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 12:54:58,196 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_4633_210_2040_1_1526824163_4145343_416-76117-0 from training. Duration: 0.95 2023-10-02 12:55:00,844 WARNING [train.py:1204] (3/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_331-117829-0_sp0.9 from training. Duration: 0.688875 2023-10-02 12:55:03,539 WARNING [train.py:1204] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_299-247059-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 12:55:03,621 WARNING [train.py:1204] (3/4) Exclude cut with ID _64002_210_9570_1_1535198605157_38979_2-240845-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 12:55:08,361 WARNING [train.py:1204] (3/4) Exclude cut with ID _4681_210_3525_1_1527250689464_120889_15-126760-0_sp0.9 from training. Duration: 0.9 2023-10-02 12:55:08,373 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_21500_210_2054_1_1532149310_2716758_256-156611-0_sp1.1 from training. Duration: 0.936375 2023-10-02 12:55:10,099 WARNING [train.py:1204] (3/4) Exclude cut with ID _16908_210_5724_1_1531119157248_3772700_624-203729-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 12:55:26,939 WARNING [train.py:1204] (3/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_605-219732-0_sp1.1 from training. Duration: 0.8545625 2023-10-02 12:55:26,977 WARNING [train.py:1204] (3/4) Exclude cut with ID _44718_210_5816_1_1534050051869_6469030_380-167520-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 12:55:28,862 WARNING [train.py:1204] (3/4) Exclude cut with ID _19896_210_1883_1_1531709022662_6649569_94-229927-0_sp1.1 from training. Duration: 0.8818125 2023-10-02 12:55:30,115 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.515e+02 1.774e+02 1.970e+02 2.261e+02 2.926e+02, threshold=3.941e+02, percent-clipped=0.0 2023-10-02 12:55:30,186 WARNING [train.py:1204] (3/4) Exclude cut with ID _19514_210_9259_1_1531530071330_5084230_519-174300-0_sp1.1 from training. Duration: 0.963625 2023-10-02 12:55:31,487 WARNING [train.py:1204] (3/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_340-174176-0_sp0.9 from training. Duration: 0.5444375 2023-10-02 12:55:31,490 WARNING [train.py:1204] (3/4) Exclude cut with ID _8263_210_1664_1_1528439452891_970220_68-181805-0_sp1.1 from training. Duration: 0.5818125 2023-10-02 12:55:32,882 WARNING [train.py:1204] (3/4) Exclude cut with ID _3120_210_3525_1_1526036143112_4072980_348-257723-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 12:55:34,212 WARNING [train.py:1204] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_453-133983-0_sp1.1 from training. Duration: 0.7090625 2023-10-02 12:55:34,266 WARNING [train.py:1204] (3/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_17-86060-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 12:55:34,295 WARNING [train.py:1204] (3/4) Exclude cut with ID _29877_210_6228_1_1532397791106_5276149_228-184657-0_sp1.1 from training. Duration: 0.97275 2023-10-02 12:55:35,715 WARNING [train.py:1204] (3/4) Exclude cut with ID _18516_210_9206_1_1531704653206_3692406_198-334870-0 from training. Duration: 0.92 2023-10-02 12:55:37,598 WARNING [train.py:1204] (3/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_340-174176-0_sp1.1 from training. Duration: 0.4454375 2023-10-02 12:55:37,610 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_14-139186-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 12:55:39,246 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.nonlin_attention.balancer.prob, batch_count=886886.6666666666, ans=0.125 2023-10-02 12:55:40,529 WARNING [train.py:1204] (3/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_126-29104-0_sp0.9 from training. Duration: 0.9444375 2023-10-02 12:55:47,807 WARNING [train.py:1204] (3/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_84-10310-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 12:55:55,404 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_15517_210_6753_1_1530954008_3487336_79-273076-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 12:55:55,448 WARNING [train.py:1204] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_371-18169-0_sp0.9 from training. Duration: 0.9555625 2023-10-02 12:56:01,693 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_68702_210_23689_1_1535799577_3420068_170-92788-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 12:56:03,125 WARNING [train.py:1204] (3/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_78-182912-0 from training. Duration: 0.59 2023-10-02 12:56:04,296 INFO [train.py:1046] (3/4) Epoch 26, batch 250, loss[loss=0.1598, simple_loss=0.2382, pruned_loss=0.04068, over 23147.00 frames. ], tot_loss[loss=0.1699, simple_loss=0.248, pruned_loss=0.04594, over 3395545.50 frames. ], batch size: 119, lr: 3.98e-03, grad_scale: 16.0 2023-10-02 12:56:04,384 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3019_210_2028_1_1526044245_3264865_297-12384-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 12:56:04,389 WARNING [train.py:1204] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_147-111137-0_sp1.1 from training. Duration: 0.863625 2023-10-02 12:56:04,399 WARNING [train.py:1204] (3/4) Exclude cut with ID _28323_210_16187_1_1532480395667_4100300_117-8859-0_sp1.1 from training. Duration: 0.97275 2023-10-02 12:56:05,690 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7762_210_1794_1_1528199508_4057408_419-125009-0_sp1.1 from training. Duration: 0.7545625 2023-10-02 12:56:05,785 WARNING [train.py:1204] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_268-87909-0 from training. Duration: 0.99 2023-10-02 12:56:07,185 WARNING [train.py:1204] (3/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_118-311357-0_sp1.1 from training. Duration: 0.92725 2023-10-02 12:56:07,345 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.feed_forward3.hidden_balancer.prob, batch_count=887020.0, ans=0.125 2023-10-02 12:56:09,079 WARNING [train.py:1204] (3/4) Exclude cut with ID ID0377W0003-870225 from training. Duration: 0.852 2023-10-02 12:56:10,541 WARNING [train.py:1204] (3/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_56-5838-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 12:56:10,634 WARNING [train.py:1204] (3/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_90-31383-0_sp0.9 from training. Duration: 0.9666875 2023-10-02 12:56:11,944 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_31583_210_3852_1_1532595851_4024413_58-86657-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 12:56:11,987 WARNING [train.py:1204] (3/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_344-113875-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 12:56:14,171 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.bypass.skip_rate, batch_count=887020.0, ans=0.09899494936611666 2023-10-02 12:56:15,234 WARNING [train.py:1204] (3/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_278-104676-0_sp1.1 from training. Duration: 0.8909375 2023-10-02 12:56:15,255 WARNING [train.py:1204] (3/4) Exclude cut with ID _55844_210_2346_1_1534471212569_7625707_764-2387-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 12:56:16,640 WARNING [train.py:1204] (3/4) Exclude cut with ID _27430_210_7770_1_1532604689375_1378590_157-14963-0_sp1.1 from training. Duration: 0.963625 2023-10-02 12:56:21,112 WARNING [train.py:1204] (3/4) Exclude cut with ID _7160_210_1876_1_1527940923231_3703799_282-322611-0_sp1.1 from training. Duration: 0.7909375 2023-10-02 12:56:22,107 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.0.layers.0.feed_forward3.out_whiten, num_groups=1, num_channels=192, metric=11.68 vs. limit=15.0 2023-10-02 12:56:30,911 WARNING [train.py:1204] (3/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_190-138640-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 12:56:32,319 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_33348_210_5632_1_1533086912_3324030_63-270922-0_sp1.1 from training. Duration: 0.97275 2023-10-02 12:56:32,346 WARNING [train.py:1204] (3/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_135-184281-0_sp1.1 from training. Duration: 0.9 2023-10-02 12:56:40,083 WARNING [train.py:1204] (3/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_277-210798-0_sp1.1 from training. Duration: 0.5 2023-10-02 12:56:41,504 WARNING [train.py:1204] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_402-253027-0_sp0.9 from training. Duration: 0.6 2023-10-02 12:56:41,581 WARNING [train.py:1204] (3/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_540-275685-0_sp0.9 from training. Duration: 0.988875 2023-10-02 12:56:42,916 WARNING [train.py:1204] (3/4) Exclude cut with ID _59493_210_17282_1_1535078419077_3225879_118-18960-0_sp1.1 from training. Duration: 0.936375 2023-10-02 12:56:42,999 WARNING [train.py:1204] (3/4) Exclude cut with ID _6194_210_1534_1_1527511826160_3941194_321-148641-0_sp1.1 from training. Duration: 0.5818125 2023-10-02 12:56:43,005 WARNING [train.py:1204] (3/4) Exclude cut with ID _61133_210_9969_1_1535196777227_3696409_333-270325-0_sp1.1 from training. Duration: 0.8454375 2023-10-02 12:56:44,869 WARNING [train.py:1204] (3/4) Exclude cut with ID _49376_210_5986_1_1533985278732_3370740_307-145221-0_sp1.1 from training. Duration: 0.936375 2023-10-02 12:56:45,083 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.bypass.skip_rate, batch_count=887153.3333333334, ans=0.04949747468305833 2023-10-02 12:56:47,651 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6846_210_6753_1_1527850767_4295158_2-315073-0_sp0.9 from training. Duration: 0.97775 2023-10-02 12:56:49,034 WARNING [train.py:1204] (3/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_17-25045-0 from training. Duration: 0.59 2023-10-02 12:56:50,870 WARNING [train.py:1204] (3/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_503-333510-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 12:56:50,997 WARNING [train.py:1204] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_238-245655-0_sp1.1 from training. Duration: 0.736375 2023-10-02 12:56:52,345 WARNING [train.py:1204] (3/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_329-82235-0_sp0.9 from training. Duration: 0.911125 2023-10-02 12:56:52,351 WARNING [train.py:1204] (3/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_325-113798-0_sp0.9 from training. Duration: 0.9444375 2023-10-02 12:56:53,709 WARNING [train.py:1204] (3/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_395-37222-0_sp0.9 from training. Duration: 0.8444375 2023-10-02 12:56:53,790 WARNING [train.py:1204] (3/4) Exclude cut with ID _64135_210_2253_1_1535677267876_7608972_498-5039-0_sp1.1 from training. Duration: 0.7090625 2023-10-02 12:56:55,111 WARNING [train.py:1204] (3/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_303-228738-0_sp0.9 from training. Duration: 0.9333125 2023-10-02 12:56:56,584 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_577-220899-0_sp1.1 from training. Duration: 0.92725 2023-10-02 12:56:58,005 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3475_210_2028_1_1526193578_3622569_377-166133-0_sp1.1 from training. Duration: 0.7818125 2023-10-02 12:56:59,288 WARNING [train.py:1204] (3/4) Exclude cut with ID _49376_210_5986_1_1533985278732_3370740_272-313512-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 12:57:02,023 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7762_210_1794_1_1528199508_4057408_422-227034-0_sp0.9 from training. Duration: 0.811125 2023-10-02 12:57:05,456 WARNING [train.py:1204] (3/4) Exclude cut with ID _18895_210_6228_1_1531357274234_3784930_74-341428-0_sp1.1 from training. Duration: 0.92725 2023-10-02 12:57:10,100 WARNING [train.py:1204] (3/4) Exclude cut with ID _44902_210_16187_1_1533888006062_3792078_149-67212-0_sp1.1 from training. Duration: 0.7909375 2023-10-02 12:57:13,031 WARNING [train.py:1204] (3/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_243-126020-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 12:57:14,402 WARNING [train.py:1204] (3/4) Exclude cut with ID _54218_210_12210_1_1534413646388_4409259_259-6561-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 12:57:16,528 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_41452_210_7291_1_1533437687_3941281_405-11559-0 from training. Duration: 0.91 2023-10-02 12:57:17,741 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_759-225084-0_sp1.1 from training. Duration: 0.97275 2023-10-02 12:57:17,752 WARNING [train.py:1204] (3/4) Exclude cut with ID _14747_210_8341_1_1530685797463_3654089_147-131229-0_sp0.9 from training. Duration: 0.8444375 2023-10-02 12:57:18,992 INFO [train.py:1046] (3/4) Epoch 26, batch 300, loss[loss=0.175, simple_loss=0.2375, pruned_loss=0.05618, over 23893.00 frames. ], tot_loss[loss=0.1694, simple_loss=0.2472, pruned_loss=0.04581, over 3688365.42 frames. ], batch size: 195, lr: 3.97e-03, grad_scale: 16.0 2023-10-02 12:57:19,148 WARNING [train.py:1204] (3/4) Exclude cut with ID _14867_210_1968_1_1530684179890_3846969_319-250868-0 from training. Duration: 0.99 2023-10-02 12:57:19,177 WARNING [train.py:1204] (3/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_580-93798-0_sp0.9 from training. Duration: 0.62225 2023-10-02 12:57:20,664 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.0.feed_forward1.out_proj.dropout_p, batch_count=887353.3333333334, ans=0.1 2023-10-02 12:57:22,308 WARNING [train.py:1204] (3/4) Exclude cut with ID _9679_210_7497_1_1529129212906_4363929_512-313889-0_sp0.9 from training. Duration: 0.9 2023-10-02 12:57:22,316 WARNING [train.py:1204] (3/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_18-189198-0 from training. Duration: 0.9 2023-10-02 12:57:27,810 WARNING [train.py:1204] (3/4) Exclude cut with ID _49755_210_18053_1_1534581005846_3218709_137-64179-0_sp1.1 from training. Duration: 0.92725 2023-10-02 12:57:29,211 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_48049_210_5632_1_1533864553_3519937_306-283794-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 12:57:32,234 WARNING [train.py:1204] (3/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_25-120161-0_sp1.1 from training. Duration: 0.8909375 2023-10-02 12:57:32,294 WARNING [train.py:1204] (3/4) Exclude cut with ID _8833_210_5219_1_1528592665210_7553059_319-349181-0 from training. Duration: 0.99 2023-10-02 12:57:34,160 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_296-332325-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 12:57:34,362 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.feed_forward3.hidden_balancer.prob, batch_count=887420.0, ans=0.125 2023-10-02 12:57:35,550 WARNING [train.py:1204] (3/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_436-186311-0_sp1.1 from training. Duration: 0.5909375 2023-10-02 12:57:35,573 WARNING [train.py:1204] (3/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_348-127789-0 from training. Duration: 0.85 2023-10-02 12:57:35,579 WARNING [train.py:1204] (3/4) Exclude cut with ID _3992_210_1747_1_1526644816031_3731060_443-195183-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 12:57:40,154 WARNING [train.py:1204] (3/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_308-231735-0_sp1.1 from training. Duration: 0.663625 2023-10-02 12:57:44,242 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6786_210_1388_1_1527839699_4194495_378-46070-0_sp1.1 from training. Duration: 0.6545625 2023-10-02 12:57:44,289 WARNING [train.py:1204] (3/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_63-101996-0 from training. Duration: 0.88 2023-10-02 12:57:49,025 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2541_210_1794_1_1525590269_3084765_156-173713-0 from training. Duration: 0.81 2023-10-02 12:57:49,080 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_217-239796-0_sp1.1 from training. Duration: 0.963625 2023-10-02 12:57:51,813 WARNING [train.py:1204] (3/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_530-329400-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 12:57:53,651 WARNING [train.py:1204] (3/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_150-62920-0_sp1.1 from training. Duration: 0.963625 2023-10-02 12:57:53,651 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_5522_210_3273_1_1527249606_3909096_366-172508-0 from training. Duration: 0.66 2023-10-02 12:57:53,654 WARNING [train.py:1204] (3/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_303-340446-0_sp1.1 from training. Duration: 0.8181875 2023-10-02 12:57:55,082 WARNING [train.py:1204] (3/4) Exclude cut with ID _8699_210_2069_1_1528630346831_3960409_160-24408-0_sp1.1 from training. Duration: 0.82725 2023-10-02 12:57:56,521 WARNING [train.py:1204] (3/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_299-187801-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 12:57:57,869 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_34886_210_10633_1_1533203882_1755661_92-345288-0_sp1.1 from training. Duration: 0.97275 2023-10-02 12:57:59,150 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.591e+02 1.798e+02 1.983e+02 2.227e+02 3.244e+02, threshold=3.967e+02, percent-clipped=0.0 2023-10-02 12:58:01,976 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_5438_210_1794_1_1527336687_2600009_104-288274-0_sp1.1 from training. Duration: 0.52725 2023-10-02 12:58:01,980 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_154-316450-0 from training. Duration: 0.76 2023-10-02 12:58:02,060 WARNING [train.py:1204] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_23-189234-0_sp0.9 from training. Duration: 0.92225 2023-10-02 12:58:05,375 WARNING [train.py:1204] (3/4) Exclude cut with ID _4779_210_2084_1_1527318022944_3914893_346-168820-0_sp1.1 from training. Duration: 0.963625 2023-10-02 12:58:06,737 WARNING [train.py:1204] (3/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_5-55893-0 from training. Duration: 0.55 2023-10-02 12:58:06,846 WARNING [train.py:1204] (3/4) Exclude cut with ID _20455_210_5219_1_1531565996775_8098100_180-24517-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 12:58:11,587 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_243-143119-0_sp0.9 from training. Duration: 0.8555625 2023-10-02 12:58:13,526 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.4.encoder.layers.0.whiten, num_groups=1, num_channels=384, metric=5.06 vs. limit=12.0 2023-10-02 12:58:14,300 WARNING [train.py:1204] (3/4) Exclude cut with ID _65585_210_12210_1_1535450451584_3764098_293-212045-0_sp1.1 from training. Duration: 0.92725 2023-10-02 12:58:14,312 WARNING [train.py:1204] (3/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_577-332600-0 from training. Duration: 0.57 2023-10-02 12:58:18,962 WARNING [train.py:1204] (3/4) Exclude cut with ID _30039_210_13270_1_1532484612900_2725150_104-289874-0_sp1.1 from training. Duration: 0.963625 2023-10-02 12:58:18,980 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3172_210_2087_1_1526292929_3987865_197-97762-0_sp1.1 from training. Duration: 0.6818125 2023-10-02 12:58:20,626 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.balancer1.prob, batch_count=887620.0, ans=0.125 2023-10-02 12:58:21,801 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_256-150908-0_sp1.1 from training. Duration: 0.963625 2023-10-02 12:58:21,888 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_296-287678-0_sp1.1 from training. Duration: 0.863625 2023-10-02 12:58:23,224 WARNING [train.py:1204] (3/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_443-30034-0 from training. Duration: 0.64 2023-10-02 12:58:23,255 WARNING [train.py:1204] (3/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_494-199538-0_sp0.9 from training. Duration: 0.8333125 2023-10-02 12:58:25,114 WARNING [train.py:1204] (3/4) Exclude cut with ID _19708_210_9206_1_1532136650733_4238364_61-162325-0_sp1.1 from training. Duration: 0.97275 2023-10-02 12:58:26,485 WARNING [train.py:1204] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_475-127455-0 from training. Duration: 0.7 2023-10-02 12:58:26,606 WARNING [train.py:1204] (3/4) Exclude cut with ID _48677_210_5663_1_1533902405919_3892989_188-11581-0_sp1.1 from training. Duration: 0.963625 2023-10-02 12:58:26,627 WARNING [train.py:1204] (3/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_136-8381-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 12:58:28,030 WARNING [train.py:1204] (3/4) Exclude cut with ID _55328_210_27767_1_1534580973260_4088170_270-229555-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 12:58:28,265 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.ff3_skip_rate, batch_count=887620.0, ans=0.0 2023-10-02 12:58:29,309 WARNING [train.py:1204] (3/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_372-266951-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 12:58:30,661 WARNING [train.py:1204] (3/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_14-242149-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 12:58:33,279 INFO [train.py:1046] (3/4) Epoch 26, batch 350, loss[loss=0.1609, simple_loss=0.2293, pruned_loss=0.04626, over 23789.00 frames. ], tot_loss[loss=0.1686, simple_loss=0.2459, pruned_loss=0.04565, over 3909719.47 frames. ], batch size: 212, lr: 3.97e-03, grad_scale: 16.0 2023-10-02 12:58:35,340 WARNING [train.py:1204] (3/4) Exclude cut with ID _7877_210_6014_1_1528286358099_3750136_159-76512-0_sp1.1 from training. Duration: 0.936375 2023-10-02 12:58:35,343 WARNING [train.py:1204] (3/4) Exclude cut with ID _8263_210_1664_1_1528438953494_294100_40-204731-0_sp1.1 from training. Duration: 0.4545625 2023-10-02 12:58:35,636 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.conv_skip_rate, batch_count=887686.6666666666, ans=0.0 2023-10-02 12:58:38,199 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8278_210_2550_1_1528637099_4220413_694-238413-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 12:58:44,306 WARNING [train.py:1204] (3/4) Exclude cut with ID _47278_210_12041_1_1533902418349_3576110_421-172710-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 12:58:44,976 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.3.encoder.layers.2.self_attn2.whiten, num_groups=1, num_channels=512, metric=11.88 vs. limit=22.5 2023-10-02 12:58:47,133 WARNING [train.py:1204] (3/4) Exclude cut with ID _14970_210_15709_1_1530773543493_1715730_215-282680-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 12:58:47,169 WARNING [train.py:1204] (3/4) Exclude cut with ID _64639_210_5816_1_1535367619437_3528000_306-149449-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 12:58:50,494 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_28-291982-0 from training. Duration: 0.97 2023-10-02 12:58:51,868 WARNING [train.py:1204] (3/4) Exclude cut with ID _55134_210_21982_1_1534590028377_3882170_412-15870-0_sp1.1 from training. Duration: 0.936375 2023-10-02 12:58:51,901 WARNING [train.py:1204] (3/4) Exclude cut with ID _13684_210_7497_1_1530619354863_3598359_312-235606-0 from training. Duration: 0.6 2023-10-02 12:58:54,564 WARNING [train.py:1204] (3/4) Exclude cut with ID _37112_210_2575_1_1532997007673_7268610_347-238993-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 12:58:54,600 WARNING [train.py:1204] (3/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_121-284152-0 from training. Duration: 0.59 2023-10-02 12:58:54,673 WARNING [train.py:1204] (3/4) Exclude cut with ID _2941_210_2577_1_1526199673150_2489759_228-2961-0_sp1.1 from training. Duration: 0.97275 2023-10-02 12:58:57,991 WARNING [train.py:1204] (3/4) Exclude cut with ID _28323_210_16187_1_1532480395667_4100300_531-24157-0 from training. Duration: 0.98 2023-10-02 12:58:59,393 WARNING [train.py:1204] (3/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_266-329755-0_sp0.9 from training. Duration: 0.988875 2023-10-02 12:59:00,827 WARNING [train.py:1204] (3/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_1038-146421-0_sp1.1 from training. Duration: 0.97275 2023-10-02 12:59:00,915 WARNING [train.py:1204] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_391-22148-0_sp0.9 from training. Duration: 0.8555625 2023-10-02 12:59:02,320 WARNING [train.py:1204] (3/4) Exclude cut with ID _64521_210_14699_1_1535243427259_3815328_543-341117-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 12:59:03,668 WARNING [train.py:1204] (3/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_52-168712-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 12:59:03,713 WARNING [train.py:1204] (3/4) Exclude cut with ID _33920_210_4220_1_1533257851500_3084399_198-334063-0_sp1.1 from training. Duration: 0.8545625 2023-10-02 12:59:03,733 WARNING [train.py:1204] (3/4) Exclude cut with ID _50093_210_4220_1_1534417141207_3432299_69-111987-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 12:59:04,941 WARNING [train.py:1204] (3/4) Exclude cut with ID _14821_210_8750_1_1530680503991_6829359_189-297451-0_sp0.9 from training. Duration: 0.611125 2023-10-02 12:59:06,729 WARNING [train.py:1204] (3/4) Exclude cut with ID _8030_210_7497_1_1528444886781_3531089_262-86750-0_sp0.9 from training. Duration: 0.9 2023-10-02 12:59:06,734 WARNING [train.py:1204] (3/4) Exclude cut with ID _24612_210_8913_1_1531998054193_3806359_294-191196-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 12:59:12,858 WARNING [train.py:1204] (3/4) Exclude cut with ID _16909_210_5724_1_1531292079912_4118919_707-342896-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 12:59:12,889 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_9460_210_4211_1_1528954587_10049667_572-37035-0_sp0.9 from training. Duration: 0.888875 2023-10-02 12:59:12,935 WARNING [train.py:1204] (3/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_762-109886-0_sp1.1 from training. Duration: 0.82725 2023-10-02 12:59:14,322 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_14953_210_2151_1_1530701710_5616165_519-326393-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 12:59:19,059 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=887886.6666666666, ans=0.1 2023-10-02 12:59:20,278 WARNING [train.py:1204] (3/4) Exclude cut with ID _25914_210_6400_1_1532141317058_644740_52-96808-0 from training. Duration: 0.91 2023-10-02 12:59:20,282 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_68095_210_20185_1_1535718226_8888855_1079-94525-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 12:59:24,475 WARNING [train.py:1204] (3/4) Exclude cut with ID _14860_210_4915_1_1530702020340_7343240_746-3647-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 12:59:24,476 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_70379_210_7925_1_1535941455_557455_49-101720-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 12:59:24,488 WARNING [train.py:1204] (3/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_8-307291-0_sp1.1 from training. Duration: 0.936375 2023-10-02 12:59:25,971 WARNING [train.py:1204] (3/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_117-290916-0 from training. Duration: 0.51 2023-10-02 12:59:28,030 WARNING [train.py:1204] (3/4) Exclude cut with ID _22020_210_2461_1_1531724164513_6326900_944-339840-0_sp1.1 from training. Duration: 0.963625 2023-10-02 12:59:29,377 WARNING [train.py:1204] (3/4) Exclude cut with ID IC0515W0043-254911 from training. Duration: 0.976 2023-10-02 12:59:30,695 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3910_210_1794_1_1526742306_334530_28-238909-0 from training. Duration: 0.81 2023-10-02 12:59:30,712 WARNING [train.py:1204] (3/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_269-267275-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 12:59:33,575 WARNING [train.py:1204] (3/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_72-251255-0_sp1.1 from training. Duration: 0.8545625 2023-10-02 12:59:33,594 WARNING [train.py:1204] (3/4) Exclude cut with ID _14886_210_4220_1_1530702020355_3516190_175-229073-0 from training. Duration: 0.45 2023-10-02 12:59:35,055 WARNING [train.py:1204] (3/4) Exclude cut with ID _40519_210_13794_1_1533715228655_7337390_921-209452-0_sp1.1 from training. Duration: 0.963625 2023-10-02 12:59:40,059 WARNING [train.py:1204] (3/4) Exclude cut with ID _29857_210_20034_1_1532395886753_3798160_335-163562-0_sp0.9 from training. Duration: 0.9444375 2023-10-02 12:59:41,436 WARNING [train.py:1204] (3/4) Exclude cut with ID _58583_210_5236_1_1534726835266_3574249_384-58445-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 12:59:42,908 WARNING [train.py:1204] (3/4) Exclude cut with ID _44011_210_2046_1_1533722401691_3631310_96-315656-0_sp1.1 from training. Duration: 0.963625 2023-10-02 12:59:42,912 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_68484_210_1794_1_1535849804_7678097_549-170613-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 12:59:45,305 WARNING [train.py:1204] (3/4) Exclude cut with ID _37892_210_19596_1_1533198677897_3964058_214-148058-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 12:59:48,002 WARNING [train.py:1204] (3/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_415-78792-0_sp0.9 from training. Duration: 0.9 2023-10-02 12:59:49,262 INFO [train.py:1046] (3/4) Epoch 26, batch 400, loss[loss=0.1764, simple_loss=0.2569, pruned_loss=0.04795, over 23949.00 frames. ], tot_loss[loss=0.1681, simple_loss=0.245, pruned_loss=0.04557, over 4081927.29 frames. ], batch size: 80, lr: 3.97e-03, grad_scale: 32.0 2023-10-02 12:59:49,519 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.self_attn_weights.pos_emb_skip_rate, batch_count=888020.0, ans=0.0 2023-10-02 12:59:50,623 WARNING [train.py:1204] (3/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_383-205833-0_sp1.1 from training. Duration: 0.863625 2023-10-02 12:59:51,994 WARNING [train.py:1204] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_167-137255-0 from training. Duration: 0.64 2023-10-02 12:59:52,002 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8275_210_4198_1_1528455320_3892571_484-154786-0_sp1.1 from training. Duration: 0.963625 2023-10-02 12:59:53,390 WARNING [train.py:1204] (3/4) Exclude cut with ID _40284_210_2084_1_1533513679814_3704279_396-319412-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 12:59:56,006 WARNING [train.py:1204] (3/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_280-120942-0_sp0.9 from training. Duration: 0.8555625 2023-10-02 12:59:56,066 WARNING [train.py:1204] (3/4) Exclude cut with ID _45630_210_10556_1_1533949174498_3902049_558-240889-0_sp1.1 from training. Duration: 0.92725 2023-10-02 12:59:58,811 WARNING [train.py:1204] (3/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_197-307458-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 13:00:00,164 WARNING [train.py:1204] (3/4) Exclude cut with ID _16909_210_5724_1_1531292079912_4118919_163-254843-0_sp1.1 from training. Duration: 0.92725 2023-10-02 13:00:02,181 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_103-84010-0 from training. Duration: 0.69 2023-10-02 13:00:03,603 WARNING [train.py:1204] (3/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_401-158239-0 from training. Duration: 0.71 2023-10-02 13:00:03,604 WARNING [train.py:1204] (3/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_899-233658-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 13:00:03,701 WARNING [train.py:1204] (3/4) Exclude cut with ID _23825_210_13449_1_1531915313526_3497150_240-271367-0 from training. Duration: 0.97 2023-10-02 13:00:05,107 WARNING [train.py:1204] (3/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_424-134619-0_sp1.1 from training. Duration: 0.92725 2023-10-02 13:00:09,718 WARNING [train.py:1204] (3/4) Exclude cut with ID _44831_210_13427_1_1533816011549_3689035_32-188147-0_sp1.1 from training. Duration: 0.87275 2023-10-02 13:00:09,721 WARNING [train.py:1204] (3/4) Exclude cut with ID _59170_210_6721_1_1534983633960_4203335_314-63879-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 13:00:09,734 WARNING [train.py:1204] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_602-275260-0 from training. Duration: 0.82 2023-10-02 13:00:11,048 WARNING [train.py:1204] (3/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_511-306767-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 13:00:11,090 WARNING [train.py:1204] (3/4) Exclude cut with ID _41817_210_18835_1_1533434477160_7423049_565-201775-0_sp1.1 from training. Duration: 0.92725 2023-10-02 13:00:11,110 WARNING [train.py:1204] (3/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_778-173151-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 13:00:12,569 WARNING [train.py:1204] (3/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_680-51001-0_sp1.1 from training. Duration: 0.963625 2023-10-02 13:00:15,733 WARNING [train.py:1204] (3/4) Exclude cut with ID IC0677W0059-334891 from training. Duration: 0.896 2023-10-02 13:00:15,771 WARNING [train.py:1204] (3/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_370-331600-0 from training. Duration: 0.94 2023-10-02 13:00:20,576 WARNING [train.py:1204] (3/4) Exclude cut with ID _66840_210_5219_1_1535452168426_4196349_446-240192-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 13:00:21,974 WARNING [train.py:1204] (3/4) Exclude cut with ID _65242_210_18628_1_1535369393717_3823149_38-6732-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 13:00:23,197 WARNING [train.py:1204] (3/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_302-2550-0 from training. Duration: 0.81 2023-10-02 13:00:23,279 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_100-348441-0 from training. Duration: 0.91 2023-10-02 13:00:26,114 WARNING [train.py:1204] (3/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_329-285801-0_sp1.1 from training. Duration: 0.82725 2023-10-02 13:00:28,743 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.453e+02 1.840e+02 2.026e+02 2.260e+02 3.455e+02, threshold=4.051e+02, percent-clipped=0.0 2023-10-02 13:00:28,878 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_30167_210_3528_1_1532428012_4772375_343-334712-0_sp1.1 from training. Duration: 0.936375 2023-10-02 13:00:35,741 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7543_210_6753_1_1528543318_4552_1-85072-0 from training. Duration: 0.83 2023-10-02 13:00:37,753 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7002_210_1794_1_1527935634_4034444_487-236501-0_sp1.1 from training. Duration: 0.6 2023-10-02 13:00:37,996 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.conv_module1.balancer2.prob, batch_count=888220.0, ans=0.125 2023-10-02 13:00:39,083 WARNING [train.py:1204] (3/4) Exclude cut with ID _7160_210_1876_1_1527940923231_3703799_282-322611-0 from training. Duration: 0.87 2023-10-02 13:00:39,567 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.5.encoder.layers.0.nonlin_attention.whiten2, num_groups=1, num_channels=256, metric=3.31 vs. limit=15.0 2023-10-02 13:00:42,200 WARNING [train.py:1204] (3/4) Exclude cut with ID _67335_210_5219_1_1535525975652_4275229_570-262983-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 13:00:43,595 WARNING [train.py:1204] (3/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_625-338546-0_sp1.1 from training. Duration: 0.8 2023-10-02 13:00:43,619 WARNING [train.py:1204] (3/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_176-56120-0 from training. Duration: 0.83 2023-10-02 13:00:46,886 WARNING [train.py:1204] (3/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_963-187109-0_sp1.1 from training. Duration: 0.9 2023-10-02 13:00:48,758 WARNING [train.py:1204] (3/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_121-284152-0_sp0.9 from training. Duration: 0.6555625 2023-10-02 13:00:51,321 WARNING [train.py:1204] (3/4) Exclude cut with ID _45021_210_9994_1_1533619751094_3330288_104-81711-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 13:00:54,051 WARNING [train.py:1204] (3/4) Exclude cut with ID _51382_210_23862_1_1534492801838_3650104_129-343914-0_sp1.1 from training. Duration: 0.936375 2023-10-02 13:00:54,085 WARNING [train.py:1204] (3/4) Exclude cut with ID _14970_210_15709_1_1530775547361_1966965_8-169924-0 from training. Duration: 0.92 2023-10-02 13:00:55,530 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2295_210_2087_1_1525500993_2407863_234-83104-0_sp1.1 from training. Duration: 0.563625 2023-10-02 13:00:55,592 WARNING [train.py:1204] (3/4) Exclude cut with ID _14687_210_5854_1_1530703838259_3664889_115-239046-0 from training. Duration: 0.67 2023-10-02 13:00:58,237 WARNING [train.py:1204] (3/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_690-126306-0_sp1.1 from training. Duration: 0.7090625 2023-10-02 13:00:58,253 WARNING [train.py:1204] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_625-102955-0_sp0.9 from training. Duration: 0.8555625 2023-10-02 13:01:00,853 WARNING [train.py:1204] (3/4) Exclude cut with ID _9389_210_1664_1_1529217337301_5289269_289-213417-0 from training. Duration: 0.89 2023-10-02 13:01:00,994 WARNING [train.py:1204] (3/4) Exclude cut with ID _13253_210_1925_1_1530757775819_3577873_191-144977-0_sp0.9 from training. Duration: 0.8666875 2023-10-02 13:01:02,171 INFO [train.py:1046] (3/4) Epoch 26, batch 450, loss[loss=0.1715, simple_loss=0.2582, pruned_loss=0.04237, over 24315.00 frames. ], tot_loss[loss=0.1692, simple_loss=0.2462, pruned_loss=0.0461, over 4226179.52 frames. ], batch size: 74, lr: 3.97e-03, grad_scale: 32.0 2023-10-02 13:01:02,274 WARNING [train.py:1204] (3/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_325-155703-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 13:01:02,308 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2295_210_2087_1_1525500993_2407863_116-238894-0_sp0.9 from training. Duration: 0.77775 2023-10-02 13:01:03,706 WARNING [train.py:1204] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_94-126128-0 from training. Duration: 0.86 2023-10-02 13:01:05,511 WARNING [train.py:1204] (3/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_395-310220-0_sp0.9 from training. Duration: 0.988875 2023-10-02 13:01:05,596 WARNING [train.py:1204] (3/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_821-110852-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 13:01:07,001 WARNING [train.py:1204] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_64-248359-0_sp1.1 from training. Duration: 0.62725 2023-10-02 13:01:07,011 WARNING [train.py:1204] (3/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_138-187031-0 from training. Duration: 0.99 2023-10-02 13:01:07,050 WARNING [train.py:1204] (3/4) Exclude cut with ID _4122_210_2084_1_1526713164417_4353280_528-206771-0_sp0.9 from training. Duration: 0.97775 2023-10-02 13:01:07,333 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.feed_forward2.hidden_balancer.prob, batch_count=888353.3333333334, ans=0.125 2023-10-02 13:01:08,913 WARNING [train.py:1204] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_844-96989-0_sp1.1 from training. Duration: 0.6454375 2023-10-02 13:01:10,463 WARNING [train.py:1204] (3/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_106-30782-0_sp1.1 from training. Duration: 0.72725 2023-10-02 13:01:22,505 WARNING [train.py:1204] (3/4) Exclude cut with ID _14683_210_5219_1_1530698352608_3974859_281-250040-0_sp1.1 from training. Duration: 0.936375 2023-10-02 13:01:22,541 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_46013_210_7234_1_1533776362_3744032_463-52378-0_sp0.9 from training. Duration: 0.9555625 2023-10-02 13:01:25,225 WARNING [train.py:1204] (3/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_299-34413-0 from training. Duration: 0.65 2023-10-02 13:01:25,285 WARNING [train.py:1204] (3/4) Exclude cut with ID _35799_210_5854_1_1533263487887_3529071_490-326847-0 from training. Duration: 0.99 2023-10-02 13:01:28,099 WARNING [train.py:1204] (3/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_589-9070-0_sp0.9 from training. Duration: 0.788875 2023-10-02 13:01:28,863 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.3.encoder.layers.0.nonlin_attention.whiten1, num_groups=1, num_channels=384, metric=7.55 vs. limit=10.0 2023-10-02 13:01:29,526 WARNING [train.py:1204] (3/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_884-29515-0_sp1.1 from training. Duration: 0.936375 2023-10-02 13:01:30,947 WARNING [train.py:1204] (3/4) Exclude cut with ID _15012_210_1968_1_1530856820429_5095350_474-119995-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 13:01:33,744 WARNING [train.py:1204] (3/4) Exclude cut with ID _65585_210_12210_1_1535450451584_3764098_211-97124-0_sp1.1 from training. Duration: 0.963625 2023-10-02 13:01:35,489 WARNING [train.py:1204] (3/4) Exclude cut with ID _33640_210_2655_1_1533117605479_4855469_666-224463-0_sp1.1 from training. Duration: 0.963625 2023-10-02 13:01:38,119 WARNING [train.py:1204] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_35-153886-0 from training. Duration: 0.84 2023-10-02 13:01:38,176 WARNING [train.py:1204] (3/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_210-106047-0 from training. Duration: 0.97 2023-10-02 13:01:39,538 WARNING [train.py:1204] (3/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_479-125611-0 from training. Duration: 0.96 2023-10-02 13:01:39,567 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_9417_210_1794_1_1529235261_4614131_427-153963-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 13:01:41,291 WARNING [train.py:1204] (3/4) Exclude cut with ID _23818_210_6228_1_1531917063884_3676920_133-170571-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 13:01:42,613 WARNING [train.py:1204] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_602-275260-0_sp1.1 from training. Duration: 0.7454375 2023-10-02 13:01:43,964 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9029W0121-509521 from training. Duration: 0.896 2023-10-02 13:01:43,973 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9029W0434-509821 from training. Duration: 0.64 2023-10-02 13:01:44,005 WARNING [train.py:1204] (3/4) Exclude cut with ID _23038_210_9570_1_1532001584158_3652419_289-307034-0_sp1.1 from training. Duration: 0.936375 2023-10-02 13:01:45,280 WARNING [train.py:1204] (3/4) Exclude cut with ID _29345_210_4381_1_1532689168263_3860169_149-257268-0_sp0.9 from training. Duration: 0.9 2023-10-02 13:01:47,696 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_405-339986-0_sp1.1 from training. Duration: 0.463625 2023-10-02 13:01:50,334 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2655_210_2087_1_1525688959_3897400_437-337690-0_sp1.1 from training. Duration: 0.57275 2023-10-02 13:01:50,359 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7543_210_6753_1_1528541907_1399322_165-323619-0_sp1.1 from training. Duration: 0.62725 2023-10-02 13:01:50,403 WARNING [train.py:1204] (3/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_508-335369-0_sp1.1 from training. Duration: 0.436375 2023-10-02 13:01:51,756 WARNING [train.py:1204] (3/4) Exclude cut with ID _7852_210_5219_1_1528286471316_8139400_358-287492-0 from training. Duration: 0.96 2023-10-02 13:01:54,481 WARNING [train.py:1204] (3/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_322-217220-0_sp0.9 from training. Duration: 0.9555625 2023-10-02 13:01:55,895 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3171_210_2087_1_1526109084_2330007_66-260583-0_sp0.9 from training. Duration: 0.8 2023-10-02 13:01:55,930 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8558_210_1410_1_1528505225_7613406_131-329524-0_sp1.1 from training. Duration: 0.7181875 2023-10-02 13:01:57,273 WARNING [train.py:1204] (3/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_30-326681-0 from training. Duration: 0.83 2023-10-02 13:02:01,382 WARNING [train.py:1204] (3/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_58-68956-0_sp0.9 from training. Duration: 0.92225 2023-10-02 13:02:02,723 WARNING [train.py:1204] (3/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_255-312654-0 from training. Duration: 0.91 2023-10-02 13:02:04,029 WARNING [train.py:1204] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_99-167392-0 from training. Duration: 0.61 2023-10-02 13:02:04,123 WARNING [train.py:1204] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_94-126128-0_sp0.9 from training. Duration: 0.9555625 2023-10-02 13:02:08,641 WARNING [train.py:1204] (3/4) Exclude cut with ID _68635_210_9317_1_1535770813410_4315870_187-192817-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 13:02:08,964 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.conv_module2.balancer2.prob, batch_count=888620.0, ans=0.125 2023-10-02 13:02:10,558 WARNING [train.py:1204] (3/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_277-204970-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 13:02:10,666 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6163_210_6400_1_1527506746_4263068_283-137811-0_sp0.9 from training. Duration: 0.8555625 2023-10-02 13:02:10,697 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9144W0121-566446 from training. Duration: 0.896 2023-10-02 13:02:14,068 WARNING [train.py:1204] (3/4) Exclude cut with ID _49371_210_12071_1_1533952761021_3812190_351-315519-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 13:02:15,250 WARNING [train.py:1204] (3/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_115-326778-0_sp1.1 from training. Duration: 0.8454375 2023-10-02 13:02:16,860 INFO [train.py:1046] (3/4) Epoch 26, batch 500, loss[loss=0.2162, simple_loss=0.2815, pruned_loss=0.07548, over 19149.00 frames. ], tot_loss[loss=0.1706, simple_loss=0.2475, pruned_loss=0.04682, over 4322444.43 frames. ], batch size: 388, lr: 3.97e-03, grad_scale: 32.0 2023-10-02 13:02:16,933 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_17292_210_6753_1_1531125388_2943734_436-198965-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 13:02:16,943 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9165W0117-576751 from training. Duration: 0.896 2023-10-02 13:02:18,331 WARNING [train.py:1204] (3/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_212-149947-0 from training. Duration: 0.73 2023-10-02 13:02:18,346 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_13757_210_3528_1_1530440682_4107239_65-167544-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 13:02:21,185 WARNING [train.py:1204] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_700-183549-0_sp0.9 from training. Duration: 0.7555625 2023-10-02 13:02:24,626 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.0.layers.1.self_attn1.whiten, num_groups=1, num_channels=192, metric=11.53 vs. limit=22.5 2023-10-02 13:02:26,464 WARNING [train.py:1204] (3/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_216-343007-0_sp0.9 from training. Duration: 0.7333125 2023-10-02 13:02:27,916 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7544_210_6753_1_1528587722_3576686_295-258479-0_sp1.1 from training. Duration: 0.763625 2023-10-02 13:02:29,379 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_365-287106-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 13:02:29,391 WARNING [train.py:1204] (3/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_300-3747-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 13:02:30,684 WARNING [train.py:1204] (3/4) Exclude cut with ID _14069_210_5632_1_1530512692033_4803390_487-303201-0_sp1.1 from training. Duration: 0.92725 2023-10-02 13:02:38,305 WARNING [train.py:1204] (3/4) Exclude cut with ID _14459_210_2228_1_1531443641946_7516700_668-123026-0_sp1.1 from training. Duration: 0.936375 2023-10-02 13:02:38,328 WARNING [train.py:1204] (3/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_108-53929-0_sp1.1 from training. Duration: 0.536375 2023-10-02 13:02:38,366 WARNING [train.py:1204] (3/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_266-137104-0_sp0.9 from training. Duration: 0.72225 2023-10-02 13:02:38,395 WARNING [train.py:1204] (3/4) Exclude cut with ID _12302_210_5219_1_1529924446466_8107280_902-168619-0_sp1.1 from training. Duration: 0.936375 2023-10-02 13:02:40,293 WARNING [train.py:1204] (3/4) Exclude cut with ID _59502_210_12210_1_1535107024698_1835139_146-162495-0 from training. Duration: 0.92 2023-10-02 13:02:40,307 WARNING [train.py:1204] (3/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_412-287526-0_sp1.1 from training. Duration: 0.6545625 2023-10-02 13:02:42,286 WARNING [train.py:1204] (3/4) Exclude cut with ID _7160_210_1876_1_1527940923231_3703799_120-150943-0_sp0.9 from training. Duration: 0.9 2023-10-02 13:02:42,416 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.bypass_mid.scale_min, batch_count=888753.3333333334, ans=0.2 2023-10-02 13:02:43,419 WARNING [train.py:1204] (3/4) Exclude cut with ID _15037_210_2253_1_1530752294104_3267650_296-192597-0_sp1.1 from training. Duration: 0.736375 2023-10-02 13:02:44,692 WARNING [train.py:1204] (3/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_397-216497-0_sp1.1 from training. Duration: 0.87275 2023-10-02 13:02:44,705 WARNING [train.py:1204] (3/4) Exclude cut with ID _40094_210_16380_1_1533294005773_3641159_76-269320-0_sp1.1 from training. Duration: 0.936375 2023-10-02 13:02:44,757 WARNING [train.py:1204] (3/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_449-15508-0 from training. Duration: 0.82 2023-10-02 13:02:49,406 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9309W0194-640321 from training. Duration: 0.64 2023-10-02 13:02:52,227 WARNING [train.py:1204] (3/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_284-312178-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 13:02:53,739 WARNING [train.py:1204] (3/4) Exclude cut with ID _29853_210_7497_1_1532505600851_6642110_401-55937-0_sp1.1 from training. Duration: 0.92725 2023-10-02 13:02:55,077 WARNING [train.py:1204] (3/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_716-24918-0_sp1.1 from training. Duration: 0.92725 2023-10-02 13:02:55,099 WARNING [train.py:1204] (3/4) Exclude cut with ID _19637_210_4220_1_1531537167676_3205060_314-305638-0_sp1.1 from training. Duration: 0.92725 2023-10-02 13:02:56,399 WARNING [train.py:1204] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_475-127455-0_sp1.1 from training. Duration: 0.636375 2023-10-02 13:02:57,650 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.502e+02 1.792e+02 1.964e+02 2.180e+02 2.813e+02, threshold=3.927e+02, percent-clipped=0.0 2023-10-02 13:02:57,810 WARNING [train.py:1204] (3/4) Exclude cut with ID _5444_210_2151_1_1527418808464_5312210_581-315411-0 from training. Duration: 0.62 2023-10-02 13:03:00,549 WARNING [train.py:1204] (3/4) Exclude cut with ID _44902_210_16187_1_1533888006062_3792078_149-67212-0_sp0.9 from training. Duration: 0.9666875 2023-10-02 13:03:00,618 WARNING [train.py:1204] (3/4) Exclude cut with ID _31602_210_5854_1_1532677021456_4458250_63-63253-0_sp1.1 from training. Duration: 0.963625 2023-10-02 13:03:01,338 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.2.encoder.layers.2.conv_module2.whiten, num_groups=1, num_channels=384, metric=5.64 vs. limit=15.0 2023-10-02 13:03:05,189 WARNING [train.py:1204] (3/4) Exclude cut with ID _55209_210_1531_1_1534675828996_4597160_660-203992-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 13:03:08,013 WARNING [train.py:1204] (3/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_83-190624-0_sp1.1 from training. Duration: 0.92725 2023-10-02 13:03:16,987 WARNING [train.py:1204] (3/4) Exclude cut with ID _58605_210_12210_1_1534723195939_3904259_154-139635-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 13:03:21,078 WARNING [train.py:1204] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_164-224052-0 from training. Duration: 0.7 2023-10-02 13:03:21,087 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_1126-189071-0_sp1.1 from training. Duration: 0.963625 2023-10-02 13:03:21,097 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_15396_210_6890_1_1531308473_1938936_310-214789-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 13:03:23,160 WARNING [train.py:1204] (3/4) Exclude cut with ID _8395_210_1531_1_1528597871523_4411579_431-132470-0 from training. Duration: 0.74 2023-10-02 13:03:23,217 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3780_210_2087_1_1526467760_2130483_170-259044-0_sp0.9 from training. Duration: 0.77775 2023-10-02 13:03:25,937 WARNING [train.py:1204] (3/4) Exclude cut with ID _12733_210_8341_1_1530167424556_3646380_537-230640-0_sp1.1 from training. Duration: 0.963625 2023-10-02 13:03:28,024 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.3.encoder.layers.2.self_attn_weights.whiten_keys, num_groups=8, num_channels=256, metric=4.63 vs. limit=6.0 2023-10-02 13:03:30,037 INFO [train.py:1046] (3/4) Epoch 26, batch 550, loss[loss=0.1794, simple_loss=0.2577, pruned_loss=0.0505, over 23767.00 frames. ], tot_loss[loss=0.1721, simple_loss=0.2493, pruned_loss=0.04741, over 4409714.01 frames. ], batch size: 212, lr: 3.97e-03, grad_scale: 8.0 2023-10-02 13:03:30,113 WARNING [train.py:1204] (3/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_159-281310-0 from training. Duration: 0.95 2023-10-02 13:03:31,484 WARNING [train.py:1204] (3/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_434-83000-0 from training. Duration: 0.98 2023-10-02 13:03:31,518 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_4405_210_1849_1_1526732205_3593939_493-32464-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 13:03:31,527 WARNING [train.py:1204] (3/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_342-100157-0 from training. Duration: 0.54 2023-10-02 13:03:32,752 WARNING [train.py:1204] (3/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_765-250133-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 13:03:32,756 WARNING [train.py:1204] (3/4) Exclude cut with ID _38670_210_15379_1_1533448810659_4391059_81-312245-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 13:03:34,102 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_50050_210_16223_1_1535435796_7290138_1273-247611-0_sp1.1 from training. Duration: 0.936375 2023-10-02 13:03:35,418 WARNING [train.py:1204] (3/4) Exclude cut with ID _44831_210_13427_1_1533816011549_3689035_399-18591-0_sp1.1 from training. Duration: 0.936375 2023-10-02 13:03:35,433 WARNING [train.py:1204] (3/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_557-324258-0_sp0.9 from training. Duration: 0.9 2023-10-02 13:03:35,521 WARNING [train.py:1204] (3/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_62-335552-0_sp1.1 from training. Duration: 0.7909375 2023-10-02 13:03:38,785 WARNING [train.py:1204] (3/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_673-264125-0_sp1.1 from training. Duration: 0.963625 2023-10-02 13:03:38,868 WARNING [train.py:1204] (3/4) Exclude cut with ID _8030_210_7497_1_1528444886781_3531089_262-86750-0 from training. Duration: 0.81 2023-10-02 13:03:38,891 WARNING [train.py:1204] (3/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_444-28041-0_sp1.1 from training. Duration: 0.77275 2023-10-02 13:03:40,766 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.4.encoder.layers.1.whiten, num_groups=1, num_channels=384, metric=8.45 vs. limit=12.0 2023-10-02 13:03:44,365 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_34008_210_10431_1_1533345424_8183247_516-14858-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 13:03:44,378 WARNING [train.py:1204] (3/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_54-248218-0_sp1.1 from training. Duration: 0.936375 2023-10-02 13:03:47,704 WARNING [train.py:1204] (3/4) Exclude cut with ID _13253_210_1925_1_1530757775819_3577873_300-119782-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 13:03:47,771 WARNING [train.py:1204] (3/4) Exclude cut with ID _39290_210_4220_1_1533862778760_3075118_21-240703-0_sp1.1 from training. Duration: 0.936375 2023-10-02 13:03:48,542 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.3.encoder.layers.0.self_attn1.whiten, num_groups=1, num_channels=512, metric=21.62 vs. limit=22.5 2023-10-02 13:03:52,058 WARNING [train.py:1204] (3/4) Exclude cut with ID 774-127930-0014-10412-0_sp1.1 from training. Duration: 0.95 2023-10-02 13:03:52,133 WARNING [train.py:1204] (3/4) Exclude cut with ID _67499_210_17282_1_1535509805220_3692430_20-179346-0 from training. Duration: 0.97 2023-10-02 13:03:55,414 WARNING [train.py:1204] (3/4) Exclude cut with ID _8365_210_2064_1_1528592311070_4677690_431-294102-0_sp0.9 from training. Duration: 0.988875 2023-10-02 13:03:56,047 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.4.encoder.layers.1.self_attn1.whiten, num_groups=1, num_channels=384, metric=11.70 vs. limit=22.5 2023-10-02 13:03:59,726 WARNING [train.py:1204] (3/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_365-1492-0_sp1.1 from training. Duration: 0.82725 2023-10-02 13:03:59,767 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_105-114797-0_sp0.9 from training. Duration: 0.9333125 2023-10-02 13:04:02,462 WARNING [train.py:1204] (3/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_769-168302-0_sp0.9 from training. Duration: 0.788875 2023-10-02 13:04:03,966 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_40340_210_9024_1_1533433916_4132944_320-270277-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 13:04:03,971 WARNING [train.py:1204] (3/4) Exclude cut with ID ID0203W0003-781711 from training. Duration: 0.958 2023-10-02 13:04:05,257 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_30434_210_13049_1_1532519384_981746_52-12932-0_sp1.1 from training. Duration: 0.936375 2023-10-02 13:04:06,647 WARNING [train.py:1204] (3/4) Exclude cut with ID _8263_210_1664_1_1528438953494_294100_40-204731-0_sp0.9 from training. Duration: 0.5555625 2023-10-02 13:04:08,100 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1032-236863-0_sp0.9 from training. Duration: 0.9333125 2023-10-02 13:04:10,005 WARNING [train.py:1204] (3/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_335-274780-0_sp0.9 from training. Duration: 0.8666875 2023-10-02 13:04:10,014 WARNING [train.py:1204] (3/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_654-52713-0_sp1.1 from training. Duration: 0.7 2023-10-02 13:04:11,279 WARNING [train.py:1204] (3/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_742-267856-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 13:04:12,656 WARNING [train.py:1204] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_74-48717-0 from training. Duration: 0.58 2023-10-02 13:04:14,035 WARNING [train.py:1204] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_18-46756-0 from training. Duration: 0.79 2023-10-02 13:04:15,901 WARNING [train.py:1204] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_612-56061-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 13:04:15,910 WARNING [train.py:1204] (3/4) Exclude cut with ID _56423_210_5854_1_1534896061049_3165969_412-2403-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 13:04:15,972 WARNING [train.py:1204] (3/4) Exclude cut with ID _62574_210_2655_1_1535180407058_3933249_120-196848-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 13:04:15,972 WARNING [train.py:1204] (3/4) Exclude cut with ID _40519_210_13794_1_1533715228655_7337390_916-51000-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 13:04:16,105 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.bypass_mid.scale_min, batch_count=889220.0, ans=0.2 2023-10-02 13:04:19,300 WARNING [train.py:1204] (3/4) Exclude cut with ID _65263_210_22356_1_1535763598691_3921037_108-21902-0_sp1.1 from training. Duration: 0.9 2023-10-02 13:04:19,381 WARNING [train.py:1204] (3/4) Exclude cut with ID _4122_210_2084_1_1526713164417_4353280_528-206771-0_sp1.1 from training. Duration: 0.8 2023-10-02 13:04:22,696 WARNING [train.py:1204] (3/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_43-325657-0_sp1.1 from training. Duration: 0.92725 2023-10-02 13:04:24,136 WARNING [train.py:1204] (3/4) Exclude cut with ID _44659_210_2721_1_1534036120061_6614860_314-82041-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 13:04:24,198 WARNING [train.py:1204] (3/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_81-300212-0_sp0.9 from training. Duration: 0.6666875 2023-10-02 13:04:25,470 WARNING [train.py:1204] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_665-196155-0_sp0.9 from training. Duration: 0.7444375 2023-10-02 13:04:26,875 WARNING [train.py:1204] (3/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_140-336881-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 13:04:28,224 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6290_210_3273_1_1527595224_1282787_115-87095-0_sp0.9 from training. Duration: 0.611125 2023-10-02 13:04:28,289 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_53373_210_5399_1_1534229471_4715695_607-259685-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 13:04:29,648 WARNING [train.py:1204] (3/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_175-163894-0_sp1.1 from training. Duration: 0.67275 2023-10-02 13:04:29,666 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7002_210_1794_1_1527939683_5157286_1-281264-0_sp1.1 from training. Duration: 0.4 2023-10-02 13:04:35,230 WARNING [train.py:1204] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_605-257205-0 from training. Duration: 0.59 2023-10-02 13:04:39,633 WARNING [train.py:1204] (3/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_454-314901-0 from training. Duration: 0.46 2023-10-02 13:04:41,515 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_46032_210_14656_1_1533781412_4507969_287-130422-0_sp1.1 from training. Duration: 0.97275 2023-10-02 13:04:41,540 WARNING [train.py:1204] (3/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_186-309161-0_sp1.1 from training. Duration: 0.4909375 2023-10-02 13:04:41,589 WARNING [train.py:1204] (3/4) Exclude cut with ID _42659_210_2070_1_1533607442765_3120380_45-342307-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 13:04:43,943 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.out_combiner.scale_min, batch_count=889353.3333333334, ans=0.2 2023-10-02 13:04:44,866 INFO [train.py:1046] (3/4) Epoch 26, batch 600, loss[loss=0.1498, simple_loss=0.2277, pruned_loss=0.03595, over 24634.00 frames. ], tot_loss[loss=0.1726, simple_loss=0.249, pruned_loss=0.04814, over 4451259.03 frames. ], batch size: 60, lr: 3.97e-03, grad_scale: 8.0 2023-10-02 13:04:49,213 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.feed_forward2.hidden_balancer.prob, batch_count=889353.3333333334, ans=0.125 2023-10-02 13:04:50,748 WARNING [train.py:1204] (3/4) Exclude cut with ID _12555_210_8750_1_1530075594402_3780239_478-326818-0_sp1.1 from training. Duration: 0.963625 2023-10-02 13:04:54,080 WARNING [train.py:1204] (3/4) Exclude cut with ID _7606_210_4915_1_1528894253277_5080690_127-177761-0_sp1.1 from training. Duration: 0.5818125 2023-10-02 13:04:54,192 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6788_210_1388_1_1527898698_4562021_91-197981-0 from training. Duration: 0.83 2023-10-02 13:04:55,663 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8558_210_1410_1_1528505225_7613406_131-329524-0_sp0.9 from training. Duration: 0.87775 2023-10-02 13:04:57,070 WARNING [train.py:1204] (3/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_153-153537-0_sp1.1 from training. Duration: 0.82725 2023-10-02 13:04:58,480 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_17292_210_6753_1_1531125388_2943734_285-349105-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 13:05:01,275 WARNING [train.py:1204] (3/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_37-41919-0 from training. Duration: 0.87 2023-10-02 13:05:02,582 WARNING [train.py:1204] (3/4) Exclude cut with ID _60860_210_8254_1_1534896015875_3557169_172-294852-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 13:05:08,142 WARNING [train.py:1204] (3/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_146-276944-0 from training. Duration: 0.83 2023-10-02 13:05:08,364 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.feed_forward1.hidden_balancer.prob, batch_count=889420.0, ans=0.125 2023-10-02 13:05:10,949 WARNING [train.py:1204] (3/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_1254-44082-0_sp1.1 from training. Duration: 0.936375 2023-10-02 13:05:10,961 WARNING [train.py:1204] (3/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_1115-180350-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 13:05:10,989 WARNING [train.py:1204] (3/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_60-146189-0_sp1.1 from training. Duration: 0.8909375 2023-10-02 13:05:17,053 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.conv_module1.balancer2.min_positive, batch_count=889486.6666666666, ans=0.05 2023-10-02 13:05:18,246 WARNING [train.py:1204] (3/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_517-153649-0_sp1.1 from training. Duration: 0.92725 2023-10-02 13:05:18,258 WARNING [train.py:1204] (3/4) Exclude cut with ID _39571_210_13449_1_1533801584143_3651077_37-266257-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 13:05:18,305 WARNING [train.py:1204] (3/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_527-276118-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 13:05:25,055 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7696_210_2040_1_1528207167_3716099_117-125434-0_sp1.1 from training. Duration: 0.6909375 2023-10-02 13:05:26,448 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.nonlin_attention.balancer.min_positive, batch_count=889486.6666666666, ans=0.05 2023-10-02 13:05:28,840 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.438e+02 1.874e+02 2.049e+02 2.312e+02 3.828e+02, threshold=4.099e+02, percent-clipped=0.0 2023-10-02 13:05:29,006 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_25767_210_2161_1_1532307483_3867748_478-156360-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 13:05:29,018 WARNING [train.py:1204] (3/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_177-49092-0_sp1.1 from training. Duration: 0.82725 2023-10-02 13:05:29,028 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_11305_210_1794_1_1529750428_7178148_680-317073-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 13:05:37,392 WARNING [train.py:1204] (3/4) Exclude cut with ID _53565_210_10635_1_1534337218335_3820259_287-18120-0 from training. Duration: 0.97 2023-10-02 13:05:42,852 WARNING [train.py:1204] (3/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_498-40153-0_sp1.1 from training. Duration: 0.57275 2023-10-02 13:05:42,877 WARNING [train.py:1204] (3/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_555-63066-0_sp1.1 from training. Duration: 0.963625 2023-10-02 13:05:43,167 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.conv_module2.balancer1.prob, batch_count=889620.0, ans=0.125 2023-10-02 13:05:47,636 WARNING [train.py:1204] (3/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_395-310220-0 from training. Duration: 0.89 2023-10-02 13:05:47,721 WARNING [train.py:1204] (3/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_937-208281-0_sp1.1 from training. Duration: 0.77275 2023-10-02 13:05:50,411 WARNING [train.py:1204] (3/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_120-211630-0 from training. Duration: 0.92 2023-10-02 13:05:52,318 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_454-276429-0_sp1.1 from training. Duration: 0.7909375 2023-10-02 13:05:52,361 WARNING [train.py:1204] (3/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_404-75889-0_sp1.1 from training. Duration: 0.8090625 2023-10-02 13:05:58,316 INFO [train.py:1046] (3/4) Epoch 26, batch 650, loss[loss=0.1583, simple_loss=0.2282, pruned_loss=0.04426, over 23662.00 frames. ], tot_loss[loss=0.1709, simple_loss=0.2476, pruned_loss=0.04709, over 4504376.78 frames. ], batch size: 232, lr: 3.97e-03, grad_scale: 8.0 2023-10-02 13:05:58,384 WARNING [train.py:1204] (3/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_282-136641-0_sp0.9 from training. Duration: 0.4666875 2023-10-02 13:05:59,724 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_809-17156-0_sp1.1 from training. Duration: 0.663625 2023-10-02 13:06:01,175 WARNING [train.py:1204] (3/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_1003-101638-0_sp0.9 from training. Duration: 0.811125 2023-10-02 13:06:02,586 WARNING [train.py:1204] (3/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_404-75889-0_sp0.9 from training. Duration: 0.988875 2023-10-02 13:06:05,131 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.0.layers.0.self_attn_weights.whiten_keys, num_groups=4, num_channels=128, metric=2.88 vs. limit=6.0 2023-10-02 13:06:05,455 WARNING [train.py:1204] (3/4) Exclude cut with ID _59288_210_5854_1_1535077757774_3412246_570-295255-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 13:06:06,883 WARNING [train.py:1204] (3/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_412-287526-0 from training. Duration: 0.72 2023-10-02 13:06:07,153 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.bypass_mid.scale_min, batch_count=889686.6666666666, ans=0.2 2023-10-02 13:06:08,258 WARNING [train.py:1204] (3/4) Exclude cut with ID _44010_210_4525_1_1533884262964_4013620_649-79655-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 13:06:11,272 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.nonlin_attention.balancer.prob, batch_count=889753.3333333334, ans=0.125 2023-10-02 13:06:12,530 WARNING [train.py:1204] (3/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_57-270037-0_sp0.9 from training. Duration: 0.8555625 2023-10-02 13:06:12,531 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_34815_210_6388_1_1533381320_3571903_302-255963-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 13:06:17,074 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_47449_210_5399_1_1534076445_4006929_573-301852-0_sp1.1 from training. Duration: 0.97275 2023-10-02 13:06:20,932 WARNING [train.py:1204] (3/4) Exclude cut with ID 6709-74022-0004-86860-0_sp1.1 from training. Duration: 0.9409375 2023-10-02 13:06:21,062 WARNING [train.py:1204] (3/4) Exclude cut with ID _40519_210_13794_1_1533715228655_7337390_111-71991-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 13:06:22,354 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_30167_210_3528_1_1532428012_4772375_169-214186-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 13:06:22,628 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.conv_skip_rate, batch_count=889753.3333333334, ans=0.0 2023-10-02 13:06:25,108 WARNING [train.py:1204] (3/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_20-235313-0_sp1.1 from training. Duration: 0.963625 2023-10-02 13:06:25,323 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.feed_forward2.hidden_balancer.prob, batch_count=889753.3333333334, ans=0.125 2023-10-02 13:06:26,438 WARNING [train.py:1204] (3/4) Exclude cut with ID _4681_210_3525_1_1527251040321_1235713_123-24428-0_sp0.9 from training. Duration: 0.5333125 2023-10-02 13:06:28,383 WARNING [train.py:1204] (3/4) Exclude cut with ID _31602_210_5854_1_1532677021456_4458250_444-198752-0_sp1.1 from training. Duration: 0.97275 2023-10-02 13:06:28,431 WARNING [train.py:1204] (3/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_9-279442-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 13:06:29,769 WARNING [train.py:1204] (3/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_973-198085-0_sp0.9 from training. Duration: 0.8333125 2023-10-02 13:06:31,143 WARNING [train.py:1204] (3/4) Exclude cut with ID _29853_210_7497_1_1532505600851_6642110_567-250221-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 13:06:32,570 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6545_210_2040_1_1527774670_4245258_407-48070-0_sp1.1 from training. Duration: 0.6909375 2023-10-02 13:06:33,935 WARNING [train.py:1204] (3/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_224-343295-0_sp0.9 from training. Duration: 0.611125 2023-10-02 13:06:35,200 WARNING [train.py:1204] (3/4) Exclude cut with ID IC0125W0011-61817 from training. Duration: 0.88 2023-10-02 13:06:35,208 WARNING [train.py:1204] (3/4) Exclude cut with ID _60113_210_16271_1_1535164212481_4386079_435-154371-0_sp1.1 from training. Duration: 0.97275 2023-10-02 13:06:35,220 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_25836_210_16554_1_1532138167_53197_3-9394-0_sp1.1 from training. Duration: 0.92725 2023-10-02 13:06:38,102 WARNING [train.py:1204] (3/4) Exclude cut with ID _29343_210_4381_1_1532602757752_3674109_57-55125-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 13:06:39,443 WARNING [train.py:1204] (3/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_267-23560-0_sp1.1 from training. Duration: 0.936375 2023-10-02 13:06:39,469 WARNING [train.py:1204] (3/4) Exclude cut with ID _30196_210_1664_1_1533708496522_7074140_1150-278867-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 13:06:39,532 WARNING [train.py:1204] (3/4) Exclude cut with ID _6270_210_2084_1_1527850911573_3856420_26-277343-0_sp0.9 from training. Duration: 0.911125 2023-10-02 13:06:40,853 WARNING [train.py:1204] (3/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_325-62802-0 from training. Duration: 0.87 2023-10-02 13:06:40,935 WARNING [train.py:1204] (3/4) Exclude cut with ID _56524_210_9642_1_1534572011779_3619170_145-310842-0_sp1.1 from training. Duration: 0.9 2023-10-02 13:06:42,405 WARNING [train.py:1204] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_860-96198-0_sp0.9 from training. Duration: 0.811125 2023-10-02 13:06:43,755 WARNING [train.py:1204] (3/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_17-25045-0_sp1.1 from training. Duration: 0.536375 2023-10-02 13:06:43,780 WARNING [train.py:1204] (3/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_349-69727-0_sp1.1 from training. Duration: 0.936375 2023-10-02 13:06:45,685 WARNING [train.py:1204] (3/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_580-93798-0_sp1.1 from training. Duration: 0.5090625 2023-10-02 13:06:45,794 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7762_210_1794_1_1528199508_4057408_419-125009-0 from training. Duration: 0.83 2023-10-02 13:06:47,163 WARNING [train.py:1204] (3/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_356-167816-0 from training. Duration: 0.72 2023-10-02 13:06:47,186 WARNING [train.py:1204] (3/4) Exclude cut with ID _29345_210_4381_1_1532689168263_3860169_225-97100-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 13:06:47,190 WARNING [train.py:1204] (3/4) Exclude cut with ID _14227_210_4381_1_1530701804286_3940259_374-194712-0_sp1.1 from training. Duration: 0.92725 2023-10-02 13:06:47,234 WARNING [train.py:1204] (3/4) Exclude cut with ID _40137_210_12210_1_1533290484046_3795180_249-187059-0_sp0.9 from training. Duration: 0.97775 2023-10-02 13:06:47,253 WARNING [train.py:1204] (3/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_389-317194-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 13:06:49,897 WARNING [train.py:1204] (3/4) Exclude cut with ID _27485_210_4916_1_1532739191677_3668269_239-122918-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 13:06:54,129 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.nonlin_attention.balancer.prob, batch_count=889886.6666666666, ans=0.125 2023-10-02 13:06:55,487 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2245_210_2715_1_1525511548_3540254_128-117609-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 13:06:55,514 WARNING [train.py:1204] (3/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_160-276178-0_sp1.1 from training. Duration: 0.963625 2023-10-02 13:06:57,363 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_31920_210_10633_1_1532860009_1686094_1-279735-0_sp1.1 from training. Duration: 0.97275 2023-10-02 13:07:00,188 WARNING [train.py:1204] (3/4) Exclude cut with ID _51078_210_9070_1_1534161557424_3805250_325-262383-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 13:07:00,215 WARNING [train.py:1204] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_643-52007-0_sp0.9 from training. Duration: 0.6333125 2023-10-02 13:07:00,265 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_51937_210_5014_1_1534573610_4022025_518-299248-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 13:07:06,015 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.bypass.scale_min, batch_count=889953.3333333334, ans=0.2 2023-10-02 13:07:07,178 WARNING [train.py:1204] (3/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_166-314607-0_sp1.1 from training. Duration: 0.4909375 2023-10-02 13:07:08,454 WARNING [train.py:1204] (3/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_1087-282939-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 13:07:08,495 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_23850_210_4238_1_1532232597_1632433_32-185135-0_sp1.1 from training. Duration: 0.7818125 2023-10-02 13:07:08,524 WARNING [train.py:1204] (3/4) Exclude cut with ID _59500_210_8254_1_1534809610680_3736009_145-89359-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 13:07:10,124 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.ff2_skip_rate, batch_count=890020.0, ans=0.0 2023-10-02 13:07:10,642 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.2.encoder.layers.1.self_attn_weights.whiten_keys, num_groups=4, num_channels=128, metric=2.91 vs. limit=6.0 2023-10-02 13:07:11,131 INFO [train.py:1046] (3/4) Epoch 26, batch 700, loss[loss=0.1913, simple_loss=0.2577, pruned_loss=0.06246, over 23775.00 frames. ], tot_loss[loss=0.1698, simple_loss=0.246, pruned_loss=0.04676, over 4548758.52 frames. ], batch size: 179, lr: 3.97e-03, grad_scale: 8.0 2023-10-02 13:07:14,503 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_340-336408-0 from training. Duration: 0.93 2023-10-02 13:07:15,842 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.0.conv_module2.balancer1.prob, batch_count=890020.0, ans=0.125 2023-10-02 13:07:17,022 WARNING [train.py:1204] (3/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_9-21737-0 from training. Duration: 0.97 2023-10-02 13:07:17,307 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=890020.0, ans=0.1 2023-10-02 13:07:19,095 WARNING [train.py:1204] (3/4) Exclude cut with ID _2220_210_1819_1_1525509109898_3658680_50-99837-0 from training. Duration: 0.54 2023-10-02 13:07:19,163 WARNING [train.py:1204] (3/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_879-299350-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 13:07:20,931 WARNING [train.py:1204] (3/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_905-160074-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 13:07:22,357 WARNING [train.py:1204] (3/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_395-73570-0 from training. Duration: 0.99 2023-10-02 13:07:22,568 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.feed_forward1.out_proj.dropout_p, batch_count=890020.0, ans=0.1 2023-10-02 13:07:26,500 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7264_210_4938_1_1527939911_8132095_800-91995-0_sp1.1 from training. Duration: 0.7818125 2023-10-02 13:07:28,386 WARNING [train.py:1204] (3/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_77-311404-0_sp1.1 from training. Duration: 0.8545625 2023-10-02 13:07:30,980 WARNING [train.py:1204] (3/4) Exclude cut with ID _13679_210_7497_1_1530534293662_3355098_107-80772-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 13:07:31,096 WARNING [train.py:1204] (3/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_224-343295-0_sp1.1 from training. Duration: 0.5 2023-10-02 13:07:31,128 WARNING [train.py:1204] (3/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_156-253482-0_sp1.1 from training. Duration: 0.963625 2023-10-02 13:07:33,818 WARNING [train.py:1204] (3/4) Exclude cut with ID _49728_210_12651_1_1534041034294_6788150_309-125532-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 13:07:37,931 WARNING [train.py:1204] (3/4) Exclude cut with ID _13913_210_9994_1_1530579615187_3383897_473-277682-0_sp0.9 from training. Duration: 0.4333125 2023-10-02 13:07:37,937 WARNING [train.py:1204] (3/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_3-276701-0_sp1.1 from training. Duration: 0.82725 2023-10-02 13:07:39,321 WARNING [train.py:1204] (3/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_282-136641-0 from training. Duration: 0.42 2023-10-02 13:07:42,182 WARNING [train.py:1204] (3/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_127-566-0 from training. Duration: 0.84 2023-10-02 13:07:45,418 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6163_210_6400_1_1527506746_4263068_283-137811-0_sp1.1 from training. Duration: 0.7 2023-10-02 13:07:45,447 WARNING [train.py:1204] (3/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_34-183685-0_sp1.1 from training. Duration: 0.8909375 2023-10-02 13:07:47,336 WARNING [train.py:1204] (3/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_154-135696-0_sp0.9 from training. Duration: 0.888875 2023-10-02 13:07:51,762 WARNING [train.py:1204] (3/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_676-51014-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 13:07:51,821 WARNING [train.py:1204] (3/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_38-169920-0 from training. Duration: 0.91 2023-10-02 13:07:56,015 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.567e+02 1.890e+02 2.209e+02 2.852e+02 4.841e+02, threshold=4.419e+02, percent-clipped=5.0 2023-10-02 13:07:56,149 WARNING [train.py:1204] (3/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_747-114465-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 13:07:57,406 WARNING [train.py:1204] (3/4) Exclude cut with ID _8021_210_7994_1_1528376451358_3653069_100-140231-0_sp1.1 from training. Duration: 0.6090625 2023-10-02 13:07:57,424 WARNING [train.py:1204] (3/4) Exclude cut with ID _64135_210_2253_1_1535677267876_7608972_133-188266-0 from training. Duration: 0.81 2023-10-02 13:08:03,259 WARNING [train.py:1204] (3/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_714-310538-0_sp1.1 from training. Duration: 0.7818125 2023-10-02 13:08:04,649 WARNING [train.py:1204] (3/4) Exclude cut with ID _39867_210_9826_1_1533553222030_9344259_1134-288498-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 13:08:07,412 WARNING [train.py:1204] (3/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_211-237289-0_sp1.1 from training. Duration: 0.936375 2023-10-02 13:08:11,835 WARNING [train.py:1204] (3/4) Exclude cut with ID _59202_210_26984_1_1534816810612_3549174_137-318710-0_sp0.9 from training. Duration: 0.988875 2023-10-02 13:08:13,692 WARNING [train.py:1204] (3/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_86-66192-0 from training. Duration: 0.55 2023-10-02 13:08:15,356 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.ff2_skip_rate, batch_count=890286.6666666666, ans=0.0 2023-10-02 13:08:15,427 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.out_combiner.scale_min, batch_count=890286.6666666666, ans=0.2 2023-10-02 13:08:16,501 WARNING [train.py:1204] (3/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_540-275685-0 from training. Duration: 0.89 2023-10-02 13:08:16,535 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8776_210_2040_1_1528638837_4088036_244-278763-0 from training. Duration: 0.9 2023-10-02 13:08:18,582 WARNING [train.py:1204] (3/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_9-219972-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 13:08:19,994 WARNING [train.py:1204] (3/4) Exclude cut with ID _44659_210_2721_1_1534036120061_6614860_656-283823-0_sp1.1 from training. Duration: 0.92725 2023-10-02 13:08:20,061 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_69630_210_7234_1_1535789601_661049_77-216385-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 13:08:22,707 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_19952_210_2161_1_1531875469_3851750_210-308919-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 13:08:22,712 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_4737_210_2040_1_1526998689_3293008_378-178650-0 from training. Duration: 0.95 2023-10-02 13:08:25,906 INFO [train.py:1046] (3/4) Epoch 26, batch 750, loss[loss=0.1695, simple_loss=0.2362, pruned_loss=0.05137, over 23721.00 frames. ], tot_loss[loss=0.1694, simple_loss=0.2454, pruned_loss=0.04674, over 4580984.15 frames. ], batch size: 164, lr: 3.97e-03, grad_scale: 8.0 2023-10-02 13:08:27,263 WARNING [train.py:1204] (3/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_224-343295-0 from training. Duration: 0.55 2023-10-02 13:08:28,493 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7002_210_1794_1_1527939683_5157286_184-229630-0 from training. Duration: 0.74 2023-10-02 13:08:28,518 WARNING [train.py:1204] (3/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_6-214815-0 from training. Duration: 0.85 2023-10-02 13:08:28,616 WARNING [train.py:1204] (3/4) Exclude cut with ID _8699_210_2069_1_1528630346831_3960409_160-24408-0 from training. Duration: 0.91 2023-10-02 13:08:29,257 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.3.encoder.layers.0.nonlin_attention.whiten2, num_groups=1, num_channels=512, metric=7.21 vs. limit=15.0 2023-10-02 13:08:29,931 WARNING [train.py:1204] (3/4) Exclude cut with ID _5585_210_2667_1_1527418813324_8007340_89-332072-0 from training. Duration: 0.64 2023-10-02 13:08:29,984 WARNING [train.py:1204] (3/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_528-259800-0_sp1.1 from training. Duration: 0.963625 2023-10-02 13:08:31,314 WARNING [train.py:1204] (3/4) Exclude cut with ID _55383_210_10635_1_1534468426172_2231669_134-128246-0 from training. Duration: 0.89 2023-10-02 13:08:33,288 WARNING [train.py:1204] (3/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_400-338274-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 13:08:33,367 WARNING [train.py:1204] (3/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_364-22817-0_sp0.9 from training. Duration: 0.788875 2023-10-02 13:08:34,744 WARNING [train.py:1204] (3/4) Exclude cut with ID _42863_210_9070_1_1533902398788_3696920_331-291057-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 13:08:37,529 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_5436_210_1794_1_1527331317_2541494_229-211016-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 13:08:37,573 WARNING [train.py:1204] (3/4) Exclude cut with ID _6456_210_1534_1_1527681634212_3181049_345-252877-0_sp0.9 from training. Duration: 0.711125 2023-10-02 13:08:37,598 WARNING [train.py:1204] (3/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_96-81499-0_sp1.1 from training. Duration: 0.92725 2023-10-02 13:08:39,659 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_5575_210_1410_1_1527301305_2570170_342-265572-0_sp1.1 from training. Duration: 0.8545625 2023-10-02 13:08:40,918 WARNING [train.py:1204] (3/4) Exclude cut with ID _5444_210_2151_1_1527418808464_5312210_726-27165-0_sp0.9 from training. Duration: 0.7444375 2023-10-02 13:08:43,578 WARNING [train.py:1204] (3/4) Exclude cut with ID _22083_210_8254_1_1531702862439_3678970_304-327750-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 13:08:43,850 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.nonlin_attention.balancer.prob, batch_count=890420.0, ans=0.125 2023-10-02 13:08:45,116 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.1.conv_module1.balancer2.prob, batch_count=890420.0, ans=0.125 2023-10-02 13:08:46,235 WARNING [train.py:1204] (3/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_245-226199-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 13:08:46,277 WARNING [train.py:1204] (3/4) Exclude cut with ID _42875_210_14856_1_1533704397487_4012433_102-199499-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 13:08:46,338 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_51225_210_3601_1_1534400634_2327383_55-301844-0 from training. Duration: 0.93 2023-10-02 13:08:49,539 WARNING [train.py:1204] (3/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_916-195848-0_sp1.1 from training. Duration: 0.836375 2023-10-02 13:08:49,601 WARNING [train.py:1204] (3/4) Exclude cut with ID _44718_210_5816_1_1534050051869_6469030_497-311999-0_sp1.1 from training. Duration: 0.936375 2023-10-02 13:08:50,950 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_34886_210_10633_1_1533203882_1755661_91-194765-0_sp1.1 from training. Duration: 0.936375 2023-10-02 13:08:52,312 WARNING [train.py:1204] (3/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_310-26186-0_sp0.9 from training. Duration: 0.77775 2023-10-02 13:08:53,623 WARNING [train.py:1204] (3/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_416-257730-0 from training. Duration: 0.65 2023-10-02 13:08:53,629 WARNING [train.py:1204] (3/4) Exclude cut with ID _9389_210_1664_1_1529217337301_5289269_209-154760-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 13:08:55,094 WARNING [train.py:1204] (3/4) Exclude cut with ID _4092_210_2151_1_1526643044449_3135759_227-213972-0 from training. Duration: 0.54 2023-10-02 13:08:55,114 WARNING [train.py:1204] (3/4) Exclude cut with ID IC0679W0044-335852 from training. Duration: 0.768 2023-10-02 13:08:56,904 WARNING [train.py:1204] (3/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_42-186162-0 from training. Duration: 0.92 2023-10-02 13:08:56,913 WARNING [train.py:1204] (3/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_302-2550-0_sp0.9 from training. Duration: 0.9 2023-10-02 13:08:56,945 WARNING [train.py:1204] (3/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_356-31216-0_sp0.9 from training. Duration: 0.8333125 2023-10-02 13:08:59,699 WARNING [train.py:1204] (3/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_125-322172-0_sp1.1 from training. Duration: 0.8454375 2023-10-02 13:09:07,159 WARNING [train.py:1204] (3/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_141-89727-0_sp0.9 from training. Duration: 0.788875 2023-10-02 13:09:07,185 WARNING [train.py:1204] (3/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_647-337784-0_sp1.1 from training. Duration: 0.97275 2023-10-02 13:09:07,195 WARNING [train.py:1204] (3/4) Exclude cut with ID _6493_210_1747_1_1528459210424_4012470_513-140967-0_sp1.1 from training. Duration: 0.7181875 2023-10-02 13:09:09,978 WARNING [train.py:1204] (3/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_87-266334-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 13:09:10,090 WARNING [train.py:1204] (3/4) Exclude cut with ID _26528_210_9070_1_1532570410842_3851310_526-261338-0_sp1.1 from training. Duration: 0.92725 2023-10-02 13:09:10,138 WARNING [train.py:1204] (3/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_380-343670-0 from training. Duration: 0.88 2023-10-02 13:09:10,191 WARNING [train.py:1204] (3/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_449-15508-0_sp1.1 from training. Duration: 0.7454375 2023-10-02 13:09:12,284 WARNING [train.py:1204] (3/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_372-6869-0_sp0.9 from training. Duration: 0.488875 2023-10-02 13:09:12,342 WARNING [train.py:1204] (3/4) Exclude cut with ID _26907_210_7234_1_1532221740694_3205263_337-2494-0_sp0.9 from training. Duration: 0.97775 2023-10-02 13:09:15,015 WARNING [train.py:1204] (3/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_10-100367-0_sp1.1 from training. Duration: 0.8909375 2023-10-02 13:09:15,049 WARNING [train.py:1204] (3/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_47-167373-0 from training. Duration: 0.92 2023-10-02 13:09:16,795 WARNING [train.py:1204] (3/4) Exclude cut with ID _28323_210_16187_1_1532480395667_4100300_628-282179-0_sp1.1 from training. Duration: 0.97275 2023-10-02 13:09:17,045 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=890553.3333333334, ans=0.1 2023-10-02 13:09:19,691 WARNING [train.py:1204] (3/4) Exclude cut with ID _20455_210_5219_1_1531565996775_8098100_213-280898-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 13:09:22,275 WARNING [train.py:1204] (3/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_383-316000-0_sp1.1 from training. Duration: 0.8181875 2023-10-02 13:09:22,310 WARNING [train.py:1204] (3/4) Exclude cut with ID _18513_210_9206_1_1531618215223_3862620_16-78180-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 13:09:23,645 WARNING [train.py:1204] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_330-327861-0_sp1.1 from training. Duration: 0.6090625 2023-10-02 13:09:28,003 WARNING [train.py:1204] (3/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_356-31216-0 from training. Duration: 0.75 2023-10-02 13:09:28,022 WARNING [train.py:1204] (3/4) Exclude cut with ID _25248_210_8750_1_1532062825941_5554180_255-148539-0_sp1.1 from training. Duration: 0.963625 2023-10-02 13:09:28,078 WARNING [train.py:1204] (3/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_269-154726-0_sp1.1 from training. Duration: 0.7818125 2023-10-02 13:09:30,893 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7002_210_1794_1_1527939683_5157286_163-87552-0_sp1.1 from training. Duration: 0.7818125 2023-10-02 13:09:30,919 WARNING [train.py:1204] (3/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_97-151896-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 13:09:34,207 WARNING [train.py:1204] (3/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_207-7026-0_sp1.1 from training. Duration: 0.97275 2023-10-02 13:09:34,225 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_92-46009-0_sp0.9 from training. Duration: 0.6 2023-10-02 13:09:39,951 INFO [train.py:1046] (3/4) Epoch 26, batch 800, loss[loss=0.1867, simple_loss=0.2494, pruned_loss=0.06199, over 23579.00 frames. ], tot_loss[loss=0.1696, simple_loss=0.2461, pruned_loss=0.04656, over 4622751.81 frames. ], batch size: 256, lr: 3.97e-03, grad_scale: 16.0 2023-10-02 13:09:43,443 WARNING [train.py:1204] (3/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_122-178847-0_sp1.1 from training. Duration: 0.97275 2023-10-02 13:09:43,447 WARNING [train.py:1204] (3/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_94-348956-0_sp1.1 from training. Duration: 0.92725 2023-10-02 13:09:45,314 WARNING [train.py:1204] (3/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_1018-244089-0_sp1.1 from training. Duration: 0.963625 2023-10-02 13:09:45,334 WARNING [train.py:1204] (3/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_87-118423-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 13:09:48,034 WARNING [train.py:1204] (3/4) Exclude cut with ID _53713_210_24116_1_1534314487762_7416751_504-212292-0_sp1.1 from training. Duration: 0.92725 2023-10-02 13:09:48,072 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_446-150327-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 13:09:48,869 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.1.encoder.layers.1.self_attn_weights.whiten_keys, num_groups=4, num_channels=128, metric=2.82 vs. limit=6.0 2023-10-02 13:09:49,507 WARNING [train.py:1204] (3/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_682-245989-0_sp1.1 from training. Duration: 0.92725 2023-10-02 13:09:51,089 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.attention_skip_rate, batch_count=890686.6666666666, ans=0.0 2023-10-02 13:09:53,570 WARNING [train.py:1204] (3/4) Exclude cut with ID _35723_210_1794_1_1533088341043_628250_8-234524-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 13:09:53,638 WARNING [train.py:1204] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_64-248359-0_sp0.9 from training. Duration: 0.7666875 2023-10-02 13:09:56,290 WARNING [train.py:1204] (3/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_49-97158-0 from training. Duration: 0.76 2023-10-02 13:09:57,562 WARNING [train.py:1204] (3/4) Exclude cut with ID _24314_210_4915_1_1532135078116_7170179_743-915-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 13:09:58,924 WARNING [train.py:1204] (3/4) Exclude cut with ID _61194_210_18628_1_1535007555628_3750909_353-323468-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 13:09:58,950 WARNING [train.py:1204] (3/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_387-131538-0_sp1.1 from training. Duration: 0.736375 2023-10-02 13:09:59,234 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.conv_skip_rate, batch_count=890753.3333333334, ans=0.0 2023-10-02 13:10:00,258 WARNING [train.py:1204] (3/4) Exclude cut with ID _44424_210_18163_1_1533623394848_1213110_79-187603-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 13:10:00,303 WARNING [train.py:1204] (3/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_86-19601-0 from training. Duration: 0.85 2023-10-02 13:10:01,946 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_50050_210_16223_1_1535435796_7290138_103-283529-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 13:10:01,986 WARNING [train.py:1204] (3/4) Exclude cut with ID _13913_210_9994_1_1530579615187_3383897_473-277682-0 from training. Duration: 0.39 2023-10-02 13:10:03,421 WARNING [train.py:1204] (3/4) Exclude cut with ID _42326_210_7770_1_1533814282531_3707269_368-68029-0_sp1.1 from training. Duration: 0.92725 2023-10-02 13:10:05,271 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_41428_210_2161_1_1533774184_3992532_446-339645-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 13:10:06,331 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.0.layers.0.feed_forward2.out_whiten, num_groups=1, num_channels=192, metric=5.68 vs. limit=15.0 2023-10-02 13:10:06,689 WARNING [train.py:1204] (3/4) Exclude cut with ID _69584_210_5219_1_1535785133775_4044450_325-7656-0_sp1.1 from training. Duration: 0.7818125 2023-10-02 13:10:06,709 WARNING [train.py:1204] (3/4) Exclude cut with ID _21632_210_2253_1_1531994001765_3744140_134-184387-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 13:10:10,150 WARNING [train.py:1204] (3/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_550-184707-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 13:10:10,165 WARNING [train.py:1204] (3/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_106-341780-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 13:10:12,839 WARNING [train.py:1204] (3/4) Exclude cut with ID _45871_210_5665_1_1533864614245_3296060_253-132552-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 13:10:14,615 WARNING [train.py:1204] (3/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_395-310220-0_sp1.1 from training. Duration: 0.8090625 2023-10-02 13:10:14,644 WARNING [train.py:1204] (3/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_795-33252-0_sp0.9 from training. Duration: 0.411125 2023-10-02 13:10:17,310 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9002W0003-495962 from training. Duration: 0.946 2023-10-02 13:10:17,345 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_43-71989-0 from training. Duration: 0.57 2023-10-02 13:10:17,365 WARNING [train.py:1204] (3/4) Exclude cut with ID _8699_210_2069_1_1528630346831_3960409_155-39766-0_sp1.1 from training. Duration: 0.7090625 2023-10-02 13:10:18,701 WARNING [train.py:1204] (3/4) Exclude cut with ID _27894_210_12392_1_1532415496078_6902439_788-232684-0_sp1.1 from training. Duration: 0.97275 2023-10-02 13:10:20,126 WARNING [train.py:1204] (3/4) Exclude cut with ID _56494_210_4220_1_1535284837625_3033169_10-39296-0_sp1.1 from training. Duration: 0.92725 2023-10-02 13:10:20,145 WARNING [train.py:1204] (3/4) Exclude cut with ID _40901_210_5665_1_1533363789958_3856130_341-20819-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 13:10:24,220 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.564e+02 1.808e+02 1.916e+02 2.137e+02 2.899e+02, threshold=3.832e+02, percent-clipped=0.0 2023-10-02 13:10:25,737 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9029W0435-509822 from training. Duration: 0.896 2023-10-02 13:10:25,780 WARNING [train.py:1204] (3/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_51-117576-0 from training. Duration: 0.4 2023-10-02 13:10:27,144 WARNING [train.py:1204] (3/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_197-28164-0_sp1.1 from training. Duration: 0.72725 2023-10-02 13:10:27,337 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.attention_skip_rate, batch_count=890886.6666666666, ans=0.0 2023-10-02 13:10:29,848 WARNING [train.py:1204] (3/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_145-137790-0_sp1.1 from training. Duration: 0.7545625 2023-10-02 13:10:32,666 WARNING [train.py:1204] (3/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_2-39653-0_sp1.1 from training. Duration: 0.8909375 2023-10-02 13:10:37,688 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3780_210_2087_1_1526467760_2130483_186-208240-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 13:10:39,060 WARNING [train.py:1204] (3/4) Exclude cut with ID _24055_210_12651_1_1532310907143_976979_92-59356-0 from training. Duration: 0.91 2023-10-02 13:10:39,107 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3910_210_1794_1_1526742306_334530_28-238909-0_sp1.1 from training. Duration: 0.736375 2023-10-02 13:10:41,914 WARNING [train.py:1204] (3/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_269-154726-0 from training. Duration: 0.86 2023-10-02 13:10:47,333 WARNING [train.py:1204] (3/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_326-165900-0_sp0.9 from training. Duration: 0.7666875 2023-10-02 13:10:50,525 WARNING [train.py:1204] (3/4) Exclude cut with ID _5294_210_1747_1_1527249643928_3696179_470-276077-0_sp1.1 from training. Duration: 0.963625 2023-10-02 13:10:50,556 WARNING [train.py:1204] (3/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_372-131163-0 from training. Duration: 0.94 2023-10-02 13:10:51,908 WARNING [train.py:1204] (3/4) Exclude cut with ID _45297_210_2721_1_1533715351381_3264949_351-147136-0_sp1.1 from training. Duration: 0.7909375 2023-10-02 13:10:52,078 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.bypass.scale_min, batch_count=891020.0, ans=0.2 2023-10-02 13:10:53,190 INFO [train.py:1046] (3/4) Epoch 26, batch 850, loss[loss=0.1801, simple_loss=0.2608, pruned_loss=0.04969, over 24360.00 frames. ], tot_loss[loss=0.1713, simple_loss=0.2476, pruned_loss=0.04748, over 4627057.65 frames. ], batch size: 77, lr: 3.97e-03, grad_scale: 16.0 2023-10-02 13:10:53,285 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_1072-12671-0_sp1.1 from training. Duration: 0.92725 2023-10-02 13:10:54,565 WARNING [train.py:1204] (3/4) Exclude cut with ID _15133_210_6228_1_1530794512074_3080379_54-198193-0 from training. Duration: 0.94 2023-10-02 13:10:54,581 WARNING [train.py:1204] (3/4) Exclude cut with ID _30196_210_1664_1_1533708496522_7074140_1194-270988-0_sp1.1 from training. Duration: 0.97275 2023-10-02 13:10:55,926 WARNING [train.py:1204] (3/4) Exclude cut with ID _50310_210_15488_1_1534068043345_3641080_289-193637-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 13:10:57,326 WARNING [train.py:1204] (3/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_313-263495-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 13:10:58,717 WARNING [train.py:1204] (3/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_18-130336-0_sp1.1 from training. Duration: 0.7181875 2023-10-02 13:11:00,135 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_47896_210_20508_1_1534492518_5958868_646-287209-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 13:11:01,517 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_294-163793-0 from training. Duration: 0.82 2023-10-02 13:11:02,728 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_8-319957-0 from training. Duration: 0.96 2023-10-02 13:11:02,731 WARNING [train.py:1204] (3/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_1211-226066-0 from training. Duration: 0.76 2023-10-02 13:11:04,662 WARNING [train.py:1204] (3/4) Exclude cut with ID _4779_210_2084_1_1527318022944_3914893_500-285013-0_sp0.9 from training. Duration: 0.7666875 2023-10-02 13:11:04,687 WARNING [train.py:1204] (3/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_347-232268-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 13:11:07,431 WARNING [train.py:1204] (3/4) Exclude cut with ID _29877_210_6228_1_1532397791106_5276149_111-55056-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 13:11:07,467 WARNING [train.py:1204] (3/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_812-271646-0_sp1.1 from training. Duration: 0.92725 2023-10-02 13:11:09,448 WARNING [train.py:1204] (3/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_235-262124-0_sp1.1 from training. Duration: 0.6090625 2023-10-02 13:11:12,317 WARNING [train.py:1204] (3/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_14-16530-0_sp1.1 from training. Duration: 0.97275 2023-10-02 13:11:12,350 WARNING [train.py:1204] (3/4) Exclude cut with ID _44718_210_5816_1_1534050051869_6469030_568-304512-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 13:11:12,368 WARNING [train.py:1204] (3/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_1033-274376-0 from training. Duration: 0.87 2023-10-02 13:11:15,295 WARNING [train.py:1204] (3/4) Exclude cut with ID _4681_210_3525_1_1527251040321_1235713_123-24428-0 from training. Duration: 0.48 2023-10-02 13:11:18,017 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_37818_210_4238_1_1533620910_4720083_251-288764-0_sp1.1 from training. Duration: 0.97275 2023-10-02 13:11:19,346 WARNING [train.py:1204] (3/4) Exclude cut with ID _65585_210_12210_1_1535450451584_3764098_112-294301-0 from training. Duration: 0.88 2023-10-02 13:11:22,614 WARNING [train.py:1204] (3/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_329-113173-0 from training. Duration: 0.62 2023-10-02 13:11:24,017 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6767_210_3273_1_1527855783_1718932_173-256601-0 from training. Duration: 0.8 2023-10-02 13:11:24,232 INFO [scaling.py:1118] (3/4) WithLoss: name=encoder.encoders.0.layers.1.self_attn_weights, loss-sum=0.000e+00 2023-10-02 13:11:25,534 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9266W0001-627677 from training. Duration: 0.896 2023-10-02 13:11:25,545 WARNING [train.py:1204] (3/4) Exclude cut with ID _49643_210_7989_1_1534640244224_3939031_611-185515-0_sp1.1 from training. Duration: 0.8545625 2023-10-02 13:11:25,556 WARNING [train.py:1204] (3/4) Exclude cut with ID _31649_210_6228_1_1532584608063_4091078_687-237771-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 13:11:25,566 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_148-172861-0_sp0.9 from training. Duration: 0.5555625 2023-10-02 13:11:28,310 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_5129_210_6400_1_1527161005_3579054_107-69738-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 13:11:31,068 WARNING [train.py:1204] (3/4) Exclude cut with ID _40137_210_12210_1_1533290484046_3795180_36-301164-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 13:11:32,355 WARNING [train.py:1204] (3/4) Exclude cut with ID _7537_210_2084_1_1528502395431_3924359_113-199933-0 from training. Duration: 0.59 2023-10-02 13:11:34,353 WARNING [train.py:1204] (3/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_547-5424-0_sp1.1 from training. Duration: 0.8545625 2023-10-02 13:11:34,402 WARNING [train.py:1204] (3/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_147-284024-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 13:11:37,631 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_294-163793-0_sp1.1 from training. Duration: 0.7454375 2023-10-02 13:11:37,666 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3780_210_2087_1_1526467760_2130483_170-259044-0_sp1.1 from training. Duration: 0.636375 2023-10-02 13:11:38,860 WARNING [train.py:1204] (3/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_726-270780-0_sp1.1 from training. Duration: 0.7818125 2023-10-02 13:11:40,242 WARNING [train.py:1204] (3/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_214-291369-0_sp1.1 from training. Duration: 0.57275 2023-10-02 13:11:40,299 WARNING [train.py:1204] (3/4) Exclude cut with ID _21881_210_3525_1_1531825242757_3798380_247-102591-0 from training. Duration: 0.98 2023-10-02 13:11:46,359 WARNING [train.py:1204] (3/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_18-174647-0_sp1.1 from training. Duration: 0.82725 2023-10-02 13:11:46,362 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_415-52611-0_sp1.1 from training. Duration: 0.936375 2023-10-02 13:11:47,598 WARNING [train.py:1204] (3/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_379-49457-0_sp0.9 from training. Duration: 0.9666875 2023-10-02 13:11:47,611 WARNING [train.py:1204] (3/4) Exclude cut with ID _64521_210_14699_1_1535243427259_3815328_368-195106-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 13:11:47,687 WARNING [train.py:1204] (3/4) Exclude cut with ID _28394_210_8913_1_1532430049518_3650118_51-334124-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 13:11:49,142 WARNING [train.py:1204] (3/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_317-161769-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 13:11:51,966 WARNING [train.py:1204] (3/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_40-129756-0_sp0.9 from training. Duration: 0.9 2023-10-02 13:11:52,254 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.balancer1.prob, batch_count=891286.6666666666, ans=0.125 2023-10-02 13:11:53,314 WARNING [train.py:1204] (3/4) Exclude cut with ID _3067_210_2667_1_1526212974640_4503168_119-175776-0_sp0.9 from training. Duration: 0.82225 2023-10-02 13:11:53,361 WARNING [train.py:1204] (3/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_1004-304915-0_sp1.1 from training. Duration: 0.92725 2023-10-02 13:11:55,379 WARNING [train.py:1204] (3/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_359-118926-0_sp0.9 from training. Duration: 0.8 2023-10-02 13:11:58,931 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.1.encoder.layers.0.conv_module2.whiten, num_groups=1, num_channels=256, metric=6.61 vs. limit=15.0 2023-10-02 13:11:59,573 WARNING [train.py:1204] (3/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_51-200220-0_sp0.9 from training. Duration: 0.62225 2023-10-02 13:12:00,964 WARNING [train.py:1204] (3/4) Exclude cut with ID _46683_210_9642_1_1533730913492_4591116_530-326202-0_sp1.1 from training. Duration: 0.936375 2023-10-02 13:12:01,652 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.3.encoder.layers.2.nonlin_attention.whiten1, num_groups=1, num_channels=384, metric=4.62 vs. limit=10.0 2023-10-02 13:12:02,334 WARNING [train.py:1204] (3/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_567-135114-0 from training. Duration: 0.87 2023-10-02 13:12:02,368 WARNING [train.py:1204] (3/4) Exclude cut with ID _66863_210_18053_1_1535698758334_8590319_1092-271784-0_sp1.1 from training. Duration: 0.87275 2023-10-02 13:12:02,378 WARNING [train.py:1204] (3/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_762-282614-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 13:12:05,765 WARNING [train.py:1204] (3/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_280-120942-0 from training. Duration: 0.77 2023-10-02 13:12:07,822 INFO [train.py:1046] (3/4) Epoch 26, batch 900, loss[loss=0.1541, simple_loss=0.2334, pruned_loss=0.03743, over 24373.00 frames. ], tot_loss[loss=0.1723, simple_loss=0.2487, pruned_loss=0.04797, over 4639749.57 frames. ], batch size: 61, lr: 3.97e-03, grad_scale: 16.0 2023-10-02 13:12:08,181 INFO [scaling.py:1118] (3/4) WithLoss: name=encoder.encoders.4.encoder.layers.2.self_attn_weights, loss-sum=0.000e+00 2023-10-02 13:12:09,897 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.4.encoder.layers.1.nonlin_attention.whiten2, num_groups=1, num_channels=384, metric=3.34 vs. limit=15.0 2023-10-02 13:12:10,679 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_31583_210_3852_1_1532595851_4024413_27-170931-0_sp1.1 from training. Duration: 0.963625 2023-10-02 13:12:13,848 WARNING [train.py:1204] (3/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_266-271628-0_sp1.1 from training. Duration: 0.92725 2023-10-02 13:12:13,895 WARNING [train.py:1204] (3/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_41-31157-0 from training. Duration: 0.57 2023-10-02 13:12:16,723 WARNING [train.py:1204] (3/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_113-18163-0_sp1.1 from training. Duration: 0.8090625 2023-10-02 13:12:16,763 WARNING [train.py:1204] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_147-111137-0 from training. Duration: 0.95 2023-10-02 13:12:17,039 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.bypass_mid.scale_min, batch_count=891353.3333333334, ans=0.2 2023-10-02 13:12:18,109 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1083-220122-0_sp1.1 from training. Duration: 0.463625 2023-10-02 13:12:19,658 WARNING [train.py:1204] (3/4) Exclude cut with ID _14884_210_2252_1_1530707459689_5561250_591-173406-0_sp1.1 from training. Duration: 0.87275 2023-10-02 13:12:19,663 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_24677_210_1794_1_1532002693_4341296_149-303793-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 13:12:20,997 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_133-564-0_sp1.1 from training. Duration: 0.7181875 2023-10-02 13:12:21,035 WARNING [train.py:1204] (3/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_625-338546-0_sp0.9 from training. Duration: 0.97775 2023-10-02 13:12:29,876 WARNING [train.py:1204] (3/4) Exclude cut with ID _25882_210_3036_1_1532775665815_6479510_624-320169-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 13:12:29,890 WARNING [train.py:1204] (3/4) Exclude cut with ID _20622_210_3852_1_1531622427080_1093839_127-325326-0_sp1.1 from training. Duration: 0.92725 2023-10-02 13:12:31,216 WARNING [train.py:1204] (3/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_464-33931-0_sp1.1 from training. Duration: 0.4909375 2023-10-02 13:12:32,819 WARNING [train.py:1204] (3/4) Exclude cut with ID _66112_210_10635_1_1535590236305_3801484_120-304588-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 13:12:32,952 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=891420.0, ans=0.1 2023-10-02 13:12:39,234 WARNING [train.py:1204] (3/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_110-130961-0 from training. Duration: 0.47 2023-10-02 13:12:41,795 WARNING [train.py:1204] (3/4) Exclude cut with ID _16908_210_5724_1_1531119157248_3772700_64-54020-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 13:12:46,512 WARNING [train.py:1204] (3/4) Exclude cut with ID _4068_210_2629_1_1526697350395_1546259_224-288693-0_sp1.1 from training. Duration: 0.536375 2023-10-02 13:12:46,565 WARNING [train.py:1204] (3/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_191-41023-0_sp0.9 from training. Duration: 0.788875 2023-10-02 13:12:46,625 WARNING [train.py:1204] (3/4) Exclude cut with ID ID0205W0255-783002 from training. Duration: 0.958 2023-10-02 13:12:47,960 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_162-262717-0 from training. Duration: 0.83 2023-10-02 13:12:51,898 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.473e+02 1.855e+02 2.022e+02 2.290e+02 3.129e+02, threshold=4.044e+02, percent-clipped=0.0 2023-10-02 13:12:52,149 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.1.feed_forward1.out_proj.dropout_p, batch_count=891553.3333333334, ans=0.1 2023-10-02 13:12:54,554 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_224-322394-0_sp1.1 from training. Duration: 0.6 2023-10-02 13:12:54,560 WARNING [train.py:1204] (3/4) Exclude cut with ID _4866_210_2577_1_1527404412284_4364659_133-1696-0_sp1.1 from training. Duration: 0.9 2023-10-02 13:12:54,619 WARNING [train.py:1204] (3/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_973-198085-0_sp1.1 from training. Duration: 0.6818125 2023-10-02 13:12:59,459 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.attention_skip_rate, batch_count=891553.3333333334, ans=0.0 2023-10-02 13:13:00,676 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_47449_210_5399_1_1534076445_4006929_410-237806-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 13:13:00,693 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7543_210_6753_1_1528543318_4552_1-85072-0_sp0.9 from training. Duration: 0.92225 2023-10-02 13:13:03,492 WARNING [train.py:1204] (3/4) Exclude cut with ID _65263_210_22356_1_1535763598691_3921037_108-21902-0 from training. Duration: 0.99 2023-10-02 13:13:03,497 WARNING [train.py:1204] (3/4) Exclude cut with ID _65585_210_12210_1_1535450451584_3764098_309-174818-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 13:13:03,659 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_309-65177-0 from training. Duration: 0.88 2023-10-02 13:13:06,876 WARNING [train.py:1204] (3/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_363-185703-0_sp1.1 from training. Duration: 0.863625 2023-10-02 13:13:06,911 WARNING [train.py:1204] (3/4) Exclude cut with ID _42326_210_7770_1_1533814282531_3707269_442-238692-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 13:13:07,084 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.self_attn_weights.pos_emb_skip_rate, batch_count=891620.0, ans=0.0 2023-10-02 13:13:08,292 WARNING [train.py:1204] (3/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_226-254446-0_sp1.1 from training. Duration: 0.936375 2023-10-02 13:13:08,314 WARNING [train.py:1204] (3/4) Exclude cut with ID _29853_210_7497_1_1532505600851_6642110_287-299903-0_sp1.1 from training. Duration: 0.963625 2023-10-02 13:13:12,705 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1015-343884-0 from training. Duration: 0.62 2023-10-02 13:13:12,741 WARNING [train.py:1204] (3/4) Exclude cut with ID ID0305W0008-833402 from training. Duration: 0.91 2023-10-02 13:13:16,072 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6673_210_3273_1_1527767790_3173240_351-167954-0_sp1.1 from training. Duration: 0.436375 2023-10-02 13:13:16,092 WARNING [train.py:1204] (3/4) Exclude cut with ID _6456_210_1534_1_1527681634212_3181049_345-252877-0 from training. Duration: 0.64 2023-10-02 13:13:16,625 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.5.encoder.layers.1.nonlin_attention.whiten2, num_groups=1, num_channels=256, metric=10.97 vs. limit=15.0 2023-10-02 13:13:18,760 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_13-205182-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 13:13:19,096 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.bypass.skip_rate, batch_count=891620.0, ans=0.07 2023-10-02 13:13:20,305 WARNING [train.py:1204] (3/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_427-208901-0 from training. Duration: 0.68 2023-10-02 13:13:21,667 INFO [train.py:1046] (3/4) Epoch 26, batch 950, loss[loss=0.1606, simple_loss=0.2376, pruned_loss=0.04179, over 18728.00 frames. ], tot_loss[loss=0.1712, simple_loss=0.2479, pruned_loss=0.04728, over 4665921.11 frames. ], batch size: 41, lr: 3.96e-03, grad_scale: 16.0 2023-10-02 13:13:26,449 WARNING [train.py:1204] (3/4) Exclude cut with ID _49039_210_6315_1_1533967537541_3846118_9-148921-0_sp1.1 from training. Duration: 0.97275 2023-10-02 13:13:29,205 WARNING [train.py:1204] (3/4) Exclude cut with ID _59493_210_17282_1_1535078419077_3225879_143-70317-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 13:13:29,232 WARNING [train.py:1204] (3/4) Exclude cut with ID _27967_210_12140_1_1532588391859_8419130_1056-236369-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 13:13:30,518 WARNING [train.py:1204] (3/4) Exclude cut with ID _6537_210_1883_1_1527987559710_3597129_382-133062-0_sp0.9 from training. Duration: 0.7555625 2023-10-02 13:13:33,413 WARNING [train.py:1204] (3/4) Exclude cut with ID ID0376W0255-869957 from training. Duration: 0.9510625 2023-10-02 13:13:36,157 WARNING [train.py:1204] (3/4) Exclude cut with ID _49371_210_12071_1_1533952761021_3812190_483-240691-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 13:13:36,241 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_12868_210_6753_1_1530701932_530275_18-76322-0_sp1.1 from training. Duration: 0.7909375 2023-10-02 13:13:37,669 WARNING [train.py:1204] (3/4) Exclude cut with ID _53950_210_5816_1_1534417264919_3862951_367-26344-0_sp1.1 from training. Duration: 0.97275 2023-10-02 13:13:37,701 WARNING [train.py:1204] (3/4) Exclude cut with ID _5444_210_2151_1_1527418808464_5312210_634-194174-0_sp1.1 from training. Duration: 0.8818125 2023-10-02 13:13:37,726 WARNING [train.py:1204] (3/4) Exclude cut with ID _56494_210_4220_1_1535284837625_3033169_69-138150-0 from training. Duration: 0.94 2023-10-02 13:13:39,515 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7543_210_6753_1_1528544010_1110402_25-183860-0_sp0.9 from training. Duration: 0.52225 2023-10-02 13:13:42,214 WARNING [train.py:1204] (3/4) Exclude cut with ID _40284_210_2084_1_1533513679814_3704279_287-191918-0_sp1.1 from training. Duration: 0.963625 2023-10-02 13:13:42,327 WARNING [train.py:1204] (3/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_291-270881-0 from training. Duration: 0.97 2023-10-02 13:13:44,183 WARNING [train.py:1204] (3/4) Exclude cut with ID _12555_210_8750_1_1530075594402_3780239_449-267986-0_sp0.9 from training. Duration: 0.92225 2023-10-02 13:13:45,713 WARNING [train.py:1204] (3/4) Exclude cut with ID _3992_210_1747_1_1526644816031_3731060_268-64520-0_sp1.1 from training. Duration: 0.963625 2023-10-02 13:13:45,720 WARNING [train.py:1204] (3/4) Exclude cut with ID _39411_210_5986_1_1533538825030_3343649_173-238248-0_sp0.9 from training. Duration: 0.92225 2023-10-02 13:13:45,739 WARNING [train.py:1204] (3/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_828-313097-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 13:13:47,081 WARNING [train.py:1204] (3/4) Exclude cut with ID _26907_210_7234_1_1532221740694_3205263_337-2494-0 from training. Duration: 0.88 2023-10-02 13:13:49,944 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_229-45300-0_sp1.1 from training. Duration: 0.5181875 2023-10-02 13:13:51,308 WARNING [train.py:1204] (3/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_300-27235-0_sp1.1 from training. Duration: 0.7909375 2023-10-02 13:13:52,713 WARNING [train.py:1204] (3/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_48-338976-0_sp1.1 from training. Duration: 0.8090625 2023-10-02 13:13:52,866 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=891820.0, ans=0.1 2023-10-02 13:13:57,317 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_13091_210_7925_1_1530609447_8987354_792-195353-0_sp1.1 from training. Duration: 0.936375 2023-10-02 13:13:58,619 WARNING [train.py:1204] (3/4) Exclude cut with ID _56524_210_9642_1_1534572011779_3619170_311-54500-0_sp1.1 from training. Duration: 0.97275 2023-10-02 13:14:02,568 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7543_210_6753_1_1528544010_1110402_25-183860-0 from training. Duration: 0.47 2023-10-02 13:14:04,013 WARNING [train.py:1204] (3/4) Exclude cut with ID 2411-132532-0017-82279-0_sp1.1 from training. Duration: 0.9681875 2023-10-02 13:14:04,014 WARNING [train.py:1204] (3/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_272-249985-0_sp0.9 from training. Duration: 0.7444375 2023-10-02 13:14:04,076 WARNING [train.py:1204] (3/4) Exclude cut with ID _55644_210_11856_1_1534593599582_4103719_320-229006-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 13:14:04,126 WARNING [train.py:1204] (3/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_177-100554-0_sp1.1 from training. Duration: 0.963625 2023-10-02 13:14:04,128 WARNING [train.py:1204] (3/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_787-294416-0_sp1.1 from training. Duration: 0.7454375 2023-10-02 13:14:08,295 WARNING [train.py:1204] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_318-59097-0 from training. Duration: 0.76 2023-10-02 13:14:10,088 WARNING [train.py:1204] (3/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_480-13146-0_sp1.1 from training. Duration: 0.77275 2023-10-02 13:14:14,604 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_469-338558-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 13:14:14,659 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_367-126897-0_sp1.1 from training. Duration: 0.963625 2023-10-02 13:14:14,681 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6545_210_2040_1_1527774670_4245258_379-14182-0 from training. Duration: 0.8 2023-10-02 13:14:14,692 WARNING [train.py:1204] (3/4) Exclude cut with ID _21631_210_2253_1_1531907880778_3160339_197-79498-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 13:14:14,696 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_416-293746-0_sp1.1 from training. Duration: 0.8181875 2023-10-02 13:14:16,484 WARNING [train.py:1204] (3/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_52-211683-0 from training. Duration: 0.9 2023-10-02 13:14:19,352 WARNING [train.py:1204] (3/4) Exclude cut with ID _48678_210_5663_1_1533988830947_3769199_306-255886-0_sp0.9 from training. Duration: 0.8555625 2023-10-02 13:14:19,924 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.4.encoder.layers.0.self_attn_weights.whiten_keys, num_groups=4, num_channels=128, metric=1.77 vs. limit=6.0 2023-10-02 13:14:20,752 WARNING [train.py:1204] (3/4) Exclude cut with ID _65134_210_14763_1_1535335393985_3505368_296-24733-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 13:14:22,619 INFO [scaling.py:1118] (3/4) WithLoss: name=encoder.encoders.0.layers.0.self_attn_weights, loss-sum=0.000e+00 2023-10-02 13:14:26,497 WARNING [train.py:1204] (3/4) Exclude cut with ID _14898_210_9206_1_1530766812377_4177190_335-99364-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 13:14:29,102 WARNING [train.py:1204] (3/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_19-342632-0 from training. Duration: 0.91 2023-10-02 13:14:29,132 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_53071_210_16554_1_1534490405_2954066_265-170472-0 from training. Duration: 0.83 2023-10-02 13:14:32,232 WARNING [train.py:1204] (3/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_189-298172-0_sp1.1 from training. Duration: 0.963625 2023-10-02 13:14:33,829 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=892020.0, ans=0.1 2023-10-02 13:14:34,929 INFO [train.py:1046] (3/4) Epoch 26, batch 1000, loss[loss=0.1732, simple_loss=0.2451, pruned_loss=0.05063, over 23172.00 frames. ], tot_loss[loss=0.1706, simple_loss=0.2473, pruned_loss=0.04697, over 4687847.63 frames. ], batch size: 93, lr: 3.96e-03, grad_scale: 16.0 2023-10-02 13:14:35,070 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7615_210_4211_1_1528110965_4580160_72-235211-0 from training. Duration: 0.76 2023-10-02 13:14:36,422 WARNING [train.py:1204] (3/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_155-90874-0_sp1.1 from training. Duration: 0.936375 2023-10-02 13:14:41,162 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.bypass.skip_rate, batch_count=892020.0, ans=0.07 2023-10-02 13:14:42,433 WARNING [train.py:1204] (3/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_164-216310-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 13:14:43,784 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_46013_210_7234_1_1533776362_3744032_463-52378-0 from training. Duration: 0.86 2023-10-02 13:14:43,788 WARNING [train.py:1204] (3/4) Exclude cut with ID _61133_210_9969_1_1535196777227_3696409_333-270325-0 from training. Duration: 0.93 2023-10-02 13:14:49,313 WARNING [train.py:1204] (3/4) Exclude cut with ID _5172_210_1883_1_1527246062567_3577219_18-101380-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 13:14:49,321 WARNING [train.py:1204] (3/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_577-236858-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 13:14:50,682 WARNING [train.py:1204] (3/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_1006-177345-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 13:14:54,809 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_4-266348-0 from training. Duration: 0.78 2023-10-02 13:14:58,156 WARNING [train.py:1204] (3/4) Exclude cut with ID _55857_210_2346_1_1534644007831_6898209_745-98391-0 from training. Duration: 0.76 2023-10-02 13:14:59,625 WARNING [train.py:1204] (3/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_375-244976-0 from training. Duration: 0.75 2023-10-02 13:14:59,666 WARNING [train.py:1204] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_4-248404-0_sp1.1 from training. Duration: 0.97275 2023-10-02 13:15:02,333 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8453_210_4211_1_1528502940_8562730_596-306820-0 from training. Duration: 0.97 2023-10-02 13:15:03,726 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_211-339622-0_sp0.9 from training. Duration: 0.5 2023-10-02 13:15:03,765 WARNING [train.py:1204] (3/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_393-57888-0 from training. Duration: 0.64 2023-10-02 13:15:05,154 WARNING [train.py:1204] (3/4) Exclude cut with ID _23825_210_13449_1_1531915313526_3497150_143-343575-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 13:15:06,435 WARNING [train.py:1204] (3/4) Exclude cut with ID _58605_210_12210_1_1534723195939_3904259_36-12256-0_sp1.1 from training. Duration: 0.936375 2023-10-02 13:15:06,661 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=892153.3333333334, ans=0.1 2023-10-02 13:15:12,492 WARNING [train.py:1204] (3/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_38-67795-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 13:15:13,849 WARNING [train.py:1204] (3/4) Exclude cut with ID _64631_210_27767_1_1535349580068_4165160_308-327832-0_sp1.1 from training. Duration: 0.92725 2023-10-02 13:15:15,107 WARNING [train.py:1204] (3/4) Exclude cut with ID _37086_210_9038_1_1532999540891_7021250_726-81176-0_sp1.1 from training. Duration: 0.936375 2023-10-02 13:15:15,784 WARNING [train.py:1204] (3/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_47-141521-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 13:15:15,788 WARNING [train.py:1204] (3/4) Exclude cut with ID _3067_210_2667_1_1526212974640_4503168_250-285937-0 from training. Duration: 0.8 2023-10-02 13:15:15,818 WARNING [train.py:1204] (3/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_239-24000-0_sp1.1 from training. Duration: 0.97275 2023-10-02 13:15:17,113 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3371_210_1794_1_1526384462_5580777_396-339812-0_sp0.9 from training. Duration: 0.9444375 2023-10-02 13:15:18,401 WARNING [train.py:1204] (3/4) Exclude cut with ID _16909_210_5724_1_1531292079912_4118919_580-173454-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 13:15:19,620 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.636e+02 1.933e+02 2.169e+02 2.725e+02 4.611e+02, threshold=4.339e+02, percent-clipped=3.0 2023-10-02 13:15:19,712 WARNING [train.py:1204] (3/4) Exclude cut with ID IC0124W0002-61308 from training. Duration: 0.982 2023-10-02 13:15:23,080 WARNING [train.py:1204] (3/4) Exclude cut with ID _31602_210_5854_1_1532677021456_4458250_249-96604-0 from training. Duration: 0.89 2023-10-02 13:15:24,454 WARNING [train.py:1204] (3/4) Exclude cut with ID _69584_210_5219_1_1535785133775_4044450_325-7656-0 from training. Duration: 0.86 2023-10-02 13:15:25,823 WARNING [train.py:1204] (3/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_478-162247-0 from training. Duration: 0.82 2023-10-02 13:15:25,968 WARNING [train.py:1204] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_686-54383-0_sp1.1 from training. Duration: 0.7909375 2023-10-02 13:15:30,834 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.feed_forward1.hidden_balancer.prob, batch_count=892220.0, ans=0.125 2023-10-02 13:15:34,558 WARNING [train.py:1204] (3/4) Exclude cut with ID _29303_210_2575_1_1532392237303_7410649_85-270092-0_sp1.1 from training. Duration: 0.936375 2023-10-02 13:15:34,576 WARNING [train.py:1204] (3/4) Exclude cut with ID _2322_210_2667_1_1525604548971_3656299_440-80494-0_sp1.1 from training. Duration: 0.8 2023-10-02 13:15:34,616 WARNING [train.py:1204] (3/4) Exclude cut with ID _47346_210_9476_1_1533815971553_3536160_201-40644-0_sp1.1 from training. Duration: 0.936375 2023-10-02 13:15:35,965 WARNING [train.py:1204] (3/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_343-139898-0_sp1.1 from training. Duration: 0.87275 2023-10-02 13:15:37,397 WARNING [train.py:1204] (3/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_1012-279473-0 from training. Duration: 0.67 2023-10-02 13:15:38,806 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7931_210_1794_1_1528594728_5564059_280-279560-0_sp1.1 from training. Duration: 0.8909375 2023-10-02 13:15:38,853 WARNING [train.py:1204] (3/4) Exclude cut with ID _8263_210_1664_1_1528439452891_970220_68-181805-0 from training. Duration: 0.64 2023-10-02 13:15:40,182 WARNING [train.py:1204] (3/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_605-133395-0 from training. Duration: 0.66 2023-10-02 13:15:41,686 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_43-171109-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 13:15:41,690 WARNING [train.py:1204] (3/4) Exclude cut with ID _40094_210_16380_1_1533294005773_3641159_107-280739-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 13:15:43,033 WARNING [train.py:1204] (3/4) Exclude cut with ID _14821_210_8750_1_1530680503991_6829359_2-227151-0_sp0.9 from training. Duration: 0.92225 2023-10-02 13:15:46,783 WARNING [train.py:1204] (3/4) Exclude cut with ID _14268_210_9206_1_1530622771285_4714099_331-105351-0_sp1.1 from training. Duration: 0.5909375 2023-10-02 13:15:46,894 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_26326_210_7925_1_1533002493_3592406_451-82741-0_sp1.1 from training. Duration: 0.97275 2023-10-02 13:15:50,029 INFO [train.py:1046] (3/4) Epoch 26, batch 1050, loss[loss=0.1618, simple_loss=0.2269, pruned_loss=0.04833, over 23517.00 frames. ], tot_loss[loss=0.1689, simple_loss=0.2457, pruned_loss=0.04605, over 4683401.02 frames. ], batch size: 285, lr: 3.96e-03, grad_scale: 8.0 2023-10-02 13:15:51,448 WARNING [train.py:1204] (3/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_191-59784-0_sp0.9 from training. Duration: 0.97775 2023-10-02 13:15:51,550 WARNING [train.py:1204] (3/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_445-117398-0_sp1.1 from training. Duration: 0.8090625 2023-10-02 13:15:53,496 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.4.encoder.layers.1.nonlin_attention.whiten1, num_groups=1, num_channels=288, metric=3.82 vs. limit=10.0 2023-10-02 13:15:54,306 WARNING [train.py:1204] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_605-257205-0_sp0.9 from training. Duration: 0.6555625 2023-10-02 13:15:54,372 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_50050_210_16223_1_1535435796_7290138_592-258841-0_sp1.1 from training. Duration: 0.936375 2023-10-02 13:15:55,862 WARNING [train.py:1204] (3/4) Exclude cut with ID _14348_210_8341_1_1530599434996_3778640_240-232370-0_sp0.9 from training. Duration: 0.9333125 2023-10-02 13:15:58,973 WARNING [train.py:1204] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_239-316802-0_sp0.9 from training. Duration: 0.7666875 2023-10-02 13:16:00,557 WARNING [train.py:1204] (3/4) Exclude cut with ID _45560_210_21982_1_1533812603951_3584999_111-6376-0_sp1.1 from training. Duration: 0.763625 2023-10-02 13:16:03,232 WARNING [train.py:1204] (3/4) Exclude cut with ID _54189_210_16380_1_1534332670039_3911710_606-311481-0_sp1.1 from training. Duration: 0.963625 2023-10-02 13:16:04,645 WARNING [train.py:1204] (3/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_237-332027-0_sp1.1 from training. Duration: 0.863625 2023-10-02 13:16:04,678 WARNING [train.py:1204] (3/4) Exclude cut with ID _66070_210_7640_1_1535441393096_7805310_489-292631-0_sp0.9 from training. Duration: 0.87775 2023-10-02 13:16:04,751 WARNING [train.py:1204] (3/4) Exclude cut with ID _49728_210_12651_1_1534041034294_6788150_624-195301-0_sp1.1 from training. Duration: 0.77275 2023-10-02 13:16:06,072 WARNING [train.py:1204] (3/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_44-79427-0 from training. Duration: 0.98 2023-10-02 13:16:06,137 WARNING [train.py:1204] (3/4) Exclude cut with ID _35350_210_16040_1_1533040191183_4075299_490-198354-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 13:16:06,168 WARNING [train.py:1204] (3/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_309-222545-0 from training. Duration: 0.84 2023-10-02 13:16:10,051 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_13091_210_7925_1_1530609447_8987354_790-199469-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 13:16:10,057 WARNING [train.py:1204] (3/4) Exclude cut with ID _21632_210_2253_1_1531994001765_3744140_110-274654-0 from training. Duration: 0.82 2023-10-02 13:16:10,063 WARNING [train.py:1204] (3/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_498-40153-0_sp0.9 from training. Duration: 0.7 2023-10-02 13:16:16,967 WARNING [train.py:1204] (3/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_363-282909-0_sp1.1 from training. Duration: 0.936375 2023-10-02 13:16:17,023 WARNING [train.py:1204] (3/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_178-177582-0_sp0.9 from training. Duration: 0.82225 2023-10-02 13:16:17,033 WARNING [train.py:1204] (3/4) Exclude cut with ID _24612_210_8913_1_1531998054193_3806359_357-35487-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 13:16:20,984 WARNING [train.py:1204] (3/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_138-332773-0 from training. Duration: 0.81 2023-10-02 13:16:21,010 WARNING [train.py:1204] (3/4) Exclude cut with ID _7624_210_1531_1_1528109802198_4933101_418-349834-0 from training. Duration: 0.6 2023-10-02 13:16:21,040 WARNING [train.py:1204] (3/4) Exclude cut with ID _7537_210_2084_1_1528502395431_3924359_14-312906-0_sp0.9 from training. Duration: 0.9333125 2023-10-02 13:16:25,230 WARNING [train.py:1204] (3/4) Exclude cut with ID _60057_210_10640_1_1534903065793_3333150_362-230377-0 from training. Duration: 0.87 2023-10-02 13:16:28,038 WARNING [train.py:1204] (3/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_138-64210-0 from training. Duration: 0.73 2023-10-02 13:16:29,356 WARNING [train.py:1204] (3/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_454-339504-0_sp1.1 from training. Duration: 0.97275 2023-10-02 13:16:32,658 WARNING [train.py:1204] (3/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_508-335369-0_sp0.9 from training. Duration: 0.5333125 2023-10-02 13:16:35,367 WARNING [train.py:1204] (3/4) Exclude cut with ID _5608_210_2106_1_1527415439089_6819768_909-82767-0_sp0.9 from training. Duration: 0.67775 2023-10-02 13:16:35,405 WARNING [train.py:1204] (3/4) Exclude cut with ID _6694_210_4599_1_1527852637490_3965529_99-11611-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 13:16:36,936 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2655_210_2087_1_1525688959_3897400_52-279899-0_sp1.1 from training. Duration: 0.62725 2023-10-02 13:16:41,007 WARNING [train.py:1204] (3/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_375-245677-0_sp1.1 from training. Duration: 0.736375 2023-10-02 13:16:42,937 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.5.encoder.layers.0.feed_forward2.out_whiten, num_groups=1, num_channels=256, metric=7.45 vs. limit=15.0 2023-10-02 13:16:43,815 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_23850_210_4238_1_1532232597_1632433_32-185135-0 from training. Duration: 0.86 2023-10-02 13:16:45,270 WARNING [train.py:1204] (3/4) Exclude cut with ID _15113_210_8254_1_1530842438579_3716279_209-54533-0 from training. Duration: 0.96 2023-10-02 13:16:45,308 WARNING [train.py:1204] (3/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_401-53423-0 from training. Duration: 0.55 2023-10-02 13:16:47,163 WARNING [train.py:1204] (3/4) Exclude cut with ID _14867_210_1968_1_1530684179890_3846969_319-250868-0_sp1.1 from training. Duration: 0.9 2023-10-02 13:16:47,184 WARNING [train.py:1204] (3/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_106-30782-0_sp0.9 from training. Duration: 0.888875 2023-10-02 13:16:48,520 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_5960_210_1794_1_1527403865_3990999_515-2613-0 from training. Duration: 0.88 2023-10-02 13:16:52,516 WARNING [train.py:1204] (3/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_798-106179-0_sp0.9 from training. Duration: 0.92225 2023-10-02 13:16:55,150 WARNING [train.py:1204] (3/4) Exclude cut with ID _14227_210_4381_1_1530701804286_3940259_125-275642-0_sp1.1 from training. Duration: 0.9 2023-10-02 13:16:55,152 WARNING [train.py:1204] (3/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_79-200392-0_sp1.1 from training. Duration: 0.8818125 2023-10-02 13:16:55,200 WARNING [train.py:1204] (3/4) Exclude cut with ID _48836_210_12210_1_1534075848293_3904150_429-35475-0_sp0.9 from training. Duration: 0.988875 2023-10-02 13:16:56,484 WARNING [train.py:1204] (3/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_224-172517-0_sp1.1 from training. Duration: 0.97275 2023-10-02 13:17:00,605 WARNING [train.py:1204] (3/4) Exclude cut with ID _15113_210_8254_1_1530842438579_3716279_76-232507-0_sp1.1 from training. Duration: 0.97275 2023-10-02 13:17:00,608 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_135-5501-0 from training. Duration: 0.98 2023-10-02 13:17:00,732 WARNING [train.py:1204] (3/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_563-347832-0_sp0.9 from training. Duration: 0.988875 2023-10-02 13:17:02,020 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6786_210_1388_1_1527839699_4194495_273-149841-0 from training. Duration: 0.92 2023-10-02 13:17:02,045 WARNING [train.py:1204] (3/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_172-215932-0 from training. Duration: 0.66 2023-10-02 13:17:04,022 INFO [train.py:1046] (3/4) Epoch 26, batch 1100, loss[loss=0.1661, simple_loss=0.2379, pruned_loss=0.04709, over 23843.00 frames. ], tot_loss[loss=0.169, simple_loss=0.2463, pruned_loss=0.04583, over 4708337.04 frames. ], batch size: 195, lr: 3.96e-03, grad_scale: 8.0 2023-10-02 13:17:04,084 WARNING [train.py:1204] (3/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_939-132910-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 13:17:05,652 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.conv_skip_rate, batch_count=892686.6666666666, ans=0.0 2023-10-02 13:17:06,837 WARNING [train.py:1204] (3/4) Exclude cut with ID _49371_210_12071_1_1533952761021_3812190_16-246001-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 13:17:11,101 WARNING [train.py:1204] (3/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_680-189165-0_sp1.1 from training. Duration: 0.936375 2023-10-02 13:17:15,128 WARNING [train.py:1204] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_53-300544-0_sp1.1 from training. Duration: 0.6090625 2023-10-02 13:17:16,483 WARNING [train.py:1204] (3/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_540-275685-0_sp1.1 from training. Duration: 0.8090625 2023-10-02 13:17:16,498 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_38965_210_4852_1_1533174906_3484735_453-106579-0_sp1.1 from training. Duration: 0.92725 2023-10-02 13:17:16,540 WARNING [train.py:1204] (3/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_105-328174-0 from training. Duration: 0.76 2023-10-02 13:17:17,927 WARNING [train.py:1204] (3/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_279-126205-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 13:17:21,302 WARNING [train.py:1204] (3/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_5-55893-0_sp0.9 from training. Duration: 0.611125 2023-10-02 13:17:22,772 WARNING [train.py:1204] (3/4) Exclude cut with ID _13678_210_7497_1_1530530069226_3463170_141-10835-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 13:17:24,156 WARNING [train.py:1204] (3/4) Exclude cut with ID _22083_210_8254_1_1531702862439_3678970_201-85738-0_sp1.1 from training. Duration: 0.8181875 2023-10-02 13:17:24,174 WARNING [train.py:1204] (3/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_51-1049-0 from training. Duration: 0.75 2023-10-02 13:17:25,619 WARNING [train.py:1204] (3/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_206-10097-0_sp0.9 from training. Duration: 0.5444375 2023-10-02 13:17:26,945 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_4528_210_3273_1_1527076654_3276777_360-63661-0_sp1.1 from training. Duration: 0.92725 2023-10-02 13:17:26,952 WARNING [train.py:1204] (3/4) Exclude cut with ID _64086_210_7994_1_1535270417019_7435949_449-31567-0_sp1.1 from training. Duration: 0.8818125 2023-10-02 13:17:29,839 WARNING [train.py:1204] (3/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_245-174553-0_sp0.9 from training. Duration: 0.97775 2023-10-02 13:17:31,207 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6545_210_2040_1_1527774670_4245258_379-14182-0_sp1.1 from training. Duration: 0.72725 2023-10-02 13:17:34,632 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.bypass.scale_min, batch_count=892820.0, ans=0.2 2023-10-02 13:17:35,909 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_256-206070-0_sp1.1 from training. Duration: 0.963625 2023-10-02 13:17:39,802 WARNING [train.py:1204] (3/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_278-104676-0 from training. Duration: 0.98 2023-10-02 13:17:39,864 WARNING [train.py:1204] (3/4) Exclude cut with ID IC0671W0065-331908 from training. Duration: 0.896 2023-10-02 13:17:40,010 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.feed_forward1.hidden_balancer.prob, batch_count=892820.0, ans=0.125 2023-10-02 13:17:41,153 WARNING [train.py:1204] (3/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_244-129917-0_sp1.1 from training. Duration: 0.97275 2023-10-02 13:17:42,580 WARNING [train.py:1204] (3/4) Exclude cut with ID _7852_210_5219_1_1528286471316_8139400_1129-170857-0_sp1.1 from training. Duration: 0.97275 2023-10-02 13:17:43,972 WARNING [train.py:1204] (3/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_443-30034-0_sp0.9 from training. Duration: 0.711125 2023-10-02 13:17:44,008 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_31583_210_3852_1_1532595851_4024413_250-332999-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 13:17:44,769 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.0.layers.1.nonlin_attention.whiten2, num_groups=1, num_channels=192, metric=5.92 vs. limit=15.0 2023-10-02 13:17:46,767 WARNING [train.py:1204] (3/4) Exclude cut with ID _8772_210_1850_1_1528635595972_3664011_247-336448-0 from training. Duration: 0.74 2023-10-02 13:17:48,055 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.505e+02 1.826e+02 2.020e+02 2.449e+02 3.878e+02, threshold=4.041e+02, percent-clipped=0.0 2023-10-02 13:17:48,140 WARNING [train.py:1204] (3/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_567-135114-0_sp0.9 from training. Duration: 0.9666875 2023-10-02 13:17:48,144 WARNING [train.py:1204] (3/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_557-349534-0_sp1.1 from training. Duration: 0.82725 2023-10-02 13:17:48,163 WARNING [train.py:1204] (3/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_1341-26728-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 13:17:48,389 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.conv_module1.balancer1.prob, batch_count=892886.6666666666, ans=0.125 2023-10-02 13:17:50,044 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_40222_210_6228_1_1533385298_4481939_375-328051-0_sp1.1 from training. Duration: 0.97275 2023-10-02 13:17:50,077 WARNING [train.py:1204] (3/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_276-96593-0 from training. Duration: 0.77 2023-10-02 13:17:51,704 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.conv_module2.balancer2.prob, batch_count=892886.6666666666, ans=0.125 2023-10-02 13:17:54,891 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_53071_210_16554_1_1534490405_2954066_265-170472-0_sp0.9 from training. Duration: 0.92225 2023-10-02 13:17:54,908 WARNING [train.py:1204] (3/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_365-1492-0 from training. Duration: 0.91 2023-10-02 13:17:55,110 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.ff2_skip_rate, batch_count=892886.6666666666, ans=0.0 2023-10-02 13:17:57,525 WARNING [train.py:1204] (3/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_654-52713-0_sp0.9 from training. Duration: 0.8555625 2023-10-02 13:18:01,670 WARNING [train.py:1204] (3/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_113-92608-0_sp1.1 from training. Duration: 0.4909375 2023-10-02 13:18:05,024 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_46013_210_7234_1_1533776362_3744032_50-141535-0 from training. Duration: 0.99 2023-10-02 13:18:06,317 WARNING [train.py:1204] (3/4) Exclude cut with ID _13942_210_9988_1_1530604906036_3733360_499-55676-0_sp1.1 from training. Duration: 0.563625 2023-10-02 13:18:07,721 WARNING [train.py:1204] (3/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_375-65143-0_sp1.1 from training. Duration: 0.97275 2023-10-02 13:18:08,019 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.conv_module1.balancer1.prob, batch_count=892953.3333333334, ans=0.125 2023-10-02 13:18:09,140 WARNING [train.py:1204] (3/4) Exclude cut with ID _16756_210_4915_1_1531047803763_7104259_199-325608-0_sp1.1 from training. Duration: 0.92725 2023-10-02 13:18:10,493 WARNING [train.py:1204] (3/4) Exclude cut with ID _31649_210_6228_1_1532584608063_4091078_443-124391-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 13:18:10,595 WARNING [train.py:1204] (3/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_146-218763-0 from training. Duration: 0.86 2023-10-02 13:18:11,836 WARNING [train.py:1204] (3/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_567-135114-0_sp1.1 from training. Duration: 0.7909375 2023-10-02 13:18:11,881 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_26326_210_7925_1_1533002493_3592406_462-288299-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 13:18:13,298 WARNING [train.py:1204] (3/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_322-217220-0 from training. Duration: 0.86 2023-10-02 13:18:13,340 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_238-256668-0_sp1.1 from training. Duration: 0.62725 2023-10-02 13:18:13,375 WARNING [train.py:1204] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_371-18169-0 from training. Duration: 0.86 2023-10-02 13:18:14,779 WARNING [train.py:1204] (3/4) Exclude cut with ID _58480_210_12392_1_1534811121111_3646160_23-74512-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 13:18:14,791 WARNING [train.py:1204] (3/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_784-213099-0_sp1.1 from training. Duration: 0.8454375 2023-10-02 13:18:16,035 INFO [train.py:1046] (3/4) Epoch 26, batch 1150, loss[loss=0.1675, simple_loss=0.2392, pruned_loss=0.04787, over 23863.00 frames. ], tot_loss[loss=0.1692, simple_loss=0.2462, pruned_loss=0.04608, over 4692517.13 frames. ], batch size: 195, lr: 3.96e-03, grad_scale: 8.0 2023-10-02 13:18:16,110 WARNING [train.py:1204] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_625-102955-0_sp1.1 from training. Duration: 0.7 2023-10-02 13:18:20,901 WARNING [train.py:1204] (3/4) Exclude cut with ID _49748_210_24116_1_1533986971737_3158910_176-174327-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 13:18:24,217 WARNING [train.py:1204] (3/4) Exclude cut with ID _50310_210_15488_1_1534068043345_3641080_432-96140-0_sp1.1 from training. Duration: 0.936375 2023-10-02 13:18:25,537 WARNING [train.py:1204] (3/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_76-14910-0_sp1.1 from training. Duration: 0.92725 2023-10-02 13:18:26,825 WARNING [train.py:1204] (3/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_40-19458-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 13:18:26,867 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_46-93547-0 from training. Duration: 0.95 2023-10-02 13:18:26,895 WARNING [train.py:1204] (3/4) Exclude cut with ID _21631_210_2253_1_1531907880778_3160339_321-35325-0_sp1.1 from training. Duration: 0.963625 2023-10-02 13:18:29,751 WARNING [train.py:1204] (3/4) Exclude cut with ID _4779_210_2084_1_1527318022944_3914893_500-285013-0 from training. Duration: 0.69 2023-10-02 13:18:31,127 WARNING [train.py:1204] (3/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_301-93324-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 13:18:31,133 WARNING [train.py:1204] (3/4) Exclude cut with ID _6473_210_1741_1_1527901307396_3815380_108-17335-0_sp1.1 from training. Duration: 0.5909375 2023-10-02 13:18:37,161 WARNING [train.py:1204] (3/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_206-10097-0 from training. Duration: 0.49 2023-10-02 13:18:39,912 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_36756_210_7234_1_1533026991_966276_103-77513-0_sp1.1 from training. Duration: 0.97275 2023-10-02 13:18:43,931 WARNING [train.py:1204] (3/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_662-18193-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 13:18:45,383 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_26433_210_12108_1_1532157158_4056480_439-52438-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 13:18:45,428 WARNING [train.py:1204] (3/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_464-33931-0 from training. Duration: 0.54 2023-10-02 13:18:45,434 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3474_210_2028_1_1526188513_4825246_145-309131-0_sp0.9 from training. Duration: 0.82225 2023-10-02 13:18:45,452 WARNING [train.py:1204] (3/4) Exclude cut with ID _9679_210_7497_1_1529129212906_4363929_446-308321-0_sp1.1 from training. Duration: 0.963625 2023-10-02 13:18:48,373 WARNING [train.py:1204] (3/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_662-275621-0 from training. Duration: 0.42 2023-10-02 13:18:49,749 WARNING [train.py:1204] (3/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_344-337148-0_sp1.1 from training. Duration: 0.97275 2023-10-02 13:18:51,695 WARNING [train.py:1204] (3/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_759-179071-0_sp1.1 from training. Duration: 0.92725 2023-10-02 13:19:03,167 WARNING [train.py:1204] (3/4) Exclude cut with ID _28783_210_7943_1_1532671164808_7155479_293-12188-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 13:19:03,474 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.conv_skip_rate, batch_count=893220.0, ans=0.0 2023-10-02 13:19:10,553 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_17613_210_2151_1_1531137569_2366160_235-320304-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 13:19:10,571 WARNING [train.py:1204] (3/4) Exclude cut with ID _14349_210_8341_1_1530615710818_3442470_170-336541-0 from training. Duration: 0.86 2023-10-02 13:19:10,638 WARNING [train.py:1204] (3/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_375-218677-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 13:19:12,094 WARNING [train.py:1204] (3/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_829-202956-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 13:19:16,333 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9029W0436-509823 from training. Duration: 0.768 2023-10-02 13:19:17,689 WARNING [train.py:1204] (3/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_231-112214-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 13:19:18,394 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.2.encoder.layers.1.nonlin_attention.whiten1, num_groups=1, num_channels=288, metric=5.41 vs. limit=10.0 2023-10-02 13:19:19,941 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.0.layers.1.feed_forward1.out_whiten, num_groups=1, num_channels=192, metric=4.73 vs. limit=15.0 2023-10-02 13:19:23,637 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9059W0470-524883 from training. Duration: 0.3415625 2023-10-02 13:19:28,345 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_43266_210_1794_1_1533521789_4497218_206-79512-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 13:19:28,425 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_294-163793-0_sp0.9 from training. Duration: 0.911125 2023-10-02 13:19:29,673 INFO [train.py:1046] (3/4) Epoch 26, batch 1200, loss[loss=0.1714, simple_loss=0.2601, pruned_loss=0.04135, over 24387.00 frames. ], tot_loss[loss=0.1697, simple_loss=0.2467, pruned_loss=0.04635, over 4701000.95 frames. ], batch size: 77, lr: 3.96e-03, grad_scale: 16.0 2023-10-02 13:19:29,719 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6290_210_3273_1_1527595224_1282787_115-87095-0_sp1.1 from training. Duration: 0.5 2023-10-02 13:19:29,778 WARNING [train.py:1204] (3/4) Exclude cut with ID _14100_210_8341_1_1530529320613_3678069_279-55760-0_sp1.1 from training. Duration: 0.8454375 2023-10-02 13:19:31,404 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.feed_forward1.out_proj.dropout_p, batch_count=893353.3333333334, ans=0.1 2023-10-02 13:19:34,550 WARNING [train.py:1204] (3/4) Exclude cut with ID _14621_210_1925_1_1530686901185_3675420_419-234200-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 13:19:35,027 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.5.encoder.layers.1.nonlin_attention.whiten1, num_groups=1, num_channels=192, metric=2.70 vs. limit=10.0 2023-10-02 13:19:39,254 WARNING [train.py:1204] (3/4) Exclude cut with ID _60229_210_27767_1_1534838406158_3655350_353-213656-0_sp1.1 from training. Duration: 0.863625 2023-10-02 13:19:39,262 WARNING [train.py:1204] (3/4) Exclude cut with ID _48678_210_5663_1_1533988830947_3769199_306-255886-0_sp1.1 from training. Duration: 0.7 2023-10-02 13:19:41,910 WARNING [train.py:1204] (3/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_466-3492-0_sp1.1 from training. Duration: 0.97275 2023-10-02 13:19:41,910 WARNING [train.py:1204] (3/4) Exclude cut with ID _8755_210_1876_1_1528628559357_3258209_199-242167-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 13:19:41,992 WARNING [train.py:1204] (3/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_237-89684-0_sp1.1 from training. Duration: 0.87275 2023-10-02 13:19:44,667 WARNING [train.py:1204] (3/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_70-84121-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 13:19:46,143 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7544_210_6753_1_1528587722_3576686_172-171750-0_sp1.1 from training. Duration: 0.5909375 2023-10-02 13:19:46,250 WARNING [train.py:1204] (3/4) Exclude cut with ID _24271_210_12658_1_1531987270022_7190519_40-321822-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 13:19:46,271 WARNING [train.py:1204] (3/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_227-83612-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 13:19:48,948 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9156W0015-572508 from training. Duration: 0.896 2023-10-02 13:19:50,408 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_292-265928-0 from training. Duration: 0.51 2023-10-02 13:19:53,697 WARNING [train.py:1204] (3/4) Exclude cut with ID _14747_210_8341_1_1530685797463_3654089_147-131229-0_sp1.1 from training. Duration: 0.6909375 2023-10-02 13:19:53,871 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.balancer2.prob, batch_count=893420.0, ans=0.125 2023-10-02 13:19:56,905 WARNING [train.py:1204] (3/4) Exclude cut with ID _45297_210_2721_1_1533715351381_3264949_351-147136-0_sp0.9 from training. Duration: 0.9666875 2023-10-02 13:19:58,378 WARNING [train.py:1204] (3/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_1102-324568-0_sp1.1 from training. Duration: 0.97275 2023-10-02 13:20:01,060 WARNING [train.py:1204] (3/4) Exclude cut with ID _18516_210_9206_1_1531704653206_3692406_465-75446-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 13:20:01,062 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9203W0126-595623 from training. Duration: 0.896 2023-10-02 13:20:01,149 WARNING [train.py:1204] (3/4) Exclude cut with ID _55383_210_10635_1_1534468426172_2231669_139-328410-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 13:20:02,740 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.feed_forward1.out_proj.dropout_p, batch_count=893486.6666666666, ans=0.1 2023-10-02 13:20:08,947 WARNING [train.py:1204] (3/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_166-246557-0_sp0.9 from training. Duration: 0.711125 2023-10-02 13:20:08,951 WARNING [train.py:1204] (3/4) Exclude cut with ID _30039_210_13270_1_1532484612900_2725150_239-304872-0_sp1.1 from training. Duration: 0.963625 2023-10-02 13:20:10,321 WARNING [train.py:1204] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_6-226750-0 from training. Duration: 0.94 2023-10-02 13:20:10,395 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_135-5501-0_sp1.1 from training. Duration: 0.8909375 2023-10-02 13:20:10,560 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.feed_forward2.hidden_balancer.prob, batch_count=893486.6666666666, ans=0.125 2023-10-02 13:20:14,418 WARNING [train.py:1204] (3/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_359-118926-0 from training. Duration: 0.72 2023-10-02 13:20:15,603 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.558e+02 1.901e+02 2.098e+02 2.367e+02 3.990e+02, threshold=4.196e+02, percent-clipped=0.0 2023-10-02 13:20:18,199 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.0.layers.0.conv_module1.whiten, num_groups=1, num_channels=192, metric=14.45 vs. limit=15.0 2023-10-02 13:20:18,492 WARNING [train.py:1204] (3/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_4-91037-0 from training. Duration: 0.88 2023-10-02 13:20:18,515 WARNING [train.py:1204] (3/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_415-288492-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 13:20:19,856 WARNING [train.py:1204] (3/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_544-287179-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 13:20:21,632 WARNING [train.py:1204] (3/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_122-32606-0_sp1.1 from training. Duration: 0.92725 2023-10-02 13:20:21,685 WARNING [train.py:1204] (3/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_121-284152-0_sp1.1 from training. Duration: 0.536375 2023-10-02 13:20:21,772 WARNING [train.py:1204] (3/4) Exclude cut with ID _44718_210_5816_1_1534050051869_6469030_350-299477-0_sp1.1 from training. Duration: 0.97275 2023-10-02 13:20:21,789 WARNING [train.py:1204] (3/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_1103-56445-0_sp0.9 from training. Duration: 0.811125 2023-10-02 13:20:23,126 WARNING [train.py:1204] (3/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_64-298917-0_sp0.9 from training. Duration: 0.97775 2023-10-02 13:20:23,181 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2245_210_2715_1_1525511548_3540254_310-52483-0 from training. Duration: 0.88 2023-10-02 13:20:23,228 WARNING [train.py:1204] (3/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_401-158239-0_sp1.1 from training. Duration: 0.6454375 2023-10-02 13:20:23,277 WARNING [train.py:1204] (3/4) Exclude cut with ID _59502_210_12210_1_1535107024698_1835139_146-162495-0_sp1.1 from training. Duration: 0.836375 2023-10-02 13:20:23,286 WARNING [train.py:1204] (3/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_41-31157-0_sp0.9 from training. Duration: 0.6333125 2023-10-02 13:20:26,554 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_68717_210_3528_1_1535628683_1556759_95-36722-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 13:20:26,560 WARNING [train.py:1204] (3/4) Exclude cut with ID _30196_210_1664_1_1533708496522_7074140_654-162611-0_sp1.1 from training. Duration: 0.92725 2023-10-02 13:20:30,499 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_9302_210_2550_1_1528889962_4655416_638-301620-0_sp0.9 from training. Duration: 0.52225 2023-10-02 13:20:31,917 WARNING [train.py:1204] (3/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_478-162247-0_sp1.1 from training. Duration: 0.7454375 2023-10-02 13:20:33,952 WARNING [train.py:1204] (3/4) Exclude cut with ID _45297_210_2721_1_1533715351381_3264949_353-9541-0 from training. Duration: 0.97 2023-10-02 13:20:35,527 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.conv_skip_rate, batch_count=893620.0, ans=0.0 2023-10-02 13:20:38,032 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9653W0190-675468 from training. Duration: 0.9935 2023-10-02 13:20:39,540 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_49730_210_13049_1_1534075294_1830676_214-84436-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 13:20:43,372 INFO [train.py:1046] (3/4) Epoch 26, batch 1250, loss[loss=0.1683, simple_loss=0.2511, pruned_loss=0.04272, over 24074.00 frames. ], tot_loss[loss=0.171, simple_loss=0.248, pruned_loss=0.04702, over 4699727.32 frames. ], batch size: 80, lr: 3.96e-03, grad_scale: 16.0 2023-10-02 13:20:43,412 WARNING [train.py:1204] (3/4) Exclude cut with ID _50093_210_4220_1_1534417141207_3432299_259-322252-0_sp1.1 from training. Duration: 0.836375 2023-10-02 13:20:43,496 WARNING [train.py:1204] (3/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_599-50220-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 13:20:44,926 WARNING [train.py:1204] (3/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_174-109016-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 13:20:47,600 WARNING [train.py:1204] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_665-196155-0 from training. Duration: 0.67 2023-10-02 13:20:52,226 WARNING [train.py:1204] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_656-188361-0_sp1.1 from training. Duration: 0.7909375 2023-10-02 13:20:53,527 WARNING [train.py:1204] (3/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_60-18123-0_sp1.1 from training. Duration: 0.8 2023-10-02 13:20:53,585 WARNING [train.py:1204] (3/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_535-14410-0 from training. Duration: 0.47 2023-10-02 13:20:55,607 WARNING [train.py:1204] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_94-126128-0_sp1.1 from training. Duration: 0.7818125 2023-10-02 13:20:57,000 WARNING [train.py:1204] (3/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_356-31216-0_sp1.1 from training. Duration: 0.6818125 2023-10-02 13:20:59,957 WARNING [train.py:1204] (3/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_443-30034-0_sp1.1 from training. Duration: 0.5818125 2023-10-02 13:21:00,217 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.conv_module1.balancer2.prob, batch_count=893753.3333333334, ans=0.125 2023-10-02 13:21:01,321 WARNING [train.py:1204] (3/4) Exclude cut with ID _26907_210_7234_1_1532221740694_3205263_337-2494-0_sp1.1 from training. Duration: 0.8 2023-10-02 13:21:02,645 WARNING [train.py:1204] (3/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_49-97158-0_sp0.9 from training. Duration: 0.8444375 2023-10-02 13:21:02,664 WARNING [train.py:1204] (3/4) Exclude cut with ID _7877_210_6014_1_1528286358099_3750136_553-170128-0_sp1.1 from training. Duration: 0.963625 2023-10-02 13:21:05,834 WARNING [train.py:1204] (3/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_202-293523-0_sp0.9 from training. Duration: 0.8 2023-10-02 13:21:08,005 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.balancer1.prob, batch_count=893753.3333333334, ans=0.125 2023-10-02 13:21:11,882 WARNING [train.py:1204] (3/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_188-171673-0_sp0.9 from training. Duration: 0.6444375 2023-10-02 13:21:11,899 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_120-41668-0_sp1.1 from training. Duration: 0.636375 2023-10-02 13:21:11,904 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_40223_210_6228_1_1533298404_4812267_613-78769-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 13:21:13,287 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_53861_210_5399_1_1534422156_4272077_150-336899-0_sp1.1 from training. Duration: 0.963625 2023-10-02 13:21:14,629 WARNING [train.py:1204] (3/4) Exclude cut with ID _14459_210_2228_1_1531443641946_7516700_1146-56843-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 13:21:16,166 WARNING [train.py:1204] (3/4) Exclude cut with ID _39867_210_9826_1_1533553222030_9344259_1194-299928-0_sp1.1 from training. Duration: 0.92725 2023-10-02 13:21:17,564 WARNING [train.py:1204] (3/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_214-291369-0_sp0.9 from training. Duration: 0.7 2023-10-02 13:21:22,217 WARNING [train.py:1204] (3/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_358-215558-0 from training. Duration: 0.93 2023-10-02 13:21:22,281 WARNING [train.py:1204] (3/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_577-287150-0_sp0.9 from training. Duration: 0.911125 2023-10-02 13:21:24,864 WARNING [train.py:1204] (3/4) Exclude cut with ID _60860_210_8254_1_1534896015875_3557169_229-187367-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 13:21:26,331 WARNING [train.py:1204] (3/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_857-298942-0 from training. Duration: 0.85 2023-10-02 13:21:26,384 WARNING [train.py:1204] (3/4) Exclude cut with ID _40137_210_12210_1_1533290484046_3795180_249-187059-0_sp1.1 from training. Duration: 0.8 2023-10-02 13:21:26,391 WARNING [train.py:1204] (3/4) Exclude cut with ID ID0171W0508-765603 from training. Duration: 0.541 2023-10-02 13:21:28,033 WARNING [train.py:1204] (3/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_521-219439-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 13:21:28,039 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_40223_210_6228_1_1533298404_4812267_741-302616-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 13:21:30,950 WARNING [train.py:1204] (3/4) Exclude cut with ID _65293_210_9070_1_1535545549765_4004210_438-154735-0_sp1.1 from training. Duration: 0.92725 2023-10-02 13:21:33,868 WARNING [train.py:1204] (3/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_564-132950-0_sp1.1 from training. Duration: 0.92725 2023-10-02 13:21:33,908 WARNING [train.py:1204] (3/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_291-270881-0_sp1.1 from training. Duration: 0.8818125 2023-10-02 13:21:35,634 WARNING [train.py:1204] (3/4) Exclude cut with ID _60229_210_27767_1_1534838406158_3655350_353-213656-0 from training. Duration: 0.95 2023-10-02 13:21:35,646 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_243-143119-0 from training. Duration: 0.77 2023-10-02 13:21:36,994 WARNING [train.py:1204] (3/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_690-126306-0 from training. Duration: 0.78 2023-10-02 13:21:37,210 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.balancer2.prob, batch_count=893886.6666666666, ans=0.125 2023-10-02 13:21:40,363 WARNING [train.py:1204] (3/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_347-95003-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 13:21:41,801 WARNING [train.py:1204] (3/4) Exclude cut with ID _2322_210_2667_1_1525604548971_3656299_440-80494-0 from training. Duration: 0.88 2023-10-02 13:21:41,823 WARNING [train.py:1204] (3/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_370-229434-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 13:21:43,202 WARNING [train.py:1204] (3/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_124-345840-0_sp1.1 from training. Duration: 0.47275 2023-10-02 13:21:44,490 WARNING [train.py:1204] (3/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_165-123581-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 13:21:45,946 WARNING [train.py:1204] (3/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_453-282588-0 from training. Duration: 0.98 2023-10-02 13:21:45,968 WARNING [train.py:1204] (3/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_299-34413-0_sp0.9 from training. Duration: 0.72225 2023-10-02 13:21:46,095 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.0.ff2_skip_rate, batch_count=893953.3333333334, ans=0.0 2023-10-02 13:21:47,242 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_40036_210_13405_1_1533648647_1315388_48-146016-0_sp1.1 from training. Duration: 0.8454375 2023-10-02 13:21:47,272 WARNING [train.py:1204] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_99-167392-0_sp0.9 from training. Duration: 0.67775 2023-10-02 13:21:48,624 WARNING [train.py:1204] (3/4) Exclude cut with ID _48677_210_5663_1_1533902405919_3892989_37-207114-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 13:21:49,521 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.1.encoder.layers.0.whiten, num_groups=1, num_channels=256, metric=5.13 vs. limit=12.0 2023-10-02 13:21:51,849 WARNING [train.py:1204] (3/4) Exclude cut with ID _29857_210_20034_1_1532395886753_3798160_335-163562-0 from training. Duration: 0.85 2023-10-02 13:21:53,264 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_36492_210_7927_1_1533261987_4790834_703-343351-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 13:21:55,842 WARNING [train.py:1204] (3/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_80-147301-0_sp0.9 from training. Duration: 0.9333125 2023-10-02 13:21:55,929 WARNING [train.py:1204] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_218-295097-0_sp0.9 from training. Duration: 0.8555625 2023-10-02 13:21:57,246 INFO [train.py:1046] (3/4) Epoch 26, batch 1300, loss[loss=0.1914, simple_loss=0.2529, pruned_loss=0.06498, over 23722.00 frames. ], tot_loss[loss=0.1718, simple_loss=0.2487, pruned_loss=0.04743, over 4691061.97 frames. ], batch size: 179, lr: 3.96e-03, grad_scale: 16.0 2023-10-02 13:21:58,689 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2295_210_2087_1_1525500993_2407863_116-238894-0_sp1.1 from training. Duration: 0.636375 2023-10-02 13:22:02,034 WARNING [train.py:1204] (3/4) Exclude cut with ID _3231_210_2106_1_1526205864934_6784220_697-171068-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 13:22:02,060 WARNING [train.py:1204] (3/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_102-77064-0 from training. Duration: 0.93 2023-10-02 13:22:06,683 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_13091_210_7925_1_1530609447_8987354_1186-261957-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 13:22:08,008 WARNING [train.py:1204] (3/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_91-277684-0_sp1.1 from training. Duration: 0.67275 2023-10-02 13:22:09,350 WARNING [train.py:1204] (3/4) Exclude cut with ID _31088_210_16446_1_1532520012284_3440919_159-313751-0_sp1.1 from training. Duration: 0.936375 2023-10-02 13:22:11,319 WARNING [train.py:1204] (3/4) Exclude cut with ID _42326_210_7770_1_1533814282531_3707269_227-203721-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 13:22:12,811 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3531_210_3528_1_1526294203_5216456_309-227302-0_sp1.1 from training. Duration: 0.62725 2023-10-02 13:22:13,659 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.0.layers.0.feed_forward2.out_whiten, num_groups=1, num_channels=192, metric=5.52 vs. limit=15.0 2023-10-02 13:22:14,146 WARNING [train.py:1204] (3/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_226-242140-0 from training. Duration: 0.81 2023-10-02 13:22:18,282 WARNING [train.py:1204] (3/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_122-109023-0_sp1.1 from training. Duration: 0.6909375 2023-10-02 13:22:19,689 WARNING [train.py:1204] (3/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_660-211088-0_sp0.9 from training. Duration: 0.688875 2023-10-02 13:22:21,596 WARNING [train.py:1204] (3/4) Exclude cut with ID _39290_210_4220_1_1533862778760_3075118_92-311600-0 from training. Duration: 0.88 2023-10-02 13:22:23,087 WARNING [train.py:1204] (3/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_390-339137-0_sp1.1 from training. Duration: 0.5090625 2023-10-02 13:22:25,918 WARNING [train.py:1204] (3/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_449-341946-0_sp1.1 from training. Duration: 0.963625 2023-10-02 13:22:27,291 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_68095_210_20185_1_1535718226_8888855_884-327007-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 13:22:28,653 WARNING [train.py:1204] (3/4) Exclude cut with ID _42326_210_7770_1_1533814282531_3707269_308-222998-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 13:22:30,022 WARNING [train.py:1204] (3/4) Exclude cut with ID _40399_210_4353_1_1533305658194_3279406_344-238708-0_sp1.1 from training. Duration: 0.963625 2023-10-02 13:22:30,089 WARNING [train.py:1204] (3/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_87-131427-0_sp1.1 from training. Duration: 0.7090625 2023-10-02 13:22:31,446 WARNING [train.py:1204] (3/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_110-130961-0_sp0.9 from training. Duration: 0.52225 2023-10-02 13:22:31,486 WARNING [train.py:1204] (3/4) Exclude cut with ID _55153_210_2253_1_1534384786677_3162096_148-79022-0 from training. Duration: 0.96 2023-10-02 13:22:35,961 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.3.encoder.layers.0.feed_forward3.out_whiten, num_groups=1, num_channels=512, metric=10.96 vs. limit=15.0 2023-10-02 13:22:37,990 WARNING [train.py:1204] (3/4) Exclude cut with ID _9679_210_7497_1_1529129212906_4363929_512-313889-0_sp1.1 from training. Duration: 0.736375 2023-10-02 13:22:38,002 WARNING [train.py:1204] (3/4) Exclude cut with ID _7970_210_1883_1_1528455616296_3472230_25-178407-0_sp1.1 from training. Duration: 0.6454375 2023-10-02 13:22:40,690 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_246-125000-0 from training. Duration: 0.62 2023-10-02 13:22:42,554 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_15126_210_6753_1_1530778677_2591185_113-157651-0_sp0.9 from training. Duration: 0.7555625 2023-10-02 13:22:43,835 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.564e+02 1.906e+02 2.137e+02 2.529e+02 3.286e+02, threshold=4.274e+02, percent-clipped=0.0 2023-10-02 13:22:43,930 WARNING [train.py:1204] (3/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_105-63309-0_sp1.1 from training. Duration: 0.8909375 2023-10-02 13:22:45,340 WARNING [train.py:1204] (3/4) Exclude cut with ID _29877_210_6228_1_1532397791106_5276149_578-177271-0_sp1.1 from training. Duration: 0.936375 2023-10-02 13:22:45,395 WARNING [train.py:1204] (3/4) Exclude cut with ID _44831_210_13427_1_1533816011549_3689035_32-188147-0 from training. Duration: 0.96 2023-10-02 13:22:46,752 WARNING [train.py:1204] (3/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_300-254337-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 13:22:46,765 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_128-241008-0 from training. Duration: 0.94 2023-10-02 13:22:48,233 WARNING [train.py:1204] (3/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_53-279178-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 13:22:51,469 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_41452_210_7291_1_1533437687_3941281_273-334088-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 13:22:51,471 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_128-241008-0_sp1.1 from training. Duration: 0.8545625 2023-10-02 13:22:55,637 WARNING [train.py:1204] (3/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_379-49457-0 from training. Duration: 0.87 2023-10-02 13:22:55,701 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_105-114797-0 from training. Duration: 0.84 2023-10-02 13:22:57,043 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_14928_210_5632_1_1530773461_4714000_454-112198-0 from training. Duration: 0.66 2023-10-02 13:23:02,786 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_456-22141-0_sp1.1 from training. Duration: 0.8 2023-10-02 13:23:04,331 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_115-233929-0 from training. Duration: 0.72 2023-10-02 13:23:06,289 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_616-16795-0_sp1.1 from training. Duration: 0.963625 2023-10-02 13:23:11,706 INFO [train.py:1046] (3/4) Epoch 26, batch 1350, loss[loss=0.1543, simple_loss=0.217, pruned_loss=0.04583, over 23413.00 frames. ], tot_loss[loss=0.171, simple_loss=0.2477, pruned_loss=0.04712, over 4688559.75 frames. ], batch size: 285, lr: 3.96e-03, grad_scale: 16.0 2023-10-02 13:23:11,887 WARNING [train.py:1204] (3/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_448-212135-0 from training. Duration: 0.91 2023-10-02 13:23:15,198 WARNING [train.py:1204] (3/4) Exclude cut with ID _46683_210_9642_1_1533730913492_4591116_201-205677-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 13:23:19,093 WARNING [train.py:1204] (3/4) Exclude cut with ID _64086_210_7994_1_1535270417019_7435949_43-80483-0_sp1.1 from training. Duration: 0.97275 2023-10-02 13:23:22,289 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_240-134124-0_sp1.1 from training. Duration: 0.963625 2023-10-02 13:23:23,566 WARNING [train.py:1204] (3/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_907-120940-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 13:23:24,950 WARNING [train.py:1204] (3/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_848-280957-0_sp1.1 from training. Duration: 0.7909375 2023-10-02 13:23:24,995 WARNING [train.py:1204] (3/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_243-42116-0_sp0.9 from training. Duration: 0.811125 2023-10-02 13:23:27,855 WARNING [train.py:1204] (3/4) Exclude cut with ID _6694_210_4599_1_1527852637490_3965529_116-109398-0_sp0.9 from training. Duration: 0.811125 2023-10-02 13:23:29,463 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.conv_module2.balancer1.prob, batch_count=894420.0, ans=0.125 2023-10-02 13:23:30,547 WARNING [train.py:1204] (3/4) Exclude cut with ID _24314_210_4915_1_1532135078116_7170179_521-258868-0 from training. Duration: 0.79 2023-10-02 13:23:30,660 WARNING [train.py:1204] (3/4) Exclude cut with ID _2220_210_1819_1_1525509109898_3658680_50-99837-0_sp0.9 from training. Duration: 0.6 2023-10-02 13:23:31,958 WARNING [train.py:1204] (3/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_313-134930-0_sp1.1 from training. Duration: 0.8818125 2023-10-02 13:23:33,816 WARNING [train.py:1204] (3/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_125-144174-0 from training. Duration: 0.39 2023-10-02 13:23:35,597 WARNING [train.py:1204] (3/4) Exclude cut with ID _59288_210_5854_1_1535077757774_3412246_275-293897-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 13:23:36,984 WARNING [train.py:1204] (3/4) Exclude cut with ID _40519_210_13794_1_1533715228655_7337390_722-52880-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 13:23:37,005 WARNING [train.py:1204] (3/4) Exclude cut with ID _7537_210_2084_1_1528502395431_3924359_128-268029-0 from training. Duration: 0.98 2023-10-02 13:23:38,502 WARNING [train.py:1204] (3/4) Exclude cut with ID _8021_210_7994_1_1528376451358_3653069_179-45206-0 from training. Duration: 0.86 2023-10-02 13:23:39,854 WARNING [train.py:1204] (3/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_88-276668-0 from training. Duration: 0.99 2023-10-02 13:23:40,751 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.0.layers.0.self_attn_weights.whiten_keys, num_groups=4, num_channels=128, metric=3.08 vs. limit=6.0 2023-10-02 13:23:43,046 WARNING [train.py:1204] (3/4) Exclude cut with ID _22613_210_5219_1_1532253599686_4090170_290-212862-0_sp1.1 from training. Duration: 0.97275 2023-10-02 13:23:43,051 WARNING [train.py:1204] (3/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_465-214726-0 from training. Duration: 0.86 2023-10-02 13:23:52,182 WARNING [train.py:1204] (3/4) Exclude cut with ID _56480_210_4220_1_1534679942079_3075409_138-36031-0_sp1.1 from training. Duration: 0.97275 2023-10-02 13:23:52,528 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.feed_forward2.hidden_balancer.prob, batch_count=894486.6666666666, ans=0.125 2023-10-02 13:23:59,575 WARNING [train.py:1204] (3/4) Exclude cut with ID _58605_210_12210_1_1534723195939_3904259_300-285878-0_sp1.1 from training. Duration: 0.97275 2023-10-02 13:23:59,614 WARNING [train.py:1204] (3/4) Exclude cut with ID _40901_210_5665_1_1533363789958_3856130_114-340229-0_sp1.1 from training. Duration: 0.936375 2023-10-02 13:24:01,441 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_489-230703-0 from training. Duration: 0.85 2023-10-02 13:24:05,425 WARNING [train.py:1204] (3/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_322-81243-0_sp1.1 from training. Duration: 0.936375 2023-10-02 13:24:05,512 WARNING [train.py:1204] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_271-269084-0 from training. Duration: 0.83 2023-10-02 13:24:05,522 WARNING [train.py:1204] (3/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_166-314607-0_sp0.9 from training. Duration: 0.6 2023-10-02 13:24:05,701 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.feed_forward3.hidden_balancer.prob, batch_count=894553.3333333334, ans=0.125 2023-10-02 13:24:06,831 WARNING [train.py:1204] (3/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_340-343302-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 13:24:08,335 WARNING [train.py:1204] (3/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_286-112015-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 13:24:11,324 WARNING [train.py:1204] (3/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_328-264776-0 from training. Duration: 0.67 2023-10-02 13:24:14,286 WARNING [train.py:1204] (3/4) Exclude cut with ID _4866_210_2577_1_1527404412284_4364659_256-306830-0_sp0.9 from training. Duration: 0.9666875 2023-10-02 13:24:21,218 WARNING [train.py:1204] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_239-316802-0 from training. Duration: 0.69 2023-10-02 13:24:23,284 WARNING [train.py:1204] (3/4) Exclude cut with ID _29857_210_20034_1_1532395886753_3798160_279-185801-0 from training. Duration: 0.91 2023-10-02 13:24:25,931 INFO [train.py:1046] (3/4) Epoch 26, batch 1400, loss[loss=0.1555, simple_loss=0.2227, pruned_loss=0.0441, over 23500.00 frames. ], tot_loss[loss=0.1693, simple_loss=0.2462, pruned_loss=0.04626, over 4683297.71 frames. ], batch size: 256, lr: 3.96e-03, grad_scale: 16.0 2023-10-02 13:24:26,123 WARNING [train.py:1204] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_656-188361-0 from training. Duration: 0.87 2023-10-02 13:24:27,493 WARNING [train.py:1204] (3/4) Exclude cut with ID _44718_210_5816_1_1534050051869_6469030_100-230287-0_sp1.1 from training. Duration: 0.936375 2023-10-02 13:24:31,836 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8275_210_4198_1_1528455320_3892571_276-215622-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 13:24:31,880 WARNING [train.py:1204] (3/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_698-309430-0_sp1.1 from training. Duration: 0.963625 2023-10-02 13:24:37,715 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_365-217119-0 from training. Duration: 0.63 2023-10-02 13:24:39,077 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7264_210_4938_1_1527939911_8132095_800-91995-0 from training. Duration: 0.86 2023-10-02 13:24:39,285 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.bypass.skip_rate, batch_count=894753.3333333334, ans=0.04949747468305833 2023-10-02 13:24:49,168 WARNING [train.py:1204] (3/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_49-97158-0_sp1.1 from training. Duration: 0.6909375 2023-10-02 13:24:49,369 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.bypass.skip_rate, batch_count=894753.3333333334, ans=0.04949747468305833 2023-10-02 13:24:51,881 WARNING [train.py:1204] (3/4) Exclude cut with ID _61132_210_9969_1_1535021792964_3755419_61-57720-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 13:24:53,942 WARNING [train.py:1204] (3/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_345-306832-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 13:24:53,971 WARNING [train.py:1204] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_164-224052-0_sp0.9 from training. Duration: 0.77775 2023-10-02 13:24:58,083 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_34886_210_10633_1_1533203882_1755661_76-156096-0_sp1.1 from training. Duration: 0.7909375 2023-10-02 13:24:58,166 WARNING [train.py:1204] (3/4) Exclude cut with ID 4133-6541-0027-40495-0_sp1.1 from training. Duration: 0.9681875 2023-10-02 13:25:04,507 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.feed_forward3.hidden_balancer.prob, batch_count=894820.0, ans=0.125 2023-10-02 13:25:08,813 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_514-211165-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 13:25:10,246 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_41211_210_2544_1_1533348859_3702910_97-212695-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 13:25:11,513 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.499e+02 1.828e+02 2.043e+02 2.426e+02 3.539e+02, threshold=4.086e+02, percent-clipped=0.0 2023-10-02 13:25:11,922 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.bypass.skip_rate, batch_count=894886.6666666666, ans=0.07 2023-10-02 13:25:12,973 WARNING [train.py:1204] (3/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_406-45086-0 from training. Duration: 0.8 2023-10-02 13:25:13,036 WARNING [train.py:1204] (3/4) Exclude cut with ID _65263_210_22356_1_1535763598691_3921037_49-222917-0_sp1.1 from training. Duration: 0.836375 2023-10-02 13:25:14,401 WARNING [train.py:1204] (3/4) Exclude cut with ID _4866_210_2577_1_1527404412284_4364659_352-157930-0_sp1.1 from training. Duration: 0.736375 2023-10-02 13:25:14,478 WARNING [train.py:1204] (3/4) Exclude cut with ID _4366_210_1388_1_1526792053633_3613819_231-211402-0_sp1.1 from training. Duration: 0.8545625 2023-10-02 13:25:16,195 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_98-21576-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 13:25:17,516 WARNING [train.py:1204] (3/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_348-127789-0_sp0.9 from training. Duration: 0.9444375 2023-10-02 13:25:17,540 WARNING [train.py:1204] (3/4) Exclude cut with ID _45871_210_5665_1_1533864614245_3296060_98-329557-0_sp1.1 from training. Duration: 0.97275 2023-10-02 13:25:18,892 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6846_210_6753_1_1527850767_4295158_2-315073-0_sp1.1 from training. Duration: 0.8 2023-10-02 13:25:19,093 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.1.conv_skip_rate, batch_count=894886.6666666666, ans=0.0 2023-10-02 13:25:19,184 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.ff2_skip_rate, batch_count=894886.6666666666, ans=0.0 2023-10-02 13:25:20,255 WARNING [train.py:1204] (3/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_145-137790-0 from training. Duration: 0.83 2023-10-02 13:25:20,277 WARNING [train.py:1204] (3/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_133-158308-0_sp0.9 from training. Duration: 0.9333125 2023-10-02 13:25:23,099 WARNING [train.py:1204] (3/4) Exclude cut with ID _14898_210_9206_1_1530766812377_4177190_320-11758-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 13:25:25,348 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.conv_module2.balancer2.prob, batch_count=894953.3333333334, ans=0.125 2023-10-02 13:25:27,621 WARNING [train.py:1204] (3/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_406-45086-0_sp0.9 from training. Duration: 0.888875 2023-10-02 13:25:34,551 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_5438_210_1794_1_1527336687_2600009_104-288274-0 from training. Duration: 0.58 2023-10-02 13:25:36,479 WARNING [train.py:1204] (3/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_108-53929-0_sp0.9 from training. Duration: 0.6555625 2023-10-02 13:25:36,542 WARNING [train.py:1204] (3/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_444-286390-0_sp1.1 from training. Duration: 0.963625 2023-10-02 13:25:37,987 WARNING [train.py:1204] (3/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_125-144174-0_sp0.9 from training. Duration: 0.4333125 2023-10-02 13:25:38,059 WARNING [train.py:1204] (3/4) Exclude cut with ID _55209_210_1531_1_1534675828996_4597160_118-345621-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 13:25:39,373 INFO [train.py:1046] (3/4) Epoch 26, batch 1450, loss[loss=0.1621, simple_loss=0.2328, pruned_loss=0.04573, over 23633.00 frames. ], tot_loss[loss=0.1689, simple_loss=0.246, pruned_loss=0.04593, over 4702971.85 frames. ], batch size: 106, lr: 3.96e-03, grad_scale: 16.0 2023-10-02 13:25:39,498 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_45886_210_17024_1_1533709215_471063_7-79165-0_sp1.1 from training. Duration: 0.8909375 2023-10-02 13:25:42,362 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=895020.0, ans=0.1 2023-10-02 13:25:43,492 WARNING [train.py:1204] (3/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_175-163894-0_sp0.9 from training. Duration: 0.82225 2023-10-02 13:25:45,412 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_50188_210_4198_1_1534503621_3646041_353-81682-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 13:25:45,418 WARNING [train.py:1204] (3/4) Exclude cut with ID _17875_210_2087_1_1531224012907_1821339_175-80565-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 13:25:45,436 WARNING [train.py:1204] (3/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_159-328157-0_sp1.1 from training. Duration: 0.47275 2023-10-02 13:25:49,623 WARNING [train.py:1204] (3/4) Exclude cut with ID _60057_210_10640_1_1534903065793_3333150_137-267579-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 13:25:49,671 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_92-46009-0_sp1.1 from training. Duration: 0.4909375 2023-10-02 13:25:50,924 WARNING [train.py:1204] (3/4) Exclude cut with ID _53203_210_2087_1_1534246218309_475590_48-68507-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 13:25:50,947 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_28211_210_6388_1_1532861506_4523224_128-328162-0 from training. Duration: 0.9 2023-10-02 13:25:51,007 WARNING [train.py:1204] (3/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_51-1049-0_sp1.1 from training. Duration: 0.6818125 2023-10-02 13:25:52,886 WARNING [train.py:1204] (3/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_383-316000-0 from training. Duration: 0.9 2023-10-02 13:25:52,939 WARNING [train.py:1204] (3/4) Exclude cut with ID _4779_210_2084_1_1527318022944_3914893_185-295153-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 13:25:54,274 WARNING [train.py:1204] (3/4) Exclude cut with ID _3231_210_2106_1_1526205864934_6784220_171-256004-0_sp0.9 from training. Duration: 0.9555625 2023-10-02 13:25:54,275 WARNING [train.py:1204] (3/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_87-272068-0 from training. Duration: 0.83 2023-10-02 13:25:55,692 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_164-152777-0_sp1.1 from training. Duration: 0.92725 2023-10-02 13:25:55,746 WARNING [train.py:1204] (3/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_18-130336-0_sp0.9 from training. Duration: 0.87775 2023-10-02 13:25:57,077 WARNING [train.py:1204] (3/4) Exclude cut with ID _14886_210_4220_1_1530702020355_3516190_175-229073-0_sp1.1 from training. Duration: 0.4090625 2023-10-02 13:25:57,089 WARNING [train.py:1204] (3/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_791-45149-0_sp0.9 from training. Duration: 0.9555625 2023-10-02 13:25:58,387 WARNING [train.py:1204] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_468-189058-0_sp1.1 from training. Duration: 0.87275 2023-10-02 13:25:59,733 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8278_210_2550_1_1528637099_4220413_32-142780-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 13:26:02,885 WARNING [train.py:1204] (3/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_146-218763-0_sp0.9 from training. Duration: 0.9555625 2023-10-02 13:26:07,549 WARNING [train.py:1204] (3/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_348-127789-0_sp1.1 from training. Duration: 0.77275 2023-10-02 13:26:07,562 WARNING [train.py:1204] (3/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_288-285445-0_sp1.1 from training. Duration: 0.936375 2023-10-02 13:26:09,071 WARNING [train.py:1204] (3/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_308-181732-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 13:26:09,099 WARNING [train.py:1204] (3/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_895-117428-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 13:26:10,607 WARNING [train.py:1204] (3/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_387-159008-0_sp0.9 from training. Duration: 0.9555625 2023-10-02 13:26:10,622 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_37351_210_7925_1_1533012931_3812113_265-85780-0_sp0.9 from training. Duration: 0.97775 2023-10-02 13:26:11,889 WARNING [train.py:1204] (3/4) Exclude cut with ID _40551_210_10635_1_1533776424632_3416179_361-3964-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 13:26:11,929 WARNING [train.py:1204] (3/4) Exclude cut with ID _25882_210_3036_1_1532775665815_6479510_485-122742-0_sp1.1 from training. Duration: 0.97275 2023-10-02 13:26:16,538 WARNING [train.py:1204] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_98-51676-0 from training. Duration: 0.73 2023-10-02 13:26:17,999 WARNING [train.py:1204] (3/4) Exclude cut with ID _30548_210_16040_1_1532608097130_3146969_371-280610-0_sp1.1 from training. Duration: 0.92725 2023-10-02 13:26:21,383 WARNING [train.py:1204] (3/4) Exclude cut with ID IC0671W0081-331924 from training. Duration: 0.896 2023-10-02 13:26:22,576 WARNING [train.py:1204] (3/4) Exclude cut with ID _40284_210_2084_1_1533513679814_3704279_380-160775-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 13:26:25,405 WARNING [train.py:1204] (3/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_57-270037-0_sp1.1 from training. Duration: 0.7 2023-10-02 13:26:26,793 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_853-150237-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 13:26:28,197 WARNING [train.py:1204] (3/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_491-261923-0 from training. Duration: 0.98 2023-10-02 13:26:29,737 WARNING [train.py:1204] (3/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_836-244045-0_sp1.1 from training. Duration: 0.97275 2023-10-02 13:26:31,043 WARNING [train.py:1204] (3/4) Exclude cut with ID _14880_210_8895_1_1530680426655_3795259_289-212817-0 from training. Duration: 0.69 2023-10-02 13:26:31,205 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.feed_forward1.out_proj.dropout_p, batch_count=895220.0, ans=0.1 2023-10-02 13:26:32,444 WARNING [train.py:1204] (3/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_303-340446-0 from training. Duration: 0.9 2023-10-02 13:26:35,734 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_37152_210_9697_1_1533106573_3844188_418-41736-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 13:26:37,915 WARNING [train.py:1204] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_371-18169-0_sp1.1 from training. Duration: 0.7818125 2023-10-02 13:26:37,948 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_187-219984-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 13:26:40,660 WARNING [train.py:1204] (3/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_63-267585-0 from training. Duration: 0.9 2023-10-02 13:26:42,091 WARNING [train.py:1204] (3/4) Exclude cut with ID _27013_210_1389_1_1532437173240_3252649_222-125573-0 from training. Duration: 0.98 2023-10-02 13:26:43,844 WARNING [train.py:1204] (3/4) Exclude cut with ID _14821_210_8750_1_1530680503991_6829359_189-297451-0 from training. Duration: 0.55 2023-10-02 13:26:45,271 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_557-163391-0_sp1.1 from training. Duration: 0.97275 2023-10-02 13:26:46,686 WARNING [train.py:1204] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_65-88242-0_sp1.1 from training. Duration: 0.5090625 2023-10-02 13:26:53,479 INFO [train.py:1046] (3/4) Epoch 26, batch 1500, loss[loss=0.1698, simple_loss=0.2461, pruned_loss=0.04675, over 23304.00 frames. ], tot_loss[loss=0.1693, simple_loss=0.2462, pruned_loss=0.04618, over 4696962.04 frames. ], batch size: 119, lr: 3.96e-03, grad_scale: 16.0 2023-10-02 13:26:55,628 WARNING [train.py:1204] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_218-295097-0 from training. Duration: 0.77 2023-10-02 13:26:56,871 WARNING [train.py:1204] (3/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_686-247600-0_sp1.1 from training. Duration: 0.836375 2023-10-02 13:26:56,875 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_397-147196-0_sp0.9 from training. Duration: 0.888875 2023-10-02 13:26:58,226 WARNING [train.py:1204] (3/4) Exclude cut with ID _53950_210_5816_1_1534417264919_3862951_237-136449-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 13:26:59,521 WARNING [train.py:1204] (3/4) Exclude cut with ID _3120_210_3525_1_1526036143112_4072980_74-178400-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 13:26:59,599 WARNING [train.py:1204] (3/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_309-222545-0_sp0.9 from training. Duration: 0.9333125 2023-10-02 13:27:00,914 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7544_210_6753_1_1528587722_3576686_172-171750-0 from training. Duration: 0.65 2023-10-02 13:27:01,017 WARNING [train.py:1204] (3/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_3-237434-0_sp0.9 from training. Duration: 0.8444375 2023-10-02 13:27:02,321 WARNING [train.py:1204] (3/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_101-293367-0_sp0.9 from training. Duration: 0.7 2023-10-02 13:27:02,329 WARNING [train.py:1204] (3/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_818-213552-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 13:27:04,212 WARNING [train.py:1204] (3/4) Exclude cut with ID _14349_210_8341_1_1530615710818_3442470_115-285975-0_sp1.1 from training. Duration: 0.7818125 2023-10-02 13:27:06,907 WARNING [train.py:1204] (3/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_291-309749-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 13:27:08,637 WARNING [train.py:1204] (3/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_715-3462-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 13:27:14,109 WARNING [train.py:1204] (3/4) Exclude cut with ID _59502_210_12210_1_1535107024698_1835139_13-331827-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 13:27:14,120 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_4528_210_3273_1_1527076654_3276777_342-168068-0 from training. Duration: 0.87 2023-10-02 13:27:14,198 WARNING [train.py:1204] (3/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_215-141059-0_sp0.9 from training. Duration: 0.688875 2023-10-02 13:27:14,217 WARNING [train.py:1204] (3/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_106-286130-0_sp1.1 from training. Duration: 0.8909375 2023-10-02 13:27:16,036 WARNING [train.py:1204] (3/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_480-77655-0_sp1.1 from training. Duration: 0.97275 2023-10-02 13:27:18,881 WARNING [train.py:1204] (3/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_848-280957-0 from training. Duration: 0.87 2023-10-02 13:27:21,698 WARNING [train.py:1204] (3/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_122-109023-0 from training. Duration: 0.76 2023-10-02 13:27:23,284 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.conv_module2.balancer1.prob, batch_count=895486.6666666666, ans=0.125 2023-10-02 13:27:24,733 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_11305_210_1794_1_1529750428_7178148_645-233480-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 13:27:24,797 WARNING [train.py:1204] (3/4) Exclude cut with ID _7562_210_2084_1_1528887767916_3946229_345-169156-0 from training. Duration: 0.58 2023-10-02 13:27:27,599 WARNING [train.py:1204] (3/4) Exclude cut with ID _7562_210_2084_1_1528887767916_3946229_345-169156-0_sp1.1 from training. Duration: 0.52725 2023-10-02 13:27:28,996 WARNING [train.py:1204] (3/4) Exclude cut with ID _13253_210_1925_1_1530757775819_3577873_253-246386-0_sp1.1 from training. Duration: 0.8181875 2023-10-02 13:27:29,051 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_136-264444-0_sp1.1 from training. Duration: 0.97275 2023-10-02 13:27:29,072 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2168_210_2290_1_1525606920_1849400_136-222991-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 13:27:29,260 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.1.balancer1.prob, batch_count=895486.6666666666, ans=0.125 2023-10-02 13:27:30,383 WARNING [train.py:1204] (3/4) Exclude cut with ID _7852_210_5219_1_1528286471316_8139400_242-105336-0 from training. Duration: 0.72 2023-10-02 13:27:30,436 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_5960_210_1794_1_1527403865_3990999_515-2613-0_sp1.1 from training. Duration: 0.8 2023-10-02 13:27:30,450 WARNING [train.py:1204] (3/4) Exclude cut with ID _39688_210_19596_1_1533371626725_4154159_551-167834-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 13:27:31,804 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_85-195240-0 from training. Duration: 0.87 2023-10-02 13:27:31,854 WARNING [train.py:1204] (3/4) Exclude cut with ID _63171_210_15488_1_1535104792794_3295110_113-113534-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 13:27:36,385 WARNING [train.py:1204] (3/4) Exclude cut with ID _8833_210_5219_1_1528592665210_7553059_319-349181-0_sp1.1 from training. Duration: 0.9 2023-10-02 13:27:36,386 WARNING [train.py:1204] (3/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_225-43381-0 from training. Duration: 0.94 2023-10-02 13:27:39,721 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.517e+02 1.823e+02 1.989e+02 2.187e+02 2.730e+02, threshold=3.978e+02, percent-clipped=0.0 2023-10-02 13:27:41,255 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_5959_210_1794_1_1527400400_3445726_412-44052-0_sp0.9 from training. Duration: 0.8555625 2023-10-02 13:27:44,035 WARNING [train.py:1204] (3/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_356-167816-0_sp1.1 from training. Duration: 0.6545625 2023-10-02 13:27:48,599 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9012W0112-501244 from training. Duration: 0.768 2023-10-02 13:27:48,654 WARNING [train.py:1204] (3/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_177-326952-0_sp1.1 from training. Duration: 0.92725 2023-10-02 13:27:48,658 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9016W0116-502789 from training. Duration: 0.896 2023-10-02 13:27:50,096 WARNING [train.py:1204] (3/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_557-204815-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 13:27:51,451 WARNING [train.py:1204] (3/4) Exclude cut with ID _55857_210_2346_1_1534644007831_6898209_279-229469-0_sp1.1 from training. Duration: 0.936375 2023-10-02 13:27:52,791 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9029W0375-509764 from training. Duration: 0.896 2023-10-02 13:27:54,122 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2295_210_2087_1_1525500993_2407863_234-83104-0_sp0.9 from training. Duration: 0.688875 2023-10-02 13:27:57,564 WARNING [train.py:1204] (3/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_266-137104-0 from training. Duration: 0.65 2023-10-02 13:27:57,805 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.bypass.skip_rate, batch_count=895620.0, ans=0.09899494936611666 2023-10-02 13:27:58,901 WARNING [train.py:1204] (3/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_289-163927-0_sp1.1 from training. Duration: 0.92725 2023-10-02 13:28:01,757 WARNING [train.py:1204] (3/4) Exclude cut with ID _12733_210_8341_1_1530167424556_3646380_86-225314-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 13:28:01,780 WARNING [train.py:1204] (3/4) Exclude cut with ID _7877_210_6014_1_1528286358099_3750136_618-284064-0_sp1.1 from training. Duration: 0.92725 2023-10-02 13:28:03,070 WARNING [train.py:1204] (3/4) Exclude cut with ID _55844_210_2346_1_1534471212569_7625707_1017-31469-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 13:28:03,110 WARNING [train.py:1204] (3/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_617-241446-0_sp1.1 from training. Duration: 0.92725 2023-10-02 13:28:03,163 WARNING [train.py:1204] (3/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_866-173440-0_sp1.1 from training. Duration: 0.8181875 2023-10-02 13:28:06,322 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_9460_210_4211_1_1528954587_10049667_480-260269-0 from training. Duration: 0.56 2023-10-02 13:28:06,359 WARNING [train.py:1204] (3/4) Exclude cut with ID _44659_210_2721_1_1534036120061_6614860_626-298884-0 from training. Duration: 0.97 2023-10-02 13:28:06,590 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.conv_module1.balancer1.prob, batch_count=895686.6666666666, ans=0.125 2023-10-02 13:28:07,719 INFO [train.py:1046] (3/4) Epoch 26, batch 1550, loss[loss=0.228, simple_loss=0.2863, pruned_loss=0.08485, over 19133.00 frames. ], tot_loss[loss=0.1697, simple_loss=0.2464, pruned_loss=0.04654, over 4695183.31 frames. ], batch size: 388, lr: 3.96e-03, grad_scale: 16.0 2023-10-02 13:28:07,775 WARNING [train.py:1204] (3/4) Exclude cut with ID _53565_210_10635_1_1534337218335_3820259_287-18120-0_sp1.1 from training. Duration: 0.8818125 2023-10-02 13:28:07,842 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_791-197891-0 from training. Duration: 0.84 2023-10-02 13:28:09,145 WARNING [train.py:1204] (3/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_527-238990-0 from training. Duration: 0.9 2023-10-02 13:28:10,623 WARNING [train.py:1204] (3/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_449-65969-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 13:28:12,425 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_553-341452-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 13:28:12,474 WARNING [train.py:1204] (3/4) Exclude cut with ID _16909_210_5724_1_1531292079912_4118919_300-47574-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 13:28:12,494 WARNING [train.py:1204] (3/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_70-27732-0_sp1.1 from training. Duration: 0.8545625 2023-10-02 13:28:15,130 WARNING [train.py:1204] (3/4) Exclude cut with ID _44718_210_5816_1_1534050051869_6469030_727-28074-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 13:28:16,335 WARNING [train.py:1204] (3/4) Exclude cut with ID _55857_210_2346_1_1534644007831_6898209_357-123664-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 13:28:19,629 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9123W0004-555964 from training. Duration: 0.896 2023-10-02 13:28:19,648 WARNING [train.py:1204] (3/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_445-145208-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 13:28:19,686 WARNING [train.py:1204] (3/4) Exclude cut with ID _3675_210_4557_1_1526639934408_4182089_456-18305-0_sp1.1 from training. Duration: 0.6909375 2023-10-02 13:28:19,739 WARNING [train.py:1204] (3/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_1-99680-0_sp1.1 from training. Duration: 0.6090625 2023-10-02 13:28:22,522 WARNING [train.py:1204] (3/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_146-282400-0_sp0.9 from training. Duration: 0.788875 2023-10-02 13:28:22,535 WARNING [train.py:1204] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_844-96989-0 from training. Duration: 0.71 2023-10-02 13:28:22,704 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=895753.3333333334, ans=0.1 2023-10-02 13:28:25,137 WARNING [train.py:1204] (3/4) Exclude cut with ID _29853_210_7497_1_1532505600851_6642110_128-146364-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 13:28:25,169 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_224-322394-0 from training. Duration: 0.66 2023-10-02 13:28:25,265 WARNING [train.py:1204] (3/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_191-59784-0 from training. Duration: 0.88 2023-10-02 13:28:25,270 WARNING [train.py:1204] (3/4) Exclude cut with ID _12556_210_8341_1_1530081024100_7327690_725-193477-0 from training. Duration: 0.98 2023-10-02 13:28:27,287 WARNING [train.py:1204] (3/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_523-79838-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 13:28:28,391 WARNING [train.py:1204] (3/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_398-334558-0_sp1.1 from training. Duration: 0.87275 2023-10-02 13:28:31,444 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.bypass_mid.scale_min, batch_count=895753.3333333334, ans=0.2 2023-10-02 13:28:32,546 WARNING [train.py:1204] (3/4) Exclude cut with ID _6456_210_1534_1_1527681634212_3181049_171-36697-0_sp1.1 from training. Duration: 0.936375 2023-10-02 13:28:33,957 WARNING [train.py:1204] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_700-183549-0 from training. Duration: 0.68 2023-10-02 13:28:33,959 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8853_210_3273_1_1528633899_818968_51-187685-0 from training. Duration: 0.69 2023-10-02 13:28:36,100 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.2.encoder.layers.2.conv_module2.whiten, num_groups=1, num_channels=384, metric=4.80 vs. limit=15.0 2023-10-02 13:28:38,724 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.1.bypass.scale_min, batch_count=895820.0, ans=0.2 2023-10-02 13:28:41,327 WARNING [train.py:1204] (3/4) Exclude cut with ID _7719_210_3525_1_1528456153163_4296960_333-110399-0_sp1.1 from training. Duration: 0.87275 2023-10-02 13:28:45,937 WARNING [train.py:1204] (3/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_1022-83740-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 13:28:45,960 WARNING [train.py:1204] (3/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_61-327062-0_sp0.9 from training. Duration: 0.9 2023-10-02 13:28:45,965 WARNING [train.py:1204] (3/4) Exclude cut with ID _47251_210_18628_1_1533814063128_4137250_214-88076-0_sp1.1 from training. Duration: 0.8 2023-10-02 13:28:46,221 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.bypass.scale_min, batch_count=895820.0, ans=0.2 2023-10-02 13:28:47,257 WARNING [train.py:1204] (3/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_839-173017-0 from training. Duration: 0.41 2023-10-02 13:28:52,783 WARNING [train.py:1204] (3/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_299-34413-0_sp1.1 from training. Duration: 0.5909375 2023-10-02 13:28:54,185 WARNING [train.py:1204] (3/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_664-159457-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 13:28:56,802 WARNING [train.py:1204] (3/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_483-291760-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 13:28:59,937 WARNING [train.py:1204] (3/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_287-255474-0_sp1.1 from training. Duration: 0.92725 2023-10-02 13:29:01,322 WARNING [train.py:1204] (3/4) Exclude cut with ID _48677_210_5663_1_1533902405919_3892989_111-11439-0_sp1.1 from training. Duration: 0.87275 2023-10-02 13:29:01,329 WARNING [train.py:1204] (3/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_399-156791-0 from training. Duration: 0.83 2023-10-02 13:29:01,361 WARNING [train.py:1204] (3/4) Exclude cut with ID _3992_210_1747_1_1526644816031_3731060_36-147855-0_sp0.9 from training. Duration: 0.8666875 2023-10-02 13:29:02,805 WARNING [train.py:1204] (3/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_442-320317-0_sp1.1 from training. Duration: 0.7454375 2023-10-02 13:29:02,824 WARNING [train.py:1204] (3/4) Exclude cut with ID _55429_210_10626_1_1534467588527_7174143_953-80868-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 13:29:02,891 WARNING [train.py:1204] (3/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_159-328157-0_sp0.9 from training. Duration: 0.57775 2023-10-02 13:29:02,897 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9309W0197-640324 from training. Duration: 0.896 2023-10-02 13:29:05,583 WARNING [train.py:1204] (3/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_361-30822-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 13:29:07,714 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.2.encoder.layers.1.nonlin_attention.whiten1, num_groups=1, num_channels=288, metric=5.58 vs. limit=10.0 2023-10-02 13:29:11,463 WARNING [train.py:1204] (3/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_636-251213-0 from training. Duration: 0.94 2023-10-02 13:29:15,992 WARNING [train.py:1204] (3/4) Exclude cut with ID _49728_210_12651_1_1534041034294_6788150_468-333827-0_sp1.1 from training. Duration: 0.963625 2023-10-02 13:29:17,350 WARNING [train.py:1204] (3/4) Exclude cut with ID _25882_210_3036_1_1532775665815_6479510_623-191759-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 13:29:17,408 WARNING [train.py:1204] (3/4) Exclude cut with ID _5189_210_4557_1_1527248319428_3769229_199-53526-0 from training. Duration: 0.42 2023-10-02 13:29:21,073 INFO [train.py:1046] (3/4) Epoch 26, batch 1600, loss[loss=0.1547, simple_loss=0.2438, pruned_loss=0.03277, over 24476.00 frames. ], tot_loss[loss=0.1703, simple_loss=0.2471, pruned_loss=0.04679, over 4712605.90 frames. ], batch size: 66, lr: 3.96e-03, grad_scale: 32.0 2023-10-02 13:29:21,150 WARNING [train.py:1204] (3/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_32-205288-0_sp0.9 from training. Duration: 0.8666875 2023-10-02 13:29:22,539 WARNING [train.py:1204] (3/4) Exclude cut with ID _65263_210_22356_1_1535763598691_3921037_401-346646-0_sp1.1 from training. Duration: 0.963625 2023-10-02 13:29:22,543 WARNING [train.py:1204] (3/4) Exclude cut with ID _12555_210_8750_1_1530075594402_3780239_449-267986-0_sp1.1 from training. Duration: 0.7545625 2023-10-02 13:29:22,566 WARNING [train.py:1204] (3/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_430-201020-0_sp1.1 from training. Duration: 0.8818125 2023-10-02 13:29:25,185 WARNING [train.py:1204] (3/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_379-49457-0_sp1.1 from training. Duration: 0.7909375 2023-10-02 13:29:26,946 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.attention_skip_rate, batch_count=896020.0, ans=0.0 2023-10-02 13:29:28,107 WARNING [train.py:1204] (3/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_200-105218-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 13:29:28,171 WARNING [train.py:1204] (3/4) Exclude cut with ID _3867_210_1883_1_1526641200575_4475830_319-349850-0 from training. Duration: 0.67 2023-10-02 13:29:29,570 WARNING [train.py:1204] (3/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_916-195848-0 from training. Duration: 0.92 2023-10-02 13:29:29,689 WARNING [train.py:1204] (3/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_436-186311-0 from training. Duration: 0.65 2023-10-02 13:29:31,627 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3910_210_1794_1_1526777667_3747260_302-180617-0_sp1.1 from training. Duration: 0.97275 2023-10-02 13:29:34,275 WARNING [train.py:1204] (3/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_102-92751-0 from training. Duration: 0.99 2023-10-02 13:29:34,332 WARNING [train.py:1204] (3/4) Exclude cut with ID _51899_210_20034_1_1534129254466_3669220_265-215428-0_sp1.1 from training. Duration: 0.8909375 2023-10-02 13:29:36,968 WARNING [train.py:1204] (3/4) Exclude cut with ID _58583_210_5236_1_1534726835266_3574249_438-198993-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 13:29:41,819 WARNING [train.py:1204] (3/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_166-166990-0_sp1.1 from training. Duration: 0.87275 2023-10-02 13:29:44,537 WARNING [train.py:1204] (3/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_907-151398-0 from training. Duration: 0.37 2023-10-02 13:29:45,955 WARNING [train.py:1204] (3/4) Exclude cut with ID _38670_210_15379_1_1533448810659_4391059_452-256669-0_sp1.1 from training. Duration: 0.936375 2023-10-02 13:29:45,989 WARNING [train.py:1204] (3/4) Exclude cut with ID _48677_210_5663_1_1533902405919_3892989_408-40740-0 from training. Duration: 0.83 2023-10-02 13:29:46,113 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.0.feed_forward2.hidden_balancer.prob, batch_count=896086.6666666666, ans=0.125 2023-10-02 13:29:47,411 WARNING [train.py:1204] (3/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_903-158756-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 13:29:48,901 WARNING [train.py:1204] (3/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_397-216497-0 from training. Duration: 0.96 2023-10-02 13:29:54,349 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.4.encoder.layers.0.feed_forward3.out_whiten, num_groups=1, num_channels=384, metric=11.94 vs. limit=15.0 2023-10-02 13:29:55,203 WARNING [train.py:1204] (3/4) Exclude cut with ID _55429_210_10626_1_1534467588527_7174143_97-335084-0 from training. Duration: 0.97 2023-10-02 13:30:02,535 WARNING [train.py:1204] (3/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_453-267730-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 13:30:02,604 WARNING [train.py:1204] (3/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_153-97441-0 from training. Duration: 0.56 2023-10-02 13:30:04,017 WARNING [train.py:1204] (3/4) Exclude cut with ID _40901_210_5665_1_1533363789958_3856130_312-329122-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 13:30:04,060 WARNING [train.py:1204] (3/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_97-126471-0_sp1.1 from training. Duration: 0.97275 2023-10-02 13:30:04,060 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_11305_210_1794_1_1529750428_7178148_257-312870-0_sp1.1 from training. Duration: 0.8 2023-10-02 13:30:06,790 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.616e+02 1.834e+02 2.040e+02 2.278e+02 3.104e+02, threshold=4.080e+02, percent-clipped=0.0 2023-10-02 13:30:06,871 WARNING [train.py:1204] (3/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_454-314901-0_sp0.9 from training. Duration: 0.511125 2023-10-02 13:30:10,057 WARNING [train.py:1204] (3/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_282-136641-0_sp1.1 from training. Duration: 0.3818125 2023-10-02 13:30:12,766 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_46013_210_7234_1_1533776362_3744032_493-160724-0_sp1.1 from training. Duration: 0.963625 2023-10-02 13:30:12,812 WARNING [train.py:1204] (3/4) Exclude cut with ID _67831_210_12392_1_1535612464202_3092612_259-33442-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 13:30:14,111 WARNING [train.py:1204] (3/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_289-62384-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 13:30:14,159 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6545_210_2040_1_1527774670_4245258_379-14182-0_sp0.9 from training. Duration: 0.888875 2023-10-02 13:30:15,458 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_548-294316-0_sp1.1 from training. Duration: 0.736375 2023-10-02 13:30:17,023 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.conv_module2.balancer2.min_positive, batch_count=896220.0, ans=0.05 2023-10-02 13:30:18,155 WARNING [train.py:1204] (3/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_857-298942-0_sp1.1 from training. Duration: 0.77275 2023-10-02 13:30:20,001 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6673_210_3273_1_1527767790_3173240_327-306185-0_sp1.1 from training. Duration: 0.7545625 2023-10-02 13:30:26,028 WARNING [train.py:1204] (3/4) Exclude cut with ID _61008_210_10633_1_1534852769198_6974370_814-107647-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 13:30:26,107 WARNING [train.py:1204] (3/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_116-332008-0_sp1.1 from training. Duration: 0.8909375 2023-10-02 13:30:27,594 WARNING [train.py:1204] (3/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_266-329755-0 from training. Duration: 0.89 2023-10-02 13:30:27,597 WARNING [train.py:1204] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_271-269084-0_sp0.9 from training. Duration: 0.92225 2023-10-02 13:30:30,219 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3171_210_2087_1_1526109084_2330007_66-260583-0 from training. Duration: 0.72 2023-10-02 13:30:33,118 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.0.layers.1.feed_forward3.out_whiten, num_groups=1, num_channels=192, metric=10.03 vs. limit=15.0 2023-10-02 13:30:34,797 INFO [train.py:1046] (3/4) Epoch 26, batch 1650, loss[loss=0.1485, simple_loss=0.226, pruned_loss=0.0355, over 24592.00 frames. ], tot_loss[loss=0.17, simple_loss=0.2471, pruned_loss=0.04648, over 4719542.41 frames. ], batch size: 60, lr: 3.95e-03, grad_scale: 32.0 2023-10-02 13:30:36,237 WARNING [train.py:1204] (3/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_1326-276386-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 13:30:37,592 WARNING [train.py:1204] (3/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_372-30302-0_sp1.1 from training. Duration: 0.7818125 2023-10-02 13:30:37,650 WARNING [train.py:1204] (3/4) Exclude cut with ID _25882_210_3036_1_1532775665815_6479510_631-257029-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 13:30:37,657 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3691_210_3528_1_1526468169_4062053_355-334432-0 from training. Duration: 0.87 2023-10-02 13:30:38,916 WARNING [train.py:1204] (3/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_839-173017-0_sp1.1 from training. Duration: 0.37275 2023-10-02 13:30:38,932 WARNING [train.py:1204] (3/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_441-91466-0 from training. Duration: 0.97 2023-10-02 13:30:38,976 WARNING [train.py:1204] (3/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_108-53929-0 from training. Duration: 0.59 2023-10-02 13:30:42,522 WARNING [train.py:1204] (3/4) Exclude cut with ID _9389_210_1664_1_1529217337301_5289269_260-1942-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 13:30:43,855 WARNING [train.py:1204] (3/4) Exclude cut with ID _56423_210_5854_1_1534896061049_3165969_198-61547-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 13:30:43,895 WARNING [train.py:1204] (3/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_412-142122-0_sp1.1 from training. Duration: 0.92725 2023-10-02 13:30:43,918 WARNING [train.py:1204] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_88-341718-0_sp1.1 from training. Duration: 0.6 2023-10-02 13:30:45,321 WARNING [train.py:1204] (3/4) Exclude cut with ID _61079_210_4525_1_1534921008198_4011419_334-298697-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 13:30:46,795 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_5465_210_4211_1_1527227253_55357_8-2499-0_sp1.1 from training. Duration: 0.996375 2023-10-02 13:30:49,567 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_447-201565-0_sp1.1 from training. Duration: 0.97275 2023-10-02 13:30:51,237 WARNING [train.py:1204] (3/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_253-161789-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 13:30:51,240 WARNING [train.py:1204] (3/4) Exclude cut with ID _44304_210_15374_1_1533639352765_4613129_302-22084-0_sp0.9 from training. Duration: 0.9555625 2023-10-02 13:30:51,257 WARNING [train.py:1204] (3/4) Exclude cut with ID _26664_210_7770_1_1532258943280_3418270_187-80815-0_sp1.1 from training. Duration: 0.8454375 2023-10-02 13:30:52,615 WARNING [train.py:1204] (3/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_68-212801-0 from training. Duration: 0.73 2023-10-02 13:30:52,654 WARNING [train.py:1204] (3/4) Exclude cut with ID _29345_210_4381_1_1532689168263_3860169_149-257268-0 from training. Duration: 0.81 2023-10-02 13:30:58,681 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_213-103282-0_sp1.1 from training. Duration: 0.7090625 2023-10-02 13:31:00,057 WARNING [train.py:1204] (3/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_156-302675-0_sp1.1 from training. Duration: 0.5 2023-10-02 13:31:07,575 WARNING [train.py:1204] (3/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_125-322172-0 from training. Duration: 0.93 2023-10-02 13:31:07,642 WARNING [train.py:1204] (3/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_493-60474-0_sp1.1 from training. Duration: 0.963625 2023-10-02 13:31:10,831 WARNING [train.py:1204] (3/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_156-302675-0 from training. Duration: 0.55 2023-10-02 13:31:12,372 WARNING [train.py:1204] (3/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_123-267862-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 13:31:16,357 WARNING [train.py:1204] (3/4) Exclude cut with ID _28427_210_4220_1_1532581240967_3061069_201-143747-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 13:31:16,401 WARNING [train.py:1204] (3/4) Exclude cut with ID _8772_210_1850_1_1528635595972_3664011_96-12756-0_sp1.1 from training. Duration: 0.8909375 2023-10-02 13:31:16,435 WARNING [train.py:1204] (3/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_1347-145675-0_sp1.1 from training. Duration: 0.936375 2023-10-02 13:31:19,692 WARNING [train.py:1204] (3/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_387-159008-0_sp1.1 from training. Duration: 0.7818125 2023-10-02 13:31:19,717 WARNING [train.py:1204] (3/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_400-201896-0_sp1.1 from training. Duration: 0.963625 2023-10-02 13:31:22,406 WARNING [train.py:1204] (3/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_326-174612-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 13:31:23,702 WARNING [train.py:1204] (3/4) Exclude cut with ID _55844_210_2346_1_1534471212569_7625707_390-116233-0_sp1.1 from training. Duration: 0.963625 2023-10-02 13:31:23,728 WARNING [train.py:1204] (3/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_419-317062-0_sp1.1 from training. Duration: 0.82725 2023-10-02 13:31:23,781 WARNING [train.py:1204] (3/4) Exclude cut with ID _29877_210_6228_1_1532397791106_5276149_266-62291-0_sp0.9 from training. Duration: 0.9666875 2023-10-02 13:31:25,153 WARNING [train.py:1204] (3/4) Exclude cut with ID _7857_210_1534_1_1528286515745_6831470_431-338528-0_sp1.1 from training. Duration: 0.92725 2023-10-02 13:31:26,518 WARNING [train.py:1204] (3/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_87-131427-0_sp0.9 from training. Duration: 0.8666875 2023-10-02 13:31:27,964 WARNING [train.py:1204] (3/4) Exclude cut with ID _24055_210_12651_1_1532310907143_976979_92-59356-0_sp1.1 from training. Duration: 0.82725 2023-10-02 13:31:29,788 WARNING [train.py:1204] (3/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_96-290039-0 from training. Duration: 0.56 2023-10-02 13:31:30,526 WARNING [train.py:1204] (3/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_48-182155-0_sp0.9 from training. Duration: 0.9666875 2023-10-02 13:31:30,562 WARNING [train.py:1204] (3/4) Exclude cut with ID _4626_210_3532_1_1526817652717_3916859_446-206656-0 from training. Duration: 0.76 2023-10-02 13:31:30,758 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.balancer2.prob, batch_count=896553.3333333334, ans=0.125 2023-10-02 13:31:32,159 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.conv_module1.balancer2.prob, batch_count=896553.3333333334, ans=0.125 2023-10-02 13:31:32,172 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=896553.3333333334, ans=0.1 2023-10-02 13:31:33,279 WARNING [train.py:1204] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_168-32906-0 from training. Duration: 0.69 2023-10-02 13:31:33,295 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3780_210_2087_1_1526467760_2130483_170-259044-0 from training. Duration: 0.7 2023-10-02 13:31:33,321 WARNING [train.py:1204] (3/4) Exclude cut with ID _65134_210_14763_1_1535335393985_3505368_153-216718-0_sp1.1 from training. Duration: 0.92725 2023-10-02 13:31:34,651 WARNING [train.py:1204] (3/4) Exclude cut with ID _64639_210_5816_1_1535367619437_3528000_461-329156-0_sp1.1 from training. Duration: 0.87275 2023-10-02 13:31:35,951 WARNING [train.py:1204] (3/4) Exclude cut with ID _21989_210_5854_1_1531899214931_3271360_126-347531-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 13:31:37,291 WARNING [train.py:1204] (3/4) Exclude cut with ID _7614_210_4915_1_1529326764092_4537359_96-162197-0_sp1.1 from training. Duration: 0.963625 2023-10-02 13:31:37,292 WARNING [train.py:1204] (3/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_1083-147536-0 from training. Duration: 0.97 2023-10-02 13:31:40,132 WARNING [train.py:1204] (3/4) Exclude cut with ID _64029_210_4381_1_1535178502772_3756459_275-16710-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 13:31:42,098 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7264_210_4938_1_1527939911_8132095_800-91995-0_sp0.9 from training. Duration: 0.9555625 2023-10-02 13:31:42,138 WARNING [train.py:1204] (3/4) Exclude cut with ID _3844_210_1390_1_1526727803582_2871209_50-193879-0_sp1.1 from training. Duration: 0.936375 2023-10-02 13:31:44,730 WARNING [train.py:1204] (3/4) Exclude cut with ID _46952_210_5986_1_1534230010357_3107379_232-344503-0 from training. Duration: 0.96 2023-10-02 13:31:48,797 INFO [train.py:1046] (3/4) Epoch 26, batch 1700, loss[loss=0.1633, simple_loss=0.2525, pruned_loss=0.03703, over 24635.00 frames. ], tot_loss[loss=0.169, simple_loss=0.2461, pruned_loss=0.04601, over 4722356.71 frames. ], batch size: 68, lr: 3.95e-03, grad_scale: 32.0 2023-10-02 13:31:50,182 WARNING [train.py:1204] (3/4) Exclude cut with ID _25808_210_13292_1_1532343514562_7366109_206-14076-0_sp1.1 from training. Duration: 0.936375 2023-10-02 13:31:50,183 WARNING [train.py:1204] (3/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_55-31904-0_sp0.9 from training. Duration: 0.97775 2023-10-02 13:31:50,214 WARNING [train.py:1204] (3/4) Exclude cut with ID _14632_210_9988_1_1530691279392_3923200_161-129530-0 from training. Duration: 0.96 2023-10-02 13:31:50,267 WARNING [train.py:1204] (3/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_770-200430-0_sp1.1 from training. Duration: 0.8090625 2023-10-02 13:31:50,276 WARNING [train.py:1204] (3/4) Exclude cut with ID _6270_210_2084_1_1527850911573_3856420_26-277343-0_sp1.1 from training. Duration: 0.7454375 2023-10-02 13:31:50,277 WARNING [train.py:1204] (3/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_88-23776-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 13:31:52,418 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.self_attn_weights.pos_emb_skip_rate, batch_count=896686.6666666666, ans=0.0 2023-10-02 13:31:53,563 WARNING [train.py:1204] (3/4) Exclude cut with ID _60860_210_8254_1_1534896015875_3557169_245-256666-0_sp1.1 from training. Duration: 0.8818125 2023-10-02 13:31:53,592 WARNING [train.py:1204] (3/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_636-251213-0_sp1.1 from training. Duration: 0.8545625 2023-10-02 13:31:54,851 WARNING [train.py:1204] (3/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_540-131164-0 from training. Duration: 0.92 2023-10-02 13:31:56,328 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_5931_210_2040_1_1527428481_4547999_321-193614-0_sp0.9 from training. Duration: 0.7444375 2023-10-02 13:32:04,796 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.bypass_mid.scale_min, batch_count=896753.3333333334, ans=0.2 2023-10-02 13:32:05,914 WARNING [train.py:1204] (3/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_373-225403-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 13:32:07,303 WARNING [train.py:1204] (3/4) Exclude cut with ID _20622_210_3852_1_1531622427080_1093839_57-326027-0_sp1.1 from training. Duration: 0.97275 2023-10-02 13:32:11,984 WARNING [train.py:1204] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_605-257205-0_sp1.1 from training. Duration: 0.536375 2023-10-02 13:32:12,003 WARNING [train.py:1204] (3/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_442-320317-0_sp0.9 from training. Duration: 0.911125 2023-10-02 13:32:13,373 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_35361_210_4238_1_1533262810_7793476_675-87012-0_sp1.1 from training. Duration: 0.8090625 2023-10-02 13:32:13,401 WARNING [train.py:1204] (3/4) Exclude cut with ID _4184_210_2550_1_1526730192013_2219380_306-44736-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 13:32:14,846 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_124-113562-0 from training. Duration: 0.94 2023-10-02 13:32:17,664 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2821_210_2715_1_1526108385_3763560_73-296673-0_sp0.9 from training. Duration: 0.92225 2023-10-02 13:32:18,927 WARNING [train.py:1204] (3/4) Exclude cut with ID _10301_210_5886_1_1529222275332_3910620_216-46959-0_sp1.1 from training. Duration: 0.92725 2023-10-02 13:32:20,338 WARNING [train.py:1204] (3/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_91-277684-0_sp0.9 from training. Duration: 0.82225 2023-10-02 13:32:22,226 WARNING [train.py:1204] (3/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_351-294650-0_sp0.9 from training. Duration: 0.7 2023-10-02 13:32:22,476 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.bypass.skip_rate, batch_count=896820.0, ans=0.07 2023-10-02 13:32:25,044 WARNING [train.py:1204] (3/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_209-205365-0 from training. Duration: 0.79 2023-10-02 13:32:25,092 WARNING [train.py:1204] (3/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_97-325053-0 from training. Duration: 0.92 2023-10-02 13:32:25,195 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_49348_210_7927_1_1533954500_3691300_109-276430-0_sp1.1 from training. Duration: 0.92725 2023-10-02 13:32:27,988 WARNING [train.py:1204] (3/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_912-47473-0 from training. Duration: 0.68 2023-10-02 13:32:28,066 WARNING [train.py:1204] (3/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_96-320736-0_sp0.9 from training. Duration: 0.9555625 2023-10-02 13:32:35,978 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.510e+02 1.840e+02 2.040e+02 2.267e+02 3.457e+02, threshold=4.081e+02, percent-clipped=0.0 2023-10-02 13:32:37,469 WARNING [train.py:1204] (3/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_1252-235481-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 13:32:39,350 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.5.encoder.layers.0.feed_forward3.out_whiten, num_groups=1, num_channels=256, metric=14.53 vs. limit=15.0 2023-10-02 13:32:40,179 WARNING [train.py:1204] (3/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_813-261554-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 13:32:40,253 WARNING [train.py:1204] (3/4) Exclude cut with ID _21632_210_2253_1_1531994001765_3744140_110-274654-0_sp0.9 from training. Duration: 0.911125 2023-10-02 13:32:40,429 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.feed_forward1.out_proj.dropout_p, batch_count=896886.6666666666, ans=0.1 2023-10-02 13:32:41,890 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=896886.6666666666, ans=0.1 2023-10-02 13:32:43,012 WARNING [train.py:1204] (3/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_381-317791-0_sp1.1 from training. Duration: 0.663625 2023-10-02 13:32:43,037 WARNING [train.py:1204] (3/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_51-117576-0_sp1.1 from training. Duration: 0.363625 2023-10-02 13:32:43,065 WARNING [train.py:1204] (3/4) Exclude cut with ID _42321_210_7770_1_1533468631352_4366895_104-215915-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 13:32:46,437 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_216-22824-0_sp1.1 from training. Duration: 0.92725 2023-10-02 13:32:46,438 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_14831_210_7927_1_1530700200_5583610_181-27467-0 from training. Duration: 0.91 2023-10-02 13:32:46,482 WARNING [train.py:1204] (3/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_1024-265645-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 13:32:46,484 WARNING [train.py:1204] (3/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_412-233904-0_sp1.1 from training. Duration: 0.936375 2023-10-02 13:32:46,506 WARNING [train.py:1204] (3/4) Exclude cut with ID _30191_210_1664_1_1533189915047_7196670_911-223461-0_sp1.1 from training. Duration: 0.92725 2023-10-02 13:32:46,513 WARNING [train.py:1204] (3/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_558-148266-0_sp1.1 from training. Duration: 0.963625 2023-10-02 13:32:49,303 WARNING [train.py:1204] (3/4) Exclude cut with ID _58480_210_12392_1_1534811121111_3646160_212-164495-0_sp1.1 from training. Duration: 0.936375 2023-10-02 13:32:49,304 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_454-276429-0_sp0.9 from training. Duration: 0.9666875 2023-10-02 13:32:49,383 WARNING [train.py:1204] (3/4) Exclude cut with ID _24736_210_3279_1_1532332894218_2838700_73-197086-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 13:32:50,715 WARNING [train.py:1204] (3/4) Exclude cut with ID _5294_210_1747_1_1527249643928_3696179_529-242298-0_sp1.1 from training. Duration: 0.863625 2023-10-02 13:32:50,746 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_53555_210_5632_1_1534595220_7232297_345-171438-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 13:32:55,389 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_28211_210_6388_1_1532861506_4523224_22-25766-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 13:32:55,481 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7002_210_1794_1_1527939683_5157286_163-87552-0 from training. Duration: 0.86 2023-10-02 13:32:58,093 WARNING [train.py:1204] (3/4) Exclude cut with ID _56927_210_9969_1_1534848970840_3695229_409-225129-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 13:32:59,499 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_26192_210_7927_1_1532137269_1040115_21-219331-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 13:33:00,882 WARNING [train.py:1204] (3/4) Exclude cut with ID _15012_210_1968_1_1530856820429_5095350_662-210469-0 from training. Duration: 0.57 2023-10-02 13:33:04,824 INFO [train.py:1046] (3/4) Epoch 26, batch 1750, loss[loss=0.1565, simple_loss=0.2389, pruned_loss=0.03705, over 23265.00 frames. ], tot_loss[loss=0.1678, simple_loss=0.2446, pruned_loss=0.04556, over 4719979.18 frames. ], batch size: 93, lr: 3.95e-03, grad_scale: 16.0 2023-10-02 13:33:06,413 WARNING [train.py:1204] (3/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_582-191016-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 13:33:07,917 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.1.conv_skip_rate, batch_count=897020.0, ans=0.0 2023-10-02 13:33:08,958 WARNING [train.py:1204] (3/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_270-210032-0_sp1.1 from training. Duration: 0.963625 2023-10-02 13:33:08,987 WARNING [train.py:1204] (3/4) Exclude cut with ID _6789_210_1534_1_1527857937218_3118950_262-48853-0_sp0.9 from training. Duration: 0.62225 2023-10-02 13:33:10,349 WARNING [train.py:1204] (3/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_847-41334-0 from training. Duration: 0.44 2023-10-02 13:33:10,374 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_573-46532-0_sp1.1 from training. Duration: 0.92725 2023-10-02 13:33:13,147 WARNING [train.py:1204] (3/4) Exclude cut with ID _21881_210_3525_1_1531825242757_3798380_247-102591-0_sp1.1 from training. Duration: 0.8909375 2023-10-02 13:33:13,168 WARNING [train.py:1204] (3/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_200-21395-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 13:33:17,809 WARNING [train.py:1204] (3/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_711-15688-0 from training. Duration: 0.43 2023-10-02 13:33:19,281 WARNING [train.py:1204] (3/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_53-75025-0_sp1.1 from training. Duration: 0.963625 2023-10-02 13:33:22,482 WARNING [train.py:1204] (3/4) Exclude cut with ID _6194_210_1534_1_1527511826160_3941194_321-148641-0 from training. Duration: 0.64 2023-10-02 13:33:22,505 WARNING [train.py:1204] (3/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_337-294431-0_sp1.1 from training. Duration: 0.936375 2023-10-02 13:33:22,613 WARNING [train.py:1204] (3/4) Exclude cut with ID _21632_210_2253_1_1531994001765_3744140_110-274654-0_sp1.1 from training. Duration: 0.7454375 2023-10-02 13:33:24,447 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.5.encoder.layers.1.self_attn1.whiten, num_groups=1, num_channels=256, metric=10.40 vs. limit=22.5 2023-10-02 13:33:25,400 WARNING [train.py:1204] (3/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_135-174080-0_sp1.1 from training. Duration: 0.4818125 2023-10-02 13:33:26,765 WARNING [train.py:1204] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_21-174514-0 from training. Duration: 0.43 2023-10-02 13:33:29,499 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_38965_210_4852_1_1533174906_3484735_215-294589-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 13:33:29,523 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_548-294316-0 from training. Duration: 0.81 2023-10-02 13:33:38,748 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_103-84010-0_sp1.1 from training. Duration: 0.62725 2023-10-02 13:33:40,215 WARNING [train.py:1204] (3/4) Exclude cut with ID _18199_210_6109_1_1531475789864_4288790_295-198247-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 13:33:40,242 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_13757_210_3528_1_1530440682_4107239_103-313224-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 13:33:43,006 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_34886_210_10633_1_1533203882_1755661_67-294673-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 13:33:43,023 WARNING [train.py:1204] (3/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_350-101734-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 13:33:45,068 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_37357_210_17024_1_1533089504_2316121_204-153986-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 13:33:46,492 WARNING [train.py:1204] (3/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_673-115569-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 13:33:49,117 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_489-230703-0_sp0.9 from training. Duration: 0.9444375 2023-10-02 13:33:49,180 WARNING [train.py:1204] (3/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_575-102496-0_sp1.1 from training. Duration: 0.936375 2023-10-02 13:33:52,171 WARNING [train.py:1204] (3/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_669-93163-0 from training. Duration: 0.98 2023-10-02 13:33:52,298 WARNING [train.py:1204] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_249-281349-0_sp0.9 from training. Duration: 0.9444375 2023-10-02 13:33:55,008 WARNING [train.py:1204] (3/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_106-30782-0 from training. Duration: 0.8 2023-10-02 13:33:56,330 WARNING [train.py:1204] (3/4) Exclude cut with ID _8772_210_1850_1_1528635595972_3664011_398-320049-0_sp1.1 from training. Duration: 0.8 2023-10-02 13:33:57,727 WARNING [train.py:1204] (3/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_516-151727-0_sp1.1 from training. Duration: 0.963625 2023-10-02 13:33:59,036 WARNING [train.py:1204] (3/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_325-62802-0_sp1.1 from training. Duration: 0.7909375 2023-10-02 13:34:03,144 WARNING [train.py:1204] (3/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_93-58002-0_sp0.9 from training. Duration: 0.6333125 2023-10-02 13:34:05,063 WARNING [train.py:1204] (3/4) Exclude cut with ID _4681_210_3525_1_1527251040321_1235713_123-24428-0_sp1.1 from training. Duration: 0.436375 2023-10-02 13:34:06,379 WARNING [train.py:1204] (3/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_1185-56473-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 13:34:06,511 WARNING [train.py:1204] (3/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_96-244299-0_sp1.1 from training. Duration: 0.8 2023-10-02 13:34:09,654 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.feed_forward1.out_proj.dropout_p, batch_count=897286.6666666666, ans=0.1 2023-10-02 13:34:09,668 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.feed_forward2.hidden_balancer.prob, batch_count=897286.6666666666, ans=0.125 2023-10-02 13:34:10,719 WARNING [train.py:1204] (3/4) Exclude cut with ID _59500_210_8254_1_1534809610680_3736009_104-72815-0_sp1.1 from training. Duration: 0.963625 2023-10-02 13:34:12,188 WARNING [train.py:1204] (3/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_468-283113-0_sp1.1 from training. Duration: 0.87275 2023-10-02 13:34:15,135 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6239_210_4211_1_1527765584_4306698_528-102186-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 13:34:17,050 WARNING [train.py:1204] (3/4) Exclude cut with ID _45297_210_2721_1_1533715351381_3264949_351-147136-0 from training. Duration: 0.87 2023-10-02 13:34:17,054 WARNING [train.py:1204] (3/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_520-51744-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 13:34:17,384 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.bypass_mid.scale_min, batch_count=897353.3333333334, ans=0.2 2023-10-02 13:34:18,360 INFO [train.py:1046] (3/4) Epoch 26, batch 1800, loss[loss=0.1631, simple_loss=0.2349, pruned_loss=0.04563, over 23634.00 frames. ], tot_loss[loss=0.1673, simple_loss=0.244, pruned_loss=0.04532, over 4722211.74 frames. ], batch size: 135, lr: 3.95e-03, grad_scale: 16.0 2023-10-02 13:34:18,409 WARNING [train.py:1204] (3/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_310-114452-0_sp1.1 from training. Duration: 0.536375 2023-10-02 13:34:18,413 WARNING [train.py:1204] (3/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_450-165710-0_sp1.1 from training. Duration: 0.92725 2023-10-02 13:34:18,424 WARNING [train.py:1204] (3/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_280-120942-0_sp1.1 from training. Duration: 0.7 2023-10-02 13:34:18,447 WARNING [train.py:1204] (3/4) Exclude cut with ID _65585_210_12210_1_1535450451584_3764098_112-294301-0_sp0.9 from training. Duration: 0.97775 2023-10-02 13:34:18,522 WARNING [train.py:1204] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_98-51676-0_sp0.9 from training. Duration: 0.811125 2023-10-02 13:34:23,070 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_33348_210_5632_1_1533086912_3324030_85-28583-0_sp1.1 from training. Duration: 0.7454375 2023-10-02 13:34:23,153 WARNING [train.py:1204] (3/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_336-208232-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 13:34:25,830 WARNING [train.py:1204] (3/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_339-104791-0_sp0.9 from training. Duration: 0.5444375 2023-10-02 13:34:28,617 WARNING [train.py:1204] (3/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_559-29840-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 13:34:30,034 WARNING [train.py:1204] (3/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_117-290916-0_sp0.9 from training. Duration: 0.5666875 2023-10-02 13:34:31,365 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_185-112289-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 13:34:34,766 WARNING [train.py:1204] (3/4) Exclude cut with ID _59256_210_5986_1_1535076085997_3589120_155-298797-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 13:34:35,051 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.ff2_skip_rate, batch_count=897420.0, ans=0.0 2023-10-02 13:34:35,942 WARNING [train.py:1204] (3/4) Exclude cut with ID _44659_210_2721_1_1534036120061_6614860_246-206531-0_sp1.1 from training. Duration: 0.92725 2023-10-02 13:34:35,989 WARNING [train.py:1204] (3/4) Exclude cut with ID _23818_210_6228_1_1531917063884_3676920_469-151944-0_sp1.1 from training. Duration: 0.92725 2023-10-02 13:34:37,768 WARNING [train.py:1204] (3/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_88-276668-0_sp1.1 from training. Duration: 0.9 2023-10-02 13:34:39,310 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_15101_210_7927_1_1530786415_5033274_320-327527-0_sp1.1 from training. Duration: 0.87275 2023-10-02 13:34:39,325 WARNING [train.py:1204] (3/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_446-112117-0 from training. Duration: 0.9 2023-10-02 13:34:39,375 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_30280_210_1881_1_1532872779_3660139_400-254159-0_sp1.1 from training. Duration: 0.936375 2023-10-02 13:34:43,322 WARNING [train.py:1204] (3/4) Exclude cut with ID _40284_210_2084_1_1533513679814_3704279_158-345341-0_sp1.1 from training. Duration: 0.936375 2023-10-02 13:34:44,899 WARNING [train.py:1204] (3/4) Exclude cut with ID 3033-130750-0096-55598-0 from training. Duration: 0.83 2023-10-02 13:34:48,178 WARNING [train.py:1204] (3/4) Exclude cut with ID _9389_210_1664_1_1529217337301_5289269_379-128794-0 from training. Duration: 0.85 2023-10-02 13:34:48,193 WARNING [train.py:1204] (3/4) Exclude cut with ID _3675_210_4557_1_1526639934408_4182089_456-18305-0 from training. Duration: 0.76 2023-10-02 13:34:48,224 WARNING [train.py:1204] (3/4) Exclude cut with ID _29853_210_7497_1_1532505600851_6642110_571-249460-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 13:34:49,687 WARNING [train.py:1204] (3/4) Exclude cut with ID _4068_210_2629_1_1526697350395_1546259_230-325568-0_sp1.1 from training. Duration: 0.92725 2023-10-02 13:34:49,687 WARNING [train.py:1204] (3/4) Exclude cut with ID _34415_210_8750_1_1533085213375_3451116_213-267570-0_sp1.1 from training. Duration: 0.963625 2023-10-02 13:34:51,596 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6767_210_3273_1_1527855783_1718932_142-127132-0_sp1.1 from training. Duration: 0.863625 2023-10-02 13:34:57,175 WARNING [train.py:1204] (3/4) Exclude cut with ID IC0653W0001-322925 from training. Duration: 0.896 2023-10-02 13:34:58,544 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_4528_210_3273_1_1527076654_3276777_209-219804-0_sp1.1 from training. Duration: 0.62725 2023-10-02 13:34:59,900 WARNING [train.py:1204] (3/4) Exclude cut with ID _55844_210_2346_1_1534471212569_7625707_187-204971-0_sp1.1 from training. Duration: 0.936375 2023-10-02 13:35:01,307 WARNING [train.py:1204] (3/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_369-325184-0 from training. Duration: 0.99 2023-10-02 13:35:01,342 WARNING [train.py:1204] (3/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_791-45149-0 from training. Duration: 0.86 2023-10-02 13:35:03,343 WARNING [train.py:1204] (3/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_317-211107-0_sp1.1 from training. Duration: 0.836375 2023-10-02 13:35:04,550 WARNING [train.py:1204] (3/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_488-330913-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 13:35:05,755 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.541e+02 1.971e+02 2.205e+02 2.598e+02 3.859e+02, threshold=4.411e+02, percent-clipped=0.0 2023-10-02 13:35:05,864 WARNING [train.py:1204] (3/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_506-42614-0_sp1.1 from training. Duration: 0.7181875 2023-10-02 13:35:10,681 WARNING [train.py:1204] (3/4) Exclude cut with ID _8696_210_1876_1_1528545632440_3345058_209-54069-0 from training. Duration: 0.67 2023-10-02 13:35:15,378 WARNING [train.py:1204] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_215-202973-0_sp1.1 from training. Duration: 0.8 2023-10-02 13:35:16,752 WARNING [train.py:1204] (3/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_321-215742-0 from training. Duration: 0.5 2023-10-02 13:35:16,798 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_428-32114-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 13:35:16,809 WARNING [train.py:1204] (3/4) Exclude cut with ID _27013_210_1389_1_1532437173240_3252649_467-263151-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 13:35:18,123 WARNING [train.py:1204] (3/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_734-129761-0_sp0.9 from training. Duration: 0.911125 2023-10-02 13:35:18,163 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_521-214276-0 from training. Duration: 0.44 2023-10-02 13:35:20,200 WARNING [train.py:1204] (3/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_80-147301-0_sp1.1 from training. Duration: 0.763625 2023-10-02 13:35:20,203 WARNING [train.py:1204] (3/4) Exclude cut with ID _9679_210_7497_1_1529129212906_4363929_56-307796-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 13:35:21,733 WARNING [train.py:1204] (3/4) Exclude cut with ID _14832_210_8750_1_1530691351539_3171348_1-54315-0 from training. Duration: 0.87 2023-10-02 13:35:21,734 WARNING [train.py:1204] (3/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_558-155750-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 13:35:25,734 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_142-309370-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 13:35:25,761 WARNING [train.py:1204] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_18-46756-0_sp0.9 from training. Duration: 0.87775 2023-10-02 13:35:25,776 WARNING [train.py:1204] (3/4) Exclude cut with ID _61148_210_6897_1_1535094022726_4217610_279-212159-0_sp1.1 from training. Duration: 0.936375 2023-10-02 13:35:27,088 WARNING [train.py:1204] (3/4) Exclude cut with ID _21987_210_5854_1_1531821500482_3637038_164-2641-0_sp1.1 from training. Duration: 0.936375 2023-10-02 13:35:27,123 WARNING [train.py:1204] (3/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_395-340852-0_sp1.1 from training. Duration: 0.8454375 2023-10-02 13:35:29,799 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1077-184261-0_sp1.1 from training. Duration: 0.963625 2023-10-02 13:35:29,807 WARNING [train.py:1204] (3/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_1176-204992-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 13:35:33,122 INFO [train.py:1046] (3/4) Epoch 26, batch 1850, loss[loss=0.1551, simple_loss=0.2286, pruned_loss=0.04078, over 23601.00 frames. ], tot_loss[loss=0.1682, simple_loss=0.2449, pruned_loss=0.04578, over 4713922.18 frames. ], batch size: 149, lr: 3.95e-03, grad_scale: 16.0 2023-10-02 13:35:33,256 WARNING [train.py:1204] (3/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_563-347832-0_sp1.1 from training. Duration: 0.8090625 2023-10-02 13:35:34,657 WARNING [train.py:1204] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_380-204756-0_sp0.9 from training. Duration: 0.9555625 2023-10-02 13:35:39,463 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.1.feed_forward1.out_proj.dropout_p, batch_count=897686.6666666666, ans=0.1 2023-10-02 13:35:40,686 WARNING [train.py:1204] (3/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_212-10501-0_sp1.1 from training. Duration: 0.8545625 2023-10-02 13:35:40,716 WARNING [train.py:1204] (3/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_445-117398-0 from training. Duration: 0.89 2023-10-02 13:35:45,222 WARNING [train.py:1204] (3/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_29-280622-0 from training. Duration: 0.82 2023-10-02 13:35:47,978 WARNING [train.py:1204] (3/4) Exclude cut with ID _26664_210_7770_1_1532258943280_3418270_187-80815-0 from training. Duration: 0.93 2023-10-02 13:35:50,837 WARNING [train.py:1204] (3/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_414-349640-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 13:35:50,865 WARNING [train.py:1204] (3/4) Exclude cut with ID _45021_210_9994_1_1533619751094_3330288_96-239010-0 from training. Duration: 0.91 2023-10-02 13:35:50,868 WARNING [train.py:1204] (3/4) Exclude cut with ID _5189_210_4557_1_1527248319428_3769229_199-53526-0_sp0.9 from training. Duration: 0.4666875 2023-10-02 13:36:00,316 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.4.encoder.layers.2.conv_module2.whiten, num_groups=1, num_channels=384, metric=4.19 vs. limit=15.0 2023-10-02 13:36:01,130 WARNING [train.py:1204] (3/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_63-101996-0_sp1.1 from training. Duration: 0.8 2023-10-02 13:36:02,572 WARNING [train.py:1204] (3/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_199-86432-0 from training. Duration: 0.98 2023-10-02 13:36:05,835 WARNING [train.py:1204] (3/4) Exclude cut with ID _36399_210_16049_1_1532998771377_3698333_207-259013-0_sp1.1 from training. Duration: 0.92725 2023-10-02 13:36:05,876 WARNING [train.py:1204] (3/4) Exclude cut with ID _62574_210_2655_1_1535180407058_3933249_567-111756-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 13:36:08,946 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.0.balancer1.prob, batch_count=897820.0, ans=0.125 2023-10-02 13:36:10,553 WARNING [train.py:1204] (3/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_436-318113-0 from training. Duration: 0.75 2023-10-02 13:36:10,601 WARNING [train.py:1204] (3/4) Exclude cut with ID _53379_210_20034_1_1534222027363_3854654_43-305214-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 13:36:11,846 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_15126_210_6753_1_1530778677_2591185_113-157651-0_sp1.1 from training. Duration: 0.6181875 2023-10-02 13:36:13,677 WARNING [train.py:1204] (3/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_146-218763-0_sp1.1 from training. Duration: 0.7818125 2023-10-02 13:36:14,322 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.3.encoder.layers.3.nonlin_attention.whiten1, num_groups=1, num_channels=384, metric=5.61 vs. limit=10.0 2023-10-02 13:36:17,651 WARNING [train.py:1204] (3/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_714-310538-0_sp0.9 from training. Duration: 0.9555625 2023-10-02 13:36:19,116 WARNING [train.py:1204] (3/4) Exclude cut with ID _24271_210_12658_1_1531987270022_7190519_709-16721-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 13:36:21,860 WARNING [train.py:1204] (3/4) Exclude cut with ID _5190_210_4557_1_1527244368799_373439_28-197030-0_sp0.9 from training. Duration: 0.8 2023-10-02 13:36:21,894 WARNING [train.py:1204] (3/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_267-197536-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 13:36:23,307 WARNING [train.py:1204] (3/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_658-282610-0_sp1.1 from training. Duration: 0.4818125 2023-10-02 13:36:23,316 WARNING [train.py:1204] (3/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_257-67050-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 13:36:23,514 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.self_attn_weights.pos_emb_skip_rate, batch_count=897886.6666666666, ans=0.0 2023-10-02 13:36:24,742 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_14906_210_7927_1_1530754190_9408633_961-53402-0_sp1.1 from training. Duration: 0.936375 2023-10-02 13:36:25,279 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.feed_forward3.out_whiten.whitening_limit, batch_count=897886.6666666666, ans=15.0 2023-10-02 13:36:26,128 WARNING [train.py:1204] (3/4) Exclude cut with ID _60113_210_16271_1_1535164212481_4386079_490-280290-0_sp1.1 from training. Duration: 0.8909375 2023-10-02 13:36:29,519 WARNING [train.py:1204] (3/4) Exclude cut with ID _14100_210_8341_1_1530529320613_3678069_258-224599-0 from training. Duration: 0.77 2023-10-02 13:36:30,829 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6961_210_2087_1_1527937168_6815011_565-326898-0_sp1.1 from training. Duration: 0.936375 2023-10-02 13:36:33,493 WARNING [train.py:1204] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_239-316802-0_sp1.1 from training. Duration: 0.62725 2023-10-02 13:36:34,836 WARNING [train.py:1204] (3/4) Exclude cut with ID _55857_210_2346_1_1534644007831_6898209_745-98391-0_sp1.1 from training. Duration: 0.6909375 2023-10-02 13:36:34,840 WARNING [train.py:1204] (3/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_154-135696-0 from training. Duration: 0.8 2023-10-02 13:36:34,842 WARNING [train.py:1204] (3/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_93-58002-0 from training. Duration: 0.57 2023-10-02 13:36:36,964 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9025W0005-507335 from training. Duration: 0.896 2023-10-02 13:36:37,038 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9029W0034-509435 from training. Duration: 0.64 2023-10-02 13:36:38,319 WARNING [train.py:1204] (3/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_76-203867-0_sp1.1 from training. Duration: 0.6545625 2023-10-02 13:36:38,335 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_46639_210_8341_1_1533726088_4495500_66-346325-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 13:36:39,593 WARNING [train.py:1204] (3/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_145-137790-0_sp0.9 from training. Duration: 0.92225 2023-10-02 13:36:39,614 WARNING [train.py:1204] (3/4) Exclude cut with ID _36522_210_8887_1_1533177030611_1772478_7-327299-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 13:36:40,937 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9042W0009-515615 from training. Duration: 0.896 2023-10-02 13:36:40,952 WARNING [train.py:1204] (3/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_39-248870-0_sp0.9 from training. Duration: 0.7666875 2023-10-02 13:36:40,991 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_35441_210_16554_1_1533347572_4092680_362-242912-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 13:36:42,358 WARNING [train.py:1204] (3/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_216-343007-0_sp1.1 from training. Duration: 0.6 2023-10-02 13:36:44,168 WARNING [train.py:1204] (3/4) Exclude cut with ID _3867_210_1883_1_1526641200575_4475830_272-215563-0_sp0.9 from training. Duration: 0.6333125 2023-10-02 13:36:45,643 WARNING [train.py:1204] (3/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_553-175603-0_sp1.1 from training. Duration: 0.92725 2023-10-02 13:36:45,652 WARNING [train.py:1204] (3/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_963-187109-0 from training. Duration: 0.99 2023-10-02 13:36:47,271 INFO [train.py:1046] (3/4) Epoch 26, batch 1900, loss[loss=0.1684, simple_loss=0.2539, pruned_loss=0.04147, over 23969.00 frames. ], tot_loss[loss=0.168, simple_loss=0.2454, pruned_loss=0.04529, over 4723555.17 frames. ], batch size: 80, lr: 3.95e-03, grad_scale: 16.0 2023-10-02 13:36:47,602 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.balancer2.prob, batch_count=898020.0, ans=0.125 2023-10-02 13:36:49,918 WARNING [train.py:1204] (3/4) Exclude cut with ID _44973_210_1511_1_1533794381163_3634770_495-211466-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 13:36:49,937 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9068W0002-529085 from training. Duration: 0.896 2023-10-02 13:36:49,958 WARNING [train.py:1204] (3/4) Exclude cut with ID _5294_210_1747_1_1527249643928_3696179_180-100713-0_sp1.1 from training. Duration: 0.736375 2023-10-02 13:36:50,126 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.1.feed_forward1.out_proj.dropout_p, batch_count=898020.0, ans=0.1 2023-10-02 13:36:51,367 WARNING [train.py:1204] (3/4) Exclude cut with ID _35799_210_5854_1_1533263487887_3529071_149-49689-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 13:36:56,812 WARNING [train.py:1204] (3/4) Exclude cut with ID _67725_210_27767_1_1535608756671_5105359_36-220794-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 13:36:57,066 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.feed_forward3.hidden_balancer.prob, batch_count=898020.0, ans=0.125 2023-10-02 13:36:58,224 WARNING [train.py:1204] (3/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_338-96520-0_sp1.1 from training. Duration: 0.8545625 2023-10-02 13:36:58,293 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9104W0004-547160 from training. Duration: 0.896 2023-10-02 13:37:00,264 WARNING [train.py:1204] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_330-327861-0 from training. Duration: 0.67 2023-10-02 13:37:00,356 WARNING [train.py:1204] (3/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_156-257607-0_sp0.9 from training. Duration: 0.92225 2023-10-02 13:37:00,410 WARNING [train.py:1204] (3/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_476-322050-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 13:37:00,418 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9118W0017-553385 from training. Duration: 0.896 2023-10-02 13:37:01,765 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9120W0028-554435 from training. Duration: 0.896 2023-10-02 13:37:04,607 WARNING [train.py:1204] (3/4) Exclude cut with ID _56524_210_9642_1_1534572011779_3619170_145-310842-0 from training. Duration: 0.99 2023-10-02 13:37:06,003 WARNING [train.py:1204] (3/4) Exclude cut with ID _51631_210_8254_1_1534129228420_4084794_270-222508-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 13:37:09,838 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.3.encoder.layers.2.nonlin_attention.whiten1, num_groups=1, num_channels=384, metric=4.62 vs. limit=10.0 2023-10-02 13:37:10,573 WARNING [train.py:1204] (3/4) Exclude cut with ID _14268_210_9206_1_1530622771285_4714099_331-105351-0 from training. Duration: 0.65 2023-10-02 13:37:10,804 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.feed_forward1.out_proj.dropout_p, batch_count=898086.6666666666, ans=0.1 2023-10-02 13:37:11,547 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.2.encoder.layers.0.self_attn_weights.whiten_keys, num_groups=4, num_channels=128, metric=6.20 vs. limit=6.0 2023-10-02 13:37:12,142 WARNING [train.py:1204] (3/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_40-129756-0 from training. Duration: 0.81 2023-10-02 13:37:12,723 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.4.encoder.layers.0.self_attn2.whiten, num_groups=1, num_channels=384, metric=13.83 vs. limit=22.5 2023-10-02 13:37:24,257 WARNING [train.py:1204] (3/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_231-104736-0 from training. Duration: 0.78 2023-10-02 13:37:25,782 WARNING [train.py:1204] (3/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_214-291369-0 from training. Duration: 0.63 2023-10-02 13:37:25,789 WARNING [train.py:1204] (3/4) Exclude cut with ID _62574_210_2655_1_1535180407058_3933249_369-84347-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 13:37:27,146 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9210W0111-599240 from training. Duration: 0.896 2023-10-02 13:37:27,149 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_92-246270-0 from training. Duration: 0.85 2023-10-02 13:37:28,529 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_37152_210_9697_1_1533106573_3844188_29-239070-0 from training. Duration: 0.98 2023-10-02 13:37:28,599 WARNING [train.py:1204] (3/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_239-22076-0 from training. Duration: 0.85 2023-10-02 13:37:28,601 WARNING [train.py:1204] (3/4) Exclude cut with ID _65962_210_29927_1_1535436026519_4174930_368-44639-0_sp1.1 from training. Duration: 0.97275 2023-10-02 13:37:34,435 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.470e+02 1.789e+02 2.022e+02 2.205e+02 2.844e+02, threshold=4.043e+02, percent-clipped=0.0 2023-10-02 13:37:34,514 WARNING [train.py:1204] (3/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_152-212533-0 from training. Duration: 0.97 2023-10-02 13:37:37,331 WARNING [train.py:1204] (3/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_90-31383-0_sp1.1 from training. Duration: 0.7909375 2023-10-02 13:37:40,100 WARNING [train.py:1204] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_196-152868-0_sp1.1 from training. Duration: 0.92725 2023-10-02 13:37:40,102 WARNING [train.py:1204] (3/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_140-181176-0 from training. Duration: 0.92 2023-10-02 13:37:43,467 WARNING [train.py:1204] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_168-32906-0_sp0.9 from training. Duration: 0.7666875 2023-10-02 13:37:47,545 WARNING [train.py:1204] (3/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_13-165512-0 from training. Duration: 0.82 2023-10-02 13:37:47,573 WARNING [train.py:1204] (3/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_381-317791-0_sp0.9 from training. Duration: 0.811125 2023-10-02 13:37:52,439 WARNING [train.py:1204] (3/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_63-267585-0_sp1.1 from training. Duration: 0.8181875 2023-10-02 13:37:52,441 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_326-303945-0_sp1.1 from training. Duration: 0.9 2023-10-02 13:37:52,452 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_194-143381-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 13:37:54,327 WARNING [train.py:1204] (3/4) Exclude cut with ID _39290_210_4220_1_1533862778760_3075118_194-16667-0_sp1.1 from training. Duration: 0.963625 2023-10-02 13:37:54,427 WARNING [train.py:1204] (3/4) Exclude cut with ID _15012_210_1968_1_1530856820429_5095350_662-210469-0_sp0.9 from training. Duration: 0.6333125 2023-10-02 13:37:55,632 WARNING [train.py:1204] (3/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_68-219807-0_sp1.1 from training. Duration: 0.42725 2023-10-02 13:37:55,690 WARNING [train.py:1204] (3/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_321-22821-0_sp0.9 from training. Duration: 0.988875 2023-10-02 13:37:59,749 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_1037-45317-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 13:37:59,751 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_403-39619-0_sp1.1 from training. Duration: 0.763625 2023-10-02 13:38:01,079 INFO [train.py:1046] (3/4) Epoch 26, batch 1950, loss[loss=0.1948, simple_loss=0.2677, pruned_loss=0.06095, over 22719.00 frames. ], tot_loss[loss=0.1692, simple_loss=0.2465, pruned_loss=0.04592, over 4726179.96 frames. ], batch size: 322, lr: 3.95e-03, grad_scale: 16.0 2023-10-02 13:38:01,173 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_510-96996-0_sp0.9 from training. Duration: 0.9666875 2023-10-02 13:38:01,175 WARNING [train.py:1204] (3/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_225-180155-0_sp1.1 from training. Duration: 0.92725 2023-10-02 13:38:02,551 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_278-272612-0_sp0.9 from training. Duration: 0.811125 2023-10-02 13:38:02,630 WARNING [train.py:1204] (3/4) Exclude cut with ID _53713_210_24116_1_1534314487762_7416751_691-70244-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 13:38:05,880 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6788_210_1388_1_1527898698_4562021_91-197981-0_sp1.1 from training. Duration: 0.7545625 2023-10-02 13:38:08,748 WARNING [train.py:1204] (3/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_60-18123-0_sp0.9 from training. Duration: 0.97775 2023-10-02 13:38:08,795 WARNING [train.py:1204] (3/4) Exclude cut with ID _35799_210_5854_1_1533263487887_3529071_5-214617-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 13:38:10,093 WARNING [train.py:1204] (3/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_890-5117-0_sp1.1 from training. Duration: 0.7090625 2023-10-02 13:38:12,775 WARNING [train.py:1204] (3/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_660-211088-0 from training. Duration: 0.62 2023-10-02 13:38:13,426 WARNING [train.py:1204] (3/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_314-126819-0_sp0.9 from training. Duration: 0.5555625 2023-10-02 13:38:13,461 WARNING [train.py:1204] (3/4) Exclude cut with ID _21632_210_2253_1_1531994001765_3744140_362-66225-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 13:38:14,710 WARNING [train.py:1204] (3/4) Exclude cut with ID _61118_210_9527_1_1534897668884_3752630_173-258555-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 13:38:17,410 WARNING [train.py:1204] (3/4) Exclude cut with ID _7719_210_3525_1_1528456153163_4296960_88-250259-0_sp0.9 from training. Duration: 0.9333125 2023-10-02 13:38:17,432 WARNING [train.py:1204] (3/4) Exclude cut with ID _15140_210_9288_1_1530791388450_4880149_539-153708-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 13:38:17,460 WARNING [train.py:1204] (3/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_253-42544-0_sp1.1 from training. Duration: 0.97275 2023-10-02 13:38:19,562 WARNING [train.py:1204] (3/4) Exclude cut with ID _33640_210_2655_1_1533117605479_4855469_479-296877-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 13:38:22,371 WARNING [train.py:1204] (3/4) Exclude cut with ID 3033-130750-0096-55598-0_sp1.1 from training. Duration: 0.7545625 2023-10-02 13:38:22,393 WARNING [train.py:1204] (3/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_191-41023-0_sp1.1 from training. Duration: 0.6454375 2023-10-02 13:38:23,564 WARNING [train.py:1204] (3/4) Exclude cut with ID _8365_210_2064_1_1528592311070_4677690_431-294102-0_sp1.1 from training. Duration: 0.8090625 2023-10-02 13:38:23,594 WARNING [train.py:1204] (3/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_893-296030-0_sp1.1 from training. Duration: 0.97275 2023-10-02 13:38:26,767 WARNING [train.py:1204] (3/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_401-343820-0_sp1.1 from training. Duration: 0.97275 2023-10-02 13:38:30,917 WARNING [train.py:1204] (3/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_127-566-0_sp1.1 from training. Duration: 0.763625 2023-10-02 13:38:30,918 WARNING [train.py:1204] (3/4) Exclude cut with ID _14100_210_8341_1_1530529320613_3678069_126-272847-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 13:38:30,938 WARNING [train.py:1204] (3/4) Exclude cut with ID _15133_210_6228_1_1530794512074_3080379_29-264458-0_sp1.1 from training. Duration: 0.52725 2023-10-02 13:38:30,939 WARNING [train.py:1204] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_127-119109-0 from training. Duration: 0.72 2023-10-02 13:38:30,995 WARNING [train.py:1204] (3/4) Exclude cut with ID _6537_210_1883_1_1527987559710_3597129_382-133062-0_sp1.1 from training. Duration: 0.6181875 2023-10-02 13:38:31,005 WARNING [train.py:1204] (3/4) Exclude cut with ID _29529_210_7234_1_1532393961455_3977460_391-219682-0_sp1.1 from training. Duration: 0.936375 2023-10-02 13:38:32,418 WARNING [train.py:1204] (3/4) Exclude cut with ID _37537_210_2253_1_1533171758568_3421949_96-137189-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 13:38:33,990 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.ff2_skip_rate, batch_count=898486.6666666666, ans=0.0 2023-10-02 13:38:35,141 WARNING [train.py:1204] (3/4) Exclude cut with ID _58414_210_8254_1_1535421609162_7151182_15-276301-0_sp1.1 from training. Duration: 0.97275 2023-10-02 13:38:38,410 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=898486.6666666666, ans=0.1 2023-10-02 13:38:39,613 WARNING [train.py:1204] (3/4) Exclude cut with ID _59502_210_12210_1_1535107024698_1835139_110-289074-0_sp1.1 from training. Duration: 0.92725 2023-10-02 13:38:42,381 WARNING [train.py:1204] (3/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_367-61330-0_sp1.1 from training. Duration: 0.6818125 2023-10-02 13:38:45,186 WARNING [train.py:1204] (3/4) Exclude cut with ID _45297_210_2721_1_1533715351381_3264949_353-9541-0_sp1.1 from training. Duration: 0.8818125 2023-10-02 13:38:45,228 WARNING [train.py:1204] (3/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_85-138257-0_sp0.9 from training. Duration: 0.788875 2023-10-02 13:38:46,959 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3787_210_2040_1_1526477544_5478586_603-120324-0 from training. Duration: 0.81 2023-10-02 13:38:47,019 WARNING [train.py:1204] (3/4) Exclude cut with ID _42863_210_9070_1_1533902398788_3696920_411-204131-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 13:38:50,434 WARNING [train.py:1204] (3/4) Exclude cut with ID _30196_210_1664_1_1533708496522_7074140_822-289909-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 13:38:51,904 WARNING [train.py:1204] (3/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_32-335103-0_sp0.9 from training. Duration: 0.911125 2023-10-02 13:38:51,949 WARNING [train.py:1204] (3/4) Exclude cut with ID _6493_210_1747_1_1528459210424_4012470_513-140967-0_sp0.9 from training. Duration: 0.87775 2023-10-02 13:39:00,526 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_47896_210_20508_1_1534492518_5958868_439-257766-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 13:39:00,588 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_1294-63152-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 13:39:03,416 WARNING [train.py:1204] (3/4) Exclude cut with ID _59170_210_6721_1_1534983633960_4203335_319-166512-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 13:39:04,992 WARNING [train.py:1204] (3/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_146-3470-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 13:39:08,100 WARNING [train.py:1204] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_678-159307-0_sp1.1 from training. Duration: 0.77275 2023-10-02 13:39:09,377 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_343-48822-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 13:39:09,442 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_4692_210_3528_1_1526899755_4323254_107-44401-0 from training. Duration: 0.86 2023-10-02 13:39:09,447 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_74-292608-0_sp1.1 from training. Duration: 0.6545625 2023-10-02 13:39:11,163 WARNING [train.py:1204] (3/4) Exclude cut with ID _55644_210_11856_1_1534593599582_4103719_293-248694-0_sp1.1 from training. Duration: 0.97275 2023-10-02 13:39:12,528 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3172_210_2087_1_1526292929_3987865_197-97762-0 from training. Duration: 0.75 2023-10-02 13:39:15,103 INFO [train.py:1046] (3/4) Epoch 26, batch 2000, loss[loss=0.1613, simple_loss=0.25, pruned_loss=0.03632, over 24615.00 frames. ], tot_loss[loss=0.1693, simple_loss=0.2467, pruned_loss=0.04591, over 4724587.25 frames. ], batch size: 68, lr: 3.95e-03, grad_scale: 32.0 2023-10-02 13:39:15,175 WARNING [train.py:1204] (3/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_264-348896-0_sp0.9 from training. Duration: 0.9444375 2023-10-02 13:39:17,237 WARNING [train.py:1204] (3/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_209-205365-0_sp0.9 from training. Duration: 0.87775 2023-10-02 13:39:18,453 WARNING [train.py:1204] (3/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_222-198127-0_sp0.9 from training. Duration: 0.9333125 2023-10-02 13:39:18,502 WARNING [train.py:1204] (3/4) Exclude cut with ID _57572_210_5517_1_1534921219624_3809484_52-43530-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 13:39:20,387 WARNING [train.py:1204] (3/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_96-320736-0_sp1.1 from training. Duration: 0.7818125 2023-10-02 13:39:23,053 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_13091_210_7925_1_1530609447_8987354_1159-225508-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 13:39:26,274 WARNING [train.py:1204] (3/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_237-44332-0 from training. Duration: 0.94 2023-10-02 13:39:26,445 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.conv_module1.balancer2.prob, batch_count=898686.6666666666, ans=0.125 2023-10-02 13:39:27,617 WARNING [train.py:1204] (3/4) Exclude cut with ID _7970_210_1883_1_1528455616296_3472230_25-178407-0_sp0.9 from training. Duration: 0.788875 2023-10-02 13:39:27,864 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.feed_forward1.hidden_balancer.prob, batch_count=898686.6666666666, ans=0.125 2023-10-02 13:39:29,092 WARNING [train.py:1204] (3/4) Exclude cut with ID _29003_210_19596_1_1532507391720_3894098_357-257062-0_sp1.1 from training. Duration: 0.936375 2023-10-02 13:39:30,570 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_809-17156-0 from training. Duration: 0.73 2023-10-02 13:39:31,985 WARNING [train.py:1204] (3/4) Exclude cut with ID _7160_210_1876_1_1527940923231_3703799_302-150728-0_sp1.1 from training. Duration: 0.7090625 2023-10-02 13:39:32,003 WARNING [train.py:1204] (3/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_444-28041-0_sp0.9 from training. Duration: 0.9444375 2023-10-02 13:39:33,405 WARNING [train.py:1204] (3/4) Exclude cut with ID _52892_210_6721_1_1534378941259_4124129_278-251666-0_sp1.1 from training. Duration: 0.92725 2023-10-02 13:39:35,171 WARNING [train.py:1204] (3/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_115-326778-0 from training. Duration: 0.93 2023-10-02 13:39:36,349 WARNING [train.py:1204] (3/4) Exclude cut with ID _41391_210_7994_1_1533603771872_3638201_31-188961-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 13:39:39,031 WARNING [train.py:1204] (3/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_1073-6264-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 13:39:39,052 WARNING [train.py:1204] (3/4) Exclude cut with ID _66868_210_9527_1_1535509287954_4561140_435-259515-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 13:39:39,145 WARNING [train.py:1204] (3/4) Exclude cut with ID _14886_210_4220_1_1530702020355_3516190_159-73748-0 from training. Duration: 0.83 2023-10-02 13:39:40,396 WARNING [train.py:1204] (3/4) Exclude cut with ID _15150_210_8341_1_1530752411803_7256384_469-239509-0_sp1.1 from training. Duration: 0.6181875 2023-10-02 13:39:41,808 WARNING [train.py:1204] (3/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_395-340852-0 from training. Duration: 0.93 2023-10-02 13:39:41,812 WARNING [train.py:1204] (3/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_138-187031-0_sp1.1 from training. Duration: 0.9 2023-10-02 13:39:45,926 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_13757_210_3528_1_1530440682_4107239_48-106347-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 13:39:45,979 WARNING [train.py:1204] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_164-224052-0_sp1.1 from training. Duration: 0.636375 2023-10-02 13:39:47,630 WARNING [train.py:1204] (3/4) Exclude cut with ID _59288_210_5854_1_1535077757774_3412246_408-322648-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 13:39:47,681 WARNING [train.py:1204] (3/4) Exclude cut with ID _30191_210_1664_1_1533189915047_7196670_827-36359-0_sp1.1 from training. Duration: 0.963625 2023-10-02 13:39:49,072 WARNING [train.py:1204] (3/4) Exclude cut with ID _4680_210_3525_1_1527249757953_893386_75-78979-0_sp1.1 from training. Duration: 0.82725 2023-10-02 13:39:49,122 WARNING [train.py:1204] (3/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_383-165234-0 from training. Duration: 0.94 2023-10-02 13:39:53,705 WARNING [train.py:1204] (3/4) Exclude cut with ID _19896_210_1883_1_1531709022662_6649569_94-229927-0 from training. Duration: 0.97 2023-10-02 13:39:53,712 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6788_210_1388_1_1527898698_4562021_180-75288-0_sp1.1 from training. Duration: 0.9 2023-10-02 13:39:53,719 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_26433_210_12108_1_1532157158_4056480_54-321349-0_sp1.1 from training. Duration: 0.97275 2023-10-02 13:39:58,525 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.feed_forward1.out_proj.dropout_p, batch_count=898886.6666666666, ans=0.1 2023-10-02 13:39:59,633 WARNING [train.py:1204] (3/4) Exclude cut with ID _56108_210_16380_1_1534505446159_3887959_265-161493-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 13:40:02,310 WARNING [train.py:1204] (3/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_395-111602-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 13:40:02,316 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_154-316450-0_sp0.9 from training. Duration: 0.8444375 2023-10-02 13:40:02,391 WARNING [train.py:1204] (3/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_381-116041-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 13:40:03,568 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.624e+02 1.946e+02 2.166e+02 2.855e+02 3.639e+02, threshold=4.333e+02, percent-clipped=0.0 2023-10-02 13:40:03,763 WARNING [train.py:1204] (3/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_65-104124-0_sp1.1 from training. Duration: 0.963625 2023-10-02 13:40:03,802 WARNING [train.py:1204] (3/4) Exclude cut with ID _6536_210_1883_1_1527850783339_3319069_169-23811-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 13:40:05,079 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_289-180904-0_sp0.9 from training. Duration: 0.8444375 2023-10-02 13:40:05,096 WARNING [train.py:1204] (3/4) Exclude cut with ID _56406_210_9288_1_1534899618664_3496149_226-242400-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 13:40:06,491 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_24899_210_16554_1_1532046422_3827462_276-216330-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 13:40:08,516 WARNING [train.py:1204] (3/4) Exclude cut with ID _29345_210_4381_1_1532689168263_3860169_198-174328-0_sp1.1 from training. Duration: 0.82725 2023-10-02 13:40:09,932 WARNING [train.py:1204] (3/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_338-96520-0 from training. Duration: 0.94 2023-10-02 13:40:12,818 WARNING [train.py:1204] (3/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_7-111954-0_sp1.1 from training. Duration: 0.5090625 2023-10-02 13:40:14,233 WARNING [train.py:1204] (3/4) Exclude cut with ID _36692_210_5665_1_1533608967420_398809_8-16261-0_sp1.1 from training. Duration: 0.97275 2023-10-02 13:40:16,872 WARNING [train.py:1204] (3/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_86-212657-0_sp1.1 from training. Duration: 0.97275 2023-10-02 13:40:16,878 WARNING [train.py:1204] (3/4) Exclude cut with ID _44659_210_2721_1_1534036120061_6614860_626-298884-0_sp1.1 from training. Duration: 0.8818125 2023-10-02 13:40:20,767 WARNING [train.py:1204] (3/4) Exclude cut with ID _49755_210_18053_1_1534581005846_3218709_120-257918-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 13:40:23,544 WARNING [train.py:1204] (3/4) Exclude cut with ID _21881_210_3525_1_1531825242757_3798380_400-113821-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 13:40:23,547 WARNING [train.py:1204] (3/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_387-236773-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 13:40:24,887 WARNING [train.py:1204] (3/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_613-232856-0_sp1.1 from training. Duration: 0.4909375 2023-10-02 13:40:24,917 WARNING [train.py:1204] (3/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_181-78632-0_sp0.9 from training. Duration: 0.8666875 2023-10-02 13:40:29,535 INFO [train.py:1046] (3/4) Epoch 26, batch 2050, loss[loss=0.1779, simple_loss=0.2561, pruned_loss=0.04982, over 23383.00 frames. ], tot_loss[loss=0.1687, simple_loss=0.2457, pruned_loss=0.04581, over 4717927.05 frames. ], batch size: 93, lr: 3.95e-03, grad_scale: 8.0 2023-10-02 13:40:29,583 WARNING [train.py:1204] (3/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_175-17485-0_sp1.1 from training. Duration: 0.97275 2023-10-02 13:40:30,937 WARNING [train.py:1204] (3/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_1004-183422-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 13:40:32,480 WARNING [train.py:1204] (3/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_32-65962-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 13:40:33,833 WARNING [train.py:1204] (3/4) Exclude cut with ID _15458_210_8341_1_1530858614263_3834410_40-298664-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 13:40:36,576 WARNING [train.py:1204] (3/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_380-261863-0_sp1.1 from training. Duration: 0.963625 2023-10-02 13:40:39,724 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_372-173012-0_sp1.1 from training. Duration: 0.863625 2023-10-02 13:40:41,097 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_57-82205-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 13:40:42,417 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3691_210_3528_1_1526468169_4062053_355-334432-0_sp0.9 from training. Duration: 0.9666875 2023-10-02 13:40:43,835 WARNING [train.py:1204] (3/4) Exclude cut with ID _19023_210_4353_1_1531378892552_5820780_28-36615-0 from training. Duration: 0.97 2023-10-02 13:40:43,841 WARNING [train.py:1204] (3/4) Exclude cut with ID _29853_210_7497_1_1532505600851_6642110_286-251333-0_sp1.1 from training. Duration: 0.92725 2023-10-02 13:40:44,646 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.2.encoder.layers.0.nonlin_attention.whiten2, num_groups=1, num_channels=384, metric=8.31 vs. limit=15.0 2023-10-02 13:40:45,162 WARNING [train.py:1204] (3/4) Exclude cut with ID _31602_210_5854_1_1532677021456_4458250_518-80679-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 13:40:46,476 WARNING [train.py:1204] (3/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_38-187851-0_sp0.9 from training. Duration: 0.688875 2023-10-02 13:40:56,211 WARNING [train.py:1204] (3/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_39-248870-0_sp1.1 from training. Duration: 0.62725 2023-10-02 13:40:56,230 WARNING [train.py:1204] (3/4) Exclude cut with ID _16908_210_5724_1_1531119157248_3772700_154-31455-0_sp1.1 from training. Duration: 0.97275 2023-10-02 13:40:58,018 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8453_210_4211_1_1528502940_8562730_347-330673-0 from training. Duration: 0.72 2023-10-02 13:41:00,756 WARNING [train.py:1204] (3/4) Exclude cut with ID _30191_210_1664_1_1533189915047_7196670_674-204247-0_sp1.1 from training. Duration: 0.97275 2023-10-02 13:41:02,171 WARNING [train.py:1204] (3/4) Exclude cut with ID _8833_210_5219_1_1528592665210_7553059_329-125156-0 from training. Duration: 0.82 2023-10-02 13:41:02,216 WARNING [train.py:1204] (3/4) Exclude cut with ID _4779_210_2084_1_1527318022944_3914893_500-285013-0_sp1.1 from training. Duration: 0.62725 2023-10-02 13:41:02,302 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder_embed.dropout.p, batch_count=899153.3333333334, ans=0.1 2023-10-02 13:41:04,932 WARNING [train.py:1204] (3/4) Exclude cut with ID _14421_210_8750_1_1530604795825_3542290_190-313578-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 13:41:06,420 WARNING [train.py:1204] (3/4) Exclude cut with ID _35799_210_5854_1_1533263487887_3529071_538-273574-0_sp1.1 from training. Duration: 0.936375 2023-10-02 13:41:06,507 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_14928_210_5632_1_1530773461_4714000_454-112198-0_sp1.1 from training. Duration: 0.6 2023-10-02 13:41:07,762 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_468-143126-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 13:41:10,411 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_4528_210_3273_1_1527076654_3276777_258-121681-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 13:41:10,509 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_397-132565-0_sp1.1 from training. Duration: 0.9 2023-10-02 13:41:12,269 WARNING [train.py:1204] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_454-123002-0_sp0.9 from training. Duration: 0.7666875 2023-10-02 13:41:15,118 WARNING [train.py:1204] (3/4) Exclude cut with ID _27052_210_5986_1_1532329197187_3585009_297-86890-0_sp1.1 from training. Duration: 0.936375 2023-10-02 13:41:17,776 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_51225_210_3601_1_1534400634_2327383_55-301844-0_sp1.1 from training. Duration: 0.8454375 2023-10-02 13:41:19,680 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6786_210_1388_1_1527839699_4194495_378-46070-0_sp0.9 from training. Duration: 0.8 2023-10-02 13:41:21,088 WARNING [train.py:1204] (3/4) Exclude cut with ID _42334_210_7770_1_1534206610396_3605880_55-19758-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 13:41:24,004 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.conv_skip_rate, batch_count=899220.0, ans=0.0 2023-10-02 13:41:25,735 WARNING [train.py:1204] (3/4) Exclude cut with ID _14884_210_2252_1_1530707459689_5561250_114-201399-0_sp0.9 from training. Duration: 0.8444375 2023-10-02 13:41:30,593 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_99-251314-0_sp0.9 from training. Duration: 0.9666875 2023-10-02 13:41:31,928 WARNING [train.py:1204] (3/4) Exclude cut with ID _26528_210_9070_1_1532570410842_3851310_497-97218-0 from training. Duration: 0.86 2023-10-02 13:41:32,168 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=899286.6666666666, ans=0.1 2023-10-02 13:41:36,266 WARNING [train.py:1204] (3/4) Exclude cut with ID _55328_210_27767_1_1534580973260_4088170_230-317241-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 13:41:37,568 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_125-203105-0_sp0.9 from training. Duration: 0.888875 2023-10-02 13:41:38,991 WARNING [train.py:1204] (3/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_55-31904-0_sp1.1 from training. Duration: 0.8 2023-10-02 13:41:40,530 WARNING [train.py:1204] (3/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_18-174647-0 from training. Duration: 0.91 2023-10-02 13:41:40,703 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.attention_skip_rate, batch_count=899286.6666666666, ans=0.0 2023-10-02 13:41:40,763 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.ff3_skip_rate, batch_count=899286.6666666666, ans=0.0 2023-10-02 13:41:43,098 INFO [train.py:1046] (3/4) Epoch 26, batch 2100, loss[loss=0.1842, simple_loss=0.2609, pruned_loss=0.0537, over 23582.00 frames. ], tot_loss[loss=0.1675, simple_loss=0.2441, pruned_loss=0.04542, over 4708937.91 frames. ], batch size: 85, lr: 3.95e-03, grad_scale: 8.0 2023-10-02 13:41:45,038 WARNING [train.py:1204] (3/4) Exclude cut with ID IC0165W0005-81726 from training. Duration: 0.952 2023-10-02 13:41:45,039 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_31583_210_3852_1_1532595851_4024413_148-171847-0_sp1.1 from training. Duration: 0.963625 2023-10-02 13:41:45,096 WARNING [train.py:1204] (3/4) Exclude cut with ID _21989_210_5854_1_1531899214931_3271360_116-138769-0_sp1.1 from training. Duration: 0.936375 2023-10-02 13:41:46,403 WARNING [train.py:1204] (3/4) Exclude cut with ID _66070_210_7640_1_1535441393096_7805310_489-292631-0_sp1.1 from training. Duration: 0.7181875 2023-10-02 13:41:46,481 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_202-27960-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 13:41:46,488 WARNING [train.py:1204] (3/4) Exclude cut with ID _5294_210_1747_1_1527249643928_3696179_342-270278-0 from training. Duration: 0.55 2023-10-02 13:41:47,823 WARNING [train.py:1204] (3/4) Exclude cut with ID _38323_210_12108_1_1533121233369_3303049_416-11919-0 from training. Duration: 0.96 2023-10-02 13:41:47,946 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6545_210_2040_1_1527774670_4245258_407-48070-0_sp0.9 from training. Duration: 0.8444375 2023-10-02 13:41:51,146 WARNING [train.py:1204] (3/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_503-30719-0_sp1.1 from training. Duration: 0.8909375 2023-10-02 13:41:52,896 WARNING [train.py:1204] (3/4) Exclude cut with ID _49643_210_7989_1_1534640244224_3939031_234-326497-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 13:41:54,270 WARNING [train.py:1204] (3/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_301-77873-0_sp1.1 from training. Duration: 0.963625 2023-10-02 13:41:55,672 WARNING [train.py:1204] (3/4) Exclude cut with ID _45297_210_2721_1_1533715351381_3264949_152-154697-0_sp1.1 from training. Duration: 0.97275 2023-10-02 13:41:55,679 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3371_210_1794_1_1526384462_5580777_396-339812-0 from training. Duration: 0.85 2023-10-02 13:41:57,016 WARNING [train.py:1204] (3/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_399-156791-0_sp1.1 from training. Duration: 0.7545625 2023-10-02 13:41:57,066 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7696_210_2040_1_1528207167_3716099_117-125434-0 from training. Duration: 0.76 2023-10-02 13:41:57,070 WARNING [train.py:1204] (3/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_96-244299-0 from training. Duration: 0.88 2023-10-02 13:41:58,794 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.5.encoder.layers.1.whiten, num_groups=1, num_channels=256, metric=2.04 vs. limit=12.0 2023-10-02 13:41:59,105 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.1.encoder.layers.0.feed_forward2.out_whiten, num_groups=1, num_channels=256, metric=4.07 vs. limit=15.0 2023-10-02 13:42:00,245 WARNING [train.py:1204] (3/4) Exclude cut with ID _37892_210_19596_1_1533198677897_3964058_519-343462-0_sp1.1 from training. Duration: 0.92725 2023-10-02 13:42:00,274 WARNING [train.py:1204] (3/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_202-183922-0_sp1.1 from training. Duration: 0.77275 2023-10-02 13:42:00,280 WARNING [train.py:1204] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_453-133983-0 from training. Duration: 0.78 2023-10-02 13:42:01,627 WARNING [train.py:1204] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_178-39722-0_sp1.1 from training. Duration: 0.4454375 2023-10-02 13:42:07,202 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_15126_210_6753_1_1530778677_2591185_113-157651-0 from training. Duration: 0.68 2023-10-02 13:42:07,203 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_271-234962-0_sp1.1 from training. Duration: 0.7181875 2023-10-02 13:42:08,728 WARNING [train.py:1204] (3/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_195-64967-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 13:42:09,968 WARNING [train.py:1204] (3/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_365-215065-0_sp1.1 from training. Duration: 0.936375 2023-10-02 13:42:14,431 WARNING [train.py:1204] (3/4) Exclude cut with ID _31602_210_5854_1_1532677021456_4458250_249-96604-0_sp0.9 from training. Duration: 0.988875 2023-10-02 13:42:14,484 WARNING [train.py:1204] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_307-158462-0 from training. Duration: 0.69 2023-10-02 13:42:14,526 WARNING [train.py:1204] (3/4) Exclude cut with ID _30918_210_6400_1_1532754078600_3125920_407-223394-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 13:42:14,529 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6673_210_3273_1_1527767790_3173240_351-167954-0_sp0.9 from training. Duration: 0.5333125 2023-10-02 13:42:17,213 WARNING [train.py:1204] (3/4) Exclude cut with ID _4866_210_2577_1_1527404412284_4364659_256-306830-0 from training. Duration: 0.87 2023-10-02 13:42:17,266 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_17613_210_2151_1_1531137569_2366160_222-131118-0_sp1.1 from training. Duration: 0.963625 2023-10-02 13:42:17,278 WARNING [train.py:1204] (3/4) Exclude cut with ID _45560_210_21982_1_1533812603951_3584999_111-6376-0 from training. Duration: 0.84 2023-10-02 13:42:17,304 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_216-73841-0 from training. Duration: 0.96 2023-10-02 13:42:18,594 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_14058_210_4163_1_1530690953_6327180_49-210687-0 from training. Duration: 0.94 2023-10-02 13:42:21,270 WARNING [train.py:1204] (3/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_506-42614-0_sp0.9 from training. Duration: 0.87775 2023-10-02 13:42:23,377 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_25-332138-0_sp1.1 from training. Duration: 0.82725 2023-10-02 13:42:26,615 WARNING [train.py:1204] (3/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_589-9070-0_sp1.1 from training. Duration: 0.6454375 2023-10-02 13:42:26,729 WARNING [train.py:1204] (3/4) Exclude cut with ID _6194_210_1534_1_1527511826160_3941194_14-282876-0_sp1.1 from training. Duration: 0.6454375 2023-10-02 13:42:28,083 WARNING [train.py:1204] (3/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_1166-90654-0_sp1.1 from training. Duration: 0.92725 2023-10-02 13:42:29,476 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_218-255858-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 13:42:29,478 WARNING [train.py:1204] (3/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_1285-256622-0 from training. Duration: 0.84 2023-10-02 13:42:29,496 WARNING [train.py:1204] (3/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_205-286261-0_sp1.1 from training. Duration: 0.963625 2023-10-02 13:42:29,522 WARNING [train.py:1204] (3/4) Exclude cut with ID _68777_210_4353_1_1535810390731_3117904_6-154119-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 13:42:31,207 WARNING [train.py:1204] (3/4) Exclude cut with ID _22982_210_1883_1_1531882790447_5449409_565-199393-0_sp1.1 from training. Duration: 0.92725 2023-10-02 13:42:31,217 WARNING [train.py:1204] (3/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_278-154492-0 from training. Duration: 0.76 2023-10-02 13:42:32,628 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_426-313557-0 from training. Duration: 0.53 2023-10-02 13:42:33,882 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.611e+02 1.840e+02 2.077e+02 2.431e+02 3.502e+02, threshold=4.154e+02, percent-clipped=0.0 2023-10-02 13:42:33,964 WARNING [train.py:1204] (3/4) Exclude cut with ID _41817_210_18835_1_1533434477160_7423049_555-293448-0 from training. Duration: 0.8 2023-10-02 13:42:36,754 WARNING [train.py:1204] (3/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_32-335103-0_sp1.1 from training. Duration: 0.7454375 2023-10-02 13:42:39,544 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2655_210_2087_1_1525688959_3897400_202-147557-0_sp1.1 from training. Duration: 0.77275 2023-10-02 13:42:39,566 WARNING [train.py:1204] (3/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_158-250116-0 from training. Duration: 0.62 2023-10-02 13:42:44,372 WARNING [train.py:1204] (3/4) Exclude cut with ID _55429_210_10626_1_1534467588527_7174143_528-88083-0_sp1.1 from training. Duration: 0.963625 2023-10-02 13:42:47,068 WARNING [train.py:1204] (3/4) Exclude cut with ID _8030_210_7497_1_1528444886781_3531089_32-31837-0_sp1.1 from training. Duration: 0.8818125 2023-10-02 13:42:48,345 WARNING [train.py:1204] (3/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_744-86168-0_sp1.1 from training. Duration: 0.97275 2023-10-02 13:42:48,353 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1113-7369-0_sp0.9 from training. Duration: 0.9444375 2023-10-02 13:42:48,377 WARNING [train.py:1204] (3/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_124-345840-0_sp0.9 from training. Duration: 0.57775 2023-10-02 13:42:48,422 WARNING [train.py:1204] (3/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_328-264776-0_sp0.9 from training. Duration: 0.7444375 2023-10-02 13:42:49,856 WARNING [train.py:1204] (3/4) Exclude cut with ID _18895_210_6228_1_1531357274234_3784930_469-166615-0_sp1.1 from training. Duration: 0.963625 2023-10-02 13:42:49,862 WARNING [train.py:1204] (3/4) Exclude cut with ID _8395_210_1531_1_1528597871523_4411579_431-132470-0_sp0.9 from training. Duration: 0.82225 2023-10-02 13:42:51,217 WARNING [train.py:1204] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_554-45617-0_sp1.1 from training. Duration: 0.9 2023-10-02 13:42:51,246 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_35361_210_4238_1_1533262810_7793476_524-188242-0_sp1.1 from training. Duration: 0.92725 2023-10-02 13:42:53,337 WARNING [train.py:1204] (3/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_938-205135-0 from training. Duration: 0.87 2023-10-02 13:42:54,713 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_5931_210_2040_1_1527428481_4547999_321-193614-0 from training. Duration: 0.67 2023-10-02 13:42:54,733 WARNING [train.py:1204] (3/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_445-269521-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 13:42:57,301 INFO [train.py:1046] (3/4) Epoch 26, batch 2150, loss[loss=0.1639, simple_loss=0.2538, pruned_loss=0.03701, over 24546.00 frames. ], tot_loss[loss=0.1661, simple_loss=0.2429, pruned_loss=0.04465, over 4716870.69 frames. ], batch size: 71, lr: 3.95e-03, grad_scale: 8.0 2023-10-02 13:42:57,632 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.bypass.skip_rate, batch_count=899686.6666666666, ans=0.09899494936611666 2023-10-02 13:42:58,745 WARNING [train.py:1204] (3/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_3-33116-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 13:42:58,747 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_4633_210_2040_1_1526824163_4145343_416-76117-0_sp1.1 from training. Duration: 0.863625 2023-10-02 13:42:58,776 WARNING [train.py:1204] (3/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_1033-274376-0_sp1.1 from training. Duration: 0.7909375 2023-10-02 13:42:58,819 WARNING [train.py:1204] (3/4) Exclude cut with ID _8772_210_1850_1_1528635595972_3664011_398-320049-0_sp0.9 from training. Duration: 0.97775 2023-10-02 13:43:03,694 WARNING [train.py:1204] (3/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_321-215742-0_sp0.9 from training. Duration: 0.5555625 2023-10-02 13:43:05,122 WARNING [train.py:1204] (3/4) Exclude cut with ID _28394_210_8913_1_1532430049518_3650118_272-190950-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 13:43:07,710 WARNING [train.py:1204] (3/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_31-299371-0_sp1.1 from training. Duration: 0.92725 2023-10-02 13:43:09,088 WARNING [train.py:1204] (3/4) Exclude cut with ID _55383_210_10635_1_1534468426172_2231669_134-128246-0_sp0.9 from training. Duration: 0.988875 2023-10-02 13:43:09,090 WARNING [train.py:1204] (3/4) Exclude cut with ID _40519_210_13794_1_1533715228655_7337390_887-9547-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 13:43:11,002 WARNING [train.py:1204] (3/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_310-109461-0_sp0.9 from training. Duration: 0.888875 2023-10-02 13:43:11,371 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.ff2_skip_rate, batch_count=899753.3333333334, ans=0.0 2023-10-02 13:43:12,493 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2009_210_2071_1_1525481134_4588937_258-250196-0_sp1.1 from training. Duration: 0.92725 2023-10-02 13:43:13,860 WARNING [train.py:1204] (3/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_1186-72702-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 13:43:13,863 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_14831_210_7927_1_1530700200_5583610_181-27467-0_sp1.1 from training. Duration: 0.82725 2023-10-02 13:43:16,769 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.conv_skip_rate, batch_count=899753.3333333334, ans=0.0 2023-10-02 13:43:17,173 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.5.encoder.layers.0.self_attn_weights.whiten_keys, num_groups=4, num_channels=128, metric=1.77 vs. limit=6.0 2023-10-02 13:43:18,023 WARNING [train.py:1204] (3/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_278-132109-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 13:43:18,048 WARNING [train.py:1204] (3/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_261-297503-0 from training. Duration: 0.99 2023-10-02 13:43:23,461 WARNING [train.py:1204] (3/4) Exclude cut with ID _19896_210_1883_1_1531709022662_6649569_313-265903-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 13:43:25,272 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_105-114797-0_sp1.1 from training. Duration: 0.763625 2023-10-02 13:43:25,359 WARNING [train.py:1204] (3/4) Exclude cut with ID _53755_210_8254_1_1534577312941_4707360_9-65843-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 13:43:26,691 WARNING [train.py:1204] (3/4) Exclude cut with ID _18516_210_9206_1_1531704653206_3692406_361-224549-0_sp1.1 from training. Duration: 0.97275 2023-10-02 13:43:26,710 WARNING [train.py:1204] (3/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_403-310975-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 13:43:26,740 WARNING [train.py:1204] (3/4) Exclude cut with ID _5444_210_2151_1_1527418808464_5312210_581-315411-0_sp0.9 from training. Duration: 0.688875 2023-10-02 13:43:26,791 WARNING [train.py:1204] (3/4) Exclude cut with ID _23825_210_13449_1_1531915313526_3497150_464-154904-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 13:43:28,760 WARNING [train.py:1204] (3/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_937-208281-0_sp0.9 from training. Duration: 0.9444375 2023-10-02 13:43:28,821 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_37839_210_7234_1_1533542462_3689303_158-181212-0_sp1.1 from training. Duration: 0.963625 2023-10-02 13:43:30,176 WARNING [train.py:1204] (3/4) Exclude cut with ID _8772_210_1850_1_1528635595972_3664011_398-320049-0 from training. Duration: 0.88 2023-10-02 13:43:31,993 WARNING [train.py:1204] (3/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_718-250907-0_sp1.1 from training. Duration: 0.62725 2023-10-02 13:43:32,074 WARNING [train.py:1204] (3/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_641-126395-0_sp1.1 from training. Duration: 0.92725 2023-10-02 13:43:33,325 WARNING [train.py:1204] (3/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_166-241493-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 13:43:35,893 WARNING [train.py:1204] (3/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_526-62079-0_sp0.9 from training. Duration: 0.7444375 2023-10-02 13:43:35,991 WARNING [train.py:1204] (3/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_439-275083-0_sp1.1 from training. Duration: 0.936375 2023-10-02 13:43:38,685 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_66907_210_21581_1_1535615661_3590177_493-229032-0_sp1.1 from training. Duration: 0.92725 2023-10-02 13:43:38,720 WARNING [train.py:1204] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_89-321955-0_sp1.1 from training. Duration: 0.72725 2023-10-02 13:43:40,166 WARNING [train.py:1204] (3/4) Exclude cut with ID _29857_210_20034_1_1532395886753_3798160_273-226195-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 13:43:40,178 WARNING [train.py:1204] (3/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_404-75889-0 from training. Duration: 0.89 2023-10-02 13:43:41,567 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_271-234962-0_sp0.9 from training. Duration: 0.87775 2023-10-02 13:43:44,968 WARNING [train.py:1204] (3/4) Exclude cut with ID _22961_210_8254_1_1531976366672_4278180_156-339685-0_sp1.1 from training. Duration: 0.97275 2023-10-02 13:43:45,016 WARNING [train.py:1204] (3/4) Exclude cut with ID _37892_210_19596_1_1533198677897_3964058_425-231927-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 13:43:46,409 WARNING [train.py:1204] (3/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_102-288670-0_sp1.1 from training. Duration: 0.97275 2023-10-02 13:43:49,023 WARNING [train.py:1204] (3/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_283-220372-0_sp1.1 from training. Duration: 0.5090625 2023-10-02 13:43:49,092 WARNING [train.py:1204] (3/4) Exclude cut with ID _44304_210_15374_1_1533639352765_4613129_86-1699-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 13:43:49,175 WARNING [train.py:1204] (3/4) Exclude cut with ID _2891_210_2550_1_1525874398521_4427710_327-121590-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 13:43:49,182 WARNING [train.py:1204] (3/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_369-222060-0 from training. Duration: 0.71 2023-10-02 13:43:50,634 WARNING [train.py:1204] (3/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_762-109886-0 from training. Duration: 0.91 2023-10-02 13:43:51,902 WARNING [train.py:1204] (3/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_477-255782-0_sp0.9 from training. Duration: 0.911125 2023-10-02 13:43:51,933 WARNING [train.py:1204] (3/4) Exclude cut with ID IC0669W0061-330906 from training. Duration: 0.768 2023-10-02 13:43:53,301 WARNING [train.py:1204] (3/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_347-213071-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 13:43:53,946 WARNING [train.py:1204] (3/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_507-303370-0_sp1.1 from training. Duration: 0.87275 2023-10-02 13:43:55,181 WARNING [train.py:1204] (3/4) Exclude cut with ID _15012_210_1968_1_1530856820429_5095350_688-178849-0 from training. Duration: 0.64 2023-10-02 13:43:55,186 WARNING [train.py:1204] (3/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_28-70987-0_sp0.9 from training. Duration: 0.9666875 2023-10-02 13:43:55,198 WARNING [train.py:1204] (3/4) Exclude cut with ID _5910_210_1389_1_1527337972454_5050100_555-159815-0 from training. Duration: 0.98 2023-10-02 13:43:55,225 WARNING [train.py:1204] (3/4) Exclude cut with ID IC0679W0064-335871 from training. Duration: 0.768 2023-10-02 13:43:55,225 WARNING [train.py:1204] (3/4) Exclude cut with ID IC0679W0079-335886 from training. Duration: 0.896 2023-10-02 13:43:56,500 WARNING [train.py:1204] (3/4) Exclude cut with ID _5608_210_2106_1_1527415439089_6819768_909-82767-0 from training. Duration: 0.61 2023-10-02 13:43:56,627 WARNING [train.py:1204] (3/4) Exclude cut with ID _23038_210_9570_1_1532001584158_3652419_411-323967-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 13:43:59,273 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_350-231748-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 13:43:59,283 WARNING [train.py:1204] (3/4) Exclude cut with ID _5289_210_2084_1_1527678202736_3779299_142-103317-0_sp0.9 from training. Duration: 0.9333125 2023-10-02 13:43:59,378 WARNING [train.py:1204] (3/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_314-144055-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 13:43:59,540 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.conv_skip_rate, batch_count=899953.3333333334, ans=0.0 2023-10-02 13:44:01,337 WARNING [train.py:1204] (3/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_177-236809-0_sp0.9 from training. Duration: 0.5444375 2023-10-02 13:44:02,654 WARNING [train.py:1204] (3/4) Exclude cut with ID _23038_210_9570_1_1532001584158_3652419_82-334733-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 13:44:02,661 WARNING [train.py:1204] (3/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_225-22526-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 13:44:11,260 INFO [train.py:1046] (3/4) Epoch 26, batch 2200, loss[loss=0.1832, simple_loss=0.2485, pruned_loss=0.05889, over 23790.00 frames. ], tot_loss[loss=0.1667, simple_loss=0.2436, pruned_loss=0.0449, over 4715438.28 frames. ], batch size: 232, lr: 3.95e-03, grad_scale: 8.0 2023-10-02 13:44:11,360 WARNING [train.py:1204] (3/4) Exclude cut with ID _30327_210_16049_1_1532484161295_3924169_401-70819-0_sp1.1 from training. Duration: 0.7818125 2023-10-02 13:44:12,638 WARNING [train.py:1204] (3/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_538-32291-0 from training. Duration: 0.83 2023-10-02 13:44:17,326 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_37357_210_17024_1_1533089504_2316121_258-211605-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 13:44:21,490 WARNING [train.py:1204] (3/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_550-335198-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 13:44:22,775 WARNING [train.py:1204] (3/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_98-741-0_sp0.9 from training. Duration: 0.888875 2023-10-02 13:44:22,834 WARNING [train.py:1204] (3/4) Exclude cut with ID _38323_210_12108_1_1533121233369_3303049_251-238988-0_sp1.1 from training. Duration: 0.92725 2023-10-02 13:44:24,177 WARNING [train.py:1204] (3/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_259-34038-0_sp0.9 from training. Duration: 0.82225 2023-10-02 13:44:26,120 WARNING [train.py:1204] (3/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_1060-180633-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 13:44:26,163 WARNING [train.py:1204] (3/4) Exclude cut with ID _8755_210_1876_1_1528628559357_3258209_211-37473-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 13:44:26,168 WARNING [train.py:1204] (3/4) Exclude cut with ID _4122_210_2084_1_1526713164417_4353280_528-206771-0 from training. Duration: 0.88 2023-10-02 13:44:29,205 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.conv_module2.balancer1.max_abs, batch_count=900086.6666666666, ans=10.0 2023-10-02 13:44:30,406 WARNING [train.py:1204] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_64-248359-0 from training. Duration: 0.69 2023-10-02 13:44:32,347 WARNING [train.py:1204] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_402-253027-0_sp1.1 from training. Duration: 0.4909375 2023-10-02 13:44:33,920 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.ff2_skip_rate, batch_count=900086.6666666666, ans=0.0 2023-10-02 13:44:38,215 WARNING [train.py:1204] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_625-102955-0 from training. Duration: 0.77 2023-10-02 13:44:38,351 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder_embed.conv.2.prob, batch_count=900086.6666666666, ans=0.125 2023-10-02 13:44:41,158 WARNING [train.py:1204] (3/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_611-219324-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 13:44:41,245 WARNING [train.py:1204] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_718-68505-0_sp0.9 from training. Duration: 0.988875 2023-10-02 13:44:42,561 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_353-182855-0_sp1.1 from training. Duration: 0.87275 2023-10-02 13:44:45,384 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_5960_210_1794_1_1527403865_3990999_515-2613-0_sp0.9 from training. Duration: 0.97775 2023-10-02 13:44:45,408 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_5575_210_1410_1_1527301305_2570170_342-265572-0 from training. Duration: 0.94 2023-10-02 13:44:50,031 WARNING [train.py:1204] (3/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_329-113173-0_sp0.9 from training. Duration: 0.688875 2023-10-02 13:44:51,354 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_15396_210_6890_1_1531308473_1938936_213-312506-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 13:44:51,401 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_521-214276-0_sp1.1 from training. Duration: 0.4 2023-10-02 13:44:57,291 WARNING [train.py:1204] (3/4) Exclude cut with ID _15026_210_8750_1_1530777602316_3291850_211-195043-0_sp1.1 from training. Duration: 0.72725 2023-10-02 13:44:57,405 WARNING [train.py:1204] (3/4) Exclude cut with ID _29343_210_4381_1_1532602757752_3674109_108-257494-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 13:44:58,916 WARNING [train.py:1204] (3/4) Exclude cut with ID _64414_210_17301_1_1535243252048_3757759_403-58151-0_sp1.1 from training. Duration: 0.963625 2023-10-02 13:45:00,332 WARNING [train.py:1204] (3/4) Exclude cut with ID _36512_210_6987_1_1533033143998_3126069_109-177787-0_sp1.1 from training. Duration: 0.92725 2023-10-02 13:45:01,646 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.512e+02 1.763e+02 1.852e+02 2.075e+02 2.576e+02, threshold=3.704e+02, percent-clipped=0.0 2023-10-02 13:45:01,780 WARNING [train.py:1204] (3/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_498-40153-0 from training. Duration: 0.63 2023-10-02 13:45:03,725 WARNING [train.py:1204] (3/4) Exclude cut with ID _56447_210_1389_1_1535074245910_3697189_161-334523-0_sp1.1 from training. Duration: 0.936375 2023-10-02 13:45:05,026 WARNING [train.py:1204] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_238-245655-0 from training. Duration: 0.81 2023-10-02 13:45:08,234 WARNING [train.py:1204] (3/4) Exclude cut with ID _64631_210_27767_1_1535349580068_4165160_108-43865-0_sp1.1 from training. Duration: 0.92725 2023-10-02 13:45:08,240 WARNING [train.py:1204] (3/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_243-42116-0_sp1.1 from training. Duration: 0.663625 2023-10-02 13:45:08,273 WARNING [train.py:1204] (3/4) Exclude cut with ID _66868_210_9527_1_1535509287954_4561140_594-132160-0_sp1.1 from training. Duration: 0.92725 2023-10-02 13:45:11,013 WARNING [train.py:1204] (3/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_48-338976-0_sp0.9 from training. Duration: 0.988875 2023-10-02 13:45:11,044 WARNING [train.py:1204] (3/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_137-100595-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 13:45:11,050 WARNING [train.py:1204] (3/4) Exclude cut with ID _49748_210_24116_1_1533986971737_3158910_12-271669-0_sp1.1 from training. Duration: 0.936375 2023-10-02 13:45:11,075 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_37357_210_17024_1_1533089504_2316121_206-80421-0_sp1.1 from training. Duration: 0.936375 2023-10-02 13:45:11,211 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.conv_module1.balancer1.prob, batch_count=900286.6666666666, ans=0.125 2023-10-02 13:45:12,429 WARNING [train.py:1204] (3/4) Exclude cut with ID _7537_210_2084_1_1528502395431_3924359_113-199933-0_sp1.1 from training. Duration: 0.536375 2023-10-02 13:45:12,465 WARNING [train.py:1204] (3/4) Exclude cut with ID _55440_210_12651_1_1534496713364_2980520_31-59289-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 13:45:15,166 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1083-220122-0_sp0.9 from training. Duration: 0.5666875 2023-10-02 13:45:16,600 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_426-313557-0_sp1.1 from training. Duration: 0.4818125 2023-10-02 13:45:17,939 WARNING [train.py:1204] (3/4) Exclude cut with ID _35799_210_5854_1_1533263487887_3529071_324-94843-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 13:45:19,776 WARNING [train.py:1204] (3/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_264-108685-0_sp1.1 from training. Duration: 0.5 2023-10-02 13:45:19,870 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9012W0006-501141 from training. Duration: 0.896 2023-10-02 13:45:22,549 WARNING [train.py:1204] (3/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_890-5117-0_sp0.9 from training. Duration: 0.8666875 2023-10-02 13:45:22,592 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9023W0008-506301 from training. Duration: 0.896 2023-10-02 13:45:22,667 WARNING [train.py:1204] (3/4) Exclude cut with ID _13916_210_9994_1_1530788341126_3763149_60-191556-0_sp0.9 from training. Duration: 0.67775 2023-10-02 13:45:24,567 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9029W0331-509721 from training. Duration: 0.896 2023-10-02 13:45:25,729 INFO [train.py:1046] (3/4) Epoch 26, batch 2250, loss[loss=0.2203, simple_loss=0.2796, pruned_loss=0.08047, over 19389.00 frames. ], tot_loss[loss=0.1679, simple_loss=0.245, pruned_loss=0.04537, over 4710384.80 frames. ], batch size: 388, lr: 3.95e-03, grad_scale: 8.0 2023-10-02 13:45:25,849 WARNING [train.py:1204] (3/4) Exclude cut with ID _35823_210_6917_1_1533187881914_7297830_567-108596-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 13:45:27,149 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7543_210_6753_1_1528544010_1110402_25-183860-0_sp1.1 from training. Duration: 0.42725 2023-10-02 13:45:27,259 WARNING [train.py:1204] (3/4) Exclude cut with ID _47278_210_12041_1_1533902418349_3576110_495-260035-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 13:45:28,624 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9051W0015-520281 from training. Duration: 0.9045625 2023-10-02 13:45:29,981 WARNING [train.py:1204] (3/4) Exclude cut with ID _14459_210_2228_1_1531443641946_7516700_707-66193-0_sp1.1 from training. Duration: 0.97275 2023-10-02 13:45:30,296 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.feed_forward1.hidden_balancer.prob, batch_count=900353.3333333334, ans=0.125 2023-10-02 13:45:31,943 WARNING [train.py:1204] (3/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_219-224445-0_sp1.1 from training. Duration: 0.763625 2023-10-02 13:45:36,115 WARNING [train.py:1204] (3/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_345-167986-0_sp0.9 from training. Duration: 0.9333125 2023-10-02 13:45:37,845 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_34815_210_6388_1_1533381320_3571903_279-305565-0_sp1.1 from training. Duration: 0.836375 2023-10-02 13:45:39,446 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.ff3_skip_rate, batch_count=900420.0, ans=0.0 2023-10-02 13:45:40,676 WARNING [train.py:1204] (3/4) Exclude cut with ID _21987_210_5854_1_1531821500482_3637038_317-179212-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 13:45:41,991 WARNING [train.py:1204] (3/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_395-37222-0_sp1.1 from training. Duration: 0.6909375 2023-10-02 13:45:42,059 WARNING [train.py:1204] (3/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_168-50337-0_sp1.1 from training. Duration: 0.763625 2023-10-02 13:45:42,597 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.4.encoder.layers.2.nonlin_attention.whiten2, num_groups=1, num_channels=384, metric=3.42 vs. limit=15.0 2023-10-02 13:45:46,037 WARNING [train.py:1204] (3/4) Exclude cut with ID _60113_210_16271_1_1535164212481_4386079_559-167191-0 from training. Duration: 0.93 2023-10-02 13:45:46,047 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_47228_210_4983_1_1533780021_3672027_344-179642-0_sp1.1 from training. Duration: 0.92725 2023-10-02 13:45:46,092 WARNING [train.py:1204] (3/4) Exclude cut with ID _22961_210_8254_1_1531976366672_4278180_317-199546-0_sp0.9 from training. Duration: 0.9555625 2023-10-02 13:45:47,559 WARNING [train.py:1204] (3/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_141-89727-0 from training. Duration: 0.71 2023-10-02 13:45:49,388 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_78-116075-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 13:45:49,400 WARNING [train.py:1204] (3/4) Exclude cut with ID _64086_210_7994_1_1535270417019_7435949_52-235756-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 13:45:50,873 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7615_210_4211_1_1528110965_4580160_72-235211-0_sp1.1 from training. Duration: 0.6909375 2023-10-02 13:45:55,516 WARNING [train.py:1204] (3/4) Exclude cut with ID _14069_210_5632_1_1530512692033_4803390_287-27950-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 13:45:55,619 WARNING [train.py:1204] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_74-48717-0_sp0.9 from training. Duration: 0.6444375 2023-10-02 13:45:56,998 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7544_210_6753_1_1528591327_1471426_172-348777-0_sp1.1 from training. Duration: 0.6 2023-10-02 13:45:57,211 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.ff3_skip_rate, batch_count=900486.6666666666, ans=0.0 2023-10-02 13:45:58,374 WARNING [train.py:1204] (3/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_220-191080-0 from training. Duration: 0.98 2023-10-02 13:45:59,770 WARNING [train.py:1204] (3/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_1081-137641-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 13:46:01,614 WARNING [train.py:1204] (3/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_278-235459-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 13:46:05,708 WARNING [train.py:1204] (3/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_1181-77470-0_sp1.1 from training. Duration: 0.936375 2023-10-02 13:46:05,834 WARNING [train.py:1204] (3/4) Exclude cut with ID _60860_210_8254_1_1534896015875_3557169_198-5531-0_sp1.1 from training. Duration: 0.936375 2023-10-02 13:46:06,040 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.feed_forward2.hidden_balancer.prob, batch_count=900486.6666666666, ans=0.125 2023-10-02 13:46:08,070 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=900486.6666666666, ans=0.1 2023-10-02 13:46:09,112 WARNING [train.py:1204] (3/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_17-289781-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 13:46:09,117 WARNING [train.py:1204] (3/4) Exclude cut with ID _50310_210_15488_1_1534068043345_3641080_146-199752-0_sp1.1 from training. Duration: 0.92725 2023-10-02 13:46:10,723 WARNING [train.py:1204] (3/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_969-213354-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 13:46:11,998 WARNING [train.py:1204] (3/4) Exclude cut with ID _70519_210_4163_1_1535961595994_7458230_66-344156-0_sp1.1 from training. Duration: 0.963625 2023-10-02 13:46:16,259 WARNING [train.py:1204] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_380-204756-0_sp1.1 from training. Duration: 0.7818125 2023-10-02 13:46:17,875 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.conv_module2.balancer2.prob, batch_count=900553.3333333334, ans=0.125 2023-10-02 13:46:19,031 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_128-202104-0_sp0.9 from training. Duration: 0.7 2023-10-02 13:46:22,493 WARNING [train.py:1204] (3/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_51-1049-0_sp0.9 from training. Duration: 0.8333125 2023-10-02 13:46:23,759 WARNING [train.py:1204] (3/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_86-66192-0_sp1.1 from training. Duration: 0.5 2023-10-02 13:46:23,823 WARNING [train.py:1204] (3/4) Exclude cut with ID _14069_210_5632_1_1530512692033_4803390_95-1896-0_sp1.1 from training. Duration: 0.8909375 2023-10-02 13:46:28,568 WARNING [train.py:1204] (3/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_29-293896-0_sp1.1 from training. Duration: 0.5545625 2023-10-02 13:46:31,146 WARNING [train.py:1204] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_167-137255-0_sp0.9 from training. Duration: 0.711125 2023-10-02 13:46:31,147 WARNING [train.py:1204] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_217-95056-0 from training. Duration: 0.98 2023-10-02 13:46:31,175 WARNING [train.py:1204] (3/4) Exclude cut with ID _47278_210_12041_1_1533902418349_3576110_97-266746-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 13:46:32,456 WARNING [train.py:1204] (3/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_380-343670-0_sp1.1 from training. Duration: 0.8 2023-10-02 13:46:35,210 WARNING [train.py:1204] (3/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_107-332398-0 from training. Duration: 0.78 2023-10-02 13:46:38,437 WARNING [train.py:1204] (3/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_91-267696-0_sp1.1 from training. Duration: 0.7909375 2023-10-02 13:46:38,458 WARNING [train.py:1204] (3/4) Exclude cut with ID _41298_210_9019_1_1533430371450_4822143_498-186993-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 13:46:39,669 INFO [train.py:1046] (3/4) Epoch 26, batch 2300, loss[loss=0.1398, simple_loss=0.2176, pruned_loss=0.03096, over 24286.00 frames. ], tot_loss[loss=0.1687, simple_loss=0.2456, pruned_loss=0.0459, over 4701703.79 frames. ], batch size: 56, lr: 3.94e-03, grad_scale: 8.0 2023-10-02 13:46:44,126 WARNING [train.py:1204] (3/4) Exclude cut with ID _17956_210_4353_1_1531209568125_6979970_54-77610-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 13:46:44,162 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_40047_210_2290_1_1533290519_2374311_257-335162-0_sp1.1 from training. Duration: 0.863625 2023-10-02 13:46:46,968 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9653W0193-675471 from training. Duration: 0.9656875 2023-10-02 13:46:49,603 WARNING [train.py:1204] (3/4) Exclude cut with ID _25248_210_8750_1_1532062825941_5554180_243-240536-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 13:46:57,644 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_32360_210_4198_1_1532689160_3648436_60-51908-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 13:46:57,652 WARNING [train.py:1204] (3/4) Exclude cut with ID _4092_210_2151_1_1526643044449_3135759_227-213972-0_sp0.9 from training. Duration: 0.6 2023-10-02 13:46:57,687 WARNING [train.py:1204] (3/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_392-173500-0_sp1.1 from training. Duration: 0.97275 2023-10-02 13:46:57,715 WARNING [train.py:1204] (3/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_71-329150-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 13:46:57,722 WARNING [train.py:1204] (3/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_866-173440-0 from training. Duration: 0.9 2023-10-02 13:46:57,901 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.balancer1.prob, batch_count=900753.3333333334, ans=0.125 2023-10-02 13:46:59,063 WARNING [train.py:1204] (3/4) Exclude cut with ID _14832_210_8750_1_1530691351539_3171348_1-54315-0_sp0.9 from training. Duration: 0.9666875 2023-10-02 13:47:01,071 WARNING [train.py:1204] (3/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_430-116139-0_sp1.1 from training. Duration: 0.77275 2023-10-02 13:47:01,104 WARNING [train.py:1204] (3/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_356-45054-0_sp0.9 from training. Duration: 0.9555625 2023-10-02 13:47:05,328 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3531_210_3528_1_1526294203_5216456_309-227302-0_sp0.9 from training. Duration: 0.7666875 2023-10-02 13:47:08,010 WARNING [train.py:1204] (3/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_301-330455-0_sp1.1 from training. Duration: 0.72725 2023-10-02 13:47:09,845 WARNING [train.py:1204] (3/4) Exclude cut with ID _23557_210_8750_1_1531890106370_6846255_667-145945-0_sp1.1 from training. Duration: 0.92725 2023-10-02 13:47:15,238 WARNING [train.py:1204] (3/4) Exclude cut with ID _14860_210_4915_1_1530702020340_7343240_836-328489-0_sp1.1 from training. Duration: 0.8181875 2023-10-02 13:47:15,285 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_37152_210_9697_1_1533106573_3844188_416-333249-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 13:47:18,163 WARNING [train.py:1204] (3/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_538-32291-0_sp0.9 from training. Duration: 0.92225 2023-10-02 13:47:20,881 WARNING [train.py:1204] (3/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_965-176824-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 13:47:24,297 WARNING [train.py:1204] (3/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_86-19601-0_sp1.1 from training. Duration: 0.77275 2023-10-02 13:47:24,988 WARNING [train.py:1204] (3/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_436-318113-0_sp1.1 from training. Duration: 0.6818125 2023-10-02 13:47:26,325 WARNING [train.py:1204] (3/4) Exclude cut with ID _41817_210_18835_1_1533434477160_7423049_555-293448-0_sp0.9 from training. Duration: 0.888875 2023-10-02 13:47:26,342 WARNING [train.py:1204] (3/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_68-219807-0 from training. Duration: 0.47 2023-10-02 13:47:30,973 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.568e+02 2.002e+02 2.209e+02 2.529e+02 4.134e+02, threshold=4.417e+02, percent-clipped=1.0 2023-10-02 13:47:32,412 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7544_210_6753_1_1528591327_1471426_33-346031-0_sp1.1 from training. Duration: 0.5545625 2023-10-02 13:47:32,414 WARNING [train.py:1204] (3/4) Exclude cut with ID _49748_210_24116_1_1533986971737_3158910_263-52113-0_sp1.1 from training. Duration: 0.97275 2023-10-02 13:47:32,469 WARNING [train.py:1204] (3/4) Exclude cut with ID _64639_210_5816_1_1535367619437_3528000_340-24580-0_sp1.1 from training. Duration: 0.92725 2023-10-02 13:47:32,483 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_47228_210_4983_1_1533780021_3672027_479-114941-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 13:47:32,513 WARNING [train.py:1204] (3/4) Exclude cut with ID _44585_210_18628_1_1533605350387_4518199_506-119659-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 13:47:33,909 WARNING [train.py:1204] (3/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_314-126819-0_sp1.1 from training. Duration: 0.4545625 2023-10-02 13:47:33,913 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2541_210_1794_1_1525590269_3084765_156-173713-0_sp0.9 from training. Duration: 0.9 2023-10-02 13:47:33,966 WARNING [train.py:1204] (3/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_108-255430-0 from training. Duration: 0.97 2023-10-02 13:47:33,981 WARNING [train.py:1204] (3/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_791-45149-0_sp1.1 from training. Duration: 0.7818125 2023-10-02 13:47:34,182 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.self_attn_weights.pos_emb_skip_rate, batch_count=900886.6666666666, ans=0.0 2023-10-02 13:47:35,267 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_53378_210_17282_1_1534227054_1261425_10-118295-0_sp1.1 from training. Duration: 0.97275 2023-10-02 13:47:35,324 WARNING [train.py:1204] (3/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_148-158509-0 from training. Duration: 0.41 2023-10-02 13:47:35,519 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.bypass.scale_min, batch_count=900886.6666666666, ans=0.2 2023-10-02 13:47:40,903 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_34726_210_4525_1_1533019474_5030395_687-291305-0_sp1.1 from training. Duration: 0.936375 2023-10-02 13:47:45,541 WARNING [train.py:1204] (3/4) Exclude cut with ID _14860_210_4915_1_1530702020340_7343240_264-324537-0_sp1.1 from training. Duration: 0.963625 2023-10-02 13:47:48,325 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_41251_210_15368_1_1533429767_4820695_591-46386-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 13:47:48,349 WARNING [train.py:1204] (3/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_86-19601-0_sp0.9 from training. Duration: 0.9444375 2023-10-02 13:47:48,379 WARNING [train.py:1204] (3/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_401-255491-0_sp0.9 from training. Duration: 0.77775 2023-10-02 13:47:49,791 WARNING [train.py:1204] (3/4) Exclude cut with ID _8696_210_1876_1_1528545632440_3345058_209-54069-0_sp1.1 from training. Duration: 0.6090625 2023-10-02 13:47:49,807 WARNING [train.py:1204] (3/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_369-325184-0_sp1.1 from training. Duration: 0.9 2023-10-02 13:47:51,521 WARNING [train.py:1204] (3/4) Exclude cut with ID _14687_210_5854_1_1530703838259_3664889_115-239046-0_sp0.9 from training. Duration: 0.7444375 2023-10-02 13:47:51,559 WARNING [train.py:1204] (3/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_168-50337-0 from training. Duration: 0.84 2023-10-02 13:47:53,136 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.feed_forward2.hidden_balancer.prob, batch_count=901020.0, ans=0.125 2023-10-02 13:47:54,267 INFO [train.py:1046] (3/4) Epoch 26, batch 2350, loss[loss=0.1782, simple_loss=0.2402, pruned_loss=0.05808, over 23807.00 frames. ], tot_loss[loss=0.169, simple_loss=0.2462, pruned_loss=0.04585, over 4701545.97 frames. ], batch size: 164, lr: 3.94e-03, grad_scale: 8.0 2023-10-02 13:47:58,920 WARNING [train.py:1204] (3/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_909-64151-0_sp1.1 from training. Duration: 0.8545625 2023-10-02 13:47:58,939 WARNING [train.py:1204] (3/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_597-154640-0 from training. Duration: 0.9 2023-10-02 13:48:03,658 WARNING [train.py:1204] (3/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_126-274694-0 from training. Duration: 0.98 2023-10-02 13:48:05,134 WARNING [train.py:1204] (3/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_344-269786-0_sp1.1 from training. Duration: 0.97275 2023-10-02 13:48:10,965 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_37818_210_4238_1_1533620910_4720083_322-253154-0_sp1.1 from training. Duration: 0.92725 2023-10-02 13:48:10,968 WARNING [train.py:1204] (3/4) Exclude cut with ID _58895_210_8750_1_1535184081320_3649089_221-15823-0_sp1.1 from training. Duration: 0.92725 2023-10-02 13:48:10,979 WARNING [train.py:1204] (3/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_9-21737-0_sp1.1 from training. Duration: 0.8818125 2023-10-02 13:48:11,018 WARNING [train.py:1204] (3/4) Exclude cut with ID _14069_210_5632_1_1530512692033_4803390_522-341588-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 13:48:11,104 WARNING [train.py:1204] (3/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_363-185703-0 from training. Duration: 0.95 2023-10-02 13:48:12,105 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.0.layers.0.feed_forward3.out_whiten, num_groups=1, num_channels=192, metric=11.62 vs. limit=15.0 2023-10-02 13:48:15,180 WARNING [train.py:1204] (3/4) Exclude cut with ID _41817_210_18835_1_1533434477160_7423049_265-76216-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 13:48:20,636 WARNING [train.py:1204] (3/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_841-341679-0 from training. Duration: 0.58 2023-10-02 13:48:22,112 WARNING [train.py:1204] (3/4) Exclude cut with ID _55429_210_10626_1_1534467588527_7174143_97-335084-0_sp1.1 from training. Duration: 0.8818125 2023-10-02 13:48:22,762 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.3.encoder.layers.2.self_attn2.whiten, num_groups=1, num_channels=512, metric=11.33 vs. limit=22.5 2023-10-02 13:48:25,493 WARNING [train.py:1204] (3/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_690-126306-0_sp0.9 from training. Duration: 0.8666875 2023-10-02 13:48:26,802 WARNING [train.py:1204] (3/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_102-92751-0_sp1.1 from training. Duration: 0.9 2023-10-02 13:48:29,717 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3787_210_2040_1_1526477544_5478586_603-120324-0_sp1.1 from training. Duration: 0.736375 2023-10-02 13:48:29,828 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_33348_210_5632_1_1533086912_3324030_85-28583-0 from training. Duration: 0.82 2023-10-02 13:48:30,446 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.3.encoder.layers.3.feed_forward2.out_whiten, num_groups=1, num_channels=512, metric=6.39 vs. limit=15.0 2023-10-02 13:48:31,189 WARNING [train.py:1204] (3/4) Exclude cut with ID _60057_210_10640_1_1534903065793_3333150_362-230377-0_sp1.1 from training. Duration: 0.7909375 2023-10-02 13:48:33,175 WARNING [train.py:1204] (3/4) Exclude cut with ID _60113_210_16271_1_1535164212481_4386079_194-146486-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 13:48:33,187 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_53753_210_7234_1_1534320003_3594148_171-277668-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 13:48:35,102 WARNING [train.py:1204] (3/4) Exclude cut with ID _34423_210_8750_1_1533430798456_3330199_251-332788-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 13:48:37,807 WARNING [train.py:1204] (3/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_663-166973-0_sp0.9 from training. Duration: 0.888875 2023-10-02 13:48:40,568 WARNING [train.py:1204] (3/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_927-43827-0 from training. Duration: 0.74 2023-10-02 13:48:40,622 WARNING [train.py:1204] (3/4) Exclude cut with ID _28323_210_16187_1_1532480395667_4100300_306-74561-0_sp1.1 from training. Duration: 0.8545625 2023-10-02 13:48:43,809 WARNING [train.py:1204] (3/4) Exclude cut with ID _15037_210_2253_1_1530752294104_3267650_139-337989-0_sp1.1 from training. Duration: 0.92725 2023-10-02 13:48:45,163 WARNING [train.py:1204] (3/4) Exclude cut with ID _49039_210_6315_1_1533967537541_3846118_460-263670-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 13:48:46,630 WARNING [train.py:1204] (3/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_186-309161-0 from training. Duration: 0.54 2023-10-02 13:48:47,943 WARNING [train.py:1204] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_87-278686-0_sp1.1 from training. Duration: 0.7 2023-10-02 13:48:49,399 WARNING [train.py:1204] (3/4) Exclude cut with ID _27052_210_5986_1_1532329197187_3585009_314-106499-0 from training. Duration: 0.92 2023-10-02 13:48:49,417 WARNING [train.py:1204] (3/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_412-287526-0_sp0.9 from training. Duration: 0.8 2023-10-02 13:48:53,641 WARNING [train.py:1204] (3/4) Exclude cut with ID _22961_210_8254_1_1531976366672_4278180_317-199546-0 from training. Duration: 0.86 2023-10-02 13:48:58,180 WARNING [train.py:1204] (3/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_381-317791-0 from training. Duration: 0.73 2023-10-02 13:48:58,228 WARNING [train.py:1204] (3/4) Exclude cut with ID _44010_210_4525_1_1533884262964_4013620_560-310411-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 13:48:58,242 WARNING [train.py:1204] (3/4) Exclude cut with ID _5294_210_1747_1_1527249643928_3696179_342-270278-0_sp0.9 from training. Duration: 0.611125 2023-10-02 13:48:58,457 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.ff3_skip_rate, batch_count=901286.6666666666, ans=0.0 2023-10-02 13:48:59,524 WARNING [train.py:1204] (3/4) Exclude cut with ID ID0473W0002-918951 from training. Duration: 0.924 2023-10-02 13:48:59,548 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1083-220122-0 from training. Duration: 0.51 2023-10-02 13:49:02,236 WARNING [train.py:1204] (3/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_138-154812-0 from training. Duration: 0.78 2023-10-02 13:49:05,829 WARNING [train.py:1204] (3/4) Exclude cut with ID _44982_210_1511_1_1534312820108_3726029_331-19420-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 13:49:08,598 INFO [train.py:1046] (3/4) Epoch 26, batch 2400, loss[loss=0.1514, simple_loss=0.2332, pruned_loss=0.03483, over 24336.00 frames. ], tot_loss[loss=0.169, simple_loss=0.2466, pruned_loss=0.04563, over 4709434.11 frames. ], batch size: 61, lr: 3.94e-03, grad_scale: 16.0 2023-10-02 13:49:10,067 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2655_210_2087_1_1525688959_3897400_202-147557-0_sp0.9 from training. Duration: 0.9444375 2023-10-02 13:49:14,519 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_23850_210_4238_1_1532232597_1632433_32-185135-0_sp0.9 from training. Duration: 0.9555625 2023-10-02 13:49:15,942 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_46-93547-0_sp1.1 from training. Duration: 0.863625 2023-10-02 13:49:15,992 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7931_210_1794_1_1528594728_5564059_280-279560-0 from training. Duration: 0.98 2023-10-02 13:49:17,349 WARNING [train.py:1204] (3/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_937-208281-0 from training. Duration: 0.85 2023-10-02 13:49:22,988 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_14928_210_5632_1_1530773461_4714000_454-112198-0_sp0.9 from training. Duration: 0.7333125 2023-10-02 13:49:22,989 WARNING [train.py:1204] (3/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_944-153333-0_sp1.1 from training. Duration: 0.963625 2023-10-02 13:49:25,693 WARNING [train.py:1204] (3/4) Exclude cut with ID _14821_210_8750_1_1530680503991_6829359_2-227151-0 from training. Duration: 0.83 2023-10-02 13:49:25,717 WARNING [train.py:1204] (3/4) Exclude cut with ID _28461_210_2253_1_1532771775189_3041299_96-152158-0_sp1.1 from training. Duration: 0.77275 2023-10-02 13:49:27,111 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7837_210_1792_1_1528453445_5841305_654-54195-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 13:49:27,143 WARNING [train.py:1204] (3/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_344-152526-0 from training. Duration: 0.53 2023-10-02 13:49:32,067 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.bypass.scale_min, batch_count=901420.0, ans=0.2 2023-10-02 13:49:33,098 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_30167_210_3528_1_1532428012_4772375_480-142230-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 13:49:34,575 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_160-218721-0 from training. Duration: 0.96 2023-10-02 13:49:40,631 WARNING [train.py:1204] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_65-88242-0_sp0.9 from training. Duration: 0.62225 2023-10-02 13:49:43,362 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_34886_210_10633_1_1533203882_1755661_76-156096-0 from training. Duration: 0.87 2023-10-02 13:49:46,762 WARNING [train.py:1204] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_188-38388-0_sp1.1 from training. Duration: 0.92725 2023-10-02 13:49:49,346 WARNING [train.py:1204] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_81-289335-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 13:49:52,263 WARNING [train.py:1204] (3/4) Exclude cut with ID _59493_210_17282_1_1535078419077_3225879_110-8778-0_sp1.1 from training. Duration: 0.936375 2023-10-02 13:49:52,306 WARNING [train.py:1204] (3/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_188-260011-0 from training. Duration: 0.73 2023-10-02 13:49:52,346 WARNING [train.py:1204] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_453-133983-0_sp0.9 from training. Duration: 0.8666875 2023-10-02 13:49:58,274 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.553e+02 1.892e+02 2.173e+02 2.694e+02 4.951e+02, threshold=4.347e+02, percent-clipped=1.0 2023-10-02 13:50:00,905 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8054_210_7927_1_1528627721_4984094_553-17379-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 13:50:02,342 WARNING [train.py:1204] (3/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_341-171072-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 13:50:03,917 WARNING [train.py:1204] (3/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_265-257286-0_sp1.1 from training. Duration: 0.963625 2023-10-02 13:50:05,976 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_403-39619-0_sp0.9 from training. Duration: 0.9333125 2023-10-02 13:50:05,981 WARNING [train.py:1204] (3/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_464-33931-0_sp0.9 from training. Duration: 0.6 2023-10-02 13:50:06,011 WARNING [train.py:1204] (3/4) Exclude cut with ID _60804_210_9642_1_1534831199859_7268299_632-143863-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 13:50:06,018 WARNING [train.py:1204] (3/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_431-310997-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 13:50:06,066 WARNING [train.py:1204] (3/4) Exclude cut with ID _31649_210_6228_1_1532584608063_4091078_114-263860-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 13:50:06,078 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6961_210_2087_1_1527937168_6815011_562-165518-0_sp0.9 from training. Duration: 0.8555625 2023-10-02 13:50:10,859 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.conv_skip_rate, batch_count=901620.0, ans=0.0 2023-10-02 13:50:11,911 WARNING [train.py:1204] (3/4) Exclude cut with ID _27052_210_5986_1_1532329197187_3585009_338-325870-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 13:50:11,963 WARNING [train.py:1204] (3/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_469-94673-0_sp1.1 from training. Duration: 0.7181875 2023-10-02 13:50:11,982 WARNING [train.py:1204] (3/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_259-34038-0 from training. Duration: 0.74 2023-10-02 13:50:13,272 WARNING [train.py:1204] (3/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_388-129552-0 from training. Duration: 0.96 2023-10-02 13:50:15,188 WARNING [train.py:1204] (3/4) Exclude cut with ID _28394_210_8913_1_1532430049518_3650118_342-251455-0_sp1.1 from training. Duration: 0.936375 2023-10-02 13:50:15,206 WARNING [train.py:1204] (3/4) Exclude cut with ID _53731_210_7994_1_1534553979244_3757214_8-191390-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 13:50:15,364 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.attention_skip_rate, batch_count=901620.0, ans=0.0 2023-10-02 13:50:16,512 WARNING [train.py:1204] (3/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_613-232856-0 from training. Duration: 0.54 2023-10-02 13:50:16,579 WARNING [train.py:1204] (3/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_557-349534-0 from training. Duration: 0.91 2023-10-02 13:50:16,594 WARNING [train.py:1204] (3/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_379-184223-0 from training. Duration: 0.85 2023-10-02 13:50:16,598 WARNING [train.py:1204] (3/4) Exclude cut with ID IC0125W0001-61807 from training. Duration: 0.966 2023-10-02 13:50:17,981 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_229-45300-0 from training. Duration: 0.57 2023-10-02 13:50:19,370 WARNING [train.py:1204] (3/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_1069-24743-0_sp1.1 from training. Duration: 0.92725 2023-10-02 13:50:20,690 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_41452_210_7291_1_1533437687_3941281_323-172873-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 13:50:20,700 WARNING [train.py:1204] (3/4) Exclude cut with ID _37086_210_9038_1_1532999540891_7021250_802-226709-0_sp1.1 from training. Duration: 0.97275 2023-10-02 13:50:22,014 INFO [train.py:1046] (3/4) Epoch 26, batch 2450, loss[loss=0.1736, simple_loss=0.2592, pruned_loss=0.04405, over 24350.00 frames. ], tot_loss[loss=0.1673, simple_loss=0.2444, pruned_loss=0.04513, over 4702429.41 frames. ], batch size: 77, lr: 3.94e-03, grad_scale: 16.0 2023-10-02 13:50:22,091 WARNING [train.py:1204] (3/4) Exclude cut with ID IC0144W0500-71767 from training. Duration: 0.931875 2023-10-02 13:50:22,588 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.5.encoder.layers.1.feed_forward2.out_whiten, num_groups=1, num_channels=256, metric=8.83 vs. limit=15.0 2023-10-02 13:50:23,446 WARNING [train.py:1204] (3/4) Exclude cut with ID _39846_210_12651_1_1533605310265_2115212_233-129699-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 13:50:23,510 WARNING [train.py:1204] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_574-45541-0_sp0.9 from training. Duration: 0.72225 2023-10-02 13:50:28,087 WARNING [train.py:1204] (3/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_372-298093-0_sp1.1 from training. Duration: 0.72725 2023-10-02 13:50:28,103 WARNING [train.py:1204] (3/4) Exclude cut with ID _62574_210_2655_1_1535180407058_3933249_34-26343-0_sp1.1 from training. Duration: 0.97275 2023-10-02 13:50:32,195 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_512-256238-0_sp1.1 from training. Duration: 0.963625 2023-10-02 13:50:32,200 WARNING [train.py:1204] (3/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_196-55502-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 13:50:33,516 WARNING [train.py:1204] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_195-167011-0 from training. Duration: 0.75 2023-10-02 13:50:38,166 WARNING [train.py:1204] (3/4) Exclude cut with ID _13678_210_7497_1_1530530069226_3463170_345-230416-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 13:50:38,176 WARNING [train.py:1204] (3/4) Exclude cut with ID _51078_210_9070_1_1534161557424_3805250_225-327809-0_sp1.1 from training. Duration: 0.963625 2023-10-02 13:50:42,723 WARNING [train.py:1204] (3/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_18-189198-0_sp1.1 from training. Duration: 0.8181875 2023-10-02 13:50:42,742 WARNING [train.py:1204] (3/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_329-82235-0_sp1.1 from training. Duration: 0.7454375 2023-10-02 13:50:42,751 WARNING [train.py:1204] (3/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_726-270780-0_sp0.9 from training. Duration: 0.9555625 2023-10-02 13:50:42,796 WARNING [train.py:1204] (3/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_654-52713-0 from training. Duration: 0.77 2023-10-02 13:50:48,695 WARNING [train.py:1204] (3/4) Exclude cut with ID _20455_210_5219_1_1531565996775_8098100_191-16006-0_sp1.1 from training. Duration: 0.963625 2023-10-02 13:50:50,180 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3171_210_2087_1_1526109084_2330007_66-260583-0_sp1.1 from training. Duration: 0.6545625 2023-10-02 13:50:50,232 WARNING [train.py:1204] (3/4) Exclude cut with ID _38670_210_15379_1_1533448810659_4391059_63-129006-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 13:50:54,394 WARNING [train.py:1204] (3/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_393-57888-0_sp0.9 from training. Duration: 0.711125 2023-10-02 13:50:54,430 WARNING [train.py:1204] (3/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_857-298942-0_sp0.9 from training. Duration: 0.9444375 2023-10-02 13:50:55,708 WARNING [train.py:1204] (3/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_239-22076-0_sp0.9 from training. Duration: 0.9444375 2023-10-02 13:50:55,730 WARNING [train.py:1204] (3/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_424-227947-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 13:50:55,960 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.feed_forward2.hidden_balancer.prob, batch_count=901820.0, ans=0.125 2023-10-02 13:50:58,927 WARNING [train.py:1204] (3/4) Exclude cut with ID _14069_210_5632_1_1530512692033_4803390_95-1896-0 from training. Duration: 0.98 2023-10-02 13:50:59,006 WARNING [train.py:1204] (3/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_96-244299-0_sp0.9 from training. Duration: 0.97775 2023-10-02 13:51:05,904 WARNING [train.py:1204] (3/4) Exclude cut with ID _46683_210_9642_1_1533730913492_4591116_562-319825-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 13:51:06,784 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.2.encoder.layers.0.feed_forward3.out_whiten, num_groups=1, num_channels=384, metric=12.17 vs. limit=15.0 2023-10-02 13:51:07,908 WARNING [train.py:1204] (3/4) Exclude cut with ID _64639_210_5816_1_1535367619437_3528000_284-173474-0_sp1.1 from training. Duration: 0.963625 2023-10-02 13:51:07,935 WARNING [train.py:1204] (3/4) Exclude cut with ID _29529_210_7234_1_1532393961455_3977460_497-100133-0_sp1.1 from training. Duration: 0.936375 2023-10-02 13:51:07,969 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_12868_210_6753_1_1530701932_530275_18-76322-0_sp0.9 from training. Duration: 0.9666875 2023-10-02 13:51:09,060 WARNING [train.py:1204] (3/4) Exclude cut with ID _50093_210_4220_1_1534417141207_3432299_116-303447-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 13:51:10,908 WARNING [train.py:1204] (3/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_44-79427-0_sp1.1 from training. Duration: 0.8909375 2023-10-02 13:51:12,204 WARNING [train.py:1204] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_887-249555-0 from training. Duration: 0.77 2023-10-02 13:51:13,697 WARNING [train.py:1204] (3/4) Exclude cut with ID _28461_210_2253_1_1532771775189_3041299_96-152158-0_sp0.9 from training. Duration: 0.9444375 2023-10-02 13:51:13,747 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_14058_210_4163_1_1530690953_6327180_49-210687-0_sp1.1 from training. Duration: 0.8545625 2023-10-02 13:51:14,042 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.balancer1.prob, batch_count=901886.6666666666, ans=0.125 2023-10-02 13:51:15,346 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.conv_skip_rate, batch_count=901886.6666666666, ans=0.0 2023-10-02 13:51:16,546 WARNING [train.py:1204] (3/4) Exclude cut with ID _40399_210_4353_1_1533305658194_3279406_351-127903-0_sp1.1 from training. Duration: 0.97275 2023-10-02 13:51:16,560 WARNING [train.py:1204] (3/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_184-206939-0_sp1.1 from training. Duration: 0.936375 2023-10-02 13:51:21,087 WARNING [train.py:1204] (3/4) Exclude cut with ID _14100_210_8341_1_1530529320613_3678069_258-224599-0_sp1.1 from training. Duration: 0.7 2023-10-02 13:51:21,116 WARNING [train.py:1204] (3/4) Exclude cut with ID _6493_210_1747_1_1528459210424_4012470_324-323854-0 from training. Duration: 0.92 2023-10-02 13:51:22,432 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_151-325626-0_sp1.1 from training. Duration: 0.8818125 2023-10-02 13:51:22,505 WARNING [train.py:1204] (3/4) Exclude cut with ID _14528_210_8254_1_1530669700514_4306939_106-43145-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 13:51:23,789 WARNING [train.py:1204] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_686-54383-0 from training. Duration: 0.87 2023-10-02 13:51:23,835 WARNING [train.py:1204] (3/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_801-102362-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 13:51:25,653 WARNING [train.py:1204] (3/4) Exclude cut with ID _48677_210_5663_1_1533902405919_3892989_408-40740-0_sp0.9 from training. Duration: 0.92225 2023-10-02 13:51:29,684 WARNING [train.py:1204] (3/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_448-212135-0_sp1.1 from training. Duration: 0.82725 2023-10-02 13:51:32,473 WARNING [train.py:1204] (3/4) Exclude cut with ID _59170_210_6721_1_1534983633960_4203335_320-124047-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 13:51:32,525 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_4692_210_3528_1_1526899755_4323254_107-44401-0_sp1.1 from training. Duration: 0.7818125 2023-10-02 13:51:34,018 WARNING [train.py:1204] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_731-2428-0 from training. Duration: 0.83 2023-10-02 13:51:35,261 INFO [train.py:1046] (3/4) Epoch 26, batch 2500, loss[loss=0.1793, simple_loss=0.2607, pruned_loss=0.04896, over 24352.00 frames. ], tot_loss[loss=0.1666, simple_loss=0.2437, pruned_loss=0.04475, over 4700856.29 frames. ], batch size: 77, lr: 3.94e-03, grad_scale: 16.0 2023-10-02 13:51:35,376 WARNING [train.py:1204] (3/4) Exclude cut with ID _29857_210_20034_1_1532395886753_3798160_335-163562-0_sp1.1 from training. Duration: 0.77275 2023-10-02 13:51:41,916 WARNING [train.py:1204] (3/4) Exclude cut with ID _14832_210_8750_1_1530691351539_3171348_171-275004-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 13:51:42,759 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.1.encoder.layers.1.conv_module1.whiten, num_groups=1, num_channels=256, metric=12.47 vs. limit=15.0 2023-10-02 13:51:46,752 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.0.conv_skip_rate, batch_count=902020.0, ans=0.0 2023-10-02 13:51:49,340 WARNING [train.py:1204] (3/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_105-328174-0_sp0.9 from training. Duration: 0.8444375 2023-10-02 13:51:49,373 WARNING [train.py:1204] (3/4) Exclude cut with ID _68965_210_1883_1_1535698962272_3497490_164-118421-0_sp1.1 from training. Duration: 0.936375 2023-10-02 13:51:50,754 WARNING [train.py:1204] (3/4) Exclude cut with ID _19896_210_1883_1_1531709022662_6649569_380-24170-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 13:51:50,766 WARNING [train.py:1204] (3/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_505-205270-0_sp1.1 from training. Duration: 0.3454375 2023-10-02 13:51:52,266 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.attention_skip_rate, batch_count=902086.6666666666, ans=0.0 2023-10-02 13:51:55,895 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.0.layers.0.self_attn2.whiten, num_groups=1, num_channels=192, metric=11.06 vs. limit=22.5 2023-10-02 13:51:58,277 WARNING [train.py:1204] (3/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_427-208901-0_sp1.1 from training. Duration: 0.6181875 2023-10-02 13:51:58,314 WARNING [train.py:1204] (3/4) Exclude cut with ID _23818_210_6228_1_1531917063884_3676920_85-326916-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 13:51:59,722 WARNING [train.py:1204] (3/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_101-293367-0_sp1.1 from training. Duration: 0.57275 2023-10-02 13:51:59,738 WARNING [train.py:1204] (3/4) Exclude cut with ID _5409_210_1388_1_1527292850189_5865191_178-127970-0_sp1.1 from training. Duration: 0.5181875 2023-10-02 13:51:59,784 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_403-39619-0 from training. Duration: 0.84 2023-10-02 13:52:01,174 WARNING [train.py:1204] (3/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_427-87841-0_sp1.1 from training. Duration: 0.963625 2023-10-02 13:52:02,479 WARNING [train.py:1204] (3/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_286-68181-0_sp1.1 from training. Duration: 0.97275 2023-10-02 13:52:02,522 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6673_210_3273_1_1527767790_3173240_327-306185-0 from training. Duration: 0.83 2023-10-02 13:52:03,195 WARNING [train.py:1204] (3/4) Exclude cut with ID _3675_210_4557_1_1526639934408_4182089_106-203713-0_sp1.1 from training. Duration: 0.963625 2023-10-02 13:52:04,446 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_53071_210_16554_1_1534493434_1276796_130-36076-0 from training. Duration: 0.95 2023-10-02 13:52:04,483 WARNING [train.py:1204] (3/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_586-93054-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 13:52:08,031 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.1.conv_module1.whiten.whitening_limit, batch_count=902153.3333333334, ans=15.0 2023-10-02 13:52:08,534 WARNING [train.py:1204] (3/4) Exclude cut with ID _23825_210_13449_1_1531915313526_3497150_262-10571-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 13:52:08,606 WARNING [train.py:1204] (3/4) Exclude cut with ID _28783_210_7943_1_1532671164808_7155479_432-67885-0_sp1.1 from training. Duration: 0.97275 2023-10-02 13:52:08,860 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.ff3_skip_rate, batch_count=902153.3333333334, ans=0.0 2023-10-02 13:52:11,862 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6287_210_3273_1_1527509506_2653716_182-199887-0_sp0.9 from training. Duration: 0.5666875 2023-10-02 13:52:13,241 WARNING [train.py:1204] (3/4) Exclude cut with ID _3231_210_2106_1_1526205864934_6784220_171-256004-0 from training. Duration: 0.86 2023-10-02 13:52:14,494 WARNING [train.py:1204] (3/4) Exclude cut with ID _69051_210_1531_1_1535885566901_3829538_332-92090-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 13:52:14,775 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.conv_module2.balancer2.prob, batch_count=902153.3333333334, ans=0.125 2023-10-02 13:52:16,390 WARNING [train.py:1204] (3/4) Exclude cut with ID _3675_210_4557_1_1526639934408_4182089_104-324125-0_sp1.1 from training. Duration: 0.963625 2023-10-02 13:52:16,606 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.attention_skip_rate, batch_count=902153.3333333334, ans=0.0 2023-10-02 13:52:20,531 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_41764_210_2521_1_1533686110_4089313_501-2119-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 13:52:24,642 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_56-100988-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 13:52:25,910 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.572e+02 1.808e+02 1.922e+02 2.142e+02 2.952e+02, threshold=3.844e+02, percent-clipped=0.0 2023-10-02 13:52:26,078 WARNING [train.py:1204] (3/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_398-239819-0_sp1.1 from training. Duration: 0.9 2023-10-02 13:52:31,093 WARNING [train.py:1204] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_74-48717-0_sp1.1 from training. Duration: 0.52725 2023-10-02 13:52:35,424 WARNING [train.py:1204] (3/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_177-49092-0 from training. Duration: 0.91 2023-10-02 13:52:35,441 WARNING [train.py:1204] (3/4) Exclude cut with ID _62337_210_12392_1_1535009273105_3412229_44-176353-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 13:52:35,449 WARNING [train.py:1204] (3/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_110-130961-0_sp1.1 from training. Duration: 0.42725 2023-10-02 13:52:36,940 WARNING [train.py:1204] (3/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_37-189262-0_sp1.1 from training. Duration: 0.8909375 2023-10-02 13:52:36,940 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3172_210_2087_1_1526292929_3987865_197-97762-0_sp0.9 from training. Duration: 0.8333125 2023-10-02 13:52:38,298 WARNING [train.py:1204] (3/4) Exclude cut with ID IC0677W0065-334897 from training. Duration: 0.896 2023-10-02 13:52:38,299 WARNING [train.py:1204] (3/4) Exclude cut with ID IC0677W0080-334912 from training. Duration: 0.768 2023-10-02 13:52:38,304 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_120-41668-0 from training. Duration: 0.7 2023-10-02 13:52:39,750 WARNING [train.py:1204] (3/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_460-205287-0_sp1.1 from training. Duration: 0.963625 2023-10-02 13:52:42,931 WARNING [train.py:1204] (3/4) Exclude cut with ID _14632_210_9988_1_1530691279392_3923200_102-204477-0 from training. Duration: 0.97 2023-10-02 13:52:42,936 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_34815_210_6388_1_1533381320_3571903_279-305565-0 from training. Duration: 0.92 2023-10-02 13:52:43,009 WARNING [train.py:1204] (3/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_91-148831-0_sp1.1 from training. Duration: 0.9 2023-10-02 13:52:44,305 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_82-221196-0 from training. Duration: 0.76 2023-10-02 13:52:44,517 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.conv_module1.balancer1.prob, batch_count=902286.6666666666, ans=0.125 2023-10-02 13:52:47,695 WARNING [train.py:1204] (3/4) Exclude cut with ID _38020_210_5986_1_1533112252259_7006089_493-102745-0 from training. Duration: 0.98 2023-10-02 13:52:50,319 INFO [train.py:1046] (3/4) Epoch 26, batch 2550, loss[loss=0.194, simple_loss=0.2646, pruned_loss=0.06172, over 23812.00 frames. ], tot_loss[loss=0.1672, simple_loss=0.2443, pruned_loss=0.0451, over 4697184.24 frames. ], batch size: 212, lr: 3.94e-03, grad_scale: 16.0 2023-10-02 13:52:50,424 WARNING [train.py:1204] (3/4) Exclude cut with ID _61133_210_9969_1_1535196777227_3696409_40-93442-0_sp1.1 from training. Duration: 0.92725 2023-10-02 13:52:53,042 WARNING [train.py:1204] (3/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_45-258852-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 13:52:53,075 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_53071_210_16554_1_1534493434_1276796_130-36076-0_sp1.1 from training. Duration: 0.863625 2023-10-02 13:52:55,940 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8694_210_6400_1_1529065754_3260756_332-13553-0_sp1.1 from training. Duration: 0.92725 2023-10-02 13:52:59,026 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_405-339986-0 from training. Duration: 0.51 2023-10-02 13:52:59,077 WARNING [train.py:1204] (3/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_927-43827-0_sp1.1 from training. Duration: 0.67275 2023-10-02 13:53:01,943 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_397-147196-0 from training. Duration: 0.8 2023-10-02 13:53:03,374 WARNING [train.py:1204] (3/4) Exclude cut with ID 3557-8342-0013-54691-0_sp1.1 from training. Duration: 0.836375 2023-10-02 13:53:06,076 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_47438_210_5399_1_1533797471_4513356_626-127722-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 13:53:08,005 WARNING [train.py:1204] (3/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_256-323883-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 13:53:08,009 WARNING [train.py:1204] (3/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_839-173017-0_sp0.9 from training. Duration: 0.4555625 2023-10-02 13:53:08,038 WARNING [train.py:1204] (3/4) Exclude cut with ID _5289_210_2084_1_1527678202736_3779299_392-261988-0_sp1.1 from training. Duration: 0.5909375 2023-10-02 13:53:09,296 WARNING [train.py:1204] (3/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_119-224384-0_sp1.1 from training. Duration: 0.936375 2023-10-02 13:53:09,339 WARNING [train.py:1204] (3/4) Exclude cut with ID _57523_210_1856_1_1534766294545_3732989_262-267933-0_sp1.1 from training. Duration: 0.963625 2023-10-02 13:53:12,078 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_35361_210_4238_1_1533262810_7793476_675-87012-0_sp0.9 from training. Duration: 0.988875 2023-10-02 13:53:12,101 WARNING [train.py:1204] (3/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_17-90541-0 from training. Duration: 0.94 2023-10-02 13:53:13,969 WARNING [train.py:1204] (3/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_535-14410-0_sp1.1 from training. Duration: 0.42725 2023-10-02 13:53:13,974 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_24673_210_6400_1_1531968633_778659_94-154511-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 13:53:13,977 WARNING [train.py:1204] (3/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_212-10501-0 from training. Duration: 0.94 2023-10-02 13:53:18,211 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.bypass_mid.scale_min, batch_count=902486.6666666666, ans=0.2 2023-10-02 13:53:18,748 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.1.encoder.layers.0.self_attn2.whiten, num_groups=1, num_channels=256, metric=10.48 vs. limit=22.5 2023-10-02 13:53:25,268 WARNING [train.py:1204] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_383-285196-0_sp1.1 from training. Duration: 0.8818125 2023-10-02 13:53:30,825 WARNING [train.py:1204] (3/4) Exclude cut with ID _45560_210_21982_1_1533812603951_3584999_238-47020-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 13:53:30,843 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_21500_210_2054_1_1532149310_2716758_278-18053-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 13:53:30,851 WARNING [train.py:1204] (3/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_453-9626-0_sp1.1 from training. Duration: 0.936375 2023-10-02 13:53:32,678 WARNING [train.py:1204] (3/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_153-97441-0_sp1.1 from training. Duration: 0.5090625 2023-10-02 13:53:40,362 WARNING [train.py:1204] (3/4) Exclude cut with ID _67725_210_27767_1_1535608756671_5105359_431-120895-0_sp1.1 from training. Duration: 0.963625 2023-10-02 13:53:41,826 WARNING [train.py:1204] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_574-45541-0_sp1.1 from training. Duration: 0.5909375 2023-10-02 13:53:41,849 WARNING [train.py:1204] (3/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_162-103620-0_sp1.1 from training. Duration: 0.8454375 2023-10-02 13:53:41,860 WARNING [train.py:1204] (3/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_734-129761-0_sp1.1 from training. Duration: 0.7454375 2023-10-02 13:53:41,887 WARNING [train.py:1204] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_620-276793-0_sp1.1 from training. Duration: 0.536375 2023-10-02 13:53:43,249 WARNING [train.py:1204] (3/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_153-97441-0_sp0.9 from training. Duration: 0.62225 2023-10-02 13:53:46,001 WARNING [train.py:1204] (3/4) Exclude cut with ID _12733_210_8341_1_1530167424556_3646380_523-85621-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 13:53:47,788 WARNING [train.py:1204] (3/4) Exclude cut with ID _64048_210_10626_1_1535198382802_3835179_352-179834-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 13:53:52,328 WARNING [train.py:1204] (3/4) Exclude cut with ID _55162_210_17071_1_1534408248583_3753480_43-242364-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 13:53:53,673 WARNING [train.py:1204] (3/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_661-264857-0 from training. Duration: 0.93 2023-10-02 13:53:53,680 WARNING [train.py:1204] (3/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_325-237800-0_sp1.1 from training. Duration: 0.97275 2023-10-02 13:53:53,727 WARNING [train.py:1204] (3/4) Exclude cut with ID _55328_210_27767_1_1534580973260_4088170_385-171459-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 13:53:53,795 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3780_210_2087_1_1526467389_2145572_176-107140-0_sp1.1 from training. Duration: 0.62725 2023-10-02 13:53:55,167 WARNING [train.py:1204] (3/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_375-244976-0_sp1.1 from training. Duration: 0.6818125 2023-10-02 13:53:56,026 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.2.encoder.layers.0.nonlin_attention.whiten1, num_groups=1, num_channels=288, metric=8.11 vs. limit=10.0 2023-10-02 13:53:56,600 WARNING [train.py:1204] (3/4) Exclude cut with ID _35941_210_15379_1_1532995294515_4023089_309-33162-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 13:54:03,848 INFO [train.py:1046] (3/4) Epoch 26, batch 2600, loss[loss=0.1719, simple_loss=0.2556, pruned_loss=0.04407, over 23731.00 frames. ], tot_loss[loss=0.1682, simple_loss=0.2453, pruned_loss=0.04556, over 4707406.14 frames. ], batch size: 85, lr: 3.94e-03, grad_scale: 16.0 2023-10-02 13:54:03,912 WARNING [train.py:1204] (3/4) Exclude cut with ID _14459_210_2228_1_1531443641946_7516700_1185-22038-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 13:54:07,214 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_1197-69460-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 13:54:09,008 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9012W0022-501157 from training. Duration: 0.896 2023-10-02 13:54:11,749 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9023W0009-506302 from training. Duration: 0.896 2023-10-02 13:54:11,772 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2821_210_2715_1_1526108385_3763560_73-296673-0_sp1.1 from training. Duration: 0.7545625 2023-10-02 13:54:11,805 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9025W0113-507442 from training. Duration: 0.896 2023-10-02 13:54:13,177 WARNING [train.py:1204] (3/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_739-2098-0 from training. Duration: 0.92 2023-10-02 13:54:13,190 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9029W0332-509722 from training. Duration: 0.768 2023-10-02 13:54:14,719 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.balancer1.prob, batch_count=902686.6666666666, ans=0.125 2023-10-02 13:54:15,876 WARNING [train.py:1204] (3/4) Exclude cut with ID _16908_210_5724_1_1531119157248_3772700_68-101953-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 13:54:15,895 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9042W0011-515617 from training. Duration: 0.768 2023-10-02 13:54:17,311 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3019_210_2028_1_1526044245_3264865_191-72235-0 from training. Duration: 0.97 2023-10-02 13:54:17,567 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.self_attn_weights.pos_emb_skip_rate, batch_count=902753.3333333334, ans=0.0 2023-10-02 13:54:19,110 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9052W0006-520792 from training. Duration: 0.896 2023-10-02 13:54:20,543 WARNING [train.py:1204] (3/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_245-174553-0_sp1.1 from training. Duration: 0.8 2023-10-02 13:54:21,879 WARNING [train.py:1204] (3/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_202-67017-0 from training. Duration: 0.82 2023-10-02 13:54:23,787 WARNING [train.py:1204] (3/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_304-59227-0 from training. Duration: 0.77 2023-10-02 13:54:25,171 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_246-125000-0_sp0.9 from training. Duration: 0.688875 2023-10-02 13:54:25,218 WARNING [train.py:1204] (3/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_243-42116-0 from training. Duration: 0.73 2023-10-02 13:54:26,641 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9087W0121-538492 from training. Duration: 0.896 2023-10-02 13:54:26,661 WARNING [train.py:1204] (3/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_120-76843-0 from training. Duration: 0.55 2023-10-02 13:54:34,977 WARNING [train.py:1204] (3/4) Exclude cut with ID _7852_210_5219_1_1528286471316_8139400_850-304621-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 13:54:34,996 WARNING [train.py:1204] (3/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_154-285415-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 13:54:36,877 WARNING [train.py:1204] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_538-278194-0_sp1.1 from training. Duration: 0.92725 2023-10-02 13:54:36,877 WARNING [train.py:1204] (3/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_119-207162-0 from training. Duration: 0.71 2023-10-02 13:54:40,066 WARNING [train.py:1204] (3/4) Exclude cut with ID _2334_210_2667_1_1525608238880_4705129_459-104997-0_sp1.1 from training. Duration: 0.863625 2023-10-02 13:54:40,248 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.balancer_na.min_abs, batch_count=902820.0, ans=0.02 2023-10-02 13:54:44,374 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9147W0019-567847 from training. Duration: 0.768 2023-10-02 13:54:44,587 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.conv_module2.balancer2.prob, batch_count=902820.0, ans=0.125 2023-10-02 13:54:50,012 WARNING [train.py:1204] (3/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_228-189439-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 13:54:50,043 WARNING [train.py:1204] (3/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_196-77172-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 13:54:50,266 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.feed_forward3.hidden_balancer.prob, batch_count=902886.6666666666, ans=0.125 2023-10-02 13:54:51,989 WARNING [train.py:1204] (3/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_625-338546-0 from training. Duration: 0.88 2023-10-02 13:54:53,267 WARNING [train.py:1204] (3/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_193-327630-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 13:54:53,268 WARNING [train.py:1204] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_391-24006-0_sp1.1 from training. Duration: 0.92725 2023-10-02 13:54:53,343 WARNING [train.py:1204] (3/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_197-28164-0 from training. Duration: 0.8 2023-10-02 13:54:54,629 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.527e+02 1.918e+02 2.069e+02 2.446e+02 3.571e+02, threshold=4.138e+02, percent-clipped=0.0 2023-10-02 13:54:54,864 WARNING [train.py:1204] (3/4) Exclude cut with ID _8395_210_1531_1_1528597871523_4411579_431-132470-0_sp1.1 from training. Duration: 0.67275 2023-10-02 13:54:56,665 WARNING [train.py:1204] (3/4) Exclude cut with ID _35350_210_16040_1_1533040191183_4075299_465-248326-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 13:54:59,367 WARNING [train.py:1204] (3/4) Exclude cut with ID _16756_210_4915_1_1531047803763_7104259_101-141922-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 13:55:02,050 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9212W0005-600172 from training. Duration: 0.896 2023-10-02 13:55:02,096 WARNING [train.py:1204] (3/4) Exclude cut with ID _63186_210_2084_1_1535414479705_3908649_378-166820-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 13:55:02,109 WARNING [train.py:1204] (3/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_52-211683-0_sp1.1 from training. Duration: 0.8181875 2023-10-02 13:55:05,199 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.feed_forward2.hidden_balancer.prob, batch_count=902953.3333333334, ans=0.125 2023-10-02 13:55:07,625 WARNING [train.py:1204] (3/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_1147-237729-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 13:55:09,470 WARNING [train.py:1204] (3/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_234-14174-0_sp0.9 from training. Duration: 0.911125 2023-10-02 13:55:09,482 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_454-276429-0 from training. Duration: 0.87 2023-10-02 13:55:10,877 WARNING [train.py:1204] (3/4) Exclude cut with ID _53713_210_24116_1_1534314487762_7416751_755-165313-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 13:55:12,861 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8278_210_2550_1_1528637099_4220413_436-317072-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 13:55:14,182 WARNING [train.py:1204] (3/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_28-324729-0_sp1.1 from training. Duration: 0.8545625 2023-10-02 13:55:18,195 INFO [train.py:1046] (3/4) Epoch 26, batch 2650, loss[loss=0.1805, simple_loss=0.2498, pruned_loss=0.05561, over 23833.00 frames. ], tot_loss[loss=0.1686, simple_loss=0.2459, pruned_loss=0.04568, over 4721935.34 frames. ], batch size: 195, lr: 3.94e-03, grad_scale: 16.0 2023-10-02 13:55:19,675 WARNING [train.py:1204] (3/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_326-165900-0 from training. Duration: 0.69 2023-10-02 13:55:19,749 WARNING [train.py:1204] (3/4) Exclude cut with ID _53489_210_6228_1_1534298357169_3835970_96-30246-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 13:55:22,346 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8275_210_4198_1_1528455320_3892571_491-17707-0_sp1.1 from training. Duration: 0.5818125 2023-10-02 13:55:26,815 WARNING [train.py:1204] (3/4) Exclude cut with ID _14348_210_8341_1_1530599434996_3778640_240-232370-0 from training. Duration: 0.84 2023-10-02 13:55:26,825 WARNING [train.py:1204] (3/4) Exclude cut with ID _45012_210_12651_1_1533608435136_5371380_54-112623-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 13:55:26,928 WARNING [train.py:1204] (3/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_328-264776-0_sp1.1 from training. Duration: 0.6090625 2023-10-02 13:55:28,841 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9325W0001-648427 from training. Duration: 0.768 2023-10-02 13:55:28,857 WARNING [train.py:1204] (3/4) Exclude cut with ID _56480_210_4220_1_1534679942079_3075409_49-74717-0_sp1.1 from training. Duration: 0.936375 2023-10-02 13:55:31,593 WARNING [train.py:1204] (3/4) Exclude cut with ID _59500_210_8254_1_1534809610680_3736009_6-241869-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 13:55:33,153 WARNING [train.py:1204] (3/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_172-215932-0_sp0.9 from training. Duration: 0.7333125 2023-10-02 13:55:34,575 WARNING [train.py:1204] (3/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_17-90541-0_sp1.1 from training. Duration: 0.8545625 2023-10-02 13:55:37,465 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_43831_210_2158_1_1533725502_5872791_525-89447-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 13:55:37,538 WARNING [train.py:1204] (3/4) Exclude cut with ID _12302_210_5219_1_1529924446466_8107280_907-182949-0 from training. Duration: 0.92 2023-10-02 13:55:37,551 WARNING [train.py:1204] (3/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_402-102311-0_sp0.9 from training. Duration: 0.7444375 2023-10-02 13:55:37,590 WARNING [train.py:1204] (3/4) Exclude cut with ID _8021_210_7994_1_1528376451358_3653069_179-45206-0_sp0.9 from training. Duration: 0.9555625 2023-10-02 13:55:42,733 WARNING [train.py:1204] (3/4) Exclude cut with ID _40137_210_12210_1_1533290484046_3795180_249-187059-0 from training. Duration: 0.88 2023-10-02 13:55:42,816 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9653W0194-675472 from training. Duration: 0.9658125 2023-10-02 13:55:44,275 WARNING [train.py:1204] (3/4) Exclude cut with ID _51332_210_9988_1_1534244371836_3801270_336-255604-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 13:55:46,914 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_510-96996-0 from training. Duration: 0.87 2023-10-02 13:55:46,926 WARNING [train.py:1204] (3/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_1167-20437-0_sp1.1 from training. Duration: 0.92725 2023-10-02 13:55:46,985 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3475_210_2028_1_1526193578_3622569_265-276625-0 from training. Duration: 0.76 2023-10-02 13:55:51,120 WARNING [train.py:1204] (3/4) Exclude cut with ID _14860_210_4915_1_1530702020340_7343240_470-172418-0_sp1.1 from training. Duration: 0.97275 2023-10-02 13:55:51,128 WARNING [train.py:1204] (3/4) Exclude cut with ID _5444_210_2151_1_1527418808464_5312210_581-315411-0_sp1.1 from training. Duration: 0.563625 2023-10-02 13:55:51,142 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_34450_210_9547_1_1533099443_4109050_519-342439-0_sp1.1 from training. Duration: 0.97275 2023-10-02 13:55:52,427 WARNING [train.py:1204] (3/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_50-175961-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 13:55:52,631 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.0.feed_forward1.out_proj.dropout_p, batch_count=903153.3333333334, ans=0.1 2023-10-02 13:55:56,975 WARNING [train.py:1204] (3/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_372-6869-0 from training. Duration: 0.44 2023-10-02 13:55:56,980 WARNING [train.py:1204] (3/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_3-237434-0 from training. Duration: 0.76 2023-10-02 13:55:59,030 WARNING [train.py:1204] (3/4) Exclude cut with ID _8772_210_1850_1_1528635595972_3664011_247-336448-0_sp1.1 from training. Duration: 0.67275 2023-10-02 13:56:03,145 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_427-78097-0 from training. Duration: 0.8 2023-10-02 13:56:03,161 WARNING [train.py:1204] (3/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_466-252727-0_sp1.1 from training. Duration: 0.97275 2023-10-02 13:56:04,588 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_15972_210_6400_1_1530877934_3405692_316-35478-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 13:56:04,615 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_809-17156-0_sp0.9 from training. Duration: 0.811125 2023-10-02 13:56:05,899 WARNING [train.py:1204] (3/4) Exclude cut with ID _57572_210_5517_1_1534921219624_3809484_594-263474-0_sp1.1 from training. Duration: 0.936375 2023-10-02 13:56:05,936 WARNING [train.py:1204] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_190-228204-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 13:56:07,434 WARNING [train.py:1204] (3/4) Exclude cut with ID _19514_210_9259_1_1531530071330_5084230_117-21193-0_sp1.1 from training. Duration: 0.936375 2023-10-02 13:56:09,445 WARNING [train.py:1204] (3/4) Exclude cut with ID _69584_210_5219_1_1535785133775_4044450_190-21315-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 13:56:11,313 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_53071_210_16554_1_1534493434_1276796_184-302050-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 13:56:12,603 WARNING [train.py:1204] (3/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_120-76843-0_sp1.1 from training. Duration: 0.5 2023-10-02 13:56:13,950 WARNING [train.py:1204] (3/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_318-162017-0_sp1.1 from training. Duration: 0.82725 2023-10-02 13:56:15,282 WARNING [train.py:1204] (3/4) Exclude cut with ID _46067_210_4220_1_1534125571066_3307450_79-46864-0_sp1.1 from training. Duration: 0.92725 2023-10-02 13:56:16,770 WARNING [train.py:1204] (3/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_358-215558-0_sp1.1 from training. Duration: 0.8454375 2023-10-02 13:56:16,845 WARNING [train.py:1204] (3/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_621-66764-0_sp1.1 from training. Duration: 0.92725 2023-10-02 13:56:19,673 WARNING [train.py:1204] (3/4) Exclude cut with ID _42863_210_9070_1_1533902398788_3696920_338-156851-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 13:56:19,705 WARNING [train.py:1204] (3/4) Exclude cut with ID _5289_210_2084_1_1527678202736_3779299_392-261988-0_sp0.9 from training. Duration: 0.72225 2023-10-02 13:56:22,527 WARNING [train.py:1204] (3/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_409-84682-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 13:56:23,918 WARNING [train.py:1204] (3/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_30-326681-0_sp0.9 from training. Duration: 0.92225 2023-10-02 13:56:23,922 WARNING [train.py:1204] (3/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_389-184695-0_sp1.1 from training. Duration: 0.92725 2023-10-02 13:56:23,969 WARNING [train.py:1204] (3/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_442-320317-0 from training. Duration: 0.82 2023-10-02 13:56:24,151 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.balancer1.prob, batch_count=903286.6666666666, ans=0.125 2023-10-02 13:56:25,582 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.conv_module1.balancer1.prob, batch_count=903286.6666666666, ans=0.125 2023-10-02 13:56:27,990 WARNING [train.py:1204] (3/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_756-29911-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 13:56:29,813 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_401-112036-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 13:56:29,897 WARNING [train.py:1204] (3/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_181-308827-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 13:56:31,289 WARNING [train.py:1204] (3/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_332-258725-0_sp1.1 from training. Duration: 0.963625 2023-10-02 13:56:33,058 INFO [train.py:1046] (3/4) Epoch 26, batch 2700, loss[loss=0.141, simple_loss=0.2222, pruned_loss=0.02996, over 21586.00 frames. ], tot_loss[loss=0.1702, simple_loss=0.2473, pruned_loss=0.04657, over 4706953.64 frames. ], batch size: 47, lr: 3.94e-03, grad_scale: 16.0 2023-10-02 13:56:33,130 WARNING [train.py:1204] (3/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_188-260011-0_sp0.9 from training. Duration: 0.811125 2023-10-02 13:56:33,173 WARNING [train.py:1204] (3/4) Exclude cut with ID _49039_210_6315_1_1533967537541_3846118_99-302239-0_sp1.1 from training. Duration: 0.963625 2023-10-02 13:56:35,851 WARNING [train.py:1204] (3/4) Exclude cut with ID _4122_210_2084_1_1526713164417_4353280_312-294828-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 13:56:35,856 WARNING [train.py:1204] (3/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_303-228738-0 from training. Duration: 0.84 2023-10-02 13:56:39,080 WARNING [train.py:1204] (3/4) Exclude cut with ID _14349_210_8341_1_1530615710818_3442470_115-285975-0_sp0.9 from training. Duration: 0.9555625 2023-10-02 13:56:40,531 WARNING [train.py:1204] (3/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_125-144174-0_sp1.1 from training. Duration: 0.3545625 2023-10-02 13:56:40,705 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=903353.3333333334, ans=0.1 2023-10-02 13:56:42,489 WARNING [train.py:1204] (3/4) Exclude cut with ID _53565_210_10635_1_1534337218335_3820259_351-54570-0_sp1.1 from training. Duration: 0.97275 2023-10-02 13:56:42,999 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.5.encoder.layers.0.conv_module1.whiten, num_groups=1, num_channels=256, metric=6.96 vs. limit=15.0 2023-10-02 13:56:43,792 WARNING [train.py:1204] (3/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_410-40065-0_sp1.1 from training. Duration: 0.963625 2023-10-02 13:56:43,821 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_38965_210_4852_1_1533174906_3484735_167-339442-0_sp1.1 from training. Duration: 0.963625 2023-10-02 13:56:43,912 WARNING [train.py:1204] (3/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_265-164486-0_sp1.1 from training. Duration: 0.87275 2023-10-02 13:56:43,922 WARNING [train.py:1204] (3/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_725-182484-0_sp1.1 from training. Duration: 0.92725 2023-10-02 13:56:43,946 WARNING [train.py:1204] (3/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_607-213951-0_sp0.9 from training. Duration: 0.9333125 2023-10-02 13:56:45,267 WARNING [train.py:1204] (3/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_605-133395-0_sp1.1 from training. Duration: 0.6 2023-10-02 13:56:45,296 WARNING [train.py:1204] (3/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_351-294650-0 from training. Duration: 0.63 2023-10-02 13:56:45,350 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_14953_210_2151_1_1530701710_5616165_710-165400-0_sp1.1 from training. Duration: 0.7909375 2023-10-02 13:56:48,038 WARNING [train.py:1204] (3/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_237-58433-0_sp1.1 from training. Duration: 0.67275 2023-10-02 13:56:49,353 WARNING [train.py:1204] (3/4) Exclude cut with ID _5190_210_4557_1_1527244368799_373439_28-197030-0_sp1.1 from training. Duration: 0.6545625 2023-10-02 13:56:49,403 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_743-327651-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 13:56:49,564 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.conv_module1.balancer1.prob, batch_count=903420.0, ans=0.125 2023-10-02 13:56:50,939 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=903420.0, ans=0.1 2023-10-02 13:56:53,454 WARNING [train.py:1204] (3/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_76-203867-0_sp0.9 from training. Duration: 0.8 2023-10-02 13:56:53,535 WARNING [train.py:1204] (3/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_274-274798-0 from training. Duration: 0.98 2023-10-02 13:56:53,578 WARNING [train.py:1204] (3/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_226-242140-0_sp1.1 from training. Duration: 0.736375 2023-10-02 13:56:57,781 WARNING [train.py:1204] (3/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_352-59958-0_sp1.1 from training. Duration: 0.7818125 2023-10-02 13:56:57,795 WARNING [train.py:1204] (3/4) Exclude cut with ID _39411_210_5986_1_1533538825030_3343649_191-251819-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 13:57:03,712 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7046_210_6753_1_1527895337_6920917_642-106799-0_sp1.1 from training. Duration: 0.836375 2023-10-02 13:57:03,722 WARNING [train.py:1204] (3/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_412-3442-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 13:57:03,756 WARNING [train.py:1204] (3/4) Exclude cut with ID _60057_210_10640_1_1534903065793_3333150_171-181133-0_sp0.9 from training. Duration: 0.97775 2023-10-02 13:57:05,455 WARNING [train.py:1204] (3/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_154-135696-0_sp1.1 from training. Duration: 0.72725 2023-10-02 13:57:08,655 WARNING [train.py:1204] (3/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_148-186805-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 13:57:11,974 WARNING [train.py:1204] (3/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_605-165641-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 13:57:11,980 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_174-277364-0_sp0.9 from training. Duration: 0.788875 2023-10-02 13:57:11,996 WARNING [train.py:1204] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_89-321955-0_sp0.9 from training. Duration: 0.888875 2023-10-02 13:57:14,054 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.1.encoder.layers.1.feed_forward2.out_whiten, num_groups=1, num_channels=256, metric=4.82 vs. limit=15.0 2023-10-02 13:57:16,182 WARNING [train.py:1204] (3/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_438-105429-0_sp1.1 from training. Duration: 0.963625 2023-10-02 13:57:16,186 WARNING [train.py:1204] (3/4) Exclude cut with ID _14970_210_15709_1_1530773543493_1715730_187-20259-0_sp0.9 from training. Duration: 0.988875 2023-10-02 13:57:22,900 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.555e+02 1.793e+02 2.064e+02 2.296e+02 3.697e+02, threshold=4.128e+02, percent-clipped=0.0 2023-10-02 13:57:23,078 WARNING [train.py:1204] (3/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_385-193052-0_sp0.9 from training. Duration: 0.9555625 2023-10-02 13:57:24,396 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7113_210_1849_1_1527942685_3448768_194-311626-0_sp1.1 from training. Duration: 0.92725 2023-10-02 13:57:27,235 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3780_210_2087_1_1526467389_2145572_176-107140-0_sp0.9 from training. Duration: 0.7666875 2023-10-02 13:57:27,237 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_36756_210_7234_1_1533026991_966276_123-213457-0_sp1.1 from training. Duration: 0.936375 2023-10-02 13:57:31,383 WARNING [train.py:1204] (3/4) Exclude cut with ID _23557_210_8750_1_1531890106370_6846255_456-244861-0_sp1.1 from training. Duration: 0.963625 2023-10-02 13:57:33,366 WARNING [train.py:1204] (3/4) Exclude cut with ID _37086_210_9038_1_1532999540891_7021250_242-349170-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 13:57:34,689 WARNING [train.py:1204] (3/4) Exclude cut with ID _41298_210_9019_1_1533430371450_4822143_471-281351-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 13:57:36,113 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_53378_210_17282_1_1534221071_5769509_572-126176-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 13:57:36,210 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_1507-263032-0_sp1.1 from training. Duration: 0.963625 2023-10-02 13:57:38,136 WARNING [train.py:1204] (3/4) Exclude cut with ID _13679_210_7497_1_1530534293662_3355098_411-186088-0_sp1.1 from training. Duration: 0.8545625 2023-10-02 13:57:41,892 WARNING [train.py:1204] (3/4) Exclude cut with ID _29345_210_4381_1_1532689168263_3860169_149-257268-0_sp1.1 from training. Duration: 0.736375 2023-10-02 13:57:43,180 WARNING [train.py:1204] (3/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_381-252014-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 13:57:43,186 WARNING [train.py:1204] (3/4) Exclude cut with ID _35799_210_5854_1_1533263487887_3529071_512-2717-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 13:57:44,671 WARNING [train.py:1204] (3/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_505-205270-0_sp0.9 from training. Duration: 0.42225 2023-10-02 13:57:46,059 WARNING [train.py:1204] (3/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_731-20117-0_sp1.1 from training. Duration: 0.936375 2023-10-02 13:57:47,401 INFO [train.py:1046] (3/4) Epoch 26, batch 2750, loss[loss=0.1588, simple_loss=0.2341, pruned_loss=0.04177, over 23631.00 frames. ], tot_loss[loss=0.1707, simple_loss=0.2475, pruned_loss=0.04694, over 4701387.09 frames. ], batch size: 149, lr: 3.94e-03, grad_scale: 16.0 2023-10-02 13:57:48,832 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_37351_210_7925_1_1533012931_3812113_265-85780-0_sp1.1 from training. Duration: 0.8 2023-10-02 13:57:48,839 WARNING [train.py:1204] (3/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_313-134930-0 from training. Duration: 0.97 2023-10-02 13:57:50,189 WARNING [train.py:1204] (3/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_374-138971-0 from training. Duration: 0.94 2023-10-02 13:57:50,240 WARNING [train.py:1204] (3/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_1227-287886-0_sp1.1 from training. Duration: 0.936375 2023-10-02 13:57:52,938 WARNING [train.py:1204] (3/4) Exclude cut with ID _59493_210_17282_1_1535078419077_3225879_332-217544-0_sp1.1 from training. Duration: 0.97275 2023-10-02 13:57:54,304 WARNING [train.py:1204] (3/4) Exclude cut with ID _23514_210_1389_1_1531832139785_3031439_361-119640-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 13:57:55,723 WARNING [train.py:1204] (3/4) Exclude cut with ID _69584_210_5219_1_1535785133775_4044450_142-118058-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 13:57:55,752 WARNING [train.py:1204] (3/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_178-177582-0_sp1.1 from training. Duration: 0.67275 2023-10-02 13:57:55,784 WARNING [train.py:1204] (3/4) Exclude cut with ID _25303_210_17519_1_1532050207576_3635970_41-195572-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 13:57:58,031 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.1.encoder.layers.1.feed_forward3.out_whiten, num_groups=1, num_channels=256, metric=10.94 vs. limit=15.0 2023-10-02 13:57:59,872 WARNING [train.py:1204] (3/4) Exclude cut with ID _55328_210_27767_1_1534580973260_4088170_329-341322-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 13:57:59,899 WARNING [train.py:1204] (3/4) Exclude cut with ID _5608_210_2106_1_1527415439089_6819768_909-82767-0_sp1.1 from training. Duration: 0.5545625 2023-10-02 13:57:59,948 WARNING [train.py:1204] (3/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_261-297503-0_sp1.1 from training. Duration: 0.9 2023-10-02 13:57:59,955 WARNING [train.py:1204] (3/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_459-291706-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 13:57:59,955 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7762_210_1794_1_1528199508_4057408_422-227034-0 from training. Duration: 0.73 2023-10-02 13:57:59,965 WARNING [train.py:1204] (3/4) Exclude cut with ID _3067_210_2667_1_1526212974640_4503168_250-285937-0_sp0.9 from training. Duration: 0.888875 2023-10-02 13:57:59,986 WARNING [train.py:1204] (3/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_332-253745-0_sp1.1 from training. Duration: 0.936375 2023-10-02 13:58:06,590 WARNING [train.py:1204] (3/4) Exclude cut with ID _13938_210_9988_1_1530518718978_3693960_304-112784-0 from training. Duration: 0.46 2023-10-02 13:58:07,953 WARNING [train.py:1204] (3/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_226-267957-0_sp1.1 from training. Duration: 0.92725 2023-10-02 13:58:07,982 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_41822_210_13270_1_1533955976_4379284_608-254567-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 13:58:08,141 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=903753.3333333334, ans=0.1 2023-10-02 13:58:09,897 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_124-113562-0_sp1.1 from training. Duration: 0.8545625 2023-10-02 13:58:09,926 WARNING [train.py:1204] (3/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_117-290916-0_sp1.1 from training. Duration: 0.463625 2023-10-02 13:58:10,000 WARNING [train.py:1204] (3/4) Exclude cut with ID _28427_210_4220_1_1532581240967_3061069_102-66533-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 13:58:10,188 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.balancer2.prob, batch_count=903753.3333333334, ans=0.125 2023-10-02 13:58:10,206 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.nonlin_attention.balancer.prob, batch_count=903753.3333333334, ans=0.125 2023-10-02 13:58:11,254 WARNING [train.py:1204] (3/4) Exclude cut with ID _55383_210_10635_1_1534468426172_2231669_134-128246-0_sp1.1 from training. Duration: 0.8090625 2023-10-02 13:58:11,310 WARNING [train.py:1204] (3/4) Exclude cut with ID _45871_210_5665_1_1533864614245_3296060_49-308316-0_sp1.1 from training. Duration: 0.97275 2023-10-02 13:58:12,646 WARNING [train.py:1204] (3/4) Exclude cut with ID _57572_210_5517_1_1534921219624_3809484_176-199205-0_sp1.1 from training. Duration: 0.97275 2023-10-02 13:58:15,459 WARNING [train.py:1204] (3/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_45-265684-0_sp1.1 from training. Duration: 0.6545625 2023-10-02 13:58:15,490 WARNING [train.py:1204] (3/4) Exclude cut with ID _15150_210_8341_1_1530752411803_7256384_469-239509-0_sp0.9 from training. Duration: 0.7555625 2023-10-02 13:58:16,877 WARNING [train.py:1204] (3/4) Exclude cut with ID _28461_210_2253_1_1532771775189_3041299_136-270644-0_sp1.1 from training. Duration: 0.8454375 2023-10-02 13:58:17,046 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=903820.0, ans=0.1 2023-10-02 13:58:18,184 WARNING [train.py:1204] (3/4) Exclude cut with ID _69584_210_5219_1_1535785133775_4044450_302-47302-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 13:58:19,544 WARNING [train.py:1204] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_166-128941-0_sp1.1 from training. Duration: 0.4909375 2023-10-02 13:58:25,180 WARNING [train.py:1204] (3/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_559-120504-0_sp1.1 from training. Duration: 0.97275 2023-10-02 13:58:27,797 WARNING [train.py:1204] (3/4) Exclude cut with ID _6456_210_1534_1_1527681634212_3181049_345-252877-0_sp1.1 from training. Duration: 0.5818125 2023-10-02 13:58:27,827 WARNING [train.py:1204] (3/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_902-233003-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 13:58:33,722 WARNING [train.py:1204] (3/4) Exclude cut with ID _36538_210_9994_1_1533204020363_3795620_204-261385-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 13:58:33,726 WARNING [train.py:1204] (3/4) Exclude cut with ID _14821_210_8750_1_1530680503991_6829359_189-297451-0_sp1.1 from training. Duration: 0.5 2023-10-02 13:58:33,773 WARNING [train.py:1204] (3/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_1211-226066-0_sp0.9 from training. Duration: 0.8444375 2023-10-02 13:58:39,007 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6786_210_1388_1_1527839699_4194495_273-149841-0_sp1.1 from training. Duration: 0.836375 2023-10-02 13:58:39,035 WARNING [train.py:1204] (3/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_380-162648-0_sp1.1 from training. Duration: 0.8545625 2023-10-02 13:58:39,036 WARNING [train.py:1204] (3/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_494-199538-0 from training. Duration: 0.75 2023-10-02 13:58:43,001 WARNING [train.py:1204] (3/4) Exclude cut with ID _49097_210_16049_1_1533952780207_4360979_276-87053-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 13:58:44,431 WARNING [train.py:1204] (3/4) Exclude cut with ID _5444_210_2151_1_1527418808464_5312210_726-27165-0 from training. Duration: 0.67 2023-10-02 13:58:47,601 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.conv_module2.balancer1.prob, batch_count=903953.3333333334, ans=0.125 2023-10-02 13:58:48,653 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_128-202104-0_sp1.1 from training. Duration: 0.57275 2023-10-02 13:58:49,956 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_11349_210_6753_1_1529800861_1101207_26-65602-0_sp1.1 from training. Duration: 0.87275 2023-10-02 13:58:51,238 WARNING [train.py:1204] (3/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_159-328157-0 from training. Duration: 0.52 2023-10-02 13:58:51,336 WARNING [train.py:1204] (3/4) Exclude cut with ID _14069_210_5632_1_1530512692033_4803390_98-21934-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 13:58:52,777 WARNING [train.py:1204] (3/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_352-59958-0_sp0.9 from training. Duration: 0.9555625 2023-10-02 13:58:54,100 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6163_210_6400_1_1527506746_4263068_283-137811-0 from training. Duration: 0.77 2023-10-02 13:58:54,150 WARNING [train.py:1204] (3/4) Exclude cut with ID _21989_210_5854_1_1531899214931_3271360_133-44759-0_sp1.1 from training. Duration: 0.7 2023-10-02 13:58:57,090 WARNING [train.py:1204] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_21-174514-0_sp0.9 from training. Duration: 0.47775 2023-10-02 13:58:57,119 WARNING [train.py:1204] (3/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_201-78811-0_sp1.1 from training. Duration: 0.963625 2023-10-02 13:58:58,400 WARNING [train.py:1204] (3/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_537-164630-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 13:59:00,220 INFO [train.py:1046] (3/4) Epoch 26, batch 2800, loss[loss=0.1652, simple_loss=0.2362, pruned_loss=0.04708, over 23685.00 frames. ], tot_loss[loss=0.1698, simple_loss=0.2462, pruned_loss=0.04673, over 4693468.99 frames. ], batch size: 135, lr: 3.94e-03, grad_scale: 32.0 2023-10-02 13:59:00,291 WARNING [train.py:1204] (3/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_156-257607-0 from training. Duration: 0.83 2023-10-02 13:59:00,309 WARNING [train.py:1204] (3/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_308-154450-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 13:59:00,350 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_45399_210_4852_1_1533624725_7579737_214-240574-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 13:59:04,776 WARNING [train.py:1204] (3/4) Exclude cut with ID _29303_210_2575_1_1532392237303_7410649_615-305517-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 13:59:04,824 WARNING [train.py:1204] (3/4) Exclude cut with ID IC0125W0002-61808 from training. Duration: 0.708 2023-10-02 13:59:04,824 WARNING [train.py:1204] (3/4) Exclude cut with ID IC0125W0017-61823 from training. Duration: 0.759 2023-10-02 13:59:08,167 WARNING [train.py:1204] (3/4) Exclude cut with ID _53219_210_4525_1_1534834836008_4064825_4-145291-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 13:59:09,635 WARNING [train.py:1204] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_185-174805-0_sp1.1 from training. Duration: 0.6181875 2023-10-02 13:59:09,649 WARNING [train.py:1204] (3/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_635-193046-0_sp1.1 from training. Duration: 0.92725 2023-10-02 13:59:13,786 WARNING [train.py:1204] (3/4) Exclude cut with ID _46067_210_4220_1_1534125571066_3307450_321-258866-0_sp1.1 from training. Duration: 0.97275 2023-10-02 13:59:16,471 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8558_210_1410_1_1528505225_7613406_131-329524-0 from training. Duration: 0.79 2023-10-02 13:59:17,946 WARNING [train.py:1204] (3/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_847-41334-0_sp0.9 from training. Duration: 0.488875 2023-10-02 13:59:19,359 WARNING [train.py:1204] (3/4) Exclude cut with ID _14227_210_4381_1_1530701804286_3940259_125-275642-0 from training. Duration: 0.99 2023-10-02 13:59:20,692 WARNING [train.py:1204] (3/4) Exclude cut with ID _25248_210_8750_1_1532062825941_5554180_248-177255-0_sp1.1 from training. Duration: 0.963625 2023-10-02 13:59:20,762 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_40047_210_2290_1_1533290519_2374311_195-165693-0_sp1.1 from training. Duration: 0.8090625 2023-10-02 13:59:20,766 WARNING [train.py:1204] (3/4) Exclude cut with ID _23109_210_4381_1_1532150828777_901949_29-258672-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 13:59:23,630 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=904086.6666666666, ans=0.1 2023-10-02 13:59:24,830 WARNING [train.py:1204] (3/4) Exclude cut with ID _14821_210_8750_1_1530680503991_6829359_2-227151-0_sp1.1 from training. Duration: 0.7545625 2023-10-02 13:59:24,868 WARNING [train.py:1204] (3/4) Exclude cut with ID _49643_210_7989_1_1534640244224_3939031_565-53982-0_sp1.1 from training. Duration: 0.963625 2023-10-02 13:59:24,871 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7544_210_6753_1_1528591327_1471426_33-346031-0_sp0.9 from training. Duration: 0.67775 2023-10-02 13:59:26,348 WARNING [train.py:1204] (3/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_95-61782-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 13:59:32,664 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.0.bypass.skip_rate, batch_count=904153.3333333334, ans=0.035 2023-10-02 13:59:32,751 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.bypass.skip_rate, batch_count=904153.3333333334, ans=0.09899494936611666 2023-10-02 13:59:33,810 WARNING [train.py:1204] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_686-54383-0_sp0.9 from training. Duration: 0.9666875 2023-10-02 13:59:35,851 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_505-276823-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 13:59:39,210 WARNING [train.py:1204] (3/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_745-128443-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 13:59:39,271 WARNING [train.py:1204] (3/4) Exclude cut with ID _3844_210_1390_1_1526727803582_2871209_20-251664-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 13:59:41,054 WARNING [train.py:1204] (3/4) Exclude cut with ID _36538_210_9994_1_1533204020363_3795620_487-164628-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 13:59:44,005 WARNING [train.py:1204] (3/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_309-222545-0_sp1.1 from training. Duration: 0.763625 2023-10-02 13:59:45,422 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3531_210_3528_1_1526294203_5216456_309-227302-0 from training. Duration: 0.69 2023-10-02 13:59:45,494 WARNING [train.py:1204] (3/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_42-192460-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 13:59:46,731 WARNING [train.py:1204] (3/4) Exclude cut with ID _22083_210_8254_1_1531702862439_3678970_49-234546-0_sp1.1 from training. Duration: 0.7545625 2023-10-02 13:59:46,742 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_427-78097-0_sp0.9 from training. Duration: 0.888875 2023-10-02 13:59:50,941 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.541e+02 1.855e+02 1.985e+02 2.213e+02 3.780e+02, threshold=3.970e+02, percent-clipped=0.0 2023-10-02 13:59:51,041 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_378-205254-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 13:59:51,083 WARNING [train.py:1204] (3/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_672-314830-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 13:59:55,080 WARNING [train.py:1204] (3/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_90-30673-0_sp1.1 from training. Duration: 0.763625 2023-10-02 13:59:56,636 WARNING [train.py:1204] (3/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_563-329943-0_sp1.1 from training. Duration: 0.9 2023-10-02 13:59:56,650 WARNING [train.py:1204] (3/4) Exclude cut with ID _21632_210_2253_1_1531994001765_3744140_154-84090-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 13:59:56,651 WARNING [train.py:1204] (3/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_103-336064-0_sp0.9 from training. Duration: 0.5666875 2023-10-02 13:59:56,705 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_43-71989-0_sp0.9 from training. Duration: 0.6333125 2023-10-02 13:59:58,002 WARNING [train.py:1204] (3/4) Exclude cut with ID _3867_210_1883_1_1526641200575_4475830_319-349850-0_sp0.9 from training. Duration: 0.7444375 2023-10-02 13:59:59,350 WARNING [train.py:1204] (3/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_629-179454-0_sp1.1 from training. Duration: 0.963625 2023-10-02 13:59:59,357 WARNING [train.py:1204] (3/4) Exclude cut with ID _65263_210_22356_1_1535763598691_3921037_49-222917-0 from training. Duration: 0.92 2023-10-02 13:59:59,394 WARNING [train.py:1204] (3/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_602-331772-0_sp1.1 from training. Duration: 0.936375 2023-10-02 14:00:00,769 WARNING [train.py:1204] (3/4) Exclude cut with ID _58414_210_8254_1_1535421609162_7151182_292-134978-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 14:00:01,433 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_38793_210_2544_1_1533176114_3849320_200-299292-0_sp1.1 from training. Duration: 0.936375 2023-10-02 14:00:02,568 WARNING [train.py:1204] (3/4) Exclude cut with ID _14970_210_15709_1_1530775547361_1966965_6-23328-0 from training. Duration: 0.86 2023-10-02 14:00:04,346 WARNING [train.py:1204] (3/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_386-225511-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 14:00:04,363 WARNING [train.py:1204] (3/4) Exclude cut with ID _38323_210_12108_1_1533121233369_3303049_416-11919-0_sp1.1 from training. Duration: 0.87275 2023-10-02 14:00:05,718 WARNING [train.py:1204] (3/4) Exclude cut with ID _4866_210_2577_1_1527404412284_4364659_256-306830-0_sp1.1 from training. Duration: 0.7909375 2023-10-02 14:00:07,159 WARNING [train.py:1204] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_53-300544-0 from training. Duration: 0.67 2023-10-02 14:00:10,930 WARNING [train.py:1204] (3/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_80-74184-0_sp1.1 from training. Duration: 0.7545625 2023-10-02 14:00:12,204 WARNING [train.py:1204] (3/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_332-256168-0_sp1.1 from training. Duration: 0.6818125 2023-10-02 14:00:12,250 WARNING [train.py:1204] (3/4) Exclude cut with ID _48836_210_12210_1_1534075848293_3904150_429-35475-0_sp1.1 from training. Duration: 0.8090625 2023-10-02 14:00:13,647 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_144-135833-0_sp1.1 from training. Duration: 0.97275 2023-10-02 14:00:14,811 INFO [train.py:1046] (3/4) Epoch 26, batch 2850, loss[loss=0.1605, simple_loss=0.2092, pruned_loss=0.05592, over 19262.00 frames. ], tot_loss[loss=0.1686, simple_loss=0.2441, pruned_loss=0.04655, over 4663940.71 frames. ], batch size: 388, lr: 3.94e-03, grad_scale: 32.0 2023-10-02 14:00:16,460 WARNING [train.py:1204] (3/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_380-343670-0_sp0.9 from training. Duration: 0.97775 2023-10-02 14:00:16,485 WARNING [train.py:1204] (3/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_213-265174-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 14:00:16,505 WARNING [train.py:1204] (3/4) Exclude cut with ID _20455_210_5219_1_1531565996775_8098100_86-314283-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 14:00:20,378 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_41750_210_1960_1_1533520678_3948042_68-223813-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 14:00:20,437 WARNING [train.py:1204] (3/4) Exclude cut with ID _55153_210_2253_1_1534384786677_3162096_24-198429-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 14:00:20,606 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.conv_skip_rate, batch_count=904353.3333333334, ans=0.0 2023-10-02 14:00:23,134 WARNING [train.py:1204] (3/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_119-207162-0_sp0.9 from training. Duration: 0.788875 2023-10-02 14:00:23,194 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_148-172861-0 from training. Duration: 0.5 2023-10-02 14:00:27,346 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.nonlin_attention.balancer.prob, batch_count=904420.0, ans=0.125 2023-10-02 14:00:29,487 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.conv_skip_rate, batch_count=904420.0, ans=0.0 2023-10-02 14:00:31,848 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_5522_210_3273_1_1527249606_3909096_318-203153-0 from training. Duration: 0.43 2023-10-02 14:00:31,855 WARNING [train.py:1204] (3/4) Exclude cut with ID _14884_210_2252_1_1530707459689_5561250_288-189949-0_sp1.1 from training. Duration: 0.97275 2023-10-02 14:00:32,074 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=904420.0, ans=0.1 2023-10-02 14:00:33,290 WARNING [train.py:1204] (3/4) Exclude cut with ID _5294_210_1747_1_1527249643928_3696179_529-242298-0 from training. Duration: 0.95 2023-10-02 14:00:33,354 WARNING [train.py:1204] (3/4) Exclude cut with ID _36538_210_9994_1_1533204020363_3795620_503-110289-0_sp1.1 from training. Duration: 0.936375 2023-10-02 14:00:33,501 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.0.feed_forward1.out_proj.dropout_p, batch_count=904420.0, ans=0.1 2023-10-02 14:00:35,903 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.4.encoder.layers.1.nonlin_attention.whiten1, num_groups=1, num_channels=288, metric=4.31 vs. limit=10.0 2023-10-02 14:00:35,984 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.3.encoder.layers.2.feed_forward1.out_whiten, num_groups=1, num_channels=512, metric=8.14 vs. limit=15.0 2023-10-02 14:00:36,673 WARNING [train.py:1204] (3/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_385-193052-0 from training. Duration: 0.86 2023-10-02 14:00:37,963 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_456-22141-0 from training. Duration: 0.88 2023-10-02 14:00:39,850 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_38965_210_4852_1_1533174906_3484735_284-117840-0_sp1.1 from training. Duration: 0.936375 2023-10-02 14:00:50,411 WARNING [train.py:1204] (3/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_75-201625-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 14:00:51,953 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.ff2_skip_rate, batch_count=904486.6666666666, ans=0.0 2023-10-02 14:00:53,043 WARNING [train.py:1204] (3/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_255-312654-0_sp1.1 from training. Duration: 0.82725 2023-10-02 14:00:53,052 WARNING [train.py:1204] (3/4) Exclude cut with ID _47251_210_18628_1_1533814063128_4137250_214-88076-0_sp0.9 from training. Duration: 0.97775 2023-10-02 14:00:53,130 WARNING [train.py:1204] (3/4) Exclude cut with ID _4092_210_2151_1_1526643044449_3135759_227-213972-0_sp1.1 from training. Duration: 0.4909375 2023-10-02 14:00:53,152 WARNING [train.py:1204] (3/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_912-47473-0_sp1.1 from training. Duration: 0.6181875 2023-10-02 14:00:53,171 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6767_210_3273_1_1527855783_1718932_173-256601-0_sp1.1 from training. Duration: 0.72725 2023-10-02 14:00:54,548 WARNING [train.py:1204] (3/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_32-205288-0_sp1.1 from training. Duration: 0.7090625 2023-10-02 14:00:54,584 WARNING [train.py:1204] (3/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_39-248870-0 from training. Duration: 0.69 2023-10-02 14:00:57,321 WARNING [train.py:1204] (3/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_377-296839-0_sp1.1 from training. Duration: 0.5 2023-10-02 14:00:57,350 WARNING [train.py:1204] (3/4) Exclude cut with ID _13679_210_7497_1_1530534293662_3355098_413-56154-0_sp1.1 from training. Duration: 0.963625 2023-10-02 14:00:58,771 WARNING [train.py:1204] (3/4) Exclude cut with ID _14970_210_15709_1_1530775547361_1966965_201-84544-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 14:01:00,104 WARNING [train.py:1204] (3/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_105-95775-0_sp1.1 from training. Duration: 0.936375 2023-10-02 14:01:02,288 WARNING [train.py:1204] (3/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_131-191669-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 14:01:02,320 WARNING [train.py:1204] (3/4) Exclude cut with ID _14860_210_4915_1_1530702020340_7343240_695-4323-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 14:01:03,669 WARNING [train.py:1204] (3/4) Exclude cut with ID _39867_210_9826_1_1533553222030_9344259_991-130565-0_sp1.1 from training. Duration: 0.97275 2023-10-02 14:01:06,936 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_50-3289-0_sp1.1 from training. Duration: 0.82725 2023-10-02 14:01:07,237 INFO [scaling.py:1118] (3/4) WithLoss: name=encoder.encoders.1.encoder.layers.1.self_attn_weights, loss-sum=0.000e+00 2023-10-02 14:01:08,325 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_54-347670-0_sp1.1 from training. Duration: 0.92725 2023-10-02 14:01:08,385 WARNING [train.py:1204] (3/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_200-153445-0_sp1.1 from training. Duration: 0.936375 2023-10-02 14:01:09,747 WARNING [train.py:1204] (3/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_279-302911-0_sp1.1 from training. Duration: 0.97275 2023-10-02 14:01:11,635 WARNING [train.py:1204] (3/4) Exclude cut with ID _14886_210_4220_1_1530702020355_3516190_159-73748-0_sp0.9 from training. Duration: 0.92225 2023-10-02 14:01:15,683 WARNING [train.py:1204] (3/4) Exclude cut with ID _53379_210_20034_1_1534222027363_3854654_23-222715-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 14:01:17,096 WARNING [train.py:1204] (3/4) Exclude cut with ID _12555_210_8750_1_1530075594402_3780239_449-267986-0 from training. Duration: 0.83 2023-10-02 14:01:18,427 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7544_210_6753_1_1528587722_3576686_295-258479-0 from training. Duration: 0.84 2023-10-02 14:01:18,560 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7002_210_1794_1_1527935634_4034444_487-236501-0_sp0.9 from training. Duration: 0.7333125 2023-10-02 14:01:19,916 WARNING [train.py:1204] (3/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_244-19107-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 14:01:21,210 WARNING [train.py:1204] (3/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_390-339137-0 from training. Duration: 0.56 2023-10-02 14:01:21,276 WARNING [train.py:1204] (3/4) Exclude cut with ID _7562_210_2084_1_1528887767916_3946229_186-326422-0_sp1.1 from training. Duration: 0.763625 2023-10-02 14:01:22,592 WARNING [train.py:1204] (3/4) Exclude cut with ID _33640_210_2655_1_1533117605479_4855469_645-145235-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 14:01:22,616 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_25767_210_2161_1_1532307483_3867748_506-171777-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 14:01:22,646 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_5438_210_1794_1_1527336687_2600009_233-58072-0_sp1.1 from training. Duration: 0.7 2023-10-02 14:01:22,647 WARNING [train.py:1204] (3/4) Exclude cut with ID IC0669W0063-330908 from training. Duration: 0.896 2023-10-02 14:01:22,679 WARNING [train.py:1204] (3/4) Exclude cut with ID IC0671W0070-331913 from training. Duration: 0.896 2023-10-02 14:01:22,682 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_258-310570-0_sp1.1 from training. Duration: 0.7454375 2023-10-02 14:01:24,088 WARNING [train.py:1204] (3/4) Exclude cut with ID _9461_210_4557_1_1529060514726_3677451_162-177507-0_sp1.1 from training. Duration: 0.97275 2023-10-02 14:01:28,029 INFO [train.py:1046] (3/4) Epoch 26, batch 2900, loss[loss=0.1833, simple_loss=0.2519, pruned_loss=0.05732, over 23615.00 frames. ], tot_loss[loss=0.1688, simple_loss=0.2447, pruned_loss=0.04642, over 4685597.07 frames. ], batch size: 256, lr: 3.94e-03, grad_scale: 32.0 2023-10-02 14:01:28,188 WARNING [train.py:1204] (3/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_26-175553-0_sp0.9 from training. Duration: 0.67775 2023-10-02 14:01:28,203 WARNING [train.py:1204] (3/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_7-150481-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 14:01:29,549 WARNING [train.py:1204] (3/4) Exclude cut with ID _23557_210_8750_1_1531890106370_6846255_72-273262-0_sp1.1 from training. Duration: 0.963625 2023-10-02 14:01:29,664 WARNING [train.py:1204] (3/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_367-61330-0 from training. Duration: 0.75 2023-10-02 14:01:32,946 WARNING [train.py:1204] (3/4) Exclude cut with ID _39867_210_9826_1_1533553222030_9344259_1337-11141-0_sp1.1 from training. Duration: 0.936375 2023-10-02 14:01:32,965 WARNING [train.py:1204] (3/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_101-293367-0 from training. Duration: 0.63 2023-10-02 14:01:34,321 WARNING [train.py:1204] (3/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_10-100367-0 from training. Duration: 0.98 2023-10-02 14:01:37,389 WARNING [train.py:1204] (3/4) Exclude cut with ID _60057_210_10640_1_1534903065793_3333150_171-181133-0_sp1.1 from training. Duration: 0.8 2023-10-02 14:01:37,392 WARNING [train.py:1204] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_844-96989-0_sp0.9 from training. Duration: 0.788875 2023-10-02 14:01:37,549 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.nonlin_attention.balancer.prob, batch_count=904686.6666666666, ans=0.125 2023-10-02 14:01:40,225 WARNING [train.py:1204] (3/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_301-144595-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 14:01:44,072 WARNING [train.py:1204] (3/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_115-147343-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 14:01:46,826 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3171_210_2087_1_1526109084_2330007_37-64334-0_sp1.1 from training. Duration: 0.7454375 2023-10-02 14:01:46,877 WARNING [train.py:1204] (3/4) Exclude cut with ID _19514_210_9259_1_1531530071330_5084230_769-272914-0_sp1.1 from training. Duration: 0.936375 2023-10-02 14:01:50,953 WARNING [train.py:1204] (3/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_76-311963-0_sp0.9 from training. Duration: 0.77775 2023-10-02 14:01:50,988 WARNING [train.py:1204] (3/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_526-62079-0 from training. Duration: 0.67 2023-10-02 14:01:52,279 WARNING [train.py:1204] (3/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_7-111954-0_sp0.9 from training. Duration: 0.62225 2023-10-02 14:01:53,701 WARNING [train.py:1204] (3/4) Exclude cut with ID _57997_210_4163_1_1534766425889_3961049_360-279140-0_sp1.1 from training. Duration: 0.97275 2023-10-02 14:01:55,164 WARNING [train.py:1204] (3/4) Exclude cut with ID _4866_210_2577_1_1527404412284_4364659_133-1696-0 from training. Duration: 0.99 2023-10-02 14:01:55,236 WARNING [train.py:1204] (3/4) Exclude cut with ID _5190_210_4557_1_1527244368799_373439_28-197030-0 from training. Duration: 0.72 2023-10-02 14:01:57,865 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_53-39003-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 14:01:57,867 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6673_210_3273_1_1527767790_3173240_351-167954-0 from training. Duration: 0.48 2023-10-02 14:01:57,901 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2822_210_2715_1_1526112586_1999587_165-36656-0_sp1.1 from training. Duration: 0.8090625 2023-10-02 14:02:00,523 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_328-264892-0_sp1.1 from training. Duration: 0.92725 2023-10-02 14:02:00,528 WARNING [train.py:1204] (3/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_29-293896-0_sp0.9 from training. Duration: 0.67775 2023-10-02 14:02:03,639 WARNING [train.py:1204] (3/4) Exclude cut with ID _65263_210_22356_1_1535763598691_3921037_465-55448-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 14:02:03,708 WARNING [train.py:1204] (3/4) Exclude cut with ID _21989_210_5854_1_1531899214931_3271360_311-53106-0_sp1.1 from training. Duration: 0.97275 2023-10-02 14:02:06,611 WARNING [train.py:1204] (3/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_389-277915-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 14:02:09,764 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_69630_210_7234_1_1535791094_677837_63-277139-0_sp1.1 from training. Duration: 0.963625 2023-10-02 14:02:09,895 WARNING [train.py:1204] (3/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_85-138257-0 from training. Duration: 0.71 2023-10-02 14:02:11,153 WARNING [train.py:1204] (3/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_686-247600-0 from training. Duration: 0.92 2023-10-02 14:02:11,158 WARNING [train.py:1204] (3/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_108-255430-0_sp1.1 from training. Duration: 0.8818125 2023-10-02 14:02:15,859 WARNING [train.py:1204] (3/4) Exclude cut with ID _14884_210_2252_1_1530707459689_5561250_114-201399-0_sp1.1 from training. Duration: 0.6909375 2023-10-02 14:02:18,592 WARNING [train.py:1204] (3/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_506-42614-0 from training. Duration: 0.79 2023-10-02 14:02:19,822 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.537e+02 1.775e+02 2.023e+02 2.342e+02 3.264e+02, threshold=4.046e+02, percent-clipped=0.0 2023-10-02 14:02:19,913 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_85-195240-0_sp1.1 from training. Duration: 0.7909375 2023-10-02 14:02:22,858 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_141-36243-0_sp1.1 from training. Duration: 0.97275 2023-10-02 14:02:23,081 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=904886.6666666666, ans=0.1 2023-10-02 14:02:32,448 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.2.encoder.layers.2.nonlin_attention.whiten2, num_groups=1, num_channels=384, metric=5.23 vs. limit=15.0 2023-10-02 14:02:33,017 WARNING [train.py:1204] (3/4) Exclude cut with ID _7852_210_5219_1_1528286471316_8139400_358-287492-0_sp1.1 from training. Duration: 0.87275 2023-10-02 14:02:33,040 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_805-71497-0_sp0.9 from training. Duration: 0.788875 2023-10-02 14:02:34,361 WARNING [train.py:1204] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_178-39722-0 from training. Duration: 0.49 2023-10-02 14:02:37,189 WARNING [train.py:1204] (3/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_428-181925-0_sp1.1 from training. Duration: 0.963625 2023-10-02 14:02:37,192 WARNING [train.py:1204] (3/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_770-200430-0 from training. Duration: 0.89 2023-10-02 14:02:37,226 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_1104-271009-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 14:02:38,568 WARNING [train.py:1204] (3/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_128-296581-0_sp0.9 from training. Duration: 0.82225 2023-10-02 14:02:38,802 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.feed_forward1.hidden_balancer.prob, batch_count=904953.3333333334, ans=0.125 2023-10-02 14:02:40,241 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.attention_skip_rate, batch_count=905020.0, ans=0.0 2023-10-02 14:02:41,849 INFO [train.py:1046] (3/4) Epoch 26, batch 2950, loss[loss=0.1776, simple_loss=0.2607, pruned_loss=0.04722, over 23213.00 frames. ], tot_loss[loss=0.1691, simple_loss=0.2452, pruned_loss=0.04645, over 4681094.82 frames. ], batch size: 105, lr: 3.94e-03, grad_scale: 16.0 2023-10-02 14:02:43,862 WARNING [train.py:1204] (3/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_19-62511-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 14:02:45,673 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_805-71497-0 from training. Duration: 0.71 2023-10-02 14:02:47,055 WARNING [train.py:1204] (3/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_573-152077-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 14:02:47,060 WARNING [train.py:1204] (3/4) Exclude cut with ID _42875_210_14856_1_1533704397487_4012433_393-177816-0_sp1.1 from training. Duration: 0.963625 2023-10-02 14:02:47,160 WARNING [train.py:1204] (3/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_278-35159-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 14:02:48,568 WARNING [train.py:1204] (3/4) Exclude cut with ID _15037_210_2253_1_1530752294104_3267650_13-130361-0_sp1.1 from training. Duration: 0.9 2023-10-02 14:02:49,966 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3019_210_2028_1_1526044245_3264865_127-135615-0 from training. Duration: 0.94 2023-10-02 14:02:51,260 WARNING [train.py:1204] (3/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_607-213951-0 from training. Duration: 0.84 2023-10-02 14:02:51,323 WARNING [train.py:1204] (3/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_436-318113-0_sp0.9 from training. Duration: 0.8333125 2023-10-02 14:02:51,323 WARNING [train.py:1204] (3/4) Exclude cut with ID _13938_210_9988_1_1530518718978_3693960_148-152225-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 14:02:56,778 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2541_210_1794_1_1525590269_3084765_156-173713-0_sp1.1 from training. Duration: 0.736375 2023-10-02 14:02:58,736 WARNING [train.py:1204] (3/4) Exclude cut with ID _28323_210_16187_1_1532480395667_4100300_531-24157-0_sp1.1 from training. Duration: 0.8909375 2023-10-02 14:03:00,175 WARNING [train.py:1204] (3/4) Exclude cut with ID _29303_210_2575_1_1532392237303_7410649_382-217492-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 14:03:00,226 WARNING [train.py:1204] (3/4) Exclude cut with ID _58895_210_8750_1_1535184081320_3649089_270-158013-0_sp1.1 from training. Duration: 0.92725 2023-10-02 14:03:04,215 WARNING [train.py:1204] (3/4) Exclude cut with ID _29277_210_3279_1_1532581109293_3173069_311-206227-0_sp1.1 from training. Duration: 0.936375 2023-10-02 14:03:04,226 WARNING [train.py:1204] (3/4) Exclude cut with ID _7160_210_1876_1_1527940923231_3703799_282-322611-0_sp0.9 from training. Duration: 0.9666875 2023-10-02 14:03:04,422 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=905086.6666666666, ans=0.1 2023-10-02 14:03:07,035 WARNING [train.py:1204] (3/4) Exclude cut with ID _53219_210_4525_1_1534834836008_4064825_562-32295-0_sp1.1 from training. Duration: 0.963625 2023-10-02 14:03:07,120 WARNING [train.py:1204] (3/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_408-87330-0_sp1.1 from training. Duration: 0.963625 2023-10-02 14:03:07,126 WARNING [train.py:1204] (3/4) Exclude cut with ID _59202_210_26984_1_1534816810612_3549174_137-318710-0_sp1.1 from training. Duration: 0.8090625 2023-10-02 14:03:11,125 WARNING [train.py:1204] (3/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_372-30302-0 from training. Duration: 0.86 2023-10-02 14:03:15,082 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7543_210_6753_1_1528541907_1399322_178-56269-0_sp1.1 from training. Duration: 0.95275 2023-10-02 14:03:15,100 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9109W0012-549758 from training. Duration: 0.512 2023-10-02 14:03:17,028 WARNING [train.py:1204] (3/4) Exclude cut with ID _5410_210_1388_1_1527397055795_1719020_39-112221-0_sp0.9 from training. Duration: 0.7666875 2023-10-02 14:03:18,446 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9121W0012-554933 from training. Duration: 0.896 2023-10-02 14:03:19,812 WARNING [train.py:1204] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_476-272757-0 from training. Duration: 0.93 2023-10-02 14:03:19,842 WARNING [train.py:1204] (3/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_1085-171718-0_sp1.1 from training. Duration: 0.92725 2023-10-02 14:03:19,899 WARNING [train.py:1204] (3/4) Exclude cut with ID _8480_210_1874_1_1528589391205_6728330_309-139896-0_sp1.1 from training. Duration: 0.736375 2023-10-02 14:03:19,899 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9130W0113-559703 from training. Duration: 0.896 2023-10-02 14:03:19,903 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_292-265928-0_sp1.1 from training. Duration: 0.463625 2023-10-02 14:03:21,987 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.3.encoder.layers.1.self_attn2.whiten, num_groups=1, num_channels=512, metric=15.57 vs. limit=22.5 2023-10-02 14:03:22,655 WARNING [train.py:1204] (3/4) Exclude cut with ID _8772_210_1850_1_1528635595972_3664011_96-12756-0 from training. Duration: 0.98 2023-10-02 14:03:23,968 WARNING [train.py:1204] (3/4) Exclude cut with ID _12556_210_8341_1_1530081024100_7327690_725-193477-0_sp1.1 from training. Duration: 0.8909375 2023-10-02 14:03:24,022 WARNING [train.py:1204] (3/4) Exclude cut with ID _5289_210_2084_1_1527678202736_3779299_142-103317-0_sp1.1 from training. Duration: 0.763625 2023-10-02 14:03:26,738 WARNING [train.py:1204] (3/4) Exclude cut with ID _45012_210_12651_1_1533608435136_5371380_206-51075-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 14:03:28,114 WARNING [train.py:1204] (3/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_176-56120-0_sp1.1 from training. Duration: 0.7545625 2023-10-02 14:03:28,139 WARNING [train.py:1204] (3/4) Exclude cut with ID _40284_210_2084_1_1533513679814_3704279_220-201864-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 14:03:28,168 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9165W0004-576638 from training. Duration: 0.896 2023-10-02 14:03:28,197 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_5438_210_1794_1_1527336687_2600009_218-343336-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 14:03:29,472 WARNING [train.py:1204] (3/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_318-162017-0 from training. Duration: 0.91 2023-10-02 14:03:29,750 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=905220.0, ans=0.1 2023-10-02 14:03:35,375 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_49544_210_1794_1_1534379770_5262083_483-349298-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 14:03:36,761 WARNING [train.py:1204] (3/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_197-28164-0_sp0.9 from training. Duration: 0.888875 2023-10-02 14:03:38,135 WARNING [train.py:1204] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_643-52007-0 from training. Duration: 0.57 2023-10-02 14:03:38,160 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_38965_210_4852_1_1533174906_3484735_403-332963-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 14:03:39,534 WARNING [train.py:1204] (3/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_340-174176-0 from training. Duration: 0.49 2023-10-02 14:03:41,522 WARNING [train.py:1204] (3/4) Exclude cut with ID _56708_210_13177_1_1535252351282_3422899_363-917-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 14:03:43,461 WARNING [train.py:1204] (3/4) Exclude cut with ID _19896_210_1883_1_1531709022662_6649569_525-159063-0_sp1.1 from training. Duration: 0.936375 2023-10-02 14:03:43,486 WARNING [train.py:1204] (3/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_430-116139-0_sp0.9 from training. Duration: 0.9444375 2023-10-02 14:03:44,892 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_53753_210_7234_1_1534320003_3594148_86-329250-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 14:03:44,911 WARNING [train.py:1204] (3/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_406-275649-0_sp1.1 from training. Duration: 0.3818125 2023-10-02 14:03:48,010 WARNING [train.py:1204] (3/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_1083-147536-0_sp1.1 from training. Duration: 0.8818125 2023-10-02 14:03:48,084 WARNING [train.py:1204] (3/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_54-311741-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 14:03:48,089 WARNING [train.py:1204] (3/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_237-58433-0_sp0.9 from training. Duration: 0.82225 2023-10-02 14:03:49,402 WARNING [train.py:1204] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_98-51676-0_sp1.1 from training. Duration: 0.663625 2023-10-02 14:03:49,481 WARNING [train.py:1204] (3/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_386-270472-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 14:03:50,970 WARNING [train.py:1204] (3/4) Exclude cut with ID _19023_210_4353_1_1531378892552_5820780_83-254794-0_sp1.1 from training. Duration: 0.7818125 2023-10-02 14:03:53,861 WARNING [train.py:1204] (3/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_314-93656-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 14:03:53,883 WARNING [train.py:1204] (3/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_336-213530-0 from training. Duration: 0.9 2023-10-02 14:03:55,203 WARNING [train.py:1204] (3/4) Exclude cut with ID _36686_210_5665_1_1533778225913_3646410_65-221120-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 14:03:56,578 INFO [train.py:1046] (3/4) Epoch 26, batch 3000, loss[loss=0.1555, simple_loss=0.2291, pruned_loss=0.04099, over 24426.00 frames. ], tot_loss[loss=0.1691, simple_loss=0.2459, pruned_loss=0.04613, over 4688138.30 frames. ], batch size: 58, lr: 3.93e-03, grad_scale: 16.0 2023-10-02 14:03:56,578 INFO [train.py:1069] (3/4) Computing validation loss 2023-10-02 14:04:08,891 INFO [train.py:1078] (3/4) Epoch 26, validation: loss=0.3521, simple_loss=0.2784, pruned_loss=0.2129, over 1125622.00 frames. 2023-10-02 14:04:08,891 INFO [train.py:1079] (3/4) Maximum memory allocated so far is 21134MB 2023-10-02 14:04:10,316 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_26433_210_12108_1_1532157158_4056480_271-68178-0_sp1.1 from training. Duration: 0.92725 2023-10-02 14:04:11,682 WARNING [train.py:1204] (3/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_46-207599-0_sp0.9 from training. Duration: 0.711125 2023-10-02 14:04:13,863 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.conv_module1.balancer1.prob, batch_count=905353.3333333334, ans=0.125 2023-10-02 14:04:17,137 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9303W0114-637133 from training. Duration: 0.896 2023-10-02 14:04:17,172 WARNING [train.py:1204] (3/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_490-207861-0 from training. Duration: 0.98 2023-10-02 14:04:19,915 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6767_210_3273_1_1527855783_1718932_173-256601-0_sp0.9 from training. Duration: 0.888875 2023-10-02 14:04:19,951 WARNING [train.py:1204] (3/4) Exclude cut with ID _13921_210_9994_1_1530841782272_3503520_327-69677-0_sp1.1 from training. Duration: 0.8181875 2023-10-02 14:04:20,005 WARNING [train.py:1204] (3/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_508-335369-0 from training. Duration: 0.48 2023-10-02 14:04:20,035 WARNING [train.py:1204] (3/4) Exclude cut with ID _64631_210_27767_1_1535349580068_4165160_24-46567-0_sp1.1 from training. Duration: 0.963625 2023-10-02 14:04:26,689 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_89-35748-0_sp1.1 from training. Duration: 0.7181875 2023-10-02 14:04:34,411 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_26433_210_12108_1_1532157158_4056480_578-262107-0_sp1.1 from training. Duration: 0.9 2023-10-02 14:04:39,628 WARNING [train.py:1204] (3/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_129-337802-0 from training. Duration: 0.72 2023-10-02 14:04:40,958 WARNING [train.py:1204] (3/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_356-167816-0_sp0.9 from training. Duration: 0.8 2023-10-02 14:04:44,167 WARNING [train.py:1204] (3/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_137-116442-0_sp1.1 from training. Duration: 0.7090625 2023-10-02 14:04:46,150 WARNING [train.py:1204] (3/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_358-172940-0_sp1.1 from training. Duration: 0.963625 2023-10-02 14:04:46,170 WARNING [train.py:1204] (3/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_668-344147-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 14:04:48,095 WARNING [train.py:1204] (3/4) Exclude cut with ID _62337_210_12392_1_1535009273105_3412229_255-157667-0_sp1.1 from training. Duration: 0.936375 2023-10-02 14:04:48,098 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_486-151131-0 from training. Duration: 0.85 2023-10-02 14:04:50,900 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_53638_210_21581_1_1534297319_5808449_885-80172-0 from training. Duration: 0.96 2023-10-02 14:04:52,254 WARNING [train.py:1204] (3/4) Exclude cut with ID _50093_210_4220_1_1534417141207_3432299_105-183067-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 14:04:52,310 WARNING [train.py:1204] (3/4) Exclude cut with ID _7562_210_2084_1_1528887767916_3946229_345-169156-0_sp0.9 from training. Duration: 0.6444375 2023-10-02 14:04:53,760 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.balancer1.prob, batch_count=905553.3333333334, ans=0.125 2023-10-02 14:04:54,969 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_4528_210_3273_1_1527076654_3276777_209-219804-0_sp0.9 from training. Duration: 0.7666875 2023-10-02 14:04:55,005 WARNING [train.py:1204] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_35-153886-0_sp0.9 from training. Duration: 0.9333125 2023-10-02 14:04:56,358 WARNING [train.py:1204] (3/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_332-121925-0_sp1.1 from training. Duration: 0.92725 2023-10-02 14:04:56,360 WARNING [train.py:1204] (3/4) Exclude cut with ID _59202_210_26984_1_1534816810612_3549174_495-65515-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 14:04:57,987 WARNING [train.py:1204] (3/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_769-168302-0_sp1.1 from training. Duration: 0.6454375 2023-10-02 14:04:59,311 WARNING [train.py:1204] (3/4) Exclude cut with ID _24271_210_12658_1_1531987270022_7190519_708-71345-0_sp1.1 from training. Duration: 0.936375 2023-10-02 14:04:59,311 WARNING [train.py:1204] (3/4) Exclude cut with ID 3033-130750-0096-55598-0_sp0.9 from training. Duration: 0.92225 2023-10-02 14:05:00,725 WARNING [train.py:1204] (3/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_219-224445-0_sp0.9 from training. Duration: 0.9333125 2023-10-02 14:05:00,962 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=905553.3333333334, ans=0.1 2023-10-02 14:05:03,023 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.bypass_mid.scale_min, batch_count=905553.3333333334, ans=0.2 2023-10-02 14:05:03,752 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.596e+02 1.996e+02 2.253e+02 2.544e+02 4.342e+02, threshold=4.506e+02, percent-clipped=3.0 2023-10-02 14:05:03,852 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_174-277364-0 from training. Duration: 0.71 2023-10-02 14:05:03,934 WARNING [train.py:1204] (3/4) Exclude cut with ID _45021_210_9994_1_1533619751094_3330288_96-239010-0_sp1.1 from training. Duration: 0.82725 2023-10-02 14:05:05,236 WARNING [train.py:1204] (3/4) Exclude cut with ID _31602_210_5854_1_1532677021456_4458250_489-305116-0_sp1.1 from training. Duration: 0.97275 2023-10-02 14:05:05,262 WARNING [train.py:1204] (3/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_269-154726-0_sp0.9 from training. Duration: 0.9555625 2023-10-02 14:05:09,303 WARNING [train.py:1204] (3/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_126-54418-0_sp1.1 from training. Duration: 0.92725 2023-10-02 14:05:09,336 WARNING [train.py:1204] (3/4) Exclude cut with ID _42326_210_7770_1_1533814282531_3707269_182-224159-0_sp1.1 from training. Duration: 0.92725 2023-10-02 14:05:10,667 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7002_210_1794_1_1527939683_5157286_1-281264-0_sp0.9 from training. Duration: 0.488875 2023-10-02 14:05:10,702 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_86-16881-0 from training. Duration: 0.58 2023-10-02 14:05:10,724 WARNING [train.py:1204] (3/4) Exclude cut with ID _19514_210_9259_1_1531530071330_5084230_637-279094-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 14:05:10,755 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_26433_210_12108_1_1532157158_4056480_578-262107-0 from training. Duration: 0.99 2023-10-02 14:05:12,053 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_82-221196-0_sp1.1 from training. Duration: 0.6909375 2023-10-02 14:05:14,130 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.bypass_mid.scale_min, batch_count=905620.0, ans=0.2 2023-10-02 14:05:15,295 WARNING [train.py:1204] (3/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_402-102311-0 from training. Duration: 0.67 2023-10-02 14:05:19,162 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1015-343884-0_sp1.1 from training. Duration: 0.563625 2023-10-02 14:05:20,568 WARNING [train.py:1204] (3/4) Exclude cut with ID _13684_210_7497_1_1530619354863_3598359_312-235606-0_sp1.1 from training. Duration: 0.5454375 2023-10-02 14:05:20,618 WARNING [train.py:1204] (3/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_55-31904-0 from training. Duration: 0.88 2023-10-02 14:05:21,953 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_25-332138-0 from training. Duration: 0.91 2023-10-02 14:05:21,955 WARNING [train.py:1204] (3/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_912-282412-0_sp0.9 from training. Duration: 0.5444375 2023-10-02 14:05:23,232 INFO [train.py:1046] (3/4) Epoch 26, batch 3050, loss[loss=0.1714, simple_loss=0.2569, pruned_loss=0.04298, over 24642.00 frames. ], tot_loss[loss=0.1699, simple_loss=0.2471, pruned_loss=0.04636, over 4693493.36 frames. ], batch size: 73, lr: 3.93e-03, grad_scale: 4.0 2023-10-02 14:05:23,331 WARNING [train.py:1204] (3/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_458-299435-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 14:05:24,696 WARNING [train.py:1204] (3/4) Exclude cut with ID _69584_210_5219_1_1535785133775_4044450_275-88403-0_sp1.1 from training. Duration: 0.92725 2023-10-02 14:05:24,713 WARNING [train.py:1204] (3/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_841-341679-0_sp1.1 from training. Duration: 0.52725 2023-10-02 14:05:24,720 WARNING [train.py:1204] (3/4) Exclude cut with ID _41391_210_7994_1_1533603771872_3638201_1-54521-0_sp1.1 from training. Duration: 0.97275 2023-10-02 14:05:24,966 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.feed_forward1.out_proj.dropout_p, batch_count=905686.6666666666, ans=0.1 2023-10-02 14:05:26,085 WARNING [train.py:1204] (3/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_495-186609-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 14:05:27,540 WARNING [train.py:1204] (3/4) Exclude cut with ID _4681_210_3525_1_1527250689464_120889_15-126760-0 from training. Duration: 0.81 2023-10-02 14:05:30,118 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3019_210_2028_1_1526044245_3264865_127-135615-0_sp1.1 from training. Duration: 0.8545625 2023-10-02 14:05:31,563 WARNING [train.py:1204] (3/4) Exclude cut with ID _30191_210_1664_1_1533189915047_7196670_915-21561-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 14:05:32,880 WARNING [train.py:1204] (3/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_6-214815-0_sp0.9 from training. Duration: 0.9444375 2023-10-02 14:05:35,615 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_45399_210_4852_1_1533624725_7579737_190-137494-0_sp1.1 from training. Duration: 0.97275 2023-10-02 14:05:38,987 WARNING [train.py:1204] (3/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_207-132038-0 from training. Duration: 0.94 2023-10-02 14:05:43,162 WARNING [train.py:1204] (3/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_90-31383-0 from training. Duration: 0.87 2023-10-02 14:05:44,598 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_15517_210_6753_1_1530952293_1664795_119-31001-0 from training. Duration: 0.94 2023-10-02 14:05:44,626 WARNING [train.py:1204] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_684-85342-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 14:05:49,710 WARNING [train.py:1204] (3/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_97-325053-0_sp1.1 from training. Duration: 0.836375 2023-10-02 14:05:51,927 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.bypass.skip_rate, batch_count=905820.0, ans=0.07 2023-10-02 14:05:51,941 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.ff2_skip_rate, batch_count=905820.0, ans=0.0 2023-10-02 14:05:54,443 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_68702_210_23689_1_1535799577_3420068_168-25150-0_sp1.1 from training. Duration: 0.97275 2023-10-02 14:05:54,449 WARNING [train.py:1204] (3/4) Exclude cut with ID _65134_210_14763_1_1535335393985_3505368_111-27984-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 14:05:54,503 WARNING [train.py:1204] (3/4) Exclude cut with ID _48678_210_5663_1_1533988830947_3769199_276-226230-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 14:05:57,415 WARNING [train.py:1204] (3/4) Exclude cut with ID _28427_210_4220_1_1532581240967_3061069_43-84928-0_sp1.1 from training. Duration: 0.9 2023-10-02 14:05:57,449 WARNING [train.py:1204] (3/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_158-250116-0_sp1.1 from training. Duration: 0.563625 2023-10-02 14:05:57,622 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.self_attn_weights.pos_emb_skip_rate, batch_count=905820.0, ans=0.0 2023-10-02 14:05:58,791 WARNING [train.py:1204] (3/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_279-332556-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 14:05:58,850 WARNING [train.py:1204] (3/4) Exclude cut with ID _42863_210_9070_1_1533902398788_3696920_488-230146-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 14:05:58,851 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_387-247159-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 14:06:00,130 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6729_210_2550_1_1528032481_1788725_261-42619-0_sp1.1 from training. Duration: 0.97275 2023-10-02 14:06:01,582 WARNING [train.py:1204] (3/4) Exclude cut with ID _19896_210_1883_1_1531709022662_6649569_831-44724-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 14:06:02,442 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.2.encoder.layers.0.feed_forward1.out_whiten, num_groups=1, num_channels=384, metric=5.40 vs. limit=15.0 2023-10-02 14:06:03,026 WARNING [train.py:1204] (3/4) Exclude cut with ID _55153_210_2253_1_1534384786677_3162096_280-285944-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 14:06:03,047 WARNING [train.py:1204] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_448-4048-0 from training. Duration: 0.96 2023-10-02 14:06:04,437 WARNING [train.py:1204] (3/4) Exclude cut with ID _29678_210_7893_1_1532421042787_3906118_456-202548-0_sp1.1 from training. Duration: 0.97275 2023-10-02 14:06:04,449 WARNING [train.py:1204] (3/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_107-332398-0_sp1.1 from training. Duration: 0.7090625 2023-10-02 14:06:07,164 WARNING [train.py:1204] (3/4) Exclude cut with ID _13921_210_9994_1_1530838702800_3003429_46-149444-0_sp1.1 from training. Duration: 0.8545625 2023-10-02 14:06:07,210 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_257-20733-0_sp0.9 from training. Duration: 0.8444375 2023-10-02 14:06:07,263 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_53753_210_7234_1_1534320003_3594148_385-234809-0_sp1.1 from training. Duration: 0.963625 2023-10-02 14:06:08,617 WARNING [train.py:1204] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_292-215346-0_sp1.1 from training. Duration: 0.8818125 2023-10-02 14:06:13,198 WARNING [train.py:1204] (3/4) Exclude cut with ID _68965_210_1883_1_1535698962272_3497490_196-124268-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 14:06:14,497 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_23801_210_16223_1_1532066392_3294657_570-5439-0_sp1.1 from training. Duration: 0.8818125 2023-10-02 14:06:19,300 WARNING [train.py:1204] (3/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_349-6928-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 14:06:19,348 WARNING [train.py:1204] (3/4) Exclude cut with ID _60621_210_17071_1_1535013053530_3671739_226-255202-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 14:06:19,351 WARNING [train.py:1204] (3/4) Exclude cut with ID _41817_210_18835_1_1533434477160_7423049_448-130302-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 14:06:21,368 WARNING [train.py:1204] (3/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_758-143677-0_sp1.1 from training. Duration: 0.8909375 2023-10-02 14:06:22,720 WARNING [train.py:1204] (3/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_912-47473-0_sp0.9 from training. Duration: 0.7555625 2023-10-02 14:06:22,758 WARNING [train.py:1204] (3/4) Exclude cut with ID _6694_210_4599_1_1527852637490_3965529_356-169233-0_sp1.1 from training. Duration: 0.9 2023-10-02 14:06:24,130 WARNING [train.py:1204] (3/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_1-99680-0 from training. Duration: 0.67 2023-10-02 14:06:24,224 WARNING [train.py:1204] (3/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_199-86432-0_sp1.1 from training. Duration: 0.8909375 2023-10-02 14:06:24,247 WARNING [train.py:1204] (3/4) Exclude cut with ID _61148_210_6897_1_1535094022726_4217610_362-182430-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 14:06:25,663 WARNING [train.py:1204] (3/4) Exclude cut with ID _49376_210_5986_1_1533985278732_3370740_301-154763-0 from training. Duration: 0.98 2023-10-02 14:06:26,925 WARNING [train.py:1204] (3/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_572-185150-0_sp1.1 from training. Duration: 0.8818125 2023-10-02 14:06:27,219 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.balancer2.prob, batch_count=905953.3333333334, ans=0.125 2023-10-02 14:06:29,937 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.conv_skip_rate, batch_count=905953.3333333334, ans=0.0 2023-10-02 14:06:32,411 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_47896_210_20508_1_1534492518_5958868_558-314572-0_sp1.1 from training. Duration: 0.8818125 2023-10-02 14:06:32,649 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.nonlin_attention.balancer.prob, batch_count=905953.3333333334, ans=0.125 2023-10-02 14:06:33,713 WARNING [train.py:1204] (3/4) Exclude cut with ID _45560_210_21982_1_1533812603951_3584999_111-6376-0_sp0.9 from training. Duration: 0.9333125 2023-10-02 14:06:36,517 INFO [train.py:1046] (3/4) Epoch 26, batch 3100, loss[loss=0.1571, simple_loss=0.2433, pruned_loss=0.03547, over 24481.00 frames. ], tot_loss[loss=0.1691, simple_loss=0.2463, pruned_loss=0.04597, over 4691277.70 frames. ], batch size: 66, lr: 3.93e-03, grad_scale: 8.0 2023-10-02 14:06:37,172 WARNING [train.py:1204] (3/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_412-317077-0_sp1.1 from training. Duration: 0.6818125 2023-10-02 14:06:38,549 WARNING [train.py:1204] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_718-68505-0 from training. Duration: 0.89 2023-10-02 14:06:39,945 WARNING [train.py:1204] (3/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_77-311404-0 from training. Duration: 0.94 2023-10-02 14:06:41,227 WARNING [train.py:1204] (3/4) Exclude cut with ID _60229_210_27767_1_1534838406158_3655350_264-279573-0 from training. Duration: 0.83 2023-10-02 14:06:42,621 WARNING [train.py:1204] (3/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_106-300985-0_sp1.1 from training. Duration: 0.8181875 2023-10-02 14:06:47,727 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_4183_210_2087_1_1526729100_1869838_79-277189-0_sp1.1 from training. Duration: 0.963625 2023-10-02 14:06:47,737 WARNING [train.py:1204] (3/4) Exclude cut with ID _28427_210_4220_1_1532581240967_3061069_195-295268-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 14:06:50,532 WARNING [train.py:1204] (3/4) Exclude cut with ID _4092_210_2151_1_1526643044449_3135759_356-278694-0_sp0.9 from training. Duration: 0.52225 2023-10-02 14:06:53,723 WARNING [train.py:1204] (3/4) Exclude cut with ID _14459_210_2228_1_1531443641946_7516700_777-71446-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 14:06:56,706 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.conv_skip_rate, batch_count=906086.6666666666, ans=0.0 2023-10-02 14:06:59,118 WARNING [train.py:1204] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_520-126729-0 from training. Duration: 0.7 2023-10-02 14:07:04,544 WARNING [train.py:1204] (3/4) Exclude cut with ID _13684_210_7497_1_1530619354863_3598359_312-235606-0_sp0.9 from training. Duration: 0.6666875 2023-10-02 14:07:04,599 WARNING [train.py:1204] (3/4) Exclude cut with ID _37537_210_2253_1_1533171758568_3421949_283-349402-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 14:07:04,630 WARNING [train.py:1204] (3/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_29-117932-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 14:07:06,013 WARNING [train.py:1204] (3/4) Exclude cut with ID _29303_210_2575_1_1532392237303_7410649_721-283351-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 14:07:07,856 WARNING [train.py:1204] (3/4) Exclude cut with ID _14886_210_4220_1_1530702020355_3516190_175-229073-0_sp0.9 from training. Duration: 0.5 2023-10-02 14:07:09,339 WARNING [train.py:1204] (3/4) Exclude cut with ID _15458_210_8341_1_1530858614263_3834410_70-268830-0_sp1.1 from training. Duration: 0.97275 2023-10-02 14:07:09,356 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2295_210_2087_1_1525500993_2407863_234-83104-0 from training. Duration: 0.62 2023-10-02 14:07:09,357 WARNING [train.py:1204] (3/4) Exclude cut with ID _22961_210_8254_1_1531976366672_4278180_317-199546-0_sp1.1 from training. Duration: 0.7818125 2023-10-02 14:07:10,781 WARNING [train.py:1204] (3/4) Exclude cut with ID _27894_210_12392_1_1532415496078_6902439_547-196022-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 14:07:12,181 WARNING [train.py:1204] (3/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_46-207599-0 from training. Duration: 0.64 2023-10-02 14:07:13,568 WARNING [train.py:1204] (3/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_426-326246-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 14:07:15,086 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.conv_module1.balancer1.prob, batch_count=906153.3333333334, ans=0.125 2023-10-02 14:07:16,853 WARNING [train.py:1204] (3/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_445-117398-0_sp0.9 from training. Duration: 0.988875 2023-10-02 14:07:16,898 WARNING [train.py:1204] (3/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_563-347832-0 from training. Duration: 0.89 2023-10-02 14:07:16,978 WARNING [train.py:1204] (3/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_252-216674-0 from training. Duration: 0.54 2023-10-02 14:07:18,918 WARNING [train.py:1204] (3/4) Exclude cut with ID _59611_210_8895_1_1534741198240_3650361_187-212779-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 14:07:18,974 WARNING [train.py:1204] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_282-159367-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 14:07:20,919 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.2.encoder.layers.2.nonlin_attention.whiten1, num_groups=1, num_channels=288, metric=8.44 vs. limit=10.0 2023-10-02 14:07:21,586 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_68702_210_23689_1_1535799577_3420068_33-136883-0_sp1.1 from training. Duration: 0.936375 2023-10-02 14:07:21,602 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_47228_210_4983_1_1533780021_3672027_184-293706-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 14:07:22,913 WARNING [train.py:1204] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_656-188361-0_sp0.9 from training. Duration: 0.9666875 2023-10-02 14:07:23,029 WARNING [train.py:1204] (3/4) Exclude cut with ID _4866_210_2577_1_1527404412284_4364659_352-157930-0_sp0.9 from training. Duration: 0.9 2023-10-02 14:07:23,030 WARNING [train.py:1204] (3/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_210-106047-0_sp1.1 from training. Duration: 0.8818125 2023-10-02 14:07:24,870 WARNING [train.py:1204] (3/4) Exclude cut with ID _9389_210_1664_1_1529217337301_5289269_289-213417-0_sp1.1 from training. Duration: 0.8090625 2023-10-02 14:07:24,898 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3910_210_1794_1_1526777667_3747260_178-41186-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 14:07:26,129 WARNING [train.py:1204] (3/4) Exclude cut with ID _27052_210_5986_1_1532329197187_3585009_24-172263-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 14:07:26,133 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_5522_210_3273_1_1527249606_3909096_318-203153-0_sp1.1 from training. Duration: 0.3909375 2023-10-02 14:07:30,700 WARNING [train.py:1204] (3/4) Exclude cut with ID _27894_210_12392_1_1532415496078_6902439_222-51982-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 14:07:31,915 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.510e+02 1.814e+02 2.073e+02 2.408e+02 3.405e+02, threshold=4.147e+02, percent-clipped=0.0 2023-10-02 14:07:32,025 WARNING [train.py:1204] (3/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_87-131427-0 from training. Duration: 0.78 2023-10-02 14:07:34,778 WARNING [train.py:1204] (3/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_372-298093-0_sp0.9 from training. Duration: 0.888875 2023-10-02 14:07:36,153 WARNING [train.py:1204] (3/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_663-166973-0 from training. Duration: 0.8 2023-10-02 14:07:36,193 WARNING [train.py:1204] (3/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_429-177469-0_sp1.1 from training. Duration: 0.936375 2023-10-02 14:07:37,939 WARNING [train.py:1204] (3/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_724-172651-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 14:07:37,985 WARNING [train.py:1204] (3/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_103-336064-0 from training. Duration: 0.51 2023-10-02 14:07:39,699 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.balancer1.prob, batch_count=906286.6666666666, ans=0.125 2023-10-02 14:07:48,296 WARNING [train.py:1204] (3/4) Exclude cut with ID _48677_210_5663_1_1533902405919_3892989_111-11439-0 from training. Duration: 0.96 2023-10-02 14:07:51,600 INFO [train.py:1046] (3/4) Epoch 26, batch 3150, loss[loss=0.1584, simple_loss=0.2406, pruned_loss=0.03813, over 24441.00 frames. ], tot_loss[loss=0.1681, simple_loss=0.2448, pruned_loss=0.04567, over 4683529.15 frames. ], batch size: 63, lr: 3.93e-03, grad_scale: 8.0 2023-10-02 14:07:51,675 WARNING [train.py:1204] (3/4) Exclude cut with ID _8085_210_4557_1_1528455633493_3716100_95-103398-0_sp1.1 from training. Duration: 0.963625 2023-10-02 14:07:51,731 WARNING [train.py:1204] (3/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_1041-110837-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 14:07:54,475 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_50050_210_16223_1_1535435796_7290138_1155-121757-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 14:07:54,477 WARNING [train.py:1204] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_602-275260-0_sp0.9 from training. Duration: 0.911125 2023-10-02 14:07:54,544 WARNING [train.py:1204] (3/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_265-164486-0 from training. Duration: 0.96 2023-10-02 14:07:56,486 WARNING [train.py:1204] (3/4) Exclude cut with ID _46067_210_4220_1_1534125571066_3307450_142-229375-0_sp1.1 from training. Duration: 0.963625 2023-10-02 14:07:57,193 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.3.encoder.layers.0.nonlin_attention.whiten1, num_groups=1, num_channels=384, metric=7.42 vs. limit=10.0 2023-10-02 14:07:57,876 WARNING [train.py:1204] (3/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_310-26186-0_sp1.1 from training. Duration: 0.636375 2023-10-02 14:07:58,069 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.feed_forward2.hidden_balancer.prob, batch_count=906353.3333333334, ans=0.125 2023-10-02 14:07:59,194 WARNING [train.py:1204] (3/4) Exclude cut with ID _28461_210_2253_1_1532771775189_3041299_136-270644-0 from training. Duration: 0.93 2023-10-02 14:08:00,690 WARNING [train.py:1204] (3/4) Exclude cut with ID _14860_210_4915_1_1530702020340_7343240_438-286372-0_sp1.1 from training. Duration: 0.936375 2023-10-02 14:08:02,154 WARNING [train.py:1204] (3/4) Exclude cut with ID IC0128W0004-63309 from training. Duration: 0.831 2023-10-02 14:08:04,976 WARNING [train.py:1204] (3/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_135-174080-0 from training. Duration: 0.53 2023-10-02 14:08:05,020 WARNING [train.py:1204] (3/4) Exclude cut with ID _41326_210_5854_1_1533778076509_3492080_277-59511-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 14:08:05,091 WARNING [train.py:1204] (3/4) Exclude cut with ID IC0144W0112-71379 from training. Duration: 0.992 2023-10-02 14:08:06,378 WARNING [train.py:1204] (3/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_711-15688-0_sp0.9 from training. Duration: 0.47775 2023-10-02 14:08:06,575 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.conv_skip_rate, batch_count=906420.0, ans=0.0 2023-10-02 14:08:07,706 WARNING [train.py:1204] (3/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_237-332027-0 from training. Duration: 0.95 2023-10-02 14:08:09,600 WARNING [train.py:1204] (3/4) Exclude cut with ID _7970_210_1883_1_1528455616296_3472230_25-178407-0 from training. Duration: 0.71 2023-10-02 14:08:09,602 WARNING [train.py:1204] (3/4) Exclude cut with ID _8480_210_1874_1_1528589391205_6728330_309-139896-0 from training. Duration: 0.81 2023-10-02 14:08:09,642 WARNING [train.py:1204] (3/4) Exclude cut with ID _39571_210_13449_1_1533801584143_3651077_319-268877-0_sp1.1 from training. Duration: 0.936375 2023-10-02 14:08:09,645 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_155-246880-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 14:08:10,965 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_566-122599-0_sp1.1 from training. Duration: 0.936375 2023-10-02 14:08:12,503 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_12868_210_6753_1_1530701932_530275_18-76322-0 from training. Duration: 0.87 2023-10-02 14:08:13,934 WARNING [train.py:1204] (3/4) Exclude cut with ID _56494_210_4220_1_1535284837625_3033169_159-106035-0_sp1.1 from training. Duration: 0.963625 2023-10-02 14:08:13,980 WARNING [train.py:1204] (3/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_98-276013-0_sp1.1 from training. Duration: 0.963625 2023-10-02 14:08:14,028 WARNING [train.py:1204] (3/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_330-216124-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 14:08:17,289 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_177-58005-0_sp0.9 from training. Duration: 0.688875 2023-10-02 14:08:20,475 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.5.encoder.layers.0.feed_forward2.out_whiten, num_groups=1, num_channels=256, metric=7.71 vs. limit=15.0 2023-10-02 14:08:21,367 WARNING [train.py:1204] (3/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_343-139898-0 from training. Duration: 0.96 2023-10-02 14:08:22,732 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6961_210_2087_1_1527937168_6815011_395-185060-0_sp1.1 from training. Duration: 0.863625 2023-10-02 14:08:25,984 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7002_210_1794_1_1527939683_5157286_184-229630-0_sp0.9 from training. Duration: 0.82225 2023-10-02 14:08:26,026 WARNING [train.py:1204] (3/4) Exclude cut with ID _59170_210_6721_1_1534983633960_4203335_153-18895-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 14:08:26,077 WARNING [train.py:1204] (3/4) Exclude cut with ID _49643_210_7989_1_1534640244224_3939031_598-161926-0 from training. Duration: 0.9 2023-10-02 14:08:29,308 WARNING [train.py:1204] (3/4) Exclude cut with ID _4068_210_2629_1_1526697350395_1546259_224-288693-0 from training. Duration: 0.59 2023-10-02 14:08:30,689 WARNING [train.py:1204] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_718-68505-0_sp1.1 from training. Duration: 0.8090625 2023-10-02 14:08:30,754 WARNING [train.py:1204] (3/4) Exclude cut with ID _4068_210_2629_1_1526697350395_1546259_224-288693-0_sp0.9 from training. Duration: 0.6555625 2023-10-02 14:08:31,496 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.1.encoder.layers.0.conv_module1.whiten, num_groups=1, num_channels=256, metric=14.18 vs. limit=15.0 2023-10-02 14:08:32,050 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2296_210_2087_1_1525503448_3397335_57-185430-0_sp0.9 from training. Duration: 0.6666875 2023-10-02 14:08:32,112 WARNING [train.py:1204] (3/4) Exclude cut with ID _15458_210_8341_1_1530858614263_3834410_49-235472-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 14:08:32,122 WARNING [train.py:1204] (3/4) Exclude cut with ID _64135_210_2253_1_1535677267876_7608972_498-5039-0_sp0.9 from training. Duration: 0.8666875 2023-10-02 14:08:33,517 WARNING [train.py:1204] (3/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_283-220372-0_sp0.9 from training. Duration: 0.62225 2023-10-02 14:08:33,525 WARNING [train.py:1204] (3/4) Exclude cut with ID _6473_210_1741_1_1527901307396_3815380_108-17335-0_sp0.9 from training. Duration: 0.72225 2023-10-02 14:08:35,038 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.feed_forward2.hidden_balancer.prob, batch_count=906553.3333333334, ans=0.125 2023-10-02 14:08:36,090 WARNING [train.py:1204] (3/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_113-92608-0 from training. Duration: 0.54 2023-10-02 14:08:36,148 WARNING [train.py:1204] (3/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_272-249985-0_sp1.1 from training. Duration: 0.6090625 2023-10-02 14:08:36,171 WARNING [train.py:1204] (3/4) Exclude cut with ID _50376_210_13401_1_1534031502101_4217740_518-187390-0_sp1.1 from training. Duration: 0.97275 2023-10-02 14:08:36,428 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.bypass.skip_rate, batch_count=906553.3333333334, ans=0.07 2023-10-02 14:08:37,580 WARNING [train.py:1204] (3/4) Exclude cut with ID _59738_210_15379_1_1534849512562_3941470_165-248260-0_sp1.1 from training. Duration: 0.92725 2023-10-02 14:08:37,590 WARNING [train.py:1204] (3/4) Exclude cut with ID _23818_210_6228_1_1531917063884_3676920_400-343925-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 14:08:38,902 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6961_210_2087_1_1527937168_6815011_562-165518-0 from training. Duration: 0.77 2023-10-02 14:08:38,958 WARNING [train.py:1204] (3/4) Exclude cut with ID _24271_210_12658_1_1531987270022_7190519_289-207931-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 14:08:42,237 WARNING [train.py:1204] (3/4) Exclude cut with ID _14632_210_9988_1_1530691279392_3923200_332-105904-0 from training. Duration: 0.9 2023-10-02 14:08:42,248 WARNING [train.py:1204] (3/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_203-232438-0_sp1.1 from training. Duration: 0.97275 2023-10-02 14:08:43,633 WARNING [train.py:1204] (3/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_263-168388-0 from training. Duration: 0.86 2023-10-02 14:08:43,705 WARNING [train.py:1204] (3/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_174-205058-0 from training. Duration: 0.95 2023-10-02 14:08:45,064 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_94-195801-0_sp1.1 from training. Duration: 0.8545625 2023-10-02 14:08:46,403 WARNING [train.py:1204] (3/4) Exclude cut with ID _55153_210_2253_1_1534384786677_3162096_269-103204-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 14:08:46,518 WARNING [train.py:1204] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_748-36150-0 from training. Duration: 0.91 2023-10-02 14:08:46,583 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_148-172861-0_sp1.1 from training. Duration: 0.4545625 2023-10-02 14:08:47,419 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.0.layers.1.self_attn1.whiten, num_groups=1, num_channels=192, metric=9.57 vs. limit=22.5 2023-10-02 14:08:48,423 WARNING [train.py:1204] (3/4) Exclude cut with ID _67499_210_17282_1_1535509805220_3692430_460-77730-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 14:08:51,132 WARNING [train.py:1204] (3/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_374-20753-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 14:08:52,501 WARNING [train.py:1204] (3/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_374-289983-0_sp1.1 from training. Duration: 0.97275 2023-10-02 14:08:52,545 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_34886_210_10633_1_1533203882_1755661_76-156096-0_sp0.9 from training. Duration: 0.9666875 2023-10-02 14:08:57,690 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_238-256668-0_sp0.9 from training. Duration: 0.7666875 2023-10-02 14:08:58,870 WARNING [train.py:1204] (3/4) Exclude cut with ID _46683_210_9642_1_1533730913492_4591116_489-146727-0_sp1.1 from training. Duration: 0.97275 2023-10-02 14:09:00,294 WARNING [train.py:1204] (3/4) Exclude cut with ID _13938_210_9988_1_1530518718978_3693960_304-112784-0_sp0.9 from training. Duration: 0.511125 2023-10-02 14:09:05,093 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.1.encoder.layers.1.whiten, num_groups=1, num_channels=256, metric=3.96 vs. limit=12.0 2023-10-02 14:09:08,282 INFO [train.py:1046] (3/4) Epoch 26, batch 3200, loss[loss=0.1541, simple_loss=0.2267, pruned_loss=0.04077, over 23798.00 frames. ], tot_loss[loss=0.1663, simple_loss=0.2428, pruned_loss=0.04487, over 4680984.58 frames. ], batch size: 232, lr: 3.93e-03, grad_scale: 16.0 2023-10-02 14:09:08,391 WARNING [train.py:1204] (3/4) Exclude cut with ID _24271_210_12658_1_1531987270022_7190519_243-46549-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 14:09:08,395 WARNING [train.py:1204] (3/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_78-182912-0_sp1.1 from training. Duration: 0.536375 2023-10-02 14:09:11,776 WARNING [train.py:1204] (3/4) Exclude cut with ID _57572_210_5517_1_1534921219624_3809484_121-321334-0_sp1.1 from training. Duration: 0.97275 2023-10-02 14:09:11,874 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_40047_210_2290_1_1533290519_2374311_254-102149-0_sp1.1 from training. Duration: 0.963625 2023-10-02 14:09:11,875 WARNING [train.py:1204] (3/4) Exclude cut with ID _28461_210_2253_1_1532771775189_3041299_96-152158-0 from training. Duration: 0.85 2023-10-02 14:09:14,627 WARNING [train.py:1204] (3/4) Exclude cut with ID _29877_210_6228_1_1532397791106_5276149_161-102979-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 14:09:19,212 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3787_210_2040_1_1526477544_5478586_603-120324-0_sp0.9 from training. Duration: 0.9 2023-10-02 14:09:22,002 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_41750_210_1960_1_1533520678_3948042_247-213456-0_sp1.1 from training. Duration: 0.97275 2023-10-02 14:09:25,477 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.3.encoder.layers.3.self_attn_weights.whiten_keys, num_groups=8, num_channels=256, metric=3.01 vs. limit=6.0 2023-10-02 14:09:25,590 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.2.encoder.layers.1.conv_module1.whiten, num_groups=1, num_channels=384, metric=10.27 vs. limit=15.0 2023-10-02 14:09:29,865 WARNING [train.py:1204] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_127-119109-0_sp0.9 from training. Duration: 0.8 2023-10-02 14:09:36,889 WARNING [train.py:1204] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_215-202973-0 from training. Duration: 0.88 2023-10-02 14:09:38,182 WARNING [train.py:1204] (3/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_17-320491-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 14:09:41,587 WARNING [train.py:1204] (3/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_310-114452-0 from training. Duration: 0.59 2023-10-02 14:09:41,656 WARNING [train.py:1204] (3/4) Exclude cut with ID _15133_210_6228_1_1530794512074_3080379_29-264458-0_sp0.9 from training. Duration: 0.6444375 2023-10-02 14:09:44,404 WARNING [train.py:1204] (3/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_159-281310-0_sp1.1 from training. Duration: 0.863625 2023-10-02 14:09:44,413 WARNING [train.py:1204] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_195-167011-0_sp0.9 from training. Duration: 0.8333125 2023-10-02 14:09:46,303 WARNING [train.py:1204] (3/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_156-257607-0_sp1.1 from training. Duration: 0.7545625 2023-10-02 14:09:47,868 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.feed_forward1.hidden_balancer.prob, batch_count=906820.0, ans=0.125 2023-10-02 14:09:49,057 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2295_210_2087_1_1525500993_2407863_116-238894-0 from training. Duration: 0.7 2023-10-02 14:09:50,490 WARNING [train.py:1204] (3/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_321-335475-0_sp0.9 from training. Duration: 0.5 2023-10-02 14:09:51,810 WARNING [train.py:1204] (3/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_188-114464-0 from training. Duration: 0.82 2023-10-02 14:09:54,972 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2592_210_2290_1_1525697025_4882925_157-70512-0 from training. Duration: 0.91 2023-10-02 14:09:56,435 WARNING [train.py:1204] (3/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_138-64210-0_sp1.1 from training. Duration: 0.663625 2023-10-02 14:10:01,145 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_11305_210_1794_1_1529750428_7178148_651-95702-0_sp1.1 from training. Duration: 0.963625 2023-10-02 14:10:01,160 WARNING [train.py:1204] (3/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_379-184223-0_sp0.9 from training. Duration: 0.9444375 2023-10-02 14:10:01,187 WARNING [train.py:1204] (3/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_381-125325-0_sp1.1 from training. Duration: 0.963625 2023-10-02 14:10:02,458 WARNING [train.py:1204] (3/4) Exclude cut with ID IC0622W0021-307659 from training. Duration: 0.896 2023-10-02 14:10:02,462 WARNING [train.py:1204] (3/4) Exclude cut with ID _7624_210_1531_1_1528109802198_4933101_418-349834-0_sp1.1 from training. Duration: 0.5454375 2023-10-02 14:10:03,776 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.541e+02 1.962e+02 2.253e+02 2.668e+02 3.638e+02, threshold=4.506e+02, percent-clipped=0.0 2023-10-02 14:10:07,366 WARNING [train.py:1204] (3/4) Exclude cut with ID _49728_210_12651_1_1534041034294_6788150_607-3416-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 14:10:07,587 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.conv_module1.balancer2.prob, batch_count=906953.3333333334, ans=0.125 2023-10-02 14:10:09,802 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8275_210_4198_1_1528455320_3892571_491-17707-0 from training. Duration: 0.64 2023-10-02 14:10:09,854 WARNING [train.py:1204] (3/4) Exclude cut with ID _5287_210_2084_1_1527418883040_3878670_333-48508-0 from training. Duration: 0.6 2023-10-02 14:10:11,198 WARNING [train.py:1204] (3/4) Exclude cut with ID _8365_210_2064_1_1528592311070_4677690_371-122295-0 from training. Duration: 0.55 2023-10-02 14:10:13,024 WARNING [train.py:1204] (3/4) Exclude cut with ID _14860_210_4915_1_1530702020340_7343240_836-328489-0 from training. Duration: 0.9 2023-10-02 14:10:14,462 WARNING [train.py:1204] (3/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_1033-274376-0_sp0.9 from training. Duration: 0.9666875 2023-10-02 14:10:17,191 WARNING [train.py:1204] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_475-127455-0_sp0.9 from training. Duration: 0.77775 2023-10-02 14:10:17,197 WARNING [train.py:1204] (3/4) Exclude cut with ID IC0671W0071-331914 from training. Duration: 0.896 2023-10-02 14:10:17,226 WARNING [train.py:1204] (3/4) Exclude cut with ID _30327_210_16049_1_1532484161295_3924169_2-1523-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 14:10:17,229 WARNING [train.py:1204] (3/4) Exclude cut with ID _24314_210_4915_1_1532135078116_7170179_460-208279-0_sp1.1 from training. Duration: 0.97275 2023-10-02 14:10:19,911 WARNING [train.py:1204] (3/4) Exclude cut with ID IC0679W0052-335859 from training. Duration: 0.64 2023-10-02 14:10:22,562 INFO [train.py:1046] (3/4) Epoch 26, batch 3250, loss[loss=0.1693, simple_loss=0.2621, pruned_loss=0.0382, over 24680.00 frames. ], tot_loss[loss=0.1664, simple_loss=0.2434, pruned_loss=0.04471, over 4695634.46 frames. ], batch size: 73, lr: 3.93e-03, grad_scale: 16.0 2023-10-02 14:10:24,099 WARNING [train.py:1204] (3/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_3-237434-0_sp1.1 from training. Duration: 0.6909375 2023-10-02 14:10:26,239 WARNING [train.py:1204] (3/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_405-185283-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 14:10:33,858 WARNING [train.py:1204] (3/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_927-83684-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 14:10:33,866 WARNING [train.py:1204] (3/4) Exclude cut with ID _6694_210_4599_1_1527852637490_3965529_356-169233-0 from training. Duration: 0.99 2023-10-02 14:10:35,181 WARNING [train.py:1204] (3/4) Exclude cut with ID _19896_210_1883_1_1531709022662_6649569_562-233001-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 14:10:35,229 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_354-78629-0_sp1.1 from training. Duration: 0.963625 2023-10-02 14:10:35,230 WARNING [train.py:1204] (3/4) Exclude cut with ID _60135_210_13427_1_1534935557922_3651319_96-168198-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 14:10:36,756 WARNING [train.py:1204] (3/4) Exclude cut with ID _31602_210_5854_1_1532677021456_4458250_249-96604-0_sp1.1 from training. Duration: 0.8090625 2023-10-02 14:10:38,124 WARNING [train.py:1204] (3/4) Exclude cut with ID _14100_210_8341_1_1530529320613_3678069_258-224599-0_sp0.9 from training. Duration: 0.8555625 2023-10-02 14:10:41,425 WARNING [train.py:1204] (3/4) Exclude cut with ID _19708_210_9206_1_1532136650733_4238364_215-160135-0_sp1.1 from training. Duration: 0.97275 2023-10-02 14:10:41,453 WARNING [train.py:1204] (3/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_113-18163-0_sp0.9 from training. Duration: 0.988875 2023-10-02 14:10:42,753 WARNING [train.py:1204] (3/4) Exclude cut with ID _64135_210_2253_1_1535677267876_7608972_552-337340-0_sp1.1 from training. Duration: 0.92725 2023-10-02 14:10:42,784 WARNING [train.py:1204] (3/4) Exclude cut with ID _67725_210_27767_1_1535608756671_5105359_10-12887-0_sp1.1 from training. Duration: 0.97275 2023-10-02 14:10:42,797 WARNING [train.py:1204] (3/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_401-284840-0_sp1.1 from training. Duration: 0.97275 2023-10-02 14:10:42,823 WARNING [train.py:1204] (3/4) Exclude cut with ID _44659_210_2721_1_1534036120061_6614860_483-141285-0_sp1.1 from training. Duration: 0.936375 2023-10-02 14:10:46,150 WARNING [train.py:1204] (3/4) Exclude cut with ID _9461_210_4557_1_1529060514726_3677451_358-332734-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 14:10:46,253 WARNING [train.py:1204] (3/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_788-298907-0_sp1.1 from training. Duration: 0.8090625 2023-10-02 14:10:46,503 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.conv_skip_rate, batch_count=907086.6666666666, ans=0.0 2023-10-02 14:10:48,895 WARNING [train.py:1204] (3/4) Exclude cut with ID _39411_210_5986_1_1533538825030_3343649_354-267196-0_sp1.1 from training. Duration: 0.92725 2023-10-02 14:10:48,932 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_14953_210_2151_1_1530701710_5616165_192-309792-0_sp1.1 from training. Duration: 0.97275 2023-10-02 14:10:50,324 WARNING [train.py:1204] (3/4) Exclude cut with ID _59170_210_6721_1_1534983633960_4203335_290-151255-0_sp1.1 from training. Duration: 0.92725 2023-10-02 14:10:51,556 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_5575_210_1410_1_1527301305_2570170_249-93769-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 14:10:51,565 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_46013_210_7234_1_1533776362_3744032_463-52378-0_sp1.1 from training. Duration: 0.7818125 2023-10-02 14:10:54,397 WARNING [train.py:1204] (3/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_580-93798-0 from training. Duration: 0.56 2023-10-02 14:10:55,773 WARNING [train.py:1204] (3/4) Exclude cut with ID _19514_210_9259_1_1531530071330_5084230_385-13762-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 14:10:55,777 WARNING [train.py:1204] (3/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_458-67321-0_sp1.1 from training. Duration: 0.8818125 2023-10-02 14:10:57,561 WARNING [train.py:1204] (3/4) Exclude cut with ID _41298_210_9019_1_1533430371450_4822143_188-154791-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 14:10:58,891 WARNING [train.py:1204] (3/4) Exclude cut with ID _15037_210_2253_1_1530752294104_3267650_296-192597-0_sp0.9 from training. Duration: 0.9 2023-10-02 14:11:05,000 WARNING [train.py:1204] (3/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_661-264857-0_sp1.1 from training. Duration: 0.8454375 2023-10-02 14:11:09,441 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.bypass_mid.scale_min, batch_count=907220.0, ans=0.2 2023-10-02 14:11:12,694 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_34008_210_10431_1_1533345424_8183247_662-317331-0_sp1.1 from training. Duration: 0.8545625 2023-10-02 14:11:12,897 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.nonlin_attention.balancer.prob, batch_count=907220.0, ans=0.125 2023-10-02 14:11:13,819 WARNING [train.py:1204] (3/4) Exclude cut with ID _46502_210_13794_1_1533724452198_3657159_241-278329-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 14:11:13,820 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_11349_210_6753_1_1529800861_1101207_26-65602-0 from training. Duration: 0.96 2023-10-02 14:11:13,824 WARNING [train.py:1204] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_448-4048-0_sp1.1 from training. Duration: 0.87275 2023-10-02 14:11:13,841 WARNING [train.py:1204] (3/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_662-275621-0_sp1.1 from training. Duration: 0.3818125 2023-10-02 14:11:15,212 WARNING [train.py:1204] (3/4) Exclude cut with ID _13938_210_9988_1_1530518718978_3693960_256-54454-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 14:11:16,580 WARNING [train.py:1204] (3/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_256-32662-0 from training. Duration: 0.9 2023-10-02 14:11:18,494 WARNING [train.py:1204] (3/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_326-17061-0 from training. Duration: 0.94 2023-10-02 14:11:18,545 WARNING [train.py:1204] (3/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_505-97987-0_sp1.1 from training. Duration: 0.7818125 2023-10-02 14:11:19,914 WARNING [train.py:1204] (3/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_316-294364-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 14:11:21,243 WARNING [train.py:1204] (3/4) Exclude cut with ID _19514_210_9259_1_1531530071330_5084230_762-231063-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 14:11:21,320 WARNING [train.py:1204] (3/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_68-219807-0_sp0.9 from training. Duration: 0.52225 2023-10-02 14:11:22,562 WARNING [train.py:1204] (3/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_479-158053-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 14:11:25,412 WARNING [train.py:1204] (3/4) Exclude cut with ID _66112_210_10635_1_1535590236305_3801484_83-38181-0_sp1.1 from training. Duration: 0.936375 2023-10-02 14:11:25,427 WARNING [train.py:1204] (3/4) Exclude cut with ID _55857_210_2346_1_1534644007831_6898209_255-146346-0_sp1.1 from training. Duration: 0.963625 2023-10-02 14:11:28,231 WARNING [train.py:1204] (3/4) Exclude cut with ID _48836_210_12210_1_1534075848293_3904150_429-35475-0 from training. Duration: 0.89 2023-10-02 14:11:28,250 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_50154_210_13731_1_1534750862_1080961_119-187163-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 14:11:28,520 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.conv_module1.balancer2.prob, batch_count=907286.6666666666, ans=0.125 2023-10-02 14:11:31,402 WARNING [train.py:1204] (3/4) Exclude cut with ID _8793_210_6310_1_1528542985139_2310009_121-179782-0_sp0.9 from training. Duration: 0.888875 2023-10-02 14:11:31,413 WARNING [train.py:1204] (3/4) Exclude cut with ID _14884_210_2252_1_1530707459689_5561250_591-173406-0 from training. Duration: 0.96 2023-10-02 14:11:34,654 WARNING [train.py:1204] (3/4) Exclude cut with ID _15133_210_6228_1_1530794512074_3080379_54-198193-0_sp1.1 from training. Duration: 0.8545625 2023-10-02 14:11:34,663 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6545_210_2040_1_1527774670_4245258_407-48070-0 from training. Duration: 0.76 2023-10-02 14:11:36,042 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1032-236863-0 from training. Duration: 0.84 2023-10-02 14:11:36,185 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.0.feed_forward3.hidden_balancer.prob, batch_count=907353.3333333334, ans=0.125 2023-10-02 14:11:37,356 INFO [train.py:1046] (3/4) Epoch 26, batch 3300, loss[loss=0.1658, simple_loss=0.2558, pruned_loss=0.03796, over 24385.00 frames. ], tot_loss[loss=0.1678, simple_loss=0.2451, pruned_loss=0.04523, over 4705159.21 frames. ], batch size: 69, lr: 3.93e-03, grad_scale: 16.0 2023-10-02 14:11:37,427 WARNING [train.py:1204] (3/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_507-303370-0 from training. Duration: 0.96 2023-10-02 14:11:37,454 WARNING [train.py:1204] (3/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_164-282366-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 14:11:40,414 WARNING [train.py:1204] (3/4) Exclude cut with ID _39867_210_9826_1_1533553222030_9344259_312-158727-0_sp1.1 from training. Duration: 0.963625 2023-10-02 14:11:41,708 WARNING [train.py:1204] (3/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_174-205058-0_sp1.1 from training. Duration: 0.863625 2023-10-02 14:11:41,868 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.feed_forward1.hidden_balancer.prob, batch_count=907353.3333333334, ans=0.125 2023-10-02 14:11:43,046 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_11305_210_1794_1_1529750428_7178148_166-237358-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 14:11:44,786 WARNING [train.py:1204] (3/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_26-175553-0_sp1.1 from training. Duration: 0.5545625 2023-10-02 14:11:44,811 WARNING [train.py:1204] (3/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_96-290039-0_sp1.1 from training. Duration: 0.5090625 2023-10-02 14:11:47,539 WARNING [train.py:1204] (3/4) Exclude cut with ID _53219_210_4525_1_1534834836008_4064825_261-101691-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 14:11:48,974 WARNING [train.py:1204] (3/4) Exclude cut with ID _24271_210_12658_1_1531987270022_7190519_96-293390-0_sp1.1 from training. Duration: 0.936375 2023-10-02 14:11:52,346 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.nonlin_attention.balancer.prob, batch_count=907420.0, ans=0.125 2023-10-02 14:11:54,813 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_271-234962-0 from training. Duration: 0.79 2023-10-02 14:11:54,883 WARNING [train.py:1204] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_136-30660-0_sp1.1 from training. Duration: 0.97275 2023-10-02 14:11:54,901 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_625-281046-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 14:11:56,279 WARNING [train.py:1204] (3/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_333-168193-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 14:11:56,551 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.conv_module1.balancer2.prob, batch_count=907420.0, ans=0.125 2023-10-02 14:11:57,574 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9029W0269-509664 from training. Duration: 0.896 2023-10-02 14:11:57,659 WARNING [train.py:1204] (3/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_450-147393-0_sp1.1 from training. Duration: 0.92725 2023-10-02 14:11:58,992 WARNING [train.py:1204] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_643-52007-0_sp1.1 from training. Duration: 0.5181875 2023-10-02 14:12:00,375 WARNING [train.py:1204] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_330-327861-0_sp0.9 from training. Duration: 0.7444375 2023-10-02 14:12:00,384 WARNING [train.py:1204] (3/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_260-317651-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 14:12:00,418 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9044W0010-516654 from training. Duration: 0.896 2023-10-02 14:12:05,616 WARNING [train.py:1204] (3/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_265-118378-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 14:12:05,630 WARNING [train.py:1204] (3/4) Exclude cut with ID _6194_210_1534_1_1527511826160_3941194_321-148641-0_sp0.9 from training. Duration: 0.711125 2023-10-02 14:12:07,126 WARNING [train.py:1204] (3/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_101-169176-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 14:12:07,137 WARNING [train.py:1204] (3/4) Exclude cut with ID 497-129325-0061-62254-0_sp1.1 from training. Duration: 0.97725 2023-10-02 14:12:07,389 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.conv_module2.balancer2.prob, batch_count=907486.6666666666, ans=0.125 2023-10-02 14:12:08,522 WARNING [train.py:1204] (3/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_795-33252-0_sp1.1 from training. Duration: 0.336375 2023-10-02 14:12:08,537 WARNING [train.py:1204] (3/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_168-246176-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 14:12:09,877 WARNING [train.py:1204] (3/4) Exclude cut with ID _7160_210_1876_1_1527940923231_3703799_120-150943-0_sp1.1 from training. Duration: 0.736375 2023-10-02 14:12:12,574 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9084W0005-536844 from training. Duration: 0.768 2023-10-02 14:12:14,059 WARNING [train.py:1204] (3/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_234-260682-0 from training. Duration: 0.84 2023-10-02 14:12:14,096 WARNING [train.py:1204] (3/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_188-114464-0_sp0.9 from training. Duration: 0.911125 2023-10-02 14:12:16,756 WARNING [train.py:1204] (3/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_81-75849-0 from training. Duration: 0.39 2023-10-02 14:12:17,493 WARNING [train.py:1204] (3/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_379-184223-0_sp1.1 from training. Duration: 0.77275 2023-10-02 14:12:20,177 WARNING [train.py:1204] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_454-123002-0_sp1.1 from training. Duration: 0.62725 2023-10-02 14:12:20,215 WARNING [train.py:1204] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_887-249555-0_sp1.1 from training. Duration: 0.7 2023-10-02 14:12:23,469 WARNING [train.py:1204] (3/4) Exclude cut with ID _31088_210_16446_1_1532520012284_3440919_20-260902-0_sp1.1 from training. Duration: 0.97275 2023-10-02 14:12:23,518 WARNING [train.py:1204] (3/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_856-344574-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 14:12:23,520 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_36756_210_7234_1_1533026991_966276_4-164118-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 14:12:23,534 WARNING [train.py:1204] (3/4) Exclude cut with ID _8365_210_2064_1_1528592311070_4677690_371-122295-0_sp1.1 from training. Duration: 0.5 2023-10-02 14:12:24,932 WARNING [train.py:1204] (3/4) Exclude cut with ID _5910_210_1389_1_1527337972454_5050100_555-159815-0_sp1.1 from training. Duration: 0.8909375 2023-10-02 14:12:24,941 WARNING [train.py:1204] (3/4) Exclude cut with ID _37086_210_9038_1_1532999540891_7021250_469-147941-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 14:12:26,365 WARNING [train.py:1204] (3/4) Exclude cut with ID _3067_210_2667_1_1526212974640_4503168_459-173867-0_sp0.9 from training. Duration: 0.97775 2023-10-02 14:12:27,625 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9156W0006-572499 from training. Duration: 0.896 2023-10-02 14:12:28,968 WARNING [train.py:1204] (3/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_277-210798-0 from training. Duration: 0.55 2023-10-02 14:12:32,058 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.568e+02 1.777e+02 1.941e+02 2.097e+02 3.420e+02, threshold=3.882e+02, percent-clipped=0.0 2023-10-02 14:12:32,167 WARNING [train.py:1204] (3/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_103-336064-0_sp1.1 from training. Duration: 0.463625 2023-10-02 14:12:34,005 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3414_210_1988_1_1526212718_4813195_331-290945-0_sp1.1 from training. Duration: 0.936375 2023-10-02 14:12:34,006 WARNING [train.py:1204] (3/4) Exclude cut with ID _67499_210_17282_1_1535509805220_3692430_365-29276-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 14:12:35,478 WARNING [train.py:1204] (3/4) Exclude cut with ID _14821_210_8750_1_1530680503991_6829359_69-250847-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 14:12:35,479 WARNING [train.py:1204] (3/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_1102-61301-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 14:12:36,918 WARNING [train.py:1204] (3/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_166-246557-0_sp1.1 from training. Duration: 0.5818125 2023-10-02 14:12:38,221 WARNING [train.py:1204] (3/4) Exclude cut with ID _15351_210_10626_1_1530856635653_3576210_214-129918-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 14:12:38,248 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_120-41668-0_sp0.9 from training. Duration: 0.77775 2023-10-02 14:12:39,587 WARNING [train.py:1204] (3/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_27-72278-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 14:12:39,709 WARNING [train.py:1204] (3/4) Exclude cut with ID _24314_210_4915_1_1532135078116_7170179_521-258868-0_sp1.1 from training. Duration: 0.7181875 2023-10-02 14:12:41,212 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.1.feed_forward3.hidden_balancer.prob, batch_count=907620.0, ans=0.125 2023-10-02 14:12:42,382 WARNING [train.py:1204] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_143-86344-0 from training. Duration: 0.77 2023-10-02 14:12:42,408 WARNING [train.py:1204] (3/4) Exclude cut with ID _53489_210_6228_1_1534298357169_3835970_465-157433-0_sp1.1 from training. Duration: 0.92725 2023-10-02 14:12:43,738 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_248-319353-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 14:12:45,155 WARNING [train.py:1204] (3/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_102-77064-0_sp1.1 from training. Duration: 0.8454375 2023-10-02 14:12:46,473 WARNING [train.py:1204] (3/4) Exclude cut with ID _9389_210_1664_1_1529217337301_5289269_379-128794-0_sp1.1 from training. Duration: 0.77275 2023-10-02 14:12:47,793 WARNING [train.py:1204] (3/4) Exclude cut with ID _3231_210_2106_1_1526205864934_6784220_236-260835-0_sp1.1 from training. Duration: 0.97275 2023-10-02 14:12:50,870 INFO [train.py:1046] (3/4) Epoch 26, batch 3350, loss[loss=0.1535, simple_loss=0.2281, pruned_loss=0.0395, over 24570.00 frames. ], tot_loss[loss=0.1682, simple_loss=0.2458, pruned_loss=0.04533, over 4719233.44 frames. ], batch size: 60, lr: 3.93e-03, grad_scale: 16.0 2023-10-02 14:12:50,965 WARNING [train.py:1204] (3/4) Exclude cut with ID _24271_210_12658_1_1531987270022_7190519_184-342363-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 14:12:50,966 WARNING [train.py:1204] (3/4) Exclude cut with ID _27894_210_12392_1_1532415496078_6902439_67-221663-0_sp1.1 from training. Duration: 0.92725 2023-10-02 14:12:53,768 WARNING [train.py:1204] (3/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_304-59227-0_sp1.1 from training. Duration: 0.7 2023-10-02 14:12:55,288 WARNING [train.py:1204] (3/4) Exclude cut with ID _58605_210_12210_1_1534723195939_3904259_458-42232-0_sp1.1 from training. Duration: 0.92725 2023-10-02 14:12:57,127 WARNING [train.py:1204] (3/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_13-165512-0_sp0.9 from training. Duration: 0.911125 2023-10-02 14:12:58,635 WARNING [train.py:1204] (3/4) Exclude cut with ID _58602_210_2346_1_1534903210085_6908830_792-102241-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 14:12:58,743 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.0.conv_module1.balancer2.min_abs, batch_count=907686.6666666666, ans=0.5 2023-10-02 14:12:58,825 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.balancer2.prob, batch_count=907686.6666666666, ans=0.125 2023-10-02 14:13:01,326 WARNING [train.py:1204] (3/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_588-125165-0_sp0.9 from training. Duration: 0.92225 2023-10-02 14:13:02,701 WARNING [train.py:1204] (3/4) Exclude cut with ID _54218_210_12210_1_1534413646388_4409259_239-98429-0_sp1.1 from training. Duration: 0.97275 2023-10-02 14:13:03,982 WARNING [train.py:1204] (3/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_538-32291-0_sp1.1 from training. Duration: 0.7545625 2023-10-02 14:13:04,074 WARNING [train.py:1204] (3/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_419-317062-0 from training. Duration: 0.91 2023-10-02 14:13:07,834 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9309W0202-640329 from training. Duration: 0.896 2023-10-02 14:13:07,875 WARNING [train.py:1204] (3/4) Exclude cut with ID _3231_210_2106_1_1526205864934_6784220_419-313291-0_sp1.1 from training. Duration: 0.97275 2023-10-02 14:13:10,557 WARNING [train.py:1204] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_619-344499-0 from training. Duration: 0.63 2023-10-02 14:13:10,570 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_278-272612-0 from training. Duration: 0.73 2023-10-02 14:13:10,660 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_103-84010-0_sp0.9 from training. Duration: 0.7666875 2023-10-02 14:13:10,917 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.bypass_mid.scale_min, batch_count=907753.3333333334, ans=0.2 2023-10-02 14:13:11,952 WARNING [train.py:1204] (3/4) Exclude cut with ID _26528_210_9070_1_1532570410842_3851310_497-97218-0_sp1.1 from training. Duration: 0.7818125 2023-10-02 14:13:13,309 WARNING [train.py:1204] (3/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_262-84613-0_sp1.1 from training. Duration: 0.936375 2023-10-02 14:13:13,358 WARNING [train.py:1204] (3/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_387-131538-0 from training. Duration: 0.81 2023-10-02 14:13:14,650 WARNING [train.py:1204] (3/4) Exclude cut with ID _8030_210_7497_1_1528444886781_3531089_224-300489-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 14:13:14,667 WARNING [train.py:1204] (3/4) Exclude cut with ID _14349_210_8341_1_1530615710818_3442470_170-336541-0_sp0.9 from training. Duration: 0.9555625 2023-10-02 14:13:14,837 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=907753.3333333334, ans=0.1 2023-10-02 14:13:17,450 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_47069_210_15368_1_1533781267_4431320_180-31746-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 14:13:18,828 WARNING [train.py:1204] (3/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_447-7041-0_sp1.1 from training. Duration: 0.92725 2023-10-02 14:13:20,183 WARNING [train.py:1204] (3/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_2-55283-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 14:13:20,239 WARNING [train.py:1204] (3/4) Exclude cut with ID _12555_210_8750_1_1530075594402_3780239_70-181911-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 14:13:24,936 WARNING [train.py:1204] (3/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_311-328022-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 14:13:26,469 WARNING [train.py:1204] (3/4) Exclude cut with ID _14528_210_8254_1_1530669700514_4306939_330-2987-0_sp1.1 from training. Duration: 0.92725 2023-10-02 14:13:28,324 WARNING [train.py:1204] (3/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_385-184858-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 14:13:32,519 WARNING [train.py:1204] (3/4) Exclude cut with ID _64414_210_17301_1_1535243252048_3757759_348-333360-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 14:13:32,589 WARNING [train.py:1204] (3/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_753-214751-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 14:13:34,174 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=907886.6666666666, ans=0.1 2023-10-02 14:13:35,263 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_17613_210_2151_1_1531137569_2366160_202-208013-0_sp1.1 from training. Duration: 0.92725 2023-10-02 14:13:35,280 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_43831_210_2158_1_1533725502_5872791_610-129300-0_sp1.1 from training. Duration: 0.936375 2023-10-02 14:13:38,172 WARNING [train.py:1204] (3/4) Exclude cut with ID _54218_210_12210_1_1534413646388_4409259_321-23852-0_sp1.1 from training. Duration: 0.936375 2023-10-02 14:13:40,021 WARNING [train.py:1204] (3/4) Exclude cut with ID _39411_210_5986_1_1533538825030_3343649_173-238248-0 from training. Duration: 0.83 2023-10-02 14:13:40,028 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_405-339986-0_sp0.9 from training. Duration: 0.5666875 2023-10-02 14:13:40,051 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2009_210_2071_1_1525481134_4588937_385-302100-0 from training. Duration: 0.87 2023-10-02 14:13:40,088 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_161-276996-0_sp1.1 from training. Duration: 0.863625 2023-10-02 14:13:42,091 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_177-58005-0 from training. Duration: 0.62 2023-10-02 14:13:43,526 WARNING [train.py:1204] (3/4) Exclude cut with ID _64639_210_5816_1_1535367619437_3528000_146-343734-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 14:13:44,859 WARNING [train.py:1204] (3/4) Exclude cut with ID _3845_210_1390_1_1526731326141_3523610_72-61441-0_sp1.1 from training. Duration: 0.92725 2023-10-02 14:13:51,754 WARNING [train.py:1204] (3/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_991-125028-0_sp1.1 from training. Duration: 0.936375 2023-10-02 14:13:53,066 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7544_210_6753_1_1528591327_1471426_172-348777-0 from training. Duration: 0.66 2023-10-02 14:13:53,108 WARNING [train.py:1204] (3/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_416-257730-0_sp1.1 from training. Duration: 0.5909375 2023-10-02 14:13:55,815 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1113-7369-0_sp1.1 from training. Duration: 0.77275 2023-10-02 14:13:56,528 WARNING [train.py:1204] (3/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_242-76943-0_sp1.1 from training. Duration: 0.8545625 2023-10-02 14:14:02,308 WARNING [train.py:1204] (3/4) Exclude cut with ID _44718_210_5816_1_1534050051869_6469030_190-49648-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 14:14:03,739 WARNING [train.py:1204] (3/4) Exclude cut with ID _47251_210_18628_1_1533814063128_4137250_214-88076-0 from training. Duration: 0.88 2023-10-02 14:14:05,074 INFO [train.py:1046] (3/4) Epoch 26, batch 3400, loss[loss=0.17, simple_loss=0.2389, pruned_loss=0.05057, over 23616.00 frames. ], tot_loss[loss=0.1692, simple_loss=0.2471, pruned_loss=0.04569, over 4724092.54 frames. ], batch size: 256, lr: 3.93e-03, grad_scale: 16.0 2023-10-02 14:14:05,117 WARNING [train.py:1204] (3/4) Exclude cut with ID _5585_210_2667_1_1527418813324_8007340_89-332072-0_sp1.1 from training. Duration: 0.5818125 2023-10-02 14:14:05,141 WARNING [train.py:1204] (3/4) Exclude cut with ID _41817_210_18835_1_1533434477160_7423049_555-293448-0_sp1.1 from training. Duration: 0.72725 2023-10-02 14:14:06,553 WARNING [train.py:1204] (3/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_286-331314-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 14:14:07,848 WARNING [train.py:1204] (3/4) Exclude cut with ID _13921_210_9994_1_1530838702800_3003429_46-149444-0 from training. Duration: 0.94 2023-10-02 14:14:08,358 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.5.encoder.layers.1.conv_module1.whiten, num_groups=1, num_channels=256, metric=3.23 vs. limit=15.0 2023-10-02 14:14:08,728 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.1.encoder.layers.1.self_attn1.whiten, num_groups=1, num_channels=256, metric=12.62 vs. limit=22.5 2023-10-02 14:14:09,177 WARNING [train.py:1204] (3/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_768-43595-0_sp1.1 from training. Duration: 0.936375 2023-10-02 14:14:09,187 WARNING [train.py:1204] (3/4) Exclude cut with ID _35355_210_16040_1_1533212988977_4016229_74-190076-0 from training. Duration: 0.8 2023-10-02 14:14:10,476 WARNING [train.py:1204] (3/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_1007-316380-0_sp1.1 from training. Duration: 0.963625 2023-10-02 14:14:10,511 WARNING [train.py:1204] (3/4) Exclude cut with ID _23557_210_8750_1_1531890106370_6846255_150-283437-0_sp1.1 from training. Duration: 0.963625 2023-10-02 14:14:10,564 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3474_210_2028_1_1526188513_4825246_145-309131-0_sp1.1 from training. Duration: 0.67275 2023-10-02 14:14:11,922 WARNING [train.py:1204] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_391-22148-0_sp1.1 from training. Duration: 0.7 2023-10-02 14:14:11,938 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_238-256668-0 from training. Duration: 0.69 2023-10-02 14:14:12,805 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.1.encoder.layers.1.conv_module1.whiten, num_groups=1, num_channels=256, metric=9.70 vs. limit=15.0 2023-10-02 14:14:16,768 WARNING [train.py:1204] (3/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_372-198878-0 from training. Duration: 0.6 2023-10-02 14:14:16,778 WARNING [train.py:1204] (3/4) Exclude cut with ID ID0174W0006-766659 from training. Duration: 0.953 2023-10-02 14:14:16,800 WARNING [train.py:1204] (3/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_668-23019-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 14:14:18,307 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.0.ff3_skip_rate, batch_count=908086.6666666666, ans=0.0 2023-10-02 14:14:20,944 WARNING [train.py:1204] (3/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_322-74328-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 14:14:20,950 WARNING [train.py:1204] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_872-264525-0_sp1.1 from training. Duration: 0.5909375 2023-10-02 14:14:20,995 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_53378_210_17282_1_1534221071_5769509_641-286201-0_sp1.1 from training. Duration: 0.97275 2023-10-02 14:14:22,284 WARNING [train.py:1204] (3/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_199-295611-0_sp1.1 from training. Duration: 0.5 2023-10-02 14:14:26,493 WARNING [train.py:1204] (3/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_1270-317971-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 14:14:29,621 WARNING [train.py:1204] (3/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_784-213099-0 from training. Duration: 0.93 2023-10-02 14:14:34,370 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_115-233929-0_sp0.9 from training. Duration: 0.8 2023-10-02 14:14:37,156 WARNING [train.py:1204] (3/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_256-223809-0_sp1.1 from training. Duration: 0.97275 2023-10-02 14:14:38,437 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_285-87781-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 14:14:38,523 WARNING [train.py:1204] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_619-344499-0_sp1.1 from training. Duration: 0.57275 2023-10-02 14:14:44,547 WARNING [train.py:1204] (3/4) Exclude cut with ID _60229_210_27767_1_1534838406158_3655350_264-279573-0_sp0.9 from training. Duration: 0.92225 2023-10-02 14:14:47,408 WARNING [train.py:1204] (3/4) Exclude cut with ID _7719_210_3525_1_1528456153163_4296960_333-110399-0 from training. Duration: 0.96 2023-10-02 14:14:51,584 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_50423_210_13270_1_1534128598_5021734_759-90833-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 14:14:51,644 WARNING [train.py:1204] (3/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_151-254798-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 14:14:52,890 WARNING [train.py:1204] (3/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_450-203637-0 from training. Duration: 0.92 2023-10-02 14:14:52,926 WARNING [train.py:1204] (3/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_673-36456-0_sp1.1 from training. Duration: 0.936375 2023-10-02 14:14:54,291 WARNING [train.py:1204] (3/4) Exclude cut with ID _36538_210_9994_1_1533204020363_3795620_131-8685-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 14:14:54,343 WARNING [train.py:1204] (3/4) Exclude cut with ID _12556_210_8341_1_1530081024100_7327690_713-55626-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 14:14:55,662 WARNING [train.py:1204] (3/4) Exclude cut with ID _2322_210_2667_1_1525604548971_3656299_440-80494-0_sp0.9 from training. Duration: 0.97775 2023-10-02 14:14:57,019 WARNING [train.py:1204] (3/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_210-325081-0_sp1.1 from training. Duration: 0.97275 2023-10-02 14:14:58,790 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.538e+02 1.837e+02 2.033e+02 2.297e+02 3.671e+02, threshold=4.066e+02, percent-clipped=0.0 2023-10-02 14:15:00,687 WARNING [train.py:1204] (3/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_375-244976-0_sp0.9 from training. Duration: 0.8333125 2023-10-02 14:15:00,693 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_15517_210_6753_1_1530952293_1664795_119-31001-0_sp1.1 from training. Duration: 0.8545625 2023-10-02 14:15:02,777 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.3.encoder.layers.2.feed_forward2.out_whiten, num_groups=1, num_channels=512, metric=13.25 vs. limit=15.0 2023-10-02 14:15:06,488 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_41452_210_7291_1_1533437687_3941281_319-42326-0_sp1.1 from training. Duration: 0.92725 2023-10-02 14:15:06,598 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_372-173012-0 from training. Duration: 0.95 2023-10-02 14:15:12,680 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=908286.6666666666, ans=0.1 2023-10-02 14:15:13,796 WARNING [train.py:1204] (3/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_41-31157-0_sp1.1 from training. Duration: 0.5181875 2023-10-02 14:15:17,847 INFO [train.py:1046] (3/4) Epoch 26, batch 3450, loss[loss=0.1652, simple_loss=0.2399, pruned_loss=0.04526, over 23637.00 frames. ], tot_loss[loss=0.1692, simple_loss=0.247, pruned_loss=0.04565, over 4715681.80 frames. ], batch size: 149, lr: 3.93e-03, grad_scale: 16.0 2023-10-02 14:15:17,907 WARNING [train.py:1204] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_91-212486-0 from training. Duration: 0.68 2023-10-02 14:15:20,831 WARNING [train.py:1204] (3/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_1267-272693-0 from training. Duration: 0.73 2023-10-02 14:15:20,876 WARNING [train.py:1204] (3/4) Exclude cut with ID _13938_210_9988_1_1530518718978_3693960_152-67046-0_sp1.1 from training. Duration: 0.936375 2023-10-02 14:15:22,251 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_4528_210_3273_1_1527076654_3276777_342-168068-0_sp1.1 from training. Duration: 0.7909375 2023-10-02 14:15:22,259 WARNING [train.py:1204] (3/4) Exclude cut with ID _7606_210_4915_1_1528894253277_5080690_127-177761-0 from training. Duration: 0.64 2023-10-02 14:15:23,595 WARNING [train.py:1204] (3/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_335-137972-0_sp1.1 from training. Duration: 0.92725 2023-10-02 14:15:26,954 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.3.encoder.layers.1.self_attn1.whiten, num_groups=1, num_channels=512, metric=16.35 vs. limit=22.5 2023-10-02 14:15:28,271 WARNING [train.py:1204] (3/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_331-117829-0_sp1.1 from training. Duration: 0.563625 2023-10-02 14:15:28,480 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.conv_module1.balancer2.prob, batch_count=908353.3333333334, ans=0.125 2023-10-02 14:15:34,191 WARNING [train.py:1204] (3/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_125-125818-0_sp1.1 from training. Duration: 0.87275 2023-10-02 14:15:35,528 WARNING [train.py:1204] (3/4) Exclude cut with ID _6789_210_1534_1_1527854471671_2870519_48-294897-0_sp1.1 from training. Duration: 0.963625 2023-10-02 14:15:36,905 WARNING [train.py:1204] (3/4) Exclude cut with ID _6694_210_4599_1_1527852637490_3965529_116-109398-0_sp1.1 from training. Duration: 0.663625 2023-10-02 14:15:36,915 WARNING [train.py:1204] (3/4) Exclude cut with ID _52892_210_6721_1_1534378941259_4124129_222-172685-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 14:15:37,069 WARNING [train.py:1204] (3/4) Exclude cut with ID _50655_210_18628_1_1534229942193_3772154_320-132275-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 14:15:44,991 WARNING [train.py:1204] (3/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_76-311963-0 from training. Duration: 0.7 2023-10-02 14:15:49,200 WARNING [train.py:1204] (3/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_32-205288-0 from training. Duration: 0.78 2023-10-02 14:15:50,384 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_86-16881-0_sp0.9 from training. Duration: 0.6444375 2023-10-02 14:15:50,419 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_271-297505-0_sp1.1 from training. Duration: 0.9 2023-10-02 14:15:51,791 WARNING [train.py:1204] (3/4) Exclude cut with ID _57572_210_5517_1_1534921219624_3809484_187-295293-0_sp1.1 from training. Duration: 0.963625 2023-10-02 14:15:55,941 WARNING [train.py:1204] (3/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_563-329943-0 from training. Duration: 0.99 2023-10-02 14:15:56,010 WARNING [train.py:1204] (3/4) Exclude cut with ID _7160_210_1876_1_1527940923231_3703799_302-150728-0_sp0.9 from training. Duration: 0.8666875 2023-10-02 14:16:00,031 WARNING [train.py:1204] (3/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_926-7526-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 14:16:01,133 WARNING [train.py:1204] (3/4) Exclude cut with ID _62574_210_2655_1_1535180407058_3933249_467-22348-0_sp1.1 from training. Duration: 0.936375 2023-10-02 14:16:01,357 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.bypass.skip_rate, batch_count=908553.3333333334, ans=0.07 2023-10-02 14:16:01,408 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.feed_forward1.out_proj.dropout_p, batch_count=908553.3333333334, ans=0.1 2023-10-02 14:16:02,559 WARNING [train.py:1204] (3/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_280-161633-0_sp0.9 from training. Duration: 0.811125 2023-10-02 14:16:03,846 WARNING [train.py:1204] (3/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_385-193052-0_sp1.1 from training. Duration: 0.7818125 2023-10-02 14:16:05,360 WARNING [train.py:1204] (3/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_444-28041-0 from training. Duration: 0.85 2023-10-02 14:16:05,364 WARNING [train.py:1204] (3/4) Exclude cut with ID _66070_210_7640_1_1535441393096_7805310_341-249237-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 14:16:05,557 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.balancer2.prob, batch_count=908553.3333333334, ans=0.125 2023-10-02 14:16:05,606 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.out_combiner.scale_min, batch_count=908553.3333333334, ans=0.2 2023-10-02 14:16:06,747 WARNING [train.py:1204] (3/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_425-269826-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 14:16:09,913 WARNING [train.py:1204] (3/4) Exclude cut with ID _53950_210_5816_1_1534417264919_3862951_124-185597-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 14:16:11,373 WARNING [train.py:1204] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_380-204756-0 from training. Duration: 0.86 2023-10-02 14:16:14,128 WARNING [train.py:1204] (3/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_282-257791-0_sp0.9 from training. Duration: 0.9555625 2023-10-02 14:16:17,340 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.0.ff3_skip_rate, batch_count=908620.0, ans=0.0 2023-10-02 14:16:19,942 WARNING [train.py:1204] (3/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_372-131163-0_sp1.1 from training. Duration: 0.8545625 2023-10-02 14:16:21,423 WARNING [train.py:1204] (3/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_447-233513-0_sp1.1 from training. Duration: 0.963625 2023-10-02 14:16:24,048 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_54-118333-0_sp1.1 from training. Duration: 0.97275 2023-10-02 14:16:27,542 WARNING [train.py:1204] (3/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_388-230239-0_sp1.1 from training. Duration: 0.963625 2023-10-02 14:16:27,553 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_40047_210_2290_1_1533290519_2374311_141-54639-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 14:16:29,472 WARNING [train.py:1204] (3/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_91-267696-0_sp0.9 from training. Duration: 0.9666875 2023-10-02 14:16:29,511 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_532-137847-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 14:16:32,143 INFO [train.py:1046] (3/4) Epoch 26, batch 3500, loss[loss=0.1546, simple_loss=0.2376, pruned_loss=0.0358, over 24478.00 frames. ], tot_loss[loss=0.1678, simple_loss=0.2454, pruned_loss=0.04514, over 4714899.57 frames. ], batch size: 63, lr: 3.93e-03, grad_scale: 8.0 2023-10-02 14:16:33,513 WARNING [train.py:1204] (3/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_155-283231-0_sp1.1 from training. Duration: 0.97275 2023-10-02 14:16:38,109 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2245_210_2715_1_1525511548_3540254_310-52483-0_sp1.1 from training. Duration: 0.8 2023-10-02 14:16:39,455 WARNING [train.py:1204] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_185-174805-0 from training. Duration: 0.68 2023-10-02 14:16:40,902 WARNING [train.py:1204] (3/4) Exclude cut with ID _15012_210_1968_1_1530856820429_5095350_662-210469-0_sp1.1 from training. Duration: 0.5181875 2023-10-02 14:16:43,739 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_426-313557-0_sp0.9 from training. Duration: 0.588875 2023-10-02 14:16:45,234 WARNING [train.py:1204] (3/4) Exclude cut with ID _42326_210_7770_1_1533814282531_3707269_407-204343-0_sp1.1 from training. Duration: 0.97275 2023-10-02 14:16:46,479 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_258-310570-0 from training. Duration: 0.82 2023-10-02 14:16:51,117 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_4737_210_2040_1_1526998689_3293008_378-178650-0_sp1.1 from training. Duration: 0.863625 2023-10-02 14:16:52,556 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_47896_210_20508_1_1534492518_5958868_432-327451-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 14:16:52,654 WARNING [train.py:1204] (3/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_188-114464-0_sp1.1 from training. Duration: 0.7454375 2023-10-02 14:16:52,661 WARNING [train.py:1204] (3/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_798-106179-0_sp1.1 from training. Duration: 0.7545625 2023-10-02 14:16:52,782 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.self_attn_weights.pos_emb_skip_rate, batch_count=908753.3333333334, ans=0.0 2023-10-02 14:16:53,976 WARNING [train.py:1204] (3/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_143-117421-0_sp1.1 from training. Duration: 0.52725 2023-10-02 14:16:54,004 WARNING [train.py:1204] (3/4) Exclude cut with ID _67499_210_17282_1_1535509805220_3692430_361-78786-0_sp1.1 from training. Duration: 0.963625 2023-10-02 14:16:54,206 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.conv_skip_rate, batch_count=908753.3333333334, ans=0.0 2023-10-02 14:16:55,381 WARNING [train.py:1204] (3/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_712-342563-0_sp1.1 from training. Duration: 0.8818125 2023-10-02 14:16:55,438 WARNING [train.py:1204] (3/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_387-159008-0 from training. Duration: 0.86 2023-10-02 14:16:58,231 WARNING [train.py:1204] (3/4) Exclude cut with ID _33640_210_2655_1_1533117605479_4855469_959-18820-0_sp1.1 from training. Duration: 0.963625 2023-10-02 14:16:58,259 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_133-564-0_sp0.9 from training. Duration: 0.87775 2023-10-02 14:17:00,161 WARNING [train.py:1204] (3/4) Exclude cut with ID _23514_210_1389_1_1531832139785_3031439_12-29287-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 14:17:04,949 WARNING [train.py:1204] (3/4) Exclude cut with ID _23514_210_1389_1_1531832139785_3031439_489-185052-0_sp1.1 from training. Duration: 0.963625 2023-10-02 14:17:05,018 WARNING [train.py:1204] (3/4) Exclude cut with ID _5444_210_2151_1_1527418808464_5312210_634-194174-0 from training. Duration: 0.97 2023-10-02 14:17:06,732 WARNING [train.py:1204] (3/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_91-26919-0_sp1.1 from training. Duration: 0.8818125 2023-10-02 14:17:07,022 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.bypass.skip_rate, batch_count=908820.0, ans=0.09899494936611666 2023-10-02 14:17:09,508 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_37351_210_7925_1_1533012931_3812113_516-82303-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 14:17:12,158 WARNING [train.py:1204] (3/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_80-74184-0_sp0.9 from training. Duration: 0.92225 2023-10-02 14:17:12,224 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_9460_210_4211_1_1528954587_10049667_495-202963-0_sp1.1 from training. Duration: 0.963625 2023-10-02 14:17:12,844 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.3.encoder.layers.3.nonlin_attention.whiten1, num_groups=1, num_channels=384, metric=5.27 vs. limit=10.0 2023-10-02 14:17:13,651 WARNING [train.py:1204] (3/4) Exclude cut with ID _14632_210_9988_1_1530691279392_3923200_332-105904-0_sp1.1 from training. Duration: 0.8181875 2023-10-02 14:17:13,666 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_40764_210_1925_1_1533882359_3752345_493-167886-0_sp1.1 from training. Duration: 0.92725 2023-10-02 14:17:15,117 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_196-289637-0 from training. Duration: 0.75 2023-10-02 14:17:16,546 WARNING [train.py:1204] (3/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_26-175553-0 from training. Duration: 0.61 2023-10-02 14:17:16,582 WARNING [train.py:1204] (3/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_890-5117-0 from training. Duration: 0.78 2023-10-02 14:17:16,639 WARNING [train.py:1204] (3/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_206-306860-0_sp1.1 from training. Duration: 0.92725 2023-10-02 14:17:19,300 WARNING [train.py:1204] (3/4) Exclude cut with ID _25882_210_3036_1_1532775665815_6479510_570-79824-0_sp1.1 from training. Duration: 0.963625 2023-10-02 14:17:19,357 WARNING [train.py:1204] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_731-2428-0_sp1.1 from training. Duration: 0.7545625 2023-10-02 14:17:19,376 WARNING [train.py:1204] (3/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_138-154812-0_sp0.9 from training. Duration: 0.8666875 2023-10-02 14:17:22,602 WARNING [train.py:1204] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_178-39722-0_sp0.9 from training. Duration: 0.5444375 2023-10-02 14:17:23,948 WARNING [train.py:1204] (3/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_1285-256622-0_sp0.9 from training. Duration: 0.9333125 2023-10-02 14:17:25,650 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.feed_forward2.hidden_balancer.prob, batch_count=908886.6666666666, ans=0.125 2023-10-02 14:17:28,013 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.562e+02 1.927e+02 2.309e+02 3.023e+02 4.699e+02, threshold=4.619e+02, percent-clipped=3.0 2023-10-02 14:17:28,106 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_41428_210_2161_1_1533774184_3992532_65-271323-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 14:17:29,509 WARNING [train.py:1204] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_308-161067-0 from training. Duration: 0.66 2023-10-02 14:17:29,514 WARNING [train.py:1204] (3/4) Exclude cut with ID _3067_210_2667_1_1526212974640_4503168_459-173867-0 from training. Duration: 0.88 2023-10-02 14:17:29,516 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2221_210_1988_1_1525597051_4935541_415-296882-0_sp1.1 from training. Duration: 0.7 2023-10-02 14:17:33,305 WARNING [train.py:1204] (3/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_354-192266-0_sp1.1 from training. Duration: 0.936375 2023-10-02 14:17:34,714 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_258-310570-0_sp0.9 from training. Duration: 0.911125 2023-10-02 14:17:34,841 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_548-141039-0_sp1.1 from training. Duration: 0.963625 2023-10-02 14:17:37,536 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_15101_210_7927_1_1530786415_5033274_320-327527-0 from training. Duration: 0.96 2023-10-02 14:17:38,872 WARNING [train.py:1204] (3/4) Exclude cut with ID _2895_210_2477_1_1525956785794_4063750_289-11475-0_sp0.9 from training. Duration: 0.911125 2023-10-02 14:17:40,273 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7543_210_6753_1_1528543318_4552_1-85072-0_sp1.1 from training. Duration: 0.7545625 2023-10-02 14:17:41,596 WARNING [train.py:1204] (3/4) Exclude cut with ID _64639_210_5816_1_1535367619437_3528000_461-329156-0 from training. Duration: 0.96 2023-10-02 14:17:43,017 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_211-339622-0 from training. Duration: 0.45 2023-10-02 14:17:44,935 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.4.encoder.layers.2.self_attn2.whiten, num_groups=1, num_channels=384, metric=12.04 vs. limit=22.5 2023-10-02 14:17:45,640 INFO [train.py:1046] (3/4) Epoch 26, batch 3550, loss[loss=0.1568, simple_loss=0.2329, pruned_loss=0.04038, over 22471.00 frames. ], tot_loss[loss=0.167, simple_loss=0.2444, pruned_loss=0.04481, over 4709142.60 frames. ], batch size: 49, lr: 3.93e-03, grad_scale: 8.0 2023-10-02 14:17:45,720 WARNING [train.py:1204] (3/4) Exclude cut with ID _20455_210_5219_1_1531565996775_8098100_402-112739-0_sp1.1 from training. Duration: 0.963625 2023-10-02 14:17:47,073 WARNING [train.py:1204] (3/4) Exclude cut with ID _53489_210_6228_1_1534298357169_3835970_256-115565-0_sp1.1 from training. Duration: 0.936375 2023-10-02 14:17:47,099 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_26326_210_7925_1_1533002493_3592406_360-305260-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 14:17:47,122 WARNING [train.py:1204] (3/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_18-204383-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 14:17:50,299 WARNING [train.py:1204] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_306-232480-0_sp1.1 from training. Duration: 0.87275 2023-10-02 14:17:56,019 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.bypass.skip_rate, batch_count=909020.0, ans=0.07 2023-10-02 14:17:57,255 WARNING [train.py:1204] (3/4) Exclude cut with ID _41161_210_20034_1_1533431249657_3555314_116-299186-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 14:17:59,863 WARNING [train.py:1204] (3/4) Exclude cut with ID _13913_210_9994_1_1530579615187_3383897_473-277682-0_sp1.1 from training. Duration: 0.3545625 2023-10-02 14:18:03,264 WARNING [train.py:1204] (3/4) Exclude cut with ID _58895_210_8750_1_1535184081320_3649089_49-225702-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 14:18:03,361 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3080_210_2071_1_1526086102_4365166_312-8726-0_sp1.1 from training. Duration: 0.82725 2023-10-02 14:18:03,627 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.balancer2.prob, batch_count=909086.6666666666, ans=0.125 2023-10-02 14:18:05,202 WARNING [train.py:1204] (3/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_489-190175-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 14:18:05,275 WARNING [train.py:1204] (3/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_345-95717-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 14:18:06,600 WARNING [train.py:1204] (3/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_85-138257-0_sp1.1 from training. Duration: 0.6454375 2023-10-02 14:18:08,260 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.ff2_skip_rate, batch_count=909086.6666666666, ans=0.0 2023-10-02 14:18:09,325 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7046_210_6753_1_1527895337_6920917_783-278109-0_sp1.1 from training. Duration: 0.7 2023-10-02 14:18:09,354 WARNING [train.py:1204] (3/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_450-203637-0_sp1.1 from training. Duration: 0.836375 2023-10-02 14:18:09,394 WARNING [train.py:1204] (3/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_416-132893-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 14:18:09,416 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2655_210_2087_1_1525688959_3897400_437-337690-0_sp0.9 from training. Duration: 0.7 2023-10-02 14:18:10,806 WARNING [train.py:1204] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_700-183549-0_sp1.1 from training. Duration: 0.6181875 2023-10-02 14:18:16,277 WARNING [train.py:1204] (3/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_191-59784-0_sp1.1 from training. Duration: 0.8 2023-10-02 14:18:16,288 WARNING [train.py:1204] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_218-295097-0_sp1.1 from training. Duration: 0.7 2023-10-02 14:18:17,646 WARNING [train.py:1204] (3/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_310-109461-0_sp1.1 from training. Duration: 0.72725 2023-10-02 14:18:17,650 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_33348_210_5632_1_1533086912_3324030_131-321149-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 14:18:17,694 WARNING [train.py:1204] (3/4) Exclude cut with ID _64135_210_2253_1_1535677267876_7608972_133-188266-0_sp1.1 from training. Duration: 0.736375 2023-10-02 14:18:19,001 WARNING [train.py:1204] (3/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_505-205270-0 from training. Duration: 0.38 2023-10-02 14:18:19,018 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_67223_210_4238_1_1535612120_2537431_102-88248-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 14:18:19,300 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.feed_forward1.hidden_balancer.prob, batch_count=909153.3333333334, ans=0.125 2023-10-02 14:18:20,337 WARNING [train.py:1204] (3/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_60-235433-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 14:18:20,424 WARNING [train.py:1204] (3/4) Exclude cut with ID _5189_210_4557_1_1527248319428_3769229_199-53526-0_sp1.1 from training. Duration: 0.3818125 2023-10-02 14:18:26,527 WARNING [train.py:1204] (3/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_439-90979-0_sp1.1 from training. Duration: 0.963625 2023-10-02 14:18:26,607 WARNING [train.py:1204] (3/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_1162-132880-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 14:18:27,197 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.self_attn2.whiten.whitening_limit, batch_count=909153.3333333334, ans=22.5 2023-10-02 14:18:28,495 WARNING [train.py:1204] (3/4) Exclude cut with ID _56423_210_5854_1_1534896061049_3165969_220-22317-0_sp1.1 from training. Duration: 0.963625 2023-10-02 14:18:31,063 WARNING [train.py:1204] (3/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_909-64151-0 from training. Duration: 0.94 2023-10-02 14:18:32,823 WARNING [train.py:1204] (3/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_465-156500-0_sp0.9 from training. Duration: 0.8 2023-10-02 14:18:32,904 WARNING [train.py:1204] (3/4) Exclude cut with ID _44304_210_15374_1_1533639352765_4613129_302-22084-0 from training. Duration: 0.86 2023-10-02 14:18:32,950 WARNING [train.py:1204] (3/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_663-166973-0_sp1.1 from training. Duration: 0.72725 2023-10-02 14:18:36,159 WARNING [train.py:1204] (3/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_503-287883-0_sp0.9 from training. Duration: 0.97775 2023-10-02 14:18:36,187 WARNING [train.py:1204] (3/4) Exclude cut with ID _60057_210_10640_1_1534903065793_3333150_362-230377-0_sp0.9 from training. Duration: 0.9666875 2023-10-02 14:18:38,306 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.2.encoder.layers.0.feed_forward2.out_whiten, num_groups=1, num_channels=384, metric=6.38 vs. limit=15.0 2023-10-02 14:18:40,120 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_4183_210_2087_1_1526729100_1869838_143-280962-0 from training. Duration: 0.56 2023-10-02 14:18:41,578 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.feed_forward2.hidden_balancer.prob, batch_count=909220.0, ans=0.125 2023-10-02 14:18:42,781 WARNING [train.py:1204] (3/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_281-1313-0_sp1.1 from training. Duration: 0.936375 2023-10-02 14:18:48,490 WARNING [train.py:1204] (3/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_64-215118-0_sp1.1 from training. Duration: 0.936375 2023-10-02 14:18:48,545 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2822_210_2715_1_1526112586_1999587_165-36656-0 from training. Duration: 0.89 2023-10-02 14:18:48,591 WARNING [train.py:1204] (3/4) Exclude cut with ID _22961_210_8254_1_1531976366672_4278180_402-208830-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 14:18:52,564 WARNING [train.py:1204] (3/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_98-193494-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 14:18:53,556 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.0.layers.1.feed_forward1.out_whiten, num_groups=1, num_channels=192, metric=4.93 vs. limit=15.0 2023-10-02 14:18:54,224 WARNING [train.py:1204] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_383-285196-0 from training. Duration: 0.97 2023-10-02 14:18:58,932 INFO [train.py:1046] (3/4) Epoch 26, batch 3600, loss[loss=0.1736, simple_loss=0.2323, pruned_loss=0.0574, over 18965.00 frames. ], tot_loss[loss=0.1667, simple_loss=0.2438, pruned_loss=0.04479, over 4706672.91 frames. ], batch size: 388, lr: 3.93e-03, grad_scale: 16.0 2023-10-02 14:19:00,410 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6767_210_3273_1_1527855783_1718932_142-127132-0 from training. Duration: 0.95 2023-10-02 14:19:00,436 WARNING [train.py:1204] (3/4) Exclude cut with ID _39867_210_9826_1_1533553222030_9344259_1018-339635-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 14:19:01,117 WARNING [train.py:1204] (3/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_274-274798-0_sp1.1 from training. Duration: 0.8909375 2023-10-02 14:19:03,761 WARNING [train.py:1204] (3/4) Exclude cut with ID _2334_210_2667_1_1525608238880_4705129_376-263393-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 14:19:05,483 WARNING [train.py:1204] (3/4) Exclude cut with ID _29303_210_2575_1_1532392237303_7410649_363-344675-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 14:19:07,429 WARNING [train.py:1204] (3/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_58-68956-0_sp1.1 from training. Duration: 0.7545625 2023-10-02 14:19:10,312 WARNING [train.py:1204] (3/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_709-311571-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 14:19:11,732 WARNING [train.py:1204] (3/4) Exclude cut with ID _46067_210_4220_1_1534125571066_3307450_136-257407-0_sp1.1 from training. Duration: 0.963625 2023-10-02 14:19:13,219 WARNING [train.py:1204] (3/4) Exclude cut with ID _6493_210_1747_1_1528459210424_4012470_324-323854-0_sp1.1 from training. Duration: 0.836375 2023-10-02 14:19:13,289 WARNING [train.py:1204] (3/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_234-260682-0_sp1.1 from training. Duration: 0.763625 2023-10-02 14:19:13,448 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.self_attn_weights.pos_emb_skip_rate, batch_count=909420.0, ans=0.0 2023-10-02 14:19:14,610 WARNING [train.py:1204] (3/4) Exclude cut with ID _36634_210_1794_1_1533296352450_1399884_55-119451-0_sp1.1 from training. Duration: 0.963625 2023-10-02 14:19:14,614 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_40047_210_2290_1_1533290519_2374311_195-165693-0 from training. Duration: 0.89 2023-10-02 14:19:14,689 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder_embed.conv.5.prob, batch_count=909420.0, ans=0.125 2023-10-02 14:19:18,796 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_5522_210_3273_1_1527249606_3909096_366-172508-0_sp0.9 from training. Duration: 0.7333125 2023-10-02 14:19:18,889 WARNING [train.py:1204] (3/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_187-10504-0_sp1.1 from training. Duration: 0.963625 2023-10-02 14:19:21,621 WARNING [train.py:1204] (3/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_282-257791-0_sp1.1 from training. Duration: 0.7818125 2023-10-02 14:19:24,376 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_43831_210_2158_1_1533725502_5872791_467-288776-0_sp1.1 from training. Duration: 0.92725 2023-10-02 14:19:24,588 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.conv_module2.balancer1.prob, batch_count=909420.0, ans=0.125 2023-10-02 14:19:25,705 WARNING [train.py:1204] (3/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_138-154812-0_sp1.1 from training. Duration: 0.7090625 2023-10-02 14:19:25,738 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_1154-185930-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 14:19:25,754 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2655_210_2087_1_1525688959_3897400_52-279899-0 from training. Duration: 0.69 2023-10-02 14:19:27,617 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_141-89113-0_sp1.1 from training. Duration: 0.7818125 2023-10-02 14:19:30,337 WARNING [train.py:1204] (3/4) Exclude cut with ID _55134_210_21982_1_1534590028377_3882170_297-47310-0_sp1.1 from training. Duration: 0.963625 2023-10-02 14:19:30,404 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_486-151131-0_sp1.1 from training. Duration: 0.77275 2023-10-02 14:19:30,637 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.conv_module1.balancer1.min_positive, batch_count=909486.6666666666, ans=0.025 2023-10-02 14:19:31,848 WARNING [train.py:1204] (3/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_21-232980-0_sp1.1 from training. Duration: 0.936375 2023-10-02 14:19:34,622 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6165_210_3528_1_1527590982_4379841_338-106380-0_sp1.1 from training. Duration: 0.92725 2023-10-02 14:19:35,531 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.feed_forward3.hidden_balancer.prob, batch_count=909486.6666666666, ans=0.125 2023-10-02 14:19:36,625 WARNING [train.py:1204] (3/4) Exclude cut with ID _55162_210_17071_1_1534408248583_3753480_355-257218-0_sp1.1 from training. Duration: 0.97275 2023-10-02 14:19:36,697 WARNING [train.py:1204] (3/4) Exclude cut with ID _2895_210_2477_1_1525956785794_4063750_289-11475-0 from training. Duration: 0.82 2023-10-02 14:19:42,924 WARNING [train.py:1204] (3/4) Exclude cut with ID _16756_210_4915_1_1531047803763_7104259_839-183431-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 14:19:44,324 WARNING [train.py:1204] (3/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_427-208901-0_sp0.9 from training. Duration: 0.7555625 2023-10-02 14:19:44,363 WARNING [train.py:1204] (3/4) Exclude cut with ID _36512_210_6987_1_1533033143998_3126069_181-306500-0 from training. Duration: 0.95 2023-10-02 14:19:48,623 WARNING [train.py:1204] (3/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_28-70987-0_sp1.1 from training. Duration: 0.7909375 2023-10-02 14:19:52,770 WARNING [train.py:1204] (3/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_590-58996-0_sp1.1 from training. Duration: 0.936375 2023-10-02 14:19:55,939 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.491e+02 1.927e+02 2.147e+02 2.530e+02 3.358e+02, threshold=4.294e+02, percent-clipped=0.0 2023-10-02 14:19:56,035 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_48730_210_5399_1_1533885886_2212769_108-47561-0_sp1.1 from training. Duration: 0.936375 2023-10-02 14:20:00,138 WARNING [train.py:1204] (3/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_401-158239-0_sp0.9 from training. Duration: 0.788875 2023-10-02 14:20:00,159 WARNING [train.py:1204] (3/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_278-154492-0_sp1.1 from training. Duration: 0.6909375 2023-10-02 14:20:00,165 WARNING [train.py:1204] (3/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_137-116442-0 from training. Duration: 0.78 2023-10-02 14:20:02,212 WARNING [train.py:1204] (3/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_189-317158-0 from training. Duration: 0.9 2023-10-02 14:20:03,498 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_47896_210_20508_1_1534492518_5958868_558-314572-0 from training. Duration: 0.97 2023-10-02 14:20:07,202 WARNING [train.py:1204] (3/4) Exclude cut with ID _66070_210_7640_1_1535441393096_7805310_290-40284-0_sp1.1 from training. Duration: 0.97275 2023-10-02 14:20:07,242 WARNING [train.py:1204] (3/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_110-224688-0_sp1.1 from training. Duration: 0.8818125 2023-10-02 14:20:07,320 WARNING [train.py:1204] (3/4) Exclude cut with ID _7719_210_3525_1_1528456153163_4296960_88-250259-0 from training. Duration: 0.84 2023-10-02 14:20:08,584 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8453_210_4211_1_1528502940_8562730_1157-114102-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 14:20:08,620 WARNING [train.py:1204] (3/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_317-175062-0_sp1.1 from training. Duration: 0.7454375 2023-10-02 14:20:08,634 WARNING [train.py:1204] (3/4) Exclude cut with ID _29857_210_20034_1_1532395886753_3798160_254-347254-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 14:20:09,960 WARNING [train.py:1204] (3/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_28-70987-0 from training. Duration: 0.87 2023-10-02 14:20:10,029 WARNING [train.py:1204] (3/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_321-335475-0 from training. Duration: 0.45 2023-10-02 14:20:12,097 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.3.encoder.layers.1.whiten, num_groups=1, num_channels=512, metric=4.95 vs. limit=12.0 2023-10-02 14:20:12,837 WARNING [train.py:1204] (3/4) Exclude cut with ID _40901_210_5665_1_1533363789958_3856130_356-107330-0_sp1.1 from training. Duration: 0.936375 2023-10-02 14:20:14,028 INFO [train.py:1046] (3/4) Epoch 26, batch 3650, loss[loss=0.1928, simple_loss=0.2534, pruned_loss=0.06611, over 22824.00 frames. ], tot_loss[loss=0.1677, simple_loss=0.245, pruned_loss=0.04513, over 4720278.19 frames. ], batch size: 322, lr: 3.93e-03, grad_scale: 16.0 2023-10-02 14:20:14,110 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3780_210_2087_1_1526467389_2145572_176-107140-0 from training. Duration: 0.69 2023-10-02 14:20:18,240 WARNING [train.py:1204] (3/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_80-135200-0 from training. Duration: 0.79 2023-10-02 14:20:19,601 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_33348_210_5632_1_1533086912_3324030_85-28583-0_sp0.9 from training. Duration: 0.911125 2023-10-02 14:20:22,408 WARNING [train.py:1204] (3/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_29-293896-0 from training. Duration: 0.61 2023-10-02 14:20:23,842 WARNING [train.py:1204] (3/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_194-252800-0 from training. Duration: 0.97 2023-10-02 14:20:29,835 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_360-52446-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 14:20:29,836 WARNING [train.py:1204] (3/4) Exclude cut with ID _9389_210_1664_1_1529217337301_5289269_289-213417-0_sp0.9 from training. Duration: 0.988875 2023-10-02 14:20:31,193 WARNING [train.py:1204] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_143-86344-0_sp0.9 from training. Duration: 0.8555625 2023-10-02 14:20:33,503 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=909753.3333333334, ans=0.1 2023-10-02 14:20:35,085 WARNING [train.py:1204] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_520-126729-0_sp0.9 from training. Duration: 0.77775 2023-10-02 14:20:35,107 WARNING [train.py:1204] (3/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_76-308663-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 14:20:35,170 WARNING [train.py:1204] (3/4) Exclude cut with ID _8021_210_7994_1_1528376451358_3653069_100-140231-0 from training. Duration: 0.67 2023-10-02 14:20:35,231 WARNING [train.py:1204] (3/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_387-131538-0_sp0.9 from training. Duration: 0.9 2023-10-02 14:20:35,266 WARNING [train.py:1204] (3/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_365-182054-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 14:20:37,083 WARNING [train.py:1204] (3/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_345-340690-0 from training. Duration: 0.93 2023-10-02 14:20:38,639 WARNING [train.py:1204] (3/4) Exclude cut with ID _5409_210_1388_1_1527292850189_5865191_178-127970-0_sp0.9 from training. Duration: 0.6333125 2023-10-02 14:20:38,675 WARNING [train.py:1204] (3/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_352-12734-0_sp1.1 from training. Duration: 0.92725 2023-10-02 14:20:38,678 WARNING [train.py:1204] (3/4) Exclude cut with ID _65134_210_14763_1_1535335393985_3505368_121-129373-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 14:20:41,384 WARNING [train.py:1204] (3/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_557-324258-0_sp1.1 from training. Duration: 0.736375 2023-10-02 14:20:42,739 WARNING [train.py:1204] (3/4) Exclude cut with ID _48678_210_5663_1_1533988830947_3769199_306-255886-0 from training. Duration: 0.77 2023-10-02 14:20:44,137 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7544_210_6753_1_1528587000_691468_87-178599-0 from training. Duration: 0.87 2023-10-02 14:20:44,193 WARNING [train.py:1204] (3/4) Exclude cut with ID _2941_210_2577_1_1526199673150_2489759_208-236402-0_sp1.1 from training. Duration: 0.963625 2023-10-02 14:20:45,574 WARNING [train.py:1204] (3/4) Exclude cut with ID _38670_210_15379_1_1533448810659_4391059_65-169397-0 from training. Duration: 0.97 2023-10-02 14:20:48,160 WARNING [train.py:1204] (3/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_306-328175-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 14:20:48,177 WARNING [train.py:1204] (3/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_212-149947-0_sp1.1 from training. Duration: 0.663625 2023-10-02 14:20:53,665 WARNING [train.py:1204] (3/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_321-22821-0_sp1.1 from training. Duration: 0.8090625 2023-10-02 14:20:53,934 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=909820.0, ans=0.1 2023-10-02 14:20:55,103 WARNING [train.py:1204] (3/4) Exclude cut with ID _44010_210_4525_1_1533884262964_4013620_483-69110-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 14:20:55,112 WARNING [train.py:1204] (3/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_38-187851-0_sp1.1 from training. Duration: 0.563625 2023-10-02 14:20:56,553 WARNING [train.py:1204] (3/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_129-337802-0_sp0.9 from training. Duration: 0.8 2023-10-02 14:20:58,023 WARNING [train.py:1204] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_268-87909-0_sp1.1 from training. Duration: 0.9 2023-10-02 14:21:00,061 WARNING [train.py:1204] (3/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_559-184862-0_sp1.1 from training. Duration: 0.8545625 2023-10-02 14:21:02,778 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_46032_210_14656_1_1533781412_4507969_237-120766-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 14:21:04,584 WARNING [train.py:1204] (3/4) Exclude cut with ID _28427_210_4220_1_1532581240967_3061069_330-170906-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 14:21:04,589 WARNING [train.py:1204] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_401-51578-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 14:21:05,297 WARNING [train.py:1204] (3/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_209-205365-0_sp1.1 from training. Duration: 0.7181875 2023-10-02 14:21:06,524 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_23801_210_16223_1_1532066392_3294657_57-242170-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 14:21:07,888 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_127-37371-0_sp1.1 from training. Duration: 0.936375 2023-10-02 14:21:13,385 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9171W0019-578905 from training. Duration: 0.896 2023-10-02 14:21:16,182 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_24605_210_1794_1_1532047863_4149755_349-49056-0_sp1.1 from training. Duration: 0.92725 2023-10-02 14:21:16,200 WARNING [train.py:1204] (3/4) Exclude cut with ID _12302_210_5219_1_1529924446466_8107280_897-21237-0_sp1.1 from training. Duration: 0.936375 2023-10-02 14:21:17,594 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_9460_210_4211_1_1528954587_10049667_480-260269-0_sp0.9 from training. Duration: 0.62225 2023-10-02 14:21:18,902 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_348-152548-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 14:21:18,968 WARNING [train.py:1204] (3/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_301-330455-0_sp0.9 from training. Duration: 0.888875 2023-10-02 14:21:20,234 WARNING [train.py:1204] (3/4) Exclude cut with ID _61008_210_10633_1_1534852769198_6974370_345-91705-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 14:21:21,716 WARNING [train.py:1204] (3/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_304-273433-0 from training. Duration: 0.99 2023-10-02 14:21:21,720 WARNING [train.py:1204] (3/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_359-345972-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 14:21:24,432 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2296_210_2087_1_1525503448_3397335_57-185430-0_sp1.1 from training. Duration: 0.5454375 2023-10-02 14:21:27,019 INFO [train.py:1046] (3/4) Epoch 26, batch 3700, loss[loss=0.1754, simple_loss=0.2466, pruned_loss=0.0521, over 23859.00 frames. ], tot_loss[loss=0.1685, simple_loss=0.246, pruned_loss=0.04553, over 4716704.55 frames. ], batch size: 212, lr: 3.92e-03, grad_scale: 8.0 2023-10-02 14:21:27,077 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_1256-120974-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 14:21:28,415 WARNING [train.py:1204] (3/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_372-30302-0_sp0.9 from training. Duration: 0.9555625 2023-10-02 14:21:31,599 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_42012_210_7927_1_1533439936_3871556_451-344262-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 14:21:31,600 WARNING [train.py:1204] (3/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_28-324729-0 from training. Duration: 0.94 2023-10-02 14:21:31,609 WARNING [train.py:1204] (3/4) Exclude cut with ID _64135_210_2253_1_1535677267876_7608972_313-18394-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 14:21:31,699 WARNING [train.py:1204] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_620-276793-0_sp0.9 from training. Duration: 0.6555625 2023-10-02 14:21:33,071 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7544_210_6753_1_1528591327_1471426_172-348777-0_sp0.9 from training. Duration: 0.7333125 2023-10-02 14:21:37,703 WARNING [train.py:1204] (3/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_332-221057-0_sp0.9 from training. Duration: 0.7555625 2023-10-02 14:21:39,874 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.conv_module2.balancer2.prob, batch_count=910020.0, ans=0.125 2023-10-02 14:21:41,041 WARNING [train.py:1204] (3/4) Exclude cut with ID _19023_210_4353_1_1531378892552_5820780_5-300035-0_sp1.1 from training. Duration: 0.92725 2023-10-02 14:21:41,083 WARNING [train.py:1204] (3/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_343-309023-0_sp1.1 from training. Duration: 0.963625 2023-10-02 14:21:41,150 WARNING [train.py:1204] (3/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_37-41919-0_sp1.1 from training. Duration: 0.7909375 2023-10-02 14:21:42,510 WARNING [train.py:1204] (3/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_41-182910-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 14:21:42,566 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_229-45300-0_sp0.9 from training. Duration: 0.6333125 2023-10-02 14:21:45,287 WARNING [train.py:1204] (3/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_40-214716-0_sp1.1 from training. Duration: 0.963625 2023-10-02 14:21:46,639 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9309W0203-640330 from training. Duration: 0.896 2023-10-02 14:21:46,928 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.ff3_skip_rate, batch_count=910086.6666666666, ans=0.0 2023-10-02 14:21:52,297 WARNING [train.py:1204] (3/4) Exclude cut with ID _60229_210_27767_1_1534838406158_3655350_264-279573-0_sp1.1 from training. Duration: 0.7545625 2023-10-02 14:21:52,326 WARNING [train.py:1204] (3/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_372-198878-0_sp0.9 from training. Duration: 0.6666875 2023-10-02 14:21:53,752 WARNING [train.py:1204] (3/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_359-118926-0_sp1.1 from training. Duration: 0.6545625 2023-10-02 14:21:53,762 WARNING [train.py:1204] (3/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_85-65056-0 from training. Duration: 0.96 2023-10-02 14:21:55,083 WARNING [train.py:1204] (3/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_89-242335-0_sp1.1 from training. Duration: 0.863625 2023-10-02 14:21:59,221 WARNING [train.py:1204] (3/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_128-49444-0_sp1.1 from training. Duration: 0.936375 2023-10-02 14:21:59,300 WARNING [train.py:1204] (3/4) Exclude cut with ID _30327_210_16049_1_1532484161295_3924169_401-70819-0 from training. Duration: 0.86 2023-10-02 14:22:00,626 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_41822_210_13270_1_1533955976_4379284_260-256391-0_sp1.1 from training. Duration: 0.936375 2023-10-02 14:22:02,645 WARNING [train.py:1204] (3/4) Exclude cut with ID _67335_210_5219_1_1535525975652_4275229_599-247317-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 14:22:04,125 WARNING [train.py:1204] (3/4) Exclude cut with ID _29857_210_20034_1_1532395886753_3798160_209-239267-0_sp1.1 from training. Duration: 0.936375 2023-10-02 14:22:05,954 WARNING [train.py:1204] (3/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_46-207599-0_sp1.1 from training. Duration: 0.5818125 2023-10-02 14:22:07,978 WARNING [train.py:1204] (3/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_912-282412-0_sp1.1 from training. Duration: 0.4454375 2023-10-02 14:22:12,195 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_4183_210_2087_1_1526729100_1869838_141-166600-0_sp1.1 from training. Duration: 0.863625 2023-10-02 14:22:12,200 WARNING [train.py:1204] (3/4) Exclude cut with ID _15037_210_2253_1_1530752294104_3267650_296-192597-0 from training. Duration: 0.81 2023-10-02 14:22:13,481 WARNING [train.py:1204] (3/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_571-220562-0_sp1.1 from training. Duration: 0.963625 2023-10-02 14:22:13,506 WARNING [train.py:1204] (3/4) Exclude cut with ID _5287_210_2084_1_1527418883040_3878670_12-221186-0 from training. Duration: 0.98 2023-10-02 14:22:15,207 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=910220.0, ans=0.1 2023-10-02 14:22:19,105 WARNING [train.py:1204] (3/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_149-85602-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 14:22:19,143 WARNING [train.py:1204] (3/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_1003-101638-0_sp1.1 from training. Duration: 0.663625 2023-10-02 14:22:19,315 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.conv_skip_rate, batch_count=910220.0, ans=0.0 2023-10-02 14:22:22,003 WARNING [train.py:1204] (3/4) Exclude cut with ID _55844_210_2346_1_1534471212569_7625707_1099-33689-0_sp1.1 from training. Duration: 0.92725 2023-10-02 14:22:22,048 WARNING [train.py:1204] (3/4) Exclude cut with ID _7606_210_4915_1_1528894253277_5080690_301-10732-0 from training. Duration: 0.81 2023-10-02 14:22:24,735 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.494e+02 1.885e+02 2.100e+02 2.312e+02 3.361e+02, threshold=4.200e+02, percent-clipped=0.0 2023-10-02 14:22:24,870 WARNING [train.py:1204] (3/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_403-34698-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 14:22:24,876 WARNING [train.py:1204] (3/4) Exclude cut with ID _8480_210_1874_1_1528589391205_6728330_755-112625-0_sp0.9 from training. Duration: 0.82225 2023-10-02 14:22:24,888 WARNING [train.py:1204] (3/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_527-238990-0_sp1.1 from training. Duration: 0.8181875 2023-10-02 14:22:24,907 WARNING [train.py:1204] (3/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_426-7328-0_sp1.1 from training. Duration: 0.92725 2023-10-02 14:22:27,668 WARNING [train.py:1204] (3/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_189-317158-0_sp1.1 from training. Duration: 0.8181875 2023-10-02 14:22:28,979 WARNING [train.py:1204] (3/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_162-103620-0 from training. Duration: 0.93 2023-10-02 14:22:30,312 WARNING [train.py:1204] (3/4) Exclude cut with ID _22083_210_8254_1_1531702862439_3678970_49-234546-0 from training. Duration: 0.83 2023-10-02 14:22:31,585 WARNING [train.py:1204] (3/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_370-331600-0_sp1.1 from training. Duration: 0.8545625 2023-10-02 14:22:31,597 WARNING [train.py:1204] (3/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_166-59042-0_sp1.1 from training. Duration: 0.97275 2023-10-02 14:22:32,998 WARNING [train.py:1204] (3/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_138-332773-0_sp1.1 from training. Duration: 0.736375 2023-10-02 14:22:33,193 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.nonlin_attention.balancer.prob, batch_count=910286.6666666666, ans=0.125 2023-10-02 14:22:33,251 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.balancer1.prob, batch_count=910286.6666666666, ans=0.125 2023-10-02 14:22:34,360 WARNING [train.py:1204] (3/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_90-30673-0_sp0.9 from training. Duration: 0.9333125 2023-10-02 14:22:38,333 WARNING [train.py:1204] (3/4) Exclude cut with ID _27967_210_12140_1_1532588391859_8419130_737-41898-0_sp1.1 from training. Duration: 0.936375 2023-10-02 14:22:41,245 INFO [train.py:1046] (3/4) Epoch 26, batch 3750, loss[loss=0.1811, simple_loss=0.2649, pruned_loss=0.0486, over 23993.00 frames. ], tot_loss[loss=0.1698, simple_loss=0.2477, pruned_loss=0.04599, over 4722844.46 frames. ], batch size: 80, lr: 3.92e-03, grad_scale: 8.0 2023-10-02 14:22:41,360 WARNING [train.py:1204] (3/4) Exclude cut with ID _8833_210_5219_1_1528592665210_7553059_329-125156-0_sp1.1 from training. Duration: 0.7454375 2023-10-02 14:22:42,724 WARNING [train.py:1204] (3/4) Exclude cut with ID _41817_210_18835_1_1533434477160_7423049_352-42537-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 14:22:44,204 WARNING [train.py:1204] (3/4) Exclude cut with ID _8365_210_2064_1_1528592311070_4677690_431-294102-0 from training. Duration: 0.89 2023-10-02 14:22:45,544 WARNING [train.py:1204] (3/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_454-314901-0_sp1.1 from training. Duration: 0.4181875 2023-10-02 14:22:47,162 WARNING [train.py:1204] (3/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_80-135200-0_sp0.9 from training. Duration: 0.87775 2023-10-02 14:22:48,416 WARNING [train.py:1204] (3/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_181-78632-0 from training. Duration: 0.78 2023-10-02 14:22:48,472 WARNING [train.py:1204] (3/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_300-27235-0_sp0.9 from training. Duration: 0.9666875 2023-10-02 14:22:49,739 WARNING [train.py:1204] (3/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_96-177176-0_sp1.1 from training. Duration: 0.97275 2023-10-02 14:22:51,123 WARNING [train.py:1204] (3/4) Exclude cut with ID _35941_210_15379_1_1532995294515_4023089_246-348742-0_sp1.1 from training. Duration: 0.97275 2023-10-02 14:22:52,567 WARNING [train.py:1204] (3/4) Exclude cut with ID _38670_210_15379_1_1533448810659_4391059_65-169397-0_sp1.1 from training. Duration: 0.8818125 2023-10-02 14:22:56,745 WARNING [train.py:1204] (3/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_1003-99642-0_sp1.1 from training. Duration: 0.963625 2023-10-02 14:22:59,627 WARNING [train.py:1204] (3/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_320-14078-0_sp1.1 from training. Duration: 0.863625 2023-10-02 14:23:00,987 WARNING [train.py:1204] (3/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_367-61330-0_sp0.9 from training. Duration: 0.8333125 2023-10-02 14:23:02,400 WARNING [train.py:1204] (3/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_515-298514-0_sp1.1 from training. Duration: 0.92725 2023-10-02 14:23:03,898 WARNING [train.py:1204] (3/4) Exclude cut with ID _53731_210_7994_1_1534553979244_3757214_471-320815-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 14:23:05,334 WARNING [train.py:1204] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_306-232480-0 from training. Duration: 0.96 2023-10-02 14:23:07,180 WARNING [train.py:1204] (3/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_478-162247-0_sp0.9 from training. Duration: 0.911125 2023-10-02 14:23:08,537 WARNING [train.py:1204] (3/4) Exclude cut with ID _56423_210_5854_1_1534896061049_3165969_345-284696-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 14:23:08,744 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.conv_skip_rate, batch_count=910486.6666666666, ans=0.0 2023-10-02 14:23:10,287 WARNING [train.py:1204] (3/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_600-122253-0_sp1.1 from training. Duration: 0.963625 2023-10-02 14:23:12,240 WARNING [train.py:1204] (3/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_199-295611-0 from training. Duration: 0.55 2023-10-02 14:23:17,900 WARNING [train.py:1204] (3/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_734-129761-0 from training. Duration: 0.82 2023-10-02 14:23:19,247 WARNING [train.py:1204] (3/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_349-169257-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 14:23:19,279 WARNING [train.py:1204] (3/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_202-67017-0_sp0.9 from training. Duration: 0.911125 2023-10-02 14:23:20,684 WARNING [train.py:1204] (3/4) Exclude cut with ID _16908_210_5724_1_1531119157248_3772700_61-258978-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 14:23:20,754 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder_embed.dropout.p, batch_count=910486.6666666666, ans=0.1 2023-10-02 14:23:20,890 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.balancer2.prob, batch_count=910486.6666666666, ans=0.125 2023-10-02 14:23:26,320 WARNING [train.py:1204] (3/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_172-156931-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 14:23:27,680 WARNING [train.py:1204] (3/4) Exclude cut with ID _4092_210_2151_1_1526643044449_3135759_356-278694-0_sp1.1 from training. Duration: 0.42725 2023-10-02 14:23:30,471 WARNING [train.py:1204] (3/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_116-332008-0 from training. Duration: 0.98 2023-10-02 14:23:31,957 WARNING [train.py:1204] (3/4) Exclude cut with ID _29003_210_19596_1_1532507391720_3894098_358-114789-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 14:23:36,046 WARNING [train.py:1204] (3/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_152-212533-0_sp1.1 from training. Duration: 0.8818125 2023-10-02 14:23:36,115 WARNING [train.py:1204] (3/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_267-214116-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 14:23:39,685 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7543_210_6753_1_1528541907_1399322_165-323619-0_sp0.9 from training. Duration: 0.7666875 2023-10-02 14:23:42,897 WARNING [train.py:1204] (3/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_577-332600-0_sp0.9 from training. Duration: 0.6333125 2023-10-02 14:23:44,686 WARNING [train.py:1204] (3/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_342-100157-0_sp0.9 from training. Duration: 0.6 2023-10-02 14:23:46,102 WARNING [train.py:1204] (3/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_141-89727-0_sp1.1 from training. Duration: 0.6454375 2023-10-02 14:23:47,511 WARNING [train.py:1204] (3/4) Exclude cut with ID _3992_210_1747_1_1526644816031_3731060_107-251434-0_sp1.1 from training. Duration: 0.9 2023-10-02 14:23:50,395 WARNING [train.py:1204] (3/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_401-53423-0_sp1.1 from training. Duration: 0.5 2023-10-02 14:23:50,883 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.5.encoder.layers.1.self_attn1.whiten, num_groups=1, num_channels=256, metric=12.72 vs. limit=22.5 2023-10-02 14:23:54,551 INFO [train.py:1046] (3/4) Epoch 26, batch 3800, loss[loss=0.1545, simple_loss=0.229, pruned_loss=0.03997, over 19601.00 frames. ], tot_loss[loss=0.1696, simple_loss=0.2474, pruned_loss=0.04592, over 4728259.26 frames. ], batch size: 42, lr: 3.92e-03, grad_scale: 8.0 2023-10-02 14:23:58,687 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7762_210_1794_1_1528199508_4057408_422-227034-0_sp1.1 from training. Duration: 0.663625 2023-10-02 14:24:01,076 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.0.layers.1.self_attn2.whiten, num_groups=1, num_channels=192, metric=10.27 vs. limit=22.5 2023-10-02 14:24:01,501 WARNING [train.py:1204] (3/4) Exclude cut with ID _4184_210_2550_1_1526730192013_2219380_102-250299-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 14:24:01,571 WARNING [train.py:1204] (3/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_253-308377-0_sp0.9 from training. Duration: 0.5444375 2023-10-02 14:24:02,903 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_271-297505-0 from training. Duration: 0.99 2023-10-02 14:24:04,286 WARNING [train.py:1204] (3/4) Exclude cut with ID _35941_210_15379_1_1532995294515_4023089_213-142973-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 14:24:05,665 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7544_210_6753_1_1528587722_3576686_162-63018-0_sp1.1 from training. Duration: 0.92725 2023-10-02 14:24:07,449 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_365-217119-0_sp0.9 from training. Duration: 0.7 2023-10-02 14:24:08,923 WARNING [train.py:1204] (3/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_662-275621-0_sp0.9 from training. Duration: 0.4666875 2023-10-02 14:24:08,924 WARNING [train.py:1204] (3/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_374-187555-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 14:24:10,283 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_289-180904-0_sp1.1 from training. Duration: 0.6909375 2023-10-02 14:24:11,670 WARNING [train.py:1204] (3/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_189-47206-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 14:24:11,697 WARNING [train.py:1204] (3/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_234-14174-0_sp1.1 from training. Duration: 0.7454375 2023-10-02 14:24:13,494 WARNING [train.py:1204] (3/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_576-76621-0_sp1.1 from training. Duration: 0.963625 2023-10-02 14:24:13,589 WARNING [train.py:1204] (3/4) Exclude cut with ID _13253_210_1925_1_1530757775819_3577873_253-246386-0 from training. Duration: 0.9 2023-10-02 14:24:17,551 WARNING [train.py:1204] (3/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_372-6869-0_sp1.1 from training. Duration: 0.4 2023-10-02 14:24:18,904 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_21434_210_4238_1_1531916339_1222252_22-54954-0_sp1.1 from training. Duration: 0.936375 2023-10-02 14:24:20,315 WARNING [train.py:1204] (3/4) Exclude cut with ID _29853_210_7497_1_1532505600851_6642110_492-170361-0_sp1.1 from training. Duration: 0.92725 2023-10-02 14:24:21,932 WARNING [train.py:1204] (3/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_505-97987-0_sp0.9 from training. Duration: 0.9555625 2023-10-02 14:24:23,180 WARNING [train.py:1204] (3/4) Exclude cut with ID _8365_210_2064_1_1528592311070_4677690_371-122295-0_sp0.9 from training. Duration: 0.611125 2023-10-02 14:24:24,481 WARNING [train.py:1204] (3/4) Exclude cut with ID _14268_210_9206_1_1530622771285_4714099_637-86630-0_sp1.1 from training. Duration: 0.463625 2023-10-02 14:24:24,492 WARNING [train.py:1204] (3/4) Exclude cut with ID _29857_210_20034_1_1532395886753_3798160_372-5678-0_sp1.1 from training. Duration: 0.963625 2023-10-02 14:24:27,164 WARNING [train.py:1204] (3/4) Exclude cut with ID _52892_210_6721_1_1534378941259_4124129_90-165108-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 14:24:28,502 WARNING [train.py:1204] (3/4) Exclude cut with ID _27052_210_5986_1_1532329197187_3585009_143-339198-0_sp1.1 from training. Duration: 0.963625 2023-10-02 14:24:32,645 WARNING [train.py:1204] (3/4) Exclude cut with ID _14268_210_9206_1_1530622771285_4714099_637-86630-0_sp0.9 from training. Duration: 0.5666875 2023-10-02 14:24:32,647 WARNING [train.py:1204] (3/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_401-255491-0 from training. Duration: 0.7 2023-10-02 14:24:34,029 WARNING [train.py:1204] (3/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_178-6946-0_sp1.1 from training. Duration: 0.97275 2023-10-02 14:24:40,956 INFO [scaling.py:1118] (3/4) WithLoss: name=encoder.encoders.4.encoder.layers.1.self_attn_weights, loss-sum=0.000e+00 2023-10-02 14:24:40,985 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.conv_module1.balancer2.prob, batch_count=910886.6666666666, ans=0.125 2023-10-02 14:24:42,013 WARNING [train.py:1204] (3/4) Exclude cut with ID _27485_210_4916_1_1532739191677_3668269_460-163885-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 14:24:42,541 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.4.encoder.layers.2.self_attn2.whiten, num_groups=1, num_channels=384, metric=15.09 vs. limit=22.5 2023-10-02 14:24:48,006 WARNING [train.py:1204] (3/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_270-26053-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 14:24:50,719 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.491e+02 1.895e+02 2.051e+02 2.424e+02 3.630e+02, threshold=4.102e+02, percent-clipped=0.0 2023-10-02 14:24:50,817 WARNING [train.py:1204] (3/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_412-317077-0 from training. Duration: 0.75 2023-10-02 14:24:50,949 WARNING [train.py:1204] (3/4) Exclude cut with ID _5289_210_2084_1_1527678202736_3779299_142-103317-0 from training. Duration: 0.84 2023-10-02 14:24:50,993 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_150-143438-0_sp1.1 from training. Duration: 0.92725 2023-10-02 14:24:53,814 WARNING [train.py:1204] (3/4) Exclude cut with ID _27894_210_12392_1_1532415496078_6902439_668-269779-0_sp1.1 from training. Duration: 0.97275 2023-10-02 14:24:53,886 WARNING [train.py:1204] (3/4) Exclude cut with ID _18513_210_9206_1_1531618215223_3862620_56-222075-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 14:24:54,005 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=910953.3333333334, ans=0.1 2023-10-02 14:24:56,596 WARNING [train.py:1204] (3/4) Exclude cut with ID _4092_210_2151_1_1526643044449_3135759_356-278694-0 from training. Duration: 0.47 2023-10-02 14:24:59,319 WARNING [train.py:1204] (3/4) Exclude cut with ID _25808_210_13292_1_1532343514562_7366109_633-47524-0 from training. Duration: 0.99 2023-10-02 14:24:59,330 WARNING [train.py:1204] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_872-264525-0 from training. Duration: 0.65 2023-10-02 14:24:59,378 WARNING [train.py:1204] (3/4) Exclude cut with ID _14632_210_9988_1_1530691279392_3923200_239-232545-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 14:25:00,879 WARNING [train.py:1204] (3/4) Exclude cut with ID _14349_210_8341_1_1530615710818_3442470_153-27432-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 14:25:03,802 WARNING [train.py:1204] (3/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_37-41919-0_sp0.9 from training. Duration: 0.9666875 2023-10-02 14:25:05,048 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_196-289637-0_sp0.9 from training. Duration: 0.8333125 2023-10-02 14:25:06,908 INFO [train.py:1046] (3/4) Epoch 26, batch 3850, loss[loss=0.1676, simple_loss=0.2383, pruned_loss=0.04838, over 23730.00 frames. ], tot_loss[loss=0.1688, simple_loss=0.2466, pruned_loss=0.0455, over 4722272.92 frames. ], batch size: 164, lr: 3.92e-03, grad_scale: 8.0 2023-10-02 14:25:12,067 WARNING [train.py:1204] (3/4) Exclude cut with ID _13678_210_7497_1_1530530069226_3463170_371-3221-0_sp1.1 from training. Duration: 0.8181875 2023-10-02 14:25:13,451 WARNING [train.py:1204] (3/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_557-324258-0 from training. Duration: 0.81 2023-10-02 14:25:14,800 WARNING [train.py:1204] (3/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_1012-279473-0_sp1.1 from training. Duration: 0.6090625 2023-10-02 14:25:14,887 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_66916_210_14656_1_1535725525_326566_7-92649-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 14:25:18,983 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_9460_210_4211_1_1528954587_10049667_480-260269-0_sp1.1 from training. Duration: 0.5090625 2023-10-02 14:25:19,593 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.2.encoder.layers.2.feed_forward1.out_whiten, num_groups=1, num_channels=384, metric=8.33 vs. limit=15.0 2023-10-02 14:25:21,884 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_121-155001-0_sp1.1 from training. Duration: 0.92725 2023-10-02 14:25:23,233 WARNING [train.py:1204] (3/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_188-171673-0_sp1.1 from training. Duration: 0.52725 2023-10-02 14:25:24,579 WARNING [train.py:1204] (3/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_2-39653-0 from training. Duration: 0.98 2023-10-02 14:25:27,056 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.0.layers.0.feed_forward3.out_whiten, num_groups=1, num_channels=192, metric=11.80 vs. limit=15.0 2023-10-02 14:25:28,874 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_23850_210_4238_1_1532232597_1632433_112-229012-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 14:25:29,733 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.0.layers.1.conv_module2.whiten, num_groups=1, num_channels=192, metric=10.22 vs. limit=15.0 2023-10-02 14:25:31,622 WARNING [train.py:1204] (3/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_626-79528-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 14:25:34,346 WARNING [train.py:1204] (3/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_856-90052-0_sp1.1 from training. Duration: 0.963625 2023-10-02 14:25:35,679 WARNING [train.py:1204] (3/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_122-109023-0_sp0.9 from training. Duration: 0.8444375 2023-10-02 14:25:37,154 WARNING [train.py:1204] (3/4) Exclude cut with ID _64414_210_17301_1_1535243252048_3757759_37-70328-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 14:25:39,639 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_622-344907-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 14:25:39,696 WARNING [train.py:1204] (3/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_410-321956-0_sp1.1 from training. Duration: 0.92725 2023-10-02 14:25:39,708 WARNING [train.py:1204] (3/4) Exclude cut with ID _7562_210_2084_1_1528887767916_3946229_186-326422-0_sp0.9 from training. Duration: 0.9333125 2023-10-02 14:25:40,993 WARNING [train.py:1204] (3/4) Exclude cut with ID _37537_210_2253_1_1533171758568_3421949_127-285192-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 14:25:41,522 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.4.encoder.layers.2.conv_module2.whiten, num_groups=1, num_channels=384, metric=8.49 vs. limit=15.0 2023-10-02 14:25:42,765 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.conv_module2.balancer2.prob, batch_count=911153.3333333334, ans=0.125 2023-10-02 14:25:43,808 WARNING [train.py:1204] (3/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_723-228168-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 14:25:43,886 WARNING [train.py:1204] (3/4) Exclude cut with ID _13678_210_7497_1_1530530069226_3463170_426-246984-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 14:25:45,672 WARNING [train.py:1204] (3/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_399-156791-0_sp0.9 from training. Duration: 0.92225 2023-10-02 14:25:45,733 WARNING [train.py:1204] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_391-22148-0 from training. Duration: 0.77 2023-10-02 14:25:45,766 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3475_210_2028_1_1526193578_3622569_64-297072-0 from training. Duration: 0.91 2023-10-02 14:25:47,156 WARNING [train.py:1204] (3/4) Exclude cut with ID _17875_210_2087_1_1531224012907_1821339_225-280015-0_sp1.1 from training. Duration: 0.963625 2023-10-02 14:25:47,174 WARNING [train.py:1204] (3/4) Exclude cut with ID _50655_210_18628_1_1534229942193_3772154_325-246169-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 14:25:48,822 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.bypass.skip_rate, batch_count=911153.3333333334, ans=0.07 2023-10-02 14:25:49,923 WARNING [train.py:1204] (3/4) Exclude cut with ID _29529_210_7234_1_1532393961455_3977460_597-257348-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 14:25:49,966 WARNING [train.py:1204] (3/4) Exclude cut with ID _58895_210_8750_1_1535184081320_3649089_77-173485-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 14:25:49,990 WARNING [train.py:1204] (3/4) Exclude cut with ID _6270_210_2084_1_1527850911573_3856420_26-277343-0 from training. Duration: 0.82 2023-10-02 14:25:52,700 WARNING [train.py:1204] (3/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_144-3287-0 from training. Duration: 0.67 2023-10-02 14:25:54,101 WARNING [train.py:1204] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_293-145278-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 14:25:56,669 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_40047_210_2290_1_1533290519_2374311_257-335162-0 from training. Duration: 0.95 2023-10-02 14:25:58,160 WARNING [train.py:1204] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_619-344499-0_sp0.9 from training. Duration: 0.7 2023-10-02 14:26:02,373 WARNING [train.py:1204] (3/4) Exclude cut with ID _19504_210_9994_1_1531648863242_3325220_456-315379-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 14:26:02,452 WARNING [train.py:1204] (3/4) Exclude cut with ID _55857_210_2346_1_1534644007831_6898209_631-221692-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 14:26:07,237 WARNING [train.py:1204] (3/4) Exclude cut with ID _64631_210_27767_1_1535349580068_4165160_324-3682-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 14:26:07,257 WARNING [train.py:1204] (3/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_317-211107-0 from training. Duration: 0.92 2023-10-02 14:26:10,379 WARNING [train.py:1204] (3/4) Exclude cut with ID _6789_210_1534_1_1527857937218_3118950_311-61262-0 from training. Duration: 0.65 2023-10-02 14:26:10,578 WARNING [train.py:1204] (3/4) Exclude cut with ID _28323_210_16187_1_1532480395667_4100300_365-68882-0_sp1.1 from training. Duration: 0.92725 2023-10-02 14:26:10,605 WARNING [train.py:1204] (3/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_816-181825-0_sp1.1 from training. Duration: 0.92725 2023-10-02 14:26:10,780 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.bypass.scale_min, batch_count=911286.6666666666, ans=0.2 2023-10-02 14:26:15,041 WARNING [train.py:1204] (3/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_132-269633-0_sp1.1 from training. Duration: 0.6181875 2023-10-02 14:26:15,060 WARNING [train.py:1204] (3/4) Exclude cut with ID _44585_210_18628_1_1533605350387_4518199_280-73534-0_sp1.1 from training. Duration: 0.8090625 2023-10-02 14:26:15,121 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_68095_210_20185_1_1535718226_8888855_1009-164755-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 14:26:16,461 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_40223_210_6228_1_1533298404_4812267_400-103813-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 14:26:16,462 WARNING [train.py:1204] (3/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_491-261923-0_sp1.1 from training. Duration: 0.8909375 2023-10-02 14:26:16,468 WARNING [train.py:1204] (3/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_712-342563-0 from training. Duration: 0.97 2023-10-02 14:26:19,028 WARNING [train.py:1204] (3/4) Exclude cut with ID _41161_210_20034_1_1533431249657_3555314_175-190032-0_sp1.1 from training. Duration: 0.963625 2023-10-02 14:26:20,300 INFO [train.py:1046] (3/4) Epoch 26, batch 3900, loss[loss=0.1667, simple_loss=0.2403, pruned_loss=0.04656, over 23249.00 frames. ], tot_loss[loss=0.1676, simple_loss=0.245, pruned_loss=0.04508, over 4719678.98 frames. ], batch size: 105, lr: 3.92e-03, grad_scale: 8.0 2023-10-02 14:26:20,394 WARNING [train.py:1204] (3/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_788-298907-0 from training. Duration: 0.89 2023-10-02 14:26:21,736 WARNING [train.py:1204] (3/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_225-70961-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 14:26:21,742 WARNING [train.py:1204] (3/4) Exclude cut with ID _58583_210_5236_1_1534726835266_3574249_235-175994-0_sp1.1 from training. Duration: 0.92725 2023-10-02 14:26:23,140 WARNING [train.py:1204] (3/4) Exclude cut with ID _12302_210_5219_1_1529924446466_8107280_907-182949-0_sp1.1 from training. Duration: 0.836375 2023-10-02 14:26:23,180 WARNING [train.py:1204] (3/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_94-193692-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 14:26:24,531 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_4528_210_3273_1_1527076654_3276777_342-168068-0_sp0.9 from training. Duration: 0.9666875 2023-10-02 14:26:24,578 WARNING [train.py:1204] (3/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_368-74239-0_sp1.1 from training. Duration: 0.92725 2023-10-02 14:26:24,586 WARNING [train.py:1204] (3/4) Exclude cut with ID _44831_210_13427_1_1533816011549_3689035_291-295928-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 14:26:25,943 WARNING [train.py:1204] (3/4) Exclude cut with ID _56406_210_9288_1_1534899618664_3496149_455-131997-0_sp1.1 from training. Duration: 0.97275 2023-10-02 14:26:25,950 WARNING [train.py:1204] (3/4) Exclude cut with ID _6473_210_1741_1_1527901307396_3815380_108-17335-0 from training. Duration: 0.65 2023-10-02 14:26:27,304 WARNING [train.py:1204] (3/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_286-135600-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 14:26:30,030 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_928-321675-0_sp1.1 from training. Duration: 0.936375 2023-10-02 14:26:30,117 WARNING [train.py:1204] (3/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_81-300212-0_sp1.1 from training. Duration: 0.5454375 2023-10-02 14:26:31,433 WARNING [train.py:1204] (3/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_29-280622-0_sp0.9 from training. Duration: 0.911125 2023-10-02 14:26:32,868 WARNING [train.py:1204] (3/4) Exclude cut with ID _61079_210_4525_1_1534921008198_4011419_228-176447-0_sp1.1 from training. Duration: 0.936375 2023-10-02 14:26:34,356 WARNING [train.py:1204] (3/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_372-198878-0_sp1.1 from training. Duration: 0.5454375 2023-10-02 14:26:34,366 WARNING [train.py:1204] (3/4) Exclude cut with ID _64521_210_14699_1_1535243427259_3815328_244-208725-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 14:26:37,595 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8275_210_4198_1_1528455320_3892571_491-17707-0_sp0.9 from training. Duration: 0.711125 2023-10-02 14:26:38,951 WARNING [train.py:1204] (3/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_375-245677-0 from training. Duration: 0.81 2023-10-02 14:26:38,959 WARNING [train.py:1204] (3/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_690-286055-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 14:26:40,931 WARNING [train.py:1204] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_89-321955-0 from training. Duration: 0.8 2023-10-02 14:26:40,978 WARNING [train.py:1204] (3/4) Exclude cut with ID _18895_210_6228_1_1531357274234_3784930_213-98721-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 14:26:42,365 WARNING [train.py:1204] (3/4) Exclude cut with ID _19708_210_9206_1_1532136650733_4238364_347-65240-0 from training. Duration: 0.97 2023-10-02 14:26:42,479 WARNING [train.py:1204] (3/4) Exclude cut with ID _64135_210_2253_1_1535677267876_7608972_498-5039-0 from training. Duration: 0.78 2023-10-02 14:26:45,285 WARNING [train.py:1204] (3/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_146-276944-0_sp1.1 from training. Duration: 0.7545625 2023-10-02 14:26:48,603 WARNING [train.py:1204] (3/4) Exclude cut with ID _63171_210_15488_1_1535104792794_3295110_155-34488-0_sp1.1 from training. Duration: 0.97275 2023-10-02 14:26:48,625 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_5438_210_1794_1_1527336687_2600009_233-58072-0_sp0.9 from training. Duration: 0.8555625 2023-10-02 14:26:48,672 WARNING [train.py:1204] (3/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_63-101996-0_sp0.9 from training. Duration: 0.97775 2023-10-02 14:26:52,662 WARNING [train.py:1204] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_271-269084-0_sp1.1 from training. Duration: 0.7545625 2023-10-02 14:26:55,386 WARNING [train.py:1204] (3/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_345-167986-0_sp1.1 from training. Duration: 0.763625 2023-10-02 14:26:57,058 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.bypass.skip_rate, batch_count=911486.6666666666, ans=0.09899494936611666 2023-10-02 14:26:57,541 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.3.encoder.layers.0.nonlin_attention.whiten2, num_groups=1, num_channels=512, metric=7.36 vs. limit=15.0 2023-10-02 14:26:58,145 WARNING [train.py:1204] (3/4) Exclude cut with ID _39290_210_4220_1_1533862778760_3075118_92-311600-0_sp1.1 from training. Duration: 0.8 2023-10-02 14:26:58,157 WARNING [train.py:1204] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_209-65306-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 14:26:59,520 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_26433_210_12108_1_1532157158_4056480_317-335807-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 14:27:05,023 WARNING [train.py:1204] (3/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_238-94127-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 14:27:05,066 WARNING [train.py:1204] (3/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_207-132038-0_sp1.1 from training. Duration: 0.8545625 2023-10-02 14:27:13,074 WARNING [train.py:1204] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_167-137255-0_sp1.1 from training. Duration: 0.5818125 2023-10-02 14:27:14,492 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2009_210_2071_1_1525481134_4588937_385-302100-0_sp0.9 from training. Duration: 0.9666875 2023-10-02 14:27:17,263 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.543e+02 1.831e+02 1.989e+02 2.156e+02 3.105e+02, threshold=3.979e+02, percent-clipped=0.0 2023-10-02 14:27:20,780 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_15517_210_6753_1_1530954008_3487336_253-110617-0_sp1.1 from training. Duration: 0.936375 2023-10-02 14:27:23,441 WARNING [train.py:1204] (3/4) Exclude cut with ID _39290_210_4220_1_1533862778760_3075118_92-311600-0_sp0.9 from training. Duration: 0.97775 2023-10-02 14:27:23,482 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_42-142473-0 from training. Duration: 0.72 2023-10-02 14:27:23,520 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2296_210_2087_1_1525503448_3397335_57-185430-0 from training. Duration: 0.6 2023-10-02 14:27:25,020 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_11305_210_1794_1_1529750428_7178148_257-312870-0_sp0.9 from training. Duration: 0.97775 2023-10-02 14:27:26,369 WARNING [train.py:1204] (3/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_222-198127-0 from training. Duration: 0.84 2023-10-02 14:27:26,468 WARNING [train.py:1204] (3/4) Exclude cut with ID _65263_210_22356_1_1535763598691_3921037_423-202699-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 14:27:26,525 WARNING [train.py:1204] (3/4) Exclude cut with ID _15037_210_2253_1_1530752294104_3267650_13-130361-0 from training. Duration: 0.99 2023-10-02 14:27:29,501 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.0.conv_module1.balancer2.prob, batch_count=911620.0, ans=0.125 2023-10-02 14:27:32,441 WARNING [train.py:1204] (3/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_628-198080-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 14:27:32,616 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.nonlin_attention.balancer.prob, batch_count=911686.6666666666, ans=0.125 2023-10-02 14:27:33,703 INFO [train.py:1046] (3/4) Epoch 26, batch 3950, loss[loss=0.1694, simple_loss=0.2428, pruned_loss=0.04799, over 23280.00 frames. ], tot_loss[loss=0.1675, simple_loss=0.245, pruned_loss=0.04505, over 4735080.81 frames. ], batch size: 119, lr: 3.92e-03, grad_scale: 8.0 2023-10-02 14:27:33,818 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3171_210_2087_1_1526109084_2330007_37-64334-0 from training. Duration: 0.82 2023-10-02 14:27:35,079 WARNING [train.py:1204] (3/4) Exclude cut with ID _5287_210_2084_1_1527418883040_3878670_12-221186-0_sp1.1 from training. Duration: 0.8909375 2023-10-02 14:27:38,890 WARNING [train.py:1204] (3/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_1103-56445-0_sp1.1 from training. Duration: 0.663625 2023-10-02 14:27:39,452 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.5.encoder.layers.1.feed_forward3.out_whiten, num_groups=1, num_channels=256, metric=14.70 vs. limit=15.0 2023-10-02 14:27:40,356 WARNING [train.py:1204] (3/4) Exclude cut with ID _65263_210_22356_1_1535763598691_3921037_169-140068-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 14:27:45,849 WARNING [train.py:1204] (3/4) Exclude cut with ID IC0671W0118-331961 from training. Duration: 0.896 2023-10-02 14:27:47,183 WARNING [train.py:1204] (3/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_402-102311-0_sp1.1 from training. Duration: 0.6090625 2023-10-02 14:27:47,211 WARNING [train.py:1204] (3/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_202-293523-0 from training. Duration: 0.72 2023-10-02 14:27:47,370 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.1.feed_forward1.out_proj.dropout_p, batch_count=911753.3333333334, ans=0.1 2023-10-02 14:27:48,522 WARNING [train.py:1204] (3/4) Exclude cut with ID IC0679W0038-335846 from training. Duration: 0.768 2023-10-02 14:27:48,544 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_37357_210_17024_1_1533089504_2316121_79-198866-0_sp1.1 from training. Duration: 0.936375 2023-10-02 14:27:51,381 WARNING [train.py:1204] (3/4) Exclude cut with ID _7606_210_4915_1_1528894253277_5080690_292-271285-0_sp1.1 from training. Duration: 0.963625 2023-10-02 14:27:51,393 WARNING [train.py:1204] (3/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_377-296839-0_sp0.9 from training. Duration: 0.611125 2023-10-02 14:27:51,396 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_45886_210_17024_1_1533704917_2435333_58-131069-0_sp1.1 from training. Duration: 0.963625 2023-10-02 14:27:56,047 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6961_210_2087_1_1527937168_6815011_395-185060-0 from training. Duration: 0.95 2023-10-02 14:27:56,354 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.self_attn_weights.pos_emb_skip_rate, batch_count=911753.3333333334, ans=0.0 2023-10-02 14:27:57,483 WARNING [train.py:1204] (3/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_304-266479-0_sp1.1 from training. Duration: 0.97275 2023-10-02 14:27:57,548 WARNING [train.py:1204] (3/4) Exclude cut with ID _3867_210_1883_1_1526641200575_4475830_319-349850-0_sp1.1 from training. Duration: 0.6090625 2023-10-02 14:27:57,557 WARNING [train.py:1204] (3/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_304-59227-0_sp0.9 from training. Duration: 0.8555625 2023-10-02 14:27:57,610 WARNING [train.py:1204] (3/4) Exclude cut with ID _8021_210_7994_1_1528376451358_3653069_100-140231-0_sp0.9 from training. Duration: 0.7444375 2023-10-02 14:27:58,945 WARNING [train.py:1204] (3/4) Exclude cut with ID _8833_210_5219_1_1528592665210_7553059_329-125156-0_sp0.9 from training. Duration: 0.911125 2023-10-02 14:28:00,499 INFO [scaling.py:1118] (3/4) WithLoss: name=encoder.encoders.2.encoder.layers.1.self_attn_weights, loss-sum=0.000e+00 2023-10-02 14:28:09,444 WARNING [train.py:1204] (3/4) Exclude cut with ID _14459_210_2228_1_1531443641946_7516700_792-287234-0_sp1.1 from training. Duration: 0.92725 2023-10-02 14:28:09,932 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.5.encoder.layers.1.self_attn1.whiten, num_groups=1, num_channels=256, metric=11.13 vs. limit=22.5 2023-10-02 14:28:11,359 WARNING [train.py:1204] (3/4) Exclude cut with ID _19708_210_9206_1_1532136650733_4238364_347-65240-0_sp1.1 from training. Duration: 0.8818125 2023-10-02 14:28:14,047 WARNING [train.py:1204] (3/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_70-27732-0 from training. Duration: 0.94 2023-10-02 14:28:19,516 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_397-132565-0 from training. Duration: 0.99 2023-10-02 14:28:19,519 WARNING [train.py:1204] (3/4) Exclude cut with ID _49728_210_12651_1_1534041034294_6788150_624-195301-0 from training. Duration: 0.85 2023-10-02 14:28:20,838 WARNING [train.py:1204] (3/4) Exclude cut with ID _55429_210_10626_1_1534467588527_7174143_377-169391-0_sp1.1 from training. Duration: 0.87275 2023-10-02 14:28:22,641 WARNING [train.py:1204] (3/4) Exclude cut with ID _49376_210_5986_1_1533985278732_3370740_354-344695-0_sp1.1 from training. Duration: 0.9 2023-10-02 14:28:29,628 WARNING [train.py:1204] (3/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_276-96593-0_sp1.1 from training. Duration: 0.7 2023-10-02 14:28:29,643 WARNING [train.py:1204] (3/4) Exclude cut with ID _15012_210_1968_1_1530856820429_5095350_688-178849-0_sp0.9 from training. Duration: 0.711125 2023-10-02 14:28:29,676 WARNING [train.py:1204] (3/4) Exclude cut with ID _25565_210_12392_1_1532423009382_3952098_372-69023-0_sp1.1 from training. Duration: 0.936375 2023-10-02 14:28:30,945 WARNING [train.py:1204] (3/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_61-327062-0_sp1.1 from training. Duration: 0.736375 2023-10-02 14:28:30,989 WARNING [train.py:1204] (3/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_215-141059-0 from training. Duration: 0.62 2023-10-02 14:28:35,886 WARNING [train.py:1204] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_6-226750-0_sp1.1 from training. Duration: 0.8545625 2023-10-02 14:28:36,096 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.self_attn_weights.pos_emb_skip_rate, batch_count=911953.3333333334, ans=0.0 2023-10-02 14:28:37,240 WARNING [train.py:1204] (3/4) Exclude cut with ID _14348_210_8341_1_1530599434996_3778640_240-232370-0_sp1.1 from training. Duration: 0.763625 2023-10-02 14:28:37,510 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.attention_skip_rate, batch_count=911953.3333333334, ans=0.0 2023-10-02 14:28:41,180 WARNING [train.py:1204] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_468-189058-0 from training. Duration: 0.96 2023-10-02 14:28:48,307 INFO [train.py:1046] (3/4) Epoch 26, batch 4000, loss[loss=0.1521, simple_loss=0.2377, pruned_loss=0.03326, over 24607.00 frames. ], tot_loss[loss=0.1685, simple_loss=0.2461, pruned_loss=0.04545, over 4720703.94 frames. ], batch size: 60, lr: 3.92e-03, grad_scale: 16.0 2023-10-02 14:28:48,426 WARNING [train.py:1204] (3/4) Exclude cut with ID _21987_210_5854_1_1531821500482_3637038_247-93340-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 14:28:48,586 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.feed_forward1.hidden_balancer.prob, batch_count=912020.0, ans=0.125 2023-10-02 14:28:54,532 WARNING [train.py:1204] (3/4) Exclude cut with ID _66868_210_9527_1_1535509287954_4561140_436-191602-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 14:28:56,276 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.self_attn_weights.pos_emb_skip_rate, batch_count=912020.0, ans=0.0 2023-10-02 14:28:58,768 WARNING [train.py:1204] (3/4) Exclude cut with ID _18895_210_6228_1_1531357274234_3784930_394-58203-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 14:28:59,029 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.feed_forward3.hidden_balancer.prob, batch_count=912020.0, ans=0.125 2023-10-02 14:29:00,083 WARNING [train.py:1204] (3/4) Exclude cut with ID _58605_210_12210_1_1534723195939_3904259_258-281498-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 14:29:00,119 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_33348_210_5632_1_1533086912_3324030_245-292752-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 14:29:00,137 WARNING [train.py:1204] (3/4) Exclude cut with ID _67831_210_12392_1_1535612464202_3092612_346-2148-0 from training. Duration: 0.93 2023-10-02 14:29:00,194 WARNING [train.py:1204] (3/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_369-222060-0_sp0.9 from training. Duration: 0.788875 2023-10-02 14:29:00,343 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.0.conv_module2.balancer2.prob, batch_count=912020.0, ans=0.125 2023-10-02 14:29:01,601 WARNING [train.py:1204] (3/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_245-174553-0 from training. Duration: 0.88 2023-10-02 14:29:01,607 WARNING [train.py:1204] (3/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_231-104736-0_sp1.1 from training. Duration: 0.7090625 2023-10-02 14:29:01,614 WARNING [train.py:1204] (3/4) Exclude cut with ID _44585_210_18628_1_1533605350387_4518199_280-73534-0 from training. Duration: 0.89 2023-10-02 14:29:03,527 WARNING [train.py:1204] (3/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_22-201476-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 14:29:06,818 WARNING [train.py:1204] (3/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_325-62802-0_sp0.9 from training. Duration: 0.9666875 2023-10-02 14:29:06,839 WARNING [train.py:1204] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_199-274062-0_sp1.1 from training. Duration: 0.97275 2023-10-02 14:29:07,995 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_8-319957-0_sp1.1 from training. Duration: 0.87275 2023-10-02 14:29:08,026 WARNING [train.py:1204] (3/4) Exclude cut with ID _29343_210_4381_1_1532602757752_3674109_224-349726-0_sp1.1 from training. Duration: 0.936375 2023-10-02 14:29:08,030 WARNING [train.py:1204] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_520-126729-0_sp1.1 from training. Duration: 0.636375 2023-10-02 14:29:09,948 WARNING [train.py:1204] (3/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_239-22076-0_sp1.1 from training. Duration: 0.77275 2023-10-02 14:29:10,045 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9012W0011-501146 from training. Duration: 0.896 2023-10-02 14:29:11,389 WARNING [train.py:1204] (3/4) Exclude cut with ID _29857_210_20034_1_1532395886753_3798160_279-185801-0_sp1.1 from training. Duration: 0.82725 2023-10-02 14:29:12,714 WARNING [train.py:1204] (3/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_261-223933-0_sp1.1 from training. Duration: 0.92725 2023-10-02 14:29:15,574 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9029W0025-509426 from training. Duration: 0.896 2023-10-02 14:29:16,067 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.4.encoder.layers.2.conv_module2.whiten, num_groups=1, num_channels=384, metric=3.78 vs. limit=15.0 2023-10-02 14:29:16,872 WARNING [train.py:1204] (3/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_337-181502-0_sp1.1 from training. Duration: 0.5909375 2023-10-02 14:29:16,894 WARNING [train.py:1204] (3/4) Exclude cut with ID _68635_210_9317_1_1535770813410_4315870_159-332599-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 14:29:19,968 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.1.nonlin_attention.balancer.prob, batch_count=912153.3333333334, ans=0.125 2023-10-02 14:29:20,682 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.0.layers.1.self_attn_weights.whiten_keys, num_groups=4, num_channels=128, metric=2.35 vs. limit=6.0 2023-10-02 14:29:23,161 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.bypass.scale_min, batch_count=912153.3333333334, ans=0.2 2023-10-02 14:29:25,160 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.0.layers.1.whiten, num_groups=1, num_channels=192, metric=3.77 vs. limit=12.0 2023-10-02 14:29:25,592 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_161-276996-0 from training. Duration: 0.95 2023-10-02 14:29:26,954 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7088_210_1874_1_1527939776_1101113_127-68870-0_sp1.1 from training. Duration: 0.936375 2023-10-02 14:29:28,393 WARNING [train.py:1204] (3/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_322-217220-0_sp1.1 from training. Duration: 0.7818125 2023-10-02 14:29:28,449 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9072W0002-530636 from training. Duration: 0.768 2023-10-02 14:29:29,794 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_340-336408-0_sp1.1 from training. Duration: 0.8454375 2023-10-02 14:29:29,843 WARNING [train.py:1204] (3/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_395-37222-0 from training. Duration: 0.76 2023-10-02 14:29:29,854 WARNING [train.py:1204] (3/4) Exclude cut with ID _64099_210_17301_1_1535526977656_1578188_159-209156-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 14:29:31,177 WARNING [train.py:1204] (3/4) Exclude cut with ID _53950_210_5816_1_1534417264919_3862951_28-29681-0_sp1.1 from training. Duration: 0.92725 2023-10-02 14:29:32,065 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.1.encoder.layers.0.feed_forward1.out_whiten, num_groups=1, num_channels=256, metric=5.21 vs. limit=15.0 2023-10-02 14:29:32,540 WARNING [train.py:1204] (3/4) Exclude cut with ID _4681_210_3525_1_1527250689464_120889_15-126760-0_sp1.1 from training. Duration: 0.736375 2023-10-02 14:29:33,883 WARNING [train.py:1204] (3/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_242-3472-0_sp1.1 from training. Duration: 0.663625 2023-10-02 14:29:33,928 WARNING [train.py:1204] (3/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_436-186311-0_sp0.9 from training. Duration: 0.72225 2023-10-02 14:29:33,949 WARNING [train.py:1204] (3/4) Exclude cut with ID _3675_210_4557_1_1526639934408_4182089_452-172147-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 14:29:35,265 WARNING [train.py:1204] (3/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_80-147301-0 from training. Duration: 0.84 2023-10-02 14:29:37,225 WARNING [train.py:1204] (3/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_351-107588-0_sp1.1 from training. Duration: 0.92725 2023-10-02 14:29:38,658 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9111W0002-550781 from training. Duration: 0.896 2023-10-02 14:29:39,413 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=912220.0, ans=0.1 2023-10-02 14:29:45,246 WARNING [train.py:1204] (3/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_465-156500-0_sp1.1 from training. Duration: 0.6545625 2023-10-02 14:29:46,604 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.571e+02 1.860e+02 2.015e+02 2.242e+02 2.909e+02, threshold=4.030e+02, percent-clipped=0.0 2023-10-02 14:29:48,188 WARNING [train.py:1204] (3/4) Exclude cut with ID _13938_210_9988_1_1530518718978_3693960_304-112784-0_sp1.1 from training. Duration: 0.4181875 2023-10-02 14:29:49,701 WARNING [train.py:1204] (3/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_202-67017-0_sp1.1 from training. Duration: 0.7454375 2023-10-02 14:29:49,742 WARNING [train.py:1204] (3/4) Exclude cut with ID _58414_210_8254_1_1535421609162_7151182_448-4358-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 14:29:49,804 WARNING [train.py:1204] (3/4) Exclude cut with ID _66840_210_5219_1_1535452168426_4196349_203-243234-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 14:29:51,133 WARNING [train.py:1204] (3/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_201-213513-0_sp1.1 from training. Duration: 0.97275 2023-10-02 14:29:55,789 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_117-187560-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 14:29:58,468 WARNING [train.py:1204] (3/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_135-174080-0_sp0.9 from training. Duration: 0.588875 2023-10-02 14:29:58,523 WARNING [train.py:1204] (3/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_45-265684-0 from training. Duration: 0.72 2023-10-02 14:29:59,957 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_435-313305-0_sp1.1 from training. Duration: 0.6454375 2023-10-02 14:29:59,983 WARNING [train.py:1204] (3/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_235-307337-0_sp1.1 from training. Duration: 0.8545625 2023-10-02 14:30:02,649 INFO [train.py:1046] (3/4) Epoch 26, batch 4050, loss[loss=0.1734, simple_loss=0.2377, pruned_loss=0.05455, over 23809.00 frames. ], tot_loss[loss=0.1684, simple_loss=0.2457, pruned_loss=0.04553, over 4720681.23 frames. ], batch size: 212, lr: 3.92e-03, grad_scale: 16.0 2023-10-02 14:30:02,692 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7762_210_1794_1_1528199508_4057408_419-125009-0_sp0.9 from training. Duration: 0.92225 2023-10-02 14:30:04,130 WARNING [train.py:1204] (3/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_215-141059-0_sp1.1 from training. Duration: 0.563625 2023-10-02 14:30:04,225 WARNING [train.py:1204] (3/4) Exclude cut with ID _27013_210_1389_1_1532437173240_3252649_222-125573-0_sp1.1 from training. Duration: 0.8909375 2023-10-02 14:30:08,867 WARNING [train.py:1204] (3/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_126-274694-0_sp1.1 from training. Duration: 0.8909375 2023-10-02 14:30:11,071 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2592_210_2290_1_1525697025_4882925_157-70512-0_sp1.1 from training. Duration: 0.82725 2023-10-02 14:30:11,124 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_292-265928-0_sp0.9 from training. Duration: 0.5666875 2023-10-02 14:30:12,519 WARNING [train.py:1204] (3/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_1211-226066-0_sp1.1 from training. Duration: 0.6909375 2023-10-02 14:30:12,570 WARNING [train.py:1204] (3/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_394-265747-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 14:30:14,635 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.feed_forward1.hidden_balancer.prob, batch_count=912353.3333333334, ans=0.125 2023-10-02 14:30:17,112 WARNING [train.py:1204] (3/4) Exclude cut with ID _48782_210_12658_1_1534467701544_2300422_239-67139-0_sp1.1 from training. Duration: 0.97275 2023-10-02 14:30:19,769 WARNING [train.py:1204] (3/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_329-113173-0_sp1.1 from training. Duration: 0.563625 2023-10-02 14:30:21,320 WARNING [train.py:1204] (3/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_81-75849-0_sp0.9 from training. Duration: 0.4333125 2023-10-02 14:30:22,737 WARNING [train.py:1204] (3/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_1003-101638-0 from training. Duration: 0.73 2023-10-02 14:30:22,768 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9309W0189-640316 from training. Duration: 0.64 2023-10-02 14:30:26,372 WARNING [train.py:1204] (3/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_503-287883-0_sp1.1 from training. Duration: 0.8 2023-10-02 14:30:27,988 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.bypass.scale_min, batch_count=912420.0, ans=0.2 2023-10-02 14:30:31,835 WARNING [train.py:1204] (3/4) Exclude cut with ID _8833_210_5219_1_1528592665210_7553059_611-80313-0 from training. Duration: 0.64 2023-10-02 14:30:33,247 WARNING [train.py:1204] (3/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_335-106626-0_sp1.1 from training. Duration: 0.963625 2023-10-02 14:30:36,453 WARNING [train.py:1204] (3/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_237-44332-0_sp1.1 from training. Duration: 0.8545625 2023-10-02 14:30:39,326 WARNING [train.py:1204] (3/4) Exclude cut with ID _46952_210_5986_1_1534230010357_3107379_178-147341-0_sp1.1 from training. Duration: 0.97275 2023-10-02 14:30:39,356 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_13091_210_7925_1_1530609447_8987354_946-154426-0_sp1.1 from training. Duration: 0.936375 2023-10-02 14:30:39,381 WARNING [train.py:1204] (3/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_326-17061-0_sp1.1 from training. Duration: 0.8545625 2023-10-02 14:30:41,704 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.bypass.scale_min, batch_count=912486.6666666666, ans=0.2 2023-10-02 14:30:42,723 WARNING [train.py:1204] (3/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_38-169920-0_sp1.1 from training. Duration: 0.82725 2023-10-02 14:30:46,702 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.0.layers.1.self_attn1.whiten, num_groups=1, num_channels=192, metric=10.94 vs. limit=22.5 2023-10-02 14:30:47,102 WARNING [train.py:1204] (3/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_283-220372-0 from training. Duration: 0.56 2023-10-02 14:30:47,110 WARNING [train.py:1204] (3/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_252-216674-0_sp1.1 from training. Duration: 0.4909375 2023-10-02 14:30:48,547 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_202-91290-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 14:30:50,011 WARNING [train.py:1204] (3/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_80-74184-0 from training. Duration: 0.83 2023-10-02 14:30:55,583 WARNING [train.py:1204] (3/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_411-271358-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 14:31:01,706 WARNING [train.py:1204] (3/4) Exclude cut with ID _68777_210_4353_1_1535810390731_3117904_129-166012-0 from training. Duration: 0.85 2023-10-02 14:31:01,795 WARNING [train.py:1204] (3/4) Exclude cut with ID _44982_210_1511_1_1534312820108_3726029_410-109759-0_sp1.1 from training. Duration: 0.963625 2023-10-02 14:31:01,800 WARNING [train.py:1204] (3/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_364-22817-0_sp1.1 from training. Duration: 0.6454375 2023-10-02 14:31:04,591 WARNING [train.py:1204] (3/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_89-242335-0 from training. Duration: 0.95 2023-10-02 14:31:04,599 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_151-325626-0 from training. Duration: 0.97 2023-10-02 14:31:04,600 WARNING [train.py:1204] (3/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_786-264332-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 14:31:06,012 WARNING [train.py:1204] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_35-153886-0_sp1.1 from training. Duration: 0.763625 2023-10-02 14:31:06,189 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.feed_forward1.out_proj.dropout_p, batch_count=912620.0, ans=0.1 2023-10-02 14:31:09,359 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_24007_210_1794_1_1531918925_1663875_116-100916-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 14:31:09,383 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_162-262717-0_sp1.1 from training. Duration: 0.7545625 2023-10-02 14:31:10,965 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.conv_module2.balancer1.prob, batch_count=912620.0, ans=0.125 2023-10-02 14:31:16,802 WARNING [train.py:1204] (3/4) Exclude cut with ID _8263_210_1664_1_1528438953494_294100_40-204731-0 from training. Duration: 0.5 2023-10-02 14:31:18,136 INFO [train.py:1046] (3/4) Epoch 26, batch 4100, loss[loss=0.1581, simple_loss=0.2392, pruned_loss=0.03846, over 24450.00 frames. ], tot_loss[loss=0.1698, simple_loss=0.2474, pruned_loss=0.04608, over 4696215.10 frames. ], batch size: 66, lr: 3.92e-03, grad_scale: 16.0 2023-10-02 14:31:18,240 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_416-293746-0 from training. Duration: 0.9 2023-10-02 14:31:19,635 WARNING [train.py:1204] (3/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_605-219732-0 from training. Duration: 0.94 2023-10-02 14:31:20,993 WARNING [train.py:1204] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_454-123002-0 from training. Duration: 0.69 2023-10-02 14:31:20,999 WARNING [train.py:1204] (3/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_25-304273-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 14:31:21,053 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6860_210_4852_1_1527917457_3833327_451-39702-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 14:31:22,316 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_304-16505-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 14:31:22,328 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_4-266348-0_sp1.1 from training. Duration: 0.7090625 2023-10-02 14:31:22,402 WARNING [train.py:1204] (3/4) Exclude cut with ID ID0149W0001-753701 from training. Duration: 0.989 2023-10-02 14:31:23,950 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_41251_210_15368_1_1533429767_4820695_394-55393-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 14:31:25,297 WARNING [train.py:1204] (3/4) Exclude cut with ID _8793_210_6310_1_1528542985139_2310009_121-179782-0_sp1.1 from training. Duration: 0.72725 2023-10-02 14:31:25,320 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2729_210_3528_1_1525863505_3882214_421-73047-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 14:31:26,619 WARNING [train.py:1204] (3/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_278-154492-0_sp0.9 from training. Duration: 0.8444375 2023-10-02 14:31:32,688 WARNING [train.py:1204] (3/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_412-317077-0_sp0.9 from training. Duration: 0.8333125 2023-10-02 14:31:32,767 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_727-138317-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 14:31:33,060 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=912753.3333333334, ans=0.1 2023-10-02 14:31:34,147 WARNING [train.py:1204] (3/4) Exclude cut with ID _25808_210_13292_1_1532343514562_7366109_345-335048-0_sp1.1 from training. Duration: 0.92725 2023-10-02 14:31:34,161 WARNING [train.py:1204] (3/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_506-23200-0 from training. Duration: 0.97 2023-10-02 14:31:34,242 WARNING [train.py:1204] (3/4) Exclude cut with ID _44718_210_5816_1_1534050051869_6469030_643-245187-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 14:31:34,247 WARNING [train.py:1204] (3/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_42-186162-0_sp1.1 from training. Duration: 0.836375 2023-10-02 14:31:34,270 WARNING [train.py:1204] (3/4) Exclude cut with ID _3231_210_2106_1_1526205864934_6784220_171-256004-0_sp1.1 from training. Duration: 0.7818125 2023-10-02 14:31:36,036 WARNING [train.py:1204] (3/4) Exclude cut with ID _25914_210_6400_1_1532141317058_644740_43-248478-0_sp1.1 from training. Duration: 0.936375 2023-10-02 14:31:37,468 WARNING [train.py:1204] (3/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_577-287150-0 from training. Duration: 0.82 2023-10-02 14:31:41,594 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_46015_210_7234_1_1533863319_3087837_376-276068-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 14:31:42,963 WARNING [train.py:1204] (3/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_51-200220-0 from training. Duration: 0.56 2023-10-02 14:31:44,395 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_452-96101-0_sp1.1 from training. Duration: 0.963625 2023-10-02 14:31:47,660 WARNING [train.py:1204] (3/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_263-168388-0_sp1.1 from training. Duration: 0.7818125 2023-10-02 14:31:47,661 WARNING [train.py:1204] (3/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_469-94673-0 from training. Duration: 0.79 2023-10-02 14:31:47,780 WARNING [train.py:1204] (3/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_303-228738-0_sp1.1 from training. Duration: 0.763625 2023-10-02 14:31:49,678 WARNING [train.py:1204] (3/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_262-250826-0_sp1.1 from training. Duration: 0.9 2023-10-02 14:31:49,707 WARNING [train.py:1204] (3/4) Exclude cut with ID _68777_210_4353_1_1535810390731_3117904_129-166012-0_sp1.1 from training. Duration: 0.77275 2023-10-02 14:31:51,192 WARNING [train.py:1204] (3/4) Exclude cut with ID _58895_210_8750_1_1535184081320_3649089_265-323810-0 from training. Duration: 0.96 2023-10-02 14:31:52,700 WARNING [train.py:1204] (3/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_120-76843-0_sp0.9 from training. Duration: 0.611125 2023-10-02 14:31:54,056 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7544_210_6753_1_1528587722_3576686_295-258479-0_sp0.9 from training. Duration: 0.9333125 2023-10-02 14:31:56,814 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7046_210_6753_1_1527895337_6920917_783-278109-0 from training. Duration: 0.77 2023-10-02 14:31:56,852 WARNING [train.py:1204] (3/4) Exclude cut with ID _42326_210_7770_1_1533814282531_3707269_113-308323-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 14:31:56,893 WARNING [train.py:1204] (3/4) Exclude cut with ID _7852_210_5219_1_1528286471316_8139400_242-105336-0_sp0.9 from training. Duration: 0.8 2023-10-02 14:31:59,693 WARNING [train.py:1204] (3/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_114-277905-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 14:32:07,079 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_13118_210_7925_1_1530755612_7608297_42-66683-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 14:32:08,686 WARNING [train.py:1204] (3/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_901-79438-0_sp1.1 from training. Duration: 0.8545625 2023-10-02 14:32:08,734 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_30280_210_1881_1_1532872779_3660139_211-182194-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 14:32:14,957 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.ff2_skip_rate, batch_count=912886.6666666666, ans=0.0 2023-10-02 14:32:14,969 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.conv_module2.balancer1.prob, batch_count=912886.6666666666, ans=0.125 2023-10-02 14:32:16,361 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.437e+02 1.852e+02 2.033e+02 2.251e+02 3.212e+02, threshold=4.067e+02, percent-clipped=0.0 2023-10-02 14:32:17,787 WARNING [train.py:1204] (3/4) Exclude cut with ID _14898_210_9206_1_1530766812377_4177190_621-45732-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 14:32:17,796 WARNING [train.py:1204] (3/4) Exclude cut with ID _35350_210_16040_1_1533040191183_4075299_224-133775-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 14:32:18,500 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.4.encoder.layers.0.conv_module1.whiten, num_groups=1, num_channels=384, metric=4.46 vs. limit=15.0 2023-10-02 14:32:20,344 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.5.encoder.layers.1.nonlin_attention.whiten1, num_groups=1, num_channels=192, metric=2.70 vs. limit=10.0 2023-10-02 14:32:21,268 WARNING [train.py:1204] (3/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_147-293311-0_sp1.1 from training. Duration: 0.8545625 2023-10-02 14:32:23,995 WARNING [train.py:1204] (3/4) Exclude cut with ID _12302_210_5219_1_1529924446466_8107280_835-279754-0_sp1.1 from training. Duration: 0.8181875 2023-10-02 14:32:26,851 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8453_210_4211_1_1528502940_8562730_347-330673-0_sp0.9 from training. Duration: 0.8 2023-10-02 14:32:27,745 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.0.layers.1.self_attn2.whiten, num_groups=1, num_channels=192, metric=10.45 vs. limit=22.5 2023-10-02 14:32:28,215 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_213-103282-0_sp0.9 from training. Duration: 0.8666875 2023-10-02 14:32:29,567 WARNING [train.py:1204] (3/4) Exclude cut with ID _49728_210_12651_1_1534041034294_6788150_66-43496-0_sp1.1 from training. Duration: 0.97275 2023-10-02 14:32:29,569 WARNING [train.py:1204] (3/4) Exclude cut with ID _19023_210_4353_1_1531378892552_5820780_28-36615-0_sp1.1 from training. Duration: 0.8818125 2023-10-02 14:32:32,311 INFO [train.py:1046] (3/4) Epoch 26, batch 4150, loss[loss=0.1732, simple_loss=0.2537, pruned_loss=0.04641, over 23286.00 frames. ], tot_loss[loss=0.1694, simple_loss=0.2471, pruned_loss=0.0458, over 4707356.63 frames. ], batch size: 93, lr: 3.92e-03, grad_scale: 16.0 2023-10-02 14:32:32,389 WARNING [train.py:1204] (3/4) Exclude cut with ID _14349_210_8341_1_1530615710818_3442470_115-285975-0 from training. Duration: 0.86 2023-10-02 14:32:32,425 WARNING [train.py:1204] (3/4) Exclude cut with ID _12734_210_8341_1_1530183614296_3891929_236-47464-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 14:32:33,808 WARNING [train.py:1204] (3/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_177-236809-0 from training. Duration: 0.49 2023-10-02 14:32:33,898 WARNING [train.py:1204] (3/4) Exclude cut with ID _13678_210_7497_1_1530530069226_3463170_371-3221-0 from training. Duration: 0.9 2023-10-02 14:32:34,124 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.ff3_skip_rate, batch_count=913020.0, ans=0.0 2023-10-02 14:32:35,213 WARNING [train.py:1204] (3/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_48-338976-0 from training. Duration: 0.89 2023-10-02 14:32:36,978 WARNING [train.py:1204] (3/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_1167-309744-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 14:32:39,985 WARNING [train.py:1204] (3/4) Exclude cut with ID _30327_210_16049_1_1532484161295_3924169_401-70819-0_sp0.9 from training. Duration: 0.9555625 2023-10-02 14:32:40,002 WARNING [train.py:1204] (3/4) Exclude cut with ID _55134_210_21982_1_1534590028377_3882170_346-91022-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 14:32:42,297 INFO [scaling.py:1118] (3/4) WithLoss: name=encoder.encoders.5.encoder.layers.0.self_attn_weights, loss-sum=0.000e+00 2023-10-02 14:32:43,395 WARNING [train.py:1204] (3/4) Exclude cut with ID _14459_210_2228_1_1531443641946_7516700_174-118801-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 14:32:44,710 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_269-278006-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 14:32:46,474 WARNING [train.py:1204] (3/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_390-339137-0_sp0.9 from training. Duration: 0.62225 2023-10-02 14:32:47,865 WARNING [train.py:1204] (3/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_253-308377-0_sp1.1 from training. Duration: 0.4454375 2023-10-02 14:32:47,876 WARNING [train.py:1204] (3/4) Exclude cut with ID _23825_210_13449_1_1531915313526_3497150_240-271367-0_sp1.1 from training. Duration: 0.8818125 2023-10-02 14:32:48,845 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.0.layers.1.feed_forward1.out_whiten, num_groups=1, num_channels=192, metric=4.67 vs. limit=15.0 2023-10-02 14:32:49,220 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1015-343884-0_sp0.9 from training. Duration: 0.688875 2023-10-02 14:32:54,050 WARNING [train.py:1204] (3/4) Exclude cut with ID _23514_210_1389_1_1531832139785_3031439_335-48210-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 14:32:59,424 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_309-65177-0_sp1.1 from training. Duration: 0.8 2023-10-02 14:33:00,828 WARNING [train.py:1204] (3/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_231-137892-0 from training. Duration: 0.99 2023-10-02 14:33:03,518 WARNING [train.py:1204] (3/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_57-270037-0 from training. Duration: 0.77 2023-10-02 14:33:03,522 WARNING [train.py:1204] (3/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_465-214726-0_sp1.1 from training. Duration: 0.7818125 2023-10-02 14:33:03,608 WARNING [train.py:1204] (3/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_106-300985-0 from training. Duration: 0.9 2023-10-02 14:33:03,613 WARNING [train.py:1204] (3/4) Exclude cut with ID _14632_210_9988_1_1530691279392_3923200_161-129530-0_sp1.1 from training. Duration: 0.87275 2023-10-02 14:33:04,863 WARNING [train.py:1204] (3/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_514-88370-0_sp1.1 from training. Duration: 0.963625 2023-10-02 14:33:07,600 WARNING [train.py:1204] (3/4) Exclude cut with ID _56495_210_4234_1_1534748080334_4136219_305-334505-0_sp1.1 from training. Duration: 0.92725 2023-10-02 14:33:07,674 WARNING [train.py:1204] (3/4) Exclude cut with ID _42875_210_14856_1_1533704397487_4012433_61-264914-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 14:33:07,861 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.conv_module2.balancer2.prob, batch_count=913153.3333333334, ans=0.125 2023-10-02 14:33:14,294 WARNING [train.py:1204] (3/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_91-148831-0 from training. Duration: 0.99 2023-10-02 14:33:16,990 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_435-313305-0_sp0.9 from training. Duration: 0.788875 2023-10-02 14:33:18,354 WARNING [train.py:1204] (3/4) Exclude cut with ID _16744_210_5986_1_1531724405384_3907567_242-111316-0_sp1.1 from training. Duration: 0.8454375 2023-10-02 14:33:20,073 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_257-20733-0 from training. Duration: 0.76 2023-10-02 14:33:20,112 WARNING [train.py:1204] (3/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_4-91037-0_sp1.1 from training. Duration: 0.8 2023-10-02 14:33:21,501 WARNING [train.py:1204] (3/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_3-276701-0 from training. Duration: 0.91 2023-10-02 14:33:24,145 WARNING [train.py:1204] (3/4) Exclude cut with ID _3992_210_1747_1_1526644816031_3731060_36-147855-0_sp1.1 from training. Duration: 0.7090625 2023-10-02 14:33:24,227 WARNING [train.py:1204] (3/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_1296-298671-0_sp1.1 from training. Duration: 0.963625 2023-10-02 14:33:26,076 WARNING [train.py:1204] (3/4) Exclude cut with ID _68965_210_1883_1_1535698962272_3497490_309-334593-0_sp1.1 from training. Duration: 0.92725 2023-10-02 14:33:26,174 WARNING [train.py:1204] (3/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_128-296581-0 from training. Duration: 0.74 2023-10-02 14:33:26,174 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_17613_210_2151_1_1531137569_2366160_170-299079-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 14:33:26,176 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6287_210_3273_1_1527509506_2653716_182-199887-0_sp1.1 from training. Duration: 0.463625 2023-10-02 14:33:28,783 WARNING [train.py:1204] (3/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_80-135200-0_sp1.1 from training. Duration: 0.7181875 2023-10-02 14:33:31,562 WARNING [train.py:1204] (3/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_34-183685-0 from training. Duration: 0.98 2023-10-02 14:33:31,585 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8278_210_2550_1_1528637099_4220413_418-210155-0_sp1.1 from training. Duration: 0.92725 2023-10-02 14:33:31,589 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8853_210_3273_1_1528633899_818968_51-187685-0_sp0.9 from training. Duration: 0.7666875 2023-10-02 14:33:32,816 WARNING [train.py:1204] (3/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_332-221057-0_sp1.1 from training. Duration: 0.6181875 2023-10-02 14:33:32,888 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2221_210_1988_1_1525597051_4935541_415-296882-0 from training. Duration: 0.77 2023-10-02 14:33:32,920 WARNING [train.py:1204] (3/4) Exclude cut with ID _33640_210_2655_1_1533117605479_4855469_770-278185-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 14:33:34,241 WARNING [train.py:1204] (3/4) Exclude cut with ID _5287_210_2084_1_1527418883040_3878670_333-48508-0_sp1.1 from training. Duration: 0.5454375 2023-10-02 14:33:34,303 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_142-276839-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 14:33:37,128 WARNING [train.py:1204] (3/4) Exclude cut with ID _28394_210_8913_1_1532430049518_3650118_126-139371-0_sp1.1 from training. Duration: 0.92725 2023-10-02 14:33:37,141 WARNING [train.py:1204] (3/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_76-203867-0 from training. Duration: 0.72 2023-10-02 14:33:37,184 WARNING [train.py:1204] (3/4) Exclude cut with ID _6194_210_1534_1_1527511826160_3941194_14-282876-0_sp0.9 from training. Duration: 0.788875 2023-10-02 14:33:40,714 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.conv_skip_rate, batch_count=913286.6666666666, ans=0.0 2023-10-02 14:33:42,196 WARNING [train.py:1204] (3/4) Exclude cut with ID _35355_210_16040_1_1533212988977_4016229_74-190076-0_sp1.1 from training. Duration: 0.72725 2023-10-02 14:33:42,477 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.ff3_skip_rate, batch_count=913286.6666666666, ans=0.0 2023-10-02 14:33:44,146 WARNING [train.py:1204] (3/4) Exclude cut with ID _33920_210_4220_1_1533257851500_3084399_198-334063-0 from training. Duration: 0.94 2023-10-02 14:33:45,609 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_4-266348-0_sp0.9 from training. Duration: 0.8666875 2023-10-02 14:33:46,858 INFO [train.py:1046] (3/4) Epoch 26, batch 4200, loss[loss=0.1613, simple_loss=0.2501, pruned_loss=0.03628, over 24663.00 frames. ], tot_loss[loss=0.1683, simple_loss=0.2461, pruned_loss=0.04528, over 4711505.52 frames. ], batch size: 65, lr: 3.92e-03, grad_scale: 16.0 2023-10-02 14:33:47,031 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_23856_210_4238_1_1531994785_3087934_122-38075-0_sp1.1 from training. Duration: 0.936375 2023-10-02 14:33:48,405 WARNING [train.py:1204] (3/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_276-96593-0_sp0.9 from training. Duration: 0.8555625 2023-10-02 14:33:50,253 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_308-278734-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 14:33:50,255 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_5190_210_4557_1_1527245078_2965646_353-250603-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 14:33:51,685 WARNING [train.py:1204] (3/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_472-216663-0 from training. Duration: 0.92 2023-10-02 14:33:56,124 WARNING [train.py:1204] (3/4) Exclude cut with ID _7537_210_2084_1_1528502395431_3924359_14-312906-0 from training. Duration: 0.84 2023-10-02 14:33:57,508 WARNING [train.py:1204] (3/4) Exclude cut with ID _29003_210_19596_1_1532507391720_3894098_559-236660-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 14:33:58,952 WARNING [train.py:1204] (3/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_312-204798-0_sp1.1 from training. Duration: 0.8454375 2023-10-02 14:34:00,375 WARNING [train.py:1204] (3/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_17-309070-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 14:34:00,617 INFO [scaling.py:1118] (3/4) WithLoss: name=encoder.encoders.1.encoder.layers.0.self_attn_weights, loss-sum=0.000e+00 2023-10-02 14:34:03,202 WARNING [train.py:1204] (3/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_344-152526-0_sp0.9 from training. Duration: 0.588875 2023-10-02 14:34:04,614 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_41452_210_7291_1_1533437687_3941281_405-11559-0_sp1.1 from training. Duration: 0.82725 2023-10-02 14:34:05,827 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_48730_210_5399_1_1533885886_2212769_157-124052-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 14:34:05,867 WARNING [train.py:1204] (3/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_415-78792-0 from training. Duration: 0.81 2023-10-02 14:34:05,880 WARNING [train.py:1204] (3/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_345-340690-0_sp1.1 from training. Duration: 0.8454375 2023-10-02 14:34:07,347 WARNING [train.py:1204] (3/4) Exclude cut with ID _23825_210_13449_1_1531915313526_3497150_124-8138-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 14:34:07,396 WARNING [train.py:1204] (3/4) Exclude cut with ID _49258_210_7994_1_1534122017863_3738861_194-24864-0_sp1.1 from training. Duration: 0.936375 2023-10-02 14:34:07,412 WARNING [train.py:1204] (3/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_369-222060-0_sp1.1 from training. Duration: 0.6454375 2023-10-02 14:34:08,827 WARNING [train.py:1204] (3/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_480-13146-0_sp0.9 from training. Duration: 0.9444375 2023-10-02 14:34:12,021 WARNING [train.py:1204] (3/4) Exclude cut with ID _13253_210_1925_1_1530757775819_3577873_191-144977-0 from training. Duration: 0.78 2023-10-02 14:34:13,225 WARNING [train.py:1204] (3/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_1128-140266-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 14:34:17,892 WARNING [train.py:1204] (3/4) Exclude cut with ID _8772_210_1850_1_1528635595972_3664011_247-336448-0_sp0.9 from training. Duration: 0.82225 2023-10-02 14:34:19,252 WARNING [train.py:1204] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_318-59097-0_sp1.1 from training. Duration: 0.6909375 2023-10-02 14:34:20,701 WARNING [train.py:1204] (3/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_540-131164-0_sp1.1 from training. Duration: 0.836375 2023-10-02 14:34:22,613 WARNING [train.py:1204] (3/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_39-110836-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 14:34:25,691 WARNING [train.py:1204] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_860-96198-0_sp1.1 from training. Duration: 0.663625 2023-10-02 14:34:25,706 WARNING [train.py:1204] (3/4) Exclude cut with ID _15133_210_6228_1_1530794512074_3080379_29-264458-0 from training. Duration: 0.58 2023-10-02 14:34:25,727 WARNING [train.py:1204] (3/4) Exclude cut with ID _55209_210_1531_1_1534675828996_4597160_797-348343-0_sp1.1 from training. Duration: 0.963625 2023-10-02 14:34:27,065 WARNING [train.py:1204] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_23-189234-0_sp1.1 from training. Duration: 0.7545625 2023-10-02 14:34:32,563 WARNING [train.py:1204] (3/4) Exclude cut with ID _5410_210_1388_1_1527397055795_1719020_39-112221-0_sp1.1 from training. Duration: 0.62725 2023-10-02 14:34:35,226 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_100-348441-0_sp1.1 from training. Duration: 0.82725 2023-10-02 14:34:38,462 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.nonlin_attention.balancer.prob, batch_count=913553.3333333334, ans=0.125 2023-10-02 14:34:39,595 WARNING [train.py:1204] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_143-86344-0_sp1.1 from training. Duration: 0.7 2023-10-02 14:34:42,407 WARNING [train.py:1204] (3/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_126-29104-0 from training. Duration: 0.85 2023-10-02 14:34:43,626 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.384e+02 1.882e+02 2.071e+02 2.346e+02 3.876e+02, threshold=4.142e+02, percent-clipped=0.0 2023-10-02 14:34:43,861 WARNING [train.py:1204] (3/4) Exclude cut with ID _39846_210_12651_1_1533603651992_1318030_138-31771-0_sp1.1 from training. Duration: 0.963625 2023-10-02 14:34:50,414 WARNING [train.py:1204] (3/4) Exclude cut with ID _2220_210_1819_1_1525509109898_3658680_50-99837-0_sp1.1 from training. Duration: 0.4909375 2023-10-02 14:34:51,712 WARNING [train.py:1204] (3/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_836-210550-0_sp1.1 from training. Duration: 0.97275 2023-10-02 14:34:53,214 WARNING [train.py:1204] (3/4) Exclude cut with ID _28323_210_16187_1_1532480395667_4100300_306-74561-0 from training. Duration: 0.94 2023-10-02 14:34:59,758 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_177-58005-0_sp1.1 from training. Duration: 0.563625 2023-10-02 14:35:01,178 INFO [train.py:1046] (3/4) Epoch 26, batch 4250, loss[loss=0.1562, simple_loss=0.2364, pruned_loss=0.03796, over 24608.00 frames. ], tot_loss[loss=0.1669, simple_loss=0.2445, pruned_loss=0.04471, over 4710327.72 frames. ], batch size: 60, lr: 3.92e-03, grad_scale: 16.0 2023-10-02 14:35:02,799 WARNING [train.py:1204] (3/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_19-342632-0_sp1.1 from training. Duration: 0.82725 2023-10-02 14:35:02,814 WARNING [train.py:1204] (3/4) Exclude cut with ID _35355_210_16040_1_1533212988977_4016229_74-190076-0_sp0.9 from training. Duration: 0.888875 2023-10-02 14:35:05,544 WARNING [train.py:1204] (3/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_285-97786-0_sp1.1 from training. Duration: 0.97275 2023-10-02 14:35:09,677 WARNING [train.py:1204] (3/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_337-181502-0_sp0.9 from training. Duration: 0.72225 2023-10-02 14:35:09,713 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_353-182855-0 from training. Duration: 0.96 2023-10-02 14:35:11,020 WARNING [train.py:1204] (3/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_409-327730-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 14:35:12,561 WARNING [train.py:1204] (3/4) Exclude cut with ID _8030_210_7497_1_1528444886781_3531089_373-19403-0_sp1.1 from training. Duration: 0.97275 2023-10-02 14:35:15,683 WARNING [train.py:1204] (3/4) Exclude cut with ID _6789_210_1534_1_1527857937218_3118950_231-340237-0_sp1.1 from training. Duration: 0.936375 2023-10-02 14:35:15,891 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.bypass.scale_min, batch_count=913753.3333333334, ans=0.2 2023-10-02 14:35:20,417 WARNING [train.py:1204] (3/4) Exclude cut with ID _6493_210_1747_1_1528459210424_4012470_536-57832-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 14:35:20,438 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7046_210_6753_1_1527895337_6920917_756-116002-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 14:35:21,865 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_37152_210_9697_1_1533106573_3844188_29-239070-0_sp1.1 from training. Duration: 0.8909375 2023-10-02 14:35:21,867 WARNING [train.py:1204] (3/4) Exclude cut with ID _19023_210_4353_1_1531378892552_5820780_61-334539-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 14:35:23,747 WARNING [train.py:1204] (3/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_159-28488-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 14:35:25,566 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_764-71272-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 14:35:25,647 WARNING [train.py:1204] (3/4) Exclude cut with ID _44902_210_16187_1_1533888006062_3792078_258-178274-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 14:35:28,313 WARNING [train.py:1204] (3/4) Exclude cut with ID _7537_210_2084_1_1528502395431_3924359_14-312906-0_sp1.1 from training. Duration: 0.763625 2023-10-02 14:35:29,743 WARNING [train.py:1204] (3/4) Exclude cut with ID _49643_210_7989_1_1534640244224_3939031_674-116639-0_sp1.1 from training. Duration: 0.963625 2023-10-02 14:35:31,212 WARNING [train.py:1204] (3/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_415-161-0 from training. Duration: 0.98 2023-10-02 14:35:35,371 WARNING [train.py:1204] (3/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_264-348896-0 from training. Duration: 0.85 2023-10-02 14:35:36,654 WARNING [train.py:1204] (3/4) Exclude cut with ID _51899_210_20034_1_1534129254466_3669220_233-161492-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 14:35:36,720 WARNING [train.py:1204] (3/4) Exclude cut with ID _31255_210_13427_1_1532865619495_3815143_233-200023-0_sp1.1 from training. Duration: 0.936375 2023-10-02 14:35:36,735 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_23850_210_4238_1_1532232597_1632433_100-229879-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 14:35:36,825 WARNING [train.py:1204] (3/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_375-245677-0_sp0.9 from training. Duration: 0.9 2023-10-02 14:35:36,828 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_40223_210_6228_1_1533298404_4812267_355-317415-0_sp1.1 from training. Duration: 0.97275 2023-10-02 14:35:38,126 WARNING [train.py:1204] (3/4) Exclude cut with ID _9679_210_7497_1_1529129212906_4363929_235-81488-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 14:35:42,486 WARNING [train.py:1204] (3/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_172-215932-0_sp1.1 from training. Duration: 0.6 2023-10-02 14:35:42,542 WARNING [train.py:1204] (3/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_64-298917-0_sp1.1 from training. Duration: 0.8 2023-10-02 14:35:45,416 WARNING [train.py:1204] (3/4) Exclude cut with ID _38020_210_5986_1_1533112252259_7006089_131-53533-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 14:35:47,323 WARNING [train.py:1204] (3/4) Exclude cut with ID _42334_210_7770_1_1534206610396_3605880_392-48154-0_sp1.1 from training. Duration: 0.963625 2023-10-02 14:35:48,643 WARNING [train.py:1204] (3/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_337-181502-0 from training. Duration: 0.65 2023-10-02 14:35:48,650 WARNING [train.py:1204] (3/4) Exclude cut with ID _6267_210_2084_1_1527764658526_3894730_375-219887-0_sp0.9 from training. Duration: 0.7444375 2023-10-02 14:35:48,812 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.1.feed_forward2.hidden_balancer.prob, batch_count=913886.6666666666, ans=0.125 2023-10-02 14:35:50,471 WARNING [train.py:1204] (3/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_60-146189-0 from training. Duration: 0.98 2023-10-02 14:35:51,803 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_243-143119-0_sp1.1 from training. Duration: 0.7 2023-10-02 14:35:53,848 WARNING [train.py:1204] (3/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_1267-272693-0_sp0.9 from training. Duration: 0.811125 2023-10-02 14:35:56,720 INFO [scaling.py:1022] (3/4) Whitening: name=encoder_embed.out_whiten, num_groups=1, num_channels=192, metric=6.61 vs. limit=8.0 2023-10-02 14:35:57,029 WARNING [train.py:1204] (3/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_83-182645-0_sp1.1 from training. Duration: 0.97275 2023-10-02 14:35:57,062 WARNING [train.py:1204] (3/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_403-292260-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 14:35:58,555 WARNING [train.py:1204] (3/4) Exclude cut with ID _55429_210_10626_1_1534467588527_7174143_377-169391-0 from training. Duration: 0.96 2023-10-02 14:35:59,928 WARNING [train.py:1204] (3/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_494-199538-0_sp1.1 from training. Duration: 0.6818125 2023-10-02 14:36:01,314 WARNING [train.py:1204] (3/4) Exclude cut with ID _3067_210_2667_1_1526212974640_4503168_119-175776-0_sp1.1 from training. Duration: 0.67275 2023-10-02 14:36:01,555 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.nonlin_attention.balancer.prob, batch_count=913953.3333333334, ans=0.125 2023-10-02 14:36:01,650 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.nonlin_attention.balancer.prob, batch_count=913953.3333333334, ans=0.125 2023-10-02 14:36:04,176 WARNING [train.py:1204] (3/4) Exclude cut with ID _3231_210_2106_1_1526205864934_6784220_183-303839-0_sp1.1 from training. Duration: 0.97275 2023-10-02 14:36:06,910 WARNING [train.py:1204] (3/4) Exclude cut with ID _18513_210_9206_1_1531618215223_3862620_165-38958-0_sp1.1 from training. Duration: 0.963625 2023-10-02 14:36:08,241 WARNING [train.py:1204] (3/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_87-272068-0_sp1.1 from training. Duration: 0.7545625 2023-10-02 14:36:09,661 WARNING [train.py:1204] (3/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_353-95-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 14:36:11,031 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2009_210_2071_1_1525481134_4588937_73-302813-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 14:36:12,535 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.self_attn_weights.pos_emb_skip_rate, batch_count=913953.3333333334, ans=0.0 2023-10-02 14:36:13,631 WARNING [train.py:1204] (3/4) Exclude cut with ID _64220_210_9527_1_1535250151772_4875259_88-95011-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 14:36:13,701 WARNING [train.py:1204] (3/4) Exclude cut with ID _44902_210_16187_1_1533888006062_3792078_160-12107-0_sp1.1 from training. Duration: 0.92725 2023-10-02 14:36:13,707 WARNING [train.py:1204] (3/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_547-5424-0 from training. Duration: 0.94 2023-10-02 14:36:14,433 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.3.encoder.layers.1.self_attn1.whiten, num_groups=1, num_channels=512, metric=17.12 vs. limit=22.5 2023-10-02 14:36:15,021 INFO [train.py:1046] (3/4) Epoch 26, batch 4300, loss[loss=0.1554, simple_loss=0.2294, pruned_loss=0.04065, over 21540.00 frames. ], tot_loss[loss=0.1661, simple_loss=0.2429, pruned_loss=0.04458, over 4696144.59 frames. ], batch size: 47, lr: 3.92e-03, grad_scale: 16.0 2023-10-02 14:36:15,164 WARNING [train.py:1204] (3/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_268-85656-0_sp1.1 from training. Duration: 0.936375 2023-10-02 14:36:21,116 WARNING [train.py:1204] (3/4) Exclude cut with ID _66210_210_12651_1_1535446812922_3365850_42-284985-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 14:36:21,149 WARNING [train.py:1204] (3/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_465-214726-0_sp0.9 from training. Duration: 0.9555625 2023-10-02 14:36:24,565 WARNING [train.py:1204] (3/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_50-95770-0_sp1.1 from training. Duration: 0.936375 2023-10-02 14:36:32,279 WARNING [train.py:1204] (3/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_269-77567-0_sp1.1 from training. Duration: 0.963625 2023-10-02 14:36:32,281 WARNING [train.py:1204] (3/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_7-111954-0 from training. Duration: 0.56 2023-10-02 14:36:33,622 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_85-195240-0_sp0.9 from training. Duration: 0.9666875 2023-10-02 14:36:35,069 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2822_210_2715_1_1526112586_1999587_165-36656-0_sp0.9 from training. Duration: 0.988875 2023-10-02 14:36:36,326 WARNING [train.py:1204] (3/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_181-78632-0_sp1.1 from training. Duration: 0.7090625 2023-10-02 14:36:36,346 WARNING [train.py:1204] (3/4) Exclude cut with ID IC0671W0044-331887 from training. Duration: 0.896 2023-10-02 14:36:39,305 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.nonlin_attention.balancer.prob, batch_count=914086.6666666666, ans=0.125 2023-10-02 14:36:40,446 WARNING [train.py:1204] (3/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_266-137104-0_sp1.1 from training. Duration: 0.5909375 2023-10-02 14:36:40,692 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.conv_module1.balancer2.prob, batch_count=914086.6666666666, ans=0.125 2023-10-02 14:36:40,802 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.ff2_skip_rate, batch_count=914086.6666666666, ans=0.0 2023-10-02 14:36:41,870 WARNING [train.py:1204] (3/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_137-116442-0_sp0.9 from training. Duration: 0.8666875 2023-10-02 14:36:44,647 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_23801_210_16223_1_1532066392_3294657_570-5439-0 from training. Duration: 0.97 2023-10-02 14:36:44,668 WARNING [train.py:1204] (3/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_29-280622-0_sp1.1 from training. Duration: 0.7454375 2023-10-02 14:36:44,692 WARNING [train.py:1204] (3/4) Exclude cut with ID 3557-8342-0013-54691-0 from training. Duration: 0.92 2023-10-02 14:36:47,412 WARNING [train.py:1204] (3/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_711-15688-0_sp1.1 from training. Duration: 0.3909375 2023-10-02 14:36:48,941 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_15126_210_6753_1_1530778677_2591185_120-277707-0_sp1.1 from training. Duration: 0.72725 2023-10-02 14:36:50,564 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.conv_module1.balancer2.prob, batch_count=914153.3333333334, ans=0.125 2023-10-02 14:36:52,321 WARNING [train.py:1204] (3/4) Exclude cut with ID _14970_210_15709_1_1530775547361_1966965_8-169924-0_sp1.1 from training. Duration: 0.836375 2023-10-02 14:36:53,623 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_4692_210_3528_1_1526899755_4323254_107-44401-0_sp0.9 from training. Duration: 0.9555625 2023-10-02 14:36:53,704 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3475_210_2028_1_1526193578_3622569_265-276625-0_sp0.9 from training. Duration: 0.8444375 2023-10-02 14:36:55,481 WARNING [train.py:1204] (3/4) Exclude cut with ID _55153_210_2253_1_1534384786677_3162096_148-79022-0_sp1.1 from training. Duration: 0.87275 2023-10-02 14:36:57,433 WARNING [train.py:1204] (3/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_292-36336-0_sp1.1 from training. Duration: 0.92725 2023-10-02 14:36:57,451 WARNING [train.py:1204] (3/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_329-82235-0 from training. Duration: 0.82 2023-10-02 14:36:57,632 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.feed_forward3.hidden_balancer.prob, batch_count=914153.3333333334, ans=0.125 2023-10-02 14:36:59,405 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_11305_210_1794_1_1529750428_7178148_257-312870-0 from training. Duration: 0.88 2023-10-02 14:37:00,843 WARNING [train.py:1204] (3/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_436-4493-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 14:37:02,273 WARNING [train.py:1204] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_321-54129-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 14:37:02,280 WARNING [train.py:1204] (3/4) Exclude cut with ID _5287_210_2084_1_1527418883040_3878670_333-48508-0_sp0.9 from training. Duration: 0.6666875 2023-10-02 14:37:02,296 WARNING [train.py:1204] (3/4) Exclude cut with ID _26528_210_9070_1_1532570410842_3851310_244-278158-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 14:37:02,330 WARNING [train.py:1204] (3/4) Exclude cut with ID _46952_210_5986_1_1534230010357_3107379_232-344503-0_sp1.1 from training. Duration: 0.87275 2023-10-02 14:37:02,339 WARNING [train.py:1204] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_43-206757-0 from training. Duration: 0.75 2023-10-02 14:37:02,341 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_4183_210_2087_1_1526729100_1869838_141-166600-0 from training. Duration: 0.95 2023-10-02 14:37:03,760 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_296-287678-0 from training. Duration: 0.95 2023-10-02 14:37:05,051 WARNING [train.py:1204] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_265-60343-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 14:37:06,348 WARNING [train.py:1204] (3/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_332-256168-0 from training. Duration: 0.75 2023-10-02 14:37:06,388 WARNING [train.py:1204] (3/4) Exclude cut with ID _13679_210_7497_1_1530534293662_3355098_411-186088-0 from training. Duration: 0.94 2023-10-02 14:37:10,413 WARNING [train.py:1204] (3/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_458-126779-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 14:37:10,655 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.balancer1.prob, batch_count=914220.0, ans=0.125 2023-10-02 14:37:11,865 WARNING [train.py:1204] (3/4) Exclude cut with ID IC0803W0380-397572 from training. Duration: 0.032 2023-10-02 14:37:13,157 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.468e+02 1.843e+02 2.069e+02 2.319e+02 3.612e+02, threshold=4.138e+02, percent-clipped=0.0 2023-10-02 14:37:13,248 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_278-272612-0_sp1.1 from training. Duration: 0.663625 2023-10-02 14:37:14,678 WARNING [train.py:1204] (3/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_491-263101-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 14:37:14,683 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_53555_210_5632_1_1534595220_7232297_334-336778-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 14:37:18,511 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_50-3289-0 from training. Duration: 0.91 2023-10-02 14:37:18,575 WARNING [train.py:1204] (3/4) Exclude cut with ID _8699_210_2069_1_1528630346831_3960409_155-39766-0_sp0.9 from training. Duration: 0.8666875 2023-10-02 14:37:18,579 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_502-82145-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 14:37:18,617 WARNING [train.py:1204] (3/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_550-43132-0_sp1.1 from training. Duration: 0.97275 2023-10-02 14:37:18,643 WARNING [train.py:1204] (3/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_446-112117-0_sp1.1 from training. Duration: 0.8181875 2023-10-02 14:37:19,962 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3691_210_3528_1_1526468169_4062053_355-334432-0_sp1.1 from training. Duration: 0.7909375 2023-10-02 14:37:21,364 WARNING [train.py:1204] (3/4) Exclude cut with ID _58605_210_12210_1_1534723195939_3904259_239-94720-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 14:37:24,518 WARNING [train.py:1204] (3/4) Exclude cut with ID _49376_210_5986_1_1533985278732_3370740_391-344072-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 14:37:24,593 WARNING [train.py:1204] (3/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_558-322014-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 14:37:26,477 WARNING [train.py:1204] (3/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_256-32662-0_sp1.1 from training. Duration: 0.8181875 2023-10-02 14:37:29,593 INFO [train.py:1046] (3/4) Epoch 26, batch 4350, loss[loss=0.1809, simple_loss=0.2666, pruned_loss=0.04761, over 24589.00 frames. ], tot_loss[loss=0.1664, simple_loss=0.2437, pruned_loss=0.04452, over 4712259.22 frames. ], batch size: 71, lr: 3.92e-03, grad_scale: 16.0 2023-10-02 14:37:32,869 WARNING [train.py:1204] (3/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_572-185150-0 from training. Duration: 0.97 2023-10-02 14:37:32,915 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_89-35748-0_sp0.9 from training. Duration: 0.87775 2023-10-02 14:37:38,278 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_88-66747-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 14:37:38,485 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.balancer1.prob, batch_count=914353.3333333334, ans=0.125 2023-10-02 14:37:41,044 WARNING [train.py:1204] (3/4) Exclude cut with ID _19504_210_9994_1_1531648863242_3325220_357-237029-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 14:37:43,813 WARNING [train.py:1204] (3/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_302-2550-0_sp1.1 from training. Duration: 0.736375 2023-10-02 14:37:43,814 WARNING [train.py:1204] (3/4) Exclude cut with ID _9461_210_4557_1_1529060514726_3677451_59-194017-0_sp1.1 from training. Duration: 0.963625 2023-10-02 14:37:46,131 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.0.layers.0.conv_module1.whiten, num_groups=1, num_channels=192, metric=12.12 vs. limit=15.0 2023-10-02 14:37:49,307 WARNING [train.py:1204] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_665-196155-0_sp1.1 from training. Duration: 0.6090625 2023-10-02 14:37:52,116 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_326-315007-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 14:37:53,600 WARNING [train.py:1204] (3/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_105-328174-0_sp1.1 from training. Duration: 0.6909375 2023-10-02 14:37:53,607 WARNING [train.py:1204] (3/4) Exclude cut with ID _56494_210_4220_1_1535284837625_3033169_106-263722-0_sp1.1 from training. Duration: 0.97275 2023-10-02 14:37:57,008 WARNING [train.py:1204] (3/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_770-200430-0_sp0.9 from training. Duration: 0.988875 2023-10-02 14:37:59,278 INFO [scaling.py:1022] (3/4) Whitening: name=encoder_embed.out_whiten, num_groups=1, num_channels=192, metric=6.89 vs. limit=8.0 2023-10-02 14:38:00,261 WARNING [train.py:1204] (3/4) Exclude cut with ID _65640_210_6310_1_1535368201445_3173250_209-13008-0_sp1.1 from training. Duration: 0.9 2023-10-02 14:38:02,304 WARNING [train.py:1204] (3/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_212-149947-0_sp0.9 from training. Duration: 0.811125 2023-10-02 14:38:06,504 WARNING [train.py:1204] (3/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_188-171673-0 from training. Duration: 0.58 2023-10-02 14:38:07,874 WARNING [train.py:1204] (3/4) Exclude cut with ID _58895_210_8750_1_1535184081320_3649089_286-281724-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 14:38:07,943 WARNING [train.py:1204] (3/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_16-254909-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 14:38:09,524 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.nonlin_attention.balancer.prob, batch_count=914486.6666666666, ans=0.125 2023-10-02 14:38:12,208 WARNING [train.py:1204] (3/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_52-95655-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 14:38:13,624 WARNING [train.py:1204] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_554-45617-0 from training. Duration: 0.99 2023-10-02 14:38:16,392 WARNING [train.py:1204] (3/4) Exclude cut with ID _14683_210_5219_1_1530698352608_3974859_473-344611-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 14:38:17,782 WARNING [train.py:1204] (3/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_17-25045-0_sp0.9 from training. Duration: 0.6555625 2023-10-02 14:38:21,826 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9072W0003-530637 from training. Duration: 0.768 2023-10-02 14:38:23,201 WARNING [train.py:1204] (3/4) Exclude cut with ID _38670_210_15379_1_1533448810659_4391059_76-74025-0_sp1.1 from training. Duration: 0.936375 2023-10-02 14:38:23,235 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_74-292608-0_sp0.9 from training. Duration: 0.8 2023-10-02 14:38:25,077 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9084W0025-536862 from training. Duration: 0.896 2023-10-02 14:38:25,237 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.1.conv_skip_rate, batch_count=914553.3333333334, ans=0.0 2023-10-02 14:38:26,385 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9087W0021-538392 from training. Duration: 0.768 2023-10-02 14:38:26,398 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7543_210_6753_1_1528543323_645515_56-163234-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 14:38:26,420 WARNING [train.py:1204] (3/4) Exclude cut with ID _45560_210_21982_1_1533812603951_3584999_44-332643-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 14:38:28,169 WARNING [train.py:1204] (3/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_1285-256622-0_sp1.1 from training. Duration: 0.763625 2023-10-02 14:38:28,224 WARNING [train.py:1204] (3/4) Exclude cut with ID _52781_210_16187_1_1534297729099_1118149_141-325632-0_sp1.1 from training. Duration: 0.936375 2023-10-02 14:38:29,530 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_1051-137631-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 14:38:29,575 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_13091_210_7925_1_1530609447_8987354_233-18333-0_sp1.1 from training. Duration: 0.97275 2023-10-02 14:38:33,479 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_34008_210_10431_1_1533345424_8183247_662-317331-0 from training. Duration: 0.94 2023-10-02 14:38:33,493 WARNING [train.py:1204] (3/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_291-248920-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 14:38:33,495 WARNING [train.py:1204] (3/4) Exclude cut with ID _65263_210_22356_1_1535763598691_3921037_274-288481-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 14:38:33,515 WARNING [train.py:1204] (3/4) Exclude cut with ID _5189_210_4557_1_1527248319428_3769229_233-168080-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 14:38:34,686 WARNING [train.py:1204] (3/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_135-184281-0 from training. Duration: 0.99 2023-10-02 14:38:36,074 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9123W0012-555972 from training. Duration: 0.896 2023-10-02 14:38:36,078 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9123W0118-556077 from training. Duration: 0.896 2023-10-02 14:38:36,087 WARNING [train.py:1204] (3/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_406-275649-0 from training. Duration: 0.42 2023-10-02 14:38:40,556 WARNING [train.py:1204] (3/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_309-13684-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 14:38:40,576 WARNING [train.py:1204] (3/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_129-337802-0_sp1.1 from training. Duration: 0.6545625 2023-10-02 14:38:40,604 WARNING [train.py:1204] (3/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_649-266825-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 14:38:40,659 WARNING [train.py:1204] (3/4) Exclude cut with ID _40519_210_13794_1_1533715228655_7337390_895-224742-0_sp1.1 from training. Duration: 0.92725 2023-10-02 14:38:42,072 WARNING [train.py:1204] (3/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_234-14174-0 from training. Duration: 0.82 2023-10-02 14:38:43,284 INFO [train.py:1046] (3/4) Epoch 26, batch 4400, loss[loss=0.1751, simple_loss=0.2573, pruned_loss=0.04648, over 24670.00 frames. ], tot_loss[loss=0.1672, simple_loss=0.2451, pruned_loss=0.0447, over 4723692.57 frames. ], batch size: 65, lr: 3.91e-03, grad_scale: 32.0 2023-10-02 14:38:44,734 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9156W0009-572502 from training. Duration: 0.768 2023-10-02 14:38:44,740 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_424-245529-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 14:38:47,614 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.feed_forward1.out_proj.dropout_p, batch_count=914686.6666666666, ans=0.1 2023-10-02 14:38:48,745 WARNING [train.py:1204] (3/4) Exclude cut with ID _26528_210_9070_1_1532570410842_3851310_232-10149-0_sp1.1 from training. Duration: 0.963625 2023-10-02 14:38:48,752 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_17292_210_6753_1_1531125388_2943734_152-30410-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 14:38:49,021 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.feed_forward1.out_proj.dropout_p, batch_count=914686.6666666666, ans=0.1 2023-10-02 14:38:50,134 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_35441_210_16554_1_1533347572_4092680_291-82075-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 14:38:51,616 WARNING [train.py:1204] (3/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_64-298917-0 from training. Duration: 0.88 2023-10-02 14:38:51,634 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_5618_210_1874_1_1527311008_5851_1-214501-0_sp0.9 from training. Duration: 0.7655625 2023-10-02 14:38:52,999 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_9460_210_4211_1_1528954587_10049667_572-37035-0 from training. Duration: 0.8 2023-10-02 14:38:53,016 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9193W0023-590322 from training. Duration: 0.896 2023-10-02 14:38:54,370 WARNING [train.py:1204] (3/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_206-10097-0_sp1.1 from training. Duration: 0.4454375 2023-10-02 14:38:54,390 WARNING [train.py:1204] (3/4) Exclude cut with ID _55134_210_21982_1_1534590028377_3882170_17-51731-0_sp1.1 from training. Duration: 0.963625 2023-10-02 14:38:57,540 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_14953_210_2151_1_1530701710_5616165_710-165400-0 from training. Duration: 0.87 2023-10-02 14:38:57,671 WARNING [train.py:1204] (3/4) Exclude cut with ID _53203_210_2087_1_1534246218309_475590_61-85777-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 14:38:57,867 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.attention_skip_rate, batch_count=914753.3333333334, ans=0.0 2023-10-02 14:39:00,866 WARNING [train.py:1204] (3/4) Exclude cut with ID _55844_210_2346_1_1534471212569_7625707_1050-10453-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 14:39:00,877 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9218W0013-603297 from training. Duration: 0.896 2023-10-02 14:39:01,272 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.balancer2.prob, batch_count=914753.3333333334, ans=0.125 2023-10-02 14:39:01,637 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.4.encoder.layers.0.feed_forward1.out_whiten, num_groups=1, num_channels=384, metric=5.88 vs. limit=15.0 2023-10-02 14:39:03,043 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_267-102818-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 14:39:03,044 WARNING [train.py:1204] (3/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_191-41023-0 from training. Duration: 0.71 2023-10-02 14:39:03,101 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9231W0202-610227 from training. Duration: 0.896 2023-10-02 14:39:05,631 WARNING [train.py:1204] (3/4) Exclude cut with ID _6537_210_1883_1_1527987559710_3597129_382-133062-0 from training. Duration: 0.68 2023-10-02 14:39:05,781 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.bypass_mid.scale_min, batch_count=914753.3333333334, ans=0.2 2023-10-02 14:39:07,005 WARNING [train.py:1204] (3/4) Exclude cut with ID _28427_210_4220_1_1532581240967_3061069_43-84928-0 from training. Duration: 0.99 2023-10-02 14:39:07,052 WARNING [train.py:1204] (3/4) Exclude cut with ID _59256_210_5986_1_1535076085997_3589120_121-237386-0 from training. Duration: 0.96 2023-10-02 14:39:08,459 WARNING [train.py:1204] (3/4) Exclude cut with ID _8480_210_1874_1_1528589391205_6728330_137-256986-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 14:39:09,864 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_9417_210_1794_1_1529235261_4614131_479-346963-0_sp1.1 from training. Duration: 0.936375 2023-10-02 14:39:11,221 WARNING [train.py:1204] (3/4) Exclude cut with ID _26474_210_9570_1_1532520022699_3623380_267-7362-0_sp1.1 from training. Duration: 0.936375 2023-10-02 14:39:11,299 WARNING [train.py:1204] (3/4) Exclude cut with ID _55328_210_27767_1_1534580973260_4088170_287-61272-0_sp1.1 from training. Duration: 0.97275 2023-10-02 14:39:14,015 WARNING [train.py:1204] (3/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_589-9070-0 from training. Duration: 0.71 2023-10-02 14:39:14,022 WARNING [train.py:1204] (3/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_148-158509-0_sp1.1 from training. Duration: 0.37275 2023-10-02 14:39:14,095 WARNING [train.py:1204] (3/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_164-184548-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 14:39:16,800 WARNING [train.py:1204] (3/4) Exclude cut with ID _14970_210_15709_1_1530775547361_1966965_6-23328-0_sp1.1 from training. Duration: 0.7818125 2023-10-02 14:39:16,805 WARNING [train.py:1204] (3/4) Exclude cut with ID _29303_210_2575_1_1532392237303_7410649_612-165069-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 14:39:18,157 WARNING [train.py:1204] (3/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_383-321224-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 14:39:18,183 WARNING [train.py:1204] (3/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_283-160525-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 14:39:18,199 WARNING [train.py:1204] (3/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_505-97987-0 from training. Duration: 0.86 2023-10-02 14:39:19,578 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9309W0190-640317 from training. Duration: 0.768 2023-10-02 14:39:21,243 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.ff3_skip_rate, batch_count=914820.0, ans=0.0 2023-10-02 14:39:22,390 WARNING [train.py:1204] (3/4) Exclude cut with ID _50093_210_4220_1_1534417141207_3432299_280-297390-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 14:39:27,934 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_17292_210_6753_1_1531125388_2943734_320-71385-0_sp1.1 from training. Duration: 0.97275 2023-10-02 14:39:29,934 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_213-103282-0 from training. Duration: 0.78 2023-10-02 14:39:35,788 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7615_210_4211_1_1528110965_4580160_72-235211-0_sp0.9 from training. Duration: 0.8444375 2023-10-02 14:39:37,343 WARNING [train.py:1204] (3/4) Exclude cut with ID _14867_210_1968_1_1530684179890_3846969_173-320977-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 14:39:38,895 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7046_210_6753_1_1527895337_6920917_783-278109-0_sp0.9 from training. Duration: 0.8555625 2023-10-02 14:39:38,938 WARNING [train.py:1204] (3/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_90-30673-0 from training. Duration: 0.84 2023-10-02 14:39:38,952 WARNING [train.py:1204] (3/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_415-161-0_sp1.1 from training. Duration: 0.8909375 2023-10-02 14:39:38,968 WARNING [train.py:1204] (3/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_86-66192-0_sp0.9 from training. Duration: 0.611125 2023-10-02 14:39:38,971 WARNING [train.py:1204] (3/4) Exclude cut with ID _4626_210_3532_1_1526817652717_3916859_446-206656-0_sp1.1 from training. Duration: 0.6909375 2023-10-02 14:39:40,332 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.495e+02 1.911e+02 2.191e+02 2.459e+02 3.786e+02, threshold=4.382e+02, percent-clipped=0.0 2023-10-02 14:39:40,453 WARNING [train.py:1204] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_166-128941-0_sp0.9 from training. Duration: 0.6 2023-10-02 14:39:43,859 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.2.encoder.layers.1.nonlin_attention.whiten2, num_groups=1, num_channels=384, metric=4.59 vs. limit=15.0 2023-10-02 14:39:44,476 WARNING [train.py:1204] (3/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_345-167986-0 from training. Duration: 0.84 2023-10-02 14:39:47,103 WARNING [train.py:1204] (3/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_973-198085-0 from training. Duration: 0.75 2023-10-02 14:39:48,523 WARNING [train.py:1204] (3/4) Exclude cut with ID _60057_210_10640_1_1534903065793_3333150_171-181133-0 from training. Duration: 0.88 2023-10-02 14:39:48,537 WARNING [train.py:1204] (3/4) Exclude cut with ID _19504_210_9994_1_1531648863242_3325220_336-196513-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 14:39:48,544 WARNING [train.py:1204] (3/4) Exclude cut with ID _6493_210_1747_1_1528459210424_4012470_513-140967-0 from training. Duration: 0.79 2023-10-02 14:39:49,907 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6673_210_3273_1_1527767790_3173240_327-306185-0_sp0.9 from training. Duration: 0.92225 2023-10-02 14:39:51,398 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1032-236863-0_sp1.1 from training. Duration: 0.763625 2023-10-02 14:39:54,146 WARNING [train.py:1204] (3/4) Exclude cut with ID _14867_210_1968_1_1530684179890_3846969_413-29292-0 from training. Duration: 0.9 2023-10-02 14:39:55,586 INFO [train.py:1046] (3/4) Epoch 26, batch 4450, loss[loss=0.1512, simple_loss=0.2276, pruned_loss=0.03738, over 21084.00 frames. ], tot_loss[loss=0.168, simple_loss=0.2457, pruned_loss=0.0452, over 4711459.25 frames. ], batch size: 46, lr: 3.91e-03, grad_scale: 32.0 2023-10-02 14:39:58,857 WARNING [train.py:1204] (3/4) Exclude cut with ID _47251_210_18628_1_1533814063128_4137250_186-343322-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 14:40:02,745 WARNING [train.py:1204] (3/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_624-46091-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 14:40:04,036 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_791-197891-0_sp0.9 from training. Duration: 0.9333125 2023-10-02 14:40:09,621 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_50050_210_16223_1_1535435796_7290138_786-288457-0_sp1.1 from training. Duration: 0.92725 2023-10-02 14:40:09,652 WARNING [train.py:1204] (3/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_213-227283-0_sp1.1 from training. Duration: 0.963625 2023-10-02 14:40:11,258 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.balancer2.prob, batch_count=915086.6666666666, ans=0.125 2023-10-02 14:40:12,378 WARNING [train.py:1204] (3/4) Exclude cut with ID _42659_210_2070_1_1533607442765_3120380_58-341121-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 14:40:13,825 WARNING [train.py:1204] (3/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_506-23200-0_sp1.1 from training. Duration: 0.8818125 2023-10-02 14:40:16,563 WARNING [train.py:1204] (3/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_336-213530-0_sp1.1 from training. Duration: 0.8181875 2023-10-02 14:40:17,897 WARNING [train.py:1204] (3/4) Exclude cut with ID _64135_210_2253_1_1535677267876_7608972_180-5312-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 14:40:17,984 WARNING [train.py:1204] (3/4) Exclude cut with ID _12302_210_5219_1_1529924446466_8107280_835-279754-0 from training. Duration: 0.9 2023-10-02 14:40:17,986 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_510-96996-0_sp1.1 from training. Duration: 0.7909375 2023-10-02 14:40:18,105 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.1.conv_skip_rate, batch_count=915086.6666666666, ans=0.0 2023-10-02 14:40:19,340 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_15517_210_6753_1_1530954008_3487336_216-187834-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 14:40:19,366 WARNING [train.py:1204] (3/4) Exclude cut with ID _19504_210_9994_1_1531648863242_3325220_414-113427-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 14:40:19,367 WARNING [train.py:1204] (3/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_264-108685-0_sp0.9 from training. Duration: 0.611125 2023-10-02 14:40:19,899 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.4.encoder.layers.0.conv_module2.whiten, num_groups=1, num_channels=384, metric=8.19 vs. limit=15.0 2023-10-02 14:40:20,945 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_5438_210_1794_1_1527336687_2600009_104-288274-0_sp0.9 from training. Duration: 0.6444375 2023-10-02 14:40:25,247 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.nonlin_attention.balancer.prob, batch_count=915153.3333333334, ans=0.125 2023-10-02 14:40:26,200 WARNING [train.py:1204] (3/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_520-39496-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 14:40:26,240 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_37818_210_4238_1_1533620910_4720083_315-308565-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 14:40:28,136 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2009_210_2071_1_1525481134_4588937_385-302100-0_sp1.1 from training. Duration: 0.7909375 2023-10-02 14:40:28,349 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.ff2_skip_rate, batch_count=915153.3333333334, ans=0.0 2023-10-02 14:40:28,422 INFO [scaling.py:1118] (3/4) WithLoss: name=encoder.encoders.3.encoder.layers.0.self_attn_weights, loss-sum=0.000e+00 2023-10-02 14:40:29,465 WARNING [train.py:1204] (3/4) Exclude cut with ID _58895_210_8750_1_1535184081320_3649089_277-53219-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 14:40:29,546 WARNING [train.py:1204] (3/4) Exclude cut with ID _26664_210_7770_1_1532258943280_3418270_121-111258-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 14:40:33,022 WARNING [train.py:1204] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_99-167392-0_sp1.1 from training. Duration: 0.5545625 2023-10-02 14:40:35,733 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1113-7369-0 from training. Duration: 0.85 2023-10-02 14:40:35,750 WARNING [train.py:1204] (3/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_352-59958-0 from training. Duration: 0.86 2023-10-02 14:40:35,756 WARNING [train.py:1204] (3/4) Exclude cut with ID _14886_210_4220_1_1530702020355_3516190_159-73748-0_sp1.1 from training. Duration: 0.7545625 2023-10-02 14:40:37,229 WARNING [train.py:1204] (3/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_131-119569-0_sp1.1 from training. Duration: 0.92725 2023-10-02 14:40:38,518 WARNING [train.py:1204] (3/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_216-343007-0 from training. Duration: 0.66 2023-10-02 14:40:41,345 WARNING [train.py:1204] (3/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_113-92608-0_sp0.9 from training. Duration: 0.6 2023-10-02 14:40:45,381 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_40047_210_2290_1_1533290519_2374311_132-123271-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 14:40:46,631 WARNING [train.py:1204] (3/4) Exclude cut with ID _8793_210_6310_1_1528542985139_2310009_121-179782-0 from training. Duration: 0.8 2023-10-02 14:40:46,652 WARNING [train.py:1204] (3/4) Exclude cut with ID _43997_210_4525_1_1533538632555_5189329_283-218639-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 14:40:46,655 WARNING [train.py:1204] (3/4) Exclude cut with ID _49376_210_5986_1_1533985278732_3370740_297-78397-0_sp1.1 from training. Duration: 0.936375 2023-10-02 14:40:46,668 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3475_210_2028_1_1526193578_3622569_377-166133-0_sp0.9 from training. Duration: 0.9555625 2023-10-02 14:40:46,674 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_520-213149-0_sp1.1 from training. Duration: 0.92725 2023-10-02 14:40:49,329 WARNING [train.py:1204] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_567-144483-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 14:40:52,120 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_4183_210_2087_1_1526729100_1869838_143-280962-0_sp0.9 from training. Duration: 0.62225 2023-10-02 14:40:53,459 WARNING [train.py:1204] (3/4) Exclude cut with ID _49643_210_7989_1_1534640244224_3939031_611-185515-0 from training. Duration: 0.94 2023-10-02 14:40:54,886 WARNING [train.py:1204] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_88-341718-0_sp0.9 from training. Duration: 0.7333125 2023-10-02 14:40:56,708 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_51225_210_3601_1_1534400634_2327383_49-235441-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 14:40:57,173 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.5.encoder.layers.1.whiten, num_groups=1, num_channels=256, metric=2.00 vs. limit=12.0 2023-10-02 14:40:58,405 WARNING [train.py:1204] (3/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_240-294400-0_sp1.1 from training. Duration: 0.936375 2023-10-02 14:40:59,789 WARNING [train.py:1204] (3/4) Exclude cut with ID _55429_210_10626_1_1534467588527_7174143_640-304825-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 14:40:59,822 WARNING [train.py:1204] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_187-221566-0_sp1.1 from training. Duration: 0.3909375 2023-10-02 14:41:02,511 WARNING [train.py:1204] (3/4) Exclude cut with ID _7606_210_4915_1_1528894253277_5080690_127-177761-0_sp0.9 from training. Duration: 0.711125 2023-10-02 14:41:04,604 WARNING [train.py:1204] (3/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_430-116139-0 from training. Duration: 0.85 2023-10-02 14:41:05,950 WARNING [train.py:1204] (3/4) Exclude cut with ID _3675_210_4557_1_1526639934408_4182089_456-18305-0_sp0.9 from training. Duration: 0.8444375 2023-10-02 14:41:08,735 INFO [train.py:1046] (3/4) Epoch 26, batch 4500, loss[loss=0.1483, simple_loss=0.2303, pruned_loss=0.03312, over 24336.00 frames. ], tot_loss[loss=0.168, simple_loss=0.2457, pruned_loss=0.04522, over 4718013.21 frames. ], batch size: 61, lr: 3.91e-03, grad_scale: 32.0 2023-10-02 14:41:11,537 WARNING [train.py:1204] (3/4) Exclude cut with ID _18513_210_9206_1_1531618215223_3862620_402-5957-0_sp1.1 from training. Duration: 0.9 2023-10-02 14:41:11,621 WARNING [train.py:1204] (3/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_465-156500-0 from training. Duration: 0.72 2023-10-02 14:41:11,622 WARNING [train.py:1204] (3/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_714-310538-0 from training. Duration: 0.86 2023-10-02 14:41:13,032 WARNING [train.py:1204] (3/4) Exclude cut with ID _39411_210_5986_1_1533538825030_3343649_335-301187-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 14:41:18,544 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_41750_210_1960_1_1533520678_3948042_241-158616-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 14:41:18,595 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_46013_210_7234_1_1533776362_3744032_50-141535-0_sp1.1 from training. Duration: 0.9 2023-10-02 14:41:19,926 WARNING [train.py:1204] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_43-206757-0_sp0.9 from training. Duration: 0.8333125 2023-10-02 14:41:19,986 WARNING [train.py:1204] (3/4) Exclude cut with ID _22961_210_8254_1_1531976366672_4278180_172-276296-0_sp1.1 from training. Duration: 0.97275 2023-10-02 14:41:20,016 WARNING [train.py:1204] (3/4) Exclude cut with ID _17956_210_4353_1_1531209568125_6979970_1305-301903-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 14:41:21,237 WARNING [train.py:1204] (3/4) Exclude cut with ID _28323_210_16187_1_1532480395667_4100300_604-44507-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 14:41:34,679 WARNING [train.py:1204] (3/4) Exclude cut with ID _23514_210_1389_1_1531832139785_3031439_88-337544-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 14:41:36,141 WARNING [train.py:1204] (3/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_932-3672-0_sp1.1 from training. Duration: 0.963625 2023-10-02 14:41:36,396 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.feed_forward3.hidden_balancer.prob, batch_count=915486.6666666666, ans=0.125 2023-10-02 14:41:38,290 WARNING [train.py:1204] (3/4) Exclude cut with ID _46502_210_13794_1_1533724452198_3657159_207-109991-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 14:41:39,538 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_216-73841-0_sp1.1 from training. Duration: 0.87275 2023-10-02 14:41:40,891 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8453_210_4211_1_1528502940_8562730_347-330673-0_sp1.1 from training. Duration: 0.6545625 2023-10-02 14:41:46,186 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_4183_210_2087_1_1526729100_1869838_143-280962-0_sp1.1 from training. Duration: 0.5090625 2023-10-02 14:41:49,175 WARNING [train.py:1204] (3/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_325-113798-0_sp1.1 from training. Duration: 0.77275 2023-10-02 14:41:53,278 WARNING [train.py:1204] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_91-212486-0_sp0.9 from training. Duration: 0.7555625 2023-10-02 14:41:53,755 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.5.encoder.layers.1.self_attn1.whiten, num_groups=1, num_channels=256, metric=12.54 vs. limit=22.5 2023-10-02 14:41:56,131 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8776_210_2040_1_1528638837_4088036_244-278763-0_sp1.1 from training. Duration: 0.8181875 2023-10-02 14:41:56,162 WARNING [train.py:1204] (3/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_106-286130-0 from training. Duration: 0.98 2023-10-02 14:41:57,857 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2746_210_1988_1_1525872816_3795627_372-205861-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 14:41:57,894 WARNING [train.py:1204] (3/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_240-54646-0_sp1.1 from training. Duration: 0.936375 2023-10-02 14:42:00,601 WARNING [train.py:1204] (3/4) Exclude cut with ID _59500_210_8254_1_1534809610680_3736009_162-180695-0_sp1.1 from training. Duration: 0.936375 2023-10-02 14:42:00,632 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7544_210_6753_1_1528591327_1471426_146-93357-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 14:42:03,904 WARNING [train.py:1204] (3/4) Exclude cut with ID _42657_210_2070_1_1533520654016_3327008_405-87432-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 14:42:03,932 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_4528_210_3273_1_1527076654_3276777_209-219804-0 from training. Duration: 0.69 2023-10-02 14:42:03,933 WARNING [train.py:1204] (3/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_78-182912-0_sp0.9 from training. Duration: 0.6555625 2023-10-02 14:42:03,946 WARNING [train.py:1204] (3/4) Exclude cut with ID _31255_210_13427_1_1532865619495_3815143_322-114998-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 14:42:07,070 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.505e+02 1.824e+02 1.951e+02 2.153e+02 2.921e+02, threshold=3.902e+02, percent-clipped=0.0 2023-10-02 14:42:07,367 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.conv_module1.balancer2.prob, batch_count=915620.0, ans=0.125 2023-10-02 14:42:09,968 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.2.encoder.layers.0.self_attn2.whiten, num_groups=1, num_channels=384, metric=18.85 vs. limit=22.5 2023-10-02 14:42:10,469 WARNING [train.py:1204] (3/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_848-280957-0_sp0.9 from training. Duration: 0.9666875 2023-10-02 14:42:10,494 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7696_210_2040_1_1528207167_3716099_117-125434-0_sp0.9 from training. Duration: 0.8444375 2023-10-02 14:42:11,932 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_1002-109192-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 14:42:13,318 WARNING [train.py:1204] (3/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_138-64210-0_sp0.9 from training. Duration: 0.811125 2023-10-02 14:42:13,344 WARNING [train.py:1204] (3/4) Exclude cut with ID _25808_210_13292_1_1532343514562_7366109_633-47524-0_sp1.1 from training. Duration: 0.9 2023-10-02 14:42:14,782 WARNING [train.py:1204] (3/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_297-5233-0 from training. Duration: 0.95 2023-10-02 14:42:16,154 WARNING [train.py:1204] (3/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_335-274780-0 from training. Duration: 0.78 2023-10-02 14:42:16,159 WARNING [train.py:1204] (3/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_301-330455-0 from training. Duration: 0.8 2023-10-02 14:42:20,333 WARNING [train.py:1204] (3/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_383-205833-0 from training. Duration: 0.95 2023-10-02 14:42:21,635 INFO [train.py:1046] (3/4) Epoch 26, batch 4550, loss[loss=0.1814, simple_loss=0.2497, pruned_loss=0.0565, over 23717.00 frames. ], tot_loss[loss=0.1678, simple_loss=0.2449, pruned_loss=0.04531, over 4710400.65 frames. ], batch size: 212, lr: 3.91e-03, grad_scale: 16.0 2023-10-02 14:42:24,424 WARNING [train.py:1204] (3/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_48-182155-0 from training. Duration: 0.87 2023-10-02 14:42:26,373 WARNING [train.py:1204] (3/4) Exclude cut with ID _14832_210_8750_1_1530691351539_3171348_1-54315-0_sp1.1 from training. Duration: 0.7909375 2023-10-02 14:42:29,141 WARNING [train.py:1204] (3/4) Exclude cut with ID _44982_210_1511_1_1534312820108_3726029_413-100814-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 14:42:29,210 WARNING [train.py:1204] (3/4) Exclude cut with ID _3231_210_2106_1_1526205864934_6784220_104-261392-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 14:42:29,513 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.bypass.skip_rate, batch_count=915686.6666666666, ans=0.04949747468305833 2023-10-02 14:42:33,934 WARNING [train.py:1204] (3/4) Exclude cut with ID _24314_210_4915_1_1532135078116_7170179_1-13588-0_sp1.1 from training. Duration: 0.97275 2023-10-02 14:42:36,593 WARNING [train.py:1204] (3/4) Exclude cut with ID _44304_210_15374_1_1533639352765_4613129_141-8356-0_sp1.1 from training. Duration: 0.963625 2023-10-02 14:42:37,339 WARNING [train.py:1204] (3/4) Exclude cut with ID _37892_210_19596_1_1533198677897_3964058_374-335034-0_sp1.1 from training. Duration: 0.936375 2023-10-02 14:42:38,702 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.1.conv_module2.balancer1.prob, batch_count=915753.3333333334, ans=0.125 2023-10-02 14:42:39,914 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_195-257512-0_sp0.9 from training. Duration: 0.7444375 2023-10-02 14:42:39,917 WARNING [train.py:1204] (3/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_1267-272693-0_sp1.1 from training. Duration: 0.663625 2023-10-02 14:42:39,917 WARNING [train.py:1204] (3/4) Exclude cut with ID _30196_210_1664_1_1533708496522_7074140_690-326155-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 14:42:42,489 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_68847_210_14656_1_1536070214_2586843_38-20717-0_sp1.1 from training. Duration: 0.97275 2023-10-02 14:42:42,537 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_99-251314-0_sp1.1 from training. Duration: 0.7909375 2023-10-02 14:42:45,335 WARNING [train.py:1204] (3/4) Exclude cut with ID _49376_210_5986_1_1533985278732_3370740_225-95630-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 14:42:48,055 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2655_210_2087_1_1525688959_3897400_202-147557-0 from training. Duration: 0.85 2023-10-02 14:42:49,347 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_5618_210_1874_1_1527311008_5851_1-214501-0_sp1.1 from training. Duration: 0.626375 2023-10-02 14:42:50,760 WARNING [train.py:1204] (3/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_225-43381-0_sp1.1 from training. Duration: 0.8545625 2023-10-02 14:42:51,734 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.0.layers.1.self_attn1.whiten, num_groups=1, num_channels=192, metric=11.96 vs. limit=22.5 2023-10-02 14:42:52,166 WARNING [train.py:1204] (3/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_242-3472-0 from training. Duration: 0.73 2023-10-02 14:42:54,917 WARNING [train.py:1204] (3/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_339-104791-0 from training. Duration: 0.49 2023-10-02 14:42:54,986 WARNING [train.py:1204] (3/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_488-92261-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 14:42:58,369 WARNING [train.py:1204] (3/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_175-163894-0 from training. Duration: 0.74 2023-10-02 14:42:59,771 WARNING [train.py:1204] (3/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_41-234353-0_sp0.9 from training. Duration: 0.7555625 2023-10-02 14:43:04,318 WARNING [train.py:1204] (3/4) Exclude cut with ID _47251_210_18628_1_1533814063128_4137250_116-275927-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 14:43:04,352 WARNING [train.py:1204] (3/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_61-223324-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 14:43:04,364 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6961_210_2087_1_1527937168_6815011_562-165518-0_sp1.1 from training. Duration: 0.7 2023-10-02 14:43:05,824 WARNING [train.py:1204] (3/4) Exclude cut with ID _21989_210_5854_1_1531899214931_3271360_133-44759-0 from training. Duration: 0.77 2023-10-02 14:43:09,496 WARNING [train.py:1204] (3/4) Exclude cut with ID _23557_210_8750_1_1531890106370_6846255_77-284923-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 14:43:12,294 WARNING [train.py:1204] (3/4) Exclude cut with ID _13678_210_7497_1_1530530069226_3463170_464-296127-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 14:43:12,308 WARNING [train.py:1204] (3/4) Exclude cut with ID _22613_210_5219_1_1532253599686_4090170_393-36663-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 14:43:13,647 WARNING [train.py:1204] (3/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_235-262124-0_sp0.9 from training. Duration: 0.7444375 2023-10-02 14:43:13,747 WARNING [train.py:1204] (3/4) Exclude cut with ID _8480_210_1874_1_1528589391205_6728330_755-112625-0 from training. Duration: 0.74 2023-10-02 14:43:13,791 WARNING [train.py:1204] (3/4) Exclude cut with ID _6456_210_1534_1_1527681634212_3181049_215-309359-0 from training. Duration: 0.88 2023-10-02 14:43:15,112 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3019_210_2028_1_1526044245_3264865_191-72235-0_sp1.1 from training. Duration: 0.8818125 2023-10-02 14:43:16,483 WARNING [train.py:1204] (3/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_380-162648-0 from training. Duration: 0.94 2023-10-02 14:43:16,731 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.nonlin_attention.balancer.prob, batch_count=915886.6666666666, ans=0.125 2023-10-02 14:43:17,921 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_92-46009-0 from training. Duration: 0.54 2023-10-02 14:43:19,275 WARNING [train.py:1204] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_53-300544-0_sp0.9 from training. Duration: 0.7444375 2023-10-02 14:43:19,392 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_68235_210_1925_1_1535863247_4484656_101-250943-0_sp1.1 from training. Duration: 0.97275 2023-10-02 14:43:19,402 WARNING [train.py:1204] (3/4) Exclude cut with ID _14970_210_15709_1_1530775547361_1966965_204-22182-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 14:43:20,804 WARNING [train.py:1204] (3/4) Exclude cut with ID _55153_210_2253_1_1534384786677_3162096_310-130092-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 14:43:20,814 WARNING [train.py:1204] (3/4) Exclude cut with ID _60113_210_16271_1_1535164212481_4386079_559-167191-0_sp1.1 from training. Duration: 0.8454375 2023-10-02 14:43:22,308 WARNING [train.py:1204] (3/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_93-58002-0_sp1.1 from training. Duration: 0.5181875 2023-10-02 14:43:22,357 WARNING [train.py:1204] (3/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_110-224688-0 from training. Duration: 0.97 2023-10-02 14:43:23,791 WARNING [train.py:1204] (3/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_178-178004-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 14:43:23,807 WARNING [train.py:1204] (3/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_321-335475-0_sp1.1 from training. Duration: 0.4090625 2023-10-02 14:43:25,680 WARNING [train.py:1204] (3/4) Exclude cut with ID _66863_210_18053_1_1535698758334_8590319_1092-271784-0 from training. Duration: 0.96 2023-10-02 14:43:25,686 WARNING [train.py:1204] (3/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_607-213951-0_sp1.1 from training. Duration: 0.763625 2023-10-02 14:43:25,694 WARNING [train.py:1204] (3/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_468-283113-0 from training. Duration: 0.96 2023-10-02 14:43:28,563 WARNING [train.py:1204] (3/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_13-165512-0_sp1.1 from training. Duration: 0.7454375 2023-10-02 14:43:28,579 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_69630_210_7234_1_1535788670_910989_56-169762-0_sp1.1 from training. Duration: 0.92725 2023-10-02 14:43:31,460 WARNING [train.py:1204] (3/4) Exclude cut with ID _67335_210_5219_1_1535525975652_4275229_121-80357-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 14:43:33,225 WARNING [train.py:1204] (3/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_126-12079-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 14:43:33,260 WARNING [train.py:1204] (3/4) Exclude cut with ID _5294_210_1747_1_1527249643928_3696179_342-270278-0_sp1.1 from training. Duration: 0.5 2023-10-02 14:43:33,361 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_30322_210_1925_1_1532843478_826817_35-328264-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 14:43:36,499 INFO [train.py:1046] (3/4) Epoch 26, batch 4600, loss[loss=0.1431, simple_loss=0.2257, pruned_loss=0.0302, over 24541.00 frames. ], tot_loss[loss=0.1665, simple_loss=0.2435, pruned_loss=0.04473, over 4709629.82 frames. ], batch size: 60, lr: 3.91e-03, grad_scale: 16.0 2023-10-02 14:43:36,586 WARNING [train.py:1204] (3/4) Exclude cut with ID _8480_210_1874_1_1528589391205_6728330_755-112625-0_sp1.1 from training. Duration: 0.67275 2023-10-02 14:43:39,814 WARNING [train.py:1204] (3/4) Exclude cut with ID _38670_210_15379_1_1533448810659_4391059_132-146412-0_sp1.1 from training. Duration: 0.97275 2023-10-02 14:43:39,899 WARNING [train.py:1204] (3/4) Exclude cut with ID _22020_210_2461_1_1531724164513_6326900_563-39691-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 14:43:42,673 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_9460_210_4211_1_1528954587_10049667_572-37035-0_sp1.1 from training. Duration: 0.72725 2023-10-02 14:43:42,684 WARNING [train.py:1204] (3/4) Exclude cut with ID _68777_210_4353_1_1535810390731_3117904_129-166012-0_sp0.9 from training. Duration: 0.9444375 2023-10-02 14:43:44,054 WARNING [train.py:1204] (3/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_487-91921-0_sp1.1 from training. Duration: 0.963625 2023-10-02 14:43:45,410 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_195-257512-0 from training. Duration: 0.67 2023-10-02 14:43:46,878 WARNING [train.py:1204] (3/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_188-260011-0_sp1.1 from training. Duration: 0.663625 2023-10-02 14:43:48,434 WARNING [train.py:1204] (3/4) Exclude cut with ID _8480_210_1874_1_1528589391205_6728330_309-139896-0_sp0.9 from training. Duration: 0.9 2023-10-02 14:43:48,495 WARNING [train.py:1204] (3/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_866-321248-0_sp1.1 from training. Duration: 0.963625 2023-10-02 14:43:51,311 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.ff2_skip_rate, batch_count=916086.6666666666, ans=0.0 2023-10-02 14:43:52,419 WARNING [train.py:1204] (3/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_510-216513-0_sp1.1 from training. Duration: 0.97275 2023-10-02 14:43:58,427 WARNING [train.py:1204] (3/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_758-143677-0 from training. Duration: 0.98 2023-10-02 14:43:59,940 WARNING [train.py:1204] (3/4) Exclude cut with ID _55134_210_21982_1_1534590028377_3882170_50-214427-0_sp1.1 from training. Duration: 0.97275 2023-10-02 14:44:01,782 WARNING [train.py:1204] (3/4) Exclude cut with ID _31649_210_6228_1_1532584608063_4091078_482-301330-0_sp1.1 from training. Duration: 0.97275 2023-10-02 14:44:03,306 WARNING [train.py:1204] (3/4) Exclude cut with ID _35799_210_5854_1_1533263487887_3529071_490-326847-0_sp1.1 from training. Duration: 0.9 2023-10-02 14:44:03,316 WARNING [train.py:1204] (3/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_984-90716-0_sp1.1 from training. Duration: 0.963625 2023-10-02 14:44:05,360 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.0.feed_forward1.hidden_balancer.prob, batch_count=916153.3333333334, ans=0.125 2023-10-02 14:44:10,973 WARNING [train.py:1204] (3/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_166-246557-0 from training. Duration: 0.64 2023-10-02 14:44:10,974 WARNING [train.py:1204] (3/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_51-200220-0_sp1.1 from training. Duration: 0.5090625 2023-10-02 14:44:11,054 WARNING [train.py:1204] (3/4) Exclude cut with ID _51382_210_23862_1_1534492801838_3650104_275-92796-0_sp1.1 from training. Duration: 0.936375 2023-10-02 14:44:15,218 WARNING [train.py:1204] (3/4) Exclude cut with ID _29853_210_7497_1_1532505600851_6642110_461-142829-0_sp1.1 from training. Duration: 0.97275 2023-10-02 14:44:15,263 WARNING [train.py:1204] (3/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_788-298907-0_sp0.9 from training. Duration: 0.988875 2023-10-02 14:44:18,044 WARNING [train.py:1204] (3/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_739-2098-0_sp1.1 from training. Duration: 0.836375 2023-10-02 14:44:20,780 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7543_210_6753_1_1528541907_1399322_165-323619-0 from training. Duration: 0.69 2023-10-02 14:44:22,146 WARNING [train.py:1204] (3/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_401-255491-0_sp1.1 from training. Duration: 0.636375 2023-10-02 14:44:26,847 WARNING [train.py:1204] (3/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_24-143283-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 14:44:28,227 WARNING [train.py:1204] (3/4) Exclude cut with ID _14459_210_2228_1_1531443641946_7516700_1221-280439-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 14:44:30,919 WARNING [train.py:1204] (3/4) Exclude cut with ID _59202_210_26984_1_1534816810612_3549174_469-213482-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 14:44:30,920 WARNING [train.py:1204] (3/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_148-158509-0_sp0.9 from training. Duration: 0.4555625 2023-10-02 14:44:30,947 WARNING [train.py:1204] (3/4) Exclude cut with ID _24314_210_4915_1_1532135078116_7170179_863-259095-0_sp1.1 from training. Duration: 0.97275 2023-10-02 14:44:32,861 WARNING [train.py:1204] (3/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_321-22821-0 from training. Duration: 0.89 2023-10-02 14:44:32,898 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_43831_210_2158_1_1533725502_5872791_55-163137-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 14:44:32,935 WARNING [train.py:1204] (3/4) Exclude cut with ID _27430_210_7770_1_1532604689375_1378590_26-25207-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 14:44:34,730 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_17613_210_2151_1_1531137569_2366160_223-316461-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 14:44:34,809 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_53679_210_4852_1_1534556726_4186141_541-213370-0_sp1.1 from training. Duration: 0.963625 2023-10-02 14:44:34,873 WARNING [train.py:1204] (3/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_369-295086-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 14:44:36,306 WARNING [train.py:1204] (3/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_91-267696-0 from training. Duration: 0.87 2023-10-02 14:44:36,353 WARNING [train.py:1204] (3/4) Exclude cut with ID _5294_210_1747_1_1527249643928_3696179_180-100713-0 from training. Duration: 0.81 2023-10-02 14:44:37,536 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.526e+02 1.833e+02 1.993e+02 2.244e+02 3.849e+02, threshold=3.987e+02, percent-clipped=0.0 2023-10-02 14:44:37,637 WARNING [train.py:1204] (3/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_32-335103-0 from training. Duration: 0.82 2023-10-02 14:44:37,642 WARNING [train.py:1204] (3/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_133-312727-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 14:44:38,296 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.3.encoder.layers.1.whiten, num_groups=1, num_channels=512, metric=5.38 vs. limit=12.0 2023-10-02 14:44:39,037 WARNING [train.py:1204] (3/4) Exclude cut with ID _59288_210_5854_1_1535077757774_3412246_391-165934-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 14:44:39,080 WARNING [train.py:1204] (3/4) Exclude cut with ID _28323_210_16187_1_1532480395667_4100300_488-1281-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 14:44:40,394 WARNING [train.py:1204] (3/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_124-193207-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 14:44:40,635 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.conv_module1.balancer1.prob, batch_count=916286.6666666666, ans=0.125 2023-10-02 14:44:42,681 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.self_attn_weights.pos_emb_skip_rate, batch_count=916286.6666666666, ans=0.0 2023-10-02 14:44:49,426 WARNING [train.py:1204] (3/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_226-242140-0_sp0.9 from training. Duration: 0.9 2023-10-02 14:44:50,685 INFO [train.py:1046] (3/4) Epoch 26, batch 4650, loss[loss=0.1725, simple_loss=0.2522, pruned_loss=0.04639, over 24045.00 frames. ], tot_loss[loss=0.1661, simple_loss=0.2432, pruned_loss=0.04451, over 4698228.79 frames. ], batch size: 80, lr: 3.91e-03, grad_scale: 8.0 2023-10-02 14:44:52,575 WARNING [train.py:1204] (3/4) Exclude cut with ID _29277_210_3279_1_1532581109293_3173069_392-3694-0_sp1.1 from training. Duration: 0.936375 2023-10-02 14:44:52,616 WARNING [train.py:1204] (3/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_563-329842-0_sp1.1 from training. Duration: 0.97275 2023-10-02 14:44:53,996 WARNING [train.py:1204] (3/4) Exclude cut with ID _58583_210_5236_1_1534726835266_3574249_391-87755-0_sp1.1 from training. Duration: 0.92725 2023-10-02 14:44:54,025 WARNING [train.py:1204] (3/4) Exclude cut with ID _28427_210_4220_1_1532581240967_3061069_281-237827-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 14:44:54,040 WARNING [train.py:1204] (3/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_44-147898-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 14:44:55,341 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_24007_210_1794_1_1531918925_1663875_125-320772-0_sp1.1 from training. Duration: 0.97275 2023-10-02 14:44:58,138 WARNING [train.py:1204] (3/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_143-117421-0 from training. Duration: 0.58 2023-10-02 14:44:58,434 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.conv_skip_rate, batch_count=916353.3333333334, ans=0.0 2023-10-02 14:45:01,397 WARNING [train.py:1204] (3/4) Exclude cut with ID _69584_210_5219_1_1535785133775_4044450_325-7656-0_sp0.9 from training. Duration: 0.9555625 2023-10-02 14:45:03,151 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_37351_210_7925_1_1533012931_3812113_265-85780-0 from training. Duration: 0.88 2023-10-02 14:45:03,186 WARNING [train.py:1204] (3/4) Exclude cut with ID _18516_210_9206_1_1531704653206_3692406_331-237896-0_sp1.1 from training. Duration: 0.936375 2023-10-02 14:45:04,526 WARNING [train.py:1204] (3/4) Exclude cut with ID _14629_210_15709_1_1530604298182_956899_6-232695-0 from training. Duration: 0.94 2023-10-02 14:45:04,545 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7544_210_6753_1_1528587000_691468_87-178599-0_sp1.1 from training. Duration: 0.7909375 2023-10-02 14:45:04,591 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_326-303945-0 from training. Duration: 0.99 2023-10-02 14:45:04,611 WARNING [train.py:1204] (3/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_503-30719-0 from training. Duration: 0.98 2023-10-02 14:45:04,620 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_11305_210_1794_1_1529750428_7178148_299-344640-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 14:45:04,808 INFO [scaling.py:1118] (3/4) WithLoss: name=encoder.encoders.2.encoder.layers.0.self_attn_weights, loss-sum=0.000e+00 2023-10-02 14:45:05,957 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_53071_210_16554_1_1534490405_2954066_265-170472-0_sp1.1 from training. Duration: 0.7545625 2023-10-02 14:45:08,796 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.balancer1.prob, batch_count=916420.0, ans=0.125 2023-10-02 14:45:09,312 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.2.encoder.layers.0.nonlin_attention.whiten2, num_groups=1, num_channels=384, metric=6.68 vs. limit=15.0 2023-10-02 14:45:10,033 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_174-277364-0_sp1.1 from training. Duration: 0.6454375 2023-10-02 14:45:11,839 WARNING [train.py:1204] (3/4) Exclude cut with ID _58414_210_8254_1_1535421609162_7151182_193-151884-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 14:45:11,865 WARNING [train.py:1204] (3/4) Exclude cut with ID IC0651W0104-322063 from training. Duration: 0.768 2023-10-02 14:45:14,670 WARNING [train.py:1204] (3/4) Exclude cut with ID _21881_210_3525_1_1531825242757_3798380_139-305982-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 14:45:14,774 WARNING [train.py:1204] (3/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_79-200392-0 from training. Duration: 0.97 2023-10-02 14:45:17,559 WARNING [train.py:1204] (3/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_451-280894-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 14:45:17,567 WARNING [train.py:1204] (3/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_304-273433-0_sp1.1 from training. Duration: 0.9 2023-10-02 14:45:17,633 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_5959_210_1794_1_1527400400_3445726_412-44052-0 from training. Duration: 0.77 2023-10-02 14:45:18,999 WARNING [train.py:1204] (3/4) Exclude cut with ID _4681_210_3525_1_1527251040321_1235713_148-26159-0_sp1.1 from training. Duration: 0.963625 2023-10-02 14:45:21,754 WARNING [train.py:1204] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_887-249555-0_sp0.9 from training. Duration: 0.8555625 2023-10-02 14:45:22,019 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.conv_module2.balancer2.prob, batch_count=916486.6666666666, ans=0.125 2023-10-02 14:45:24,998 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_25236_210_2158_1_1532051968_280567_37-6860-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 14:45:31,030 WARNING [train.py:1204] (3/4) Exclude cut with ID _29277_210_3279_1_1532581109293_3173069_417-115527-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 14:45:33,835 WARNING [train.py:1204] (3/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_552-134748-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 14:45:35,884 WARNING [train.py:1204] (3/4) Exclude cut with ID _43176_210_8254_1_1533524417469_4174339_188-174143-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 14:45:35,925 WARNING [train.py:1204] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_127-119109-0_sp1.1 from training. Duration: 0.6545625 2023-10-02 14:45:38,705 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.1.conv_module1.balancer1.max_abs, batch_count=916553.3333333334, ans=10.0 2023-10-02 14:45:39,877 WARNING [train.py:1204] (3/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_38-187851-0 from training. Duration: 0.62 2023-10-02 14:45:39,907 WARNING [train.py:1204] (3/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_588-125165-0 from training. Duration: 0.83 2023-10-02 14:45:39,978 WARNING [train.py:1204] (3/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_344-152526-0_sp1.1 from training. Duration: 0.4818125 2023-10-02 14:45:39,987 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7002_210_1794_1_1527935634_4034444_487-236501-0 from training. Duration: 0.66 2023-10-02 14:45:41,432 WARNING [train.py:1204] (3/4) Exclude cut with ID _23825_210_13449_1_1531915313526_3497150_379-16981-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 14:45:49,864 WARNING [train.py:1204] (3/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_297-5233-0_sp1.1 from training. Duration: 0.863625 2023-10-02 14:45:49,868 WARNING [train.py:1204] (3/4) Exclude cut with ID _41298_210_9019_1_1533430371450_4822143_178-283538-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 14:45:49,902 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7046_210_6753_1_1527895337_6920917_642-106799-0 from training. Duration: 0.92 2023-10-02 14:45:49,918 WARNING [train.py:1204] (3/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_532-291952-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 14:45:51,385 WARNING [train.py:1204] (3/4) Exclude cut with ID _47278_210_12041_1_1533902418349_3576110_260-270468-0_sp1.1 from training. Duration: 0.92725 2023-10-02 14:45:51,391 WARNING [train.py:1204] (3/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_144-3287-0_sp0.9 from training. Duration: 0.7444375 2023-10-02 14:45:52,914 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_162-262717-0_sp0.9 from training. Duration: 0.92225 2023-10-02 14:45:55,631 WARNING [train.py:1204] (3/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_62-335552-0_sp0.9 from training. Duration: 0.9666875 2023-10-02 14:45:55,644 WARNING [train.py:1204] (3/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_726-272852-0_sp1.1 from training. Duration: 0.92725 2023-10-02 14:45:55,719 WARNING [train.py:1204] (3/4) Exclude cut with ID _14898_210_9206_1_1530766812377_4177190_26-135492-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 14:45:55,925 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.conv_module2.balancer2.min_positive, batch_count=916620.0, ans=0.05 2023-10-02 14:45:58,926 WARNING [train.py:1204] (3/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_200-95304-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 14:45:58,966 WARNING [train.py:1204] (3/4) Exclude cut with ID _45012_210_12651_1_1533608435136_5371380_240-37101-0_sp1.1 from training. Duration: 0.8454375 2023-10-02 14:45:58,983 WARNING [train.py:1204] (3/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_132-269633-0_sp0.9 from training. Duration: 0.7555625 2023-10-02 14:46:02,183 WARNING [train.py:1204] (3/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_907-151398-0_sp0.9 from training. Duration: 0.411125 2023-10-02 14:46:02,266 WARNING [train.py:1204] (3/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_45-265684-0_sp0.9 from training. Duration: 0.8 2023-10-02 14:46:03,540 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_74-292608-0 from training. Duration: 0.72 2023-10-02 14:46:04,810 INFO [train.py:1046] (3/4) Epoch 26, batch 4700, loss[loss=0.1654, simple_loss=0.2536, pruned_loss=0.0386, over 24651.00 frames. ], tot_loss[loss=0.1669, simple_loss=0.2446, pruned_loss=0.04466, over 4715234.13 frames. ], batch size: 73, lr: 3.91e-03, grad_scale: 8.0 2023-10-02 14:46:10,952 WARNING [train.py:1204] (3/4) Exclude cut with ID _59202_210_26984_1_1534816810612_3549174_221-84819-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 14:46:11,106 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.balancer2.prob, batch_count=916686.6666666666, ans=0.125 2023-10-02 14:46:12,224 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_33696_210_3852_1_1533082780_2663375_181-325595-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 14:46:13,572 WARNING [train.py:1204] (3/4) Exclude cut with ID _47251_210_18628_1_1533814063128_4137250_351-154105-0_sp1.1 from training. Duration: 0.963625 2023-10-02 14:46:15,473 WARNING [train.py:1204] (3/4) Exclude cut with ID _35501_210_9288_1_1532998507986_3192189_351-334740-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 14:46:16,839 WARNING [train.py:1204] (3/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_342-100157-0_sp1.1 from training. Duration: 0.4909375 2023-10-02 14:46:21,154 WARNING [train.py:1204] (3/4) Exclude cut with ID _6789_210_1534_1_1527857937218_3118950_262-48853-0 from training. Duration: 0.56 2023-10-02 14:46:22,394 WARNING [train.py:1204] (3/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_430-201020-0 from training. Duration: 0.97 2023-10-02 14:46:25,636 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_71-136518-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 14:46:27,009 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_28-291982-0_sp1.1 from training. Duration: 0.8818125 2023-10-02 14:46:27,039 WARNING [train.py:1204] (3/4) Exclude cut with ID _54218_210_12210_1_1534413646388_4409259_437-275326-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 14:46:29,872 WARNING [train.py:1204] (3/4) Exclude cut with ID _55134_210_21982_1_1534590028377_3882170_40-309139-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 14:46:34,125 WARNING [train.py:1204] (3/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_266-329755-0_sp1.1 from training. Duration: 0.8090625 2023-10-02 14:46:36,075 WARNING [train.py:1204] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_21-174514-0_sp1.1 from training. Duration: 0.3909375 2023-10-02 14:46:38,846 WARNING [train.py:1204] (3/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_409-207378-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 14:46:43,664 WARNING [train.py:1204] (3/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_132-269633-0 from training. Duration: 0.68 2023-10-02 14:46:45,069 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2245_210_2715_1_1525511548_3540254_310-52483-0_sp0.9 from training. Duration: 0.97775 2023-10-02 14:46:46,482 WARNING [train.py:1204] (3/4) Exclude cut with ID _27430_210_7770_1_1532604689375_1378590_47-77403-0_sp1.1 from training. Duration: 0.92725 2023-10-02 14:46:51,422 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.3.encoder.layers.0.self_attn1.whiten, num_groups=1, num_channels=512, metric=17.21 vs. limit=22.5 2023-10-02 14:46:52,186 WARNING [train.py:1204] (3/4) Exclude cut with ID _7562_210_2084_1_1528887767916_3946229_186-326422-0 from training. Duration: 0.84 2023-10-02 14:46:53,548 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_230-35993-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 14:46:56,925 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.4.encoder.layers.1.nonlin_attention.whiten1, num_groups=1, num_channels=288, metric=3.87 vs. limit=10.0 2023-10-02 14:46:59,660 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_23854_210_1794_1_1531969696_1329157_57-123491-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 14:46:59,708 WARNING [train.py:1204] (3/4) Exclude cut with ID _14747_210_8341_1_1530685797463_3654089_147-131229-0 from training. Duration: 0.76 2023-10-02 14:47:01,257 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.self_attn_weights.pos_emb_skip_rate, batch_count=916886.6666666666, ans=0.0 2023-10-02 14:47:02,376 WARNING [train.py:1204] (3/4) Exclude cut with ID _30191_210_1664_1_1533189915047_7196670_430-19539-0_sp1.1 from training. Duration: 0.92725 2023-10-02 14:47:02,392 WARNING [train.py:1204] (3/4) Exclude cut with ID _67725_210_27767_1_1535608756671_5105359_44-50762-0_sp1.1 from training. Duration: 0.963625 2023-10-02 14:47:05,236 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.477e+02 1.831e+02 2.029e+02 2.263e+02 3.134e+02, threshold=4.059e+02, percent-clipped=0.0 2023-10-02 14:47:05,330 WARNING [train.py:1204] (3/4) Exclude cut with ID _27485_210_4916_1_1532739191677_3668269_217-161755-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 14:47:05,365 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_5931_210_2040_1_1527428481_4547999_321-193614-0_sp1.1 from training. Duration: 0.6090625 2023-10-02 14:47:05,387 WARNING [train.py:1204] (3/4) Exclude cut with ID _6267_210_2084_1_1527764658526_3894730_375-219887-0 from training. Duration: 0.67 2023-10-02 14:47:06,735 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9087W0022-538393 from training. Duration: 0.768 2023-10-02 14:47:07,715 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.0.layers.0.feed_forward2.out_whiten, num_groups=1, num_channels=192, metric=5.76 vs. limit=15.0 2023-10-02 14:47:08,617 WARNING [train.py:1204] (3/4) Exclude cut with ID _68777_210_4353_1_1535810390731_3117904_83-196459-0_sp1.1 from training. Duration: 0.963625 2023-10-02 14:47:08,753 WARNING [train.py:1204] (3/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_455-343812-0_sp1.1 from training. Duration: 0.92725 2023-10-02 14:47:08,754 WARNING [train.py:1204] (3/4) Exclude cut with ID _13916_210_9994_1_1530788341126_3763149_414-320174-0_sp1.1 from training. Duration: 0.92725 2023-10-02 14:47:08,758 WARNING [train.py:1204] (3/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_398-239819-0 from training. Duration: 0.99 2023-10-02 14:47:10,087 WARNING [train.py:1204] (3/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_199-232314-0_sp1.1 from training. Duration: 0.92725 2023-10-02 14:47:14,779 WARNING [train.py:1204] (3/4) Exclude cut with ID _2334_210_2667_1_1525608238880_4705129_459-104997-0 from training. Duration: 0.95 2023-10-02 14:47:17,408 WARNING [train.py:1204] (3/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_441-209395-0_sp1.1 from training. Duration: 0.936375 2023-10-02 14:47:18,757 INFO [train.py:1046] (3/4) Epoch 26, batch 4750, loss[loss=0.147, simple_loss=0.2214, pruned_loss=0.03627, over 24492.00 frames. ], tot_loss[loss=0.1675, simple_loss=0.2451, pruned_loss=0.04496, over 4718894.93 frames. ], batch size: 58, lr: 3.91e-03, grad_scale: 8.0 2023-10-02 14:47:18,869 WARNING [train.py:1204] (3/4) Exclude cut with ID _27052_210_5986_1_1532329197187_3585009_320-80393-0_sp1.1 from training. Duration: 0.97275 2023-10-02 14:47:24,630 WARNING [train.py:1204] (3/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_89-217253-0_sp1.1 from training. Duration: 0.97275 2023-10-02 14:47:24,648 WARNING [train.py:1204] (3/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_453-282588-0_sp1.1 from training. Duration: 0.8909375 2023-10-02 14:47:26,071 WARNING [train.py:1204] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_88-341718-0 from training. Duration: 0.66 2023-10-02 14:47:26,112 WARNING [train.py:1204] (3/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_230-3521-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 14:47:28,983 WARNING [train.py:1204] (3/4) Exclude cut with ID _9679_210_7497_1_1529129212906_4363929_512-313889-0 from training. Duration: 0.81 2023-10-02 14:47:30,874 WARNING [train.py:1204] (3/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_194-252800-0_sp1.1 from training. Duration: 0.8818125 2023-10-02 14:47:30,895 WARNING [train.py:1204] (3/4) Exclude cut with ID _55209_210_1531_1_1534675828996_4597160_239-293541-0_sp1.1 from training. Duration: 0.963625 2023-10-02 14:47:31,051 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=917020.0, ans=0.1 2023-10-02 14:47:32,191 WARNING [train.py:1204] (3/4) Exclude cut with ID _35799_210_5854_1_1533263487887_3529071_15-325131-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 14:47:33,868 INFO [scaling.py:1118] (3/4) WithLoss: name=encoder.encoders.4.encoder.layers.2.self_attn_weights, loss-sum=0.000e+00 2023-10-02 14:47:36,551 WARNING [train.py:1204] (3/4) Exclude cut with ID _14100_210_8341_1_1530529320613_3678069_279-55760-0 from training. Duration: 0.93 2023-10-02 14:47:42,295 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_489-230703-0_sp1.1 from training. Duration: 0.77275 2023-10-02 14:47:45,623 WARNING [train.py:1204] (3/4) Exclude cut with ID _6694_210_4599_1_1527852637490_3965529_144-185051-0 from training. Duration: 0.96 2023-10-02 14:47:45,690 WARNING [train.py:1204] (3/4) Exclude cut with ID _28461_210_2253_1_1532771775189_3041299_1-228155-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 14:47:45,908 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=917086.6666666666, ans=0.1 2023-10-02 14:47:49,799 WARNING [train.py:1204] (3/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_686-252323-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 14:47:49,809 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_1075-294633-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 14:47:49,821 WARNING [train.py:1204] (3/4) Exclude cut with ID _22083_210_8254_1_1531702862439_3678970_30-92879-0_sp1.1 from training. Duration: 0.97275 2023-10-02 14:47:49,897 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9245W0121-616888 from training. Duration: 0.896 2023-10-02 14:47:49,900 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6786_210_1388_1_1527839699_4194495_378-46070-0 from training. Duration: 0.72 2023-10-02 14:47:55,443 WARNING [train.py:1204] (3/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_317-175062-0 from training. Duration: 0.82 2023-10-02 14:47:58,841 WARNING [train.py:1204] (3/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_238-89390-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 14:48:00,233 WARNING [train.py:1204] (3/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_524-310811-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 14:48:03,527 WARNING [train.py:1204] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_307-158462-0_sp0.9 from training. Duration: 0.7666875 2023-10-02 14:48:03,528 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9309W0191-640318 from training. Duration: 0.512 2023-10-02 14:48:03,532 WARNING [train.py:1204] (3/4) Exclude cut with ID _55153_210_2253_1_1534384786677_3162096_100-204617-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 14:48:05,017 WARNING [train.py:1204] (3/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_112-149213-0_sp1.1 from training. Duration: 0.863625 2023-10-02 14:48:05,289 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.self_attn_weights.pos_emb_skip_rate, batch_count=917220.0, ans=0.0 2023-10-02 14:48:07,788 WARNING [train.py:1204] (3/4) Exclude cut with ID _67831_210_12392_1_1535612464202_3092612_346-2148-0_sp1.1 from training. Duration: 0.8454375 2023-10-02 14:48:09,228 WARNING [train.py:1204] (3/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_51-117576-0_sp0.9 from training. Duration: 0.4444375 2023-10-02 14:48:10,501 WARNING [train.py:1204] (3/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_300-27235-0 from training. Duration: 0.87 2023-10-02 14:48:10,579 WARNING [train.py:1204] (3/4) Exclude cut with ID _40399_210_4353_1_1533305658194_3279406_195-116825-0_sp1.1 from training. Duration: 0.97275 2023-10-02 14:48:10,610 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7002_210_1794_1_1527939683_5157286_163-87552-0_sp0.9 from training. Duration: 0.9555625 2023-10-02 14:48:10,645 WARNING [train.py:1204] (3/4) Exclude cut with ID _19514_210_9259_1_1531530071330_5084230_560-237597-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 14:48:12,432 WARNING [train.py:1204] (3/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_41-234353-0_sp1.1 from training. Duration: 0.6181875 2023-10-02 14:48:12,461 WARNING [train.py:1204] (3/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_769-168302-0 from training. Duration: 0.71 2023-10-02 14:48:15,178 WARNING [train.py:1204] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_574-45541-0 from training. Duration: 0.65 2023-10-02 14:48:18,359 WARNING [train.py:1204] (3/4) Exclude cut with ID _50310_210_15488_1_1534068043345_3641080_262-67120-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 14:48:19,857 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_53071_210_16554_1_1534493434_1276796_35-11927-0_sp1.1 from training. Duration: 0.963625 2023-10-02 14:48:19,858 WARNING [train.py:1204] (3/4) Exclude cut with ID _3992_210_1747_1_1526644816031_3731060_107-251434-0 from training. Duration: 0.99 2023-10-02 14:48:21,169 WARNING [train.py:1204] (3/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_412-54488-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 14:48:22,612 WARNING [train.py:1204] (3/4) Exclude cut with ID _18516_210_9206_1_1531704653206_3692406_77-258490-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 14:48:23,995 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_40047_210_2290_1_1533290519_2374311_195-165693-0_sp0.9 from training. Duration: 0.988875 2023-10-02 14:48:24,044 WARNING [train.py:1204] (3/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_296-36971-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 14:48:24,112 WARNING [train.py:1204] (3/4) Exclude cut with ID _14970_210_15709_1_1530773543493_1715730_187-20259-0_sp1.1 from training. Duration: 0.8090625 2023-10-02 14:48:28,839 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_53373_210_5399_1_1534229471_4715695_135-305927-0_sp1.1 from training. Duration: 0.936375 2023-10-02 14:48:28,878 WARNING [train.py:1204] (3/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_60-18123-0 from training. Duration: 0.88 2023-10-02 14:48:28,946 WARNING [train.py:1204] (3/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_166-166990-0 from training. Duration: 0.96 2023-10-02 14:48:30,750 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_5438_210_1794_1_1527336687_2600009_233-58072-0 from training. Duration: 0.77 2023-10-02 14:48:32,236 INFO [scaling.py:1118] (3/4) WithLoss: name=encoder.encoders.0.layers.1.self_attn_weights, loss-sum=0.000e+00 2023-10-02 14:48:33,391 INFO [train.py:1046] (3/4) Epoch 26, batch 4800, loss[loss=0.1879, simple_loss=0.2566, pruned_loss=0.05959, over 22707.00 frames. ], tot_loss[loss=0.1679, simple_loss=0.2458, pruned_loss=0.04501, over 4722483.85 frames. ], batch size: 322, lr: 3.91e-03, grad_scale: 16.0 2023-10-02 14:48:34,774 WARNING [train.py:1204] (3/4) Exclude cut with ID _8833_210_5219_1_1528592665210_7553059_611-80313-0_sp0.9 from training. Duration: 0.711125 2023-10-02 14:48:34,807 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_53555_210_5632_1_1534595220_7232297_118-43239-0_sp1.1 from training. Duration: 0.936375 2023-10-02 14:48:36,170 WARNING [train.py:1204] (3/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_480-13146-0 from training. Duration: 0.85 2023-10-02 14:48:36,351 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.ff2_skip_rate, batch_count=917353.3333333334, ans=0.0 2023-10-02 14:48:43,164 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_892-28814-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 14:48:43,217 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_24899_210_16554_1_1532046422_3827462_160-21255-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 14:48:49,832 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_154-316450-0_sp1.1 from training. Duration: 0.6909375 2023-10-02 14:48:51,212 WARNING [train.py:1204] (3/4) Exclude cut with ID _30196_210_1664_1_1533708496522_7074140_1225-301232-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 14:48:51,235 WARNING [train.py:1204] (3/4) Exclude cut with ID _28783_210_7943_1_1532671164808_7155479_414-208100-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 14:48:52,597 WARNING [train.py:1204] (3/4) Exclude cut with ID _19023_210_4353_1_1531378892552_5820780_83-254794-0 from training. Duration: 0.86 2023-10-02 14:48:53,930 WARNING [train.py:1204] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_591-83752-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 14:48:53,993 WARNING [train.py:1204] (3/4) Exclude cut with ID _29877_210_6228_1_1532397791106_5276149_266-62291-0_sp1.1 from training. Duration: 0.7909375 2023-10-02 14:48:55,370 WARNING [train.py:1204] (3/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_280-161633-0_sp1.1 from training. Duration: 0.663625 2023-10-02 14:48:58,196 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7543_210_6753_1_1528539468_333764_39-156634-0_sp1.1 from training. Duration: 0.97275 2023-10-02 14:49:01,497 WARNING [train.py:1204] (3/4) Exclude cut with ID _67499_210_17282_1_1535509805220_3692430_505-312248-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 14:49:01,519 WARNING [train.py:1204] (3/4) Exclude cut with ID _5294_210_1747_1_1527249643928_3696179_180-100713-0_sp0.9 from training. Duration: 0.9 2023-10-02 14:49:02,920 WARNING [train.py:1204] (3/4) Exclude cut with ID _26528_210_9070_1_1532570410842_3851310_508-148132-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 14:49:02,930 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_211-339622-0_sp1.1 from training. Duration: 0.4090625 2023-10-02 14:49:02,952 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_339-138356-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 14:49:03,032 WARNING [train.py:1204] (3/4) Exclude cut with ID _27060_210_5986_1_1532674838922_3522549_211-19933-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 14:49:04,299 WARNING [train.py:1204] (3/4) Exclude cut with ID _60229_210_27767_1_1534838406158_3655350_457-117923-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 14:49:07,659 WARNING [train.py:1204] (3/4) Exclude cut with ID _55429_210_10626_1_1534467588527_7174143_460-141439-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 14:49:07,784 WARNING [train.py:1204] (3/4) Exclude cut with ID _59170_210_6721_1_1534983633960_4203335_263-10780-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 14:49:07,797 WARNING [train.py:1204] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_872-264525-0_sp0.9 from training. Duration: 0.72225 2023-10-02 14:49:10,429 WARNING [train.py:1204] (3/4) Exclude cut with ID _7624_210_1531_1_1528109802198_4933101_418-349834-0_sp0.9 from training. Duration: 0.6666875 2023-10-02 14:49:10,530 WARNING [train.py:1204] (3/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_27-53998-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 14:49:12,164 INFO [scaling.py:1118] (3/4) WithLoss: name=encoder.encoders.1.encoder.layers.0.self_attn_weights, loss-sum=0.000e+00 2023-10-02 14:49:12,217 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.conv_skip_rate, batch_count=917486.6666666666, ans=0.0 2023-10-02 14:49:13,140 WARNING [train.py:1204] (3/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_202-183922-0 from training. Duration: 0.85 2023-10-02 14:49:13,160 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7002_210_1794_1_1527939683_5157286_1-281264-0 from training. Duration: 0.44 2023-10-02 14:49:14,989 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_53753_210_7234_1_1534320003_3594148_493-9314-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 14:49:15,009 WARNING [train.py:1204] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_217-95056-0_sp1.1 from training. Duration: 0.8909375 2023-10-02 14:49:15,051 WARNING [train.py:1204] (3/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_406-45086-0_sp1.1 from training. Duration: 0.72725 2023-10-02 14:49:15,056 WARNING [train.py:1204] (3/4) Exclude cut with ID _31649_210_6228_1_1532584608063_4091078_505-213821-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 14:49:16,722 WARNING [train.py:1204] (3/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_317-175062-0_sp0.9 from training. Duration: 0.911125 2023-10-02 14:49:18,153 WARNING [train.py:1204] (3/4) Exclude cut with ID _5444_210_2151_1_1527418808464_5312210_726-27165-0_sp1.1 from training. Duration: 0.6090625 2023-10-02 14:49:19,479 WARNING [train.py:1204] (3/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_494-170437-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 14:49:22,500 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_474-64054-0_sp1.1 from training. Duration: 0.936375 2023-10-02 14:49:24,005 WARNING [train.py:1204] (3/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_521-7556-0_sp1.1 from training. Duration: 0.97275 2023-10-02 14:49:24,159 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.nonlin_attention.balancer.prob, batch_count=917553.3333333334, ans=0.125 2023-10-02 14:49:26,837 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_269-348505-0_sp1.1 from training. Duration: 0.92725 2023-10-02 14:49:31,032 WARNING [train.py:1204] (3/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_559-184862-0 from training. Duration: 0.94 2023-10-02 14:49:31,077 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_68702_210_23689_1_1535799577_3420068_92-76040-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 14:49:33,033 WARNING [train.py:1204] (3/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_227-102052-0_sp1.1 from training. Duration: 0.97275 2023-10-02 14:49:33,072 WARNING [train.py:1204] (3/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_718-250907-0_sp0.9 from training. Duration: 0.7666875 2023-10-02 14:49:34,319 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.523e+02 1.969e+02 2.180e+02 2.462e+02 3.461e+02, threshold=4.360e+02, percent-clipped=0.0 2023-10-02 14:49:34,407 WARNING [train.py:1204] (3/4) Exclude cut with ID _14069_210_5632_1_1530512692033_4803390_352-143891-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 14:49:34,708 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.feed_forward3.hidden_balancer.prob, batch_count=917620.0, ans=0.125 2023-10-02 14:49:37,557 WARNING [train.py:1204] (3/4) Exclude cut with ID _6267_210_2084_1_1527764658526_3894730_65-106602-0_sp1.1 from training. Duration: 0.936375 2023-10-02 14:49:37,614 WARNING [train.py:1204] (3/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_202-183922-0_sp0.9 from training. Duration: 0.9444375 2023-10-02 14:49:38,943 WARNING [train.py:1204] (3/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_465-268827-0_sp1.1 from training. Duration: 0.97275 2023-10-02 14:49:39,020 WARNING [train.py:1204] (3/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_58-29064-0_sp1.1 from training. Duration: 0.9 2023-10-02 14:49:40,278 WARNING [train.py:1204] (3/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_1-99680-0_sp0.9 from training. Duration: 0.7444375 2023-10-02 14:49:40,333 WARNING [train.py:1204] (3/4) Exclude cut with ID _13253_210_1925_1_1530757775819_3577873_191-144977-0_sp1.1 from training. Duration: 0.7090625 2023-10-02 14:49:40,541 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.feed_forward1.hidden_balancer.prob, batch_count=917620.0, ans=0.125 2023-10-02 14:49:44,404 WARNING [train.py:1204] (3/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_276-112684-0_sp1.1 from training. Duration: 0.92725 2023-10-02 14:49:44,419 WARNING [train.py:1204] (3/4) Exclude cut with ID _64639_210_5816_1_1535367619437_3528000_358-411-0_sp1.1 from training. Duration: 0.97275 2023-10-02 14:49:44,426 WARNING [train.py:1204] (3/4) Exclude cut with ID _31649_210_6228_1_1532584608063_4091078_106-21199-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 14:49:47,062 WARNING [train.py:1204] (3/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_901-79438-0 from training. Duration: 0.94 2023-10-02 14:49:47,429 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.nonlin_attention.balancer.prob, batch_count=917686.6666666666, ans=0.125 2023-10-02 14:49:48,376 INFO [train.py:1046] (3/4) Epoch 26, batch 4850, loss[loss=0.158, simple_loss=0.2371, pruned_loss=0.03948, over 24459.00 frames. ], tot_loss[loss=0.1689, simple_loss=0.2464, pruned_loss=0.04574, over 4717374.00 frames. ], batch size: 63, lr: 3.91e-03, grad_scale: 16.0 2023-10-02 14:49:48,484 WARNING [train.py:1204] (3/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_718-250907-0 from training. Duration: 0.69 2023-10-02 14:49:48,489 WARNING [train.py:1204] (3/4) Exclude cut with ID _5287_210_2084_1_1527418883040_3878670_363-194161-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 14:49:48,493 WARNING [train.py:1204] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_586-321321-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 14:49:48,560 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_68900_210_2087_1_1535713157_3708529_164-292661-0_sp1.1 from training. Duration: 0.963625 2023-10-02 14:49:48,561 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_26433_210_12108_1_1532157158_4056480_114-144878-0_sp1.1 from training. Duration: 0.97275 2023-10-02 14:49:51,367 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_15517_210_6753_1_1530954008_3487336_96-341874-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 14:49:58,407 WARNING [train.py:1204] (3/4) Exclude cut with ID _14268_210_9206_1_1530622771285_4714099_637-86630-0 from training. Duration: 0.51 2023-10-02 14:49:58,509 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_28211_210_6388_1_1532861506_4523224_121-320092-0_sp1.1 from training. Duration: 0.92725 2023-10-02 14:50:01,922 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_141-89113-0_sp0.9 from training. Duration: 0.9555625 2023-10-02 14:50:03,297 WARNING [train.py:1204] (3/4) Exclude cut with ID _6789_210_1534_1_1527857937218_3118950_311-61262-0_sp1.1 from training. Duration: 0.5909375 2023-10-02 14:50:03,322 WARNING [train.py:1204] (3/4) Exclude cut with ID _58480_210_12392_1_1534811121111_3646160_389-200461-0_sp1.1 from training. Duration: 0.97275 2023-10-02 14:50:04,906 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.feed_forward1.hidden_balancer.prob, batch_count=917753.3333333334, ans=0.125 2023-10-02 14:50:06,578 WARNING [train.py:1204] (3/4) Exclude cut with ID _41326_210_5854_1_1533778076509_3492080_28-156553-0_sp1.1 from training. Duration: 0.92725 2023-10-02 14:50:07,837 WARNING [train.py:1204] (3/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_577-287150-0_sp1.1 from training. Duration: 0.7454375 2023-10-02 14:50:09,238 WARNING [train.py:1204] (3/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_40-129756-0_sp1.1 from training. Duration: 0.736375 2023-10-02 14:50:09,248 WARNING [train.py:1204] (3/4) Exclude cut with ID _60113_210_16271_1_1535164212481_4386079_490-280290-0 from training. Duration: 0.98 2023-10-02 14:50:13,995 WARNING [train.py:1204] (3/4) Exclude cut with ID _58059_210_5854_1_1534901471149_3164808_34-39905-0_sp1.1 from training. Duration: 0.963625 2023-10-02 14:50:15,437 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_791-197891-0_sp1.1 from training. Duration: 0.763625 2023-10-02 14:50:16,715 WARNING [train.py:1204] (3/4) Exclude cut with ID _6789_210_1534_1_1527857937218_3118950_262-48853-0_sp1.1 from training. Duration: 0.5090625 2023-10-02 14:50:18,066 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2655_210_2087_1_1525688959_3897400_52-279899-0_sp0.9 from training. Duration: 0.7666875 2023-10-02 14:50:18,070 WARNING [train.py:1204] (3/4) Exclude cut with ID _14884_210_2252_1_1530707459689_5561250_114-201399-0 from training. Duration: 0.76 2023-10-02 14:50:19,679 WARNING [train.py:1204] (3/4) Exclude cut with ID _19023_210_4353_1_1531378892552_5820780_83-254794-0_sp0.9 from training. Duration: 0.9555625 2023-10-02 14:50:19,702 WARNING [train.py:1204] (3/4) Exclude cut with ID _53489_210_6228_1_1534298357169_3835970_10-297919-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 14:50:23,889 WARNING [train.py:1204] (3/4) Exclude cut with ID _44585_210_18628_1_1533605350387_4518199_418-3055-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 14:50:23,918 WARNING [train.py:1204] (3/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_310-109461-0 from training. Duration: 0.8 2023-10-02 14:50:25,200 WARNING [train.py:1204] (3/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_235-262124-0 from training. Duration: 0.67 2023-10-02 14:50:26,563 WARNING [train.py:1204] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_195-167011-0_sp1.1 from training. Duration: 0.6818125 2023-10-02 14:50:33,981 WARNING [train.py:1204] (3/4) Exclude cut with ID _7852_210_5219_1_1528286471316_8139400_188-55162-0_sp1.1 from training. Duration: 0.936375 2023-10-02 14:50:34,039 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_35361_210_4238_1_1533262810_7793476_675-87012-0 from training. Duration: 0.89 2023-10-02 14:50:35,345 WARNING [train.py:1204] (3/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_465-81427-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 14:50:35,352 WARNING [train.py:1204] (3/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_597-154640-0_sp1.1 from training. Duration: 0.8181875 2023-10-02 14:50:37,422 WARNING [train.py:1204] (3/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_186-309161-0_sp0.9 from training. Duration: 0.6 2023-10-02 14:50:37,716 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=917886.6666666666, ans=0.1 2023-10-02 14:50:38,810 WARNING [train.py:1204] (3/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_546-119312-0 from training. Duration: 0.99 2023-10-02 14:50:38,814 WARNING [train.py:1204] (3/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_330-240060-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 14:50:38,889 WARNING [train.py:1204] (3/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_477-255782-0 from training. Duration: 0.82 2023-10-02 14:50:38,916 WARNING [train.py:1204] (3/4) Exclude cut with ID _14970_210_15709_1_1530773543493_1715730_150-248939-0_sp1.1 from training. Duration: 0.97275 2023-10-02 14:50:40,323 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_556-206615-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 14:50:41,623 WARNING [train.py:1204] (3/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_308-231735-0 from training. Duration: 0.73 2023-10-02 14:50:47,671 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.1.encoder.layers.0.conv_module2.whiten, num_groups=1, num_channels=256, metric=5.21 vs. limit=15.0 2023-10-02 14:50:50,881 WARNING [train.py:1204] (3/4) Exclude cut with ID _27485_210_4916_1_1532739191677_3668269_66-236571-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 14:50:51,023 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.1.conv_module1.balancer1.prob, batch_count=917953.3333333334, ans=0.125 2023-10-02 14:50:56,216 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2221_210_1988_1_1525597051_4935541_415-296882-0_sp0.9 from training. Duration: 0.8555625 2023-10-02 14:50:56,223 WARNING [train.py:1204] (3/4) Exclude cut with ID _64521_210_14699_1_1535243427259_3815328_231-78630-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 14:50:56,379 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.1.conv_skip_rate, batch_count=917953.3333333334, ans=0.0 2023-10-02 14:51:00,348 WARNING [train.py:1204] (3/4) Exclude cut with ID _13942_210_9988_1_1530604906036_3733360_499-55676-0 from training. Duration: 0.62 2023-10-02 14:51:00,351 WARNING [train.py:1204] (3/4) Exclude cut with ID _49376_210_5986_1_1533985278732_3370740_301-154763-0_sp1.1 from training. Duration: 0.8909375 2023-10-02 14:51:01,644 INFO [train.py:1046] (3/4) Epoch 26, batch 4900, loss[loss=0.1643, simple_loss=0.2437, pruned_loss=0.0424, over 24346.00 frames. ], tot_loss[loss=0.1682, simple_loss=0.245, pruned_loss=0.04568, over 4691537.11 frames. ], batch size: 61, lr: 3.91e-03, grad_scale: 16.0 2023-10-02 14:51:06,429 WARNING [train.py:1204] (3/4) Exclude cut with ID _27485_210_4916_1_1532739191677_3668269_255-216698-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 14:51:08,334 WARNING [train.py:1204] (3/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_137-334143-0_sp1.1 from training. Duration: 0.97275 2023-10-02 14:51:09,680 WARNING [train.py:1204] (3/4) Exclude cut with ID _6456_210_1534_1_1527681634212_3181049_215-309359-0_sp1.1 from training. Duration: 0.8 2023-10-02 14:51:13,007 WARNING [train.py:1204] (3/4) Exclude cut with ID _65640_210_6310_1_1535368201445_3173250_209-13008-0 from training. Duration: 0.99 2023-10-02 14:51:17,702 WARNING [train.py:1204] (3/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_58-29064-0 from training. Duration: 0.99 2023-10-02 14:51:20,638 WARNING [train.py:1204] (3/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_410-287857-0 from training. Duration: 0.93 2023-10-02 14:51:21,874 WARNING [train.py:1204] (3/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_153-153537-0 from training. Duration: 0.91 2023-10-02 14:51:21,913 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7544_210_6753_1_1528587722_3576686_172-171750-0_sp0.9 from training. Duration: 0.72225 2023-10-02 14:51:22,113 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.conv_skip_rate, batch_count=918086.6666666666, ans=0.0 2023-10-02 14:51:23,164 WARNING [train.py:1204] (3/4) Exclude cut with ID _64521_210_14699_1_1535243427259_3815328_464-183853-0_sp1.1 from training. Duration: 0.97275 2023-10-02 14:51:23,193 WARNING [train.py:1204] (3/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_385-210308-0_sp1.1 from training. Duration: 0.963625 2023-10-02 14:51:23,212 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_46013_210_7234_1_1533776362_3744032_354-261865-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 14:51:23,218 WARNING [train.py:1204] (3/4) Exclude cut with ID _3067_210_2667_1_1526212974640_4503168_250-285937-0_sp1.1 from training. Duration: 0.72725 2023-10-02 14:51:24,568 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_40036_210_13405_1_1533648647_1315388_48-146016-0 from training. Duration: 0.93 2023-10-02 14:51:27,396 WARNING [train.py:1204] (3/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_280-161633-0 from training. Duration: 0.73 2023-10-02 14:51:28,602 WARNING [train.py:1204] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_308-161067-0_sp0.9 from training. Duration: 0.7333125 2023-10-02 14:51:28,702 WARNING [train.py:1204] (3/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_68-212801-0_sp0.9 from training. Duration: 0.811125 2023-10-02 14:51:30,003 WARNING [train.py:1204] (3/4) Exclude cut with ID _6789_210_1534_1_1527857937218_3118950_311-61262-0_sp0.9 from training. Duration: 0.72225 2023-10-02 14:51:31,495 WARNING [train.py:1204] (3/4) Exclude cut with ID _14632_210_9988_1_1530691279392_3923200_102-204477-0_sp1.1 from training. Duration: 0.8818125 2023-10-02 14:51:32,868 WARNING [train.py:1204] (3/4) Exclude cut with ID _65242_210_18628_1_1535369393717_3823149_290-117517-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 14:51:34,278 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_69630_210_7234_1_1535788670_910989_35-337140-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 14:51:34,286 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_5618_210_1874_1_1527311008_5851_1-214501-0 from training. Duration: 0.689 2023-10-02 14:51:36,178 WARNING [train.py:1204] (3/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_168-50337-0_sp0.9 from training. Duration: 0.9333125 2023-10-02 14:51:36,261 WARNING [train.py:1204] (3/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_185-14438-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 14:51:36,282 WARNING [train.py:1204] (3/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_61-327062-0 from training. Duration: 0.81 2023-10-02 14:51:38,293 WARNING [train.py:1204] (3/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_98-741-0 from training. Duration: 0.8 2023-10-02 14:51:42,369 WARNING [train.py:1204] (3/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_312-204798-0 from training. Duration: 0.93 2023-10-02 14:51:42,495 WARNING [train.py:1204] (3/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_787-294416-0_sp0.9 from training. Duration: 0.911125 2023-10-02 14:51:44,436 WARNING [train.py:1204] (3/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_140-181176-0_sp1.1 from training. Duration: 0.836375 2023-10-02 14:51:44,480 WARNING [train.py:1204] (3/4) Exclude cut with ID _14687_210_5854_1_1530703838259_3664889_115-239046-0_sp1.1 from training. Duration: 0.6090625 2023-10-02 14:51:45,809 WARNING [train.py:1204] (3/4) Exclude cut with ID _56423_210_5854_1_1534896061049_3165969_370-230614-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 14:51:45,852 WARNING [train.py:1204] (3/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_406-275649-0_sp0.9 from training. Duration: 0.4666875 2023-10-02 14:51:47,584 WARNING [train.py:1204] (3/4) Exclude cut with ID _15150_210_8341_1_1530752411803_7256384_22-138274-0_sp1.1 from training. Duration: 0.87275 2023-10-02 14:51:47,617 WARNING [train.py:1204] (3/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_125-125818-0 from training. Duration: 0.96 2023-10-02 14:51:50,425 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_4399_210_1874_1_1526705485_2400757_220-47974-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 14:51:50,553 WARNING [train.py:1204] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_308-161067-0_sp1.1 from training. Duration: 0.6 2023-10-02 14:51:53,170 WARNING [train.py:1204] (3/4) Exclude cut with ID _53489_210_6228_1_1534298357169_3835970_102-57645-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 14:51:55,948 WARNING [train.py:1204] (3/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_178-177582-0 from training. Duration: 0.74 2023-10-02 14:51:56,010 WARNING [train.py:1204] (3/4) Exclude cut with ID _14349_210_8341_1_1530615710818_3442470_170-336541-0_sp1.1 from training. Duration: 0.7818125 2023-10-02 14:51:57,312 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_521-214276-0_sp0.9 from training. Duration: 0.488875 2023-10-02 14:51:58,682 WARNING [train.py:1204] (3/4) Exclude cut with ID _66070_210_7640_1_1535441393096_7805310_489-292631-0 from training. Duration: 0.79 2023-10-02 14:52:02,742 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.535e+02 1.867e+02 2.051e+02 2.311e+02 3.725e+02, threshold=4.103e+02, percent-clipped=0.0 2023-10-02 14:52:04,207 WARNING [train.py:1204] (3/4) Exclude cut with ID _14069_210_5632_1_1530512692033_4803390_438-340023-0_sp1.1 from training. Duration: 0.936375 2023-10-02 14:52:04,461 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.bypass.scale_min, batch_count=918286.6666666666, ans=0.2 2023-10-02 14:52:05,519 WARNING [train.py:1204] (3/4) Exclude cut with ID _7852_210_5219_1_1528286471316_8139400_242-105336-0_sp1.1 from training. Duration: 0.6545625 2023-10-02 14:52:06,891 WARNING [train.py:1204] (3/4) Exclude cut with ID _3867_210_1883_1_1526641200575_4475830_272-215563-0 from training. Duration: 0.57 2023-10-02 14:52:06,907 WARNING [train.py:1204] (3/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_143-117421-0_sp0.9 from training. Duration: 0.6444375 2023-10-02 14:52:06,913 WARNING [train.py:1204] (3/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_30-326681-0_sp1.1 from training. Duration: 0.7545625 2023-10-02 14:52:08,875 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.1.bypass_mid.scale_min, batch_count=918286.6666666666, ans=0.2 2023-10-02 14:52:10,117 WARNING [train.py:1204] (3/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_538-218448-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 14:52:14,754 WARNING [train.py:1204] (3/4) Exclude cut with ID _33640_210_2655_1_1533117605479_4855469_60-192970-0_sp1.1 from training. Duration: 0.963625 2023-10-02 14:52:14,761 WARNING [train.py:1204] (3/4) Exclude cut with ID _65585_210_12210_1_1535450451584_3764098_112-294301-0_sp1.1 from training. Duration: 0.8 2023-10-02 14:52:14,804 WARNING [train.py:1204] (3/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_643-37412-0_sp1.1 from training. Duration: 0.936375 2023-10-02 14:52:14,823 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_9302_210_2550_1_1528889962_4655416_638-301620-0_sp1.1 from training. Duration: 0.42725 2023-10-02 14:52:16,116 INFO [train.py:1046] (3/4) Epoch 26, batch 4950, loss[loss=0.1561, simple_loss=0.2225, pruned_loss=0.04483, over 23570.00 frames. ], tot_loss[loss=0.1678, simple_loss=0.2444, pruned_loss=0.04561, over 4689853.01 frames. ], batch size: 256, lr: 3.91e-03, grad_scale: 16.0 2023-10-02 14:52:16,831 WARNING [train.py:1204] (3/4) Exclude cut with ID _3867_210_1883_1_1526641200575_4475830_272-215563-0_sp1.1 from training. Duration: 0.5181875 2023-10-02 14:52:19,691 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_53679_210_4852_1_1534556726_4186141_358-154143-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 14:52:19,702 WARNING [train.py:1204] (3/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_841-341679-0_sp0.9 from training. Duration: 0.6444375 2023-10-02 14:52:22,402 WARNING [train.py:1204] (3/4) Exclude cut with ID _7160_210_1876_1_1527940923231_3703799_302-150728-0 from training. Duration: 0.78 2023-10-02 14:52:22,441 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_125-203105-0 from training. Duration: 0.8 2023-10-02 14:52:22,456 WARNING [train.py:1204] (3/4) Exclude cut with ID _5585_210_2667_1_1527418813324_8007340_89-332072-0_sp0.9 from training. Duration: 0.711125 2023-10-02 14:52:23,778 WARNING [train.py:1204] (3/4) Exclude cut with ID _3992_210_1747_1_1526644816031_3731060_36-147855-0 from training. Duration: 0.78 2023-10-02 14:52:23,802 WARNING [train.py:1204] (3/4) Exclude cut with ID _59256_210_5986_1_1535076085997_3589120_172-239045-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 14:52:23,822 WARNING [train.py:1204] (3/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_472-216663-0_sp1.1 from training. Duration: 0.836375 2023-10-02 14:52:23,864 WARNING [train.py:1204] (3/4) Exclude cut with ID _36512_210_6987_1_1533033143998_3126069_181-306500-0_sp1.1 from training. Duration: 0.863625 2023-10-02 14:52:25,204 WARNING [train.py:1204] (3/4) Exclude cut with ID _56524_210_9642_1_1534572011779_3619170_492-95008-0_sp1.1 from training. Duration: 0.97275 2023-10-02 14:52:26,539 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_33348_210_5632_1_1533086912_3324030_119-90326-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 14:52:26,583 WARNING [train.py:1204] (3/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_231-137892-0_sp1.1 from training. Duration: 0.9 2023-10-02 14:52:27,928 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8453_210_4211_1_1528502940_8562730_596-306820-0_sp1.1 from training. Duration: 0.8818125 2023-10-02 14:52:29,220 WARNING [train.py:1204] (3/4) Exclude cut with ID _27052_210_5986_1_1532329197187_3585009_50-242750-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 14:52:31,824 WARNING [train.py:1204] (3/4) Exclude cut with ID _13913_210_9994_1_1530579615187_3383897_358-98435-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 14:52:31,839 WARNING [train.py:1204] (3/4) Exclude cut with ID _27013_210_1389_1_1532437173240_3252649_399-229778-0_sp1.1 from training. Duration: 0.963625 2023-10-02 14:52:34,621 WARNING [train.py:1204] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_185-174805-0_sp0.9 from training. Duration: 0.7555625 2023-10-02 14:52:35,127 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.5.encoder.layers.1.nonlin_attention.whiten2, num_groups=1, num_channels=256, metric=8.72 vs. limit=15.0 2023-10-02 14:52:39,664 WARNING [train.py:1204] (3/4) Exclude cut with ID _65585_210_12210_1_1535450451584_3764098_196-221014-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 14:52:41,342 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.bypass_mid.scale_min, batch_count=918420.0, ans=0.2 2023-10-02 14:52:42,802 WARNING [train.py:1204] (3/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_202-293523-0_sp1.1 from training. Duration: 0.6545625 2023-10-02 14:52:44,224 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8275_210_4198_1_1528455320_3892571_213-127634-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 14:52:44,269 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_921-231678-0_sp1.1 from training. Duration: 0.97275 2023-10-02 14:52:45,572 WARNING [train.py:1204] (3/4) Exclude cut with ID _38020_210_5986_1_1533112252259_7006089_493-102745-0_sp1.1 from training. Duration: 0.8909375 2023-10-02 14:52:47,093 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6846_210_6753_1_1527850767_4295158_2-315073-0 from training. Duration: 0.88 2023-10-02 14:52:47,159 WARNING [train.py:1204] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_249-281349-0 from training. Duration: 0.85 2023-10-02 14:52:50,470 WARNING [train.py:1204] (3/4) Exclude cut with ID _18513_210_9206_1_1531618215223_3862620_314-83429-0_sp1.1 from training. Duration: 0.97275 2023-10-02 14:52:53,146 WARNING [train.py:1204] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_215-202973-0_sp0.9 from training. Duration: 0.97775 2023-10-02 14:52:53,155 WARNING [train.py:1204] (3/4) Exclude cut with ID _7719_210_3525_1_1528456153163_4296960_88-250259-0_sp1.1 from training. Duration: 0.763625 2023-10-02 14:52:53,271 WARNING [train.py:1204] (3/4) Exclude cut with ID _7606_210_4915_1_1528894253277_5080690_301-10732-0_sp1.1 from training. Duration: 0.736375 2023-10-02 14:52:53,446 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.feed_forward3.hidden_balancer.prob, batch_count=918486.6666666666, ans=0.125 2023-10-02 14:52:54,610 WARNING [train.py:1204] (3/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_818-11907-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 14:52:55,956 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_5959_210_1794_1_1527400400_3445726_412-44052-0_sp1.1 from training. Duration: 0.7 2023-10-02 14:52:57,369 WARNING [train.py:1204] (3/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_6-279195-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 14:52:58,737 WARNING [train.py:1204] (3/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_252-216674-0_sp0.9 from training. Duration: 0.6 2023-10-02 14:52:58,883 WARNING [train.py:1204] (3/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_144-3287-0_sp1.1 from training. Duration: 0.6090625 2023-10-02 14:53:00,344 WARNING [train.py:1204] (3/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_342-3879-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 14:53:00,373 WARNING [train.py:1204] (3/4) Exclude cut with ID _33640_210_2655_1_1533117605479_4855469_584-65630-0_sp1.1 from training. Duration: 0.97275 2023-10-02 14:53:01,680 WARNING [train.py:1204] (3/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_37-189262-0 from training. Duration: 0.98 2023-10-02 14:53:01,715 WARNING [train.py:1204] (3/4) Exclude cut with ID _9389_210_1664_1_1529217337301_5289269_379-128794-0_sp0.9 from training. Duration: 0.9444375 2023-10-02 14:53:03,189 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=918553.3333333334, ans=0.1 2023-10-02 14:53:04,332 WARNING [train.py:1204] (3/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_119-207162-0_sp1.1 from training. Duration: 0.6454375 2023-10-02 14:53:07,958 WARNING [train.py:1204] (3/4) Exclude cut with ID _2941_210_2577_1_1526199673150_2489759_200-333643-0_sp1.1 from training. Duration: 0.936375 2023-10-02 14:53:09,365 WARNING [train.py:1204] (3/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_479-125611-0_sp1.1 from training. Duration: 0.87275 2023-10-02 14:53:09,376 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3475_210_2028_1_1526193578_3622569_64-297072-0_sp1.1 from training. Duration: 0.82725 2023-10-02 14:53:09,515 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.bypass.scale_min, batch_count=918553.3333333334, ans=0.2 2023-10-02 14:53:10,687 WARNING [train.py:1204] (3/4) Exclude cut with ID _53565_210_10635_1_1534337218335_3820259_139-346291-0_sp1.1 from training. Duration: 0.97275 2023-10-02 14:53:10,736 WARNING [train.py:1204] (3/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_588-125165-0_sp1.1 from training. Duration: 0.7545625 2023-10-02 14:53:12,491 WARNING [train.py:1204] (3/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_938-205135-0_sp1.1 from training. Duration: 0.7909375 2023-10-02 14:53:13,884 WARNING [train.py:1204] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_216-48707-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 14:53:15,209 WARNING [train.py:1204] (3/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_477-255782-0_sp1.1 from training. Duration: 0.7454375 2023-10-02 14:53:15,251 WARNING [train.py:1204] (3/4) Exclude cut with ID _16909_210_5724_1_1531292079912_4118919_583-92842-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 14:53:16,629 WARNING [train.py:1204] (3/4) Exclude cut with ID _60860_210_8254_1_1534896015875_3557169_245-256666-0 from training. Duration: 0.97 2023-10-02 14:53:16,824 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.balancer1.prob, batch_count=918620.0, ans=0.125 2023-10-02 14:53:21,201 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_45399_210_4852_1_1533624725_7579737_349-116173-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 14:53:25,820 WARNING [train.py:1204] (3/4) Exclude cut with ID _8699_210_2069_1_1528630346831_3960409_155-39766-0 from training. Duration: 0.78 2023-10-02 14:53:25,829 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_86-16881-0_sp1.1 from training. Duration: 0.52725 2023-10-02 14:53:29,780 INFO [train.py:1046] (3/4) Epoch 26, batch 5000, loss[loss=0.1687, simple_loss=0.2465, pruned_loss=0.04543, over 23745.00 frames. ], tot_loss[loss=0.1672, simple_loss=0.2442, pruned_loss=0.04517, over 4699821.66 frames. ], batch size: 179, lr: 3.91e-03, grad_scale: 16.0 2023-10-02 14:53:31,464 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_191-159384-0_sp1.1 from training. Duration: 0.97275 2023-10-02 14:53:31,472 WARNING [train.py:1204] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_307-158462-0_sp1.1 from training. Duration: 0.62725 2023-10-02 14:53:34,123 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7544_210_6753_1_1528591327_1471426_33-346031-0 from training. Duration: 0.61 2023-10-02 14:53:35,500 WARNING [train.py:1204] (3/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_112-149213-0 from training. Duration: 0.95 2023-10-02 14:53:36,794 WARNING [train.py:1204] (3/4) Exclude cut with ID _55844_210_2346_1_1534471212569_7625707_925-191342-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 14:53:38,728 WARNING [train.py:1204] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_87-278686-0 from training. Duration: 0.77 2023-10-02 14:53:38,747 WARNING [train.py:1204] (3/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_415-78792-0_sp1.1 from training. Duration: 0.736375 2023-10-02 14:53:38,863 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.feed_forward1.hidden_balancer.prob, batch_count=918686.6666666666, ans=0.125 2023-10-02 14:53:40,547 WARNING [train.py:1204] (3/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_107-332398-0_sp0.9 from training. Duration: 0.8666875 2023-10-02 14:53:40,628 WARNING [train.py:1204] (3/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_72-251255-0 from training. Duration: 0.94 2023-10-02 14:53:41,904 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_431-227273-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 14:53:41,959 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7544_210_6753_1_1528587000_691468_87-178599-0_sp0.9 from training. Duration: 0.9666875 2023-10-02 14:53:43,773 WARNING [train.py:1204] (3/4) Exclude cut with ID _13916_210_9994_1_1530788341126_3763149_60-191556-0 from training. Duration: 0.61 2023-10-02 14:53:43,775 WARNING [train.py:1204] (3/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_746-164782-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 14:53:44,044 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.ff2_skip_rate, batch_count=918753.3333333334, ans=0.0 2023-10-02 14:53:45,028 WARNING [train.py:1204] (3/4) Exclude cut with ID _33640_210_2655_1_1533117605479_4855469_596-21066-0_sp1.1 from training. Duration: 0.92725 2023-10-02 14:53:45,156 WARNING [train.py:1204] (3/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_320-14078-0 from training. Duration: 0.95 2023-10-02 14:53:45,180 WARNING [train.py:1204] (3/4) Exclude cut with ID _29877_210_6228_1_1532397791106_5276149_266-62291-0 from training. Duration: 0.87 2023-10-02 14:53:46,550 WARNING [train.py:1204] (3/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_176-56120-0_sp0.9 from training. Duration: 0.92225 2023-10-02 14:53:46,601 WARNING [train.py:1204] (3/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_91-26919-0 from training. Duration: 0.97 2023-10-02 14:53:47,842 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_224-322394-0_sp0.9 from training. Duration: 0.7333125 2023-10-02 14:53:47,893 WARNING [train.py:1204] (3/4) Exclude cut with ID _44982_210_1511_1_1534312820108_3726029_586-297059-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 14:53:49,792 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_196-289637-0_sp1.1 from training. Duration: 0.6818125 2023-10-02 14:53:49,801 WARNING [train.py:1204] (3/4) Exclude cut with ID _18513_210_9206_1_1531618215223_3862620_402-5957-0 from training. Duration: 0.99 2023-10-02 14:53:49,807 WARNING [train.py:1204] (3/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_81-300212-0 from training. Duration: 0.6 2023-10-02 14:53:50,976 WARNING [train.py:1204] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_166-128941-0 from training. Duration: 0.54 2023-10-02 14:53:51,000 WARNING [train.py:1204] (3/4) Exclude cut with ID _58059_210_5854_1_1534901471149_3164808_57-309660-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 14:53:51,064 WARNING [train.py:1204] (3/4) Exclude cut with ID _60621_210_17071_1_1535013053530_3671739_73-155814-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 14:53:52,442 WARNING [train.py:1204] (3/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_726-270780-0 from training. Duration: 0.86 2023-10-02 14:53:52,454 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8853_210_3273_1_1528633899_818968_51-187685-0_sp1.1 from training. Duration: 0.62725 2023-10-02 14:53:53,953 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.0.feed_forward1.out_proj.dropout_p, batch_count=918753.3333333334, ans=0.1 2023-10-02 14:53:55,166 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_315-212794-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 14:53:56,543 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_53373_210_5399_1_1534229471_4715695_314-122795-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 14:53:57,761 WARNING [train.py:1204] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_187-221566-0_sp0.9 from training. Duration: 0.47775 2023-10-02 14:53:59,286 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_133-564-0 from training. Duration: 0.79 2023-10-02 14:54:00,587 WARNING [train.py:1204] (3/4) Exclude cut with ID _55153_210_2253_1_1534384786677_3162096_255-228344-0_sp1.1 from training. Duration: 0.936375 2023-10-02 14:54:01,998 WARNING [train.py:1204] (3/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_383-165234-0_sp1.1 from training. Duration: 0.8545625 2023-10-02 14:54:05,955 WARNING [train.py:1204] (3/4) Exclude cut with ID IC0677W0056-334889 from training. Duration: 0.768 2023-10-02 14:54:07,528 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_14953_210_2151_1_1530701710_5616165_710-165400-0_sp0.9 from training. Duration: 0.9666875 2023-10-02 14:54:09,461 WARNING [train.py:1204] (3/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_355-236205-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 14:54:09,463 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_34450_210_9547_1_1533099443_4109050_503-246893-0_sp1.1 from training. Duration: 0.963625 2023-10-02 14:54:12,690 WARNING [train.py:1204] (3/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_25-120161-0 from training. Duration: 0.98 2023-10-02 14:54:14,442 WARNING [train.py:1204] (3/4) Exclude cut with ID _66112_210_10635_1_1535590236305_3801484_205-311930-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 14:54:14,475 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_48049_210_5632_1_1533864553_3519937_185-281218-0_sp1.1 from training. Duration: 0.92725 2023-10-02 14:54:14,520 WARNING [train.py:1204] (3/4) Exclude cut with ID _48782_210_12658_1_1534467701544_2300422_191-332415-0_sp1.1 from training. Duration: 0.97275 2023-10-02 14:54:15,935 WARNING [train.py:1204] (3/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_907-151398-0_sp1.1 from training. Duration: 0.336375 2023-10-02 14:54:17,272 WARNING [train.py:1204] (3/4) Exclude cut with ID _67499_210_17282_1_1535509805220_3692430_20-179346-0_sp1.1 from training. Duration: 0.8818125 2023-10-02 14:54:21,308 WARNING [train.py:1204] (3/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_441-91466-0_sp1.1 from training. Duration: 0.8818125 2023-10-02 14:54:21,391 WARNING [train.py:1204] (3/4) Exclude cut with ID _46681_210_9642_1_1533893005571_7603359_786-164007-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 14:54:24,985 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.conv_module2.balancer2.prob, batch_count=918886.6666666666, ans=0.125 2023-10-02 14:54:27,662 WARNING [train.py:1204] (3/4) Exclude cut with ID _14970_210_15709_1_1530773543493_1715730_187-20259-0 from training. Duration: 0.89 2023-10-02 14:54:30,192 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.563e+02 1.795e+02 1.977e+02 2.128e+02 2.824e+02, threshold=3.954e+02, percent-clipped=0.0 2023-10-02 14:54:31,702 WARNING [train.py:1204] (3/4) Exclude cut with ID _12734_210_8341_1_1530183614296_3891929_383-339075-0_sp1.1 from training. Duration: 0.963625 2023-10-02 14:54:33,212 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=918953.3333333334, ans=0.1 2023-10-02 14:54:35,971 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=918953.3333333334, ans=0.1 2023-10-02 14:54:40,670 WARNING [train.py:1204] (3/4) Exclude cut with ID _37086_210_9038_1_1532999540891_7021250_834-322225-0_sp1.1 from training. Duration: 0.92725 2023-10-02 14:54:40,780 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_10987_210_4163_1_1529760555_4123902_56-290559-0_sp1.1 from training. Duration: 0.963625 2023-10-02 14:54:40,786 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_92-246270-0_sp0.9 from training. Duration: 0.9444375 2023-10-02 14:54:42,127 WARNING [train.py:1204] (3/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_56-13561-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 14:54:42,170 WARNING [train.py:1204] (3/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_393-57888-0_sp1.1 from training. Duration: 0.5818125 2023-10-02 14:54:42,200 WARNING [train.py:1204] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_168-32906-0_sp1.1 from training. Duration: 0.62725 2023-10-02 14:54:42,882 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.3.encoder.layers.1.feed_forward2.out_whiten, num_groups=1, num_channels=512, metric=5.63 vs. limit=15.0 2023-10-02 14:54:43,491 INFO [train.py:1046] (3/4) Epoch 26, batch 5050, loss[loss=0.1609, simple_loss=0.2444, pruned_loss=0.03869, over 24477.00 frames. ], tot_loss[loss=0.1676, simple_loss=0.245, pruned_loss=0.04511, over 4720854.64 frames. ], batch size: 66, lr: 3.91e-03, grad_scale: 16.0 2023-10-02 14:54:43,564 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_51225_210_3601_1_1534400634_2327383_127-207553-0_sp1.1 from training. Duration: 0.963625 2023-10-02 14:54:46,984 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.ff3_skip_rate, batch_count=919020.0, ans=0.0 2023-10-02 14:54:48,100 WARNING [train.py:1204] (3/4) Exclude cut with ID _58583_210_5236_1_1534726835266_3574249_494-203521-0_sp1.1 from training. Duration: 0.963625 2023-10-02 14:54:48,112 WARNING [train.py:1204] (3/4) Exclude cut with ID _49376_210_5986_1_1533985278732_3370740_354-344695-0 from training. Duration: 0.99 2023-10-02 14:54:48,195 WARNING [train.py:1204] (3/4) Exclude cut with ID _8021_210_7994_1_1528376451358_3653069_179-45206-0_sp1.1 from training. Duration: 0.7818125 2023-10-02 14:54:50,886 WARNING [train.py:1204] (3/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_742-154017-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 14:54:50,979 WARNING [train.py:1204] (3/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_222-198127-0_sp1.1 from training. Duration: 0.763625 2023-10-02 14:54:52,909 WARNING [train.py:1204] (3/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_235-307337-0 from training. Duration: 0.94 2023-10-02 14:54:52,979 WARNING [train.py:1204] (3/4) Exclude cut with ID _8393_210_6702_1_1528505860249_3457328_453-176500-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 14:54:54,057 WARNING [train.py:1204] (3/4) Exclude cut with ID _53565_210_10635_1_1534337218335_3820259_102-179528-0_sp1.1 from training. Duration: 0.97275 2023-10-02 14:54:56,831 WARNING [train.py:1204] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_43-206757-0_sp1.1 from training. Duration: 0.6818125 2023-10-02 14:54:56,932 WARNING [train.py:1204] (3/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_335-274780-0_sp1.1 from training. Duration: 0.7090625 2023-10-02 14:54:58,186 WARNING [train.py:1204] (3/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_128-296581-0_sp1.1 from training. Duration: 0.67275 2023-10-02 14:55:03,760 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.feed_forward1.hidden_balancer.prob, batch_count=919086.6666666666, ans=0.125 2023-10-02 14:55:06,509 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.balancer2.prob, batch_count=919086.6666666666, ans=0.125 2023-10-02 14:55:06,857 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.4.encoder.layers.2.feed_forward3.out_whiten, num_groups=1, num_channels=384, metric=13.07 vs. limit=15.0 2023-10-02 14:55:07,674 WARNING [train.py:1204] (3/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_111-112362-0 from training. Duration: 0.91 2023-10-02 14:55:08,945 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_5522_210_3273_1_1527249606_3909096_366-172508-0_sp1.1 from training. Duration: 0.6 2023-10-02 14:55:09,023 WARNING [train.py:1204] (3/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_47-167373-0_sp1.1 from training. Duration: 0.836375 2023-10-02 14:55:09,067 WARNING [train.py:1204] (3/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_41-234353-0 from training. Duration: 0.68 2023-10-02 14:55:10,738 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_82-221196-0_sp0.9 from training. Duration: 0.8444375 2023-10-02 14:55:12,138 WARNING [train.py:1204] (3/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_47-4814-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 14:55:12,165 WARNING [train.py:1204] (3/4) Exclude cut with ID _43541_210_10626_1_1533796233418_3635440_40-247993-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 14:55:13,571 WARNING [train.py:1204] (3/4) Exclude cut with ID _21989_210_5854_1_1531899214931_3271360_133-44759-0_sp0.9 from training. Duration: 0.8555625 2023-10-02 14:55:13,574 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_94-195801-0 from training. Duration: 0.94 2023-10-02 14:55:14,871 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_141-89113-0 from training. Duration: 0.86 2023-10-02 14:55:14,980 WARNING [train.py:1204] (3/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_683-284578-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 14:55:17,729 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.1.encoder.layers.1.self_attn2.whiten, num_groups=1, num_channels=256, metric=8.94 vs. limit=22.5 2023-10-02 14:55:18,840 WARNING [train.py:1204] (3/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_6-214815-0_sp1.1 from training. Duration: 0.77275 2023-10-02 14:55:20,561 WARNING [train.py:1204] (3/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_556-212082-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 14:55:20,602 WARNING [train.py:1204] (3/4) Exclude cut with ID _44902_210_16187_1_1533888006062_3792078_149-67212-0 from training. Duration: 0.87 2023-10-02 14:55:23,221 WARNING [train.py:1204] (3/4) Exclude cut with ID _61148_210_6897_1_1535094022726_4217610_333-159440-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 14:55:26,482 WARNING [train.py:1204] (3/4) Exclude cut with ID _3120_210_3525_1_1526036143112_4072980_404-249994-0 from training. Duration: 0.99 2023-10-02 14:55:26,574 WARNING [train.py:1204] (3/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_234-260682-0_sp0.9 from training. Duration: 0.9333125 2023-10-02 14:55:26,592 WARNING [train.py:1204] (3/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_374-138971-0_sp1.1 from training. Duration: 0.8545625 2023-10-02 14:55:27,980 WARNING [train.py:1204] (3/4) Exclude cut with ID _53713_210_24116_1_1534314487762_7416751_743-302085-0_sp1.1 from training. Duration: 0.92725 2023-10-02 14:55:29,360 WARNING [train.py:1204] (3/4) Exclude cut with ID _18516_210_9206_1_1531704653206_3692406_198-334870-0_sp1.1 from training. Duration: 0.836375 2023-10-02 14:55:29,593 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.1.attention_skip_rate, batch_count=919220.0, ans=0.0 2023-10-02 14:55:30,742 WARNING [train.py:1204] (3/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_395-73570-0_sp1.1 from training. Duration: 0.9 2023-10-02 14:55:32,171 WARNING [train.py:1204] (3/4) Exclude cut with ID _46683_210_9642_1_1533730913492_4591116_421-19547-0_sp1.1 from training. Duration: 0.936375 2023-10-02 14:55:33,471 WARNING [train.py:1204] (3/4) Exclude cut with ID _19896_210_1883_1_1531709022662_6649569_714-8668-0_sp1.1 from training. Duration: 0.963625 2023-10-02 14:55:33,494 WARNING [train.py:1204] (3/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_49-101072-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 14:55:33,508 WARNING [train.py:1204] (3/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_220-191080-0_sp1.1 from training. Duration: 0.8909375 2023-10-02 14:55:33,549 WARNING [train.py:1204] (3/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_62-335552-0 from training. Duration: 0.87 2023-10-02 14:55:34,851 WARNING [train.py:1204] (3/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_111-112362-0_sp1.1 from training. Duration: 0.82725 2023-10-02 14:55:36,191 WARNING [train.py:1204] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_318-59097-0_sp0.9 from training. Duration: 0.8444375 2023-10-02 14:55:39,500 WARNING [train.py:1204] (3/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_98-255710-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 14:55:39,506 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9042W0003-515609 from training. Duration: 0.896 2023-10-02 14:55:39,515 WARNING [train.py:1204] (3/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_158-250116-0_sp0.9 from training. Duration: 0.688875 2023-10-02 14:55:40,912 WARNING [train.py:1204] (3/4) Exclude cut with ID _26750_210_5665_1_1532226678442_4063230_7-346885-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 14:55:42,217 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_37839_210_7234_1_1533542462_3689303_357-227078-0_sp1.1 from training. Duration: 0.963625 2023-10-02 14:55:42,261 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9052W0119-520904 from training. Duration: 0.896 2023-10-02 14:55:44,972 WARNING [train.py:1204] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_249-281349-0_sp1.1 from training. Duration: 0.77275 2023-10-02 14:55:44,977 WARNING [train.py:1204] (3/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_147-293311-0 from training. Duration: 0.94 2023-10-02 14:55:44,978 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_158-21166-0_sp1.1 from training. Duration: 0.963625 2023-10-02 14:55:48,836 WARNING [train.py:1204] (3/4) Exclude cut with ID _14100_210_8341_1_1530529320613_3678069_207-332194-0_sp1.1 from training. Duration: 0.92725 2023-10-02 14:55:50,098 WARNING [train.py:1204] (3/4) Exclude cut with ID _9389_210_1664_1_1529217337301_5289269_324-211031-0_sp1.1 from training. Duration: 0.963625 2023-10-02 14:55:50,111 WARNING [train.py:1204] (3/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_133-158308-0 from training. Duration: 0.84 2023-10-02 14:55:50,216 WARNING [train.py:1204] (3/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_166-314607-0 from training. Duration: 0.54 2023-10-02 14:55:53,628 WARNING [train.py:1204] (3/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_649-79538-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 14:55:53,644 WARNING [train.py:1204] (3/4) Exclude cut with ID _28394_210_8913_1_1532430049518_3650118_343-115667-0_sp1.1 from training. Duration: 0.97275 2023-10-02 14:55:53,683 WARNING [train.py:1204] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_87-278686-0_sp0.9 from training. Duration: 0.8555625 2023-10-02 14:55:56,528 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9109W0003-549749 from training. Duration: 0.768 2023-10-02 14:55:56,775 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.conv_skip_rate, batch_count=919353.3333333334, ans=0.0 2023-10-02 14:55:57,814 INFO [train.py:1046] (3/4) Epoch 26, batch 5100, loss[loss=0.1853, simple_loss=0.2622, pruned_loss=0.05421, over 23435.00 frames. ], tot_loss[loss=0.169, simple_loss=0.2464, pruned_loss=0.04581, over 4702305.60 frames. ], batch size: 93, lr: 3.90e-03, grad_scale: 8.0 2023-10-02 14:55:59,332 WARNING [train.py:1204] (3/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_126-29104-0_sp1.1 from training. Duration: 0.77275 2023-10-02 14:56:00,751 WARNING [train.py:1204] (3/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_325-113798-0 from training. Duration: 0.85 2023-10-02 14:56:02,085 WARNING [train.py:1204] (3/4) Exclude cut with ID _6694_210_4599_1_1527852637490_3965529_116-109398-0 from training. Duration: 0.73 2023-10-02 14:56:02,144 WARNING [train.py:1204] (3/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_46-57119-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 14:56:03,560 WARNING [train.py:1204] (3/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_546-119312-0_sp1.1 from training. Duration: 0.9 2023-10-02 14:56:06,336 WARNING [train.py:1204] (3/4) Exclude cut with ID _60057_210_10640_1_1534903065793_3333150_468-306939-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 14:56:06,383 WARNING [train.py:1204] (3/4) Exclude cut with ID _15150_210_8341_1_1530752411803_7256384_22-138274-0 from training. Duration: 0.96 2023-10-02 14:56:06,408 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_289-180904-0 from training. Duration: 0.76 2023-10-02 14:56:11,040 WARNING [train.py:1204] (3/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_64-133721-0_sp1.1 from training. Duration: 0.92725 2023-10-02 14:56:11,078 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_195-257512-0_sp1.1 from training. Duration: 0.6090625 2023-10-02 14:56:11,297 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.conv_module1.balancer1.min_positive, batch_count=919420.0, ans=0.025 2023-10-02 14:56:15,730 WARNING [train.py:1204] (3/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_704-101963-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 14:56:19,110 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.feed_forward1.hidden_balancer.prob, batch_count=919420.0, ans=0.125 2023-10-02 14:56:19,144 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.conv_module1.balancer2.min_positive, batch_count=919420.0, ans=0.05 2023-10-02 14:56:20,222 WARNING [train.py:1204] (3/4) Exclude cut with ID _4366_210_1388_1_1526792053633_3613819_231-211402-0 from training. Duration: 0.94 2023-10-02 14:56:20,300 WARNING [train.py:1204] (3/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_679-190385-0_sp1.1 from training. Duration: 0.97275 2023-10-02 14:56:22,968 WARNING [train.py:1204] (3/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_298-81459-0_sp1.1 from training. Duration: 0.963625 2023-10-02 14:56:22,985 WARNING [train.py:1204] (3/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_658-282610-0_sp0.9 from training. Duration: 0.588875 2023-10-02 14:56:25,101 WARNING [train.py:1204] (3/4) Exclude cut with ID _57572_210_5517_1_1534921219624_3809484_196-283734-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 14:56:26,421 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_53071_210_16554_1_1534490405_2954066_294-198554-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 14:56:26,426 WARNING [train.py:1204] (3/4) Exclude cut with ID _4866_210_2577_1_1527404412284_4364659_352-157930-0 from training. Duration: 0.81 2023-10-02 14:56:28,254 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.5.encoder.layers.1.nonlin_attention.whiten2, num_groups=1, num_channels=256, metric=4.12 vs. limit=15.0 2023-10-02 14:56:29,170 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9231W0113-610139 from training. Duration: 0.896 2023-10-02 14:56:30,453 WARNING [train.py:1204] (3/4) Exclude cut with ID _24271_210_12658_1_1531987270022_7190519_638-171906-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 14:56:30,505 WARNING [train.py:1204] (3/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_272-249985-0 from training. Duration: 0.67 2023-10-02 14:56:30,515 WARNING [train.py:1204] (3/4) Exclude cut with ID _64086_210_7994_1_1535270417019_7435949_449-31567-0 from training. Duration: 0.97 2023-10-02 14:56:33,401 WARNING [train.py:1204] (3/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_109-282568-0_sp1.1 from training. Duration: 0.97275 2023-10-02 14:56:37,835 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.conv_module2.balancer2.min_abs, batch_count=919486.6666666666, ans=0.5 2023-10-02 14:56:39,162 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.feed_forward1.out_proj.dropout_p, batch_count=919486.6666666666, ans=0.1 2023-10-02 14:56:42,226 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_121-45934-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 14:56:43,670 WARNING [train.py:1204] (3/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_314-126819-0 from training. Duration: 0.5 2023-10-02 14:56:43,698 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9309W0192-640319 from training. Duration: 0.512 2023-10-02 14:56:45,003 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9310W0003-640649 from training. Duration: 0.896 2023-10-02 14:56:46,971 WARNING [train.py:1204] (3/4) Exclude cut with ID _7160_210_1876_1_1527940923231_3703799_120-150943-0 from training. Duration: 0.81 2023-10-02 14:56:46,973 WARNING [train.py:1204] (3/4) Exclude cut with ID _40380_210_9059_1_1533380594759_4187380_6-33508-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 14:56:49,685 WARNING [train.py:1204] (3/4) Exclude cut with ID _59202_210_26984_1_1534816810612_3549174_137-318710-0 from training. Duration: 0.89 2023-10-02 14:56:54,183 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_435-313305-0 from training. Duration: 0.71 2023-10-02 14:56:55,495 WARNING [train.py:1204] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_91-212486-0_sp1.1 from training. Duration: 0.6181875 2023-10-02 14:56:57,553 WARNING [train.py:1204] (3/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_133-158308-0_sp1.1 from training. Duration: 0.763625 2023-10-02 14:56:59,930 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.536e+02 1.826e+02 2.092e+02 2.413e+02 3.639e+02, threshold=4.183e+02, percent-clipped=0.0 2023-10-02 14:57:00,050 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_99-251314-0 from training. Duration: 0.87 2023-10-02 14:57:00,762 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.3.encoder.layers.0.feed_forward1.out_whiten, num_groups=1, num_channels=512, metric=8.07 vs. limit=15.0 2023-10-02 14:57:02,713 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_365-217119-0_sp1.1 from training. Duration: 0.57275 2023-10-02 14:57:04,078 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3475_210_2028_1_1526193578_3622569_377-166133-0 from training. Duration: 0.86 2023-10-02 14:57:08,302 WARNING [train.py:1204] (3/4) Exclude cut with ID _13916_210_9994_1_1530788341126_3763149_116-21034-0_sp1.1 from training. Duration: 0.87275 2023-10-02 14:57:08,315 WARNING [train.py:1204] (3/4) Exclude cut with ID _64220_210_9527_1_1535250151772_4875259_142-216831-0_sp1.1 from training. Duration: 0.936375 2023-10-02 14:57:08,315 WARNING [train.py:1204] (3/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_490-207861-0_sp1.1 from training. Duration: 0.8909375 2023-10-02 14:57:09,653 WARNING [train.py:1204] (3/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_87-272068-0_sp0.9 from training. Duration: 0.92225 2023-10-02 14:57:09,674 WARNING [train.py:1204] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_18-46756-0_sp1.1 from training. Duration: 0.7181875 2023-10-02 14:57:10,908 INFO [train.py:1046] (3/4) Epoch 26, batch 5150, loss[loss=0.2252, simple_loss=0.2859, pruned_loss=0.08226, over 19443.00 frames. ], tot_loss[loss=0.1691, simple_loss=0.2467, pruned_loss=0.04578, over 4717885.77 frames. ], batch size: 388, lr: 3.90e-03, grad_scale: 8.0 2023-10-02 14:57:10,976 WARNING [train.py:1204] (3/4) Exclude cut with ID _39411_210_5986_1_1533538825030_3343649_98-311335-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 14:57:11,048 WARNING [train.py:1204] (3/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_113-18163-0 from training. Duration: 0.89 2023-10-02 14:57:11,050 WARNING [train.py:1204] (3/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_1103-56445-0 from training. Duration: 0.73 2023-10-02 14:57:11,102 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3474_210_2028_1_1526188513_4825246_145-309131-0 from training. Duration: 0.74 2023-10-02 14:57:12,937 WARNING [train.py:1204] (3/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_120-211630-0_sp1.1 from training. Duration: 0.836375 2023-10-02 14:57:12,947 WARNING [train.py:1204] (3/4) Exclude cut with ID _8030_210_7497_1_1528444886781_3531089_32-31837-0 from training. Duration: 0.97 2023-10-02 14:57:13,233 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.ff3_skip_rate, batch_count=919686.6666666666, ans=0.0 2023-10-02 14:57:14,347 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_19949_210_2161_1_1531706337_928079_8-160956-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 14:57:14,405 WARNING [train.py:1204] (3/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_321-215742-0_sp1.1 from training. Duration: 0.4545625 2023-10-02 14:57:15,842 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_583-124650-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 14:57:17,239 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_30434_210_13049_1_1532519384_981746_164-182755-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 14:57:22,003 WARNING [train.py:1204] (3/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_339-104791-0_sp1.1 from training. Duration: 0.4454375 2023-10-02 14:57:23,248 WARNING [train.py:1204] (3/4) Exclude cut with ID _15026_210_8750_1_1530777602316_3291850_211-195043-0 from training. Duration: 0.8 2023-10-02 14:57:24,720 WARNING [train.py:1204] (3/4) Exclude cut with ID _29877_210_6228_1_1532397791106_5276149_487-182647-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 14:57:24,770 WARNING [train.py:1204] (3/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_127-566-0_sp0.9 from training. Duration: 0.9333125 2023-10-02 14:57:27,440 WARNING [train.py:1204] (3/4) Exclude cut with ID _8263_210_1664_1_1528439452891_970220_68-181805-0_sp0.9 from training. Duration: 0.711125 2023-10-02 14:57:27,441 WARNING [train.py:1204] (3/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_504-72513-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 14:57:27,450 WARNING [train.py:1204] (3/4) Exclude cut with ID _64639_210_5816_1_1535367619437_3528000_324-265233-0_sp1.1 from training. Duration: 0.963625 2023-10-02 14:57:28,639 WARNING [train.py:1204] (3/4) Exclude cut with ID _6456_210_1534_1_1527681634212_3181049_215-309359-0_sp0.9 from training. Duration: 0.97775 2023-10-02 14:57:28,644 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_115-233929-0_sp1.1 from training. Duration: 0.6545625 2023-10-02 14:57:28,678 WARNING [train.py:1204] (3/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_219-224445-0 from training. Duration: 0.84 2023-10-02 14:57:31,380 WARNING [train.py:1204] (3/4) Exclude cut with ID _14867_210_1968_1_1530684179890_3846969_413-29292-0_sp1.1 from training. Duration: 0.8181875 2023-10-02 14:57:32,077 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.4.encoder.layers.0.nonlin_attention.whiten1, num_groups=1, num_channels=288, metric=8.49 vs. limit=10.0 2023-10-02 14:57:32,759 WARNING [train.py:1204] (3/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_1012-279473-0_sp0.9 from training. Duration: 0.7444375 2023-10-02 14:57:32,893 WARNING [train.py:1204] (3/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_605-133395-0_sp0.9 from training. Duration: 0.7333125 2023-10-02 14:57:34,276 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_89-35748-0 from training. Duration: 0.79 2023-10-02 14:57:36,949 WARNING [train.py:1204] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_476-272757-0_sp1.1 from training. Duration: 0.8454375 2023-10-02 14:57:37,289 INFO [scaling.py:1118] (3/4) WithLoss: name=encoder.encoders.5.encoder.layers.0.self_attn_weights, loss-sum=0.000e+00 2023-10-02 14:57:41,156 WARNING [train.py:1204] (3/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_449-15508-0_sp0.9 from training. Duration: 0.911125 2023-10-02 14:57:42,567 WARNING [train.py:1204] (3/4) Exclude cut with ID _29345_210_4381_1_1532689168263_3860169_198-174328-0 from training. Duration: 0.91 2023-10-02 14:57:45,836 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_15517_210_6753_1_1530952293_1664795_125-158157-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 14:57:52,067 WARNING [train.py:1204] (3/4) Exclude cut with ID _25565_210_12392_1_1532423009382_3952098_420-113180-0_sp1.1 from training. Duration: 0.963625 2023-10-02 14:57:54,713 WARNING [train.py:1204] (3/4) Exclude cut with ID _51471_210_1889_1_1534399391390_3094250_236-146773-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 14:57:58,239 WARNING [train.py:1204] (3/4) Exclude cut with ID _30039_210_13270_1_1532484612900_2725150_175-35012-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 14:57:59,658 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_23850_210_4238_1_1532232597_1632433_99-275843-0_sp1.1 from training. Duration: 0.936375 2023-10-02 14:58:02,894 WARNING [train.py:1204] (3/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_658-282610-0 from training. Duration: 0.53 2023-10-02 14:58:05,637 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_37818_210_4238_1_1533620910_4720083_234-65934-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 14:58:05,707 WARNING [train.py:1204] (3/4) Exclude cut with ID _3067_210_2667_1_1526212974640_4503168_459-173867-0_sp1.1 from training. Duration: 0.8 2023-10-02 14:58:05,936 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.nonlin_attention.balancer.prob, batch_count=919886.6666666666, ans=0.125 2023-10-02 14:58:07,030 WARNING [train.py:1204] (3/4) Exclude cut with ID _8696_210_1876_1_1528545632440_3345058_209-54069-0_sp0.9 from training. Duration: 0.7444375 2023-10-02 14:58:09,768 WARNING [train.py:1204] (3/4) Exclude cut with ID _62574_210_2655_1_1535180407058_3933249_420-236465-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 14:58:11,148 WARNING [train.py:1204] (3/4) Exclude cut with ID _46681_210_9642_1_1533893005571_7603359_333-4042-0_sp1.1 from training. Duration: 0.936375 2023-10-02 14:58:12,484 WARNING [train.py:1204] (3/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_377-296839-0 from training. Duration: 0.55 2023-10-02 14:58:16,610 WARNING [train.py:1204] (3/4) Exclude cut with ID _43997_210_4525_1_1533538632555_5189329_426-92599-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 14:58:16,824 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.conv_module2.balancer1.prob, batch_count=919953.3333333334, ans=0.125 2023-10-02 14:58:18,495 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_43-71989-0_sp1.1 from training. Duration: 0.5181875 2023-10-02 14:58:18,706 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.feed_forward1.out_proj.dropout_p, batch_count=919953.3333333334, ans=0.1 2023-10-02 14:58:21,807 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2009_210_2071_1_1525481134_4588937_140-345530-0_sp1.1 from training. Duration: 0.963625 2023-10-02 14:58:21,829 WARNING [train.py:1204] (3/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_138-332773-0_sp0.9 from training. Duration: 0.9 2023-10-02 14:58:21,890 WARNING [train.py:1204] (3/4) Exclude cut with ID _13942_210_9988_1_1530604906036_3733360_499-55676-0_sp0.9 from training. Duration: 0.688875 2023-10-02 14:58:23,162 WARNING [train.py:1204] (3/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_259-34038-0_sp1.1 from training. Duration: 0.67275 2023-10-02 14:58:23,179 WARNING [train.py:1204] (3/4) Exclude cut with ID _3120_210_3525_1_1526036143112_4072980_404-249994-0_sp1.1 from training. Duration: 0.9 2023-10-02 14:58:23,204 WARNING [train.py:1204] (3/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_222-108810-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 14:58:25,772 INFO [train.py:1046] (3/4) Epoch 26, batch 5200, loss[loss=0.163, simple_loss=0.253, pruned_loss=0.03648, over 24649.00 frames. ], tot_loss[loss=0.1698, simple_loss=0.247, pruned_loss=0.04629, over 4701279.06 frames. ], batch size: 68, lr: 3.90e-03, grad_scale: 16.0 2023-10-02 14:58:25,949 WARNING [train.py:1204] (3/4) Exclude cut with ID _22083_210_8254_1_1531702862439_3678970_49-234546-0_sp0.9 from training. Duration: 0.92225 2023-10-02 14:58:27,976 WARNING [train.py:1204] (3/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_199-295611-0_sp0.9 from training. Duration: 0.611125 2023-10-02 14:58:30,588 WARNING [train.py:1204] (3/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_43-192945-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 14:58:34,016 INFO [scaling.py:1118] (3/4) WithLoss: name=encoder.encoders.2.encoder.layers.2.self_attn_weights, loss-sum=0.000e+00 2023-10-02 14:58:35,110 WARNING [train.py:1204] (3/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_364-22817-0 from training. Duration: 0.71 2023-10-02 14:58:36,482 WARNING [train.py:1204] (3/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_388-129552-0_sp1.1 from training. Duration: 0.87275 2023-10-02 14:58:37,806 WARNING [train.py:1204] (3/4) Exclude cut with ID _6537_210_1883_1_1527987559710_3597129_190-280227-0_sp1.1 from training. Duration: 0.97275 2023-10-02 14:58:37,980 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.nonlin_attention.balancer.prob, batch_count=920020.0, ans=0.125 2023-10-02 14:58:40,522 WARNING [train.py:1204] (3/4) Exclude cut with ID _55844_210_2346_1_1534471212569_7625707_398-56897-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 14:58:40,604 WARNING [train.py:1204] (3/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_48-182155-0_sp1.1 from training. Duration: 0.7909375 2023-10-02 14:58:41,876 WARNING [train.py:1204] (3/4) Exclude cut with ID _29529_210_7234_1_1532393961455_3977460_582-18084-0_sp1.1 from training. Duration: 0.97275 2023-10-02 14:58:42,019 WARNING [train.py:1204] (3/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_912-282412-0 from training. Duration: 0.49 2023-10-02 14:58:44,868 WARNING [train.py:1204] (3/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_332-256168-0_sp0.9 from training. Duration: 0.8333125 2023-10-02 14:58:44,933 WARNING [train.py:1204] (3/4) Exclude cut with ID _64220_210_9527_1_1535250151772_4875259_55-250624-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 14:58:47,624 WARNING [train.py:1204] (3/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_264-108685-0 from training. Duration: 0.55 2023-10-02 14:58:50,818 WARNING [train.py:1204] (3/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_613-232856-0_sp0.9 from training. Duration: 0.6 2023-10-02 14:58:52,628 WARNING [train.py:1204] (3/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_68-212801-0_sp1.1 from training. Duration: 0.663625 2023-10-02 14:58:52,688 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6290_210_3273_1_1527595224_1282787_115-87095-0 from training. Duration: 0.55 2023-10-02 14:58:53,988 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3080_210_2071_1_1526086102_4365166_312-8726-0 from training. Duration: 0.91 2023-10-02 14:58:55,514 WARNING [train.py:1204] (3/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_105-63309-0 from training. Duration: 0.98 2023-10-02 14:58:55,573 WARNING [train.py:1204] (3/4) Exclude cut with ID _2941_210_2577_1_1526199673150_2489759_201-160779-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 14:58:55,577 WARNING [train.py:1204] (3/4) Exclude cut with ID ID0406W0226-885434 from training. Duration: 0.997 2023-10-02 14:58:55,586 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_17292_210_6753_1_1531125388_2943734_353-328201-0_sp1.1 from training. Duration: 0.97275 2023-10-02 14:58:58,305 WARNING [train.py:1204] (3/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_572-96029-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 14:58:58,350 WARNING [train.py:1204] (3/4) Exclude cut with ID _14970_210_15709_1_1530775547361_1966965_6-23328-0_sp0.9 from training. Duration: 0.9555625 2023-10-02 14:58:58,389 WARNING [train.py:1204] (3/4) Exclude cut with ID _45012_210_12651_1_1533608435136_5371380_240-37101-0 from training. Duration: 0.93 2023-10-02 14:58:58,664 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.balancer_ff2.min_abs, batch_count=920153.3333333334, ans=0.1 2023-10-02 14:58:59,685 WARNING [train.py:1204] (3/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_685-90183-0_sp1.1 from training. Duration: 0.936375 2023-10-02 14:59:01,675 WARNING [train.py:1204] (3/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_41-22245-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 14:59:04,382 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_15126_210_6753_1_1530778677_2591185_120-277707-0 from training. Duration: 0.8 2023-10-02 14:59:05,634 WARNING [train.py:1204] (3/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_310-26186-0 from training. Duration: 0.7 2023-10-02 14:59:05,648 WARNING [train.py:1204] (3/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_787-294416-0 from training. Duration: 0.82 2023-10-02 14:59:09,857 WARNING [train.py:1204] (3/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_795-33252-0 from training. Duration: 0.37 2023-10-02 14:59:09,891 WARNING [train.py:1204] (3/4) Exclude cut with ID _7537_210_2084_1_1528502395431_3924359_113-199933-0_sp0.9 from training. Duration: 0.6555625 2023-10-02 14:59:15,469 WARNING [train.py:1204] (3/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_308-231735-0_sp0.9 from training. Duration: 0.811125 2023-10-02 14:59:15,488 WARNING [train.py:1204] (3/4) Exclude cut with ID _19504_210_9994_1_1531648863242_3325220_354-296765-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 14:59:17,603 WARNING [train.py:1204] (3/4) Exclude cut with ID _5409_210_1388_1_1527292850189_5865191_178-127970-0 from training. Duration: 0.57 2023-10-02 14:59:18,091 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.5.encoder.layers.1.self_attn_weights.whiten_keys, num_groups=4, num_channels=128, metric=1.56 vs. limit=6.0 2023-10-02 14:59:18,904 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_1466-248056-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 14:59:18,929 WARNING [train.py:1204] (3/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_535-14410-0_sp0.9 from training. Duration: 0.52225 2023-10-02 14:59:18,933 WARNING [train.py:1204] (3/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_747-129941-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 14:59:18,973 WARNING [train.py:1204] (3/4) Exclude cut with ID _15012_210_1968_1_1530856820429_5095350_688-178849-0_sp1.1 from training. Duration: 0.5818125 2023-10-02 14:59:19,257 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.self_attn_weights.pos_emb_skip_rate, batch_count=920220.0, ans=0.0 2023-10-02 14:59:23,654 WARNING [train.py:1204] (3/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_356-45054-0_sp1.1 from training. Duration: 0.7818125 2023-10-02 14:59:25,045 WARNING [train.py:1204] (3/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_146-276944-0_sp0.9 from training. Duration: 0.92225 2023-10-02 14:59:27,489 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.529e+02 1.858e+02 2.023e+02 2.241e+02 5.045e+02, threshold=4.045e+02, percent-clipped=1.0 2023-10-02 14:59:29,003 WARNING [train.py:1204] (3/4) Exclude cut with ID _39204_210_4220_1_1533880547368_3191120_346-180306-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 14:59:30,289 WARNING [train.py:1204] (3/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_641-158017-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 14:59:30,290 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_19952_210_2161_1_1531875469_3851750_274-42137-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 14:59:30,493 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.conv_skip_rate, batch_count=920286.6666666666, ans=0.0 2023-10-02 14:59:34,488 WARNING [train.py:1204] (3/4) Exclude cut with ID _60804_210_9642_1_1534831199859_7268299_87-327819-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 14:59:34,561 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_9302_210_2550_1_1528889962_4655416_638-301620-0 from training. Duration: 0.47 2023-10-02 14:59:35,844 WARNING [train.py:1204] (3/4) Exclude cut with ID _44304_210_15374_1_1533639352765_4613129_302-22084-0_sp1.1 from training. Duration: 0.7818125 2023-10-02 14:59:35,855 WARNING [train.py:1204] (3/4) Exclude cut with ID _18513_210_9206_1_1531618215223_3862620_177-227063-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 14:59:37,219 WARNING [train.py:1204] (3/4) Exclude cut with ID _41326_210_5854_1_1533778076509_3492080_510-45444-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 14:59:37,266 WARNING [train.py:1204] (3/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_76-311963-0_sp1.1 from training. Duration: 0.636375 2023-10-02 14:59:38,601 WARNING [train.py:1204] (3/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_156-302675-0_sp0.9 from training. Duration: 0.611125 2023-10-02 14:59:39,854 INFO [train.py:1046] (3/4) Epoch 26, batch 5250, loss[loss=0.1561, simple_loss=0.2143, pruned_loss=0.04894, over 23413.00 frames. ], tot_loss[loss=0.1691, simple_loss=0.2466, pruned_loss=0.04578, over 4721148.21 frames. ], batch size: 285, lr: 3.90e-03, grad_scale: 16.0 2023-10-02 14:59:41,230 WARNING [train.py:1204] (3/4) Exclude cut with ID _53489_210_6228_1_1534298357169_3835970_367-148411-0_sp1.1 from training. Duration: 0.936375 2023-10-02 14:59:44,028 WARNING [train.py:1204] (3/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_389-144525-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 14:59:44,063 WARNING [train.py:1204] (3/4) Exclude cut with ID _14069_210_5632_1_1530512692033_4803390_158-238427-0_sp1.1 from training. Duration: 0.963625 2023-10-02 14:59:44,147 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_486-151131-0_sp0.9 from training. Duration: 0.9444375 2023-10-02 14:59:45,572 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.0.feed_forward2.hidden_balancer.prob, batch_count=920353.3333333334, ans=0.125 2023-10-02 14:59:51,771 WARNING [train.py:1204] (3/4) Exclude cut with ID _39846_210_12651_1_1533603651992_1318030_188-71831-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 14:59:54,507 WARNING [train.py:1204] (3/4) Exclude cut with ID _39411_210_5986_1_1533538825030_3343649_173-238248-0_sp1.1 from training. Duration: 0.7545625 2023-10-02 14:59:57,329 WARNING [train.py:1204] (3/4) Exclude cut with ID _20684_210_6917_1_1531567751646_4203930_469-49081-0_sp1.1 from training. Duration: 0.92725 2023-10-02 14:59:58,681 WARNING [train.py:1204] (3/4) Exclude cut with ID _8833_210_5219_1_1528592665210_7553059_611-80313-0_sp1.1 from training. Duration: 0.5818125 2023-10-02 15:00:00,704 WARNING [train.py:1204] (3/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_440-141811-0 from training. Duration: 0.95 2023-10-02 15:00:00,726 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_41211_210_2544_1_1533348859_3702910_236-223652-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 15:00:02,111 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_40222_210_6228_1_1533385298_4481939_220-163998-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 15:00:07,647 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.nonlin_attention.balancer.prob, batch_count=920486.6666666666, ans=0.125 2023-10-02 15:00:13,429 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.nonlin_attention.balancer.prob, batch_count=920486.6666666666, ans=0.125 2023-10-02 15:00:18,717 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.balancer2.prob, batch_count=920486.6666666666, ans=0.125 2023-10-02 15:00:22,594 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.conv_module2.balancer2.min_abs, batch_count=920553.3333333334, ans=0.5 2023-10-02 15:00:27,841 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.feed_forward1.hidden_balancer.prob, batch_count=920553.3333333334, ans=0.125 2023-10-02 15:00:30,467 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.self_attn_weights.pos_emb_skip_rate, batch_count=920553.3333333334, ans=0.0 2023-10-02 15:00:34,671 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.4.encoder.layers.0.conv_module1.whiten, num_groups=1, num_channels=384, metric=5.12 vs. limit=15.0 2023-10-02 15:00:48,980 INFO [train.py:1046] (3/4) Epoch 26, batch 5300, loss[loss=0.1578, simple_loss=0.2221, pruned_loss=0.04676, over 23649.00 frames. ], tot_loss[loss=0.1683, simple_loss=0.2457, pruned_loss=0.04542, over 4711561.45 frames. ], batch size: 232, lr: 3.90e-03, grad_scale: 16.0 2023-10-02 15:00:58,850 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.4.encoder.layers.2.nonlin_attention.whiten2, num_groups=1, num_channels=384, metric=4.85 vs. limit=15.0 2023-10-02 15:01:03,377 WARNING [train.py:1204] (3/4) Exclude cut with ID _14629_210_15709_1_1530604298182_956899_6-232695-0_sp1.1 from training. Duration: 0.8545625 2023-10-02 15:01:03,397 WARNING [train.py:1204] (3/4) Exclude cut with ID _13921_210_9994_1_1530841782272_3503520_327-69677-0 from training. Duration: 0.9 2023-10-02 15:01:03,424 WARNING [train.py:1204] (3/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_372-298093-0 from training. Duration: 0.8 2023-10-02 15:01:03,437 WARNING [train.py:1204] (3/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_515-139111-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 15:01:03,678 WARNING [train.py:1204] (3/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_85-65056-0_sp1.1 from training. Duration: 0.87275 2023-10-02 15:01:03,719 WARNING [train.py:1204] (3/4) Exclude cut with ID _15113_210_8254_1_1530842438579_3716279_209-54533-0_sp1.1 from training. Duration: 0.87275 2023-10-02 15:01:03,767 WARNING [train.py:1204] (3/4) Exclude cut with ID _70447_210_12392_1_1535972499300_3160420_123-312838-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 15:01:03,795 WARNING [train.py:1204] (3/4) Exclude cut with ID _60113_210_16271_1_1535164212481_4386079_70-63596-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 15:01:03,823 WARNING [train.py:1204] (3/4) Exclude cut with ID _14632_210_9988_1_1530691279392_3923200_225-188509-0_sp1.1 from training. Duration: 0.963625 2023-10-02 15:01:04,237 WARNING [train.py:1204] (3/4) Exclude cut with ID _34734_210_4525_1_1533365977542_4448720_584-180532-0_sp1.1 from training. Duration: 0.97275 2023-10-02 15:01:04,251 WARNING [train.py:1204] (3/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_351-294650-0_sp1.1 from training. Duration: 0.57275 2023-10-02 15:01:04,580 WARNING [train.py:1204] (3/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_263-168388-0_sp0.9 from training. Duration: 0.9555625 2023-10-02 15:01:04,608 WARNING [train.py:1204] (3/4) Exclude cut with ID _22083_210_8254_1_1531702862439_3678970_201-85738-0 from training. Duration: 0.9 2023-10-02 15:01:04,694 WARNING [train.py:1204] (3/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_237-58433-0 from training. Duration: 0.74 2023-10-02 15:01:04,720 WARNING [train.py:1204] (3/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_262-250826-0 from training. Duration: 0.99 2023-10-02 15:01:04,819 WARNING [train.py:1204] (3/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_927-43827-0_sp0.9 from training. Duration: 0.82225 2023-10-02 15:01:04,851 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2821_210_2715_1_1526108385_3763560_73-296673-0 from training. Duration: 0.83 2023-10-02 15:01:04,927 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6287_210_3273_1_1527509506_2653716_182-199887-0 from training. Duration: 0.51 2023-10-02 15:01:05,054 WARNING [train.py:1204] (3/4) Exclude cut with ID _70519_210_4163_1_1535961595994_7458230_899-80223-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 15:01:05,314 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_210-234949-0_sp1.1 from training. Duration: 0.97275 2023-10-02 15:01:05,354 WARNING [train.py:1204] (3/4) Exclude cut with ID _30196_210_1664_1_1533708496522_7074140_768-278176-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 15:01:05,434 WARNING [train.py:1204] (3/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_315-128283-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 15:01:05,522 WARNING [train.py:1204] (3/4) Exclude cut with ID _14886_210_4220_1_1530702020355_3516190_330-323941-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 15:01:05,800 WARNING [train.py:1204] (3/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_264-348896-0_sp1.1 from training. Duration: 0.77275 2023-10-02 15:01:05,822 WARNING [train.py:1204] (3/4) Exclude cut with ID _37086_210_9038_1_1532999540891_7021250_433-10563-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 15:01:05,859 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_53638_210_21581_1_1534297319_5808449_885-80172-0_sp1.1 from training. Duration: 0.87275 2023-10-02 15:01:05,953 WARNING [train.py:1204] (3/4) Exclude cut with ID _14623_210_1925_1_1530773354962_3632609_157-218771-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 15:01:05,965 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_1368-78165-0_sp1.1 from training. Duration: 0.97275 2023-10-02 15:01:05,971 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_309-65177-0_sp0.9 from training. Duration: 0.97775 2023-10-02 15:01:05,977 WARNING [train.py:1204] (3/4) Exclude cut with ID _21632_210_2253_1_1531994001765_3744140_291-138812-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 15:01:05,991 WARNING [train.py:1204] (3/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_669-93163-0_sp1.1 from training. Duration: 0.8909375 2023-10-02 15:01:06,984 WARNING [train.py:1204] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_292-215346-0 from training. Duration: 0.97 2023-10-02 15:01:07,010 WARNING [train.py:1204] (3/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_286-222073-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 15:01:07,305 WARNING [train.py:1204] (3/4) Exclude cut with ID _59256_210_5986_1_1535076085997_3589120_121-237386-0_sp1.1 from training. Duration: 0.87275 2023-10-02 15:01:07,320 WARNING [train.py:1204] (3/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_96-320736-0 from training. Duration: 0.86 2023-10-02 15:01:07,331 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_128-202104-0 from training. Duration: 0.63 2023-10-02 15:01:07,425 WARNING [train.py:1204] (3/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_242-3472-0_sp0.9 from training. Duration: 0.811125 2023-10-02 15:01:07,476 WARNING [train.py:1204] (3/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_154-74182-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 15:01:07,480 WARNING [train.py:1204] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_187-221566-0 from training. Duration: 0.43 2023-10-02 15:01:07,647 WARNING [train.py:1204] (3/4) Exclude cut with ID _6194_210_1534_1_1527511826160_3941194_14-282876-0 from training. Duration: 0.71 2023-10-02 15:01:07,689 WARNING [train.py:1204] (3/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_660-211088-0_sp1.1 from training. Duration: 0.563625 2023-10-02 15:01:08,019 WARNING [train.py:1204] (3/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_526-62079-0_sp1.1 from training. Duration: 0.6090625 2023-10-02 15:01:08,162 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_92-246270-0_sp1.1 from training. Duration: 0.77275 2023-10-02 15:01:08,246 WARNING [train.py:1204] (3/4) Exclude cut with ID IC0287W0023-142350 from training. Duration: 0.956 2023-10-02 15:01:08,319 WARNING [train.py:1204] (3/4) Exclude cut with ID _58605_210_12210_1_1534723195939_3904259_48-269402-0 from training. Duration: 0.96 2023-10-02 15:01:08,334 WARNING [train.py:1204] (3/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_416-257730-0_sp0.9 from training. Duration: 0.72225 2023-10-02 15:01:08,405 WARNING [train.py:1204] (3/4) Exclude cut with ID _7877_210_6014_1_1528286358099_3750136_185-17516-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 15:01:08,502 WARNING [train.py:1204] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_23-189234-0 from training. Duration: 0.83 2023-10-02 15:01:08,517 WARNING [train.py:1204] (3/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_237-89684-0 from training. Duration: 0.96 2023-10-02 15:01:09,092 WARNING [train.py:1204] (3/4) Exclude cut with ID _13916_210_9994_1_1530788341126_3763149_116-21034-0 from training. Duration: 0.96 2023-10-02 15:01:09,251 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_246-125000-0_sp1.1 from training. Duration: 0.563625 2023-10-02 15:01:15,202 INFO [train.py:1046] (3/4) Epoch 27, batch 0, loss[loss=0.1666, simple_loss=0.2524, pruned_loss=0.04039, over 24635.00 frames. ], tot_loss[loss=0.1666, simple_loss=0.2524, pruned_loss=0.04039, over 24635.00 frames. ], batch size: 68, lr: 3.83e-03, grad_scale: 32.0 2023-10-02 15:01:15,202 INFO [train.py:1069] (3/4) Computing validation loss 2023-10-02 15:01:27,543 INFO [train.py:1078] (3/4) Epoch 27, validation: loss=0.313, simple_loss=0.2744, pruned_loss=0.1758, over 1125622.00 frames. 2023-10-02 15:01:27,543 INFO [train.py:1079] (3/4) Maximum memory allocated so far is 21134MB 2023-10-02 15:01:30,410 WARNING [train.py:1204] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_402-253027-0 from training. Duration: 0.54 2023-10-02 15:01:31,748 WARNING [train.py:1204] (3/4) Exclude cut with ID _59288_210_5854_1_1535077757774_3412246_158-289345-0_sp1.1 from training. Duration: 0.92725 2023-10-02 15:01:33,136 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_28211_210_6388_1_1532861506_4523224_128-328162-0_sp1.1 from training. Duration: 0.8181875 2023-10-02 15:01:37,247 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.1.encoder.layers.1.self_attn_weights.whiten_keys, num_groups=4, num_channels=128, metric=2.17 vs. limit=6.0 2023-10-02 15:01:37,607 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_788-278281-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 15:01:37,616 WARNING [train.py:1204] (3/4) Exclude cut with ID _49728_210_12651_1_1534041034294_6788150_624-195301-0_sp0.9 from training. Duration: 0.9444375 2023-10-02 15:01:37,671 WARNING [train.py:1204] (3/4) Exclude cut with ID _14860_210_4915_1_1530702020340_7343240_267-139775-0_sp1.1 from training. Duration: 0.963625 2023-10-02 15:01:39,086 WARNING [train.py:1204] (3/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_332-221057-0 from training. Duration: 0.68 2023-10-02 15:01:40,494 WARNING [train.py:1204] (3/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_242-76943-0 from training. Duration: 0.94 2023-10-02 15:01:41,888 WARNING [train.py:1204] (3/4) Exclude cut with ID _16747_210_5986_1_1531811544733_2749109_87-349587-0_sp1.1 from training. Duration: 0.963625 2023-10-02 15:01:43,346 WARNING [train.py:1204] (3/4) Exclude cut with ID _22982_210_1883_1_1531882790447_5449409_505-346452-0_sp1.1 from training. Duration: 0.963625 2023-10-02 15:01:46,134 WARNING [train.py:1204] (3/4) Exclude cut with ID _17956_210_4353_1_1531209568125_6979970_13-192107-0_sp1.1 from training. Duration: 0.963625 2023-10-02 15:01:46,173 WARNING [train.py:1204] (3/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_430-307666-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 15:01:46,220 WARNING [train.py:1204] (3/4) Exclude cut with ID _2895_210_2477_1_1525956785794_4063750_289-11475-0_sp1.1 from training. Duration: 0.7454375 2023-10-02 15:01:46,232 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3910_210_1794_1_1526742306_334530_28-238909-0_sp0.9 from training. Duration: 0.9 2023-10-02 15:01:48,791 WARNING [train.py:1204] (3/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_18-130336-0 from training. Duration: 0.79 2023-10-02 15:01:49,569 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.3.encoder.layers.1.whiten, num_groups=1, num_channels=512, metric=4.46 vs. limit=12.0 2023-10-02 15:01:50,253 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_548-294316-0_sp0.9 from training. Duration: 0.9 2023-10-02 15:01:53,953 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.0.layers.0.nonlin_attention.whiten1, num_groups=1, num_channels=144, metric=10.13 vs. limit=10.0 2023-10-02 15:01:56,406 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_42-142473-0_sp1.1 from training. Duration: 0.6545625 2023-10-02 15:01:56,413 WARNING [train.py:1204] (3/4) Exclude cut with ID _59170_210_6721_1_1534983633960_4203335_326-48066-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 15:01:57,835 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_45886_210_17024_1_1533709215_471063_7-79165-0 from training. Duration: 0.98 2023-10-02 15:02:04,938 WARNING [train.py:1204] (3/4) Exclude cut with ID _44585_210_18628_1_1533605350387_4518199_280-73534-0_sp0.9 from training. Duration: 0.988875 2023-10-02 15:02:04,944 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3475_210_2028_1_1526193578_3622569_265-276625-0_sp1.1 from training. Duration: 0.6909375 2023-10-02 15:02:06,868 WARNING [train.py:1204] (3/4) Exclude cut with ID _64631_210_27767_1_1535349580068_4165160_88-225232-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 15:02:07,159 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.conv_module2.balancer1.prob, batch_count=920900.0, ans=0.125 2023-10-02 15:02:09,705 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_205-265518-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 15:02:11,023 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.551e+02 2.074e+02 2.559e+02 3.176e+02 5.504e+02, threshold=5.117e+02, percent-clipped=16.0 2023-10-02 15:02:12,517 WARNING [train.py:1204] (3/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_271-253734-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 15:02:16,721 WARNING [train.py:1204] (3/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_282-257791-0 from training. Duration: 0.86 2023-10-02 15:02:18,390 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.conv_module2.balancer2.prob, batch_count=920966.6666666666, ans=0.125 2023-10-02 15:02:20,858 WARNING [train.py:1204] (3/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_356-45054-0 from training. Duration: 0.86 2023-10-02 15:02:20,903 WARNING [train.py:1204] (3/4) Exclude cut with ID _69584_210_5219_1_1535785133775_4044450_38-348116-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 15:02:20,904 WARNING [train.py:1204] (3/4) Exclude cut with ID _66763_210_17301_1_1535788667617_241750_21-328480-0_sp1.1 from training. Duration: 0.936375 2023-10-02 15:02:21,144 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.conv_module1.balancer2.prob, batch_count=920966.6666666666, ans=0.125 2023-10-02 15:02:22,234 WARNING [train.py:1204] (3/4) Exclude cut with ID _26528_210_9070_1_1532570410842_3851310_497-97218-0_sp0.9 from training. Duration: 0.9555625 2023-10-02 15:02:22,292 WARNING [train.py:1204] (3/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_592-101075-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 15:02:25,464 WARNING [train.py:1204] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_678-159307-0 from training. Duration: 0.85 2023-10-02 15:02:28,578 WARNING [train.py:1204] (3/4) Exclude cut with ID _28427_210_4220_1_1532581240967_3061069_166-319773-0_sp1.1 from training. Duration: 0.936375 2023-10-02 15:02:28,664 WARNING [train.py:1204] (3/4) Exclude cut with ID _29877_210_6228_1_1532397791106_5276149_606-224984-0_sp1.1 from training. Duration: 0.936375 2023-10-02 15:02:32,030 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_125-203105-0_sp1.1 from training. Duration: 0.72725 2023-10-02 15:02:35,422 WARNING [train.py:1204] (3/4) Exclude cut with ID IC0620W0113-306765 from training. Duration: 0.768 2023-10-02 15:02:36,818 WARNING [train.py:1204] (3/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_410-287857-0_sp1.1 from training. Duration: 0.8454375 2023-10-02 15:02:38,292 WARNING [train.py:1204] (3/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_78-95260-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 15:02:39,699 WARNING [train.py:1204] (3/4) Exclude cut with ID _58059_210_5854_1_1534901471149_3164808_6-73080-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 15:02:40,891 INFO [train.py:1046] (3/4) Epoch 27, batch 50, loss[loss=0.1527, simple_loss=0.2277, pruned_loss=0.03888, over 23672.00 frames. ], tot_loss[loss=0.1689, simple_loss=0.2461, pruned_loss=0.04582, over 1057502.93 frames. ], batch size: 149, lr: 3.83e-03, grad_scale: 32.0 2023-10-02 15:02:40,948 WARNING [train.py:1204] (3/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_331-117829-0 from training. Duration: 0.62 2023-10-02 15:02:41,016 WARNING [train.py:1204] (3/4) Exclude cut with ID _14880_210_8895_1_1530680426655_3795259_289-212817-0_sp0.9 from training. Duration: 0.7666875 2023-10-02 15:02:42,308 WARNING [train.py:1204] (3/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_620-144779-0_sp1.1 from training. Duration: 0.92725 2023-10-02 15:02:43,700 WARNING [train.py:1204] (3/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_561-288575-0_sp1.1 from training. Duration: 0.963625 2023-10-02 15:02:43,868 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.conv_module1.balancer2.prob, batch_count=921100.0, ans=0.125 2023-10-02 15:02:44,970 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_22657_210_6890_1_1532086002_1637216_122-188128-0_sp1.1 from training. Duration: 0.963625 2023-10-02 15:02:46,398 WARNING [train.py:1204] (3/4) Exclude cut with ID _16756_210_4915_1_1531047803763_7104259_604-86607-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 15:02:50,455 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder_embed.dropout.p, batch_count=921100.0, ans=0.1 2023-10-02 15:02:50,585 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.bypass.scale_min, batch_count=921100.0, ans=0.2 2023-10-02 15:02:51,663 WARNING [train.py:1204] (3/4) Exclude cut with ID _15150_210_8341_1_1530752411803_7256384_469-239509-0 from training. Duration: 0.68 2023-10-02 15:02:51,679 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_10987_210_4163_1_1529760555_4123902_302-87926-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 15:02:52,419 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.2.encoder.layers.2.self_attn_weights.whiten_keys, num_groups=4, num_channels=128, metric=3.29 vs. limit=6.0 2023-10-02 15:02:57,879 WARNING [train.py:1204] (3/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_469-94673-0_sp0.9 from training. Duration: 0.87775 2023-10-02 15:03:01,035 WARNING [train.py:1204] (3/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_798-106179-0 from training. Duration: 0.83 2023-10-02 15:03:02,373 WARNING [train.py:1204] (3/4) Exclude cut with ID _16744_210_5986_1_1531724405384_3907567_242-111316-0 from training. Duration: 0.93 2023-10-02 15:03:03,080 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.self_attn_weights.whiten_keys.whitening_limit, batch_count=921166.6666666666, ans=6.0 2023-10-02 15:03:03,727 WARNING [train.py:1204] (3/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_938-205135-0_sp0.9 from training. Duration: 0.9666875 2023-10-02 15:03:05,728 WARNING [train.py:1204] (3/4) Exclude cut with ID _64135_210_2253_1_1535677267876_7608972_133-188266-0_sp0.9 from training. Duration: 0.9 2023-10-02 15:03:06,359 WARNING [train.py:1204] (3/4) Exclude cut with ID _44585_210_18628_1_1533605350387_4518199_623-346820-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 15:03:06,431 WARNING [train.py:1204] (3/4) Exclude cut with ID _42326_210_7770_1_1533814282531_3707269_454-266903-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 15:03:07,775 WARNING [train.py:1204] (3/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_5-55893-0_sp1.1 from training. Duration: 0.5 2023-10-02 15:03:07,825 WARNING [train.py:1204] (3/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_577-332600-0_sp1.1 from training. Duration: 0.5181875 2023-10-02 15:03:07,826 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_957-138882-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 15:03:15,937 WARNING [train.py:1204] (3/4) Exclude cut with ID _12556_210_8341_1_1530081024100_7327690_219-38004-0_sp1.1 from training. Duration: 0.97275 2023-10-02 15:03:17,268 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_397-147196-0_sp1.1 from training. Duration: 0.72725 2023-10-02 15:03:17,283 WARNING [train.py:1204] (3/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_231-104736-0_sp0.9 from training. Duration: 0.8666875 2023-10-02 15:03:18,603 WARNING [train.py:1204] (3/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_398-334558-0 from training. Duration: 0.96 2023-10-02 15:03:21,325 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_257-20733-0_sp1.1 from training. Duration: 0.6909375 2023-10-02 15:03:21,405 WARNING [train.py:1204] (3/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_146-282400-0_sp1.1 from training. Duration: 0.6454375 2023-10-02 15:03:21,408 WARNING [train.py:1204] (3/4) Exclude cut with ID _51899_210_20034_1_1534129254466_3669220_265-215428-0 from training. Duration: 0.98 2023-10-02 15:03:22,758 WARNING [train.py:1204] (3/4) Exclude cut with ID _34423_210_8750_1_1533430798456_3330199_219-106541-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 15:03:25,444 WARNING [train.py:1204] (3/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_458-67321-0 from training. Duration: 0.97 2023-10-02 15:03:27,075 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.conv_module1.balancer1.min_positive, batch_count=921300.0, ans=0.025 2023-10-02 15:03:33,782 WARNING [train.py:1204] (3/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_1051-182606-0_sp1.1 from training. Duration: 0.936375 2023-10-02 15:03:33,808 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_53555_210_5632_1_1534595220_7232297_603-4396-0_sp1.1 from training. Duration: 0.97275 2023-10-02 15:03:35,316 WARNING [train.py:1204] (3/4) Exclude cut with ID _54218_210_12210_1_1534413646388_4409259_159-144367-0_sp1.1 from training. Duration: 0.963625 2023-10-02 15:03:36,668 WARNING [train.py:1204] (3/4) Exclude cut with ID _7606_210_4915_1_1528894253277_5080690_301-10732-0_sp0.9 from training. Duration: 0.9 2023-10-02 15:03:36,671 WARNING [train.py:1204] (3/4) Exclude cut with ID _15026_210_8750_1_1530777602316_3291850_211-195043-0_sp0.9 from training. Duration: 0.888875 2023-10-02 15:03:39,978 WARNING [train.py:1204] (3/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_146-282400-0 from training. Duration: 0.71 2023-10-02 15:03:40,013 WARNING [train.py:1204] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_65-88242-0 from training. Duration: 0.56 2023-10-02 15:03:43,241 WARNING [train.py:1204] (3/4) Exclude cut with ID _55857_210_2346_1_1534644007831_6898209_80-107757-0_sp1.1 from training. Duration: 0.963625 2023-10-02 15:03:43,286 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_15126_210_6753_1_1530778677_2591185_120-277707-0_sp0.9 from training. Duration: 0.888875 2023-10-02 15:03:44,682 WARNING [train.py:1204] (3/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_1033-168593-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 15:03:44,715 WARNING [train.py:1204] (3/4) Exclude cut with ID _46683_210_9642_1_1533730913492_4591116_475-30711-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 15:03:44,745 WARNING [train.py:1204] (3/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_253-308377-0 from training. Duration: 0.49 2023-10-02 15:03:44,789 WARNING [train.py:1204] (3/4) Exclude cut with ID _4680_210_3525_1_1527249757953_893386_75-78979-0 from training. Duration: 0.91 2023-10-02 15:03:46,140 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_5522_210_3273_1_1527249606_3909096_318-203153-0_sp0.9 from training. Duration: 0.47775 2023-10-02 15:03:47,548 WARNING [train.py:1204] (3/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_550-170116-0_sp1.1 from training. Duration: 0.92725 2023-10-02 15:03:48,882 WARNING [train.py:1204] (3/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_277-210798-0_sp0.9 from training. Duration: 0.611125 2023-10-02 15:03:48,956 WARNING [train.py:1204] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_620-276793-0 from training. Duration: 0.59 2023-10-02 15:03:48,968 WARNING [train.py:1204] (3/4) Exclude cut with ID _3067_210_2667_1_1526212974640_4503168_119-175776-0 from training. Duration: 0.74 2023-10-02 15:03:50,338 WARNING [train.py:1204] (3/4) Exclude cut with ID _55209_210_1531_1_1534675828996_4597160_681-195827-0_sp1.1 from training. Duration: 0.92725 2023-10-02 15:03:51,669 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_427-78097-0_sp1.1 from training. Duration: 0.72725 2023-10-02 15:03:53,070 WARNING [train.py:1204] (3/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_96-290039-0_sp0.9 from training. Duration: 0.62225 2023-10-02 15:03:53,087 WARNING [train.py:1204] (3/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_287-292174-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 15:03:54,370 INFO [train.py:1046] (3/4) Epoch 27, batch 100, loss[loss=0.1915, simple_loss=0.257, pruned_loss=0.06298, over 22764.00 frames. ], tot_loss[loss=0.1692, simple_loss=0.2475, pruned_loss=0.04547, over 1873878.73 frames. ], batch size: 322, lr: 3.83e-03, grad_scale: 16.0 2023-10-02 15:03:55,877 WARNING [train.py:1204] (3/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_200-292228-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 15:03:58,761 WARNING [train.py:1204] (3/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_291-194999-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 15:04:02,182 WARNING [train.py:1204] (3/4) Exclude cut with ID _58895_210_8750_1_1535184081320_3649089_265-323810-0_sp1.1 from training. Duration: 0.87275 2023-10-02 15:04:03,531 WARNING [train.py:1204] (3/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_58-68956-0 from training. Duration: 0.83 2023-10-02 15:04:03,536 WARNING [train.py:1204] (3/4) Exclude cut with ID _39867_210_9826_1_1533553222030_9344259_322-115153-0_sp1.1 from training. Duration: 0.963625 2023-10-02 15:04:06,950 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3371_210_1794_1_1526384462_5580777_396-339812-0_sp1.1 from training. Duration: 0.77275 2023-10-02 15:04:08,236 WARNING [train.py:1204] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_731-2428-0_sp0.9 from training. Duration: 0.92225 2023-10-02 15:04:08,261 WARNING [train.py:1204] (3/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_98-741-0_sp1.1 from training. Duration: 0.72725 2023-10-02 15:04:08,272 WARNING [train.py:1204] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_238-245655-0_sp0.9 from training. Duration: 0.9 2023-10-02 15:04:08,302 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6788_210_1388_1_1527898698_4562021_91-197981-0_sp0.9 from training. Duration: 0.92225 2023-10-02 15:04:09,614 WARNING [train.py:1204] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_860-96198-0 from training. Duration: 0.73 2023-10-02 15:04:14,551 WARNING [train.py:1204] (3/4) Exclude cut with ID _14268_210_9206_1_1530622771285_4714099_331-105351-0_sp0.9 from training. Duration: 0.72225 2023-10-02 15:04:14,580 WARNING [train.py:1204] (3/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_204-137133-0_sp1.1 from training. Duration: 0.92725 2023-10-02 15:04:14,604 WARNING [train.py:1204] (3/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_5-46496-0_sp1.1 from training. Duration: 0.936375 2023-10-02 15:04:14,620 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_160-218721-0_sp1.1 from training. Duration: 0.87275 2023-10-02 15:04:16,197 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=921500.0, ans=0.1 2023-10-02 15:04:18,592 WARNING [train.py:1204] (3/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_503-287883-0 from training. Duration: 0.88 2023-10-02 15:04:19,957 WARNING [train.py:1204] (3/4) Exclude cut with ID _57572_210_5517_1_1534921219624_3809484_110-79134-0_sp1.1 from training. Duration: 0.92725 2023-10-02 15:04:20,274 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.ff2_skip_rate, batch_count=921500.0, ans=0.0 2023-10-02 15:04:21,296 WARNING [train.py:1204] (3/4) Exclude cut with ID _44585_210_18628_1_1533605350387_4518199_421-37641-0_sp1.1 from training. Duration: 0.936375 2023-10-02 15:04:21,376 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_42-142473-0_sp0.9 from training. Duration: 0.8 2023-10-02 15:04:22,836 WARNING [train.py:1204] (3/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_310-114452-0_sp0.9 from training. Duration: 0.6555625 2023-10-02 15:04:22,989 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.1.feed_forward1.hidden_balancer.prob, batch_count=921566.6666666666, ans=0.125 2023-10-02 15:04:27,031 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9029W0120-509520 from training. Duration: 0.768 2023-10-02 15:04:27,057 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9029W0433-509820 from training. Duration: 0.896 2023-10-02 15:04:28,438 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_40222_210_6228_1_1533385298_4481939_163-334586-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 15:04:28,438 WARNING [train.py:1204] (3/4) Exclude cut with ID _48677_210_5663_1_1533902405919_3892989_408-40740-0_sp1.1 from training. Duration: 0.7545625 2023-10-02 15:04:31,308 WARNING [train.py:1204] (3/4) Exclude cut with ID _24314_210_4915_1_1532135078116_7170179_521-258868-0_sp0.9 from training. Duration: 0.87775 2023-10-02 15:04:34,449 WARNING [train.py:1204] (3/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_576-51198-0_sp1.1 from training. Duration: 0.92725 2023-10-02 15:04:35,767 WARNING [train.py:1204] (3/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_306-216871-0_sp1.1 from training. Duration: 0.97275 2023-10-02 15:04:35,995 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.conv_module1.balancer2.prob, batch_count=921566.6666666666, ans=0.125 2023-10-02 15:04:39,498 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.balancer2.prob, batch_count=921633.3333333334, ans=0.125 2023-10-02 15:04:40,520 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.396e+02 1.778e+02 2.000e+02 2.218e+02 5.015e+02, threshold=4.000e+02, percent-clipped=0.0 2023-10-02 15:04:40,646 WARNING [train.py:1204] (3/4) Exclude cut with ID _20684_210_6917_1_1531567751646_4203930_463-119982-0_sp1.1 from training. Duration: 0.97275 2023-10-02 15:04:40,720 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9084W0012-536850 from training. Duration: 0.64 2023-10-02 15:04:43,198 WARNING [train.py:1204] (3/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_847-41334-0_sp1.1 from training. Duration: 0.4 2023-10-02 15:04:44,558 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.2.encoder.layers.0.nonlin_attention.whiten1, num_groups=1, num_channels=288, metric=8.33 vs. limit=10.0 2023-10-02 15:04:46,415 WARNING [train.py:1204] (3/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_326-165900-0_sp1.1 from training. Duration: 0.62725 2023-10-02 15:04:46,481 WARNING [train.py:1204] (3/4) Exclude cut with ID _7537_210_2084_1_1528502395431_3924359_128-268029-0_sp1.1 from training. Duration: 0.8909375 2023-10-02 15:04:47,805 WARNING [train.py:1204] (3/4) Exclude cut with ID _67499_210_17282_1_1535509805220_3692430_333-155030-0_sp1.1 from training. Duration: 0.97275 2023-10-02 15:04:50,617 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_34008_210_10431_1_1533345424_8183247_588-257177-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 15:04:54,558 WARNING [train.py:1204] (3/4) Exclude cut with ID _25914_210_6400_1_1532141317058_644740_52-96808-0_sp1.1 from training. Duration: 0.82725 2023-10-02 15:04:55,937 WARNING [train.py:1204] (3/4) Exclude cut with ID _56494_210_4220_1_1535284837625_3033169_69-138150-0_sp1.1 from training. Duration: 0.8545625 2023-10-02 15:04:57,346 WARNING [train.py:1204] (3/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_19-143553-0_sp1.1 from training. Duration: 0.97275 2023-10-02 15:04:57,401 WARNING [train.py:1204] (3/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_582-130735-0_sp1.1 from training. Duration: 0.936375 2023-10-02 15:04:58,776 WARNING [train.py:1204] (3/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_943-201356-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 15:04:58,785 WARNING [train.py:1204] (3/4) Exclude cut with ID _3231_210_2106_1_1526205864934_6784220_422-229276-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 15:04:58,815 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_48049_210_5632_1_1533864553_3519937_73-198717-0_sp1.1 from training. Duration: 0.97275 2023-10-02 15:05:00,207 WARNING [train.py:1204] (3/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_329-285801-0 from training. Duration: 0.91 2023-10-02 15:05:00,222 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9171W0114-579000 from training. Duration: 0.896 2023-10-02 15:05:00,232 WARNING [train.py:1204] (3/4) Exclude cut with ID _3120_210_3525_1_1526036143112_4072980_210-132675-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 15:05:02,162 WARNING [train.py:1204] (3/4) Exclude cut with ID _55857_210_2346_1_1534644007831_6898209_745-98391-0_sp0.9 from training. Duration: 0.8444375 2023-10-02 15:05:02,221 WARNING [train.py:1204] (3/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_615-322035-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 15:05:02,225 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_36756_210_7234_1_1533026991_966276_119-315767-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 15:05:02,238 WARNING [train.py:1204] (3/4) Exclude cut with ID _13916_210_9994_1_1530788341126_3763149_60-191556-0_sp1.1 from training. Duration: 0.5545625 2023-10-02 15:05:02,260 WARNING [train.py:1204] (3/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_177-236809-0_sp1.1 from training. Duration: 0.4454375 2023-10-02 15:05:03,457 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7002_210_1794_1_1527939683_5157286_184-229630-0_sp1.1 from training. Duration: 0.67275 2023-10-02 15:05:03,472 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_50428_210_5724_1_1534247532_4126418_464-18322-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 15:05:04,911 WARNING [train.py:1204] (3/4) Exclude cut with ID _35799_210_5854_1_1533263487887_3529071_466-131402-0_sp1.1 from training. Duration: 0.936375 2023-10-02 15:05:05,127 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.nonlin_attention.balancer.prob, batch_count=921700.0, ans=0.125 2023-10-02 15:05:06,311 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_1259-132768-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 15:05:06,367 WARNING [train.py:1204] (3/4) Exclude cut with ID _6694_210_4599_1_1527852637490_3965529_144-185051-0_sp1.1 from training. Duration: 0.87275 2023-10-02 15:05:06,407 WARNING [train.py:1204] (3/4) Exclude cut with ID _64639_210_5816_1_1535367619437_3528000_59-133290-0_sp1.1 from training. Duration: 0.963625 2023-10-02 15:05:08,226 INFO [train.py:1046] (3/4) Epoch 27, batch 150, loss[loss=0.1639, simple_loss=0.2537, pruned_loss=0.03702, over 24626.00 frames. ], tot_loss[loss=0.1689, simple_loss=0.2469, pruned_loss=0.04542, over 2490475.26 frames. ], batch size: 68, lr: 3.83e-03, grad_scale: 16.0 2023-10-02 15:05:08,585 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.conv_module2.balancer1.prob, batch_count=921766.6666666666, ans=0.125 2023-10-02 15:05:09,658 WARNING [train.py:1204] (3/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_223-49525-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 15:05:12,947 WARNING [train.py:1204] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_748-36150-0_sp1.1 from training. Duration: 0.82725 2023-10-02 15:05:12,948 WARNING [train.py:1204] (3/4) Exclude cut with ID _19896_210_1883_1_1531709022662_6649569_52-221627-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 15:05:13,008 WARNING [train.py:1204] (3/4) Exclude cut with ID _6789_210_1534_1_1527854471671_2870519_296-115766-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 15:05:15,618 WARNING [train.py:1204] (3/4) Exclude cut with ID _14624_210_1925_1_1530858690657_3642219_100-19871-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 15:05:15,660 WARNING [train.py:1204] (3/4) Exclude cut with ID _12555_210_8750_1_1530075594402_3780239_50-293675-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 15:05:17,681 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.2.encoder.layers.1.nonlin_attention.whiten1, num_groups=1, num_channels=288, metric=5.76 vs. limit=10.0 2023-10-02 15:05:18,218 WARNING [train.py:1204] (3/4) Exclude cut with ID _14880_210_8895_1_1530680426655_3795259_289-212817-0_sp1.1 from training. Duration: 0.62725 2023-10-02 15:05:18,276 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_388-211626-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 15:05:23,654 WARNING [train.py:1204] (3/4) Exclude cut with ID _5289_210_2084_1_1527678202736_3779299_392-261988-0 from training. Duration: 0.65 2023-10-02 15:05:23,663 WARNING [train.py:1204] (3/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_124-345840-0 from training. Duration: 0.52 2023-10-02 15:05:23,667 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2655_210_2087_1_1525688959_3897400_437-337690-0 from training. Duration: 0.63 2023-10-02 15:05:25,238 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3780_210_2087_1_1526467391_239068_13-274633-0_sp1.1 from training. Duration: 0.92725 2023-10-02 15:05:25,242 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_805-71497-0_sp1.1 from training. Duration: 0.6454375 2023-10-02 15:05:26,594 WARNING [train.py:1204] (3/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_434-83000-0_sp1.1 from training. Duration: 0.8909375 2023-10-02 15:05:28,608 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_17613_210_2151_1_1531137569_2366160_285-219531-0_sp1.1 from training. Duration: 0.97275 2023-10-02 15:05:28,620 WARNING [train.py:1204] (3/4) Exclude cut with ID _56108_210_16380_1_1534505446159_3887959_22-37858-0_sp1.1 from training. Duration: 0.936375 2023-10-02 15:05:29,903 WARNING [train.py:1204] (3/4) Exclude cut with ID _28427_210_4220_1_1532581240967_3061069_200-88444-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 15:05:29,969 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_77-279042-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 15:05:31,382 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9309W0193-640320 from training. Duration: 0.896 2023-10-02 15:05:34,063 WARNING [train.py:1204] (3/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_516-324055-0_sp1.1 from training. Duration: 0.936375 2023-10-02 15:05:38,882 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.feed_forward3.hidden_balancer.prob, batch_count=921900.0, ans=0.125 2023-10-02 15:05:39,935 WARNING [train.py:1204] (3/4) Exclude cut with ID _40094_210_16380_1_1533294005773_3641159_10-261240-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 15:05:42,787 WARNING [train.py:1204] (3/4) Exclude cut with ID _6267_210_2084_1_1527764658526_3894730_375-219887-0_sp1.1 from training. Duration: 0.6090625 2023-10-02 15:05:44,102 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6788_210_1388_1_1527898698_4562021_180-75288-0 from training. Duration: 0.99 2023-10-02 15:05:46,824 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.bypass.skip_rate, batch_count=921900.0, ans=0.04949747468305833 2023-10-02 15:05:47,943 WARNING [train.py:1204] (3/4) Exclude cut with ID _8030_210_7497_1_1528444886781_3531089_262-86750-0_sp1.1 from training. Duration: 0.736375 2023-10-02 15:05:47,965 WARNING [train.py:1204] (3/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_934-325498-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 15:05:49,317 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_456-22141-0_sp0.9 from training. Duration: 0.97775 2023-10-02 15:05:50,679 WARNING [train.py:1204] (3/4) Exclude cut with ID _49643_210_7989_1_1534640244224_3939031_598-161926-0_sp1.1 from training. Duration: 0.8181875 2023-10-02 15:05:52,102 WARNING [train.py:1204] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_345-49721-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 15:05:52,166 WARNING [train.py:1204] (3/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_440-141811-0_sp1.1 from training. Duration: 0.863625 2023-10-02 15:05:52,431 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.ff3_skip_rate, batch_count=921966.6666666666, ans=0.0 2023-10-02 15:05:54,798 WARNING [train.py:1204] (3/4) Exclude cut with ID _64048_210_10626_1_1535198382802_3835179_394-212120-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 15:05:54,854 WARNING [train.py:1204] (3/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_91-277684-0 from training. Duration: 0.74 2023-10-02 15:06:00,173 WARNING [train.py:1204] (3/4) Exclude cut with ID _23038_210_9570_1_1532001584158_3652419_191-346343-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 15:06:00,395 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.attention_skip_rate, batch_count=921966.6666666666, ans=0.0 2023-10-02 15:06:01,963 WARNING [train.py:1204] (3/4) Exclude cut with ID _15140_210_9288_1_1530791388450_4880149_351-347974-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 15:06:03,349 WARNING [train.py:1204] (3/4) Exclude cut with ID _58605_210_12210_1_1534723195939_3904259_48-269402-0_sp1.1 from training. Duration: 0.87275 2023-10-02 15:06:03,356 WARNING [train.py:1204] (3/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_401-53423-0_sp0.9 from training. Duration: 0.611125 2023-10-02 15:06:03,596 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.bypass_mid.scale_min, batch_count=921966.6666666666, ans=0.2 2023-10-02 15:06:06,120 WARNING [train.py:1204] (3/4) Exclude cut with ID _42659_210_2070_1_1533607442765_3120380_158-49004-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 15:06:06,230 WARNING [train.py:1204] (3/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_81-75849-0_sp1.1 from training. Duration: 0.3545625 2023-10-02 15:06:06,445 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.feed_forward1.out_proj.dropout_p, batch_count=922033.3333333334, ans=0.1 2023-10-02 15:06:09,124 WARNING [train.py:1204] (3/4) Exclude cut with ID _27052_210_5986_1_1532329197187_3585009_314-106499-0_sp1.1 from training. Duration: 0.836375 2023-10-02 15:06:09,330 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=922033.3333333334, ans=0.1 2023-10-02 15:06:12,476 WARNING [train.py:1204] (3/4) Exclude cut with ID _4626_210_3532_1_1526817652717_3916859_446-206656-0_sp0.9 from training. Duration: 0.8444375 2023-10-02 15:06:13,795 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_53378_210_17282_1_1534221071_5769509_158-114960-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 15:06:16,507 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3171_210_2087_1_1526109084_2330007_37-64334-0_sp0.9 from training. Duration: 0.911125 2023-10-02 15:06:16,523 WARNING [train.py:1204] (3/4) Exclude cut with ID _5410_210_1388_1_1527397055795_1719020_39-112221-0 from training. Duration: 0.69 2023-10-02 15:06:16,541 WARNING [train.py:1204] (3/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_4-91037-0_sp0.9 from training. Duration: 0.97775 2023-10-02 15:06:16,571 WARNING [train.py:1204] (3/4) Exclude cut with ID ID0079W0443-717795 from training. Duration: 0.541 2023-10-02 15:06:20,098 WARNING [train.py:1204] (3/4) Exclude cut with ID _34734_210_4525_1_1533365977542_4448720_766-297928-0_sp1.1 from training. Duration: 0.936375 2023-10-02 15:06:21,860 INFO [train.py:1046] (3/4) Epoch 27, batch 200, loss[loss=0.1827, simple_loss=0.2496, pruned_loss=0.05794, over 23847.00 frames. ], tot_loss[loss=0.1685, simple_loss=0.2468, pruned_loss=0.04507, over 2997977.79 frames. ], batch size: 195, lr: 3.82e-03, grad_scale: 8.0 2023-10-02 15:06:23,566 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.conv_module1.balancer1.prob, batch_count=922100.0, ans=0.125 2023-10-02 15:06:26,147 WARNING [train.py:1204] (3/4) Exclude cut with ID _51471_210_1889_1_1534399391390_3094250_310-337793-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 15:06:26,168 WARNING [train.py:1204] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_678-159307-0_sp0.9 from training. Duration: 0.9444375 2023-10-02 15:06:27,505 WARNING [train.py:1204] (3/4) Exclude cut with ID _50093_210_4220_1_1534417141207_3432299_259-322252-0 from training. Duration: 0.92 2023-10-02 15:06:28,884 WARNING [train.py:1204] (3/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_67-103444-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 15:06:30,196 WARNING [train.py:1204] (3/4) Exclude cut with ID _40551_210_10635_1_1533776424632_3416179_311-247578-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 15:06:33,031 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_4633_210_2040_1_1526824163_4145343_416-76117-0 from training. Duration: 0.95 2023-10-02 15:06:34,359 WARNING [train.py:1204] (3/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_331-117829-0_sp0.9 from training. Duration: 0.688875 2023-10-02 15:06:34,486 WARNING [train.py:1204] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_299-247059-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 15:06:37,646 WARNING [train.py:1204] (3/4) Exclude cut with ID _64002_210_9570_1_1535198605157_38979_2-240845-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 15:06:40,430 WARNING [train.py:1204] (3/4) Exclude cut with ID _4681_210_3525_1_1527250689464_120889_15-126760-0_sp0.9 from training. Duration: 0.9 2023-10-02 15:06:40,442 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_21500_210_2054_1_1532149310_2716758_256-156611-0_sp1.1 from training. Duration: 0.936375 2023-10-02 15:06:40,457 WARNING [train.py:1204] (3/4) Exclude cut with ID _16908_210_5724_1_1531119157248_3772700_624-203729-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 15:06:59,841 WARNING [train.py:1204] (3/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_605-219732-0_sp1.1 from training. Duration: 0.8545625 2023-10-02 15:07:01,160 WARNING [train.py:1204] (3/4) Exclude cut with ID _44718_210_5816_1_1534050051869_6469030_380-167520-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 15:07:02,418 WARNING [train.py:1204] (3/4) Exclude cut with ID _19896_210_1883_1_1531709022662_6649569_94-229927-0_sp1.1 from training. Duration: 0.8818125 2023-10-02 15:07:02,479 WARNING [train.py:1204] (3/4) Exclude cut with ID _19514_210_9259_1_1531530071330_5084230_519-174300-0_sp1.1 from training. Duration: 0.963625 2023-10-02 15:07:03,809 WARNING [train.py:1204] (3/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_340-174176-0_sp0.9 from training. Duration: 0.5444375 2023-10-02 15:07:03,813 WARNING [train.py:1204] (3/4) Exclude cut with ID _8263_210_1664_1_1528439452891_970220_68-181805-0_sp1.1 from training. Duration: 0.5818125 2023-10-02 15:07:05,194 WARNING [train.py:1204] (3/4) Exclude cut with ID _3120_210_3525_1_1526036143112_4072980_348-257723-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 15:07:06,589 WARNING [train.py:1204] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_453-133983-0_sp1.1 from training. Duration: 0.7090625 2023-10-02 15:07:08,370 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.469e+02 1.865e+02 2.065e+02 2.281e+02 3.557e+02, threshold=4.130e+02, percent-clipped=0.0 2023-10-02 15:07:08,455 WARNING [train.py:1204] (3/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_17-86060-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 15:07:08,493 WARNING [train.py:1204] (3/4) Exclude cut with ID _29877_210_6228_1_1532397791106_5276149_228-184657-0_sp1.1 from training. Duration: 0.97275 2023-10-02 15:07:08,586 WARNING [train.py:1204] (3/4) Exclude cut with ID _18516_210_9206_1_1531704653206_3692406_198-334870-0 from training. Duration: 0.92 2023-10-02 15:07:09,965 WARNING [train.py:1204] (3/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_340-174176-0_sp1.1 from training. Duration: 0.4454375 2023-10-02 15:07:09,987 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_14-139186-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 15:07:14,099 WARNING [train.py:1204] (3/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_126-29104-0_sp0.9 from training. Duration: 0.9444375 2023-10-02 15:07:18,228 WARNING [train.py:1204] (3/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_84-10310-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 15:07:24,864 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_15517_210_6753_1_1530954008_3487336_79-273076-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 15:07:24,920 WARNING [train.py:1204] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_371-18169-0_sp0.9 from training. Duration: 0.9555625 2023-10-02 15:07:31,820 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_68702_210_23689_1_1535799577_3420068_170-92788-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 15:07:34,358 INFO [train.py:1046] (3/4) Epoch 27, batch 250, loss[loss=0.1667, simple_loss=0.2581, pruned_loss=0.03761, over 24564.00 frames. ], tot_loss[loss=0.1695, simple_loss=0.2476, pruned_loss=0.04572, over 3377660.34 frames. ], batch size: 71, lr: 3.82e-03, grad_scale: 8.0 2023-10-02 15:07:34,462 WARNING [train.py:1204] (3/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_78-182912-0 from training. Duration: 0.59 2023-10-02 15:07:35,765 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3019_210_2028_1_1526044245_3264865_297-12384-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 15:07:35,770 WARNING [train.py:1204] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_147-111137-0_sp1.1 from training. Duration: 0.863625 2023-10-02 15:07:35,774 WARNING [train.py:1204] (3/4) Exclude cut with ID _28323_210_16187_1_1532480395667_4100300_117-8859-0_sp1.1 from training. Duration: 0.97275 2023-10-02 15:07:35,855 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7762_210_1794_1_1528199508_4057408_419-125009-0_sp1.1 from training. Duration: 0.7545625 2023-10-02 15:07:37,698 WARNING [train.py:1204] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_268-87909-0 from training. Duration: 0.99 2023-10-02 15:07:39,037 WARNING [train.py:1204] (3/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_118-311357-0_sp1.1 from training. Duration: 0.92725 2023-10-02 15:07:39,052 WARNING [train.py:1204] (3/4) Exclude cut with ID ID0377W0003-870225 from training. Duration: 0.852 2023-10-02 15:07:40,690 WARNING [train.py:1204] (3/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_56-5838-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 15:07:42,008 WARNING [train.py:1204] (3/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_90-31383-0_sp0.9 from training. Duration: 0.9666875 2023-10-02 15:07:42,085 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_31583_210_3852_1_1532595851_4024413_58-86657-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 15:07:42,124 WARNING [train.py:1204] (3/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_344-113875-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 15:07:44,325 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.0.layers.1.self_attn2.whiten, num_groups=1, num_channels=192, metric=10.01 vs. limit=22.5 2023-10-02 15:07:44,772 WARNING [train.py:1204] (3/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_278-104676-0_sp1.1 from training. Duration: 0.8909375 2023-10-02 15:07:44,793 WARNING [train.py:1204] (3/4) Exclude cut with ID _55844_210_2346_1_1534471212569_7625707_764-2387-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 15:07:46,190 WARNING [train.py:1204] (3/4) Exclude cut with ID _27430_210_7770_1_1532604689375_1378590_157-14963-0_sp1.1 from training. Duration: 0.963625 2023-10-02 15:07:49,329 WARNING [train.py:1204] (3/4) Exclude cut with ID _7160_210_1876_1_1527940923231_3703799_282-322611-0_sp1.1 from training. Duration: 0.7909375 2023-10-02 15:07:57,010 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.conv_skip_rate, batch_count=922500.0, ans=0.0 2023-10-02 15:07:58,201 WARNING [train.py:1204] (3/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_190-138640-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 15:08:03,535 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_33348_210_5632_1_1533086912_3324030_63-270922-0_sp1.1 from training. Duration: 0.97275 2023-10-02 15:08:03,572 WARNING [train.py:1204] (3/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_135-184281-0_sp1.1 from training. Duration: 0.9 2023-10-02 15:08:09,653 WARNING [train.py:1204] (3/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_277-210798-0_sp1.1 from training. Duration: 0.5 2023-10-02 15:08:09,696 WARNING [train.py:1204] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_402-253027-0_sp0.9 from training. Duration: 0.6 2023-10-02 15:08:11,104 WARNING [train.py:1204] (3/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_540-275685-0_sp0.9 from training. Duration: 0.988875 2023-10-02 15:08:12,455 WARNING [train.py:1204] (3/4) Exclude cut with ID _59493_210_17282_1_1535078419077_3225879_118-18960-0_sp1.1 from training. Duration: 0.936375 2023-10-02 15:08:12,537 WARNING [train.py:1204] (3/4) Exclude cut with ID _6194_210_1534_1_1527511826160_3941194_321-148641-0_sp1.1 from training. Duration: 0.5818125 2023-10-02 15:08:12,549 WARNING [train.py:1204] (3/4) Exclude cut with ID _61133_210_9969_1_1535196777227_3696409_333-270325-0_sp1.1 from training. Duration: 0.8454375 2023-10-02 15:08:12,611 WARNING [train.py:1204] (3/4) Exclude cut with ID _49376_210_5986_1_1533985278732_3370740_307-145221-0_sp1.1 from training. Duration: 0.936375 2023-10-02 15:08:15,287 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6846_210_6753_1_1527850767_4295158_2-315073-0_sp0.9 from training. Duration: 0.97775 2023-10-02 15:08:16,808 WARNING [train.py:1204] (3/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_17-25045-0 from training. Duration: 0.59 2023-10-02 15:08:16,834 WARNING [train.py:1204] (3/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_503-333510-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 15:08:17,032 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.bypass.scale_min, batch_count=922633.3333333334, ans=0.2 2023-10-02 15:08:19,456 WARNING [train.py:1204] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_238-245655-0_sp1.1 from training. Duration: 0.736375 2023-10-02 15:08:20,564 WARNING [train.py:1204] (3/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_329-82235-0_sp0.9 from training. Duration: 0.911125 2023-10-02 15:08:20,569 WARNING [train.py:1204] (3/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_325-113798-0_sp0.9 from training. Duration: 0.9444375 2023-10-02 15:08:20,652 WARNING [train.py:1204] (3/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_395-37222-0_sp0.9 from training. Duration: 0.8444375 2023-10-02 15:08:21,946 WARNING [train.py:1204] (3/4) Exclude cut with ID _64135_210_2253_1_1535677267876_7608972_498-5039-0_sp1.1 from training. Duration: 0.7090625 2023-10-02 15:08:21,960 WARNING [train.py:1204] (3/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_303-228738-0_sp0.9 from training. Duration: 0.9333125 2023-10-02 15:08:22,442 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.5.encoder.layers.1.conv_module1.whiten, num_groups=1, num_channels=256, metric=3.02 vs. limit=15.0 2023-10-02 15:08:23,593 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.conv_skip_rate, batch_count=922633.3333333334, ans=0.0 2023-10-02 15:08:24,836 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_577-220899-0_sp1.1 from training. Duration: 0.92725 2023-10-02 15:08:26,642 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3475_210_2028_1_1526193578_3622569_377-166133-0_sp1.1 from training. Duration: 0.7818125 2023-10-02 15:08:27,972 WARNING [train.py:1204] (3/4) Exclude cut with ID _49376_210_5986_1_1533985278732_3370740_272-313512-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 15:08:32,061 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7762_210_1794_1_1528199508_4057408_422-227034-0_sp0.9 from training. Duration: 0.811125 2023-10-02 15:08:33,695 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.feed_forward1.hidden_balancer.prob, batch_count=922700.0, ans=0.125 2023-10-02 15:08:36,194 WARNING [train.py:1204] (3/4) Exclude cut with ID _18895_210_6228_1_1531357274234_3784930_74-341428-0_sp1.1 from training. Duration: 0.92725 2023-10-02 15:08:37,697 WARNING [train.py:1204] (3/4) Exclude cut with ID _44902_210_16187_1_1533888006062_3792078_149-67212-0_sp1.1 from training. Duration: 0.7909375 2023-10-02 15:08:43,733 WARNING [train.py:1204] (3/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_243-126020-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 15:08:43,853 WARNING [train.py:1204] (3/4) Exclude cut with ID _54218_210_12210_1_1534413646388_4409259_259-6561-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 15:08:47,740 INFO [train.py:1046] (3/4) Epoch 27, batch 300, loss[loss=0.168, simple_loss=0.231, pruned_loss=0.05247, over 23774.00 frames. ], tot_loss[loss=0.1685, simple_loss=0.2457, pruned_loss=0.04566, over 3659775.23 frames. ], batch size: 179, lr: 3.82e-03, grad_scale: 8.0 2023-10-02 15:08:47,800 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_41452_210_7291_1_1533437687_3941281_405-11559-0 from training. Duration: 0.91 2023-10-02 15:08:49,680 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_759-225084-0_sp1.1 from training. Duration: 0.97275 2023-10-02 15:08:49,700 WARNING [train.py:1204] (3/4) Exclude cut with ID _14747_210_8341_1_1530685797463_3654089_147-131229-0_sp0.9 from training. Duration: 0.8444375 2023-10-02 15:08:51,091 WARNING [train.py:1204] (3/4) Exclude cut with ID _14867_210_1968_1_1530684179890_3846969_319-250868-0 from training. Duration: 0.99 2023-10-02 15:08:52,362 WARNING [train.py:1204] (3/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_580-93798-0_sp0.9 from training. Duration: 0.62225 2023-10-02 15:08:55,587 WARNING [train.py:1204] (3/4) Exclude cut with ID _9679_210_7497_1_1529129212906_4363929_512-313889-0_sp0.9 from training. Duration: 0.9 2023-10-02 15:08:55,597 WARNING [train.py:1204] (3/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_18-189198-0 from training. Duration: 0.9 2023-10-02 15:08:59,038 WARNING [train.py:1204] (3/4) Exclude cut with ID _49755_210_18053_1_1534581005846_3218709_137-64179-0_sp1.1 from training. Duration: 0.92725 2023-10-02 15:09:00,429 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_48049_210_5632_1_1533864553_3519937_306-283794-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 15:09:03,218 WARNING [train.py:1204] (3/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_25-120161-0_sp1.1 from training. Duration: 0.8909375 2023-10-02 15:09:04,455 WARNING [train.py:1204] (3/4) Exclude cut with ID _8833_210_5219_1_1528592665210_7553059_319-349181-0 from training. Duration: 0.99 2023-10-02 15:09:04,545 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_296-332325-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 15:09:05,885 WARNING [train.py:1204] (3/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_436-186311-0_sp1.1 from training. Duration: 0.5909375 2023-10-02 15:09:05,899 WARNING [train.py:1204] (3/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_348-127789-0 from training. Duration: 0.85 2023-10-02 15:09:05,912 WARNING [train.py:1204] (3/4) Exclude cut with ID _3992_210_1747_1_1526644816031_3731060_443-195183-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 15:09:10,023 WARNING [train.py:1204] (3/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_308-231735-0_sp1.1 from training. Duration: 0.663625 2023-10-02 15:09:14,633 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6786_210_1388_1_1527839699_4194495_378-46070-0_sp1.1 from training. Duration: 0.6545625 2023-10-02 15:09:14,684 WARNING [train.py:1204] (3/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_63-101996-0 from training. Duration: 0.88 2023-10-02 15:09:17,462 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2541_210_1794_1_1525590269_3084765_156-173713-0 from training. Duration: 0.81 2023-10-02 15:09:17,506 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_217-239796-0_sp1.1 from training. Duration: 0.963625 2023-10-02 15:09:21,403 WARNING [train.py:1204] (3/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_530-329400-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 15:09:24,033 WARNING [train.py:1204] (3/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_150-62920-0_sp1.1 from training. Duration: 0.963625 2023-10-02 15:09:24,033 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_5522_210_3273_1_1527249606_3909096_366-172508-0 from training. Duration: 0.66 2023-10-02 15:09:24,036 WARNING [train.py:1204] (3/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_303-340446-0_sp1.1 from training. Duration: 0.8181875 2023-10-02 15:09:24,164 WARNING [train.py:1204] (3/4) Exclude cut with ID _8699_210_2069_1_1528630346831_3960409_160-24408-0_sp1.1 from training. Duration: 0.82725 2023-10-02 15:09:25,539 WARNING [train.py:1204] (3/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_299-187801-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 15:09:25,571 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_34886_210_10633_1_1533203882_1755661_92-345288-0_sp1.1 from training. Duration: 0.97275 2023-10-02 15:09:31,627 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_5438_210_1794_1_1527336687_2600009_104-288274-0_sp1.1 from training. Duration: 0.52725 2023-10-02 15:09:31,644 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_154-316450-0 from training. Duration: 0.76 2023-10-02 15:09:32,983 WARNING [train.py:1204] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_23-189234-0_sp0.9 from training. Duration: 0.92225 2023-10-02 15:09:34,449 WARNING [train.py:1204] (3/4) Exclude cut with ID _4779_210_2084_1_1527318022944_3914893_346-168820-0_sp1.1 from training. Duration: 0.963625 2023-10-02 15:09:35,795 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.609e+02 1.868e+02 2.076e+02 2.400e+02 4.267e+02, threshold=4.152e+02, percent-clipped=1.0 2023-10-02 15:09:35,930 WARNING [train.py:1204] (3/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_5-55893-0 from training. Duration: 0.55 2023-10-02 15:09:37,258 WARNING [train.py:1204] (3/4) Exclude cut with ID _20455_210_5219_1_1531565996775_8098100_180-24517-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 15:09:41,693 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_243-143119-0_sp0.9 from training. Duration: 0.8555625 2023-10-02 15:09:43,177 WARNING [train.py:1204] (3/4) Exclude cut with ID _65585_210_12210_1_1535450451584_3764098_293-212045-0_sp1.1 from training. Duration: 0.92725 2023-10-02 15:09:43,181 WARNING [train.py:1204] (3/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_577-332600-0 from training. Duration: 0.57 2023-10-02 15:09:46,243 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.conv_skip_rate, batch_count=923033.3333333334, ans=0.0 2023-10-02 15:09:47,374 WARNING [train.py:1204] (3/4) Exclude cut with ID _30039_210_13270_1_1532484612900_2725150_104-289874-0_sp1.1 from training. Duration: 0.963625 2023-10-02 15:09:47,383 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3172_210_2087_1_1526292929_3987865_197-97762-0_sp1.1 from training. Duration: 0.6818125 2023-10-02 15:09:48,357 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.1.encoder.layers.0.nonlin_attention.whiten1, num_groups=1, num_channels=192, metric=4.68 vs. limit=10.0 2023-10-02 15:09:49,372 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_256-150908-0_sp1.1 from training. Duration: 0.963625 2023-10-02 15:09:50,738 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_296-287678-0_sp1.1 from training. Duration: 0.863625 2023-10-02 15:09:50,763 WARNING [train.py:1204] (3/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_443-30034-0 from training. Duration: 0.64 2023-10-02 15:09:52,646 WARNING [train.py:1204] (3/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_494-199538-0_sp0.9 from training. Duration: 0.8333125 2023-10-02 15:09:52,708 WARNING [train.py:1204] (3/4) Exclude cut with ID _19708_210_9206_1_1532136650733_4238364_61-162325-0_sp1.1 from training. Duration: 0.97275 2023-10-02 15:09:54,015 WARNING [train.py:1204] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_475-127455-0 from training. Duration: 0.7 2023-10-02 15:09:55,424 WARNING [train.py:1204] (3/4) Exclude cut with ID _48677_210_5663_1_1533902405919_3892989_188-11581-0_sp1.1 from training. Duration: 0.963625 2023-10-02 15:09:55,461 WARNING [train.py:1204] (3/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_136-8381-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 15:09:57,301 WARNING [train.py:1204] (3/4) Exclude cut with ID _55328_210_27767_1_1534580973260_4088170_270-229555-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 15:09:57,342 WARNING [train.py:1204] (3/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_372-266951-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 15:09:58,655 WARNING [train.py:1204] (3/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_14-242149-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 15:10:02,719 INFO [train.py:1046] (3/4) Epoch 27, batch 350, loss[loss=0.1564, simple_loss=0.2328, pruned_loss=0.04004, over 23568.00 frames. ], tot_loss[loss=0.1673, simple_loss=0.2439, pruned_loss=0.04538, over 3875646.54 frames. ], batch size: 149, lr: 3.82e-03, grad_scale: 8.0 2023-10-02 15:10:02,825 WARNING [train.py:1204] (3/4) Exclude cut with ID _7877_210_6014_1_1528286358099_3750136_159-76512-0_sp1.1 from training. Duration: 0.936375 2023-10-02 15:10:02,827 WARNING [train.py:1204] (3/4) Exclude cut with ID _8263_210_1664_1_1528438953494_294100_40-204731-0_sp1.1 from training. Duration: 0.4545625 2023-10-02 15:10:03,057 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.bypass.skip_rate, batch_count=923100.0, ans=0.09899494936611666 2023-10-02 15:10:05,459 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8278_210_2550_1_1528637099_4220413_694-238413-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 15:10:05,699 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.conv_module2.balancer1.prob, batch_count=923100.0, ans=0.125 2023-10-02 15:10:07,627 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.feed_forward3.hidden_balancer.prob, batch_count=923100.0, ans=0.125 2023-10-02 15:10:11,384 WARNING [train.py:1204] (3/4) Exclude cut with ID _47278_210_12041_1_1533902418349_3576110_421-172710-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 15:10:14,012 WARNING [train.py:1204] (3/4) Exclude cut with ID _14970_210_15709_1_1530773543493_1715730_215-282680-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 15:10:14,042 WARNING [train.py:1204] (3/4) Exclude cut with ID _64639_210_5816_1_1535367619437_3528000_306-149449-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 15:10:15,568 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_28-291982-0 from training. Duration: 0.97 2023-10-02 15:10:18,643 WARNING [train.py:1204] (3/4) Exclude cut with ID _55134_210_21982_1_1534590028377_3882170_412-15870-0_sp1.1 from training. Duration: 0.936375 2023-10-02 15:10:18,678 WARNING [train.py:1204] (3/4) Exclude cut with ID _13684_210_7497_1_1530619354863_3598359_312-235606-0 from training. Duration: 0.6 2023-10-02 15:10:22,121 WARNING [train.py:1204] (3/4) Exclude cut with ID _37112_210_2575_1_1532997007673_7268610_347-238993-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 15:10:23,445 WARNING [train.py:1204] (3/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_121-284152-0 from training. Duration: 0.59 2023-10-02 15:10:24,821 WARNING [train.py:1204] (3/4) Exclude cut with ID _2941_210_2577_1_1526199673150_2489759_228-2961-0_sp1.1 from training. Duration: 0.97275 2023-10-02 15:10:27,593 WARNING [train.py:1204] (3/4) Exclude cut with ID _28323_210_16187_1_1532480395667_4100300_531-24157-0 from training. Duration: 0.98 2023-10-02 15:10:27,859 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.ff3_skip_rate, batch_count=923166.6666666666, ans=0.0 2023-10-02 15:10:29,547 WARNING [train.py:1204] (3/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_266-329755-0_sp0.9 from training. Duration: 0.988875 2023-10-02 15:10:30,972 WARNING [train.py:1204] (3/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_1038-146421-0_sp1.1 from training. Duration: 0.97275 2023-10-02 15:10:31,031 WARNING [train.py:1204] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_391-22148-0_sp0.9 from training. Duration: 0.8555625 2023-10-02 15:10:33,716 WARNING [train.py:1204] (3/4) Exclude cut with ID _64521_210_14699_1_1535243427259_3815328_543-341117-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 15:10:33,735 WARNING [train.py:1204] (3/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_52-168712-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 15:10:33,773 WARNING [train.py:1204] (3/4) Exclude cut with ID _33920_210_4220_1_1533257851500_3084399_198-334063-0_sp1.1 from training. Duration: 0.8545625 2023-10-02 15:10:33,793 WARNING [train.py:1204] (3/4) Exclude cut with ID _50093_210_4220_1_1534417141207_3432299_69-111987-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 15:10:33,829 WARNING [train.py:1204] (3/4) Exclude cut with ID _14821_210_8750_1_1530680503991_6829359_189-297451-0_sp0.9 from training. Duration: 0.611125 2023-10-02 15:10:35,215 WARNING [train.py:1204] (3/4) Exclude cut with ID _8030_210_7497_1_1528444886781_3531089_262-86750-0_sp0.9 from training. Duration: 0.9 2023-10-02 15:10:35,220 WARNING [train.py:1204] (3/4) Exclude cut with ID _24612_210_8913_1_1531998054193_3806359_294-191196-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 15:10:35,395 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.bypass.skip_rate, batch_count=923233.3333333334, ans=0.09899494936611666 2023-10-02 15:10:42,806 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.conv_module2.balancer1.min_positive, batch_count=923233.3333333334, ans=0.025 2023-10-02 15:10:43,956 WARNING [train.py:1204] (3/4) Exclude cut with ID _16909_210_5724_1_1531292079912_4118919_707-342896-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 15:10:43,978 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_9460_210_4211_1_1528954587_10049667_572-37035-0_sp0.9 from training. Duration: 0.888875 2023-10-02 15:10:44,024 WARNING [train.py:1204] (3/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_762-109886-0_sp1.1 from training. Duration: 0.82725 2023-10-02 15:10:45,424 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_14953_210_2151_1_1530701710_5616165_519-326393-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 15:10:50,905 WARNING [train.py:1204] (3/4) Exclude cut with ID _25914_210_6400_1_1532141317058_644740_52-96808-0 from training. Duration: 0.91 2023-10-02 15:10:50,909 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_68095_210_20185_1_1535718226_8888855_1079-94525-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 15:10:56,232 WARNING [train.py:1204] (3/4) Exclude cut with ID _14860_210_4915_1_1530702020340_7343240_746-3647-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 15:10:56,233 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_70379_210_7925_1_1535941455_557455_49-101720-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 15:10:56,258 WARNING [train.py:1204] (3/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_8-307291-0_sp1.1 from training. Duration: 0.936375 2023-10-02 15:10:57,688 WARNING [train.py:1204] (3/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_117-290916-0 from training. Duration: 0.51 2023-10-02 15:11:00,876 WARNING [train.py:1204] (3/4) Exclude cut with ID _22020_210_2461_1_1531724164513_6326900_944-339840-0_sp1.1 from training. Duration: 0.963625 2023-10-02 15:11:00,959 WARNING [train.py:1204] (3/4) Exclude cut with ID IC0515W0043-254911 from training. Duration: 0.976 2023-10-02 15:11:02,266 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3910_210_1794_1_1526742306_334530_28-238909-0 from training. Duration: 0.81 2023-10-02 15:11:02,287 WARNING [train.py:1204] (3/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_269-267275-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 15:11:05,112 WARNING [train.py:1204] (3/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_72-251255-0_sp1.1 from training. Duration: 0.8545625 2023-10-02 15:11:05,124 WARNING [train.py:1204] (3/4) Exclude cut with ID _14886_210_4220_1_1530702020355_3516190_175-229073-0 from training. Duration: 0.45 2023-10-02 15:11:07,804 WARNING [train.py:1204] (3/4) Exclude cut with ID _40519_210_13794_1_1533715228655_7337390_921-209452-0_sp1.1 from training. Duration: 0.963625 2023-10-02 15:11:09,586 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.nonlin_attention.balancer.max_positive, batch_count=923366.6666666666, ans=0.95 2023-10-02 15:11:11,146 WARNING [train.py:1204] (3/4) Exclude cut with ID _29857_210_20034_1_1532395886753_3798160_335-163562-0_sp0.9 from training. Duration: 0.9444375 2023-10-02 15:11:12,663 WARNING [train.py:1204] (3/4) Exclude cut with ID _58583_210_5236_1_1534726835266_3574249_384-58445-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 15:11:14,055 WARNING [train.py:1204] (3/4) Exclude cut with ID _44011_210_2046_1_1533722401691_3631310_96-315656-0_sp1.1 from training. Duration: 0.963625 2023-10-02 15:11:14,059 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_68484_210_1794_1_1535849804_7678097_549-170613-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 15:11:14,275 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.ff2_skip_rate, batch_count=923366.6666666666, ans=0.0 2023-10-02 15:11:16,748 INFO [train.py:1046] (3/4) Epoch 27, batch 400, loss[loss=0.1483, simple_loss=0.2201, pruned_loss=0.03827, over 24415.00 frames. ], tot_loss[loss=0.1664, simple_loss=0.2435, pruned_loss=0.0446, over 4056419.29 frames. ], batch size: 58, lr: 3.82e-03, grad_scale: 16.0 2023-10-02 15:11:16,823 WARNING [train.py:1204] (3/4) Exclude cut with ID _37892_210_19596_1_1533198677897_3964058_214-148058-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 15:11:18,486 INFO [scaling.py:1118] (3/4) WithLoss: name=encoder.encoders.4.encoder.layers.2.self_attn_weights, loss-sum=0.000e+00 2023-10-02 15:11:19,507 WARNING [train.py:1204] (3/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_415-78792-0_sp0.9 from training. Duration: 0.9 2023-10-02 15:11:20,894 WARNING [train.py:1204] (3/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_383-205833-0_sp1.1 from training. Duration: 0.863625 2023-10-02 15:11:22,330 WARNING [train.py:1204] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_167-137255-0 from training. Duration: 0.64 2023-10-02 15:11:22,346 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8275_210_4198_1_1528455320_3892571_484-154786-0_sp1.1 from training. Duration: 0.963625 2023-10-02 15:11:23,612 WARNING [train.py:1204] (3/4) Exclude cut with ID _40284_210_2084_1_1533513679814_3704279_396-319412-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 15:11:26,806 WARNING [train.py:1204] (3/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_280-120942-0_sp0.9 from training. Duration: 0.8555625 2023-10-02 15:11:26,865 WARNING [train.py:1204] (3/4) Exclude cut with ID _45630_210_10556_1_1533949174498_3902049_558-240889-0_sp1.1 from training. Duration: 0.92725 2023-10-02 15:11:29,556 WARNING [train.py:1204] (3/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_197-307458-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 15:11:30,448 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.nonlin_attention.balancer.prob, batch_count=923500.0, ans=0.125 2023-10-02 15:11:30,485 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.ff2_skip_rate, batch_count=923500.0, ans=0.0 2023-10-02 15:11:31,522 WARNING [train.py:1204] (3/4) Exclude cut with ID _16909_210_5724_1_1531292079912_4118919_163-254843-0_sp1.1 from training. Duration: 0.92725 2023-10-02 15:11:32,900 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_103-84010-0 from training. Duration: 0.69 2023-10-02 15:11:33,024 WARNING [train.py:1204] (3/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_401-158239-0 from training. Duration: 0.71 2023-10-02 15:11:33,025 WARNING [train.py:1204] (3/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_899-233658-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 15:11:35,841 WARNING [train.py:1204] (3/4) Exclude cut with ID _23825_210_13449_1_1531915313526_3497150_240-271367-0 from training. Duration: 0.97 2023-10-02 15:11:35,888 WARNING [train.py:1204] (3/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_424-134619-0_sp1.1 from training. Duration: 0.92725 2023-10-02 15:11:39,956 WARNING [train.py:1204] (3/4) Exclude cut with ID _44831_210_13427_1_1533816011549_3689035_32-188147-0_sp1.1 from training. Duration: 0.87275 2023-10-02 15:11:39,959 WARNING [train.py:1204] (3/4) Exclude cut with ID _59170_210_6721_1_1534983633960_4203335_314-63879-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 15:11:39,979 WARNING [train.py:1204] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_602-275260-0 from training. Duration: 0.82 2023-10-02 15:11:41,177 WARNING [train.py:1204] (3/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_511-306767-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 15:11:41,231 WARNING [train.py:1204] (3/4) Exclude cut with ID _41817_210_18835_1_1533434477160_7423049_565-201775-0_sp1.1 from training. Duration: 0.92725 2023-10-02 15:11:41,241 WARNING [train.py:1204] (3/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_778-173151-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 15:11:41,391 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.1.feed_forward1.hidden_balancer.prob, batch_count=923500.0, ans=0.125 2023-10-02 15:11:43,204 WARNING [train.py:1204] (3/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_680-51001-0_sp1.1 from training. Duration: 0.963625 2023-10-02 15:11:44,629 WARNING [train.py:1204] (3/4) Exclude cut with ID IC0677W0059-334891 from training. Duration: 0.896 2023-10-02 15:11:44,668 WARNING [train.py:1204] (3/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_370-331600-0 from training. Duration: 0.94 2023-10-02 15:11:48,765 WARNING [train.py:1204] (3/4) Exclude cut with ID _66840_210_5219_1_1535452168426_4196349_446-240192-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 15:11:50,136 WARNING [train.py:1204] (3/4) Exclude cut with ID _65242_210_18628_1_1535369393717_3823149_38-6732-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 15:11:50,209 WARNING [train.py:1204] (3/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_302-2550-0 from training. Duration: 0.81 2023-10-02 15:11:52,780 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_100-348441-0 from training. Duration: 0.91 2023-10-02 15:11:55,514 WARNING [train.py:1204] (3/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_329-285801-0_sp1.1 from training. Duration: 0.82725 2023-10-02 15:11:55,672 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_30167_210_3528_1_1532428012_4772375_343-334712-0_sp1.1 from training. Duration: 0.936375 2023-10-02 15:12:03,394 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.483e+02 1.781e+02 1.952e+02 2.127e+02 3.140e+02, threshold=3.905e+02, percent-clipped=0.0 2023-10-02 15:12:03,551 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7543_210_6753_1_1528543318_4552_1-85072-0 from training. Duration: 0.83 2023-10-02 15:12:06,330 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7002_210_1794_1_1527935634_4034444_487-236501-0_sp1.1 from training. Duration: 0.6 2023-10-02 15:12:07,689 WARNING [train.py:1204] (3/4) Exclude cut with ID _7160_210_1876_1_1527940923231_3703799_282-322611-0 from training. Duration: 0.87 2023-10-02 15:12:10,324 WARNING [train.py:1204] (3/4) Exclude cut with ID _67335_210_5219_1_1535525975652_4275229_570-262983-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 15:12:12,281 WARNING [train.py:1204] (3/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_625-338546-0_sp1.1 from training. Duration: 0.8 2023-10-02 15:12:13,609 WARNING [train.py:1204] (3/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_176-56120-0 from training. Duration: 0.83 2023-10-02 15:12:17,707 WARNING [train.py:1204] (3/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_963-187109-0_sp1.1 from training. Duration: 0.9 2023-10-02 15:12:19,163 WARNING [train.py:1204] (3/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_121-284152-0_sp0.9 from training. Duration: 0.6555625 2023-10-02 15:12:20,608 WARNING [train.py:1204] (3/4) Exclude cut with ID _45021_210_9994_1_1533619751094_3330288_104-81711-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 15:12:22,106 WARNING [train.py:1204] (3/4) Exclude cut with ID _51382_210_23862_1_1534492801838_3650104_129-343914-0_sp1.1 from training. Duration: 0.936375 2023-10-02 15:12:22,143 WARNING [train.py:1204] (3/4) Exclude cut with ID _14970_210_15709_1_1530775547361_1966965_8-169924-0 from training. Duration: 0.92 2023-10-02 15:12:25,007 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2295_210_2087_1_1525500993_2407863_234-83104-0_sp1.1 from training. Duration: 0.563625 2023-10-02 15:12:27,544 WARNING [train.py:1204] (3/4) Exclude cut with ID _14687_210_5854_1_1530703838259_3664889_115-239046-0 from training. Duration: 0.67 2023-10-02 15:12:29,447 INFO [train.py:1046] (3/4) Epoch 27, batch 450, loss[loss=0.176, simple_loss=0.26, pruned_loss=0.04596, over 24358.00 frames. ], tot_loss[loss=0.1669, simple_loss=0.2441, pruned_loss=0.04482, over 4202515.80 frames. ], batch size: 77, lr: 3.82e-03, grad_scale: 16.0 2023-10-02 15:12:29,539 WARNING [train.py:1204] (3/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_690-126306-0_sp1.1 from training. Duration: 0.7090625 2023-10-02 15:12:29,559 WARNING [train.py:1204] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_625-102955-0_sp0.9 from training. Duration: 0.8555625 2023-10-02 15:12:31,745 INFO [scaling.py:1118] (3/4) WithLoss: name=encoder.encoders.2.encoder.layers.0.self_attn_weights, loss-sum=0.000e+00 2023-10-02 15:12:32,753 WARNING [train.py:1204] (3/4) Exclude cut with ID _9389_210_1664_1_1529217337301_5289269_289-213417-0 from training. Duration: 0.89 2023-10-02 15:12:34,781 WARNING [train.py:1204] (3/4) Exclude cut with ID _13253_210_1925_1_1530757775819_3577873_191-144977-0_sp0.9 from training. Duration: 0.8666875 2023-10-02 15:12:34,831 WARNING [train.py:1204] (3/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_325-155703-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 15:12:36,231 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2295_210_2087_1_1525500993_2407863_116-238894-0_sp0.9 from training. Duration: 0.77775 2023-10-02 15:12:37,655 WARNING [train.py:1204] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_94-126128-0 from training. Duration: 0.86 2023-10-02 15:12:37,878 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.ff2_skip_rate, batch_count=923766.6666666666, ans=0.0 2023-10-02 15:12:38,933 WARNING [train.py:1204] (3/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_395-310220-0_sp0.9 from training. Duration: 0.988875 2023-10-02 15:12:39,006 WARNING [train.py:1204] (3/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_821-110852-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 15:12:39,059 WARNING [train.py:1204] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_64-248359-0_sp1.1 from training. Duration: 0.62725 2023-10-02 15:12:39,278 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.bypass.scale_min, batch_count=923766.6666666666, ans=0.2 2023-10-02 15:12:40,378 WARNING [train.py:1204] (3/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_138-187031-0 from training. Duration: 0.99 2023-10-02 15:12:40,423 WARNING [train.py:1204] (3/4) Exclude cut with ID _4122_210_2084_1_1526713164417_4353280_528-206771-0_sp0.9 from training. Duration: 0.97775 2023-10-02 15:12:41,695 WARNING [train.py:1204] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_844-96989-0_sp1.1 from training. Duration: 0.6454375 2023-10-02 15:12:43,107 WARNING [train.py:1204] (3/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_106-30782-0_sp1.1 from training. Duration: 0.72725 2023-10-02 15:12:52,152 WARNING [train.py:1204] (3/4) Exclude cut with ID _14683_210_5219_1_1530698352608_3974859_281-250040-0_sp1.1 from training. Duration: 0.936375 2023-10-02 15:12:52,188 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_46013_210_7234_1_1533776362_3744032_463-52378-0_sp0.9 from training. Duration: 0.9555625 2023-10-02 15:12:55,001 WARNING [train.py:1204] (3/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_299-34413-0 from training. Duration: 0.65 2023-10-02 15:12:55,076 WARNING [train.py:1204] (3/4) Exclude cut with ID _35799_210_5854_1_1533263487887_3529071_490-326847-0 from training. Duration: 0.99 2023-10-02 15:12:59,080 WARNING [train.py:1204] (3/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_589-9070-0_sp0.9 from training. Duration: 0.788875 2023-10-02 15:13:00,592 WARNING [train.py:1204] (3/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_884-29515-0_sp1.1 from training. Duration: 0.936375 2023-10-02 15:13:01,528 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.bypass.skip_rate, batch_count=923900.0, ans=0.04949747468305833 2023-10-02 15:13:02,625 WARNING [train.py:1204] (3/4) Exclude cut with ID _15012_210_1968_1_1530856820429_5095350_474-119995-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 15:13:05,907 WARNING [train.py:1204] (3/4) Exclude cut with ID _65585_210_12210_1_1535450451584_3764098_211-97124-0_sp1.1 from training. Duration: 0.963625 2023-10-02 15:13:07,187 WARNING [train.py:1204] (3/4) Exclude cut with ID _33640_210_2655_1_1533117605479_4855469_666-224463-0_sp1.1 from training. Duration: 0.963625 2023-10-02 15:13:08,712 WARNING [train.py:1204] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_35-153886-0 from training. Duration: 0.84 2023-10-02 15:13:09,992 WARNING [train.py:1204] (3/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_210-106047-0 from training. Duration: 0.97 2023-10-02 15:13:11,455 WARNING [train.py:1204] (3/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_479-125611-0 from training. Duration: 0.96 2023-10-02 15:13:11,484 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_9417_210_1794_1_1529235261_4614131_427-153963-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 15:13:12,818 WARNING [train.py:1204] (3/4) Exclude cut with ID _23818_210_6228_1_1531917063884_3676920_133-170571-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 15:13:12,883 WARNING [train.py:1204] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_602-275260-0_sp1.1 from training. Duration: 0.7454375 2023-10-02 15:13:13,708 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.0.layers.1.feed_forward3.out_whiten, num_groups=1, num_channels=192, metric=8.68 vs. limit=15.0 2023-10-02 15:13:14,688 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9029W0121-509521 from training. Duration: 0.896 2023-10-02 15:13:16,050 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9029W0434-509821 from training. Duration: 0.64 2023-10-02 15:13:16,068 WARNING [train.py:1204] (3/4) Exclude cut with ID _23038_210_9570_1_1532001584158_3652419_289-307034-0_sp1.1 from training. Duration: 0.936375 2023-10-02 15:13:17,434 WARNING [train.py:1204] (3/4) Exclude cut with ID _29345_210_4381_1_1532689168263_3860169_149-257268-0_sp0.9 from training. Duration: 0.9 2023-10-02 15:13:18,759 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_405-339986-0_sp1.1 from training. Duration: 0.463625 2023-10-02 15:13:20,274 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2655_210_2087_1_1525688959_3897400_437-337690-0_sp1.1 from training. Duration: 0.57275 2023-10-02 15:13:20,297 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7543_210_6753_1_1528541907_1399322_165-323619-0_sp1.1 from training. Duration: 0.62725 2023-10-02 15:13:21,714 WARNING [train.py:1204] (3/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_508-335369-0_sp1.1 from training. Duration: 0.436375 2023-10-02 15:13:21,774 WARNING [train.py:1204] (3/4) Exclude cut with ID _7852_210_5219_1_1528286471316_8139400_358-287492-0 from training. Duration: 0.96 2023-10-02 15:13:24,722 WARNING [train.py:1204] (3/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_322-217220-0_sp0.9 from training. Duration: 0.9555625 2023-10-02 15:13:24,934 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.conv_module1.balancer1.max_abs, batch_count=923966.6666666666, ans=10.0 2023-10-02 15:13:26,155 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3171_210_2087_1_1526109084_2330007_66-260583-0_sp0.9 from training. Duration: 0.8 2023-10-02 15:13:27,515 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8558_210_1410_1_1528505225_7613406_131-329524-0_sp1.1 from training. Duration: 0.7181875 2023-10-02 15:13:29,421 WARNING [train.py:1204] (3/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_30-326681-0 from training. Duration: 0.83 2023-10-02 15:13:34,722 WARNING [train.py:1204] (3/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_58-68956-0_sp0.9 from training. Duration: 0.92225 2023-10-02 15:13:34,796 WARNING [train.py:1204] (3/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_255-312654-0 from training. Duration: 0.91 2023-10-02 15:13:35,894 WARNING [train.py:1204] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_99-167392-0 from training. Duration: 0.61 2023-10-02 15:13:36,131 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.conv_module2.balancer2.prob, batch_count=924033.3333333334, ans=0.125 2023-10-02 15:13:36,197 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.conv_module1.balancer1.prob, batch_count=924033.3333333334, ans=0.125 2023-10-02 15:13:37,318 WARNING [train.py:1204] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_94-126128-0_sp0.9 from training. Duration: 0.9555625 2023-10-02 15:13:41,517 WARNING [train.py:1204] (3/4) Exclude cut with ID _68635_210_9317_1_1535770813410_4315870_187-192817-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 15:13:43,028 WARNING [train.py:1204] (3/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_277-204970-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 15:13:44,261 INFO [train.py:1046] (3/4) Epoch 27, batch 500, loss[loss=0.1609, simple_loss=0.2422, pruned_loss=0.03973, over 23735.00 frames. ], tot_loss[loss=0.167, simple_loss=0.2448, pruned_loss=0.04461, over 4305497.08 frames. ], batch size: 85, lr: 3.82e-03, grad_scale: 16.0 2023-10-02 15:13:44,345 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6163_210_6400_1_1527506746_4263068_283-137811-0_sp0.9 from training. Duration: 0.8555625 2023-10-02 15:13:46,148 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9144W0121-566446 from training. Duration: 0.896 2023-10-02 15:13:50,276 WARNING [train.py:1204] (3/4) Exclude cut with ID _49371_210_12071_1_1533952761021_3812190_351-315519-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 15:13:50,333 WARNING [train.py:1204] (3/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_115-326778-0_sp1.1 from training. Duration: 0.8454375 2023-10-02 15:13:51,691 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_17292_210_6753_1_1531125388_2943734_436-198965-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 15:13:51,700 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9165W0117-576751 from training. Duration: 0.896 2023-10-02 15:13:53,030 WARNING [train.py:1204] (3/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_212-149947-0 from training. Duration: 0.73 2023-10-02 15:13:53,040 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_13757_210_3528_1_1530440682_4107239_65-167544-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 15:13:55,873 WARNING [train.py:1204] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_700-183549-0_sp0.9 from training. Duration: 0.7555625 2023-10-02 15:14:00,537 WARNING [train.py:1204] (3/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_216-343007-0_sp0.9 from training. Duration: 0.7333125 2023-10-02 15:14:01,754 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7544_210_6753_1_1528587722_3576686_295-258479-0_sp1.1 from training. Duration: 0.763625 2023-10-02 15:14:03,799 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_365-287106-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 15:14:03,811 WARNING [train.py:1204] (3/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_300-3747-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 15:14:05,108 WARNING [train.py:1204] (3/4) Exclude cut with ID _14069_210_5632_1_1530512692033_4803390_487-303201-0_sp1.1 from training. Duration: 0.92725 2023-10-02 15:14:10,120 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.feed_forward3.hidden_balancer.prob, batch_count=924166.6666666666, ans=0.125 2023-10-02 15:14:14,008 WARNING [train.py:1204] (3/4) Exclude cut with ID _14459_210_2228_1_1531443641946_7516700_668-123026-0_sp1.1 from training. Duration: 0.936375 2023-10-02 15:14:14,037 WARNING [train.py:1204] (3/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_108-53929-0_sp1.1 from training. Duration: 0.536375 2023-10-02 15:14:14,065 WARNING [train.py:1204] (3/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_266-137104-0_sp0.9 from training. Duration: 0.72225 2023-10-02 15:14:15,406 WARNING [train.py:1204] (3/4) Exclude cut with ID _12302_210_5219_1_1529924446466_8107280_902-168619-0_sp1.1 from training. Duration: 0.936375 2023-10-02 15:14:15,426 WARNING [train.py:1204] (3/4) Exclude cut with ID _59502_210_12210_1_1535107024698_1835139_146-162495-0 from training. Duration: 0.92 2023-10-02 15:14:15,449 WARNING [train.py:1204] (3/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_412-287526-0_sp1.1 from training. Duration: 0.6545625 2023-10-02 15:14:17,444 WARNING [train.py:1204] (3/4) Exclude cut with ID _7160_210_1876_1_1527940923231_3703799_120-150943-0_sp0.9 from training. Duration: 0.9 2023-10-02 15:14:18,776 WARNING [train.py:1204] (3/4) Exclude cut with ID _15037_210_2253_1_1530752294104_3267650_296-192597-0_sp1.1 from training. Duration: 0.736375 2023-10-02 15:14:18,796 WARNING [train.py:1204] (3/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_397-216497-0_sp1.1 from training. Duration: 0.87275 2023-10-02 15:14:18,829 WARNING [train.py:1204] (3/4) Exclude cut with ID _40094_210_16380_1_1533294005773_3641159_76-269320-0_sp1.1 from training. Duration: 0.936375 2023-10-02 15:14:19,083 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.conv_skip_rate, batch_count=924233.3333333334, ans=0.0 2023-10-02 15:14:20,131 WARNING [train.py:1204] (3/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_449-15508-0 from training. Duration: 0.82 2023-10-02 15:14:20,421 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.self_attn_weights.pos_emb_skip_rate, batch_count=924233.3333333334, ans=0.0 2023-10-02 15:14:20,423 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.nonlin_attention.balancer.max_positive, batch_count=924233.3333333334, ans=0.95 2023-10-02 15:14:20,539 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.conv_module1.balancer1.prob, batch_count=924233.3333333334, ans=0.125 2023-10-02 15:14:21,668 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9309W0194-640321 from training. Duration: 0.64 2023-10-02 15:14:24,333 WARNING [train.py:1204] (3/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_284-312178-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 15:14:25,761 WARNING [train.py:1204] (3/4) Exclude cut with ID _29853_210_7497_1_1532505600851_6642110_401-55937-0_sp1.1 from training. Duration: 0.92725 2023-10-02 15:14:25,952 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.attention_skip_rate, batch_count=924233.3333333334, ans=0.0 2023-10-02 15:14:27,104 WARNING [train.py:1204] (3/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_716-24918-0_sp1.1 from training. Duration: 0.92725 2023-10-02 15:14:27,134 WARNING [train.py:1204] (3/4) Exclude cut with ID _19637_210_4220_1_1531537167676_3205060_314-305638-0_sp1.1 from training. Duration: 0.92725 2023-10-02 15:14:27,173 WARNING [train.py:1204] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_475-127455-0_sp1.1 from training. Duration: 0.636375 2023-10-02 15:14:30,332 WARNING [train.py:1204] (3/4) Exclude cut with ID _5444_210_2151_1_1527418808464_5312210_581-315411-0 from training. Duration: 0.62 2023-10-02 15:14:31,511 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.459e+02 1.886e+02 2.128e+02 2.373e+02 3.584e+02, threshold=4.256e+02, percent-clipped=0.0 2023-10-02 15:14:34,150 WARNING [train.py:1204] (3/4) Exclude cut with ID _44902_210_16187_1_1533888006062_3792078_149-67212-0_sp0.9 from training. Duration: 0.9666875 2023-10-02 15:14:34,220 WARNING [train.py:1204] (3/4) Exclude cut with ID _31602_210_5854_1_1532677021456_4458250_63-63253-0_sp1.1 from training. Duration: 0.963625 2023-10-02 15:14:38,178 WARNING [train.py:1204] (3/4) Exclude cut with ID _55209_210_1531_1_1534675828996_4597160_660-203992-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 15:14:39,668 WARNING [train.py:1204] (3/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_83-190624-0_sp1.1 from training. Duration: 0.92725 2023-10-02 15:14:44,296 WARNING [train.py:1204] (3/4) Exclude cut with ID _58605_210_12210_1_1534723195939_3904259_154-139635-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 15:14:47,142 WARNING [train.py:1204] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_164-224052-0 from training. Duration: 0.7 2023-10-02 15:14:47,150 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_1126-189071-0_sp1.1 from training. Duration: 0.963625 2023-10-02 15:14:47,167 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_15396_210_6890_1_1531308473_1938936_310-214789-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 15:14:47,224 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder_embed.convnext.out_balancer.prob, batch_count=924366.6666666666, ans=0.125 2023-10-02 15:14:51,147 WARNING [train.py:1204] (3/4) Exclude cut with ID _8395_210_1531_1_1528597871523_4411579_431-132470-0 from training. Duration: 0.74 2023-10-02 15:14:51,196 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3780_210_2087_1_1526467760_2130483_170-259044-0_sp0.9 from training. Duration: 0.77775 2023-10-02 15:14:52,591 WARNING [train.py:1204] (3/4) Exclude cut with ID _12733_210_8341_1_1530167424556_3646380_537-230640-0_sp1.1 from training. Duration: 0.963625 2023-10-02 15:14:57,925 INFO [train.py:1046] (3/4) Epoch 27, batch 550, loss[loss=0.167, simple_loss=0.2383, pruned_loss=0.04786, over 23501.00 frames. ], tot_loss[loss=0.1681, simple_loss=0.2457, pruned_loss=0.04528, over 4387595.15 frames. ], batch size: 134, lr: 3.82e-03, grad_scale: 16.0 2023-10-02 15:14:58,010 WARNING [train.py:1204] (3/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_159-281310-0 from training. Duration: 0.95 2023-10-02 15:15:01,097 WARNING [train.py:1204] (3/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_434-83000-0 from training. Duration: 0.98 2023-10-02 15:15:02,352 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_4405_210_1849_1_1526732205_3593939_493-32464-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 15:15:02,361 WARNING [train.py:1204] (3/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_342-100157-0 from training. Duration: 0.54 2023-10-02 15:15:02,431 WARNING [train.py:1204] (3/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_765-250133-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 15:15:02,435 WARNING [train.py:1204] (3/4) Exclude cut with ID _38670_210_15379_1_1533448810659_4391059_81-312245-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 15:15:04,297 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_50050_210_16223_1_1535435796_7290138_1273-247611-0_sp1.1 from training. Duration: 0.936375 2023-10-02 15:15:04,356 WARNING [train.py:1204] (3/4) Exclude cut with ID _44831_210_13427_1_1533816011549_3689035_399-18591-0_sp1.1 from training. Duration: 0.936375 2023-10-02 15:15:06,174 WARNING [train.py:1204] (3/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_557-324258-0_sp0.9 from training. Duration: 0.9 2023-10-02 15:15:06,264 WARNING [train.py:1204] (3/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_62-335552-0_sp1.1 from training. Duration: 0.7909375 2023-10-02 15:15:09,037 WARNING [train.py:1204] (3/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_673-264125-0_sp1.1 from training. Duration: 0.963625 2023-10-02 15:15:09,120 WARNING [train.py:1204] (3/4) Exclude cut with ID _8030_210_7497_1_1528444886781_3531089_262-86750-0 from training. Duration: 0.81 2023-10-02 15:15:09,134 WARNING [train.py:1204] (3/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_444-28041-0_sp1.1 from training. Duration: 0.77275 2023-10-02 15:15:13,357 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_34008_210_10431_1_1533345424_8183247_516-14858-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 15:15:13,378 WARNING [train.py:1204] (3/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_54-248218-0_sp1.1 from training. Duration: 0.936375 2023-10-02 15:15:17,791 WARNING [train.py:1204] (3/4) Exclude cut with ID _13253_210_1925_1_1530757775819_3577873_300-119782-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 15:15:19,110 WARNING [train.py:1204] (3/4) Exclude cut with ID _39290_210_4220_1_1533862778760_3075118_21-240703-0_sp1.1 from training. Duration: 0.936375 2023-10-02 15:15:23,320 WARNING [train.py:1204] (3/4) Exclude cut with ID 774-127930-0014-10412-0_sp1.1 from training. Duration: 0.95 2023-10-02 15:15:24,578 WARNING [train.py:1204] (3/4) Exclude cut with ID _67499_210_17282_1_1535509805220_3692430_20-179346-0 from training. Duration: 0.97 2023-10-02 15:15:25,958 WARNING [train.py:1204] (3/4) Exclude cut with ID _8365_210_2064_1_1528592311070_4677690_431-294102-0_sp0.9 from training. Duration: 0.988875 2023-10-02 15:15:30,228 WARNING [train.py:1204] (3/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_365-1492-0_sp1.1 from training. Duration: 0.82725 2023-10-02 15:15:30,260 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_105-114797-0_sp0.9 from training. Duration: 0.9333125 2023-10-02 15:15:30,407 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.bypass.skip_rate, batch_count=924566.6666666666, ans=0.07 2023-10-02 15:15:31,713 WARNING [train.py:1204] (3/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_769-168302-0_sp0.9 from training. Duration: 0.788875 2023-10-02 15:15:36,949 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_40340_210_9024_1_1533433916_4132944_320-270277-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 15:15:36,954 WARNING [train.py:1204] (3/4) Exclude cut with ID ID0203W0003-781711 from training. Duration: 0.958 2023-10-02 15:15:38,856 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_30434_210_13049_1_1532519384_981746_52-12932-0_sp1.1 from training. Duration: 0.936375 2023-10-02 15:15:38,962 WARNING [train.py:1204] (3/4) Exclude cut with ID _8263_210_1664_1_1528438953494_294100_40-204731-0_sp0.9 from training. Duration: 0.5555625 2023-10-02 15:15:41,646 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1032-236863-0_sp0.9 from training. Duration: 0.9333125 2023-10-02 15:15:43,018 WARNING [train.py:1204] (3/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_335-274780-0_sp0.9 from training. Duration: 0.8666875 2023-10-02 15:15:43,027 WARNING [train.py:1204] (3/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_654-52713-0_sp1.1 from training. Duration: 0.7 2023-10-02 15:15:44,385 WARNING [train.py:1204] (3/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_742-267856-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 15:15:44,518 WARNING [train.py:1204] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_74-48717-0 from training. Duration: 0.58 2023-10-02 15:15:47,611 WARNING [train.py:1204] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_18-46756-0 from training. Duration: 0.79 2023-10-02 15:15:47,663 WARNING [train.py:1204] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_612-56061-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 15:15:47,672 WARNING [train.py:1204] (3/4) Exclude cut with ID _56423_210_5854_1_1534896061049_3165969_412-2403-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 15:15:48,932 WARNING [train.py:1204] (3/4) Exclude cut with ID _62574_210_2655_1_1535180407058_3933249_120-196848-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 15:15:48,933 WARNING [train.py:1204] (3/4) Exclude cut with ID _40519_210_13794_1_1533715228655_7337390_916-51000-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 15:15:50,395 WARNING [train.py:1204] (3/4) Exclude cut with ID _65263_210_22356_1_1535763598691_3921037_108-21902-0_sp1.1 from training. Duration: 0.9 2023-10-02 15:15:51,699 WARNING [train.py:1204] (3/4) Exclude cut with ID _4122_210_2084_1_1526713164417_4353280_528-206771-0_sp1.1 from training. Duration: 0.8 2023-10-02 15:15:54,427 WARNING [train.py:1204] (3/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_43-325657-0_sp1.1 from training. Duration: 0.92725 2023-10-02 15:15:55,852 WARNING [train.py:1204] (3/4) Exclude cut with ID _44659_210_2721_1_1534036120061_6614860_314-82041-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 15:15:55,911 WARNING [train.py:1204] (3/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_81-300212-0_sp0.9 from training. Duration: 0.6666875 2023-10-02 15:15:56,061 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.conv_module2.balancer2.min_abs, batch_count=924700.0, ans=0.5 2023-10-02 15:15:57,194 WARNING [train.py:1204] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_665-196155-0_sp0.9 from training. Duration: 0.7444375 2023-10-02 15:15:58,596 WARNING [train.py:1204] (3/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_140-336881-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 15:15:59,983 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6290_210_3273_1_1527595224_1282787_115-87095-0_sp0.9 from training. Duration: 0.611125 2023-10-02 15:16:00,038 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_53373_210_5399_1_1534229471_4715695_607-259685-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 15:16:01,409 WARNING [train.py:1204] (3/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_175-163894-0_sp1.1 from training. Duration: 0.67275 2023-10-02 15:16:01,438 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7002_210_1794_1_1527939683_5157286_1-281264-0_sp1.1 from training. Duration: 0.4 2023-10-02 15:16:07,972 WARNING [train.py:1204] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_605-257205-0 from training. Duration: 0.59 2023-10-02 15:16:10,679 WARNING [train.py:1204] (3/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_454-314901-0 from training. Duration: 0.46 2023-10-02 15:16:10,771 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_46032_210_14656_1_1533781412_4507969_287-130422-0_sp1.1 from training. Duration: 0.97275 2023-10-02 15:16:10,787 WARNING [train.py:1204] (3/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_186-309161-0_sp1.1 from training. Duration: 0.4909375 2023-10-02 15:16:10,822 WARNING [train.py:1204] (3/4) Exclude cut with ID _42659_210_2070_1_1533607442765_3120380_45-342307-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 15:16:12,157 INFO [train.py:1046] (3/4) Epoch 27, batch 600, loss[loss=0.1644, simple_loss=0.2462, pruned_loss=0.04126, over 24340.00 frames. ], tot_loss[loss=0.1684, simple_loss=0.2459, pruned_loss=0.04547, over 4465971.96 frames. ], batch size: 61, lr: 3.82e-03, grad_scale: 16.0 2023-10-02 15:16:18,138 WARNING [train.py:1204] (3/4) Exclude cut with ID _12555_210_8750_1_1530075594402_3780239_478-326818-0_sp1.1 from training. Duration: 0.963625 2023-10-02 15:16:20,855 WARNING [train.py:1204] (3/4) Exclude cut with ID _7606_210_4915_1_1528894253277_5080690_127-177761-0_sp1.1 from training. Duration: 0.5818125 2023-10-02 15:16:21,005 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.0.bypass_mid.scale_min, batch_count=924766.6666666666, ans=0.2 2023-10-02 15:16:22,233 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6788_210_1388_1_1527898698_4562021_91-197981-0 from training. Duration: 0.83 2023-10-02 15:16:23,711 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8558_210_1410_1_1528505225_7613406_131-329524-0_sp0.9 from training. Duration: 0.87775 2023-10-02 15:16:26,430 WARNING [train.py:1204] (3/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_153-153537-0_sp1.1 from training. Duration: 0.82725 2023-10-02 15:16:29,172 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_17292_210_6753_1_1531125388_2943734_285-349105-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 15:16:30,510 WARNING [train.py:1204] (3/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_37-41919-0 from training. Duration: 0.87 2023-10-02 15:16:31,784 WARNING [train.py:1204] (3/4) Exclude cut with ID _60860_210_8254_1_1534896015875_3557169_172-294852-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 15:16:37,441 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.balancer2.prob, batch_count=924833.3333333334, ans=0.125 2023-10-02 15:16:39,047 WARNING [train.py:1204] (3/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_146-276944-0 from training. Duration: 0.83 2023-10-02 15:16:43,262 WARNING [train.py:1204] (3/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_1254-44082-0_sp1.1 from training. Duration: 0.936375 2023-10-02 15:16:43,275 WARNING [train.py:1204] (3/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_1115-180350-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 15:16:43,305 WARNING [train.py:1204] (3/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_60-146189-0_sp1.1 from training. Duration: 0.8909375 2023-10-02 15:16:43,596 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.conv_module1.balancer1.prob, batch_count=924900.0, ans=0.125 2023-10-02 15:16:49,407 WARNING [train.py:1204] (3/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_517-153649-0_sp1.1 from training. Duration: 0.92725 2023-10-02 15:16:49,428 WARNING [train.py:1204] (3/4) Exclude cut with ID _39571_210_13449_1_1533801584143_3651077_37-266257-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 15:16:49,464 WARNING [train.py:1204] (3/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_527-276118-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 15:16:50,369 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.1.encoder.layers.1.feed_forward3.out_whiten, num_groups=1, num_channels=256, metric=8.81 vs. limit=15.0 2023-10-02 15:16:53,775 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7696_210_2040_1_1528207167_3716099_117-125434-0_sp1.1 from training. Duration: 0.6909375 2023-10-02 15:16:57,982 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_25767_210_2161_1_1532307483_3867748_478-156360-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 15:16:57,986 WARNING [train.py:1204] (3/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_177-49092-0_sp1.1 from training. Duration: 0.82725 2023-10-02 15:16:57,995 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_11305_210_1794_1_1529750428_7178148_680-317073-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 15:16:59,384 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.548e+02 1.876e+02 2.029e+02 2.331e+02 3.965e+02, threshold=4.058e+02, percent-clipped=0.0 2023-10-02 15:17:02,892 WARNING [train.py:1204] (3/4) Exclude cut with ID _53565_210_10635_1_1534337218335_3820259_287-18120-0 from training. Duration: 0.97 2023-10-02 15:17:09,022 WARNING [train.py:1204] (3/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_498-40153-0_sp1.1 from training. Duration: 0.57275 2023-10-02 15:17:09,043 WARNING [train.py:1204] (3/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_555-63066-0_sp1.1 from training. Duration: 0.963625 2023-10-02 15:17:13,764 WARNING [train.py:1204] (3/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_395-310220-0 from training. Duration: 0.89 2023-10-02 15:17:15,077 WARNING [train.py:1204] (3/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_937-208281-0_sp1.1 from training. Duration: 0.77275 2023-10-02 15:17:16,593 WARNING [train.py:1204] (3/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_120-211630-0 from training. Duration: 0.92 2023-10-02 15:17:16,623 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_454-276429-0_sp1.1 from training. Duration: 0.7909375 2023-10-02 15:17:17,982 WARNING [train.py:1204] (3/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_404-75889-0_sp1.1 from training. Duration: 0.8090625 2023-10-02 15:17:22,302 WARNING [train.py:1204] (3/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_282-136641-0_sp0.9 from training. Duration: 0.4666875 2023-10-02 15:17:22,409 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_809-17156-0_sp1.1 from training. Duration: 0.663625 2023-10-02 15:17:25,042 WARNING [train.py:1204] (3/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_1003-101638-0_sp0.9 from training. Duration: 0.811125 2023-10-02 15:17:25,141 WARNING [train.py:1204] (3/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_404-75889-0_sp0.9 from training. Duration: 0.988875 2023-10-02 15:17:26,473 INFO [train.py:1046] (3/4) Epoch 27, batch 650, loss[loss=0.1571, simple_loss=0.2199, pruned_loss=0.04717, over 23387.00 frames. ], tot_loss[loss=0.1676, simple_loss=0.2447, pruned_loss=0.04527, over 4522723.34 frames. ], batch size: 285, lr: 3.82e-03, grad_scale: 16.0 2023-10-02 15:17:26,623 WARNING [train.py:1204] (3/4) Exclude cut with ID _59288_210_5854_1_1535077757774_3412246_570-295255-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 15:17:30,644 WARNING [train.py:1204] (3/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_412-287526-0 from training. Duration: 0.72 2023-10-02 15:17:32,970 WARNING [train.py:1204] (3/4) Exclude cut with ID _44010_210_4525_1_1533884262964_4013620_649-79655-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 15:17:33,752 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.2.encoder.layers.0.nonlin_attention.whiten1, num_groups=1, num_channels=288, metric=8.15 vs. limit=10.0 2023-10-02 15:17:34,437 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.conv_module1.balancer2.prob, batch_count=925100.0, ans=0.125 2023-10-02 15:17:38,826 WARNING [train.py:1204] (3/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_57-270037-0_sp0.9 from training. Duration: 0.8555625 2023-10-02 15:17:38,827 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_34815_210_6388_1_1533381320_3571903_302-255963-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 15:17:41,675 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_47449_210_5399_1_1534076445_4006929_573-301852-0_sp1.1 from training. Duration: 0.97275 2023-10-02 15:17:44,935 WARNING [train.py:1204] (3/4) Exclude cut with ID 6709-74022-0004-86860-0_sp1.1 from training. Duration: 0.9409375 2023-10-02 15:17:46,356 WARNING [train.py:1204] (3/4) Exclude cut with ID _40519_210_13794_1_1533715228655_7337390_111-71991-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 15:17:47,638 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_30167_210_3528_1_1532428012_4772375_169-214186-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 15:17:51,773 WARNING [train.py:1204] (3/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_20-235313-0_sp1.1 from training. Duration: 0.963625 2023-10-02 15:17:51,810 WARNING [train.py:1204] (3/4) Exclude cut with ID _4681_210_3525_1_1527251040321_1235713_123-24428-0_sp0.9 from training. Duration: 0.5333125 2023-10-02 15:17:52,046 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.ff2_skip_rate, batch_count=925166.6666666666, ans=0.0 2023-10-02 15:17:53,273 WARNING [train.py:1204] (3/4) Exclude cut with ID _31602_210_5854_1_1532677021456_4458250_444-198752-0_sp1.1 from training. Duration: 0.97275 2023-10-02 15:17:53,332 WARNING [train.py:1204] (3/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_9-279442-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 15:17:54,606 WARNING [train.py:1204] (3/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_973-198085-0_sp0.9 from training. Duration: 0.8333125 2023-10-02 15:17:56,071 WARNING [train.py:1204] (3/4) Exclude cut with ID _29853_210_7497_1_1532505600851_6642110_567-250221-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 15:17:57,451 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6545_210_2040_1_1527774670_4245258_407-48070-0_sp1.1 from training. Duration: 0.6909375 2023-10-02 15:17:57,600 WARNING [train.py:1204] (3/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_224-343295-0_sp0.9 from training. Duration: 0.611125 2023-10-02 15:17:57,614 WARNING [train.py:1204] (3/4) Exclude cut with ID IC0125W0011-61817 from training. Duration: 0.88 2023-10-02 15:17:57,629 WARNING [train.py:1204] (3/4) Exclude cut with ID _60113_210_16271_1_1535164212481_4386079_435-154371-0_sp1.1 from training. Duration: 0.97275 2023-10-02 15:17:58,735 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_25836_210_16554_1_1532138167_53197_3-9394-0_sp1.1 from training. Duration: 0.92725 2023-10-02 15:18:00,772 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.4.encoder.layers.1.conv_module2.whiten, num_groups=1, num_channels=384, metric=2.66 vs. limit=15.0 2023-10-02 15:18:00,932 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.2.encoder.layers.1.feed_forward2.out_whiten, num_groups=1, num_channels=384, metric=4.02 vs. limit=15.0 2023-10-02 15:18:02,725 WARNING [train.py:1204] (3/4) Exclude cut with ID _29343_210_4381_1_1532602757752_3674109_57-55125-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 15:18:02,818 WARNING [train.py:1204] (3/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_267-23560-0_sp1.1 from training. Duration: 0.936375 2023-10-02 15:18:03,863 WARNING [train.py:1204] (3/4) Exclude cut with ID _30196_210_1664_1_1533708496522_7074140_1150-278867-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 15:18:05,608 WARNING [train.py:1204] (3/4) Exclude cut with ID _6270_210_2084_1_1527850911573_3856420_26-277343-0_sp0.9 from training. Duration: 0.911125 2023-10-02 15:18:05,666 WARNING [train.py:1204] (3/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_325-62802-0 from training. Duration: 0.87 2023-10-02 15:18:07,015 WARNING [train.py:1204] (3/4) Exclude cut with ID _56524_210_9642_1_1534572011779_3619170_145-310842-0_sp1.1 from training. Duration: 0.9 2023-10-02 15:18:07,062 WARNING [train.py:1204] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_860-96198-0_sp0.9 from training. Duration: 0.811125 2023-10-02 15:18:07,764 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.3.encoder.layers.2.conv_module2.whiten, num_groups=1, num_channels=512, metric=4.29 vs. limit=15.0 2023-10-02 15:18:08,421 WARNING [train.py:1204] (3/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_17-25045-0_sp1.1 from training. Duration: 0.536375 2023-10-02 15:18:08,460 WARNING [train.py:1204] (3/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_349-69727-0_sp1.1 from training. Duration: 0.936375 2023-10-02 15:18:08,756 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.feed_forward3.hidden_balancer.prob, batch_count=925233.3333333334, ans=0.125 2023-10-02 15:18:08,764 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.ff2_skip_rate, batch_count=925233.3333333334, ans=0.0 2023-10-02 15:18:09,781 WARNING [train.py:1204] (3/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_580-93798-0_sp1.1 from training. Duration: 0.5090625 2023-10-02 15:18:09,893 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7762_210_1794_1_1528199508_4057408_419-125009-0 from training. Duration: 0.83 2023-10-02 15:18:13,110 WARNING [train.py:1204] (3/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_356-167816-0 from training. Duration: 0.72 2023-10-02 15:18:13,135 WARNING [train.py:1204] (3/4) Exclude cut with ID _29345_210_4381_1_1532689168263_3860169_225-97100-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 15:18:13,139 WARNING [train.py:1204] (3/4) Exclude cut with ID _14227_210_4381_1_1530701804286_3940259_374-194712-0_sp1.1 from training. Duration: 0.92725 2023-10-02 15:18:13,176 WARNING [train.py:1204] (3/4) Exclude cut with ID _40137_210_12210_1_1533290484046_3795180_249-187059-0_sp0.9 from training. Duration: 0.97775 2023-10-02 15:18:13,194 WARNING [train.py:1204] (3/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_389-317194-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 15:18:14,610 WARNING [train.py:1204] (3/4) Exclude cut with ID _27485_210_4916_1_1532739191677_3668269_239-122918-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 15:18:20,364 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2245_210_2715_1_1525511548_3540254_128-117609-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 15:18:21,096 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.2.encoder.layers.2.nonlin_attention.whiten2, num_groups=1, num_channels=384, metric=5.33 vs. limit=15.0 2023-10-02 15:18:21,684 WARNING [train.py:1204] (3/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_160-276178-0_sp1.1 from training. Duration: 0.963625 2023-10-02 15:18:22,393 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.3.encoder.layers.0.self_attn2.whiten, num_groups=1, num_channels=512, metric=14.30 vs. limit=22.5 2023-10-02 15:18:23,094 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_31920_210_10633_1_1532860009_1686094_1-279735-0_sp1.1 from training. Duration: 0.97275 2023-10-02 15:18:25,916 WARNING [train.py:1204] (3/4) Exclude cut with ID _51078_210_9070_1_1534161557424_3805250_325-262383-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 15:18:25,935 WARNING [train.py:1204] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_643-52007-0_sp0.9 from training. Duration: 0.6333125 2023-10-02 15:18:27,291 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_51937_210_5014_1_1534573610_4022025_518-299248-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 15:18:33,991 WARNING [train.py:1204] (3/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_166-314607-0_sp1.1 from training. Duration: 0.4909375 2023-10-02 15:18:33,995 WARNING [train.py:1204] (3/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_1087-282939-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 15:18:34,028 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_23850_210_4238_1_1532232597_1632433_32-185135-0_sp1.1 from training. Duration: 0.7818125 2023-10-02 15:18:35,328 WARNING [train.py:1204] (3/4) Exclude cut with ID _59500_210_8254_1_1534809610680_3736009_145-89359-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 15:18:41,048 INFO [train.py:1046] (3/4) Epoch 27, batch 700, loss[loss=0.1645, simple_loss=0.2534, pruned_loss=0.03775, over 23987.00 frames. ], tot_loss[loss=0.1669, simple_loss=0.2438, pruned_loss=0.04498, over 4567622.85 frames. ], batch size: 80, lr: 3.82e-03, grad_scale: 16.0 2023-10-02 15:18:41,194 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_340-336408-0 from training. Duration: 0.93 2023-10-02 15:18:42,581 WARNING [train.py:1204] (3/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_9-21737-0 from training. Duration: 0.97 2023-10-02 15:18:46,017 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.conv_module1.balancer1.prob, batch_count=925433.3333333334, ans=0.125 2023-10-02 15:18:47,184 WARNING [train.py:1204] (3/4) Exclude cut with ID _2220_210_1819_1_1525509109898_3658680_50-99837-0 from training. Duration: 0.54 2023-10-02 15:18:47,264 WARNING [train.py:1204] (3/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_879-299350-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 15:18:48,671 WARNING [train.py:1204] (3/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_905-160074-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 15:18:50,064 WARNING [train.py:1204] (3/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_395-73570-0 from training. Duration: 0.99 2023-10-02 15:18:55,603 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7264_210_4938_1_1527939911_8132095_800-91995-0_sp1.1 from training. Duration: 0.7818125 2023-10-02 15:18:58,223 WARNING [train.py:1204] (3/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_77-311404-0_sp1.1 from training. Duration: 0.8545625 2023-10-02 15:18:59,583 WARNING [train.py:1204] (3/4) Exclude cut with ID _13679_210_7497_1_1530534293662_3355098_107-80772-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 15:19:00,992 WARNING [train.py:1204] (3/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_224-343295-0_sp1.1 from training. Duration: 0.5 2023-10-02 15:19:01,050 WARNING [train.py:1204] (3/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_156-253482-0_sp1.1 from training. Duration: 0.963625 2023-10-02 15:19:05,068 WARNING [train.py:1204] (3/4) Exclude cut with ID _49728_210_12651_1_1534041034294_6788150_309-125532-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 15:19:05,259 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.bypass.skip_rate, batch_count=925500.0, ans=0.04949747468305833 2023-10-02 15:19:08,535 WARNING [train.py:1204] (3/4) Exclude cut with ID _13913_210_9994_1_1530579615187_3383897_473-277682-0_sp0.9 from training. Duration: 0.4333125 2023-10-02 15:19:08,539 WARNING [train.py:1204] (3/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_3-276701-0_sp1.1 from training. Duration: 0.82725 2023-10-02 15:19:10,475 WARNING [train.py:1204] (3/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_282-136641-0 from training. Duration: 0.42 2023-10-02 15:19:13,618 WARNING [train.py:1204] (3/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_127-566-0 from training. Duration: 0.84 2023-10-02 15:19:15,156 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6163_210_6400_1_1527506746_4263068_283-137811-0_sp1.1 from training. Duration: 0.7 2023-10-02 15:19:15,182 WARNING [train.py:1204] (3/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_34-183685-0_sp1.1 from training. Duration: 0.8909375 2023-10-02 15:19:17,038 WARNING [train.py:1204] (3/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_154-135696-0_sp0.9 from training. Duration: 0.888875 2023-10-02 15:19:21,133 WARNING [train.py:1204] (3/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_676-51014-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 15:19:21,209 WARNING [train.py:1204] (3/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_38-169920-0 from training. Duration: 0.91 2023-10-02 15:19:25,315 WARNING [train.py:1204] (3/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_747-114465-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 15:19:26,649 WARNING [train.py:1204] (3/4) Exclude cut with ID _8021_210_7994_1_1528376451358_3653069_100-140231-0_sp1.1 from training. Duration: 0.6090625 2023-10-02 15:19:26,666 WARNING [train.py:1204] (3/4) Exclude cut with ID _64135_210_2253_1_1535677267876_7608972_133-188266-0 from training. Duration: 0.81 2023-10-02 15:19:27,205 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.4.encoder.layers.2.self_attn2.whiten, num_groups=1, num_channels=384, metric=11.25 vs. limit=22.5 2023-10-02 15:19:27,926 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.597e+02 1.868e+02 2.051e+02 2.317e+02 3.367e+02, threshold=4.102e+02, percent-clipped=0.0 2023-10-02 15:19:30,700 WARNING [train.py:1204] (3/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_714-310538-0_sp1.1 from training. Duration: 0.7818125 2023-10-02 15:19:33,360 WARNING [train.py:1204] (3/4) Exclude cut with ID _39867_210_9826_1_1533553222030_9344259_1134-288498-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 15:19:34,832 WARNING [train.py:1204] (3/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_211-237289-0_sp1.1 from training. Duration: 0.936375 2023-10-02 15:19:38,320 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=925700.0, ans=0.1 2023-10-02 15:19:40,394 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.ff2_skip_rate, batch_count=925700.0, ans=0.0 2023-10-02 15:19:41,514 WARNING [train.py:1204] (3/4) Exclude cut with ID _59202_210_26984_1_1534816810612_3549174_137-318710-0_sp0.9 from training. Duration: 0.988875 2023-10-02 15:19:42,789 WARNING [train.py:1204] (3/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_86-66192-0 from training. Duration: 0.55 2023-10-02 15:19:45,635 WARNING [train.py:1204] (3/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_540-275685-0 from training. Duration: 0.89 2023-10-02 15:19:45,849 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.feed_forward2.hidden_balancer.prob, batch_count=925700.0, ans=0.125 2023-10-02 15:19:47,340 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8776_210_2040_1_1528638837_4088036_244-278763-0 from training. Duration: 0.9 2023-10-02 15:19:48,814 WARNING [train.py:1204] (3/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_9-219972-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 15:19:50,205 WARNING [train.py:1204] (3/4) Exclude cut with ID _44659_210_2721_1_1534036120061_6614860_656-283823-0_sp1.1 from training. Duration: 0.92725 2023-10-02 15:19:50,269 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_69630_210_7234_1_1535789601_661049_77-216385-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 15:19:51,707 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_19952_210_2161_1_1531875469_3851750_210-308919-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 15:19:51,713 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_4737_210_2040_1_1526998689_3293008_378-178650-0 from training. Duration: 0.95 2023-10-02 15:19:54,429 INFO [train.py:1046] (3/4) Epoch 27, batch 750, loss[loss=0.1701, simple_loss=0.2491, pruned_loss=0.04553, over 23291.00 frames. ], tot_loss[loss=0.1663, simple_loss=0.2442, pruned_loss=0.04425, over 4612797.02 frames. ], batch size: 105, lr: 3.82e-03, grad_scale: 16.0 2023-10-02 15:19:55,888 WARNING [train.py:1204] (3/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_224-343295-0 from training. Duration: 0.55 2023-10-02 15:19:55,905 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7002_210_1794_1_1527939683_5157286_184-229630-0 from training. Duration: 0.74 2023-10-02 15:19:55,929 WARNING [train.py:1204] (3/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_6-214815-0 from training. Duration: 0.85 2023-10-02 15:19:56,006 WARNING [train.py:1204] (3/4) Exclude cut with ID _8699_210_2069_1_1528630346831_3960409_160-24408-0 from training. Duration: 0.91 2023-10-02 15:19:56,049 WARNING [train.py:1204] (3/4) Exclude cut with ID _5585_210_2667_1_1527418813324_8007340_89-332072-0 from training. Duration: 0.64 2023-10-02 15:19:57,349 WARNING [train.py:1204] (3/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_528-259800-0_sp1.1 from training. Duration: 0.963625 2023-10-02 15:19:57,438 WARNING [train.py:1204] (3/4) Exclude cut with ID _55383_210_10635_1_1534468426172_2231669_134-128246-0 from training. Duration: 0.89 2023-10-02 15:19:58,009 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.4.encoder.layers.1.self_attn_weights.whiten_keys, num_groups=4, num_channels=128, metric=1.93 vs. limit=6.0 2023-10-02 15:19:58,830 WARNING [train.py:1204] (3/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_400-338274-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 15:20:00,119 WARNING [train.py:1204] (3/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_364-22817-0_sp0.9 from training. Duration: 0.788875 2023-10-02 15:20:01,518 WARNING [train.py:1204] (3/4) Exclude cut with ID _42863_210_9070_1_1533902398788_3696920_331-291057-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 15:20:02,203 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.3.encoder.layers.0.feed_forward1.out_whiten, num_groups=1, num_channels=512, metric=11.72 vs. limit=15.0 2023-10-02 15:20:02,882 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_5436_210_1794_1_1527331317_2541494_229-211016-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 15:20:04,143 WARNING [train.py:1204] (3/4) Exclude cut with ID _6456_210_1534_1_1527681634212_3181049_345-252877-0_sp0.9 from training. Duration: 0.711125 2023-10-02 15:20:04,174 WARNING [train.py:1204] (3/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_96-81499-0_sp1.1 from training. Duration: 0.92725 2023-10-02 15:20:07,351 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_5575_210_1410_1_1527301305_2570170_342-265572-0_sp1.1 from training. Duration: 0.8545625 2023-10-02 15:20:09,351 WARNING [train.py:1204] (3/4) Exclude cut with ID _5444_210_2151_1_1527418808464_5312210_726-27165-0_sp0.9 from training. Duration: 0.7444375 2023-10-02 15:20:11,441 WARNING [train.py:1204] (3/4) Exclude cut with ID _22083_210_8254_1_1531702862439_3678970_304-327750-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 15:20:12,963 WARNING [train.py:1204] (3/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_245-226199-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 15:20:14,303 WARNING [train.py:1204] (3/4) Exclude cut with ID _42875_210_14856_1_1533704397487_4012433_102-199499-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 15:20:14,359 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_51225_210_3601_1_1534400634_2327383_55-301844-0 from training. Duration: 0.93 2023-10-02 15:20:16,467 WARNING [train.py:1204] (3/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_916-195848-0_sp1.1 from training. Duration: 0.836375 2023-10-02 15:20:17,922 WARNING [train.py:1204] (3/4) Exclude cut with ID _44718_210_5816_1_1534050051869_6469030_497-311999-0_sp1.1 from training. Duration: 0.936375 2023-10-02 15:20:18,046 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_34886_210_10633_1_1533203882_1755661_91-194765-0_sp1.1 from training. Duration: 0.936375 2023-10-02 15:20:20,789 WARNING [train.py:1204] (3/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_310-26186-0_sp0.9 from training. Duration: 0.77775 2023-10-02 15:20:22,158 WARNING [train.py:1204] (3/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_416-257730-0 from training. Duration: 0.65 2023-10-02 15:20:22,172 WARNING [train.py:1204] (3/4) Exclude cut with ID _9389_210_1664_1_1529217337301_5289269_209-154760-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 15:20:23,602 WARNING [train.py:1204] (3/4) Exclude cut with ID _4092_210_2151_1_1526643044449_3135759_227-213972-0 from training. Duration: 0.54 2023-10-02 15:20:24,872 WARNING [train.py:1204] (3/4) Exclude cut with ID IC0679W0044-335852 from training. Duration: 0.768 2023-10-02 15:20:24,936 WARNING [train.py:1204] (3/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_42-186162-0 from training. Duration: 0.92 2023-10-02 15:20:24,942 WARNING [train.py:1204] (3/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_302-2550-0_sp0.9 from training. Duration: 0.9 2023-10-02 15:20:24,964 WARNING [train.py:1204] (3/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_356-31216-0_sp0.9 from training. Duration: 0.8333125 2023-10-02 15:20:26,638 WARNING [train.py:1204] (3/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_125-322172-0_sp1.1 from training. Duration: 0.8454375 2023-10-02 15:20:33,607 WARNING [train.py:1204] (3/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_141-89727-0_sp0.9 from training. Duration: 0.788875 2023-10-02 15:20:33,625 WARNING [train.py:1204] (3/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_647-337784-0_sp1.1 from training. Duration: 0.97275 2023-10-02 15:20:33,642 WARNING [train.py:1204] (3/4) Exclude cut with ID _6493_210_1747_1_1528459210424_4012470_513-140967-0_sp1.1 from training. Duration: 0.7181875 2023-10-02 15:20:36,369 WARNING [train.py:1204] (3/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_87-266334-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 15:20:37,750 WARNING [train.py:1204] (3/4) Exclude cut with ID _26528_210_9070_1_1532570410842_3851310_526-261338-0_sp1.1 from training. Duration: 0.92725 2023-10-02 15:20:39,041 WARNING [train.py:1204] (3/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_380-343670-0 from training. Duration: 0.88 2023-10-02 15:20:40,948 WARNING [train.py:1204] (3/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_449-15508-0_sp1.1 from training. Duration: 0.7454375 2023-10-02 15:20:42,866 WARNING [train.py:1204] (3/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_372-6869-0_sp0.9 from training. Duration: 0.488875 2023-10-02 15:20:43,537 WARNING [train.py:1204] (3/4) Exclude cut with ID _26907_210_7234_1_1532221740694_3205263_337-2494-0_sp0.9 from training. Duration: 0.97775 2023-10-02 15:20:46,187 WARNING [train.py:1204] (3/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_10-100367-0_sp1.1 from training. Duration: 0.8909375 2023-10-02 15:20:47,515 WARNING [train.py:1204] (3/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_47-167373-0 from training. Duration: 0.92 2023-10-02 15:20:47,580 WARNING [train.py:1204] (3/4) Exclude cut with ID _28323_210_16187_1_1532480395667_4100300_628-282179-0_sp1.1 from training. Duration: 0.97275 2023-10-02 15:20:51,710 WARNING [train.py:1204] (3/4) Exclude cut with ID _20455_210_5219_1_1531565996775_8098100_213-280898-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 15:20:53,104 WARNING [train.py:1204] (3/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_383-316000-0_sp1.1 from training. Duration: 0.8181875 2023-10-02 15:20:53,147 WARNING [train.py:1204] (3/4) Exclude cut with ID _18513_210_9206_1_1531618215223_3862620_16-78180-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 15:20:55,952 WARNING [train.py:1204] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_330-327861-0_sp1.1 from training. Duration: 0.6090625 2023-10-02 15:20:58,639 WARNING [train.py:1204] (3/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_356-31216-0 from training. Duration: 0.75 2023-10-02 15:20:58,660 WARNING [train.py:1204] (3/4) Exclude cut with ID _25248_210_8750_1_1532062825941_5554180_255-148539-0_sp1.1 from training. Duration: 0.963625 2023-10-02 15:20:58,702 WARNING [train.py:1204] (3/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_269-154726-0_sp1.1 from training. Duration: 0.7818125 2023-10-02 15:21:02,777 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7002_210_1794_1_1527939683_5157286_163-87552-0_sp1.1 from training. Duration: 0.7818125 2023-10-02 15:21:02,810 WARNING [train.py:1204] (3/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_97-151896-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 15:21:04,256 WARNING [train.py:1204] (3/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_207-7026-0_sp1.1 from training. Duration: 0.97275 2023-10-02 15:21:04,286 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_92-46009-0_sp0.9 from training. Duration: 0.6 2023-10-02 15:21:08,217 INFO [train.py:1046] (3/4) Epoch 27, batch 800, loss[loss=0.169, simple_loss=0.2429, pruned_loss=0.04758, over 23377.00 frames. ], tot_loss[loss=0.1669, simple_loss=0.2447, pruned_loss=0.04458, over 4630437.54 frames. ], batch size: 119, lr: 3.82e-03, grad_scale: 32.0 2023-10-02 15:21:10,583 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.1.bypass_mid.scale_min, batch_count=926100.0, ans=0.2 2023-10-02 15:21:11,619 WARNING [train.py:1204] (3/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_122-178847-0_sp1.1 from training. Duration: 0.97275 2023-10-02 15:21:11,630 WARNING [train.py:1204] (3/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_94-348956-0_sp1.1 from training. Duration: 0.92725 2023-10-02 15:21:14,888 WARNING [train.py:1204] (3/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_1018-244089-0_sp1.1 from training. Duration: 0.963625 2023-10-02 15:21:14,908 WARNING [train.py:1204] (3/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_87-118423-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 15:21:14,987 WARNING [train.py:1204] (3/4) Exclude cut with ID _53713_210_24116_1_1534314487762_7416751_504-212292-0_sp1.1 from training. Duration: 0.92725 2023-10-02 15:21:15,015 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_446-150327-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 15:21:17,621 WARNING [train.py:1204] (3/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_682-245989-0_sp1.1 from training. Duration: 0.92725 2023-10-02 15:21:19,387 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.bypass.skip_rate, batch_count=926100.0, ans=0.04949747468305833 2023-10-02 15:21:20,401 WARNING [train.py:1204] (3/4) Exclude cut with ID _35723_210_1794_1_1533088341043_628250_8-234524-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 15:21:21,711 WARNING [train.py:1204] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_64-248359-0_sp0.9 from training. Duration: 0.7666875 2023-10-02 15:21:22,203 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.5.encoder.layers.1.self_attn2.whiten, num_groups=1, num_channels=256, metric=10.07 vs. limit=22.5 2023-10-02 15:21:24,459 WARNING [train.py:1204] (3/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_49-97158-0 from training. Duration: 0.76 2023-10-02 15:21:25,794 WARNING [train.py:1204] (3/4) Exclude cut with ID _24314_210_4915_1_1532135078116_7170179_743-915-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 15:21:27,170 WARNING [train.py:1204] (3/4) Exclude cut with ID _61194_210_18628_1_1535007555628_3750909_353-323468-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 15:21:27,193 WARNING [train.py:1204] (3/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_387-131538-0_sp1.1 from training. Duration: 0.736375 2023-10-02 15:21:27,227 WARNING [train.py:1204] (3/4) Exclude cut with ID _44424_210_18163_1_1533623394848_1213110_79-187603-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 15:21:27,272 WARNING [train.py:1204] (3/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_86-19601-0 from training. Duration: 0.85 2023-10-02 15:21:27,303 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_50050_210_16223_1_1535435796_7290138_103-283529-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 15:21:28,574 WARNING [train.py:1204] (3/4) Exclude cut with ID _13913_210_9994_1_1530579615187_3383897_473-277682-0 from training. Duration: 0.39 2023-10-02 15:21:31,438 WARNING [train.py:1204] (3/4) Exclude cut with ID _42326_210_7770_1_1533814282531_3707269_368-68029-0_sp1.1 from training. Duration: 0.92725 2023-10-02 15:21:32,868 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_41428_210_2161_1_1533774184_3992532_446-339645-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 15:21:34,259 WARNING [train.py:1204] (3/4) Exclude cut with ID _69584_210_5219_1_1535785133775_4044450_325-7656-0_sp1.1 from training. Duration: 0.7818125 2023-10-02 15:21:34,274 WARNING [train.py:1204] (3/4) Exclude cut with ID _21632_210_2253_1_1531994001765_3744140_134-184387-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 15:21:37,106 WARNING [train.py:1204] (3/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_550-184707-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 15:21:37,127 WARNING [train.py:1204] (3/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_106-341780-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 15:21:41,137 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.conv_module2.balancer1.prob, batch_count=926233.3333333334, ans=0.125 2023-10-02 15:21:43,858 WARNING [train.py:1204] (3/4) Exclude cut with ID _45871_210_5665_1_1533864614245_3296060_253-132552-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 15:21:45,675 WARNING [train.py:1204] (3/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_395-310220-0_sp1.1 from training. Duration: 0.8090625 2023-10-02 15:21:45,698 WARNING [train.py:1204] (3/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_795-33252-0_sp0.9 from training. Duration: 0.411125 2023-10-02 15:21:45,931 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.ff3_skip_rate, batch_count=926233.3333333334, ans=0.0 2023-10-02 15:21:47,066 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9002W0003-495962 from training. Duration: 0.946 2023-10-02 15:21:47,098 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_43-71989-0 from training. Duration: 0.57 2023-10-02 15:21:47,233 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.bypass.skip_rate, batch_count=926233.3333333334, ans=0.04949747468305833 2023-10-02 15:21:48,369 WARNING [train.py:1204] (3/4) Exclude cut with ID _8699_210_2069_1_1528630346831_3960409_155-39766-0_sp1.1 from training. Duration: 0.7090625 2023-10-02 15:21:48,381 WARNING [train.py:1204] (3/4) Exclude cut with ID _27894_210_12392_1_1532415496078_6902439_788-232684-0_sp1.1 from training. Duration: 0.97275 2023-10-02 15:21:49,756 WARNING [train.py:1204] (3/4) Exclude cut with ID _56494_210_4220_1_1535284837625_3033169_10-39296-0_sp1.1 from training. Duration: 0.92725 2023-10-02 15:21:49,791 WARNING [train.py:1204] (3/4) Exclude cut with ID _40901_210_5665_1_1533363789958_3856130_341-20819-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 15:21:55,123 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.532e+02 1.828e+02 2.029e+02 2.331e+02 3.011e+02, threshold=4.058e+02, percent-clipped=0.0 2023-10-02 15:21:56,502 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9029W0435-509822 from training. Duration: 0.896 2023-10-02 15:21:56,535 WARNING [train.py:1204] (3/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_51-117576-0 from training. Duration: 0.4 2023-10-02 15:21:57,961 WARNING [train.py:1204] (3/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_197-28164-0_sp1.1 from training. Duration: 0.72725 2023-10-02 15:22:00,690 WARNING [train.py:1204] (3/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_145-137790-0_sp1.1 from training. Duration: 0.7545625 2023-10-02 15:22:04,793 WARNING [train.py:1204] (3/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_2-39653-0_sp1.1 from training. Duration: 0.8909375 2023-10-02 15:22:07,628 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3780_210_2087_1_1526467760_2130483_186-208240-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 15:22:09,019 WARNING [train.py:1204] (3/4) Exclude cut with ID _24055_210_12651_1_1532310907143_976979_92-59356-0 from training. Duration: 0.91 2023-10-02 15:22:09,074 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3910_210_1794_1_1526742306_334530_28-238909-0_sp1.1 from training. Duration: 0.736375 2023-10-02 15:22:11,741 WARNING [train.py:1204] (3/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_269-154726-0 from training. Duration: 0.86 2023-10-02 15:22:18,195 WARNING [train.py:1204] (3/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_326-165900-0_sp0.9 from training. Duration: 0.7666875 2023-10-02 15:22:21,349 INFO [train.py:1046] (3/4) Epoch 27, batch 850, loss[loss=0.1643, simple_loss=0.2522, pruned_loss=0.03824, over 24635.00 frames. ], tot_loss[loss=0.1672, simple_loss=0.245, pruned_loss=0.04469, over 4665130.56 frames. ], batch size: 68, lr: 3.82e-03, grad_scale: 16.0 2023-10-02 15:22:22,801 WARNING [train.py:1204] (3/4) Exclude cut with ID _5294_210_1747_1_1527249643928_3696179_470-276077-0_sp1.1 from training. Duration: 0.963625 2023-10-02 15:22:22,845 WARNING [train.py:1204] (3/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_372-131163-0 from training. Duration: 0.94 2023-10-02 15:22:24,153 WARNING [train.py:1204] (3/4) Exclude cut with ID _45297_210_2721_1_1533715351381_3264949_351-147136-0_sp1.1 from training. Duration: 0.7909375 2023-10-02 15:22:24,231 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_1072-12671-0_sp1.1 from training. Duration: 0.92725 2023-10-02 15:22:25,558 WARNING [train.py:1204] (3/4) Exclude cut with ID _15133_210_6228_1_1530794512074_3080379_54-198193-0 from training. Duration: 0.94 2023-10-02 15:22:25,583 WARNING [train.py:1204] (3/4) Exclude cut with ID _30196_210_1664_1_1533708496522_7074140_1194-270988-0_sp1.1 from training. Duration: 0.97275 2023-10-02 15:22:26,985 WARNING [train.py:1204] (3/4) Exclude cut with ID _50310_210_15488_1_1534068043345_3641080_289-193637-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 15:22:28,295 WARNING [train.py:1204] (3/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_313-263495-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 15:22:29,697 WARNING [train.py:1204] (3/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_18-130336-0_sp1.1 from training. Duration: 0.7181875 2023-10-02 15:22:31,069 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_47896_210_20508_1_1534492518_5958868_646-287209-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 15:22:31,174 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_294-163793-0 from training. Duration: 0.82 2023-10-02 15:22:31,217 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_8-319957-0 from training. Duration: 0.96 2023-10-02 15:22:31,220 WARNING [train.py:1204] (3/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_1211-226066-0 from training. Duration: 0.76 2023-10-02 15:22:32,692 WARNING [train.py:1204] (3/4) Exclude cut with ID _4779_210_2084_1_1527318022944_3914893_500-285013-0_sp0.9 from training. Duration: 0.7666875 2023-10-02 15:22:32,725 WARNING [train.py:1204] (3/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_347-232268-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 15:22:35,531 WARNING [train.py:1204] (3/4) Exclude cut with ID _29877_210_6228_1_1532397791106_5276149_111-55056-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 15:22:35,576 WARNING [train.py:1204] (3/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_812-271646-0_sp1.1 from training. Duration: 0.92725 2023-10-02 15:22:35,610 WARNING [train.py:1204] (3/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_235-262124-0_sp1.1 from training. Duration: 0.6090625 2023-10-02 15:22:39,508 WARNING [train.py:1204] (3/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_14-16530-0_sp1.1 from training. Duration: 0.97275 2023-10-02 15:22:39,542 WARNING [train.py:1204] (3/4) Exclude cut with ID _44718_210_5816_1_1534050051869_6469030_568-304512-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 15:22:40,916 WARNING [train.py:1204] (3/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_1033-274376-0 from training. Duration: 0.87 2023-10-02 15:22:44,828 WARNING [train.py:1204] (3/4) Exclude cut with ID _4681_210_3525_1_1527251040321_1235713_123-24428-0 from training. Duration: 0.48 2023-10-02 15:22:47,939 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_37818_210_4238_1_1533620910_4720083_251-288764-0_sp1.1 from training. Duration: 0.97275 2023-10-02 15:22:49,698 WARNING [train.py:1204] (3/4) Exclude cut with ID _65585_210_12210_1_1535450451584_3764098_112-294301-0 from training. Duration: 0.88 2023-10-02 15:22:53,680 WARNING [train.py:1204] (3/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_329-113173-0 from training. Duration: 0.62 2023-10-02 15:22:55,033 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6767_210_3273_1_1527855783_1718932_173-256601-0 from training. Duration: 0.8 2023-10-02 15:22:56,536 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9266W0001-627677 from training. Duration: 0.896 2023-10-02 15:22:56,546 WARNING [train.py:1204] (3/4) Exclude cut with ID _49643_210_7989_1_1534640244224_3939031_611-185515-0_sp1.1 from training. Duration: 0.8545625 2023-10-02 15:22:56,557 WARNING [train.py:1204] (3/4) Exclude cut with ID _31649_210_6228_1_1532584608063_4091078_687-237771-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 15:22:56,567 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_148-172861-0_sp0.9 from training. Duration: 0.5555625 2023-10-02 15:22:58,050 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.balancer2.prob, batch_count=926566.6666666666, ans=0.125 2023-10-02 15:23:00,593 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_5129_210_6400_1_1527161005_3579054_107-69738-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 15:23:02,005 WARNING [train.py:1204] (3/4) Exclude cut with ID _40137_210_12210_1_1533290484046_3795180_36-301164-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 15:23:02,035 WARNING [train.py:1204] (3/4) Exclude cut with ID _7537_210_2084_1_1528502395431_3924359_113-199933-0 from training. Duration: 0.59 2023-10-02 15:23:03,489 WARNING [train.py:1204] (3/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_547-5424-0_sp1.1 from training. Duration: 0.8545625 2023-10-02 15:23:04,845 WARNING [train.py:1204] (3/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_147-284024-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 15:23:06,225 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_294-163793-0_sp1.1 from training. Duration: 0.7454375 2023-10-02 15:23:06,257 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3780_210_2087_1_1526467760_2130483_170-259044-0_sp1.1 from training. Duration: 0.636375 2023-10-02 15:23:08,901 WARNING [train.py:1204] (3/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_726-270780-0_sp1.1 from training. Duration: 0.7818125 2023-10-02 15:23:10,253 WARNING [train.py:1204] (3/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_214-291369-0_sp1.1 from training. Duration: 0.57275 2023-10-02 15:23:10,303 WARNING [train.py:1204] (3/4) Exclude cut with ID _21881_210_3525_1_1531825242757_3798380_247-102591-0 from training. Duration: 0.98 2023-10-02 15:23:14,757 WARNING [train.py:1204] (3/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_18-174647-0_sp1.1 from training. Duration: 0.82725 2023-10-02 15:23:14,759 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_415-52611-0_sp1.1 from training. Duration: 0.936375 2023-10-02 15:23:16,682 WARNING [train.py:1204] (3/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_379-49457-0_sp0.9 from training. Duration: 0.9666875 2023-10-02 15:23:16,696 WARNING [train.py:1204] (3/4) Exclude cut with ID _64521_210_14699_1_1535243427259_3815328_368-195106-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 15:23:16,769 WARNING [train.py:1204] (3/4) Exclude cut with ID _28394_210_8913_1_1532430049518_3650118_51-334124-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 15:23:20,091 WARNING [train.py:1204] (3/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_317-161769-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 15:23:24,018 WARNING [train.py:1204] (3/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_40-129756-0_sp0.9 from training. Duration: 0.9 2023-10-02 15:23:25,371 WARNING [train.py:1204] (3/4) Exclude cut with ID _3067_210_2667_1_1526212974640_4503168_119-175776-0_sp0.9 from training. Duration: 0.82225 2023-10-02 15:23:26,730 WARNING [train.py:1204] (3/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_1004-304915-0_sp1.1 from training. Duration: 0.92725 2023-10-02 15:23:26,801 WARNING [train.py:1204] (3/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_359-118926-0_sp0.9 from training. Duration: 0.8 2023-10-02 15:23:34,945 INFO [train.py:1046] (3/4) Epoch 27, batch 900, loss[loss=0.1486, simple_loss=0.228, pruned_loss=0.03466, over 24583.00 frames. ], tot_loss[loss=0.1682, simple_loss=0.2462, pruned_loss=0.0451, over 4681475.40 frames. ], batch size: 60, lr: 3.81e-03, grad_scale: 16.0 2023-10-02 15:23:35,003 WARNING [train.py:1204] (3/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_51-200220-0_sp0.9 from training. Duration: 0.62225 2023-10-02 15:23:35,089 WARNING [train.py:1204] (3/4) Exclude cut with ID _46683_210_9642_1_1533730913492_4591116_530-326202-0_sp1.1 from training. Duration: 0.936375 2023-10-02 15:23:36,423 WARNING [train.py:1204] (3/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_567-135114-0 from training. Duration: 0.87 2023-10-02 15:23:37,692 WARNING [train.py:1204] (3/4) Exclude cut with ID _66863_210_18053_1_1535698758334_8590319_1092-271784-0_sp1.1 from training. Duration: 0.87275 2023-10-02 15:23:37,709 WARNING [train.py:1204] (3/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_762-282614-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 15:23:39,208 WARNING [train.py:1204] (3/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_280-120942-0 from training. Duration: 0.77 2023-10-02 15:23:44,915 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_31583_210_3852_1_1532595851_4024413_27-170931-0_sp1.1 from training. Duration: 0.963625 2023-10-02 15:23:45,080 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.conv_module2.balancer1.max_abs, batch_count=926766.6666666666, ans=10.0 2023-10-02 15:23:48,262 WARNING [train.py:1204] (3/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_266-271628-0_sp1.1 from training. Duration: 0.92725 2023-10-02 15:23:48,294 WARNING [train.py:1204] (3/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_41-31157-0 from training. Duration: 0.57 2023-10-02 15:23:51,618 WARNING [train.py:1204] (3/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_113-18163-0_sp1.1 from training. Duration: 0.8090625 2023-10-02 15:23:52,962 WARNING [train.py:1204] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_147-111137-0 from training. Duration: 0.95 2023-10-02 15:23:53,052 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1083-220122-0_sp1.1 from training. Duration: 0.463625 2023-10-02 15:23:54,418 WARNING [train.py:1204] (3/4) Exclude cut with ID _14884_210_2252_1_1530707459689_5561250_591-173406-0_sp1.1 from training. Duration: 0.87275 2023-10-02 15:23:54,424 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_24677_210_1794_1_1532002693_4341296_149-303793-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 15:23:56,146 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_133-564-0_sp1.1 from training. Duration: 0.7181875 2023-10-02 15:23:56,168 WARNING [train.py:1204] (3/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_625-338546-0_sp0.9 from training. Duration: 0.97775 2023-10-02 15:24:02,496 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.conv_skip_rate, batch_count=926833.3333333334, ans=0.0 2023-10-02 15:24:04,241 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.4.encoder.layers.1.feed_forward2.out_whiten, num_groups=1, num_channels=384, metric=12.19 vs. limit=15.0 2023-10-02 15:24:05,063 WARNING [train.py:1204] (3/4) Exclude cut with ID _25882_210_3036_1_1532775665815_6479510_624-320169-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 15:24:05,069 WARNING [train.py:1204] (3/4) Exclude cut with ID _20622_210_3852_1_1531622427080_1093839_127-325326-0_sp1.1 from training. Duration: 0.92725 2023-10-02 15:24:06,434 WARNING [train.py:1204] (3/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_464-33931-0_sp1.1 from training. Duration: 0.4909375 2023-10-02 15:24:07,909 WARNING [train.py:1204] (3/4) Exclude cut with ID _66112_210_10635_1_1535590236305_3801484_120-304588-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 15:24:09,483 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.conv_module2.balancer2.prob, batch_count=926900.0, ans=0.125 2023-10-02 15:24:13,300 WARNING [train.py:1204] (3/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_110-130961-0 from training. Duration: 0.47 2023-10-02 15:24:13,485 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.conv_module1.balancer1.prob, batch_count=926900.0, ans=0.125 2023-10-02 15:24:14,724 WARNING [train.py:1204] (3/4) Exclude cut with ID _16908_210_5724_1_1531119157248_3772700_64-54020-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 15:24:14,935 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.feed_forward1.out_proj.dropout_p, batch_count=926900.0, ans=0.1 2023-10-02 15:24:17,695 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.1.bypass_mid.scale_min, batch_count=926966.6666666666, ans=0.2 2023-10-02 15:24:19,421 WARNING [train.py:1204] (3/4) Exclude cut with ID _4068_210_2629_1_1526697350395_1546259_224-288693-0_sp1.1 from training. Duration: 0.536375 2023-10-02 15:24:21,378 WARNING [train.py:1204] (3/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_191-41023-0_sp0.9 from training. Duration: 0.788875 2023-10-02 15:24:21,446 WARNING [train.py:1204] (3/4) Exclude cut with ID ID0205W0255-783002 from training. Duration: 0.958 2023-10-02 15:24:22,739 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_162-262717-0 from training. Duration: 0.83 2023-10-02 15:24:24,177 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.598e+02 1.889e+02 2.142e+02 2.586e+02 4.212e+02, threshold=4.284e+02, percent-clipped=1.0 2023-10-02 15:24:27,508 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_224-322394-0_sp1.1 from training. Duration: 0.6 2023-10-02 15:24:27,514 WARNING [train.py:1204] (3/4) Exclude cut with ID _4866_210_2577_1_1527404412284_4364659_133-1696-0_sp1.1 from training. Duration: 0.9 2023-10-02 15:24:27,562 WARNING [train.py:1204] (3/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_973-198085-0_sp1.1 from training. Duration: 0.6818125 2023-10-02 15:24:34,961 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_47449_210_5399_1_1534076445_4006929_410-237806-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 15:24:34,978 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7543_210_6753_1_1528543318_4552_1-85072-0_sp0.9 from training. Duration: 0.92225 2023-10-02 15:24:36,345 WARNING [train.py:1204] (3/4) Exclude cut with ID _65263_210_22356_1_1535763598691_3921037_108-21902-0 from training. Duration: 0.99 2023-10-02 15:24:36,356 WARNING [train.py:1204] (3/4) Exclude cut with ID _65585_210_12210_1_1535450451584_3764098_309-174818-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 15:24:39,153 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_309-65177-0 from training. Duration: 0.88 2023-10-02 15:24:39,358 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.conv_skip_rate, batch_count=927033.3333333334, ans=0.0 2023-10-02 15:24:40,668 WARNING [train.py:1204] (3/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_363-185703-0_sp1.1 from training. Duration: 0.863625 2023-10-02 15:24:40,700 WARNING [train.py:1204] (3/4) Exclude cut with ID _42326_210_7770_1_1533814282531_3707269_442-238692-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 15:24:43,408 WARNING [train.py:1204] (3/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_226-254446-0_sp1.1 from training. Duration: 0.936375 2023-10-02 15:24:43,434 WARNING [train.py:1204] (3/4) Exclude cut with ID _29853_210_7497_1_1532505600851_6642110_287-299903-0_sp1.1 from training. Duration: 0.963625 2023-10-02 15:24:46,258 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1015-343884-0 from training. Duration: 0.62 2023-10-02 15:24:47,557 WARNING [train.py:1204] (3/4) Exclude cut with ID ID0305W0008-833402 from training. Duration: 0.91 2023-10-02 15:24:47,837 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.conv_module1.balancer2.prob, batch_count=927100.0, ans=0.125 2023-10-02 15:24:49,476 INFO [train.py:1046] (3/4) Epoch 27, batch 950, loss[loss=0.1589, simple_loss=0.2384, pruned_loss=0.03974, over 24575.00 frames. ], tot_loss[loss=0.1684, simple_loss=0.2461, pruned_loss=0.04535, over 4696447.41 frames. ], batch size: 60, lr: 3.81e-03, grad_scale: 16.0 2023-10-02 15:24:49,541 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6673_210_3273_1_1527767790_3173240_351-167954-0_sp1.1 from training. Duration: 0.436375 2023-10-02 15:24:49,554 WARNING [train.py:1204] (3/4) Exclude cut with ID _6456_210_1534_1_1527681634212_3181049_345-252877-0 from training. Duration: 0.64 2023-10-02 15:24:51,468 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_13-205182-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 15:24:54,220 WARNING [train.py:1204] (3/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_427-208901-0 from training. Duration: 0.68 2023-10-02 15:24:59,332 WARNING [train.py:1204] (3/4) Exclude cut with ID _49039_210_6315_1_1533967537541_3846118_9-148921-0_sp1.1 from training. Duration: 0.97275 2023-10-02 15:25:00,850 WARNING [train.py:1204] (3/4) Exclude cut with ID _59493_210_17282_1_1535078419077_3225879_143-70317-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 15:25:01,050 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.ff3_skip_rate, batch_count=927100.0, ans=0.0 2023-10-02 15:25:02,171 WARNING [train.py:1204] (3/4) Exclude cut with ID _27967_210_12140_1_1532588391859_8419130_1056-236369-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 15:25:02,216 WARNING [train.py:1204] (3/4) Exclude cut with ID _6537_210_1883_1_1527987559710_3597129_382-133062-0_sp0.9 from training. Duration: 0.7555625 2023-10-02 15:25:04,997 WARNING [train.py:1204] (3/4) Exclude cut with ID ID0376W0255-869957 from training. Duration: 0.9510625 2023-10-02 15:25:06,576 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.bypass.scale_min, batch_count=927166.6666666666, ans=0.2 2023-10-02 15:25:09,030 WARNING [train.py:1204] (3/4) Exclude cut with ID _49371_210_12071_1_1533952761021_3812190_483-240691-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 15:25:10,346 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_12868_210_6753_1_1530701932_530275_18-76322-0_sp1.1 from training. Duration: 0.7909375 2023-10-02 15:25:10,406 WARNING [train.py:1204] (3/4) Exclude cut with ID _53950_210_5816_1_1534417264919_3862951_367-26344-0_sp1.1 from training. Duration: 0.97275 2023-10-02 15:25:10,433 WARNING [train.py:1204] (3/4) Exclude cut with ID _5444_210_2151_1_1527418808464_5312210_634-194174-0_sp1.1 from training. Duration: 0.8818125 2023-10-02 15:25:10,450 WARNING [train.py:1204] (3/4) Exclude cut with ID _56494_210_4220_1_1535284837625_3033169_69-138150-0 from training. Duration: 0.94 2023-10-02 15:25:11,815 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7543_210_6753_1_1528544010_1110402_25-183860-0_sp0.9 from training. Duration: 0.52225 2023-10-02 15:25:13,176 WARNING [train.py:1204] (3/4) Exclude cut with ID _40284_210_2084_1_1533513679814_3704279_287-191918-0_sp1.1 from training. Duration: 0.963625 2023-10-02 15:25:14,600 WARNING [train.py:1204] (3/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_291-270881-0 from training. Duration: 0.97 2023-10-02 15:25:15,925 WARNING [train.py:1204] (3/4) Exclude cut with ID _12555_210_8750_1_1530075594402_3780239_449-267986-0_sp0.9 from training. Duration: 0.92225 2023-10-02 15:25:19,026 WARNING [train.py:1204] (3/4) Exclude cut with ID _3992_210_1747_1_1526644816031_3731060_268-64520-0_sp1.1 from training. Duration: 0.963625 2023-10-02 15:25:19,033 WARNING [train.py:1204] (3/4) Exclude cut with ID _39411_210_5986_1_1533538825030_3343649_173-238248-0_sp0.9 from training. Duration: 0.92225 2023-10-02 15:25:19,053 WARNING [train.py:1204] (3/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_828-313097-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 15:25:20,398 WARNING [train.py:1204] (3/4) Exclude cut with ID _26907_210_7234_1_1532221740694_3205263_337-2494-0 from training. Duration: 0.88 2023-10-02 15:25:23,699 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_229-45300-0_sp1.1 from training. Duration: 0.5181875 2023-10-02 15:25:25,542 WARNING [train.py:1204] (3/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_300-27235-0_sp1.1 from training. Duration: 0.7909375 2023-10-02 15:25:25,654 WARNING [train.py:1204] (3/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_48-338976-0_sp1.1 from training. Duration: 0.8090625 2023-10-02 15:25:31,809 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_13091_210_7925_1_1530609447_8987354_792-195353-0_sp1.1 from training. Duration: 0.936375 2023-10-02 15:25:31,815 WARNING [train.py:1204] (3/4) Exclude cut with ID _56524_210_9642_1_1534572011779_3619170_311-54500-0_sp1.1 from training. Duration: 0.97275 2023-10-02 15:25:34,656 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7543_210_6753_1_1528544010_1110402_25-183860-0 from training. Duration: 0.47 2023-10-02 15:25:36,013 WARNING [train.py:1204] (3/4) Exclude cut with ID 2411-132532-0017-82279-0_sp1.1 from training. Duration: 0.9681875 2023-10-02 15:25:36,014 WARNING [train.py:1204] (3/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_272-249985-0_sp0.9 from training. Duration: 0.7444375 2023-10-02 15:25:37,491 WARNING [train.py:1204] (3/4) Exclude cut with ID _55644_210_11856_1_1534593599582_4103719_320-229006-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 15:25:38,805 WARNING [train.py:1204] (3/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_177-100554-0_sp1.1 from training. Duration: 0.963625 2023-10-02 15:25:38,807 WARNING [train.py:1204] (3/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_787-294416-0_sp1.1 from training. Duration: 0.7454375 2023-10-02 15:25:41,628 WARNING [train.py:1204] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_318-59097-0 from training. Duration: 0.76 2023-10-02 15:25:43,009 WARNING [train.py:1204] (3/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_480-13146-0_sp1.1 from training. Duration: 0.77275 2023-10-02 15:25:44,493 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_469-338558-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 15:25:45,806 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_367-126897-0_sp1.1 from training. Duration: 0.963625 2023-10-02 15:25:45,822 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6545_210_2040_1_1527774670_4245258_379-14182-0 from training. Duration: 0.8 2023-10-02 15:25:45,839 WARNING [train.py:1204] (3/4) Exclude cut with ID _21631_210_2253_1_1531907880778_3160339_197-79498-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 15:25:45,849 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_416-293746-0_sp1.1 from training. Duration: 0.8181875 2023-10-02 15:25:47,802 WARNING [train.py:1204] (3/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_52-211683-0 from training. Duration: 0.9 2023-10-02 15:25:50,603 WARNING [train.py:1204] (3/4) Exclude cut with ID _48678_210_5663_1_1533988830947_3769199_306-255886-0_sp0.9 from training. Duration: 0.8555625 2023-10-02 15:25:53,145 WARNING [train.py:1204] (3/4) Exclude cut with ID _65134_210_14763_1_1535335393985_3505368_296-24733-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 15:25:55,825 WARNING [train.py:1204] (3/4) Exclude cut with ID _14898_210_9206_1_1530766812377_4177190_335-99364-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 15:25:57,222 WARNING [train.py:1204] (3/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_19-342632-0 from training. Duration: 0.91 2023-10-02 15:25:57,263 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_53071_210_16554_1_1534490405_2954066_265-170472-0 from training. Duration: 0.83 2023-10-02 15:26:01,668 WARNING [train.py:1204] (3/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_189-298172-0_sp1.1 from training. Duration: 0.963625 2023-10-02 15:26:04,262 INFO [train.py:1046] (3/4) Epoch 27, batch 1000, loss[loss=0.171, simple_loss=0.2356, pruned_loss=0.05319, over 23787.00 frames. ], tot_loss[loss=0.1681, simple_loss=0.2455, pruned_loss=0.04531, over 4700696.55 frames. ], batch size: 232, lr: 3.81e-03, grad_scale: 16.0 2023-10-02 15:26:07,000 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7615_210_4211_1_1528110965_4580160_72-235211-0 from training. Duration: 0.76 2023-10-02 15:26:07,046 WARNING [train.py:1204] (3/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_155-90874-0_sp1.1 from training. Duration: 0.936375 2023-10-02 15:26:07,252 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.conv_module1.balancer1.min_positive, batch_count=927433.3333333334, ans=0.025 2023-10-02 15:26:08,711 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.conv_module1.balancer2.min_positive, batch_count=927433.3333333334, ans=0.05 2023-10-02 15:26:11,250 WARNING [train.py:1204] (3/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_164-216310-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 15:26:12,657 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_46013_210_7234_1_1533776362_3744032_463-52378-0 from training. Duration: 0.86 2023-10-02 15:26:12,660 WARNING [train.py:1204] (3/4) Exclude cut with ID _61133_210_9969_1_1535196777227_3696409_333-270325-0 from training. Duration: 0.93 2023-10-02 15:26:15,452 WARNING [train.py:1204] (3/4) Exclude cut with ID _5172_210_1883_1_1527246062567_3577219_18-101380-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 15:26:15,468 WARNING [train.py:1204] (3/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_577-236858-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 15:26:15,602 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.bypass.skip_rate, batch_count=927433.3333333334, ans=0.04949747468305833 2023-10-02 15:26:16,791 WARNING [train.py:1204] (3/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_1006-177345-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 15:26:20,272 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_4-266348-0 from training. Duration: 0.78 2023-10-02 15:26:24,656 WARNING [train.py:1204] (3/4) Exclude cut with ID _55857_210_2346_1_1534644007831_6898209_745-98391-0 from training. Duration: 0.76 2023-10-02 15:26:26,759 WARNING [train.py:1204] (3/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_375-244976-0 from training. Duration: 0.75 2023-10-02 15:26:26,812 WARNING [train.py:1204] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_4-248404-0_sp1.1 from training. Duration: 0.97275 2023-10-02 15:26:29,559 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8453_210_4211_1_1528502940_8562730_596-306820-0 from training. Duration: 0.97 2023-10-02 15:26:29,671 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_211-339622-0_sp0.9 from training. Duration: 0.5 2023-10-02 15:26:29,697 WARNING [train.py:1204] (3/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_393-57888-0 from training. Duration: 0.64 2023-10-02 15:26:32,212 WARNING [train.py:1204] (3/4) Exclude cut with ID _23825_210_13449_1_1531915313526_3497150_143-343575-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 15:26:33,630 WARNING [train.py:1204] (3/4) Exclude cut with ID _58605_210_12210_1_1534723195939_3904259_36-12256-0_sp1.1 from training. Duration: 0.936375 2023-10-02 15:26:40,705 WARNING [train.py:1204] (3/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_38-67795-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 15:26:41,612 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.0.layers.1.conv_module1.whiten, num_groups=1, num_channels=192, metric=7.74 vs. limit=15.0 2023-10-02 15:26:42,069 WARNING [train.py:1204] (3/4) Exclude cut with ID _64631_210_27767_1_1535349580068_4165160_308-327832-0_sp1.1 from training. Duration: 0.92725 2023-10-02 15:26:43,443 WARNING [train.py:1204] (3/4) Exclude cut with ID _37086_210_9038_1_1532999540891_7021250_726-81176-0_sp1.1 from training. Duration: 0.936375 2023-10-02 15:26:44,795 WARNING [train.py:1204] (3/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_47-141521-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 15:26:44,799 WARNING [train.py:1204] (3/4) Exclude cut with ID _3067_210_2667_1_1526212974640_4503168_250-285937-0 from training. Duration: 0.8 2023-10-02 15:26:44,830 WARNING [train.py:1204] (3/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_239-24000-0_sp1.1 from training. Duration: 0.97275 2023-10-02 15:26:44,889 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3371_210_1794_1_1526384462_5580777_396-339812-0_sp0.9 from training. Duration: 0.9444375 2023-10-02 15:26:46,314 WARNING [train.py:1204] (3/4) Exclude cut with ID _16909_210_5724_1_1531292079912_4118919_580-173454-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 15:26:46,366 WARNING [train.py:1204] (3/4) Exclude cut with ID IC0124W0002-61308 from training. Duration: 0.982 2023-10-02 15:26:49,705 WARNING [train.py:1204] (3/4) Exclude cut with ID _31602_210_5854_1_1532677021456_4458250_249-96604-0 from training. Duration: 0.89 2023-10-02 15:26:51,728 WARNING [train.py:1204] (3/4) Exclude cut with ID _69584_210_5219_1_1535785133775_4044450_325-7656-0 from training. Duration: 0.86 2023-10-02 15:26:53,526 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.534e+02 1.843e+02 1.984e+02 2.193e+02 2.939e+02, threshold=3.969e+02, percent-clipped=0.0 2023-10-02 15:26:54,940 WARNING [train.py:1204] (3/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_478-162247-0 from training. Duration: 0.82 2023-10-02 15:26:56,381 WARNING [train.py:1204] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_686-54383-0_sp1.1 from training. Duration: 0.7909375 2023-10-02 15:27:01,246 WARNING [train.py:1204] (3/4) Exclude cut with ID _29303_210_2575_1_1532392237303_7410649_85-270092-0_sp1.1 from training. Duration: 0.936375 2023-10-02 15:27:02,560 WARNING [train.py:1204] (3/4) Exclude cut with ID _2322_210_2667_1_1525604548971_3656299_440-80494-0_sp1.1 from training. Duration: 0.8 2023-10-02 15:27:02,592 WARNING [train.py:1204] (3/4) Exclude cut with ID _47346_210_9476_1_1533815971553_3536160_201-40644-0_sp1.1 from training. Duration: 0.936375 2023-10-02 15:27:03,961 WARNING [train.py:1204] (3/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_343-139898-0_sp1.1 from training. Duration: 0.87275 2023-10-02 15:27:05,303 WARNING [train.py:1204] (3/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_1012-279473-0 from training. Duration: 0.67 2023-10-02 15:27:05,398 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7931_210_1794_1_1528594728_5564059_280-279560-0_sp1.1 from training. Duration: 0.8909375 2023-10-02 15:27:06,750 WARNING [train.py:1204] (3/4) Exclude cut with ID _8263_210_1664_1_1528439452891_970220_68-181805-0 from training. Duration: 0.64 2023-10-02 15:27:06,803 WARNING [train.py:1204] (3/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_605-133395-0 from training. Duration: 0.66 2023-10-02 15:27:07,352 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.3.encoder.layers.2.whiten, num_groups=1, num_channels=512, metric=5.59 vs. limit=12.0 2023-10-02 15:27:08,506 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.feed_forward3.hidden_balancer.prob, batch_count=927700.0, ans=0.125 2023-10-02 15:27:09,573 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_43-171109-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 15:27:09,577 WARNING [train.py:1204] (3/4) Exclude cut with ID _40094_210_16380_1_1533294005773_3641159_107-280739-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 15:27:12,336 WARNING [train.py:1204] (3/4) Exclude cut with ID _14821_210_8750_1_1530680503991_6829359_2-227151-0_sp0.9 from training. Duration: 0.92225 2023-10-02 15:27:13,773 WARNING [train.py:1204] (3/4) Exclude cut with ID _14268_210_9206_1_1530622771285_4714099_331-105351-0_sp1.1 from training. Duration: 0.5909375 2023-10-02 15:27:15,167 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_26326_210_7925_1_1533002493_3592406_451-82741-0_sp1.1 from training. Duration: 0.97275 2023-10-02 15:27:18,233 INFO [train.py:1046] (3/4) Epoch 27, batch 1050, loss[loss=0.1564, simple_loss=0.2218, pruned_loss=0.04556, over 22755.00 frames. ], tot_loss[loss=0.167, simple_loss=0.2443, pruned_loss=0.04487, over 4703925.09 frames. ], batch size: 322, lr: 3.81e-03, grad_scale: 16.0 2023-10-02 15:27:18,334 WARNING [train.py:1204] (3/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_191-59784-0_sp0.9 from training. Duration: 0.97775 2023-10-02 15:27:18,568 INFO [scaling.py:1118] (3/4) WithLoss: name=encoder.encoders.1.encoder.layers.1.self_attn_weights, loss-sum=0.000e+00 2023-10-02 15:27:18,621 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.conv_skip_rate, batch_count=927766.6666666666, ans=0.0 2023-10-02 15:27:19,646 WARNING [train.py:1204] (3/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_445-117398-0_sp1.1 from training. Duration: 0.8090625 2023-10-02 15:27:21,507 WARNING [train.py:1204] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_605-257205-0_sp0.9 from training. Duration: 0.6555625 2023-10-02 15:27:21,569 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_50050_210_16223_1_1535435796_7290138_592-258841-0_sp1.1 from training. Duration: 0.936375 2023-10-02 15:27:24,280 WARNING [train.py:1204] (3/4) Exclude cut with ID _14348_210_8341_1_1530599434996_3778640_240-232370-0_sp0.9 from training. Duration: 0.9333125 2023-10-02 15:27:25,600 WARNING [train.py:1204] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_239-316802-0_sp0.9 from training. Duration: 0.7666875 2023-10-02 15:27:28,425 WARNING [train.py:1204] (3/4) Exclude cut with ID _45560_210_21982_1_1533812603951_3584999_111-6376-0_sp1.1 from training. Duration: 0.763625 2023-10-02 15:27:30,437 WARNING [train.py:1204] (3/4) Exclude cut with ID _54189_210_16380_1_1534332670039_3911710_606-311481-0_sp1.1 from training. Duration: 0.963625 2023-10-02 15:27:30,510 WARNING [train.py:1204] (3/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_237-332027-0_sp1.1 from training. Duration: 0.863625 2023-10-02 15:27:31,862 WARNING [train.py:1204] (3/4) Exclude cut with ID _66070_210_7640_1_1535441393096_7805310_489-292631-0_sp0.9 from training. Duration: 0.87775 2023-10-02 15:27:31,938 WARNING [train.py:1204] (3/4) Exclude cut with ID _49728_210_12651_1_1534041034294_6788150_624-195301-0_sp1.1 from training. Duration: 0.77275 2023-10-02 15:27:33,332 WARNING [train.py:1204] (3/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_44-79427-0 from training. Duration: 0.98 2023-10-02 15:27:33,384 WARNING [train.py:1204] (3/4) Exclude cut with ID _35350_210_16040_1_1533040191183_4075299_490-198354-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 15:27:33,439 WARNING [train.py:1204] (3/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_309-222545-0 from training. Duration: 0.84 2023-10-02 15:27:36,216 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_13091_210_7925_1_1530609447_8987354_790-199469-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 15:27:36,222 WARNING [train.py:1204] (3/4) Exclude cut with ID _21632_210_2253_1_1531994001765_3744140_110-274654-0 from training. Duration: 0.82 2023-10-02 15:27:37,503 WARNING [train.py:1204] (3/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_498-40153-0_sp0.9 from training. Duration: 0.7 2023-10-02 15:27:41,798 WARNING [train.py:1204] (3/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_363-282909-0_sp1.1 from training. Duration: 0.936375 2023-10-02 15:27:43,129 WARNING [train.py:1204] (3/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_178-177582-0_sp0.9 from training. Duration: 0.82225 2023-10-02 15:27:43,146 WARNING [train.py:1204] (3/4) Exclude cut with ID _24612_210_8913_1_1531998054193_3806359_357-35487-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 15:27:45,898 WARNING [train.py:1204] (3/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_138-332773-0 from training. Duration: 0.81 2023-10-02 15:27:45,920 WARNING [train.py:1204] (3/4) Exclude cut with ID _7624_210_1531_1_1528109802198_4933101_418-349834-0 from training. Duration: 0.6 2023-10-02 15:27:48,054 WARNING [train.py:1204] (3/4) Exclude cut with ID _7537_210_2084_1_1528502395431_3924359_14-312906-0_sp0.9 from training. Duration: 0.9333125 2023-10-02 15:27:51,433 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.2.encoder.layers.0.self_attn2.whiten, num_groups=1, num_channels=384, metric=17.11 vs. limit=22.5 2023-10-02 15:27:52,223 WARNING [train.py:1204] (3/4) Exclude cut with ID _60057_210_10640_1_1534903065793_3333150_362-230377-0 from training. Duration: 0.87 2023-10-02 15:27:53,480 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.2.encoder.layers.2.self_attn2.whiten, num_groups=1, num_channels=384, metric=17.17 vs. limit=22.5 2023-10-02 15:27:56,758 WARNING [train.py:1204] (3/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_138-64210-0 from training. Duration: 0.73 2023-10-02 15:27:58,119 WARNING [train.py:1204] (3/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_454-339504-0_sp1.1 from training. Duration: 0.97275 2023-10-02 15:28:00,951 WARNING [train.py:1204] (3/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_508-335369-0_sp0.9 from training. Duration: 0.5333125 2023-10-02 15:28:02,896 WARNING [train.py:1204] (3/4) Exclude cut with ID _5608_210_2106_1_1527415439089_6819768_909-82767-0_sp0.9 from training. Duration: 0.67775 2023-10-02 15:28:04,196 WARNING [train.py:1204] (3/4) Exclude cut with ID _6694_210_4599_1_1527852637490_3965529_99-11611-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 15:28:04,265 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2655_210_2087_1_1525688959_3897400_52-279899-0_sp1.1 from training. Duration: 0.62725 2023-10-02 15:28:06,995 WARNING [train.py:1204] (3/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_375-245677-0_sp1.1 from training. Duration: 0.736375 2023-10-02 15:28:08,534 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.ff3_skip_rate, batch_count=927966.6666666666, ans=0.0 2023-10-02 15:28:08,581 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.ff3_skip_rate, batch_count=927966.6666666666, ans=0.0 2023-10-02 15:28:08,615 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.ff3_skip_rate, batch_count=927966.6666666666, ans=0.0 2023-10-02 15:28:10,138 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_23850_210_4238_1_1532232597_1632433_32-185135-0 from training. Duration: 0.86 2023-10-02 15:28:12,738 WARNING [train.py:1204] (3/4) Exclude cut with ID _15113_210_8254_1_1530842438579_3716279_209-54533-0 from training. Duration: 0.96 2023-10-02 15:28:12,777 WARNING [train.py:1204] (3/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_401-53423-0 from training. Duration: 0.55 2023-10-02 15:28:12,818 WARNING [train.py:1204] (3/4) Exclude cut with ID _14867_210_1968_1_1530684179890_3846969_319-250868-0_sp1.1 from training. Duration: 0.9 2023-10-02 15:28:14,096 WARNING [train.py:1204] (3/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_106-30782-0_sp0.9 from training. Duration: 0.888875 2023-10-02 15:28:15,502 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_5960_210_1794_1_1527403865_3990999_515-2613-0 from training. Duration: 0.88 2023-10-02 15:28:16,319 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.2.encoder.layers.2.conv_module2.whiten, num_groups=1, num_channels=384, metric=4.93 vs. limit=15.0 2023-10-02 15:28:18,418 WARNING [train.py:1204] (3/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_798-106179-0_sp0.9 from training. Duration: 0.92225 2023-10-02 15:28:19,884 WARNING [train.py:1204] (3/4) Exclude cut with ID _14227_210_4381_1_1530701804286_3940259_125-275642-0_sp1.1 from training. Duration: 0.9 2023-10-02 15:28:19,887 WARNING [train.py:1204] (3/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_79-200392-0_sp1.1 from training. Duration: 0.8818125 2023-10-02 15:28:19,934 WARNING [train.py:1204] (3/4) Exclude cut with ID _48836_210_12210_1_1534075848293_3904150_429-35475-0_sp0.9 from training. Duration: 0.988875 2023-10-02 15:28:19,949 WARNING [train.py:1204] (3/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_224-172517-0_sp1.1 from training. Duration: 0.97275 2023-10-02 15:28:24,775 WARNING [train.py:1204] (3/4) Exclude cut with ID _15113_210_8254_1_1530842438579_3716279_76-232507-0_sp1.1 from training. Duration: 0.97275 2023-10-02 15:28:24,778 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_135-5501-0 from training. Duration: 0.98 2023-10-02 15:28:26,879 WARNING [train.py:1204] (3/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_563-347832-0_sp0.9 from training. Duration: 0.988875 2023-10-02 15:28:26,891 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6786_210_1388_1_1527839699_4194495_273-149841-0 from training. Duration: 0.92 2023-10-02 15:28:26,918 WARNING [train.py:1204] (3/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_172-215932-0 from training. Duration: 0.66 2023-10-02 15:28:28,178 WARNING [train.py:1204] (3/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_939-132910-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 15:28:29,638 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.balancer2.prob, batch_count=928033.3333333334, ans=0.125 2023-10-02 15:28:31,387 WARNING [train.py:1204] (3/4) Exclude cut with ID _49371_210_12071_1_1533952761021_3812190_16-246001-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 15:28:31,662 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.bypass_mid.scale_min, batch_count=928100.0, ans=0.2 2023-10-02 15:28:32,694 INFO [train.py:1046] (3/4) Epoch 27, batch 1100, loss[loss=0.1885, simple_loss=0.2743, pruned_loss=0.05136, over 24337.00 frames. ], tot_loss[loss=0.1664, simple_loss=0.2442, pruned_loss=0.04426, over 4712671.33 frames. ], batch size: 77, lr: 3.81e-03, grad_scale: 16.0 2023-10-02 15:28:32,986 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=928100.0, ans=0.1 2023-10-02 15:28:35,642 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.0.conv_module2.balancer1.max_abs, batch_count=928100.0, ans=10.0 2023-10-02 15:28:36,737 WARNING [train.py:1204] (3/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_680-189165-0_sp1.1 from training. Duration: 0.936375 2023-10-02 15:28:39,709 WARNING [train.py:1204] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_53-300544-0_sp1.1 from training. Duration: 0.6090625 2023-10-02 15:28:41,037 WARNING [train.py:1204] (3/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_540-275685-0_sp1.1 from training. Duration: 0.8090625 2023-10-02 15:28:41,205 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.conv_module2.balancer2.prob, batch_count=928100.0, ans=0.125 2023-10-02 15:28:42,241 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_38965_210_4852_1_1533174906_3484735_453-106579-0_sp1.1 from training. Duration: 0.92725 2023-10-02 15:28:42,291 WARNING [train.py:1204] (3/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_105-328174-0 from training. Duration: 0.76 2023-10-02 15:28:43,640 WARNING [train.py:1204] (3/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_279-126205-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 15:28:45,043 WARNING [train.py:1204] (3/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_5-55893-0_sp0.9 from training. Duration: 0.611125 2023-10-02 15:28:46,478 WARNING [train.py:1204] (3/4) Exclude cut with ID _13678_210_7497_1_1530530069226_3463170_141-10835-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 15:28:50,487 WARNING [train.py:1204] (3/4) Exclude cut with ID _22083_210_8254_1_1531702862439_3678970_201-85738-0_sp1.1 from training. Duration: 0.8181875 2023-10-02 15:28:50,513 WARNING [train.py:1204] (3/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_51-1049-0 from training. Duration: 0.75 2023-10-02 15:28:51,834 WARNING [train.py:1204] (3/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_206-10097-0_sp0.9 from training. Duration: 0.5444375 2023-10-02 15:28:53,234 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_4528_210_3273_1_1527076654_3276777_360-63661-0_sp1.1 from training. Duration: 0.92725 2023-10-02 15:28:53,241 WARNING [train.py:1204] (3/4) Exclude cut with ID _64086_210_7994_1_1535270417019_7435949_449-31567-0_sp1.1 from training. Duration: 0.8818125 2023-10-02 15:28:56,679 WARNING [train.py:1204] (3/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_245-174553-0_sp0.9 from training. Duration: 0.97775 2023-10-02 15:28:59,313 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6545_210_2040_1_1527774670_4245258_379-14182-0_sp1.1 from training. Duration: 0.72725 2023-10-02 15:29:01,392 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.0.conv_module2.balancer1.prob, batch_count=928233.3333333334, ans=0.125 2023-10-02 15:29:02,730 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_256-206070-0_sp1.1 from training. Duration: 0.963625 2023-10-02 15:29:06,880 WARNING [train.py:1204] (3/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_278-104676-0 from training. Duration: 0.98 2023-10-02 15:29:06,964 WARNING [train.py:1204] (3/4) Exclude cut with ID IC0671W0065-331908 from training. Duration: 0.896 2023-10-02 15:29:07,009 WARNING [train.py:1204] (3/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_244-129917-0_sp1.1 from training. Duration: 0.97275 2023-10-02 15:29:07,252 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.conv_module2.balancer1.prob, batch_count=928233.3333333334, ans=0.125 2023-10-02 15:29:09,802 WARNING [train.py:1204] (3/4) Exclude cut with ID _7852_210_5219_1_1528286471316_8139400_1129-170857-0_sp1.1 from training. Duration: 0.97275 2023-10-02 15:29:09,867 WARNING [train.py:1204] (3/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_443-30034-0_sp0.9 from training. Duration: 0.711125 2023-10-02 15:29:09,910 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_31583_210_3852_1_1532595851_4024413_250-332999-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 15:29:11,298 WARNING [train.py:1204] (3/4) Exclude cut with ID _8772_210_1850_1_1528635595972_3664011_247-336448-0 from training. Duration: 0.74 2023-10-02 15:29:11,352 WARNING [train.py:1204] (3/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_567-135114-0_sp0.9 from training. Duration: 0.9666875 2023-10-02 15:29:11,356 WARNING [train.py:1204] (3/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_557-349534-0_sp1.1 from training. Duration: 0.82725 2023-10-02 15:29:12,640 WARNING [train.py:1204] (3/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_1341-26728-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 15:29:12,707 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_40222_210_6228_1_1533385298_4481939_375-328051-0_sp1.1 from training. Duration: 0.97275 2023-10-02 15:29:12,737 WARNING [train.py:1204] (3/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_276-96593-0 from training. Duration: 0.77 2023-10-02 15:29:19,974 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_53071_210_16554_1_1534490405_2954066_265-170472-0_sp0.9 from training. Duration: 0.92225 2023-10-02 15:29:20,007 WARNING [train.py:1204] (3/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_365-1492-0 from training. Duration: 0.91 2023-10-02 15:29:21,250 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.465e+02 1.794e+02 1.965e+02 2.259e+02 3.177e+02, threshold=3.930e+02, percent-clipped=0.0 2023-10-02 15:29:21,413 WARNING [train.py:1204] (3/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_654-52713-0_sp0.9 from training. Duration: 0.8555625 2023-10-02 15:29:27,723 WARNING [train.py:1204] (3/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_113-92608-0_sp1.1 from training. Duration: 0.4909375 2023-10-02 15:29:31,695 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_46013_210_7234_1_1533776362_3744032_50-141535-0 from training. Duration: 0.99 2023-10-02 15:29:31,714 WARNING [train.py:1204] (3/4) Exclude cut with ID _13942_210_9988_1_1530604906036_3733360_499-55676-0_sp1.1 from training. Duration: 0.563625 2023-10-02 15:29:33,683 WARNING [train.py:1204] (3/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_375-65143-0_sp1.1 from training. Duration: 0.97275 2023-10-02 15:29:35,134 WARNING [train.py:1204] (3/4) Exclude cut with ID _16756_210_4915_1_1531047803763_7104259_199-325608-0_sp1.1 from training. Duration: 0.92725 2023-10-02 15:29:36,570 WARNING [train.py:1204] (3/4) Exclude cut with ID _31649_210_6228_1_1532584608063_4091078_443-124391-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 15:29:37,896 WARNING [train.py:1204] (3/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_146-218763-0 from training. Duration: 0.86 2023-10-02 15:29:39,239 WARNING [train.py:1204] (3/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_567-135114-0_sp1.1 from training. Duration: 0.7909375 2023-10-02 15:29:39,272 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_26326_210_7925_1_1533002493_3592406_462-288299-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 15:29:40,622 WARNING [train.py:1204] (3/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_322-217220-0 from training. Duration: 0.86 2023-10-02 15:29:40,662 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_238-256668-0_sp1.1 from training. Duration: 0.62725 2023-10-02 15:29:42,061 WARNING [train.py:1204] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_371-18169-0 from training. Duration: 0.86 2023-10-02 15:29:43,490 WARNING [train.py:1204] (3/4) Exclude cut with ID _58480_210_12392_1_1534811121111_3646160_23-74512-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 15:29:43,495 WARNING [train.py:1204] (3/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_784-213099-0_sp1.1 from training. Duration: 0.8454375 2023-10-02 15:29:43,574 WARNING [train.py:1204] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_625-102955-0_sp1.1 from training. Duration: 0.7 2023-10-02 15:29:46,244 INFO [train.py:1046] (3/4) Epoch 27, batch 1150, loss[loss=0.1719, simple_loss=0.2581, pruned_loss=0.04287, over 24563.00 frames. ], tot_loss[loss=0.1674, simple_loss=0.2454, pruned_loss=0.04467, over 4699071.52 frames. ], batch size: 71, lr: 3.81e-03, grad_scale: 16.0 2023-10-02 15:29:47,699 WARNING [train.py:1204] (3/4) Exclude cut with ID _49748_210_24116_1_1533986971737_3158910_176-174327-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 15:29:51,035 WARNING [train.py:1204] (3/4) Exclude cut with ID _50310_210_15488_1_1534068043345_3641080_432-96140-0_sp1.1 from training. Duration: 0.936375 2023-10-02 15:29:52,959 WARNING [train.py:1204] (3/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_76-14910-0_sp1.1 from training. Duration: 0.92725 2023-10-02 15:29:54,364 WARNING [train.py:1204] (3/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_40-19458-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 15:29:54,395 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_46-93547-0 from training. Duration: 0.95 2023-10-02 15:29:54,421 WARNING [train.py:1204] (3/4) Exclude cut with ID _21631_210_2253_1_1531907880778_3160339_321-35325-0_sp1.1 from training. Duration: 0.963625 2023-10-02 15:29:57,274 WARNING [train.py:1204] (3/4) Exclude cut with ID _4779_210_2084_1_1527318022944_3914893_500-285013-0 from training. Duration: 0.69 2023-10-02 15:29:58,567 WARNING [train.py:1204] (3/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_301-93324-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 15:29:58,581 WARNING [train.py:1204] (3/4) Exclude cut with ID _6473_210_1741_1_1527901307396_3815380_108-17335-0_sp1.1 from training. Duration: 0.5909375 2023-10-02 15:30:04,660 WARNING [train.py:1204] (3/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_206-10097-0 from training. Duration: 0.49 2023-10-02 15:30:04,964 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.conv_module1.balancer1.max_abs, batch_count=928500.0, ans=10.0 2023-10-02 15:30:06,645 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_36756_210_7234_1_1533026991_966276_103-77513-0_sp1.1 from training. Duration: 0.97275 2023-10-02 15:30:09,466 WARNING [train.py:1204] (3/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_662-18193-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 15:30:09,528 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_26433_210_12108_1_1532157158_4056480_439-52438-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 15:30:10,841 WARNING [train.py:1204] (3/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_464-33931-0 from training. Duration: 0.54 2023-10-02 15:30:10,850 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3474_210_2028_1_1526188513_4825246_145-309131-0_sp0.9 from training. Duration: 0.82225 2023-10-02 15:30:10,875 WARNING [train.py:1204] (3/4) Exclude cut with ID _9679_210_7497_1_1529129212906_4363929_446-308321-0_sp1.1 from training. Duration: 0.963625 2023-10-02 15:30:14,530 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.1.encoder.layers.1.nonlin_attention.whiten2, num_groups=1, num_channels=256, metric=6.66 vs. limit=15.0 2023-10-02 15:30:15,019 WARNING [train.py:1204] (3/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_662-275621-0 from training. Duration: 0.42 2023-10-02 15:30:15,096 WARNING [train.py:1204] (3/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_344-337148-0_sp1.1 from training. Duration: 0.97275 2023-10-02 15:30:16,429 WARNING [train.py:1204] (3/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_759-179071-0_sp1.1 from training. Duration: 0.92725 2023-10-02 15:30:27,072 WARNING [train.py:1204] (3/4) Exclude cut with ID _28783_210_7943_1_1532671164808_7155479_293-12188-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 15:30:34,482 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_17613_210_2151_1_1531137569_2366160_235-320304-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 15:30:34,496 WARNING [train.py:1204] (3/4) Exclude cut with ID _14349_210_8341_1_1530615710818_3442470_170-336541-0 from training. Duration: 0.86 2023-10-02 15:30:34,559 WARNING [train.py:1204] (3/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_375-218677-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 15:30:34,602 WARNING [train.py:1204] (3/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_829-202956-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 15:30:40,514 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9029W0436-509823 from training. Duration: 0.768 2023-10-02 15:30:43,134 WARNING [train.py:1204] (3/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_231-112214-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 15:30:48,736 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9059W0470-524883 from training. Duration: 0.3415625 2023-10-02 15:30:51,459 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_43266_210_1794_1_1533521789_4497218_206-79512-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 15:30:53,291 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_294-163793-0_sp0.9 from training. Duration: 0.911125 2023-10-02 15:30:53,320 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6290_210_3273_1_1527595224_1282787_115-87095-0_sp1.1 from training. Duration: 0.5 2023-10-02 15:30:53,360 WARNING [train.py:1204] (3/4) Exclude cut with ID _14100_210_8341_1_1530529320613_3678069_279-55760-0_sp1.1 from training. Duration: 0.8454375 2023-10-02 15:30:57,992 WARNING [train.py:1204] (3/4) Exclude cut with ID _14621_210_1925_1_1530686901185_3675420_419-234200-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 15:31:01,339 INFO [train.py:1046] (3/4) Epoch 27, batch 1200, loss[loss=0.1482, simple_loss=0.2261, pruned_loss=0.03519, over 24427.00 frames. ], tot_loss[loss=0.1676, simple_loss=0.2453, pruned_loss=0.04492, over 4708255.11 frames. ], batch size: 58, lr: 3.81e-03, grad_scale: 32.0 2023-10-02 15:31:02,789 WARNING [train.py:1204] (3/4) Exclude cut with ID _60229_210_27767_1_1534838406158_3655350_353-213656-0_sp1.1 from training. Duration: 0.863625 2023-10-02 15:31:02,798 WARNING [train.py:1204] (3/4) Exclude cut with ID _48678_210_5663_1_1533988830947_3769199_306-255886-0_sp1.1 from training. Duration: 0.7 2023-10-02 15:31:03,015 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.feed_forward2.hidden_balancer.prob, batch_count=928766.6666666666, ans=0.125 2023-10-02 15:31:04,243 WARNING [train.py:1204] (3/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_466-3492-0_sp1.1 from training. Duration: 0.97275 2023-10-02 15:31:04,246 WARNING [train.py:1204] (3/4) Exclude cut with ID _8755_210_1876_1_1528628559357_3258209_199-242167-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 15:31:04,296 WARNING [train.py:1204] (3/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_237-89684-0_sp1.1 from training. Duration: 0.87275 2023-10-02 15:31:07,117 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.feed_forward2.hidden_balancer.prob, batch_count=928766.6666666666, ans=0.125 2023-10-02 15:31:08,226 WARNING [train.py:1204] (3/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_70-84121-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 15:31:08,468 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.bypass_mid.scale_min, batch_count=928766.6666666666, ans=0.2 2023-10-02 15:31:10,155 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7544_210_6753_1_1528587722_3576686_172-171750-0_sp1.1 from training. Duration: 0.5909375 2023-10-02 15:31:11,569 WARNING [train.py:1204] (3/4) Exclude cut with ID _24271_210_12658_1_1531987270022_7190519_40-321822-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 15:31:11,586 WARNING [train.py:1204] (3/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_227-83612-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 15:31:11,764 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.self_attn_weights.pos_emb_skip_rate, batch_count=928766.6666666666, ans=0.0 2023-10-02 15:31:13,715 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.2.encoder.layers.1.nonlin_attention.whiten1, num_groups=1, num_channels=288, metric=5.22 vs. limit=10.0 2023-10-02 15:31:14,313 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9156W0015-572508 from training. Duration: 0.896 2023-10-02 15:31:17,112 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_292-265928-0 from training. Duration: 0.51 2023-10-02 15:31:18,614 WARNING [train.py:1204] (3/4) Exclude cut with ID _14747_210_8341_1_1530685797463_3654089_147-131229-0_sp1.1 from training. Duration: 0.6909375 2023-10-02 15:31:21,363 WARNING [train.py:1204] (3/4) Exclude cut with ID _45297_210_2721_1_1533715351381_3264949_351-147136-0_sp0.9 from training. Duration: 0.9666875 2023-10-02 15:31:24,185 WARNING [train.py:1204] (3/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_1102-324568-0_sp1.1 from training. Duration: 0.97275 2023-10-02 15:31:26,191 WARNING [train.py:1204] (3/4) Exclude cut with ID _18516_210_9206_1_1531704653206_3692406_465-75446-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 15:31:26,192 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9203W0126-595623 from training. Duration: 0.896 2023-10-02 15:31:27,538 WARNING [train.py:1204] (3/4) Exclude cut with ID _55383_210_10635_1_1534468426172_2231669_139-328410-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 15:31:35,181 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.conv_module1.balancer2.prob, batch_count=928900.0, ans=0.125 2023-10-02 15:31:36,232 WARNING [train.py:1204] (3/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_166-246557-0_sp0.9 from training. Duration: 0.711125 2023-10-02 15:31:36,244 WARNING [train.py:1204] (3/4) Exclude cut with ID _30039_210_13270_1_1532484612900_2725150_239-304872-0_sp1.1 from training. Duration: 0.963625 2023-10-02 15:31:36,267 WARNING [train.py:1204] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_6-226750-0 from training. Duration: 0.94 2023-10-02 15:31:36,333 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_135-5501-0_sp1.1 from training. Duration: 0.8909375 2023-10-02 15:31:39,306 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.ff2_skip_rate, batch_count=928900.0, ans=0.0 2023-10-02 15:31:40,407 WARNING [train.py:1204] (3/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_359-118926-0 from training. Duration: 0.72 2023-10-02 15:31:42,017 WARNING [train.py:1204] (3/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_4-91037-0 from training. Duration: 0.88 2023-10-02 15:31:42,040 WARNING [train.py:1204] (3/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_415-288492-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 15:31:42,184 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.conv_module2.balancer1.prob, batch_count=928900.0, ans=0.125 2023-10-02 15:31:43,772 WARNING [train.py:1204] (3/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_544-287179-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 15:31:45,154 WARNING [train.py:1204] (3/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_122-32606-0_sp1.1 from training. Duration: 0.92725 2023-10-02 15:31:46,532 WARNING [train.py:1204] (3/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_121-284152-0_sp1.1 from training. Duration: 0.536375 2023-10-02 15:31:48,994 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.596e+02 1.904e+02 2.120e+02 2.564e+02 3.915e+02, threshold=4.240e+02, percent-clipped=0.0 2023-10-02 15:31:49,072 WARNING [train.py:1204] (3/4) Exclude cut with ID _44718_210_5816_1_1534050051869_6469030_350-299477-0_sp1.1 from training. Duration: 0.97275 2023-10-02 15:31:49,083 WARNING [train.py:1204] (3/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_1103-56445-0_sp0.9 from training. Duration: 0.811125 2023-10-02 15:31:49,148 WARNING [train.py:1204] (3/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_64-298917-0_sp0.9 from training. Duration: 0.97775 2023-10-02 15:31:50,401 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2245_210_2715_1_1525511548_3540254_310-52483-0 from training. Duration: 0.88 2023-10-02 15:31:50,468 WARNING [train.py:1204] (3/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_401-158239-0_sp1.1 from training. Duration: 0.6454375 2023-10-02 15:31:51,806 WARNING [train.py:1204] (3/4) Exclude cut with ID _59502_210_12210_1_1535107024698_1835139_146-162495-0_sp1.1 from training. Duration: 0.836375 2023-10-02 15:31:51,815 WARNING [train.py:1204] (3/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_41-31157-0_sp0.9 from training. Duration: 0.6333125 2023-10-02 15:31:52,029 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=928966.6666666666, ans=0.1 2023-10-02 15:31:54,609 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_68717_210_3528_1_1535628683_1556759_95-36722-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 15:31:54,616 WARNING [train.py:1204] (3/4) Exclude cut with ID _30196_210_1664_1_1533708496522_7074140_654-162611-0_sp1.1 from training. Duration: 0.92725 2023-10-02 15:31:57,986 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_9302_210_2550_1_1528889962_4655416_638-301620-0_sp0.9 from training. Duration: 0.52225 2023-10-02 15:31:59,358 WARNING [train.py:1204] (3/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_478-162247-0_sp1.1 from training. Duration: 0.7454375 2023-10-02 15:32:02,882 WARNING [train.py:1204] (3/4) Exclude cut with ID _45297_210_2721_1_1533715351381_3264949_353-9541-0 from training. Duration: 0.97 2023-10-02 15:32:06,940 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9653W0190-675468 from training. Duration: 0.9935 2023-10-02 15:32:08,353 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_49730_210_13049_1_1534075294_1830676_214-84436-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 15:32:09,863 WARNING [train.py:1204] (3/4) Exclude cut with ID _50093_210_4220_1_1534417141207_3432299_259-322252-0_sp1.1 from training. Duration: 0.836375 2023-10-02 15:32:11,210 WARNING [train.py:1204] (3/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_599-50220-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 15:32:11,308 WARNING [train.py:1204] (3/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_174-109016-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 15:32:11,488 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.conv_module1.balancer1.prob, batch_count=929033.3333333334, ans=0.125 2023-10-02 15:32:13,714 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.5.encoder.layers.0.self_attn1.whiten, num_groups=1, num_channels=256, metric=11.81 vs. limit=22.5 2023-10-02 15:32:14,399 INFO [train.py:1046] (3/4) Epoch 27, batch 1250, loss[loss=0.1893, simple_loss=0.2686, pruned_loss=0.05499, over 23995.00 frames. ], tot_loss[loss=0.1689, simple_loss=0.2462, pruned_loss=0.0458, over 4709417.16 frames. ], batch size: 80, lr: 3.81e-03, grad_scale: 32.0 2023-10-02 15:32:14,519 WARNING [train.py:1204] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_665-196155-0 from training. Duration: 0.67 2023-10-02 15:32:18,675 WARNING [train.py:1204] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_656-188361-0_sp1.1 from training. Duration: 0.7909375 2023-10-02 15:32:20,161 WARNING [train.py:1204] (3/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_60-18123-0_sp1.1 from training. Duration: 0.8 2023-10-02 15:32:21,508 WARNING [train.py:1204] (3/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_535-14410-0 from training. Duration: 0.47 2023-10-02 15:32:21,679 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.1.conv_module2.balancer2.prob, batch_count=929100.0, ans=0.125 2023-10-02 15:32:22,887 WARNING [train.py:1204] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_94-126128-0_sp1.1 from training. Duration: 0.7818125 2023-10-02 15:32:25,551 WARNING [train.py:1204] (3/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_356-31216-0_sp1.1 from training. Duration: 0.6818125 2023-10-02 15:32:29,354 WARNING [train.py:1204] (3/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_443-30034-0_sp1.1 from training. Duration: 0.5818125 2023-10-02 15:32:31,191 WARNING [train.py:1204] (3/4) Exclude cut with ID _26907_210_7234_1_1532221740694_3205263_337-2494-0_sp1.1 from training. Duration: 0.8 2023-10-02 15:32:32,538 WARNING [train.py:1204] (3/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_49-97158-0_sp0.9 from training. Duration: 0.8444375 2023-10-02 15:32:32,549 WARNING [train.py:1204] (3/4) Exclude cut with ID _7877_210_6014_1_1528286358099_3750136_553-170128-0_sp1.1 from training. Duration: 0.963625 2023-10-02 15:32:35,218 WARNING [train.py:1204] (3/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_202-293523-0_sp0.9 from training. Duration: 0.8 2023-10-02 15:32:35,511 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.conv_module2.balancer2.prob, batch_count=929166.6666666666, ans=0.125 2023-10-02 15:32:38,066 WARNING [train.py:1204] (3/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_188-171673-0_sp0.9 from training. Duration: 0.6444375 2023-10-02 15:32:38,088 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_120-41668-0_sp1.1 from training. Duration: 0.636375 2023-10-02 15:32:38,099 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_40223_210_6228_1_1533298404_4812267_613-78769-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 15:32:40,932 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_53861_210_5399_1_1534422156_4272077_150-336899-0_sp1.1 from training. Duration: 0.963625 2023-10-02 15:32:41,150 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.bypass.skip_rate, batch_count=929166.6666666666, ans=0.04949747468305833 2023-10-02 15:32:42,143 WARNING [train.py:1204] (3/4) Exclude cut with ID _14459_210_2228_1_1531443641946_7516700_1146-56843-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 15:32:43,706 WARNING [train.py:1204] (3/4) Exclude cut with ID _39867_210_9826_1_1533553222030_9344259_1194-299928-0_sp1.1 from training. Duration: 0.92725 2023-10-02 15:32:45,051 WARNING [train.py:1204] (3/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_214-291369-0_sp0.9 from training. Duration: 0.7 2023-10-02 15:32:45,734 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.4.encoder.layers.0.whiten, num_groups=1, num_channels=384, metric=4.88 vs. limit=12.0 2023-10-02 15:32:49,536 WARNING [train.py:1204] (3/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_358-215558-0 from training. Duration: 0.93 2023-10-02 15:32:50,897 WARNING [train.py:1204] (3/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_577-287150-0_sp0.9 from training. Duration: 0.911125 2023-10-02 15:32:53,792 WARNING [train.py:1204] (3/4) Exclude cut with ID _60860_210_8254_1_1534896015875_3557169_229-187367-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 15:32:53,839 WARNING [train.py:1204] (3/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_857-298942-0 from training. Duration: 0.85 2023-10-02 15:32:55,142 WARNING [train.py:1204] (3/4) Exclude cut with ID _40137_210_12210_1_1533290484046_3795180_249-187059-0_sp1.1 from training. Duration: 0.8 2023-10-02 15:32:55,149 WARNING [train.py:1204] (3/4) Exclude cut with ID ID0171W0508-765603 from training. Duration: 0.541 2023-10-02 15:32:55,175 WARNING [train.py:1204] (3/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_521-219439-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 15:32:55,181 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_40223_210_6228_1_1533298404_4812267_741-302616-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 15:32:56,837 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.conv_skip_rate, batch_count=929300.0, ans=0.0 2023-10-02 15:32:58,095 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.bypass.scale_min, batch_count=929300.0, ans=0.2 2023-10-02 15:32:59,739 WARNING [train.py:1204] (3/4) Exclude cut with ID _65293_210_9070_1_1535545549765_4004210_438-154735-0_sp1.1 from training. Duration: 0.92725 2023-10-02 15:33:00,541 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.1.encoder.layers.0.whiten, num_groups=1, num_channels=256, metric=4.96 vs. limit=12.0 2023-10-02 15:33:04,327 WARNING [train.py:1204] (3/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_564-132950-0_sp1.1 from training. Duration: 0.92725 2023-10-02 15:33:04,366 WARNING [train.py:1204] (3/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_291-270881-0_sp1.1 from training. Duration: 0.8818125 2023-10-02 15:33:05,127 WARNING [train.py:1204] (3/4) Exclude cut with ID _60229_210_27767_1_1534838406158_3655350_353-213656-0 from training. Duration: 0.95 2023-10-02 15:33:06,314 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_243-143119-0 from training. Duration: 0.77 2023-10-02 15:33:06,352 WARNING [train.py:1204] (3/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_690-126306-0 from training. Duration: 0.78 2023-10-02 15:33:09,152 WARNING [train.py:1204] (3/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_347-95003-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 15:33:10,477 WARNING [train.py:1204] (3/4) Exclude cut with ID _2322_210_2667_1_1525604548971_3656299_440-80494-0 from training. Duration: 0.88 2023-10-02 15:33:10,501 WARNING [train.py:1204] (3/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_370-229434-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 15:33:13,243 WARNING [train.py:1204] (3/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_124-345840-0_sp1.1 from training. Duration: 0.47275 2023-10-02 15:33:13,252 WARNING [train.py:1204] (3/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_165-123581-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 15:33:14,658 WARNING [train.py:1204] (3/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_453-282588-0 from training. Duration: 0.98 2023-10-02 15:33:14,681 WARNING [train.py:1204] (3/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_299-34413-0_sp0.9 from training. Duration: 0.72225 2023-10-02 15:33:15,882 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_40036_210_13405_1_1533648647_1315388_48-146016-0_sp1.1 from training. Duration: 0.8454375 2023-10-02 15:33:15,920 WARNING [train.py:1204] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_99-167392-0_sp0.9 from training. Duration: 0.67775 2023-10-02 15:33:17,388 WARNING [train.py:1204] (3/4) Exclude cut with ID _48677_210_5663_1_1533902405919_3892989_37-207114-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 15:33:19,309 WARNING [train.py:1204] (3/4) Exclude cut with ID _29857_210_20034_1_1532395886753_3798160_335-163562-0 from training. Duration: 0.85 2023-10-02 15:33:21,997 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_36492_210_7927_1_1533261987_4790834_703-343351-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 15:33:23,370 WARNING [train.py:1204] (3/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_80-147301-0_sp0.9 from training. Duration: 0.9333125 2023-10-02 15:33:24,790 WARNING [train.py:1204] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_218-295097-0_sp0.9 from training. Duration: 0.8555625 2023-10-02 15:33:25,813 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.0.layers.0.conv_module2.whiten, num_groups=1, num_channels=192, metric=8.72 vs. limit=15.0 2023-10-02 15:33:26,212 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2295_210_2087_1_1525500993_2407863_116-238894-0_sp1.1 from training. Duration: 0.636375 2023-10-02 15:33:27,276 INFO [train.py:1046] (3/4) Epoch 27, batch 1300, loss[loss=0.2353, simple_loss=0.3003, pruned_loss=0.08513, over 19313.00 frames. ], tot_loss[loss=0.1691, simple_loss=0.2467, pruned_loss=0.04575, over 4706482.40 frames. ], batch size: 388, lr: 3.81e-03, grad_scale: 16.0 2023-10-02 15:33:29,380 WARNING [train.py:1204] (3/4) Exclude cut with ID _3231_210_2106_1_1526205864934_6784220_697-171068-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 15:33:29,407 WARNING [train.py:1204] (3/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_102-77064-0 from training. Duration: 0.93 2023-10-02 15:33:34,451 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.conv_skip_rate, batch_count=929433.3333333334, ans=0.0 2023-10-02 15:33:35,480 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_13091_210_7925_1_1530609447_8987354_1186-261957-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 15:33:36,905 WARNING [train.py:1204] (3/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_91-277684-0_sp1.1 from training. Duration: 0.67275 2023-10-02 15:33:38,258 WARNING [train.py:1204] (3/4) Exclude cut with ID _31088_210_16446_1_1532520012284_3440919_159-313751-0_sp1.1 from training. Duration: 0.936375 2023-10-02 15:33:39,694 WARNING [train.py:1204] (3/4) Exclude cut with ID _42326_210_7770_1_1533814282531_3707269_227-203721-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 15:33:41,174 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3531_210_3528_1_1526294203_5216456_309-227302-0_sp1.1 from training. Duration: 0.62725 2023-10-02 15:33:42,512 WARNING [train.py:1204] (3/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_226-242140-0 from training. Duration: 0.81 2023-10-02 15:33:46,619 WARNING [train.py:1204] (3/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_122-109023-0_sp1.1 from training. Duration: 0.6909375 2023-10-02 15:33:47,960 WARNING [train.py:1204] (3/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_660-211088-0_sp0.9 from training. Duration: 0.688875 2023-10-02 15:33:49,840 WARNING [train.py:1204] (3/4) Exclude cut with ID _39290_210_4220_1_1533862778760_3075118_92-311600-0 from training. Duration: 0.88 2023-10-02 15:33:53,981 WARNING [train.py:1204] (3/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_390-339137-0_sp1.1 from training. Duration: 0.5090625 2023-10-02 15:33:56,800 WARNING [train.py:1204] (3/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_449-341946-0_sp1.1 from training. Duration: 0.963625 2023-10-02 15:33:58,286 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_68095_210_20185_1_1535718226_8888855_884-327007-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 15:33:59,662 WARNING [train.py:1204] (3/4) Exclude cut with ID _42326_210_7770_1_1533814282531_3707269_308-222998-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 15:34:01,531 WARNING [train.py:1204] (3/4) Exclude cut with ID _40399_210_4353_1_1533305658194_3279406_344-238708-0_sp1.1 from training. Duration: 0.963625 2023-10-02 15:34:01,593 WARNING [train.py:1204] (3/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_87-131427-0_sp1.1 from training. Duration: 0.7090625 2023-10-02 15:34:03,460 WARNING [train.py:1204] (3/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_110-130961-0_sp0.9 from training. Duration: 0.52225 2023-10-02 15:34:03,504 WARNING [train.py:1204] (3/4) Exclude cut with ID _55153_210_2253_1_1534384786677_3162096_148-79022-0 from training. Duration: 0.96 2023-10-02 15:34:09,741 WARNING [train.py:1204] (3/4) Exclude cut with ID _9679_210_7497_1_1529129212906_4363929_512-313889-0_sp1.1 from training. Duration: 0.736375 2023-10-02 15:34:09,755 WARNING [train.py:1204] (3/4) Exclude cut with ID _7970_210_1883_1_1528455616296_3472230_25-178407-0_sp1.1 from training. Duration: 0.6454375 2023-10-02 15:34:09,874 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_246-125000-0 from training. Duration: 0.62 2023-10-02 15:34:11,151 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_15126_210_6753_1_1530778677_2591185_113-157651-0_sp0.9 from training. Duration: 0.7555625 2023-10-02 15:34:12,517 WARNING [train.py:1204] (3/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_105-63309-0_sp1.1 from training. Duration: 0.8909375 2023-10-02 15:34:15,199 WARNING [train.py:1204] (3/4) Exclude cut with ID _29877_210_6228_1_1532397791106_5276149_578-177271-0_sp1.1 from training. Duration: 0.936375 2023-10-02 15:34:16,490 WARNING [train.py:1204] (3/4) Exclude cut with ID _44831_210_13427_1_1533816011549_3689035_32-188147-0 from training. Duration: 0.96 2023-10-02 15:34:17,772 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.627e+02 1.852e+02 2.104e+02 2.381e+02 3.588e+02, threshold=4.208e+02, percent-clipped=0.0 2023-10-02 15:34:17,837 WARNING [train.py:1204] (3/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_300-254337-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 15:34:17,857 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_128-241008-0 from training. Duration: 0.94 2023-10-02 15:34:19,237 WARNING [train.py:1204] (3/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_53-279178-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 15:34:20,798 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_41452_210_7291_1_1533437687_3941281_273-334088-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 15:34:20,800 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_128-241008-0_sp1.1 from training. Duration: 0.8545625 2023-10-02 15:34:24,156 WARNING [train.py:1204] (3/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_379-49457-0 from training. Duration: 0.87 2023-10-02 15:34:25,499 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_105-114797-0 from training. Duration: 0.84 2023-10-02 15:34:26,813 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_14928_210_5632_1_1530773461_4714000_454-112198-0 from training. Duration: 0.66 2023-10-02 15:34:31,485 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_456-22141-0_sp1.1 from training. Duration: 0.8 2023-10-02 15:34:31,700 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.ff2_skip_rate, batch_count=929700.0, ans=0.0 2023-10-02 15:34:33,007 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_115-233929-0 from training. Duration: 0.72 2023-10-02 15:34:33,158 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.bypass_mid.scale_min, batch_count=929700.0, ans=0.2 2023-10-02 15:34:34,504 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_616-16795-0_sp1.1 from training. Duration: 0.963625 2023-10-02 15:34:34,738 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.conv_module1.balancer1.min_positive, batch_count=929700.0, ans=0.025 2023-10-02 15:34:39,393 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.bypass.skip_rate, batch_count=929700.0, ans=0.07 2023-10-02 15:34:41,790 INFO [train.py:1046] (3/4) Epoch 27, batch 1350, loss[loss=0.1569, simple_loss=0.2407, pruned_loss=0.03659, over 24302.00 frames. ], tot_loss[loss=0.169, simple_loss=0.2462, pruned_loss=0.04588, over 4705194.13 frames. ], batch size: 61, lr: 3.81e-03, grad_scale: 16.0 2023-10-02 15:34:41,912 WARNING [train.py:1204] (3/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_448-212135-0 from training. Duration: 0.91 2023-10-02 15:34:44,761 WARNING [train.py:1204] (3/4) Exclude cut with ID _46683_210_9642_1_1533730913492_4591116_201-205677-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 15:34:47,546 WARNING [train.py:1204] (3/4) Exclude cut with ID _64086_210_7994_1_1535270417019_7435949_43-80483-0_sp1.1 from training. Duration: 0.97275 2023-10-02 15:34:50,303 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_240-134124-0_sp1.1 from training. Duration: 0.963625 2023-10-02 15:34:50,323 WARNING [train.py:1204] (3/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_907-120940-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 15:34:53,020 WARNING [train.py:1204] (3/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_848-280957-0_sp1.1 from training. Duration: 0.7909375 2023-10-02 15:34:53,060 WARNING [train.py:1204] (3/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_243-42116-0_sp0.9 from training. Duration: 0.811125 2023-10-02 15:34:57,622 WARNING [train.py:1204] (3/4) Exclude cut with ID _6694_210_4599_1_1527852637490_3965529_116-109398-0_sp0.9 from training. Duration: 0.811125 2023-10-02 15:35:00,773 WARNING [train.py:1204] (3/4) Exclude cut with ID _24314_210_4915_1_1532135078116_7170179_521-258868-0 from training. Duration: 0.79 2023-10-02 15:35:02,796 WARNING [train.py:1204] (3/4) Exclude cut with ID _2220_210_1819_1_1525509109898_3658680_50-99837-0_sp0.9 from training. Duration: 0.6 2023-10-02 15:35:02,838 WARNING [train.py:1204] (3/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_313-134930-0_sp1.1 from training. Duration: 0.8818125 2023-10-02 15:35:04,448 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.conv_module1.balancer2.prob, batch_count=929833.3333333334, ans=0.125 2023-10-02 15:35:05,577 WARNING [train.py:1204] (3/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_125-144174-0 from training. Duration: 0.39 2023-10-02 15:35:06,379 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.feed_forward3.hidden_balancer.prob, batch_count=929833.3333333334, ans=0.125 2023-10-02 15:35:07,287 WARNING [train.py:1204] (3/4) Exclude cut with ID _59288_210_5854_1_1535077757774_3412246_275-293897-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 15:35:08,603 WARNING [train.py:1204] (3/4) Exclude cut with ID _40519_210_13794_1_1533715228655_7337390_722-52880-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 15:35:08,617 WARNING [train.py:1204] (3/4) Exclude cut with ID _7537_210_2084_1_1528502395431_3924359_128-268029-0 from training. Duration: 0.98 2023-10-02 15:35:09,994 WARNING [train.py:1204] (3/4) Exclude cut with ID _8021_210_7994_1_1528376451358_3653069_179-45206-0 from training. Duration: 0.86 2023-10-02 15:35:11,449 WARNING [train.py:1204] (3/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_88-276668-0 from training. Duration: 0.99 2023-10-02 15:35:12,764 WARNING [train.py:1204] (3/4) Exclude cut with ID _22613_210_5219_1_1532253599686_4090170_290-212862-0_sp1.1 from training. Duration: 0.97275 2023-10-02 15:35:12,769 WARNING [train.py:1204] (3/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_465-214726-0 from training. Duration: 0.86 2023-10-02 15:35:14,392 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.feed_forward1.out_proj.dropout_p, batch_count=929900.0, ans=0.1 2023-10-02 15:35:22,569 WARNING [train.py:1204] (3/4) Exclude cut with ID _56480_210_4220_1_1534679942079_3075409_138-36031-0_sp1.1 from training. Duration: 0.97275 2023-10-02 15:35:31,726 WARNING [train.py:1204] (3/4) Exclude cut with ID _58605_210_12210_1_1534723195939_3904259_300-285878-0_sp1.1 from training. Duration: 0.97275 2023-10-02 15:35:31,756 WARNING [train.py:1204] (3/4) Exclude cut with ID _40901_210_5665_1_1533363789958_3856130_114-340229-0_sp1.1 from training. Duration: 0.936375 2023-10-02 15:35:31,792 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_489-230703-0 from training. Duration: 0.85 2023-10-02 15:35:32,295 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.4.encoder.layers.1.feed_forward2.out_whiten, num_groups=1, num_channels=384, metric=12.06 vs. limit=15.0 2023-10-02 15:35:35,266 WARNING [train.py:1204] (3/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_322-81243-0_sp1.1 from training. Duration: 0.936375 2023-10-02 15:35:37,145 WARNING [train.py:1204] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_271-269084-0 from training. Duration: 0.83 2023-10-02 15:35:38,400 WARNING [train.py:1204] (3/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_166-314607-0_sp0.9 from training. Duration: 0.6 2023-10-02 15:35:38,459 WARNING [train.py:1204] (3/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_340-343302-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 15:35:41,239 WARNING [train.py:1204] (3/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_286-112015-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 15:35:42,723 WARNING [train.py:1204] (3/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_328-264776-0 from training. Duration: 0.67 2023-10-02 15:35:45,396 WARNING [train.py:1204] (3/4) Exclude cut with ID _4866_210_2577_1_1527404412284_4364659_256-306830-0_sp0.9 from training. Duration: 0.9666875 2023-10-02 15:35:49,625 WARNING [train.py:1204] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_239-316802-0 from training. Duration: 0.69 2023-10-02 15:35:51,029 WARNING [train.py:1204] (3/4) Exclude cut with ID _29857_210_20034_1_1532395886753_3798160_279-185801-0 from training. Duration: 0.91 2023-10-02 15:35:55,183 INFO [train.py:1046] (3/4) Epoch 27, batch 1400, loss[loss=0.1854, simple_loss=0.2707, pruned_loss=0.05005, over 24385.00 frames. ], tot_loss[loss=0.1684, simple_loss=0.2453, pruned_loss=0.0457, over 4702034.57 frames. ], batch size: 77, lr: 3.81e-03, grad_scale: 16.0 2023-10-02 15:35:55,282 WARNING [train.py:1204] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_656-188361-0 from training. Duration: 0.87 2023-10-02 15:35:57,076 WARNING [train.py:1204] (3/4) Exclude cut with ID _44718_210_5816_1_1534050051869_6469030_100-230287-0_sp1.1 from training. Duration: 0.936375 2023-10-02 15:35:57,642 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.5.encoder.layers.1.conv_module2.whiten, num_groups=1, num_channels=256, metric=4.64 vs. limit=15.0 2023-10-02 15:35:59,276 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8275_210_4198_1_1528455320_3892571_276-215622-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 15:36:00,593 WARNING [train.py:1204] (3/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_698-309430-0_sp1.1 from training. Duration: 0.963625 2023-10-02 15:36:07,090 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_365-217119-0 from training. Duration: 0.63 2023-10-02 15:36:07,210 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7264_210_4938_1_1527939911_8132095_800-91995-0 from training. Duration: 0.86 2023-10-02 15:36:11,010 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.0.layers.0.whiten, num_groups=1, num_channels=192, metric=4.46 vs. limit=12.0 2023-10-02 15:36:13,034 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.bypass_mid.scale_min, batch_count=930166.6666666666, ans=0.2 2023-10-02 15:36:16,803 WARNING [train.py:1204] (3/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_49-97158-0_sp1.1 from training. Duration: 0.6909375 2023-10-02 15:36:18,150 WARNING [train.py:1204] (3/4) Exclude cut with ID _61132_210_9969_1_1535021792964_3755419_61-57720-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 15:36:19,565 WARNING [train.py:1204] (3/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_345-306832-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 15:36:21,022 WARNING [train.py:1204] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_164-224052-0_sp0.9 from training. Duration: 0.77775 2023-10-02 15:36:26,456 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_34886_210_10633_1_1533203882_1755661_76-156096-0_sp1.1 from training. Duration: 0.7909375 2023-10-02 15:36:26,544 WARNING [train.py:1204] (3/4) Exclude cut with ID 4133-6541-0027-40495-0_sp1.1 from training. Duration: 0.9681875 2023-10-02 15:36:36,529 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_514-211165-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 15:36:36,591 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_41211_210_2544_1_1533348859_3702910_97-212695-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 15:36:41,291 WARNING [train.py:1204] (3/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_406-45086-0 from training. Duration: 0.8 2023-10-02 15:36:41,354 WARNING [train.py:1204] (3/4) Exclude cut with ID _65263_210_22356_1_1535763598691_3921037_49-222917-0_sp1.1 from training. Duration: 0.836375 2023-10-02 15:36:42,660 WARNING [train.py:1204] (3/4) Exclude cut with ID _4866_210_2577_1_1527404412284_4364659_352-157930-0_sp1.1 from training. Duration: 0.736375 2023-10-02 15:36:44,066 WARNING [train.py:1204] (3/4) Exclude cut with ID _4366_210_1388_1_1526792053633_3613819_231-211402-0_sp1.1 from training. Duration: 0.8545625 2023-10-02 15:36:44,122 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_98-21576-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 15:36:45,445 WARNING [train.py:1204] (3/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_348-127789-0_sp0.9 from training. Duration: 0.9444375 2023-10-02 15:36:45,459 WARNING [train.py:1204] (3/4) Exclude cut with ID _45871_210_5665_1_1533864614245_3296060_98-329557-0_sp1.1 from training. Duration: 0.97275 2023-10-02 15:36:45,709 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.feed_forward3.hidden_balancer.prob, batch_count=930300.0, ans=0.125 2023-10-02 15:36:46,666 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.540e+02 1.831e+02 2.061e+02 2.226e+02 3.360e+02, threshold=4.122e+02, percent-clipped=0.0 2023-10-02 15:36:46,738 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6846_210_6753_1_1527850767_4295158_2-315073-0_sp1.1 from training. Duration: 0.8 2023-10-02 15:36:48,074 WARNING [train.py:1204] (3/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_145-137790-0 from training. Duration: 0.83 2023-10-02 15:36:48,096 WARNING [train.py:1204] (3/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_133-158308-0_sp0.9 from training. Duration: 0.9333125 2023-10-02 15:36:51,026 WARNING [train.py:1204] (3/4) Exclude cut with ID _14898_210_9206_1_1530766812377_4177190_320-11758-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 15:36:51,219 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.conv_module2.balancer1.prob, batch_count=930300.0, ans=0.125 2023-10-02 15:36:55,086 WARNING [train.py:1204] (3/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_406-45086-0_sp0.9 from training. Duration: 0.888875 2023-10-02 15:36:55,422 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.attention_skip_rate, batch_count=930366.6666666666, ans=0.0 2023-10-02 15:37:01,307 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_5438_210_1794_1_1527336687_2600009_104-288274-0 from training. Duration: 0.58 2023-10-02 15:37:02,566 WARNING [train.py:1204] (3/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_108-53929-0_sp0.9 from training. Duration: 0.6555625 2023-10-02 15:37:02,745 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.conv_skip_rate, batch_count=930366.6666666666, ans=0.0 2023-10-02 15:37:03,941 WARNING [train.py:1204] (3/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_444-286390-0_sp1.1 from training. Duration: 0.963625 2023-10-02 15:37:05,375 WARNING [train.py:1204] (3/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_125-144174-0_sp0.9 from training. Duration: 0.4333125 2023-10-02 15:37:06,748 WARNING [train.py:1204] (3/4) Exclude cut with ID _55209_210_1531_1_1534675828996_4597160_118-345621-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 15:37:09,955 INFO [train.py:1046] (3/4) Epoch 27, batch 1450, loss[loss=0.1675, simple_loss=0.233, pruned_loss=0.05101, over 22790.00 frames. ], tot_loss[loss=0.1671, simple_loss=0.2444, pruned_loss=0.04493, over 4707435.63 frames. ], batch size: 322, lr: 3.81e-03, grad_scale: 16.0 2023-10-02 15:37:10,071 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_45886_210_17024_1_1533709215_471063_7-79165-0_sp1.1 from training. Duration: 0.8909375 2023-10-02 15:37:14,193 WARNING [train.py:1204] (3/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_175-163894-0_sp0.9 from training. Duration: 0.82225 2023-10-02 15:37:16,862 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_50188_210_4198_1_1534503621_3646041_353-81682-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 15:37:16,868 WARNING [train.py:1204] (3/4) Exclude cut with ID _17875_210_2087_1_1531224012907_1821339_175-80565-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 15:37:16,884 WARNING [train.py:1204] (3/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_159-328157-0_sp1.1 from training. Duration: 0.47275 2023-10-02 15:37:21,010 WARNING [train.py:1204] (3/4) Exclude cut with ID _60057_210_10640_1_1534903065793_3333150_137-267579-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 15:37:22,347 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_92-46009-0_sp1.1 from training. Duration: 0.4909375 2023-10-02 15:37:22,461 WARNING [train.py:1204] (3/4) Exclude cut with ID _53203_210_2087_1_1534246218309_475590_48-68507-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 15:37:23,771 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_28211_210_6388_1_1532861506_4523224_128-328162-0 from training. Duration: 0.9 2023-10-02 15:37:25,165 WARNING [train.py:1204] (3/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_51-1049-0_sp1.1 from training. Duration: 0.6818125 2023-10-02 15:37:26,669 WARNING [train.py:1204] (3/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_383-316000-0 from training. Duration: 0.9 2023-10-02 15:37:26,719 WARNING [train.py:1204] (3/4) Exclude cut with ID _4779_210_2084_1_1527318022944_3914893_185-295153-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 15:37:26,795 WARNING [train.py:1204] (3/4) Exclude cut with ID _3231_210_2106_1_1526205864934_6784220_171-256004-0_sp0.9 from training. Duration: 0.9555625 2023-10-02 15:37:26,796 WARNING [train.py:1204] (3/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_87-272068-0 from training. Duration: 0.83 2023-10-02 15:37:28,184 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_164-152777-0_sp1.1 from training. Duration: 0.92725 2023-10-02 15:37:28,230 WARNING [train.py:1204] (3/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_18-130336-0_sp0.9 from training. Duration: 0.87775 2023-10-02 15:37:29,581 WARNING [train.py:1204] (3/4) Exclude cut with ID _14886_210_4220_1_1530702020355_3516190_175-229073-0_sp1.1 from training. Duration: 0.4090625 2023-10-02 15:37:29,598 WARNING [train.py:1204] (3/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_791-45149-0_sp0.9 from training. Duration: 0.9555625 2023-10-02 15:37:30,401 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.2.encoder.layers.1.conv_module1.whiten, num_groups=1, num_channels=384, metric=9.28 vs. limit=15.0 2023-10-02 15:37:30,998 WARNING [train.py:1204] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_468-189058-0_sp1.1 from training. Duration: 0.87275 2023-10-02 15:37:31,097 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8278_210_2550_1_1528637099_4220413_32-142780-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 15:37:34,323 WARNING [train.py:1204] (3/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_146-218763-0_sp0.9 from training. Duration: 0.9555625 2023-10-02 15:37:34,475 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.conv_skip_rate, batch_count=930500.0, ans=0.0 2023-10-02 15:37:39,463 WARNING [train.py:1204] (3/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_348-127789-0_sp1.1 from training. Duration: 0.77275 2023-10-02 15:37:39,476 WARNING [train.py:1204] (3/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_288-285445-0_sp1.1 from training. Duration: 0.936375 2023-10-02 15:37:42,191 WARNING [train.py:1204] (3/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_308-181732-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 15:37:42,228 WARNING [train.py:1204] (3/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_895-117428-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 15:37:43,679 WARNING [train.py:1204] (3/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_387-159008-0_sp0.9 from training. Duration: 0.9555625 2023-10-02 15:37:43,695 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_37351_210_7925_1_1533012931_3812113_265-85780-0_sp0.9 from training. Duration: 0.97775 2023-10-02 15:37:43,726 WARNING [train.py:1204] (3/4) Exclude cut with ID _40551_210_10635_1_1533776424632_3416179_361-3964-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 15:37:45,011 WARNING [train.py:1204] (3/4) Exclude cut with ID _25882_210_3036_1_1532775665815_6479510_485-122742-0_sp1.1 from training. Duration: 0.97275 2023-10-02 15:37:46,539 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.feed_forward2.hidden_balancer.prob, batch_count=930566.6666666666, ans=0.125 2023-10-02 15:37:49,056 WARNING [train.py:1204] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_98-51676-0 from training. Duration: 0.73 2023-10-02 15:37:51,941 WARNING [train.py:1204] (3/4) Exclude cut with ID _30548_210_16040_1_1532608097130_3146969_371-280610-0_sp1.1 from training. Duration: 0.92725 2023-10-02 15:37:54,753 WARNING [train.py:1204] (3/4) Exclude cut with ID IC0671W0081-331924 from training. Duration: 0.896 2023-10-02 15:37:56,194 WARNING [train.py:1204] (3/4) Exclude cut with ID _40284_210_2084_1_1533513679814_3704279_380-160775-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 15:37:57,638 WARNING [train.py:1204] (3/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_57-270037-0_sp1.1 from training. Duration: 0.7 2023-10-02 15:37:58,937 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_853-150237-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 15:38:00,615 WARNING [train.py:1204] (3/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_491-261923-0 from training. Duration: 0.98 2023-10-02 15:38:05,846 WARNING [train.py:1204] (3/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_836-244045-0_sp1.1 from training. Duration: 0.97275 2023-10-02 15:38:05,923 WARNING [train.py:1204] (3/4) Exclude cut with ID _14880_210_8895_1_1530680426655_3795259_289-212817-0 from training. Duration: 0.69 2023-10-02 15:38:07,337 WARNING [train.py:1204] (3/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_303-340446-0 from training. Duration: 0.9 2023-10-02 15:38:09,182 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_37152_210_9697_1_1533106573_3844188_418-41736-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 15:38:12,581 WARNING [train.py:1204] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_371-18169-0_sp1.1 from training. Duration: 0.7818125 2023-10-02 15:38:13,803 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_187-219984-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 15:38:15,194 WARNING [train.py:1204] (3/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_63-267585-0 from training. Duration: 0.9 2023-10-02 15:38:18,002 WARNING [train.py:1204] (3/4) Exclude cut with ID _27013_210_1389_1_1532437173240_3252649_222-125573-0 from training. Duration: 0.98 2023-10-02 15:38:18,057 WARNING [train.py:1204] (3/4) Exclude cut with ID _14821_210_8750_1_1530680503991_6829359_189-297451-0 from training. Duration: 0.55 2023-10-02 15:38:19,410 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_557-163391-0_sp1.1 from training. Duration: 0.97275 2023-10-02 15:38:20,867 WARNING [train.py:1204] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_65-88242-0_sp1.1 from training. Duration: 0.5090625 2023-10-02 15:38:23,503 INFO [train.py:1046] (3/4) Epoch 27, batch 1500, loss[loss=0.1784, simple_loss=0.2505, pruned_loss=0.05315, over 22849.00 frames. ], tot_loss[loss=0.1671, simple_loss=0.2446, pruned_loss=0.04481, over 4715149.02 frames. ], batch size: 323, lr: 3.81e-03, grad_scale: 16.0 2023-10-02 15:38:26,623 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.feed_forward2.hidden_balancer.prob, batch_count=930766.6666666666, ans=0.125 2023-10-02 15:38:29,184 WARNING [train.py:1204] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_218-295097-0 from training. Duration: 0.77 2023-10-02 15:38:29,217 WARNING [train.py:1204] (3/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_686-247600-0_sp1.1 from training. Duration: 0.836375 2023-10-02 15:38:29,220 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_397-147196-0_sp0.9 from training. Duration: 0.888875 2023-10-02 15:38:30,543 WARNING [train.py:1204] (3/4) Exclude cut with ID _53950_210_5816_1_1534417264919_3862951_237-136449-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 15:38:30,680 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.ff2_skip_rate, batch_count=930766.6666666666, ans=0.0 2023-10-02 15:38:31,891 WARNING [train.py:1204] (3/4) Exclude cut with ID _3120_210_3525_1_1526036143112_4072980_74-178400-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 15:38:31,966 WARNING [train.py:1204] (3/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_309-222545-0_sp0.9 from training. Duration: 0.9333125 2023-10-02 15:38:33,669 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7544_210_6753_1_1528587722_3576686_172-171750-0 from training. Duration: 0.65 2023-10-02 15:38:35,067 WARNING [train.py:1204] (3/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_3-237434-0_sp0.9 from training. Duration: 0.8444375 2023-10-02 15:38:35,097 WARNING [train.py:1204] (3/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_101-293367-0_sp0.9 from training. Duration: 0.7 2023-10-02 15:38:35,103 WARNING [train.py:1204] (3/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_818-213552-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 15:38:36,910 WARNING [train.py:1204] (3/4) Exclude cut with ID _14349_210_8341_1_1530615710818_3442470_115-285975-0_sp1.1 from training. Duration: 0.7818125 2023-10-02 15:38:38,985 WARNING [train.py:1204] (3/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_291-309749-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 15:38:40,287 WARNING [train.py:1204] (3/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_715-3462-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 15:38:44,440 WARNING [train.py:1204] (3/4) Exclude cut with ID _59502_210_12210_1_1535107024698_1835139_13-331827-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 15:38:44,450 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_4528_210_3273_1_1527076654_3276777_342-168068-0 from training. Duration: 0.87 2023-10-02 15:38:45,793 WARNING [train.py:1204] (3/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_215-141059-0_sp0.9 from training. Duration: 0.688875 2023-10-02 15:38:47,040 WARNING [train.py:1204] (3/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_106-286130-0_sp1.1 from training. Duration: 0.8909375 2023-10-02 15:38:47,117 WARNING [train.py:1204] (3/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_480-77655-0_sp1.1 from training. Duration: 0.97275 2023-10-02 15:38:51,299 WARNING [train.py:1204] (3/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_848-280957-0 from training. Duration: 0.87 2023-10-02 15:38:55,469 WARNING [train.py:1204] (3/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_122-109023-0 from training. Duration: 0.76 2023-10-02 15:38:56,782 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_11305_210_1794_1_1529750428_7178148_645-233480-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 15:38:56,844 WARNING [train.py:1204] (3/4) Exclude cut with ID _7562_210_2084_1_1528887767916_3946229_345-169156-0 from training. Duration: 0.58 2023-10-02 15:38:58,258 WARNING [train.py:1204] (3/4) Exclude cut with ID _7562_210_2084_1_1528887767916_3946229_345-169156-0_sp1.1 from training. Duration: 0.52725 2023-10-02 15:38:59,707 WARNING [train.py:1204] (3/4) Exclude cut with ID _13253_210_1925_1_1530757775819_3577873_253-246386-0_sp1.1 from training. Duration: 0.8181875 2023-10-02 15:39:00,965 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_136-264444-0_sp1.1 from training. Duration: 0.97275 2023-10-02 15:39:00,987 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2168_210_2290_1_1525606920_1849400_136-222991-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 15:39:01,091 WARNING [train.py:1204] (3/4) Exclude cut with ID _7852_210_5219_1_1528286471316_8139400_242-105336-0 from training. Duration: 0.72 2023-10-02 15:39:02,423 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_5960_210_1794_1_1527403865_3990999_515-2613-0_sp1.1 from training. Duration: 0.8 2023-10-02 15:39:02,438 WARNING [train.py:1204] (3/4) Exclude cut with ID _39688_210_19596_1_1533371626725_4154159_551-167834-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 15:39:02,500 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_85-195240-0 from training. Duration: 0.87 2023-10-02 15:39:04,404 WARNING [train.py:1204] (3/4) Exclude cut with ID _63171_210_15488_1_1535104792794_3295110_113-113534-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 15:39:08,369 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.balancer2.prob, batch_count=930966.6666666666, ans=0.125 2023-10-02 15:39:11,209 WARNING [train.py:1204] (3/4) Exclude cut with ID _8833_210_5219_1_1528592665210_7553059_319-349181-0_sp1.1 from training. Duration: 0.9 2023-10-02 15:39:11,209 WARNING [train.py:1204] (3/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_225-43381-0 from training. Duration: 0.94 2023-10-02 15:39:13,896 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.531e+02 1.907e+02 2.123e+02 2.554e+02 3.367e+02, threshold=4.246e+02, percent-clipped=0.0 2023-10-02 15:39:15,349 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_5959_210_1794_1_1527400400_3445726_412-44052-0_sp0.9 from training. Duration: 0.8555625 2023-10-02 15:39:17,921 WARNING [train.py:1204] (3/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_356-167816-0_sp1.1 from training. Duration: 0.6545625 2023-10-02 15:39:19,668 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.out_combiner.scale_min, batch_count=930966.6666666666, ans=0.2 2023-10-02 15:39:22,146 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9012W0112-501244 from training. Duration: 0.768 2023-10-02 15:39:22,196 WARNING [train.py:1204] (3/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_177-326952-0_sp1.1 from training. Duration: 0.92725 2023-10-02 15:39:22,200 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9016W0116-502789 from training. Duration: 0.896 2023-10-02 15:39:23,615 WARNING [train.py:1204] (3/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_557-204815-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 15:39:24,979 WARNING [train.py:1204] (3/4) Exclude cut with ID _55857_210_2346_1_1534644007831_6898209_279-229469-0_sp1.1 from training. Duration: 0.936375 2023-10-02 15:39:26,376 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9029W0375-509764 from training. Duration: 0.896 2023-10-02 15:39:27,802 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2295_210_2087_1_1525500993_2407863_234-83104-0_sp0.9 from training. Duration: 0.688875 2023-10-02 15:39:27,989 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.conv_module1.balancer2.min_abs, batch_count=931033.3333333334, ans=0.5 2023-10-02 15:39:30,611 WARNING [train.py:1204] (3/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_266-137104-0 from training. Duration: 0.65 2023-10-02 15:39:33,175 WARNING [train.py:1204] (3/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_289-163927-0_sp1.1 from training. Duration: 0.92725 2023-10-02 15:39:36,353 INFO [train.py:1046] (3/4) Epoch 27, batch 1550, loss[loss=0.156, simple_loss=0.2341, pruned_loss=0.03894, over 24312.00 frames. ], tot_loss[loss=0.168, simple_loss=0.2458, pruned_loss=0.04513, over 4724479.41 frames. ], batch size: 56, lr: 3.81e-03, grad_scale: 16.0 2023-10-02 15:39:36,439 WARNING [train.py:1204] (3/4) Exclude cut with ID _12733_210_8341_1_1530167424556_3646380_86-225314-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 15:39:36,452 WARNING [train.py:1204] (3/4) Exclude cut with ID _7877_210_6014_1_1528286358099_3750136_618-284064-0_sp1.1 from training. Duration: 0.92725 2023-10-02 15:39:38,217 WARNING [train.py:1204] (3/4) Exclude cut with ID _55844_210_2346_1_1534471212569_7625707_1017-31469-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 15:39:38,263 WARNING [train.py:1204] (3/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_617-241446-0_sp1.1 from training. Duration: 0.92725 2023-10-02 15:39:38,305 WARNING [train.py:1204] (3/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_866-173440-0_sp1.1 from training. Duration: 0.8181875 2023-10-02 15:39:41,003 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_9460_210_4211_1_1528954587_10049667_480-260269-0 from training. Duration: 0.56 2023-10-02 15:39:41,042 WARNING [train.py:1204] (3/4) Exclude cut with ID _44659_210_2721_1_1534036120061_6614860_626-298884-0 from training. Duration: 0.97 2023-10-02 15:39:41,074 WARNING [train.py:1204] (3/4) Exclude cut with ID _53565_210_10635_1_1534337218335_3820259_287-18120-0_sp1.1 from training. Duration: 0.8818125 2023-10-02 15:39:43,045 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_791-197891-0 from training. Duration: 0.84 2023-10-02 15:39:43,082 WARNING [train.py:1204] (3/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_527-238990-0 from training. Duration: 0.9 2023-10-02 15:39:44,542 WARNING [train.py:1204] (3/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_449-65969-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 15:39:46,442 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_553-341452-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 15:39:46,483 WARNING [train.py:1204] (3/4) Exclude cut with ID _16909_210_5724_1_1531292079912_4118919_300-47574-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 15:39:46,494 WARNING [train.py:1204] (3/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_70-27732-0_sp1.1 from training. Duration: 0.8545625 2023-10-02 15:39:46,763 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.self_attn_weights.pos_emb_skip_rate, batch_count=931100.0, ans=0.0 2023-10-02 15:39:48,024 WARNING [train.py:1204] (3/4) Exclude cut with ID _44718_210_5816_1_1534050051869_6469030_727-28074-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 15:39:49,292 WARNING [train.py:1204] (3/4) Exclude cut with ID _55857_210_2346_1_1534644007831_6898209_357-123664-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 15:39:50,841 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9123W0004-555964 from training. Duration: 0.896 2023-10-02 15:39:51,052 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.attention_skip_rate, batch_count=931166.6666666666, ans=0.0 2023-10-02 15:39:52,094 WARNING [train.py:1204] (3/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_445-145208-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 15:39:52,132 WARNING [train.py:1204] (3/4) Exclude cut with ID _3675_210_4557_1_1526639934408_4182089_456-18305-0_sp1.1 from training. Duration: 0.6909375 2023-10-02 15:39:53,534 WARNING [train.py:1204] (3/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_1-99680-0_sp1.1 from training. Duration: 0.6090625 2023-10-02 15:39:54,956 WARNING [train.py:1204] (3/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_146-282400-0_sp0.9 from training. Duration: 0.788875 2023-10-02 15:39:54,969 WARNING [train.py:1204] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_844-96989-0 from training. Duration: 0.71 2023-10-02 15:39:56,356 WARNING [train.py:1204] (3/4) Exclude cut with ID _29853_210_7497_1_1532505600851_6642110_128-146364-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 15:39:56,385 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_224-322394-0 from training. Duration: 0.66 2023-10-02 15:39:56,642 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.conv_module1.balancer1.min_positive, batch_count=931166.6666666666, ans=0.025 2023-10-02 15:39:57,775 WARNING [train.py:1204] (3/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_191-59784-0 from training. Duration: 0.88 2023-10-02 15:39:57,790 WARNING [train.py:1204] (3/4) Exclude cut with ID _12556_210_8341_1_1530081024100_7327690_725-193477-0 from training. Duration: 0.98 2023-10-02 15:39:59,206 WARNING [train.py:1204] (3/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_523-79838-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 15:40:00,577 WARNING [train.py:1204] (3/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_398-334558-0_sp1.1 from training. Duration: 0.87275 2023-10-02 15:40:01,358 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.2.encoder.layers.2.feed_forward1.out_whiten, num_groups=1, num_channels=384, metric=7.34 vs. limit=15.0 2023-10-02 15:40:03,458 WARNING [train.py:1204] (3/4) Exclude cut with ID _6456_210_1534_1_1527681634212_3181049_171-36697-0_sp1.1 from training. Duration: 0.936375 2023-10-02 15:40:04,953 WARNING [train.py:1204] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_700-183549-0 from training. Duration: 0.68 2023-10-02 15:40:04,955 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8853_210_3273_1_1528633899_818968_51-187685-0 from training. Duration: 0.69 2023-10-02 15:40:14,040 WARNING [train.py:1204] (3/4) Exclude cut with ID _7719_210_3525_1_1528456153163_4296960_333-110399-0_sp1.1 from training. Duration: 0.87275 2023-10-02 15:40:18,769 WARNING [train.py:1204] (3/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_1022-83740-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 15:40:18,796 WARNING [train.py:1204] (3/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_61-327062-0_sp0.9 from training. Duration: 0.9 2023-10-02 15:40:18,802 WARNING [train.py:1204] (3/4) Exclude cut with ID _47251_210_18628_1_1533814063128_4137250_214-88076-0_sp1.1 from training. Duration: 0.8 2023-10-02 15:40:20,062 WARNING [train.py:1204] (3/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_839-173017-0 from training. Duration: 0.41 2023-10-02 15:40:26,802 WARNING [train.py:1204] (3/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_299-34413-0_sp1.1 from training. Duration: 0.5909375 2023-10-02 15:40:28,168 WARNING [train.py:1204] (3/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_664-159457-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 15:40:29,664 WARNING [train.py:1204] (3/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_483-291760-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 15:40:31,083 WARNING [train.py:1204] (3/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_287-255474-0_sp1.1 from training. Duration: 0.92725 2023-10-02 15:40:32,429 WARNING [train.py:1204] (3/4) Exclude cut with ID _48677_210_5663_1_1533902405919_3892989_111-11439-0_sp1.1 from training. Duration: 0.87275 2023-10-02 15:40:32,443 WARNING [train.py:1204] (3/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_399-156791-0 from training. Duration: 0.83 2023-10-02 15:40:33,798 WARNING [train.py:1204] (3/4) Exclude cut with ID _3992_210_1747_1_1526644816031_3731060_36-147855-0_sp0.9 from training. Duration: 0.8666875 2023-10-02 15:40:35,218 WARNING [train.py:1204] (3/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_442-320317-0_sp1.1 from training. Duration: 0.7454375 2023-10-02 15:40:35,236 WARNING [train.py:1204] (3/4) Exclude cut with ID _55429_210_10626_1_1534467588527_7174143_953-80868-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 15:40:35,300 WARNING [train.py:1204] (3/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_159-328157-0_sp0.9 from training. Duration: 0.57775 2023-10-02 15:40:35,307 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9309W0197-640324 from training. Duration: 0.896 2023-10-02 15:40:38,090 WARNING [train.py:1204] (3/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_361-30822-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 15:40:43,521 WARNING [train.py:1204] (3/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_636-251213-0 from training. Duration: 0.94 2023-10-02 15:40:45,148 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.bypass_mid.scale_min, batch_count=931366.6666666666, ans=0.2 2023-10-02 15:40:46,299 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.1.attention_skip_rate, batch_count=931366.6666666666, ans=0.0 2023-10-02 15:40:49,578 WARNING [train.py:1204] (3/4) Exclude cut with ID _49728_210_12651_1_1534041034294_6788150_468-333827-0_sp1.1 from training. Duration: 0.963625 2023-10-02 15:40:50,808 INFO [train.py:1046] (3/4) Epoch 27, batch 1600, loss[loss=0.16, simple_loss=0.2403, pruned_loss=0.03978, over 24453.00 frames. ], tot_loss[loss=0.1687, simple_loss=0.2463, pruned_loss=0.04558, over 4730522.55 frames. ], batch size: 63, lr: 3.81e-03, grad_scale: 32.0 2023-10-02 15:40:50,875 WARNING [train.py:1204] (3/4) Exclude cut with ID _25882_210_3036_1_1532775665815_6479510_623-191759-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 15:40:52,156 WARNING [train.py:1204] (3/4) Exclude cut with ID _5189_210_4557_1_1527248319428_3769229_199-53526-0 from training. Duration: 0.42 2023-10-02 15:40:52,260 WARNING [train.py:1204] (3/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_32-205288-0_sp0.9 from training. Duration: 0.8666875 2023-10-02 15:40:53,671 WARNING [train.py:1204] (3/4) Exclude cut with ID _65263_210_22356_1_1535763598691_3921037_401-346646-0_sp1.1 from training. Duration: 0.963625 2023-10-02 15:40:53,678 WARNING [train.py:1204] (3/4) Exclude cut with ID _12555_210_8750_1_1530075594402_3780239_449-267986-0_sp1.1 from training. Duration: 0.7545625 2023-10-02 15:40:53,691 WARNING [train.py:1204] (3/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_430-201020-0_sp1.1 from training. Duration: 0.8818125 2023-10-02 15:40:54,969 WARNING [train.py:1204] (3/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_379-49457-0_sp1.1 from training. Duration: 0.7909375 2023-10-02 15:40:59,035 WARNING [train.py:1204] (3/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_200-105218-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 15:40:59,077 WARNING [train.py:1204] (3/4) Exclude cut with ID _3867_210_1883_1_1526641200575_4475830_319-349850-0 from training. Duration: 0.67 2023-10-02 15:40:59,306 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.feed_forward1.hidden_balancer.prob, batch_count=931433.3333333334, ans=0.125 2023-10-02 15:41:00,402 WARNING [train.py:1204] (3/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_916-195848-0 from training. Duration: 0.92 2023-10-02 15:41:01,794 WARNING [train.py:1204] (3/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_436-186311-0 from training. Duration: 0.65 2023-10-02 15:41:03,216 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3910_210_1794_1_1526777667_3747260_302-180617-0_sp1.1 from training. Duration: 0.97275 2023-10-02 15:41:04,661 WARNING [train.py:1204] (3/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_102-92751-0 from training. Duration: 0.99 2023-10-02 15:41:05,981 WARNING [train.py:1204] (3/4) Exclude cut with ID _51899_210_20034_1_1534129254466_3669220_265-215428-0_sp1.1 from training. Duration: 0.8909375 2023-10-02 15:41:09,096 WARNING [train.py:1204] (3/4) Exclude cut with ID _58583_210_5236_1_1534726835266_3574249_438-198993-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 15:41:09,292 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.self_attn_weights.pos_emb_skip_rate, batch_count=931500.0, ans=0.0 2023-10-02 15:41:14,208 WARNING [train.py:1204] (3/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_166-166990-0_sp1.1 from training. Duration: 0.87275 2023-10-02 15:41:18,194 WARNING [train.py:1204] (3/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_907-151398-0 from training. Duration: 0.37 2023-10-02 15:41:19,794 WARNING [train.py:1204] (3/4) Exclude cut with ID _38670_210_15379_1_1533448810659_4391059_452-256669-0_sp1.1 from training. Duration: 0.936375 2023-10-02 15:41:21,122 WARNING [train.py:1204] (3/4) Exclude cut with ID _48677_210_5663_1_1533902405919_3892989_408-40740-0 from training. Duration: 0.83 2023-10-02 15:41:21,165 WARNING [train.py:1204] (3/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_903-158756-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 15:41:21,216 WARNING [train.py:1204] (3/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_397-216497-0 from training. Duration: 0.96 2023-10-02 15:41:27,196 WARNING [train.py:1204] (3/4) Exclude cut with ID _55429_210_10626_1_1534467588527_7174143_97-335084-0 from training. Duration: 0.97 2023-10-02 15:41:34,041 WARNING [train.py:1204] (3/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_453-267730-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 15:41:34,271 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.feed_forward1.out_proj.dropout_p, batch_count=931633.3333333334, ans=0.1 2023-10-02 15:41:35,356 WARNING [train.py:1204] (3/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_153-97441-0 from training. Duration: 0.56 2023-10-02 15:41:35,426 WARNING [train.py:1204] (3/4) Exclude cut with ID _40901_210_5665_1_1533363789958_3856130_312-329122-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 15:41:35,463 WARNING [train.py:1204] (3/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_97-126471-0_sp1.1 from training. Duration: 0.97275 2023-10-02 15:41:35,463 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_11305_210_1794_1_1529750428_7178148_257-312870-0_sp1.1 from training. Duration: 0.8 2023-10-02 15:41:37,423 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.4.encoder.layers.1.nonlin_attention.whiten2, num_groups=1, num_channels=384, metric=3.47 vs. limit=15.0 2023-10-02 15:41:38,218 WARNING [train.py:1204] (3/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_454-314901-0_sp0.9 from training. Duration: 0.511125 2023-10-02 15:41:41,438 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.590e+02 1.847e+02 2.065e+02 2.421e+02 3.334e+02, threshold=4.130e+02, percent-clipped=0.0 2023-10-02 15:41:41,759 WARNING [train.py:1204] (3/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_282-136641-0_sp1.1 from training. Duration: 0.3818125 2023-10-02 15:41:43,647 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_46013_210_7234_1_1533776362_3744032_493-160724-0_sp1.1 from training. Duration: 0.963625 2023-10-02 15:41:43,697 WARNING [train.py:1204] (3/4) Exclude cut with ID _67831_210_12392_1_1535612464202_3092612_259-33442-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 15:41:44,942 WARNING [train.py:1204] (3/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_289-62384-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 15:41:45,009 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6545_210_2040_1_1527774670_4245258_379-14182-0_sp0.9 from training. Duration: 0.888875 2023-10-02 15:41:46,405 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_548-294316-0_sp1.1 from training. Duration: 0.736375 2023-10-02 15:41:47,786 WARNING [train.py:1204] (3/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_857-298942-0_sp1.1 from training. Duration: 0.77275 2023-10-02 15:41:49,083 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6673_210_3273_1_1527767790_3173240_327-306185-0_sp1.1 from training. Duration: 0.7545625 2023-10-02 15:41:54,978 WARNING [train.py:1204] (3/4) Exclude cut with ID _61008_210_10633_1_1534852769198_6974370_814-107647-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 15:41:56,132 WARNING [train.py:1204] (3/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_116-332008-0_sp1.1 from training. Duration: 0.8909375 2023-10-02 15:41:58,943 WARNING [train.py:1204] (3/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_266-329755-0 from training. Duration: 0.89 2023-10-02 15:41:58,953 WARNING [train.py:1204] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_271-269084-0_sp0.9 from training. Duration: 0.92225 2023-10-02 15:41:59,018 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3171_210_2087_1_1526109084_2330007_66-260583-0 from training. Duration: 0.72 2023-10-02 15:42:00,608 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.bypass.skip_rate, batch_count=931700.0, ans=0.09899494936611666 2023-10-02 15:42:01,352 INFO [scaling.py:1022] (3/4) Whitening: name=encoder_embed.convnext.out_whiten, num_groups=1, num_channels=128, metric=4.59 vs. limit=5.0 2023-10-02 15:42:03,087 INFO [train.py:1046] (3/4) Epoch 27, batch 1650, loss[loss=0.1648, simple_loss=0.2321, pruned_loss=0.0487, over 18587.00 frames. ], tot_loss[loss=0.1698, simple_loss=0.2469, pruned_loss=0.0463, over 4711991.23 frames. ], batch size: 40, lr: 3.80e-03, grad_scale: 16.0 2023-10-02 15:42:03,367 INFO [scaling.py:1118] (3/4) WithLoss: name=encoder.encoders.0.layers.0.self_attn_weights, loss-sum=0.000e+00 2023-10-02 15:42:04,562 WARNING [train.py:1204] (3/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_1326-276386-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 15:42:05,971 WARNING [train.py:1204] (3/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_372-30302-0_sp1.1 from training. Duration: 0.7818125 2023-10-02 15:42:07,777 WARNING [train.py:1204] (3/4) Exclude cut with ID _25882_210_3036_1_1532775665815_6479510_631-257029-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 15:42:07,793 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3691_210_3528_1_1526468169_4062053_355-334432-0 from training. Duration: 0.87 2023-10-02 15:42:07,800 WARNING [train.py:1204] (3/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_839-173017-0_sp1.1 from training. Duration: 0.37275 2023-10-02 15:42:07,814 WARNING [train.py:1204] (3/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_441-91466-0 from training. Duration: 0.97 2023-10-02 15:42:09,563 WARNING [train.py:1204] (3/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_108-53929-0 from training. Duration: 0.59 2023-10-02 15:42:12,871 WARNING [train.py:1204] (3/4) Exclude cut with ID _9389_210_1664_1_1529217337301_5289269_260-1942-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 15:42:12,975 WARNING [train.py:1204] (3/4) Exclude cut with ID _56423_210_5854_1_1534896061049_3165969_198-61547-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 15:42:14,384 WARNING [train.py:1204] (3/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_412-142122-0_sp1.1 from training. Duration: 0.92725 2023-10-02 15:42:14,407 WARNING [train.py:1204] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_88-341718-0_sp1.1 from training. Duration: 0.6 2023-10-02 15:42:15,967 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.feed_forward3.hidden_balancer.prob, batch_count=931766.6666666666, ans=0.125 2023-10-02 15:42:15,969 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.attention_skip_rate, batch_count=931766.6666666666, ans=0.0 2023-10-02 15:42:17,149 WARNING [train.py:1204] (3/4) Exclude cut with ID _61079_210_4525_1_1534921008198_4011419_334-298697-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 15:42:18,509 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_5465_210_4211_1_1527227253_55357_8-2499-0_sp1.1 from training. Duration: 0.996375 2023-10-02 15:42:19,978 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_447-201565-0_sp1.1 from training. Duration: 0.97275 2023-10-02 15:42:19,992 WARNING [train.py:1204] (3/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_253-161789-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 15:42:19,995 WARNING [train.py:1204] (3/4) Exclude cut with ID _44304_210_15374_1_1533639352765_4613129_302-22084-0_sp0.9 from training. Duration: 0.9555625 2023-10-02 15:42:20,004 WARNING [train.py:1204] (3/4) Exclude cut with ID _26664_210_7770_1_1532258943280_3418270_187-80815-0_sp1.1 from training. Duration: 0.8454375 2023-10-02 15:42:21,372 WARNING [train.py:1204] (3/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_68-212801-0 from training. Duration: 0.73 2023-10-02 15:42:22,662 WARNING [train.py:1204] (3/4) Exclude cut with ID _29345_210_4381_1_1532689168263_3860169_149-257268-0 from training. Duration: 0.81 2023-10-02 15:42:22,996 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.feed_forward3.hidden_balancer.prob, batch_count=931833.3333333334, ans=0.125 2023-10-02 15:42:26,187 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.conv_skip_rate, batch_count=931833.3333333334, ans=0.0 2023-10-02 15:42:27,422 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_213-103282-0_sp1.1 from training. Duration: 0.7090625 2023-10-02 15:42:30,112 WARNING [train.py:1204] (3/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_156-302675-0_sp1.1 from training. Duration: 0.5 2023-10-02 15:42:31,741 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.self_attn_weights.pos_emb_skip_rate, batch_count=931900.0, ans=0.0 2023-10-02 15:42:37,163 WARNING [train.py:1204] (3/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_125-322172-0 from training. Duration: 0.93 2023-10-02 15:42:37,220 WARNING [train.py:1204] (3/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_493-60474-0_sp1.1 from training. Duration: 0.963625 2023-10-02 15:42:37,424 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.conv_module1.balancer1.prob, batch_count=931900.0, ans=0.125 2023-10-02 15:42:41,104 WARNING [train.py:1204] (3/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_156-302675-0 from training. Duration: 0.55 2023-10-02 15:42:43,917 WARNING [train.py:1204] (3/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_123-267862-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 15:42:46,634 WARNING [train.py:1204] (3/4) Exclude cut with ID _28427_210_4220_1_1532581240967_3061069_201-143747-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 15:42:47,929 WARNING [train.py:1204] (3/4) Exclude cut with ID _8772_210_1850_1_1528635595972_3664011_96-12756-0_sp1.1 from training. Duration: 0.8909375 2023-10-02 15:42:47,963 WARNING [train.py:1204] (3/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_1347-145675-0_sp1.1 from training. Duration: 0.936375 2023-10-02 15:42:49,336 WARNING [train.py:1204] (3/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_387-159008-0_sp1.1 from training. Duration: 0.7818125 2023-10-02 15:42:49,360 WARNING [train.py:1204] (3/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_400-201896-0_sp1.1 from training. Duration: 0.963625 2023-10-02 15:42:53,356 WARNING [train.py:1204] (3/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_326-174612-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 15:42:53,394 WARNING [train.py:1204] (3/4) Exclude cut with ID _55844_210_2346_1_1534471212569_7625707_390-116233-0_sp1.1 from training. Duration: 0.963625 2023-10-02 15:42:53,421 WARNING [train.py:1204] (3/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_419-317062-0_sp1.1 from training. Duration: 0.82725 2023-10-02 15:42:53,489 WARNING [train.py:1204] (3/4) Exclude cut with ID _29877_210_6228_1_1532397791106_5276149_266-62291-0_sp0.9 from training. Duration: 0.9666875 2023-10-02 15:42:55,632 WARNING [train.py:1204] (3/4) Exclude cut with ID _7857_210_1534_1_1528286515745_6831470_431-338528-0_sp1.1 from training. Duration: 0.92725 2023-10-02 15:42:56,965 WARNING [train.py:1204] (3/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_87-131427-0_sp0.9 from training. Duration: 0.8666875 2023-10-02 15:42:58,417 WARNING [train.py:1204] (3/4) Exclude cut with ID _24055_210_12651_1_1532310907143_976979_92-59356-0_sp1.1 from training. Duration: 0.82725 2023-10-02 15:42:59,870 WARNING [train.py:1204] (3/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_96-290039-0 from training. Duration: 0.56 2023-10-02 15:43:01,297 WARNING [train.py:1204] (3/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_48-182155-0_sp0.9 from training. Duration: 0.9666875 2023-10-02 15:43:01,341 WARNING [train.py:1204] (3/4) Exclude cut with ID _4626_210_3532_1_1526817652717_3916859_446-206656-0 from training. Duration: 0.76 2023-10-02 15:43:01,551 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.feed_forward1.hidden_balancer.prob, batch_count=932033.3333333334, ans=0.125 2023-10-02 15:43:02,792 WARNING [train.py:1204] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_168-32906-0 from training. Duration: 0.69 2023-10-02 15:43:02,814 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3780_210_2087_1_1526467760_2130483_170-259044-0 from training. Duration: 0.7 2023-10-02 15:43:02,832 WARNING [train.py:1204] (3/4) Exclude cut with ID _65134_210_14763_1_1535335393985_3505368_153-216718-0_sp1.1 from training. Duration: 0.92725 2023-10-02 15:43:04,031 WARNING [train.py:1204] (3/4) Exclude cut with ID _64639_210_5816_1_1535367619437_3528000_461-329156-0_sp1.1 from training. Duration: 0.87275 2023-10-02 15:43:05,336 WARNING [train.py:1204] (3/4) Exclude cut with ID _21989_210_5854_1_1531899214931_3271360_126-347531-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 15:43:05,393 WARNING [train.py:1204] (3/4) Exclude cut with ID _7614_210_4915_1_1529326764092_4537359_96-162197-0_sp1.1 from training. Duration: 0.963625 2023-10-02 15:43:05,394 WARNING [train.py:1204] (3/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_1083-147536-0 from training. Duration: 0.97 2023-10-02 15:43:10,212 WARNING [train.py:1204] (3/4) Exclude cut with ID _64029_210_4381_1_1535178502772_3756459_275-16710-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 15:43:10,367 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.0.feed_forward1.out_proj.dropout_p, batch_count=932033.3333333334, ans=0.1 2023-10-02 15:43:11,965 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7264_210_4938_1_1527939911_8132095_800-91995-0_sp0.9 from training. Duration: 0.9555625 2023-10-02 15:43:12,004 WARNING [train.py:1204] (3/4) Exclude cut with ID _3844_210_1390_1_1526727803582_2871209_50-193879-0_sp1.1 from training. Duration: 0.936375 2023-10-02 15:43:15,313 WARNING [train.py:1204] (3/4) Exclude cut with ID _46952_210_5986_1_1534230010357_3107379_232-344503-0 from training. Duration: 0.96 2023-10-02 15:43:18,086 INFO [train.py:1046] (3/4) Epoch 27, batch 1700, loss[loss=0.1627, simple_loss=0.2446, pruned_loss=0.04046, over 24479.00 frames. ], tot_loss[loss=0.169, simple_loss=0.2459, pruned_loss=0.04603, over 4718771.77 frames. ], batch size: 63, lr: 3.80e-03, grad_scale: 16.0 2023-10-02 15:43:18,196 WARNING [train.py:1204] (3/4) Exclude cut with ID _25808_210_13292_1_1532343514562_7366109_206-14076-0_sp1.1 from training. Duration: 0.936375 2023-10-02 15:43:18,196 WARNING [train.py:1204] (3/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_55-31904-0_sp0.9 from training. Duration: 0.97775 2023-10-02 15:43:18,214 WARNING [train.py:1204] (3/4) Exclude cut with ID _14632_210_9988_1_1530691279392_3923200_161-129530-0 from training. Duration: 0.96 2023-10-02 15:43:18,276 WARNING [train.py:1204] (3/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_770-200430-0_sp1.1 from training. Duration: 0.8090625 2023-10-02 15:43:18,286 WARNING [train.py:1204] (3/4) Exclude cut with ID _6270_210_2084_1_1527850911573_3856420_26-277343-0_sp1.1 from training. Duration: 0.7454375 2023-10-02 15:43:18,287 WARNING [train.py:1204] (3/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_88-23776-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 15:43:20,254 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.3.encoder.layers.1.feed_forward1.out_whiten, num_groups=1, num_channels=512, metric=9.28 vs. limit=15.0 2023-10-02 15:43:21,066 WARNING [train.py:1204] (3/4) Exclude cut with ID _60860_210_8254_1_1534896015875_3557169_245-256666-0_sp1.1 from training. Duration: 0.8818125 2023-10-02 15:43:21,093 WARNING [train.py:1204] (3/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_636-251213-0_sp1.1 from training. Duration: 0.8545625 2023-10-02 15:43:22,404 WARNING [train.py:1204] (3/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_540-131164-0 from training. Duration: 0.92 2023-10-02 15:43:25,668 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_5931_210_2040_1_1527428481_4547999_321-193614-0_sp0.9 from training. Duration: 0.7444375 2023-10-02 15:43:25,848 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.ff2_skip_rate, batch_count=932100.0, ans=0.0 2023-10-02 15:43:33,813 WARNING [train.py:1204] (3/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_373-225403-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 15:43:36,442 WARNING [train.py:1204] (3/4) Exclude cut with ID _20622_210_3852_1_1531622427080_1093839_57-326027-0_sp1.1 from training. Duration: 0.97275 2023-10-02 15:43:43,621 WARNING [train.py:1204] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_605-257205-0_sp1.1 from training. Duration: 0.536375 2023-10-02 15:43:43,651 WARNING [train.py:1204] (3/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_442-320317-0_sp0.9 from training. Duration: 0.911125 2023-10-02 15:43:45,031 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_35361_210_4238_1_1533262810_7793476_675-87012-0_sp1.1 from training. Duration: 0.8090625 2023-10-02 15:43:45,062 WARNING [train.py:1204] (3/4) Exclude cut with ID _4184_210_2550_1_1526730192013_2219380_306-44736-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 15:43:45,304 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.conv_module2.balancer2.prob, batch_count=932166.6666666666, ans=0.125 2023-10-02 15:43:47,645 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_124-113562-0 from training. Duration: 0.94 2023-10-02 15:43:49,211 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2821_210_2715_1_1526108385_3763560_73-296673-0_sp0.9 from training. Duration: 0.92225 2023-10-02 15:43:49,226 WARNING [train.py:1204] (3/4) Exclude cut with ID _10301_210_5886_1_1529222275332_3910620_216-46959-0_sp1.1 from training. Duration: 0.92725 2023-10-02 15:43:50,734 WARNING [train.py:1204] (3/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_91-277684-0_sp0.9 from training. Duration: 0.82225 2023-10-02 15:43:52,085 WARNING [train.py:1204] (3/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_351-294650-0_sp0.9 from training. Duration: 0.7 2023-10-02 15:43:53,468 WARNING [train.py:1204] (3/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_209-205365-0 from training. Duration: 0.79 2023-10-02 15:43:53,609 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.conv_module2.balancer1.prob, batch_count=932233.3333333334, ans=0.125 2023-10-02 15:43:54,767 WARNING [train.py:1204] (3/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_97-325053-0 from training. Duration: 0.92 2023-10-02 15:43:56,663 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_49348_210_7927_1_1533954500_3691300_109-276430-0_sp1.1 from training. Duration: 0.92725 2023-10-02 15:43:56,817 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.bypass.skip_rate, batch_count=932233.3333333334, ans=0.07 2023-10-02 15:43:59,400 WARNING [train.py:1204] (3/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_912-47473-0 from training. Duration: 0.68 2023-10-02 15:43:59,487 WARNING [train.py:1204] (3/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_96-320736-0_sp0.9 from training. Duration: 0.9555625 2023-10-02 15:44:06,389 WARNING [train.py:1204] (3/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_1252-235481-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 15:44:07,714 WARNING [train.py:1204] (3/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_813-261554-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 15:44:09,558 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.549e+02 1.921e+02 2.104e+02 2.385e+02 3.447e+02, threshold=4.208e+02, percent-clipped=0.0 2023-10-02 15:44:09,677 WARNING [train.py:1204] (3/4) Exclude cut with ID _21632_210_2253_1_1531994001765_3744140_110-274654-0_sp0.9 from training. Duration: 0.911125 2023-10-02 15:44:12,888 WARNING [train.py:1204] (3/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_381-317791-0_sp1.1 from training. Duration: 0.663625 2023-10-02 15:44:12,909 WARNING [train.py:1204] (3/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_51-117576-0_sp1.1 from training. Duration: 0.363625 2023-10-02 15:44:12,934 WARNING [train.py:1204] (3/4) Exclude cut with ID _42321_210_7770_1_1533468631352_4366895_104-215915-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 15:44:13,161 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.self_attn_weights.pos_emb_skip_rate, batch_count=932300.0, ans=0.0 2023-10-02 15:44:14,249 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_216-22824-0_sp1.1 from training. Duration: 0.92725 2023-10-02 15:44:14,249 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_14831_210_7927_1_1530700200_5583610_181-27467-0 from training. Duration: 0.91 2023-10-02 15:44:15,669 WARNING [train.py:1204] (3/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_1024-265645-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 15:44:15,672 WARNING [train.py:1204] (3/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_412-233904-0_sp1.1 from training. Duration: 0.936375 2023-10-02 15:44:15,690 WARNING [train.py:1204] (3/4) Exclude cut with ID _30191_210_1664_1_1533189915047_7196670_911-223461-0_sp1.1 from training. Duration: 0.92725 2023-10-02 15:44:15,698 WARNING [train.py:1204] (3/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_558-148266-0_sp1.1 from training. Duration: 0.963625 2023-10-02 15:44:17,363 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.ff3_skip_rate, batch_count=932366.6666666666, ans=0.0 2023-10-02 15:44:18,540 WARNING [train.py:1204] (3/4) Exclude cut with ID _58480_210_12392_1_1534811121111_3646160_212-164495-0_sp1.1 from training. Duration: 0.936375 2023-10-02 15:44:18,541 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_454-276429-0_sp0.9 from training. Duration: 0.9666875 2023-10-02 15:44:19,879 WARNING [train.py:1204] (3/4) Exclude cut with ID _24736_210_3279_1_1532332894218_2838700_73-197086-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 15:44:19,920 WARNING [train.py:1204] (3/4) Exclude cut with ID _5294_210_1747_1_1527249643928_3696179_529-242298-0_sp1.1 from training. Duration: 0.863625 2023-10-02 15:44:19,957 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_53555_210_5632_1_1534595220_7232297_345-171438-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 15:44:25,515 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_28211_210_6388_1_1532861506_4523224_22-25766-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 15:44:25,597 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7002_210_1794_1_1527939683_5157286_163-87552-0 from training. Duration: 0.86 2023-10-02 15:44:27,002 WARNING [train.py:1204] (3/4) Exclude cut with ID _56927_210_9969_1_1534848970840_3695229_409-225129-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 15:44:28,994 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_26192_210_7927_1_1532137269_1040115_21-219331-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 15:44:31,725 INFO [train.py:1046] (3/4) Epoch 27, batch 1750, loss[loss=0.1528, simple_loss=0.2385, pruned_loss=0.03358, over 24494.00 frames. ], tot_loss[loss=0.1674, simple_loss=0.2444, pruned_loss=0.04524, over 4720691.78 frames. ], batch size: 66, lr: 3.80e-03, grad_scale: 16.0 2023-10-02 15:44:31,765 WARNING [train.py:1204] (3/4) Exclude cut with ID _15012_210_1968_1_1530856820429_5095350_662-210469-0 from training. Duration: 0.57 2023-10-02 15:44:37,192 WARNING [train.py:1204] (3/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_582-191016-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 15:44:37,353 WARNING [train.py:1204] (3/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_270-210032-0_sp1.1 from training. Duration: 0.963625 2023-10-02 15:44:38,666 WARNING [train.py:1204] (3/4) Exclude cut with ID _6789_210_1534_1_1527857937218_3118950_262-48853-0_sp0.9 from training. Duration: 0.62225 2023-10-02 15:44:38,717 WARNING [train.py:1204] (3/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_847-41334-0 from training. Duration: 0.44 2023-10-02 15:44:38,887 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.self_attn_weights.pos_emb_skip_rate, batch_count=932433.3333333334, ans=0.0 2023-10-02 15:44:40,573 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_573-46532-0_sp1.1 from training. Duration: 0.92725 2023-10-02 15:44:43,759 WARNING [train.py:1204] (3/4) Exclude cut with ID _21881_210_3525_1_1531825242757_3798380_247-102591-0_sp1.1 from training. Duration: 0.8909375 2023-10-02 15:44:43,792 WARNING [train.py:1204] (3/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_200-21395-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 15:44:44,109 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.conv_module2.balancer2.prob, batch_count=932433.3333333334, ans=0.125 2023-10-02 15:44:48,323 WARNING [train.py:1204] (3/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_711-15688-0 from training. Duration: 0.43 2023-10-02 15:44:49,775 WARNING [train.py:1204] (3/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_53-75025-0_sp1.1 from training. Duration: 0.963625 2023-10-02 15:44:51,273 WARNING [train.py:1204] (3/4) Exclude cut with ID _6194_210_1534_1_1527511826160_3941194_321-148641-0 from training. Duration: 0.64 2023-10-02 15:44:51,304 WARNING [train.py:1204] (3/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_337-294431-0_sp1.1 from training. Duration: 0.936375 2023-10-02 15:44:52,687 WARNING [train.py:1204] (3/4) Exclude cut with ID _21632_210_2253_1_1531994001765_3744140_110-274654-0_sp1.1 from training. Duration: 0.7454375 2023-10-02 15:44:55,367 WARNING [train.py:1204] (3/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_135-174080-0_sp1.1 from training. Duration: 0.4818125 2023-10-02 15:44:55,458 WARNING [train.py:1204] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_21-174514-0 from training. Duration: 0.43 2023-10-02 15:44:58,594 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_38965_210_4852_1_1533174906_3484735_215-294589-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 15:44:58,631 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_548-294316-0 from training. Duration: 0.81 2023-10-02 15:45:05,531 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_103-84010-0_sp1.1 from training. Duration: 0.62725 2023-10-02 15:45:08,903 WARNING [train.py:1204] (3/4) Exclude cut with ID _18199_210_6109_1_1531475789864_4288790_295-198247-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 15:45:08,933 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_13757_210_3528_1_1530440682_4107239_103-313224-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 15:45:14,889 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_34886_210_10633_1_1533203882_1755661_67-294673-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 15:45:14,905 WARNING [train.py:1204] (3/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_350-101734-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 15:45:15,142 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.conv_module1.balancer1.prob, batch_count=932633.3333333334, ans=0.125 2023-10-02 15:45:16,299 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_37357_210_17024_1_1533089504_2316121_204-153986-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 15:45:16,555 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.feed_forward1.out_proj.dropout_p, batch_count=932633.3333333334, ans=0.1 2023-10-02 15:45:17,768 WARNING [train.py:1204] (3/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_673-115569-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 15:45:20,606 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_489-230703-0_sp0.9 from training. Duration: 0.9444375 2023-10-02 15:45:20,666 WARNING [train.py:1204] (3/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_575-102496-0_sp1.1 from training. Duration: 0.936375 2023-10-02 15:45:20,906 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.conv_module2.balancer1.prob, batch_count=932633.3333333334, ans=0.125 2023-10-02 15:45:22,077 WARNING [train.py:1204] (3/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_669-93163-0 from training. Duration: 0.98 2023-10-02 15:45:22,292 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=932633.3333333334, ans=0.1 2023-10-02 15:45:22,296 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.feed_forward1.hidden_balancer.prob, batch_count=932633.3333333334, ans=0.125 2023-10-02 15:45:23,417 WARNING [train.py:1204] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_249-281349-0_sp0.9 from training. Duration: 0.9444375 2023-10-02 15:45:24,931 WARNING [train.py:1204] (3/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_106-30782-0 from training. Duration: 0.8 2023-10-02 15:45:26,265 WARNING [train.py:1204] (3/4) Exclude cut with ID _8772_210_1850_1_1528635595972_3664011_398-320049-0_sp1.1 from training. Duration: 0.8 2023-10-02 15:45:28,103 WARNING [train.py:1204] (3/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_516-151727-0_sp1.1 from training. Duration: 0.963625 2023-10-02 15:45:29,445 WARNING [train.py:1204] (3/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_325-62802-0_sp1.1 from training. Duration: 0.7909375 2023-10-02 15:45:29,679 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.bypass.scale_min, batch_count=932700.0, ans=0.2 2023-10-02 15:45:29,697 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.feed_forward1.hidden_balancer.prob, batch_count=932700.0, ans=0.125 2023-10-02 15:45:32,264 WARNING [train.py:1204] (3/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_93-58002-0_sp0.9 from training. Duration: 0.6333125 2023-10-02 15:45:32,413 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.1.feed_forward1.out_proj.dropout_p, batch_count=932700.0, ans=0.1 2023-10-02 15:45:33,648 WARNING [train.py:1204] (3/4) Exclude cut with ID _4681_210_3525_1_1527251040321_1235713_123-24428-0_sp1.1 from training. Duration: 0.436375 2023-10-02 15:45:33,699 WARNING [train.py:1204] (3/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_1185-56473-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 15:45:35,150 WARNING [train.py:1204] (3/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_96-244299-0_sp1.1 from training. Duration: 0.8 2023-10-02 15:45:38,561 WARNING [train.py:1204] (3/4) Exclude cut with ID _59500_210_8254_1_1534809610680_3736009_104-72815-0_sp1.1 from training. Duration: 0.963625 2023-10-02 15:45:39,953 WARNING [train.py:1204] (3/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_468-283113-0_sp1.1 from training. Duration: 0.87275 2023-10-02 15:45:41,804 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6239_210_4211_1_1527765584_4306698_528-102186-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 15:45:43,583 WARNING [train.py:1204] (3/4) Exclude cut with ID _45297_210_2721_1_1533715351381_3264949_351-147136-0 from training. Duration: 0.87 2023-10-02 15:45:43,588 WARNING [train.py:1204] (3/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_520-51744-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 15:45:44,951 WARNING [train.py:1204] (3/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_310-114452-0_sp1.1 from training. Duration: 0.536375 2023-10-02 15:45:44,955 WARNING [train.py:1204] (3/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_450-165710-0_sp1.1 from training. Duration: 0.92725 2023-10-02 15:45:44,970 WARNING [train.py:1204] (3/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_280-120942-0_sp1.1 from training. Duration: 0.7 2023-10-02 15:45:46,223 INFO [train.py:1046] (3/4) Epoch 27, batch 1800, loss[loss=0.1681, simple_loss=0.2485, pruned_loss=0.04382, over 23332.00 frames. ], tot_loss[loss=0.1664, simple_loss=0.2431, pruned_loss=0.04484, over 4707371.55 frames. ], batch size: 93, lr: 3.80e-03, grad_scale: 16.0 2023-10-02 15:45:46,289 WARNING [train.py:1204] (3/4) Exclude cut with ID _65585_210_12210_1_1535450451584_3764098_112-294301-0_sp0.9 from training. Duration: 0.97775 2023-10-02 15:45:46,350 WARNING [train.py:1204] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_98-51676-0_sp0.9 from training. Duration: 0.811125 2023-10-02 15:45:50,370 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_33348_210_5632_1_1533086912_3324030_85-28583-0_sp1.1 from training. Duration: 0.7454375 2023-10-02 15:45:51,714 WARNING [train.py:1204] (3/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_336-208232-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 15:45:53,141 WARNING [train.py:1204] (3/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_339-104791-0_sp0.9 from training. Duration: 0.5444375 2023-10-02 15:45:55,946 WARNING [train.py:1204] (3/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_559-29840-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 15:45:59,182 WARNING [train.py:1204] (3/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_117-290916-0_sp0.9 from training. Duration: 0.5666875 2023-10-02 15:45:59,269 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_185-112289-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 15:46:02,022 WARNING [train.py:1204] (3/4) Exclude cut with ID _59256_210_5986_1_1535076085997_3589120_155-298797-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 15:46:04,820 WARNING [train.py:1204] (3/4) Exclude cut with ID _44659_210_2721_1_1534036120061_6614860_246-206531-0_sp1.1 from training. Duration: 0.92725 2023-10-02 15:46:04,866 WARNING [train.py:1204] (3/4) Exclude cut with ID _23818_210_6228_1_1531917063884_3676920_469-151944-0_sp1.1 from training. Duration: 0.92725 2023-10-02 15:46:04,941 WARNING [train.py:1204] (3/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_88-276668-0_sp1.1 from training. Duration: 0.9 2023-10-02 15:46:07,541 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_15101_210_7927_1_1530786415_5033274_320-327527-0_sp1.1 from training. Duration: 0.87275 2023-10-02 15:46:07,551 WARNING [train.py:1204] (3/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_446-112117-0 from training. Duration: 0.9 2023-10-02 15:46:08,930 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_30280_210_1881_1_1532872779_3660139_400-254159-0_sp1.1 from training. Duration: 0.936375 2023-10-02 15:46:12,603 WARNING [train.py:1204] (3/4) Exclude cut with ID _40284_210_2084_1_1533513679814_3704279_158-345341-0_sp1.1 from training. Duration: 0.936375 2023-10-02 15:46:12,792 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.conv_skip_rate, batch_count=932833.3333333334, ans=0.0 2023-10-02 15:46:14,701 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.1.conv_module1.balancer1.prob, batch_count=932900.0, ans=0.125 2023-10-02 15:46:15,958 WARNING [train.py:1204] (3/4) Exclude cut with ID 3033-130750-0096-55598-0 from training. Duration: 0.83 2023-10-02 15:46:18,670 WARNING [train.py:1204] (3/4) Exclude cut with ID _9389_210_1664_1_1529217337301_5289269_379-128794-0 from training. Duration: 0.85 2023-10-02 15:46:18,685 WARNING [train.py:1204] (3/4) Exclude cut with ID _3675_210_4557_1_1526639934408_4182089_456-18305-0 from training. Duration: 0.76 2023-10-02 15:46:20,042 WARNING [train.py:1204] (3/4) Exclude cut with ID _29853_210_7497_1_1532505600851_6642110_571-249460-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 15:46:21,406 WARNING [train.py:1204] (3/4) Exclude cut with ID _4068_210_2629_1_1526697350395_1546259_230-325568-0_sp1.1 from training. Duration: 0.92725 2023-10-02 15:46:21,407 WARNING [train.py:1204] (3/4) Exclude cut with ID _34415_210_8750_1_1533085213375_3451116_213-267570-0_sp1.1 from training. Duration: 0.963625 2023-10-02 15:46:21,478 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6767_210_3273_1_1527855783_1718932_142-127132-0_sp1.1 from training. Duration: 0.863625 2023-10-02 15:46:21,781 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.ff3_skip_rate, batch_count=932900.0, ans=0.0 2023-10-02 15:46:24,869 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.3.encoder.layers.1.conv_module2.whiten, num_groups=1, num_channels=512, metric=2.98 vs. limit=15.0 2023-10-02 15:46:28,280 WARNING [train.py:1204] (3/4) Exclude cut with ID IC0653W0001-322925 from training. Duration: 0.896 2023-10-02 15:46:29,684 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_4528_210_3273_1_1527076654_3276777_209-219804-0_sp1.1 from training. Duration: 0.62725 2023-10-02 15:46:31,061 WARNING [train.py:1204] (3/4) Exclude cut with ID _55844_210_2346_1_1534471212569_7625707_187-204971-0_sp1.1 from training. Duration: 0.936375 2023-10-02 15:46:33,094 WARNING [train.py:1204] (3/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_369-325184-0 from training. Duration: 0.99 2023-10-02 15:46:34,374 WARNING [train.py:1204] (3/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_791-45149-0 from training. Duration: 0.86 2023-10-02 15:46:34,422 WARNING [train.py:1204] (3/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_317-211107-0_sp1.1 from training. Duration: 0.836375 2023-10-02 15:46:34,789 INFO [scaling.py:1118] (3/4) WithLoss: name=encoder.encoders.5.encoder.layers.0.self_attn_weights, loss-sum=0.000e+00 2023-10-02 15:46:35,791 WARNING [train.py:1204] (3/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_488-330913-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 15:46:36,533 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.2.encoder.layers.2.whiten, num_groups=1, num_channels=384, metric=3.81 vs. limit=12.0 2023-10-02 15:46:37,034 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.417e+02 1.793e+02 1.976e+02 2.310e+02 4.121e+02, threshold=3.953e+02, percent-clipped=0.0 2023-10-02 15:46:37,152 WARNING [train.py:1204] (3/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_506-42614-0_sp1.1 from training. Duration: 0.7181875 2023-10-02 15:46:41,315 WARNING [train.py:1204] (3/4) Exclude cut with ID _8696_210_1876_1_1528545632440_3345058_209-54069-0 from training. Duration: 0.67 2023-10-02 15:46:48,008 WARNING [train.py:1204] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_215-202973-0_sp1.1 from training. Duration: 0.8 2023-10-02 15:46:49,455 WARNING [train.py:1204] (3/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_321-215742-0 from training. Duration: 0.5 2023-10-02 15:46:50,787 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_428-32114-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 15:46:50,795 WARNING [train.py:1204] (3/4) Exclude cut with ID _27013_210_1389_1_1532437173240_3252649_467-263151-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 15:46:50,846 WARNING [train.py:1204] (3/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_734-129761-0_sp0.9 from training. Duration: 0.911125 2023-10-02 15:46:50,886 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_521-214276-0 from training. Duration: 0.44 2023-10-02 15:46:53,641 WARNING [train.py:1204] (3/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_80-147301-0_sp1.1 from training. Duration: 0.763625 2023-10-02 15:46:53,643 WARNING [train.py:1204] (3/4) Exclude cut with ID _9679_210_7497_1_1529129212906_4363929_56-307796-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 15:46:56,257 WARNING [train.py:1204] (3/4) Exclude cut with ID _14832_210_8750_1_1530691351539_3171348_1-54315-0 from training. Duration: 0.87 2023-10-02 15:46:56,258 WARNING [train.py:1204] (3/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_558-155750-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 15:46:57,694 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_142-309370-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 15:46:57,722 WARNING [train.py:1204] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_18-46756-0_sp0.9 from training. Duration: 0.87775 2023-10-02 15:46:57,729 WARNING [train.py:1204] (3/4) Exclude cut with ID _61148_210_6897_1_1535094022726_4217610_279-212159-0_sp1.1 from training. Duration: 0.936375 2023-10-02 15:46:58,963 INFO [train.py:1046] (3/4) Epoch 27, batch 1850, loss[loss=0.1975, simple_loss=0.2545, pruned_loss=0.07025, over 19175.00 frames. ], tot_loss[loss=0.1667, simple_loss=0.2438, pruned_loss=0.04483, over 4719860.76 frames. ], batch size: 388, lr: 3.80e-03, grad_scale: 16.0 2023-10-02 15:47:00,417 WARNING [train.py:1204] (3/4) Exclude cut with ID _21987_210_5854_1_1531821500482_3637038_164-2641-0_sp1.1 from training. Duration: 0.936375 2023-10-02 15:47:00,462 WARNING [train.py:1204] (3/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_395-340852-0_sp1.1 from training. Duration: 0.8454375 2023-10-02 15:47:02,030 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.0.conv_module1.balancer2.prob, batch_count=933100.0, ans=0.125 2023-10-02 15:47:03,213 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1077-184261-0_sp1.1 from training. Duration: 0.963625 2023-10-02 15:47:03,221 WARNING [train.py:1204] (3/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_1176-204992-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 15:47:05,198 WARNING [train.py:1204] (3/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_563-347832-0_sp1.1 from training. Duration: 0.8090625 2023-10-02 15:47:06,465 WARNING [train.py:1204] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_380-204756-0_sp0.9 from training. Duration: 0.9555625 2023-10-02 15:47:12,424 WARNING [train.py:1204] (3/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_212-10501-0_sp1.1 from training. Duration: 0.8545625 2023-10-02 15:47:12,445 WARNING [train.py:1204] (3/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_445-117398-0 from training. Duration: 0.89 2023-10-02 15:47:12,599 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.conv_module2.balancer1.prob, batch_count=933166.6666666666, ans=0.125 2023-10-02 15:47:17,224 WARNING [train.py:1204] (3/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_29-280622-0 from training. Duration: 0.82 2023-10-02 15:47:19,950 WARNING [train.py:1204] (3/4) Exclude cut with ID _26664_210_7770_1_1532258943280_3418270_187-80815-0 from training. Duration: 0.93 2023-10-02 15:47:22,773 WARNING [train.py:1204] (3/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_414-349640-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 15:47:22,792 WARNING [train.py:1204] (3/4) Exclude cut with ID _45021_210_9994_1_1533619751094_3330288_96-239010-0 from training. Duration: 0.91 2023-10-02 15:47:22,795 WARNING [train.py:1204] (3/4) Exclude cut with ID _5189_210_4557_1_1527248319428_3769229_199-53526-0_sp0.9 from training. Duration: 0.4666875 2023-10-02 15:47:32,445 WARNING [train.py:1204] (3/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_63-101996-0_sp1.1 from training. Duration: 0.8 2023-10-02 15:47:33,843 WARNING [train.py:1204] (3/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_199-86432-0 from training. Duration: 0.98 2023-10-02 15:47:34,100 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.conv_module2.balancer2.prob, batch_count=933233.3333333334, ans=0.125 2023-10-02 15:47:35,253 WARNING [train.py:1204] (3/4) Exclude cut with ID _36399_210_16049_1_1532998771377_3698333_207-259013-0_sp1.1 from training. Duration: 0.92725 2023-10-02 15:47:35,294 WARNING [train.py:1204] (3/4) Exclude cut with ID _62574_210_2655_1_1535180407058_3933249_567-111756-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 15:47:39,935 WARNING [train.py:1204] (3/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_436-318113-0 from training. Duration: 0.75 2023-10-02 15:47:40,637 WARNING [train.py:1204] (3/4) Exclude cut with ID _53379_210_20034_1_1534222027363_3854654_43-305214-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 15:47:41,787 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_15126_210_6753_1_1530778677_2591185_113-157651-0_sp1.1 from training. Duration: 0.6181875 2023-10-02 15:47:41,899 WARNING [train.py:1204] (3/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_146-218763-0_sp1.1 from training. Duration: 0.7818125 2023-10-02 15:47:44,074 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.conv_module2.balancer1.prob, batch_count=933300.0, ans=0.125 2023-10-02 15:47:45,212 WARNING [train.py:1204] (3/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_714-310538-0_sp0.9 from training. Duration: 0.9555625 2023-10-02 15:47:46,642 WARNING [train.py:1204] (3/4) Exclude cut with ID _24271_210_12658_1_1531987270022_7190519_709-16721-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 15:47:49,445 WARNING [train.py:1204] (3/4) Exclude cut with ID _5190_210_4557_1_1527244368799_373439_28-197030-0_sp0.9 from training. Duration: 0.8 2023-10-02 15:47:53,266 WARNING [train.py:1204] (3/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_267-197536-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 15:47:54,609 WARNING [train.py:1204] (3/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_658-282610-0_sp1.1 from training. Duration: 0.4818125 2023-10-02 15:47:54,619 WARNING [train.py:1204] (3/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_257-67050-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 15:47:56,078 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_14906_210_7927_1_1530754190_9408633_961-53402-0_sp1.1 from training. Duration: 0.936375 2023-10-02 15:47:57,623 WARNING [train.py:1204] (3/4) Exclude cut with ID _60113_210_16271_1_1535164212481_4386079_490-280290-0_sp1.1 from training. Duration: 0.8909375 2023-10-02 15:47:59,132 WARNING [train.py:1204] (3/4) Exclude cut with ID _14100_210_8341_1_1530529320613_3678069_258-224599-0 from training. Duration: 0.77 2023-10-02 15:47:59,182 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6961_210_2087_1_1527937168_6815011_565-326898-0_sp1.1 from training. Duration: 0.936375 2023-10-02 15:48:03,322 WARNING [train.py:1204] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_239-316802-0_sp1.1 from training. Duration: 0.62725 2023-10-02 15:48:03,390 WARNING [train.py:1204] (3/4) Exclude cut with ID _55857_210_2346_1_1534644007831_6898209_745-98391-0_sp1.1 from training. Duration: 0.6909375 2023-10-02 15:48:03,393 WARNING [train.py:1204] (3/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_154-135696-0 from training. Duration: 0.8 2023-10-02 15:48:03,395 WARNING [train.py:1204] (3/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_93-58002-0 from training. Duration: 0.57 2023-10-02 15:48:04,970 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.balancer1.prob, batch_count=933366.6666666666, ans=0.125 2023-10-02 15:48:06,013 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9025W0005-507335 from training. Duration: 0.896 2023-10-02 15:48:06,091 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9029W0034-509435 from training. Duration: 0.64 2023-10-02 15:48:08,117 WARNING [train.py:1204] (3/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_76-203867-0_sp1.1 from training. Duration: 0.6545625 2023-10-02 15:48:09,395 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_46639_210_8341_1_1533726088_4495500_66-346325-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 15:48:09,402 WARNING [train.py:1204] (3/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_145-137790-0_sp0.9 from training. Duration: 0.92225 2023-10-02 15:48:09,416 WARNING [train.py:1204] (3/4) Exclude cut with ID _36522_210_8887_1_1533177030611_1772478_7-327299-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 15:48:09,498 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9042W0009-515615 from training. Duration: 0.896 2023-10-02 15:48:09,507 WARNING [train.py:1204] (3/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_39-248870-0_sp0.9 from training. Duration: 0.7666875 2023-10-02 15:48:10,848 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_35441_210_16554_1_1533347572_4092680_362-242912-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 15:48:12,763 WARNING [train.py:1204] (3/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_216-343007-0_sp1.1 from training. Duration: 0.6 2023-10-02 15:48:13,024 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.ff2_skip_rate, batch_count=933366.6666666666, ans=0.0 2023-10-02 15:48:14,565 WARNING [train.py:1204] (3/4) Exclude cut with ID _3867_210_1883_1_1526641200575_4475830_272-215563-0_sp0.9 from training. Duration: 0.6333125 2023-10-02 15:48:15,791 INFO [train.py:1046] (3/4) Epoch 27, batch 1900, loss[loss=0.1925, simple_loss=0.261, pruned_loss=0.06196, over 23842.00 frames. ], tot_loss[loss=0.168, simple_loss=0.2454, pruned_loss=0.04531, over 4722086.56 frames. ], batch size: 212, lr: 3.80e-03, grad_scale: 16.0 2023-10-02 15:48:15,864 WARNING [train.py:1204] (3/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_553-175603-0_sp1.1 from training. Duration: 0.92725 2023-10-02 15:48:15,883 WARNING [train.py:1204] (3/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_963-187109-0 from training. Duration: 0.99 2023-10-02 15:48:16,455 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.4.encoder.layers.1.nonlin_attention.whiten1, num_groups=1, num_channels=288, metric=4.29 vs. limit=10.0 2023-10-02 15:48:19,164 WARNING [train.py:1204] (3/4) Exclude cut with ID _44973_210_1511_1_1533794381163_3634770_495-211466-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 15:48:19,182 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9068W0002-529085 from training. Duration: 0.896 2023-10-02 15:48:19,195 WARNING [train.py:1204] (3/4) Exclude cut with ID _5294_210_1747_1_1527249643928_3696179_180-100713-0_sp1.1 from training. Duration: 0.736375 2023-10-02 15:48:20,569 WARNING [train.py:1204] (3/4) Exclude cut with ID _35799_210_5854_1_1533263487887_3529071_149-49689-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 15:48:26,084 WARNING [train.py:1204] (3/4) Exclude cut with ID _67725_210_27767_1_1535608756671_5105359_36-220794-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 15:48:27,496 WARNING [train.py:1204] (3/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_338-96520-0_sp1.1 from training. Duration: 0.8545625 2023-10-02 15:48:28,826 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9104W0004-547160 from training. Duration: 0.896 2023-10-02 15:48:30,202 WARNING [train.py:1204] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_330-327861-0 from training. Duration: 0.67 2023-10-02 15:48:31,594 WARNING [train.py:1204] (3/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_156-257607-0_sp0.9 from training. Duration: 0.92225 2023-10-02 15:48:31,647 WARNING [train.py:1204] (3/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_476-322050-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 15:48:31,661 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9118W0017-553385 from training. Duration: 0.896 2023-10-02 15:48:31,691 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9120W0028-554435 from training. Duration: 0.896 2023-10-02 15:48:35,790 WARNING [train.py:1204] (3/4) Exclude cut with ID _56524_210_9642_1_1534572011779_3619170_145-310842-0 from training. Duration: 0.99 2023-10-02 15:48:37,123 WARNING [train.py:1204] (3/4) Exclude cut with ID _51631_210_8254_1_1534129228420_4084794_270-222508-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 15:48:37,338 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=933500.0, ans=0.1 2023-10-02 15:48:41,875 WARNING [train.py:1204] (3/4) Exclude cut with ID _14268_210_9206_1_1530622771285_4714099_331-105351-0 from training. Duration: 0.65 2023-10-02 15:48:43,258 WARNING [train.py:1204] (3/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_40-129756-0 from training. Duration: 0.81 2023-10-02 15:48:54,495 WARNING [train.py:1204] (3/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_231-104736-0 from training. Duration: 0.78 2023-10-02 15:48:57,171 WARNING [train.py:1204] (3/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_214-291369-0 from training. Duration: 0.63 2023-10-02 15:48:57,187 WARNING [train.py:1204] (3/4) Exclude cut with ID _62574_210_2655_1_1535180407058_3933249_369-84347-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 15:48:58,486 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9210W0111-599240 from training. Duration: 0.896 2023-10-02 15:48:58,490 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_92-246270-0 from training. Duration: 0.85 2023-10-02 15:48:58,532 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_37152_210_9697_1_1533106573_3844188_29-239070-0 from training. Duration: 0.98 2023-10-02 15:49:00,023 WARNING [train.py:1204] (3/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_239-22076-0 from training. Duration: 0.85 2023-10-02 15:49:00,025 WARNING [train.py:1204] (3/4) Exclude cut with ID _65962_210_29927_1_1535436026519_4174930_368-44639-0_sp1.1 from training. Duration: 0.97275 2023-10-02 15:49:03,338 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.3.encoder.layers.0.nonlin_attention.whiten1, num_groups=1, num_channels=384, metric=6.59 vs. limit=10.0 2023-10-02 15:49:05,343 WARNING [train.py:1204] (3/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_152-212533-0 from training. Duration: 0.97 2023-10-02 15:49:06,699 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.503e+02 1.841e+02 1.943e+02 2.107e+02 2.840e+02, threshold=3.886e+02, percent-clipped=0.0 2023-10-02 15:49:06,879 WARNING [train.py:1204] (3/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_90-31383-0_sp1.1 from training. Duration: 0.7909375 2023-10-02 15:49:10,922 WARNING [train.py:1204] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_196-152868-0_sp1.1 from training. Duration: 0.92725 2023-10-02 15:49:10,924 WARNING [train.py:1204] (3/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_140-181176-0 from training. Duration: 0.92 2023-10-02 15:49:12,354 WARNING [train.py:1204] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_168-32906-0_sp0.9 from training. Duration: 0.7666875 2023-10-02 15:49:13,164 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.2.encoder.layers.2.self_attn1.whiten, num_groups=1, num_channels=384, metric=19.69 vs. limit=22.5 2023-10-02 15:49:15,620 WARNING [train.py:1204] (3/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_13-165512-0 from training. Duration: 0.82 2023-10-02 15:49:15,656 WARNING [train.py:1204] (3/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_381-317791-0_sp0.9 from training. Duration: 0.811125 2023-10-02 15:49:22,150 WARNING [train.py:1204] (3/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_63-267585-0_sp1.1 from training. Duration: 0.8181875 2023-10-02 15:49:22,152 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_326-303945-0_sp1.1 from training. Duration: 0.9 2023-10-02 15:49:22,163 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_194-143381-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 15:49:23,935 WARNING [train.py:1204] (3/4) Exclude cut with ID _39290_210_4220_1_1533862778760_3075118_194-16667-0_sp1.1 from training. Duration: 0.963625 2023-10-02 15:49:25,336 WARNING [train.py:1204] (3/4) Exclude cut with ID _15012_210_1968_1_1530856820429_5095350_662-210469-0_sp0.9 from training. Duration: 0.6333125 2023-10-02 15:49:25,373 WARNING [train.py:1204] (3/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_68-219807-0_sp1.1 from training. Duration: 0.42725 2023-10-02 15:49:26,656 WARNING [train.py:1204] (3/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_321-22821-0_sp0.9 from training. Duration: 0.988875 2023-10-02 15:49:29,314 INFO [train.py:1046] (3/4) Epoch 27, batch 1950, loss[loss=0.1732, simple_loss=0.2511, pruned_loss=0.04767, over 23611.00 frames. ], tot_loss[loss=0.169, simple_loss=0.2463, pruned_loss=0.04583, over 4719656.27 frames. ], batch size: 93, lr: 3.80e-03, grad_scale: 16.0 2023-10-02 15:49:30,773 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_1037-45317-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 15:49:30,775 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_403-39619-0_sp1.1 from training. Duration: 0.763625 2023-10-02 15:49:33,536 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_510-96996-0_sp0.9 from training. Duration: 0.9666875 2023-10-02 15:49:33,539 WARNING [train.py:1204] (3/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_225-180155-0_sp1.1 from training. Duration: 0.92725 2023-10-02 15:49:33,584 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_278-272612-0_sp0.9 from training. Duration: 0.811125 2023-10-02 15:49:34,937 WARNING [train.py:1204] (3/4) Exclude cut with ID _53713_210_24116_1_1534314487762_7416751_691-70244-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 15:49:38,973 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6788_210_1388_1_1527898698_4562021_91-197981-0_sp1.1 from training. Duration: 0.7545625 2023-10-02 15:49:39,116 WARNING [train.py:1204] (3/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_60-18123-0_sp0.9 from training. Duration: 0.97775 2023-10-02 15:49:40,503 WARNING [train.py:1204] (3/4) Exclude cut with ID _35799_210_5854_1_1533263487887_3529071_5-214617-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 15:49:40,515 WARNING [train.py:1204] (3/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_890-5117-0_sp1.1 from training. Duration: 0.7090625 2023-10-02 15:49:41,931 WARNING [train.py:1204] (3/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_660-211088-0 from training. Duration: 0.62 2023-10-02 15:49:43,293 WARNING [train.py:1204] (3/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_314-126819-0_sp0.9 from training. Duration: 0.5555625 2023-10-02 15:49:43,320 WARNING [train.py:1204] (3/4) Exclude cut with ID _21632_210_2253_1_1531994001765_3744140_362-66225-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 15:49:43,621 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.conv_module2.balancer1.prob, batch_count=933833.3333333334, ans=0.125 2023-10-02 15:49:44,615 WARNING [train.py:1204] (3/4) Exclude cut with ID _61118_210_9527_1_1534897668884_3752630_173-258555-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 15:49:46,538 WARNING [train.py:1204] (3/4) Exclude cut with ID _7719_210_3525_1_1528456153163_4296960_88-250259-0_sp0.9 from training. Duration: 0.9333125 2023-10-02 15:49:46,560 WARNING [train.py:1204] (3/4) Exclude cut with ID _15140_210_9288_1_1530791388450_4880149_539-153708-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 15:49:46,594 WARNING [train.py:1204] (3/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_253-42544-0_sp1.1 from training. Duration: 0.97275 2023-10-02 15:49:48,682 WARNING [train.py:1204] (3/4) Exclude cut with ID _33640_210_2655_1_1533117605479_4855469_479-296877-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 15:49:51,346 WARNING [train.py:1204] (3/4) Exclude cut with ID 3033-130750-0096-55598-0_sp1.1 from training. Duration: 0.7545625 2023-10-02 15:49:51,357 WARNING [train.py:1204] (3/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_191-41023-0_sp1.1 from training. Duration: 0.6454375 2023-10-02 15:49:51,371 WARNING [train.py:1204] (3/4) Exclude cut with ID _8365_210_2064_1_1528592311070_4677690_431-294102-0_sp1.1 from training. Duration: 0.8090625 2023-10-02 15:49:51,507 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.balancer2.prob, batch_count=933833.3333333334, ans=0.125 2023-10-02 15:49:53,190 WARNING [train.py:1204] (3/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_893-296030-0_sp1.1 from training. Duration: 0.97275 2023-10-02 15:49:56,355 WARNING [train.py:1204] (3/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_401-343820-0_sp1.1 from training. Duration: 0.97275 2023-10-02 15:49:57,838 WARNING [train.py:1204] (3/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_127-566-0_sp1.1 from training. Duration: 0.763625 2023-10-02 15:49:57,839 WARNING [train.py:1204] (3/4) Exclude cut with ID _14100_210_8341_1_1530529320613_3678069_126-272847-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 15:49:57,854 WARNING [train.py:1204] (3/4) Exclude cut with ID _15133_210_6228_1_1530794512074_3080379_29-264458-0_sp1.1 from training. Duration: 0.52725 2023-10-02 15:49:57,855 WARNING [train.py:1204] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_127-119109-0 from training. Duration: 0.72 2023-10-02 15:49:59,263 WARNING [train.py:1204] (3/4) Exclude cut with ID _6537_210_1883_1_1527987559710_3597129_382-133062-0_sp1.1 from training. Duration: 0.6181875 2023-10-02 15:49:59,281 WARNING [train.py:1204] (3/4) Exclude cut with ID _29529_210_7234_1_1532393961455_3977460_391-219682-0_sp1.1 from training. Duration: 0.936375 2023-10-02 15:49:59,470 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.feed_forward3.hidden_balancer.prob, batch_count=933900.0, ans=0.125 2023-10-02 15:50:00,515 WARNING [train.py:1204] (3/4) Exclude cut with ID _37537_210_2253_1_1533171758568_3421949_96-137189-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 15:50:03,281 WARNING [train.py:1204] (3/4) Exclude cut with ID _58414_210_8254_1_1535421609162_7151182_15-276301-0_sp1.1 from training. Duration: 0.97275 2023-10-02 15:50:04,814 WARNING [train.py:1204] (3/4) Exclude cut with ID _59502_210_12210_1_1535107024698_1835139_110-289074-0_sp1.1 from training. Duration: 0.92725 2023-10-02 15:50:08,831 WARNING [train.py:1204] (3/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_367-61330-0_sp1.1 from training. Duration: 0.6818125 2023-10-02 15:50:10,467 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.balancer2.prob, batch_count=933900.0, ans=0.125 2023-10-02 15:50:11,549 WARNING [train.py:1204] (3/4) Exclude cut with ID _45297_210_2721_1_1533715351381_3264949_353-9541-0_sp1.1 from training. Duration: 0.8818125 2023-10-02 15:50:11,597 WARNING [train.py:1204] (3/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_85-138257-0_sp0.9 from training. Duration: 0.788875 2023-10-02 15:50:12,892 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3787_210_2040_1_1526477544_5478586_603-120324-0 from training. Duration: 0.81 2023-10-02 15:50:12,936 WARNING [train.py:1204] (3/4) Exclude cut with ID _42863_210_9070_1_1533902398788_3696920_411-204131-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 15:50:16,148 WARNING [train.py:1204] (3/4) Exclude cut with ID _30196_210_1664_1_1533708496522_7074140_822-289909-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 15:50:17,535 WARNING [train.py:1204] (3/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_32-335103-0_sp0.9 from training. Duration: 0.911125 2023-10-02 15:50:17,575 WARNING [train.py:1204] (3/4) Exclude cut with ID _6493_210_1747_1_1528459210424_4012470_513-140967-0_sp0.9 from training. Duration: 0.87775 2023-10-02 15:50:26,732 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_47896_210_20508_1_1534492518_5958868_439-257766-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 15:50:28,146 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_1294-63152-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 15:50:29,532 WARNING [train.py:1204] (3/4) Exclude cut with ID _59170_210_6721_1_1534983633960_4203335_319-166512-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 15:50:31,023 WARNING [train.py:1204] (3/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_146-3470-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 15:50:34,921 WARNING [train.py:1204] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_678-159307-0_sp1.1 from training. Duration: 0.77275 2023-10-02 15:50:36,219 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_343-48822-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 15:50:36,276 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_4692_210_3528_1_1526899755_4323254_107-44401-0 from training. Duration: 0.86 2023-10-02 15:50:36,283 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_74-292608-0_sp1.1 from training. Duration: 0.6545625 2023-10-02 15:50:36,354 WARNING [train.py:1204] (3/4) Exclude cut with ID _55644_210_11856_1_1534593599582_4103719_293-248694-0_sp1.1 from training. Duration: 0.97275 2023-10-02 15:50:37,745 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3172_210_2087_1_1526292929_3987865_197-97762-0 from training. Duration: 0.75 2023-10-02 15:50:39,165 WARNING [train.py:1204] (3/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_264-348896-0_sp0.9 from training. Duration: 0.9444375 2023-10-02 15:50:41,906 INFO [train.py:1046] (3/4) Epoch 27, batch 2000, loss[loss=0.1655, simple_loss=0.2506, pruned_loss=0.04016, over 24444.00 frames. ], tot_loss[loss=0.1696, simple_loss=0.2469, pruned_loss=0.04617, over 4711128.05 frames. ], batch size: 66, lr: 3.80e-03, grad_scale: 32.0 2023-10-02 15:50:42,099 WARNING [train.py:1204] (3/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_209-205365-0_sp0.9 from training. Duration: 0.87775 2023-10-02 15:50:44,091 WARNING [train.py:1204] (3/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_222-198127-0_sp0.9 from training. Duration: 0.9333125 2023-10-02 15:50:44,134 WARNING [train.py:1204] (3/4) Exclude cut with ID _57572_210_5517_1_1534921219624_3809484_52-43530-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 15:50:45,563 WARNING [train.py:1204] (3/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_96-320736-0_sp1.1 from training. Duration: 0.7818125 2023-10-02 15:50:46,945 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_13091_210_7925_1_1530609447_8987354_1159-225508-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 15:50:49,602 WARNING [train.py:1204] (3/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_237-44332-0 from training. Duration: 0.94 2023-10-02 15:50:49,649 WARNING [train.py:1204] (3/4) Exclude cut with ID _7970_210_1883_1_1528455616296_3472230_25-178407-0_sp0.9 from training. Duration: 0.788875 2023-10-02 15:50:52,361 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.0.layers.1.conv_module2.whiten, num_groups=1, num_channels=192, metric=9.14 vs. limit=15.0 2023-10-02 15:50:53,274 WARNING [train.py:1204] (3/4) Exclude cut with ID _29003_210_19596_1_1532507391720_3894098_357-257062-0_sp1.1 from training. Duration: 0.936375 2023-10-02 15:50:55,924 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_809-17156-0 from training. Duration: 0.73 2023-10-02 15:50:57,407 WARNING [train.py:1204] (3/4) Exclude cut with ID _7160_210_1876_1_1527940923231_3703799_302-150728-0_sp1.1 from training. Duration: 0.7090625 2023-10-02 15:50:57,415 WARNING [train.py:1204] (3/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_444-28041-0_sp0.9 from training. Duration: 0.9444375 2023-10-02 15:51:00,188 WARNING [train.py:1204] (3/4) Exclude cut with ID _52892_210_6721_1_1534378941259_4124129_278-251666-0_sp1.1 from training. Duration: 0.92725 2023-10-02 15:51:01,659 WARNING [train.py:1204] (3/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_115-326778-0 from training. Duration: 0.93 2023-10-02 15:51:03,069 WARNING [train.py:1204] (3/4) Exclude cut with ID _41391_210_7994_1_1533603771872_3638201_31-188961-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 15:51:04,428 WARNING [train.py:1204] (3/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_1073-6264-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 15:51:04,451 WARNING [train.py:1204] (3/4) Exclude cut with ID _66868_210_9527_1_1535509287954_4561140_435-259515-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 15:51:05,795 WARNING [train.py:1204] (3/4) Exclude cut with ID _14886_210_4220_1_1530702020355_3516190_159-73748-0 from training. Duration: 0.83 2023-10-02 15:51:05,839 WARNING [train.py:1204] (3/4) Exclude cut with ID _15150_210_8341_1_1530752411803_7256384_469-239509-0_sp1.1 from training. Duration: 0.6181875 2023-10-02 15:51:07,278 WARNING [train.py:1204] (3/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_395-340852-0 from training. Duration: 0.93 2023-10-02 15:51:07,281 WARNING [train.py:1204] (3/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_138-187031-0_sp1.1 from training. Duration: 0.9 2023-10-02 15:51:11,544 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_13757_210_3528_1_1530440682_4107239_48-106347-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 15:51:11,616 WARNING [train.py:1204] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_164-224052-0_sp1.1 from training. Duration: 0.636375 2023-10-02 15:51:13,927 WARNING [train.py:1204] (3/4) Exclude cut with ID _59288_210_5854_1_1535077757774_3412246_408-322648-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 15:51:13,973 WARNING [train.py:1204] (3/4) Exclude cut with ID _30191_210_1664_1_1533189915047_7196670_827-36359-0_sp1.1 from training. Duration: 0.963625 2023-10-02 15:51:14,069 WARNING [train.py:1204] (3/4) Exclude cut with ID _4680_210_3525_1_1527249757953_893386_75-78979-0_sp1.1 from training. Duration: 0.82725 2023-10-02 15:51:15,230 WARNING [train.py:1204] (3/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_383-165234-0 from training. Duration: 0.94 2023-10-02 15:51:17,948 WARNING [train.py:1204] (3/4) Exclude cut with ID _19896_210_1883_1_1531709022662_6649569_94-229927-0 from training. Duration: 0.97 2023-10-02 15:51:17,954 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6788_210_1388_1_1527898698_4562021_180-75288-0_sp1.1 from training. Duration: 0.9 2023-10-02 15:51:17,963 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_26433_210_12108_1_1532157158_4056480_54-321349-0_sp1.1 from training. Duration: 0.97275 2023-10-02 15:51:24,098 WARNING [train.py:1204] (3/4) Exclude cut with ID _56108_210_16380_1_1534505446159_3887959_265-161493-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 15:51:25,459 WARNING [train.py:1204] (3/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_395-111602-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 15:51:25,472 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_154-316450-0_sp0.9 from training. Duration: 0.8444375 2023-10-02 15:51:25,620 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.0.attention_skip_rate, batch_count=934300.0, ans=0.0 2023-10-02 15:51:26,806 WARNING [train.py:1204] (3/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_381-116041-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 15:51:28,108 WARNING [train.py:1204] (3/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_65-104124-0_sp1.1 from training. Duration: 0.963625 2023-10-02 15:51:28,161 WARNING [train.py:1204] (3/4) Exclude cut with ID _6536_210_1883_1_1527850783339_3319069_169-23811-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 15:51:28,190 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_289-180904-0_sp0.9 from training. Duration: 0.8444375 2023-10-02 15:51:28,200 WARNING [train.py:1204] (3/4) Exclude cut with ID _56406_210_9288_1_1534899618664_3496149_226-242400-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 15:51:28,347 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.conv_module1.balancer2.prob, batch_count=934300.0, ans=0.125 2023-10-02 15:51:30,787 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_24899_210_16554_1_1532046422_3827462_276-216330-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 15:51:32,355 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.conv_module1.balancer2.prob, batch_count=934300.0, ans=0.125 2023-10-02 15:51:33,411 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.515e+02 1.846e+02 2.049e+02 2.214e+02 2.926e+02, threshold=4.097e+02, percent-clipped=0.0 2023-10-02 15:51:33,517 WARNING [train.py:1204] (3/4) Exclude cut with ID _29345_210_4381_1_1532689168263_3860169_198-174328-0_sp1.1 from training. Duration: 0.82725 2023-10-02 15:51:33,586 WARNING [train.py:1204] (3/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_338-96520-0 from training. Duration: 0.94 2023-10-02 15:51:37,594 WARNING [train.py:1204] (3/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_7-111954-0_sp1.1 from training. Duration: 0.5090625 2023-10-02 15:51:38,935 WARNING [train.py:1204] (3/4) Exclude cut with ID _36692_210_5665_1_1533608967420_398809_8-16261-0_sp1.1 from training. Duration: 0.97275 2023-10-02 15:51:44,268 WARNING [train.py:1204] (3/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_86-212657-0_sp1.1 from training. Duration: 0.97275 2023-10-02 15:51:44,275 WARNING [train.py:1204] (3/4) Exclude cut with ID _44659_210_2721_1_1534036120061_6614860_626-298884-0_sp1.1 from training. Duration: 0.8818125 2023-10-02 15:51:46,260 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.3.encoder.layers.0.nonlin_attention.whiten1, num_groups=1, num_channels=384, metric=7.34 vs. limit=10.0 2023-10-02 15:51:48,335 WARNING [train.py:1204] (3/4) Exclude cut with ID _49755_210_18053_1_1534581005846_3218709_120-257918-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 15:51:51,099 WARNING [train.py:1204] (3/4) Exclude cut with ID _21881_210_3525_1_1531825242757_3798380_400-113821-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 15:51:51,102 WARNING [train.py:1204] (3/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_387-236773-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 15:51:52,439 WARNING [train.py:1204] (3/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_613-232856-0_sp1.1 from training. Duration: 0.4909375 2023-10-02 15:51:52,472 WARNING [train.py:1204] (3/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_181-78632-0_sp0.9 from training. Duration: 0.8666875 2023-10-02 15:51:53,870 WARNING [train.py:1204] (3/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_175-17485-0_sp1.1 from training. Duration: 0.97275 2023-10-02 15:51:55,653 INFO [train.py:1046] (3/4) Epoch 27, batch 2050, loss[loss=0.1611, simple_loss=0.2493, pruned_loss=0.03643, over 24582.00 frames. ], tot_loss[loss=0.1695, simple_loss=0.2471, pruned_loss=0.04594, over 4714939.86 frames. ], batch size: 71, lr: 3.80e-03, grad_scale: 32.0 2023-10-02 15:51:55,728 WARNING [train.py:1204] (3/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_1004-183422-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 15:51:59,065 WARNING [train.py:1204] (3/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_32-65962-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 15:52:00,416 WARNING [train.py:1204] (3/4) Exclude cut with ID _15458_210_8341_1_1530858614263_3834410_40-298664-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 15:52:00,594 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.attention_skip_rate, batch_count=934433.3333333334, ans=0.0 2023-10-02 15:52:04,688 WARNING [train.py:1204] (3/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_380-261863-0_sp1.1 from training. Duration: 0.963625 2023-10-02 15:52:06,230 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=934433.3333333334, ans=0.1 2023-10-02 15:52:07,452 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_372-173012-0_sp1.1 from training. Duration: 0.863625 2023-10-02 15:52:07,512 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_57-82205-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 15:52:07,564 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3691_210_3528_1_1526468169_4062053_355-334432-0_sp0.9 from training. Duration: 0.9666875 2023-10-02 15:52:08,983 WARNING [train.py:1204] (3/4) Exclude cut with ID _19023_210_4353_1_1531378892552_5820780_28-36615-0 from training. Duration: 0.97 2023-10-02 15:52:08,988 WARNING [train.py:1204] (3/4) Exclude cut with ID _29853_210_7497_1_1532505600851_6642110_286-251333-0_sp1.1 from training. Duration: 0.92725 2023-10-02 15:52:10,832 WARNING [train.py:1204] (3/4) Exclude cut with ID _31602_210_5854_1_1532677021456_4458250_518-80679-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 15:52:10,873 WARNING [train.py:1204] (3/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_38-187851-0_sp0.9 from training. Duration: 0.688875 2023-10-02 15:52:20,925 WARNING [train.py:1204] (3/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_39-248870-0_sp1.1 from training. Duration: 0.62725 2023-10-02 15:52:20,951 WARNING [train.py:1204] (3/4) Exclude cut with ID _16908_210_5724_1_1531119157248_3772700_154-31455-0_sp1.1 from training. Duration: 0.97275 2023-10-02 15:52:21,070 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8453_210_4211_1_1528502940_8562730_347-330673-0 from training. Duration: 0.72 2023-10-02 15:52:23,756 WARNING [train.py:1204] (3/4) Exclude cut with ID _30191_210_1664_1_1533189915047_7196670_674-204247-0_sp1.1 from training. Duration: 0.97275 2023-10-02 15:52:23,976 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.conv_module2.balancer2.min_abs, batch_count=934566.6666666666, ans=0.5 2023-10-02 15:52:25,816 WARNING [train.py:1204] (3/4) Exclude cut with ID _8833_210_5219_1_1528592665210_7553059_329-125156-0 from training. Duration: 0.82 2023-10-02 15:52:26,071 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.ff2_skip_rate, batch_count=934566.6666666666, ans=0.0 2023-10-02 15:52:26,283 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.5.encoder.layers.1.self_attn2.whiten, num_groups=1, num_channels=256, metric=12.02 vs. limit=22.5 2023-10-02 15:52:27,466 WARNING [train.py:1204] (3/4) Exclude cut with ID _4779_210_2084_1_1527318022944_3914893_500-285013-0_sp1.1 from training. Duration: 0.62725 2023-10-02 15:52:28,859 WARNING [train.py:1204] (3/4) Exclude cut with ID _14421_210_8750_1_1530604795825_3542290_190-313578-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 15:52:30,297 WARNING [train.py:1204] (3/4) Exclude cut with ID _35799_210_5854_1_1533263487887_3529071_538-273574-0_sp1.1 from training. Duration: 0.936375 2023-10-02 15:52:31,686 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_14928_210_5632_1_1530773461_4714000_454-112198-0_sp1.1 from training. Duration: 0.6 2023-10-02 15:52:32,962 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_468-143126-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 15:52:33,055 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_4528_210_3273_1_1527076654_3276777_258-121681-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 15:52:33,571 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.self_attn_weights.whiten_keys.whitening_limit, batch_count=934566.6666666666, ans=6.0 2023-10-02 15:52:34,277 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_397-132565-0_sp1.1 from training. Duration: 0.9 2023-10-02 15:52:34,296 WARNING [train.py:1204] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_454-123002-0_sp0.9 from training. Duration: 0.7666875 2023-10-02 15:52:37,086 WARNING [train.py:1204] (3/4) Exclude cut with ID _27052_210_5986_1_1532329197187_3585009_297-86890-0_sp1.1 from training. Duration: 0.936375 2023-10-02 15:52:40,215 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_51225_210_3601_1_1534400634_2327383_55-301844-0_sp1.1 from training. Duration: 0.8454375 2023-10-02 15:52:41,668 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6786_210_1388_1_1527839699_4194495_378-46070-0_sp0.9 from training. Duration: 0.8 2023-10-02 15:52:43,045 WARNING [train.py:1204] (3/4) Exclude cut with ID _42334_210_7770_1_1534206610396_3605880_55-19758-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 15:52:43,865 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.feed_forward1.hidden_balancer.prob, batch_count=934633.3333333334, ans=0.125 2023-10-02 15:52:47,974 WARNING [train.py:1204] (3/4) Exclude cut with ID _14884_210_2252_1_1530707459689_5561250_114-201399-0_sp0.9 from training. Duration: 0.8444375 2023-10-02 15:52:52,181 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_99-251314-0_sp0.9 from training. Duration: 0.9666875 2023-10-02 15:52:53,475 WARNING [train.py:1204] (3/4) Exclude cut with ID _26528_210_9070_1_1532570410842_3851310_497-97218-0 from training. Duration: 0.86 2023-10-02 15:52:58,864 WARNING [train.py:1204] (3/4) Exclude cut with ID _55328_210_27767_1_1534580973260_4088170_230-317241-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 15:52:58,956 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_125-203105-0_sp0.9 from training. Duration: 0.888875 2023-10-02 15:53:02,940 WARNING [train.py:1204] (3/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_55-31904-0_sp1.1 from training. Duration: 0.8 2023-10-02 15:53:04,287 WARNING [train.py:1204] (3/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_18-174647-0 from training. Duration: 0.91 2023-10-02 15:53:07,152 WARNING [train.py:1204] (3/4) Exclude cut with ID IC0165W0005-81726 from training. Duration: 0.952 2023-10-02 15:53:07,152 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_31583_210_3852_1_1532595851_4024413_148-171847-0_sp1.1 from training. Duration: 0.963625 2023-10-02 15:53:08,543 WARNING [train.py:1204] (3/4) Exclude cut with ID _21989_210_5854_1_1531899214931_3271360_116-138769-0_sp1.1 from training. Duration: 0.936375 2023-10-02 15:53:08,712 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.1.conv_module1.balancer2.prob, batch_count=934766.6666666666, ans=0.125 2023-10-02 15:53:09,820 INFO [train.py:1046] (3/4) Epoch 27, batch 2100, loss[loss=0.1724, simple_loss=0.2425, pruned_loss=0.05119, over 23471.00 frames. ], tot_loss[loss=0.1682, simple_loss=0.2453, pruned_loss=0.04558, over 4714273.94 frames. ], batch size: 134, lr: 3.80e-03, grad_scale: 32.0 2023-10-02 15:53:09,862 WARNING [train.py:1204] (3/4) Exclude cut with ID _66070_210_7640_1_1535441393096_7805310_489-292631-0_sp1.1 from training. Duration: 0.7181875 2023-10-02 15:53:11,241 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_202-27960-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 15:53:11,250 WARNING [train.py:1204] (3/4) Exclude cut with ID _5294_210_1747_1_1527249643928_3696179_342-270278-0 from training. Duration: 0.55 2023-10-02 15:53:11,286 WARNING [train.py:1204] (3/4) Exclude cut with ID _38323_210_12108_1_1533121233369_3303049_416-11919-0 from training. Duration: 0.96 2023-10-02 15:53:14,556 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6545_210_2040_1_1527774670_4245258_407-48070-0_sp0.9 from training. Duration: 0.8444375 2023-10-02 15:53:17,955 WARNING [train.py:1204] (3/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_503-30719-0_sp1.1 from training. Duration: 0.8909375 2023-10-02 15:53:19,296 WARNING [train.py:1204] (3/4) Exclude cut with ID _49643_210_7989_1_1534640244224_3939031_234-326497-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 15:53:19,564 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.0.feed_forward1.out_proj.dropout_p, batch_count=934766.6666666666, ans=0.1 2023-10-02 15:53:20,774 WARNING [train.py:1204] (3/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_301-77873-0_sp1.1 from training. Duration: 0.963625 2023-10-02 15:53:20,828 WARNING [train.py:1204] (3/4) Exclude cut with ID _45297_210_2721_1_1533715351381_3264949_152-154697-0_sp1.1 from training. Duration: 0.97275 2023-10-02 15:53:20,835 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3371_210_1794_1_1526384462_5580777_396-339812-0 from training. Duration: 0.85 2023-10-02 15:53:20,914 WARNING [train.py:1204] (3/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_399-156791-0_sp1.1 from training. Duration: 0.7545625 2023-10-02 15:53:22,256 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7696_210_2040_1_1528207167_3716099_117-125434-0 from training. Duration: 0.76 2023-10-02 15:53:22,260 WARNING [train.py:1204] (3/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_96-244299-0 from training. Duration: 0.88 2023-10-02 15:53:24,936 WARNING [train.py:1204] (3/4) Exclude cut with ID _37892_210_19596_1_1533198677897_3964058_519-343462-0_sp1.1 from training. Duration: 0.92725 2023-10-02 15:53:26,619 WARNING [train.py:1204] (3/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_202-183922-0_sp1.1 from training. Duration: 0.77275 2023-10-02 15:53:26,625 WARNING [train.py:1204] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_453-133983-0 from training. Duration: 0.78 2023-10-02 15:53:26,661 WARNING [train.py:1204] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_178-39722-0_sp1.1 from training. Duration: 0.4454375 2023-10-02 15:53:28,320 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.attention_skip_rate, batch_count=934833.3333333334, ans=0.0 2023-10-02 15:53:31,504 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_15126_210_6753_1_1530778677_2591185_113-157651-0 from training. Duration: 0.68 2023-10-02 15:53:31,506 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_271-234962-0_sp1.1 from training. Duration: 0.7181875 2023-10-02 15:53:35,485 WARNING [train.py:1204] (3/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_195-64967-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 15:53:35,537 WARNING [train.py:1204] (3/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_365-215065-0_sp1.1 from training. Duration: 0.936375 2023-10-02 15:53:37,067 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.conv_skip_rate, batch_count=934833.3333333334, ans=0.0 2023-10-02 15:53:39,635 WARNING [train.py:1204] (3/4) Exclude cut with ID _31602_210_5854_1_1532677021456_4458250_249-96604-0_sp0.9 from training. Duration: 0.988875 2023-10-02 15:53:39,726 WARNING [train.py:1204] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_307-158462-0 from training. Duration: 0.69 2023-10-02 15:53:40,966 WARNING [train.py:1204] (3/4) Exclude cut with ID _30918_210_6400_1_1532754078600_3125920_407-223394-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 15:53:40,969 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6673_210_3273_1_1527767790_3173240_351-167954-0_sp0.9 from training. Duration: 0.5333125 2023-10-02 15:53:42,384 WARNING [train.py:1204] (3/4) Exclude cut with ID _4866_210_2577_1_1527404412284_4364659_256-306830-0 from training. Duration: 0.87 2023-10-02 15:53:43,655 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_17613_210_2151_1_1531137569_2366160_222-131118-0_sp1.1 from training. Duration: 0.963625 2023-10-02 15:53:43,674 WARNING [train.py:1204] (3/4) Exclude cut with ID _45560_210_21982_1_1533812603951_3584999_111-6376-0 from training. Duration: 0.84 2023-10-02 15:53:43,701 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_216-73841-0 from training. Duration: 0.96 2023-10-02 15:53:44,935 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_14058_210_4163_1_1530690953_6327180_49-210687-0 from training. Duration: 0.94 2023-10-02 15:53:48,311 WARNING [train.py:1204] (3/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_506-42614-0_sp0.9 from training. Duration: 0.87775 2023-10-02 15:53:49,739 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_25-332138-0_sp1.1 from training. Duration: 0.82725 2023-10-02 15:53:51,248 WARNING [train.py:1204] (3/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_589-9070-0_sp1.1 from training. Duration: 0.6454375 2023-10-02 15:53:52,654 WARNING [train.py:1204] (3/4) Exclude cut with ID _6194_210_1534_1_1527511826160_3941194_14-282876-0_sp1.1 from training. Duration: 0.6454375 2023-10-02 15:53:54,063 WARNING [train.py:1204] (3/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_1166-90654-0_sp1.1 from training. Duration: 0.92725 2023-10-02 15:53:54,171 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_218-255858-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 15:53:54,173 WARNING [train.py:1204] (3/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_1285-256622-0 from training. Duration: 0.84 2023-10-02 15:53:54,189 WARNING [train.py:1204] (3/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_205-286261-0_sp1.1 from training. Duration: 0.963625 2023-10-02 15:53:54,203 WARNING [train.py:1204] (3/4) Exclude cut with ID _68777_210_4353_1_1535810390731_3117904_6-154119-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 15:53:55,594 WARNING [train.py:1204] (3/4) Exclude cut with ID _22982_210_1883_1_1531882790447_5449409_565-199393-0_sp1.1 from training. Duration: 0.92725 2023-10-02 15:53:55,603 WARNING [train.py:1204] (3/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_278-154492-0 from training. Duration: 0.76 2023-10-02 15:53:56,371 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.2.encoder.layers.0.whiten, num_groups=1, num_channels=384, metric=5.72 vs. limit=12.0 2023-10-02 15:53:57,475 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_426-313557-0 from training. Duration: 0.53 2023-10-02 15:53:57,533 WARNING [train.py:1204] (3/4) Exclude cut with ID _41817_210_18835_1_1533434477160_7423049_555-293448-0 from training. Duration: 0.8 2023-10-02 15:54:00,195 WARNING [train.py:1204] (3/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_32-335103-0_sp1.1 from training. Duration: 0.7454375 2023-10-02 15:54:02,751 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.547e+02 1.820e+02 2.011e+02 2.406e+02 3.767e+02, threshold=4.021e+02, percent-clipped=0.0 2023-10-02 15:54:02,890 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2655_210_2087_1_1525688959_3897400_202-147557-0_sp1.1 from training. Duration: 0.77275 2023-10-02 15:54:02,920 WARNING [train.py:1204] (3/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_158-250116-0 from training. Duration: 0.62 2023-10-02 15:54:08,195 WARNING [train.py:1204] (3/4) Exclude cut with ID _55429_210_10626_1_1534467588527_7174143_528-88083-0_sp1.1 from training. Duration: 0.963625 2023-10-02 15:54:09,801 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.attention_skip_rate, batch_count=935033.3333333334, ans=0.0 2023-10-02 15:54:10,972 WARNING [train.py:1204] (3/4) Exclude cut with ID _8030_210_7497_1_1528444886781_3531089_32-31837-0_sp1.1 from training. Duration: 0.8818125 2023-10-02 15:54:11,111 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.1.feed_forward3.hidden_balancer.prob, batch_count=935033.3333333334, ans=0.125 2023-10-02 15:54:12,275 WARNING [train.py:1204] (3/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_744-86168-0_sp1.1 from training. Duration: 0.97275 2023-10-02 15:54:12,283 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1113-7369-0_sp0.9 from training. Duration: 0.9444375 2023-10-02 15:54:12,306 WARNING [train.py:1204] (3/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_124-345840-0_sp0.9 from training. Duration: 0.57775 2023-10-02 15:54:12,350 WARNING [train.py:1204] (3/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_328-264776-0_sp0.9 from training. Duration: 0.7444375 2023-10-02 15:54:13,773 WARNING [train.py:1204] (3/4) Exclude cut with ID _18895_210_6228_1_1531357274234_3784930_469-166615-0_sp1.1 from training. Duration: 0.963625 2023-10-02 15:54:13,786 WARNING [train.py:1204] (3/4) Exclude cut with ID _8395_210_1531_1_1528597871523_4411579_431-132470-0_sp0.9 from training. Duration: 0.82225 2023-10-02 15:54:17,527 WARNING [train.py:1204] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_554-45617-0_sp1.1 from training. Duration: 0.9 2023-10-02 15:54:17,550 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_35361_210_4238_1_1533262810_7793476_524-188242-0_sp1.1 from training. Duration: 0.92725 2023-10-02 15:54:19,079 WARNING [train.py:1204] (3/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_938-205135-0 from training. Duration: 0.87 2023-10-02 15:54:20,423 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_5931_210_2040_1_1527428481_4547999_321-193614-0 from training. Duration: 0.67 2023-10-02 15:54:20,444 WARNING [train.py:1204] (3/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_445-269521-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 15:54:23,078 INFO [train.py:1046] (3/4) Epoch 27, batch 2150, loss[loss=0.1554, simple_loss=0.2421, pruned_loss=0.03433, over 24628.00 frames. ], tot_loss[loss=0.1671, simple_loss=0.2441, pruned_loss=0.04502, over 4708420.71 frames. ], batch size: 68, lr: 3.80e-03, grad_scale: 16.0 2023-10-02 15:54:23,125 WARNING [train.py:1204] (3/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_3-33116-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 15:54:23,127 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_4633_210_2040_1_1526824163_4145343_416-76117-0_sp1.1 from training. Duration: 0.863625 2023-10-02 15:54:23,170 WARNING [train.py:1204] (3/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_1033-274376-0_sp1.1 from training. Duration: 0.7909375 2023-10-02 15:54:24,475 WARNING [train.py:1204] (3/4) Exclude cut with ID _8772_210_1850_1_1528635595972_3664011_398-320049-0_sp0.9 from training. Duration: 0.97775 2023-10-02 15:54:28,180 WARNING [train.py:1204] (3/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_321-215742-0_sp0.9 from training. Duration: 0.5555625 2023-10-02 15:54:28,345 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.conv_module2.balancer1.prob, batch_count=935100.0, ans=0.125 2023-10-02 15:54:30,035 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.4.encoder.layers.2.feed_forward2.out_whiten, num_groups=1, num_channels=384, metric=8.96 vs. limit=15.0 2023-10-02 15:54:30,876 WARNING [train.py:1204] (3/4) Exclude cut with ID _28394_210_8913_1_1532430049518_3650118_272-190950-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 15:54:32,207 WARNING [train.py:1204] (3/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_31-299371-0_sp1.1 from training. Duration: 0.92725 2023-10-02 15:54:34,882 WARNING [train.py:1204] (3/4) Exclude cut with ID _55383_210_10635_1_1534468426172_2231669_134-128246-0_sp0.9 from training. Duration: 0.988875 2023-10-02 15:54:34,884 WARNING [train.py:1204] (3/4) Exclude cut with ID _40519_210_13794_1_1533715228655_7337390_887-9547-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 15:54:34,926 WARNING [train.py:1204] (3/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_310-109461-0_sp0.9 from training. Duration: 0.888875 2023-10-02 15:54:37,806 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2009_210_2071_1_1525481134_4588937_258-250196-0_sp1.1 from training. Duration: 0.92725 2023-10-02 15:54:39,066 WARNING [train.py:1204] (3/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_1186-72702-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 15:54:39,069 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_14831_210_7927_1_1530700200_5583610_181-27467-0_sp1.1 from training. Duration: 0.82725 2023-10-02 15:54:42,517 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.3.encoder.layers.0.nonlin_attention.whiten2, num_groups=1, num_channels=512, metric=7.73 vs. limit=15.0 2023-10-02 15:54:43,128 WARNING [train.py:1204] (3/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_278-132109-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 15:54:43,151 WARNING [train.py:1204] (3/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_261-297503-0 from training. Duration: 0.99 2023-10-02 15:54:44,750 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.balancer2.prob, batch_count=935166.6666666666, ans=0.125 2023-10-02 15:54:47,937 WARNING [train.py:1204] (3/4) Exclude cut with ID _19896_210_1883_1_1531709022662_6649569_313-265903-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 15:54:49,289 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_105-114797-0_sp1.1 from training. Duration: 0.763625 2023-10-02 15:54:51,294 WARNING [train.py:1204] (3/4) Exclude cut with ID _53755_210_8254_1_1534577312941_4707360_9-65843-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 15:54:52,598 WARNING [train.py:1204] (3/4) Exclude cut with ID _18516_210_9206_1_1531704653206_3692406_361-224549-0_sp1.1 from training. Duration: 0.97275 2023-10-02 15:54:52,632 WARNING [train.py:1204] (3/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_403-310975-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 15:54:52,668 WARNING [train.py:1204] (3/4) Exclude cut with ID _5444_210_2151_1_1527418808464_5312210_581-315411-0_sp0.9 from training. Duration: 0.688875 2023-10-02 15:54:54,031 WARNING [train.py:1204] (3/4) Exclude cut with ID _23825_210_13449_1_1531915313526_3497150_464-154904-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 15:54:54,042 WARNING [train.py:1204] (3/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_937-208281-0_sp0.9 from training. Duration: 0.9444375 2023-10-02 15:54:55,475 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_37839_210_7234_1_1533542462_3689303_158-181212-0_sp1.1 from training. Duration: 0.963625 2023-10-02 15:54:56,854 WARNING [train.py:1204] (3/4) Exclude cut with ID _8772_210_1850_1_1528635595972_3664011_398-320049-0 from training. Duration: 0.88 2023-10-02 15:54:58,803 WARNING [train.py:1204] (3/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_718-250907-0_sp1.1 from training. Duration: 0.62725 2023-10-02 15:54:58,871 WARNING [train.py:1204] (3/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_641-126395-0_sp1.1 from training. Duration: 0.92725 2023-10-02 15:55:00,662 WARNING [train.py:1204] (3/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_166-241493-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 15:55:00,755 WARNING [train.py:1204] (3/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_526-62079-0_sp0.9 from training. Duration: 0.7444375 2023-10-02 15:55:02,119 WARNING [train.py:1204] (3/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_439-275083-0_sp1.1 from training. Duration: 0.936375 2023-10-02 15:55:04,805 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_66907_210_21581_1_1535615661_3590177_493-229032-0_sp1.1 from training. Duration: 0.92725 2023-10-02 15:55:04,845 WARNING [train.py:1204] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_89-321955-0_sp1.1 from training. Duration: 0.72725 2023-10-02 15:55:07,213 WARNING [train.py:1204] (3/4) Exclude cut with ID _29857_210_20034_1_1532395886753_3798160_273-226195-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 15:55:07,225 WARNING [train.py:1204] (3/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_404-75889-0 from training. Duration: 0.89 2023-10-02 15:55:08,495 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_271-234962-0_sp0.9 from training. Duration: 0.87775 2023-10-02 15:55:09,912 WARNING [train.py:1204] (3/4) Exclude cut with ID _22961_210_8254_1_1531976366672_4278180_156-339685-0_sp1.1 from training. Duration: 0.97275 2023-10-02 15:55:11,230 WARNING [train.py:1204] (3/4) Exclude cut with ID _37892_210_19596_1_1533198677897_3964058_425-231927-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 15:55:11,319 WARNING [train.py:1204] (3/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_102-288670-0_sp1.1 from training. Duration: 0.97275 2023-10-02 15:55:12,706 WARNING [train.py:1204] (3/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_283-220372-0_sp1.1 from training. Duration: 0.5090625 2023-10-02 15:55:14,043 WARNING [train.py:1204] (3/4) Exclude cut with ID _44304_210_15374_1_1533639352765_4613129_86-1699-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 15:55:14,232 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.1.feed_forward1.hidden_balancer.prob, batch_count=935300.0, ans=0.125 2023-10-02 15:55:15,348 WARNING [train.py:1204] (3/4) Exclude cut with ID _2891_210_2550_1_1525874398521_4427710_327-121590-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 15:55:15,355 WARNING [train.py:1204] (3/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_369-222060-0 from training. Duration: 0.71 2023-10-02 15:55:15,493 WARNING [train.py:1204] (3/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_762-109886-0 from training. Duration: 0.91 2023-10-02 15:55:16,866 WARNING [train.py:1204] (3/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_477-255782-0_sp0.9 from training. Duration: 0.911125 2023-10-02 15:55:16,901 WARNING [train.py:1204] (3/4) Exclude cut with ID IC0669W0061-330906 from training. Duration: 0.768 2023-10-02 15:55:16,955 WARNING [train.py:1204] (3/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_347-213071-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 15:55:16,979 WARNING [train.py:1204] (3/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_507-303370-0_sp1.1 from training. Duration: 0.87275 2023-10-02 15:55:18,876 WARNING [train.py:1204] (3/4) Exclude cut with ID _15012_210_1968_1_1530856820429_5095350_688-178849-0 from training. Duration: 0.64 2023-10-02 15:55:18,881 WARNING [train.py:1204] (3/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_28-70987-0_sp0.9 from training. Duration: 0.9666875 2023-10-02 15:55:18,884 WARNING [train.py:1204] (3/4) Exclude cut with ID _5910_210_1389_1_1527337972454_5050100_555-159815-0 from training. Duration: 0.98 2023-10-02 15:55:18,909 WARNING [train.py:1204] (3/4) Exclude cut with ID IC0679W0064-335871 from training. Duration: 0.768 2023-10-02 15:55:18,909 WARNING [train.py:1204] (3/4) Exclude cut with ID IC0679W0079-335886 from training. Duration: 0.896 2023-10-02 15:55:19,517 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.3.encoder.layers.2.self_attn2.whiten, num_groups=1, num_channels=512, metric=11.93 vs. limit=22.5 2023-10-02 15:55:20,198 WARNING [train.py:1204] (3/4) Exclude cut with ID _5608_210_2106_1_1527415439089_6819768_909-82767-0 from training. Duration: 0.61 2023-10-02 15:55:21,608 WARNING [train.py:1204] (3/4) Exclude cut with ID _23038_210_9570_1_1532001584158_3652419_411-323967-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 15:55:22,937 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_350-231748-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 15:55:24,655 WARNING [train.py:1204] (3/4) Exclude cut with ID _5289_210_2084_1_1527678202736_3779299_142-103317-0_sp0.9 from training. Duration: 0.9333125 2023-10-02 15:55:24,724 WARNING [train.py:1204] (3/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_314-144055-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 15:55:26,100 WARNING [train.py:1204] (3/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_177-236809-0_sp0.9 from training. Duration: 0.5444375 2023-10-02 15:55:28,000 WARNING [train.py:1204] (3/4) Exclude cut with ID _23038_210_9570_1_1532001584158_3652419_82-334733-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 15:55:28,008 WARNING [train.py:1204] (3/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_225-22526-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 15:55:35,588 WARNING [train.py:1204] (3/4) Exclude cut with ID _30327_210_16049_1_1532484161295_3924169_401-70819-0_sp1.1 from training. Duration: 0.7818125 2023-10-02 15:55:35,629 WARNING [train.py:1204] (3/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_538-32291-0 from training. Duration: 0.83 2023-10-02 15:55:36,907 INFO [train.py:1046] (3/4) Epoch 27, batch 2200, loss[loss=0.1788, simple_loss=0.2581, pruned_loss=0.04975, over 23503.00 frames. ], tot_loss[loss=0.1672, simple_loss=0.2445, pruned_loss=0.04492, over 4712624.74 frames. ], batch size: 93, lr: 3.80e-03, grad_scale: 16.0 2023-10-02 15:55:40,974 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_37357_210_17024_1_1533089504_2316121_258-211605-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 15:55:44,053 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.bypass.scale_min, batch_count=935433.3333333334, ans=0.2 2023-10-02 15:55:45,176 WARNING [train.py:1204] (3/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_550-335198-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 15:55:46,537 WARNING [train.py:1204] (3/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_98-741-0_sp0.9 from training. Duration: 0.888875 2023-10-02 15:55:46,594 WARNING [train.py:1204] (3/4) Exclude cut with ID _38323_210_12108_1_1533121233369_3303049_251-238988-0_sp1.1 from training. Duration: 0.92725 2023-10-02 15:55:46,683 WARNING [train.py:1204] (3/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_259-34038-0_sp0.9 from training. Duration: 0.82225 2023-10-02 15:55:49,341 WARNING [train.py:1204] (3/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_1060-180633-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 15:55:50,653 WARNING [train.py:1204] (3/4) Exclude cut with ID _8755_210_1876_1_1528628559357_3258209_211-37473-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 15:55:50,658 WARNING [train.py:1204] (3/4) Exclude cut with ID _4122_210_2084_1_1526713164417_4353280_528-206771-0 from training. Duration: 0.88 2023-10-02 15:55:55,840 WARNING [train.py:1204] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_64-248359-0 from training. Duration: 0.69 2023-10-02 15:55:57,265 WARNING [train.py:1204] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_402-253027-0_sp1.1 from training. Duration: 0.4909375 2023-10-02 15:56:01,864 WARNING [train.py:1204] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_625-102955-0 from training. Duration: 0.77 2023-10-02 15:56:03,798 WARNING [train.py:1204] (3/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_611-219324-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 15:56:05,195 WARNING [train.py:1204] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_718-68505-0_sp0.9 from training. Duration: 0.988875 2023-10-02 15:56:05,250 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_353-182855-0_sp1.1 from training. Duration: 0.87275 2023-10-02 15:56:09,420 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_5960_210_1794_1_1527403865_3990999_515-2613-0_sp0.9 from training. Duration: 0.97775 2023-10-02 15:56:09,451 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_5575_210_1410_1_1527301305_2570170_342-265572-0 from training. Duration: 0.94 2023-10-02 15:56:13,544 WARNING [train.py:1204] (3/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_329-113173-0_sp0.9 from training. Duration: 0.688875 2023-10-02 15:56:14,872 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_15396_210_6890_1_1531308473_1938936_213-312506-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 15:56:16,222 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_521-214276-0_sp1.1 from training. Duration: 0.4 2023-10-02 15:56:18,990 WARNING [train.py:1204] (3/4) Exclude cut with ID _15026_210_8750_1_1530777602316_3291850_211-195043-0_sp1.1 from training. Duration: 0.72725 2023-10-02 15:56:22,484 WARNING [train.py:1204] (3/4) Exclude cut with ID _29343_210_4381_1_1532602757752_3674109_108-257494-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 15:56:23,874 WARNING [train.py:1204] (3/4) Exclude cut with ID _64414_210_17301_1_1535243252048_3757759_403-58151-0_sp1.1 from training. Duration: 0.963625 2023-10-02 15:56:25,284 WARNING [train.py:1204] (3/4) Exclude cut with ID _36512_210_6987_1_1533033143998_3126069_109-177787-0_sp1.1 from training. Duration: 0.92725 2023-10-02 15:56:28,578 WARNING [train.py:1204] (3/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_498-40153-0 from training. Duration: 0.63 2023-10-02 15:56:28,655 WARNING [train.py:1204] (3/4) Exclude cut with ID _56447_210_1389_1_1535074245910_3697189_161-334523-0_sp1.1 from training. Duration: 0.936375 2023-10-02 15:56:29,898 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.496e+02 1.796e+02 1.925e+02 2.148e+02 3.063e+02, threshold=3.850e+02, percent-clipped=0.0 2023-10-02 15:56:30,021 WARNING [train.py:1204] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_238-245655-0 from training. Duration: 0.81 2023-10-02 15:56:32,082 WARNING [train.py:1204] (3/4) Exclude cut with ID _64631_210_27767_1_1535349580068_4165160_108-43865-0_sp1.1 from training. Duration: 0.92725 2023-10-02 15:56:32,088 WARNING [train.py:1204] (3/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_243-42116-0_sp1.1 from training. Duration: 0.663625 2023-10-02 15:56:32,130 WARNING [train.py:1204] (3/4) Exclude cut with ID _66868_210_9527_1_1535509287954_4561140_594-132160-0_sp1.1 from training. Duration: 0.92725 2023-10-02 15:56:35,269 WARNING [train.py:1204] (3/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_48-338976-0_sp0.9 from training. Duration: 0.988875 2023-10-02 15:56:35,309 WARNING [train.py:1204] (3/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_137-100595-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 15:56:35,322 WARNING [train.py:1204] (3/4) Exclude cut with ID _49748_210_24116_1_1533986971737_3158910_12-271669-0_sp1.1 from training. Duration: 0.936375 2023-10-02 15:56:35,346 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_37357_210_17024_1_1533089504_2316121_206-80421-0_sp1.1 from training. Duration: 0.936375 2023-10-02 15:56:36,642 WARNING [train.py:1204] (3/4) Exclude cut with ID _7537_210_2084_1_1528502395431_3924359_113-199933-0_sp1.1 from training. Duration: 0.536375 2023-10-02 15:56:38,070 WARNING [train.py:1204] (3/4) Exclude cut with ID _55440_210_12651_1_1534496713364_2980520_31-59289-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 15:56:39,600 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1083-220122-0_sp0.9 from training. Duration: 0.5666875 2023-10-02 15:56:42,376 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_426-313557-0_sp1.1 from training. Duration: 0.4818125 2023-10-02 15:56:43,691 WARNING [train.py:1204] (3/4) Exclude cut with ID _35799_210_5854_1_1533263487887_3529071_324-94843-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 15:56:45,100 WARNING [train.py:1204] (3/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_264-108685-0_sp1.1 from training. Duration: 0.5 2023-10-02 15:56:45,188 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9012W0006-501141 from training. Duration: 0.896 2023-10-02 15:56:47,923 WARNING [train.py:1204] (3/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_890-5117-0_sp0.9 from training. Duration: 0.8666875 2023-10-02 15:56:49,219 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9023W0008-506301 from training. Duration: 0.896 2023-10-02 15:56:49,293 WARNING [train.py:1204] (3/4) Exclude cut with ID _13916_210_9994_1_1530788341126_3763149_60-191556-0_sp0.9 from training. Duration: 0.67775 2023-10-02 15:56:49,354 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9029W0331-509721 from training. Duration: 0.896 2023-10-02 15:56:50,515 INFO [train.py:1046] (3/4) Epoch 27, batch 2250, loss[loss=0.166, simple_loss=0.2568, pruned_loss=0.03753, over 24321.00 frames. ], tot_loss[loss=0.1677, simple_loss=0.2452, pruned_loss=0.04506, over 4707815.49 frames. ], batch size: 74, lr: 3.80e-03, grad_scale: 16.0 2023-10-02 15:56:51,858 WARNING [train.py:1204] (3/4) Exclude cut with ID _35823_210_6917_1_1533187881914_7297830_567-108596-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 15:56:53,675 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7543_210_6753_1_1528544010_1110402_25-183860-0_sp1.1 from training. Duration: 0.42725 2023-10-02 15:56:55,169 WARNING [train.py:1204] (3/4) Exclude cut with ID _47278_210_12041_1_1533902418349_3576110_495-260035-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 15:56:56,541 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9051W0015-520281 from training. Duration: 0.9045625 2023-10-02 15:56:59,516 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.0.layers.0.conv_module2.whiten, num_groups=1, num_channels=192, metric=8.97 vs. limit=15.0 2023-10-02 15:56:59,802 WARNING [train.py:1204] (3/4) Exclude cut with ID _14459_210_2228_1_1531443641946_7516700_707-66193-0_sp1.1 from training. Duration: 0.97275 2023-10-02 15:57:01,337 WARNING [train.py:1204] (3/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_219-224445-0_sp1.1 from training. Duration: 0.763625 2023-10-02 15:57:03,018 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.balancer2.prob, batch_count=935766.6666666666, ans=0.125 2023-10-02 15:57:06,590 WARNING [train.py:1204] (3/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_345-167986-0_sp0.9 from training. Duration: 0.9333125 2023-10-02 15:57:08,028 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_34815_210_6388_1_1533381320_3571903_279-305565-0_sp1.1 from training. Duration: 0.836375 2023-10-02 15:57:10,792 WARNING [train.py:1204] (3/4) Exclude cut with ID _21987_210_5854_1_1531821500482_3637038_317-179212-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 15:57:12,140 WARNING [train.py:1204] (3/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_395-37222-0_sp1.1 from training. Duration: 0.6909375 2023-10-02 15:57:13,557 WARNING [train.py:1204] (3/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_168-50337-0_sp1.1 from training. Duration: 0.763625 2023-10-02 15:57:14,948 WARNING [train.py:1204] (3/4) Exclude cut with ID _60113_210_16271_1_1535164212481_4386079_559-167191-0 from training. Duration: 0.93 2023-10-02 15:57:14,967 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_47228_210_4983_1_1533780021_3672027_344-179642-0_sp1.1 from training. Duration: 0.92725 2023-10-02 15:57:14,994 WARNING [train.py:1204] (3/4) Exclude cut with ID _22961_210_8254_1_1531976366672_4278180_317-199546-0_sp0.9 from training. Duration: 0.9555625 2023-10-02 15:57:16,479 WARNING [train.py:1204] (3/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_141-89727-0 from training. Duration: 0.71 2023-10-02 15:57:17,809 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_78-116075-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 15:57:17,820 WARNING [train.py:1204] (3/4) Exclude cut with ID _64086_210_7994_1_1535270417019_7435949_52-235756-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 15:57:19,214 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7615_210_4211_1_1528110965_4580160_72-235211-0_sp1.1 from training. Duration: 0.6909375 2023-10-02 15:57:19,431 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=935900.0, ans=0.1 2023-10-02 15:57:23,310 WARNING [train.py:1204] (3/4) Exclude cut with ID _14069_210_5632_1_1530512692033_4803390_287-27950-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 15:57:24,673 WARNING [train.py:1204] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_74-48717-0_sp0.9 from training. Duration: 0.6444375 2023-10-02 15:57:24,698 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7544_210_6753_1_1528591327_1471426_172-348777-0_sp1.1 from training. Duration: 0.6 2023-10-02 15:57:25,408 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder_embed.conv.8.prob, batch_count=935900.0, ans=0.125 2023-10-02 15:57:28,136 WARNING [train.py:1204] (3/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_220-191080-0 from training. Duration: 0.98 2023-10-02 15:57:28,234 WARNING [train.py:1204] (3/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_1081-137641-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 15:57:29,706 WARNING [train.py:1204] (3/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_278-235459-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 15:57:32,958 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.0.ff3_skip_rate, batch_count=935900.0, ans=0.0 2023-10-02 15:57:35,376 WARNING [train.py:1204] (3/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_1181-77470-0_sp1.1 from training. Duration: 0.936375 2023-10-02 15:57:37,275 WARNING [train.py:1204] (3/4) Exclude cut with ID _60860_210_8254_1_1534896015875_3557169_198-5531-0_sp1.1 from training. Duration: 0.936375 2023-10-02 15:57:37,369 WARNING [train.py:1204] (3/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_17-289781-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 15:57:38,674 WARNING [train.py:1204] (3/4) Exclude cut with ID _50310_210_15488_1_1534068043345_3641080_146-199752-0_sp1.1 from training. Duration: 0.92725 2023-10-02 15:57:40,118 WARNING [train.py:1204] (3/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_969-213354-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 15:57:41,451 WARNING [train.py:1204] (3/4) Exclude cut with ID _70519_210_4163_1_1535961595994_7458230_66-344156-0_sp1.1 from training. Duration: 0.963625 2023-10-02 15:57:44,413 WARNING [train.py:1204] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_380-204756-0_sp1.1 from training. Duration: 0.7818125 2023-10-02 15:57:47,306 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_128-202104-0_sp0.9 from training. Duration: 0.7 2023-10-02 15:57:50,200 WARNING [train.py:1204] (3/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_51-1049-0_sp0.9 from training. Duration: 0.8333125 2023-10-02 15:57:50,225 WARNING [train.py:1204] (3/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_86-66192-0_sp1.1 from training. Duration: 0.5 2023-10-02 15:57:51,584 WARNING [train.py:1204] (3/4) Exclude cut with ID _14069_210_5632_1_1530512692033_4803390_95-1896-0_sp1.1 from training. Duration: 0.8909375 2023-10-02 15:57:57,905 WARNING [train.py:1204] (3/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_29-293896-0_sp1.1 from training. Duration: 0.5545625 2023-10-02 15:58:00,776 WARNING [train.py:1204] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_167-137255-0_sp0.9 from training. Duration: 0.711125 2023-10-02 15:58:00,777 WARNING [train.py:1204] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_217-95056-0 from training. Duration: 0.98 2023-10-02 15:58:00,790 WARNING [train.py:1204] (3/4) Exclude cut with ID _47278_210_12041_1_1533902418349_3576110_97-266746-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 15:58:02,136 WARNING [train.py:1204] (3/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_380-343670-0_sp1.1 from training. Duration: 0.8 2023-10-02 15:58:04,201 WARNING [train.py:1204] (3/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_107-332398-0 from training. Duration: 0.78 2023-10-02 15:58:05,943 INFO [train.py:1046] (3/4) Epoch 27, batch 2300, loss[loss=0.1802, simple_loss=0.2654, pruned_loss=0.04749, over 24431.00 frames. ], tot_loss[loss=0.1685, simple_loss=0.2461, pruned_loss=0.04538, over 4717851.33 frames. ], batch size: 69, lr: 3.80e-03, grad_scale: 16.0 2023-10-02 15:58:07,442 WARNING [train.py:1204] (3/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_91-267696-0_sp1.1 from training. Duration: 0.7909375 2023-10-02 15:58:07,475 WARNING [train.py:1204] (3/4) Exclude cut with ID _41298_210_9019_1_1533430371450_4822143_498-186993-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 15:58:12,784 WARNING [train.py:1204] (3/4) Exclude cut with ID _17956_210_4353_1_1531209568125_6979970_54-77610-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 15:58:14,086 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_40047_210_2290_1_1533290519_2374311_257-335162-0_sp1.1 from training. Duration: 0.863625 2023-10-02 15:58:15,551 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9653W0193-675471 from training. Duration: 0.9656875 2023-10-02 15:58:15,692 WARNING [train.py:1204] (3/4) Exclude cut with ID _25248_210_8750_1_1532062825941_5554180_243-240536-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 15:58:16,490 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.1.encoder.layers.0.self_attn1.whiten, num_groups=1, num_channels=256, metric=12.20 vs. limit=22.5 2023-10-02 15:58:22,507 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_32360_210_4198_1_1532689160_3648436_60-51908-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 15:58:22,523 WARNING [train.py:1204] (3/4) Exclude cut with ID _4092_210_2151_1_1526643044449_3135759_227-213972-0_sp0.9 from training. Duration: 0.6 2023-10-02 15:58:22,557 WARNING [train.py:1204] (3/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_392-173500-0_sp1.1 from training. Duration: 0.97275 2023-10-02 15:58:23,922 WARNING [train.py:1204] (3/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_71-329150-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 15:58:23,924 WARNING [train.py:1204] (3/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_866-173440-0 from training. Duration: 0.9 2023-10-02 15:58:25,321 WARNING [train.py:1204] (3/4) Exclude cut with ID _14832_210_8750_1_1530691351539_3171348_1-54315-0_sp0.9 from training. Duration: 0.9666875 2023-10-02 15:58:28,455 WARNING [train.py:1204] (3/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_430-116139-0_sp1.1 from training. Duration: 0.77275 2023-10-02 15:58:30,340 WARNING [train.py:1204] (3/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_356-45054-0_sp0.9 from training. Duration: 0.9555625 2023-10-02 15:58:33,413 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.0.balancer1.prob, batch_count=936233.3333333334, ans=0.125 2023-10-02 15:58:35,011 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3531_210_3528_1_1526294203_5216456_309-227302-0_sp0.9 from training. Duration: 0.7666875 2023-10-02 15:58:38,374 WARNING [train.py:1204] (3/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_301-330455-0_sp1.1 from training. Duration: 0.72725 2023-10-02 15:58:41,187 WARNING [train.py:1204] (3/4) Exclude cut with ID _23557_210_8750_1_1531890106370_6846255_667-145945-0_sp1.1 from training. Duration: 0.92725 2023-10-02 15:58:44,153 WARNING [train.py:1204] (3/4) Exclude cut with ID _14860_210_4915_1_1530702020340_7343240_836-328489-0_sp1.1 from training. Duration: 0.8181875 2023-10-02 15:58:45,518 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_37152_210_9697_1_1533106573_3844188_416-333249-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 15:58:47,019 WARNING [train.py:1204] (3/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_538-32291-0_sp0.9 from training. Duration: 0.92225 2023-10-02 15:58:51,006 WARNING [train.py:1204] (3/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_965-176824-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 15:58:51,236 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.conv_skip_rate, batch_count=936300.0, ans=0.0 2023-10-02 15:58:53,873 WARNING [train.py:1204] (3/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_86-19601-0_sp1.1 from training. Duration: 0.77275 2023-10-02 15:58:54,514 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.4.encoder.layers.0.conv_module2.whiten, num_groups=1, num_channels=384, metric=7.97 vs. limit=15.0 2023-10-02 15:58:55,218 WARNING [train.py:1204] (3/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_436-318113-0_sp1.1 from training. Duration: 0.6818125 2023-10-02 15:58:55,266 WARNING [train.py:1204] (3/4) Exclude cut with ID _41817_210_18835_1_1533434477160_7423049_555-293448-0_sp0.9 from training. Duration: 0.888875 2023-10-02 15:58:55,277 WARNING [train.py:1204] (3/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_68-219807-0 from training. Duration: 0.47 2023-10-02 15:58:58,044 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.529e+02 1.878e+02 2.081e+02 2.351e+02 3.500e+02, threshold=4.161e+02, percent-clipped=0.0 2023-10-02 15:58:58,206 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7544_210_6753_1_1528591327_1471426_33-346031-0_sp1.1 from training. Duration: 0.5545625 2023-10-02 15:58:58,207 WARNING [train.py:1204] (3/4) Exclude cut with ID _49748_210_24116_1_1533986971737_3158910_263-52113-0_sp1.1 from training. Duration: 0.97275 2023-10-02 15:59:00,073 WARNING [train.py:1204] (3/4) Exclude cut with ID _64639_210_5816_1_1535367619437_3528000_340-24580-0_sp1.1 from training. Duration: 0.92725 2023-10-02 15:59:00,087 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_47228_210_4983_1_1533780021_3672027_479-114941-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 15:59:00,119 WARNING [train.py:1204] (3/4) Exclude cut with ID _44585_210_18628_1_1533605350387_4518199_506-119659-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 15:59:01,373 WARNING [train.py:1204] (3/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_314-126819-0_sp1.1 from training. Duration: 0.4545625 2023-10-02 15:59:01,376 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2541_210_1794_1_1525590269_3084765_156-173713-0_sp0.9 from training. Duration: 0.9 2023-10-02 15:59:03,138 WARNING [train.py:1204] (3/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_108-255430-0 from training. Duration: 0.97 2023-10-02 15:59:03,153 WARNING [train.py:1204] (3/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_791-45149-0_sp1.1 from training. Duration: 0.7818125 2023-10-02 15:59:03,158 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_53378_210_17282_1_1534227054_1261425_10-118295-0_sp1.1 from training. Duration: 0.97275 2023-10-02 15:59:04,489 WARNING [train.py:1204] (3/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_148-158509-0 from training. Duration: 0.41 2023-10-02 15:59:09,369 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_34726_210_4525_1_1533019474_5030395_687-291305-0_sp1.1 from training. Duration: 0.936375 2023-10-02 15:59:09,600 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=936366.6666666666, ans=0.1 2023-10-02 15:59:12,070 WARNING [train.py:1204] (3/4) Exclude cut with ID _14860_210_4915_1_1530702020340_7343240_264-324537-0_sp1.1 from training. Duration: 0.963625 2023-10-02 15:59:14,998 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_41251_210_15368_1_1533429767_4820695_591-46386-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 15:59:16,439 WARNING [train.py:1204] (3/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_86-19601-0_sp0.9 from training. Duration: 0.9444375 2023-10-02 15:59:16,461 WARNING [train.py:1204] (3/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_401-255491-0_sp0.9 from training. Duration: 0.77775 2023-10-02 15:59:17,826 WARNING [train.py:1204] (3/4) Exclude cut with ID _8696_210_1876_1_1528545632440_3345058_209-54069-0_sp1.1 from training. Duration: 0.6090625 2023-10-02 15:59:17,841 WARNING [train.py:1204] (3/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_369-325184-0_sp1.1 from training. Duration: 0.9 2023-10-02 15:59:19,044 INFO [train.py:1046] (3/4) Epoch 27, batch 2350, loss[loss=0.1671, simple_loss=0.2434, pruned_loss=0.04538, over 23369.00 frames. ], tot_loss[loss=0.1687, simple_loss=0.2465, pruned_loss=0.04552, over 4717034.81 frames. ], batch size: 119, lr: 3.80e-03, grad_scale: 16.0 2023-10-02 15:59:20,347 WARNING [train.py:1204] (3/4) Exclude cut with ID _14687_210_5854_1_1530703838259_3664889_115-239046-0_sp0.9 from training. Duration: 0.7444375 2023-10-02 15:59:20,389 WARNING [train.py:1204] (3/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_168-50337-0 from training. Duration: 0.84 2023-10-02 15:59:24,646 WARNING [train.py:1204] (3/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_909-64151-0_sp1.1 from training. Duration: 0.8545625 2023-10-02 15:59:24,663 WARNING [train.py:1204] (3/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_597-154640-0 from training. Duration: 0.9 2023-10-02 15:59:24,798 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.conv_module1.balancer2.prob, batch_count=936433.3333333334, ans=0.125 2023-10-02 15:59:26,919 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.attention_skip_rate, batch_count=936433.3333333334, ans=0.0 2023-10-02 15:59:30,568 WARNING [train.py:1204] (3/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_126-274694-0 from training. Duration: 0.98 2023-10-02 15:59:34,968 WARNING [train.py:1204] (3/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_344-269786-0_sp1.1 from training. Duration: 0.97275 2023-10-02 15:59:37,063 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_37818_210_4238_1_1533620910_4720083_322-253154-0_sp1.1 from training. Duration: 0.92725 2023-10-02 15:59:37,066 WARNING [train.py:1204] (3/4) Exclude cut with ID _58895_210_8750_1_1535184081320_3649089_221-15823-0_sp1.1 from training. Duration: 0.92725 2023-10-02 15:59:37,078 WARNING [train.py:1204] (3/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_9-21737-0_sp1.1 from training. Duration: 0.8818125 2023-10-02 15:59:37,109 WARNING [train.py:1204] (3/4) Exclude cut with ID _14069_210_5632_1_1530512692033_4803390_522-341588-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 15:59:38,469 WARNING [train.py:1204] (3/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_363-185703-0 from training. Duration: 0.95 2023-10-02 15:59:42,631 WARNING [train.py:1204] (3/4) Exclude cut with ID _41817_210_18835_1_1533434477160_7423049_265-76216-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 15:59:42,868 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.attention_skip_rate, batch_count=936500.0, ans=0.0 2023-10-02 15:59:48,116 WARNING [train.py:1204] (3/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_841-341679-0 from training. Duration: 0.58 2023-10-02 15:59:50,932 WARNING [train.py:1204] (3/4) Exclude cut with ID _55429_210_10626_1_1534467588527_7174143_97-335084-0_sp1.1 from training. Duration: 0.8818125 2023-10-02 15:59:53,707 WARNING [train.py:1204] (3/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_690-126306-0_sp0.9 from training. Duration: 0.8666875 2023-10-02 15:59:53,715 WARNING [train.py:1204] (3/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_102-92751-0_sp1.1 from training. Duration: 0.9 2023-10-02 15:59:55,198 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3787_210_2040_1_1526477544_5478586_603-120324-0_sp1.1 from training. Duration: 0.736375 2023-10-02 15:59:56,581 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_33348_210_5632_1_1533086912_3324030_85-28583-0 from training. Duration: 0.82 2023-10-02 15:59:57,914 WARNING [train.py:1204] (3/4) Exclude cut with ID _60057_210_10640_1_1534903065793_3333150_362-230377-0_sp1.1 from training. Duration: 0.7909375 2023-10-02 15:59:58,764 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.conv_module1.balancer1.prob, batch_count=936566.6666666666, ans=0.125 2023-10-02 15:59:59,953 WARNING [train.py:1204] (3/4) Exclude cut with ID _60113_210_16271_1_1535164212481_4386079_194-146486-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 15:59:59,967 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_53753_210_7234_1_1534320003_3594148_171-277668-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 16:00:01,281 WARNING [train.py:1204] (3/4) Exclude cut with ID _34423_210_8750_1_1533430798456_3330199_251-332788-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 16:00:04,097 WARNING [train.py:1204] (3/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_663-166973-0_sp0.9 from training. Duration: 0.888875 2023-10-02 16:00:06,170 WARNING [train.py:1204] (3/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_927-43827-0 from training. Duration: 0.74 2023-10-02 16:00:07,532 WARNING [train.py:1204] (3/4) Exclude cut with ID _28323_210_16187_1_1532480395667_4100300_306-74561-0_sp1.1 from training. Duration: 0.8545625 2023-10-02 16:00:09,361 WARNING [train.py:1204] (3/4) Exclude cut with ID _15037_210_2253_1_1530752294104_3267650_139-337989-0_sp1.1 from training. Duration: 0.92725 2023-10-02 16:00:10,601 WARNING [train.py:1204] (3/4) Exclude cut with ID _49039_210_6315_1_1533967537541_3846118_460-263670-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 16:00:12,147 WARNING [train.py:1204] (3/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_186-309161-0 from training. Duration: 0.54 2023-10-02 16:00:13,563 WARNING [train.py:1204] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_87-278686-0_sp1.1 from training. Duration: 0.7 2023-10-02 16:00:13,737 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.attention_skip_rate, batch_count=936633.3333333334, ans=0.0 2023-10-02 16:00:16,306 WARNING [train.py:1204] (3/4) Exclude cut with ID _27052_210_5986_1_1532329197187_3585009_314-106499-0 from training. Duration: 0.92 2023-10-02 16:00:16,329 WARNING [train.py:1204] (3/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_412-287526-0_sp0.9 from training. Duration: 0.8 2023-10-02 16:00:20,491 WARNING [train.py:1204] (3/4) Exclude cut with ID _22961_210_8254_1_1531976366672_4278180_317-199546-0 from training. Duration: 0.86 2023-10-02 16:00:23,363 WARNING [train.py:1204] (3/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_381-317791-0 from training. Duration: 0.73 2023-10-02 16:00:24,729 WARNING [train.py:1204] (3/4) Exclude cut with ID _44010_210_4525_1_1533884262964_4013620_560-310411-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 16:00:24,736 WARNING [train.py:1204] (3/4) Exclude cut with ID _5294_210_1747_1_1527249643928_3696179_342-270278-0_sp0.9 from training. Duration: 0.611125 2023-10-02 16:00:24,758 WARNING [train.py:1204] (3/4) Exclude cut with ID ID0473W0002-918951 from training. Duration: 0.924 2023-10-02 16:00:24,777 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1083-220122-0 from training. Duration: 0.51 2023-10-02 16:00:26,413 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.self_attn_weights.pos_emb_skip_rate, batch_count=936700.0, ans=0.0 2023-10-02 16:00:27,508 WARNING [train.py:1204] (3/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_138-154812-0 from training. Duration: 0.78 2023-10-02 16:00:29,593 WARNING [train.py:1204] (3/4) Exclude cut with ID _44982_210_1511_1_1534312820108_3726029_331-19420-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 16:00:32,040 INFO [train.py:1046] (3/4) Epoch 27, batch 2400, loss[loss=0.1635, simple_loss=0.2418, pruned_loss=0.0426, over 24662.00 frames. ], tot_loss[loss=0.1679, simple_loss=0.2456, pruned_loss=0.04513, over 4722734.76 frames. ], batch size: 65, lr: 3.79e-03, grad_scale: 32.0 2023-10-02 16:00:33,461 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2655_210_2087_1_1525688959_3897400_202-147557-0_sp0.9 from training. Duration: 0.9444375 2023-10-02 16:00:38,428 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_23850_210_4238_1_1532232597_1632433_32-185135-0_sp0.9 from training. Duration: 0.9555625 2023-10-02 16:00:38,535 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_46-93547-0_sp1.1 from training. Duration: 0.863625 2023-10-02 16:00:38,580 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7931_210_1794_1_1528594728_5564059_280-279560-0 from training. Duration: 0.98 2023-10-02 16:00:38,608 WARNING [train.py:1204] (3/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_937-208281-0 from training. Duration: 0.85 2023-10-02 16:00:45,991 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_14928_210_5632_1_1530773461_4714000_454-112198-0_sp0.9 from training. Duration: 0.7333125 2023-10-02 16:00:45,993 WARNING [train.py:1204] (3/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_944-153333-0_sp1.1 from training. Duration: 0.963625 2023-10-02 16:00:48,776 WARNING [train.py:1204] (3/4) Exclude cut with ID _14821_210_8750_1_1530680503991_6829359_2-227151-0 from training. Duration: 0.83 2023-10-02 16:00:48,803 WARNING [train.py:1204] (3/4) Exclude cut with ID _28461_210_2253_1_1532771775189_3041299_96-152158-0_sp1.1 from training. Duration: 0.77275 2023-10-02 16:00:48,882 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7837_210_1792_1_1528453445_5841305_654-54195-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 16:00:50,232 WARNING [train.py:1204] (3/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_344-152526-0 from training. Duration: 0.53 2023-10-02 16:00:50,446 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.attention_skip_rate, batch_count=936833.3333333334, ans=0.0 2023-10-02 16:00:55,750 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_30167_210_3528_1_1532428012_4772375_480-142230-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 16:00:57,713 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_160-218721-0 from training. Duration: 0.96 2023-10-02 16:01:03,184 WARNING [train.py:1204] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_65-88242-0_sp0.9 from training. Duration: 0.62225 2023-10-02 16:01:07,104 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.2.encoder.layers.0.feed_forward1.out_whiten, num_groups=1, num_channels=384, metric=5.75 vs. limit=15.0 2023-10-02 16:01:07,721 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_34886_210_10633_1_1533203882_1755661_76-156096-0 from training. Duration: 0.87 2023-10-02 16:01:11,971 WARNING [train.py:1204] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_188-38388-0_sp1.1 from training. Duration: 0.92725 2023-10-02 16:01:13,482 WARNING [train.py:1204] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_81-289335-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 16:01:17,608 WARNING [train.py:1204] (3/4) Exclude cut with ID _59493_210_17282_1_1535078419077_3225879_110-8778-0_sp1.1 from training. Duration: 0.936375 2023-10-02 16:01:17,651 WARNING [train.py:1204] (3/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_188-260011-0 from training. Duration: 0.73 2023-10-02 16:01:18,922 WARNING [train.py:1204] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_453-133983-0_sp0.9 from training. Duration: 0.8666875 2023-10-02 16:01:25,753 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.564e+02 1.884e+02 2.076e+02 2.344e+02 3.327e+02, threshold=4.151e+02, percent-clipped=0.0 2023-10-02 16:01:25,840 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8054_210_7927_1_1528627721_4984094_553-17379-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 16:01:27,248 WARNING [train.py:1204] (3/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_341-171072-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 16:01:30,655 WARNING [train.py:1204] (3/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_265-257286-0_sp1.1 from training. Duration: 0.963625 2023-10-02 16:01:30,743 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_403-39619-0_sp0.9 from training. Duration: 0.9333125 2023-10-02 16:01:30,747 WARNING [train.py:1204] (3/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_464-33931-0_sp0.9 from training. Duration: 0.6 2023-10-02 16:01:31,994 WARNING [train.py:1204] (3/4) Exclude cut with ID _60804_210_9642_1_1534831199859_7268299_632-143863-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 16:01:32,008 WARNING [train.py:1204] (3/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_431-310997-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 16:01:33,332 WARNING [train.py:1204] (3/4) Exclude cut with ID _31649_210_6228_1_1532584608063_4091078_114-263860-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 16:01:33,353 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6961_210_2087_1_1527937168_6815011_562-165518-0_sp0.9 from training. Duration: 0.8555625 2023-10-02 16:01:36,755 WARNING [train.py:1204] (3/4) Exclude cut with ID _27052_210_5986_1_1532329197187_3585009_338-325870-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 16:01:36,794 WARNING [train.py:1204] (3/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_469-94673-0_sp1.1 from training. Duration: 0.7181875 2023-10-02 16:01:36,812 WARNING [train.py:1204] (3/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_259-34038-0 from training. Duration: 0.74 2023-10-02 16:01:38,169 WARNING [train.py:1204] (3/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_388-129552-0 from training. Duration: 0.96 2023-10-02 16:01:39,591 WARNING [train.py:1204] (3/4) Exclude cut with ID _28394_210_8913_1_1532430049518_3650118_342-251455-0_sp1.1 from training. Duration: 0.936375 2023-10-02 16:01:39,611 WARNING [train.py:1204] (3/4) Exclude cut with ID _53731_210_7994_1_1534553979244_3757214_8-191390-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 16:01:39,645 WARNING [train.py:1204] (3/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_613-232856-0 from training. Duration: 0.54 2023-10-02 16:01:41,417 WARNING [train.py:1204] (3/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_557-349534-0 from training. Duration: 0.91 2023-10-02 16:01:41,433 WARNING [train.py:1204] (3/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_379-184223-0 from training. Duration: 0.85 2023-10-02 16:01:41,437 WARNING [train.py:1204] (3/4) Exclude cut with ID IC0125W0001-61807 from training. Duration: 0.966 2023-10-02 16:01:42,815 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_229-45300-0 from training. Duration: 0.57 2023-10-02 16:01:42,891 WARNING [train.py:1204] (3/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_1069-24743-0_sp1.1 from training. Duration: 0.92725 2023-10-02 16:01:44,353 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_41452_210_7291_1_1533437687_3941281_323-172873-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 16:01:44,363 WARNING [train.py:1204] (3/4) Exclude cut with ID _37086_210_9038_1_1532999540891_7021250_802-226709-0_sp1.1 from training. Duration: 0.97275 2023-10-02 16:01:45,655 WARNING [train.py:1204] (3/4) Exclude cut with ID IC0144W0500-71767 from training. Duration: 0.931875 2023-10-02 16:01:46,938 INFO [train.py:1046] (3/4) Epoch 27, batch 2450, loss[loss=0.1647, simple_loss=0.2453, pruned_loss=0.04206, over 24373.00 frames. ], tot_loss[loss=0.1671, simple_loss=0.244, pruned_loss=0.04505, over 4719094.56 frames. ], batch size: 77, lr: 3.79e-03, grad_scale: 32.0 2023-10-02 16:01:47,014 WARNING [train.py:1204] (3/4) Exclude cut with ID _39846_210_12651_1_1533605310265_2115212_233-129699-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 16:01:47,074 WARNING [train.py:1204] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_574-45541-0_sp0.9 from training. Duration: 0.72225 2023-10-02 16:01:48,660 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.1.conv_module2.balancer1.prob, batch_count=937100.0, ans=0.125 2023-10-02 16:01:49,820 WARNING [train.py:1204] (3/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_372-298093-0_sp1.1 from training. Duration: 0.72725 2023-10-02 16:01:51,028 WARNING [train.py:1204] (3/4) Exclude cut with ID _62574_210_2655_1_1535180407058_3933249_34-26343-0_sp1.1 from training. Duration: 0.97275 2023-10-02 16:01:53,822 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_512-256238-0_sp1.1 from training. Duration: 0.963625 2023-10-02 16:01:53,833 WARNING [train.py:1204] (3/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_196-55502-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 16:01:55,205 WARNING [train.py:1204] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_195-167011-0 from training. Duration: 0.75 2023-10-02 16:01:59,976 WARNING [train.py:1204] (3/4) Exclude cut with ID _13678_210_7497_1_1530530069226_3463170_345-230416-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 16:01:59,995 WARNING [train.py:1204] (3/4) Exclude cut with ID _51078_210_9070_1_1534161557424_3805250_225-327809-0_sp1.1 from training. Duration: 0.963625 2023-10-02 16:02:03,320 WARNING [train.py:1204] (3/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_18-189198-0_sp1.1 from training. Duration: 0.8181875 2023-10-02 16:02:03,331 WARNING [train.py:1204] (3/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_329-82235-0_sp1.1 from training. Duration: 0.7454375 2023-10-02 16:02:04,652 WARNING [train.py:1204] (3/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_726-270780-0_sp0.9 from training. Duration: 0.9555625 2023-10-02 16:02:04,681 WARNING [train.py:1204] (3/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_654-52713-0 from training. Duration: 0.77 2023-10-02 16:02:08,086 WARNING [train.py:1204] (3/4) Exclude cut with ID _20455_210_5219_1_1531565996775_8098100_191-16006-0_sp1.1 from training. Duration: 0.963625 2023-10-02 16:02:09,550 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3171_210_2087_1_1526109084_2330007_66-260583-0_sp1.1 from training. Duration: 0.6545625 2023-10-02 16:02:10,814 WARNING [train.py:1204] (3/4) Exclude cut with ID _38670_210_15379_1_1533448810659_4391059_63-129006-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 16:02:13,911 WARNING [train.py:1204] (3/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_393-57888-0_sp0.9 from training. Duration: 0.711125 2023-10-02 16:02:13,938 WARNING [train.py:1204] (3/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_857-298942-0_sp0.9 from training. Duration: 0.9444375 2023-10-02 16:02:15,315 WARNING [train.py:1204] (3/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_239-22076-0_sp0.9 from training. Duration: 0.9444375 2023-10-02 16:02:15,344 WARNING [train.py:1204] (3/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_424-227947-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 16:02:16,760 WARNING [train.py:1204] (3/4) Exclude cut with ID _14069_210_5632_1_1530512692033_4803390_95-1896-0 from training. Duration: 0.98 2023-10-02 16:02:18,156 WARNING [train.py:1204] (3/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_96-244299-0_sp0.9 from training. Duration: 0.97775 2023-10-02 16:02:22,182 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.ff2_skip_rate, batch_count=937233.3333333334, ans=0.0 2023-10-02 16:02:26,510 WARNING [train.py:1204] (3/4) Exclude cut with ID _46683_210_9642_1_1533730913492_4591116_562-319825-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 16:02:27,969 WARNING [train.py:1204] (3/4) Exclude cut with ID _64639_210_5816_1_1535367619437_3528000_284-173474-0_sp1.1 from training. Duration: 0.963625 2023-10-02 16:02:28,001 WARNING [train.py:1204] (3/4) Exclude cut with ID _29529_210_7234_1_1532393961455_3977460_497-100133-0_sp1.1 from training. Duration: 0.936375 2023-10-02 16:02:28,035 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_12868_210_6753_1_1530701932_530275_18-76322-0_sp0.9 from training. Duration: 0.9666875 2023-10-02 16:02:28,067 WARNING [train.py:1204] (3/4) Exclude cut with ID _50093_210_4220_1_1534417141207_3432299_116-303447-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 16:02:30,020 WARNING [train.py:1204] (3/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_44-79427-0_sp1.1 from training. Duration: 0.8909375 2023-10-02 16:02:31,361 WARNING [train.py:1204] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_887-249555-0 from training. Duration: 0.77 2023-10-02 16:02:34,059 WARNING [train.py:1204] (3/4) Exclude cut with ID _28461_210_2253_1_1532771775189_3041299_96-152158-0_sp0.9 from training. Duration: 0.9444375 2023-10-02 16:02:34,107 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_14058_210_4163_1_1530690953_6327180_49-210687-0_sp1.1 from training. Duration: 0.8545625 2023-10-02 16:02:37,024 WARNING [train.py:1204] (3/4) Exclude cut with ID _40399_210_4353_1_1533305658194_3279406_351-127903-0_sp1.1 from training. Duration: 0.97275 2023-10-02 16:02:37,037 WARNING [train.py:1204] (3/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_184-206939-0_sp1.1 from training. Duration: 0.936375 2023-10-02 16:02:39,276 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.balancer1.prob, batch_count=937300.0, ans=0.125 2023-10-02 16:02:43,584 WARNING [train.py:1204] (3/4) Exclude cut with ID _14100_210_8341_1_1530529320613_3678069_258-224599-0_sp1.1 from training. Duration: 0.7 2023-10-02 16:02:44,886 WARNING [train.py:1204] (3/4) Exclude cut with ID _6493_210_1747_1_1528459210424_4012470_324-323854-0 from training. Duration: 0.92 2023-10-02 16:02:46,270 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_151-325626-0_sp1.1 from training. Duration: 0.8818125 2023-10-02 16:02:47,671 WARNING [train.py:1204] (3/4) Exclude cut with ID _14528_210_8254_1_1530669700514_4306939_106-43145-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 16:02:48,178 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.3.encoder.layers.3.conv_module1.whiten, num_groups=1, num_channels=512, metric=7.22 vs. limit=15.0 2023-10-02 16:02:48,907 WARNING [train.py:1204] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_686-54383-0 from training. Duration: 0.87 2023-10-02 16:02:48,954 WARNING [train.py:1204] (3/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_801-102362-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 16:02:49,026 WARNING [train.py:1204] (3/4) Exclude cut with ID _48677_210_5663_1_1533902405919_3892989_408-40740-0_sp0.9 from training. Duration: 0.92225 2023-10-02 16:02:53,304 WARNING [train.py:1204] (3/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_448-212135-0_sp1.1 from training. Duration: 0.82725 2023-10-02 16:02:55,244 WARNING [train.py:1204] (3/4) Exclude cut with ID _59170_210_6721_1_1534983633960_4203335_320-124047-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 16:02:55,303 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_4692_210_3528_1_1526899755_4323254_107-44401-0_sp1.1 from training. Duration: 0.7818125 2023-10-02 16:02:57,642 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.1.encoder.layers.1.feed_forward1.out_whiten, num_groups=1, num_channels=256, metric=5.52 vs. limit=15.0 2023-10-02 16:02:59,396 WARNING [train.py:1204] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_731-2428-0 from training. Duration: 0.83 2023-10-02 16:03:00,703 INFO [train.py:1046] (3/4) Epoch 27, batch 2500, loss[loss=0.1479, simple_loss=0.2206, pruned_loss=0.03763, over 23651.00 frames. ], tot_loss[loss=0.1669, simple_loss=0.2433, pruned_loss=0.04528, over 4702686.03 frames. ], batch size: 149, lr: 3.79e-03, grad_scale: 16.0 2023-10-02 16:03:00,786 WARNING [train.py:1204] (3/4) Exclude cut with ID _29857_210_20034_1_1532395886753_3798160_335-163562-0_sp1.1 from training. Duration: 0.77275 2023-10-02 16:03:05,643 WARNING [train.py:1204] (3/4) Exclude cut with ID _14832_210_8750_1_1530691351539_3171348_171-275004-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 16:03:13,058 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.feed_forward1.out_proj.dropout_p, batch_count=937433.3333333334, ans=0.1 2023-10-02 16:03:14,922 WARNING [train.py:1204] (3/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_105-328174-0_sp0.9 from training. Duration: 0.8444375 2023-10-02 16:03:16,155 WARNING [train.py:1204] (3/4) Exclude cut with ID _68965_210_1883_1_1535698962272_3497490_164-118421-0_sp1.1 from training. Duration: 0.936375 2023-10-02 16:03:17,548 WARNING [train.py:1204] (3/4) Exclude cut with ID _19896_210_1883_1_1531709022662_6649569_380-24170-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 16:03:17,553 WARNING [train.py:1204] (3/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_505-205270-0_sp1.1 from training. Duration: 0.3454375 2023-10-02 16:03:25,718 WARNING [train.py:1204] (3/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_427-208901-0_sp1.1 from training. Duration: 0.6181875 2023-10-02 16:03:25,778 WARNING [train.py:1204] (3/4) Exclude cut with ID _23818_210_6228_1_1531917063884_3676920_85-326916-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 16:03:28,869 WARNING [train.py:1204] (3/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_101-293367-0_sp1.1 from training. Duration: 0.57275 2023-10-02 16:03:28,878 WARNING [train.py:1204] (3/4) Exclude cut with ID _5409_210_1388_1_1527292850189_5865191_178-127970-0_sp1.1 from training. Duration: 0.5181875 2023-10-02 16:03:28,929 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_403-39619-0 from training. Duration: 0.84 2023-10-02 16:03:30,345 WARNING [train.py:1204] (3/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_427-87841-0_sp1.1 from training. Duration: 0.963625 2023-10-02 16:03:30,395 WARNING [train.py:1204] (3/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_286-68181-0_sp1.1 from training. Duration: 0.97275 2023-10-02 16:03:31,777 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6673_210_3273_1_1527767790_3173240_327-306185-0 from training. Duration: 0.83 2023-10-02 16:03:31,804 WARNING [train.py:1204] (3/4) Exclude cut with ID _3675_210_4557_1_1526639934408_4182089_106-203713-0_sp1.1 from training. Duration: 0.963625 2023-10-02 16:03:31,862 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_53071_210_16554_1_1534493434_1276796_130-36076-0 from training. Duration: 0.95 2023-10-02 16:03:33,160 WARNING [train.py:1204] (3/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_586-93054-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 16:03:36,370 WARNING [train.py:1204] (3/4) Exclude cut with ID _23825_210_13449_1_1531915313526_3497150_262-10571-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 16:03:37,631 WARNING [train.py:1204] (3/4) Exclude cut with ID _28783_210_7943_1_1532671164808_7155479_432-67885-0_sp1.1 from training. Duration: 0.97275 2023-10-02 16:03:39,094 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6287_210_3273_1_1527509506_2653716_182-199887-0_sp0.9 from training. Duration: 0.5666875 2023-10-02 16:03:40,423 WARNING [train.py:1204] (3/4) Exclude cut with ID _3231_210_2106_1_1526205864934_6784220_171-256004-0 from training. Duration: 0.86 2023-10-02 16:03:40,477 WARNING [train.py:1204] (3/4) Exclude cut with ID _69051_210_1531_1_1535885566901_3829538_332-92090-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 16:03:43,069 WARNING [train.py:1204] (3/4) Exclude cut with ID _3675_210_4557_1_1526639934408_4182089_104-324125-0_sp1.1 from training. Duration: 0.963625 2023-10-02 16:03:46,928 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_41764_210_2521_1_1533686110_4089313_501-2119-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 16:03:47,225 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=937633.3333333334, ans=0.1 2023-10-02 16:03:51,059 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_56-100988-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 16:03:53,886 WARNING [train.py:1204] (3/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_398-239819-0_sp1.1 from training. Duration: 0.9 2023-10-02 16:03:55,146 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.578e+02 1.821e+02 2.040e+02 2.416e+02 3.469e+02, threshold=4.081e+02, percent-clipped=0.0 2023-10-02 16:03:58,740 WARNING [train.py:1204] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_74-48717-0_sp1.1 from training. Duration: 0.52725 2023-10-02 16:04:01,704 WARNING [train.py:1204] (3/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_177-49092-0 from training. Duration: 0.91 2023-10-02 16:04:01,729 WARNING [train.py:1204] (3/4) Exclude cut with ID _62337_210_12392_1_1535009273105_3412229_44-176353-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 16:04:01,745 WARNING [train.py:1204] (3/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_110-130961-0_sp1.1 from training. Duration: 0.42725 2023-10-02 16:04:04,441 WARNING [train.py:1204] (3/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_37-189262-0_sp1.1 from training. Duration: 0.8909375 2023-10-02 16:04:04,442 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3172_210_2087_1_1526292929_3987865_197-97762-0_sp0.9 from training. Duration: 0.8333125 2023-10-02 16:04:05,807 WARNING [train.py:1204] (3/4) Exclude cut with ID IC0677W0065-334897 from training. Duration: 0.896 2023-10-02 16:04:05,807 WARNING [train.py:1204] (3/4) Exclude cut with ID IC0677W0080-334912 from training. Duration: 0.768 2023-10-02 16:04:05,812 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_120-41668-0 from training. Duration: 0.7 2023-10-02 16:04:06,694 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.1.encoder.layers.1.conv_module1.whiten, num_groups=1, num_channels=256, metric=9.51 vs. limit=15.0 2023-10-02 16:04:09,100 WARNING [train.py:1204] (3/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_460-205287-0_sp1.1 from training. Duration: 0.963625 2023-10-02 16:04:10,455 WARNING [train.py:1204] (3/4) Exclude cut with ID _14632_210_9988_1_1530691279392_3923200_102-204477-0 from training. Duration: 0.97 2023-10-02 16:04:10,460 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_34815_210_6388_1_1533381320_3571903_279-305565-0 from training. Duration: 0.92 2023-10-02 16:04:11,960 WARNING [train.py:1204] (3/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_91-148831-0_sp1.1 from training. Duration: 0.9 2023-10-02 16:04:13,247 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_82-221196-0 from training. Duration: 0.76 2023-10-02 16:04:14,541 INFO [train.py:1046] (3/4) Epoch 27, batch 2550, loss[loss=0.1808, simple_loss=0.2552, pruned_loss=0.05322, over 23505.00 frames. ], tot_loss[loss=0.1673, simple_loss=0.2444, pruned_loss=0.04508, over 4710928.99 frames. ], batch size: 285, lr: 3.79e-03, grad_scale: 16.0 2023-10-02 16:04:16,587 WARNING [train.py:1204] (3/4) Exclude cut with ID _38020_210_5986_1_1533112252259_7006089_493-102745-0 from training. Duration: 0.98 2023-10-02 16:04:18,883 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=937766.6666666666, ans=0.1 2023-10-02 16:04:19,915 WARNING [train.py:1204] (3/4) Exclude cut with ID _61133_210_9969_1_1535196777227_3696409_40-93442-0_sp1.1 from training. Duration: 0.92725 2023-10-02 16:04:21,305 WARNING [train.py:1204] (3/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_45-258852-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 16:04:21,333 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_53071_210_16554_1_1534493434_1276796_130-36076-0_sp1.1 from training. Duration: 0.863625 2023-10-02 16:04:22,781 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8694_210_6400_1_1529065754_3260756_332-13553-0_sp1.1 from training. Duration: 0.92725 2023-10-02 16:04:24,138 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_405-339986-0 from training. Duration: 0.51 2023-10-02 16:04:24,185 WARNING [train.py:1204] (3/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_927-43827-0_sp1.1 from training. Duration: 0.67275 2023-10-02 16:04:27,095 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_397-147196-0 from training. Duration: 0.8 2023-10-02 16:04:28,385 WARNING [train.py:1204] (3/4) Exclude cut with ID 3557-8342-0013-54691-0_sp1.1 from training. Duration: 0.836375 2023-10-02 16:04:30,428 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_47438_210_5399_1_1533797471_4513356_626-127722-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 16:04:33,028 WARNING [train.py:1204] (3/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_256-323883-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 16:04:34,323 WARNING [train.py:1204] (3/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_839-173017-0_sp0.9 from training. Duration: 0.4555625 2023-10-02 16:04:34,369 WARNING [train.py:1204] (3/4) Exclude cut with ID _5289_210_2084_1_1527678202736_3779299_392-261988-0_sp1.1 from training. Duration: 0.5909375 2023-10-02 16:04:35,742 WARNING [train.py:1204] (3/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_119-224384-0_sp1.1 from training. Duration: 0.936375 2023-10-02 16:04:35,779 WARNING [train.py:1204] (3/4) Exclude cut with ID _57523_210_1856_1_1534766294545_3732989_262-267933-0_sp1.1 from training. Duration: 0.963625 2023-10-02 16:04:38,538 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_35361_210_4238_1_1533262810_7793476_675-87012-0_sp0.9 from training. Duration: 0.988875 2023-10-02 16:04:38,565 WARNING [train.py:1204] (3/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_17-90541-0 from training. Duration: 0.94 2023-10-02 16:04:40,467 WARNING [train.py:1204] (3/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_535-14410-0_sp1.1 from training. Duration: 0.42725 2023-10-02 16:04:40,475 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_24673_210_6400_1_1531968633_778659_94-154511-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 16:04:40,479 WARNING [train.py:1204] (3/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_212-10501-0 from training. Duration: 0.94 2023-10-02 16:04:51,421 WARNING [train.py:1204] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_383-285196-0_sp1.1 from training. Duration: 0.8818125 2023-10-02 16:04:57,129 WARNING [train.py:1204] (3/4) Exclude cut with ID _45560_210_21982_1_1533812603951_3584999_238-47020-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 16:04:57,149 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_21500_210_2054_1_1532149310_2716758_278-18053-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 16:04:57,157 WARNING [train.py:1204] (3/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_453-9626-0_sp1.1 from training. Duration: 0.936375 2023-10-02 16:04:58,554 WARNING [train.py:1204] (3/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_153-97441-0_sp1.1 from training. Duration: 0.5090625 2023-10-02 16:05:03,436 WARNING [train.py:1204] (3/4) Exclude cut with ID _67725_210_27767_1_1535608756671_5105359_431-120895-0_sp1.1 from training. Duration: 0.963625 2023-10-02 16:05:04,887 INFO [scaling.py:1118] (3/4) WithLoss: name=encoder.encoders.3.encoder.layers.2.self_attn_weights, loss-sum=0.000e+00 2023-10-02 16:05:07,419 WARNING [train.py:1204] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_574-45541-0_sp1.1 from training. Duration: 0.5909375 2023-10-02 16:05:07,435 WARNING [train.py:1204] (3/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_162-103620-0_sp1.1 from training. Duration: 0.8454375 2023-10-02 16:05:07,445 WARNING [train.py:1204] (3/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_734-129761-0_sp1.1 from training. Duration: 0.7454375 2023-10-02 16:05:08,846 WARNING [train.py:1204] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_620-276793-0_sp1.1 from training. Duration: 0.536375 2023-10-02 16:05:08,883 WARNING [train.py:1204] (3/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_153-97441-0_sp0.9 from training. Duration: 0.62225 2023-10-02 16:05:12,185 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.conv_module1.balancer2.prob, batch_count=937966.6666666666, ans=0.125 2023-10-02 16:05:13,288 WARNING [train.py:1204] (3/4) Exclude cut with ID _12733_210_8341_1_1530167424556_3646380_523-85621-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 16:05:14,670 WARNING [train.py:1204] (3/4) Exclude cut with ID _64048_210_10626_1_1535198382802_3835179_352-179834-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 16:05:19,453 WARNING [train.py:1204] (3/4) Exclude cut with ID _55162_210_17071_1_1534408248583_3753480_43-242364-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 16:05:19,470 WARNING [train.py:1204] (3/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_661-264857-0 from training. Duration: 0.93 2023-10-02 16:05:19,476 WARNING [train.py:1204] (3/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_325-237800-0_sp1.1 from training. Duration: 0.97275 2023-10-02 16:05:20,847 WARNING [train.py:1204] (3/4) Exclude cut with ID _55328_210_27767_1_1534580973260_4088170_385-171459-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 16:05:20,941 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3780_210_2087_1_1526467389_2145572_176-107140-0_sp1.1 from training. Duration: 0.62725 2023-10-02 16:05:22,670 WARNING [train.py:1204] (3/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_375-244976-0_sp1.1 from training. Duration: 0.6818125 2023-10-02 16:05:22,801 WARNING [train.py:1204] (3/4) Exclude cut with ID _35941_210_15379_1_1532995294515_4023089_309-33162-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 16:05:28,335 WARNING [train.py:1204] (3/4) Exclude cut with ID _14459_210_2228_1_1531443641946_7516700_1185-22038-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 16:05:29,769 INFO [train.py:1046] (3/4) Epoch 27, batch 2600, loss[loss=0.1489, simple_loss=0.2296, pruned_loss=0.03408, over 21627.00 frames. ], tot_loss[loss=0.1674, simple_loss=0.2448, pruned_loss=0.04504, over 4694670.70 frames. ], batch size: 47, lr: 3.79e-03, grad_scale: 16.0 2023-10-02 16:05:31,062 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_1197-69460-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 16:05:33,248 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9012W0022-501157 from training. Duration: 0.896 2023-10-02 16:05:35,907 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9023W0009-506302 from training. Duration: 0.896 2023-10-02 16:05:35,932 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2821_210_2715_1_1526108385_3763560_73-296673-0_sp1.1 from training. Duration: 0.7545625 2023-10-02 16:05:37,182 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9025W0113-507442 from training. Duration: 0.896 2023-10-02 16:05:37,263 WARNING [train.py:1204] (3/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_739-2098-0 from training. Duration: 0.92 2023-10-02 16:05:37,283 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9029W0332-509722 from training. Duration: 0.768 2023-10-02 16:05:38,909 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.conv_skip_rate, batch_count=938100.0, ans=0.0 2023-10-02 16:05:40,465 WARNING [train.py:1204] (3/4) Exclude cut with ID _16908_210_5724_1_1531119157248_3772700_68-101953-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 16:05:40,481 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9042W0011-515617 from training. Duration: 0.768 2023-10-02 16:05:41,863 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3019_210_2028_1_1526044245_3264865_191-72235-0 from training. Duration: 0.97 2023-10-02 16:05:41,953 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9052W0006-520792 from training. Duration: 0.896 2023-10-02 16:05:44,576 WARNING [train.py:1204] (3/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_245-174553-0_sp1.1 from training. Duration: 0.8 2023-10-02 16:05:46,019 WARNING [train.py:1204] (3/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_202-67017-0 from training. Duration: 0.82 2023-10-02 16:05:47,288 WARNING [train.py:1204] (3/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_304-59227-0 from training. Duration: 0.77 2023-10-02 16:05:49,096 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_246-125000-0_sp0.9 from training. Duration: 0.688875 2023-10-02 16:05:49,143 WARNING [train.py:1204] (3/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_243-42116-0 from training. Duration: 0.73 2023-10-02 16:05:51,877 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9087W0121-538492 from training. Duration: 0.896 2023-10-02 16:05:51,892 WARNING [train.py:1204] (3/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_120-76843-0 from training. Duration: 0.55 2023-10-02 16:05:59,432 WARNING [train.py:1204] (3/4) Exclude cut with ID _7852_210_5219_1_1528286471316_8139400_850-304621-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 16:06:01,339 WARNING [train.py:1204] (3/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_154-285415-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 16:06:01,351 WARNING [train.py:1204] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_538-278194-0_sp1.1 from training. Duration: 0.92725 2023-10-02 16:06:01,352 WARNING [train.py:1204] (3/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_119-207162-0 from training. Duration: 0.71 2023-10-02 16:06:02,768 WARNING [train.py:1204] (3/4) Exclude cut with ID _2334_210_2667_1_1525608238880_4705129_459-104997-0_sp1.1 from training. Duration: 0.863625 2023-10-02 16:06:07,118 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9147W0019-567847 from training. Duration: 0.768 2023-10-02 16:06:13,023 WARNING [train.py:1204] (3/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_228-189439-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 16:06:13,068 WARNING [train.py:1204] (3/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_196-77172-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 16:06:14,346 WARNING [train.py:1204] (3/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_625-338546-0 from training. Duration: 0.88 2023-10-02 16:06:15,561 WARNING [train.py:1204] (3/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_193-327630-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 16:06:15,562 WARNING [train.py:1204] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_391-24006-0_sp1.1 from training. Duration: 0.92725 2023-10-02 16:06:15,644 WARNING [train.py:1204] (3/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_197-28164-0 from training. Duration: 0.8 2023-10-02 16:06:18,996 WARNING [train.py:1204] (3/4) Exclude cut with ID _8395_210_1531_1_1528597871523_4411579_431-132470-0_sp1.1 from training. Duration: 0.67275 2023-10-02 16:06:19,014 WARNING [train.py:1204] (3/4) Exclude cut with ID _35350_210_16040_1_1533040191183_4075299_465-248326-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 16:06:21,762 WARNING [train.py:1204] (3/4) Exclude cut with ID _16756_210_4915_1_1531047803763_7104259_101-141922-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 16:06:24,356 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.500e+02 1.903e+02 2.122e+02 2.565e+02 3.470e+02, threshold=4.244e+02, percent-clipped=0.0 2023-10-02 16:06:24,514 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9212W0005-600172 from training. Duration: 0.896 2023-10-02 16:06:24,554 WARNING [train.py:1204] (3/4) Exclude cut with ID _63186_210_2084_1_1535414479705_3908649_378-166820-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 16:06:25,873 WARNING [train.py:1204] (3/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_52-211683-0_sp1.1 from training. Duration: 0.8181875 2023-10-02 16:06:29,474 WARNING [train.py:1204] (3/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_1147-237729-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 16:06:30,187 WARNING [train.py:1204] (3/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_234-14174-0_sp0.9 from training. Duration: 0.911125 2023-10-02 16:06:30,199 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_454-276429-0 from training. Duration: 0.87 2023-10-02 16:06:31,505 WARNING [train.py:1204] (3/4) Exclude cut with ID _53713_210_24116_1_1534314487762_7416751_755-165313-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 16:06:34,212 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8278_210_2550_1_1528637099_4220413_436-317072-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 16:06:34,274 WARNING [train.py:1204] (3/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_28-324729-0_sp1.1 from training. Duration: 0.8545625 2023-10-02 16:06:34,460 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=938366.6666666666, ans=0.1 2023-10-02 16:06:40,077 WARNING [train.py:1204] (3/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_326-165900-0 from training. Duration: 0.69 2023-10-02 16:06:41,485 WARNING [train.py:1204] (3/4) Exclude cut with ID _53489_210_6228_1_1534298357169_3835970_96-30246-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 16:06:44,166 INFO [train.py:1046] (3/4) Epoch 27, batch 2650, loss[loss=0.1607, simple_loss=0.2428, pruned_loss=0.03932, over 24457.00 frames. ], tot_loss[loss=0.1688, simple_loss=0.246, pruned_loss=0.04579, over 4689289.93 frames. ], batch size: 63, lr: 3.79e-03, grad_scale: 16.0 2023-10-02 16:06:44,212 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8275_210_4198_1_1528455320_3892571_491-17707-0_sp1.1 from training. Duration: 0.5818125 2023-10-02 16:06:47,022 WARNING [train.py:1204] (3/4) Exclude cut with ID _14348_210_8341_1_1530599434996_3778640_240-232370-0 from training. Duration: 0.84 2023-10-02 16:06:47,032 WARNING [train.py:1204] (3/4) Exclude cut with ID _45012_210_12651_1_1533608435136_5371380_54-112623-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 16:06:48,920 WARNING [train.py:1204] (3/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_328-264776-0_sp1.1 from training. Duration: 0.6090625 2023-10-02 16:06:49,881 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.1.conv_module2.whiten.whitening_limit, batch_count=938433.3333333334, ans=15.0 2023-10-02 16:06:50,337 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9325W0001-648427 from training. Duration: 0.768 2023-10-02 16:06:50,348 WARNING [train.py:1204] (3/4) Exclude cut with ID _56480_210_4220_1_1534679942079_3075409_49-74717-0_sp1.1 from training. Duration: 0.936375 2023-10-02 16:06:51,697 WARNING [train.py:1204] (3/4) Exclude cut with ID _59500_210_8254_1_1534809610680_3736009_6-241869-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 16:06:53,239 WARNING [train.py:1204] (3/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_172-215932-0_sp0.9 from training. Duration: 0.7333125 2023-10-02 16:06:53,403 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=938433.3333333334, ans=0.1 2023-10-02 16:06:54,618 WARNING [train.py:1204] (3/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_17-90541-0_sp1.1 from training. Duration: 0.8545625 2023-10-02 16:06:57,758 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_43831_210_2158_1_1533725502_5872791_525-89447-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 16:06:57,823 WARNING [train.py:1204] (3/4) Exclude cut with ID _12302_210_5219_1_1529924446466_8107280_907-182949-0 from training. Duration: 0.92 2023-10-02 16:06:57,841 WARNING [train.py:1204] (3/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_402-102311-0_sp0.9 from training. Duration: 0.7444375 2023-10-02 16:06:59,145 WARNING [train.py:1204] (3/4) Exclude cut with ID _8021_210_7994_1_1528376451358_3653069_179-45206-0_sp0.9 from training. Duration: 0.9555625 2023-10-02 16:07:00,898 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.attention_skip_rate, batch_count=938500.0, ans=0.0 2023-10-02 16:07:02,618 WARNING [train.py:1204] (3/4) Exclude cut with ID _40137_210_12210_1_1533290484046_3795180_249-187059-0 from training. Duration: 0.88 2023-10-02 16:07:03,998 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9653W0194-675472 from training. Duration: 0.9658125 2023-10-02 16:07:06,876 WARNING [train.py:1204] (3/4) Exclude cut with ID _51332_210_9988_1_1534244371836_3801270_336-255604-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 16:07:10,169 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_510-96996-0 from training. Duration: 0.87 2023-10-02 16:07:10,182 WARNING [train.py:1204] (3/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_1167-20437-0_sp1.1 from training. Duration: 0.92725 2023-10-02 16:07:10,257 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3475_210_2028_1_1526193578_3622569_265-276625-0 from training. Duration: 0.76 2023-10-02 16:07:14,518 WARNING [train.py:1204] (3/4) Exclude cut with ID _14860_210_4915_1_1530702020340_7343240_470-172418-0_sp1.1 from training. Duration: 0.97275 2023-10-02 16:07:14,538 WARNING [train.py:1204] (3/4) Exclude cut with ID _5444_210_2151_1_1527418808464_5312210_581-315411-0_sp1.1 from training. Duration: 0.563625 2023-10-02 16:07:14,553 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_34450_210_9547_1_1533099443_4109050_519-342439-0_sp1.1 from training. Duration: 0.97275 2023-10-02 16:07:14,591 WARNING [train.py:1204] (3/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_50-175961-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 16:07:14,748 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.feed_forward3.hidden_balancer.prob, batch_count=938566.6666666666, ans=0.125 2023-10-02 16:07:19,972 WARNING [train.py:1204] (3/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_372-6869-0 from training. Duration: 0.44 2023-10-02 16:07:19,978 WARNING [train.py:1204] (3/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_3-237434-0 from training. Duration: 0.76 2023-10-02 16:07:21,901 WARNING [train.py:1204] (3/4) Exclude cut with ID _8772_210_1850_1_1528635595972_3664011_247-336448-0_sp1.1 from training. Duration: 0.67275 2023-10-02 16:07:26,081 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_427-78097-0 from training. Duration: 0.8 2023-10-02 16:07:26,095 WARNING [train.py:1204] (3/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_466-252727-0_sp1.1 from training. Duration: 0.97275 2023-10-02 16:07:26,165 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_15972_210_6400_1_1530877934_3405692_316-35478-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 16:07:26,198 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_809-17156-0_sp0.9 from training. Duration: 0.811125 2023-10-02 16:07:27,507 WARNING [train.py:1204] (3/4) Exclude cut with ID _57572_210_5517_1_1534921219624_3809484_594-263474-0_sp1.1 from training. Duration: 0.936375 2023-10-02 16:07:27,551 WARNING [train.py:1204] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_190-228204-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 16:07:30,783 WARNING [train.py:1204] (3/4) Exclude cut with ID _19514_210_9259_1_1531530071330_5084230_117-21193-0_sp1.1 from training. Duration: 0.936375 2023-10-02 16:07:33,349 WARNING [train.py:1204] (3/4) Exclude cut with ID _69584_210_5219_1_1535785133775_4044450_190-21315-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 16:07:34,085 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_53071_210_16554_1_1534493434_1276796_184-302050-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 16:07:35,311 WARNING [train.py:1204] (3/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_120-76843-0_sp1.1 from training. Duration: 0.5 2023-10-02 16:07:35,408 WARNING [train.py:1204] (3/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_318-162017-0_sp1.1 from training. Duration: 0.82725 2023-10-02 16:07:38,408 WARNING [train.py:1204] (3/4) Exclude cut with ID _46067_210_4220_1_1534125571066_3307450_79-46864-0_sp1.1 from training. Duration: 0.92725 2023-10-02 16:07:38,462 WARNING [train.py:1204] (3/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_358-215558-0_sp1.1 from training. Duration: 0.8454375 2023-10-02 16:07:39,875 WARNING [train.py:1204] (3/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_621-66764-0_sp1.1 from training. Duration: 0.92725 2023-10-02 16:07:40,013 WARNING [train.py:1204] (3/4) Exclude cut with ID _42863_210_9070_1_1533902398788_3696920_338-156851-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 16:07:41,889 WARNING [train.py:1204] (3/4) Exclude cut with ID _5289_210_2084_1_1527678202736_3779299_392-261988-0_sp0.9 from training. Duration: 0.72225 2023-10-02 16:07:43,389 WARNING [train.py:1204] (3/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_409-84682-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 16:07:44,805 WARNING [train.py:1204] (3/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_30-326681-0_sp0.9 from training. Duration: 0.92225 2023-10-02 16:07:46,102 WARNING [train.py:1204] (3/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_389-184695-0_sp1.1 from training. Duration: 0.92725 2023-10-02 16:07:46,140 WARNING [train.py:1204] (3/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_442-320317-0 from training. Duration: 0.82 2023-10-02 16:07:48,931 WARNING [train.py:1204] (3/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_756-29911-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 16:07:50,328 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_401-112036-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 16:07:52,086 WARNING [train.py:1204] (3/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_181-308827-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 16:07:53,511 WARNING [train.py:1204] (3/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_332-258725-0_sp1.1 from training. Duration: 0.963625 2023-10-02 16:07:53,580 WARNING [train.py:1204] (3/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_188-260011-0_sp0.9 from training. Duration: 0.811125 2023-10-02 16:07:53,631 WARNING [train.py:1204] (3/4) Exclude cut with ID _49039_210_6315_1_1533967537541_3846118_99-302239-0_sp1.1 from training. Duration: 0.963625 2023-10-02 16:07:53,783 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.conv_module1.balancer2.prob, batch_count=938700.0, ans=0.125 2023-10-02 16:07:56,220 WARNING [train.py:1204] (3/4) Exclude cut with ID _4122_210_2084_1_1526713164417_4353280_312-294828-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 16:07:56,226 WARNING [train.py:1204] (3/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_303-228738-0 from training. Duration: 0.84 2023-10-02 16:07:59,335 INFO [train.py:1046] (3/4) Epoch 27, batch 2700, loss[loss=0.2213, simple_loss=0.2872, pruned_loss=0.07774, over 19786.00 frames. ], tot_loss[loss=0.1692, simple_loss=0.2463, pruned_loss=0.04606, over 4694682.13 frames. ], batch size: 388, lr: 3.79e-03, grad_scale: 16.0 2023-10-02 16:07:59,417 WARNING [train.py:1204] (3/4) Exclude cut with ID _14349_210_8341_1_1530615710818_3442470_115-285975-0_sp0.9 from training. Duration: 0.9555625 2023-10-02 16:08:00,842 WARNING [train.py:1204] (3/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_125-144174-0_sp1.1 from training. Duration: 0.3545625 2023-10-02 16:08:01,045 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.feed_forward1.hidden_balancer.prob, batch_count=938766.6666666666, ans=0.125 2023-10-02 16:08:03,461 WARNING [train.py:1204] (3/4) Exclude cut with ID _53565_210_10635_1_1534337218335_3820259_351-54570-0_sp1.1 from training. Duration: 0.97275 2023-10-02 16:08:03,480 WARNING [train.py:1204] (3/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_410-40065-0_sp1.1 from training. Duration: 0.963625 2023-10-02 16:08:03,508 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_38965_210_4852_1_1533174906_3484735_167-339442-0_sp1.1 from training. Duration: 0.963625 2023-10-02 16:08:05,451 WARNING [train.py:1204] (3/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_265-164486-0_sp1.1 from training. Duration: 0.87275 2023-10-02 16:08:05,460 WARNING [train.py:1204] (3/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_725-182484-0_sp1.1 from training. Duration: 0.92725 2023-10-02 16:08:05,492 WARNING [train.py:1204] (3/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_607-213951-0_sp0.9 from training. Duration: 0.9333125 2023-10-02 16:08:05,504 WARNING [train.py:1204] (3/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_605-133395-0_sp1.1 from training. Duration: 0.6 2023-10-02 16:08:05,518 WARNING [train.py:1204] (3/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_351-294650-0 from training. Duration: 0.63 2023-10-02 16:08:06,971 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_14953_210_2151_1_1530701710_5616165_710-165400-0_sp1.1 from training. Duration: 0.7909375 2023-10-02 16:08:09,678 WARNING [train.py:1204] (3/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_237-58433-0_sp1.1 from training. Duration: 0.67275 2023-10-02 16:08:11,567 WARNING [train.py:1204] (3/4) Exclude cut with ID _5190_210_4557_1_1527244368799_373439_28-197030-0_sp1.1 from training. Duration: 0.6545625 2023-10-02 16:08:11,614 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_743-327651-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 16:08:14,361 WARNING [train.py:1204] (3/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_76-203867-0_sp0.9 from training. Duration: 0.8 2023-10-02 16:08:17,001 WARNING [train.py:1204] (3/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_274-274798-0 from training. Duration: 0.98 2023-10-02 16:08:17,049 WARNING [train.py:1204] (3/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_226-242140-0_sp1.1 from training. Duration: 0.736375 2023-10-02 16:08:21,168 WARNING [train.py:1204] (3/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_352-59958-0_sp1.1 from training. Duration: 0.7818125 2023-10-02 16:08:21,181 WARNING [train.py:1204] (3/4) Exclude cut with ID _39411_210_5986_1_1533538825030_3343649_191-251819-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 16:08:21,404 INFO [scaling.py:1118] (3/4) WithLoss: name=encoder.encoders.3.encoder.layers.0.self_attn_weights, loss-sum=0.000e+00 2023-10-02 16:08:26,030 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.feed_forward1.out_proj.dropout_p, batch_count=938833.3333333334, ans=0.1 2023-10-02 16:08:27,061 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7046_210_6753_1_1527895337_6920917_642-106799-0_sp1.1 from training. Duration: 0.836375 2023-10-02 16:08:27,068 WARNING [train.py:1204] (3/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_412-3442-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 16:08:27,093 WARNING [train.py:1204] (3/4) Exclude cut with ID _60057_210_10640_1_1534903065793_3333150_171-181133-0_sp0.9 from training. Duration: 0.97775 2023-10-02 16:08:27,108 WARNING [train.py:1204] (3/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_154-135696-0_sp1.1 from training. Duration: 0.72725 2023-10-02 16:08:31,075 WARNING [train.py:1204] (3/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_148-186805-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 16:08:35,544 WARNING [train.py:1204] (3/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_605-165641-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 16:08:35,550 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_174-277364-0_sp0.9 from training. Duration: 0.788875 2023-10-02 16:08:35,563 WARNING [train.py:1204] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_89-321955-0_sp0.9 from training. Duration: 0.888875 2023-10-02 16:08:37,828 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.out_combiner.scale_min, batch_count=938900.0, ans=0.2 2023-10-02 16:08:39,116 WARNING [train.py:1204] (3/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_438-105429-0_sp1.1 from training. Duration: 0.963625 2023-10-02 16:08:40,387 WARNING [train.py:1204] (3/4) Exclude cut with ID _14970_210_15709_1_1530773543493_1715730_187-20259-0_sp0.9 from training. Duration: 0.988875 2023-10-02 16:08:42,570 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.bypass.scale_min, batch_count=938966.6666666666, ans=0.2 2023-10-02 16:08:42,799 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.4.encoder.layers.2.conv_module2.whiten, num_groups=1, num_channels=384, metric=3.51 vs. limit=15.0 2023-10-02 16:08:47,845 WARNING [train.py:1204] (3/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_385-193052-0_sp0.9 from training. Duration: 0.9555625 2023-10-02 16:08:47,889 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7113_210_1849_1_1527942685_3448768_194-311626-0_sp1.1 from training. Duration: 0.92725 2023-10-02 16:08:49,752 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.5.encoder.layers.1.nonlin_attention.whiten1, num_groups=1, num_channels=192, metric=2.85 vs. limit=10.0 2023-10-02 16:08:51,880 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3780_210_2087_1_1526467389_2145572_176-107140-0_sp0.9 from training. Duration: 0.7666875 2023-10-02 16:08:51,884 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_36756_210_7234_1_1533026991_966276_123-213457-0_sp1.1 from training. Duration: 0.936375 2023-10-02 16:08:53,642 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.489e+02 1.852e+02 1.992e+02 2.201e+02 2.706e+02, threshold=3.984e+02, percent-clipped=0.0 2023-10-02 16:08:56,346 WARNING [train.py:1204] (3/4) Exclude cut with ID _23557_210_8750_1_1531890106370_6846255_456-244861-0_sp1.1 from training. Duration: 0.963625 2023-10-02 16:08:56,410 WARNING [train.py:1204] (3/4) Exclude cut with ID _37086_210_9038_1_1532999540891_7021250_242-349170-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 16:08:57,863 WARNING [train.py:1204] (3/4) Exclude cut with ID _41298_210_9019_1_1533430371450_4822143_471-281351-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 16:08:57,955 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_53378_210_17282_1_1534221071_5769509_572-126176-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 16:08:59,337 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_1507-263032-0_sp1.1 from training. Duration: 0.963625 2023-10-02 16:09:00,622 WARNING [train.py:1204] (3/4) Exclude cut with ID _13679_210_7497_1_1530534293662_3355098_411-186088-0_sp1.1 from training. Duration: 0.8545625 2023-10-02 16:09:02,197 WARNING [train.py:1204] (3/4) Exclude cut with ID _29345_210_4381_1_1532689168263_3860169_149-257268-0_sp1.1 from training. Duration: 0.736375 2023-10-02 16:09:03,534 WARNING [train.py:1204] (3/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_381-252014-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 16:09:03,541 WARNING [train.py:1204] (3/4) Exclude cut with ID _35799_210_5854_1_1533263487887_3529071_512-2717-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 16:09:07,328 WARNING [train.py:1204] (3/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_505-205270-0_sp0.9 from training. Duration: 0.42225 2023-10-02 16:09:08,667 WARNING [train.py:1204] (3/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_731-20117-0_sp1.1 from training. Duration: 0.936375 2023-10-02 16:09:11,330 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_37351_210_7925_1_1533012931_3812113_265-85780-0_sp1.1 from training. Duration: 0.8 2023-10-02 16:09:11,337 WARNING [train.py:1204] (3/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_313-134930-0 from training. Duration: 0.97 2023-10-02 16:09:13,018 INFO [train.py:1046] (3/4) Epoch 27, batch 2750, loss[loss=0.167, simple_loss=0.2494, pruned_loss=0.04226, over 24386.00 frames. ], tot_loss[loss=0.1693, simple_loss=0.2466, pruned_loss=0.04606, over 4707438.35 frames. ], batch size: 77, lr: 3.79e-03, grad_scale: 16.0 2023-10-02 16:09:14,391 WARNING [train.py:1204] (3/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_374-138971-0 from training. Duration: 0.94 2023-10-02 16:09:14,434 WARNING [train.py:1204] (3/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_1227-287886-0_sp1.1 from training. Duration: 0.936375 2023-10-02 16:09:17,242 WARNING [train.py:1204] (3/4) Exclude cut with ID _59493_210_17282_1_1535078419077_3225879_332-217544-0_sp1.1 from training. Duration: 0.97275 2023-10-02 16:09:18,547 WARNING [train.py:1204] (3/4) Exclude cut with ID _23514_210_1389_1_1531832139785_3031439_361-119640-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 16:09:21,299 WARNING [train.py:1204] (3/4) Exclude cut with ID _69584_210_5219_1_1535785133775_4044450_142-118058-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 16:09:21,321 WARNING [train.py:1204] (3/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_178-177582-0_sp1.1 from training. Duration: 0.67275 2023-10-02 16:09:21,364 WARNING [train.py:1204] (3/4) Exclude cut with ID _25303_210_17519_1_1532050207576_3635970_41-195572-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 16:09:22,938 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.conv_skip_rate, batch_count=939100.0, ans=0.0 2023-10-02 16:09:25,815 WARNING [train.py:1204] (3/4) Exclude cut with ID _55328_210_27767_1_1534580973260_4088170_329-341322-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 16:09:25,859 WARNING [train.py:1204] (3/4) Exclude cut with ID _5608_210_2106_1_1527415439089_6819768_909-82767-0_sp1.1 from training. Duration: 0.5545625 2023-10-02 16:09:27,136 WARNING [train.py:1204] (3/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_261-297503-0_sp1.1 from training. Duration: 0.9 2023-10-02 16:09:27,142 WARNING [train.py:1204] (3/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_459-291706-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 16:09:27,143 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7762_210_1794_1_1528199508_4057408_422-227034-0 from training. Duration: 0.73 2023-10-02 16:09:27,152 WARNING [train.py:1204] (3/4) Exclude cut with ID _3067_210_2667_1_1526212974640_4503168_250-285937-0_sp0.9 from training. Duration: 0.888875 2023-10-02 16:09:27,167 WARNING [train.py:1204] (3/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_332-253745-0_sp1.1 from training. Duration: 0.936375 2023-10-02 16:09:32,690 WARNING [train.py:1204] (3/4) Exclude cut with ID _13938_210_9988_1_1530518718978_3693960_304-112784-0 from training. Duration: 0.46 2023-10-02 16:09:34,118 WARNING [train.py:1204] (3/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_226-267957-0_sp1.1 from training. Duration: 0.92725 2023-10-02 16:09:34,155 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_41822_210_13270_1_1533955976_4379284_608-254567-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 16:09:34,209 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_124-113562-0_sp1.1 from training. Duration: 0.8545625 2023-10-02 16:09:35,525 WARNING [train.py:1204] (3/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_117-290916-0_sp1.1 from training. Duration: 0.463625 2023-10-02 16:09:35,591 WARNING [train.py:1204] (3/4) Exclude cut with ID _28427_210_4220_1_1532581240967_3061069_102-66533-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 16:09:36,891 WARNING [train.py:1204] (3/4) Exclude cut with ID _55383_210_10635_1_1534468426172_2231669_134-128246-0_sp1.1 from training. Duration: 0.8090625 2023-10-02 16:09:36,949 WARNING [train.py:1204] (3/4) Exclude cut with ID _45871_210_5665_1_1533864614245_3296060_49-308316-0_sp1.1 from training. Duration: 0.97275 2023-10-02 16:09:36,999 WARNING [train.py:1204] (3/4) Exclude cut with ID _57572_210_5517_1_1534921219624_3809484_176-199205-0_sp1.1 from training. Duration: 0.97275 2023-10-02 16:09:37,205 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.conv_module1.balancer2.prob, batch_count=939166.6666666666, ans=0.125 2023-10-02 16:09:42,214 WARNING [train.py:1204] (3/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_45-265684-0_sp1.1 from training. Duration: 0.6545625 2023-10-02 16:09:42,238 WARNING [train.py:1204] (3/4) Exclude cut with ID _15150_210_8341_1_1530752411803_7256384_469-239509-0_sp0.9 from training. Duration: 0.7555625 2023-10-02 16:09:43,597 WARNING [train.py:1204] (3/4) Exclude cut with ID _28461_210_2253_1_1532771775189_3041299_136-270644-0_sp1.1 from training. Duration: 0.8454375 2023-10-02 16:09:43,798 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.ff2_skip_rate, batch_count=939233.3333333334, ans=0.0 2023-10-02 16:09:45,377 WARNING [train.py:1204] (3/4) Exclude cut with ID _69584_210_5219_1_1535785133775_4044450_302-47302-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 16:09:45,475 WARNING [train.py:1204] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_166-128941-0_sp1.1 from training. Duration: 0.4909375 2023-10-02 16:09:49,638 WARNING [train.py:1204] (3/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_559-120504-0_sp1.1 from training. Duration: 0.97275 2023-10-02 16:09:52,461 WARNING [train.py:1204] (3/4) Exclude cut with ID _6456_210_1534_1_1527681634212_3181049_345-252877-0_sp1.1 from training. Duration: 0.5818125 2023-10-02 16:09:52,494 WARNING [train.py:1204] (3/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_902-233003-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 16:09:58,391 WARNING [train.py:1204] (3/4) Exclude cut with ID _36538_210_9994_1_1533204020363_3795620_204-261385-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 16:09:58,395 WARNING [train.py:1204] (3/4) Exclude cut with ID _14821_210_8750_1_1530680503991_6829359_189-297451-0_sp1.1 from training. Duration: 0.5 2023-10-02 16:09:58,434 WARNING [train.py:1204] (3/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_1211-226066-0_sp0.9 from training. Duration: 0.8444375 2023-10-02 16:10:02,684 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.conv_skip_rate, batch_count=939300.0, ans=0.0 2023-10-02 16:10:04,435 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6786_210_1388_1_1527839699_4194495_273-149841-0_sp1.1 from training. Duration: 0.836375 2023-10-02 16:10:04,592 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.bypass_mid.scale_min, batch_count=939300.0, ans=0.2 2023-10-02 16:10:05,697 WARNING [train.py:1204] (3/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_380-162648-0_sp1.1 from training. Duration: 0.8545625 2023-10-02 16:10:05,697 WARNING [train.py:1204] (3/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_494-199538-0 from training. Duration: 0.75 2023-10-02 16:10:08,736 WARNING [train.py:1204] (3/4) Exclude cut with ID _49097_210_16049_1_1533952780207_4360979_276-87053-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 16:10:12,073 WARNING [train.py:1204] (3/4) Exclude cut with ID _5444_210_2151_1_1527418808464_5312210_726-27165-0 from training. Duration: 0.67 2023-10-02 16:10:16,678 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_128-202104-0_sp1.1 from training. Duration: 0.57275 2023-10-02 16:10:18,353 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.balancer2.prob, batch_count=939366.6666666666, ans=0.125 2023-10-02 16:10:19,417 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_11349_210_6753_1_1529800861_1101207_26-65602-0_sp1.1 from training. Duration: 0.87275 2023-10-02 16:10:19,442 WARNING [train.py:1204] (3/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_159-328157-0 from training. Duration: 0.52 2023-10-02 16:10:20,859 WARNING [train.py:1204] (3/4) Exclude cut with ID _14069_210_5632_1_1530512692033_4803390_98-21934-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 16:10:22,272 WARNING [train.py:1204] (3/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_352-59958-0_sp0.9 from training. Duration: 0.9555625 2023-10-02 16:10:23,542 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6163_210_6400_1_1527506746_4263068_283-137811-0 from training. Duration: 0.77 2023-10-02 16:10:23,571 WARNING [train.py:1204] (3/4) Exclude cut with ID _21989_210_5854_1_1531899214931_3271360_133-44759-0_sp1.1 from training. Duration: 0.7 2023-10-02 16:10:25,229 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.self_attn_weights.pos_emb_skip_rate, batch_count=939433.3333333334, ans=0.0 2023-10-02 16:10:26,194 INFO [train.py:1046] (3/4) Epoch 27, batch 2800, loss[loss=0.1537, simple_loss=0.2422, pruned_loss=0.03253, over 24470.00 frames. ], tot_loss[loss=0.1679, simple_loss=0.245, pruned_loss=0.04533, over 4710137.65 frames. ], batch size: 63, lr: 3.79e-03, grad_scale: 16.0 2023-10-02 16:10:26,336 WARNING [train.py:1204] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_21-174514-0_sp0.9 from training. Duration: 0.47775 2023-10-02 16:10:26,369 WARNING [train.py:1204] (3/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_201-78811-0_sp1.1 from training. Duration: 0.963625 2023-10-02 16:10:28,275 WARNING [train.py:1204] (3/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_537-164630-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 16:10:29,566 WARNING [train.py:1204] (3/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_156-257607-0 from training. Duration: 0.83 2023-10-02 16:10:29,577 WARNING [train.py:1204] (3/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_308-154450-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 16:10:29,618 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_45399_210_4852_1_1533624725_7579737_214-240574-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 16:10:31,016 WARNING [train.py:1204] (3/4) Exclude cut with ID _29303_210_2575_1_1532392237303_7410649_615-305517-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 16:10:31,052 WARNING [train.py:1204] (3/4) Exclude cut with ID IC0125W0002-61808 from training. Duration: 0.708 2023-10-02 16:10:31,052 WARNING [train.py:1204] (3/4) Exclude cut with ID IC0125W0017-61823 from training. Duration: 0.759 2023-10-02 16:10:35,617 WARNING [train.py:1204] (3/4) Exclude cut with ID _53219_210_4525_1_1534834836008_4064825_4-145291-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 16:10:35,740 WARNING [train.py:1204] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_185-174805-0_sp1.1 from training. Duration: 0.6181875 2023-10-02 16:10:35,761 WARNING [train.py:1204] (3/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_635-193046-0_sp1.1 from training. Duration: 0.92725 2023-10-02 16:10:38,525 WARNING [train.py:1204] (3/4) Exclude cut with ID _46067_210_4220_1_1534125571066_3307450_321-258866-0_sp1.1 from training. Duration: 0.97275 2023-10-02 16:10:40,720 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8558_210_1410_1_1528505225_7613406_131-329524-0 from training. Duration: 0.79 2023-10-02 16:10:43,066 WARNING [train.py:1204] (3/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_847-41334-0_sp0.9 from training. Duration: 0.488875 2023-10-02 16:10:44,815 WARNING [train.py:1204] (3/4) Exclude cut with ID _14227_210_4381_1_1530701804286_3940259_125-275642-0 from training. Duration: 0.99 2023-10-02 16:10:46,201 WARNING [train.py:1204] (3/4) Exclude cut with ID _25248_210_8750_1_1532062825941_5554180_248-177255-0_sp1.1 from training. Duration: 0.963625 2023-10-02 16:10:47,511 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_40047_210_2290_1_1533290519_2374311_195-165693-0_sp1.1 from training. Duration: 0.8090625 2023-10-02 16:10:47,522 WARNING [train.py:1204] (3/4) Exclude cut with ID _23109_210_4381_1_1532150828777_901949_29-258672-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 16:10:47,704 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.feed_forward1.hidden_balancer.prob, batch_count=939500.0, ans=0.125 2023-10-02 16:10:50,279 WARNING [train.py:1204] (3/4) Exclude cut with ID _14821_210_8750_1_1530680503991_6829359_2-227151-0_sp1.1 from training. Duration: 0.7545625 2023-10-02 16:10:50,412 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.0.balancer2.prob, batch_count=939500.0, ans=0.125 2023-10-02 16:10:51,615 WARNING [train.py:1204] (3/4) Exclude cut with ID _49643_210_7989_1_1534640244224_3939031_565-53982-0_sp1.1 from training. Duration: 0.963625 2023-10-02 16:10:51,619 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7544_210_6753_1_1528591327_1471426_33-346031-0_sp0.9 from training. Duration: 0.67775 2023-10-02 16:10:51,700 WARNING [train.py:1204] (3/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_95-61782-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 16:10:57,834 WARNING [train.py:1204] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_686-54383-0_sp0.9 from training. Duration: 0.9666875 2023-10-02 16:10:58,046 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.conv_skip_rate, batch_count=939566.6666666666, ans=0.0 2023-10-02 16:10:59,225 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_505-276823-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 16:11:02,125 WARNING [train.py:1204] (3/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_745-128443-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 16:11:02,181 WARNING [train.py:1204] (3/4) Exclude cut with ID _3844_210_1390_1_1526727803582_2871209_20-251664-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 16:11:03,509 WARNING [train.py:1204] (3/4) Exclude cut with ID _36538_210_9994_1_1533204020363_3795620_487-164628-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 16:11:08,131 WARNING [train.py:1204] (3/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_309-222545-0_sp1.1 from training. Duration: 0.763625 2023-10-02 16:11:08,151 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3531_210_3528_1_1526294203_5216456_309-227302-0 from training. Duration: 0.69 2023-10-02 16:11:09,886 WARNING [train.py:1204] (3/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_42-192460-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 16:11:09,955 WARNING [train.py:1204] (3/4) Exclude cut with ID _22083_210_8254_1_1531702862439_3678970_49-234546-0_sp1.1 from training. Duration: 0.7545625 2023-10-02 16:11:09,959 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_427-78097-0_sp0.9 from training. Duration: 0.888875 2023-10-02 16:11:16,034 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_378-205254-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 16:11:16,076 WARNING [train.py:1204] (3/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_672-314830-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 16:11:17,558 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.conv_skip_rate, batch_count=939633.3333333334, ans=0.0 2023-10-02 16:11:18,747 WARNING [train.py:1204] (3/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_90-30673-0_sp1.1 from training. Duration: 0.763625 2023-10-02 16:11:19,426 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.2.encoder.layers.2.nonlin_attention.whiten1, num_groups=1, num_channels=288, metric=4.61 vs. limit=10.0 2023-10-02 16:11:21,492 WARNING [train.py:1204] (3/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_563-329943-0_sp1.1 from training. Duration: 0.9 2023-10-02 16:11:21,506 WARNING [train.py:1204] (3/4) Exclude cut with ID _21632_210_2253_1_1531994001765_3744140_154-84090-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 16:11:21,507 WARNING [train.py:1204] (3/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_103-336064-0_sp0.9 from training. Duration: 0.5666875 2023-10-02 16:11:22,776 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.568e+02 1.891e+02 2.118e+02 2.532e+02 5.316e+02, threshold=4.237e+02, percent-clipped=2.0 2023-10-02 16:11:22,888 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_43-71989-0_sp0.9 from training. Duration: 0.6333125 2023-10-02 16:11:24,212 WARNING [train.py:1204] (3/4) Exclude cut with ID _3867_210_1883_1_1526641200575_4475830_319-349850-0_sp0.9 from training. Duration: 0.7444375 2023-10-02 16:11:25,557 WARNING [train.py:1204] (3/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_629-179454-0_sp1.1 from training. Duration: 0.963625 2023-10-02 16:11:25,565 WARNING [train.py:1204] (3/4) Exclude cut with ID _65263_210_22356_1_1535763598691_3921037_49-222917-0 from training. Duration: 0.92 2023-10-02 16:11:25,592 WARNING [train.py:1204] (3/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_602-331772-0_sp1.1 from training. Duration: 0.936375 2023-10-02 16:11:26,659 INFO [scaling.py:1022] (3/4) Whitening: name=encoder_embed.out_whiten, num_groups=1, num_channels=192, metric=7.26 vs. limit=8.0 2023-10-02 16:11:28,837 WARNING [train.py:1204] (3/4) Exclude cut with ID _58414_210_8254_1_1535421609162_7151182_292-134978-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 16:11:28,891 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_38793_210_2544_1_1533176114_3849320_200-299292-0_sp1.1 from training. Duration: 0.936375 2023-10-02 16:11:31,540 WARNING [train.py:1204] (3/4) Exclude cut with ID _14970_210_15709_1_1530775547361_1966965_6-23328-0 from training. Duration: 0.86 2023-10-02 16:11:32,850 WARNING [train.py:1204] (3/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_386-225511-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 16:11:32,869 WARNING [train.py:1204] (3/4) Exclude cut with ID _38323_210_12108_1_1533121233369_3303049_416-11919-0_sp1.1 from training. Duration: 0.87275 2023-10-02 16:11:32,924 WARNING [train.py:1204] (3/4) Exclude cut with ID _4866_210_2577_1_1527404412284_4364659_256-306830-0_sp1.1 from training. Duration: 0.7909375 2023-10-02 16:11:33,081 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.self_attn_weights.pos_emb_skip_rate, batch_count=939700.0, ans=0.0 2023-10-02 16:11:35,612 WARNING [train.py:1204] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_53-300544-0 from training. Duration: 0.67 2023-10-02 16:11:40,359 INFO [train.py:1046] (3/4) Epoch 27, batch 2850, loss[loss=0.1698, simple_loss=0.2391, pruned_loss=0.05021, over 23830.00 frames. ], tot_loss[loss=0.167, simple_loss=0.2443, pruned_loss=0.04481, over 4707960.35 frames. ], batch size: 195, lr: 3.79e-03, grad_scale: 16.0 2023-10-02 16:11:41,046 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.3.encoder.layers.3.nonlin_attention.whiten1, num_groups=1, num_channels=384, metric=5.44 vs. limit=10.0 2023-10-02 16:11:41,798 WARNING [train.py:1204] (3/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_80-74184-0_sp1.1 from training. Duration: 0.7545625 2023-10-02 16:11:41,811 WARNING [train.py:1204] (3/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_332-256168-0_sp1.1 from training. Duration: 0.6818125 2023-10-02 16:11:41,865 WARNING [train.py:1204] (3/4) Exclude cut with ID _48836_210_12210_1_1534075848293_3904150_429-35475-0_sp1.1 from training. Duration: 0.8090625 2023-10-02 16:11:45,104 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_144-135833-0_sp1.1 from training. Duration: 0.97275 2023-10-02 16:11:46,413 WARNING [train.py:1204] (3/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_380-343670-0_sp0.9 from training. Duration: 0.97775 2023-10-02 16:11:48,353 WARNING [train.py:1204] (3/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_213-265174-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 16:11:48,375 WARNING [train.py:1204] (3/4) Exclude cut with ID _20455_210_5219_1_1531565996775_8098100_86-314283-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 16:11:49,871 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_41750_210_1960_1_1533520678_3948042_68-223813-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 16:11:51,245 WARNING [train.py:1204] (3/4) Exclude cut with ID _55153_210_2253_1_1534384786677_3162096_24-198429-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 16:11:52,631 WARNING [train.py:1204] (3/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_119-207162-0_sp0.9 from training. Duration: 0.788875 2023-10-02 16:11:53,968 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_148-172861-0 from training. Duration: 0.5 2023-10-02 16:11:59,846 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_5522_210_3273_1_1527249606_3909096_318-203153-0 from training. Duration: 0.43 2023-10-02 16:11:59,863 WARNING [train.py:1204] (3/4) Exclude cut with ID _14884_210_2252_1_1530707459689_5561250_288-189949-0_sp1.1 from training. Duration: 0.97275 2023-10-02 16:12:01,289 WARNING [train.py:1204] (3/4) Exclude cut with ID _5294_210_1747_1_1527249643928_3696179_529-242298-0 from training. Duration: 0.95 2023-10-02 16:12:01,354 WARNING [train.py:1204] (3/4) Exclude cut with ID _36538_210_9994_1_1533204020363_3795620_503-110289-0_sp1.1 from training. Duration: 0.936375 2023-10-02 16:12:01,493 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.bypass.skip_rate, batch_count=939833.3333333334, ans=0.04949747468305833 2023-10-02 16:12:04,165 WARNING [train.py:1204] (3/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_385-193052-0 from training. Duration: 0.86 2023-10-02 16:12:05,463 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_456-22141-0 from training. Duration: 0.88 2023-10-02 16:12:05,799 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=939833.3333333334, ans=0.1 2023-10-02 16:12:06,856 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_38965_210_4852_1_1533174906_3484735_284-117840-0_sp1.1 from training. Duration: 0.936375 2023-10-02 16:12:17,864 WARNING [train.py:1204] (3/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_75-201625-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 16:12:19,277 WARNING [train.py:1204] (3/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_255-312654-0_sp1.1 from training. Duration: 0.82725 2023-10-02 16:12:19,293 WARNING [train.py:1204] (3/4) Exclude cut with ID _47251_210_18628_1_1533814063128_4137250_214-88076-0_sp0.9 from training. Duration: 0.97775 2023-10-02 16:12:20,693 WARNING [train.py:1204] (3/4) Exclude cut with ID _4092_210_2151_1_1526643044449_3135759_227-213972-0_sp1.1 from training. Duration: 0.4909375 2023-10-02 16:12:20,709 WARNING [train.py:1204] (3/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_912-47473-0_sp1.1 from training. Duration: 0.6181875 2023-10-02 16:12:20,741 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6767_210_3273_1_1527855783_1718932_173-256601-0_sp1.1 from training. Duration: 0.72725 2023-10-02 16:12:23,473 WARNING [train.py:1204] (3/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_32-205288-0_sp1.1 from training. Duration: 0.7090625 2023-10-02 16:12:23,496 WARNING [train.py:1204] (3/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_39-248870-0 from training. Duration: 0.69 2023-10-02 16:12:24,900 WARNING [train.py:1204] (3/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_377-296839-0_sp1.1 from training. Duration: 0.5 2023-10-02 16:12:24,923 WARNING [train.py:1204] (3/4) Exclude cut with ID _13679_210_7497_1_1530534293662_3355098_413-56154-0_sp1.1 from training. Duration: 0.963625 2023-10-02 16:12:26,135 WARNING [train.py:1204] (3/4) Exclude cut with ID _14970_210_15709_1_1530775547361_1966965_201-84544-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 16:12:26,198 WARNING [train.py:1204] (3/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_105-95775-0_sp1.1 from training. Duration: 0.936375 2023-10-02 16:12:29,604 WARNING [train.py:1204] (3/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_131-191669-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 16:12:29,638 WARNING [train.py:1204] (3/4) Exclude cut with ID _14860_210_4915_1_1530702020340_7343240_695-4323-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 16:12:30,965 WARNING [train.py:1204] (3/4) Exclude cut with ID _39867_210_9826_1_1533553222030_9344259_991-130565-0_sp1.1 from training. Duration: 0.97275 2023-10-02 16:12:32,570 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_50-3289-0_sp1.1 from training. Duration: 0.82725 2023-10-02 16:12:35,294 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_54-347670-0_sp1.1 from training. Duration: 0.92725 2023-10-02 16:12:35,355 WARNING [train.py:1204] (3/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_200-153445-0_sp1.1 from training. Duration: 0.936375 2023-10-02 16:12:35,871 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.4.encoder.layers.1.feed_forward2.out_whiten, num_groups=1, num_channels=384, metric=13.20 vs. limit=15.0 2023-10-02 16:12:36,657 WARNING [train.py:1204] (3/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_279-302911-0_sp1.1 from training. Duration: 0.97275 2023-10-02 16:12:38,105 WARNING [train.py:1204] (3/4) Exclude cut with ID _14886_210_4220_1_1530702020355_3516190_159-73748-0_sp0.9 from training. Duration: 0.92225 2023-10-02 16:12:43,026 WARNING [train.py:1204] (3/4) Exclude cut with ID _53379_210_20034_1_1534222027363_3854654_23-222715-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 16:12:44,317 WARNING [train.py:1204] (3/4) Exclude cut with ID _12555_210_8750_1_1530075594402_3780239_449-267986-0 from training. Duration: 0.83 2023-10-02 16:12:44,335 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7544_210_6753_1_1528587722_3576686_295-258479-0 from training. Duration: 0.84 2023-10-02 16:12:47,719 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7002_210_1794_1_1527935634_4034444_487-236501-0_sp0.9 from training. Duration: 0.7333125 2023-10-02 16:12:47,760 WARNING [train.py:1204] (3/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_244-19107-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 16:12:47,779 WARNING [train.py:1204] (3/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_390-339137-0 from training. Duration: 0.56 2023-10-02 16:12:49,115 WARNING [train.py:1204] (3/4) Exclude cut with ID _7562_210_2084_1_1528887767916_3946229_186-326422-0_sp1.1 from training. Duration: 0.763625 2023-10-02 16:12:49,169 WARNING [train.py:1204] (3/4) Exclude cut with ID _33640_210_2655_1_1533117605479_4855469_645-145235-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 16:12:50,443 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_25767_210_2161_1_1532307483_3867748_506-171777-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 16:12:50,480 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_5438_210_1794_1_1527336687_2600009_233-58072-0_sp1.1 from training. Duration: 0.7 2023-10-02 16:12:50,481 WARNING [train.py:1204] (3/4) Exclude cut with ID IC0669W0063-330908 from training. Duration: 0.896 2023-10-02 16:12:50,526 WARNING [train.py:1204] (3/4) Exclude cut with ID IC0671W0070-331913 from training. Duration: 0.896 2023-10-02 16:12:50,530 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_258-310570-0_sp1.1 from training. Duration: 0.7454375 2023-10-02 16:12:51,881 WARNING [train.py:1204] (3/4) Exclude cut with ID _9461_210_4557_1_1529060514726_3677451_162-177507-0_sp1.1 from training. Duration: 0.97275 2023-10-02 16:12:54,590 INFO [train.py:1046] (3/4) Epoch 27, batch 2900, loss[loss=0.1494, simple_loss=0.2245, pruned_loss=0.03717, over 24499.00 frames. ], tot_loss[loss=0.1668, simple_loss=0.244, pruned_loss=0.04484, over 4705685.23 frames. ], batch size: 58, lr: 3.79e-03, grad_scale: 8.0 2023-10-02 16:12:56,147 WARNING [train.py:1204] (3/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_26-175553-0_sp0.9 from training. Duration: 0.67775 2023-10-02 16:12:56,163 WARNING [train.py:1204] (3/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_7-150481-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 16:12:56,217 WARNING [train.py:1204] (3/4) Exclude cut with ID _23557_210_8750_1_1531890106370_6846255_72-273262-0_sp1.1 from training. Duration: 0.963625 2023-10-02 16:12:59,010 WARNING [train.py:1204] (3/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_367-61330-0 from training. Duration: 0.75 2023-10-02 16:13:03,524 WARNING [train.py:1204] (3/4) Exclude cut with ID _39867_210_9826_1_1533553222030_9344259_1337-11141-0_sp1.1 from training. Duration: 0.936375 2023-10-02 16:13:03,542 WARNING [train.py:1204] (3/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_101-293367-0 from training. Duration: 0.63 2023-10-02 16:13:04,870 WARNING [train.py:1204] (3/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_10-100367-0 from training. Duration: 0.98 2023-10-02 16:13:06,217 WARNING [train.py:1204] (3/4) Exclude cut with ID _60057_210_10640_1_1534903065793_3333150_171-181133-0_sp1.1 from training. Duration: 0.8 2023-10-02 16:13:06,219 WARNING [train.py:1204] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_844-96989-0_sp0.9 from training. Duration: 0.788875 2023-10-02 16:13:08,921 WARNING [train.py:1204] (3/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_301-144595-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 16:13:10,243 WARNING [train.py:1204] (3/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_115-147343-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 16:13:15,649 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3171_210_2087_1_1526109084_2330007_37-64334-0_sp1.1 from training. Duration: 0.7454375 2023-10-02 16:13:15,685 WARNING [train.py:1204] (3/4) Exclude cut with ID _19514_210_9259_1_1531530071330_5084230_769-272914-0_sp1.1 from training. Duration: 0.936375 2023-10-02 16:13:17,085 WARNING [train.py:1204] (3/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_76-311963-0_sp0.9 from training. Duration: 0.77775 2023-10-02 16:13:18,305 WARNING [train.py:1204] (3/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_526-62079-0 from training. Duration: 0.67 2023-10-02 16:13:18,357 WARNING [train.py:1204] (3/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_7-111954-0_sp0.9 from training. Duration: 0.62225 2023-10-02 16:13:19,782 WARNING [train.py:1204] (3/4) Exclude cut with ID _57997_210_4163_1_1534766425889_3961049_360-279140-0_sp1.1 from training. Duration: 0.97275 2023-10-02 16:13:21,652 WARNING [train.py:1204] (3/4) Exclude cut with ID _4866_210_2577_1_1527404412284_4364659_133-1696-0 from training. Duration: 0.99 2023-10-02 16:13:22,964 WARNING [train.py:1204] (3/4) Exclude cut with ID _5190_210_4557_1_1527244368799_373439_28-197030-0 from training. Duration: 0.72 2023-10-02 16:13:25,726 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_53-39003-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 16:13:25,729 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6673_210_3273_1_1527767790_3173240_351-167954-0 from training. Duration: 0.48 2023-10-02 16:13:25,744 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2822_210_2715_1_1526112586_1999587_165-36656-0_sp1.1 from training. Duration: 0.8090625 2023-10-02 16:13:27,248 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_328-264892-0_sp1.1 from training. Duration: 0.92725 2023-10-02 16:13:27,252 WARNING [train.py:1204] (3/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_29-293896-0_sp0.9 from training. Duration: 0.67775 2023-10-02 16:13:31,365 WARNING [train.py:1204] (3/4) Exclude cut with ID _65263_210_22356_1_1535763598691_3921037_465-55448-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 16:13:31,433 WARNING [train.py:1204] (3/4) Exclude cut with ID _21989_210_5854_1_1531899214931_3271360_311-53106-0_sp1.1 from training. Duration: 0.97275 2023-10-02 16:13:36,131 WARNING [train.py:1204] (3/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_389-277915-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 16:13:39,007 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_69630_210_7234_1_1535791094_677837_63-277139-0_sp1.1 from training. Duration: 0.963625 2023-10-02 16:13:43,123 WARNING [train.py:1204] (3/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_85-138257-0 from training. Duration: 0.71 2023-10-02 16:13:43,139 WARNING [train.py:1204] (3/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_686-247600-0 from training. Duration: 0.92 2023-10-02 16:13:43,145 WARNING [train.py:1204] (3/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_108-255430-0_sp1.1 from training. Duration: 0.8818125 2023-10-02 16:13:45,326 WARNING [train.py:1204] (3/4) Exclude cut with ID _14884_210_2252_1_1530707459689_5561250_114-201399-0_sp1.1 from training. Duration: 0.6909375 2023-10-02 16:13:48,615 WARNING [train.py:1204] (3/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_506-42614-0 from training. Duration: 0.79 2023-10-02 16:13:49,966 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_85-195240-0_sp1.1 from training. Duration: 0.7909375 2023-10-02 16:13:50,753 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.3.encoder.layers.0.nonlin_attention.whiten1, num_groups=1, num_channels=384, metric=7.42 vs. limit=10.0 2023-10-02 16:13:53,106 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.522e+02 1.791e+02 2.001e+02 2.222e+02 3.379e+02, threshold=4.002e+02, percent-clipped=0.0 2023-10-02 16:13:54,549 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_141-36243-0_sp1.1 from training. Duration: 0.97275 2023-10-02 16:14:02,651 WARNING [train.py:1204] (3/4) Exclude cut with ID _7852_210_5219_1_1528286471316_8139400_358-287492-0_sp1.1 from training. Duration: 0.87275 2023-10-02 16:14:02,678 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_805-71497-0_sp0.9 from training. Duration: 0.788875 2023-10-02 16:14:02,769 WARNING [train.py:1204] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_178-39722-0 from training. Duration: 0.49 2023-10-02 16:14:03,314 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.4.encoder.layers.2.self_attn_weights.whiten_keys, num_groups=4, num_channels=128, metric=2.27 vs. limit=6.0 2023-10-02 16:14:06,040 WARNING [train.py:1204] (3/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_428-181925-0_sp1.1 from training. Duration: 0.963625 2023-10-02 16:14:06,044 WARNING [train.py:1204] (3/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_770-200430-0 from training. Duration: 0.89 2023-10-02 16:14:07,353 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_1104-271009-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 16:14:07,416 WARNING [train.py:1204] (3/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_128-296581-0_sp0.9 from training. Duration: 0.82225 2023-10-02 16:14:08,797 INFO [train.py:1046] (3/4) Epoch 27, batch 2950, loss[loss=0.1695, simple_loss=0.2574, pruned_loss=0.04081, over 24463.00 frames. ], tot_loss[loss=0.167, simple_loss=0.2446, pruned_loss=0.04473, over 4702139.85 frames. ], batch size: 69, lr: 3.79e-03, grad_scale: 8.0 2023-10-02 16:14:12,981 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.1.encoder.layers.1.nonlin_attention.whiten2, num_groups=1, num_channels=256, metric=6.58 vs. limit=15.0 2023-10-02 16:14:13,391 WARNING [train.py:1204] (3/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_19-62511-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 16:14:16,252 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_805-71497-0 from training. Duration: 0.71 2023-10-02 16:14:16,980 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.2.encoder.layers.1.nonlin_attention.whiten2, num_groups=1, num_channels=384, metric=5.69 vs. limit=15.0 2023-10-02 16:14:17,553 WARNING [train.py:1204] (3/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_573-152077-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 16:14:17,558 WARNING [train.py:1204] (3/4) Exclude cut with ID _42875_210_14856_1_1533704397487_4012433_393-177816-0_sp1.1 from training. Duration: 0.963625 2023-10-02 16:14:17,668 WARNING [train.py:1204] (3/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_278-35159-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 16:14:17,959 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.balancer2.prob, batch_count=940433.3333333334, ans=0.125 2023-10-02 16:14:18,282 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.4.encoder.layers.2.conv_module2.whiten, num_groups=1, num_channels=384, metric=3.08 vs. limit=15.0 2023-10-02 16:14:19,674 WARNING [train.py:1204] (3/4) Exclude cut with ID _15037_210_2253_1_1530752294104_3267650_13-130361-0_sp1.1 from training. Duration: 0.9 2023-10-02 16:14:21,061 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3019_210_2028_1_1526044245_3264865_127-135615-0 from training. Duration: 0.94 2023-10-02 16:14:21,105 WARNING [train.py:1204] (3/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_607-213951-0 from training. Duration: 0.84 2023-10-02 16:14:22,478 WARNING [train.py:1204] (3/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_436-318113-0_sp0.9 from training. Duration: 0.8333125 2023-10-02 16:14:22,479 WARNING [train.py:1204] (3/4) Exclude cut with ID _13938_210_9988_1_1530518718978_3693960_148-152225-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 16:14:28,600 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2541_210_1794_1_1525590269_3084765_156-173713-0_sp1.1 from training. Duration: 0.736375 2023-10-02 16:14:29,995 WARNING [train.py:1204] (3/4) Exclude cut with ID _28323_210_16187_1_1532480395667_4100300_531-24157-0_sp1.1 from training. Duration: 0.8909375 2023-10-02 16:14:32,742 WARNING [train.py:1204] (3/4) Exclude cut with ID _29303_210_2575_1_1532392237303_7410649_382-217492-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 16:14:32,774 WARNING [train.py:1204] (3/4) Exclude cut with ID _58895_210_8750_1_1535184081320_3649089_270-158013-0_sp1.1 from training. Duration: 0.92725 2023-10-02 16:14:34,369 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.conv_module1.balancer2.prob, batch_count=940500.0, ans=0.125 2023-10-02 16:14:37,276 WARNING [train.py:1204] (3/4) Exclude cut with ID _29277_210_3279_1_1532581109293_3173069_311-206227-0_sp1.1 from training. Duration: 0.936375 2023-10-02 16:14:37,287 WARNING [train.py:1204] (3/4) Exclude cut with ID _7160_210_1876_1_1527940923231_3703799_282-322611-0_sp0.9 from training. Duration: 0.9666875 2023-10-02 16:14:40,137 WARNING [train.py:1204] (3/4) Exclude cut with ID _53219_210_4525_1_1534834836008_4064825_562-32295-0_sp1.1 from training. Duration: 0.963625 2023-10-02 16:14:40,348 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.feed_forward1.hidden_balancer.prob, batch_count=940566.6666666666, ans=0.125 2023-10-02 16:14:41,511 WARNING [train.py:1204] (3/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_408-87330-0_sp1.1 from training. Duration: 0.963625 2023-10-02 16:14:41,518 WARNING [train.py:1204] (3/4) Exclude cut with ID _59202_210_26984_1_1534816810612_3549174_137-318710-0_sp1.1 from training. Duration: 0.8090625 2023-10-02 16:14:42,952 WARNING [train.py:1204] (3/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_372-30302-0 from training. Duration: 0.86 2023-10-02 16:14:47,506 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7543_210_6753_1_1528541907_1399322_178-56269-0_sp1.1 from training. Duration: 0.95275 2023-10-02 16:14:47,530 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9109W0012-549758 from training. Duration: 0.512 2023-10-02 16:14:48,909 WARNING [train.py:1204] (3/4) Exclude cut with ID _5410_210_1388_1_1527397055795_1719020_39-112221-0_sp0.9 from training. Duration: 0.7666875 2023-10-02 16:14:50,433 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9121W0012-554933 from training. Duration: 0.896 2023-10-02 16:14:52,269 WARNING [train.py:1204] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_476-272757-0 from training. Duration: 0.93 2023-10-02 16:14:52,293 WARNING [train.py:1204] (3/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_1085-171718-0_sp1.1 from training. Duration: 0.92725 2023-10-02 16:14:52,356 WARNING [train.py:1204] (3/4) Exclude cut with ID _8480_210_1874_1_1528589391205_6728330_309-139896-0_sp1.1 from training. Duration: 0.736375 2023-10-02 16:14:52,356 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9130W0113-559703 from training. Duration: 0.896 2023-10-02 16:14:53,613 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_292-265928-0_sp1.1 from training. Duration: 0.463625 2023-10-02 16:14:56,837 WARNING [train.py:1204] (3/4) Exclude cut with ID _8772_210_1850_1_1528635595972_3664011_96-12756-0 from training. Duration: 0.98 2023-10-02 16:14:56,897 WARNING [train.py:1204] (3/4) Exclude cut with ID _12556_210_8341_1_1530081024100_7327690_725-193477-0_sp1.1 from training. Duration: 0.8909375 2023-10-02 16:14:58,145 WARNING [train.py:1204] (3/4) Exclude cut with ID _5289_210_2084_1_1527678202736_3779299_142-103317-0_sp1.1 from training. Duration: 0.763625 2023-10-02 16:15:00,972 WARNING [train.py:1204] (3/4) Exclude cut with ID _45012_210_12651_1_1533608435136_5371380_206-51075-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 16:15:02,475 WARNING [train.py:1204] (3/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_176-56120-0_sp1.1 from training. Duration: 0.7545625 2023-10-02 16:15:02,506 WARNING [train.py:1204] (3/4) Exclude cut with ID _40284_210_2084_1_1533513679814_3704279_220-201864-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 16:15:02,532 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9165W0004-576638 from training. Duration: 0.896 2023-10-02 16:15:03,780 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_5438_210_1794_1_1527336687_2600009_218-343336-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 16:15:03,812 WARNING [train.py:1204] (3/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_318-162017-0 from training. Duration: 0.91 2023-10-02 16:15:04,052 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.feed_forward3.hidden_balancer.prob, batch_count=940633.3333333334, ans=0.125 2023-10-02 16:15:09,851 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_49544_210_1794_1_1534379770_5262083_483-349298-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 16:15:09,931 WARNING [train.py:1204] (3/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_197-28164-0_sp0.9 from training. Duration: 0.888875 2023-10-02 16:15:10,080 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=940700.0, ans=0.1 2023-10-02 16:15:10,128 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.attention_skip_rate, batch_count=940700.0, ans=0.0 2023-10-02 16:15:11,295 WARNING [train.py:1204] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_643-52007-0 from training. Duration: 0.57 2023-10-02 16:15:11,322 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_38965_210_4852_1_1533174906_3484735_403-332963-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 16:15:12,671 WARNING [train.py:1204] (3/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_340-174176-0 from training. Duration: 0.49 2023-10-02 16:15:15,991 WARNING [train.py:1204] (3/4) Exclude cut with ID _56708_210_13177_1_1535252351282_3422899_363-917-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 16:15:17,214 WARNING [train.py:1204] (3/4) Exclude cut with ID _19896_210_1883_1_1531709022662_6649569_525-159063-0_sp1.1 from training. Duration: 0.936375 2023-10-02 16:15:17,242 WARNING [train.py:1204] (3/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_430-116139-0_sp0.9 from training. Duration: 0.9444375 2023-10-02 16:15:18,681 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_53753_210_7234_1_1534320003_3594148_86-329250-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 16:15:20,069 WARNING [train.py:1204] (3/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_406-275649-0_sp1.1 from training. Duration: 0.3818125 2023-10-02 16:15:21,529 WARNING [train.py:1204] (3/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_1083-147536-0_sp1.1 from training. Duration: 0.8818125 2023-10-02 16:15:22,834 INFO [train.py:1046] (3/4) Epoch 27, batch 3000, loss[loss=0.2197, simple_loss=0.2826, pruned_loss=0.07843, over 19487.00 frames. ], tot_loss[loss=0.1679, simple_loss=0.2455, pruned_loss=0.04516, over 4701939.86 frames. ], batch size: 388, lr: 3.79e-03, grad_scale: 8.0 2023-10-02 16:15:22,835 INFO [train.py:1069] (3/4) Computing validation loss 2023-10-02 16:15:34,589 INFO [train.py:1078] (3/4) Epoch 27, validation: loss=0.3322, simple_loss=0.2706, pruned_loss=0.197, over 1125622.00 frames. 2023-10-02 16:15:34,590 INFO [train.py:1079] (3/4) Maximum memory allocated so far is 21134MB 2023-10-02 16:15:34,669 WARNING [train.py:1204] (3/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_54-311741-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 16:15:34,697 WARNING [train.py:1204] (3/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_237-58433-0_sp0.9 from training. Duration: 0.82225 2023-10-02 16:15:34,740 WARNING [train.py:1204] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_98-51676-0_sp1.1 from training. Duration: 0.663625 2023-10-02 16:15:34,784 WARNING [train.py:1204] (3/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_386-270472-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 16:15:36,503 WARNING [train.py:1204] (3/4) Exclude cut with ID _19023_210_4353_1_1531378892552_5820780_83-254794-0_sp1.1 from training. Duration: 0.7818125 2023-10-02 16:15:37,887 WARNING [train.py:1204] (3/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_314-93656-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 16:15:37,900 WARNING [train.py:1204] (3/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_336-213530-0 from training. Duration: 0.9 2023-10-02 16:15:40,565 WARNING [train.py:1204] (3/4) Exclude cut with ID _36686_210_5665_1_1533778225913_3646410_65-221120-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 16:15:41,997 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_26433_210_12108_1_1532157158_4056480_271-68178-0_sp1.1 from training. Duration: 0.92725 2023-10-02 16:15:43,337 WARNING [train.py:1204] (3/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_46-207599-0_sp0.9 from training. Duration: 0.711125 2023-10-02 16:15:46,691 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9303W0114-637133 from training. Duration: 0.896 2023-10-02 16:15:46,715 WARNING [train.py:1204] (3/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_490-207861-0 from training. Duration: 0.98 2023-10-02 16:15:48,198 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6767_210_3273_1_1527855783_1718932_173-256601-0_sp0.9 from training. Duration: 0.888875 2023-10-02 16:15:49,527 WARNING [train.py:1204] (3/4) Exclude cut with ID _13921_210_9994_1_1530841782272_3503520_327-69677-0_sp1.1 from training. Duration: 0.8181875 2023-10-02 16:15:49,589 WARNING [train.py:1204] (3/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_508-335369-0 from training. Duration: 0.48 2023-10-02 16:15:49,611 WARNING [train.py:1204] (3/4) Exclude cut with ID _64631_210_27767_1_1535349580068_4165160_24-46567-0_sp1.1 from training. Duration: 0.963625 2023-10-02 16:15:52,523 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.self_attn_weights.pos_emb_skip_rate, batch_count=940833.3333333334, ans=0.0 2023-10-02 16:15:57,337 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_89-35748-0_sp1.1 from training. Duration: 0.7181875 2023-10-02 16:16:06,139 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_26433_210_12108_1_1532157158_4056480_578-262107-0_sp1.1 from training. Duration: 0.9 2023-10-02 16:16:10,894 WARNING [train.py:1204] (3/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_129-337802-0 from training. Duration: 0.72 2023-10-02 16:16:12,157 WARNING [train.py:1204] (3/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_356-167816-0_sp0.9 from training. Duration: 0.8 2023-10-02 16:16:15,029 WARNING [train.py:1204] (3/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_137-116442-0_sp1.1 from training. Duration: 0.7090625 2023-10-02 16:16:15,074 WARNING [train.py:1204] (3/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_358-172940-0_sp1.1 from training. Duration: 0.963625 2023-10-02 16:16:15,103 WARNING [train.py:1204] (3/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_668-344147-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 16:16:17,680 WARNING [train.py:1204] (3/4) Exclude cut with ID _62337_210_12392_1_1535009273105_3412229_255-157667-0_sp1.1 from training. Duration: 0.936375 2023-10-02 16:16:17,690 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_486-151131-0 from training. Duration: 0.85 2023-10-02 16:16:20,423 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_53638_210_21581_1_1534297319_5808449_885-80172-0 from training. Duration: 0.96 2023-10-02 16:16:20,513 WARNING [train.py:1204] (3/4) Exclude cut with ID _50093_210_4220_1_1534417141207_3432299_105-183067-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 16:16:21,834 WARNING [train.py:1204] (3/4) Exclude cut with ID _7562_210_2084_1_1528887767916_3946229_345-169156-0_sp0.9 from training. Duration: 0.6444375 2023-10-02 16:16:23,235 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_4528_210_3273_1_1527076654_3276777_209-219804-0_sp0.9 from training. Duration: 0.7666875 2023-10-02 16:16:23,266 WARNING [train.py:1204] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_35-153886-0_sp0.9 from training. Duration: 0.9333125 2023-10-02 16:16:24,638 WARNING [train.py:1204] (3/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_332-121925-0_sp1.1 from training. Duration: 0.92725 2023-10-02 16:16:24,640 WARNING [train.py:1204] (3/4) Exclude cut with ID _59202_210_26984_1_1534816810612_3549174_495-65515-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 16:16:29,344 WARNING [train.py:1204] (3/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_769-168302-0_sp1.1 from training. Duration: 0.6454375 2023-10-02 16:16:29,392 WARNING [train.py:1204] (3/4) Exclude cut with ID _24271_210_12658_1_1531987270022_7190519_708-71345-0_sp1.1 from training. Duration: 0.936375 2023-10-02 16:16:29,392 WARNING [train.py:1204] (3/4) Exclude cut with ID 3033-130750-0096-55598-0_sp0.9 from training. Duration: 0.92225 2023-10-02 16:16:30,784 WARNING [train.py:1204] (3/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_219-224445-0_sp0.9 from training. Duration: 0.9333125 2023-10-02 16:16:31,979 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.568e+02 1.934e+02 2.111e+02 2.505e+02 3.384e+02, threshold=4.221e+02, percent-clipped=0.0 2023-10-02 16:16:33,447 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_174-277364-0 from training. Duration: 0.71 2023-10-02 16:16:34,335 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.0.layers.1.whiten, num_groups=1, num_channels=192, metric=3.71 vs. limit=12.0 2023-10-02 16:16:34,820 WARNING [train.py:1204] (3/4) Exclude cut with ID _45021_210_9994_1_1533619751094_3330288_96-239010-0_sp1.1 from training. Duration: 0.82725 2023-10-02 16:16:34,845 WARNING [train.py:1204] (3/4) Exclude cut with ID _31602_210_5854_1_1532677021456_4458250_489-305116-0_sp1.1 from training. Duration: 0.97275 2023-10-02 16:16:34,863 WARNING [train.py:1204] (3/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_269-154726-0_sp0.9 from training. Duration: 0.9555625 2023-10-02 16:16:38,234 WARNING [train.py:1204] (3/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_126-54418-0_sp1.1 from training. Duration: 0.92725 2023-10-02 16:16:39,523 WARNING [train.py:1204] (3/4) Exclude cut with ID _42326_210_7770_1_1533814282531_3707269_182-224159-0_sp1.1 from training. Duration: 0.92725 2023-10-02 16:16:39,609 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7002_210_1794_1_1527939683_5157286_1-281264-0_sp0.9 from training. Duration: 0.488875 2023-10-02 16:16:39,641 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_86-16881-0 from training. Duration: 0.58 2023-10-02 16:16:39,671 WARNING [train.py:1204] (3/4) Exclude cut with ID _19514_210_9259_1_1531530071330_5084230_637-279094-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 16:16:40,444 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.balancer2.prob, batch_count=941033.3333333334, ans=0.125 2023-10-02 16:16:41,575 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_26433_210_12108_1_1532157158_4056480_578-262107-0 from training. Duration: 0.99 2023-10-02 16:16:42,872 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_82-221196-0_sp1.1 from training. Duration: 0.6909375 2023-10-02 16:16:43,078 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.attention_skip_rate, batch_count=941033.3333333334, ans=0.0 2023-10-02 16:16:45,436 WARNING [train.py:1204] (3/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_402-102311-0 from training. Duration: 0.67 2023-10-02 16:16:46,935 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1015-343884-0_sp1.1 from training. Duration: 0.563625 2023-10-02 16:16:48,193 INFO [train.py:1046] (3/4) Epoch 27, batch 3050, loss[loss=0.1552, simple_loss=0.2361, pruned_loss=0.03715, over 24623.00 frames. ], tot_loss[loss=0.1691, simple_loss=0.2465, pruned_loss=0.04581, over 4697514.52 frames. ], batch size: 60, lr: 3.79e-03, grad_scale: 8.0 2023-10-02 16:16:48,289 WARNING [train.py:1204] (3/4) Exclude cut with ID _13684_210_7497_1_1530619354863_3598359_312-235606-0_sp1.1 from training. Duration: 0.5454375 2023-10-02 16:16:49,638 WARNING [train.py:1204] (3/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_55-31904-0 from training. Duration: 0.88 2023-10-02 16:16:49,720 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_25-332138-0 from training. Duration: 0.91 2023-10-02 16:16:49,722 WARNING [train.py:1204] (3/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_912-282412-0_sp0.9 from training. Duration: 0.5444375 2023-10-02 16:16:51,009 WARNING [train.py:1204] (3/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_458-299435-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 16:16:52,378 WARNING [train.py:1204] (3/4) Exclude cut with ID _69584_210_5219_1_1535785133775_4044450_275-88403-0_sp1.1 from training. Duration: 0.92725 2023-10-02 16:16:52,387 WARNING [train.py:1204] (3/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_841-341679-0_sp1.1 from training. Duration: 0.52725 2023-10-02 16:16:52,393 WARNING [train.py:1204] (3/4) Exclude cut with ID _41391_210_7994_1_1533603771872_3638201_1-54521-0_sp1.1 from training. Duration: 0.97275 2023-10-02 16:16:52,436 WARNING [train.py:1204] (3/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_495-186609-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 16:16:53,927 WARNING [train.py:1204] (3/4) Exclude cut with ID _4681_210_3525_1_1527250689464_120889_15-126760-0 from training. Duration: 0.81 2023-10-02 16:16:57,332 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3019_210_2028_1_1526044245_3264865_127-135615-0_sp1.1 from training. Duration: 0.8545625 2023-10-02 16:17:00,011 WARNING [train.py:1204] (3/4) Exclude cut with ID _30191_210_1664_1_1533189915047_7196670_915-21561-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 16:17:00,060 WARNING [train.py:1204] (3/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_6-214815-0_sp0.9 from training. Duration: 0.9444375 2023-10-02 16:17:04,270 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_45399_210_4852_1_1533624725_7579737_190-137494-0_sp1.1 from training. Duration: 0.97275 2023-10-02 16:17:06,316 WARNING [train.py:1204] (3/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_207-132038-0 from training. Duration: 0.94 2023-10-02 16:17:08,450 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.2.encoder.layers.2.feed_forward2.out_whiten, num_groups=1, num_channels=384, metric=4.02 vs. limit=15.0 2023-10-02 16:17:13,714 WARNING [train.py:1204] (3/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_90-31383-0 from training. Duration: 0.87 2023-10-02 16:17:14,998 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_15517_210_6753_1_1530952293_1664795_119-31001-0 from training. Duration: 0.94 2023-10-02 16:17:15,030 WARNING [train.py:1204] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_684-85342-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 16:17:16,409 WARNING [train.py:1204] (3/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_97-325053-0_sp1.1 from training. Duration: 0.836375 2023-10-02 16:17:19,234 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_68702_210_23689_1_1535799577_3420068_168-25150-0_sp1.1 from training. Duration: 0.97275 2023-10-02 16:17:19,241 WARNING [train.py:1204] (3/4) Exclude cut with ID _65134_210_14763_1_1535335393985_3505368_111-27984-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 16:17:19,288 WARNING [train.py:1204] (3/4) Exclude cut with ID _48678_210_5663_1_1533988830947_3769199_276-226230-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 16:17:22,054 WARNING [train.py:1204] (3/4) Exclude cut with ID _28427_210_4220_1_1532581240967_3061069_43-84928-0_sp1.1 from training. Duration: 0.9 2023-10-02 16:17:23,278 WARNING [train.py:1204] (3/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_158-250116-0_sp1.1 from training. Duration: 0.563625 2023-10-02 16:17:23,301 WARNING [train.py:1204] (3/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_279-332556-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 16:17:23,343 WARNING [train.py:1204] (3/4) Exclude cut with ID _42863_210_9070_1_1533902398788_3696920_488-230146-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 16:17:23,343 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_387-247159-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 16:17:25,448 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6729_210_2550_1_1528032481_1788725_261-42619-0_sp1.1 from training. Duration: 0.97275 2023-10-02 16:17:26,784 WARNING [train.py:1204] (3/4) Exclude cut with ID _19896_210_1883_1_1531709022662_6649569_831-44724-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 16:17:28,838 WARNING [train.py:1204] (3/4) Exclude cut with ID _55153_210_2253_1_1534384786677_3162096_280-285944-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 16:17:28,953 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.nonlin_attention.balancer.prob, batch_count=941233.3333333334, ans=0.125 2023-10-02 16:17:30,148 WARNING [train.py:1204] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_448-4048-0 from training. Duration: 0.96 2023-10-02 16:17:30,200 WARNING [train.py:1204] (3/4) Exclude cut with ID _29678_210_7893_1_1532421042787_3906118_456-202548-0_sp1.1 from training. Duration: 0.97275 2023-10-02 16:17:30,221 WARNING [train.py:1204] (3/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_107-332398-0_sp1.1 from training. Duration: 0.7090625 2023-10-02 16:17:33,090 WARNING [train.py:1204] (3/4) Exclude cut with ID _13921_210_9994_1_1530838702800_3003429_46-149444-0_sp1.1 from training. Duration: 0.8545625 2023-10-02 16:17:33,119 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_257-20733-0_sp0.9 from training. Duration: 0.8444375 2023-10-02 16:17:34,976 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_53753_210_7234_1_1534320003_3594148_385-234809-0_sp1.1 from training. Duration: 0.963625 2023-10-02 16:17:35,023 WARNING [train.py:1204] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_292-215346-0_sp1.1 from training. Duration: 0.8818125 2023-10-02 16:17:40,793 WARNING [train.py:1204] (3/4) Exclude cut with ID _68965_210_1883_1_1535698962272_3497490_196-124268-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 16:17:42,698 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_23801_210_16223_1_1532066392_3294657_570-5439-0_sp1.1 from training. Duration: 0.8818125 2023-10-02 16:17:45,008 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.1.encoder.layers.1.feed_forward1.out_whiten, num_groups=1, num_channels=256, metric=5.53 vs. limit=15.0 2023-10-02 16:17:48,377 WARNING [train.py:1204] (3/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_349-6928-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 16:17:48,424 WARNING [train.py:1204] (3/4) Exclude cut with ID _60621_210_17071_1_1535013053530_3671739_226-255202-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 16:17:49,714 WARNING [train.py:1204] (3/4) Exclude cut with ID _41817_210_18835_1_1533434477160_7423049_448-130302-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 16:17:51,173 WARNING [train.py:1204] (3/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_758-143677-0_sp1.1 from training. Duration: 0.8909375 2023-10-02 16:17:52,446 WARNING [train.py:1204] (3/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_912-47473-0_sp0.9 from training. Duration: 0.7555625 2023-10-02 16:17:52,475 WARNING [train.py:1204] (3/4) Exclude cut with ID _6694_210_4599_1_1527852637490_3965529_356-169233-0_sp1.1 from training. Duration: 0.9 2023-10-02 16:17:53,938 WARNING [train.py:1204] (3/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_1-99680-0 from training. Duration: 0.67 2023-10-02 16:17:55,248 WARNING [train.py:1204] (3/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_199-86432-0_sp1.1 from training. Duration: 0.8909375 2023-10-02 16:17:55,273 WARNING [train.py:1204] (3/4) Exclude cut with ID _61148_210_6897_1_1535094022726_4217610_362-182430-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 16:17:56,660 WARNING [train.py:1204] (3/4) Exclude cut with ID _49376_210_5986_1_1533985278732_3370740_301-154763-0 from training. Duration: 0.98 2023-10-02 16:17:58,703 WARNING [train.py:1204] (3/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_572-185150-0_sp1.1 from training. Duration: 0.8818125 2023-10-02 16:18:00,948 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.2.encoder.layers.0.self_attn2.whiten, num_groups=1, num_channels=384, metric=19.80 vs. limit=22.5 2023-10-02 16:18:02,757 INFO [train.py:1046] (3/4) Epoch 27, batch 3100, loss[loss=0.152, simple_loss=0.2433, pruned_loss=0.03038, over 24517.00 frames. ], tot_loss[loss=0.1686, simple_loss=0.2461, pruned_loss=0.04554, over 4701331.88 frames. ], batch size: 71, lr: 3.79e-03, grad_scale: 8.0 2023-10-02 16:18:04,176 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_47896_210_20508_1_1534492518_5958868_558-314572-0_sp1.1 from training. Duration: 0.8818125 2023-10-02 16:18:05,689 WARNING [train.py:1204] (3/4) Exclude cut with ID _45560_210_21982_1_1533812603951_3584999_111-6376-0_sp0.9 from training. Duration: 0.9333125 2023-10-02 16:18:07,160 WARNING [train.py:1204] (3/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_412-317077-0_sp1.1 from training. Duration: 0.6818125 2023-10-02 16:18:07,328 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.bypass.scale_min, batch_count=941433.3333333334, ans=0.2 2023-10-02 16:18:07,337 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.balancer2.prob, batch_count=941433.3333333334, ans=0.125 2023-10-02 16:18:09,051 WARNING [train.py:1204] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_718-68505-0 from training. Duration: 0.89 2023-10-02 16:18:09,502 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.5.encoder.layers.1.conv_module2.whiten, num_groups=1, num_channels=256, metric=4.58 vs. limit=15.0 2023-10-02 16:18:13,113 WARNING [train.py:1204] (3/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_77-311404-0 from training. Duration: 0.94 2023-10-02 16:18:13,213 WARNING [train.py:1204] (3/4) Exclude cut with ID _60229_210_27767_1_1534838406158_3655350_264-279573-0 from training. Duration: 0.83 2023-10-02 16:18:14,422 WARNING [train.py:1204] (3/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_106-300985-0_sp1.1 from training. Duration: 0.8181875 2023-10-02 16:18:17,786 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_4183_210_2087_1_1526729100_1869838_79-277189-0_sp1.1 from training. Duration: 0.963625 2023-10-02 16:18:17,796 WARNING [train.py:1204] (3/4) Exclude cut with ID _28427_210_4220_1_1532581240967_3061069_195-295268-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 16:18:20,608 WARNING [train.py:1204] (3/4) Exclude cut with ID _4092_210_2151_1_1526643044449_3135759_356-278694-0_sp0.9 from training. Duration: 0.52225 2023-10-02 16:18:22,135 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.feed_forward3.hidden_balancer.prob, batch_count=941500.0, ans=0.125 2023-10-02 16:18:24,598 WARNING [train.py:1204] (3/4) Exclude cut with ID _14459_210_2228_1_1531443641946_7516700_777-71446-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 16:18:28,201 WARNING [train.py:1204] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_520-126729-0 from training. Duration: 0.7 2023-10-02 16:18:34,105 WARNING [train.py:1204] (3/4) Exclude cut with ID _13684_210_7497_1_1530619354863_3598359_312-235606-0_sp0.9 from training. Duration: 0.6666875 2023-10-02 16:18:34,146 WARNING [train.py:1204] (3/4) Exclude cut with ID _37537_210_2253_1_1533171758568_3421949_283-349402-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 16:18:34,189 WARNING [train.py:1204] (3/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_29-117932-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 16:18:35,471 WARNING [train.py:1204] (3/4) Exclude cut with ID _29303_210_2575_1_1532392237303_7410649_721-283351-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 16:18:36,811 WARNING [train.py:1204] (3/4) Exclude cut with ID _14886_210_4220_1_1530702020355_3516190_175-229073-0_sp0.9 from training. Duration: 0.5 2023-10-02 16:18:38,618 WARNING [train.py:1204] (3/4) Exclude cut with ID _15458_210_8341_1_1530858614263_3834410_70-268830-0_sp1.1 from training. Duration: 0.97275 2023-10-02 16:18:38,646 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2295_210_2087_1_1525500993_2407863_234-83104-0 from training. Duration: 0.62 2023-10-02 16:18:38,647 WARNING [train.py:1204] (3/4) Exclude cut with ID _22961_210_8254_1_1531976366672_4278180_317-199546-0_sp1.1 from training. Duration: 0.7818125 2023-10-02 16:18:38,749 WARNING [train.py:1204] (3/4) Exclude cut with ID _27894_210_12392_1_1532415496078_6902439_547-196022-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 16:18:41,356 WARNING [train.py:1204] (3/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_46-207599-0 from training. Duration: 0.64 2023-10-02 16:18:41,451 WARNING [train.py:1204] (3/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_426-326246-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 16:18:44,694 WARNING [train.py:1204] (3/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_445-117398-0_sp0.9 from training. Duration: 0.988875 2023-10-02 16:18:45,984 WARNING [train.py:1204] (3/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_563-347832-0 from training. Duration: 0.89 2023-10-02 16:18:47,365 WARNING [train.py:1204] (3/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_252-216674-0 from training. Duration: 0.54 2023-10-02 16:18:48,708 WARNING [train.py:1204] (3/4) Exclude cut with ID _59611_210_8895_1_1534741198240_3650361_187-212779-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 16:18:48,758 WARNING [train.py:1204] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_282-159367-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 16:18:50,122 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_68702_210_23689_1_1535799577_3420068_33-136883-0_sp1.1 from training. Duration: 0.936375 2023-10-02 16:18:50,137 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_47228_210_4983_1_1533780021_3672027_184-293706-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 16:18:51,423 WARNING [train.py:1204] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_656-188361-0_sp0.9 from training. Duration: 0.9666875 2023-10-02 16:18:54,083 WARNING [train.py:1204] (3/4) Exclude cut with ID _4866_210_2577_1_1527404412284_4364659_352-157930-0_sp0.9 from training. Duration: 0.9 2023-10-02 16:18:54,084 WARNING [train.py:1204] (3/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_210-106047-0_sp1.1 from training. Duration: 0.8818125 2023-10-02 16:18:54,208 WARNING [train.py:1204] (3/4) Exclude cut with ID _9389_210_1664_1_1529217337301_5289269_289-213417-0_sp1.1 from training. Duration: 0.8090625 2023-10-02 16:18:55,564 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3910_210_1794_1_1526777667_3747260_178-41186-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 16:18:55,568 WARNING [train.py:1204] (3/4) Exclude cut with ID _27052_210_5986_1_1532329197187_3585009_24-172263-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 16:18:55,574 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_5522_210_3273_1_1527249606_3909096_318-203153-0_sp1.1 from training. Duration: 0.3909375 2023-10-02 16:18:59,164 WARNING [train.py:1204] (3/4) Exclude cut with ID _27894_210_12392_1_1532415496078_6902439_222-51982-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 16:19:00,437 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.528e+02 1.900e+02 2.040e+02 2.286e+02 3.067e+02, threshold=4.081e+02, percent-clipped=0.0 2023-10-02 16:19:00,555 WARNING [train.py:1204] (3/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_87-131427-0 from training. Duration: 0.78 2023-10-02 16:19:03,436 WARNING [train.py:1204] (3/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_372-298093-0_sp0.9 from training. Duration: 0.888875 2023-10-02 16:19:03,500 WARNING [train.py:1204] (3/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_663-166973-0 from training. Duration: 0.8 2023-10-02 16:19:03,649 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=941700.0, ans=0.1 2023-10-02 16:19:04,757 WARNING [train.py:1204] (3/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_429-177469-0_sp1.1 from training. Duration: 0.936375 2023-10-02 16:19:04,790 WARNING [train.py:1204] (3/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_724-172651-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 16:19:04,837 WARNING [train.py:1204] (3/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_103-336064-0 from training. Duration: 0.51 2023-10-02 16:19:16,595 INFO [train.py:1046] (3/4) Epoch 27, batch 3150, loss[loss=0.1728, simple_loss=0.2473, pruned_loss=0.04912, over 23429.00 frames. ], tot_loss[loss=0.1673, simple_loss=0.2445, pruned_loss=0.04509, over 4681155.86 frames. ], batch size: 93, lr: 3.78e-03, grad_scale: 8.0 2023-10-02 16:19:16,655 WARNING [train.py:1204] (3/4) Exclude cut with ID _48677_210_5663_1_1533902405919_3892989_111-11439-0 from training. Duration: 0.96 2023-10-02 16:19:18,101 WARNING [train.py:1204] (3/4) Exclude cut with ID _8085_210_4557_1_1528455633493_3716100_95-103398-0_sp1.1 from training. Duration: 0.963625 2023-10-02 16:19:18,166 WARNING [train.py:1204] (3/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_1041-110837-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 16:19:20,842 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_50050_210_16223_1_1535435796_7290138_1155-121757-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 16:19:20,844 WARNING [train.py:1204] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_602-275260-0_sp0.9 from training. Duration: 0.911125 2023-10-02 16:19:20,906 WARNING [train.py:1204] (3/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_265-164486-0 from training. Duration: 0.96 2023-10-02 16:19:22,283 WARNING [train.py:1204] (3/4) Exclude cut with ID _46067_210_4220_1_1534125571066_3307450_142-229375-0_sp1.1 from training. Duration: 0.963625 2023-10-02 16:19:22,323 WARNING [train.py:1204] (3/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_310-26186-0_sp1.1 from training. Duration: 0.636375 2023-10-02 16:19:23,629 WARNING [train.py:1204] (3/4) Exclude cut with ID _28461_210_2253_1_1532771775189_3041299_136-270644-0 from training. Duration: 0.93 2023-10-02 16:19:24,966 WARNING [train.py:1204] (3/4) Exclude cut with ID _14860_210_4915_1_1530702020340_7343240_438-286372-0_sp1.1 from training. Duration: 0.936375 2023-10-02 16:19:26,838 WARNING [train.py:1204] (3/4) Exclude cut with ID IC0128W0004-63309 from training. Duration: 0.831 2023-10-02 16:19:30,076 WARNING [train.py:1204] (3/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_135-174080-0 from training. Duration: 0.53 2023-10-02 16:19:30,112 WARNING [train.py:1204] (3/4) Exclude cut with ID _41326_210_5854_1_1533778076509_3492080_277-59511-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 16:19:31,483 WARNING [train.py:1204] (3/4) Exclude cut with ID IC0144W0112-71379 from training. Duration: 0.992 2023-10-02 16:19:32,784 WARNING [train.py:1204] (3/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_711-15688-0_sp0.9 from training. Duration: 0.47775 2023-10-02 16:19:34,207 WARNING [train.py:1204] (3/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_237-332027-0 from training. Duration: 0.95 2023-10-02 16:19:36,072 WARNING [train.py:1204] (3/4) Exclude cut with ID _7970_210_1883_1_1528455616296_3472230_25-178407-0 from training. Duration: 0.71 2023-10-02 16:19:36,074 WARNING [train.py:1204] (3/4) Exclude cut with ID _8480_210_1874_1_1528589391205_6728330_309-139896-0 from training. Duration: 0.81 2023-10-02 16:19:36,099 WARNING [train.py:1204] (3/4) Exclude cut with ID _39571_210_13449_1_1533801584143_3651077_319-268877-0_sp1.1 from training. Duration: 0.936375 2023-10-02 16:19:36,104 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_155-246880-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 16:19:37,495 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_566-122599-0_sp1.1 from training. Duration: 0.936375 2023-10-02 16:19:37,580 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder_embed.convnext.hidden_balancer.prob, batch_count=941833.3333333334, ans=0.125 2023-10-02 16:19:38,917 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_12868_210_6753_1_1530701932_530275_18-76322-0 from training. Duration: 0.87 2023-10-02 16:19:39,179 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.conv_module1.balancer2.min_abs, batch_count=941833.3333333334, ans=0.5 2023-10-02 16:19:40,367 WARNING [train.py:1204] (3/4) Exclude cut with ID _56494_210_4220_1_1535284837625_3033169_159-106035-0_sp1.1 from training. Duration: 0.963625 2023-10-02 16:19:41,692 WARNING [train.py:1204] (3/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_98-276013-0_sp1.1 from training. Duration: 0.963625 2023-10-02 16:19:42,351 WARNING [train.py:1204] (3/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_330-216124-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 16:19:44,813 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_177-58005-0_sp0.9 from training. Duration: 0.688875 2023-10-02 16:19:46,500 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.balancer1.prob, batch_count=941900.0, ans=0.125 2023-10-02 16:19:47,633 WARNING [train.py:1204] (3/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_343-139898-0 from training. Duration: 0.96 2023-10-02 16:19:47,698 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6961_210_2087_1_1527937168_6815011_395-185060-0_sp1.1 from training. Duration: 0.863625 2023-10-02 16:19:50,328 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7002_210_1794_1_1527939683_5157286_184-229630-0_sp0.9 from training. Duration: 0.82225 2023-10-02 16:19:50,371 WARNING [train.py:1204] (3/4) Exclude cut with ID _59170_210_6721_1_1534983633960_4203335_153-18895-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 16:19:51,732 WARNING [train.py:1204] (3/4) Exclude cut with ID _49643_210_7989_1_1534640244224_3939031_598-161926-0 from training. Duration: 0.9 2023-10-02 16:19:53,219 WARNING [train.py:1204] (3/4) Exclude cut with ID _4068_210_2629_1_1526697350395_1546259_224-288693-0 from training. Duration: 0.59 2023-10-02 16:19:54,565 WARNING [train.py:1204] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_718-68505-0_sp1.1 from training. Duration: 0.8090625 2023-10-02 16:19:55,872 WARNING [train.py:1204] (3/4) Exclude cut with ID _4068_210_2629_1_1526697350395_1546259_224-288693-0_sp0.9 from training. Duration: 0.6555625 2023-10-02 16:19:55,889 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2296_210_2087_1_1525503448_3397335_57-185430-0_sp0.9 from training. Duration: 0.6666875 2023-10-02 16:19:57,221 WARNING [train.py:1204] (3/4) Exclude cut with ID _15458_210_8341_1_1530858614263_3834410_49-235472-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 16:19:57,223 WARNING [train.py:1204] (3/4) Exclude cut with ID _64135_210_2253_1_1535677267876_7608972_498-5039-0_sp0.9 from training. Duration: 0.8666875 2023-10-02 16:20:00,799 WARNING [train.py:1204] (3/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_283-220372-0_sp0.9 from training. Duration: 0.62225 2023-10-02 16:20:00,808 WARNING [train.py:1204] (3/4) Exclude cut with ID _6473_210_1741_1_1527901307396_3815380_108-17335-0_sp0.9 from training. Duration: 0.72225 2023-10-02 16:20:00,901 WARNING [train.py:1204] (3/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_113-92608-0 from training. Duration: 0.54 2023-10-02 16:20:02,207 WARNING [train.py:1204] (3/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_272-249985-0_sp1.1 from training. Duration: 0.6090625 2023-10-02 16:20:02,223 WARNING [train.py:1204] (3/4) Exclude cut with ID _50376_210_13401_1_1534031502101_4217740_518-187390-0_sp1.1 from training. Duration: 0.97275 2023-10-02 16:20:02,418 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=941966.6666666666, ans=0.1 2023-10-02 16:20:04,963 WARNING [train.py:1204] (3/4) Exclude cut with ID _59738_210_15379_1_1534849512562_3941470_165-248260-0_sp1.1 from training. Duration: 0.92725 2023-10-02 16:20:04,982 WARNING [train.py:1204] (3/4) Exclude cut with ID _23818_210_6228_1_1531917063884_3676920_400-343925-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 16:20:05,036 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6961_210_2087_1_1527937168_6815011_562-165518-0 from training. Duration: 0.77 2023-10-02 16:20:05,102 WARNING [train.py:1204] (3/4) Exclude cut with ID _24271_210_12658_1_1531987270022_7190519_289-207931-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 16:20:08,309 WARNING [train.py:1204] (3/4) Exclude cut with ID _14632_210_9988_1_1530691279392_3923200_332-105904-0 from training. Duration: 0.9 2023-10-02 16:20:08,322 WARNING [train.py:1204] (3/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_203-232438-0_sp1.1 from training. Duration: 0.97275 2023-10-02 16:20:09,720 WARNING [train.py:1204] (3/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_263-168388-0 from training. Duration: 0.86 2023-10-02 16:20:09,792 WARNING [train.py:1204] (3/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_174-205058-0 from training. Duration: 0.95 2023-10-02 16:20:11,149 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_94-195801-0_sp1.1 from training. Duration: 0.8545625 2023-10-02 16:20:11,179 WARNING [train.py:1204] (3/4) Exclude cut with ID _55153_210_2253_1_1534384786677_3162096_269-103204-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 16:20:11,254 WARNING [train.py:1204] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_748-36150-0 from training. Duration: 0.91 2023-10-02 16:20:13,156 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_148-172861-0_sp1.1 from training. Duration: 0.4545625 2023-10-02 16:20:13,204 WARNING [train.py:1204] (3/4) Exclude cut with ID _67499_210_17282_1_1535509805220_3692430_460-77730-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 16:20:15,690 WARNING [train.py:1204] (3/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_374-20753-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 16:20:16,947 WARNING [train.py:1204] (3/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_374-289983-0_sp1.1 from training. Duration: 0.97275 2023-10-02 16:20:16,974 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_34886_210_10633_1_1533203882_1755661_76-156096-0_sp0.9 from training. Duration: 0.9666875 2023-10-02 16:20:22,556 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_238-256668-0_sp0.9 from training. Duration: 0.7666875 2023-10-02 16:20:23,869 WARNING [train.py:1204] (3/4) Exclude cut with ID _46683_210_9642_1_1533730913492_4591116_489-146727-0_sp1.1 from training. Duration: 0.97275 2023-10-02 16:20:25,248 WARNING [train.py:1204] (3/4) Exclude cut with ID _13938_210_9988_1_1530518718978_3693960_304-112784-0_sp0.9 from training. Duration: 0.511125 2023-10-02 16:20:30,374 INFO [train.py:1046] (3/4) Epoch 27, batch 3200, loss[loss=0.1675, simple_loss=0.2454, pruned_loss=0.0448, over 24667.00 frames. ], tot_loss[loss=0.1666, simple_loss=0.2436, pruned_loss=0.0448, over 4687365.43 frames. ], batch size: 65, lr: 3.78e-03, grad_scale: 16.0 2023-10-02 16:20:30,456 WARNING [train.py:1204] (3/4) Exclude cut with ID _24271_210_12658_1_1531987270022_7190519_243-46549-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 16:20:30,462 WARNING [train.py:1204] (3/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_78-182912-0_sp1.1 from training. Duration: 0.536375 2023-10-02 16:20:33,319 WARNING [train.py:1204] (3/4) Exclude cut with ID _57572_210_5517_1_1534921219624_3809484_121-321334-0_sp1.1 from training. Duration: 0.97275 2023-10-02 16:20:35,177 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_40047_210_2290_1_1533290519_2374311_254-102149-0_sp1.1 from training. Duration: 0.963625 2023-10-02 16:20:35,178 WARNING [train.py:1204] (3/4) Exclude cut with ID _28461_210_2253_1_1532771775189_3041299_96-152158-0 from training. Duration: 0.85 2023-10-02 16:20:36,632 WARNING [train.py:1204] (3/4) Exclude cut with ID _29877_210_6228_1_1532397791106_5276149_161-102979-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 16:20:40,847 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3787_210_2040_1_1526477544_5478586_603-120324-0_sp0.9 from training. Duration: 0.9 2023-10-02 16:20:44,045 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_41750_210_1960_1_1533520678_3948042_247-213456-0_sp1.1 from training. Duration: 0.97275 2023-10-02 16:20:53,380 WARNING [train.py:1204] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_127-119109-0_sp0.9 from training. Duration: 0.8 2023-10-02 16:21:02,291 WARNING [train.py:1204] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_215-202973-0 from training. Duration: 0.88 2023-10-02 16:21:02,573 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.conv_module1.balancer1.prob, batch_count=942233.3333333334, ans=0.125 2023-10-02 16:21:03,515 WARNING [train.py:1204] (3/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_17-320491-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 16:21:05,514 WARNING [train.py:1204] (3/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_310-114452-0 from training. Duration: 0.59 2023-10-02 16:21:06,889 WARNING [train.py:1204] (3/4) Exclude cut with ID _15133_210_6228_1_1530794512074_3080379_29-264458-0_sp0.9 from training. Duration: 0.6444375 2023-10-02 16:21:09,525 WARNING [train.py:1204] (3/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_159-281310-0_sp1.1 from training. Duration: 0.863625 2023-10-02 16:21:10,775 WARNING [train.py:1204] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_195-167011-0_sp0.9 from training. Duration: 0.8333125 2023-10-02 16:21:12,758 WARNING [train.py:1204] (3/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_156-257607-0_sp1.1 from training. Duration: 0.7545625 2023-10-02 16:21:15,422 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2295_210_2087_1_1525500993_2407863_116-238894-0 from training. Duration: 0.7 2023-10-02 16:21:16,822 WARNING [train.py:1204] (3/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_321-335475-0_sp0.9 from training. Duration: 0.5 2023-10-02 16:21:19,549 WARNING [train.py:1204] (3/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_188-114464-0 from training. Duration: 0.82 2023-10-02 16:21:20,990 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2592_210_2290_1_1525697025_4882925_157-70512-0 from training. Duration: 0.91 2023-10-02 16:21:25,121 WARNING [train.py:1204] (3/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_138-64210-0_sp1.1 from training. Duration: 0.663625 2023-10-02 16:21:25,760 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.3.encoder.layers.2.self_attn1.whiten, num_groups=1, num_channels=512, metric=14.28 vs. limit=22.5 2023-10-02 16:21:26,755 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.562e+02 1.873e+02 2.063e+02 2.304e+02 3.218e+02, threshold=4.126e+02, percent-clipped=0.0 2023-10-02 16:21:30,347 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_11305_210_1794_1_1529750428_7178148_651-95702-0_sp1.1 from training. Duration: 0.963625 2023-10-02 16:21:31,650 WARNING [train.py:1204] (3/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_379-184223-0_sp0.9 from training. Duration: 0.9444375 2023-10-02 16:21:31,668 WARNING [train.py:1204] (3/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_381-125325-0_sp1.1 from training. Duration: 0.963625 2023-10-02 16:21:31,732 WARNING [train.py:1204] (3/4) Exclude cut with ID IC0622W0021-307659 from training. Duration: 0.896 2023-10-02 16:21:31,734 WARNING [train.py:1204] (3/4) Exclude cut with ID _7624_210_1531_1_1528109802198_4933101_418-349834-0_sp1.1 from training. Duration: 0.5454375 2023-10-02 16:21:35,847 WARNING [train.py:1204] (3/4) Exclude cut with ID _49728_210_12651_1_1534041034294_6788150_607-3416-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 16:21:36,021 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.conv_module2.balancer2.prob, batch_count=942366.6666666666, ans=0.125 2023-10-02 16:21:39,104 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8275_210_4198_1_1528455320_3892571_491-17707-0 from training. Duration: 0.64 2023-10-02 16:21:39,149 WARNING [train.py:1204] (3/4) Exclude cut with ID _5287_210_2084_1_1527418883040_3878670_333-48508-0 from training. Duration: 0.6 2023-10-02 16:21:39,214 WARNING [train.py:1204] (3/4) Exclude cut with ID _8365_210_2064_1_1528592311070_4677690_371-122295-0 from training. Duration: 0.55 2023-10-02 16:21:40,548 WARNING [train.py:1204] (3/4) Exclude cut with ID _14860_210_4915_1_1530702020340_7343240_836-328489-0 from training. Duration: 0.9 2023-10-02 16:21:43,734 INFO [train.py:1046] (3/4) Epoch 27, batch 3250, loss[loss=0.1744, simple_loss=0.2564, pruned_loss=0.04615, over 24426.00 frames. ], tot_loss[loss=0.1666, simple_loss=0.2437, pruned_loss=0.04472, over 4698487.79 frames. ], batch size: 63, lr: 3.78e-03, grad_scale: 16.0 2023-10-02 16:21:43,820 WARNING [train.py:1204] (3/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_1033-274376-0_sp0.9 from training. Duration: 0.9666875 2023-10-02 16:21:45,328 WARNING [train.py:1204] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_475-127455-0_sp0.9 from training. Duration: 0.77775 2023-10-02 16:21:45,342 WARNING [train.py:1204] (3/4) Exclude cut with ID IC0671W0071-331914 from training. Duration: 0.896 2023-10-02 16:21:45,505 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.conv_module1.balancer1.prob, batch_count=942433.3333333334, ans=0.125 2023-10-02 16:21:46,209 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.0.layers.0.whiten, num_groups=1, num_channels=192, metric=4.17 vs. limit=12.0 2023-10-02 16:21:46,553 WARNING [train.py:1204] (3/4) Exclude cut with ID _30327_210_16049_1_1532484161295_3924169_2-1523-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 16:21:46,555 WARNING [train.py:1204] (3/4) Exclude cut with ID _24314_210_4915_1_1532135078116_7170179_460-208279-0_sp1.1 from training. Duration: 0.97275 2023-10-02 16:21:48,107 WARNING [train.py:1204] (3/4) Exclude cut with ID IC0679W0052-335859 from training. Duration: 0.64 2023-10-02 16:21:48,309 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.feed_forward1.hidden_balancer.prob, batch_count=942433.3333333334, ans=0.125 2023-10-02 16:21:50,942 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.0.conv_module1.balancer1.prob, batch_count=942433.3333333334, ans=0.125 2023-10-02 16:21:53,641 WARNING [train.py:1204] (3/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_3-237434-0_sp1.1 from training. Duration: 0.6909375 2023-10-02 16:21:56,483 WARNING [train.py:1204] (3/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_405-185283-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 16:21:56,643 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.balancer2.prob, batch_count=942500.0, ans=0.125 2023-10-02 16:22:00,000 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.3.encoder.layers.1.nonlin_attention.whiten1, num_groups=1, num_channels=384, metric=5.18 vs. limit=10.0 2023-10-02 16:22:02,735 INFO [scaling.py:1118] (3/4) WithLoss: name=encoder.encoders.2.encoder.layers.1.self_attn_weights, loss-sum=0.000e+00 2023-10-02 16:22:03,892 WARNING [train.py:1204] (3/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_927-83684-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 16:22:03,909 WARNING [train.py:1204] (3/4) Exclude cut with ID _6694_210_4599_1_1527852637490_3965529_356-169233-0 from training. Duration: 0.99 2023-10-02 16:22:04,172 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.nonlin_attention.balancer.prob, batch_count=942500.0, ans=0.125 2023-10-02 16:22:05,789 WARNING [train.py:1204] (3/4) Exclude cut with ID _19896_210_1883_1_1531709022662_6649569_562-233001-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 16:22:05,839 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_354-78629-0_sp1.1 from training. Duration: 0.963625 2023-10-02 16:22:05,840 WARNING [train.py:1204] (3/4) Exclude cut with ID _60135_210_13427_1_1534935557922_3651319_96-168198-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 16:22:07,268 WARNING [train.py:1204] (3/4) Exclude cut with ID _31602_210_5854_1_1532677021456_4458250_249-96604-0_sp1.1 from training. Duration: 0.8090625 2023-10-02 16:22:07,708 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.5.encoder.layers.1.nonlin_attention.whiten1, num_groups=1, num_channels=192, metric=3.19 vs. limit=10.0 2023-10-02 16:22:08,560 WARNING [train.py:1204] (3/4) Exclude cut with ID _14100_210_8341_1_1530529320613_3678069_258-224599-0_sp0.9 from training. Duration: 0.8555625 2023-10-02 16:22:09,978 WARNING [train.py:1204] (3/4) Exclude cut with ID _19708_210_9206_1_1532136650733_4238364_215-160135-0_sp1.1 from training. Duration: 0.97275 2023-10-02 16:22:10,009 WARNING [train.py:1204] (3/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_113-18163-0_sp0.9 from training. Duration: 0.988875 2023-10-02 16:22:10,188 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=942500.0, ans=0.1 2023-10-02 16:22:11,926 WARNING [train.py:1204] (3/4) Exclude cut with ID _64135_210_2253_1_1535677267876_7608972_552-337340-0_sp1.1 from training. Duration: 0.92725 2023-10-02 16:22:11,960 WARNING [train.py:1204] (3/4) Exclude cut with ID _67725_210_27767_1_1535608756671_5105359_10-12887-0_sp1.1 from training. Duration: 0.97275 2023-10-02 16:22:11,966 WARNING [train.py:1204] (3/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_401-284840-0_sp1.1 from training. Duration: 0.97275 2023-10-02 16:22:11,999 WARNING [train.py:1204] (3/4) Exclude cut with ID _44659_210_2721_1_1534036120061_6614860_483-141285-0_sp1.1 from training. Duration: 0.936375 2023-10-02 16:22:15,235 WARNING [train.py:1204] (3/4) Exclude cut with ID _9461_210_4557_1_1529060514726_3677451_358-332734-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 16:22:16,597 WARNING [train.py:1204] (3/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_788-298907-0_sp1.1 from training. Duration: 0.8090625 2023-10-02 16:22:17,947 WARNING [train.py:1204] (3/4) Exclude cut with ID _39411_210_5986_1_1533538825030_3343649_354-267196-0_sp1.1 from training. Duration: 0.92725 2023-10-02 16:22:17,966 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_14953_210_2151_1_1530701710_5616165_192-309792-0_sp1.1 from training. Duration: 0.97275 2023-10-02 16:22:20,689 WARNING [train.py:1204] (3/4) Exclude cut with ID _59170_210_6721_1_1534983633960_4203335_290-151255-0_sp1.1 from training. Duration: 0.92725 2023-10-02 16:22:20,724 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_5575_210_1410_1_1527301305_2570170_249-93769-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 16:22:20,736 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_46013_210_7234_1_1533776362_3744032_463-52378-0_sp1.1 from training. Duration: 0.7818125 2023-10-02 16:22:23,676 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.feed_forward3.hidden_balancer.prob, batch_count=942566.6666666666, ans=0.125 2023-10-02 16:22:24,792 WARNING [train.py:1204] (3/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_580-93798-0 from training. Duration: 0.56 2023-10-02 16:22:26,200 WARNING [train.py:1204] (3/4) Exclude cut with ID _19514_210_9259_1_1531530071330_5084230_385-13762-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 16:22:26,205 WARNING [train.py:1204] (3/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_458-67321-0_sp1.1 from training. Duration: 0.8818125 2023-10-02 16:22:26,312 WARNING [train.py:1204] (3/4) Exclude cut with ID _41298_210_9019_1_1533430371450_4822143_188-154791-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 16:22:27,579 WARNING [train.py:1204] (3/4) Exclude cut with ID _15037_210_2253_1_1530752294104_3267650_296-192597-0_sp0.9 from training. Duration: 0.9 2023-10-02 16:22:33,650 WARNING [train.py:1204] (3/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_661-264857-0_sp1.1 from training. Duration: 0.8454375 2023-10-02 16:22:39,694 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_34008_210_10431_1_1533345424_8183247_662-317331-0_sp1.1 from training. Duration: 0.8545625 2023-10-02 16:22:41,453 WARNING [train.py:1204] (3/4) Exclude cut with ID _46502_210_13794_1_1533724452198_3657159_241-278329-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 16:22:41,453 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_11349_210_6753_1_1529800861_1101207_26-65602-0 from training. Duration: 0.96 2023-10-02 16:22:41,457 WARNING [train.py:1204] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_448-4048-0_sp1.1 from training. Duration: 0.87275 2023-10-02 16:22:41,466 WARNING [train.py:1204] (3/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_662-275621-0_sp1.1 from training. Duration: 0.3818125 2023-10-02 16:22:41,515 WARNING [train.py:1204] (3/4) Exclude cut with ID _13938_210_9988_1_1530518718978_3693960_256-54454-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 16:22:41,724 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.0.conv_skip_rate, batch_count=942700.0, ans=0.0 2023-10-02 16:22:44,230 WARNING [train.py:1204] (3/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_256-32662-0 from training. Duration: 0.9 2023-10-02 16:22:44,273 WARNING [train.py:1204] (3/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_326-17061-0 from training. Duration: 0.94 2023-10-02 16:22:44,333 WARNING [train.py:1204] (3/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_505-97987-0_sp1.1 from training. Duration: 0.7818125 2023-10-02 16:22:46,356 WARNING [train.py:1204] (3/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_316-294364-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 16:22:47,663 WARNING [train.py:1204] (3/4) Exclude cut with ID _19514_210_9259_1_1531530071330_5084230_762-231063-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 16:22:49,029 WARNING [train.py:1204] (3/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_68-219807-0_sp0.9 from training. Duration: 0.52225 2023-10-02 16:22:49,072 WARNING [train.py:1204] (3/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_479-158053-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 16:22:53,139 WARNING [train.py:1204] (3/4) Exclude cut with ID _66112_210_10635_1_1535590236305_3801484_83-38181-0_sp1.1 from training. Duration: 0.936375 2023-10-02 16:22:53,153 WARNING [train.py:1204] (3/4) Exclude cut with ID _55857_210_2346_1_1534644007831_6898209_255-146346-0_sp1.1 from training. Duration: 0.963625 2023-10-02 16:22:53,275 WARNING [train.py:1204] (3/4) Exclude cut with ID _48836_210_12210_1_1534075848293_3904150_429-35475-0 from training. Duration: 0.89 2023-10-02 16:22:54,545 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_50154_210_13731_1_1534750862_1080961_119-187163-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 16:22:55,959 WARNING [train.py:1204] (3/4) Exclude cut with ID _8793_210_6310_1_1528542985139_2310009_121-179782-0_sp0.9 from training. Duration: 0.888875 2023-10-02 16:22:55,969 WARNING [train.py:1204] (3/4) Exclude cut with ID _14884_210_2252_1_1530707459689_5561250_591-173406-0 from training. Duration: 0.96 2023-10-02 16:22:57,202 INFO [train.py:1046] (3/4) Epoch 27, batch 3300, loss[loss=0.1841, simple_loss=0.2606, pruned_loss=0.05382, over 23318.00 frames. ], tot_loss[loss=0.167, simple_loss=0.2446, pruned_loss=0.04474, over 4712864.94 frames. ], batch size: 105, lr: 3.78e-03, grad_scale: 16.0 2023-10-02 16:22:57,424 WARNING [train.py:1204] (3/4) Exclude cut with ID _15133_210_6228_1_1530794512074_3080379_54-198193-0_sp1.1 from training. Duration: 0.8545625 2023-10-02 16:22:58,706 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6545_210_2040_1_1527774670_4245258_407-48070-0 from training. Duration: 0.76 2023-10-02 16:23:00,293 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1032-236863-0 from training. Duration: 0.84 2023-10-02 16:23:01,684 WARNING [train.py:1204] (3/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_507-303370-0 from training. Duration: 0.96 2023-10-02 16:23:01,712 WARNING [train.py:1204] (3/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_164-282366-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 16:23:06,281 WARNING [train.py:1204] (3/4) Exclude cut with ID _39867_210_9826_1_1533553222030_9344259_312-158727-0_sp1.1 from training. Duration: 0.963625 2023-10-02 16:23:06,362 WARNING [train.py:1204] (3/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_174-205058-0_sp1.1 from training. Duration: 0.863625 2023-10-02 16:23:06,389 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_11305_210_1794_1_1529750428_7178148_166-237358-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 16:23:07,902 WARNING [train.py:1204] (3/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_26-175553-0_sp1.1 from training. Duration: 0.5545625 2023-10-02 16:23:07,920 WARNING [train.py:1204] (3/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_96-290039-0_sp1.1 from training. Duration: 0.5090625 2023-10-02 16:23:10,693 WARNING [train.py:1204] (3/4) Exclude cut with ID _53219_210_4525_1_1534834836008_4064825_261-101691-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 16:23:12,636 WARNING [train.py:1204] (3/4) Exclude cut with ID _24271_210_12658_1_1531987270022_7190519_96-293390-0_sp1.1 from training. Duration: 0.936375 2023-10-02 16:23:15,988 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_271-234962-0 from training. Duration: 0.79 2023-10-02 16:23:16,038 WARNING [train.py:1204] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_136-30660-0_sp1.1 from training. Duration: 0.97275 2023-10-02 16:23:16,171 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.feed_forward1.hidden_balancer.prob, batch_count=942833.3333333334, ans=0.125 2023-10-02 16:23:17,309 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_625-281046-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 16:23:18,647 WARNING [train.py:1204] (3/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_333-168193-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 16:23:19,929 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9029W0269-509664 from training. Duration: 0.896 2023-10-02 16:23:20,019 WARNING [train.py:1204] (3/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_450-147393-0_sp1.1 from training. Duration: 0.92725 2023-10-02 16:23:21,378 WARNING [train.py:1204] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_643-52007-0_sp1.1 from training. Duration: 0.5181875 2023-10-02 16:23:22,730 WARNING [train.py:1204] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_330-327861-0_sp0.9 from training. Duration: 0.7444375 2023-10-02 16:23:22,746 WARNING [train.py:1204] (3/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_260-317651-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 16:23:22,762 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9044W0010-516654 from training. Duration: 0.896 2023-10-02 16:23:24,362 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.feed_forward2.hidden_balancer.prob, batch_count=942833.3333333334, ans=0.125 2023-10-02 16:23:26,929 WARNING [train.py:1204] (3/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_265-118378-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 16:23:26,935 WARNING [train.py:1204] (3/4) Exclude cut with ID _6194_210_1534_1_1527511826160_3941194_321-148641-0_sp0.9 from training. Duration: 0.711125 2023-10-02 16:23:29,700 WARNING [train.py:1204] (3/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_101-169176-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 16:23:29,703 WARNING [train.py:1204] (3/4) Exclude cut with ID 497-129325-0061-62254-0_sp1.1 from training. Duration: 0.97725 2023-10-02 16:23:30,036 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.conv_module2.balancer2.prob, batch_count=942900.0, ans=0.125 2023-10-02 16:23:31,095 WARNING [train.py:1204] (3/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_795-33252-0_sp1.1 from training. Duration: 0.336375 2023-10-02 16:23:31,107 WARNING [train.py:1204] (3/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_168-246176-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 16:23:32,483 WARNING [train.py:1204] (3/4) Exclude cut with ID _7160_210_1876_1_1527940923231_3703799_120-150943-0_sp1.1 from training. Duration: 0.736375 2023-10-02 16:23:34,459 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9084W0005-536844 from training. Duration: 0.768 2023-10-02 16:23:35,914 WARNING [train.py:1204] (3/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_234-260682-0 from training. Duration: 0.84 2023-10-02 16:23:37,251 WARNING [train.py:1204] (3/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_188-114464-0_sp0.9 from training. Duration: 0.911125 2023-10-02 16:23:38,675 WARNING [train.py:1204] (3/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_81-75849-0 from training. Duration: 0.39 2023-10-02 16:23:40,176 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.feed_forward1.hidden_balancer.prob, batch_count=942966.6666666666, ans=0.125 2023-10-02 16:23:41,380 WARNING [train.py:1204] (3/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_379-184223-0_sp1.1 from training. Duration: 0.77275 2023-10-02 16:23:44,210 WARNING [train.py:1204] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_454-123002-0_sp1.1 from training. Duration: 0.62725 2023-10-02 16:23:45,491 WARNING [train.py:1204] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_887-249555-0_sp1.1 from training. Duration: 0.7 2023-10-02 16:23:47,549 WARNING [train.py:1204] (3/4) Exclude cut with ID _31088_210_16446_1_1532520012284_3440919_20-260902-0_sp1.1 from training. Duration: 0.97275 2023-10-02 16:23:48,866 WARNING [train.py:1204] (3/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_856-344574-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 16:23:48,869 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_36756_210_7234_1_1533026991_966276_4-164118-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 16:23:48,900 WARNING [train.py:1204] (3/4) Exclude cut with ID _8365_210_2064_1_1528592311070_4677690_371-122295-0_sp1.1 from training. Duration: 0.5 2023-10-02 16:23:50,375 WARNING [train.py:1204] (3/4) Exclude cut with ID _5910_210_1389_1_1527337972454_5050100_555-159815-0_sp1.1 from training. Duration: 0.8909375 2023-10-02 16:23:50,385 WARNING [train.py:1204] (3/4) Exclude cut with ID _37086_210_9038_1_1532999540891_7021250_469-147941-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 16:23:51,783 WARNING [train.py:1204] (3/4) Exclude cut with ID _3067_210_2667_1_1526212974640_4503168_459-173867-0_sp0.9 from training. Duration: 0.97775 2023-10-02 16:23:51,895 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9156W0006-572499 from training. Duration: 0.896 2023-10-02 16:23:53,227 WARNING [train.py:1204] (3/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_277-210798-0 from training. Duration: 0.55 2023-10-02 16:23:53,512 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.feed_forward1.hidden_balancer.prob, batch_count=942966.6666666666, ans=0.125 2023-10-02 16:23:54,394 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.582e+02 1.925e+02 2.169e+02 2.567e+02 3.728e+02, threshold=4.338e+02, percent-clipped=0.0 2023-10-02 16:23:54,551 WARNING [train.py:1204] (3/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_103-336064-0_sp1.1 from training. Duration: 0.463625 2023-10-02 16:23:55,884 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3414_210_1988_1_1526212718_4813195_331-290945-0_sp1.1 from training. Duration: 0.936375 2023-10-02 16:23:55,885 WARNING [train.py:1204] (3/4) Exclude cut with ID _67499_210_17282_1_1535509805220_3692430_365-29276-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 16:23:57,439 WARNING [train.py:1204] (3/4) Exclude cut with ID _14821_210_8750_1_1530680503991_6829359_69-250847-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 16:23:57,448 WARNING [train.py:1204] (3/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_1102-61301-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 16:23:58,709 WARNING [train.py:1204] (3/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_166-246557-0_sp1.1 from training. Duration: 0.5818125 2023-10-02 16:23:58,764 WARNING [train.py:1204] (3/4) Exclude cut with ID _15351_210_10626_1_1530856635653_3576210_214-129918-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 16:23:58,776 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_120-41668-0_sp0.9 from training. Duration: 0.77775 2023-10-02 16:23:59,013 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.feed_forward1.out_proj.dropout_p, batch_count=943033.3333333334, ans=0.1 2023-10-02 16:24:00,095 WARNING [train.py:1204] (3/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_27-72278-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 16:24:03,259 WARNING [train.py:1204] (3/4) Exclude cut with ID _24314_210_4915_1_1532135078116_7170179_521-258868-0_sp1.1 from training. Duration: 0.7181875 2023-10-02 16:24:05,997 WARNING [train.py:1204] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_143-86344-0 from training. Duration: 0.77 2023-10-02 16:24:06,034 WARNING [train.py:1204] (3/4) Exclude cut with ID _53489_210_6228_1_1534298357169_3835970_465-157433-0_sp1.1 from training. Duration: 0.92725 2023-10-02 16:24:07,451 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_248-319353-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 16:24:08,818 WARNING [train.py:1204] (3/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_102-77064-0_sp1.1 from training. Duration: 0.8454375 2023-10-02 16:24:08,866 WARNING [train.py:1204] (3/4) Exclude cut with ID _9389_210_1664_1_1529217337301_5289269_379-128794-0_sp1.1 from training. Duration: 0.77275 2023-10-02 16:24:10,175 INFO [train.py:1046] (3/4) Epoch 27, batch 3350, loss[loss=0.1758, simple_loss=0.2565, pruned_loss=0.04753, over 23425.00 frames. ], tot_loss[loss=0.1679, simple_loss=0.2457, pruned_loss=0.04507, over 4715709.93 frames. ], batch size: 93, lr: 3.78e-03, grad_scale: 16.0 2023-10-02 16:24:10,271 WARNING [train.py:1204] (3/4) Exclude cut with ID _3231_210_2106_1_1526205864934_6784220_236-260835-0_sp1.1 from training. Duration: 0.97275 2023-10-02 16:24:11,623 WARNING [train.py:1204] (3/4) Exclude cut with ID _24271_210_12658_1_1531987270022_7190519_184-342363-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 16:24:11,624 WARNING [train.py:1204] (3/4) Exclude cut with ID _27894_210_12392_1_1532415496078_6902439_67-221663-0_sp1.1 from training. Duration: 0.92725 2023-10-02 16:24:13,621 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.bypass.skip_rate, batch_count=943100.0, ans=0.07 2023-10-02 16:24:14,755 WARNING [train.py:1204] (3/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_304-59227-0_sp1.1 from training. Duration: 0.7 2023-10-02 16:24:15,071 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=943100.0, ans=0.1 2023-10-02 16:24:16,224 WARNING [train.py:1204] (3/4) Exclude cut with ID _58605_210_12210_1_1534723195939_3904259_458-42232-0_sp1.1 from training. Duration: 0.92725 2023-10-02 16:24:18,213 WARNING [train.py:1204] (3/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_13-165512-0_sp0.9 from training. Duration: 0.911125 2023-10-02 16:24:21,053 WARNING [train.py:1204] (3/4) Exclude cut with ID _58602_210_2346_1_1534903210085_6908830_792-102241-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 16:24:23,737 WARNING [train.py:1204] (3/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_588-125165-0_sp0.9 from training. Duration: 0.92225 2023-10-02 16:24:23,838 WARNING [train.py:1204] (3/4) Exclude cut with ID _54218_210_12210_1_1534413646388_4409259_239-98429-0_sp1.1 from training. Duration: 0.97275 2023-10-02 16:24:25,188 WARNING [train.py:1204] (3/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_538-32291-0_sp1.1 from training. Duration: 0.7545625 2023-10-02 16:24:26,503 WARNING [train.py:1204] (3/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_419-317062-0 from training. Duration: 0.91 2023-10-02 16:24:27,935 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9309W0202-640329 from training. Duration: 0.896 2023-10-02 16:24:27,972 WARNING [train.py:1204] (3/4) Exclude cut with ID _3231_210_2106_1_1526205864934_6784220_419-313291-0_sp1.1 from training. Duration: 0.97275 2023-10-02 16:24:30,690 WARNING [train.py:1204] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_619-344499-0 from training. Duration: 0.63 2023-10-02 16:24:30,708 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_278-272612-0 from training. Duration: 0.73 2023-10-02 16:24:32,090 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_103-84010-0_sp0.9 from training. Duration: 0.7666875 2023-10-02 16:24:32,118 WARNING [train.py:1204] (3/4) Exclude cut with ID _26528_210_9070_1_1532570410842_3851310_497-97218-0_sp1.1 from training. Duration: 0.7818125 2023-10-02 16:24:35,444 WARNING [train.py:1204] (3/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_262-84613-0_sp1.1 from training. Duration: 0.936375 2023-10-02 16:24:35,483 WARNING [train.py:1204] (3/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_387-131538-0 from training. Duration: 0.81 2023-10-02 16:24:35,514 WARNING [train.py:1204] (3/4) Exclude cut with ID _8030_210_7497_1_1528444886781_3531089_224-300489-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 16:24:35,535 WARNING [train.py:1204] (3/4) Exclude cut with ID _14349_210_8341_1_1530615710818_3442470_170-336541-0_sp0.9 from training. Duration: 0.9555625 2023-10-02 16:24:38,783 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_47069_210_15368_1_1533781267_4431320_180-31746-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 16:24:40,308 WARNING [train.py:1204] (3/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_447-7041-0_sp1.1 from training. Duration: 0.92725 2023-10-02 16:24:41,643 WARNING [train.py:1204] (3/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_2-55283-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 16:24:43,031 WARNING [train.py:1204] (3/4) Exclude cut with ID _12555_210_8750_1_1530075594402_3780239_70-181911-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 16:24:45,880 WARNING [train.py:1204] (3/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_311-328022-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 16:24:47,934 WARNING [train.py:1204] (3/4) Exclude cut with ID _14528_210_8254_1_1530669700514_4306939_330-2987-0_sp1.1 from training. Duration: 0.92725 2023-10-02 16:24:49,261 WARNING [train.py:1204] (3/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_385-184858-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 16:24:52,440 WARNING [train.py:1204] (3/4) Exclude cut with ID _64414_210_17301_1_1535243252048_3757759_348-333360-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 16:24:52,512 WARNING [train.py:1204] (3/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_753-214751-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 16:24:55,291 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_17613_210_2151_1_1531137569_2366160_202-208013-0_sp1.1 from training. Duration: 0.92725 2023-10-02 16:24:56,576 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_43831_210_2158_1_1533725502_5872791_610-129300-0_sp1.1 from training. Duration: 0.936375 2023-10-02 16:24:57,961 WARNING [train.py:1204] (3/4) Exclude cut with ID _54218_210_12210_1_1534413646388_4409259_321-23852-0_sp1.1 from training. Duration: 0.936375 2023-10-02 16:24:58,121 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.conv_skip_rate, batch_count=943300.0, ans=0.0 2023-10-02 16:25:00,709 WARNING [train.py:1204] (3/4) Exclude cut with ID _39411_210_5986_1_1533538825030_3343649_173-238248-0 from training. Duration: 0.83 2023-10-02 16:25:00,715 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_405-339986-0_sp0.9 from training. Duration: 0.5666875 2023-10-02 16:25:02,034 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2009_210_2071_1_1525481134_4588937_385-302100-0 from training. Duration: 0.87 2023-10-02 16:25:02,063 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_161-276996-0_sp1.1 from training. Duration: 0.863625 2023-10-02 16:25:02,152 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_177-58005-0 from training. Duration: 0.62 2023-10-02 16:25:03,593 WARNING [train.py:1204] (3/4) Exclude cut with ID _64639_210_5816_1_1535367619437_3528000_146-343734-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 16:25:04,911 WARNING [train.py:1204] (3/4) Exclude cut with ID _3845_210_1390_1_1526731326141_3523610_72-61441-0_sp1.1 from training. Duration: 0.92725 2023-10-02 16:25:11,400 WARNING [train.py:1204] (3/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_991-125028-0_sp1.1 from training. Duration: 0.936375 2023-10-02 16:25:11,461 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7544_210_6753_1_1528591327_1471426_172-348777-0 from training. Duration: 0.66 2023-10-02 16:25:12,745 WARNING [train.py:1204] (3/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_416-257730-0_sp1.1 from training. Duration: 0.5909375 2023-10-02 16:25:14,165 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1113-7369-0_sp1.1 from training. Duration: 0.77275 2023-10-02 16:25:15,600 WARNING [train.py:1204] (3/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_242-76943-0_sp1.1 from training. Duration: 0.8545625 2023-10-02 16:25:18,968 WARNING [train.py:1204] (3/4) Exclude cut with ID _44718_210_5816_1_1534050051869_6469030_190-49648-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 16:25:19,562 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.4.encoder.layers.2.feed_forward1.out_whiten, num_groups=1, num_channels=384, metric=8.95 vs. limit=15.0 2023-10-02 16:25:20,528 WARNING [train.py:1204] (3/4) Exclude cut with ID _47251_210_18628_1_1533814063128_4137250_214-88076-0 from training. Duration: 0.88 2023-10-02 16:25:21,875 WARNING [train.py:1204] (3/4) Exclude cut with ID _5585_210_2667_1_1527418813324_8007340_89-332072-0_sp1.1 from training. Duration: 0.5818125 2023-10-02 16:25:21,902 WARNING [train.py:1204] (3/4) Exclude cut with ID _41817_210_18835_1_1533434477160_7423049_555-293448-0_sp1.1 from training. Duration: 0.72725 2023-10-02 16:25:23,300 WARNING [train.py:1204] (3/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_286-331314-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 16:25:23,350 WARNING [train.py:1204] (3/4) Exclude cut with ID _13921_210_9994_1_1530838702800_3003429_46-149444-0 from training. Duration: 0.94 2023-10-02 16:25:24,641 INFO [train.py:1046] (3/4) Epoch 27, batch 3400, loss[loss=0.1663, simple_loss=0.2382, pruned_loss=0.04721, over 22547.00 frames. ], tot_loss[loss=0.1677, simple_loss=0.2456, pruned_loss=0.04484, over 4721596.30 frames. ], batch size: 322, lr: 3.78e-03, grad_scale: 16.0 2023-10-02 16:25:24,692 WARNING [train.py:1204] (3/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_768-43595-0_sp1.1 from training. Duration: 0.936375 2023-10-02 16:25:24,703 WARNING [train.py:1204] (3/4) Exclude cut with ID _35355_210_16040_1_1533212988977_4016229_74-190076-0 from training. Duration: 0.8 2023-10-02 16:25:26,055 WARNING [train.py:1204] (3/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_1007-316380-0_sp1.1 from training. Duration: 0.963625 2023-10-02 16:25:26,091 WARNING [train.py:1204] (3/4) Exclude cut with ID _23557_210_8750_1_1531890106370_6846255_150-283437-0_sp1.1 from training. Duration: 0.963625 2023-10-02 16:25:27,353 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3474_210_2028_1_1526188513_4825246_145-309131-0_sp1.1 from training. Duration: 0.67275 2023-10-02 16:25:28,685 WARNING [train.py:1204] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_391-22148-0_sp1.1 from training. Duration: 0.7 2023-10-02 16:25:28,712 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_238-256668-0 from training. Duration: 0.69 2023-10-02 16:25:32,817 WARNING [train.py:1204] (3/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_372-198878-0 from training. Duration: 0.6 2023-10-02 16:25:32,835 WARNING [train.py:1204] (3/4) Exclude cut with ID ID0174W0006-766659 from training. Duration: 0.953 2023-10-02 16:25:32,858 WARNING [train.py:1204] (3/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_668-23019-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 16:25:37,624 WARNING [train.py:1204] (3/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_322-74328-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 16:25:37,637 WARNING [train.py:1204] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_872-264525-0_sp1.1 from training. Duration: 0.5909375 2023-10-02 16:25:37,694 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_53378_210_17282_1_1534221071_5769509_641-286201-0_sp1.1 from training. Duration: 0.97275 2023-10-02 16:25:40,387 WARNING [train.py:1204] (3/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_199-295611-0_sp1.1 from training. Duration: 0.5 2023-10-02 16:25:45,885 WARNING [train.py:1204] (3/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_1270-317971-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 16:25:49,063 WARNING [train.py:1204] (3/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_784-213099-0 from training. Duration: 0.93 2023-10-02 16:25:52,707 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_115-233929-0_sp0.9 from training. Duration: 0.8 2023-10-02 16:25:55,380 WARNING [train.py:1204] (3/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_256-223809-0_sp1.1 from training. Duration: 0.97275 2023-10-02 16:25:55,431 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_285-87781-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 16:25:56,815 WARNING [train.py:1204] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_619-344499-0_sp1.1 from training. Duration: 0.57275 2023-10-02 16:25:57,652 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.2.encoder.layers.0.conv_module1.whiten, num_groups=1, num_channels=384, metric=6.00 vs. limit=15.0 2023-10-02 16:26:02,258 WARNING [train.py:1204] (3/4) Exclude cut with ID _60229_210_27767_1_1534838406158_3655350_264-279573-0_sp0.9 from training. Duration: 0.92225 2023-10-02 16:26:05,106 WARNING [train.py:1204] (3/4) Exclude cut with ID _7719_210_3525_1_1528456153163_4296960_333-110399-0 from training. Duration: 0.96 2023-10-02 16:26:12,763 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_50423_210_13270_1_1534128598_5021734_759-90833-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 16:26:14,108 WARNING [train.py:1204] (3/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_151-254798-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 16:26:14,142 WARNING [train.py:1204] (3/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_450-203637-0 from training. Duration: 0.92 2023-10-02 16:26:15,365 WARNING [train.py:1204] (3/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_673-36456-0_sp1.1 from training. Duration: 0.936375 2023-10-02 16:26:15,418 WARNING [train.py:1204] (3/4) Exclude cut with ID _36538_210_9994_1_1533204020363_3795620_131-8685-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 16:26:15,464 WARNING [train.py:1204] (3/4) Exclude cut with ID _12556_210_8341_1_1530081024100_7327690_713-55626-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 16:26:15,500 WARNING [train.py:1204] (3/4) Exclude cut with ID _2322_210_2667_1_1525604548971_3656299_440-80494-0_sp0.9 from training. Duration: 0.97775 2023-10-02 16:26:19,263 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.3.encoder.layers.3.feed_forward3.out_whiten, num_groups=1, num_channels=512, metric=13.10 vs. limit=15.0 2023-10-02 16:26:20,081 WARNING [train.py:1204] (3/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_210-325081-0_sp1.1 from training. Duration: 0.97275 2023-10-02 16:26:21,336 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.566e+02 1.815e+02 1.942e+02 2.172e+02 3.090e+02, threshold=3.885e+02, percent-clipped=0.0 2023-10-02 16:26:23,551 WARNING [train.py:1204] (3/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_375-244976-0_sp0.9 from training. Duration: 0.8333125 2023-10-02 16:26:23,564 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_15517_210_6753_1_1530952293_1664795_119-31001-0_sp1.1 from training. Duration: 0.8545625 2023-10-02 16:26:27,792 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_41452_210_7291_1_1533437687_3941281_319-42326-0_sp1.1 from training. Duration: 0.92725 2023-10-02 16:26:30,410 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_372-173012-0 from training. Duration: 0.95 2023-10-02 16:26:34,653 WARNING [train.py:1204] (3/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_41-31157-0_sp1.1 from training. Duration: 0.5181875 2023-10-02 16:26:37,409 INFO [train.py:1046] (3/4) Epoch 27, batch 3450, loss[loss=0.1355, simple_loss=0.2126, pruned_loss=0.02922, over 24312.00 frames. ], tot_loss[loss=0.1675, simple_loss=0.2457, pruned_loss=0.0447, over 4729815.51 frames. ], batch size: 56, lr: 3.78e-03, grad_scale: 16.0 2023-10-02 16:26:39,258 WARNING [train.py:1204] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_91-212486-0 from training. Duration: 0.68 2023-10-02 16:26:42,642 WARNING [train.py:1204] (3/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_1267-272693-0 from training. Duration: 0.73 2023-10-02 16:26:42,681 WARNING [train.py:1204] (3/4) Exclude cut with ID _13938_210_9988_1_1530518718978_3693960_152-67046-0_sp1.1 from training. Duration: 0.936375 2023-10-02 16:26:42,791 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.1.conv_skip_rate, batch_count=943766.6666666666, ans=0.0 2023-10-02 16:26:44,033 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_4528_210_3273_1_1527076654_3276777_342-168068-0_sp1.1 from training. Duration: 0.7909375 2023-10-02 16:26:44,048 WARNING [train.py:1204] (3/4) Exclude cut with ID _7606_210_4915_1_1528894253277_5080690_127-177761-0 from training. Duration: 0.64 2023-10-02 16:26:45,400 WARNING [train.py:1204] (3/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_335-137972-0_sp1.1 from training. Duration: 0.92725 2023-10-02 16:26:50,152 WARNING [train.py:1204] (3/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_331-117829-0_sp1.1 from training. Duration: 0.563625 2023-10-02 16:26:52,908 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.conv_module2.balancer2.prob, batch_count=943833.3333333334, ans=0.125 2023-10-02 16:26:55,145 INFO [scaling.py:1118] (3/4) WithLoss: name=encoder.encoders.2.encoder.layers.2.self_attn_weights, loss-sum=0.000e+00 2023-10-02 16:26:56,256 WARNING [train.py:1204] (3/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_125-125818-0_sp1.1 from training. Duration: 0.87275 2023-10-02 16:26:56,309 WARNING [train.py:1204] (3/4) Exclude cut with ID _6789_210_1534_1_1527854471671_2870519_48-294897-0_sp1.1 from training. Duration: 0.963625 2023-10-02 16:26:56,389 WARNING [train.py:1204] (3/4) Exclude cut with ID _6694_210_4599_1_1527852637490_3965529_116-109398-0_sp1.1 from training. Duration: 0.663625 2023-10-02 16:26:57,713 WARNING [train.py:1204] (3/4) Exclude cut with ID _52892_210_6721_1_1534378941259_4124129_222-172685-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 16:26:59,158 WARNING [train.py:1204] (3/4) Exclude cut with ID _50655_210_18628_1_1534229942193_3772154_320-132275-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 16:27:04,599 WARNING [train.py:1204] (3/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_76-311963-0 from training. Duration: 0.7 2023-10-02 16:27:10,503 WARNING [train.py:1204] (3/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_32-205288-0 from training. Duration: 0.78 2023-10-02 16:27:10,521 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_86-16881-0_sp0.9 from training. Duration: 0.6444375 2023-10-02 16:27:10,700 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.bypass.skip_rate, batch_count=943900.0, ans=0.07 2023-10-02 16:27:11,904 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_271-297505-0_sp1.1 from training. Duration: 0.9 2023-10-02 16:27:12,033 WARNING [train.py:1204] (3/4) Exclude cut with ID _57572_210_5517_1_1534921219624_3809484_187-295293-0_sp1.1 from training. Duration: 0.963625 2023-10-02 16:27:18,764 WARNING [train.py:1204] (3/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_563-329943-0 from training. Duration: 0.99 2023-10-02 16:27:18,845 WARNING [train.py:1204] (3/4) Exclude cut with ID _7160_210_1876_1_1527940923231_3703799_302-150728-0_sp0.9 from training. Duration: 0.8666875 2023-10-02 16:27:21,675 WARNING [train.py:1204] (3/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_926-7526-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 16:27:21,701 WARNING [train.py:1204] (3/4) Exclude cut with ID _62574_210_2655_1_1535180407058_3933249_467-22348-0_sp1.1 from training. Duration: 0.936375 2023-10-02 16:27:21,857 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.0.attention_skip_rate, batch_count=943966.6666666666, ans=0.0 2023-10-02 16:27:23,875 WARNING [train.py:1204] (3/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_280-161633-0_sp0.9 from training. Duration: 0.811125 2023-10-02 16:27:25,009 WARNING [train.py:1204] (3/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_385-193052-0_sp1.1 from training. Duration: 0.7818125 2023-10-02 16:27:26,466 WARNING [train.py:1204] (3/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_444-28041-0 from training. Duration: 0.85 2023-10-02 16:27:26,471 WARNING [train.py:1204] (3/4) Exclude cut with ID _66070_210_7640_1_1535441393096_7805310_341-249237-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 16:27:27,791 WARNING [train.py:1204] (3/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_425-269826-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 16:27:30,858 WARNING [train.py:1204] (3/4) Exclude cut with ID _53950_210_5816_1_1534417264919_3862951_124-185597-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 16:27:33,658 WARNING [train.py:1204] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_380-204756-0 from training. Duration: 0.86 2023-10-02 16:27:36,538 WARNING [train.py:1204] (3/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_282-257791-0_sp0.9 from training. Duration: 0.9555625 2023-10-02 16:27:41,234 WARNING [train.py:1204] (3/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_372-131163-0_sp1.1 from training. Duration: 0.8545625 2023-10-02 16:27:42,665 WARNING [train.py:1204] (3/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_447-233513-0_sp1.1 from training. Duration: 0.963625 2023-10-02 16:27:46,223 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_54-118333-0_sp1.1 from training. Duration: 0.97275 2023-10-02 16:27:50,431 WARNING [train.py:1204] (3/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_388-230239-0_sp1.1 from training. Duration: 0.963625 2023-10-02 16:27:50,451 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_40047_210_2290_1_1533290519_2374311_141-54639-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 16:27:50,507 WARNING [train.py:1204] (3/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_91-267696-0_sp0.9 from training. Duration: 0.9666875 2023-10-02 16:27:51,782 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_532-137847-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 16:27:53,142 INFO [train.py:1046] (3/4) Epoch 27, batch 3500, loss[loss=0.1626, simple_loss=0.2473, pruned_loss=0.039, over 24474.00 frames. ], tot_loss[loss=0.1665, simple_loss=0.2443, pruned_loss=0.04435, over 4732296.75 frames. ], batch size: 66, lr: 3.78e-03, grad_scale: 16.0 2023-10-02 16:27:53,341 WARNING [train.py:1204] (3/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_155-283231-0_sp1.1 from training. Duration: 0.97275 2023-10-02 16:27:56,704 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2245_210_2715_1_1525511548_3540254_310-52483-0_sp1.1 from training. Duration: 0.8 2023-10-02 16:27:57,924 WARNING [train.py:1204] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_185-174805-0 from training. Duration: 0.68 2023-10-02 16:27:58,241 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.conv_module1.balancer1.prob, batch_count=944100.0, ans=0.125 2023-10-02 16:27:59,293 WARNING [train.py:1204] (3/4) Exclude cut with ID _15012_210_1968_1_1530856820429_5095350_662-210469-0_sp1.1 from training. Duration: 0.5181875 2023-10-02 16:28:01,866 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_426-313557-0_sp0.9 from training. Duration: 0.588875 2023-10-02 16:28:04,996 WARNING [train.py:1204] (3/4) Exclude cut with ID _42326_210_7770_1_1533814282531_3707269_407-204343-0_sp1.1 from training. Duration: 0.97275 2023-10-02 16:28:05,003 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_258-310570-0 from training. Duration: 0.82 2023-10-02 16:28:09,272 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_4737_210_2040_1_1526998689_3293008_378-178650-0_sp1.1 from training. Duration: 0.863625 2023-10-02 16:28:11,930 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_47896_210_20508_1_1534492518_5958868_432-327451-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 16:28:12,031 WARNING [train.py:1204] (3/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_188-114464-0_sp1.1 from training. Duration: 0.7454375 2023-10-02 16:28:12,037 WARNING [train.py:1204] (3/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_798-106179-0_sp1.1 from training. Duration: 0.7545625 2023-10-02 16:28:13,366 WARNING [train.py:1204] (3/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_143-117421-0_sp1.1 from training. Duration: 0.52725 2023-10-02 16:28:13,402 WARNING [train.py:1204] (3/4) Exclude cut with ID _67499_210_17282_1_1535509805220_3692430_361-78786-0_sp1.1 from training. Duration: 0.963625 2023-10-02 16:28:13,443 WARNING [train.py:1204] (3/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_712-342563-0_sp1.1 from training. Duration: 0.8818125 2023-10-02 16:28:15,377 WARNING [train.py:1204] (3/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_387-159008-0 from training. Duration: 0.86 2023-10-02 16:28:16,945 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.0.conv_skip_rate, batch_count=944166.6666666666, ans=0.0 2023-10-02 16:28:18,119 WARNING [train.py:1204] (3/4) Exclude cut with ID _33640_210_2655_1_1533117605479_4855469_959-18820-0_sp1.1 from training. Duration: 0.963625 2023-10-02 16:28:18,170 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_133-564-0_sp0.9 from training. Duration: 0.87775 2023-10-02 16:28:18,642 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.5.encoder.layers.1.self_attn1.whiten, num_groups=1, num_channels=256, metric=11.22 vs. limit=22.5 2023-10-02 16:28:19,534 WARNING [train.py:1204] (3/4) Exclude cut with ID _23514_210_1389_1_1531832139785_3031439_12-29287-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 16:28:21,232 INFO [scaling.py:1118] (3/4) WithLoss: name=encoder.encoders.3.encoder.layers.1.self_attn_weights, loss-sum=0.000e+00 2023-10-02 16:28:22,403 WARNING [train.py:1204] (3/4) Exclude cut with ID _23514_210_1389_1_1531832139785_3031439_489-185052-0_sp1.1 from training. Duration: 0.963625 2023-10-02 16:28:22,472 WARNING [train.py:1204] (3/4) Exclude cut with ID _5444_210_2151_1_1527418808464_5312210_634-194174-0 from training. Duration: 0.97 2023-10-02 16:28:22,501 WARNING [train.py:1204] (3/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_91-26919-0_sp1.1 from training. Duration: 0.8818125 2023-10-02 16:28:25,781 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_37351_210_7925_1_1533012931_3812113_516-82303-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 16:28:25,952 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.0.ff2_skip_rate, batch_count=944233.3333333334, ans=0.0 2023-10-02 16:28:27,125 WARNING [train.py:1204] (3/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_80-74184-0_sp0.9 from training. Duration: 0.92225 2023-10-02 16:28:28,492 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_9460_210_4211_1_1528954587_10049667_495-202963-0_sp1.1 from training. Duration: 0.963625 2023-10-02 16:28:29,885 WARNING [train.py:1204] (3/4) Exclude cut with ID _14632_210_9988_1_1530691279392_3923200_332-105904-0_sp1.1 from training. Duration: 0.8181875 2023-10-02 16:28:29,909 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_40764_210_1925_1_1533882359_3752345_493-167886-0_sp1.1 from training. Duration: 0.92725 2023-10-02 16:28:32,979 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_196-289637-0 from training. Duration: 0.75 2023-10-02 16:28:34,385 WARNING [train.py:1204] (3/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_26-175553-0 from training. Duration: 0.61 2023-10-02 16:28:34,444 WARNING [train.py:1204] (3/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_890-5117-0 from training. Duration: 0.78 2023-10-02 16:28:35,758 WARNING [train.py:1204] (3/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_206-306860-0_sp1.1 from training. Duration: 0.92725 2023-10-02 16:28:37,200 WARNING [train.py:1204] (3/4) Exclude cut with ID _25882_210_3036_1_1532775665815_6479510_570-79824-0_sp1.1 from training. Duration: 0.963625 2023-10-02 16:28:38,624 WARNING [train.py:1204] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_731-2428-0_sp1.1 from training. Duration: 0.7545625 2023-10-02 16:28:38,653 WARNING [train.py:1204] (3/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_138-154812-0_sp0.9 from training. Duration: 0.8666875 2023-10-02 16:28:41,469 WARNING [train.py:1204] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_178-39722-0_sp0.9 from training. Duration: 0.5444375 2023-10-02 16:28:42,719 WARNING [train.py:1204] (3/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_1285-256622-0_sp0.9 from training. Duration: 0.9333125 2023-10-02 16:28:44,892 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.conv_module1.balancer1.prob, batch_count=944300.0, ans=0.125 2023-10-02 16:28:46,708 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_41428_210_2161_1_1533774184_3992532_65-271323-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 16:28:48,031 WARNING [train.py:1204] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_308-161067-0 from training. Duration: 0.66 2023-10-02 16:28:48,036 WARNING [train.py:1204] (3/4) Exclude cut with ID _3067_210_2667_1_1526212974640_4503168_459-173867-0 from training. Duration: 0.88 2023-10-02 16:28:48,047 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2221_210_1988_1_1525597051_4935541_415-296882-0_sp1.1 from training. Duration: 0.7 2023-10-02 16:28:50,811 WARNING [train.py:1204] (3/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_354-192266-0_sp1.1 from training. Duration: 0.936375 2023-10-02 16:28:52,057 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.607e+02 1.840e+02 2.099e+02 2.420e+02 3.438e+02, threshold=4.198e+02, percent-clipped=0.0 2023-10-02 16:28:52,146 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_258-310570-0_sp0.9 from training. Duration: 0.911125 2023-10-02 16:28:53,477 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_548-141039-0_sp1.1 from training. Duration: 0.963625 2023-10-02 16:28:55,488 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_15101_210_7927_1_1530786415_5033274_320-327527-0 from training. Duration: 0.96 2023-10-02 16:28:55,540 WARNING [train.py:1204] (3/4) Exclude cut with ID _2895_210_2477_1_1525956785794_4063750_289-11475-0_sp0.9 from training. Duration: 0.911125 2023-10-02 16:28:56,954 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7543_210_6753_1_1528543318_4552_1-85072-0_sp1.1 from training. Duration: 0.7545625 2023-10-02 16:28:57,042 WARNING [train.py:1204] (3/4) Exclude cut with ID _64639_210_5816_1_1535367619437_3528000_461-329156-0 from training. Duration: 0.96 2023-10-02 16:28:59,080 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.2.encoder.layers.0.self_attn1.whiten, num_groups=1, num_channels=384, metric=19.97 vs. limit=22.5 2023-10-02 16:28:59,680 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_211-339622-0 from training. Duration: 0.45 2023-10-02 16:29:01,569 WARNING [train.py:1204] (3/4) Exclude cut with ID _20455_210_5219_1_1531565996775_8098100_402-112739-0_sp1.1 from training. Duration: 0.963625 2023-10-02 16:29:02,887 WARNING [train.py:1204] (3/4) Exclude cut with ID _53489_210_6228_1_1534298357169_3835970_256-115565-0_sp1.1 from training. Duration: 0.936375 2023-10-02 16:29:02,913 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_26326_210_7925_1_1533002493_3592406_360-305260-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 16:29:02,935 WARNING [train.py:1204] (3/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_18-204383-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 16:29:05,710 WARNING [train.py:1204] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_306-232480-0_sp1.1 from training. Duration: 0.87275 2023-10-02 16:29:07,064 INFO [train.py:1046] (3/4) Epoch 27, batch 3550, loss[loss=0.1522, simple_loss=0.2305, pruned_loss=0.03692, over 24324.00 frames. ], tot_loss[loss=0.1661, simple_loss=0.2436, pruned_loss=0.04427, over 4711595.20 frames. ], batch size: 56, lr: 3.78e-03, grad_scale: 8.0 2023-10-02 16:29:13,870 WARNING [train.py:1204] (3/4) Exclude cut with ID _41161_210_20034_1_1533431249657_3555314_116-299186-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 16:29:15,190 WARNING [train.py:1204] (3/4) Exclude cut with ID _13913_210_9994_1_1530579615187_3383897_473-277682-0_sp1.1 from training. Duration: 0.3545625 2023-10-02 16:29:17,869 WARNING [train.py:1204] (3/4) Exclude cut with ID _58895_210_8750_1_1535184081320_3649089_49-225702-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 16:29:19,290 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3080_210_2071_1_1526086102_4365166_312-8726-0_sp1.1 from training. Duration: 0.82725 2023-10-02 16:29:20,686 WARNING [train.py:1204] (3/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_489-190175-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 16:29:22,107 WARNING [train.py:1204] (3/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_345-95717-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 16:29:22,126 WARNING [train.py:1204] (3/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_85-138257-0_sp1.1 from training. Duration: 0.6454375 2023-10-02 16:29:25,448 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7046_210_6753_1_1527895337_6920917_783-278109-0_sp1.1 from training. Duration: 0.7 2023-10-02 16:29:25,482 WARNING [train.py:1204] (3/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_450-203637-0_sp1.1 from training. Duration: 0.836375 2023-10-02 16:29:26,783 WARNING [train.py:1204] (3/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_416-132893-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 16:29:26,799 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2655_210_2087_1_1525688959_3897400_437-337690-0_sp0.9 from training. Duration: 0.7 2023-10-02 16:29:26,886 WARNING [train.py:1204] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_700-183549-0_sp1.1 from training. Duration: 0.6181875 2023-10-02 16:29:28,865 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.3.encoder.layers.2.nonlin_attention.whiten2, num_groups=1, num_channels=512, metric=6.98 vs. limit=15.0 2023-10-02 16:29:32,761 WARNING [train.py:1204] (3/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_191-59784-0_sp1.1 from training. Duration: 0.8 2023-10-02 16:29:32,780 WARNING [train.py:1204] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_218-295097-0_sp1.1 from training. Duration: 0.7 2023-10-02 16:29:35,412 WARNING [train.py:1204] (3/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_310-109461-0_sp1.1 from training. Duration: 0.72725 2023-10-02 16:29:35,418 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_33348_210_5632_1_1533086912_3324030_131-321149-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 16:29:35,476 WARNING [train.py:1204] (3/4) Exclude cut with ID _64135_210_2253_1_1535677267876_7608972_133-188266-0_sp1.1 from training. Duration: 0.736375 2023-10-02 16:29:35,493 WARNING [train.py:1204] (3/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_505-205270-0 from training. Duration: 0.38 2023-10-02 16:29:35,510 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_67223_210_4238_1_1535612120_2537431_102-88248-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 16:29:36,806 WARNING [train.py:1204] (3/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_60-235433-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 16:29:38,142 WARNING [train.py:1204] (3/4) Exclude cut with ID _5189_210_4557_1_1527248319428_3769229_199-53526-0_sp1.1 from training. Duration: 0.3818125 2023-10-02 16:29:42,851 WARNING [train.py:1204] (3/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_439-90979-0_sp1.1 from training. Duration: 0.963625 2023-10-02 16:29:42,919 WARNING [train.py:1204] (3/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_1162-132880-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 16:29:43,167 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.ff2_skip_rate, batch_count=944566.6666666666, ans=0.0 2023-10-02 16:29:44,236 WARNING [train.py:1204] (3/4) Exclude cut with ID _56423_210_5854_1_1534896061049_3165969_220-22317-0_sp1.1 from training. Duration: 0.963625 2023-10-02 16:29:44,527 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.ff3_skip_rate, batch_count=944566.6666666666, ans=0.0 2023-10-02 16:29:46,189 WARNING [train.py:1204] (3/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_909-64151-0 from training. Duration: 0.94 2023-10-02 16:29:46,245 WARNING [train.py:1204] (3/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_465-156500-0_sp0.9 from training. Duration: 0.8 2023-10-02 16:29:47,512 WARNING [train.py:1204] (3/4) Exclude cut with ID _44304_210_15374_1_1533639352765_4613129_302-22084-0 from training. Duration: 0.86 2023-10-02 16:29:48,689 WARNING [train.py:1204] (3/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_663-166973-0_sp1.1 from training. Duration: 0.72725 2023-10-02 16:29:50,120 WARNING [train.py:1204] (3/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_503-287883-0_sp0.9 from training. Duration: 0.97775 2023-10-02 16:29:50,155 WARNING [train.py:1204] (3/4) Exclude cut with ID _60057_210_10640_1_1534903065793_3333150_362-230377-0_sp0.9 from training. Duration: 0.9666875 2023-10-02 16:29:51,402 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.3.encoder.layers.3.nonlin_attention.whiten2, num_groups=1, num_channels=512, metric=4.95 vs. limit=15.0 2023-10-02 16:29:54,636 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_4183_210_2087_1_1526729100_1869838_143-280962-0 from training. Duration: 0.56 2023-10-02 16:29:55,994 WARNING [train.py:1204] (3/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_281-1313-0_sp1.1 from training. Duration: 0.936375 2023-10-02 16:29:56,183 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.nonlin_attention.balancer.prob, batch_count=944633.3333333334, ans=0.125 2023-10-02 16:30:02,235 WARNING [train.py:1204] (3/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_64-215118-0_sp1.1 from training. Duration: 0.936375 2023-10-02 16:30:02,311 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2822_210_2715_1_1526112586_1999587_165-36656-0 from training. Duration: 0.89 2023-10-02 16:30:03,545 WARNING [train.py:1204] (3/4) Exclude cut with ID _22961_210_8254_1_1531976366672_4278180_402-208830-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 16:30:06,532 WARNING [train.py:1204] (3/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_98-193494-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 16:30:07,960 WARNING [train.py:1204] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_383-285196-0 from training. Duration: 0.97 2023-10-02 16:30:12,846 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.attention_skip_rate, batch_count=944700.0, ans=0.0 2023-10-02 16:30:14,094 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6767_210_3273_1_1527855783_1718932_142-127132-0 from training. Duration: 0.95 2023-10-02 16:30:14,122 WARNING [train.py:1204] (3/4) Exclude cut with ID _39867_210_9826_1_1533553222030_9344259_1018-339635-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 16:30:15,926 WARNING [train.py:1204] (3/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_274-274798-0_sp1.1 from training. Duration: 0.8909375 2023-10-02 16:30:18,749 WARNING [train.py:1204] (3/4) Exclude cut with ID _2334_210_2667_1_1525608238880_4705129_376-263393-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 16:30:18,795 WARNING [train.py:1204] (3/4) Exclude cut with ID _29303_210_2575_1_1532392237303_7410649_363-344675-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 16:30:18,875 WARNING [train.py:1204] (3/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_58-68956-0_sp1.1 from training. Duration: 0.7545625 2023-10-02 16:30:21,528 INFO [train.py:1046] (3/4) Epoch 27, batch 3600, loss[loss=0.1752, simple_loss=0.2641, pruned_loss=0.04313, over 24008.00 frames. ], tot_loss[loss=0.1663, simple_loss=0.2441, pruned_loss=0.04427, over 4697614.58 frames. ], batch size: 80, lr: 3.78e-03, grad_scale: 16.0 2023-10-02 16:30:22,997 WARNING [train.py:1204] (3/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_709-311571-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 16:30:25,056 WARNING [train.py:1204] (3/4) Exclude cut with ID _46067_210_4220_1_1534125571066_3307450_136-257407-0_sp1.1 from training. Duration: 0.963625 2023-10-02 16:30:26,425 WARNING [train.py:1204] (3/4) Exclude cut with ID _6493_210_1747_1_1528459210424_4012470_324-323854-0_sp1.1 from training. Duration: 0.836375 2023-10-02 16:30:26,483 WARNING [train.py:1204] (3/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_234-260682-0_sp1.1 from training. Duration: 0.763625 2023-10-02 16:30:27,802 WARNING [train.py:1204] (3/4) Exclude cut with ID _36634_210_1794_1_1533296352450_1399884_55-119451-0_sp1.1 from training. Duration: 0.963625 2023-10-02 16:30:27,806 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_40047_210_2290_1_1533290519_2374311_195-165693-0 from training. Duration: 0.89 2023-10-02 16:30:31,851 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_5522_210_3273_1_1527249606_3909096_366-172508-0_sp0.9 from training. Duration: 0.7333125 2023-10-02 16:30:31,939 WARNING [train.py:1204] (3/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_187-10504-0_sp1.1 from training. Duration: 0.963625 2023-10-02 16:30:35,346 WARNING [train.py:1204] (3/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_282-257791-0_sp1.1 from training. Duration: 0.7818125 2023-10-02 16:30:38,119 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_43831_210_2158_1_1533725502_5872791_467-288776-0_sp1.1 from training. Duration: 0.92725 2023-10-02 16:30:38,215 WARNING [train.py:1204] (3/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_138-154812-0_sp1.1 from training. Duration: 0.7090625 2023-10-02 16:30:39,648 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_1154-185930-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 16:30:39,673 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2655_210_2087_1_1525688959_3897400_52-279899-0 from training. Duration: 0.69 2023-10-02 16:30:39,728 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_141-89113-0_sp1.1 from training. Duration: 0.7818125 2023-10-02 16:30:39,777 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder_embed.convnext.layerdrop_rate, batch_count=944833.3333333334, ans=0.015 2023-10-02 16:30:43,049 WARNING [train.py:1204] (3/4) Exclude cut with ID _55134_210_21982_1_1534590028377_3882170_297-47310-0_sp1.1 from training. Duration: 0.963625 2023-10-02 16:30:44,357 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_486-151131-0_sp1.1 from training. Duration: 0.77275 2023-10-02 16:30:45,777 WARNING [train.py:1204] (3/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_21-232980-0_sp1.1 from training. Duration: 0.936375 2023-10-02 16:30:48,994 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6165_210_3528_1_1527590982_4379841_338-106380-0_sp1.1 from training. Duration: 0.92725 2023-10-02 16:30:49,066 WARNING [train.py:1204] (3/4) Exclude cut with ID _55162_210_17071_1_1534408248583_3753480_355-257218-0_sp1.1 from training. Duration: 0.97275 2023-10-02 16:30:50,485 WARNING [train.py:1204] (3/4) Exclude cut with ID _2895_210_2477_1_1525956785794_4063750_289-11475-0 from training. Duration: 0.82 2023-10-02 16:30:54,220 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.bypass.skip_rate, batch_count=944900.0, ans=0.07 2023-10-02 16:30:56,584 WARNING [train.py:1204] (3/4) Exclude cut with ID _16756_210_4915_1_1531047803763_7104259_839-183431-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 16:30:57,966 WARNING [train.py:1204] (3/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_427-208901-0_sp0.9 from training. Duration: 0.7555625 2023-10-02 16:30:58,017 WARNING [train.py:1204] (3/4) Exclude cut with ID _36512_210_6987_1_1533033143998_3126069_181-306500-0 from training. Duration: 0.95 2023-10-02 16:31:00,726 WARNING [train.py:1204] (3/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_28-70987-0_sp1.1 from training. Duration: 0.7909375 2023-10-02 16:31:00,986 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.ff3_skip_rate, batch_count=944900.0, ans=0.0 2023-10-02 16:31:06,785 WARNING [train.py:1204] (3/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_590-58996-0_sp1.1 from training. Duration: 0.936375 2023-10-02 16:31:08,280 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_48730_210_5399_1_1533885886_2212769_108-47561-0_sp1.1 from training. Duration: 0.936375 2023-10-02 16:31:10,732 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.conv_module1.balancer1.min_positive, batch_count=944966.6666666666, ans=0.025 2023-10-02 16:31:14,383 WARNING [train.py:1204] (3/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_401-158239-0_sp0.9 from training. Duration: 0.788875 2023-10-02 16:31:14,396 WARNING [train.py:1204] (3/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_278-154492-0_sp1.1 from training. Duration: 0.6909375 2023-10-02 16:31:14,402 WARNING [train.py:1204] (3/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_137-116442-0 from training. Duration: 0.78 2023-10-02 16:31:15,884 WARNING [train.py:1204] (3/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_189-317158-0 from training. Duration: 0.9 2023-10-02 16:31:17,251 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_47896_210_20508_1_1534492518_5958868_558-314572-0 from training. Duration: 0.97 2023-10-02 16:31:19,118 WARNING [train.py:1204] (3/4) Exclude cut with ID _66070_210_7640_1_1535441393096_7805310_290-40284-0_sp1.1 from training. Duration: 0.97275 2023-10-02 16:31:19,148 WARNING [train.py:1204] (3/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_110-224688-0_sp1.1 from training. Duration: 0.8818125 2023-10-02 16:31:20,967 WARNING [train.py:1204] (3/4) Exclude cut with ID _7719_210_3525_1_1528456153163_4296960_88-250259-0 from training. Duration: 0.84 2023-10-02 16:31:22,203 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.539e+02 1.857e+02 2.015e+02 2.444e+02 4.517e+02, threshold=4.030e+02, percent-clipped=2.0 2023-10-02 16:31:22,289 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8453_210_4211_1_1528502940_8562730_1157-114102-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 16:31:22,320 WARNING [train.py:1204] (3/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_317-175062-0_sp1.1 from training. Duration: 0.7454375 2023-10-02 16:31:22,326 WARNING [train.py:1204] (3/4) Exclude cut with ID _29857_210_20034_1_1532395886753_3798160_254-347254-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 16:31:23,678 WARNING [train.py:1204] (3/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_28-70987-0 from training. Duration: 0.87 2023-10-02 16:31:25,005 WARNING [train.py:1204] (3/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_321-335475-0 from training. Duration: 0.45 2023-10-02 16:31:28,991 WARNING [train.py:1204] (3/4) Exclude cut with ID _40901_210_5665_1_1533363789958_3856130_356-107330-0_sp1.1 from training. Duration: 0.936375 2023-10-02 16:31:30,451 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3780_210_2087_1_1526467389_2145572_176-107140-0 from training. Duration: 0.69 2023-10-02 16:31:36,509 INFO [train.py:1046] (3/4) Epoch 27, batch 3650, loss[loss=0.1621, simple_loss=0.2334, pruned_loss=0.04537, over 23618.00 frames. ], tot_loss[loss=0.1672, simple_loss=0.2448, pruned_loss=0.04484, over 4694119.77 frames. ], batch size: 256, lr: 3.78e-03, grad_scale: 16.0 2023-10-02 16:31:36,554 WARNING [train.py:1204] (3/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_80-135200-0 from training. Duration: 0.79 2023-10-02 16:31:36,665 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_33348_210_5632_1_1533086912_3324030_85-28583-0_sp0.9 from training. Duration: 0.911125 2023-10-02 16:31:39,495 WARNING [train.py:1204] (3/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_29-293896-0 from training. Duration: 0.61 2023-10-02 16:31:39,718 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.balancer1.prob, batch_count=945100.0, ans=0.125 2023-10-02 16:31:42,788 WARNING [train.py:1204] (3/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_194-252800-0 from training. Duration: 0.97 2023-10-02 16:31:44,989 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.1.encoder.layers.0.self_attn2.whiten, num_groups=1, num_channels=256, metric=13.16 vs. limit=22.5 2023-10-02 16:31:46,906 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_360-52446-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 16:31:46,908 WARNING [train.py:1204] (3/4) Exclude cut with ID _9389_210_1664_1_1529217337301_5289269_289-213417-0_sp0.9 from training. Duration: 0.988875 2023-10-02 16:31:46,956 WARNING [train.py:1204] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_143-86344-0_sp0.9 from training. Duration: 0.8555625 2023-10-02 16:31:49,910 WARNING [train.py:1204] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_520-126729-0_sp0.9 from training. Duration: 0.77775 2023-10-02 16:31:50,094 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.feed_forward3.hidden_balancer.prob, batch_count=945166.6666666666, ans=0.125 2023-10-02 16:31:51,206 WARNING [train.py:1204] (3/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_76-308663-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 16:31:52,607 WARNING [train.py:1204] (3/4) Exclude cut with ID _8021_210_7994_1_1528376451358_3653069_100-140231-0 from training. Duration: 0.67 2023-10-02 16:31:52,673 WARNING [train.py:1204] (3/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_387-131538-0_sp0.9 from training. Duration: 0.9 2023-10-02 16:31:52,866 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=945166.6666666666, ans=0.1 2023-10-02 16:31:54,472 WARNING [train.py:1204] (3/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_365-182054-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 16:31:54,513 WARNING [train.py:1204] (3/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_345-340690-0 from training. Duration: 0.93 2023-10-02 16:31:56,507 WARNING [train.py:1204] (3/4) Exclude cut with ID _5409_210_1388_1_1527292850189_5865191_178-127970-0_sp0.9 from training. Duration: 0.6333125 2023-10-02 16:31:56,556 WARNING [train.py:1204] (3/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_352-12734-0_sp1.1 from training. Duration: 0.92725 2023-10-02 16:31:56,559 WARNING [train.py:1204] (3/4) Exclude cut with ID _65134_210_14763_1_1535335393985_3505368_121-129373-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 16:31:57,975 WARNING [train.py:1204] (3/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_557-324258-0_sp1.1 from training. Duration: 0.736375 2023-10-02 16:32:00,784 WARNING [train.py:1204] (3/4) Exclude cut with ID _48678_210_5663_1_1533988830947_3769199_306-255886-0 from training. Duration: 0.77 2023-10-02 16:32:00,872 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7544_210_6753_1_1528587000_691468_87-178599-0 from training. Duration: 0.87 2023-10-02 16:32:02,279 WARNING [train.py:1204] (3/4) Exclude cut with ID _2941_210_2577_1_1526199673150_2489759_208-236402-0_sp1.1 from training. Duration: 0.963625 2023-10-02 16:32:02,514 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.bypass.skip_rate, batch_count=945166.6666666666, ans=0.07 2023-10-02 16:32:03,655 WARNING [train.py:1204] (3/4) Exclude cut with ID _38670_210_15379_1_1533448810659_4391059_65-169397-0 from training. Duration: 0.97 2023-10-02 16:32:06,266 WARNING [train.py:1204] (3/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_306-328175-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 16:32:06,275 WARNING [train.py:1204] (3/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_212-149947-0_sp1.1 from training. Duration: 0.663625 2023-10-02 16:32:11,051 WARNING [train.py:1204] (3/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_321-22821-0_sp1.1 from training. Duration: 0.8090625 2023-10-02 16:32:13,045 WARNING [train.py:1204] (3/4) Exclude cut with ID _44010_210_4525_1_1533884262964_4013620_483-69110-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 16:32:13,056 WARNING [train.py:1204] (3/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_38-187851-0_sp1.1 from training. Duration: 0.563625 2023-10-02 16:32:15,554 WARNING [train.py:1204] (3/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_129-337802-0_sp0.9 from training. Duration: 0.8 2023-10-02 16:32:16,869 WARNING [train.py:1204] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_268-87909-0_sp1.1 from training. Duration: 0.9 2023-10-02 16:32:18,295 WARNING [train.py:1204] (3/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_559-184862-0_sp1.1 from training. Duration: 0.8545625 2023-10-02 16:32:21,073 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_46032_210_14656_1_1533781412_4507969_237-120766-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 16:32:22,543 WARNING [train.py:1204] (3/4) Exclude cut with ID _28427_210_4220_1_1532581240967_3061069_330-170906-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 16:32:22,550 WARNING [train.py:1204] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_401-51578-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 16:32:24,442 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.1.conv_module2.balancer1.prob, batch_count=945300.0, ans=0.125 2023-10-02 16:32:25,622 WARNING [train.py:1204] (3/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_209-205365-0_sp1.1 from training. Duration: 0.7181875 2023-10-02 16:32:25,701 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_23801_210_16223_1_1532066392_3294657_57-242170-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 16:32:25,753 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_127-37371-0_sp1.1 from training. Duration: 0.936375 2023-10-02 16:32:33,425 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9171W0019-578905 from training. Duration: 0.896 2023-10-02 16:32:34,931 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.nonlin_attention.balancer.prob, batch_count=945366.6666666666, ans=0.125 2023-10-02 16:32:36,282 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_24605_210_1794_1_1532047863_4149755_349-49056-0_sp1.1 from training. Duration: 0.92725 2023-10-02 16:32:36,294 WARNING [train.py:1204] (3/4) Exclude cut with ID _12302_210_5219_1_1529924446466_8107280_897-21237-0_sp1.1 from training. Duration: 0.936375 2023-10-02 16:32:37,681 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_9460_210_4211_1_1528954587_10049667_480-260269-0_sp0.9 from training. Duration: 0.62225 2023-10-02 16:32:37,739 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_348-152548-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 16:32:39,117 WARNING [train.py:1204] (3/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_301-330455-0_sp0.9 from training. Duration: 0.888875 2023-10-02 16:32:39,335 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.conv_module2.balancer2.prob, batch_count=945366.6666666666, ans=0.125 2023-10-02 16:32:41,034 WARNING [train.py:1204] (3/4) Exclude cut with ID _61008_210_10633_1_1534852769198_6974370_345-91705-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 16:32:42,386 WARNING [train.py:1204] (3/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_304-273433-0 from training. Duration: 0.99 2023-10-02 16:32:42,389 WARNING [train.py:1204] (3/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_359-345972-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 16:32:45,603 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2296_210_2087_1_1525503448_3397335_57-185430-0_sp1.1 from training. Duration: 0.5454375 2023-10-02 16:32:48,281 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_1256-120974-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 16:32:49,691 WARNING [train.py:1204] (3/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_372-30302-0_sp0.9 from training. Duration: 0.9555625 2023-10-02 16:32:50,911 INFO [train.py:1046] (3/4) Epoch 27, batch 3700, loss[loss=0.1591, simple_loss=0.238, pruned_loss=0.04009, over 23492.00 frames. ], tot_loss[loss=0.1666, simple_loss=0.2445, pruned_loss=0.04437, over 4712382.44 frames. ], batch size: 119, lr: 3.78e-03, grad_scale: 16.0 2023-10-02 16:32:52,424 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_42012_210_7927_1_1533439936_3871556_451-344262-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 16:32:52,425 WARNING [train.py:1204] (3/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_28-324729-0 from training. Duration: 0.94 2023-10-02 16:32:52,439 WARNING [train.py:1204] (3/4) Exclude cut with ID _64135_210_2253_1_1535677267876_7608972_313-18394-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 16:32:52,607 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.self_attn_weights.pos_emb_skip_rate, batch_count=945433.3333333334, ans=0.0 2023-10-02 16:32:53,843 WARNING [train.py:1204] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_620-276793-0_sp0.9 from training. Duration: 0.6555625 2023-10-02 16:32:54,006 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.conv_skip_rate, batch_count=945433.3333333334, ans=0.0 2023-10-02 16:32:55,136 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7544_210_6753_1_1528591327_1471426_172-348777-0_sp0.9 from training. Duration: 0.7333125 2023-10-02 16:32:58,429 WARNING [train.py:1204] (3/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_332-221057-0_sp0.9 from training. Duration: 0.7555625 2023-10-02 16:32:58,650 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.ff2_skip_rate, batch_count=945433.3333333334, ans=0.0 2023-10-02 16:33:01,164 WARNING [train.py:1204] (3/4) Exclude cut with ID _19023_210_4353_1_1531378892552_5820780_5-300035-0_sp1.1 from training. Duration: 0.92725 2023-10-02 16:33:01,211 WARNING [train.py:1204] (3/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_343-309023-0_sp1.1 from training. Duration: 0.963625 2023-10-02 16:33:03,094 WARNING [train.py:1204] (3/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_37-41919-0_sp1.1 from training. Duration: 0.7909375 2023-10-02 16:33:03,126 WARNING [train.py:1204] (3/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_41-182910-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 16:33:04,526 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_229-45300-0_sp0.9 from training. Duration: 0.6333125 2023-10-02 16:33:04,675 WARNING [train.py:1204] (3/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_40-214716-0_sp1.1 from training. Duration: 0.963625 2023-10-02 16:33:06,100 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9309W0203-640330 from training. Duration: 0.896 2023-10-02 16:33:15,956 WARNING [train.py:1204] (3/4) Exclude cut with ID _60229_210_27767_1_1534838406158_3655350_264-279573-0_sp1.1 from training. Duration: 0.7545625 2023-10-02 16:33:15,993 WARNING [train.py:1204] (3/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_372-198878-0_sp0.9 from training. Duration: 0.6666875 2023-10-02 16:33:17,330 WARNING [train.py:1204] (3/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_359-118926-0_sp1.1 from training. Duration: 0.6545625 2023-10-02 16:33:19,208 WARNING [train.py:1204] (3/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_85-65056-0 from training. Duration: 0.96 2023-10-02 16:33:19,235 WARNING [train.py:1204] (3/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_89-242335-0_sp1.1 from training. Duration: 0.863625 2023-10-02 16:33:22,155 WARNING [train.py:1204] (3/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_128-49444-0_sp1.1 from training. Duration: 0.936375 2023-10-02 16:33:23,552 WARNING [train.py:1204] (3/4) Exclude cut with ID _30327_210_16049_1_1532484161295_3924169_401-70819-0 from training. Duration: 0.86 2023-10-02 16:33:24,990 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_41822_210_13270_1_1533955976_4379284_260-256391-0_sp1.1 from training. Duration: 0.936375 2023-10-02 16:33:26,366 WARNING [train.py:1204] (3/4) Exclude cut with ID _67335_210_5219_1_1535525975652_4275229_599-247317-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 16:33:26,627 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.bypass.scale_min, batch_count=945566.6666666666, ans=0.2 2023-10-02 16:33:28,262 WARNING [train.py:1204] (3/4) Exclude cut with ID _29857_210_20034_1_1532395886753_3798160_209-239267-0_sp1.1 from training. Duration: 0.936375 2023-10-02 16:33:28,282 WARNING [train.py:1204] (3/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_46-207599-0_sp1.1 from training. Duration: 0.5818125 2023-10-02 16:33:29,856 WARNING [train.py:1204] (3/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_912-282412-0_sp1.1 from training. Duration: 0.4454375 2023-10-02 16:33:35,093 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_4183_210_2087_1_1526729100_1869838_141-166600-0_sp1.1 from training. Duration: 0.863625 2023-10-02 16:33:35,098 WARNING [train.py:1204] (3/4) Exclude cut with ID _15037_210_2253_1_1530752294104_3267650_296-192597-0 from training. Duration: 0.81 2023-10-02 16:33:35,749 WARNING [train.py:1204] (3/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_571-220562-0_sp1.1 from training. Duration: 0.963625 2023-10-02 16:33:37,090 WARNING [train.py:1204] (3/4) Exclude cut with ID _5287_210_2084_1_1527418883040_3878670_12-221186-0 from training. Duration: 0.98 2023-10-02 16:33:40,073 WARNING [train.py:1204] (3/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_149-85602-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 16:33:41,374 WARNING [train.py:1204] (3/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_1003-101638-0_sp1.1 from training. Duration: 0.663625 2023-10-02 16:33:42,839 WARNING [train.py:1204] (3/4) Exclude cut with ID _55844_210_2346_1_1534471212569_7625707_1099-33689-0_sp1.1 from training. Duration: 0.92725 2023-10-02 16:33:44,182 WARNING [train.py:1204] (3/4) Exclude cut with ID _7606_210_4915_1_1528894253277_5080690_301-10732-0 from training. Duration: 0.81 2023-10-02 16:33:44,670 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.conv_module2.whiten.whitening_limit, batch_count=945633.3333333334, ans=15.0 2023-10-02 16:33:45,586 WARNING [train.py:1204] (3/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_403-34698-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 16:33:45,592 WARNING [train.py:1204] (3/4) Exclude cut with ID _8480_210_1874_1_1528589391205_6728330_755-112625-0_sp0.9 from training. Duration: 0.82225 2023-10-02 16:33:45,603 WARNING [train.py:1204] (3/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_527-238990-0_sp1.1 from training. Duration: 0.8181875 2023-10-02 16:33:47,291 WARNING [train.py:1204] (3/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_426-7328-0_sp1.1 from training. Duration: 0.92725 2023-10-02 16:33:50,529 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.545e+02 1.788e+02 1.952e+02 2.175e+02 3.581e+02, threshold=3.904e+02, percent-clipped=0.0 2023-10-02 16:33:50,680 WARNING [train.py:1204] (3/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_189-317158-0_sp1.1 from training. Duration: 0.8181875 2023-10-02 16:33:50,736 WARNING [train.py:1204] (3/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_162-103620-0 from training. Duration: 0.93 2023-10-02 16:33:52,114 WARNING [train.py:1204] (3/4) Exclude cut with ID _22083_210_8254_1_1531702862439_3678970_49-234546-0 from training. Duration: 0.83 2023-10-02 16:33:53,318 WARNING [train.py:1204] (3/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_370-331600-0_sp1.1 from training. Duration: 0.8545625 2023-10-02 16:33:53,330 WARNING [train.py:1204] (3/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_166-59042-0_sp1.1 from training. Duration: 0.97275 2023-10-02 16:33:56,005 WARNING [train.py:1204] (3/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_138-332773-0_sp1.1 from training. Duration: 0.736375 2023-10-02 16:33:57,364 WARNING [train.py:1204] (3/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_90-30673-0_sp0.9 from training. Duration: 0.9333125 2023-10-02 16:33:58,818 WARNING [train.py:1204] (3/4) Exclude cut with ID _27967_210_12140_1_1532588391859_8419130_737-41898-0_sp1.1 from training. Duration: 0.936375 2023-10-02 16:34:00,601 WARNING [train.py:1204] (3/4) Exclude cut with ID _8833_210_5219_1_1528592665210_7553059_329-125156-0_sp1.1 from training. Duration: 0.7454375 2023-10-02 16:34:02,017 WARNING [train.py:1204] (3/4) Exclude cut with ID _41817_210_18835_1_1533434477160_7423049_352-42537-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 16:34:03,387 WARNING [train.py:1204] (3/4) Exclude cut with ID _8365_210_2064_1_1528592311070_4677690_431-294102-0 from training. Duration: 0.89 2023-10-02 16:34:04,687 INFO [train.py:1046] (3/4) Epoch 27, batch 3750, loss[loss=0.1732, simple_loss=0.2671, pruned_loss=0.03968, over 24308.00 frames. ], tot_loss[loss=0.1669, simple_loss=0.2453, pruned_loss=0.04419, over 4727413.54 frames. ], batch size: 74, lr: 3.78e-03, grad_scale: 16.0 2023-10-02 16:34:04,831 WARNING [train.py:1204] (3/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_454-314901-0_sp1.1 from training. Duration: 0.4181875 2023-10-02 16:34:06,365 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.bypass.scale_min, batch_count=945766.6666666666, ans=0.2 2023-10-02 16:34:08,088 WARNING [train.py:1204] (3/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_80-135200-0_sp0.9 from training. Duration: 0.87775 2023-10-02 16:34:08,223 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.balancer_ff2.min_abs, batch_count=945766.6666666666, ans=0.1 2023-10-02 16:34:09,379 WARNING [train.py:1204] (3/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_181-78632-0 from training. Duration: 0.78 2023-10-02 16:34:10,683 WARNING [train.py:1204] (3/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_300-27235-0_sp0.9 from training. Duration: 0.9666875 2023-10-02 16:34:10,876 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.balancer_ff3.min_abs, batch_count=945766.6666666666, ans=0.2 2023-10-02 16:34:12,010 WARNING [train.py:1204] (3/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_96-177176-0_sp1.1 from training. Duration: 0.97275 2023-10-02 16:34:13,389 WARNING [train.py:1204] (3/4) Exclude cut with ID _35941_210_15379_1_1532995294515_4023089_246-348742-0_sp1.1 from training. Duration: 0.97275 2023-10-02 16:34:14,747 WARNING [train.py:1204] (3/4) Exclude cut with ID _38670_210_15379_1_1533448810659_4391059_65-169397-0_sp1.1 from training. Duration: 0.8818125 2023-10-02 16:34:17,994 WARNING [train.py:1204] (3/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_1003-99642-0_sp1.1 from training. Duration: 0.963625 2023-10-02 16:34:22,680 WARNING [train.py:1204] (3/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_320-14078-0_sp1.1 from training. Duration: 0.863625 2023-10-02 16:34:22,773 WARNING [train.py:1204] (3/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_367-61330-0_sp0.9 from training. Duration: 0.8333125 2023-10-02 16:34:25,544 WARNING [train.py:1204] (3/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_515-298514-0_sp1.1 from training. Duration: 0.92725 2023-10-02 16:34:25,759 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.ff3_skip_rate, batch_count=945833.3333333334, ans=0.0 2023-10-02 16:34:28,480 WARNING [train.py:1204] (3/4) Exclude cut with ID _53731_210_7994_1_1534553979244_3757214_471-320815-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 16:34:29,718 WARNING [train.py:1204] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_306-232480-0 from training. Duration: 0.96 2023-10-02 16:34:29,790 WARNING [train.py:1204] (3/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_478-162247-0_sp0.9 from training. Duration: 0.911125 2023-10-02 16:34:31,630 WARNING [train.py:1204] (3/4) Exclude cut with ID _56423_210_5854_1_1534896061049_3165969_345-284696-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 16:34:31,664 WARNING [train.py:1204] (3/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_600-122253-0_sp1.1 from training. Duration: 0.963625 2023-10-02 16:34:33,242 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.conv_module1.balancer1.min_positive, batch_count=945900.0, ans=0.025 2023-10-02 16:34:35,742 WARNING [train.py:1204] (3/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_199-295611-0 from training. Duration: 0.55 2023-10-02 16:34:38,475 WARNING [train.py:1204] (3/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_734-129761-0 from training. Duration: 0.82 2023-10-02 16:34:39,754 WARNING [train.py:1204] (3/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_349-169257-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 16:34:41,709 WARNING [train.py:1204] (3/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_202-67017-0_sp0.9 from training. Duration: 0.911125 2023-10-02 16:34:44,287 WARNING [train.py:1204] (3/4) Exclude cut with ID _16908_210_5724_1_1531119157248_3772700_61-258978-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 16:34:49,047 WARNING [train.py:1204] (3/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_172-156931-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 16:34:50,339 WARNING [train.py:1204] (3/4) Exclude cut with ID _4092_210_2151_1_1526643044449_3135759_356-278694-0_sp1.1 from training. Duration: 0.42725 2023-10-02 16:34:53,631 WARNING [train.py:1204] (3/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_116-332008-0 from training. Duration: 0.98 2023-10-02 16:34:56,350 WARNING [train.py:1204] (3/4) Exclude cut with ID _29003_210_19596_1_1532507391720_3894098_358-114789-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 16:34:59,125 WARNING [train.py:1204] (3/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_152-212533-0_sp1.1 from training. Duration: 0.8818125 2023-10-02 16:34:59,184 WARNING [train.py:1204] (3/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_267-214116-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 16:35:02,092 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7543_210_6753_1_1528541907_1399322_165-323619-0_sp0.9 from training. Duration: 0.7666875 2023-10-02 16:35:02,342 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.conv_module2.balancer1.min_positive, batch_count=946033.3333333334, ans=0.025 2023-10-02 16:35:06,646 WARNING [train.py:1204] (3/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_577-332600-0_sp0.9 from training. Duration: 0.6333125 2023-10-02 16:35:08,029 WARNING [train.py:1204] (3/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_342-100157-0_sp0.9 from training. Duration: 0.6 2023-10-02 16:35:10,873 WARNING [train.py:1204] (3/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_141-89727-0_sp1.1 from training. Duration: 0.6454375 2023-10-02 16:35:12,256 WARNING [train.py:1204] (3/4) Exclude cut with ID _3992_210_1747_1_1526644816031_3731060_107-251434-0_sp1.1 from training. Duration: 0.9 2023-10-02 16:35:14,258 WARNING [train.py:1204] (3/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_401-53423-0_sp1.1 from training. Duration: 0.5 2023-10-02 16:35:15,839 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.attention_skip_rate, batch_count=946033.3333333334, ans=0.0 2023-10-02 16:35:18,931 INFO [train.py:1046] (3/4) Epoch 27, batch 3800, loss[loss=0.1738, simple_loss=0.2357, pruned_loss=0.05601, over 23785.00 frames. ], tot_loss[loss=0.1665, simple_loss=0.2449, pruned_loss=0.04401, over 4737904.28 frames. ], batch size: 179, lr: 3.78e-03, grad_scale: 16.0 2023-10-02 16:35:20,379 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7762_210_1794_1_1528199508_4057408_422-227034-0_sp1.1 from training. Duration: 0.663625 2023-10-02 16:35:24,843 WARNING [train.py:1204] (3/4) Exclude cut with ID _4184_210_2550_1_1526730192013_2219380_102-250299-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 16:35:26,156 WARNING [train.py:1204] (3/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_253-308377-0_sp0.9 from training. Duration: 0.5444375 2023-10-02 16:35:26,231 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_271-297505-0 from training. Duration: 0.99 2023-10-02 16:35:28,942 WARNING [train.py:1204] (3/4) Exclude cut with ID _35941_210_15379_1_1532995294515_4023089_213-142973-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 16:35:30,469 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7544_210_6753_1_1528587722_3576686_162-63018-0_sp1.1 from training. Duration: 0.92725 2023-10-02 16:35:31,770 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_365-217119-0_sp0.9 from training. Duration: 0.7 2023-10-02 16:35:31,905 WARNING [train.py:1204] (3/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_662-275621-0_sp0.9 from training. Duration: 0.4666875 2023-10-02 16:35:31,907 WARNING [train.py:1204] (3/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_374-187555-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 16:35:33,293 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_289-180904-0_sp1.1 from training. Duration: 0.6909375 2023-10-02 16:35:34,705 WARNING [train.py:1204] (3/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_189-47206-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 16:35:36,575 WARNING [train.py:1204] (3/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_234-14174-0_sp1.1 from training. Duration: 0.7454375 2023-10-02 16:35:36,612 WARNING [train.py:1204] (3/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_576-76621-0_sp1.1 from training. Duration: 0.963625 2023-10-02 16:35:36,702 WARNING [train.py:1204] (3/4) Exclude cut with ID _13253_210_1925_1_1530757775819_3577873_253-246386-0 from training. Duration: 0.9 2023-10-02 16:35:39,336 WARNING [train.py:1204] (3/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_372-6869-0_sp1.1 from training. Duration: 0.4 2023-10-02 16:35:40,766 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_21434_210_4238_1_1531916339_1222252_22-54954-0_sp1.1 from training. Duration: 0.936375 2023-10-02 16:35:43,943 WARNING [train.py:1204] (3/4) Exclude cut with ID _29853_210_7497_1_1532505600851_6642110_492-170361-0_sp1.1 from training. Duration: 0.92725 2023-10-02 16:35:46,775 WARNING [train.py:1204] (3/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_505-97987-0_sp0.9 from training. Duration: 0.9555625 2023-10-02 16:35:46,830 WARNING [train.py:1204] (3/4) Exclude cut with ID _8365_210_2064_1_1528592311070_4677690_371-122295-0_sp0.9 from training. Duration: 0.611125 2023-10-02 16:35:49,545 WARNING [train.py:1204] (3/4) Exclude cut with ID _14268_210_9206_1_1530622771285_4714099_637-86630-0_sp1.1 from training. Duration: 0.463625 2023-10-02 16:35:49,557 WARNING [train.py:1204] (3/4) Exclude cut with ID _29857_210_20034_1_1532395886753_3798160_372-5678-0_sp1.1 from training. Duration: 0.963625 2023-10-02 16:35:50,857 WARNING [train.py:1204] (3/4) Exclude cut with ID _52892_210_6721_1_1534378941259_4124129_90-165108-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 16:35:52,813 WARNING [train.py:1204] (3/4) Exclude cut with ID _27052_210_5986_1_1532329197187_3585009_143-339198-0_sp1.1 from training. Duration: 0.963625 2023-10-02 16:35:54,530 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.ff2_skip_rate, batch_count=946233.3333333334, ans=0.0 2023-10-02 16:35:55,707 WARNING [train.py:1204] (3/4) Exclude cut with ID _14268_210_9206_1_1530622771285_4714099_637-86630-0_sp0.9 from training. Duration: 0.5666875 2023-10-02 16:35:55,709 WARNING [train.py:1204] (3/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_401-255491-0 from training. Duration: 0.7 2023-10-02 16:35:57,217 WARNING [train.py:1204] (3/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_178-6946-0_sp1.1 from training. Duration: 0.97275 2023-10-02 16:36:02,669 WARNING [train.py:1204] (3/4) Exclude cut with ID _27485_210_4916_1_1532739191677_3668269_460-163885-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 16:36:04,746 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.feed_forward2.hidden_balancer.prob, batch_count=946300.0, ans=0.125 2023-10-02 16:36:10,073 WARNING [train.py:1204] (3/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_270-26053-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 16:36:11,532 WARNING [train.py:1204] (3/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_412-317077-0 from training. Duration: 0.75 2023-10-02 16:36:13,365 WARNING [train.py:1204] (3/4) Exclude cut with ID _5289_210_2084_1_1527678202736_3779299_142-103317-0 from training. Duration: 0.84 2023-10-02 16:36:15,054 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_150-143438-0_sp1.1 from training. Duration: 0.92725 2023-10-02 16:36:16,444 WARNING [train.py:1204] (3/4) Exclude cut with ID _27894_210_12392_1_1532415496078_6902439_668-269779-0_sp1.1 from training. Duration: 0.97275 2023-10-02 16:36:17,637 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.550e+02 1.874e+02 2.041e+02 2.512e+02 4.276e+02, threshold=4.082e+02, percent-clipped=2.0 2023-10-02 16:36:17,756 WARNING [train.py:1204] (3/4) Exclude cut with ID _18513_210_9206_1_1531618215223_3862620_56-222075-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 16:36:19,196 WARNING [train.py:1204] (3/4) Exclude cut with ID _4092_210_2151_1_1526643044449_3135759_356-278694-0 from training. Duration: 0.47 2023-10-02 16:36:23,891 WARNING [train.py:1204] (3/4) Exclude cut with ID _25808_210_13292_1_1532343514562_7366109_633-47524-0 from training. Duration: 0.99 2023-10-02 16:36:23,909 WARNING [train.py:1204] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_872-264525-0 from training. Duration: 0.65 2023-10-02 16:36:23,944 WARNING [train.py:1204] (3/4) Exclude cut with ID _14632_210_9988_1_1530691279392_3923200_239-232545-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 16:36:25,385 WARNING [train.py:1204] (3/4) Exclude cut with ID _14349_210_8341_1_1530615710818_3442470_153-27432-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 16:36:29,851 WARNING [train.py:1204] (3/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_37-41919-0_sp0.9 from training. Duration: 0.9666875 2023-10-02 16:36:29,909 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_196-289637-0_sp0.9 from training. Duration: 0.8333125 2023-10-02 16:36:30,175 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.feed_forward3.hidden_balancer.prob, batch_count=946366.6666666666, ans=0.125 2023-10-02 16:36:31,646 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.conv_module1.balancer1.prob, batch_count=946433.3333333334, ans=0.125 2023-10-02 16:36:32,742 INFO [train.py:1046] (3/4) Epoch 27, batch 3850, loss[loss=0.1629, simple_loss=0.2186, pruned_loss=0.05361, over 22620.00 frames. ], tot_loss[loss=0.1656, simple_loss=0.2435, pruned_loss=0.04385, over 4739101.98 frames. ], batch size: 322, lr: 3.78e-03, grad_scale: 16.0 2023-10-02 16:36:32,868 WARNING [train.py:1204] (3/4) Exclude cut with ID _13678_210_7497_1_1530530069226_3463170_371-3221-0_sp1.1 from training. Duration: 0.8181875 2023-10-02 16:36:34,250 WARNING [train.py:1204] (3/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_557-324258-0 from training. Duration: 0.81 2023-10-02 16:36:35,613 WARNING [train.py:1204] (3/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_1012-279473-0_sp1.1 from training. Duration: 0.6090625 2023-10-02 16:36:35,803 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.attention_skip_rate, batch_count=946433.3333333334, ans=0.0 2023-10-02 16:36:37,450 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_66916_210_14656_1_1535725525_326566_7-92649-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 16:36:39,046 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.bypass_mid.scale_min, batch_count=946433.3333333334, ans=0.2 2023-10-02 16:36:40,228 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_9460_210_4211_1_1528954587_10049667_480-260269-0_sp1.1 from training. Duration: 0.5090625 2023-10-02 16:36:43,485 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_121-155001-0_sp1.1 from training. Duration: 0.92725 2023-10-02 16:36:45,502 WARNING [train.py:1204] (3/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_188-171673-0_sp1.1 from training. Duration: 0.52725 2023-10-02 16:36:46,818 WARNING [train.py:1204] (3/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_2-39653-0 from training. Duration: 0.98 2023-10-02 16:36:51,610 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_23850_210_4238_1_1532232597_1632433_112-229012-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 16:36:54,246 WARNING [train.py:1204] (3/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_626-79528-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 16:36:55,653 WARNING [train.py:1204] (3/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_856-90052-0_sp1.1 from training. Duration: 0.963625 2023-10-02 16:36:55,700 WARNING [train.py:1204] (3/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_122-109023-0_sp0.9 from training. Duration: 0.8444375 2023-10-02 16:36:58,499 WARNING [train.py:1204] (3/4) Exclude cut with ID _64414_210_17301_1_1535243252048_3757759_37-70328-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 16:36:58,569 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_622-344907-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 16:37:00,073 WARNING [train.py:1204] (3/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_410-321956-0_sp1.1 from training. Duration: 0.92725 2023-10-02 16:37:00,091 WARNING [train.py:1204] (3/4) Exclude cut with ID _7562_210_2084_1_1528887767916_3946229_186-326422-0_sp0.9 from training. Duration: 0.9333125 2023-10-02 16:37:01,430 WARNING [train.py:1204] (3/4) Exclude cut with ID _37537_210_2253_1_1533171758568_3421949_127-285192-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 16:37:01,614 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.attention_skip_rate, batch_count=946566.6666666666, ans=0.0 2023-10-02 16:37:04,114 WARNING [train.py:1204] (3/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_723-228168-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 16:37:05,585 WARNING [train.py:1204] (3/4) Exclude cut with ID _13678_210_7497_1_1530530069226_3463170_426-246984-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 16:37:05,598 WARNING [train.py:1204] (3/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_399-156791-0_sp0.9 from training. Duration: 0.92225 2023-10-02 16:37:06,915 WARNING [train.py:1204] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_391-22148-0 from training. Duration: 0.77 2023-10-02 16:37:06,946 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3475_210_2028_1_1526193578_3622569_64-297072-0 from training. Duration: 0.91 2023-10-02 16:37:07,003 WARNING [train.py:1204] (3/4) Exclude cut with ID _17875_210_2087_1_1531224012907_1821339_225-280015-0_sp1.1 from training. Duration: 0.963625 2023-10-02 16:37:08,832 WARNING [train.py:1204] (3/4) Exclude cut with ID _50655_210_18628_1_1534229942193_3772154_325-246169-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 16:37:10,416 WARNING [train.py:1204] (3/4) Exclude cut with ID _29529_210_7234_1_1532393961455_3977460_597-257348-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 16:37:11,737 WARNING [train.py:1204] (3/4) Exclude cut with ID _58895_210_8750_1_1535184081320_3649089_77-173485-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 16:37:11,770 WARNING [train.py:1204] (3/4) Exclude cut with ID _6270_210_2084_1_1527850911573_3856420_26-277343-0 from training. Duration: 0.82 2023-10-02 16:37:15,519 WARNING [train.py:1204] (3/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_144-3287-0 from training. Duration: 0.67 2023-10-02 16:37:15,630 WARNING [train.py:1204] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_293-145278-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 16:37:17,968 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.0.layers.1.feed_forward1.out_whiten, num_groups=1, num_channels=192, metric=4.38 vs. limit=15.0 2023-10-02 16:37:18,396 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_40047_210_2290_1_1533290519_2374311_257-335162-0 from training. Duration: 0.95 2023-10-02 16:37:20,340 WARNING [train.py:1204] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_619-344499-0_sp0.9 from training. Duration: 0.7 2023-10-02 16:37:25,911 WARNING [train.py:1204] (3/4) Exclude cut with ID _19504_210_9994_1_1531648863242_3325220_456-315379-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 16:37:25,999 WARNING [train.py:1204] (3/4) Exclude cut with ID _55857_210_2346_1_1534644007831_6898209_631-221692-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 16:37:30,197 WARNING [train.py:1204] (3/4) Exclude cut with ID _64631_210_27767_1_1535349580068_4165160_324-3682-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 16:37:31,526 WARNING [train.py:1204] (3/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_317-211107-0 from training. Duration: 0.92 2023-10-02 16:37:31,624 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder_embed.convnext.out_balancer.prob, batch_count=946700.0, ans=0.125 2023-10-02 16:37:32,975 WARNING [train.py:1204] (3/4) Exclude cut with ID _6789_210_1534_1_1527857937218_3118950_311-61262-0 from training. Duration: 0.65 2023-10-02 16:37:35,933 WARNING [train.py:1204] (3/4) Exclude cut with ID _28323_210_16187_1_1532480395667_4100300_365-68882-0_sp1.1 from training. Duration: 0.92725 2023-10-02 16:37:35,965 WARNING [train.py:1204] (3/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_816-181825-0_sp1.1 from training. Duration: 0.92725 2023-10-02 16:37:39,970 WARNING [train.py:1204] (3/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_132-269633-0_sp1.1 from training. Duration: 0.6181875 2023-10-02 16:37:39,989 WARNING [train.py:1204] (3/4) Exclude cut with ID _44585_210_18628_1_1533605350387_4518199_280-73534-0_sp1.1 from training. Duration: 0.8090625 2023-10-02 16:37:40,041 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_68095_210_20185_1_1535718226_8888855_1009-164755-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 16:37:41,844 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_40223_210_6228_1_1533298404_4812267_400-103813-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 16:37:41,845 WARNING [train.py:1204] (3/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_491-261923-0_sp1.1 from training. Duration: 0.8909375 2023-10-02 16:37:41,852 WARNING [train.py:1204] (3/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_712-342563-0 from training. Duration: 0.97 2023-10-02 16:37:43,135 WARNING [train.py:1204] (3/4) Exclude cut with ID _41161_210_20034_1_1533431249657_3555314_175-190032-0_sp1.1 from training. Duration: 0.963625 2023-10-02 16:37:43,266 WARNING [train.py:1204] (3/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_788-298907-0 from training. Duration: 0.89 2023-10-02 16:37:45,242 WARNING [train.py:1204] (3/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_225-70961-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 16:37:45,256 WARNING [train.py:1204] (3/4) Exclude cut with ID _58583_210_5236_1_1534726835266_3574249_235-175994-0_sp1.1 from training. Duration: 0.92725 2023-10-02 16:37:47,287 WARNING [train.py:1204] (3/4) Exclude cut with ID _12302_210_5219_1_1529924446466_8107280_907-182949-0_sp1.1 from training. Duration: 0.836375 2023-10-02 16:37:47,322 WARNING [train.py:1204] (3/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_94-193692-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 16:37:48,817 INFO [train.py:1046] (3/4) Epoch 27, batch 3900, loss[loss=0.1443, simple_loss=0.2259, pruned_loss=0.03132, over 24334.00 frames. ], tot_loss[loss=0.1651, simple_loss=0.2429, pruned_loss=0.04361, over 4733780.12 frames. ], batch size: 61, lr: 3.77e-03, grad_scale: 16.0 2023-10-02 16:37:48,895 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_4528_210_3273_1_1527076654_3276777_342-168068-0_sp0.9 from training. Duration: 0.9666875 2023-10-02 16:37:48,947 WARNING [train.py:1204] (3/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_368-74239-0_sp1.1 from training. Duration: 0.92725 2023-10-02 16:37:48,949 WARNING [train.py:1204] (3/4) Exclude cut with ID _44831_210_13427_1_1533816011549_3689035_291-295928-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 16:37:49,024 WARNING [train.py:1204] (3/4) Exclude cut with ID _56406_210_9288_1_1534899618664_3496149_455-131997-0_sp1.1 from training. Duration: 0.97275 2023-10-02 16:37:50,132 WARNING [train.py:1204] (3/4) Exclude cut with ID _6473_210_1741_1_1527901307396_3815380_108-17335-0 from training. Duration: 0.65 2023-10-02 16:37:50,183 WARNING [train.py:1204] (3/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_286-135600-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 16:37:54,754 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_928-321675-0_sp1.1 from training. Duration: 0.936375 2023-10-02 16:37:54,833 WARNING [train.py:1204] (3/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_81-300212-0_sp1.1 from training. Duration: 0.5454375 2023-10-02 16:37:54,863 WARNING [train.py:1204] (3/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_29-280622-0_sp0.9 from training. Duration: 0.911125 2023-10-02 16:37:56,156 WARNING [train.py:1204] (3/4) Exclude cut with ID _61079_210_4525_1_1534921008198_4011419_228-176447-0_sp1.1 from training. Duration: 0.936375 2023-10-02 16:37:57,631 WARNING [train.py:1204] (3/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_372-198878-0_sp1.1 from training. Duration: 0.5454375 2023-10-02 16:37:57,652 WARNING [train.py:1204] (3/4) Exclude cut with ID _64521_210_14699_1_1535243427259_3815328_244-208725-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 16:37:58,986 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8275_210_4198_1_1528455320_3892571_491-17707-0_sp0.9 from training. Duration: 0.711125 2023-10-02 16:38:00,447 WARNING [train.py:1204] (3/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_375-245677-0 from training. Duration: 0.81 2023-10-02 16:38:00,454 WARNING [train.py:1204] (3/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_690-286055-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 16:38:01,780 WARNING [train.py:1204] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_89-321955-0 from training. Duration: 0.8 2023-10-02 16:38:01,810 WARNING [train.py:1204] (3/4) Exclude cut with ID _18895_210_6228_1_1531357274234_3784930_213-98721-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 16:38:03,174 WARNING [train.py:1204] (3/4) Exclude cut with ID _19708_210_9206_1_1532136650733_4238364_347-65240-0 from training. Duration: 0.97 2023-10-02 16:38:04,529 WARNING [train.py:1204] (3/4) Exclude cut with ID _64135_210_2253_1_1535677267876_7608972_498-5039-0 from training. Duration: 0.78 2023-10-02 16:38:09,124 WARNING [train.py:1204] (3/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_146-276944-0_sp1.1 from training. Duration: 0.7545625 2023-10-02 16:38:09,243 WARNING [train.py:1204] (3/4) Exclude cut with ID _63171_210_15488_1_1535104792794_3295110_155-34488-0_sp1.1 from training. Duration: 0.97275 2023-10-02 16:38:10,392 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_5438_210_1794_1_1527336687_2600009_233-58072-0_sp0.9 from training. Duration: 0.8555625 2023-10-02 16:38:10,438 WARNING [train.py:1204] (3/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_63-101996-0_sp0.9 from training. Duration: 0.97775 2023-10-02 16:38:15,387 WARNING [train.py:1204] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_271-269084-0_sp1.1 from training. Duration: 0.7545625 2023-10-02 16:38:18,641 WARNING [train.py:1204] (3/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_345-167986-0_sp1.1 from training. Duration: 0.763625 2023-10-02 16:38:20,613 WARNING [train.py:1204] (3/4) Exclude cut with ID _39290_210_4220_1_1533862778760_3075118_92-311600-0_sp1.1 from training. Duration: 0.8 2023-10-02 16:38:21,986 WARNING [train.py:1204] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_209-65306-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 16:38:22,071 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_26433_210_12108_1_1532157158_4056480_317-335807-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 16:38:26,352 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.conv_module2.balancer1.prob, batch_count=946900.0, ans=0.125 2023-10-02 16:38:28,805 WARNING [train.py:1204] (3/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_238-94127-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 16:38:28,855 WARNING [train.py:1204] (3/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_207-132038-0_sp1.1 from training. Duration: 0.8545625 2023-10-02 16:38:32,268 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.4.encoder.layers.0.feed_forward2.out_whiten, num_groups=1, num_channels=384, metric=6.35 vs. limit=15.0 2023-10-02 16:38:34,597 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=946966.6666666666, ans=0.1 2023-10-02 16:38:37,081 WARNING [train.py:1204] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_167-137255-0_sp1.1 from training. Duration: 0.5818125 2023-10-02 16:38:38,486 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2009_210_2071_1_1525481134_4588937_385-302100-0_sp0.9 from training. Duration: 0.9666875 2023-10-02 16:38:47,389 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.495e+02 1.883e+02 2.029e+02 2.294e+02 3.792e+02, threshold=4.059e+02, percent-clipped=0.0 2023-10-02 16:38:47,616 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_15517_210_6753_1_1530954008_3487336_253-110617-0_sp1.1 from training. Duration: 0.936375 2023-10-02 16:38:49,220 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.attention_skip_rate, batch_count=947033.3333333334, ans=0.0 2023-10-02 16:38:50,836 WARNING [train.py:1204] (3/4) Exclude cut with ID _39290_210_4220_1_1533862778760_3075118_92-311600-0_sp0.9 from training. Duration: 0.97775 2023-10-02 16:38:50,970 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.1.conv_module2.balancer1.prob, batch_count=947033.3333333334, ans=0.125 2023-10-02 16:38:52,091 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_42-142473-0 from training. Duration: 0.72 2023-10-02 16:38:52,138 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2296_210_2087_1_1525503448_3397335_57-185430-0 from training. Duration: 0.6 2023-10-02 16:38:52,825 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_11305_210_1794_1_1529750428_7178148_257-312870-0_sp0.9 from training. Duration: 0.97775 2023-10-02 16:38:54,239 WARNING [train.py:1204] (3/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_222-198127-0 from training. Duration: 0.84 2023-10-02 16:38:56,873 WARNING [train.py:1204] (3/4) Exclude cut with ID _65263_210_22356_1_1535763598691_3921037_423-202699-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 16:38:58,189 WARNING [train.py:1204] (3/4) Exclude cut with ID _15037_210_2253_1_1530752294104_3267650_13-130361-0 from training. Duration: 0.99 2023-10-02 16:39:02,575 INFO [train.py:1046] (3/4) Epoch 27, batch 3950, loss[loss=0.1692, simple_loss=0.2581, pruned_loss=0.0401, over 24566.00 frames. ], tot_loss[loss=0.165, simple_loss=0.2435, pruned_loss=0.04322, over 4750193.00 frames. ], batch size: 71, lr: 3.77e-03, grad_scale: 16.0 2023-10-02 16:39:02,713 WARNING [train.py:1204] (3/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_628-198080-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 16:39:03,973 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3171_210_2087_1_1526109084_2330007_37-64334-0 from training. Duration: 0.82 2023-10-02 16:39:04,021 WARNING [train.py:1204] (3/4) Exclude cut with ID _5287_210_2084_1_1527418883040_3878670_12-221186-0_sp1.1 from training. Duration: 0.8909375 2023-10-02 16:39:05,609 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.conv_module1.balancer2.min_positive, batch_count=947100.0, ans=0.05 2023-10-02 16:39:07,961 WARNING [train.py:1204] (3/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_1103-56445-0_sp1.1 from training. Duration: 0.663625 2023-10-02 16:39:09,276 WARNING [train.py:1204] (3/4) Exclude cut with ID _65263_210_22356_1_1535763598691_3921037_169-140068-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 16:39:11,476 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.ff3_skip_rate, batch_count=947100.0, ans=0.0 2023-10-02 16:39:13,180 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.nonlin_attention.whiten1.whitening_limit, batch_count=947100.0, ans=10.0 2023-10-02 16:39:15,301 WARNING [train.py:1204] (3/4) Exclude cut with ID IC0671W0118-331961 from training. Duration: 0.896 2023-10-02 16:39:15,357 WARNING [train.py:1204] (3/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_402-102311-0_sp1.1 from training. Duration: 0.6090625 2023-10-02 16:39:15,383 WARNING [train.py:1204] (3/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_202-293523-0 from training. Duration: 0.72 2023-10-02 16:39:16,717 WARNING [train.py:1204] (3/4) Exclude cut with ID IC0679W0038-335846 from training. Duration: 0.768 2023-10-02 16:39:16,745 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_37357_210_17024_1_1533089504_2316121_79-198866-0_sp1.1 from training. Duration: 0.936375 2023-10-02 16:39:19,656 WARNING [train.py:1204] (3/4) Exclude cut with ID _7606_210_4915_1_1528894253277_5080690_292-271285-0_sp1.1 from training. Duration: 0.963625 2023-10-02 16:39:21,502 WARNING [train.py:1204] (3/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_377-296839-0_sp0.9 from training. Duration: 0.611125 2023-10-02 16:39:21,505 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_45886_210_17024_1_1533704917_2435333_58-131069-0_sp1.1 from training. Duration: 0.963625 2023-10-02 16:39:21,999 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.5.encoder.layers.1.conv_module2.whiten, num_groups=1, num_channels=256, metric=4.72 vs. limit=15.0 2023-10-02 16:39:24,223 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6961_210_2087_1_1527937168_6815011_395-185060-0 from training. Duration: 0.95 2023-10-02 16:39:28,196 WARNING [train.py:1204] (3/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_304-266479-0_sp1.1 from training. Duration: 0.97275 2023-10-02 16:39:28,249 WARNING [train.py:1204] (3/4) Exclude cut with ID _3867_210_1883_1_1526641200575_4475830_319-349850-0_sp1.1 from training. Duration: 0.6090625 2023-10-02 16:39:29,722 WARNING [train.py:1204] (3/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_304-59227-0_sp0.9 from training. Duration: 0.8555625 2023-10-02 16:39:31,021 WARNING [train.py:1204] (3/4) Exclude cut with ID _8021_210_7994_1_1528376451358_3653069_100-140231-0_sp0.9 from training. Duration: 0.7444375 2023-10-02 16:39:31,095 WARNING [train.py:1204] (3/4) Exclude cut with ID _8833_210_5219_1_1528592665210_7553059_329-125156-0_sp0.9 from training. Duration: 0.911125 2023-10-02 16:39:31,347 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.balancer1.prob, batch_count=947233.3333333334, ans=0.125 2023-10-02 16:39:32,746 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.attention_skip_rate, batch_count=947233.3333333334, ans=0.0 2023-10-02 16:39:40,950 WARNING [train.py:1204] (3/4) Exclude cut with ID _14459_210_2228_1_1531443641946_7516700_792-287234-0_sp1.1 from training. Duration: 0.92725 2023-10-02 16:39:40,973 WARNING [train.py:1204] (3/4) Exclude cut with ID _19708_210_9206_1_1532136650733_4238364_347-65240-0_sp1.1 from training. Duration: 0.8818125 2023-10-02 16:39:45,738 WARNING [train.py:1204] (3/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_70-27732-0 from training. Duration: 0.94 2023-10-02 16:39:45,911 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.feed_forward1.hidden_balancer.prob, batch_count=947300.0, ans=0.125 2023-10-02 16:39:51,348 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_397-132565-0 from training. Duration: 0.99 2023-10-02 16:39:51,360 WARNING [train.py:1204] (3/4) Exclude cut with ID _49728_210_12651_1_1534041034294_6788150_624-195301-0 from training. Duration: 0.85 2023-10-02 16:39:51,392 WARNING [train.py:1204] (3/4) Exclude cut with ID _55429_210_10626_1_1534467588527_7174143_377-169391-0_sp1.1 from training. Duration: 0.87275 2023-10-02 16:39:53,353 WARNING [train.py:1204] (3/4) Exclude cut with ID _49376_210_5986_1_1533985278732_3370740_354-344695-0_sp1.1 from training. Duration: 0.9 2023-10-02 16:40:02,316 WARNING [train.py:1204] (3/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_276-96593-0_sp1.1 from training. Duration: 0.7 2023-10-02 16:40:02,324 WARNING [train.py:1204] (3/4) Exclude cut with ID _15012_210_1968_1_1530856820429_5095350_688-178849-0_sp0.9 from training. Duration: 0.711125 2023-10-02 16:40:03,600 WARNING [train.py:1204] (3/4) Exclude cut with ID _25565_210_12392_1_1532423009382_3952098_372-69023-0_sp1.1 from training. Duration: 0.936375 2023-10-02 16:40:03,626 WARNING [train.py:1204] (3/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_61-327062-0_sp1.1 from training. Duration: 0.736375 2023-10-02 16:40:03,664 WARNING [train.py:1204] (3/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_215-141059-0 from training. Duration: 0.62 2023-10-02 16:40:09,283 WARNING [train.py:1204] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_6-226750-0_sp1.1 from training. Duration: 0.8545625 2023-10-02 16:40:09,381 WARNING [train.py:1204] (3/4) Exclude cut with ID _14348_210_8341_1_1530599434996_3778640_240-232370-0_sp1.1 from training. Duration: 0.763625 2023-10-02 16:40:13,522 WARNING [train.py:1204] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_468-189058-0 from training. Duration: 0.96 2023-10-02 16:40:16,474 INFO [train.py:1046] (3/4) Epoch 27, batch 4000, loss[loss=0.1749, simple_loss=0.2495, pruned_loss=0.05016, over 23659.00 frames. ], tot_loss[loss=0.1654, simple_loss=0.2442, pruned_loss=0.04329, over 4757988.97 frames. ], batch size: 232, lr: 3.77e-03, grad_scale: 32.0 2023-10-02 16:40:18,267 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.conv_module2.balancer1.prob, batch_count=947433.3333333334, ans=0.125 2023-10-02 16:40:23,309 WARNING [train.py:1204] (3/4) Exclude cut with ID _21987_210_5854_1_1531821500482_3637038_247-93340-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 16:40:31,398 WARNING [train.py:1204] (3/4) Exclude cut with ID _66868_210_9527_1_1535509287954_4561140_436-191602-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 16:40:37,413 WARNING [train.py:1204] (3/4) Exclude cut with ID _18895_210_6228_1_1531357274234_3784930_394-58203-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 16:40:37,460 WARNING [train.py:1204] (3/4) Exclude cut with ID _58605_210_12210_1_1534723195939_3904259_258-281498-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 16:40:38,747 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_33348_210_5632_1_1533086912_3324030_245-292752-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 16:40:38,769 WARNING [train.py:1204] (3/4) Exclude cut with ID _67831_210_12392_1_1535612464202_3092612_346-2148-0 from training. Duration: 0.93 2023-10-02 16:40:40,120 WARNING [train.py:1204] (3/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_369-222060-0_sp0.9 from training. Duration: 0.788875 2023-10-02 16:40:40,170 WARNING [train.py:1204] (3/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_245-174553-0 from training. Duration: 0.88 2023-10-02 16:40:40,186 WARNING [train.py:1204] (3/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_231-104736-0_sp1.1 from training. Duration: 0.7090625 2023-10-02 16:40:40,192 WARNING [train.py:1204] (3/4) Exclude cut with ID _44585_210_18628_1_1533605350387_4518199_280-73534-0 from training. Duration: 0.89 2023-10-02 16:40:41,593 WARNING [train.py:1204] (3/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_22-201476-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 16:40:43,189 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.0.conv_module1.balancer1.prob, batch_count=947500.0, ans=0.125 2023-10-02 16:40:43,275 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.conv_module2.balancer1.max_abs, batch_count=947500.0, ans=10.0 2023-10-02 16:40:44,466 WARNING [train.py:1204] (3/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_325-62802-0_sp0.9 from training. Duration: 0.9666875 2023-10-02 16:40:44,479 WARNING [train.py:1204] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_199-274062-0_sp1.1 from training. Duration: 0.97275 2023-10-02 16:40:44,482 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_8-319957-0_sp1.1 from training. Duration: 0.87275 2023-10-02 16:40:45,739 WARNING [train.py:1204] (3/4) Exclude cut with ID _29343_210_4381_1_1532602757752_3674109_224-349726-0_sp1.1 from training. Duration: 0.936375 2023-10-02 16:40:45,743 WARNING [train.py:1204] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_520-126729-0_sp1.1 from training. Duration: 0.636375 2023-10-02 16:40:46,491 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.3.encoder.layers.1.whiten, num_groups=1, num_channels=512, metric=5.00 vs. limit=12.0 2023-10-02 16:40:47,107 WARNING [train.py:1204] (3/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_239-22076-0_sp1.1 from training. Duration: 0.77275 2023-10-02 16:40:48,932 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9012W0011-501146 from training. Duration: 0.896 2023-10-02 16:40:49,000 WARNING [train.py:1204] (3/4) Exclude cut with ID _29857_210_20034_1_1532395886753_3798160_279-185801-0_sp1.1 from training. Duration: 0.82725 2023-10-02 16:40:50,488 WARNING [train.py:1204] (3/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_261-223933-0_sp1.1 from training. Duration: 0.92725 2023-10-02 16:40:51,916 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9029W0025-509426 from training. Duration: 0.896 2023-10-02 16:40:53,252 WARNING [train.py:1204] (3/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_337-181502-0_sp1.1 from training. Duration: 0.5909375 2023-10-02 16:40:53,264 WARNING [train.py:1204] (3/4) Exclude cut with ID _68635_210_9317_1_1535770813410_4315870_159-332599-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 16:40:54,773 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.conv_skip_rate, batch_count=947566.6666666666, ans=0.0 2023-10-02 16:40:56,284 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.feed_forward2.hidden_balancer.prob, batch_count=947566.6666666666, ans=0.125 2023-10-02 16:40:58,163 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.0.conv_module1.balancer1.prob, batch_count=947566.6666666666, ans=0.125 2023-10-02 16:40:58,295 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.balancer1.prob, batch_count=947566.6666666666, ans=0.125 2023-10-02 16:41:00,740 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_161-276996-0 from training. Duration: 0.95 2023-10-02 16:41:00,773 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7088_210_1874_1_1527939776_1101113_127-68870-0_sp1.1 from training. Duration: 0.936375 2023-10-02 16:41:03,920 WARNING [train.py:1204] (3/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_322-217220-0_sp1.1 from training. Duration: 0.7818125 2023-10-02 16:41:03,976 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9072W0002-530636 from training. Duration: 0.768 2023-10-02 16:41:06,594 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_340-336408-0_sp1.1 from training. Duration: 0.8454375 2023-10-02 16:41:07,279 WARNING [train.py:1204] (3/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_395-37222-0 from training. Duration: 0.76 2023-10-02 16:41:07,283 WARNING [train.py:1204] (3/4) Exclude cut with ID _64099_210_17301_1_1535526977656_1578188_159-209156-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 16:41:08,688 WARNING [train.py:1204] (3/4) Exclude cut with ID _53950_210_5816_1_1534417264919_3862951_28-29681-0_sp1.1 from training. Duration: 0.92725 2023-10-02 16:41:08,766 WARNING [train.py:1204] (3/4) Exclude cut with ID _4681_210_3525_1_1527250689464_120889_15-126760-0_sp1.1 from training. Duration: 0.736375 2023-10-02 16:41:11,439 WARNING [train.py:1204] (3/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_242-3472-0_sp1.1 from training. Duration: 0.663625 2023-10-02 16:41:11,480 WARNING [train.py:1204] (3/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_436-186311-0_sp0.9 from training. Duration: 0.72225 2023-10-02 16:41:11,495 WARNING [train.py:1204] (3/4) Exclude cut with ID _3675_210_4557_1_1526639934408_4182089_452-172147-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 16:41:14,132 WARNING [train.py:1204] (3/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_80-147301-0 from training. Duration: 0.84 2023-10-02 16:41:14,164 WARNING [train.py:1204] (3/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_351-107588-0_sp1.1 from training. Duration: 0.92725 2023-10-02 16:41:15,414 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.476e+02 1.843e+02 2.082e+02 2.345e+02 3.565e+02, threshold=4.163e+02, percent-clipped=0.0 2023-10-02 16:41:15,551 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9111W0002-550781 from training. Duration: 0.896 2023-10-02 16:41:18,626 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.conv_module2.balancer2.min_positive, batch_count=947700.0, ans=0.05 2023-10-02 16:41:19,763 WARNING [train.py:1204] (3/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_465-156500-0_sp1.1 from training. Duration: 0.6545625 2023-10-02 16:41:19,957 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.bypass.skip_rate, batch_count=947700.0, ans=0.04949747468305833 2023-10-02 16:41:22,990 WARNING [train.py:1204] (3/4) Exclude cut with ID _13938_210_9988_1_1530518718978_3693960_304-112784-0_sp1.1 from training. Duration: 0.4181875 2023-10-02 16:41:24,465 WARNING [train.py:1204] (3/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_202-67017-0_sp1.1 from training. Duration: 0.7454375 2023-10-02 16:41:24,511 WARNING [train.py:1204] (3/4) Exclude cut with ID _58414_210_8254_1_1535421609162_7151182_448-4358-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 16:41:25,800 WARNING [train.py:1204] (3/4) Exclude cut with ID _66840_210_5219_1_1535452168426_4196349_203-243234-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 16:41:27,763 WARNING [train.py:1204] (3/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_201-213513-0_sp1.1 from training. Duration: 0.97275 2023-10-02 16:41:30,557 INFO [train.py:1046] (3/4) Epoch 27, batch 4050, loss[loss=0.1717, simple_loss=0.253, pruned_loss=0.04525, over 24474.00 frames. ], tot_loss[loss=0.1656, simple_loss=0.2446, pruned_loss=0.0433, over 4758155.63 frames. ], batch size: 63, lr: 3.77e-03, grad_scale: 16.0 2023-10-02 16:41:32,529 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_117-187560-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 16:41:35,311 WARNING [train.py:1204] (3/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_135-174080-0_sp0.9 from training. Duration: 0.588875 2023-10-02 16:41:35,356 WARNING [train.py:1204] (3/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_45-265684-0 from training. Duration: 0.72 2023-10-02 16:41:36,681 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_435-313305-0_sp1.1 from training. Duration: 0.6454375 2023-10-02 16:41:36,706 WARNING [train.py:1204] (3/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_235-307337-0_sp1.1 from training. Duration: 0.8545625 2023-10-02 16:41:38,658 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7762_210_1794_1_1528199508_4057408_419-125009-0_sp0.9 from training. Duration: 0.92225 2023-10-02 16:41:39,929 WARNING [train.py:1204] (3/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_215-141059-0_sp1.1 from training. Duration: 0.563625 2023-10-02 16:41:41,274 WARNING [train.py:1204] (3/4) Exclude cut with ID _27013_210_1389_1_1532437173240_3252649_222-125573-0_sp1.1 from training. Duration: 0.8909375 2023-10-02 16:41:44,032 WARNING [train.py:1204] (3/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_126-274694-0_sp1.1 from training. Duration: 0.8909375 2023-10-02 16:41:46,860 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2592_210_2290_1_1525697025_4882925_157-70512-0_sp1.1 from training. Duration: 0.82725 2023-10-02 16:41:46,898 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_292-265928-0_sp0.9 from training. Duration: 0.5666875 2023-10-02 16:41:49,624 WARNING [train.py:1204] (3/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_1211-226066-0_sp1.1 from training. Duration: 0.6909375 2023-10-02 16:41:49,689 WARNING [train.py:1204] (3/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_394-265747-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 16:41:52,851 WARNING [train.py:1204] (3/4) Exclude cut with ID _48782_210_12658_1_1534467701544_2300422_239-67139-0_sp1.1 from training. Duration: 0.97275 2023-10-02 16:41:55,603 WARNING [train.py:1204] (3/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_329-113173-0_sp1.1 from training. Duration: 0.563625 2023-10-02 16:41:58,324 WARNING [train.py:1204] (3/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_81-75849-0_sp0.9 from training. Duration: 0.4333125 2023-10-02 16:42:01,088 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.2.encoder.layers.2.nonlin_attention.whiten1, num_groups=1, num_channels=288, metric=4.78 vs. limit=10.0 2023-10-02 16:42:01,627 WARNING [train.py:1204] (3/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_1003-101638-0 from training. Duration: 0.73 2023-10-02 16:42:01,649 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9309W0189-640316 from training. Duration: 0.64 2023-10-02 16:42:03,098 WARNING [train.py:1204] (3/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_503-287883-0_sp1.1 from training. Duration: 0.8 2023-10-02 16:42:10,801 WARNING [train.py:1204] (3/4) Exclude cut with ID _8833_210_5219_1_1528592665210_7553059_611-80313-0 from training. Duration: 0.64 2023-10-02 16:42:12,091 WARNING [train.py:1204] (3/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_335-106626-0_sp1.1 from training. Duration: 0.963625 2023-10-02 16:42:14,975 WARNING [train.py:1204] (3/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_237-44332-0_sp1.1 from training. Duration: 0.8545625 2023-10-02 16:42:15,219 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.ff3_skip_rate, batch_count=947966.6666666666, ans=0.0 2023-10-02 16:42:16,493 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.1.attention_skip_rate, batch_count=947966.6666666666, ans=0.0 2023-10-02 16:42:17,732 WARNING [train.py:1204] (3/4) Exclude cut with ID _46952_210_5986_1_1534230010357_3107379_178-147341-0_sp1.1 from training. Duration: 0.97275 2023-10-02 16:42:17,782 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_13091_210_7925_1_1530609447_8987354_946-154426-0_sp1.1 from training. Duration: 0.936375 2023-10-02 16:42:17,798 WARNING [train.py:1204] (3/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_326-17061-0_sp1.1 from training. Duration: 0.8545625 2023-10-02 16:42:17,949 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.1.feed_forward2.hidden_balancer.prob, batch_count=947966.6666666666, ans=0.125 2023-10-02 16:42:21,956 WARNING [train.py:1204] (3/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_38-169920-0_sp1.1 from training. Duration: 0.82725 2023-10-02 16:42:26,715 WARNING [train.py:1204] (3/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_283-220372-0 from training. Duration: 0.56 2023-10-02 16:42:26,730 WARNING [train.py:1204] (3/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_252-216674-0_sp1.1 from training. Duration: 0.4909375 2023-10-02 16:42:26,841 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_202-91290-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 16:42:29,701 WARNING [train.py:1204] (3/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_80-74184-0 from training. Duration: 0.83 2023-10-02 16:42:31,269 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=948033.3333333334, ans=0.1 2023-10-02 16:42:33,163 WARNING [train.py:1204] (3/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_411-271358-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 16:42:39,137 WARNING [train.py:1204] (3/4) Exclude cut with ID _68777_210_4353_1_1535810390731_3117904_129-166012-0 from training. Duration: 0.85 2023-10-02 16:42:42,333 WARNING [train.py:1204] (3/4) Exclude cut with ID _44982_210_1511_1_1534312820108_3726029_410-109759-0_sp1.1 from training. Duration: 0.963625 2023-10-02 16:42:42,338 WARNING [train.py:1204] (3/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_364-22817-0_sp1.1 from training. Duration: 0.6454375 2023-10-02 16:42:42,491 WARNING [train.py:1204] (3/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_89-242335-0 from training. Duration: 0.95 2023-10-02 16:42:42,508 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_151-325626-0 from training. Duration: 0.97 2023-10-02 16:42:42,509 WARNING [train.py:1204] (3/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_786-264332-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 16:42:45,020 INFO [train.py:1046] (3/4) Epoch 27, batch 4100, loss[loss=0.1576, simple_loss=0.233, pruned_loss=0.04111, over 23505.00 frames. ], tot_loss[loss=0.1662, simple_loss=0.2448, pruned_loss=0.04385, over 4757498.53 frames. ], batch size: 120, lr: 3.77e-03, grad_scale: 16.0 2023-10-02 16:42:45,094 WARNING [train.py:1204] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_35-153886-0_sp1.1 from training. Duration: 0.763625 2023-10-02 16:42:45,185 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_24007_210_1794_1_1531918925_1663875_116-100916-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 16:42:45,199 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_162-262717-0_sp1.1 from training. Duration: 0.7545625 2023-10-02 16:42:51,951 WARNING [train.py:1204] (3/4) Exclude cut with ID _8263_210_1664_1_1528438953494_294100_40-204731-0 from training. Duration: 0.5 2023-10-02 16:42:54,754 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_416-293746-0 from training. Duration: 0.9 2023-10-02 16:42:56,193 WARNING [train.py:1204] (3/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_605-219732-0 from training. Duration: 0.94 2023-10-02 16:42:57,931 WARNING [train.py:1204] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_454-123002-0 from training. Duration: 0.69 2023-10-02 16:42:57,937 WARNING [train.py:1204] (3/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_25-304273-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 16:42:58,002 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6860_210_4852_1_1527917457_3833327_451-39702-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 16:42:59,279 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_304-16505-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 16:42:59,297 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_4-266348-0_sp1.1 from training. Duration: 0.7090625 2023-10-02 16:42:59,949 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.feed_forward1.out_whiten.whitening_limit, batch_count=948166.6666666666, ans=15.0 2023-10-02 16:43:00,761 WARNING [train.py:1204] (3/4) Exclude cut with ID ID0149W0001-753701 from training. Duration: 0.989 2023-10-02 16:43:04,109 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_41251_210_15368_1_1533429767_4820695_394-55393-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 16:43:05,421 WARNING [train.py:1204] (3/4) Exclude cut with ID _8793_210_6310_1_1528542985139_2310009_121-179782-0_sp1.1 from training. Duration: 0.72725 2023-10-02 16:43:05,443 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2729_210_3528_1_1525863505_3882214_421-73047-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 16:43:06,719 WARNING [train.py:1204] (3/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_278-154492-0_sp0.9 from training. Duration: 0.8444375 2023-10-02 16:43:07,013 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.bypass_mid.scale_min, batch_count=948166.6666666666, ans=0.2 2023-10-02 16:43:10,125 WARNING [train.py:1204] (3/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_412-317077-0_sp0.9 from training. Duration: 0.8333125 2023-10-02 16:43:11,586 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_727-138317-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 16:43:11,625 WARNING [train.py:1204] (3/4) Exclude cut with ID _25808_210_13292_1_1532343514562_7366109_345-335048-0_sp1.1 from training. Duration: 0.92725 2023-10-02 16:43:11,638 WARNING [train.py:1204] (3/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_506-23200-0 from training. Duration: 0.97 2023-10-02 16:43:13,012 WARNING [train.py:1204] (3/4) Exclude cut with ID _44718_210_5816_1_1534050051869_6469030_643-245187-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 16:43:13,017 WARNING [train.py:1204] (3/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_42-186162-0_sp1.1 from training. Duration: 0.836375 2023-10-02 16:43:13,029 WARNING [train.py:1204] (3/4) Exclude cut with ID _3231_210_2106_1_1526205864934_6784220_171-256004-0_sp1.1 from training. Duration: 0.7818125 2023-10-02 16:43:13,065 WARNING [train.py:1204] (3/4) Exclude cut with ID _25914_210_6400_1_1532141317058_644740_43-248478-0_sp1.1 from training. Duration: 0.936375 2023-10-02 16:43:13,118 WARNING [train.py:1204] (3/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_577-287150-0 from training. Duration: 0.82 2023-10-02 16:43:17,254 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_46015_210_7234_1_1533863319_3087837_376-276068-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 16:43:18,583 WARNING [train.py:1204] (3/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_51-200220-0 from training. Duration: 0.56 2023-10-02 16:43:18,677 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_452-96101-0_sp1.1 from training. Duration: 0.963625 2023-10-02 16:43:21,408 WARNING [train.py:1204] (3/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_263-168388-0_sp1.1 from training. Duration: 0.7818125 2023-10-02 16:43:21,409 WARNING [train.py:1204] (3/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_469-94673-0 from training. Duration: 0.79 2023-10-02 16:43:22,902 WARNING [train.py:1204] (3/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_303-228738-0_sp1.1 from training. Duration: 0.763625 2023-10-02 16:43:22,945 WARNING [train.py:1204] (3/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_262-250826-0_sp1.1 from training. Duration: 0.9 2023-10-02 16:43:22,966 WARNING [train.py:1204] (3/4) Exclude cut with ID _68777_210_4353_1_1535810390731_3117904_129-166012-0_sp1.1 from training. Duration: 0.77275 2023-10-02 16:43:24,383 WARNING [train.py:1204] (3/4) Exclude cut with ID _58895_210_8750_1_1535184081320_3649089_265-323810-0 from training. Duration: 0.96 2023-10-02 16:43:25,760 WARNING [train.py:1204] (3/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_120-76843-0_sp0.9 from training. Duration: 0.611125 2023-10-02 16:43:27,073 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7544_210_6753_1_1528587722_3576686_295-258479-0_sp0.9 from training. Duration: 0.9333125 2023-10-02 16:43:28,939 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7046_210_6753_1_1527895337_6920917_783-278109-0 from training. Duration: 0.77 2023-10-02 16:43:28,984 WARNING [train.py:1204] (3/4) Exclude cut with ID _42326_210_7770_1_1533814282531_3707269_113-308323-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 16:43:29,023 WARNING [train.py:1204] (3/4) Exclude cut with ID _7852_210_5219_1_1528286471316_8139400_242-105336-0_sp0.9 from training. Duration: 0.8 2023-10-02 16:43:33,841 WARNING [train.py:1204] (3/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_114-277905-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 16:43:39,905 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_13118_210_7925_1_1530755612_7608297_42-66683-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 16:43:42,635 WARNING [train.py:1204] (3/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_901-79438-0_sp1.1 from training. Duration: 0.8545625 2023-10-02 16:43:42,695 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_30280_210_1881_1_1532872779_3660139_211-182194-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 16:43:44,320 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.conv_skip_rate, batch_count=948366.6666666666, ans=0.0 2023-10-02 16:43:45,429 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.562e+02 1.836e+02 1.976e+02 2.197e+02 2.879e+02, threshold=3.952e+02, percent-clipped=0.0 2023-10-02 16:43:45,821 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.bypass.scale_min, batch_count=948366.6666666666, ans=0.2 2023-10-02 16:43:48,396 WARNING [train.py:1204] (3/4) Exclude cut with ID _14898_210_9206_1_1530766812377_4177190_621-45732-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 16:43:48,404 WARNING [train.py:1204] (3/4) Exclude cut with ID _35350_210_16040_1_1533040191183_4075299_224-133775-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 16:43:51,229 WARNING [train.py:1204] (3/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_147-293311-0_sp1.1 from training. Duration: 0.8545625 2023-10-02 16:43:53,962 WARNING [train.py:1204] (3/4) Exclude cut with ID _12302_210_5219_1_1529924446466_8107280_835-279754-0_sp1.1 from training. Duration: 0.8181875 2023-10-02 16:43:56,709 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8453_210_4211_1_1528502940_8562730_347-330673-0_sp0.9 from training. Duration: 0.8 2023-10-02 16:43:58,455 INFO [train.py:1046] (3/4) Epoch 27, batch 4150, loss[loss=0.1675, simple_loss=0.2354, pruned_loss=0.04976, over 23521.00 frames. ], tot_loss[loss=0.1665, simple_loss=0.2445, pruned_loss=0.04428, over 4743799.74 frames. ], batch size: 134, lr: 3.77e-03, grad_scale: 16.0 2023-10-02 16:43:58,531 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_213-103282-0_sp0.9 from training. Duration: 0.8666875 2023-10-02 16:43:59,931 WARNING [train.py:1204] (3/4) Exclude cut with ID _49728_210_12651_1_1534041034294_6788150_66-43496-0_sp1.1 from training. Duration: 0.97275 2023-10-02 16:43:59,935 WARNING [train.py:1204] (3/4) Exclude cut with ID _19023_210_4353_1_1531378892552_5820780_28-36615-0_sp1.1 from training. Duration: 0.8818125 2023-10-02 16:44:03,318 WARNING [train.py:1204] (3/4) Exclude cut with ID _14349_210_8341_1_1530615710818_3442470_115-285975-0 from training. Duration: 0.86 2023-10-02 16:44:03,344 WARNING [train.py:1204] (3/4) Exclude cut with ID _12734_210_8341_1_1530183614296_3891929_236-47464-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 16:44:04,670 WARNING [train.py:1204] (3/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_177-236809-0 from training. Duration: 0.49 2023-10-02 16:44:04,736 WARNING [train.py:1204] (3/4) Exclude cut with ID _13678_210_7497_1_1530530069226_3463170_371-3221-0 from training. Duration: 0.9 2023-10-02 16:44:04,753 WARNING [train.py:1204] (3/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_48-338976-0 from training. Duration: 0.89 2023-10-02 16:44:07,983 WARNING [train.py:1204] (3/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_1167-309744-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 16:44:11,166 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.balancer2.prob, batch_count=948433.3333333334, ans=0.125 2023-10-02 16:44:13,630 WARNING [train.py:1204] (3/4) Exclude cut with ID _30327_210_16049_1_1532484161295_3924169_401-70819-0_sp0.9 from training. Duration: 0.9555625 2023-10-02 16:44:13,638 WARNING [train.py:1204] (3/4) Exclude cut with ID _55134_210_21982_1_1534590028377_3882170_346-91022-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 16:44:17,813 WARNING [train.py:1204] (3/4) Exclude cut with ID _14459_210_2228_1_1531443641946_7516700_174-118801-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 16:44:19,170 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_269-278006-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 16:44:19,235 WARNING [train.py:1204] (3/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_390-339137-0_sp0.9 from training. Duration: 0.62225 2023-10-02 16:44:21,871 WARNING [train.py:1204] (3/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_253-308377-0_sp1.1 from training. Duration: 0.4454375 2023-10-02 16:44:21,882 WARNING [train.py:1204] (3/4) Exclude cut with ID _23825_210_13449_1_1531915313526_3497150_240-271367-0_sp1.1 from training. Duration: 0.8818125 2023-10-02 16:44:23,266 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1015-343884-0_sp0.9 from training. Duration: 0.688875 2023-10-02 16:44:26,092 WARNING [train.py:1204] (3/4) Exclude cut with ID _23514_210_1389_1_1531832139785_3031439_335-48210-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 16:44:27,914 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.balancer2.prob, batch_count=948566.6666666666, ans=0.125 2023-10-02 16:44:29,039 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_309-65177-0_sp1.1 from training. Duration: 0.8 2023-10-02 16:44:31,020 WARNING [train.py:1204] (3/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_231-137892-0 from training. Duration: 0.99 2023-10-02 16:44:34,790 WARNING [train.py:1204] (3/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_57-270037-0 from training. Duration: 0.77 2023-10-02 16:44:34,795 WARNING [train.py:1204] (3/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_465-214726-0_sp1.1 from training. Duration: 0.7818125 2023-10-02 16:44:36,148 WARNING [train.py:1204] (3/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_106-300985-0 from training. Duration: 0.9 2023-10-02 16:44:36,153 WARNING [train.py:1204] (3/4) Exclude cut with ID _14632_210_9988_1_1530691279392_3923200_161-129530-0_sp1.1 from training. Duration: 0.87275 2023-10-02 16:44:36,166 WARNING [train.py:1204] (3/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_514-88370-0_sp1.1 from training. Duration: 0.963625 2023-10-02 16:44:37,782 WARNING [train.py:1204] (3/4) Exclude cut with ID _56495_210_4234_1_1534748080334_4136219_305-334505-0_sp1.1 from training. Duration: 0.92725 2023-10-02 16:44:37,873 WARNING [train.py:1204] (3/4) Exclude cut with ID _42875_210_14856_1_1533704397487_4012433_61-264914-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 16:44:43,867 WARNING [train.py:1204] (3/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_91-148831-0 from training. Duration: 0.99 2023-10-02 16:44:46,648 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_435-313305-0_sp0.9 from training. Duration: 0.788875 2023-10-02 16:44:47,999 WARNING [train.py:1204] (3/4) Exclude cut with ID _16744_210_5986_1_1531724405384_3907567_242-111316-0_sp1.1 from training. Duration: 0.8454375 2023-10-02 16:44:49,407 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_257-20733-0 from training. Duration: 0.76 2023-10-02 16:44:49,463 WARNING [train.py:1204] (3/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_4-91037-0_sp1.1 from training. Duration: 0.8 2023-10-02 16:44:50,804 WARNING [train.py:1204] (3/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_3-276701-0 from training. Duration: 0.91 2023-10-02 16:44:53,455 WARNING [train.py:1204] (3/4) Exclude cut with ID _3992_210_1747_1_1526644816031_3731060_36-147855-0_sp1.1 from training. Duration: 0.7090625 2023-10-02 16:44:54,778 WARNING [train.py:1204] (3/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_1296-298671-0_sp1.1 from training. Duration: 0.963625 2023-10-02 16:44:56,162 WARNING [train.py:1204] (3/4) Exclude cut with ID _68965_210_1883_1_1535698962272_3497490_309-334593-0_sp1.1 from training. Duration: 0.92725 2023-10-02 16:44:57,465 WARNING [train.py:1204] (3/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_128-296581-0 from training. Duration: 0.74 2023-10-02 16:44:57,465 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_17613_210_2151_1_1531137569_2366160_170-299079-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 16:44:57,467 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6287_210_3273_1_1527509506_2653716_182-199887-0_sp1.1 from training. Duration: 0.463625 2023-10-02 16:44:58,130 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.4.encoder.layers.1.conv_module1.whiten, num_groups=1, num_channels=384, metric=2.99 vs. limit=15.0 2023-10-02 16:44:58,926 WARNING [train.py:1204] (3/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_80-135200-0_sp1.1 from training. Duration: 0.7181875 2023-10-02 16:45:00,392 WARNING [train.py:1204] (3/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_34-183685-0 from training. Duration: 0.98 2023-10-02 16:45:00,422 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8278_210_2550_1_1528637099_4220413_418-210155-0_sp1.1 from training. Duration: 0.92725 2023-10-02 16:45:01,747 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8853_210_3273_1_1528633899_818968_51-187685-0_sp0.9 from training. Duration: 0.7666875 2023-10-02 16:45:01,769 WARNING [train.py:1204] (3/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_332-221057-0_sp1.1 from training. Duration: 0.6181875 2023-10-02 16:45:01,848 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2221_210_1988_1_1525597051_4935541_415-296882-0 from training. Duration: 0.77 2023-10-02 16:45:01,868 WARNING [train.py:1204] (3/4) Exclude cut with ID _33640_210_2655_1_1533117605479_4855469_770-278185-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 16:45:03,707 WARNING [train.py:1204] (3/4) Exclude cut with ID _5287_210_2084_1_1527418883040_3878670_333-48508-0_sp1.1 from training. Duration: 0.5454375 2023-10-02 16:45:05,031 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_142-276839-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 16:45:06,371 WARNING [train.py:1204] (3/4) Exclude cut with ID _28394_210_8913_1_1532430049518_3650118_126-139371-0_sp1.1 from training. Duration: 0.92725 2023-10-02 16:45:06,393 WARNING [train.py:1204] (3/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_76-203867-0 from training. Duration: 0.72 2023-10-02 16:45:06,427 WARNING [train.py:1204] (3/4) Exclude cut with ID _6194_210_1534_1_1527511826160_3941194_14-282876-0_sp0.9 from training. Duration: 0.788875 2023-10-02 16:45:11,245 WARNING [train.py:1204] (3/4) Exclude cut with ID _35355_210_16040_1_1533212988977_4016229_74-190076-0_sp1.1 from training. Duration: 0.72725 2023-10-02 16:45:12,425 INFO [train.py:1046] (3/4) Epoch 27, batch 4200, loss[loss=0.1725, simple_loss=0.2388, pruned_loss=0.05315, over 23754.00 frames. ], tot_loss[loss=0.166, simple_loss=0.2435, pruned_loss=0.04421, over 4735242.05 frames. ], batch size: 212, lr: 3.77e-03, grad_scale: 16.0 2023-10-02 16:45:12,579 WARNING [train.py:1204] (3/4) Exclude cut with ID _33920_210_4220_1_1533257851500_3084399_198-334063-0 from training. Duration: 0.94 2023-10-02 16:45:14,005 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_4-266348-0_sp0.9 from training. Duration: 0.8666875 2023-10-02 16:45:15,462 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_23856_210_4238_1_1531994785_3087934_122-38075-0_sp1.1 from training. Duration: 0.936375 2023-10-02 16:45:18,201 WARNING [train.py:1204] (3/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_276-96593-0_sp0.9 from training. Duration: 0.8555625 2023-10-02 16:45:19,479 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_308-278734-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 16:45:19,480 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_5190_210_4557_1_1527245078_2965646_353-250603-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 16:45:19,733 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.feed_forward2.hidden_balancer.prob, batch_count=948766.6666666666, ans=0.125 2023-10-02 16:45:20,849 WARNING [train.py:1204] (3/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_472-216663-0 from training. Duration: 0.92 2023-10-02 16:45:23,703 WARNING [train.py:1204] (3/4) Exclude cut with ID _7537_210_2084_1_1528502395431_3924359_14-312906-0 from training. Duration: 0.84 2023-10-02 16:45:23,751 WARNING [train.py:1204] (3/4) Exclude cut with ID _29003_210_19596_1_1532507391720_3894098_559-236660-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 16:45:25,102 WARNING [train.py:1204] (3/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_312-204798-0_sp1.1 from training. Duration: 0.8454375 2023-10-02 16:45:26,606 WARNING [train.py:1204] (3/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_17-309070-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 16:45:29,788 WARNING [train.py:1204] (3/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_344-152526-0_sp0.9 from training. Duration: 0.588875 2023-10-02 16:45:31,807 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.conv_module2.balancer2.prob, batch_count=948833.3333333334, ans=0.125 2023-10-02 16:45:33,386 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_41452_210_7291_1_1533437687_3941281_405-11559-0_sp1.1 from training. Duration: 0.82725 2023-10-02 16:45:33,418 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_48730_210_5399_1_1533885886_2212769_157-124052-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 16:45:33,466 WARNING [train.py:1204] (3/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_415-78792-0 from training. Duration: 0.81 2023-10-02 16:45:33,470 WARNING [train.py:1204] (3/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_345-340690-0_sp1.1 from training. Duration: 0.8454375 2023-10-02 16:45:34,913 WARNING [train.py:1204] (3/4) Exclude cut with ID _23825_210_13449_1_1531915313526_3497150_124-8138-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 16:45:36,977 WARNING [train.py:1204] (3/4) Exclude cut with ID _49258_210_7994_1_1534122017863_3738861_194-24864-0_sp1.1 from training. Duration: 0.936375 2023-10-02 16:45:36,994 WARNING [train.py:1204] (3/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_369-222060-0_sp1.1 from training. Duration: 0.6454375 2023-10-02 16:45:37,111 WARNING [train.py:1204] (3/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_480-13146-0_sp0.9 from training. Duration: 0.9444375 2023-10-02 16:45:39,504 WARNING [train.py:1204] (3/4) Exclude cut with ID _13253_210_1925_1_1530757775819_3577873_191-144977-0 from training. Duration: 0.78 2023-10-02 16:45:39,528 WARNING [train.py:1204] (3/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_1128-140266-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 16:45:45,066 WARNING [train.py:1204] (3/4) Exclude cut with ID _8772_210_1850_1_1528635595972_3664011_247-336448-0_sp0.9 from training. Duration: 0.82225 2023-10-02 16:45:45,152 WARNING [train.py:1204] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_318-59097-0_sp1.1 from training. Duration: 0.6909375 2023-10-02 16:45:48,052 WARNING [train.py:1204] (3/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_540-131164-0_sp1.1 from training. Duration: 0.836375 2023-10-02 16:45:49,478 WARNING [train.py:1204] (3/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_39-110836-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 16:45:50,745 WARNING [train.py:1204] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_860-96198-0_sp1.1 from training. Duration: 0.663625 2023-10-02 16:45:50,752 WARNING [train.py:1204] (3/4) Exclude cut with ID _15133_210_6228_1_1530794512074_3080379_29-264458-0 from training. Duration: 0.58 2023-10-02 16:45:51,977 WARNING [train.py:1204] (3/4) Exclude cut with ID _55209_210_1531_1_1534675828996_4597160_797-348343-0_sp1.1 from training. Duration: 0.963625 2023-10-02 16:45:52,075 WARNING [train.py:1204] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_23-189234-0_sp1.1 from training. Duration: 0.7545625 2023-10-02 16:45:55,007 WARNING [train.py:1204] (3/4) Exclude cut with ID _5410_210_1388_1_1527397055795_1719020_39-112221-0_sp1.1 from training. Duration: 0.62725 2023-10-02 16:45:55,148 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.1.attention_skip_rate, batch_count=948966.6666666666, ans=0.0 2023-10-02 16:45:57,588 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_100-348441-0_sp1.1 from training. Duration: 0.82725 2023-10-02 16:46:04,394 WARNING [train.py:1204] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_143-86344-0_sp1.1 from training. Duration: 0.7 2023-10-02 16:46:08,878 WARNING [train.py:1204] (3/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_126-29104-0 from training. Duration: 0.85 2023-10-02 16:46:10,277 WARNING [train.py:1204] (3/4) Exclude cut with ID _39846_210_12651_1_1533603651992_1318030_138-31771-0_sp1.1 from training. Duration: 0.963625 2023-10-02 16:46:11,961 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.self_attn_weights.pos_emb_skip_rate, batch_count=949033.3333333334, ans=0.0 2023-10-02 16:46:12,985 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.649e+02 1.962e+02 2.274e+02 2.740e+02 4.088e+02, threshold=4.548e+02, percent-clipped=1.0 2023-10-02 16:46:14,528 WARNING [train.py:1204] (3/4) Exclude cut with ID _2220_210_1819_1_1525509109898_3658680_50-99837-0_sp1.1 from training. Duration: 0.4909375 2023-10-02 16:46:14,596 WARNING [train.py:1204] (3/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_836-210550-0_sp1.1 from training. Duration: 0.97275 2023-10-02 16:46:15,914 WARNING [train.py:1204] (3/4) Exclude cut with ID _28323_210_16187_1_1532480395667_4100300_306-74561-0 from training. Duration: 0.94 2023-10-02 16:46:22,700 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_177-58005-0_sp1.1 from training. Duration: 0.563625 2023-10-02 16:46:25,307 INFO [train.py:1046] (3/4) Epoch 27, batch 4250, loss[loss=0.1666, simple_loss=0.2396, pruned_loss=0.04679, over 24477.00 frames. ], tot_loss[loss=0.1653, simple_loss=0.2428, pruned_loss=0.04393, over 4723181.33 frames. ], batch size: 58, lr: 3.77e-03, grad_scale: 16.0 2023-10-02 16:46:26,741 WARNING [train.py:1204] (3/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_19-342632-0_sp1.1 from training. Duration: 0.82725 2023-10-02 16:46:26,756 WARNING [train.py:1204] (3/4) Exclude cut with ID _35355_210_16040_1_1533212988977_4016229_74-190076-0_sp0.9 from training. Duration: 0.888875 2023-10-02 16:46:28,421 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.conv_module1.balancer1.prob, batch_count=949100.0, ans=0.125 2023-10-02 16:46:28,444 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.bypass.skip_rate, batch_count=949100.0, ans=0.09899494936611666 2023-10-02 16:46:29,954 WARNING [train.py:1204] (3/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_285-97786-0_sp1.1 from training. Duration: 0.97275 2023-10-02 16:46:35,229 WARNING [train.py:1204] (3/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_337-181502-0_sp0.9 from training. Duration: 0.72225 2023-10-02 16:46:35,271 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_353-182855-0 from training. Duration: 0.96 2023-10-02 16:46:36,503 WARNING [train.py:1204] (3/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_409-327730-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 16:46:39,702 WARNING [train.py:1204] (3/4) Exclude cut with ID _8030_210_7497_1_1528444886781_3531089_373-19403-0_sp1.1 from training. Duration: 0.97275 2023-10-02 16:46:42,552 WARNING [train.py:1204] (3/4) Exclude cut with ID _6789_210_1534_1_1527857937218_3118950_231-340237-0_sp1.1 from training. Duration: 0.936375 2023-10-02 16:46:45,388 WARNING [train.py:1204] (3/4) Exclude cut with ID _6493_210_1747_1_1528459210424_4012470_536-57832-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 16:46:45,403 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7046_210_6753_1_1527895337_6920917_756-116002-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 16:46:46,849 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_37152_210_9697_1_1533106573_3844188_29-239070-0_sp1.1 from training. Duration: 0.8909375 2023-10-02 16:46:48,126 WARNING [train.py:1204] (3/4) Exclude cut with ID _19023_210_4353_1_1531378892552_5820780_61-334539-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 16:46:49,502 WARNING [train.py:1204] (3/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_159-28488-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 16:46:50,888 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_764-71272-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 16:46:52,234 WARNING [train.py:1204] (3/4) Exclude cut with ID _44902_210_16187_1_1533888006062_3792078_258-178274-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 16:46:52,444 INFO [scaling.py:1118] (3/4) WithLoss: name=encoder.encoders.2.encoder.layers.1.self_attn_weights, loss-sum=0.000e+00 2023-10-02 16:46:54,864 WARNING [train.py:1204] (3/4) Exclude cut with ID _7537_210_2084_1_1528502395431_3924359_14-312906-0_sp1.1 from training. Duration: 0.763625 2023-10-02 16:46:56,214 WARNING [train.py:1204] (3/4) Exclude cut with ID _49643_210_7989_1_1534640244224_3939031_674-116639-0_sp1.1 from training. Duration: 0.963625 2023-10-02 16:46:56,307 WARNING [train.py:1204] (3/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_415-161-0 from training. Duration: 0.98 2023-10-02 16:47:02,382 WARNING [train.py:1204] (3/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_264-348896-0 from training. Duration: 0.85 2023-10-02 16:47:02,398 WARNING [train.py:1204] (3/4) Exclude cut with ID _51899_210_20034_1_1534129254466_3669220_233-161492-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 16:47:02,448 WARNING [train.py:1204] (3/4) Exclude cut with ID _31255_210_13427_1_1532865619495_3815143_233-200023-0_sp1.1 from training. Duration: 0.936375 2023-10-02 16:47:02,470 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_23850_210_4238_1_1532232597_1632433_100-229879-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 16:47:03,850 WARNING [train.py:1204] (3/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_375-245677-0_sp0.9 from training. Duration: 0.9 2023-10-02 16:47:03,853 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_40223_210_6228_1_1533298404_4812267_355-317415-0_sp1.1 from training. Duration: 0.97275 2023-10-02 16:47:05,673 WARNING [train.py:1204] (3/4) Exclude cut with ID _9679_210_7497_1_1529129212906_4363929_235-81488-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 16:47:08,377 WARNING [train.py:1204] (3/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_172-215932-0_sp1.1 from training. Duration: 0.6 2023-10-02 16:47:09,793 WARNING [train.py:1204] (3/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_64-298917-0_sp1.1 from training. Duration: 0.8 2023-10-02 16:47:10,702 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.attention_skip_rate, batch_count=949300.0, ans=0.0 2023-10-02 16:47:14,278 WARNING [train.py:1204] (3/4) Exclude cut with ID _38020_210_5986_1_1533112252259_7006089_131-53533-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 16:47:15,210 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.1.encoder.layers.1.self_attn2.whiten, num_groups=1, num_channels=256, metric=9.94 vs. limit=22.5 2023-10-02 16:47:15,707 WARNING [train.py:1204] (3/4) Exclude cut with ID _42334_210_7770_1_1534206610396_3605880_392-48154-0_sp1.1 from training. Duration: 0.963625 2023-10-02 16:47:15,947 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.ff2_skip_rate, batch_count=949300.0, ans=0.0 2023-10-02 16:47:17,298 WARNING [train.py:1204] (3/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_337-181502-0 from training. Duration: 0.65 2023-10-02 16:47:17,306 WARNING [train.py:1204] (3/4) Exclude cut with ID _6267_210_2084_1_1527764658526_3894730_375-219887-0_sp0.9 from training. Duration: 0.7444375 2023-10-02 16:47:17,403 WARNING [train.py:1204] (3/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_60-146189-0 from training. Duration: 0.98 2023-10-02 16:47:18,784 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_243-143119-0_sp1.1 from training. Duration: 0.7 2023-10-02 16:47:20,312 WARNING [train.py:1204] (3/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_1267-272693-0_sp0.9 from training. Duration: 0.811125 2023-10-02 16:47:20,915 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.3.encoder.layers.2.feed_forward3.out_whiten, num_groups=1, num_channels=512, metric=8.34 vs. limit=15.0 2023-10-02 16:47:21,704 WARNING [train.py:1204] (3/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_83-182645-0_sp1.1 from training. Duration: 0.97275 2023-10-02 16:47:21,727 WARNING [train.py:1204] (3/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_403-292260-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 16:47:24,429 WARNING [train.py:1204] (3/4) Exclude cut with ID _55429_210_10626_1_1534467588527_7174143_377-169391-0 from training. Duration: 0.96 2023-10-02 16:47:25,831 WARNING [train.py:1204] (3/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_494-199538-0_sp1.1 from training. Duration: 0.6818125 2023-10-02 16:47:27,129 WARNING [train.py:1204] (3/4) Exclude cut with ID _3067_210_2667_1_1526212974640_4503168_119-175776-0_sp1.1 from training. Duration: 0.67275 2023-10-02 16:47:31,947 WARNING [train.py:1204] (3/4) Exclude cut with ID _3231_210_2106_1_1526205864934_6784220_183-303839-0_sp1.1 from training. Duration: 0.97275 2023-10-02 16:47:33,987 WARNING [train.py:1204] (3/4) Exclude cut with ID _18513_210_9206_1_1531618215223_3862620_165-38958-0_sp1.1 from training. Duration: 0.963625 2023-10-02 16:47:35,341 WARNING [train.py:1204] (3/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_87-272068-0_sp1.1 from training. Duration: 0.7545625 2023-10-02 16:47:35,416 WARNING [train.py:1204] (3/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_353-95-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 16:47:35,666 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.bypass_mid.scale_min, batch_count=949366.6666666666, ans=0.2 2023-10-02 16:47:38,492 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2009_210_2071_1_1525481134_4588937_73-302813-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 16:47:39,727 INFO [train.py:1046] (3/4) Epoch 27, batch 4300, loss[loss=0.1788, simple_loss=0.2464, pruned_loss=0.05559, over 23731.00 frames. ], tot_loss[loss=0.1655, simple_loss=0.2433, pruned_loss=0.0438, over 4730870.60 frames. ], batch size: 179, lr: 3.77e-03, grad_scale: 16.0 2023-10-02 16:47:39,901 WARNING [train.py:1204] (3/4) Exclude cut with ID _64220_210_9527_1_1535250151772_4875259_88-95011-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 16:47:41,215 WARNING [train.py:1204] (3/4) Exclude cut with ID _44902_210_16187_1_1533888006062_3792078_160-12107-0_sp1.1 from training. Duration: 0.92725 2023-10-02 16:47:41,222 WARNING [train.py:1204] (3/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_547-5424-0 from training. Duration: 0.94 2023-10-02 16:47:41,332 WARNING [train.py:1204] (3/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_268-85656-0_sp1.1 from training. Duration: 0.936375 2023-10-02 16:47:46,138 WARNING [train.py:1204] (3/4) Exclude cut with ID _66210_210_12651_1_1535446812922_3365850_42-284985-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 16:47:46,170 WARNING [train.py:1204] (3/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_465-214726-0_sp0.9 from training. Duration: 0.9555625 2023-10-02 16:47:47,739 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.conv_skip_rate, batch_count=949433.3333333334, ans=0.0 2023-10-02 16:47:50,314 WARNING [train.py:1204] (3/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_50-95770-0_sp1.1 from training. Duration: 0.936375 2023-10-02 16:47:55,076 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.3.encoder.layers.3.nonlin_attention.whiten1, num_groups=1, num_channels=384, metric=5.61 vs. limit=10.0 2023-10-02 16:47:58,586 WARNING [train.py:1204] (3/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_269-77567-0_sp1.1 from training. Duration: 0.963625 2023-10-02 16:47:58,588 WARNING [train.py:1204] (3/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_7-111954-0 from training. Duration: 0.56 2023-10-02 16:47:59,923 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_85-195240-0_sp0.9 from training. Duration: 0.9666875 2023-10-02 16:48:01,364 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2822_210_2715_1_1526112586_1999587_165-36656-0_sp0.9 from training. Duration: 0.988875 2023-10-02 16:48:01,379 WARNING [train.py:1204] (3/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_181-78632-0_sp1.1 from training. Duration: 0.7090625 2023-10-02 16:48:02,683 WARNING [train.py:1204] (3/4) Exclude cut with ID IC0671W0044-331887 from training. Duration: 0.896 2023-10-02 16:48:06,451 WARNING [train.py:1204] (3/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_266-137104-0_sp1.1 from training. Duration: 0.5909375 2023-10-02 16:48:08,478 WARNING [train.py:1204] (3/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_137-116442-0_sp0.9 from training. Duration: 0.8666875 2023-10-02 16:48:11,154 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_23801_210_16223_1_1532066392_3294657_570-5439-0 from training. Duration: 0.97 2023-10-02 16:48:11,178 WARNING [train.py:1204] (3/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_29-280622-0_sp1.1 from training. Duration: 0.7454375 2023-10-02 16:48:11,204 WARNING [train.py:1204] (3/4) Exclude cut with ID 3557-8342-0013-54691-0 from training. Duration: 0.92 2023-10-02 16:48:13,944 WARNING [train.py:1204] (3/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_711-15688-0_sp1.1 from training. Duration: 0.3909375 2023-10-02 16:48:16,134 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.bypass.skip_rate, batch_count=949566.6666666666, ans=0.04949747468305833 2023-10-02 16:48:17,225 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_15126_210_6753_1_1530778677_2591185_120-277707-0_sp1.1 from training. Duration: 0.72725 2023-10-02 16:48:17,497 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.bypass.scale_min, batch_count=949566.6666666666, ans=0.2 2023-10-02 16:48:18,782 WARNING [train.py:1204] (3/4) Exclude cut with ID _14970_210_15709_1_1530775547361_1966965_8-169924-0_sp1.1 from training. Duration: 0.836375 2023-10-02 16:48:18,784 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_4692_210_3528_1_1526899755_4323254_107-44401-0_sp0.9 from training. Duration: 0.9555625 2023-10-02 16:48:20,188 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3475_210_2028_1_1526193578_3622569_265-276625-0_sp0.9 from training. Duration: 0.8444375 2023-10-02 16:48:21,600 WARNING [train.py:1204] (3/4) Exclude cut with ID _55153_210_2253_1_1534384786677_3162096_148-79022-0_sp1.1 from training. Duration: 0.87275 2023-10-02 16:48:21,855 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.nonlin_attention.balancer.prob, batch_count=949566.6666666666, ans=0.125 2023-10-02 16:48:23,023 WARNING [train.py:1204] (3/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_292-36336-0_sp1.1 from training. Duration: 0.92725 2023-10-02 16:48:23,040 WARNING [train.py:1204] (3/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_329-82235-0 from training. Duration: 0.82 2023-10-02 16:48:23,246 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.bypass_mid.scale_min, batch_count=949633.3333333334, ans=0.2 2023-10-02 16:48:24,361 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_11305_210_1794_1_1529750428_7178148_257-312870-0 from training. Duration: 0.88 2023-10-02 16:48:25,832 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.1.conv_skip_rate, batch_count=949633.3333333334, ans=0.0 2023-10-02 16:48:27,015 WARNING [train.py:1204] (3/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_436-4493-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 16:48:28,531 WARNING [train.py:1204] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_321-54129-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 16:48:28,538 WARNING [train.py:1204] (3/4) Exclude cut with ID _5287_210_2084_1_1527418883040_3878670_333-48508-0_sp0.9 from training. Duration: 0.6666875 2023-10-02 16:48:28,552 WARNING [train.py:1204] (3/4) Exclude cut with ID _26528_210_9070_1_1532570410842_3851310_244-278158-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 16:48:28,590 WARNING [train.py:1204] (3/4) Exclude cut with ID _46952_210_5986_1_1534230010357_3107379_232-344503-0_sp1.1 from training. Duration: 0.87275 2023-10-02 16:48:28,600 WARNING [train.py:1204] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_43-206757-0 from training. Duration: 0.75 2023-10-02 16:48:28,601 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_4183_210_2087_1_1526729100_1869838_141-166600-0 from training. Duration: 0.95 2023-10-02 16:48:29,943 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_296-287678-0 from training. Duration: 0.95 2023-10-02 16:48:30,004 WARNING [train.py:1204] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_265-60343-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 16:48:31,304 WARNING [train.py:1204] (3/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_332-256168-0 from training. Duration: 0.75 2023-10-02 16:48:31,335 WARNING [train.py:1204] (3/4) Exclude cut with ID _13679_210_7497_1_1530534293662_3355098_411-186088-0 from training. Duration: 0.94 2023-10-02 16:48:34,681 WARNING [train.py:1204] (3/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_458-126779-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 16:48:34,784 WARNING [train.py:1204] (3/4) Exclude cut with ID IC0803W0380-397572 from training. Duration: 0.032 2023-10-02 16:48:36,646 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_278-272612-0_sp1.1 from training. Duration: 0.663625 2023-10-02 16:48:38,432 WARNING [train.py:1204] (3/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_491-263101-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 16:48:38,436 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_53555_210_5632_1_1534595220_7232297_334-336778-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 16:48:40,906 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.619e+02 1.825e+02 1.972e+02 2.208e+02 2.993e+02, threshold=3.944e+02, percent-clipped=0.0 2023-10-02 16:48:42,308 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_50-3289-0 from training. Duration: 0.91 2023-10-02 16:48:43,011 WARNING [train.py:1204] (3/4) Exclude cut with ID _8699_210_2069_1_1528630346831_3960409_155-39766-0_sp0.9 from training. Duration: 0.8666875 2023-10-02 16:48:44,311 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_502-82145-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 16:48:44,366 WARNING [train.py:1204] (3/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_550-43132-0_sp1.1 from training. Duration: 0.97275 2023-10-02 16:48:44,394 WARNING [train.py:1204] (3/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_446-112117-0_sp1.1 from training. Duration: 0.8181875 2023-10-02 16:48:45,765 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3691_210_3528_1_1526468169_4062053_355-334432-0_sp1.1 from training. Duration: 0.7909375 2023-10-02 16:48:47,158 WARNING [train.py:1204] (3/4) Exclude cut with ID _58605_210_12210_1_1534723195939_3904259_239-94720-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 16:48:49,914 WARNING [train.py:1204] (3/4) Exclude cut with ID _49376_210_5986_1_1533985278732_3370740_391-344072-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 16:48:51,350 WARNING [train.py:1204] (3/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_558-322014-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 16:48:51,388 WARNING [train.py:1204] (3/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_256-32662-0_sp1.1 from training. Duration: 0.8181875 2023-10-02 16:48:54,093 INFO [train.py:1046] (3/4) Epoch 27, batch 4350, loss[loss=0.1614, simple_loss=0.2454, pruned_loss=0.03874, over 24655.00 frames. ], tot_loss[loss=0.1664, simple_loss=0.2443, pruned_loss=0.04425, over 4728568.87 frames. ], batch size: 65, lr: 3.77e-03, grad_scale: 16.0 2023-10-02 16:48:56,987 WARNING [train.py:1204] (3/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_572-185150-0 from training. Duration: 0.97 2023-10-02 16:48:57,015 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_89-35748-0_sp0.9 from training. Duration: 0.87775 2023-10-02 16:48:57,223 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.self_attn_weights.pos_emb_skip_rate, batch_count=949766.6666666666, ans=0.0 2023-10-02 16:49:01,254 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_88-66747-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 16:49:02,701 WARNING [train.py:1204] (3/4) Exclude cut with ID _19504_210_9994_1_1531648863242_3325220_357-237029-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 16:49:05,904 WARNING [train.py:1204] (3/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_302-2550-0_sp1.1 from training. Duration: 0.736375 2023-10-02 16:49:05,904 WARNING [train.py:1204] (3/4) Exclude cut with ID _9461_210_4557_1_1529060514726_3677451_59-194017-0_sp1.1 from training. Duration: 0.963625 2023-10-02 16:49:12,420 WARNING [train.py:1204] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_665-196155-0_sp1.1 from training. Duration: 0.6090625 2023-10-02 16:49:17,119 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_326-315007-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 16:49:19,818 WARNING [train.py:1204] (3/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_105-328174-0_sp1.1 from training. Duration: 0.6909375 2023-10-02 16:49:19,826 WARNING [train.py:1204] (3/4) Exclude cut with ID _56494_210_4220_1_1535284837625_3033169_106-263722-0_sp1.1 from training. Duration: 0.97275 2023-10-02 16:49:23,979 WARNING [train.py:1204] (3/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_770-200430-0_sp0.9 from training. Duration: 0.988875 2023-10-02 16:49:25,455 WARNING [train.py:1204] (3/4) Exclude cut with ID _65640_210_6310_1_1535368201445_3173250_209-13008-0_sp1.1 from training. Duration: 0.9 2023-10-02 16:49:25,560 WARNING [train.py:1204] (3/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_212-149947-0_sp0.9 from training. Duration: 0.811125 2023-10-02 16:49:28,562 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.bypass.scale_min, batch_count=949900.0, ans=0.2 2023-10-02 16:49:29,795 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=949900.0, ans=0.1 2023-10-02 16:49:30,910 WARNING [train.py:1204] (3/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_188-171673-0 from training. Duration: 0.58 2023-10-02 16:49:30,970 WARNING [train.py:1204] (3/4) Exclude cut with ID _58895_210_8750_1_1535184081320_3649089_286-281724-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 16:49:32,309 WARNING [train.py:1204] (3/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_16-254909-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 16:49:35,762 WARNING [train.py:1204] (3/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_52-95655-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 16:49:37,160 WARNING [train.py:1204] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_554-45617-0 from training. Duration: 0.99 2023-10-02 16:49:42,130 WARNING [train.py:1204] (3/4) Exclude cut with ID _14683_210_5219_1_1530698352608_3974859_473-344611-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 16:49:44,770 WARNING [train.py:1204] (3/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_17-25045-0_sp0.9 from training. Duration: 0.6555625 2023-10-02 16:49:45,149 INFO [scaling.py:1118] (3/4) WithLoss: name=encoder.encoders.5.encoder.layers.1.self_attn_weights, loss-sum=0.000e+00 2023-10-02 16:49:47,004 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9072W0003-530637 from training. Duration: 0.768 2023-10-02 16:49:48,326 WARNING [train.py:1204] (3/4) Exclude cut with ID _38670_210_15379_1_1533448810659_4391059_76-74025-0_sp1.1 from training. Duration: 0.936375 2023-10-02 16:49:48,374 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_74-292608-0_sp0.9 from training. Duration: 0.8 2023-10-02 16:49:49,721 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9084W0025-536862 from training. Duration: 0.896 2023-10-02 16:49:49,788 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9087W0021-538392 from training. Duration: 0.768 2023-10-02 16:49:49,793 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7543_210_6753_1_1528543323_645515_56-163234-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 16:49:49,815 WARNING [train.py:1204] (3/4) Exclude cut with ID _45560_210_21982_1_1533812603951_3584999_44-332643-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 16:49:49,887 WARNING [train.py:1204] (3/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_1285-256622-0_sp1.1 from training. Duration: 0.763625 2023-10-02 16:49:50,541 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.3.encoder.layers.0.nonlin_attention.whiten1, num_groups=1, num_channels=384, metric=6.92 vs. limit=10.0 2023-10-02 16:49:51,300 WARNING [train.py:1204] (3/4) Exclude cut with ID _52781_210_16187_1_1534297729099_1118149_141-325632-0_sp1.1 from training. Duration: 0.936375 2023-10-02 16:49:52,580 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_1051-137631-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 16:49:52,610 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_13091_210_7925_1_1530609447_8987354_233-18333-0_sp1.1 from training. Duration: 0.97275 2023-10-02 16:49:55,285 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_34008_210_10431_1_1533345424_8183247_662-317331-0 from training. Duration: 0.94 2023-10-02 16:49:55,291 WARNING [train.py:1204] (3/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_291-248920-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 16:49:55,293 WARNING [train.py:1204] (3/4) Exclude cut with ID _65263_210_22356_1_1535763598691_3921037_274-288481-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 16:49:55,330 WARNING [train.py:1204] (3/4) Exclude cut with ID _5189_210_4557_1_1527248319428_3769229_233-168080-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 16:49:56,701 WARNING [train.py:1204] (3/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_135-184281-0 from training. Duration: 0.99 2023-10-02 16:49:58,088 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9123W0012-555972 from training. Duration: 0.896 2023-10-02 16:49:58,093 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9123W0118-556077 from training. Duration: 0.896 2023-10-02 16:49:58,103 WARNING [train.py:1204] (3/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_406-275649-0 from training. Duration: 0.42 2023-10-02 16:50:00,775 WARNING [train.py:1204] (3/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_309-13684-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 16:50:02,094 WARNING [train.py:1204] (3/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_129-337802-0_sp1.1 from training. Duration: 0.6545625 2023-10-02 16:50:02,120 WARNING [train.py:1204] (3/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_649-266825-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 16:50:02,176 WARNING [train.py:1204] (3/4) Exclude cut with ID _40519_210_13794_1_1533715228655_7337390_895-224742-0_sp1.1 from training. Duration: 0.92725 2023-10-02 16:50:04,899 WARNING [train.py:1204] (3/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_234-14174-0 from training. Duration: 0.82 2023-10-02 16:50:06,881 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9156W0009-572502 from training. Duration: 0.768 2023-10-02 16:50:06,892 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_424-245529-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 16:50:08,106 INFO [train.py:1046] (3/4) Epoch 27, batch 4400, loss[loss=0.1728, simple_loss=0.2516, pruned_loss=0.04701, over 23408.00 frames. ], tot_loss[loss=0.1677, simple_loss=0.2457, pruned_loss=0.0448, over 4717997.33 frames. ], batch size: 106, lr: 3.77e-03, grad_scale: 32.0 2023-10-02 16:50:11,481 WARNING [train.py:1204] (3/4) Exclude cut with ID _26528_210_9070_1_1532570410842_3851310_232-10149-0_sp1.1 from training. Duration: 0.963625 2023-10-02 16:50:11,489 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_17292_210_6753_1_1531125388_2943734_152-30410-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 16:50:13,431 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_35441_210_16554_1_1533347572_4092680_291-82075-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 16:50:15,272 WARNING [train.py:1204] (3/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_64-298917-0 from training. Duration: 0.88 2023-10-02 16:50:15,296 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_5618_210_1874_1_1527311008_5851_1-214501-0_sp0.9 from training. Duration: 0.7655625 2023-10-02 16:50:16,612 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_9460_210_4211_1_1528954587_10049667_572-37035-0 from training. Duration: 0.8 2023-10-02 16:50:16,642 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9193W0023-590322 from training. Duration: 0.896 2023-10-02 16:50:18,037 WARNING [train.py:1204] (3/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_206-10097-0_sp1.1 from training. Duration: 0.4454375 2023-10-02 16:50:18,049 WARNING [train.py:1204] (3/4) Exclude cut with ID _55134_210_21982_1_1534590028377_3882170_17-51731-0_sp1.1 from training. Duration: 0.963625 2023-10-02 16:50:20,791 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_14953_210_2151_1_1530701710_5616165_710-165400-0 from training. Duration: 0.87 2023-10-02 16:50:22,188 WARNING [train.py:1204] (3/4) Exclude cut with ID _53203_210_2087_1_1534246218309_475590_61-85777-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 16:50:22,290 WARNING [train.py:1204] (3/4) Exclude cut with ID _55844_210_2346_1_1534471212569_7625707_1050-10453-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 16:50:22,299 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9218W0013-603297 from training. Duration: 0.896 2023-10-02 16:50:22,541 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.conv_module2.balancer1.prob, batch_count=950166.6666666666, ans=0.125 2023-10-02 16:50:24,955 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_267-102818-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 16:50:24,956 WARNING [train.py:1204] (3/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_191-41023-0 from training. Duration: 0.71 2023-10-02 16:50:25,005 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9231W0202-610227 from training. Duration: 0.896 2023-10-02 16:50:28,938 WARNING [train.py:1204] (3/4) Exclude cut with ID _6537_210_1883_1_1527987559710_3597129_382-133062-0 from training. Duration: 0.68 2023-10-02 16:50:28,977 WARNING [train.py:1204] (3/4) Exclude cut with ID _28427_210_4220_1_1532581240967_3061069_43-84928-0 from training. Duration: 0.99 2023-10-02 16:50:29,001 WARNING [train.py:1204] (3/4) Exclude cut with ID _59256_210_5986_1_1535076085997_3589120_121-237386-0 from training. Duration: 0.96 2023-10-02 16:50:30,310 WARNING [train.py:1204] (3/4) Exclude cut with ID _8480_210_1874_1_1528589391205_6728330_137-256986-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 16:50:30,410 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_9417_210_1794_1_1529235261_4614131_479-346963-0_sp1.1 from training. Duration: 0.936375 2023-10-02 16:50:31,798 WARNING [train.py:1204] (3/4) Exclude cut with ID _26474_210_9570_1_1532520022699_3623380_267-7362-0_sp1.1 from training. Duration: 0.936375 2023-10-02 16:50:31,868 WARNING [train.py:1204] (3/4) Exclude cut with ID _55328_210_27767_1_1534580973260_4088170_287-61272-0_sp1.1 from training. Duration: 0.97275 2023-10-02 16:50:34,515 WARNING [train.py:1204] (3/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_589-9070-0 from training. Duration: 0.71 2023-10-02 16:50:34,522 WARNING [train.py:1204] (3/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_148-158509-0_sp1.1 from training. Duration: 0.37275 2023-10-02 16:50:36,261 WARNING [train.py:1204] (3/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_164-184548-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 16:50:38,889 WARNING [train.py:1204] (3/4) Exclude cut with ID _14970_210_15709_1_1530775547361_1966965_6-23328-0_sp1.1 from training. Duration: 0.7818125 2023-10-02 16:50:38,894 WARNING [train.py:1204] (3/4) Exclude cut with ID _29303_210_2575_1_1532392237303_7410649_612-165069-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 16:50:40,929 WARNING [train.py:1204] (3/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_383-321224-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 16:50:42,252 WARNING [train.py:1204] (3/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_283-160525-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 16:50:42,264 WARNING [train.py:1204] (3/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_505-97987-0 from training. Duration: 0.86 2023-10-02 16:50:42,348 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9309W0190-640317 from training. Duration: 0.768 2023-10-02 16:50:46,653 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.bypass.scale_min, batch_count=950233.3333333334, ans=0.2 2023-10-02 16:50:47,732 WARNING [train.py:1204] (3/4) Exclude cut with ID _50093_210_4220_1_1534417141207_3432299_280-297390-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 16:50:48,399 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.3.encoder.layers.1.nonlin_attention.whiten2, num_groups=1, num_channels=512, metric=5.08 vs. limit=15.0 2023-10-02 16:50:51,896 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_17292_210_6753_1_1531125388_2943734_320-71385-0_sp1.1 from training. Duration: 0.97275 2023-10-02 16:50:54,678 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_213-103282-0 from training. Duration: 0.78 2023-10-02 16:50:54,879 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.feed_forward1.out_proj.dropout_p, batch_count=950300.0, ans=0.1 2023-10-02 16:50:56,113 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.0.conv_module1.balancer1.prob, batch_count=950300.0, ans=0.125 2023-10-02 16:50:58,681 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7615_210_4211_1_1528110965_4580160_72-235211-0_sp0.9 from training. Duration: 0.8444375 2023-10-02 16:50:58,877 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.feed_forward2.hidden_balancer.prob, batch_count=950300.0, ans=0.125 2023-10-02 16:51:00,196 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.self_attn_weights.pos_emb_skip_rate, batch_count=950300.0, ans=0.0 2023-10-02 16:51:01,451 WARNING [train.py:1204] (3/4) Exclude cut with ID _14867_210_1968_1_1530684179890_3846969_173-320977-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 16:51:02,935 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7046_210_6753_1_1527895337_6920917_783-278109-0_sp0.9 from training. Duration: 0.8555625 2023-10-02 16:51:03,109 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.0.feed_forward3.hidden_balancer.prob, batch_count=950300.0, ans=0.125 2023-10-02 16:51:04,261 WARNING [train.py:1204] (3/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_90-30673-0 from training. Duration: 0.84 2023-10-02 16:51:04,283 WARNING [train.py:1204] (3/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_415-161-0_sp1.1 from training. Duration: 0.8909375 2023-10-02 16:51:04,301 WARNING [train.py:1204] (3/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_86-66192-0_sp0.9 from training. Duration: 0.611125 2023-10-02 16:51:04,303 WARNING [train.py:1204] (3/4) Exclude cut with ID _4626_210_3532_1_1526817652717_3916859_446-206656-0_sp1.1 from training. Duration: 0.6909375 2023-10-02 16:51:05,615 WARNING [train.py:1204] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_166-128941-0_sp0.9 from training. Duration: 0.6 2023-10-02 16:51:08,772 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.498e+02 1.837e+02 2.009e+02 2.278e+02 3.254e+02, threshold=4.017e+02, percent-clipped=0.0 2023-10-02 16:51:10,191 WARNING [train.py:1204] (3/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_345-167986-0 from training. Duration: 0.84 2023-10-02 16:51:11,588 WARNING [train.py:1204] (3/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_973-198085-0 from training. Duration: 0.75 2023-10-02 16:51:13,326 WARNING [train.py:1204] (3/4) Exclude cut with ID _60057_210_10640_1_1534903065793_3333150_171-181133-0 from training. Duration: 0.88 2023-10-02 16:51:14,684 WARNING [train.py:1204] (3/4) Exclude cut with ID _19504_210_9994_1_1531648863242_3325220_336-196513-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 16:51:14,692 WARNING [train.py:1204] (3/4) Exclude cut with ID _6493_210_1747_1_1528459210424_4012470_513-140967-0 from training. Duration: 0.79 2023-10-02 16:51:14,784 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6673_210_3273_1_1527767790_3173240_327-306185-0_sp0.9 from training. Duration: 0.92225 2023-10-02 16:51:19,478 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1032-236863-0_sp1.1 from training. Duration: 0.763625 2023-10-02 16:51:20,914 WARNING [train.py:1204] (3/4) Exclude cut with ID _14867_210_1968_1_1530684179890_3846969_413-29292-0 from training. Duration: 0.9 2023-10-02 16:51:22,141 INFO [train.py:1046] (3/4) Epoch 27, batch 4450, loss[loss=0.1909, simple_loss=0.272, pruned_loss=0.05486, over 24037.00 frames. ], tot_loss[loss=0.1689, simple_loss=0.2467, pruned_loss=0.04554, over 4714409.16 frames. ], batch size: 80, lr: 3.77e-03, grad_scale: 32.0 2023-10-02 16:51:23,719 WARNING [train.py:1204] (3/4) Exclude cut with ID _47251_210_18628_1_1533814063128_4137250_186-343322-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 16:51:26,474 WARNING [train.py:1204] (3/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_624-46091-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 16:51:26,514 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_791-197891-0_sp0.9 from training. Duration: 0.9333125 2023-10-02 16:51:27,209 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.self_attn_weights.whiten_keys.whitening_limit, batch_count=950433.3333333334, ans=6.0 2023-10-02 16:51:32,016 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_50050_210_16223_1_1535435796_7290138_786-288457-0_sp1.1 from training. Duration: 0.92725 2023-10-02 16:51:32,033 WARNING [train.py:1204] (3/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_213-227283-0_sp1.1 from training. Duration: 0.963625 2023-10-02 16:51:34,747 WARNING [train.py:1204] (3/4) Exclude cut with ID _42659_210_2070_1_1533607442765_3120380_58-341121-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 16:51:36,198 WARNING [train.py:1204] (3/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_506-23200-0_sp1.1 from training. Duration: 0.8818125 2023-10-02 16:51:39,324 WARNING [train.py:1204] (3/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_336-213530-0_sp1.1 from training. Duration: 0.8181875 2023-10-02 16:51:41,225 WARNING [train.py:1204] (3/4) Exclude cut with ID _64135_210_2253_1_1535677267876_7608972_180-5312-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 16:51:41,311 WARNING [train.py:1204] (3/4) Exclude cut with ID _12302_210_5219_1_1529924446466_8107280_835-279754-0 from training. Duration: 0.9 2023-10-02 16:51:41,313 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_510-96996-0_sp1.1 from training. Duration: 0.7909375 2023-10-02 16:51:42,486 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_15517_210_6753_1_1530954008_3487336_216-187834-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 16:51:42,528 WARNING [train.py:1204] (3/4) Exclude cut with ID _19504_210_9994_1_1531648863242_3325220_414-113427-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 16:51:42,529 WARNING [train.py:1204] (3/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_264-108685-0_sp0.9 from training. Duration: 0.611125 2023-10-02 16:51:47,039 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_5438_210_1794_1_1527336687_2600009_104-288274-0_sp0.9 from training. Duration: 0.6444375 2023-10-02 16:51:52,143 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.balancer2.prob, batch_count=950566.6666666666, ans=0.125 2023-10-02 16:51:53,201 WARNING [train.py:1204] (3/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_520-39496-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 16:51:53,246 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_37818_210_4238_1_1533620910_4720083_315-308565-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 16:51:54,641 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2009_210_2071_1_1525481134_4588937_385-302100-0_sp1.1 from training. Duration: 0.7909375 2023-10-02 16:51:54,861 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.conv_module2.balancer2.prob, batch_count=950566.6666666666, ans=0.125 2023-10-02 16:51:55,945 WARNING [train.py:1204] (3/4) Exclude cut with ID _58895_210_8750_1_1535184081320_3649089_277-53219-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 16:51:57,294 WARNING [train.py:1204] (3/4) Exclude cut with ID _26664_210_7770_1_1532258943280_3418270_121-111258-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 16:52:01,369 WARNING [train.py:1204] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_99-167392-0_sp1.1 from training. Duration: 0.5545625 2023-10-02 16:52:03,965 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1113-7369-0 from training. Duration: 0.85 2023-10-02 16:52:03,981 WARNING [train.py:1204] (3/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_352-59958-0 from training. Duration: 0.86 2023-10-02 16:52:03,986 WARNING [train.py:1204] (3/4) Exclude cut with ID _14886_210_4220_1_1530702020355_3516190_159-73748-0_sp1.1 from training. Duration: 0.7545625 2023-10-02 16:52:05,443 WARNING [train.py:1204] (3/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_131-119569-0_sp1.1 from training. Duration: 0.92725 2023-10-02 16:52:06,858 WARNING [train.py:1204] (3/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_216-343007-0 from training. Duration: 0.66 2023-10-02 16:52:10,224 WARNING [train.py:1204] (3/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_113-92608-0_sp0.9 from training. Duration: 0.6 2023-10-02 16:52:10,401 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.0.conv_module2.balancer1.prob, batch_count=950633.3333333334, ans=0.125 2023-10-02 16:52:13,255 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_40047_210_2290_1_1533290519_2374311_132-123271-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 16:52:13,296 WARNING [train.py:1204] (3/4) Exclude cut with ID _8793_210_6310_1_1528542985139_2310009_121-179782-0 from training. Duration: 0.8 2023-10-02 16:52:15,119 WARNING [train.py:1204] (3/4) Exclude cut with ID _43997_210_4525_1_1533538632555_5189329_283-218639-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 16:52:15,130 WARNING [train.py:1204] (3/4) Exclude cut with ID _49376_210_5986_1_1533985278732_3370740_297-78397-0_sp1.1 from training. Duration: 0.936375 2023-10-02 16:52:15,150 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3475_210_2028_1_1526193578_3622569_377-166133-0_sp0.9 from training. Duration: 0.9555625 2023-10-02 16:52:15,157 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_520-213149-0_sp1.1 from training. Duration: 0.92725 2023-10-02 16:52:17,788 WARNING [train.py:1204] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_567-144483-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 16:52:21,726 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_4183_210_2087_1_1526729100_1869838_143-280962-0_sp0.9 from training. Duration: 0.62225 2023-10-02 16:52:21,771 WARNING [train.py:1204] (3/4) Exclude cut with ID _49643_210_7989_1_1534640244224_3939031_611-185515-0 from training. Duration: 0.94 2023-10-02 16:52:23,130 WARNING [train.py:1204] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_88-341718-0_sp0.9 from training. Duration: 0.7333125 2023-10-02 16:52:24,575 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_51225_210_3601_1_1534400634_2327383_49-235441-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 16:52:25,898 WARNING [train.py:1204] (3/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_240-294400-0_sp1.1 from training. Duration: 0.936375 2023-10-02 16:52:28,482 WARNING [train.py:1204] (3/4) Exclude cut with ID _55429_210_10626_1_1534467588527_7174143_640-304825-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 16:52:28,511 WARNING [train.py:1204] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_187-221566-0_sp1.1 from training. Duration: 0.3909375 2023-10-02 16:52:31,333 WARNING [train.py:1204] (3/4) Exclude cut with ID _7606_210_4915_1_1528894253277_5080690_127-177761-0_sp0.9 from training. Duration: 0.711125 2023-10-02 16:52:34,055 WARNING [train.py:1204] (3/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_430-116139-0 from training. Duration: 0.85 2023-10-02 16:52:35,451 INFO [train.py:1046] (3/4) Epoch 27, batch 4500, loss[loss=0.1564, simple_loss=0.239, pruned_loss=0.03692, over 23659.00 frames. ], tot_loss[loss=0.1687, simple_loss=0.2466, pruned_loss=0.04535, over 4715266.11 frames. ], batch size: 106, lr: 3.77e-03, grad_scale: 8.0 2023-10-02 16:52:35,518 WARNING [train.py:1204] (3/4) Exclude cut with ID _3675_210_4557_1_1526639934408_4182089_456-18305-0_sp0.9 from training. Duration: 0.8444375 2023-10-02 16:52:38,363 WARNING [train.py:1204] (3/4) Exclude cut with ID _18513_210_9206_1_1531618215223_3862620_402-5957-0_sp1.1 from training. Duration: 0.9 2023-10-02 16:52:39,767 WARNING [train.py:1204] (3/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_465-156500-0 from training. Duration: 0.72 2023-10-02 16:52:39,768 WARNING [train.py:1204] (3/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_714-310538-0 from training. Duration: 0.86 2023-10-02 16:52:41,742 WARNING [train.py:1204] (3/4) Exclude cut with ID _39411_210_5986_1_1533538825030_3343649_335-301187-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 16:52:45,304 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_41750_210_1960_1_1533520678_3948042_241-158616-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 16:52:46,693 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_46013_210_7234_1_1533776362_3744032_50-141535-0_sp1.1 from training. Duration: 0.9 2023-10-02 16:52:46,755 WARNING [train.py:1204] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_43-206757-0_sp0.9 from training. Duration: 0.8333125 2023-10-02 16:52:48,429 WARNING [train.py:1204] (3/4) Exclude cut with ID _22961_210_8254_1_1531976366672_4278180_172-276296-0_sp1.1 from training. Duration: 0.97275 2023-10-02 16:52:48,450 WARNING [train.py:1204] (3/4) Exclude cut with ID _17956_210_4353_1_1531209568125_6979970_1305-301903-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 16:52:49,747 WARNING [train.py:1204] (3/4) Exclude cut with ID _28323_210_16187_1_1532480395667_4100300_604-44507-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 16:53:00,001 WARNING [train.py:1204] (3/4) Exclude cut with ID _23514_210_1389_1_1531832139785_3031439_88-337544-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 16:53:01,326 WARNING [train.py:1204] (3/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_932-3672-0_sp1.1 from training. Duration: 0.963625 2023-10-02 16:53:01,629 INFO [scaling.py:1118] (3/4) WithLoss: name=encoder.encoders.4.encoder.layers.2.self_attn_weights, loss-sum=0.000e+00 2023-10-02 16:53:04,042 WARNING [train.py:1204] (3/4) Exclude cut with ID _46502_210_13794_1_1533724452198_3657159_207-109991-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 16:53:05,426 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_216-73841-0_sp1.1 from training. Duration: 0.87275 2023-10-02 16:53:05,516 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8453_210_4211_1_1528502940_8562730_347-330673-0_sp1.1 from training. Duration: 0.6545625 2023-10-02 16:53:11,474 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_4183_210_2087_1_1526729100_1869838_143-280962-0_sp1.1 from training. Duration: 0.5090625 2023-10-02 16:53:14,250 WARNING [train.py:1204] (3/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_325-113798-0_sp1.1 from training. Duration: 0.77275 2023-10-02 16:53:19,268 WARNING [train.py:1204] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_91-212486-0_sp0.9 from training. Duration: 0.7555625 2023-10-02 16:53:22,659 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8776_210_2040_1_1528638837_4088036_244-278763-0_sp1.1 from training. Duration: 0.8181875 2023-10-02 16:53:23,926 WARNING [train.py:1204] (3/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_106-286130-0 from training. Duration: 0.98 2023-10-02 16:53:23,988 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2746_210_1988_1_1525872816_3795627_372-205861-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 16:53:25,368 WARNING [train.py:1204] (3/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_240-54646-0_sp1.1 from training. Duration: 0.936375 2023-10-02 16:53:25,570 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.conv_module2.balancer2.prob, batch_count=950966.6666666666, ans=0.125 2023-10-02 16:53:28,064 WARNING [train.py:1204] (3/4) Exclude cut with ID _59500_210_8254_1_1534809610680_3736009_162-180695-0_sp1.1 from training. Duration: 0.936375 2023-10-02 16:53:28,089 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7544_210_6753_1_1528591327_1471426_146-93357-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 16:53:30,718 WARNING [train.py:1204] (3/4) Exclude cut with ID _42657_210_2070_1_1533520654016_3327008_405-87432-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 16:53:30,747 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_4528_210_3273_1_1527076654_3276777_209-219804-0 from training. Duration: 0.69 2023-10-02 16:53:30,747 WARNING [train.py:1204] (3/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_78-182912-0_sp0.9 from training. Duration: 0.6555625 2023-10-02 16:53:30,760 WARNING [train.py:1204] (3/4) Exclude cut with ID _31255_210_13427_1_1532865619495_3815143_322-114998-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 16:53:33,732 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.attention_skip_rate, batch_count=951033.3333333334, ans=0.0 2023-10-02 16:53:34,894 WARNING [train.py:1204] (3/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_848-280957-0_sp0.9 from training. Duration: 0.9666875 2023-10-02 16:53:34,921 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7696_210_2040_1_1528207167_3716099_117-125434-0_sp0.9 from training. Duration: 0.8444375 2023-10-02 16:53:37,674 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_1002-109192-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 16:53:38,922 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.458e+02 1.864e+02 2.019e+02 2.246e+02 3.268e+02, threshold=4.037e+02, percent-clipped=0.0 2023-10-02 16:53:39,127 WARNING [train.py:1204] (3/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_138-64210-0_sp0.9 from training. Duration: 0.811125 2023-10-02 16:53:39,145 WARNING [train.py:1204] (3/4) Exclude cut with ID _25808_210_13292_1_1532343514562_7366109_633-47524-0_sp1.1 from training. Duration: 0.9 2023-10-02 16:53:42,387 WARNING [train.py:1204] (3/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_297-5233-0 from training. Duration: 0.95 2023-10-02 16:53:45,111 WARNING [train.py:1204] (3/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_335-274780-0 from training. Duration: 0.78 2023-10-02 16:53:45,116 WARNING [train.py:1204] (3/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_301-330455-0 from training. Duration: 0.8 2023-10-02 16:53:45,379 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.conv_skip_rate, batch_count=951033.3333333334, ans=0.0 2023-10-02 16:53:49,800 INFO [train.py:1046] (3/4) Epoch 27, batch 4550, loss[loss=0.1435, simple_loss=0.2262, pruned_loss=0.03045, over 24292.00 frames. ], tot_loss[loss=0.1679, simple_loss=0.2457, pruned_loss=0.04508, over 4720066.01 frames. ], batch size: 61, lr: 3.77e-03, grad_scale: 8.0 2023-10-02 16:53:49,879 WARNING [train.py:1204] (3/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_383-205833-0 from training. Duration: 0.95 2023-10-02 16:53:51,309 WARNING [train.py:1204] (3/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_48-182155-0 from training. Duration: 0.87 2023-10-02 16:53:53,341 WARNING [train.py:1204] (3/4) Exclude cut with ID _14832_210_8750_1_1530691351539_3171348_1-54315-0_sp1.1 from training. Duration: 0.7909375 2023-10-02 16:53:54,853 WARNING [train.py:1204] (3/4) Exclude cut with ID _44982_210_1511_1_1534312820108_3726029_413-100814-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 16:53:56,207 WARNING [train.py:1204] (3/4) Exclude cut with ID _3231_210_2106_1_1526205864934_6784220_104-261392-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 16:53:57,878 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.bypass_mid.scale_min, batch_count=951100.0, ans=0.2 2023-10-02 16:53:58,940 WARNING [train.py:1204] (3/4) Exclude cut with ID _24314_210_4915_1_1532135078116_7170179_1-13588-0_sp1.1 from training. Duration: 0.97275 2023-10-02 16:54:01,856 WARNING [train.py:1204] (3/4) Exclude cut with ID _44304_210_15374_1_1533639352765_4613129_141-8356-0_sp1.1 from training. Duration: 0.963625 2023-10-02 16:54:03,232 WARNING [train.py:1204] (3/4) Exclude cut with ID _37892_210_19596_1_1533198677897_3964058_374-335034-0_sp1.1 from training. Duration: 0.936375 2023-10-02 16:54:05,937 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_195-257512-0_sp0.9 from training. Duration: 0.7444375 2023-10-02 16:54:05,940 WARNING [train.py:1204] (3/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_1267-272693-0_sp1.1 from training. Duration: 0.663625 2023-10-02 16:54:05,940 WARNING [train.py:1204] (3/4) Exclude cut with ID _30196_210_1664_1_1533708496522_7074140_690-326155-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 16:54:08,738 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_68847_210_14656_1_1536070214_2586843_38-20717-0_sp1.1 from training. Duration: 0.97275 2023-10-02 16:54:08,784 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_99-251314-0_sp1.1 from training. Duration: 0.7909375 2023-10-02 16:54:09,114 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.ff2_skip_rate, batch_count=951166.6666666666, ans=0.0 2023-10-02 16:54:13,500 WARNING [train.py:1204] (3/4) Exclude cut with ID _49376_210_5986_1_1533985278732_3370740_225-95630-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 16:54:16,240 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2655_210_2087_1_1525688959_3897400_202-147557-0 from training. Duration: 0.85 2023-10-02 16:54:17,596 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_5618_210_1874_1_1527311008_5851_1-214501-0_sp1.1 from training. Duration: 0.626375 2023-10-02 16:54:18,941 WARNING [train.py:1204] (3/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_225-43381-0_sp1.1 from training. Duration: 0.8545625 2023-10-02 16:54:19,653 WARNING [train.py:1204] (3/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_242-3472-0 from training. Duration: 0.73 2023-10-02 16:54:20,696 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.0.layers.0.conv_module1.whiten, num_groups=1, num_channels=192, metric=10.83 vs. limit=15.0 2023-10-02 16:54:24,199 WARNING [train.py:1204] (3/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_339-104791-0 from training. Duration: 0.49 2023-10-02 16:54:24,270 WARNING [train.py:1204] (3/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_488-92261-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 16:54:26,953 WARNING [train.py:1204] (3/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_175-163894-0 from training. Duration: 0.74 2023-10-02 16:54:28,383 WARNING [train.py:1204] (3/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_41-234353-0_sp0.9 from training. Duration: 0.7555625 2023-10-02 16:54:31,173 WARNING [train.py:1204] (3/4) Exclude cut with ID _47251_210_18628_1_1533814063128_4137250_116-275927-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 16:54:31,204 WARNING [train.py:1204] (3/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_61-223324-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 16:54:32,474 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6961_210_2087_1_1527937168_6815011_562-165518-0_sp1.1 from training. Duration: 0.7 2023-10-02 16:54:33,923 WARNING [train.py:1204] (3/4) Exclude cut with ID _21989_210_5854_1_1531899214931_3271360_133-44759-0 from training. Duration: 0.77 2023-10-02 16:54:36,633 WARNING [train.py:1204] (3/4) Exclude cut with ID _23557_210_8750_1_1531890106370_6846255_77-284923-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 16:54:39,342 WARNING [train.py:1204] (3/4) Exclude cut with ID _13678_210_7497_1_1530530069226_3463170_464-296127-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 16:54:39,362 WARNING [train.py:1204] (3/4) Exclude cut with ID _22613_210_5219_1_1532253599686_4090170_393-36663-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 16:54:40,780 WARNING [train.py:1204] (3/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_235-262124-0_sp0.9 from training. Duration: 0.7444375 2023-10-02 16:54:42,157 WARNING [train.py:1204] (3/4) Exclude cut with ID _8480_210_1874_1_1528589391205_6728330_755-112625-0 from training. Duration: 0.74 2023-10-02 16:54:42,223 WARNING [train.py:1204] (3/4) Exclude cut with ID _6456_210_1534_1_1527681634212_3181049_215-309359-0 from training. Duration: 0.88 2023-10-02 16:54:42,240 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3019_210_2028_1_1526044245_3264865_191-72235-0_sp1.1 from training. Duration: 0.8818125 2023-10-02 16:54:42,409 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.1.conv_module2.balancer1.prob, batch_count=951300.0, ans=0.125 2023-10-02 16:54:43,531 WARNING [train.py:1204] (3/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_380-162648-0 from training. Duration: 0.94 2023-10-02 16:54:45,588 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_92-46009-0 from training. Duration: 0.54 2023-10-02 16:54:45,604 WARNING [train.py:1204] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_53-300544-0_sp0.9 from training. Duration: 0.7444375 2023-10-02 16:54:46,895 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_68235_210_1925_1_1535863247_4484656_101-250943-0_sp1.1 from training. Duration: 0.97275 2023-10-02 16:54:46,911 WARNING [train.py:1204] (3/4) Exclude cut with ID _14970_210_15709_1_1530775547361_1966965_204-22182-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 16:54:48,739 WARNING [train.py:1204] (3/4) Exclude cut with ID _55153_210_2253_1_1534384786677_3162096_310-130092-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 16:54:48,755 WARNING [train.py:1204] (3/4) Exclude cut with ID _60113_210_16271_1_1535164212481_4386079_559-167191-0_sp1.1 from training. Duration: 0.8454375 2023-10-02 16:54:50,775 WARNING [train.py:1204] (3/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_93-58002-0_sp1.1 from training. Duration: 0.5181875 2023-10-02 16:54:52,117 WARNING [train.py:1204] (3/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_110-224688-0 from training. Duration: 0.97 2023-10-02 16:54:53,533 WARNING [train.py:1204] (3/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_178-178004-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 16:54:54,809 WARNING [train.py:1204] (3/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_321-335475-0_sp1.1 from training. Duration: 0.4090625 2023-10-02 16:54:54,866 WARNING [train.py:1204] (3/4) Exclude cut with ID _66863_210_18053_1_1535698758334_8590319_1092-271784-0 from training. Duration: 0.96 2023-10-02 16:54:54,873 WARNING [train.py:1204] (3/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_607-213951-0_sp1.1 from training. Duration: 0.763625 2023-10-02 16:54:54,893 WARNING [train.py:1204] (3/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_468-283113-0 from training. Duration: 0.96 2023-10-02 16:54:57,017 WARNING [train.py:1204] (3/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_13-165512-0_sp1.1 from training. Duration: 0.7454375 2023-10-02 16:54:57,041 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_69630_210_7234_1_1535788670_910989_56-169762-0_sp1.1 from training. Duration: 0.92725 2023-10-02 16:54:58,444 WARNING [train.py:1204] (3/4) Exclude cut with ID _67335_210_5219_1_1535525975652_4275229_121-80357-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 16:54:58,490 WARNING [train.py:1204] (3/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_126-12079-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 16:54:59,753 WARNING [train.py:1204] (3/4) Exclude cut with ID _5294_210_1747_1_1527249643928_3696179_342-270278-0_sp1.1 from training. Duration: 0.5 2023-10-02 16:55:01,165 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_30322_210_1925_1_1532843478_826817_35-328264-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 16:55:02,401 WARNING [train.py:1204] (3/4) Exclude cut with ID _8480_210_1874_1_1528589391205_6728330_755-112625-0_sp1.1 from training. Duration: 0.67275 2023-10-02 16:55:03,698 INFO [train.py:1046] (3/4) Epoch 27, batch 4600, loss[loss=0.1788, simple_loss=0.2633, pruned_loss=0.04718, over 24374.00 frames. ], tot_loss[loss=0.1668, simple_loss=0.2443, pruned_loss=0.04467, over 4715058.42 frames. ], batch size: 77, lr: 3.77e-03, grad_scale: 8.0 2023-10-02 16:55:05,189 WARNING [train.py:1204] (3/4) Exclude cut with ID _38670_210_15379_1_1533448810659_4391059_132-146412-0_sp1.1 from training. Duration: 0.97275 2023-10-02 16:55:06,548 WARNING [train.py:1204] (3/4) Exclude cut with ID _22020_210_2461_1_1531724164513_6326900_563-39691-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 16:55:09,278 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_9460_210_4211_1_1528954587_10049667_572-37035-0_sp1.1 from training. Duration: 0.72725 2023-10-02 16:55:09,294 WARNING [train.py:1204] (3/4) Exclude cut with ID _68777_210_4353_1_1535810390731_3117904_129-166012-0_sp0.9 from training. Duration: 0.9444375 2023-10-02 16:55:10,583 WARNING [train.py:1204] (3/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_487-91921-0_sp1.1 from training. Duration: 0.963625 2023-10-02 16:55:11,982 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_195-257512-0 from training. Duration: 0.67 2023-10-02 16:55:13,369 WARNING [train.py:1204] (3/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_188-260011-0_sp1.1 from training. Duration: 0.663625 2023-10-02 16:55:17,319 WARNING [train.py:1204] (3/4) Exclude cut with ID _8480_210_1874_1_1528589391205_6728330_309-139896-0_sp0.9 from training. Duration: 0.9 2023-10-02 16:55:18,661 WARNING [train.py:1204] (3/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_866-321248-0_sp1.1 from training. Duration: 0.963625 2023-10-02 16:55:20,660 WARNING [train.py:1204] (3/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_510-216513-0_sp1.1 from training. Duration: 0.97275 2023-10-02 16:55:28,219 WARNING [train.py:1204] (3/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_758-143677-0 from training. Duration: 0.98 2023-10-02 16:55:29,629 WARNING [train.py:1204] (3/4) Exclude cut with ID _55134_210_21982_1_1534590028377_3882170_50-214427-0_sp1.1 from training. Duration: 0.97275 2023-10-02 16:55:29,825 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.ff3_skip_rate, batch_count=951500.0, ans=0.0 2023-10-02 16:55:31,243 WARNING [train.py:1204] (3/4) Exclude cut with ID _31649_210_6228_1_1532584608063_4091078_482-301330-0_sp1.1 from training. Duration: 0.97275 2023-10-02 16:55:33,991 WARNING [train.py:1204] (3/4) Exclude cut with ID _35799_210_5854_1_1533263487887_3529071_490-326847-0_sp1.1 from training. Duration: 0.9 2023-10-02 16:55:34,000 WARNING [train.py:1204] (3/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_984-90716-0_sp1.1 from training. Duration: 0.963625 2023-10-02 16:55:34,228 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.attention_skip_rate, batch_count=951566.6666666666, ans=0.0 2023-10-02 16:55:39,411 WARNING [train.py:1204] (3/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_166-246557-0 from training. Duration: 0.64 2023-10-02 16:55:39,411 WARNING [train.py:1204] (3/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_51-200220-0_sp1.1 from training. Duration: 0.5090625 2023-10-02 16:55:39,473 WARNING [train.py:1204] (3/4) Exclude cut with ID _51382_210_23862_1_1534492801838_3650104_275-92796-0_sp1.1 from training. Duration: 0.936375 2023-10-02 16:55:39,666 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=951566.6666666666, ans=0.1 2023-10-02 16:55:44,100 WARNING [train.py:1204] (3/4) Exclude cut with ID _29853_210_7497_1_1532505600851_6642110_461-142829-0_sp1.1 from training. Duration: 0.97275 2023-10-02 16:55:46,031 WARNING [train.py:1204] (3/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_788-298907-0_sp0.9 from training. Duration: 0.988875 2023-10-02 16:55:47,418 WARNING [train.py:1204] (3/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_739-2098-0_sp1.1 from training. Duration: 0.836375 2023-10-02 16:55:50,697 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7543_210_6753_1_1528541907_1399322_165-323619-0 from training. Duration: 0.69 2023-10-02 16:55:52,031 WARNING [train.py:1204] (3/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_401-255491-0_sp1.1 from training. Duration: 0.636375 2023-10-02 16:55:56,237 WARNING [train.py:1204] (3/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_24-143283-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 16:55:56,329 WARNING [train.py:1204] (3/4) Exclude cut with ID _14459_210_2228_1_1531443641946_7516700_1221-280439-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 16:55:59,511 WARNING [train.py:1204] (3/4) Exclude cut with ID _59202_210_26984_1_1534816810612_3549174_469-213482-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 16:55:59,512 WARNING [train.py:1204] (3/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_148-158509-0_sp0.9 from training. Duration: 0.4555625 2023-10-02 16:56:00,813 WARNING [train.py:1204] (3/4) Exclude cut with ID _24314_210_4915_1_1532135078116_7170179_863-259095-0_sp1.1 from training. Duration: 0.97275 2023-10-02 16:56:00,883 WARNING [train.py:1204] (3/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_321-22821-0 from training. Duration: 0.89 2023-10-02 16:56:00,903 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_43831_210_2158_1_1533725502_5872791_55-163137-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 16:56:00,955 WARNING [train.py:1204] (3/4) Exclude cut with ID _27430_210_7770_1_1532604689375_1378590_26-25207-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 16:56:03,587 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_17613_210_2151_1_1531137569_2366160_223-316461-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 16:56:03,656 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_53679_210_4852_1_1534556726_4186141_541-213370-0_sp1.1 from training. Duration: 0.963625 2023-10-02 16:56:05,000 WARNING [train.py:1204] (3/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_369-295086-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 16:56:05,051 WARNING [train.py:1204] (3/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_91-267696-0 from training. Duration: 0.87 2023-10-02 16:56:06,329 WARNING [train.py:1204] (3/4) Exclude cut with ID _5294_210_1747_1_1527249643928_3696179_180-100713-0 from training. Duration: 0.81 2023-10-02 16:56:06,358 WARNING [train.py:1204] (3/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_32-335103-0 from training. Duration: 0.82 2023-10-02 16:56:06,364 WARNING [train.py:1204] (3/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_133-312727-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 16:56:07,497 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.372e+02 1.867e+02 2.097e+02 2.357e+02 3.231e+02, threshold=4.193e+02, percent-clipped=0.0 2023-10-02 16:56:07,643 WARNING [train.py:1204] (3/4) Exclude cut with ID _59288_210_5854_1_1535077757774_3412246_391-165934-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 16:56:07,688 WARNING [train.py:1204] (3/4) Exclude cut with ID _28323_210_16187_1_1532480395667_4100300_488-1281-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 16:56:10,336 WARNING [train.py:1204] (3/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_124-193207-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 16:56:18,269 INFO [train.py:1046] (3/4) Epoch 27, batch 4650, loss[loss=0.1823, simple_loss=0.2557, pruned_loss=0.05451, over 23439.00 frames. ], tot_loss[loss=0.1661, simple_loss=0.2436, pruned_loss=0.04437, over 4713636.58 frames. ], batch size: 93, lr: 3.76e-03, grad_scale: 8.0 2023-10-02 16:56:20,388 WARNING [train.py:1204] (3/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_226-242140-0_sp0.9 from training. Duration: 0.9 2023-10-02 16:56:24,321 WARNING [train.py:1204] (3/4) Exclude cut with ID _29277_210_3279_1_1532581109293_3173069_392-3694-0_sp1.1 from training. Duration: 0.936375 2023-10-02 16:56:24,365 WARNING [train.py:1204] (3/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_563-329842-0_sp1.1 from training. Duration: 0.97275 2023-10-02 16:56:25,620 WARNING [train.py:1204] (3/4) Exclude cut with ID _58583_210_5236_1_1534726835266_3574249_391-87755-0_sp1.1 from training. Duration: 0.92725 2023-10-02 16:56:25,649 WARNING [train.py:1204] (3/4) Exclude cut with ID _28427_210_4220_1_1532581240967_3061069_281-237827-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 16:56:26,953 WARNING [train.py:1204] (3/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_44-147898-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 16:56:27,032 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_24007_210_1794_1_1531918925_1663875_125-320772-0_sp1.1 from training. Duration: 0.97275 2023-10-02 16:56:28,500 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.conv_module2.balancer2.prob, batch_count=951766.6666666666, ans=0.125 2023-10-02 16:56:29,777 WARNING [train.py:1204] (3/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_143-117421-0 from training. Duration: 0.58 2023-10-02 16:56:34,345 WARNING [train.py:1204] (3/4) Exclude cut with ID _69584_210_5219_1_1535785133775_4044450_325-7656-0_sp0.9 from training. Duration: 0.9555625 2023-10-02 16:56:35,822 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_37351_210_7925_1_1533012931_3812113_265-85780-0 from training. Duration: 0.88 2023-10-02 16:56:35,851 WARNING [train.py:1204] (3/4) Exclude cut with ID _18516_210_9206_1_1531704653206_3692406_331-237896-0_sp1.1 from training. Duration: 0.936375 2023-10-02 16:56:37,159 WARNING [train.py:1204] (3/4) Exclude cut with ID _14629_210_15709_1_1530604298182_956899_6-232695-0 from training. Duration: 0.94 2023-10-02 16:56:37,190 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7544_210_6753_1_1528587000_691468_87-178599-0_sp1.1 from training. Duration: 0.7909375 2023-10-02 16:56:37,239 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_326-303945-0 from training. Duration: 0.99 2023-10-02 16:56:38,616 WARNING [train.py:1204] (3/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_503-30719-0 from training. Duration: 0.98 2023-10-02 16:56:38,623 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_11305_210_1794_1_1529750428_7178148_299-344640-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 16:56:38,678 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_53071_210_16554_1_1534490405_2954066_265-170472-0_sp1.1 from training. Duration: 0.7545625 2023-10-02 16:56:38,862 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.conv_skip_rate, batch_count=951833.3333333334, ans=0.0 2023-10-02 16:56:41,504 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_174-277364-0_sp1.1 from training. Duration: 0.6454375 2023-10-02 16:56:41,595 WARNING [train.py:1204] (3/4) Exclude cut with ID _58414_210_8254_1_1535421609162_7151182_193-151884-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 16:56:43,364 WARNING [train.py:1204] (3/4) Exclude cut with ID IC0651W0104-322063 from training. Duration: 0.768 2023-10-02 16:56:45,288 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.4.encoder.layers.0.whiten, num_groups=1, num_channels=384, metric=4.85 vs. limit=12.0 2023-10-02 16:56:45,975 WARNING [train.py:1204] (3/4) Exclude cut with ID _21881_210_3525_1_1531825242757_3798380_139-305982-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 16:56:47,352 WARNING [train.py:1204] (3/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_79-200392-0 from training. Duration: 0.97 2023-10-02 16:56:49,540 WARNING [train.py:1204] (3/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_451-280894-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 16:56:49,547 WARNING [train.py:1204] (3/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_304-273433-0_sp1.1 from training. Duration: 0.9 2023-10-02 16:56:51,569 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_5959_210_1794_1_1527400400_3445726_412-44052-0 from training. Duration: 0.77 2023-10-02 16:56:52,899 WARNING [train.py:1204] (3/4) Exclude cut with ID _4681_210_3525_1_1527251040321_1235713_148-26159-0_sp1.1 from training. Duration: 0.963625 2023-10-02 16:56:58,143 WARNING [train.py:1204] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_887-249555-0_sp0.9 from training. Duration: 0.8555625 2023-10-02 16:57:01,094 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_25236_210_2158_1_1532051968_280567_37-6860-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 16:57:05,843 WARNING [train.py:1204] (3/4) Exclude cut with ID _29277_210_3279_1_1532581109293_3173069_417-115527-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 16:57:07,264 WARNING [train.py:1204] (3/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_552-134748-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 16:57:07,341 WARNING [train.py:1204] (3/4) Exclude cut with ID _43176_210_8254_1_1533524417469_4174339_188-174143-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 16:57:08,557 WARNING [train.py:1204] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_127-119109-0_sp1.1 from training. Duration: 0.6545625 2023-10-02 16:57:11,565 WARNING [train.py:1204] (3/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_38-187851-0 from training. Duration: 0.62 2023-10-02 16:57:11,717 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.conv_module1.balancer1.prob, batch_count=951966.6666666666, ans=0.125 2023-10-02 16:57:12,879 WARNING [train.py:1204] (3/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_588-125165-0 from training. Duration: 0.83 2023-10-02 16:57:14,155 WARNING [train.py:1204] (3/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_344-152526-0_sp1.1 from training. Duration: 0.4818125 2023-10-02 16:57:14,156 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7002_210_1794_1_1527935634_4034444_487-236501-0 from training. Duration: 0.66 2023-10-02 16:57:16,049 WARNING [train.py:1204] (3/4) Exclude cut with ID _23825_210_13449_1_1531915313526_3497150_379-16981-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 16:57:23,594 WARNING [train.py:1204] (3/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_297-5233-0_sp1.1 from training. Duration: 0.863625 2023-10-02 16:57:23,598 WARNING [train.py:1204] (3/4) Exclude cut with ID _41298_210_9019_1_1533430371450_4822143_178-283538-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 16:57:23,626 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7046_210_6753_1_1527895337_6920917_642-106799-0 from training. Duration: 0.92 2023-10-02 16:57:23,640 WARNING [train.py:1204] (3/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_532-291952-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 16:57:23,748 WARNING [train.py:1204] (3/4) Exclude cut with ID _47278_210_12041_1_1533902418349_3576110_260-270468-0_sp1.1 from training. Duration: 0.92725 2023-10-02 16:57:25,648 WARNING [train.py:1204] (3/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_144-3287-0_sp0.9 from training. Duration: 0.7444375 2023-10-02 16:57:27,079 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_162-262717-0_sp0.9 from training. Duration: 0.92225 2023-10-02 16:57:28,579 WARNING [train.py:1204] (3/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_62-335552-0_sp0.9 from training. Duration: 0.9666875 2023-10-02 16:57:28,584 WARNING [train.py:1204] (3/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_726-272852-0_sp1.1 from training. Duration: 0.92725 2023-10-02 16:57:28,660 WARNING [train.py:1204] (3/4) Exclude cut with ID _14898_210_9206_1_1530766812377_4177190_26-135492-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 16:57:29,229 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.4.encoder.layers.1.feed_forward1.out_whiten, num_groups=1, num_channels=384, metric=10.83 vs. limit=15.0 2023-10-02 16:57:31,423 WARNING [train.py:1204] (3/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_200-95304-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 16:57:31,477 WARNING [train.py:1204] (3/4) Exclude cut with ID _45012_210_12651_1_1533608435136_5371380_240-37101-0_sp1.1 from training. Duration: 0.8454375 2023-10-02 16:57:31,484 WARNING [train.py:1204] (3/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_132-269633-0_sp0.9 from training. Duration: 0.7555625 2023-10-02 16:57:31,641 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.conv_module1.balancer1.prob, batch_count=952100.0, ans=0.125 2023-10-02 16:57:32,743 INFO [train.py:1046] (3/4) Epoch 27, batch 4700, loss[loss=0.1495, simple_loss=0.2263, pruned_loss=0.03638, over 24315.00 frames. ], tot_loss[loss=0.1664, simple_loss=0.244, pruned_loss=0.04439, over 4722158.40 frames. ], batch size: 56, lr: 3.76e-03, grad_scale: 8.0 2023-10-02 16:57:32,861 WARNING [train.py:1204] (3/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_907-151398-0_sp0.9 from training. Duration: 0.411125 2023-10-02 16:57:34,667 WARNING [train.py:1204] (3/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_45-265684-0_sp0.9 from training. Duration: 0.8 2023-10-02 16:57:36,091 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_74-292608-0 from training. Duration: 0.72 2023-10-02 16:57:43,183 WARNING [train.py:1204] (3/4) Exclude cut with ID _59202_210_26984_1_1534816810612_3549174_221-84819-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 16:57:44,522 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_33696_210_3852_1_1533082780_2663375_181-325595-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 16:57:44,567 WARNING [train.py:1204] (3/4) Exclude cut with ID _47251_210_18628_1_1533814063128_4137250_351-154105-0_sp1.1 from training. Duration: 0.963625 2023-10-02 16:57:46,437 WARNING [train.py:1204] (3/4) Exclude cut with ID _35501_210_9288_1_1532998507986_3192189_351-334740-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 16:57:46,957 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.4.encoder.layers.1.nonlin_attention.whiten2, num_groups=1, num_channels=384, metric=3.21 vs. limit=15.0 2023-10-02 16:57:47,884 WARNING [train.py:1204] (3/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_342-100157-0_sp1.1 from training. Duration: 0.4909375 2023-10-02 16:57:52,535 WARNING [train.py:1204] (3/4) Exclude cut with ID _6789_210_1534_1_1527857937218_3118950_262-48853-0 from training. Duration: 0.56 2023-10-02 16:57:52,569 WARNING [train.py:1204] (3/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_430-201020-0 from training. Duration: 0.97 2023-10-02 16:57:54,014 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_71-136518-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 16:57:54,243 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.conv_module1.balancer2.prob, batch_count=952166.6666666666, ans=0.125 2023-10-02 16:57:55,344 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_28-291982-0_sp1.1 from training. Duration: 0.8818125 2023-10-02 16:57:55,366 WARNING [train.py:1204] (3/4) Exclude cut with ID _54218_210_12210_1_1534413646388_4409259_437-275326-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 16:57:59,419 WARNING [train.py:1204] (3/4) Exclude cut with ID _55134_210_21982_1_1534590028377_3882170_40-309139-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 16:58:06,604 WARNING [train.py:1204] (3/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_266-329755-0_sp1.1 from training. Duration: 0.8090625 2023-10-02 16:58:07,781 WARNING [train.py:1204] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_21-174514-0_sp1.1 from training. Duration: 0.3909375 2023-10-02 16:58:09,214 WARNING [train.py:1204] (3/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_409-207378-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 16:58:14,777 WARNING [train.py:1204] (3/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_132-269633-0 from training. Duration: 0.68 2023-10-02 16:58:16,164 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2245_210_2715_1_1525511548_3540254_310-52483-0_sp0.9 from training. Duration: 0.97775 2023-10-02 16:58:18,023 WARNING [train.py:1204] (3/4) Exclude cut with ID _27430_210_7770_1_1532604689375_1378590_47-77403-0_sp1.1 from training. Duration: 0.92725 2023-10-02 16:58:18,885 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.0.layers.0.whiten, num_groups=1, num_channels=192, metric=4.64 vs. limit=12.0 2023-10-02 16:58:21,441 WARNING [train.py:1204] (3/4) Exclude cut with ID _7562_210_2084_1_1528887767916_3946229_186-326422-0 from training. Duration: 0.84 2023-10-02 16:58:24,725 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_230-35993-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 16:58:28,828 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_23854_210_1794_1_1531969696_1329157_57-123491-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 16:58:28,864 WARNING [train.py:1204] (3/4) Exclude cut with ID _14747_210_8341_1_1530685797463_3654089_147-131229-0 from training. Duration: 0.76 2023-10-02 16:58:31,600 WARNING [train.py:1204] (3/4) Exclude cut with ID _30191_210_1664_1_1533189915047_7196670_430-19539-0_sp1.1 from training. Duration: 0.92725 2023-10-02 16:58:31,617 WARNING [train.py:1204] (3/4) Exclude cut with ID _67725_210_27767_1_1535608756671_5105359_44-50762-0_sp1.1 from training. Duration: 0.963625 2023-10-02 16:58:33,065 WARNING [train.py:1204] (3/4) Exclude cut with ID _27485_210_4916_1_1532739191677_3668269_217-161755-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 16:58:34,362 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_5931_210_2040_1_1527428481_4547999_321-193614-0_sp1.1 from training. Duration: 0.6090625 2023-10-02 16:58:34,378 WARNING [train.py:1204] (3/4) Exclude cut with ID _6267_210_2084_1_1527764658526_3894730_375-219887-0 from training. Duration: 0.67 2023-10-02 16:58:36,115 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.512e+02 1.791e+02 1.934e+02 2.224e+02 4.121e+02, threshold=3.867e+02, percent-clipped=0.0 2023-10-02 16:58:36,206 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9087W0022-538393 from training. Duration: 0.768 2023-10-02 16:58:37,642 WARNING [train.py:1204] (3/4) Exclude cut with ID _68777_210_4353_1_1535810390731_3117904_83-196459-0_sp1.1 from training. Duration: 0.963625 2023-10-02 16:58:41,543 WARNING [train.py:1204] (3/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_455-343812-0_sp1.1 from training. Duration: 0.92725 2023-10-02 16:58:41,544 WARNING [train.py:1204] (3/4) Exclude cut with ID _13916_210_9994_1_1530788341126_3763149_414-320174-0_sp1.1 from training. Duration: 0.92725 2023-10-02 16:58:41,548 WARNING [train.py:1204] (3/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_398-239819-0 from training. Duration: 0.99 2023-10-02 16:58:41,635 WARNING [train.py:1204] (3/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_199-232314-0_sp1.1 from training. Duration: 0.92725 2023-10-02 16:58:45,877 INFO [train.py:1046] (3/4) Epoch 27, batch 4750, loss[loss=0.1566, simple_loss=0.2446, pruned_loss=0.03434, over 24482.00 frames. ], tot_loss[loss=0.1662, simple_loss=0.244, pruned_loss=0.04419, over 4721828.00 frames. ], batch size: 63, lr: 3.76e-03, grad_scale: 8.0 2023-10-02 16:58:45,948 WARNING [train.py:1204] (3/4) Exclude cut with ID _2334_210_2667_1_1525608238880_4705129_459-104997-0 from training. Duration: 0.95 2023-10-02 16:58:47,424 WARNING [train.py:1204] (3/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_441-209395-0_sp1.1 from training. Duration: 0.936375 2023-10-02 16:58:48,688 WARNING [train.py:1204] (3/4) Exclude cut with ID _27052_210_5986_1_1532329197187_3585009_320-80393-0_sp1.1 from training. Duration: 0.97275 2023-10-02 16:58:53,500 WARNING [train.py:1204] (3/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_89-217253-0_sp1.1 from training. Duration: 0.97275 2023-10-02 16:58:53,514 WARNING [train.py:1204] (3/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_453-282588-0_sp1.1 from training. Duration: 0.8909375 2023-10-02 16:58:56,056 WARNING [train.py:1204] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_88-341718-0 from training. Duration: 0.66 2023-10-02 16:58:56,089 WARNING [train.py:1204] (3/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_230-3521-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 16:58:57,480 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=952433.3333333334, ans=0.1 2023-10-02 16:58:58,140 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.0.layers.0.self_attn1.whiten, num_groups=1, num_channels=192, metric=10.05 vs. limit=22.5 2023-10-02 16:59:01,217 WARNING [train.py:1204] (3/4) Exclude cut with ID _9679_210_7497_1_1529129212906_4363929_512-313889-0 from training. Duration: 0.81 2023-10-02 16:59:02,626 WARNING [train.py:1204] (3/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_194-252800-0_sp1.1 from training. Duration: 0.8818125 2023-10-02 16:59:02,648 WARNING [train.py:1204] (3/4) Exclude cut with ID _55209_210_1531_1_1534675828996_4597160_239-293541-0_sp1.1 from training. Duration: 0.963625 2023-10-02 16:59:03,997 WARNING [train.py:1204] (3/4) Exclude cut with ID _35799_210_5854_1_1533263487887_3529071_15-325131-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 16:59:06,286 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.1.encoder.layers.0.conv_module1.whiten, num_groups=1, num_channels=256, metric=8.94 vs. limit=15.0 2023-10-02 16:59:08,287 WARNING [train.py:1204] (3/4) Exclude cut with ID _14100_210_8341_1_1530529320613_3678069_279-55760-0 from training. Duration: 0.93 2023-10-02 16:59:12,936 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_489-230703-0_sp1.1 from training. Duration: 0.77275 2023-10-02 16:59:14,299 WARNING [train.py:1204] (3/4) Exclude cut with ID _6694_210_4599_1_1527852637490_3965529_144-185051-0 from training. Duration: 0.96 2023-10-02 16:59:15,674 WARNING [train.py:1204] (3/4) Exclude cut with ID _28461_210_2253_1_1532771775189_3041299_1-228155-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 16:59:18,431 WARNING [train.py:1204] (3/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_686-252323-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 16:59:18,433 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_1075-294633-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 16:59:18,446 WARNING [train.py:1204] (3/4) Exclude cut with ID _22083_210_8254_1_1531702862439_3678970_30-92879-0_sp1.1 from training. Duration: 0.97275 2023-10-02 16:59:19,822 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9245W0121-616888 from training. Duration: 0.896 2023-10-02 16:59:19,825 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6786_210_1388_1_1527839699_4194495_378-46070-0 from training. Duration: 0.72 2023-10-02 16:59:21,808 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.1.conv_module2.balancer2.prob, batch_count=952566.6666666666, ans=0.125 2023-10-02 16:59:23,138 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.1.feed_forward1.out_proj.dropout_p, batch_count=952566.6666666666, ans=0.1 2023-10-02 16:59:25,789 WARNING [train.py:1204] (3/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_317-175062-0 from training. Duration: 0.82 2023-10-02 16:59:28,941 WARNING [train.py:1204] (3/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_238-89390-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 16:59:29,502 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.4.encoder.layers.1.self_attn2.whiten, num_groups=1, num_channels=384, metric=12.00 vs. limit=22.5 2023-10-02 16:59:30,387 WARNING [train.py:1204] (3/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_524-310811-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 16:59:30,598 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.ff2_skip_rate, batch_count=952633.3333333334, ans=0.0 2023-10-02 16:59:33,115 WARNING [train.py:1204] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_307-158462-0_sp0.9 from training. Duration: 0.7666875 2023-10-02 16:59:33,116 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9309W0191-640318 from training. Duration: 0.512 2023-10-02 16:59:33,120 WARNING [train.py:1204] (3/4) Exclude cut with ID _55153_210_2253_1_1534384786677_3162096_100-204617-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 16:59:35,911 WARNING [train.py:1204] (3/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_112-149213-0_sp1.1 from training. Duration: 0.863625 2023-10-02 16:59:37,337 WARNING [train.py:1204] (3/4) Exclude cut with ID _67831_210_12392_1_1535612464202_3092612_346-2148-0_sp1.1 from training. Duration: 0.8454375 2023-10-02 16:59:40,106 WARNING [train.py:1204] (3/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_51-117576-0_sp0.9 from training. Duration: 0.4444375 2023-10-02 16:59:40,128 WARNING [train.py:1204] (3/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_300-27235-0 from training. Duration: 0.87 2023-10-02 16:59:40,179 WARNING [train.py:1204] (3/4) Exclude cut with ID _40399_210_4353_1_1533305658194_3279406_195-116825-0_sp1.1 from training. Duration: 0.97275 2023-10-02 16:59:40,201 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7002_210_1794_1_1527939683_5157286_163-87552-0_sp0.9 from training. Duration: 0.9555625 2023-10-02 16:59:40,231 WARNING [train.py:1204] (3/4) Exclude cut with ID _19514_210_9259_1_1531530071330_5084230_560-237597-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 16:59:40,333 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder_embed.convnext.out_balancer.prob, batch_count=952633.3333333334, ans=0.125 2023-10-02 16:59:42,085 WARNING [train.py:1204] (3/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_41-234353-0_sp1.1 from training. Duration: 0.6181875 2023-10-02 16:59:42,102 WARNING [train.py:1204] (3/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_769-168302-0 from training. Duration: 0.71 2023-10-02 16:59:44,806 WARNING [train.py:1204] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_574-45541-0 from training. Duration: 0.65 2023-10-02 16:59:48,681 WARNING [train.py:1204] (3/4) Exclude cut with ID _50310_210_15488_1_1534068043345_3641080_262-67120-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 16:59:50,163 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_53071_210_16554_1_1534493434_1276796_35-11927-0_sp1.1 from training. Duration: 0.963625 2023-10-02 16:59:50,165 WARNING [train.py:1204] (3/4) Exclude cut with ID _3992_210_1747_1_1526644816031_3731060_107-251434-0 from training. Duration: 0.99 2023-10-02 16:59:51,979 WARNING [train.py:1204] (3/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_412-54488-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 16:59:53,348 WARNING [train.py:1204] (3/4) Exclude cut with ID _18516_210_9206_1_1531704653206_3692406_77-258490-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 16:59:54,771 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_40047_210_2290_1_1533290519_2374311_195-165693-0_sp0.9 from training. Duration: 0.988875 2023-10-02 16:59:56,236 WARNING [train.py:1204] (3/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_296-36971-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 16:59:56,286 WARNING [train.py:1204] (3/4) Exclude cut with ID _14970_210_15709_1_1530773543493_1715730_187-20259-0_sp1.1 from training. Duration: 0.8090625 2023-10-02 16:59:59,417 INFO [train.py:1046] (3/4) Epoch 27, batch 4800, loss[loss=0.1759, simple_loss=0.2657, pruned_loss=0.04308, over 24453.00 frames. ], tot_loss[loss=0.1675, simple_loss=0.2455, pruned_loss=0.04474, over 4724786.70 frames. ], batch size: 69, lr: 3.76e-03, grad_scale: 16.0 2023-10-02 16:59:59,527 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_53373_210_5399_1_1534229471_4715695_135-305927-0_sp1.1 from training. Duration: 0.936375 2023-10-02 17:00:00,794 WARNING [train.py:1204] (3/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_60-18123-0 from training. Duration: 0.88 2023-10-02 17:00:00,880 WARNING [train.py:1204] (3/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_166-166990-0 from training. Duration: 0.96 2023-10-02 17:00:01,480 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.3.encoder.layers.3.nonlin_attention.whiten2, num_groups=1, num_channels=512, metric=4.68 vs. limit=15.0 2023-10-02 17:00:03,441 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_5438_210_1794_1_1527336687_2600009_233-58072-0 from training. Duration: 0.77 2023-10-02 17:00:04,900 WARNING [train.py:1204] (3/4) Exclude cut with ID _8833_210_5219_1_1528592665210_7553059_611-80313-0_sp0.9 from training. Duration: 0.711125 2023-10-02 17:00:04,932 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_53555_210_5632_1_1534595220_7232297_118-43239-0_sp1.1 from training. Duration: 0.936375 2023-10-02 17:00:05,125 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.conv_module1.balancer2.prob, batch_count=952766.6666666666, ans=0.125 2023-10-02 17:00:06,274 WARNING [train.py:1204] (3/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_480-13146-0 from training. Duration: 0.85 2023-10-02 17:00:09,328 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.feed_forward2.hidden_balancer.prob, batch_count=952766.6666666666, ans=0.125 2023-10-02 17:00:10,472 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_892-28814-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 17:00:10,528 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_24899_210_16554_1_1532046422_3827462_160-21255-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 17:00:12,638 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.self_attn_weights.pos_emb_skip_rate, batch_count=952833.3333333334, ans=0.0 2023-10-02 17:00:16,599 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_154-316450-0_sp1.1 from training. Duration: 0.6909375 2023-10-02 17:00:17,944 WARNING [train.py:1204] (3/4) Exclude cut with ID _30196_210_1664_1_1533708496522_7074140_1225-301232-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 17:00:17,968 WARNING [train.py:1204] (3/4) Exclude cut with ID _28783_210_7943_1_1532671164808_7155479_414-208100-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 17:00:18,020 WARNING [train.py:1204] (3/4) Exclude cut with ID _19023_210_4353_1_1531378892552_5820780_83-254794-0 from training. Duration: 0.86 2023-10-02 17:00:19,311 WARNING [train.py:1204] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_591-83752-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 17:00:19,356 WARNING [train.py:1204] (3/4) Exclude cut with ID _29877_210_6228_1_1532397791106_5276149_266-62291-0_sp1.1 from training. Duration: 0.7909375 2023-10-02 17:00:22,020 WARNING [train.py:1204] (3/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_280-161633-0_sp1.1 from training. Duration: 0.663625 2023-10-02 17:00:25,481 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7543_210_6753_1_1528539468_333764_39-156634-0_sp1.1 from training. Duration: 0.97275 2023-10-02 17:00:26,952 WARNING [train.py:1204] (3/4) Exclude cut with ID _67499_210_17282_1_1535509805220_3692430_505-312248-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 17:00:27,129 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.conv_skip_rate, batch_count=952900.0, ans=0.0 2023-10-02 17:00:28,271 WARNING [train.py:1204] (3/4) Exclude cut with ID _5294_210_1747_1_1527249643928_3696179_180-100713-0_sp0.9 from training. Duration: 0.9 2023-10-02 17:00:28,975 WARNING [train.py:1204] (3/4) Exclude cut with ID _26528_210_9070_1_1532570410842_3851310_508-148132-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 17:00:30,087 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_211-339622-0_sp1.1 from training. Duration: 0.4090625 2023-10-02 17:00:30,110 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_339-138356-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 17:00:32,087 WARNING [train.py:1204] (3/4) Exclude cut with ID _27060_210_5986_1_1532674838922_3522549_211-19933-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 17:00:33,522 WARNING [train.py:1204] (3/4) Exclude cut with ID _60229_210_27767_1_1534838406158_3655350_457-117923-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 17:00:36,423 WARNING [train.py:1204] (3/4) Exclude cut with ID _55429_210_10626_1_1534467588527_7174143_460-141439-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 17:00:36,543 WARNING [train.py:1204] (3/4) Exclude cut with ID _59170_210_6721_1_1534983633960_4203335_263-10780-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 17:00:36,554 WARNING [train.py:1204] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_872-264525-0_sp0.9 from training. Duration: 0.72225 2023-10-02 17:00:37,968 WARNING [train.py:1204] (3/4) Exclude cut with ID _7624_210_1531_1_1528109802198_4933101_418-349834-0_sp0.9 from training. Duration: 0.6666875 2023-10-02 17:00:39,357 WARNING [train.py:1204] (3/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_27-53998-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 17:00:39,424 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder_embed.convnext.out_balancer.prob, batch_count=952900.0, ans=0.125 2023-10-02 17:00:42,033 WARNING [train.py:1204] (3/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_202-183922-0 from training. Duration: 0.85 2023-10-02 17:00:42,051 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7002_210_1794_1_1527939683_5157286_1-281264-0 from training. Duration: 0.44 2023-10-02 17:00:44,025 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_53753_210_7234_1_1534320003_3594148_493-9314-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 17:00:44,039 WARNING [train.py:1204] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_217-95056-0_sp1.1 from training. Duration: 0.8909375 2023-10-02 17:00:44,087 WARNING [train.py:1204] (3/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_406-45086-0_sp1.1 from training. Duration: 0.72725 2023-10-02 17:00:44,093 WARNING [train.py:1204] (3/4) Exclude cut with ID _31649_210_6228_1_1532584608063_4091078_505-213821-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 17:00:44,101 WARNING [train.py:1204] (3/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_317-175062-0_sp0.9 from training. Duration: 0.911125 2023-10-02 17:00:46,727 WARNING [train.py:1204] (3/4) Exclude cut with ID _5444_210_2151_1_1527418808464_5312210_726-27165-0_sp1.1 from training. Duration: 0.6090625 2023-10-02 17:00:46,772 WARNING [train.py:1204] (3/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_494-170437-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 17:00:51,337 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_474-64054-0_sp1.1 from training. Duration: 0.936375 2023-10-02 17:00:52,782 WARNING [train.py:1204] (3/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_521-7556-0_sp1.1 from training. Duration: 0.97275 2023-10-02 17:00:55,417 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_269-348505-0_sp1.1 from training. Duration: 0.92725 2023-10-02 17:00:58,266 WARNING [train.py:1204] (3/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_559-184862-0 from training. Duration: 0.94 2023-10-02 17:00:58,293 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_68702_210_23689_1_1535799577_3420068_92-76040-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 17:00:58,333 WARNING [train.py:1204] (3/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_227-102052-0_sp1.1 from training. Duration: 0.97275 2023-10-02 17:01:00,152 WARNING [train.py:1204] (3/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_718-250907-0_sp0.9 from training. Duration: 0.7666875 2023-10-02 17:01:00,225 WARNING [train.py:1204] (3/4) Exclude cut with ID _14069_210_5632_1_1530512692033_4803390_352-143891-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 17:01:02,772 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.521e+02 1.815e+02 2.074e+02 2.431e+02 3.640e+02, threshold=4.148e+02, percent-clipped=0.0 2023-10-02 17:01:04,193 WARNING [train.py:1204] (3/4) Exclude cut with ID _6267_210_2084_1_1527764658526_3894730_65-106602-0_sp1.1 from training. Duration: 0.936375 2023-10-02 17:01:04,269 WARNING [train.py:1204] (3/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_202-183922-0_sp0.9 from training. Duration: 0.9444375 2023-10-02 17:01:05,504 WARNING [train.py:1204] (3/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_465-268827-0_sp1.1 from training. Duration: 0.97275 2023-10-02 17:01:05,577 WARNING [train.py:1204] (3/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_58-29064-0_sp1.1 from training. Duration: 0.9 2023-10-02 17:01:06,925 WARNING [train.py:1204] (3/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_1-99680-0_sp0.9 from training. Duration: 0.7444375 2023-10-02 17:01:06,965 WARNING [train.py:1204] (3/4) Exclude cut with ID _13253_210_1925_1_1530757775819_3577873_191-144977-0_sp1.1 from training. Duration: 0.7090625 2023-10-02 17:01:10,930 WARNING [train.py:1204] (3/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_276-112684-0_sp1.1 from training. Duration: 0.92725 2023-10-02 17:01:12,202 INFO [train.py:1046] (3/4) Epoch 27, batch 4850, loss[loss=0.17, simple_loss=0.2531, pruned_loss=0.04343, over 24477.00 frames. ], tot_loss[loss=0.1683, simple_loss=0.2459, pruned_loss=0.04532, over 4722706.33 frames. ], batch size: 66, lr: 3.76e-03, grad_scale: 16.0 2023-10-02 17:01:12,246 WARNING [train.py:1204] (3/4) Exclude cut with ID _64639_210_5816_1_1535367619437_3528000_358-411-0_sp1.1 from training. Duration: 0.97275 2023-10-02 17:01:12,253 WARNING [train.py:1204] (3/4) Exclude cut with ID _31649_210_6228_1_1532584608063_4091078_106-21199-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 17:01:14,018 WARNING [train.py:1204] (3/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_901-79438-0 from training. Duration: 0.94 2023-10-02 17:01:15,495 WARNING [train.py:1204] (3/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_718-250907-0 from training. Duration: 0.69 2023-10-02 17:01:15,501 WARNING [train.py:1204] (3/4) Exclude cut with ID _5287_210_2084_1_1527418883040_3878670_363-194161-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 17:01:15,505 WARNING [train.py:1204] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_586-321321-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 17:01:16,840 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_68900_210_2087_1_1535713157_3708529_164-292661-0_sp1.1 from training. Duration: 0.963625 2023-10-02 17:01:16,848 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_26433_210_12108_1_1532157158_4056480_114-144878-0_sp1.1 from training. Duration: 0.97275 2023-10-02 17:01:17,057 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.attention_skip_rate, batch_count=953100.0, ans=0.0 2023-10-02 17:01:19,546 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_15517_210_6753_1_1530954008_3487336_96-341874-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 17:01:21,739 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.conv_module2.balancer1.prob, batch_count=953100.0, ans=0.125 2023-10-02 17:01:23,812 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.2.encoder.layers.0.self_attn_weights.whiten_keys, num_groups=4, num_channels=128, metric=4.65 vs. limit=6.0 2023-10-02 17:01:24,488 WARNING [train.py:1204] (3/4) Exclude cut with ID _14268_210_9206_1_1530622771285_4714099_637-86630-0 from training. Duration: 0.51 2023-10-02 17:01:27,105 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_28211_210_6388_1_1532861506_4523224_121-320092-0_sp1.1 from training. Duration: 0.92725 2023-10-02 17:01:32,432 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_141-89113-0_sp0.9 from training. Duration: 0.9555625 2023-10-02 17:01:32,531 WARNING [train.py:1204] (3/4) Exclude cut with ID _6789_210_1534_1_1527857937218_3118950_311-61262-0_sp1.1 from training. Duration: 0.5909375 2023-10-02 17:01:33,827 WARNING [train.py:1204] (3/4) Exclude cut with ID _58480_210_12392_1_1534811121111_3646160_389-200461-0_sp1.1 from training. Duration: 0.97275 2023-10-02 17:01:35,413 INFO [scaling.py:1118] (3/4) WithLoss: name=encoder.encoders.1.encoder.layers.0.self_attn_weights, loss-sum=0.000e+00 2023-10-02 17:01:36,596 WARNING [train.py:1204] (3/4) Exclude cut with ID _41326_210_5854_1_1533778076509_3492080_28-156553-0_sp1.1 from training. Duration: 0.92725 2023-10-02 17:01:37,931 WARNING [train.py:1204] (3/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_577-287150-0_sp1.1 from training. Duration: 0.7454375 2023-10-02 17:01:38,019 WARNING [train.py:1204] (3/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_40-129756-0_sp1.1 from training. Duration: 0.736375 2023-10-02 17:01:39,302 WARNING [train.py:1204] (3/4) Exclude cut with ID _60113_210_16271_1_1535164212481_4386079_490-280290-0 from training. Duration: 0.98 2023-10-02 17:01:43,383 WARNING [train.py:1204] (3/4) Exclude cut with ID _58059_210_5854_1_1534901471149_3164808_34-39905-0_sp1.1 from training. Duration: 0.963625 2023-10-02 17:01:45,401 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_791-197891-0_sp1.1 from training. Duration: 0.763625 2023-10-02 17:01:46,642 WARNING [train.py:1204] (3/4) Exclude cut with ID _6789_210_1534_1_1527857937218_3118950_262-48853-0_sp1.1 from training. Duration: 0.5090625 2023-10-02 17:01:46,702 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2655_210_2087_1_1525688959_3897400_52-279899-0_sp0.9 from training. Duration: 0.7666875 2023-10-02 17:01:46,705 WARNING [train.py:1204] (3/4) Exclude cut with ID _14884_210_2252_1_1530707459689_5561250_114-201399-0 from training. Duration: 0.76 2023-10-02 17:01:49,564 WARNING [train.py:1204] (3/4) Exclude cut with ID _19023_210_4353_1_1531378892552_5820780_83-254794-0_sp0.9 from training. Duration: 0.9555625 2023-10-02 17:01:49,588 WARNING [train.py:1204] (3/4) Exclude cut with ID _53489_210_6228_1_1534298357169_3835970_10-297919-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 17:01:52,832 WARNING [train.py:1204] (3/4) Exclude cut with ID _44585_210_18628_1_1533605350387_4518199_418-3055-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 17:01:52,850 WARNING [train.py:1204] (3/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_310-109461-0 from training. Duration: 0.8 2023-10-02 17:01:52,888 WARNING [train.py:1204] (3/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_235-262124-0 from training. Duration: 0.67 2023-10-02 17:01:53,675 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.1.encoder.layers.1.self_attn1.whiten, num_groups=1, num_channels=256, metric=10.61 vs. limit=22.5 2023-10-02 17:01:54,280 WARNING [train.py:1204] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_195-167011-0_sp1.1 from training. Duration: 0.6818125 2023-10-02 17:02:02,128 WARNING [train.py:1204] (3/4) Exclude cut with ID _7852_210_5219_1_1528286471316_8139400_188-55162-0_sp1.1 from training. Duration: 0.936375 2023-10-02 17:02:03,420 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_35361_210_4238_1_1533262810_7793476_675-87012-0 from training. Duration: 0.89 2023-10-02 17:02:04,998 WARNING [train.py:1204] (3/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_465-81427-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 17:02:05,004 WARNING [train.py:1204] (3/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_597-154640-0_sp1.1 from training. Duration: 0.8181875 2023-10-02 17:02:06,394 WARNING [train.py:1204] (3/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_186-309161-0_sp0.9 from training. Duration: 0.6 2023-10-02 17:02:07,798 WARNING [train.py:1204] (3/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_546-119312-0 from training. Duration: 0.99 2023-10-02 17:02:07,803 WARNING [train.py:1204] (3/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_330-240060-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 17:02:07,875 WARNING [train.py:1204] (3/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_477-255782-0 from training. Duration: 0.82 2023-10-02 17:02:08,053 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.nonlin_attention.balancer.prob, batch_count=953300.0, ans=0.125 2023-10-02 17:02:09,402 WARNING [train.py:1204] (3/4) Exclude cut with ID _14970_210_15709_1_1530773543493_1715730_150-248939-0_sp1.1 from training. Duration: 0.97275 2023-10-02 17:02:09,493 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_556-206615-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 17:02:09,549 WARNING [train.py:1204] (3/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_308-231735-0 from training. Duration: 0.73 2023-10-02 17:02:13,781 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=953366.6666666666, ans=0.1 2023-10-02 17:02:18,321 WARNING [train.py:1204] (3/4) Exclude cut with ID _27485_210_4916_1_1532739191677_3668269_66-236571-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 17:02:24,281 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2221_210_1988_1_1525597051_4935541_415-296882-0_sp0.9 from training. Duration: 0.8555625 2023-10-02 17:02:24,297 WARNING [train.py:1204] (3/4) Exclude cut with ID _64521_210_14699_1_1535243427259_3815328_231-78630-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 17:02:26,971 INFO [train.py:1046] (3/4) Epoch 27, batch 4900, loss[loss=0.16, simple_loss=0.2469, pruned_loss=0.03659, over 24700.00 frames. ], tot_loss[loss=0.1678, simple_loss=0.2451, pruned_loss=0.04525, over 4705691.72 frames. ], batch size: 73, lr: 3.76e-03, grad_scale: 16.0 2023-10-02 17:02:29,045 WARNING [train.py:1204] (3/4) Exclude cut with ID _13942_210_9988_1_1530604906036_3733360_499-55676-0 from training. Duration: 0.62 2023-10-02 17:02:29,047 WARNING [train.py:1204] (3/4) Exclude cut with ID _49376_210_5986_1_1533985278732_3370740_301-154763-0_sp1.1 from training. Duration: 0.8909375 2023-10-02 17:02:35,119 WARNING [train.py:1204] (3/4) Exclude cut with ID _27485_210_4916_1_1532739191677_3668269_255-216698-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 17:02:36,500 WARNING [train.py:1204] (3/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_137-334143-0_sp1.1 from training. Duration: 0.97275 2023-10-02 17:02:36,521 WARNING [train.py:1204] (3/4) Exclude cut with ID _6456_210_1534_1_1527681634212_3181049_215-309359-0_sp1.1 from training. Duration: 0.8 2023-10-02 17:02:36,815 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.balancer2.prob, batch_count=953433.3333333334, ans=0.125 2023-10-02 17:02:38,082 WARNING [train.py:1204] (3/4) Exclude cut with ID _65640_210_6310_1_1535368201445_3173250_209-13008-0 from training. Duration: 0.99 2023-10-02 17:02:42,332 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder_embed.conv.2.prob, batch_count=953500.0, ans=0.125 2023-10-02 17:02:43,544 WARNING [train.py:1204] (3/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_58-29064-0 from training. Duration: 0.99 2023-10-02 17:02:46,952 WARNING [train.py:1204] (3/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_410-287857-0 from training. Duration: 0.93 2023-10-02 17:02:48,270 WARNING [train.py:1204] (3/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_153-153537-0 from training. Duration: 0.91 2023-10-02 17:02:48,309 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7544_210_6753_1_1528587722_3576686_172-171750-0_sp0.9 from training. Duration: 0.72225 2023-10-02 17:02:49,582 WARNING [train.py:1204] (3/4) Exclude cut with ID _64521_210_14699_1_1535243427259_3815328_464-183853-0_sp1.1 from training. Duration: 0.97275 2023-10-02 17:02:49,596 WARNING [train.py:1204] (3/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_385-210308-0_sp1.1 from training. Duration: 0.963625 2023-10-02 17:02:49,603 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_46013_210_7234_1_1533776362_3744032_354-261865-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 17:02:49,612 WARNING [train.py:1204] (3/4) Exclude cut with ID _3067_210_2667_1_1526212974640_4503168_250-285937-0_sp1.1 from training. Duration: 0.72725 2023-10-02 17:02:49,685 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_40036_210_13405_1_1533648647_1315388_48-146016-0 from training. Duration: 0.93 2023-10-02 17:02:54,304 WARNING [train.py:1204] (3/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_280-161633-0 from training. Duration: 0.73 2023-10-02 17:02:56,025 WARNING [train.py:1204] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_308-161067-0_sp0.9 from training. Duration: 0.7333125 2023-10-02 17:02:56,257 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.attention_skip_rate, batch_count=953566.6666666666, ans=0.0 2023-10-02 17:02:57,349 WARNING [train.py:1204] (3/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_68-212801-0_sp0.9 from training. Duration: 0.811125 2023-10-02 17:02:57,406 WARNING [train.py:1204] (3/4) Exclude cut with ID _6789_210_1534_1_1527857937218_3118950_311-61262-0_sp0.9 from training. Duration: 0.72225 2023-10-02 17:02:58,998 WARNING [train.py:1204] (3/4) Exclude cut with ID _14632_210_9988_1_1530691279392_3923200_102-204477-0_sp1.1 from training. Duration: 0.8818125 2023-10-02 17:03:00,318 WARNING [train.py:1204] (3/4) Exclude cut with ID _65242_210_18628_1_1535369393717_3823149_290-117517-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 17:03:00,426 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_69630_210_7234_1_1535788670_910989_35-337140-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 17:03:00,434 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_5618_210_1874_1_1527311008_5851_1-214501-0 from training. Duration: 0.689 2023-10-02 17:03:03,386 WARNING [train.py:1204] (3/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_168-50337-0_sp0.9 from training. Duration: 0.9333125 2023-10-02 17:03:04,714 WARNING [train.py:1204] (3/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_185-14438-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 17:03:04,735 WARNING [train.py:1204] (3/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_61-327062-0 from training. Duration: 0.81 2023-10-02 17:03:04,746 WARNING [train.py:1204] (3/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_98-741-0 from training. Duration: 0.8 2023-10-02 17:03:08,834 WARNING [train.py:1204] (3/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_312-204798-0 from training. Duration: 0.93 2023-10-02 17:03:10,197 WARNING [train.py:1204] (3/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_787-294416-0_sp0.9 from training. Duration: 0.911125 2023-10-02 17:03:10,468 INFO [scaling.py:1118] (3/4) WithLoss: name=encoder.encoders.4.encoder.layers.2.self_attn_weights, loss-sum=0.000e+00 2023-10-02 17:03:12,875 WARNING [train.py:1204] (3/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_140-181176-0_sp1.1 from training. Duration: 0.836375 2023-10-02 17:03:12,904 WARNING [train.py:1204] (3/4) Exclude cut with ID _14687_210_5854_1_1530703838259_3664889_115-239046-0_sp1.1 from training. Duration: 0.6090625 2023-10-02 17:03:12,954 WARNING [train.py:1204] (3/4) Exclude cut with ID _56423_210_5854_1_1534896061049_3165969_370-230614-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 17:03:13,145 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.conv_module1.balancer1.prob, batch_count=953633.3333333334, ans=0.125 2023-10-02 17:03:14,238 WARNING [train.py:1204] (3/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_406-275649-0_sp0.9 from training. Duration: 0.4666875 2023-10-02 17:03:14,249 WARNING [train.py:1204] (3/4) Exclude cut with ID _15150_210_8341_1_1530752411803_7256384_22-138274-0_sp1.1 from training. Duration: 0.87275 2023-10-02 17:03:14,290 WARNING [train.py:1204] (3/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_125-125818-0 from training. Duration: 0.96 2023-10-02 17:03:14,443 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.1.balancer1.prob, batch_count=953633.3333333334, ans=0.125 2023-10-02 17:03:17,047 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_4399_210_1874_1_1526705485_2400757_220-47974-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 17:03:18,454 WARNING [train.py:1204] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_308-161067-0_sp1.1 from training. Duration: 0.6 2023-10-02 17:03:20,507 INFO [scaling.py:1118] (3/4) WithLoss: name=encoder.encoders.3.encoder.layers.0.self_attn_weights, loss-sum=0.000e+00 2023-10-02 17:03:21,550 WARNING [train.py:1204] (3/4) Exclude cut with ID _53489_210_6228_1_1534298357169_3835970_102-57645-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 17:03:24,725 WARNING [train.py:1204] (3/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_178-177582-0 from training. Duration: 0.74 2023-10-02 17:03:24,783 WARNING [train.py:1204] (3/4) Exclude cut with ID _14349_210_8341_1_1530615710818_3442470_170-336541-0_sp1.1 from training. Duration: 0.7818125 2023-10-02 17:03:26,663 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_521-214276-0_sp0.9 from training. Duration: 0.488875 2023-10-02 17:03:26,711 WARNING [train.py:1204] (3/4) Exclude cut with ID _66070_210_7640_1_1535441393096_7805310_489-292631-0 from training. Duration: 0.79 2023-10-02 17:03:30,817 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.577e+02 1.829e+02 2.030e+02 2.368e+02 3.684e+02, threshold=4.060e+02, percent-clipped=0.0 2023-10-02 17:03:34,231 WARNING [train.py:1204] (3/4) Exclude cut with ID _14069_210_5632_1_1530512692033_4803390_438-340023-0_sp1.1 from training. Duration: 0.936375 2023-10-02 17:03:34,316 WARNING [train.py:1204] (3/4) Exclude cut with ID _7852_210_5219_1_1528286471316_8139400_242-105336-0_sp1.1 from training. Duration: 0.6545625 2023-10-02 17:03:35,661 WARNING [train.py:1204] (3/4) Exclude cut with ID _3867_210_1883_1_1526641200575_4475830_272-215563-0 from training. Duration: 0.57 2023-10-02 17:03:36,869 WARNING [train.py:1204] (3/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_143-117421-0_sp0.9 from training. Duration: 0.6444375 2023-10-02 17:03:36,876 WARNING [train.py:1204] (3/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_30-326681-0_sp1.1 from training. Duration: 0.7545625 2023-10-02 17:03:38,316 WARNING [train.py:1204] (3/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_538-218448-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 17:03:41,070 INFO [train.py:1046] (3/4) Epoch 27, batch 4950, loss[loss=0.1674, simple_loss=0.254, pruned_loss=0.04042, over 24489.00 frames. ], tot_loss[loss=0.1658, simple_loss=0.2424, pruned_loss=0.04461, over 4696171.46 frames. ], batch size: 69, lr: 3.76e-03, grad_scale: 16.0 2023-10-02 17:03:41,160 WARNING [train.py:1204] (3/4) Exclude cut with ID _33640_210_2655_1_1533117605479_4855469_60-192970-0_sp1.1 from training. Duration: 0.963625 2023-10-02 17:03:41,166 WARNING [train.py:1204] (3/4) Exclude cut with ID _65585_210_12210_1_1535450451584_3764098_112-294301-0_sp1.1 from training. Duration: 0.8 2023-10-02 17:03:41,194 WARNING [train.py:1204] (3/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_643-37412-0_sp1.1 from training. Duration: 0.936375 2023-10-02 17:03:42,517 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_9302_210_2550_1_1528889962_4655416_638-301620-0_sp1.1 from training. Duration: 0.42725 2023-10-02 17:03:42,624 WARNING [train.py:1204] (3/4) Exclude cut with ID _3867_210_1883_1_1526641200575_4475830_272-215563-0_sp1.1 from training. Duration: 0.5181875 2023-10-02 17:03:45,268 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_53679_210_4852_1_1534556726_4186141_358-154143-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 17:03:45,279 WARNING [train.py:1204] (3/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_841-341679-0_sp0.9 from training. Duration: 0.6444375 2023-10-02 17:03:48,083 WARNING [train.py:1204] (3/4) Exclude cut with ID _7160_210_1876_1_1527940923231_3703799_302-150728-0 from training. Duration: 0.78 2023-10-02 17:03:48,111 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_125-203105-0 from training. Duration: 0.8 2023-10-02 17:03:49,884 WARNING [train.py:1204] (3/4) Exclude cut with ID _5585_210_2667_1_1527418813324_8007340_89-332072-0_sp0.9 from training. Duration: 0.711125 2023-10-02 17:03:49,972 WARNING [train.py:1204] (3/4) Exclude cut with ID _3992_210_1747_1_1526644816031_3731060_36-147855-0 from training. Duration: 0.78 2023-10-02 17:03:50,006 WARNING [train.py:1204] (3/4) Exclude cut with ID _59256_210_5986_1_1535076085997_3589120_172-239045-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 17:03:50,015 WARNING [train.py:1204] (3/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_472-216663-0_sp1.1 from training. Duration: 0.836375 2023-10-02 17:03:50,072 WARNING [train.py:1204] (3/4) Exclude cut with ID _36512_210_6987_1_1533033143998_3126069_181-306500-0_sp1.1 from training. Duration: 0.863625 2023-10-02 17:03:50,088 WARNING [train.py:1204] (3/4) Exclude cut with ID _56524_210_9642_1_1534572011779_3619170_492-95008-0_sp1.1 from training. Duration: 0.97275 2023-10-02 17:03:53,342 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_33348_210_5632_1_1533086912_3324030_119-90326-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 17:03:53,390 WARNING [train.py:1204] (3/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_231-137892-0_sp1.1 from training. Duration: 0.9 2023-10-02 17:03:55,247 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8453_210_4211_1_1528502940_8562730_596-306820-0_sp1.1 from training. Duration: 0.8818125 2023-10-02 17:03:56,571 WARNING [train.py:1204] (3/4) Exclude cut with ID _27052_210_5986_1_1532329197187_3585009_50-242750-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 17:03:57,978 WARNING [train.py:1204] (3/4) Exclude cut with ID _13913_210_9994_1_1530579615187_3383897_358-98435-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 17:03:57,990 WARNING [train.py:1204] (3/4) Exclude cut with ID _27013_210_1389_1_1532437173240_3252649_399-229778-0_sp1.1 from training. Duration: 0.963625 2023-10-02 17:04:00,709 WARNING [train.py:1204] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_185-174805-0_sp0.9 from training. Duration: 0.7555625 2023-10-02 17:04:06,576 WARNING [train.py:1204] (3/4) Exclude cut with ID _65585_210_12210_1_1535450451584_3764098_196-221014-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 17:04:07,985 WARNING [train.py:1204] (3/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_202-293523-0_sp1.1 from training. Duration: 0.6545625 2023-10-02 17:04:09,324 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8275_210_4198_1_1528455320_3892571_213-127634-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 17:04:10,768 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_921-231678-0_sp1.1 from training. Duration: 0.97275 2023-10-02 17:04:12,109 WARNING [train.py:1204] (3/4) Exclude cut with ID _38020_210_5986_1_1533112252259_7006089_493-102745-0_sp1.1 from training. Duration: 0.8909375 2023-10-02 17:04:12,200 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6846_210_6753_1_1527850767_4295158_2-315073-0 from training. Duration: 0.88 2023-10-02 17:04:13,481 WARNING [train.py:1204] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_249-281349-0 from training. Duration: 0.85 2023-10-02 17:04:16,291 WARNING [train.py:1204] (3/4) Exclude cut with ID _18513_210_9206_1_1531618215223_3862620_314-83429-0_sp1.1 from training. Duration: 0.97275 2023-10-02 17:04:17,841 WARNING [train.py:1204] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_215-202973-0_sp0.9 from training. Duration: 0.97775 2023-10-02 17:04:17,859 WARNING [train.py:1204] (3/4) Exclude cut with ID _7719_210_3525_1_1528456153163_4296960_88-250259-0_sp1.1 from training. Duration: 0.763625 2023-10-02 17:04:17,970 WARNING [train.py:1204] (3/4) Exclude cut with ID _7606_210_4915_1_1528894253277_5080690_301-10732-0_sp1.1 from training. Duration: 0.736375 2023-10-02 17:04:17,979 WARNING [train.py:1204] (3/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_818-11907-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 17:04:19,253 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_5959_210_1794_1_1527400400_3445726_412-44052-0_sp1.1 from training. Duration: 0.7 2023-10-02 17:04:23,153 WARNING [train.py:1204] (3/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_6-279195-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 17:04:24,532 WARNING [train.py:1204] (3/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_252-216674-0_sp0.9 from training. Duration: 0.6 2023-10-02 17:04:25,888 WARNING [train.py:1204] (3/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_144-3287-0_sp1.1 from training. Duration: 0.6090625 2023-10-02 17:04:27,931 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.2.encoder.layers.2.nonlin_attention.whiten1, num_groups=1, num_channels=288, metric=4.57 vs. limit=10.0 2023-10-02 17:04:28,444 WARNING [train.py:1204] (3/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_342-3879-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 17:04:28,477 WARNING [train.py:1204] (3/4) Exclude cut with ID _33640_210_2655_1_1533117605479_4855469_584-65630-0_sp1.1 from training. Duration: 0.97275 2023-10-02 17:04:28,611 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.0.nonlin_attention.balancer.prob, batch_count=953966.6666666666, ans=0.125 2023-10-02 17:04:29,841 WARNING [train.py:1204] (3/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_37-189262-0 from training. Duration: 0.98 2023-10-02 17:04:29,881 WARNING [train.py:1204] (3/4) Exclude cut with ID _9389_210_1664_1_1529217337301_5289269_379-128794-0_sp0.9 from training. Duration: 0.9444375 2023-10-02 17:04:31,772 WARNING [train.py:1204] (3/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_119-207162-0_sp1.1 from training. Duration: 0.6454375 2023-10-02 17:04:34,581 WARNING [train.py:1204] (3/4) Exclude cut with ID _2941_210_2577_1_1526199673150_2489759_200-333643-0_sp1.1 from training. Duration: 0.936375 2023-10-02 17:04:37,400 WARNING [train.py:1204] (3/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_479-125611-0_sp1.1 from training. Duration: 0.87275 2023-10-02 17:04:37,415 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3475_210_2028_1_1526193578_3622569_64-297072-0_sp1.1 from training. Duration: 0.82725 2023-10-02 17:04:38,744 WARNING [train.py:1204] (3/4) Exclude cut with ID _53565_210_10635_1_1534337218335_3820259_139-346291-0_sp1.1 from training. Duration: 0.97275 2023-10-02 17:04:38,787 WARNING [train.py:1204] (3/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_588-125165-0_sp1.1 from training. Duration: 0.7545625 2023-10-02 17:04:38,841 WARNING [train.py:1204] (3/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_938-205135-0_sp1.1 from training. Duration: 0.7909375 2023-10-02 17:04:41,552 WARNING [train.py:1204] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_216-48707-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 17:04:41,614 WARNING [train.py:1204] (3/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_477-255782-0_sp1.1 from training. Duration: 0.7454375 2023-10-02 17:04:41,645 WARNING [train.py:1204] (3/4) Exclude cut with ID _16909_210_5724_1_1531292079912_4118919_583-92842-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 17:04:43,131 WARNING [train.py:1204] (3/4) Exclude cut with ID _60860_210_8254_1_1534896015875_3557169_245-256666-0 from training. Duration: 0.97 2023-10-02 17:04:47,262 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_45399_210_4852_1_1533624725_7579737_349-116173-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 17:04:47,880 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.4.encoder.layers.0.conv_module2.whiten, num_groups=1, num_channels=384, metric=8.63 vs. limit=15.0 2023-10-02 17:04:52,260 WARNING [train.py:1204] (3/4) Exclude cut with ID _8699_210_2069_1_1528630346831_3960409_155-39766-0 from training. Duration: 0.78 2023-10-02 17:04:52,276 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_86-16881-0_sp1.1 from training. Duration: 0.52725 2023-10-02 17:04:54,064 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.bypass.skip_rate, batch_count=954100.0, ans=0.07 2023-10-02 17:04:55,560 INFO [train.py:1046] (3/4) Epoch 27, batch 5000, loss[loss=0.1524, simple_loss=0.2289, pruned_loss=0.03795, over 24432.00 frames. ], tot_loss[loss=0.1656, simple_loss=0.2424, pruned_loss=0.04443, over 4706744.08 frames. ], batch size: 58, lr: 3.76e-03, grad_scale: 8.0 2023-10-02 17:04:58,313 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_191-159384-0_sp1.1 from training. Duration: 0.97275 2023-10-02 17:04:58,320 WARNING [train.py:1204] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_307-158462-0_sp1.1 from training. Duration: 0.62725 2023-10-02 17:05:00,972 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7544_210_6753_1_1528591327_1471426_33-346031-0 from training. Duration: 0.61 2023-10-02 17:05:01,049 WARNING [train.py:1204] (3/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_112-149213-0 from training. Duration: 0.95 2023-10-02 17:05:04,344 WARNING [train.py:1204] (3/4) Exclude cut with ID _55844_210_2346_1_1534471212569_7625707_925-191342-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 17:05:05,832 WARNING [train.py:1204] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_87-278686-0 from training. Duration: 0.77 2023-10-02 17:05:05,861 WARNING [train.py:1204] (3/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_415-78792-0_sp1.1 from training. Duration: 0.736375 2023-10-02 17:05:05,872 WARNING [train.py:1204] (3/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_107-332398-0_sp0.9 from training. Duration: 0.8666875 2023-10-02 17:05:07,264 WARNING [train.py:1204] (3/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_72-251255-0 from training. Duration: 0.94 2023-10-02 17:05:08,511 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_431-227273-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 17:05:09,835 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7544_210_6753_1_1528587000_691468_87-178599-0_sp0.9 from training. Duration: 0.9666875 2023-10-02 17:05:09,928 WARNING [train.py:1204] (3/4) Exclude cut with ID _13916_210_9994_1_1530788341126_3763149_60-191556-0 from training. Duration: 0.61 2023-10-02 17:05:09,930 WARNING [train.py:1204] (3/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_746-164782-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 17:05:11,266 WARNING [train.py:1204] (3/4) Exclude cut with ID _33640_210_2655_1_1533117605479_4855469_596-21066-0_sp1.1 from training. Duration: 0.92725 2023-10-02 17:05:12,665 WARNING [train.py:1204] (3/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_320-14078-0 from training. Duration: 0.95 2023-10-02 17:05:13,960 WARNING [train.py:1204] (3/4) Exclude cut with ID _29877_210_6228_1_1532397791106_5276149_266-62291-0 from training. Duration: 0.87 2023-10-02 17:05:14,038 WARNING [train.py:1204] (3/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_176-56120-0_sp0.9 from training. Duration: 0.92225 2023-10-02 17:05:15,371 WARNING [train.py:1204] (3/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_91-26919-0 from training. Duration: 0.97 2023-10-02 17:05:15,379 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_224-322394-0_sp0.9 from training. Duration: 0.7333125 2023-10-02 17:05:16,739 WARNING [train.py:1204] (3/4) Exclude cut with ID _44982_210_1511_1_1534312820108_3726029_586-297059-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 17:05:16,781 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_196-289637-0_sp1.1 from training. Duration: 0.6818125 2023-10-02 17:05:16,782 WARNING [train.py:1204] (3/4) Exclude cut with ID _18513_210_9206_1_1531618215223_3862620_402-5957-0 from training. Duration: 0.99 2023-10-02 17:05:16,789 WARNING [train.py:1204] (3/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_81-300212-0 from training. Duration: 0.6 2023-10-02 17:05:18,325 WARNING [train.py:1204] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_166-128941-0 from training. Duration: 0.54 2023-10-02 17:05:18,338 WARNING [train.py:1204] (3/4) Exclude cut with ID _58059_210_5854_1_1534901471149_3164808_57-309660-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 17:05:19,615 WARNING [train.py:1204] (3/4) Exclude cut with ID _60621_210_17071_1_1535013053530_3671739_73-155814-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 17:05:21,598 WARNING [train.py:1204] (3/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_726-270780-0 from training. Duration: 0.86 2023-10-02 17:05:21,621 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8853_210_3273_1_1528633899_818968_51-187685-0_sp1.1 from training. Duration: 0.62725 2023-10-02 17:05:22,938 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_315-212794-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 17:05:24,873 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_53373_210_5399_1_1534229471_4715695_314-122795-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 17:05:25,166 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.balancer2.prob, batch_count=954233.3333333334, ans=0.125 2023-10-02 17:05:26,218 WARNING [train.py:1204] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_187-221566-0_sp0.9 from training. Duration: 0.47775 2023-10-02 17:05:27,698 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_133-564-0 from training. Duration: 0.79 2023-10-02 17:05:27,751 WARNING [train.py:1204] (3/4) Exclude cut with ID _55153_210_2253_1_1534384786677_3162096_255-228344-0_sp1.1 from training. Duration: 0.936375 2023-10-02 17:05:29,701 WARNING [train.py:1204] (3/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_383-165234-0_sp1.1 from training. Duration: 0.8545625 2023-10-02 17:05:29,822 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.1.conv_skip_rate, batch_count=954233.3333333334, ans=0.0 2023-10-02 17:05:29,862 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.balancer2.prob, batch_count=954233.3333333334, ans=0.125 2023-10-02 17:05:31,183 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.feed_forward3.hidden_balancer.prob, batch_count=954233.3333333334, ans=0.125 2023-10-02 17:05:33,796 WARNING [train.py:1204] (3/4) Exclude cut with ID IC0677W0056-334889 from training. Duration: 0.768 2023-10-02 17:05:37,211 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_14953_210_2151_1_1530701710_5616165_710-165400-0_sp0.9 from training. Duration: 0.9666875 2023-10-02 17:05:38,543 WARNING [train.py:1204] (3/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_355-236205-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 17:05:38,544 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_34450_210_9547_1_1533099443_4109050_503-246893-0_sp1.1 from training. Duration: 0.963625 2023-10-02 17:05:41,391 WARNING [train.py:1204] (3/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_25-120161-0 from training. Duration: 0.98 2023-10-02 17:05:41,421 WARNING [train.py:1204] (3/4) Exclude cut with ID _66112_210_10635_1_1535590236305_3801484_205-311930-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 17:05:42,713 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_48049_210_5632_1_1533864553_3519937_185-281218-0_sp1.1 from training. Duration: 0.92725 2023-10-02 17:05:42,753 WARNING [train.py:1204] (3/4) Exclude cut with ID _48782_210_12658_1_1534467701544_2300422_191-332415-0_sp1.1 from training. Duration: 0.97275 2023-10-02 17:05:44,096 WARNING [train.py:1204] (3/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_907-151398-0_sp1.1 from training. Duration: 0.336375 2023-10-02 17:05:44,147 WARNING [train.py:1204] (3/4) Exclude cut with ID _67499_210_17282_1_1535509805220_3692430_20-179346-0_sp1.1 from training. Duration: 0.8818125 2023-10-02 17:05:48,067 WARNING [train.py:1204] (3/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_441-91466-0_sp1.1 from training. Duration: 0.8818125 2023-10-02 17:05:49,375 WARNING [train.py:1204] (3/4) Exclude cut with ID _46681_210_9642_1_1533893005571_7603359_786-164007-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 17:05:56,084 WARNING [train.py:1204] (3/4) Exclude cut with ID _14970_210_15709_1_1530773543493_1715730_187-20259-0 from training. Duration: 0.89 2023-10-02 17:05:57,709 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.conv_module2.balancer1.prob, batch_count=954366.6666666666, ans=0.125 2023-10-02 17:05:59,918 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.605e+02 1.877e+02 2.124e+02 2.613e+02 3.895e+02, threshold=4.248e+02, percent-clipped=0.0 2023-10-02 17:06:00,046 WARNING [train.py:1204] (3/4) Exclude cut with ID _12734_210_8341_1_1530183614296_3891929_383-339075-0_sp1.1 from training. Duration: 0.963625 2023-10-02 17:06:05,582 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.1.encoder.layers.1.whiten, num_groups=1, num_channels=256, metric=3.92 vs. limit=12.0 2023-10-02 17:06:09,334 INFO [train.py:1046] (3/4) Epoch 27, batch 5050, loss[loss=0.1891, simple_loss=0.2569, pruned_loss=0.06066, over 23802.00 frames. ], tot_loss[loss=0.166, simple_loss=0.2433, pruned_loss=0.04433, over 4714663.98 frames. ], batch size: 195, lr: 3.76e-03, grad_scale: 8.0 2023-10-02 17:06:09,439 WARNING [train.py:1204] (3/4) Exclude cut with ID _37086_210_9038_1_1532999540891_7021250_834-322225-0_sp1.1 from training. Duration: 0.92725 2023-10-02 17:06:10,793 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_10987_210_4163_1_1529760555_4123902_56-290559-0_sp1.1 from training. Duration: 0.963625 2023-10-02 17:06:10,800 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_92-246270-0_sp0.9 from training. Duration: 0.9444375 2023-10-02 17:06:10,818 WARNING [train.py:1204] (3/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_56-13561-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 17:06:10,840 WARNING [train.py:1204] (3/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_393-57888-0_sp1.1 from training. Duration: 0.5818125 2023-10-02 17:06:12,275 WARNING [train.py:1204] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_168-32906-0_sp1.1 from training. Duration: 0.62725 2023-10-02 17:06:12,328 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_51225_210_3601_1_1534400634_2327383_127-207553-0_sp1.1 from training. Duration: 0.963625 2023-10-02 17:06:15,335 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.feed_forward1.hidden_balancer.prob, batch_count=954433.3333333334, ans=0.125 2023-10-02 17:06:16,420 WARNING [train.py:1204] (3/4) Exclude cut with ID _58583_210_5236_1_1534726835266_3574249_494-203521-0_sp1.1 from training. Duration: 0.963625 2023-10-02 17:06:16,446 WARNING [train.py:1204] (3/4) Exclude cut with ID _49376_210_5986_1_1533985278732_3370740_354-344695-0 from training. Duration: 0.99 2023-10-02 17:06:16,518 WARNING [train.py:1204] (3/4) Exclude cut with ID _8021_210_7994_1_1528376451358_3653069_179-45206-0_sp1.1 from training. Duration: 0.7818125 2023-10-02 17:06:19,239 WARNING [train.py:1204] (3/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_742-154017-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 17:06:20,555 WARNING [train.py:1204] (3/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_222-198127-0_sp1.1 from training. Duration: 0.763625 2023-10-02 17:06:21,923 WARNING [train.py:1204] (3/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_235-307337-0 from training. Duration: 0.94 2023-10-02 17:06:23,612 WARNING [train.py:1204] (3/4) Exclude cut with ID _8393_210_6702_1_1528505860249_3457328_453-176500-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 17:06:23,658 WARNING [train.py:1204] (3/4) Exclude cut with ID _53565_210_10635_1_1534337218335_3820259_102-179528-0_sp1.1 from training. Duration: 0.97275 2023-10-02 17:06:25,105 WARNING [train.py:1204] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_43-206757-0_sp1.1 from training. Duration: 0.6818125 2023-10-02 17:06:26,450 WARNING [train.py:1204] (3/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_335-274780-0_sp1.1 from training. Duration: 0.7090625 2023-10-02 17:06:26,485 WARNING [train.py:1204] (3/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_128-296581-0_sp1.1 from training. Duration: 0.67275 2023-10-02 17:06:35,724 WARNING [train.py:1204] (3/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_111-112362-0 from training. Duration: 0.91 2023-10-02 17:06:37,035 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_5522_210_3273_1_1527249606_3909096_366-172508-0_sp1.1 from training. Duration: 0.6 2023-10-02 17:06:37,104 WARNING [train.py:1204] (3/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_47-167373-0_sp1.1 from training. Duration: 0.836375 2023-10-02 17:06:38,387 WARNING [train.py:1204] (3/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_41-234353-0 from training. Duration: 0.68 2023-10-02 17:06:38,413 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_82-221196-0_sp0.9 from training. Duration: 0.8444375 2023-10-02 17:06:40,373 WARNING [train.py:1204] (3/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_47-4814-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 17:06:40,400 WARNING [train.py:1204] (3/4) Exclude cut with ID _43541_210_10626_1_1533796233418_3635440_40-247993-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 17:06:41,740 WARNING [train.py:1204] (3/4) Exclude cut with ID _21989_210_5854_1_1531899214931_3271360_133-44759-0_sp0.9 from training. Duration: 0.8555625 2023-10-02 17:06:41,743 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_94-195801-0 from training. Duration: 0.94 2023-10-02 17:06:41,825 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_141-89113-0 from training. Duration: 0.86 2023-10-02 17:06:43,136 WARNING [train.py:1204] (3/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_683-284578-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 17:06:45,941 WARNING [train.py:1204] (3/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_6-214815-0_sp1.1 from training. Duration: 0.77275 2023-10-02 17:06:49,950 WARNING [train.py:1204] (3/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_556-212082-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 17:06:49,992 WARNING [train.py:1204] (3/4) Exclude cut with ID _44902_210_16187_1_1533888006062_3792078_149-67212-0 from training. Duration: 0.87 2023-10-02 17:06:51,386 WARNING [train.py:1204] (3/4) Exclude cut with ID _61148_210_6897_1_1535094022726_4217610_333-159440-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 17:06:52,245 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.1.encoder.layers.1.nonlin_attention.whiten1, num_groups=1, num_channels=192, metric=5.89 vs. limit=10.0 2023-10-02 17:06:53,421 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.bypass.scale_min, batch_count=954633.3333333334, ans=0.2 2023-10-02 17:06:54,533 WARNING [train.py:1204] (3/4) Exclude cut with ID _3120_210_3525_1_1526036143112_4072980_404-249994-0 from training. Duration: 0.99 2023-10-02 17:06:55,852 WARNING [train.py:1204] (3/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_234-260682-0_sp0.9 from training. Duration: 0.9333125 2023-10-02 17:06:55,871 WARNING [train.py:1204] (3/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_374-138971-0_sp1.1 from training. Duration: 0.8545625 2023-10-02 17:06:57,211 WARNING [train.py:1204] (3/4) Exclude cut with ID _53713_210_24116_1_1534314487762_7416751_743-302085-0_sp1.1 from training. Duration: 0.92725 2023-10-02 17:06:57,278 WARNING [train.py:1204] (3/4) Exclude cut with ID _18516_210_9206_1_1531704653206_3692406_198-334870-0_sp1.1 from training. Duration: 0.836375 2023-10-02 17:06:59,179 WARNING [train.py:1204] (3/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_395-73570-0_sp1.1 from training. Duration: 0.9 2023-10-02 17:07:00,894 WARNING [train.py:1204] (3/4) Exclude cut with ID _46683_210_9642_1_1533730913492_4591116_421-19547-0_sp1.1 from training. Duration: 0.936375 2023-10-02 17:07:02,235 WARNING [train.py:1204] (3/4) Exclude cut with ID _19896_210_1883_1_1531709022662_6649569_714-8668-0_sp1.1 from training. Duration: 0.963625 2023-10-02 17:07:03,523 WARNING [train.py:1204] (3/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_49-101072-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 17:07:03,540 WARNING [train.py:1204] (3/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_220-191080-0_sp1.1 from training. Duration: 0.8909375 2023-10-02 17:07:03,575 WARNING [train.py:1204] (3/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_62-335552-0 from training. Duration: 0.87 2023-10-02 17:07:05,359 WARNING [train.py:1204] (3/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_111-112362-0_sp1.1 from training. Duration: 0.82725 2023-10-02 17:07:05,473 WARNING [train.py:1204] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_318-59097-0_sp0.9 from training. Duration: 0.8444375 2023-10-02 17:07:08,327 WARNING [train.py:1204] (3/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_98-255710-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 17:07:08,334 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9042W0003-515609 from training. Duration: 0.896 2023-10-02 17:07:08,343 WARNING [train.py:1204] (3/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_158-250116-0_sp0.9 from training. Duration: 0.688875 2023-10-02 17:07:11,042 WARNING [train.py:1204] (3/4) Exclude cut with ID _26750_210_5665_1_1532226678442_4063230_7-346885-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 17:07:11,094 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_37839_210_7234_1_1533542462_3689303_357-227078-0_sp1.1 from training. Duration: 0.963625 2023-10-02 17:07:11,142 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9052W0119-520904 from training. Duration: 0.896 2023-10-02 17:07:14,580 WARNING [train.py:1204] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_249-281349-0_sp1.1 from training. Duration: 0.77275 2023-10-02 17:07:14,585 WARNING [train.py:1204] (3/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_147-293311-0 from training. Duration: 0.94 2023-10-02 17:07:14,586 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_158-21166-0_sp1.1 from training. Duration: 0.963625 2023-10-02 17:07:17,421 WARNING [train.py:1204] (3/4) Exclude cut with ID _14100_210_8341_1_1530529320613_3678069_207-332194-0_sp1.1 from training. Duration: 0.92725 2023-10-02 17:07:17,658 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.ff3_skip_rate, batch_count=954700.0, ans=0.0 2023-10-02 17:07:18,824 WARNING [train.py:1204] (3/4) Exclude cut with ID _9389_210_1664_1_1529217337301_5289269_324-211031-0_sp1.1 from training. Duration: 0.963625 2023-10-02 17:07:18,844 WARNING [train.py:1204] (3/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_133-158308-0 from training. Duration: 0.84 2023-10-02 17:07:20,224 WARNING [train.py:1204] (3/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_166-314607-0 from training. Duration: 0.54 2023-10-02 17:07:23,383 INFO [train.py:1046] (3/4) Epoch 27, batch 5100, loss[loss=0.1572, simple_loss=0.2428, pruned_loss=0.03585, over 24636.00 frames. ], tot_loss[loss=0.1671, simple_loss=0.2447, pruned_loss=0.04474, over 4712582.43 frames. ], batch size: 68, lr: 3.76e-03, grad_scale: 8.0 2023-10-02 17:07:23,493 WARNING [train.py:1204] (3/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_649-79538-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 17:07:23,500 WARNING [train.py:1204] (3/4) Exclude cut with ID _28394_210_8913_1_1532430049518_3650118_343-115667-0_sp1.1 from training. Duration: 0.97275 2023-10-02 17:07:23,540 WARNING [train.py:1204] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_87-278686-0_sp0.9 from training. Duration: 0.8555625 2023-10-02 17:07:26,268 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9109W0003-549749 from training. Duration: 0.768 2023-10-02 17:07:29,526 WARNING [train.py:1204] (3/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_126-29104-0_sp1.1 from training. Duration: 0.77275 2023-10-02 17:07:32,259 WARNING [train.py:1204] (3/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_325-113798-0 from training. Duration: 0.85 2023-10-02 17:07:32,327 WARNING [train.py:1204] (3/4) Exclude cut with ID _6694_210_4599_1_1527852637490_3965529_116-109398-0 from training. Duration: 0.73 2023-10-02 17:07:33,631 WARNING [train.py:1204] (3/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_46-57119-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 17:07:34,959 WARNING [train.py:1204] (3/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_546-119312-0_sp1.1 from training. Duration: 0.9 2023-10-02 17:07:37,686 WARNING [train.py:1204] (3/4) Exclude cut with ID _60057_210_10640_1_1534903065793_3333150_468-306939-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 17:07:37,731 WARNING [train.py:1204] (3/4) Exclude cut with ID _15150_210_8341_1_1530752411803_7256384_22-138274-0 from training. Duration: 0.96 2023-10-02 17:07:38,560 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.1.encoder.layers.0.conv_module1.whiten, num_groups=1, num_channels=256, metric=9.21 vs. limit=15.0 2023-10-02 17:07:39,533 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_289-180904-0 from training. Duration: 0.76 2023-10-02 17:07:41,243 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.ff3_skip_rate, batch_count=954833.3333333334, ans=0.0 2023-10-02 17:07:42,767 WARNING [train.py:1204] (3/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_64-133721-0_sp1.1 from training. Duration: 0.92725 2023-10-02 17:07:42,815 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_195-257512-0_sp1.1 from training. Duration: 0.6090625 2023-10-02 17:07:45,613 WARNING [train.py:1204] (3/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_704-101963-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 17:07:47,066 WARNING [train.py:1204] (3/4) Exclude cut with ID _4366_210_1388_1_1526792053633_3613819_231-211402-0 from training. Duration: 0.94 2023-10-02 17:07:48,447 WARNING [train.py:1204] (3/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_679-190385-0_sp1.1 from training. Duration: 0.97275 2023-10-02 17:07:49,810 WARNING [train.py:1204] (3/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_298-81459-0_sp1.1 from training. Duration: 0.963625 2023-10-02 17:07:49,819 WARNING [train.py:1204] (3/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_658-282610-0_sp0.9 from training. Duration: 0.588875 2023-10-02 17:07:53,106 WARNING [train.py:1204] (3/4) Exclude cut with ID _57572_210_5517_1_1534921219624_3809484_196-283734-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 17:07:54,548 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_53071_210_16554_1_1534490405_2954066_294-198554-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 17:07:54,553 WARNING [train.py:1204] (3/4) Exclude cut with ID _4866_210_2577_1_1527404412284_4364659_352-157930-0 from training. Duration: 0.81 2023-10-02 17:07:56,646 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9231W0113-610139 from training. Duration: 0.896 2023-10-02 17:07:56,830 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.conv_module2.balancer1.min_positive, batch_count=954900.0, ans=0.025 2023-10-02 17:07:57,752 WARNING [train.py:1204] (3/4) Exclude cut with ID _24271_210_12658_1_1531987270022_7190519_638-171906-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 17:07:59,007 WARNING [train.py:1204] (3/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_272-249985-0 from training. Duration: 0.67 2023-10-02 17:07:59,017 WARNING [train.py:1204] (3/4) Exclude cut with ID _64086_210_7994_1_1535270417019_7435949_449-31567-0 from training. Duration: 0.97 2023-10-02 17:08:03,260 WARNING [train.py:1204] (3/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_109-282568-0_sp1.1 from training. Duration: 0.97275 2023-10-02 17:08:11,892 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_121-45934-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 17:08:13,481 WARNING [train.py:1204] (3/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_314-126819-0 from training. Duration: 0.5 2023-10-02 17:08:13,511 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9309W0192-640319 from training. Duration: 0.512 2023-10-02 17:08:15,286 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9310W0003-640649 from training. Duration: 0.896 2023-10-02 17:08:16,680 WARNING [train.py:1204] (3/4) Exclude cut with ID _7160_210_1876_1_1527940923231_3703799_120-150943-0 from training. Duration: 0.81 2023-10-02 17:08:16,682 WARNING [train.py:1204] (3/4) Exclude cut with ID _40380_210_9059_1_1533380594759_4187380_6-33508-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 17:08:18,194 WARNING [train.py:1204] (3/4) Exclude cut with ID _59202_210_26984_1_1534816810612_3549174_137-318710-0 from training. Duration: 0.89 2023-10-02 17:08:22,181 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_435-313305-0 from training. Duration: 0.71 2023-10-02 17:08:23,560 WARNING [train.py:1204] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_91-212486-0_sp1.1 from training. Duration: 0.6181875 2023-10-02 17:08:25,447 WARNING [train.py:1204] (3/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_133-158308-0_sp1.1 from training. Duration: 0.763625 2023-10-02 17:08:28,127 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.605e+02 1.816e+02 1.951e+02 2.163e+02 3.190e+02, threshold=3.902e+02, percent-clipped=0.0 2023-10-02 17:08:28,888 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_99-251314-0 from training. Duration: 0.87 2023-10-02 17:08:31,415 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_365-217119-0_sp1.1 from training. Duration: 0.57275 2023-10-02 17:08:31,459 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3475_210_2028_1_1526193578_3622569_377-166133-0 from training. Duration: 0.86 2023-10-02 17:08:36,954 INFO [train.py:1046] (3/4) Epoch 27, batch 5150, loss[loss=0.1763, simple_loss=0.2504, pruned_loss=0.05116, over 23762.00 frames. ], tot_loss[loss=0.1681, simple_loss=0.2459, pruned_loss=0.04521, over 4716544.57 frames. ], batch size: 195, lr: 3.76e-03, grad_scale: 8.0 2023-10-02 17:08:37,068 WARNING [train.py:1204] (3/4) Exclude cut with ID _13916_210_9994_1_1530788341126_3763149_116-21034-0_sp1.1 from training. Duration: 0.87275 2023-10-02 17:08:37,079 WARNING [train.py:1204] (3/4) Exclude cut with ID _64220_210_9527_1_1535250151772_4875259_142-216831-0_sp1.1 from training. Duration: 0.936375 2023-10-02 17:08:37,079 WARNING [train.py:1204] (3/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_490-207861-0_sp1.1 from training. Duration: 0.8909375 2023-10-02 17:08:38,574 WARNING [train.py:1204] (3/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_87-272068-0_sp0.9 from training. Duration: 0.92225 2023-10-02 17:08:38,605 WARNING [train.py:1204] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_18-46756-0_sp1.1 from training. Duration: 0.7181875 2023-10-02 17:08:40,402 WARNING [train.py:1204] (3/4) Exclude cut with ID _39411_210_5986_1_1533538825030_3343649_98-311335-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 17:08:40,483 WARNING [train.py:1204] (3/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_113-18163-0 from training. Duration: 0.89 2023-10-02 17:08:40,485 WARNING [train.py:1204] (3/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_1103-56445-0 from training. Duration: 0.73 2023-10-02 17:08:41,784 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3474_210_2028_1_1526188513_4825246_145-309131-0 from training. Duration: 0.74 2023-10-02 17:08:41,812 WARNING [train.py:1204] (3/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_120-211630-0_sp1.1 from training. Duration: 0.836375 2023-10-02 17:08:41,822 WARNING [train.py:1204] (3/4) Exclude cut with ID _8030_210_7497_1_1528444886781_3531089_32-31837-0 from training. Duration: 0.97 2023-10-02 17:08:43,240 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_19949_210_2161_1_1531706337_928079_8-160956-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 17:08:43,300 WARNING [train.py:1204] (3/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_321-215742-0_sp1.1 from training. Duration: 0.4545625 2023-10-02 17:08:44,726 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_583-124650-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 17:08:47,957 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_30434_210_13049_1_1532519384_981746_164-182755-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 17:08:49,646 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.conv_module1.balancer2.prob, batch_count=955100.0, ans=0.125 2023-10-02 17:08:50,846 WARNING [train.py:1204] (3/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_339-104791-0_sp1.1 from training. Duration: 0.4454375 2023-10-02 17:08:52,125 WARNING [train.py:1204] (3/4) Exclude cut with ID _15026_210_8750_1_1530777602316_3291850_211-195043-0 from training. Duration: 0.8 2023-10-02 17:08:53,519 WARNING [train.py:1204] (3/4) Exclude cut with ID _29877_210_6228_1_1532397791106_5276149_487-182647-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 17:08:53,577 WARNING [train.py:1204] (3/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_127-566-0_sp0.9 from training. Duration: 0.9333125 2023-10-02 17:08:56,243 WARNING [train.py:1204] (3/4) Exclude cut with ID _8263_210_1664_1_1528439452891_970220_68-181805-0_sp0.9 from training. Duration: 0.711125 2023-10-02 17:08:56,245 WARNING [train.py:1204] (3/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_504-72513-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 17:08:56,253 WARNING [train.py:1204] (3/4) Exclude cut with ID _64639_210_5816_1_1535367619437_3528000_324-265233-0_sp1.1 from training. Duration: 0.963625 2023-10-02 17:08:56,319 WARNING [train.py:1204] (3/4) Exclude cut with ID _6456_210_1534_1_1527681634212_3181049_215-309359-0_sp0.9 from training. Duration: 0.97775 2023-10-02 17:08:56,330 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_115-233929-0_sp1.1 from training. Duration: 0.6545625 2023-10-02 17:08:58,304 WARNING [train.py:1204] (3/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_219-224445-0 from training. Duration: 0.84 2023-10-02 17:08:58,580 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.conv_skip_rate, batch_count=955166.6666666666, ans=0.0 2023-10-02 17:08:58,591 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=955166.6666666666, ans=0.1 2023-10-02 17:08:59,640 WARNING [train.py:1204] (3/4) Exclude cut with ID _14867_210_1968_1_1530684179890_3846969_413-29292-0_sp1.1 from training. Duration: 0.8181875 2023-10-02 17:09:01,398 WARNING [train.py:1204] (3/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_1012-279473-0_sp0.9 from training. Duration: 0.7444375 2023-10-02 17:09:01,532 WARNING [train.py:1204] (3/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_605-133395-0_sp0.9 from training. Duration: 0.7333125 2023-10-02 17:09:02,866 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_89-35748-0 from training. Duration: 0.79 2023-10-02 17:09:04,136 WARNING [train.py:1204] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_476-272757-0_sp1.1 from training. Duration: 0.8454375 2023-10-02 17:09:08,492 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.bypass.skip_rate, batch_count=955233.3333333334, ans=0.07 2023-10-02 17:09:09,598 WARNING [train.py:1204] (3/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_449-15508-0_sp0.9 from training. Duration: 0.911125 2023-10-02 17:09:11,534 WARNING [train.py:1204] (3/4) Exclude cut with ID _29345_210_4381_1_1532689168263_3860169_198-174328-0 from training. Duration: 0.91 2023-10-02 17:09:14,243 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_15517_210_6753_1_1530952293_1664795_125-158157-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 17:09:19,108 WARNING [train.py:1204] (3/4) Exclude cut with ID _25565_210_12392_1_1532423009382_3952098_420-113180-0_sp1.1 from training. Duration: 0.963625 2023-10-02 17:09:21,905 WARNING [train.py:1204] (3/4) Exclude cut with ID _51471_210_1889_1_1534399391390_3094250_236-146773-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 17:09:24,677 WARNING [train.py:1204] (3/4) Exclude cut with ID _30039_210_13270_1_1532484612900_2725150_175-35012-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 17:09:26,038 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_23850_210_4238_1_1532232597_1632433_99-275843-0_sp1.1 from training. Duration: 0.936375 2023-10-02 17:09:26,382 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.ff2_skip_rate, batch_count=955300.0, ans=0.0 2023-10-02 17:09:26,386 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.bypass.skip_rate, batch_count=955300.0, ans=0.09899494936611666 2023-10-02 17:09:28,072 WARNING [train.py:1204] (3/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_658-282610-0 from training. Duration: 0.53 2023-10-02 17:09:30,259 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.feed_forward1.hidden_balancer.prob, batch_count=955300.0, ans=0.125 2023-10-02 17:09:32,719 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_37818_210_4238_1_1533620910_4720083_234-65934-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 17:09:32,793 WARNING [train.py:1204] (3/4) Exclude cut with ID _3067_210_2667_1_1526212974640_4503168_459-173867-0_sp1.1 from training. Duration: 0.8 2023-10-02 17:09:32,816 WARNING [train.py:1204] (3/4) Exclude cut with ID _8696_210_1876_1_1528545632440_3345058_209-54069-0_sp0.9 from training. Duration: 0.7444375 2023-10-02 17:09:36,858 WARNING [train.py:1204] (3/4) Exclude cut with ID _62574_210_2655_1_1535180407058_3933249_420-236465-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 17:09:38,161 WARNING [train.py:1204] (3/4) Exclude cut with ID _46681_210_9642_1_1533893005571_7603359_333-4042-0_sp1.1 from training. Duration: 0.936375 2023-10-02 17:09:38,236 WARNING [train.py:1204] (3/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_377-296839-0 from training. Duration: 0.55 2023-10-02 17:09:41,496 WARNING [train.py:1204] (3/4) Exclude cut with ID _43997_210_4525_1_1533538632555_5189329_426-92599-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 17:09:44,321 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_43-71989-0_sp1.1 from training. Duration: 0.5181875 2023-10-02 17:09:45,762 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2009_210_2071_1_1525481134_4588937_140-345530-0_sp1.1 from training. Duration: 0.963625 2023-10-02 17:09:45,769 WARNING [train.py:1204] (3/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_138-332773-0_sp0.9 from training. Duration: 0.9 2023-10-02 17:09:47,600 WARNING [train.py:1204] (3/4) Exclude cut with ID _13942_210_9988_1_1530604906036_3733360_499-55676-0_sp0.9 from training. Duration: 0.688875 2023-10-02 17:09:47,625 WARNING [train.py:1204] (3/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_259-34038-0_sp1.1 from training. Duration: 0.67275 2023-10-02 17:09:47,645 WARNING [train.py:1204] (3/4) Exclude cut with ID _3120_210_3525_1_1526036143112_4072980_404-249994-0_sp1.1 from training. Duration: 0.9 2023-10-02 17:09:49,023 WARNING [train.py:1204] (3/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_222-108810-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 17:09:51,795 INFO [train.py:1046] (3/4) Epoch 27, batch 5200, loss[loss=0.1636, simple_loss=0.2538, pruned_loss=0.03668, over 24664.00 frames. ], tot_loss[loss=0.1682, simple_loss=0.2458, pruned_loss=0.04535, over 4720123.14 frames. ], batch size: 68, lr: 3.76e-03, grad_scale: 8.0 2023-10-02 17:09:51,858 WARNING [train.py:1204] (3/4) Exclude cut with ID _22083_210_8254_1_1531702862439_3678970_49-234546-0_sp0.9 from training. Duration: 0.92225 2023-10-02 17:09:53,176 WARNING [train.py:1204] (3/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_199-295611-0_sp0.9 from training. Duration: 0.611125 2023-10-02 17:09:55,988 WARNING [train.py:1204] (3/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_43-192945-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 17:10:01,437 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.out_combiner.scale_min, batch_count=955433.3333333334, ans=0.2 2023-10-02 17:10:02,383 WARNING [train.py:1204] (3/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_364-22817-0 from training. Duration: 0.71 2023-10-02 17:10:02,463 WARNING [train.py:1204] (3/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_388-129552-0_sp1.1 from training. Duration: 0.87275 2023-10-02 17:10:02,521 WARNING [train.py:1204] (3/4) Exclude cut with ID _6537_210_1883_1_1527987559710_3597129_190-280227-0_sp1.1 from training. Duration: 0.97275 2023-10-02 17:10:05,292 WARNING [train.py:1204] (3/4) Exclude cut with ID _55844_210_2346_1_1534471212569_7625707_398-56897-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 17:10:06,681 WARNING [train.py:1204] (3/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_48-182155-0_sp1.1 from training. Duration: 0.7909375 2023-10-02 17:10:06,707 WARNING [train.py:1204] (3/4) Exclude cut with ID _29529_210_7234_1_1532393961455_3977460_582-18084-0_sp1.1 from training. Duration: 0.97275 2023-10-02 17:10:08,080 WARNING [train.py:1204] (3/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_912-282412-0 from training. Duration: 0.49 2023-10-02 17:10:10,809 WARNING [train.py:1204] (3/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_332-256168-0_sp0.9 from training. Duration: 0.8333125 2023-10-02 17:10:10,864 WARNING [train.py:1204] (3/4) Exclude cut with ID _64220_210_9527_1_1535250151772_4875259_55-250624-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 17:10:14,006 WARNING [train.py:1204] (3/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_264-108685-0 from training. Duration: 0.55 2023-10-02 17:10:14,280 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.conv_module1.balancer2.prob, batch_count=955500.0, ans=0.125 2023-10-02 17:10:16,682 WARNING [train.py:1204] (3/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_613-232856-0_sp0.9 from training. Duration: 0.6 2023-10-02 17:10:16,880 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.conv_skip_rate, batch_count=955500.0, ans=0.0 2023-10-02 17:10:18,002 WARNING [train.py:1204] (3/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_68-212801-0_sp1.1 from training. Duration: 0.663625 2023-10-02 17:10:18,047 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6290_210_3273_1_1527595224_1282787_115-87095-0 from training. Duration: 0.55 2023-10-02 17:10:18,088 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3080_210_2071_1_1526086102_4365166_312-8726-0 from training. Duration: 0.91 2023-10-02 17:10:20,358 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.feed_forward3.hidden_balancer.prob, batch_count=955566.6666666666, ans=0.125 2023-10-02 17:10:21,542 WARNING [train.py:1204] (3/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_105-63309-0 from training. Duration: 0.98 2023-10-02 17:10:22,938 WARNING [train.py:1204] (3/4) Exclude cut with ID _2941_210_2577_1_1526199673150_2489759_201-160779-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 17:10:22,943 WARNING [train.py:1204] (3/4) Exclude cut with ID ID0406W0226-885434 from training. Duration: 0.997 2023-10-02 17:10:22,952 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_17292_210_6753_1_1531125388_2943734_353-328201-0_sp1.1 from training. Duration: 0.97275 2023-10-02 17:10:25,555 WARNING [train.py:1204] (3/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_572-96029-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 17:10:25,593 WARNING [train.py:1204] (3/4) Exclude cut with ID _14970_210_15709_1_1530775547361_1966965_6-23328-0_sp0.9 from training. Duration: 0.9555625 2023-10-02 17:10:26,914 WARNING [train.py:1204] (3/4) Exclude cut with ID _45012_210_12651_1_1533608435136_5371380_240-37101-0 from training. Duration: 0.93 2023-10-02 17:10:27,021 WARNING [train.py:1204] (3/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_685-90183-0_sp1.1 from training. Duration: 0.936375 2023-10-02 17:10:29,685 WARNING [train.py:1204] (3/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_41-22245-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 17:10:29,902 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.out_combiner.scale_min, batch_count=955566.6666666666, ans=0.2 2023-10-02 17:10:33,289 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_15126_210_6753_1_1530778677_2591185_120-277707-0 from training. Duration: 0.8 2023-10-02 17:10:34,615 WARNING [train.py:1204] (3/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_310-26186-0 from training. Duration: 0.7 2023-10-02 17:10:34,640 WARNING [train.py:1204] (3/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_787-294416-0 from training. Duration: 0.82 2023-10-02 17:10:39,063 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.conv_module2.balancer2.prob, batch_count=955633.3333333334, ans=0.125 2023-10-02 17:10:40,185 WARNING [train.py:1204] (3/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_795-33252-0 from training. Duration: 0.37 2023-10-02 17:10:40,230 WARNING [train.py:1204] (3/4) Exclude cut with ID _7537_210_2084_1_1528502395431_3924359_113-199933-0_sp0.9 from training. Duration: 0.6555625 2023-10-02 17:10:43,089 WARNING [train.py:1204] (3/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_308-231735-0_sp0.9 from training. Duration: 0.811125 2023-10-02 17:10:43,111 WARNING [train.py:1204] (3/4) Exclude cut with ID _19504_210_9994_1_1531648863242_3325220_354-296765-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 17:10:44,504 WARNING [train.py:1204] (3/4) Exclude cut with ID _5409_210_1388_1_1527292850189_5865191_178-127970-0 from training. Duration: 0.57 2023-10-02 17:10:44,553 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_1466-248056-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 17:10:46,311 WARNING [train.py:1204] (3/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_535-14410-0_sp0.9 from training. Duration: 0.52225 2023-10-02 17:10:46,314 WARNING [train.py:1204] (3/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_747-129941-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 17:10:47,686 WARNING [train.py:1204] (3/4) Exclude cut with ID _15012_210_1968_1_1530856820429_5095350_688-178849-0_sp1.1 from training. Duration: 0.5818125 2023-10-02 17:10:50,748 WARNING [train.py:1204] (3/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_356-45054-0_sp1.1 from training. Duration: 0.7818125 2023-10-02 17:10:53,396 WARNING [train.py:1204] (3/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_146-276944-0_sp0.9 from training. Duration: 0.92225 2023-10-02 17:10:56,235 WARNING [train.py:1204] (3/4) Exclude cut with ID _39204_210_4220_1_1533880547368_3191120_346-180306-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 17:10:57,478 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.505e+02 1.823e+02 2.038e+02 2.325e+02 3.987e+02, threshold=4.077e+02, percent-clipped=1.0 2023-10-02 17:10:57,553 WARNING [train.py:1204] (3/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_641-158017-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 17:10:57,553 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_19952_210_2161_1_1531875469_3851750_274-42137-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 17:10:57,791 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.balancer1.prob, batch_count=955700.0, ans=0.125 2023-10-02 17:11:01,690 WARNING [train.py:1204] (3/4) Exclude cut with ID _60804_210_9642_1_1534831199859_7268299_87-327819-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 17:11:01,756 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_9302_210_2550_1_1528889962_4655416_638-301620-0 from training. Duration: 0.47 2023-10-02 17:11:03,449 WARNING [train.py:1204] (3/4) Exclude cut with ID _44304_210_15374_1_1533639352765_4613129_302-22084-0_sp1.1 from training. Duration: 0.7818125 2023-10-02 17:11:05,099 INFO [train.py:1046] (3/4) Epoch 27, batch 5250, loss[loss=0.1622, simple_loss=0.2453, pruned_loss=0.03956, over 23401.00 frames. ], tot_loss[loss=0.168, simple_loss=0.2453, pruned_loss=0.04531, over 4721168.16 frames. ], batch size: 93, lr: 3.76e-03, grad_scale: 8.0 2023-10-02 17:11:05,144 WARNING [train.py:1204] (3/4) Exclude cut with ID _18513_210_9206_1_1531618215223_3862620_177-227063-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 17:11:05,242 WARNING [train.py:1204] (3/4) Exclude cut with ID _41326_210_5854_1_1533778076509_3492080_510-45444-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 17:11:06,466 WARNING [train.py:1204] (3/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_76-311963-0_sp1.1 from training. Duration: 0.636375 2023-10-02 17:11:06,595 WARNING [train.py:1204] (3/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_156-302675-0_sp0.9 from training. Duration: 0.611125 2023-10-02 17:11:09,389 WARNING [train.py:1204] (3/4) Exclude cut with ID _53489_210_6228_1_1534298357169_3835970_367-148411-0_sp1.1 from training. Duration: 0.936375 2023-10-02 17:11:12,275 WARNING [train.py:1204] (3/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_389-144525-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 17:11:13,567 WARNING [train.py:1204] (3/4) Exclude cut with ID _14069_210_5632_1_1530512692033_4803390_158-238427-0_sp1.1 from training. Duration: 0.963625 2023-10-02 17:11:13,652 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_486-151131-0_sp0.9 from training. Duration: 0.9444375 2023-10-02 17:11:18,372 WARNING [train.py:1204] (3/4) Exclude cut with ID _39846_210_12651_1_1533603651992_1318030_188-71831-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 17:11:20,350 WARNING [train.py:1204] (3/4) Exclude cut with ID _39411_210_5986_1_1533538825030_3343649_173-238248-0_sp1.1 from training. Duration: 0.7545625 2023-10-02 17:11:21,810 WARNING [train.py:1204] (3/4) Exclude cut with ID _20684_210_6917_1_1531567751646_4203930_469-49081-0_sp1.1 from training. Duration: 0.92725 2023-10-02 17:11:24,646 WARNING [train.py:1204] (3/4) Exclude cut with ID _8833_210_5219_1_1528592665210_7553059_611-80313-0_sp1.1 from training. Duration: 0.5818125 2023-10-02 17:11:25,997 WARNING [train.py:1204] (3/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_440-141811-0 from training. Duration: 0.95 2023-10-02 17:11:26,011 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_41211_210_2544_1_1533348859_3702910_236-223652-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 17:11:26,104 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_40222_210_6228_1_1533385298_4481939_220-163998-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 17:11:28,876 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.ff3_skip_rate, batch_count=955833.3333333334, ans=0.0 2023-10-02 17:12:08,929 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.1.bypass_mid.scale_min, batch_count=956033.3333333334, ans=0.2 2023-10-02 17:12:13,978 INFO [train.py:1046] (3/4) Epoch 27, batch 5300, loss[loss=0.1793, simple_loss=0.2491, pruned_loss=0.05473, over 23328.00 frames. ], tot_loss[loss=0.1664, simple_loss=0.2429, pruned_loss=0.04495, over 4696926.88 frames. ], batch size: 93, lr: 3.76e-03, grad_scale: 8.0 2023-10-02 17:12:14,323 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.bypass.skip_rate, batch_count=956100.0, ans=0.07 2023-10-02 17:12:25,423 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.4.encoder.layers.0.conv_module1.whiten, num_groups=1, num_channels=384, metric=4.70 vs. limit=15.0 2023-10-02 17:12:28,178 WARNING [train.py:1204] (3/4) Exclude cut with ID _14629_210_15709_1_1530604298182_956899_6-232695-0_sp1.1 from training. Duration: 0.8545625 2023-10-02 17:12:28,214 WARNING [train.py:1204] (3/4) Exclude cut with ID _13921_210_9994_1_1530841782272_3503520_327-69677-0 from training. Duration: 0.9 2023-10-02 17:12:28,240 WARNING [train.py:1204] (3/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_372-298093-0 from training. Duration: 0.8 2023-10-02 17:12:28,256 WARNING [train.py:1204] (3/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_515-139111-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 17:12:28,538 WARNING [train.py:1204] (3/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_85-65056-0_sp1.1 from training. Duration: 0.87275 2023-10-02 17:12:28,933 WARNING [train.py:1204] (3/4) Exclude cut with ID _15113_210_8254_1_1530842438579_3716279_209-54533-0_sp1.1 from training. Duration: 0.87275 2023-10-02 17:12:28,979 WARNING [train.py:1204] (3/4) Exclude cut with ID _70447_210_12392_1_1535972499300_3160420_123-312838-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 17:12:29,004 WARNING [train.py:1204] (3/4) Exclude cut with ID _60113_210_16271_1_1535164212481_4386079_70-63596-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 17:12:29,033 WARNING [train.py:1204] (3/4) Exclude cut with ID _14632_210_9988_1_1530691279392_3923200_225-188509-0_sp1.1 from training. Duration: 0.963625 2023-10-02 17:12:29,079 WARNING [train.py:1204] (3/4) Exclude cut with ID _34734_210_4525_1_1533365977542_4448720_584-180532-0_sp1.1 from training. Duration: 0.97275 2023-10-02 17:12:29,092 WARNING [train.py:1204] (3/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_351-294650-0_sp1.1 from training. Duration: 0.57275 2023-10-02 17:12:29,412 WARNING [train.py:1204] (3/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_263-168388-0_sp0.9 from training. Duration: 0.9555625 2023-10-02 17:12:29,440 WARNING [train.py:1204] (3/4) Exclude cut with ID _22083_210_8254_1_1531702862439_3678970_201-85738-0 from training. Duration: 0.9 2023-10-02 17:12:29,541 WARNING [train.py:1204] (3/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_237-58433-0 from training. Duration: 0.74 2023-10-02 17:12:29,569 WARNING [train.py:1204] (3/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_262-250826-0 from training. Duration: 0.99 2023-10-02 17:12:29,680 WARNING [train.py:1204] (3/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_927-43827-0_sp0.9 from training. Duration: 0.82225 2023-10-02 17:12:29,714 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2821_210_2715_1_1526108385_3763560_73-296673-0 from training. Duration: 0.83 2023-10-02 17:12:29,790 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6287_210_3273_1_1527509506_2653716_182-199887-0 from training. Duration: 0.51 2023-10-02 17:12:29,928 WARNING [train.py:1204] (3/4) Exclude cut with ID _70519_210_4163_1_1535961595994_7458230_899-80223-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 17:12:30,198 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_210-234949-0_sp1.1 from training. Duration: 0.97275 2023-10-02 17:12:30,235 WARNING [train.py:1204] (3/4) Exclude cut with ID _30196_210_1664_1_1533708496522_7074140_768-278176-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 17:12:30,318 WARNING [train.py:1204] (3/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_315-128283-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 17:12:30,401 WARNING [train.py:1204] (3/4) Exclude cut with ID _14886_210_4220_1_1530702020355_3516190_330-323941-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 17:12:30,646 WARNING [train.py:1204] (3/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_264-348896-0_sp1.1 from training. Duration: 0.77275 2023-10-02 17:12:30,669 WARNING [train.py:1204] (3/4) Exclude cut with ID _37086_210_9038_1_1532999540891_7021250_433-10563-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 17:12:30,706 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_53638_210_21581_1_1534297319_5808449_885-80172-0_sp1.1 from training. Duration: 0.87275 2023-10-02 17:12:31,207 WARNING [train.py:1204] (3/4) Exclude cut with ID _14623_210_1925_1_1530773354962_3632609_157-218771-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 17:12:31,217 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_1368-78165-0_sp1.1 from training. Duration: 0.97275 2023-10-02 17:12:31,222 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_309-65177-0_sp0.9 from training. Duration: 0.97775 2023-10-02 17:12:31,227 WARNING [train.py:1204] (3/4) Exclude cut with ID _21632_210_2253_1_1531994001765_3744140_291-138812-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 17:12:31,240 WARNING [train.py:1204] (3/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_669-93163-0_sp1.1 from training. Duration: 0.8909375 2023-10-02 17:12:31,834 WARNING [train.py:1204] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_292-215346-0 from training. Duration: 0.97 2023-10-02 17:12:31,860 WARNING [train.py:1204] (3/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_286-222073-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 17:12:32,180 WARNING [train.py:1204] (3/4) Exclude cut with ID _59256_210_5986_1_1535076085997_3589120_121-237386-0_sp1.1 from training. Duration: 0.87275 2023-10-02 17:12:32,195 WARNING [train.py:1204] (3/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_96-320736-0 from training. Duration: 0.86 2023-10-02 17:12:32,205 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_128-202104-0 from training. Duration: 0.63 2023-10-02 17:12:32,293 WARNING [train.py:1204] (3/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_242-3472-0_sp0.9 from training. Duration: 0.811125 2023-10-02 17:12:32,338 WARNING [train.py:1204] (3/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_154-74182-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 17:12:32,342 WARNING [train.py:1204] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_187-221566-0 from training. Duration: 0.43 2023-10-02 17:12:32,514 WARNING [train.py:1204] (3/4) Exclude cut with ID _6194_210_1534_1_1527511826160_3941194_14-282876-0 from training. Duration: 0.71 2023-10-02 17:12:32,554 WARNING [train.py:1204] (3/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_660-211088-0_sp1.1 from training. Duration: 0.563625 2023-10-02 17:12:32,885 WARNING [train.py:1204] (3/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_526-62079-0_sp1.1 from training. Duration: 0.6090625 2023-10-02 17:12:33,027 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_92-246270-0_sp1.1 from training. Duration: 0.77275 2023-10-02 17:12:33,111 WARNING [train.py:1204] (3/4) Exclude cut with ID IC0287W0023-142350 from training. Duration: 0.956 2023-10-02 17:12:33,185 WARNING [train.py:1204] (3/4) Exclude cut with ID _58605_210_12210_1_1534723195939_3904259_48-269402-0 from training. Duration: 0.96 2023-10-02 17:12:33,200 WARNING [train.py:1204] (3/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_416-257730-0_sp0.9 from training. Duration: 0.72225 2023-10-02 17:12:33,274 WARNING [train.py:1204] (3/4) Exclude cut with ID _7877_210_6014_1_1528286358099_3750136_185-17516-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 17:12:33,910 WARNING [train.py:1204] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_23-189234-0 from training. Duration: 0.83 2023-10-02 17:12:33,928 WARNING [train.py:1204] (3/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_237-89684-0 from training. Duration: 0.96 2023-10-02 17:12:33,999 WARNING [train.py:1204] (3/4) Exclude cut with ID _13916_210_9994_1_1530788341126_3763149_116-21034-0 from training. Duration: 0.96 2023-10-02 17:12:34,157 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_246-125000-0_sp1.1 from training. Duration: 0.563625 2023-10-02 17:12:40,113 INFO [train.py:1046] (3/4) Epoch 28, batch 0, loss[loss=0.174, simple_loss=0.2449, pruned_loss=0.05153, over 23828.00 frames. ], tot_loss[loss=0.174, simple_loss=0.2449, pruned_loss=0.05153, over 23828.00 frames. ], batch size: 164, lr: 3.69e-03, grad_scale: 16.0 2023-10-02 17:12:40,114 INFO [train.py:1069] (3/4) Computing validation loss 2023-10-02 17:12:52,154 INFO [train.py:1078] (3/4) Epoch 28, validation: loss=0.3134, simple_loss=0.267, pruned_loss=0.1799, over 1125622.00 frames. 2023-10-02 17:12:52,155 INFO [train.py:1079] (3/4) Maximum memory allocated so far is 21134MB 2023-10-02 17:12:54,821 WARNING [train.py:1204] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_402-253027-0 from training. Duration: 0.54 2023-10-02 17:12:56,221 WARNING [train.py:1204] (3/4) Exclude cut with ID _59288_210_5854_1_1535077757774_3412246_158-289345-0_sp1.1 from training. Duration: 0.92725 2023-10-02 17:12:57,662 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_28211_210_6388_1_1532861506_4523224_128-328162-0_sp1.1 from training. Duration: 0.8181875 2023-10-02 17:13:03,693 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_788-278281-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 17:13:03,701 WARNING [train.py:1204] (3/4) Exclude cut with ID _49728_210_12651_1_1534041034294_6788150_624-195301-0_sp0.9 from training. Duration: 0.9444375 2023-10-02 17:13:03,748 WARNING [train.py:1204] (3/4) Exclude cut with ID _14860_210_4915_1_1530702020340_7343240_267-139775-0_sp1.1 from training. Duration: 0.963625 2023-10-02 17:13:05,555 WARNING [train.py:1204] (3/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_332-221057-0 from training. Duration: 0.68 2023-10-02 17:13:07,020 WARNING [train.py:1204] (3/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_242-76943-0 from training. Duration: 0.94 2023-10-02 17:13:08,450 WARNING [train.py:1204] (3/4) Exclude cut with ID _16747_210_5986_1_1531811544733_2749109_87-349587-0_sp1.1 from training. Duration: 0.963625 2023-10-02 17:13:09,805 WARNING [train.py:1204] (3/4) Exclude cut with ID _22982_210_1883_1_1531882790447_5449409_505-346452-0_sp1.1 from training. Duration: 0.963625 2023-10-02 17:13:12,605 WARNING [train.py:1204] (3/4) Exclude cut with ID _17956_210_4353_1_1531209568125_6979970_13-192107-0_sp1.1 from training. Duration: 0.963625 2023-10-02 17:13:12,632 WARNING [train.py:1204] (3/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_430-307666-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 17:13:13,911 WARNING [train.py:1204] (3/4) Exclude cut with ID _2895_210_2477_1_1525956785794_4063750_289-11475-0_sp1.1 from training. Duration: 0.7454375 2023-10-02 17:13:13,930 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3910_210_1794_1_1526742306_334530_28-238909-0_sp0.9 from training. Duration: 0.9 2023-10-02 17:13:15,373 WARNING [train.py:1204] (3/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_18-130336-0 from training. Duration: 0.79 2023-10-02 17:13:16,759 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_548-294316-0_sp0.9 from training. Duration: 0.9 2023-10-02 17:13:24,607 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_42-142473-0_sp1.1 from training. Duration: 0.6545625 2023-10-02 17:13:24,622 WARNING [train.py:1204] (3/4) Exclude cut with ID _59170_210_6721_1_1534983633960_4203335_326-48066-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 17:13:26,063 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_45886_210_17024_1_1533709215_471063_7-79165-0 from training. Duration: 0.98 2023-10-02 17:13:27,596 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.bypass.skip_rate, batch_count=956313.3333333334, ans=0.09899494936611666 2023-10-02 17:13:30,718 WARNING [train.py:1204] (3/4) Exclude cut with ID _44585_210_18628_1_1533605350387_4518199_280-73534-0_sp0.9 from training. Duration: 0.988875 2023-10-02 17:13:30,724 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3475_210_2028_1_1526193578_3622569_265-276625-0_sp1.1 from training. Duration: 0.6909375 2023-10-02 17:13:33,381 WARNING [train.py:1204] (3/4) Exclude cut with ID _64631_210_27767_1_1535349580068_4165160_88-225232-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 17:13:36,924 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_205-265518-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 17:13:39,799 WARNING [train.py:1204] (3/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_271-253734-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 17:13:41,017 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.621e+02 1.877e+02 2.089e+02 2.395e+02 3.641e+02, threshold=4.178e+02, percent-clipped=0.0 2023-10-02 17:13:45,275 WARNING [train.py:1204] (3/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_282-257791-0 from training. Duration: 0.86 2023-10-02 17:13:46,757 WARNING [train.py:1204] (3/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_356-45054-0 from training. Duration: 0.86 2023-10-02 17:13:48,068 WARNING [train.py:1204] (3/4) Exclude cut with ID _69584_210_5219_1_1535785133775_4044450_38-348116-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 17:13:48,070 WARNING [train.py:1204] (3/4) Exclude cut with ID _66763_210_17301_1_1535788667617_241750_21-328480-0_sp1.1 from training. Duration: 0.936375 2023-10-02 17:13:49,507 WARNING [train.py:1204] (3/4) Exclude cut with ID _26528_210_9070_1_1532570410842_3851310_497-97218-0_sp0.9 from training. Duration: 0.9555625 2023-10-02 17:13:49,557 WARNING [train.py:1204] (3/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_592-101075-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 17:13:51,668 WARNING [train.py:1204] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_678-159307-0 from training. Duration: 0.85 2023-10-02 17:13:54,823 WARNING [train.py:1204] (3/4) Exclude cut with ID _28427_210_4220_1_1532581240967_3061069_166-319773-0_sp1.1 from training. Duration: 0.936375 2023-10-02 17:13:56,172 WARNING [train.py:1204] (3/4) Exclude cut with ID _29877_210_6228_1_1532397791106_5276149_606-224984-0_sp1.1 from training. Duration: 0.936375 2023-10-02 17:14:00,592 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_125-203105-0_sp1.1 from training. Duration: 0.72725 2023-10-02 17:14:03,395 WARNING [train.py:1204] (3/4) Exclude cut with ID IC0620W0113-306765 from training. Duration: 0.768 2023-10-02 17:14:04,831 WARNING [train.py:1204] (3/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_410-287857-0_sp1.1 from training. Duration: 0.8454375 2023-10-02 17:14:06,218 INFO [train.py:1046] (3/4) Epoch 28, batch 50, loss[loss=0.1882, simple_loss=0.2601, pruned_loss=0.0581, over 22847.00 frames. ], tot_loss[loss=0.1659, simple_loss=0.2419, pruned_loss=0.04489, over 1062676.22 frames. ], batch size: 322, lr: 3.69e-03, grad_scale: 16.0 2023-10-02 17:14:06,345 WARNING [train.py:1204] (3/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_78-95260-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 17:14:09,629 WARNING [train.py:1204] (3/4) Exclude cut with ID _58059_210_5854_1_1534901471149_3164808_6-73080-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 17:14:09,644 WARNING [train.py:1204] (3/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_331-117829-0 from training. Duration: 0.62 2023-10-02 17:14:09,706 WARNING [train.py:1204] (3/4) Exclude cut with ID _14880_210_8895_1_1530680426655_3795259_289-212817-0_sp0.9 from training. Duration: 0.7666875 2023-10-02 17:14:09,741 WARNING [train.py:1204] (3/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_620-144779-0_sp1.1 from training. Duration: 0.92725 2023-10-02 17:14:11,423 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.ff3_skip_rate, batch_count=956513.3333333334, ans=0.0 2023-10-02 17:14:12,490 WARNING [train.py:1204] (3/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_561-288575-0_sp1.1 from training. Duration: 0.963625 2023-10-02 17:14:13,810 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_22657_210_6890_1_1532086002_1637216_122-188128-0_sp1.1 from training. Duration: 0.963625 2023-10-02 17:14:16,363 WARNING [train.py:1204] (3/4) Exclude cut with ID _16756_210_4915_1_1531047803763_7104259_604-86607-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 17:14:19,190 WARNING [train.py:1204] (3/4) Exclude cut with ID _15150_210_8341_1_1530752411803_7256384_469-239509-0 from training. Duration: 0.68 2023-10-02 17:14:19,198 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_10987_210_4163_1_1529760555_4123902_302-87926-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 17:14:25,783 WARNING [train.py:1204] (3/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_469-94673-0_sp0.9 from training. Duration: 0.87775 2023-10-02 17:14:27,163 WARNING [train.py:1204] (3/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_798-106179-0 from training. Duration: 0.83 2023-10-02 17:14:30,386 WARNING [train.py:1204] (3/4) Exclude cut with ID _16744_210_5986_1_1531724405384_3907567_242-111316-0 from training. Duration: 0.93 2023-10-02 17:14:31,784 WARNING [train.py:1204] (3/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_938-205135-0_sp0.9 from training. Duration: 0.9666875 2023-10-02 17:14:33,219 WARNING [train.py:1204] (3/4) Exclude cut with ID _64135_210_2253_1_1535677267876_7608972_133-188266-0_sp0.9 from training. Duration: 0.9 2023-10-02 17:14:33,223 WARNING [train.py:1204] (3/4) Exclude cut with ID _44585_210_18628_1_1533605350387_4518199_623-346820-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 17:14:34,589 WARNING [train.py:1204] (3/4) Exclude cut with ID _42326_210_7770_1_1533814282531_3707269_454-266903-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 17:14:35,962 WARNING [train.py:1204] (3/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_5-55893-0_sp1.1 from training. Duration: 0.5 2023-10-02 17:14:37,291 WARNING [train.py:1204] (3/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_577-332600-0_sp1.1 from training. Duration: 0.5181875 2023-10-02 17:14:37,292 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_957-138882-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 17:14:38,991 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.balancer2.prob, batch_count=956646.6666666666, ans=0.125 2023-10-02 17:14:46,266 WARNING [train.py:1204] (3/4) Exclude cut with ID _12556_210_8341_1_1530081024100_7327690_219-38004-0_sp1.1 from training. Duration: 0.97275 2023-10-02 17:14:46,350 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_397-147196-0_sp1.1 from training. Duration: 0.72725 2023-10-02 17:14:46,368 WARNING [train.py:1204] (3/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_231-104736-0_sp0.9 from training. Duration: 0.8666875 2023-10-02 17:14:46,431 WARNING [train.py:1204] (3/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_398-334558-0 from training. Duration: 0.96 2023-10-02 17:14:49,312 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_257-20733-0_sp1.1 from training. Duration: 0.6909375 2023-10-02 17:14:49,423 WARNING [train.py:1204] (3/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_146-282400-0_sp1.1 from training. Duration: 0.6454375 2023-10-02 17:14:49,428 WARNING [train.py:1204] (3/4) Exclude cut with ID _51899_210_20034_1_1534129254466_3669220_265-215428-0 from training. Duration: 0.98 2023-10-02 17:14:50,843 WARNING [train.py:1204] (3/4) Exclude cut with ID _34423_210_8750_1_1533430798456_3330199_219-106541-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 17:14:52,246 WARNING [train.py:1204] (3/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_458-67321-0 from training. Duration: 0.97 2023-10-02 17:14:58,822 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.bypass.skip_rate, batch_count=956713.3333333334, ans=0.07 2023-10-02 17:14:59,951 WARNING [train.py:1204] (3/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_1051-182606-0_sp1.1 from training. Duration: 0.936375 2023-10-02 17:15:00,579 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_53555_210_5632_1_1534595220_7232297_603-4396-0_sp1.1 from training. Duration: 0.97275 2023-10-02 17:15:01,858 WARNING [train.py:1204] (3/4) Exclude cut with ID _54218_210_12210_1_1534413646388_4409259_159-144367-0_sp1.1 from training. Duration: 0.963625 2023-10-02 17:15:03,226 WARNING [train.py:1204] (3/4) Exclude cut with ID _7606_210_4915_1_1528894253277_5080690_301-10732-0_sp0.9 from training. Duration: 0.9 2023-10-02 17:15:03,230 WARNING [train.py:1204] (3/4) Exclude cut with ID _15026_210_8750_1_1530777602316_3291850_211-195043-0_sp0.9 from training. Duration: 0.888875 2023-10-02 17:15:04,703 WARNING [train.py:1204] (3/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_146-282400-0 from training. Duration: 0.71 2023-10-02 17:15:04,736 WARNING [train.py:1204] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_65-88242-0 from training. Duration: 0.56 2023-10-02 17:15:04,869 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.1.balancer2.prob, batch_count=956780.0, ans=0.125 2023-10-02 17:15:06,110 WARNING [train.py:1204] (3/4) Exclude cut with ID _55857_210_2346_1_1534644007831_6898209_80-107757-0_sp1.1 from training. Duration: 0.963625 2023-10-02 17:15:07,482 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_15126_210_6753_1_1530778677_2591185_120-277707-0_sp0.9 from training. Duration: 0.888875 2023-10-02 17:15:08,898 WARNING [train.py:1204] (3/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_1033-168593-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 17:15:08,944 WARNING [train.py:1204] (3/4) Exclude cut with ID _46683_210_9642_1_1533730913492_4591116_475-30711-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 17:15:08,971 WARNING [train.py:1204] (3/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_253-308377-0 from training. Duration: 0.49 2023-10-02 17:15:09,035 WARNING [train.py:1204] (3/4) Exclude cut with ID _4680_210_3525_1_1527249757953_893386_75-78979-0 from training. Duration: 0.91 2023-10-02 17:15:10,415 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_5522_210_3273_1_1527249606_3909096_318-203153-0_sp0.9 from training. Duration: 0.47775 2023-10-02 17:15:12,255 WARNING [train.py:1204] (3/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_550-170116-0_sp1.1 from training. Duration: 0.92725 2023-10-02 17:15:12,305 WARNING [train.py:1204] (3/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_277-210798-0_sp0.9 from training. Duration: 0.611125 2023-10-02 17:15:13,605 WARNING [train.py:1204] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_620-276793-0 from training. Duration: 0.59 2023-10-02 17:15:13,629 WARNING [train.py:1204] (3/4) Exclude cut with ID _3067_210_2667_1_1526212974640_4503168_119-175776-0 from training. Duration: 0.74 2023-10-02 17:15:15,039 WARNING [train.py:1204] (3/4) Exclude cut with ID _55209_210_1531_1_1534675828996_4597160_681-195827-0_sp1.1 from training. Duration: 0.92725 2023-10-02 17:15:15,115 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_427-78097-0_sp1.1 from training. Duration: 0.72725 2023-10-02 17:15:17,790 WARNING [train.py:1204] (3/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_96-290039-0_sp0.9 from training. Duration: 0.62225 2023-10-02 17:15:17,810 WARNING [train.py:1204] (3/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_287-292174-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 17:15:18,037 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.bypass.scale_min, batch_count=956780.0, ans=0.2 2023-10-02 17:15:20,556 INFO [train.py:1046] (3/4) Epoch 28, batch 100, loss[loss=0.1716, simple_loss=0.2452, pruned_loss=0.04899, over 23678.00 frames. ], tot_loss[loss=0.1693, simple_loss=0.2464, pruned_loss=0.04606, over 1866875.15 frames. ], batch size: 256, lr: 3.69e-03, grad_scale: 16.0 2023-10-02 17:15:20,660 WARNING [train.py:1204] (3/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_200-292228-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 17:15:24,701 WARNING [train.py:1204] (3/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_291-194999-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 17:15:26,211 WARNING [train.py:1204] (3/4) Exclude cut with ID _58895_210_8750_1_1535184081320_3649089_265-323810-0_sp1.1 from training. Duration: 0.87275 2023-10-02 17:15:30,032 WARNING [train.py:1204] (3/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_58-68956-0 from training. Duration: 0.83 2023-10-02 17:15:30,040 WARNING [train.py:1204] (3/4) Exclude cut with ID _39867_210_9826_1_1533553222030_9344259_322-115153-0_sp1.1 from training. Duration: 0.963625 2023-10-02 17:15:32,163 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.conv_module1.balancer2.prob, batch_count=956846.6666666666, ans=0.125 2023-10-02 17:15:33,361 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3371_210_1794_1_1526384462_5580777_396-339812-0_sp1.1 from training. Duration: 0.77275 2023-10-02 17:15:34,625 WARNING [train.py:1204] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_731-2428-0_sp0.9 from training. Duration: 0.92225 2023-10-02 17:15:34,652 WARNING [train.py:1204] (3/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_98-741-0_sp1.1 from training. Duration: 0.72725 2023-10-02 17:15:34,654 WARNING [train.py:1204] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_238-245655-0_sp0.9 from training. Duration: 0.9 2023-10-02 17:15:34,686 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6788_210_1388_1_1527898698_4562021_91-197981-0_sp0.9 from training. Duration: 0.92225 2023-10-02 17:15:34,937 INFO [scaling.py:1118] (3/4) WithLoss: name=encoder.encoders.2.encoder.layers.1.self_attn_weights, loss-sum=0.000e+00 2023-10-02 17:15:36,079 WARNING [train.py:1204] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_860-96198-0 from training. Duration: 0.73 2023-10-02 17:15:36,203 WARNING [train.py:1204] (3/4) Exclude cut with ID _14268_210_9206_1_1530622771285_4714099_331-105351-0_sp0.9 from training. Duration: 0.72225 2023-10-02 17:15:37,558 WARNING [train.py:1204] (3/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_204-137133-0_sp1.1 from training. Duration: 0.92725 2023-10-02 17:15:37,587 WARNING [train.py:1204] (3/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_5-46496-0_sp1.1 from training. Duration: 0.936375 2023-10-02 17:15:37,595 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_160-218721-0_sp1.1 from training. Duration: 0.87275 2023-10-02 17:15:41,595 WARNING [train.py:1204] (3/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_503-287883-0 from training. Duration: 0.88 2023-10-02 17:15:41,888 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.attention_skip_rate, batch_count=956913.3333333334, ans=0.0 2023-10-02 17:15:42,929 WARNING [train.py:1204] (3/4) Exclude cut with ID _57572_210_5517_1_1534921219624_3809484_110-79134-0_sp1.1 from training. Duration: 0.92725 2023-10-02 17:15:44,688 WARNING [train.py:1204] (3/4) Exclude cut with ID _44585_210_18628_1_1533605350387_4518199_421-37641-0_sp1.1 from training. Duration: 0.936375 2023-10-02 17:15:45,979 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_42-142473-0_sp0.9 from training. Duration: 0.8 2023-10-02 17:15:47,424 WARNING [train.py:1204] (3/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_310-114452-0_sp0.9 from training. Duration: 0.6555625 2023-10-02 17:15:50,264 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9029W0120-509520 from training. Duration: 0.768 2023-10-02 17:15:50,287 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9029W0433-509820 from training. Duration: 0.896 2023-10-02 17:15:51,715 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_40222_210_6228_1_1533385298_4481939_163-334586-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 17:15:51,715 WARNING [train.py:1204] (3/4) Exclude cut with ID _48677_210_5663_1_1533902405919_3892989_408-40740-0_sp1.1 from training. Duration: 0.7545625 2023-10-02 17:15:55,770 WARNING [train.py:1204] (3/4) Exclude cut with ID _24314_210_4915_1_1532135078116_7170179_521-258868-0_sp0.9 from training. Duration: 0.87775 2023-10-02 17:15:57,177 WARNING [train.py:1204] (3/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_576-51198-0_sp1.1 from training. Duration: 0.92725 2023-10-02 17:15:59,114 WARNING [train.py:1204] (3/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_306-216871-0_sp1.1 from training. Duration: 0.97275 2023-10-02 17:16:04,263 WARNING [train.py:1204] (3/4) Exclude cut with ID _20684_210_6917_1_1531567751646_4203930_463-119982-0_sp1.1 from training. Duration: 0.97275 2023-10-02 17:16:05,620 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9084W0012-536850 from training. Duration: 0.64 2023-10-02 17:16:08,218 WARNING [train.py:1204] (3/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_847-41334-0_sp1.1 from training. Duration: 0.4 2023-10-02 17:16:10,932 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.532e+02 1.862e+02 2.053e+02 2.349e+02 3.571e+02, threshold=4.105e+02, percent-clipped=0.0 2023-10-02 17:16:12,384 WARNING [train.py:1204] (3/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_326-165900-0_sp1.1 from training. Duration: 0.62725 2023-10-02 17:16:12,452 WARNING [train.py:1204] (3/4) Exclude cut with ID _7537_210_2084_1_1528502395431_3924359_128-268029-0_sp1.1 from training. Duration: 0.8909375 2023-10-02 17:16:13,876 WARNING [train.py:1204] (3/4) Exclude cut with ID _67499_210_17282_1_1535509805220_3692430_333-155030-0_sp1.1 from training. Duration: 0.97275 2023-10-02 17:16:17,407 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.1.feed_forward3.hidden_balancer.prob, batch_count=957046.6666666666, ans=0.125 2023-10-02 17:16:18,598 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_34008_210_10431_1_1533345424_8183247_588-257177-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 17:16:21,398 WARNING [train.py:1204] (3/4) Exclude cut with ID _25914_210_6400_1_1532141317058_644740_52-96808-0_sp1.1 from training. Duration: 0.82725 2023-10-02 17:16:22,780 WARNING [train.py:1204] (3/4) Exclude cut with ID _56494_210_4220_1_1535284837625_3033169_69-138150-0_sp1.1 from training. Duration: 0.8545625 2023-10-02 17:16:25,595 WARNING [train.py:1204] (3/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_19-143553-0_sp1.1 from training. Duration: 0.97275 2023-10-02 17:16:25,651 WARNING [train.py:1204] (3/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_582-130735-0_sp1.1 from training. Duration: 0.936375 2023-10-02 17:16:27,142 WARNING [train.py:1204] (3/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_943-201356-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 17:16:27,153 WARNING [train.py:1204] (3/4) Exclude cut with ID _3231_210_2106_1_1526205864934_6784220_422-229276-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 17:16:28,557 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_48049_210_5632_1_1533864553_3519937_73-198717-0_sp1.1 from training. Duration: 0.97275 2023-10-02 17:16:29,956 WARNING [train.py:1204] (3/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_329-285801-0 from training. Duration: 0.91 2023-10-02 17:16:29,987 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9171W0114-579000 from training. Duration: 0.896 2023-10-02 17:16:30,003 WARNING [train.py:1204] (3/4) Exclude cut with ID _3120_210_3525_1_1526036143112_4072980_210-132675-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 17:16:31,901 WARNING [train.py:1204] (3/4) Exclude cut with ID _55857_210_2346_1_1534644007831_6898209_745-98391-0_sp0.9 from training. Duration: 0.8444375 2023-10-02 17:16:33,161 WARNING [train.py:1204] (3/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_615-322035-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 17:16:33,167 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_36756_210_7234_1_1533026991_966276_119-315767-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 17:16:33,184 WARNING [train.py:1204] (3/4) Exclude cut with ID _13916_210_9994_1_1530788341126_3763149_60-191556-0_sp1.1 from training. Duration: 0.5545625 2023-10-02 17:16:33,203 WARNING [train.py:1204] (3/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_177-236809-0_sp1.1 from training. Duration: 0.4454375 2023-10-02 17:16:33,254 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7002_210_1794_1_1527939683_5157286_184-229630-0_sp1.1 from training. Duration: 0.67275 2023-10-02 17:16:33,270 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_50428_210_5724_1_1534247532_4126418_464-18322-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 17:16:34,549 INFO [train.py:1046] (3/4) Epoch 28, batch 150, loss[loss=0.1767, simple_loss=0.2662, pruned_loss=0.04359, over 24450.00 frames. ], tot_loss[loss=0.1695, simple_loss=0.2475, pruned_loss=0.04578, over 2510029.61 frames. ], batch size: 69, lr: 3.69e-03, grad_scale: 8.0 2023-10-02 17:16:34,630 WARNING [train.py:1204] (3/4) Exclude cut with ID _35799_210_5854_1_1533263487887_3529071_466-131402-0_sp1.1 from training. Duration: 0.936375 2023-10-02 17:16:36,528 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_1259-132768-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 17:16:36,594 WARNING [train.py:1204] (3/4) Exclude cut with ID _6694_210_4599_1_1527852637490_3965529_144-185051-0_sp1.1 from training. Duration: 0.87275 2023-10-02 17:16:36,624 WARNING [train.py:1204] (3/4) Exclude cut with ID _64639_210_5816_1_1535367619437_3528000_59-133290-0_sp1.1 from training. Duration: 0.963625 2023-10-02 17:16:40,683 WARNING [train.py:1204] (3/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_223-49525-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 17:16:42,134 WARNING [train.py:1204] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_748-36150-0_sp1.1 from training. Duration: 0.82725 2023-10-02 17:16:42,134 WARNING [train.py:1204] (3/4) Exclude cut with ID _19896_210_1883_1_1531709022662_6649569_52-221627-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 17:16:43,531 WARNING [train.py:1204] (3/4) Exclude cut with ID _6789_210_1534_1_1527854471671_2870519_296-115766-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 17:16:44,992 WARNING [train.py:1204] (3/4) Exclude cut with ID _14624_210_1925_1_1530858690657_3642219_100-19871-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 17:16:45,041 WARNING [train.py:1204] (3/4) Exclude cut with ID _12555_210_8750_1_1530075594402_3780239_50-293675-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 17:16:47,701 WARNING [train.py:1204] (3/4) Exclude cut with ID _14880_210_8895_1_1530680426655_3795259_289-212817-0_sp1.1 from training. Duration: 0.62725 2023-10-02 17:16:47,758 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_388-211626-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 17:16:53,529 WARNING [train.py:1204] (3/4) Exclude cut with ID _5289_210_2084_1_1527678202736_3779299_392-261988-0 from training. Duration: 0.65 2023-10-02 17:16:53,537 WARNING [train.py:1204] (3/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_124-345840-0 from training. Duration: 0.52 2023-10-02 17:16:53,542 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2655_210_2087_1_1525688959_3897400_437-337690-0 from training. Duration: 0.63 2023-10-02 17:16:56,250 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3780_210_2087_1_1526467391_239068_13-274633-0_sp1.1 from training. Duration: 0.92725 2023-10-02 17:16:56,255 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_805-71497-0_sp1.1 from training. Duration: 0.6454375 2023-10-02 17:16:56,346 WARNING [train.py:1204] (3/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_434-83000-0_sp1.1 from training. Duration: 0.8909375 2023-10-02 17:16:57,729 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_17613_210_2151_1_1531137569_2366160_285-219531-0_sp1.1 from training. Duration: 0.97275 2023-10-02 17:16:57,733 WARNING [train.py:1204] (3/4) Exclude cut with ID _56108_210_16380_1_1534505446159_3887959_22-37858-0_sp1.1 from training. Duration: 0.936375 2023-10-02 17:16:59,037 WARNING [train.py:1204] (3/4) Exclude cut with ID _28427_210_4220_1_1532581240967_3061069_200-88444-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 17:16:59,115 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_77-279042-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 17:17:01,183 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9309W0193-640320 from training. Duration: 0.896 2023-10-02 17:17:02,425 WARNING [train.py:1204] (3/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_516-324055-0_sp1.1 from training. Duration: 0.936375 2023-10-02 17:17:09,163 WARNING [train.py:1204] (3/4) Exclude cut with ID _40094_210_16380_1_1533294005773_3641159_10-261240-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 17:17:11,876 WARNING [train.py:1204] (3/4) Exclude cut with ID _6267_210_2084_1_1527764658526_3894730_375-219887-0_sp1.1 from training. Duration: 0.6090625 2023-10-02 17:17:11,970 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6788_210_1388_1_1527898698_4562021_180-75288-0 from training. Duration: 0.99 2023-10-02 17:17:15,961 WARNING [train.py:1204] (3/4) Exclude cut with ID _8030_210_7497_1_1528444886781_3531089_262-86750-0_sp1.1 from training. Duration: 0.736375 2023-10-02 17:17:15,974 WARNING [train.py:1204] (3/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_934-325498-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 17:17:15,993 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_456-22141-0_sp0.9 from training. Duration: 0.97775 2023-10-02 17:17:17,356 WARNING [train.py:1204] (3/4) Exclude cut with ID _49643_210_7989_1_1534640244224_3939031_598-161926-0_sp1.1 from training. Duration: 0.8181875 2023-10-02 17:17:18,567 WARNING [train.py:1204] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_345-49721-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 17:17:18,645 WARNING [train.py:1204] (3/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_440-141811-0_sp1.1 from training. Duration: 0.863625 2023-10-02 17:17:20,585 WARNING [train.py:1204] (3/4) Exclude cut with ID _64048_210_10626_1_1535198382802_3835179_394-212120-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 17:17:20,629 WARNING [train.py:1204] (3/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_91-277684-0 from training. Duration: 0.74 2023-10-02 17:17:23,410 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=957380.0, ans=0.1 2023-10-02 17:17:25,883 WARNING [train.py:1204] (3/4) Exclude cut with ID _23038_210_9570_1_1532001584158_3652419_191-346343-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 17:17:27,264 WARNING [train.py:1204] (3/4) Exclude cut with ID _15140_210_9288_1_1530791388450_4880149_351-347974-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 17:17:27,308 WARNING [train.py:1204] (3/4) Exclude cut with ID _58605_210_12210_1_1534723195939_3904259_48-269402-0_sp1.1 from training. Duration: 0.87275 2023-10-02 17:17:27,314 WARNING [train.py:1204] (3/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_401-53423-0_sp0.9 from training. Duration: 0.611125 2023-10-02 17:17:30,509 WARNING [train.py:1204] (3/4) Exclude cut with ID _42659_210_2070_1_1533607442765_3120380_158-49004-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 17:17:31,930 WARNING [train.py:1204] (3/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_81-75849-0_sp1.1 from training. Duration: 0.3545625 2023-10-02 17:17:33,912 WARNING [train.py:1204] (3/4) Exclude cut with ID _27052_210_5986_1_1532329197187_3585009_314-106499-0_sp1.1 from training. Duration: 0.836375 2023-10-02 17:17:36,717 WARNING [train.py:1204] (3/4) Exclude cut with ID _4626_210_3532_1_1526817652717_3916859_446-206656-0_sp0.9 from training. Duration: 0.8444375 2023-10-02 17:17:39,352 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_53378_210_17282_1_1534221071_5769509_158-114960-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 17:17:42,082 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3171_210_2087_1_1526109084_2330007_37-64334-0_sp0.9 from training. Duration: 0.911125 2023-10-02 17:17:42,107 WARNING [train.py:1204] (3/4) Exclude cut with ID _5410_210_1388_1_1527397055795_1719020_39-112221-0 from training. Duration: 0.69 2023-10-02 17:17:42,136 WARNING [train.py:1204] (3/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_4-91037-0_sp0.9 from training. Duration: 0.97775 2023-10-02 17:17:42,158 WARNING [train.py:1204] (3/4) Exclude cut with ID ID0079W0443-717795 from training. Duration: 0.541 2023-10-02 17:17:43,890 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.out_combiner.scale_min, batch_count=957446.6666666666, ans=0.2 2023-10-02 17:17:46,333 WARNING [train.py:1204] (3/4) Exclude cut with ID _34734_210_4525_1_1533365977542_4448720_766-297928-0_sp1.1 from training. Duration: 0.936375 2023-10-02 17:17:47,778 INFO [train.py:1046] (3/4) Epoch 28, batch 200, loss[loss=0.1603, simple_loss=0.2501, pruned_loss=0.0352, over 24424.00 frames. ], tot_loss[loss=0.1709, simple_loss=0.249, pruned_loss=0.04638, over 2995932.94 frames. ], batch size: 69, lr: 3.68e-03, grad_scale: 8.0 2023-10-02 17:17:47,872 WARNING [train.py:1204] (3/4) Exclude cut with ID _51471_210_1889_1_1534399391390_3094250_310-337793-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 17:17:47,885 WARNING [train.py:1204] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_678-159307-0_sp0.9 from training. Duration: 0.9444375 2023-10-02 17:17:52,367 WARNING [train.py:1204] (3/4) Exclude cut with ID _50093_210_4220_1_1534417141207_3432299_259-322252-0 from training. Duration: 0.92 2023-10-02 17:17:52,435 WARNING [train.py:1204] (3/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_67-103444-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 17:17:53,680 WARNING [train.py:1204] (3/4) Exclude cut with ID _40551_210_10635_1_1533776424632_3416179_311-247578-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 17:17:56,407 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_4633_210_2040_1_1526824163_4145343_416-76117-0 from training. Duration: 0.95 2023-10-02 17:17:57,755 WARNING [train.py:1204] (3/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_331-117829-0_sp0.9 from training. Duration: 0.688875 2023-10-02 17:17:59,181 WARNING [train.py:1204] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_299-247059-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 17:18:00,781 WARNING [train.py:1204] (3/4) Exclude cut with ID _64002_210_9570_1_1535198605157_38979_2-240845-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 17:18:02,279 WARNING [train.py:1204] (3/4) Exclude cut with ID _4681_210_3525_1_1527250689464_120889_15-126760-0_sp0.9 from training. Duration: 0.9 2023-10-02 17:18:02,290 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_21500_210_2054_1_1532149310_2716758_256-156611-0_sp1.1 from training. Duration: 0.936375 2023-10-02 17:18:02,304 WARNING [train.py:1204] (3/4) Exclude cut with ID _16908_210_5724_1_1531119157248_3772700_624-203729-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 17:18:14,962 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.conv_skip_rate, batch_count=957580.0, ans=0.0 2023-10-02 17:18:24,960 WARNING [train.py:1204] (3/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_605-219732-0_sp1.1 from training. Duration: 0.8545625 2023-10-02 17:18:24,996 WARNING [train.py:1204] (3/4) Exclude cut with ID _44718_210_5816_1_1534050051869_6469030_380-167520-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 17:18:26,500 WARNING [train.py:1204] (3/4) Exclude cut with ID _19896_210_1883_1_1531709022662_6649569_94-229927-0_sp1.1 from training. Duration: 0.8818125 2023-10-02 17:18:27,823 WARNING [train.py:1204] (3/4) Exclude cut with ID _19514_210_9259_1_1531530071330_5084230_519-174300-0_sp1.1 from training. Duration: 0.963625 2023-10-02 17:18:29,170 WARNING [train.py:1204] (3/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_340-174176-0_sp0.9 from training. Duration: 0.5444375 2023-10-02 17:18:29,174 WARNING [train.py:1204] (3/4) Exclude cut with ID _8263_210_1664_1_1528439452891_970220_68-181805-0_sp1.1 from training. Duration: 0.5818125 2023-10-02 17:18:31,908 WARNING [train.py:1204] (3/4) Exclude cut with ID _3120_210_3525_1_1526036143112_4072980_348-257723-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 17:18:33,324 WARNING [train.py:1204] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_453-133983-0_sp1.1 from training. Duration: 0.7090625 2023-10-02 17:18:33,385 WARNING [train.py:1204] (3/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_17-86060-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 17:18:33,404 WARNING [train.py:1204] (3/4) Exclude cut with ID _29877_210_6228_1_1532397791106_5276149_228-184657-0_sp1.1 from training. Duration: 0.97275 2023-10-02 17:18:34,701 WARNING [train.py:1204] (3/4) Exclude cut with ID _18516_210_9206_1_1531704653206_3692406_198-334870-0 from training. Duration: 0.92 2023-10-02 17:18:36,053 WARNING [train.py:1204] (3/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_340-174176-0_sp1.1 from training. Duration: 0.4454375 2023-10-02 17:18:36,071 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_14-139186-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 17:18:37,242 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.447e+02 1.956e+02 2.202e+02 2.604e+02 4.152e+02, threshold=4.404e+02, percent-clipped=1.0 2023-10-02 17:18:40,726 WARNING [train.py:1204] (3/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_126-29104-0_sp0.9 from training. Duration: 0.9444375 2023-10-02 17:18:44,203 WARNING [train.py:1204] (3/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_84-10310-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 17:18:51,684 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.4.encoder.layers.0.self_attn1.whiten, num_groups=1, num_channels=384, metric=14.02 vs. limit=22.5 2023-10-02 17:18:52,424 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_15517_210_6753_1_1530954008_3487336_79-273076-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 17:18:52,465 WARNING [train.py:1204] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_371-18169-0_sp0.9 from training. Duration: 0.9555625 2023-10-02 17:18:57,290 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_68702_210_23689_1_1535799577_3420068_170-92788-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 17:19:00,026 WARNING [train.py:1204] (3/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_78-182912-0 from training. Duration: 0.59 2023-10-02 17:19:00,084 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3019_210_2028_1_1526044245_3264865_297-12384-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 17:19:00,089 WARNING [train.py:1204] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_147-111137-0_sp1.1 from training. Duration: 0.863625 2023-10-02 17:19:00,094 WARNING [train.py:1204] (3/4) Exclude cut with ID _28323_210_16187_1_1532480395667_4100300_117-8859-0_sp1.1 from training. Duration: 0.97275 2023-10-02 17:19:01,477 INFO [train.py:1046] (3/4) Epoch 28, batch 250, loss[loss=0.2095, simple_loss=0.2645, pruned_loss=0.07726, over 19085.00 frames. ], tot_loss[loss=0.169, simple_loss=0.247, pruned_loss=0.04553, over 3380279.65 frames. ], batch size: 388, lr: 3.68e-03, grad_scale: 8.0 2023-10-02 17:19:01,589 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7762_210_1794_1_1528199508_4057408_419-125009-0_sp1.1 from training. Duration: 0.7545625 2023-10-02 17:19:02,920 WARNING [train.py:1204] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_268-87909-0 from training. Duration: 0.99 2023-10-02 17:19:04,307 WARNING [train.py:1204] (3/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_118-311357-0_sp1.1 from training. Duration: 0.92725 2023-10-02 17:19:04,321 WARNING [train.py:1204] (3/4) Exclude cut with ID ID0377W0003-870225 from training. Duration: 0.852 2023-10-02 17:19:06,309 WARNING [train.py:1204] (3/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_56-5838-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 17:19:07,710 WARNING [train.py:1204] (3/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_90-31383-0_sp0.9 from training. Duration: 0.9666875 2023-10-02 17:19:07,799 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_31583_210_3852_1_1532595851_4024413_58-86657-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 17:19:09,563 WARNING [train.py:1204] (3/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_344-113875-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 17:19:11,004 WARNING [train.py:1204] (3/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_278-104676-0_sp1.1 from training. Duration: 0.8909375 2023-10-02 17:19:11,025 WARNING [train.py:1204] (3/4) Exclude cut with ID _55844_210_2346_1_1534471212569_7625707_764-2387-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 17:19:12,973 WARNING [train.py:1204] (3/4) Exclude cut with ID _27430_210_7770_1_1532604689375_1378590_157-14963-0_sp1.1 from training. Duration: 0.963625 2023-10-02 17:19:13,187 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.conv_module1.balancer2.prob, batch_count=957846.6666666666, ans=0.125 2023-10-02 17:19:13,191 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.bypass.scale_min, batch_count=957846.6666666666, ans=0.2 2023-10-02 17:19:17,096 WARNING [train.py:1204] (3/4) Exclude cut with ID _7160_210_1876_1_1527940923231_3703799_282-322611-0_sp1.1 from training. Duration: 0.7909375 2023-10-02 17:19:20,228 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.bypass.scale_min, batch_count=957913.3333333334, ans=0.2 2023-10-02 17:19:26,050 WARNING [train.py:1204] (3/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_190-138640-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 17:19:28,740 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_33348_210_5632_1_1533086912_3324030_63-270922-0_sp1.1 from training. Duration: 0.97275 2023-10-02 17:19:28,773 WARNING [train.py:1204] (3/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_135-184281-0_sp1.1 from training. Duration: 0.9 2023-10-02 17:19:32,801 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.balancer2.prob, batch_count=957980.0, ans=0.125 2023-10-02 17:19:38,476 WARNING [train.py:1204] (3/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_277-210798-0_sp1.1 from training. Duration: 0.5 2023-10-02 17:19:38,539 WARNING [train.py:1204] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_402-253027-0_sp0.9 from training. Duration: 0.6 2023-10-02 17:19:39,844 WARNING [train.py:1204] (3/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_540-275685-0_sp0.9 from training. Duration: 0.988875 2023-10-02 17:19:39,898 WARNING [train.py:1204] (3/4) Exclude cut with ID _59493_210_17282_1_1535078419077_3225879_118-18960-0_sp1.1 from training. Duration: 0.936375 2023-10-02 17:19:41,299 WARNING [train.py:1204] (3/4) Exclude cut with ID _6194_210_1534_1_1527511826160_3941194_321-148641-0_sp1.1 from training. Duration: 0.5818125 2023-10-02 17:19:41,305 WARNING [train.py:1204] (3/4) Exclude cut with ID _61133_210_9969_1_1535196777227_3696409_333-270325-0_sp1.1 from training. Duration: 0.8454375 2023-10-02 17:19:41,364 WARNING [train.py:1204] (3/4) Exclude cut with ID _49376_210_5986_1_1533985278732_3370740_307-145221-0_sp1.1 from training. Duration: 0.936375 2023-10-02 17:19:44,023 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6846_210_6753_1_1527850767_4295158_2-315073-0_sp0.9 from training. Duration: 0.97775 2023-10-02 17:19:46,000 WARNING [train.py:1204] (3/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_17-25045-0 from training. Duration: 0.59 2023-10-02 17:19:46,018 WARNING [train.py:1204] (3/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_503-333510-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 17:19:47,484 WARNING [train.py:1204] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_238-245655-0_sp1.1 from training. Duration: 0.736375 2023-10-02 17:19:48,721 WARNING [train.py:1204] (3/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_329-82235-0_sp0.9 from training. Duration: 0.911125 2023-10-02 17:19:48,726 WARNING [train.py:1204] (3/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_325-113798-0_sp0.9 from training. Duration: 0.9444375 2023-10-02 17:19:48,798 WARNING [train.py:1204] (3/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_395-37222-0_sp0.9 from training. Duration: 0.8444375 2023-10-02 17:19:50,144 WARNING [train.py:1204] (3/4) Exclude cut with ID _64135_210_2253_1_1535677267876_7608972_498-5039-0_sp1.1 from training. Duration: 0.7090625 2023-10-02 17:19:50,153 WARNING [train.py:1204] (3/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_303-228738-0_sp0.9 from training. Duration: 0.9333125 2023-10-02 17:19:51,594 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_577-220899-0_sp1.1 from training. Duration: 0.92725 2023-10-02 17:19:52,958 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3475_210_2028_1_1526193578_3622569_377-166133-0_sp1.1 from training. Duration: 0.7818125 2023-10-02 17:19:52,981 WARNING [train.py:1204] (3/4) Exclude cut with ID _49376_210_5986_1_1533985278732_3370740_272-313512-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 17:19:57,106 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7762_210_1794_1_1528199508_4057408_422-227034-0_sp0.9 from training. Duration: 0.811125 2023-10-02 17:20:01,926 WARNING [train.py:1204] (3/4) Exclude cut with ID _18895_210_6228_1_1531357274234_3784930_74-341428-0_sp1.1 from training. Duration: 0.92725 2023-10-02 17:20:04,852 WARNING [train.py:1204] (3/4) Exclude cut with ID _44902_210_16187_1_1533888006062_3792078_149-67212-0_sp1.1 from training. Duration: 0.7909375 2023-10-02 17:20:10,189 WARNING [train.py:1204] (3/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_243-126020-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 17:20:12,744 WARNING [train.py:1204] (3/4) Exclude cut with ID _54218_210_12210_1_1534413646388_4409259_259-6561-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 17:20:14,255 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_41452_210_7291_1_1533437687_3941281_405-11559-0 from training. Duration: 0.91 2023-10-02 17:20:14,956 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.3.encoder.layers.1.feed_forward3.out_whiten, num_groups=1, num_channels=512, metric=10.10 vs. limit=15.0 2023-10-02 17:20:15,520 INFO [train.py:1046] (3/4) Epoch 28, batch 300, loss[loss=0.1714, simple_loss=0.2586, pruned_loss=0.04213, over 24355.00 frames. ], tot_loss[loss=0.1674, simple_loss=0.2451, pruned_loss=0.04489, over 3682948.28 frames. ], batch size: 77, lr: 3.68e-03, grad_scale: 8.0 2023-10-02 17:20:15,621 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_759-225084-0_sp1.1 from training. Duration: 0.97275 2023-10-02 17:20:17,381 WARNING [train.py:1204] (3/4) Exclude cut with ID _14747_210_8341_1_1530685797463_3654089_147-131229-0_sp0.9 from training. Duration: 0.8444375 2023-10-02 17:20:18,758 WARNING [train.py:1204] (3/4) Exclude cut with ID _14867_210_1968_1_1530684179890_3846969_319-250868-0 from training. Duration: 0.99 2023-10-02 17:20:18,777 WARNING [train.py:1204] (3/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_580-93798-0_sp0.9 from training. Duration: 0.62225 2023-10-02 17:20:20,182 WARNING [train.py:1204] (3/4) Exclude cut with ID _9679_210_7497_1_1529129212906_4363929_512-313889-0_sp0.9 from training. Duration: 0.9 2023-10-02 17:20:20,198 WARNING [train.py:1204] (3/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_18-189198-0 from training. Duration: 0.9 2023-10-02 17:20:21,772 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.conv_module2.balancer1.prob, batch_count=958180.0, ans=0.125 2023-10-02 17:20:25,548 WARNING [train.py:1204] (3/4) Exclude cut with ID _49755_210_18053_1_1534581005846_3218709_137-64179-0_sp1.1 from training. Duration: 0.92725 2023-10-02 17:20:25,614 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_48049_210_5632_1_1533864553_3519937_306-283794-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 17:20:28,852 WARNING [train.py:1204] (3/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_25-120161-0_sp1.1 from training. Duration: 0.8909375 2023-10-02 17:20:28,919 WARNING [train.py:1204] (3/4) Exclude cut with ID _8833_210_5219_1_1528592665210_7553059_319-349181-0 from training. Duration: 0.99 2023-10-02 17:20:30,167 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_296-332325-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 17:20:32,796 WARNING [train.py:1204] (3/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_436-186311-0_sp1.1 from training. Duration: 0.5909375 2023-10-02 17:20:32,807 WARNING [train.py:1204] (3/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_348-127789-0 from training. Duration: 0.85 2023-10-02 17:20:32,812 WARNING [train.py:1204] (3/4) Exclude cut with ID _3992_210_1747_1_1526644816031_3731060_443-195183-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 17:20:35,755 WARNING [train.py:1204] (3/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_308-231735-0_sp1.1 from training. Duration: 0.663625 2023-10-02 17:20:41,750 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6786_210_1388_1_1527839699_4194495_378-46070-0_sp1.1 from training. Duration: 0.6545625 2023-10-02 17:20:43,184 WARNING [train.py:1204] (3/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_63-101996-0 from training. Duration: 0.88 2023-10-02 17:20:46,039 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2541_210_1794_1_1525590269_3084765_156-173713-0 from training. Duration: 0.81 2023-10-02 17:20:46,070 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_217-239796-0_sp1.1 from training. Duration: 0.963625 2023-10-02 17:20:47,517 WARNING [train.py:1204] (3/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_530-329400-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 17:20:49,491 WARNING [train.py:1204] (3/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_150-62920-0_sp1.1 from training. Duration: 0.963625 2023-10-02 17:20:49,492 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_5522_210_3273_1_1527249606_3909096_366-172508-0 from training. Duration: 0.66 2023-10-02 17:20:49,500 WARNING [train.py:1204] (3/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_303-340446-0_sp1.1 from training. Duration: 0.8181875 2023-10-02 17:20:50,935 WARNING [train.py:1204] (3/4) Exclude cut with ID _8699_210_2069_1_1528630346831_3960409_160-24408-0_sp1.1 from training. Duration: 0.82725 2023-10-02 17:20:53,622 WARNING [train.py:1204] (3/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_299-187801-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 17:20:53,655 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_34886_210_10633_1_1533203882_1755661_92-345288-0_sp1.1 from training. Duration: 0.97275 2023-10-02 17:20:56,441 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_5438_210_1794_1_1527336687_2600009_104-288274-0_sp1.1 from training. Duration: 0.52725 2023-10-02 17:20:56,445 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_154-316450-0 from training. Duration: 0.76 2023-10-02 17:20:58,397 WARNING [train.py:1204] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_23-189234-0_sp0.9 from training. Duration: 0.92225 2023-10-02 17:21:01,230 WARNING [train.py:1204] (3/4) Exclude cut with ID _4779_210_2084_1_1527318022944_3914893_346-168820-0_sp1.1 from training. Duration: 0.963625 2023-10-02 17:21:02,539 WARNING [train.py:1204] (3/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_5-55893-0 from training. Duration: 0.55 2023-10-02 17:21:03,762 WARNING [train.py:1204] (3/4) Exclude cut with ID _20455_210_5219_1_1531565996775_8098100_180-24517-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 17:21:05,024 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.503e+02 1.790e+02 1.923e+02 2.144e+02 2.937e+02, threshold=3.846e+02, percent-clipped=0.0 2023-10-02 17:21:09,304 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_243-143119-0_sp0.9 from training. Duration: 0.8555625 2023-10-02 17:21:11,352 WARNING [train.py:1204] (3/4) Exclude cut with ID _65585_210_12210_1_1535450451584_3764098_293-212045-0_sp1.1 from training. Duration: 0.92725 2023-10-02 17:21:11,356 WARNING [train.py:1204] (3/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_577-332600-0 from training. Duration: 0.57 2023-10-02 17:21:12,195 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.2.encoder.layers.0.feed_forward3.out_whiten, num_groups=1, num_channels=384, metric=9.95 vs. limit=15.0 2023-10-02 17:21:15,510 WARNING [train.py:1204] (3/4) Exclude cut with ID _30039_210_13270_1_1532484612900_2725150_104-289874-0_sp1.1 from training. Duration: 0.963625 2023-10-02 17:21:15,523 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3172_210_2087_1_1526292929_3987865_197-97762-0_sp1.1 from training. Duration: 0.6818125 2023-10-02 17:21:18,687 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_256-150908-0_sp1.1 from training. Duration: 0.963625 2023-10-02 17:21:18,802 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_296-287678-0_sp1.1 from training. Duration: 0.863625 2023-10-02 17:21:20,098 WARNING [train.py:1204] (3/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_443-30034-0 from training. Duration: 0.64 2023-10-02 17:21:21,444 WARNING [train.py:1204] (3/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_494-199538-0_sp0.9 from training. Duration: 0.8333125 2023-10-02 17:21:21,503 WARNING [train.py:1204] (3/4) Exclude cut with ID _19708_210_9206_1_1532136650733_4238364_61-162325-0_sp1.1 from training. Duration: 0.97275 2023-10-02 17:21:22,844 WARNING [train.py:1204] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_475-127455-0 from training. Duration: 0.7 2023-10-02 17:21:22,987 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.1.conv_skip_rate, batch_count=958446.6666666666, ans=0.0 2023-10-02 17:21:24,301 WARNING [train.py:1204] (3/4) Exclude cut with ID _48677_210_5663_1_1533902405919_3892989_188-11581-0_sp1.1 from training. Duration: 0.963625 2023-10-02 17:21:24,330 WARNING [train.py:1204] (3/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_136-8381-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 17:21:25,787 WARNING [train.py:1204] (3/4) Exclude cut with ID _55328_210_27767_1_1534580973260_4088170_270-229555-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 17:21:25,823 WARNING [train.py:1204] (3/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_372-266951-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 17:21:27,315 WARNING [train.py:1204] (3/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_14-242149-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 17:21:29,227 INFO [train.py:1046] (3/4) Epoch 28, batch 350, loss[loss=0.1718, simple_loss=0.2532, pruned_loss=0.04516, over 24660.00 frames. ], tot_loss[loss=0.1664, simple_loss=0.2434, pruned_loss=0.04464, over 3910029.54 frames. ], batch size: 65, lr: 3.68e-03, grad_scale: 8.0 2023-10-02 17:21:30,791 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.conv_skip_rate, batch_count=958513.3333333334, ans=0.0 2023-10-02 17:21:31,964 WARNING [train.py:1204] (3/4) Exclude cut with ID _7877_210_6014_1_1528286358099_3750136_159-76512-0_sp1.1 from training. Duration: 0.936375 2023-10-02 17:21:31,967 WARNING [train.py:1204] (3/4) Exclude cut with ID _8263_210_1664_1_1528438953494_294100_40-204731-0_sp1.1 from training. Duration: 0.4545625 2023-10-02 17:21:34,642 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8278_210_2550_1_1528637099_4220413_694-238413-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 17:21:40,578 WARNING [train.py:1204] (3/4) Exclude cut with ID _47278_210_12041_1_1533902418349_3576110_421-172710-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 17:21:42,055 WARNING [train.py:1204] (3/4) Exclude cut with ID _14970_210_15709_1_1530773543493_1715730_215-282680-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 17:21:43,692 WARNING [train.py:1204] (3/4) Exclude cut with ID _64639_210_5816_1_1535367619437_3528000_306-149449-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 17:21:45,251 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_28-291982-0 from training. Duration: 0.97 2023-10-02 17:21:46,576 WARNING [train.py:1204] (3/4) Exclude cut with ID _55134_210_21982_1_1534590028377_3882170_412-15870-0_sp1.1 from training. Duration: 0.936375 2023-10-02 17:21:46,607 WARNING [train.py:1204] (3/4) Exclude cut with ID _13684_210_7497_1_1530619354863_3598359_312-235606-0 from training. Duration: 0.6 2023-10-02 17:21:49,452 WARNING [train.py:1204] (3/4) Exclude cut with ID _37112_210_2575_1_1532997007673_7268610_347-238993-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 17:21:49,495 WARNING [train.py:1204] (3/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_121-284152-0 from training. Duration: 0.59 2023-10-02 17:21:50,802 WARNING [train.py:1204] (3/4) Exclude cut with ID _2941_210_2577_1_1526199673150_2489759_228-2961-0_sp1.1 from training. Duration: 0.97275 2023-10-02 17:21:51,723 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.1.encoder.layers.0.self_attn1.whiten, num_groups=1, num_channels=256, metric=12.59 vs. limit=22.5 2023-10-02 17:21:54,202 WARNING [train.py:1204] (3/4) Exclude cut with ID _28323_210_16187_1_1532480395667_4100300_531-24157-0 from training. Duration: 0.98 2023-10-02 17:21:56,049 WARNING [train.py:1204] (3/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_266-329755-0_sp0.9 from training. Duration: 0.988875 2023-10-02 17:21:57,463 WARNING [train.py:1204] (3/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_1038-146421-0_sp1.1 from training. Duration: 0.97275 2023-10-02 17:21:58,761 WARNING [train.py:1204] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_391-22148-0_sp0.9 from training. Duration: 0.8555625 2023-10-02 17:22:00,109 WARNING [train.py:1204] (3/4) Exclude cut with ID _64521_210_14699_1_1535243427259_3815328_543-341117-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 17:22:00,127 WARNING [train.py:1204] (3/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_52-168712-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 17:22:00,169 WARNING [train.py:1204] (3/4) Exclude cut with ID _33920_210_4220_1_1533257851500_3084399_198-334063-0_sp1.1 from training. Duration: 0.8545625 2023-10-02 17:22:01,482 WARNING [train.py:1204] (3/4) Exclude cut with ID _50093_210_4220_1_1534417141207_3432299_69-111987-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 17:22:01,530 WARNING [train.py:1204] (3/4) Exclude cut with ID _14821_210_8750_1_1530680503991_6829359_189-297451-0_sp0.9 from training. Duration: 0.611125 2023-10-02 17:22:03,186 WARNING [train.py:1204] (3/4) Exclude cut with ID _8030_210_7497_1_1528444886781_3531089_262-86750-0_sp0.9 from training. Duration: 0.9 2023-10-02 17:22:03,202 WARNING [train.py:1204] (3/4) Exclude cut with ID _24612_210_8913_1_1531998054193_3806359_294-191196-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 17:22:09,289 WARNING [train.py:1204] (3/4) Exclude cut with ID _16909_210_5724_1_1531292079912_4118919_707-342896-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 17:22:10,585 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_9460_210_4211_1_1528954587_10049667_572-37035-0_sp0.9 from training. Duration: 0.888875 2023-10-02 17:22:12,424 WARNING [train.py:1204] (3/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_762-109886-0_sp1.1 from training. Duration: 0.82725 2023-10-02 17:22:12,446 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_14953_210_2151_1_1530701710_5616165_519-326393-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 17:22:15,413 WARNING [train.py:1204] (3/4) Exclude cut with ID _25914_210_6400_1_1532141317058_644740_52-96808-0 from training. Duration: 0.91 2023-10-02 17:22:15,416 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_68095_210_20185_1_1535718226_8888855_1079-94525-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 17:22:15,563 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.attention_skip_rate, batch_count=958713.3333333334, ans=0.0 2023-10-02 17:22:20,804 WARNING [train.py:1204] (3/4) Exclude cut with ID _14860_210_4915_1_1530702020340_7343240_746-3647-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 17:22:20,805 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_70379_210_7925_1_1535941455_557455_49-101720-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 17:22:22,076 WARNING [train.py:1204] (3/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_8-307291-0_sp1.1 from training. Duration: 0.936375 2023-10-02 17:22:22,194 WARNING [train.py:1204] (3/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_117-290916-0 from training. Duration: 0.51 2023-10-02 17:22:25,485 WARNING [train.py:1204] (3/4) Exclude cut with ID _22020_210_2461_1_1531724164513_6326900_944-339840-0_sp1.1 from training. Duration: 0.963625 2023-10-02 17:22:26,848 WARNING [train.py:1204] (3/4) Exclude cut with ID IC0515W0043-254911 from training. Duration: 0.976 2023-10-02 17:22:26,957 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3910_210_1794_1_1526742306_334530_28-238909-0 from training. Duration: 0.81 2023-10-02 17:22:26,978 WARNING [train.py:1204] (3/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_269-267275-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 17:22:31,491 WARNING [train.py:1204] (3/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_72-251255-0_sp1.1 from training. Duration: 0.8545625 2023-10-02 17:22:31,512 WARNING [train.py:1204] (3/4) Exclude cut with ID _14886_210_4220_1_1530702020355_3516190_175-229073-0 from training. Duration: 0.45 2023-10-02 17:22:33,133 WARNING [train.py:1204] (3/4) Exclude cut with ID _40519_210_13794_1_1533715228655_7337390_921-209452-0_sp1.1 from training. Duration: 0.963625 2023-10-02 17:22:34,519 WARNING [train.py:1204] (3/4) Exclude cut with ID _29857_210_20034_1_1532395886753_3798160_335-163562-0_sp0.9 from training. Duration: 0.9444375 2023-10-02 17:22:37,636 WARNING [train.py:1204] (3/4) Exclude cut with ID _58583_210_5236_1_1534726835266_3574249_384-58445-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 17:22:37,712 WARNING [train.py:1204] (3/4) Exclude cut with ID _44011_210_2046_1_1533722401691_3631310_96-315656-0_sp1.1 from training. Duration: 0.963625 2023-10-02 17:22:37,716 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_68484_210_1794_1_1535849804_7678097_549-170613-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 17:22:39,369 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.feed_forward1.out_proj.dropout_p, batch_count=958780.0, ans=0.1 2023-10-02 17:22:40,521 WARNING [train.py:1204] (3/4) Exclude cut with ID _37892_210_19596_1_1533198677897_3964058_214-148058-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 17:22:43,232 INFO [train.py:1046] (3/4) Epoch 28, batch 400, loss[loss=0.1792, simple_loss=0.2498, pruned_loss=0.05432, over 23777.00 frames. ], tot_loss[loss=0.1664, simple_loss=0.2432, pruned_loss=0.04486, over 4070976.15 frames. ], batch size: 179, lr: 3.68e-03, grad_scale: 16.0 2023-10-02 17:22:43,295 WARNING [train.py:1204] (3/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_415-78792-0_sp0.9 from training. Duration: 0.9 2023-10-02 17:22:45,267 WARNING [train.py:1204] (3/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_383-205833-0_sp1.1 from training. Duration: 0.863625 2023-10-02 17:22:46,475 WARNING [train.py:1204] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_167-137255-0 from training. Duration: 0.64 2023-10-02 17:22:46,483 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8275_210_4198_1_1528455320_3892571_484-154786-0_sp1.1 from training. Duration: 0.963625 2023-10-02 17:22:47,843 WARNING [train.py:1204] (3/4) Exclude cut with ID _40284_210_2084_1_1533513679814_3704279_396-319412-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 17:22:49,226 WARNING [train.py:1204] (3/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_280-120942-0_sp0.9 from training. Duration: 0.8555625 2023-10-02 17:22:49,271 WARNING [train.py:1204] (3/4) Exclude cut with ID _45630_210_10556_1_1533949174498_3902049_558-240889-0_sp1.1 from training. Duration: 0.92725 2023-10-02 17:22:49,467 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.feed_forward2.hidden_balancer.prob, batch_count=958846.6666666666, ans=0.125 2023-10-02 17:22:52,012 WARNING [train.py:1204] (3/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_197-307458-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 17:22:53,956 WARNING [train.py:1204] (3/4) Exclude cut with ID _16909_210_5724_1_1531292079912_4118919_163-254843-0_sp1.1 from training. Duration: 0.92725 2023-10-02 17:22:56,497 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_103-84010-0 from training. Duration: 0.69 2023-10-02 17:22:58,066 WARNING [train.py:1204] (3/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_401-158239-0 from training. Duration: 0.71 2023-10-02 17:22:58,067 WARNING [train.py:1204] (3/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_899-233658-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 17:22:59,654 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=958913.3333333334, ans=0.1 2023-10-02 17:23:01,135 WARNING [train.py:1204] (3/4) Exclude cut with ID _23825_210_13449_1_1531915313526_3497150_240-271367-0 from training. Duration: 0.97 2023-10-02 17:23:01,187 WARNING [train.py:1204] (3/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_424-134619-0_sp1.1 from training. Duration: 0.92725 2023-10-02 17:23:03,986 WARNING [train.py:1204] (3/4) Exclude cut with ID _44831_210_13427_1_1533816011549_3689035_32-188147-0_sp1.1 from training. Duration: 0.87275 2023-10-02 17:23:03,990 WARNING [train.py:1204] (3/4) Exclude cut with ID _59170_210_6721_1_1534983633960_4203335_314-63879-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 17:23:04,010 WARNING [train.py:1204] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_602-275260-0 from training. Duration: 0.82 2023-10-02 17:23:05,320 WARNING [train.py:1204] (3/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_511-306767-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 17:23:05,345 WARNING [train.py:1204] (3/4) Exclude cut with ID _41817_210_18835_1_1533434477160_7423049_565-201775-0_sp1.1 from training. Duration: 0.92725 2023-10-02 17:23:05,357 WARNING [train.py:1204] (3/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_778-173151-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 17:23:05,394 WARNING [train.py:1204] (3/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_680-51001-0_sp1.1 from training. Duration: 0.963625 2023-10-02 17:23:08,583 WARNING [train.py:1204] (3/4) Exclude cut with ID IC0677W0059-334891 from training. Duration: 0.896 2023-10-02 17:23:08,625 WARNING [train.py:1204] (3/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_370-331600-0 from training. Duration: 0.94 2023-10-02 17:23:11,693 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.conv_module1.balancer1.prob, batch_count=958980.0, ans=0.125 2023-10-02 17:23:13,330 WARNING [train.py:1204] (3/4) Exclude cut with ID _66840_210_5219_1_1535452168426_4196349_446-240192-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 17:23:13,413 WARNING [train.py:1204] (3/4) Exclude cut with ID _65242_210_18628_1_1535369393717_3823149_38-6732-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 17:23:14,850 WARNING [train.py:1204] (3/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_302-2550-0 from training. Duration: 0.81 2023-10-02 17:23:16,191 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_100-348441-0 from training. Duration: 0.91 2023-10-02 17:23:18,867 WARNING [train.py:1204] (3/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_329-285801-0_sp1.1 from training. Duration: 0.82725 2023-10-02 17:23:21,527 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_30167_210_3528_1_1532428012_4772375_343-334712-0_sp1.1 from training. Duration: 0.936375 2023-10-02 17:23:26,366 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.nonlin_attention.balancer.prob, batch_count=959046.6666666666, ans=0.125 2023-10-02 17:23:27,639 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7543_210_6753_1_1528543318_4552_1-85072-0 from training. Duration: 0.83 2023-10-02 17:23:30,819 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7002_210_1794_1_1527935634_4034444_487-236501-0_sp1.1 from training. Duration: 0.6 2023-10-02 17:23:32,168 WARNING [train.py:1204] (3/4) Exclude cut with ID _7160_210_1876_1_1527940923231_3703799_282-322611-0 from training. Duration: 0.87 2023-10-02 17:23:34,770 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.466e+02 1.847e+02 2.073e+02 2.548e+02 3.934e+02, threshold=4.147e+02, percent-clipped=1.0 2023-10-02 17:23:34,837 WARNING [train.py:1204] (3/4) Exclude cut with ID _67335_210_5219_1_1535525975652_4275229_570-262983-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 17:23:34,942 WARNING [train.py:1204] (3/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_625-338546-0_sp1.1 from training. Duration: 0.8 2023-10-02 17:23:34,969 WARNING [train.py:1204] (3/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_176-56120-0 from training. Duration: 0.83 2023-10-02 17:23:39,770 WARNING [train.py:1204] (3/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_963-187109-0_sp1.1 from training. Duration: 0.9 2023-10-02 17:23:43,047 WARNING [train.py:1204] (3/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_121-284152-0_sp0.9 from training. Duration: 0.6555625 2023-10-02 17:23:43,144 WARNING [train.py:1204] (3/4) Exclude cut with ID _45021_210_9994_1_1533619751094_3330288_104-81711-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 17:23:43,313 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.1.feed_forward1.out_proj.dropout_p, batch_count=959113.3333333334, ans=0.1 2023-10-02 17:23:44,762 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.feed_forward3.hidden_balancer.prob, batch_count=959113.3333333334, ans=0.125 2023-10-02 17:23:46,023 WARNING [train.py:1204] (3/4) Exclude cut with ID _51382_210_23862_1_1534492801838_3650104_129-343914-0_sp1.1 from training. Duration: 0.936375 2023-10-02 17:23:47,330 WARNING [train.py:1204] (3/4) Exclude cut with ID _14970_210_15709_1_1530775547361_1966965_8-169924-0 from training. Duration: 0.92 2023-10-02 17:23:48,769 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2295_210_2087_1_1525500993_2407863_234-83104-0_sp1.1 from training. Duration: 0.563625 2023-10-02 17:23:48,842 WARNING [train.py:1204] (3/4) Exclude cut with ID _14687_210_5854_1_1530703838259_3664889_115-239046-0 from training. Duration: 0.67 2023-10-02 17:23:50,279 WARNING [train.py:1204] (3/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_690-126306-0_sp1.1 from training. Duration: 0.7090625 2023-10-02 17:23:50,287 WARNING [train.py:1204] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_625-102955-0_sp0.9 from training. Duration: 0.8555625 2023-10-02 17:23:50,508 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.bypass.scale_min, batch_count=959113.3333333334, ans=0.2 2023-10-02 17:23:51,671 WARNING [train.py:1204] (3/4) Exclude cut with ID _9389_210_1664_1_1529217337301_5289269_289-213417-0 from training. Duration: 0.89 2023-10-02 17:23:53,066 WARNING [train.py:1204] (3/4) Exclude cut with ID _13253_210_1925_1_1530757775819_3577873_191-144977-0_sp0.9 from training. Duration: 0.8666875 2023-10-02 17:23:54,289 WARNING [train.py:1204] (3/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_325-155703-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 17:23:54,332 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2295_210_2087_1_1525500993_2407863_116-238894-0_sp0.9 from training. Duration: 0.77775 2023-10-02 17:23:57,429 INFO [train.py:1046] (3/4) Epoch 28, batch 450, loss[loss=0.1594, simple_loss=0.2375, pruned_loss=0.04065, over 23367.00 frames. ], tot_loss[loss=0.1667, simple_loss=0.2436, pruned_loss=0.04489, over 4219459.74 frames. ], batch size: 105, lr: 3.68e-03, grad_scale: 8.0 2023-10-02 17:23:57,494 WARNING [train.py:1204] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_94-126128-0 from training. Duration: 0.86 2023-10-02 17:23:57,510 WARNING [train.py:1204] (3/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_395-310220-0_sp0.9 from training. Duration: 0.988875 2023-10-02 17:23:57,589 WARNING [train.py:1204] (3/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_821-110852-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 17:23:59,365 WARNING [train.py:1204] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_64-248359-0_sp1.1 from training. Duration: 0.62725 2023-10-02 17:23:59,391 WARNING [train.py:1204] (3/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_138-187031-0 from training. Duration: 0.99 2023-10-02 17:24:00,737 WARNING [train.py:1204] (3/4) Exclude cut with ID _4122_210_2084_1_1526713164417_4353280_528-206771-0_sp0.9 from training. Duration: 0.97775 2023-10-02 17:24:00,834 WARNING [train.py:1204] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_844-96989-0_sp1.1 from training. Duration: 0.6454375 2023-10-02 17:24:03,557 WARNING [train.py:1204] (3/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_106-30782-0_sp1.1 from training. Duration: 0.72725 2023-10-02 17:24:11,205 WARNING [train.py:1204] (3/4) Exclude cut with ID _14683_210_5219_1_1530698352608_3974859_281-250040-0_sp1.1 from training. Duration: 0.936375 2023-10-02 17:24:12,936 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_46013_210_7234_1_1533776362_3744032_463-52378-0_sp0.9 from training. Duration: 0.9555625 2023-10-02 17:24:15,097 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.self_attn_weights.whiten_keys.whitening_limit, batch_count=959246.6666666666, ans=6.0 2023-10-02 17:24:15,580 WARNING [train.py:1204] (3/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_299-34413-0 from training. Duration: 0.65 2023-10-02 17:24:15,640 WARNING [train.py:1204] (3/4) Exclude cut with ID _35799_210_5854_1_1533263487887_3529071_490-326847-0 from training. Duration: 0.99 2023-10-02 17:24:19,857 WARNING [train.py:1204] (3/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_589-9070-0_sp0.9 from training. Duration: 0.788875 2023-10-02 17:24:22,567 WARNING [train.py:1204] (3/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_884-29515-0_sp1.1 from training. Duration: 0.936375 2023-10-02 17:24:23,149 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.4.encoder.layers.2.whiten, num_groups=1, num_channels=384, metric=3.61 vs. limit=12.0 2023-10-02 17:24:23,970 WARNING [train.py:1204] (3/4) Exclude cut with ID _15012_210_1968_1_1530856820429_5095350_474-119995-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 17:24:28,044 WARNING [train.py:1204] (3/4) Exclude cut with ID _65585_210_12210_1_1535450451584_3764098_211-97124-0_sp1.1 from training. Duration: 0.963625 2023-10-02 17:24:29,794 WARNING [train.py:1204] (3/4) Exclude cut with ID _33640_210_2655_1_1533117605479_4855469_666-224463-0_sp1.1 from training. Duration: 0.963625 2023-10-02 17:24:31,277 WARNING [train.py:1204] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_35-153886-0 from training. Duration: 0.84 2023-10-02 17:24:31,316 WARNING [train.py:1204] (3/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_210-106047-0 from training. Duration: 0.97 2023-10-02 17:24:34,650 WARNING [train.py:1204] (3/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_479-125611-0 from training. Duration: 0.96 2023-10-02 17:24:34,674 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_9417_210_1794_1_1529235261_4614131_427-153963-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 17:24:34,773 WARNING [train.py:1204] (3/4) Exclude cut with ID _23818_210_6228_1_1531917063884_3676920_133-170571-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 17:24:36,057 WARNING [train.py:1204] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_602-275260-0_sp1.1 from training. Duration: 0.7454375 2023-10-02 17:24:37,880 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9029W0121-509521 from training. Duration: 0.896 2023-10-02 17:24:37,888 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9029W0434-509821 from training. Duration: 0.64 2023-10-02 17:24:37,919 WARNING [train.py:1204] (3/4) Exclude cut with ID _23038_210_9570_1_1532001584158_3652419_289-307034-0_sp1.1 from training. Duration: 0.936375 2023-10-02 17:24:38,096 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.self_attn_weights.pos_emb_skip_rate, batch_count=959313.3333333334, ans=0.0 2023-10-02 17:24:39,241 WARNING [train.py:1204] (3/4) Exclude cut with ID _29345_210_4381_1_1532689168263_3860169_149-257268-0_sp0.9 from training. Duration: 0.9 2023-10-02 17:24:40,590 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_405-339986-0_sp1.1 from training. Duration: 0.463625 2023-10-02 17:24:40,897 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.feed_forward3.hidden_balancer.prob, batch_count=959380.0, ans=0.125 2023-10-02 17:24:45,095 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2655_210_2087_1_1525688959_3897400_437-337690-0_sp1.1 from training. Duration: 0.57275 2023-10-02 17:24:45,142 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7543_210_6753_1_1528541907_1399322_165-323619-0_sp1.1 from training. Duration: 0.62725 2023-10-02 17:24:46,449 WARNING [train.py:1204] (3/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_508-335369-0_sp1.1 from training. Duration: 0.436375 2023-10-02 17:24:46,517 WARNING [train.py:1204] (3/4) Exclude cut with ID _7852_210_5219_1_1528286471316_8139400_358-287492-0 from training. Duration: 0.96 2023-10-02 17:24:49,317 WARNING [train.py:1204] (3/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_322-217220-0_sp0.9 from training. Duration: 0.9555625 2023-10-02 17:24:52,043 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3171_210_2087_1_1526109084_2330007_66-260583-0_sp0.9 from training. Duration: 0.8 2023-10-02 17:24:52,072 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8558_210_1410_1_1528505225_7613406_131-329524-0_sp1.1 from training. Duration: 0.7181875 2023-10-02 17:24:53,495 WARNING [train.py:1204] (3/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_30-326681-0 from training. Duration: 0.83 2023-10-02 17:24:56,269 WARNING [train.py:1204] (3/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_58-68956-0_sp0.9 from training. Duration: 0.92225 2023-10-02 17:24:56,327 WARNING [train.py:1204] (3/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_255-312654-0 from training. Duration: 0.91 2023-10-02 17:24:57,671 WARNING [train.py:1204] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_99-167392-0 from training. Duration: 0.61 2023-10-02 17:24:59,025 WARNING [train.py:1204] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_94-126128-0_sp0.9 from training. Duration: 0.9555625 2023-10-02 17:25:01,227 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.feed_forward3.hidden_balancer.prob, batch_count=959446.6666666666, ans=0.125 2023-10-02 17:25:02,516 WARNING [train.py:1204] (3/4) Exclude cut with ID _68635_210_9317_1_1535770813410_4315870_187-192817-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 17:25:04,328 WARNING [train.py:1204] (3/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_277-204970-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 17:25:06,399 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6163_210_6400_1_1527506746_4263068_283-137811-0_sp0.9 from training. Duration: 0.8555625 2023-10-02 17:25:06,430 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9144W0121-566446 from training. Duration: 0.896 2023-10-02 17:25:10,395 WARNING [train.py:1204] (3/4) Exclude cut with ID _49371_210_12071_1_1533952761021_3812190_351-315519-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 17:25:10,524 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder_embed.convnext.out_balancer.prob, batch_count=959513.3333333334, ans=0.125 2023-10-02 17:25:11,644 INFO [train.py:1046] (3/4) Epoch 28, batch 500, loss[loss=0.1634, simple_loss=0.2384, pruned_loss=0.04417, over 23732.00 frames. ], tot_loss[loss=0.1673, simple_loss=0.2444, pruned_loss=0.04514, over 4325566.19 frames. ], batch size: 232, lr: 3.68e-03, grad_scale: 8.0 2023-10-02 17:25:11,686 WARNING [train.py:1204] (3/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_115-326778-0_sp1.1 from training. Duration: 0.8454375 2023-10-02 17:25:11,734 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_17292_210_6753_1_1531125388_2943734_436-198965-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 17:25:11,742 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9165W0117-576751 from training. Duration: 0.896 2023-10-02 17:25:13,544 WARNING [train.py:1204] (3/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_212-149947-0 from training. Duration: 0.73 2023-10-02 17:25:13,552 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_13757_210_3528_1_1530440682_4107239_65-167544-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 17:25:16,341 WARNING [train.py:1204] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_700-183549-0_sp0.9 from training. Duration: 0.7555625 2023-10-02 17:25:19,166 WARNING [train.py:1204] (3/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_216-343007-0_sp0.9 from training. Duration: 0.7333125 2023-10-02 17:25:20,662 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7544_210_6753_1_1528587722_3576686_295-258479-0_sp1.1 from training. Duration: 0.763625 2023-10-02 17:25:22,094 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_365-287106-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 17:25:22,109 WARNING [train.py:1204] (3/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_300-3747-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 17:25:23,340 WARNING [train.py:1204] (3/4) Exclude cut with ID _14069_210_5632_1_1530512692033_4803390_487-303201-0_sp1.1 from training. Duration: 0.92725 2023-10-02 17:25:28,467 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.1.encoder.layers.0.feed_forward2.out_whiten, num_groups=1, num_channels=256, metric=4.11 vs. limit=15.0 2023-10-02 17:25:34,074 WARNING [train.py:1204] (3/4) Exclude cut with ID _14459_210_2228_1_1531443641946_7516700_668-123026-0_sp1.1 from training. Duration: 0.936375 2023-10-02 17:25:34,101 WARNING [train.py:1204] (3/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_108-53929-0_sp1.1 from training. Duration: 0.536375 2023-10-02 17:25:35,424 WARNING [train.py:1204] (3/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_266-137104-0_sp0.9 from training. Duration: 0.72225 2023-10-02 17:25:35,463 WARNING [train.py:1204] (3/4) Exclude cut with ID _12302_210_5219_1_1529924446466_8107280_902-168619-0_sp1.1 from training. Duration: 0.936375 2023-10-02 17:25:37,332 WARNING [train.py:1204] (3/4) Exclude cut with ID _59502_210_12210_1_1535107024698_1835139_146-162495-0 from training. Duration: 0.92 2023-10-02 17:25:37,353 WARNING [train.py:1204] (3/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_412-287526-0_sp1.1 from training. Duration: 0.6545625 2023-10-02 17:25:38,837 WARNING [train.py:1204] (3/4) Exclude cut with ID _7160_210_1876_1_1527940923231_3703799_120-150943-0_sp0.9 from training. Duration: 0.9 2023-10-02 17:25:40,382 WARNING [train.py:1204] (3/4) Exclude cut with ID _15037_210_2253_1_1530752294104_3267650_296-192597-0_sp1.1 from training. Duration: 0.736375 2023-10-02 17:25:40,403 WARNING [train.py:1204] (3/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_397-216497-0_sp1.1 from training. Duration: 0.87275 2023-10-02 17:25:41,694 WARNING [train.py:1204] (3/4) Exclude cut with ID _40094_210_16380_1_1533294005773_3641159_76-269320-0_sp1.1 from training. Duration: 0.936375 2023-10-02 17:25:41,748 WARNING [train.py:1204] (3/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_449-15508-0 from training. Duration: 0.82 2023-10-02 17:25:41,969 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.conv_module2.balancer1.prob, batch_count=959646.6666666666, ans=0.125 2023-10-02 17:25:44,894 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.5.encoder.layers.1.self_attn2.whiten, num_groups=1, num_channels=256, metric=11.58 vs. limit=22.5 2023-10-02 17:25:47,702 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9309W0194-640321 from training. Duration: 0.64 2023-10-02 17:25:49,222 WARNING [train.py:1204] (3/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_284-312178-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 17:25:50,086 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.1.encoder.layers.0.feed_forward2.out_whiten, num_groups=1, num_channels=256, metric=4.22 vs. limit=15.0 2023-10-02 17:25:50,631 WARNING [train.py:1204] (3/4) Exclude cut with ID _29853_210_7497_1_1532505600851_6642110_401-55937-0_sp1.1 from training. Duration: 0.92725 2023-10-02 17:25:51,990 WARNING [train.py:1204] (3/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_716-24918-0_sp1.1 from training. Duration: 0.92725 2023-10-02 17:25:52,009 WARNING [train.py:1204] (3/4) Exclude cut with ID _19637_210_4220_1_1531537167676_3205060_314-305638-0_sp1.1 from training. Duration: 0.92725 2023-10-02 17:25:52,051 WARNING [train.py:1204] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_475-127455-0_sp1.1 from training. Duration: 0.636375 2023-10-02 17:25:54,802 WARNING [train.py:1204] (3/4) Exclude cut with ID _5444_210_2151_1_1527418808464_5312210_581-315411-0 from training. Duration: 0.62 2023-10-02 17:25:56,243 WARNING [train.py:1204] (3/4) Exclude cut with ID _44902_210_16187_1_1533888006062_3792078_149-67212-0_sp0.9 from training. Duration: 0.9666875 2023-10-02 17:25:57,588 WARNING [train.py:1204] (3/4) Exclude cut with ID _31602_210_5854_1_1532677021456_4458250_63-63253-0_sp1.1 from training. Duration: 0.963625 2023-10-02 17:25:57,773 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.feed_forward1.out_proj.dropout_p, batch_count=959713.3333333334, ans=0.1 2023-10-02 17:26:00,492 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.conv_skip_rate, batch_count=959713.3333333334, ans=0.0 2023-10-02 17:26:03,512 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.572e+02 1.887e+02 2.063e+02 2.289e+02 3.276e+02, threshold=4.127e+02, percent-clipped=0.0 2023-10-02 17:26:03,611 WARNING [train.py:1204] (3/4) Exclude cut with ID _55209_210_1531_1_1534675828996_4597160_660-203992-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 17:26:06,605 WARNING [train.py:1204] (3/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_83-190624-0_sp1.1 from training. Duration: 0.92725 2023-10-02 17:26:06,861 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.bypass.scale_min, batch_count=959713.3333333334, ans=0.2 2023-10-02 17:26:10,935 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.0.layers.0.feed_forward3.out_whiten, num_groups=1, num_channels=192, metric=12.36 vs. limit=15.0 2023-10-02 17:26:11,324 WARNING [train.py:1204] (3/4) Exclude cut with ID _58605_210_12210_1_1534723195939_3904259_154-139635-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 17:26:15,283 WARNING [train.py:1204] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_164-224052-0 from training. Duration: 0.7 2023-10-02 17:26:15,298 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_1126-189071-0_sp1.1 from training. Duration: 0.963625 2023-10-02 17:26:15,318 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_15396_210_6890_1_1531308473_1938936_310-214789-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 17:26:18,065 WARNING [train.py:1204] (3/4) Exclude cut with ID _8395_210_1531_1_1528597871523_4411579_431-132470-0 from training. Duration: 0.74 2023-10-02 17:26:20,025 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3780_210_2087_1_1526467760_2130483_170-259044-0_sp0.9 from training. Duration: 0.77775 2023-10-02 17:26:22,642 WARNING [train.py:1204] (3/4) Exclude cut with ID _12733_210_8341_1_1530167424556_3646380_537-230640-0_sp1.1 from training. Duration: 0.963625 2023-10-02 17:26:25,371 INFO [train.py:1046] (3/4) Epoch 28, batch 550, loss[loss=0.1643, simple_loss=0.2419, pruned_loss=0.04335, over 23265.00 frames. ], tot_loss[loss=0.167, simple_loss=0.2444, pruned_loss=0.04476, over 4422929.56 frames. ], batch size: 119, lr: 3.68e-03, grad_scale: 8.0 2023-10-02 17:26:25,463 WARNING [train.py:1204] (3/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_159-281310-0 from training. Duration: 0.95 2023-10-02 17:26:26,874 WARNING [train.py:1204] (3/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_434-83000-0 from training. Duration: 0.98 2023-10-02 17:26:28,314 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_4405_210_1849_1_1526732205_3593939_493-32464-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 17:26:28,323 WARNING [train.py:1204] (3/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_342-100157-0 from training. Duration: 0.54 2023-10-02 17:26:29,702 WARNING [train.py:1204] (3/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_765-250133-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 17:26:29,713 WARNING [train.py:1204] (3/4) Exclude cut with ID _38670_210_15379_1_1533448810659_4391059_81-312245-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 17:26:29,789 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_50050_210_16223_1_1535435796_7290138_1273-247611-0_sp1.1 from training. Duration: 0.936375 2023-10-02 17:26:31,074 WARNING [train.py:1204] (3/4) Exclude cut with ID _44831_210_13427_1_1533816011549_3689035_399-18591-0_sp1.1 from training. Duration: 0.936375 2023-10-02 17:26:31,089 WARNING [train.py:1204] (3/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_557-324258-0_sp0.9 from training. Duration: 0.9 2023-10-02 17:26:32,932 WARNING [train.py:1204] (3/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_62-335552-0_sp1.1 from training. Duration: 0.7909375 2023-10-02 17:26:34,351 WARNING [train.py:1204] (3/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_673-264125-0_sp1.1 from training. Duration: 0.963625 2023-10-02 17:26:35,711 WARNING [train.py:1204] (3/4) Exclude cut with ID _8030_210_7497_1_1528444886781_3531089_262-86750-0 from training. Duration: 0.81 2023-10-02 17:26:35,724 WARNING [train.py:1204] (3/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_444-28041-0_sp1.1 from training. Duration: 0.77275 2023-10-02 17:26:40,343 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_34008_210_10431_1_1533345424_8183247_516-14858-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 17:26:40,356 WARNING [train.py:1204] (3/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_54-248218-0_sp1.1 from training. Duration: 0.936375 2023-10-02 17:26:41,923 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.feed_forward2.hidden_balancer.prob, batch_count=959913.3333333334, ans=0.125 2023-10-02 17:26:43,816 WARNING [train.py:1204] (3/4) Exclude cut with ID _13253_210_1925_1_1530757775819_3577873_300-119782-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 17:26:45,095 WARNING [train.py:1204] (3/4) Exclude cut with ID _39290_210_4220_1_1533862778760_3075118_21-240703-0_sp1.1 from training. Duration: 0.936375 2023-10-02 17:26:47,948 WARNING [train.py:1204] (3/4) Exclude cut with ID 774-127930-0014-10412-0_sp1.1 from training. Duration: 0.95 2023-10-02 17:26:49,414 WARNING [train.py:1204] (3/4) Exclude cut with ID _67499_210_17282_1_1535509805220_3692430_20-179346-0 from training. Duration: 0.97 2023-10-02 17:26:51,207 WARNING [train.py:1204] (3/4) Exclude cut with ID _8365_210_2064_1_1528592311070_4677690_431-294102-0_sp0.9 from training. Duration: 0.988875 2023-10-02 17:26:51,429 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.bypass_mid.scale_min, batch_count=959913.3333333334, ans=0.2 2023-10-02 17:26:55,452 WARNING [train.py:1204] (3/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_365-1492-0_sp1.1 from training. Duration: 0.82725 2023-10-02 17:26:55,487 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_105-114797-0_sp0.9 from training. Duration: 0.9333125 2023-10-02 17:26:56,882 WARNING [train.py:1204] (3/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_769-168302-0_sp0.9 from training. Duration: 0.788875 2023-10-02 17:26:57,662 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.0.layers.1.whiten, num_groups=1, num_channels=192, metric=3.88 vs. limit=12.0 2023-10-02 17:27:02,099 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_40340_210_9024_1_1533433916_4132944_320-270277-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 17:27:03,351 WARNING [train.py:1204] (3/4) Exclude cut with ID ID0203W0003-781711 from training. Duration: 0.958 2023-10-02 17:27:04,659 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_30434_210_13049_1_1532519384_981746_52-12932-0_sp1.1 from training. Duration: 0.936375 2023-10-02 17:27:04,764 WARNING [train.py:1204] (3/4) Exclude cut with ID _8263_210_1664_1_1528438953494_294100_40-204731-0_sp0.9 from training. Duration: 0.5555625 2023-10-02 17:27:08,110 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1032-236863-0_sp0.9 from training. Duration: 0.9333125 2023-10-02 17:27:09,402 WARNING [train.py:1204] (3/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_335-274780-0_sp0.9 from training. Duration: 0.8666875 2023-10-02 17:27:09,410 WARNING [train.py:1204] (3/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_654-52713-0_sp1.1 from training. Duration: 0.7 2023-10-02 17:27:10,821 WARNING [train.py:1204] (3/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_742-267856-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 17:27:12,148 WARNING [train.py:1204] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_74-48717-0 from training. Duration: 0.58 2023-10-02 17:27:12,246 WARNING [train.py:1204] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_18-46756-0 from training. Duration: 0.79 2023-10-02 17:27:13,567 WARNING [train.py:1204] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_612-56061-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 17:27:13,575 WARNING [train.py:1204] (3/4) Exclude cut with ID _56423_210_5854_1_1534896061049_3165969_412-2403-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 17:27:13,636 WARNING [train.py:1204] (3/4) Exclude cut with ID _62574_210_2655_1_1535180407058_3933249_120-196848-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 17:27:13,636 WARNING [train.py:1204] (3/4) Exclude cut with ID _40519_210_13794_1_1533715228655_7337390_916-51000-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 17:27:16,985 WARNING [train.py:1204] (3/4) Exclude cut with ID _65263_210_22356_1_1535763598691_3921037_108-21902-0_sp1.1 from training. Duration: 0.9 2023-10-02 17:27:18,354 WARNING [train.py:1204] (3/4) Exclude cut with ID _4122_210_2084_1_1526713164417_4353280_528-206771-0_sp1.1 from training. Duration: 0.8 2023-10-02 17:27:21,629 WARNING [train.py:1204] (3/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_43-325657-0_sp1.1 from training. Duration: 0.92725 2023-10-02 17:27:21,683 WARNING [train.py:1204] (3/4) Exclude cut with ID _44659_210_2721_1_1534036120061_6614860_314-82041-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 17:27:23,003 WARNING [train.py:1204] (3/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_81-300212-0_sp0.9 from training. Duration: 0.6666875 2023-10-02 17:27:24,250 WARNING [train.py:1204] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_665-196155-0_sp0.9 from training. Duration: 0.7444375 2023-10-02 17:27:25,889 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.ff2_skip_rate, batch_count=960113.3333333334, ans=0.0 2023-10-02 17:27:27,042 WARNING [train.py:1204] (3/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_140-336881-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 17:27:27,122 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6290_210_3273_1_1527595224_1282787_115-87095-0_sp0.9 from training. Duration: 0.611125 2023-10-02 17:27:28,452 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_53373_210_5399_1_1534229471_4715695_607-259685-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 17:27:29,870 WARNING [train.py:1204] (3/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_175-163894-0_sp1.1 from training. Duration: 0.67275 2023-10-02 17:27:29,890 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7002_210_1794_1_1527939683_5157286_1-281264-0_sp1.1 from training. Duration: 0.4 2023-10-02 17:27:36,643 WARNING [train.py:1204] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_605-257205-0 from training. Duration: 0.59 2023-10-02 17:27:38,729 WARNING [train.py:1204] (3/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_454-314901-0 from training. Duration: 0.46 2023-10-02 17:27:40,098 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_46032_210_14656_1_1533781412_4507969_287-130422-0_sp1.1 from training. Duration: 0.97275 2023-10-02 17:27:41,851 INFO [train.py:1046] (3/4) Epoch 28, batch 600, loss[loss=0.2029, simple_loss=0.2606, pruned_loss=0.07257, over 19657.00 frames. ], tot_loss[loss=0.1671, simple_loss=0.2446, pruned_loss=0.04481, over 4482570.65 frames. ], batch size: 389, lr: 3.68e-03, grad_scale: 8.0 2023-10-02 17:27:41,905 WARNING [train.py:1204] (3/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_186-309161-0_sp1.1 from training. Duration: 0.4909375 2023-10-02 17:27:41,958 WARNING [train.py:1204] (3/4) Exclude cut with ID _42659_210_2070_1_1533607442765_3120380_45-342307-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 17:27:44,179 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.1.encoder.layers.1.feed_forward1.out_whiten, num_groups=1, num_channels=256, metric=6.80 vs. limit=15.0 2023-10-02 17:27:49,485 WARNING [train.py:1204] (3/4) Exclude cut with ID _12555_210_8750_1_1530075594402_3780239_478-326818-0_sp1.1 from training. Duration: 0.963625 2023-10-02 17:27:50,867 WARNING [train.py:1204] (3/4) Exclude cut with ID _7606_210_4915_1_1528894253277_5080690_127-177761-0_sp1.1 from training. Duration: 0.5818125 2023-10-02 17:27:52,664 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6788_210_1388_1_1527898698_4562021_91-197981-0 from training. Duration: 0.83 2023-10-02 17:27:54,136 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8558_210_1410_1_1528505225_7613406_131-329524-0_sp0.9 from training. Duration: 0.87775 2023-10-02 17:27:55,588 WARNING [train.py:1204] (3/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_153-153537-0_sp1.1 from training. Duration: 0.82725 2023-10-02 17:27:58,239 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_17292_210_6753_1_1531125388_2943734_285-349105-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 17:28:00,998 WARNING [train.py:1204] (3/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_37-41919-0 from training. Duration: 0.87 2023-10-02 17:28:01,032 WARNING [train.py:1204] (3/4) Exclude cut with ID _60860_210_8254_1_1534896015875_3557169_172-294852-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 17:28:05,230 WARNING [train.py:1204] (3/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_146-276944-0 from training. Duration: 0.83 2023-10-02 17:28:05,500 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.feed_forward3.hidden_balancer.prob, batch_count=960246.6666666666, ans=0.125 2023-10-02 17:28:09,854 WARNING [train.py:1204] (3/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_1254-44082-0_sp1.1 from training. Duration: 0.936375 2023-10-02 17:28:09,866 WARNING [train.py:1204] (3/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_1115-180350-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 17:28:09,897 WARNING [train.py:1204] (3/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_60-146189-0_sp1.1 from training. Duration: 0.8909375 2023-10-02 17:28:16,412 WARNING [train.py:1204] (3/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_517-153649-0_sp1.1 from training. Duration: 0.92725 2023-10-02 17:28:16,431 WARNING [train.py:1204] (3/4) Exclude cut with ID _39571_210_13449_1_1533801584143_3651077_37-266257-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 17:28:16,479 WARNING [train.py:1204] (3/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_527-276118-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 17:28:18,043 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.bypass.scale_min, batch_count=960313.3333333334, ans=0.2 2023-10-02 17:28:18,070 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.bypass_mid.scale_min, batch_count=960313.3333333334, ans=0.2 2023-10-02 17:28:23,782 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7696_210_2040_1_1528207167_3716099_117-125434-0_sp1.1 from training. Duration: 0.6909375 2023-10-02 17:28:28,090 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.self_attn_weights.pos_emb_skip_rate, batch_count=960380.0, ans=0.0 2023-10-02 17:28:29,156 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_25767_210_2161_1_1532307483_3867748_478-156360-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 17:28:29,160 WARNING [train.py:1204] (3/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_177-49092-0_sp1.1 from training. Duration: 0.82725 2023-10-02 17:28:29,167 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_11305_210_1794_1_1529750428_7178148_680-317073-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 17:28:33,184 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.480e+02 1.831e+02 2.042e+02 2.282e+02 3.339e+02, threshold=4.085e+02, percent-clipped=0.0 2023-10-02 17:28:36,043 WARNING [train.py:1204] (3/4) Exclude cut with ID _53565_210_10635_1_1534337218335_3820259_287-18120-0 from training. Duration: 0.97 2023-10-02 17:28:40,314 WARNING [train.py:1204] (3/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_498-40153-0_sp1.1 from training. Duration: 0.57275 2023-10-02 17:28:40,333 WARNING [train.py:1204] (3/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_555-63066-0_sp1.1 from training. Duration: 0.963625 2023-10-02 17:28:45,762 WARNING [train.py:1204] (3/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_395-310220-0 from training. Duration: 0.89 2023-10-02 17:28:47,098 WARNING [train.py:1204] (3/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_937-208281-0_sp1.1 from training. Duration: 0.77275 2023-10-02 17:28:48,405 WARNING [train.py:1204] (3/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_120-211630-0 from training. Duration: 0.92 2023-10-02 17:28:49,871 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_454-276429-0_sp1.1 from training. Duration: 0.7909375 2023-10-02 17:28:49,906 WARNING [train.py:1204] (3/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_404-75889-0_sp1.1 from training. Duration: 0.8090625 2023-10-02 17:28:55,752 INFO [train.py:1046] (3/4) Epoch 28, batch 650, loss[loss=0.1779, simple_loss=0.2674, pruned_loss=0.04418, over 24409.00 frames. ], tot_loss[loss=0.1669, simple_loss=0.2439, pruned_loss=0.04497, over 4534070.24 frames. ], batch size: 69, lr: 3.68e-03, grad_scale: 8.0 2023-10-02 17:28:55,861 WARNING [train.py:1204] (3/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_282-136641-0_sp0.9 from training. Duration: 0.4666875 2023-10-02 17:28:56,058 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.attention_skip_rate, batch_count=960513.3333333334, ans=0.0 2023-10-02 17:28:57,206 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_809-17156-0_sp1.1 from training. Duration: 0.663625 2023-10-02 17:28:57,358 WARNING [train.py:1204] (3/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_1003-101638-0_sp0.9 from training. Duration: 0.811125 2023-10-02 17:28:58,678 WARNING [train.py:1204] (3/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_404-75889-0_sp0.9 from training. Duration: 0.988875 2023-10-02 17:29:01,299 WARNING [train.py:1204] (3/4) Exclude cut with ID _59288_210_5854_1_1535077757774_3412246_570-295255-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 17:29:02,839 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.bypass_mid.scale_min, batch_count=960513.3333333334, ans=0.2 2023-10-02 17:29:03,830 WARNING [train.py:1204] (3/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_412-287526-0 from training. Duration: 0.72 2023-10-02 17:29:05,101 WARNING [train.py:1204] (3/4) Exclude cut with ID _44010_210_4525_1_1533884262964_4013620_649-79655-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 17:29:10,660 WARNING [train.py:1204] (3/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_57-270037-0_sp0.9 from training. Duration: 0.8555625 2023-10-02 17:29:10,660 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_34815_210_6388_1_1533381320_3571903_302-255963-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 17:29:14,052 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_47449_210_5399_1_1534076445_4006929_573-301852-0_sp1.1 from training. Duration: 0.97275 2023-10-02 17:29:18,721 WARNING [train.py:1204] (3/4) Exclude cut with ID 6709-74022-0004-86860-0_sp1.1 from training. Duration: 0.9409375 2023-10-02 17:29:19,946 WARNING [train.py:1204] (3/4) Exclude cut with ID _40519_210_13794_1_1533715228655_7337390_111-71991-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 17:29:19,988 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_30167_210_3528_1_1532428012_4772375_169-214186-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 17:29:23,347 WARNING [train.py:1204] (3/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_20-235313-0_sp1.1 from training. Duration: 0.963625 2023-10-02 17:29:23,381 WARNING [train.py:1204] (3/4) Exclude cut with ID _4681_210_3525_1_1527251040321_1235713_123-24428-0_sp0.9 from training. Duration: 0.5333125 2023-10-02 17:29:26,139 WARNING [train.py:1204] (3/4) Exclude cut with ID _31602_210_5854_1_1532677021456_4458250_444-198752-0_sp1.1 from training. Duration: 0.97275 2023-10-02 17:29:26,200 WARNING [train.py:1204] (3/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_9-279442-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 17:29:27,558 WARNING [train.py:1204] (3/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_973-198085-0_sp0.9 from training. Duration: 0.8333125 2023-10-02 17:29:27,643 WARNING [train.py:1204] (3/4) Exclude cut with ID _29853_210_7497_1_1532505600851_6642110_567-250221-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 17:29:28,961 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6545_210_2040_1_1527774670_4245258_407-48070-0_sp1.1 from training. Duration: 0.6909375 2023-10-02 17:29:30,525 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.feed_forward3.hidden_balancer.prob, batch_count=960646.6666666666, ans=0.125 2023-10-02 17:29:32,944 WARNING [train.py:1204] (3/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_224-343295-0_sp0.9 from training. Duration: 0.611125 2023-10-02 17:29:32,966 WARNING [train.py:1204] (3/4) Exclude cut with ID IC0125W0011-61817 from training. Duration: 0.88 2023-10-02 17:29:32,980 WARNING [train.py:1204] (3/4) Exclude cut with ID _60113_210_16271_1_1535164212481_4386079_435-154371-0_sp1.1 from training. Duration: 0.97275 2023-10-02 17:29:32,992 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_25836_210_16554_1_1532138167_53197_3-9394-0_sp1.1 from training. Duration: 0.92725 2023-10-02 17:29:35,828 WARNING [train.py:1204] (3/4) Exclude cut with ID _29343_210_4381_1_1532602757752_3674109_57-55125-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 17:29:35,925 WARNING [train.py:1204] (3/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_267-23560-0_sp1.1 from training. Duration: 0.936375 2023-10-02 17:29:37,217 WARNING [train.py:1204] (3/4) Exclude cut with ID _30196_210_1664_1_1533708496522_7074140_1150-278867-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 17:29:37,276 WARNING [train.py:1204] (3/4) Exclude cut with ID _6270_210_2084_1_1527850911573_3856420_26-277343-0_sp0.9 from training. Duration: 0.911125 2023-10-02 17:29:38,679 WARNING [train.py:1204] (3/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_325-62802-0 from training. Duration: 0.87 2023-10-02 17:29:38,758 WARNING [train.py:1204] (3/4) Exclude cut with ID _56524_210_9642_1_1534572011779_3619170_145-310842-0_sp1.1 from training. Duration: 0.9 2023-10-02 17:29:40,090 WARNING [train.py:1204] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_860-96198-0_sp0.9 from training. Duration: 0.811125 2023-10-02 17:29:41,910 WARNING [train.py:1204] (3/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_17-25045-0_sp1.1 from training. Duration: 0.536375 2023-10-02 17:29:41,935 WARNING [train.py:1204] (3/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_349-69727-0_sp1.1 from training. Duration: 0.936375 2023-10-02 17:29:43,185 WARNING [train.py:1204] (3/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_580-93798-0_sp1.1 from training. Duration: 0.5090625 2023-10-02 17:29:44,606 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7762_210_1794_1_1528199508_4057408_419-125009-0 from training. Duration: 0.83 2023-10-02 17:29:46,475 WARNING [train.py:1204] (3/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_356-167816-0 from training. Duration: 0.72 2023-10-02 17:29:46,502 WARNING [train.py:1204] (3/4) Exclude cut with ID _29345_210_4381_1_1532689168263_3860169_225-97100-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 17:29:48,341 WARNING [train.py:1204] (3/4) Exclude cut with ID _14227_210_4381_1_1530701804286_3940259_374-194712-0_sp1.1 from training. Duration: 0.92725 2023-10-02 17:29:48,368 WARNING [train.py:1204] (3/4) Exclude cut with ID _40137_210_12210_1_1533290484046_3795180_249-187059-0_sp0.9 from training. Duration: 0.97775 2023-10-02 17:29:48,400 WARNING [train.py:1204] (3/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_389-317194-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 17:29:51,576 WARNING [train.py:1204] (3/4) Exclude cut with ID _27485_210_4916_1_1532739191677_3668269_239-122918-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 17:29:55,818 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2245_210_2715_1_1525511548_3540254_128-117609-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 17:29:55,855 WARNING [train.py:1204] (3/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_160-276178-0_sp1.1 from training. Duration: 0.963625 2023-10-02 17:29:57,230 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_31920_210_10633_1_1532860009_1686094_1-279735-0_sp1.1 from training. Duration: 0.97275 2023-10-02 17:29:59,933 WARNING [train.py:1204] (3/4) Exclude cut with ID _51078_210_9070_1_1534161557424_3805250_325-262383-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 17:29:59,967 WARNING [train.py:1204] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_643-52007-0_sp0.9 from training. Duration: 0.6333125 2023-10-02 17:30:01,309 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_51937_210_5014_1_1534573610_4022025_518-299248-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 17:30:08,010 INFO [train.py:1046] (3/4) Epoch 28, batch 700, loss[loss=0.1769, simple_loss=0.2455, pruned_loss=0.05414, over 23463.00 frames. ], tot_loss[loss=0.1668, simple_loss=0.2441, pruned_loss=0.0447, over 4578083.40 frames. ], batch size: 285, lr: 3.68e-03, grad_scale: 8.0 2023-10-02 17:30:08,054 WARNING [train.py:1204] (3/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_166-314607-0_sp1.1 from training. Duration: 0.4909375 2023-10-02 17:30:08,058 WARNING [train.py:1204] (3/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_1087-282939-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 17:30:08,098 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_23850_210_4238_1_1532232597_1632433_32-185135-0_sp1.1 from training. Duration: 0.7818125 2023-10-02 17:30:09,440 WARNING [train.py:1204] (3/4) Exclude cut with ID _59500_210_8254_1_1534809610680_3736009_145-89359-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 17:30:15,208 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_340-336408-0 from training. Duration: 0.93 2023-10-02 17:30:15,283 WARNING [train.py:1204] (3/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_9-21737-0 from training. Duration: 0.97 2023-10-02 17:30:18,483 WARNING [train.py:1204] (3/4) Exclude cut with ID _2220_210_1819_1_1525509109898_3658680_50-99837-0 from training. Duration: 0.54 2023-10-02 17:30:18,546 WARNING [train.py:1204] (3/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_879-299350-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 17:30:21,793 WARNING [train.py:1204] (3/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_905-160074-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 17:30:24,959 WARNING [train.py:1204] (3/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_395-73570-0 from training. Duration: 0.99 2023-10-02 17:30:27,862 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7264_210_4938_1_1527939911_8132095_800-91995-0_sp1.1 from training. Duration: 0.7818125 2023-10-02 17:30:29,233 WARNING [train.py:1204] (3/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_77-311404-0_sp1.1 from training. Duration: 0.8545625 2023-10-02 17:30:30,578 WARNING [train.py:1204] (3/4) Exclude cut with ID _13679_210_7497_1_1530534293662_3355098_107-80772-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 17:30:31,343 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.1.encoder.layers.1.conv_module1.whiten, num_groups=1, num_channels=256, metric=11.23 vs. limit=15.0 2023-10-02 17:30:32,083 WARNING [train.py:1204] (3/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_224-343295-0_sp1.1 from training. Duration: 0.5 2023-10-02 17:30:32,128 WARNING [train.py:1204] (3/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_156-253482-0_sp1.1 from training. Duration: 0.963625 2023-10-02 17:30:34,841 WARNING [train.py:1204] (3/4) Exclude cut with ID _49728_210_12651_1_1534041034294_6788150_309-125532-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 17:30:36,304 WARNING [train.py:1204] (3/4) Exclude cut with ID _13913_210_9994_1_1530579615187_3383897_473-277682-0_sp0.9 from training. Duration: 0.4333125 2023-10-02 17:30:36,453 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.attention_skip_rate, batch_count=960980.0, ans=0.0 2023-10-02 17:30:37,565 WARNING [train.py:1204] (3/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_3-276701-0_sp1.1 from training. Duration: 0.82725 2023-10-02 17:30:37,693 WARNING [train.py:1204] (3/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_282-136641-0 from training. Duration: 0.42 2023-10-02 17:30:40,430 WARNING [train.py:1204] (3/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_127-566-0 from training. Duration: 0.84 2023-10-02 17:30:42,034 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.feed_forward1.hidden_balancer.prob, batch_count=960980.0, ans=0.125 2023-10-02 17:30:45,115 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6163_210_6400_1_1527506746_4263068_283-137811-0_sp1.1 from training. Duration: 0.7 2023-10-02 17:30:45,147 WARNING [train.py:1204] (3/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_34-183685-0_sp1.1 from training. Duration: 0.8909375 2023-10-02 17:30:47,101 WARNING [train.py:1204] (3/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_154-135696-0_sp0.9 from training. Duration: 0.888875 2023-10-02 17:30:47,224 INFO [scaling.py:1118] (3/4) WithLoss: name=encoder.encoders.0.layers.1.self_attn_weights, loss-sum=0.000e+00 2023-10-02 17:30:52,298 WARNING [train.py:1204] (3/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_676-51014-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 17:30:52,356 WARNING [train.py:1204] (3/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_38-169920-0 from training. Duration: 0.91 2023-10-02 17:30:56,597 WARNING [train.py:1204] (3/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_747-114465-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 17:30:56,657 WARNING [train.py:1204] (3/4) Exclude cut with ID _8021_210_7994_1_1528376451358_3653069_100-140231-0_sp1.1 from training. Duration: 0.6090625 2023-10-02 17:30:56,682 WARNING [train.py:1204] (3/4) Exclude cut with ID _64135_210_2253_1_1535677267876_7608972_133-188266-0 from training. Duration: 0.81 2023-10-02 17:30:59,554 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.balancer2.prob, batch_count=961046.6666666666, ans=0.125 2023-10-02 17:31:00,699 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.450e+02 1.837e+02 1.990e+02 2.293e+02 3.578e+02, threshold=3.980e+02, percent-clipped=0.0 2023-10-02 17:31:00,793 WARNING [train.py:1204] (3/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_714-310538-0_sp1.1 from training. Duration: 0.7818125 2023-10-02 17:31:02,229 WARNING [train.py:1204] (3/4) Exclude cut with ID _39867_210_9826_1_1533553222030_9344259_1134-288498-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 17:31:04,883 WARNING [train.py:1204] (3/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_211-237289-0_sp1.1 from training. Duration: 0.936375 2023-10-02 17:31:09,102 WARNING [train.py:1204] (3/4) Exclude cut with ID _59202_210_26984_1_1534816810612_3549174_137-318710-0_sp0.9 from training. Duration: 0.988875 2023-10-02 17:31:09,140 WARNING [train.py:1204] (3/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_86-66192-0 from training. Duration: 0.55 2023-10-02 17:31:13,176 WARNING [train.py:1204] (3/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_540-275685-0 from training. Duration: 0.89 2023-10-02 17:31:14,892 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8776_210_2040_1_1528638837_4088036_244-278763-0 from training. Duration: 0.9 2023-10-02 17:31:18,162 WARNING [train.py:1204] (3/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_9-219972-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 17:31:20,016 WARNING [train.py:1204] (3/4) Exclude cut with ID _44659_210_2721_1_1534036120061_6614860_656-283823-0_sp1.1 from training. Duration: 0.92725 2023-10-02 17:31:20,100 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_69630_210_7234_1_1535789601_661049_77-216385-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 17:31:22,859 INFO [train.py:1046] (3/4) Epoch 28, batch 750, loss[loss=0.1695, simple_loss=0.2435, pruned_loss=0.04776, over 23285.00 frames. ], tot_loss[loss=0.1659, simple_loss=0.2429, pruned_loss=0.04445, over 4598657.03 frames. ], batch size: 119, lr: 3.68e-03, grad_scale: 8.0 2023-10-02 17:31:22,943 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_19952_210_2161_1_1531875469_3851750_210-308919-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 17:31:22,948 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_4737_210_2040_1_1526998689_3293008_378-178650-0 from training. Duration: 0.95 2023-10-02 17:31:26,990 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.3.encoder.layers.3.feed_forward3.out_whiten, num_groups=1, num_channels=512, metric=12.42 vs. limit=15.0 2023-10-02 17:31:27,664 WARNING [train.py:1204] (3/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_224-343295-0 from training. Duration: 0.55 2023-10-02 17:31:27,680 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7002_210_1794_1_1527939683_5157286_184-229630-0 from training. Duration: 0.74 2023-10-02 17:31:27,715 WARNING [train.py:1204] (3/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_6-214815-0 from training. Duration: 0.85 2023-10-02 17:31:29,096 WARNING [train.py:1204] (3/4) Exclude cut with ID _8699_210_2069_1_1528630346831_3960409_160-24408-0 from training. Duration: 0.91 2023-10-02 17:31:29,137 WARNING [train.py:1204] (3/4) Exclude cut with ID _5585_210_2667_1_1527418813324_8007340_89-332072-0 from training. Duration: 0.64 2023-10-02 17:31:30,429 WARNING [train.py:1204] (3/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_528-259800-0_sp1.1 from training. Duration: 0.963625 2023-10-02 17:31:31,862 WARNING [train.py:1204] (3/4) Exclude cut with ID _55383_210_10635_1_1534468426172_2231669_134-128246-0 from training. Duration: 0.89 2023-10-02 17:31:31,939 WARNING [train.py:1204] (3/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_400-338274-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 17:31:31,992 WARNING [train.py:1204] (3/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_364-22817-0_sp0.9 from training. Duration: 0.788875 2023-10-02 17:31:33,398 WARNING [train.py:1204] (3/4) Exclude cut with ID _42863_210_9070_1_1533902398788_3696920_331-291057-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 17:31:36,155 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_5436_210_1794_1_1527331317_2541494_229-211016-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 17:31:36,201 WARNING [train.py:1204] (3/4) Exclude cut with ID _6456_210_1534_1_1527681634212_3181049_345-252877-0_sp0.9 from training. Duration: 0.711125 2023-10-02 17:31:36,231 WARNING [train.py:1204] (3/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_96-81499-0_sp1.1 from training. Duration: 0.92725 2023-10-02 17:31:38,862 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_5575_210_1410_1_1527301305_2570170_342-265572-0_sp1.1 from training. Duration: 0.8545625 2023-10-02 17:31:40,172 WARNING [train.py:1204] (3/4) Exclude cut with ID _5444_210_2151_1_1527418808464_5312210_726-27165-0_sp0.9 from training. Duration: 0.7444375 2023-10-02 17:31:40,309 WARNING [train.py:1204] (3/4) Exclude cut with ID _22083_210_8254_1_1531702862439_3678970_304-327750-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 17:31:41,756 WARNING [train.py:1204] (3/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_245-226199-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 17:31:41,956 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.bypass.scale_min, batch_count=961246.6666666666, ans=0.2 2023-10-02 17:31:43,011 WARNING [train.py:1204] (3/4) Exclude cut with ID _42875_210_14856_1_1533704397487_4012433_102-199499-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 17:31:43,508 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.5.encoder.layers.0.self_attn2.whiten, num_groups=1, num_channels=256, metric=12.96 vs. limit=22.5 2023-10-02 17:31:45,540 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_51225_210_3601_1_1534400634_2327383_55-301844-0 from training. Duration: 0.93 2023-10-02 17:31:45,861 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=961246.6666666666, ans=0.1 2023-10-02 17:31:46,740 WARNING [train.py:1204] (3/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_916-195848-0_sp1.1 from training. Duration: 0.836375 2023-10-02 17:31:48,077 WARNING [train.py:1204] (3/4) Exclude cut with ID _44718_210_5816_1_1534050051869_6469030_497-311999-0_sp1.1 from training. Duration: 0.936375 2023-10-02 17:31:48,207 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_34886_210_10633_1_1533203882_1755661_91-194765-0_sp1.1 from training. Duration: 0.936375 2023-10-02 17:31:49,553 WARNING [train.py:1204] (3/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_310-26186-0_sp0.9 from training. Duration: 0.77775 2023-10-02 17:31:49,634 WARNING [train.py:1204] (3/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_416-257730-0 from training. Duration: 0.65 2023-10-02 17:31:49,639 WARNING [train.py:1204] (3/4) Exclude cut with ID _9389_210_1664_1_1529217337301_5289269_209-154760-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 17:31:52,655 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.0.layers.0.feed_forward1.out_whiten, num_groups=1, num_channels=192, metric=9.97 vs. limit=15.0 2023-10-02 17:31:53,044 WARNING [train.py:1204] (3/4) Exclude cut with ID _4092_210_2151_1_1526643044449_3135759_227-213972-0 from training. Duration: 0.54 2023-10-02 17:31:53,089 WARNING [train.py:1204] (3/4) Exclude cut with ID IC0679W0044-335852 from training. Duration: 0.768 2023-10-02 17:31:54,336 WARNING [train.py:1204] (3/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_42-186162-0 from training. Duration: 0.92 2023-10-02 17:31:54,342 WARNING [train.py:1204] (3/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_302-2550-0_sp0.9 from training. Duration: 0.9 2023-10-02 17:31:54,356 WARNING [train.py:1204] (3/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_356-31216-0_sp0.9 from training. Duration: 0.8333125 2023-10-02 17:31:56,321 WARNING [train.py:1204] (3/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_125-322172-0_sp1.1 from training. Duration: 0.8454375 2023-10-02 17:32:03,542 WARNING [train.py:1204] (3/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_141-89727-0_sp0.9 from training. Duration: 0.788875 2023-10-02 17:32:04,826 WARNING [train.py:1204] (3/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_647-337784-0_sp1.1 from training. Duration: 0.97275 2023-10-02 17:32:04,848 WARNING [train.py:1204] (3/4) Exclude cut with ID _6493_210_1747_1_1528459210424_4012470_513-140967-0_sp1.1 from training. Duration: 0.7181875 2023-10-02 17:32:06,405 WARNING [train.py:1204] (3/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_87-266334-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 17:32:06,576 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.1.ff2_skip_rate, batch_count=961380.0, ans=0.0 2023-10-02 17:32:07,868 WARNING [train.py:1204] (3/4) Exclude cut with ID _26528_210_9070_1_1532570410842_3851310_526-261338-0_sp1.1 from training. Duration: 0.92725 2023-10-02 17:32:08,068 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.feed_forward1.out_proj.dropout_p, batch_count=961380.0, ans=0.1 2023-10-02 17:32:09,173 WARNING [train.py:1204] (3/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_380-343670-0 from training. Duration: 0.88 2023-10-02 17:32:09,244 WARNING [train.py:1204] (3/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_449-15508-0_sp1.1 from training. Duration: 0.7454375 2023-10-02 17:32:11,940 WARNING [train.py:1204] (3/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_372-6869-0_sp0.9 from training. Duration: 0.488875 2023-10-02 17:32:11,999 WARNING [train.py:1204] (3/4) Exclude cut with ID _26907_210_7234_1_1532221740694_3205263_337-2494-0_sp0.9 from training. Duration: 0.97775 2023-10-02 17:32:15,191 WARNING [train.py:1204] (3/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_10-100367-0_sp1.1 from training. Duration: 0.8909375 2023-10-02 17:32:15,227 WARNING [train.py:1204] (3/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_47-167373-0 from training. Duration: 0.92 2023-10-02 17:32:16,652 WARNING [train.py:1204] (3/4) Exclude cut with ID _28323_210_16187_1_1532480395667_4100300_628-282179-0_sp1.1 from training. Duration: 0.97275 2023-10-02 17:32:16,831 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.conv_skip_rate, batch_count=961380.0, ans=0.0 2023-10-02 17:32:20,851 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.conv_module1.balancer2.prob, batch_count=961446.6666666666, ans=0.125 2023-10-02 17:32:22,551 WARNING [train.py:1204] (3/4) Exclude cut with ID _20455_210_5219_1_1531565996775_8098100_213-280898-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 17:32:23,984 WARNING [train.py:1204] (3/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_383-316000-0_sp1.1 from training. Duration: 0.8181875 2023-10-02 17:32:25,401 WARNING [train.py:1204] (3/4) Exclude cut with ID _18513_210_9206_1_1531618215223_3862620_16-78180-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 17:32:28,125 WARNING [train.py:1204] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_330-327861-0_sp1.1 from training. Duration: 0.6090625 2023-10-02 17:32:29,656 WARNING [train.py:1204] (3/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_356-31216-0 from training. Duration: 0.75 2023-10-02 17:32:29,683 WARNING [train.py:1204] (3/4) Exclude cut with ID _25248_210_8750_1_1532062825941_5554180_255-148539-0_sp1.1 from training. Duration: 0.963625 2023-10-02 17:32:29,722 WARNING [train.py:1204] (3/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_269-154726-0_sp1.1 from training. Duration: 0.7818125 2023-10-02 17:32:33,154 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7002_210_1794_1_1527939683_5157286_163-87552-0_sp1.1 from training. Duration: 0.7818125 2023-10-02 17:32:34,461 WARNING [train.py:1204] (3/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_97-151896-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 17:32:34,768 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.ff3_skip_rate, batch_count=961446.6666666666, ans=0.0 2023-10-02 17:32:37,111 INFO [train.py:1046] (3/4) Epoch 28, batch 800, loss[loss=0.1727, simple_loss=0.2519, pruned_loss=0.04673, over 23297.00 frames. ], tot_loss[loss=0.1666, simple_loss=0.2436, pruned_loss=0.04479, over 4615411.26 frames. ], batch size: 93, lr: 3.68e-03, grad_scale: 16.0 2023-10-02 17:32:37,181 WARNING [train.py:1204] (3/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_207-7026-0_sp1.1 from training. Duration: 0.97275 2023-10-02 17:32:37,200 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_92-46009-0_sp0.9 from training. Duration: 0.6 2023-10-02 17:32:44,017 WARNING [train.py:1204] (3/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_122-178847-0_sp1.1 from training. Duration: 0.97275 2023-10-02 17:32:44,020 WARNING [train.py:1204] (3/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_94-348956-0_sp1.1 from training. Duration: 0.92725 2023-10-02 17:32:45,429 WARNING [train.py:1204] (3/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_1018-244089-0_sp1.1 from training. Duration: 0.963625 2023-10-02 17:32:46,073 WARNING [train.py:1204] (3/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_87-118423-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 17:32:47,391 WARNING [train.py:1204] (3/4) Exclude cut with ID _53713_210_24116_1_1534314487762_7416751_504-212292-0_sp1.1 from training. Duration: 0.92725 2023-10-02 17:32:47,420 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_446-150327-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 17:32:48,872 WARNING [train.py:1204] (3/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_682-245989-0_sp1.1 from training. Duration: 0.92725 2023-10-02 17:32:52,387 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.conv_module2.balancer2.min_abs, batch_count=961580.0, ans=0.5 2023-10-02 17:32:53,507 WARNING [train.py:1204] (3/4) Exclude cut with ID _35723_210_1794_1_1533088341043_628250_8-234524-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 17:32:54,870 WARNING [train.py:1204] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_64-248359-0_sp0.9 from training. Duration: 0.7666875 2023-10-02 17:32:57,597 WARNING [train.py:1204] (3/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_49-97158-0 from training. Duration: 0.76 2023-10-02 17:32:57,646 WARNING [train.py:1204] (3/4) Exclude cut with ID _24314_210_4915_1_1532135078116_7170179_743-915-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 17:32:59,105 WARNING [train.py:1204] (3/4) Exclude cut with ID _61194_210_18628_1_1535007555628_3750909_353-323468-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 17:32:59,137 WARNING [train.py:1204] (3/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_387-131538-0_sp1.1 from training. Duration: 0.736375 2023-10-02 17:32:59,168 WARNING [train.py:1204] (3/4) Exclude cut with ID _44424_210_18163_1_1533623394848_1213110_79-187603-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 17:32:59,198 WARNING [train.py:1204] (3/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_86-19601-0 from training. Duration: 0.85 2023-10-02 17:32:59,218 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_50050_210_16223_1_1535435796_7290138_103-283529-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 17:33:00,602 WARNING [train.py:1204] (3/4) Exclude cut with ID _13913_210_9994_1_1530579615187_3383897_473-277682-0 from training. Duration: 0.39 2023-10-02 17:33:03,965 WARNING [train.py:1204] (3/4) Exclude cut with ID _42326_210_7770_1_1533814282531_3707269_368-68029-0_sp1.1 from training. Duration: 0.92725 2023-10-02 17:33:06,628 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_41428_210_2161_1_1533774184_3992532_446-339645-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 17:33:09,413 WARNING [train.py:1204] (3/4) Exclude cut with ID _69584_210_5219_1_1535785133775_4044450_325-7656-0_sp1.1 from training. Duration: 0.7818125 2023-10-02 17:33:09,423 WARNING [train.py:1204] (3/4) Exclude cut with ID _21632_210_2253_1_1531994001765_3744140_134-184387-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 17:33:12,198 WARNING [train.py:1204] (3/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_550-184707-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 17:33:12,222 WARNING [train.py:1204] (3/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_106-341780-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 17:33:15,103 WARNING [train.py:1204] (3/4) Exclude cut with ID _45871_210_5665_1_1533864614245_3296060_253-132552-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 17:33:15,146 WARNING [train.py:1204] (3/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_395-310220-0_sp1.1 from training. Duration: 0.8090625 2023-10-02 17:33:17,071 WARNING [train.py:1204] (3/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_795-33252-0_sp0.9 from training. Duration: 0.411125 2023-10-02 17:33:18,560 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9002W0003-495962 from training. Duration: 0.946 2023-10-02 17:33:19,714 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_43-71989-0 from training. Duration: 0.57 2023-10-02 17:33:19,727 WARNING [train.py:1204] (3/4) Exclude cut with ID _8699_210_2069_1_1528630346831_3960409_155-39766-0_sp1.1 from training. Duration: 0.7090625 2023-10-02 17:33:19,742 WARNING [train.py:1204] (3/4) Exclude cut with ID _27894_210_12392_1_1532415496078_6902439_788-232684-0_sp1.1 from training. Duration: 0.97275 2023-10-02 17:33:21,115 WARNING [train.py:1204] (3/4) Exclude cut with ID _56494_210_4220_1_1535284837625_3033169_10-39296-0_sp1.1 from training. Duration: 0.92725 2023-10-02 17:33:21,143 WARNING [train.py:1204] (3/4) Exclude cut with ID _40901_210_5665_1_1533363789958_3856130_341-20819-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 17:33:27,255 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9029W0435-509822 from training. Duration: 0.896 2023-10-02 17:33:27,289 WARNING [train.py:1204] (3/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_51-117576-0 from training. Duration: 0.4 2023-10-02 17:33:28,511 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.528e+02 1.943e+02 2.202e+02 2.642e+02 5.405e+02, threshold=4.403e+02, percent-clipped=5.0 2023-10-02 17:33:29,920 WARNING [train.py:1204] (3/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_197-28164-0_sp1.1 from training. Duration: 0.72725 2023-10-02 17:33:30,050 WARNING [train.py:1204] (3/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_145-137790-0_sp1.1 from training. Duration: 0.7545625 2023-10-02 17:33:32,438 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.bypass_mid.scale_min, batch_count=961713.3333333334, ans=0.2 2023-10-02 17:33:33,579 WARNING [train.py:1204] (3/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_2-39653-0_sp1.1 from training. Duration: 0.8909375 2023-10-02 17:33:37,676 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3780_210_2087_1_1526467760_2130483_186-208240-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 17:33:39,080 WARNING [train.py:1204] (3/4) Exclude cut with ID _24055_210_12651_1_1532310907143_976979_92-59356-0 from training. Duration: 0.91 2023-10-02 17:33:39,126 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3910_210_1794_1_1526742306_334530_28-238909-0_sp1.1 from training. Duration: 0.736375 2023-10-02 17:33:40,569 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=961780.0, ans=0.1 2023-10-02 17:33:41,677 WARNING [train.py:1204] (3/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_269-154726-0 from training. Duration: 0.86 2023-10-02 17:33:41,937 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.nonlin_attention.balancer.prob, batch_count=961780.0, ans=0.125 2023-10-02 17:33:45,798 WARNING [train.py:1204] (3/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_326-165900-0_sp0.9 from training. Duration: 0.7666875 2023-10-02 17:33:49,265 WARNING [train.py:1204] (3/4) Exclude cut with ID _5294_210_1747_1_1527249643928_3696179_470-276077-0_sp1.1 from training. Duration: 0.963625 2023-10-02 17:33:49,300 WARNING [train.py:1204] (3/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_372-131163-0 from training. Duration: 0.94 2023-10-02 17:33:50,499 INFO [train.py:1046] (3/4) Epoch 28, batch 850, loss[loss=0.215, simple_loss=0.27, pruned_loss=0.07998, over 19367.00 frames. ], tot_loss[loss=0.167, simple_loss=0.2443, pruned_loss=0.04487, over 4640336.88 frames. ], batch size: 388, lr: 3.68e-03, grad_scale: 16.0 2023-10-02 17:33:50,597 WARNING [train.py:1204] (3/4) Exclude cut with ID _45297_210_2721_1_1533715351381_3264949_351-147136-0_sp1.1 from training. Duration: 0.7909375 2023-10-02 17:33:50,667 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_1072-12671-0_sp1.1 from training. Duration: 0.92725 2023-10-02 17:33:52,597 WARNING [train.py:1204] (3/4) Exclude cut with ID _15133_210_6228_1_1530794512074_3080379_54-198193-0 from training. Duration: 0.94 2023-10-02 17:33:52,619 WARNING [train.py:1204] (3/4) Exclude cut with ID _30196_210_1664_1_1533708496522_7074140_1194-270988-0_sp1.1 from training. Duration: 0.97275 2023-10-02 17:33:54,072 WARNING [train.py:1204] (3/4) Exclude cut with ID _50310_210_15488_1_1534068043345_3641080_289-193637-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 17:33:55,443 WARNING [train.py:1204] (3/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_313-263495-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 17:33:56,896 WARNING [train.py:1204] (3/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_18-130336-0_sp1.1 from training. Duration: 0.7181875 2023-10-02 17:33:58,136 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_47896_210_20508_1_1534492518_5958868_646-287209-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 17:33:59,542 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_294-163793-0 from training. Duration: 0.82 2023-10-02 17:33:59,585 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_8-319957-0 from training. Duration: 0.96 2023-10-02 17:33:59,589 WARNING [train.py:1204] (3/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_1211-226066-0 from training. Duration: 0.76 2023-10-02 17:34:01,032 WARNING [train.py:1204] (3/4) Exclude cut with ID _4779_210_2084_1_1527318022944_3914893_500-285013-0_sp0.9 from training. Duration: 0.7666875 2023-10-02 17:34:01,056 WARNING [train.py:1204] (3/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_347-232268-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 17:34:04,174 WARNING [train.py:1204] (3/4) Exclude cut with ID _29877_210_6228_1_1532397791106_5276149_111-55056-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 17:34:04,214 WARNING [train.py:1204] (3/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_812-271646-0_sp1.1 from training. Duration: 0.92725 2023-10-02 17:34:05,507 WARNING [train.py:1204] (3/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_235-262124-0_sp1.1 from training. Duration: 0.6090625 2023-10-02 17:34:09,670 WARNING [train.py:1204] (3/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_14-16530-0_sp1.1 from training. Duration: 0.97275 2023-10-02 17:34:09,704 WARNING [train.py:1204] (3/4) Exclude cut with ID _44718_210_5816_1_1534050051869_6469030_568-304512-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 17:34:09,733 WARNING [train.py:1204] (3/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_1033-274376-0 from training. Duration: 0.87 2023-10-02 17:34:13,905 WARNING [train.py:1204] (3/4) Exclude cut with ID _4681_210_3525_1_1527251040321_1235713_123-24428-0 from training. Duration: 0.48 2023-10-02 17:34:16,696 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_37818_210_4238_1_1533620910_4720083_251-288764-0_sp1.1 from training. Duration: 0.97275 2023-10-02 17:34:17,423 WARNING [train.py:1204] (3/4) Exclude cut with ID _65585_210_12210_1_1535450451584_3764098_112-294301-0 from training. Duration: 0.88 2023-10-02 17:34:22,425 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.balancer1.prob, batch_count=961980.0, ans=0.125 2023-10-02 17:34:23,501 WARNING [train.py:1204] (3/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_329-113173-0 from training. Duration: 0.62 2023-10-02 17:34:23,600 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6767_210_3273_1_1527855783_1718932_173-256601-0 from training. Duration: 0.8 2023-10-02 17:34:26,403 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9266W0001-627677 from training. Duration: 0.896 2023-10-02 17:34:26,415 WARNING [train.py:1204] (3/4) Exclude cut with ID _49643_210_7989_1_1534640244224_3939031_611-185515-0_sp1.1 from training. Duration: 0.8545625 2023-10-02 17:34:26,425 WARNING [train.py:1204] (3/4) Exclude cut with ID _31649_210_6228_1_1532584608063_4091078_687-237771-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 17:34:26,434 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_148-172861-0_sp0.9 from training. Duration: 0.5555625 2023-10-02 17:34:29,181 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_5129_210_6400_1_1527161005_3579054_107-69738-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 17:34:30,502 WARNING [train.py:1204] (3/4) Exclude cut with ID _40137_210_12210_1_1533290484046_3795180_36-301164-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 17:34:30,526 WARNING [train.py:1204] (3/4) Exclude cut with ID _7537_210_2084_1_1528502395431_3924359_113-199933-0 from training. Duration: 0.59 2023-10-02 17:34:33,265 WARNING [train.py:1204] (3/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_547-5424-0_sp1.1 from training. Duration: 0.8545625 2023-10-02 17:34:35,127 WARNING [train.py:1204] (3/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_147-284024-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 17:34:35,198 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_294-163793-0_sp1.1 from training. Duration: 0.7454375 2023-10-02 17:34:36,465 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3780_210_2087_1_1526467760_2130483_170-259044-0_sp1.1 from training. Duration: 0.636375 2023-10-02 17:34:37,916 WARNING [train.py:1204] (3/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_726-270780-0_sp1.1 from training. Duration: 0.7818125 2023-10-02 17:34:39,260 WARNING [train.py:1204] (3/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_214-291369-0_sp1.1 from training. Duration: 0.57275 2023-10-02 17:34:40,557 WARNING [train.py:1204] (3/4) Exclude cut with ID _21881_210_3525_1_1531825242757_3798380_247-102591-0 from training. Duration: 0.98 2023-10-02 17:34:43,494 WARNING [train.py:1204] (3/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_18-174647-0_sp1.1 from training. Duration: 0.82725 2023-10-02 17:34:43,497 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_415-52611-0_sp1.1 from training. Duration: 0.936375 2023-10-02 17:34:44,872 WARNING [train.py:1204] (3/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_379-49457-0_sp0.9 from training. Duration: 0.9666875 2023-10-02 17:34:44,898 WARNING [train.py:1204] (3/4) Exclude cut with ID _64521_210_14699_1_1535243427259_3815328_368-195106-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 17:34:46,240 WARNING [train.py:1204] (3/4) Exclude cut with ID _28394_210_8913_1_1532430049518_3650118_51-334124-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 17:34:46,454 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.feed_forward3.hidden_balancer.prob, batch_count=962046.6666666666, ans=0.125 2023-10-02 17:34:49,132 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.balancer2.prob, batch_count=962113.3333333334, ans=0.125 2023-10-02 17:34:50,327 WARNING [train.py:1204] (3/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_317-161769-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 17:34:52,234 WARNING [train.py:1204] (3/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_40-129756-0_sp0.9 from training. Duration: 0.9 2023-10-02 17:34:52,326 WARNING [train.py:1204] (3/4) Exclude cut with ID _3067_210_2667_1_1526212974640_4503168_119-175776-0_sp0.9 from training. Duration: 0.82225 2023-10-02 17:34:54,349 WARNING [train.py:1204] (3/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_1004-304915-0_sp1.1 from training. Duration: 0.92725 2023-10-02 17:34:54,407 WARNING [train.py:1204] (3/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_359-118926-0_sp0.9 from training. Duration: 0.8 2023-10-02 17:34:59,860 WARNING [train.py:1204] (3/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_51-200220-0_sp0.9 from training. Duration: 0.62225 2023-10-02 17:35:01,270 WARNING [train.py:1204] (3/4) Exclude cut with ID _46683_210_9642_1_1533730913492_4591116_530-326202-0_sp1.1 from training. Duration: 0.936375 2023-10-02 17:35:01,319 WARNING [train.py:1204] (3/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_567-135114-0 from training. Duration: 0.87 2023-10-02 17:35:01,344 WARNING [train.py:1204] (3/4) Exclude cut with ID _66863_210_18053_1_1535698758334_8590319_1092-271784-0_sp1.1 from training. Duration: 0.87275 2023-10-02 17:35:01,360 WARNING [train.py:1204] (3/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_762-282614-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 17:35:03,777 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.2.encoder.layers.0.feed_forward1.out_whiten, num_groups=1, num_channels=384, metric=6.30 vs. limit=15.0 2023-10-02 17:35:04,356 INFO [train.py:1046] (3/4) Epoch 28, batch 900, loss[loss=0.1813, simple_loss=0.2545, pruned_loss=0.05404, over 23789.00 frames. ], tot_loss[loss=0.1674, simple_loss=0.245, pruned_loss=0.04486, over 4659545.08 frames. ], batch size: 212, lr: 3.68e-03, grad_scale: 16.0 2023-10-02 17:35:04,414 WARNING [train.py:1204] (3/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_280-120942-0 from training. Duration: 0.77 2023-10-02 17:35:11,134 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_31583_210_3852_1_1532595851_4024413_27-170931-0_sp1.1 from training. Duration: 0.963625 2023-10-02 17:35:12,561 WARNING [train.py:1204] (3/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_266-271628-0_sp1.1 from training. Duration: 0.92725 2023-10-02 17:35:13,909 WARNING [train.py:1204] (3/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_41-31157-0 from training. Duration: 0.57 2023-10-02 17:35:15,405 WARNING [train.py:1204] (3/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_113-18163-0_sp1.1 from training. Duration: 0.8090625 2023-10-02 17:35:15,445 WARNING [train.py:1204] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_147-111137-0 from training. Duration: 0.95 2023-10-02 17:35:16,818 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1083-220122-0_sp1.1 from training. Duration: 0.463625 2023-10-02 17:35:19,525 WARNING [train.py:1204] (3/4) Exclude cut with ID _14884_210_2252_1_1530707459689_5561250_591-173406-0_sp1.1 from training. Duration: 0.87275 2023-10-02 17:35:19,536 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_24677_210_1794_1_1532002693_4341296_149-303793-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 17:35:20,203 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_133-564-0_sp1.1 from training. Duration: 0.7181875 2023-10-02 17:35:20,225 WARNING [train.py:1204] (3/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_625-338546-0_sp0.9 from training. Duration: 0.97775 2023-10-02 17:35:21,125 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.1.encoder.layers.0.nonlin_attention.whiten2, num_groups=1, num_channels=256, metric=5.59 vs. limit=15.0 2023-10-02 17:35:29,204 WARNING [train.py:1204] (3/4) Exclude cut with ID _25882_210_3036_1_1532775665815_6479510_624-320169-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 17:35:29,211 WARNING [train.py:1204] (3/4) Exclude cut with ID _20622_210_3852_1_1531622427080_1093839_127-325326-0_sp1.1 from training. Duration: 0.92725 2023-10-02 17:35:30,540 WARNING [train.py:1204] (3/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_464-33931-0_sp1.1 from training. Duration: 0.4909375 2023-10-02 17:35:33,350 WARNING [train.py:1204] (3/4) Exclude cut with ID _66112_210_10635_1_1535590236305_3801484_120-304588-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 17:35:38,117 WARNING [train.py:1204] (3/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_110-130961-0 from training. Duration: 0.47 2023-10-02 17:35:40,749 WARNING [train.py:1204] (3/4) Exclude cut with ID _16908_210_5724_1_1531119157248_3772700_64-54020-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 17:35:42,975 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.2.encoder.layers.1.nonlin_attention.whiten2, num_groups=1, num_channels=384, metric=5.81 vs. limit=15.0 2023-10-02 17:35:46,249 WARNING [train.py:1204] (3/4) Exclude cut with ID _4068_210_2629_1_1526697350395_1546259_224-288693-0_sp1.1 from training. Duration: 0.536375 2023-10-02 17:35:46,293 WARNING [train.py:1204] (3/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_191-41023-0_sp0.9 from training. Duration: 0.788875 2023-10-02 17:35:46,356 WARNING [train.py:1204] (3/4) Exclude cut with ID ID0205W0255-783002 from training. Duration: 0.958 2023-10-02 17:35:47,629 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_162-262717-0 from training. Duration: 0.83 2023-10-02 17:35:52,955 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_224-322394-0_sp1.1 from training. Duration: 0.6 2023-10-02 17:35:52,961 WARNING [train.py:1204] (3/4) Exclude cut with ID _4866_210_2577_1_1527404412284_4364659_133-1696-0_sp1.1 from training. Duration: 0.9 2023-10-02 17:35:53,013 WARNING [train.py:1204] (3/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_973-198085-0_sp1.1 from training. Duration: 0.6818125 2023-10-02 17:35:57,654 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.547e+02 1.851e+02 2.081e+02 2.371e+02 3.484e+02, threshold=4.162e+02, percent-clipped=0.0 2023-10-02 17:35:59,128 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_47449_210_5399_1_1534076445_4006929_410-237806-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 17:35:59,144 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7543_210_6753_1_1528543318_4552_1-85072-0_sp0.9 from training. Duration: 0.92225 2023-10-02 17:36:01,802 WARNING [train.py:1204] (3/4) Exclude cut with ID _65263_210_22356_1_1535763598691_3921037_108-21902-0 from training. Duration: 0.99 2023-10-02 17:36:01,806 WARNING [train.py:1204] (3/4) Exclude cut with ID _65585_210_12210_1_1535450451584_3764098_309-174818-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 17:36:01,963 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_309-65177-0 from training. Duration: 0.88 2023-10-02 17:36:03,371 WARNING [train.py:1204] (3/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_363-185703-0_sp1.1 from training. Duration: 0.863625 2023-10-02 17:36:04,640 WARNING [train.py:1204] (3/4) Exclude cut with ID _42326_210_7770_1_1533814282531_3707269_442-238692-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 17:36:06,741 WARNING [train.py:1204] (3/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_226-254446-0_sp1.1 from training. Duration: 0.936375 2023-10-02 17:36:06,759 WARNING [train.py:1204] (3/4) Exclude cut with ID _29853_210_7497_1_1532505600851_6642110_287-299903-0_sp1.1 from training. Duration: 0.963625 2023-10-02 17:36:10,778 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1015-343884-0 from training. Duration: 0.62 2023-10-02 17:36:12,110 WARNING [train.py:1204] (3/4) Exclude cut with ID ID0305W0008-833402 from training. Duration: 0.91 2023-10-02 17:36:12,385 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.conv_module1.balancer1.prob, batch_count=962446.6666666666, ans=0.125 2023-10-02 17:36:13,515 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6673_210_3273_1_1527767790_3173240_351-167954-0_sp1.1 from training. Duration: 0.436375 2023-10-02 17:36:13,528 WARNING [train.py:1204] (3/4) Exclude cut with ID _6456_210_1534_1_1527681634212_3181049_345-252877-0 from training. Duration: 0.64 2023-10-02 17:36:14,956 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_13-205182-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 17:36:18,174 INFO [train.py:1046] (3/4) Epoch 28, batch 950, loss[loss=0.1563, simple_loss=0.2403, pruned_loss=0.03612, over 24452.00 frames. ], tot_loss[loss=0.1682, simple_loss=0.2457, pruned_loss=0.04535, over 4662602.28 frames. ], batch size: 66, lr: 3.67e-03, grad_scale: 8.0 2023-10-02 17:36:18,310 WARNING [train.py:1204] (3/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_427-208901-0 from training. Duration: 0.68 2023-10-02 17:36:19,822 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=962513.3333333334, ans=0.1 2023-10-02 17:36:24,725 WARNING [train.py:1204] (3/4) Exclude cut with ID _49039_210_6315_1_1533967537541_3846118_9-148921-0_sp1.1 from training. Duration: 0.97275 2023-10-02 17:36:26,282 WARNING [train.py:1204] (3/4) Exclude cut with ID _59493_210_17282_1_1535078419077_3225879_143-70317-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 17:36:27,591 WARNING [train.py:1204] (3/4) Exclude cut with ID _27967_210_12140_1_1532588391859_8419130_1056-236369-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 17:36:27,633 WARNING [train.py:1204] (3/4) Exclude cut with ID _6537_210_1883_1_1527987559710_3597129_382-133062-0_sp0.9 from training. Duration: 0.7555625 2023-10-02 17:36:30,339 WARNING [train.py:1204] (3/4) Exclude cut with ID ID0376W0255-869957 from training. Duration: 0.9510625 2023-10-02 17:36:33,188 WARNING [train.py:1204] (3/4) Exclude cut with ID _49371_210_12071_1_1533952761021_3812190_483-240691-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 17:36:34,583 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_12868_210_6753_1_1530701932_530275_18-76322-0_sp1.1 from training. Duration: 0.7909375 2023-10-02 17:36:35,898 WARNING [train.py:1204] (3/4) Exclude cut with ID _53950_210_5816_1_1534417264919_3862951_367-26344-0_sp1.1 from training. Duration: 0.97275 2023-10-02 17:36:35,924 WARNING [train.py:1204] (3/4) Exclude cut with ID _5444_210_2151_1_1527418808464_5312210_634-194174-0_sp1.1 from training. Duration: 0.8818125 2023-10-02 17:36:35,947 WARNING [train.py:1204] (3/4) Exclude cut with ID _56494_210_4220_1_1535284837625_3033169_69-138150-0 from training. Duration: 0.94 2023-10-02 17:36:37,289 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7543_210_6753_1_1528544010_1110402_25-183860-0_sp0.9 from training. Duration: 0.52225 2023-10-02 17:36:39,253 WARNING [train.py:1204] (3/4) Exclude cut with ID _40284_210_2084_1_1533513679814_3704279_287-191918-0_sp1.1 from training. Duration: 0.963625 2023-10-02 17:36:40,782 WARNING [train.py:1204] (3/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_291-270881-0 from training. Duration: 0.97 2023-10-02 17:36:42,132 WARNING [train.py:1204] (3/4) Exclude cut with ID _12555_210_8750_1_1530075594402_3780239_449-267986-0_sp0.9 from training. Duration: 0.92225 2023-10-02 17:36:44,976 WARNING [train.py:1204] (3/4) Exclude cut with ID _3992_210_1747_1_1526644816031_3731060_268-64520-0_sp1.1 from training. Duration: 0.963625 2023-10-02 17:36:44,982 WARNING [train.py:1204] (3/4) Exclude cut with ID _39411_210_5986_1_1533538825030_3343649_173-238248-0_sp0.9 from training. Duration: 0.92225 2023-10-02 17:36:45,004 WARNING [train.py:1204] (3/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_828-313097-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 17:36:46,334 WARNING [train.py:1204] (3/4) Exclude cut with ID _26907_210_7234_1_1532221740694_3205263_337-2494-0 from training. Duration: 0.88 2023-10-02 17:36:48,600 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.balancer1.prob, batch_count=962646.6666666666, ans=0.125 2023-10-02 17:36:49,580 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_229-45300-0_sp1.1 from training. Duration: 0.5181875 2023-10-02 17:36:49,708 WARNING [train.py:1204] (3/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_300-27235-0_sp1.1 from training. Duration: 0.7909375 2023-10-02 17:36:51,666 WARNING [train.py:1204] (3/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_48-338976-0_sp1.1 from training. Duration: 0.8090625 2023-10-02 17:36:53,405 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.attention_skip_rate, batch_count=962646.6666666666, ans=0.0 2023-10-02 17:36:57,761 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_13091_210_7925_1_1530609447_8987354_792-195353-0_sp1.1 from training. Duration: 0.936375 2023-10-02 17:36:57,766 WARNING [train.py:1204] (3/4) Exclude cut with ID _56524_210_9642_1_1534572011779_3619170_311-54500-0_sp1.1 from training. Duration: 0.97275 2023-10-02 17:37:00,860 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=962646.6666666666, ans=0.1 2023-10-02 17:37:01,951 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7543_210_6753_1_1528544010_1110402_25-183860-0 from training. Duration: 0.47 2023-10-02 17:37:03,305 WARNING [train.py:1204] (3/4) Exclude cut with ID 2411-132532-0017-82279-0_sp1.1 from training. Duration: 0.9681875 2023-10-02 17:37:03,306 WARNING [train.py:1204] (3/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_272-249985-0_sp0.9 from training. Duration: 0.7444375 2023-10-02 17:37:04,555 WARNING [train.py:1204] (3/4) Exclude cut with ID _55644_210_11856_1_1534593599582_4103719_320-229006-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 17:37:04,744 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.feed_forward1.hidden_balancer.prob, batch_count=962713.3333333334, ans=0.125 2023-10-02 17:37:05,887 WARNING [train.py:1204] (3/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_177-100554-0_sp1.1 from training. Duration: 0.963625 2023-10-02 17:37:05,889 WARNING [train.py:1204] (3/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_787-294416-0_sp1.1 from training. Duration: 0.7454375 2023-10-02 17:37:06,329 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.5.encoder.layers.1.feed_forward2.out_whiten, num_groups=1, num_channels=256, metric=12.71 vs. limit=15.0 2023-10-02 17:37:07,475 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.feed_forward1.hidden_balancer.prob, batch_count=962713.3333333334, ans=0.125 2023-10-02 17:37:07,522 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.bypass.skip_rate, batch_count=962713.3333333334, ans=0.04949747468305833 2023-10-02 17:37:10,497 WARNING [train.py:1204] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_318-59097-0 from training. Duration: 0.76 2023-10-02 17:37:12,823 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.0.layers.1.conv_module2.whiten, num_groups=1, num_channels=192, metric=10.14 vs. limit=15.0 2023-10-02 17:37:13,224 WARNING [train.py:1204] (3/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_480-13146-0_sp1.1 from training. Duration: 0.77275 2023-10-02 17:37:14,624 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_469-338558-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 17:37:14,682 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_367-126897-0_sp1.1 from training. Duration: 0.963625 2023-10-02 17:37:14,699 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6545_210_2040_1_1527774670_4245258_379-14182-0 from training. Duration: 0.8 2023-10-02 17:37:14,713 WARNING [train.py:1204] (3/4) Exclude cut with ID _21631_210_2253_1_1531907880778_3160339_197-79498-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 17:37:14,717 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_416-293746-0_sp1.1 from training. Duration: 0.8181875 2023-10-02 17:37:15,962 WARNING [train.py:1204] (3/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_52-211683-0 from training. Duration: 0.9 2023-10-02 17:37:18,856 WARNING [train.py:1204] (3/4) Exclude cut with ID _48678_210_5663_1_1533988830947_3769199_306-255886-0_sp0.9 from training. Duration: 0.8555625 2023-10-02 17:37:19,025 WARNING [train.py:1204] (3/4) Exclude cut with ID _65134_210_14763_1_1535335393985_3505368_296-24733-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 17:37:25,299 WARNING [train.py:1204] (3/4) Exclude cut with ID _14898_210_9206_1_1530766812377_4177190_335-99364-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 17:37:27,173 WARNING [train.py:1204] (3/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_19-342632-0 from training. Duration: 0.91 2023-10-02 17:37:28,505 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_53071_210_16554_1_1534490405_2954066_265-170472-0 from training. Duration: 0.83 2023-10-02 17:37:32,593 INFO [train.py:1046] (3/4) Epoch 28, batch 1000, loss[loss=0.1809, simple_loss=0.2654, pruned_loss=0.04823, over 24005.00 frames. ], tot_loss[loss=0.1676, simple_loss=0.245, pruned_loss=0.04514, over 4678201.24 frames. ], batch size: 86, lr: 3.67e-03, grad_scale: 8.0 2023-10-02 17:37:32,665 WARNING [train.py:1204] (3/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_189-298172-0_sp1.1 from training. Duration: 0.963625 2023-10-02 17:37:32,935 INFO [scaling.py:1118] (3/4) WithLoss: name=encoder.encoders.4.encoder.layers.1.self_attn_weights, loss-sum=0.000e+00 2023-10-02 17:37:36,840 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7615_210_4211_1_1528110965_4580160_72-235211-0 from training. Duration: 0.76 2023-10-02 17:37:36,876 WARNING [train.py:1204] (3/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_155-90874-0_sp1.1 from training. Duration: 0.936375 2023-10-02 17:37:43,120 WARNING [train.py:1204] (3/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_164-216310-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 17:37:43,235 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder_embed.dropout.p, batch_count=962846.6666666666, ans=0.1 2023-10-02 17:37:44,496 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_46013_210_7234_1_1533776362_3744032_463-52378-0 from training. Duration: 0.86 2023-10-02 17:37:44,500 WARNING [train.py:1204] (3/4) Exclude cut with ID _61133_210_9969_1_1535196777227_3696409_333-270325-0 from training. Duration: 0.93 2023-10-02 17:37:47,751 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.4.encoder.layers.1.nonlin_attention.whiten1, num_groups=1, num_channels=288, metric=5.89 vs. limit=10.0 2023-10-02 17:37:48,606 WARNING [train.py:1204] (3/4) Exclude cut with ID _5172_210_1883_1_1527246062567_3577219_18-101380-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 17:37:48,614 WARNING [train.py:1204] (3/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_577-236858-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 17:37:49,969 WARNING [train.py:1204] (3/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_1006-177345-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 17:37:52,131 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_4-266348-0 from training. Duration: 0.78 2023-10-02 17:37:52,297 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.1.balancer2.prob, batch_count=962913.3333333334, ans=0.125 2023-10-02 17:37:55,067 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.self_attn_weights.pos_emb_skip_rate, batch_count=962913.3333333334, ans=0.0 2023-10-02 17:37:55,087 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.feed_forward1.hidden_balancer.prob, batch_count=962913.3333333334, ans=0.125 2023-10-02 17:37:55,092 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.conv_module2.balancer2.prob, batch_count=962913.3333333334, ans=0.125 2023-10-02 17:37:56,827 WARNING [train.py:1204] (3/4) Exclude cut with ID _55857_210_2346_1_1534644007831_6898209_745-98391-0 from training. Duration: 0.76 2023-10-02 17:37:59,913 WARNING [train.py:1204] (3/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_375-244976-0 from training. Duration: 0.75 2023-10-02 17:37:59,951 WARNING [train.py:1204] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_4-248404-0_sp1.1 from training. Duration: 0.97275 2023-10-02 17:38:01,466 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8453_210_4211_1_1528502940_8562730_596-306820-0 from training. Duration: 0.97 2023-10-02 17:38:01,981 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.4.encoder.layers.2.nonlin_attention.whiten1, num_groups=1, num_channels=288, metric=4.34 vs. limit=10.0 2023-10-02 17:38:03,995 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_211-339622-0_sp0.9 from training. Duration: 0.5 2023-10-02 17:38:04,041 WARNING [train.py:1204] (3/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_393-57888-0 from training. Duration: 0.64 2023-10-02 17:38:05,418 WARNING [train.py:1204] (3/4) Exclude cut with ID _23825_210_13449_1_1531915313526_3497150_143-343575-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 17:38:06,785 WARNING [train.py:1204] (3/4) Exclude cut with ID _58605_210_12210_1_1534723195939_3904259_36-12256-0_sp1.1 from training. Duration: 0.936375 2023-10-02 17:38:13,927 WARNING [train.py:1204] (3/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_38-67795-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 17:38:13,986 WARNING [train.py:1204] (3/4) Exclude cut with ID _64631_210_27767_1_1535349580068_4165160_308-327832-0_sp1.1 from training. Duration: 0.92725 2023-10-02 17:38:15,796 WARNING [train.py:1204] (3/4) Exclude cut with ID _37086_210_9038_1_1532999540891_7021250_726-81176-0_sp1.1 from training. Duration: 0.936375 2023-10-02 17:38:17,072 WARNING [train.py:1204] (3/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_47-141521-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 17:38:17,077 WARNING [train.py:1204] (3/4) Exclude cut with ID _3067_210_2667_1_1526212974640_4503168_250-285937-0 from training. Duration: 0.8 2023-10-02 17:38:17,103 WARNING [train.py:1204] (3/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_239-24000-0_sp1.1 from training. Duration: 0.97275 2023-10-02 17:38:18,503 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3371_210_1794_1_1526384462_5580777_396-339812-0_sp0.9 from training. Duration: 0.9444375 2023-10-02 17:38:18,569 WARNING [train.py:1204] (3/4) Exclude cut with ID _16909_210_5724_1_1531292079912_4118919_580-173454-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 17:38:19,924 WARNING [train.py:1204] (3/4) Exclude cut with ID IC0124W0002-61308 from training. Duration: 0.982 2023-10-02 17:38:23,299 WARNING [train.py:1204] (3/4) Exclude cut with ID _31602_210_5854_1_1532677021456_4458250_249-96604-0 from training. Duration: 0.89 2023-10-02 17:38:24,616 WARNING [train.py:1204] (3/4) Exclude cut with ID _69584_210_5219_1_1535785133775_4044450_325-7656-0 from training. Duration: 0.86 2023-10-02 17:38:25,896 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.507e+02 1.974e+02 2.112e+02 2.590e+02 4.842e+02, threshold=4.225e+02, percent-clipped=1.0 2023-10-02 17:38:25,992 WARNING [train.py:1204] (3/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_478-162247-0 from training. Duration: 0.82 2023-10-02 17:38:26,248 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.feed_forward2.hidden_balancer.prob, batch_count=963046.6666666666, ans=0.125 2023-10-02 17:38:27,399 WARNING [train.py:1204] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_686-54383-0_sp1.1 from training. Duration: 0.7909375 2023-10-02 17:38:34,117 WARNING [train.py:1204] (3/4) Exclude cut with ID _29303_210_2575_1_1532392237303_7410649_85-270092-0_sp1.1 from training. Duration: 0.936375 2023-10-02 17:38:34,127 WARNING [train.py:1204] (3/4) Exclude cut with ID _2322_210_2667_1_1525604548971_3656299_440-80494-0_sp1.1 from training. Duration: 0.8 2023-10-02 17:38:34,162 WARNING [train.py:1204] (3/4) Exclude cut with ID _47346_210_9476_1_1533815971553_3536160_201-40644-0_sp1.1 from training. Duration: 0.936375 2023-10-02 17:38:35,519 WARNING [train.py:1204] (3/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_343-139898-0_sp1.1 from training. Duration: 0.87275 2023-10-02 17:38:38,241 WARNING [train.py:1204] (3/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_1012-279473-0 from training. Duration: 0.67 2023-10-02 17:38:38,453 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.conv_skip_rate, batch_count=963113.3333333334, ans=0.0 2023-10-02 17:38:39,660 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7931_210_1794_1_1528594728_5564059_280-279560-0_sp1.1 from training. Duration: 0.8909375 2023-10-02 17:38:39,705 WARNING [train.py:1204] (3/4) Exclude cut with ID _8263_210_1664_1_1528439452891_970220_68-181805-0 from training. Duration: 0.64 2023-10-02 17:38:39,752 WARNING [train.py:1204] (3/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_605-133395-0 from training. Duration: 0.66 2023-10-02 17:38:42,440 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_43-171109-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 17:38:42,444 WARNING [train.py:1204] (3/4) Exclude cut with ID _40094_210_16380_1_1533294005773_3641159_107-280739-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 17:38:43,891 WARNING [train.py:1204] (3/4) Exclude cut with ID _14821_210_8750_1_1530680503991_6829359_2-227151-0_sp0.9 from training. Duration: 0.92225 2023-10-02 17:38:44,121 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.balancer1.prob, batch_count=963113.3333333334, ans=0.125 2023-10-02 17:38:46,986 INFO [train.py:1046] (3/4) Epoch 28, batch 1050, loss[loss=0.1676, simple_loss=0.2371, pruned_loss=0.04904, over 23887.00 frames. ], tot_loss[loss=0.1657, simple_loss=0.2422, pruned_loss=0.04459, over 4667694.78 frames. ], batch size: 195, lr: 3.67e-03, grad_scale: 8.0 2023-10-02 17:38:47,078 WARNING [train.py:1204] (3/4) Exclude cut with ID _14268_210_9206_1_1530622771285_4714099_331-105351-0_sp1.1 from training. Duration: 0.5909375 2023-10-02 17:38:48,578 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.bypass.skip_rate, batch_count=963180.0, ans=0.09899494936611666 2023-10-02 17:38:49,663 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_26326_210_7925_1_1533002493_3592406_451-82741-0_sp1.1 from training. Duration: 0.97275 2023-10-02 17:38:51,744 WARNING [train.py:1204] (3/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_191-59784-0_sp0.9 from training. Duration: 0.97775 2023-10-02 17:38:53,133 WARNING [train.py:1204] (3/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_445-117398-0_sp1.1 from training. Duration: 0.8090625 2023-10-02 17:38:54,601 WARNING [train.py:1204] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_605-257205-0_sp0.9 from training. Duration: 0.6555625 2023-10-02 17:38:54,664 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_50050_210_16223_1_1535435796_7290138_592-258841-0_sp1.1 from training. Duration: 0.936375 2023-10-02 17:38:57,273 WARNING [train.py:1204] (3/4) Exclude cut with ID _14348_210_8341_1_1530599434996_3778640_240-232370-0_sp0.9 from training. Duration: 0.9333125 2023-10-02 17:39:00,436 WARNING [train.py:1204] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_239-316802-0_sp0.9 from training. Duration: 0.7666875 2023-10-02 17:39:01,716 WARNING [train.py:1204] (3/4) Exclude cut with ID _45560_210_21982_1_1533812603951_3584999_111-6376-0_sp1.1 from training. Duration: 0.763625 2023-10-02 17:39:03,691 WARNING [train.py:1204] (3/4) Exclude cut with ID _54189_210_16380_1_1534332670039_3911710_606-311481-0_sp1.1 from training. Duration: 0.963625 2023-10-02 17:39:03,755 WARNING [train.py:1204] (3/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_237-332027-0_sp1.1 from training. Duration: 0.863625 2023-10-02 17:39:03,780 WARNING [train.py:1204] (3/4) Exclude cut with ID _66070_210_7640_1_1535441393096_7805310_489-292631-0_sp0.9 from training. Duration: 0.87775 2023-10-02 17:39:05,260 WARNING [train.py:1204] (3/4) Exclude cut with ID _49728_210_12651_1_1534041034294_6788150_624-195301-0_sp1.1 from training. Duration: 0.77275 2023-10-02 17:39:06,573 WARNING [train.py:1204] (3/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_44-79427-0 from training. Duration: 0.98 2023-10-02 17:39:07,834 WARNING [train.py:1204] (3/4) Exclude cut with ID _35350_210_16040_1_1533040191183_4075299_490-198354-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 17:39:08,010 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.feed_forward1.hidden_balancer.prob, batch_count=963246.6666666666, ans=0.125 2023-10-02 17:39:09,148 WARNING [train.py:1204] (3/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_309-222545-0 from training. Duration: 0.84 2023-10-02 17:39:10,533 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_13091_210_7925_1_1530609447_8987354_790-199469-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 17:39:10,548 WARNING [train.py:1204] (3/4) Exclude cut with ID _21632_210_2253_1_1531994001765_3744140_110-274654-0 from training. Duration: 0.82 2023-10-02 17:39:10,555 WARNING [train.py:1204] (3/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_498-40153-0_sp0.9 from training. Duration: 0.7 2023-10-02 17:39:15,002 WARNING [train.py:1204] (3/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_363-282909-0_sp1.1 from training. Duration: 0.936375 2023-10-02 17:39:16,777 WARNING [train.py:1204] (3/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_178-177582-0_sp0.9 from training. Duration: 0.82225 2023-10-02 17:39:16,787 WARNING [train.py:1204] (3/4) Exclude cut with ID _24612_210_8913_1_1531998054193_3806359_357-35487-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 17:39:19,593 WARNING [train.py:1204] (3/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_138-332773-0 from training. Duration: 0.81 2023-10-02 17:39:19,612 WARNING [train.py:1204] (3/4) Exclude cut with ID _7624_210_1531_1_1528109802198_4933101_418-349834-0 from training. Duration: 0.6 2023-10-02 17:39:19,631 WARNING [train.py:1204] (3/4) Exclude cut with ID _7537_210_2084_1_1528502395431_3924359_14-312906-0_sp0.9 from training. Duration: 0.9333125 2023-10-02 17:39:24,104 WARNING [train.py:1204] (3/4) Exclude cut with ID _60057_210_10640_1_1534903065793_3333150_362-230377-0 from training. Duration: 0.87 2023-10-02 17:39:25,691 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.feed_forward2.hidden_balancer.prob, batch_count=963313.3333333334, ans=0.125 2023-10-02 17:39:26,895 WARNING [train.py:1204] (3/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_138-64210-0 from training. Duration: 0.73 2023-10-02 17:39:28,218 WARNING [train.py:1204] (3/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_454-339504-0_sp1.1 from training. Duration: 0.97275 2023-10-02 17:39:30,541 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.balancer1.prob, batch_count=963380.0, ans=0.125 2023-10-02 17:39:31,649 WARNING [train.py:1204] (3/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_508-335369-0_sp0.9 from training. Duration: 0.5333125 2023-10-02 17:39:33,588 WARNING [train.py:1204] (3/4) Exclude cut with ID _5608_210_2106_1_1527415439089_6819768_909-82767-0_sp0.9 from training. Duration: 0.67775 2023-10-02 17:39:35,042 WARNING [train.py:1204] (3/4) Exclude cut with ID _6694_210_4599_1_1527852637490_3965529_99-11611-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 17:39:35,106 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2655_210_2087_1_1525688959_3897400_52-279899-0_sp1.1 from training. Duration: 0.62725 2023-10-02 17:39:39,180 WARNING [train.py:1204] (3/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_375-245677-0_sp1.1 from training. Duration: 0.736375 2023-10-02 17:39:41,984 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_23850_210_4238_1_1532232597_1632433_32-185135-0 from training. Duration: 0.86 2023-10-02 17:39:43,413 WARNING [train.py:1204] (3/4) Exclude cut with ID _15113_210_8254_1_1530842438579_3716279_209-54533-0 from training. Duration: 0.96 2023-10-02 17:39:44,736 WARNING [train.py:1204] (3/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_401-53423-0 from training. Duration: 0.55 2023-10-02 17:39:44,768 WARNING [train.py:1204] (3/4) Exclude cut with ID _14867_210_1968_1_1530684179890_3846969_319-250868-0_sp1.1 from training. Duration: 0.9 2023-10-02 17:39:44,782 WARNING [train.py:1204] (3/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_106-30782-0_sp0.9 from training. Duration: 0.888875 2023-10-02 17:39:44,978 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.1.self_attn_weights.pos_emb_skip_rate, batch_count=963446.6666666666, ans=0.0 2023-10-02 17:39:46,266 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_5960_210_1794_1_1527403865_3990999_515-2613-0 from training. Duration: 0.88 2023-10-02 17:39:49,719 WARNING [train.py:1204] (3/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_798-106179-0_sp0.9 from training. Duration: 0.92225 2023-10-02 17:39:51,216 WARNING [train.py:1204] (3/4) Exclude cut with ID _14227_210_4381_1_1530701804286_3940259_125-275642-0_sp1.1 from training. Duration: 0.9 2023-10-02 17:39:51,218 WARNING [train.py:1204] (3/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_79-200392-0_sp1.1 from training. Duration: 0.8818125 2023-10-02 17:39:52,526 WARNING [train.py:1204] (3/4) Exclude cut with ID _48836_210_12210_1_1534075848293_3904150_429-35475-0_sp0.9 from training. Duration: 0.988875 2023-10-02 17:39:52,549 WARNING [train.py:1204] (3/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_224-172517-0_sp1.1 from training. Duration: 0.97275 2023-10-02 17:39:57,070 WARNING [train.py:1204] (3/4) Exclude cut with ID _15113_210_8254_1_1530842438579_3716279_76-232507-0_sp1.1 from training. Duration: 0.97275 2023-10-02 17:39:57,073 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_135-5501-0 from training. Duration: 0.98 2023-10-02 17:39:58,487 WARNING [train.py:1204] (3/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_563-347832-0_sp0.9 from training. Duration: 0.988875 2023-10-02 17:39:58,497 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6786_210_1388_1_1527839699_4194495_273-149841-0 from training. Duration: 0.92 2023-10-02 17:39:58,522 WARNING [train.py:1204] (3/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_172-215932-0 from training. Duration: 0.66 2023-10-02 17:40:00,426 WARNING [train.py:1204] (3/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_939-132910-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 17:40:01,638 INFO [train.py:1046] (3/4) Epoch 28, batch 1100, loss[loss=0.1669, simple_loss=0.2445, pruned_loss=0.04464, over 23378.00 frames. ], tot_loss[loss=0.1654, simple_loss=0.2423, pruned_loss=0.0442, over 4679937.75 frames. ], batch size: 105, lr: 3.67e-03, grad_scale: 8.0 2023-10-02 17:40:01,932 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.feed_forward2.hidden_balancer.prob, batch_count=963513.3333333334, ans=0.125 2023-10-02 17:40:03,377 WARNING [train.py:1204] (3/4) Exclude cut with ID _49371_210_12071_1_1533952761021_3812190_16-246001-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 17:40:09,498 WARNING [train.py:1204] (3/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_680-189165-0_sp1.1 from training. Duration: 0.936375 2023-10-02 17:40:13,707 WARNING [train.py:1204] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_53-300544-0_sp1.1 from training. Duration: 0.6090625 2023-10-02 17:40:15,069 WARNING [train.py:1204] (3/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_540-275685-0_sp1.1 from training. Duration: 0.8090625 2023-10-02 17:40:15,092 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_38965_210_4852_1_1533174906_3484735_453-106579-0_sp1.1 from training. Duration: 0.92725 2023-10-02 17:40:16,409 WARNING [train.py:1204] (3/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_105-328174-0 from training. Duration: 0.76 2023-10-02 17:40:16,530 WARNING [train.py:1204] (3/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_279-126205-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 17:40:18,038 WARNING [train.py:1204] (3/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_5-55893-0_sp0.9 from training. Duration: 0.611125 2023-10-02 17:40:21,242 WARNING [train.py:1204] (3/4) Exclude cut with ID _13678_210_7497_1_1530530069226_3463170_141-10835-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 17:40:22,661 WARNING [train.py:1204] (3/4) Exclude cut with ID _22083_210_8254_1_1531702862439_3678970_201-85738-0_sp1.1 from training. Duration: 0.8181875 2023-10-02 17:40:23,321 WARNING [train.py:1204] (3/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_51-1049-0 from training. Duration: 0.75 2023-10-02 17:40:24,473 WARNING [train.py:1204] (3/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_206-10097-0_sp0.9 from training. Duration: 0.5444375 2023-10-02 17:40:25,764 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_4528_210_3273_1_1527076654_3276777_360-63661-0_sp1.1 from training. Duration: 0.92725 2023-10-02 17:40:25,771 WARNING [train.py:1204] (3/4) Exclude cut with ID _64086_210_7994_1_1535270417019_7435949_449-31567-0_sp1.1 from training. Duration: 0.8818125 2023-10-02 17:40:28,915 WARNING [train.py:1204] (3/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_245-174553-0_sp0.9 from training. Duration: 0.97775 2023-10-02 17:40:30,449 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6545_210_2040_1_1527774670_4245258_379-14182-0_sp1.1 from training. Duration: 0.72725 2023-10-02 17:40:33,242 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_256-206070-0_sp1.1 from training. Duration: 0.963625 2023-10-02 17:40:38,118 WARNING [train.py:1204] (3/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_278-104676-0 from training. Duration: 0.98 2023-10-02 17:40:39,474 WARNING [train.py:1204] (3/4) Exclude cut with ID IC0671W0065-331908 from training. Duration: 0.896 2023-10-02 17:40:39,519 WARNING [train.py:1204] (3/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_244-129917-0_sp1.1 from training. Duration: 0.97275 2023-10-02 17:40:40,992 WARNING [train.py:1204] (3/4) Exclude cut with ID _7852_210_5219_1_1528286471316_8139400_1129-170857-0_sp1.1 from training. Duration: 0.97275 2023-10-02 17:40:42,318 WARNING [train.py:1204] (3/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_443-30034-0_sp0.9 from training. Duration: 0.711125 2023-10-02 17:40:42,357 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_31583_210_3852_1_1532595851_4024413_250-332999-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 17:40:43,789 WARNING [train.py:1204] (3/4) Exclude cut with ID _8772_210_1850_1_1528635595972_3664011_247-336448-0 from training. Duration: 0.74 2023-10-02 17:40:45,110 WARNING [train.py:1204] (3/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_567-135114-0_sp0.9 from training. Duration: 0.9666875 2023-10-02 17:40:45,114 WARNING [train.py:1204] (3/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_557-349534-0_sp1.1 from training. Duration: 0.82725 2023-10-02 17:40:45,126 WARNING [train.py:1204] (3/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_1341-26728-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 17:40:46,507 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_40222_210_6228_1_1533385298_4481939_375-328051-0_sp1.1 from training. Duration: 0.97275 2023-10-02 17:40:46,529 WARNING [train.py:1204] (3/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_276-96593-0 from training. Duration: 0.77 2023-10-02 17:40:51,838 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_53071_210_16554_1_1534490405_2954066_265-170472-0_sp0.9 from training. Duration: 0.92225 2023-10-02 17:40:51,862 WARNING [train.py:1204] (3/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_365-1492-0 from training. Duration: 0.91 2023-10-02 17:40:52,132 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.feed_forward1.out_proj.dropout_p, batch_count=963713.3333333334, ans=0.1 2023-10-02 17:40:54,708 WARNING [train.py:1204] (3/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_654-52713-0_sp0.9 from training. Duration: 0.8555625 2023-10-02 17:40:56,022 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.515e+02 1.777e+02 1.908e+02 2.106e+02 3.203e+02, threshold=3.817e+02, percent-clipped=0.0 2023-10-02 17:40:59,285 WARNING [train.py:1204] (3/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_113-92608-0_sp1.1 from training. Duration: 0.4909375 2023-10-02 17:41:02,034 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_46013_210_7234_1_1533776362_3744032_50-141535-0 from training. Duration: 0.99 2023-10-02 17:41:02,050 WARNING [train.py:1204] (3/4) Exclude cut with ID _13942_210_9988_1_1530604906036_3733360_499-55676-0_sp1.1 from training. Duration: 0.563625 2023-10-02 17:41:03,446 WARNING [train.py:1204] (3/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_375-65143-0_sp1.1 from training. Duration: 0.97275 2023-10-02 17:41:06,074 WARNING [train.py:1204] (3/4) Exclude cut with ID _16756_210_4915_1_1531047803763_7104259_199-325608-0_sp1.1 from training. Duration: 0.92725 2023-10-02 17:41:06,106 WARNING [train.py:1204] (3/4) Exclude cut with ID _31649_210_6228_1_1532584608063_4091078_443-124391-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 17:41:08,020 WARNING [train.py:1204] (3/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_146-218763-0 from training. Duration: 0.86 2023-10-02 17:41:08,073 WARNING [train.py:1204] (3/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_567-135114-0_sp1.1 from training. Duration: 0.7909375 2023-10-02 17:41:08,122 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_26326_210_7925_1_1533002493_3592406_462-288299-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 17:41:09,471 WARNING [train.py:1204] (3/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_322-217220-0 from training. Duration: 0.86 2023-10-02 17:41:09,510 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_238-256668-0_sp1.1 from training. Duration: 0.62725 2023-10-02 17:41:10,780 WARNING [train.py:1204] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_371-18169-0 from training. Duration: 0.86 2023-10-02 17:41:12,248 WARNING [train.py:1204] (3/4) Exclude cut with ID _58480_210_12392_1_1534811121111_3646160_23-74512-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 17:41:13,455 WARNING [train.py:1204] (3/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_784-213099-0_sp1.1 from training. Duration: 0.8454375 2023-10-02 17:41:13,753 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.ff2_skip_rate, batch_count=963780.0, ans=0.0 2023-10-02 17:41:14,820 WARNING [train.py:1204] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_625-102955-0_sp1.1 from training. Duration: 0.7 2023-10-02 17:41:16,127 INFO [train.py:1046] (3/4) Epoch 28, batch 1150, loss[loss=0.1713, simple_loss=0.2537, pruned_loss=0.04448, over 23854.00 frames. ], tot_loss[loss=0.1661, simple_loss=0.2434, pruned_loss=0.04442, over 4683312.29 frames. ], batch size: 86, lr: 3.67e-03, grad_scale: 8.0 2023-10-02 17:41:17,719 WARNING [train.py:1204] (3/4) Exclude cut with ID _49748_210_24116_1_1533986971737_3158910_176-174327-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 17:41:20,914 WARNING [train.py:1204] (3/4) Exclude cut with ID _50310_210_15488_1_1534068043345_3641080_432-96140-0_sp1.1 from training. Duration: 0.936375 2023-10-02 17:41:22,370 WARNING [train.py:1204] (3/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_76-14910-0_sp1.1 from training. Duration: 0.92725 2023-10-02 17:41:22,392 WARNING [train.py:1204] (3/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_40-19458-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 17:41:22,433 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_46-93547-0 from training. Duration: 0.95 2023-10-02 17:41:22,461 WARNING [train.py:1204] (3/4) Exclude cut with ID _21631_210_2253_1_1531907880778_3160339_321-35325-0_sp1.1 from training. Duration: 0.963625 2023-10-02 17:41:25,539 WARNING [train.py:1204] (3/4) Exclude cut with ID _4779_210_2084_1_1527318022944_3914893_500-285013-0 from training. Duration: 0.69 2023-10-02 17:41:27,334 WARNING [train.py:1204] (3/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_301-93324-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 17:41:27,340 WARNING [train.py:1204] (3/4) Exclude cut with ID _6473_210_1741_1_1527901307396_3815380_108-17335-0_sp1.1 from training. Duration: 0.5909375 2023-10-02 17:41:30,261 WARNING [train.py:1204] (3/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_206-10097-0 from training. Duration: 0.49 2023-10-02 17:41:31,643 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_36756_210_7234_1_1533026991_966276_103-77513-0_sp1.1 from training. Duration: 0.97275 2023-10-02 17:41:33,559 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.5.encoder.layers.1.feed_forward2.out_whiten, num_groups=1, num_channels=256, metric=10.16 vs. limit=15.0 2023-10-02 17:41:34,960 WARNING [train.py:1204] (3/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_662-18193-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 17:41:36,224 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_26433_210_12108_1_1532157158_4056480_439-52438-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 17:41:36,269 WARNING [train.py:1204] (3/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_464-33931-0 from training. Duration: 0.54 2023-10-02 17:41:36,276 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3474_210_2028_1_1526188513_4825246_145-309131-0_sp0.9 from training. Duration: 0.82225 2023-10-02 17:41:36,295 WARNING [train.py:1204] (3/4) Exclude cut with ID _9679_210_7497_1_1529129212906_4363929_446-308321-0_sp1.1 from training. Duration: 0.963625 2023-10-02 17:41:40,388 WARNING [train.py:1204] (3/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_662-275621-0 from training. Duration: 0.42 2023-10-02 17:41:41,036 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.3.encoder.layers.1.self_attn2.whiten, num_groups=1, num_channels=512, metric=16.94 vs. limit=22.5 2023-10-02 17:41:41,726 WARNING [train.py:1204] (3/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_344-337148-0_sp1.1 from training. Duration: 0.97275 2023-10-02 17:41:43,092 WARNING [train.py:1204] (3/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_759-179071-0_sp1.1 from training. Duration: 0.92725 2023-10-02 17:41:52,831 WARNING [train.py:1204] (3/4) Exclude cut with ID _28783_210_7943_1_1532671164808_7155479_293-12188-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 17:41:58,946 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_17613_210_2151_1_1531137569_2366160_235-320304-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 17:41:58,963 WARNING [train.py:1204] (3/4) Exclude cut with ID _14349_210_8341_1_1530615710818_3442470_170-336541-0 from training. Duration: 0.86 2023-10-02 17:42:00,289 WARNING [train.py:1204] (3/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_375-218677-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 17:42:00,328 WARNING [train.py:1204] (3/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_829-202956-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 17:42:08,713 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9029W0436-509823 from training. Duration: 0.768 2023-10-02 17:42:08,849 WARNING [train.py:1204] (3/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_231-112214-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 17:42:15,731 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9059W0470-524883 from training. Duration: 0.3415625 2023-10-02 17:42:15,955 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.ff2_skip_rate, batch_count=964113.3333333334, ans=0.0 2023-10-02 17:42:21,399 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_43266_210_1794_1_1533521789_4497218_206-79512-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 17:42:22,785 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_294-163793-0_sp0.9 from training. Duration: 0.911125 2023-10-02 17:42:22,814 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6290_210_3273_1_1527595224_1282787_115-87095-0_sp1.1 from training. Duration: 0.5 2023-10-02 17:42:22,864 WARNING [train.py:1204] (3/4) Exclude cut with ID _14100_210_8341_1_1530529320613_3678069_279-55760-0_sp1.1 from training. Duration: 0.8454375 2023-10-02 17:42:27,355 WARNING [train.py:1204] (3/4) Exclude cut with ID _14621_210_1925_1_1530686901185_3675420_419-234200-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 17:42:30,693 INFO [train.py:1046] (3/4) Epoch 28, batch 1200, loss[loss=0.165, simple_loss=0.2331, pruned_loss=0.04841, over 23753.00 frames. ], tot_loss[loss=0.1663, simple_loss=0.2441, pruned_loss=0.04422, over 4695003.42 frames. ], batch size: 164, lr: 3.67e-03, grad_scale: 16.0 2023-10-02 17:42:32,133 WARNING [train.py:1204] (3/4) Exclude cut with ID _60229_210_27767_1_1534838406158_3655350_353-213656-0_sp1.1 from training. Duration: 0.863625 2023-10-02 17:42:32,142 WARNING [train.py:1204] (3/4) Exclude cut with ID _48678_210_5663_1_1533988830947_3769199_306-255886-0_sp1.1 from training. Duration: 0.7 2023-10-02 17:42:34,656 WARNING [train.py:1204] (3/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_466-3492-0_sp1.1 from training. Duration: 0.97275 2023-10-02 17:42:34,656 WARNING [train.py:1204] (3/4) Exclude cut with ID _8755_210_1876_1_1528628559357_3258209_199-242167-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 17:42:34,723 WARNING [train.py:1204] (3/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_237-89684-0_sp1.1 from training. Duration: 0.87275 2023-10-02 17:42:37,428 WARNING [train.py:1204] (3/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_70-84121-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 17:42:39,189 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7544_210_6753_1_1528587722_3576686_172-171750-0_sp1.1 from training. Duration: 0.5909375 2023-10-02 17:42:40,630 WARNING [train.py:1204] (3/4) Exclude cut with ID _24271_210_12658_1_1531987270022_7190519_40-321822-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 17:42:40,666 WARNING [train.py:1204] (3/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_227-83612-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 17:42:43,523 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9156W0015-572508 from training. Duration: 0.896 2023-10-02 17:42:44,968 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_292-265928-0 from training. Duration: 0.51 2023-10-02 17:42:47,662 WARNING [train.py:1204] (3/4) Exclude cut with ID _14747_210_8341_1_1530685797463_3654089_147-131229-0_sp1.1 from training. Duration: 0.6909375 2023-10-02 17:42:50,373 WARNING [train.py:1204] (3/4) Exclude cut with ID _45297_210_2721_1_1533715351381_3264949_351-147136-0_sp0.9 from training. Duration: 0.9666875 2023-10-02 17:42:52,407 WARNING [train.py:1204] (3/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_1102-324568-0_sp1.1 from training. Duration: 0.97275 2023-10-02 17:42:53,833 WARNING [train.py:1204] (3/4) Exclude cut with ID _18516_210_9206_1_1531704653206_3692406_465-75446-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 17:42:53,834 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9203W0126-595623 from training. Duration: 0.896 2023-10-02 17:42:55,275 WARNING [train.py:1204] (3/4) Exclude cut with ID _55383_210_10635_1_1534468426172_2231669_139-328410-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 17:43:01,880 WARNING [train.py:1204] (3/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_166-246557-0_sp0.9 from training. Duration: 0.711125 2023-10-02 17:43:01,884 WARNING [train.py:1204] (3/4) Exclude cut with ID _30039_210_13270_1_1532484612900_2725150_239-304872-0_sp1.1 from training. Duration: 0.963625 2023-10-02 17:43:01,921 WARNING [train.py:1204] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_6-226750-0 from training. Duration: 0.94 2023-10-02 17:43:03,382 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_135-5501-0_sp1.1 from training. Duration: 0.8909375 2023-10-02 17:43:08,980 WARNING [train.py:1204] (3/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_359-118926-0 from training. Duration: 0.72 2023-10-02 17:43:14,446 WARNING [train.py:1204] (3/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_4-91037-0 from training. Duration: 0.88 2023-10-02 17:43:14,463 WARNING [train.py:1204] (3/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_415-288492-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 17:43:15,922 WARNING [train.py:1204] (3/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_544-287179-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 17:43:17,351 WARNING [train.py:1204] (3/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_122-32606-0_sp1.1 from training. Duration: 0.92725 2023-10-02 17:43:17,403 WARNING [train.py:1204] (3/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_121-284152-0_sp1.1 from training. Duration: 0.536375 2023-10-02 17:43:18,693 WARNING [train.py:1204] (3/4) Exclude cut with ID _44718_210_5816_1_1534050051869_6469030_350-299477-0_sp1.1 from training. Duration: 0.97275 2023-10-02 17:43:18,704 WARNING [train.py:1204] (3/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_1103-56445-0_sp0.9 from training. Duration: 0.811125 2023-10-02 17:43:18,754 WARNING [train.py:1204] (3/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_64-298917-0_sp0.9 from training. Duration: 0.97775 2023-10-02 17:43:20,073 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2245_210_2715_1_1525511548_3540254_310-52483-0 from training. Duration: 0.88 2023-10-02 17:43:20,120 WARNING [train.py:1204] (3/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_401-158239-0_sp1.1 from training. Duration: 0.6454375 2023-10-02 17:43:21,558 WARNING [train.py:1204] (3/4) Exclude cut with ID _59502_210_12210_1_1535107024698_1835139_146-162495-0_sp1.1 from training. Duration: 0.836375 2023-10-02 17:43:21,567 WARNING [train.py:1204] (3/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_41-31157-0_sp0.9 from training. Duration: 0.6333125 2023-10-02 17:43:22,958 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.536e+02 1.818e+02 2.090e+02 2.388e+02 3.203e+02, threshold=4.180e+02, percent-clipped=0.0 2023-10-02 17:43:23,073 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_68717_210_3528_1_1535628683_1556759_95-36722-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 17:43:23,083 WARNING [train.py:1204] (3/4) Exclude cut with ID _30196_210_1664_1_1533708496522_7074140_654-162611-0_sp1.1 from training. Duration: 0.92725 2023-10-02 17:43:27,570 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_9302_210_2550_1_1528889962_4655416_638-301620-0_sp0.9 from training. Duration: 0.52225 2023-10-02 17:43:29,668 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=964446.6666666666, ans=0.1 2023-10-02 17:43:30,629 WARNING [train.py:1204] (3/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_478-162247-0_sp1.1 from training. Duration: 0.7454375 2023-10-02 17:43:33,387 WARNING [train.py:1204] (3/4) Exclude cut with ID _45297_210_2721_1_1533715351381_3264949_353-9541-0 from training. Duration: 0.97 2023-10-02 17:43:36,604 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9653W0190-675468 from training. Duration: 0.9935 2023-10-02 17:43:39,167 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_49730_210_13049_1_1534075294_1830676_214-84436-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 17:43:42,644 WARNING [train.py:1204] (3/4) Exclude cut with ID _50093_210_4220_1_1534417141207_3432299_259-322252-0_sp1.1 from training. Duration: 0.836375 2023-10-02 17:43:43,965 INFO [train.py:1046] (3/4) Epoch 28, batch 1250, loss[loss=0.178, simple_loss=0.2493, pruned_loss=0.05335, over 23730.00 frames. ], tot_loss[loss=0.1671, simple_loss=0.2452, pruned_loss=0.04456, over 4712323.53 frames. ], batch size: 179, lr: 3.67e-03, grad_scale: 16.0 2023-10-02 17:43:44,048 WARNING [train.py:1204] (3/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_599-50220-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 17:43:45,433 WARNING [train.py:1204] (3/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_174-109016-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 17:43:45,765 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.out_combiner.scale_min, batch_count=964513.3333333334, ans=0.2 2023-10-02 17:43:46,855 WARNING [train.py:1204] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_665-196155-0 from training. Duration: 0.67 2023-10-02 17:43:47,355 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.4.encoder.layers.2.self_attn2.whiten, num_groups=1, num_channels=384, metric=11.96 vs. limit=22.5 2023-10-02 17:43:51,081 WARNING [train.py:1204] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_656-188361-0_sp1.1 from training. Duration: 0.7909375 2023-10-02 17:43:52,480 WARNING [train.py:1204] (3/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_60-18123-0_sp1.1 from training. Duration: 0.8 2023-10-02 17:43:53,828 WARNING [train.py:1204] (3/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_535-14410-0 from training. Duration: 0.47 2023-10-02 17:43:57,134 WARNING [train.py:1204] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_94-126128-0_sp1.1 from training. Duration: 0.7818125 2023-10-02 17:43:57,236 WARNING [train.py:1204] (3/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_356-31216-0_sp1.1 from training. Duration: 0.6818125 2023-10-02 17:44:01,721 WARNING [train.py:1204] (3/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_443-30034-0_sp1.1 from training. Duration: 0.5818125 2023-10-02 17:44:01,825 WARNING [train.py:1204] (3/4) Exclude cut with ID _26907_210_7234_1_1532221740694_3205263_337-2494-0_sp1.1 from training. Duration: 0.8 2023-10-02 17:44:03,243 WARNING [train.py:1204] (3/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_49-97158-0_sp0.9 from training. Duration: 0.8444375 2023-10-02 17:44:03,255 WARNING [train.py:1204] (3/4) Exclude cut with ID _7877_210_6014_1_1528286358099_3750136_553-170128-0_sp1.1 from training. Duration: 0.963625 2023-10-02 17:44:03,466 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.1.conv_skip_rate, batch_count=964580.0, ans=0.0 2023-10-02 17:44:04,730 WARNING [train.py:1204] (3/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_202-293523-0_sp0.9 from training. Duration: 0.8 2023-10-02 17:44:10,567 WARNING [train.py:1204] (3/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_188-171673-0_sp0.9 from training. Duration: 0.6444375 2023-10-02 17:44:10,590 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_120-41668-0_sp1.1 from training. Duration: 0.636375 2023-10-02 17:44:10,596 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_40223_210_6228_1_1533298404_4812267_613-78769-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 17:44:11,960 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_53861_210_5399_1_1534422156_4272077_150-336899-0_sp1.1 from training. Duration: 0.963625 2023-10-02 17:44:12,028 WARNING [train.py:1204] (3/4) Exclude cut with ID _14459_210_2228_1_1531443641946_7516700_1146-56843-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 17:44:14,876 WARNING [train.py:1204] (3/4) Exclude cut with ID _39867_210_9826_1_1533553222030_9344259_1194-299928-0_sp1.1 from training. Duration: 0.92725 2023-10-02 17:44:16,850 WARNING [train.py:1204] (3/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_214-291369-0_sp0.9 from training. Duration: 0.7 2023-10-02 17:44:21,044 WARNING [train.py:1204] (3/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_358-215558-0 from training. Duration: 0.93 2023-10-02 17:44:21,113 WARNING [train.py:1204] (3/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_577-287150-0_sp0.9 from training. Duration: 0.911125 2023-10-02 17:44:22,549 WARNING [train.py:1204] (3/4) Exclude cut with ID _60860_210_8254_1_1534896015875_3557169_229-187367-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 17:44:23,889 WARNING [train.py:1204] (3/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_857-298942-0 from training. Duration: 0.85 2023-10-02 17:44:23,945 WARNING [train.py:1204] (3/4) Exclude cut with ID _40137_210_12210_1_1533290484046_3795180_249-187059-0_sp1.1 from training. Duration: 0.8 2023-10-02 17:44:23,951 WARNING [train.py:1204] (3/4) Exclude cut with ID ID0171W0508-765603 from training. Duration: 0.541 2023-10-02 17:44:23,966 WARNING [train.py:1204] (3/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_521-219439-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 17:44:23,971 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_40223_210_6228_1_1533298404_4812267_741-302616-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 17:44:26,668 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.1.encoder.layers.0.conv_module1.whiten, num_groups=1, num_channels=256, metric=11.52 vs. limit=15.0 2023-10-02 17:44:28,785 WARNING [train.py:1204] (3/4) Exclude cut with ID _65293_210_9070_1_1535545549765_4004210_438-154735-0_sp1.1 from training. Duration: 0.92725 2023-10-02 17:44:33,343 WARNING [train.py:1204] (3/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_564-132950-0_sp1.1 from training. Duration: 0.92725 2023-10-02 17:44:33,379 WARNING [train.py:1204] (3/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_291-270881-0_sp1.1 from training. Duration: 0.8818125 2023-10-02 17:44:35,216 WARNING [train.py:1204] (3/4) Exclude cut with ID _60229_210_27767_1_1534838406158_3655350_353-213656-0 from training. Duration: 0.95 2023-10-02 17:44:35,236 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_243-143119-0 from training. Duration: 0.77 2023-10-02 17:44:35,256 WARNING [train.py:1204] (3/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_690-126306-0 from training. Duration: 0.78 2023-10-02 17:44:38,222 WARNING [train.py:1204] (3/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_347-95003-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 17:44:39,608 WARNING [train.py:1204] (3/4) Exclude cut with ID _2322_210_2667_1_1525604548971_3656299_440-80494-0 from training. Duration: 0.88 2023-10-02 17:44:39,623 WARNING [train.py:1204] (3/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_370-229434-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 17:44:42,233 WARNING [train.py:1204] (3/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_124-345840-0_sp1.1 from training. Duration: 0.47275 2023-10-02 17:44:42,248 WARNING [train.py:1204] (3/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_165-123581-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 17:44:44,419 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.0.layers.1.feed_forward2.out_whiten, num_groups=1, num_channels=192, metric=4.36 vs. limit=15.0 2023-10-02 17:44:44,900 WARNING [train.py:1204] (3/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_453-282588-0 from training. Duration: 0.98 2023-10-02 17:44:44,934 WARNING [train.py:1204] (3/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_299-34413-0_sp0.9 from training. Duration: 0.72225 2023-10-02 17:44:46,270 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_40036_210_13405_1_1533648647_1315388_48-146016-0_sp1.1 from training. Duration: 0.8454375 2023-10-02 17:44:46,306 WARNING [train.py:1204] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_99-167392-0_sp0.9 from training. Duration: 0.67775 2023-10-02 17:44:48,025 WARNING [train.py:1204] (3/4) Exclude cut with ID _48677_210_5663_1_1533902405919_3892989_37-207114-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 17:44:48,140 WARNING [train.py:1204] (3/4) Exclude cut with ID _29857_210_20034_1_1532395886753_3798160_335-163562-0 from training. Duration: 0.85 2023-10-02 17:44:52,143 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_36492_210_7927_1_1533261987_4790834_703-343351-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 17:44:53,501 WARNING [train.py:1204] (3/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_80-147301-0_sp0.9 from training. Duration: 0.9333125 2023-10-02 17:44:54,775 WARNING [train.py:1204] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_218-295097-0_sp0.9 from training. Duration: 0.8555625 2023-10-02 17:44:55,083 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.feed_forward1.out_proj.dropout_p, batch_count=964780.0, ans=0.1 2023-10-02 17:44:56,262 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2295_210_2087_1_1525500993_2407863_116-238894-0_sp1.1 from training. Duration: 0.636375 2023-10-02 17:44:57,482 INFO [train.py:1046] (3/4) Epoch 28, batch 1300, loss[loss=0.1556, simple_loss=0.2371, pruned_loss=0.03706, over 24472.00 frames. ], tot_loss[loss=0.1686, simple_loss=0.2464, pruned_loss=0.0454, over 4717078.13 frames. ], batch size: 63, lr: 3.67e-03, grad_scale: 16.0 2023-10-02 17:44:59,608 WARNING [train.py:1204] (3/4) Exclude cut with ID _3231_210_2106_1_1526205864934_6784220_697-171068-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 17:44:59,645 WARNING [train.py:1204] (3/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_102-77064-0 from training. Duration: 0.93 2023-10-02 17:45:02,879 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_13091_210_7925_1_1530609447_8987354_1186-261957-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 17:45:04,599 WARNING [train.py:1204] (3/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_91-277684-0_sp1.1 from training. Duration: 0.67275 2023-10-02 17:45:05,966 WARNING [train.py:1204] (3/4) Exclude cut with ID _31088_210_16446_1_1532520012284_3440919_159-313751-0_sp1.1 from training. Duration: 0.936375 2023-10-02 17:45:07,391 WARNING [train.py:1204] (3/4) Exclude cut with ID _42326_210_7770_1_1533814282531_3707269_227-203721-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 17:45:08,882 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3531_210_3528_1_1526294203_5216456_309-227302-0_sp1.1 from training. Duration: 0.62725 2023-10-02 17:45:10,205 WARNING [train.py:1204] (3/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_226-242140-0 from training. Duration: 0.81 2023-10-02 17:45:10,358 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.bypass_mid.scale_min, batch_count=964846.6666666666, ans=0.2 2023-10-02 17:45:16,958 WARNING [train.py:1204] (3/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_122-109023-0_sp1.1 from training. Duration: 0.6909375 2023-10-02 17:45:17,045 WARNING [train.py:1204] (3/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_660-211088-0_sp0.9 from training. Duration: 0.688875 2023-10-02 17:45:18,451 WARNING [train.py:1204] (3/4) Exclude cut with ID _39290_210_4220_1_1533862778760_3075118_92-311600-0 from training. Duration: 0.88 2023-10-02 17:45:21,934 WARNING [train.py:1204] (3/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_390-339137-0_sp1.1 from training. Duration: 0.5090625 2023-10-02 17:45:26,003 WARNING [train.py:1204] (3/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_449-341946-0_sp1.1 from training. Duration: 0.963625 2023-10-02 17:45:27,298 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_68095_210_20185_1_1535718226_8888855_884-327007-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 17:45:28,625 WARNING [train.py:1204] (3/4) Exclude cut with ID _42326_210_7770_1_1533814282531_3707269_308-222998-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 17:45:30,729 WARNING [train.py:1204] (3/4) Exclude cut with ID _40399_210_4353_1_1533305658194_3279406_344-238708-0_sp1.1 from training. Duration: 0.963625 2023-10-02 17:45:30,795 WARNING [train.py:1204] (3/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_87-131427-0_sp1.1 from training. Duration: 0.7090625 2023-10-02 17:45:32,062 WARNING [train.py:1204] (3/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_110-130961-0_sp0.9 from training. Duration: 0.52225 2023-10-02 17:45:32,113 WARNING [train.py:1204] (3/4) Exclude cut with ID _55153_210_2253_1_1534384786677_3162096_148-79022-0 from training. Duration: 0.96 2023-10-02 17:45:32,383 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.conv_module2.balancer1.prob, batch_count=964980.0, ans=0.125 2023-10-02 17:45:38,201 WARNING [train.py:1204] (3/4) Exclude cut with ID _9679_210_7497_1_1529129212906_4363929_512-313889-0_sp1.1 from training. Duration: 0.736375 2023-10-02 17:45:38,213 WARNING [train.py:1204] (3/4) Exclude cut with ID _7970_210_1883_1_1528455616296_3472230_25-178407-0_sp1.1 from training. Duration: 0.6454375 2023-10-02 17:45:39,642 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_246-125000-0 from training. Duration: 0.62 2023-10-02 17:45:39,697 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_15126_210_6753_1_1530778677_2591185_113-157651-0_sp0.9 from training. Duration: 0.7555625 2023-10-02 17:45:41,085 WARNING [train.py:1204] (3/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_105-63309-0_sp1.1 from training. Duration: 0.8909375 2023-10-02 17:45:42,566 WARNING [train.py:1204] (3/4) Exclude cut with ID _29877_210_6228_1_1532397791106_5276149_578-177271-0_sp1.1 from training. Duration: 0.936375 2023-10-02 17:45:43,897 WARNING [train.py:1204] (3/4) Exclude cut with ID _44831_210_13427_1_1533816011549_3689035_32-188147-0 from training. Duration: 0.96 2023-10-02 17:45:45,137 WARNING [train.py:1204] (3/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_300-254337-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 17:45:45,156 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_128-241008-0 from training. Duration: 0.94 2023-10-02 17:45:47,758 WARNING [train.py:1204] (3/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_53-279178-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 17:45:50,982 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.537e+02 1.887e+02 2.082e+02 2.300e+02 3.182e+02, threshold=4.165e+02, percent-clipped=0.0 2023-10-02 17:45:51,164 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_41452_210_7291_1_1533437687_3941281_273-334088-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 17:45:51,167 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_128-241008-0_sp1.1 from training. Duration: 0.8545625 2023-10-02 17:45:54,098 WARNING [train.py:1204] (3/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_379-49457-0 from training. Duration: 0.87 2023-10-02 17:45:55,555 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_105-114797-0 from training. Duration: 0.84 2023-10-02 17:45:55,667 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_14928_210_5632_1_1530773461_4714000_454-112198-0 from training. Duration: 0.66 2023-10-02 17:45:59,760 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_456-22141-0_sp1.1 from training. Duration: 0.8 2023-10-02 17:46:03,362 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_115-233929-0 from training. Duration: 0.72 2023-10-02 17:46:04,749 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_616-16795-0_sp1.1 from training. Duration: 0.963625 2023-10-02 17:46:09,658 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.feed_forward3.hidden_balancer.prob, batch_count=965113.3333333334, ans=0.125 2023-10-02 17:46:12,135 INFO [train.py:1046] (3/4) Epoch 28, batch 1350, loss[loss=0.1676, simple_loss=0.2542, pruned_loss=0.04054, over 24447.00 frames. ], tot_loss[loss=0.1674, simple_loss=0.2448, pruned_loss=0.04501, over 4709765.92 frames. ], batch size: 69, lr: 3.67e-03, grad_scale: 16.0 2023-10-02 17:46:13,607 WARNING [train.py:1204] (3/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_448-212135-0 from training. Duration: 0.91 2023-10-02 17:46:16,393 WARNING [train.py:1204] (3/4) Exclude cut with ID _46683_210_9642_1_1533730913492_4591116_201-205677-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 17:46:19,182 WARNING [train.py:1204] (3/4) Exclude cut with ID _64086_210_7994_1_1535270417019_7435949_43-80483-0_sp1.1 from training. Duration: 0.97275 2023-10-02 17:46:22,444 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_240-134124-0_sp1.1 from training. Duration: 0.963625 2023-10-02 17:46:22,464 WARNING [train.py:1204] (3/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_907-120940-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 17:46:25,161 WARNING [train.py:1204] (3/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_848-280957-0_sp1.1 from training. Duration: 0.7909375 2023-10-02 17:46:25,199 WARNING [train.py:1204] (3/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_243-42116-0_sp0.9 from training. Duration: 0.811125 2023-10-02 17:46:27,978 WARNING [train.py:1204] (3/4) Exclude cut with ID _6694_210_4599_1_1527852637490_3965529_116-109398-0_sp0.9 from training. Duration: 0.811125 2023-10-02 17:46:29,336 WARNING [train.py:1204] (3/4) Exclude cut with ID _24314_210_4915_1_1532135078116_7170179_521-258868-0 from training. Duration: 0.79 2023-10-02 17:46:31,345 WARNING [train.py:1204] (3/4) Exclude cut with ID _2220_210_1819_1_1525509109898_3658680_50-99837-0_sp0.9 from training. Duration: 0.6 2023-10-02 17:46:31,489 INFO [scaling.py:1118] (3/4) WithLoss: name=encoder.encoders.0.layers.1.self_attn_weights, loss-sum=0.000e+00 2023-10-02 17:46:32,686 WARNING [train.py:1204] (3/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_313-134930-0_sp1.1 from training. Duration: 0.8818125 2023-10-02 17:46:32,932 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.conv_module2.balancer2.min_positive, batch_count=965246.6666666666, ans=0.05 2023-10-02 17:46:34,538 WARNING [train.py:1204] (3/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_125-144174-0 from training. Duration: 0.39 2023-10-02 17:46:34,587 WARNING [train.py:1204] (3/4) Exclude cut with ID _59288_210_5854_1_1535077757774_3412246_275-293897-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 17:46:36,007 WARNING [train.py:1204] (3/4) Exclude cut with ID _40519_210_13794_1_1533715228655_7337390_722-52880-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 17:46:36,026 WARNING [train.py:1204] (3/4) Exclude cut with ID _7537_210_2084_1_1528502395431_3924359_128-268029-0 from training. Duration: 0.98 2023-10-02 17:46:37,526 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.balancer1.prob, batch_count=965246.6666666666, ans=0.125 2023-10-02 17:46:38,865 WARNING [train.py:1204] (3/4) Exclude cut with ID _8021_210_7994_1_1528376451358_3653069_179-45206-0 from training. Duration: 0.86 2023-10-02 17:46:40,253 WARNING [train.py:1204] (3/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_88-276668-0 from training. Duration: 0.99 2023-10-02 17:46:42,208 WARNING [train.py:1204] (3/4) Exclude cut with ID _22613_210_5219_1_1532253599686_4090170_290-212862-0_sp1.1 from training. Duration: 0.97275 2023-10-02 17:46:42,212 WARNING [train.py:1204] (3/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_465-214726-0 from training. Duration: 0.86 2023-10-02 17:46:48,106 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.conv_module1.balancer2.prob, batch_count=965313.3333333334, ans=0.125 2023-10-02 17:46:55,343 WARNING [train.py:1204] (3/4) Exclude cut with ID _56480_210_4220_1_1534679942079_3075409_138-36031-0_sp1.1 from training. Duration: 0.97275 2023-10-02 17:46:55,566 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.0.attention_skip_rate, batch_count=965380.0, ans=0.0 2023-10-02 17:47:03,080 WARNING [train.py:1204] (3/4) Exclude cut with ID _58605_210_12210_1_1534723195939_3904259_300-285878-0_sp1.1 from training. Duration: 0.97275 2023-10-02 17:47:03,115 WARNING [train.py:1204] (3/4) Exclude cut with ID _40901_210_5665_1_1533363789958_3856130_114-340229-0_sp1.1 from training. Duration: 0.936375 2023-10-02 17:47:04,286 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_489-230703-0 from training. Duration: 0.85 2023-10-02 17:47:07,226 WARNING [train.py:1204] (3/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_322-81243-0_sp1.1 from training. Duration: 0.936375 2023-10-02 17:47:07,326 WARNING [train.py:1204] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_271-269084-0 from training. Duration: 0.83 2023-10-02 17:47:07,337 WARNING [train.py:1204] (3/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_166-314607-0_sp0.9 from training. Duration: 0.6 2023-10-02 17:47:08,687 WARNING [train.py:1204] (3/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_340-343302-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 17:47:11,364 WARNING [train.py:1204] (3/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_286-112015-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 17:47:13,184 WARNING [train.py:1204] (3/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_328-264776-0 from training. Duration: 0.67 2023-10-02 17:47:14,448 WARNING [train.py:1204] (3/4) Exclude cut with ID _4866_210_2577_1_1527404412284_4364659_256-306830-0_sp0.9 from training. Duration: 0.9666875 2023-10-02 17:47:19,227 WARNING [train.py:1204] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_239-316802-0 from training. Duration: 0.69 2023-10-02 17:47:19,418 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.1.nonlin_attention.balancer.prob, batch_count=965446.6666666666, ans=0.125 2023-10-02 17:47:20,691 WARNING [train.py:1204] (3/4) Exclude cut with ID _29857_210_20034_1_1532395886753_3798160_279-185801-0 from training. Duration: 0.91 2023-10-02 17:47:25,901 INFO [train.py:1046] (3/4) Epoch 28, batch 1400, loss[loss=0.1647, simple_loss=0.2274, pruned_loss=0.05098, over 23590.00 frames. ], tot_loss[loss=0.1665, simple_loss=0.2439, pruned_loss=0.04453, over 4712597.72 frames. ], batch size: 256, lr: 3.67e-03, grad_scale: 16.0 2023-10-02 17:47:26,021 WARNING [train.py:1204] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_656-188361-0 from training. Duration: 0.87 2023-10-02 17:47:27,453 WARNING [train.py:1204] (3/4) Exclude cut with ID _44718_210_5816_1_1534050051869_6469030_100-230287-0_sp1.1 from training. Duration: 0.936375 2023-10-02 17:47:30,286 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8275_210_4198_1_1528455320_3892571_276-215622-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 17:47:31,633 WARNING [train.py:1204] (3/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_698-309430-0_sp1.1 from training. Duration: 0.963625 2023-10-02 17:47:32,575 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.ff3_skip_rate, batch_count=965513.3333333334, ans=0.0 2023-10-02 17:47:35,140 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.nonlin_attention.balancer.prob, batch_count=965513.3333333334, ans=0.125 2023-10-02 17:47:36,346 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_365-217119-0 from training. Duration: 0.63 2023-10-02 17:47:37,736 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7264_210_4938_1_1527939911_8132095_800-91995-0 from training. Duration: 0.86 2023-10-02 17:47:39,395 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.feed_forward1.hidden_balancer.prob, batch_count=965580.0, ans=0.125 2023-10-02 17:47:46,575 WARNING [train.py:1204] (3/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_49-97158-0_sp1.1 from training. Duration: 0.6909375 2023-10-02 17:47:49,935 WARNING [train.py:1204] (3/4) Exclude cut with ID _61132_210_9969_1_1535021792964_3755419_61-57720-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 17:47:51,401 WARNING [train.py:1204] (3/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_345-306832-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 17:47:51,431 WARNING [train.py:1204] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_164-224052-0_sp0.9 from training. Duration: 0.77775 2023-10-02 17:47:54,379 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_34886_210_10633_1_1533203882_1755661_76-156096-0_sp1.1 from training. Duration: 0.7909375 2023-10-02 17:47:55,741 WARNING [train.py:1204] (3/4) Exclude cut with ID 4133-6541-0027-40495-0_sp1.1 from training. Duration: 0.9681875 2023-10-02 17:48:00,265 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.3.encoder.layers.3.whiten, num_groups=1, num_channels=512, metric=4.51 vs. limit=12.0 2023-10-02 17:48:06,103 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_514-211165-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 17:48:07,764 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_41211_210_2544_1_1533348859_3702910_97-212695-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 17:48:11,065 WARNING [train.py:1204] (3/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_406-45086-0 from training. Duration: 0.8 2023-10-02 17:48:11,273 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.nonlin_attention.balancer.prob, batch_count=965713.3333333334, ans=0.125 2023-10-02 17:48:12,444 WARNING [train.py:1204] (3/4) Exclude cut with ID _65263_210_22356_1_1535763598691_3921037_49-222917-0_sp1.1 from training. Duration: 0.836375 2023-10-02 17:48:12,496 WARNING [train.py:1204] (3/4) Exclude cut with ID _4866_210_2577_1_1527404412284_4364659_352-157930-0_sp1.1 from training. Duration: 0.736375 2023-10-02 17:48:12,562 WARNING [train.py:1204] (3/4) Exclude cut with ID _4366_210_1388_1_1526792053633_3613819_231-211402-0_sp1.1 from training. Duration: 0.8545625 2023-10-02 17:48:13,841 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_98-21576-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 17:48:15,210 WARNING [train.py:1204] (3/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_348-127789-0_sp0.9 from training. Duration: 0.9444375 2023-10-02 17:48:15,225 WARNING [train.py:1204] (3/4) Exclude cut with ID _45871_210_5665_1_1533864614245_3296060_98-329557-0_sp1.1 from training. Duration: 0.97275 2023-10-02 17:48:15,277 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6846_210_6753_1_1527850767_4295158_2-315073-0_sp1.1 from training. Duration: 0.8 2023-10-02 17:48:15,385 WARNING [train.py:1204] (3/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_145-137790-0 from training. Duration: 0.83 2023-10-02 17:48:16,740 WARNING [train.py:1204] (3/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_133-158308-0_sp0.9 from training. Duration: 0.9333125 2023-10-02 17:48:16,980 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.self_attn_weights.pos_emb_skip_rate, batch_count=965713.3333333334, ans=0.0 2023-10-02 17:48:19,942 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.611e+02 1.826e+02 2.070e+02 2.525e+02 5.054e+02, threshold=4.140e+02, percent-clipped=1.0 2023-10-02 17:48:20,129 WARNING [train.py:1204] (3/4) Exclude cut with ID _14898_210_9206_1_1530766812377_4177190_320-11758-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 17:48:24,277 WARNING [train.py:1204] (3/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_406-45086-0_sp0.9 from training. Duration: 0.888875 2023-10-02 17:48:31,336 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_5438_210_1794_1_1527336687_2600009_104-288274-0 from training. Duration: 0.58 2023-10-02 17:48:33,208 WARNING [train.py:1204] (3/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_108-53929-0_sp0.9 from training. Duration: 0.6555625 2023-10-02 17:48:33,270 WARNING [train.py:1204] (3/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_444-286390-0_sp1.1 from training. Duration: 0.963625 2023-10-02 17:48:36,492 WARNING [train.py:1204] (3/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_125-144174-0_sp0.9 from training. Duration: 0.4333125 2023-10-02 17:48:36,556 WARNING [train.py:1204] (3/4) Exclude cut with ID _55209_210_1531_1_1534675828996_4597160_118-345621-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 17:48:39,296 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_45886_210_17024_1_1533709215_471063_7-79165-0_sp1.1 from training. Duration: 0.8909375 2023-10-02 17:48:40,603 INFO [train.py:1046] (3/4) Epoch 28, batch 1450, loss[loss=0.1632, simple_loss=0.2563, pruned_loss=0.03504, over 24398.00 frames. ], tot_loss[loss=0.1665, simple_loss=0.2438, pruned_loss=0.0446, over 4717069.25 frames. ], batch size: 69, lr: 3.67e-03, grad_scale: 16.0 2023-10-02 17:48:43,922 WARNING [train.py:1204] (3/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_175-163894-0_sp0.9 from training. Duration: 0.82225 2023-10-02 17:48:45,335 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_50188_210_4198_1_1534503621_3646041_353-81682-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 17:48:45,351 WARNING [train.py:1204] (3/4) Exclude cut with ID _17875_210_2087_1_1531224012907_1821339_175-80565-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 17:48:45,369 WARNING [train.py:1204] (3/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_159-328157-0_sp1.1 from training. Duration: 0.47275 2023-10-02 17:48:46,196 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.2.encoder.layers.1.feed_forward1.out_whiten, num_groups=1, num_channels=384, metric=6.70 vs. limit=15.0 2023-10-02 17:48:51,465 WARNING [train.py:1204] (3/4) Exclude cut with ID _60057_210_10640_1_1534903065793_3333150_137-267579-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 17:48:51,527 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_92-46009-0_sp1.1 from training. Duration: 0.4909375 2023-10-02 17:48:52,935 WARNING [train.py:1204] (3/4) Exclude cut with ID _53203_210_2087_1_1534246218309_475590_48-68507-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 17:48:54,253 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_28211_210_6388_1_1532861506_4523224_128-328162-0 from training. Duration: 0.9 2023-10-02 17:48:54,327 WARNING [train.py:1204] (3/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_51-1049-0_sp1.1 from training. Duration: 0.6818125 2023-10-02 17:48:55,717 WARNING [train.py:1204] (3/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_383-316000-0 from training. Duration: 0.9 2023-10-02 17:48:57,023 WARNING [train.py:1204] (3/4) Exclude cut with ID _4779_210_2084_1_1527318022944_3914893_185-295153-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 17:48:58,343 WARNING [train.py:1204] (3/4) Exclude cut with ID _3231_210_2106_1_1526205864934_6784220_171-256004-0_sp0.9 from training. Duration: 0.9555625 2023-10-02 17:48:58,344 WARNING [train.py:1204] (3/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_87-272068-0 from training. Duration: 0.83 2023-10-02 17:48:58,536 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.1.balancer2.prob, batch_count=965913.3333333334, ans=0.125 2023-10-02 17:48:59,844 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_164-152777-0_sp1.1 from training. Duration: 0.92725 2023-10-02 17:48:59,887 WARNING [train.py:1204] (3/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_18-130336-0_sp0.9 from training. Duration: 0.87775 2023-10-02 17:48:59,941 WARNING [train.py:1204] (3/4) Exclude cut with ID _14886_210_4220_1_1530702020355_3516190_175-229073-0_sp1.1 from training. Duration: 0.4090625 2023-10-02 17:48:59,953 WARNING [train.py:1204] (3/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_791-45149-0_sp0.9 from training. Duration: 0.9555625 2023-10-02 17:49:01,413 WARNING [train.py:1204] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_468-189058-0_sp1.1 from training. Duration: 0.87275 2023-10-02 17:49:03,449 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.4.encoder.layers.0.self_attn_weights.whiten_keys, num_groups=4, num_channels=128, metric=1.71 vs. limit=6.0 2023-10-02 17:49:04,110 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8278_210_2550_1_1528637099_4220413_32-142780-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 17:49:06,079 WARNING [train.py:1204] (3/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_146-218763-0_sp0.9 from training. Duration: 0.9555625 2023-10-02 17:49:07,493 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.feed_forward1.hidden_balancer.prob, batch_count=965913.3333333334, ans=0.125 2023-10-02 17:49:10,590 WARNING [train.py:1204] (3/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_348-127789-0_sp1.1 from training. Duration: 0.77275 2023-10-02 17:49:10,595 WARNING [train.py:1204] (3/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_288-285445-0_sp1.1 from training. Duration: 0.936375 2023-10-02 17:49:12,069 WARNING [train.py:1204] (3/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_308-181732-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 17:49:13,362 WARNING [train.py:1204] (3/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_895-117428-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 17:49:14,788 WARNING [train.py:1204] (3/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_387-159008-0_sp0.9 from training. Duration: 0.9555625 2023-10-02 17:49:14,800 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_37351_210_7925_1_1533012931_3812113_265-85780-0_sp0.9 from training. Duration: 0.97775 2023-10-02 17:49:14,814 WARNING [train.py:1204] (3/4) Exclude cut with ID _40551_210_10635_1_1533776424632_3416179_361-3964-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 17:49:15,015 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.bypass.scale_min, batch_count=965980.0, ans=0.2 2023-10-02 17:49:16,134 WARNING [train.py:1204] (3/4) Exclude cut with ID _25882_210_3036_1_1532775665815_6479510_485-122742-0_sp1.1 from training. Duration: 0.97275 2023-10-02 17:49:20,682 WARNING [train.py:1204] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_98-51676-0 from training. Duration: 0.73 2023-10-02 17:49:23,825 WARNING [train.py:1204] (3/4) Exclude cut with ID _30548_210_16040_1_1532608097130_3146969_371-280610-0_sp1.1 from training. Duration: 0.92725 2023-10-02 17:49:27,812 WARNING [train.py:1204] (3/4) Exclude cut with ID IC0671W0081-331924 from training. Duration: 0.896 2023-10-02 17:49:27,914 WARNING [train.py:1204] (3/4) Exclude cut with ID _40284_210_2084_1_1533513679814_3704279_380-160775-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 17:49:29,308 WARNING [train.py:1204] (3/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_57-270037-0_sp1.1 from training. Duration: 0.7 2023-10-02 17:49:30,580 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_853-150237-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 17:49:30,664 WARNING [train.py:1204] (3/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_491-261923-0 from training. Duration: 0.98 2023-10-02 17:49:34,826 WARNING [train.py:1204] (3/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_836-244045-0_sp1.1 from training. Duration: 0.97275 2023-10-02 17:49:34,926 WARNING [train.py:1204] (3/4) Exclude cut with ID _14880_210_8895_1_1530680426655_3795259_289-212817-0 from training. Duration: 0.69 2023-10-02 17:49:36,664 WARNING [train.py:1204] (3/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_303-340446-0 from training. Duration: 0.9 2023-10-02 17:49:36,896 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.bypass_mid.scale_min, batch_count=966046.6666666666, ans=0.2 2023-10-02 17:49:38,164 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_37152_210_9697_1_1533106573_3844188_418-41736-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 17:49:42,216 WARNING [train.py:1204] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_371-18169-0_sp1.1 from training. Duration: 0.7818125 2023-10-02 17:49:42,252 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_187-219984-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 17:49:44,247 WARNING [train.py:1204] (3/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_63-267585-0 from training. Duration: 0.9 2023-10-02 17:49:46,221 WARNING [train.py:1204] (3/4) Exclude cut with ID _27013_210_1389_1_1532437173240_3252649_222-125573-0 from training. Duration: 0.98 2023-10-02 17:49:46,293 WARNING [train.py:1204] (3/4) Exclude cut with ID _14821_210_8750_1_1530680503991_6829359_189-297451-0 from training. Duration: 0.55 2023-10-02 17:49:48,907 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_557-163391-0_sp1.1 from training. Duration: 0.97275 2023-10-02 17:49:48,998 WARNING [train.py:1204] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_65-88242-0_sp1.1 from training. Duration: 0.5090625 2023-10-02 17:49:54,980 INFO [train.py:1046] (3/4) Epoch 28, batch 1500, loss[loss=0.1855, simple_loss=0.2548, pruned_loss=0.0581, over 23600.00 frames. ], tot_loss[loss=0.1667, simple_loss=0.244, pruned_loss=0.04473, over 4713023.81 frames. ], batch size: 285, lr: 3.67e-03, grad_scale: 16.0 2023-10-02 17:50:00,456 WARNING [train.py:1204] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_218-295097-0 from training. Duration: 0.77 2023-10-02 17:50:00,486 WARNING [train.py:1204] (3/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_686-247600-0_sp1.1 from training. Duration: 0.836375 2023-10-02 17:50:00,490 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_397-147196-0_sp0.9 from training. Duration: 0.888875 2023-10-02 17:50:00,578 WARNING [train.py:1204] (3/4) Exclude cut with ID _53950_210_5816_1_1534417264919_3862951_237-136449-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 17:50:01,853 WARNING [train.py:1204] (3/4) Exclude cut with ID _3120_210_3525_1_1526036143112_4072980_74-178400-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 17:50:01,940 WARNING [train.py:1204] (3/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_309-222545-0_sp0.9 from training. Duration: 0.9333125 2023-10-02 17:50:03,288 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7544_210_6753_1_1528587722_3576686_172-171750-0 from training. Duration: 0.65 2023-10-02 17:50:05,304 WARNING [train.py:1204] (3/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_3-237434-0_sp0.9 from training. Duration: 0.8444375 2023-10-02 17:50:05,343 WARNING [train.py:1204] (3/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_101-293367-0_sp0.9 from training. Duration: 0.7 2023-10-02 17:50:05,350 WARNING [train.py:1204] (3/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_818-213552-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 17:50:06,718 WARNING [train.py:1204] (3/4) Exclude cut with ID _14349_210_8341_1_1530615710818_3442470_115-285975-0_sp1.1 from training. Duration: 0.7818125 2023-10-02 17:50:09,290 WARNING [train.py:1204] (3/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_291-309749-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 17:50:09,381 WARNING [train.py:1204] (3/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_715-3462-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 17:50:12,352 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.self_attn_weights.pos_emb_skip_rate, batch_count=966246.6666666666, ans=0.0 2023-10-02 17:50:13,617 WARNING [train.py:1204] (3/4) Exclude cut with ID _59502_210_12210_1_1535107024698_1835139_13-331827-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 17:50:13,626 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_4528_210_3273_1_1527076654_3276777_342-168068-0 from training. Duration: 0.87 2023-10-02 17:50:15,485 WARNING [train.py:1204] (3/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_215-141059-0_sp0.9 from training. Duration: 0.688875 2023-10-02 17:50:15,521 WARNING [train.py:1204] (3/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_106-286130-0_sp1.1 from training. Duration: 0.8909375 2023-10-02 17:50:17,386 WARNING [train.py:1204] (3/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_480-77655-0_sp1.1 from training. Duration: 0.97275 2023-10-02 17:50:19,099 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.feed_forward1.out_proj.dropout_p, batch_count=966246.6666666666, ans=0.1 2023-10-02 17:50:20,147 WARNING [train.py:1204] (3/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_848-280957-0 from training. Duration: 0.87 2023-10-02 17:50:24,866 WARNING [train.py:1204] (3/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_122-109023-0 from training. Duration: 0.76 2023-10-02 17:50:26,219 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_11305_210_1794_1_1529750428_7178148_645-233480-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 17:50:26,274 WARNING [train.py:1204] (3/4) Exclude cut with ID _7562_210_2084_1_1528887767916_3946229_345-169156-0 from training. Duration: 0.58 2023-10-02 17:50:30,270 WARNING [train.py:1204] (3/4) Exclude cut with ID _7562_210_2084_1_1528887767916_3946229_345-169156-0_sp1.1 from training. Duration: 0.52725 2023-10-02 17:50:31,701 WARNING [train.py:1204] (3/4) Exclude cut with ID _13253_210_1925_1_1530757775819_3577873_253-246386-0_sp1.1 from training. Duration: 0.8181875 2023-10-02 17:50:31,762 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_136-264444-0_sp1.1 from training. Duration: 0.97275 2023-10-02 17:50:31,784 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2168_210_2290_1_1525606920_1849400_136-222991-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 17:50:33,195 WARNING [train.py:1204] (3/4) Exclude cut with ID _7852_210_5219_1_1528286471316_8139400_242-105336-0 from training. Duration: 0.72 2023-10-02 17:50:35,154 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_5960_210_1794_1_1527403865_3990999_515-2613-0_sp1.1 from training. Duration: 0.8 2023-10-02 17:50:35,178 WARNING [train.py:1204] (3/4) Exclude cut with ID _39688_210_19596_1_1533371626725_4154159_551-167834-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 17:50:36,503 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_85-195240-0 from training. Duration: 0.87 2023-10-02 17:50:36,564 WARNING [train.py:1204] (3/4) Exclude cut with ID _63171_210_15488_1_1535104792794_3295110_113-113534-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 17:50:42,089 WARNING [train.py:1204] (3/4) Exclude cut with ID _8833_210_5219_1_1528592665210_7553059_319-349181-0_sp1.1 from training. Duration: 0.9 2023-10-02 17:50:42,090 WARNING [train.py:1204] (3/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_225-43381-0 from training. Duration: 0.94 2023-10-02 17:50:46,226 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_5959_210_1794_1_1527400400_3445726_412-44052-0_sp0.9 from training. Duration: 0.8555625 2023-10-02 17:50:47,939 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.619e+02 1.911e+02 2.171e+02 2.439e+02 3.664e+02, threshold=4.341e+02, percent-clipped=0.0 2023-10-02 17:50:48,121 WARNING [train.py:1204] (3/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_356-167816-0_sp1.1 from training. Duration: 0.6545625 2023-10-02 17:50:51,398 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9012W0112-501244 from training. Duration: 0.768 2023-10-02 17:50:52,780 WARNING [train.py:1204] (3/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_177-326952-0_sp1.1 from training. Duration: 0.92725 2023-10-02 17:50:52,784 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9016W0116-502789 from training. Duration: 0.896 2023-10-02 17:50:54,167 WARNING [train.py:1204] (3/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_557-204815-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 17:50:55,942 WARNING [train.py:1204] (3/4) Exclude cut with ID _55857_210_2346_1_1534644007831_6898209_279-229469-0_sp1.1 from training. Duration: 0.936375 2023-10-02 17:50:56,022 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9029W0375-509764 from training. Duration: 0.896 2023-10-02 17:50:57,333 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2295_210_2087_1_1525500993_2407863_234-83104-0_sp0.9 from training. Duration: 0.688875 2023-10-02 17:50:58,806 WARNING [train.py:1204] (3/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_266-137104-0 from training. Duration: 0.65 2023-10-02 17:51:00,228 WARNING [train.py:1204] (3/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_289-163927-0_sp1.1 from training. Duration: 0.92725 2023-10-02 17:51:01,735 WARNING [train.py:1204] (3/4) Exclude cut with ID _12733_210_8341_1_1530167424556_3646380_86-225314-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 17:51:01,761 WARNING [train.py:1204] (3/4) Exclude cut with ID _7877_210_6014_1_1528286358099_3750136_618-284064-0_sp1.1 from training. Duration: 0.92725 2023-10-02 17:51:03,464 WARNING [train.py:1204] (3/4) Exclude cut with ID _55844_210_2346_1_1534471212569_7625707_1017-31469-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 17:51:03,516 WARNING [train.py:1204] (3/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_617-241446-0_sp1.1 from training. Duration: 0.92725 2023-10-02 17:51:03,759 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.balancer2.prob, batch_count=966446.6666666666, ans=0.125 2023-10-02 17:51:04,851 WARNING [train.py:1204] (3/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_866-173440-0_sp1.1 from training. Duration: 0.8181875 2023-10-02 17:51:06,252 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_9460_210_4211_1_1528954587_10049667_480-260269-0 from training. Duration: 0.56 2023-10-02 17:51:07,395 WARNING [train.py:1204] (3/4) Exclude cut with ID _44659_210_2721_1_1534036120061_6614860_626-298884-0 from training. Duration: 0.97 2023-10-02 17:51:07,426 WARNING [train.py:1204] (3/4) Exclude cut with ID _53565_210_10635_1_1534337218335_3820259_287-18120-0_sp1.1 from training. Duration: 0.8818125 2023-10-02 17:51:07,489 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_791-197891-0 from training. Duration: 0.84 2023-10-02 17:51:08,841 INFO [train.py:1046] (3/4) Epoch 28, batch 1550, loss[loss=0.1667, simple_loss=0.2463, pruned_loss=0.04355, over 24654.00 frames. ], tot_loss[loss=0.1676, simple_loss=0.2449, pruned_loss=0.04512, over 4716440.83 frames. ], batch size: 65, lr: 3.67e-03, grad_scale: 16.0 2023-10-02 17:51:10,204 WARNING [train.py:1204] (3/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_527-238990-0 from training. Duration: 0.9 2023-10-02 17:51:11,627 WARNING [train.py:1204] (3/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_449-65969-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 17:51:13,003 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_553-341452-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 17:51:13,051 WARNING [train.py:1204] (3/4) Exclude cut with ID _16909_210_5724_1_1531292079912_4118919_300-47574-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 17:51:14,390 WARNING [train.py:1204] (3/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_70-27732-0_sp1.1 from training. Duration: 0.8545625 2023-10-02 17:51:15,901 WARNING [train.py:1204] (3/4) Exclude cut with ID _44718_210_5816_1_1534050051869_6469030_727-28074-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 17:51:17,658 WARNING [train.py:1204] (3/4) Exclude cut with ID _55857_210_2346_1_1534644007831_6898209_357-123664-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 17:51:21,131 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9123W0004-555964 from training. Duration: 0.896 2023-10-02 17:51:21,163 WARNING [train.py:1204] (3/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_445-145208-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 17:51:21,211 WARNING [train.py:1204] (3/4) Exclude cut with ID _3675_210_4557_1_1526639934408_4182089_456-18305-0_sp1.1 from training. Duration: 0.6909375 2023-10-02 17:51:22,579 WARNING [train.py:1204] (3/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_1-99680-0_sp1.1 from training. Duration: 0.6090625 2023-10-02 17:51:23,952 WARNING [train.py:1204] (3/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_146-282400-0_sp0.9 from training. Duration: 0.788875 2023-10-02 17:51:23,964 WARNING [train.py:1204] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_844-96989-0 from training. Duration: 0.71 2023-10-02 17:51:25,977 WARNING [train.py:1204] (3/4) Exclude cut with ID _29853_210_7497_1_1532505600851_6642110_128-146364-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 17:51:26,007 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_224-322394-0 from training. Duration: 0.66 2023-10-02 17:51:27,383 WARNING [train.py:1204] (3/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_191-59784-0 from training. Duration: 0.88 2023-10-02 17:51:27,392 WARNING [train.py:1204] (3/4) Exclude cut with ID _12556_210_8341_1_1530081024100_7327690_725-193477-0 from training. Duration: 0.98 2023-10-02 17:51:27,473 WARNING [train.py:1204] (3/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_523-79838-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 17:51:30,096 WARNING [train.py:1204] (3/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_398-334558-0_sp1.1 from training. Duration: 0.87275 2023-10-02 17:51:34,165 WARNING [train.py:1204] (3/4) Exclude cut with ID _6456_210_1534_1_1527681634212_3181049_171-36697-0_sp1.1 from training. Duration: 0.936375 2023-10-02 17:51:36,256 WARNING [train.py:1204] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_700-183549-0 from training. Duration: 0.68 2023-10-02 17:51:36,259 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8853_210_3273_1_1528633899_818968_51-187685-0 from training. Duration: 0.69 2023-10-02 17:51:43,348 WARNING [train.py:1204] (3/4) Exclude cut with ID _7719_210_3525_1_1528456153163_4296960_333-110399-0_sp1.1 from training. Duration: 0.87275 2023-10-02 17:51:44,992 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.attention_skip_rate, batch_count=966646.6666666666, ans=0.0 2023-10-02 17:51:46,094 WARNING [train.py:1204] (3/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_1022-83740-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 17:51:46,119 WARNING [train.py:1204] (3/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_61-327062-0_sp0.9 from training. Duration: 0.9 2023-10-02 17:51:46,124 WARNING [train.py:1204] (3/4) Exclude cut with ID _47251_210_18628_1_1533814063128_4137250_214-88076-0_sp1.1 from training. Duration: 0.8 2023-10-02 17:51:47,431 WARNING [train.py:1204] (3/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_839-173017-0 from training. Duration: 0.41 2023-10-02 17:51:52,688 WARNING [train.py:1204] (3/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_299-34413-0_sp1.1 from training. Duration: 0.5909375 2023-10-02 17:51:55,367 WARNING [train.py:1204] (3/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_664-159457-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 17:51:58,616 WARNING [train.py:1204] (3/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_483-291760-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 17:52:01,334 WARNING [train.py:1204] (3/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_287-255474-0_sp1.1 from training. Duration: 0.92725 2023-10-02 17:52:01,384 WARNING [train.py:1204] (3/4) Exclude cut with ID _48677_210_5663_1_1533902405919_3892989_111-11439-0_sp1.1 from training. Duration: 0.87275 2023-10-02 17:52:02,638 WARNING [train.py:1204] (3/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_399-156791-0 from training. Duration: 0.83 2023-10-02 17:52:02,677 WARNING [train.py:1204] (3/4) Exclude cut with ID _3992_210_1747_1_1526644816031_3731060_36-147855-0_sp0.9 from training. Duration: 0.8666875 2023-10-02 17:52:04,126 WARNING [train.py:1204] (3/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_442-320317-0_sp1.1 from training. Duration: 0.7454375 2023-10-02 17:52:04,146 WARNING [train.py:1204] (3/4) Exclude cut with ID _55429_210_10626_1_1534467588527_7174143_953-80868-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 17:52:05,490 WARNING [train.py:1204] (3/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_159-328157-0_sp0.9 from training. Duration: 0.57775 2023-10-02 17:52:05,497 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9309W0197-640324 from training. Duration: 0.896 2023-10-02 17:52:08,191 WARNING [train.py:1204] (3/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_361-30822-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 17:52:10,925 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.0.layers.1.feed_forward1.out_whiten, num_groups=1, num_channels=192, metric=4.84 vs. limit=15.0 2023-10-02 17:52:14,093 WARNING [train.py:1204] (3/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_636-251213-0 from training. Duration: 0.94 2023-10-02 17:52:19,527 WARNING [train.py:1204] (3/4) Exclude cut with ID _49728_210_12651_1_1534041034294_6788150_468-333827-0_sp1.1 from training. Duration: 0.963625 2023-10-02 17:52:20,829 WARNING [train.py:1204] (3/4) Exclude cut with ID _25882_210_3036_1_1532775665815_6479510_623-191759-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 17:52:20,891 WARNING [train.py:1204] (3/4) Exclude cut with ID _5189_210_4557_1_1527248319428_3769229_199-53526-0 from training. Duration: 0.42 2023-10-02 17:52:22,670 INFO [train.py:1046] (3/4) Epoch 28, batch 1600, loss[loss=0.1692, simple_loss=0.255, pruned_loss=0.0417, over 24650.00 frames. ], tot_loss[loss=0.1669, simple_loss=0.2448, pruned_loss=0.04456, over 4723604.58 frames. ], batch size: 68, lr: 3.67e-03, grad_scale: 32.0 2023-10-02 17:52:22,921 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.conv_module2.balancer1.max_abs, batch_count=966846.6666666666, ans=10.0 2023-10-02 17:52:24,582 WARNING [train.py:1204] (3/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_32-205288-0_sp0.9 from training. Duration: 0.8666875 2023-10-02 17:52:25,946 WARNING [train.py:1204] (3/4) Exclude cut with ID _65263_210_22356_1_1535763598691_3921037_401-346646-0_sp1.1 from training. Duration: 0.963625 2023-10-02 17:52:25,954 WARNING [train.py:1204] (3/4) Exclude cut with ID _12555_210_8750_1_1530075594402_3780239_449-267986-0_sp1.1 from training. Duration: 0.7545625 2023-10-02 17:52:25,983 WARNING [train.py:1204] (3/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_430-201020-0_sp1.1 from training. Duration: 0.8818125 2023-10-02 17:52:26,074 WARNING [train.py:1204] (3/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_379-49457-0_sp1.1 from training. Duration: 0.7909375 2023-10-02 17:52:30,845 WARNING [train.py:1204] (3/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_200-105218-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 17:52:32,090 WARNING [train.py:1204] (3/4) Exclude cut with ID _3867_210_1883_1_1526641200575_4475830_319-349850-0 from training. Duration: 0.67 2023-10-02 17:52:32,172 WARNING [train.py:1204] (3/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_916-195848-0 from training. Duration: 0.92 2023-10-02 17:52:33,459 WARNING [train.py:1204] (3/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_436-186311-0 from training. Duration: 0.65 2023-10-02 17:52:34,867 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3910_210_1794_1_1526777667_3747260_302-180617-0_sp1.1 from training. Duration: 0.97275 2023-10-02 17:52:36,235 WARNING [train.py:1204] (3/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_102-92751-0 from training. Duration: 0.99 2023-10-02 17:52:36,298 WARNING [train.py:1204] (3/4) Exclude cut with ID _51899_210_20034_1_1534129254466_3669220_265-215428-0_sp1.1 from training. Duration: 0.8909375 2023-10-02 17:52:40,243 WARNING [train.py:1204] (3/4) Exclude cut with ID _58583_210_5236_1_1534726835266_3574249_438-198993-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 17:52:44,872 WARNING [train.py:1204] (3/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_166-166990-0_sp1.1 from training. Duration: 0.87275 2023-10-02 17:52:46,883 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.3.encoder.layers.1.nonlin_attention.whiten1, num_groups=1, num_channels=384, metric=4.70 vs. limit=10.0 2023-10-02 17:52:47,543 WARNING [train.py:1204] (3/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_907-151398-0 from training. Duration: 0.37 2023-10-02 17:52:48,953 WARNING [train.py:1204] (3/4) Exclude cut with ID _38670_210_15379_1_1533448810659_4391059_452-256669-0_sp1.1 from training. Duration: 0.936375 2023-10-02 17:52:50,309 WARNING [train.py:1204] (3/4) Exclude cut with ID _48677_210_5663_1_1533902405919_3892989_408-40740-0 from training. Duration: 0.83 2023-10-02 17:52:50,347 WARNING [train.py:1204] (3/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_903-158756-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 17:52:51,718 WARNING [train.py:1204] (3/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_397-216497-0 from training. Duration: 0.96 2023-10-02 17:52:55,198 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.conv_skip_rate, batch_count=966980.0, ans=0.0 2023-10-02 17:52:56,962 WARNING [train.py:1204] (3/4) Exclude cut with ID _55429_210_10626_1_1534467588527_7174143_97-335084-0 from training. Duration: 0.97 2023-10-02 17:53:05,270 WARNING [train.py:1204] (3/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_453-267730-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 17:53:05,338 WARNING [train.py:1204] (3/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_153-97441-0 from training. Duration: 0.56 2023-10-02 17:53:06,682 WARNING [train.py:1204] (3/4) Exclude cut with ID _40901_210_5665_1_1533363789958_3856130_312-329122-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 17:53:06,737 WARNING [train.py:1204] (3/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_97-126471-0_sp1.1 from training. Duration: 0.97275 2023-10-02 17:53:06,737 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_11305_210_1794_1_1529750428_7178148_257-312870-0_sp1.1 from training. Duration: 0.8 2023-10-02 17:53:07,040 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.feed_forward1.hidden_balancer.prob, batch_count=967046.6666666666, ans=0.125 2023-10-02 17:53:09,497 WARNING [train.py:1204] (3/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_454-314901-0_sp0.9 from training. Duration: 0.511125 2023-10-02 17:53:09,741 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=967046.6666666666, ans=0.1 2023-10-02 17:53:12,837 WARNING [train.py:1204] (3/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_282-136641-0_sp1.1 from training. Duration: 0.3818125 2023-10-02 17:53:14,273 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_46013_210_7234_1_1533776362_3744032_493-160724-0_sp1.1 from training. Duration: 0.963625 2023-10-02 17:53:15,488 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.365e+02 1.816e+02 2.018e+02 2.318e+02 3.311e+02, threshold=4.036e+02, percent-clipped=0.0 2023-10-02 17:53:15,579 WARNING [train.py:1204] (3/4) Exclude cut with ID _67831_210_12392_1_1535612464202_3092612_259-33442-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 17:53:16,764 WARNING [train.py:1204] (3/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_289-62384-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 17:53:16,817 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6545_210_2040_1_1527774670_4245258_379-14182-0_sp0.9 from training. Duration: 0.888875 2023-10-02 17:53:18,173 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_548-294316-0_sp1.1 from training. Duration: 0.736375 2023-10-02 17:53:19,538 WARNING [train.py:1204] (3/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_857-298942-0_sp1.1 from training. Duration: 0.77275 2023-10-02 17:53:20,887 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6673_210_3273_1_1527767790_3173240_327-306185-0_sp1.1 from training. Duration: 0.7545625 2023-10-02 17:53:28,829 WARNING [train.py:1204] (3/4) Exclude cut with ID _61008_210_10633_1_1534852769198_6974370_814-107647-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 17:53:28,900 WARNING [train.py:1204] (3/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_116-332008-0_sp1.1 from training. Duration: 0.8909375 2023-10-02 17:53:32,059 WARNING [train.py:1204] (3/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_266-329755-0 from training. Duration: 0.89 2023-10-02 17:53:32,062 WARNING [train.py:1204] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_271-269084-0_sp0.9 from training. Duration: 0.92225 2023-10-02 17:53:32,151 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3171_210_2087_1_1526109084_2330007_66-260583-0 from training. Duration: 0.72 2023-10-02 17:53:36,104 INFO [train.py:1046] (3/4) Epoch 28, batch 1650, loss[loss=0.1734, simple_loss=0.2523, pruned_loss=0.04724, over 23429.00 frames. ], tot_loss[loss=0.1673, simple_loss=0.2453, pruned_loss=0.04469, over 4717053.38 frames. ], batch size: 93, lr: 3.67e-03, grad_scale: 32.0 2023-10-02 17:53:37,583 WARNING [train.py:1204] (3/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_1326-276386-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 17:53:38,962 WARNING [train.py:1204] (3/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_372-30302-0_sp1.1 from training. Duration: 0.7818125 2023-10-02 17:53:39,028 WARNING [train.py:1204] (3/4) Exclude cut with ID _25882_210_3036_1_1532775665815_6479510_631-257029-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 17:53:39,042 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3691_210_3528_1_1526468169_4062053_355-334432-0 from training. Duration: 0.87 2023-10-02 17:53:40,336 WARNING [train.py:1204] (3/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_839-173017-0_sp1.1 from training. Duration: 0.37275 2023-10-02 17:53:40,345 WARNING [train.py:1204] (3/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_441-91466-0 from training. Duration: 0.97 2023-10-02 17:53:40,379 WARNING [train.py:1204] (3/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_108-53929-0 from training. Duration: 0.59 2023-10-02 17:53:43,626 WARNING [train.py:1204] (3/4) Exclude cut with ID _9389_210_1664_1_1529217337301_5289269_260-1942-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 17:53:43,679 WARNING [train.py:1204] (3/4) Exclude cut with ID _56423_210_5854_1_1534896061049_3165969_198-61547-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 17:53:43,708 WARNING [train.py:1204] (3/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_412-142122-0_sp1.1 from training. Duration: 0.92725 2023-10-02 17:53:43,741 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder_embed.conv.5.prob, batch_count=967180.0, ans=0.125 2023-10-02 17:53:45,047 WARNING [train.py:1204] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_88-341718-0_sp1.1 from training. Duration: 0.6 2023-10-02 17:53:47,804 WARNING [train.py:1204] (3/4) Exclude cut with ID _61079_210_4525_1_1534921008198_4011419_334-298697-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 17:53:49,269 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_5465_210_4211_1_1527227253_55357_8-2499-0_sp1.1 from training. Duration: 0.996375 2023-10-02 17:53:51,952 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_447-201565-0_sp1.1 from training. Duration: 0.97275 2023-10-02 17:53:51,966 WARNING [train.py:1204] (3/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_253-161789-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 17:53:51,969 WARNING [train.py:1204] (3/4) Exclude cut with ID _44304_210_15374_1_1533639352765_4613129_302-22084-0_sp0.9 from training. Duration: 0.9555625 2023-10-02 17:53:51,984 WARNING [train.py:1204] (3/4) Exclude cut with ID _26664_210_7770_1_1532258943280_3418270_187-80815-0_sp1.1 from training. Duration: 0.8454375 2023-10-02 17:53:53,299 WARNING [train.py:1204] (3/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_68-212801-0 from training. Duration: 0.73 2023-10-02 17:53:53,343 WARNING [train.py:1204] (3/4) Exclude cut with ID _29345_210_4381_1_1532689168263_3860169_149-257268-0 from training. Duration: 0.81 2023-10-02 17:53:54,884 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.conv_module2.balancer2.prob, batch_count=967246.6666666666, ans=0.125 2023-10-02 17:54:00,523 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_213-103282-0_sp1.1 from training. Duration: 0.7090625 2023-10-02 17:54:03,640 WARNING [train.py:1204] (3/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_156-302675-0_sp1.1 from training. Duration: 0.5 2023-10-02 17:54:09,285 WARNING [train.py:1204] (3/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_125-322172-0 from training. Duration: 0.93 2023-10-02 17:54:11,065 WARNING [train.py:1204] (3/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_493-60474-0_sp1.1 from training. Duration: 0.963625 2023-10-02 17:54:13,850 WARNING [train.py:1204] (3/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_156-302675-0 from training. Duration: 0.55 2023-10-02 17:54:16,680 WARNING [train.py:1204] (3/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_123-267862-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 17:54:19,443 WARNING [train.py:1204] (3/4) Exclude cut with ID _28427_210_4220_1_1532581240967_3061069_201-143747-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 17:54:19,469 WARNING [train.py:1204] (3/4) Exclude cut with ID _8772_210_1850_1_1528635595972_3664011_96-12756-0_sp1.1 from training. Duration: 0.8909375 2023-10-02 17:54:19,504 WARNING [train.py:1204] (3/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_1347-145675-0_sp1.1 from training. Duration: 0.936375 2023-10-02 17:54:20,825 WARNING [train.py:1204] (3/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_387-159008-0_sp1.1 from training. Duration: 0.7818125 2023-10-02 17:54:20,843 WARNING [train.py:1204] (3/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_400-201896-0_sp1.1 from training. Duration: 0.963625 2023-10-02 17:54:22,341 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.nonlin_attention.balancer.min_positive, batch_count=967380.0, ans=0.05 2023-10-02 17:54:23,525 WARNING [train.py:1204] (3/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_326-174612-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 17:54:25,387 WARNING [train.py:1204] (3/4) Exclude cut with ID _55844_210_2346_1_1534471212569_7625707_390-116233-0_sp1.1 from training. Duration: 0.963625 2023-10-02 17:54:25,429 WARNING [train.py:1204] (3/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_419-317062-0_sp1.1 from training. Duration: 0.82725 2023-10-02 17:54:26,747 WARNING [train.py:1204] (3/4) Exclude cut with ID _29877_210_6228_1_1532397791106_5276149_266-62291-0_sp0.9 from training. Duration: 0.9666875 2023-10-02 17:54:28,640 WARNING [train.py:1204] (3/4) Exclude cut with ID _7857_210_1534_1_1528286515745_6831470_431-338528-0_sp1.1 from training. Duration: 0.92725 2023-10-02 17:54:28,704 WARNING [train.py:1204] (3/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_87-131427-0_sp0.9 from training. Duration: 0.8666875 2023-10-02 17:54:31,990 WARNING [train.py:1204] (3/4) Exclude cut with ID _24055_210_12651_1_1532310907143_976979_92-59356-0_sp1.1 from training. Duration: 0.82725 2023-10-02 17:54:33,329 WARNING [train.py:1204] (3/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_96-290039-0 from training. Duration: 0.56 2023-10-02 17:54:34,707 WARNING [train.py:1204] (3/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_48-182155-0_sp0.9 from training. Duration: 0.9666875 2023-10-02 17:54:34,748 WARNING [train.py:1204] (3/4) Exclude cut with ID _4626_210_3532_1_1526817652717_3916859_446-206656-0 from training. Duration: 0.76 2023-10-02 17:54:36,036 WARNING [train.py:1204] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_168-32906-0 from training. Duration: 0.69 2023-10-02 17:54:37,362 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3780_210_2087_1_1526467760_2130483_170-259044-0 from training. Duration: 0.7 2023-10-02 17:54:37,380 WARNING [train.py:1204] (3/4) Exclude cut with ID _65134_210_14763_1_1535335393985_3505368_153-216718-0_sp1.1 from training. Duration: 0.92725 2023-10-02 17:54:37,445 WARNING [train.py:1204] (3/4) Exclude cut with ID _64639_210_5816_1_1535367619437_3528000_461-329156-0_sp1.1 from training. Duration: 0.87275 2023-10-02 17:54:38,775 WARNING [train.py:1204] (3/4) Exclude cut with ID _21989_210_5854_1_1531899214931_3271360_126-347531-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 17:54:38,822 WARNING [train.py:1204] (3/4) Exclude cut with ID _7614_210_4915_1_1529326764092_4537359_96-162197-0_sp1.1 from training. Duration: 0.963625 2023-10-02 17:54:38,823 WARNING [train.py:1204] (3/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_1083-147536-0 from training. Duration: 0.97 2023-10-02 17:54:42,959 WARNING [train.py:1204] (3/4) Exclude cut with ID _64029_210_4381_1_1535178502772_3756459_275-16710-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 17:54:44,950 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7264_210_4938_1_1527939911_8132095_800-91995-0_sp0.9 from training. Duration: 0.9555625 2023-10-02 17:54:44,991 WARNING [train.py:1204] (3/4) Exclude cut with ID _3844_210_1390_1_1526727803582_2871209_50-193879-0_sp1.1 from training. Duration: 0.936375 2023-10-02 17:54:46,591 WARNING [train.py:1204] (3/4) Exclude cut with ID _46952_210_5986_1_1534230010357_3107379_232-344503-0 from training. Duration: 0.96 2023-10-02 17:54:48,282 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.conv_module1.balancer1.prob, batch_count=967513.3333333334, ans=0.125 2023-10-02 17:54:49,267 INFO [train.py:1046] (3/4) Epoch 28, batch 1700, loss[loss=0.1406, simple_loss=0.1989, pruned_loss=0.04114, over 23488.00 frames. ], tot_loss[loss=0.167, simple_loss=0.2447, pruned_loss=0.04462, over 4691963.64 frames. ], batch size: 285, lr: 3.67e-03, grad_scale: 32.0 2023-10-02 17:54:50,702 WARNING [train.py:1204] (3/4) Exclude cut with ID _25808_210_13292_1_1532343514562_7366109_206-14076-0_sp1.1 from training. Duration: 0.936375 2023-10-02 17:54:50,703 WARNING [train.py:1204] (3/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_55-31904-0_sp0.9 from training. Duration: 0.97775 2023-10-02 17:54:50,720 WARNING [train.py:1204] (3/4) Exclude cut with ID _14632_210_9988_1_1530691279392_3923200_161-129530-0 from training. Duration: 0.96 2023-10-02 17:54:50,792 WARNING [train.py:1204] (3/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_770-200430-0_sp1.1 from training. Duration: 0.8090625 2023-10-02 17:54:52,013 WARNING [train.py:1204] (3/4) Exclude cut with ID _6270_210_2084_1_1527850911573_3856420_26-277343-0_sp1.1 from training. Duration: 0.7454375 2023-10-02 17:54:52,023 WARNING [train.py:1204] (3/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_88-23776-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 17:54:53,477 WARNING [train.py:1204] (3/4) Exclude cut with ID _60860_210_8254_1_1534896015875_3557169_245-256666-0_sp1.1 from training. Duration: 0.8818125 2023-10-02 17:54:53,519 WARNING [train.py:1204] (3/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_636-251213-0_sp1.1 from training. Duration: 0.8545625 2023-10-02 17:54:53,532 WARNING [train.py:1204] (3/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_540-131164-0 from training. Duration: 0.92 2023-10-02 17:54:55,559 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.conv_module2.balancer2.min_positive, batch_count=967513.3333333334, ans=0.05 2023-10-02 17:54:57,187 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_5931_210_2040_1_1527428481_4547999_321-193614-0_sp0.9 from training. Duration: 0.7444375 2023-10-02 17:55:04,652 WARNING [train.py:1204] (3/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_373-225403-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 17:55:07,333 WARNING [train.py:1204] (3/4) Exclude cut with ID _20622_210_3852_1_1531622427080_1093839_57-326027-0_sp1.1 from training. Duration: 0.97275 2023-10-02 17:55:11,523 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.2.encoder.layers.2.conv_module2.whiten, num_groups=1, num_channels=384, metric=5.04 vs. limit=15.0 2023-10-02 17:55:13,481 WARNING [train.py:1204] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_605-257205-0_sp1.1 from training. Duration: 0.536375 2023-10-02 17:55:13,521 WARNING [train.py:1204] (3/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_442-320317-0_sp0.9 from training. Duration: 0.911125 2023-10-02 17:55:14,877 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_35361_210_4238_1_1533262810_7793476_675-87012-0_sp1.1 from training. Duration: 0.8090625 2023-10-02 17:55:14,918 WARNING [train.py:1204] (3/4) Exclude cut with ID _4184_210_2550_1_1526730192013_2219380_306-44736-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 17:55:17,633 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_124-113562-0 from training. Duration: 0.94 2023-10-02 17:55:20,395 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2821_210_2715_1_1526108385_3763560_73-296673-0_sp0.9 from training. Duration: 0.92225 2023-10-02 17:55:20,409 WARNING [train.py:1204] (3/4) Exclude cut with ID _10301_210_5886_1_1529222275332_3910620_216-46959-0_sp1.1 from training. Duration: 0.92725 2023-10-02 17:55:21,904 WARNING [train.py:1204] (3/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_91-277684-0_sp0.9 from training. Duration: 0.82225 2023-10-02 17:55:25,022 WARNING [train.py:1204] (3/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_351-294650-0_sp0.9 from training. Duration: 0.7 2023-10-02 17:55:25,160 WARNING [train.py:1204] (3/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_209-205365-0 from training. Duration: 0.79 2023-10-02 17:55:26,537 WARNING [train.py:1204] (3/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_97-325053-0 from training. Duration: 0.92 2023-10-02 17:55:28,333 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_49348_210_7927_1_1533954500_3691300_109-276430-0_sp1.1 from training. Duration: 0.92725 2023-10-02 17:55:29,725 WARNING [train.py:1204] (3/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_912-47473-0 from training. Duration: 0.68 2023-10-02 17:55:31,143 WARNING [train.py:1204] (3/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_96-320736-0_sp0.9 from training. Duration: 0.9555625 2023-10-02 17:55:31,494 INFO [scaling.py:1118] (3/4) WithLoss: name=encoder.encoders.4.encoder.layers.2.self_attn_weights, loss-sum=0.000e+00 2023-10-02 17:55:34,872 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.bypass.skip_rate, batch_count=967713.3333333334, ans=0.04949747468305833 2023-10-02 17:55:38,723 WARNING [train.py:1204] (3/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_1252-235481-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 17:55:38,821 WARNING [train.py:1204] (3/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_813-261554-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 17:55:39,097 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.feed_forward1.hidden_balancer.prob, batch_count=967713.3333333334, ans=0.125 2023-10-02 17:55:40,259 WARNING [train.py:1204] (3/4) Exclude cut with ID _21632_210_2253_1_1531994001765_3744140_110-274654-0_sp0.9 from training. Duration: 0.911125 2023-10-02 17:55:41,657 WARNING [train.py:1204] (3/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_381-317791-0_sp1.1 from training. Duration: 0.663625 2023-10-02 17:55:41,676 WARNING [train.py:1204] (3/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_51-117576-0_sp1.1 from training. Duration: 0.363625 2023-10-02 17:55:42,882 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.460e+02 1.838e+02 2.056e+02 2.244e+02 3.312e+02, threshold=4.111e+02, percent-clipped=0.0 2023-10-02 17:55:42,971 WARNING [train.py:1204] (3/4) Exclude cut with ID _42321_210_7770_1_1533468631352_4366895_104-215915-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 17:55:44,858 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_216-22824-0_sp1.1 from training. Duration: 0.92725 2023-10-02 17:55:44,859 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_14831_210_7927_1_1530700200_5583610_181-27467-0 from training. Duration: 0.91 2023-10-02 17:55:46,215 WARNING [train.py:1204] (3/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_1024-265645-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 17:55:46,217 WARNING [train.py:1204] (3/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_412-233904-0_sp1.1 from training. Duration: 0.936375 2023-10-02 17:55:46,233 WARNING [train.py:1204] (3/4) Exclude cut with ID _30191_210_1664_1_1533189915047_7196670_911-223461-0_sp1.1 from training. Duration: 0.92725 2023-10-02 17:55:46,250 WARNING [train.py:1204] (3/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_558-148266-0_sp1.1 from training. Duration: 0.963625 2023-10-02 17:55:47,784 WARNING [train.py:1204] (3/4) Exclude cut with ID _58480_210_12392_1_1534811121111_3646160_212-164495-0_sp1.1 from training. Duration: 0.936375 2023-10-02 17:55:47,785 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_454-276429-0_sp0.9 from training. Duration: 0.9666875 2023-10-02 17:55:49,261 WARNING [train.py:1204] (3/4) Exclude cut with ID _24736_210_3279_1_1532332894218_2838700_73-197086-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 17:55:50,593 WARNING [train.py:1204] (3/4) Exclude cut with ID _5294_210_1747_1_1527249643928_3696179_529-242298-0_sp1.1 from training. Duration: 0.863625 2023-10-02 17:55:50,629 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_53555_210_5632_1_1534595220_7232297_345-171438-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 17:55:54,658 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_28211_210_6388_1_1532861506_4523224_22-25766-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 17:55:56,386 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7002_210_1794_1_1527939683_5157286_163-87552-0 from training. Duration: 0.86 2023-10-02 17:55:58,235 WARNING [train.py:1204] (3/4) Exclude cut with ID _56927_210_9969_1_1534848970840_3695229_409-225129-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 17:55:59,590 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_26192_210_7927_1_1532137269_1040115_21-219331-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 17:55:59,856 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.bypass.scale_min, batch_count=967780.0, ans=0.2 2023-10-02 17:56:01,049 WARNING [train.py:1204] (3/4) Exclude cut with ID _15012_210_1968_1_1530856820429_5095350_662-210469-0 from training. Duration: 0.57 2023-10-02 17:56:03,353 INFO [scaling.py:1118] (3/4) WithLoss: name=encoder.encoders.0.layers.1.self_attn_weights, loss-sum=0.000e+00 2023-10-02 17:56:03,905 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.2.encoder.layers.0.nonlin_attention.whiten2, num_groups=1, num_channels=384, metric=7.77 vs. limit=15.0 2023-10-02 17:56:04,365 INFO [train.py:1046] (3/4) Epoch 28, batch 1750, loss[loss=0.1751, simple_loss=0.2595, pruned_loss=0.04535, over 24078.00 frames. ], tot_loss[loss=0.1662, simple_loss=0.2438, pruned_loss=0.0443, over 4694825.69 frames. ], batch size: 80, lr: 3.66e-03, grad_scale: 32.0 2023-10-02 17:56:05,890 WARNING [train.py:1204] (3/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_582-191016-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 17:56:09,668 WARNING [train.py:1204] (3/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_270-210032-0_sp1.1 from training. Duration: 0.963625 2023-10-02 17:56:09,695 WARNING [train.py:1204] (3/4) Exclude cut with ID _6789_210_1534_1_1527857937218_3118950_262-48853-0_sp0.9 from training. Duration: 0.62225 2023-10-02 17:56:09,759 WARNING [train.py:1204] (3/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_847-41334-0 from training. Duration: 0.44 2023-10-02 17:56:09,783 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_573-46532-0_sp1.1 from training. Duration: 0.92725 2023-10-02 17:56:14,604 WARNING [train.py:1204] (3/4) Exclude cut with ID _21881_210_3525_1_1531825242757_3798380_247-102591-0_sp1.1 from training. Duration: 0.8909375 2023-10-02 17:56:14,634 WARNING [train.py:1204] (3/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_200-21395-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 17:56:17,248 WARNING [train.py:1204] (3/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_711-15688-0 from training. Duration: 0.43 2023-10-02 17:56:18,861 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.conv_skip_rate, batch_count=967913.3333333334, ans=0.0 2023-10-02 17:56:19,993 WARNING [train.py:1204] (3/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_53-75025-0_sp1.1 from training. Duration: 0.963625 2023-10-02 17:56:21,425 WARNING [train.py:1204] (3/4) Exclude cut with ID _6194_210_1534_1_1527511826160_3941194_321-148641-0 from training. Duration: 0.64 2023-10-02 17:56:21,448 WARNING [train.py:1204] (3/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_337-294431-0_sp1.1 from training. Duration: 0.936375 2023-10-02 17:56:23,368 WARNING [train.py:1204] (3/4) Exclude cut with ID _21632_210_2253_1_1531994001765_3744140_110-274654-0_sp1.1 from training. Duration: 0.7454375 2023-10-02 17:56:24,925 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.conv_module2.balancer1.prob, batch_count=967913.3333333334, ans=0.125 2023-10-02 17:56:25,997 WARNING [train.py:1204] (3/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_135-174080-0_sp1.1 from training. Duration: 0.4818125 2023-10-02 17:56:27,822 WARNING [train.py:1204] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_21-174514-0 from training. Duration: 0.43 2023-10-02 17:56:29,261 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_38965_210_4852_1_1533174906_3484735_215-294589-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 17:56:29,300 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_548-294316-0 from training. Duration: 0.81 2023-10-02 17:56:38,360 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_103-84010-0_sp1.1 from training. Duration: 0.62725 2023-10-02 17:56:39,926 WARNING [train.py:1204] (3/4) Exclude cut with ID _18199_210_6109_1_1531475789864_4288790_295-198247-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 17:56:39,945 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_13757_210_3528_1_1530440682_4107239_103-313224-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 17:56:42,735 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_34886_210_10633_1_1533203882_1755661_67-294673-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 17:56:42,745 WARNING [train.py:1204] (3/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_350-101734-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 17:56:45,939 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_37357_210_17024_1_1533089504_2316121_204-153986-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 17:56:48,487 WARNING [train.py:1204] (3/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_673-115569-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 17:56:49,921 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_489-230703-0_sp0.9 from training. Duration: 0.9444375 2023-10-02 17:56:49,978 WARNING [train.py:1204] (3/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_575-102496-0_sp1.1 from training. Duration: 0.936375 2023-10-02 17:56:51,775 WARNING [train.py:1204] (3/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_669-93163-0 from training. Duration: 0.98 2023-10-02 17:56:53,180 WARNING [train.py:1204] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_249-281349-0_sp0.9 from training. Duration: 0.9444375 2023-10-02 17:56:54,725 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.bypass.scale_min, batch_count=968046.6666666666, ans=0.2 2023-10-02 17:56:56,331 WARNING [train.py:1204] (3/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_106-30782-0 from training. Duration: 0.8 2023-10-02 17:56:57,733 WARNING [train.py:1204] (3/4) Exclude cut with ID _8772_210_1850_1_1528635595972_3664011_398-320049-0_sp1.1 from training. Duration: 0.8 2023-10-02 17:56:59,184 WARNING [train.py:1204] (3/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_516-151727-0_sp1.1 from training. Duration: 0.963625 2023-10-02 17:56:59,244 WARNING [train.py:1204] (3/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_325-62802-0_sp1.1 from training. Duration: 0.7909375 2023-10-02 17:56:59,422 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.self_attn_weights.pos_emb_skip_rate, batch_count=968046.6666666666, ans=0.0 2023-10-02 17:56:59,484 INFO [scaling.py:1118] (3/4) WithLoss: name=encoder.encoders.5.encoder.layers.0.self_attn_weights, loss-sum=2.544e-03 2023-10-02 17:57:03,435 WARNING [train.py:1204] (3/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_93-58002-0_sp0.9 from training. Duration: 0.6333125 2023-10-02 17:57:04,818 WARNING [train.py:1204] (3/4) Exclude cut with ID _4681_210_3525_1_1527251040321_1235713_123-24428-0_sp1.1 from training. Duration: 0.436375 2023-10-02 17:57:06,510 WARNING [train.py:1204] (3/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_1185-56473-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 17:57:07,860 WARNING [train.py:1204] (3/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_96-244299-0_sp1.1 from training. Duration: 0.8 2023-10-02 17:57:11,933 WARNING [train.py:1204] (3/4) Exclude cut with ID _59500_210_8254_1_1534809610680_3736009_104-72815-0_sp1.1 from training. Duration: 0.963625 2023-10-02 17:57:14,624 WARNING [train.py:1204] (3/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_468-283113-0_sp1.1 from training. Duration: 0.87275 2023-10-02 17:57:16,045 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6239_210_4211_1_1527765584_4306698_528-102186-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 17:57:16,114 WARNING [train.py:1204] (3/4) Exclude cut with ID _45297_210_2721_1_1533715351381_3264949_351-147136-0 from training. Duration: 0.87 2023-10-02 17:57:16,119 WARNING [train.py:1204] (3/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_520-51744-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 17:57:16,324 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.conv_module1.balancer1.prob, batch_count=968180.0, ans=0.125 2023-10-02 17:57:17,428 INFO [train.py:1046] (3/4) Epoch 28, batch 1800, loss[loss=0.1712, simple_loss=0.2582, pruned_loss=0.04211, over 24116.00 frames. ], tot_loss[loss=0.166, simple_loss=0.2441, pruned_loss=0.04396, over 4711851.79 frames. ], batch size: 86, lr: 3.66e-03, grad_scale: 32.0 2023-10-02 17:57:17,549 WARNING [train.py:1204] (3/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_310-114452-0_sp1.1 from training. Duration: 0.536375 2023-10-02 17:57:17,560 WARNING [train.py:1204] (3/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_450-165710-0_sp1.1 from training. Duration: 0.92725 2023-10-02 17:57:17,570 WARNING [train.py:1204] (3/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_280-120942-0_sp1.1 from training. Duration: 0.7 2023-10-02 17:57:17,592 WARNING [train.py:1204] (3/4) Exclude cut with ID _65585_210_12210_1_1535450451584_3764098_112-294301-0_sp0.9 from training. Duration: 0.97775 2023-10-02 17:57:19,572 WARNING [train.py:1204] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_98-51676-0_sp0.9 from training. Duration: 0.811125 2023-10-02 17:57:22,400 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_33348_210_5632_1_1533086912_3324030_85-28583-0_sp1.1 from training. Duration: 0.7454375 2023-10-02 17:57:22,683 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.feed_forward1.hidden_balancer.prob, batch_count=968180.0, ans=0.125 2023-10-02 17:57:24,300 WARNING [train.py:1204] (3/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_336-208232-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 17:57:25,612 WARNING [train.py:1204] (3/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_339-104791-0_sp0.9 from training. Duration: 0.5444375 2023-10-02 17:57:28,365 WARNING [train.py:1204] (3/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_559-29840-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 17:57:31,639 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.0.layers.0.feed_forward3.out_whiten, num_groups=1, num_channels=192, metric=11.29 vs. limit=15.0 2023-10-02 17:57:33,365 WARNING [train.py:1204] (3/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_117-290916-0_sp0.9 from training. Duration: 0.5666875 2023-10-02 17:57:33,447 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_185-112289-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 17:57:36,220 WARNING [train.py:1204] (3/4) Exclude cut with ID _59256_210_5986_1_1535076085997_3589120_155-298797-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 17:57:39,631 WARNING [train.py:1204] (3/4) Exclude cut with ID _44659_210_2721_1_1534036120061_6614860_246-206531-0_sp1.1 from training. Duration: 0.92725 2023-10-02 17:57:39,680 WARNING [train.py:1204] (3/4) Exclude cut with ID _23818_210_6228_1_1531917063884_3676920_469-151944-0_sp1.1 from training. Duration: 0.92725 2023-10-02 17:57:41,036 WARNING [train.py:1204] (3/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_88-276668-0_sp1.1 from training. Duration: 0.9 2023-10-02 17:57:42,375 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_15101_210_7927_1_1530786415_5033274_320-327527-0_sp1.1 from training. Duration: 0.87275 2023-10-02 17:57:42,383 WARNING [train.py:1204] (3/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_446-112117-0 from training. Duration: 0.9 2023-10-02 17:57:43,713 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_30280_210_1881_1_1532872779_3660139_400-254159-0_sp1.1 from training. Duration: 0.936375 2023-10-02 17:57:46,544 WARNING [train.py:1204] (3/4) Exclude cut with ID _40284_210_2084_1_1533513679814_3704279_158-345341-0_sp1.1 from training. Duration: 0.936375 2023-10-02 17:57:49,435 WARNING [train.py:1204] (3/4) Exclude cut with ID 3033-130750-0096-55598-0 from training. Duration: 0.83 2023-10-02 17:57:52,310 WARNING [train.py:1204] (3/4) Exclude cut with ID _9389_210_1664_1_1529217337301_5289269_379-128794-0 from training. Duration: 0.85 2023-10-02 17:57:52,334 WARNING [train.py:1204] (3/4) Exclude cut with ID _3675_210_4557_1_1526639934408_4182089_456-18305-0 from training. Duration: 0.76 2023-10-02 17:57:54,155 WARNING [train.py:1204] (3/4) Exclude cut with ID _29853_210_7497_1_1532505600851_6642110_571-249460-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 17:57:54,272 WARNING [train.py:1204] (3/4) Exclude cut with ID _4068_210_2629_1_1526697350395_1546259_230-325568-0_sp1.1 from training. Duration: 0.92725 2023-10-02 17:57:54,273 WARNING [train.py:1204] (3/4) Exclude cut with ID _34415_210_8750_1_1533085213375_3451116_213-267570-0_sp1.1 from training. Duration: 0.963625 2023-10-02 17:57:55,611 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6767_210_3273_1_1527855783_1718932_142-127132-0_sp1.1 from training. Duration: 0.863625 2023-10-02 17:58:02,339 WARNING [train.py:1204] (3/4) Exclude cut with ID IC0653W0001-322925 from training. Duration: 0.896 2023-10-02 17:58:05,106 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_4528_210_3273_1_1527076654_3276777_209-219804-0_sp1.1 from training. Duration: 0.62725 2023-10-02 17:58:06,496 WARNING [train.py:1204] (3/4) Exclude cut with ID _55844_210_2346_1_1534471212569_7625707_187-204971-0_sp1.1 from training. Duration: 0.936375 2023-10-02 17:58:09,121 WARNING [train.py:1204] (3/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_369-325184-0 from training. Duration: 0.99 2023-10-02 17:58:09,145 WARNING [train.py:1204] (3/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_791-45149-0 from training. Duration: 0.86 2023-10-02 17:58:09,185 WARNING [train.py:1204] (3/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_317-211107-0_sp1.1 from training. Duration: 0.836375 2023-10-02 17:58:10,911 WARNING [train.py:1204] (3/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_488-330913-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 17:58:11,015 WARNING [train.py:1204] (3/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_506-42614-0_sp1.1 from training. Duration: 0.7181875 2023-10-02 17:58:12,305 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.534e+02 1.835e+02 1.992e+02 2.220e+02 3.215e+02, threshold=3.984e+02, percent-clipped=0.0 2023-10-02 17:58:14,405 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.4.encoder.layers.2.whiten, num_groups=1, num_channels=384, metric=2.84 vs. limit=12.0 2023-10-02 17:58:16,561 WARNING [train.py:1204] (3/4) Exclude cut with ID _8696_210_1876_1_1528545632440_3345058_209-54069-0 from training. Duration: 0.67 2023-10-02 17:58:21,018 WARNING [train.py:1204] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_215-202973-0_sp1.1 from training. Duration: 0.8 2023-10-02 17:58:21,078 WARNING [train.py:1204] (3/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_321-215742-0 from training. Duration: 0.5 2023-10-02 17:58:22,400 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_428-32114-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 17:58:22,408 WARNING [train.py:1204] (3/4) Exclude cut with ID _27013_210_1389_1_1532437173240_3252649_467-263151-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 17:58:23,736 WARNING [train.py:1204] (3/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_734-129761-0_sp0.9 from training. Duration: 0.911125 2023-10-02 17:58:23,785 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_521-214276-0 from training. Duration: 0.44 2023-10-02 17:58:27,002 WARNING [train.py:1204] (3/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_80-147301-0_sp1.1 from training. Duration: 0.763625 2023-10-02 17:58:27,004 WARNING [train.py:1204] (3/4) Exclude cut with ID _9679_210_7497_1_1529129212906_4363929_56-307796-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 17:58:30,748 WARNING [train.py:1204] (3/4) Exclude cut with ID _14832_210_8750_1_1530691351539_3171348_1-54315-0 from training. Duration: 0.87 2023-10-02 17:58:30,749 WARNING [train.py:1204] (3/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_558-155750-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 17:58:32,225 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_142-309370-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 17:58:33,430 INFO [train.py:1046] (3/4) Epoch 28, batch 1850, loss[loss=0.1611, simple_loss=0.2369, pruned_loss=0.04265, over 23561.00 frames. ], tot_loss[loss=0.1655, simple_loss=0.2442, pruned_loss=0.04343, over 4721109.24 frames. ], batch size: 149, lr: 3.66e-03, grad_scale: 32.0 2023-10-02 17:58:33,504 WARNING [train.py:1204] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_18-46756-0_sp0.9 from training. Duration: 0.87775 2023-10-02 17:58:33,525 WARNING [train.py:1204] (3/4) Exclude cut with ID _61148_210_6897_1_1535094022726_4217610_279-212159-0_sp1.1 from training. Duration: 0.936375 2023-10-02 17:58:33,750 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.self_attn_weights.pos_emb_skip_rate, batch_count=968513.3333333334, ans=0.0 2023-10-02 17:58:34,835 WARNING [train.py:1204] (3/4) Exclude cut with ID _21987_210_5854_1_1531821500482_3637038_164-2641-0_sp1.1 from training. Duration: 0.936375 2023-10-02 17:58:34,880 WARNING [train.py:1204] (3/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_395-340852-0_sp1.1 from training. Duration: 0.8454375 2023-10-02 17:58:36,943 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.2.encoder.layers.1.whiten, num_groups=1, num_channels=384, metric=4.85 vs. limit=12.0 2023-10-02 17:58:37,578 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1077-184261-0_sp1.1 from training. Duration: 0.963625 2023-10-02 17:58:37,605 WARNING [train.py:1204] (3/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_1176-204992-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 17:58:40,810 WARNING [train.py:1204] (3/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_563-347832-0_sp1.1 from training. Duration: 0.8090625 2023-10-02 17:58:42,156 WARNING [train.py:1204] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_380-204756-0_sp0.9 from training. Duration: 0.9555625 2023-10-02 17:58:46,547 WARNING [train.py:1204] (3/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_212-10501-0_sp1.1 from training. Duration: 0.8545625 2023-10-02 17:58:46,573 WARNING [train.py:1204] (3/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_445-117398-0 from training. Duration: 0.89 2023-10-02 17:58:51,811 WARNING [train.py:1204] (3/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_29-280622-0 from training. Duration: 0.82 2023-10-02 17:58:53,285 WARNING [train.py:1204] (3/4) Exclude cut with ID _26664_210_7770_1_1532258943280_3418270_187-80815-0 from training. Duration: 0.93 2023-10-02 17:58:58,219 WARNING [train.py:1204] (3/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_414-349640-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 17:58:58,237 WARNING [train.py:1204] (3/4) Exclude cut with ID _45021_210_9994_1_1533619751094_3330288_96-239010-0 from training. Duration: 0.91 2023-10-02 17:58:58,239 WARNING [train.py:1204] (3/4) Exclude cut with ID _5189_210_4557_1_1527248319428_3769229_199-53526-0_sp0.9 from training. Duration: 0.4666875 2023-10-02 17:59:05,614 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.conv_module2.balancer2.prob, batch_count=968646.6666666666, ans=0.125 2023-10-02 17:59:08,110 WARNING [train.py:1204] (3/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_63-101996-0_sp1.1 from training. Duration: 0.8 2023-10-02 17:59:08,243 WARNING [train.py:1204] (3/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_199-86432-0 from training. Duration: 0.98 2023-10-02 17:59:12,692 WARNING [train.py:1204] (3/4) Exclude cut with ID _36399_210_16049_1_1532998771377_3698333_207-259013-0_sp1.1 from training. Duration: 0.92725 2023-10-02 17:59:12,734 WARNING [train.py:1204] (3/4) Exclude cut with ID _62574_210_2655_1_1535180407058_3933249_567-111756-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 17:59:15,747 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.1.conv_module1.balancer2.prob, batch_count=968713.3333333334, ans=0.125 2023-10-02 17:59:16,859 WARNING [train.py:1204] (3/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_436-318113-0 from training. Duration: 0.75 2023-10-02 17:59:16,920 WARNING [train.py:1204] (3/4) Exclude cut with ID _53379_210_20034_1_1534222027363_3854654_43-305214-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 17:59:17,077 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.feed_forward1.out_proj.dropout_p, batch_count=968713.3333333334, ans=0.1 2023-10-02 17:59:18,184 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_15126_210_6753_1_1530778677_2591185_113-157651-0_sp1.1 from training. Duration: 0.6181875 2023-10-02 17:59:19,590 WARNING [train.py:1204] (3/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_146-218763-0_sp1.1 from training. Duration: 0.7818125 2023-10-02 17:59:21,047 WARNING [train.py:1204] (3/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_714-310538-0_sp0.9 from training. Duration: 0.9555625 2023-10-02 17:59:23,754 WARNING [train.py:1204] (3/4) Exclude cut with ID _24271_210_12658_1_1531987270022_7190519_709-16721-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 17:59:26,488 WARNING [train.py:1204] (3/4) Exclude cut with ID _5190_210_4557_1_1527244368799_373439_28-197030-0_sp0.9 from training. Duration: 0.8 2023-10-02 17:59:27,122 WARNING [train.py:1204] (3/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_267-197536-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 17:59:27,882 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.2.encoder.layers.1.whiten, num_groups=1, num_channels=384, metric=4.40 vs. limit=12.0 2023-10-02 17:59:28,396 WARNING [train.py:1204] (3/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_658-282610-0_sp1.1 from training. Duration: 0.4818125 2023-10-02 17:59:28,405 WARNING [train.py:1204] (3/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_257-67050-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 17:59:30,414 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_14906_210_7927_1_1530754190_9408633_961-53402-0_sp1.1 from training. Duration: 0.936375 2023-10-02 17:59:31,801 WARNING [train.py:1204] (3/4) Exclude cut with ID _60113_210_16271_1_1535164212481_4386079_490-280290-0_sp1.1 from training. Duration: 0.8909375 2023-10-02 17:59:36,518 WARNING [train.py:1204] (3/4) Exclude cut with ID _14100_210_8341_1_1530529320613_3678069_258-224599-0 from training. Duration: 0.77 2023-10-02 17:59:36,571 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6961_210_2087_1_1527937168_6815011_565-326898-0_sp1.1 from training. Duration: 0.936375 2023-10-02 17:59:39,582 WARNING [train.py:1204] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_239-316802-0_sp1.1 from training. Duration: 0.62725 2023-10-02 17:59:40,954 WARNING [train.py:1204] (3/4) Exclude cut with ID _55857_210_2346_1_1534644007831_6898209_745-98391-0_sp1.1 from training. Duration: 0.6909375 2023-10-02 17:59:40,957 WARNING [train.py:1204] (3/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_154-135696-0 from training. Duration: 0.8 2023-10-02 17:59:40,959 WARNING [train.py:1204] (3/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_93-58002-0 from training. Duration: 0.57 2023-10-02 17:59:42,296 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9025W0005-507335 from training. Duration: 0.896 2023-10-02 17:59:42,371 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9029W0034-509435 from training. Duration: 0.64 2023-10-02 17:59:45,069 WARNING [train.py:1204] (3/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_76-203867-0_sp1.1 from training. Duration: 0.6545625 2023-10-02 17:59:45,078 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_46639_210_8341_1_1533726088_4495500_66-346325-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 17:59:45,085 WARNING [train.py:1204] (3/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_145-137790-0_sp0.9 from training. Duration: 0.92225 2023-10-02 17:59:45,107 WARNING [train.py:1204] (3/4) Exclude cut with ID _36522_210_8887_1_1533177030611_1772478_7-327299-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 17:59:45,180 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9042W0009-515615 from training. Duration: 0.896 2023-10-02 17:59:46,913 INFO [train.py:1046] (3/4) Epoch 28, batch 1900, loss[loss=0.1552, simple_loss=0.2304, pruned_loss=0.04003, over 24434.00 frames. ], tot_loss[loss=0.1661, simple_loss=0.2446, pruned_loss=0.0438, over 4721082.90 frames. ], batch size: 58, lr: 3.66e-03, grad_scale: 32.0 2023-10-02 17:59:46,977 WARNING [train.py:1204] (3/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_39-248870-0_sp0.9 from training. Duration: 0.7666875 2023-10-02 17:59:47,027 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_35441_210_16554_1_1533347572_4092680_362-242912-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 17:59:48,307 WARNING [train.py:1204] (3/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_216-343007-0_sp1.1 from training. Duration: 0.6 2023-10-02 17:59:49,621 WARNING [train.py:1204] (3/4) Exclude cut with ID _3867_210_1883_1_1526641200575_4475830_272-215563-0_sp0.9 from training. Duration: 0.6333125 2023-10-02 17:59:51,009 WARNING [train.py:1204] (3/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_553-175603-0_sp1.1 from training. Duration: 0.92725 2023-10-02 17:59:51,026 WARNING [train.py:1204] (3/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_963-187109-0 from training. Duration: 0.99 2023-10-02 17:59:51,531 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.5.encoder.layers.0.whiten, num_groups=1, num_channels=256, metric=2.28 vs. limit=12.0 2023-10-02 17:59:53,706 WARNING [train.py:1204] (3/4) Exclude cut with ID _44973_210_1511_1_1533794381163_3634770_495-211466-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 17:59:53,717 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9068W0002-529085 from training. Duration: 0.896 2023-10-02 17:59:53,731 WARNING [train.py:1204] (3/4) Exclude cut with ID _5294_210_1747_1_1527249643928_3696179_180-100713-0_sp1.1 from training. Duration: 0.736375 2023-10-02 17:59:55,114 WARNING [train.py:1204] (3/4) Exclude cut with ID _35799_210_5854_1_1533263487887_3529071_149-49689-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 17:59:59,658 WARNING [train.py:1204] (3/4) Exclude cut with ID _67725_210_27767_1_1535608756671_5105359_36-220794-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 17:59:59,881 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.attention_skip_rate, batch_count=968913.3333333334, ans=0.0 2023-10-02 18:00:01,554 WARNING [train.py:1204] (3/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_338-96520-0_sp1.1 from training. Duration: 0.8545625 2023-10-02 18:00:02,900 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9104W0004-547160 from training. Duration: 0.896 2023-10-02 18:00:04,294 WARNING [train.py:1204] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_330-327861-0 from training. Duration: 0.67 2023-10-02 18:00:06,087 WARNING [train.py:1204] (3/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_156-257607-0_sp0.9 from training. Duration: 0.92225 2023-10-02 18:00:06,141 WARNING [train.py:1204] (3/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_476-322050-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 18:00:06,155 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9118W0017-553385 from training. Duration: 0.896 2023-10-02 18:00:07,533 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9120W0028-554435 from training. Duration: 0.896 2023-10-02 18:00:11,841 WARNING [train.py:1204] (3/4) Exclude cut with ID _56524_210_9642_1_1534572011779_3619170_145-310842-0 from training. Duration: 0.99 2023-10-02 18:00:13,206 WARNING [train.py:1204] (3/4) Exclude cut with ID _51631_210_8254_1_1534129228420_4084794_270-222508-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 18:00:18,113 WARNING [train.py:1204] (3/4) Exclude cut with ID _14268_210_9206_1_1530622771285_4714099_331-105351-0 from training. Duration: 0.65 2023-10-02 18:00:19,505 WARNING [train.py:1204] (3/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_40-129756-0 from training. Duration: 0.81 2023-10-02 18:00:21,159 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.self_attn_weights.pos_emb_skip_rate, batch_count=968980.0, ans=0.0 2023-10-02 18:00:22,479 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.self_attn_weights.pos_emb_skip_rate, batch_count=968980.0, ans=0.0 2023-10-02 18:00:22,664 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.5.encoder.layers.1.conv_module2.whiten, num_groups=1, num_channels=256, metric=4.54 vs. limit=15.0 2023-10-02 18:00:26,498 WARNING [train.py:1204] (3/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_231-104736-0 from training. Duration: 0.78 2023-10-02 18:00:26,619 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.conv_module1.balancer2.min_abs, batch_count=968980.0, ans=0.5 2023-10-02 18:00:29,300 WARNING [train.py:1204] (3/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_214-291369-0 from training. Duration: 0.63 2023-10-02 18:00:29,307 WARNING [train.py:1204] (3/4) Exclude cut with ID _62574_210_2655_1_1535180407058_3933249_369-84347-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 18:00:29,365 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9210W0111-599240 from training. Duration: 0.896 2023-10-02 18:00:29,369 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_92-246270-0 from training. Duration: 0.85 2023-10-02 18:00:31,199 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_37152_210_9697_1_1533106573_3844188_29-239070-0 from training. Duration: 0.98 2023-10-02 18:00:31,257 WARNING [train.py:1204] (3/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_239-22076-0 from training. Duration: 0.85 2023-10-02 18:00:31,259 WARNING [train.py:1204] (3/4) Exclude cut with ID _65962_210_29927_1_1535436026519_4174930_368-44639-0_sp1.1 from training. Duration: 0.97275 2023-10-02 18:00:35,812 WARNING [train.py:1204] (3/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_152-212533-0 from training. Duration: 0.97 2023-10-02 18:00:37,874 WARNING [train.py:1204] (3/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_90-31383-0_sp1.1 from training. Duration: 0.7909375 2023-10-02 18:00:40,419 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.533e+02 1.863e+02 2.033e+02 2.281e+02 3.695e+02, threshold=4.067e+02, percent-clipped=0.0 2023-10-02 18:00:41,834 WARNING [train.py:1204] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_196-152868-0_sp1.1 from training. Duration: 0.92725 2023-10-02 18:00:41,835 WARNING [train.py:1204] (3/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_140-181176-0 from training. Duration: 0.92 2023-10-02 18:00:44,563 WARNING [train.py:1204] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_168-32906-0_sp0.9 from training. Duration: 0.7666875 2023-10-02 18:00:47,771 WARNING [train.py:1204] (3/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_13-165512-0 from training. Duration: 0.82 2023-10-02 18:00:47,808 WARNING [train.py:1204] (3/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_381-317791-0_sp0.9 from training. Duration: 0.811125 2023-10-02 18:00:52,040 WARNING [train.py:1204] (3/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_63-267585-0_sp1.1 from training. Duration: 0.8181875 2023-10-02 18:00:52,042 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_326-303945-0_sp1.1 from training. Duration: 0.9 2023-10-02 18:00:52,054 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_194-143381-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 18:00:53,423 WARNING [train.py:1204] (3/4) Exclude cut with ID _39290_210_4220_1_1533862778760_3075118_194-16667-0_sp1.1 from training. Duration: 0.963625 2023-10-02 18:00:54,694 WARNING [train.py:1204] (3/4) Exclude cut with ID _15012_210_1968_1_1530856820429_5095350_662-210469-0_sp0.9 from training. Duration: 0.6333125 2023-10-02 18:00:54,731 WARNING [train.py:1204] (3/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_68-219807-0_sp1.1 from training. Duration: 0.42725 2023-10-02 18:00:56,050 WARNING [train.py:1204] (3/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_321-22821-0_sp0.9 from training. Duration: 0.988875 2023-10-02 18:00:58,764 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_1037-45317-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 18:00:58,765 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_403-39619-0_sp1.1 from training. Duration: 0.763625 2023-10-02 18:01:00,734 INFO [train.py:1046] (3/4) Epoch 28, batch 1950, loss[loss=0.1693, simple_loss=0.2466, pruned_loss=0.04599, over 23471.00 frames. ], tot_loss[loss=0.1673, simple_loss=0.2454, pruned_loss=0.04458, over 4715844.57 frames. ], batch size: 134, lr: 3.66e-03, grad_scale: 32.0 2023-10-02 18:01:00,874 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_510-96996-0_sp0.9 from training. Duration: 0.9666875 2023-10-02 18:01:00,877 WARNING [train.py:1204] (3/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_225-180155-0_sp1.1 from training. Duration: 0.92725 2023-10-02 18:01:02,213 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_278-272612-0_sp0.9 from training. Duration: 0.811125 2023-10-02 18:01:03,553 WARNING [train.py:1204] (3/4) Exclude cut with ID _53713_210_24116_1_1534314487762_7416751_691-70244-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 18:01:05,061 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6788_210_1388_1_1527898698_4562021_91-197981-0_sp1.1 from training. Duration: 0.7545625 2023-10-02 18:01:05,865 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.2.encoder.layers.0.feed_forward3.out_whiten, num_groups=1, num_channels=384, metric=11.40 vs. limit=15.0 2023-10-02 18:01:07,239 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.1.bypass_mid.scale_min, batch_count=969180.0, ans=0.2 2023-10-02 18:01:08,430 WARNING [train.py:1204] (3/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_60-18123-0_sp0.9 from training. Duration: 0.97775 2023-10-02 18:01:08,464 WARNING [train.py:1204] (3/4) Exclude cut with ID _35799_210_5854_1_1533263487887_3529071_5-214617-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 18:01:08,479 WARNING [train.py:1204] (3/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_890-5117-0_sp1.1 from training. Duration: 0.7090625 2023-10-02 18:01:11,222 WARNING [train.py:1204] (3/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_660-211088-0 from training. Duration: 0.62 2023-10-02 18:01:11,279 WARNING [train.py:1204] (3/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_314-126819-0_sp0.9 from training. Duration: 0.5555625 2023-10-02 18:01:11,314 WARNING [train.py:1204] (3/4) Exclude cut with ID _21632_210_2253_1_1531994001765_3744140_362-66225-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 18:01:12,679 WARNING [train.py:1204] (3/4) Exclude cut with ID _61118_210_9527_1_1534897668884_3752630_173-258555-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 18:01:14,365 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=969246.6666666666, ans=0.1 2023-10-02 18:01:15,469 WARNING [train.py:1204] (3/4) Exclude cut with ID _7719_210_3525_1_1528456153163_4296960_88-250259-0_sp0.9 from training. Duration: 0.9333125 2023-10-02 18:01:15,498 WARNING [train.py:1204] (3/4) Exclude cut with ID _15140_210_9288_1_1530791388450_4880149_539-153708-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 18:01:15,525 WARNING [train.py:1204] (3/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_253-42544-0_sp1.1 from training. Duration: 0.97275 2023-10-02 18:01:17,599 WARNING [train.py:1204] (3/4) Exclude cut with ID _33640_210_2655_1_1533117605479_4855469_479-296877-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 18:01:20,360 WARNING [train.py:1204] (3/4) Exclude cut with ID 3033-130750-0096-55598-0_sp1.1 from training. Duration: 0.7545625 2023-10-02 18:01:20,380 WARNING [train.py:1204] (3/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_191-41023-0_sp1.1 from training. Duration: 0.6454375 2023-10-02 18:01:20,402 WARNING [train.py:1204] (3/4) Exclude cut with ID _8365_210_2064_1_1528592311070_4677690_431-294102-0_sp1.1 from training. Duration: 0.8090625 2023-10-02 18:01:21,677 WARNING [train.py:1204] (3/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_893-296030-0_sp1.1 from training. Duration: 0.97275 2023-10-02 18:01:25,799 WARNING [train.py:1204] (3/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_401-343820-0_sp1.1 from training. Duration: 0.97275 2023-10-02 18:01:27,356 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.conv_skip_rate, batch_count=969246.6666666666, ans=0.0 2023-10-02 18:01:28,578 WARNING [train.py:1204] (3/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_127-566-0_sp1.1 from training. Duration: 0.763625 2023-10-02 18:01:28,579 WARNING [train.py:1204] (3/4) Exclude cut with ID _14100_210_8341_1_1530529320613_3678069_126-272847-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 18:01:30,495 WARNING [train.py:1204] (3/4) Exclude cut with ID _15133_210_6228_1_1530794512074_3080379_29-264458-0_sp1.1 from training. Duration: 0.52725 2023-10-02 18:01:30,496 WARNING [train.py:1204] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_127-119109-0 from training. Duration: 0.72 2023-10-02 18:01:32,061 WARNING [train.py:1204] (3/4) Exclude cut with ID _6537_210_1883_1_1527987559710_3597129_382-133062-0_sp1.1 from training. Duration: 0.6181875 2023-10-02 18:01:32,072 WARNING [train.py:1204] (3/4) Exclude cut with ID _29529_210_7234_1_1532393961455_3977460_391-219682-0_sp1.1 from training. Duration: 0.936375 2023-10-02 18:01:32,772 WARNING [train.py:1204] (3/4) Exclude cut with ID _37537_210_2253_1_1533171758568_3421949_96-137189-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 18:01:36,982 WARNING [train.py:1204] (3/4) Exclude cut with ID _58414_210_8254_1_1535421609162_7151182_15-276301-0_sp1.1 from training. Duration: 0.97275 2023-10-02 18:01:38,445 WARNING [train.py:1204] (3/4) Exclude cut with ID _59502_210_12210_1_1535107024698_1835139_110-289074-0_sp1.1 from training. Duration: 0.92725 2023-10-02 18:01:43,146 WARNING [train.py:1204] (3/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_367-61330-0_sp1.1 from training. Duration: 0.6818125 2023-10-02 18:01:47,214 WARNING [train.py:1204] (3/4) Exclude cut with ID _45297_210_2721_1_1533715351381_3264949_353-9541-0_sp1.1 from training. Duration: 0.8818125 2023-10-02 18:01:47,255 WARNING [train.py:1204] (3/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_85-138257-0_sp0.9 from training. Duration: 0.788875 2023-10-02 18:01:49,116 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3787_210_2040_1_1526477544_5478586_603-120324-0 from training. Duration: 0.81 2023-10-02 18:01:49,162 WARNING [train.py:1204] (3/4) Exclude cut with ID _42863_210_9070_1_1533902398788_3696920_411-204131-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 18:01:53,289 WARNING [train.py:1204] (3/4) Exclude cut with ID _30196_210_1664_1_1533708496522_7074140_822-289909-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 18:01:53,365 WARNING [train.py:1204] (3/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_32-335103-0_sp0.9 from training. Duration: 0.911125 2023-10-02 18:01:54,690 WARNING [train.py:1204] (3/4) Exclude cut with ID _6493_210_1747_1_1528459210424_4012470_513-140967-0_sp0.9 from training. Duration: 0.87775 2023-10-02 18:02:01,876 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_47896_210_20508_1_1534492518_5958868_439-257766-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 18:02:03,828 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_1294-63152-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 18:02:05,138 WARNING [train.py:1204] (3/4) Exclude cut with ID _59170_210_6721_1_1534983633960_4203335_319-166512-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 18:02:06,605 WARNING [train.py:1204] (3/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_146-3470-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 18:02:09,430 WARNING [train.py:1204] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_678-159307-0_sp1.1 from training. Duration: 0.77275 2023-10-02 18:02:09,473 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_343-48822-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 18:02:09,536 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_4692_210_3528_1_1526899755_4323254_107-44401-0 from training. Duration: 0.86 2023-10-02 18:02:11,245 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_74-292608-0_sp1.1 from training. Duration: 0.6545625 2023-10-02 18:02:12,579 WARNING [train.py:1204] (3/4) Exclude cut with ID _55644_210_11856_1_1534593599582_4103719_293-248694-0_sp1.1 from training. Duration: 0.97275 2023-10-02 18:02:12,666 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3172_210_2087_1_1526292929_3987865_197-97762-0 from training. Duration: 0.75 2023-10-02 18:02:15,243 INFO [train.py:1046] (3/4) Epoch 28, batch 2000, loss[loss=0.1715, simple_loss=0.2509, pruned_loss=0.04606, over 23405.00 frames. ], tot_loss[loss=0.1683, simple_loss=0.2464, pruned_loss=0.04509, over 4706323.02 frames. ], batch size: 93, lr: 3.66e-03, grad_scale: 32.0 2023-10-02 18:02:15,347 WARNING [train.py:1204] (3/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_264-348896-0_sp0.9 from training. Duration: 0.9444375 2023-10-02 18:02:18,151 WARNING [train.py:1204] (3/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_209-205365-0_sp0.9 from training. Duration: 0.87775 2023-10-02 18:02:19,875 WARNING [train.py:1204] (3/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_222-198127-0_sp0.9 from training. Duration: 0.9333125 2023-10-02 18:02:19,917 WARNING [train.py:1204] (3/4) Exclude cut with ID _57572_210_5517_1_1534921219624_3809484_52-43530-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 18:02:21,586 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.conv_skip_rate, batch_count=969513.3333333334, ans=0.0 2023-10-02 18:02:22,642 WARNING [train.py:1204] (3/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_96-320736-0_sp1.1 from training. Duration: 0.7818125 2023-10-02 18:02:23,987 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_13091_210_7925_1_1530609447_8987354_1159-225508-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 18:02:25,493 WARNING [train.py:1204] (3/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_237-44332-0 from training. Duration: 0.94 2023-10-02 18:02:25,538 WARNING [train.py:1204] (3/4) Exclude cut with ID _7970_210_1883_1_1528455616296_3472230_25-178407-0_sp0.9 from training. Duration: 0.788875 2023-10-02 18:02:29,534 WARNING [train.py:1204] (3/4) Exclude cut with ID _29003_210_19596_1_1532507391720_3894098_357-257062-0_sp1.1 from training. Duration: 0.936375 2023-10-02 18:02:30,475 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.feed_forward1.out_proj.dropout_p, batch_count=969580.0, ans=0.1 2023-10-02 18:02:31,537 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_809-17156-0 from training. Duration: 0.73 2023-10-02 18:02:32,774 WARNING [train.py:1204] (3/4) Exclude cut with ID _7160_210_1876_1_1527940923231_3703799_302-150728-0_sp1.1 from training. Duration: 0.7090625 2023-10-02 18:02:32,784 WARNING [train.py:1204] (3/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_444-28041-0_sp0.9 from training. Duration: 0.9444375 2023-10-02 18:02:37,231 WARNING [train.py:1204] (3/4) Exclude cut with ID _52892_210_6721_1_1534378941259_4124129_278-251666-0_sp1.1 from training. Duration: 0.92725 2023-10-02 18:02:37,337 WARNING [train.py:1204] (3/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_115-326778-0 from training. Duration: 0.93 2023-10-02 18:02:38,694 WARNING [train.py:1204] (3/4) Exclude cut with ID _41391_210_7994_1_1533603771872_3638201_31-188961-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 18:02:40,097 WARNING [train.py:1204] (3/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_1073-6264-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 18:02:40,113 WARNING [train.py:1204] (3/4) Exclude cut with ID _66868_210_9527_1_1535509287954_4561140_435-259515-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 18:02:40,203 WARNING [train.py:1204] (3/4) Exclude cut with ID _14886_210_4220_1_1530702020355_3516190_159-73748-0 from training. Duration: 0.83 2023-10-02 18:02:40,243 WARNING [train.py:1204] (3/4) Exclude cut with ID _15150_210_8341_1_1530752411803_7256384_469-239509-0_sp1.1 from training. Duration: 0.6181875 2023-10-02 18:02:43,507 WARNING [train.py:1204] (3/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_395-340852-0 from training. Duration: 0.93 2023-10-02 18:02:44,779 WARNING [train.py:1204] (3/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_138-187031-0_sp1.1 from training. Duration: 0.9 2023-10-02 18:02:46,440 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.conv_module2.balancer2.prob, batch_count=969646.6666666666, ans=0.125 2023-10-02 18:02:47,571 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_13757_210_3528_1_1530440682_4107239_48-106347-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 18:02:48,905 WARNING [train.py:1204] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_164-224052-0_sp1.1 from training. Duration: 0.636375 2023-10-02 18:02:48,910 WARNING [train.py:1204] (3/4) Exclude cut with ID _59288_210_5854_1_1535077757774_3412246_408-322648-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 18:02:48,963 WARNING [train.py:1204] (3/4) Exclude cut with ID _30191_210_1664_1_1533189915047_7196670_827-36359-0_sp1.1 from training. Duration: 0.963625 2023-10-02 18:02:51,009 WARNING [train.py:1204] (3/4) Exclude cut with ID _4680_210_3525_1_1527249757953_893386_75-78979-0_sp1.1 from training. Duration: 0.82725 2023-10-02 18:02:52,359 WARNING [train.py:1204] (3/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_383-165234-0 from training. Duration: 0.94 2023-10-02 18:02:52,668 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.bypass.skip_rate, batch_count=969646.6666666666, ans=0.07 2023-10-02 18:02:53,831 WARNING [train.py:1204] (3/4) Exclude cut with ID _19896_210_1883_1_1531709022662_6649569_94-229927-0 from training. Duration: 0.97 2023-10-02 18:02:53,837 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6788_210_1388_1_1527898698_4562021_180-75288-0_sp1.1 from training. Duration: 0.9 2023-10-02 18:02:53,845 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_26433_210_12108_1_1532157158_4056480_54-321349-0_sp1.1 from training. Duration: 0.97275 2023-10-02 18:02:54,120 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.bypass_mid.scale_min, batch_count=969646.6666666666, ans=0.2 2023-10-02 18:02:57,933 WARNING [train.py:1204] (3/4) Exclude cut with ID _56108_210_16380_1_1534505446159_3887959_265-161493-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 18:02:59,342 WARNING [train.py:1204] (3/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_395-111602-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 18:02:59,348 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_154-316450-0_sp0.9 from training. Duration: 0.8444375 2023-10-02 18:03:00,072 WARNING [train.py:1204] (3/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_381-116041-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 18:03:01,369 WARNING [train.py:1204] (3/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_65-104124-0_sp1.1 from training. Duration: 0.963625 2023-10-02 18:03:03,201 WARNING [train.py:1204] (3/4) Exclude cut with ID _6536_210_1883_1_1527850783339_3319069_169-23811-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 18:03:04,552 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_289-180904-0_sp0.9 from training. Duration: 0.8444375 2023-10-02 18:03:04,563 WARNING [train.py:1204] (3/4) Exclude cut with ID _56406_210_9288_1_1534899618664_3496149_226-242400-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 18:03:04,813 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.conv_module2.balancer2.min_positive, batch_count=969713.3333333334, ans=0.05 2023-10-02 18:03:05,840 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_24899_210_16554_1_1532046422_3827462_276-216330-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 18:03:07,238 WARNING [train.py:1204] (3/4) Exclude cut with ID _29345_210_4381_1_1532689168263_3860169_198-174328-0_sp1.1 from training. Duration: 0.82725 2023-10-02 18:03:08,513 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.565e+02 1.888e+02 2.044e+02 2.349e+02 3.109e+02, threshold=4.088e+02, percent-clipped=0.0 2023-10-02 18:03:08,630 WARNING [train.py:1204] (3/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_338-96520-0 from training. Duration: 0.94 2023-10-02 18:03:12,062 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.2.encoder.layers.0.whiten, num_groups=1, num_channels=384, metric=7.40 vs. limit=12.0 2023-10-02 18:03:13,191 WARNING [train.py:1204] (3/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_7-111954-0_sp1.1 from training. Duration: 0.5090625 2023-10-02 18:03:14,596 WARNING [train.py:1204] (3/4) Exclude cut with ID _36692_210_5665_1_1533608967420_398809_8-16261-0_sp1.1 from training. Duration: 0.97275 2023-10-02 18:03:18,655 WARNING [train.py:1204] (3/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_86-212657-0_sp1.1 from training. Duration: 0.97275 2023-10-02 18:03:18,670 WARNING [train.py:1204] (3/4) Exclude cut with ID _44659_210_2721_1_1534036120061_6614860_626-298884-0_sp1.1 from training. Duration: 0.8818125 2023-10-02 18:03:23,329 WARNING [train.py:1204] (3/4) Exclude cut with ID _49755_210_18053_1_1534581005846_3218709_120-257918-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 18:03:23,610 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=969780.0, ans=0.1 2023-10-02 18:03:26,075 WARNING [train.py:1204] (3/4) Exclude cut with ID _21881_210_3525_1_1531825242757_3798380_400-113821-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 18:03:26,078 WARNING [train.py:1204] (3/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_387-236773-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 18:03:26,155 WARNING [train.py:1204] (3/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_613-232856-0_sp1.1 from training. Duration: 0.4909375 2023-10-02 18:03:26,175 WARNING [train.py:1204] (3/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_181-78632-0_sp0.9 from training. Duration: 0.8666875 2023-10-02 18:03:28,970 INFO [train.py:1046] (3/4) Epoch 28, batch 2050, loss[loss=0.1442, simple_loss=0.2049, pruned_loss=0.04168, over 22633.00 frames. ], tot_loss[loss=0.1672, simple_loss=0.2448, pruned_loss=0.04477, over 4708122.85 frames. ], batch size: 322, lr: 3.66e-03, grad_scale: 32.0 2023-10-02 18:03:30,428 WARNING [train.py:1204] (3/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_175-17485-0_sp1.1 from training. Duration: 0.97275 2023-10-02 18:03:30,516 WARNING [train.py:1204] (3/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_1004-183422-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 18:03:35,076 WARNING [train.py:1204] (3/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_32-65962-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 18:03:35,143 WARNING [train.py:1204] (3/4) Exclude cut with ID _15458_210_8341_1_1530858614263_3834410_40-298664-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 18:03:39,844 WARNING [train.py:1204] (3/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_380-261863-0_sp1.1 from training. Duration: 0.963625 2023-10-02 18:03:41,331 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_372-173012-0_sp1.1 from training. Duration: 0.863625 2023-10-02 18:03:41,374 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_57-82205-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 18:03:42,709 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3691_210_3528_1_1526468169_4062053_355-334432-0_sp0.9 from training. Duration: 0.9666875 2023-10-02 18:03:44,490 WARNING [train.py:1204] (3/4) Exclude cut with ID _19023_210_4353_1_1531378892552_5820780_28-36615-0 from training. Duration: 0.97 2023-10-02 18:03:44,495 WARNING [train.py:1204] (3/4) Exclude cut with ID _29853_210_7497_1_1532505600851_6642110_286-251333-0_sp1.1 from training. Duration: 0.92725 2023-10-02 18:03:47,102 WARNING [train.py:1204] (3/4) Exclude cut with ID _31602_210_5854_1_1532677021456_4458250_518-80679-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 18:03:47,145 WARNING [train.py:1204] (3/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_38-187851-0_sp0.9 from training. Duration: 0.688875 2023-10-02 18:03:56,188 WARNING [train.py:1204] (3/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_39-248870-0_sp1.1 from training. Duration: 0.62725 2023-10-02 18:03:56,213 WARNING [train.py:1204] (3/4) Exclude cut with ID _16908_210_5724_1_1531119157248_3772700_154-31455-0_sp1.1 from training. Duration: 0.97275 2023-10-02 18:03:56,432 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=969913.3333333334, ans=0.1 2023-10-02 18:03:57,650 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8453_210_4211_1_1528502940_8562730_347-330673-0 from training. Duration: 0.72 2023-10-02 18:04:00,309 WARNING [train.py:1204] (3/4) Exclude cut with ID _30191_210_1664_1_1533189915047_7196670_674-204247-0_sp1.1 from training. Duration: 0.97275 2023-10-02 18:04:01,637 WARNING [train.py:1204] (3/4) Exclude cut with ID _8833_210_5219_1_1528592665210_7553059_329-125156-0 from training. Duration: 0.82 2023-10-02 18:04:02,325 WARNING [train.py:1204] (3/4) Exclude cut with ID _4779_210_2084_1_1527318022944_3914893_500-285013-0_sp1.1 from training. Duration: 0.62725 2023-10-02 18:04:04,988 WARNING [train.py:1204] (3/4) Exclude cut with ID _14421_210_8750_1_1530604795825_3542290_190-313578-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 18:04:05,161 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.self_attn_weights.pos_emb_skip_rate, batch_count=969980.0, ans=0.0 2023-10-02 18:04:07,702 WARNING [train.py:1204] (3/4) Exclude cut with ID _35799_210_5854_1_1533263487887_3529071_538-273574-0_sp1.1 from training. Duration: 0.936375 2023-10-02 18:04:08,445 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.2.encoder.layers.0.feed_forward1.out_whiten, num_groups=1, num_channels=384, metric=5.85 vs. limit=15.0 2023-10-02 18:04:09,132 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_14928_210_5632_1_1530773461_4714000_454-112198-0_sp1.1 from training. Duration: 0.6 2023-10-02 18:04:09,177 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_468-143126-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 18:04:10,467 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_4528_210_3273_1_1527076654_3276777_258-121681-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 18:04:11,826 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_397-132565-0_sp1.1 from training. Duration: 0.9 2023-10-02 18:04:11,847 WARNING [train.py:1204] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_454-123002-0_sp0.9 from training. Duration: 0.7666875 2023-10-02 18:04:12,057 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.conv_module1.balancer2.prob, batch_count=970046.6666666666, ans=0.125 2023-10-02 18:04:15,211 WARNING [train.py:1204] (3/4) Exclude cut with ID _27052_210_5986_1_1532329197187_3585009_297-86890-0_sp1.1 from training. Duration: 0.936375 2023-10-02 18:04:16,204 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.0.layers.0.conv_module1.whiten, num_groups=1, num_channels=192, metric=10.07 vs. limit=15.0 2023-10-02 18:04:17,822 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_51225_210_3601_1_1534400634_2327383_55-301844-0_sp1.1 from training. Duration: 0.8454375 2023-10-02 18:04:20,538 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6786_210_1388_1_1527839699_4194495_378-46070-0_sp0.9 from training. Duration: 0.8 2023-10-02 18:04:20,634 WARNING [train.py:1204] (3/4) Exclude cut with ID _42334_210_7770_1_1534206610396_3605880_55-19758-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 18:04:24,742 WARNING [train.py:1204] (3/4) Exclude cut with ID _14884_210_2252_1_1530707459689_5561250_114-201399-0_sp0.9 from training. Duration: 0.8444375 2023-10-02 18:04:29,767 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.feed_forward3.hidden_balancer.prob, batch_count=970113.3333333334, ans=0.125 2023-10-02 18:04:30,911 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_99-251314-0_sp0.9 from training. Duration: 0.9666875 2023-10-02 18:04:30,997 WARNING [train.py:1204] (3/4) Exclude cut with ID _26528_210_9070_1_1532570410842_3851310_497-97218-0 from training. Duration: 0.86 2023-10-02 18:04:34,579 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.bypass_mid.scale_min, batch_count=970113.3333333334, ans=0.2 2023-10-02 18:04:37,129 WARNING [train.py:1204] (3/4) Exclude cut with ID _55328_210_27767_1_1534580973260_4088170_230-317241-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 18:04:38,442 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_125-203105-0_sp0.9 from training. Duration: 0.888875 2023-10-02 18:04:39,780 WARNING [train.py:1204] (3/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_55-31904-0_sp1.1 from training. Duration: 0.8 2023-10-02 18:04:40,028 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.balancer1.prob, batch_count=970113.3333333334, ans=0.125 2023-10-02 18:04:41,230 WARNING [train.py:1204] (3/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_18-174647-0 from training. Duration: 0.91 2023-10-02 18:04:42,481 INFO [train.py:1046] (3/4) Epoch 28, batch 2100, loss[loss=0.1487, simple_loss=0.2233, pruned_loss=0.03706, over 18534.00 frames. ], tot_loss[loss=0.1658, simple_loss=0.2431, pruned_loss=0.04419, over 4705987.15 frames. ], batch size: 40, lr: 3.66e-03, grad_scale: 8.0 2023-10-02 18:04:44,402 WARNING [train.py:1204] (3/4) Exclude cut with ID IC0165W0005-81726 from training. Duration: 0.952 2023-10-02 18:04:44,403 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_31583_210_3852_1_1532595851_4024413_148-171847-0_sp1.1 from training. Duration: 0.963625 2023-10-02 18:04:44,450 WARNING [train.py:1204] (3/4) Exclude cut with ID _21989_210_5854_1_1531899214931_3271360_116-138769-0_sp1.1 from training. Duration: 0.936375 2023-10-02 18:04:45,755 WARNING [train.py:1204] (3/4) Exclude cut with ID _66070_210_7640_1_1535441393096_7805310_489-292631-0_sp1.1 from training. Duration: 0.7181875 2023-10-02 18:04:47,102 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_202-27960-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 18:04:47,109 WARNING [train.py:1204] (3/4) Exclude cut with ID _5294_210_1747_1_1527249643928_3696179_342-270278-0 from training. Duration: 0.55 2023-10-02 18:04:47,136 WARNING [train.py:1204] (3/4) Exclude cut with ID _38323_210_12108_1_1533121233369_3303049_416-11919-0 from training. Duration: 0.96 2023-10-02 18:04:48,514 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6545_210_2040_1_1527774670_4245258_407-48070-0_sp0.9 from training. Duration: 0.8444375 2023-10-02 18:04:51,238 WARNING [train.py:1204] (3/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_503-30719-0_sp1.1 from training. Duration: 0.8909375 2023-10-02 18:04:52,638 WARNING [train.py:1204] (3/4) Exclude cut with ID _49643_210_7989_1_1534640244224_3939031_234-326497-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 18:04:54,130 WARNING [train.py:1204] (3/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_301-77873-0_sp1.1 from training. Duration: 0.963625 2023-10-02 18:04:55,460 WARNING [train.py:1204] (3/4) Exclude cut with ID _45297_210_2721_1_1533715351381_3264949_152-154697-0_sp1.1 from training. Duration: 0.97275 2023-10-02 18:04:55,466 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3371_210_1794_1_1526384462_5580777_396-339812-0 from training. Duration: 0.85 2023-10-02 18:04:56,864 WARNING [train.py:1204] (3/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_399-156791-0_sp1.1 from training. Duration: 0.7545625 2023-10-02 18:04:56,911 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7696_210_2040_1_1528207167_3716099_117-125434-0 from training. Duration: 0.76 2023-10-02 18:04:56,916 WARNING [train.py:1204] (3/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_96-244299-0 from training. Duration: 0.88 2023-10-02 18:04:58,883 WARNING [train.py:1204] (3/4) Exclude cut with ID _37892_210_19596_1_1533198677897_3964058_519-343462-0_sp1.1 from training. Duration: 0.92725 2023-10-02 18:05:00,801 WARNING [train.py:1204] (3/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_202-183922-0_sp1.1 from training. Duration: 0.77275 2023-10-02 18:05:00,807 WARNING [train.py:1204] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_453-133983-0 from training. Duration: 0.78 2023-10-02 18:05:00,844 WARNING [train.py:1204] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_178-39722-0_sp1.1 from training. Duration: 0.4454375 2023-10-02 18:05:03,594 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_15126_210_6753_1_1530778677_2591185_113-157651-0 from training. Duration: 0.68 2023-10-02 18:05:03,594 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_271-234962-0_sp1.1 from training. Duration: 0.7181875 2023-10-02 18:05:06,997 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.ff2_skip_rate, batch_count=970246.6666666666, ans=0.0 2023-10-02 18:05:08,174 WARNING [train.py:1204] (3/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_195-64967-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 18:05:09,433 WARNING [train.py:1204] (3/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_365-215065-0_sp1.1 from training. Duration: 0.936375 2023-10-02 18:05:09,773 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.balancer1.prob, batch_count=970246.6666666666, ans=0.125 2023-10-02 18:05:12,964 WARNING [train.py:1204] (3/4) Exclude cut with ID _31602_210_5854_1_1532677021456_4458250_249-96604-0_sp0.9 from training. Duration: 0.988875 2023-10-02 18:05:13,010 WARNING [train.py:1204] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_307-158462-0 from training. Duration: 0.69 2023-10-02 18:05:13,062 WARNING [train.py:1204] (3/4) Exclude cut with ID _30918_210_6400_1_1532754078600_3125920_407-223394-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 18:05:13,065 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6673_210_3273_1_1527767790_3173240_351-167954-0_sp0.9 from training. Duration: 0.5333125 2023-10-02 18:05:15,824 WARNING [train.py:1204] (3/4) Exclude cut with ID _4866_210_2577_1_1527404412284_4364659_256-306830-0 from training. Duration: 0.87 2023-10-02 18:05:15,879 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_17613_210_2151_1_1531137569_2366160_222-131118-0_sp1.1 from training. Duration: 0.963625 2023-10-02 18:05:15,890 WARNING [train.py:1204] (3/4) Exclude cut with ID _45560_210_21982_1_1533812603951_3584999_111-6376-0 from training. Duration: 0.84 2023-10-02 18:05:15,918 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_216-73841-0 from training. Duration: 0.96 2023-10-02 18:05:15,952 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_14058_210_4163_1_1530690953_6327180_49-210687-0 from training. Duration: 0.94 2023-10-02 18:05:18,617 WARNING [train.py:1204] (3/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_506-42614-0_sp0.9 from training. Duration: 0.87775 2023-10-02 18:05:20,053 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.feed_forward2.hidden_balancer.prob, batch_count=970313.3333333334, ans=0.125 2023-10-02 18:05:21,177 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_25-332138-0_sp1.1 from training. Duration: 0.82725 2023-10-02 18:05:22,686 WARNING [train.py:1204] (3/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_589-9070-0_sp1.1 from training. Duration: 0.6454375 2023-10-02 18:05:22,939 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=970313.3333333334, ans=0.1 2023-10-02 18:05:24,013 WARNING [train.py:1204] (3/4) Exclude cut with ID _6194_210_1534_1_1527511826160_3941194_14-282876-0_sp1.1 from training. Duration: 0.6454375 2023-10-02 18:05:24,200 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.1.conv_skip_rate, batch_count=970313.3333333334, ans=0.0 2023-10-02 18:05:25,393 WARNING [train.py:1204] (3/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_1166-90654-0_sp1.1 from training. Duration: 0.92725 2023-10-02 18:05:26,776 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_218-255858-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 18:05:26,778 WARNING [train.py:1204] (3/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_1285-256622-0 from training. Duration: 0.84 2023-10-02 18:05:26,791 WARNING [train.py:1204] (3/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_205-286261-0_sp1.1 from training. Duration: 0.963625 2023-10-02 18:05:26,805 WARNING [train.py:1204] (3/4) Exclude cut with ID _68777_210_4353_1_1535810390731_3117904_6-154119-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 18:05:29,388 WARNING [train.py:1204] (3/4) Exclude cut with ID _22982_210_1883_1_1531882790447_5449409_565-199393-0_sp1.1 from training. Duration: 0.92725 2023-10-02 18:05:29,399 WARNING [train.py:1204] (3/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_278-154492-0 from training. Duration: 0.76 2023-10-02 18:05:30,712 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_426-313557-0 from training. Duration: 0.53 2023-10-02 18:05:30,973 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.nonlin_attention.balancer.prob, batch_count=970380.0, ans=0.125 2023-10-02 18:05:32,026 WARNING [train.py:1204] (3/4) Exclude cut with ID _41817_210_18835_1_1533434477160_7423049_555-293448-0 from training. Duration: 0.8 2023-10-02 18:05:35,471 WARNING [train.py:1204] (3/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_32-335103-0_sp1.1 from training. Duration: 0.7454375 2023-10-02 18:05:38,200 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2655_210_2087_1_1525688959_3897400_202-147557-0_sp1.1 from training. Duration: 0.77275 2023-10-02 18:05:39,457 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.616e+02 1.839e+02 2.053e+02 2.400e+02 3.677e+02, threshold=4.107e+02, percent-clipped=0.0 2023-10-02 18:05:39,533 WARNING [train.py:1204] (3/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_158-250116-0 from training. Duration: 0.62 2023-10-02 18:05:44,288 WARNING [train.py:1204] (3/4) Exclude cut with ID _55429_210_10626_1_1534467588527_7174143_528-88083-0_sp1.1 from training. Duration: 0.963625 2023-10-02 18:05:47,021 WARNING [train.py:1204] (3/4) Exclude cut with ID _8030_210_7497_1_1528444886781_3531089_32-31837-0_sp1.1 from training. Duration: 0.8818125 2023-10-02 18:05:47,065 WARNING [train.py:1204] (3/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_744-86168-0_sp1.1 from training. Duration: 0.97275 2023-10-02 18:05:47,075 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1113-7369-0_sp0.9 from training. Duration: 0.9444375 2023-10-02 18:05:48,375 WARNING [train.py:1204] (3/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_124-345840-0_sp0.9 from training. Duration: 0.57775 2023-10-02 18:05:48,432 WARNING [train.py:1204] (3/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_328-264776-0_sp0.9 from training. Duration: 0.7444375 2023-10-02 18:05:49,847 WARNING [train.py:1204] (3/4) Exclude cut with ID _18895_210_6228_1_1531357274234_3784930_469-166615-0_sp1.1 from training. Duration: 0.963625 2023-10-02 18:05:49,854 WARNING [train.py:1204] (3/4) Exclude cut with ID _8395_210_1531_1_1528597871523_4411579_431-132470-0_sp0.9 from training. Duration: 0.82225 2023-10-02 18:05:49,927 WARNING [train.py:1204] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_554-45617-0_sp1.1 from training. Duration: 0.9 2023-10-02 18:05:49,945 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_35361_210_4238_1_1533262810_7793476_524-188242-0_sp1.1 from training. Duration: 0.92725 2023-10-02 18:05:51,503 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=970446.6666666666, ans=0.1 2023-10-02 18:05:52,677 WARNING [train.py:1204] (3/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_938-205135-0 from training. Duration: 0.87 2023-10-02 18:05:54,195 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_5931_210_2040_1_1527428481_4547999_321-193614-0 from training. Duration: 0.67 2023-10-02 18:05:54,207 WARNING [train.py:1204] (3/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_445-269521-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 18:05:56,854 INFO [train.py:1046] (3/4) Epoch 28, batch 2150, loss[loss=0.1591, simple_loss=0.2331, pruned_loss=0.04258, over 23308.00 frames. ], tot_loss[loss=0.1652, simple_loss=0.2423, pruned_loss=0.044, over 4708626.67 frames. ], batch size: 134, lr: 3.66e-03, grad_scale: 8.0 2023-10-02 18:05:56,987 WARNING [train.py:1204] (3/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_3-33116-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 18:05:56,988 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_4633_210_2040_1_1526824163_4145343_416-76117-0_sp1.1 from training. Duration: 0.863625 2023-10-02 18:05:57,013 WARNING [train.py:1204] (3/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_1033-274376-0_sp1.1 from training. Duration: 0.7909375 2023-10-02 18:05:57,484 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.5.encoder.layers.0.feed_forward1.out_whiten, num_groups=1, num_channels=256, metric=6.54 vs. limit=15.0 2023-10-02 18:05:58,309 WARNING [train.py:1204] (3/4) Exclude cut with ID _8772_210_1850_1_1528635595972_3664011_398-320049-0_sp0.9 from training. Duration: 0.97775 2023-10-02 18:06:04,917 WARNING [train.py:1204] (3/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_321-215742-0_sp0.9 from training. Duration: 0.5555625 2023-10-02 18:06:06,371 WARNING [train.py:1204] (3/4) Exclude cut with ID _28394_210_8913_1_1532430049518_3650118_272-190950-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 18:06:08,195 WARNING [train.py:1204] (3/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_31-299371-0_sp1.1 from training. Duration: 0.92725 2023-10-02 18:06:10,884 WARNING [train.py:1204] (3/4) Exclude cut with ID _55383_210_10635_1_1534468426172_2231669_134-128246-0_sp0.9 from training. Duration: 0.988875 2023-10-02 18:06:10,887 WARNING [train.py:1204] (3/4) Exclude cut with ID _40519_210_13794_1_1533715228655_7337390_887-9547-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 18:06:10,914 WARNING [train.py:1204] (3/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_310-109461-0_sp0.9 from training. Duration: 0.888875 2023-10-02 18:06:15,466 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2009_210_2071_1_1525481134_4588937_258-250196-0_sp1.1 from training. Duration: 0.92725 2023-10-02 18:06:16,809 WARNING [train.py:1204] (3/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_1186-72702-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 18:06:16,811 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_14831_210_7927_1_1530700200_5583610_181-27467-0_sp1.1 from training. Duration: 0.82725 2023-10-02 18:06:19,639 WARNING [train.py:1204] (3/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_278-132109-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 18:06:20,875 WARNING [train.py:1204] (3/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_261-297503-0 from training. Duration: 0.99 2023-10-02 18:06:23,718 WARNING [train.py:1204] (3/4) Exclude cut with ID _19896_210_1883_1_1531709022662_6649569_313-265903-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 18:06:25,158 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_105-114797-0_sp1.1 from training. Duration: 0.763625 2023-10-02 18:06:25,242 WARNING [train.py:1204] (3/4) Exclude cut with ID _53755_210_8254_1_1534577312941_4707360_9-65843-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 18:06:25,273 WARNING [train.py:1204] (3/4) Exclude cut with ID _18516_210_9206_1_1531704653206_3692406_361-224549-0_sp1.1 from training. Duration: 0.97275 2023-10-02 18:06:26,523 WARNING [train.py:1204] (3/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_403-310975-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 18:06:26,565 WARNING [train.py:1204] (3/4) Exclude cut with ID _5444_210_2151_1_1527418808464_5312210_581-315411-0_sp0.9 from training. Duration: 0.688875 2023-10-02 18:06:26,615 WARNING [train.py:1204] (3/4) Exclude cut with ID _23825_210_13449_1_1531915313526_3497150_464-154904-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 18:06:26,624 WARNING [train.py:1204] (3/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_937-208281-0_sp0.9 from training. Duration: 0.9444375 2023-10-02 18:06:27,995 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_37839_210_7234_1_1533542462_3689303_158-181212-0_sp1.1 from training. Duration: 0.963625 2023-10-02 18:06:29,459 WARNING [train.py:1204] (3/4) Exclude cut with ID _8772_210_1850_1_1528635595972_3664011_398-320049-0 from training. Duration: 0.88 2023-10-02 18:06:31,644 WARNING [train.py:1204] (3/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_718-250907-0_sp1.1 from training. Duration: 0.62725 2023-10-02 18:06:33,605 WARNING [train.py:1204] (3/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_641-126395-0_sp1.1 from training. Duration: 0.92725 2023-10-02 18:06:33,642 WARNING [train.py:1204] (3/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_166-241493-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 18:06:34,943 WARNING [train.py:1204] (3/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_526-62079-0_sp0.9 from training. Duration: 0.7444375 2023-10-02 18:06:37,625 WARNING [train.py:1204] (3/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_439-275083-0_sp1.1 from training. Duration: 0.936375 2023-10-02 18:06:39,045 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_66907_210_21581_1_1535615661_3590177_493-229032-0_sp1.1 from training. Duration: 0.92725 2023-10-02 18:06:40,369 WARNING [train.py:1204] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_89-321955-0_sp1.1 from training. Duration: 0.72725 2023-10-02 18:06:40,467 WARNING [train.py:1204] (3/4) Exclude cut with ID _29857_210_20034_1_1532395886753_3798160_273-226195-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 18:06:40,472 WARNING [train.py:1204] (3/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_404-75889-0 from training. Duration: 0.89 2023-10-02 18:06:42,270 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_271-234962-0_sp0.9 from training. Duration: 0.87775 2023-10-02 18:06:45,609 WARNING [train.py:1204] (3/4) Exclude cut with ID _22961_210_8254_1_1531976366672_4278180_156-339685-0_sp1.1 from training. Duration: 0.97275 2023-10-02 18:06:45,651 WARNING [train.py:1204] (3/4) Exclude cut with ID _37892_210_19596_1_1533198677897_3964058_425-231927-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 18:06:47,037 WARNING [train.py:1204] (3/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_102-288670-0_sp1.1 from training. Duration: 0.97275 2023-10-02 18:06:48,315 WARNING [train.py:1204] (3/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_283-220372-0_sp1.1 from training. Duration: 0.5090625 2023-10-02 18:06:49,735 WARNING [train.py:1204] (3/4) Exclude cut with ID _44304_210_15374_1_1533639352765_4613129_86-1699-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 18:06:49,817 WARNING [train.py:1204] (3/4) Exclude cut with ID _2891_210_2550_1_1525874398521_4427710_327-121590-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 18:06:49,823 WARNING [train.py:1204] (3/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_369-222060-0 from training. Duration: 0.71 2023-10-02 18:06:51,199 WARNING [train.py:1204] (3/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_762-109886-0 from training. Duration: 0.91 2023-10-02 18:06:52,432 WARNING [train.py:1204] (3/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_477-255782-0_sp0.9 from training. Duration: 0.911125 2023-10-02 18:06:52,466 WARNING [train.py:1204] (3/4) Exclude cut with ID IC0669W0061-330906 from training. Duration: 0.768 2023-10-02 18:06:52,512 WARNING [train.py:1204] (3/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_347-213071-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 18:06:52,546 WARNING [train.py:1204] (3/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_507-303370-0_sp1.1 from training. Duration: 0.87275 2023-10-02 18:06:53,987 WARNING [train.py:1204] (3/4) Exclude cut with ID _15012_210_1968_1_1530856820429_5095350_688-178849-0 from training. Duration: 0.64 2023-10-02 18:06:53,992 WARNING [train.py:1204] (3/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_28-70987-0_sp0.9 from training. Duration: 0.9666875 2023-10-02 18:06:53,994 WARNING [train.py:1204] (3/4) Exclude cut with ID _5910_210_1389_1_1527337972454_5050100_555-159815-0 from training. Duration: 0.98 2023-10-02 18:06:54,022 WARNING [train.py:1204] (3/4) Exclude cut with ID IC0679W0064-335871 from training. Duration: 0.768 2023-10-02 18:06:54,023 WARNING [train.py:1204] (3/4) Exclude cut with ID IC0679W0079-335886 from training. Duration: 0.896 2023-10-02 18:06:55,403 WARNING [train.py:1204] (3/4) Exclude cut with ID _5608_210_2106_1_1527415439089_6819768_909-82767-0 from training. Duration: 0.61 2023-10-02 18:06:56,731 WARNING [train.py:1204] (3/4) Exclude cut with ID _23038_210_9570_1_1532001584158_3652419_411-323967-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 18:06:58,026 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_350-231748-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 18:06:58,037 WARNING [train.py:1204] (3/4) Exclude cut with ID _5289_210_2084_1_1527678202736_3779299_142-103317-0_sp0.9 from training. Duration: 0.9333125 2023-10-02 18:06:58,125 WARNING [train.py:1204] (3/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_314-144055-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 18:06:59,560 WARNING [train.py:1204] (3/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_177-236809-0_sp0.9 from training. Duration: 0.5444375 2023-10-02 18:07:00,986 WARNING [train.py:1204] (3/4) Exclude cut with ID _23038_210_9570_1_1532001584158_3652419_82-334733-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 18:07:01,001 WARNING [train.py:1204] (3/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_225-22526-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 18:07:07,789 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.3.encoder.layers.3.whiten, num_groups=1, num_channels=512, metric=3.77 vs. limit=12.0 2023-10-02 18:07:09,996 WARNING [train.py:1204] (3/4) Exclude cut with ID _30327_210_16049_1_1532484161295_3924169_401-70819-0_sp1.1 from training. Duration: 0.7818125 2023-10-02 18:07:10,039 WARNING [train.py:1204] (3/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_538-32291-0 from training. Duration: 0.83 2023-10-02 18:07:11,933 INFO [train.py:1046] (3/4) Epoch 28, batch 2200, loss[loss=0.192, simple_loss=0.264, pruned_loss=0.05998, over 23807.00 frames. ], tot_loss[loss=0.1654, simple_loss=0.2429, pruned_loss=0.04389, over 4719696.59 frames. ], batch size: 179, lr: 3.66e-03, grad_scale: 8.0 2023-10-02 18:07:14,727 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_37357_210_17024_1_1533089504_2316121_258-211605-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 18:07:15,058 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.conv_module1.balancer2.prob, batch_count=970846.6666666666, ans=0.125 2023-10-02 18:07:17,489 WARNING [train.py:1204] (3/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_550-335198-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 18:07:18,840 WARNING [train.py:1204] (3/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_98-741-0_sp0.9 from training. Duration: 0.888875 2023-10-02 18:07:18,953 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.0.feed_forward1.out_proj.dropout_p, batch_count=970846.6666666666, ans=0.1 2023-10-02 18:07:20,184 WARNING [train.py:1204] (3/4) Exclude cut with ID _38323_210_12108_1_1533121233369_3303049_251-238988-0_sp1.1 from training. Duration: 0.92725 2023-10-02 18:07:20,277 WARNING [train.py:1204] (3/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_259-34038-0_sp0.9 from training. Duration: 0.82225 2023-10-02 18:07:24,229 WARNING [train.py:1204] (3/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_1060-180633-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 18:07:24,269 WARNING [train.py:1204] (3/4) Exclude cut with ID _8755_210_1876_1_1528628559357_3258209_211-37473-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 18:07:24,277 WARNING [train.py:1204] (3/4) Exclude cut with ID _4122_210_2084_1_1526713164417_4353280_528-206771-0 from training. Duration: 0.88 2023-10-02 18:07:24,554 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.ff2_skip_rate, batch_count=970913.3333333334, ans=0.0 2023-10-02 18:07:28,564 WARNING [train.py:1204] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_64-248359-0 from training. Duration: 0.69 2023-10-02 18:07:31,284 WARNING [train.py:1204] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_402-253027-0_sp1.1 from training. Duration: 0.4909375 2023-10-02 18:07:37,277 WARNING [train.py:1204] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_625-102955-0 from training. Duration: 0.77 2023-10-02 18:07:39,345 WARNING [train.py:1204] (3/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_611-219324-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 18:07:40,915 WARNING [train.py:1204] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_718-68505-0_sp0.9 from training. Duration: 0.988875 2023-10-02 18:07:40,960 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_353-182855-0_sp1.1 from training. Duration: 0.87275 2023-10-02 18:07:44,847 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_5960_210_1794_1_1527403865_3990999_515-2613-0_sp0.9 from training. Duration: 0.97775 2023-10-02 18:07:44,875 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_5575_210_1410_1_1527301305_2570170_342-265572-0 from training. Duration: 0.94 2023-10-02 18:07:47,557 WARNING [train.py:1204] (3/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_329-113173-0_sp0.9 from training. Duration: 0.688875 2023-10-02 18:07:47,732 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.attention_skip_rate, batch_count=970980.0, ans=0.0 2023-10-02 18:07:48,967 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_15396_210_6890_1_1531308473_1938936_213-312506-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 18:07:49,027 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_521-214276-0_sp1.1 from training. Duration: 0.4 2023-10-02 18:07:53,132 WARNING [train.py:1204] (3/4) Exclude cut with ID _15026_210_8750_1_1530777602316_3291850_211-195043-0_sp1.1 from training. Duration: 0.72725 2023-10-02 18:07:54,529 WARNING [train.py:1204] (3/4) Exclude cut with ID _29343_210_4381_1_1532602757752_3674109_108-257494-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 18:07:55,991 WARNING [train.py:1204] (3/4) Exclude cut with ID _64414_210_17301_1_1535243252048_3757759_403-58151-0_sp1.1 from training. Duration: 0.963625 2023-10-02 18:07:57,396 WARNING [train.py:1204] (3/4) Exclude cut with ID _36512_210_6987_1_1533033143998_3126069_109-177787-0_sp1.1 from training. Duration: 0.92725 2023-10-02 18:07:59,630 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.1.encoder.layers.1.conv_module2.whiten, num_groups=1, num_channels=256, metric=10.03 vs. limit=15.0 2023-10-02 18:08:00,137 WARNING [train.py:1204] (3/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_498-40153-0 from training. Duration: 0.63 2023-10-02 18:08:01,483 WARNING [train.py:1204] (3/4) Exclude cut with ID _56447_210_1389_1_1535074245910_3697189_161-334523-0_sp1.1 from training. Duration: 0.936375 2023-10-02 18:08:02,871 WARNING [train.py:1204] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_238-245655-0 from training. Duration: 0.81 2023-10-02 18:08:06,111 WARNING [train.py:1204] (3/4) Exclude cut with ID _64631_210_27767_1_1535349580068_4165160_108-43865-0_sp1.1 from training. Duration: 0.92725 2023-10-02 18:08:06,116 WARNING [train.py:1204] (3/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_243-42116-0_sp1.1 from training. Duration: 0.663625 2023-10-02 18:08:06,145 WARNING [train.py:1204] (3/4) Exclude cut with ID _66868_210_9527_1_1535509287954_4561140_594-132160-0_sp1.1 from training. Duration: 0.92725 2023-10-02 18:08:07,324 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.569e+02 1.865e+02 2.059e+02 2.462e+02 3.335e+02, threshold=4.117e+02, percent-clipped=0.0 2023-10-02 18:08:08,832 WARNING [train.py:1204] (3/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_48-338976-0_sp0.9 from training. Duration: 0.988875 2023-10-02 18:08:08,903 WARNING [train.py:1204] (3/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_137-100595-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 18:08:10,255 WARNING [train.py:1204] (3/4) Exclude cut with ID _49748_210_24116_1_1533986971737_3158910_12-271669-0_sp1.1 from training. Duration: 0.936375 2023-10-02 18:08:10,274 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_37357_210_17024_1_1533089504_2316121_206-80421-0_sp1.1 from training. Duration: 0.936375 2023-10-02 18:08:10,980 WARNING [train.py:1204] (3/4) Exclude cut with ID _7537_210_2084_1_1528502395431_3924359_113-199933-0_sp1.1 from training. Duration: 0.536375 2023-10-02 18:08:12,363 WARNING [train.py:1204] (3/4) Exclude cut with ID _55440_210_12651_1_1534496713364_2980520_31-59289-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 18:08:14,313 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1083-220122-0_sp0.9 from training. Duration: 0.5666875 2023-10-02 18:08:16,421 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_426-313557-0_sp1.1 from training. Duration: 0.4818125 2023-10-02 18:08:17,771 WARNING [train.py:1204] (3/4) Exclude cut with ID _35799_210_5854_1_1533263487887_3529071_324-94843-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 18:08:20,391 WARNING [train.py:1204] (3/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_264-108685-0_sp1.1 from training. Duration: 0.5 2023-10-02 18:08:20,479 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9012W0006-501141 from training. Duration: 0.896 2023-10-02 18:08:23,128 WARNING [train.py:1204] (3/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_890-5117-0_sp0.9 from training. Duration: 0.8666875 2023-10-02 18:08:24,654 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9023W0008-506301 from training. Duration: 0.896 2023-10-02 18:08:25,921 INFO [train.py:1046] (3/4) Epoch 28, batch 2250, loss[loss=0.1758, simple_loss=0.2481, pruned_loss=0.05174, over 23456.00 frames. ], tot_loss[loss=0.1663, simple_loss=0.2439, pruned_loss=0.04431, over 4722024.42 frames. ], batch size: 285, lr: 3.66e-03, grad_scale: 8.0 2023-10-02 18:08:25,989 WARNING [train.py:1204] (3/4) Exclude cut with ID _13916_210_9994_1_1530788341126_3763149_60-191556-0_sp0.9 from training. Duration: 0.67775 2023-10-02 18:08:26,030 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9029W0331-509721 from training. Duration: 0.896 2023-10-02 18:08:27,405 WARNING [train.py:1204] (3/4) Exclude cut with ID _35823_210_6917_1_1533187881914_7297830_567-108596-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 18:08:28,849 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7543_210_6753_1_1528544010_1110402_25-183860-0_sp1.1 from training. Duration: 0.42725 2023-10-02 18:08:28,955 WARNING [train.py:1204] (3/4) Exclude cut with ID _47278_210_12041_1_1533902418349_3576110_495-260035-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 18:08:30,412 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9051W0015-520281 from training. Duration: 0.9045625 2023-10-02 18:08:31,742 WARNING [train.py:1204] (3/4) Exclude cut with ID _14459_210_2228_1_1531443641946_7516700_707-66193-0_sp1.1 from training. Duration: 0.97275 2023-10-02 18:08:32,051 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.feed_forward1.hidden_balancer.prob, batch_count=971180.0, ans=0.125 2023-10-02 18:08:33,194 WARNING [train.py:1204] (3/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_219-224445-0_sp1.1 from training. Duration: 0.763625 2023-10-02 18:08:37,136 WARNING [train.py:1204] (3/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_345-167986-0_sp0.9 from training. Duration: 0.9333125 2023-10-02 18:08:37,250 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_34815_210_6388_1_1533381320_3571903_279-305565-0_sp1.1 from training. Duration: 0.836375 2023-10-02 18:08:37,468 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.conv_skip_rate, batch_count=971180.0, ans=0.0 2023-10-02 18:08:39,992 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.conv_module2.balancer2.prob, batch_count=971246.6666666666, ans=0.125 2023-10-02 18:08:40,966 WARNING [train.py:1204] (3/4) Exclude cut with ID _21987_210_5854_1_1531821500482_3637038_317-179212-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 18:08:42,961 WARNING [train.py:1204] (3/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_395-37222-0_sp1.1 from training. Duration: 0.6909375 2023-10-02 18:08:44,271 WARNING [train.py:1204] (3/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_168-50337-0_sp1.1 from training. Duration: 0.763625 2023-10-02 18:08:45,697 WARNING [train.py:1204] (3/4) Exclude cut with ID _60113_210_16271_1_1535164212481_4386079_559-167191-0 from training. Duration: 0.93 2023-10-02 18:08:46,974 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_47228_210_4983_1_1533780021_3672027_344-179642-0_sp1.1 from training. Duration: 0.92725 2023-10-02 18:08:46,999 WARNING [train.py:1204] (3/4) Exclude cut with ID _22961_210_8254_1_1531976366672_4278180_317-199546-0_sp0.9 from training. Duration: 0.9555625 2023-10-02 18:08:48,543 WARNING [train.py:1204] (3/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_141-89727-0 from training. Duration: 0.71 2023-10-02 18:08:49,818 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_78-116075-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 18:08:49,839 WARNING [train.py:1204] (3/4) Exclude cut with ID _64086_210_7994_1_1535270417019_7435949_52-235756-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 18:08:52,431 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7615_210_4211_1_1528110965_4580160_72-235211-0_sp1.1 from training. Duration: 0.6909375 2023-10-02 18:08:52,767 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.ff2_skip_rate, batch_count=971246.6666666666, ans=0.0 2023-10-02 18:08:54,734 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.3.encoder.layers.2.conv_module2.whiten, num_groups=1, num_channels=512, metric=4.15 vs. limit=15.0 2023-10-02 18:08:55,436 WARNING [train.py:1204] (3/4) Exclude cut with ID _14069_210_5632_1_1530512692033_4803390_287-27950-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 18:08:56,794 WARNING [train.py:1204] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_74-48717-0_sp0.9 from training. Duration: 0.6444375 2023-10-02 18:08:58,119 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7544_210_6753_1_1528591327_1471426_172-348777-0_sp1.1 from training. Duration: 0.6 2023-10-02 18:08:59,472 WARNING [train.py:1204] (3/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_220-191080-0 from training. Duration: 0.98 2023-10-02 18:09:00,885 WARNING [train.py:1204] (3/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_1081-137641-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 18:09:03,550 WARNING [train.py:1204] (3/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_278-235459-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 18:09:08,995 WARNING [train.py:1204] (3/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_1181-77470-0_sp1.1 from training. Duration: 0.936375 2023-10-02 18:09:11,026 WARNING [train.py:1204] (3/4) Exclude cut with ID _60860_210_8254_1_1534896015875_3557169_198-5531-0_sp1.1 from training. Duration: 0.936375 2023-10-02 18:09:12,367 WARNING [train.py:1204] (3/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_17-289781-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 18:09:12,382 WARNING [train.py:1204] (3/4) Exclude cut with ID _50310_210_15488_1_1534068043345_3641080_146-199752-0_sp1.1 from training. Duration: 0.92725 2023-10-02 18:09:15,674 WARNING [train.py:1204] (3/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_969-213354-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 18:09:17,556 WARNING [train.py:1204] (3/4) Exclude cut with ID _70519_210_4163_1_1535961595994_7458230_66-344156-0_sp1.1 from training. Duration: 0.963625 2023-10-02 18:09:22,850 WARNING [train.py:1204] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_380-204756-0_sp1.1 from training. Duration: 0.7818125 2023-10-02 18:09:24,353 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_128-202104-0_sp0.9 from training. Duration: 0.7 2023-10-02 18:09:27,296 WARNING [train.py:1204] (3/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_51-1049-0_sp0.9 from training. Duration: 0.8333125 2023-10-02 18:09:28,734 WARNING [train.py:1204] (3/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_86-66192-0_sp1.1 from training. Duration: 0.5 2023-10-02 18:09:28,790 WARNING [train.py:1204] (3/4) Exclude cut with ID _14069_210_5632_1_1530512692033_4803390_95-1896-0_sp1.1 from training. Duration: 0.8909375 2023-10-02 18:09:32,822 WARNING [train.py:1204] (3/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_29-293896-0_sp1.1 from training. Duration: 0.5545625 2023-10-02 18:09:34,313 WARNING [train.py:1204] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_167-137255-0_sp0.9 from training. Duration: 0.711125 2023-10-02 18:09:34,314 WARNING [train.py:1204] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_217-95056-0 from training. Duration: 0.98 2023-10-02 18:09:34,327 WARNING [train.py:1204] (3/4) Exclude cut with ID _47278_210_12041_1_1533902418349_3576110_97-266746-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 18:09:35,659 WARNING [train.py:1204] (3/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_380-343670-0_sp1.1 from training. Duration: 0.8 2023-10-02 18:09:35,910 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.feed_forward2.hidden_balancer.prob, batch_count=971446.6666666666, ans=0.125 2023-10-02 18:09:38,367 INFO [train.py:1046] (3/4) Epoch 28, batch 2300, loss[loss=0.1566, simple_loss=0.2431, pruned_loss=0.03506, over 24312.00 frames. ], tot_loss[loss=0.1668, simple_loss=0.2445, pruned_loss=0.0446, over 4728841.94 frames. ], batch size: 61, lr: 3.66e-03, grad_scale: 8.0 2023-10-02 18:09:38,482 WARNING [train.py:1204] (3/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_107-332398-0 from training. Duration: 0.78 2023-10-02 18:09:38,975 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.4.encoder.layers.2.nonlin_attention.whiten1, num_groups=1, num_channels=288, metric=5.50 vs. limit=10.0 2023-10-02 18:09:41,701 WARNING [train.py:1204] (3/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_91-267696-0_sp1.1 from training. Duration: 0.7909375 2023-10-02 18:09:41,729 WARNING [train.py:1204] (3/4) Exclude cut with ID _41298_210_9019_1_1533430371450_4822143_498-186993-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 18:09:46,990 WARNING [train.py:1204] (3/4) Exclude cut with ID _17956_210_4353_1_1531209568125_6979970_54-77610-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 18:09:47,021 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_40047_210_2290_1_1533290519_2374311_257-335162-0_sp1.1 from training. Duration: 0.863625 2023-10-02 18:09:48,374 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9653W0193-675471 from training. Duration: 0.9656875 2023-10-02 18:09:51,228 WARNING [train.py:1204] (3/4) Exclude cut with ID _25248_210_8750_1_1532062825941_5554180_243-240536-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 18:09:55,467 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.conv_module2.balancer1.prob, batch_count=971580.0, ans=0.125 2023-10-02 18:09:58,129 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_32360_210_4198_1_1532689160_3648436_60-51908-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 18:09:58,137 WARNING [train.py:1204] (3/4) Exclude cut with ID _4092_210_2151_1_1526643044449_3135759_227-213972-0_sp0.9 from training. Duration: 0.6 2023-10-02 18:09:58,171 WARNING [train.py:1204] (3/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_392-173500-0_sp1.1 from training. Duration: 0.97275 2023-10-02 18:09:59,528 WARNING [train.py:1204] (3/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_71-329150-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 18:09:59,539 WARNING [train.py:1204] (3/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_866-173440-0 from training. Duration: 0.9 2023-10-02 18:09:59,612 WARNING [train.py:1204] (3/4) Exclude cut with ID _14832_210_8750_1_1530691351539_3171348_1-54315-0_sp0.9 from training. Duration: 0.9666875 2023-10-02 18:10:03,688 WARNING [train.py:1204] (3/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_430-116139-0_sp1.1 from training. Duration: 0.77275 2023-10-02 18:10:03,714 WARNING [train.py:1204] (3/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_356-45054-0_sp0.9 from training. Duration: 0.9555625 2023-10-02 18:10:06,533 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3531_210_3528_1_1526294203_5216456_309-227302-0_sp0.9 from training. Duration: 0.7666875 2023-10-02 18:10:06,697 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=971646.6666666666, ans=0.1 2023-10-02 18:10:09,270 WARNING [train.py:1204] (3/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_301-330455-0_sp1.1 from training. Duration: 0.72725 2023-10-02 18:10:09,516 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.nonlin_attention.balancer.max_positive, batch_count=971646.6666666666, ans=0.95 2023-10-02 18:10:13,006 WARNING [train.py:1204] (3/4) Exclude cut with ID _23557_210_8750_1_1531890106370_6846255_667-145945-0_sp1.1 from training. Duration: 0.92725 2023-10-02 18:10:17,657 WARNING [train.py:1204] (3/4) Exclude cut with ID _14860_210_4915_1_1530702020340_7343240_836-328489-0_sp1.1 from training. Duration: 0.8181875 2023-10-02 18:10:17,697 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_37152_210_9697_1_1533106573_3844188_416-333249-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 18:10:20,365 WARNING [train.py:1204] (3/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_538-32291-0_sp0.9 from training. Duration: 0.92225 2023-10-02 18:10:23,131 WARNING [train.py:1204] (3/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_965-176824-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 18:10:27,320 WARNING [train.py:1204] (3/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_86-19601-0_sp1.1 from training. Duration: 0.77275 2023-10-02 18:10:28,680 WARNING [train.py:1204] (3/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_436-318113-0_sp1.1 from training. Duration: 0.6818125 2023-10-02 18:10:28,723 WARNING [train.py:1204] (3/4) Exclude cut with ID _41817_210_18835_1_1533434477160_7423049_555-293448-0_sp0.9 from training. Duration: 0.888875 2023-10-02 18:10:28,734 WARNING [train.py:1204] (3/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_68-219807-0 from training. Duration: 0.47 2023-10-02 18:10:31,658 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.ff2_skip_rate, batch_count=971713.3333333334, ans=0.0 2023-10-02 18:10:32,755 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7544_210_6753_1_1528591327_1471426_33-346031-0_sp1.1 from training. Duration: 0.5545625 2023-10-02 18:10:32,757 WARNING [train.py:1204] (3/4) Exclude cut with ID _49748_210_24116_1_1533986971737_3158910_263-52113-0_sp1.1 from training. Duration: 0.97275 2023-10-02 18:10:32,812 WARNING [train.py:1204] (3/4) Exclude cut with ID _64639_210_5816_1_1535367619437_3528000_340-24580-0_sp1.1 from training. Duration: 0.92725 2023-10-02 18:10:32,838 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_47228_210_4983_1_1533780021_3672027_479-114941-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 18:10:32,861 WARNING [train.py:1204] (3/4) Exclude cut with ID _44585_210_18628_1_1533605350387_4518199_506-119659-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 18:10:34,208 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.510e+02 1.868e+02 2.077e+02 2.377e+02 3.384e+02, threshold=4.153e+02, percent-clipped=0.0 2023-10-02 18:10:34,317 WARNING [train.py:1204] (3/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_314-126819-0_sp1.1 from training. Duration: 0.4545625 2023-10-02 18:10:34,321 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2541_210_1794_1_1525590269_3084765_156-173713-0_sp0.9 from training. Duration: 0.9 2023-10-02 18:10:34,374 WARNING [train.py:1204] (3/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_108-255430-0 from training. Duration: 0.97 2023-10-02 18:10:34,381 WARNING [train.py:1204] (3/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_791-45149-0_sp1.1 from training. Duration: 0.7818125 2023-10-02 18:10:34,387 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_53378_210_17282_1_1534227054_1261425_10-118295-0_sp1.1 from training. Duration: 0.97275 2023-10-02 18:10:35,784 WARNING [train.py:1204] (3/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_148-158509-0 from training. Duration: 0.41 2023-10-02 18:10:39,961 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_34726_210_4525_1_1533019474_5030395_687-291305-0_sp1.1 from training. Duration: 0.936375 2023-10-02 18:10:43,842 WARNING [train.py:1204] (3/4) Exclude cut with ID _14860_210_4915_1_1530702020340_7343240_264-324537-0_sp1.1 from training. Duration: 0.963625 2023-10-02 18:10:46,644 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_41251_210_15368_1_1533429767_4820695_591-46386-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 18:10:47,844 WARNING [train.py:1204] (3/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_86-19601-0_sp0.9 from training. Duration: 0.9444375 2023-10-02 18:10:47,874 WARNING [train.py:1204] (3/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_401-255491-0_sp0.9 from training. Duration: 0.77775 2023-10-02 18:10:50,573 WARNING [train.py:1204] (3/4) Exclude cut with ID _8696_210_1876_1_1528545632440_3345058_209-54069-0_sp1.1 from training. Duration: 0.6090625 2023-10-02 18:10:50,590 WARNING [train.py:1204] (3/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_369-325184-0_sp1.1 from training. Duration: 0.9 2023-10-02 18:10:51,975 INFO [train.py:1046] (3/4) Epoch 28, batch 2350, loss[loss=0.1699, simple_loss=0.246, pruned_loss=0.04687, over 23221.00 frames. ], tot_loss[loss=0.1678, simple_loss=0.2455, pruned_loss=0.0451, over 4720488.96 frames. ], batch size: 93, lr: 3.66e-03, grad_scale: 8.0 2023-10-02 18:10:52,024 WARNING [train.py:1204] (3/4) Exclude cut with ID _14687_210_5854_1_1530703838259_3664889_115-239046-0_sp0.9 from training. Duration: 0.7444375 2023-10-02 18:10:52,066 WARNING [train.py:1204] (3/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_168-50337-0 from training. Duration: 0.84 2023-10-02 18:10:58,852 WARNING [train.py:1204] (3/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_909-64151-0_sp1.1 from training. Duration: 0.8545625 2023-10-02 18:10:58,862 WARNING [train.py:1204] (3/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_597-154640-0 from training. Duration: 0.9 2023-10-02 18:11:03,058 WARNING [train.py:1204] (3/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_126-274694-0 from training. Duration: 0.98 2023-10-02 18:11:05,836 WARNING [train.py:1204] (3/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_344-269786-0_sp1.1 from training. Duration: 0.97275 2023-10-02 18:11:09,947 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_37818_210_4238_1_1533620910_4720083_322-253154-0_sp1.1 from training. Duration: 0.92725 2023-10-02 18:11:09,951 WARNING [train.py:1204] (3/4) Exclude cut with ID _58895_210_8750_1_1535184081320_3649089_221-15823-0_sp1.1 from training. Duration: 0.92725 2023-10-02 18:11:09,961 WARNING [train.py:1204] (3/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_9-21737-0_sp1.1 from training. Duration: 0.8818125 2023-10-02 18:11:10,007 WARNING [train.py:1204] (3/4) Exclude cut with ID _14069_210_5632_1_1530512692033_4803390_522-341588-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 18:11:11,911 WARNING [train.py:1204] (3/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_363-185703-0 from training. Duration: 0.95 2023-10-02 18:11:15,233 WARNING [train.py:1204] (3/4) Exclude cut with ID _41817_210_18835_1_1533434477160_7423049_265-76216-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 18:11:21,919 WARNING [train.py:1204] (3/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_841-341679-0 from training. Duration: 0.58 2023-10-02 18:11:22,695 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.3.encoder.layers.1.self_attn1.whiten, num_groups=1, num_channels=512, metric=15.79 vs. limit=22.5 2023-10-02 18:11:23,357 WARNING [train.py:1204] (3/4) Exclude cut with ID _55429_210_10626_1_1534467588527_7174143_97-335084-0_sp1.1 from training. Duration: 0.8818125 2023-10-02 18:11:25,250 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.bypass.skip_rate, batch_count=971980.0, ans=0.04949747468305833 2023-10-02 18:11:26,267 WARNING [train.py:1204] (3/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_690-126306-0_sp0.9 from training. Duration: 0.8666875 2023-10-02 18:11:26,282 WARNING [train.py:1204] (3/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_102-92751-0_sp1.1 from training. Duration: 0.9 2023-10-02 18:11:29,024 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3787_210_2040_1_1526477544_5478586_603-120324-0_sp1.1 from training. Duration: 0.736375 2023-10-02 18:11:30,410 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_33348_210_5632_1_1533086912_3324030_85-28583-0 from training. Duration: 0.82 2023-10-02 18:11:30,473 WARNING [train.py:1204] (3/4) Exclude cut with ID _60057_210_10640_1_1534903065793_3333150_362-230377-0_sp1.1 from training. Duration: 0.7909375 2023-10-02 18:11:31,785 WARNING [train.py:1204] (3/4) Exclude cut with ID _60113_210_16271_1_1535164212481_4386079_194-146486-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 18:11:31,799 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_53753_210_7234_1_1534320003_3594148_171-277668-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 18:11:31,825 WARNING [train.py:1204] (3/4) Exclude cut with ID _34423_210_8750_1_1533430798456_3330199_251-332788-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 18:11:37,117 WARNING [train.py:1204] (3/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_663-166973-0_sp0.9 from training. Duration: 0.888875 2023-10-02 18:11:37,245 WARNING [train.py:1204] (3/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_927-43827-0 from training. Duration: 0.74 2023-10-02 18:11:37,293 WARNING [train.py:1204] (3/4) Exclude cut with ID _28323_210_16187_1_1532480395667_4100300_306-74561-0_sp1.1 from training. Duration: 0.8545625 2023-10-02 18:11:41,403 WARNING [train.py:1204] (3/4) Exclude cut with ID _15037_210_2253_1_1530752294104_3267650_139-337989-0_sp1.1 from training. Duration: 0.92725 2023-10-02 18:11:41,422 WARNING [train.py:1204] (3/4) Exclude cut with ID _49039_210_6315_1_1533967537541_3846118_460-263670-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 18:11:43,218 WARNING [train.py:1204] (3/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_186-309161-0 from training. Duration: 0.54 2023-10-02 18:11:45,044 WARNING [train.py:1204] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_87-278686-0_sp1.1 from training. Duration: 0.7 2023-10-02 18:11:48,523 WARNING [train.py:1204] (3/4) Exclude cut with ID _27052_210_5986_1_1532329197187_3585009_314-106499-0 from training. Duration: 0.92 2023-10-02 18:11:48,541 WARNING [train.py:1204] (3/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_412-287526-0_sp0.9 from training. Duration: 0.8 2023-10-02 18:11:52,643 WARNING [train.py:1204] (3/4) Exclude cut with ID _22961_210_8254_1_1531976366672_4278180_317-199546-0 from training. Duration: 0.86 2023-10-02 18:11:54,219 WARNING [train.py:1204] (3/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_381-317791-0 from training. Duration: 0.73 2023-10-02 18:11:55,622 WARNING [train.py:1204] (3/4) Exclude cut with ID _44010_210_4525_1_1533884262964_4013620_560-310411-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 18:11:55,630 WARNING [train.py:1204] (3/4) Exclude cut with ID _5294_210_1747_1_1527249643928_3696179_342-270278-0_sp0.9 from training. Duration: 0.611125 2023-10-02 18:11:55,647 WARNING [train.py:1204] (3/4) Exclude cut with ID ID0473W0002-918951 from training. Duration: 0.924 2023-10-02 18:11:55,679 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1083-220122-0 from training. Duration: 0.51 2023-10-02 18:11:59,645 WARNING [train.py:1204] (3/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_138-154812-0 from training. Duration: 0.78 2023-10-02 18:12:01,090 WARNING [train.py:1204] (3/4) Exclude cut with ID _44982_210_1511_1_1534312820108_3726029_331-19420-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 18:12:03,859 INFO [train.py:1046] (3/4) Epoch 28, batch 2400, loss[loss=0.1616, simple_loss=0.2379, pruned_loss=0.04268, over 23398.00 frames. ], tot_loss[loss=0.1675, simple_loss=0.2455, pruned_loss=0.04471, over 4714367.80 frames. ], batch size: 119, lr: 3.66e-03, grad_scale: 16.0 2023-10-02 18:12:03,957 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2655_210_2087_1_1525688959_3897400_202-147557-0_sp0.9 from training. Duration: 0.9444375 2023-10-02 18:12:05,549 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.attention_skip_rate, batch_count=972180.0, ans=0.0 2023-10-02 18:12:08,016 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_23850_210_4238_1_1532232597_1632433_32-185135-0_sp0.9 from training. Duration: 0.9555625 2023-10-02 18:12:08,133 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_46-93547-0_sp1.1 from training. Duration: 0.863625 2023-10-02 18:12:10,190 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7931_210_1794_1_1528594728_5564059_280-279560-0 from training. Duration: 0.98 2023-10-02 18:12:10,221 WARNING [train.py:1204] (3/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_937-208281-0 from training. Duration: 0.85 2023-10-02 18:12:11,695 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.balancer2.prob, batch_count=972180.0, ans=0.125 2023-10-02 18:12:16,260 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.ff2_skip_rate, batch_count=972180.0, ans=0.0 2023-10-02 18:12:16,289 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.bypass.scale_min, batch_count=972180.0, ans=0.2 2023-10-02 18:12:17,963 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_14928_210_5632_1_1530773461_4714000_454-112198-0_sp0.9 from training. Duration: 0.7333125 2023-10-02 18:12:17,964 WARNING [train.py:1204] (3/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_944-153333-0_sp1.1 from training. Duration: 0.963625 2023-10-02 18:12:19,423 WARNING [train.py:1204] (3/4) Exclude cut with ID _14821_210_8750_1_1530680503991_6829359_2-227151-0 from training. Duration: 0.83 2023-10-02 18:12:19,439 WARNING [train.py:1204] (3/4) Exclude cut with ID _28461_210_2253_1_1532771775189_3041299_96-152158-0_sp1.1 from training. Duration: 0.77275 2023-10-02 18:12:20,744 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7837_210_1792_1_1528453445_5841305_654-54195-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 18:12:20,792 WARNING [train.py:1204] (3/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_344-152526-0 from training. Duration: 0.53 2023-10-02 18:12:26,302 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_30167_210_3528_1_1532428012_4772375_480-142230-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 18:12:27,756 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_160-218721-0 from training. Duration: 0.96 2023-10-02 18:12:33,140 WARNING [train.py:1204] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_65-88242-0_sp0.9 from training. Duration: 0.62225 2023-10-02 18:12:37,273 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_34886_210_10633_1_1533203882_1755661_76-156096-0 from training. Duration: 0.87 2023-10-02 18:12:38,141 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.bypass_mid.scale_min, batch_count=972313.3333333334, ans=0.2 2023-10-02 18:12:39,176 WARNING [train.py:1204] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_188-38388-0_sp1.1 from training. Duration: 0.92725 2023-10-02 18:12:40,626 WARNING [train.py:1204] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_81-289335-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 18:12:45,389 WARNING [train.py:1204] (3/4) Exclude cut with ID _59493_210_17282_1_1535078419077_3225879_110-8778-0_sp1.1 from training. Duration: 0.936375 2023-10-02 18:12:47,145 WARNING [train.py:1204] (3/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_188-260011-0 from training. Duration: 0.73 2023-10-02 18:12:47,186 WARNING [train.py:1204] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_453-133983-0_sp0.9 from training. Duration: 0.8666875 2023-10-02 18:12:55,278 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8054_210_7927_1_1528627721_4984094_553-17379-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 18:12:56,697 WARNING [train.py:1204] (3/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_341-171072-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 18:12:58,341 WARNING [train.py:1204] (3/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_265-257286-0_sp1.1 from training. Duration: 0.963625 2023-10-02 18:12:59,480 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.511e+02 1.861e+02 2.059e+02 2.310e+02 3.271e+02, threshold=4.118e+02, percent-clipped=0.0 2023-10-02 18:12:59,601 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_403-39619-0_sp0.9 from training. Duration: 0.9333125 2023-10-02 18:12:59,606 WARNING [train.py:1204] (3/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_464-33931-0_sp0.9 from training. Duration: 0.6 2023-10-02 18:12:59,630 WARNING [train.py:1204] (3/4) Exclude cut with ID _60804_210_9642_1_1534831199859_7268299_632-143863-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 18:12:59,636 WARNING [train.py:1204] (3/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_431-310997-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 18:13:00,966 WARNING [train.py:1204] (3/4) Exclude cut with ID _31649_210_6228_1_1532584608063_4091078_114-263860-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 18:13:00,977 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6961_210_2087_1_1527937168_6815011_562-165518-0_sp0.9 from training. Duration: 0.8555625 2023-10-02 18:13:03,965 WARNING [train.py:1204] (3/4) Exclude cut with ID _27052_210_5986_1_1532329197187_3585009_338-325870-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 18:13:04,007 WARNING [train.py:1204] (3/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_469-94673-0_sp1.1 from training. Duration: 0.7181875 2023-10-02 18:13:04,048 WARNING [train.py:1204] (3/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_259-34038-0 from training. Duration: 0.74 2023-10-02 18:13:05,489 WARNING [train.py:1204] (3/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_388-129552-0 from training. Duration: 0.96 2023-10-02 18:13:08,664 WARNING [train.py:1204] (3/4) Exclude cut with ID _28394_210_8913_1_1532430049518_3650118_342-251455-0_sp1.1 from training. Duration: 0.936375 2023-10-02 18:13:08,699 WARNING [train.py:1204] (3/4) Exclude cut with ID _53731_210_7994_1_1534553979244_3757214_8-191390-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 18:13:08,732 WARNING [train.py:1204] (3/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_613-232856-0 from training. Duration: 0.54 2023-10-02 18:13:08,796 WARNING [train.py:1204] (3/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_557-349534-0 from training. Duration: 0.91 2023-10-02 18:13:08,805 WARNING [train.py:1204] (3/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_379-184223-0 from training. Duration: 0.85 2023-10-02 18:13:08,808 WARNING [train.py:1204] (3/4) Exclude cut with ID IC0125W0001-61807 from training. Duration: 0.966 2023-10-02 18:13:08,957 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=972446.6666666666, ans=0.1 2023-10-02 18:13:09,419 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.3.encoder.layers.1.nonlin_attention.whiten2, num_groups=1, num_channels=512, metric=8.83 vs. limit=15.0 2023-10-02 18:13:10,230 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_229-45300-0 from training. Duration: 0.57 2023-10-02 18:13:11,608 WARNING [train.py:1204] (3/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_1069-24743-0_sp1.1 from training. Duration: 0.92725 2023-10-02 18:13:13,634 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_41452_210_7291_1_1533437687_3941281_323-172873-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 18:13:13,643 WARNING [train.py:1204] (3/4) Exclude cut with ID _37086_210_9038_1_1532999540891_7021250_802-226709-0_sp1.1 from training. Duration: 0.97275 2023-10-02 18:13:15,500 WARNING [train.py:1204] (3/4) Exclude cut with ID IC0144W0500-71767 from training. Duration: 0.931875 2023-10-02 18:13:17,416 WARNING [train.py:1204] (3/4) Exclude cut with ID _39846_210_12651_1_1533605310265_2115212_233-129699-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 18:13:17,473 WARNING [train.py:1204] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_574-45541-0_sp0.9 from training. Duration: 0.72225 2023-10-02 18:13:18,813 INFO [train.py:1046] (3/4) Epoch 28, batch 2450, loss[loss=0.1675, simple_loss=0.2552, pruned_loss=0.03987, over 24644.00 frames. ], tot_loss[loss=0.1665, simple_loss=0.2439, pruned_loss=0.04453, over 4696507.31 frames. ], batch size: 68, lr: 3.66e-03, grad_scale: 16.0 2023-10-02 18:13:21,584 WARNING [train.py:1204] (3/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_372-298093-0_sp1.1 from training. Duration: 0.72725 2023-10-02 18:13:21,603 WARNING [train.py:1204] (3/4) Exclude cut with ID _62574_210_2655_1_1535180407058_3933249_34-26343-0_sp1.1 from training. Duration: 0.97275 2023-10-02 18:13:27,088 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_512-256238-0_sp1.1 from training. Duration: 0.963625 2023-10-02 18:13:27,094 WARNING [train.py:1204] (3/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_196-55502-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 18:13:27,190 WARNING [train.py:1204] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_195-167011-0 from training. Duration: 0.75 2023-10-02 18:13:34,152 WARNING [train.py:1204] (3/4) Exclude cut with ID _13678_210_7497_1_1530530069226_3463170_345-230416-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 18:13:34,163 WARNING [train.py:1204] (3/4) Exclude cut with ID _51078_210_9070_1_1534161557424_3805250_225-327809-0_sp1.1 from training. Duration: 0.963625 2023-10-02 18:13:37,072 WARNING [train.py:1204] (3/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_18-189198-0_sp1.1 from training. Duration: 0.8181875 2023-10-02 18:13:37,092 WARNING [train.py:1204] (3/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_329-82235-0_sp1.1 from training. Duration: 0.7454375 2023-10-02 18:13:37,101 WARNING [train.py:1204] (3/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_726-270780-0_sp0.9 from training. Duration: 0.9555625 2023-10-02 18:13:38,347 WARNING [train.py:1204] (3/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_654-52713-0 from training. Duration: 0.77 2023-10-02 18:13:42,957 WARNING [train.py:1204] (3/4) Exclude cut with ID _20455_210_5219_1_1531565996775_8098100_191-16006-0_sp1.1 from training. Duration: 0.963625 2023-10-02 18:13:44,870 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3171_210_2087_1_1526109084_2330007_66-260583-0_sp1.1 from training. Duration: 0.6545625 2023-10-02 18:13:46,173 WARNING [train.py:1204] (3/4) Exclude cut with ID _38670_210_15379_1_1533448810659_4391059_63-129006-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 18:13:46,481 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.conv_module2.balancer2.prob, batch_count=972646.6666666666, ans=0.125 2023-10-02 18:13:49,298 WARNING [train.py:1204] (3/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_393-57888-0_sp0.9 from training. Duration: 0.711125 2023-10-02 18:13:50,635 WARNING [train.py:1204] (3/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_857-298942-0_sp0.9 from training. Duration: 0.9444375 2023-10-02 18:13:50,731 WARNING [train.py:1204] (3/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_239-22076-0_sp0.9 from training. Duration: 0.9444375 2023-10-02 18:13:50,760 WARNING [train.py:1204] (3/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_424-227947-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 18:13:54,093 WARNING [train.py:1204] (3/4) Exclude cut with ID _14069_210_5632_1_1530512692033_4803390_95-1896-0 from training. Duration: 0.98 2023-10-02 18:13:55,457 WARNING [train.py:1204] (3/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_96-244299-0_sp0.9 from training. Duration: 0.97775 2023-10-02 18:14:02,371 WARNING [train.py:1204] (3/4) Exclude cut with ID _46683_210_9642_1_1533730913492_4591116_562-319825-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 18:14:03,823 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.1.conv_module2.balancer1.prob, batch_count=972713.3333333334, ans=0.125 2023-10-02 18:14:05,050 WARNING [train.py:1204] (3/4) Exclude cut with ID _64639_210_5816_1_1535367619437_3528000_284-173474-0_sp1.1 from training. Duration: 0.963625 2023-10-02 18:14:05,071 WARNING [train.py:1204] (3/4) Exclude cut with ID _29529_210_7234_1_1532393961455_3977460_497-100133-0_sp1.1 from training. Duration: 0.936375 2023-10-02 18:14:05,114 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_12868_210_6753_1_1530701932_530275_18-76322-0_sp0.9 from training. Duration: 0.9666875 2023-10-02 18:14:05,146 WARNING [train.py:1204] (3/4) Exclude cut with ID _50093_210_4220_1_1534417141207_3432299_116-303447-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 18:14:06,492 WARNING [train.py:1204] (3/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_44-79427-0_sp1.1 from training. Duration: 0.8909375 2023-10-02 18:14:07,881 WARNING [train.py:1204] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_887-249555-0 from training. Duration: 0.77 2023-10-02 18:14:10,746 WARNING [train.py:1204] (3/4) Exclude cut with ID _28461_210_2253_1_1532771775189_3041299_96-152158-0_sp0.9 from training. Duration: 0.9444375 2023-10-02 18:14:10,794 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_14058_210_4163_1_1530690953_6327180_49-210687-0_sp1.1 from training. Duration: 0.8545625 2023-10-02 18:14:14,069 WARNING [train.py:1204] (3/4) Exclude cut with ID _40399_210_4353_1_1533305658194_3279406_351-127903-0_sp1.1 from training. Duration: 0.97275 2023-10-02 18:14:14,074 WARNING [train.py:1204] (3/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_184-206939-0_sp1.1 from training. Duration: 0.936375 2023-10-02 18:14:17,492 WARNING [train.py:1204] (3/4) Exclude cut with ID _14100_210_8341_1_1530529320613_3678069_258-224599-0_sp1.1 from training. Duration: 0.7 2023-10-02 18:14:17,517 WARNING [train.py:1204] (3/4) Exclude cut with ID _6493_210_1747_1_1528459210424_4012470_324-323854-0 from training. Duration: 0.92 2023-10-02 18:14:19,277 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_151-325626-0_sp1.1 from training. Duration: 0.8818125 2023-10-02 18:14:19,351 WARNING [train.py:1204] (3/4) Exclude cut with ID _14528_210_8254_1_1530669700514_4306939_106-43145-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 18:14:19,365 WARNING [train.py:1204] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_686-54383-0 from training. Duration: 0.87 2023-10-02 18:14:20,639 WARNING [train.py:1204] (3/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_801-102362-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 18:14:21,971 WARNING [train.py:1204] (3/4) Exclude cut with ID _48677_210_5663_1_1533902405919_3892989_408-40740-0_sp0.9 from training. Duration: 0.92225 2023-10-02 18:14:25,440 WARNING [train.py:1204] (3/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_448-212135-0_sp1.1 from training. Duration: 0.82725 2023-10-02 18:14:28,081 WARNING [train.py:1204] (3/4) Exclude cut with ID _59170_210_6721_1_1534983633960_4203335_320-124047-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 18:14:29,411 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_4692_210_3528_1_1526899755_4323254_107-44401-0_sp1.1 from training. Duration: 0.7818125 2023-10-02 18:14:32,113 INFO [train.py:1046] (3/4) Epoch 28, batch 2500, loss[loss=0.16, simple_loss=0.242, pruned_loss=0.03896, over 23865.00 frames. ], tot_loss[loss=0.1653, simple_loss=0.2424, pruned_loss=0.04409, over 4702646.11 frames. ], batch size: 86, lr: 3.66e-03, grad_scale: 8.0 2023-10-02 18:14:32,167 WARNING [train.py:1204] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_731-2428-0 from training. Duration: 0.83 2023-10-02 18:14:32,250 WARNING [train.py:1204] (3/4) Exclude cut with ID _29857_210_20034_1_1532395886753_3798160_335-163562-0_sp1.1 from training. Duration: 0.77275 2023-10-02 18:14:37,224 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.1.encoder.layers.1.conv_module2.whiten, num_groups=1, num_channels=256, metric=11.59 vs. limit=15.0 2023-10-02 18:14:37,787 WARNING [train.py:1204] (3/4) Exclude cut with ID _14832_210_8750_1_1530691351539_3171348_171-275004-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 18:14:48,513 WARNING [train.py:1204] (3/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_105-328174-0_sp0.9 from training. Duration: 0.8444375 2023-10-02 18:14:48,545 WARNING [train.py:1204] (3/4) Exclude cut with ID _68965_210_1883_1_1535698962272_3497490_164-118421-0_sp1.1 from training. Duration: 0.936375 2023-10-02 18:14:49,935 WARNING [train.py:1204] (3/4) Exclude cut with ID _19896_210_1883_1_1531709022662_6649569_380-24170-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 18:14:49,940 WARNING [train.py:1204] (3/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_505-205270-0_sp1.1 from training. Duration: 0.3454375 2023-10-02 18:14:57,096 WARNING [train.py:1204] (3/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_427-208901-0_sp1.1 from training. Duration: 0.6181875 2023-10-02 18:14:57,146 WARNING [train.py:1204] (3/4) Exclude cut with ID _23818_210_6228_1_1531917063884_3676920_85-326916-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 18:14:57,225 WARNING [train.py:1204] (3/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_101-293367-0_sp1.1 from training. Duration: 0.57275 2023-10-02 18:14:57,232 WARNING [train.py:1204] (3/4) Exclude cut with ID _5409_210_1388_1_1527292850189_5865191_178-127970-0_sp1.1 from training. Duration: 0.5181875 2023-10-02 18:14:58,533 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_403-39619-0 from training. Duration: 0.84 2023-10-02 18:14:59,909 WARNING [train.py:1204] (3/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_427-87841-0_sp1.1 from training. Duration: 0.963625 2023-10-02 18:15:01,221 WARNING [train.py:1204] (3/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_286-68181-0_sp1.1 from training. Duration: 0.97275 2023-10-02 18:15:01,262 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6673_210_3273_1_1527767790_3173240_327-306185-0 from training. Duration: 0.83 2023-10-02 18:15:01,290 WARNING [train.py:1204] (3/4) Exclude cut with ID _3675_210_4557_1_1526639934408_4182089_106-203713-0_sp1.1 from training. Duration: 0.963625 2023-10-02 18:15:02,624 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_53071_210_16554_1_1534493434_1276796_130-36076-0 from training. Duration: 0.95 2023-10-02 18:15:02,653 WARNING [train.py:1204] (3/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_586-93054-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 18:15:06,778 WARNING [train.py:1204] (3/4) Exclude cut with ID _23825_210_13449_1_1531915313526_3497150_262-10571-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 18:15:08,276 WARNING [train.py:1204] (3/4) Exclude cut with ID _28783_210_7943_1_1532671164808_7155479_432-67885-0_sp1.1 from training. Duration: 0.97275 2023-10-02 18:15:09,770 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6287_210_3273_1_1527509506_2653716_182-199887-0_sp0.9 from training. Duration: 0.5666875 2023-10-02 18:15:11,128 WARNING [train.py:1204] (3/4) Exclude cut with ID _3231_210_2106_1_1526205864934_6784220_171-256004-0 from training. Duration: 0.86 2023-10-02 18:15:13,099 WARNING [train.py:1204] (3/4) Exclude cut with ID _69051_210_1531_1_1535885566901_3829538_332-92090-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 18:15:14,384 WARNING [train.py:1204] (3/4) Exclude cut with ID _3675_210_4557_1_1526639934408_4182089_104-324125-0_sp1.1 from training. Duration: 0.963625 2023-10-02 18:15:17,871 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_41764_210_2521_1_1533686110_4089313_501-2119-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 18:15:20,652 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_56-100988-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 18:15:20,827 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.feed_forward3.hidden_balancer.prob, batch_count=973046.6666666666, ans=0.125 2023-10-02 18:15:21,468 INFO [scaling.py:1022] (3/4) Whitening: name=encoder_embed.out_whiten, num_groups=1, num_channels=192, metric=6.62 vs. limit=8.0 2023-10-02 18:15:22,189 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=973046.6666666666, ans=0.1 2023-10-02 18:15:23,385 WARNING [train.py:1204] (3/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_398-239819-0_sp1.1 from training. Duration: 0.9 2023-10-02 18:15:29,178 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.531e+02 1.840e+02 2.003e+02 2.187e+02 4.032e+02, threshold=4.007e+02, percent-clipped=0.0 2023-10-02 18:15:30,673 WARNING [train.py:1204] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_74-48717-0_sp1.1 from training. Duration: 0.52725 2023-10-02 18:15:33,406 WARNING [train.py:1204] (3/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_177-49092-0 from training. Duration: 0.91 2023-10-02 18:15:33,429 WARNING [train.py:1204] (3/4) Exclude cut with ID _62337_210_12392_1_1535009273105_3412229_44-176353-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 18:15:33,437 WARNING [train.py:1204] (3/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_110-130961-0_sp1.1 from training. Duration: 0.42725 2023-10-02 18:15:34,932 WARNING [train.py:1204] (3/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_37-189262-0_sp1.1 from training. Duration: 0.8909375 2023-10-02 18:15:34,933 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3172_210_2087_1_1526292929_3987865_197-97762-0_sp0.9 from training. Duration: 0.8333125 2023-10-02 18:15:36,401 WARNING [train.py:1204] (3/4) Exclude cut with ID IC0677W0065-334897 from training. Duration: 0.896 2023-10-02 18:15:36,401 WARNING [train.py:1204] (3/4) Exclude cut with ID IC0677W0080-334912 from training. Duration: 0.768 2023-10-02 18:15:36,413 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_120-41668-0 from training. Duration: 0.7 2023-10-02 18:15:38,000 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.feed_forward1.out_proj.dropout_p, batch_count=973113.3333333334, ans=0.1 2023-10-02 18:15:39,134 WARNING [train.py:1204] (3/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_460-205287-0_sp1.1 from training. Duration: 0.963625 2023-10-02 18:15:40,545 WARNING [train.py:1204] (3/4) Exclude cut with ID _14632_210_9988_1_1530691279392_3923200_102-204477-0 from training. Duration: 0.97 2023-10-02 18:15:40,550 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_34815_210_6388_1_1533381320_3571903_279-305565-0 from training. Duration: 0.92 2023-10-02 18:15:40,781 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.bypass_mid.scale_min, batch_count=973113.3333333334, ans=0.2 2023-10-02 18:15:41,866 WARNING [train.py:1204] (3/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_91-148831-0_sp1.1 from training. Duration: 0.9 2023-10-02 18:15:41,929 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_82-221196-0 from training. Duration: 0.76 2023-10-02 18:15:45,236 INFO [train.py:1046] (3/4) Epoch 28, batch 2550, loss[loss=0.1751, simple_loss=0.2545, pruned_loss=0.0478, over 23369.00 frames. ], tot_loss[loss=0.1654, simple_loss=0.2431, pruned_loss=0.04387, over 4723296.94 frames. ], batch size: 93, lr: 3.65e-03, grad_scale: 8.0 2023-10-02 18:15:45,361 WARNING [train.py:1204] (3/4) Exclude cut with ID _38020_210_5986_1_1533112252259_7006089_493-102745-0 from training. Duration: 0.98 2023-10-02 18:15:47,541 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.balancer1.prob, batch_count=973180.0, ans=0.125 2023-10-02 18:15:49,141 WARNING [train.py:1204] (3/4) Exclude cut with ID _61133_210_9969_1_1535196777227_3696409_40-93442-0_sp1.1 from training. Duration: 0.92725 2023-10-02 18:15:51,798 WARNING [train.py:1204] (3/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_45-258852-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 18:15:51,832 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_53071_210_16554_1_1534493434_1276796_130-36076-0_sp1.1 from training. Duration: 0.863625 2023-10-02 18:15:54,449 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8694_210_6400_1_1529065754_3260756_332-13553-0_sp1.1 from training. Duration: 0.92725 2023-10-02 18:15:54,542 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_405-339986-0 from training. Duration: 0.51 2023-10-02 18:15:55,876 WARNING [train.py:1204] (3/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_927-43827-0_sp1.1 from training. Duration: 0.67275 2023-10-02 18:15:57,447 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.feed_forward1.hidden_balancer.prob, batch_count=973180.0, ans=0.125 2023-10-02 18:16:00,539 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_397-147196-0 from training. Duration: 0.8 2023-10-02 18:16:00,661 WARNING [train.py:1204] (3/4) Exclude cut with ID 3557-8342-0013-54691-0_sp1.1 from training. Duration: 0.836375 2023-10-02 18:16:03,463 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_47438_210_5399_1_1533797471_4513356_626-127722-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 18:16:06,100 WARNING [train.py:1204] (3/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_256-323883-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 18:16:06,106 WARNING [train.py:1204] (3/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_839-173017-0_sp0.9 from training. Duration: 0.4555625 2023-10-02 18:16:06,136 WARNING [train.py:1204] (3/4) Exclude cut with ID _5289_210_2084_1_1527678202736_3779299_392-261988-0_sp1.1 from training. Duration: 0.5909375 2023-10-02 18:16:06,176 WARNING [train.py:1204] (3/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_119-224384-0_sp1.1 from training. Duration: 0.936375 2023-10-02 18:16:06,211 WARNING [train.py:1204] (3/4) Exclude cut with ID _57523_210_1856_1_1534766294545_3732989_262-267933-0_sp1.1 from training. Duration: 0.963625 2023-10-02 18:16:08,909 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_35361_210_4238_1_1533262810_7793476_675-87012-0_sp0.9 from training. Duration: 0.988875 2023-10-02 18:16:08,935 WARNING [train.py:1204] (3/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_17-90541-0 from training. Duration: 0.94 2023-10-02 18:16:08,974 WARNING [train.py:1204] (3/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_535-14410-0_sp1.1 from training. Duration: 0.42725 2023-10-02 18:16:08,980 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_24673_210_6400_1_1531968633_778659_94-154511-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 18:16:09,130 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.bypass.scale_min, batch_count=973246.6666666666, ans=0.2 2023-10-02 18:16:10,268 WARNING [train.py:1204] (3/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_212-10501-0 from training. Duration: 0.94 2023-10-02 18:16:23,877 WARNING [train.py:1204] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_383-285196-0_sp1.1 from training. Duration: 0.8818125 2023-10-02 18:16:29,230 WARNING [train.py:1204] (3/4) Exclude cut with ID _45560_210_21982_1_1533812603951_3584999_238-47020-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 18:16:29,246 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_21500_210_2054_1_1532149310_2716758_278-18053-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 18:16:29,253 WARNING [train.py:1204] (3/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_453-9626-0_sp1.1 from training. Duration: 0.936375 2023-10-02 18:16:30,673 WARNING [train.py:1204] (3/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_153-97441-0_sp1.1 from training. Duration: 0.5090625 2023-10-02 18:16:34,316 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.4.encoder.layers.2.feed_forward1.out_whiten, num_groups=1, num_channels=384, metric=8.65 vs. limit=15.0 2023-10-02 18:16:35,224 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.0.feed_forward1.hidden_balancer.prob, batch_count=973380.0, ans=0.125 2023-10-02 18:16:37,925 WARNING [train.py:1204] (3/4) Exclude cut with ID _67725_210_27767_1_1535608756671_5105359_431-120895-0_sp1.1 from training. Duration: 0.963625 2023-10-02 18:16:40,649 WARNING [train.py:1204] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_574-45541-0_sp1.1 from training. Duration: 0.5909375 2023-10-02 18:16:40,665 WARNING [train.py:1204] (3/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_162-103620-0_sp1.1 from training. Duration: 0.8454375 2023-10-02 18:16:40,684 WARNING [train.py:1204] (3/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_734-129761-0_sp1.1 from training. Duration: 0.7454375 2023-10-02 18:16:40,729 WARNING [train.py:1204] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_620-276793-0_sp1.1 from training. Duration: 0.536375 2023-10-02 18:16:42,039 WARNING [train.py:1204] (3/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_153-97441-0_sp0.9 from training. Duration: 0.62225 2023-10-02 18:16:44,875 WARNING [train.py:1204] (3/4) Exclude cut with ID _12733_210_8341_1_1530167424556_3646380_523-85621-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 18:16:44,895 WARNING [train.py:1204] (3/4) Exclude cut with ID _64048_210_10626_1_1535198382802_3835179_352-179834-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 18:16:45,246 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.attention_skip_rate, batch_count=973446.6666666666, ans=0.0 2023-10-02 18:16:51,206 WARNING [train.py:1204] (3/4) Exclude cut with ID _55162_210_17071_1_1534408248583_3753480_43-242364-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 18:16:51,228 WARNING [train.py:1204] (3/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_661-264857-0 from training. Duration: 0.93 2023-10-02 18:16:51,242 WARNING [train.py:1204] (3/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_325-237800-0_sp1.1 from training. Duration: 0.97275 2023-10-02 18:16:51,289 WARNING [train.py:1204] (3/4) Exclude cut with ID _55328_210_27767_1_1534580973260_4088170_385-171459-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 18:16:53,023 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3780_210_2087_1_1526467389_2145572_176-107140-0_sp1.1 from training. Duration: 0.62725 2023-10-02 18:16:53,127 WARNING [train.py:1204] (3/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_375-244976-0_sp1.1 from training. Duration: 0.6818125 2023-10-02 18:16:54,054 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.1.encoder.layers.0.self_attn1.whiten, num_groups=1, num_channels=256, metric=12.60 vs. limit=22.5 2023-10-02 18:16:54,055 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.self_attn1.whiten.whitening_limit, batch_count=973446.6666666666, ans=22.5 2023-10-02 18:16:54,554 WARNING [train.py:1204] (3/4) Exclude cut with ID _35941_210_15379_1_1532995294515_4023089_309-33162-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 18:16:58,691 INFO [train.py:1046] (3/4) Epoch 28, batch 2600, loss[loss=0.1632, simple_loss=0.2522, pruned_loss=0.03709, over 24422.00 frames. ], tot_loss[loss=0.1659, simple_loss=0.2441, pruned_loss=0.04388, over 4723630.64 frames. ], batch size: 69, lr: 3.65e-03, grad_scale: 8.0 2023-10-02 18:17:00,146 WARNING [train.py:1204] (3/4) Exclude cut with ID _14459_210_2228_1_1531443641946_7516700_1185-22038-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 18:17:02,761 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_1197-69460-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 18:17:02,990 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.bypass.skip_rate, batch_count=973513.3333333334, ans=0.04949747468305833 2023-10-02 18:17:04,323 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9012W0022-501157 from training. Duration: 0.896 2023-10-02 18:17:07,694 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9023W0009-506302 from training. Duration: 0.896 2023-10-02 18:17:07,708 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2821_210_2715_1_1526108385_3763560_73-296673-0_sp1.1 from training. Duration: 0.7545625 2023-10-02 18:17:07,734 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9025W0113-507442 from training. Duration: 0.896 2023-10-02 18:17:09,215 WARNING [train.py:1204] (3/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_739-2098-0 from training. Duration: 0.92 2023-10-02 18:17:10,491 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9029W0332-509722 from training. Duration: 0.768 2023-10-02 18:17:11,934 WARNING [train.py:1204] (3/4) Exclude cut with ID _16908_210_5724_1_1531119157248_3772700_68-101953-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 18:17:11,955 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9042W0011-515617 from training. Duration: 0.768 2023-10-02 18:17:13,430 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3019_210_2028_1_1526044245_3264865_191-72235-0 from training. Duration: 0.97 2023-10-02 18:17:14,811 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9052W0006-520792 from training. Duration: 0.896 2023-10-02 18:17:16,185 WARNING [train.py:1204] (3/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_245-174553-0_sp1.1 from training. Duration: 0.8 2023-10-02 18:17:16,307 WARNING [train.py:1204] (3/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_202-67017-0 from training. Duration: 0.82 2023-10-02 18:17:17,610 WARNING [train.py:1204] (3/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_304-59227-0 from training. Duration: 0.77 2023-10-02 18:17:20,256 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_246-125000-0_sp0.9 from training. Duration: 0.688875 2023-10-02 18:17:20,286 WARNING [train.py:1204] (3/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_243-42116-0 from training. Duration: 0.73 2023-10-02 18:17:22,668 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9087W0121-538492 from training. Duration: 0.896 2023-10-02 18:17:22,681 WARNING [train.py:1204] (3/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_120-76843-0 from training. Duration: 0.55 2023-10-02 18:17:31,040 WARNING [train.py:1204] (3/4) Exclude cut with ID _7852_210_5219_1_1528286471316_8139400_850-304621-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 18:17:31,062 WARNING [train.py:1204] (3/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_154-285415-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 18:17:31,074 WARNING [train.py:1204] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_538-278194-0_sp1.1 from training. Duration: 0.92725 2023-10-02 18:17:31,074 WARNING [train.py:1204] (3/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_119-207162-0 from training. Duration: 0.71 2023-10-02 18:17:33,819 WARNING [train.py:1204] (3/4) Exclude cut with ID _2334_210_2667_1_1525608238880_4705129_459-104997-0_sp1.1 from training. Duration: 0.863625 2023-10-02 18:17:39,081 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.4.encoder.layers.2.self_attn2.whiten, num_groups=1, num_channels=384, metric=12.13 vs. limit=22.5 2023-10-02 18:17:39,870 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9147W0019-567847 from training. Duration: 0.768 2023-10-02 18:17:44,237 WARNING [train.py:1204] (3/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_228-189439-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 18:17:44,271 WARNING [train.py:1204] (3/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_196-77172-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 18:17:45,595 WARNING [train.py:1204] (3/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_625-338546-0 from training. Duration: 0.88 2023-10-02 18:17:45,641 WARNING [train.py:1204] (3/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_193-327630-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 18:17:45,643 WARNING [train.py:1204] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_391-24006-0_sp1.1 from training. Duration: 0.92725 2023-10-02 18:17:46,958 WARNING [train.py:1204] (3/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_197-28164-0 from training. Duration: 0.8 2023-10-02 18:17:50,405 WARNING [train.py:1204] (3/4) Exclude cut with ID _8395_210_1531_1_1528597871523_4411579_431-132470-0_sp1.1 from training. Duration: 0.67275 2023-10-02 18:17:50,428 WARNING [train.py:1204] (3/4) Exclude cut with ID _35350_210_16040_1_1533040191183_4075299_465-248326-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 18:17:53,556 WARNING [train.py:1204] (3/4) Exclude cut with ID _16756_210_4915_1_1531047803763_7104259_101-141922-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 18:17:56,167 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.450e+02 1.832e+02 1.959e+02 2.190e+02 4.032e+02, threshold=3.917e+02, percent-clipped=2.0 2023-10-02 18:17:56,279 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9212W0005-600172 from training. Duration: 0.896 2023-10-02 18:17:56,318 WARNING [train.py:1204] (3/4) Exclude cut with ID _63186_210_2084_1_1535414479705_3908649_378-166820-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 18:17:56,338 WARNING [train.py:1204] (3/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_52-211683-0_sp1.1 from training. Duration: 0.8181875 2023-10-02 18:18:01,887 WARNING [train.py:1204] (3/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_1147-237729-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 18:18:01,968 WARNING [train.py:1204] (3/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_234-14174-0_sp0.9 from training. Duration: 0.911125 2023-10-02 18:18:01,983 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_454-276429-0 from training. Duration: 0.87 2023-10-02 18:18:02,059 WARNING [train.py:1204] (3/4) Exclude cut with ID _53713_210_24116_1_1534314487762_7416751_755-165313-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 18:18:03,462 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8278_210_2550_1_1528637099_4220413_436-317072-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 18:18:03,637 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.balancer1.prob, batch_count=973780.0, ans=0.125 2023-10-02 18:18:04,811 WARNING [train.py:1204] (3/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_28-324729-0_sp1.1 from training. Duration: 0.8545625 2023-10-02 18:18:05,460 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.3.encoder.layers.1.feed_forward3.out_whiten, num_groups=1, num_channels=512, metric=10.56 vs. limit=15.0 2023-10-02 18:18:11,960 INFO [train.py:1046] (3/4) Epoch 28, batch 2650, loss[loss=0.1535, simple_loss=0.2288, pruned_loss=0.03911, over 22183.00 frames. ], tot_loss[loss=0.1667, simple_loss=0.2449, pruned_loss=0.04421, over 4723852.33 frames. ], batch size: 48, lr: 3.65e-03, grad_scale: 8.0 2023-10-02 18:18:12,060 WARNING [train.py:1204] (3/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_326-165900-0 from training. Duration: 0.69 2023-10-02 18:18:13,444 WARNING [train.py:1204] (3/4) Exclude cut with ID _53489_210_6228_1_1534298357169_3835970_96-30246-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 18:18:14,835 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8275_210_4198_1_1528455320_3892571_491-17707-0_sp1.1 from training. Duration: 0.5818125 2023-10-02 18:18:17,702 WARNING [train.py:1204] (3/4) Exclude cut with ID _14348_210_8341_1_1530599434996_3778640_240-232370-0 from training. Duration: 0.84 2023-10-02 18:18:17,719 WARNING [train.py:1204] (3/4) Exclude cut with ID _45012_210_12651_1_1533608435136_5371380_54-112623-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 18:18:20,832 WARNING [train.py:1204] (3/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_328-264776-0_sp1.1 from training. Duration: 0.6090625 2023-10-02 18:18:20,893 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9325W0001-648427 from training. Duration: 0.768 2023-10-02 18:18:20,902 WARNING [train.py:1204] (3/4) Exclude cut with ID _56480_210_4220_1_1534679942079_3075409_49-74717-0_sp1.1 from training. Duration: 0.936375 2023-10-02 18:18:22,907 WARNING [train.py:1204] (3/4) Exclude cut with ID _59500_210_8254_1_1534809610680_3736009_6-241869-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 18:18:26,825 WARNING [train.py:1204] (3/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_172-215932-0_sp0.9 from training. Duration: 0.7333125 2023-10-02 18:18:28,140 WARNING [train.py:1204] (3/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_17-90541-0_sp1.1 from training. Duration: 0.8545625 2023-10-02 18:18:30,929 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_43831_210_2158_1_1533725502_5872791_525-89447-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 18:18:32,318 WARNING [train.py:1204] (3/4) Exclude cut with ID _12302_210_5219_1_1529924446466_8107280_907-182949-0 from training. Duration: 0.92 2023-10-02 18:18:33,599 WARNING [train.py:1204] (3/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_402-102311-0_sp0.9 from training. Duration: 0.7444375 2023-10-02 18:18:33,636 WARNING [train.py:1204] (3/4) Exclude cut with ID _8021_210_7994_1_1528376451358_3653069_179-45206-0_sp0.9 from training. Duration: 0.9555625 2023-10-02 18:18:35,238 WARNING [train.py:1204] (3/4) Exclude cut with ID _40137_210_12210_1_1533290484046_3795180_249-187059-0 from training. Duration: 0.88 2023-10-02 18:18:36,711 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9653W0194-675472 from training. Duration: 0.9658125 2023-10-02 18:18:39,492 WARNING [train.py:1204] (3/4) Exclude cut with ID _51332_210_9988_1_1534244371836_3801270_336-255604-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 18:18:41,470 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_510-96996-0 from training. Duration: 0.87 2023-10-02 18:18:41,482 WARNING [train.py:1204] (3/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_1167-20437-0_sp1.1 from training. Duration: 0.92725 2023-10-02 18:18:41,719 INFO [scaling.py:1118] (3/4) WithLoss: name=encoder.encoders.3.encoder.layers.1.self_attn_weights, loss-sum=0.000e+00 2023-10-02 18:18:42,788 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3475_210_2028_1_1526193578_3622569_265-276625-0 from training. Duration: 0.76 2023-10-02 18:18:43,100 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.conv_module1.balancer1.max_abs, batch_count=973980.0, ans=10.0 2023-10-02 18:18:43,520 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.3.encoder.layers.2.feed_forward2.out_whiten, num_groups=1, num_channels=512, metric=9.12 vs. limit=15.0 2023-10-02 18:18:45,636 WARNING [train.py:1204] (3/4) Exclude cut with ID _14860_210_4915_1_1530702020340_7343240_470-172418-0_sp1.1 from training. Duration: 0.97275 2023-10-02 18:18:45,645 WARNING [train.py:1204] (3/4) Exclude cut with ID _5444_210_2151_1_1527418808464_5312210_581-315411-0_sp1.1 from training. Duration: 0.563625 2023-10-02 18:18:46,989 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_34450_210_9547_1_1533099443_4109050_519-342439-0_sp1.1 from training. Duration: 0.97275 2023-10-02 18:18:47,019 WARNING [train.py:1204] (3/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_50-175961-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 18:18:49,766 WARNING [train.py:1204] (3/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_372-6869-0 from training. Duration: 0.44 2023-10-02 18:18:49,771 WARNING [train.py:1204] (3/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_3-237434-0 from training. Duration: 0.76 2023-10-02 18:18:53,586 WARNING [train.py:1204] (3/4) Exclude cut with ID _8772_210_1850_1_1528635595972_3664011_247-336448-0_sp1.1 from training. Duration: 0.67275 2023-10-02 18:18:56,410 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_427-78097-0 from training. Duration: 0.8 2023-10-02 18:18:56,432 WARNING [train.py:1204] (3/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_466-252727-0_sp1.1 from training. Duration: 0.97275 2023-10-02 18:18:56,501 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_15972_210_6400_1_1530877934_3405692_316-35478-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 18:18:57,801 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_809-17156-0_sp0.9 from training. Duration: 0.811125 2023-10-02 18:18:57,848 WARNING [train.py:1204] (3/4) Exclude cut with ID _57572_210_5517_1_1534921219624_3809484_594-263474-0_sp1.1 from training. Duration: 0.936375 2023-10-02 18:18:59,212 WARNING [train.py:1204] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_190-228204-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 18:19:00,638 WARNING [train.py:1204] (3/4) Exclude cut with ID _19514_210_9259_1_1531530071330_5084230_117-21193-0_sp1.1 from training. Duration: 0.936375 2023-10-02 18:19:02,066 WARNING [train.py:1204] (3/4) Exclude cut with ID _69584_210_5219_1_1535785133775_4044450_190-21315-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 18:19:02,144 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_53071_210_16554_1_1534493434_1276796_184-302050-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 18:19:02,192 WARNING [train.py:1204] (3/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_120-76843-0_sp1.1 from training. Duration: 0.5 2023-10-02 18:19:03,530 WARNING [train.py:1204] (3/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_318-162017-0_sp1.1 from training. Duration: 0.82725 2023-10-02 18:19:04,928 WARNING [train.py:1204] (3/4) Exclude cut with ID _46067_210_4220_1_1534125571066_3307450_79-46864-0_sp1.1 from training. Duration: 0.92725 2023-10-02 18:19:04,980 WARNING [train.py:1204] (3/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_358-215558-0_sp1.1 from training. Duration: 0.8454375 2023-10-02 18:19:06,351 WARNING [train.py:1204] (3/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_621-66764-0_sp1.1 from training. Duration: 0.92725 2023-10-02 18:19:07,742 WARNING [train.py:1204] (3/4) Exclude cut with ID _42863_210_9070_1_1533902398788_3696920_338-156851-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 18:19:09,499 WARNING [train.py:1204] (3/4) Exclude cut with ID _5289_210_2084_1_1527678202736_3779299_392-261988-0_sp0.9 from training. Duration: 0.72225 2023-10-02 18:19:13,527 WARNING [train.py:1204] (3/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_409-84682-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 18:19:13,609 WARNING [train.py:1204] (3/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_30-326681-0_sp0.9 from training. Duration: 0.92225 2023-10-02 18:19:13,613 WARNING [train.py:1204] (3/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_389-184695-0_sp1.1 from training. Duration: 0.92725 2023-10-02 18:19:14,956 WARNING [train.py:1204] (3/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_442-320317-0 from training. Duration: 0.82 2023-10-02 18:19:18,243 WARNING [train.py:1204] (3/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_756-29911-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 18:19:20,205 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_401-112036-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 18:19:20,466 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.ff2_skip_rate, batch_count=974113.3333333334, ans=0.0 2023-10-02 18:19:21,513 WARNING [train.py:1204] (3/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_181-308827-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 18:19:23,537 WARNING [train.py:1204] (3/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_332-258725-0_sp1.1 from training. Duration: 0.963625 2023-10-02 18:19:24,666 WARNING [train.py:1204] (3/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_188-260011-0_sp0.9 from training. Duration: 0.811125 2023-10-02 18:19:24,713 WARNING [train.py:1204] (3/4) Exclude cut with ID _49039_210_6315_1_1533967537541_3846118_99-302239-0_sp1.1 from training. Duration: 0.963625 2023-10-02 18:19:26,037 INFO [train.py:1046] (3/4) Epoch 28, batch 2700, loss[loss=0.1703, simple_loss=0.2444, pruned_loss=0.0481, over 23533.00 frames. ], tot_loss[loss=0.167, simple_loss=0.2457, pruned_loss=0.04414, over 4722382.29 frames. ], batch size: 134, lr: 3.65e-03, grad_scale: 8.0 2023-10-02 18:19:26,168 WARNING [train.py:1204] (3/4) Exclude cut with ID _4122_210_2084_1_1526713164417_4353280_312-294828-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 18:19:26,173 WARNING [train.py:1204] (3/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_303-228738-0 from training. Duration: 0.84 2023-10-02 18:19:29,058 WARNING [train.py:1204] (3/4) Exclude cut with ID _14349_210_8341_1_1530615710818_3442470_115-285975-0_sp0.9 from training. Duration: 0.9555625 2023-10-02 18:19:31,744 WARNING [train.py:1204] (3/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_125-144174-0_sp1.1 from training. Duration: 0.3545625 2023-10-02 18:19:33,250 WARNING [train.py:1204] (3/4) Exclude cut with ID _53565_210_10635_1_1534337218335_3820259_351-54570-0_sp1.1 from training. Duration: 0.97275 2023-10-02 18:19:33,280 WARNING [train.py:1204] (3/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_410-40065-0_sp1.1 from training. Duration: 0.963625 2023-10-02 18:19:33,301 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_38965_210_4852_1_1533174906_3484735_167-339442-0_sp1.1 from training. Duration: 0.963625 2023-10-02 18:19:36,054 WARNING [train.py:1204] (3/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_265-164486-0_sp1.1 from training. Duration: 0.87275 2023-10-02 18:19:36,064 WARNING [train.py:1204] (3/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_725-182484-0_sp1.1 from training. Duration: 0.92725 2023-10-02 18:19:36,088 WARNING [train.py:1204] (3/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_607-213951-0_sp0.9 from training. Duration: 0.9333125 2023-10-02 18:19:36,101 WARNING [train.py:1204] (3/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_605-133395-0_sp1.1 from training. Duration: 0.6 2023-10-02 18:19:36,121 WARNING [train.py:1204] (3/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_351-294650-0 from training. Duration: 0.63 2023-10-02 18:19:37,482 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_14953_210_2151_1_1530701710_5616165_710-165400-0_sp1.1 from training. Duration: 0.7909375 2023-10-02 18:19:38,899 WARNING [train.py:1204] (3/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_237-58433-0_sp1.1 from training. Duration: 0.67275 2023-10-02 18:19:40,307 WARNING [train.py:1204] (3/4) Exclude cut with ID _5190_210_4557_1_1527244368799_373439_28-197030-0_sp1.1 from training. Duration: 0.6545625 2023-10-02 18:19:40,361 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_743-327651-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 18:19:43,711 WARNING [train.py:1204] (3/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_76-203867-0_sp0.9 from training. Duration: 0.8 2023-10-02 18:19:43,787 WARNING [train.py:1204] (3/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_274-274798-0 from training. Duration: 0.98 2023-10-02 18:19:45,511 WARNING [train.py:1204] (3/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_226-242140-0_sp1.1 from training. Duration: 0.736375 2023-10-02 18:19:49,760 WARNING [train.py:1204] (3/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_352-59958-0_sp1.1 from training. Duration: 0.7818125 2023-10-02 18:19:49,774 WARNING [train.py:1204] (3/4) Exclude cut with ID _39411_210_5986_1_1533538825030_3343649_191-251819-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 18:19:56,314 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7046_210_6753_1_1527895337_6920917_642-106799-0_sp1.1 from training. Duration: 0.836375 2023-10-02 18:19:57,570 WARNING [train.py:1204] (3/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_412-3442-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 18:19:57,604 WARNING [train.py:1204] (3/4) Exclude cut with ID _60057_210_10640_1_1534903065793_3333150_171-181133-0_sp0.9 from training. Duration: 0.97775 2023-10-02 18:19:57,613 WARNING [train.py:1204] (3/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_154-135696-0_sp1.1 from training. Duration: 0.72725 2023-10-02 18:19:57,772 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.bypass.skip_rate, batch_count=974313.3333333334, ans=0.07 2023-10-02 18:20:01,731 WARNING [train.py:1204] (3/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_148-186805-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 18:20:04,470 WARNING [train.py:1204] (3/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_605-165641-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 18:20:04,479 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_174-277364-0_sp0.9 from training. Duration: 0.788875 2023-10-02 18:20:04,491 WARNING [train.py:1204] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_89-321955-0_sp0.9 from training. Duration: 0.888875 2023-10-02 18:20:08,597 WARNING [train.py:1204] (3/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_438-105429-0_sp1.1 from training. Duration: 0.963625 2023-10-02 18:20:08,600 WARNING [train.py:1204] (3/4) Exclude cut with ID _14970_210_15709_1_1530773543493_1715730_187-20259-0_sp0.9 from training. Duration: 0.988875 2023-10-02 18:20:14,665 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.conv_module2.balancer2.prob, batch_count=974380.0, ans=0.125 2023-10-02 18:20:15,923 WARNING [train.py:1204] (3/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_385-193052-0_sp0.9 from training. Duration: 0.9555625 2023-10-02 18:20:15,971 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7113_210_1849_1_1527942685_3448768_194-311626-0_sp1.1 from training. Duration: 0.92725 2023-10-02 18:20:20,566 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3780_210_2087_1_1526467389_2145572_176-107140-0_sp0.9 from training. Duration: 0.7666875 2023-10-02 18:20:20,567 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_36756_210_7234_1_1533026991_966276_123-213457-0_sp1.1 from training. Duration: 0.936375 2023-10-02 18:20:23,620 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.646e+02 1.829e+02 1.997e+02 2.292e+02 3.269e+02, threshold=3.995e+02, percent-clipped=0.0 2023-10-02 18:20:23,788 WARNING [train.py:1204] (3/4) Exclude cut with ID _23557_210_8750_1_1531890106370_6846255_456-244861-0_sp1.1 from training. Duration: 0.963625 2023-10-02 18:20:25,272 WARNING [train.py:1204] (3/4) Exclude cut with ID _37086_210_9038_1_1532999540891_7021250_242-349170-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 18:20:26,604 WARNING [train.py:1204] (3/4) Exclude cut with ID _41298_210_9019_1_1533430371450_4822143_471-281351-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 18:20:28,101 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_53378_210_17282_1_1534221071_5769509_572-126176-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 18:20:30,061 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_1507-263032-0_sp1.1 from training. Duration: 0.963625 2023-10-02 18:20:30,090 WARNING [train.py:1204] (3/4) Exclude cut with ID _13679_210_7497_1_1530534293662_3355098_411-186088-0_sp1.1 from training. Duration: 0.8545625 2023-10-02 18:20:31,521 WARNING [train.py:1204] (3/4) Exclude cut with ID _29345_210_4381_1_1532689168263_3860169_149-257268-0_sp1.1 from training. Duration: 0.736375 2023-10-02 18:20:32,815 WARNING [train.py:1204] (3/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_381-252014-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 18:20:32,826 WARNING [train.py:1204] (3/4) Exclude cut with ID _35799_210_5854_1_1533263487887_3529071_512-2717-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 18:20:36,924 WARNING [train.py:1204] (3/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_505-205270-0_sp0.9 from training. Duration: 0.42225 2023-10-02 18:20:36,994 WARNING [train.py:1204] (3/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_731-20117-0_sp1.1 from training. Duration: 0.936375 2023-10-02 18:20:39,692 INFO [train.py:1046] (3/4) Epoch 28, batch 2750, loss[loss=0.1601, simple_loss=0.2549, pruned_loss=0.03261, over 24322.00 frames. ], tot_loss[loss=0.1661, simple_loss=0.245, pruned_loss=0.0436, over 4715899.82 frames. ], batch size: 74, lr: 3.65e-03, grad_scale: 8.0 2023-10-02 18:20:39,781 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_37351_210_7925_1_1533012931_3812113_265-85780-0_sp1.1 from training. Duration: 0.8 2023-10-02 18:20:39,788 WARNING [train.py:1204] (3/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_313-134930-0 from training. Duration: 0.97 2023-10-02 18:20:41,206 WARNING [train.py:1204] (3/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_374-138971-0 from training. Duration: 0.94 2023-10-02 18:20:41,250 WARNING [train.py:1204] (3/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_1227-287886-0_sp1.1 from training. Duration: 0.936375 2023-10-02 18:20:42,786 INFO [scaling.py:1118] (3/4) WithLoss: name=encoder.encoders.3.encoder.layers.3.self_attn_weights, loss-sum=0.000e+00 2023-10-02 18:20:45,673 WARNING [train.py:1204] (3/4) Exclude cut with ID _59493_210_17282_1_1535078419077_3225879_332-217544-0_sp1.1 from training. Duration: 0.97275 2023-10-02 18:20:45,712 WARNING [train.py:1204] (3/4) Exclude cut with ID _23514_210_1389_1_1531832139785_3031439_361-119640-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 18:20:47,785 WARNING [train.py:1204] (3/4) Exclude cut with ID _69584_210_5219_1_1535785133775_4044450_142-118058-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 18:20:48,383 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.3.encoder.layers.0.feed_forward3.out_whiten, num_groups=1, num_channels=512, metric=11.04 vs. limit=15.0 2023-10-02 18:20:49,071 WARNING [train.py:1204] (3/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_178-177582-0_sp1.1 from training. Duration: 0.67275 2023-10-02 18:20:49,123 WARNING [train.py:1204] (3/4) Exclude cut with ID _25303_210_17519_1_1532050207576_3635970_41-195572-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 18:20:51,907 WARNING [train.py:1204] (3/4) Exclude cut with ID _55328_210_27767_1_1534580973260_4088170_329-341322-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 18:20:53,169 WARNING [train.py:1204] (3/4) Exclude cut with ID _5608_210_2106_1_1527415439089_6819768_909-82767-0_sp1.1 from training. Duration: 0.5545625 2023-10-02 18:20:53,210 WARNING [train.py:1204] (3/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_261-297503-0_sp1.1 from training. Duration: 0.9 2023-10-02 18:20:53,216 WARNING [train.py:1204] (3/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_459-291706-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 18:20:53,217 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7762_210_1794_1_1528199508_4057408_422-227034-0 from training. Duration: 0.73 2023-10-02 18:20:53,226 WARNING [train.py:1204] (3/4) Exclude cut with ID _3067_210_2667_1_1526212974640_4503168_250-285937-0_sp0.9 from training. Duration: 0.888875 2023-10-02 18:20:53,245 WARNING [train.py:1204] (3/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_332-253745-0_sp1.1 from training. Duration: 0.936375 2023-10-02 18:20:55,482 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.attention_skip_rate, batch_count=974580.0, ans=0.0 2023-10-02 18:21:00,087 WARNING [train.py:1204] (3/4) Exclude cut with ID _13938_210_9988_1_1530518718978_3693960_304-112784-0 from training. Duration: 0.46 2023-10-02 18:21:02,802 WARNING [train.py:1204] (3/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_226-267957-0_sp1.1 from training. Duration: 0.92725 2023-10-02 18:21:02,840 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_41822_210_13270_1_1533955976_4379284_608-254567-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 18:21:02,905 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_124-113562-0_sp1.1 from training. Duration: 0.8545625 2023-10-02 18:21:04,260 WARNING [train.py:1204] (3/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_117-290916-0_sp1.1 from training. Duration: 0.463625 2023-10-02 18:21:05,601 WARNING [train.py:1204] (3/4) Exclude cut with ID _28427_210_4220_1_1532581240967_3061069_102-66533-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 18:21:06,874 WARNING [train.py:1204] (3/4) Exclude cut with ID _55383_210_10635_1_1534468426172_2231669_134-128246-0_sp1.1 from training. Duration: 0.8090625 2023-10-02 18:21:06,932 WARNING [train.py:1204] (3/4) Exclude cut with ID _45871_210_5665_1_1533864614245_3296060_49-308316-0_sp1.1 from training. Duration: 0.97275 2023-10-02 18:21:07,070 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.self_attn_weights.pos_emb_skip_rate, batch_count=974580.0, ans=0.0 2023-10-02 18:21:08,264 WARNING [train.py:1204] (3/4) Exclude cut with ID _57572_210_5517_1_1534921219624_3809484_176-199205-0_sp1.1 from training. Duration: 0.97275 2023-10-02 18:21:08,550 INFO [scaling.py:1118] (3/4) WithLoss: name=encoder.encoders.2.encoder.layers.2.self_attn_weights, loss-sum=0.000e+00 2023-10-02 18:21:08,623 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.bypass.skip_rate, batch_count=974646.6666666666, ans=0.04949747468305833 2023-10-02 18:21:11,167 WARNING [train.py:1204] (3/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_45-265684-0_sp1.1 from training. Duration: 0.6545625 2023-10-02 18:21:12,535 WARNING [train.py:1204] (3/4) Exclude cut with ID _15150_210_8341_1_1530752411803_7256384_469-239509-0_sp0.9 from training. Duration: 0.7555625 2023-10-02 18:21:12,582 WARNING [train.py:1204] (3/4) Exclude cut with ID _28461_210_2253_1_1532771775189_3041299_136-270644-0_sp1.1 from training. Duration: 0.8454375 2023-10-02 18:21:12,649 WARNING [train.py:1204] (3/4) Exclude cut with ID _69584_210_5219_1_1535785133775_4044450_302-47302-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 18:21:14,309 WARNING [train.py:1204] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_166-128941-0_sp1.1 from training. Duration: 0.4909375 2023-10-02 18:21:20,879 WARNING [train.py:1204] (3/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_559-120504-0_sp1.1 from training. Duration: 0.97275 2023-10-02 18:21:22,762 WARNING [train.py:1204] (3/4) Exclude cut with ID _6456_210_1534_1_1527681634212_3181049_345-252877-0_sp1.1 from training. Duration: 0.5818125 2023-10-02 18:21:22,794 WARNING [train.py:1204] (3/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_902-233003-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 18:21:25,632 WARNING [train.py:1204] (3/4) Exclude cut with ID _36538_210_9994_1_1533204020363_3795620_204-261385-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 18:21:25,643 WARNING [train.py:1204] (3/4) Exclude cut with ID _14821_210_8750_1_1530680503991_6829359_189-297451-0_sp1.1 from training. Duration: 0.5 2023-10-02 18:21:26,950 WARNING [train.py:1204] (3/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_1211-226066-0_sp0.9 from training. Duration: 0.8444375 2023-10-02 18:21:31,861 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6786_210_1388_1_1527839699_4194495_273-149841-0_sp1.1 from training. Duration: 0.836375 2023-10-02 18:21:33,172 WARNING [train.py:1204] (3/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_380-162648-0_sp1.1 from training. Duration: 0.8545625 2023-10-02 18:21:33,173 WARNING [train.py:1204] (3/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_494-199538-0 from training. Duration: 0.75 2023-10-02 18:21:37,421 WARNING [train.py:1204] (3/4) Exclude cut with ID _49097_210_16049_1_1533952780207_4360979_276-87053-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 18:21:38,810 WARNING [train.py:1204] (3/4) Exclude cut with ID _5444_210_2151_1_1527418808464_5312210_726-27165-0 from training. Duration: 0.67 2023-10-02 18:21:44,918 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_128-202104-0_sp1.1 from training. Duration: 0.57275 2023-10-02 18:21:46,361 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_11349_210_6753_1_1529800861_1101207_26-65602-0_sp1.1 from training. Duration: 0.87275 2023-10-02 18:21:46,382 WARNING [train.py:1204] (3/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_159-328157-0 from training. Duration: 0.52 2023-10-02 18:21:49,646 WARNING [train.py:1204] (3/4) Exclude cut with ID _14069_210_5632_1_1530512692033_4803390_98-21934-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 18:21:51,020 WARNING [train.py:1204] (3/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_352-59958-0_sp0.9 from training. Duration: 0.9555625 2023-10-02 18:21:51,062 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6163_210_6400_1_1527506746_4263068_283-137811-0 from training. Duration: 0.77 2023-10-02 18:21:51,175 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.1.ff2_skip_rate, batch_count=974780.0, ans=0.0 2023-10-02 18:21:52,288 WARNING [train.py:1204] (3/4) Exclude cut with ID _21989_210_5854_1_1531899214931_3271360_133-44759-0_sp1.1 from training. Duration: 0.7 2023-10-02 18:21:55,385 INFO [train.py:1046] (3/4) Epoch 28, batch 2800, loss[loss=0.1613, simple_loss=0.2324, pruned_loss=0.04506, over 23693.00 frames. ], tot_loss[loss=0.1649, simple_loss=0.2434, pruned_loss=0.04323, over 4721461.69 frames. ], batch size: 232, lr: 3.65e-03, grad_scale: 16.0 2023-10-02 18:21:55,480 WARNING [train.py:1204] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_21-174514-0_sp0.9 from training. Duration: 0.47775 2023-10-02 18:21:56,835 WARNING [train.py:1204] (3/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_201-78811-0_sp1.1 from training. Duration: 0.963625 2023-10-02 18:21:56,860 WARNING [train.py:1204] (3/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_537-164630-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 18:21:56,930 WARNING [train.py:1204] (3/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_156-257607-0 from training. Duration: 0.83 2023-10-02 18:21:56,947 WARNING [train.py:1204] (3/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_308-154450-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 18:21:58,252 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_45399_210_4852_1_1533624725_7579737_214-240574-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 18:21:59,676 WARNING [train.py:1204] (3/4) Exclude cut with ID _29303_210_2575_1_1532392237303_7410649_615-305517-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 18:21:59,718 WARNING [train.py:1204] (3/4) Exclude cut with ID IC0125W0002-61808 from training. Duration: 0.708 2023-10-02 18:21:59,719 WARNING [train.py:1204] (3/4) Exclude cut with ID IC0125W0017-61823 from training. Duration: 0.759 2023-10-02 18:22:03,056 WARNING [train.py:1204] (3/4) Exclude cut with ID _53219_210_4525_1_1534834836008_4064825_4-145291-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 18:22:05,765 WARNING [train.py:1204] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_185-174805-0_sp1.1 from training. Duration: 0.6181875 2023-10-02 18:22:05,785 WARNING [train.py:1204] (3/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_635-193046-0_sp1.1 from training. Duration: 0.92725 2023-10-02 18:22:07,363 WARNING [train.py:1204] (3/4) Exclude cut with ID _46067_210_4220_1_1534125571066_3307450_321-258866-0_sp1.1 from training. Duration: 0.97275 2023-10-02 18:22:10,036 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8558_210_1410_1_1528505225_7613406_131-329524-0 from training. Duration: 0.79 2023-10-02 18:22:11,442 WARNING [train.py:1204] (3/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_847-41334-0_sp0.9 from training. Duration: 0.488875 2023-10-02 18:22:11,533 WARNING [train.py:1204] (3/4) Exclude cut with ID _14227_210_4381_1_1530701804286_3940259_125-275642-0 from training. Duration: 0.99 2023-10-02 18:22:14,842 WARNING [train.py:1204] (3/4) Exclude cut with ID _25248_210_8750_1_1532062825941_5554180_248-177255-0_sp1.1 from training. Duration: 0.963625 2023-10-02 18:22:16,202 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_40047_210_2290_1_1533290519_2374311_195-165693-0_sp1.1 from training. Duration: 0.8090625 2023-10-02 18:22:16,214 WARNING [train.py:1204] (3/4) Exclude cut with ID _23109_210_4381_1_1532150828777_901949_29-258672-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 18:22:19,051 WARNING [train.py:1204] (3/4) Exclude cut with ID _14821_210_8750_1_1530680503991_6829359_2-227151-0_sp1.1 from training. Duration: 0.7545625 2023-10-02 18:22:19,088 WARNING [train.py:1204] (3/4) Exclude cut with ID _49643_210_7989_1_1534640244224_3939031_565-53982-0_sp1.1 from training. Duration: 0.963625 2023-10-02 18:22:19,092 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7544_210_6753_1_1528591327_1471426_33-346031-0_sp0.9 from training. Duration: 0.67775 2023-10-02 18:22:20,991 WARNING [train.py:1204] (3/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_95-61782-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 18:22:28,492 WARNING [train.py:1204] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_686-54383-0_sp0.9 from training. Duration: 0.9666875 2023-10-02 18:22:29,838 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_505-276823-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 18:22:30,696 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.feed_forward1.out_proj.dropout_p, batch_count=974980.0, ans=0.1 2023-10-02 18:22:31,689 WARNING [train.py:1204] (3/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_745-128443-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 18:22:31,733 WARNING [train.py:1204] (3/4) Exclude cut with ID _3844_210_1390_1_1526727803582_2871209_20-251664-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 18:22:32,958 WARNING [train.py:1204] (3/4) Exclude cut with ID _36538_210_9994_1_1533204020363_3795620_487-164628-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 18:22:37,098 WARNING [train.py:1204] (3/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_309-222545-0_sp1.1 from training. Duration: 0.763625 2023-10-02 18:22:37,112 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3531_210_3528_1_1526294203_5216456_309-227302-0 from training. Duration: 0.69 2023-10-02 18:22:37,176 WARNING [train.py:1204] (3/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_42-192460-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 18:22:38,660 WARNING [train.py:1204] (3/4) Exclude cut with ID _22083_210_8254_1_1531702862439_3678970_49-234546-0_sp1.1 from training. Duration: 0.7545625 2023-10-02 18:22:38,664 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_427-78097-0_sp0.9 from training. Duration: 0.888875 2023-10-02 18:22:42,032 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_378-205254-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 18:22:43,188 WARNING [train.py:1204] (3/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_672-314830-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 18:22:44,057 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.2.encoder.layers.1.whiten, num_groups=1, num_channels=384, metric=4.65 vs. limit=12.0 2023-10-02 18:22:46,574 WARNING [train.py:1204] (3/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_90-30673-0_sp1.1 from training. Duration: 0.763625 2023-10-02 18:22:47,996 WARNING [train.py:1204] (3/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_563-329943-0_sp1.1 from training. Duration: 0.9 2023-10-02 18:22:48,010 WARNING [train.py:1204] (3/4) Exclude cut with ID _21632_210_2253_1_1531994001765_3744140_154-84090-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 18:22:48,011 WARNING [train.py:1204] (3/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_103-336064-0_sp0.9 from training. Duration: 0.5666875 2023-10-02 18:22:49,307 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_43-71989-0_sp0.9 from training. Duration: 0.6333125 2023-10-02 18:22:49,371 WARNING [train.py:1204] (3/4) Exclude cut with ID _3867_210_1883_1_1526641200575_4475830_319-349850-0_sp0.9 from training. Duration: 0.7444375 2023-10-02 18:22:49,622 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.conv_module1.balancer2.prob, batch_count=975046.6666666666, ans=0.125 2023-10-02 18:22:50,700 WARNING [train.py:1204] (3/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_629-179454-0_sp1.1 from training. Duration: 0.963625 2023-10-02 18:22:50,709 WARNING [train.py:1204] (3/4) Exclude cut with ID _65263_210_22356_1_1535763598691_3921037_49-222917-0 from training. Duration: 0.92 2023-10-02 18:22:50,748 WARNING [train.py:1204] (3/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_602-331772-0_sp1.1 from training. Duration: 0.936375 2023-10-02 18:22:51,410 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.3.encoder.layers.2.feed_forward3.out_whiten, num_groups=1, num_channels=512, metric=7.56 vs. limit=15.0 2023-10-02 18:22:52,090 WARNING [train.py:1204] (3/4) Exclude cut with ID _58414_210_8254_1_1535421609162_7151182_292-134978-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 18:22:52,123 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_38793_210_2544_1_1533176114_3849320_200-299292-0_sp1.1 from training. Duration: 0.936375 2023-10-02 18:22:52,833 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.3.encoder.layers.2.feed_forward2.out_whiten, num_groups=1, num_channels=512, metric=9.19 vs. limit=15.0 2023-10-02 18:22:53,778 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.659e+02 1.840e+02 2.003e+02 2.153e+02 3.195e+02, threshold=4.007e+02, percent-clipped=0.0 2023-10-02 18:22:53,913 WARNING [train.py:1204] (3/4) Exclude cut with ID _14970_210_15709_1_1530775547361_1966965_6-23328-0 from training. Duration: 0.86 2023-10-02 18:22:53,996 WARNING [train.py:1204] (3/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_386-225511-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 18:22:55,309 WARNING [train.py:1204] (3/4) Exclude cut with ID _38323_210_12108_1_1533121233369_3303049_416-11919-0_sp1.1 from training. Duration: 0.87275 2023-10-02 18:22:55,372 WARNING [train.py:1204] (3/4) Exclude cut with ID _4866_210_2577_1_1527404412284_4364659_256-306830-0_sp1.1 from training. Duration: 0.7909375 2023-10-02 18:22:56,753 WARNING [train.py:1204] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_53-300544-0 from training. Duration: 0.67 2023-10-02 18:23:02,810 WARNING [train.py:1204] (3/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_80-74184-0_sp1.1 from training. Duration: 0.7545625 2023-10-02 18:23:02,833 WARNING [train.py:1204] (3/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_332-256168-0_sp1.1 from training. Duration: 0.6818125 2023-10-02 18:23:02,882 WARNING [train.py:1204] (3/4) Exclude cut with ID _48836_210_12210_1_1534075848293_3904150_429-35475-0_sp1.1 from training. Duration: 0.8090625 2023-10-02 18:23:05,710 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_144-135833-0_sp1.1 from training. Duration: 0.97275 2023-10-02 18:23:09,108 WARNING [train.py:1204] (3/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_380-343670-0_sp0.9 from training. Duration: 0.97775 2023-10-02 18:23:09,127 WARNING [train.py:1204] (3/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_213-265174-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 18:23:09,149 WARNING [train.py:1204] (3/4) Exclude cut with ID _20455_210_5219_1_1531565996775_8098100_86-314283-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 18:23:10,545 INFO [train.py:1046] (3/4) Epoch 28, batch 2850, loss[loss=0.1743, simple_loss=0.2515, pruned_loss=0.0486, over 23174.00 frames. ], tot_loss[loss=0.1648, simple_loss=0.2427, pruned_loss=0.04344, over 4711288.71 frames. ], batch size: 119, lr: 3.65e-03, grad_scale: 8.0 2023-10-02 18:23:12,034 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_41750_210_1960_1_1533520678_3948042_68-223813-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 18:23:12,095 WARNING [train.py:1204] (3/4) Exclude cut with ID _55153_210_2253_1_1534384786677_3162096_24-198429-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 18:23:13,987 WARNING [train.py:1204] (3/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_119-207162-0_sp0.9 from training. Duration: 0.788875 2023-10-02 18:23:14,043 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_148-172861-0 from training. Duration: 0.5 2023-10-02 18:23:20,683 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_5522_210_3273_1_1527249606_3909096_318-203153-0 from training. Duration: 0.43 2023-10-02 18:23:20,690 WARNING [train.py:1204] (3/4) Exclude cut with ID _14884_210_2252_1_1530707459689_5561250_288-189949-0_sp1.1 from training. Duration: 0.97275 2023-10-02 18:23:22,712 WARNING [train.py:1204] (3/4) Exclude cut with ID _5294_210_1747_1_1527249643928_3696179_529-242298-0 from training. Duration: 0.95 2023-10-02 18:23:24,071 WARNING [train.py:1204] (3/4) Exclude cut with ID _36538_210_9994_1_1533204020363_3795620_503-110289-0_sp1.1 from training. Duration: 0.936375 2023-10-02 18:23:27,365 WARNING [train.py:1204] (3/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_385-193052-0 from training. Duration: 0.86 2023-10-02 18:23:28,733 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_456-22141-0 from training. Duration: 0.88 2023-10-02 18:23:30,158 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_38965_210_4852_1_1533174906_3484735_284-117840-0_sp1.1 from training. Duration: 0.936375 2023-10-02 18:23:40,852 WARNING [train.py:1204] (3/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_75-201625-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 18:23:42,622 WARNING [train.py:1204] (3/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_255-312654-0_sp1.1 from training. Duration: 0.82725 2023-10-02 18:23:42,638 WARNING [train.py:1204] (3/4) Exclude cut with ID _47251_210_18628_1_1533814063128_4137250_214-88076-0_sp0.9 from training. Duration: 0.97775 2023-10-02 18:23:43,919 WARNING [train.py:1204] (3/4) Exclude cut with ID _4092_210_2151_1_1526643044449_3135759_227-213972-0_sp1.1 from training. Duration: 0.4909375 2023-10-02 18:23:43,941 WARNING [train.py:1204] (3/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_912-47473-0_sp1.1 from training. Duration: 0.6181875 2023-10-02 18:23:43,959 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6767_210_3273_1_1527855783_1718932_173-256601-0_sp1.1 from training. Duration: 0.72725 2023-10-02 18:23:47,174 WARNING [train.py:1204] (3/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_32-205288-0_sp1.1 from training. Duration: 0.7090625 2023-10-02 18:23:47,208 WARNING [train.py:1204] (3/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_39-248870-0 from training. Duration: 0.69 2023-10-02 18:23:48,786 WARNING [train.py:1204] (3/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_377-296839-0_sp1.1 from training. Duration: 0.5 2023-10-02 18:23:50,095 WARNING [train.py:1204] (3/4) Exclude cut with ID _13679_210_7497_1_1530534293662_3355098_413-56154-0_sp1.1 from training. Duration: 0.963625 2023-10-02 18:23:50,148 WARNING [train.py:1204] (3/4) Exclude cut with ID _14970_210_15709_1_1530775547361_1966965_201-84544-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 18:23:50,196 WARNING [train.py:1204] (3/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_105-95775-0_sp1.1 from training. Duration: 0.936375 2023-10-02 18:23:52,925 WARNING [train.py:1204] (3/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_131-191669-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 18:23:54,772 WARNING [train.py:1204] (3/4) Exclude cut with ID _14860_210_4915_1_1530702020340_7343240_695-4323-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 18:23:54,870 WARNING [train.py:1204] (3/4) Exclude cut with ID _39867_210_9826_1_1533553222030_9344259_991-130565-0_sp1.1 from training. Duration: 0.97275 2023-10-02 18:23:58,303 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_50-3289-0_sp1.1 from training. Duration: 0.82725 2023-10-02 18:23:59,528 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_54-347670-0_sp1.1 from training. Duration: 0.92725 2023-10-02 18:23:59,581 WARNING [train.py:1204] (3/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_200-153445-0_sp1.1 from training. Duration: 0.936375 2023-10-02 18:24:00,837 WARNING [train.py:1204] (3/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_279-302911-0_sp1.1 from training. Duration: 0.97275 2023-10-02 18:24:02,317 WARNING [train.py:1204] (3/4) Exclude cut with ID _14886_210_4220_1_1530702020355_3516190_159-73748-0_sp0.9 from training. Duration: 0.92225 2023-10-02 18:24:05,119 WARNING [train.py:1204] (3/4) Exclude cut with ID _53379_210_20034_1_1534222027363_3854654_23-222715-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 18:24:06,428 WARNING [train.py:1204] (3/4) Exclude cut with ID _12555_210_8750_1_1530075594402_3780239_449-267986-0 from training. Duration: 0.83 2023-10-02 18:24:06,440 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7544_210_6753_1_1528587722_3576686_295-258479-0 from training. Duration: 0.84 2023-10-02 18:24:09,253 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7002_210_1794_1_1527935634_4034444_487-236501-0_sp0.9 from training. Duration: 0.7333125 2023-10-02 18:24:09,294 WARNING [train.py:1204] (3/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_244-19107-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 18:24:10,500 WARNING [train.py:1204] (3/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_390-339137-0 from training. Duration: 0.56 2023-10-02 18:24:11,715 WARNING [train.py:1204] (3/4) Exclude cut with ID _7562_210_2084_1_1528887767916_3946229_186-326422-0_sp1.1 from training. Duration: 0.763625 2023-10-02 18:24:11,776 WARNING [train.py:1204] (3/4) Exclude cut with ID _33640_210_2655_1_1533117605479_4855469_645-145235-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 18:24:13,526 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_25767_210_2161_1_1532307483_3867748_506-171777-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 18:24:13,553 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_5438_210_1794_1_1527336687_2600009_233-58072-0_sp1.1 from training. Duration: 0.7 2023-10-02 18:24:13,553 WARNING [train.py:1204] (3/4) Exclude cut with ID IC0669W0063-330908 from training. Duration: 0.896 2023-10-02 18:24:13,585 WARNING [train.py:1204] (3/4) Exclude cut with ID IC0671W0070-331913 from training. Duration: 0.896 2023-10-02 18:24:13,588 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_258-310570-0_sp1.1 from training. Duration: 0.7454375 2023-10-02 18:24:14,910 WARNING [train.py:1204] (3/4) Exclude cut with ID _9461_210_4557_1_1529060514726_3677451_162-177507-0_sp1.1 from training. Duration: 0.97275 2023-10-02 18:24:19,936 WARNING [train.py:1204] (3/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_26-175553-0_sp0.9 from training. Duration: 0.67775 2023-10-02 18:24:21,199 WARNING [train.py:1204] (3/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_7-150481-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 18:24:21,266 WARNING [train.py:1204] (3/4) Exclude cut with ID _23557_210_8750_1_1531890106370_6846255_72-273262-0_sp1.1 from training. Duration: 0.963625 2023-10-02 18:24:22,633 WARNING [train.py:1204] (3/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_367-61330-0 from training. Duration: 0.75 2023-10-02 18:24:23,881 INFO [train.py:1046] (3/4) Epoch 28, batch 2900, loss[loss=0.1463, simple_loss=0.2377, pruned_loss=0.02744, over 24498.00 frames. ], tot_loss[loss=0.1652, simple_loss=0.2428, pruned_loss=0.04374, over 4706843.08 frames. ], batch size: 63, lr: 3.65e-03, grad_scale: 8.0 2023-10-02 18:24:27,307 WARNING [train.py:1204] (3/4) Exclude cut with ID _39867_210_9826_1_1533553222030_9344259_1337-11141-0_sp1.1 from training. Duration: 0.936375 2023-10-02 18:24:27,334 WARNING [train.py:1204] (3/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_101-293367-0 from training. Duration: 0.63 2023-10-02 18:24:28,690 WARNING [train.py:1204] (3/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_10-100367-0 from training. Duration: 0.98 2023-10-02 18:24:30,546 WARNING [train.py:1204] (3/4) Exclude cut with ID _60057_210_10640_1_1534903065793_3333150_171-181133-0_sp1.1 from training. Duration: 0.8 2023-10-02 18:24:30,549 WARNING [train.py:1204] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_844-96989-0_sp0.9 from training. Duration: 0.788875 2023-10-02 18:24:32,002 WARNING [train.py:1204] (3/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_301-144595-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 18:24:33,392 WARNING [train.py:1204] (3/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_115-147343-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 18:24:37,306 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3171_210_2087_1_1526109084_2330007_37-64334-0_sp1.1 from training. Duration: 0.7454375 2023-10-02 18:24:37,371 WARNING [train.py:1204] (3/4) Exclude cut with ID _19514_210_9259_1_1531530071330_5084230_769-272914-0_sp1.1 from training. Duration: 0.936375 2023-10-02 18:24:40,375 WARNING [train.py:1204] (3/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_76-311963-0_sp0.9 from training. Duration: 0.77775 2023-10-02 18:24:40,408 WARNING [train.py:1204] (3/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_526-62079-0 from training. Duration: 0.67 2023-10-02 18:24:41,726 WARNING [train.py:1204] (3/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_7-111954-0_sp0.9 from training. Duration: 0.62225 2023-10-02 18:24:41,960 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.attention_skip_rate, batch_count=975580.0, ans=0.0 2023-10-02 18:24:43,079 WARNING [train.py:1204] (3/4) Exclude cut with ID _57997_210_4163_1_1534766425889_3961049_360-279140-0_sp1.1 from training. Duration: 0.97275 2023-10-02 18:24:45,950 WARNING [train.py:1204] (3/4) Exclude cut with ID _4866_210_2577_1_1527404412284_4364659_133-1696-0 from training. Duration: 0.99 2023-10-02 18:24:46,012 WARNING [train.py:1204] (3/4) Exclude cut with ID _5190_210_4557_1_1527244368799_373439_28-197030-0 from training. Duration: 0.72 2023-10-02 18:24:49,211 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_53-39003-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 18:24:49,214 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6673_210_3273_1_1527767790_3173240_351-167954-0 from training. Duration: 0.48 2023-10-02 18:24:50,528 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2822_210_2715_1_1526112586_1999587_165-36656-0_sp1.1 from training. Duration: 0.8090625 2023-10-02 18:24:51,986 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_328-264892-0_sp1.1 from training. Duration: 0.92725 2023-10-02 18:24:51,991 WARNING [train.py:1204] (3/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_29-293896-0_sp0.9 from training. Duration: 0.67775 2023-10-02 18:24:53,927 WARNING [train.py:1204] (3/4) Exclude cut with ID _65263_210_22356_1_1535763598691_3921037_465-55448-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 18:24:55,157 WARNING [train.py:1204] (3/4) Exclude cut with ID _21989_210_5854_1_1531899214931_3271360_311-53106-0_sp1.1 from training. Duration: 0.97275 2023-10-02 18:24:59,272 WARNING [train.py:1204] (3/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_389-277915-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 18:25:01,876 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_69630_210_7234_1_1535791094_677837_63-277139-0_sp1.1 from training. Duration: 0.963625 2023-10-02 18:25:02,761 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.1.encoder.layers.0.whiten, num_groups=1, num_channels=256, metric=4.93 vs. limit=12.0 2023-10-02 18:25:03,208 WARNING [train.py:1204] (3/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_85-138257-0 from training. Duration: 0.71 2023-10-02 18:25:03,232 WARNING [train.py:1204] (3/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_686-247600-0 from training. Duration: 0.92 2023-10-02 18:25:03,238 WARNING [train.py:1204] (3/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_108-255430-0_sp1.1 from training. Duration: 0.8818125 2023-10-02 18:25:04,981 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.feed_forward1.out_proj.dropout_p, batch_count=975646.6666666666, ans=0.1 2023-10-02 18:25:06,134 WARNING [train.py:1204] (3/4) Exclude cut with ID _14884_210_2252_1_1530707459689_5561250_114-201399-0_sp1.1 from training. Duration: 0.6909375 2023-10-02 18:25:07,614 WARNING [train.py:1204] (3/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_506-42614-0 from training. Duration: 0.79 2023-10-02 18:25:08,971 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_85-195240-0_sp1.1 from training. Duration: 0.7909375 2023-10-02 18:25:09,266 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.bypass.scale_min, batch_count=975713.3333333334, ans=0.2 2023-10-02 18:25:13,089 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_141-36243-0_sp1.1 from training. Duration: 0.97275 2023-10-02 18:25:15,233 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.0.layers.1.self_attn1.whiten, num_groups=1, num_channels=192, metric=10.82 vs. limit=22.5 2023-10-02 18:25:22,909 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.577e+02 1.803e+02 1.962e+02 2.123e+02 3.494e+02, threshold=3.924e+02, percent-clipped=0.0 2023-10-02 18:25:22,985 WARNING [train.py:1204] (3/4) Exclude cut with ID _7852_210_5219_1_1528286471316_8139400_358-287492-0_sp1.1 from training. Duration: 0.87275 2023-10-02 18:25:23,014 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_805-71497-0_sp0.9 from training. Duration: 0.788875 2023-10-02 18:25:24,705 WARNING [train.py:1204] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_178-39722-0 from training. Duration: 0.49 2023-10-02 18:25:26,513 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.attention_skip_rate, batch_count=975780.0, ans=0.0 2023-10-02 18:25:27,729 WARNING [train.py:1204] (3/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_428-181925-0_sp1.1 from training. Duration: 0.963625 2023-10-02 18:25:27,733 WARNING [train.py:1204] (3/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_770-200430-0 from training. Duration: 0.89 2023-10-02 18:25:27,799 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_1104-271009-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 18:25:28,647 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.conv_skip_rate, batch_count=975780.0, ans=0.0 2023-10-02 18:25:29,751 WARNING [train.py:1204] (3/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_128-296581-0_sp0.9 from training. Duration: 0.82225 2023-10-02 18:25:31,937 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.3.encoder.layers.0.self_attn1.whiten, num_groups=1, num_channels=512, metric=16.07 vs. limit=22.5 2023-10-02 18:25:36,444 WARNING [train.py:1204] (3/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_19-62511-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 18:25:37,846 INFO [train.py:1046] (3/4) Epoch 28, batch 2950, loss[loss=0.1441, simple_loss=0.2223, pruned_loss=0.03299, over 24355.00 frames. ], tot_loss[loss=0.1656, simple_loss=0.2435, pruned_loss=0.04383, over 4703528.74 frames. ], batch size: 56, lr: 3.65e-03, grad_scale: 8.0 2023-10-02 18:25:37,940 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_805-71497-0 from training. Duration: 0.71 2023-10-02 18:25:39,361 WARNING [train.py:1204] (3/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_573-152077-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 18:25:39,366 WARNING [train.py:1204] (3/4) Exclude cut with ID _42875_210_14856_1_1533704397487_4012433_393-177816-0_sp1.1 from training. Duration: 0.963625 2023-10-02 18:25:39,520 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.conv_module1.balancer1.prob, batch_count=975846.6666666666, ans=0.125 2023-10-02 18:25:40,759 WARNING [train.py:1204] (3/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_278-35159-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 18:25:43,324 WARNING [train.py:1204] (3/4) Exclude cut with ID _15037_210_2253_1_1530752294104_3267650_13-130361-0_sp1.1 from training. Duration: 0.9 2023-10-02 18:25:44,760 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3019_210_2028_1_1526044245_3264865_127-135615-0 from training. Duration: 0.94 2023-10-02 18:25:44,815 WARNING [train.py:1204] (3/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_607-213951-0 from training. Duration: 0.84 2023-10-02 18:25:46,090 WARNING [train.py:1204] (3/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_436-318113-0_sp0.9 from training. Duration: 0.8333125 2023-10-02 18:25:46,091 WARNING [train.py:1204] (3/4) Exclude cut with ID _13938_210_9988_1_1530518718978_3693960_148-152225-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 18:25:46,871 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.2.encoder.layers.2.self_attn1.whiten, num_groups=1, num_channels=384, metric=18.64 vs. limit=22.5 2023-10-02 18:25:48,016 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.conv_module2.balancer2.prob, batch_count=975846.6666666666, ans=0.125 2023-10-02 18:25:49,205 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2541_210_1794_1_1525590269_3084765_156-173713-0_sp1.1 from training. Duration: 0.736375 2023-10-02 18:25:52,485 WARNING [train.py:1204] (3/4) Exclude cut with ID _28323_210_16187_1_1532480395667_4100300_531-24157-0_sp1.1 from training. Duration: 0.8909375 2023-10-02 18:25:54,360 WARNING [train.py:1204] (3/4) Exclude cut with ID _29303_210_2575_1_1532392237303_7410649_382-217492-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 18:25:54,410 WARNING [train.py:1204] (3/4) Exclude cut with ID _58895_210_8750_1_1535184081320_3649089_270-158013-0_sp1.1 from training. Duration: 0.92725 2023-10-02 18:25:57,724 WARNING [train.py:1204] (3/4) Exclude cut with ID _29277_210_3279_1_1532581109293_3173069_311-206227-0_sp1.1 from training. Duration: 0.936375 2023-10-02 18:25:57,743 WARNING [train.py:1204] (3/4) Exclude cut with ID _7160_210_1876_1_1527940923231_3703799_282-322611-0_sp0.9 from training. Duration: 0.9666875 2023-10-02 18:25:59,129 WARNING [train.py:1204] (3/4) Exclude cut with ID _53219_210_4525_1_1534834836008_4064825_562-32295-0_sp1.1 from training. Duration: 0.963625 2023-10-02 18:25:59,849 WARNING [train.py:1204] (3/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_408-87330-0_sp1.1 from training. Duration: 0.963625 2023-10-02 18:26:00,919 WARNING [train.py:1204] (3/4) Exclude cut with ID _59202_210_26984_1_1534816810612_3549174_137-318710-0_sp1.1 from training. Duration: 0.8090625 2023-10-02 18:26:02,308 WARNING [train.py:1204] (3/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_372-30302-0 from training. Duration: 0.86 2023-10-02 18:26:05,212 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.nonlin_attention.balancer.prob, batch_count=975913.3333333334, ans=0.125 2023-10-02 18:26:06,419 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7543_210_6753_1_1528541907_1399322_178-56269-0_sp1.1 from training. Duration: 0.95275 2023-10-02 18:26:06,453 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9109W0012-549758 from training. Duration: 0.512 2023-10-02 18:26:07,782 WARNING [train.py:1204] (3/4) Exclude cut with ID _5410_210_1388_1_1527397055795_1719020_39-112221-0_sp0.9 from training. Duration: 0.7666875 2023-10-02 18:26:09,110 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9121W0012-554933 from training. Duration: 0.896 2023-10-02 18:26:09,200 WARNING [train.py:1204] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_476-272757-0 from training. Duration: 0.93 2023-10-02 18:26:10,540 WARNING [train.py:1204] (3/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_1085-171718-0_sp1.1 from training. Duration: 0.92725 2023-10-02 18:26:10,591 WARNING [train.py:1204] (3/4) Exclude cut with ID _8480_210_1874_1_1528589391205_6728330_309-139896-0_sp1.1 from training. Duration: 0.736375 2023-10-02 18:26:10,591 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9130W0113-559703 from training. Duration: 0.896 2023-10-02 18:26:10,595 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_292-265928-0_sp1.1 from training. Duration: 0.463625 2023-10-02 18:26:13,693 WARNING [train.py:1204] (3/4) Exclude cut with ID _8772_210_1850_1_1528635595972_3664011_96-12756-0 from training. Duration: 0.98 2023-10-02 18:26:13,769 WARNING [train.py:1204] (3/4) Exclude cut with ID _12556_210_8341_1_1530081024100_7327690_725-193477-0_sp1.1 from training. Duration: 0.8909375 2023-10-02 18:26:15,139 WARNING [train.py:1204] (3/4) Exclude cut with ID _5289_210_2084_1_1527678202736_3779299_142-103317-0_sp1.1 from training. Duration: 0.763625 2023-10-02 18:26:16,467 WARNING [train.py:1204] (3/4) Exclude cut with ID _45012_210_12651_1_1533608435136_5371380_206-51075-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 18:26:17,846 WARNING [train.py:1204] (3/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_176-56120-0_sp1.1 from training. Duration: 0.7545625 2023-10-02 18:26:17,871 WARNING [train.py:1204] (3/4) Exclude cut with ID _40284_210_2084_1_1533513679814_3704279_220-201864-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 18:26:19,162 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9165W0004-576638 from training. Duration: 0.896 2023-10-02 18:26:19,209 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_5438_210_1794_1_1527336687_2600009_218-343336-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 18:26:19,237 WARNING [train.py:1204] (3/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_318-162017-0 from training. Duration: 0.91 2023-10-02 18:26:24,002 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_49544_210_1794_1_1534379770_5262083_483-349298-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 18:26:25,847 WARNING [train.py:1204] (3/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_197-28164-0_sp0.9 from training. Duration: 0.888875 2023-10-02 18:26:27,191 WARNING [train.py:1204] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_643-52007-0 from training. Duration: 0.57 2023-10-02 18:26:27,210 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_38965_210_4852_1_1533174906_3484735_403-332963-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 18:26:27,343 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.0.balancer2.prob, batch_count=976046.6666666666, ans=0.125 2023-10-02 18:26:29,212 WARNING [train.py:1204] (3/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_340-174176-0 from training. Duration: 0.49 2023-10-02 18:26:33,293 WARNING [train.py:1204] (3/4) Exclude cut with ID _56708_210_13177_1_1535252351282_3422899_363-917-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 18:26:33,985 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.4.encoder.layers.0.self_attn_weights.whiten_keys, num_groups=4, num_channels=128, metric=1.75 vs. limit=6.0 2023-10-02 18:26:34,684 WARNING [train.py:1204] (3/4) Exclude cut with ID _19896_210_1883_1_1531709022662_6649569_525-159063-0_sp1.1 from training. Duration: 0.936375 2023-10-02 18:26:34,717 WARNING [train.py:1204] (3/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_430-116139-0_sp0.9 from training. Duration: 0.9444375 2023-10-02 18:26:37,318 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_53753_210_7234_1_1534320003_3594148_86-329250-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 18:26:37,327 WARNING [train.py:1204] (3/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_406-275649-0_sp1.1 from training. Duration: 0.3818125 2023-10-02 18:26:38,738 WARNING [train.py:1204] (3/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_1083-147536-0_sp1.1 from training. Duration: 0.8818125 2023-10-02 18:26:40,290 WARNING [train.py:1204] (3/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_54-311741-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 18:26:40,295 WARNING [train.py:1204] (3/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_237-58433-0_sp0.9 from training. Duration: 0.82225 2023-10-02 18:26:40,343 WARNING [train.py:1204] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_98-51676-0_sp1.1 from training. Duration: 0.663625 2023-10-02 18:26:41,813 WARNING [train.py:1204] (3/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_386-270472-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 18:26:41,916 WARNING [train.py:1204] (3/4) Exclude cut with ID _19023_210_4353_1_1531378892552_5820780_83-254794-0_sp1.1 from training. Duration: 0.7818125 2023-10-02 18:26:43,181 WARNING [train.py:1204] (3/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_314-93656-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 18:26:44,668 WARNING [train.py:1204] (3/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_336-213530-0 from training. Duration: 0.9 2023-10-02 18:26:44,772 WARNING [train.py:1204] (3/4) Exclude cut with ID _36686_210_5665_1_1533778225913_3646410_65-221120-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 18:26:47,512 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_26433_210_12108_1_1532157158_4056480_271-68178-0_sp1.1 from training. Duration: 0.92725 2023-10-02 18:26:47,565 WARNING [train.py:1204] (3/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_46-207599-0_sp0.9 from training. Duration: 0.711125 2023-10-02 18:26:51,594 INFO [train.py:1046] (3/4) Epoch 28, batch 3000, loss[loss=0.1485, simple_loss=0.2237, pruned_loss=0.03665, over 23299.00 frames. ], tot_loss[loss=0.1665, simple_loss=0.2446, pruned_loss=0.04421, over 4700346.71 frames. ], batch size: 51, lr: 3.65e-03, grad_scale: 8.0 2023-10-02 18:26:51,594 INFO [train.py:1069] (3/4) Computing validation loss 2023-10-02 18:27:03,154 INFO [train.py:1078] (3/4) Epoch 28, validation: loss=0.3199, simple_loss=0.2738, pruned_loss=0.183, over 1125622.00 frames. 2023-10-02 18:27:03,155 INFO [train.py:1079] (3/4) Maximum memory allocated so far is 21134MB 2023-10-02 18:27:03,274 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9303W0114-637133 from training. Duration: 0.896 2023-10-02 18:27:04,581 WARNING [train.py:1204] (3/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_490-207861-0 from training. Duration: 0.98 2023-10-02 18:27:06,060 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6767_210_3273_1_1527855783_1718932_173-256601-0_sp0.9 from training. Duration: 0.888875 2023-10-02 18:27:06,115 WARNING [train.py:1204] (3/4) Exclude cut with ID _13921_210_9994_1_1530841782272_3503520_327-69677-0_sp1.1 from training. Duration: 0.8181875 2023-10-02 18:27:06,216 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.0.self_attn_weights.pos_emb_skip_rate, batch_count=976180.0, ans=0.0 2023-10-02 18:27:07,407 WARNING [train.py:1204] (3/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_508-335369-0 from training. Duration: 0.48 2023-10-02 18:27:07,441 WARNING [train.py:1204] (3/4) Exclude cut with ID _64631_210_27767_1_1535349580068_4165160_24-46567-0_sp1.1 from training. Duration: 0.963625 2023-10-02 18:27:10,431 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.balancer2.prob, batch_count=976180.0, ans=0.125 2023-10-02 18:27:13,167 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_89-35748-0_sp1.1 from training. Duration: 0.7181875 2023-10-02 18:27:23,930 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_26433_210_12108_1_1532157158_4056480_578-262107-0_sp1.1 from training. Duration: 0.9 2023-10-02 18:27:28,747 WARNING [train.py:1204] (3/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_129-337802-0 from training. Duration: 0.72 2023-10-02 18:27:30,692 WARNING [train.py:1204] (3/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_356-167816-0_sp0.9 from training. Duration: 0.8 2023-10-02 18:27:34,808 WARNING [train.py:1204] (3/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_137-116442-0_sp1.1 from training. Duration: 0.7090625 2023-10-02 18:27:34,841 WARNING [train.py:1204] (3/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_358-172940-0_sp1.1 from training. Duration: 0.963625 2023-10-02 18:27:34,871 WARNING [train.py:1204] (3/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_668-344147-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 18:27:37,629 WARNING [train.py:1204] (3/4) Exclude cut with ID _62337_210_12392_1_1535009273105_3412229_255-157667-0_sp1.1 from training. Duration: 0.936375 2023-10-02 18:27:37,640 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_486-151131-0 from training. Duration: 0.85 2023-10-02 18:27:40,399 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_53638_210_21581_1_1534297319_5808449_885-80172-0 from training. Duration: 0.96 2023-10-02 18:27:41,822 WARNING [train.py:1204] (3/4) Exclude cut with ID _50093_210_4220_1_1534417141207_3432299_105-183067-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 18:27:43,105 WARNING [train.py:1204] (3/4) Exclude cut with ID _7562_210_2084_1_1528887767916_3946229_345-169156-0_sp0.9 from training. Duration: 0.6444375 2023-10-02 18:27:45,778 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_4528_210_3273_1_1527076654_3276777_209-219804-0_sp0.9 from training. Duration: 0.7666875 2023-10-02 18:27:45,805 WARNING [train.py:1204] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_35-153886-0_sp0.9 from training. Duration: 0.9333125 2023-10-02 18:27:45,875 WARNING [train.py:1204] (3/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_332-121925-0_sp1.1 from training. Duration: 0.92725 2023-10-02 18:27:45,877 WARNING [train.py:1204] (3/4) Exclude cut with ID _59202_210_26984_1_1534816810612_3549174_495-65515-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 18:27:48,651 WARNING [train.py:1204] (3/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_769-168302-0_sp1.1 from training. Duration: 0.6454375 2023-10-02 18:27:49,922 WARNING [train.py:1204] (3/4) Exclude cut with ID _24271_210_12658_1_1531987270022_7190519_708-71345-0_sp1.1 from training. Duration: 0.936375 2023-10-02 18:27:49,923 WARNING [train.py:1204] (3/4) Exclude cut with ID 3033-130750-0096-55598-0_sp0.9 from training. Duration: 0.92225 2023-10-02 18:27:51,365 WARNING [train.py:1204] (3/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_219-224445-0_sp0.9 from training. Duration: 0.9333125 2023-10-02 18:27:54,594 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_174-277364-0 from training. Duration: 0.71 2023-10-02 18:27:54,680 WARNING [train.py:1204] (3/4) Exclude cut with ID _45021_210_9994_1_1533619751094_3330288_96-239010-0_sp1.1 from training. Duration: 0.82725 2023-10-02 18:27:55,992 WARNING [train.py:1204] (3/4) Exclude cut with ID _31602_210_5854_1_1532677021456_4458250_489-305116-0_sp1.1 from training. Duration: 0.97275 2023-10-02 18:27:56,026 WARNING [train.py:1204] (3/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_269-154726-0_sp0.9 from training. Duration: 0.9555625 2023-10-02 18:27:58,157 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.2.encoder.layers.0.whiten, num_groups=1, num_channels=384, metric=5.19 vs. limit=12.0 2023-10-02 18:28:00,072 WARNING [train.py:1204] (3/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_126-54418-0_sp1.1 from training. Duration: 0.92725 2023-10-02 18:28:00,095 WARNING [train.py:1204] (3/4) Exclude cut with ID _42326_210_7770_1_1533814282531_3707269_182-224159-0_sp1.1 from training. Duration: 0.92725 2023-10-02 18:28:01,804 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.605e+02 1.829e+02 2.050e+02 2.331e+02 3.345e+02, threshold=4.099e+02, percent-clipped=0.0 2023-10-02 18:28:01,958 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7002_210_1794_1_1527939683_5157286_1-281264-0_sp0.9 from training. Duration: 0.488875 2023-10-02 18:28:03,645 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_86-16881-0 from training. Duration: 0.58 2023-10-02 18:28:03,679 WARNING [train.py:1204] (3/4) Exclude cut with ID _19514_210_9259_1_1531530071330_5084230_637-279094-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 18:28:03,707 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_26433_210_12108_1_1532157158_4056480_578-262107-0 from training. Duration: 0.99 2023-10-02 18:28:03,753 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_82-221196-0_sp1.1 from training. Duration: 0.6909375 2023-10-02 18:28:05,264 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=976446.6666666666, ans=0.1 2023-10-02 18:28:05,286 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.attention_skip_rate, batch_count=976446.6666666666, ans=0.0 2023-10-02 18:28:06,543 WARNING [train.py:1204] (3/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_402-102311-0 from training. Duration: 0.67 2023-10-02 18:28:07,159 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.5.encoder.layers.1.feed_forward2.out_whiten, num_groups=1, num_channels=256, metric=7.57 vs. limit=15.0 2023-10-02 18:28:08,050 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1015-343884-0_sp1.1 from training. Duration: 0.563625 2023-10-02 18:28:09,403 WARNING [train.py:1204] (3/4) Exclude cut with ID _13684_210_7497_1_1530619354863_3598359_312-235606-0_sp1.1 from training. Duration: 0.5454375 2023-10-02 18:28:09,437 WARNING [train.py:1204] (3/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_55-31904-0 from training. Duration: 0.88 2023-10-02 18:28:10,769 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_25-332138-0 from training. Duration: 0.91 2023-10-02 18:28:10,778 WARNING [train.py:1204] (3/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_912-282412-0_sp0.9 from training. Duration: 0.5444375 2023-10-02 18:28:12,161 WARNING [train.py:1204] (3/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_458-299435-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 18:28:13,410 WARNING [train.py:1204] (3/4) Exclude cut with ID _69584_210_5219_1_1535785133775_4044450_275-88403-0_sp1.1 from training. Duration: 0.92725 2023-10-02 18:28:13,418 WARNING [train.py:1204] (3/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_841-341679-0_sp1.1 from training. Duration: 0.52725 2023-10-02 18:28:13,431 WARNING [train.py:1204] (3/4) Exclude cut with ID _41391_210_7994_1_1533603771872_3638201_1-54521-0_sp1.1 from training. Duration: 0.97275 2023-10-02 18:28:14,743 WARNING [train.py:1204] (3/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_495-186609-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 18:28:16,137 INFO [train.py:1046] (3/4) Epoch 28, batch 3050, loss[loss=0.1551, simple_loss=0.2319, pruned_loss=0.03917, over 24422.00 frames. ], tot_loss[loss=0.1665, simple_loss=0.2446, pruned_loss=0.04421, over 4718079.15 frames. ], batch size: 58, lr: 3.65e-03, grad_scale: 8.0 2023-10-02 18:28:16,285 WARNING [train.py:1204] (3/4) Exclude cut with ID _4681_210_3525_1_1527250689464_120889_15-126760-0 from training. Duration: 0.81 2023-10-02 18:28:19,109 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3019_210_2028_1_1526044245_3264865_127-135615-0_sp1.1 from training. Duration: 0.8545625 2023-10-02 18:28:20,517 WARNING [train.py:1204] (3/4) Exclude cut with ID _30191_210_1664_1_1533189915047_7196670_915-21561-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 18:28:20,560 WARNING [train.py:1204] (3/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_6-214815-0_sp0.9 from training. Duration: 0.9444375 2023-10-02 18:28:24,325 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_45399_210_4852_1_1533624725_7579737_190-137494-0_sp1.1 from training. Duration: 0.97275 2023-10-02 18:28:27,058 WARNING [train.py:1204] (3/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_207-132038-0 from training. Duration: 0.94 2023-10-02 18:28:27,318 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.bypass.skip_rate, batch_count=976513.3333333334, ans=0.04949747468305833 2023-10-02 18:28:34,336 WARNING [train.py:1204] (3/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_90-31383-0 from training. Duration: 0.87 2023-10-02 18:28:34,374 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_15517_210_6753_1_1530952293_1664795_119-31001-0 from training. Duration: 0.94 2023-10-02 18:28:34,411 WARNING [train.py:1204] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_684-85342-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 18:28:37,887 WARNING [train.py:1204] (3/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_97-325053-0_sp1.1 from training. Duration: 0.836375 2023-10-02 18:28:40,701 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_68702_210_23689_1_1535799577_3420068_168-25150-0_sp1.1 from training. Duration: 0.97275 2023-10-02 18:28:40,710 WARNING [train.py:1204] (3/4) Exclude cut with ID _65134_210_14763_1_1535335393985_3505368_111-27984-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 18:28:40,773 WARNING [train.py:1204] (3/4) Exclude cut with ID _48678_210_5663_1_1533988830947_3769199_276-226230-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 18:28:42,923 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.3.encoder.layers.0.conv_module1.whiten, num_groups=1, num_channels=512, metric=4.13 vs. limit=15.0 2023-10-02 18:28:43,506 WARNING [train.py:1204] (3/4) Exclude cut with ID _28427_210_4220_1_1532581240967_3061069_43-84928-0_sp1.1 from training. Duration: 0.9 2023-10-02 18:28:43,548 WARNING [train.py:1204] (3/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_158-250116-0_sp1.1 from training. Duration: 0.563625 2023-10-02 18:28:44,835 WARNING [train.py:1204] (3/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_279-332556-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 18:28:44,871 WARNING [train.py:1204] (3/4) Exclude cut with ID _42863_210_9070_1_1533902398788_3696920_488-230146-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 18:28:44,872 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_387-247159-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 18:28:46,221 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6729_210_2550_1_1528032481_1788725_261-42619-0_sp1.1 from training. Duration: 0.97275 2023-10-02 18:28:47,698 WARNING [train.py:1204] (3/4) Exclude cut with ID _19896_210_1883_1_1531709022662_6649569_831-44724-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 18:28:48,341 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.3.encoder.layers.2.feed_forward2.out_whiten, num_groups=1, num_channels=512, metric=10.00 vs. limit=15.0 2023-10-02 18:28:49,241 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.ff3_skip_rate, batch_count=976646.6666666666, ans=0.0 2023-10-02 18:28:52,241 WARNING [train.py:1204] (3/4) Exclude cut with ID _55153_210_2253_1_1534384786677_3162096_280-285944-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 18:28:52,274 WARNING [train.py:1204] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_448-4048-0 from training. Duration: 0.96 2023-10-02 18:28:53,627 WARNING [train.py:1204] (3/4) Exclude cut with ID _29678_210_7893_1_1532421042787_3906118_456-202548-0_sp1.1 from training. Duration: 0.97275 2023-10-02 18:28:53,655 WARNING [train.py:1204] (3/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_107-332398-0_sp1.1 from training. Duration: 0.7090625 2023-10-02 18:28:56,901 WARNING [train.py:1204] (3/4) Exclude cut with ID _13921_210_9994_1_1530838702800_3003429_46-149444-0_sp1.1 from training. Duration: 0.8545625 2023-10-02 18:28:58,227 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_257-20733-0_sp0.9 from training. Duration: 0.8444375 2023-10-02 18:28:58,282 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_53753_210_7234_1_1534320003_3594148_385-234809-0_sp1.1 from training. Duration: 0.963625 2023-10-02 18:28:59,586 WARNING [train.py:1204] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_292-215346-0_sp1.1 from training. Duration: 0.8818125 2023-10-02 18:29:04,363 WARNING [train.py:1204] (3/4) Exclude cut with ID _68965_210_1883_1_1535698962272_3497490_196-124268-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 18:29:04,422 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_23801_210_16223_1_1532066392_3294657_570-5439-0_sp1.1 from training. Duration: 0.8818125 2023-10-02 18:29:11,958 WARNING [train.py:1204] (3/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_349-6928-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 18:29:12,015 WARNING [train.py:1204] (3/4) Exclude cut with ID _60621_210_17071_1_1535013053530_3671739_226-255202-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 18:29:12,018 WARNING [train.py:1204] (3/4) Exclude cut with ID _41817_210_18835_1_1533434477160_7423049_448-130302-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 18:29:14,625 WARNING [train.py:1204] (3/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_758-143677-0_sp1.1 from training. Duration: 0.8909375 2023-10-02 18:29:14,682 WARNING [train.py:1204] (3/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_912-47473-0_sp0.9 from training. Duration: 0.7555625 2023-10-02 18:29:14,720 WARNING [train.py:1204] (3/4) Exclude cut with ID _6694_210_4599_1_1527852637490_3965529_356-169233-0_sp1.1 from training. Duration: 0.9 2023-10-02 18:29:17,400 WARNING [train.py:1204] (3/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_1-99680-0 from training. Duration: 0.67 2023-10-02 18:29:18,856 WARNING [train.py:1204] (3/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_199-86432-0_sp1.1 from training. Duration: 0.8909375 2023-10-02 18:29:20,158 WARNING [train.py:1204] (3/4) Exclude cut with ID _61148_210_6897_1_1535094022726_4217610_362-182430-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 18:29:20,240 WARNING [train.py:1204] (3/4) Exclude cut with ID _49376_210_5986_1_1533985278732_3370740_301-154763-0 from training. Duration: 0.98 2023-10-02 18:29:21,674 WARNING [train.py:1204] (3/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_572-185150-0_sp1.1 from training. Duration: 0.8818125 2023-10-02 18:29:27,547 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_47896_210_20508_1_1534492518_5958868_558-314572-0_sp1.1 from training. Duration: 0.8818125 2023-10-02 18:29:27,649 WARNING [train.py:1204] (3/4) Exclude cut with ID _45560_210_21982_1_1533812603951_3584999_111-6376-0_sp0.9 from training. Duration: 0.9333125 2023-10-02 18:29:29,111 WARNING [train.py:1204] (3/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_412-317077-0_sp1.1 from training. Duration: 0.6818125 2023-10-02 18:29:30,970 INFO [train.py:1046] (3/4) Epoch 28, batch 3100, loss[loss=0.1834, simple_loss=0.2601, pruned_loss=0.05336, over 23353.00 frames. ], tot_loss[loss=0.1661, simple_loss=0.2441, pruned_loss=0.04411, over 4723272.11 frames. ], batch size: 93, lr: 3.65e-03, grad_scale: 8.0 2023-10-02 18:29:31,102 WARNING [train.py:1204] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_718-68505-0 from training. Duration: 0.89 2023-10-02 18:29:34,514 WARNING [train.py:1204] (3/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_77-311404-0 from training. Duration: 0.94 2023-10-02 18:29:34,598 WARNING [train.py:1204] (3/4) Exclude cut with ID _60229_210_27767_1_1534838406158_3655350_264-279573-0 from training. Duration: 0.83 2023-10-02 18:29:35,838 WARNING [train.py:1204] (3/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_106-300985-0_sp1.1 from training. Duration: 0.8181875 2023-10-02 18:29:39,123 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.feed_forward1.out_proj.dropout_p, batch_count=976846.6666666666, ans=0.1 2023-10-02 18:29:41,510 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_4183_210_2087_1_1526729100_1869838_79-277189-0_sp1.1 from training. Duration: 0.963625 2023-10-02 18:29:41,519 WARNING [train.py:1204] (3/4) Exclude cut with ID _28427_210_4220_1_1532581240967_3061069_195-295268-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 18:29:44,237 WARNING [train.py:1204] (3/4) Exclude cut with ID _4092_210_2151_1_1526643044449_3135759_356-278694-0_sp0.9 from training. Duration: 0.52225 2023-10-02 18:29:45,960 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.bypass.scale_min, batch_count=976913.3333333334, ans=0.2 2023-10-02 18:29:47,188 WARNING [train.py:1204] (3/4) Exclude cut with ID _14459_210_2228_1_1531443641946_7516700_777-71446-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 18:29:51,452 WARNING [train.py:1204] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_520-126729-0 from training. Duration: 0.7 2023-10-02 18:29:56,090 WARNING [train.py:1204] (3/4) Exclude cut with ID _13684_210_7497_1_1530619354863_3598359_312-235606-0_sp0.9 from training. Duration: 0.6666875 2023-10-02 18:29:57,323 WARNING [train.py:1204] (3/4) Exclude cut with ID _37537_210_2253_1_1533171758568_3421949_283-349402-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 18:29:57,352 WARNING [train.py:1204] (3/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_29-117932-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 18:29:57,372 WARNING [train.py:1204] (3/4) Exclude cut with ID _29303_210_2575_1_1532392237303_7410649_721-283351-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 18:29:58,728 WARNING [train.py:1204] (3/4) Exclude cut with ID _14886_210_4220_1_1530702020355_3516190_175-229073-0_sp0.9 from training. Duration: 0.5 2023-10-02 18:29:59,721 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.0.layers.1.conv_module2.whiten, num_groups=1, num_channels=192, metric=8.49 vs. limit=15.0 2023-10-02 18:30:00,275 WARNING [train.py:1204] (3/4) Exclude cut with ID _15458_210_8341_1_1530858614263_3834410_70-268830-0_sp1.1 from training. Duration: 0.97275 2023-10-02 18:30:00,310 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2295_210_2087_1_1525500993_2407863_234-83104-0 from training. Duration: 0.62 2023-10-02 18:30:00,311 WARNING [train.py:1204] (3/4) Exclude cut with ID _22961_210_8254_1_1531976366672_4278180_317-199546-0_sp1.1 from training. Duration: 0.7818125 2023-10-02 18:30:02,237 WARNING [train.py:1204] (3/4) Exclude cut with ID _27894_210_12392_1_1532415496078_6902439_547-196022-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 18:30:04,183 WARNING [train.py:1204] (3/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_46-207599-0 from training. Duration: 0.64 2023-10-02 18:30:05,572 WARNING [train.py:1204] (3/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_426-326246-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 18:30:08,964 WARNING [train.py:1204] (3/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_445-117398-0_sp0.9 from training. Duration: 0.988875 2023-10-02 18:30:10,211 WARNING [train.py:1204] (3/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_563-347832-0 from training. Duration: 0.89 2023-10-02 18:30:10,297 WARNING [train.py:1204] (3/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_252-216674-0 from training. Duration: 0.54 2023-10-02 18:30:10,388 WARNING [train.py:1204] (3/4) Exclude cut with ID _59611_210_8895_1_1534741198240_3650361_187-212779-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 18:30:11,722 WARNING [train.py:1204] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_282-159367-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 18:30:14,400 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_68702_210_23689_1_1535799577_3420068_33-136883-0_sp1.1 from training. Duration: 0.936375 2023-10-02 18:30:15,675 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_47228_210_4983_1_1533780021_3672027_184-293706-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 18:30:15,710 WARNING [train.py:1204] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_656-188361-0_sp0.9 from training. Duration: 0.9666875 2023-10-02 18:30:18,440 WARNING [train.py:1204] (3/4) Exclude cut with ID _4866_210_2577_1_1527404412284_4364659_352-157930-0_sp0.9 from training. Duration: 0.9 2023-10-02 18:30:18,441 WARNING [train.py:1204] (3/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_210-106047-0_sp1.1 from training. Duration: 0.8818125 2023-10-02 18:30:19,798 WARNING [train.py:1204] (3/4) Exclude cut with ID _9389_210_1664_1_1529217337301_5289269_289-213417-0_sp1.1 from training. Duration: 0.8090625 2023-10-02 18:30:19,830 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3910_210_1794_1_1526777667_3747260_178-41186-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 18:30:19,834 WARNING [train.py:1204] (3/4) Exclude cut with ID _27052_210_5986_1_1532329197187_3585009_24-172263-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 18:30:19,837 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_5522_210_3273_1_1527249606_3909096_318-203153-0_sp1.1 from training. Duration: 0.3909375 2023-10-02 18:30:25,287 WARNING [train.py:1204] (3/4) Exclude cut with ID _27894_210_12392_1_1532415496078_6902439_222-51982-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 18:30:25,377 WARNING [train.py:1204] (3/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_87-131427-0 from training. Duration: 0.78 2023-10-02 18:30:27,097 WARNING [train.py:1204] (3/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_372-298093-0_sp0.9 from training. Duration: 0.888875 2023-10-02 18:30:28,436 WARNING [train.py:1204] (3/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_663-166973-0 from training. Duration: 0.8 2023-10-02 18:30:28,465 WARNING [train.py:1204] (3/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_429-177469-0_sp1.1 from training. Duration: 0.936375 2023-10-02 18:30:29,752 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.547e+02 1.812e+02 1.984e+02 2.223e+02 4.054e+02, threshold=3.967e+02, percent-clipped=0.0 2023-10-02 18:30:29,844 WARNING [train.py:1204] (3/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_724-172651-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 18:30:29,889 WARNING [train.py:1204] (3/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_103-336064-0 from training. Duration: 0.51 2023-10-02 18:30:34,929 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.conv_skip_rate, batch_count=977113.3333333334, ans=0.0 2023-10-02 18:30:37,486 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.ff2_skip_rate, batch_count=977113.3333333334, ans=0.0 2023-10-02 18:30:37,552 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.attention_skip_rate, batch_count=977113.3333333334, ans=0.0 2023-10-02 18:30:41,568 WARNING [train.py:1204] (3/4) Exclude cut with ID _48677_210_5663_1_1533902405919_3892989_111-11439-0 from training. Duration: 0.96 2023-10-02 18:30:44,702 INFO [train.py:1046] (3/4) Epoch 28, batch 3150, loss[loss=0.1829, simple_loss=0.2625, pruned_loss=0.05163, over 23447.00 frames. ], tot_loss[loss=0.1655, simple_loss=0.2436, pruned_loss=0.04373, over 4722299.51 frames. ], batch size: 93, lr: 3.65e-03, grad_scale: 8.0 2023-10-02 18:30:44,829 WARNING [train.py:1204] (3/4) Exclude cut with ID _8085_210_4557_1_1528455633493_3716100_95-103398-0_sp1.1 from training. Duration: 0.963625 2023-10-02 18:30:44,894 WARNING [train.py:1204] (3/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_1041-110837-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 18:30:47,526 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_50050_210_16223_1_1535435796_7290138_1155-121757-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 18:30:47,528 WARNING [train.py:1204] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_602-275260-0_sp0.9 from training. Duration: 0.911125 2023-10-02 18:30:48,921 WARNING [train.py:1204] (3/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_265-164486-0 from training. Duration: 0.96 2023-10-02 18:30:50,270 WARNING [train.py:1204] (3/4) Exclude cut with ID _46067_210_4220_1_1534125571066_3307450_142-229375-0_sp1.1 from training. Duration: 0.963625 2023-10-02 18:30:50,293 WARNING [train.py:1204] (3/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_310-26186-0_sp1.1 from training. Duration: 0.636375 2023-10-02 18:30:51,682 WARNING [train.py:1204] (3/4) Exclude cut with ID _28461_210_2253_1_1532771775189_3041299_136-270644-0 from training. Duration: 0.93 2023-10-02 18:30:54,539 WARNING [train.py:1204] (3/4) Exclude cut with ID _14860_210_4915_1_1530702020340_7343240_438-286372-0_sp1.1 from training. Duration: 0.936375 2023-10-02 18:30:57,334 WARNING [train.py:1204] (3/4) Exclude cut with ID IC0128W0004-63309 from training. Duration: 0.831 2023-10-02 18:30:58,843 WARNING [train.py:1204] (3/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_135-174080-0 from training. Duration: 0.53 2023-10-02 18:31:00,860 WARNING [train.py:1204] (3/4) Exclude cut with ID _41326_210_5854_1_1533778076509_3492080_277-59511-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 18:31:00,931 WARNING [train.py:1204] (3/4) Exclude cut with ID IC0144W0112-71379 from training. Duration: 0.992 2023-10-02 18:31:00,997 WARNING [train.py:1204] (3/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_711-15688-0_sp0.9 from training. Duration: 0.47775 2023-10-02 18:31:03,758 WARNING [train.py:1204] (3/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_237-332027-0 from training. Duration: 0.95 2023-10-02 18:31:03,812 WARNING [train.py:1204] (3/4) Exclude cut with ID _7970_210_1883_1_1528455616296_3472230_25-178407-0 from training. Duration: 0.71 2023-10-02 18:31:03,817 WARNING [train.py:1204] (3/4) Exclude cut with ID _8480_210_1874_1_1528589391205_6728330_309-139896-0 from training. Duration: 0.81 2023-10-02 18:31:03,832 WARNING [train.py:1204] (3/4) Exclude cut with ID _39571_210_13449_1_1533801584143_3651077_319-268877-0_sp1.1 from training. Duration: 0.936375 2023-10-02 18:31:03,836 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_155-246880-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 18:31:06,410 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_566-122599-0_sp1.1 from training. Duration: 0.936375 2023-10-02 18:31:08,219 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_12868_210_6753_1_1530701932_530275_18-76322-0 from training. Duration: 0.87 2023-10-02 18:31:09,997 WARNING [train.py:1204] (3/4) Exclude cut with ID _56494_210_4220_1_1535284837625_3033169_159-106035-0_sp1.1 from training. Duration: 0.963625 2023-10-02 18:31:10,032 WARNING [train.py:1204] (3/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_98-276013-0_sp1.1 from training. Duration: 0.963625 2023-10-02 18:31:11,414 WARNING [train.py:1204] (3/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_330-216124-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 18:31:12,910 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_177-58005-0_sp0.9 from training. Duration: 0.688875 2023-10-02 18:31:15,030 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.feed_forward1.hidden_balancer.prob, batch_count=977313.3333333334, ans=0.125 2023-10-02 18:31:17,434 WARNING [train.py:1204] (3/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_343-139898-0 from training. Duration: 0.96 2023-10-02 18:31:17,506 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6961_210_2087_1_1527937168_6815011_395-185060-0_sp1.1 from training. Duration: 0.863625 2023-10-02 18:31:20,383 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7002_210_1794_1_1527939683_5157286_184-229630-0_sp0.9 from training. Duration: 0.82225 2023-10-02 18:31:20,429 WARNING [train.py:1204] (3/4) Exclude cut with ID _59170_210_6721_1_1534983633960_4203335_153-18895-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 18:31:21,877 WARNING [train.py:1204] (3/4) Exclude cut with ID _49643_210_7989_1_1534640244224_3939031_598-161926-0 from training. Duration: 0.9 2023-10-02 18:31:24,683 WARNING [train.py:1204] (3/4) Exclude cut with ID _4068_210_2629_1_1526697350395_1546259_224-288693-0 from training. Duration: 0.59 2023-10-02 18:31:26,014 WARNING [train.py:1204] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_718-68505-0_sp1.1 from training. Duration: 0.8090625 2023-10-02 18:31:26,076 WARNING [train.py:1204] (3/4) Exclude cut with ID _4068_210_2629_1_1526697350395_1546259_224-288693-0_sp0.9 from training. Duration: 0.6555625 2023-10-02 18:31:26,095 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2296_210_2087_1_1525503448_3397335_57-185430-0_sp0.9 from training. Duration: 0.6666875 2023-10-02 18:31:27,406 WARNING [train.py:1204] (3/4) Exclude cut with ID _15458_210_8341_1_1530858614263_3834410_49-235472-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 18:31:27,408 WARNING [train.py:1204] (3/4) Exclude cut with ID _64135_210_2253_1_1535677267876_7608972_498-5039-0_sp0.9 from training. Duration: 0.8666875 2023-10-02 18:31:28,791 WARNING [train.py:1204] (3/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_283-220372-0_sp0.9 from training. Duration: 0.62225 2023-10-02 18:31:28,800 WARNING [train.py:1204] (3/4) Exclude cut with ID _6473_210_1741_1_1527901307396_3815380_108-17335-0_sp0.9 from training. Duration: 0.72225 2023-10-02 18:31:28,882 WARNING [train.py:1204] (3/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_113-92608-0 from training. Duration: 0.54 2023-10-02 18:31:28,934 WARNING [train.py:1204] (3/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_272-249985-0_sp1.1 from training. Duration: 0.6090625 2023-10-02 18:31:28,950 WARNING [train.py:1204] (3/4) Exclude cut with ID _50376_210_13401_1_1534031502101_4217740_518-187390-0_sp1.1 from training. Duration: 0.97275 2023-10-02 18:31:32,177 WARNING [train.py:1204] (3/4) Exclude cut with ID _59738_210_15379_1_1534849512562_3941470_165-248260-0_sp1.1 from training. Duration: 0.92725 2023-10-02 18:31:32,187 WARNING [train.py:1204] (3/4) Exclude cut with ID _23818_210_6228_1_1531917063884_3676920_400-343925-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 18:31:33,562 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6961_210_2087_1_1527937168_6815011_562-165518-0 from training. Duration: 0.77 2023-10-02 18:31:33,628 WARNING [train.py:1204] (3/4) Exclude cut with ID _24271_210_12658_1_1531987270022_7190519_289-207931-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 18:31:35,084 WARNING [train.py:1204] (3/4) Exclude cut with ID _14632_210_9988_1_1530691279392_3923200_332-105904-0 from training. Duration: 0.9 2023-10-02 18:31:35,102 WARNING [train.py:1204] (3/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_203-232438-0_sp1.1 from training. Duration: 0.97275 2023-10-02 18:31:36,504 WARNING [train.py:1204] (3/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_263-168388-0 from training. Duration: 0.86 2023-10-02 18:31:37,875 WARNING [train.py:1204] (3/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_174-205058-0 from training. Duration: 0.95 2023-10-02 18:31:38,044 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.0.feed_forward1.out_proj.dropout_p, batch_count=977380.0, ans=0.1 2023-10-02 18:31:39,758 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_94-195801-0_sp1.1 from training. Duration: 0.8545625 2023-10-02 18:31:39,780 WARNING [train.py:1204] (3/4) Exclude cut with ID _55153_210_2253_1_1534384786677_3162096_269-103204-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 18:31:41,530 WARNING [train.py:1204] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_748-36150-0 from training. Duration: 0.91 2023-10-02 18:31:43,269 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_148-172861-0_sp1.1 from training. Duration: 0.4545625 2023-10-02 18:31:43,329 WARNING [train.py:1204] (3/4) Exclude cut with ID _67499_210_17282_1_1535509805220_3692430_460-77730-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 18:31:46,226 WARNING [train.py:1204] (3/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_374-20753-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 18:31:47,480 WARNING [train.py:1204] (3/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_374-289983-0_sp1.1 from training. Duration: 0.97275 2023-10-02 18:31:47,515 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_34886_210_10633_1_1533203882_1755661_76-156096-0_sp0.9 from training. Duration: 0.9666875 2023-10-02 18:31:51,659 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_238-256668-0_sp0.9 from training. Duration: 0.7666875 2023-10-02 18:31:53,069 WARNING [train.py:1204] (3/4) Exclude cut with ID _46683_210_9642_1_1533730913492_4591116_489-146727-0_sp1.1 from training. Duration: 0.97275 2023-10-02 18:31:54,515 WARNING [train.py:1204] (3/4) Exclude cut with ID _13938_210_9988_1_1530518718978_3693960_304-112784-0_sp0.9 from training. Duration: 0.511125 2023-10-02 18:31:58,497 INFO [train.py:1046] (3/4) Epoch 28, batch 3200, loss[loss=0.1755, simple_loss=0.2652, pruned_loss=0.04292, over 24703.00 frames. ], tot_loss[loss=0.1652, simple_loss=0.2425, pruned_loss=0.04395, over 4710069.95 frames. ], batch size: 73, lr: 3.65e-03, grad_scale: 16.0 2023-10-02 18:31:58,688 WARNING [train.py:1204] (3/4) Exclude cut with ID _24271_210_12658_1_1531987270022_7190519_243-46549-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 18:31:58,692 WARNING [train.py:1204] (3/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_78-182912-0_sp1.1 from training. Duration: 0.536375 2023-10-02 18:32:04,298 WARNING [train.py:1204] (3/4) Exclude cut with ID _57572_210_5517_1_1534921219624_3809484_121-321334-0_sp1.1 from training. Duration: 0.97275 2023-10-02 18:32:05,642 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_40047_210_2290_1_1533290519_2374311_254-102149-0_sp1.1 from training. Duration: 0.963625 2023-10-02 18:32:05,643 WARNING [train.py:1204] (3/4) Exclude cut with ID _28461_210_2253_1_1532771775189_3041299_96-152158-0 from training. Duration: 0.85 2023-10-02 18:32:07,299 WARNING [train.py:1204] (3/4) Exclude cut with ID _29877_210_6228_1_1532397791106_5276149_161-102979-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 18:32:12,481 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3787_210_2040_1_1526477544_5478586_603-120324-0_sp0.9 from training. Duration: 0.9 2023-10-02 18:32:14,484 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.feed_forward1.hidden_balancer.prob, batch_count=977580.0, ans=0.125 2023-10-02 18:32:18,455 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_41750_210_1960_1_1533520678_3948042_247-213456-0_sp1.1 from training. Duration: 0.97275 2023-10-02 18:32:23,028 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.ff2_skip_rate, batch_count=977580.0, ans=0.0 2023-10-02 18:32:24,439 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.feed_forward2.hidden_balancer.prob, batch_count=977580.0, ans=0.125 2023-10-02 18:32:25,643 WARNING [train.py:1204] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_127-119109-0_sp0.9 from training. Duration: 0.8 2023-10-02 18:32:27,152 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.feed_forward2.hidden_balancer.prob, batch_count=977646.6666666666, ans=0.125 2023-10-02 18:32:32,785 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.feed_forward1.out_proj.dropout_p, batch_count=977646.6666666666, ans=0.1 2023-10-02 18:32:34,411 WARNING [train.py:1204] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_215-202973-0 from training. Duration: 0.88 2023-10-02 18:32:35,739 WARNING [train.py:1204] (3/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_17-320491-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 18:32:38,445 WARNING [train.py:1204] (3/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_310-114452-0 from training. Duration: 0.59 2023-10-02 18:32:39,871 WARNING [train.py:1204] (3/4) Exclude cut with ID _15133_210_6228_1_1530794512074_3080379_29-264458-0_sp0.9 from training. Duration: 0.6444375 2023-10-02 18:32:41,822 WARNING [train.py:1204] (3/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_159-281310-0_sp1.1 from training. Duration: 0.863625 2023-10-02 18:32:41,838 WARNING [train.py:1204] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_195-167011-0_sp0.9 from training. Duration: 0.8333125 2023-10-02 18:32:43,753 WARNING [train.py:1204] (3/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_156-257607-0_sp1.1 from training. Duration: 0.7545625 2023-10-02 18:32:46,872 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2295_210_2087_1_1525500993_2407863_116-238894-0 from training. Duration: 0.7 2023-10-02 18:32:48,397 WARNING [train.py:1204] (3/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_321-335475-0_sp0.9 from training. Duration: 0.5 2023-10-02 18:32:49,831 WARNING [train.py:1204] (3/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_188-114464-0 from training. Duration: 0.82 2023-10-02 18:32:52,600 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2592_210_2290_1_1525697025_4882925_157-70512-0 from training. Duration: 0.91 2023-10-02 18:32:52,890 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.out_combiner.scale_min, batch_count=977713.3333333334, ans=0.2 2023-10-02 18:32:54,043 WARNING [train.py:1204] (3/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_138-64210-0_sp1.1 from training. Duration: 0.663625 2023-10-02 18:32:58,088 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.399e+02 1.829e+02 1.986e+02 2.158e+02 2.804e+02, threshold=3.972e+02, percent-clipped=0.0 2023-10-02 18:33:00,951 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_11305_210_1794_1_1529750428_7178148_651-95702-0_sp1.1 from training. Duration: 0.963625 2023-10-02 18:33:00,972 WARNING [train.py:1204] (3/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_379-184223-0_sp0.9 from training. Duration: 0.9444375 2023-10-02 18:33:00,990 WARNING [train.py:1204] (3/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_381-125325-0_sp1.1 from training. Duration: 0.963625 2023-10-02 18:33:01,218 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.conv_module1.balancer1.prob, batch_count=977780.0, ans=0.125 2023-10-02 18:33:02,298 WARNING [train.py:1204] (3/4) Exclude cut with ID IC0622W0021-307659 from training. Duration: 0.896 2023-10-02 18:33:02,300 WARNING [train.py:1204] (3/4) Exclude cut with ID _7624_210_1531_1_1528109802198_4933101_418-349834-0_sp1.1 from training. Duration: 0.5454375 2023-10-02 18:33:05,505 WARNING [train.py:1204] (3/4) Exclude cut with ID _49728_210_12651_1_1534041034294_6788150_607-3416-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 18:33:05,607 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8275_210_4198_1_1528455320_3892571_491-17707-0 from training. Duration: 0.64 2023-10-02 18:33:07,021 WARNING [train.py:1204] (3/4) Exclude cut with ID _5287_210_2084_1_1527418883040_3878670_333-48508-0 from training. Duration: 0.6 2023-10-02 18:33:07,107 WARNING [train.py:1204] (3/4) Exclude cut with ID _8365_210_2064_1_1528592311070_4677690_371-122295-0 from training. Duration: 0.55 2023-10-02 18:33:08,993 WARNING [train.py:1204] (3/4) Exclude cut with ID _14860_210_4915_1_1530702020340_7343240_836-328489-0 from training. Duration: 0.9 2023-10-02 18:33:12,245 WARNING [train.py:1204] (3/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_1033-274376-0_sp0.9 from training. Duration: 0.9666875 2023-10-02 18:33:13,365 INFO [train.py:1046] (3/4) Epoch 28, batch 3250, loss[loss=0.1767, simple_loss=0.2398, pruned_loss=0.05675, over 19110.00 frames. ], tot_loss[loss=0.1649, simple_loss=0.2425, pruned_loss=0.04368, over 4713455.01 frames. ], batch size: 388, lr: 3.65e-03, grad_scale: 16.0 2023-10-02 18:33:15,376 WARNING [train.py:1204] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_475-127455-0_sp0.9 from training. Duration: 0.77775 2023-10-02 18:33:15,382 WARNING [train.py:1204] (3/4) Exclude cut with ID IC0671W0071-331914 from training. Duration: 0.896 2023-10-02 18:33:15,408 WARNING [train.py:1204] (3/4) Exclude cut with ID _30327_210_16049_1_1532484161295_3924169_2-1523-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 18:33:15,411 WARNING [train.py:1204] (3/4) Exclude cut with ID _24314_210_4915_1_1532135078116_7170179_460-208279-0_sp1.1 from training. Duration: 0.97275 2023-10-02 18:33:16,817 WARNING [train.py:1204] (3/4) Exclude cut with ID IC0679W0052-335859 from training. Duration: 0.64 2023-10-02 18:33:20,912 WARNING [train.py:1204] (3/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_3-237434-0_sp1.1 from training. Duration: 0.6909375 2023-10-02 18:33:23,789 WARNING [train.py:1204] (3/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_405-185283-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 18:33:30,735 WARNING [train.py:1204] (3/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_927-83684-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 18:33:30,744 WARNING [train.py:1204] (3/4) Exclude cut with ID _6694_210_4599_1_1527852637490_3965529_356-169233-0 from training. Duration: 0.99 2023-10-02 18:33:30,841 WARNING [train.py:1204] (3/4) Exclude cut with ID _19896_210_1883_1_1531709022662_6649569_562-233001-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 18:33:32,113 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_354-78629-0_sp1.1 from training. Duration: 0.963625 2023-10-02 18:33:32,114 WARNING [train.py:1204] (3/4) Exclude cut with ID _60135_210_13427_1_1534935557922_3651319_96-168198-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 18:33:33,490 WARNING [train.py:1204] (3/4) Exclude cut with ID _31602_210_5854_1_1532677021456_4458250_249-96604-0_sp1.1 from training. Duration: 0.8090625 2023-10-02 18:33:33,516 WARNING [train.py:1204] (3/4) Exclude cut with ID _14100_210_8341_1_1530529320613_3678069_258-224599-0_sp0.9 from training. Duration: 0.8555625 2023-10-02 18:33:36,559 WARNING [train.py:1204] (3/4) Exclude cut with ID _19708_210_9206_1_1532136650733_4238364_215-160135-0_sp1.1 from training. Duration: 0.97275 2023-10-02 18:33:36,578 WARNING [train.py:1204] (3/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_113-18163-0_sp0.9 from training. Duration: 0.988875 2023-10-02 18:33:36,636 WARNING [train.py:1204] (3/4) Exclude cut with ID _64135_210_2253_1_1535677267876_7608972_552-337340-0_sp1.1 from training. Duration: 0.92725 2023-10-02 18:33:36,658 WARNING [train.py:1204] (3/4) Exclude cut with ID _67725_210_27767_1_1535608756671_5105359_10-12887-0_sp1.1 from training. Duration: 0.97275 2023-10-02 18:33:36,663 WARNING [train.py:1204] (3/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_401-284840-0_sp1.1 from training. Duration: 0.97275 2023-10-02 18:33:37,907 WARNING [train.py:1204] (3/4) Exclude cut with ID _44659_210_2721_1_1534036120061_6614860_483-141285-0_sp1.1 from training. Duration: 0.936375 2023-10-02 18:33:41,762 WARNING [train.py:1204] (3/4) Exclude cut with ID _9461_210_4557_1_1529060514726_3677451_358-332734-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 18:33:42,025 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.balancer1.prob, batch_count=977980.0, ans=0.125 2023-10-02 18:33:44,249 WARNING [train.py:1204] (3/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_788-298907-0_sp1.1 from training. Duration: 0.8090625 2023-10-02 18:33:45,794 WARNING [train.py:1204] (3/4) Exclude cut with ID _39411_210_5986_1_1533538825030_3343649_354-267196-0_sp1.1 from training. Duration: 0.92725 2023-10-02 18:33:45,820 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_14953_210_2151_1_1530701710_5616165_192-309792-0_sp1.1 from training. Duration: 0.97275 2023-10-02 18:33:47,156 WARNING [train.py:1204] (3/4) Exclude cut with ID _59170_210_6721_1_1534983633960_4203335_290-151255-0_sp1.1 from training. Duration: 0.92725 2023-10-02 18:33:49,123 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_5575_210_1410_1_1527301305_2570170_249-93769-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 18:33:49,139 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_46013_210_7234_1_1533776362_3744032_463-52378-0_sp1.1 from training. Duration: 0.7818125 2023-10-02 18:33:50,806 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.conv_skip_rate, batch_count=977980.0, ans=0.0 2023-10-02 18:33:53,389 WARNING [train.py:1204] (3/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_580-93798-0 from training. Duration: 0.56 2023-10-02 18:33:54,839 WARNING [train.py:1204] (3/4) Exclude cut with ID _19514_210_9259_1_1531530071330_5084230_385-13762-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 18:33:54,844 WARNING [train.py:1204] (3/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_458-67321-0_sp1.1 from training. Duration: 0.8818125 2023-10-02 18:33:54,962 WARNING [train.py:1204] (3/4) Exclude cut with ID _41298_210_9019_1_1533430371450_4822143_188-154791-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 18:33:56,291 WARNING [train.py:1204] (3/4) Exclude cut with ID _15037_210_2253_1_1530752294104_3267650_296-192597-0_sp0.9 from training. Duration: 0.9 2023-10-02 18:34:01,774 WARNING [train.py:1204] (3/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_661-264857-0_sp1.1 from training. Duration: 0.8454375 2023-10-02 18:34:03,744 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.3.encoder.layers.3.self_attn2.whiten, num_groups=1, num_channels=512, metric=12.00 vs. limit=22.5 2023-10-02 18:34:09,309 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_34008_210_10431_1_1533345424_8183247_662-317331-0_sp1.1 from training. Duration: 0.8545625 2023-10-02 18:34:09,341 WARNING [train.py:1204] (3/4) Exclude cut with ID _46502_210_13794_1_1533724452198_3657159_241-278329-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 18:34:09,342 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_11349_210_6753_1_1529800861_1101207_26-65602-0 from training. Duration: 0.96 2023-10-02 18:34:09,345 WARNING [train.py:1204] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_448-4048-0_sp1.1 from training. Duration: 0.87275 2023-10-02 18:34:09,352 WARNING [train.py:1204] (3/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_662-275621-0_sp1.1 from training. Duration: 0.3818125 2023-10-02 18:34:09,392 WARNING [train.py:1204] (3/4) Exclude cut with ID _13938_210_9988_1_1530518718978_3693960_256-54454-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 18:34:11,063 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.conv_module1.balancer1.max_abs, batch_count=978113.3333333334, ans=10.0 2023-10-02 18:34:12,590 WARNING [train.py:1204] (3/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_256-32662-0 from training. Duration: 0.9 2023-10-02 18:34:12,648 WARNING [train.py:1204] (3/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_326-17061-0 from training. Duration: 0.94 2023-10-02 18:34:14,563 WARNING [train.py:1204] (3/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_505-97987-0_sp1.1 from training. Duration: 0.7818125 2023-10-02 18:34:15,767 WARNING [train.py:1204] (3/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_316-294364-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 18:34:15,823 WARNING [train.py:1204] (3/4) Exclude cut with ID _19514_210_9259_1_1531530071330_5084230_762-231063-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 18:34:15,875 WARNING [train.py:1204] (3/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_68-219807-0_sp0.9 from training. Duration: 0.52225 2023-10-02 18:34:16,059 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.conv_module1.balancer1.prob, batch_count=978113.3333333334, ans=0.125 2023-10-02 18:34:16,589 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.2.encoder.layers.2.feed_forward2.out_whiten, num_groups=1, num_channels=384, metric=4.94 vs. limit=15.0 2023-10-02 18:34:17,137 WARNING [train.py:1204] (3/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_479-158053-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 18:34:21,656 WARNING [train.py:1204] (3/4) Exclude cut with ID _66112_210_10635_1_1535590236305_3801484_83-38181-0_sp1.1 from training. Duration: 0.936375 2023-10-02 18:34:21,672 WARNING [train.py:1204] (3/4) Exclude cut with ID _55857_210_2346_1_1534644007831_6898209_255-146346-0_sp1.1 from training. Duration: 0.963625 2023-10-02 18:34:24,326 WARNING [train.py:1204] (3/4) Exclude cut with ID _48836_210_12210_1_1534075848293_3904150_429-35475-0 from training. Duration: 0.89 2023-10-02 18:34:24,337 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_50154_210_13731_1_1534750862_1080961_119-187163-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 18:34:27,062 INFO [train.py:1046] (3/4) Epoch 28, batch 3300, loss[loss=0.1796, simple_loss=0.2498, pruned_loss=0.05472, over 23556.00 frames. ], tot_loss[loss=0.1658, simple_loss=0.2435, pruned_loss=0.04402, over 4715609.29 frames. ], batch size: 256, lr: 3.65e-03, grad_scale: 16.0 2023-10-02 18:34:27,144 WARNING [train.py:1204] (3/4) Exclude cut with ID _8793_210_6310_1_1528542985139_2310009_121-179782-0_sp0.9 from training. Duration: 0.888875 2023-10-02 18:34:27,147 WARNING [train.py:1204] (3/4) Exclude cut with ID _14884_210_2252_1_1530707459689_5561250_591-173406-0 from training. Duration: 0.96 2023-10-02 18:34:29,968 WARNING [train.py:1204] (3/4) Exclude cut with ID _15133_210_6228_1_1530794512074_3080379_54-198193-0_sp1.1 from training. Duration: 0.8545625 2023-10-02 18:34:29,977 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6545_210_2040_1_1527774670_4245258_407-48070-0 from training. Duration: 0.76 2023-10-02 18:34:31,380 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1032-236863-0 from training. Duration: 0.84 2023-10-02 18:34:32,759 WARNING [train.py:1204] (3/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_507-303370-0 from training. Duration: 0.96 2023-10-02 18:34:32,779 WARNING [train.py:1204] (3/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_164-282366-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 18:34:35,734 WARNING [train.py:1204] (3/4) Exclude cut with ID _39867_210_9826_1_1533553222030_9344259_312-158727-0_sp1.1 from training. Duration: 0.963625 2023-10-02 18:34:38,344 WARNING [train.py:1204] (3/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_174-205058-0_sp1.1 from training. Duration: 0.863625 2023-10-02 18:34:38,365 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_11305_210_1794_1_1529750428_7178148_166-237358-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 18:34:41,538 WARNING [train.py:1204] (3/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_26-175553-0_sp1.1 from training. Duration: 0.5545625 2023-10-02 18:34:41,564 WARNING [train.py:1204] (3/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_96-290039-0_sp1.1 from training. Duration: 0.5090625 2023-10-02 18:34:43,913 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.balancer2.prob, batch_count=978246.6666666666, ans=0.125 2023-10-02 18:34:44,939 WARNING [train.py:1204] (3/4) Exclude cut with ID _53219_210_4525_1_1534834836008_4064825_261-101691-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 18:34:45,652 WARNING [train.py:1204] (3/4) Exclude cut with ID _24271_210_12658_1_1531987270022_7190519_96-293390-0_sp1.1 from training. Duration: 0.936375 2023-10-02 18:34:49,621 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_271-234962-0 from training. Duration: 0.79 2023-10-02 18:34:49,688 WARNING [train.py:1204] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_136-30660-0_sp1.1 from training. Duration: 0.97275 2023-10-02 18:34:49,701 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_625-281046-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 18:34:52,318 WARNING [train.py:1204] (3/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_333-168193-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 18:34:52,390 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9029W0269-509664 from training. Duration: 0.896 2023-10-02 18:34:53,743 WARNING [train.py:1204] (3/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_450-147393-0_sp1.1 from training. Duration: 0.92725 2023-10-02 18:34:55,663 WARNING [train.py:1204] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_643-52007-0_sp1.1 from training. Duration: 0.5181875 2023-10-02 18:34:55,737 WARNING [train.py:1204] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_330-327861-0_sp0.9 from training. Duration: 0.7444375 2023-10-02 18:34:56,982 WARNING [train.py:1204] (3/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_260-317651-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 18:34:57,015 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9044W0010-516654 from training. Duration: 0.896 2023-10-02 18:34:59,787 WARNING [train.py:1204] (3/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_265-118378-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 18:34:59,793 WARNING [train.py:1204] (3/4) Exclude cut with ID _6194_210_1534_1_1527511826160_3941194_321-148641-0_sp0.9 from training. Duration: 0.711125 2023-10-02 18:35:02,652 WARNING [train.py:1204] (3/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_101-169176-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 18:35:02,655 WARNING [train.py:1204] (3/4) Exclude cut with ID 497-129325-0061-62254-0_sp1.1 from training. Duration: 0.97725 2023-10-02 18:35:04,037 WARNING [train.py:1204] (3/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_795-33252-0_sp1.1 from training. Duration: 0.336375 2023-10-02 18:35:04,050 WARNING [train.py:1204] (3/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_168-246176-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 18:35:04,158 WARNING [train.py:1204] (3/4) Exclude cut with ID _7160_210_1876_1_1527940923231_3703799_120-150943-0_sp1.1 from training. Duration: 0.736375 2023-10-02 18:35:05,536 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9084W0005-536844 from training. Duration: 0.768 2023-10-02 18:35:08,200 WARNING [train.py:1204] (3/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_234-260682-0 from training. Duration: 0.84 2023-10-02 18:35:09,513 WARNING [train.py:1204] (3/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_188-114464-0_sp0.9 from training. Duration: 0.911125 2023-10-02 18:35:11,014 WARNING [train.py:1204] (3/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_81-75849-0 from training. Duration: 0.39 2023-10-02 18:35:14,727 WARNING [train.py:1204] (3/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_379-184223-0_sp1.1 from training. Duration: 0.77275 2023-10-02 18:35:16,675 WARNING [train.py:1204] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_454-123002-0_sp1.1 from training. Duration: 0.62725 2023-10-02 18:35:16,709 WARNING [train.py:1204] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_887-249555-0_sp1.1 from training. Duration: 0.7 2023-10-02 18:35:19,457 WARNING [train.py:1204] (3/4) Exclude cut with ID _31088_210_16446_1_1532520012284_3440919_20-260902-0_sp1.1 from training. Duration: 0.97275 2023-10-02 18:35:20,777 WARNING [train.py:1204] (3/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_856-344574-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 18:35:20,787 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_36756_210_7234_1_1533026991_966276_4-164118-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 18:35:20,802 WARNING [train.py:1204] (3/4) Exclude cut with ID _8365_210_2064_1_1528592311070_4677690_371-122295-0_sp1.1 from training. Duration: 0.5 2023-10-02 18:35:23,742 WARNING [train.py:1204] (3/4) Exclude cut with ID _5910_210_1389_1_1527337972454_5050100_555-159815-0_sp1.1 from training. Duration: 0.8909375 2023-10-02 18:35:23,752 WARNING [train.py:1204] (3/4) Exclude cut with ID _37086_210_9038_1_1532999540891_7021250_469-147941-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 18:35:23,825 WARNING [train.py:1204] (3/4) Exclude cut with ID _3067_210_2667_1_1526212974640_4503168_459-173867-0_sp0.9 from training. Duration: 0.97775 2023-10-02 18:35:24,017 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=978380.0, ans=0.1 2023-10-02 18:35:25,239 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9156W0006-572499 from training. Duration: 0.896 2023-10-02 18:35:26,522 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.464e+02 1.840e+02 2.125e+02 2.554e+02 4.181e+02, threshold=4.250e+02, percent-clipped=1.0 2023-10-02 18:35:26,594 WARNING [train.py:1204] (3/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_277-210798-0 from training. Duration: 0.55 2023-10-02 18:35:28,703 WARNING [train.py:1204] (3/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_103-336064-0_sp1.1 from training. Duration: 0.463625 2023-10-02 18:35:28,763 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3414_210_1988_1_1526212718_4813195_331-290945-0_sp1.1 from training. Duration: 0.936375 2023-10-02 18:35:28,764 WARNING [train.py:1204] (3/4) Exclude cut with ID _67499_210_17282_1_1535509805220_3692430_365-29276-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 18:35:31,396 WARNING [train.py:1204] (3/4) Exclude cut with ID _14821_210_8750_1_1530680503991_6829359_69-250847-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 18:35:31,397 WARNING [train.py:1204] (3/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_1102-61301-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 18:35:34,056 WARNING [train.py:1204] (3/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_166-246557-0_sp1.1 from training. Duration: 0.5818125 2023-10-02 18:35:35,357 WARNING [train.py:1204] (3/4) Exclude cut with ID _15351_210_10626_1_1530856635653_3576210_214-129918-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 18:35:35,378 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_120-41668-0_sp0.9 from training. Duration: 0.77775 2023-10-02 18:35:35,451 WARNING [train.py:1204] (3/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_27-72278-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 18:35:36,842 WARNING [train.py:1204] (3/4) Exclude cut with ID _24314_210_4915_1_1532135078116_7170179_521-258868-0_sp1.1 from training. Duration: 0.7181875 2023-10-02 18:35:37,154 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.nonlin_attention.balancer.prob, batch_count=978446.6666666666, ans=0.125 2023-10-02 18:35:38,250 WARNING [train.py:1204] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_143-86344-0 from training. Duration: 0.77 2023-10-02 18:35:38,282 WARNING [train.py:1204] (3/4) Exclude cut with ID _53489_210_6228_1_1534298357169_3835970_465-157433-0_sp1.1 from training. Duration: 0.92725 2023-10-02 18:35:40,907 INFO [train.py:1046] (3/4) Epoch 28, batch 3350, loss[loss=0.1621, simple_loss=0.2319, pruned_loss=0.04614, over 23761.00 frames. ], tot_loss[loss=0.1663, simple_loss=0.244, pruned_loss=0.04431, over 4713501.51 frames. ], batch size: 164, lr: 3.64e-03, grad_scale: 16.0 2023-10-02 18:35:40,954 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_248-319353-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 18:35:42,376 WARNING [train.py:1204] (3/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_102-77064-0_sp1.1 from training. Duration: 0.8454375 2023-10-02 18:35:42,401 WARNING [train.py:1204] (3/4) Exclude cut with ID _9389_210_1664_1_1529217337301_5289269_379-128794-0_sp1.1 from training. Duration: 0.77275 2023-10-02 18:35:42,506 WARNING [train.py:1204] (3/4) Exclude cut with ID _3231_210_2106_1_1526205864934_6784220_236-260835-0_sp1.1 from training. Duration: 0.97275 2023-10-02 18:35:42,685 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.conv_module1.balancer1.prob, batch_count=978513.3333333334, ans=0.125 2023-10-02 18:35:45,664 WARNING [train.py:1204] (3/4) Exclude cut with ID _24271_210_12658_1_1531987270022_7190519_184-342363-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 18:35:45,665 WARNING [train.py:1204] (3/4) Exclude cut with ID _27894_210_12392_1_1532415496078_6902439_67-221663-0_sp1.1 from training. Duration: 0.92725 2023-10-02 18:35:48,829 WARNING [train.py:1204] (3/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_304-59227-0_sp1.1 from training. Duration: 0.7 2023-10-02 18:35:50,252 WARNING [train.py:1204] (3/4) Exclude cut with ID _58605_210_12210_1_1534723195939_3904259_458-42232-0_sp1.1 from training. Duration: 0.92725 2023-10-02 18:35:51,618 WARNING [train.py:1204] (3/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_13-165512-0_sp0.9 from training. Duration: 0.911125 2023-10-02 18:35:54,419 WARNING [train.py:1204] (3/4) Exclude cut with ID _58602_210_2346_1_1534903210085_6908830_792-102241-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 18:35:55,808 WARNING [train.py:1204] (3/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_588-125165-0_sp0.9 from training. Duration: 0.92225 2023-10-02 18:35:55,918 WARNING [train.py:1204] (3/4) Exclude cut with ID _54218_210_12210_1_1534413646388_4409259_239-98429-0_sp1.1 from training. Duration: 0.97275 2023-10-02 18:35:57,259 WARNING [train.py:1204] (3/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_538-32291-0_sp1.1 from training. Duration: 0.7545625 2023-10-02 18:35:59,095 WARNING [train.py:1204] (3/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_419-317062-0 from training. Duration: 0.91 2023-10-02 18:36:00,559 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9309W0202-640329 from training. Duration: 0.896 2023-10-02 18:36:00,601 WARNING [train.py:1204] (3/4) Exclude cut with ID _3231_210_2106_1_1526205864934_6784220_419-313291-0_sp1.1 from training. Duration: 0.97275 2023-10-02 18:36:04,573 WARNING [train.py:1204] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_619-344499-0 from training. Duration: 0.63 2023-10-02 18:36:04,592 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_278-272612-0 from training. Duration: 0.73 2023-10-02 18:36:04,670 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_103-84010-0_sp0.9 from training. Duration: 0.7666875 2023-10-02 18:36:05,950 WARNING [train.py:1204] (3/4) Exclude cut with ID _26528_210_9070_1_1532570410842_3851310_497-97218-0_sp1.1 from training. Duration: 0.7818125 2023-10-02 18:36:07,329 WARNING [train.py:1204] (3/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_262-84613-0_sp1.1 from training. Duration: 0.936375 2023-10-02 18:36:08,627 WARNING [train.py:1204] (3/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_387-131538-0 from training. Duration: 0.81 2023-10-02 18:36:08,661 WARNING [train.py:1204] (3/4) Exclude cut with ID _8030_210_7497_1_1528444886781_3531089_224-300489-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 18:36:08,677 WARNING [train.py:1204] (3/4) Exclude cut with ID _14349_210_8341_1_1530615710818_3442470_170-336541-0_sp0.9 from training. Duration: 0.9555625 2023-10-02 18:36:10,112 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_47069_210_15368_1_1533781267_4431320_180-31746-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 18:36:11,983 WARNING [train.py:1204] (3/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_447-7041-0_sp1.1 from training. Duration: 0.92725 2023-10-02 18:36:13,259 WARNING [train.py:1204] (3/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_2-55283-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 18:36:13,311 WARNING [train.py:1204] (3/4) Exclude cut with ID _12555_210_8750_1_1530075594402_3780239_70-181911-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 18:36:15,401 WARNING [train.py:1204] (3/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_311-328022-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 18:36:18,556 WARNING [train.py:1204] (3/4) Exclude cut with ID _14528_210_8254_1_1530669700514_4306939_330-2987-0_sp1.1 from training. Duration: 0.92725 2023-10-02 18:36:18,606 WARNING [train.py:1204] (3/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_385-184858-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 18:36:22,779 WARNING [train.py:1204] (3/4) Exclude cut with ID _64414_210_17301_1_1535243252048_3757759_348-333360-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 18:36:24,079 WARNING [train.py:1204] (3/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_753-214751-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 18:36:25,504 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_17613_210_2151_1_1531137569_2366160_202-208013-0_sp1.1 from training. Duration: 0.92725 2023-10-02 18:36:25,513 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_43831_210_2158_1_1533725502_5872791_610-129300-0_sp1.1 from training. Duration: 0.936375 2023-10-02 18:36:26,989 WARNING [train.py:1204] (3/4) Exclude cut with ID _54218_210_12210_1_1534413646388_4409259_321-23852-0_sp1.1 from training. Duration: 0.936375 2023-10-02 18:36:28,422 WARNING [train.py:1204] (3/4) Exclude cut with ID _39411_210_5986_1_1533538825030_3343649_173-238248-0 from training. Duration: 0.83 2023-10-02 18:36:28,428 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_405-339986-0_sp0.9 from training. Duration: 0.5666875 2023-10-02 18:36:28,458 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2009_210_2071_1_1525481134_4588937_385-302100-0 from training. Duration: 0.87 2023-10-02 18:36:28,628 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.feed_forward1.hidden_balancer.prob, batch_count=978713.3333333334, ans=0.125 2023-10-02 18:36:29,691 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_161-276996-0_sp1.1 from training. Duration: 0.863625 2023-10-02 18:36:31,830 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_177-58005-0 from training. Duration: 0.62 2023-10-02 18:36:33,256 WARNING [train.py:1204] (3/4) Exclude cut with ID _64639_210_5816_1_1535367619437_3528000_146-343734-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 18:36:34,563 WARNING [train.py:1204] (3/4) Exclude cut with ID _3845_210_1390_1_1526731326141_3523610_72-61441-0_sp1.1 from training. Duration: 0.92725 2023-10-02 18:36:43,061 WARNING [train.py:1204] (3/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_991-125028-0_sp1.1 from training. Duration: 0.936375 2023-10-02 18:36:43,125 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7544_210_6753_1_1528591327_1471426_172-348777-0 from training. Duration: 0.66 2023-10-02 18:36:43,172 WARNING [train.py:1204] (3/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_416-257730-0_sp1.1 from training. Duration: 0.5909375 2023-10-02 18:36:44,614 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1113-7369-0_sp1.1 from training. Duration: 0.77275 2023-10-02 18:36:46,565 WARNING [train.py:1204] (3/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_242-76943-0_sp1.1 from training. Duration: 0.8545625 2023-10-02 18:36:51,348 WARNING [train.py:1204] (3/4) Exclude cut with ID _44718_210_5816_1_1534050051869_6469030_190-49648-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 18:36:54,092 WARNING [train.py:1204] (3/4) Exclude cut with ID _47251_210_18628_1_1533814063128_4137250_214-88076-0 from training. Duration: 0.88 2023-10-02 18:36:54,120 WARNING [train.py:1204] (3/4) Exclude cut with ID _5585_210_2667_1_1527418813324_8007340_89-332072-0_sp1.1 from training. Duration: 0.5818125 2023-10-02 18:36:54,165 WARNING [train.py:1204] (3/4) Exclude cut with ID _41817_210_18835_1_1533434477160_7423049_555-293448-0_sp1.1 from training. Duration: 0.72725 2023-10-02 18:36:55,462 INFO [train.py:1046] (3/4) Epoch 28, batch 3400, loss[loss=0.146, simple_loss=0.227, pruned_loss=0.03248, over 24293.00 frames. ], tot_loss[loss=0.1669, simple_loss=0.2448, pruned_loss=0.04455, over 4718012.43 frames. ], batch size: 61, lr: 3.64e-03, grad_scale: 16.0 2023-10-02 18:36:55,609 WARNING [train.py:1204] (3/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_286-331314-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 18:36:55,664 WARNING [train.py:1204] (3/4) Exclude cut with ID _13921_210_9994_1_1530838702800_3003429_46-149444-0 from training. Duration: 0.94 2023-10-02 18:36:56,940 WARNING [train.py:1204] (3/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_768-43595-0_sp1.1 from training. Duration: 0.936375 2023-10-02 18:36:56,950 WARNING [train.py:1204] (3/4) Exclude cut with ID _35355_210_16040_1_1533212988977_4016229_74-190076-0 from training. Duration: 0.8 2023-10-02 18:36:57,054 WARNING [train.py:1204] (3/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_1007-316380-0_sp1.1 from training. Duration: 0.963625 2023-10-02 18:36:57,094 WARNING [train.py:1204] (3/4) Exclude cut with ID _23557_210_8750_1_1531890106370_6846255_150-283437-0_sp1.1 from training. Duration: 0.963625 2023-10-02 18:36:58,472 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3474_210_2028_1_1526188513_4825246_145-309131-0_sp1.1 from training. Duration: 0.67275 2023-10-02 18:36:58,555 WARNING [train.py:1204] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_391-22148-0_sp1.1 from training. Duration: 0.7 2023-10-02 18:36:58,579 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_238-256668-0 from training. Duration: 0.69 2023-10-02 18:37:01,380 WARNING [train.py:1204] (3/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_372-198878-0 from training. Duration: 0.6 2023-10-02 18:37:01,390 WARNING [train.py:1204] (3/4) Exclude cut with ID ID0174W0006-766659 from training. Duration: 0.953 2023-10-02 18:37:03,086 WARNING [train.py:1204] (3/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_668-23019-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 18:37:07,106 WARNING [train.py:1204] (3/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_322-74328-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 18:37:07,112 WARNING [train.py:1204] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_872-264525-0_sp1.1 from training. Duration: 0.5909375 2023-10-02 18:37:07,152 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_53378_210_17282_1_1534221071_5769509_641-286201-0_sp1.1 from training. Duration: 0.97275 2023-10-02 18:37:08,549 WARNING [train.py:1204] (3/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_199-295611-0_sp1.1 from training. Duration: 0.5 2023-10-02 18:37:12,121 WARNING [train.py:1204] (3/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_1270-317971-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 18:37:13,969 WARNING [train.py:1204] (3/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_784-213099-0 from training. Duration: 0.93 2023-10-02 18:37:18,031 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_115-233929-0_sp0.9 from training. Duration: 0.8 2023-10-02 18:37:19,407 WARNING [train.py:1204] (3/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_256-223809-0_sp1.1 from training. Duration: 0.97275 2023-10-02 18:37:19,439 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_285-87781-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 18:37:20,063 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.4.encoder.layers.0.self_attn_weights.whiten_keys, num_groups=4, num_channels=128, metric=1.67 vs. limit=6.0 2023-10-02 18:37:20,850 WARNING [train.py:1204] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_619-344499-0_sp1.1 from training. Duration: 0.57275 2023-10-02 18:37:25,007 WARNING [train.py:1204] (3/4) Exclude cut with ID _60229_210_27767_1_1534838406158_3655350_264-279573-0_sp0.9 from training. Duration: 0.92225 2023-10-02 18:37:29,749 WARNING [train.py:1204] (3/4) Exclude cut with ID _7719_210_3525_1_1528456153163_4296960_333-110399-0 from training. Duration: 0.96 2023-10-02 18:37:35,358 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_50423_210_13270_1_1534128598_5021734_759-90833-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 18:37:36,673 WARNING [train.py:1204] (3/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_151-254798-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 18:37:37,989 WARNING [train.py:1204] (3/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_450-203637-0 from training. Duration: 0.92 2023-10-02 18:37:38,030 WARNING [train.py:1204] (3/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_673-36456-0_sp1.1 from training. Duration: 0.936375 2023-10-02 18:37:39,784 WARNING [train.py:1204] (3/4) Exclude cut with ID _36538_210_9994_1_1533204020363_3795620_131-8685-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 18:37:39,833 WARNING [train.py:1204] (3/4) Exclude cut with ID _12556_210_8341_1_1530081024100_7327690_713-55626-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 18:37:39,875 WARNING [train.py:1204] (3/4) Exclude cut with ID _2322_210_2667_1_1525604548971_3656299_440-80494-0_sp0.9 from training. Duration: 0.97775 2023-10-02 18:37:43,275 WARNING [train.py:1204] (3/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_210-325081-0_sp1.1 from training. Duration: 0.97275 2023-10-02 18:37:49,064 WARNING [train.py:1204] (3/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_375-244976-0_sp0.9 from training. Duration: 0.8333125 2023-10-02 18:37:49,069 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_15517_210_6753_1_1530952293_1664795_119-31001-0_sp1.1 from training. Duration: 0.8545625 2023-10-02 18:37:53,223 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_41452_210_7291_1_1533437687_3941281_319-42326-0_sp1.1 from training. Duration: 0.92725 2023-10-02 18:37:54,682 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_372-173012-0 from training. Duration: 0.95 2023-10-02 18:37:55,881 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.378e+02 1.833e+02 2.007e+02 2.238e+02 3.330e+02, threshold=4.014e+02, percent-clipped=0.0 2023-10-02 18:38:00,129 WARNING [train.py:1204] (3/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_41-31157-0_sp1.1 from training. Duration: 0.5181875 2023-10-02 18:38:04,749 WARNING [train.py:1204] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_91-212486-0 from training. Duration: 0.68 2023-10-02 18:38:07,600 WARNING [train.py:1204] (3/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_1267-272693-0 from training. Duration: 0.73 2023-10-02 18:38:07,642 WARNING [train.py:1204] (3/4) Exclude cut with ID _13938_210_9988_1_1530518718978_3693960_152-67046-0_sp1.1 from training. Duration: 0.936375 2023-10-02 18:38:09,001 INFO [train.py:1046] (3/4) Epoch 28, batch 3450, loss[loss=0.16, simple_loss=0.2205, pruned_loss=0.04978, over 23775.00 frames. ], tot_loss[loss=0.1674, simple_loss=0.2448, pruned_loss=0.04504, over 4705612.16 frames. ], batch size: 232, lr: 3.64e-03, grad_scale: 8.0 2023-10-02 18:38:10,423 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_4528_210_3273_1_1527076654_3276777_342-168068-0_sp1.1 from training. Duration: 0.7909375 2023-10-02 18:38:10,439 WARNING [train.py:1204] (3/4) Exclude cut with ID _7606_210_4915_1_1528894253277_5080690_127-177761-0 from training. Duration: 0.64 2023-10-02 18:38:11,772 WARNING [train.py:1204] (3/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_335-137972-0_sp1.1 from training. Duration: 0.92725 2023-10-02 18:38:16,012 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.attention_skip_rate, batch_count=979180.0, ans=0.0 2023-10-02 18:38:17,063 WARNING [train.py:1204] (3/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_331-117829-0_sp1.1 from training. Duration: 0.563625 2023-10-02 18:38:20,034 WARNING [train.py:1204] (3/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_125-125818-0_sp1.1 from training. Duration: 0.87275 2023-10-02 18:38:21,324 WARNING [train.py:1204] (3/4) Exclude cut with ID _6789_210_1534_1_1527854471671_2870519_48-294897-0_sp1.1 from training. Duration: 0.963625 2023-10-02 18:38:21,416 WARNING [train.py:1204] (3/4) Exclude cut with ID _6694_210_4599_1_1527852637490_3965529_116-109398-0_sp1.1 from training. Duration: 0.663625 2023-10-02 18:38:21,426 WARNING [train.py:1204] (3/4) Exclude cut with ID _52892_210_6721_1_1534378941259_4124129_222-172685-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 18:38:22,828 WARNING [train.py:1204] (3/4) Exclude cut with ID _50655_210_18628_1_1534229942193_3772154_320-132275-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 18:38:25,780 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=979246.6666666666, ans=0.1 2023-10-02 18:38:27,196 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.feed_forward2.hidden_balancer.prob, batch_count=979246.6666666666, ans=0.125 2023-10-02 18:38:28,383 WARNING [train.py:1204] (3/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_76-311963-0 from training. Duration: 0.7 2023-10-02 18:38:29,863 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.attention_skip_rate, batch_count=979246.6666666666, ans=0.0 2023-10-02 18:38:35,802 WARNING [train.py:1204] (3/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_32-205288-0 from training. Duration: 0.78 2023-10-02 18:38:35,828 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_86-16881-0_sp0.9 from training. Duration: 0.6444375 2023-10-02 18:38:37,190 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_271-297505-0_sp1.1 from training. Duration: 0.9 2023-10-02 18:38:38,729 WARNING [train.py:1204] (3/4) Exclude cut with ID _57572_210_5517_1_1534921219624_3809484_187-295293-0_sp1.1 from training. Duration: 0.963625 2023-10-02 18:38:42,458 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.0.bypass.scale_min, batch_count=979313.3333333334, ans=0.2 2023-10-02 18:38:44,905 WARNING [train.py:1204] (3/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_563-329943-0 from training. Duration: 0.99 2023-10-02 18:38:44,966 WARNING [train.py:1204] (3/4) Exclude cut with ID _7160_210_1876_1_1527940923231_3703799_302-150728-0_sp0.9 from training. Duration: 0.8666875 2023-10-02 18:38:45,573 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.4.encoder.layers.1.self_attn1.whiten, num_groups=1, num_channels=384, metric=11.77 vs. limit=22.5 2023-10-02 18:38:47,132 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.feed_forward1.out_proj.dropout_p, batch_count=979313.3333333334, ans=0.1 2023-10-02 18:38:49,614 WARNING [train.py:1204] (3/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_926-7526-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 18:38:49,635 WARNING [train.py:1204] (3/4) Exclude cut with ID _62574_210_2655_1_1535180407058_3933249_467-22348-0_sp1.1 from training. Duration: 0.936375 2023-10-02 18:38:52,382 WARNING [train.py:1204] (3/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_280-161633-0_sp0.9 from training. Duration: 0.811125 2023-10-02 18:38:53,731 WARNING [train.py:1204] (3/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_385-193052-0_sp1.1 from training. Duration: 0.7818125 2023-10-02 18:38:55,083 WARNING [train.py:1204] (3/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_444-28041-0 from training. Duration: 0.85 2023-10-02 18:38:55,089 WARNING [train.py:1204] (3/4) Exclude cut with ID _66070_210_7640_1_1535441393096_7805310_341-249237-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 18:38:56,502 WARNING [train.py:1204] (3/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_425-269826-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 18:38:57,940 WARNING [train.py:1204] (3/4) Exclude cut with ID _53950_210_5816_1_1534417264919_3862951_124-185597-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 18:39:00,697 WARNING [train.py:1204] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_380-204756-0 from training. Duration: 0.86 2023-10-02 18:39:04,976 WARNING [train.py:1204] (3/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_282-257791-0_sp0.9 from training. Duration: 0.9555625 2023-10-02 18:39:08,247 WARNING [train.py:1204] (3/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_372-131163-0_sp1.1 from training. Duration: 0.8545625 2023-10-02 18:39:09,610 WARNING [train.py:1204] (3/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_447-233513-0_sp1.1 from training. Duration: 0.963625 2023-10-02 18:39:11,887 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.feed_forward1.hidden_balancer.prob, batch_count=979446.6666666666, ans=0.125 2023-10-02 18:39:12,908 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_54-118333-0_sp1.1 from training. Duration: 0.97275 2023-10-02 18:39:14,286 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.conv_skip_rate, batch_count=979446.6666666666, ans=0.0 2023-10-02 18:39:19,163 WARNING [train.py:1204] (3/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_388-230239-0_sp1.1 from training. Duration: 0.963625 2023-10-02 18:39:19,181 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_40047_210_2290_1_1533290519_2374311_141-54639-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 18:39:20,503 WARNING [train.py:1204] (3/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_91-267696-0_sp0.9 from training. Duration: 0.9666875 2023-10-02 18:39:21,811 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_532-137847-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 18:39:23,142 INFO [train.py:1046] (3/4) Epoch 28, batch 3500, loss[loss=0.1439, simple_loss=0.2224, pruned_loss=0.03267, over 24352.00 frames. ], tot_loss[loss=0.1664, simple_loss=0.2431, pruned_loss=0.04485, over 4690361.39 frames. ], batch size: 56, lr: 3.64e-03, grad_scale: 8.0 2023-10-02 18:39:26,041 WARNING [train.py:1204] (3/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_155-283231-0_sp1.1 from training. Duration: 0.97275 2023-10-02 18:39:28,804 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2245_210_2715_1_1525511548_3540254_310-52483-0_sp1.1 from training. Duration: 0.8 2023-10-02 18:39:28,859 WARNING [train.py:1204] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_185-174805-0 from training. Duration: 0.68 2023-10-02 18:39:31,471 WARNING [train.py:1204] (3/4) Exclude cut with ID _15012_210_1968_1_1530856820429_5095350_662-210469-0_sp1.1 from training. Duration: 0.5181875 2023-10-02 18:39:35,396 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_426-313557-0_sp0.9 from training. Duration: 0.588875 2023-10-02 18:39:35,795 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.balancer2.prob, batch_count=979580.0, ans=0.125 2023-10-02 18:39:36,941 WARNING [train.py:1204] (3/4) Exclude cut with ID _42326_210_7770_1_1533814282531_3707269_407-204343-0_sp1.1 from training. Duration: 0.97275 2023-10-02 18:39:36,948 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_258-310570-0 from training. Duration: 0.82 2023-10-02 18:39:41,617 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_4737_210_2040_1_1526998689_3293008_378-178650-0_sp1.1 from training. Duration: 0.863625 2023-10-02 18:39:42,942 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_47896_210_20508_1_1534492518_5958868_432-327451-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 18:39:44,353 WARNING [train.py:1204] (3/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_188-114464-0_sp1.1 from training. Duration: 0.7454375 2023-10-02 18:39:44,360 WARNING [train.py:1204] (3/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_798-106179-0_sp1.1 from training. Duration: 0.7545625 2023-10-02 18:39:44,397 WARNING [train.py:1204] (3/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_143-117421-0_sp1.1 from training. Duration: 0.52725 2023-10-02 18:39:46,254 WARNING [train.py:1204] (3/4) Exclude cut with ID _67499_210_17282_1_1535509805220_3692430_361-78786-0_sp1.1 from training. Duration: 0.963625 2023-10-02 18:39:47,469 WARNING [train.py:1204] (3/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_712-342563-0_sp1.1 from training. Duration: 0.8818125 2023-10-02 18:39:47,517 WARNING [train.py:1204] (3/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_387-159008-0 from training. Duration: 0.86 2023-10-02 18:39:49,536 WARNING [train.py:1204] (3/4) Exclude cut with ID _33640_210_2655_1_1533117605479_4855469_959-18820-0_sp1.1 from training. Duration: 0.963625 2023-10-02 18:39:49,564 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_133-564-0_sp0.9 from training. Duration: 0.87775 2023-10-02 18:39:51,526 WARNING [train.py:1204] (3/4) Exclude cut with ID _23514_210_1389_1_1531832139785_3031439_12-29287-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 18:39:53,273 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.conv_module1.balancer1.prob, batch_count=979646.6666666666, ans=0.125 2023-10-02 18:39:55,601 WARNING [train.py:1204] (3/4) Exclude cut with ID _23514_210_1389_1_1531832139785_3031439_489-185052-0_sp1.1 from training. Duration: 0.963625 2023-10-02 18:39:55,679 WARNING [train.py:1204] (3/4) Exclude cut with ID _5444_210_2151_1_1527418808464_5312210_634-194174-0 from training. Duration: 0.97 2023-10-02 18:39:55,696 WARNING [train.py:1204] (3/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_91-26919-0_sp1.1 from training. Duration: 0.8818125 2023-10-02 18:39:58,494 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_37351_210_7925_1_1533012931_3812113_516-82303-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 18:39:58,600 WARNING [train.py:1204] (3/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_80-74184-0_sp0.9 from training. Duration: 0.92225 2023-10-02 18:39:59,938 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_9460_210_4211_1_1528954587_10049667_495-202963-0_sp1.1 from training. Duration: 0.963625 2023-10-02 18:40:01,358 WARNING [train.py:1204] (3/4) Exclude cut with ID _14632_210_9988_1_1530691279392_3923200_332-105904-0_sp1.1 from training. Duration: 0.8181875 2023-10-02 18:40:01,376 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_40764_210_1925_1_1533882359_3752345_493-167886-0_sp1.1 from training. Duration: 0.92725 2023-10-02 18:40:01,494 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_196-289637-0 from training. Duration: 0.75 2023-10-02 18:40:02,799 WARNING [train.py:1204] (3/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_26-175553-0 from training. Duration: 0.61 2023-10-02 18:40:04,082 WARNING [train.py:1204] (3/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_890-5117-0 from training. Duration: 0.78 2023-10-02 18:40:04,135 WARNING [train.py:1204] (3/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_206-306860-0_sp1.1 from training. Duration: 0.92725 2023-10-02 18:40:06,750 WARNING [train.py:1204] (3/4) Exclude cut with ID _25882_210_3036_1_1532775665815_6479510_570-79824-0_sp1.1 from training. Duration: 0.963625 2023-10-02 18:40:08,065 WARNING [train.py:1204] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_731-2428-0_sp1.1 from training. Duration: 0.7545625 2023-10-02 18:40:08,087 WARNING [train.py:1204] (3/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_138-154812-0_sp0.9 from training. Duration: 0.8666875 2023-10-02 18:40:08,434 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.conv_module2.balancer1.prob, batch_count=979713.3333333334, ans=0.125 2023-10-02 18:40:11,527 WARNING [train.py:1204] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_178-39722-0_sp0.9 from training. Duration: 0.5444375 2023-10-02 18:40:12,831 WARNING [train.py:1204] (3/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_1285-256622-0_sp0.9 from training. Duration: 0.9333125 2023-10-02 18:40:18,038 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_41428_210_2161_1_1533774184_3992532_65-271323-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 18:40:19,387 WARNING [train.py:1204] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_308-161067-0 from training. Duration: 0.66 2023-10-02 18:40:19,392 WARNING [train.py:1204] (3/4) Exclude cut with ID _3067_210_2667_1_1526212974640_4503168_459-173867-0 from training. Duration: 0.88 2023-10-02 18:40:19,393 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2221_210_1988_1_1525597051_4935541_415-296882-0_sp1.1 from training. Duration: 0.7 2023-10-02 18:40:23,422 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.513e+02 1.910e+02 2.082e+02 2.470e+02 3.693e+02, threshold=4.164e+02, percent-clipped=0.0 2023-10-02 18:40:23,497 WARNING [train.py:1204] (3/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_354-192266-0_sp1.1 from training. Duration: 0.936375 2023-10-02 18:40:23,564 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_258-310570-0_sp0.9 from training. Duration: 0.911125 2023-10-02 18:40:24,938 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_548-141039-0_sp1.1 from training. Duration: 0.963625 2023-10-02 18:40:27,669 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_15101_210_7927_1_1530786415_5033274_320-327527-0 from training. Duration: 0.96 2023-10-02 18:40:28,926 WARNING [train.py:1204] (3/4) Exclude cut with ID _2895_210_2477_1_1525956785794_4063750_289-11475-0_sp0.9 from training. Duration: 0.911125 2023-10-02 18:40:30,339 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7543_210_6753_1_1528543318_4552_1-85072-0_sp1.1 from training. Duration: 0.7545625 2023-10-02 18:40:30,406 WARNING [train.py:1204] (3/4) Exclude cut with ID _64639_210_5816_1_1535367619437_3528000_461-329156-0 from training. Duration: 0.96 2023-10-02 18:40:31,876 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_211-339622-0 from training. Duration: 0.45 2023-10-02 18:40:33,306 WARNING [train.py:1204] (3/4) Exclude cut with ID _20455_210_5219_1_1531565996775_8098100_402-112739-0_sp1.1 from training. Duration: 0.963625 2023-10-02 18:40:35,931 INFO [train.py:1046] (3/4) Epoch 28, batch 3550, loss[loss=0.1788, simple_loss=0.2442, pruned_loss=0.05671, over 23862.00 frames. ], tot_loss[loss=0.1655, simple_loss=0.242, pruned_loss=0.04451, over 4690838.26 frames. ], batch size: 195, lr: 3.64e-03, grad_scale: 8.0 2023-10-02 18:40:35,980 WARNING [train.py:1204] (3/4) Exclude cut with ID _53489_210_6228_1_1534298357169_3835970_256-115565-0_sp1.1 from training. Duration: 0.936375 2023-10-02 18:40:35,996 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_26326_210_7925_1_1533002493_3592406_360-305260-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 18:40:36,027 WARNING [train.py:1204] (3/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_18-204383-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 18:40:38,693 WARNING [train.py:1204] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_306-232480-0_sp1.1 from training. Duration: 0.87275 2023-10-02 18:40:46,962 WARNING [train.py:1204] (3/4) Exclude cut with ID _41161_210_20034_1_1533431249657_3555314_116-299186-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 18:40:50,091 WARNING [train.py:1204] (3/4) Exclude cut with ID _13913_210_9994_1_1530579615187_3383897_473-277682-0_sp1.1 from training. Duration: 0.3545625 2023-10-02 18:40:52,887 WARNING [train.py:1204] (3/4) Exclude cut with ID _58895_210_8750_1_1535184081320_3649089_49-225702-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 18:40:54,248 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3080_210_2071_1_1526086102_4365166_312-8726-0_sp1.1 from training. Duration: 0.82725 2023-10-02 18:40:55,597 WARNING [train.py:1204] (3/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_489-190175-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 18:40:56,947 WARNING [train.py:1204] (3/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_345-95717-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 18:40:56,964 WARNING [train.py:1204] (3/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_85-138257-0_sp1.1 from training. Duration: 0.6454375 2023-10-02 18:40:59,712 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7046_210_6753_1_1527895337_6920917_783-278109-0_sp1.1 from training. Duration: 0.7 2023-10-02 18:40:59,824 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.self_attn_weights.pos_emb_skip_rate, batch_count=979913.3333333334, ans=0.0 2023-10-02 18:41:00,986 WARNING [train.py:1204] (3/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_450-203637-0_sp1.1 from training. Duration: 0.836375 2023-10-02 18:41:01,030 WARNING [train.py:1204] (3/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_416-132893-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 18:41:01,054 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2655_210_2087_1_1525688959_3897400_437-337690-0_sp0.9 from training. Duration: 0.7 2023-10-02 18:41:02,495 WARNING [train.py:1204] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_700-183549-0_sp1.1 from training. Duration: 0.6181875 2023-10-02 18:41:07,970 WARNING [train.py:1204] (3/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_191-59784-0_sp1.1 from training. Duration: 0.8 2023-10-02 18:41:07,981 WARNING [train.py:1204] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_218-295097-0_sp1.1 from training. Duration: 0.7 2023-10-02 18:41:09,549 WARNING [train.py:1204] (3/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_310-109461-0_sp1.1 from training. Duration: 0.72725 2023-10-02 18:41:09,559 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_33348_210_5632_1_1533086912_3324030_131-321149-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 18:41:10,874 WARNING [train.py:1204] (3/4) Exclude cut with ID _64135_210_2253_1_1535677267876_7608972_133-188266-0_sp1.1 from training. Duration: 0.736375 2023-10-02 18:41:10,907 WARNING [train.py:1204] (3/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_505-205270-0 from training. Duration: 0.38 2023-10-02 18:41:10,918 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_67223_210_4238_1_1535612120_2537431_102-88248-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 18:41:12,301 WARNING [train.py:1204] (3/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_60-235433-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 18:41:14,104 WARNING [train.py:1204] (3/4) Exclude cut with ID _5189_210_4557_1_1527248319428_3769229_199-53526-0_sp1.1 from training. Duration: 0.3818125 2023-10-02 18:41:16,168 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.ff3_skip_rate, batch_count=979980.0, ans=0.0 2023-10-02 18:41:20,651 WARNING [train.py:1204] (3/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_439-90979-0_sp1.1 from training. Duration: 0.963625 2023-10-02 18:41:20,715 WARNING [train.py:1204] (3/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_1162-132880-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 18:41:20,786 WARNING [train.py:1204] (3/4) Exclude cut with ID _56423_210_5854_1_1534896061049_3165969_220-22317-0_sp1.1 from training. Duration: 0.963625 2023-10-02 18:41:24,084 WARNING [train.py:1204] (3/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_909-64151-0 from training. Duration: 0.94 2023-10-02 18:41:25,400 WARNING [train.py:1204] (3/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_465-156500-0_sp0.9 from training. Duration: 0.8 2023-10-02 18:41:26,764 WARNING [train.py:1204] (3/4) Exclude cut with ID _44304_210_15374_1_1533639352765_4613129_302-22084-0 from training. Duration: 0.86 2023-10-02 18:41:26,807 WARNING [train.py:1204] (3/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_663-166973-0_sp1.1 from training. Duration: 0.72725 2023-10-02 18:41:29,427 WARNING [train.py:1204] (3/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_503-287883-0_sp0.9 from training. Duration: 0.97775 2023-10-02 18:41:29,443 WARNING [train.py:1204] (3/4) Exclude cut with ID _60057_210_10640_1_1534903065793_3333150_362-230377-0_sp0.9 from training. Duration: 0.9666875 2023-10-02 18:41:30,191 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.2.encoder.layers.1.conv_module1.whiten, num_groups=1, num_channels=384, metric=5.13 vs. limit=15.0 2023-10-02 18:41:32,167 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_4183_210_2087_1_1526729100_1869838_143-280962-0 from training. Duration: 0.56 2023-10-02 18:41:33,538 WARNING [train.py:1204] (3/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_281-1313-0_sp1.1 from training. Duration: 0.936375 2023-10-02 18:41:37,795 WARNING [train.py:1204] (3/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_64-215118-0_sp1.1 from training. Duration: 0.936375 2023-10-02 18:41:39,281 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2822_210_2715_1_1526112586_1999587_165-36656-0 from training. Duration: 0.89 2023-10-02 18:41:39,328 WARNING [train.py:1204] (3/4) Exclude cut with ID _22961_210_8254_1_1531976366672_4278180_402-208830-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 18:41:43,322 WARNING [train.py:1204] (3/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_98-193494-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 18:41:46,517 WARNING [train.py:1204] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_383-285196-0 from training. Duration: 0.97 2023-10-02 18:41:49,885 INFO [train.py:1046] (3/4) Epoch 28, batch 3600, loss[loss=0.1578, simple_loss=0.2328, pruned_loss=0.04139, over 23485.00 frames. ], tot_loss[loss=0.1653, simple_loss=0.2422, pruned_loss=0.04421, over 4707357.18 frames. ], batch size: 285, lr: 3.64e-03, grad_scale: 16.0 2023-10-02 18:41:51,594 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.bypass_mid.scale_min, batch_count=980180.0, ans=0.2 2023-10-02 18:41:54,414 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6767_210_3273_1_1527855783_1718932_142-127132-0 from training. Duration: 0.95 2023-10-02 18:41:54,447 WARNING [train.py:1204] (3/4) Exclude cut with ID _39867_210_9826_1_1533553222030_9344259_1018-339635-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 18:41:55,726 WARNING [train.py:1204] (3/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_274-274798-0_sp1.1 from training. Duration: 0.8909375 2023-10-02 18:41:55,953 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.feed_forward1.out_proj.dropout_p, batch_count=980180.0, ans=0.1 2023-10-02 18:41:58,909 WARNING [train.py:1204] (3/4) Exclude cut with ID _2334_210_2667_1_1525608238880_4705129_376-263393-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 18:42:00,260 WARNING [train.py:1204] (3/4) Exclude cut with ID _29303_210_2575_1_1532392237303_7410649_363-344675-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 18:42:00,339 WARNING [train.py:1204] (3/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_58-68956-0_sp1.1 from training. Duration: 0.7545625 2023-10-02 18:42:03,234 WARNING [train.py:1204] (3/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_709-311571-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 18:42:04,575 WARNING [train.py:1204] (3/4) Exclude cut with ID _46067_210_4220_1_1534125571066_3307450_136-257407-0_sp1.1 from training. Duration: 0.963625 2023-10-02 18:42:05,983 WARNING [train.py:1204] (3/4) Exclude cut with ID _6493_210_1747_1_1528459210424_4012470_324-323854-0_sp1.1 from training. Duration: 0.836375 2023-10-02 18:42:06,044 WARNING [train.py:1204] (3/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_234-260682-0_sp1.1 from training. Duration: 0.763625 2023-10-02 18:42:06,099 WARNING [train.py:1204] (3/4) Exclude cut with ID _36634_210_1794_1_1533296352450_1399884_55-119451-0_sp1.1 from training. Duration: 0.963625 2023-10-02 18:42:06,103 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_40047_210_2290_1_1533290519_2374311_195-165693-0 from training. Duration: 0.89 2023-10-02 18:42:10,299 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_5522_210_3273_1_1527249606_3909096_366-172508-0_sp0.9 from training. Duration: 0.7333125 2023-10-02 18:42:10,385 WARNING [train.py:1204] (3/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_187-10504-0_sp1.1 from training. Duration: 0.963625 2023-10-02 18:42:13,010 WARNING [train.py:1204] (3/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_282-257791-0_sp1.1 from training. Duration: 0.7818125 2023-10-02 18:42:16,046 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_43831_210_2158_1_1533725502_5872791_467-288776-0_sp1.1 from training. Duration: 0.92725 2023-10-02 18:42:17,988 WARNING [train.py:1204] (3/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_138-154812-0_sp1.1 from training. Duration: 0.7090625 2023-10-02 18:42:18,036 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_1154-185930-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 18:42:18,057 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2655_210_2087_1_1525688959_3897400_52-279899-0 from training. Duration: 0.69 2023-10-02 18:42:19,927 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_141-89113-0_sp1.1 from training. Duration: 0.7818125 2023-10-02 18:42:21,402 WARNING [train.py:1204] (3/4) Exclude cut with ID _55134_210_21982_1_1534590028377_3882170_297-47310-0_sp1.1 from training. Duration: 0.963625 2023-10-02 18:42:22,697 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_486-151131-0_sp1.1 from training. Duration: 0.77275 2023-10-02 18:42:22,816 WARNING [train.py:1204] (3/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_21-232980-0_sp1.1 from training. Duration: 0.936375 2023-10-02 18:42:25,474 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6165_210_3528_1_1527590982_4379841_338-106380-0_sp1.1 from training. Duration: 0.92725 2023-10-02 18:42:26,808 WARNING [train.py:1204] (3/4) Exclude cut with ID _55162_210_17071_1_1534408248583_3753480_355-257218-0_sp1.1 from training. Duration: 0.97275 2023-10-02 18:42:28,733 WARNING [train.py:1204] (3/4) Exclude cut with ID _2895_210_2477_1_1525956785794_4063750_289-11475-0 from training. Duration: 0.82 2023-10-02 18:42:33,043 WARNING [train.py:1204] (3/4) Exclude cut with ID _16756_210_4915_1_1531047803763_7104259_839-183431-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 18:42:34,447 WARNING [train.py:1204] (3/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_427-208901-0_sp0.9 from training. Duration: 0.7555625 2023-10-02 18:42:34,491 WARNING [train.py:1204] (3/4) Exclude cut with ID _36512_210_6987_1_1533033143998_3126069_181-306500-0 from training. Duration: 0.95 2023-10-02 18:42:39,027 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.4.encoder.layers.1.feed_forward2.out_whiten, num_groups=1, num_channels=384, metric=10.43 vs. limit=15.0 2023-10-02 18:42:39,776 WARNING [train.py:1204] (3/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_28-70987-0_sp1.1 from training. Duration: 0.7909375 2023-10-02 18:42:45,297 WARNING [train.py:1204] (3/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_590-58996-0_sp1.1 from training. Duration: 0.936375 2023-10-02 18:42:48,651 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_48730_210_5399_1_1533885886_2212769_108-47561-0_sp1.1 from training. Duration: 0.936375 2023-10-02 18:42:50,300 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.559e+02 1.786e+02 1.946e+02 2.303e+02 3.332e+02, threshold=3.893e+02, percent-clipped=0.0 2023-10-02 18:42:54,599 WARNING [train.py:1204] (3/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_401-158239-0_sp0.9 from training. Duration: 0.788875 2023-10-02 18:42:54,619 WARNING [train.py:1204] (3/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_278-154492-0_sp1.1 from training. Duration: 0.6909375 2023-10-02 18:42:54,624 WARNING [train.py:1204] (3/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_137-116442-0 from training. Duration: 0.78 2023-10-02 18:42:55,952 WARNING [train.py:1204] (3/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_189-317158-0 from training. Duration: 0.9 2023-10-02 18:42:57,875 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_47896_210_20508_1_1534492518_5958868_558-314572-0 from training. Duration: 0.97 2023-10-02 18:42:59,325 WARNING [train.py:1204] (3/4) Exclude cut with ID _66070_210_7640_1_1535441393096_7805310_290-40284-0_sp1.1 from training. Duration: 0.97275 2023-10-02 18:43:00,703 WARNING [train.py:1204] (3/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_110-224688-0_sp1.1 from training. Duration: 0.8818125 2023-10-02 18:43:00,795 WARNING [train.py:1204] (3/4) Exclude cut with ID _7719_210_3525_1_1528456153163_4296960_88-250259-0 from training. Duration: 0.84 2023-10-02 18:43:02,114 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8453_210_4211_1_1528502940_8562730_1157-114102-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 18:43:02,144 WARNING [train.py:1204] (3/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_317-175062-0_sp1.1 from training. Duration: 0.7454375 2023-10-02 18:43:02,150 WARNING [train.py:1204] (3/4) Exclude cut with ID _29857_210_20034_1_1532395886753_3798160_254-347254-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 18:43:03,429 INFO [train.py:1046] (3/4) Epoch 28, batch 3650, loss[loss=0.1588, simple_loss=0.228, pruned_loss=0.04483, over 23658.00 frames. ], tot_loss[loss=0.1648, simple_loss=0.242, pruned_loss=0.04382, over 4716768.10 frames. ], batch size: 256, lr: 3.64e-03, grad_scale: 16.0 2023-10-02 18:43:03,549 WARNING [train.py:1204] (3/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_28-70987-0 from training. Duration: 0.87 2023-10-02 18:43:04,862 WARNING [train.py:1204] (3/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_321-335475-0 from training. Duration: 0.45 2023-10-02 18:43:09,041 WARNING [train.py:1204] (3/4) Exclude cut with ID _40901_210_5665_1_1533363789958_3856130_356-107330-0_sp1.1 from training. Duration: 0.936375 2023-10-02 18:43:09,112 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3780_210_2087_1_1526467389_2145572_176-107140-0 from training. Duration: 0.69 2023-10-02 18:43:11,887 WARNING [train.py:1204] (3/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_80-135200-0 from training. Duration: 0.79 2023-10-02 18:43:13,311 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_33348_210_5632_1_1533086912_3324030_85-28583-0_sp0.9 from training. Duration: 0.911125 2023-10-02 18:43:17,836 WARNING [train.py:1204] (3/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_29-293896-0 from training. Duration: 0.61 2023-10-02 18:43:19,730 WARNING [train.py:1204] (3/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_194-252800-0 from training. Duration: 0.97 2023-10-02 18:43:22,701 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.conv_module1.balancer1.prob, batch_count=980580.0, ans=0.125 2023-10-02 18:43:23,844 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_360-52446-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 18:43:23,845 WARNING [train.py:1204] (3/4) Exclude cut with ID _9389_210_1664_1_1529217337301_5289269_289-213417-0_sp0.9 from training. Duration: 0.988875 2023-10-02 18:43:23,896 WARNING [train.py:1204] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_143-86344-0_sp0.9 from training. Duration: 0.8555625 2023-10-02 18:43:26,678 WARNING [train.py:1204] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_520-126729-0_sp0.9 from training. Duration: 0.77775 2023-10-02 18:43:26,714 WARNING [train.py:1204] (3/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_76-308663-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 18:43:28,553 WARNING [train.py:1204] (3/4) Exclude cut with ID _8021_210_7994_1_1528376451358_3653069_100-140231-0 from training. Duration: 0.67 2023-10-02 18:43:28,616 WARNING [train.py:1204] (3/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_387-131538-0_sp0.9 from training. Duration: 0.9 2023-10-02 18:43:29,868 WARNING [train.py:1204] (3/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_365-182054-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 18:43:29,910 WARNING [train.py:1204] (3/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_345-340690-0 from training. Duration: 0.93 2023-10-02 18:43:29,992 WARNING [train.py:1204] (3/4) Exclude cut with ID _5409_210_1388_1_1527292850189_5865191_178-127970-0_sp0.9 from training. Duration: 0.6333125 2023-10-02 18:43:31,263 WARNING [train.py:1204] (3/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_352-12734-0_sp1.1 from training. Duration: 0.92725 2023-10-02 18:43:31,273 WARNING [train.py:1204] (3/4) Exclude cut with ID _65134_210_14763_1_1535335393985_3505368_121-129373-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 18:43:32,695 WARNING [train.py:1204] (3/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_557-324258-0_sp1.1 from training. Duration: 0.736375 2023-10-02 18:43:35,412 WARNING [train.py:1204] (3/4) Exclude cut with ID _48678_210_5663_1_1533988830947_3769199_306-255886-0 from training. Duration: 0.77 2023-10-02 18:43:35,626 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.feed_forward3.hidden_balancer.prob, batch_count=980646.6666666666, ans=0.125 2023-10-02 18:43:36,629 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7544_210_6753_1_1528587000_691468_87-178599-0 from training. Duration: 0.87 2023-10-02 18:43:38,031 WARNING [train.py:1204] (3/4) Exclude cut with ID _2941_210_2577_1_1526199673150_2489759_208-236402-0_sp1.1 from training. Duration: 0.963625 2023-10-02 18:43:38,190 WARNING [train.py:1204] (3/4) Exclude cut with ID _38670_210_15379_1_1533448810659_4391059_65-169397-0 from training. Duration: 0.97 2023-10-02 18:43:40,858 WARNING [train.py:1204] (3/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_306-328175-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 18:43:40,876 WARNING [train.py:1204] (3/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_212-149947-0_sp1.1 from training. Duration: 0.663625 2023-10-02 18:43:45,117 WARNING [train.py:1204] (3/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_321-22821-0_sp1.1 from training. Duration: 0.8090625 2023-10-02 18:43:47,786 WARNING [train.py:1204] (3/4) Exclude cut with ID _44010_210_4525_1_1533884262964_4013620_483-69110-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 18:43:47,800 WARNING [train.py:1204] (3/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_38-187851-0_sp1.1 from training. Duration: 0.563625 2023-10-02 18:43:49,131 WARNING [train.py:1204] (3/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_129-337802-0_sp0.9 from training. Duration: 0.8 2023-10-02 18:43:49,200 WARNING [train.py:1204] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_268-87909-0_sp1.1 from training. Duration: 0.9 2023-10-02 18:43:51,968 WARNING [train.py:1204] (3/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_559-184862-0_sp1.1 from training. Duration: 0.8545625 2023-10-02 18:43:53,449 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_46032_210_14656_1_1533781412_4507969_237-120766-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 18:43:53,537 WARNING [train.py:1204] (3/4) Exclude cut with ID _28427_210_4220_1_1532581240967_3061069_330-170906-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 18:43:53,710 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.feed_forward1.hidden_balancer.prob, batch_count=980713.3333333334, ans=0.125 2023-10-02 18:43:54,839 WARNING [train.py:1204] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_401-51578-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 18:43:56,786 WARNING [train.py:1204] (3/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_209-205365-0_sp1.1 from training. Duration: 0.7181875 2023-10-02 18:43:58,243 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_23801_210_16223_1_1532066392_3294657_57-242170-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 18:43:59,604 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_127-37371-0_sp1.1 from training. Duration: 0.936375 2023-10-02 18:43:59,803 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.1.conv_module1.balancer1.prob, batch_count=980713.3333333334, ans=0.125 2023-10-02 18:44:05,375 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.bypass.scale_min, batch_count=980780.0, ans=0.2 2023-10-02 18:44:06,511 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9171W0019-578905 from training. Duration: 0.896 2023-10-02 18:44:07,964 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_24605_210_1794_1_1532047863_4149755_349-49056-0_sp1.1 from training. Duration: 0.92725 2023-10-02 18:44:07,975 WARNING [train.py:1204] (3/4) Exclude cut with ID _12302_210_5219_1_1529924446466_8107280_897-21237-0_sp1.1 from training. Duration: 0.936375 2023-10-02 18:44:09,322 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_9460_210_4211_1_1528954587_10049667_480-260269-0_sp0.9 from training. Duration: 0.62225 2023-10-02 18:44:10,618 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_348-152548-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 18:44:10,693 WARNING [train.py:1204] (3/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_301-330455-0_sp0.9 from training. Duration: 0.888875 2023-10-02 18:44:12,205 WARNING [train.py:1204] (3/4) Exclude cut with ID _61008_210_10633_1_1534852769198_6974370_345-91705-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 18:44:13,611 WARNING [train.py:1204] (3/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_304-273433-0 from training. Duration: 0.99 2023-10-02 18:44:13,614 WARNING [train.py:1204] (3/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_359-345972-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 18:44:16,240 INFO [train.py:1046] (3/4) Epoch 28, batch 3700, loss[loss=0.1477, simple_loss=0.2324, pruned_loss=0.03154, over 24664.00 frames. ], tot_loss[loss=0.1651, simple_loss=0.2425, pruned_loss=0.04388, over 4724803.00 frames. ], batch size: 65, lr: 3.64e-03, grad_scale: 16.0 2023-10-02 18:44:18,188 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2296_210_2087_1_1525503448_3397335_57-185430-0_sp1.1 from training. Duration: 0.5454375 2023-10-02 18:44:20,128 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_1256-120974-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 18:44:21,426 WARNING [train.py:1204] (3/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_372-30302-0_sp0.9 from training. Duration: 0.9555625 2023-10-02 18:44:24,242 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_42012_210_7927_1_1533439936_3871556_451-344262-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 18:44:24,252 WARNING [train.py:1204] (3/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_28-324729-0 from training. Duration: 0.94 2023-10-02 18:44:24,260 WARNING [train.py:1204] (3/4) Exclude cut with ID _64135_210_2253_1_1535677267876_7608972_313-18394-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 18:44:24,335 WARNING [train.py:1204] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_620-276793-0_sp0.9 from training. Duration: 0.6555625 2023-10-02 18:44:25,662 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7544_210_6753_1_1528591327_1471426_172-348777-0_sp0.9 from training. Duration: 0.7333125 2023-10-02 18:44:26,011 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.feed_forward1.out_proj.dropout_p, batch_count=980846.6666666666, ans=0.1 2023-10-02 18:44:29,154 WARNING [train.py:1204] (3/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_332-221057-0_sp0.9 from training. Duration: 0.7555625 2023-10-02 18:44:31,952 WARNING [train.py:1204] (3/4) Exclude cut with ID _19023_210_4353_1_1531378892552_5820780_5-300035-0_sp1.1 from training. Duration: 0.92725 2023-10-02 18:44:33,220 WARNING [train.py:1204] (3/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_343-309023-0_sp1.1 from training. Duration: 0.963625 2023-10-02 18:44:33,317 WARNING [train.py:1204] (3/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_37-41919-0_sp1.1 from training. Duration: 0.7909375 2023-10-02 18:44:34,619 WARNING [train.py:1204] (3/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_41-182910-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 18:44:34,783 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.1.balancer1.prob, batch_count=980913.3333333334, ans=0.125 2023-10-02 18:44:35,873 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_229-45300-0_sp0.9 from training. Duration: 0.6333125 2023-10-02 18:44:38,682 WARNING [train.py:1204] (3/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_40-214716-0_sp1.1 from training. Duration: 0.963625 2023-10-02 18:44:40,154 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9309W0203-640330 from training. Duration: 0.896 2023-10-02 18:44:45,872 WARNING [train.py:1204] (3/4) Exclude cut with ID _60229_210_27767_1_1534838406158_3655350_264-279573-0_sp1.1 from training. Duration: 0.7545625 2023-10-02 18:44:45,902 WARNING [train.py:1204] (3/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_372-198878-0_sp0.9 from training. Duration: 0.6666875 2023-10-02 18:44:45,992 WARNING [train.py:1204] (3/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_359-118926-0_sp1.1 from training. Duration: 0.6545625 2023-10-02 18:44:46,002 WARNING [train.py:1204] (3/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_85-65056-0 from training. Duration: 0.96 2023-10-02 18:44:46,018 WARNING [train.py:1204] (3/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_89-242335-0_sp1.1 from training. Duration: 0.863625 2023-10-02 18:44:50,955 WARNING [train.py:1204] (3/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_128-49444-0_sp1.1 from training. Duration: 0.936375 2023-10-02 18:44:52,716 WARNING [train.py:1204] (3/4) Exclude cut with ID _30327_210_16049_1_1532484161295_3924169_401-70819-0 from training. Duration: 0.86 2023-10-02 18:44:54,094 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_41822_210_13270_1_1533955976_4379284_260-256391-0_sp1.1 from training. Duration: 0.936375 2023-10-02 18:44:55,536 WARNING [train.py:1204] (3/4) Exclude cut with ID _67335_210_5219_1_1535525975652_4275229_599-247317-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 18:44:58,746 WARNING [train.py:1204] (3/4) Exclude cut with ID _29857_210_20034_1_1532395886753_3798160_209-239267-0_sp1.1 from training. Duration: 0.936375 2023-10-02 18:45:00,010 WARNING [train.py:1204] (3/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_46-207599-0_sp1.1 from training. Duration: 0.5818125 2023-10-02 18:45:00,244 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.conv_module2.balancer1.prob, batch_count=981046.6666666666, ans=0.125 2023-10-02 18:45:01,491 WARNING [train.py:1204] (3/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_912-282412-0_sp1.1 from training. Duration: 0.4454375 2023-10-02 18:45:02,290 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.2.encoder.layers.1.nonlin_attention.whiten2, num_groups=1, num_channels=384, metric=5.29 vs. limit=15.0 2023-10-02 18:45:04,369 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_4183_210_2087_1_1526729100_1869838_141-166600-0_sp1.1 from training. Duration: 0.863625 2023-10-02 18:45:04,373 WARNING [train.py:1204] (3/4) Exclude cut with ID _15037_210_2253_1_1530752294104_3267650_296-192597-0 from training. Duration: 0.81 2023-10-02 18:45:05,608 WARNING [train.py:1204] (3/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_571-220562-0_sp1.1 from training. Duration: 0.963625 2023-10-02 18:45:05,634 WARNING [train.py:1204] (3/4) Exclude cut with ID _5287_210_2084_1_1527418883040_3878670_12-221186-0 from training. Duration: 0.98 2023-10-02 18:45:11,122 WARNING [train.py:1204] (3/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_149-85602-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 18:45:11,143 WARNING [train.py:1204] (3/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_1003-101638-0_sp1.1 from training. Duration: 0.663625 2023-10-02 18:45:13,854 WARNING [train.py:1204] (3/4) Exclude cut with ID _55844_210_2346_1_1534471212569_7625707_1099-33689-0_sp1.1 from training. Duration: 0.92725 2023-10-02 18:45:15,169 WARNING [train.py:1204] (3/4) Exclude cut with ID _7606_210_4915_1_1528894253277_5080690_301-10732-0 from training. Duration: 0.81 2023-10-02 18:45:16,603 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.560e+02 1.928e+02 2.164e+02 2.452e+02 3.426e+02, threshold=4.329e+02, percent-clipped=0.0 2023-10-02 18:45:16,696 WARNING [train.py:1204] (3/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_403-34698-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 18:45:16,701 WARNING [train.py:1204] (3/4) Exclude cut with ID _8480_210_1874_1_1528589391205_6728330_755-112625-0_sp0.9 from training. Duration: 0.82225 2023-10-02 18:45:16,713 WARNING [train.py:1204] (3/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_527-238990-0_sp1.1 from training. Duration: 0.8181875 2023-10-02 18:45:16,740 WARNING [train.py:1204] (3/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_426-7328-0_sp1.1 from training. Duration: 0.92725 2023-10-02 18:45:20,114 WARNING [train.py:1204] (3/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_189-317158-0_sp1.1 from training. Duration: 0.8181875 2023-10-02 18:45:20,178 WARNING [train.py:1204] (3/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_162-103620-0 from training. Duration: 0.93 2023-10-02 18:45:21,641 WARNING [train.py:1204] (3/4) Exclude cut with ID _22083_210_8254_1_1531702862439_3678970_49-234546-0 from training. Duration: 0.83 2023-10-02 18:45:21,702 WARNING [train.py:1204] (3/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_370-331600-0_sp1.1 from training. Duration: 0.8545625 2023-10-02 18:45:23,730 WARNING [train.py:1204] (3/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_166-59042-0_sp1.1 from training. Duration: 0.97275 2023-10-02 18:45:25,019 WARNING [train.py:1204] (3/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_138-332773-0_sp1.1 from training. Duration: 0.736375 2023-10-02 18:45:26,318 WARNING [train.py:1204] (3/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_90-30673-0_sp0.9 from training. Duration: 0.9333125 2023-10-02 18:45:29,737 WARNING [train.py:1204] (3/4) Exclude cut with ID _27967_210_12140_1_1532588391859_8419130_737-41898-0_sp1.1 from training. Duration: 0.936375 2023-10-02 18:45:31,125 INFO [train.py:1046] (3/4) Epoch 28, batch 3750, loss[loss=0.1599, simple_loss=0.2453, pruned_loss=0.03723, over 24347.00 frames. ], tot_loss[loss=0.1661, simple_loss=0.2441, pruned_loss=0.04411, over 4722372.59 frames. ], batch size: 61, lr: 3.64e-03, grad_scale: 16.0 2023-10-02 18:45:31,166 WARNING [train.py:1204] (3/4) Exclude cut with ID _8833_210_5219_1_1528592665210_7553059_329-125156-0_sp1.1 from training. Duration: 0.7454375 2023-10-02 18:45:32,558 WARNING [train.py:1204] (3/4) Exclude cut with ID _41817_210_18835_1_1533434477160_7423049_352-42537-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 18:45:35,165 WARNING [train.py:1204] (3/4) Exclude cut with ID _8365_210_2064_1_1528592311070_4677690_431-294102-0 from training. Duration: 0.89 2023-10-02 18:45:35,271 WARNING [train.py:1204] (3/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_454-314901-0_sp1.1 from training. Duration: 0.4181875 2023-10-02 18:45:35,409 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.conv_module1.balancer1.prob, batch_count=981180.0, ans=0.125 2023-10-02 18:45:36,884 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.conv_module2.balancer2.prob, batch_count=981180.0, ans=0.125 2023-10-02 18:45:38,029 WARNING [train.py:1204] (3/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_80-135200-0_sp0.9 from training. Duration: 0.87775 2023-10-02 18:45:38,093 WARNING [train.py:1204] (3/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_181-78632-0 from training. Duration: 0.78 2023-10-02 18:45:39,415 WARNING [train.py:1204] (3/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_300-27235-0_sp0.9 from training. Duration: 0.9666875 2023-10-02 18:45:40,719 WARNING [train.py:1204] (3/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_96-177176-0_sp1.1 from training. Duration: 0.97275 2023-10-02 18:45:40,800 WARNING [train.py:1204] (3/4) Exclude cut with ID _35941_210_15379_1_1532995294515_4023089_246-348742-0_sp1.1 from training. Duration: 0.97275 2023-10-02 18:45:42,195 WARNING [train.py:1204] (3/4) Exclude cut with ID _38670_210_15379_1_1533448810659_4391059_65-169397-0_sp1.1 from training. Duration: 0.8818125 2023-10-02 18:45:46,398 WARNING [train.py:1204] (3/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_1003-99642-0_sp1.1 from training. Duration: 0.963625 2023-10-02 18:45:49,758 WARNING [train.py:1204] (3/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_320-14078-0_sp1.1 from training. Duration: 0.863625 2023-10-02 18:45:51,135 WARNING [train.py:1204] (3/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_367-61330-0_sp0.9 from training. Duration: 0.8333125 2023-10-02 18:45:52,519 WARNING [train.py:1204] (3/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_515-298514-0_sp1.1 from training. Duration: 0.92725 2023-10-02 18:45:54,600 WARNING [train.py:1204] (3/4) Exclude cut with ID _53731_210_7994_1_1534553979244_3757214_471-320815-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 18:45:56,584 WARNING [train.py:1204] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_306-232480-0 from training. Duration: 0.96 2023-10-02 18:45:57,911 WARNING [train.py:1204] (3/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_478-162247-0_sp0.9 from training. Duration: 0.911125 2023-10-02 18:45:58,022 WARNING [train.py:1204] (3/4) Exclude cut with ID _56423_210_5854_1_1534896061049_3165969_345-284696-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 18:45:59,751 WARNING [train.py:1204] (3/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_600-122253-0_sp1.1 from training. Duration: 0.963625 2023-10-02 18:46:02,631 WARNING [train.py:1204] (3/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_199-295611-0 from training. Duration: 0.55 2023-10-02 18:46:05,837 WARNING [train.py:1204] (3/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_734-129761-0 from training. Duration: 0.82 2023-10-02 18:46:07,253 WARNING [train.py:1204] (3/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_349-169257-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 18:46:07,296 WARNING [train.py:1204] (3/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_202-67017-0_sp0.9 from training. Duration: 0.911125 2023-10-02 18:46:07,457 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.nonlin_attention.balancer.prob, batch_count=981313.3333333334, ans=0.125 2023-10-02 18:46:09,993 WARNING [train.py:1204] (3/4) Exclude cut with ID _16908_210_5724_1_1531119157248_3772700_61-258978-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 18:46:15,308 WARNING [train.py:1204] (3/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_172-156931-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 18:46:17,971 WARNING [train.py:1204] (3/4) Exclude cut with ID _4092_210_2151_1_1526643044449_3135759_356-278694-0_sp1.1 from training. Duration: 0.42725 2023-10-02 18:46:20,110 WARNING [train.py:1204] (3/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_116-332008-0 from training. Duration: 0.98 2023-10-02 18:46:24,139 WARNING [train.py:1204] (3/4) Exclude cut with ID _29003_210_19596_1_1532507391720_3894098_358-114789-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 18:46:26,946 WARNING [train.py:1204] (3/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_152-212533-0_sp1.1 from training. Duration: 0.8818125 2023-10-02 18:46:28,868 WARNING [train.py:1204] (3/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_267-214116-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 18:46:30,448 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7543_210_6753_1_1528541907_1399322_165-323619-0_sp0.9 from training. Duration: 0.7666875 2023-10-02 18:46:34,488 WARNING [train.py:1204] (3/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_577-332600-0_sp0.9 from training. Duration: 0.6333125 2023-10-02 18:46:35,903 WARNING [train.py:1204] (3/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_342-100157-0_sp0.9 from training. Duration: 0.6 2023-10-02 18:46:37,318 WARNING [train.py:1204] (3/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_141-89727-0_sp1.1 from training. Duration: 0.6454375 2023-10-02 18:46:40,023 WARNING [train.py:1204] (3/4) Exclude cut with ID _3992_210_1747_1_1526644816031_3731060_107-251434-0_sp1.1 from training. Duration: 0.9 2023-10-02 18:46:42,655 WARNING [train.py:1204] (3/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_401-53423-0_sp1.1 from training. Duration: 0.5 2023-10-02 18:46:43,504 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.2.encoder.layers.1.feed_forward2.out_whiten, num_groups=1, num_channels=384, metric=3.84 vs. limit=15.0 2023-10-02 18:46:44,016 INFO [train.py:1046] (3/4) Epoch 28, batch 3800, loss[loss=0.1754, simple_loss=0.2648, pruned_loss=0.04298, over 24373.00 frames. ], tot_loss[loss=0.1663, simple_loss=0.2445, pruned_loss=0.04401, over 4727748.69 frames. ], batch size: 77, lr: 3.64e-03, grad_scale: 16.0 2023-10-02 18:46:50,040 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7762_210_1794_1_1528199508_4057408_422-227034-0_sp1.1 from training. Duration: 0.663625 2023-10-02 18:46:54,146 WARNING [train.py:1204] (3/4) Exclude cut with ID _4184_210_2550_1_1526730192013_2219380_102-250299-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 18:46:55,505 WARNING [train.py:1204] (3/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_253-308377-0_sp0.9 from training. Duration: 0.5444375 2023-10-02 18:46:55,742 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.nonlin_attention.balancer.prob, batch_count=981513.3333333334, ans=0.125 2023-10-02 18:46:56,946 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_271-297505-0 from training. Duration: 0.99 2023-10-02 18:46:58,980 WARNING [train.py:1204] (3/4) Exclude cut with ID _35941_210_15379_1_1532995294515_4023089_213-142973-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 18:46:59,309 INFO [scaling.py:1118] (3/4) WithLoss: name=encoder.encoders.3.encoder.layers.3.self_attn_weights, loss-sum=0.000e+00 2023-10-02 18:47:00,351 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7544_210_6753_1_1528587722_3576686_162-63018-0_sp1.1 from training. Duration: 0.92725 2023-10-02 18:47:01,779 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_365-217119-0_sp0.9 from training. Duration: 0.7 2023-10-02 18:47:03,181 WARNING [train.py:1204] (3/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_662-275621-0_sp0.9 from training. Duration: 0.4666875 2023-10-02 18:47:03,182 WARNING [train.py:1204] (3/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_374-187555-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 18:47:03,255 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_289-180904-0_sp1.1 from training. Duration: 0.6909375 2023-10-02 18:47:05,902 WARNING [train.py:1204] (3/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_189-47206-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 18:47:05,930 WARNING [train.py:1204] (3/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_234-14174-0_sp1.1 from training. Duration: 0.7454375 2023-10-02 18:47:05,965 WARNING [train.py:1204] (3/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_576-76621-0_sp1.1 from training. Duration: 0.963625 2023-10-02 18:47:06,052 WARNING [train.py:1204] (3/4) Exclude cut with ID _13253_210_1925_1_1530757775819_3577873_253-246386-0 from training. Duration: 0.9 2023-10-02 18:47:10,164 WARNING [train.py:1204] (3/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_372-6869-0_sp1.1 from training. Duration: 0.4 2023-10-02 18:47:10,206 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_21434_210_4238_1_1531916339_1222252_22-54954-0_sp1.1 from training. Duration: 0.936375 2023-10-02 18:47:12,832 WARNING [train.py:1204] (3/4) Exclude cut with ID _29853_210_7497_1_1532505600851_6642110_492-170361-0_sp1.1 from training. Duration: 0.92725 2023-10-02 18:47:13,034 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.conv_module1.balancer2.prob, batch_count=981646.6666666666, ans=0.125 2023-10-02 18:47:14,300 WARNING [train.py:1204] (3/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_505-97987-0_sp0.9 from training. Duration: 0.9555625 2023-10-02 18:47:15,626 WARNING [train.py:1204] (3/4) Exclude cut with ID _8365_210_2064_1_1528592311070_4677690_371-122295-0_sp0.9 from training. Duration: 0.611125 2023-10-02 18:47:16,997 WARNING [train.py:1204] (3/4) Exclude cut with ID _14268_210_9206_1_1530622771285_4714099_637-86630-0_sp1.1 from training. Duration: 0.463625 2023-10-02 18:47:17,016 WARNING [train.py:1204] (3/4) Exclude cut with ID _29857_210_20034_1_1532395886753_3798160_372-5678-0_sp1.1 from training. Duration: 0.963625 2023-10-02 18:47:20,099 WARNING [train.py:1204] (3/4) Exclude cut with ID _52892_210_6721_1_1534378941259_4124129_90-165108-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 18:47:20,192 WARNING [train.py:1204] (3/4) Exclude cut with ID _27052_210_5986_1_1532329197187_3585009_143-339198-0_sp1.1 from training. Duration: 0.963625 2023-10-02 18:47:20,342 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.conv_module2.balancer2.min_abs, batch_count=981646.6666666666, ans=0.5 2023-10-02 18:47:21,888 INFO [scaling.py:1118] (3/4) WithLoss: name=encoder.encoders.3.encoder.layers.2.self_attn_weights, loss-sum=0.000e+00 2023-10-02 18:47:25,662 WARNING [train.py:1204] (3/4) Exclude cut with ID _14268_210_9206_1_1530622771285_4714099_637-86630-0_sp0.9 from training. Duration: 0.5666875 2023-10-02 18:47:25,665 WARNING [train.py:1204] (3/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_401-255491-0 from training. Duration: 0.7 2023-10-02 18:47:27,551 WARNING [train.py:1204] (3/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_178-6946-0_sp1.1 from training. Duration: 0.97275 2023-10-02 18:47:33,103 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.nonlin_attention.balancer.prob, batch_count=981713.3333333334, ans=0.125 2023-10-02 18:47:34,253 WARNING [train.py:1204] (3/4) Exclude cut with ID _27485_210_4916_1_1532739191677_3668269_460-163885-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 18:47:38,466 WARNING [train.py:1204] (3/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_270-26053-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 18:47:38,660 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.conv_module1.balancer1.prob, batch_count=981713.3333333334, ans=0.125 2023-10-02 18:47:40,882 WARNING [train.py:1204] (3/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_412-317077-0 from training. Duration: 0.75 2023-10-02 18:47:42,353 WARNING [train.py:1204] (3/4) Exclude cut with ID _5289_210_2084_1_1527678202736_3779299_142-103317-0 from training. Duration: 0.84 2023-10-02 18:47:42,406 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_150-143438-0_sp1.1 from training. Duration: 0.92725 2023-10-02 18:47:42,610 INFO [scaling.py:1118] (3/4) WithLoss: name=encoder.encoders.1.encoder.layers.0.self_attn_weights, loss-sum=0.000e+00 2023-10-02 18:47:43,814 WARNING [train.py:1204] (3/4) Exclude cut with ID _27894_210_12392_1_1532415496078_6902439_668-269779-0_sp1.1 from training. Duration: 0.97275 2023-10-02 18:47:45,043 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.499e+02 1.791e+02 1.970e+02 2.183e+02 2.893e+02, threshold=3.939e+02, percent-clipped=0.0 2023-10-02 18:47:45,156 WARNING [train.py:1204] (3/4) Exclude cut with ID _18513_210_9206_1_1531618215223_3862620_56-222075-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 18:47:46,670 WARNING [train.py:1204] (3/4) Exclude cut with ID _4092_210_2151_1_1526643044449_3135759_356-278694-0 from training. Duration: 0.47 2023-10-02 18:47:47,389 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.2.encoder.layers.2.conv_module1.whiten, num_groups=1, num_channels=384, metric=5.99 vs. limit=15.0 2023-10-02 18:47:51,240 WARNING [train.py:1204] (3/4) Exclude cut with ID _25808_210_13292_1_1532343514562_7366109_633-47524-0 from training. Duration: 0.99 2023-10-02 18:47:51,252 WARNING [train.py:1204] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_872-264525-0 from training. Duration: 0.65 2023-10-02 18:47:52,574 WARNING [train.py:1204] (3/4) Exclude cut with ID _14632_210_9988_1_1530691279392_3923200_239-232545-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 18:47:52,651 WARNING [train.py:1204] (3/4) Exclude cut with ID _14349_210_8341_1_1530615710818_3442470_153-27432-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 18:47:57,282 INFO [train.py:1046] (3/4) Epoch 28, batch 3850, loss[loss=0.1792, simple_loss=0.2619, pruned_loss=0.04821, over 23514.00 frames. ], tot_loss[loss=0.1654, simple_loss=0.2429, pruned_loss=0.04393, over 4713387.11 frames. ], batch size: 93, lr: 3.64e-03, grad_scale: 8.0 2023-10-02 18:47:57,391 WARNING [train.py:1204] (3/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_37-41919-0_sp0.9 from training. Duration: 0.9666875 2023-10-02 18:47:58,718 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_196-289637-0_sp0.9 from training. Duration: 0.8333125 2023-10-02 18:48:03,891 WARNING [train.py:1204] (3/4) Exclude cut with ID _13678_210_7497_1_1530530069226_3463170_371-3221-0_sp1.1 from training. Duration: 0.8181875 2023-10-02 18:48:03,965 WARNING [train.py:1204] (3/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_557-324258-0 from training. Duration: 0.81 2023-10-02 18:48:04,679 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.3.encoder.layers.3.whiten, num_groups=1, num_channels=512, metric=9.00 vs. limit=12.0 2023-10-02 18:48:05,458 WARNING [train.py:1204] (3/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_1012-279473-0_sp1.1 from training. Duration: 0.6090625 2023-10-02 18:48:05,543 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_66916_210_14656_1_1535725525_326566_7-92649-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 18:48:09,597 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_9460_210_4211_1_1528954587_10049667_480-260269-0_sp1.1 from training. Duration: 0.5090625 2023-10-02 18:48:11,058 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.0.balancer_ff2.min_abs, batch_count=981913.3333333334, ans=0.1 2023-10-02 18:48:12,231 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_121-155001-0_sp1.1 from training. Duration: 0.92725 2023-10-02 18:48:12,383 WARNING [train.py:1204] (3/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_188-171673-0_sp1.1 from training. Duration: 0.52725 2023-10-02 18:48:14,215 WARNING [train.py:1204] (3/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_2-39653-0 from training. Duration: 0.98 2023-10-02 18:48:18,575 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.bypass.scale_min, batch_count=981913.3333333334, ans=0.2 2023-10-02 18:48:19,746 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_23850_210_4238_1_1532232597_1632433_112-229012-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 18:48:22,353 WARNING [train.py:1204] (3/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_626-79528-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 18:48:23,813 WARNING [train.py:1204] (3/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_856-90052-0_sp1.1 from training. Duration: 0.963625 2023-10-02 18:48:25,192 WARNING [train.py:1204] (3/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_122-109023-0_sp0.9 from training. Duration: 0.8444375 2023-10-02 18:48:28,429 WARNING [train.py:1204] (3/4) Exclude cut with ID _64414_210_17301_1_1535243252048_3757759_37-70328-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 18:48:28,509 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_622-344907-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 18:48:29,940 WARNING [train.py:1204] (3/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_410-321956-0_sp1.1 from training. Duration: 0.92725 2023-10-02 18:48:29,951 WARNING [train.py:1204] (3/4) Exclude cut with ID _7562_210_2084_1_1528887767916_3946229_186-326422-0_sp0.9 from training. Duration: 0.9333125 2023-10-02 18:48:31,985 WARNING [train.py:1204] (3/4) Exclude cut with ID _37537_210_2253_1_1533171758568_3421949_127-285192-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 18:48:33,406 WARNING [train.py:1204] (3/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_723-228168-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 18:48:34,720 WARNING [train.py:1204] (3/4) Exclude cut with ID _13678_210_7497_1_1530530069226_3463170_426-246984-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 18:48:34,735 WARNING [train.py:1204] (3/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_399-156791-0_sp0.9 from training. Duration: 0.92225 2023-10-02 18:48:34,806 WARNING [train.py:1204] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_391-22148-0 from training. Duration: 0.77 2023-10-02 18:48:34,840 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3475_210_2028_1_1526193578_3622569_64-297072-0 from training. Duration: 0.91 2023-10-02 18:48:36,140 WARNING [train.py:1204] (3/4) Exclude cut with ID _17875_210_2087_1_1531224012907_1821339_225-280015-0_sp1.1 from training. Duration: 0.963625 2023-10-02 18:48:37,407 WARNING [train.py:1204] (3/4) Exclude cut with ID _50655_210_18628_1_1534229942193_3772154_325-246169-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 18:48:37,645 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.self_attn_weights.pos_emb_skip_rate, batch_count=981980.0, ans=0.0 2023-10-02 18:48:40,173 WARNING [train.py:1204] (3/4) Exclude cut with ID _29529_210_7234_1_1532393961455_3977460_597-257348-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 18:48:40,220 WARNING [train.py:1204] (3/4) Exclude cut with ID _58895_210_8750_1_1535184081320_3649089_77-173485-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 18:48:40,242 WARNING [train.py:1204] (3/4) Exclude cut with ID _6270_210_2084_1_1527850911573_3856420_26-277343-0 from training. Duration: 0.82 2023-10-02 18:48:41,714 WARNING [train.py:1204] (3/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_144-3287-0 from training. Duration: 0.67 2023-10-02 18:48:41,894 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.conv_module2.balancer2.prob, batch_count=982046.6666666666, ans=0.125 2023-10-02 18:48:44,883 WARNING [train.py:1204] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_293-145278-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 18:48:46,328 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_40047_210_2290_1_1533290519_2374311_257-335162-0 from training. Duration: 0.95 2023-10-02 18:48:47,707 WARNING [train.py:1204] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_619-344499-0_sp0.9 from training. Duration: 0.7 2023-10-02 18:48:52,095 WARNING [train.py:1204] (3/4) Exclude cut with ID _19504_210_9994_1_1531648863242_3325220_456-315379-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 18:48:52,201 WARNING [train.py:1204] (3/4) Exclude cut with ID _55857_210_2346_1_1534644007831_6898209_631-221692-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 18:48:52,449 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.conv_skip_rate, batch_count=982046.6666666666, ans=0.0 2023-10-02 18:48:56,226 WARNING [train.py:1204] (3/4) Exclude cut with ID _64631_210_27767_1_1535349580068_4165160_324-3682-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 18:48:58,040 WARNING [train.py:1204] (3/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_317-211107-0 from training. Duration: 0.92 2023-10-02 18:48:58,425 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.balancer1.prob, batch_count=982113.3333333334, ans=0.125 2023-10-02 18:49:00,090 WARNING [train.py:1204] (3/4) Exclude cut with ID _6789_210_1534_1_1527857937218_3118950_311-61262-0 from training. Duration: 0.65 2023-10-02 18:49:02,897 WARNING [train.py:1204] (3/4) Exclude cut with ID _28323_210_16187_1_1532480395667_4100300_365-68882-0_sp1.1 from training. Duration: 0.92725 2023-10-02 18:49:02,926 WARNING [train.py:1204] (3/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_816-181825-0_sp1.1 from training. Duration: 0.92725 2023-10-02 18:49:05,541 WARNING [train.py:1204] (3/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_132-269633-0_sp1.1 from training. Duration: 0.6181875 2023-10-02 18:49:05,552 WARNING [train.py:1204] (3/4) Exclude cut with ID _44585_210_18628_1_1533605350387_4518199_280-73534-0_sp1.1 from training. Duration: 0.8090625 2023-10-02 18:49:06,913 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_68095_210_20185_1_1535718226_8888855_1009-164755-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 18:49:06,985 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_40223_210_6228_1_1533298404_4812267_400-103813-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 18:49:06,986 WARNING [train.py:1204] (3/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_491-261923-0_sp1.1 from training. Duration: 0.8909375 2023-10-02 18:49:06,991 WARNING [train.py:1204] (3/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_712-342563-0 from training. Duration: 0.97 2023-10-02 18:49:08,300 WARNING [train.py:1204] (3/4) Exclude cut with ID _41161_210_20034_1_1533431249657_3555314_175-190032-0_sp1.1 from training. Duration: 0.963625 2023-10-02 18:49:09,713 WARNING [train.py:1204] (3/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_788-298907-0 from training. Duration: 0.89 2023-10-02 18:49:09,731 WARNING [train.py:1204] (3/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_225-70961-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 18:49:09,738 WARNING [train.py:1204] (3/4) Exclude cut with ID _58583_210_5236_1_1534726835266_3574249_235-175994-0_sp1.1 from training. Duration: 0.92725 2023-10-02 18:49:10,990 INFO [train.py:1046] (3/4) Epoch 28, batch 3900, loss[loss=0.181, simple_loss=0.2641, pruned_loss=0.04899, over 23985.00 frames. ], tot_loss[loss=0.1642, simple_loss=0.2415, pruned_loss=0.04344, over 4700294.45 frames. ], batch size: 86, lr: 3.64e-03, grad_scale: 8.0 2023-10-02 18:49:11,146 WARNING [train.py:1204] (3/4) Exclude cut with ID _12302_210_5219_1_1529924446466_8107280_907-182949-0_sp1.1 from training. Duration: 0.836375 2023-10-02 18:49:13,087 WARNING [train.py:1204] (3/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_94-193692-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 18:49:15,719 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_4528_210_3273_1_1527076654_3276777_342-168068-0_sp0.9 from training. Duration: 0.9666875 2023-10-02 18:49:15,786 WARNING [train.py:1204] (3/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_368-74239-0_sp1.1 from training. Duration: 0.92725 2023-10-02 18:49:15,788 WARNING [train.py:1204] (3/4) Exclude cut with ID _44831_210_13427_1_1533816011549_3689035_291-295928-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 18:49:17,264 WARNING [train.py:1204] (3/4) Exclude cut with ID _56406_210_9288_1_1534899618664_3496149_455-131997-0_sp1.1 from training. Duration: 0.97275 2023-10-02 18:49:17,270 WARNING [train.py:1204] (3/4) Exclude cut with ID _6473_210_1741_1_1527901307396_3815380_108-17335-0 from training. Duration: 0.65 2023-10-02 18:49:18,597 WARNING [train.py:1204] (3/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_286-135600-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 18:49:22,656 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_928-321675-0_sp1.1 from training. Duration: 0.936375 2023-10-02 18:49:22,727 WARNING [train.py:1204] (3/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_81-300212-0_sp1.1 from training. Duration: 0.5454375 2023-10-02 18:49:22,763 WARNING [train.py:1204] (3/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_29-280622-0_sp0.9 from training. Duration: 0.911125 2023-10-02 18:49:24,177 WARNING [train.py:1204] (3/4) Exclude cut with ID _61079_210_4525_1_1534921008198_4011419_228-176447-0_sp1.1 from training. Duration: 0.936375 2023-10-02 18:49:25,650 WARNING [train.py:1204] (3/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_372-198878-0_sp1.1 from training. Duration: 0.5454375 2023-10-02 18:49:25,663 WARNING [train.py:1204] (3/4) Exclude cut with ID _64521_210_14699_1_1535243427259_3815328_244-208725-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 18:49:28,744 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8275_210_4198_1_1528455320_3892571_491-17707-0_sp0.9 from training. Duration: 0.711125 2023-10-02 18:49:30,096 WARNING [train.py:1204] (3/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_375-245677-0 from training. Duration: 0.81 2023-10-02 18:49:30,104 WARNING [train.py:1204] (3/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_690-286055-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 18:49:30,244 WARNING [train.py:1204] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_89-321955-0 from training. Duration: 0.8 2023-10-02 18:49:32,172 WARNING [train.py:1204] (3/4) Exclude cut with ID _18895_210_6228_1_1531357274234_3784930_213-98721-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 18:49:32,251 WARNING [train.py:1204] (3/4) Exclude cut with ID _19708_210_9206_1_1532136650733_4238364_347-65240-0 from training. Duration: 0.97 2023-10-02 18:49:34,073 WARNING [train.py:1204] (3/4) Exclude cut with ID _64135_210_2253_1_1535677267876_7608972_498-5039-0 from training. Duration: 0.78 2023-10-02 18:49:38,193 WARNING [train.py:1204] (3/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_146-276944-0_sp1.1 from training. Duration: 0.7545625 2023-10-02 18:49:40,778 WARNING [train.py:1204] (3/4) Exclude cut with ID _63171_210_15488_1_1535104792794_3295110_155-34488-0_sp1.1 from training. Duration: 0.97275 2023-10-02 18:49:40,792 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_5438_210_1794_1_1527336687_2600009_233-58072-0_sp0.9 from training. Duration: 0.8555625 2023-10-02 18:49:40,849 WARNING [train.py:1204] (3/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_63-101996-0_sp0.9 from training. Duration: 0.97775 2023-10-02 18:49:45,643 WARNING [train.py:1204] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_271-269084-0_sp1.1 from training. Duration: 0.7545625 2023-10-02 18:49:46,212 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.3.encoder.layers.1.nonlin_attention.whiten2, num_groups=1, num_channels=512, metric=7.37 vs. limit=15.0 2023-10-02 18:49:47,016 WARNING [train.py:1204] (3/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_345-167986-0_sp1.1 from training. Duration: 0.763625 2023-10-02 18:49:49,847 WARNING [train.py:1204] (3/4) Exclude cut with ID _39290_210_4220_1_1533862778760_3075118_92-311600-0_sp1.1 from training. Duration: 0.8 2023-10-02 18:49:49,853 WARNING [train.py:1204] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_209-65306-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 18:49:49,936 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_26433_210_12108_1_1532157158_4056480_317-335807-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 18:49:52,726 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.ff2_skip_rate, batch_count=982313.3333333334, ans=0.0 2023-10-02 18:49:54,353 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.bypass_mid.scale_min, batch_count=982380.0, ans=0.2 2023-10-02 18:49:55,522 WARNING [train.py:1204] (3/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_238-94127-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 18:49:56,835 WARNING [train.py:1204] (3/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_207-132038-0_sp1.1 from training. Duration: 0.8545625 2023-10-02 18:50:03,023 WARNING [train.py:1204] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_167-137255-0_sp1.1 from training. Duration: 0.5818125 2023-10-02 18:50:04,902 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2009_210_2071_1_1525481134_4588937_385-302100-0_sp0.9 from training. Duration: 0.9666875 2023-10-02 18:50:13,024 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.596e+02 1.882e+02 2.101e+02 2.412e+02 3.470e+02, threshold=4.202e+02, percent-clipped=0.0 2023-10-02 18:50:13,197 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_15517_210_6753_1_1530954008_3487336_253-110617-0_sp1.1 from training. Duration: 0.936375 2023-10-02 18:50:16,276 WARNING [train.py:1204] (3/4) Exclude cut with ID _39290_210_4220_1_1533862778760_3075118_92-311600-0_sp0.9 from training. Duration: 0.97775 2023-10-02 18:50:16,329 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_42-142473-0 from training. Duration: 0.72 2023-10-02 18:50:16,360 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2296_210_2087_1_1525503448_3397335_57-185430-0 from training. Duration: 0.6 2023-10-02 18:50:16,372 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_11305_210_1794_1_1529750428_7178148_257-312870-0_sp0.9 from training. Duration: 0.97775 2023-10-02 18:50:17,793 WARNING [train.py:1204] (3/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_222-198127-0 from training. Duration: 0.84 2023-10-02 18:50:19,149 WARNING [train.py:1204] (3/4) Exclude cut with ID _65263_210_22356_1_1535763598691_3921037_423-202699-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 18:50:19,226 WARNING [train.py:1204] (3/4) Exclude cut with ID _15037_210_2253_1_1530752294104_3267650_13-130361-0 from training. Duration: 0.99 2023-10-02 18:50:24,520 INFO [train.py:1046] (3/4) Epoch 28, batch 3950, loss[loss=0.1759, simple_loss=0.2351, pruned_loss=0.05839, over 19658.00 frames. ], tot_loss[loss=0.1642, simple_loss=0.2418, pruned_loss=0.04332, over 4696252.94 frames. ], batch size: 388, lr: 3.64e-03, grad_scale: 8.0 2023-10-02 18:50:25,933 WARNING [train.py:1204] (3/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_628-198080-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 18:50:26,027 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3171_210_2087_1_1526109084_2330007_37-64334-0 from training. Duration: 0.82 2023-10-02 18:50:27,385 WARNING [train.py:1204] (3/4) Exclude cut with ID _5287_210_2084_1_1527418883040_3878670_12-221186-0_sp1.1 from training. Duration: 0.8909375 2023-10-02 18:50:29,426 WARNING [train.py:1204] (3/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_1103-56445-0_sp1.1 from training. Duration: 0.663625 2023-10-02 18:50:31,415 WARNING [train.py:1204] (3/4) Exclude cut with ID _65263_210_22356_1_1535763598691_3921037_169-140068-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 18:50:31,990 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.5.encoder.layers.1.feed_forward2.out_whiten, num_groups=1, num_channels=256, metric=8.88 vs. limit=15.0 2023-10-02 18:50:34,418 WARNING [train.py:1204] (3/4) Exclude cut with ID IC0671W0118-331961 from training. Duration: 0.896 2023-10-02 18:50:35,723 WARNING [train.py:1204] (3/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_402-102311-0_sp1.1 from training. Duration: 0.6090625 2023-10-02 18:50:35,742 WARNING [train.py:1204] (3/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_202-293523-0 from training. Duration: 0.72 2023-10-02 18:50:37,080 WARNING [train.py:1204] (3/4) Exclude cut with ID IC0679W0038-335846 from training. Duration: 0.768 2023-10-02 18:50:37,110 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_37357_210_17024_1_1533089504_2316121_79-198866-0_sp1.1 from training. Duration: 0.936375 2023-10-02 18:50:38,522 WARNING [train.py:1204] (3/4) Exclude cut with ID _7606_210_4915_1_1528894253277_5080690_292-271285-0_sp1.1 from training. Duration: 0.963625 2023-10-02 18:50:39,774 WARNING [train.py:1204] (3/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_377-296839-0_sp0.9 from training. Duration: 0.611125 2023-10-02 18:50:39,786 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_45886_210_17024_1_1533704917_2435333_58-131069-0_sp1.1 from training. Duration: 0.963625 2023-10-02 18:50:41,853 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6961_210_2087_1_1527937168_6815011_395-185060-0 from training. Duration: 0.95 2023-10-02 18:50:43,219 WARNING [train.py:1204] (3/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_304-266479-0_sp1.1 from training. Duration: 0.97275 2023-10-02 18:50:44,531 WARNING [train.py:1204] (3/4) Exclude cut with ID _3867_210_1883_1_1526641200575_4475830_319-349850-0_sp1.1 from training. Duration: 0.6090625 2023-10-02 18:50:44,540 WARNING [train.py:1204] (3/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_304-59227-0_sp0.9 from training. Duration: 0.8555625 2023-10-02 18:50:44,600 WARNING [train.py:1204] (3/4) Exclude cut with ID _8021_210_7994_1_1528376451358_3653069_100-140231-0_sp0.9 from training. Duration: 0.7444375 2023-10-02 18:50:45,957 WARNING [train.py:1204] (3/4) Exclude cut with ID _8833_210_5219_1_1528592665210_7553059_329-125156-0_sp0.9 from training. Duration: 0.911125 2023-10-02 18:50:56,311 WARNING [train.py:1204] (3/4) Exclude cut with ID _14459_210_2228_1_1531443641946_7516700_792-287234-0_sp1.1 from training. Duration: 0.92725 2023-10-02 18:50:56,336 WARNING [train.py:1204] (3/4) Exclude cut with ID _19708_210_9206_1_1532136650733_4238364_347-65240-0_sp1.1 from training. Duration: 0.8818125 2023-10-02 18:51:02,693 WARNING [train.py:1204] (3/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_70-27732-0 from training. Duration: 0.94 2023-10-02 18:51:08,098 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_397-132565-0 from training. Duration: 0.99 2023-10-02 18:51:08,101 WARNING [train.py:1204] (3/4) Exclude cut with ID _49728_210_12651_1_1534041034294_6788150_624-195301-0 from training. Duration: 0.85 2023-10-02 18:51:09,392 WARNING [train.py:1204] (3/4) Exclude cut with ID _55429_210_10626_1_1534467588527_7174143_377-169391-0_sp1.1 from training. Duration: 0.87275 2023-10-02 18:51:12,666 WARNING [train.py:1204] (3/4) Exclude cut with ID _49376_210_5986_1_1533985278732_3370740_354-344695-0_sp1.1 from training. Duration: 0.9 2023-10-02 18:51:18,318 WARNING [train.py:1204] (3/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_276-96593-0_sp1.1 from training. Duration: 0.7 2023-10-02 18:51:18,326 WARNING [train.py:1204] (3/4) Exclude cut with ID _15012_210_1968_1_1530856820429_5095350_688-178849-0_sp0.9 from training. Duration: 0.711125 2023-10-02 18:51:18,382 WARNING [train.py:1204] (3/4) Exclude cut with ID _25565_210_12392_1_1532423009382_3952098_372-69023-0_sp1.1 from training. Duration: 0.936375 2023-10-02 18:51:19,683 WARNING [train.py:1204] (3/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_61-327062-0_sp1.1 from training. Duration: 0.736375 2023-10-02 18:51:19,722 WARNING [train.py:1204] (3/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_215-141059-0 from training. Duration: 0.62 2023-10-02 18:51:22,589 WARNING [train.py:1204] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_6-226750-0_sp1.1 from training. Duration: 0.8545625 2023-10-02 18:51:24,008 WARNING [train.py:1204] (3/4) Exclude cut with ID _14348_210_8341_1_1530599434996_3778640_240-232370-0_sp1.1 from training. Duration: 0.763625 2023-10-02 18:51:24,811 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.2.encoder.layers.1.conv_module1.whiten, num_groups=1, num_channels=384, metric=10.10 vs. limit=15.0 2023-10-02 18:51:26,920 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.bypass.scale_min, batch_count=982780.0, ans=0.2 2023-10-02 18:51:29,069 WARNING [train.py:1204] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_468-189058-0 from training. Duration: 0.96 2023-10-02 18:51:31,996 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.0.conv_module2.balancer2.min_abs, batch_count=982780.0, ans=0.5 2023-10-02 18:51:36,152 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.self_attn_weights.pos_emb_skip_rate, batch_count=982780.0, ans=0.0 2023-10-02 18:51:38,606 INFO [train.py:1046] (3/4) Epoch 28, batch 4000, loss[loss=0.14, simple_loss=0.2172, pruned_loss=0.0314, over 22001.00 frames. ], tot_loss[loss=0.1648, simple_loss=0.2425, pruned_loss=0.04353, over 4705809.50 frames. ], batch size: 48, lr: 3.64e-03, grad_scale: 16.0 2023-10-02 18:51:38,659 WARNING [train.py:1204] (3/4) Exclude cut with ID _21987_210_5854_1_1531821500482_3637038_247-93340-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 18:51:44,737 WARNING [train.py:1204] (3/4) Exclude cut with ID _66868_210_9527_1_1535509287954_4561140_436-191602-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 18:51:50,170 WARNING [train.py:1204] (3/4) Exclude cut with ID _18895_210_6228_1_1531357274234_3784930_394-58203-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 18:51:50,236 WARNING [train.py:1204] (3/4) Exclude cut with ID _58605_210_12210_1_1534723195939_3904259_258-281498-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 18:51:50,957 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.2.encoder.layers.2.feed_forward1.out_whiten, num_groups=1, num_channels=384, metric=6.00 vs. limit=15.0 2023-10-02 18:51:51,532 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_33348_210_5632_1_1533086912_3324030_245-292752-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 18:51:51,566 WARNING [train.py:1204] (3/4) Exclude cut with ID _67831_210_12392_1_1535612464202_3092612_346-2148-0 from training. Duration: 0.93 2023-10-02 18:51:51,810 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.feed_forward2.hidden_balancer.prob, batch_count=982913.3333333334, ans=0.125 2023-10-02 18:51:52,899 WARNING [train.py:1204] (3/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_369-222060-0_sp0.9 from training. Duration: 0.788875 2023-10-02 18:51:54,218 WARNING [train.py:1204] (3/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_245-174553-0 from training. Duration: 0.88 2023-10-02 18:51:54,224 WARNING [train.py:1204] (3/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_231-104736-0_sp1.1 from training. Duration: 0.7090625 2023-10-02 18:51:54,231 WARNING [train.py:1204] (3/4) Exclude cut with ID _44585_210_18628_1_1533605350387_4518199_280-73534-0 from training. Duration: 0.89 2023-10-02 18:51:55,689 WARNING [train.py:1204] (3/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_22-201476-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 18:51:59,033 WARNING [train.py:1204] (3/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_325-62802-0_sp0.9 from training. Duration: 0.9666875 2023-10-02 18:51:59,053 WARNING [train.py:1204] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_199-274062-0_sp1.1 from training. Duration: 0.97275 2023-10-02 18:51:59,056 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_8-319957-0_sp1.1 from training. Duration: 0.87275 2023-10-02 18:51:59,084 WARNING [train.py:1204] (3/4) Exclude cut with ID _29343_210_4381_1_1532602757752_3674109_224-349726-0_sp1.1 from training. Duration: 0.936375 2023-10-02 18:51:59,089 WARNING [train.py:1204] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_520-126729-0_sp1.1 from training. Duration: 0.636375 2023-10-02 18:52:02,369 WARNING [train.py:1204] (3/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_239-22076-0_sp1.1 from training. Duration: 0.77275 2023-10-02 18:52:02,530 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.0.conv_module2.balancer2.prob, batch_count=982913.3333333334, ans=0.125 2023-10-02 18:52:03,812 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9012W0011-501146 from training. Duration: 0.896 2023-10-02 18:52:05,624 WARNING [train.py:1204] (3/4) Exclude cut with ID _29857_210_20034_1_1532395886753_3798160_279-185801-0_sp1.1 from training. Duration: 0.82725 2023-10-02 18:52:05,662 WARNING [train.py:1204] (3/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_261-223933-0_sp1.1 from training. Duration: 0.92725 2023-10-02 18:52:05,840 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.ff3_skip_rate, batch_count=982913.3333333334, ans=0.0 2023-10-02 18:52:08,425 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9029W0025-509426 from training. Duration: 0.896 2023-10-02 18:52:09,757 WARNING [train.py:1204] (3/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_337-181502-0_sp1.1 from training. Duration: 0.5909375 2023-10-02 18:52:09,769 WARNING [train.py:1204] (3/4) Exclude cut with ID _68635_210_9317_1_1535770813410_4315870_159-332599-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 18:52:17,246 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_161-276996-0 from training. Duration: 0.95 2023-10-02 18:52:18,564 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7088_210_1874_1_1527939776_1101113_127-68870-0_sp1.1 from training. Duration: 0.936375 2023-10-02 18:52:18,745 WARNING [train.py:1204] (3/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_322-217220-0_sp1.1 from training. Duration: 0.7818125 2023-10-02 18:52:19,421 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.2.encoder.layers.0.nonlin_attention.whiten2, num_groups=1, num_channels=384, metric=8.75 vs. limit=15.0 2023-10-02 18:52:20,162 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9072W0002-530636 from training. Duration: 0.768 2023-10-02 18:52:21,592 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_340-336408-0_sp1.1 from training. Duration: 0.8454375 2023-10-02 18:52:22,860 WARNING [train.py:1204] (3/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_395-37222-0 from training. Duration: 0.76 2023-10-02 18:52:22,873 WARNING [train.py:1204] (3/4) Exclude cut with ID _64099_210_17301_1_1535526977656_1578188_159-209156-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 18:52:24,219 WARNING [train.py:1204] (3/4) Exclude cut with ID _53950_210_5816_1_1534417264919_3862951_28-29681-0_sp1.1 from training. Duration: 0.92725 2023-10-02 18:52:24,312 WARNING [train.py:1204] (3/4) Exclude cut with ID _4681_210_3525_1_1527250689464_120889_15-126760-0_sp1.1 from training. Duration: 0.736375 2023-10-02 18:52:25,592 WARNING [train.py:1204] (3/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_242-3472-0_sp1.1 from training. Duration: 0.663625 2023-10-02 18:52:26,813 WARNING [train.py:1204] (3/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_436-186311-0_sp0.9 from training. Duration: 0.72225 2023-10-02 18:52:26,840 WARNING [train.py:1204] (3/4) Exclude cut with ID _3675_210_4557_1_1526639934408_4182089_452-172147-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 18:52:28,287 WARNING [train.py:1204] (3/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_80-147301-0 from training. Duration: 0.84 2023-10-02 18:52:28,325 WARNING [train.py:1204] (3/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_351-107588-0_sp1.1 from training. Duration: 0.92725 2023-10-02 18:52:30,981 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.2.encoder.layers.0.feed_forward3.out_whiten, num_groups=1, num_channels=384, metric=12.00 vs. limit=15.0 2023-10-02 18:52:31,526 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9111W0002-550781 from training. Duration: 0.896 2023-10-02 18:52:36,147 WARNING [train.py:1204] (3/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_465-156500-0_sp1.1 from training. Duration: 0.6545625 2023-10-02 18:52:37,666 WARNING [train.py:1204] (3/4) Exclude cut with ID _13938_210_9988_1_1530518718978_3693960_304-112784-0_sp1.1 from training. Duration: 0.4181875 2023-10-02 18:52:40,741 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.514e+02 1.835e+02 2.053e+02 2.244e+02 3.735e+02, threshold=4.107e+02, percent-clipped=0.0 2023-10-02 18:52:40,813 WARNING [train.py:1204] (3/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_202-67017-0_sp1.1 from training. Duration: 0.7454375 2023-10-02 18:52:40,872 WARNING [train.py:1204] (3/4) Exclude cut with ID _58414_210_8254_1_1535421609162_7151182_448-4358-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 18:52:42,739 WARNING [train.py:1204] (3/4) Exclude cut with ID _66840_210_5219_1_1535452168426_4196349_203-243234-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 18:52:44,125 WARNING [train.py:1204] (3/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_201-213513-0_sp1.1 from training. Duration: 0.97275 2023-10-02 18:52:49,578 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_117-187560-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 18:52:52,211 INFO [train.py:1046] (3/4) Epoch 28, batch 4050, loss[loss=0.1734, simple_loss=0.2531, pruned_loss=0.04687, over 23331.00 frames. ], tot_loss[loss=0.1659, simple_loss=0.244, pruned_loss=0.04386, over 4706547.38 frames. ], batch size: 93, lr: 3.64e-03, grad_scale: 16.0 2023-10-02 18:52:52,272 WARNING [train.py:1204] (3/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_135-174080-0_sp0.9 from training. Duration: 0.588875 2023-10-02 18:52:53,586 WARNING [train.py:1204] (3/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_45-265684-0 from training. Duration: 0.72 2023-10-02 18:52:55,029 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_435-313305-0_sp1.1 from training. Duration: 0.6454375 2023-10-02 18:52:55,056 WARNING [train.py:1204] (3/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_235-307337-0_sp1.1 from training. Duration: 0.8545625 2023-10-02 18:52:56,547 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7762_210_1794_1_1528199508_4057408_419-125009-0_sp0.9 from training. Duration: 0.92225 2023-10-02 18:52:57,895 WARNING [train.py:1204] (3/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_215-141059-0_sp1.1 from training. Duration: 0.563625 2023-10-02 18:52:57,985 WARNING [train.py:1204] (3/4) Exclude cut with ID _27013_210_1389_1_1532437173240_3252649_222-125573-0_sp1.1 from training. Duration: 0.8909375 2023-10-02 18:53:02,187 WARNING [train.py:1204] (3/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_126-274694-0_sp1.1 from training. Duration: 0.8909375 2023-10-02 18:53:05,297 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2592_210_2290_1_1525697025_4882925_157-70512-0_sp1.1 from training. Duration: 0.82725 2023-10-02 18:53:06,648 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_292-265928-0_sp0.9 from training. Duration: 0.5666875 2023-10-02 18:53:08,504 WARNING [train.py:1204] (3/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_1211-226066-0_sp1.1 from training. Duration: 0.6909375 2023-10-02 18:53:08,553 WARNING [train.py:1204] (3/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_394-265747-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 18:53:13,216 WARNING [train.py:1204] (3/4) Exclude cut with ID _48782_210_12658_1_1534467701544_2300422_239-67139-0_sp1.1 from training. Duration: 0.97275 2023-10-02 18:53:15,126 WARNING [train.py:1204] (3/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_329-113173-0_sp1.1 from training. Duration: 0.563625 2023-10-02 18:53:18,019 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=983246.6666666666, ans=0.1 2023-10-02 18:53:19,126 WARNING [train.py:1204] (3/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_81-75849-0_sp0.9 from training. Duration: 0.4333125 2023-10-02 18:53:20,580 WARNING [train.py:1204] (3/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_1003-101638-0 from training. Duration: 0.73 2023-10-02 18:53:20,610 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9309W0189-640316 from training. Duration: 0.64 2023-10-02 18:53:22,063 WARNING [train.py:1204] (3/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_503-287883-0_sp1.1 from training. Duration: 0.8 2023-10-02 18:53:30,205 WARNING [train.py:1204] (3/4) Exclude cut with ID _8833_210_5219_1_1528592665210_7553059_611-80313-0 from training. Duration: 0.64 2023-10-02 18:53:30,278 WARNING [train.py:1204] (3/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_335-106626-0_sp1.1 from training. Duration: 0.963625 2023-10-02 18:53:33,235 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.bypass.skip_rate, batch_count=983313.3333333334, ans=0.07 2023-10-02 18:53:33,250 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.feed_forward3.hidden_balancer.prob, batch_count=983313.3333333334, ans=0.125 2023-10-02 18:53:34,320 WARNING [train.py:1204] (3/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_237-44332-0_sp1.1 from training. Duration: 0.8545625 2023-10-02 18:53:36,420 WARNING [train.py:1204] (3/4) Exclude cut with ID _46952_210_5986_1_1534230010357_3107379_178-147341-0_sp1.1 from training. Duration: 0.97275 2023-10-02 18:53:37,655 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_13091_210_7925_1_1530609447_8987354_946-154426-0_sp1.1 from training. Duration: 0.936375 2023-10-02 18:53:37,678 WARNING [train.py:1204] (3/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_326-17061-0_sp1.1 from training. Duration: 0.8545625 2023-10-02 18:53:40,813 WARNING [train.py:1204] (3/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_38-169920-0_sp1.1 from training. Duration: 0.82725 2023-10-02 18:53:45,544 WARNING [train.py:1204] (3/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_283-220372-0 from training. Duration: 0.56 2023-10-02 18:53:45,551 WARNING [train.py:1204] (3/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_252-216674-0_sp1.1 from training. Duration: 0.4909375 2023-10-02 18:53:45,638 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder_embed.conv.5.prob, batch_count=983380.0, ans=0.125 2023-10-02 18:53:47,054 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_202-91290-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 18:53:48,870 WARNING [train.py:1204] (3/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_80-74184-0 from training. Duration: 0.83 2023-10-02 18:53:51,765 WARNING [train.py:1204] (3/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_411-271358-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 18:53:58,526 WARNING [train.py:1204] (3/4) Exclude cut with ID _68777_210_4353_1_1535810390731_3117904_129-166012-0 from training. Duration: 0.85 2023-10-02 18:53:58,597 WARNING [train.py:1204] (3/4) Exclude cut with ID _44982_210_1511_1_1534312820108_3726029_410-109759-0_sp1.1 from training. Duration: 0.963625 2023-10-02 18:53:58,609 WARNING [train.py:1204] (3/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_364-22817-0_sp1.1 from training. Duration: 0.6454375 2023-10-02 18:54:01,260 WARNING [train.py:1204] (3/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_89-242335-0 from training. Duration: 0.95 2023-10-02 18:54:01,277 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_151-325626-0 from training. Duration: 0.97 2023-10-02 18:54:01,278 WARNING [train.py:1204] (3/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_786-264332-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 18:54:02,686 WARNING [train.py:1204] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_35-153886-0_sp1.1 from training. Duration: 0.763625 2023-10-02 18:54:02,776 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_24007_210_1794_1_1531918925_1663875_116-100916-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 18:54:02,792 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_162-262717-0_sp1.1 from training. Duration: 0.7545625 2023-10-02 18:54:05,492 INFO [train.py:1046] (3/4) Epoch 28, batch 4100, loss[loss=0.176, simple_loss=0.2631, pruned_loss=0.04444, over 24537.00 frames. ], tot_loss[loss=0.1667, simple_loss=0.2449, pruned_loss=0.0442, over 4717018.52 frames. ], batch size: 71, lr: 3.64e-03, grad_scale: 16.0 2023-10-02 18:54:07,774 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.feed_forward3.hidden_balancer.prob, batch_count=983513.3333333334, ans=0.125 2023-10-02 18:54:08,236 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.2.encoder.layers.0.nonlin_attention.whiten2, num_groups=1, num_channels=384, metric=15.08 vs. limit=15.0 2023-10-02 18:54:10,576 WARNING [train.py:1204] (3/4) Exclude cut with ID _8263_210_1664_1_1528438953494_294100_40-204731-0 from training. Duration: 0.5 2023-10-02 18:54:12,026 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_416-293746-0 from training. Duration: 0.9 2023-10-02 18:54:13,873 WARNING [train.py:1204] (3/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_605-219732-0 from training. Duration: 0.94 2023-10-02 18:54:13,966 WARNING [train.py:1204] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_454-123002-0 from training. Duration: 0.69 2023-10-02 18:54:13,971 WARNING [train.py:1204] (3/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_25-304273-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 18:54:15,778 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6860_210_4852_1_1527917457_3833327_451-39702-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 18:54:15,805 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_304-16505-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 18:54:15,827 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_4-266348-0_sp1.1 from training. Duration: 0.7090625 2023-10-02 18:54:17,186 WARNING [train.py:1204] (3/4) Exclude cut with ID ID0149W0001-753701 from training. Duration: 0.989 2023-10-02 18:54:19,946 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_41251_210_15368_1_1533429767_4820695_394-55393-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 18:54:20,018 WARNING [train.py:1204] (3/4) Exclude cut with ID _8793_210_6310_1_1528542985139_2310009_121-179782-0_sp1.1 from training. Duration: 0.72725 2023-10-02 18:54:20,033 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2729_210_3528_1_1525863505_3882214_421-73047-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 18:54:21,420 WARNING [train.py:1204] (3/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_278-154492-0_sp0.9 from training. Duration: 0.8444375 2023-10-02 18:54:23,081 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.0.conv_module2.balancer2.min_abs, batch_count=983580.0, ans=0.5 2023-10-02 18:54:25,576 WARNING [train.py:1204] (3/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_412-317077-0_sp0.9 from training. Duration: 0.8333125 2023-10-02 18:54:25,653 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_727-138317-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 18:54:26,963 WARNING [train.py:1204] (3/4) Exclude cut with ID _25808_210_13292_1_1532343514562_7366109_345-335048-0_sp1.1 from training. Duration: 0.92725 2023-10-02 18:54:26,983 WARNING [train.py:1204] (3/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_506-23200-0 from training. Duration: 0.97 2023-10-02 18:54:28,347 WARNING [train.py:1204] (3/4) Exclude cut with ID _44718_210_5816_1_1534050051869_6469030_643-245187-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 18:54:28,351 WARNING [train.py:1204] (3/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_42-186162-0_sp1.1 from training. Duration: 0.836375 2023-10-02 18:54:28,374 WARNING [train.py:1204] (3/4) Exclude cut with ID _3231_210_2106_1_1526205864934_6784220_171-256004-0_sp1.1 from training. Duration: 0.7818125 2023-10-02 18:54:28,411 WARNING [train.py:1204] (3/4) Exclude cut with ID _25914_210_6400_1_1532141317058_644740_43-248478-0_sp1.1 from training. Duration: 0.936375 2023-10-02 18:54:29,674 WARNING [train.py:1204] (3/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_577-287150-0 from training. Duration: 0.82 2023-10-02 18:54:31,908 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.2.encoder.layers.2.self_attn1.whiten, num_groups=1, num_channels=384, metric=18.25 vs. limit=22.5 2023-10-02 18:54:32,562 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_46015_210_7234_1_1533863319_3087837_376-276068-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 18:54:35,171 WARNING [train.py:1204] (3/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_51-200220-0 from training. Duration: 0.56 2023-10-02 18:54:37,231 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_452-96101-0_sp1.1 from training. Duration: 0.963625 2023-10-02 18:54:38,481 WARNING [train.py:1204] (3/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_263-168388-0_sp1.1 from training. Duration: 0.7818125 2023-10-02 18:54:38,482 WARNING [train.py:1204] (3/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_469-94673-0 from training. Duration: 0.79 2023-10-02 18:54:40,430 WARNING [train.py:1204] (3/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_303-228738-0_sp1.1 from training. Duration: 0.763625 2023-10-02 18:54:41,709 WARNING [train.py:1204] (3/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_262-250826-0_sp1.1 from training. Duration: 0.9 2023-10-02 18:54:41,738 WARNING [train.py:1204] (3/4) Exclude cut with ID _68777_210_4353_1_1535810390731_3117904_129-166012-0_sp1.1 from training. Duration: 0.77275 2023-10-02 18:54:44,438 WARNING [train.py:1204] (3/4) Exclude cut with ID _58895_210_8750_1_1535184081320_3649089_265-323810-0 from training. Duration: 0.96 2023-10-02 18:54:44,612 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.conv_skip_rate, batch_count=983646.6666666666, ans=0.0 2023-10-02 18:54:47,801 WARNING [train.py:1204] (3/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_120-76843-0_sp0.9 from training. Duration: 0.611125 2023-10-02 18:54:47,860 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7544_210_6753_1_1528587722_3576686_295-258479-0_sp0.9 from training. Duration: 0.9333125 2023-10-02 18:54:50,598 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7046_210_6753_1_1527895337_6920917_783-278109-0 from training. Duration: 0.77 2023-10-02 18:54:52,072 WARNING [train.py:1204] (3/4) Exclude cut with ID _42326_210_7770_1_1533814282531_3707269_113-308323-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 18:54:53,347 WARNING [train.py:1204] (3/4) Exclude cut with ID _7852_210_5219_1_1528286471316_8139400_242-105336-0_sp0.9 from training. Duration: 0.8 2023-10-02 18:54:56,152 WARNING [train.py:1204] (3/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_114-277905-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 18:55:00,684 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.conv_skip_rate, batch_count=983713.3333333334, ans=0.0 2023-10-02 18:55:01,743 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_13118_210_7925_1_1530755612_7608297_42-66683-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 18:55:03,229 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.1.feed_forward1.out_proj.dropout_p, batch_count=983780.0, ans=0.1 2023-10-02 18:55:04,476 WARNING [train.py:1204] (3/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_901-79438-0_sp1.1 from training. Duration: 0.8545625 2023-10-02 18:55:05,778 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_30280_210_1881_1_1532872779_3660139_211-182194-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 18:55:07,211 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.556e+02 1.958e+02 2.238e+02 2.586e+02 4.135e+02, threshold=4.476e+02, percent-clipped=1.0 2023-10-02 18:55:11,971 WARNING [train.py:1204] (3/4) Exclude cut with ID _14898_210_9206_1_1530766812377_4177190_621-45732-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 18:55:11,979 WARNING [train.py:1204] (3/4) Exclude cut with ID _35350_210_16040_1_1533040191183_4075299_224-133775-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 18:55:15,299 WARNING [train.py:1204] (3/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_147-293311-0_sp1.1 from training. Duration: 0.8545625 2023-10-02 18:55:17,931 WARNING [train.py:1204] (3/4) Exclude cut with ID _12302_210_5219_1_1529924446466_8107280_835-279754-0_sp1.1 from training. Duration: 0.8181875 2023-10-02 18:55:19,516 INFO [train.py:1046] (3/4) Epoch 28, batch 4150, loss[loss=0.1609, simple_loss=0.2576, pruned_loss=0.03208, over 24352.00 frames. ], tot_loss[loss=0.1669, simple_loss=0.2447, pruned_loss=0.04455, over 4720911.20 frames. ], batch size: 74, lr: 3.63e-03, grad_scale: 16.0 2023-10-02 18:55:22,998 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8453_210_4211_1_1528502940_8562730_347-330673-0_sp0.9 from training. Duration: 0.8 2023-10-02 18:55:23,077 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_213-103282-0_sp0.9 from training. Duration: 0.8666875 2023-10-02 18:55:23,333 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.bypass.scale_min, batch_count=983846.6666666666, ans=0.2 2023-10-02 18:55:24,370 WARNING [train.py:1204] (3/4) Exclude cut with ID _49728_210_12651_1_1534041034294_6788150_66-43496-0_sp1.1 from training. Duration: 0.97275 2023-10-02 18:55:24,372 WARNING [train.py:1204] (3/4) Exclude cut with ID _19023_210_4353_1_1531378892552_5820780_28-36615-0_sp1.1 from training. Duration: 0.8818125 2023-10-02 18:55:27,106 WARNING [train.py:1204] (3/4) Exclude cut with ID _14349_210_8341_1_1530615710818_3442470_115-285975-0 from training. Duration: 0.86 2023-10-02 18:55:27,132 WARNING [train.py:1204] (3/4) Exclude cut with ID _12734_210_8341_1_1530183614296_3891929_236-47464-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 18:55:27,870 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.2.encoder.layers.2.feed_forward1.out_whiten, num_groups=1, num_channels=384, metric=8.62 vs. limit=15.0 2023-10-02 18:55:28,386 WARNING [train.py:1204] (3/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_177-236809-0 from training. Duration: 0.49 2023-10-02 18:55:28,456 WARNING [train.py:1204] (3/4) Exclude cut with ID _13678_210_7497_1_1530530069226_3463170_371-3221-0 from training. Duration: 0.9 2023-10-02 18:55:28,470 WARNING [train.py:1204] (3/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_48-338976-0 from training. Duration: 0.89 2023-10-02 18:55:29,977 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.conv_skip_rate, batch_count=983846.6666666666, ans=0.0 2023-10-02 18:55:30,342 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.5.encoder.layers.0.self_attn_weights.whiten_keys, num_groups=4, num_channels=128, metric=1.68 vs. limit=6.0 2023-10-02 18:55:31,166 WARNING [train.py:1204] (3/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_1167-309744-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 18:55:34,010 WARNING [train.py:1204] (3/4) Exclude cut with ID _30327_210_16049_1_1532484161295_3924169_401-70819-0_sp0.9 from training. Duration: 0.9555625 2023-10-02 18:55:34,026 WARNING [train.py:1204] (3/4) Exclude cut with ID _55134_210_21982_1_1534590028377_3882170_346-91022-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 18:55:34,146 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.self_attn_weights.pos_emb_skip_rate, batch_count=983913.3333333334, ans=0.0 2023-10-02 18:55:39,913 WARNING [train.py:1204] (3/4) Exclude cut with ID _14459_210_2228_1_1531443641946_7516700_174-118801-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 18:55:40,112 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.self_attn_weights.pos_emb_skip_rate, batch_count=983913.3333333334, ans=0.0 2023-10-02 18:55:40,208 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.conv_module2.balancer2.prob, batch_count=983913.3333333334, ans=0.125 2023-10-02 18:55:41,318 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_269-278006-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 18:55:41,379 WARNING [train.py:1204] (3/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_390-339137-0_sp0.9 from training. Duration: 0.62225 2023-10-02 18:55:42,800 WARNING [train.py:1204] (3/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_253-308377-0_sp1.1 from training. Duration: 0.4454375 2023-10-02 18:55:42,817 WARNING [train.py:1204] (3/4) Exclude cut with ID _23825_210_13449_1_1531915313526_3497150_240-271367-0_sp1.1 from training. Duration: 0.8818125 2023-10-02 18:55:44,243 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1015-343884-0_sp0.9 from training. Duration: 0.688875 2023-10-02 18:55:47,697 WARNING [train.py:1204] (3/4) Exclude cut with ID _23514_210_1389_1_1531832139785_3031439_335-48210-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 18:55:50,948 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_309-65177-0_sp1.1 from training. Duration: 0.8 2023-10-02 18:55:53,932 WARNING [train.py:1204] (3/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_231-137892-0 from training. Duration: 0.99 2023-10-02 18:55:55,369 WARNING [train.py:1204] (3/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_57-270037-0 from training. Duration: 0.77 2023-10-02 18:55:55,380 WARNING [train.py:1204] (3/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_465-214726-0_sp1.1 from training. Duration: 0.7818125 2023-10-02 18:55:57,946 WARNING [train.py:1204] (3/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_106-300985-0 from training. Duration: 0.9 2023-10-02 18:55:57,951 WARNING [train.py:1204] (3/4) Exclude cut with ID _14632_210_9988_1_1530691279392_3923200_161-129530-0_sp1.1 from training. Duration: 0.87275 2023-10-02 18:55:57,969 WARNING [train.py:1204] (3/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_514-88370-0_sp1.1 from training. Duration: 0.963625 2023-10-02 18:56:00,742 WARNING [train.py:1204] (3/4) Exclude cut with ID _56495_210_4234_1_1534748080334_4136219_305-334505-0_sp1.1 from training. Duration: 0.92725 2023-10-02 18:56:02,081 WARNING [train.py:1204] (3/4) Exclude cut with ID _42875_210_14856_1_1533704397487_4012433_61-264914-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 18:56:03,661 WARNING [train.py:1204] (3/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_91-148831-0 from training. Duration: 0.99 2023-10-02 18:56:07,048 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_435-313305-0_sp0.9 from training. Duration: 0.788875 2023-10-02 18:56:09,645 WARNING [train.py:1204] (3/4) Exclude cut with ID _16744_210_5986_1_1531724405384_3907567_242-111316-0_sp1.1 from training. Duration: 0.8454375 2023-10-02 18:56:09,712 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_257-20733-0 from training. Duration: 0.76 2023-10-02 18:56:10,928 WARNING [train.py:1204] (3/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_4-91037-0_sp1.1 from training. Duration: 0.8 2023-10-02 18:56:12,339 WARNING [train.py:1204] (3/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_3-276701-0 from training. Duration: 0.91 2023-10-02 18:56:15,512 WARNING [train.py:1204] (3/4) Exclude cut with ID _3992_210_1747_1_1526644816031_3731060_36-147855-0_sp1.1 from training. Duration: 0.7090625 2023-10-02 18:56:15,587 WARNING [train.py:1204] (3/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_1296-298671-0_sp1.1 from training. Duration: 0.963625 2023-10-02 18:56:16,956 WARNING [train.py:1204] (3/4) Exclude cut with ID _68965_210_1883_1_1535698962272_3497490_309-334593-0_sp1.1 from training. Duration: 0.92725 2023-10-02 18:56:18,386 WARNING [train.py:1204] (3/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_128-296581-0 from training. Duration: 0.74 2023-10-02 18:56:18,386 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_17613_210_2151_1_1531137569_2366160_170-299079-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 18:56:18,388 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6287_210_3273_1_1527509506_2653716_182-199887-0_sp1.1 from training. Duration: 0.463625 2023-10-02 18:56:19,769 WARNING [train.py:1204] (3/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_80-135200-0_sp1.1 from training. Duration: 0.7181875 2023-10-02 18:56:21,774 WARNING [train.py:1204] (3/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_34-183685-0 from training. Duration: 0.98 2023-10-02 18:56:21,796 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8278_210_2550_1_1528637099_4220413_418-210155-0_sp1.1 from training. Duration: 0.92725 2023-10-02 18:56:21,800 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8853_210_3273_1_1528633899_818968_51-187685-0_sp0.9 from training. Duration: 0.7666875 2023-10-02 18:56:21,814 WARNING [train.py:1204] (3/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_332-221057-0_sp1.1 from training. Duration: 0.6181875 2023-10-02 18:56:23,716 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2221_210_1988_1_1525597051_4935541_415-296882-0 from training. Duration: 0.77 2023-10-02 18:56:23,741 WARNING [train.py:1204] (3/4) Exclude cut with ID _33640_210_2655_1_1533117605479_4855469_770-278185-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 18:56:23,775 WARNING [train.py:1204] (3/4) Exclude cut with ID _5287_210_2084_1_1527418883040_3878670_333-48508-0_sp1.1 from training. Duration: 0.5454375 2023-10-02 18:56:25,193 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_142-276839-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 18:56:26,586 WARNING [train.py:1204] (3/4) Exclude cut with ID _28394_210_8913_1_1532430049518_3650118_126-139371-0_sp1.1 from training. Duration: 0.92725 2023-10-02 18:56:26,601 WARNING [train.py:1204] (3/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_76-203867-0 from training. Duration: 0.72 2023-10-02 18:56:27,875 WARNING [train.py:1204] (3/4) Exclude cut with ID _6194_210_1534_1_1527511826160_3941194_14-282876-0_sp0.9 from training. Duration: 0.788875 2023-10-02 18:56:33,343 INFO [train.py:1046] (3/4) Epoch 28, batch 4200, loss[loss=0.169, simple_loss=0.2561, pruned_loss=0.04096, over 24643.00 frames. ], tot_loss[loss=0.166, simple_loss=0.2434, pruned_loss=0.04431, over 4709210.66 frames. ], batch size: 68, lr: 3.63e-03, grad_scale: 16.0 2023-10-02 18:56:33,427 WARNING [train.py:1204] (3/4) Exclude cut with ID _35355_210_16040_1_1533212988977_4016229_74-190076-0_sp1.1 from training. Duration: 0.72725 2023-10-02 18:56:35,203 WARNING [train.py:1204] (3/4) Exclude cut with ID _33920_210_4220_1_1533257851500_3084399_198-334063-0 from training. Duration: 0.94 2023-10-02 18:56:36,555 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_4-266348-0_sp0.9 from training. Duration: 0.8666875 2023-10-02 18:56:37,261 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.3.encoder.layers.3.feed_forward1.out_whiten, num_groups=1, num_channels=512, metric=5.52 vs. limit=15.0 2023-10-02 18:56:37,972 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_23856_210_4238_1_1531994785_3087934_122-38075-0_sp1.1 from training. Duration: 0.936375 2023-10-02 18:56:39,373 WARNING [train.py:1204] (3/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_276-96593-0_sp0.9 from training. Duration: 0.8555625 2023-10-02 18:56:39,437 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_308-278734-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 18:56:40,898 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_5190_210_4557_1_1527245078_2965646_353-250603-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 18:56:43,495 WARNING [train.py:1204] (3/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_472-216663-0 from training. Duration: 0.92 2023-10-02 18:56:46,711 WARNING [train.py:1204] (3/4) Exclude cut with ID _7537_210_2084_1_1528502395431_3924359_14-312906-0 from training. Duration: 0.84 2023-10-02 18:56:46,757 WARNING [train.py:1204] (3/4) Exclude cut with ID _29003_210_19596_1_1532507391720_3894098_559-236660-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 18:56:48,308 WARNING [train.py:1204] (3/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_312-204798-0_sp1.1 from training. Duration: 0.8454375 2023-10-02 18:56:49,713 WARNING [train.py:1204] (3/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_17-309070-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 18:56:54,709 WARNING [train.py:1204] (3/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_344-152526-0_sp0.9 from training. Duration: 0.588875 2023-10-02 18:56:56,110 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_41452_210_7291_1_1533437687_3941281_405-11559-0_sp1.1 from training. Duration: 0.82725 2023-10-02 18:56:57,444 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_48730_210_5399_1_1533885886_2212769_157-124052-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 18:56:57,507 WARNING [train.py:1204] (3/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_415-78792-0 from training. Duration: 0.81 2023-10-02 18:56:58,751 WARNING [train.py:1204] (3/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_345-340690-0_sp1.1 from training. Duration: 0.8454375 2023-10-02 18:57:00,141 WARNING [train.py:1204] (3/4) Exclude cut with ID _23825_210_13449_1_1531915313526_3497150_124-8138-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 18:57:01,435 WARNING [train.py:1204] (3/4) Exclude cut with ID _49258_210_7994_1_1534122017863_3738861_194-24864-0_sp1.1 from training. Duration: 0.936375 2023-10-02 18:57:01,458 WARNING [train.py:1204] (3/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_369-222060-0_sp1.1 from training. Duration: 0.6454375 2023-10-02 18:57:02,791 WARNING [train.py:1204] (3/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_480-13146-0_sp0.9 from training. Duration: 0.9444375 2023-10-02 18:57:04,350 WARNING [train.py:1204] (3/4) Exclude cut with ID _13253_210_1925_1_1530757775819_3577873_191-144977-0 from training. Duration: 0.78 2023-10-02 18:57:04,367 WARNING [train.py:1204] (3/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_1128-140266-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 18:57:04,602 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.conv_module2.balancer2.min_abs, batch_count=984313.3333333334, ans=0.5 2023-10-02 18:57:07,805 WARNING [train.py:1204] (3/4) Exclude cut with ID _8772_210_1850_1_1528635595972_3664011_247-336448-0_sp0.9 from training. Duration: 0.82225 2023-10-02 18:57:09,225 WARNING [train.py:1204] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_318-59097-0_sp1.1 from training. Duration: 0.6909375 2023-10-02 18:57:11,928 WARNING [train.py:1204] (3/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_540-131164-0_sp1.1 from training. Duration: 0.836375 2023-10-02 18:57:13,273 WARNING [train.py:1204] (3/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_39-110836-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 18:57:15,963 WARNING [train.py:1204] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_860-96198-0_sp1.1 from training. Duration: 0.663625 2023-10-02 18:57:15,970 WARNING [train.py:1204] (3/4) Exclude cut with ID _15133_210_6228_1_1530794512074_3080379_29-264458-0 from training. Duration: 0.58 2023-10-02 18:57:15,989 WARNING [train.py:1204] (3/4) Exclude cut with ID _55209_210_1531_1_1534675828996_4597160_797-348343-0_sp1.1 from training. Duration: 0.963625 2023-10-02 18:57:16,084 WARNING [train.py:1204] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_23-189234-0_sp1.1 from training. Duration: 0.7545625 2023-10-02 18:57:22,538 WARNING [train.py:1204] (3/4) Exclude cut with ID _5410_210_1388_1_1527397055795_1719020_39-112221-0_sp1.1 from training. Duration: 0.62725 2023-10-02 18:57:22,657 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_100-348441-0_sp1.1 from training. Duration: 0.82725 2023-10-02 18:57:26,453 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.4.encoder.layers.0.conv_module1.whiten, num_groups=1, num_channels=384, metric=4.22 vs. limit=15.0 2023-10-02 18:57:28,740 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.bypass.scale_min, batch_count=984380.0, ans=0.2 2023-10-02 18:57:28,802 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.ff2_skip_rate, batch_count=984380.0, ans=0.0 2023-10-02 18:57:29,957 WARNING [train.py:1204] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_143-86344-0_sp1.1 from training. Duration: 0.7 2023-10-02 18:57:32,658 WARNING [train.py:1204] (3/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_126-29104-0 from training. Duration: 0.85 2023-10-02 18:57:35,472 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.562e+02 1.827e+02 2.035e+02 2.482e+02 4.070e+02, threshold=4.070e+02, percent-clipped=0.0 2023-10-02 18:57:35,572 WARNING [train.py:1204] (3/4) Exclude cut with ID _39846_210_12651_1_1533603651992_1318030_138-31771-0_sp1.1 from training. Duration: 0.963625 2023-10-02 18:57:38,851 WARNING [train.py:1204] (3/4) Exclude cut with ID _2220_210_1819_1_1525509109898_3658680_50-99837-0_sp1.1 from training. Duration: 0.4909375 2023-10-02 18:57:40,220 WARNING [train.py:1204] (3/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_836-210550-0_sp1.1 from training. Duration: 0.97275 2023-10-02 18:57:41,746 WARNING [train.py:1204] (3/4) Exclude cut with ID _28323_210_16187_1_1532480395667_4100300_306-74561-0 from training. Duration: 0.94 2023-10-02 18:57:47,013 INFO [train.py:1046] (3/4) Epoch 28, batch 4250, loss[loss=0.1626, simple_loss=0.2258, pruned_loss=0.04967, over 22732.00 frames. ], tot_loss[loss=0.1657, simple_loss=0.2429, pruned_loss=0.04426, over 4725114.22 frames. ], batch size: 322, lr: 3.63e-03, grad_scale: 16.0 2023-10-02 18:57:47,093 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_177-58005-0_sp1.1 from training. Duration: 0.563625 2023-10-02 18:57:50,427 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder_embed.convnext.layerdrop_rate, batch_count=984513.3333333334, ans=0.015 2023-10-02 18:57:51,739 WARNING [train.py:1204] (3/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_19-342632-0_sp1.1 from training. Duration: 0.82725 2023-10-02 18:57:51,758 WARNING [train.py:1204] (3/4) Exclude cut with ID _35355_210_16040_1_1533212988977_4016229_74-190076-0_sp0.9 from training. Duration: 0.888875 2023-10-02 18:57:55,323 WARNING [train.py:1204] (3/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_285-97786-0_sp1.1 from training. Duration: 0.97275 2023-10-02 18:57:59,519 WARNING [train.py:1204] (3/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_337-181502-0_sp0.9 from training. Duration: 0.72225 2023-10-02 18:57:59,563 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_353-182855-0 from training. Duration: 0.96 2023-10-02 18:58:00,825 WARNING [train.py:1204] (3/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_409-327730-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 18:58:02,390 WARNING [train.py:1204] (3/4) Exclude cut with ID _8030_210_7497_1_1528444886781_3531089_373-19403-0_sp1.1 from training. Duration: 0.97275 2023-10-02 18:58:06,991 WARNING [train.py:1204] (3/4) Exclude cut with ID _6789_210_1534_1_1527857937218_3118950_231-340237-0_sp1.1 from training. Duration: 0.936375 2023-10-02 18:58:09,867 WARNING [train.py:1204] (3/4) Exclude cut with ID _6493_210_1747_1_1528459210424_4012470_536-57832-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 18:58:09,879 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7046_210_6753_1_1527895337_6920917_756-116002-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 18:58:12,648 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_37152_210_9697_1_1533106573_3844188_29-239070-0_sp1.1 from training. Duration: 0.8909375 2023-10-02 18:58:12,651 WARNING [train.py:1204] (3/4) Exclude cut with ID _19023_210_4353_1_1531378892552_5820780_61-334539-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 18:58:12,761 WARNING [train.py:1204] (3/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_159-28488-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 18:58:14,027 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_764-71272-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 18:58:15,420 WARNING [train.py:1204] (3/4) Exclude cut with ID _44902_210_16187_1_1533888006062_3792078_258-178274-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 18:58:15,715 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=984646.6666666666, ans=0.1 2023-10-02 18:58:16,862 WARNING [train.py:1204] (3/4) Exclude cut with ID _7537_210_2084_1_1528502395431_3924359_14-312906-0_sp1.1 from training. Duration: 0.763625 2023-10-02 18:58:16,952 WARNING [train.py:1204] (3/4) Exclude cut with ID _49643_210_7989_1_1534640244224_3939031_674-116639-0_sp1.1 from training. Duration: 0.963625 2023-10-02 18:58:20,037 WARNING [train.py:1204] (3/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_415-161-0 from training. Duration: 0.98 2023-10-02 18:58:22,077 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.nonlin_attention.balancer.max_positive, batch_count=984646.6666666666, ans=0.95 2023-10-02 18:58:23,232 WARNING [train.py:1204] (3/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_264-348896-0 from training. Duration: 0.85 2023-10-02 18:58:23,245 WARNING [train.py:1204] (3/4) Exclude cut with ID _51899_210_20034_1_1534129254466_3669220_233-161492-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 18:58:25,007 WARNING [train.py:1204] (3/4) Exclude cut with ID _31255_210_13427_1_1532865619495_3815143_233-200023-0_sp1.1 from training. Duration: 0.936375 2023-10-02 18:58:25,031 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_23850_210_4238_1_1532232597_1632433_100-229879-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 18:58:26,399 WARNING [train.py:1204] (3/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_375-245677-0_sp0.9 from training. Duration: 0.9 2023-10-02 18:58:26,402 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_40223_210_6228_1_1533298404_4812267_355-317415-0_sp1.1 from training. Duration: 0.97275 2023-10-02 18:58:27,689 WARNING [train.py:1204] (3/4) Exclude cut with ID _9679_210_7497_1_1529129212906_4363929_235-81488-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 18:58:31,896 WARNING [train.py:1204] (3/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_172-215932-0_sp1.1 from training. Duration: 0.6 2023-10-02 18:58:33,270 WARNING [train.py:1204] (3/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_64-298917-0_sp1.1 from training. Duration: 0.8 2023-10-02 18:58:34,876 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.self_attn_weights.pos_emb_skip_rate, batch_count=984713.3333333334, ans=0.0 2023-10-02 18:58:37,419 WARNING [train.py:1204] (3/4) Exclude cut with ID _38020_210_5986_1_1533112252259_7006089_131-53533-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 18:58:40,622 WARNING [train.py:1204] (3/4) Exclude cut with ID _42334_210_7770_1_1534206610396_3605880_392-48154-0_sp1.1 from training. Duration: 0.963625 2023-10-02 18:58:40,688 WARNING [train.py:1204] (3/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_337-181502-0 from training. Duration: 0.65 2023-10-02 18:58:40,702 WARNING [train.py:1204] (3/4) Exclude cut with ID _6267_210_2084_1_1527764658526_3894730_375-219887-0_sp0.9 from training. Duration: 0.7444375 2023-10-02 18:58:41,988 WARNING [train.py:1204] (3/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_60-146189-0 from training. Duration: 0.98 2023-10-02 18:58:43,317 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_243-143119-0_sp1.1 from training. Duration: 0.7 2023-10-02 18:58:44,848 WARNING [train.py:1204] (3/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_1267-272693-0_sp0.9 from training. Duration: 0.811125 2023-10-02 18:58:46,124 WARNING [train.py:1204] (3/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_83-182645-0_sp1.1 from training. Duration: 0.97275 2023-10-02 18:58:46,142 WARNING [train.py:1204] (3/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_403-292260-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 18:58:48,778 WARNING [train.py:1204] (3/4) Exclude cut with ID _55429_210_10626_1_1534467588527_7174143_377-169391-0 from training. Duration: 0.96 2023-10-02 18:58:50,216 WARNING [train.py:1204] (3/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_494-199538-0_sp1.1 from training. Duration: 0.6818125 2023-10-02 18:58:50,262 WARNING [train.py:1204] (3/4) Exclude cut with ID _3067_210_2667_1_1526212974640_4503168_119-175776-0_sp1.1 from training. Duration: 0.67275 2023-10-02 18:58:55,289 WARNING [train.py:1204] (3/4) Exclude cut with ID _3231_210_2106_1_1526205864934_6784220_183-303839-0_sp1.1 from training. Duration: 0.97275 2023-10-02 18:58:55,634 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=984780.0, ans=0.1 2023-10-02 18:58:55,636 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.ff3_skip_rate, batch_count=984780.0, ans=0.0 2023-10-02 18:58:56,804 WARNING [train.py:1204] (3/4) Exclude cut with ID _18513_210_9206_1_1531618215223_3862620_165-38958-0_sp1.1 from training. Duration: 0.963625 2023-10-02 18:58:56,912 WARNING [train.py:1204] (3/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_87-272068-0_sp1.1 from training. Duration: 0.7545625 2023-10-02 18:58:58,672 WARNING [train.py:1204] (3/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_353-95-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 18:59:00,072 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2009_210_2071_1_1525481134_4588937_73-302813-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 18:59:01,438 INFO [train.py:1046] (3/4) Epoch 28, batch 4300, loss[loss=0.1605, simple_loss=0.234, pruned_loss=0.04352, over 22784.00 frames. ], tot_loss[loss=0.165, simple_loss=0.2423, pruned_loss=0.0439, over 4726835.61 frames. ], batch size: 322, lr: 3.63e-03, grad_scale: 16.0 2023-10-02 18:59:01,520 WARNING [train.py:1204] (3/4) Exclude cut with ID _64220_210_9527_1_1535250151772_4875259_88-95011-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 18:59:02,859 WARNING [train.py:1204] (3/4) Exclude cut with ID _44902_210_16187_1_1533888006062_3792078_160-12107-0_sp1.1 from training. Duration: 0.92725 2023-10-02 18:59:02,874 WARNING [train.py:1204] (3/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_547-5424-0 from training. Duration: 0.94 2023-10-02 18:59:04,236 WARNING [train.py:1204] (3/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_268-85656-0_sp1.1 from training. Duration: 0.936375 2023-10-02 18:59:08,895 WARNING [train.py:1204] (3/4) Exclude cut with ID _66210_210_12651_1_1535446812922_3365850_42-284985-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 18:59:10,294 WARNING [train.py:1204] (3/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_465-214726-0_sp0.9 from training. Duration: 0.9555625 2023-10-02 18:59:13,256 WARNING [train.py:1204] (3/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_50-95770-0_sp1.1 from training. Duration: 0.936375 2023-10-02 18:59:20,088 WARNING [train.py:1204] (3/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_269-77567-0_sp1.1 from training. Duration: 0.963625 2023-10-02 18:59:20,091 WARNING [train.py:1204] (3/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_7-111954-0 from training. Duration: 0.56 2023-10-02 18:59:22,003 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_85-195240-0_sp0.9 from training. Duration: 0.9666875 2023-10-02 18:59:23,884 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2822_210_2715_1_1526112586_1999587_165-36656-0_sp0.9 from training. Duration: 0.988875 2023-10-02 18:59:23,902 WARNING [train.py:1204] (3/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_181-78632-0_sp1.1 from training. Duration: 0.7090625 2023-10-02 18:59:23,914 WARNING [train.py:1204] (3/4) Exclude cut with ID IC0671W0044-331887 from training. Duration: 0.896 2023-10-02 18:59:24,084 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.balancer1.prob, batch_count=984913.3333333334, ans=0.125 2023-10-02 18:59:24,135 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.feed_forward2.hidden_balancer.prob, batch_count=984913.3333333334, ans=0.125 2023-10-02 18:59:26,903 INFO [scaling.py:1118] (3/4) WithLoss: name=encoder.encoders.3.encoder.layers.3.self_attn_weights, loss-sum=0.000e+00 2023-10-02 18:59:28,380 WARNING [train.py:1204] (3/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_266-137104-0_sp1.1 from training. Duration: 0.5909375 2023-10-02 18:59:29,837 WARNING [train.py:1204] (3/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_137-116442-0_sp0.9 from training. Duration: 0.8666875 2023-10-02 18:59:32,583 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_23801_210_16223_1_1532066392_3294657_570-5439-0 from training. Duration: 0.97 2023-10-02 18:59:33,836 WARNING [train.py:1204] (3/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_29-280622-0_sp1.1 from training. Duration: 0.7454375 2023-10-02 18:59:33,872 WARNING [train.py:1204] (3/4) Exclude cut with ID 3557-8342-0013-54691-0 from training. Duration: 0.92 2023-10-02 18:59:35,266 WARNING [train.py:1204] (3/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_711-15688-0_sp1.1 from training. Duration: 0.3909375 2023-10-02 18:59:36,653 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_15126_210_6753_1_1530778677_2591185_120-277707-0_sp1.1 from training. Duration: 0.72725 2023-10-02 18:59:36,807 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.balancer2.prob, batch_count=984980.0, ans=0.125 2023-10-02 18:59:40,040 WARNING [train.py:1204] (3/4) Exclude cut with ID _14970_210_15709_1_1530775547361_1966965_8-169924-0_sp1.1 from training. Duration: 0.836375 2023-10-02 18:59:40,042 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_4692_210_3528_1_1526899755_4323254_107-44401-0_sp0.9 from training. Duration: 0.9555625 2023-10-02 18:59:41,426 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3475_210_2028_1_1526193578_3622569_265-276625-0_sp0.9 from training. Duration: 0.8444375 2023-10-02 18:59:44,012 WARNING [train.py:1204] (3/4) Exclude cut with ID _55153_210_2253_1_1534384786677_3162096_148-79022-0_sp1.1 from training. Duration: 0.87275 2023-10-02 18:59:45,403 WARNING [train.py:1204] (3/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_292-36336-0_sp1.1 from training. Duration: 0.92725 2023-10-02 18:59:45,413 WARNING [train.py:1204] (3/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_329-82235-0 from training. Duration: 0.82 2023-10-02 18:59:46,766 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_11305_210_1794_1_1529750428_7178148_257-312870-0 from training. Duration: 0.88 2023-10-02 18:59:48,180 WARNING [train.py:1204] (3/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_436-4493-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 18:59:51,041 WARNING [train.py:1204] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_321-54129-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 18:59:51,050 WARNING [train.py:1204] (3/4) Exclude cut with ID _5287_210_2084_1_1527418883040_3878670_333-48508-0_sp0.9 from training. Duration: 0.6666875 2023-10-02 18:59:51,064 WARNING [train.py:1204] (3/4) Exclude cut with ID _26528_210_9070_1_1532570410842_3851310_244-278158-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 18:59:51,104 WARNING [train.py:1204] (3/4) Exclude cut with ID _46952_210_5986_1_1534230010357_3107379_232-344503-0_sp1.1 from training. Duration: 0.87275 2023-10-02 18:59:51,120 WARNING [train.py:1204] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_43-206757-0 from training. Duration: 0.75 2023-10-02 18:59:51,121 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_4183_210_2087_1_1526729100_1869838_141-166600-0 from training. Duration: 0.95 2023-10-02 18:59:52,888 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_296-287678-0 from training. Duration: 0.95 2023-10-02 18:59:52,946 WARNING [train.py:1204] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_265-60343-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 18:59:52,971 WARNING [train.py:1204] (3/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_332-256168-0 from training. Duration: 0.75 2023-10-02 18:59:54,843 WARNING [train.py:1204] (3/4) Exclude cut with ID _13679_210_7497_1_1530534293662_3355098_411-186088-0 from training. Duration: 0.94 2023-10-02 18:59:56,510 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.nonlin_attention.balancer.prob, batch_count=985046.6666666666, ans=0.125 2023-10-02 18:59:58,935 WARNING [train.py:1204] (3/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_458-126779-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 19:00:00,807 WARNING [train.py:1204] (3/4) Exclude cut with ID IC0803W0380-397572 from training. Duration: 0.032 2023-10-02 19:00:00,871 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_278-272612-0_sp1.1 from training. Duration: 0.663625 2023-10-02 19:00:01,077 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.bypass_mid.scale_min, batch_count=985113.3333333334, ans=0.2 2023-10-02 19:00:02,304 WARNING [train.py:1204] (3/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_491-263101-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 19:00:02,308 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_53555_210_5632_1_1534595220_7232297_334-336778-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 19:00:03,577 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.642e+02 1.971e+02 2.225e+02 2.582e+02 4.307e+02, threshold=4.451e+02, percent-clipped=1.0 2023-10-02 19:00:05,061 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_50-3289-0 from training. Duration: 0.91 2023-10-02 19:00:05,123 WARNING [train.py:1204] (3/4) Exclude cut with ID _8699_210_2069_1_1528630346831_3960409_155-39766-0_sp0.9 from training. Duration: 0.8666875 2023-10-02 19:00:05,126 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_502-82145-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 19:00:06,477 WARNING [train.py:1204] (3/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_550-43132-0_sp1.1 from training. Duration: 0.97275 2023-10-02 19:00:06,506 WARNING [train.py:1204] (3/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_446-112117-0_sp1.1 from training. Duration: 0.8181875 2023-10-02 19:00:07,869 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3691_210_3528_1_1526468169_4062053_355-334432-0_sp1.1 from training. Duration: 0.7909375 2023-10-02 19:00:09,337 WARNING [train.py:1204] (3/4) Exclude cut with ID _58605_210_12210_1_1534723195939_3904259_239-94720-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 19:00:11,342 WARNING [train.py:1204] (3/4) Exclude cut with ID _49376_210_5986_1_1533985278732_3370740_391-344072-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 19:00:13,900 WARNING [train.py:1204] (3/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_558-322014-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 19:00:13,932 WARNING [train.py:1204] (3/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_256-32662-0_sp1.1 from training. Duration: 0.8181875 2023-10-02 19:00:15,290 INFO [train.py:1046] (3/4) Epoch 28, batch 4350, loss[loss=0.1641, simple_loss=0.2404, pruned_loss=0.04389, over 23497.00 frames. ], tot_loss[loss=0.1652, simple_loss=0.2428, pruned_loss=0.04381, over 4736974.39 frames. ], batch size: 106, lr: 3.63e-03, grad_scale: 16.0 2023-10-02 19:00:18,042 WARNING [train.py:1204] (3/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_572-185150-0 from training. Duration: 0.97 2023-10-02 19:00:18,078 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_89-35748-0_sp0.9 from training. Duration: 0.87775 2023-10-02 19:00:24,191 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_88-66747-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 19:00:26,179 WARNING [train.py:1204] (3/4) Exclude cut with ID _19504_210_9994_1_1531648863242_3325220_357-237029-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 19:00:28,482 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.2.encoder.layers.1.feed_forward3.out_whiten, num_groups=1, num_channels=384, metric=8.65 vs. limit=15.0 2023-10-02 19:00:29,709 WARNING [train.py:1204] (3/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_302-2550-0_sp1.1 from training. Duration: 0.736375 2023-10-02 19:00:29,709 WARNING [train.py:1204] (3/4) Exclude cut with ID _9461_210_4557_1_1529060514726_3677451_59-194017-0_sp1.1 from training. Duration: 0.963625 2023-10-02 19:00:32,648 WARNING [train.py:1204] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_665-196155-0_sp1.1 from training. Duration: 0.6090625 2023-10-02 19:00:36,703 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_326-315007-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 19:00:39,366 WARNING [train.py:1204] (3/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_105-328174-0_sp1.1 from training. Duration: 0.6909375 2023-10-02 19:00:39,373 WARNING [train.py:1204] (3/4) Exclude cut with ID _56494_210_4220_1_1535284837625_3033169_106-263722-0_sp1.1 from training. Duration: 0.97275 2023-10-02 19:00:43,929 WARNING [train.py:1204] (3/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_770-200430-0_sp0.9 from training. Duration: 0.988875 2023-10-02 19:00:45,407 WARNING [train.py:1204] (3/4) Exclude cut with ID _65640_210_6310_1_1535368201445_3173250_209-13008-0_sp1.1 from training. Duration: 0.9 2023-10-02 19:00:45,901 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.5.encoder.layers.0.whiten, num_groups=1, num_channels=256, metric=2.42 vs. limit=12.0 2023-10-02 19:00:46,836 WARNING [train.py:1204] (3/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_212-149947-0_sp0.9 from training. Duration: 0.811125 2023-10-02 19:00:52,544 WARNING [train.py:1204] (3/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_188-171673-0 from training. Duration: 0.58 2023-10-02 19:00:53,852 WARNING [train.py:1204] (3/4) Exclude cut with ID _58895_210_8750_1_1535184081320_3649089_286-281724-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 19:00:53,921 WARNING [train.py:1204] (3/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_16-254909-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 19:00:58,705 WARNING [train.py:1204] (3/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_52-95655-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 19:01:02,452 WARNING [train.py:1204] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_554-45617-0 from training. Duration: 0.99 2023-10-02 19:01:05,178 WARNING [train.py:1204] (3/4) Exclude cut with ID _14683_210_5219_1_1530698352608_3974859_473-344611-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 19:01:06,563 WARNING [train.py:1204] (3/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_17-25045-0_sp0.9 from training. Duration: 0.6555625 2023-10-02 19:01:08,117 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.feed_forward2.hidden_balancer.prob, batch_count=985380.0, ans=0.125 2023-10-02 19:01:10,770 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9072W0003-530637 from training. Duration: 0.768 2023-10-02 19:01:12,282 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.conv_module2.balancer2.prob, batch_count=985380.0, ans=0.125 2023-10-02 19:01:13,492 WARNING [train.py:1204] (3/4) Exclude cut with ID _38670_210_15379_1_1533448810659_4391059_76-74025-0_sp1.1 from training. Duration: 0.936375 2023-10-02 19:01:13,539 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_74-292608-0_sp0.9 from training. Duration: 0.8 2023-10-02 19:01:15,325 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9084W0025-536862 from training. Duration: 0.896 2023-10-02 19:01:15,391 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9087W0021-538392 from training. Duration: 0.768 2023-10-02 19:01:15,396 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7543_210_6753_1_1528543323_645515_56-163234-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 19:01:15,418 WARNING [train.py:1204] (3/4) Exclude cut with ID _45560_210_21982_1_1533812603951_3584999_44-332643-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 19:01:16,747 WARNING [train.py:1204] (3/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_1285-256622-0_sp1.1 from training. Duration: 0.763625 2023-10-02 19:01:16,805 WARNING [train.py:1204] (3/4) Exclude cut with ID _52781_210_16187_1_1534297729099_1118149_141-325632-0_sp1.1 from training. Duration: 0.936375 2023-10-02 19:01:18,153 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_1051-137631-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 19:01:18,201 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_13091_210_7925_1_1530609447_8987354_233-18333-0_sp1.1 from training. Duration: 0.97275 2023-10-02 19:01:20,927 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_34008_210_10431_1_1533345424_8183247_662-317331-0 from training. Duration: 0.94 2023-10-02 19:01:20,934 WARNING [train.py:1204] (3/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_291-248920-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 19:01:20,943 WARNING [train.py:1204] (3/4) Exclude cut with ID _65263_210_22356_1_1535763598691_3921037_274-288481-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 19:01:20,965 WARNING [train.py:1204] (3/4) Exclude cut with ID _5189_210_4557_1_1527248319428_3769229_233-168080-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 19:01:21,666 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.3.encoder.layers.1.conv_module1.whiten, num_groups=1, num_channels=512, metric=3.43 vs. limit=15.0 2023-10-02 19:01:22,324 WARNING [train.py:1204] (3/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_135-184281-0 from training. Duration: 0.99 2023-10-02 19:01:24,824 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9123W0012-555972 from training. Duration: 0.896 2023-10-02 19:01:24,829 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9123W0118-556077 from training. Duration: 0.896 2023-10-02 19:01:24,838 WARNING [train.py:1204] (3/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_406-275649-0 from training. Duration: 0.42 2023-10-02 19:01:28,197 WARNING [train.py:1204] (3/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_309-13684-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 19:01:28,233 WARNING [train.py:1204] (3/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_129-337802-0_sp1.1 from training. Duration: 0.6545625 2023-10-02 19:01:28,254 WARNING [train.py:1204] (3/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_649-266825-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 19:01:29,575 INFO [train.py:1046] (3/4) Epoch 28, batch 4400, loss[loss=0.1377, simple_loss=0.2173, pruned_loss=0.02909, over 24389.00 frames. ], tot_loss[loss=0.1666, simple_loss=0.2443, pruned_loss=0.04442, over 4736661.37 frames. ], batch size: 58, lr: 3.63e-03, grad_scale: 32.0 2023-10-02 19:01:29,627 WARNING [train.py:1204] (3/4) Exclude cut with ID _40519_210_13794_1_1533715228655_7337390_895-224742-0_sp1.1 from training. Duration: 0.92725 2023-10-02 19:01:31,057 WARNING [train.py:1204] (3/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_234-14174-0 from training. Duration: 0.82 2023-10-02 19:01:33,085 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9156W0009-572502 from training. Duration: 0.768 2023-10-02 19:01:33,099 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_424-245529-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 19:01:37,586 WARNING [train.py:1204] (3/4) Exclude cut with ID _26528_210_9070_1_1532570410842_3851310_232-10149-0_sp1.1 from training. Duration: 0.963625 2023-10-02 19:01:37,602 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_17292_210_6753_1_1531125388_2943734_152-30410-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 19:01:37,927 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.balancer1.prob, batch_count=985513.3333333334, ans=0.125 2023-10-02 19:01:38,906 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_35441_210_16554_1_1533347572_4092680_291-82075-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 19:01:40,454 WARNING [train.py:1204] (3/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_64-298917-0 from training. Duration: 0.88 2023-10-02 19:01:41,831 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_5618_210_1874_1_1527311008_5851_1-214501-0_sp0.9 from training. Duration: 0.7655625 2023-10-02 19:01:41,874 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_9460_210_4211_1_1528954587_10049667_572-37035-0 from training. Duration: 0.8 2023-10-02 19:01:41,900 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9193W0023-590322 from training. Duration: 0.896 2023-10-02 19:01:42,482 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.3.encoder.layers.0.conv_module2.whiten, num_groups=1, num_channels=512, metric=3.43 vs. limit=15.0 2023-10-02 19:01:44,552 WARNING [train.py:1204] (3/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_206-10097-0_sp1.1 from training. Duration: 0.4454375 2023-10-02 19:01:44,564 WARNING [train.py:1204] (3/4) Exclude cut with ID _55134_210_21982_1_1534590028377_3882170_17-51731-0_sp1.1 from training. Duration: 0.963625 2023-10-02 19:01:46,599 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_14953_210_2151_1_1530701710_5616165_710-165400-0 from training. Duration: 0.87 2023-10-02 19:01:46,890 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.attention_skip_rate, batch_count=985580.0, ans=0.0 2023-10-02 19:01:48,012 WARNING [train.py:1204] (3/4) Exclude cut with ID _53203_210_2087_1_1534246218309_475590_61-85777-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 19:01:49,386 WARNING [train.py:1204] (3/4) Exclude cut with ID _55844_210_2346_1_1534471212569_7625707_1050-10453-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 19:01:49,396 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9218W0013-603297 from training. Duration: 0.896 2023-10-02 19:01:52,255 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_267-102818-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 19:01:52,256 WARNING [train.py:1204] (3/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_191-41023-0 from training. Duration: 0.71 2023-10-02 19:01:52,304 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9231W0202-610227 from training. Duration: 0.896 2023-10-02 19:01:53,081 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.1.encoder.layers.1.self_attn2.whiten, num_groups=1, num_channels=256, metric=10.12 vs. limit=22.5 2023-10-02 19:01:56,789 WARNING [train.py:1204] (3/4) Exclude cut with ID _6537_210_1883_1_1527987559710_3597129_382-133062-0 from training. Duration: 0.68 2023-10-02 19:01:56,820 WARNING [train.py:1204] (3/4) Exclude cut with ID _28427_210_4220_1_1532581240967_3061069_43-84928-0 from training. Duration: 0.99 2023-10-02 19:01:56,852 WARNING [train.py:1204] (3/4) Exclude cut with ID _59256_210_5986_1_1535076085997_3589120_121-237386-0 from training. Duration: 0.96 2023-10-02 19:01:56,876 WARNING [train.py:1204] (3/4) Exclude cut with ID _8480_210_1874_1_1528589391205_6728330_137-256986-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 19:01:58,253 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_9417_210_1794_1_1529235261_4614131_479-346963-0_sp1.1 from training. Duration: 0.936375 2023-10-02 19:01:58,301 WARNING [train.py:1204] (3/4) Exclude cut with ID _26474_210_9570_1_1532520022699_3623380_267-7362-0_sp1.1 from training. Duration: 0.936375 2023-10-02 19:01:59,676 WARNING [train.py:1204] (3/4) Exclude cut with ID _55328_210_27767_1_1534580973260_4088170_287-61272-0_sp1.1 from training. Duration: 0.97275 2023-10-02 19:02:01,110 WARNING [train.py:1204] (3/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_589-9070-0 from training. Duration: 0.71 2023-10-02 19:02:02,411 WARNING [train.py:1204] (3/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_148-158509-0_sp1.1 from training. Duration: 0.37275 2023-10-02 19:02:02,481 WARNING [train.py:1204] (3/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_164-184548-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 19:02:04,434 WARNING [train.py:1204] (3/4) Exclude cut with ID _14970_210_15709_1_1530775547361_1966965_6-23328-0_sp1.1 from training. Duration: 0.7818125 2023-10-02 19:02:04,439 WARNING [train.py:1204] (3/4) Exclude cut with ID _29303_210_2575_1_1532392237303_7410649_612-165069-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 19:02:06,253 WARNING [train.py:1204] (3/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_383-321224-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 19:02:06,288 WARNING [train.py:1204] (3/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_283-160525-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 19:02:06,305 WARNING [train.py:1204] (3/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_505-97987-0 from training. Duration: 0.86 2023-10-02 19:02:06,533 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.self_attn_weights.pos_emb_skip_rate, batch_count=985646.6666666666, ans=0.0 2023-10-02 19:02:07,634 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9309W0190-640317 from training. Duration: 0.768 2023-10-02 19:02:11,537 WARNING [train.py:1204] (3/4) Exclude cut with ID _50093_210_4220_1_1534417141207_3432299_280-297390-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 19:02:16,499 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_17292_210_6753_1_1531125388_2943734_320-71385-0_sp1.1 from training. Duration: 0.97275 2023-10-02 19:02:18,103 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.feed_forward1.out_proj.dropout_p, batch_count=985713.3333333334, ans=0.1 2023-10-02 19:02:19,279 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_213-103282-0 from training. Duration: 0.78 2023-10-02 19:02:22,079 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7615_210_4211_1_1528110965_4580160_72-235211-0_sp0.9 from training. Duration: 0.8444375 2023-10-02 19:02:24,843 WARNING [train.py:1204] (3/4) Exclude cut with ID _14867_210_1968_1_1530684179890_3846969_173-320977-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 19:02:28,080 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7046_210_6753_1_1527895337_6920917_783-278109-0_sp0.9 from training. Duration: 0.8555625 2023-10-02 19:02:29,406 WARNING [train.py:1204] (3/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_90-30673-0 from training. Duration: 0.84 2023-10-02 19:02:29,440 WARNING [train.py:1204] (3/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_415-161-0_sp1.1 from training. Duration: 0.8909375 2023-10-02 19:02:29,450 WARNING [train.py:1204] (3/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_86-66192-0_sp0.9 from training. Duration: 0.611125 2023-10-02 19:02:29,453 WARNING [train.py:1204] (3/4) Exclude cut with ID _4626_210_3532_1_1526817652717_3916859_446-206656-0_sp1.1 from training. Duration: 0.6909375 2023-10-02 19:02:29,515 WARNING [train.py:1204] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_166-128941-0_sp0.9 from training. Duration: 0.6 2023-10-02 19:02:32,186 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.611e+02 1.814e+02 2.040e+02 2.293e+02 3.611e+02, threshold=4.080e+02, percent-clipped=0.0 2023-10-02 19:02:32,385 WARNING [train.py:1204] (3/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_345-167986-0 from training. Duration: 0.84 2023-10-02 19:02:35,784 WARNING [train.py:1204] (3/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_973-198085-0 from training. Duration: 0.75 2023-10-02 19:02:37,169 WARNING [train.py:1204] (3/4) Exclude cut with ID _60057_210_10640_1_1534903065793_3333150_171-181133-0 from training. Duration: 0.88 2023-10-02 19:02:37,193 WARNING [train.py:1204] (3/4) Exclude cut with ID _19504_210_9994_1_1531648863242_3325220_336-196513-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 19:02:37,200 WARNING [train.py:1204] (3/4) Exclude cut with ID _6493_210_1747_1_1528459210424_4012470_513-140967-0 from training. Duration: 0.79 2023-10-02 19:02:38,502 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6673_210_3273_1_1527767790_3173240_327-306185-0_sp0.9 from training. Duration: 0.92225 2023-10-02 19:02:41,196 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1032-236863-0_sp1.1 from training. Duration: 0.763625 2023-10-02 19:02:42,615 WARNING [train.py:1204] (3/4) Exclude cut with ID _14867_210_1968_1_1530684179890_3846969_413-29292-0 from training. Duration: 0.9 2023-10-02 19:02:43,972 INFO [train.py:1046] (3/4) Epoch 28, batch 4450, loss[loss=0.1811, simple_loss=0.2671, pruned_loss=0.04754, over 24368.00 frames. ], tot_loss[loss=0.1675, simple_loss=0.2457, pruned_loss=0.04464, over 4746281.95 frames. ], batch size: 77, lr: 3.63e-03, grad_scale: 32.0 2023-10-02 19:02:46,068 WARNING [train.py:1204] (3/4) Exclude cut with ID _47251_210_18628_1_1533814063128_4137250_186-343322-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 19:02:48,790 WARNING [train.py:1204] (3/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_624-46091-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 19:02:48,835 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_791-197891-0_sp0.9 from training. Duration: 0.9333125 2023-10-02 19:02:54,231 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_50050_210_16223_1_1535435796_7290138_786-288457-0_sp1.1 from training. Duration: 0.92725 2023-10-02 19:02:55,989 WARNING [train.py:1204] (3/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_213-227283-0_sp1.1 from training. Duration: 0.963625 2023-10-02 19:03:00,159 WARNING [train.py:1204] (3/4) Exclude cut with ID _42659_210_2070_1_1533607442765_3120380_58-341121-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 19:03:01,630 WARNING [train.py:1204] (3/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_506-23200-0_sp1.1 from training. Duration: 0.8818125 2023-10-02 19:03:04,955 WARNING [train.py:1204] (3/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_336-213530-0_sp1.1 from training. Duration: 0.8181875 2023-10-02 19:03:04,979 WARNING [train.py:1204] (3/4) Exclude cut with ID _64135_210_2253_1_1535677267876_7608972_180-5312-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 19:03:06,769 WARNING [train.py:1204] (3/4) Exclude cut with ID _12302_210_5219_1_1529924446466_8107280_835-279754-0 from training. Duration: 0.9 2023-10-02 19:03:06,771 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_510-96996-0_sp1.1 from training. Duration: 0.7909375 2023-10-02 19:03:06,835 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_15517_210_6753_1_1530954008_3487336_216-187834-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 19:03:06,890 WARNING [train.py:1204] (3/4) Exclude cut with ID _19504_210_9994_1_1531648863242_3325220_414-113427-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 19:03:06,891 WARNING [train.py:1204] (3/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_264-108685-0_sp0.9 from training. Duration: 0.611125 2023-10-02 19:03:09,562 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_5438_210_1794_1_1527336687_2600009_104-288274-0_sp0.9 from training. Duration: 0.6444375 2023-10-02 19:03:14,484 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=985980.0, ans=0.1 2023-10-02 19:03:15,631 WARNING [train.py:1204] (3/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_520-39496-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 19:03:15,675 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_37818_210_4238_1_1533620910_4720083_315-308565-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 19:03:17,119 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2009_210_2071_1_1525481134_4588937_385-302100-0_sp1.1 from training. Duration: 0.7909375 2023-10-02 19:03:17,166 WARNING [train.py:1204] (3/4) Exclude cut with ID _58895_210_8750_1_1535184081320_3649089_277-53219-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 19:03:18,555 WARNING [train.py:1204] (3/4) Exclude cut with ID _26664_210_7770_1_1532258943280_3418270_121-111258-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 19:03:24,015 WARNING [train.py:1204] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_99-167392-0_sp1.1 from training. Duration: 0.5545625 2023-10-02 19:03:24,121 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1113-7369-0 from training. Duration: 0.85 2023-10-02 19:03:25,449 WARNING [train.py:1204] (3/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_352-59958-0 from training. Duration: 0.86 2023-10-02 19:03:25,454 WARNING [train.py:1204] (3/4) Exclude cut with ID _14886_210_4220_1_1530702020355_3516190_159-73748-0_sp1.1 from training. Duration: 0.7545625 2023-10-02 19:03:26,921 WARNING [train.py:1204] (3/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_131-119569-0_sp1.1 from training. Duration: 0.92725 2023-10-02 19:03:28,747 WARNING [train.py:1204] (3/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_216-343007-0 from training. Duration: 0.66 2023-10-02 19:03:31,499 WARNING [train.py:1204] (3/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_113-92608-0_sp0.9 from training. Duration: 0.6 2023-10-02 19:03:34,816 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_40047_210_2290_1_1533290519_2374311_132-123271-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 19:03:36,138 WARNING [train.py:1204] (3/4) Exclude cut with ID _8793_210_6310_1_1528542985139_2310009_121-179782-0 from training. Duration: 0.8 2023-10-02 19:03:36,155 WARNING [train.py:1204] (3/4) Exclude cut with ID _43997_210_4525_1_1533538632555_5189329_283-218639-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 19:03:36,158 WARNING [train.py:1204] (3/4) Exclude cut with ID _49376_210_5986_1_1533985278732_3370740_297-78397-0_sp1.1 from training. Duration: 0.936375 2023-10-02 19:03:36,173 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3475_210_2028_1_1526193578_3622569_377-166133-0_sp0.9 from training. Duration: 0.9555625 2023-10-02 19:03:36,180 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_520-213149-0_sp1.1 from training. Duration: 0.92725 2023-10-02 19:03:38,833 WARNING [train.py:1204] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_567-144483-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 19:03:42,161 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_4183_210_2087_1_1526729100_1869838_143-280962-0_sp0.9 from training. Duration: 0.62225 2023-10-02 19:03:43,523 WARNING [train.py:1204] (3/4) Exclude cut with ID _49643_210_7989_1_1534640244224_3939031_611-185515-0 from training. Duration: 0.94 2023-10-02 19:03:45,522 WARNING [train.py:1204] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_88-341718-0_sp0.9 from training. Duration: 0.7333125 2023-10-02 19:03:46,449 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.1.encoder.layers.0.self_attn_weights.whiten_keys, num_groups=4, num_channels=128, metric=3.62 vs. limit=6.0 2023-10-02 19:03:48,227 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_51225_210_3601_1_1534400634_2327383_49-235441-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 19:03:49,715 WARNING [train.py:1204] (3/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_240-294400-0_sp1.1 from training. Duration: 0.936375 2023-10-02 19:03:51,022 WARNING [train.py:1204] (3/4) Exclude cut with ID _55429_210_10626_1_1534467588527_7174143_640-304825-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 19:03:51,051 WARNING [train.py:1204] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_187-221566-0_sp1.1 from training. Duration: 0.3909375 2023-10-02 19:03:53,683 WARNING [train.py:1204] (3/4) Exclude cut with ID _7606_210_4915_1_1528894253277_5080690_127-177761-0_sp0.9 from training. Duration: 0.711125 2023-10-02 19:03:56,412 WARNING [train.py:1204] (3/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_430-116139-0 from training. Duration: 0.85 2023-10-02 19:03:57,703 INFO [train.py:1046] (3/4) Epoch 28, batch 4500, loss[loss=0.159, simple_loss=0.226, pruned_loss=0.046, over 23722.00 frames. ], tot_loss[loss=0.1678, simple_loss=0.2463, pruned_loss=0.04471, over 4740359.68 frames. ], batch size: 232, lr: 3.63e-03, grad_scale: 16.0 2023-10-02 19:03:57,759 WARNING [train.py:1204] (3/4) Exclude cut with ID _3675_210_4557_1_1526639934408_4182089_456-18305-0_sp0.9 from training. Duration: 0.8444375 2023-10-02 19:03:58,158 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.conv_module1.balancer1.prob, batch_count=986180.0, ans=0.125 2023-10-02 19:04:00,017 WARNING [train.py:1204] (3/4) Exclude cut with ID _18513_210_9206_1_1531618215223_3862620_402-5957-0_sp1.1 from training. Duration: 0.9 2023-10-02 19:04:01,451 WARNING [train.py:1204] (3/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_465-156500-0 from training. Duration: 0.72 2023-10-02 19:04:01,452 WARNING [train.py:1204] (3/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_714-310538-0 from training. Duration: 0.86 2023-10-02 19:04:04,562 WARNING [train.py:1204] (3/4) Exclude cut with ID _39411_210_5986_1_1533538825030_3343649_335-301187-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 19:04:09,838 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_41750_210_1960_1_1533520678_3948042_241-158616-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 19:04:09,886 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_46013_210_7234_1_1533776362_3744032_50-141535-0_sp1.1 from training. Duration: 0.9 2023-10-02 19:04:09,968 WARNING [train.py:1204] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_43-206757-0_sp0.9 from training. Duration: 0.8333125 2023-10-02 19:04:11,767 WARNING [train.py:1204] (3/4) Exclude cut with ID _22961_210_8254_1_1531976366672_4278180_172-276296-0_sp1.1 from training. Duration: 0.97275 2023-10-02 19:04:11,793 WARNING [train.py:1204] (3/4) Exclude cut with ID _17956_210_4353_1_1531209568125_6979970_1305-301903-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 19:04:13,163 WARNING [train.py:1204] (3/4) Exclude cut with ID _28323_210_16187_1_1532480395667_4100300_604-44507-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 19:04:13,438 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.conv_skip_rate, batch_count=986246.6666666666, ans=0.0 2023-10-02 19:04:22,013 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.0.conv_skip_rate, batch_count=986246.6666666666, ans=0.0 2023-10-02 19:04:23,341 WARNING [train.py:1204] (3/4) Exclude cut with ID _23514_210_1389_1_1531832139785_3031439_88-337544-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 19:04:24,760 WARNING [train.py:1204] (3/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_932-3672-0_sp1.1 from training. Duration: 0.963625 2023-10-02 19:04:27,456 WARNING [train.py:1204] (3/4) Exclude cut with ID _46502_210_13794_1_1533724452198_3657159_207-109991-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 19:04:27,527 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_216-73841-0_sp1.1 from training. Duration: 0.87275 2023-10-02 19:04:28,895 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8453_210_4211_1_1528502940_8562730_347-330673-0_sp1.1 from training. Duration: 0.6545625 2023-10-02 19:04:33,641 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_4183_210_2087_1_1526729100_1869838_143-280962-0_sp1.1 from training. Duration: 0.5090625 2023-10-02 19:04:38,406 WARNING [train.py:1204] (3/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_325-113798-0_sp1.1 from training. Duration: 0.77275 2023-10-02 19:04:43,196 WARNING [train.py:1204] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_91-212486-0_sp0.9 from training. Duration: 0.7555625 2023-10-02 19:04:46,445 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8776_210_2040_1_1528638837_4088036_244-278763-0_sp1.1 from training. Duration: 0.8181875 2023-10-02 19:04:46,488 WARNING [train.py:1204] (3/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_106-286130-0 from training. Duration: 0.98 2023-10-02 19:04:47,817 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2746_210_1988_1_1525872816_3795627_372-205861-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 19:04:49,165 WARNING [train.py:1204] (3/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_240-54646-0_sp1.1 from training. Duration: 0.936375 2023-10-02 19:04:50,596 WARNING [train.py:1204] (3/4) Exclude cut with ID _59500_210_8254_1_1534809610680_3736009_162-180695-0_sp1.1 from training. Duration: 0.936375 2023-10-02 19:04:50,617 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7544_210_6753_1_1528591327_1471426_146-93357-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 19:04:52,064 WARNING [train.py:1204] (3/4) Exclude cut with ID _42657_210_2070_1_1533520654016_3327008_405-87432-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 19:04:52,103 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_4528_210_3273_1_1527076654_3276777_209-219804-0 from training. Duration: 0.69 2023-10-02 19:04:52,103 WARNING [train.py:1204] (3/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_78-182912-0_sp0.9 from training. Duration: 0.6555625 2023-10-02 19:04:52,110 WARNING [train.py:1204] (3/4) Exclude cut with ID _31255_210_13427_1_1532865619495_3815143_322-114998-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 19:04:55,536 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.2.encoder.layers.2.feed_forward1.out_whiten, num_groups=1, num_channels=384, metric=11.27 vs. limit=15.0 2023-10-02 19:04:56,253 WARNING [train.py:1204] (3/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_848-280957-0_sp0.9 from training. Duration: 0.9666875 2023-10-02 19:04:56,272 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7696_210_2040_1_1528207167_3716099_117-125434-0_sp0.9 from training. Duration: 0.8444375 2023-10-02 19:04:57,784 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_1002-109192-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 19:05:00,965 WARNING [train.py:1204] (3/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_138-64210-0_sp0.9 from training. Duration: 0.811125 2023-10-02 19:05:00,997 WARNING [train.py:1204] (3/4) Exclude cut with ID _25808_210_13292_1_1532343514562_7366109_633-47524-0_sp1.1 from training. Duration: 0.9 2023-10-02 19:05:02,253 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.546e+02 1.810e+02 2.088e+02 2.457e+02 3.731e+02, threshold=4.176e+02, percent-clipped=0.0 2023-10-02 19:05:02,456 WARNING [train.py:1204] (3/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_297-5233-0 from training. Duration: 0.95 2023-10-02 19:05:05,377 WARNING [train.py:1204] (3/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_335-274780-0 from training. Duration: 0.78 2023-10-02 19:05:05,382 WARNING [train.py:1204] (3/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_301-330455-0 from training. Duration: 0.8 2023-10-02 19:05:06,885 WARNING [train.py:1204] (3/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_383-205833-0 from training. Duration: 0.95 2023-10-02 19:05:07,062 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.bypass.scale_min, batch_count=986446.6666666666, ans=0.2 2023-10-02 19:05:10,368 WARNING [train.py:1204] (3/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_48-182155-0 from training. Duration: 0.87 2023-10-02 19:05:10,450 WARNING [train.py:1204] (3/4) Exclude cut with ID _14832_210_8750_1_1530691351539_3171348_1-54315-0_sp1.1 from training. Duration: 0.7909375 2023-10-02 19:05:13,565 INFO [train.py:1046] (3/4) Epoch 28, batch 4550, loss[loss=0.1616, simple_loss=0.2135, pruned_loss=0.05482, over 19078.00 frames. ], tot_loss[loss=0.1665, simple_loss=0.2445, pruned_loss=0.04427, over 4718863.77 frames. ], batch size: 388, lr: 3.63e-03, grad_scale: 16.0 2023-10-02 19:05:15,005 WARNING [train.py:1204] (3/4) Exclude cut with ID _44982_210_1511_1_1534312820108_3726029_413-100814-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 19:05:15,063 WARNING [train.py:1204] (3/4) Exclude cut with ID _3231_210_2106_1_1526205864934_6784220_104-261392-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 19:05:19,344 WARNING [train.py:1204] (3/4) Exclude cut with ID _24314_210_4915_1_1532135078116_7170179_1-13588-0_sp1.1 from training. Duration: 0.97275 2023-10-02 19:05:22,150 WARNING [train.py:1204] (3/4) Exclude cut with ID _44304_210_15374_1_1533639352765_4613129_141-8356-0_sp1.1 from training. Duration: 0.963625 2023-10-02 19:05:24,999 WARNING [train.py:1204] (3/4) Exclude cut with ID _37892_210_19596_1_1533198677897_3964058_374-335034-0_sp1.1 from training. Duration: 0.936375 2023-10-02 19:05:25,137 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_195-257512-0_sp0.9 from training. Duration: 0.7444375 2023-10-02 19:05:25,139 WARNING [train.py:1204] (3/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_1267-272693-0_sp1.1 from training. Duration: 0.663625 2023-10-02 19:05:25,140 WARNING [train.py:1204] (3/4) Exclude cut with ID _30196_210_1664_1_1533708496522_7074140_690-326155-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 19:05:29,706 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_68847_210_14656_1_1536070214_2586843_38-20717-0_sp1.1 from training. Duration: 0.97275 2023-10-02 19:05:29,762 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_99-251314-0_sp1.1 from training. Duration: 0.7909375 2023-10-02 19:05:32,591 WARNING [train.py:1204] (3/4) Exclude cut with ID _49376_210_5986_1_1533985278732_3370740_225-95630-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 19:05:35,805 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2655_210_2087_1_1525688959_3897400_202-147557-0 from training. Duration: 0.85 2023-10-02 19:05:37,190 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_5618_210_1874_1_1527311008_5851_1-214501-0_sp1.1 from training. Duration: 0.626375 2023-10-02 19:05:38,606 WARNING [train.py:1204] (3/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_225-43381-0_sp1.1 from training. Duration: 0.8545625 2023-10-02 19:05:38,708 WARNING [train.py:1204] (3/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_242-3472-0 from training. Duration: 0.73 2023-10-02 19:05:44,150 WARNING [train.py:1204] (3/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_339-104791-0 from training. Duration: 0.49 2023-10-02 19:05:45,280 WARNING [train.py:1204] (3/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_488-92261-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 19:05:50,563 WARNING [train.py:1204] (3/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_175-163894-0 from training. Duration: 0.74 2023-10-02 19:05:50,779 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.1.feed_forward1.out_proj.dropout_p, batch_count=986646.6666666666, ans=0.1 2023-10-02 19:05:51,956 WARNING [train.py:1204] (3/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_41-234353-0_sp0.9 from training. Duration: 0.7555625 2023-10-02 19:05:54,854 WARNING [train.py:1204] (3/4) Exclude cut with ID _47251_210_18628_1_1533814063128_4137250_116-275927-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 19:05:54,894 WARNING [train.py:1204] (3/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_61-223324-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 19:05:56,197 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6961_210_2087_1_1527937168_6815011_562-165518-0_sp1.1 from training. Duration: 0.7 2023-10-02 19:05:57,631 WARNING [train.py:1204] (3/4) Exclude cut with ID _21989_210_5854_1_1531899214931_3271360_133-44759-0 from training. Duration: 0.77 2023-10-02 19:06:00,316 WARNING [train.py:1204] (3/4) Exclude cut with ID _23557_210_8750_1_1531890106370_6846255_77-284923-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 19:06:03,509 WARNING [train.py:1204] (3/4) Exclude cut with ID _13678_210_7497_1_1530530069226_3463170_464-296127-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 19:06:03,530 WARNING [train.py:1204] (3/4) Exclude cut with ID _22613_210_5219_1_1532253599686_4090170_393-36663-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 19:06:04,874 WARNING [train.py:1204] (3/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_235-262124-0_sp0.9 from training. Duration: 0.7444375 2023-10-02 19:06:06,230 WARNING [train.py:1204] (3/4) Exclude cut with ID _8480_210_1874_1_1528589391205_6728330_755-112625-0 from training. Duration: 0.74 2023-10-02 19:06:07,556 WARNING [train.py:1204] (3/4) Exclude cut with ID _6456_210_1534_1_1527681634212_3181049_215-309359-0 from training. Duration: 0.88 2023-10-02 19:06:07,573 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3019_210_2028_1_1526044245_3264865_191-72235-0_sp1.1 from training. Duration: 0.8818125 2023-10-02 19:06:09,574 WARNING [train.py:1204] (3/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_380-162648-0 from training. Duration: 0.94 2023-10-02 19:06:11,032 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_92-46009-0 from training. Duration: 0.54 2023-10-02 19:06:11,048 WARNING [train.py:1204] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_53-300544-0_sp0.9 from training. Duration: 0.7444375 2023-10-02 19:06:12,414 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_68235_210_1925_1_1535863247_4484656_101-250943-0_sp1.1 from training. Duration: 0.97275 2023-10-02 19:06:12,430 WARNING [train.py:1204] (3/4) Exclude cut with ID _14970_210_15709_1_1530775547361_1966965_204-22182-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 19:06:13,758 WARNING [train.py:1204] (3/4) Exclude cut with ID _55153_210_2253_1_1534384786677_3162096_310-130092-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 19:06:13,769 WARNING [train.py:1204] (3/4) Exclude cut with ID _60113_210_16271_1_1535164212481_4386079_559-167191-0_sp1.1 from training. Duration: 0.8454375 2023-10-02 19:06:17,154 WARNING [train.py:1204] (3/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_93-58002-0_sp1.1 from training. Duration: 0.5181875 2023-10-02 19:06:18,525 WARNING [train.py:1204] (3/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_110-224688-0 from training. Duration: 0.97 2023-10-02 19:06:18,645 WARNING [train.py:1204] (3/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_178-178004-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 19:06:18,653 WARNING [train.py:1204] (3/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_321-335475-0_sp1.1 from training. Duration: 0.4090625 2023-10-02 19:06:20,395 WARNING [train.py:1204] (3/4) Exclude cut with ID _66863_210_18053_1_1535698758334_8590319_1092-271784-0 from training. Duration: 0.96 2023-10-02 19:06:20,410 WARNING [train.py:1204] (3/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_607-213951-0_sp1.1 from training. Duration: 0.763625 2023-10-02 19:06:20,419 WARNING [train.py:1204] (3/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_468-283113-0 from training. Duration: 0.96 2023-10-02 19:06:23,237 WARNING [train.py:1204] (3/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_13-165512-0_sp1.1 from training. Duration: 0.7454375 2023-10-02 19:06:23,254 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_69630_210_7234_1_1535788670_910989_56-169762-0_sp1.1 from training. Duration: 0.92725 2023-10-02 19:06:25,989 WARNING [train.py:1204] (3/4) Exclude cut with ID _67335_210_5219_1_1535525975652_4275229_121-80357-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 19:06:26,040 WARNING [train.py:1204] (3/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_126-12079-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 19:06:26,071 WARNING [train.py:1204] (3/4) Exclude cut with ID _5294_210_1747_1_1527249643928_3696179_342-270278-0_sp1.1 from training. Duration: 0.5 2023-10-02 19:06:28,691 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_30322_210_1925_1_1532843478_826817_35-328264-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 19:06:30,047 INFO [train.py:1046] (3/4) Epoch 28, batch 4600, loss[loss=0.144, simple_loss=0.225, pruned_loss=0.03152, over 24589.00 frames. ], tot_loss[loss=0.1651, simple_loss=0.2429, pruned_loss=0.04369, over 4719566.63 frames. ], batch size: 60, lr: 3.63e-03, grad_scale: 16.0 2023-10-02 19:06:30,118 WARNING [train.py:1204] (3/4) Exclude cut with ID _8480_210_1874_1_1528589391205_6728330_755-112625-0_sp1.1 from training. Duration: 0.67275 2023-10-02 19:06:30,338 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.0.bypass.scale_min, batch_count=986846.6666666666, ans=0.2 2023-10-02 19:06:31,568 WARNING [train.py:1204] (3/4) Exclude cut with ID _38670_210_15379_1_1533448810659_4391059_132-146412-0_sp1.1 from training. Duration: 0.97275 2023-10-02 19:06:31,639 WARNING [train.py:1204] (3/4) Exclude cut with ID _22020_210_2461_1_1531724164513_6326900_563-39691-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 19:06:35,714 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_9460_210_4211_1_1528954587_10049667_572-37035-0_sp1.1 from training. Duration: 0.72725 2023-10-02 19:06:35,726 WARNING [train.py:1204] (3/4) Exclude cut with ID _68777_210_4353_1_1535810390731_3117904_129-166012-0_sp0.9 from training. Duration: 0.9444375 2023-10-02 19:06:35,791 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder_embed.conv.2.prob, batch_count=986846.6666666666, ans=0.125 2023-10-02 19:06:37,571 WARNING [train.py:1204] (3/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_487-91921-0_sp1.1 from training. Duration: 0.963625 2023-10-02 19:06:39,451 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_195-257512-0 from training. Duration: 0.67 2023-10-02 19:06:40,837 WARNING [train.py:1204] (3/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_188-260011-0_sp1.1 from training. Duration: 0.663625 2023-10-02 19:06:43,617 WARNING [train.py:1204] (3/4) Exclude cut with ID _8480_210_1874_1_1528589391205_6728330_309-139896-0_sp0.9 from training. Duration: 0.9 2023-10-02 19:06:44,184 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.5.encoder.layers.0.nonlin_attention.whiten1, num_groups=1, num_channels=192, metric=4.11 vs. limit=10.0 2023-10-02 19:06:45,010 WARNING [train.py:1204] (3/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_866-321248-0_sp1.1 from training. Duration: 0.963625 2023-10-02 19:06:48,224 WARNING [train.py:1204] (3/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_510-216513-0_sp1.1 from training. Duration: 0.97275 2023-10-02 19:06:54,512 WARNING [train.py:1204] (3/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_758-143677-0 from training. Duration: 0.98 2023-10-02 19:06:55,732 WARNING [train.py:1204] (3/4) Exclude cut with ID _55134_210_21982_1_1534590028377_3882170_50-214427-0_sp1.1 from training. Duration: 0.97275 2023-10-02 19:06:57,235 WARNING [train.py:1204] (3/4) Exclude cut with ID _31649_210_6228_1_1532584608063_4091078_482-301330-0_sp1.1 from training. Duration: 0.97275 2023-10-02 19:07:00,007 WARNING [train.py:1204] (3/4) Exclude cut with ID _35799_210_5854_1_1533263487887_3529071_490-326847-0_sp1.1 from training. Duration: 0.9 2023-10-02 19:07:00,014 WARNING [train.py:1204] (3/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_984-90716-0_sp1.1 from training. Duration: 0.963625 2023-10-02 19:07:05,678 WARNING [train.py:1204] (3/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_166-246557-0 from training. Duration: 0.64 2023-10-02 19:07:05,678 WARNING [train.py:1204] (3/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_51-200220-0_sp1.1 from training. Duration: 0.5090625 2023-10-02 19:07:07,468 WARNING [train.py:1204] (3/4) Exclude cut with ID _51382_210_23862_1_1534492801838_3650104_275-92796-0_sp1.1 from training. Duration: 0.936375 2023-10-02 19:07:07,884 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.conv_module1.balancer2.min_positive, batch_count=986980.0, ans=0.05 2023-10-02 19:07:10,857 WARNING [train.py:1204] (3/4) Exclude cut with ID _29853_210_7497_1_1532505600851_6642110_461-142829-0_sp1.1 from training. Duration: 0.97275 2023-10-02 19:07:12,188 WARNING [train.py:1204] (3/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_788-298907-0_sp0.9 from training. Duration: 0.988875 2023-10-02 19:07:12,302 WARNING [train.py:1204] (3/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_739-2098-0_sp1.1 from training. Duration: 0.836375 2023-10-02 19:07:14,981 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7543_210_6753_1_1528541907_1399322_165-323619-0 from training. Duration: 0.69 2023-10-02 19:07:16,419 WARNING [train.py:1204] (3/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_401-255491-0_sp1.1 from training. Duration: 0.636375 2023-10-02 19:07:21,611 WARNING [train.py:1204] (3/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_24-143283-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 19:07:22,978 WARNING [train.py:1204] (3/4) Exclude cut with ID _14459_210_2228_1_1531443641946_7516700_1221-280439-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 19:07:25,783 WARNING [train.py:1204] (3/4) Exclude cut with ID _59202_210_26984_1_1534816810612_3549174_469-213482-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 19:07:25,784 WARNING [train.py:1204] (3/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_148-158509-0_sp0.9 from training. Duration: 0.4555625 2023-10-02 19:07:25,825 WARNING [train.py:1204] (3/4) Exclude cut with ID _24314_210_4915_1_1532135078116_7170179_863-259095-0_sp1.1 from training. Duration: 0.97275 2023-10-02 19:07:27,052 WARNING [train.py:1204] (3/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_321-22821-0 from training. Duration: 0.89 2023-10-02 19:07:27,071 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_43831_210_2158_1_1533725502_5872791_55-163137-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 19:07:27,112 WARNING [train.py:1204] (3/4) Exclude cut with ID _27430_210_7770_1_1532604689375_1378590_26-25207-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 19:07:29,784 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_17613_210_2151_1_1531137569_2366160_223-316461-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 19:07:29,852 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_53679_210_4852_1_1534556726_4186141_541-213370-0_sp1.1 from training. Duration: 0.963625 2023-10-02 19:07:31,242 WARNING [train.py:1204] (3/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_369-295086-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 19:07:32,626 WARNING [train.py:1204] (3/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_91-267696-0 from training. Duration: 0.87 2023-10-02 19:07:32,686 WARNING [train.py:1204] (3/4) Exclude cut with ID _5294_210_1747_1_1527249643928_3696179_180-100713-0 from training. Duration: 0.81 2023-10-02 19:07:33,820 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.512e+02 1.870e+02 2.077e+02 2.555e+02 3.884e+02, threshold=4.155e+02, percent-clipped=0.0 2023-10-02 19:07:33,891 WARNING [train.py:1204] (3/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_32-335103-0 from training. Duration: 0.82 2023-10-02 19:07:33,896 WARNING [train.py:1204] (3/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_133-312727-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 19:07:33,988 WARNING [train.py:1204] (3/4) Exclude cut with ID _59288_210_5854_1_1535077757774_3412246_391-165934-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 19:07:35,363 WARNING [train.py:1204] (3/4) Exclude cut with ID _28323_210_16187_1_1532480395667_4100300_488-1281-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 19:07:35,432 WARNING [train.py:1204] (3/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_124-193207-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 19:07:35,675 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.attention_skip_rate, batch_count=987113.3333333334, ans=0.0 2023-10-02 19:07:40,785 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.2.encoder.layers.1.self_attn_weights.whiten_keys, num_groups=4, num_channels=128, metric=3.27 vs. limit=6.0 2023-10-02 19:07:44,200 INFO [train.py:1046] (3/4) Epoch 28, batch 4650, loss[loss=0.165, simple_loss=0.2529, pruned_loss=0.03852, over 24666.00 frames. ], tot_loss[loss=0.1641, simple_loss=0.2416, pruned_loss=0.04329, over 4708833.55 frames. ], batch size: 68, lr: 3.63e-03, grad_scale: 16.0 2023-10-02 19:07:46,921 WARNING [train.py:1204] (3/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_226-242140-0_sp0.9 from training. Duration: 0.9 2023-10-02 19:07:48,411 WARNING [train.py:1204] (3/4) Exclude cut with ID _29277_210_3279_1_1532581109293_3173069_392-3694-0_sp1.1 from training. Duration: 0.936375 2023-10-02 19:07:49,719 WARNING [train.py:1204] (3/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_563-329842-0_sp1.1 from training. Duration: 0.97275 2023-10-02 19:07:49,763 WARNING [train.py:1204] (3/4) Exclude cut with ID _58583_210_5236_1_1534726835266_3574249_391-87755-0_sp1.1 from training. Duration: 0.92725 2023-10-02 19:07:49,783 WARNING [train.py:1204] (3/4) Exclude cut with ID _28427_210_4220_1_1532581240967_3061069_281-237827-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 19:07:49,803 WARNING [train.py:1204] (3/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_44-147898-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 19:07:53,692 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_24007_210_1794_1_1531918925_1663875_125-320772-0_sp1.1 from training. Duration: 0.97275 2023-10-02 19:07:56,476 WARNING [train.py:1204] (3/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_143-117421-0 from training. Duration: 0.58 2023-10-02 19:07:59,296 WARNING [train.py:1204] (3/4) Exclude cut with ID _69584_210_5219_1_1535785133775_4044450_325-7656-0_sp0.9 from training. Duration: 0.9555625 2023-10-02 19:07:59,428 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_37351_210_7925_1_1533012931_3812113_265-85780-0 from training. Duration: 0.88 2023-10-02 19:07:59,447 WARNING [train.py:1204] (3/4) Exclude cut with ID _18516_210_9206_1_1531704653206_3692406_331-237896-0_sp1.1 from training. Duration: 0.936375 2023-10-02 19:08:00,849 WARNING [train.py:1204] (3/4) Exclude cut with ID _14629_210_15709_1_1530604298182_956899_6-232695-0 from training. Duration: 0.94 2023-10-02 19:08:00,880 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7544_210_6753_1_1528587000_691468_87-178599-0_sp1.1 from training. Duration: 0.7909375 2023-10-02 19:08:02,206 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_326-303945-0 from training. Duration: 0.99 2023-10-02 19:08:02,228 WARNING [train.py:1204] (3/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_503-30719-0 from training. Duration: 0.98 2023-10-02 19:08:02,242 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_11305_210_1794_1_1529750428_7178148_299-344640-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 19:08:03,564 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_53071_210_16554_1_1534490405_2954066_265-170472-0_sp1.1 from training. Duration: 0.7545625 2023-10-02 19:08:06,719 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_174-277364-0_sp1.1 from training. Duration: 0.6454375 2023-10-02 19:08:08,120 WARNING [train.py:1204] (3/4) Exclude cut with ID _58414_210_8254_1_1535421609162_7151182_193-151884-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 19:08:08,149 WARNING [train.py:1204] (3/4) Exclude cut with ID IC0651W0104-322063 from training. Duration: 0.768 2023-10-02 19:08:10,820 WARNING [train.py:1204] (3/4) Exclude cut with ID _21881_210_3525_1_1531825242757_3798380_139-305982-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 19:08:10,931 WARNING [train.py:1204] (3/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_79-200392-0 from training. Duration: 0.97 2023-10-02 19:08:13,652 WARNING [train.py:1204] (3/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_451-280894-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 19:08:13,660 WARNING [train.py:1204] (3/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_304-273433-0_sp1.1 from training. Duration: 0.9 2023-10-02 19:08:16,285 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_5959_210_1794_1_1527400400_3445726_412-44052-0 from training. Duration: 0.77 2023-10-02 19:08:16,411 WARNING [train.py:1204] (3/4) Exclude cut with ID _4681_210_3525_1_1527251040321_1235713_148-26159-0_sp1.1 from training. Duration: 0.963625 2023-10-02 19:08:19,096 WARNING [train.py:1204] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_887-249555-0_sp0.9 from training. Duration: 0.8555625 2023-10-02 19:08:24,407 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_25236_210_2158_1_1532051968_280567_37-6860-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 19:08:24,531 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.1.feed_forward1.hidden_balancer.prob, batch_count=987313.3333333334, ans=0.125 2023-10-02 19:08:29,989 WARNING [train.py:1204] (3/4) Exclude cut with ID _29277_210_3279_1_1532581109293_3173069_417-115527-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 19:08:31,476 WARNING [train.py:1204] (3/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_552-134748-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 19:08:31,546 WARNING [train.py:1204] (3/4) Exclude cut with ID _43176_210_8254_1_1533524417469_4174339_188-174143-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 19:08:31,576 WARNING [train.py:1204] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_127-119109-0_sp1.1 from training. Duration: 0.6545625 2023-10-02 19:08:35,606 WARNING [train.py:1204] (3/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_38-187851-0 from training. Duration: 0.62 2023-10-02 19:08:35,634 WARNING [train.py:1204] (3/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_588-125165-0 from training. Duration: 0.83 2023-10-02 19:08:35,703 WARNING [train.py:1204] (3/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_344-152526-0_sp1.1 from training. Duration: 0.4818125 2023-10-02 19:08:35,704 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7002_210_1794_1_1527935634_4034444_487-236501-0 from training. Duration: 0.66 2023-10-02 19:08:38,925 WARNING [train.py:1204] (3/4) Exclude cut with ID _23825_210_13449_1_1531915313526_3497150_379-16981-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 19:08:39,773 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.self_attn1.whiten.whitening_limit, batch_count=987380.0, ans=22.5 2023-10-02 19:08:44,577 WARNING [train.py:1204] (3/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_297-5233-0_sp1.1 from training. Duration: 0.863625 2023-10-02 19:08:44,581 WARNING [train.py:1204] (3/4) Exclude cut with ID _41298_210_9019_1_1533430371450_4822143_178-283538-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 19:08:44,621 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7046_210_6753_1_1527895337_6920917_642-106799-0 from training. Duration: 0.92 2023-10-02 19:08:45,850 WARNING [train.py:1204] (3/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_532-291952-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 19:08:47,290 WARNING [train.py:1204] (3/4) Exclude cut with ID _47278_210_12041_1_1533902418349_3576110_260-270468-0_sp1.1 from training. Duration: 0.92725 2023-10-02 19:08:47,304 WARNING [train.py:1204] (3/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_144-3287-0_sp0.9 from training. Duration: 0.7444375 2023-10-02 19:08:47,407 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_162-262717-0_sp0.9 from training. Duration: 0.92225 2023-10-02 19:08:48,822 WARNING [train.py:1204] (3/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_62-335552-0_sp0.9 from training. Duration: 0.9666875 2023-10-02 19:08:48,826 WARNING [train.py:1204] (3/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_726-272852-0_sp1.1 from training. Duration: 0.92725 2023-10-02 19:08:50,868 WARNING [train.py:1204] (3/4) Exclude cut with ID _14898_210_9206_1_1530766812377_4177190_26-135492-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 19:08:53,469 WARNING [train.py:1204] (3/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_200-95304-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 19:08:54,721 WARNING [train.py:1204] (3/4) Exclude cut with ID _45012_210_12651_1_1533608435136_5371380_240-37101-0_sp1.1 from training. Duration: 0.8454375 2023-10-02 19:08:54,730 WARNING [train.py:1204] (3/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_132-269633-0_sp0.9 from training. Duration: 0.7555625 2023-10-02 19:08:54,814 WARNING [train.py:1204] (3/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_907-151398-0_sp0.9 from training. Duration: 0.411125 2023-10-02 19:08:57,390 INFO [train.py:1046] (3/4) Epoch 28, batch 4700, loss[loss=0.1621, simple_loss=0.2539, pruned_loss=0.03518, over 24643.00 frames. ], tot_loss[loss=0.1649, simple_loss=0.2427, pruned_loss=0.04355, over 4711340.31 frames. ], batch size: 73, lr: 3.63e-03, grad_scale: 16.0 2023-10-02 19:08:57,441 WARNING [train.py:1204] (3/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_45-265684-0_sp0.9 from training. Duration: 0.8 2023-10-02 19:08:58,824 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_74-292608-0 from training. Duration: 0.72 2023-10-02 19:09:00,535 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.feed_forward1.out_proj.dropout_p, batch_count=987513.3333333334, ans=0.1 2023-10-02 19:09:06,286 WARNING [train.py:1204] (3/4) Exclude cut with ID _59202_210_26984_1_1534816810612_3549174_221-84819-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 19:09:08,149 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_33696_210_3852_1_1533082780_2663375_181-325595-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 19:09:08,185 WARNING [train.py:1204] (3/4) Exclude cut with ID _47251_210_18628_1_1533814063128_4137250_351-154105-0_sp1.1 from training. Duration: 0.963625 2023-10-02 19:09:08,286 WARNING [train.py:1204] (3/4) Exclude cut with ID _35501_210_9288_1_1532998507986_3192189_351-334740-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 19:09:10,879 WARNING [train.py:1204] (3/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_342-100157-0_sp1.1 from training. Duration: 0.4909375 2023-10-02 19:09:14,916 WARNING [train.py:1204] (3/4) Exclude cut with ID _6789_210_1534_1_1527857937218_3118950_262-48853-0 from training. Duration: 0.56 2023-10-02 19:09:14,953 WARNING [train.py:1204] (3/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_430-201020-0 from training. Duration: 0.97 2023-10-02 19:09:17,682 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_71-136518-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 19:09:19,030 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_28-291982-0_sp1.1 from training. Duration: 0.8818125 2023-10-02 19:09:20,733 WARNING [train.py:1204] (3/4) Exclude cut with ID _54218_210_12210_1_1534413646388_4409259_437-275326-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 19:09:23,571 WARNING [train.py:1204] (3/4) Exclude cut with ID _55134_210_21982_1_1534590028377_3882170_40-309139-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 19:09:30,946 WARNING [train.py:1204] (3/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_266-329755-0_sp1.1 from training. Duration: 0.8090625 2023-10-02 19:09:32,314 WARNING [train.py:1204] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_21-174514-0_sp1.1 from training. Duration: 0.3909375 2023-10-02 19:09:33,802 WARNING [train.py:1204] (3/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_409-207378-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 19:09:38,246 WARNING [train.py:1204] (3/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_132-269633-0 from training. Duration: 0.68 2023-10-02 19:09:40,227 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2245_210_2715_1_1525511548_3540254_310-52483-0_sp0.9 from training. Duration: 0.97775 2023-10-02 19:09:42,828 WARNING [train.py:1204] (3/4) Exclude cut with ID _27430_210_7770_1_1532604689375_1378590_47-77403-0_sp1.1 from training. Duration: 0.92725 2023-10-02 19:09:47,034 WARNING [train.py:1204] (3/4) Exclude cut with ID _7562_210_2084_1_1528887767916_3946229_186-326422-0 from training. Duration: 0.84 2023-10-02 19:09:47,160 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_230-35993-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 19:09:53,001 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_23854_210_1794_1_1531969696_1329157_57-123491-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 19:09:53,053 WARNING [train.py:1204] (3/4) Exclude cut with ID _14747_210_8341_1_1530685797463_3654089_147-131229-0 from training. Duration: 0.76 2023-10-02 19:09:55,872 WARNING [train.py:1204] (3/4) Exclude cut with ID _30191_210_1664_1_1533189915047_7196670_430-19539-0_sp1.1 from training. Duration: 0.92725 2023-10-02 19:09:55,887 WARNING [train.py:1204] (3/4) Exclude cut with ID _67725_210_27767_1_1535608756671_5105359_44-50762-0_sp1.1 from training. Duration: 0.963625 2023-10-02 19:09:57,431 WARNING [train.py:1204] (3/4) Exclude cut with ID _27485_210_4916_1_1532739191677_3668269_217-161755-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 19:09:57,470 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_5931_210_2040_1_1527428481_4547999_321-193614-0_sp1.1 from training. Duration: 0.6090625 2023-10-02 19:09:57,495 WARNING [train.py:1204] (3/4) Exclude cut with ID _6267_210_2084_1_1527764658526_3894730_375-219887-0 from training. Duration: 0.67 2023-10-02 19:09:59,313 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9087W0022-538393 from training. Duration: 0.768 2023-10-02 19:09:59,430 WARNING [train.py:1204] (3/4) Exclude cut with ID _68777_210_4353_1_1535810390731_3117904_83-196459-0_sp1.1 from training. Duration: 0.963625 2023-10-02 19:10:00,770 WARNING [train.py:1204] (3/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_455-343812-0_sp1.1 from training. Duration: 0.92725 2023-10-02 19:10:00,771 WARNING [train.py:1204] (3/4) Exclude cut with ID _13916_210_9994_1_1530788341126_3763149_414-320174-0_sp1.1 from training. Duration: 0.92725 2023-10-02 19:10:02,129 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.576e+02 1.850e+02 1.974e+02 2.169e+02 3.460e+02, threshold=3.948e+02, percent-clipped=0.0 2023-10-02 19:10:02,193 WARNING [train.py:1204] (3/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_398-239819-0 from training. Duration: 0.99 2023-10-02 19:10:02,279 WARNING [train.py:1204] (3/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_199-232314-0_sp1.1 from training. Duration: 0.92725 2023-10-02 19:10:04,900 WARNING [train.py:1204] (3/4) Exclude cut with ID _2334_210_2667_1_1525608238880_4705129_459-104997-0 from training. Duration: 0.95 2023-10-02 19:10:05,019 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.conv_module2.balancer2.prob, batch_count=987780.0, ans=0.125 2023-10-02 19:10:09,408 WARNING [train.py:1204] (3/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_441-209395-0_sp1.1 from training. Duration: 0.936375 2023-10-02 19:10:09,491 WARNING [train.py:1204] (3/4) Exclude cut with ID _27052_210_5986_1_1532329197187_3585009_320-80393-0_sp1.1 from training. Duration: 0.97275 2023-10-02 19:10:10,752 INFO [train.py:1046] (3/4) Epoch 28, batch 4750, loss[loss=0.1745, simple_loss=0.2421, pruned_loss=0.05347, over 23498.00 frames. ], tot_loss[loss=0.1658, simple_loss=0.2437, pruned_loss=0.04399, over 4716152.27 frames. ], batch size: 134, lr: 3.63e-03, grad_scale: 8.0 2023-10-02 19:10:13,709 WARNING [train.py:1204] (3/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_89-217253-0_sp1.1 from training. Duration: 0.97275 2023-10-02 19:10:13,732 WARNING [train.py:1204] (3/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_453-282588-0_sp1.1 from training. Duration: 0.8909375 2023-10-02 19:10:15,138 WARNING [train.py:1204] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_88-341718-0 from training. Duration: 0.66 2023-10-02 19:10:15,197 WARNING [train.py:1204] (3/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_230-3521-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 19:10:17,912 WARNING [train.py:1204] (3/4) Exclude cut with ID _9679_210_7497_1_1529129212906_4363929_512-313889-0 from training. Duration: 0.81 2023-10-02 19:10:21,102 WARNING [train.py:1204] (3/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_194-252800-0_sp1.1 from training. Duration: 0.8818125 2023-10-02 19:10:21,136 WARNING [train.py:1204] (3/4) Exclude cut with ID _55209_210_1531_1_1534675828996_4597160_239-293541-0_sp1.1 from training. Duration: 0.963625 2023-10-02 19:10:21,186 WARNING [train.py:1204] (3/4) Exclude cut with ID _35799_210_5854_1_1533263487887_3529071_15-325131-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 19:10:25,500 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.ff3_skip_rate, batch_count=987913.3333333334, ans=0.0 2023-10-02 19:10:26,684 WARNING [train.py:1204] (3/4) Exclude cut with ID _14100_210_8341_1_1530529320613_3678069_279-55760-0 from training. Duration: 0.93 2023-10-02 19:10:28,123 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.0.conv_skip_rate, batch_count=987913.3333333334, ans=0.0 2023-10-02 19:10:32,681 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_489-230703-0_sp1.1 from training. Duration: 0.77275 2023-10-02 19:10:34,050 WARNING [train.py:1204] (3/4) Exclude cut with ID _6694_210_4599_1_1527852637490_3965529_144-185051-0 from training. Duration: 0.96 2023-10-02 19:10:34,126 WARNING [train.py:1204] (3/4) Exclude cut with ID _28461_210_2253_1_1532771775189_3041299_1-228155-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 19:10:38,619 WARNING [train.py:1204] (3/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_686-252323-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 19:10:38,622 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_1075-294633-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 19:10:38,647 WARNING [train.py:1204] (3/4) Exclude cut with ID _22083_210_8254_1_1531702862439_3678970_30-92879-0_sp1.1 from training. Duration: 0.97275 2023-10-02 19:10:38,726 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9245W0121-616888 from training. Duration: 0.896 2023-10-02 19:10:40,526 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6786_210_1388_1_1527839699_4194495_378-46070-0 from training. Duration: 0.72 2023-10-02 19:10:46,382 WARNING [train.py:1204] (3/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_317-175062-0 from training. Duration: 0.82 2023-10-02 19:10:49,275 WARNING [train.py:1204] (3/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_238-89390-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 19:10:50,499 WARNING [train.py:1204] (3/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_524-310811-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 19:10:51,402 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.bypass.skip_rate, batch_count=987980.0, ans=0.07 2023-10-02 19:10:53,686 WARNING [train.py:1204] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_307-158462-0_sp0.9 from training. Duration: 0.7666875 2023-10-02 19:10:53,686 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9309W0191-640318 from training. Duration: 0.512 2023-10-02 19:10:53,691 WARNING [train.py:1204] (3/4) Exclude cut with ID _55153_210_2253_1_1534384786677_3162096_100-204617-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 19:10:56,440 WARNING [train.py:1204] (3/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_112-149213-0_sp1.1 from training. Duration: 0.863625 2023-10-02 19:10:59,102 WARNING [train.py:1204] (3/4) Exclude cut with ID _67831_210_12392_1_1535612464202_3092612_346-2148-0_sp1.1 from training. Duration: 0.8454375 2023-10-02 19:11:00,428 WARNING [train.py:1204] (3/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_51-117576-0_sp0.9 from training. Duration: 0.4444375 2023-10-02 19:11:01,878 WARNING [train.py:1204] (3/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_300-27235-0 from training. Duration: 0.87 2023-10-02 19:11:01,940 WARNING [train.py:1204] (3/4) Exclude cut with ID _40399_210_4353_1_1533305658194_3279406_195-116825-0_sp1.1 from training. Duration: 0.97275 2023-10-02 19:11:01,962 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7002_210_1794_1_1527939683_5157286_163-87552-0_sp0.9 from training. Duration: 0.9555625 2023-10-02 19:11:01,990 WARNING [train.py:1204] (3/4) Exclude cut with ID _19514_210_9259_1_1531530071330_5084230_560-237597-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 19:11:03,315 WARNING [train.py:1204] (3/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_41-234353-0_sp1.1 from training. Duration: 0.6181875 2023-10-02 19:11:03,338 WARNING [train.py:1204] (3/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_769-168302-0 from training. Duration: 0.71 2023-10-02 19:11:04,748 WARNING [train.py:1204] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_574-45541-0 from training. Duration: 0.65 2023-10-02 19:11:09,805 WARNING [train.py:1204] (3/4) Exclude cut with ID _50310_210_15488_1_1534068043345_3641080_262-67120-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 19:11:10,082 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.ff2_skip_rate, batch_count=988113.3333333334, ans=0.0 2023-10-02 19:11:14,515 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_53071_210_16554_1_1534493434_1276796_35-11927-0_sp1.1 from training. Duration: 0.963625 2023-10-02 19:11:14,517 WARNING [train.py:1204] (3/4) Exclude cut with ID _3992_210_1747_1_1526644816031_3731060_107-251434-0 from training. Duration: 0.99 2023-10-02 19:11:14,566 WARNING [train.py:1204] (3/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_412-54488-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 19:11:15,928 WARNING [train.py:1204] (3/4) Exclude cut with ID _18516_210_9206_1_1531704653206_3692406_77-258490-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 19:11:17,385 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_40047_210_2290_1_1533290519_2374311_195-165693-0_sp0.9 from training. Duration: 0.988875 2023-10-02 19:11:18,789 WARNING [train.py:1204] (3/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_296-36971-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 19:11:18,845 WARNING [train.py:1204] (3/4) Exclude cut with ID _14970_210_15709_1_1530773543493_1715730_187-20259-0_sp1.1 from training. Duration: 0.8090625 2023-10-02 19:11:21,627 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_53373_210_5399_1_1534229471_4715695_135-305927-0_sp1.1 from training. Duration: 0.936375 2023-10-02 19:11:22,987 WARNING [train.py:1204] (3/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_60-18123-0 from training. Duration: 0.88 2023-10-02 19:11:23,050 WARNING [train.py:1204] (3/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_166-166990-0 from training. Duration: 0.96 2023-10-02 19:11:24,794 INFO [train.py:1046] (3/4) Epoch 28, batch 4800, loss[loss=0.1785, simple_loss=0.2517, pruned_loss=0.05262, over 23592.00 frames. ], tot_loss[loss=0.167, simple_loss=0.2447, pruned_loss=0.04469, over 4724924.20 frames. ], batch size: 256, lr: 3.63e-03, grad_scale: 16.0 2023-10-02 19:11:24,849 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_5438_210_1794_1_1527336687_2600009_233-58072-0 from training. Duration: 0.77 2023-10-02 19:11:26,369 WARNING [train.py:1204] (3/4) Exclude cut with ID _8833_210_5219_1_1528592665210_7553059_611-80313-0_sp0.9 from training. Duration: 0.711125 2023-10-02 19:11:27,642 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_53555_210_5632_1_1534595220_7232297_118-43239-0_sp1.1 from training. Duration: 0.936375 2023-10-02 19:11:27,752 WARNING [train.py:1204] (3/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_480-13146-0 from training. Duration: 0.85 2023-10-02 19:11:30,786 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_892-28814-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 19:11:32,177 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_24899_210_16554_1_1532046422_3827462_160-21255-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 19:11:39,540 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_154-316450-0_sp1.1 from training. Duration: 0.6909375 2023-10-02 19:11:40,975 WARNING [train.py:1204] (3/4) Exclude cut with ID _30196_210_1664_1_1533708496522_7074140_1225-301232-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 19:11:41,015 WARNING [train.py:1204] (3/4) Exclude cut with ID _28783_210_7943_1_1532671164808_7155479_414-208100-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 19:11:41,060 WARNING [train.py:1204] (3/4) Exclude cut with ID _19023_210_4353_1_1531378892552_5820780_83-254794-0 from training. Duration: 0.86 2023-10-02 19:11:42,408 WARNING [train.py:1204] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_591-83752-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 19:11:42,441 WARNING [train.py:1204] (3/4) Exclude cut with ID _29877_210_6228_1_1532397791106_5276149_266-62291-0_sp1.1 from training. Duration: 0.7909375 2023-10-02 19:11:42,558 WARNING [train.py:1204] (3/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_280-161633-0_sp1.1 from training. Duration: 0.663625 2023-10-02 19:11:47,273 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7543_210_6753_1_1528539468_333764_39-156634-0_sp1.1 from training. Duration: 0.97275 2023-10-02 19:11:48,655 WARNING [train.py:1204] (3/4) Exclude cut with ID _67499_210_17282_1_1535509805220_3692430_505-312248-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 19:11:49,969 WARNING [train.py:1204] (3/4) Exclude cut with ID _5294_210_1747_1_1527249643928_3696179_180-100713-0_sp0.9 from training. Duration: 0.9 2023-10-02 19:11:51,415 WARNING [train.py:1204] (3/4) Exclude cut with ID _26528_210_9070_1_1532570410842_3851310_508-148132-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 19:11:51,435 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_211-339622-0_sp1.1 from training. Duration: 0.4090625 2023-10-02 19:11:51,460 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_339-138356-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 19:11:51,529 WARNING [train.py:1204] (3/4) Exclude cut with ID _27060_210_5986_1_1532674838922_3522549_211-19933-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 19:11:56,149 WARNING [train.py:1204] (3/4) Exclude cut with ID _60229_210_27767_1_1534838406158_3655350_457-117923-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 19:11:56,356 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.1.balancer2.prob, batch_count=988313.3333333334, ans=0.125 2023-10-02 19:11:58,875 WARNING [train.py:1204] (3/4) Exclude cut with ID _55429_210_10626_1_1534467588527_7174143_460-141439-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 19:12:01,508 WARNING [train.py:1204] (3/4) Exclude cut with ID _59170_210_6721_1_1534983633960_4203335_263-10780-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 19:12:01,527 WARNING [train.py:1204] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_872-264525-0_sp0.9 from training. Duration: 0.72225 2023-10-02 19:12:02,879 WARNING [train.py:1204] (3/4) Exclude cut with ID _7624_210_1531_1_1528109802198_4933101_418-349834-0_sp0.9 from training. Duration: 0.6666875 2023-10-02 19:12:02,972 WARNING [train.py:1204] (3/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_27-53998-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 19:12:04,363 WARNING [train.py:1204] (3/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_202-183922-0 from training. Duration: 0.85 2023-10-02 19:12:04,385 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7002_210_1794_1_1527939683_5157286_1-281264-0 from training. Duration: 0.44 2023-10-02 19:12:04,477 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_53753_210_7234_1_1534320003_3594148_493-9314-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 19:12:05,814 WARNING [train.py:1204] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_217-95056-0_sp1.1 from training. Duration: 0.8909375 2023-10-02 19:12:05,849 WARNING [train.py:1204] (3/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_406-45086-0_sp1.1 from training. Duration: 0.72725 2023-10-02 19:12:05,854 WARNING [train.py:1204] (3/4) Exclude cut with ID _31649_210_6228_1_1532584608063_4091078_505-213821-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 19:12:05,868 WARNING [train.py:1204] (3/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_317-175062-0_sp0.9 from training. Duration: 0.911125 2023-10-02 19:12:08,726 WARNING [train.py:1204] (3/4) Exclude cut with ID _5444_210_2151_1_1527418808464_5312210_726-27165-0_sp1.1 from training. Duration: 0.6090625 2023-10-02 19:12:08,778 WARNING [train.py:1204] (3/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_494-170437-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 19:12:13,822 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_474-64054-0_sp1.1 from training. Duration: 0.936375 2023-10-02 19:12:13,989 WARNING [train.py:1204] (3/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_521-7556-0_sp1.1 from training. Duration: 0.97275 2023-10-02 19:12:14,205 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.balancer2.prob, batch_count=988380.0, ans=0.125 2023-10-02 19:12:15,824 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_269-348505-0_sp1.1 from training. Duration: 0.92725 2023-10-02 19:12:19,945 WARNING [train.py:1204] (3/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_559-184862-0 from training. Duration: 0.94 2023-10-02 19:12:21,104 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_68702_210_23689_1_1535799577_3420068_92-76040-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 19:12:21,142 WARNING [train.py:1204] (3/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_227-102052-0_sp1.1 from training. Duration: 0.97275 2023-10-02 19:12:21,180 WARNING [train.py:1204] (3/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_718-250907-0_sp0.9 from training. Duration: 0.7666875 2023-10-02 19:12:22,720 WARNING [train.py:1204] (3/4) Exclude cut with ID _14069_210_5632_1_1530512692033_4803390_352-143891-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 19:12:27,300 WARNING [train.py:1204] (3/4) Exclude cut with ID _6267_210_2084_1_1527764658526_3894730_65-106602-0_sp1.1 from training. Duration: 0.936375 2023-10-02 19:12:27,366 WARNING [train.py:1204] (3/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_202-183922-0_sp0.9 from training. Duration: 0.9444375 2023-10-02 19:12:27,372 WARNING [train.py:1204] (3/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_465-268827-0_sp1.1 from training. Duration: 0.97275 2023-10-02 19:12:28,662 WARNING [train.py:1204] (3/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_58-29064-0_sp1.1 from training. Duration: 0.9 2023-10-02 19:12:28,690 WARNING [train.py:1204] (3/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_1-99680-0_sp0.9 from training. Duration: 0.7444375 2023-10-02 19:12:30,077 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.581e+02 1.923e+02 2.078e+02 2.346e+02 3.782e+02, threshold=4.155e+02, percent-clipped=0.0 2023-10-02 19:12:30,183 WARNING [train.py:1204] (3/4) Exclude cut with ID _13253_210_1925_1_1530757775819_3577873_191-144977-0_sp1.1 from training. Duration: 0.7090625 2023-10-02 19:12:34,314 WARNING [train.py:1204] (3/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_276-112684-0_sp1.1 from training. Duration: 0.92725 2023-10-02 19:12:34,330 WARNING [train.py:1204] (3/4) Exclude cut with ID _64639_210_5816_1_1535367619437_3528000_358-411-0_sp1.1 from training. Duration: 0.97275 2023-10-02 19:12:34,338 WARNING [train.py:1204] (3/4) Exclude cut with ID _31649_210_6228_1_1532584608063_4091078_106-21199-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 19:12:35,706 WARNING [train.py:1204] (3/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_901-79438-0 from training. Duration: 0.94 2023-10-02 19:12:38,350 INFO [train.py:1046] (3/4) Epoch 28, batch 4850, loss[loss=0.1678, simple_loss=0.2578, pruned_loss=0.03887, over 24672.00 frames. ], tot_loss[loss=0.1675, simple_loss=0.2453, pruned_loss=0.04489, over 4706842.35 frames. ], batch size: 68, lr: 3.63e-03, grad_scale: 16.0 2023-10-02 19:12:38,441 WARNING [train.py:1204] (3/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_718-250907-0 from training. Duration: 0.69 2023-10-02 19:12:38,446 WARNING [train.py:1204] (3/4) Exclude cut with ID _5287_210_2084_1_1527418883040_3878670_363-194161-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 19:12:38,457 WARNING [train.py:1204] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_586-321321-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 19:12:40,427 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_68900_210_2087_1_1535713157_3708529_164-292661-0_sp1.1 from training. Duration: 0.963625 2023-10-02 19:12:40,428 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_26433_210_12108_1_1532157158_4056480_114-144878-0_sp1.1 from training. Duration: 0.97275 2023-10-02 19:12:43,611 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_15517_210_6753_1_1530954008_3487336_96-341874-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 19:12:49,957 WARNING [train.py:1204] (3/4) Exclude cut with ID _14268_210_9206_1_1530622771285_4714099_637-86630-0 from training. Duration: 0.51 2023-10-02 19:12:51,538 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.conv_module1.balancer2.min_positive, batch_count=988513.3333333334, ans=0.05 2023-10-02 19:12:52,656 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_28211_210_6388_1_1532861506_4523224_121-320092-0_sp1.1 from training. Duration: 0.92725 2023-10-02 19:12:56,068 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_141-89113-0_sp0.9 from training. Duration: 0.9555625 2023-10-02 19:12:57,427 WARNING [train.py:1204] (3/4) Exclude cut with ID _6789_210_1534_1_1527857937218_3118950_311-61262-0_sp1.1 from training. Duration: 0.5909375 2023-10-02 19:12:57,448 WARNING [train.py:1204] (3/4) Exclude cut with ID _58480_210_12392_1_1534811121111_3646160_389-200461-0_sp1.1 from training. Duration: 0.97275 2023-10-02 19:13:00,265 WARNING [train.py:1204] (3/4) Exclude cut with ID _41326_210_5854_1_1533778076509_3492080_28-156553-0_sp1.1 from training. Duration: 0.92725 2023-10-02 19:13:01,565 WARNING [train.py:1204] (3/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_577-287150-0_sp1.1 from training. Duration: 0.7454375 2023-10-02 19:13:01,761 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.conv_skip_rate, batch_count=988580.0, ans=0.0 2023-10-02 19:13:02,949 WARNING [train.py:1204] (3/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_40-129756-0_sp1.1 from training. Duration: 0.736375 2023-10-02 19:13:04,196 WARNING [train.py:1204] (3/4) Exclude cut with ID _60113_210_16271_1_1535164212481_4386079_490-280290-0 from training. Duration: 0.98 2023-10-02 19:13:06,908 WARNING [train.py:1204] (3/4) Exclude cut with ID _58059_210_5854_1_1534901471149_3164808_34-39905-0_sp1.1 from training. Duration: 0.963625 2023-10-02 19:13:09,737 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_791-197891-0_sp1.1 from training. Duration: 0.763625 2023-10-02 19:13:09,757 WARNING [train.py:1204] (3/4) Exclude cut with ID _6789_210_1534_1_1527857937218_3118950_262-48853-0_sp1.1 from training. Duration: 0.5090625 2023-10-02 19:13:11,147 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2655_210_2087_1_1525688959_3897400_52-279899-0_sp0.9 from training. Duration: 0.7666875 2023-10-02 19:13:11,151 WARNING [train.py:1204] (3/4) Exclude cut with ID _14884_210_2252_1_1530707459689_5561250_114-201399-0 from training. Duration: 0.76 2023-10-02 19:13:13,262 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.1.conv_skip_rate, batch_count=988646.6666666666, ans=0.0 2023-10-02 19:13:14,986 WARNING [train.py:1204] (3/4) Exclude cut with ID _19023_210_4353_1_1531378892552_5820780_83-254794-0_sp0.9 from training. Duration: 0.9555625 2023-10-02 19:13:15,007 WARNING [train.py:1204] (3/4) Exclude cut with ID _53489_210_6228_1_1534298357169_3835970_10-297919-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 19:13:19,218 WARNING [train.py:1204] (3/4) Exclude cut with ID _44585_210_18628_1_1533605350387_4518199_418-3055-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 19:13:19,238 WARNING [train.py:1204] (3/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_310-109461-0 from training. Duration: 0.8 2023-10-02 19:13:19,281 WARNING [train.py:1204] (3/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_235-262124-0 from training. Duration: 0.67 2023-10-02 19:13:21,236 WARNING [train.py:1204] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_195-167011-0_sp1.1 from training. Duration: 0.6818125 2023-10-02 19:13:30,057 WARNING [train.py:1204] (3/4) Exclude cut with ID _7852_210_5219_1_1528286471316_8139400_188-55162-0_sp1.1 from training. Duration: 0.936375 2023-10-02 19:13:30,117 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_35361_210_4238_1_1533262810_7793476_675-87012-0 from training. Duration: 0.89 2023-10-02 19:13:31,401 WARNING [train.py:1204] (3/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_465-81427-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 19:13:31,408 WARNING [train.py:1204] (3/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_597-154640-0_sp1.1 from training. Duration: 0.8181875 2023-10-02 19:13:32,832 WARNING [train.py:1204] (3/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_186-309161-0_sp0.9 from training. Duration: 0.6 2023-10-02 19:13:34,196 WARNING [train.py:1204] (3/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_546-119312-0 from training. Duration: 0.99 2023-10-02 19:13:34,200 WARNING [train.py:1204] (3/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_330-240060-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 19:13:35,815 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.self_attn_weights.pos_emb_skip_rate, batch_count=988713.3333333334, ans=0.0 2023-10-02 19:13:37,001 WARNING [train.py:1204] (3/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_477-255782-0 from training. Duration: 0.82 2023-10-02 19:13:37,028 WARNING [train.py:1204] (3/4) Exclude cut with ID _14970_210_15709_1_1530773543493_1715730_150-248939-0_sp1.1 from training. Duration: 0.97275 2023-10-02 19:13:37,110 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_556-206615-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 19:13:38,520 WARNING [train.py:1204] (3/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_308-231735-0 from training. Duration: 0.73 2023-10-02 19:13:41,680 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.bypass_mid.scale_min, batch_count=988780.0, ans=0.2 2023-10-02 19:13:47,897 WARNING [train.py:1204] (3/4) Exclude cut with ID _27485_210_4916_1_1532739191677_3668269_66-236571-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 19:13:52,255 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.2.encoder.layers.1.self_attn_weights.whiten_keys, num_groups=4, num_channels=128, metric=2.83 vs. limit=6.0 2023-10-02 19:13:52,766 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2221_210_1988_1_1525597051_4935541_415-296882-0_sp0.9 from training. Duration: 0.8555625 2023-10-02 19:13:52,773 WARNING [train.py:1204] (3/4) Exclude cut with ID _64521_210_14699_1_1535243427259_3815328_231-78630-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 19:13:54,090 INFO [train.py:1046] (3/4) Epoch 28, batch 4900, loss[loss=0.1769, simple_loss=0.2624, pruned_loss=0.04564, over 24323.00 frames. ], tot_loss[loss=0.1671, simple_loss=0.2448, pruned_loss=0.04471, over 4703632.05 frames. ], batch size: 77, lr: 3.63e-03, grad_scale: 8.0 2023-10-02 19:13:55,018 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.2.encoder.layers.1.feed_forward1.out_whiten, num_groups=1, num_channels=384, metric=7.77 vs. limit=15.0 2023-10-02 19:13:56,917 WARNING [train.py:1204] (3/4) Exclude cut with ID _13942_210_9988_1_1530604906036_3733360_499-55676-0 from training. Duration: 0.62 2023-10-02 19:13:56,920 WARNING [train.py:1204] (3/4) Exclude cut with ID _49376_210_5986_1_1533985278732_3370740_301-154763-0_sp1.1 from training. Duration: 0.8909375 2023-10-02 19:14:03,025 WARNING [train.py:1204] (3/4) Exclude cut with ID _27485_210_4916_1_1532739191677_3668269_255-216698-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 19:14:03,130 WARNING [train.py:1204] (3/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_137-334143-0_sp1.1 from training. Duration: 0.97275 2023-10-02 19:14:03,301 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.ff3_skip_rate, batch_count=988846.6666666666, ans=0.0 2023-10-02 19:14:03,737 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.3.encoder.layers.0.nonlin_attention.whiten1, num_groups=1, num_channels=384, metric=9.82 vs. limit=10.0 2023-10-02 19:14:04,342 WARNING [train.py:1204] (3/4) Exclude cut with ID _6456_210_1534_1_1527681634212_3181049_215-309359-0_sp1.1 from training. Duration: 0.8 2023-10-02 19:14:04,623 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.feed_forward2.hidden_balancer.prob, batch_count=988846.6666666666, ans=0.125 2023-10-02 19:14:07,245 WARNING [train.py:1204] (3/4) Exclude cut with ID _65640_210_6310_1_1535368201445_3173250_209-13008-0 from training. Duration: 0.99 2023-10-02 19:14:12,725 WARNING [train.py:1204] (3/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_58-29064-0 from training. Duration: 0.99 2023-10-02 19:14:15,988 WARNING [train.py:1204] (3/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_410-287857-0 from training. Duration: 0.93 2023-10-02 19:14:17,381 WARNING [train.py:1204] (3/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_153-153537-0 from training. Duration: 0.91 2023-10-02 19:14:17,423 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7544_210_6753_1_1528587722_3576686_172-171750-0_sp0.9 from training. Duration: 0.72225 2023-10-02 19:14:17,462 WARNING [train.py:1204] (3/4) Exclude cut with ID _64521_210_14699_1_1535243427259_3815328_464-183853-0_sp1.1 from training. Duration: 0.97275 2023-10-02 19:14:17,482 WARNING [train.py:1204] (3/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_385-210308-0_sp1.1 from training. Duration: 0.963625 2023-10-02 19:14:17,490 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_46013_210_7234_1_1533776362_3744032_354-261865-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 19:14:19,380 WARNING [train.py:1204] (3/4) Exclude cut with ID _3067_210_2667_1_1526212974640_4503168_250-285937-0_sp1.1 from training. Duration: 0.72725 2023-10-02 19:14:19,453 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_40036_210_13405_1_1533648647_1315388_48-146016-0 from training. Duration: 0.93 2023-10-02 19:14:23,667 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.0.layers.0.conv_module1.whiten, num_groups=1, num_channels=192, metric=12.04 vs. limit=15.0 2023-10-02 19:14:23,979 WARNING [train.py:1204] (3/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_280-161633-0 from training. Duration: 0.73 2023-10-02 19:14:24,151 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.self_attn_weights.pos_emb_skip_rate, batch_count=988980.0, ans=0.0 2023-10-02 19:14:25,275 WARNING [train.py:1204] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_308-161067-0_sp0.9 from training. Duration: 0.7333125 2023-10-02 19:14:25,370 WARNING [train.py:1204] (3/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_68-212801-0_sp0.9 from training. Duration: 0.811125 2023-10-02 19:14:26,590 WARNING [train.py:1204] (3/4) Exclude cut with ID _6789_210_1534_1_1527857937218_3118950_311-61262-0_sp0.9 from training. Duration: 0.72225 2023-10-02 19:14:26,741 WARNING [train.py:1204] (3/4) Exclude cut with ID _14632_210_9988_1_1530691279392_3923200_102-204477-0_sp1.1 from training. Duration: 0.8818125 2023-10-02 19:14:28,202 WARNING [train.py:1204] (3/4) Exclude cut with ID _65242_210_18628_1_1535369393717_3823149_290-117517-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 19:14:28,313 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_69630_210_7234_1_1535788670_910989_35-337140-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 19:14:28,321 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_5618_210_1874_1_1527311008_5851_1-214501-0 from training. Duration: 0.689 2023-10-02 19:14:30,397 WARNING [train.py:1204] (3/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_168-50337-0_sp0.9 from training. Duration: 0.9333125 2023-10-02 19:14:30,688 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.bypass.skip_rate, batch_count=988980.0, ans=0.07 2023-10-02 19:14:31,752 WARNING [train.py:1204] (3/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_185-14438-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 19:14:31,768 WARNING [train.py:1204] (3/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_61-327062-0 from training. Duration: 0.81 2023-10-02 19:14:31,775 WARNING [train.py:1204] (3/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_98-741-0 from training. Duration: 0.8 2023-10-02 19:14:35,749 WARNING [train.py:1204] (3/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_312-204798-0 from training. Duration: 0.93 2023-10-02 19:14:37,115 WARNING [train.py:1204] (3/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_787-294416-0_sp0.9 from training. Duration: 0.911125 2023-10-02 19:14:38,498 WARNING [train.py:1204] (3/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_140-181176-0_sp1.1 from training. Duration: 0.836375 2023-10-02 19:14:38,541 WARNING [train.py:1204] (3/4) Exclude cut with ID _14687_210_5854_1_1530703838259_3664889_115-239046-0_sp1.1 from training. Duration: 0.6090625 2023-10-02 19:14:39,882 WARNING [train.py:1204] (3/4) Exclude cut with ID _56423_210_5854_1_1534896061049_3165969_370-230614-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 19:14:39,929 WARNING [train.py:1204] (3/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_406-275649-0_sp0.9 from training. Duration: 0.4666875 2023-10-02 19:14:39,940 WARNING [train.py:1204] (3/4) Exclude cut with ID _15150_210_8341_1_1530752411803_7256384_22-138274-0_sp1.1 from training. Duration: 0.87275 2023-10-02 19:14:39,978 WARNING [train.py:1204] (3/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_125-125818-0 from training. Duration: 0.96 2023-10-02 19:14:40,716 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.2.encoder.layers.2.self_attn1.whiten, num_groups=1, num_channels=384, metric=18.60 vs. limit=22.5 2023-10-02 19:14:44,476 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_4399_210_1874_1_1526705485_2400757_220-47974-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 19:14:45,857 WARNING [train.py:1204] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_308-161067-0_sp1.1 from training. Duration: 0.6 2023-10-02 19:14:47,862 WARNING [train.py:1204] (3/4) Exclude cut with ID _53489_210_6228_1_1534298357169_3835970_102-57645-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 19:14:51,095 WARNING [train.py:1204] (3/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_178-177582-0 from training. Duration: 0.74 2023-10-02 19:14:51,359 INFO [scaling.py:1118] (3/4) WithLoss: name=encoder.encoders.4.encoder.layers.1.self_attn_weights, loss-sum=0.000e+00 2023-10-02 19:14:52,377 WARNING [train.py:1204] (3/4) Exclude cut with ID _14349_210_8341_1_1530615710818_3442470_170-336541-0_sp1.1 from training. Duration: 0.7818125 2023-10-02 19:14:52,449 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_521-214276-0_sp0.9 from training. Duration: 0.488875 2023-10-02 19:14:53,035 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.4.encoder.layers.0.conv_module2.whiten, num_groups=1, num_channels=384, metric=8.52 vs. limit=15.0 2023-10-02 19:14:53,893 WARNING [train.py:1204] (3/4) Exclude cut with ID _66070_210_7640_1_1535441393096_7805310_489-292631-0 from training. Duration: 0.79 2023-10-02 19:14:55,547 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.conv_module2.balancer2.prob, batch_count=989113.3333333334, ans=0.125 2023-10-02 19:14:59,579 WARNING [train.py:1204] (3/4) Exclude cut with ID _14069_210_5632_1_1530512692033_4803390_438-340023-0_sp1.1 from training. Duration: 0.936375 2023-10-02 19:15:00,502 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.feed_forward1.hidden_balancer.prob, batch_count=989113.3333333334, ans=0.125 2023-10-02 19:15:01,254 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.483e+02 1.868e+02 2.016e+02 2.240e+02 3.516e+02, threshold=4.032e+02, percent-clipped=0.0 2023-10-02 19:15:01,365 WARNING [train.py:1204] (3/4) Exclude cut with ID _7852_210_5219_1_1528286471316_8139400_242-105336-0_sp1.1 from training. Duration: 0.6545625 2023-10-02 19:15:04,102 WARNING [train.py:1204] (3/4) Exclude cut with ID _3867_210_1883_1_1526641200575_4475830_272-215563-0 from training. Duration: 0.57 2023-10-02 19:15:04,116 WARNING [train.py:1204] (3/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_143-117421-0_sp0.9 from training. Duration: 0.6444375 2023-10-02 19:15:04,130 WARNING [train.py:1204] (3/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_30-326681-0_sp1.1 from training. Duration: 0.7545625 2023-10-02 19:15:05,571 WARNING [train.py:1204] (3/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_538-218448-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 19:15:08,285 INFO [train.py:1046] (3/4) Epoch 28, batch 4950, loss[loss=0.1648, simple_loss=0.2611, pruned_loss=0.03424, over 24383.00 frames. ], tot_loss[loss=0.1662, simple_loss=0.2439, pruned_loss=0.04424, over 4708917.37 frames. ], batch size: 74, lr: 3.63e-03, grad_scale: 8.0 2023-10-02 19:15:09,652 WARNING [train.py:1204] (3/4) Exclude cut with ID _33640_210_2655_1_1533117605479_4855469_60-192970-0_sp1.1 from training. Duration: 0.963625 2023-10-02 19:15:09,666 WARNING [train.py:1204] (3/4) Exclude cut with ID _65585_210_12210_1_1535450451584_3764098_112-294301-0_sp1.1 from training. Duration: 0.8 2023-10-02 19:15:09,687 WARNING [train.py:1204] (3/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_643-37412-0_sp1.1 from training. Duration: 0.936375 2023-10-02 19:15:09,717 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_9302_210_2550_1_1528889962_4655416_638-301620-0_sp1.1 from training. Duration: 0.42725 2023-10-02 19:15:12,437 WARNING [train.py:1204] (3/4) Exclude cut with ID _3867_210_1883_1_1526641200575_4475830_272-215563-0_sp1.1 from training. Duration: 0.5181875 2023-10-02 19:15:15,129 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_53679_210_4852_1_1534556726_4186141_358-154143-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 19:15:15,146 WARNING [train.py:1204] (3/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_841-341679-0_sp0.9 from training. Duration: 0.6444375 2023-10-02 19:15:17,144 WARNING [train.py:1204] (3/4) Exclude cut with ID _7160_210_1876_1_1527940923231_3703799_302-150728-0 from training. Duration: 0.78 2023-10-02 19:15:18,985 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_125-203105-0 from training. Duration: 0.8 2023-10-02 19:15:19,006 WARNING [train.py:1204] (3/4) Exclude cut with ID _5585_210_2667_1_1527418813324_8007340_89-332072-0_sp0.9 from training. Duration: 0.711125 2023-10-02 19:15:20,356 WARNING [train.py:1204] (3/4) Exclude cut with ID _3992_210_1747_1_1526644816031_3731060_36-147855-0 from training. Duration: 0.78 2023-10-02 19:15:20,383 WARNING [train.py:1204] (3/4) Exclude cut with ID _59256_210_5986_1_1535076085997_3589120_172-239045-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 19:15:20,392 WARNING [train.py:1204] (3/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_472-216663-0_sp1.1 from training. Duration: 0.836375 2023-10-02 19:15:20,593 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.bypass.scale_min, batch_count=989180.0, ans=0.2 2023-10-02 19:15:22,287 WARNING [train.py:1204] (3/4) Exclude cut with ID _36512_210_6987_1_1533033143998_3126069_181-306500-0_sp1.1 from training. Duration: 0.863625 2023-10-02 19:15:22,311 WARNING [train.py:1204] (3/4) Exclude cut with ID _56524_210_9642_1_1534572011779_3619170_492-95008-0_sp1.1 from training. Duration: 0.97275 2023-10-02 19:15:23,873 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_33348_210_5632_1_1533086912_3324030_119-90326-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 19:15:25,096 WARNING [train.py:1204] (3/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_231-137892-0_sp1.1 from training. Duration: 0.9 2023-10-02 19:15:25,639 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.4.encoder.layers.1.feed_forward1.out_whiten, num_groups=1, num_channels=384, metric=11.08 vs. limit=15.0 2023-10-02 19:15:26,376 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8453_210_4211_1_1528502940_8562730_596-306820-0_sp1.1 from training. Duration: 0.8818125 2023-10-02 19:15:26,466 WARNING [train.py:1204] (3/4) Exclude cut with ID _27052_210_5986_1_1532329197187_3585009_50-242750-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 19:15:29,191 WARNING [train.py:1204] (3/4) Exclude cut with ID _13913_210_9994_1_1530579615187_3383897_358-98435-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 19:15:29,204 WARNING [train.py:1204] (3/4) Exclude cut with ID _27013_210_1389_1_1532437173240_3252649_399-229778-0_sp1.1 from training. Duration: 0.963625 2023-10-02 19:15:31,935 WARNING [train.py:1204] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_185-174805-0_sp0.9 from training. Duration: 0.7555625 2023-10-02 19:15:32,048 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder_embed.convnext.hidden_balancer.prob, batch_count=989246.6666666666, ans=0.125 2023-10-02 19:15:35,250 WARNING [train.py:1204] (3/4) Exclude cut with ID _65585_210_12210_1_1535450451584_3764098_196-221014-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 19:15:36,718 WARNING [train.py:1204] (3/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_202-293523-0_sp1.1 from training. Duration: 0.6545625 2023-10-02 19:15:38,048 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8275_210_4198_1_1528455320_3892571_213-127634-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 19:15:38,098 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_921-231678-0_sp1.1 from training. Duration: 0.97275 2023-10-02 19:15:39,520 WARNING [train.py:1204] (3/4) Exclude cut with ID _38020_210_5986_1_1533112252259_7006089_493-102745-0_sp1.1 from training. Duration: 0.8909375 2023-10-02 19:15:42,566 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6846_210_6753_1_1527850767_4295158_2-315073-0 from training. Duration: 0.88 2023-10-02 19:15:42,630 WARNING [train.py:1204] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_249-281349-0 from training. Duration: 0.85 2023-10-02 19:15:42,836 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.ff3_skip_rate, batch_count=989313.3333333334, ans=0.0 2023-10-02 19:15:45,211 WARNING [train.py:1204] (3/4) Exclude cut with ID _18513_210_9206_1_1531618215223_3862620_314-83429-0_sp1.1 from training. Duration: 0.97275 2023-10-02 19:15:45,450 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.balancer2.prob, batch_count=989313.3333333334, ans=0.125 2023-10-02 19:15:47,275 WARNING [train.py:1204] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_215-202973-0_sp0.9 from training. Duration: 0.97775 2023-10-02 19:15:47,285 WARNING [train.py:1204] (3/4) Exclude cut with ID _7719_210_3525_1_1528456153163_4296960_88-250259-0_sp1.1 from training. Duration: 0.763625 2023-10-02 19:15:48,621 WARNING [train.py:1204] (3/4) Exclude cut with ID _7606_210_4915_1_1528894253277_5080690_301-10732-0_sp1.1 from training. Duration: 0.736375 2023-10-02 19:15:50,254 WARNING [train.py:1204] (3/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_818-11907-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 19:15:50,333 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_5959_210_1794_1_1527400400_3445726_412-44052-0_sp1.1 from training. Duration: 0.7 2023-10-02 19:15:53,003 WARNING [train.py:1204] (3/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_6-279195-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 19:15:55,676 WARNING [train.py:1204] (3/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_252-216674-0_sp0.9 from training. Duration: 0.6 2023-10-02 19:15:57,028 WARNING [train.py:1204] (3/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_144-3287-0_sp1.1 from training. Duration: 0.6090625 2023-10-02 19:15:58,344 WARNING [train.py:1204] (3/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_342-3879-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 19:15:58,376 WARNING [train.py:1204] (3/4) Exclude cut with ID _33640_210_2655_1_1533117605479_4855469_584-65630-0_sp1.1 from training. Duration: 0.97275 2023-10-02 19:15:58,530 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.self_attn_weights.pos_emb_skip_rate, batch_count=989380.0, ans=0.0 2023-10-02 19:15:59,672 WARNING [train.py:1204] (3/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_37-189262-0 from training. Duration: 0.98 2023-10-02 19:15:59,713 WARNING [train.py:1204] (3/4) Exclude cut with ID _9389_210_1664_1_1529217337301_5289269_379-128794-0_sp0.9 from training. Duration: 0.9444375 2023-10-02 19:16:01,138 WARNING [train.py:1204] (3/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_119-207162-0_sp1.1 from training. Duration: 0.6454375 2023-10-02 19:16:07,019 WARNING [train.py:1204] (3/4) Exclude cut with ID _2941_210_2577_1_1526199673150_2489759_200-333643-0_sp1.1 from training. Duration: 0.936375 2023-10-02 19:16:08,375 WARNING [train.py:1204] (3/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_479-125611-0_sp1.1 from training. Duration: 0.87275 2023-10-02 19:16:08,385 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3475_210_2028_1_1526193578_3622569_64-297072-0_sp1.1 from training. Duration: 0.82725 2023-10-02 19:16:08,430 WARNING [train.py:1204] (3/4) Exclude cut with ID _53565_210_10635_1_1534337218335_3820259_139-346291-0_sp1.1 from training. Duration: 0.97275 2023-10-02 19:16:08,471 WARNING [train.py:1204] (3/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_588-125165-0_sp1.1 from training. Duration: 0.7545625 2023-10-02 19:16:09,864 WARNING [train.py:1204] (3/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_938-205135-0_sp1.1 from training. Duration: 0.7909375 2023-10-02 19:16:11,265 WARNING [train.py:1204] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_216-48707-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 19:16:12,640 WARNING [train.py:1204] (3/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_477-255782-0_sp1.1 from training. Duration: 0.7454375 2023-10-02 19:16:12,675 WARNING [train.py:1204] (3/4) Exclude cut with ID _16909_210_5724_1_1531292079912_4118919_583-92842-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 19:16:12,750 WARNING [train.py:1204] (3/4) Exclude cut with ID _60860_210_8254_1_1534896015875_3557169_245-256666-0 from training. Duration: 0.97 2023-10-02 19:16:17,207 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_45399_210_4852_1_1533624725_7579737_349-116173-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 19:16:21,701 INFO [train.py:1046] (3/4) Epoch 28, batch 5000, loss[loss=0.178, simple_loss=0.2624, pruned_loss=0.04673, over 24060.00 frames. ], tot_loss[loss=0.1661, simple_loss=0.2438, pruned_loss=0.0442, over 4717348.77 frames. ], batch size: 86, lr: 3.62e-03, grad_scale: 8.0 2023-10-02 19:16:21,773 WARNING [train.py:1204] (3/4) Exclude cut with ID _8699_210_2069_1_1528630346831_3960409_155-39766-0 from training. Duration: 0.78 2023-10-02 19:16:21,785 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_86-16881-0_sp1.1 from training. Duration: 0.52725 2023-10-02 19:16:27,504 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_191-159384-0_sp1.1 from training. Duration: 0.97275 2023-10-02 19:16:28,716 WARNING [train.py:1204] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_307-158462-0_sp1.1 from training. Duration: 0.62725 2023-10-02 19:16:30,157 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7544_210_6753_1_1528591327_1471426_33-346031-0 from training. Duration: 0.61 2023-10-02 19:16:30,246 WARNING [train.py:1204] (3/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_112-149213-0 from training. Duration: 0.95 2023-10-02 19:16:33,365 WARNING [train.py:1204] (3/4) Exclude cut with ID _55844_210_2346_1_1534471212569_7625707_925-191342-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 19:16:34,822 WARNING [train.py:1204] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_87-278686-0 from training. Duration: 0.77 2023-10-02 19:16:34,840 WARNING [train.py:1204] (3/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_415-78792-0_sp1.1 from training. Duration: 0.736375 2023-10-02 19:16:34,859 WARNING [train.py:1204] (3/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_107-332398-0_sp0.9 from training. Duration: 0.8666875 2023-10-02 19:16:36,170 WARNING [train.py:1204] (3/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_72-251255-0 from training. Duration: 0.94 2023-10-02 19:16:36,212 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_431-227273-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 19:16:37,583 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7544_210_6753_1_1528587000_691468_87-178599-0_sp0.9 from training. Duration: 0.9666875 2023-10-02 19:16:38,890 WARNING [train.py:1204] (3/4) Exclude cut with ID _13916_210_9994_1_1530788341126_3763149_60-191556-0 from training. Duration: 0.61 2023-10-02 19:16:38,892 WARNING [train.py:1204] (3/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_746-164782-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 19:16:38,932 WARNING [train.py:1204] (3/4) Exclude cut with ID _33640_210_2655_1_1533117605479_4855469_596-21066-0_sp1.1 from training. Duration: 0.92725 2023-10-02 19:16:41,612 WARNING [train.py:1204] (3/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_320-14078-0 from training. Duration: 0.95 2023-10-02 19:16:41,643 WARNING [train.py:1204] (3/4) Exclude cut with ID _29877_210_6228_1_1532397791106_5276149_266-62291-0 from training. Duration: 0.87 2023-10-02 19:16:41,707 WARNING [train.py:1204] (3/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_176-56120-0_sp0.9 from training. Duration: 0.92225 2023-10-02 19:16:42,116 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.5.encoder.layers.1.whiten, num_groups=1, num_channels=256, metric=2.16 vs. limit=12.0 2023-10-02 19:16:43,058 WARNING [train.py:1204] (3/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_91-26919-0 from training. Duration: 0.97 2023-10-02 19:16:43,065 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_224-322394-0_sp0.9 from training. Duration: 0.7333125 2023-10-02 19:16:43,107 WARNING [train.py:1204] (3/4) Exclude cut with ID _44982_210_1511_1_1534312820108_3726029_586-297059-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 19:16:43,157 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_196-289637-0_sp1.1 from training. Duration: 0.6818125 2023-10-02 19:16:43,158 WARNING [train.py:1204] (3/4) Exclude cut with ID _18513_210_9206_1_1531618215223_3862620_402-5957-0 from training. Duration: 0.99 2023-10-02 19:16:44,458 WARNING [train.py:1204] (3/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_81-300212-0 from training. Duration: 0.6 2023-10-02 19:16:45,910 WARNING [train.py:1204] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_166-128941-0 from training. Duration: 0.54 2023-10-02 19:16:45,923 WARNING [train.py:1204] (3/4) Exclude cut with ID _58059_210_5854_1_1534901471149_3164808_57-309660-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 19:16:47,753 WARNING [train.py:1204] (3/4) Exclude cut with ID _60621_210_17071_1_1535013053530_3671739_73-155814-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 19:16:47,873 WARNING [train.py:1204] (3/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_726-270780-0 from training. Duration: 0.86 2023-10-02 19:16:48,060 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=989580.0, ans=0.1 2023-10-02 19:16:49,104 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8853_210_3273_1_1528633899_818968_51-187685-0_sp1.1 from training. Duration: 0.62725 2023-10-02 19:16:51,157 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_315-212794-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 19:16:51,241 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_53373_210_5399_1_1534229471_4715695_314-122795-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 19:16:53,201 WARNING [train.py:1204] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_187-221566-0_sp0.9 from training. Duration: 0.47775 2023-10-02 19:16:54,441 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_133-564-0 from training. Duration: 0.79 2023-10-02 19:16:55,826 WARNING [train.py:1204] (3/4) Exclude cut with ID _55153_210_2253_1_1534384786677_3162096_255-228344-0_sp1.1 from training. Duration: 0.936375 2023-10-02 19:16:57,254 WARNING [train.py:1204] (3/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_383-165234-0_sp1.1 from training. Duration: 0.8545625 2023-10-02 19:17:00,650 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.ff3_skip_rate, batch_count=989646.6666666666, ans=0.0 2023-10-02 19:17:01,768 WARNING [train.py:1204] (3/4) Exclude cut with ID IC0677W0056-334889 from training. Duration: 0.768 2023-10-02 19:17:04,803 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.bypass_mid.scale_min, batch_count=989713.3333333334, ans=0.2 2023-10-02 19:17:05,928 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_14953_210_2151_1_1530701710_5616165_710-165400-0_sp0.9 from training. Duration: 0.9666875 2023-10-02 19:17:07,269 WARNING [train.py:1204] (3/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_355-236205-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 19:17:07,270 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_34450_210_9547_1_1533099443_4109050_503-246893-0_sp1.1 from training. Duration: 0.963625 2023-10-02 19:17:08,816 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.ff3_skip_rate, batch_count=989713.3333333334, ans=0.0 2023-10-02 19:17:10,054 WARNING [train.py:1204] (3/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_25-120161-0 from training. Duration: 0.98 2023-10-02 19:17:10,076 WARNING [train.py:1204] (3/4) Exclude cut with ID _66112_210_10635_1_1535590236305_3801484_205-311930-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 19:17:10,103 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_48049_210_5632_1_1533864553_3519937_185-281218-0_sp1.1 from training. Duration: 0.92725 2023-10-02 19:17:11,322 WARNING [train.py:1204] (3/4) Exclude cut with ID _48782_210_12658_1_1534467701544_2300422_191-332415-0_sp1.1 from training. Duration: 0.97275 2023-10-02 19:17:12,713 WARNING [train.py:1204] (3/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_907-151398-0_sp1.1 from training. Duration: 0.336375 2023-10-02 19:17:12,756 WARNING [train.py:1204] (3/4) Exclude cut with ID _67499_210_17282_1_1535509805220_3692430_20-179346-0_sp1.1 from training. Duration: 0.8818125 2023-10-02 19:17:15,592 WARNING [train.py:1204] (3/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_441-91466-0_sp1.1 from training. Duration: 0.8818125 2023-10-02 19:17:16,923 WARNING [train.py:1204] (3/4) Exclude cut with ID _46681_210_9642_1_1533893005571_7603359_786-164007-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 19:17:22,984 WARNING [train.py:1204] (3/4) Exclude cut with ID _14970_210_15709_1_1530773543493_1715730_187-20259-0 from training. Duration: 0.89 2023-10-02 19:17:28,005 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.460e+02 1.761e+02 1.932e+02 2.126e+02 3.192e+02, threshold=3.865e+02, percent-clipped=0.0 2023-10-02 19:17:28,091 WARNING [train.py:1204] (3/4) Exclude cut with ID _12734_210_8341_1_1530183614296_3891929_383-339075-0_sp1.1 from training. Duration: 0.963625 2023-10-02 19:17:35,324 INFO [train.py:1046] (3/4) Epoch 28, batch 5050, loss[loss=0.1784, simple_loss=0.2687, pruned_loss=0.04399, over 24361.00 frames. ], tot_loss[loss=0.1665, simple_loss=0.2444, pruned_loss=0.04431, over 4719985.50 frames. ], batch size: 77, lr: 3.62e-03, grad_scale: 8.0 2023-10-02 19:17:36,787 WARNING [train.py:1204] (3/4) Exclude cut with ID _37086_210_9038_1_1532999540891_7021250_834-322225-0_sp1.1 from training. Duration: 0.92725 2023-10-02 19:17:36,918 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_10987_210_4163_1_1529760555_4123902_56-290559-0_sp1.1 from training. Duration: 0.963625 2023-10-02 19:17:36,925 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_92-246270-0_sp0.9 from training. Duration: 0.9444375 2023-10-02 19:17:36,950 WARNING [train.py:1204] (3/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_56-13561-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 19:17:38,346 WARNING [train.py:1204] (3/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_393-57888-0_sp1.1 from training. Duration: 0.5818125 2023-10-02 19:17:38,378 WARNING [train.py:1204] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_168-32906-0_sp1.1 from training. Duration: 0.62725 2023-10-02 19:17:38,429 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_51225_210_3601_1_1534400634_2327383_127-207553-0_sp1.1 from training. Duration: 0.963625 2023-10-02 19:17:43,747 WARNING [train.py:1204] (3/4) Exclude cut with ID _58583_210_5236_1_1534726835266_3574249_494-203521-0_sp1.1 from training. Duration: 0.963625 2023-10-02 19:17:43,767 WARNING [train.py:1204] (3/4) Exclude cut with ID _49376_210_5986_1_1533985278732_3370740_354-344695-0 from training. Duration: 0.99 2023-10-02 19:17:43,842 WARNING [train.py:1204] (3/4) Exclude cut with ID _8021_210_7994_1_1528376451358_3653069_179-45206-0_sp1.1 from training. Duration: 0.7818125 2023-10-02 19:17:46,611 WARNING [train.py:1204] (3/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_742-154017-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 19:17:47,939 WARNING [train.py:1204] (3/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_222-198127-0_sp1.1 from training. Duration: 0.763625 2023-10-02 19:17:47,990 WARNING [train.py:1204] (3/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_235-307337-0 from training. Duration: 0.94 2023-10-02 19:17:49,379 WARNING [train.py:1204] (3/4) Exclude cut with ID _8393_210_6702_1_1528505860249_3457328_453-176500-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 19:17:49,429 WARNING [train.py:1204] (3/4) Exclude cut with ID _53565_210_10635_1_1534337218335_3820259_102-179528-0_sp1.1 from training. Duration: 0.97275 2023-10-02 19:17:52,706 WARNING [train.py:1204] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_43-206757-0_sp1.1 from training. Duration: 0.6818125 2023-10-02 19:17:52,952 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.ff2_skip_rate, batch_count=989913.3333333334, ans=0.0 2023-10-02 19:17:54,548 WARNING [train.py:1204] (3/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_335-274780-0_sp1.1 from training. Duration: 0.7090625 2023-10-02 19:17:54,598 WARNING [train.py:1204] (3/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_128-296581-0_sp1.1 from training. Duration: 0.67275 2023-10-02 19:18:04,092 WARNING [train.py:1204] (3/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_111-112362-0 from training. Duration: 0.91 2023-10-02 19:18:04,133 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_5522_210_3273_1_1527249606_3909096_366-172508-0_sp1.1 from training. Duration: 0.6 2023-10-02 19:18:05,464 WARNING [train.py:1204] (3/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_47-167373-0_sp1.1 from training. Duration: 0.836375 2023-10-02 19:18:05,505 WARNING [train.py:1204] (3/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_41-234353-0 from training. Duration: 0.68 2023-10-02 19:18:05,538 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_82-221196-0_sp0.9 from training. Duration: 0.8444375 2023-10-02 19:18:06,895 WARNING [train.py:1204] (3/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_47-4814-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 19:18:08,238 WARNING [train.py:1204] (3/4) Exclude cut with ID _43541_210_10626_1_1533796233418_3635440_40-247993-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 19:18:08,302 WARNING [train.py:1204] (3/4) Exclude cut with ID _21989_210_5854_1_1531899214931_3271360_133-44759-0_sp0.9 from training. Duration: 0.8555625 2023-10-02 19:18:08,305 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_94-195801-0 from training. Duration: 0.94 2023-10-02 19:18:08,495 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.feed_forward2.hidden_balancer.prob, batch_count=989980.0, ans=0.125 2023-10-02 19:18:09,641 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_141-89113-0 from training. Duration: 0.86 2023-10-02 19:18:10,976 WARNING [train.py:1204] (3/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_683-284578-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 19:18:12,412 WARNING [train.py:1204] (3/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_6-214815-0_sp1.1 from training. Duration: 0.77275 2023-10-02 19:18:15,184 WARNING [train.py:1204] (3/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_556-212082-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 19:18:16,459 WARNING [train.py:1204] (3/4) Exclude cut with ID _44902_210_16187_1_1533888006062_3792078_149-67212-0 from training. Duration: 0.87 2023-10-02 19:18:17,869 WARNING [train.py:1204] (3/4) Exclude cut with ID _61148_210_6897_1_1535094022726_4217610_333-159440-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 19:18:18,082 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.nonlin_attention.balancer.prob, batch_count=990046.6666666666, ans=0.125 2023-10-02 19:18:21,224 WARNING [train.py:1204] (3/4) Exclude cut with ID _3120_210_3525_1_1526036143112_4072980_404-249994-0 from training. Duration: 0.99 2023-10-02 19:18:22,532 WARNING [train.py:1204] (3/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_234-260682-0_sp0.9 from training. Duration: 0.9333125 2023-10-02 19:18:22,559 WARNING [train.py:1204] (3/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_374-138971-0_sp1.1 from training. Duration: 0.8545625 2023-10-02 19:18:22,621 WARNING [train.py:1204] (3/4) Exclude cut with ID _53713_210_24116_1_1534314487762_7416751_743-302085-0_sp1.1 from training. Duration: 0.92725 2023-10-02 19:18:24,134 WARNING [train.py:1204] (3/4) Exclude cut with ID _18516_210_9206_1_1531704653206_3692406_198-334870-0_sp1.1 from training. Duration: 0.836375 2023-10-02 19:18:26,139 WARNING [train.py:1204] (3/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_395-73570-0_sp1.1 from training. Duration: 0.9 2023-10-02 19:18:28,889 WARNING [train.py:1204] (3/4) Exclude cut with ID _46683_210_9642_1_1533730913492_4591116_421-19547-0_sp1.1 from training. Duration: 0.936375 2023-10-02 19:18:30,126 WARNING [train.py:1204] (3/4) Exclude cut with ID _19896_210_1883_1_1531709022662_6649569_714-8668-0_sp1.1 from training. Duration: 0.963625 2023-10-02 19:18:30,154 WARNING [train.py:1204] (3/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_49-101072-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 19:18:30,160 WARNING [train.py:1204] (3/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_220-191080-0_sp1.1 from training. Duration: 0.8909375 2023-10-02 19:18:30,188 WARNING [train.py:1204] (3/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_62-335552-0 from training. Duration: 0.87 2023-10-02 19:18:32,148 WARNING [train.py:1204] (3/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_111-112362-0_sp1.1 from training. Duration: 0.82725 2023-10-02 19:18:33,629 WARNING [train.py:1204] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_318-59097-0_sp0.9 from training. Duration: 0.8444375 2023-10-02 19:18:38,402 WARNING [train.py:1204] (3/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_98-255710-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 19:18:38,408 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9042W0003-515609 from training. Duration: 0.896 2023-10-02 19:18:38,424 WARNING [train.py:1204] (3/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_158-250116-0_sp0.9 from training. Duration: 0.688875 2023-10-02 19:18:39,773 WARNING [train.py:1204] (3/4) Exclude cut with ID _26750_210_5665_1_1532226678442_4063230_7-346885-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 19:18:39,814 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_37839_210_7234_1_1533542462_3689303_357-227078-0_sp1.1 from training. Duration: 0.963625 2023-10-02 19:18:41,034 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9052W0119-520904 from training. Duration: 0.896 2023-10-02 19:18:43,813 WARNING [train.py:1204] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_249-281349-0_sp1.1 from training. Duration: 0.77275 2023-10-02 19:18:43,827 WARNING [train.py:1204] (3/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_147-293311-0 from training. Duration: 0.94 2023-10-02 19:18:43,827 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_158-21166-0_sp1.1 from training. Duration: 0.963625 2023-10-02 19:18:46,707 WARNING [train.py:1204] (3/4) Exclude cut with ID _14100_210_8341_1_1530529320613_3678069_207-332194-0_sp1.1 from training. Duration: 0.92725 2023-10-02 19:18:46,748 WARNING [train.py:1204] (3/4) Exclude cut with ID _9389_210_1664_1_1529217337301_5289269_324-211031-0_sp1.1 from training. Duration: 0.963625 2023-10-02 19:18:47,999 WARNING [train.py:1204] (3/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_133-158308-0 from training. Duration: 0.84 2023-10-02 19:18:49,391 INFO [train.py:1046] (3/4) Epoch 28, batch 5100, loss[loss=0.1508, simple_loss=0.2258, pruned_loss=0.03792, over 23686.00 frames. ], tot_loss[loss=0.1664, simple_loss=0.2442, pruned_loss=0.04432, over 4720600.66 frames. ], batch size: 149, lr: 3.62e-03, grad_scale: 8.0 2023-10-02 19:18:49,437 WARNING [train.py:1204] (3/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_166-314607-0 from training. Duration: 0.54 2023-10-02 19:18:50,848 WARNING [train.py:1204] (3/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_649-79538-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 19:18:52,480 WARNING [train.py:1204] (3/4) Exclude cut with ID _28394_210_8913_1_1532430049518_3650118_343-115667-0_sp1.1 from training. Duration: 0.97275 2023-10-02 19:18:52,527 WARNING [train.py:1204] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_87-278686-0_sp0.9 from training. Duration: 0.8555625 2023-10-02 19:18:54,416 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9109W0003-549749 from training. Duration: 0.768 2023-10-02 19:18:55,871 WARNING [train.py:1204] (3/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_126-29104-0_sp1.1 from training. Duration: 0.77275 2023-10-02 19:18:57,348 WARNING [train.py:1204] (3/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_325-113798-0 from training. Duration: 0.85 2023-10-02 19:18:58,723 WARNING [train.py:1204] (3/4) Exclude cut with ID _6694_210_4599_1_1527852637490_3965529_116-109398-0 from training. Duration: 0.73 2023-10-02 19:19:00,000 WARNING [train.py:1204] (3/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_46-57119-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 19:19:01,386 WARNING [train.py:1204] (3/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_546-119312-0_sp1.1 from training. Duration: 0.9 2023-10-02 19:19:03,181 WARNING [train.py:1204] (3/4) Exclude cut with ID _60057_210_10640_1_1534903065793_3333150_468-306939-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 19:19:05,129 WARNING [train.py:1204] (3/4) Exclude cut with ID _15150_210_8341_1_1530752411803_7256384_22-138274-0 from training. Duration: 0.96 2023-10-02 19:19:05,144 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_289-180904-0 from training. Duration: 0.76 2023-10-02 19:19:09,275 WARNING [train.py:1204] (3/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_64-133721-0_sp1.1 from training. Duration: 0.92725 2023-10-02 19:19:09,324 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_195-257512-0_sp1.1 from training. Duration: 0.6090625 2023-10-02 19:19:13,470 WARNING [train.py:1204] (3/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_704-101963-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 19:19:15,055 WARNING [train.py:1204] (3/4) Exclude cut with ID _4366_210_1388_1_1526792053633_3613819_231-211402-0 from training. Duration: 0.94 2023-10-02 19:19:15,106 WARNING [train.py:1204] (3/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_679-190385-0_sp1.1 from training. Duration: 0.97275 2023-10-02 19:19:17,764 WARNING [train.py:1204] (3/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_298-81459-0_sp1.1 from training. Duration: 0.963625 2023-10-02 19:19:17,780 WARNING [train.py:1204] (3/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_658-282610-0_sp0.9 from training. Duration: 0.588875 2023-10-02 19:19:21,058 WARNING [train.py:1204] (3/4) Exclude cut with ID _57572_210_5517_1_1534921219624_3809484_196-283734-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 19:19:22,370 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_53071_210_16554_1_1534490405_2954066_294-198554-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 19:19:22,375 WARNING [train.py:1204] (3/4) Exclude cut with ID _4866_210_2577_1_1527404412284_4364659_352-157930-0 from training. Duration: 0.81 2023-10-02 19:19:24,356 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9231W0113-610139 from training. Duration: 0.896 2023-10-02 19:19:25,681 WARNING [train.py:1204] (3/4) Exclude cut with ID _24271_210_12658_1_1531987270022_7190519_638-171906-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 19:19:25,728 WARNING [train.py:1204] (3/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_272-249985-0 from training. Duration: 0.67 2023-10-02 19:19:26,950 WARNING [train.py:1204] (3/4) Exclude cut with ID _64086_210_7994_1_1535270417019_7435949_449-31567-0 from training. Duration: 0.97 2023-10-02 19:19:29,700 WARNING [train.py:1204] (3/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_109-282568-0_sp1.1 from training. Duration: 0.97275 2023-10-02 19:19:32,017 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.2.encoder.layers.1.self_attn_weights.whiten_keys, num_groups=4, num_channels=128, metric=2.92 vs. limit=6.0 2023-10-02 19:19:35,984 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_121-45934-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 19:19:38,672 WARNING [train.py:1204] (3/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_314-126819-0 from training. Duration: 0.5 2023-10-02 19:19:38,699 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9309W0192-640319 from training. Duration: 0.512 2023-10-02 19:19:38,707 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9310W0003-640649 from training. Duration: 0.896 2023-10-02 19:19:40,176 WARNING [train.py:1204] (3/4) Exclude cut with ID _7160_210_1876_1_1527940923231_3703799_120-150943-0 from training. Duration: 0.81 2023-10-02 19:19:40,177 WARNING [train.py:1204] (3/4) Exclude cut with ID _40380_210_9059_1_1533380594759_4187380_6-33508-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 19:19:41,676 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.bypass.skip_rate, batch_count=990380.0, ans=0.09899494936611666 2023-10-02 19:19:44,203 WARNING [train.py:1204] (3/4) Exclude cut with ID _59202_210_26984_1_1534816810612_3549174_137-318710-0 from training. Duration: 0.89 2023-10-02 19:19:45,049 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.1.encoder.layers.1.self_attn1.whiten, num_groups=1, num_channels=256, metric=12.99 vs. limit=22.5 2023-10-02 19:19:47,008 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_435-313305-0 from training. Duration: 0.71 2023-10-02 19:19:50,260 WARNING [train.py:1204] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_91-212486-0_sp1.1 from training. Duration: 0.6181875 2023-10-02 19:19:52,863 WARNING [train.py:1204] (3/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_133-158308-0_sp1.1 from training. Duration: 0.763625 2023-10-02 19:19:54,318 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_99-251314-0 from training. Duration: 0.87 2023-10-02 19:19:54,463 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.self_attn_weights.pos_emb_skip_rate, batch_count=990446.6666666666, ans=0.0 2023-10-02 19:19:55,502 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.566e+02 1.787e+02 2.059e+02 2.354e+02 3.734e+02, threshold=4.118e+02, percent-clipped=0.0 2023-10-02 19:19:57,128 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_365-217119-0_sp1.1 from training. Duration: 0.57275 2023-10-02 19:19:58,862 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3475_210_2028_1_1526193578_3622569_377-166133-0 from training. Duration: 0.86 2023-10-02 19:20:00,476 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.conv_skip_rate, batch_count=990446.6666666666, ans=0.0 2023-10-02 19:20:02,931 INFO [train.py:1046] (3/4) Epoch 28, batch 5150, loss[loss=0.1831, simple_loss=0.2586, pruned_loss=0.05385, over 23900.00 frames. ], tot_loss[loss=0.1676, simple_loss=0.2455, pruned_loss=0.04492, over 4714754.41 frames. ], batch size: 86, lr: 3.62e-03, grad_scale: 8.0 2023-10-02 19:20:04,376 WARNING [train.py:1204] (3/4) Exclude cut with ID _13916_210_9994_1_1530788341126_3763149_116-21034-0_sp1.1 from training. Duration: 0.87275 2023-10-02 19:20:04,387 WARNING [train.py:1204] (3/4) Exclude cut with ID _64220_210_9527_1_1535250151772_4875259_142-216831-0_sp1.1 from training. Duration: 0.936375 2023-10-02 19:20:04,388 WARNING [train.py:1204] (3/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_490-207861-0_sp1.1 from training. Duration: 0.8909375 2023-10-02 19:20:05,085 WARNING [train.py:1204] (3/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_87-272068-0_sp0.9 from training. Duration: 0.92225 2023-10-02 19:20:06,312 WARNING [train.py:1204] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_18-46756-0_sp1.1 from training. Duration: 0.7181875 2023-10-02 19:20:06,379 WARNING [train.py:1204] (3/4) Exclude cut with ID _39411_210_5986_1_1533538825030_3343649_98-311335-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 19:20:06,451 WARNING [train.py:1204] (3/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_113-18163-0 from training. Duration: 0.89 2023-10-02 19:20:06,453 WARNING [train.py:1204] (3/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_1103-56445-0 from training. Duration: 0.73 2023-10-02 19:20:07,759 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3474_210_2028_1_1526188513_4825246_145-309131-0 from training. Duration: 0.74 2023-10-02 19:20:07,778 WARNING [train.py:1204] (3/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_120-211630-0_sp1.1 from training. Duration: 0.836375 2023-10-02 19:20:07,798 WARNING [train.py:1204] (3/4) Exclude cut with ID _8030_210_7497_1_1528444886781_3531089_32-31837-0 from training. Duration: 0.97 2023-10-02 19:20:09,136 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_19949_210_2161_1_1531706337_928079_8-160956-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 19:20:09,182 WARNING [train.py:1204] (3/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_321-215742-0_sp1.1 from training. Duration: 0.4545625 2023-10-02 19:20:10,596 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_583-124650-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 19:20:11,875 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_30434_210_13049_1_1532519384_981746_164-182755-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 19:20:12,113 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.balancer1.prob, batch_count=990513.3333333334, ans=0.125 2023-10-02 19:20:16,107 WARNING [train.py:1204] (3/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_339-104791-0_sp1.1 from training. Duration: 0.4454375 2023-10-02 19:20:16,134 WARNING [train.py:1204] (3/4) Exclude cut with ID _15026_210_8750_1_1530777602316_3291850_211-195043-0 from training. Duration: 0.8 2023-10-02 19:20:18,676 WARNING [train.py:1204] (3/4) Exclude cut with ID _29877_210_6228_1_1532397791106_5276149_487-182647-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 19:20:18,713 WARNING [train.py:1204] (3/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_127-566-0_sp0.9 from training. Duration: 0.9333125 2023-10-02 19:20:18,839 WARNING [train.py:1204] (3/4) Exclude cut with ID _8263_210_1664_1_1528439452891_970220_68-181805-0_sp0.9 from training. Duration: 0.711125 2023-10-02 19:20:18,841 WARNING [train.py:1204] (3/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_504-72513-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 19:20:18,849 WARNING [train.py:1204] (3/4) Exclude cut with ID _64639_210_5816_1_1535367619437_3528000_324-265233-0_sp1.1 from training. Duration: 0.963625 2023-10-02 19:20:20,255 WARNING [train.py:1204] (3/4) Exclude cut with ID _6456_210_1534_1_1527681634212_3181049_215-309359-0_sp0.9 from training. Duration: 0.97775 2023-10-02 19:20:20,259 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_115-233929-0_sp1.1 from training. Duration: 0.6545625 2023-10-02 19:20:20,304 WARNING [train.py:1204] (3/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_219-224445-0 from training. Duration: 0.84 2023-10-02 19:20:22,312 WARNING [train.py:1204] (3/4) Exclude cut with ID _14867_210_1968_1_1530684179890_3846969_413-29292-0_sp1.1 from training. Duration: 0.8181875 2023-10-02 19:20:23,702 WARNING [train.py:1204] (3/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_1012-279473-0_sp0.9 from training. Duration: 0.7444375 2023-10-02 19:20:26,841 WARNING [train.py:1204] (3/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_605-133395-0_sp0.9 from training. Duration: 0.7333125 2023-10-02 19:20:27,076 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.balancer2.prob, batch_count=990580.0, ans=0.125 2023-10-02 19:20:28,238 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_89-35748-0 from training. Duration: 0.79 2023-10-02 19:20:29,614 WARNING [train.py:1204] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_476-272757-0_sp1.1 from training. Duration: 0.8454375 2023-10-02 19:20:34,369 WARNING [train.py:1204] (3/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_449-15508-0_sp0.9 from training. Duration: 0.911125 2023-10-02 19:20:37,706 WARNING [train.py:1204] (3/4) Exclude cut with ID _29345_210_4381_1_1532689168263_3860169_198-174328-0 from training. Duration: 0.91 2023-10-02 19:20:41,750 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_15517_210_6753_1_1530952293_1664795_125-158157-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 19:20:45,931 WARNING [train.py:1204] (3/4) Exclude cut with ID _25565_210_12392_1_1532423009382_3952098_420-113180-0_sp1.1 from training. Duration: 0.963625 2023-10-02 19:20:46,004 WARNING [train.py:1204] (3/4) Exclude cut with ID _51471_210_1889_1_1534399391390_3094250_236-146773-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 19:20:49,969 WARNING [train.py:1204] (3/4) Exclude cut with ID _30039_210_13270_1_1532484612900_2725150_175-35012-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 19:20:50,005 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_23850_210_4238_1_1532232597_1632433_99-275843-0_sp1.1 from training. Duration: 0.936375 2023-10-02 19:20:52,633 WARNING [train.py:1204] (3/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_658-282610-0 from training. Duration: 0.53 2023-10-02 19:20:54,762 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.conv_module2.balancer2.prob, batch_count=990713.3333333334, ans=0.125 2023-10-02 19:20:57,692 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_37818_210_4238_1_1533620910_4720083_234-65934-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 19:20:59,063 WARNING [train.py:1204] (3/4) Exclude cut with ID _3067_210_2667_1_1526212974640_4503168_459-173867-0_sp1.1 from training. Duration: 0.8 2023-10-02 19:20:59,090 WARNING [train.py:1204] (3/4) Exclude cut with ID _8696_210_1876_1_1528545632440_3345058_209-54069-0_sp0.9 from training. Duration: 0.7444375 2023-10-02 19:21:03,115 WARNING [train.py:1204] (3/4) Exclude cut with ID _62574_210_2655_1_1535180407058_3933249_420-236465-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 19:21:04,920 WARNING [train.py:1204] (3/4) Exclude cut with ID _46681_210_9642_1_1533893005571_7603359_333-4042-0_sp1.1 from training. Duration: 0.936375 2023-10-02 19:21:04,995 WARNING [train.py:1204] (3/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_377-296839-0 from training. Duration: 0.55 2023-10-02 19:21:09,828 WARNING [train.py:1204] (3/4) Exclude cut with ID _43997_210_4525_1_1533538632555_5189329_426-92599-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 19:21:11,237 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_43-71989-0_sp1.1 from training. Duration: 0.5181875 2023-10-02 19:21:14,120 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2009_210_2071_1_1525481134_4588937_140-345530-0_sp1.1 from training. Duration: 0.963625 2023-10-02 19:21:14,133 WARNING [train.py:1204] (3/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_138-332773-0_sp0.9 from training. Duration: 0.9 2023-10-02 19:21:15,449 WARNING [train.py:1204] (3/4) Exclude cut with ID _13942_210_9988_1_1530604906036_3733360_499-55676-0_sp0.9 from training. Duration: 0.688875 2023-10-02 19:21:15,468 WARNING [train.py:1204] (3/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_259-34038-0_sp1.1 from training. Duration: 0.67275 2023-10-02 19:21:15,477 WARNING [train.py:1204] (3/4) Exclude cut with ID _3120_210_3525_1_1526036143112_4072980_404-249994-0_sp1.1 from training. Duration: 0.9 2023-10-02 19:21:15,513 WARNING [train.py:1204] (3/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_222-108810-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 19:21:16,765 INFO [train.py:1046] (3/4) Epoch 28, batch 5200, loss[loss=0.1689, simple_loss=0.2377, pruned_loss=0.05004, over 23791.00 frames. ], tot_loss[loss=0.1687, simple_loss=0.2464, pruned_loss=0.0455, over 4702442.20 frames. ], batch size: 164, lr: 3.62e-03, grad_scale: 16.0 2023-10-02 19:21:19,534 WARNING [train.py:1204] (3/4) Exclude cut with ID _22083_210_8254_1_1531702862439_3678970_49-234546-0_sp0.9 from training. Duration: 0.92225 2023-10-02 19:21:19,648 WARNING [train.py:1204] (3/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_199-295611-0_sp0.9 from training. Duration: 0.611125 2023-10-02 19:21:22,448 WARNING [train.py:1204] (3/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_43-192945-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 19:21:28,341 WARNING [train.py:1204] (3/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_364-22817-0 from training. Duration: 0.71 2023-10-02 19:21:30,320 WARNING [train.py:1204] (3/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_388-129552-0_sp1.1 from training. Duration: 0.87275 2023-10-02 19:21:31,596 WARNING [train.py:1204] (3/4) Exclude cut with ID _6537_210_1883_1_1527987559710_3597129_190-280227-0_sp1.1 from training. Duration: 0.97275 2023-10-02 19:21:31,895 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.feed_forward1.out_proj.dropout_p, batch_count=990913.3333333334, ans=0.1 2023-10-02 19:21:34,314 WARNING [train.py:1204] (3/4) Exclude cut with ID _55844_210_2346_1_1534471212569_7625707_398-56897-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 19:21:34,396 WARNING [train.py:1204] (3/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_48-182155-0_sp1.1 from training. Duration: 0.7909375 2023-10-02 19:21:34,417 WARNING [train.py:1204] (3/4) Exclude cut with ID _29529_210_7234_1_1532393961455_3977460_582-18084-0_sp1.1 from training. Duration: 0.97275 2023-10-02 19:21:37,705 WARNING [train.py:1204] (3/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_912-282412-0 from training. Duration: 0.49 2023-10-02 19:21:40,392 WARNING [train.py:1204] (3/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_332-256168-0_sp0.9 from training. Duration: 0.8333125 2023-10-02 19:21:40,442 WARNING [train.py:1204] (3/4) Exclude cut with ID _64220_210_9527_1_1535250151772_4875259_55-250624-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 19:21:42,402 WARNING [train.py:1204] (3/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_264-108685-0 from training. Duration: 0.55 2023-10-02 19:21:43,911 WARNING [train.py:1204] (3/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_613-232856-0_sp0.9 from training. Duration: 0.6 2023-10-02 19:21:45,274 WARNING [train.py:1204] (3/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_68-212801-0_sp1.1 from training. Duration: 0.663625 2023-10-02 19:21:46,604 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6290_210_3273_1_1527595224_1282787_115-87095-0 from training. Duration: 0.55 2023-10-02 19:21:46,649 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3080_210_2071_1_1526086102_4365166_312-8726-0 from training. Duration: 0.91 2023-10-02 19:21:48,263 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.bypass.skip_rate, batch_count=990980.0, ans=0.04949747468305833 2023-10-02 19:21:49,391 WARNING [train.py:1204] (3/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_105-63309-0 from training. Duration: 0.98 2023-10-02 19:21:50,675 WARNING [train.py:1204] (3/4) Exclude cut with ID _2941_210_2577_1_1526199673150_2489759_201-160779-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 19:21:50,678 WARNING [train.py:1204] (3/4) Exclude cut with ID ID0406W0226-885434 from training. Duration: 0.997 2023-10-02 19:21:50,684 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_17292_210_6753_1_1531125388_2943734_353-328201-0_sp1.1 from training. Duration: 0.97275 2023-10-02 19:21:52,068 WARNING [train.py:1204] (3/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_572-96029-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 19:21:52,109 WARNING [train.py:1204] (3/4) Exclude cut with ID _14970_210_15709_1_1530775547361_1966965_6-23328-0_sp0.9 from training. Duration: 0.9555625 2023-10-02 19:21:53,488 WARNING [train.py:1204] (3/4) Exclude cut with ID _45012_210_12651_1_1533608435136_5371380_240-37101-0 from training. Duration: 0.93 2023-10-02 19:21:53,561 WARNING [train.py:1204] (3/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_685-90183-0_sp1.1 from training. Duration: 0.936375 2023-10-02 19:21:55,004 WARNING [train.py:1204] (3/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_41-22245-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 19:21:57,752 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_15126_210_6753_1_1530778677_2591185_120-277707-0 from training. Duration: 0.8 2023-10-02 19:21:59,485 WARNING [train.py:1204] (3/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_310-26186-0 from training. Duration: 0.7 2023-10-02 19:21:59,505 WARNING [train.py:1204] (3/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_787-294416-0 from training. Duration: 0.82 2023-10-02 19:22:01,543 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.bypass.scale_min, batch_count=991046.6666666666, ans=0.2 2023-10-02 19:22:04,252 WARNING [train.py:1204] (3/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_795-33252-0 from training. Duration: 0.37 2023-10-02 19:22:04,970 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.2.encoder.layers.0.feed_forward3.out_whiten, num_groups=1, num_channels=384, metric=12.93 vs. limit=15.0 2023-10-02 19:22:05,577 WARNING [train.py:1204] (3/4) Exclude cut with ID _7537_210_2084_1_1528502395431_3924359_113-199933-0_sp0.9 from training. Duration: 0.6555625 2023-10-02 19:22:10,446 WARNING [train.py:1204] (3/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_308-231735-0_sp0.9 from training. Duration: 0.811125 2023-10-02 19:22:10,465 WARNING [train.py:1204] (3/4) Exclude cut with ID _19504_210_9994_1_1531648863242_3325220_354-296765-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 19:22:12,398 WARNING [train.py:1204] (3/4) Exclude cut with ID _5409_210_1388_1_1527292850189_5865191_178-127970-0 from training. Duration: 0.57 2023-10-02 19:22:12,450 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_1466-248056-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 19:22:12,467 WARNING [train.py:1204] (3/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_535-14410-0_sp0.9 from training. Duration: 0.52225 2023-10-02 19:22:12,470 WARNING [train.py:1204] (3/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_747-129941-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 19:22:13,855 WARNING [train.py:1204] (3/4) Exclude cut with ID _15012_210_1968_1_1530856820429_5095350_688-178849-0_sp1.1 from training. Duration: 0.5818125 2023-10-02 19:22:16,636 WARNING [train.py:1204] (3/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_356-45054-0_sp1.1 from training. Duration: 0.7818125 2023-10-02 19:22:17,961 WARNING [train.py:1204] (3/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_146-276944-0_sp0.9 from training. Duration: 0.92225 2023-10-02 19:22:20,807 WARNING [train.py:1204] (3/4) Exclude cut with ID _39204_210_4220_1_1533880547368_3191120_346-180306-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 19:22:22,211 WARNING [train.py:1204] (3/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_641-158017-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 19:22:22,211 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_19952_210_2161_1_1531875469_3851750_274-42137-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 19:22:23,452 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.647e+02 1.958e+02 2.159e+02 2.397e+02 4.088e+02, threshold=4.317e+02, percent-clipped=0.0 2023-10-02 19:22:25,129 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.balancer1.prob, batch_count=991113.3333333334, ans=0.125 2023-10-02 19:22:26,236 WARNING [train.py:1204] (3/4) Exclude cut with ID _60804_210_9642_1_1534831199859_7268299_87-327819-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 19:22:27,619 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_9302_210_2550_1_1528889962_4655416_638-301620-0 from training. Duration: 0.47 2023-10-02 19:22:27,680 WARNING [train.py:1204] (3/4) Exclude cut with ID _44304_210_15374_1_1533639352765_4613129_302-22084-0_sp1.1 from training. Duration: 0.7818125 2023-10-02 19:22:27,690 WARNING [train.py:1204] (3/4) Exclude cut with ID _18513_210_9206_1_1531618215223_3862620_177-227063-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 19:22:29,676 WARNING [train.py:1204] (3/4) Exclude cut with ID _41326_210_5854_1_1533778076509_3492080_510-45444-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 19:22:30,792 INFO [scaling.py:1022] (3/4) Whitening: name=encoder_embed.out_whiten, num_groups=1, num_channels=192, metric=7.12 vs. limit=8.0 2023-10-02 19:22:30,962 INFO [train.py:1046] (3/4) Epoch 28, batch 5250, loss[loss=0.1578, simple_loss=0.241, pruned_loss=0.03736, over 24314.00 frames. ], tot_loss[loss=0.1676, simple_loss=0.2448, pruned_loss=0.04522, over 4693681.42 frames. ], batch size: 61, lr: 3.62e-03, grad_scale: 16.0 2023-10-02 19:22:31,002 WARNING [train.py:1204] (3/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_76-311963-0_sp1.1 from training. Duration: 0.636375 2023-10-02 19:22:31,098 WARNING [train.py:1204] (3/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_156-302675-0_sp0.9 from training. Duration: 0.611125 2023-10-02 19:22:33,783 WARNING [train.py:1204] (3/4) Exclude cut with ID _53489_210_6228_1_1534298357169_3835970_367-148411-0_sp1.1 from training. Duration: 0.936375 2023-10-02 19:22:35,910 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=991180.0, ans=0.1 2023-10-02 19:22:37,086 WARNING [train.py:1204] (3/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_389-144525-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 19:22:37,136 WARNING [train.py:1204] (3/4) Exclude cut with ID _14069_210_5632_1_1530512692033_4803390_158-238427-0_sp1.1 from training. Duration: 0.963625 2023-10-02 19:22:38,524 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_486-151131-0_sp0.9 from training. Duration: 0.9444375 2023-10-02 19:22:45,823 WARNING [train.py:1204] (3/4) Exclude cut with ID _39846_210_12651_1_1533603651992_1318030_188-71831-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 19:22:47,192 WARNING [train.py:1204] (3/4) Exclude cut with ID _39411_210_5986_1_1533538825030_3343649_173-238248-0_sp1.1 from training. Duration: 0.7545625 2023-10-02 19:22:50,097 WARNING [train.py:1204] (3/4) Exclude cut with ID _20684_210_6917_1_1531567751646_4203930_469-49081-0_sp1.1 from training. Duration: 0.92725 2023-10-02 19:22:51,447 WARNING [train.py:1204] (3/4) Exclude cut with ID _8833_210_5219_1_1528592665210_7553059_611-80313-0_sp1.1 from training. Duration: 0.5818125 2023-10-02 19:22:54,140 WARNING [train.py:1204] (3/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_440-141811-0 from training. Duration: 0.95 2023-10-02 19:22:54,154 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_41211_210_2544_1_1533348859_3702910_236-223652-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 19:22:54,229 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_40222_210_6228_1_1533385298_4481939_220-163998-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 19:23:08,749 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.2.encoder.layers.2.self_attn_weights.whiten_keys, num_groups=4, num_channels=128, metric=3.45 vs. limit=6.0 2023-10-02 19:23:25,343 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.ff3_skip_rate, batch_count=991446.6666666666, ans=0.0 2023-10-02 19:23:25,599 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.4.encoder.layers.2.conv_module2.whiten, num_groups=1, num_channels=384, metric=4.30 vs. limit=15.0 2023-10-02 19:23:30,903 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.4.encoder.layers.0.nonlin_attention.whiten2, num_groups=1, num_channels=384, metric=11.37 vs. limit=15.0 2023-10-02 19:23:38,761 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=991513.3333333334, ans=0.1 2023-10-02 19:23:39,955 INFO [train.py:1046] (3/4) Epoch 28, batch 5300, loss[loss=0.1415, simple_loss=0.1938, pruned_loss=0.04456, over 19369.00 frames. ], tot_loss[loss=0.1668, simple_loss=0.2438, pruned_loss=0.04488, over 4693897.19 frames. ], batch size: 388, lr: 3.62e-03, grad_scale: 16.0 2023-10-02 19:23:55,173 WARNING [train.py:1204] (3/4) Exclude cut with ID _14629_210_15709_1_1530604298182_956899_6-232695-0_sp1.1 from training. Duration: 0.8545625 2023-10-02 19:23:55,208 WARNING [train.py:1204] (3/4) Exclude cut with ID _13921_210_9994_1_1530841782272_3503520_327-69677-0 from training. Duration: 0.9 2023-10-02 19:23:55,232 WARNING [train.py:1204] (3/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_372-298093-0 from training. Duration: 0.8 2023-10-02 19:23:55,245 WARNING [train.py:1204] (3/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_515-139111-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 19:23:55,470 WARNING [train.py:1204] (3/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_85-65056-0_sp1.1 from training. Duration: 0.87275 2023-10-02 19:23:55,508 WARNING [train.py:1204] (3/4) Exclude cut with ID _15113_210_8254_1_1530842438579_3716279_209-54533-0_sp1.1 from training. Duration: 0.87275 2023-10-02 19:23:55,553 WARNING [train.py:1204] (3/4) Exclude cut with ID _70447_210_12392_1_1535972499300_3160420_123-312838-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 19:23:55,578 WARNING [train.py:1204] (3/4) Exclude cut with ID _60113_210_16271_1_1535164212481_4386079_70-63596-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 19:23:55,606 WARNING [train.py:1204] (3/4) Exclude cut with ID _14632_210_9988_1_1530691279392_3923200_225-188509-0_sp1.1 from training. Duration: 0.963625 2023-10-02 19:23:55,653 WARNING [train.py:1204] (3/4) Exclude cut with ID _34734_210_4525_1_1533365977542_4448720_584-180532-0_sp1.1 from training. Duration: 0.97275 2023-10-02 19:23:55,666 WARNING [train.py:1204] (3/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_351-294650-0_sp1.1 from training. Duration: 0.57275 2023-10-02 19:23:55,996 WARNING [train.py:1204] (3/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_263-168388-0_sp0.9 from training. Duration: 0.9555625 2023-10-02 19:23:56,027 WARNING [train.py:1204] (3/4) Exclude cut with ID _22083_210_8254_1_1531702862439_3678970_201-85738-0 from training. Duration: 0.9 2023-10-02 19:23:56,496 WARNING [train.py:1204] (3/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_237-58433-0 from training. Duration: 0.74 2023-10-02 19:23:56,525 WARNING [train.py:1204] (3/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_262-250826-0 from training. Duration: 0.99 2023-10-02 19:23:56,629 WARNING [train.py:1204] (3/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_927-43827-0_sp0.9 from training. Duration: 0.82225 2023-10-02 19:23:56,659 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2821_210_2715_1_1526108385_3763560_73-296673-0 from training. Duration: 0.83 2023-10-02 19:23:56,735 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6287_210_3273_1_1527509506_2653716_182-199887-0 from training. Duration: 0.51 2023-10-02 19:23:56,869 WARNING [train.py:1204] (3/4) Exclude cut with ID _70519_210_4163_1_1535961595994_7458230_899-80223-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 19:23:57,131 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_210-234949-0_sp1.1 from training. Duration: 0.97275 2023-10-02 19:23:57,168 WARNING [train.py:1204] (3/4) Exclude cut with ID _30196_210_1664_1_1533708496522_7074140_768-278176-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 19:23:57,249 WARNING [train.py:1204] (3/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_315-128283-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 19:23:57,329 WARNING [train.py:1204] (3/4) Exclude cut with ID _14886_210_4220_1_1530702020355_3516190_330-323941-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 19:23:57,583 WARNING [train.py:1204] (3/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_264-348896-0_sp1.1 from training. Duration: 0.77275 2023-10-02 19:23:57,607 WARNING [train.py:1204] (3/4) Exclude cut with ID _37086_210_9038_1_1532999540891_7021250_433-10563-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 19:23:57,643 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_53638_210_21581_1_1534297319_5808449_885-80172-0_sp1.1 from training. Duration: 0.87275 2023-10-02 19:23:57,733 WARNING [train.py:1204] (3/4) Exclude cut with ID _14623_210_1925_1_1530773354962_3632609_157-218771-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 19:23:57,743 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_1368-78165-0_sp1.1 from training. Duration: 0.97275 2023-10-02 19:23:57,748 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_309-65177-0_sp0.9 from training. Duration: 0.97775 2023-10-02 19:23:57,754 WARNING [train.py:1204] (3/4) Exclude cut with ID _21632_210_2253_1_1531994001765_3744140_291-138812-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 19:23:57,766 WARNING [train.py:1204] (3/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_669-93163-0_sp1.1 from training. Duration: 0.8909375 2023-10-02 19:23:58,807 WARNING [train.py:1204] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_292-215346-0 from training. Duration: 0.97 2023-10-02 19:23:58,833 WARNING [train.py:1204] (3/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_286-222073-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 19:23:59,138 WARNING [train.py:1204] (3/4) Exclude cut with ID _59256_210_5986_1_1535076085997_3589120_121-237386-0_sp1.1 from training. Duration: 0.87275 2023-10-02 19:23:59,152 WARNING [train.py:1204] (3/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_96-320736-0 from training. Duration: 0.86 2023-10-02 19:23:59,161 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_128-202104-0 from training. Duration: 0.63 2023-10-02 19:23:59,256 WARNING [train.py:1204] (3/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_242-3472-0_sp0.9 from training. Duration: 0.811125 2023-10-02 19:23:59,302 WARNING [train.py:1204] (3/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_154-74182-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 19:23:59,307 WARNING [train.py:1204] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_187-221566-0 from training. Duration: 0.43 2023-10-02 19:23:59,474 WARNING [train.py:1204] (3/4) Exclude cut with ID _6194_210_1534_1_1527511826160_3941194_14-282876-0 from training. Duration: 0.71 2023-10-02 19:23:59,515 WARNING [train.py:1204] (3/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_660-211088-0_sp1.1 from training. Duration: 0.563625 2023-10-02 19:23:59,832 WARNING [train.py:1204] (3/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_526-62079-0_sp1.1 from training. Duration: 0.6090625 2023-10-02 19:23:59,978 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_92-246270-0_sp1.1 from training. Duration: 0.77275 2023-10-02 19:24:00,065 WARNING [train.py:1204] (3/4) Exclude cut with ID IC0287W0023-142350 from training. Duration: 0.956 2023-10-02 19:24:00,142 WARNING [train.py:1204] (3/4) Exclude cut with ID _58605_210_12210_1_1534723195939_3904259_48-269402-0 from training. Duration: 0.96 2023-10-02 19:24:00,157 WARNING [train.py:1204] (3/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_416-257730-0_sp0.9 from training. Duration: 0.72225 2023-10-02 19:24:00,234 WARNING [train.py:1204] (3/4) Exclude cut with ID _7877_210_6014_1_1528286358099_3750136_185-17516-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 19:24:00,339 WARNING [train.py:1204] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_23-189234-0 from training. Duration: 0.83 2023-10-02 19:24:00,356 WARNING [train.py:1204] (3/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_237-89684-0 from training. Duration: 0.96 2023-10-02 19:24:00,427 WARNING [train.py:1204] (3/4) Exclude cut with ID _13916_210_9994_1_1530788341126_3763149_116-21034-0 from training. Duration: 0.96 2023-10-02 19:24:00,580 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_246-125000-0_sp1.1 from training. Duration: 0.563625 2023-10-02 19:24:07,177 INFO [train.py:1046] (3/4) Epoch 29, batch 0, loss[loss=0.1699, simple_loss=0.2635, pruned_loss=0.0382, over 24333.00 frames. ], tot_loss[loss=0.1699, simple_loss=0.2635, pruned_loss=0.0382, over 24333.00 frames. ], batch size: 74, lr: 3.56e-03, grad_scale: 32.0 2023-10-02 19:24:07,177 INFO [train.py:1069] (3/4) Computing validation loss 2023-10-02 19:24:17,543 INFO [zipformer.py:1853] (3/4) name=encoder.encoders.4.encoder.layers.2.self_attn_weights, attn_weights_entropy = tensor([1.1779, 2.2152, 3.1264, 2.1081], device='cuda:3') 2023-10-02 19:24:19,056 INFO [train.py:1078] (3/4) Epoch 29, validation: loss=0.3081, simple_loss=0.2785, pruned_loss=0.1688, over 1125622.00 frames. 2023-10-02 19:24:19,057 INFO [train.py:1079] (3/4) Maximum memory allocated so far is 21134MB 2023-10-02 19:24:20,578 WARNING [train.py:1204] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_402-253027-0 from training. Duration: 0.54 2023-10-02 19:24:20,650 WARNING [train.py:1204] (3/4) Exclude cut with ID _59288_210_5854_1_1535077757774_3412246_158-289345-0_sp1.1 from training. Duration: 0.92725 2023-10-02 19:24:22,023 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_28211_210_6388_1_1532861506_4523224_128-328162-0_sp1.1 from training. Duration: 0.8181875 2023-10-02 19:24:23,942 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.2.encoder.layers.0.feed_forward1.out_whiten, num_groups=1, num_channels=384, metric=6.56 vs. limit=15.0 2023-10-02 19:24:27,777 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_788-278281-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 19:24:27,784 WARNING [train.py:1204] (3/4) Exclude cut with ID _49728_210_12651_1_1534041034294_6788150_624-195301-0_sp0.9 from training. Duration: 0.9444375 2023-10-02 19:24:27,817 WARNING [train.py:1204] (3/4) Exclude cut with ID _14860_210_4915_1_1530702020340_7343240_267-139775-0_sp1.1 from training. Duration: 0.963625 2023-10-02 19:24:29,199 WARNING [train.py:1204] (3/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_332-221057-0 from training. Duration: 0.68 2023-10-02 19:24:30,558 WARNING [train.py:1204] (3/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_242-76943-0 from training. Duration: 0.94 2023-10-02 19:24:33,418 WARNING [train.py:1204] (3/4) Exclude cut with ID _16747_210_5986_1_1531811544733_2749109_87-349587-0_sp1.1 from training. Duration: 0.963625 2023-10-02 19:24:33,475 WARNING [train.py:1204] (3/4) Exclude cut with ID _22982_210_1883_1_1531882790447_5449409_505-346452-0_sp1.1 from training. Duration: 0.963625 2023-10-02 19:24:38,132 WARNING [train.py:1204] (3/4) Exclude cut with ID _17956_210_4353_1_1531209568125_6979970_13-192107-0_sp1.1 from training. Duration: 0.963625 2023-10-02 19:24:38,184 WARNING [train.py:1204] (3/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_430-307666-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 19:24:39,569 WARNING [train.py:1204] (3/4) Exclude cut with ID _2895_210_2477_1_1525956785794_4063750_289-11475-0_sp1.1 from training. Duration: 0.7454375 2023-10-02 19:24:39,588 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3910_210_1794_1_1526742306_334530_28-238909-0_sp0.9 from training. Duration: 0.9 2023-10-02 19:24:40,944 WARNING [train.py:1204] (3/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_18-130336-0 from training. Duration: 0.79 2023-10-02 19:24:43,731 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_548-294316-0_sp0.9 from training. Duration: 0.9 2023-10-02 19:24:50,837 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_42-142473-0_sp1.1 from training. Duration: 0.6545625 2023-10-02 19:24:50,852 WARNING [train.py:1204] (3/4) Exclude cut with ID _59170_210_6721_1_1534983633960_4203335_326-48066-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 19:24:54,228 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_45886_210_17024_1_1533709215_471063_7-79165-0 from training. Duration: 0.98 2023-10-02 19:24:57,042 WARNING [train.py:1204] (3/4) Exclude cut with ID _44585_210_18628_1_1533605350387_4518199_280-73534-0_sp0.9 from training. Duration: 0.988875 2023-10-02 19:24:57,058 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3475_210_2028_1_1526193578_3622569_265-276625-0_sp1.1 from training. Duration: 0.6909375 2023-10-02 19:24:59,103 WARNING [train.py:1204] (3/4) Exclude cut with ID _64631_210_27767_1_1535349580068_4165160_88-225232-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 19:25:03,083 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_205-265518-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 19:25:05,701 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.501e+02 1.855e+02 2.126e+02 2.436e+02 5.590e+02, threshold=4.252e+02, percent-clipped=2.0 2023-10-02 19:25:07,760 WARNING [train.py:1204] (3/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_271-253734-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 19:25:13,156 WARNING [train.py:1204] (3/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_282-257791-0 from training. Duration: 0.86 2023-10-02 19:25:17,200 WARNING [train.py:1204] (3/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_356-45054-0 from training. Duration: 0.86 2023-10-02 19:25:18,462 WARNING [train.py:1204] (3/4) Exclude cut with ID _69584_210_5219_1_1535785133775_4044450_38-348116-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 19:25:18,463 WARNING [train.py:1204] (3/4) Exclude cut with ID _66763_210_17301_1_1535788667617_241750_21-328480-0_sp1.1 from training. Duration: 0.936375 2023-10-02 19:25:18,535 WARNING [train.py:1204] (3/4) Exclude cut with ID _26528_210_9070_1_1532570410842_3851310_497-97218-0_sp0.9 from training. Duration: 0.9555625 2023-10-02 19:25:19,870 WARNING [train.py:1204] (3/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_592-101075-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 19:25:21,308 WARNING [train.py:1204] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_678-159307-0 from training. Duration: 0.85 2023-10-02 19:25:22,742 WARNING [train.py:1204] (3/4) Exclude cut with ID _28427_210_4220_1_1532581240967_3061069_166-319773-0_sp1.1 from training. Duration: 0.936375 2023-10-02 19:25:24,117 WARNING [train.py:1204] (3/4) Exclude cut with ID _29877_210_6228_1_1532397791106_5276149_606-224984-0_sp1.1 from training. Duration: 0.936375 2023-10-02 19:25:26,263 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_125-203105-0_sp1.1 from training. Duration: 0.72725 2023-10-02 19:25:29,440 WARNING [train.py:1204] (3/4) Exclude cut with ID IC0620W0113-306765 from training. Duration: 0.768 2023-10-02 19:25:30,848 WARNING [train.py:1204] (3/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_410-287857-0_sp1.1 from training. Duration: 0.8454375 2023-10-02 19:25:32,152 INFO [train.py:1046] (3/4) Epoch 29, batch 50, loss[loss=0.1524, simple_loss=0.2354, pruned_loss=0.03474, over 23713.00 frames. ], tot_loss[loss=0.1641, simple_loss=0.2443, pruned_loss=0.04188, over 1070820.55 frames. ], batch size: 149, lr: 3.56e-03, grad_scale: 32.0 2023-10-02 19:25:33,560 WARNING [train.py:1204] (3/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_78-95260-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 19:25:35,004 WARNING [train.py:1204] (3/4) Exclude cut with ID _58059_210_5854_1_1534901471149_3164808_6-73080-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 19:25:35,024 WARNING [train.py:1204] (3/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_331-117829-0 from training. Duration: 0.62 2023-10-02 19:25:36,286 WARNING [train.py:1204] (3/4) Exclude cut with ID _14880_210_8895_1_1530680426655_3795259_289-212817-0_sp0.9 from training. Duration: 0.7666875 2023-10-02 19:25:36,313 WARNING [train.py:1204] (3/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_620-144779-0_sp1.1 from training. Duration: 0.92725 2023-10-02 19:25:37,860 WARNING [train.py:1204] (3/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_561-288575-0_sp1.1 from training. Duration: 0.963625 2023-10-02 19:25:38,016 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.nonlin_attention.balancer.max_positive, batch_count=991933.3333333334, ans=0.95 2023-10-02 19:25:41,003 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_22657_210_6890_1_1532086002_1637216_122-188128-0_sp1.1 from training. Duration: 0.963625 2023-10-02 19:25:43,583 WARNING [train.py:1204] (3/4) Exclude cut with ID _16756_210_4915_1_1531047803763_7104259_604-86607-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 19:25:44,561 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.1.encoder.layers.0.feed_forward1.out_whiten, num_groups=1, num_channels=256, metric=5.23 vs. limit=15.0 2023-10-02 19:25:46,780 WARNING [train.py:1204] (3/4) Exclude cut with ID _15150_210_8341_1_1530752411803_7256384_469-239509-0 from training. Duration: 0.68 2023-10-02 19:25:46,794 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_10987_210_4163_1_1529760555_4123902_302-87926-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 19:25:53,518 WARNING [train.py:1204] (3/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_469-94673-0_sp0.9 from training. Duration: 0.87775 2023-10-02 19:25:54,878 WARNING [train.py:1204] (3/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_798-106179-0 from training. Duration: 0.83 2023-10-02 19:25:56,889 WARNING [train.py:1204] (3/4) Exclude cut with ID _16744_210_5986_1_1531724405384_3907567_242-111316-0 from training. Duration: 0.93 2023-10-02 19:25:57,128 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.conv_module1.balancer2.prob, batch_count=992000.0, ans=0.125 2023-10-02 19:25:58,817 WARNING [train.py:1204] (3/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_938-205135-0_sp0.9 from training. Duration: 0.9666875 2023-10-02 19:25:59,045 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.conv_module1.balancer1.min_positive, batch_count=992000.0, ans=0.025 2023-10-02 19:26:00,084 WARNING [train.py:1204] (3/4) Exclude cut with ID _64135_210_2253_1_1535677267876_7608972_133-188266-0_sp0.9 from training. Duration: 0.9 2023-10-02 19:26:00,088 WARNING [train.py:1204] (3/4) Exclude cut with ID _44585_210_18628_1_1533605350387_4518199_623-346820-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 19:26:01,472 WARNING [train.py:1204] (3/4) Exclude cut with ID _42326_210_7770_1_1533814282531_3707269_454-266903-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 19:26:03,258 WARNING [train.py:1204] (3/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_5-55893-0_sp1.1 from training. Duration: 0.5 2023-10-02 19:26:03,290 WARNING [train.py:1204] (3/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_577-332600-0_sp1.1 from training. Duration: 0.5181875 2023-10-02 19:26:03,291 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_957-138882-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 19:26:07,741 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.conv_module1.balancer1.prob, batch_count=992066.6666666666, ans=0.125 2023-10-02 19:26:10,210 WARNING [train.py:1204] (3/4) Exclude cut with ID _12556_210_8341_1_1530081024100_7327690_219-38004-0_sp1.1 from training. Duration: 0.97275 2023-10-02 19:26:13,164 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_397-147196-0_sp1.1 from training. Duration: 0.72725 2023-10-02 19:26:13,181 WARNING [train.py:1204] (3/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_231-104736-0_sp0.9 from training. Duration: 0.8666875 2023-10-02 19:26:14,514 WARNING [train.py:1204] (3/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_398-334558-0 from training. Duration: 0.96 2023-10-02 19:26:15,936 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_257-20733-0_sp1.1 from training. Duration: 0.6909375 2023-10-02 19:26:17,314 WARNING [train.py:1204] (3/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_146-282400-0_sp1.1 from training. Duration: 0.6454375 2023-10-02 19:26:17,319 WARNING [train.py:1204] (3/4) Exclude cut with ID _51899_210_20034_1_1534129254466_3669220_265-215428-0 from training. Duration: 0.98 2023-10-02 19:26:17,400 WARNING [train.py:1204] (3/4) Exclude cut with ID _34423_210_8750_1_1533430798456_3330199_219-106541-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 19:26:18,808 WARNING [train.py:1204] (3/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_458-67321-0 from training. Duration: 0.97 2023-10-02 19:26:20,954 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.2.encoder.layers.0.feed_forward3.out_whiten, num_groups=1, num_channels=384, metric=11.89 vs. limit=15.0 2023-10-02 19:26:26,951 WARNING [train.py:1204] (3/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_1051-182606-0_sp1.1 from training. Duration: 0.936375 2023-10-02 19:26:26,975 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_53555_210_5632_1_1534595220_7232297_603-4396-0_sp1.1 from training. Duration: 0.97275 2023-10-02 19:26:28,345 WARNING [train.py:1204] (3/4) Exclude cut with ID _54218_210_12210_1_1534413646388_4409259_159-144367-0_sp1.1 from training. Duration: 0.963625 2023-10-02 19:26:28,448 WARNING [train.py:1204] (3/4) Exclude cut with ID _7606_210_4915_1_1528894253277_5080690_301-10732-0_sp0.9 from training. Duration: 0.9 2023-10-02 19:26:28,451 WARNING [train.py:1204] (3/4) Exclude cut with ID _15026_210_8750_1_1530777602316_3291850_211-195043-0_sp0.9 from training. Duration: 0.888875 2023-10-02 19:26:32,263 WARNING [train.py:1204] (3/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_146-282400-0 from training. Duration: 0.71 2023-10-02 19:26:32,292 WARNING [train.py:1204] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_65-88242-0 from training. Duration: 0.56 2023-10-02 19:26:33,683 WARNING [train.py:1204] (3/4) Exclude cut with ID _55857_210_2346_1_1534644007831_6898209_80-107757-0_sp1.1 from training. Duration: 0.963625 2023-10-02 19:26:33,722 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_15126_210_6753_1_1530778677_2591185_120-277707-0_sp0.9 from training. Duration: 0.888875 2023-10-02 19:26:35,610 WARNING [train.py:1204] (3/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_1033-168593-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 19:26:35,655 WARNING [train.py:1204] (3/4) Exclude cut with ID _46683_210_9642_1_1533730913492_4591116_475-30711-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 19:26:36,950 WARNING [train.py:1204] (3/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_253-308377-0 from training. Duration: 0.49 2023-10-02 19:26:37,007 WARNING [train.py:1204] (3/4) Exclude cut with ID _4680_210_3525_1_1527249757953_893386_75-78979-0 from training. Duration: 0.91 2023-10-02 19:26:38,380 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_5522_210_3273_1_1527249606_3909096_318-203153-0_sp0.9 from training. Duration: 0.47775 2023-10-02 19:26:38,489 WARNING [train.py:1204] (3/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_550-170116-0_sp1.1 from training. Duration: 0.92725 2023-10-02 19:26:38,515 WARNING [train.py:1204] (3/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_277-210798-0_sp0.9 from training. Duration: 0.611125 2023-10-02 19:26:39,769 WARNING [train.py:1204] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_620-276793-0 from training. Duration: 0.59 2023-10-02 19:26:39,779 WARNING [train.py:1204] (3/4) Exclude cut with ID _3067_210_2667_1_1526212974640_4503168_119-175776-0 from training. Duration: 0.74 2023-10-02 19:26:41,098 WARNING [train.py:1204] (3/4) Exclude cut with ID _55209_210_1531_1_1534675828996_4597160_681-195827-0_sp1.1 from training. Duration: 0.92725 2023-10-02 19:26:42,752 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_427-78097-0_sp1.1 from training. Duration: 0.72725 2023-10-02 19:26:44,764 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.3.encoder.layers.0.feed_forward1.out_whiten, num_groups=1, num_channels=512, metric=10.97 vs. limit=15.0 2023-10-02 19:26:45,399 INFO [train.py:1046] (3/4) Epoch 29, batch 100, loss[loss=0.1738, simple_loss=0.245, pruned_loss=0.05131, over 22783.00 frames. ], tot_loss[loss=0.1656, simple_loss=0.2441, pruned_loss=0.04357, over 1878534.35 frames. ], batch size: 322, lr: 3.56e-03, grad_scale: 16.0 2023-10-02 19:26:45,471 WARNING [train.py:1204] (3/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_96-290039-0_sp0.9 from training. Duration: 0.62225 2023-10-02 19:26:45,480 WARNING [train.py:1204] (3/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_287-292174-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 19:26:46,940 WARNING [train.py:1204] (3/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_200-292228-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 19:26:49,743 WARNING [train.py:1204] (3/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_291-194999-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 19:26:51,268 WARNING [train.py:1204] (3/4) Exclude cut with ID _58895_210_8750_1_1535184081320_3649089_265-323810-0_sp1.1 from training. Duration: 0.87275 2023-10-02 19:26:52,641 WARNING [train.py:1204] (3/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_58-68956-0 from training. Duration: 0.83 2023-10-02 19:26:52,654 WARNING [train.py:1204] (3/4) Exclude cut with ID _39867_210_9826_1_1533553222030_9344259_322-115153-0_sp1.1 from training. Duration: 0.963625 2023-10-02 19:26:56,807 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3371_210_1794_1_1526384462_5580777_396-339812-0_sp1.1 from training. Duration: 0.77275 2023-10-02 19:26:56,824 WARNING [train.py:1204] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_731-2428-0_sp0.9 from training. Duration: 0.92225 2023-10-02 19:26:58,564 WARNING [train.py:1204] (3/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_98-741-0_sp1.1 from training. Duration: 0.72725 2023-10-02 19:26:58,566 WARNING [train.py:1204] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_238-245655-0_sp0.9 from training. Duration: 0.9 2023-10-02 19:26:58,589 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6788_210_1388_1_1527898698_4562021_91-197981-0_sp0.9 from training. Duration: 0.92225 2023-10-02 19:27:00,608 WARNING [train.py:1204] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_860-96198-0 from training. Duration: 0.73 2023-10-02 19:27:02,027 WARNING [train.py:1204] (3/4) Exclude cut with ID _14268_210_9206_1_1530622771285_4714099_331-105351-0_sp0.9 from training. Duration: 0.72225 2023-10-02 19:27:02,049 WARNING [train.py:1204] (3/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_204-137133-0_sp1.1 from training. Duration: 0.92725 2023-10-02 19:27:02,067 WARNING [train.py:1204] (3/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_5-46496-0_sp1.1 from training. Duration: 0.936375 2023-10-02 19:27:02,086 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_160-218721-0_sp1.1 from training. Duration: 0.87275 2023-10-02 19:27:02,523 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.5.encoder.layers.0.conv_module1.whiten, num_groups=1, num_channels=256, metric=7.09 vs. limit=15.0 2023-10-02 19:27:04,050 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.2.encoder.layers.2.nonlin_attention.whiten1, num_groups=1, num_channels=288, metric=4.86 vs. limit=10.0 2023-10-02 19:27:06,919 WARNING [train.py:1204] (3/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_503-287883-0 from training. Duration: 0.88 2023-10-02 19:27:06,989 WARNING [train.py:1204] (3/4) Exclude cut with ID _57572_210_5517_1_1534921219624_3809484_110-79134-0_sp1.1 from training. Duration: 0.92725 2023-10-02 19:27:09,746 WARNING [train.py:1204] (3/4) Exclude cut with ID _44585_210_18628_1_1533605350387_4518199_421-37641-0_sp1.1 from training. Duration: 0.936375 2023-10-02 19:27:11,097 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_42-142473-0_sp0.9 from training. Duration: 0.8 2023-10-02 19:27:13,047 WARNING [train.py:1204] (3/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_310-114452-0_sp0.9 from training. Duration: 0.6555625 2023-10-02 19:27:15,843 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9029W0120-509520 from training. Duration: 0.768 2023-10-02 19:27:15,861 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9029W0433-509820 from training. Duration: 0.896 2023-10-02 19:27:17,223 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_40222_210_6228_1_1533385298_4481939_163-334586-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 19:27:17,224 WARNING [train.py:1204] (3/4) Exclude cut with ID _48677_210_5663_1_1533902405919_3892989_408-40740-0_sp1.1 from training. Duration: 0.7545625 2023-10-02 19:27:21,401 WARNING [train.py:1204] (3/4) Exclude cut with ID _24314_210_4915_1_1532135078116_7170179_521-258868-0_sp0.9 from training. Duration: 0.87775 2023-10-02 19:27:24,141 WARNING [train.py:1204] (3/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_576-51198-0_sp1.1 from training. Duration: 0.92725 2023-10-02 19:27:25,523 WARNING [train.py:1204] (3/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_306-216871-0_sp1.1 from training. Duration: 0.97275 2023-10-02 19:27:29,195 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.1.conv_skip_rate, batch_count=992466.6666666666, ans=0.0 2023-10-02 19:27:30,417 WARNING [train.py:1204] (3/4) Exclude cut with ID _20684_210_6917_1_1531567751646_4203930_463-119982-0_sp1.1 from training. Duration: 0.97275 2023-10-02 19:27:30,508 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9084W0012-536850 from training. Duration: 0.64 2023-10-02 19:27:31,913 WARNING [train.py:1204] (3/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_847-41334-0_sp1.1 from training. Duration: 0.4 2023-10-02 19:27:34,993 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.572e+02 1.846e+02 1.979e+02 2.261e+02 3.658e+02, threshold=3.958e+02, percent-clipped=0.0 2023-10-02 19:27:35,154 WARNING [train.py:1204] (3/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_326-165900-0_sp1.1 from training. Duration: 0.62725 2023-10-02 19:27:36,391 WARNING [train.py:1204] (3/4) Exclude cut with ID _7537_210_2084_1_1528502395431_3924359_128-268029-0_sp1.1 from training. Duration: 0.8909375 2023-10-02 19:27:38,257 WARNING [train.py:1204] (3/4) Exclude cut with ID _67499_210_17282_1_1535509805220_3692430_333-155030-0_sp1.1 from training. Duration: 0.97275 2023-10-02 19:27:40,835 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_34008_210_10431_1_1533345424_8183247_588-257177-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 19:27:44,004 WARNING [train.py:1204] (3/4) Exclude cut with ID _25914_210_6400_1_1532141317058_644740_52-96808-0_sp1.1 from training. Duration: 0.82725 2023-10-02 19:27:45,453 WARNING [train.py:1204] (3/4) Exclude cut with ID _56494_210_4220_1_1535284837625_3033169_69-138150-0_sp1.1 from training. Duration: 0.8545625 2023-10-02 19:27:45,670 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.feed_forward2.hidden_balancer.prob, batch_count=992533.3333333334, ans=0.125 2023-10-02 19:27:48,051 WARNING [train.py:1204] (3/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_19-143553-0_sp1.1 from training. Duration: 0.97275 2023-10-02 19:27:48,119 WARNING [train.py:1204] (3/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_582-130735-0_sp1.1 from training. Duration: 0.936375 2023-10-02 19:27:49,483 WARNING [train.py:1204] (3/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_943-201356-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 19:27:49,493 WARNING [train.py:1204] (3/4) Exclude cut with ID _3231_210_2106_1_1526205864934_6784220_422-229276-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 19:27:49,527 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_48049_210_5632_1_1533864553_3519937_73-198717-0_sp1.1 from training. Duration: 0.97275 2023-10-02 19:27:49,578 WARNING [train.py:1204] (3/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_329-285801-0 from training. Duration: 0.91 2023-10-02 19:27:49,601 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9171W0114-579000 from training. Duration: 0.896 2023-10-02 19:27:50,931 WARNING [train.py:1204] (3/4) Exclude cut with ID _3120_210_3525_1_1526036143112_4072980_210-132675-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 19:27:51,003 WARNING [train.py:1204] (3/4) Exclude cut with ID _55857_210_2346_1_1534644007831_6898209_745-98391-0_sp0.9 from training. Duration: 0.8444375 2023-10-02 19:27:52,269 WARNING [train.py:1204] (3/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_615-322035-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 19:27:52,273 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_36756_210_7234_1_1533026991_966276_119-315767-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 19:27:52,296 WARNING [train.py:1204] (3/4) Exclude cut with ID _13916_210_9994_1_1530788341126_3763149_60-191556-0_sp1.1 from training. Duration: 0.5545625 2023-10-02 19:27:52,312 WARNING [train.py:1204] (3/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_177-236809-0_sp1.1 from training. Duration: 0.4454375 2023-10-02 19:27:53,668 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7002_210_1794_1_1527939683_5157286_184-229630-0_sp1.1 from training. Duration: 0.67275 2023-10-02 19:27:53,674 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_50428_210_5724_1_1534247532_4126418_464-18322-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 19:27:53,737 WARNING [train.py:1204] (3/4) Exclude cut with ID _35799_210_5854_1_1533263487887_3529071_466-131402-0_sp1.1 from training. Duration: 0.936375 2023-10-02 19:27:53,809 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_1259-132768-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 19:27:55,158 WARNING [train.py:1204] (3/4) Exclude cut with ID _6694_210_4599_1_1527852637490_3965529_144-185051-0_sp1.1 from training. Duration: 0.87275 2023-10-02 19:27:55,391 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.conv_module2.balancer1.prob, batch_count=992533.3333333334, ans=0.125 2023-10-02 19:27:56,420 WARNING [train.py:1204] (3/4) Exclude cut with ID _64639_210_5816_1_1535367619437_3528000_59-133290-0_sp1.1 from training. Duration: 0.963625 2023-10-02 19:27:57,932 WARNING [train.py:1204] (3/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_223-49525-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 19:27:59,696 INFO [train.py:1046] (3/4) Epoch 29, batch 150, loss[loss=0.2153, simple_loss=0.2816, pruned_loss=0.07447, over 19661.00 frames. ], tot_loss[loss=0.1669, simple_loss=0.245, pruned_loss=0.04437, over 2518168.99 frames. ], batch size: 388, lr: 3.55e-03, grad_scale: 8.0 2023-10-02 19:27:59,809 WARNING [train.py:1204] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_748-36150-0_sp1.1 from training. Duration: 0.82725 2023-10-02 19:27:59,810 WARNING [train.py:1204] (3/4) Exclude cut with ID _19896_210_1883_1_1531709022662_6649569_52-221627-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 19:28:01,729 WARNING [train.py:1204] (3/4) Exclude cut with ID _6789_210_1534_1_1527854471671_2870519_296-115766-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 19:28:04,462 WARNING [train.py:1204] (3/4) Exclude cut with ID _14624_210_1925_1_1530858690657_3642219_100-19871-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 19:28:04,522 WARNING [train.py:1204] (3/4) Exclude cut with ID _12555_210_8750_1_1530075594402_3780239_50-293675-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 19:28:04,648 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.1.bypass.scale_min, batch_count=992600.0, ans=0.2 2023-10-02 19:28:07,161 WARNING [train.py:1204] (3/4) Exclude cut with ID _14880_210_8895_1_1530680426655_3795259_289-212817-0_sp1.1 from training. Duration: 0.62725 2023-10-02 19:28:08,535 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_388-211626-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 19:28:11,927 WARNING [train.py:1204] (3/4) Exclude cut with ID _5289_210_2084_1_1527678202736_3779299_392-261988-0 from training. Duration: 0.65 2023-10-02 19:28:11,942 WARNING [train.py:1204] (3/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_124-345840-0 from training. Duration: 0.52 2023-10-02 19:28:11,947 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2655_210_2087_1_1525688959_3897400_437-337690-0 from training. Duration: 0.63 2023-10-02 19:28:14,683 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3780_210_2087_1_1526467391_239068_13-274633-0_sp1.1 from training. Duration: 0.92725 2023-10-02 19:28:14,688 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_805-71497-0_sp1.1 from training. Duration: 0.6454375 2023-10-02 19:28:14,796 WARNING [train.py:1204] (3/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_434-83000-0_sp1.1 from training. Duration: 0.8909375 2023-10-02 19:28:17,316 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_17613_210_2151_1_1531137569_2366160_285-219531-0_sp1.1 from training. Duration: 0.97275 2023-10-02 19:28:17,321 WARNING [train.py:1204] (3/4) Exclude cut with ID _56108_210_16380_1_1534505446159_3887959_22-37858-0_sp1.1 from training. Duration: 0.936375 2023-10-02 19:28:17,360 WARNING [train.py:1204] (3/4) Exclude cut with ID _28427_210_4220_1_1532581240967_3061069_200-88444-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 19:28:18,664 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_77-279042-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 19:28:20,051 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9309W0193-640320 from training. Duration: 0.896 2023-10-02 19:28:21,524 WARNING [train.py:1204] (3/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_516-324055-0_sp1.1 from training. Duration: 0.936375 2023-10-02 19:28:26,954 WARNING [train.py:1204] (3/4) Exclude cut with ID _40094_210_16380_1_1533294005773_3641159_10-261240-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 19:28:30,350 WARNING [train.py:1204] (3/4) Exclude cut with ID _6267_210_2084_1_1527764658526_3894730_375-219887-0_sp1.1 from training. Duration: 0.6090625 2023-10-02 19:28:30,442 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6788_210_1388_1_1527898698_4562021_180-75288-0 from training. Duration: 0.99 2023-10-02 19:28:33,395 WARNING [train.py:1204] (3/4) Exclude cut with ID _8030_210_7497_1_1528444886781_3531089_262-86750-0_sp1.1 from training. Duration: 0.736375 2023-10-02 19:28:33,408 WARNING [train.py:1204] (3/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_934-325498-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 19:28:33,428 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_456-22141-0_sp0.9 from training. Duration: 0.97775 2023-10-02 19:28:36,179 WARNING [train.py:1204] (3/4) Exclude cut with ID _49643_210_7989_1_1534640244224_3939031_598-161926-0_sp1.1 from training. Duration: 0.8181875 2023-10-02 19:28:37,595 WARNING [train.py:1204] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_345-49721-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 19:28:40,541 WARNING [train.py:1204] (3/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_440-141811-0_sp1.1 from training. Duration: 0.863625 2023-10-02 19:28:41,905 WARNING [train.py:1204] (3/4) Exclude cut with ID _64048_210_10626_1_1535198382802_3835179_394-212120-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 19:28:41,958 WARNING [train.py:1204] (3/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_91-277684-0 from training. Duration: 0.74 2023-10-02 19:28:48,746 WARNING [train.py:1204] (3/4) Exclude cut with ID _23038_210_9570_1_1532001584158_3652419_191-346343-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 19:28:50,173 WARNING [train.py:1204] (3/4) Exclude cut with ID _15140_210_9288_1_1530791388450_4880149_351-347974-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 19:28:50,206 WARNING [train.py:1204] (3/4) Exclude cut with ID _58605_210_12210_1_1534723195939_3904259_48-269402-0_sp1.1 from training. Duration: 0.87275 2023-10-02 19:28:50,214 WARNING [train.py:1204] (3/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_401-53423-0_sp0.9 from training. Duration: 0.611125 2023-10-02 19:28:52,947 WARNING [train.py:1204] (3/4) Exclude cut with ID _42659_210_2070_1_1533607442765_3120380_158-49004-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 19:28:55,627 WARNING [train.py:1204] (3/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_81-75849-0_sp1.1 from training. Duration: 0.3545625 2023-10-02 19:28:58,344 WARNING [train.py:1204] (3/4) Exclude cut with ID _27052_210_5986_1_1532329197187_3585009_314-106499-0_sp1.1 from training. Duration: 0.836375 2023-10-02 19:28:59,773 WARNING [train.py:1204] (3/4) Exclude cut with ID _4626_210_3532_1_1526817652717_3916859_446-206656-0_sp0.9 from training. Duration: 0.8444375 2023-10-02 19:29:01,123 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_53378_210_17282_1_1534221071_5769509_158-114960-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 19:29:03,003 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3171_210_2087_1_1526109084_2330007_37-64334-0_sp0.9 from training. Duration: 0.911125 2023-10-02 19:29:03,026 WARNING [train.py:1204] (3/4) Exclude cut with ID _5410_210_1388_1_1527397055795_1719020_39-112221-0 from training. Duration: 0.69 2023-10-02 19:29:04,375 WARNING [train.py:1204] (3/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_4-91037-0_sp0.9 from training. Duration: 0.97775 2023-10-02 19:29:04,398 WARNING [train.py:1204] (3/4) Exclude cut with ID ID0079W0443-717795 from training. Duration: 0.541 2023-10-02 19:29:07,706 WARNING [train.py:1204] (3/4) Exclude cut with ID _34734_210_4525_1_1533365977542_4448720_766-297928-0_sp1.1 from training. Duration: 0.936375 2023-10-02 19:29:11,748 INFO [train.py:1046] (3/4) Epoch 29, batch 200, loss[loss=0.2203, simple_loss=0.2893, pruned_loss=0.07563, over 19587.00 frames. ], tot_loss[loss=0.168, simple_loss=0.2461, pruned_loss=0.04497, over 3008498.15 frames. ], batch size: 389, lr: 3.55e-03, grad_scale: 8.0 2023-10-02 19:29:11,858 WARNING [train.py:1204] (3/4) Exclude cut with ID _51471_210_1889_1_1534399391390_3094250_310-337793-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 19:29:11,880 WARNING [train.py:1204] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_678-159307-0_sp0.9 from training. Duration: 0.9444375 2023-10-02 19:29:16,421 WARNING [train.py:1204] (3/4) Exclude cut with ID _50093_210_4220_1_1534417141207_3432299_259-322252-0 from training. Duration: 0.92 2023-10-02 19:29:17,799 WARNING [train.py:1204] (3/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_67-103444-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 19:29:17,825 WARNING [train.py:1204] (3/4) Exclude cut with ID _40551_210_10635_1_1533776424632_3416179_311-247578-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 19:29:21,915 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_4633_210_2040_1_1526824163_4145343_416-76117-0 from training. Duration: 0.95 2023-10-02 19:29:22,020 WARNING [train.py:1204] (3/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_331-117829-0_sp0.9 from training. Duration: 0.688875 2023-10-02 19:29:22,179 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.1.bypass.scale_min, batch_count=992933.3333333334, ans=0.2 2023-10-02 19:29:24,816 WARNING [train.py:1204] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_299-247059-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 19:29:24,897 WARNING [train.py:1204] (3/4) Exclude cut with ID _64002_210_9570_1_1535198605157_38979_2-240845-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 19:29:29,010 WARNING [train.py:1204] (3/4) Exclude cut with ID _4681_210_3525_1_1527250689464_120889_15-126760-0_sp0.9 from training. Duration: 0.9 2023-10-02 19:29:29,022 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_21500_210_2054_1_1532149310_2716758_256-156611-0_sp1.1 from training. Duration: 0.936375 2023-10-02 19:29:29,045 WARNING [train.py:1204] (3/4) Exclude cut with ID _16908_210_5724_1_1531119157248_3772700_624-203729-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 19:29:45,192 WARNING [train.py:1204] (3/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_605-219732-0_sp1.1 from training. Duration: 0.8545625 2023-10-02 19:29:45,232 WARNING [train.py:1204] (3/4) Exclude cut with ID _44718_210_5816_1_1534050051869_6469030_380-167520-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 19:29:46,948 WARNING [train.py:1204] (3/4) Exclude cut with ID _19896_210_1883_1_1531709022662_6649569_94-229927-0_sp1.1 from training. Duration: 0.8818125 2023-10-02 19:29:48,321 WARNING [train.py:1204] (3/4) Exclude cut with ID _19514_210_9259_1_1531530071330_5084230_519-174300-0_sp1.1 from training. Duration: 0.963625 2023-10-02 19:29:48,369 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder_embed.conv.5.prob, batch_count=993066.6666666666, ans=0.125 2023-10-02 19:29:49,551 WARNING [train.py:1204] (3/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_340-174176-0_sp0.9 from training. Duration: 0.5444375 2023-10-02 19:29:49,555 WARNING [train.py:1204] (3/4) Exclude cut with ID _8263_210_1664_1_1528439452891_970220_68-181805-0_sp1.1 from training. Duration: 0.5818125 2023-10-02 19:29:52,125 WARNING [train.py:1204] (3/4) Exclude cut with ID _3120_210_3525_1_1526036143112_4072980_348-257723-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 19:29:52,204 WARNING [train.py:1204] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_453-133983-0_sp1.1 from training. Duration: 0.7090625 2023-10-02 19:29:53,654 WARNING [train.py:1204] (3/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_17-86060-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 19:29:53,674 WARNING [train.py:1204] (3/4) Exclude cut with ID _29877_210_6228_1_1532397791106_5276149_228-184657-0_sp1.1 from training. Duration: 0.97275 2023-10-02 19:29:55,040 WARNING [train.py:1204] (3/4) Exclude cut with ID _18516_210_9206_1_1531704653206_3692406_198-334870-0 from training. Duration: 0.92 2023-10-02 19:29:55,076 WARNING [train.py:1204] (3/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_340-174176-0_sp1.1 from training. Duration: 0.4454375 2023-10-02 19:29:55,089 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_14-139186-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 19:29:59,352 WARNING [train.py:1204] (3/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_126-29104-0_sp0.9 from training. Duration: 0.9444375 2023-10-02 19:30:00,532 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.484e+02 1.804e+02 1.986e+02 2.199e+02 3.066e+02, threshold=3.973e+02, percent-clipped=0.0 2023-10-02 19:30:02,230 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.1.ff3_skip_rate, batch_count=993133.3333333334, ans=0.0 2023-10-02 19:30:03,433 WARNING [train.py:1204] (3/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_84-10310-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 19:30:10,166 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_15517_210_6753_1_1530954008_3487336_79-273076-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 19:30:11,545 WARNING [train.py:1204] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_371-18169-0_sp0.9 from training. Duration: 0.9555625 2023-10-02 19:30:20,327 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_68702_210_23689_1_1535799577_3420068_170-92788-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 19:30:23,119 WARNING [train.py:1204] (3/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_78-182912-0 from training. Duration: 0.59 2023-10-02 19:30:23,179 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3019_210_2028_1_1526044245_3264865_297-12384-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 19:30:23,185 WARNING [train.py:1204] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_147-111137-0_sp1.1 from training. Duration: 0.863625 2023-10-02 19:30:23,197 WARNING [train.py:1204] (3/4) Exclude cut with ID _28323_210_16187_1_1532480395667_4100300_117-8859-0_sp1.1 from training. Duration: 0.97275 2023-10-02 19:30:24,482 INFO [train.py:1046] (3/4) Epoch 29, batch 250, loss[loss=0.1609, simple_loss=0.2459, pruned_loss=0.03796, over 24613.00 frames. ], tot_loss[loss=0.1683, simple_loss=0.2465, pruned_loss=0.04501, over 3388254.85 frames. ], batch size: 68, lr: 3.55e-03, grad_scale: 8.0 2023-10-02 19:30:24,566 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7762_210_1794_1_1528199508_4057408_419-125009-0_sp1.1 from training. Duration: 0.7545625 2023-10-02 19:30:24,670 WARNING [train.py:1204] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_268-87909-0 from training. Duration: 0.99 2023-10-02 19:30:26,018 WARNING [train.py:1204] (3/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_118-311357-0_sp1.1 from training. Duration: 0.92725 2023-10-02 19:30:26,035 WARNING [train.py:1204] (3/4) Exclude cut with ID ID0377W0003-870225 from training. Duration: 0.852 2023-10-02 19:30:26,286 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.bypass.skip_rate, batch_count=993266.6666666666, ans=0.09899494936611666 2023-10-02 19:30:27,462 WARNING [train.py:1204] (3/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_56-5838-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 19:30:28,953 WARNING [train.py:1204] (3/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_90-31383-0_sp0.9 from training. Duration: 0.9666875 2023-10-02 19:30:30,337 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_31583_210_3852_1_1532595851_4024413_58-86657-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 19:30:30,370 WARNING [train.py:1204] (3/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_344-113875-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 19:30:33,240 WARNING [train.py:1204] (3/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_278-104676-0_sp1.1 from training. Duration: 0.8909375 2023-10-02 19:30:34,489 WARNING [train.py:1204] (3/4) Exclude cut with ID _55844_210_2346_1_1534471212569_7625707_764-2387-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 19:30:35,998 WARNING [train.py:1204] (3/4) Exclude cut with ID _27430_210_7770_1_1532604689375_1378590_157-14963-0_sp1.1 from training. Duration: 0.963625 2023-10-02 19:30:40,949 WARNING [train.py:1204] (3/4) Exclude cut with ID _7160_210_1876_1_1527940923231_3703799_282-322611-0_sp1.1 from training. Duration: 0.7909375 2023-10-02 19:30:53,193 WARNING [train.py:1204] (3/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_190-138640-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 19:30:53,398 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=993400.0, ans=0.1 2023-10-02 19:30:54,685 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_33348_210_5632_1_1533086912_3324030_63-270922-0_sp1.1 from training. Duration: 0.97275 2023-10-02 19:30:55,968 WARNING [train.py:1204] (3/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_135-184281-0_sp1.1 from training. Duration: 0.9 2023-10-02 19:30:58,190 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.3.encoder.layers.1.feed_forward2.out_whiten, num_groups=1, num_channels=512, metric=9.03 vs. limit=15.0 2023-10-02 19:30:58,989 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=993400.0, ans=0.1 2023-10-02 19:31:00,170 WARNING [train.py:1204] (3/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_277-210798-0_sp1.1 from training. Duration: 0.5 2023-10-02 19:31:01,484 WARNING [train.py:1204] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_402-253027-0_sp0.9 from training. Duration: 0.6 2023-10-02 19:31:01,565 WARNING [train.py:1204] (3/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_540-275685-0_sp0.9 from training. Duration: 0.988875 2023-10-02 19:31:01,605 WARNING [train.py:1204] (3/4) Exclude cut with ID _59493_210_17282_1_1535078419077_3225879_118-18960-0_sp1.1 from training. Duration: 0.936375 2023-10-02 19:31:02,976 WARNING [train.py:1204] (3/4) Exclude cut with ID _6194_210_1534_1_1527511826160_3941194_321-148641-0_sp1.1 from training. Duration: 0.5818125 2023-10-02 19:31:04,270 WARNING [train.py:1204] (3/4) Exclude cut with ID _61133_210_9969_1_1535196777227_3696409_333-270325-0_sp1.1 from training. Duration: 0.8454375 2023-10-02 19:31:04,325 WARNING [train.py:1204] (3/4) Exclude cut with ID _49376_210_5986_1_1533985278732_3370740_307-145221-0_sp1.1 from training. Duration: 0.936375 2023-10-02 19:31:05,752 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6846_210_6753_1_1527850767_4295158_2-315073-0_sp0.9 from training. Duration: 0.97775 2023-10-02 19:31:08,479 WARNING [train.py:1204] (3/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_17-25045-0 from training. Duration: 0.59 2023-10-02 19:31:08,491 WARNING [train.py:1204] (3/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_503-333510-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 19:31:11,764 WARNING [train.py:1204] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_238-245655-0_sp1.1 from training. Duration: 0.736375 2023-10-02 19:31:11,806 WARNING [train.py:1204] (3/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_329-82235-0_sp0.9 from training. Duration: 0.911125 2023-10-02 19:31:11,811 WARNING [train.py:1204] (3/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_325-113798-0_sp0.9 from training. Duration: 0.9444375 2023-10-02 19:31:14,851 WARNING [train.py:1204] (3/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_395-37222-0_sp0.9 from training. Duration: 0.8444375 2023-10-02 19:31:14,928 WARNING [train.py:1204] (3/4) Exclude cut with ID _64135_210_2253_1_1535677267876_7608972_498-5039-0_sp1.1 from training. Duration: 0.7090625 2023-10-02 19:31:14,942 WARNING [train.py:1204] (3/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_303-228738-0_sp0.9 from training. Duration: 0.9333125 2023-10-02 19:31:16,802 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_577-220899-0_sp1.1 from training. Duration: 0.92725 2023-10-02 19:31:18,308 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.1.conv_module2.balancer2.prob, batch_count=993466.6666666666, ans=0.125 2023-10-02 19:31:19,486 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3475_210_2028_1_1526193578_3622569_377-166133-0_sp1.1 from training. Duration: 0.7818125 2023-10-02 19:31:19,524 WARNING [train.py:1204] (3/4) Exclude cut with ID _49376_210_5986_1_1533985278732_3370740_272-313512-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 19:31:24,195 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7762_210_1794_1_1528199508_4057408_422-227034-0_sp0.9 from training. Duration: 0.811125 2023-10-02 19:31:26,946 WARNING [train.py:1204] (3/4) Exclude cut with ID _18895_210_6228_1_1531357274234_3784930_74-341428-0_sp1.1 from training. Duration: 0.92725 2023-10-02 19:31:29,749 WARNING [train.py:1204] (3/4) Exclude cut with ID _44902_210_16187_1_1533888006062_3792078_149-67212-0_sp1.1 from training. Duration: 0.7909375 2023-10-02 19:31:32,007 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.0.layers.1.feed_forward2.out_whiten, num_groups=1, num_channels=192, metric=6.38 vs. limit=15.0 2023-10-02 19:31:34,026 WARNING [train.py:1204] (3/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_243-126020-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 19:31:36,617 WARNING [train.py:1204] (3/4) Exclude cut with ID _54218_210_12210_1_1534413646388_4409259_259-6561-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 19:31:37,994 INFO [train.py:1046] (3/4) Epoch 29, batch 300, loss[loss=0.1682, simple_loss=0.2473, pruned_loss=0.04451, over 23267.00 frames. ], tot_loss[loss=0.1656, simple_loss=0.2433, pruned_loss=0.04397, over 3679375.78 frames. ], batch size: 105, lr: 3.55e-03, grad_scale: 8.0 2023-10-02 19:31:38,113 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_41452_210_7291_1_1533437687_3941281_405-11559-0 from training. Duration: 0.91 2023-10-02 19:31:39,478 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_759-225084-0_sp1.1 from training. Duration: 0.97275 2023-10-02 19:31:39,490 WARNING [train.py:1204] (3/4) Exclude cut with ID _14747_210_8341_1_1530685797463_3654089_147-131229-0_sp0.9 from training. Duration: 0.8444375 2023-10-02 19:31:40,919 WARNING [train.py:1204] (3/4) Exclude cut with ID _14867_210_1968_1_1530684179890_3846969_319-250868-0 from training. Duration: 0.99 2023-10-02 19:31:40,957 WARNING [train.py:1204] (3/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_580-93798-0_sp0.9 from training. Duration: 0.62225 2023-10-02 19:31:42,992 WARNING [train.py:1204] (3/4) Exclude cut with ID _9679_210_7497_1_1529129212906_4363929_512-313889-0_sp0.9 from training. Duration: 0.9 2023-10-02 19:31:44,306 WARNING [train.py:1204] (3/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_18-189198-0 from training. Duration: 0.9 2023-10-02 19:31:48,914 WARNING [train.py:1204] (3/4) Exclude cut with ID _49755_210_18053_1_1534581005846_3218709_137-64179-0_sp1.1 from training. Duration: 0.92725 2023-10-02 19:31:48,976 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_48049_210_5632_1_1533864553_3519937_306-283794-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 19:31:51,937 WARNING [train.py:1204] (3/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_25-120161-0_sp1.1 from training. Duration: 0.8909375 2023-10-02 19:31:53,768 WARNING [train.py:1204] (3/4) Exclude cut with ID _8833_210_5219_1_1528592665210_7553059_319-349181-0 from training. Duration: 0.99 2023-10-02 19:31:53,864 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_296-332325-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 19:31:56,463 WARNING [train.py:1204] (3/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_436-186311-0_sp1.1 from training. Duration: 0.5909375 2023-10-02 19:31:56,480 WARNING [train.py:1204] (3/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_348-127789-0 from training. Duration: 0.85 2023-10-02 19:31:56,485 WARNING [train.py:1204] (3/4) Exclude cut with ID _3992_210_1747_1_1526644816031_3731060_443-195183-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 19:31:59,246 WARNING [train.py:1204] (3/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_308-231735-0_sp1.1 from training. Duration: 0.663625 2023-10-02 19:32:02,238 INFO [scaling.py:1118] (3/4) WithLoss: name=encoder.encoders.3.encoder.layers.3.self_attn_weights, loss-sum=0.000e+00 2023-10-02 19:32:03,415 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6786_210_1388_1_1527839699_4194495_378-46070-0_sp1.1 from training. Duration: 0.6545625 2023-10-02 19:32:04,132 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.0.feed_forward3.out_whiten.whitening_limit, batch_count=993666.6666666666, ans=15.0 2023-10-02 19:32:04,669 WARNING [train.py:1204] (3/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_63-101996-0 from training. Duration: 0.88 2023-10-02 19:32:07,403 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2541_210_1794_1_1525590269_3084765_156-173713-0 from training. Duration: 0.81 2023-10-02 19:32:07,439 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_217-239796-0_sp1.1 from training. Duration: 0.963625 2023-10-02 19:32:09,477 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.3.encoder.layers.2.whiten, num_groups=1, num_channels=512, metric=4.34 vs. limit=12.0 2023-10-02 19:32:10,515 WARNING [train.py:1204] (3/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_530-329400-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 19:32:10,691 WARNING [train.py:1204] (3/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_150-62920-0_sp1.1 from training. Duration: 0.963625 2023-10-02 19:32:10,691 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_5522_210_3273_1_1527249606_3909096_366-172508-0 from training. Duration: 0.66 2023-10-02 19:32:10,694 WARNING [train.py:1204] (3/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_303-340446-0_sp1.1 from training. Duration: 0.8181875 2023-10-02 19:32:13,801 WARNING [train.py:1204] (3/4) Exclude cut with ID _8699_210_2069_1_1528630346831_3960409_160-24408-0_sp1.1 from training. Duration: 0.82725 2023-10-02 19:32:15,927 WARNING [train.py:1204] (3/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_299-187801-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 19:32:15,960 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_34886_210_10633_1_1533203882_1755661_92-345288-0_sp1.1 from training. Duration: 0.97275 2023-10-02 19:32:20,182 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_5438_210_1794_1_1527336687_2600009_104-288274-0_sp1.1 from training. Duration: 0.52725 2023-10-02 19:32:20,187 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_154-316450-0 from training. Duration: 0.76 2023-10-02 19:32:21,511 WARNING [train.py:1204] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_23-189234-0_sp0.9 from training. Duration: 0.92225 2023-10-02 19:32:24,827 WARNING [train.py:1204] (3/4) Exclude cut with ID _4779_210_2084_1_1527318022944_3914893_346-168820-0_sp1.1 from training. Duration: 0.963625 2023-10-02 19:32:26,213 WARNING [train.py:1204] (3/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_5-55893-0 from training. Duration: 0.55 2023-10-02 19:32:27,611 WARNING [train.py:1204] (3/4) Exclude cut with ID _20455_210_5219_1_1531565996775_8098100_180-24517-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 19:32:28,949 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.523e+02 1.830e+02 2.036e+02 2.251e+02 3.092e+02, threshold=4.071e+02, percent-clipped=0.0 2023-10-02 19:32:30,471 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_243-143119-0_sp0.9 from training. Duration: 0.8555625 2023-10-02 19:32:33,186 WARNING [train.py:1204] (3/4) Exclude cut with ID _65585_210_12210_1_1535450451584_3764098_293-212045-0_sp1.1 from training. Duration: 0.92725 2023-10-02 19:32:33,190 WARNING [train.py:1204] (3/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_577-332600-0 from training. Duration: 0.57 2023-10-02 19:32:37,294 WARNING [train.py:1204] (3/4) Exclude cut with ID _30039_210_13270_1_1532484612900_2725150_104-289874-0_sp1.1 from training. Duration: 0.963625 2023-10-02 19:32:37,301 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3172_210_2087_1_1526292929_3987865_197-97762-0_sp1.1 from training. Duration: 0.6818125 2023-10-02 19:32:41,900 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_256-150908-0_sp1.1 from training. Duration: 0.963625 2023-10-02 19:32:43,706 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_296-287678-0_sp1.1 from training. Duration: 0.863625 2023-10-02 19:32:43,738 WARNING [train.py:1204] (3/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_443-30034-0 from training. Duration: 0.64 2023-10-02 19:32:45,013 WARNING [train.py:1204] (3/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_494-199538-0_sp0.9 from training. Duration: 0.8333125 2023-10-02 19:32:45,067 WARNING [train.py:1204] (3/4) Exclude cut with ID _19708_210_9206_1_1532136650733_4238364_61-162325-0_sp1.1 from training. Duration: 0.97275 2023-10-02 19:32:45,153 WARNING [train.py:1204] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_475-127455-0 from training. Duration: 0.7 2023-10-02 19:32:46,547 WARNING [train.py:1204] (3/4) Exclude cut with ID _48677_210_5663_1_1533902405919_3892989_188-11581-0_sp1.1 from training. Duration: 0.963625 2023-10-02 19:32:46,575 WARNING [train.py:1204] (3/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_136-8381-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 19:32:48,420 WARNING [train.py:1204] (3/4) Exclude cut with ID _55328_210_27767_1_1534580973260_4088170_270-229555-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 19:32:49,743 WARNING [train.py:1204] (3/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_372-266951-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 19:32:49,816 WARNING [train.py:1204] (3/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_14-242149-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 19:32:52,473 INFO [train.py:1046] (3/4) Epoch 29, batch 350, loss[loss=0.1532, simple_loss=0.237, pruned_loss=0.03468, over 24500.00 frames. ], tot_loss[loss=0.1643, simple_loss=0.2422, pruned_loss=0.04322, over 3908452.93 frames. ], batch size: 66, lr: 3.55e-03, grad_scale: 8.0 2023-10-02 19:32:54,652 WARNING [train.py:1204] (3/4) Exclude cut with ID _7877_210_6014_1_1528286358099_3750136_159-76512-0_sp1.1 from training. Duration: 0.936375 2023-10-02 19:32:54,654 WARNING [train.py:1204] (3/4) Exclude cut with ID _8263_210_1664_1_1528438953494_294100_40-204731-0_sp1.1 from training. Duration: 0.4545625 2023-10-02 19:32:57,326 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8278_210_2550_1_1528637099_4220413_694-238413-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 19:33:02,806 WARNING [train.py:1204] (3/4) Exclude cut with ID _47278_210_12041_1_1533902418349_3576110_421-172710-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 19:33:05,501 WARNING [train.py:1204] (3/4) Exclude cut with ID _14970_210_15709_1_1530773543493_1715730_215-282680-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 19:33:06,771 WARNING [train.py:1204] (3/4) Exclude cut with ID _64639_210_5816_1_1535367619437_3528000_306-149449-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 19:33:06,971 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.0.feed_forward1.out_proj.dropout_p, batch_count=994000.0, ans=0.1 2023-10-02 19:33:09,642 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_28-291982-0 from training. Duration: 0.97 2023-10-02 19:33:11,453 WARNING [train.py:1204] (3/4) Exclude cut with ID _55134_210_21982_1_1534590028377_3882170_412-15870-0_sp1.1 from training. Duration: 0.936375 2023-10-02 19:33:11,485 WARNING [train.py:1204] (3/4) Exclude cut with ID _13684_210_7497_1_1530619354863_3598359_312-235606-0 from training. Duration: 0.6 2023-10-02 19:33:14,852 WARNING [train.py:1204] (3/4) Exclude cut with ID _37112_210_2575_1_1532997007673_7268610_347-238993-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 19:33:16,144 WARNING [train.py:1204] (3/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_121-284152-0 from training. Duration: 0.59 2023-10-02 19:33:16,196 WARNING [train.py:1204] (3/4) Exclude cut with ID _2941_210_2577_1_1526199673150_2489759_228-2961-0_sp1.1 from training. Duration: 0.97275 2023-10-02 19:33:19,251 WARNING [train.py:1204] (3/4) Exclude cut with ID _28323_210_16187_1_1532480395667_4100300_531-24157-0 from training. Duration: 0.98 2023-10-02 19:33:21,059 WARNING [train.py:1204] (3/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_266-329755-0_sp0.9 from training. Duration: 0.988875 2023-10-02 19:33:22,526 WARNING [train.py:1204] (3/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_1038-146421-0_sp1.1 from training. Duration: 0.97275 2023-10-02 19:33:22,597 WARNING [train.py:1204] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_391-22148-0_sp0.9 from training. Duration: 0.8555625 2023-10-02 19:33:22,767 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.conv_module2.balancer2.min_positive, batch_count=994066.6666666666, ans=0.05 2023-10-02 19:33:24,653 WARNING [train.py:1204] (3/4) Exclude cut with ID _64521_210_14699_1_1535243427259_3815328_543-341117-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 19:33:24,664 WARNING [train.py:1204] (3/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_52-168712-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 19:33:25,768 WARNING [train.py:1204] (3/4) Exclude cut with ID _33920_210_4220_1_1533257851500_3084399_198-334063-0_sp1.1 from training. Duration: 0.8545625 2023-10-02 19:33:25,788 WARNING [train.py:1204] (3/4) Exclude cut with ID _50093_210_4220_1_1534417141207_3432299_69-111987-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 19:33:25,836 WARNING [train.py:1204] (3/4) Exclude cut with ID _14821_210_8750_1_1530680503991_6829359_189-297451-0_sp0.9 from training. Duration: 0.611125 2023-10-02 19:33:27,213 WARNING [train.py:1204] (3/4) Exclude cut with ID _8030_210_7497_1_1528444886781_3531089_262-86750-0_sp0.9 from training. Duration: 0.9 2023-10-02 19:33:27,218 WARNING [train.py:1204] (3/4) Exclude cut with ID _24612_210_8913_1_1531998054193_3806359_294-191196-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 19:33:30,179 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.feed_forward2.hidden_balancer.prob, batch_count=994066.6666666666, ans=0.125 2023-10-02 19:33:32,837 WARNING [train.py:1204] (3/4) Exclude cut with ID _16909_210_5724_1_1531292079912_4118919_707-342896-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 19:33:33,529 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.3.encoder.layers.1.self_attn1.whiten, num_groups=1, num_channels=512, metric=15.10 vs. limit=22.5 2023-10-02 19:33:34,135 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_9460_210_4211_1_1528954587_10049667_572-37035-0_sp0.9 from training. Duration: 0.888875 2023-10-02 19:33:35,505 WARNING [train.py:1204] (3/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_762-109886-0_sp1.1 from training. Duration: 0.82725 2023-10-02 19:33:35,527 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_14953_210_2151_1_1530701710_5616165_519-326393-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 19:33:36,604 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.0.layers.0.self_attn2.whiten, num_groups=1, num_channels=192, metric=11.66 vs. limit=22.5 2023-10-02 19:33:41,192 WARNING [train.py:1204] (3/4) Exclude cut with ID _25914_210_6400_1_1532141317058_644740_52-96808-0 from training. Duration: 0.91 2023-10-02 19:33:41,196 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_68095_210_20185_1_1535718226_8888855_1079-94525-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 19:33:45,065 WARNING [train.py:1204] (3/4) Exclude cut with ID _14860_210_4915_1_1530702020340_7343240_746-3647-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 19:33:45,065 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_70379_210_7925_1_1535941455_557455_49-101720-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 19:33:45,086 WARNING [train.py:1204] (3/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_8-307291-0_sp1.1 from training. Duration: 0.936375 2023-10-02 19:33:45,224 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.balancer1.prob, batch_count=994133.3333333334, ans=0.125 2023-10-02 19:33:46,491 WARNING [train.py:1204] (3/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_117-290916-0 from training. Duration: 0.51 2023-10-02 19:33:49,613 WARNING [train.py:1204] (3/4) Exclude cut with ID _22020_210_2461_1_1531724164513_6326900_944-339840-0_sp1.1 from training. Duration: 0.963625 2023-10-02 19:33:50,989 WARNING [train.py:1204] (3/4) Exclude cut with ID IC0515W0043-254911 from training. Duration: 0.976 2023-10-02 19:33:52,321 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3910_210_1794_1_1526742306_334530_28-238909-0 from training. Duration: 0.81 2023-10-02 19:33:52,337 WARNING [train.py:1204] (3/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_269-267275-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 19:33:53,155 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.0.ff3_skip_rate, batch_count=994200.0, ans=0.0 2023-10-02 19:33:54,356 WARNING [train.py:1204] (3/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_72-251255-0_sp1.1 from training. Duration: 0.8545625 2023-10-02 19:33:54,367 WARNING [train.py:1204] (3/4) Exclude cut with ID _14886_210_4220_1_1530702020355_3516190_175-229073-0 from training. Duration: 0.45 2023-10-02 19:33:55,941 WARNING [train.py:1204] (3/4) Exclude cut with ID _40519_210_13794_1_1533715228655_7337390_921-209452-0_sp1.1 from training. Duration: 0.963625 2023-10-02 19:33:58,706 WARNING [train.py:1204] (3/4) Exclude cut with ID _29857_210_20034_1_1532395886753_3798160_335-163562-0_sp0.9 from training. Duration: 0.9444375 2023-10-02 19:33:58,799 WARNING [train.py:1204] (3/4) Exclude cut with ID _58583_210_5236_1_1534726835266_3574249_384-58445-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 19:34:00,042 WARNING [train.py:1204] (3/4) Exclude cut with ID _44011_210_2046_1_1533722401691_3631310_96-315656-0_sp1.1 from training. Duration: 0.963625 2023-10-02 19:34:00,046 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_68484_210_1794_1_1535849804_7678097_549-170613-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 19:34:02,928 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.ff2_skip_rate, batch_count=994200.0, ans=0.0 2023-10-02 19:34:04,030 WARNING [train.py:1204] (3/4) Exclude cut with ID _37892_210_19596_1_1533198677897_3964058_214-148058-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 19:34:06,741 INFO [train.py:1046] (3/4) Epoch 29, batch 400, loss[loss=0.1494, simple_loss=0.2298, pruned_loss=0.03452, over 24415.00 frames. ], tot_loss[loss=0.1636, simple_loss=0.2415, pruned_loss=0.04289, over 4084260.34 frames. ], batch size: 58, lr: 3.55e-03, grad_scale: 16.0 2023-10-02 19:34:08,227 WARNING [train.py:1204] (3/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_415-78792-0_sp0.9 from training. Duration: 0.9 2023-10-02 19:34:09,619 WARNING [train.py:1204] (3/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_383-205833-0_sp1.1 from training. Duration: 0.863625 2023-10-02 19:34:09,834 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.conv_module2.balancer2.prob, batch_count=994266.6666666666, ans=0.125 2023-10-02 19:34:10,968 WARNING [train.py:1204] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_167-137255-0 from training. Duration: 0.64 2023-10-02 19:34:10,983 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8275_210_4198_1_1528455320_3892571_484-154786-0_sp1.1 from training. Duration: 0.963625 2023-10-02 19:34:12,344 WARNING [train.py:1204] (3/4) Exclude cut with ID _40284_210_2084_1_1533513679814_3704279_396-319412-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 19:34:13,744 WARNING [train.py:1204] (3/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_280-120942-0_sp0.9 from training. Duration: 0.8555625 2023-10-02 19:34:15,611 WARNING [train.py:1204] (3/4) Exclude cut with ID _45630_210_10556_1_1533949174498_3902049_558-240889-0_sp1.1 from training. Duration: 0.92725 2023-10-02 19:34:18,916 WARNING [train.py:1204] (3/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_197-307458-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 19:34:20,774 WARNING [train.py:1204] (3/4) Exclude cut with ID _16909_210_5724_1_1531292079912_4118919_163-254843-0_sp1.1 from training. Duration: 0.92725 2023-10-02 19:34:21,009 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=994333.3333333334, ans=0.1 2023-10-02 19:34:22,196 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.0.conv_skip_rate, batch_count=994333.3333333334, ans=0.0 2023-10-02 19:34:22,262 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.balancer1.prob, batch_count=994333.3333333334, ans=0.125 2023-10-02 19:34:23,424 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_103-84010-0 from training. Duration: 0.69 2023-10-02 19:34:24,962 WARNING [train.py:1204] (3/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_401-158239-0 from training. Duration: 0.71 2023-10-02 19:34:24,962 WARNING [train.py:1204] (3/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_899-233658-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 19:34:26,440 WARNING [train.py:1204] (3/4) Exclude cut with ID _23825_210_13449_1_1531915313526_3497150_240-271367-0 from training. Duration: 0.97 2023-10-02 19:34:27,746 WARNING [train.py:1204] (3/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_424-134619-0_sp1.1 from training. Duration: 0.92725 2023-10-02 19:34:32,242 WARNING [train.py:1204] (3/4) Exclude cut with ID _44831_210_13427_1_1533816011549_3689035_32-188147-0_sp1.1 from training. Duration: 0.87275 2023-10-02 19:34:32,245 WARNING [train.py:1204] (3/4) Exclude cut with ID _59170_210_6721_1_1534983633960_4203335_314-63879-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 19:34:32,260 WARNING [train.py:1204] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_602-275260-0 from training. Duration: 0.82 2023-10-02 19:34:32,289 WARNING [train.py:1204] (3/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_511-306767-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 19:34:32,330 WARNING [train.py:1204] (3/4) Exclude cut with ID _41817_210_18835_1_1533434477160_7423049_565-201775-0_sp1.1 from training. Duration: 0.92725 2023-10-02 19:34:32,345 WARNING [train.py:1204] (3/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_778-173151-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 19:34:32,967 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.3.encoder.layers.3.self_attn2.whiten, num_groups=1, num_channels=512, metric=11.15 vs. limit=22.5 2023-10-02 19:34:33,717 WARNING [train.py:1204] (3/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_680-51001-0_sp1.1 from training. Duration: 0.963625 2023-10-02 19:34:33,870 WARNING [train.py:1204] (3/4) Exclude cut with ID IC0677W0059-334891 from training. Duration: 0.896 2023-10-02 19:34:34,068 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.conv_module2.balancer1.prob, batch_count=994333.3333333334, ans=0.125 2023-10-02 19:34:35,149 WARNING [train.py:1204] (3/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_370-331600-0 from training. Duration: 0.94 2023-10-02 19:34:39,392 WARNING [train.py:1204] (3/4) Exclude cut with ID _66840_210_5219_1_1535452168426_4196349_446-240192-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 19:34:40,633 WARNING [train.py:1204] (3/4) Exclude cut with ID _65242_210_18628_1_1535369393717_3823149_38-6732-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 19:34:40,692 WARNING [train.py:1204] (3/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_302-2550-0 from training. Duration: 0.81 2023-10-02 19:34:41,971 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_100-348441-0 from training. Duration: 0.91 2023-10-02 19:34:44,689 WARNING [train.py:1204] (3/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_329-285801-0_sp1.1 from training. Duration: 0.82725 2023-10-02 19:34:47,995 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_30167_210_3528_1_1532428012_4772375_343-334712-0_sp1.1 from training. Duration: 0.936375 2023-10-02 19:34:52,834 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7543_210_6753_1_1528543318_4552_1-85072-0 from training. Duration: 0.83 2023-10-02 19:34:56,189 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7002_210_1794_1_1527935634_4034444_487-236501-0_sp1.1 from training. Duration: 0.6 2023-10-02 19:34:57,566 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.577e+02 1.820e+02 1.968e+02 2.221e+02 3.877e+02, threshold=3.936e+02, percent-clipped=0.0 2023-10-02 19:34:57,658 WARNING [train.py:1204] (3/4) Exclude cut with ID _7160_210_1876_1_1527940923231_3703799_282-322611-0 from training. Duration: 0.87 2023-10-02 19:34:59,060 WARNING [train.py:1204] (3/4) Exclude cut with ID _67335_210_5219_1_1535525975652_4275229_570-262983-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 19:34:59,180 WARNING [train.py:1204] (3/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_625-338546-0_sp1.1 from training. Duration: 0.8 2023-10-02 19:34:59,205 WARNING [train.py:1204] (3/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_176-56120-0 from training. Duration: 0.83 2023-10-02 19:35:01,937 WARNING [train.py:1204] (3/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_963-187109-0_sp1.1 from training. Duration: 0.9 2023-10-02 19:35:03,281 WARNING [train.py:1204] (3/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_121-284152-0_sp0.9 from training. Duration: 0.6555625 2023-10-02 19:35:04,586 WARNING [train.py:1204] (3/4) Exclude cut with ID _45021_210_9994_1_1533619751094_3330288_104-81711-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 19:35:07,316 WARNING [train.py:1204] (3/4) Exclude cut with ID _51382_210_23862_1_1534492801838_3650104_129-343914-0_sp1.1 from training. Duration: 0.936375 2023-10-02 19:35:07,368 WARNING [train.py:1204] (3/4) Exclude cut with ID _14970_210_15709_1_1530775547361_1966965_8-169924-0 from training. Duration: 0.92 2023-10-02 19:35:10,136 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2295_210_2087_1_1525500993_2407863_234-83104-0_sp1.1 from training. Duration: 0.563625 2023-10-02 19:35:11,307 WARNING [train.py:1204] (3/4) Exclude cut with ID _14687_210_5854_1_1530703838259_3664889_115-239046-0 from training. Duration: 0.67 2023-10-02 19:35:14,382 WARNING [train.py:1204] (3/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_690-126306-0_sp1.1 from training. Duration: 0.7090625 2023-10-02 19:35:14,391 WARNING [train.py:1204] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_625-102955-0_sp0.9 from training. Duration: 0.8555625 2023-10-02 19:35:17,678 WARNING [train.py:1204] (3/4) Exclude cut with ID _9389_210_1664_1_1529217337301_5289269_289-213417-0 from training. Duration: 0.89 2023-10-02 19:35:19,252 WARNING [train.py:1204] (3/4) Exclude cut with ID _13253_210_1925_1_1530757775819_3577873_191-144977-0_sp0.9 from training. Duration: 0.8666875 2023-10-02 19:35:20,554 INFO [train.py:1046] (3/4) Epoch 29, batch 450, loss[loss=0.1792, simple_loss=0.2666, pruned_loss=0.04585, over 24452.00 frames. ], tot_loss[loss=0.1644, simple_loss=0.2424, pruned_loss=0.04317, over 4222853.63 frames. ], batch size: 69, lr: 3.55e-03, grad_scale: 16.0 2023-10-02 19:35:20,623 WARNING [train.py:1204] (3/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_325-155703-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 19:35:20,659 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2295_210_2087_1_1525500993_2407863_116-238894-0_sp0.9 from training. Duration: 0.77775 2023-10-02 19:35:22,009 WARNING [train.py:1204] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_94-126128-0 from training. Duration: 0.86 2023-10-02 19:35:22,034 WARNING [train.py:1204] (3/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_395-310220-0_sp0.9 from training. Duration: 0.988875 2023-10-02 19:35:23,418 WARNING [train.py:1204] (3/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_821-110852-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 19:35:23,484 WARNING [train.py:1204] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_64-248359-0_sp1.1 from training. Duration: 0.62725 2023-10-02 19:35:23,495 WARNING [train.py:1204] (3/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_138-187031-0 from training. Duration: 0.99 2023-10-02 19:35:24,691 WARNING [train.py:1204] (3/4) Exclude cut with ID _4122_210_2084_1_1526713164417_4353280_528-206771-0_sp0.9 from training. Duration: 0.97775 2023-10-02 19:35:24,785 WARNING [train.py:1204] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_844-96989-0_sp1.1 from training. Duration: 0.6454375 2023-10-02 19:35:26,809 WARNING [train.py:1204] (3/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_106-30782-0_sp1.1 from training. Duration: 0.72725 2023-10-02 19:35:27,375 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.4.encoder.layers.1.conv_module1.whiten, num_groups=1, num_channels=384, metric=4.39 vs. limit=15.0 2023-10-02 19:35:33,688 WARNING [train.py:1204] (3/4) Exclude cut with ID _14683_210_5219_1_1530698352608_3974859_281-250040-0_sp1.1 from training. Duration: 0.936375 2023-10-02 19:35:33,721 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_46013_210_7234_1_1533776362_3744032_463-52378-0_sp0.9 from training. Duration: 0.9555625 2023-10-02 19:35:35,263 WARNING [train.py:1204] (3/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_299-34413-0 from training. Duration: 0.65 2023-10-02 19:35:35,320 WARNING [train.py:1204] (3/4) Exclude cut with ID _35799_210_5854_1_1533263487887_3529071_490-326847-0 from training. Duration: 0.99 2023-10-02 19:35:39,764 WARNING [train.py:1204] (3/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_589-9070-0_sp0.9 from training. Duration: 0.788875 2023-10-02 19:35:42,423 WARNING [train.py:1204] (3/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_884-29515-0_sp1.1 from training. Duration: 0.936375 2023-10-02 19:35:43,895 WARNING [train.py:1204] (3/4) Exclude cut with ID _15012_210_1968_1_1530856820429_5095350_474-119995-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 19:35:44,144 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.attention_skip_rate, batch_count=994666.6666666666, ans=0.0 2023-10-02 19:35:48,620 WARNING [train.py:1204] (3/4) Exclude cut with ID _65585_210_12210_1_1535450451584_3764098_211-97124-0_sp1.1 from training. Duration: 0.963625 2023-10-02 19:35:48,687 WARNING [train.py:1204] (3/4) Exclude cut with ID _33640_210_2655_1_1533117605479_4855469_666-224463-0_sp1.1 from training. Duration: 0.963625 2023-10-02 19:35:51,417 WARNING [train.py:1204] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_35-153886-0 from training. Duration: 0.84 2023-10-02 19:35:51,474 WARNING [train.py:1204] (3/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_210-106047-0 from training. Duration: 0.97 2023-10-02 19:35:54,674 WARNING [train.py:1204] (3/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_479-125611-0 from training. Duration: 0.96 2023-10-02 19:35:55,919 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_9417_210_1794_1_1529235261_4614131_427-153963-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 19:35:57,344 WARNING [train.py:1204] (3/4) Exclude cut with ID _23818_210_6228_1_1531917063884_3676920_133-170571-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 19:35:57,414 WARNING [train.py:1204] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_602-275260-0_sp1.1 from training. Duration: 0.7454375 2023-10-02 19:35:58,882 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9029W0121-509521 from training. Duration: 0.896 2023-10-02 19:35:58,890 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9029W0434-509821 from training. Duration: 0.64 2023-10-02 19:36:00,177 WARNING [train.py:1204] (3/4) Exclude cut with ID _23038_210_9570_1_1532001584158_3652419_289-307034-0_sp1.1 from training. Duration: 0.936375 2023-10-02 19:36:01,539 WARNING [train.py:1204] (3/4) Exclude cut with ID _29345_210_4381_1_1532689168263_3860169_149-257268-0_sp0.9 from training. Duration: 0.9 2023-10-02 19:36:02,884 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_405-339986-0_sp1.1 from training. Duration: 0.463625 2023-10-02 19:36:05,708 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2655_210_2087_1_1525688959_3897400_437-337690-0_sp1.1 from training. Duration: 0.57275 2023-10-02 19:36:07,009 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7543_210_6753_1_1528541907_1399322_165-323619-0_sp1.1 from training. Duration: 0.62725 2023-10-02 19:36:07,050 WARNING [train.py:1204] (3/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_508-335369-0_sp1.1 from training. Duration: 0.436375 2023-10-02 19:36:07,122 WARNING [train.py:1204] (3/4) Exclude cut with ID _7852_210_5219_1_1528286471316_8139400_358-287492-0 from training. Duration: 0.96 2023-10-02 19:36:09,881 WARNING [train.py:1204] (3/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_322-217220-0_sp0.9 from training. Duration: 0.9555625 2023-10-02 19:36:11,733 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3171_210_2087_1_1526109084_2330007_66-260583-0_sp0.9 from training. Duration: 0.8 2023-10-02 19:36:11,963 INFO [scaling.py:1118] (3/4) WithLoss: name=encoder.encoders.4.encoder.layers.0.self_attn_weights, loss-sum=0.000e+00 2023-10-02 19:36:13,044 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8558_210_1410_1_1528505225_7613406_131-329524-0_sp1.1 from training. Duration: 0.7181875 2023-10-02 19:36:13,167 WARNING [train.py:1204] (3/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_30-326681-0 from training. Duration: 0.83 2023-10-02 19:36:16,432 WARNING [train.py:1204] (3/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_58-68956-0_sp0.9 from training. Duration: 0.92225 2023-10-02 19:36:17,810 WARNING [train.py:1204] (3/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_255-312654-0 from training. Duration: 0.91 2023-10-02 19:36:19,160 WARNING [train.py:1204] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_99-167392-0 from training. Duration: 0.61 2023-10-02 19:36:19,262 WARNING [train.py:1204] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_94-126128-0_sp0.9 from training. Duration: 0.9555625 2023-10-02 19:36:25,701 WARNING [train.py:1204] (3/4) Exclude cut with ID _68635_210_9317_1_1535770813410_4315870_187-192817-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 19:36:28,201 WARNING [train.py:1204] (3/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_277-204970-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 19:36:29,611 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6163_210_6400_1_1527506746_4263068_283-137811-0_sp0.9 from training. Duration: 0.8555625 2023-10-02 19:36:29,640 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9144W0121-566446 from training. Duration: 0.896 2023-10-02 19:36:32,502 WARNING [train.py:1204] (3/4) Exclude cut with ID _49371_210_12071_1_1533952761021_3812190_351-315519-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 19:36:33,960 INFO [train.py:1046] (3/4) Epoch 29, batch 500, loss[loss=0.1956, simple_loss=0.2578, pruned_loss=0.0667, over 22752.00 frames. ], tot_loss[loss=0.166, simple_loss=0.2437, pruned_loss=0.04415, over 4321238.00 frames. ], batch size: 322, lr: 3.55e-03, grad_scale: 16.0 2023-10-02 19:36:34,020 WARNING [train.py:1204] (3/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_115-326778-0_sp1.1 from training. Duration: 0.8454375 2023-10-02 19:36:34,097 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_17292_210_6753_1_1531125388_2943734_436-198965-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 19:36:34,106 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9165W0117-576751 from training. Duration: 0.896 2023-10-02 19:36:36,847 WARNING [train.py:1204] (3/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_212-149947-0 from training. Duration: 0.73 2023-10-02 19:36:36,864 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_13757_210_3528_1_1530440682_4107239_65-167544-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 19:36:39,685 WARNING [train.py:1204] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_700-183549-0_sp0.9 from training. Duration: 0.7555625 2023-10-02 19:36:42,988 WARNING [train.py:1204] (3/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_216-343007-0_sp0.9 from training. Duration: 0.7333125 2023-10-02 19:36:44,320 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7544_210_6753_1_1528587722_3576686_295-258479-0_sp1.1 from training. Duration: 0.763625 2023-10-02 19:36:47,064 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_365-287106-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 19:36:47,077 WARNING [train.py:1204] (3/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_300-3747-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 19:36:47,138 WARNING [train.py:1204] (3/4) Exclude cut with ID _14069_210_5632_1_1530512692033_4803390_487-303201-0_sp1.1 from training. Duration: 0.92725 2023-10-02 19:36:52,564 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.conv_module2.balancer2.prob, batch_count=995000.0, ans=0.125 2023-10-02 19:36:57,080 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.conv_module2.balancer2.prob, batch_count=995000.0, ans=0.125 2023-10-02 19:36:57,523 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.4.encoder.layers.0.conv_module2.whiten, num_groups=1, num_channels=384, metric=8.79 vs. limit=15.0 2023-10-02 19:36:59,176 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.2.encoder.layers.0.self_attn2.whiten, num_groups=1, num_channels=384, metric=21.31 vs. limit=22.5 2023-10-02 19:36:59,753 WARNING [train.py:1204] (3/4) Exclude cut with ID _14459_210_2228_1_1531443641946_7516700_668-123026-0_sp1.1 from training. Duration: 0.936375 2023-10-02 19:36:59,774 WARNING [train.py:1204] (3/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_108-53929-0_sp1.1 from training. Duration: 0.536375 2023-10-02 19:36:59,814 WARNING [train.py:1204] (3/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_266-137104-0_sp0.9 from training. Duration: 0.72225 2023-10-02 19:36:59,846 WARNING [train.py:1204] (3/4) Exclude cut with ID _12302_210_5219_1_1529924446466_8107280_902-168619-0_sp1.1 from training. Duration: 0.936375 2023-10-02 19:36:59,872 WARNING [train.py:1204] (3/4) Exclude cut with ID _59502_210_12210_1_1535107024698_1835139_146-162495-0 from training. Duration: 0.92 2023-10-02 19:37:01,165 WARNING [train.py:1204] (3/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_412-287526-0_sp1.1 from training. Duration: 0.6545625 2023-10-02 19:37:03,915 WARNING [train.py:1204] (3/4) Exclude cut with ID _7160_210_1876_1_1527940923231_3703799_120-150943-0_sp0.9 from training. Duration: 0.9 2023-10-02 19:37:03,974 WARNING [train.py:1204] (3/4) Exclude cut with ID _15037_210_2253_1_1530752294104_3267650_296-192597-0_sp1.1 from training. Duration: 0.736375 2023-10-02 19:37:03,994 WARNING [train.py:1204] (3/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_397-216497-0_sp1.1 from training. Duration: 0.87275 2023-10-02 19:37:04,008 WARNING [train.py:1204] (3/4) Exclude cut with ID _40094_210_16380_1_1533294005773_3641159_76-269320-0_sp1.1 from training. Duration: 0.936375 2023-10-02 19:37:05,347 WARNING [train.py:1204] (3/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_449-15508-0 from training. Duration: 0.82 2023-10-02 19:37:08,199 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9309W0194-640321 from training. Duration: 0.64 2023-10-02 19:37:09,704 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.conv_module2.balancer1.prob, batch_count=995066.6666666666, ans=0.125 2023-10-02 19:37:10,929 WARNING [train.py:1204] (3/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_284-312178-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 19:37:11,260 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.attention_skip_rate, batch_count=995066.6666666666, ans=0.0 2023-10-02 19:37:12,843 WARNING [train.py:1204] (3/4) Exclude cut with ID _29853_210_7497_1_1532505600851_6642110_401-55937-0_sp1.1 from training. Duration: 0.92725 2023-10-02 19:37:14,204 WARNING [train.py:1204] (3/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_716-24918-0_sp1.1 from training. Duration: 0.92725 2023-10-02 19:37:14,224 WARNING [train.py:1204] (3/4) Exclude cut with ID _19637_210_4220_1_1531537167676_3205060_314-305638-0_sp1.1 from training. Duration: 0.92725 2023-10-02 19:37:15,527 WARNING [train.py:1204] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_475-127455-0_sp1.1 from training. Duration: 0.636375 2023-10-02 19:37:16,986 WARNING [train.py:1204] (3/4) Exclude cut with ID _5444_210_2151_1_1527418808464_5312210_581-315411-0 from training. Duration: 0.62 2023-10-02 19:37:21,598 WARNING [train.py:1204] (3/4) Exclude cut with ID _44902_210_16187_1_1533888006062_3792078_149-67212-0_sp0.9 from training. Duration: 0.9666875 2023-10-02 19:37:22,843 WARNING [train.py:1204] (3/4) Exclude cut with ID _31602_210_5854_1_1532677021456_4458250_63-63253-0_sp1.1 from training. Duration: 0.963625 2023-10-02 19:37:24,159 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.616e+02 1.819e+02 1.997e+02 2.314e+02 3.134e+02, threshold=3.994e+02, percent-clipped=0.0 2023-10-02 19:37:25,681 WARNING [train.py:1204] (3/4) Exclude cut with ID _55209_210_1531_1_1534675828996_4597160_660-203992-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 19:37:28,992 WARNING [train.py:1204] (3/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_83-190624-0_sp1.1 from training. Duration: 0.92725 2023-10-02 19:37:33,098 WARNING [train.py:1204] (3/4) Exclude cut with ID _58605_210_12210_1_1534723195939_3904259_154-139635-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 19:37:37,165 WARNING [train.py:1204] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_164-224052-0 from training. Duration: 0.7 2023-10-02 19:37:37,174 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_1126-189071-0_sp1.1 from training. Duration: 0.963625 2023-10-02 19:37:37,192 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_15396_210_6890_1_1531308473_1938936_310-214789-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 19:37:39,285 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.3.encoder.layers.2.feed_forward3.out_whiten, num_groups=1, num_channels=512, metric=8.45 vs. limit=15.0 2023-10-02 19:37:40,030 WARNING [train.py:1204] (3/4) Exclude cut with ID _8395_210_1531_1_1528597871523_4411579_431-132470-0 from training. Duration: 0.74 2023-10-02 19:37:41,393 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3780_210_2087_1_1526467760_2130483_170-259044-0_sp0.9 from training. Duration: 0.77775 2023-10-02 19:37:42,847 WARNING [train.py:1204] (3/4) Exclude cut with ID _12733_210_8341_1_1530167424556_3646380_537-230640-0_sp1.1 from training. Duration: 0.963625 2023-10-02 19:37:47,414 INFO [train.py:1046] (3/4) Epoch 29, batch 550, loss[loss=0.2, simple_loss=0.2649, pruned_loss=0.06758, over 22644.00 frames. ], tot_loss[loss=0.1682, simple_loss=0.2455, pruned_loss=0.04547, over 4400947.78 frames. ], batch size: 322, lr: 3.55e-03, grad_scale: 8.0 2023-10-02 19:37:47,550 WARNING [train.py:1204] (3/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_159-281310-0 from training. Duration: 0.95 2023-10-02 19:37:48,951 WARNING [train.py:1204] (3/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_434-83000-0 from training. Duration: 0.98 2023-10-02 19:37:48,972 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_4405_210_1849_1_1526732205_3593939_493-32464-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 19:37:48,980 WARNING [train.py:1204] (3/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_342-100157-0 from training. Duration: 0.54 2023-10-02 19:37:50,692 WARNING [train.py:1204] (3/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_765-250133-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 19:37:50,698 WARNING [train.py:1204] (3/4) Exclude cut with ID _38670_210_15379_1_1533448810659_4391059_81-312245-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 19:37:50,774 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_50050_210_16223_1_1535435796_7290138_1273-247611-0_sp1.1 from training. Duration: 0.936375 2023-10-02 19:37:52,119 WARNING [train.py:1204] (3/4) Exclude cut with ID _44831_210_13427_1_1533816011549_3689035_399-18591-0_sp1.1 from training. Duration: 0.936375 2023-10-02 19:37:52,140 WARNING [train.py:1204] (3/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_557-324258-0_sp0.9 from training. Duration: 0.9 2023-10-02 19:37:52,217 WARNING [train.py:1204] (3/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_62-335552-0_sp1.1 from training. Duration: 0.7909375 2023-10-02 19:37:55,509 WARNING [train.py:1204] (3/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_673-264125-0_sp1.1 from training. Duration: 0.963625 2023-10-02 19:37:56,839 WARNING [train.py:1204] (3/4) Exclude cut with ID _8030_210_7497_1_1528444886781_3531089_262-86750-0 from training. Duration: 0.81 2023-10-02 19:37:56,859 WARNING [train.py:1204] (3/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_444-28041-0_sp1.1 from training. Duration: 0.77275 2023-10-02 19:38:00,256 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=995266.6666666666, ans=0.1 2023-10-02 19:38:01,318 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_34008_210_10431_1_1533345424_8183247_516-14858-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 19:38:01,333 WARNING [train.py:1204] (3/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_54-248218-0_sp1.1 from training. Duration: 0.936375 2023-10-02 19:38:05,275 WARNING [train.py:1204] (3/4) Exclude cut with ID _13253_210_1925_1_1530757775819_3577873_300-119782-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 19:38:05,361 WARNING [train.py:1204] (3/4) Exclude cut with ID _39290_210_4220_1_1533862778760_3075118_21-240703-0_sp1.1 from training. Duration: 0.936375 2023-10-02 19:38:08,395 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.attention_skip_rate, batch_count=995333.3333333334, ans=0.0 2023-10-02 19:38:09,566 WARNING [train.py:1204] (3/4) Exclude cut with ID 774-127930-0014-10412-0_sp1.1 from training. Duration: 0.95 2023-10-02 19:38:10,897 WARNING [train.py:1204] (3/4) Exclude cut with ID _67499_210_17282_1_1535509805220_3692430_20-179346-0 from training. Duration: 0.97 2023-10-02 19:38:12,263 WARNING [train.py:1204] (3/4) Exclude cut with ID _8365_210_2064_1_1528592311070_4677690_431-294102-0_sp0.9 from training. Duration: 0.988875 2023-10-02 19:38:18,477 WARNING [train.py:1204] (3/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_365-1492-0_sp1.1 from training. Duration: 0.82725 2023-10-02 19:38:18,512 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_105-114797-0_sp0.9 from training. Duration: 0.9333125 2023-10-02 19:38:19,938 WARNING [train.py:1204] (3/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_769-168302-0_sp0.9 from training. Duration: 0.788875 2023-10-02 19:38:21,600 INFO [scaling.py:1118] (3/4) WithLoss: name=encoder.encoders.3.encoder.layers.2.self_attn_weights, loss-sum=0.000e+00 2023-10-02 19:38:23,342 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_40340_210_9024_1_1533433916_4132944_320-270277-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 19:38:23,347 WARNING [train.py:1204] (3/4) Exclude cut with ID ID0203W0003-781711 from training. Duration: 0.958 2023-10-02 19:38:24,667 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_30434_210_13049_1_1532519384_981746_52-12932-0_sp1.1 from training. Duration: 0.936375 2023-10-02 19:38:25,987 WARNING [train.py:1204] (3/4) Exclude cut with ID _8263_210_1664_1_1528438953494_294100_40-204731-0_sp0.9 from training. Duration: 0.5555625 2023-10-02 19:38:28,925 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1032-236863-0_sp0.9 from training. Duration: 0.9333125 2023-10-02 19:38:30,238 WARNING [train.py:1204] (3/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_335-274780-0_sp0.9 from training. Duration: 0.8666875 2023-10-02 19:38:30,251 WARNING [train.py:1204] (3/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_654-52713-0_sp1.1 from training. Duration: 0.7 2023-10-02 19:38:31,986 WARNING [train.py:1204] (3/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_742-267856-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 19:38:33,358 WARNING [train.py:1204] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_74-48717-0 from training. Duration: 0.58 2023-10-02 19:38:33,451 WARNING [train.py:1204] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_18-46756-0 from training. Duration: 0.79 2023-10-02 19:38:34,777 WARNING [train.py:1204] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_612-56061-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 19:38:34,792 WARNING [train.py:1204] (3/4) Exclude cut with ID _56423_210_5854_1_1534896061049_3165969_412-2403-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 19:38:34,842 WARNING [train.py:1204] (3/4) Exclude cut with ID _62574_210_2655_1_1535180407058_3933249_120-196848-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 19:38:34,842 WARNING [train.py:1204] (3/4) Exclude cut with ID _40519_210_13794_1_1533715228655_7337390_916-51000-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 19:38:37,683 WARNING [train.py:1204] (3/4) Exclude cut with ID _65263_210_22356_1_1535763598691_3921037_108-21902-0_sp1.1 from training. Duration: 0.9 2023-10-02 19:38:39,017 WARNING [train.py:1204] (3/4) Exclude cut with ID _4122_210_2084_1_1526713164417_4353280_528-206771-0_sp1.1 from training. Duration: 0.8 2023-10-02 19:38:40,429 WARNING [train.py:1204] (3/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_43-325657-0_sp1.1 from training. Duration: 0.92725 2023-10-02 19:38:41,719 WARNING [train.py:1204] (3/4) Exclude cut with ID _44659_210_2721_1_1534036120061_6614860_314-82041-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 19:38:41,782 WARNING [train.py:1204] (3/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_81-300212-0_sp0.9 from training. Duration: 0.6666875 2023-10-02 19:38:42,066 INFO [scaling.py:1118] (3/4) WithLoss: name=encoder.encoders.5.encoder.layers.1.self_attn_weights, loss-sum=0.000e+00 2023-10-02 19:38:43,204 WARNING [train.py:1204] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_665-196155-0_sp0.9 from training. Duration: 0.7444375 2023-10-02 19:38:45,249 WARNING [train.py:1204] (3/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_140-336881-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 19:38:45,471 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.feed_forward2.hidden_balancer.prob, batch_count=995533.3333333334, ans=0.125 2023-10-02 19:38:46,646 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6290_210_3273_1_1527595224_1282787_115-87095-0_sp0.9 from training. Duration: 0.611125 2023-10-02 19:38:46,696 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_53373_210_5399_1_1534229471_4715695_607-259685-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 19:38:48,085 WARNING [train.py:1204] (3/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_175-163894-0_sp1.1 from training. Duration: 0.67275 2023-10-02 19:38:48,102 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7002_210_1794_1_1527939683_5157286_1-281264-0_sp1.1 from training. Duration: 0.4 2023-10-02 19:38:54,659 WARNING [train.py:1204] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_605-257205-0 from training. Duration: 0.59 2023-10-02 19:38:57,320 WARNING [train.py:1204] (3/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_454-314901-0 from training. Duration: 0.46 2023-10-02 19:38:57,416 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_46032_210_14656_1_1533781412_4507969_287-130422-0_sp1.1 from training. Duration: 0.97275 2023-10-02 19:38:57,431 WARNING [train.py:1204] (3/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_186-309161-0_sp1.1 from training. Duration: 0.4909375 2023-10-02 19:38:58,919 WARNING [train.py:1204] (3/4) Exclude cut with ID _42659_210_2070_1_1533607442765_3120380_45-342307-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 19:39:02,065 INFO [train.py:1046] (3/4) Epoch 29, batch 600, loss[loss=0.1519, simple_loss=0.23, pruned_loss=0.03686, over 24334.00 frames. ], tot_loss[loss=0.1678, simple_loss=0.2452, pruned_loss=0.04518, over 4474965.06 frames. ], batch size: 56, lr: 3.55e-03, grad_scale: 8.0 2023-10-02 19:39:07,658 WARNING [train.py:1204] (3/4) Exclude cut with ID _12555_210_8750_1_1530075594402_3780239_478-326818-0_sp1.1 from training. Duration: 0.963625 2023-10-02 19:39:09,052 WARNING [train.py:1204] (3/4) Exclude cut with ID _7606_210_4915_1_1528894253277_5080690_127-177761-0_sp1.1 from training. Duration: 0.5818125 2023-10-02 19:39:10,513 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6788_210_1388_1_1527898698_4562021_91-197981-0 from training. Duration: 0.83 2023-10-02 19:39:13,348 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8558_210_1410_1_1528505225_7613406_131-329524-0_sp0.9 from training. Duration: 0.87775 2023-10-02 19:39:13,467 WARNING [train.py:1204] (3/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_153-153537-0_sp1.1 from training. Duration: 0.82725 2023-10-02 19:39:14,808 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_17292_210_6753_1_1531125388_2943734_285-349105-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 19:39:16,814 WARNING [train.py:1204] (3/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_37-41919-0 from training. Duration: 0.87 2023-10-02 19:39:18,067 WARNING [train.py:1204] (3/4) Exclude cut with ID _60860_210_8254_1_1534896015875_3557169_172-294852-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 19:39:20,256 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.0.layers.0.self_attn_weights.whiten_keys, num_groups=4, num_channels=128, metric=2.99 vs. limit=6.0 2023-10-02 19:39:24,529 WARNING [train.py:1204] (3/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_146-276944-0 from training. Duration: 0.83 2023-10-02 19:39:27,357 WARNING [train.py:1204] (3/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_1254-44082-0_sp1.1 from training. Duration: 0.936375 2023-10-02 19:39:27,371 WARNING [train.py:1204] (3/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_1115-180350-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 19:39:27,401 WARNING [train.py:1204] (3/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_60-146189-0_sp1.1 from training. Duration: 0.8909375 2023-10-02 19:39:36,279 WARNING [train.py:1204] (3/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_517-153649-0_sp1.1 from training. Duration: 0.92725 2023-10-02 19:39:36,291 WARNING [train.py:1204] (3/4) Exclude cut with ID _39571_210_13449_1_1533801584143_3651077_37-266257-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 19:39:36,334 WARNING [train.py:1204] (3/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_527-276118-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 19:39:39,513 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.3.encoder.layers.3.feed_forward2.out_whiten, num_groups=1, num_channels=512, metric=7.04 vs. limit=15.0 2023-10-02 19:39:44,408 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7696_210_2040_1_1528207167_3716099_117-125434-0_sp1.1 from training. Duration: 0.6909375 2023-10-02 19:39:47,919 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_25767_210_2161_1_1532307483_3867748_478-156360-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 19:39:47,923 WARNING [train.py:1204] (3/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_177-49092-0_sp1.1 from training. Duration: 0.82725 2023-10-02 19:39:49,153 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_11305_210_1794_1_1529750428_7178148_680-317073-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 19:39:50,742 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.bypass_mid.scale_min, batch_count=995800.0, ans=0.2 2023-10-02 19:39:53,224 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.471e+02 1.812e+02 1.989e+02 2.203e+02 3.587e+02, threshold=3.977e+02, percent-clipped=0.0 2023-10-02 19:39:55,353 WARNING [train.py:1204] (3/4) Exclude cut with ID _53565_210_10635_1_1534337218335_3820259_287-18120-0 from training. Duration: 0.97 2023-10-02 19:40:02,324 WARNING [train.py:1204] (3/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_498-40153-0_sp1.1 from training. Duration: 0.57275 2023-10-02 19:40:02,342 WARNING [train.py:1204] (3/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_555-63066-0_sp1.1 from training. Duration: 0.963625 2023-10-02 19:40:05,245 WARNING [train.py:1204] (3/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_395-310220-0 from training. Duration: 0.89 2023-10-02 19:40:06,586 WARNING [train.py:1204] (3/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_937-208281-0_sp1.1 from training. Duration: 0.77275 2023-10-02 19:40:09,883 WARNING [train.py:1204] (3/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_120-211630-0 from training. Duration: 0.92 2023-10-02 19:40:09,915 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_454-276429-0_sp1.1 from training. Duration: 0.7909375 2023-10-02 19:40:09,957 WARNING [train.py:1204] (3/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_404-75889-0_sp1.1 from training. Duration: 0.8090625 2023-10-02 19:40:15,415 INFO [train.py:1046] (3/4) Epoch 29, batch 650, loss[loss=0.178, simple_loss=0.2633, pruned_loss=0.04636, over 24579.00 frames. ], tot_loss[loss=0.1666, simple_loss=0.2438, pruned_loss=0.04467, over 4514956.33 frames. ], batch size: 71, lr: 3.55e-03, grad_scale: 8.0 2023-10-02 19:40:15,510 WARNING [train.py:1204] (3/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_282-136641-0_sp0.9 from training. Duration: 0.4666875 2023-10-02 19:40:16,863 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_809-17156-0_sp1.1 from training. Duration: 0.663625 2023-10-02 19:40:20,070 WARNING [train.py:1204] (3/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_1003-101638-0_sp0.9 from training. Duration: 0.811125 2023-10-02 19:40:20,179 WARNING [train.py:1204] (3/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_404-75889-0_sp0.9 from training. Duration: 0.988875 2023-10-02 19:40:22,954 WARNING [train.py:1204] (3/4) Exclude cut with ID _59288_210_5854_1_1535077757774_3412246_570-295255-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 19:40:23,656 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.2.encoder.layers.2.conv_module1.whiten, num_groups=1, num_channels=384, metric=5.51 vs. limit=15.0 2023-10-02 19:40:26,182 WARNING [train.py:1204] (3/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_412-287526-0 from training. Duration: 0.72 2023-10-02 19:40:26,977 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.self_attn_weights.pos_emb_skip_rate, batch_count=995933.3333333334, ans=0.0 2023-10-02 19:40:26,991 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.balancer_ff3.min_abs, batch_count=995933.3333333334, ans=0.2 2023-10-02 19:40:27,966 WARNING [train.py:1204] (3/4) Exclude cut with ID _44010_210_4525_1_1533884262964_4013620_649-79655-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 19:40:33,773 WARNING [train.py:1204] (3/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_57-270037-0_sp0.9 from training. Duration: 0.8555625 2023-10-02 19:40:33,773 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_34815_210_6388_1_1533381320_3571903_302-255963-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 19:40:36,641 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_47449_210_5399_1_1534076445_4006929_573-301852-0_sp1.1 from training. Duration: 0.97275 2023-10-02 19:40:39,632 WARNING [train.py:1204] (3/4) Exclude cut with ID 6709-74022-0004-86860-0_sp1.1 from training. Duration: 0.9409375 2023-10-02 19:40:42,791 WARNING [train.py:1204] (3/4) Exclude cut with ID _40519_210_13794_1_1533715228655_7337390_111-71991-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 19:40:44,083 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_30167_210_3528_1_1532428012_4772375_169-214186-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 19:40:45,728 WARNING [train.py:1204] (3/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_20-235313-0_sp1.1 from training. Duration: 0.963625 2023-10-02 19:40:45,765 WARNING [train.py:1204] (3/4) Exclude cut with ID _4681_210_3525_1_1527251040321_1235713_123-24428-0_sp0.9 from training. Duration: 0.5333125 2023-10-02 19:40:48,565 WARNING [train.py:1204] (3/4) Exclude cut with ID _31602_210_5854_1_1532677021456_4458250_444-198752-0_sp1.1 from training. Duration: 0.97275 2023-10-02 19:40:48,604 WARNING [train.py:1204] (3/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_9-279442-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 19:40:50,477 WARNING [train.py:1204] (3/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_973-198085-0_sp0.9 from training. Duration: 0.8333125 2023-10-02 19:40:51,867 WARNING [train.py:1204] (3/4) Exclude cut with ID _29853_210_7497_1_1532505600851_6642110_567-250221-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 19:40:51,968 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6545_210_2040_1_1527774670_4245258_407-48070-0_sp1.1 from training. Duration: 0.6909375 2023-10-02 19:40:53,376 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.bypass.skip_rate, batch_count=996066.6666666666, ans=0.07 2023-10-02 19:40:54,600 WARNING [train.py:1204] (3/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_224-343295-0_sp0.9 from training. Duration: 0.611125 2023-10-02 19:40:54,614 WARNING [train.py:1204] (3/4) Exclude cut with ID IC0125W0011-61817 from training. Duration: 0.88 2023-10-02 19:40:54,628 WARNING [train.py:1204] (3/4) Exclude cut with ID _60113_210_16271_1_1535164212481_4386079_435-154371-0_sp1.1 from training. Duration: 0.97275 2023-10-02 19:40:55,908 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_25836_210_16554_1_1532138167_53197_3-9394-0_sp1.1 from training. Duration: 0.92725 2023-10-02 19:40:58,643 WARNING [train.py:1204] (3/4) Exclude cut with ID _29343_210_4381_1_1532602757752_3674109_57-55125-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 19:41:00,120 WARNING [train.py:1204] (3/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_267-23560-0_sp1.1 from training. Duration: 0.936375 2023-10-02 19:41:00,143 WARNING [train.py:1204] (3/4) Exclude cut with ID _30196_210_1664_1_1533708496522_7074140_1150-278867-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 19:41:00,357 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.bypass.skip_rate, batch_count=996133.3333333334, ans=0.07 2023-10-02 19:41:01,507 WARNING [train.py:1204] (3/4) Exclude cut with ID _6270_210_2084_1_1527850911573_3856420_26-277343-0_sp0.9 from training. Duration: 0.911125 2023-10-02 19:41:01,574 WARNING [train.py:1204] (3/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_325-62802-0 from training. Duration: 0.87 2023-10-02 19:41:02,914 WARNING [train.py:1204] (3/4) Exclude cut with ID _56524_210_9642_1_1534572011779_3619170_145-310842-0_sp1.1 from training. Duration: 0.9 2023-10-02 19:41:02,958 WARNING [train.py:1204] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_860-96198-0_sp0.9 from training. Duration: 0.811125 2023-10-02 19:41:04,331 WARNING [train.py:1204] (3/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_17-25045-0_sp1.1 from training. Duration: 0.536375 2023-10-02 19:41:04,347 WARNING [train.py:1204] (3/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_349-69727-0_sp1.1 from training. Duration: 0.936375 2023-10-02 19:41:05,689 WARNING [train.py:1204] (3/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_580-93798-0_sp1.1 from training. Duration: 0.5090625 2023-10-02 19:41:07,055 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7762_210_1794_1_1528199508_4057408_419-125009-0 from training. Duration: 0.83 2023-10-02 19:41:07,234 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.attention_skip_rate, batch_count=996133.3333333334, ans=0.0 2023-10-02 19:41:08,489 WARNING [train.py:1204] (3/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_356-167816-0 from training. Duration: 0.72 2023-10-02 19:41:09,717 WARNING [train.py:1204] (3/4) Exclude cut with ID _29345_210_4381_1_1532689168263_3860169_225-97100-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 19:41:09,721 WARNING [train.py:1204] (3/4) Exclude cut with ID _14227_210_4381_1_1530701804286_3940259_374-194712-0_sp1.1 from training. Duration: 0.92725 2023-10-02 19:41:09,745 WARNING [train.py:1204] (3/4) Exclude cut with ID _40137_210_12210_1_1533290484046_3795180_249-187059-0_sp0.9 from training. Duration: 0.97775 2023-10-02 19:41:09,777 WARNING [train.py:1204] (3/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_389-317194-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 19:41:13,077 WARNING [train.py:1204] (3/4) Exclude cut with ID _27485_210_4916_1_1532739191677_3668269_239-122918-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 19:41:18,561 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2245_210_2715_1_1525511548_3540254_128-117609-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 19:41:18,603 WARNING [train.py:1204] (3/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_160-276178-0_sp1.1 from training. Duration: 0.963625 2023-10-02 19:41:20,390 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_31920_210_10633_1_1532860009_1686094_1-279735-0_sp1.1 from training. Duration: 0.97275 2023-10-02 19:41:23,177 WARNING [train.py:1204] (3/4) Exclude cut with ID _51078_210_9070_1_1534161557424_3805250_325-262383-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 19:41:23,197 WARNING [train.py:1204] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_643-52007-0_sp0.9 from training. Duration: 0.6333125 2023-10-02 19:41:23,245 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_51937_210_5014_1_1534573610_4022025_518-299248-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 19:41:31,063 INFO [train.py:1046] (3/4) Epoch 29, batch 700, loss[loss=0.1667, simple_loss=0.2399, pruned_loss=0.04678, over 23886.00 frames. ], tot_loss[loss=0.1656, simple_loss=0.2425, pruned_loss=0.04433, over 4551594.31 frames. ], batch size: 195, lr: 3.55e-03, grad_scale: 8.0 2023-10-02 19:41:31,105 WARNING [train.py:1204] (3/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_166-314607-0_sp1.1 from training. Duration: 0.4909375 2023-10-02 19:41:31,109 WARNING [train.py:1204] (3/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_1087-282939-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 19:41:31,148 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_23850_210_4238_1_1532232597_1632433_32-185135-0_sp1.1 from training. Duration: 0.7818125 2023-10-02 19:41:31,176 WARNING [train.py:1204] (3/4) Exclude cut with ID _59500_210_8254_1_1534809610680_3736009_145-89359-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 19:41:35,451 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_340-336408-0 from training. Duration: 0.93 2023-10-02 19:41:35,633 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.conv_module2.balancer2.prob, batch_count=996266.6666666666, ans=0.125 2023-10-02 19:41:36,782 WARNING [train.py:1204] (3/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_9-21737-0 from training. Duration: 0.97 2023-10-02 19:41:37,022 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.conv_skip_rate, batch_count=996266.6666666666, ans=0.0 2023-10-02 19:41:38,256 WARNING [train.py:1204] (3/4) Exclude cut with ID _2220_210_1819_1_1525509109898_3658680_50-99837-0 from training. Duration: 0.54 2023-10-02 19:41:39,646 WARNING [train.py:1204] (3/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_879-299350-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 19:41:41,393 WARNING [train.py:1204] (3/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_905-160074-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 19:41:44,098 WARNING [train.py:1204] (3/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_395-73570-0 from training. Duration: 0.99 2023-10-02 19:41:44,306 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.0.conv_skip_rate, batch_count=996333.3333333334, ans=0.0 2023-10-02 19:41:48,227 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7264_210_4938_1_1527939911_8132095_800-91995-0_sp1.1 from training. Duration: 0.7818125 2023-10-02 19:41:51,312 WARNING [train.py:1204] (3/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_77-311404-0_sp1.1 from training. Duration: 0.8545625 2023-10-02 19:41:51,417 WARNING [train.py:1204] (3/4) Exclude cut with ID _13679_210_7497_1_1530534293662_3355098_107-80772-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 19:41:52,777 WARNING [train.py:1204] (3/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_224-343295-0_sp1.1 from training. Duration: 0.5 2023-10-02 19:41:54,070 WARNING [train.py:1204] (3/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_156-253482-0_sp1.1 from training. Duration: 0.963625 2023-10-02 19:41:55,580 WARNING [train.py:1204] (3/4) Exclude cut with ID _49728_210_12651_1_1534041034294_6788150_309-125532-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 19:42:00,138 WARNING [train.py:1204] (3/4) Exclude cut with ID _13913_210_9994_1_1530579615187_3383897_473-277682-0_sp0.9 from training. Duration: 0.4333125 2023-10-02 19:42:00,149 WARNING [train.py:1204] (3/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_3-276701-0_sp1.1 from training. Duration: 0.82725 2023-10-02 19:42:02,147 WARNING [train.py:1204] (3/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_282-136641-0 from training. Duration: 0.42 2023-10-02 19:42:03,711 WARNING [train.py:1204] (3/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_127-566-0 from training. Duration: 0.84 2023-10-02 19:42:07,838 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6163_210_6400_1_1527506746_4263068_283-137811-0_sp1.1 from training. Duration: 0.7 2023-10-02 19:42:07,874 WARNING [train.py:1204] (3/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_34-183685-0_sp1.1 from training. Duration: 0.8909375 2023-10-02 19:42:09,427 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.conv_module1.balancer1.prob, batch_count=996400.0, ans=0.125 2023-10-02 19:42:10,595 WARNING [train.py:1204] (3/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_154-135696-0_sp0.9 from training. Duration: 0.888875 2023-10-02 19:42:15,129 WARNING [train.py:1204] (3/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_676-51014-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 19:42:15,189 WARNING [train.py:1204] (3/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_38-169920-0 from training. Duration: 0.91 2023-10-02 19:42:19,385 WARNING [train.py:1204] (3/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_747-114465-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 19:42:19,429 WARNING [train.py:1204] (3/4) Exclude cut with ID _8021_210_7994_1_1528376451358_3653069_100-140231-0_sp1.1 from training. Duration: 0.6090625 2023-10-02 19:42:21,361 WARNING [train.py:1204] (3/4) Exclude cut with ID _64135_210_2253_1_1535677267876_7608972_133-188266-0 from training. Duration: 0.81 2023-10-02 19:42:22,597 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.501e+02 1.803e+02 2.016e+02 2.224e+02 3.281e+02, threshold=4.032e+02, percent-clipped=0.0 2023-10-02 19:42:25,371 WARNING [train.py:1204] (3/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_714-310538-0_sp1.1 from training. Duration: 0.7818125 2023-10-02 19:42:25,468 WARNING [train.py:1204] (3/4) Exclude cut with ID _39867_210_9826_1_1533553222030_9344259_1134-288498-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 19:42:28,223 WARNING [train.py:1204] (3/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_211-237289-0_sp1.1 from training. Duration: 0.936375 2023-10-02 19:42:28,504 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.balancer1.prob, batch_count=996533.3333333334, ans=0.125 2023-10-02 19:42:32,424 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.1.encoder.layers.1.whiten, num_groups=1, num_channels=256, metric=5.05 vs. limit=12.0 2023-10-02 19:42:33,722 WARNING [train.py:1204] (3/4) Exclude cut with ID _59202_210_26984_1_1534816810612_3549174_137-318710-0_sp0.9 from training. Duration: 0.988875 2023-10-02 19:42:34,926 WARNING [train.py:1204] (3/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_86-66192-0 from training. Duration: 0.55 2023-10-02 19:42:37,732 WARNING [train.py:1204] (3/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_540-275685-0 from training. Duration: 0.89 2023-10-02 19:42:39,025 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8776_210_2040_1_1528638837_4088036_244-278763-0 from training. Duration: 0.9 2023-10-02 19:42:40,458 WARNING [train.py:1204] (3/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_9-219972-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 19:42:43,122 WARNING [train.py:1204] (3/4) Exclude cut with ID _44659_210_2721_1_1534036120061_6614860_656-283823-0_sp1.1 from training. Duration: 0.92725 2023-10-02 19:42:43,188 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_69630_210_7234_1_1535789601_661049_77-216385-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 19:42:45,104 INFO [train.py:1046] (3/4) Epoch 29, batch 750, loss[loss=0.1728, simple_loss=0.2501, pruned_loss=0.04774, over 23253.00 frames. ], tot_loss[loss=0.1656, simple_loss=0.2432, pruned_loss=0.04399, over 4593929.77 frames. ], batch size: 93, lr: 3.55e-03, grad_scale: 8.0 2023-10-02 19:42:46,507 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_19952_210_2161_1_1531875469_3851750_210-308919-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 19:42:46,513 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_4737_210_2040_1_1526998689_3293008_378-178650-0 from training. Duration: 0.95 2023-10-02 19:42:50,013 WARNING [train.py:1204] (3/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_224-343295-0 from training. Duration: 0.55 2023-10-02 19:42:50,040 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7002_210_1794_1_1527939683_5157286_184-229630-0 from training. Duration: 0.74 2023-10-02 19:42:50,074 WARNING [train.py:1204] (3/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_6-214815-0 from training. Duration: 0.85 2023-10-02 19:42:51,421 WARNING [train.py:1204] (3/4) Exclude cut with ID _8699_210_2069_1_1528630346831_3960409_160-24408-0 from training. Duration: 0.91 2023-10-02 19:42:51,470 WARNING [train.py:1204] (3/4) Exclude cut with ID _5585_210_2667_1_1527418813324_8007340_89-332072-0 from training. Duration: 0.64 2023-10-02 19:42:51,657 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.conv_skip_rate, batch_count=996600.0, ans=0.0 2023-10-02 19:42:51,693 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.conv_module1.balancer1.prob, batch_count=996600.0, ans=0.125 2023-10-02 19:42:52,798 WARNING [train.py:1204] (3/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_528-259800-0_sp1.1 from training. Duration: 0.963625 2023-10-02 19:42:52,885 WARNING [train.py:1204] (3/4) Exclude cut with ID _55383_210_10635_1_1534468426172_2231669_134-128246-0 from training. Duration: 0.89 2023-10-02 19:42:54,176 WARNING [train.py:1204] (3/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_400-338274-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 19:42:55,533 WARNING [train.py:1204] (3/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_364-22817-0_sp0.9 from training. Duration: 0.788875 2023-10-02 19:42:55,667 WARNING [train.py:1204] (3/4) Exclude cut with ID _42863_210_9070_1_1533902398788_3696920_331-291057-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 19:42:57,161 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_5436_210_1794_1_1527331317_2541494_229-211016-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 19:42:57,347 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.conv_module1.balancer1.prob, batch_count=996600.0, ans=0.125 2023-10-02 19:42:58,448 WARNING [train.py:1204] (3/4) Exclude cut with ID _6456_210_1534_1_1527681634212_3181049_345-252877-0_sp0.9 from training. Duration: 0.711125 2023-10-02 19:43:00,213 WARNING [train.py:1204] (3/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_96-81499-0_sp1.1 from training. Duration: 0.92725 2023-10-02 19:43:02,857 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_5575_210_1410_1_1527301305_2570170_342-265572-0_sp1.1 from training. Duration: 0.8545625 2023-10-02 19:43:04,220 WARNING [train.py:1204] (3/4) Exclude cut with ID _5444_210_2151_1_1527418808464_5312210_726-27165-0_sp0.9 from training. Duration: 0.7444375 2023-10-02 19:43:05,602 WARNING [train.py:1204] (3/4) Exclude cut with ID _22083_210_8254_1_1531702862439_3678970_304-327750-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 19:43:07,749 WARNING [train.py:1204] (3/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_245-226199-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 19:43:09,055 WARNING [train.py:1204] (3/4) Exclude cut with ID _42875_210_14856_1_1533704397487_4012433_102-199499-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 19:43:09,117 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_51225_210_3601_1_1534400634_2327383_55-301844-0 from training. Duration: 0.93 2023-10-02 19:43:10,406 WARNING [train.py:1204] (3/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_916-195848-0_sp1.1 from training. Duration: 0.836375 2023-10-02 19:43:10,472 WARNING [train.py:1204] (3/4) Exclude cut with ID _44718_210_5816_1_1534050051869_6469030_497-311999-0_sp1.1 from training. Duration: 0.936375 2023-10-02 19:43:13,226 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_34886_210_10633_1_1533203882_1755661_91-194765-0_sp1.1 from training. Duration: 0.936375 2023-10-02 19:43:15,918 WARNING [train.py:1204] (3/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_310-26186-0_sp0.9 from training. Duration: 0.77775 2023-10-02 19:43:16,012 WARNING [train.py:1204] (3/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_416-257730-0 from training. Duration: 0.65 2023-10-02 19:43:16,017 WARNING [train.py:1204] (3/4) Exclude cut with ID _9389_210_1664_1_1529217337301_5289269_209-154760-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 19:43:19,198 WARNING [train.py:1204] (3/4) Exclude cut with ID _4092_210_2151_1_1526643044449_3135759_227-213972-0 from training. Duration: 0.54 2023-10-02 19:43:19,218 WARNING [train.py:1204] (3/4) Exclude cut with ID IC0679W0044-335852 from training. Duration: 0.768 2023-10-02 19:43:19,280 WARNING [train.py:1204] (3/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_42-186162-0 from training. Duration: 0.92 2023-10-02 19:43:19,286 WARNING [train.py:1204] (3/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_302-2550-0_sp0.9 from training. Duration: 0.9 2023-10-02 19:43:19,298 WARNING [train.py:1204] (3/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_356-31216-0_sp0.9 from training. Duration: 0.8333125 2023-10-02 19:43:21,693 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.4.encoder.layers.2.nonlin_attention.whiten1, num_groups=1, num_channels=288, metric=5.62 vs. limit=10.0 2023-10-02 19:43:22,556 WARNING [train.py:1204] (3/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_125-322172-0_sp1.1 from training. Duration: 0.8454375 2023-10-02 19:43:28,121 WARNING [train.py:1204] (3/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_141-89727-0_sp0.9 from training. Duration: 0.788875 2023-10-02 19:43:28,143 WARNING [train.py:1204] (3/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_647-337784-0_sp1.1 from training. Duration: 0.97275 2023-10-02 19:43:28,154 WARNING [train.py:1204] (3/4) Exclude cut with ID _6493_210_1747_1_1528459210424_4012470_513-140967-0_sp1.1 from training. Duration: 0.7181875 2023-10-02 19:43:31,284 WARNING [train.py:1204] (3/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_87-266334-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 19:43:32,656 WARNING [train.py:1204] (3/4) Exclude cut with ID _26528_210_9070_1_1532570410842_3851310_526-261338-0_sp1.1 from training. Duration: 0.92725 2023-10-02 19:43:32,705 WARNING [train.py:1204] (3/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_380-343670-0 from training. Duration: 0.88 2023-10-02 19:43:32,951 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.balancer2.prob, batch_count=996800.0, ans=0.125 2023-10-02 19:43:34,035 WARNING [train.py:1204] (3/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_449-15508-0_sp1.1 from training. Duration: 0.7454375 2023-10-02 19:43:35,370 WARNING [train.py:1204] (3/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_372-6869-0_sp0.9 from training. Duration: 0.488875 2023-10-02 19:43:35,427 WARNING [train.py:1204] (3/4) Exclude cut with ID _26907_210_7234_1_1532221740694_3205263_337-2494-0_sp0.9 from training. Duration: 0.97775 2023-10-02 19:43:37,660 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.feed_forward3.hidden_balancer.prob, batch_count=996800.0, ans=0.125 2023-10-02 19:43:38,795 WARNING [train.py:1204] (3/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_10-100367-0_sp1.1 from training. Duration: 0.8909375 2023-10-02 19:43:38,840 WARNING [train.py:1204] (3/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_47-167373-0 from training. Duration: 0.92 2023-10-02 19:43:40,127 WARNING [train.py:1204] (3/4) Exclude cut with ID _28323_210_16187_1_1532480395667_4100300_628-282179-0_sp1.1 from training. Duration: 0.97275 2023-10-02 19:43:44,505 WARNING [train.py:1204] (3/4) Exclude cut with ID _20455_210_5219_1_1531565996775_8098100_213-280898-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 19:43:45,850 WARNING [train.py:1204] (3/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_383-316000-0_sp1.1 from training. Duration: 0.8181875 2023-10-02 19:43:45,882 WARNING [train.py:1204] (3/4) Exclude cut with ID _18513_210_9206_1_1531618215223_3862620_16-78180-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 19:43:47,321 WARNING [train.py:1204] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_330-327861-0_sp1.1 from training. Duration: 0.6090625 2023-10-02 19:43:47,468 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.bypass.skip_rate, batch_count=996866.6666666666, ans=0.04949747468305833 2023-10-02 19:43:52,510 WARNING [train.py:1204] (3/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_356-31216-0 from training. Duration: 0.75 2023-10-02 19:43:53,724 WARNING [train.py:1204] (3/4) Exclude cut with ID _25248_210_8750_1_1532062825941_5554180_255-148539-0_sp1.1 from training. Duration: 0.963625 2023-10-02 19:43:53,778 WARNING [train.py:1204] (3/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_269-154726-0_sp1.1 from training. Duration: 0.7818125 2023-10-02 19:43:57,859 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7002_210_1794_1_1527939683_5157286_163-87552-0_sp1.1 from training. Duration: 0.7818125 2023-10-02 19:43:57,887 WARNING [train.py:1204] (3/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_97-151896-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 19:43:59,211 INFO [train.py:1046] (3/4) Epoch 29, batch 800, loss[loss=0.1553, simple_loss=0.2386, pruned_loss=0.03599, over 24671.00 frames. ], tot_loss[loss=0.1653, simple_loss=0.243, pruned_loss=0.04381, over 4622143.38 frames. ], batch size: 65, lr: 3.55e-03, grad_scale: 16.0 2023-10-02 19:44:01,894 WARNING [train.py:1204] (3/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_207-7026-0_sp1.1 from training. Duration: 0.97275 2023-10-02 19:44:01,918 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_92-46009-0_sp0.9 from training. Duration: 0.6 2023-10-02 19:44:03,953 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder_embed.dropout.p, batch_count=996933.3333333334, ans=0.1 2023-10-02 19:44:08,618 WARNING [train.py:1204] (3/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_122-178847-0_sp1.1 from training. Duration: 0.97275 2023-10-02 19:44:08,622 WARNING [train.py:1204] (3/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_94-348956-0_sp1.1 from training. Duration: 0.92725 2023-10-02 19:44:11,443 WARNING [train.py:1204] (3/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_1018-244089-0_sp1.1 from training. Duration: 0.963625 2023-10-02 19:44:11,462 WARNING [train.py:1204] (3/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_87-118423-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 19:44:11,550 WARNING [train.py:1204] (3/4) Exclude cut with ID _53713_210_24116_1_1534314487762_7416751_504-212292-0_sp1.1 from training. Duration: 0.92725 2023-10-02 19:44:12,794 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_446-150327-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 19:44:13,045 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.feed_forward3.hidden_balancer.prob, batch_count=997000.0, ans=0.125 2023-10-02 19:44:14,200 WARNING [train.py:1204] (3/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_682-245989-0_sp1.1 from training. Duration: 0.92725 2023-10-02 19:44:14,417 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.ff3_skip_rate, batch_count=997000.0, ans=0.0 2023-10-02 19:44:16,920 WARNING [train.py:1204] (3/4) Exclude cut with ID _35723_210_1794_1_1533088341043_628250_8-234524-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 19:44:18,310 WARNING [train.py:1204] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_64-248359-0_sp0.9 from training. Duration: 0.7666875 2023-10-02 19:44:21,732 WARNING [train.py:1204] (3/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_49-97158-0 from training. Duration: 0.76 2023-10-02 19:44:22,963 WARNING [train.py:1204] (3/4) Exclude cut with ID _24314_210_4915_1_1532135078116_7170179_743-915-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 19:44:24,785 WARNING [train.py:1204] (3/4) Exclude cut with ID _61194_210_18628_1_1535007555628_3750909_353-323468-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 19:44:24,814 WARNING [train.py:1204] (3/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_387-131538-0_sp1.1 from training. Duration: 0.736375 2023-10-02 19:44:26,079 WARNING [train.py:1204] (3/4) Exclude cut with ID _44424_210_18163_1_1533623394848_1213110_79-187603-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 19:44:26,116 WARNING [train.py:1204] (3/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_86-19601-0 from training. Duration: 0.85 2023-10-02 19:44:26,149 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_50050_210_16223_1_1535435796_7290138_103-283529-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 19:44:27,477 WARNING [train.py:1204] (3/4) Exclude cut with ID _13913_210_9994_1_1530579615187_3383897_473-277682-0 from training. Duration: 0.39 2023-10-02 19:44:31,614 WARNING [train.py:1204] (3/4) Exclude cut with ID _42326_210_7770_1_1533814282531_3707269_368-68029-0_sp1.1 from training. Duration: 0.92725 2023-10-02 19:44:33,048 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_41428_210_2161_1_1533774184_3992532_446-339645-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 19:44:34,445 WARNING [train.py:1204] (3/4) Exclude cut with ID _69584_210_5219_1_1535785133775_4044450_325-7656-0_sp1.1 from training. Duration: 0.7818125 2023-10-02 19:44:34,452 WARNING [train.py:1204] (3/4) Exclude cut with ID _21632_210_2253_1_1531994001765_3744140_134-184387-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 19:44:37,837 WARNING [train.py:1204] (3/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_550-184707-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 19:44:37,868 WARNING [train.py:1204] (3/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_106-341780-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 19:44:42,393 WARNING [train.py:1204] (3/4) Exclude cut with ID _45871_210_5665_1_1533864614245_3296060_253-132552-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 19:44:42,429 WARNING [train.py:1204] (3/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_395-310220-0_sp1.1 from training. Duration: 0.8090625 2023-10-02 19:44:42,454 WARNING [train.py:1204] (3/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_795-33252-0_sp0.9 from training. Duration: 0.411125 2023-10-02 19:44:43,885 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9002W0003-495962 from training. Duration: 0.946 2023-10-02 19:44:43,905 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_43-71989-0 from training. Duration: 0.57 2023-10-02 19:44:45,193 WARNING [train.py:1204] (3/4) Exclude cut with ID _8699_210_2069_1_1528630346831_3960409_155-39766-0_sp1.1 from training. Duration: 0.7090625 2023-10-02 19:44:45,207 WARNING [train.py:1204] (3/4) Exclude cut with ID _27894_210_12392_1_1532415496078_6902439_788-232684-0_sp1.1 from training. Duration: 0.97275 2023-10-02 19:44:46,626 WARNING [train.py:1204] (3/4) Exclude cut with ID _56494_210_4220_1_1535284837625_3033169_10-39296-0_sp1.1 from training. Duration: 0.92725 2023-10-02 19:44:46,647 WARNING [train.py:1204] (3/4) Exclude cut with ID _40901_210_5665_1_1533363789958_3856130_341-20819-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 19:44:46,918 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.balancer2.prob, batch_count=997133.3333333334, ans=0.125 2023-10-02 19:44:49,707 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9029W0435-509822 from training. Duration: 0.896 2023-10-02 19:44:51,488 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.543e+02 1.909e+02 2.086e+02 2.400e+02 3.373e+02, threshold=4.173e+02, percent-clipped=0.0 2023-10-02 19:44:51,558 WARNING [train.py:1204] (3/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_51-117576-0 from training. Duration: 0.4 2023-10-02 19:44:51,675 WARNING [train.py:1204] (3/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_197-28164-0_sp1.1 from training. Duration: 0.72725 2023-10-02 19:44:53,063 WARNING [train.py:1204] (3/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_145-137790-0_sp1.1 from training. Duration: 0.7545625 2023-10-02 19:44:57,605 WARNING [train.py:1204] (3/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_2-39653-0_sp1.1 from training. Duration: 0.8909375 2023-10-02 19:45:00,206 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3780_210_2087_1_1526467760_2130483_186-208240-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 19:45:02,816 WARNING [train.py:1204] (3/4) Exclude cut with ID _24055_210_12651_1_1532310907143_976979_92-59356-0 from training. Duration: 0.91 2023-10-02 19:45:02,861 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3910_210_1794_1_1526742306_334530_28-238909-0_sp1.1 from training. Duration: 0.736375 2023-10-02 19:45:03,127 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.out_combiner.scale_min, batch_count=997200.0, ans=0.2 2023-10-02 19:45:06,216 WARNING [train.py:1204] (3/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_269-154726-0 from training. Duration: 0.86 2023-10-02 19:45:13,271 INFO [train.py:1046] (3/4) Epoch 29, batch 850, loss[loss=0.1641, simple_loss=0.2384, pruned_loss=0.04487, over 23649.00 frames. ], tot_loss[loss=0.1658, simple_loss=0.2434, pruned_loss=0.04417, over 4637610.35 frames. ], batch size: 149, lr: 3.55e-03, grad_scale: 16.0 2023-10-02 19:45:13,310 WARNING [train.py:1204] (3/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_326-165900-0_sp0.9 from training. Duration: 0.7666875 2023-10-02 19:45:14,716 WARNING [train.py:1204] (3/4) Exclude cut with ID _5294_210_1747_1_1527249643928_3696179_470-276077-0_sp1.1 from training. Duration: 0.963625 2023-10-02 19:45:16,044 WARNING [train.py:1204] (3/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_372-131163-0 from training. Duration: 0.94 2023-10-02 19:45:17,329 WARNING [train.py:1204] (3/4) Exclude cut with ID _45297_210_2721_1_1533715351381_3264949_351-147136-0_sp1.1 from training. Duration: 0.7909375 2023-10-02 19:45:18,686 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_1072-12671-0_sp1.1 from training. Duration: 0.92725 2023-10-02 19:45:20,450 WARNING [train.py:1204] (3/4) Exclude cut with ID _15133_210_6228_1_1530794512074_3080379_54-198193-0 from training. Duration: 0.94 2023-10-02 19:45:20,474 WARNING [train.py:1204] (3/4) Exclude cut with ID _30196_210_1664_1_1533708496522_7074140_1194-270988-0_sp1.1 from training. Duration: 0.97275 2023-10-02 19:45:20,574 WARNING [train.py:1204] (3/4) Exclude cut with ID _50310_210_15488_1_1534068043345_3641080_289-193637-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 19:45:23,336 WARNING [train.py:1204] (3/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_313-263495-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 19:45:23,489 WARNING [train.py:1204] (3/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_18-130336-0_sp1.1 from training. Duration: 0.7181875 2023-10-02 19:45:24,826 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_47896_210_20508_1_1534492518_5958868_646-287209-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 19:45:26,193 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_294-163793-0 from training. Duration: 0.82 2023-10-02 19:45:26,233 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_8-319957-0 from training. Duration: 0.96 2023-10-02 19:45:26,236 WARNING [train.py:1204] (3/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_1211-226066-0 from training. Duration: 0.76 2023-10-02 19:45:29,676 WARNING [train.py:1204] (3/4) Exclude cut with ID _4779_210_2084_1_1527318022944_3914893_500-285013-0_sp0.9 from training. Duration: 0.7666875 2023-10-02 19:45:29,693 WARNING [train.py:1204] (3/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_347-232268-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 19:45:31,099 WARNING [train.py:1204] (3/4) Exclude cut with ID _29877_210_6228_1_1532397791106_5276149_111-55056-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 19:45:31,131 WARNING [train.py:1204] (3/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_812-271646-0_sp1.1 from training. Duration: 0.92725 2023-10-02 19:45:31,154 WARNING [train.py:1204] (3/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_235-262124-0_sp1.1 from training. Duration: 0.6090625 2023-10-02 19:45:36,513 WARNING [train.py:1204] (3/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_14-16530-0_sp1.1 from training. Duration: 0.97275 2023-10-02 19:45:36,555 WARNING [train.py:1204] (3/4) Exclude cut with ID _44718_210_5816_1_1534050051869_6469030_568-304512-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 19:45:36,574 WARNING [train.py:1204] (3/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_1033-274376-0 from training. Duration: 0.87 2023-10-02 19:45:41,204 WARNING [train.py:1204] (3/4) Exclude cut with ID _4681_210_3525_1_1527251040321_1235713_123-24428-0 from training. Duration: 0.48 2023-10-02 19:45:44,568 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_37818_210_4238_1_1533620910_4720083_251-288764-0_sp1.1 from training. Duration: 0.97275 2023-10-02 19:45:45,894 WARNING [train.py:1204] (3/4) Exclude cut with ID _65585_210_12210_1_1535450451584_3764098_112-294301-0 from training. Duration: 0.88 2023-10-02 19:45:50,117 WARNING [train.py:1204] (3/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_329-113173-0 from training. Duration: 0.62 2023-10-02 19:45:51,924 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6767_210_3273_1_1527855783_1718932_173-256601-0 from training. Duration: 0.8 2023-10-02 19:45:53,327 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9266W0001-627677 from training. Duration: 0.896 2023-10-02 19:45:53,337 WARNING [train.py:1204] (3/4) Exclude cut with ID _49643_210_7989_1_1534640244224_3939031_611-185515-0_sp1.1 from training. Duration: 0.8545625 2023-10-02 19:45:53,341 WARNING [train.py:1204] (3/4) Exclude cut with ID _31649_210_6228_1_1532584608063_4091078_687-237771-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 19:45:53,351 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_148-172861-0_sp0.9 from training. Duration: 0.5555625 2023-10-02 19:45:56,138 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_5129_210_6400_1_1527161005_3579054_107-69738-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 19:45:57,469 WARNING [train.py:1204] (3/4) Exclude cut with ID _40137_210_12210_1_1533290484046_3795180_36-301164-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 19:45:57,502 WARNING [train.py:1204] (3/4) Exclude cut with ID _7537_210_2084_1_1528502395431_3924359_113-199933-0 from training. Duration: 0.59 2023-10-02 19:45:58,819 WARNING [train.py:1204] (3/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_547-5424-0_sp1.1 from training. Duration: 0.8545625 2023-10-02 19:45:58,886 WARNING [train.py:1204] (3/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_147-284024-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 19:46:00,890 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_294-163793-0_sp1.1 from training. Duration: 0.7454375 2023-10-02 19:46:01,504 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.3.encoder.layers.1.feed_forward2.out_whiten, num_groups=1, num_channels=512, metric=5.47 vs. limit=15.0 2023-10-02 19:46:02,285 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3780_210_2087_1_1526467760_2130483_170-259044-0_sp1.1 from training. Duration: 0.636375 2023-10-02 19:46:02,406 WARNING [train.py:1204] (3/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_726-270780-0_sp1.1 from training. Duration: 0.7818125 2023-10-02 19:46:03,798 WARNING [train.py:1204] (3/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_214-291369-0_sp1.1 from training. Duration: 0.57275 2023-10-02 19:46:03,849 WARNING [train.py:1204] (3/4) Exclude cut with ID _21881_210_3525_1_1531825242757_3798380_247-102591-0 from training. Duration: 0.98 2023-10-02 19:46:08,080 WARNING [train.py:1204] (3/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_18-174647-0_sp1.1 from training. Duration: 0.82725 2023-10-02 19:46:08,083 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_415-52611-0_sp1.1 from training. Duration: 0.936375 2023-10-02 19:46:08,136 WARNING [train.py:1204] (3/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_379-49457-0_sp0.9 from training. Duration: 0.9666875 2023-10-02 19:46:10,068 WARNING [train.py:1204] (3/4) Exclude cut with ID _64521_210_14699_1_1535243427259_3815328_368-195106-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 19:46:10,140 WARNING [train.py:1204] (3/4) Exclude cut with ID _28394_210_8913_1_1532430049518_3650118_51-334124-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 19:46:11,857 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.bypass.scale_min, batch_count=997533.3333333334, ans=0.2 2023-10-02 19:46:14,826 WARNING [train.py:1204] (3/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_317-161769-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 19:46:16,143 WARNING [train.py:1204] (3/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_40-129756-0_sp0.9 from training. Duration: 0.9 2023-10-02 19:46:17,504 WARNING [train.py:1204] (3/4) Exclude cut with ID _3067_210_2667_1_1526212974640_4503168_119-175776-0_sp0.9 from training. Duration: 0.82225 2023-10-02 19:46:18,023 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.4.encoder.layers.2.nonlin_attention.whiten1, num_groups=1, num_channels=288, metric=4.30 vs. limit=10.0 2023-10-02 19:46:18,847 WARNING [train.py:1204] (3/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_1004-304915-0_sp1.1 from training. Duration: 0.92725 2023-10-02 19:46:20,202 WARNING [train.py:1204] (3/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_359-118926-0_sp0.9 from training. Duration: 0.8 2023-10-02 19:46:27,487 INFO [train.py:1046] (3/4) Epoch 29, batch 900, loss[loss=0.1909, simple_loss=0.2599, pruned_loss=0.06098, over 23532.00 frames. ], tot_loss[loss=0.1668, simple_loss=0.245, pruned_loss=0.04428, over 4659119.93 frames. ], batch size: 256, lr: 3.55e-03, grad_scale: 16.0 2023-10-02 19:46:27,622 WARNING [train.py:1204] (3/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_51-200220-0_sp0.9 from training. Duration: 0.62225 2023-10-02 19:46:28,981 WARNING [train.py:1204] (3/4) Exclude cut with ID _46683_210_9642_1_1533730913492_4591116_530-326202-0_sp1.1 from training. Duration: 0.936375 2023-10-02 19:46:29,026 WARNING [train.py:1204] (3/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_567-135114-0 from training. Duration: 0.87 2023-10-02 19:46:29,066 WARNING [train.py:1204] (3/4) Exclude cut with ID _66863_210_18053_1_1535698758334_8590319_1092-271784-0_sp1.1 from training. Duration: 0.87275 2023-10-02 19:46:30,345 WARNING [train.py:1204] (3/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_762-282614-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 19:46:31,724 WARNING [train.py:1204] (3/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_280-120942-0 from training. Duration: 0.77 2023-10-02 19:46:37,369 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.0.layers.0.feed_forward3.out_whiten, num_groups=1, num_channels=192, metric=11.26 vs. limit=15.0 2023-10-02 19:46:37,957 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_31583_210_3852_1_1532595851_4024413_27-170931-0_sp1.1 from training. Duration: 0.963625 2023-10-02 19:46:41,272 WARNING [train.py:1204] (3/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_266-271628-0_sp1.1 from training. Duration: 0.92725 2023-10-02 19:46:41,319 WARNING [train.py:1204] (3/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_41-31157-0 from training. Duration: 0.57 2023-10-02 19:46:44,007 WARNING [train.py:1204] (3/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_113-18163-0_sp1.1 from training. Duration: 0.8090625 2023-10-02 19:46:44,032 WARNING [train.py:1204] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_147-111137-0 from training. Duration: 0.95 2023-10-02 19:46:44,099 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1083-220122-0_sp1.1 from training. Duration: 0.463625 2023-10-02 19:46:47,340 WARNING [train.py:1204] (3/4) Exclude cut with ID _14884_210_2252_1_1530707459689_5561250_591-173406-0_sp1.1 from training. Duration: 0.87275 2023-10-02 19:46:47,346 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_24677_210_1794_1_1532002693_4341296_149-303793-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 19:46:47,403 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_133-564-0_sp1.1 from training. Duration: 0.7181875 2023-10-02 19:46:48,746 WARNING [train.py:1204] (3/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_625-338546-0_sp0.9 from training. Duration: 0.97775 2023-10-02 19:46:51,655 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.conv_module1.balancer2.prob, batch_count=997666.6666666666, ans=0.125 2023-10-02 19:46:56,301 WARNING [train.py:1204] (3/4) Exclude cut with ID _25882_210_3036_1_1532775665815_6479510_624-320169-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 19:46:56,308 WARNING [train.py:1204] (3/4) Exclude cut with ID _20622_210_3852_1_1531622427080_1093839_127-325326-0_sp1.1 from training. Duration: 0.92725 2023-10-02 19:46:57,573 WARNING [train.py:1204] (3/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_464-33931-0_sp1.1 from training. Duration: 0.4909375 2023-10-02 19:47:00,427 WARNING [train.py:1204] (3/4) Exclude cut with ID _66112_210_10635_1_1535590236305_3801484_120-304588-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 19:47:05,156 WARNING [train.py:1204] (3/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_110-130961-0 from training. Duration: 0.47 2023-10-02 19:47:07,775 WARNING [train.py:1204] (3/4) Exclude cut with ID _16908_210_5724_1_1531119157248_3772700_64-54020-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 19:47:08,015 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.nonlin_attention.balancer.prob, batch_count=997733.3333333334, ans=0.125 2023-10-02 19:47:12,544 WARNING [train.py:1204] (3/4) Exclude cut with ID _4068_210_2629_1_1526697350395_1546259_224-288693-0_sp1.1 from training. Duration: 0.536375 2023-10-02 19:47:12,588 WARNING [train.py:1204] (3/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_191-41023-0_sp0.9 from training. Duration: 0.788875 2023-10-02 19:47:13,918 WARNING [train.py:1204] (3/4) Exclude cut with ID ID0205W0255-783002 from training. Duration: 0.958 2023-10-02 19:47:13,973 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_162-262717-0 from training. Duration: 0.83 2023-10-02 19:47:19,915 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.555e+02 1.861e+02 2.059e+02 2.455e+02 3.512e+02, threshold=4.118e+02, percent-clipped=0.0 2023-10-02 19:47:20,057 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_224-322394-0_sp1.1 from training. Duration: 0.6 2023-10-02 19:47:20,068 WARNING [train.py:1204] (3/4) Exclude cut with ID _4866_210_2577_1_1527404412284_4364659_133-1696-0_sp1.1 from training. Duration: 0.9 2023-10-02 19:47:20,829 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.2.encoder.layers.0.feed_forward1.out_whiten, num_groups=1, num_channels=384, metric=6.33 vs. limit=15.0 2023-10-02 19:47:21,409 WARNING [train.py:1204] (3/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_973-198085-0_sp1.1 from training. Duration: 0.6818125 2023-10-02 19:47:25,419 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.3.encoder.layers.2.self_attn1.whiten, num_groups=1, num_channels=512, metric=13.47 vs. limit=22.5 2023-10-02 19:47:27,674 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_47449_210_5399_1_1534076445_4006929_410-237806-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 19:47:27,685 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7543_210_6753_1_1528543318_4552_1-85072-0_sp0.9 from training. Duration: 0.92225 2023-10-02 19:47:30,380 WARNING [train.py:1204] (3/4) Exclude cut with ID _65263_210_22356_1_1535763598691_3921037_108-21902-0 from training. Duration: 0.99 2023-10-02 19:47:30,385 WARNING [train.py:1204] (3/4) Exclude cut with ID _65585_210_12210_1_1535450451584_3764098_309-174818-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 19:47:32,349 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_309-65177-0 from training. Duration: 0.88 2023-10-02 19:47:33,720 WARNING [train.py:1204] (3/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_363-185703-0_sp1.1 from training. Duration: 0.863625 2023-10-02 19:47:33,754 WARNING [train.py:1204] (3/4) Exclude cut with ID _42326_210_7770_1_1533814282531_3707269_442-238692-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 19:47:35,256 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.conv_module2.balancer1.prob, batch_count=997866.6666666666, ans=0.125 2023-10-02 19:47:36,382 WARNING [train.py:1204] (3/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_226-254446-0_sp1.1 from training. Duration: 0.936375 2023-10-02 19:47:36,405 WARNING [train.py:1204] (3/4) Exclude cut with ID _29853_210_7497_1_1532505600851_6642110_287-299903-0_sp1.1 from training. Duration: 0.963625 2023-10-02 19:47:40,478 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1015-343884-0 from training. Duration: 0.62 2023-10-02 19:47:40,516 WARNING [train.py:1204] (3/4) Exclude cut with ID ID0305W0008-833402 from training. Duration: 0.91 2023-10-02 19:47:40,621 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6673_210_3273_1_1527767790_3173240_351-167954-0_sp1.1 from training. Duration: 0.436375 2023-10-02 19:47:42,440 INFO [train.py:1046] (3/4) Epoch 29, batch 950, loss[loss=0.1788, simple_loss=0.2477, pruned_loss=0.05491, over 23812.00 frames. ], tot_loss[loss=0.1669, simple_loss=0.2449, pruned_loss=0.04443, over 4680869.76 frames. ], batch size: 195, lr: 3.55e-03, grad_scale: 16.0 2023-10-02 19:47:42,499 WARNING [train.py:1204] (3/4) Exclude cut with ID _6456_210_1534_1_1527681634212_3181049_345-252877-0 from training. Duration: 0.64 2023-10-02 19:47:43,953 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_13-205182-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 19:47:48,537 WARNING [train.py:1204] (3/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_427-208901-0 from training. Duration: 0.68 2023-10-02 19:47:52,490 INFO [scaling.py:1022] (3/4) Whitening: name=encoder_embed.out_whiten, num_groups=1, num_channels=192, metric=7.27 vs. limit=8.0 2023-10-02 19:47:52,915 WARNING [train.py:1204] (3/4) Exclude cut with ID _49039_210_6315_1_1533967537541_3846118_9-148921-0_sp1.1 from training. Duration: 0.97275 2023-10-02 19:47:54,429 WARNING [train.py:1204] (3/4) Exclude cut with ID _59493_210_17282_1_1535078419077_3225879_143-70317-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 19:47:56,300 WARNING [train.py:1204] (3/4) Exclude cut with ID _27967_210_12140_1_1532588391859_8419130_1056-236369-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 19:47:56,355 WARNING [train.py:1204] (3/4) Exclude cut with ID _6537_210_1883_1_1527987559710_3597129_382-133062-0_sp0.9 from training. Duration: 0.7555625 2023-10-02 19:47:57,701 WARNING [train.py:1204] (3/4) Exclude cut with ID ID0376W0255-869957 from training. Duration: 0.9510625 2023-10-02 19:48:03,544 WARNING [train.py:1204] (3/4) Exclude cut with ID _49371_210_12071_1_1533952761021_3812190_483-240691-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 19:48:03,625 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_12868_210_6753_1_1530701932_530275_18-76322-0_sp1.1 from training. Duration: 0.7909375 2023-10-02 19:48:03,805 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=998000.0, ans=0.1 2023-10-02 19:48:04,961 WARNING [train.py:1204] (3/4) Exclude cut with ID _53950_210_5816_1_1534417264919_3862951_367-26344-0_sp1.1 from training. Duration: 0.97275 2023-10-02 19:48:04,980 WARNING [train.py:1204] (3/4) Exclude cut with ID _5444_210_2151_1_1527418808464_5312210_634-194174-0_sp1.1 from training. Duration: 0.8818125 2023-10-02 19:48:06,249 WARNING [train.py:1204] (3/4) Exclude cut with ID _56494_210_4220_1_1535284837625_3033169_69-138150-0 from training. Duration: 0.94 2023-10-02 19:48:06,341 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7543_210_6753_1_1528544010_1110402_25-183860-0_sp0.9 from training. Duration: 0.52225 2023-10-02 19:48:07,724 WARNING [train.py:1204] (3/4) Exclude cut with ID _40284_210_2084_1_1533513679814_3704279_287-191918-0_sp1.1 from training. Duration: 0.963625 2023-10-02 19:48:09,117 WARNING [train.py:1204] (3/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_291-270881-0 from training. Duration: 0.97 2023-10-02 19:48:09,159 WARNING [train.py:1204] (3/4) Exclude cut with ID _12555_210_8750_1_1530075594402_3780239_449-267986-0_sp0.9 from training. Duration: 0.92225 2023-10-02 19:48:13,663 WARNING [train.py:1204] (3/4) Exclude cut with ID _3992_210_1747_1_1526644816031_3731060_268-64520-0_sp1.1 from training. Duration: 0.963625 2023-10-02 19:48:13,670 WARNING [train.py:1204] (3/4) Exclude cut with ID _39411_210_5986_1_1533538825030_3343649_173-238248-0_sp0.9 from training. Duration: 0.92225 2023-10-02 19:48:13,827 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=998066.6666666666, ans=0.1 2023-10-02 19:48:14,949 WARNING [train.py:1204] (3/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_828-313097-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 19:48:16,340 WARNING [train.py:1204] (3/4) Exclude cut with ID _26907_210_7234_1_1532221740694_3205263_337-2494-0 from training. Duration: 0.88 2023-10-02 19:48:17,734 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_229-45300-0_sp1.1 from training. Duration: 0.5181875 2023-10-02 19:48:19,056 WARNING [train.py:1204] (3/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_300-27235-0_sp1.1 from training. Duration: 0.7909375 2023-10-02 19:48:21,076 WARNING [train.py:1204] (3/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_48-338976-0_sp1.1 from training. Duration: 0.8090625 2023-10-02 19:48:22,701 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.conv_module1.balancer1.prob, batch_count=998066.6666666666, ans=0.125 2023-10-02 19:48:25,298 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_13091_210_7925_1_1530609447_8987354_792-195353-0_sp1.1 from training. Duration: 0.936375 2023-10-02 19:48:25,310 WARNING [train.py:1204] (3/4) Exclude cut with ID _56524_210_9642_1_1534572011779_3619170_311-54500-0_sp1.1 from training. Duration: 0.97275 2023-10-02 19:48:28,596 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7543_210_6753_1_1528544010_1110402_25-183860-0 from training. Duration: 0.47 2023-10-02 19:48:31,253 WARNING [train.py:1204] (3/4) Exclude cut with ID 2411-132532-0017-82279-0_sp1.1 from training. Duration: 0.9681875 2023-10-02 19:48:31,254 WARNING [train.py:1204] (3/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_272-249985-0_sp0.9 from training. Duration: 0.7444375 2023-10-02 19:48:31,308 WARNING [train.py:1204] (3/4) Exclude cut with ID _55644_210_11856_1_1534593599582_4103719_320-229006-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 19:48:31,378 WARNING [train.py:1204] (3/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_177-100554-0_sp1.1 from training. Duration: 0.963625 2023-10-02 19:48:31,380 WARNING [train.py:1204] (3/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_787-294416-0_sp1.1 from training. Duration: 0.7454375 2023-10-02 19:48:34,814 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.conv_skip_rate, batch_count=998133.3333333334, ans=0.0 2023-10-02 19:48:35,995 WARNING [train.py:1204] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_318-59097-0 from training. Duration: 0.76 2023-10-02 19:48:36,057 WARNING [train.py:1204] (3/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_480-13146-0_sp1.1 from training. Duration: 0.77275 2023-10-02 19:48:38,742 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_469-338558-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 19:48:40,063 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_367-126897-0_sp1.1 from training. Duration: 0.963625 2023-10-02 19:48:40,079 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6545_210_2040_1_1527774670_4245258_379-14182-0 from training. Duration: 0.8 2023-10-02 19:48:40,092 WARNING [train.py:1204] (3/4) Exclude cut with ID _21631_210_2253_1_1531907880778_3160339_197-79498-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 19:48:40,095 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_416-293746-0_sp1.1 from training. Duration: 0.8181875 2023-10-02 19:48:40,133 WARNING [train.py:1204] (3/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_52-211683-0 from training. Duration: 0.9 2023-10-02 19:48:44,836 WARNING [train.py:1204] (3/4) Exclude cut with ID _48678_210_5663_1_1533988830947_3769199_306-255886-0_sp0.9 from training. Duration: 0.8555625 2023-10-02 19:48:47,548 WARNING [train.py:1204] (3/4) Exclude cut with ID _65134_210_14763_1_1535335393985_3505368_296-24733-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 19:48:50,834 WARNING [train.py:1204] (3/4) Exclude cut with ID _14898_210_9206_1_1530766812377_4177190_335-99364-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 19:48:52,243 WARNING [train.py:1204] (3/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_19-342632-0 from training. Duration: 0.91 2023-10-02 19:48:52,270 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_53071_210_16554_1_1534490405_2954066_265-170472-0 from training. Duration: 0.83 2023-10-02 19:48:55,686 WARNING [train.py:1204] (3/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_189-298172-0_sp1.1 from training. Duration: 0.963625 2023-10-02 19:48:56,954 INFO [train.py:1046] (3/4) Epoch 29, batch 1000, loss[loss=0.1479, simple_loss=0.2111, pruned_loss=0.04237, over 22748.00 frames. ], tot_loss[loss=0.1663, simple_loss=0.2444, pruned_loss=0.04415, over 4699941.92 frames. ], batch size: 322, lr: 3.54e-03, grad_scale: 16.0 2023-10-02 19:48:59,054 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.4.encoder.layers.0.feed_forward3.out_whiten, num_groups=1, num_channels=384, metric=12.24 vs. limit=15.0 2023-10-02 19:48:59,889 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7615_210_4211_1_1528110965_4580160_72-235211-0 from training. Duration: 0.76 2023-10-02 19:48:59,938 WARNING [train.py:1204] (3/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_155-90874-0_sp1.1 from training. Duration: 0.936375 2023-10-02 19:49:02,203 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.balancer1.prob, batch_count=998266.6666666666, ans=0.125 2023-10-02 19:49:05,965 WARNING [train.py:1204] (3/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_164-216310-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 19:49:07,344 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_46013_210_7234_1_1533776362_3744032_463-52378-0 from training. Duration: 0.86 2023-10-02 19:49:07,348 WARNING [train.py:1204] (3/4) Exclude cut with ID _61133_210_9969_1_1535196777227_3696409_333-270325-0 from training. Duration: 0.93 2023-10-02 19:49:08,271 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.1.encoder.layers.0.self_attn1.whiten, num_groups=1, num_channels=256, metric=12.98 vs. limit=22.5 2023-10-02 19:49:09,342 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.3.encoder.layers.2.conv_module2.whiten, num_groups=1, num_channels=512, metric=4.01 vs. limit=15.0 2023-10-02 19:49:11,487 WARNING [train.py:1204] (3/4) Exclude cut with ID _5172_210_1883_1_1527246062567_3577219_18-101380-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 19:49:11,497 WARNING [train.py:1204] (3/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_577-236858-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 19:49:13,278 WARNING [train.py:1204] (3/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_1006-177345-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 19:49:16,088 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_4-266348-0 from training. Duration: 0.78 2023-10-02 19:49:20,625 WARNING [train.py:1204] (3/4) Exclude cut with ID _55857_210_2346_1_1534644007831_6898209_745-98391-0 from training. Duration: 0.76 2023-10-02 19:49:22,036 WARNING [train.py:1204] (3/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_375-244976-0 from training. Duration: 0.75 2023-10-02 19:49:22,083 WARNING [train.py:1204] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_4-248404-0_sp1.1 from training. Duration: 0.97275 2023-10-02 19:49:23,462 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8453_210_4211_1_1528502940_8562730_596-306820-0 from training. Duration: 0.97 2023-10-02 19:49:25,500 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_211-339622-0_sp0.9 from training. Duration: 0.5 2023-10-02 19:49:25,540 WARNING [train.py:1204] (3/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_393-57888-0 from training. Duration: 0.64 2023-10-02 19:49:28,328 WARNING [train.py:1204] (3/4) Exclude cut with ID _23825_210_13449_1_1531915313526_3497150_143-343575-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 19:49:28,391 WARNING [train.py:1204] (3/4) Exclude cut with ID _58605_210_12210_1_1534723195939_3904259_36-12256-0_sp1.1 from training. Duration: 0.936375 2023-10-02 19:49:35,959 WARNING [train.py:1204] (3/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_38-67795-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 19:49:37,284 WARNING [train.py:1204] (3/4) Exclude cut with ID _64631_210_27767_1_1535349580068_4165160_308-327832-0_sp1.1 from training. Duration: 0.92725 2023-10-02 19:49:37,331 WARNING [train.py:1204] (3/4) Exclude cut with ID _37086_210_9038_1_1532999540891_7021250_726-81176-0_sp1.1 from training. Duration: 0.936375 2023-10-02 19:49:38,645 WARNING [train.py:1204] (3/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_47-141521-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 19:49:38,649 WARNING [train.py:1204] (3/4) Exclude cut with ID _3067_210_2667_1_1526212974640_4503168_250-285937-0 from training. Duration: 0.8 2023-10-02 19:49:38,672 WARNING [train.py:1204] (3/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_239-24000-0_sp1.1 from training. Duration: 0.97275 2023-10-02 19:49:40,049 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3371_210_1794_1_1526384462_5580777_396-339812-0_sp0.9 from training. Duration: 0.9444375 2023-10-02 19:49:40,115 WARNING [train.py:1204] (3/4) Exclude cut with ID _16909_210_5724_1_1531292079912_4118919_580-173454-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 19:49:41,902 WARNING [train.py:1204] (3/4) Exclude cut with ID IC0124W0002-61308 from training. Duration: 0.982 2023-10-02 19:49:42,166 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.conv_module2.balancer2.prob, batch_count=998466.6666666666, ans=0.125 2023-10-02 19:49:44,658 WARNING [train.py:1204] (3/4) Exclude cut with ID _31602_210_5854_1_1532677021456_4458250_249-96604-0 from training. Duration: 0.89 2023-10-02 19:49:46,024 WARNING [train.py:1204] (3/4) Exclude cut with ID _69584_210_5219_1_1535785133775_4044450_325-7656-0 from training. Duration: 0.86 2023-10-02 19:49:47,449 WARNING [train.py:1204] (3/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_478-162247-0 from training. Duration: 0.82 2023-10-02 19:49:48,771 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.532e+02 1.876e+02 2.032e+02 2.220e+02 3.868e+02, threshold=4.064e+02, percent-clipped=0.0 2023-10-02 19:49:48,851 WARNING [train.py:1204] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_686-54383-0_sp1.1 from training. Duration: 0.7909375 2023-10-02 19:49:56,650 WARNING [train.py:1204] (3/4) Exclude cut with ID _29303_210_2575_1_1532392237303_7410649_85-270092-0_sp1.1 from training. Duration: 0.936375 2023-10-02 19:49:56,661 WARNING [train.py:1204] (3/4) Exclude cut with ID _2322_210_2667_1_1525604548971_3656299_440-80494-0_sp1.1 from training. Duration: 0.8 2023-10-02 19:49:56,690 WARNING [train.py:1204] (3/4) Exclude cut with ID _47346_210_9476_1_1533815971553_3536160_201-40644-0_sp1.1 from training. Duration: 0.936375 2023-10-02 19:49:58,077 WARNING [train.py:1204] (3/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_343-139898-0_sp1.1 from training. Duration: 0.87275 2023-10-02 19:49:59,584 WARNING [train.py:1204] (3/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_1012-279473-0 from training. Duration: 0.67 2023-10-02 19:49:59,674 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7931_210_1794_1_1528594728_5564059_280-279560-0_sp1.1 from training. Duration: 0.8909375 2023-10-02 19:49:59,719 WARNING [train.py:1204] (3/4) Exclude cut with ID _8263_210_1664_1_1528439452891_970220_68-181805-0 from training. Duration: 0.64 2023-10-02 19:49:59,755 WARNING [train.py:1204] (3/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_605-133395-0 from training. Duration: 0.66 2023-10-02 19:50:01,132 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_43-171109-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 19:50:01,135 WARNING [train.py:1204] (3/4) Exclude cut with ID _40094_210_16380_1_1533294005773_3641159_107-280739-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 19:50:03,863 WARNING [train.py:1204] (3/4) Exclude cut with ID _14821_210_8750_1_1530680503991_6829359_2-227151-0_sp0.9 from training. Duration: 0.92225 2023-10-02 19:50:07,066 WARNING [train.py:1204] (3/4) Exclude cut with ID _14268_210_9206_1_1530622771285_4714099_331-105351-0_sp1.1 from training. Duration: 0.5909375 2023-10-02 19:50:08,418 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_26326_210_7925_1_1533002493_3592406_451-82741-0_sp1.1 from training. Duration: 0.97275 2023-10-02 19:50:10,141 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.bypass_mid.scale_min, batch_count=998600.0, ans=0.2 2023-10-02 19:50:11,776 INFO [train.py:1046] (3/4) Epoch 29, batch 1050, loss[loss=0.1731, simple_loss=0.2592, pruned_loss=0.04348, over 24665.00 frames. ], tot_loss[loss=0.1652, simple_loss=0.2426, pruned_loss=0.04386, over 4692253.52 frames. ], batch size: 73, lr: 3.54e-03, grad_scale: 16.0 2023-10-02 19:50:11,830 WARNING [train.py:1204] (3/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_191-59784-0_sp0.9 from training. Duration: 0.97775 2023-10-02 19:50:11,918 WARNING [train.py:1204] (3/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_445-117398-0_sp1.1 from training. Duration: 0.8090625 2023-10-02 19:50:13,327 WARNING [train.py:1204] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_605-257205-0_sp0.9 from training. Duration: 0.6555625 2023-10-02 19:50:13,396 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_50050_210_16223_1_1535435796_7290138_592-258841-0_sp1.1 from training. Duration: 0.936375 2023-10-02 19:50:16,446 WARNING [train.py:1204] (3/4) Exclude cut with ID _14348_210_8341_1_1530599434996_3778640_240-232370-0_sp0.9 from training. Duration: 0.9333125 2023-10-02 19:50:17,911 WARNING [train.py:1204] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_239-316802-0_sp0.9 from training. Duration: 0.7666875 2023-10-02 19:50:21,023 WARNING [train.py:1204] (3/4) Exclude cut with ID _45560_210_21982_1_1533812603951_3584999_111-6376-0_sp1.1 from training. Duration: 0.763625 2023-10-02 19:50:22,497 WARNING [train.py:1204] (3/4) Exclude cut with ID _54189_210_16380_1_1534332670039_3911710_606-311481-0_sp1.1 from training. Duration: 0.963625 2023-10-02 19:50:23,840 WARNING [train.py:1204] (3/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_237-332027-0_sp1.1 from training. Duration: 0.863625 2023-10-02 19:50:23,866 WARNING [train.py:1204] (3/4) Exclude cut with ID _66070_210_7640_1_1535441393096_7805310_489-292631-0_sp0.9 from training. Duration: 0.87775 2023-10-02 19:50:25,199 WARNING [train.py:1204] (3/4) Exclude cut with ID _49728_210_12651_1_1534041034294_6788150_624-195301-0_sp1.1 from training. Duration: 0.77275 2023-10-02 19:50:26,815 WARNING [train.py:1204] (3/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_44-79427-0 from training. Duration: 0.98 2023-10-02 19:50:26,885 WARNING [train.py:1204] (3/4) Exclude cut with ID _35350_210_16040_1_1533040191183_4075299_490-198354-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 19:50:28,259 WARNING [train.py:1204] (3/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_309-222545-0 from training. Duration: 0.84 2023-10-02 19:50:28,502 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.ff3_skip_rate, batch_count=998666.6666666666, ans=0.0 2023-10-02 19:50:32,279 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_13091_210_7925_1_1530609447_8987354_790-199469-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 19:50:32,286 WARNING [train.py:1204] (3/4) Exclude cut with ID _21632_210_2253_1_1531994001765_3744140_110-274654-0 from training. Duration: 0.82 2023-10-02 19:50:32,291 WARNING [train.py:1204] (3/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_498-40153-0_sp0.9 from training. Duration: 0.7 2023-10-02 19:50:36,513 WARNING [train.py:1204] (3/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_363-282909-0_sp1.1 from training. Duration: 0.936375 2023-10-02 19:50:38,642 WARNING [train.py:1204] (3/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_178-177582-0_sp0.9 from training. Duration: 0.82225 2023-10-02 19:50:38,662 WARNING [train.py:1204] (3/4) Exclude cut with ID _24612_210_8913_1_1531998054193_3806359_357-35487-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 19:50:40,870 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.0.layers.0.feed_forward3.out_whiten, num_groups=1, num_channels=192, metric=11.49 vs. limit=15.0 2023-10-02 19:50:41,303 WARNING [train.py:1204] (3/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_138-332773-0 from training. Duration: 0.81 2023-10-02 19:50:41,320 WARNING [train.py:1204] (3/4) Exclude cut with ID _7624_210_1531_1_1528109802198_4933101_418-349834-0 from training. Duration: 0.6 2023-10-02 19:50:42,519 WARNING [train.py:1204] (3/4) Exclude cut with ID _7537_210_2084_1_1528502395431_3924359_14-312906-0_sp0.9 from training. Duration: 0.9333125 2023-10-02 19:50:44,693 WARNING [train.py:1204] (3/4) Exclude cut with ID _60057_210_10640_1_1534903065793_3333150_362-230377-0 from training. Duration: 0.87 2023-10-02 19:50:48,115 WARNING [train.py:1204] (3/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_138-64210-0 from training. Duration: 0.73 2023-10-02 19:50:50,047 WARNING [train.py:1204] (3/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_454-339504-0_sp1.1 from training. Duration: 0.97275 2023-10-02 19:50:52,792 WARNING [train.py:1204] (3/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_508-335369-0_sp0.9 from training. Duration: 0.5333125 2023-10-02 19:50:54,192 WARNING [train.py:1204] (3/4) Exclude cut with ID _5608_210_2106_1_1527415439089_6819768_909-82767-0_sp0.9 from training. Duration: 0.67775 2023-10-02 19:50:55,515 WARNING [train.py:1204] (3/4) Exclude cut with ID _6694_210_4599_1_1527852637490_3965529_99-11611-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 19:50:55,579 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2655_210_2087_1_1525688959_3897400_52-279899-0_sp1.1 from training. Duration: 0.62725 2023-10-02 19:50:58,538 WARNING [train.py:1204] (3/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_375-245677-0_sp1.1 from training. Duration: 0.736375 2023-10-02 19:51:02,573 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_23850_210_4238_1_1532232597_1632433_32-185135-0 from training. Duration: 0.86 2023-10-02 19:51:03,911 WARNING [train.py:1204] (3/4) Exclude cut with ID _15113_210_8254_1_1530842438579_3716279_209-54533-0 from training. Duration: 0.96 2023-10-02 19:51:03,952 WARNING [train.py:1204] (3/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_401-53423-0 from training. Duration: 0.55 2023-10-02 19:51:05,272 WARNING [train.py:1204] (3/4) Exclude cut with ID _14867_210_1968_1_1530684179890_3846969_319-250868-0_sp1.1 from training. Duration: 0.9 2023-10-02 19:51:05,285 WARNING [train.py:1204] (3/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_106-30782-0_sp0.9 from training. Duration: 0.888875 2023-10-02 19:51:06,677 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_5960_210_1794_1_1527403865_3990999_515-2613-0 from training. Duration: 0.88 2023-10-02 19:51:11,238 WARNING [train.py:1204] (3/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_798-106179-0_sp0.9 from training. Duration: 0.92225 2023-10-02 19:51:13,932 WARNING [train.py:1204] (3/4) Exclude cut with ID _14227_210_4381_1_1530701804286_3940259_125-275642-0_sp1.1 from training. Duration: 0.9 2023-10-02 19:51:13,934 WARNING [train.py:1204] (3/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_79-200392-0_sp1.1 from training. Duration: 0.8818125 2023-10-02 19:51:13,982 WARNING [train.py:1204] (3/4) Exclude cut with ID _48836_210_12210_1_1534075848293_3904150_429-35475-0_sp0.9 from training. Duration: 0.988875 2023-10-02 19:51:14,006 WARNING [train.py:1204] (3/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_224-172517-0_sp1.1 from training. Duration: 0.97275 2023-10-02 19:51:19,269 WARNING [train.py:1204] (3/4) Exclude cut with ID _15113_210_8254_1_1530842438579_3716279_76-232507-0_sp1.1 from training. Duration: 0.97275 2023-10-02 19:51:19,272 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_135-5501-0 from training. Duration: 0.98 2023-10-02 19:51:21,140 WARNING [train.py:1204] (3/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_563-347832-0_sp0.9 from training. Duration: 0.988875 2023-10-02 19:51:21,149 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6786_210_1388_1_1527839699_4194495_273-149841-0 from training. Duration: 0.92 2023-10-02 19:51:21,173 WARNING [train.py:1204] (3/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_172-215932-0 from training. Duration: 0.66 2023-10-02 19:51:22,419 WARNING [train.py:1204] (3/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_939-132910-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 19:51:26,506 INFO [train.py:1046] (3/4) Epoch 29, batch 1100, loss[loss=0.1615, simple_loss=0.2506, pruned_loss=0.03619, over 24665.00 frames. ], tot_loss[loss=0.1648, simple_loss=0.2424, pruned_loss=0.04358, over 4698089.82 frames. ], batch size: 73, lr: 3.54e-03, grad_scale: 16.0 2023-10-02 19:51:26,605 WARNING [train.py:1204] (3/4) Exclude cut with ID _49371_210_12071_1_1533952761021_3812190_16-246001-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 19:51:30,766 WARNING [train.py:1204] (3/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_680-189165-0_sp1.1 from training. Duration: 0.936375 2023-10-02 19:51:33,818 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.bypass_mid.scale_min, batch_count=998933.3333333334, ans=0.2 2023-10-02 19:51:34,879 WARNING [train.py:1204] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_53-300544-0_sp1.1 from training. Duration: 0.6090625 2023-10-02 19:51:36,147 WARNING [train.py:1204] (3/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_540-275685-0_sp1.1 from training. Duration: 0.8090625 2023-10-02 19:51:36,423 INFO [scaling.py:1118] (3/4) WithLoss: name=encoder.encoders.3.encoder.layers.3.self_attn_weights, loss-sum=0.000e+00 2023-10-02 19:51:37,419 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_38965_210_4852_1_1533174906_3484735_453-106579-0_sp1.1 from training. Duration: 0.92725 2023-10-02 19:51:37,467 WARNING [train.py:1204] (3/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_105-328174-0 from training. Duration: 0.76 2023-10-02 19:51:37,585 WARNING [train.py:1204] (3/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_279-126205-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 19:51:37,812 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.nonlin_attention.balancer.max_positive, batch_count=998933.3333333334, ans=0.95 2023-10-02 19:51:37,837 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.conv_skip_rate, batch_count=998933.3333333334, ans=0.0 2023-10-02 19:51:40,170 WARNING [train.py:1204] (3/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_5-55893-0_sp0.9 from training. Duration: 0.611125 2023-10-02 19:51:41,599 WARNING [train.py:1204] (3/4) Exclude cut with ID _13678_210_7497_1_1530530069226_3463170_141-10835-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 19:51:45,249 WARNING [train.py:1204] (3/4) Exclude cut with ID _22083_210_8254_1_1531702862439_3678970_201-85738-0_sp1.1 from training. Duration: 0.8181875 2023-10-02 19:51:45,268 WARNING [train.py:1204] (3/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_51-1049-0 from training. Duration: 0.75 2023-10-02 19:51:46,609 WARNING [train.py:1204] (3/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_206-10097-0_sp0.9 from training. Duration: 0.5444375 2023-10-02 19:51:46,697 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_4528_210_3273_1_1527076654_3276777_360-63661-0_sp1.1 from training. Duration: 0.92725 2023-10-02 19:51:46,711 WARNING [train.py:1204] (3/4) Exclude cut with ID _64086_210_7994_1_1535270417019_7435949_449-31567-0_sp1.1 from training. Duration: 0.8818125 2023-10-02 19:51:50,160 WARNING [train.py:1204] (3/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_245-174553-0_sp0.9 from training. Duration: 0.97775 2023-10-02 19:51:51,608 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6545_210_2040_1_1527774670_4245258_379-14182-0_sp1.1 from training. Duration: 0.72725 2023-10-02 19:51:55,711 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_256-206070-0_sp1.1 from training. Duration: 0.963625 2023-10-02 19:51:59,751 WARNING [train.py:1204] (3/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_278-104676-0 from training. Duration: 0.98 2023-10-02 19:51:59,927 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.0.attention_skip_rate, batch_count=999066.6666666666, ans=0.0 2023-10-02 19:52:00,024 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.feed_forward3.hidden_balancer.prob, batch_count=999066.6666666666, ans=0.125 2023-10-02 19:52:01,123 WARNING [train.py:1204] (3/4) Exclude cut with ID IC0671W0065-331908 from training. Duration: 0.896 2023-10-02 19:52:02,489 WARNING [train.py:1204] (3/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_244-129917-0_sp1.1 from training. Duration: 0.97275 2023-10-02 19:52:03,939 WARNING [train.py:1204] (3/4) Exclude cut with ID _7852_210_5219_1_1528286471316_8139400_1129-170857-0_sp1.1 from training. Duration: 0.97275 2023-10-02 19:52:05,300 WARNING [train.py:1204] (3/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_443-30034-0_sp0.9 from training. Duration: 0.711125 2023-10-02 19:52:05,339 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_31583_210_3852_1_1532595851_4024413_250-332999-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 19:52:06,637 WARNING [train.py:1204] (3/4) Exclude cut with ID _8772_210_1850_1_1528635595972_3664011_247-336448-0 from training. Duration: 0.74 2023-10-02 19:52:07,950 WARNING [train.py:1204] (3/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_567-135114-0_sp0.9 from training. Duration: 0.9666875 2023-10-02 19:52:07,954 WARNING [train.py:1204] (3/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_557-349534-0_sp1.1 from training. Duration: 0.82725 2023-10-02 19:52:07,974 WARNING [train.py:1204] (3/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_1341-26728-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 19:52:08,023 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_40222_210_6228_1_1533385298_4481939_375-328051-0_sp1.1 from training. Duration: 0.97275 2023-10-02 19:52:09,350 WARNING [train.py:1204] (3/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_276-96593-0 from training. Duration: 0.77 2023-10-02 19:52:11,028 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.feed_forward3.hidden_balancer.prob, batch_count=999133.3333333334, ans=0.125 2023-10-02 19:52:17,274 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.440e+02 1.856e+02 2.110e+02 2.423e+02 3.915e+02, threshold=4.220e+02, percent-clipped=0.0 2023-10-02 19:52:17,379 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_53071_210_16554_1_1534490405_2954066_265-170472-0_sp0.9 from training. Duration: 0.92225 2023-10-02 19:52:17,399 WARNING [train.py:1204] (3/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_365-1492-0 from training. Duration: 0.91 2023-10-02 19:52:18,886 WARNING [train.py:1204] (3/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_654-52713-0_sp0.9 from training. Duration: 0.8555625 2023-10-02 19:52:24,191 WARNING [train.py:1204] (3/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_113-92608-0_sp1.1 from training. Duration: 0.4909375 2023-10-02 19:52:26,886 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_46013_210_7234_1_1533776362_3744032_50-141535-0 from training. Duration: 0.99 2023-10-02 19:52:26,901 WARNING [train.py:1204] (3/4) Exclude cut with ID _13942_210_9988_1_1530604906036_3733360_499-55676-0_sp1.1 from training. Duration: 0.563625 2023-10-02 19:52:28,279 WARNING [train.py:1204] (3/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_375-65143-0_sp1.1 from training. Duration: 0.97275 2023-10-02 19:52:30,938 WARNING [train.py:1204] (3/4) Exclude cut with ID _16756_210_4915_1_1531047803763_7104259_199-325608-0_sp1.1 from training. Duration: 0.92725 2023-10-02 19:52:30,968 WARNING [train.py:1204] (3/4) Exclude cut with ID _31649_210_6228_1_1532584608063_4091078_443-124391-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 19:52:32,240 WARNING [train.py:1204] (3/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_146-218763-0 from training. Duration: 0.86 2023-10-02 19:52:32,279 WARNING [train.py:1204] (3/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_567-135114-0_sp1.1 from training. Duration: 0.7909375 2023-10-02 19:52:33,600 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_26326_210_7925_1_1533002493_3592406_462-288299-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 19:52:33,690 WARNING [train.py:1204] (3/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_322-217220-0 from training. Duration: 0.86 2023-10-02 19:52:34,916 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_238-256668-0_sp1.1 from training. Duration: 0.62725 2023-10-02 19:52:34,978 WARNING [train.py:1204] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_371-18169-0 from training. Duration: 0.86 2023-10-02 19:52:36,299 WARNING [train.py:1204] (3/4) Exclude cut with ID _58480_210_12392_1_1534811121111_3646160_23-74512-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 19:52:36,314 WARNING [train.py:1204] (3/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_784-213099-0_sp1.1 from training. Duration: 0.8454375 2023-10-02 19:52:37,679 WARNING [train.py:1204] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_625-102955-0_sp1.1 from training. Duration: 0.7 2023-10-02 19:52:39,038 INFO [train.py:1046] (3/4) Epoch 29, batch 1150, loss[loss=0.1615, simple_loss=0.2331, pruned_loss=0.04495, over 23334.00 frames. ], tot_loss[loss=0.1649, simple_loss=0.243, pruned_loss=0.04344, over 4709276.31 frames. ], batch size: 285, lr: 3.54e-03, grad_scale: 16.0 2023-10-02 19:52:40,644 WARNING [train.py:1204] (3/4) Exclude cut with ID _49748_210_24116_1_1533986971737_3158910_176-174327-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 19:52:43,353 WARNING [train.py:1204] (3/4) Exclude cut with ID _50310_210_15488_1_1534068043345_3641080_432-96140-0_sp1.1 from training. Duration: 0.936375 2023-10-02 19:52:46,550 WARNING [train.py:1204] (3/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_76-14910-0_sp1.1 from training. Duration: 0.92725 2023-10-02 19:52:46,582 WARNING [train.py:1204] (3/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_40-19458-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 19:52:46,629 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_46-93547-0 from training. Duration: 0.95 2023-10-02 19:52:46,670 WARNING [train.py:1204] (3/4) Exclude cut with ID _21631_210_2253_1_1531907880778_3160339_321-35325-0_sp1.1 from training. Duration: 0.963625 2023-10-02 19:52:50,457 WARNING [train.py:1204] (3/4) Exclude cut with ID _4779_210_2084_1_1527318022944_3914893_500-285013-0 from training. Duration: 0.69 2023-10-02 19:52:52,339 WARNING [train.py:1204] (3/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_301-93324-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 19:52:52,351 WARNING [train.py:1204] (3/4) Exclude cut with ID _6473_210_1741_1_1527901307396_3815380_108-17335-0_sp1.1 from training. Duration: 0.5909375 2023-10-02 19:52:57,808 WARNING [train.py:1204] (3/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_206-10097-0 from training. Duration: 0.49 2023-10-02 19:52:59,278 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_36756_210_7234_1_1533026991_966276_103-77513-0_sp1.1 from training. Duration: 0.97275 2023-10-02 19:53:03,255 WARNING [train.py:1204] (3/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_662-18193-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 19:53:03,313 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_26433_210_12108_1_1532157158_4056480_439-52438-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 19:53:03,351 WARNING [train.py:1204] (3/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_464-33931-0 from training. Duration: 0.54 2023-10-02 19:53:03,365 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3474_210_2028_1_1526188513_4825246_145-309131-0_sp0.9 from training. Duration: 0.82225 2023-10-02 19:53:04,661 WARNING [train.py:1204] (3/4) Exclude cut with ID _9679_210_7497_1_1529129212906_4363929_446-308321-0_sp1.1 from training. Duration: 0.963625 2023-10-02 19:53:06,407 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.self_attn_weights.pos_emb_skip_rate, batch_count=999333.3333333334, ans=0.0 2023-10-02 19:53:07,507 WARNING [train.py:1204] (3/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_662-275621-0 from training. Duration: 0.42 2023-10-02 19:53:07,581 WARNING [train.py:1204] (3/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_344-337148-0_sp1.1 from training. Duration: 0.97275 2023-10-02 19:53:08,960 WARNING [train.py:1204] (3/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_759-179071-0_sp1.1 from training. Duration: 0.92725 2023-10-02 19:53:20,957 WARNING [train.py:1204] (3/4) Exclude cut with ID _28783_210_7943_1_1532671164808_7155479_293-12188-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 19:53:27,130 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_17613_210_2151_1_1531137569_2366160_235-320304-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 19:53:28,393 WARNING [train.py:1204] (3/4) Exclude cut with ID _14349_210_8341_1_1530615710818_3442470_170-336541-0 from training. Duration: 0.86 2023-10-02 19:53:28,461 WARNING [train.py:1204] (3/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_375-218677-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 19:53:28,499 WARNING [train.py:1204] (3/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_829-202956-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 19:53:33,362 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.3.encoder.layers.0.whiten, num_groups=1, num_channels=512, metric=5.27 vs. limit=12.0 2023-10-02 19:53:33,991 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9029W0436-509823 from training. Duration: 0.768 2023-10-02 19:53:36,667 WARNING [train.py:1204] (3/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_231-112214-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 19:53:42,354 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9059W0470-524883 from training. Duration: 0.3415625 2023-10-02 19:53:42,612 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.nonlin_attention.balancer.prob, batch_count=999533.3333333334, ans=0.125 2023-10-02 19:53:45,149 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_43266_210_1794_1_1533521789_4497218_206-79512-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 19:53:47,103 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_294-163793-0_sp0.9 from training. Duration: 0.911125 2023-10-02 19:53:47,130 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6290_210_3273_1_1527595224_1282787_115-87095-0_sp1.1 from training. Duration: 0.5 2023-10-02 19:53:47,165 WARNING [train.py:1204] (3/4) Exclude cut with ID _14100_210_8341_1_1530529320613_3678069_279-55760-0_sp1.1 from training. Duration: 0.8454375 2023-10-02 19:53:50,381 WARNING [train.py:1204] (3/4) Exclude cut with ID _14621_210_1925_1_1530686901185_3675420_419-234200-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 19:53:53,721 INFO [train.py:1046] (3/4) Epoch 29, batch 1200, loss[loss=0.1625, simple_loss=0.2509, pruned_loss=0.03704, over 24461.00 frames. ], tot_loss[loss=0.1655, simple_loss=0.2437, pruned_loss=0.04363, over 4718949.35 frames. ], batch size: 69, lr: 3.54e-03, grad_scale: 32.0 2023-10-02 19:53:55,050 WARNING [train.py:1204] (3/4) Exclude cut with ID _60229_210_27767_1_1534838406158_3655350_353-213656-0_sp1.1 from training. Duration: 0.863625 2023-10-02 19:53:55,069 WARNING [train.py:1204] (3/4) Exclude cut with ID _48678_210_5663_1_1533988830947_3769199_306-255886-0_sp1.1 from training. Duration: 0.7 2023-10-02 19:53:55,191 WARNING [train.py:1204] (3/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_466-3492-0_sp1.1 from training. Duration: 0.97275 2023-10-02 19:53:55,191 WARNING [train.py:1204] (3/4) Exclude cut with ID _8755_210_1876_1_1528628559357_3258209_199-242167-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 19:53:56,502 WARNING [train.py:1204] (3/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_237-89684-0_sp1.1 from training. Duration: 0.87275 2023-10-02 19:53:57,772 WARNING [train.py:1204] (3/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_70-84121-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 19:54:00,460 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7544_210_6753_1_1528587722_3576686_172-171750-0_sp1.1 from training. Duration: 0.5909375 2023-10-02 19:54:01,745 WARNING [train.py:1204] (3/4) Exclude cut with ID _24271_210_12658_1_1531987270022_7190519_40-321822-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 19:54:01,762 WARNING [train.py:1204] (3/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_227-83612-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 19:54:03,169 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9156W0015-572508 from training. Duration: 0.896 2023-10-02 19:54:04,638 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_292-265928-0 from training. Duration: 0.51 2023-10-02 19:54:08,801 WARNING [train.py:1204] (3/4) Exclude cut with ID _14747_210_8341_1_1530685797463_3654089_147-131229-0_sp1.1 from training. Duration: 0.6909375 2023-10-02 19:54:10,185 WARNING [train.py:1204] (3/4) Exclude cut with ID _45297_210_2721_1_1533715351381_3264949_351-147136-0_sp0.9 from training. Duration: 0.9666875 2023-10-02 19:54:12,907 WARNING [train.py:1204] (3/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_1102-324568-0_sp1.1 from training. Duration: 0.97275 2023-10-02 19:54:14,906 WARNING [train.py:1204] (3/4) Exclude cut with ID _18516_210_9206_1_1531704653206_3692406_465-75446-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 19:54:14,907 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9203W0126-595623 from training. Duration: 0.896 2023-10-02 19:54:16,217 WARNING [train.py:1204] (3/4) Exclude cut with ID _55383_210_10635_1_1534468426172_2231669_139-328410-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 19:54:24,891 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.out_combiner.scale_min, batch_count=999733.3333333334, ans=0.2 2023-10-02 19:54:26,029 WARNING [train.py:1204] (3/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_166-246557-0_sp0.9 from training. Duration: 0.711125 2023-10-02 19:54:26,042 WARNING [train.py:1204] (3/4) Exclude cut with ID _30039_210_13270_1_1532484612900_2725150_239-304872-0_sp1.1 from training. Duration: 0.963625 2023-10-02 19:54:26,074 WARNING [train.py:1204] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_6-226750-0 from training. Duration: 0.94 2023-10-02 19:54:27,430 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_135-5501-0_sp1.1 from training. Duration: 0.8909375 2023-10-02 19:54:29,088 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.conv_module1.balancer1.prob, batch_count=999733.3333333334, ans=0.125 2023-10-02 19:54:30,248 WARNING [train.py:1204] (3/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_359-118926-0 from training. Duration: 0.72 2023-10-02 19:54:34,414 WARNING [train.py:1204] (3/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_4-91037-0 from training. Duration: 0.88 2023-10-02 19:54:34,430 WARNING [train.py:1204] (3/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_415-288492-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 19:54:35,767 WARNING [train.py:1204] (3/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_544-287179-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 19:54:35,984 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.conv_module1.balancer2.prob, batch_count=999800.0, ans=0.125 2023-10-02 19:54:37,144 WARNING [train.py:1204] (3/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_122-32606-0_sp1.1 from training. Duration: 0.92725 2023-10-02 19:54:38,490 WARNING [train.py:1204] (3/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_121-284152-0_sp1.1 from training. Duration: 0.536375 2023-10-02 19:54:38,589 WARNING [train.py:1204] (3/4) Exclude cut with ID _44718_210_5816_1_1534050051869_6469030_350-299477-0_sp1.1 from training. Duration: 0.97275 2023-10-02 19:54:38,609 WARNING [train.py:1204] (3/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_1103-56445-0_sp0.9 from training. Duration: 0.811125 2023-10-02 19:54:39,956 WARNING [train.py:1204] (3/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_64-298917-0_sp0.9 from training. Duration: 0.97775 2023-10-02 19:54:41,269 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2245_210_2715_1_1525511548_3540254_310-52483-0 from training. Duration: 0.88 2023-10-02 19:54:41,322 WARNING [train.py:1204] (3/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_401-158239-0_sp1.1 from training. Duration: 0.6454375 2023-10-02 19:54:41,375 WARNING [train.py:1204] (3/4) Exclude cut with ID _59502_210_12210_1_1535107024698_1835139_146-162495-0_sp1.1 from training. Duration: 0.836375 2023-10-02 19:54:41,385 WARNING [train.py:1204] (3/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_41-31157-0_sp0.9 from training. Duration: 0.6333125 2023-10-02 19:54:42,989 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.bypass_mid.scale_min, batch_count=999800.0, ans=0.2 2023-10-02 19:54:44,070 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_68717_210_3528_1_1535628683_1556759_95-36722-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 19:54:44,077 WARNING [train.py:1204] (3/4) Exclude cut with ID _30196_210_1664_1_1533708496522_7074140_654-162611-0_sp1.1 from training. Duration: 0.92725 2023-10-02 19:54:45,320 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.549e+02 1.823e+02 1.995e+02 2.246e+02 2.877e+02, threshold=3.989e+02, percent-clipped=0.0 2023-10-02 19:54:49,329 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_9302_210_2550_1_1528889962_4655416_638-301620-0_sp0.9 from training. Duration: 0.52225 2023-10-02 19:54:50,719 WARNING [train.py:1204] (3/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_478-162247-0_sp1.1 from training. Duration: 0.7454375 2023-10-02 19:54:53,917 WARNING [train.py:1204] (3/4) Exclude cut with ID _45297_210_2721_1_1533715351381_3264949_353-9541-0 from training. Duration: 0.97 2023-10-02 19:54:58,471 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9653W0190-675468 from training. Duration: 0.9935 2023-10-02 19:54:59,856 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_49730_210_13049_1_1534075294_1830676_214-84436-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 19:55:02,016 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.2.encoder.layers.0.feed_forward2.out_whiten, num_groups=1, num_channels=384, metric=6.73 vs. limit=15.0 2023-10-02 19:55:02,611 WARNING [train.py:1204] (3/4) Exclude cut with ID _50093_210_4220_1_1534417141207_3432299_259-322252-0_sp1.1 from training. Duration: 0.836375 2023-10-02 19:55:05,238 WARNING [train.py:1204] (3/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_599-50220-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 19:55:06,513 INFO [train.py:1046] (3/4) Epoch 29, batch 1250, loss[loss=0.165, simple_loss=0.2533, pruned_loss=0.03831, over 24563.00 frames. ], tot_loss[loss=0.1661, simple_loss=0.2445, pruned_loss=0.04383, over 4721235.65 frames. ], batch size: 71, lr: 3.54e-03, grad_scale: 16.0 2023-10-02 19:55:06,601 WARNING [train.py:1204] (3/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_174-109016-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 19:55:07,515 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.2.encoder.layers.2.feed_forward1.out_whiten, num_groups=1, num_channels=384, metric=6.99 vs. limit=15.0 2023-10-02 19:55:08,111 WARNING [train.py:1204] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_665-196155-0 from training. Duration: 0.67 2023-10-02 19:55:12,408 WARNING [train.py:1204] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_656-188361-0_sp1.1 from training. Duration: 0.7909375 2023-10-02 19:55:13,706 WARNING [train.py:1204] (3/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_60-18123-0_sp1.1 from training. Duration: 0.8 2023-10-02 19:55:13,763 WARNING [train.py:1204] (3/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_535-14410-0 from training. Duration: 0.47 2023-10-02 19:55:16,385 WARNING [train.py:1204] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_94-126128-0_sp1.1 from training. Duration: 0.7818125 2023-10-02 19:55:16,467 WARNING [train.py:1204] (3/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_356-31216-0_sp1.1 from training. Duration: 0.6818125 2023-10-02 19:55:23,313 WARNING [train.py:1204] (3/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_443-30034-0_sp1.1 from training. Duration: 0.5818125 2023-10-02 19:55:23,379 WARNING [train.py:1204] (3/4) Exclude cut with ID _26907_210_7234_1_1532221740694_3205263_337-2494-0_sp1.1 from training. Duration: 0.8 2023-10-02 19:55:24,714 WARNING [train.py:1204] (3/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_49-97158-0_sp0.9 from training. Duration: 0.8444375 2023-10-02 19:55:24,725 WARNING [train.py:1204] (3/4) Exclude cut with ID _7877_210_6014_1_1528286358099_3750136_553-170128-0_sp1.1 from training. Duration: 0.963625 2023-10-02 19:55:26,770 WARNING [train.py:1204] (3/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_202-293523-0_sp0.9 from training. Duration: 0.8 2023-10-02 19:55:31,351 WARNING [train.py:1204] (3/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_188-171673-0_sp0.9 from training. Duration: 0.6444375 2023-10-02 19:55:32,623 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_120-41668-0_sp1.1 from training. Duration: 0.636375 2023-10-02 19:55:32,635 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_40223_210_6228_1_1533298404_4812267_613-78769-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 19:55:34,011 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_53861_210_5399_1_1534422156_4272077_150-336899-0_sp1.1 from training. Duration: 0.963625 2023-10-02 19:55:34,070 WARNING [train.py:1204] (3/4) Exclude cut with ID _14459_210_2228_1_1531443641946_7516700_1146-56843-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 19:55:36,838 WARNING [train.py:1204] (3/4) Exclude cut with ID _39867_210_9826_1_1533553222030_9344259_1194-299928-0_sp1.1 from training. Duration: 0.92725 2023-10-02 19:55:36,921 WARNING [train.py:1204] (3/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_214-291369-0_sp0.9 from training. Duration: 0.7 2023-10-02 19:55:42,368 WARNING [train.py:1204] (3/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_358-215558-0 from training. Duration: 0.93 2023-10-02 19:55:42,875 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.4.encoder.layers.1.conv_module1.whiten, num_groups=1, num_channels=384, metric=3.26 vs. limit=15.0 2023-10-02 19:55:43,584 WARNING [train.py:1204] (3/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_577-287150-0_sp0.9 from training. Duration: 0.911125 2023-10-02 19:55:45,085 WARNING [train.py:1204] (3/4) Exclude cut with ID _60860_210_8254_1_1534896015875_3557169_229-187367-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 19:55:46,417 WARNING [train.py:1204] (3/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_857-298942-0 from training. Duration: 0.85 2023-10-02 19:55:46,473 WARNING [train.py:1204] (3/4) Exclude cut with ID _40137_210_12210_1_1533290484046_3795180_249-187059-0_sp1.1 from training. Duration: 0.8 2023-10-02 19:55:46,480 WARNING [train.py:1204] (3/4) Exclude cut with ID ID0171W0508-765603 from training. Duration: 0.541 2023-10-02 19:55:46,496 WARNING [train.py:1204] (3/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_521-219439-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 19:55:46,509 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_40223_210_6228_1_1533298404_4812267_741-302616-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 19:55:52,599 WARNING [train.py:1204] (3/4) Exclude cut with ID _65293_210_9070_1_1535545549765_4004210_438-154735-0_sp1.1 from training. Duration: 0.92725 2023-10-02 19:55:55,977 WARNING [train.py:1204] (3/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_564-132950-0_sp1.1 from training. Duration: 0.92725 2023-10-02 19:55:56,016 WARNING [train.py:1204] (3/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_291-270881-0_sp1.1 from training. Duration: 0.8818125 2023-10-02 19:55:57,828 WARNING [train.py:1204] (3/4) Exclude cut with ID _60229_210_27767_1_1534838406158_3655350_353-213656-0 from training. Duration: 0.95 2023-10-02 19:55:59,148 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_243-143119-0 from training. Duration: 0.77 2023-10-02 19:55:59,169 WARNING [train.py:1204] (3/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_690-126306-0 from training. Duration: 0.78 2023-10-02 19:56:01,927 WARNING [train.py:1204] (3/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_347-95003-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 19:56:03,418 WARNING [train.py:1204] (3/4) Exclude cut with ID _2322_210_2667_1_1525604548971_3656299_440-80494-0 from training. Duration: 0.88 2023-10-02 19:56:05,335 WARNING [train.py:1204] (3/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_370-229434-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 19:56:06,709 WARNING [train.py:1204] (3/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_124-345840-0_sp1.1 from training. Duration: 0.47275 2023-10-02 19:56:06,716 WARNING [train.py:1204] (3/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_165-123581-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 19:56:09,449 WARNING [train.py:1204] (3/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_453-282588-0 from training. Duration: 0.98 2023-10-02 19:56:09,476 WARNING [train.py:1204] (3/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_299-34413-0_sp0.9 from training. Duration: 0.72225 2023-10-02 19:56:10,807 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_40036_210_13405_1_1533648647_1315388_48-146016-0_sp1.1 from training. Duration: 0.8454375 2023-10-02 19:56:10,828 WARNING [train.py:1204] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_99-167392-0_sp0.9 from training. Duration: 0.67775 2023-10-02 19:56:10,890 WARNING [train.py:1204] (3/4) Exclude cut with ID _48677_210_5663_1_1533902405919_3892989_37-207114-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 19:56:13,529 WARNING [train.py:1204] (3/4) Exclude cut with ID _29857_210_20034_1_1532395886753_3798160_335-163562-0 from training. Duration: 0.85 2023-10-02 19:56:16,267 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_36492_210_7927_1_1533261987_4790834_703-343351-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 19:56:17,585 WARNING [train.py:1204] (3/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_80-147301-0_sp0.9 from training. Duration: 0.9333125 2023-10-02 19:56:18,889 WARNING [train.py:1204] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_218-295097-0_sp0.9 from training. Duration: 0.8555625 2023-10-02 19:56:20,360 INFO [train.py:1046] (3/4) Epoch 29, batch 1300, loss[loss=0.1411, simple_loss=0.2231, pruned_loss=0.02954, over 21209.00 frames. ], tot_loss[loss=0.1659, simple_loss=0.2442, pruned_loss=0.0438, over 4718560.25 frames. ], batch size: 46, lr: 3.54e-03, grad_scale: 8.0 2023-10-02 19:56:21,808 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2295_210_2087_1_1525500993_2407863_116-238894-0_sp1.1 from training. Duration: 0.636375 2023-10-02 19:56:23,362 WARNING [train.py:1204] (3/4) Exclude cut with ID _3231_210_2106_1_1526205864934_6784220_697-171068-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 19:56:23,389 WARNING [train.py:1204] (3/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_102-77064-0 from training. Duration: 0.93 2023-10-02 19:56:28,518 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_13091_210_7925_1_1530609447_8987354_1186-261957-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 19:56:31,532 WARNING [train.py:1204] (3/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_91-277684-0_sp1.1 from training. Duration: 0.67275 2023-10-02 19:56:32,935 WARNING [train.py:1204] (3/4) Exclude cut with ID _31088_210_16446_1_1532520012284_3440919_159-313751-0_sp1.1 from training. Duration: 0.936375 2023-10-02 19:56:33,178 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.conv_module1.balancer2.prob, batch_count=1000266.6666666666, ans=0.125 2023-10-02 19:56:34,327 WARNING [train.py:1204] (3/4) Exclude cut with ID _42326_210_7770_1_1533814282531_3707269_227-203721-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 19:56:36,464 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3531_210_3528_1_1526294203_5216456_309-227302-0_sp1.1 from training. Duration: 0.62725 2023-10-02 19:56:36,520 WARNING [train.py:1204] (3/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_226-242140-0 from training. Duration: 0.81 2023-10-02 19:56:40,750 WARNING [train.py:1204] (3/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_122-109023-0_sp1.1 from training. Duration: 0.6909375 2023-10-02 19:56:42,167 WARNING [train.py:1204] (3/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_660-211088-0_sp0.9 from training. Duration: 0.688875 2023-10-02 19:56:42,353 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.attention_skip_rate, batch_count=1000333.3333333334, ans=0.0 2023-10-02 19:56:43,524 WARNING [train.py:1204] (3/4) Exclude cut with ID _39290_210_4220_1_1533862778760_3075118_92-311600-0 from training. Duration: 0.88 2023-10-02 19:56:45,055 WARNING [train.py:1204] (3/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_390-339137-0_sp1.1 from training. Duration: 0.5090625 2023-10-02 19:56:48,982 WARNING [train.py:1204] (3/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_449-341946-0_sp1.1 from training. Duration: 0.963625 2023-10-02 19:56:49,139 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.feed_forward2.hidden_balancer.prob, batch_count=1000400.0, ans=0.125 2023-10-02 19:56:51,613 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_68095_210_20185_1_1535718226_8888855_884-327007-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 19:56:51,697 WARNING [train.py:1204] (3/4) Exclude cut with ID _42326_210_7770_1_1533814282531_3707269_308-222998-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 19:56:51,793 WARNING [train.py:1204] (3/4) Exclude cut with ID _40399_210_4353_1_1533305658194_3279406_344-238708-0_sp1.1 from training. Duration: 0.963625 2023-10-02 19:56:53,659 WARNING [train.py:1204] (3/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_87-131427-0_sp1.1 from training. Duration: 0.7090625 2023-10-02 19:56:54,949 WARNING [train.py:1204] (3/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_110-130961-0_sp0.9 from training. Duration: 0.52225 2023-10-02 19:56:54,997 WARNING [train.py:1204] (3/4) Exclude cut with ID _55153_210_2253_1_1534384786677_3162096_148-79022-0 from training. Duration: 0.96 2023-10-02 19:57:01,168 WARNING [train.py:1204] (3/4) Exclude cut with ID _9679_210_7497_1_1529129212906_4363929_512-313889-0_sp1.1 from training. Duration: 0.736375 2023-10-02 19:57:01,187 WARNING [train.py:1204] (3/4) Exclude cut with ID _7970_210_1883_1_1528455616296_3472230_25-178407-0_sp1.1 from training. Duration: 0.6454375 2023-10-02 19:57:02,695 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_246-125000-0 from training. Duration: 0.62 2023-10-02 19:57:04,504 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_15126_210_6753_1_1530778677_2591185_113-157651-0_sp0.9 from training. Duration: 0.7555625 2023-10-02 19:57:04,975 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.5.encoder.layers.1.conv_module1.whiten, num_groups=1, num_channels=256, metric=4.50 vs. limit=15.0 2023-10-02 19:57:06,566 WARNING [train.py:1204] (3/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_105-63309-0_sp1.1 from training. Duration: 0.8909375 2023-10-02 19:57:09,224 WARNING [train.py:1204] (3/4) Exclude cut with ID _29877_210_6228_1_1532397791106_5276149_578-177271-0_sp1.1 from training. Duration: 0.936375 2023-10-02 19:57:09,280 WARNING [train.py:1204] (3/4) Exclude cut with ID _44831_210_13427_1_1533816011549_3689035_32-188147-0 from training. Duration: 0.96 2023-10-02 19:57:10,635 WARNING [train.py:1204] (3/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_300-254337-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 19:57:10,648 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_128-241008-0 from training. Duration: 0.94 2023-10-02 19:57:12,082 WARNING [train.py:1204] (3/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_53-279178-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 19:57:14,878 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_41452_210_7291_1_1533437687_3941281_273-334088-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 19:57:16,070 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.523e+02 1.865e+02 2.006e+02 2.214e+02 3.009e+02, threshold=4.012e+02, percent-clipped=0.0 2023-10-02 19:57:16,169 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_128-241008-0_sp1.1 from training. Duration: 0.8545625 2023-10-02 19:57:17,796 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.conv_module1.balancer2.prob, batch_count=1000466.6666666666, ans=0.125 2023-10-02 19:57:18,878 WARNING [train.py:1204] (3/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_379-49457-0 from training. Duration: 0.87 2023-10-02 19:57:18,943 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_105-114797-0 from training. Duration: 0.84 2023-10-02 19:57:20,370 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_14928_210_5632_1_1530773461_4714000_454-112198-0 from training. Duration: 0.66 2023-10-02 19:57:24,964 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_456-22141-0_sp1.1 from training. Duration: 0.8 2023-10-02 19:57:27,649 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_115-233929-0 from training. Duration: 0.72 2023-10-02 19:57:27,883 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.bypass.skip_rate, batch_count=1000533.3333333334, ans=0.07 2023-10-02 19:57:29,567 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_616-16795-0_sp1.1 from training. Duration: 0.963625 2023-10-02 19:57:31,314 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.self_attn_weights.pos_emb_skip_rate, batch_count=1000533.3333333334, ans=0.0 2023-10-02 19:57:35,533 INFO [train.py:1046] (3/4) Epoch 29, batch 1350, loss[loss=0.1487, simple_loss=0.2126, pruned_loss=0.04235, over 23710.00 frames. ], tot_loss[loss=0.1647, simple_loss=0.2427, pruned_loss=0.04336, over 4715947.73 frames. ], batch size: 232, lr: 3.54e-03, grad_scale: 8.0 2023-10-02 19:57:36,477 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.nonlin_attention.balancer.prob, batch_count=1000600.0, ans=0.125 2023-10-02 19:57:37,367 WARNING [train.py:1204] (3/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_448-212135-0 from training. Duration: 0.91 2023-10-02 19:57:40,167 WARNING [train.py:1204] (3/4) Exclude cut with ID _46683_210_9642_1_1533730913492_4591116_201-205677-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 19:57:41,025 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.1.encoder.layers.1.self_attn1.whiten, num_groups=1, num_channels=256, metric=11.89 vs. limit=22.5 2023-10-02 19:57:42,896 WARNING [train.py:1204] (3/4) Exclude cut with ID _64086_210_7994_1_1535270417019_7435949_43-80483-0_sp1.1 from training. Duration: 0.97275 2023-10-02 19:57:45,240 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.1.encoder.layers.1.conv_module1.whiten, num_groups=1, num_channels=256, metric=10.64 vs. limit=15.0 2023-10-02 19:57:45,749 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_240-134124-0_sp1.1 from training. Duration: 0.963625 2023-10-02 19:57:45,769 WARNING [train.py:1204] (3/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_907-120940-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 19:57:47,155 WARNING [train.py:1204] (3/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_848-280957-0_sp1.1 from training. Duration: 0.7909375 2023-10-02 19:57:48,477 WARNING [train.py:1204] (3/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_243-42116-0_sp0.9 from training. Duration: 0.811125 2023-10-02 19:57:51,401 WARNING [train.py:1204] (3/4) Exclude cut with ID _6694_210_4599_1_1527852637490_3965529_116-109398-0_sp0.9 from training. Duration: 0.811125 2023-10-02 19:57:52,824 WARNING [train.py:1204] (3/4) Exclude cut with ID _24314_210_4915_1_1532135078116_7170179_521-258868-0 from training. Duration: 0.79 2023-10-02 19:57:54,504 WARNING [train.py:1204] (3/4) Exclude cut with ID _2220_210_1819_1_1525509109898_3658680_50-99837-0_sp0.9 from training. Duration: 0.6 2023-10-02 19:57:54,567 WARNING [train.py:1204] (3/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_313-134930-0_sp1.1 from training. Duration: 0.8818125 2023-10-02 19:57:57,274 WARNING [train.py:1204] (3/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_125-144174-0 from training. Duration: 0.39 2023-10-02 19:57:57,326 WARNING [train.py:1204] (3/4) Exclude cut with ID _59288_210_5854_1_1535077757774_3412246_275-293897-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 19:57:58,682 WARNING [train.py:1204] (3/4) Exclude cut with ID _40519_210_13794_1_1533715228655_7337390_722-52880-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 19:57:58,702 WARNING [train.py:1204] (3/4) Exclude cut with ID _7537_210_2084_1_1528502395431_3924359_128-268029-0 from training. Duration: 0.98 2023-10-02 19:58:01,907 WARNING [train.py:1204] (3/4) Exclude cut with ID _8021_210_7994_1_1528376451358_3653069_179-45206-0 from training. Duration: 0.86 2023-10-02 19:58:03,900 WARNING [train.py:1204] (3/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_88-276668-0 from training. Duration: 0.99 2023-10-02 19:58:05,432 WARNING [train.py:1204] (3/4) Exclude cut with ID _22613_210_5219_1_1532253599686_4090170_290-212862-0_sp1.1 from training. Duration: 0.97275 2023-10-02 19:58:05,439 WARNING [train.py:1204] (3/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_465-214726-0 from training. Duration: 0.86 2023-10-02 19:58:15,737 WARNING [train.py:1204] (3/4) Exclude cut with ID _56480_210_4220_1_1534679942079_3075409_138-36031-0_sp1.1 from training. Duration: 0.97275 2023-10-02 19:58:24,492 WARNING [train.py:1204] (3/4) Exclude cut with ID _58605_210_12210_1_1534723195939_3904259_300-285878-0_sp1.1 from training. Duration: 0.97275 2023-10-02 19:58:24,525 WARNING [train.py:1204] (3/4) Exclude cut with ID _40901_210_5665_1_1533363789958_3856130_114-340229-0_sp1.1 from training. Duration: 0.936375 2023-10-02 19:58:24,693 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.conv_module1.balancer2.prob, batch_count=1000800.0, ans=0.125 2023-10-02 19:58:25,905 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_489-230703-0 from training. Duration: 0.85 2023-10-02 19:58:27,922 WARNING [train.py:1204] (3/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_322-81243-0_sp1.1 from training. Duration: 0.936375 2023-10-02 19:58:29,219 WARNING [train.py:1204] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_271-269084-0 from training. Duration: 0.83 2023-10-02 19:58:29,237 WARNING [train.py:1204] (3/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_166-314607-0_sp0.9 from training. Duration: 0.6 2023-10-02 19:58:29,301 WARNING [train.py:1204] (3/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_340-343302-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 19:58:32,384 WARNING [train.py:1204] (3/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_286-112015-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 19:58:33,791 WARNING [train.py:1204] (3/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_328-264776-0 from training. Duration: 0.67 2023-10-02 19:58:35,227 WARNING [train.py:1204] (3/4) Exclude cut with ID _4866_210_2577_1_1527404412284_4364659_256-306830-0_sp0.9 from training. Duration: 0.9666875 2023-10-02 19:58:37,992 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.4.encoder.layers.0.feed_forward3.out_whiten, num_groups=1, num_channels=384, metric=11.71 vs. limit=15.0 2023-10-02 19:58:41,547 WARNING [train.py:1204] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_239-316802-0 from training. Duration: 0.69 2023-10-02 19:58:44,163 WARNING [train.py:1204] (3/4) Exclude cut with ID _29857_210_20034_1_1532395886753_3798160_279-185801-0 from training. Duration: 0.91 2023-10-02 19:58:45,813 INFO [scaling.py:1118] (3/4) WithLoss: name=encoder.encoders.0.layers.0.self_attn_weights, loss-sum=0.000e+00 2023-10-02 19:58:48,524 WARNING [train.py:1204] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_656-188361-0 from training. Duration: 0.87 2023-10-02 19:58:49,930 INFO [train.py:1046] (3/4) Epoch 29, batch 1400, loss[loss=0.1487, simple_loss=0.2106, pruned_loss=0.04342, over 22688.00 frames. ], tot_loss[loss=0.1634, simple_loss=0.2414, pruned_loss=0.04273, over 4718221.07 frames. ], batch size: 322, lr: 3.54e-03, grad_scale: 8.0 2023-10-02 19:58:49,988 WARNING [train.py:1204] (3/4) Exclude cut with ID _44718_210_5816_1_1534050051869_6469030_100-230287-0_sp1.1 from training. Duration: 0.936375 2023-10-02 19:58:53,200 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8275_210_4198_1_1528455320_3892571_276-215622-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 19:58:53,265 WARNING [train.py:1204] (3/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_698-309430-0_sp1.1 from training. Duration: 0.963625 2023-10-02 19:58:59,162 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_365-217119-0 from training. Duration: 0.63 2023-10-02 19:58:59,284 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7264_210_4938_1_1527939911_8132095_800-91995-0 from training. Duration: 0.86 2023-10-02 19:59:09,921 WARNING [train.py:1204] (3/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_49-97158-0_sp1.1 from training. Duration: 0.6909375 2023-10-02 19:59:11,319 WARNING [train.py:1204] (3/4) Exclude cut with ID _61132_210_9969_1_1535021792964_3755419_61-57720-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 19:59:14,029 WARNING [train.py:1204] (3/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_345-306832-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 19:59:14,054 WARNING [train.py:1204] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_164-224052-0_sp0.9 from training. Duration: 0.77775 2023-10-02 19:59:16,940 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_34886_210_10633_1_1533203882_1755661_76-156096-0_sp1.1 from training. Duration: 0.7909375 2023-10-02 19:59:18,345 WARNING [train.py:1204] (3/4) Exclude cut with ID 4133-6541-0027-40495-0_sp1.1 from training. Duration: 0.9681875 2023-10-02 19:59:28,510 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_514-211165-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 19:59:28,562 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_41211_210_2544_1_1533348859_3702910_97-212695-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 19:59:33,547 WARNING [train.py:1204] (3/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_406-45086-0 from training. Duration: 0.8 2023-10-02 19:59:33,613 WARNING [train.py:1204] (3/4) Exclude cut with ID _65263_210_22356_1_1535763598691_3921037_49-222917-0_sp1.1 from training. Duration: 0.836375 2023-10-02 19:59:33,669 WARNING [train.py:1204] (3/4) Exclude cut with ID _4866_210_2577_1_1527404412284_4364659_352-157930-0_sp1.1 from training. Duration: 0.736375 2023-10-02 19:59:35,086 WARNING [train.py:1204] (3/4) Exclude cut with ID _4366_210_1388_1_1526792053633_3613819_231-211402-0_sp1.1 from training. Duration: 0.8545625 2023-10-02 19:59:35,142 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_98-21576-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 19:59:36,555 WARNING [train.py:1204] (3/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_348-127789-0_sp0.9 from training. Duration: 0.9444375 2023-10-02 19:59:36,571 WARNING [train.py:1204] (3/4) Exclude cut with ID _45871_210_5665_1_1533864614245_3296060_98-329557-0_sp1.1 from training. Duration: 0.97275 2023-10-02 19:59:36,619 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6846_210_6753_1_1527850767_4295158_2-315073-0_sp1.1 from training. Duration: 0.8 2023-10-02 19:59:37,462 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.conv_module1.balancer1.prob, batch_count=1001133.3333333334, ans=0.125 2023-10-02 19:59:38,668 WARNING [train.py:1204] (3/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_145-137790-0 from training. Duration: 0.83 2023-10-02 19:59:38,682 WARNING [train.py:1204] (3/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_133-158308-0_sp0.9 from training. Duration: 0.9333125 2023-10-02 19:59:44,133 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.conv_module1.balancer1.prob, batch_count=1001133.3333333334, ans=0.125 2023-10-02 19:59:45,129 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.626e+02 1.875e+02 2.105e+02 2.508e+02 3.725e+02, threshold=4.210e+02, percent-clipped=0.0 2023-10-02 19:59:45,210 WARNING [train.py:1204] (3/4) Exclude cut with ID _14898_210_9206_1_1530766812377_4177190_320-11758-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 19:59:49,377 WARNING [train.py:1204] (3/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_406-45086-0_sp0.9 from training. Duration: 0.888875 2023-10-02 19:59:54,552 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.0.layers.0.feed_forward3.out_whiten, num_groups=1, num_channels=192, metric=13.08 vs. limit=15.0 2023-10-02 19:59:56,507 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_5438_210_1794_1_1527336687_2600009_104-288274-0 from training. Duration: 0.58 2023-10-02 19:59:58,319 WARNING [train.py:1204] (3/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_108-53929-0_sp0.9 from training. Duration: 0.6555625 2023-10-02 19:59:59,599 WARNING [train.py:1204] (3/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_444-286390-0_sp1.1 from training. Duration: 0.963625 2023-10-02 20:00:01,064 WARNING [train.py:1204] (3/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_125-144174-0_sp0.9 from training. Duration: 0.4333125 2023-10-02 20:00:03,021 WARNING [train.py:1204] (3/4) Exclude cut with ID _55209_210_1531_1_1534675828996_4597160_118-345621-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 20:00:04,353 INFO [train.py:1046] (3/4) Epoch 29, batch 1450, loss[loss=0.1739, simple_loss=0.251, pruned_loss=0.04836, over 23777.00 frames. ], tot_loss[loss=0.1634, simple_loss=0.2411, pruned_loss=0.04287, over 4718142.95 frames. ], batch size: 179, lr: 3.54e-03, grad_scale: 8.0 2023-10-02 20:00:04,443 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_45886_210_17024_1_1533709215_471063_7-79165-0_sp1.1 from training. Duration: 0.8909375 2023-10-02 20:00:08,999 WARNING [train.py:1204] (3/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_175-163894-0_sp0.9 from training. Duration: 0.82225 2023-10-02 20:00:10,395 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_50188_210_4198_1_1534503621_3646041_353-81682-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 20:00:11,016 WARNING [train.py:1204] (3/4) Exclude cut with ID _17875_210_2087_1_1531224012907_1821339_175-80565-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 20:00:11,032 WARNING [train.py:1204] (3/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_159-328157-0_sp1.1 from training. Duration: 0.47275 2023-10-02 20:00:15,385 WARNING [train.py:1204] (3/4) Exclude cut with ID _60057_210_10640_1_1534903065793_3333150_137-267579-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 20:00:15,425 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_92-46009-0_sp1.1 from training. Duration: 0.4909375 2023-10-02 20:00:18,100 WARNING [train.py:1204] (3/4) Exclude cut with ID _53203_210_2087_1_1534246218309_475590_48-68507-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 20:00:18,125 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_28211_210_6388_1_1532861506_4523224_128-328162-0 from training. Duration: 0.9 2023-10-02 20:00:18,192 WARNING [train.py:1204] (3/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_51-1049-0_sp1.1 from training. Duration: 0.6818125 2023-10-02 20:00:19,765 WARNING [train.py:1204] (3/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_383-316000-0 from training. Duration: 0.9 2023-10-02 20:00:21,111 WARNING [train.py:1204] (3/4) Exclude cut with ID _4779_210_2084_1_1527318022944_3914893_185-295153-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 20:00:22,555 WARNING [train.py:1204] (3/4) Exclude cut with ID _3231_210_2106_1_1526205864934_6784220_171-256004-0_sp0.9 from training. Duration: 0.9555625 2023-10-02 20:00:22,555 WARNING [train.py:1204] (3/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_87-272068-0 from training. Duration: 0.83 2023-10-02 20:00:23,941 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_164-152777-0_sp1.1 from training. Duration: 0.92725 2023-10-02 20:00:23,987 WARNING [train.py:1204] (3/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_18-130336-0_sp0.9 from training. Duration: 0.87775 2023-10-02 20:00:24,042 WARNING [train.py:1204] (3/4) Exclude cut with ID _14886_210_4220_1_1530702020355_3516190_175-229073-0_sp1.1 from training. Duration: 0.4090625 2023-10-02 20:00:25,165 WARNING [train.py:1204] (3/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_791-45149-0_sp0.9 from training. Duration: 0.9555625 2023-10-02 20:00:25,254 WARNING [train.py:1204] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_468-189058-0_sp1.1 from training. Duration: 0.87275 2023-10-02 20:00:26,725 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8278_210_2550_1_1528637099_4220413_32-142780-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 20:00:29,428 WARNING [train.py:1204] (3/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_146-218763-0_sp0.9 from training. Duration: 0.9555625 2023-10-02 20:00:32,672 WARNING [train.py:1204] (3/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_348-127789-0_sp1.1 from training. Duration: 0.77275 2023-10-02 20:00:32,676 WARNING [train.py:1204] (3/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_288-285445-0_sp1.1 from training. Duration: 0.936375 2023-10-02 20:00:37,832 WARNING [train.py:1204] (3/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_308-181732-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 20:00:37,875 WARNING [train.py:1204] (3/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_895-117428-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 20:00:39,301 WARNING [train.py:1204] (3/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_387-159008-0_sp0.9 from training. Duration: 0.9555625 2023-10-02 20:00:39,312 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_37351_210_7925_1_1533012931_3812113_265-85780-0_sp0.9 from training. Duration: 0.97775 2023-10-02 20:00:39,335 WARNING [train.py:1204] (3/4) Exclude cut with ID _40551_210_10635_1_1533776424632_3416179_361-3964-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 20:00:39,370 WARNING [train.py:1204] (3/4) Exclude cut with ID _25882_210_3036_1_1532775665815_6479510_485-122742-0_sp1.1 from training. Duration: 0.97275 2023-10-02 20:00:44,023 WARNING [train.py:1204] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_98-51676-0 from training. Duration: 0.73 2023-10-02 20:00:46,727 WARNING [train.py:1204] (3/4) Exclude cut with ID _30548_210_16040_1_1532608097130_3146969_371-280610-0_sp1.1 from training. Duration: 0.92725 2023-10-02 20:00:49,703 WARNING [train.py:1204] (3/4) Exclude cut with ID IC0671W0081-331924 from training. Duration: 0.896 2023-10-02 20:00:51,063 WARNING [train.py:1204] (3/4) Exclude cut with ID _40284_210_2084_1_1533513679814_3704279_380-160775-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 20:00:52,469 WARNING [train.py:1204] (3/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_57-270037-0_sp1.1 from training. Duration: 0.7 2023-10-02 20:00:52,553 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_853-150237-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 20:00:53,902 WARNING [train.py:1204] (3/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_491-261923-0 from training. Duration: 0.98 2023-10-02 20:00:54,689 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.2.encoder.layers.2.feed_forward1.out_whiten, num_groups=1, num_channels=384, metric=7.01 vs. limit=15.0 2023-10-02 20:00:56,666 WARNING [train.py:1204] (3/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_836-244045-0_sp1.1 from training. Duration: 0.97275 2023-10-02 20:00:57,985 WARNING [train.py:1204] (3/4) Exclude cut with ID _14880_210_8895_1_1530680426655_3795259_289-212817-0 from training. Duration: 0.69 2023-10-02 20:01:01,034 WARNING [train.py:1204] (3/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_303-340446-0 from training. Duration: 0.9 2023-10-02 20:01:02,539 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_37152_210_9697_1_1533106573_3844188_418-41736-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 20:01:05,928 WARNING [train.py:1204] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_371-18169-0_sp1.1 from training. Duration: 0.7818125 2023-10-02 20:01:07,264 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_187-219984-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 20:01:07,381 WARNING [train.py:1204] (3/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_63-267585-0 from training. Duration: 0.9 2023-10-02 20:01:10,665 WARNING [train.py:1204] (3/4) Exclude cut with ID _27013_210_1389_1_1532437173240_3252649_222-125573-0 from training. Duration: 0.98 2023-10-02 20:01:10,726 WARNING [train.py:1204] (3/4) Exclude cut with ID _14821_210_8750_1_1530680503991_6829359_189-297451-0 from training. Duration: 0.55 2023-10-02 20:01:11,381 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.3.encoder.layers.2.feed_forward1.out_whiten, num_groups=1, num_channels=512, metric=7.73 vs. limit=15.0 2023-10-02 20:01:13,392 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_557-163391-0_sp1.1 from training. Duration: 0.97275 2023-10-02 20:01:13,479 WARNING [train.py:1204] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_65-88242-0_sp1.1 from training. Duration: 0.5090625 2023-10-02 20:01:19,196 INFO [train.py:1046] (3/4) Epoch 29, batch 1500, loss[loss=0.1591, simple_loss=0.2391, pruned_loss=0.03953, over 23702.00 frames. ], tot_loss[loss=0.1634, simple_loss=0.2413, pruned_loss=0.04274, over 4709165.68 frames. ], batch size: 232, lr: 3.54e-03, grad_scale: 8.0 2023-10-02 20:01:19,521 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.conv_module2.balancer1.prob, batch_count=1001600.0, ans=0.125 2023-10-02 20:01:23,519 WARNING [train.py:1204] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_218-295097-0 from training. Duration: 0.77 2023-10-02 20:01:23,547 WARNING [train.py:1204] (3/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_686-247600-0_sp1.1 from training. Duration: 0.836375 2023-10-02 20:01:23,550 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_397-147196-0_sp0.9 from training. Duration: 0.888875 2023-10-02 20:01:24,961 WARNING [train.py:1204] (3/4) Exclude cut with ID _53950_210_5816_1_1534417264919_3862951_237-136449-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 20:01:26,285 WARNING [train.py:1204] (3/4) Exclude cut with ID _3120_210_3525_1_1526036143112_4072980_74-178400-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 20:01:26,355 WARNING [train.py:1204] (3/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_309-222545-0_sp0.9 from training. Duration: 0.9333125 2023-10-02 20:01:27,679 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7544_210_6753_1_1528587722_3576686_172-171750-0 from training. Duration: 0.65 2023-10-02 20:01:29,000 WARNING [train.py:1204] (3/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_3-237434-0_sp0.9 from training. Duration: 0.8444375 2023-10-02 20:01:29,033 WARNING [train.py:1204] (3/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_101-293367-0_sp0.9 from training. Duration: 0.7 2023-10-02 20:01:31,044 WARNING [train.py:1204] (3/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_818-213552-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 20:01:31,128 WARNING [train.py:1204] (3/4) Exclude cut with ID _14349_210_8341_1_1530615710818_3442470_115-285975-0_sp1.1 from training. Duration: 0.7818125 2023-10-02 20:01:31,256 WARNING [train.py:1204] (3/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_291-309749-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 20:01:32,766 WARNING [train.py:1204] (3/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_715-3462-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 20:01:36,323 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.ff3_skip_rate, batch_count=1001666.6666666666, ans=0.0 2023-10-02 20:01:40,555 WARNING [train.py:1204] (3/4) Exclude cut with ID _59502_210_12210_1_1535107024698_1835139_13-331827-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 20:01:40,565 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_4528_210_3273_1_1527076654_3276777_342-168068-0 from training. Duration: 0.87 2023-10-02 20:01:40,624 WARNING [train.py:1204] (3/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_215-141059-0_sp0.9 from training. Duration: 0.688875 2023-10-02 20:01:40,651 WARNING [train.py:1204] (3/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_106-286130-0_sp1.1 from training. Duration: 0.8909375 2023-10-02 20:01:40,814 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.conv_module1.balancer2.prob, batch_count=1001666.6666666666, ans=0.125 2023-10-02 20:01:42,039 WARNING [train.py:1204] (3/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_480-77655-0_sp1.1 from training. Duration: 0.97275 2023-10-02 20:01:46,040 WARNING [train.py:1204] (3/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_848-280957-0 from training. Duration: 0.87 2023-10-02 20:01:50,773 WARNING [train.py:1204] (3/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_122-109023-0 from training. Duration: 0.76 2023-10-02 20:01:52,289 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_11305_210_1794_1_1529750428_7178148_645-233480-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 20:01:52,342 WARNING [train.py:1204] (3/4) Exclude cut with ID _7562_210_2084_1_1528887767916_3946229_345-169156-0 from training. Duration: 0.58 2023-10-02 20:01:54,986 WARNING [train.py:1204] (3/4) Exclude cut with ID _7562_210_2084_1_1528887767916_3946229_345-169156-0_sp1.1 from training. Duration: 0.52725 2023-10-02 20:01:57,583 WARNING [train.py:1204] (3/4) Exclude cut with ID _13253_210_1925_1_1530757775819_3577873_253-246386-0_sp1.1 from training. Duration: 0.8181875 2023-10-02 20:01:57,657 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_136-264444-0_sp1.1 from training. Duration: 0.97275 2023-10-02 20:01:58,956 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2168_210_2290_1_1525606920_1849400_136-222991-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 20:02:00,852 WARNING [train.py:1204] (3/4) Exclude cut with ID _7852_210_5219_1_1528286471316_8139400_242-105336-0 from training. Duration: 0.72 2023-10-02 20:02:00,907 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_5960_210_1794_1_1527403865_3990999_515-2613-0_sp1.1 from training. Duration: 0.8 2023-10-02 20:02:00,922 WARNING [train.py:1204] (3/4) Exclude cut with ID _39688_210_19596_1_1533371626725_4154159_551-167834-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 20:02:01,539 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.3.encoder.layers.0.nonlin_attention.whiten1, num_groups=1, num_channels=384, metric=6.84 vs. limit=10.0 2023-10-02 20:02:02,197 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_85-195240-0 from training. Duration: 0.87 2023-10-02 20:02:02,263 WARNING [train.py:1204] (3/4) Exclude cut with ID _63171_210_15488_1_1535104792794_3295110_113-113534-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 20:02:08,567 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.ff3_skip_rate, batch_count=1001800.0, ans=0.0 2023-10-02 20:02:09,614 WARNING [train.py:1204] (3/4) Exclude cut with ID _8833_210_5219_1_1528592665210_7553059_319-349181-0_sp1.1 from training. Duration: 0.9 2023-10-02 20:02:09,615 WARNING [train.py:1204] (3/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_225-43381-0 from training. Duration: 0.94 2023-10-02 20:02:13,647 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.562e+02 1.835e+02 2.062e+02 2.409e+02 3.555e+02, threshold=4.124e+02, percent-clipped=0.0 2023-10-02 20:02:14,058 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.nonlin_attention.balancer.prob, batch_count=1001800.0, ans=0.125 2023-10-02 20:02:15,139 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_5959_210_1794_1_1527400400_3445726_412-44052-0_sp0.9 from training. Duration: 0.8555625 2023-10-02 20:02:16,539 WARNING [train.py:1204] (3/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_356-167816-0_sp1.1 from training. Duration: 0.6545625 2023-10-02 20:02:21,313 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9012W0112-501244 from training. Duration: 0.768 2023-10-02 20:02:21,372 WARNING [train.py:1204] (3/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_177-326952-0_sp1.1 from training. Duration: 0.92725 2023-10-02 20:02:21,375 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9016W0116-502789 from training. Duration: 0.896 2023-10-02 20:02:23,221 WARNING [train.py:1204] (3/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_557-204815-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 20:02:23,312 WARNING [train.py:1204] (3/4) Exclude cut with ID _55857_210_2346_1_1534644007831_6898209_279-229469-0_sp1.1 from training. Duration: 0.936375 2023-10-02 20:02:23,854 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.3.encoder.layers.3.feed_forward2.out_whiten, num_groups=1, num_channels=512, metric=7.91 vs. limit=15.0 2023-10-02 20:02:24,612 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9029W0375-509764 from training. Duration: 0.896 2023-10-02 20:02:26,018 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2295_210_2087_1_1525500993_2407863_234-83104-0_sp0.9 from training. Duration: 0.688875 2023-10-02 20:02:28,756 WARNING [train.py:1204] (3/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_266-137104-0 from training. Duration: 0.65 2023-10-02 20:02:30,079 WARNING [train.py:1204] (3/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_289-163927-0_sp1.1 from training. Duration: 0.92725 2023-10-02 20:02:33,248 INFO [train.py:1046] (3/4) Epoch 29, batch 1550, loss[loss=0.1582, simple_loss=0.2409, pruned_loss=0.03773, over 24685.00 frames. ], tot_loss[loss=0.1639, simple_loss=0.2426, pruned_loss=0.0426, over 4726054.99 frames. ], batch size: 65, lr: 3.54e-03, grad_scale: 8.0 2023-10-02 20:02:33,376 WARNING [train.py:1204] (3/4) Exclude cut with ID _12733_210_8341_1_1530167424556_3646380_86-225314-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 20:02:33,392 WARNING [train.py:1204] (3/4) Exclude cut with ID _7877_210_6014_1_1528286358099_3750136_618-284064-0_sp1.1 from training. Duration: 0.92725 2023-10-02 20:02:34,675 WARNING [train.py:1204] (3/4) Exclude cut with ID _55844_210_2346_1_1534471212569_7625707_1017-31469-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 20:02:34,726 WARNING [train.py:1204] (3/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_617-241446-0_sp1.1 from training. Duration: 0.92725 2023-10-02 20:02:36,187 WARNING [train.py:1204] (3/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_866-173440-0_sp1.1 from training. Duration: 0.8181875 2023-10-02 20:02:36,311 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_9460_210_4211_1_1528954587_10049667_480-260269-0 from training. Duration: 0.56 2023-10-02 20:02:38,109 WARNING [train.py:1204] (3/4) Exclude cut with ID _44659_210_2721_1_1534036120061_6614860_626-298884-0 from training. Duration: 0.97 2023-10-02 20:02:39,379 WARNING [train.py:1204] (3/4) Exclude cut with ID _53565_210_10635_1_1534337218335_3820259_287-18120-0_sp1.1 from training. Duration: 0.8818125 2023-10-02 20:02:39,425 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_791-197891-0 from training. Duration: 0.84 2023-10-02 20:02:39,483 WARNING [train.py:1204] (3/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_527-238990-0 from training. Duration: 0.9 2023-10-02 20:02:42,123 WARNING [train.py:1204] (3/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_449-65969-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 20:02:43,486 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_553-341452-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 20:02:43,534 WARNING [train.py:1204] (3/4) Exclude cut with ID _16909_210_5724_1_1531292079912_4118919_300-47574-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 20:02:43,551 WARNING [train.py:1204] (3/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_70-27732-0_sp1.1 from training. Duration: 0.8545625 2023-10-02 20:02:46,210 WARNING [train.py:1204] (3/4) Exclude cut with ID _44718_210_5816_1_1534050051869_6469030_727-28074-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 20:02:46,263 WARNING [train.py:1204] (3/4) Exclude cut with ID _55857_210_2346_1_1534644007831_6898209_357-123664-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 20:02:49,060 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9123W0004-555964 from training. Duration: 0.896 2023-10-02 20:02:49,084 WARNING [train.py:1204] (3/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_445-145208-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 20:02:51,122 WARNING [train.py:1204] (3/4) Exclude cut with ID _3675_210_4557_1_1526639934408_4182089_456-18305-0_sp1.1 from training. Duration: 0.6909375 2023-10-02 20:02:51,178 WARNING [train.py:1204] (3/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_1-99680-0_sp1.1 from training. Duration: 0.6090625 2023-10-02 20:02:52,602 WARNING [train.py:1204] (3/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_146-282400-0_sp0.9 from training. Duration: 0.788875 2023-10-02 20:02:52,624 WARNING [train.py:1204] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_844-96989-0 from training. Duration: 0.71 2023-10-02 20:02:55,338 WARNING [train.py:1204] (3/4) Exclude cut with ID _29853_210_7497_1_1532505600851_6642110_128-146364-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 20:02:55,360 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_224-322394-0 from training. Duration: 0.66 2023-10-02 20:02:55,445 WARNING [train.py:1204] (3/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_191-59784-0 from training. Duration: 0.88 2023-10-02 20:02:56,646 WARNING [train.py:1204] (3/4) Exclude cut with ID _12556_210_8341_1_1530081024100_7327690_725-193477-0 from training. Duration: 0.98 2023-10-02 20:02:56,700 WARNING [train.py:1204] (3/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_523-79838-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 20:02:56,782 WARNING [train.py:1204] (3/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_398-334558-0_sp1.1 from training. Duration: 0.87275 2023-10-02 20:03:01,017 WARNING [train.py:1204] (3/4) Exclude cut with ID _6456_210_1534_1_1527681634212_3181049_171-36697-0_sp1.1 from training. Duration: 0.936375 2023-10-02 20:03:02,441 WARNING [train.py:1204] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_700-183549-0 from training. Duration: 0.68 2023-10-02 20:03:02,450 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8853_210_3273_1_1528633899_818968_51-187685-0 from training. Duration: 0.69 2023-10-02 20:03:02,650 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.conv_skip_rate, batch_count=1002066.6666666666, ans=0.0 2023-10-02 20:03:04,339 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.conv_skip_rate, batch_count=1002066.6666666666, ans=0.0 2023-10-02 20:03:09,024 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.balancer_ff2.min_abs, batch_count=1002066.6666666666, ans=0.1 2023-10-02 20:03:11,621 WARNING [train.py:1204] (3/4) Exclude cut with ID _7719_210_3525_1_1528456153163_4296960_333-110399-0_sp1.1 from training. Duration: 0.87275 2023-10-02 20:03:15,777 WARNING [train.py:1204] (3/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_1022-83740-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 20:03:15,793 WARNING [train.py:1204] (3/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_61-327062-0_sp0.9 from training. Duration: 0.9 2023-10-02 20:03:15,798 WARNING [train.py:1204] (3/4) Exclude cut with ID _47251_210_18628_1_1533814063128_4137250_214-88076-0_sp1.1 from training. Duration: 0.8 2023-10-02 20:03:16,898 INFO [scaling.py:1022] (3/4) Whitening: name=encoder_embed.convnext.out_whiten, num_groups=1, num_channels=128, metric=4.73 vs. limit=5.0 2023-10-02 20:03:17,184 WARNING [train.py:1204] (3/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_839-173017-0 from training. Duration: 0.41 2023-10-02 20:03:21,926 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.bypass.skip_rate, batch_count=1002133.3333333334, ans=0.04949747468305833 2023-10-02 20:03:23,039 WARNING [train.py:1204] (3/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_299-34413-0_sp1.1 from training. Duration: 0.5909375 2023-10-02 20:03:25,639 WARNING [train.py:1204] (3/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_664-159457-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 20:03:27,577 WARNING [train.py:1204] (3/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_483-291760-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 20:03:30,369 WARNING [train.py:1204] (3/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_287-255474-0_sp1.1 from training. Duration: 0.92725 2023-10-02 20:03:30,411 WARNING [train.py:1204] (3/4) Exclude cut with ID _48677_210_5663_1_1533902405919_3892989_111-11439-0_sp1.1 from training. Duration: 0.87275 2023-10-02 20:03:30,418 WARNING [train.py:1204] (3/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_399-156791-0 from training. Duration: 0.83 2023-10-02 20:03:31,862 WARNING [train.py:1204] (3/4) Exclude cut with ID _3992_210_1747_1_1526644816031_3731060_36-147855-0_sp0.9 from training. Duration: 0.8666875 2023-10-02 20:03:33,192 WARNING [train.py:1204] (3/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_442-320317-0_sp1.1 from training. Duration: 0.7454375 2023-10-02 20:03:33,216 WARNING [train.py:1204] (3/4) Exclude cut with ID _55429_210_10626_1_1534467588527_7174143_953-80868-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 20:03:33,369 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.ff3_skip_rate, batch_count=1002200.0, ans=0.0 2023-10-02 20:03:34,706 WARNING [train.py:1204] (3/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_159-328157-0_sp0.9 from training. Duration: 0.57775 2023-10-02 20:03:34,712 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9309W0197-640324 from training. Duration: 0.896 2023-10-02 20:03:38,026 WARNING [train.py:1204] (3/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_361-30822-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 20:03:42,615 WARNING [train.py:1204] (3/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_636-251213-0 from training. Duration: 0.94 2023-10-02 20:03:46,752 INFO [train.py:1046] (3/4) Epoch 29, batch 1600, loss[loss=0.1678, simple_loss=0.2534, pruned_loss=0.04116, over 24454.00 frames. ], tot_loss[loss=0.1649, simple_loss=0.2436, pruned_loss=0.04314, over 4720591.14 frames. ], batch size: 69, lr: 3.54e-03, grad_scale: 16.0 2023-10-02 20:03:46,844 WARNING [train.py:1204] (3/4) Exclude cut with ID _49728_210_12651_1_1534041034294_6788150_468-333827-0_sp1.1 from training. Duration: 0.963625 2023-10-02 20:03:48,137 WARNING [train.py:1204] (3/4) Exclude cut with ID _25882_210_3036_1_1532775665815_6479510_623-191759-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 20:03:49,544 WARNING [train.py:1204] (3/4) Exclude cut with ID _5189_210_4557_1_1527248319428_3769229_199-53526-0 from training. Duration: 0.42 2023-10-02 20:03:49,652 WARNING [train.py:1204] (3/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_32-205288-0_sp0.9 from training. Duration: 0.8666875 2023-10-02 20:03:49,740 WARNING [train.py:1204] (3/4) Exclude cut with ID _65263_210_22356_1_1535763598691_3921037_401-346646-0_sp1.1 from training. Duration: 0.963625 2023-10-02 20:03:49,744 WARNING [train.py:1204] (3/4) Exclude cut with ID _12555_210_8750_1_1530075594402_3780239_449-267986-0_sp1.1 from training. Duration: 0.7545625 2023-10-02 20:03:51,072 WARNING [train.py:1204] (3/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_430-201020-0_sp1.1 from training. Duration: 0.8818125 2023-10-02 20:03:52,381 WARNING [train.py:1204] (3/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_379-49457-0_sp1.1 from training. Duration: 0.7909375 2023-10-02 20:03:54,527 WARNING [train.py:1204] (3/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_200-105218-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 20:03:54,773 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=1002266.6666666666, ans=0.1 2023-10-02 20:03:55,861 WARNING [train.py:1204] (3/4) Exclude cut with ID _3867_210_1883_1_1526641200575_4475830_319-349850-0 from training. Duration: 0.67 2023-10-02 20:03:57,722 WARNING [train.py:1204] (3/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_916-195848-0 from training. Duration: 0.92 2023-10-02 20:03:59,144 WARNING [train.py:1204] (3/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_436-186311-0 from training. Duration: 0.65 2023-10-02 20:04:00,732 INFO [scaling.py:1118] (3/4) WithLoss: name=encoder.encoders.1.encoder.layers.1.self_attn_weights, loss-sum=0.000e+00 2023-10-02 20:04:01,876 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3910_210_1794_1_1526777667_3747260_302-180617-0_sp1.1 from training. Duration: 0.97275 2023-10-02 20:04:03,230 WARNING [train.py:1204] (3/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_102-92751-0 from training. Duration: 0.99 2023-10-02 20:04:03,286 WARNING [train.py:1204] (3/4) Exclude cut with ID _51899_210_20034_1_1534129254466_3669220_265-215428-0_sp1.1 from training. Duration: 0.8909375 2023-10-02 20:04:04,736 WARNING [train.py:1204] (3/4) Exclude cut with ID _58583_210_5236_1_1534726835266_3574249_438-198993-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 20:04:11,102 WARNING [train.py:1204] (3/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_166-166990-0_sp1.1 from training. Duration: 0.87275 2023-10-02 20:04:12,978 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.5.encoder.layers.0.feed_forward2.out_whiten, num_groups=1, num_channels=256, metric=7.54 vs. limit=15.0 2023-10-02 20:04:13,855 WARNING [train.py:1204] (3/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_907-151398-0 from training. Duration: 0.37 2023-10-02 20:04:16,585 WARNING [train.py:1204] (3/4) Exclude cut with ID _38670_210_15379_1_1533448810659_4391059_452-256669-0_sp1.1 from training. Duration: 0.936375 2023-10-02 20:04:18,030 WARNING [train.py:1204] (3/4) Exclude cut with ID _48677_210_5663_1_1533902405919_3892989_408-40740-0 from training. Duration: 0.83 2023-10-02 20:04:18,073 WARNING [train.py:1204] (3/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_903-158756-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 20:04:18,121 WARNING [train.py:1204] (3/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_397-216497-0 from training. Duration: 0.96 2023-10-02 20:04:24,041 WARNING [train.py:1204] (3/4) Exclude cut with ID _55429_210_10626_1_1534467588527_7174143_97-335084-0 from training. Duration: 0.97 2023-10-02 20:04:31,622 WARNING [train.py:1204] (3/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_453-267730-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 20:04:31,690 WARNING [train.py:1204] (3/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_153-97441-0 from training. Duration: 0.56 2023-10-02 20:04:33,020 WARNING [train.py:1204] (3/4) Exclude cut with ID _40901_210_5665_1_1533363789958_3856130_312-329122-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 20:04:33,063 WARNING [train.py:1204] (3/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_97-126471-0_sp1.1 from training. Duration: 0.97275 2023-10-02 20:04:33,064 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_11305_210_1794_1_1529750428_7178148_257-312870-0_sp1.1 from training. Duration: 0.8 2023-10-02 20:04:35,951 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.conv_module1.balancer2.prob, batch_count=1002466.6666666666, ans=0.125 2023-10-02 20:04:37,609 WARNING [train.py:1204] (3/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_454-314901-0_sp0.9 from training. Duration: 0.511125 2023-10-02 20:04:40,514 WARNING [train.py:1204] (3/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_282-136641-0_sp1.1 from training. Duration: 0.3818125 2023-10-02 20:04:42,331 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.617e+02 1.872e+02 2.190e+02 2.383e+02 3.841e+02, threshold=4.379e+02, percent-clipped=0.0 2023-10-02 20:04:42,428 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_46013_210_7234_1_1533776362_3744032_493-160724-0_sp1.1 from training. Duration: 0.963625 2023-10-02 20:04:42,468 WARNING [train.py:1204] (3/4) Exclude cut with ID _67831_210_12392_1_1535612464202_3092612_259-33442-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 20:04:42,521 WARNING [train.py:1204] (3/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_289-62384-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 20:04:43,869 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6545_210_2040_1_1527774670_4245258_379-14182-0_sp0.9 from training. Duration: 0.888875 2023-10-02 20:04:45,277 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_548-294316-0_sp1.1 from training. Duration: 0.736375 2023-10-02 20:04:46,661 WARNING [train.py:1204] (3/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_857-298942-0_sp1.1 from training. Duration: 0.77275 2023-10-02 20:04:49,306 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6673_210_3273_1_1527767790_3173240_327-306185-0_sp1.1 from training. Duration: 0.7545625 2023-10-02 20:04:55,349 WARNING [train.py:1204] (3/4) Exclude cut with ID _61008_210_10633_1_1534852769198_6974370_814-107647-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 20:04:55,411 WARNING [train.py:1204] (3/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_116-332008-0_sp1.1 from training. Duration: 0.8909375 2023-10-02 20:04:58,160 WARNING [train.py:1204] (3/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_266-329755-0 from training. Duration: 0.89 2023-10-02 20:04:58,163 WARNING [train.py:1204] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_271-269084-0_sp0.9 from training. Duration: 0.92225 2023-10-02 20:04:58,238 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3171_210_2087_1_1526109084_2330007_66-260583-0 from training. Duration: 0.72 2023-10-02 20:05:00,932 INFO [train.py:1046] (3/4) Epoch 29, batch 1650, loss[loss=0.1767, simple_loss=0.2438, pruned_loss=0.05478, over 23654.00 frames. ], tot_loss[loss=0.1658, simple_loss=0.2443, pruned_loss=0.04372, over 4710639.93 frames. ], batch size: 232, lr: 3.54e-03, grad_scale: 16.0 2023-10-02 20:05:02,897 WARNING [train.py:1204] (3/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_1326-276386-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 20:05:03,558 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.3.encoder.layers.2.nonlin_attention.whiten1, num_groups=1, num_channels=384, metric=5.38 vs. limit=10.0 2023-10-02 20:05:04,294 WARNING [train.py:1204] (3/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_372-30302-0_sp1.1 from training. Duration: 0.7818125 2023-10-02 20:05:05,705 WARNING [train.py:1204] (3/4) Exclude cut with ID _25882_210_3036_1_1532775665815_6479510_631-257029-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 20:05:05,712 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3691_210_3528_1_1526468169_4062053_355-334432-0 from training. Duration: 0.87 2023-10-02 20:05:05,720 WARNING [train.py:1204] (3/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_839-173017-0_sp1.1 from training. Duration: 0.37275 2023-10-02 20:05:05,727 WARNING [train.py:1204] (3/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_441-91466-0 from training. Duration: 0.97 2023-10-02 20:05:07,043 WARNING [train.py:1204] (3/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_108-53929-0 from training. Duration: 0.59 2023-10-02 20:05:10,977 WARNING [train.py:1204] (3/4) Exclude cut with ID _9389_210_1664_1_1529217337301_5289269_260-1942-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 20:05:12,290 WARNING [train.py:1204] (3/4) Exclude cut with ID _56423_210_5854_1_1534896061049_3165969_198-61547-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 20:05:13,613 WARNING [train.py:1204] (3/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_412-142122-0_sp1.1 from training. Duration: 0.92725 2023-10-02 20:05:13,646 WARNING [train.py:1204] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_88-341718-0_sp1.1 from training. Duration: 0.6 2023-10-02 20:05:16,530 WARNING [train.py:1204] (3/4) Exclude cut with ID _61079_210_4525_1_1534921008198_4011419_334-298697-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 20:05:17,948 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_5465_210_4211_1_1527227253_55357_8-2499-0_sp1.1 from training. Duration: 0.996375 2023-10-02 20:05:19,550 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_447-201565-0_sp1.1 from training. Duration: 0.97275 2023-10-02 20:05:19,557 WARNING [train.py:1204] (3/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_253-161789-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 20:05:19,560 WARNING [train.py:1204] (3/4) Exclude cut with ID _44304_210_15374_1_1533639352765_4613129_302-22084-0_sp0.9 from training. Duration: 0.9555625 2023-10-02 20:05:19,578 WARNING [train.py:1204] (3/4) Exclude cut with ID _26664_210_7770_1_1532258943280_3418270_187-80815-0_sp1.1 from training. Duration: 0.8454375 2023-10-02 20:05:20,933 WARNING [train.py:1204] (3/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_68-212801-0 from training. Duration: 0.73 2023-10-02 20:05:20,970 WARNING [train.py:1204] (3/4) Exclude cut with ID _29345_210_4381_1_1532689168263_3860169_149-257268-0 from training. Duration: 0.81 2023-10-02 20:05:25,633 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_213-103282-0_sp1.1 from training. Duration: 0.7090625 2023-10-02 20:05:28,450 WARNING [train.py:1204] (3/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_156-302675-0_sp1.1 from training. Duration: 0.5 2023-10-02 20:05:34,770 WARNING [train.py:1204] (3/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_125-322172-0 from training. Duration: 0.93 2023-10-02 20:05:36,066 WARNING [train.py:1204] (3/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_493-60474-0_sp1.1 from training. Duration: 0.963625 2023-10-02 20:05:37,767 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.conv_skip_rate, batch_count=1002733.3333333334, ans=0.0 2023-10-02 20:05:38,921 WARNING [train.py:1204] (3/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_156-302675-0 from training. Duration: 0.55 2023-10-02 20:05:42,976 WARNING [train.py:1204] (3/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_123-267862-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 20:05:43,251 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.feed_forward1.out_proj.dropout_p, batch_count=1002733.3333333334, ans=0.1 2023-10-02 20:05:44,993 WARNING [train.py:1204] (3/4) Exclude cut with ID _28427_210_4220_1_1532581240967_3061069_201-143747-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 20:05:46,277 WARNING [train.py:1204] (3/4) Exclude cut with ID _8772_210_1850_1_1528635595972_3664011_96-12756-0_sp1.1 from training. Duration: 0.8909375 2023-10-02 20:05:46,317 WARNING [train.py:1204] (3/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_1347-145675-0_sp1.1 from training. Duration: 0.936375 2023-10-02 20:05:47,759 WARNING [train.py:1204] (3/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_387-159008-0_sp1.1 from training. Duration: 0.7818125 2023-10-02 20:05:47,782 WARNING [train.py:1204] (3/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_400-201896-0_sp1.1 from training. Duration: 0.963625 2023-10-02 20:05:50,576 WARNING [train.py:1204] (3/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_326-174612-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 20:05:51,217 WARNING [train.py:1204] (3/4) Exclude cut with ID _55844_210_2346_1_1534471212569_7625707_390-116233-0_sp1.1 from training. Duration: 0.963625 2023-10-02 20:05:52,322 WARNING [train.py:1204] (3/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_419-317062-0_sp1.1 from training. Duration: 0.82725 2023-10-02 20:05:52,371 WARNING [train.py:1204] (3/4) Exclude cut with ID _29877_210_6228_1_1532397791106_5276149_266-62291-0_sp0.9 from training. Duration: 0.9666875 2023-10-02 20:05:53,710 WARNING [train.py:1204] (3/4) Exclude cut with ID _7857_210_1534_1_1528286515745_6831470_431-338528-0_sp1.1 from training. Duration: 0.92725 2023-10-02 20:05:53,782 WARNING [train.py:1204] (3/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_87-131427-0_sp0.9 from training. Duration: 0.8666875 2023-10-02 20:05:56,517 WARNING [train.py:1204] (3/4) Exclude cut with ID _24055_210_12651_1_1532310907143_976979_92-59356-0_sp1.1 from training. Duration: 0.82725 2023-10-02 20:05:56,598 WARNING [train.py:1204] (3/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_96-290039-0 from training. Duration: 0.56 2023-10-02 20:05:57,940 WARNING [train.py:1204] (3/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_48-182155-0_sp0.9 from training. Duration: 0.9666875 2023-10-02 20:05:57,982 WARNING [train.py:1204] (3/4) Exclude cut with ID _4626_210_3532_1_1526817652717_3916859_446-206656-0 from training. Duration: 0.76 2023-10-02 20:06:00,740 WARNING [train.py:1204] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_168-32906-0 from training. Duration: 0.69 2023-10-02 20:06:00,762 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3780_210_2087_1_1526467760_2130483_170-259044-0 from training. Duration: 0.7 2023-10-02 20:06:00,779 WARNING [train.py:1204] (3/4) Exclude cut with ID _65134_210_14763_1_1535335393985_3505368_153-216718-0_sp1.1 from training. Duration: 0.92725 2023-10-02 20:06:02,198 WARNING [train.py:1204] (3/4) Exclude cut with ID _64639_210_5816_1_1535367619437_3528000_461-329156-0_sp1.1 from training. Duration: 0.87275 2023-10-02 20:06:02,233 WARNING [train.py:1204] (3/4) Exclude cut with ID _21989_210_5854_1_1531899214931_3271360_126-347531-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 20:06:03,933 WARNING [train.py:1204] (3/4) Exclude cut with ID _7614_210_4915_1_1529326764092_4537359_96-162197-0_sp1.1 from training. Duration: 0.963625 2023-10-02 20:06:03,934 WARNING [train.py:1204] (3/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_1083-147536-0 from training. Duration: 0.97 2023-10-02 20:06:07,377 WARNING [train.py:1204] (3/4) Exclude cut with ID _64029_210_4381_1_1535178502772_3756459_275-16710-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 20:06:10,090 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7264_210_4938_1_1527939911_8132095_800-91995-0_sp0.9 from training. Duration: 0.9555625 2023-10-02 20:06:10,125 WARNING [train.py:1204] (3/4) Exclude cut with ID _3844_210_1390_1_1526727803582_2871209_50-193879-0_sp1.1 from training. Duration: 0.936375 2023-10-02 20:06:10,456 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.bypass.scale_min, batch_count=1002866.6666666666, ans=0.2 2023-10-02 20:06:11,680 WARNING [train.py:1204] (3/4) Exclude cut with ID _46952_210_5986_1_1534230010357_3107379_232-344503-0 from training. Duration: 0.96 2023-10-02 20:06:13,930 INFO [scaling.py:1022] (3/4) Whitening: name=encoder_embed.out_whiten, num_groups=1, num_channels=192, metric=6.97 vs. limit=8.0 2023-10-02 20:06:15,360 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.4.encoder.layers.1.self_attn1.whiten, num_groups=1, num_channels=384, metric=13.58 vs. limit=22.5 2023-10-02 20:06:16,116 INFO [train.py:1046] (3/4) Epoch 29, batch 1700, loss[loss=0.1439, simple_loss=0.2045, pruned_loss=0.04162, over 22654.00 frames. ], tot_loss[loss=0.1657, simple_loss=0.2434, pruned_loss=0.04396, over 4696231.56 frames. ], batch size: 322, lr: 3.54e-03, grad_scale: 8.0 2023-10-02 20:06:16,179 WARNING [train.py:1204] (3/4) Exclude cut with ID _25808_210_13292_1_1532343514562_7366109_206-14076-0_sp1.1 from training. Duration: 0.936375 2023-10-02 20:06:16,180 WARNING [train.py:1204] (3/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_55-31904-0_sp0.9 from training. Duration: 0.97775 2023-10-02 20:06:16,204 WARNING [train.py:1204] (3/4) Exclude cut with ID _14632_210_9988_1_1530691279392_3923200_161-129530-0 from training. Duration: 0.96 2023-10-02 20:06:17,573 WARNING [train.py:1204] (3/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_770-200430-0_sp1.1 from training. Duration: 0.8090625 2023-10-02 20:06:17,591 WARNING [train.py:1204] (3/4) Exclude cut with ID _6270_210_2084_1_1527850911573_3856420_26-277343-0_sp1.1 from training. Duration: 0.7454375 2023-10-02 20:06:17,592 WARNING [train.py:1204] (3/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_88-23776-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 20:06:19,070 WARNING [train.py:1204] (3/4) Exclude cut with ID _60860_210_8254_1_1534896015875_3557169_245-256666-0_sp1.1 from training. Duration: 0.8818125 2023-10-02 20:06:19,101 WARNING [train.py:1204] (3/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_636-251213-0_sp1.1 from training. Duration: 0.8545625 2023-10-02 20:06:20,888 WARNING [train.py:1204] (3/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_540-131164-0 from training. Duration: 0.92 2023-10-02 20:06:21,620 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.3.encoder.layers.1.self_attn1.whiten, num_groups=1, num_channels=512, metric=16.31 vs. limit=22.5 2023-10-02 20:06:23,612 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_5931_210_2040_1_1527428481_4547999_321-193614-0_sp0.9 from training. Duration: 0.7444375 2023-10-02 20:06:32,065 WARNING [train.py:1204] (3/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_373-225403-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 20:06:34,134 WARNING [train.py:1204] (3/4) Exclude cut with ID _20622_210_3852_1_1531622427080_1093839_57-326027-0_sp1.1 from training. Duration: 0.97275 2023-10-02 20:06:38,947 WARNING [train.py:1204] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_605-257205-0_sp1.1 from training. Duration: 0.536375 2023-10-02 20:06:40,128 WARNING [train.py:1204] (3/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_442-320317-0_sp0.9 from training. Duration: 0.911125 2023-10-02 20:06:40,168 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_35361_210_4238_1_1533262810_7793476_675-87012-0_sp1.1 from training. Duration: 0.8090625 2023-10-02 20:06:40,205 WARNING [train.py:1204] (3/4) Exclude cut with ID _4184_210_2550_1_1526730192013_2219380_306-44736-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 20:06:43,369 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_124-113562-0 from training. Duration: 0.94 2023-10-02 20:06:44,811 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2821_210_2715_1_1526108385_3763560_73-296673-0_sp0.9 from training. Duration: 0.92225 2023-10-02 20:06:44,827 WARNING [train.py:1204] (3/4) Exclude cut with ID _10301_210_5886_1_1529222275332_3910620_216-46959-0_sp1.1 from training. Duration: 0.92725 2023-10-02 20:06:45,618 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.3.encoder.layers.1.conv_module2.whiten, num_groups=1, num_channels=512, metric=3.03 vs. limit=15.0 2023-10-02 20:06:46,250 WARNING [train.py:1204] (3/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_91-277684-0_sp0.9 from training. Duration: 0.82225 2023-10-02 20:06:47,744 WARNING [train.py:1204] (3/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_351-294650-0_sp0.9 from training. Duration: 0.7 2023-10-02 20:06:49,815 WARNING [train.py:1204] (3/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_209-205365-0 from training. Duration: 0.79 2023-10-02 20:06:50,939 WARNING [train.py:1204] (3/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_97-325053-0 from training. Duration: 0.92 2023-10-02 20:06:51,039 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_49348_210_7927_1_1533954500_3691300_109-276430-0_sp1.1 from training. Duration: 0.92725 2023-10-02 20:06:52,495 WARNING [train.py:1204] (3/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_912-47473-0 from training. Duration: 0.68 2023-10-02 20:06:54,039 WARNING [train.py:1204] (3/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_96-320736-0_sp0.9 from training. Duration: 0.9555625 2023-10-02 20:06:56,970 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.1.balancer2.prob, batch_count=1003066.6666666666, ans=0.125 2023-10-02 20:07:02,374 WARNING [train.py:1204] (3/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_1252-235481-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 20:07:04,308 WARNING [train.py:1204] (3/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_813-261554-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 20:07:04,372 WARNING [train.py:1204] (3/4) Exclude cut with ID _21632_210_2253_1_1531994001765_3744140_110-274654-0_sp0.9 from training. Duration: 0.911125 2023-10-02 20:07:04,635 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.feed_forward2.hidden_balancer.prob, batch_count=1003133.3333333334, ans=0.125 2023-10-02 20:07:05,813 WARNING [train.py:1204] (3/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_381-317791-0_sp1.1 from training. Duration: 0.663625 2023-10-02 20:07:07,242 WARNING [train.py:1204] (3/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_51-117576-0_sp1.1 from training. Duration: 0.363625 2023-10-02 20:07:07,274 WARNING [train.py:1204] (3/4) Exclude cut with ID _42321_210_7770_1_1533468631352_4366895_104-215915-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 20:07:09,197 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_216-22824-0_sp1.1 from training. Duration: 0.92725 2023-10-02 20:07:09,200 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_14831_210_7927_1_1530700200_5583610_181-27467-0 from training. Duration: 0.91 2023-10-02 20:07:10,538 WARNING [train.py:1204] (3/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_1024-265645-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 20:07:10,548 WARNING [train.py:1204] (3/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_412-233904-0_sp1.1 from training. Duration: 0.936375 2023-10-02 20:07:10,566 WARNING [train.py:1204] (3/4) Exclude cut with ID _30191_210_1664_1_1533189915047_7196670_911-223461-0_sp1.1 from training. Duration: 0.92725 2023-10-02 20:07:10,574 WARNING [train.py:1204] (3/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_558-148266-0_sp1.1 from training. Duration: 0.963625 2023-10-02 20:07:13,659 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.591e+02 1.858e+02 2.032e+02 2.351e+02 3.196e+02, threshold=4.063e+02, percent-clipped=0.0 2023-10-02 20:07:13,829 WARNING [train.py:1204] (3/4) Exclude cut with ID _58480_210_12392_1_1534811121111_3646160_212-164495-0_sp1.1 from training. Duration: 0.936375 2023-10-02 20:07:13,830 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_454-276429-0_sp0.9 from training. Duration: 0.9666875 2023-10-02 20:07:15,166 WARNING [train.py:1204] (3/4) Exclude cut with ID _24736_210_3279_1_1532332894218_2838700_73-197086-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 20:07:16,534 WARNING [train.py:1204] (3/4) Exclude cut with ID _5294_210_1747_1_1527249643928_3696179_529-242298-0_sp1.1 from training. Duration: 0.863625 2023-10-02 20:07:17,860 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_53555_210_5632_1_1534595220_7232297_345-171438-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 20:07:20,804 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_28211_210_6388_1_1532861506_4523224_22-25766-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 20:07:22,232 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7002_210_1794_1_1527939683_5157286_163-87552-0 from training. Duration: 0.86 2023-10-02 20:07:24,242 WARNING [train.py:1204] (3/4) Exclude cut with ID _56927_210_9969_1_1534848970840_3695229_409-225129-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 20:07:25,691 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_26192_210_7927_1_1532137269_1040115_21-219331-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 20:07:27,090 WARNING [train.py:1204] (3/4) Exclude cut with ID _15012_210_1968_1_1530856820429_5095350_662-210469-0 from training. Duration: 0.57 2023-10-02 20:07:28,626 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder_embed.conv.5.prob, batch_count=1003200.0, ans=0.125 2023-10-02 20:07:31,193 INFO [train.py:1046] (3/4) Epoch 29, batch 1750, loss[loss=0.1738, simple_loss=0.2484, pruned_loss=0.04959, over 23492.00 frames. ], tot_loss[loss=0.1649, simple_loss=0.2425, pruned_loss=0.04363, over 4698074.42 frames. ], batch size: 106, lr: 3.54e-03, grad_scale: 8.0 2023-10-02 20:07:32,705 WARNING [train.py:1204] (3/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_582-191016-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 20:07:34,155 WARNING [train.py:1204] (3/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_270-210032-0_sp1.1 from training. Duration: 0.963625 2023-10-02 20:07:35,869 WARNING [train.py:1204] (3/4) Exclude cut with ID _6789_210_1534_1_1527857937218_3118950_262-48853-0_sp0.9 from training. Duration: 0.62225 2023-10-02 20:07:35,916 WARNING [train.py:1204] (3/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_847-41334-0 from training. Duration: 0.44 2023-10-02 20:07:35,952 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_573-46532-0_sp1.1 from training. Duration: 0.92725 2023-10-02 20:07:40,511 WARNING [train.py:1204] (3/4) Exclude cut with ID _21881_210_3525_1_1531825242757_3798380_247-102591-0_sp1.1 from training. Duration: 0.8909375 2023-10-02 20:07:40,532 WARNING [train.py:1204] (3/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_200-21395-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 20:07:43,967 WARNING [train.py:1204] (3/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_711-15688-0 from training. Duration: 0.43 2023-10-02 20:07:45,431 WARNING [train.py:1204] (3/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_53-75025-0_sp1.1 from training. Duration: 0.963625 2023-10-02 20:07:46,964 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=1003333.3333333334, ans=0.1 2023-10-02 20:07:48,150 WARNING [train.py:1204] (3/4) Exclude cut with ID _6194_210_1534_1_1527511826160_3941194_321-148641-0 from training. Duration: 0.64 2023-10-02 20:07:48,175 WARNING [train.py:1204] (3/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_337-294431-0_sp1.1 from training. Duration: 0.936375 2023-10-02 20:07:49,541 WARNING [train.py:1204] (3/4) Exclude cut with ID _21632_210_2253_1_1531994001765_3744140_110-274654-0_sp1.1 from training. Duration: 0.7454375 2023-10-02 20:07:52,751 WARNING [train.py:1204] (3/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_135-174080-0_sp1.1 from training. Duration: 0.4818125 2023-10-02 20:07:52,854 WARNING [train.py:1204] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_21-174514-0 from training. Duration: 0.43 2023-10-02 20:07:55,408 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_38965_210_4852_1_1533174906_3484735_215-294589-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 20:07:55,442 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_548-294316-0 from training. Duration: 0.81 2023-10-02 20:07:55,746 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.ff3_skip_rate, batch_count=1003333.3333333334, ans=0.0 2023-10-02 20:07:55,782 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.feed_forward3.hidden_balancer.prob, batch_count=1003333.3333333334, ans=0.125 2023-10-02 20:08:00,432 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.3.encoder.layers.3.feed_forward2.out_whiten, num_groups=1, num_channels=512, metric=9.48 vs. limit=15.0 2023-10-02 20:08:01,169 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_103-84010-0_sp1.1 from training. Duration: 0.62725 2023-10-02 20:08:01,369 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.feed_forward1.out_proj.dropout_p, batch_count=1003400.0, ans=0.1 2023-10-02 20:08:04,298 WARNING [train.py:1204] (3/4) Exclude cut with ID _18199_210_6109_1_1531475789864_4288790_295-198247-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 20:08:04,327 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_13757_210_3528_1_1530440682_4107239_103-313224-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 20:08:05,962 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.1.feed_forward1.out_proj.dropout_p, batch_count=1003400.0, ans=0.1 2023-10-02 20:08:07,169 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_34886_210_10633_1_1533203882_1755661_67-294673-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 20:08:07,182 WARNING [train.py:1204] (3/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_350-101734-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 20:08:09,162 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_37357_210_17024_1_1533089504_2316121_204-153986-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 20:08:11,243 WARNING [train.py:1204] (3/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_673-115569-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 20:08:13,938 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_489-230703-0_sp0.9 from training. Duration: 0.9444375 2023-10-02 20:08:13,997 WARNING [train.py:1204] (3/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_575-102496-0_sp1.1 from training. Duration: 0.936375 2023-10-02 20:08:15,439 WARNING [train.py:1204] (3/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_669-93163-0 from training. Duration: 0.98 2023-10-02 20:08:18,156 WARNING [train.py:1204] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_249-281349-0_sp0.9 from training. Duration: 0.9444375 2023-10-02 20:08:18,405 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.feed_forward3.hidden_balancer.prob, batch_count=1003466.6666666666, ans=0.125 2023-10-02 20:08:21,451 WARNING [train.py:1204] (3/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_106-30782-0 from training. Duration: 0.8 2023-10-02 20:08:21,510 WARNING [train.py:1204] (3/4) Exclude cut with ID _8772_210_1850_1_1528635595972_3664011_398-320049-0_sp1.1 from training. Duration: 0.8 2023-10-02 20:08:21,739 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.conv_skip_rate, batch_count=1003466.6666666666, ans=0.0 2023-10-02 20:08:24,304 WARNING [train.py:1204] (3/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_516-151727-0_sp1.1 from training. Duration: 0.963625 2023-10-02 20:08:24,365 WARNING [train.py:1204] (3/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_325-62802-0_sp1.1 from training. Duration: 0.7909375 2023-10-02 20:08:28,433 WARNING [train.py:1204] (3/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_93-58002-0_sp0.9 from training. Duration: 0.6333125 2023-10-02 20:08:29,707 WARNING [train.py:1204] (3/4) Exclude cut with ID _4681_210_3525_1_1527251040321_1235713_123-24428-0_sp1.1 from training. Duration: 0.436375 2023-10-02 20:08:29,750 WARNING [train.py:1204] (3/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_1185-56473-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 20:08:31,167 WARNING [train.py:1204] (3/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_96-244299-0_sp1.1 from training. Duration: 0.8 2023-10-02 20:08:31,379 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.balancer_na.min_abs, batch_count=1003533.3333333334, ans=0.02 2023-10-02 20:08:33,999 WARNING [train.py:1204] (3/4) Exclude cut with ID _59500_210_8254_1_1534809610680_3736009_104-72815-0_sp1.1 from training. Duration: 0.963625 2023-10-02 20:08:34,279 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.balancer1.prob, batch_count=1003533.3333333334, ans=0.125 2023-10-02 20:08:37,450 WARNING [train.py:1204] (3/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_468-283113-0_sp1.1 from training. Duration: 0.87275 2023-10-02 20:08:39,374 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6239_210_4211_1_1527765584_4306698_528-102186-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 20:08:39,438 WARNING [train.py:1204] (3/4) Exclude cut with ID _45297_210_2721_1_1533715351381_3264949_351-147136-0 from training. Duration: 0.87 2023-10-02 20:08:39,442 WARNING [train.py:1204] (3/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_520-51744-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 20:08:41,335 WARNING [train.py:1204] (3/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_310-114452-0_sp1.1 from training. Duration: 0.536375 2023-10-02 20:08:41,339 WARNING [train.py:1204] (3/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_450-165710-0_sp1.1 from training. Duration: 0.92725 2023-10-02 20:08:41,357 WARNING [train.py:1204] (3/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_280-120942-0_sp1.1 from training. Duration: 0.7 2023-10-02 20:08:41,383 WARNING [train.py:1204] (3/4) Exclude cut with ID _65585_210_12210_1_1535450451584_3764098_112-294301-0_sp0.9 from training. Duration: 0.97775 2023-10-02 20:08:42,653 WARNING [train.py:1204] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_98-51676-0_sp0.9 from training. Duration: 0.811125 2023-10-02 20:08:45,395 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_33348_210_5632_1_1533086912_3324030_85-28583-0_sp1.1 from training. Duration: 0.7454375 2023-10-02 20:08:46,643 INFO [train.py:1046] (3/4) Epoch 29, batch 1800, loss[loss=0.1708, simple_loss=0.2602, pruned_loss=0.04073, over 24440.00 frames. ], tot_loss[loss=0.1647, simple_loss=0.2424, pruned_loss=0.04349, over 4707972.43 frames. ], batch size: 69, lr: 3.54e-03, grad_scale: 8.0 2023-10-02 20:08:46,773 WARNING [train.py:1204] (3/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_336-208232-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 20:08:48,225 WARNING [train.py:1204] (3/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_339-104791-0_sp0.9 from training. Duration: 0.5444375 2023-10-02 20:08:50,975 WARNING [train.py:1204] (3/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_559-29840-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 20:08:54,237 WARNING [train.py:1204] (3/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_117-290916-0_sp0.9 from training. Duration: 0.5666875 2023-10-02 20:08:55,552 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_185-112289-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 20:08:58,341 WARNING [train.py:1204] (3/4) Exclude cut with ID _59256_210_5986_1_1535076085997_3589120_155-298797-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 20:09:01,080 WARNING [train.py:1204] (3/4) Exclude cut with ID _44659_210_2721_1_1534036120061_6614860_246-206531-0_sp1.1 from training. Duration: 0.92725 2023-10-02 20:09:01,115 WARNING [train.py:1204] (3/4) Exclude cut with ID _23818_210_6228_1_1531917063884_3676920_469-151944-0_sp1.1 from training. Duration: 0.92725 2023-10-02 20:09:02,492 WARNING [train.py:1204] (3/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_88-276668-0_sp1.1 from training. Duration: 0.9 2023-10-02 20:09:03,970 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_15101_210_7927_1_1530786415_5033274_320-327527-0_sp1.1 from training. Duration: 0.87275 2023-10-02 20:09:03,987 WARNING [train.py:1204] (3/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_446-112117-0 from training. Duration: 0.9 2023-10-02 20:09:04,039 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_30280_210_1881_1_1532872779_3660139_400-254159-0_sp1.1 from training. Duration: 0.936375 2023-10-02 20:09:06,850 WARNING [train.py:1204] (3/4) Exclude cut with ID _40284_210_2084_1_1533513679814_3704279_158-345341-0_sp1.1 from training. Duration: 0.936375 2023-10-02 20:09:12,490 WARNING [train.py:1204] (3/4) Exclude cut with ID 3033-130750-0096-55598-0 from training. Duration: 0.83 2023-10-02 20:09:13,876 WARNING [train.py:1204] (3/4) Exclude cut with ID _9389_210_1664_1_1529217337301_5289269_379-128794-0 from training. Duration: 0.85 2023-10-02 20:09:13,897 WARNING [train.py:1204] (3/4) Exclude cut with ID _3675_210_4557_1_1526639934408_4182089_456-18305-0 from training. Duration: 0.76 2023-10-02 20:09:15,204 WARNING [train.py:1204] (3/4) Exclude cut with ID _29853_210_7497_1_1532505600851_6642110_571-249460-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 20:09:15,303 WARNING [train.py:1204] (3/4) Exclude cut with ID _4068_210_2629_1_1526697350395_1546259_230-325568-0_sp1.1 from training. Duration: 0.92725 2023-10-02 20:09:15,303 WARNING [train.py:1204] (3/4) Exclude cut with ID _34415_210_8750_1_1533085213375_3451116_213-267570-0_sp1.1 from training. Duration: 0.963625 2023-10-02 20:09:16,019 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.2.encoder.layers.0.conv_module2.whiten, num_groups=1, num_channels=384, metric=12.32 vs. limit=15.0 2023-10-02 20:09:16,654 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6767_210_3273_1_1527855783_1718932_142-127132-0_sp1.1 from training. Duration: 0.863625 2023-10-02 20:09:22,528 WARNING [train.py:1204] (3/4) Exclude cut with ID IC0653W0001-322925 from training. Duration: 0.896 2023-10-02 20:09:23,957 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_4528_210_3273_1_1527076654_3276777_209-219804-0_sp1.1 from training. Duration: 0.62725 2023-10-02 20:09:25,380 WARNING [train.py:1204] (3/4) Exclude cut with ID _55844_210_2346_1_1534471212569_7625707_187-204971-0_sp1.1 from training. Duration: 0.936375 2023-10-02 20:09:26,806 WARNING [train.py:1204] (3/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_369-325184-0 from training. Duration: 0.99 2023-10-02 20:09:28,223 WARNING [train.py:1204] (3/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_791-45149-0 from training. Duration: 0.86 2023-10-02 20:09:28,279 WARNING [train.py:1204] (3/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_317-211107-0_sp1.1 from training. Duration: 0.836375 2023-10-02 20:09:31,082 WARNING [train.py:1204] (3/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_488-330913-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 20:09:32,516 WARNING [train.py:1204] (3/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_506-42614-0_sp1.1 from training. Duration: 0.7181875 2023-10-02 20:09:36,711 WARNING [train.py:1204] (3/4) Exclude cut with ID _8696_210_1876_1_1528545632440_3345058_209-54069-0 from training. Duration: 0.67 2023-10-02 20:09:40,463 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.conv_module2.balancer2.prob, batch_count=1003800.0, ans=0.125 2023-10-02 20:09:42,201 WARNING [train.py:1204] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_215-202973-0_sp1.1 from training. Duration: 0.8 2023-10-02 20:09:43,455 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.608e+02 1.936e+02 2.168e+02 2.501e+02 3.680e+02, threshold=4.336e+02, percent-clipped=0.0 2023-10-02 20:09:43,579 WARNING [train.py:1204] (3/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_321-215742-0 from training. Duration: 0.5 2023-10-02 20:09:43,631 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_428-32114-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 20:09:43,639 WARNING [train.py:1204] (3/4) Exclude cut with ID _27013_210_1389_1_1532437173240_3252649_467-263151-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 20:09:45,521 WARNING [train.py:1204] (3/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_734-129761-0_sp0.9 from training. Duration: 0.911125 2023-10-02 20:09:46,769 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_521-214276-0 from training. Duration: 0.44 2023-10-02 20:09:48,227 WARNING [train.py:1204] (3/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_80-147301-0_sp1.1 from training. Duration: 0.763625 2023-10-02 20:09:49,553 WARNING [train.py:1204] (3/4) Exclude cut with ID _9679_210_7497_1_1529129212906_4363929_56-307796-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 20:09:52,767 WARNING [train.py:1204] (3/4) Exclude cut with ID _14832_210_8750_1_1530691351539_3171348_1-54315-0 from training. Duration: 0.87 2023-10-02 20:09:52,768 WARNING [train.py:1204] (3/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_558-155750-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 20:09:53,014 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.bypass.scale_min, batch_count=1003866.6666666666, ans=0.2 2023-10-02 20:09:54,259 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_142-309370-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 20:09:54,296 WARNING [train.py:1204] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_18-46756-0_sp0.9 from training. Duration: 0.87775 2023-10-02 20:09:54,303 WARNING [train.py:1204] (3/4) Exclude cut with ID _61148_210_6897_1_1535094022726_4217610_279-212159-0_sp1.1 from training. Duration: 0.936375 2023-10-02 20:09:54,456 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.0.nonlin_attention.balancer.prob, batch_count=1003866.6666666666, ans=0.125 2023-10-02 20:09:55,687 WARNING [train.py:1204] (3/4) Exclude cut with ID _21987_210_5854_1_1531821500482_3637038_164-2641-0_sp1.1 from training. Duration: 0.936375 2023-10-02 20:09:55,725 WARNING [train.py:1204] (3/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_395-340852-0_sp1.1 from training. Duration: 0.8454375 2023-10-02 20:09:58,355 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1077-184261-0_sp1.1 from training. Duration: 0.963625 2023-10-02 20:09:58,362 WARNING [train.py:1204] (3/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_1176-204992-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 20:10:01,020 INFO [train.py:1046] (3/4) Epoch 29, batch 1850, loss[loss=0.1647, simple_loss=0.2445, pruned_loss=0.04245, over 24584.00 frames. ], tot_loss[loss=0.166, simple_loss=0.2436, pruned_loss=0.04421, over 4698962.32 frames. ], batch size: 60, lr: 3.53e-03, grad_scale: 8.0 2023-10-02 20:10:01,116 WARNING [train.py:1204] (3/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_563-347832-0_sp1.1 from training. Duration: 0.8090625 2023-10-02 20:10:01,267 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.conv_skip_rate, batch_count=1003933.3333333334, ans=0.0 2023-10-02 20:10:02,448 WARNING [train.py:1204] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_380-204756-0_sp0.9 from training. Duration: 0.9555625 2023-10-02 20:10:09,286 WARNING [train.py:1204] (3/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_212-10501-0_sp1.1 from training. Duration: 0.8545625 2023-10-02 20:10:09,318 WARNING [train.py:1204] (3/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_445-117398-0 from training. Duration: 0.89 2023-10-02 20:10:12,596 WARNING [train.py:1204] (3/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_29-280622-0 from training. Duration: 0.82 2023-10-02 20:10:16,007 WARNING [train.py:1204] (3/4) Exclude cut with ID _26664_210_7770_1_1532258943280_3418270_187-80815-0 from training. Duration: 0.93 2023-10-02 20:10:21,337 WARNING [train.py:1204] (3/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_414-349640-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 20:10:21,354 WARNING [train.py:1204] (3/4) Exclude cut with ID _45021_210_9994_1_1533619751094_3330288_96-239010-0 from training. Duration: 0.91 2023-10-02 20:10:21,357 WARNING [train.py:1204] (3/4) Exclude cut with ID _5189_210_4557_1_1527248319428_3769229_199-53526-0_sp0.9 from training. Duration: 0.4666875 2023-10-02 20:10:30,720 WARNING [train.py:1204] (3/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_63-101996-0_sp1.1 from training. Duration: 0.8 2023-10-02 20:10:32,145 WARNING [train.py:1204] (3/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_199-86432-0 from training. Duration: 0.98 2023-10-02 20:10:35,076 WARNING [train.py:1204] (3/4) Exclude cut with ID _36399_210_16049_1_1532998771377_3698333_207-259013-0_sp1.1 from training. Duration: 0.92725 2023-10-02 20:10:35,118 WARNING [train.py:1204] (3/4) Exclude cut with ID _62574_210_2655_1_1535180407058_3933249_567-111756-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 20:10:39,176 WARNING [train.py:1204] (3/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_436-318113-0 from training. Duration: 0.75 2023-10-02 20:10:39,229 WARNING [train.py:1204] (3/4) Exclude cut with ID _53379_210_20034_1_1534222027363_3854654_43-305214-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 20:10:40,477 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_15126_210_6753_1_1530778677_2591185_113-157651-0_sp1.1 from training. Duration: 0.6181875 2023-10-02 20:10:40,590 WARNING [train.py:1204] (3/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_146-218763-0_sp1.1 from training. Duration: 0.7818125 2023-10-02 20:10:43,203 WARNING [train.py:1204] (3/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_714-310538-0_sp0.9 from training. Duration: 0.9555625 2023-10-02 20:10:45,352 WARNING [train.py:1204] (3/4) Exclude cut with ID _24271_210_12658_1_1531987270022_7190519_709-16721-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 20:10:49,986 WARNING [train.py:1204] (3/4) Exclude cut with ID _5190_210_4557_1_1527244368799_373439_28-197030-0_sp0.9 from training. Duration: 0.8 2023-10-02 20:10:51,774 WARNING [train.py:1204] (3/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_267-197536-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 20:10:51,816 WARNING [train.py:1204] (3/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_658-282610-0_sp1.1 from training. Duration: 0.4818125 2023-10-02 20:10:51,825 WARNING [train.py:1204] (3/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_257-67050-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 20:10:55,173 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_14906_210_7927_1_1530754190_9408633_961-53402-0_sp1.1 from training. Duration: 0.936375 2023-10-02 20:10:55,284 WARNING [train.py:1204] (3/4) Exclude cut with ID _60113_210_16271_1_1535164212481_4386079_490-280290-0_sp1.1 from training. Duration: 0.8909375 2023-10-02 20:10:58,118 WARNING [train.py:1204] (3/4) Exclude cut with ID _14100_210_8341_1_1530529320613_3678069_258-224599-0 from training. Duration: 0.77 2023-10-02 20:10:58,172 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6961_210_2087_1_1527937168_6815011_565-326898-0_sp1.1 from training. Duration: 0.936375 2023-10-02 20:11:02,380 WARNING [train.py:1204] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_239-316802-0_sp1.1 from training. Duration: 0.62725 2023-10-02 20:11:03,717 WARNING [train.py:1204] (3/4) Exclude cut with ID _55857_210_2346_1_1534644007831_6898209_745-98391-0_sp1.1 from training. Duration: 0.6909375 2023-10-02 20:11:03,721 WARNING [train.py:1204] (3/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_154-135696-0 from training. Duration: 0.8 2023-10-02 20:11:03,723 WARNING [train.py:1204] (3/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_93-58002-0 from training. Duration: 0.57 2023-10-02 20:11:05,232 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9025W0005-507335 from training. Duration: 0.896 2023-10-02 20:11:06,593 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9029W0034-509435 from training. Duration: 0.64 2023-10-02 20:11:06,709 WARNING [train.py:1204] (3/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_76-203867-0_sp1.1 from training. Duration: 0.6545625 2023-10-02 20:11:06,717 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_46639_210_8341_1_1533726088_4495500_66-346325-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 20:11:06,729 WARNING [train.py:1204] (3/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_145-137790-0_sp0.9 from training. Duration: 0.92225 2023-10-02 20:11:07,998 WARNING [train.py:1204] (3/4) Exclude cut with ID _36522_210_8887_1_1533177030611_1772478_7-327299-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 20:11:09,253 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9042W0009-515615 from training. Duration: 0.896 2023-10-02 20:11:09,263 WARNING [train.py:1204] (3/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_39-248870-0_sp0.9 from training. Duration: 0.7666875 2023-10-02 20:11:09,313 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_35441_210_16554_1_1533347572_4092680_362-242912-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 20:11:10,690 WARNING [train.py:1204] (3/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_216-343007-0_sp1.1 from training. Duration: 0.6 2023-10-02 20:11:10,760 WARNING [train.py:1204] (3/4) Exclude cut with ID _3867_210_1883_1_1526641200575_4475830_272-215563-0_sp0.9 from training. Duration: 0.6333125 2023-10-02 20:11:12,108 WARNING [train.py:1204] (3/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_553-175603-0_sp1.1 from training. Duration: 0.92725 2023-10-02 20:11:13,499 WARNING [train.py:1204] (3/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_963-187109-0 from training. Duration: 0.99 2023-10-02 20:11:14,868 INFO [train.py:1046] (3/4) Epoch 29, batch 1900, loss[loss=0.1551, simple_loss=0.2452, pruned_loss=0.03249, over 24635.00 frames. ], tot_loss[loss=0.1662, simple_loss=0.2437, pruned_loss=0.04437, over 4699215.90 frames. ], batch size: 68, lr: 3.53e-03, grad_scale: 8.0 2023-10-02 20:11:14,973 WARNING [train.py:1204] (3/4) Exclude cut with ID _44973_210_1511_1_1533794381163_3634770_495-211466-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 20:11:14,992 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9068W0002-529085 from training. Duration: 0.896 2023-10-02 20:11:15,020 WARNING [train.py:1204] (3/4) Exclude cut with ID _5294_210_1747_1_1527249643928_3696179_180-100713-0_sp1.1 from training. Duration: 0.736375 2023-10-02 20:11:17,017 WARNING [train.py:1204] (3/4) Exclude cut with ID _35799_210_5854_1_1533263487887_3529071_149-49689-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 20:11:21,933 WARNING [train.py:1204] (3/4) Exclude cut with ID _67725_210_27767_1_1535608756671_5105359_36-220794-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 20:11:23,114 WARNING [train.py:1204] (3/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_338-96520-0_sp1.1 from training. Duration: 0.8545625 2023-10-02 20:11:23,187 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9104W0004-547160 from training. Duration: 0.896 2023-10-02 20:11:25,123 WARNING [train.py:1204] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_330-327861-0 from training. Duration: 0.67 2023-10-02 20:11:26,517 WARNING [train.py:1204] (3/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_156-257607-0_sp0.9 from training. Duration: 0.92225 2023-10-02 20:11:27,814 WARNING [train.py:1204] (3/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_476-322050-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 20:11:27,822 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9118W0017-553385 from training. Duration: 0.896 2023-10-02 20:11:27,848 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9120W0028-554435 from training. Duration: 0.896 2023-10-02 20:11:30,690 WARNING [train.py:1204] (3/4) Exclude cut with ID _56524_210_9642_1_1534572011779_3619170_145-310842-0 from training. Duration: 0.99 2023-10-02 20:11:32,097 WARNING [train.py:1204] (3/4) Exclude cut with ID _51631_210_8254_1_1534129228420_4084794_270-222508-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 20:11:34,966 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.bypass.skip_rate, batch_count=1004333.3333333334, ans=0.07 2023-10-02 20:11:36,110 WARNING [train.py:1204] (3/4) Exclude cut with ID _14268_210_9206_1_1530622771285_4714099_331-105351-0 from training. Duration: 0.65 2023-10-02 20:11:38,691 WARNING [train.py:1204] (3/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_40-129756-0 from training. Duration: 0.81 2023-10-02 20:11:48,161 WARNING [train.py:1204] (3/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_231-104736-0 from training. Duration: 0.78 2023-10-02 20:11:50,861 WARNING [train.py:1204] (3/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_214-291369-0 from training. Duration: 0.63 2023-10-02 20:11:52,747 WARNING [train.py:1204] (3/4) Exclude cut with ID _62574_210_2655_1_1535180407058_3933249_369-84347-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 20:11:52,810 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9210W0111-599240 from training. Duration: 0.896 2023-10-02 20:11:54,043 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_92-246270-0 from training. Duration: 0.85 2023-10-02 20:11:54,078 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_37152_210_9697_1_1533106573_3844188_29-239070-0 from training. Duration: 0.98 2023-10-02 20:11:54,139 WARNING [train.py:1204] (3/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_239-22076-0 from training. Duration: 0.85 2023-10-02 20:11:54,142 WARNING [train.py:1204] (3/4) Exclude cut with ID _65962_210_29927_1_1535436026519_4174930_368-44639-0_sp1.1 from training. Duration: 0.97275 2023-10-02 20:11:58,856 WARNING [train.py:1204] (3/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_152-212533-0 from training. Duration: 0.97 2023-10-02 20:12:00,413 WARNING [train.py:1204] (3/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_90-31383-0_sp1.1 from training. Duration: 0.7909375 2023-10-02 20:12:03,151 WARNING [train.py:1204] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_196-152868-0_sp1.1 from training. Duration: 0.92725 2023-10-02 20:12:03,159 WARNING [train.py:1204] (3/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_140-181176-0 from training. Duration: 0.92 2023-10-02 20:12:03,675 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.4.encoder.layers.0.feed_forward2.out_whiten, num_groups=1, num_channels=384, metric=6.88 vs. limit=15.0 2023-10-02 20:12:05,832 WARNING [train.py:1204] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_168-32906-0_sp0.9 from training. Duration: 0.7666875 2023-10-02 20:12:08,685 WARNING [train.py:1204] (3/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_13-165512-0 from training. Duration: 0.82 2023-10-02 20:12:08,711 WARNING [train.py:1204] (3/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_381-317791-0_sp0.9 from training. Duration: 0.811125 2023-10-02 20:12:11,360 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.593e+02 1.825e+02 1.966e+02 2.194e+02 3.470e+02, threshold=3.932e+02, percent-clipped=0.0 2023-10-02 20:12:16,036 WARNING [train.py:1204] (3/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_63-267585-0_sp1.1 from training. Duration: 0.8181875 2023-10-02 20:12:16,038 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_326-303945-0_sp1.1 from training. Duration: 0.9 2023-10-02 20:12:16,050 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_194-143381-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 20:12:16,119 WARNING [train.py:1204] (3/4) Exclude cut with ID _39290_210_4220_1_1533862778760_3075118_194-16667-0_sp1.1 from training. Duration: 0.963625 2023-10-02 20:12:17,949 WARNING [train.py:1204] (3/4) Exclude cut with ID _15012_210_1968_1_1530856820429_5095350_662-210469-0_sp0.9 from training. Duration: 0.6333125 2023-10-02 20:12:19,293 WARNING [train.py:1204] (3/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_68-219807-0_sp1.1 from training. Duration: 0.42725 2023-10-02 20:12:19,357 WARNING [train.py:1204] (3/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_321-22821-0_sp0.9 from training. Duration: 0.988875 2023-10-02 20:12:23,504 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_1037-45317-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 20:12:23,506 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_403-39619-0_sp1.1 from training. Duration: 0.763625 2023-10-02 20:12:26,833 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_510-96996-0_sp0.9 from training. Duration: 0.9666875 2023-10-02 20:12:26,836 WARNING [train.py:1204] (3/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_225-180155-0_sp1.1 from training. Duration: 0.92725 2023-10-02 20:12:28,198 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_278-272612-0_sp0.9 from training. Duration: 0.811125 2023-10-02 20:12:30,118 INFO [train.py:1046] (3/4) Epoch 29, batch 1950, loss[loss=0.1602, simple_loss=0.2359, pruned_loss=0.04222, over 19432.00 frames. ], tot_loss[loss=0.1664, simple_loss=0.2441, pruned_loss=0.04432, over 4698749.78 frames. ], batch size: 42, lr: 3.53e-03, grad_scale: 8.0 2023-10-02 20:12:30,164 WARNING [train.py:1204] (3/4) Exclude cut with ID _53713_210_24116_1_1534314487762_7416751_691-70244-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 20:12:32,912 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6788_210_1388_1_1527898698_4562021_91-197981-0_sp1.1 from training. Duration: 0.7545625 2023-10-02 20:12:34,369 WARNING [train.py:1204] (3/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_60-18123-0_sp0.9 from training. Duration: 0.97775 2023-10-02 20:12:34,407 WARNING [train.py:1204] (3/4) Exclude cut with ID _35799_210_5854_1_1533263487887_3529071_5-214617-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 20:12:34,411 WARNING [train.py:1204] (3/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_890-5117-0_sp1.1 from training. Duration: 0.7090625 2023-10-02 20:12:35,850 WARNING [train.py:1204] (3/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_660-211088-0 from training. Duration: 0.62 2023-10-02 20:12:37,242 WARNING [train.py:1204] (3/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_314-126819-0_sp0.9 from training. Duration: 0.5555625 2023-10-02 20:12:38,453 WARNING [train.py:1204] (3/4) Exclude cut with ID _21632_210_2253_1_1531994001765_3744140_362-66225-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 20:12:39,831 WARNING [train.py:1204] (3/4) Exclude cut with ID _61118_210_9527_1_1534897668884_3752630_173-258555-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 20:12:42,489 WARNING [train.py:1204] (3/4) Exclude cut with ID _7719_210_3525_1_1528456153163_4296960_88-250259-0_sp0.9 from training. Duration: 0.9333125 2023-10-02 20:12:42,521 WARNING [train.py:1204] (3/4) Exclude cut with ID _15140_210_9288_1_1530791388450_4880149_539-153708-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 20:12:42,544 WARNING [train.py:1204] (3/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_253-42544-0_sp1.1 from training. Duration: 0.97275 2023-10-02 20:12:44,250 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.conv_module2.balancer2.prob, batch_count=1004666.6666666666, ans=0.125 2023-10-02 20:12:45,488 WARNING [train.py:1204] (3/4) Exclude cut with ID _33640_210_2655_1_1533117605479_4855469_479-296877-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 20:12:49,474 WARNING [train.py:1204] (3/4) Exclude cut with ID 3033-130750-0096-55598-0_sp1.1 from training. Duration: 0.7545625 2023-10-02 20:12:49,486 WARNING [train.py:1204] (3/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_191-41023-0_sp1.1 from training. Duration: 0.6454375 2023-10-02 20:12:49,502 WARNING [train.py:1204] (3/4) Exclude cut with ID _8365_210_2064_1_1528592311070_4677690_431-294102-0_sp1.1 from training. Duration: 0.8090625 2023-10-02 20:12:49,543 WARNING [train.py:1204] (3/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_893-296030-0_sp1.1 from training. Duration: 0.97275 2023-10-02 20:12:54,731 WARNING [train.py:1204] (3/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_401-343820-0_sp1.1 from training. Duration: 0.97275 2023-10-02 20:12:58,194 WARNING [train.py:1204] (3/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_127-566-0_sp1.1 from training. Duration: 0.763625 2023-10-02 20:12:58,196 WARNING [train.py:1204] (3/4) Exclude cut with ID _14100_210_8341_1_1530529320613_3678069_126-272847-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 20:12:58,210 WARNING [train.py:1204] (3/4) Exclude cut with ID _15133_210_6228_1_1530794512074_3080379_29-264458-0_sp1.1 from training. Duration: 0.52725 2023-10-02 20:12:58,210 WARNING [train.py:1204] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_127-119109-0 from training. Duration: 0.72 2023-10-02 20:12:58,481 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.conv_module2.balancer1.prob, batch_count=1004733.3333333334, ans=0.125 2023-10-02 20:12:59,506 WARNING [train.py:1204] (3/4) Exclude cut with ID _6537_210_1883_1_1527987559710_3597129_382-133062-0_sp1.1 from training. Duration: 0.6181875 2023-10-02 20:12:59,529 WARNING [train.py:1204] (3/4) Exclude cut with ID _29529_210_7234_1_1532393961455_3977460_391-219682-0_sp1.1 from training. Duration: 0.936375 2023-10-02 20:12:59,582 WARNING [train.py:1204] (3/4) Exclude cut with ID _37537_210_2253_1_1533171758568_3421949_96-137189-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 20:13:03,221 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.0.layers.1.whiten, num_groups=1, num_channels=192, metric=3.71 vs. limit=12.0 2023-10-02 20:13:04,372 WARNING [train.py:1204] (3/4) Exclude cut with ID _58414_210_8254_1_1535421609162_7151182_15-276301-0_sp1.1 from training. Duration: 0.97275 2023-10-02 20:13:07,009 WARNING [train.py:1204] (3/4) Exclude cut with ID _59502_210_12210_1_1535107024698_1835139_110-289074-0_sp1.1 from training. Duration: 0.92725 2023-10-02 20:13:09,928 WARNING [train.py:1204] (3/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_367-61330-0_sp1.1 from training. Duration: 0.6818125 2023-10-02 20:13:12,687 WARNING [train.py:1204] (3/4) Exclude cut with ID _45297_210_2721_1_1533715351381_3264949_353-9541-0_sp1.1 from training. Duration: 0.8818125 2023-10-02 20:13:12,906 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.balancer2.prob, batch_count=1004800.0, ans=0.125 2023-10-02 20:13:14,052 WARNING [train.py:1204] (3/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_85-138257-0_sp0.9 from training. Duration: 0.788875 2023-10-02 20:13:14,081 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3787_210_2040_1_1526477544_5478586_603-120324-0 from training. Duration: 0.81 2023-10-02 20:13:14,137 WARNING [train.py:1204] (3/4) Exclude cut with ID _42863_210_9070_1_1533902398788_3696920_411-204131-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 20:13:19,598 WARNING [train.py:1204] (3/4) Exclude cut with ID _30196_210_1664_1_1533708496522_7074140_822-289909-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 20:13:20,961 WARNING [train.py:1204] (3/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_32-335103-0_sp0.9 from training. Duration: 0.911125 2023-10-02 20:13:21,016 WARNING [train.py:1204] (3/4) Exclude cut with ID _6493_210_1747_1_1528459210424_4012470_513-140967-0_sp0.9 from training. Duration: 0.87775 2023-10-02 20:13:26,045 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.feed_forward1.out_proj.dropout_p, batch_count=1004800.0, ans=0.1 2023-10-02 20:13:30,382 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_47896_210_20508_1_1534492518_5958868_439-257766-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 20:13:30,441 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_1294-63152-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 20:13:33,796 WARNING [train.py:1204] (3/4) Exclude cut with ID _59170_210_6721_1_1534983633960_4203335_319-166512-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 20:13:36,464 WARNING [train.py:1204] (3/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_146-3470-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 20:13:38,397 WARNING [train.py:1204] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_678-159307-0_sp1.1 from training. Duration: 0.77275 2023-10-02 20:13:38,426 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_343-48822-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 20:13:38,480 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_4692_210_3528_1_1526899755_4323254_107-44401-0 from training. Duration: 0.86 2023-10-02 20:13:38,484 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_74-292608-0_sp1.1 from training. Duration: 0.6545625 2023-10-02 20:13:38,548 WARNING [train.py:1204] (3/4) Exclude cut with ID _55644_210_11856_1_1534593599582_4103719_293-248694-0_sp1.1 from training. Duration: 0.97275 2023-10-02 20:13:41,269 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3172_210_2087_1_1526292929_3987865_197-97762-0 from training. Duration: 0.75 2023-10-02 20:13:41,427 WARNING [train.py:1204] (3/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_264-348896-0_sp0.9 from training. Duration: 0.9444375 2023-10-02 20:13:43,992 INFO [train.py:1046] (3/4) Epoch 29, batch 2000, loss[loss=0.2201, simple_loss=0.2879, pruned_loss=0.07618, over 19391.00 frames. ], tot_loss[loss=0.1672, simple_loss=0.2452, pruned_loss=0.0446, over 4706256.87 frames. ], batch size: 389, lr: 3.53e-03, grad_scale: 16.0 2023-10-02 20:13:45,447 WARNING [train.py:1204] (3/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_209-205365-0_sp0.9 from training. Duration: 0.87775 2023-10-02 20:13:46,857 WARNING [train.py:1204] (3/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_222-198127-0_sp0.9 from training. Duration: 0.9333125 2023-10-02 20:13:46,897 WARNING [train.py:1204] (3/4) Exclude cut with ID _57572_210_5517_1_1534921219624_3809484_52-43530-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 20:13:49,546 WARNING [train.py:1204] (3/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_96-320736-0_sp1.1 from training. Duration: 0.7818125 2023-10-02 20:13:52,208 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_13091_210_7925_1_1530609447_8987354_1159-225508-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 20:13:55,553 WARNING [train.py:1204] (3/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_237-44332-0 from training. Duration: 0.94 2023-10-02 20:13:55,610 WARNING [train.py:1204] (3/4) Exclude cut with ID _7970_210_1883_1_1528455616296_3472230_25-178407-0_sp0.9 from training. Duration: 0.788875 2023-10-02 20:13:59,738 WARNING [train.py:1204] (3/4) Exclude cut with ID _29003_210_19596_1_1532507391720_3894098_357-257062-0_sp1.1 from training. Duration: 0.936375 2023-10-02 20:14:01,113 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_809-17156-0 from training. Duration: 0.73 2023-10-02 20:14:02,931 WARNING [train.py:1204] (3/4) Exclude cut with ID _7160_210_1876_1_1527940923231_3703799_302-150728-0_sp1.1 from training. Duration: 0.7090625 2023-10-02 20:14:02,939 WARNING [train.py:1204] (3/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_444-28041-0_sp0.9 from training. Duration: 0.9444375 2023-10-02 20:14:04,471 WARNING [train.py:1204] (3/4) Exclude cut with ID _52892_210_6721_1_1534378941259_4124129_278-251666-0_sp1.1 from training. Duration: 0.92725 2023-10-02 20:14:05,810 WARNING [train.py:1204] (3/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_115-326778-0 from training. Duration: 0.93 2023-10-02 20:14:07,209 WARNING [train.py:1204] (3/4) Exclude cut with ID _41391_210_7994_1_1533603771872_3638201_31-188961-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 20:14:09,171 WARNING [train.py:1204] (3/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_1073-6264-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 20:14:09,197 WARNING [train.py:1204] (3/4) Exclude cut with ID _66868_210_9527_1_1535509287954_4561140_435-259515-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 20:14:09,339 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.1.feed_forward1.out_proj.dropout_p, batch_count=1005000.0, ans=0.1 2023-10-02 20:14:10,559 WARNING [train.py:1204] (3/4) Exclude cut with ID _14886_210_4220_1_1530702020355_3516190_159-73748-0 from training. Duration: 0.83 2023-10-02 20:14:11,821 WARNING [train.py:1204] (3/4) Exclude cut with ID _15150_210_8341_1_1530752411803_7256384_469-239509-0_sp1.1 from training. Duration: 0.6181875 2023-10-02 20:14:13,280 WARNING [train.py:1204] (3/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_395-340852-0 from training. Duration: 0.93 2023-10-02 20:14:13,283 WARNING [train.py:1204] (3/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_138-187031-0_sp1.1 from training. Duration: 0.9 2023-10-02 20:14:14,812 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_13757_210_3528_1_1530440682_4107239_48-106347-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 20:14:16,274 WARNING [train.py:1204] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_164-224052-0_sp1.1 from training. Duration: 0.636375 2023-10-02 20:14:16,284 WARNING [train.py:1204] (3/4) Exclude cut with ID _59288_210_5854_1_1535077757774_3412246_408-322648-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 20:14:17,567 WARNING [train.py:1204] (3/4) Exclude cut with ID _30191_210_1664_1_1533189915047_7196670_827-36359-0_sp1.1 from training. Duration: 0.963625 2023-10-02 20:14:18,964 WARNING [train.py:1204] (3/4) Exclude cut with ID _4680_210_3525_1_1527249757953_893386_75-78979-0_sp1.1 from training. Duration: 0.82725 2023-10-02 20:14:20,381 WARNING [train.py:1204] (3/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_383-165234-0 from training. Duration: 0.94 2023-10-02 20:14:23,158 WARNING [train.py:1204] (3/4) Exclude cut with ID _19896_210_1883_1_1531709022662_6649569_94-229927-0 from training. Duration: 0.97 2023-10-02 20:14:23,165 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6788_210_1388_1_1527898698_4562021_180-75288-0_sp1.1 from training. Duration: 0.9 2023-10-02 20:14:23,180 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_26433_210_12108_1_1532157158_4056480_54-321349-0_sp1.1 from training. Duration: 0.97275 2023-10-02 20:14:25,482 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.bypass.scale_min, batch_count=1005066.6666666666, ans=0.2 2023-10-02 20:14:27,964 WARNING [train.py:1204] (3/4) Exclude cut with ID _56108_210_16380_1_1534505446159_3887959_265-161493-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 20:14:29,767 WARNING [train.py:1204] (3/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_395-111602-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 20:14:29,773 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_154-316450-0_sp0.9 from training. Duration: 0.8444375 2023-10-02 20:14:29,829 WARNING [train.py:1204] (3/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_381-116041-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 20:14:31,436 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.feed_forward1.out_proj.dropout_p, batch_count=1005133.3333333334, ans=0.1 2023-10-02 20:14:32,552 WARNING [train.py:1204] (3/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_65-104124-0_sp1.1 from training. Duration: 0.963625 2023-10-02 20:14:32,619 WARNING [train.py:1204] (3/4) Exclude cut with ID _6536_210_1883_1_1527850783339_3319069_169-23811-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 20:14:33,909 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_289-180904-0_sp0.9 from training. Duration: 0.8444375 2023-10-02 20:14:33,926 WARNING [train.py:1204] (3/4) Exclude cut with ID _56406_210_9288_1_1534899618664_3496149_226-242400-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 20:14:35,313 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_24899_210_16554_1_1532046422_3827462_276-216330-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 20:14:38,142 WARNING [train.py:1204] (3/4) Exclude cut with ID _29345_210_4381_1_1532689168263_3860169_198-174328-0_sp1.1 from training. Duration: 0.82725 2023-10-02 20:14:38,215 WARNING [train.py:1204] (3/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_338-96520-0 from training. Duration: 0.94 2023-10-02 20:14:39,876 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.521e+02 1.873e+02 1.992e+02 2.268e+02 3.359e+02, threshold=3.985e+02, percent-clipped=0.0 2023-10-02 20:14:41,504 WARNING [train.py:1204] (3/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_7-111954-0_sp1.1 from training. Duration: 0.5090625 2023-10-02 20:14:42,757 WARNING [train.py:1204] (3/4) Exclude cut with ID _36692_210_5665_1_1533608967420_398809_8-16261-0_sp1.1 from training. Duration: 0.97275 2023-10-02 20:14:44,286 WARNING [train.py:1204] (3/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_86-212657-0_sp1.1 from training. Duration: 0.97275 2023-10-02 20:14:44,300 WARNING [train.py:1204] (3/4) Exclude cut with ID _44659_210_2721_1_1534036120061_6614860_626-298884-0_sp1.1 from training. Duration: 0.8818125 2023-10-02 20:14:47,156 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.conv_module1.balancer2.prob, batch_count=1005200.0, ans=0.125 2023-10-02 20:14:48,344 WARNING [train.py:1204] (3/4) Exclude cut with ID _49755_210_18053_1_1534581005846_3218709_120-257918-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 20:14:49,612 WARNING [train.py:1204] (3/4) Exclude cut with ID _21881_210_3525_1_1531825242757_3798380_400-113821-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 20:14:49,615 WARNING [train.py:1204] (3/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_387-236773-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 20:14:51,528 WARNING [train.py:1204] (3/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_613-232856-0_sp1.1 from training. Duration: 0.4909375 2023-10-02 20:14:51,555 WARNING [train.py:1204] (3/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_181-78632-0_sp0.9 from training. Duration: 0.8666875 2023-10-02 20:14:54,278 WARNING [train.py:1204] (3/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_175-17485-0_sp1.1 from training. Duration: 0.97275 2023-10-02 20:14:56,233 WARNING [train.py:1204] (3/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_1004-183422-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 20:14:57,709 INFO [train.py:1046] (3/4) Epoch 29, batch 2050, loss[loss=0.17, simple_loss=0.2591, pruned_loss=0.0405, over 24569.00 frames. ], tot_loss[loss=0.1664, simple_loss=0.2443, pruned_loss=0.04425, over 4697064.78 frames. ], batch size: 71, lr: 3.53e-03, grad_scale: 16.0 2023-10-02 20:14:57,777 WARNING [train.py:1204] (3/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_32-65962-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 20:14:57,852 WARNING [train.py:1204] (3/4) Exclude cut with ID _15458_210_8341_1_1530858614263_3834410_40-298664-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 20:14:58,513 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.3.encoder.layers.0.feed_forward2.out_whiten, num_groups=1, num_channels=512, metric=4.65 vs. limit=15.0 2023-10-02 20:15:03,284 WARNING [train.py:1204] (3/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_380-261863-0_sp1.1 from training. Duration: 0.963625 2023-10-02 20:15:03,516 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.bypass_mid.scale_min, batch_count=1005266.6666666666, ans=0.2 2023-10-02 20:15:05,897 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_372-173012-0_sp1.1 from training. Duration: 0.863625 2023-10-02 20:15:05,960 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_57-82205-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 20:15:07,775 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3691_210_3528_1_1526468169_4062053_355-334432-0_sp0.9 from training. Duration: 0.9666875 2023-10-02 20:15:10,377 WARNING [train.py:1204] (3/4) Exclude cut with ID _19023_210_4353_1_1531378892552_5820780_28-36615-0 from training. Duration: 0.97 2023-10-02 20:15:10,388 WARNING [train.py:1204] (3/4) Exclude cut with ID _29853_210_7497_1_1532505600851_6642110_286-251333-0_sp1.1 from training. Duration: 0.92725 2023-10-02 20:15:12,098 WARNING [train.py:1204] (3/4) Exclude cut with ID _31602_210_5854_1_1532677021456_4458250_518-80679-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 20:15:13,450 WARNING [train.py:1204] (3/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_38-187851-0_sp0.9 from training. Duration: 0.688875 2023-10-02 20:15:22,157 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=1005333.3333333334, ans=0.1 2023-10-02 20:15:23,798 WARNING [train.py:1204] (3/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_39-248870-0_sp1.1 from training. Duration: 0.62725 2023-10-02 20:15:23,827 WARNING [train.py:1204] (3/4) Exclude cut with ID _16908_210_5724_1_1531119157248_3772700_154-31455-0_sp1.1 from training. Duration: 0.97275 2023-10-02 20:15:24,093 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.self_attn_weights.pos_emb_skip_rate, batch_count=1005333.3333333334, ans=0.0 2023-10-02 20:15:25,221 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8453_210_4211_1_1528502940_8562730_347-330673-0 from training. Duration: 0.72 2023-10-02 20:15:28,478 WARNING [train.py:1204] (3/4) Exclude cut with ID _30191_210_1664_1_1533189915047_7196670_674-204247-0_sp1.1 from training. Duration: 0.97275 2023-10-02 20:15:28,596 WARNING [train.py:1204] (3/4) Exclude cut with ID _8833_210_5219_1_1528592665210_7553059_329-125156-0 from training. Duration: 0.82 2023-10-02 20:15:28,626 WARNING [train.py:1204] (3/4) Exclude cut with ID _4779_210_2084_1_1527318022944_3914893_500-285013-0_sp1.1 from training. Duration: 0.62725 2023-10-02 20:15:33,266 WARNING [train.py:1204] (3/4) Exclude cut with ID _14421_210_8750_1_1530604795825_3542290_190-313578-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 20:15:35,912 WARNING [train.py:1204] (3/4) Exclude cut with ID _35799_210_5854_1_1533263487887_3529071_538-273574-0_sp1.1 from training. Duration: 0.936375 2023-10-02 20:15:37,280 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_14928_210_5632_1_1530773461_4714000_454-112198-0_sp1.1 from training. Duration: 0.6 2023-10-02 20:15:37,331 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_468-143126-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 20:15:38,775 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_4528_210_3273_1_1527076654_3276777_258-121681-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 20:15:40,136 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_397-132565-0_sp1.1 from training. Duration: 0.9 2023-10-02 20:15:40,162 WARNING [train.py:1204] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_454-123002-0_sp0.9 from training. Duration: 0.7666875 2023-10-02 20:15:42,203 WARNING [train.py:1204] (3/4) Exclude cut with ID _27052_210_5986_1_1532329197187_3585009_297-86890-0_sp1.1 from training. Duration: 0.936375 2023-10-02 20:15:43,604 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.1.feed_forward1.out_proj.dropout_p, batch_count=1005466.6666666666, ans=0.1 2023-10-02 20:15:44,798 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_51225_210_3601_1_1534400634_2327383_55-301844-0_sp1.1 from training. Duration: 0.8454375 2023-10-02 20:15:46,414 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.ff3_skip_rate, batch_count=1005466.6666666666, ans=0.0 2023-10-02 20:15:47,519 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6786_210_1388_1_1527839699_4194495_378-46070-0_sp0.9 from training. Duration: 0.8 2023-10-02 20:15:48,039 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.4.encoder.layers.2.self_attn1.whiten, num_groups=1, num_channels=384, metric=11.70 vs. limit=22.5 2023-10-02 20:15:48,775 WARNING [train.py:1204] (3/4) Exclude cut with ID _42334_210_7770_1_1534206610396_3605880_55-19758-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 20:15:52,930 WARNING [train.py:1204] (3/4) Exclude cut with ID _14884_210_2252_1_1530707459689_5561250_114-201399-0_sp0.9 from training. Duration: 0.8444375 2023-10-02 20:15:55,184 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=1005533.3333333334, ans=0.1 2023-10-02 20:15:57,998 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_99-251314-0_sp0.9 from training. Duration: 0.9666875 2023-10-02 20:15:58,069 WARNING [train.py:1204] (3/4) Exclude cut with ID _26528_210_9070_1_1532570410842_3851310_497-97218-0 from training. Duration: 0.86 2023-10-02 20:16:04,156 WARNING [train.py:1204] (3/4) Exclude cut with ID _55328_210_27767_1_1534580973260_4088170_230-317241-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 20:16:04,213 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_125-203105-0_sp0.9 from training. Duration: 0.888875 2023-10-02 20:16:06,893 WARNING [train.py:1204] (3/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_55-31904-0_sp1.1 from training. Duration: 0.8 2023-10-02 20:16:10,028 WARNING [train.py:1204] (3/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_18-174647-0 from training. Duration: 0.91 2023-10-02 20:16:10,293 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.self_attn_weights.pos_emb_skip_rate, batch_count=1005600.0, ans=0.0 2023-10-02 20:16:11,384 INFO [train.py:1046] (3/4) Epoch 29, batch 2100, loss[loss=0.1481, simple_loss=0.2257, pruned_loss=0.03525, over 24326.00 frames. ], tot_loss[loss=0.1653, simple_loss=0.2437, pruned_loss=0.04346, over 4710713.87 frames. ], batch size: 56, lr: 3.53e-03, grad_scale: 8.0 2023-10-02 20:16:12,813 WARNING [train.py:1204] (3/4) Exclude cut with ID IC0165W0005-81726 from training. Duration: 0.952 2023-10-02 20:16:12,814 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_31583_210_3852_1_1532595851_4024413_148-171847-0_sp1.1 from training. Duration: 0.963625 2023-10-02 20:16:13,005 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.attention_skip_rate, batch_count=1005600.0, ans=0.0 2023-10-02 20:16:14,218 WARNING [train.py:1204] (3/4) Exclude cut with ID _21989_210_5854_1_1531899214931_3271360_116-138769-0_sp1.1 from training. Duration: 0.936375 2023-10-02 20:16:14,273 WARNING [train.py:1204] (3/4) Exclude cut with ID _66070_210_7640_1_1535441393096_7805310_489-292631-0_sp1.1 from training. Duration: 0.7181875 2023-10-02 20:16:15,616 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_202-27960-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 20:16:15,624 WARNING [train.py:1204] (3/4) Exclude cut with ID _5294_210_1747_1_1527249643928_3696179_342-270278-0 from training. Duration: 0.55 2023-10-02 20:16:15,673 WARNING [train.py:1204] (3/4) Exclude cut with ID _38323_210_12108_1_1533121233369_3303049_416-11919-0 from training. Duration: 0.96 2023-10-02 20:16:17,046 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6545_210_2040_1_1527774670_4245258_407-48070-0_sp0.9 from training. Duration: 0.8444375 2023-10-02 20:16:17,838 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.1.encoder.layers.1.nonlin_attention.whiten2, num_groups=1, num_channels=256, metric=6.70 vs. limit=15.0 2023-10-02 20:16:19,832 WARNING [train.py:1204] (3/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_503-30719-0_sp1.1 from training. Duration: 0.8909375 2023-10-02 20:16:19,907 WARNING [train.py:1204] (3/4) Exclude cut with ID _49643_210_7989_1_1534640244224_3939031_234-326497-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 20:16:24,606 WARNING [train.py:1204] (3/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_301-77873-0_sp1.1 from training. Duration: 0.963625 2023-10-02 20:16:25,953 WARNING [train.py:1204] (3/4) Exclude cut with ID _45297_210_2721_1_1533715351381_3264949_152-154697-0_sp1.1 from training. Duration: 0.97275 2023-10-02 20:16:25,961 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3371_210_1794_1_1526384462_5580777_396-339812-0 from training. Duration: 0.85 2023-10-02 20:16:27,305 WARNING [train.py:1204] (3/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_399-156791-0_sp1.1 from training. Duration: 0.7545625 2023-10-02 20:16:27,341 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7696_210_2040_1_1528207167_3716099_117-125434-0 from training. Duration: 0.76 2023-10-02 20:16:27,346 WARNING [train.py:1204] (3/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_96-244299-0 from training. Duration: 0.88 2023-10-02 20:16:28,757 WARNING [train.py:1204] (3/4) Exclude cut with ID _37892_210_19596_1_1533198677897_3964058_519-343462-0_sp1.1 from training. Duration: 0.92725 2023-10-02 20:16:28,780 WARNING [train.py:1204] (3/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_202-183922-0_sp1.1 from training. Duration: 0.77275 2023-10-02 20:16:28,786 WARNING [train.py:1204] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_453-133983-0 from training. Duration: 0.78 2023-10-02 20:16:30,114 WARNING [train.py:1204] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_178-39722-0_sp1.1 from training. Duration: 0.4454375 2023-10-02 20:16:35,558 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_15126_210_6753_1_1530778677_2591185_113-157651-0 from training. Duration: 0.68 2023-10-02 20:16:35,559 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_271-234962-0_sp1.1 from training. Duration: 0.7181875 2023-10-02 20:16:38,365 WARNING [train.py:1204] (3/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_195-64967-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 20:16:38,997 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.3.encoder.layers.3.feed_forward2.out_whiten, num_groups=1, num_channels=512, metric=6.53 vs. limit=15.0 2023-10-02 20:16:39,021 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.3.encoder.layers.2.conv_module2.whiten, num_groups=1, num_channels=512, metric=4.67 vs. limit=15.0 2023-10-02 20:16:39,676 WARNING [train.py:1204] (3/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_365-215065-0_sp1.1 from training. Duration: 0.936375 2023-10-02 20:16:41,663 WARNING [train.py:1204] (3/4) Exclude cut with ID _31602_210_5854_1_1532677021456_4458250_249-96604-0_sp0.9 from training. Duration: 0.988875 2023-10-02 20:16:43,038 WARNING [train.py:1204] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_307-158462-0 from training. Duration: 0.69 2023-10-02 20:16:43,078 WARNING [train.py:1204] (3/4) Exclude cut with ID _30918_210_6400_1_1532754078600_3125920_407-223394-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 20:16:43,081 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6673_210_3273_1_1527767790_3173240_351-167954-0_sp0.9 from training. Duration: 0.5333125 2023-10-02 20:16:45,740 WARNING [train.py:1204] (3/4) Exclude cut with ID _4866_210_2577_1_1527404412284_4364659_256-306830-0 from training. Duration: 0.87 2023-10-02 20:16:45,789 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_17613_210_2151_1_1531137569_2366160_222-131118-0_sp1.1 from training. Duration: 0.963625 2023-10-02 20:16:45,816 WARNING [train.py:1204] (3/4) Exclude cut with ID _45560_210_21982_1_1533812603951_3584999_111-6376-0 from training. Duration: 0.84 2023-10-02 20:16:45,836 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_216-73841-0 from training. Duration: 0.96 2023-10-02 20:16:47,163 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_14058_210_4163_1_1530690953_6327180_49-210687-0 from training. Duration: 0.94 2023-10-02 20:16:48,544 WARNING [train.py:1204] (3/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_506-42614-0_sp0.9 from training. Duration: 0.87775 2023-10-02 20:16:51,220 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_25-332138-0_sp1.1 from training. Duration: 0.82725 2023-10-02 20:16:52,782 WARNING [train.py:1204] (3/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_589-9070-0_sp1.1 from training. Duration: 0.6454375 2023-10-02 20:16:54,738 WARNING [train.py:1204] (3/4) Exclude cut with ID _6194_210_1534_1_1527511826160_3941194_14-282876-0_sp1.1 from training. Duration: 0.6454375 2023-10-02 20:16:56,170 WARNING [train.py:1204] (3/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_1166-90654-0_sp1.1 from training. Duration: 0.92725 2023-10-02 20:16:57,559 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_218-255858-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 20:16:57,561 WARNING [train.py:1204] (3/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_1285-256622-0 from training. Duration: 0.84 2023-10-02 20:16:57,581 WARNING [train.py:1204] (3/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_205-286261-0_sp1.1 from training. Duration: 0.963625 2023-10-02 20:16:57,594 WARNING [train.py:1204] (3/4) Exclude cut with ID _68777_210_4353_1_1535810390731_3117904_6-154119-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 20:16:58,916 WARNING [train.py:1204] (3/4) Exclude cut with ID _22982_210_1883_1_1531882790447_5449409_565-199393-0_sp1.1 from training. Duration: 0.92725 2023-10-02 20:16:58,932 WARNING [train.py:1204] (3/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_278-154492-0 from training. Duration: 0.76 2023-10-02 20:16:59,936 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.0.layers.0.whiten, num_groups=1, num_channels=192, metric=4.51 vs. limit=12.0 2023-10-02 20:17:00,724 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_426-313557-0 from training. Duration: 0.53 2023-10-02 20:17:02,057 WARNING [train.py:1204] (3/4) Exclude cut with ID _41817_210_18835_1_1533434477160_7423049_555-293448-0 from training. Duration: 0.8 2023-10-02 20:17:06,368 WARNING [train.py:1204] (3/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_32-335103-0_sp1.1 from training. Duration: 0.7454375 2023-10-02 20:17:09,076 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.543e+02 1.849e+02 2.136e+02 2.701e+02 4.119e+02, threshold=4.273e+02, percent-clipped=1.0 2023-10-02 20:17:09,178 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2655_210_2087_1_1525688959_3897400_202-147557-0_sp1.1 from training. Duration: 0.77275 2023-10-02 20:17:09,208 WARNING [train.py:1204] (3/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_158-250116-0 from training. Duration: 0.62 2023-10-02 20:17:16,493 WARNING [train.py:1204] (3/4) Exclude cut with ID _55429_210_10626_1_1534467588527_7174143_528-88083-0_sp1.1 from training. Duration: 0.963625 2023-10-02 20:17:17,935 WARNING [train.py:1204] (3/4) Exclude cut with ID _8030_210_7497_1_1528444886781_3531089_32-31837-0_sp1.1 from training. Duration: 0.8818125 2023-10-02 20:17:18,127 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.balancer1.prob, batch_count=1005866.6666666666, ans=0.125 2023-10-02 20:17:19,316 WARNING [train.py:1204] (3/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_744-86168-0_sp1.1 from training. Duration: 0.97275 2023-10-02 20:17:19,325 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1113-7369-0_sp0.9 from training. Duration: 0.9444375 2023-10-02 20:17:19,349 WARNING [train.py:1204] (3/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_124-345840-0_sp0.9 from training. Duration: 0.57775 2023-10-02 20:17:19,389 WARNING [train.py:1204] (3/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_328-264776-0_sp0.9 from training. Duration: 0.7444375 2023-10-02 20:17:20,812 WARNING [train.py:1204] (3/4) Exclude cut with ID _18895_210_6228_1_1531357274234_3784930_469-166615-0_sp1.1 from training. Duration: 0.963625 2023-10-02 20:17:20,818 WARNING [train.py:1204] (3/4) Exclude cut with ID _8395_210_1531_1_1528597871523_4411579_431-132470-0_sp0.9 from training. Duration: 0.82225 2023-10-02 20:17:22,212 WARNING [train.py:1204] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_554-45617-0_sp1.1 from training. Duration: 0.9 2023-10-02 20:17:22,247 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_35361_210_4238_1_1533262810_7793476_524-188242-0_sp1.1 from training. Duration: 0.92725 2023-10-02 20:17:23,658 WARNING [train.py:1204] (3/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_938-205135-0 from training. Duration: 0.87 2023-10-02 20:17:25,463 INFO [train.py:1046] (3/4) Epoch 29, batch 2150, loss[loss=0.1658, simple_loss=0.2457, pruned_loss=0.04293, over 23590.00 frames. ], tot_loss[loss=0.1646, simple_loss=0.243, pruned_loss=0.04311, over 4710579.78 frames. ], batch size: 149, lr: 3.53e-03, grad_scale: 8.0 2023-10-02 20:17:26,913 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_5931_210_2040_1_1527428481_4547999_321-193614-0 from training. Duration: 0.67 2023-10-02 20:17:26,925 WARNING [train.py:1204] (3/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_445-269521-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 20:17:28,295 WARNING [train.py:1204] (3/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_3-33116-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 20:17:28,299 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_4633_210_2040_1_1526824163_4145343_416-76117-0_sp1.1 from training. Duration: 0.863625 2023-10-02 20:17:28,332 WARNING [train.py:1204] (3/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_1033-274376-0_sp1.1 from training. Duration: 0.7909375 2023-10-02 20:17:29,642 WARNING [train.py:1204] (3/4) Exclude cut with ID _8772_210_1850_1_1528635595972_3664011_398-320049-0_sp0.9 from training. Duration: 0.97775 2023-10-02 20:17:34,364 WARNING [train.py:1204] (3/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_321-215742-0_sp0.9 from training. Duration: 0.5555625 2023-10-02 20:17:37,445 WARNING [train.py:1204] (3/4) Exclude cut with ID _28394_210_8913_1_1532430049518_3650118_272-190950-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 20:17:38,805 WARNING [train.py:1204] (3/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_31-299371-0_sp1.1 from training. Duration: 0.92725 2023-10-02 20:17:40,222 WARNING [train.py:1204] (3/4) Exclude cut with ID _55383_210_10635_1_1534468426172_2231669_134-128246-0_sp0.9 from training. Duration: 0.988875 2023-10-02 20:17:40,224 WARNING [train.py:1204] (3/4) Exclude cut with ID _40519_210_13794_1_1533715228655_7337390_887-9547-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 20:17:40,257 WARNING [train.py:1204] (3/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_310-109461-0_sp0.9 from training. Duration: 0.888875 2023-10-02 20:17:43,446 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2009_210_2071_1_1525481134_4588937_258-250196-0_sp1.1 from training. Duration: 0.92725 2023-10-02 20:17:44,736 WARNING [train.py:1204] (3/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_1186-72702-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 20:17:44,738 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_14831_210_7927_1_1530700200_5583610_181-27467-0_sp1.1 from training. Duration: 0.82725 2023-10-02 20:17:46,315 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.nonlin_attention.balancer.prob, batch_count=1006000.0, ans=0.125 2023-10-02 20:17:47,540 WARNING [train.py:1204] (3/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_278-132109-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 20:17:48,944 WARNING [train.py:1204] (3/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_261-297503-0 from training. Duration: 0.99 2023-10-02 20:17:51,761 WARNING [train.py:1204] (3/4) Exclude cut with ID _19896_210_1883_1_1531709022662_6649569_313-265903-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 20:17:53,159 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_105-114797-0_sp1.1 from training. Duration: 0.763625 2023-10-02 20:17:53,256 WARNING [train.py:1204] (3/4) Exclude cut with ID _53755_210_8254_1_1534577312941_4707360_9-65843-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 20:17:54,544 WARNING [train.py:1204] (3/4) Exclude cut with ID _18516_210_9206_1_1531704653206_3692406_361-224549-0_sp1.1 from training. Duration: 0.97275 2023-10-02 20:17:54,578 WARNING [train.py:1204] (3/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_403-310975-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 20:17:55,888 WARNING [train.py:1204] (3/4) Exclude cut with ID _5444_210_2151_1_1527418808464_5312210_581-315411-0_sp0.9 from training. Duration: 0.688875 2023-10-02 20:17:55,925 WARNING [train.py:1204] (3/4) Exclude cut with ID _23825_210_13449_1_1531915313526_3497150_464-154904-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 20:17:55,934 WARNING [train.py:1204] (3/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_937-208281-0_sp0.9 from training. Duration: 0.9444375 2023-10-02 20:17:55,999 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_37839_210_7234_1_1533542462_3689303_158-181212-0_sp1.1 from training. Duration: 0.963625 2023-10-02 20:17:57,989 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.balancer2.prob, batch_count=1006066.6666666666, ans=0.125 2023-10-02 20:17:59,030 WARNING [train.py:1204] (3/4) Exclude cut with ID _8772_210_1850_1_1528635595972_3664011_398-320049-0 from training. Duration: 0.88 2023-10-02 20:17:59,154 WARNING [train.py:1204] (3/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_718-250907-0_sp1.1 from training. Duration: 0.62725 2023-10-02 20:18:00,468 WARNING [train.py:1204] (3/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_641-126395-0_sp1.1 from training. Duration: 0.92725 2023-10-02 20:18:00,499 WARNING [train.py:1204] (3/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_166-241493-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 20:18:01,911 WARNING [train.py:1204] (3/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_526-62079-0_sp0.9 from training. Duration: 0.7444375 2023-10-02 20:18:03,759 WARNING [train.py:1204] (3/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_439-275083-0_sp1.1 from training. Duration: 0.936375 2023-10-02 20:18:07,006 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_66907_210_21581_1_1535615661_3590177_493-229032-0_sp1.1 from training. Duration: 0.92725 2023-10-02 20:18:07,041 WARNING [train.py:1204] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_89-321955-0_sp1.1 from training. Duration: 0.72725 2023-10-02 20:18:07,139 WARNING [train.py:1204] (3/4) Exclude cut with ID _29857_210_20034_1_1532395886753_3798160_273-226195-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 20:18:07,144 WARNING [train.py:1204] (3/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_404-75889-0 from training. Duration: 0.89 2023-10-02 20:18:07,333 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.conv_module1.balancer1.prob, batch_count=1006066.6666666666, ans=0.125 2023-10-02 20:18:08,578 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_271-234962-0_sp0.9 from training. Duration: 0.87775 2023-10-02 20:18:13,085 WARNING [train.py:1204] (3/4) Exclude cut with ID _22961_210_8254_1_1531976366672_4278180_156-339685-0_sp1.1 from training. Duration: 0.97275 2023-10-02 20:18:13,144 WARNING [train.py:1204] (3/4) Exclude cut with ID _37892_210_19596_1_1533198677897_3964058_425-231927-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 20:18:13,235 WARNING [train.py:1204] (3/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_102-288670-0_sp1.1 from training. Duration: 0.97275 2023-10-02 20:18:14,605 WARNING [train.py:1204] (3/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_283-220372-0_sp1.1 from training. Duration: 0.5090625 2023-10-02 20:18:15,935 WARNING [train.py:1204] (3/4) Exclude cut with ID _44304_210_15374_1_1533639352765_4613129_86-1699-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 20:18:16,015 WARNING [train.py:1204] (3/4) Exclude cut with ID _2891_210_2550_1_1525874398521_4427710_327-121590-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 20:18:16,021 WARNING [train.py:1204] (3/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_369-222060-0 from training. Duration: 0.71 2023-10-02 20:18:17,497 WARNING [train.py:1204] (3/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_762-109886-0 from training. Duration: 0.91 2023-10-02 20:18:18,859 WARNING [train.py:1204] (3/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_477-255782-0_sp0.9 from training. Duration: 0.911125 2023-10-02 20:18:18,892 WARNING [train.py:1204] (3/4) Exclude cut with ID IC0669W0061-330906 from training. Duration: 0.768 2023-10-02 20:18:20,073 WARNING [train.py:1204] (3/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_347-213071-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 20:18:20,113 WARNING [train.py:1204] (3/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_507-303370-0_sp1.1 from training. Duration: 0.87275 2023-10-02 20:18:21,441 WARNING [train.py:1204] (3/4) Exclude cut with ID _15012_210_1968_1_1530856820429_5095350_688-178849-0 from training. Duration: 0.64 2023-10-02 20:18:21,449 WARNING [train.py:1204] (3/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_28-70987-0_sp0.9 from training. Duration: 0.9666875 2023-10-02 20:18:21,452 WARNING [train.py:1204] (3/4) Exclude cut with ID _5910_210_1389_1_1527337972454_5050100_555-159815-0 from training. Duration: 0.98 2023-10-02 20:18:21,489 WARNING [train.py:1204] (3/4) Exclude cut with ID IC0679W0064-335871 from training. Duration: 0.768 2023-10-02 20:18:21,489 WARNING [train.py:1204] (3/4) Exclude cut with ID IC0679W0079-335886 from training. Duration: 0.896 2023-10-02 20:18:22,802 WARNING [train.py:1204] (3/4) Exclude cut with ID _5608_210_2106_1_1527415439089_6819768_909-82767-0 from training. Duration: 0.61 2023-10-02 20:18:24,208 WARNING [train.py:1204] (3/4) Exclude cut with ID _23038_210_9570_1_1532001584158_3652419_411-323967-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 20:18:24,263 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_350-231748-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 20:18:24,273 WARNING [train.py:1204] (3/4) Exclude cut with ID _5289_210_2084_1_1527678202736_3779299_142-103317-0_sp0.9 from training. Duration: 0.9333125 2023-10-02 20:18:25,712 WARNING [train.py:1204] (3/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_314-144055-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 20:18:25,775 WARNING [train.py:1204] (3/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_177-236809-0_sp0.9 from training. Duration: 0.5444375 2023-10-02 20:18:29,066 WARNING [train.py:1204] (3/4) Exclude cut with ID _23038_210_9570_1_1532001584158_3652419_82-334733-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 20:18:29,080 WARNING [train.py:1204] (3/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_225-22526-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 20:18:37,894 WARNING [train.py:1204] (3/4) Exclude cut with ID _30327_210_16049_1_1532484161295_3924169_401-70819-0_sp1.1 from training. Duration: 0.7818125 2023-10-02 20:18:39,726 INFO [train.py:1046] (3/4) Epoch 29, batch 2200, loss[loss=0.1536, simple_loss=0.2456, pruned_loss=0.0308, over 24451.00 frames. ], tot_loss[loss=0.1643, simple_loss=0.2427, pruned_loss=0.04289, over 4720115.51 frames. ], batch size: 66, lr: 3.53e-03, grad_scale: 8.0 2023-10-02 20:18:39,762 WARNING [train.py:1204] (3/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_538-32291-0 from training. Duration: 0.83 2023-10-02 20:18:42,674 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_37357_210_17024_1_1533089504_2316121_258-211605-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 20:18:48,315 WARNING [train.py:1204] (3/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_550-335198-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 20:18:48,372 WARNING [train.py:1204] (3/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_98-741-0_sp0.9 from training. Duration: 0.888875 2023-10-02 20:18:49,666 WARNING [train.py:1204] (3/4) Exclude cut with ID _38323_210_12108_1_1533121233369_3303049_251-238988-0_sp1.1 from training. Duration: 0.92725 2023-10-02 20:18:49,745 WARNING [train.py:1204] (3/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_259-34038-0_sp0.9 from training. Duration: 0.82225 2023-10-02 20:18:52,578 WARNING [train.py:1204] (3/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_1060-180633-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 20:18:52,608 WARNING [train.py:1204] (3/4) Exclude cut with ID _8755_210_1876_1_1528628559357_3258209_211-37473-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 20:18:52,615 WARNING [train.py:1204] (3/4) Exclude cut with ID _4122_210_2084_1_1526713164417_4353280_528-206771-0 from training. Duration: 0.88 2023-10-02 20:18:56,908 WARNING [train.py:1204] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_64-248359-0 from training. Duration: 0.69 2023-10-02 20:18:59,553 WARNING [train.py:1204] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_402-253027-0_sp1.1 from training. Duration: 0.4909375 2023-10-02 20:18:59,740 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.conv_module2.balancer1.prob, batch_count=1006333.3333333334, ans=0.125 2023-10-02 20:19:05,572 WARNING [train.py:1204] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_625-102955-0 from training. Duration: 0.77 2023-10-02 20:19:07,682 WARNING [train.py:1204] (3/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_611-219324-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 20:19:09,033 WARNING [train.py:1204] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_718-68505-0_sp0.9 from training. Duration: 0.988875 2023-10-02 20:19:09,077 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_353-182855-0_sp1.1 from training. Duration: 0.87275 2023-10-02 20:19:13,486 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_5960_210_1794_1_1527403865_3990999_515-2613-0_sp0.9 from training. Duration: 0.97775 2023-10-02 20:19:13,506 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_5575_210_1410_1_1527301305_2570170_342-265572-0 from training. Duration: 0.94 2023-10-02 20:19:15,187 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=1006400.0, ans=0.1 2023-10-02 20:19:16,542 WARNING [train.py:1204] (3/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_329-113173-0_sp0.9 from training. Duration: 0.688875 2023-10-02 20:19:17,974 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_15396_210_6890_1_1531308473_1938936_213-312506-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 20:19:19,268 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_521-214276-0_sp1.1 from training. Duration: 0.4 2023-10-02 20:19:20,842 WARNING [train.py:1204] (3/4) Exclude cut with ID _15026_210_8750_1_1530777602316_3291850_211-195043-0_sp1.1 from training. Duration: 0.72725 2023-10-02 20:19:22,134 WARNING [train.py:1204] (3/4) Exclude cut with ID _29343_210_4381_1_1532602757752_3674109_108-257494-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 20:19:22,461 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.conv_module1.balancer2.prob, batch_count=1006466.6666666666, ans=0.125 2023-10-02 20:19:23,564 WARNING [train.py:1204] (3/4) Exclude cut with ID _64414_210_17301_1_1535243252048_3757759_403-58151-0_sp1.1 from training. Duration: 0.963625 2023-10-02 20:19:24,904 WARNING [train.py:1204] (3/4) Exclude cut with ID _36512_210_6987_1_1533033143998_3126069_109-177787-0_sp1.1 from training. Duration: 0.92725 2023-10-02 20:19:26,201 WARNING [train.py:1204] (3/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_498-40153-0 from training. Duration: 0.63 2023-10-02 20:19:27,527 WARNING [train.py:1204] (3/4) Exclude cut with ID _56447_210_1389_1_1535074245910_3697189_161-334523-0_sp1.1 from training. Duration: 0.936375 2023-10-02 20:19:29,351 WARNING [train.py:1204] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_238-245655-0 from training. Duration: 0.81 2023-10-02 20:19:31,988 WARNING [train.py:1204] (3/4) Exclude cut with ID _64631_210_27767_1_1535349580068_4165160_108-43865-0_sp1.1 from training. Duration: 0.92725 2023-10-02 20:19:32,002 WARNING [train.py:1204] (3/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_243-42116-0_sp1.1 from training. Duration: 0.663625 2023-10-02 20:19:32,025 WARNING [train.py:1204] (3/4) Exclude cut with ID _66868_210_9527_1_1535509287954_4561140_594-132160-0_sp1.1 from training. Duration: 0.92725 2023-10-02 20:19:34,071 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.3.encoder.layers.0.nonlin_attention.whiten2, num_groups=1, num_channels=512, metric=7.27 vs. limit=15.0 2023-10-02 20:19:34,728 WARNING [train.py:1204] (3/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_48-338976-0_sp0.9 from training. Duration: 0.988875 2023-10-02 20:19:34,785 WARNING [train.py:1204] (3/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_137-100595-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 20:19:34,791 WARNING [train.py:1204] (3/4) Exclude cut with ID _49748_210_24116_1_1533986971737_3158910_12-271669-0_sp1.1 from training. Duration: 0.936375 2023-10-02 20:19:36,202 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.518e+02 1.909e+02 2.141e+02 2.599e+02 8.500e+02, threshold=4.282e+02, percent-clipped=2.0 2023-10-02 20:19:36,286 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_37357_210_17024_1_1533089504_2316121_206-80421-0_sp1.1 from training. Duration: 0.936375 2023-10-02 20:19:36,378 WARNING [train.py:1204] (3/4) Exclude cut with ID _7537_210_2084_1_1528502395431_3924359_113-199933-0_sp1.1 from training. Duration: 0.536375 2023-10-02 20:19:38,238 WARNING [train.py:1204] (3/4) Exclude cut with ID _55440_210_12651_1_1534496713364_2980520_31-59289-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 20:19:39,759 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1083-220122-0_sp0.9 from training. Duration: 0.5666875 2023-10-02 20:19:39,963 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.conv_module2.balancer1.prob, batch_count=1006533.3333333334, ans=0.125 2023-10-02 20:19:43,625 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_426-313557-0_sp1.1 from training. Duration: 0.4818125 2023-10-02 20:19:44,650 WARNING [train.py:1204] (3/4) Exclude cut with ID _35799_210_5854_1_1533263487887_3529071_324-94843-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 20:19:46,041 WARNING [train.py:1204] (3/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_264-108685-0_sp1.1 from training. Duration: 0.5 2023-10-02 20:19:47,395 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9012W0006-501141 from training. Duration: 0.896 2023-10-02 20:19:48,768 WARNING [train.py:1204] (3/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_890-5117-0_sp0.9 from training. Duration: 0.8666875 2023-10-02 20:19:48,815 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9023W0008-506301 from training. Duration: 0.896 2023-10-02 20:19:50,172 WARNING [train.py:1204] (3/4) Exclude cut with ID _13916_210_9994_1_1530788341126_3763149_60-191556-0_sp0.9 from training. Duration: 0.67775 2023-10-02 20:19:51,525 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9029W0331-509721 from training. Duration: 0.896 2023-10-02 20:19:51,853 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.conv_module1.balancer1.max_abs, batch_count=1006600.0, ans=10.0 2023-10-02 20:19:52,850 INFO [train.py:1046] (3/4) Epoch 29, batch 2250, loss[loss=0.1739, simple_loss=0.2437, pruned_loss=0.05207, over 23488.00 frames. ], tot_loss[loss=0.1644, simple_loss=0.2428, pruned_loss=0.04303, over 4723371.80 frames. ], batch size: 285, lr: 3.53e-03, grad_scale: 8.0 2023-10-02 20:19:52,942 WARNING [train.py:1204] (3/4) Exclude cut with ID _35823_210_6917_1_1533187881914_7297830_567-108596-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 20:19:52,978 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7543_210_6753_1_1528544010_1110402_25-183860-0_sp1.1 from training. Duration: 0.42725 2023-10-02 20:19:54,189 WARNING [train.py:1204] (3/4) Exclude cut with ID _47278_210_12041_1_1533902418349_3576110_495-260035-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 20:19:54,304 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9051W0015-520281 from training. Duration: 0.9045625 2023-10-02 20:19:56,919 WARNING [train.py:1204] (3/4) Exclude cut with ID _14459_210_2228_1_1531443641946_7516700_707-66193-0_sp1.1 from training. Duration: 0.97275 2023-10-02 20:20:00,372 WARNING [train.py:1204] (3/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_219-224445-0_sp1.1 from training. Duration: 0.763625 2023-10-02 20:20:05,893 WARNING [train.py:1204] (3/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_345-167986-0_sp0.9 from training. Duration: 0.9333125 2023-10-02 20:20:06,012 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_34815_210_6388_1_1533381320_3571903_279-305565-0_sp1.1 from training. Duration: 0.836375 2023-10-02 20:20:06,222 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.ff2_skip_rate, batch_count=1006666.6666666666, ans=0.0 2023-10-02 20:20:09,684 WARNING [train.py:1204] (3/4) Exclude cut with ID _21987_210_5854_1_1531821500482_3637038_317-179212-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 20:20:11,437 WARNING [train.py:1204] (3/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_395-37222-0_sp1.1 from training. Duration: 0.6909375 2023-10-02 20:20:11,501 WARNING [train.py:1204] (3/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_168-50337-0_sp1.1 from training. Duration: 0.763625 2023-10-02 20:20:14,790 WARNING [train.py:1204] (3/4) Exclude cut with ID _60113_210_16271_1_1535164212481_4386079_559-167191-0 from training. Duration: 0.93 2023-10-02 20:20:14,800 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_47228_210_4983_1_1533780021_3672027_344-179642-0_sp1.1 from training. Duration: 0.92725 2023-10-02 20:20:16,116 WARNING [train.py:1204] (3/4) Exclude cut with ID _22961_210_8254_1_1531976366672_4278180_317-199546-0_sp0.9 from training. Duration: 0.9555625 2023-10-02 20:20:17,588 WARNING [train.py:1204] (3/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_141-89727-0 from training. Duration: 0.71 2023-10-02 20:20:18,889 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_78-116075-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 20:20:18,907 WARNING [train.py:1204] (3/4) Exclude cut with ID _64086_210_7994_1_1535270417019_7435949_52-235756-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 20:20:21,512 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7615_210_4211_1_1528110965_4580160_72-235211-0_sp1.1 from training. Duration: 0.6909375 2023-10-02 20:20:23,067 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.conv_skip_rate, batch_count=1006733.3333333334, ans=0.0 2023-10-02 20:20:26,922 WARNING [train.py:1204] (3/4) Exclude cut with ID _14069_210_5632_1_1530512692033_4803390_287-27950-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 20:20:28,298 WARNING [train.py:1204] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_74-48717-0_sp0.9 from training. Duration: 0.6444375 2023-10-02 20:20:28,325 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7544_210_6753_1_1528591327_1471426_172-348777-0_sp1.1 from training. Duration: 0.6 2023-10-02 20:20:29,755 WARNING [train.py:1204] (3/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_220-191080-0 from training. Duration: 0.98 2023-10-02 20:20:31,227 WARNING [train.py:1204] (3/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_1081-137641-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 20:20:34,643 WARNING [train.py:1204] (3/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_278-235459-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 20:20:37,492 WARNING [train.py:1204] (3/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_1181-77470-0_sp1.1 from training. Duration: 0.936375 2023-10-02 20:20:38,887 WARNING [train.py:1204] (3/4) Exclude cut with ID _60860_210_8254_1_1534896015875_3557169_198-5531-0_sp1.1 from training. Duration: 0.936375 2023-10-02 20:20:40,284 WARNING [train.py:1204] (3/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_17-289781-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 20:20:40,297 WARNING [train.py:1204] (3/4) Exclude cut with ID _50310_210_15488_1_1534068043345_3641080_146-199752-0_sp1.1 from training. Duration: 0.92725 2023-10-02 20:20:41,740 WARNING [train.py:1204] (3/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_969-213354-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 20:20:43,704 WARNING [train.py:1204] (3/4) Exclude cut with ID _70519_210_4163_1_1535961595994_7458230_66-344156-0_sp1.1 from training. Duration: 0.963625 2023-10-02 20:20:48,040 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.0.layers.0.conv_module2.whiten, num_groups=1, num_channels=192, metric=10.79 vs. limit=15.0 2023-10-02 20:20:48,533 WARNING [train.py:1204] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_380-204756-0_sp1.1 from training. Duration: 0.7818125 2023-10-02 20:20:49,910 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_128-202104-0_sp0.9 from training. Duration: 0.7 2023-10-02 20:20:50,155 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=1006800.0, ans=0.1 2023-10-02 20:20:54,067 WARNING [train.py:1204] (3/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_51-1049-0_sp0.9 from training. Duration: 0.8333125 2023-10-02 20:20:54,091 WARNING [train.py:1204] (3/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_86-66192-0_sp1.1 from training. Duration: 0.5 2023-10-02 20:20:55,328 WARNING [train.py:1204] (3/4) Exclude cut with ID _14069_210_5632_1_1530512692033_4803390_95-1896-0_sp1.1 from training. Duration: 0.8909375 2023-10-02 20:20:58,199 WARNING [train.py:1204] (3/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_29-293896-0_sp1.1 from training. Duration: 0.5545625 2023-10-02 20:20:58,942 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.1.encoder.layers.1.whiten, num_groups=1, num_channels=256, metric=3.92 vs. limit=12.0 2023-10-02 20:21:00,871 WARNING [train.py:1204] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_167-137255-0_sp0.9 from training. Duration: 0.711125 2023-10-02 20:21:00,872 WARNING [train.py:1204] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_217-95056-0 from training. Duration: 0.98 2023-10-02 20:21:00,887 WARNING [train.py:1204] (3/4) Exclude cut with ID _47278_210_12041_1_1533902418349_3576110_97-266746-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 20:21:00,928 WARNING [train.py:1204] (3/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_380-343670-0_sp1.1 from training. Duration: 0.8 2023-10-02 20:21:04,061 WARNING [train.py:1204] (3/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_107-332398-0 from training. Duration: 0.78 2023-10-02 20:21:06,752 INFO [train.py:1046] (3/4) Epoch 29, batch 2300, loss[loss=0.1799, simple_loss=0.2501, pruned_loss=0.05485, over 23589.00 frames. ], tot_loss[loss=0.165, simple_loss=0.2431, pruned_loss=0.04351, over 4728779.23 frames. ], batch size: 256, lr: 3.53e-03, grad_scale: 8.0 2023-10-02 20:21:06,848 WARNING [train.py:1204] (3/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_91-267696-0_sp1.1 from training. Duration: 0.7909375 2023-10-02 20:21:06,875 WARNING [train.py:1204] (3/4) Exclude cut with ID _41298_210_9019_1_1533430371450_4822143_498-186993-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 20:21:12,313 WARNING [train.py:1204] (3/4) Exclude cut with ID _17956_210_4353_1_1531209568125_6979970_54-77610-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 20:21:13,532 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_40047_210_2290_1_1533290519_2374311_257-335162-0_sp1.1 from training. Duration: 0.863625 2023-10-02 20:21:16,151 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9653W0193-675471 from training. Duration: 0.9656875 2023-10-02 20:21:16,374 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=1006933.3333333334, ans=0.1 2023-10-02 20:21:17,529 WARNING [train.py:1204] (3/4) Exclude cut with ID _25248_210_8750_1_1532062825941_5554180_243-240536-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 20:21:23,185 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.conv_skip_rate, batch_count=1007000.0, ans=0.0 2023-10-02 20:21:24,422 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_32360_210_4198_1_1532689160_3648436_60-51908-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 20:21:24,437 WARNING [train.py:1204] (3/4) Exclude cut with ID _4092_210_2151_1_1526643044449_3135759_227-213972-0_sp0.9 from training. Duration: 0.6 2023-10-02 20:21:24,468 WARNING [train.py:1204] (3/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_392-173500-0_sp1.1 from training. Duration: 0.97275 2023-10-02 20:21:25,904 WARNING [train.py:1204] (3/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_71-329150-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 20:21:25,914 WARNING [train.py:1204] (3/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_866-173440-0 from training. Duration: 0.9 2023-10-02 20:21:27,316 WARNING [train.py:1204] (3/4) Exclude cut with ID _14832_210_8750_1_1530691351539_3171348_1-54315-0_sp0.9 from training. Duration: 0.9666875 2023-10-02 20:21:30,081 WARNING [train.py:1204] (3/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_430-116139-0_sp1.1 from training. Duration: 0.77275 2023-10-02 20:21:31,407 WARNING [train.py:1204] (3/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_356-45054-0_sp0.9 from training. Duration: 0.9555625 2023-10-02 20:21:31,739 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.ff2_skip_rate, batch_count=1007000.0, ans=0.0 2023-10-02 20:21:34,806 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3531_210_3528_1_1526294203_5216456_309-227302-0_sp0.9 from training. Duration: 0.7666875 2023-10-02 20:21:37,464 WARNING [train.py:1204] (3/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_301-330455-0_sp1.1 from training. Duration: 0.72725 2023-10-02 20:21:39,120 WARNING [train.py:1204] (3/4) Exclude cut with ID _23557_210_8750_1_1531890106370_6846255_667-145945-0_sp1.1 from training. Duration: 0.92725 2023-10-02 20:21:39,737 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.3.encoder.layers.1.self_attn_weights.whiten_keys, num_groups=8, num_channels=256, metric=4.18 vs. limit=6.0 2023-10-02 20:21:44,359 WARNING [train.py:1204] (3/4) Exclude cut with ID _14860_210_4915_1_1530702020340_7343240_836-328489-0_sp1.1 from training. Duration: 0.8181875 2023-10-02 20:21:45,667 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_37152_210_9697_1_1533106573_3844188_416-333249-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 20:21:45,950 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.conv_module2.balancer1.min_positive, batch_count=1007066.6666666666, ans=0.025 2023-10-02 20:21:47,109 WARNING [train.py:1204] (3/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_538-32291-0_sp0.9 from training. Duration: 0.92225 2023-10-02 20:21:47,325 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.conv_module2.balancer2.min_abs, batch_count=1007066.6666666666, ans=0.5 2023-10-02 20:21:49,091 WARNING [train.py:1204] (3/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_965-176824-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 20:21:53,286 WARNING [train.py:1204] (3/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_86-19601-0_sp1.1 from training. Duration: 0.77275 2023-10-02 20:21:54,566 WARNING [train.py:1204] (3/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_436-318113-0_sp1.1 from training. Duration: 0.6818125 2023-10-02 20:21:54,602 WARNING [train.py:1204] (3/4) Exclude cut with ID _41817_210_18835_1_1533434477160_7423049_555-293448-0_sp0.9 from training. Duration: 0.888875 2023-10-02 20:21:54,614 WARNING [train.py:1204] (3/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_68-219807-0 from training. Duration: 0.47 2023-10-02 20:22:00,140 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7544_210_6753_1_1528591327_1471426_33-346031-0_sp1.1 from training. Duration: 0.5545625 2023-10-02 20:22:00,142 WARNING [train.py:1204] (3/4) Exclude cut with ID _49748_210_24116_1_1533986971737_3158910_263-52113-0_sp1.1 from training. Duration: 0.97275 2023-10-02 20:22:01,457 WARNING [train.py:1204] (3/4) Exclude cut with ID _64639_210_5816_1_1535367619437_3528000_340-24580-0_sp1.1 from training. Duration: 0.92725 2023-10-02 20:22:01,481 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_47228_210_4983_1_1533780021_3672027_479-114941-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 20:22:01,514 WARNING [train.py:1204] (3/4) Exclude cut with ID _44585_210_18628_1_1533605350387_4518199_506-119659-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 20:22:02,877 WARNING [train.py:1204] (3/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_314-126819-0_sp1.1 from training. Duration: 0.4545625 2023-10-02 20:22:02,881 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2541_210_1794_1_1525590269_3084765_156-173713-0_sp0.9 from training. Duration: 0.9 2023-10-02 20:22:02,923 WARNING [train.py:1204] (3/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_108-255430-0 from training. Duration: 0.97 2023-10-02 20:22:02,939 WARNING [train.py:1204] (3/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_791-45149-0_sp1.1 from training. Duration: 0.7818125 2023-10-02 20:22:02,946 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_53378_210_17282_1_1534227054_1261425_10-118295-0_sp1.1 from training. Duration: 0.97275 2023-10-02 20:22:04,288 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.508e+02 1.800e+02 1.972e+02 2.166e+02 3.182e+02, threshold=3.945e+02, percent-clipped=0.0 2023-10-02 20:22:04,395 WARNING [train.py:1204] (3/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_148-158509-0 from training. Duration: 0.41 2023-10-02 20:22:10,641 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_34726_210_4525_1_1533019474_5030395_687-291305-0_sp1.1 from training. Duration: 0.936375 2023-10-02 20:22:12,795 WARNING [train.py:1204] (3/4) Exclude cut with ID _14860_210_4915_1_1530702020340_7343240_264-324537-0_sp1.1 from training. Duration: 0.963625 2023-10-02 20:22:17,455 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_41251_210_15368_1_1533429767_4820695_591-46386-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 20:22:17,479 WARNING [train.py:1204] (3/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_86-19601-0_sp0.9 from training. Duration: 0.9444375 2023-10-02 20:22:17,504 WARNING [train.py:1204] (3/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_401-255491-0_sp0.9 from training. Duration: 0.77775 2023-10-02 20:22:19,306 WARNING [train.py:1204] (3/4) Exclude cut with ID _8696_210_1876_1_1528545632440_3345058_209-54069-0_sp1.1 from training. Duration: 0.6090625 2023-10-02 20:22:20,510 WARNING [train.py:1204] (3/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_369-325184-0_sp1.1 from training. Duration: 0.9 2023-10-02 20:22:20,786 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.bypass_mid.scale_min, batch_count=1007266.6666666666, ans=0.2 2023-10-02 20:22:21,264 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.2.encoder.layers.0.feed_forward3.out_whiten, num_groups=1, num_channels=384, metric=11.90 vs. limit=15.0 2023-10-02 20:22:21,777 INFO [train.py:1046] (3/4) Epoch 29, batch 2350, loss[loss=0.1679, simple_loss=0.228, pruned_loss=0.05388, over 22565.00 frames. ], tot_loss[loss=0.1666, simple_loss=0.2442, pruned_loss=0.04454, over 4714196.05 frames. ], batch size: 322, lr: 3.53e-03, grad_scale: 8.0 2023-10-02 20:22:21,815 WARNING [train.py:1204] (3/4) Exclude cut with ID _14687_210_5854_1_1530703838259_3664889_115-239046-0_sp0.9 from training. Duration: 0.7444375 2023-10-02 20:22:21,870 WARNING [train.py:1204] (3/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_168-50337-0 from training. Duration: 0.84 2023-10-02 20:22:27,397 WARNING [train.py:1204] (3/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_909-64151-0_sp1.1 from training. Duration: 0.8545625 2023-10-02 20:22:27,407 WARNING [train.py:1204] (3/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_597-154640-0 from training. Duration: 0.9 2023-10-02 20:22:32,951 WARNING [train.py:1204] (3/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_126-274694-0 from training. Duration: 0.98 2023-10-02 20:22:37,628 WARNING [train.py:1204] (3/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_344-269786-0_sp1.1 from training. Duration: 0.97275 2023-10-02 20:22:40,471 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_37818_210_4238_1_1533620910_4720083_322-253154-0_sp1.1 from training. Duration: 0.92725 2023-10-02 20:22:40,474 WARNING [train.py:1204] (3/4) Exclude cut with ID _58895_210_8750_1_1535184081320_3649089_221-15823-0_sp1.1 from training. Duration: 0.92725 2023-10-02 20:22:40,483 WARNING [train.py:1204] (3/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_9-21737-0_sp1.1 from training. Duration: 0.8818125 2023-10-02 20:22:40,523 WARNING [train.py:1204] (3/4) Exclude cut with ID _14069_210_5632_1_1530512692033_4803390_522-341588-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 20:22:41,914 WARNING [train.py:1204] (3/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_363-185703-0 from training. Duration: 0.95 2023-10-02 20:22:47,081 WARNING [train.py:1204] (3/4) Exclude cut with ID _41817_210_18835_1_1533434477160_7423049_265-76216-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 20:22:51,317 WARNING [train.py:1204] (3/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_841-341679-0 from training. Duration: 0.58 2023-10-02 20:22:52,608 WARNING [train.py:1204] (3/4) Exclude cut with ID _55429_210_10626_1_1534467588527_7174143_97-335084-0_sp1.1 from training. Duration: 0.8818125 2023-10-02 20:22:57,315 WARNING [train.py:1204] (3/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_690-126306-0_sp0.9 from training. Duration: 0.8666875 2023-10-02 20:22:57,324 WARNING [train.py:1204] (3/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_102-92751-0_sp1.1 from training. Duration: 0.9 2023-10-02 20:22:58,684 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3787_210_2040_1_1526477544_5478586_603-120324-0_sp1.1 from training. Duration: 0.736375 2023-10-02 20:22:58,780 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_33348_210_5632_1_1533086912_3324030_85-28583-0 from training. Duration: 0.82 2023-10-02 20:23:00,156 WARNING [train.py:1204] (3/4) Exclude cut with ID _60057_210_10640_1_1534903065793_3333150_362-230377-0_sp1.1 from training. Duration: 0.7909375 2023-10-02 20:23:00,497 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.nonlin_attention.balancer.prob, batch_count=1007400.0, ans=0.125 2023-10-02 20:23:01,556 WARNING [train.py:1204] (3/4) Exclude cut with ID _60113_210_16271_1_1535164212481_4386079_194-146486-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 20:23:01,576 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_53753_210_7234_1_1534320003_3594148_171-277668-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 20:23:02,904 WARNING [train.py:1204] (3/4) Exclude cut with ID _34423_210_8750_1_1533430798456_3330199_251-332788-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 20:23:06,139 WARNING [train.py:1204] (3/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_663-166973-0_sp0.9 from training. Duration: 0.888875 2023-10-02 20:23:08,773 WARNING [train.py:1204] (3/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_927-43827-0 from training. Duration: 0.74 2023-10-02 20:23:08,837 WARNING [train.py:1204] (3/4) Exclude cut with ID _28323_210_16187_1_1532480395667_4100300_306-74561-0_sp1.1 from training. Duration: 0.8545625 2023-10-02 20:23:09,069 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.balancer1.prob, batch_count=1007466.6666666666, ans=0.125 2023-10-02 20:23:09,126 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.feed_forward1.out_proj.dropout_p, batch_count=1007466.6666666666, ans=0.1 2023-10-02 20:23:11,637 WARNING [train.py:1204] (3/4) Exclude cut with ID _15037_210_2253_1_1530752294104_3267650_139-337989-0_sp1.1 from training. Duration: 0.92725 2023-10-02 20:23:11,659 WARNING [train.py:1204] (3/4) Exclude cut with ID _49039_210_6315_1_1533967537541_3846118_460-263670-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 20:23:13,050 WARNING [train.py:1204] (3/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_186-309161-0 from training. Duration: 0.54 2023-10-02 20:23:15,000 WARNING [train.py:1204] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_87-278686-0_sp1.1 from training. Duration: 0.7 2023-10-02 20:23:16,437 WARNING [train.py:1204] (3/4) Exclude cut with ID _27052_210_5986_1_1532329197187_3585009_314-106499-0 from training. Duration: 0.92 2023-10-02 20:23:16,460 WARNING [train.py:1204] (3/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_412-287526-0_sp0.9 from training. Duration: 0.8 2023-10-02 20:23:20,777 INFO [scaling.py:1118] (3/4) WithLoss: name=encoder.encoders.4.encoder.layers.2.self_attn_weights, loss-sum=0.000e+00 2023-10-02 20:23:21,213 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.3.encoder.layers.0.self_attn2.whiten, num_groups=1, num_channels=512, metric=15.55 vs. limit=22.5 2023-10-02 20:23:21,831 WARNING [train.py:1204] (3/4) Exclude cut with ID _22961_210_8254_1_1531976366672_4278180_317-199546-0 from training. Duration: 0.86 2023-10-02 20:23:24,696 WARNING [train.py:1204] (3/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_381-317791-0 from training. Duration: 0.73 2023-10-02 20:23:26,015 WARNING [train.py:1204] (3/4) Exclude cut with ID _44010_210_4525_1_1533884262964_4013620_560-310411-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 20:23:26,027 WARNING [train.py:1204] (3/4) Exclude cut with ID _5294_210_1747_1_1527249643928_3696179_342-270278-0_sp0.9 from training. Duration: 0.611125 2023-10-02 20:23:26,046 WARNING [train.py:1204] (3/4) Exclude cut with ID ID0473W0002-918951 from training. Duration: 0.924 2023-10-02 20:23:26,063 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1083-220122-0 from training. Duration: 0.51 2023-10-02 20:23:29,307 WARNING [train.py:1204] (3/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_138-154812-0 from training. Duration: 0.78 2023-10-02 20:23:30,884 WARNING [train.py:1204] (3/4) Exclude cut with ID _44982_210_1511_1_1534312820108_3726029_331-19420-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 20:23:35,429 INFO [train.py:1046] (3/4) Epoch 29, batch 2400, loss[loss=0.1669, simple_loss=0.2333, pruned_loss=0.05027, over 20836.00 frames. ], tot_loss[loss=0.1658, simple_loss=0.2433, pruned_loss=0.04416, over 4718216.96 frames. ], batch size: 45, lr: 3.53e-03, grad_scale: 8.0 2023-10-02 20:23:35,498 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2655_210_2087_1_1525688959_3897400_202-147557-0_sp0.9 from training. Duration: 0.9444375 2023-10-02 20:23:38,390 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_23850_210_4238_1_1532232597_1632433_32-185135-0_sp0.9 from training. Duration: 0.9555625 2023-10-02 20:23:39,685 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_46-93547-0_sp1.1 from training. Duration: 0.863625 2023-10-02 20:23:39,740 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7931_210_1794_1_1528594728_5564059_280-279560-0 from training. Duration: 0.98 2023-10-02 20:23:41,058 WARNING [train.py:1204] (3/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_937-208281-0 from training. Duration: 0.85 2023-10-02 20:23:47,292 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_14928_210_5632_1_1530773461_4714000_454-112198-0_sp0.9 from training. Duration: 0.7333125 2023-10-02 20:23:47,293 WARNING [train.py:1204] (3/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_944-153333-0_sp1.1 from training. Duration: 0.963625 2023-10-02 20:23:49,993 WARNING [train.py:1204] (3/4) Exclude cut with ID _14821_210_8750_1_1530680503991_6829359_2-227151-0 from training. Duration: 0.83 2023-10-02 20:23:50,011 WARNING [train.py:1204] (3/4) Exclude cut with ID _28461_210_2253_1_1532771775189_3041299_96-152158-0_sp1.1 from training. Duration: 0.77275 2023-10-02 20:23:51,421 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7837_210_1792_1_1528453445_5841305_654-54195-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 20:23:51,462 WARNING [train.py:1204] (3/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_344-152526-0 from training. Duration: 0.53 2023-10-02 20:23:55,016 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.self_attn_weights.pos_emb_skip_rate, batch_count=1007666.6666666666, ans=0.0 2023-10-02 20:23:56,341 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_30167_210_3528_1_1532428012_4772375_480-142230-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 20:23:57,911 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.balancer2.prob, batch_count=1007666.6666666666, ans=0.125 2023-10-02 20:23:59,065 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_160-218721-0 from training. Duration: 0.96 2023-10-02 20:23:59,878 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.2.encoder.layers.0.nonlin_attention.whiten1, num_groups=1, num_channels=288, metric=8.56 vs. limit=10.0 2023-10-02 20:24:00,561 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.ff3_skip_rate, batch_count=1007666.6666666666, ans=0.0 2023-10-02 20:24:05,008 WARNING [train.py:1204] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_65-88242-0_sp0.9 from training. Duration: 0.62225 2023-10-02 20:24:07,903 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_34886_210_10633_1_1533203882_1755661_76-156096-0 from training. Duration: 0.87 2023-10-02 20:24:09,444 WARNING [train.py:1204] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_188-38388-0_sp1.1 from training. Duration: 0.92725 2023-10-02 20:24:10,883 WARNING [train.py:1204] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_81-289335-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 20:24:15,550 WARNING [train.py:1204] (3/4) Exclude cut with ID _59493_210_17282_1_1535078419077_3225879_110-8778-0_sp1.1 from training. Duration: 0.936375 2023-10-02 20:24:15,598 WARNING [train.py:1204] (3/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_188-260011-0 from training. Duration: 0.73 2023-10-02 20:24:17,304 WARNING [train.py:1204] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_453-133983-0_sp0.9 from training. Duration: 0.8666875 2023-10-02 20:24:22,968 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8054_210_7927_1_1528627721_4984094_553-17379-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 20:24:26,240 WARNING [train.py:1204] (3/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_341-171072-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 20:24:28,963 WARNING [train.py:1204] (3/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_265-257286-0_sp1.1 from training. Duration: 0.963625 2023-10-02 20:24:30,328 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_403-39619-0_sp0.9 from training. Duration: 0.9333125 2023-10-02 20:24:30,341 WARNING [train.py:1204] (3/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_464-33931-0_sp0.9 from training. Duration: 0.6 2023-10-02 20:24:30,367 WARNING [train.py:1204] (3/4) Exclude cut with ID _60804_210_9642_1_1534831199859_7268299_632-143863-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 20:24:30,373 WARNING [train.py:1204] (3/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_431-310997-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 20:24:30,419 WARNING [train.py:1204] (3/4) Exclude cut with ID _31649_210_6228_1_1532584608063_4091078_114-263860-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 20:24:30,430 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6961_210_2087_1_1527937168_6815011_562-165518-0_sp0.9 from training. Duration: 0.8555625 2023-10-02 20:24:35,051 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.563e+02 1.909e+02 2.127e+02 2.478e+02 3.965e+02, threshold=4.255e+02, percent-clipped=1.0 2023-10-02 20:24:35,143 WARNING [train.py:1204] (3/4) Exclude cut with ID _27052_210_5986_1_1532329197187_3585009_338-325870-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 20:24:35,198 WARNING [train.py:1204] (3/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_469-94673-0_sp1.1 from training. Duration: 0.7181875 2023-10-02 20:24:35,225 WARNING [train.py:1204] (3/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_259-34038-0 from training. Duration: 0.74 2023-10-02 20:24:36,573 WARNING [train.py:1204] (3/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_388-129552-0 from training. Duration: 0.96 2023-10-02 20:24:38,015 WARNING [train.py:1204] (3/4) Exclude cut with ID _28394_210_8913_1_1532430049518_3650118_342-251455-0_sp1.1 from training. Duration: 0.936375 2023-10-02 20:24:38,043 WARNING [train.py:1204] (3/4) Exclude cut with ID _53731_210_7994_1_1534553979244_3757214_8-191390-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 20:24:38,078 WARNING [train.py:1204] (3/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_613-232856-0 from training. Duration: 0.54 2023-10-02 20:24:39,427 WARNING [train.py:1204] (3/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_557-349534-0 from training. Duration: 0.91 2023-10-02 20:24:39,435 WARNING [train.py:1204] (3/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_379-184223-0 from training. Duration: 0.85 2023-10-02 20:24:39,439 WARNING [train.py:1204] (3/4) Exclude cut with ID IC0125W0001-61807 from training. Duration: 0.966 2023-10-02 20:24:41,490 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_229-45300-0 from training. Duration: 0.57 2023-10-02 20:24:42,676 WARNING [train.py:1204] (3/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_1069-24743-0_sp1.1 from training. Duration: 0.92725 2023-10-02 20:24:45,359 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_41452_210_7291_1_1533437687_3941281_323-172873-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 20:24:45,376 WARNING [train.py:1204] (3/4) Exclude cut with ID _37086_210_9038_1_1532999540891_7021250_802-226709-0_sp1.1 from training. Duration: 0.97275 2023-10-02 20:24:46,686 WARNING [train.py:1204] (3/4) Exclude cut with ID IC0144W0500-71767 from training. Duration: 0.931875 2023-10-02 20:24:48,702 WARNING [train.py:1204] (3/4) Exclude cut with ID _39846_210_12651_1_1533605310265_2115212_233-129699-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 20:24:48,763 WARNING [train.py:1204] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_574-45541-0_sp0.9 from training. Duration: 0.72225 2023-10-02 20:24:50,012 INFO [train.py:1046] (3/4) Epoch 29, batch 2450, loss[loss=0.1909, simple_loss=0.274, pruned_loss=0.05389, over 23353.00 frames. ], tot_loss[loss=0.1652, simple_loss=0.2423, pruned_loss=0.04399, over 4702338.84 frames. ], batch size: 93, lr: 3.53e-03, grad_scale: 8.0 2023-10-02 20:24:51,611 WARNING [train.py:1204] (3/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_372-298093-0_sp1.1 from training. Duration: 0.72725 2023-10-02 20:24:51,623 WARNING [train.py:1204] (3/4) Exclude cut with ID _62574_210_2655_1_1535180407058_3933249_34-26343-0_sp1.1 from training. Duration: 0.97275 2023-10-02 20:24:51,701 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder_embed.convnext.layerdrop_rate, batch_count=1007933.3333333334, ans=0.015 2023-10-02 20:24:52,145 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.4.encoder.layers.2.self_attn1.whiten, num_groups=1, num_channels=384, metric=12.57 vs. limit=22.5 2023-10-02 20:24:55,824 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_512-256238-0_sp1.1 from training. Duration: 0.963625 2023-10-02 20:24:55,836 WARNING [train.py:1204] (3/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_196-55502-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 20:24:58,910 WARNING [train.py:1204] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_195-167011-0 from training. Duration: 0.75 2023-10-02 20:25:03,722 WARNING [train.py:1204] (3/4) Exclude cut with ID _13678_210_7497_1_1530530069226_3463170_345-230416-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 20:25:03,740 WARNING [train.py:1204] (3/4) Exclude cut with ID _51078_210_9070_1_1534161557424_3805250_225-327809-0_sp1.1 from training. Duration: 0.963625 2023-10-02 20:25:07,975 WARNING [train.py:1204] (3/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_18-189198-0_sp1.1 from training. Duration: 0.8181875 2023-10-02 20:25:07,987 WARNING [train.py:1204] (3/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_329-82235-0_sp1.1 from training. Duration: 0.7454375 2023-10-02 20:25:07,997 WARNING [train.py:1204] (3/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_726-270780-0_sp0.9 from training. Duration: 0.9555625 2023-10-02 20:25:08,044 WARNING [train.py:1204] (3/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_654-52713-0 from training. Duration: 0.77 2023-10-02 20:25:12,735 WARNING [train.py:1204] (3/4) Exclude cut with ID _20455_210_5219_1_1531565996775_8098100_191-16006-0_sp1.1 from training. Duration: 0.963625 2023-10-02 20:25:14,260 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3171_210_2087_1_1526109084_2330007_66-260583-0_sp1.1 from training. Duration: 0.6545625 2023-10-02 20:25:15,605 WARNING [train.py:1204] (3/4) Exclude cut with ID _38670_210_15379_1_1533448810659_4391059_63-129006-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 20:25:16,383 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.2.encoder.layers.2.conv_module1.whiten, num_groups=1, num_channels=384, metric=5.14 vs. limit=15.0 2023-10-02 20:25:18,357 WARNING [train.py:1204] (3/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_393-57888-0_sp0.9 from training. Duration: 0.711125 2023-10-02 20:25:20,234 WARNING [train.py:1204] (3/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_857-298942-0_sp0.9 from training. Duration: 0.9444375 2023-10-02 20:25:21,682 WARNING [train.py:1204] (3/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_239-22076-0_sp0.9 from training. Duration: 0.9444375 2023-10-02 20:25:21,716 WARNING [train.py:1204] (3/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_424-227947-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 20:25:24,397 WARNING [train.py:1204] (3/4) Exclude cut with ID _14069_210_5632_1_1530512692033_4803390_95-1896-0 from training. Duration: 0.98 2023-10-02 20:25:24,470 WARNING [train.py:1204] (3/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_96-244299-0_sp0.9 from training. Duration: 0.97775 2023-10-02 20:25:30,114 WARNING [train.py:1204] (3/4) Exclude cut with ID _46683_210_9642_1_1533730913492_4591116_562-319825-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 20:25:31,947 WARNING [train.py:1204] (3/4) Exclude cut with ID _64639_210_5816_1_1535367619437_3528000_284-173474-0_sp1.1 from training. Duration: 0.963625 2023-10-02 20:25:31,982 WARNING [train.py:1204] (3/4) Exclude cut with ID _29529_210_7234_1_1532393961455_3977460_497-100133-0_sp1.1 from training. Duration: 0.936375 2023-10-02 20:25:32,009 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_12868_210_6753_1_1530701932_530275_18-76322-0_sp0.9 from training. Duration: 0.9666875 2023-10-02 20:25:33,268 WARNING [train.py:1204] (3/4) Exclude cut with ID _50093_210_4220_1_1534417141207_3432299_116-303447-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 20:25:35,273 WARNING [train.py:1204] (3/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_44-79427-0_sp1.1 from training. Duration: 0.8909375 2023-10-02 20:25:36,715 WARNING [train.py:1204] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_887-249555-0 from training. Duration: 0.77 2023-10-02 20:25:36,920 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.feed_forward3.hidden_balancer.prob, batch_count=1008133.3333333334, ans=0.125 2023-10-02 20:25:40,838 WARNING [train.py:1204] (3/4) Exclude cut with ID _28461_210_2253_1_1532771775189_3041299_96-152158-0_sp0.9 from training. Duration: 0.9444375 2023-10-02 20:25:40,891 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_14058_210_4163_1_1530690953_6327180_49-210687-0_sp1.1 from training. Duration: 0.8545625 2023-10-02 20:25:44,471 INFO [scaling.py:1118] (3/4) WithLoss: name=encoder.encoders.4.encoder.layers.1.self_attn_weights, loss-sum=0.000e+00 2023-10-02 20:25:45,391 WARNING [train.py:1204] (3/4) Exclude cut with ID _40399_210_4353_1_1533305658194_3279406_351-127903-0_sp1.1 from training. Duration: 0.97275 2023-10-02 20:25:45,396 WARNING [train.py:1204] (3/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_184-206939-0_sp1.1 from training. Duration: 0.936375 2023-10-02 20:25:48,290 WARNING [train.py:1204] (3/4) Exclude cut with ID _14100_210_8341_1_1530529320613_3678069_258-224599-0_sp1.1 from training. Duration: 0.7 2023-10-02 20:25:48,305 WARNING [train.py:1204] (3/4) Exclude cut with ID _6493_210_1747_1_1528459210424_4012470_324-323854-0 from training. Duration: 0.92 2023-10-02 20:25:48,392 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_151-325626-0_sp1.1 from training. Duration: 0.8818125 2023-10-02 20:25:49,905 WARNING [train.py:1204] (3/4) Exclude cut with ID _14528_210_8254_1_1530669700514_4306939_106-43145-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 20:25:49,919 WARNING [train.py:1204] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_686-54383-0 from training. Duration: 0.87 2023-10-02 20:25:51,671 WARNING [train.py:1204] (3/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_801-102362-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 20:25:51,739 WARNING [train.py:1204] (3/4) Exclude cut with ID _48677_210_5663_1_1533902405919_3892989_408-40740-0_sp0.9 from training. Duration: 0.92225 2023-10-02 20:25:55,668 WARNING [train.py:1204] (3/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_448-212135-0_sp1.1 from training. Duration: 0.82725 2023-10-02 20:25:59,730 WARNING [train.py:1204] (3/4) Exclude cut with ID _59170_210_6721_1_1534983633960_4203335_320-124047-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 20:25:59,766 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_4692_210_3528_1_1526899755_4323254_107-44401-0_sp1.1 from training. Duration: 0.7818125 2023-10-02 20:26:02,566 WARNING [train.py:1204] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_731-2428-0 from training. Duration: 0.83 2023-10-02 20:26:04,388 INFO [train.py:1046] (3/4) Epoch 29, batch 2500, loss[loss=0.1594, simple_loss=0.2405, pruned_loss=0.03914, over 24425.00 frames. ], tot_loss[loss=0.1646, simple_loss=0.2422, pruned_loss=0.04352, over 4707978.95 frames. ], batch size: 58, lr: 3.53e-03, grad_scale: 8.0 2023-10-02 20:26:04,477 WARNING [train.py:1204] (3/4) Exclude cut with ID _29857_210_20034_1_1532395886753_3798160_335-163562-0_sp1.1 from training. Duration: 0.77275 2023-10-02 20:26:09,501 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.1.encoder.layers.0.self_attn_weights.whiten_keys, num_groups=4, num_channels=128, metric=3.70 vs. limit=6.0 2023-10-02 20:26:09,960 WARNING [train.py:1204] (3/4) Exclude cut with ID _14832_210_8750_1_1530691351539_3171348_171-275004-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 20:26:11,490 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.ff2_skip_rate, batch_count=1008266.6666666666, ans=0.0 2023-10-02 20:26:17,416 WARNING [train.py:1204] (3/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_105-328174-0_sp0.9 from training. Duration: 0.8444375 2023-10-02 20:26:18,770 WARNING [train.py:1204] (3/4) Exclude cut with ID _68965_210_1883_1_1535698962272_3497490_164-118421-0_sp1.1 from training. Duration: 0.936375 2023-10-02 20:26:18,860 WARNING [train.py:1204] (3/4) Exclude cut with ID _19896_210_1883_1_1531709022662_6649569_380-24170-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 20:26:18,865 WARNING [train.py:1204] (3/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_505-205270-0_sp1.1 from training. Duration: 0.3454375 2023-10-02 20:26:24,934 WARNING [train.py:1204] (3/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_427-208901-0_sp1.1 from training. Duration: 0.6181875 2023-10-02 20:26:26,423 WARNING [train.py:1204] (3/4) Exclude cut with ID _23818_210_6228_1_1531917063884_3676920_85-326916-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 20:26:26,517 WARNING [train.py:1204] (3/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_101-293367-0_sp1.1 from training. Duration: 0.57275 2023-10-02 20:26:26,525 WARNING [train.py:1204] (3/4) Exclude cut with ID _5409_210_1388_1_1527292850189_5865191_178-127970-0_sp1.1 from training. Duration: 0.5181875 2023-10-02 20:26:27,264 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.2.encoder.layers.2.conv_module2.whiten, num_groups=1, num_channels=384, metric=4.81 vs. limit=15.0 2023-10-02 20:26:27,823 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_403-39619-0 from training. Duration: 0.84 2023-10-02 20:26:27,931 WARNING [train.py:1204] (3/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_427-87841-0_sp1.1 from training. Duration: 0.963625 2023-10-02 20:26:29,231 WARNING [train.py:1204] (3/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_286-68181-0_sp1.1 from training. Duration: 0.97275 2023-10-02 20:26:29,271 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6673_210_3273_1_1527767790_3173240_327-306185-0 from training. Duration: 0.83 2023-10-02 20:26:29,288 WARNING [train.py:1204] (3/4) Exclude cut with ID _3675_210_4557_1_1526639934408_4182089_106-203713-0_sp1.1 from training. Duration: 0.963625 2023-10-02 20:26:30,660 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_53071_210_16554_1_1534493434_1276796_130-36076-0 from training. Duration: 0.95 2023-10-02 20:26:30,692 WARNING [train.py:1204] (3/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_586-93054-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 20:26:34,731 WARNING [train.py:1204] (3/4) Exclude cut with ID _23825_210_13449_1_1531915313526_3497150_262-10571-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 20:26:35,483 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.3.encoder.layers.2.feed_forward3.out_whiten, num_groups=1, num_channels=512, metric=7.25 vs. limit=15.0 2023-10-02 20:26:36,171 WARNING [train.py:1204] (3/4) Exclude cut with ID _28783_210_7943_1_1532671164808_7155479_432-67885-0_sp1.1 from training. Duration: 0.97275 2023-10-02 20:26:37,703 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6287_210_3273_1_1527509506_2653716_182-199887-0_sp0.9 from training. Duration: 0.5666875 2023-10-02 20:26:39,009 WARNING [train.py:1204] (3/4) Exclude cut with ID _3231_210_2106_1_1526205864934_6784220_171-256004-0 from training. Duration: 0.86 2023-10-02 20:26:40,259 WARNING [train.py:1204] (3/4) Exclude cut with ID _69051_210_1531_1_1535885566901_3829538_332-92090-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 20:26:40,390 WARNING [train.py:1204] (3/4) Exclude cut with ID _3675_210_4557_1_1526639934408_4182089_104-324125-0_sp1.1 from training. Duration: 0.963625 2023-10-02 20:26:45,010 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_41764_210_2521_1_1533686110_4089313_501-2119-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 20:26:46,531 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_56-100988-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 20:26:49,731 WARNING [train.py:1204] (3/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_398-239819-0_sp1.1 from training. Duration: 0.9 2023-10-02 20:26:56,348 WARNING [train.py:1204] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_74-48717-0_sp1.1 from training. Duration: 0.52725 2023-10-02 20:26:58,369 WARNING [train.py:1204] (3/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_177-49092-0 from training. Duration: 0.91 2023-10-02 20:26:59,743 WARNING [train.py:1204] (3/4) Exclude cut with ID _62337_210_12392_1_1535009273105_3412229_44-176353-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 20:26:59,760 WARNING [train.py:1204] (3/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_110-130961-0_sp1.1 from training. Duration: 0.42725 2023-10-02 20:27:01,157 WARNING [train.py:1204] (3/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_37-189262-0_sp1.1 from training. Duration: 0.8909375 2023-10-02 20:27:01,157 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3172_210_2087_1_1526292929_3987865_197-97762-0_sp0.9 from training. Duration: 0.8333125 2023-10-02 20:27:01,261 WARNING [train.py:1204] (3/4) Exclude cut with ID IC0677W0065-334897 from training. Duration: 0.896 2023-10-02 20:27:01,262 WARNING [train.py:1204] (3/4) Exclude cut with ID IC0677W0080-334912 from training. Duration: 0.768 2023-10-02 20:27:01,266 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_120-41668-0 from training. Duration: 0.7 2023-10-02 20:27:03,386 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.2.encoder.layers.1.feed_forward3.out_whiten, num_groups=1, num_channels=384, metric=8.75 vs. limit=15.0 2023-10-02 20:27:04,409 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.547e+02 1.824e+02 2.011e+02 2.167e+02 3.747e+02, threshold=4.021e+02, percent-clipped=0.0 2023-10-02 20:27:04,545 WARNING [train.py:1204] (3/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_460-205287-0_sp1.1 from training. Duration: 0.963625 2023-10-02 20:27:06,013 WARNING [train.py:1204] (3/4) Exclude cut with ID _14632_210_9988_1_1530691279392_3923200_102-204477-0 from training. Duration: 0.97 2023-10-02 20:27:06,018 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_34815_210_6388_1_1533381320_3571903_279-305565-0 from training. Duration: 0.92 2023-10-02 20:27:07,226 WARNING [train.py:1204] (3/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_91-148831-0_sp1.1 from training. Duration: 0.9 2023-10-02 20:27:08,566 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_82-221196-0 from training. Duration: 0.76 2023-10-02 20:27:13,151 WARNING [train.py:1204] (3/4) Exclude cut with ID _38020_210_5986_1_1533112252259_7006089_493-102745-0 from training. Duration: 0.98 2023-10-02 20:27:17,235 WARNING [train.py:1204] (3/4) Exclude cut with ID _61133_210_9969_1_1535196777227_3696409_40-93442-0_sp1.1 from training. Duration: 0.92725 2023-10-02 20:27:18,562 INFO [train.py:1046] (3/4) Epoch 29, batch 2550, loss[loss=0.1635, simple_loss=0.2384, pruned_loss=0.04425, over 23597.00 frames. ], tot_loss[loss=0.1651, simple_loss=0.2425, pruned_loss=0.04381, over 4696782.54 frames. ], batch size: 149, lr: 3.53e-03, grad_scale: 8.0 2023-10-02 20:27:18,653 WARNING [train.py:1204] (3/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_45-258852-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 20:27:18,689 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_53071_210_16554_1_1534493434_1276796_130-36076-0_sp1.1 from training. Duration: 0.863625 2023-10-02 20:27:20,141 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8694_210_6400_1_1529065754_3260756_332-13553-0_sp1.1 from training. Duration: 0.92725 2023-10-02 20:27:21,434 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_405-339986-0 from training. Duration: 0.51 2023-10-02 20:27:23,360 WARNING [train.py:1204] (3/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_927-43827-0_sp1.1 from training. Duration: 0.67275 2023-10-02 20:27:26,195 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_397-147196-0 from training. Duration: 0.8 2023-10-02 20:27:26,320 WARNING [train.py:1204] (3/4) Exclude cut with ID 3557-8342-0013-54691-0_sp1.1 from training. Duration: 0.836375 2023-10-02 20:27:27,917 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.feed_forward3.hidden_balancer.prob, batch_count=1008600.0, ans=0.125 2023-10-02 20:27:29,075 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_47438_210_5399_1_1533797471_4513356_626-127722-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 20:27:32,354 WARNING [train.py:1204] (3/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_256-323883-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 20:27:32,360 WARNING [train.py:1204] (3/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_839-173017-0_sp0.9 from training. Duration: 0.4555625 2023-10-02 20:27:32,395 WARNING [train.py:1204] (3/4) Exclude cut with ID _5289_210_2084_1_1527678202736_3779299_392-261988-0_sp1.1 from training. Duration: 0.5909375 2023-10-02 20:27:32,434 WARNING [train.py:1204] (3/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_119-224384-0_sp1.1 from training. Duration: 0.936375 2023-10-02 20:27:32,460 WARNING [train.py:1204] (3/4) Exclude cut with ID _57523_210_1856_1_1534766294545_3732989_262-267933-0_sp1.1 from training. Duration: 0.963625 2023-10-02 20:27:35,839 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.2.encoder.layers.0.feed_forward2.out_whiten, num_groups=1, num_channels=384, metric=5.46 vs. limit=15.0 2023-10-02 20:27:36,506 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_35361_210_4238_1_1533262810_7793476_675-87012-0_sp0.9 from training. Duration: 0.988875 2023-10-02 20:27:36,523 WARNING [train.py:1204] (3/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_17-90541-0 from training. Duration: 0.94 2023-10-02 20:27:36,558 WARNING [train.py:1204] (3/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_535-14410-0_sp1.1 from training. Duration: 0.42725 2023-10-02 20:27:36,563 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_24673_210_6400_1_1531968633_778659_94-154511-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 20:27:36,566 WARNING [train.py:1204] (3/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_212-10501-0 from training. Duration: 0.94 2023-10-02 20:27:48,218 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.0.layers.0.self_attn2.whiten, num_groups=1, num_channels=192, metric=12.07 vs. limit=22.5 2023-10-02 20:27:50,191 WARNING [train.py:1204] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_383-285196-0_sp1.1 from training. Duration: 0.8818125 2023-10-02 20:27:53,226 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.self_attn_weights.pos_emb_skip_rate, batch_count=1008733.3333333334, ans=0.0 2023-10-02 20:27:54,925 WARNING [train.py:1204] (3/4) Exclude cut with ID _45560_210_21982_1_1533812603951_3584999_238-47020-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 20:27:54,935 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_21500_210_2054_1_1532149310_2716758_278-18053-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 20:27:54,950 WARNING [train.py:1204] (3/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_453-9626-0_sp1.1 from training. Duration: 0.936375 2023-10-02 20:27:56,422 WARNING [train.py:1204] (3/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_153-97441-0_sp1.1 from training. Duration: 0.5090625 2023-10-02 20:27:58,166 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.bypass.scale_min, batch_count=1008733.3333333334, ans=0.2 2023-10-02 20:27:59,631 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.4.encoder.layers.1.feed_forward2.out_whiten, num_groups=1, num_channels=384, metric=12.12 vs. limit=15.0 2023-10-02 20:28:00,442 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.0.feed_forward1.hidden_balancer.prob, batch_count=1008733.3333333334, ans=0.125 2023-10-02 20:28:03,524 WARNING [train.py:1204] (3/4) Exclude cut with ID _67725_210_27767_1_1535608756671_5105359_431-120895-0_sp1.1 from training. Duration: 0.963625 2023-10-02 20:28:06,327 WARNING [train.py:1204] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_574-45541-0_sp1.1 from training. Duration: 0.5909375 2023-10-02 20:28:06,353 WARNING [train.py:1204] (3/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_162-103620-0_sp1.1 from training. Duration: 0.8454375 2023-10-02 20:28:06,364 WARNING [train.py:1204] (3/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_734-129761-0_sp1.1 from training. Duration: 0.7454375 2023-10-02 20:28:06,405 WARNING [train.py:1204] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_620-276793-0_sp1.1 from training. Duration: 0.536375 2023-10-02 20:28:07,701 WARNING [train.py:1204] (3/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_153-97441-0_sp0.9 from training. Duration: 0.62225 2023-10-02 20:28:08,501 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.2.encoder.layers.2.conv_module2.whiten, num_groups=1, num_channels=384, metric=4.71 vs. limit=15.0 2023-10-02 20:28:11,100 WARNING [train.py:1204] (3/4) Exclude cut with ID _12733_210_8341_1_1530167424556_3646380_523-85621-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 20:28:12,372 WARNING [train.py:1204] (3/4) Exclude cut with ID _64048_210_10626_1_1535198382802_3835179_352-179834-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 20:28:12,659 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.self_attn_weights.pos_emb_skip_rate, batch_count=1008800.0, ans=0.0 2023-10-02 20:28:15,880 WARNING [train.py:1204] (3/4) Exclude cut with ID _55162_210_17071_1_1534408248583_3753480_43-242364-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 20:28:15,904 WARNING [train.py:1204] (3/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_661-264857-0 from training. Duration: 0.93 2023-10-02 20:28:15,921 WARNING [train.py:1204] (3/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_325-237800-0_sp1.1 from training. Duration: 0.97275 2023-10-02 20:28:17,273 WARNING [train.py:1204] (3/4) Exclude cut with ID _55328_210_27767_1_1534580973260_4088170_385-171459-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 20:28:18,625 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3780_210_2087_1_1526467389_2145572_176-107140-0_sp1.1 from training. Duration: 0.62725 2023-10-02 20:28:19,959 WARNING [train.py:1204] (3/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_375-244976-0_sp1.1 from training. Duration: 0.6818125 2023-10-02 20:28:21,354 WARNING [train.py:1204] (3/4) Exclude cut with ID _35941_210_15379_1_1532995294515_4023089_309-33162-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 20:28:27,625 WARNING [train.py:1204] (3/4) Exclude cut with ID _14459_210_2228_1_1531443641946_7516700_1185-22038-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 20:28:28,961 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_1197-69460-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 20:28:32,970 INFO [train.py:1046] (3/4) Epoch 29, batch 2600, loss[loss=0.1872, simple_loss=0.2694, pruned_loss=0.05248, over 24401.00 frames. ], tot_loss[loss=0.1652, simple_loss=0.2429, pruned_loss=0.04374, over 4701549.13 frames. ], batch size: 77, lr: 3.53e-03, grad_scale: 8.0 2023-10-02 20:28:33,018 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9012W0022-501157 from training. Duration: 0.896 2023-10-02 20:28:36,082 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9023W0009-506302 from training. Duration: 0.896 2023-10-02 20:28:36,096 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2821_210_2715_1_1526108385_3763560_73-296673-0_sp1.1 from training. Duration: 0.7545625 2023-10-02 20:28:36,134 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9025W0113-507442 from training. Duration: 0.896 2023-10-02 20:28:37,523 WARNING [train.py:1204] (3/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_739-2098-0 from training. Duration: 0.92 2023-10-02 20:28:37,534 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9029W0332-509722 from training. Duration: 0.768 2023-10-02 20:28:39,076 WARNING [train.py:1204] (3/4) Exclude cut with ID _16908_210_5724_1_1531119157248_3772700_68-101953-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 20:28:40,930 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9042W0011-515617 from training. Duration: 0.768 2023-10-02 20:28:42,363 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3019_210_2028_1_1526044245_3264865_191-72235-0 from training. Duration: 0.97 2023-10-02 20:28:43,736 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9052W0006-520792 from training. Duration: 0.896 2023-10-02 20:28:45,006 WARNING [train.py:1204] (3/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_245-174553-0_sp1.1 from training. Duration: 0.8 2023-10-02 20:28:46,989 WARNING [train.py:1204] (3/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_202-67017-0 from training. Duration: 0.82 2023-10-02 20:28:48,412 WARNING [train.py:1204] (3/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_304-59227-0 from training. Duration: 0.77 2023-10-02 20:28:49,830 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_246-125000-0_sp0.9 from training. Duration: 0.688875 2023-10-02 20:28:50,059 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.balancer1.prob, batch_count=1009000.0, ans=0.125 2023-10-02 20:28:51,093 WARNING [train.py:1204] (3/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_243-42116-0 from training. Duration: 0.73 2023-10-02 20:28:53,781 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9087W0121-538492 from training. Duration: 0.896 2023-10-02 20:28:53,796 WARNING [train.py:1204] (3/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_120-76843-0 from training. Duration: 0.55 2023-10-02 20:29:00,036 WARNING [train.py:1204] (3/4) Exclude cut with ID _7852_210_5219_1_1528286471316_8139400_850-304621-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 20:29:00,160 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.1.bypass.scale_min, batch_count=1009000.0, ans=0.2 2023-10-02 20:29:01,472 WARNING [train.py:1204] (3/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_154-285415-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 20:29:01,484 WARNING [train.py:1204] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_538-278194-0_sp1.1 from training. Duration: 0.92725 2023-10-02 20:29:01,494 WARNING [train.py:1204] (3/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_119-207162-0 from training. Duration: 0.71 2023-10-02 20:29:02,866 WARNING [train.py:1204] (3/4) Exclude cut with ID _2334_210_2667_1_1525608238880_4705129_459-104997-0_sp1.1 from training. Duration: 0.863625 2023-10-02 20:29:08,908 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9147W0019-567847 from training. Duration: 0.768 2023-10-02 20:29:15,014 WARNING [train.py:1204] (3/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_228-189439-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 20:29:15,043 WARNING [train.py:1204] (3/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_196-77172-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 20:29:15,114 WARNING [train.py:1204] (3/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_625-338546-0 from training. Duration: 0.88 2023-10-02 20:29:16,525 WARNING [train.py:1204] (3/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_193-327630-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 20:29:16,527 WARNING [train.py:1204] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_391-24006-0_sp1.1 from training. Duration: 0.92725 2023-10-02 20:29:17,823 WARNING [train.py:1204] (3/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_197-28164-0 from training. Duration: 0.8 2023-10-02 20:29:22,468 WARNING [train.py:1204] (3/4) Exclude cut with ID _8395_210_1531_1_1528597871523_4411579_431-132470-0_sp1.1 from training. Duration: 0.67275 2023-10-02 20:29:22,499 WARNING [train.py:1204] (3/4) Exclude cut with ID _35350_210_16040_1_1533040191183_4075299_465-248326-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 20:29:22,722 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.conv_module1.balancer2.prob, batch_count=1009133.3333333334, ans=0.125 2023-10-02 20:29:25,195 WARNING [train.py:1204] (3/4) Exclude cut with ID _16756_210_4915_1_1531047803763_7104259_101-141922-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 20:29:28,063 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9212W0005-600172 from training. Duration: 0.896 2023-10-02 20:29:28,091 WARNING [train.py:1204] (3/4) Exclude cut with ID _63186_210_2084_1_1535414479705_3908649_378-166820-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 20:29:29,718 WARNING [train.py:1204] (3/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_52-211683-0_sp1.1 from training. Duration: 0.8181875 2023-10-02 20:29:32,333 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.423e+02 1.950e+02 2.073e+02 2.316e+02 4.084e+02, threshold=4.145e+02, percent-clipped=1.0 2023-10-02 20:29:33,940 WARNING [train.py:1204] (3/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_1147-237729-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 20:29:35,316 WARNING [train.py:1204] (3/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_234-14174-0_sp0.9 from training. Duration: 0.911125 2023-10-02 20:29:35,340 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_454-276429-0 from training. Duration: 0.87 2023-10-02 20:29:36,714 WARNING [train.py:1204] (3/4) Exclude cut with ID _53713_210_24116_1_1534314487762_7416751_755-165313-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 20:29:38,104 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8278_210_2550_1_1528637099_4220413_436-317072-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 20:29:38,176 WARNING [train.py:1204] (3/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_28-324729-0_sp1.1 from training. Duration: 0.8545625 2023-10-02 20:29:42,935 WARNING [train.py:1204] (3/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_326-165900-0 from training. Duration: 0.69 2023-10-02 20:29:43,009 WARNING [train.py:1204] (3/4) Exclude cut with ID _53489_210_6228_1_1534298357169_3835970_96-30246-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 20:29:46,101 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8275_210_4198_1_1528455320_3892571_491-17707-0_sp1.1 from training. Duration: 0.5818125 2023-10-02 20:29:47,470 INFO [train.py:1046] (3/4) Epoch 29, batch 2650, loss[loss=0.1989, simple_loss=0.2654, pruned_loss=0.06624, over 19466.00 frames. ], tot_loss[loss=0.1661, simple_loss=0.2441, pruned_loss=0.04407, over 4712972.36 frames. ], batch size: 388, lr: 3.53e-03, grad_scale: 8.0 2023-10-02 20:29:49,507 WARNING [train.py:1204] (3/4) Exclude cut with ID _14348_210_8341_1_1530599434996_3778640_240-232370-0 from training. Duration: 0.84 2023-10-02 20:29:49,526 WARNING [train.py:1204] (3/4) Exclude cut with ID _45012_210_12651_1_1533608435136_5371380_54-112623-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 20:29:50,873 WARNING [train.py:1204] (3/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_328-264776-0_sp1.1 from training. Duration: 0.6090625 2023-10-02 20:29:50,925 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9325W0001-648427 from training. Duration: 0.768 2023-10-02 20:29:52,203 WARNING [train.py:1204] (3/4) Exclude cut with ID _56480_210_4220_1_1534679942079_3075409_49-74717-0_sp1.1 from training. Duration: 0.936375 2023-10-02 20:29:53,588 WARNING [train.py:1204] (3/4) Exclude cut with ID _59500_210_8254_1_1534809610680_3736009_6-241869-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 20:29:54,989 WARNING [train.py:1204] (3/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_172-215932-0_sp0.9 from training. Duration: 0.7333125 2023-10-02 20:29:56,350 WARNING [train.py:1204] (3/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_17-90541-0_sp1.1 from training. Duration: 0.8545625 2023-10-02 20:29:59,501 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_43831_210_2158_1_1533725502_5872791_525-89447-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 20:30:00,963 WARNING [train.py:1204] (3/4) Exclude cut with ID _12302_210_5219_1_1529924446466_8107280_907-182949-0 from training. Duration: 0.92 2023-10-02 20:30:00,975 WARNING [train.py:1204] (3/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_402-102311-0_sp0.9 from training. Duration: 0.7444375 2023-10-02 20:30:01,004 WARNING [train.py:1204] (3/4) Exclude cut with ID _8021_210_7994_1_1528376451358_3653069_179-45206-0_sp0.9 from training. Duration: 0.9555625 2023-10-02 20:30:05,446 WARNING [train.py:1204] (3/4) Exclude cut with ID _40137_210_12210_1_1533290484046_3795180_249-187059-0 from training. Duration: 0.88 2023-10-02 20:30:05,530 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9653W0194-675472 from training. Duration: 0.9658125 2023-10-02 20:30:08,637 WARNING [train.py:1204] (3/4) Exclude cut with ID _51332_210_9988_1_1534244371836_3801270_336-255604-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 20:30:11,473 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_510-96996-0 from training. Duration: 0.87 2023-10-02 20:30:12,704 WARNING [train.py:1204] (3/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_1167-20437-0_sp1.1 from training. Duration: 0.92725 2023-10-02 20:30:14,013 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3475_210_2028_1_1526193578_3622569_265-276625-0 from training. Duration: 0.76 2023-10-02 20:30:18,731 WARNING [train.py:1204] (3/4) Exclude cut with ID _14860_210_4915_1_1530702020340_7343240_470-172418-0_sp1.1 from training. Duration: 0.97275 2023-10-02 20:30:18,739 WARNING [train.py:1204] (3/4) Exclude cut with ID _5444_210_2151_1_1527418808464_5312210_581-315411-0_sp1.1 from training. Duration: 0.563625 2023-10-02 20:30:18,761 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_34450_210_9547_1_1533099443_4109050_519-342439-0_sp1.1 from training. Duration: 0.97275 2023-10-02 20:30:18,784 WARNING [train.py:1204] (3/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_50-175961-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 20:30:22,849 WARNING [train.py:1204] (3/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_372-6869-0 from training. Duration: 0.44 2023-10-02 20:30:24,702 WARNING [train.py:1204] (3/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_3-237434-0 from training. Duration: 0.76 2023-10-02 20:30:26,192 WARNING [train.py:1204] (3/4) Exclude cut with ID _8772_210_1850_1_1528635595972_3664011_247-336448-0_sp1.1 from training. Duration: 0.67275 2023-10-02 20:30:28,988 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_427-78097-0 from training. Duration: 0.8 2023-10-02 20:30:29,006 WARNING [train.py:1204] (3/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_466-252727-0_sp1.1 from training. Duration: 0.97275 2023-10-02 20:30:30,405 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_15972_210_6400_1_1530877934_3405692_316-35478-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 20:30:32,152 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_809-17156-0_sp0.9 from training. Duration: 0.811125 2023-10-02 20:30:32,184 WARNING [train.py:1204] (3/4) Exclude cut with ID _57572_210_5517_1_1534921219624_3809484_594-263474-0_sp1.1 from training. Duration: 0.936375 2023-10-02 20:30:32,231 WARNING [train.py:1204] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_190-228204-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 20:30:33,652 WARNING [train.py:1204] (3/4) Exclude cut with ID _19514_210_9259_1_1531530071330_5084230_117-21193-0_sp1.1 from training. Duration: 0.936375 2023-10-02 20:30:34,964 WARNING [train.py:1204] (3/4) Exclude cut with ID _69584_210_5219_1_1535785133775_4044450_190-21315-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 20:30:36,393 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_53071_210_16554_1_1534493434_1276796_184-302050-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 20:30:37,791 WARNING [train.py:1204] (3/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_120-76843-0_sp1.1 from training. Duration: 0.5 2023-10-02 20:30:39,532 WARNING [train.py:1204] (3/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_318-162017-0_sp1.1 from training. Duration: 0.82725 2023-10-02 20:30:39,634 WARNING [train.py:1204] (3/4) Exclude cut with ID _46067_210_4220_1_1534125571066_3307450_79-46864-0_sp1.1 from training. Duration: 0.92725 2023-10-02 20:30:39,676 WARNING [train.py:1204] (3/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_358-215558-0_sp1.1 from training. Duration: 0.8454375 2023-10-02 20:30:42,322 WARNING [train.py:1204] (3/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_621-66764-0_sp1.1 from training. Duration: 0.92725 2023-10-02 20:30:43,716 WARNING [train.py:1204] (3/4) Exclude cut with ID _42863_210_9070_1_1533902398788_3696920_338-156851-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 20:30:43,747 WARNING [train.py:1204] (3/4) Exclude cut with ID _5289_210_2084_1_1527678202736_3779299_392-261988-0_sp0.9 from training. Duration: 0.72225 2023-10-02 20:30:46,496 WARNING [train.py:1204] (3/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_409-84682-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 20:30:49,125 WARNING [train.py:1204] (3/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_30-326681-0_sp0.9 from training. Duration: 0.92225 2023-10-02 20:30:49,129 WARNING [train.py:1204] (3/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_389-184695-0_sp1.1 from training. Duration: 0.92725 2023-10-02 20:30:49,176 WARNING [train.py:1204] (3/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_442-320317-0 from training. Duration: 0.82 2023-10-02 20:30:54,584 WARNING [train.py:1204] (3/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_756-29911-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 20:30:55,982 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_401-112036-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 20:30:56,076 WARNING [train.py:1204] (3/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_181-308827-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 20:30:57,489 WARNING [train.py:1204] (3/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_332-258725-0_sp1.1 from training. Duration: 0.963625 2023-10-02 20:31:00,026 WARNING [train.py:1204] (3/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_188-260011-0_sp0.9 from training. Duration: 0.811125 2023-10-02 20:31:00,076 WARNING [train.py:1204] (3/4) Exclude cut with ID _49039_210_6315_1_1533967537541_3846118_99-302239-0_sp1.1 from training. Duration: 0.963625 2023-10-02 20:31:01,352 INFO [train.py:1046] (3/4) Epoch 29, batch 2700, loss[loss=0.169, simple_loss=0.243, pruned_loss=0.0475, over 23632.00 frames. ], tot_loss[loss=0.1667, simple_loss=0.2444, pruned_loss=0.04451, over 4697342.84 frames. ], batch size: 256, lr: 3.52e-03, grad_scale: 8.0 2023-10-02 20:31:02,758 WARNING [train.py:1204] (3/4) Exclude cut with ID _4122_210_2084_1_1526713164417_4353280_312-294828-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 20:31:02,764 WARNING [train.py:1204] (3/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_303-228738-0 from training. Duration: 0.84 2023-10-02 20:31:04,815 WARNING [train.py:1204] (3/4) Exclude cut with ID _14349_210_8341_1_1530615710818_3442470_115-285975-0_sp0.9 from training. Duration: 0.9555625 2023-10-02 20:31:07,510 WARNING [train.py:1204] (3/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_125-144174-0_sp1.1 from training. Duration: 0.3545625 2023-10-02 20:31:07,617 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder_embed.conv.5.prob, batch_count=1009600.0, ans=0.125 2023-10-02 20:31:08,949 WARNING [train.py:1204] (3/4) Exclude cut with ID _53565_210_10635_1_1534337218335_3820259_351-54570-0_sp1.1 from training. Duration: 0.97275 2023-10-02 20:31:08,969 WARNING [train.py:1204] (3/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_410-40065-0_sp1.1 from training. Duration: 0.963625 2023-10-02 20:31:08,998 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_38965_210_4852_1_1533174906_3484735_167-339442-0_sp1.1 from training. Duration: 0.963625 2023-10-02 20:31:10,351 WARNING [train.py:1204] (3/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_265-164486-0_sp1.1 from training. Duration: 0.87275 2023-10-02 20:31:10,361 WARNING [train.py:1204] (3/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_725-182484-0_sp1.1 from training. Duration: 0.92725 2023-10-02 20:31:11,666 WARNING [train.py:1204] (3/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_607-213951-0_sp0.9 from training. Duration: 0.9333125 2023-10-02 20:31:11,688 WARNING [train.py:1204] (3/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_605-133395-0_sp1.1 from training. Duration: 0.6 2023-10-02 20:31:11,712 WARNING [train.py:1204] (3/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_351-294650-0 from training. Duration: 0.63 2023-10-02 20:31:11,772 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_14953_210_2151_1_1530701710_5616165_710-165400-0_sp1.1 from training. Duration: 0.7909375 2023-10-02 20:31:15,109 WARNING [train.py:1204] (3/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_237-58433-0_sp1.1 from training. Duration: 0.67275 2023-10-02 20:31:16,449 WARNING [train.py:1204] (3/4) Exclude cut with ID _5190_210_4557_1_1527244368799_373439_28-197030-0_sp1.1 from training. Duration: 0.6545625 2023-10-02 20:31:16,515 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_743-327651-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 20:31:19,362 WARNING [train.py:1204] (3/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_76-203867-0_sp0.9 from training. Duration: 0.8 2023-10-02 20:31:20,630 WARNING [train.py:1204] (3/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_274-274798-0 from training. Duration: 0.98 2023-10-02 20:31:20,676 WARNING [train.py:1204] (3/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_226-242140-0_sp1.1 from training. Duration: 0.736375 2023-10-02 20:31:26,938 WARNING [train.py:1204] (3/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_352-59958-0_sp1.1 from training. Duration: 0.7818125 2023-10-02 20:31:26,942 WARNING [train.py:1204] (3/4) Exclude cut with ID _39411_210_5986_1_1533538825030_3343649_191-251819-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 20:31:34,204 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7046_210_6753_1_1527895337_6920917_642-106799-0_sp1.1 from training. Duration: 0.836375 2023-10-02 20:31:34,210 WARNING [train.py:1204] (3/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_412-3442-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 20:31:34,228 WARNING [train.py:1204] (3/4) Exclude cut with ID _60057_210_10640_1_1534903065793_3333150_171-181133-0_sp0.9 from training. Duration: 0.97775 2023-10-02 20:31:34,236 WARNING [train.py:1204] (3/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_154-135696-0_sp1.1 from training. Duration: 0.72725 2023-10-02 20:31:35,784 WARNING [train.py:1204] (3/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_148-186805-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 20:31:38,673 WARNING [train.py:1204] (3/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_605-165641-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 20:31:38,685 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_174-277364-0_sp0.9 from training. Duration: 0.788875 2023-10-02 20:31:38,697 WARNING [train.py:1204] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_89-321955-0_sp0.9 from training. Duration: 0.888875 2023-10-02 20:31:42,731 WARNING [train.py:1204] (3/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_438-105429-0_sp1.1 from training. Duration: 0.963625 2023-10-02 20:31:42,734 WARNING [train.py:1204] (3/4) Exclude cut with ID _14970_210_15709_1_1530773543493_1715730_187-20259-0_sp0.9 from training. Duration: 0.988875 2023-10-02 20:31:50,378 WARNING [train.py:1204] (3/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_385-193052-0_sp0.9 from training. Duration: 0.9555625 2023-10-02 20:31:51,749 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7113_210_1849_1_1527942685_3448768_194-311626-0_sp1.1 from training. Duration: 0.92725 2023-10-02 20:31:54,814 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3780_210_2087_1_1526467389_2145572_176-107140-0_sp0.9 from training. Duration: 0.7666875 2023-10-02 20:31:54,816 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_36756_210_7234_1_1533026991_966276_123-213457-0_sp1.1 from training. Duration: 0.936375 2023-10-02 20:31:56,464 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.1.balancer1.prob, batch_count=1009800.0, ans=0.125 2023-10-02 20:31:58,256 WARNING [train.py:1204] (3/4) Exclude cut with ID _23557_210_8750_1_1531890106370_6846255_456-244861-0_sp1.1 from training. Duration: 0.963625 2023-10-02 20:31:58,311 WARNING [train.py:1204] (3/4) Exclude cut with ID _37086_210_9038_1_1532999540891_7021250_242-349170-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 20:31:58,379 WARNING [train.py:1204] (3/4) Exclude cut with ID _41298_210_9019_1_1533430371450_4822143_471-281351-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 20:32:00,991 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.538e+02 1.786e+02 2.084e+02 2.386e+02 3.655e+02, threshold=4.169e+02, percent-clipped=0.0 2023-10-02 20:32:01,065 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_53378_210_17282_1_1534221071_5769509_572-126176-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 20:32:01,166 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_1507-263032-0_sp1.1 from training. Duration: 0.963625 2023-10-02 20:32:03,048 WARNING [train.py:1204] (3/4) Exclude cut with ID _13679_210_7497_1_1530534293662_3355098_411-186088-0_sp1.1 from training. Duration: 0.8545625 2023-10-02 20:32:05,274 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.0.layers.1.feed_forward3.out_whiten, num_groups=1, num_channels=192, metric=10.25 vs. limit=15.0 2023-10-02 20:32:05,699 WARNING [train.py:1204] (3/4) Exclude cut with ID _29345_210_4381_1_1532689168263_3860169_149-257268-0_sp1.1 from training. Duration: 0.736375 2023-10-02 20:32:07,212 WARNING [train.py:1204] (3/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_381-252014-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 20:32:08,386 WARNING [train.py:1204] (3/4) Exclude cut with ID _35799_210_5854_1_1533263487887_3529071_512-2717-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 20:32:11,153 WARNING [train.py:1204] (3/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_505-205270-0_sp0.9 from training. Duration: 0.42225 2023-10-02 20:32:11,228 WARNING [train.py:1204] (3/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_731-20117-0_sp1.1 from training. Duration: 0.936375 2023-10-02 20:32:14,029 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_37351_210_7925_1_1533012931_3812113_265-85780-0_sp1.1 from training. Duration: 0.8 2023-10-02 20:32:14,043 WARNING [train.py:1204] (3/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_313-134930-0 from training. Duration: 0.97 2023-10-02 20:32:15,901 INFO [train.py:1046] (3/4) Epoch 29, batch 2750, loss[loss=0.1788, simple_loss=0.2463, pruned_loss=0.05567, over 23797.00 frames. ], tot_loss[loss=0.1664, simple_loss=0.2441, pruned_loss=0.0443, over 4712898.85 frames. ], batch size: 164, lr: 3.52e-03, grad_scale: 8.0 2023-10-02 20:32:17,301 WARNING [train.py:1204] (3/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_374-138971-0 from training. Duration: 0.94 2023-10-02 20:32:17,358 WARNING [train.py:1204] (3/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_1227-287886-0_sp1.1 from training. Duration: 0.936375 2023-10-02 20:32:18,833 WARNING [train.py:1204] (3/4) Exclude cut with ID _59493_210_17282_1_1535078419077_3225879_332-217544-0_sp1.1 from training. Duration: 0.97275 2023-10-02 20:32:18,876 WARNING [train.py:1204] (3/4) Exclude cut with ID _23514_210_1389_1_1531832139785_3031439_361-119640-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 20:32:21,477 WARNING [train.py:1204] (3/4) Exclude cut with ID _69584_210_5219_1_1535785133775_4044450_142-118058-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 20:32:21,496 WARNING [train.py:1204] (3/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_178-177582-0_sp1.1 from training. Duration: 0.67275 2023-10-02 20:32:21,538 WARNING [train.py:1204] (3/4) Exclude cut with ID _25303_210_17519_1_1532050207576_3635970_41-195572-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 20:32:24,721 WARNING [train.py:1204] (3/4) Exclude cut with ID _55328_210_27767_1_1534580973260_4088170_329-341322-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 20:32:24,891 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.conv_module2.balancer2.prob, batch_count=1009933.3333333334, ans=0.125 2023-10-02 20:32:26,022 WARNING [train.py:1204] (3/4) Exclude cut with ID _5608_210_2106_1_1527415439089_6819768_909-82767-0_sp1.1 from training. Duration: 0.5545625 2023-10-02 20:32:26,072 WARNING [train.py:1204] (3/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_261-297503-0_sp1.1 from training. Duration: 0.9 2023-10-02 20:32:26,084 WARNING [train.py:1204] (3/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_459-291706-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 20:32:26,085 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7762_210_1794_1_1528199508_4057408_422-227034-0 from training. Duration: 0.73 2023-10-02 20:32:26,094 WARNING [train.py:1204] (3/4) Exclude cut with ID _3067_210_2667_1_1526212974640_4503168_250-285937-0_sp0.9 from training. Duration: 0.888875 2023-10-02 20:32:27,994 WARNING [train.py:1204] (3/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_332-253745-0_sp1.1 from training. Duration: 0.936375 2023-10-02 20:32:33,814 WARNING [train.py:1204] (3/4) Exclude cut with ID _13938_210_9988_1_1530518718978_3693960_304-112784-0 from training. Duration: 0.46 2023-10-02 20:32:35,266 WARNING [train.py:1204] (3/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_226-267957-0_sp1.1 from training. Duration: 0.92725 2023-10-02 20:32:35,915 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.3.encoder.layers.2.conv_module1.whiten, num_groups=1, num_channels=512, metric=5.49 vs. limit=15.0 2023-10-02 20:32:36,627 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_41822_210_13270_1_1533955976_4379284_608-254567-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 20:32:36,693 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_124-113562-0_sp1.1 from training. Duration: 0.8545625 2023-10-02 20:32:36,721 WARNING [train.py:1204] (3/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_117-290916-0_sp1.1 from training. Duration: 0.463625 2023-10-02 20:32:38,048 WARNING [train.py:1204] (3/4) Exclude cut with ID _28427_210_4220_1_1532581240967_3061069_102-66533-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 20:32:38,121 WARNING [train.py:1204] (3/4) Exclude cut with ID _55383_210_10635_1_1534468426172_2231669_134-128246-0_sp1.1 from training. Duration: 0.8090625 2023-10-02 20:32:39,474 WARNING [train.py:1204] (3/4) Exclude cut with ID _45871_210_5665_1_1533864614245_3296060_49-308316-0_sp1.1 from training. Duration: 0.97275 2023-10-02 20:32:39,533 WARNING [train.py:1204] (3/4) Exclude cut with ID _57572_210_5517_1_1534921219624_3809484_176-199205-0_sp1.1 from training. Duration: 0.97275 2023-10-02 20:32:43,761 WARNING [train.py:1204] (3/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_45-265684-0_sp1.1 from training. Duration: 0.6545625 2023-10-02 20:32:43,787 WARNING [train.py:1204] (3/4) Exclude cut with ID _15150_210_8341_1_1530752411803_7256384_469-239509-0_sp0.9 from training. Duration: 0.7555625 2023-10-02 20:32:45,721 WARNING [train.py:1204] (3/4) Exclude cut with ID _28461_210_2253_1_1532771775189_3041299_136-270644-0_sp1.1 from training. Duration: 0.8454375 2023-10-02 20:32:45,793 WARNING [train.py:1204] (3/4) Exclude cut with ID _69584_210_5219_1_1535785133775_4044450_302-47302-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 20:32:47,151 WARNING [train.py:1204] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_166-128941-0_sp1.1 from training. Duration: 0.4909375 2023-10-02 20:32:48,905 INFO [scaling.py:1118] (3/4) WithLoss: name=encoder.encoders.3.encoder.layers.2.self_attn_weights, loss-sum=0.000e+00 2023-10-02 20:32:53,995 WARNING [train.py:1204] (3/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_559-120504-0_sp1.1 from training. Duration: 0.97275 2023-10-02 20:32:55,866 WARNING [train.py:1204] (3/4) Exclude cut with ID _6456_210_1534_1_1527681634212_3181049_345-252877-0_sp1.1 from training. Duration: 0.5818125 2023-10-02 20:32:55,901 WARNING [train.py:1204] (3/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_902-233003-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 20:33:00,687 WARNING [train.py:1204] (3/4) Exclude cut with ID _36538_210_9994_1_1533204020363_3795620_204-261385-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 20:33:00,691 WARNING [train.py:1204] (3/4) Exclude cut with ID _14821_210_8750_1_1530680503991_6829359_189-297451-0_sp1.1 from training. Duration: 0.5 2023-10-02 20:33:00,730 WARNING [train.py:1204] (3/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_1211-226066-0_sp0.9 from training. Duration: 0.8444375 2023-10-02 20:33:06,785 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6786_210_1388_1_1527839699_4194495_273-149841-0_sp1.1 from training. Duration: 0.836375 2023-10-02 20:33:06,824 WARNING [train.py:1204] (3/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_380-162648-0_sp1.1 from training. Duration: 0.8545625 2023-10-02 20:33:06,824 WARNING [train.py:1204] (3/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_494-199538-0 from training. Duration: 0.75 2023-10-02 20:33:11,024 WARNING [train.py:1204] (3/4) Exclude cut with ID _49097_210_16049_1_1533952780207_4360979_276-87053-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 20:33:13,612 WARNING [train.py:1204] (3/4) Exclude cut with ID _5444_210_2151_1_1527418808464_5312210_726-27165-0 from training. Duration: 0.67 2023-10-02 20:33:13,887 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.conv_module1.balancer2.prob, batch_count=1010200.0, ans=0.125 2023-10-02 20:33:19,669 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_128-202104-0_sp1.1 from training. Duration: 0.57275 2023-10-02 20:33:21,197 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_11349_210_6753_1_1529800861_1101207_26-65602-0_sp1.1 from training. Duration: 0.87275 2023-10-02 20:33:21,229 WARNING [train.py:1204] (3/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_159-328157-0 from training. Duration: 0.52 2023-10-02 20:33:21,315 WARNING [train.py:1204] (3/4) Exclude cut with ID _14069_210_5632_1_1530512692033_4803390_98-21934-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 20:33:24,114 WARNING [train.py:1204] (3/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_352-59958-0_sp0.9 from training. Duration: 0.9555625 2023-10-02 20:33:24,154 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6163_210_6400_1_1527506746_4263068_283-137811-0 from training. Duration: 0.77 2023-10-02 20:33:24,179 WARNING [train.py:1204] (3/4) Exclude cut with ID _21989_210_5854_1_1531899214931_3271360_133-44759-0_sp1.1 from training. Duration: 0.7 2023-10-02 20:33:26,045 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.1.ff2_skip_rate, batch_count=1010200.0, ans=0.0 2023-10-02 20:33:27,285 WARNING [train.py:1204] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_21-174514-0_sp0.9 from training. Duration: 0.47775 2023-10-02 20:33:29,041 WARNING [train.py:1204] (3/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_201-78811-0_sp1.1 from training. Duration: 0.963625 2023-10-02 20:33:29,080 WARNING [train.py:1204] (3/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_537-164630-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 20:33:29,178 WARNING [train.py:1204] (3/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_156-257607-0 from training. Duration: 0.83 2023-10-02 20:33:30,441 INFO [train.py:1046] (3/4) Epoch 29, batch 2800, loss[loss=0.1565, simple_loss=0.2382, pruned_loss=0.03735, over 24455.00 frames. ], tot_loss[loss=0.1651, simple_loss=0.2427, pruned_loss=0.04379, over 4720802.48 frames. ], batch size: 66, lr: 3.52e-03, grad_scale: 16.0 2023-10-02 20:33:30,508 WARNING [train.py:1204] (3/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_308-154450-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 20:33:30,559 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_45399_210_4852_1_1533624725_7579737_214-240574-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 20:33:31,892 WARNING [train.py:1204] (3/4) Exclude cut with ID _29303_210_2575_1_1532392237303_7410649_615-305517-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 20:33:33,838 WARNING [train.py:1204] (3/4) Exclude cut with ID IC0125W0002-61808 from training. Duration: 0.708 2023-10-02 20:33:33,839 WARNING [train.py:1204] (3/4) Exclude cut with ID IC0125W0017-61823 from training. Duration: 0.759 2023-10-02 20:33:35,704 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.4.encoder.layers.2.whiten, num_groups=1, num_channels=384, metric=2.90 vs. limit=12.0 2023-10-02 20:33:36,611 WARNING [train.py:1204] (3/4) Exclude cut with ID _53219_210_4525_1_1534834836008_4064825_4-145291-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 20:33:38,085 WARNING [train.py:1204] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_185-174805-0_sp1.1 from training. Duration: 0.6181875 2023-10-02 20:33:39,399 WARNING [train.py:1204] (3/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_635-193046-0_sp1.1 from training. Duration: 0.92725 2023-10-02 20:33:41,076 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.conv_skip_rate, batch_count=1010266.6666666666, ans=0.0 2023-10-02 20:33:42,208 WARNING [train.py:1204] (3/4) Exclude cut with ID _46067_210_4220_1_1534125571066_3307450_321-258866-0_sp1.1 from training. Duration: 0.97275 2023-10-02 20:33:43,546 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8558_210_1410_1_1528505225_7613406_131-329524-0 from training. Duration: 0.79 2023-10-02 20:33:44,971 WARNING [train.py:1204] (3/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_847-41334-0_sp0.9 from training. Duration: 0.488875 2023-10-02 20:33:45,125 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.feed_forward1.hidden_balancer.prob, batch_count=1010333.3333333334, ans=0.125 2023-10-02 20:33:47,494 WARNING [train.py:1204] (3/4) Exclude cut with ID _14227_210_4381_1_1530701804286_3940259_125-275642-0 from training. Duration: 0.99 2023-10-02 20:33:49,326 WARNING [train.py:1204] (3/4) Exclude cut with ID _25248_210_8750_1_1532062825941_5554180_248-177255-0_sp1.1 from training. Duration: 0.963625 2023-10-02 20:33:49,363 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_40047_210_2290_1_1533290519_2374311_195-165693-0_sp1.1 from training. Duration: 0.8090625 2023-10-02 20:33:49,367 WARNING [train.py:1204] (3/4) Exclude cut with ID _23109_210_4381_1_1532150828777_901949_29-258672-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 20:33:52,115 WARNING [train.py:1204] (3/4) Exclude cut with ID _14821_210_8750_1_1530680503991_6829359_2-227151-0_sp1.1 from training. Duration: 0.7545625 2023-10-02 20:33:52,151 WARNING [train.py:1204] (3/4) Exclude cut with ID _49643_210_7989_1_1534640244224_3939031_565-53982-0_sp1.1 from training. Duration: 0.963625 2023-10-02 20:33:53,936 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7544_210_6753_1_1528591327_1471426_33-346031-0_sp0.9 from training. Duration: 0.67775 2023-10-02 20:33:54,024 WARNING [train.py:1204] (3/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_95-61782-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 20:34:02,972 WARNING [train.py:1204] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_686-54383-0_sp0.9 from training. Duration: 0.9666875 2023-10-02 20:34:04,850 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_505-276823-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 20:34:07,791 WARNING [train.py:1204] (3/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_745-128443-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 20:34:09,188 WARNING [train.py:1204] (3/4) Exclude cut with ID _3844_210_1390_1_1526727803582_2871209_20-251664-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 20:34:09,259 WARNING [train.py:1204] (3/4) Exclude cut with ID _36538_210_9994_1_1533204020363_3795620_487-164628-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 20:34:13,483 WARNING [train.py:1204] (3/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_309-222545-0_sp1.1 from training. Duration: 0.763625 2023-10-02 20:34:13,497 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3531_210_3528_1_1526294203_5216456_309-227302-0 from training. Duration: 0.69 2023-10-02 20:34:13,554 WARNING [train.py:1204] (3/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_42-192460-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 20:34:15,018 WARNING [train.py:1204] (3/4) Exclude cut with ID _22083_210_8254_1_1531702862439_3678970_49-234546-0_sp1.1 from training. Duration: 0.7545625 2023-10-02 20:34:15,025 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_427-78097-0_sp0.9 from training. Duration: 0.888875 2023-10-02 20:34:15,305 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.bypass.scale_min, batch_count=1010466.6666666666, ans=0.2 2023-10-02 20:34:16,590 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.feed_forward1.hidden_balancer.prob, batch_count=1010466.6666666666, ans=0.125 2023-10-02 20:34:19,573 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_378-205254-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 20:34:19,610 WARNING [train.py:1204] (3/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_672-314830-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 20:34:23,705 WARNING [train.py:1204] (3/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_90-30673-0_sp1.1 from training. Duration: 0.763625 2023-10-02 20:34:25,740 WARNING [train.py:1204] (3/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_563-329943-0_sp1.1 from training. Duration: 0.9 2023-10-02 20:34:27,006 WARNING [train.py:1204] (3/4) Exclude cut with ID _21632_210_2253_1_1531994001765_3744140_154-84090-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 20:34:27,007 WARNING [train.py:1204] (3/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_103-336064-0_sp0.9 from training. Duration: 0.5666875 2023-10-02 20:34:27,054 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_43-71989-0_sp0.9 from training. Duration: 0.6333125 2023-10-02 20:34:27,106 WARNING [train.py:1204] (3/4) Exclude cut with ID _3867_210_1883_1_1526641200575_4475830_319-349850-0_sp0.9 from training. Duration: 0.7444375 2023-10-02 20:34:28,305 WARNING [train.py:1204] (3/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_629-179454-0_sp1.1 from training. Duration: 0.963625 2023-10-02 20:34:28,324 WARNING [train.py:1204] (3/4) Exclude cut with ID _65263_210_22356_1_1535763598691_3921037_49-222917-0 from training. Duration: 0.92 2023-10-02 20:34:28,347 WARNING [train.py:1204] (3/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_602-331772-0_sp1.1 from training. Duration: 0.936375 2023-10-02 20:34:29,612 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.502e+02 1.917e+02 2.083e+02 2.340e+02 4.683e+02, threshold=4.167e+02, percent-clipped=1.0 2023-10-02 20:34:31,058 WARNING [train.py:1204] (3/4) Exclude cut with ID _58414_210_8254_1_1535421609162_7151182_292-134978-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 20:34:31,107 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_38793_210_2544_1_1533176114_3849320_200-299292-0_sp1.1 from training. Duration: 0.936375 2023-10-02 20:34:32,452 WARNING [train.py:1204] (3/4) Exclude cut with ID _14970_210_15709_1_1530775547361_1966965_6-23328-0 from training. Duration: 0.86 2023-10-02 20:34:34,448 WARNING [train.py:1204] (3/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_386-225511-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 20:34:34,468 WARNING [train.py:1204] (3/4) Exclude cut with ID _38323_210_12108_1_1533121233369_3303049_416-11919-0_sp1.1 from training. Duration: 0.87275 2023-10-02 20:34:35,820 WARNING [train.py:1204] (3/4) Exclude cut with ID _4866_210_2577_1_1527404412284_4364659_256-306830-0_sp1.1 from training. Duration: 0.7909375 2023-10-02 20:34:35,932 WARNING [train.py:1204] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_53-300544-0 from training. Duration: 0.67 2023-10-02 20:34:37,430 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.conv_module1.balancer2.prob, batch_count=1010533.3333333334, ans=0.125 2023-10-02 20:34:40,176 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=1010533.3333333334, ans=0.1 2023-10-02 20:34:43,737 INFO [train.py:1046] (3/4) Epoch 29, batch 2850, loss[loss=0.1658, simple_loss=0.2525, pruned_loss=0.03958, over 24624.00 frames. ], tot_loss[loss=0.1645, simple_loss=0.242, pruned_loss=0.04355, over 4726307.50 frames. ], batch size: 68, lr: 3.52e-03, grad_scale: 16.0 2023-10-02 20:34:43,796 WARNING [train.py:1204] (3/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_80-74184-0_sp1.1 from training. Duration: 0.7545625 2023-10-02 20:34:43,817 WARNING [train.py:1204] (3/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_332-256168-0_sp1.1 from training. Duration: 0.6818125 2023-10-02 20:34:43,857 WARNING [train.py:1204] (3/4) Exclude cut with ID _48836_210_12210_1_1534075848293_3904150_429-35475-0_sp1.1 from training. Duration: 0.8090625 2023-10-02 20:34:45,319 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_144-135833-0_sp1.1 from training. Duration: 0.97275 2023-10-02 20:34:50,091 WARNING [train.py:1204] (3/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_380-343670-0_sp0.9 from training. Duration: 0.97775 2023-10-02 20:34:50,126 WARNING [train.py:1204] (3/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_213-265174-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 20:34:50,147 WARNING [train.py:1204] (3/4) Exclude cut with ID _20455_210_5219_1_1531565996775_8098100_86-314283-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 20:34:52,870 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_41750_210_1960_1_1533520678_3948042_68-223813-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 20:34:52,930 WARNING [train.py:1204] (3/4) Exclude cut with ID _55153_210_2253_1_1534384786677_3162096_24-198429-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 20:34:54,462 WARNING [train.py:1204] (3/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_119-207162-0_sp0.9 from training. Duration: 0.788875 2023-10-02 20:34:55,950 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_148-172861-0 from training. Duration: 0.5 2023-10-02 20:35:02,133 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_5522_210_3273_1_1527249606_3909096_318-203153-0 from training. Duration: 0.43 2023-10-02 20:35:02,142 WARNING [train.py:1204] (3/4) Exclude cut with ID _14884_210_2252_1_1530707459689_5561250_288-189949-0_sp1.1 from training. Duration: 0.97275 2023-10-02 20:35:03,572 WARNING [train.py:1204] (3/4) Exclude cut with ID _5294_210_1747_1_1527249643928_3696179_529-242298-0 from training. Duration: 0.95 2023-10-02 20:35:05,290 WARNING [train.py:1204] (3/4) Exclude cut with ID _36538_210_9994_1_1533204020363_3795620_503-110289-0_sp1.1 from training. Duration: 0.936375 2023-10-02 20:35:07,404 WARNING [train.py:1204] (3/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_385-193052-0 from training. Duration: 0.86 2023-10-02 20:35:08,768 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_456-22141-0 from training. Duration: 0.88 2023-10-02 20:35:10,221 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_38965_210_4852_1_1533174906_3484735_284-117840-0_sp1.1 from training. Duration: 0.936375 2023-10-02 20:35:13,189 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.self_attn_weights.pos_emb_skip_rate, batch_count=1010733.3333333334, ans=0.0 2023-10-02 20:35:19,261 WARNING [train.py:1204] (3/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_75-201625-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 20:35:21,956 WARNING [train.py:1204] (3/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_255-312654-0_sp1.1 from training. Duration: 0.82725 2023-10-02 20:35:21,967 WARNING [train.py:1204] (3/4) Exclude cut with ID _47251_210_18628_1_1533814063128_4137250_214-88076-0_sp0.9 from training. Duration: 0.97775 2023-10-02 20:35:23,365 WARNING [train.py:1204] (3/4) Exclude cut with ID _4092_210_2151_1_1526643044449_3135759_227-213972-0_sp1.1 from training. Duration: 0.4909375 2023-10-02 20:35:23,382 WARNING [train.py:1204] (3/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_912-47473-0_sp1.1 from training. Duration: 0.6181875 2023-10-02 20:35:23,407 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6767_210_3273_1_1527855783_1718932_173-256601-0_sp1.1 from training. Duration: 0.72725 2023-10-02 20:35:25,415 WARNING [train.py:1204] (3/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_32-205288-0_sp1.1 from training. Duration: 0.7090625 2023-10-02 20:35:25,438 WARNING [train.py:1204] (3/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_39-248870-0 from training. Duration: 0.69 2023-10-02 20:35:28,169 WARNING [train.py:1204] (3/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_377-296839-0_sp1.1 from training. Duration: 0.5 2023-10-02 20:35:28,196 WARNING [train.py:1204] (3/4) Exclude cut with ID _13679_210_7497_1_1530534293662_3355098_413-56154-0_sp1.1 from training. Duration: 0.963625 2023-10-02 20:35:29,642 WARNING [train.py:1204] (3/4) Exclude cut with ID _14970_210_15709_1_1530775547361_1966965_201-84544-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 20:35:29,724 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder_embed.dropout.p, batch_count=1010800.0, ans=0.1 2023-10-02 20:35:31,019 WARNING [train.py:1204] (3/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_105-95775-0_sp1.1 from training. Duration: 0.936375 2023-10-02 20:35:32,417 WARNING [train.py:1204] (3/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_131-191669-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 20:35:33,769 WARNING [train.py:1204] (3/4) Exclude cut with ID _14860_210_4915_1_1530702020340_7343240_695-4323-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 20:35:33,868 WARNING [train.py:1204] (3/4) Exclude cut with ID _39867_210_9826_1_1533553222030_9344259_991-130565-0_sp1.1 from training. Duration: 0.97275 2023-10-02 20:35:35,256 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_50-3289-0_sp1.1 from training. Duration: 0.82725 2023-10-02 20:35:36,793 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_54-347670-0_sp1.1 from training. Duration: 0.92725 2023-10-02 20:35:38,777 WARNING [train.py:1204] (3/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_200-153445-0_sp1.1 from training. Duration: 0.936375 2023-10-02 20:35:40,150 WARNING [train.py:1204] (3/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_279-302911-0_sp1.1 from training. Duration: 0.97275 2023-10-02 20:35:41,586 WARNING [train.py:1204] (3/4) Exclude cut with ID _14886_210_4220_1_1530702020355_3516190_159-73748-0_sp0.9 from training. Duration: 0.92225 2023-10-02 20:35:43,182 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.conv_module1.balancer1.prob, batch_count=1010866.6666666666, ans=0.125 2023-10-02 20:35:45,814 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.attention_skip_rate, batch_count=1010866.6666666666, ans=0.0 2023-10-02 20:35:46,916 WARNING [train.py:1204] (3/4) Exclude cut with ID _53379_210_20034_1_1534222027363_3854654_23-222715-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 20:35:48,347 WARNING [train.py:1204] (3/4) Exclude cut with ID _12555_210_8750_1_1530075594402_3780239_449-267986-0 from training. Duration: 0.83 2023-10-02 20:35:48,368 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7544_210_6753_1_1528587722_3576686_295-258479-0 from training. Duration: 0.84 2023-10-02 20:35:51,588 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7002_210_1794_1_1527935634_4034444_487-236501-0_sp0.9 from training. Duration: 0.7333125 2023-10-02 20:35:51,652 WARNING [train.py:1204] (3/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_244-19107-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 20:35:51,672 WARNING [train.py:1204] (3/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_390-339137-0 from training. Duration: 0.56 2023-10-02 20:35:53,038 WARNING [train.py:1204] (3/4) Exclude cut with ID _7562_210_2084_1_1528887767916_3946229_186-326422-0_sp1.1 from training. Duration: 0.763625 2023-10-02 20:35:53,094 WARNING [train.py:1204] (3/4) Exclude cut with ID _33640_210_2655_1_1533117605479_4855469_645-145235-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 20:35:53,110 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_25767_210_2161_1_1532307483_3867748_506-171777-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 20:35:53,134 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_5438_210_1794_1_1527336687_2600009_233-58072-0_sp1.1 from training. Duration: 0.7 2023-10-02 20:35:53,135 WARNING [train.py:1204] (3/4) Exclude cut with ID IC0669W0063-330908 from training. Duration: 0.896 2023-10-02 20:35:53,164 WARNING [train.py:1204] (3/4) Exclude cut with ID IC0671W0070-331913 from training. Duration: 0.896 2023-10-02 20:35:53,167 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_258-310570-0_sp1.1 from training. Duration: 0.7454375 2023-10-02 20:35:54,485 WARNING [train.py:1204] (3/4) Exclude cut with ID _9461_210_4557_1_1529060514726_3677451_162-177507-0_sp1.1 from training. Duration: 0.97275 2023-10-02 20:35:58,951 INFO [train.py:1046] (3/4) Epoch 29, batch 2900, loss[loss=0.1747, simple_loss=0.2442, pruned_loss=0.05258, over 23839.00 frames. ], tot_loss[loss=0.1648, simple_loss=0.242, pruned_loss=0.04375, over 4717962.16 frames. ], batch size: 164, lr: 3.52e-03, grad_scale: 8.0 2023-10-02 20:36:00,284 WARNING [train.py:1204] (3/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_26-175553-0_sp0.9 from training. Duration: 0.67775 2023-10-02 20:36:00,309 WARNING [train.py:1204] (3/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_7-150481-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 20:36:01,606 WARNING [train.py:1204] (3/4) Exclude cut with ID _23557_210_8750_1_1531890106370_6846255_72-273262-0_sp1.1 from training. Duration: 0.963625 2023-10-02 20:36:02,969 WARNING [train.py:1204] (3/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_367-61330-0 from training. Duration: 0.75 2023-10-02 20:36:04,948 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.3.encoder.layers.1.nonlin_attention.whiten2, num_groups=1, num_channels=512, metric=7.40 vs. limit=15.0 2023-10-02 20:36:07,210 WARNING [train.py:1204] (3/4) Exclude cut with ID _39867_210_9826_1_1533553222030_9344259_1337-11141-0_sp1.1 from training. Duration: 0.936375 2023-10-02 20:36:07,242 WARNING [train.py:1204] (3/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_101-293367-0 from training. Duration: 0.63 2023-10-02 20:36:08,614 WARNING [train.py:1204] (3/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_10-100367-0 from training. Duration: 0.98 2023-10-02 20:36:10,575 WARNING [train.py:1204] (3/4) Exclude cut with ID _60057_210_10640_1_1534903065793_3333150_171-181133-0_sp1.1 from training. Duration: 0.8 2023-10-02 20:36:10,578 WARNING [train.py:1204] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_844-96989-0_sp0.9 from training. Duration: 0.788875 2023-10-02 20:36:13,910 WARNING [train.py:1204] (3/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_301-144595-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 20:36:14,003 WARNING [train.py:1204] (3/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_115-147343-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 20:36:15,601 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.bypass.skip_rate, batch_count=1011000.0, ans=0.04949747468305833 2023-10-02 20:36:16,849 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3171_210_2087_1_1526109084_2330007_37-64334-0_sp1.1 from training. Duration: 0.7454375 2023-10-02 20:36:17,048 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.balancer1.prob, batch_count=1011000.0, ans=0.125 2023-10-02 20:36:18,155 WARNING [train.py:1204] (3/4) Exclude cut with ID _19514_210_9259_1_1531530071330_5084230_769-272914-0_sp1.1 from training. Duration: 0.936375 2023-10-02 20:36:19,614 WARNING [train.py:1204] (3/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_76-311963-0_sp0.9 from training. Duration: 0.77775 2023-10-02 20:36:19,846 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.out_combiner.scale_min, batch_count=1011000.0, ans=0.2 2023-10-02 20:36:20,947 WARNING [train.py:1204] (3/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_526-62079-0 from training. Duration: 0.67 2023-10-02 20:36:22,833 WARNING [train.py:1204] (3/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_7-111954-0_sp0.9 from training. Duration: 0.62225 2023-10-02 20:36:22,955 WARNING [train.py:1204] (3/4) Exclude cut with ID _57997_210_4163_1_1534766425889_3961049_360-279140-0_sp1.1 from training. Duration: 0.97275 2023-10-02 20:36:25,642 WARNING [train.py:1204] (3/4) Exclude cut with ID _4866_210_2577_1_1527404412284_4364659_133-1696-0 from training. Duration: 0.99 2023-10-02 20:36:25,700 WARNING [train.py:1204] (3/4) Exclude cut with ID _5190_210_4557_1_1527244368799_373439_28-197030-0 from training. Duration: 0.72 2023-10-02 20:36:30,278 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_53-39003-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 20:36:30,287 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6673_210_3273_1_1527767790_3173240_351-167954-0 from training. Duration: 0.48 2023-10-02 20:36:30,304 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2822_210_2715_1_1526112586_1999587_165-36656-0_sp1.1 from training. Duration: 0.8090625 2023-10-02 20:36:33,128 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_328-264892-0_sp1.1 from training. Duration: 0.92725 2023-10-02 20:36:33,133 WARNING [train.py:1204] (3/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_29-293896-0_sp0.9 from training. Duration: 0.67775 2023-10-02 20:36:36,039 WARNING [train.py:1204] (3/4) Exclude cut with ID _65263_210_22356_1_1535763598691_3921037_465-55448-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 20:36:37,295 WARNING [train.py:1204] (3/4) Exclude cut with ID _21989_210_5854_1_1531899214931_3271360_311-53106-0_sp1.1 from training. Duration: 0.97275 2023-10-02 20:36:37,791 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.5.encoder.layers.1.feed_forward2.out_whiten, num_groups=1, num_channels=256, metric=9.51 vs. limit=15.0 2023-10-02 20:36:40,166 WARNING [train.py:1204] (3/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_389-277915-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 20:36:43,376 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_69630_210_7234_1_1535791094_677837_63-277139-0_sp1.1 from training. Duration: 0.963625 2023-10-02 20:36:45,161 WARNING [train.py:1204] (3/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_85-138257-0 from training. Duration: 0.71 2023-10-02 20:36:45,183 WARNING [train.py:1204] (3/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_686-247600-0 from training. Duration: 0.92 2023-10-02 20:36:45,188 WARNING [train.py:1204] (3/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_108-255430-0_sp1.1 from training. Duration: 0.8818125 2023-10-02 20:36:47,996 WARNING [train.py:1204] (3/4) Exclude cut with ID _14884_210_2252_1_1530707459689_5561250_114-201399-0_sp1.1 from training. Duration: 0.6909375 2023-10-02 20:36:50,651 WARNING [train.py:1204] (3/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_506-42614-0 from training. Duration: 0.79 2023-10-02 20:36:51,139 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.5.encoder.layers.0.feed_forward1.out_whiten, num_groups=1, num_channels=256, metric=11.56 vs. limit=15.0 2023-10-02 20:36:52,452 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_85-195240-0_sp1.1 from training. Duration: 0.7909375 2023-10-02 20:36:54,269 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_141-36243-0_sp1.1 from training. Duration: 0.97275 2023-10-02 20:36:59,958 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.0.layers.0.feed_forward3.out_whiten, num_groups=1, num_channels=192, metric=11.76 vs. limit=15.0 2023-10-02 20:37:00,192 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.490e+02 1.887e+02 2.065e+02 2.292e+02 3.818e+02, threshold=4.129e+02, percent-clipped=0.0 2023-10-02 20:37:02,998 WARNING [train.py:1204] (3/4) Exclude cut with ID _7852_210_5219_1_1528286471316_8139400_358-287492-0_sp1.1 from training. Duration: 0.87275 2023-10-02 20:37:03,032 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_805-71497-0_sp0.9 from training. Duration: 0.788875 2023-10-02 20:37:03,842 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.2.encoder.layers.0.whiten, num_groups=1, num_channels=384, metric=5.55 vs. limit=12.0 2023-10-02 20:37:04,371 WARNING [train.py:1204] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_178-39722-0 from training. Duration: 0.49 2023-10-02 20:37:06,974 WARNING [train.py:1204] (3/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_428-181925-0_sp1.1 from training. Duration: 0.963625 2023-10-02 20:37:06,980 WARNING [train.py:1204] (3/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_770-200430-0 from training. Duration: 0.89 2023-10-02 20:37:07,023 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_1104-271009-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 20:37:07,646 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.4.encoder.layers.0.nonlin_attention.whiten2, num_groups=1, num_channels=384, metric=10.07 vs. limit=15.0 2023-10-02 20:37:08,938 WARNING [train.py:1204] (3/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_128-296581-0_sp0.9 from training. Duration: 0.82225 2023-10-02 20:37:09,445 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.5.encoder.layers.1.whiten, num_groups=1, num_channels=256, metric=1.99 vs. limit=12.0 2023-10-02 20:37:13,643 INFO [train.py:1046] (3/4) Epoch 29, batch 2950, loss[loss=0.1543, simple_loss=0.2423, pruned_loss=0.03321, over 23759.00 frames. ], tot_loss[loss=0.1654, simple_loss=0.2432, pruned_loss=0.04383, over 4724107.05 frames. ], batch size: 85, lr: 3.52e-03, grad_scale: 8.0 2023-10-02 20:37:15,062 WARNING [train.py:1204] (3/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_19-62511-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 20:37:16,705 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.ff3_skip_rate, batch_count=1011266.6666666666, ans=0.0 2023-10-02 20:37:17,728 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_805-71497-0 from training. Duration: 0.71 2023-10-02 20:37:17,821 WARNING [train.py:1204] (3/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_573-152077-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 20:37:17,826 WARNING [train.py:1204] (3/4) Exclude cut with ID _42875_210_14856_1_1533704397487_4012433_393-177816-0_sp1.1 from training. Duration: 0.963625 2023-10-02 20:37:20,925 WARNING [train.py:1204] (3/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_278-35159-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 20:37:21,020 WARNING [train.py:1204] (3/4) Exclude cut with ID _15037_210_2253_1_1530752294104_3267650_13-130361-0_sp1.1 from training. Duration: 0.9 2023-10-02 20:37:22,522 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3019_210_2028_1_1526044245_3264865_127-135615-0 from training. Duration: 0.94 2023-10-02 20:37:23,900 WARNING [train.py:1204] (3/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_607-213951-0 from training. Duration: 0.84 2023-10-02 20:37:25,249 WARNING [train.py:1204] (3/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_436-318113-0_sp0.9 from training. Duration: 0.8333125 2023-10-02 20:37:25,250 WARNING [train.py:1204] (3/4) Exclude cut with ID _13938_210_9988_1_1530518718978_3693960_148-152225-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 20:37:25,513 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.self_attn_weights.pos_emb_skip_rate, batch_count=1011266.6666666666, ans=0.0 2023-10-02 20:37:30,786 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2541_210_1794_1_1525590269_3084765_156-173713-0_sp1.1 from training. Duration: 0.736375 2023-10-02 20:37:32,709 WARNING [train.py:1204] (3/4) Exclude cut with ID _28323_210_16187_1_1532480395667_4100300_531-24157-0_sp1.1 from training. Duration: 0.8909375 2023-10-02 20:37:35,314 WARNING [train.py:1204] (3/4) Exclude cut with ID _29303_210_2575_1_1532392237303_7410649_382-217492-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 20:37:35,357 WARNING [train.py:1204] (3/4) Exclude cut with ID _58895_210_8750_1_1535184081320_3649089_270-158013-0_sp1.1 from training. Duration: 0.92725 2023-10-02 20:37:38,118 WARNING [train.py:1204] (3/4) Exclude cut with ID _29277_210_3279_1_1532581109293_3173069_311-206227-0_sp1.1 from training. Duration: 0.936375 2023-10-02 20:37:38,129 WARNING [train.py:1204] (3/4) Exclude cut with ID _7160_210_1876_1_1527940923231_3703799_282-322611-0_sp0.9 from training. Duration: 0.9666875 2023-10-02 20:37:39,495 WARNING [train.py:1204] (3/4) Exclude cut with ID _53219_210_4525_1_1534834836008_4064825_562-32295-0_sp1.1 from training. Duration: 0.963625 2023-10-02 20:37:40,843 WARNING [train.py:1204] (3/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_408-87330-0_sp1.1 from training. Duration: 0.963625 2023-10-02 20:37:40,850 WARNING [train.py:1204] (3/4) Exclude cut with ID _59202_210_26984_1_1534816810612_3549174_137-318710-0_sp1.1 from training. Duration: 0.8090625 2023-10-02 20:37:44,133 WARNING [train.py:1204] (3/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_372-30302-0 from training. Duration: 0.86 2023-10-02 20:37:44,441 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.self_attn_weights.pos_emb_skip_rate, batch_count=1011400.0, ans=0.0 2023-10-02 20:37:48,676 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7543_210_6753_1_1528541907_1399322_178-56269-0_sp1.1 from training. Duration: 0.95275 2023-10-02 20:37:48,694 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9109W0012-549758 from training. Duration: 0.512 2023-10-02 20:37:48,915 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.bypass.scale_min, batch_count=1011400.0, ans=0.2 2023-10-02 20:37:50,041 WARNING [train.py:1204] (3/4) Exclude cut with ID _5410_210_1388_1_1527397055795_1719020_39-112221-0_sp0.9 from training. Duration: 0.7666875 2023-10-02 20:37:51,985 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9121W0012-554933 from training. Duration: 0.896 2023-10-02 20:37:53,356 WARNING [train.py:1204] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_476-272757-0 from training. Duration: 0.93 2023-10-02 20:37:54,678 WARNING [train.py:1204] (3/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_1085-171718-0_sp1.1 from training. Duration: 0.92725 2023-10-02 20:37:54,740 WARNING [train.py:1204] (3/4) Exclude cut with ID _8480_210_1874_1_1528589391205_6728330_309-139896-0_sp1.1 from training. Duration: 0.736375 2023-10-02 20:37:54,741 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9130W0113-559703 from training. Duration: 0.896 2023-10-02 20:37:54,745 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_292-265928-0_sp1.1 from training. Duration: 0.463625 2023-10-02 20:37:56,212 WARNING [train.py:1204] (3/4) Exclude cut with ID _8772_210_1850_1_1528635595972_3664011_96-12756-0 from training. Duration: 0.98 2023-10-02 20:37:57,538 WARNING [train.py:1204] (3/4) Exclude cut with ID _12556_210_8341_1_1530081024100_7327690_725-193477-0_sp1.1 from training. Duration: 0.8909375 2023-10-02 20:37:58,907 WARNING [train.py:1204] (3/4) Exclude cut with ID _5289_210_2084_1_1527678202736_3779299_142-103317-0_sp1.1 from training. Duration: 0.763625 2023-10-02 20:38:01,507 WARNING [train.py:1204] (3/4) Exclude cut with ID _45012_210_12651_1_1533608435136_5371380_206-51075-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 20:38:02,901 WARNING [train.py:1204] (3/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_176-56120-0_sp1.1 from training. Duration: 0.7545625 2023-10-02 20:38:02,924 WARNING [train.py:1204] (3/4) Exclude cut with ID _40284_210_2084_1_1533513679814_3704279_220-201864-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 20:38:02,958 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9165W0004-576638 from training. Duration: 0.896 2023-10-02 20:38:04,399 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_5438_210_1794_1_1527336687_2600009_218-343336-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 20:38:04,432 WARNING [train.py:1204] (3/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_318-162017-0 from training. Duration: 0.91 2023-10-02 20:38:09,400 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_49544_210_1794_1_1534379770_5262083_483-349298-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 20:38:09,442 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder_embed.conv.5.prob, batch_count=1011466.6666666666, ans=0.125 2023-10-02 20:38:10,696 WARNING [train.py:1204] (3/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_197-28164-0_sp0.9 from training. Duration: 0.888875 2023-10-02 20:38:10,756 WARNING [train.py:1204] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_643-52007-0 from training. Duration: 0.57 2023-10-02 20:38:10,775 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_38965_210_4852_1_1533174906_3484735_403-332963-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 20:38:12,178 WARNING [train.py:1204] (3/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_340-174176-0 from training. Duration: 0.49 2023-10-02 20:38:15,395 WARNING [train.py:1204] (3/4) Exclude cut with ID _56708_210_13177_1_1535252351282_3422899_363-917-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 20:38:16,691 WARNING [train.py:1204] (3/4) Exclude cut with ID _19896_210_1883_1_1531709022662_6649569_525-159063-0_sp1.1 from training. Duration: 0.936375 2023-10-02 20:38:16,727 WARNING [train.py:1204] (3/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_430-116139-0_sp0.9 from training. Duration: 0.9444375 2023-10-02 20:38:18,709 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=1011533.3333333334, ans=0.1 2023-10-02 20:38:19,891 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_53753_210_7234_1_1534320003_3594148_86-329250-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 20:38:19,901 WARNING [train.py:1204] (3/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_406-275649-0_sp1.1 from training. Duration: 0.3818125 2023-10-02 20:38:21,313 WARNING [train.py:1204] (3/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_1083-147536-0_sp1.1 from training. Duration: 0.8818125 2023-10-02 20:38:21,377 WARNING [train.py:1204] (3/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_54-311741-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 20:38:21,381 WARNING [train.py:1204] (3/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_237-58433-0_sp0.9 from training. Duration: 0.82225 2023-10-02 20:38:22,635 WARNING [train.py:1204] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_98-51676-0_sp1.1 from training. Duration: 0.663625 2023-10-02 20:38:24,024 WARNING [train.py:1204] (3/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_386-270472-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 20:38:24,108 WARNING [train.py:1204] (3/4) Exclude cut with ID _19023_210_4353_1_1531378892552_5820780_83-254794-0_sp1.1 from training. Duration: 0.7818125 2023-10-02 20:38:25,506 WARNING [train.py:1204] (3/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_314-93656-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 20:38:25,522 WARNING [train.py:1204] (3/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_336-213530-0 from training. Duration: 0.9 2023-10-02 20:38:26,935 INFO [train.py:1046] (3/4) Epoch 29, batch 3000, loss[loss=0.1662, simple_loss=0.2422, pruned_loss=0.0451, over 18965.00 frames. ], tot_loss[loss=0.1657, simple_loss=0.2439, pruned_loss=0.04376, over 4723805.63 frames. ], batch size: 41, lr: 3.52e-03, grad_scale: 8.0 2023-10-02 20:38:26,935 INFO [train.py:1069] (3/4) Computing validation loss 2023-10-02 20:38:35,225 INFO [zipformer.py:1853] (3/4) name=encoder.encoders.5.encoder.layers.0.self_attn_weights, attn_weights_entropy = tensor([1.8316, 1.7443, 3.5949, 3.3755], device='cuda:3') 2023-10-02 20:38:39,082 INFO [train.py:1078] (3/4) Epoch 29, validation: loss=0.3203, simple_loss=0.2757, pruned_loss=0.1825, over 1125622.00 frames. 2023-10-02 20:38:39,082 INFO [train.py:1079] (3/4) Maximum memory allocated so far is 21134MB 2023-10-02 20:38:39,219 WARNING [train.py:1204] (3/4) Exclude cut with ID _36686_210_5665_1_1533778225913_3646410_65-221120-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 20:38:42,519 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_26433_210_12108_1_1532157158_4056480_271-68178-0_sp1.1 from training. Duration: 0.92725 2023-10-02 20:38:42,557 WARNING [train.py:1204] (3/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_46-207599-0_sp0.9 from training. Duration: 0.711125 2023-10-02 20:38:45,301 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9303W0114-637133 from training. Duration: 0.896 2023-10-02 20:38:45,326 WARNING [train.py:1204] (3/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_490-207861-0 from training. Duration: 0.98 2023-10-02 20:38:47,394 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6767_210_3273_1_1527855783_1718932_173-256601-0_sp0.9 from training. Duration: 0.888875 2023-10-02 20:38:48,776 WARNING [train.py:1204] (3/4) Exclude cut with ID _13921_210_9994_1_1530841782272_3503520_327-69677-0_sp1.1 from training. Duration: 0.8181875 2023-10-02 20:38:48,845 WARNING [train.py:1204] (3/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_508-335369-0 from training. Duration: 0.48 2023-10-02 20:38:48,867 WARNING [train.py:1204] (3/4) Exclude cut with ID _64631_210_27767_1_1535349580068_4165160_24-46567-0_sp1.1 from training. Duration: 0.963625 2023-10-02 20:38:55,660 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_89-35748-0_sp1.1 from training. Duration: 0.7181875 2023-10-02 20:39:04,222 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.bypass.skip_rate, batch_count=1011666.6666666666, ans=0.09899494936611666 2023-10-02 20:39:05,363 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_26433_210_12108_1_1532157158_4056480_578-262107-0_sp1.1 from training. Duration: 0.9 2023-10-02 20:39:11,442 WARNING [train.py:1204] (3/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_129-337802-0 from training. Duration: 0.72 2023-10-02 20:39:12,799 WARNING [train.py:1204] (3/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_356-167816-0_sp0.9 from training. Duration: 0.8 2023-10-02 20:39:14,688 WARNING [train.py:1204] (3/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_137-116442-0_sp1.1 from training. Duration: 0.7090625 2023-10-02 20:39:16,558 WARNING [train.py:1204] (3/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_358-172940-0_sp1.1 from training. Duration: 0.963625 2023-10-02 20:39:16,578 WARNING [train.py:1204] (3/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_668-344147-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 20:39:18,458 WARNING [train.py:1204] (3/4) Exclude cut with ID _62337_210_12392_1_1535009273105_3412229_255-157667-0_sp1.1 from training. Duration: 0.936375 2023-10-02 20:39:18,462 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_486-151131-0 from training. Duration: 0.85 2023-10-02 20:39:21,152 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_53638_210_21581_1_1534297319_5808449_885-80172-0 from training. Duration: 0.96 2023-10-02 20:39:21,250 WARNING [train.py:1204] (3/4) Exclude cut with ID _50093_210_4220_1_1534417141207_3432299_105-183067-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 20:39:22,706 WARNING [train.py:1204] (3/4) Exclude cut with ID _7562_210_2084_1_1528887767916_3946229_345-169156-0_sp0.9 from training. Duration: 0.6444375 2023-10-02 20:39:25,387 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_4528_210_3273_1_1527076654_3276777_209-219804-0_sp0.9 from training. Duration: 0.7666875 2023-10-02 20:39:25,421 WARNING [train.py:1204] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_35-153886-0_sp0.9 from training. Duration: 0.9333125 2023-10-02 20:39:25,479 WARNING [train.py:1204] (3/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_332-121925-0_sp1.1 from training. Duration: 0.92725 2023-10-02 20:39:25,480 WARNING [train.py:1204] (3/4) Exclude cut with ID _59202_210_26984_1_1534816810612_3549174_495-65515-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 20:39:29,593 WARNING [train.py:1204] (3/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_769-168302-0_sp1.1 from training. Duration: 0.6454375 2023-10-02 20:39:29,627 WARNING [train.py:1204] (3/4) Exclude cut with ID _24271_210_12658_1_1531987270022_7190519_708-71345-0_sp1.1 from training. Duration: 0.936375 2023-10-02 20:39:29,627 WARNING [train.py:1204] (3/4) Exclude cut with ID 3033-130750-0096-55598-0_sp0.9 from training. Duration: 0.92225 2023-10-02 20:39:30,965 WARNING [train.py:1204] (3/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_219-224445-0_sp0.9 from training. Duration: 0.9333125 2023-10-02 20:39:33,713 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_174-277364-0 from training. Duration: 0.71 2023-10-02 20:39:33,791 WARNING [train.py:1204] (3/4) Exclude cut with ID _45021_210_9994_1_1533619751094_3330288_96-239010-0_sp1.1 from training. Duration: 0.82725 2023-10-02 20:39:35,146 WARNING [train.py:1204] (3/4) Exclude cut with ID _31602_210_5854_1_1532677021456_4458250_489-305116-0_sp1.1 from training. Duration: 0.97275 2023-10-02 20:39:35,165 WARNING [train.py:1204] (3/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_269-154726-0_sp0.9 from training. Duration: 0.9555625 2023-10-02 20:39:38,626 WARNING [train.py:1204] (3/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_126-54418-0_sp1.1 from training. Duration: 0.92725 2023-10-02 20:39:39,806 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.518e+02 1.868e+02 2.049e+02 2.188e+02 3.716e+02, threshold=4.098e+02, percent-clipped=0.0 2023-10-02 20:39:39,882 WARNING [train.py:1204] (3/4) Exclude cut with ID _42326_210_7770_1_1533814282531_3707269_182-224159-0_sp1.1 from training. Duration: 0.92725 2023-10-02 20:39:39,983 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7002_210_1794_1_1527939683_5157286_1-281264-0_sp0.9 from training. Duration: 0.488875 2023-10-02 20:39:41,316 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_86-16881-0 from training. Duration: 0.58 2023-10-02 20:39:41,356 WARNING [train.py:1204] (3/4) Exclude cut with ID _19514_210_9259_1_1531530071330_5084230_637-279094-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 20:39:41,387 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_26433_210_12108_1_1532157158_4056480_578-262107-0 from training. Duration: 0.99 2023-10-02 20:39:42,525 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_82-221196-0_sp1.1 from training. Duration: 0.6909375 2023-10-02 20:39:44,492 WARNING [train.py:1204] (3/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_402-102311-0 from training. Duration: 0.67 2023-10-02 20:39:46,546 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1015-343884-0_sp1.1 from training. Duration: 0.563625 2023-10-02 20:39:49,072 WARNING [train.py:1204] (3/4) Exclude cut with ID _13684_210_7497_1_1530619354863_3598359_312-235606-0_sp1.1 from training. Duration: 0.5454375 2023-10-02 20:39:49,107 WARNING [train.py:1204] (3/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_55-31904-0 from training. Duration: 0.88 2023-10-02 20:39:50,425 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_25-332138-0 from training. Duration: 0.91 2023-10-02 20:39:50,427 WARNING [train.py:1204] (3/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_912-282412-0_sp0.9 from training. Duration: 0.5444375 2023-10-02 20:39:51,803 WARNING [train.py:1204] (3/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_458-299435-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 20:39:51,885 WARNING [train.py:1204] (3/4) Exclude cut with ID _69584_210_5219_1_1535785133775_4044450_275-88403-0_sp1.1 from training. Duration: 0.92725 2023-10-02 20:39:53,102 INFO [train.py:1046] (3/4) Epoch 29, batch 3050, loss[loss=0.1461, simple_loss=0.2264, pruned_loss=0.03292, over 24628.00 frames. ], tot_loss[loss=0.1669, simple_loss=0.2451, pruned_loss=0.04439, over 4707385.31 frames. ], batch size: 60, lr: 3.52e-03, grad_scale: 8.0 2023-10-02 20:39:53,138 WARNING [train.py:1204] (3/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_841-341679-0_sp1.1 from training. Duration: 0.52725 2023-10-02 20:39:53,145 WARNING [train.py:1204] (3/4) Exclude cut with ID _41391_210_7994_1_1533603771872_3638201_1-54521-0_sp1.1 from training. Duration: 0.97275 2023-10-02 20:39:54,591 WARNING [train.py:1204] (3/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_495-186609-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 20:39:54,763 WARNING [train.py:1204] (3/4) Exclude cut with ID _4681_210_3525_1_1527250689464_120889_15-126760-0 from training. Duration: 0.81 2023-10-02 20:39:56,219 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3019_210_2028_1_1526044245_3264865_127-135615-0_sp1.1 from training. Duration: 0.8545625 2023-10-02 20:39:57,646 WARNING [train.py:1204] (3/4) Exclude cut with ID _30191_210_1664_1_1533189915047_7196670_915-21561-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 20:39:59,031 WARNING [train.py:1204] (3/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_6-214815-0_sp0.9 from training. Duration: 0.9444375 2023-10-02 20:40:00,620 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.conv_skip_rate, batch_count=1011933.3333333334, ans=0.0 2023-10-02 20:40:01,702 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_45399_210_4852_1_1533624725_7579737_190-137494-0_sp1.1 from training. Duration: 0.97275 2023-10-02 20:40:03,196 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.ff3_skip_rate, batch_count=1011933.3333333334, ans=0.0 2023-10-02 20:40:06,300 WARNING [train.py:1204] (3/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_207-132038-0 from training. Duration: 0.94 2023-10-02 20:40:09,190 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.1.encoder.layers.1.self_attn1.whiten, num_groups=1, num_channels=256, metric=12.84 vs. limit=22.5 2023-10-02 20:40:09,704 WARNING [train.py:1204] (3/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_90-31383-0 from training. Duration: 0.87 2023-10-02 20:40:09,749 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_15517_210_6753_1_1530952293_1664795_119-31001-0 from training. Duration: 0.94 2023-10-02 20:40:09,775 WARNING [train.py:1204] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_684-85342-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 20:40:09,923 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.1.conv_module2.balancer1.max_abs, batch_count=1012000.0, ans=10.0 2023-10-02 20:40:14,277 WARNING [train.py:1204] (3/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_97-325053-0_sp1.1 from training. Duration: 0.836375 2023-10-02 20:40:18,884 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_68702_210_23689_1_1535799577_3420068_168-25150-0_sp1.1 from training. Duration: 0.97275 2023-10-02 20:40:18,892 WARNING [train.py:1204] (3/4) Exclude cut with ID _65134_210_14763_1_1535335393985_3505368_111-27984-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 20:40:18,941 WARNING [train.py:1204] (3/4) Exclude cut with ID _48678_210_5663_1_1533988830947_3769199_276-226230-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 20:40:21,734 WARNING [train.py:1204] (3/4) Exclude cut with ID _28427_210_4220_1_1532581240967_3061069_43-84928-0_sp1.1 from training. Duration: 0.9 2023-10-02 20:40:21,772 WARNING [train.py:1204] (3/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_158-250116-0_sp1.1 from training. Duration: 0.563625 2023-10-02 20:40:21,799 WARNING [train.py:1204] (3/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_279-332556-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 20:40:23,201 WARNING [train.py:1204] (3/4) Exclude cut with ID _42863_210_9070_1_1533902398788_3696920_488-230146-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 20:40:23,202 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_387-247159-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 20:40:23,284 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6729_210_2550_1_1528032481_1788725_261-42619-0_sp1.1 from training. Duration: 0.97275 2023-10-02 20:40:25,827 WARNING [train.py:1204] (3/4) Exclude cut with ID _19896_210_1883_1_1531709022662_6649569_831-44724-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 20:40:28,605 WARNING [train.py:1204] (3/4) Exclude cut with ID _55153_210_2253_1_1534384786677_3162096_280-285944-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 20:40:28,640 WARNING [train.py:1204] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_448-4048-0 from training. Duration: 0.96 2023-10-02 20:40:29,929 WARNING [train.py:1204] (3/4) Exclude cut with ID _29678_210_7893_1_1532421042787_3906118_456-202548-0_sp1.1 from training. Duration: 0.97275 2023-10-02 20:40:29,944 WARNING [train.py:1204] (3/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_107-332398-0_sp1.1 from training. Duration: 0.7090625 2023-10-02 20:40:32,699 WARNING [train.py:1204] (3/4) Exclude cut with ID _13921_210_9994_1_1530838702800_3003429_46-149444-0_sp1.1 from training. Duration: 0.8545625 2023-10-02 20:40:33,364 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_257-20733-0_sp0.9 from training. Duration: 0.8444375 2023-10-02 20:40:34,534 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_53753_210_7234_1_1534320003_3594148_385-234809-0_sp1.1 from training. Duration: 0.963625 2023-10-02 20:40:35,843 WARNING [train.py:1204] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_292-215346-0_sp1.1 from training. Duration: 0.8818125 2023-10-02 20:40:39,995 WARNING [train.py:1204] (3/4) Exclude cut with ID _68965_210_1883_1_1535698962272_3497490_196-124268-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 20:40:41,269 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_23801_210_16223_1_1532066392_3294657_570-5439-0_sp1.1 from training. Duration: 0.8818125 2023-10-02 20:40:46,582 WARNING [train.py:1204] (3/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_349-6928-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 20:40:46,787 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.conv_module1.balancer2.min_abs, batch_count=1012133.3333333334, ans=0.5 2023-10-02 20:40:47,902 WARNING [train.py:1204] (3/4) Exclude cut with ID _60621_210_17071_1_1535013053530_3671739_226-255202-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 20:40:47,906 WARNING [train.py:1204] (3/4) Exclude cut with ID _41817_210_18835_1_1533434477160_7423049_448-130302-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 20:40:48,050 WARNING [train.py:1204] (3/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_758-143677-0_sp1.1 from training. Duration: 0.8909375 2023-10-02 20:40:49,798 WARNING [train.py:1204] (3/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_912-47473-0_sp0.9 from training. Duration: 0.7555625 2023-10-02 20:40:49,843 WARNING [train.py:1204] (3/4) Exclude cut with ID _6694_210_4599_1_1527852637490_3965529_356-169233-0_sp1.1 from training. Duration: 0.9 2023-10-02 20:40:51,197 WARNING [train.py:1204] (3/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_1-99680-0 from training. Duration: 0.67 2023-10-02 20:40:51,292 WARNING [train.py:1204] (3/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_199-86432-0_sp1.1 from training. Duration: 0.8909375 2023-10-02 20:40:51,516 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.feed_forward3.hidden_balancer.prob, batch_count=1012200.0, ans=0.125 2023-10-02 20:40:52,497 WARNING [train.py:1204] (3/4) Exclude cut with ID _61148_210_6897_1_1535094022726_4217610_362-182430-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 20:40:53,847 WARNING [train.py:1204] (3/4) Exclude cut with ID _49376_210_5986_1_1533985278732_3370740_301-154763-0 from training. Duration: 0.98 2023-10-02 20:40:55,392 WARNING [train.py:1204] (3/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_572-185150-0_sp1.1 from training. Duration: 0.8818125 2023-10-02 20:41:01,074 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_47896_210_20508_1_1534492518_5958868_558-314572-0_sp1.1 from training. Duration: 0.8818125 2023-10-02 20:41:02,446 WARNING [train.py:1204] (3/4) Exclude cut with ID _45560_210_21982_1_1533812603951_3584999_111-6376-0_sp0.9 from training. Duration: 0.9333125 2023-10-02 20:41:05,402 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.1.encoder.layers.0.conv_module1.whiten, num_groups=1, num_channels=256, metric=8.87 vs. limit=15.0 2023-10-02 20:41:05,908 WARNING [train.py:1204] (3/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_412-317077-0_sp1.1 from training. Duration: 0.6818125 2023-10-02 20:41:06,113 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.conv_skip_rate, batch_count=1012266.6666666666, ans=0.0 2023-10-02 20:41:07,163 INFO [train.py:1046] (3/4) Epoch 29, batch 3100, loss[loss=0.1599, simple_loss=0.232, pruned_loss=0.04391, over 23660.00 frames. ], tot_loss[loss=0.1662, simple_loss=0.2442, pruned_loss=0.04409, over 4709177.76 frames. ], batch size: 149, lr: 3.52e-03, grad_scale: 8.0 2023-10-02 20:41:08,626 WARNING [train.py:1204] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_718-68505-0 from training. Duration: 0.89 2023-10-02 20:41:11,343 WARNING [train.py:1204] (3/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_77-311404-0 from training. Duration: 0.94 2023-10-02 20:41:11,457 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.0.conv_module2.balancer1.prob, batch_count=1012266.6666666666, ans=0.125 2023-10-02 20:41:12,759 WARNING [train.py:1204] (3/4) Exclude cut with ID _60229_210_27767_1_1534838406158_3655350_264-279573-0 from training. Duration: 0.83 2023-10-02 20:41:14,985 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.0.layers.1.whiten, num_groups=1, num_channels=192, metric=3.69 vs. limit=12.0 2023-10-02 20:41:15,814 WARNING [train.py:1204] (3/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_106-300985-0_sp1.1 from training. Duration: 0.8181875 2023-10-02 20:41:17,357 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.0.feed_forward1.out_proj.dropout_p, batch_count=1012266.6666666666, ans=0.1 2023-10-02 20:41:19,066 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_4183_210_2087_1_1526729100_1869838_79-277189-0_sp1.1 from training. Duration: 0.963625 2023-10-02 20:41:19,083 WARNING [train.py:1204] (3/4) Exclude cut with ID _28427_210_4220_1_1532581240967_3061069_195-295268-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 20:41:21,858 WARNING [train.py:1204] (3/4) Exclude cut with ID _4092_210_2151_1_1526643044449_3135759_356-278694-0_sp0.9 from training. Duration: 0.52225 2023-10-02 20:41:25,154 WARNING [train.py:1204] (3/4) Exclude cut with ID _14459_210_2228_1_1531443641946_7516700_777-71446-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 20:41:30,633 WARNING [train.py:1204] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_520-126729-0 from training. Duration: 0.7 2023-10-02 20:41:34,786 WARNING [train.py:1204] (3/4) Exclude cut with ID _13684_210_7497_1_1530619354863_3598359_312-235606-0_sp0.9 from training. Duration: 0.6666875 2023-10-02 20:41:34,855 WARNING [train.py:1204] (3/4) Exclude cut with ID _37537_210_2253_1_1533171758568_3421949_283-349402-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 20:41:36,193 WARNING [train.py:1204] (3/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_29-117932-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 20:41:36,223 WARNING [train.py:1204] (3/4) Exclude cut with ID _29303_210_2575_1_1532392237303_7410649_721-283351-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 20:41:39,219 WARNING [train.py:1204] (3/4) Exclude cut with ID _14886_210_4220_1_1530702020355_3516190_175-229073-0_sp0.9 from training. Duration: 0.5 2023-10-02 20:41:40,634 WARNING [train.py:1204] (3/4) Exclude cut with ID _15458_210_8341_1_1530858614263_3834410_70-268830-0_sp1.1 from training. Duration: 0.97275 2023-10-02 20:41:40,652 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2295_210_2087_1_1525500993_2407863_234-83104-0 from training. Duration: 0.62 2023-10-02 20:41:40,653 WARNING [train.py:1204] (3/4) Exclude cut with ID _22961_210_8254_1_1531976366672_4278180_317-199546-0_sp1.1 from training. Duration: 0.7818125 2023-10-02 20:41:42,127 WARNING [train.py:1204] (3/4) Exclude cut with ID _27894_210_12392_1_1532415496078_6902439_547-196022-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 20:41:43,512 WARNING [train.py:1204] (3/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_46-207599-0 from training. Duration: 0.64 2023-10-02 20:41:44,969 WARNING [train.py:1204] (3/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_426-326246-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 20:41:46,482 WARNING [train.py:1204] (3/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_445-117398-0_sp0.9 from training. Duration: 0.988875 2023-10-02 20:41:46,540 WARNING [train.py:1204] (3/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_563-347832-0 from training. Duration: 0.89 2023-10-02 20:41:49,685 WARNING [train.py:1204] (3/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_252-216674-0 from training. Duration: 0.54 2023-10-02 20:41:49,772 WARNING [train.py:1204] (3/4) Exclude cut with ID _59611_210_8895_1_1534741198240_3650361_187-212779-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 20:41:51,620 WARNING [train.py:1204] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_282-159367-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 20:41:52,901 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_68702_210_23689_1_1535799577_3420068_33-136883-0_sp1.1 from training. Duration: 0.936375 2023-10-02 20:41:52,918 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_47228_210_4983_1_1533780021_3672027_184-293706-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 20:41:54,770 WARNING [train.py:1204] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_656-188361-0_sp0.9 from training. Duration: 0.9666875 2023-10-02 20:41:56,204 WARNING [train.py:1204] (3/4) Exclude cut with ID _4866_210_2577_1_1527404412284_4364659_352-157930-0_sp0.9 from training. Duration: 0.9 2023-10-02 20:41:56,204 WARNING [train.py:1204] (3/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_210-106047-0_sp1.1 from training. Duration: 0.8818125 2023-10-02 20:41:57,585 WARNING [train.py:1204] (3/4) Exclude cut with ID _9389_210_1664_1_1529217337301_5289269_289-213417-0_sp1.1 from training. Duration: 0.8090625 2023-10-02 20:41:57,618 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3910_210_1794_1_1526777667_3747260_178-41186-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 20:41:57,629 WARNING [train.py:1204] (3/4) Exclude cut with ID _27052_210_5986_1_1532329197187_3585009_24-172263-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 20:41:57,633 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_5522_210_3273_1_1527249606_3909096_318-203153-0_sp1.1 from training. Duration: 0.3909375 2023-10-02 20:42:01,675 WARNING [train.py:1204] (3/4) Exclude cut with ID _27894_210_12392_1_1532415496078_6902439_222-51982-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 20:42:01,818 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.1.bypass.skip_rate, batch_count=1012466.6666666666, ans=0.035 2023-10-02 20:42:03,041 WARNING [train.py:1204] (3/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_87-131427-0 from training. Duration: 0.78 2023-10-02 20:42:05,743 WARNING [train.py:1204] (3/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_372-298093-0_sp0.9 from training. Duration: 0.888875 2023-10-02 20:42:05,808 WARNING [train.py:1204] (3/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_663-166973-0 from training. Duration: 0.8 2023-10-02 20:42:07,525 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.478e+02 1.894e+02 2.077e+02 2.394e+02 5.109e+02, threshold=4.155e+02, percent-clipped=1.0 2023-10-02 20:42:07,631 WARNING [train.py:1204] (3/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_429-177469-0_sp1.1 from training. Duration: 0.936375 2023-10-02 20:42:07,666 WARNING [train.py:1204] (3/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_724-172651-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 20:42:07,702 WARNING [train.py:1204] (3/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_103-336064-0 from training. Duration: 0.51 2023-10-02 20:42:07,914 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.conv_module2.balancer2.prob, batch_count=1012533.3333333334, ans=0.125 2023-10-02 20:42:10,927 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.bypass.scale_min, batch_count=1012533.3333333334, ans=0.2 2023-10-02 20:42:12,308 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.0.conv_module1.balancer1.prob, batch_count=1012533.3333333334, ans=0.125 2023-10-02 20:42:15,334 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.5.encoder.layers.1.self_attn1.whiten, num_groups=1, num_channels=256, metric=10.21 vs. limit=22.5 2023-10-02 20:42:18,220 WARNING [train.py:1204] (3/4) Exclude cut with ID _48677_210_5663_1_1533902405919_3892989_111-11439-0 from training. Duration: 0.96 2023-10-02 20:42:20,659 INFO [train.py:1046] (3/4) Epoch 29, batch 3150, loss[loss=0.1537, simple_loss=0.2168, pruned_loss=0.04528, over 23571.00 frames. ], tot_loss[loss=0.1655, simple_loss=0.2432, pruned_loss=0.04392, over 4702915.16 frames. ], batch size: 256, lr: 3.52e-03, grad_scale: 8.0 2023-10-02 20:42:20,723 WARNING [train.py:1204] (3/4) Exclude cut with ID _8085_210_4557_1_1528455633493_3716100_95-103398-0_sp1.1 from training. Duration: 0.963625 2023-10-02 20:42:20,771 WARNING [train.py:1204] (3/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_1041-110837-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 20:42:24,080 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_50050_210_16223_1_1535435796_7290138_1155-121757-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 20:42:24,082 WARNING [train.py:1204] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_602-275260-0_sp0.9 from training. Duration: 0.911125 2023-10-02 20:42:24,144 WARNING [train.py:1204] (3/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_265-164486-0 from training. Duration: 0.96 2023-10-02 20:42:27,316 WARNING [train.py:1204] (3/4) Exclude cut with ID _46067_210_4220_1_1534125571066_3307450_142-229375-0_sp1.1 from training. Duration: 0.963625 2023-10-02 20:42:27,354 WARNING [train.py:1204] (3/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_310-26186-0_sp1.1 from training. Duration: 0.636375 2023-10-02 20:42:29,975 WARNING [train.py:1204] (3/4) Exclude cut with ID _28461_210_2253_1_1532771775189_3041299_136-270644-0 from training. Duration: 0.93 2023-10-02 20:42:31,370 WARNING [train.py:1204] (3/4) Exclude cut with ID _14860_210_4915_1_1530702020340_7343240_438-286372-0_sp1.1 from training. Duration: 0.936375 2023-10-02 20:42:34,133 WARNING [train.py:1204] (3/4) Exclude cut with ID IC0128W0004-63309 from training. Duration: 0.831 2023-10-02 20:42:34,313 WARNING [train.py:1204] (3/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_135-174080-0 from training. Duration: 0.53 2023-10-02 20:42:35,628 WARNING [train.py:1204] (3/4) Exclude cut with ID _41326_210_5854_1_1533778076509_3492080_277-59511-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 20:42:35,702 WARNING [train.py:1204] (3/4) Exclude cut with ID IC0144W0112-71379 from training. Duration: 0.992 2023-10-02 20:42:35,762 WARNING [train.py:1204] (3/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_711-15688-0_sp0.9 from training. Duration: 0.47775 2023-10-02 20:42:37,695 WARNING [train.py:1204] (3/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_237-332027-0 from training. Duration: 0.95 2023-10-02 20:42:37,894 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.bypass.scale_min, batch_count=1012666.6666666666, ans=0.2 2023-10-02 20:42:38,851 WARNING [train.py:1204] (3/4) Exclude cut with ID _7970_210_1883_1_1528455616296_3472230_25-178407-0 from training. Duration: 0.71 2023-10-02 20:42:38,853 WARNING [train.py:1204] (3/4) Exclude cut with ID _8480_210_1874_1_1528589391205_6728330_309-139896-0 from training. Duration: 0.81 2023-10-02 20:42:38,870 WARNING [train.py:1204] (3/4) Exclude cut with ID _39571_210_13449_1_1533801584143_3651077_319-268877-0_sp1.1 from training. Duration: 0.936375 2023-10-02 20:42:38,873 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_155-246880-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 20:42:40,135 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_566-122599-0_sp1.1 from training. Duration: 0.936375 2023-10-02 20:42:40,258 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_12868_210_6753_1_1530701932_530275_18-76322-0 from training. Duration: 0.87 2023-10-02 20:42:41,573 WARNING [train.py:1204] (3/4) Exclude cut with ID _56494_210_4220_1_1535284837625_3033169_159-106035-0_sp1.1 from training. Duration: 0.963625 2023-10-02 20:42:41,607 WARNING [train.py:1204] (3/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_98-276013-0_sp1.1 from training. Duration: 0.963625 2023-10-02 20:42:42,926 WARNING [train.py:1204] (3/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_330-216124-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 20:42:46,129 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_177-58005-0_sp0.9 from training. Duration: 0.688875 2023-10-02 20:42:48,955 WARNING [train.py:1204] (3/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_343-139898-0 from training. Duration: 0.96 2023-10-02 20:42:49,030 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6961_210_2087_1_1527937168_6815011_395-185060-0_sp1.1 from training. Duration: 0.863625 2023-10-02 20:42:52,492 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7002_210_1794_1_1527939683_5157286_184-229630-0_sp0.9 from training. Duration: 0.82225 2023-10-02 20:42:54,192 WARNING [train.py:1204] (3/4) Exclude cut with ID _59170_210_6721_1_1534983633960_4203335_153-18895-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 20:42:54,242 WARNING [train.py:1204] (3/4) Exclude cut with ID _49643_210_7989_1_1534640244224_3939031_598-161926-0 from training. Duration: 0.9 2023-10-02 20:42:55,736 WARNING [train.py:1204] (3/4) Exclude cut with ID _4068_210_2629_1_1526697350395_1546259_224-288693-0 from training. Duration: 0.59 2023-10-02 20:42:57,038 WARNING [train.py:1204] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_718-68505-0_sp1.1 from training. Duration: 0.8090625 2023-10-02 20:42:57,086 WARNING [train.py:1204] (3/4) Exclude cut with ID _4068_210_2629_1_1526697350395_1546259_224-288693-0_sp0.9 from training. Duration: 0.6555625 2023-10-02 20:42:57,097 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2296_210_2087_1_1525503448_3397335_57-185430-0_sp0.9 from training. Duration: 0.6666875 2023-10-02 20:42:58,415 WARNING [train.py:1204] (3/4) Exclude cut with ID _15458_210_8341_1_1530858614263_3834410_49-235472-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 20:42:58,418 WARNING [train.py:1204] (3/4) Exclude cut with ID _64135_210_2253_1_1535677267876_7608972_498-5039-0_sp0.9 from training. Duration: 0.8666875 2023-10-02 20:42:59,823 WARNING [train.py:1204] (3/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_283-220372-0_sp0.9 from training. Duration: 0.62225 2023-10-02 20:42:59,832 WARNING [train.py:1204] (3/4) Exclude cut with ID _6473_210_1741_1_1527901307396_3815380_108-17335-0_sp0.9 from training. Duration: 0.72225 2023-10-02 20:43:01,031 WARNING [train.py:1204] (3/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_113-92608-0 from training. Duration: 0.54 2023-10-02 20:43:01,094 WARNING [train.py:1204] (3/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_272-249985-0_sp1.1 from training. Duration: 0.6090625 2023-10-02 20:43:02,388 WARNING [train.py:1204] (3/4) Exclude cut with ID _50376_210_13401_1_1534031502101_4217740_518-187390-0_sp1.1 from training. Duration: 0.97275 2023-10-02 20:43:03,757 WARNING [train.py:1204] (3/4) Exclude cut with ID _59738_210_15379_1_1534849512562_3941470_165-248260-0_sp1.1 from training. Duration: 0.92725 2023-10-02 20:43:03,769 WARNING [train.py:1204] (3/4) Exclude cut with ID _23818_210_6228_1_1531917063884_3676920_400-343925-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 20:43:03,844 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6961_210_2087_1_1527937168_6815011_562-165518-0 from training. Duration: 0.77 2023-10-02 20:43:04,449 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.3.encoder.layers.0.feed_forward1.out_whiten, num_groups=1, num_channels=512, metric=11.21 vs. limit=15.0 2023-10-02 20:43:05,255 WARNING [train.py:1204] (3/4) Exclude cut with ID _24271_210_12658_1_1531987270022_7190519_289-207931-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 20:43:08,001 WARNING [train.py:1204] (3/4) Exclude cut with ID _14632_210_9988_1_1530691279392_3923200_332-105904-0 from training. Duration: 0.9 2023-10-02 20:43:08,017 WARNING [train.py:1204] (3/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_203-232438-0_sp1.1 from training. Duration: 0.97275 2023-10-02 20:43:11,085 WARNING [train.py:1204] (3/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_263-168388-0 from training. Duration: 0.86 2023-10-02 20:43:11,152 WARNING [train.py:1204] (3/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_174-205058-0 from training. Duration: 0.95 2023-10-02 20:43:12,494 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_94-195801-0_sp1.1 from training. Duration: 0.8545625 2023-10-02 20:43:12,528 WARNING [train.py:1204] (3/4) Exclude cut with ID _55153_210_2253_1_1534384786677_3162096_269-103204-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 20:43:12,606 WARNING [train.py:1204] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_748-36150-0 from training. Duration: 0.91 2023-10-02 20:43:13,801 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_148-172861-0_sp1.1 from training. Duration: 0.4545625 2023-10-02 20:43:13,847 WARNING [train.py:1204] (3/4) Exclude cut with ID _67499_210_17282_1_1535509805220_3692430_460-77730-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 20:43:17,039 WARNING [train.py:1204] (3/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_374-20753-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 20:43:20,078 WARNING [train.py:1204] (3/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_374-289983-0_sp1.1 from training. Duration: 0.97275 2023-10-02 20:43:20,130 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_34886_210_10633_1_1533203882_1755661_76-156096-0_sp0.9 from training. Duration: 0.9666875 2023-10-02 20:43:20,806 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.2.encoder.layers.1.self_attn_weights.whiten_keys, num_groups=4, num_channels=128, metric=2.82 vs. limit=6.0 2023-10-02 20:43:22,947 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.bypass.skip_rate, batch_count=1012866.6666666666, ans=0.07 2023-10-02 20:43:25,523 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_238-256668-0_sp0.9 from training. Duration: 0.7666875 2023-10-02 20:43:27,433 WARNING [train.py:1204] (3/4) Exclude cut with ID _46683_210_9642_1_1533730913492_4591116_489-146727-0_sp1.1 from training. Duration: 0.97275 2023-10-02 20:43:27,654 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.ff2_skip_rate, batch_count=1012866.6666666666, ans=0.0 2023-10-02 20:43:30,111 WARNING [train.py:1204] (3/4) Exclude cut with ID _13938_210_9988_1_1530518718978_3693960_304-112784-0_sp0.9 from training. Duration: 0.511125 2023-10-02 20:43:34,283 INFO [train.py:1046] (3/4) Epoch 29, batch 3200, loss[loss=0.1745, simple_loss=0.2538, pruned_loss=0.04758, over 23381.00 frames. ], tot_loss[loss=0.1643, simple_loss=0.2419, pruned_loss=0.04328, over 4715786.35 frames. ], batch size: 106, lr: 3.52e-03, grad_scale: 16.0 2023-10-02 20:43:34,346 WARNING [train.py:1204] (3/4) Exclude cut with ID _24271_210_12658_1_1531987270022_7190519_243-46549-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 20:43:34,350 WARNING [train.py:1204] (3/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_78-182912-0_sp1.1 from training. Duration: 0.536375 2023-10-02 20:43:35,994 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.bypass_mid.scale_min, batch_count=1012933.3333333334, ans=0.2 2023-10-02 20:43:37,194 WARNING [train.py:1204] (3/4) Exclude cut with ID _57572_210_5517_1_1534921219624_3809484_121-321334-0_sp1.1 from training. Duration: 0.97275 2023-10-02 20:43:37,379 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.conv_module2.balancer2.prob, batch_count=1012933.3333333334, ans=0.125 2023-10-02 20:43:38,536 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_40047_210_2290_1_1533290519_2374311_254-102149-0_sp1.1 from training. Duration: 0.963625 2023-10-02 20:43:38,537 WARNING [train.py:1204] (3/4) Exclude cut with ID _28461_210_2253_1_1532771775189_3041299_96-152158-0 from training. Duration: 0.85 2023-10-02 20:43:40,676 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.bypass.skip_rate, batch_count=1012933.3333333334, ans=0.07 2023-10-02 20:43:41,859 WARNING [train.py:1204] (3/4) Exclude cut with ID _29877_210_6228_1_1532397791106_5276149_161-102979-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 20:43:44,713 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3787_210_2040_1_1526477544_5478586_603-120324-0_sp0.9 from training. Duration: 0.9 2023-10-02 20:43:49,186 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_41750_210_1960_1_1533520678_3948042_247-213456-0_sp1.1 from training. Duration: 0.97275 2023-10-02 20:43:50,691 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.conv_skip_rate, batch_count=1013000.0, ans=0.0 2023-10-02 20:43:58,214 WARNING [train.py:1204] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_127-119109-0_sp0.9 from training. Duration: 0.8 2023-10-02 20:44:09,302 WARNING [train.py:1204] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_215-202973-0 from training. Duration: 0.88 2023-10-02 20:44:09,369 WARNING [train.py:1204] (3/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_17-320491-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 20:44:12,701 WARNING [train.py:1204] (3/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_310-114452-0 from training. Duration: 0.59 2023-10-02 20:44:12,776 WARNING [train.py:1204] (3/4) Exclude cut with ID _15133_210_6228_1_1530794512074_3080379_29-264458-0_sp0.9 from training. Duration: 0.6444375 2023-10-02 20:44:16,166 WARNING [train.py:1204] (3/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_159-281310-0_sp1.1 from training. Duration: 0.863625 2023-10-02 20:44:16,183 WARNING [train.py:1204] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_195-167011-0_sp0.9 from training. Duration: 0.8333125 2023-10-02 20:44:17,525 WARNING [train.py:1204] (3/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_156-257607-0_sp1.1 from training. Duration: 0.7545625 2023-10-02 20:44:20,260 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2295_210_2087_1_1525500993_2407863_116-238894-0 from training. Duration: 0.7 2023-10-02 20:44:21,615 WARNING [train.py:1204] (3/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_321-335475-0_sp0.9 from training. Duration: 0.5 2023-10-02 20:44:23,633 WARNING [train.py:1204] (3/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_188-114464-0 from training. Duration: 0.82 2023-10-02 20:44:25,127 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2592_210_2290_1_1525697025_4882925_157-70512-0 from training. Duration: 0.91 2023-10-02 20:44:26,464 WARNING [train.py:1204] (3/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_138-64210-0_sp1.1 from training. Duration: 0.663625 2023-10-02 20:44:35,237 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.473e+02 1.842e+02 2.034e+02 2.276e+02 3.426e+02, threshold=4.068e+02, percent-clipped=0.0 2023-10-02 20:44:36,642 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_11305_210_1794_1_1529750428_7178148_651-95702-0_sp1.1 from training. Duration: 0.963625 2023-10-02 20:44:36,655 WARNING [train.py:1204] (3/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_379-184223-0_sp0.9 from training. Duration: 0.9444375 2023-10-02 20:44:36,673 WARNING [train.py:1204] (3/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_381-125325-0_sp1.1 from training. Duration: 0.963625 2023-10-02 20:44:37,994 WARNING [train.py:1204] (3/4) Exclude cut with ID IC0622W0021-307659 from training. Duration: 0.896 2023-10-02 20:44:37,996 WARNING [train.py:1204] (3/4) Exclude cut with ID _7624_210_1531_1_1528109802198_4933101_418-349834-0_sp1.1 from training. Duration: 0.5454375 2023-10-02 20:44:38,302 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=1013200.0, ans=0.1 2023-10-02 20:44:42,171 WARNING [train.py:1204] (3/4) Exclude cut with ID _49728_210_12651_1_1534041034294_6788150_607-3416-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 20:44:42,897 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8275_210_4198_1_1528455320_3892571_491-17707-0 from training. Duration: 0.64 2023-10-02 20:44:43,984 WARNING [train.py:1204] (3/4) Exclude cut with ID _5287_210_2084_1_1527418883040_3878670_333-48508-0 from training. Duration: 0.6 2023-10-02 20:44:44,067 WARNING [train.py:1204] (3/4) Exclude cut with ID _8365_210_2064_1_1528592311070_4677690_371-122295-0 from training. Duration: 0.55 2023-10-02 20:44:46,687 WARNING [train.py:1204] (3/4) Exclude cut with ID _14860_210_4915_1_1530702020340_7343240_836-328489-0 from training. Duration: 0.9 2023-10-02 20:44:47,032 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=1013266.6666666666, ans=0.1 2023-10-02 20:44:48,062 INFO [train.py:1046] (3/4) Epoch 29, batch 3250, loss[loss=0.1711, simple_loss=0.2619, pruned_loss=0.04013, over 24278.00 frames. ], tot_loss[loss=0.1648, simple_loss=0.2426, pruned_loss=0.04347, over 4731813.84 frames. ], batch size: 74, lr: 3.52e-03, grad_scale: 16.0 2023-10-02 20:44:48,156 WARNING [train.py:1204] (3/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_1033-274376-0_sp0.9 from training. Duration: 0.9666875 2023-10-02 20:44:51,364 WARNING [train.py:1204] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_475-127455-0_sp0.9 from training. Duration: 0.77775 2023-10-02 20:44:51,372 WARNING [train.py:1204] (3/4) Exclude cut with ID IC0671W0071-331914 from training. Duration: 0.896 2023-10-02 20:44:51,401 WARNING [train.py:1204] (3/4) Exclude cut with ID _30327_210_16049_1_1532484161295_3924169_2-1523-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 20:44:51,403 WARNING [train.py:1204] (3/4) Exclude cut with ID _24314_210_4915_1_1532135078116_7170179_460-208279-0_sp1.1 from training. Duration: 0.97275 2023-10-02 20:44:54,195 WARNING [train.py:1204] (3/4) Exclude cut with ID IC0679W0052-335859 from training. Duration: 0.64 2023-10-02 20:44:59,081 WARNING [train.py:1204] (3/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_3-237434-0_sp1.1 from training. Duration: 0.6909375 2023-10-02 20:45:02,411 WARNING [train.py:1204] (3/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_405-185283-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 20:45:02,760 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.attention_skip_rate, batch_count=1013333.3333333334, ans=0.0 2023-10-02 20:45:10,943 WARNING [train.py:1204] (3/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_927-83684-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 20:45:10,952 WARNING [train.py:1204] (3/4) Exclude cut with ID _6694_210_4599_1_1527852637490_3965529_356-169233-0 from training. Duration: 0.99 2023-10-02 20:45:12,353 WARNING [train.py:1204] (3/4) Exclude cut with ID _19896_210_1883_1_1531709022662_6649569_562-233001-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 20:45:12,395 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_354-78629-0_sp1.1 from training. Duration: 0.963625 2023-10-02 20:45:12,396 WARNING [train.py:1204] (3/4) Exclude cut with ID _60135_210_13427_1_1534935557922_3651319_96-168198-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 20:45:13,833 WARNING [train.py:1204] (3/4) Exclude cut with ID _31602_210_5854_1_1532677021456_4458250_249-96604-0_sp1.1 from training. Duration: 0.8090625 2023-10-02 20:45:15,080 WARNING [train.py:1204] (3/4) Exclude cut with ID _14100_210_8341_1_1530529320613_3678069_258-224599-0_sp0.9 from training. Duration: 0.8555625 2023-10-02 20:45:18,446 WARNING [train.py:1204] (3/4) Exclude cut with ID _19708_210_9206_1_1532136650733_4238364_215-160135-0_sp1.1 from training. Duration: 0.97275 2023-10-02 20:45:18,481 WARNING [train.py:1204] (3/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_113-18163-0_sp0.9 from training. Duration: 0.988875 2023-10-02 20:45:18,517 WARNING [train.py:1204] (3/4) Exclude cut with ID _64135_210_2253_1_1535677267876_7608972_552-337340-0_sp1.1 from training. Duration: 0.92725 2023-10-02 20:45:18,552 WARNING [train.py:1204] (3/4) Exclude cut with ID _67725_210_27767_1_1535608756671_5105359_10-12887-0_sp1.1 from training. Duration: 0.97275 2023-10-02 20:45:18,558 WARNING [train.py:1204] (3/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_401-284840-0_sp1.1 from training. Duration: 0.97275 2023-10-02 20:45:18,575 WARNING [train.py:1204] (3/4) Exclude cut with ID _44659_210_2721_1_1534036120061_6614860_483-141285-0_sp1.1 from training. Duration: 0.936375 2023-10-02 20:45:21,206 WARNING [train.py:1204] (3/4) Exclude cut with ID _9461_210_4557_1_1529060514726_3677451_358-332734-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 20:45:22,580 WARNING [train.py:1204] (3/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_788-298907-0_sp1.1 from training. Duration: 0.8090625 2023-10-02 20:45:24,573 WARNING [train.py:1204] (3/4) Exclude cut with ID _39411_210_5986_1_1533538825030_3343649_354-267196-0_sp1.1 from training. Duration: 0.92725 2023-10-02 20:45:24,610 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_14953_210_2151_1_1530701710_5616165_192-309792-0_sp1.1 from training. Duration: 0.97275 2023-10-02 20:45:25,993 WARNING [train.py:1204] (3/4) Exclude cut with ID _59170_210_6721_1_1534983633960_4203335_290-151255-0_sp1.1 from training. Duration: 0.92725 2023-10-02 20:45:27,333 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_5575_210_1410_1_1527301305_2570170_249-93769-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 20:45:27,342 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_46013_210_7234_1_1533776362_3744032_463-52378-0_sp1.1 from training. Duration: 0.7818125 2023-10-02 20:45:31,944 WARNING [train.py:1204] (3/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_580-93798-0 from training. Duration: 0.56 2023-10-02 20:45:32,006 WARNING [train.py:1204] (3/4) Exclude cut with ID _19514_210_9259_1_1531530071330_5084230_385-13762-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 20:45:33,201 WARNING [train.py:1204] (3/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_458-67321-0_sp1.1 from training. Duration: 0.8818125 2023-10-02 20:45:34,595 WARNING [train.py:1204] (3/4) Exclude cut with ID _41298_210_9019_1_1533430371450_4822143_188-154791-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 20:45:34,660 WARNING [train.py:1204] (3/4) Exclude cut with ID _15037_210_2253_1_1530752294104_3267650_296-192597-0_sp0.9 from training. Duration: 0.9 2023-10-02 20:45:39,573 WARNING [train.py:1204] (3/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_661-264857-0_sp1.1 from training. Duration: 0.8454375 2023-10-02 20:45:39,847 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.self_attn_weights.pos_emb_skip_rate, batch_count=1013466.6666666666, ans=0.0 2023-10-02 20:45:45,847 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_34008_210_10431_1_1533345424_8183247_662-317331-0_sp1.1 from training. Duration: 0.8545625 2023-10-02 20:45:45,880 WARNING [train.py:1204] (3/4) Exclude cut with ID _46502_210_13794_1_1533724452198_3657159_241-278329-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 20:45:45,880 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_11349_210_6753_1_1529800861_1101207_26-65602-0 from training. Duration: 0.96 2023-10-02 20:45:45,884 WARNING [train.py:1204] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_448-4048-0_sp1.1 from training. Duration: 0.87275 2023-10-02 20:45:45,892 WARNING [train.py:1204] (3/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_662-275621-0_sp1.1 from training. Duration: 0.3818125 2023-10-02 20:45:47,043 WARNING [train.py:1204] (3/4) Exclude cut with ID _13938_210_9988_1_1530518718978_3693960_256-54454-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 20:45:49,757 WARNING [train.py:1204] (3/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_256-32662-0 from training. Duration: 0.9 2023-10-02 20:45:49,814 WARNING [train.py:1204] (3/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_326-17061-0 from training. Duration: 0.94 2023-10-02 20:45:51,785 WARNING [train.py:1204] (3/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_505-97987-0_sp1.1 from training. Duration: 0.7818125 2023-10-02 20:45:53,154 WARNING [train.py:1204] (3/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_316-294364-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 20:45:54,441 WARNING [train.py:1204] (3/4) Exclude cut with ID _19514_210_9259_1_1531530071330_5084230_762-231063-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 20:45:55,801 WARNING [train.py:1204] (3/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_68-219807-0_sp0.9 from training. Duration: 0.52225 2023-10-02 20:45:55,838 WARNING [train.py:1204] (3/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_479-158053-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 20:45:58,696 WARNING [train.py:1204] (3/4) Exclude cut with ID _66112_210_10635_1_1535590236305_3801484_83-38181-0_sp1.1 from training. Duration: 0.936375 2023-10-02 20:46:00,111 WARNING [train.py:1204] (3/4) Exclude cut with ID _55857_210_2346_1_1534644007831_6898209_255-146346-0_sp1.1 from training. Duration: 0.963625 2023-10-02 20:46:02,085 WARNING [train.py:1204] (3/4) Exclude cut with ID _48836_210_12210_1_1534075848293_3904150_429-35475-0 from training. Duration: 0.89 2023-10-02 20:46:02,103 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_50154_210_13731_1_1534750862_1080961_119-187163-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 20:46:03,521 WARNING [train.py:1204] (3/4) Exclude cut with ID _8793_210_6310_1_1528542985139_2310009_121-179782-0_sp0.9 from training. Duration: 0.888875 2023-10-02 20:46:03,524 WARNING [train.py:1204] (3/4) Exclude cut with ID _14884_210_2252_1_1530707459689_5561250_591-173406-0 from training. Duration: 0.96 2023-10-02 20:46:06,108 INFO [train.py:1046] (3/4) Epoch 29, batch 3300, loss[loss=0.1412, simple_loss=0.2137, pruned_loss=0.03438, over 24348.00 frames. ], tot_loss[loss=0.1652, simple_loss=0.2426, pruned_loss=0.0439, over 4725076.36 frames. ], batch size: 56, lr: 3.52e-03, grad_scale: 16.0 2023-10-02 20:46:06,246 WARNING [train.py:1204] (3/4) Exclude cut with ID _15133_210_6228_1_1530794512074_3080379_54-198193-0_sp1.1 from training. Duration: 0.8545625 2023-10-02 20:46:06,263 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6545_210_2040_1_1527774670_4245258_407-48070-0 from training. Duration: 0.76 2023-10-02 20:46:06,491 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.bypass.skip_rate, batch_count=1013600.0, ans=0.09899494936611666 2023-10-02 20:46:08,228 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1032-236863-0 from training. Duration: 0.84 2023-10-02 20:46:09,672 WARNING [train.py:1204] (3/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_507-303370-0 from training. Duration: 0.96 2023-10-02 20:46:09,705 WARNING [train.py:1204] (3/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_164-282366-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 20:46:13,833 WARNING [train.py:1204] (3/4) Exclude cut with ID _39867_210_9826_1_1533553222030_9344259_312-158727-0_sp1.1 from training. Duration: 0.963625 2023-10-02 20:46:13,911 WARNING [train.py:1204] (3/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_174-205058-0_sp1.1 from training. Duration: 0.863625 2023-10-02 20:46:15,215 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_11305_210_1794_1_1529750428_7178148_166-237358-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 20:46:16,679 WARNING [train.py:1204] (3/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_26-175553-0_sp1.1 from training. Duration: 0.5545625 2023-10-02 20:46:16,705 WARNING [train.py:1204] (3/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_96-290039-0_sp1.1 from training. Duration: 0.5090625 2023-10-02 20:46:17,583 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.ff3_skip_rate, batch_count=1013600.0, ans=0.0 2023-10-02 20:46:19,923 WARNING [train.py:1204] (3/4) Exclude cut with ID _53219_210_4525_1_1534834836008_4064825_261-101691-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 20:46:21,294 WARNING [train.py:1204] (3/4) Exclude cut with ID _24271_210_12658_1_1531987270022_7190519_96-293390-0_sp1.1 from training. Duration: 0.936375 2023-10-02 20:46:25,980 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_271-234962-0 from training. Duration: 0.79 2023-10-02 20:46:26,570 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.4.encoder.layers.1.self_attn1.whiten, num_groups=1, num_channels=384, metric=11.55 vs. limit=22.5 2023-10-02 20:46:27,344 WARNING [train.py:1204] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_136-30660-0_sp1.1 from training. Duration: 0.97275 2023-10-02 20:46:27,359 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_625-281046-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 20:46:28,734 WARNING [train.py:1204] (3/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_333-168193-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 20:46:30,053 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9029W0269-509664 from training. Duration: 0.896 2023-10-02 20:46:30,553 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.4.encoder.layers.1.feed_forward3.out_whiten, num_groups=1, num_channels=384, metric=10.75 vs. limit=15.0 2023-10-02 20:46:31,401 WARNING [train.py:1204] (3/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_450-147393-0_sp1.1 from training. Duration: 0.92725 2023-10-02 20:46:31,458 WARNING [train.py:1204] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_643-52007-0_sp1.1 from training. Duration: 0.5181875 2023-10-02 20:46:33,244 WARNING [train.py:1204] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_330-327861-0_sp0.9 from training. Duration: 0.7444375 2023-10-02 20:46:33,260 WARNING [train.py:1204] (3/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_260-317651-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 20:46:33,287 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9044W0010-516654 from training. Duration: 0.896 2023-10-02 20:46:36,126 WARNING [train.py:1204] (3/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_265-118378-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 20:46:36,132 WARNING [train.py:1204] (3/4) Exclude cut with ID _6194_210_1534_1_1527511826160_3941194_321-148641-0_sp0.9 from training. Duration: 0.711125 2023-10-02 20:46:36,304 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.conv_module1.balancer1.prob, batch_count=1013733.3333333334, ans=0.125 2023-10-02 20:46:38,690 WARNING [train.py:1204] (3/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_101-169176-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 20:46:38,693 WARNING [train.py:1204] (3/4) Exclude cut with ID 497-129325-0061-62254-0_sp1.1 from training. Duration: 0.97725 2023-10-02 20:46:40,114 WARNING [train.py:1204] (3/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_795-33252-0_sp1.1 from training. Duration: 0.336375 2023-10-02 20:46:40,130 WARNING [train.py:1204] (3/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_168-246176-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 20:46:40,379 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.conv_skip_rate, batch_count=1013733.3333333334, ans=0.0 2023-10-02 20:46:42,059 WARNING [train.py:1204] (3/4) Exclude cut with ID _7160_210_1876_1_1527940923231_3703799_120-150943-0_sp1.1 from training. Duration: 0.736375 2023-10-02 20:46:43,507 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9084W0005-536844 from training. Duration: 0.768 2023-10-02 20:46:43,657 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.conv_module2.balancer1.prob, batch_count=1013733.3333333334, ans=0.125 2023-10-02 20:46:46,254 WARNING [train.py:1204] (3/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_234-260682-0 from training. Duration: 0.84 2023-10-02 20:46:46,284 WARNING [train.py:1204] (3/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_188-114464-0_sp0.9 from training. Duration: 0.911125 2023-10-02 20:46:48,397 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.2.encoder.layers.1.conv_module1.whiten, num_groups=1, num_channels=384, metric=5.87 vs. limit=15.0 2023-10-02 20:46:49,044 WARNING [train.py:1204] (3/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_81-75849-0 from training. Duration: 0.39 2023-10-02 20:46:50,468 WARNING [train.py:1204] (3/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_379-184223-0_sp1.1 from training. Duration: 0.77275 2023-10-02 20:46:50,755 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=1013800.0, ans=0.1 2023-10-02 20:46:52,567 WARNING [train.py:1204] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_454-123002-0_sp1.1 from training. Duration: 0.62725 2023-10-02 20:46:52,613 WARNING [train.py:1204] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_887-249555-0_sp1.1 from training. Duration: 0.7 2023-10-02 20:46:52,735 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.0.feed_forward2.hidden_balancer.prob, batch_count=1013800.0, ans=0.125 2023-10-02 20:46:56,635 WARNING [train.py:1204] (3/4) Exclude cut with ID _31088_210_16446_1_1532520012284_3440919_20-260902-0_sp1.1 from training. Duration: 0.97275 2023-10-02 20:46:56,696 WARNING [train.py:1204] (3/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_856-344574-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 20:46:56,699 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_36756_210_7234_1_1533026991_966276_4-164118-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 20:46:56,714 WARNING [train.py:1204] (3/4) Exclude cut with ID _8365_210_2064_1_1528592311070_4677690_371-122295-0_sp1.1 from training. Duration: 0.5 2023-10-02 20:46:59,316 WARNING [train.py:1204] (3/4) Exclude cut with ID _5910_210_1389_1_1527337972454_5050100_555-159815-0_sp1.1 from training. Duration: 0.8909375 2023-10-02 20:46:59,333 WARNING [train.py:1204] (3/4) Exclude cut with ID _37086_210_9038_1_1532999540891_7021250_469-147941-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 20:47:00,653 WARNING [train.py:1204] (3/4) Exclude cut with ID _3067_210_2667_1_1526212974640_4503168_459-173867-0_sp0.9 from training. Duration: 0.97775 2023-10-02 20:47:00,753 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9156W0006-572499 from training. Duration: 0.896 2023-10-02 20:47:00,843 WARNING [train.py:1204] (3/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_277-210798-0 from training. Duration: 0.55 2023-10-02 20:47:01,058 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.bypass.skip_rate, batch_count=1013800.0, ans=0.07 2023-10-02 20:47:01,085 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.bypass.skip_rate, batch_count=1013800.0, ans=0.04949747468305833 2023-10-02 20:47:04,135 WARNING [train.py:1204] (3/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_103-336064-0_sp1.1 from training. Duration: 0.463625 2023-10-02 20:47:04,196 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3414_210_1988_1_1526212718_4813195_331-290945-0_sp1.1 from training. Duration: 0.936375 2023-10-02 20:47:04,196 WARNING [train.py:1204] (3/4) Exclude cut with ID _67499_210_17282_1_1535509805220_3692430_365-29276-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 20:47:06,766 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.463e+02 1.852e+02 2.083e+02 2.464e+02 2.851e+02, threshold=4.166e+02, percent-clipped=0.0 2023-10-02 20:47:06,889 WARNING [train.py:1204] (3/4) Exclude cut with ID _14821_210_8750_1_1530680503991_6829359_69-250847-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 20:47:06,890 WARNING [train.py:1204] (3/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_1102-61301-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 20:47:08,293 WARNING [train.py:1204] (3/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_166-246557-0_sp1.1 from training. Duration: 0.5818125 2023-10-02 20:47:08,353 WARNING [train.py:1204] (3/4) Exclude cut with ID _15351_210_10626_1_1530856635653_3576210_214-129918-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 20:47:09,665 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_120-41668-0_sp0.9 from training. Duration: 0.77775 2023-10-02 20:47:09,731 WARNING [train.py:1204] (3/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_27-72278-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 20:47:12,933 WARNING [train.py:1204] (3/4) Exclude cut with ID _24314_210_4915_1_1532135078116_7170179_521-258868-0_sp1.1 from training. Duration: 0.7181875 2023-10-02 20:47:14,320 WARNING [train.py:1204] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_143-86344-0 from training. Duration: 0.77 2023-10-02 20:47:14,354 WARNING [train.py:1204] (3/4) Exclude cut with ID _53489_210_6228_1_1534298357169_3835970_465-157433-0_sp1.1 from training. Duration: 0.92725 2023-10-02 20:47:14,430 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_248-319353-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 20:47:15,899 WARNING [train.py:1204] (3/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_102-77064-0_sp1.1 from training. Duration: 0.8454375 2023-10-02 20:47:17,241 WARNING [train.py:1204] (3/4) Exclude cut with ID _9389_210_1664_1_1529217337301_5289269_379-128794-0_sp1.1 from training. Duration: 0.77275 2023-10-02 20:47:17,342 WARNING [train.py:1204] (3/4) Exclude cut with ID _3231_210_2106_1_1526205864934_6784220_236-260835-0_sp1.1 from training. Duration: 0.97275 2023-10-02 20:47:18,854 WARNING [train.py:1204] (3/4) Exclude cut with ID _24271_210_12658_1_1531987270022_7190519_184-342363-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 20:47:18,855 WARNING [train.py:1204] (3/4) Exclude cut with ID _27894_210_12392_1_1532415496078_6902439_67-221663-0_sp1.1 from training. Duration: 0.92725 2023-10-02 20:47:20,620 INFO [train.py:1046] (3/4) Epoch 29, batch 3350, loss[loss=0.1769, simple_loss=0.2592, pruned_loss=0.04728, over 23198.00 frames. ], tot_loss[loss=0.166, simple_loss=0.2442, pruned_loss=0.04391, over 4731515.19 frames. ], batch size: 105, lr: 3.52e-03, grad_scale: 16.0 2023-10-02 20:47:23,899 WARNING [train.py:1204] (3/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_304-59227-0_sp1.1 from training. Duration: 0.7 2023-10-02 20:47:24,162 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.self_attn_weights.pos_emb_skip_rate, batch_count=1013933.3333333334, ans=0.0 2023-10-02 20:47:25,287 WARNING [train.py:1204] (3/4) Exclude cut with ID _58605_210_12210_1_1534723195939_3904259_458-42232-0_sp1.1 from training. Duration: 0.92725 2023-10-02 20:47:26,661 WARNING [train.py:1204] (3/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_13-165512-0_sp0.9 from training. Duration: 0.911125 2023-10-02 20:47:29,482 WARNING [train.py:1204] (3/4) Exclude cut with ID _58602_210_2346_1_1534903210085_6908830_792-102241-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 20:47:30,881 WARNING [train.py:1204] (3/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_588-125165-0_sp0.9 from training. Duration: 0.92225 2023-10-02 20:47:32,689 WARNING [train.py:1204] (3/4) Exclude cut with ID _54218_210_12210_1_1534413646388_4409259_239-98429-0_sp1.1 from training. Duration: 0.97275 2023-10-02 20:47:33,993 WARNING [train.py:1204] (3/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_538-32291-0_sp1.1 from training. Duration: 0.7545625 2023-10-02 20:47:35,410 WARNING [train.py:1204] (3/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_419-317062-0 from training. Duration: 0.91 2023-10-02 20:47:36,784 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9309W0202-640329 from training. Duration: 0.896 2023-10-02 20:47:36,825 WARNING [train.py:1204] (3/4) Exclude cut with ID _3231_210_2106_1_1526205864934_6784220_419-313291-0_sp1.1 from training. Duration: 0.97275 2023-10-02 20:47:38,520 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.balancer2.prob, batch_count=1014000.0, ans=0.125 2023-10-02 20:47:39,743 WARNING [train.py:1204] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_619-344499-0 from training. Duration: 0.63 2023-10-02 20:47:41,007 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_278-272612-0 from training. Duration: 0.73 2023-10-02 20:47:42,350 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_103-84010-0_sp0.9 from training. Duration: 0.7666875 2023-10-02 20:47:42,376 WARNING [train.py:1204] (3/4) Exclude cut with ID _26528_210_9070_1_1532570410842_3851310_497-97218-0_sp1.1 from training. Duration: 0.7818125 2023-10-02 20:47:43,737 WARNING [train.py:1204] (3/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_262-84613-0_sp1.1 from training. Duration: 0.936375 2023-10-02 20:47:43,779 WARNING [train.py:1204] (3/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_387-131538-0 from training. Duration: 0.81 2023-10-02 20:47:43,816 WARNING [train.py:1204] (3/4) Exclude cut with ID _8030_210_7497_1_1528444886781_3531089_224-300489-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 20:47:43,832 WARNING [train.py:1204] (3/4) Exclude cut with ID _14349_210_8341_1_1530615710818_3442470_170-336541-0_sp0.9 from training. Duration: 0.9555625 2023-10-02 20:47:45,635 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_47069_210_15368_1_1533781267_4431320_180-31746-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 20:47:48,332 WARNING [train.py:1204] (3/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_447-7041-0_sp1.1 from training. Duration: 0.92725 2023-10-02 20:47:48,357 WARNING [train.py:1204] (3/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_2-55283-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 20:47:49,641 WARNING [train.py:1204] (3/4) Exclude cut with ID _12555_210_8750_1_1530075594402_3780239_70-181911-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 20:47:54,197 WARNING [train.py:1204] (3/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_311-328022-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 20:47:56,085 WARNING [train.py:1204] (3/4) Exclude cut with ID _14528_210_8254_1_1530669700514_4306939_330-2987-0_sp1.1 from training. Duration: 0.92725 2023-10-02 20:47:57,443 WARNING [train.py:1204] (3/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_385-184858-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 20:48:00,306 WARNING [train.py:1204] (3/4) Exclude cut with ID _64414_210_17301_1_1535243252048_3757759_348-333360-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 20:48:01,662 WARNING [train.py:1204] (3/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_753-214751-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 20:48:03,076 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_17613_210_2151_1_1531137569_2366160_202-208013-0_sp1.1 from training. Duration: 0.92725 2023-10-02 20:48:03,092 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_43831_210_2158_1_1533725502_5872791_610-129300-0_sp1.1 from training. Duration: 0.936375 2023-10-02 20:48:06,306 WARNING [train.py:1204] (3/4) Exclude cut with ID _54218_210_12210_1_1534413646388_4409259_321-23852-0_sp1.1 from training. Duration: 0.936375 2023-10-02 20:48:08,900 WARNING [train.py:1204] (3/4) Exclude cut with ID _39411_210_5986_1_1533538825030_3343649_173-238248-0 from training. Duration: 0.83 2023-10-02 20:48:08,907 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_405-339986-0_sp0.9 from training. Duration: 0.5666875 2023-10-02 20:48:08,946 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2009_210_2071_1_1525481134_4588937_385-302100-0 from training. Duration: 0.87 2023-10-02 20:48:10,335 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_161-276996-0_sp1.1 from training. Duration: 0.863625 2023-10-02 20:48:10,440 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_177-58005-0 from training. Duration: 0.62 2023-10-02 20:48:11,794 WARNING [train.py:1204] (3/4) Exclude cut with ID _64639_210_5816_1_1535367619437_3528000_146-343734-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 20:48:13,190 WARNING [train.py:1204] (3/4) Exclude cut with ID _3845_210_1390_1_1526731326141_3523610_72-61441-0_sp1.1 from training. Duration: 0.92725 2023-10-02 20:48:19,220 WARNING [train.py:1204] (3/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_991-125028-0_sp1.1 from training. Duration: 0.936375 2023-10-02 20:48:20,579 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7544_210_6753_1_1528591327_1471426_172-348777-0 from training. Duration: 0.66 2023-10-02 20:48:20,638 WARNING [train.py:1204] (3/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_416-257730-0_sp1.1 from training. Duration: 0.5909375 2023-10-02 20:48:22,006 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1113-7369-0_sp1.1 from training. Duration: 0.77275 2023-10-02 20:48:22,102 WARNING [train.py:1204] (3/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_242-76943-0_sp1.1 from training. Duration: 0.8545625 2023-10-02 20:48:22,980 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.feed_forward2.hidden_balancer.prob, batch_count=1014200.0, ans=0.125 2023-10-02 20:48:26,019 WARNING [train.py:1204] (3/4) Exclude cut with ID _44718_210_5816_1_1534050051869_6469030_190-49648-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 20:48:28,806 WARNING [train.py:1204] (3/4) Exclude cut with ID _47251_210_18628_1_1533814063128_4137250_214-88076-0 from training. Duration: 0.88 2023-10-02 20:48:28,839 WARNING [train.py:1204] (3/4) Exclude cut with ID _5585_210_2667_1_1527418813324_8007340_89-332072-0_sp1.1 from training. Duration: 0.5818125 2023-10-02 20:48:30,085 WARNING [train.py:1204] (3/4) Exclude cut with ID _41817_210_18835_1_1533434477160_7423049_555-293448-0_sp1.1 from training. Duration: 0.72725 2023-10-02 20:48:30,196 WARNING [train.py:1204] (3/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_286-331314-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 20:48:31,407 WARNING [train.py:1204] (3/4) Exclude cut with ID _13921_210_9994_1_1530838702800_3003429_46-149444-0 from training. Duration: 0.94 2023-10-02 20:48:31,475 WARNING [train.py:1204] (3/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_768-43595-0_sp1.1 from training. Duration: 0.936375 2023-10-02 20:48:31,484 WARNING [train.py:1204] (3/4) Exclude cut with ID _35355_210_16040_1_1533212988977_4016229_74-190076-0 from training. Duration: 0.8 2023-10-02 20:48:33,483 WARNING [train.py:1204] (3/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_1007-316380-0_sp1.1 from training. Duration: 0.963625 2023-10-02 20:48:34,762 INFO [train.py:1046] (3/4) Epoch 29, batch 3400, loss[loss=0.1705, simple_loss=0.2422, pruned_loss=0.04943, over 23387.00 frames. ], tot_loss[loss=0.1664, simple_loss=0.2445, pruned_loss=0.04412, over 4726025.61 frames. ], batch size: 105, lr: 3.52e-03, grad_scale: 16.0 2023-10-02 20:48:34,827 WARNING [train.py:1204] (3/4) Exclude cut with ID _23557_210_8750_1_1531890106370_6846255_150-283437-0_sp1.1 from training. Duration: 0.963625 2023-10-02 20:48:34,876 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3474_210_2028_1_1526188513_4825246_145-309131-0_sp1.1 from training. Duration: 0.67275 2023-10-02 20:48:36,268 WARNING [train.py:1204] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_391-22148-0_sp1.1 from training. Duration: 0.7 2023-10-02 20:48:36,297 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_238-256668-0 from training. Duration: 0.69 2023-10-02 20:48:40,516 WARNING [train.py:1204] (3/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_372-198878-0 from training. Duration: 0.6 2023-10-02 20:48:41,774 WARNING [train.py:1204] (3/4) Exclude cut with ID ID0174W0006-766659 from training. Duration: 0.953 2023-10-02 20:48:41,791 WARNING [train.py:1204] (3/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_668-23019-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 20:48:44,648 WARNING [train.py:1204] (3/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_322-74328-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 20:48:44,654 WARNING [train.py:1204] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_872-264525-0_sp1.1 from training. Duration: 0.5909375 2023-10-02 20:48:44,701 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_53378_210_17282_1_1534221071_5769509_641-286201-0_sp1.1 from training. Duration: 0.97275 2023-10-02 20:48:46,672 WARNING [train.py:1204] (3/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_199-295611-0_sp1.1 from training. Duration: 0.5 2023-10-02 20:48:53,615 WARNING [train.py:1204] (3/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_1270-317971-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 20:48:55,549 WARNING [train.py:1204] (3/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_784-213099-0 from training. Duration: 0.93 2023-10-02 20:48:59,699 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_115-233929-0_sp0.9 from training. Duration: 0.8 2023-10-02 20:49:01,137 WARNING [train.py:1204] (3/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_256-223809-0_sp1.1 from training. Duration: 0.97275 2023-10-02 20:49:01,173 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_285-87781-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 20:49:02,621 WARNING [train.py:1204] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_619-344499-0_sp1.1 from training. Duration: 0.57275 2023-10-02 20:49:08,749 WARNING [train.py:1204] (3/4) Exclude cut with ID _60229_210_27767_1_1534838406158_3655350_264-279573-0_sp0.9 from training. Duration: 0.92225 2023-10-02 20:49:11,493 WARNING [train.py:1204] (3/4) Exclude cut with ID _7719_210_3525_1_1528456153163_4296960_333-110399-0 from training. Duration: 0.96 2023-10-02 20:49:18,879 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_50423_210_13270_1_1534128598_5021734_759-90833-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 20:49:20,071 WARNING [train.py:1204] (3/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_151-254798-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 20:49:20,120 WARNING [train.py:1204] (3/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_450-203637-0 from training. Duration: 0.92 2023-10-02 20:49:21,413 WARNING [train.py:1204] (3/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_673-36456-0_sp1.1 from training. Duration: 0.936375 2023-10-02 20:49:21,454 WARNING [train.py:1204] (3/4) Exclude cut with ID _36538_210_9994_1_1533204020363_3795620_131-8685-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 20:49:21,508 WARNING [train.py:1204] (3/4) Exclude cut with ID _12556_210_8341_1_1530081024100_7327690_713-55626-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 20:49:21,546 WARNING [train.py:1204] (3/4) Exclude cut with ID _2322_210_2667_1_1525604548971_3656299_440-80494-0_sp0.9 from training. Duration: 0.97775 2023-10-02 20:49:26,788 WARNING [train.py:1204] (3/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_210-325081-0_sp1.1 from training. Duration: 0.97275 2023-10-02 20:49:28,192 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.nonlin_attention.balancer.prob, batch_count=1014466.6666666666, ans=0.125 2023-10-02 20:49:29,371 WARNING [train.py:1204] (3/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_375-244976-0_sp0.9 from training. Duration: 0.8333125 2023-10-02 20:49:29,376 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_15517_210_6753_1_1530952293_1664795_119-31001-0_sp1.1 from training. Duration: 0.8545625 2023-10-02 20:49:33,609 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_41452_210_7291_1_1533437687_3941281_319-42326-0_sp1.1 from training. Duration: 0.92725 2023-10-02 20:49:35,397 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.519e+02 1.869e+02 2.116e+02 2.414e+02 3.746e+02, threshold=4.233e+02, percent-clipped=0.0 2023-10-02 20:49:35,486 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_372-173012-0 from training. Duration: 0.95 2023-10-02 20:49:39,693 WARNING [train.py:1204] (3/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_41-31157-0_sp1.1 from training. Duration: 0.5181875 2023-10-02 20:49:42,544 WARNING [train.py:1204] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_91-212486-0 from training. Duration: 0.68 2023-10-02 20:49:45,276 WARNING [train.py:1204] (3/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_1267-272693-0 from training. Duration: 0.73 2023-10-02 20:49:46,621 WARNING [train.py:1204] (3/4) Exclude cut with ID _13938_210_9988_1_1530518718978_3693960_152-67046-0_sp1.1 from training. Duration: 0.936375 2023-10-02 20:49:47,902 INFO [train.py:1046] (3/4) Epoch 29, batch 3450, loss[loss=0.1652, simple_loss=0.2327, pruned_loss=0.04886, over 23609.00 frames. ], tot_loss[loss=0.1662, simple_loss=0.244, pruned_loss=0.04417, over 4715031.49 frames. ], batch size: 256, lr: 3.52e-03, grad_scale: 16.0 2023-10-02 20:49:48,032 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_4528_210_3273_1_1527076654_3276777_342-168068-0_sp1.1 from training. Duration: 0.7909375 2023-10-02 20:49:49,807 WARNING [train.py:1204] (3/4) Exclude cut with ID _7606_210_4915_1_1528894253277_5080690_127-177761-0 from training. Duration: 0.64 2023-10-02 20:49:51,092 WARNING [train.py:1204] (3/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_335-137972-0_sp1.1 from training. Duration: 0.92725 2023-10-02 20:49:54,444 WARNING [train.py:1204] (3/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_331-117829-0_sp1.1 from training. Duration: 0.563625 2023-10-02 20:50:01,437 WARNING [train.py:1204] (3/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_125-125818-0_sp1.1 from training. Duration: 0.87275 2023-10-02 20:50:01,493 WARNING [train.py:1204] (3/4) Exclude cut with ID _6789_210_1534_1_1527854471671_2870519_48-294897-0_sp1.1 from training. Duration: 0.963625 2023-10-02 20:50:03,214 WARNING [train.py:1204] (3/4) Exclude cut with ID _6694_210_4599_1_1527852637490_3965529_116-109398-0_sp1.1 from training. Duration: 0.663625 2023-10-02 20:50:03,224 WARNING [train.py:1204] (3/4) Exclude cut with ID _52892_210_6721_1_1534378941259_4124129_222-172685-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 20:50:04,619 WARNING [train.py:1204] (3/4) Exclude cut with ID _50655_210_18628_1_1534229942193_3772154_320-132275-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 20:50:09,535 WARNING [train.py:1204] (3/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_76-311963-0 from training. Duration: 0.7 2023-10-02 20:50:14,954 WARNING [train.py:1204] (3/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_32-205288-0 from training. Duration: 0.78 2023-10-02 20:50:14,972 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_86-16881-0_sp0.9 from training. Duration: 0.6444375 2023-10-02 20:50:16,316 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_271-297505-0_sp1.1 from training. Duration: 0.9 2023-10-02 20:50:17,736 WARNING [train.py:1204] (3/4) Exclude cut with ID _57572_210_5517_1_1534921219624_3809484_187-295293-0_sp1.1 from training. Duration: 0.963625 2023-10-02 20:50:17,965 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.conv_module2.balancer2.prob, batch_count=1014733.3333333334, ans=0.125 2023-10-02 20:50:23,592 WARNING [train.py:1204] (3/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_563-329943-0 from training. Duration: 0.99 2023-10-02 20:50:25,433 WARNING [train.py:1204] (3/4) Exclude cut with ID _7160_210_1876_1_1527940923231_3703799_302-150728-0_sp0.9 from training. Duration: 0.8666875 2023-10-02 20:50:29,571 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.2.encoder.layers.2.conv_module2.whiten, num_groups=1, num_channels=384, metric=5.09 vs. limit=15.0 2023-10-02 20:50:30,180 WARNING [train.py:1204] (3/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_926-7526-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 20:50:30,211 WARNING [train.py:1204] (3/4) Exclude cut with ID _62574_210_2655_1_1535180407058_3933249_467-22348-0_sp1.1 from training. Duration: 0.936375 2023-10-02 20:50:31,636 WARNING [train.py:1204] (3/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_280-161633-0_sp0.9 from training. Duration: 0.811125 2023-10-02 20:50:33,076 WARNING [train.py:1204] (3/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_385-193052-0_sp1.1 from training. Duration: 0.7818125 2023-10-02 20:50:34,444 WARNING [train.py:1204] (3/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_444-28041-0 from training. Duration: 0.85 2023-10-02 20:50:34,449 WARNING [train.py:1204] (3/4) Exclude cut with ID _66070_210_7640_1_1535441393096_7805310_341-249237-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 20:50:34,537 WARNING [train.py:1204] (3/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_425-269826-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 20:50:37,921 WARNING [train.py:1204] (3/4) Exclude cut with ID _53950_210_5816_1_1534417264919_3862951_124-185597-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 20:50:42,009 WARNING [train.py:1204] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_380-204756-0 from training. Duration: 0.86 2023-10-02 20:50:43,712 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.feed_forward2.hidden_balancer.prob, batch_count=1014800.0, ans=0.125 2023-10-02 20:50:44,932 WARNING [train.py:1204] (3/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_282-257791-0_sp0.9 from training. Duration: 0.9555625 2023-10-02 20:50:48,979 WARNING [train.py:1204] (3/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_372-131163-0_sp1.1 from training. Duration: 0.8545625 2023-10-02 20:50:50,391 WARNING [train.py:1204] (3/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_447-233513-0_sp1.1 from training. Duration: 0.963625 2023-10-02 20:50:51,133 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.3.encoder.layers.1.whiten, num_groups=1, num_channels=512, metric=4.56 vs. limit=12.0 2023-10-02 20:50:53,165 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_54-118333-0_sp1.1 from training. Duration: 0.97275 2023-10-02 20:50:57,883 WARNING [train.py:1204] (3/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_388-230239-0_sp1.1 from training. Duration: 0.963625 2023-10-02 20:50:57,906 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_40047_210_2290_1_1533290519_2374311_141-54639-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 20:50:57,962 WARNING [train.py:1204] (3/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_91-267696-0_sp0.9 from training. Duration: 0.9666875 2023-10-02 20:50:58,232 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.attention_skip_rate, batch_count=1014866.6666666666, ans=0.0 2023-10-02 20:50:59,901 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_532-137847-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 20:51:02,655 INFO [train.py:1046] (3/4) Epoch 29, batch 3500, loss[loss=0.1682, simple_loss=0.2528, pruned_loss=0.04178, over 24565.00 frames. ], tot_loss[loss=0.1654, simple_loss=0.243, pruned_loss=0.0439, over 4720605.56 frames. ], batch size: 71, lr: 3.52e-03, grad_scale: 16.0 2023-10-02 20:51:04,105 WARNING [train.py:1204] (3/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_155-283231-0_sp1.1 from training. Duration: 0.97275 2023-10-02 20:51:06,979 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2245_210_2715_1_1525511548_3540254_310-52483-0_sp1.1 from training. Duration: 0.8 2023-10-02 20:51:08,741 WARNING [train.py:1204] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_185-174805-0 from training. Duration: 0.68 2023-10-02 20:51:10,132 WARNING [train.py:1204] (3/4) Exclude cut with ID _15012_210_1968_1_1530856820429_5095350_662-210469-0_sp1.1 from training. Duration: 0.5181875 2023-10-02 20:51:12,846 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_426-313557-0_sp0.9 from training. Duration: 0.588875 2023-10-02 20:51:15,549 WARNING [train.py:1204] (3/4) Exclude cut with ID _42326_210_7770_1_1533814282531_3707269_407-204343-0_sp1.1 from training. Duration: 0.97275 2023-10-02 20:51:15,565 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_258-310570-0 from training. Duration: 0.82 2023-10-02 20:51:21,100 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_4737_210_2040_1_1526998689_3293008_378-178650-0_sp1.1 from training. Duration: 0.863625 2023-10-02 20:51:21,190 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_47896_210_20508_1_1534492518_5958868_432-327451-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 20:51:22,634 WARNING [train.py:1204] (3/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_188-114464-0_sp1.1 from training. Duration: 0.7454375 2023-10-02 20:51:22,641 WARNING [train.py:1204] (3/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_798-106179-0_sp1.1 from training. Duration: 0.7545625 2023-10-02 20:51:22,675 WARNING [train.py:1204] (3/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_143-117421-0_sp1.1 from training. Duration: 0.52725 2023-10-02 20:51:22,717 WARNING [train.py:1204] (3/4) Exclude cut with ID _67499_210_17282_1_1535509805220_3692430_361-78786-0_sp1.1 from training. Duration: 0.963625 2023-10-02 20:51:24,028 WARNING [train.py:1204] (3/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_712-342563-0_sp1.1 from training. Duration: 0.8818125 2023-10-02 20:51:24,091 WARNING [train.py:1204] (3/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_387-159008-0 from training. Duration: 0.86 2023-10-02 20:51:27,304 WARNING [train.py:1204] (3/4) Exclude cut with ID _33640_210_2655_1_1533117605479_4855469_959-18820-0_sp1.1 from training. Duration: 0.963625 2023-10-02 20:51:27,333 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_133-564-0_sp0.9 from training. Duration: 0.87775 2023-10-02 20:51:28,692 WARNING [train.py:1204] (3/4) Exclude cut with ID _23514_210_1389_1_1531832139785_3031439_12-29287-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 20:51:28,981 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.conv_module2.balancer2.prob, batch_count=1015000.0, ans=0.125 2023-10-02 20:51:32,084 WARNING [train.py:1204] (3/4) Exclude cut with ID _23514_210_1389_1_1531832139785_3031439_489-185052-0_sp1.1 from training. Duration: 0.963625 2023-10-02 20:51:33,370 WARNING [train.py:1204] (3/4) Exclude cut with ID _5444_210_2151_1_1527418808464_5312210_634-194174-0 from training. Duration: 0.97 2023-10-02 20:51:33,399 WARNING [train.py:1204] (3/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_91-26919-0_sp1.1 from training. Duration: 0.8818125 2023-10-02 20:51:34,049 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.3.encoder.layers.3.conv_module2.whiten, num_groups=1, num_channels=512, metric=5.43 vs. limit=15.0 2023-10-02 20:51:36,725 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_37351_210_7925_1_1533012931_3812113_516-82303-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 20:51:39,526 WARNING [train.py:1204] (3/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_80-74184-0_sp0.9 from training. Duration: 0.92225 2023-10-02 20:51:40,927 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_9460_210_4211_1_1528954587_10049667_495-202963-0_sp1.1 from training. Duration: 0.963625 2023-10-02 20:51:42,307 WARNING [train.py:1204] (3/4) Exclude cut with ID _14632_210_9988_1_1530691279392_3923200_332-105904-0_sp1.1 from training. Duration: 0.8181875 2023-10-02 20:51:42,330 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_40764_210_1925_1_1533882359_3752345_493-167886-0_sp1.1 from training. Duration: 0.92725 2023-10-02 20:51:45,065 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_196-289637-0 from training. Duration: 0.75 2023-10-02 20:51:45,134 WARNING [train.py:1204] (3/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_26-175553-0 from training. Duration: 0.61 2023-10-02 20:51:45,183 WARNING [train.py:1204] (3/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_890-5117-0 from training. Duration: 0.78 2023-10-02 20:51:46,579 WARNING [train.py:1204] (3/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_206-306860-0_sp1.1 from training. Duration: 0.92725 2023-10-02 20:51:48,095 WARNING [train.py:1204] (3/4) Exclude cut with ID _25882_210_3036_1_1532775665815_6479510_570-79824-0_sp1.1 from training. Duration: 0.963625 2023-10-02 20:51:48,160 WARNING [train.py:1204] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_731-2428-0_sp1.1 from training. Duration: 0.7545625 2023-10-02 20:51:48,186 WARNING [train.py:1204] (3/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_138-154812-0_sp0.9 from training. Duration: 0.8666875 2023-10-02 20:51:51,080 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.attention_skip_rate, batch_count=1015133.3333333334, ans=0.0 2023-10-02 20:51:52,178 WARNING [train.py:1204] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_178-39722-0_sp0.9 from training. Duration: 0.5444375 2023-10-02 20:51:52,230 WARNING [train.py:1204] (3/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_1285-256622-0_sp0.9 from training. Duration: 0.9333125 2023-10-02 20:51:55,719 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_41428_210_2161_1_1533774184_3992532_65-271323-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 20:51:57,043 WARNING [train.py:1204] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_308-161067-0 from training. Duration: 0.66 2023-10-02 20:51:57,056 WARNING [train.py:1204] (3/4) Exclude cut with ID _3067_210_2667_1_1526212974640_4503168_459-173867-0 from training. Duration: 0.88 2023-10-02 20:51:57,057 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2221_210_1988_1_1525597051_4935541_415-296882-0_sp1.1 from training. Duration: 0.7 2023-10-02 20:51:58,564 WARNING [train.py:1204] (3/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_354-192266-0_sp1.1 from training. Duration: 0.936375 2023-10-02 20:51:59,934 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_258-310570-0_sp0.9 from training. Duration: 0.911125 2023-10-02 20:52:01,904 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_548-141039-0_sp1.1 from training. Duration: 0.963625 2023-10-02 20:52:03,220 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.529e+02 1.823e+02 1.998e+02 2.225e+02 3.593e+02, threshold=3.995e+02, percent-clipped=0.0 2023-10-02 20:52:05,147 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_15101_210_7927_1_1530786415_5033274_320-327527-0 from training. Duration: 0.96 2023-10-02 20:52:06,437 WARNING [train.py:1204] (3/4) Exclude cut with ID _2895_210_2477_1_1525956785794_4063750_289-11475-0_sp0.9 from training. Duration: 0.911125 2023-10-02 20:52:07,839 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7543_210_6753_1_1528543318_4552_1-85072-0_sp1.1 from training. Duration: 0.7545625 2023-10-02 20:52:09,207 WARNING [train.py:1204] (3/4) Exclude cut with ID _64639_210_5816_1_1535367619437_3528000_461-329156-0 from training. Duration: 0.96 2023-10-02 20:52:11,822 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_211-339622-0 from training. Duration: 0.45 2023-10-02 20:52:13,374 WARNING [train.py:1204] (3/4) Exclude cut with ID _20455_210_5219_1_1531565996775_8098100_402-112739-0_sp1.1 from training. Duration: 0.963625 2023-10-02 20:52:14,760 WARNING [train.py:1204] (3/4) Exclude cut with ID _53489_210_6228_1_1534298357169_3835970_256-115565-0_sp1.1 from training. Duration: 0.936375 2023-10-02 20:52:14,783 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_26326_210_7925_1_1533002493_3592406_360-305260-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 20:52:14,805 WARNING [train.py:1204] (3/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_18-204383-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 20:52:16,037 INFO [train.py:1046] (3/4) Epoch 29, batch 3550, loss[loss=0.17, simple_loss=0.2609, pruned_loss=0.03957, over 24548.00 frames. ], tot_loss[loss=0.164, simple_loss=0.2412, pruned_loss=0.04342, over 4709987.43 frames. ], batch size: 71, lr: 3.51e-03, grad_scale: 8.0 2023-10-02 20:52:18,790 WARNING [train.py:1204] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_306-232480-0_sp1.1 from training. Duration: 0.87275 2023-10-02 20:52:28,146 WARNING [train.py:1204] (3/4) Exclude cut with ID _41161_210_20034_1_1533431249657_3555314_116-299186-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 20:52:29,580 WARNING [train.py:1204] (3/4) Exclude cut with ID _13913_210_9994_1_1530579615187_3383897_473-277682-0_sp1.1 from training. Duration: 0.3545625 2023-10-02 20:52:32,337 WARNING [train.py:1204] (3/4) Exclude cut with ID _58895_210_8750_1_1535184081320_3649089_49-225702-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 20:52:32,417 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3080_210_2071_1_1526086102_4365166_312-8726-0_sp1.1 from training. Duration: 0.82725 2023-10-02 20:52:33,722 WARNING [train.py:1204] (3/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_489-190175-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 20:52:35,688 WARNING [train.py:1204] (3/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_345-95717-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 20:52:35,707 WARNING [train.py:1204] (3/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_85-138257-0_sp1.1 from training. Duration: 0.6454375 2023-10-02 20:52:36,015 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.self_attn_weights.pos_emb_skip_rate, batch_count=1015333.3333333334, ans=0.0 2023-10-02 20:52:39,185 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7046_210_6753_1_1527895337_6920917_783-278109-0_sp1.1 from training. Duration: 0.7 2023-10-02 20:52:39,211 WARNING [train.py:1204] (3/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_450-203637-0_sp1.1 from training. Duration: 0.836375 2023-10-02 20:52:40,479 WARNING [train.py:1204] (3/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_416-132893-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 20:52:40,496 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2655_210_2087_1_1525688959_3897400_437-337690-0_sp0.9 from training. Duration: 0.7 2023-10-02 20:52:40,571 WARNING [train.py:1204] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_700-183549-0_sp1.1 from training. Duration: 0.6181875 2023-10-02 20:52:46,088 WARNING [train.py:1204] (3/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_191-59784-0_sp1.1 from training. Duration: 0.8 2023-10-02 20:52:46,099 WARNING [train.py:1204] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_218-295097-0_sp1.1 from training. Duration: 0.7 2023-10-02 20:52:47,458 WARNING [train.py:1204] (3/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_310-109461-0_sp1.1 from training. Duration: 0.72725 2023-10-02 20:52:47,463 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_33348_210_5632_1_1533086912_3324030_131-321149-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 20:52:47,690 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.bypass.skip_rate, batch_count=1015400.0, ans=0.09899494936611666 2023-10-02 20:52:48,792 WARNING [train.py:1204] (3/4) Exclude cut with ID _64135_210_2253_1_1535677267876_7608972_133-188266-0_sp1.1 from training. Duration: 0.736375 2023-10-02 20:52:48,819 WARNING [train.py:1204] (3/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_505-205270-0 from training. Duration: 0.38 2023-10-02 20:52:48,830 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_67223_210_4238_1_1535612120_2537431_102-88248-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 20:52:50,247 WARNING [train.py:1204] (3/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_60-235433-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 20:52:51,875 WARNING [train.py:1204] (3/4) Exclude cut with ID _5189_210_4557_1_1527248319428_3769229_199-53526-0_sp1.1 from training. Duration: 0.3818125 2023-10-02 20:52:56,536 WARNING [train.py:1204] (3/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_439-90979-0_sp1.1 from training. Duration: 0.963625 2023-10-02 20:52:57,912 WARNING [train.py:1204] (3/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_1162-132880-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 20:52:58,121 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.conv_module2.balancer1.prob, batch_count=1015400.0, ans=0.125 2023-10-02 20:52:59,179 WARNING [train.py:1204] (3/4) Exclude cut with ID _56423_210_5854_1_1534896061049_3165969_220-22317-0_sp1.1 from training. Duration: 0.963625 2023-10-02 20:53:00,641 WARNING [train.py:1204] (3/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_909-64151-0 from training. Duration: 0.94 2023-10-02 20:53:01,854 WARNING [train.py:1204] (3/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_465-156500-0_sp0.9 from training. Duration: 0.8 2023-10-02 20:53:03,363 WARNING [train.py:1204] (3/4) Exclude cut with ID _44304_210_15374_1_1533639352765_4613129_302-22084-0 from training. Duration: 0.86 2023-10-02 20:53:03,415 WARNING [train.py:1204] (3/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_663-166973-0_sp1.1 from training. Duration: 0.72725 2023-10-02 20:53:04,802 WARNING [train.py:1204] (3/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_503-287883-0_sp0.9 from training. Duration: 0.97775 2023-10-02 20:53:04,826 WARNING [train.py:1204] (3/4) Exclude cut with ID _60057_210_10640_1_1534903065793_3333150_362-230377-0_sp0.9 from training. Duration: 0.9666875 2023-10-02 20:53:08,405 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_4183_210_2087_1_1526729100_1869838_143-280962-0 from training. Duration: 0.56 2023-10-02 20:53:10,957 WARNING [train.py:1204] (3/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_281-1313-0_sp1.1 from training. Duration: 0.936375 2023-10-02 20:53:15,269 WARNING [train.py:1204] (3/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_64-215118-0_sp1.1 from training. Duration: 0.936375 2023-10-02 20:53:16,489 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2822_210_2715_1_1526112586_1999587_165-36656-0 from training. Duration: 0.89 2023-10-02 20:53:16,748 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.nonlin_attention.balancer.prob, batch_count=1015533.3333333334, ans=0.125 2023-10-02 20:53:17,924 WARNING [train.py:1204] (3/4) Exclude cut with ID _22961_210_8254_1_1531976366672_4278180_402-208830-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 20:53:20,744 WARNING [train.py:1204] (3/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_98-193494-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 20:53:21,005 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.conv_module2.balancer2.prob, batch_count=1015533.3333333334, ans=0.125 2023-10-02 20:53:22,035 WARNING [train.py:1204] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_383-285196-0 from training. Duration: 0.97 2023-10-02 20:53:29,816 INFO [train.py:1046] (3/4) Epoch 29, batch 3600, loss[loss=0.1537, simple_loss=0.234, pruned_loss=0.03668, over 24428.00 frames. ], tot_loss[loss=0.1639, simple_loss=0.2414, pruned_loss=0.04317, over 4708755.37 frames. ], batch size: 58, lr: 3.51e-03, grad_scale: 16.0 2023-10-02 20:53:29,925 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6767_210_3273_1_1527855783_1718932_142-127132-0 from training. Duration: 0.95 2023-10-02 20:53:29,964 WARNING [train.py:1204] (3/4) Exclude cut with ID _39867_210_9826_1_1533553222030_9344259_1018-339635-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 20:53:31,251 WARNING [train.py:1204] (3/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_274-274798-0_sp1.1 from training. Duration: 0.8909375 2023-10-02 20:53:34,005 WARNING [train.py:1204] (3/4) Exclude cut with ID _2334_210_2667_1_1525608238880_4705129_376-263393-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 20:53:34,054 WARNING [train.py:1204] (3/4) Exclude cut with ID _29303_210_2575_1_1532392237303_7410649_363-344675-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 20:53:36,218 WARNING [train.py:1204] (3/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_58-68956-0_sp1.1 from training. Duration: 0.7545625 2023-10-02 20:53:39,004 WARNING [train.py:1204] (3/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_709-311571-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 20:53:39,121 WARNING [train.py:1204] (3/4) Exclude cut with ID _46067_210_4220_1_1534125571066_3307450_136-257407-0_sp1.1 from training. Duration: 0.963625 2023-10-02 20:53:39,370 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.conv_module1.balancer1.prob, batch_count=1015600.0, ans=0.125 2023-10-02 20:53:40,975 WARNING [train.py:1204] (3/4) Exclude cut with ID _6493_210_1747_1_1528459210424_4012470_324-323854-0_sp1.1 from training. Duration: 0.836375 2023-10-02 20:53:42,350 WARNING [train.py:1204] (3/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_234-260682-0_sp1.1 from training. Duration: 0.763625 2023-10-02 20:53:42,425 WARNING [train.py:1204] (3/4) Exclude cut with ID _36634_210_1794_1_1533296352450_1399884_55-119451-0_sp1.1 from training. Duration: 0.963625 2023-10-02 20:53:42,428 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_40047_210_2290_1_1533290519_2374311_195-165693-0 from training. Duration: 0.89 2023-10-02 20:53:46,502 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_5522_210_3273_1_1527249606_3909096_366-172508-0_sp0.9 from training. Duration: 0.7333125 2023-10-02 20:53:47,856 WARNING [train.py:1204] (3/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_187-10504-0_sp1.1 from training. Duration: 0.963625 2023-10-02 20:53:50,649 WARNING [train.py:1204] (3/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_282-257791-0_sp1.1 from training. Duration: 0.7818125 2023-10-02 20:53:52,071 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_43831_210_2158_1_1533725502_5872791_467-288776-0_sp1.1 from training. Duration: 0.92725 2023-10-02 20:53:53,521 WARNING [train.py:1204] (3/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_138-154812-0_sp1.1 from training. Duration: 0.7090625 2023-10-02 20:53:53,792 INFO [scaling.py:1118] (3/4) WithLoss: name=encoder.encoders.4.encoder.layers.1.self_attn_weights, loss-sum=0.000e+00 2023-10-02 20:53:54,893 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_1154-185930-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 20:53:54,927 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2655_210_2087_1_1525688959_3897400_52-279899-0 from training. Duration: 0.69 2023-10-02 20:53:55,125 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.conv_module2.balancer1.prob, batch_count=1015666.6666666666, ans=0.125 2023-10-02 20:53:55,715 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.2.encoder.layers.0.self_attn1.whiten, num_groups=1, num_channels=384, metric=17.59 vs. limit=22.5 2023-10-02 20:53:56,456 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_141-89113-0_sp1.1 from training. Duration: 0.7818125 2023-10-02 20:54:00,164 WARNING [train.py:1204] (3/4) Exclude cut with ID _55134_210_21982_1_1534590028377_3882170_297-47310-0_sp1.1 from training. Duration: 0.963625 2023-10-02 20:54:01,458 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_486-151131-0_sp1.1 from training. Duration: 0.77275 2023-10-02 20:54:03,044 WARNING [train.py:1204] (3/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_21-232980-0_sp1.1 from training. Duration: 0.936375 2023-10-02 20:54:04,421 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6165_210_3528_1_1527590982_4379841_338-106380-0_sp1.1 from training. Duration: 0.92725 2023-10-02 20:54:04,475 WARNING [train.py:1204] (3/4) Exclude cut with ID _55162_210_17071_1_1534408248583_3753480_355-257218-0_sp1.1 from training. Duration: 0.97275 2023-10-02 20:54:05,960 WARNING [train.py:1204] (3/4) Exclude cut with ID _2895_210_2477_1_1525956785794_4063750_289-11475-0 from training. Duration: 0.82 2023-10-02 20:54:08,563 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.conv_module1.whiten.whitening_limit, batch_count=1015733.3333333334, ans=15.0 2023-10-02 20:54:09,417 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.conv_skip_rate, batch_count=1015733.3333333334, ans=0.0 2023-10-02 20:54:12,449 WARNING [train.py:1204] (3/4) Exclude cut with ID _16756_210_4915_1_1531047803763_7104259_839-183431-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 20:54:12,548 WARNING [train.py:1204] (3/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_427-208901-0_sp0.9 from training. Duration: 0.7555625 2023-10-02 20:54:13,892 WARNING [train.py:1204] (3/4) Exclude cut with ID _36512_210_6987_1_1533033143998_3126069_181-306500-0 from training. Duration: 0.95 2023-10-02 20:54:18,143 WARNING [train.py:1204] (3/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_28-70987-0_sp1.1 from training. Duration: 0.7909375 2023-10-02 20:54:23,540 WARNING [train.py:1204] (3/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_590-58996-0_sp1.1 from training. Duration: 0.936375 2023-10-02 20:54:26,252 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_48730_210_5399_1_1533885886_2212769_108-47561-0_sp1.1 from training. Duration: 0.936375 2023-10-02 20:54:31,393 WARNING [train.py:1204] (3/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_401-158239-0_sp0.9 from training. Duration: 0.788875 2023-10-02 20:54:31,408 WARNING [train.py:1204] (3/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_278-154492-0_sp1.1 from training. Duration: 0.6909375 2023-10-02 20:54:31,413 WARNING [train.py:1204] (3/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_137-116442-0 from training. Duration: 0.78 2023-10-02 20:54:32,622 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.573e+02 1.883e+02 2.051e+02 2.269e+02 3.379e+02, threshold=4.102e+02, percent-clipped=0.0 2023-10-02 20:54:34,032 WARNING [train.py:1204] (3/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_189-317158-0 from training. Duration: 0.9 2023-10-02 20:54:35,405 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_47896_210_20508_1_1534492518_5958868_558-314572-0 from training. Duration: 0.97 2023-10-02 20:54:38,678 WARNING [train.py:1204] (3/4) Exclude cut with ID _66070_210_7640_1_1535441393096_7805310_290-40284-0_sp1.1 from training. Duration: 0.97275 2023-10-02 20:54:38,728 WARNING [train.py:1204] (3/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_110-224688-0_sp1.1 from training. Duration: 0.8818125 2023-10-02 20:54:40,073 WARNING [train.py:1204] (3/4) Exclude cut with ID _7719_210_3525_1_1528456153163_4296960_88-250259-0 from training. Duration: 0.84 2023-10-02 20:54:41,432 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8453_210_4211_1_1528502940_8562730_1157-114102-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 20:54:41,456 WARNING [train.py:1204] (3/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_317-175062-0_sp1.1 from training. Duration: 0.7454375 2023-10-02 20:54:41,461 WARNING [train.py:1204] (3/4) Exclude cut with ID _29857_210_20034_1_1532395886753_3798160_254-347254-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 20:54:41,544 WARNING [train.py:1204] (3/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_28-70987-0 from training. Duration: 0.87 2023-10-02 20:54:43,488 WARNING [train.py:1204] (3/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_321-335475-0 from training. Duration: 0.45 2023-10-02 20:54:44,815 INFO [train.py:1046] (3/4) Epoch 29, batch 3650, loss[loss=0.1627, simple_loss=0.254, pruned_loss=0.03564, over 24327.00 frames. ], tot_loss[loss=0.1648, simple_loss=0.2424, pruned_loss=0.04354, over 4713925.77 frames. ], batch size: 74, lr: 3.51e-03, grad_scale: 16.0 2023-10-02 20:54:46,289 WARNING [train.py:1204] (3/4) Exclude cut with ID _40901_210_5665_1_1533363789958_3856130_356-107330-0_sp1.1 from training. Duration: 0.936375 2023-10-02 20:54:46,355 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3780_210_2087_1_1526467389_2145572_176-107140-0 from training. Duration: 0.69 2023-10-02 20:54:50,570 WARNING [train.py:1204] (3/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_80-135200-0 from training. Duration: 0.79 2023-10-02 20:54:51,981 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_33348_210_5632_1_1533086912_3324030_85-28583-0_sp0.9 from training. Duration: 0.911125 2023-10-02 20:54:56,059 WARNING [train.py:1204] (3/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_29-293896-0 from training. Duration: 0.61 2023-10-02 20:54:57,399 WARNING [train.py:1204] (3/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_194-252800-0 from training. Duration: 0.97 2023-10-02 20:54:57,605 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=1016000.0, ans=0.1 2023-10-02 20:55:02,623 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_360-52446-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 20:55:02,627 WARNING [train.py:1204] (3/4) Exclude cut with ID _9389_210_1664_1_1529217337301_5289269_289-213417-0_sp0.9 from training. Duration: 0.988875 2023-10-02 20:55:03,896 WARNING [train.py:1204] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_143-86344-0_sp0.9 from training. Duration: 0.8555625 2023-10-02 20:55:06,720 WARNING [train.py:1204] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_520-126729-0_sp0.9 from training. Duration: 0.77775 2023-10-02 20:55:08,716 WARNING [train.py:1204] (3/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_76-308663-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 20:55:08,777 WARNING [train.py:1204] (3/4) Exclude cut with ID _8021_210_7994_1_1528376451358_3653069_100-140231-0 from training. Duration: 0.67 2023-10-02 20:55:10,229 WARNING [train.py:1204] (3/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_387-131538-0_sp0.9 from training. Duration: 0.9 2023-10-02 20:55:10,253 WARNING [train.py:1204] (3/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_365-182054-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 20:55:10,295 WARNING [train.py:1204] (3/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_345-340690-0 from training. Duration: 0.93 2023-10-02 20:55:10,414 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.1.balancer2.prob, batch_count=1016000.0, ans=0.125 2023-10-02 20:55:11,685 WARNING [train.py:1204] (3/4) Exclude cut with ID _5409_210_1388_1_1527292850189_5865191_178-127970-0_sp0.9 from training. Duration: 0.6333125 2023-10-02 20:55:13,025 WARNING [train.py:1204] (3/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_352-12734-0_sp1.1 from training. Duration: 0.92725 2023-10-02 20:55:13,037 WARNING [train.py:1204] (3/4) Exclude cut with ID _65134_210_14763_1_1535335393985_3505368_121-129373-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 20:55:13,384 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.feed_forward3.hidden_balancer.prob, batch_count=1016066.6666666666, ans=0.125 2023-10-02 20:55:15,019 WARNING [train.py:1204] (3/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_557-324258-0_sp1.1 from training. Duration: 0.736375 2023-10-02 20:55:16,503 WARNING [train.py:1204] (3/4) Exclude cut with ID _48678_210_5663_1_1533988830947_3769199_306-255886-0 from training. Duration: 0.77 2023-10-02 20:55:17,824 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7544_210_6753_1_1528587000_691468_87-178599-0 from training. Duration: 0.87 2023-10-02 20:55:17,882 WARNING [train.py:1204] (3/4) Exclude cut with ID _2941_210_2577_1_1526199673150_2489759_208-236402-0_sp1.1 from training. Duration: 0.963625 2023-10-02 20:55:19,298 WARNING [train.py:1204] (3/4) Exclude cut with ID _38670_210_15379_1_1533448810659_4391059_65-169397-0 from training. Duration: 0.97 2023-10-02 20:55:20,684 WARNING [train.py:1204] (3/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_306-328175-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 20:55:20,695 WARNING [train.py:1204] (3/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_212-149947-0_sp1.1 from training. Duration: 0.663625 2023-10-02 20:55:22,205 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.bypass.scale_min, batch_count=1016066.6666666666, ans=0.2 2023-10-02 20:55:26,244 WARNING [train.py:1204] (3/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_321-22821-0_sp1.1 from training. Duration: 0.8090625 2023-10-02 20:55:27,647 WARNING [train.py:1204] (3/4) Exclude cut with ID _44010_210_4525_1_1533884262964_4013620_483-69110-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 20:55:27,661 WARNING [train.py:1204] (3/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_38-187851-0_sp1.1 from training. Duration: 0.563625 2023-10-02 20:55:30,361 WARNING [train.py:1204] (3/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_129-337802-0_sp0.9 from training. Duration: 0.8 2023-10-02 20:55:30,428 WARNING [train.py:1204] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_268-87909-0_sp1.1 from training. Duration: 0.9 2023-10-02 20:55:33,727 WARNING [train.py:1204] (3/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_559-184862-0_sp1.1 from training. Duration: 0.8545625 2023-10-02 20:55:37,393 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_46032_210_14656_1_1533781412_4507969_237-120766-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 20:55:38,812 WARNING [train.py:1204] (3/4) Exclude cut with ID _28427_210_4220_1_1532581240967_3061069_330-170906-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 20:55:38,818 WARNING [train.py:1204] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_401-51578-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 20:55:39,120 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.bypass.scale_min, batch_count=1016133.3333333334, ans=0.2 2023-10-02 20:55:39,522 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.3.encoder.layers.3.conv_module1.whiten, num_groups=1, num_channels=512, metric=5.08 vs. limit=15.0 2023-10-02 20:55:40,244 WARNING [train.py:1204] (3/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_209-205365-0_sp1.1 from training. Duration: 0.7181875 2023-10-02 20:55:41,572 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_23801_210_16223_1_1532066392_3294657_57-242170-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 20:55:42,914 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_127-37371-0_sp1.1 from training. Duration: 0.936375 2023-10-02 20:55:48,893 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9171W0019-578905 from training. Duration: 0.896 2023-10-02 20:55:52,955 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_24605_210_1794_1_1532047863_4149755_349-49056-0_sp1.1 from training. Duration: 0.92725 2023-10-02 20:55:52,972 WARNING [train.py:1204] (3/4) Exclude cut with ID _12302_210_5219_1_1529924446466_8107280_897-21237-0_sp1.1 from training. Duration: 0.936375 2023-10-02 20:55:54,303 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_9460_210_4211_1_1528954587_10049667_480-260269-0_sp0.9 from training. Duration: 0.62225 2023-10-02 20:55:54,360 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_348-152548-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 20:55:55,589 WARNING [train.py:1204] (3/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_301-330455-0_sp0.9 from training. Duration: 0.888875 2023-10-02 20:55:58,310 INFO [train.py:1046] (3/4) Epoch 29, batch 3700, loss[loss=0.1577, simple_loss=0.239, pruned_loss=0.03822, over 24512.00 frames. ], tot_loss[loss=0.1649, simple_loss=0.2427, pruned_loss=0.04352, over 4726025.37 frames. ], batch size: 63, lr: 3.51e-03, grad_scale: 8.0 2023-10-02 20:55:58,351 WARNING [train.py:1204] (3/4) Exclude cut with ID _61008_210_10633_1_1534852769198_6974370_345-91705-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 20:55:59,788 WARNING [train.py:1204] (3/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_304-273433-0 from training. Duration: 0.99 2023-10-02 20:55:59,791 WARNING [train.py:1204] (3/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_359-345972-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 20:56:02,503 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2296_210_2087_1_1525503448_3397335_57-185430-0_sp1.1 from training. Duration: 0.5454375 2023-10-02 20:56:03,972 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_1256-120974-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 20:56:04,042 WARNING [train.py:1204] (3/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_372-30302-0_sp0.9 from training. Duration: 0.9555625 2023-10-02 20:56:05,876 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_42012_210_7927_1_1533439936_3871556_451-344262-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 20:56:05,877 WARNING [train.py:1204] (3/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_28-324729-0 from training. Duration: 0.94 2023-10-02 20:56:05,885 WARNING [train.py:1204] (3/4) Exclude cut with ID _64135_210_2253_1_1535677267876_7608972_313-18394-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 20:56:07,752 WARNING [train.py:1204] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_620-276793-0_sp0.9 from training. Duration: 0.6555625 2023-10-02 20:56:07,777 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7544_210_6753_1_1528591327_1471426_172-348777-0_sp0.9 from training. Duration: 0.7333125 2023-10-02 20:56:11,023 WARNING [train.py:1204] (3/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_332-221057-0_sp0.9 from training. Duration: 0.7555625 2023-10-02 20:56:12,428 WARNING [train.py:1204] (3/4) Exclude cut with ID _19023_210_4353_1_1531378892552_5820780_5-300035-0_sp1.1 from training. Duration: 0.92725 2023-10-02 20:56:13,847 WARNING [train.py:1204] (3/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_343-309023-0_sp1.1 from training. Duration: 0.963625 2023-10-02 20:56:15,308 WARNING [train.py:1204] (3/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_37-41919-0_sp1.1 from training. Duration: 0.7909375 2023-10-02 20:56:15,339 WARNING [train.py:1204] (3/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_41-182910-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 20:56:15,473 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.1.attention_skip_rate, batch_count=1016333.3333333334, ans=0.0 2023-10-02 20:56:16,686 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_229-45300-0_sp0.9 from training. Duration: 0.6333125 2023-10-02 20:56:18,486 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.4.encoder.layers.2.feed_forward1.out_whiten, num_groups=1, num_channels=384, metric=8.33 vs. limit=15.0 2023-10-02 20:56:19,669 WARNING [train.py:1204] (3/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_40-214716-0_sp1.1 from training. Duration: 0.963625 2023-10-02 20:56:20,993 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9309W0203-640330 from training. Duration: 0.896 2023-10-02 20:56:23,906 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.ff3_skip_rate, batch_count=1016333.3333333334, ans=0.0 2023-10-02 20:56:27,837 WARNING [train.py:1204] (3/4) Exclude cut with ID _60229_210_27767_1_1534838406158_3655350_264-279573-0_sp1.1 from training. Duration: 0.7545625 2023-10-02 20:56:28,750 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.0.layers.1.nonlin_attention.whiten2, num_groups=1, num_channels=192, metric=6.11 vs. limit=15.0 2023-10-02 20:56:29,197 WARNING [train.py:1204] (3/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_372-198878-0_sp0.9 from training. Duration: 0.6666875 2023-10-02 20:56:30,632 WARNING [train.py:1204] (3/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_359-118926-0_sp1.1 from training. Duration: 0.6545625 2023-10-02 20:56:30,647 WARNING [train.py:1204] (3/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_85-65056-0 from training. Duration: 0.96 2023-10-02 20:56:30,675 WARNING [train.py:1204] (3/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_89-242335-0_sp1.1 from training. Duration: 0.863625 2023-10-02 20:56:33,567 WARNING [train.py:1204] (3/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_128-49444-0_sp1.1 from training. Duration: 0.936375 2023-10-02 20:56:35,485 WARNING [train.py:1204] (3/4) Exclude cut with ID _30327_210_16049_1_1532484161295_3924169_401-70819-0 from training. Duration: 0.86 2023-10-02 20:56:36,873 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_41822_210_13270_1_1533955976_4379284_260-256391-0_sp1.1 from training. Duration: 0.936375 2023-10-02 20:56:36,990 WARNING [train.py:1204] (3/4) Exclude cut with ID _67335_210_5219_1_1535525975652_4275229_599-247317-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 20:56:37,152 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.1.conv_module2.balancer1.prob, batch_count=1016400.0, ans=0.125 2023-10-02 20:56:40,390 WARNING [train.py:1204] (3/4) Exclude cut with ID _29857_210_20034_1_1532395886753_3798160_209-239267-0_sp1.1 from training. Duration: 0.936375 2023-10-02 20:56:40,417 WARNING [train.py:1204] (3/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_46-207599-0_sp1.1 from training. Duration: 0.5818125 2023-10-02 20:56:42,264 WARNING [train.py:1204] (3/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_912-282412-0_sp1.1 from training. Duration: 0.4454375 2023-10-02 20:56:45,198 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.nonlin_attention.balancer.prob, batch_count=1016466.6666666666, ans=0.125 2023-10-02 20:56:46,311 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_4183_210_2087_1_1526729100_1869838_141-166600-0_sp1.1 from training. Duration: 0.863625 2023-10-02 20:56:46,317 WARNING [train.py:1204] (3/4) Exclude cut with ID _15037_210_2253_1_1530752294104_3267650_296-192597-0 from training. Duration: 0.81 2023-10-02 20:56:46,379 WARNING [train.py:1204] (3/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_571-220562-0_sp1.1 from training. Duration: 0.963625 2023-10-02 20:56:46,406 WARNING [train.py:1204] (3/4) Exclude cut with ID _5287_210_2084_1_1527418883040_3878670_12-221186-0 from training. Duration: 0.98 2023-10-02 20:56:50,967 WARNING [train.py:1204] (3/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_149-85602-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 20:56:50,989 WARNING [train.py:1204] (3/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_1003-101638-0_sp1.1 from training. Duration: 0.663625 2023-10-02 20:56:53,777 WARNING [train.py:1204] (3/4) Exclude cut with ID _55844_210_2346_1_1534471212569_7625707_1099-33689-0_sp1.1 from training. Duration: 0.92725 2023-10-02 20:56:55,132 WARNING [train.py:1204] (3/4) Exclude cut with ID _7606_210_4915_1_1528894253277_5080690_301-10732-0 from training. Duration: 0.81 2023-10-02 20:56:57,837 WARNING [train.py:1204] (3/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_403-34698-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 20:56:58,011 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.feed_forward1.out_proj.dropout_p, batch_count=1016533.3333333334, ans=0.1 2023-10-02 20:56:59,078 WARNING [train.py:1204] (3/4) Exclude cut with ID _8480_210_1874_1_1528589391205_6728330_755-112625-0_sp0.9 from training. Duration: 0.82225 2023-10-02 20:56:59,099 WARNING [train.py:1204] (3/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_527-238990-0_sp1.1 from training. Duration: 0.8181875 2023-10-02 20:56:59,120 WARNING [train.py:1204] (3/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_426-7328-0_sp1.1 from training. Duration: 0.92725 2023-10-02 20:57:01,747 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.589e+02 1.813e+02 1.991e+02 2.176e+02 3.248e+02, threshold=3.981e+02, percent-clipped=0.0 2023-10-02 20:57:03,209 WARNING [train.py:1204] (3/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_189-317158-0_sp1.1 from training. Duration: 0.8181875 2023-10-02 20:57:03,278 WARNING [train.py:1204] (3/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_162-103620-0 from training. Duration: 0.93 2023-10-02 20:57:05,206 WARNING [train.py:1204] (3/4) Exclude cut with ID _22083_210_8254_1_1531702862439_3678970_49-234546-0 from training. Duration: 0.83 2023-10-02 20:57:05,258 WARNING [train.py:1204] (3/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_370-331600-0_sp1.1 from training. Duration: 0.8545625 2023-10-02 20:57:05,270 WARNING [train.py:1204] (3/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_166-59042-0_sp1.1 from training. Duration: 0.97275 2023-10-02 20:57:08,386 WARNING [train.py:1204] (3/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_138-332773-0_sp1.1 from training. Duration: 0.736375 2023-10-02 20:57:08,588 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.conv_module2.balancer1.max_abs, batch_count=1016533.3333333334, ans=10.0 2023-10-02 20:57:09,766 WARNING [train.py:1204] (3/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_90-30673-0_sp0.9 from training. Duration: 0.9333125 2023-10-02 20:57:09,984 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.feed_forward3.hidden_balancer.prob, batch_count=1016533.3333333334, ans=0.125 2023-10-02 20:57:12,505 INFO [train.py:1046] (3/4) Epoch 29, batch 3750, loss[loss=0.1653, simple_loss=0.2504, pruned_loss=0.04015, over 24299.00 frames. ], tot_loss[loss=0.1662, simple_loss=0.2445, pruned_loss=0.04394, over 4731388.65 frames. ], batch size: 74, lr: 3.51e-03, grad_scale: 8.0 2023-10-02 20:57:12,561 WARNING [train.py:1204] (3/4) Exclude cut with ID _27967_210_12140_1_1532588391859_8419130_737-41898-0_sp1.1 from training. Duration: 0.936375 2023-10-02 20:57:12,662 WARNING [train.py:1204] (3/4) Exclude cut with ID _8833_210_5219_1_1528592665210_7553059_329-125156-0_sp1.1 from training. Duration: 0.7454375 2023-10-02 20:57:14,029 WARNING [train.py:1204] (3/4) Exclude cut with ID _41817_210_18835_1_1533434477160_7423049_352-42537-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 20:57:17,280 WARNING [train.py:1204] (3/4) Exclude cut with ID _8365_210_2064_1_1528592311070_4677690_431-294102-0 from training. Duration: 0.89 2023-10-02 20:57:18,839 WARNING [train.py:1204] (3/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_454-314901-0_sp1.1 from training. Duration: 0.4181875 2023-10-02 20:57:22,039 WARNING [train.py:1204] (3/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_80-135200-0_sp0.9 from training. Duration: 0.87775 2023-10-02 20:57:22,085 WARNING [train.py:1204] (3/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_181-78632-0 from training. Duration: 0.78 2023-10-02 20:57:22,137 WARNING [train.py:1204] (3/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_300-27235-0_sp0.9 from training. Duration: 0.9666875 2023-10-02 20:57:23,454 WARNING [train.py:1204] (3/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_96-177176-0_sp1.1 from training. Duration: 0.97275 2023-10-02 20:57:26,093 WARNING [train.py:1204] (3/4) Exclude cut with ID _35941_210_15379_1_1532995294515_4023089_246-348742-0_sp1.1 from training. Duration: 0.97275 2023-10-02 20:57:27,499 WARNING [train.py:1204] (3/4) Exclude cut with ID _38670_210_15379_1_1533448810659_4391059_65-169397-0_sp1.1 from training. Duration: 0.8818125 2023-10-02 20:57:30,361 WARNING [train.py:1204] (3/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_1003-99642-0_sp1.1 from training. Duration: 0.963625 2023-10-02 20:57:31,843 WARNING [train.py:1204] (3/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_320-14078-0_sp1.1 from training. Duration: 0.863625 2023-10-02 20:57:34,548 WARNING [train.py:1204] (3/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_367-61330-0_sp0.9 from training. Duration: 0.8333125 2023-10-02 20:57:35,849 WARNING [train.py:1204] (3/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_515-298514-0_sp1.1 from training. Duration: 0.92725 2023-10-02 20:57:39,125 WARNING [train.py:1204] (3/4) Exclude cut with ID _53731_210_7994_1_1534553979244_3757214_471-320815-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 20:57:41,002 WARNING [train.py:1204] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_306-232480-0 from training. Duration: 0.96 2023-10-02 20:57:42,413 WARNING [train.py:1204] (3/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_478-162247-0_sp0.9 from training. Duration: 0.911125 2023-10-02 20:57:42,521 WARNING [train.py:1204] (3/4) Exclude cut with ID _56423_210_5854_1_1534896061049_3165969_345-284696-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 20:57:43,805 WARNING [train.py:1204] (3/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_600-122253-0_sp1.1 from training. Duration: 0.963625 2023-10-02 20:57:47,054 WARNING [train.py:1204] (3/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_199-295611-0 from training. Duration: 0.55 2023-10-02 20:57:49,917 WARNING [train.py:1204] (3/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_734-129761-0 from training. Duration: 0.82 2023-10-02 20:57:51,215 WARNING [train.py:1204] (3/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_349-169257-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 20:57:53,052 WARNING [train.py:1204] (3/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_202-67017-0_sp0.9 from training. Duration: 0.911125 2023-10-02 20:57:54,470 WARNING [train.py:1204] (3/4) Exclude cut with ID _16908_210_5724_1_1531119157248_3772700_61-258978-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 20:57:54,696 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=1016733.3333333334, ans=0.1 2023-10-02 20:57:57,526 WARNING [train.py:1204] (3/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_172-156931-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 20:57:58,871 WARNING [train.py:1204] (3/4) Exclude cut with ID _4092_210_2151_1_1526643044449_3135759_356-278694-0_sp1.1 from training. Duration: 0.42725 2023-10-02 20:58:01,717 WARNING [train.py:1204] (3/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_116-332008-0 from training. Duration: 0.98 2023-10-02 20:58:04,877 WARNING [train.py:1204] (3/4) Exclude cut with ID _29003_210_19596_1_1532507391720_3894098_358-114789-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 20:58:07,743 WARNING [train.py:1204] (3/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_152-212533-0_sp1.1 from training. Duration: 0.8818125 2023-10-02 20:58:07,813 WARNING [train.py:1204] (3/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_267-214116-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 20:58:11,025 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7543_210_6753_1_1528541907_1399322_165-323619-0_sp0.9 from training. Duration: 0.7666875 2023-10-02 20:58:14,880 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.balancer1.prob, batch_count=1016866.6666666666, ans=0.125 2023-10-02 20:58:16,070 WARNING [train.py:1204] (3/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_577-332600-0_sp0.9 from training. Duration: 0.6333125 2023-10-02 20:58:17,366 WARNING [train.py:1204] (3/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_342-100157-0_sp0.9 from training. Duration: 0.6 2023-10-02 20:58:18,758 WARNING [train.py:1204] (3/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_141-89727-0_sp1.1 from training. Duration: 0.6454375 2023-10-02 20:58:20,229 WARNING [train.py:1204] (3/4) Exclude cut with ID _3992_210_1747_1_1526644816031_3731060_107-251434-0_sp1.1 from training. Duration: 0.9 2023-10-02 20:58:23,526 WARNING [train.py:1204] (3/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_401-53423-0_sp1.1 from training. Duration: 0.5 2023-10-02 20:58:26,628 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.1.ff3_skip_rate, batch_count=1016933.3333333334, ans=0.0 2023-10-02 20:58:27,706 INFO [train.py:1046] (3/4) Epoch 29, batch 3800, loss[loss=0.147, simple_loss=0.2277, pruned_loss=0.03319, over 24602.00 frames. ], tot_loss[loss=0.166, simple_loss=0.244, pruned_loss=0.04398, over 4725839.58 frames. ], batch size: 60, lr: 3.51e-03, grad_scale: 8.0 2023-10-02 20:58:28,122 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.bypass.skip_rate, batch_count=1016933.3333333334, ans=0.09899494936611666 2023-10-02 20:58:28,407 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.5.encoder.layers.0.conv_module2.whiten, num_groups=1, num_channels=256, metric=4.70 vs. limit=15.0 2023-10-02 20:58:29,282 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7762_210_1794_1_1528199508_4057408_422-227034-0_sp1.1 from training. Duration: 0.663625 2023-10-02 20:58:33,867 WARNING [train.py:1204] (3/4) Exclude cut with ID _4184_210_2550_1_1526730192013_2219380_102-250299-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 20:58:33,942 WARNING [train.py:1204] (3/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_253-308377-0_sp0.9 from training. Duration: 0.5444375 2023-10-02 20:58:35,268 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_271-297505-0 from training. Duration: 0.99 2023-10-02 20:58:36,709 WARNING [train.py:1204] (3/4) Exclude cut with ID _35941_210_15379_1_1532995294515_4023089_213-142973-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 20:58:38,361 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.nonlin_attention.balancer.prob, batch_count=1016933.3333333334, ans=0.125 2023-10-02 20:58:39,509 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7544_210_6753_1_1528587722_3576686_162-63018-0_sp1.1 from training. Duration: 0.92725 2023-10-02 20:58:40,801 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_365-217119-0_sp0.9 from training. Duration: 0.7 2023-10-02 20:58:44,419 WARNING [train.py:1204] (3/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_662-275621-0_sp0.9 from training. Duration: 0.4666875 2023-10-02 20:58:44,421 WARNING [train.py:1204] (3/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_374-187555-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 20:58:44,490 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_289-180904-0_sp1.1 from training. Duration: 0.6909375 2023-10-02 20:58:45,874 WARNING [train.py:1204] (3/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_189-47206-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 20:58:45,915 WARNING [train.py:1204] (3/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_234-14174-0_sp1.1 from training. Duration: 0.7454375 2023-10-02 20:58:47,243 WARNING [train.py:1204] (3/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_576-76621-0_sp1.1 from training. Duration: 0.963625 2023-10-02 20:58:48,595 WARNING [train.py:1204] (3/4) Exclude cut with ID _13253_210_1925_1_1530757775819_3577873_253-246386-0 from training. Duration: 0.9 2023-10-02 20:58:50,102 WARNING [train.py:1204] (3/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_372-6869-0_sp1.1 from training. Duration: 0.4 2023-10-02 20:58:51,423 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_21434_210_4238_1_1531916339_1222252_22-54954-0_sp1.1 from training. Duration: 0.936375 2023-10-02 20:58:54,682 WARNING [train.py:1204] (3/4) Exclude cut with ID _29853_210_7497_1_1532505600851_6642110_492-170361-0_sp1.1 from training. Duration: 0.92725 2023-10-02 20:58:56,102 WARNING [train.py:1204] (3/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_505-97987-0_sp0.9 from training. Duration: 0.9555625 2023-10-02 20:58:57,339 WARNING [train.py:1204] (3/4) Exclude cut with ID _8365_210_2064_1_1528592311070_4677690_371-122295-0_sp0.9 from training. Duration: 0.611125 2023-10-02 20:58:57,471 WARNING [train.py:1204] (3/4) Exclude cut with ID _14268_210_9206_1_1530622771285_4714099_637-86630-0_sp1.1 from training. Duration: 0.463625 2023-10-02 20:58:57,483 WARNING [train.py:1204] (3/4) Exclude cut with ID _29857_210_20034_1_1532395886753_3798160_372-5678-0_sp1.1 from training. Duration: 0.963625 2023-10-02 20:59:00,408 WARNING [train.py:1204] (3/4) Exclude cut with ID _52892_210_6721_1_1534378941259_4124129_90-165108-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 20:59:01,783 WARNING [train.py:1204] (3/4) Exclude cut with ID _27052_210_5986_1_1532329197187_3585009_143-339198-0_sp1.1 from training. Duration: 0.963625 2023-10-02 20:59:06,365 WARNING [train.py:1204] (3/4) Exclude cut with ID _14268_210_9206_1_1530622771285_4714099_637-86630-0_sp0.9 from training. Duration: 0.5666875 2023-10-02 20:59:06,367 WARNING [train.py:1204] (3/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_401-255491-0 from training. Duration: 0.7 2023-10-02 20:59:07,868 WARNING [train.py:1204] (3/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_178-6946-0_sp1.1 from training. Duration: 0.97275 2023-10-02 20:59:15,249 WARNING [train.py:1204] (3/4) Exclude cut with ID _27485_210_4916_1_1532739191677_3668269_460-163885-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 20:59:19,508 WARNING [train.py:1204] (3/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_270-26053-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 20:59:22,203 WARNING [train.py:1204] (3/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_412-317077-0 from training. Duration: 0.75 2023-10-02 20:59:22,474 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.ff2_skip_rate, batch_count=1017133.3333333334, ans=0.0 2023-10-02 20:59:25,397 WARNING [train.py:1204] (3/4) Exclude cut with ID _5289_210_2084_1_1527678202736_3779299_142-103317-0 from training. Duration: 0.84 2023-10-02 20:59:26,738 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_150-143438-0_sp1.1 from training. Duration: 0.92725 2023-10-02 20:59:28,356 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.conv_skip_rate, batch_count=1017200.0, ans=0.0 2023-10-02 20:59:29,525 WARNING [train.py:1204] (3/4) Exclude cut with ID _27894_210_12392_1_1532415496078_6902439_668-269779-0_sp1.1 from training. Duration: 0.97275 2023-10-02 20:59:29,587 WARNING [train.py:1204] (3/4) Exclude cut with ID _18513_210_9206_1_1531618215223_3862620_56-222075-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 20:59:30,963 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.574e+02 1.830e+02 2.115e+02 2.392e+02 3.412e+02, threshold=4.230e+02, percent-clipped=0.0 2023-10-02 20:59:31,111 WARNING [train.py:1204] (3/4) Exclude cut with ID _4092_210_2151_1_1526643044449_3135759_356-278694-0 from training. Duration: 0.47 2023-10-02 20:59:33,910 WARNING [train.py:1204] (3/4) Exclude cut with ID _25808_210_13292_1_1532343514562_7366109_633-47524-0 from training. Duration: 0.99 2023-10-02 20:59:33,921 WARNING [train.py:1204] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_872-264525-0 from training. Duration: 0.65 2023-10-02 20:59:33,949 WARNING [train.py:1204] (3/4) Exclude cut with ID _14632_210_9988_1_1530691279392_3923200_239-232545-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 20:59:35,313 WARNING [train.py:1204] (3/4) Exclude cut with ID _14349_210_8341_1_1530615710818_3442470_153-27432-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 20:59:41,235 INFO [train.py:1046] (3/4) Epoch 29, batch 3850, loss[loss=0.1527, simple_loss=0.2327, pruned_loss=0.03633, over 24459.00 frames. ], tot_loss[loss=0.1653, simple_loss=0.2435, pruned_loss=0.04351, over 4736537.35 frames. ], batch size: 58, lr: 3.51e-03, grad_scale: 8.0 2023-10-02 20:59:41,307 WARNING [train.py:1204] (3/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_37-41919-0_sp0.9 from training. Duration: 0.9666875 2023-10-02 20:59:41,496 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.bypass.skip_rate, batch_count=1017266.6666666666, ans=0.09899494936611666 2023-10-02 20:59:43,114 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_196-289637-0_sp0.9 from training. Duration: 0.8333125 2023-10-02 20:59:48,038 WARNING [train.py:1204] (3/4) Exclude cut with ID _13678_210_7497_1_1530530069226_3463170_371-3221-0_sp1.1 from training. Duration: 0.8181875 2023-10-02 20:59:49,346 WARNING [train.py:1204] (3/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_557-324258-0 from training. Duration: 0.81 2023-10-02 20:59:50,677 WARNING [train.py:1204] (3/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_1012-279473-0_sp1.1 from training. Duration: 0.6090625 2023-10-02 20:59:52,015 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_66916_210_14656_1_1535725525_326566_7-92649-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 20:59:54,909 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_9460_210_4211_1_1528954587_10049667_480-260269-0_sp1.1 from training. Duration: 0.5090625 2023-10-02 20:59:56,499 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_121-155001-0_sp1.1 from training. Duration: 0.92725 2023-10-02 20:59:57,197 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.3.encoder.layers.0.self_attn2.whiten, num_groups=1, num_channels=512, metric=15.39 vs. limit=22.5 2023-10-02 20:59:57,258 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.self_attn_weights.whiten_keys.whitening_limit, batch_count=1017333.3333333334, ans=6.0 2023-10-02 20:59:59,709 WARNING [train.py:1204] (3/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_188-171673-0_sp1.1 from training. Duration: 0.52725 2023-10-02 20:59:59,782 WARNING [train.py:1204] (3/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_2-39653-0 from training. Duration: 0.98 2023-10-02 21:00:02,741 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.bypass.scale_min, batch_count=1017333.3333333334, ans=0.2 2023-10-02 21:00:02,749 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=1017333.3333333334, ans=0.1 2023-10-02 21:00:05,375 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_23850_210_4238_1_1532232597_1632433_112-229012-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 21:00:06,719 WARNING [train.py:1204] (3/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_626-79528-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 21:00:08,767 WARNING [train.py:1204] (3/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_856-90052-0_sp1.1 from training. Duration: 0.963625 2023-10-02 21:00:08,816 WARNING [train.py:1204] (3/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_122-109023-0_sp0.9 from training. Duration: 0.8444375 2023-10-02 21:00:13,605 WARNING [train.py:1204] (3/4) Exclude cut with ID _64414_210_17301_1_1535243252048_3757759_37-70328-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 21:00:15,348 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_622-344907-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 21:00:15,414 WARNING [train.py:1204] (3/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_410-321956-0_sp1.1 from training. Duration: 0.92725 2023-10-02 21:00:15,426 WARNING [train.py:1204] (3/4) Exclude cut with ID _7562_210_2084_1_1528887767916_3946229_186-326422-0_sp0.9 from training. Duration: 0.9333125 2023-10-02 21:00:16,842 WARNING [train.py:1204] (3/4) Exclude cut with ID _37537_210_2253_1_1533171758568_3421949_127-285192-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 21:00:18,236 WARNING [train.py:1204] (3/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_723-228168-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 21:00:19,640 WARNING [train.py:1204] (3/4) Exclude cut with ID _13678_210_7497_1_1530530069226_3463170_426-246984-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 21:00:19,666 WARNING [train.py:1204] (3/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_399-156791-0_sp0.9 from training. Duration: 0.92225 2023-10-02 21:00:19,731 WARNING [train.py:1204] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_391-22148-0 from training. Duration: 0.77 2023-10-02 21:00:21,051 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3475_210_2028_1_1526193578_3622569_64-297072-0 from training. Duration: 0.91 2023-10-02 21:00:21,117 WARNING [train.py:1204] (3/4) Exclude cut with ID _17875_210_2087_1_1531224012907_1821339_225-280015-0_sp1.1 from training. Duration: 0.963625 2023-10-02 21:00:22,373 WARNING [train.py:1204] (3/4) Exclude cut with ID _50655_210_18628_1_1534229942193_3772154_325-246169-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 21:00:23,862 WARNING [train.py:1204] (3/4) Exclude cut with ID _29529_210_7234_1_1532393961455_3977460_597-257348-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 21:00:23,911 WARNING [train.py:1204] (3/4) Exclude cut with ID _58895_210_8750_1_1535184081320_3649089_77-173485-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 21:00:23,938 WARNING [train.py:1204] (3/4) Exclude cut with ID _6270_210_2084_1_1527850911573_3856420_26-277343-0 from training. Duration: 0.82 2023-10-02 21:00:27,174 WARNING [train.py:1204] (3/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_144-3287-0 from training. Duration: 0.67 2023-10-02 21:00:28,547 WARNING [train.py:1204] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_293-145278-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 21:00:29,972 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.0.balancer1.prob, batch_count=1017466.6666666666, ans=0.125 2023-10-02 21:00:31,205 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_40047_210_2290_1_1533290519_2374311_257-335162-0 from training. Duration: 0.95 2023-10-02 21:00:33,952 WARNING [train.py:1204] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_619-344499-0_sp0.9 from training. Duration: 0.7 2023-10-02 21:00:38,243 WARNING [train.py:1204] (3/4) Exclude cut with ID _19504_210_9994_1_1531648863242_3325220_456-315379-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 21:00:38,334 WARNING [train.py:1204] (3/4) Exclude cut with ID _55857_210_2346_1_1534644007831_6898209_631-221692-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 21:00:43,128 WARNING [train.py:1204] (3/4) Exclude cut with ID _64631_210_27767_1_1535349580068_4165160_324-3682-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 21:00:43,312 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=1017533.3333333334, ans=0.1 2023-10-02 21:00:44,378 WARNING [train.py:1204] (3/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_317-211107-0 from training. Duration: 0.92 2023-10-02 21:00:46,497 WARNING [train.py:1204] (3/4) Exclude cut with ID _6789_210_1534_1_1527857937218_3118950_311-61262-0 from training. Duration: 0.65 2023-10-02 21:00:48,730 INFO [scaling.py:1118] (3/4) WithLoss: name=encoder.encoders.2.encoder.layers.2.self_attn_weights, loss-sum=0.000e+00 2023-10-02 21:00:49,759 WARNING [train.py:1204] (3/4) Exclude cut with ID _28323_210_16187_1_1532480395667_4100300_365-68882-0_sp1.1 from training. Duration: 0.92725 2023-10-02 21:00:49,787 WARNING [train.py:1204] (3/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_816-181825-0_sp1.1 from training. Duration: 0.92725 2023-10-02 21:00:52,525 WARNING [train.py:1204] (3/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_132-269633-0_sp1.1 from training. Duration: 0.6181875 2023-10-02 21:00:52,536 WARNING [train.py:1204] (3/4) Exclude cut with ID _44585_210_18628_1_1533605350387_4518199_280-73534-0_sp1.1 from training. Duration: 0.8090625 2023-10-02 21:00:52,586 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_68095_210_20185_1_1535718226_8888855_1009-164755-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 21:00:54,297 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_40223_210_6228_1_1533298404_4812267_400-103813-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 21:00:54,299 WARNING [train.py:1204] (3/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_491-261923-0_sp1.1 from training. Duration: 0.8909375 2023-10-02 21:00:54,308 WARNING [train.py:1204] (3/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_712-342563-0 from training. Duration: 0.97 2023-10-02 21:00:55,651 WARNING [train.py:1204] (3/4) Exclude cut with ID _41161_210_20034_1_1533431249657_3555314_175-190032-0_sp1.1 from training. Duration: 0.963625 2023-10-02 21:00:56,970 INFO [train.py:1046] (3/4) Epoch 29, batch 3900, loss[loss=0.1657, simple_loss=0.2325, pruned_loss=0.0495, over 22774.00 frames. ], tot_loss[loss=0.1639, simple_loss=0.2422, pruned_loss=0.04283, over 4724796.53 frames. ], batch size: 322, lr: 3.51e-03, grad_scale: 8.0 2023-10-02 21:00:57,015 WARNING [train.py:1204] (3/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_788-298907-0 from training. Duration: 0.89 2023-10-02 21:00:57,046 WARNING [train.py:1204] (3/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_225-70961-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 21:00:57,053 WARNING [train.py:1204] (3/4) Exclude cut with ID _58583_210_5236_1_1534726835266_3574249_235-175994-0_sp1.1 from training. Duration: 0.92725 2023-10-02 21:00:58,493 WARNING [train.py:1204] (3/4) Exclude cut with ID _12302_210_5219_1_1529924446466_8107280_907-182949-0_sp1.1 from training. Duration: 0.836375 2023-10-02 21:00:59,728 WARNING [train.py:1204] (3/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_94-193692-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 21:01:01,104 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_4528_210_3273_1_1527076654_3276777_342-168068-0_sp0.9 from training. Duration: 0.9666875 2023-10-02 21:01:01,158 WARNING [train.py:1204] (3/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_368-74239-0_sp1.1 from training. Duration: 0.92725 2023-10-02 21:01:01,159 WARNING [train.py:1204] (3/4) Exclude cut with ID _44831_210_13427_1_1533816011549_3689035_291-295928-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 21:01:02,572 WARNING [train.py:1204] (3/4) Exclude cut with ID _56406_210_9288_1_1534899618664_3496149_455-131997-0_sp1.1 from training. Duration: 0.97275 2023-10-02 21:01:02,579 WARNING [train.py:1204] (3/4) Exclude cut with ID _6473_210_1741_1_1527901307396_3815380_108-17335-0 from training. Duration: 0.65 2023-10-02 21:01:03,876 WARNING [train.py:1204] (3/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_286-135600-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 21:01:06,737 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_928-321675-0_sp1.1 from training. Duration: 0.936375 2023-10-02 21:01:06,815 WARNING [train.py:1204] (3/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_81-300212-0_sp1.1 from training. Duration: 0.5454375 2023-10-02 21:01:06,849 WARNING [train.py:1204] (3/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_29-280622-0_sp0.9 from training. Duration: 0.911125 2023-10-02 21:01:08,200 WARNING [train.py:1204] (3/4) Exclude cut with ID _61079_210_4525_1_1534921008198_4011419_228-176447-0_sp1.1 from training. Duration: 0.936375 2023-10-02 21:01:09,348 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.5.encoder.layers.1.self_attn1.whiten, num_groups=1, num_channels=256, metric=12.63 vs. limit=22.5 2023-10-02 21:01:10,268 WARNING [train.py:1204] (3/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_372-198878-0_sp1.1 from training. Duration: 0.5454375 2023-10-02 21:01:10,279 WARNING [train.py:1204] (3/4) Exclude cut with ID _64521_210_14699_1_1535243427259_3815328_244-208725-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 21:01:12,953 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8275_210_4198_1_1528455320_3892571_491-17707-0_sp0.9 from training. Duration: 0.711125 2023-10-02 21:01:14,291 WARNING [train.py:1204] (3/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_375-245677-0 from training. Duration: 0.81 2023-10-02 21:01:14,298 WARNING [train.py:1204] (3/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_690-286055-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 21:01:16,133 WARNING [train.py:1204] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_89-321955-0 from training. Duration: 0.8 2023-10-02 21:01:17,903 WARNING [train.py:1204] (3/4) Exclude cut with ID _18895_210_6228_1_1531357274234_3784930_213-98721-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 21:01:18,115 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.0.conv_module1.balancer1.prob, batch_count=1017666.6666666666, ans=0.125 2023-10-02 21:01:19,230 WARNING [train.py:1204] (3/4) Exclude cut with ID _19708_210_9206_1_1532136650733_4238364_347-65240-0 from training. Duration: 0.97 2023-10-02 21:01:19,347 WARNING [train.py:1204] (3/4) Exclude cut with ID _64135_210_2253_1_1535677267876_7608972_498-5039-0 from training. Duration: 0.78 2023-10-02 21:01:19,483 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.1.conv_skip_rate, batch_count=1017666.6666666666, ans=0.0 2023-10-02 21:01:23,637 WARNING [train.py:1204] (3/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_146-276944-0_sp1.1 from training. Duration: 0.7545625 2023-10-02 21:01:25,044 WARNING [train.py:1204] (3/4) Exclude cut with ID _63171_210_15488_1_1535104792794_3295110_155-34488-0_sp1.1 from training. Duration: 0.97275 2023-10-02 21:01:25,065 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_5438_210_1794_1_1527336687_2600009_233-58072-0_sp0.9 from training. Duration: 0.8555625 2023-10-02 21:01:26,793 WARNING [train.py:1204] (3/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_63-101996-0_sp0.9 from training. Duration: 0.97775 2023-10-02 21:01:29,723 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.balancer2.prob, batch_count=1017733.3333333334, ans=0.125 2023-10-02 21:01:32,129 WARNING [train.py:1204] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_271-269084-0_sp1.1 from training. Duration: 0.7545625 2023-10-02 21:01:33,507 WARNING [train.py:1204] (3/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_345-167986-0_sp1.1 from training. Duration: 0.763625 2023-10-02 21:01:36,199 WARNING [train.py:1204] (3/4) Exclude cut with ID _39290_210_4220_1_1533862778760_3075118_92-311600-0_sp1.1 from training. Duration: 0.8 2023-10-02 21:01:36,212 WARNING [train.py:1204] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_209-65306-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 21:01:37,523 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_26433_210_12108_1_1532157158_4056480_317-335807-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 21:01:41,078 WARNING [train.py:1204] (3/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_238-94127-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 21:01:42,396 WARNING [train.py:1204] (3/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_207-132038-0_sp1.1 from training. Duration: 0.8545625 2023-10-02 21:01:49,652 WARNING [train.py:1204] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_167-137255-0_sp1.1 from training. Duration: 0.5818125 2023-10-02 21:01:52,892 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2009_210_2071_1_1525481134_4588937_385-302100-0_sp0.9 from training. Duration: 0.9666875 2023-10-02 21:01:53,174 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.attention_skip_rate, batch_count=1017800.0, ans=0.0 2023-10-02 21:02:00,139 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.598e+02 1.908e+02 2.090e+02 2.279e+02 3.319e+02, threshold=4.180e+02, percent-clipped=0.0 2023-10-02 21:02:01,594 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_15517_210_6753_1_1530954008_3487336_253-110617-0_sp1.1 from training. Duration: 0.936375 2023-10-02 21:02:04,406 WARNING [train.py:1204] (3/4) Exclude cut with ID _39290_210_4220_1_1533862778760_3075118_92-311600-0_sp0.9 from training. Duration: 0.97775 2023-10-02 21:02:04,451 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_42-142473-0 from training. Duration: 0.72 2023-10-02 21:02:04,487 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2296_210_2087_1_1525503448_3397335_57-185430-0 from training. Duration: 0.6 2023-10-02 21:02:04,499 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_11305_210_1794_1_1529750428_7178148_257-312870-0_sp0.9 from training. Duration: 0.97775 2023-10-02 21:02:05,907 WARNING [train.py:1204] (3/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_222-198127-0 from training. Duration: 0.84 2023-10-02 21:02:07,289 WARNING [train.py:1204] (3/4) Exclude cut with ID _65263_210_22356_1_1535763598691_3921037_423-202699-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 21:02:07,351 WARNING [train.py:1204] (3/4) Exclude cut with ID _15037_210_2253_1_1530752294104_3267650_13-130361-0 from training. Duration: 0.99 2023-10-02 21:02:09,002 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.ff3_skip_rate, batch_count=1017933.3333333334, ans=0.0 2023-10-02 21:02:10,061 INFO [train.py:1046] (3/4) Epoch 29, batch 3950, loss[loss=0.1754, simple_loss=0.2514, pruned_loss=0.04968, over 23773.00 frames. ], tot_loss[loss=0.1636, simple_loss=0.2421, pruned_loss=0.04252, over 4727527.39 frames. ], batch size: 179, lr: 3.51e-03, grad_scale: 8.0 2023-10-02 21:02:14,501 WARNING [train.py:1204] (3/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_628-198080-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 21:02:14,600 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3171_210_2087_1_1526109084_2330007_37-64334-0 from training. Duration: 0.82 2023-10-02 21:02:15,063 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.5.encoder.layers.1.conv_module1.whiten, num_groups=1, num_channels=256, metric=3.26 vs. limit=15.0 2023-10-02 21:02:15,942 WARNING [train.py:1204] (3/4) Exclude cut with ID _5287_210_2084_1_1527418883040_3878670_12-221186-0_sp1.1 from training. Duration: 0.8909375 2023-10-02 21:02:18,950 WARNING [train.py:1204] (3/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_1103-56445-0_sp1.1 from training. Duration: 0.663625 2023-10-02 21:02:19,201 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.nonlin_attention.balancer.max_positive, batch_count=1017933.3333333334, ans=0.95 2023-10-02 21:02:19,732 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.2.encoder.layers.1.self_attn_weights.whiten_keys, num_groups=4, num_channels=128, metric=2.42 vs. limit=6.0 2023-10-02 21:02:20,355 WARNING [train.py:1204] (3/4) Exclude cut with ID _65263_210_22356_1_1535763598691_3921037_169-140068-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 21:02:22,066 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.attention_skip_rate, batch_count=1017933.3333333334, ans=0.0 2023-10-02 21:02:26,553 WARNING [train.py:1204] (3/4) Exclude cut with ID IC0671W0118-331961 from training. Duration: 0.896 2023-10-02 21:02:26,613 WARNING [train.py:1204] (3/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_402-102311-0_sp1.1 from training. Duration: 0.6090625 2023-10-02 21:02:26,638 WARNING [train.py:1204] (3/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_202-293523-0 from training. Duration: 0.72 2023-10-02 21:02:28,053 WARNING [train.py:1204] (3/4) Exclude cut with ID IC0679W0038-335846 from training. Duration: 0.768 2023-10-02 21:02:28,094 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_37357_210_17024_1_1533089504_2316121_79-198866-0_sp1.1 from training. Duration: 0.936375 2023-10-02 21:02:28,284 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.ff2_skip_rate, batch_count=1018000.0, ans=0.0 2023-10-02 21:02:30,120 WARNING [train.py:1204] (3/4) Exclude cut with ID _7606_210_4915_1_1528894253277_5080690_292-271285-0_sp1.1 from training. Duration: 0.963625 2023-10-02 21:02:31,436 WARNING [train.py:1204] (3/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_377-296839-0_sp0.9 from training. Duration: 0.611125 2023-10-02 21:02:31,439 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_45886_210_17024_1_1533704917_2435333_58-131069-0_sp1.1 from training. Duration: 0.963625 2023-10-02 21:02:32,833 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6961_210_2087_1_1527937168_6815011_395-185060-0 from training. Duration: 0.95 2023-10-02 21:02:35,615 WARNING [train.py:1204] (3/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_304-266479-0_sp1.1 from training. Duration: 0.97275 2023-10-02 21:02:35,681 WARNING [train.py:1204] (3/4) Exclude cut with ID _3867_210_1883_1_1526641200575_4475830_319-349850-0_sp1.1 from training. Duration: 0.6090625 2023-10-02 21:02:35,689 WARNING [train.py:1204] (3/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_304-59227-0_sp0.9 from training. Duration: 0.8555625 2023-10-02 21:02:36,935 WARNING [train.py:1204] (3/4) Exclude cut with ID _8021_210_7994_1_1528376451358_3653069_100-140231-0_sp0.9 from training. Duration: 0.7444375 2023-10-02 21:02:36,997 WARNING [train.py:1204] (3/4) Exclude cut with ID _8833_210_5219_1_1528592665210_7553059_329-125156-0_sp0.9 from training. Duration: 0.911125 2023-10-02 21:02:48,810 WARNING [train.py:1204] (3/4) Exclude cut with ID _14459_210_2228_1_1531443641946_7516700_792-287234-0_sp1.1 from training. Duration: 0.92725 2023-10-02 21:02:48,834 WARNING [train.py:1204] (3/4) Exclude cut with ID _19708_210_9206_1_1532136650733_4238364_347-65240-0_sp1.1 from training. Duration: 0.8818125 2023-10-02 21:02:54,387 WARNING [train.py:1204] (3/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_70-27732-0 from training. Duration: 0.94 2023-10-02 21:02:57,953 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.self_attn_weights.pos_emb_skip_rate, batch_count=1018133.3333333334, ans=0.0 2023-10-02 21:02:59,637 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_397-132565-0 from training. Duration: 0.99 2023-10-02 21:02:59,640 WARNING [train.py:1204] (3/4) Exclude cut with ID _49728_210_12651_1_1534041034294_6788150_624-195301-0 from training. Duration: 0.85 2023-10-02 21:03:00,938 WARNING [train.py:1204] (3/4) Exclude cut with ID _55429_210_10626_1_1534467588527_7174143_377-169391-0_sp1.1 from training. Duration: 0.87275 2023-10-02 21:03:01,213 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.self_attn_weights.pos_emb_skip_rate, batch_count=1018133.3333333334, ans=0.0 2023-10-02 21:03:02,249 WARNING [train.py:1204] (3/4) Exclude cut with ID _49376_210_5986_1_1533985278732_3370740_354-344695-0_sp1.1 from training. Duration: 0.9 2023-10-02 21:03:07,874 WARNING [train.py:1204] (3/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_276-96593-0_sp1.1 from training. Duration: 0.7 2023-10-02 21:03:07,890 WARNING [train.py:1204] (3/4) Exclude cut with ID _15012_210_1968_1_1530856820429_5095350_688-178849-0_sp0.9 from training. Duration: 0.711125 2023-10-02 21:03:09,242 WARNING [train.py:1204] (3/4) Exclude cut with ID _25565_210_12392_1_1532423009382_3952098_372-69023-0_sp1.1 from training. Duration: 0.936375 2023-10-02 21:03:09,283 WARNING [train.py:1204] (3/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_61-327062-0_sp1.1 from training. Duration: 0.736375 2023-10-02 21:03:09,453 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.nonlin_attention.balancer.prob, batch_count=1018200.0, ans=0.125 2023-10-02 21:03:10,445 WARNING [train.py:1204] (3/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_215-141059-0 from training. Duration: 0.62 2023-10-02 21:03:17,068 WARNING [train.py:1204] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_6-226750-0_sp1.1 from training. Duration: 0.8545625 2023-10-02 21:03:17,154 WARNING [train.py:1204] (3/4) Exclude cut with ID _14348_210_8341_1_1530599434996_3778640_240-232370-0_sp1.1 from training. Duration: 0.763625 2023-10-02 21:03:19,120 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.4.encoder.layers.1.nonlin_attention.whiten1, num_groups=1, num_channels=288, metric=3.80 vs. limit=10.0 2023-10-02 21:03:19,475 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.0.layers.1.conv_module1.whiten, num_groups=1, num_channels=192, metric=8.78 vs. limit=15.0 2023-10-02 21:03:20,011 WARNING [train.py:1204] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_468-189058-0 from training. Duration: 0.96 2023-10-02 21:03:22,905 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=1018266.6666666666, ans=0.1 2023-10-02 21:03:23,941 INFO [train.py:1046] (3/4) Epoch 29, batch 4000, loss[loss=0.219, simple_loss=0.2812, pruned_loss=0.07837, over 19340.00 frames. ], tot_loss[loss=0.1648, simple_loss=0.2429, pruned_loss=0.04332, over 4718391.15 frames. ], batch size: 388, lr: 3.51e-03, grad_scale: 16.0 2023-10-02 21:03:27,813 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.3.encoder.layers.1.nonlin_attention.whiten1, num_groups=1, num_channels=384, metric=5.30 vs. limit=10.0 2023-10-02 21:03:28,564 WARNING [train.py:1204] (3/4) Exclude cut with ID _21987_210_5854_1_1531821500482_3637038_247-93340-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 21:03:37,181 WARNING [train.py:1204] (3/4) Exclude cut with ID _66868_210_9527_1_1535509287954_4561140_436-191602-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 21:03:42,236 WARNING [train.py:1204] (3/4) Exclude cut with ID _18895_210_6228_1_1531357274234_3784930_394-58203-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 21:03:42,291 WARNING [train.py:1204] (3/4) Exclude cut with ID _58605_210_12210_1_1534723195939_3904259_258-281498-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 21:03:43,577 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_33348_210_5632_1_1533086912_3324030_245-292752-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 21:03:43,596 WARNING [train.py:1204] (3/4) Exclude cut with ID _67831_210_12392_1_1535612464202_3092612_346-2148-0 from training. Duration: 0.93 2023-10-02 21:03:43,664 WARNING [train.py:1204] (3/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_369-222060-0_sp0.9 from training. Duration: 0.788875 2023-10-02 21:03:44,965 WARNING [train.py:1204] (3/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_245-174553-0 from training. Duration: 0.88 2023-10-02 21:03:44,971 WARNING [train.py:1204] (3/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_231-104736-0_sp1.1 from training. Duration: 0.7090625 2023-10-02 21:03:45,212 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.feed_forward3.hidden_balancer.prob, batch_count=1018333.3333333334, ans=0.125 2023-10-02 21:03:46,223 WARNING [train.py:1204] (3/4) Exclude cut with ID _44585_210_18628_1_1533605350387_4518199_280-73534-0 from training. Duration: 0.89 2023-10-02 21:03:47,663 WARNING [train.py:1204] (3/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_22-201476-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 21:03:50,991 WARNING [train.py:1204] (3/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_325-62802-0_sp0.9 from training. Duration: 0.9666875 2023-10-02 21:03:51,005 WARNING [train.py:1204] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_199-274062-0_sp1.1 from training. Duration: 0.97275 2023-10-02 21:03:51,008 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_8-319957-0_sp1.1 from training. Duration: 0.87275 2023-10-02 21:03:51,038 WARNING [train.py:1204] (3/4) Exclude cut with ID _29343_210_4381_1_1532602757752_3674109_224-349726-0_sp1.1 from training. Duration: 0.936375 2023-10-02 21:03:51,042 WARNING [train.py:1204] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_520-126729-0_sp1.1 from training. Duration: 0.636375 2023-10-02 21:03:52,451 WARNING [train.py:1204] (3/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_239-22076-0_sp1.1 from training. Duration: 0.77275 2023-10-02 21:03:53,977 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9012W0011-501146 from training. Duration: 0.896 2023-10-02 21:03:54,217 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.conv_skip_rate, batch_count=1018400.0, ans=0.0 2023-10-02 21:03:55,329 WARNING [train.py:1204] (3/4) Exclude cut with ID _29857_210_20034_1_1532395886753_3798160_279-185801-0_sp1.1 from training. Duration: 0.82725 2023-10-02 21:03:55,392 WARNING [train.py:1204] (3/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_261-223933-0_sp1.1 from training. Duration: 0.92725 2023-10-02 21:03:58,513 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9029W0025-509426 from training. Duration: 0.896 2023-10-02 21:03:59,909 WARNING [train.py:1204] (3/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_337-181502-0_sp1.1 from training. Duration: 0.5909375 2023-10-02 21:03:59,939 WARNING [train.py:1204] (3/4) Exclude cut with ID _68635_210_9317_1_1535770813410_4315870_159-332599-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 21:04:06,056 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=1018400.0, ans=0.1 2023-10-02 21:04:07,255 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_161-276996-0 from training. Duration: 0.95 2023-10-02 21:04:07,302 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7088_210_1874_1_1527939776_1101113_127-68870-0_sp1.1 from training. Duration: 0.936375 2023-10-02 21:04:09,980 WARNING [train.py:1204] (3/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_322-217220-0_sp1.1 from training. Duration: 0.7818125 2023-10-02 21:04:10,056 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9072W0002-530636 from training. Duration: 0.768 2023-10-02 21:04:10,847 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.0.conv_module2.balancer2.min_positive, batch_count=1018466.6666666666, ans=0.05 2023-10-02 21:04:12,072 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_340-336408-0_sp1.1 from training. Duration: 0.8454375 2023-10-02 21:04:13,495 WARNING [train.py:1204] (3/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_395-37222-0 from training. Duration: 0.76 2023-10-02 21:04:13,507 WARNING [train.py:1204] (3/4) Exclude cut with ID _64099_210_17301_1_1535526977656_1578188_159-209156-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 21:04:14,858 WARNING [train.py:1204] (3/4) Exclude cut with ID _53950_210_5816_1_1534417264919_3862951_28-29681-0_sp1.1 from training. Duration: 0.92725 2023-10-02 21:04:16,258 WARNING [train.py:1204] (3/4) Exclude cut with ID _4681_210_3525_1_1527250689464_120889_15-126760-0_sp1.1 from training. Duration: 0.736375 2023-10-02 21:04:17,720 WARNING [train.py:1204] (3/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_242-3472-0_sp1.1 from training. Duration: 0.663625 2023-10-02 21:04:19,009 WARNING [train.py:1204] (3/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_436-186311-0_sp0.9 from training. Duration: 0.72225 2023-10-02 21:04:19,032 WARNING [train.py:1204] (3/4) Exclude cut with ID _3675_210_4557_1_1526639934408_4182089_452-172147-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 21:04:20,515 WARNING [train.py:1204] (3/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_80-147301-0 from training. Duration: 0.84 2023-10-02 21:04:20,548 WARNING [train.py:1204] (3/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_351-107588-0_sp1.1 from training. Duration: 0.92725 2023-10-02 21:04:21,967 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9111W0002-550781 from training. Duration: 0.896 2023-10-02 21:04:25,400 WARNING [train.py:1204] (3/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_465-156500-0_sp1.1 from training. Duration: 0.6545625 2023-10-02 21:04:26,060 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.3.encoder.layers.1.conv_module1.whiten, num_groups=1, num_channels=512, metric=3.26 vs. limit=15.0 2023-10-02 21:04:28,187 WARNING [train.py:1204] (3/4) Exclude cut with ID _13938_210_9988_1_1530518718978_3693960_304-112784-0_sp1.1 from training. Duration: 0.4181875 2023-10-02 21:04:30,108 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.610e+02 1.895e+02 2.160e+02 2.522e+02 3.518e+02, threshold=4.320e+02, percent-clipped=0.0 2023-10-02 21:04:30,196 WARNING [train.py:1204] (3/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_202-67017-0_sp1.1 from training. Duration: 0.7454375 2023-10-02 21:04:30,241 WARNING [train.py:1204] (3/4) Exclude cut with ID _58414_210_8254_1_1535421609162_7151182_448-4358-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 21:04:30,315 WARNING [train.py:1204] (3/4) Exclude cut with ID _66840_210_5219_1_1535452168426_4196349_203-243234-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 21:04:33,418 WARNING [train.py:1204] (3/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_201-213513-0_sp1.1 from training. Duration: 0.97275 2023-10-02 21:04:37,511 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_117-187560-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 21:04:38,761 INFO [train.py:1046] (3/4) Epoch 29, batch 4050, loss[loss=0.1589, simple_loss=0.2398, pruned_loss=0.03902, over 23709.00 frames. ], tot_loss[loss=0.1654, simple_loss=0.2436, pruned_loss=0.04367, over 4719770.70 frames. ], batch size: 85, lr: 3.51e-03, grad_scale: 8.0 2023-10-02 21:04:40,294 WARNING [train.py:1204] (3/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_135-174080-0_sp0.9 from training. Duration: 0.588875 2023-10-02 21:04:41,577 WARNING [train.py:1204] (3/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_45-265684-0 from training. Duration: 0.72 2023-10-02 21:04:42,610 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.conv_skip_rate, batch_count=1018600.0, ans=0.0 2023-10-02 21:04:43,608 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_435-313305-0_sp1.1 from training. Duration: 0.6454375 2023-10-02 21:04:43,626 WARNING [train.py:1204] (3/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_235-307337-0_sp1.1 from training. Duration: 0.8545625 2023-10-02 21:04:43,698 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7762_210_1794_1_1528199508_4057408_419-125009-0_sp0.9 from training. Duration: 0.92225 2023-10-02 21:04:45,194 WARNING [train.py:1204] (3/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_215-141059-0_sp1.1 from training. Duration: 0.563625 2023-10-02 21:04:46,616 WARNING [train.py:1204] (3/4) Exclude cut with ID _27013_210_1389_1_1532437173240_3252649_222-125573-0_sp1.1 from training. Duration: 0.8909375 2023-10-02 21:04:49,570 INFO [scaling.py:1118] (3/4) WithLoss: name=encoder.encoders.2.encoder.layers.1.self_attn_weights, loss-sum=0.000e+00 2023-10-02 21:04:50,603 WARNING [train.py:1204] (3/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_126-274694-0_sp1.1 from training. Duration: 0.8909375 2023-10-02 21:04:50,730 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder_embed.convnext.out_balancer.prob, batch_count=1018600.0, ans=0.125 2023-10-02 21:04:52,629 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2592_210_2290_1_1525697025_4882925_157-70512-0_sp1.1 from training. Duration: 0.82725 2023-10-02 21:04:54,171 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_292-265928-0_sp0.9 from training. Duration: 0.5666875 2023-10-02 21:04:55,557 WARNING [train.py:1204] (3/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_1211-226066-0_sp1.1 from training. Duration: 0.6909375 2023-10-02 21:04:55,598 WARNING [train.py:1204] (3/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_394-265747-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 21:04:57,125 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.self_attn_weights.pos_emb_skip_rate, batch_count=1018666.6666666666, ans=0.0 2023-10-02 21:04:58,345 WARNING [train.py:1204] (3/4) Exclude cut with ID _48782_210_12658_1_1534467701544_2300422_239-67139-0_sp1.1 from training. Duration: 0.97275 2023-10-02 21:05:01,127 WARNING [train.py:1204] (3/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_329-113173-0_sp1.1 from training. Duration: 0.563625 2023-10-02 21:05:04,894 WARNING [train.py:1204] (3/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_81-75849-0_sp0.9 from training. Duration: 0.4333125 2023-10-02 21:05:06,665 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.5.encoder.layers.1.feed_forward3.out_whiten, num_groups=1, num_channels=256, metric=11.54 vs. limit=15.0 2023-10-02 21:05:07,532 WARNING [train.py:1204] (3/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_1003-101638-0 from training. Duration: 0.73 2023-10-02 21:05:07,566 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9309W0189-640316 from training. Duration: 0.64 2023-10-02 21:05:10,445 WARNING [train.py:1204] (3/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_503-287883-0_sp1.1 from training. Duration: 0.8 2023-10-02 21:05:16,588 WARNING [train.py:1204] (3/4) Exclude cut with ID _8833_210_5219_1_1528592665210_7553059_611-80313-0 from training. Duration: 0.64 2023-10-02 21:05:17,933 WARNING [train.py:1204] (3/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_335-106626-0_sp1.1 from training. Duration: 0.963625 2023-10-02 21:05:19,473 WARNING [train.py:1204] (3/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_237-44332-0_sp1.1 from training. Duration: 0.8545625 2023-10-02 21:05:22,876 WARNING [train.py:1204] (3/4) Exclude cut with ID _46952_210_5986_1_1534230010357_3107379_178-147341-0_sp1.1 from training. Duration: 0.97275 2023-10-02 21:05:24,146 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_13091_210_7925_1_1530609447_8987354_946-154426-0_sp1.1 from training. Duration: 0.936375 2023-10-02 21:05:24,177 WARNING [train.py:1204] (3/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_326-17061-0_sp1.1 from training. Duration: 0.8545625 2023-10-02 21:05:26,832 WARNING [train.py:1204] (3/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_38-169920-0_sp1.1 from training. Duration: 0.82725 2023-10-02 21:05:29,627 WARNING [train.py:1204] (3/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_283-220372-0 from training. Duration: 0.56 2023-10-02 21:05:29,634 WARNING [train.py:1204] (3/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_252-216674-0_sp1.1 from training. Duration: 0.4909375 2023-10-02 21:05:31,131 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_202-91290-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 21:05:32,507 WARNING [train.py:1204] (3/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_80-74184-0 from training. Duration: 0.83 2023-10-02 21:05:37,687 WARNING [train.py:1204] (3/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_411-271358-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 21:05:37,890 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.bypass.scale_min, batch_count=1018866.6666666666, ans=0.2 2023-10-02 21:05:45,814 WARNING [train.py:1204] (3/4) Exclude cut with ID _68777_210_4353_1_1535810390731_3117904_129-166012-0 from training. Duration: 0.85 2023-10-02 21:05:46,530 WARNING [train.py:1204] (3/4) Exclude cut with ID _44982_210_1511_1_1534312820108_3726029_410-109759-0_sp1.1 from training. Duration: 0.963625 2023-10-02 21:05:46,536 WARNING [train.py:1204] (3/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_364-22817-0_sp1.1 from training. Duration: 0.6454375 2023-10-02 21:05:49,159 WARNING [train.py:1204] (3/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_89-242335-0 from training. Duration: 0.95 2023-10-02 21:05:49,168 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_151-325626-0 from training. Duration: 0.97 2023-10-02 21:05:49,169 WARNING [train.py:1204] (3/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_786-264332-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 21:05:50,675 WARNING [train.py:1204] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_35-153886-0_sp1.1 from training. Duration: 0.763625 2023-10-02 21:05:52,629 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_24007_210_1794_1_1531918925_1663875_116-100916-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 21:05:52,643 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_162-262717-0_sp1.1 from training. Duration: 0.7545625 2023-10-02 21:05:53,895 INFO [train.py:1046] (3/4) Epoch 29, batch 4100, loss[loss=0.16, simple_loss=0.2468, pruned_loss=0.03658, over 23899.00 frames. ], tot_loss[loss=0.166, simple_loss=0.2442, pruned_loss=0.04388, over 4724398.55 frames. ], batch size: 80, lr: 3.51e-03, grad_scale: 8.0 2023-10-02 21:05:59,544 WARNING [train.py:1204] (3/4) Exclude cut with ID _8263_210_1664_1_1528438953494_294100_40-204731-0 from training. Duration: 0.5 2023-10-02 21:05:59,634 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_416-293746-0 from training. Duration: 0.9 2023-10-02 21:06:02,420 WARNING [train.py:1204] (3/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_605-219732-0 from training. Duration: 0.94 2023-10-02 21:06:05,209 WARNING [train.py:1204] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_454-123002-0 from training. Duration: 0.69 2023-10-02 21:06:05,217 WARNING [train.py:1204] (3/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_25-304273-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 21:06:05,296 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6860_210_4852_1_1527917457_3833327_451-39702-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 21:06:05,325 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_304-16505-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 21:06:05,338 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_4-266348-0_sp1.1 from training. Duration: 0.7090625 2023-10-02 21:06:07,123 WARNING [train.py:1204] (3/4) Exclude cut with ID ID0149W0001-753701 from training. Duration: 0.989 2023-10-02 21:06:11,055 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.4.encoder.layers.0.conv_module2.whiten, num_groups=1, num_channels=384, metric=8.00 vs. limit=15.0 2023-10-02 21:06:11,775 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_41251_210_15368_1_1533429767_4820695_394-55393-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 21:06:11,858 WARNING [train.py:1204] (3/4) Exclude cut with ID _8793_210_6310_1_1528542985139_2310009_121-179782-0_sp1.1 from training. Duration: 0.72725 2023-10-02 21:06:13,118 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2729_210_3528_1_1525863505_3882214_421-73047-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 21:06:13,191 WARNING [train.py:1204] (3/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_278-154492-0_sp0.9 from training. Duration: 0.8444375 2023-10-02 21:06:14,768 WARNING [train.py:1204] (3/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_412-317077-0_sp0.9 from training. Duration: 0.8333125 2023-10-02 21:06:17,552 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_727-138317-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 21:06:17,594 WARNING [train.py:1204] (3/4) Exclude cut with ID _25808_210_13292_1_1532343514562_7366109_345-335048-0_sp1.1 from training. Duration: 0.92725 2023-10-02 21:06:17,612 WARNING [train.py:1204] (3/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_506-23200-0 from training. Duration: 0.97 2023-10-02 21:06:19,549 WARNING [train.py:1204] (3/4) Exclude cut with ID _44718_210_5816_1_1534050051869_6469030_643-245187-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 21:06:19,554 WARNING [train.py:1204] (3/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_42-186162-0_sp1.1 from training. Duration: 0.836375 2023-10-02 21:06:19,574 WARNING [train.py:1204] (3/4) Exclude cut with ID _3231_210_2106_1_1526205864934_6784220_171-256004-0_sp1.1 from training. Duration: 0.7818125 2023-10-02 21:06:19,602 WARNING [train.py:1204] (3/4) Exclude cut with ID _25914_210_6400_1_1532141317058_644740_43-248478-0_sp1.1 from training. Duration: 0.936375 2023-10-02 21:06:20,776 WARNING [train.py:1204] (3/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_577-287150-0 from training. Duration: 0.82 2023-10-02 21:06:22,338 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_46015_210_7234_1_1533863319_3087837_376-276068-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 21:06:24,387 WARNING [train.py:1204] (3/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_51-200220-0 from training. Duration: 0.56 2023-10-02 21:06:25,770 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_452-96101-0_sp1.1 from training. Duration: 0.963625 2023-10-02 21:06:27,200 WARNING [train.py:1204] (3/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_263-168388-0_sp1.1 from training. Duration: 0.7818125 2023-10-02 21:06:27,201 WARNING [train.py:1204] (3/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_469-94673-0 from training. Duration: 0.79 2023-10-02 21:06:28,493 WARNING [train.py:1204] (3/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_303-228738-0_sp1.1 from training. Duration: 0.763625 2023-10-02 21:06:28,527 WARNING [train.py:1204] (3/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_262-250826-0_sp1.1 from training. Duration: 0.9 2023-10-02 21:06:28,550 WARNING [train.py:1204] (3/4) Exclude cut with ID _68777_210_4353_1_1535810390731_3117904_129-166012-0_sp1.1 from training. Duration: 0.77275 2023-10-02 21:06:31,509 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.feed_forward2.hidden_balancer.prob, batch_count=1019066.6666666666, ans=0.125 2023-10-02 21:06:32,619 WARNING [train.py:1204] (3/4) Exclude cut with ID _58895_210_8750_1_1535184081320_3649089_265-323810-0 from training. Duration: 0.96 2023-10-02 21:06:32,709 WARNING [train.py:1204] (3/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_120-76843-0_sp0.9 from training. Duration: 0.611125 2023-10-02 21:06:32,758 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7544_210_6753_1_1528587722_3576686_295-258479-0_sp0.9 from training. Duration: 0.9333125 2023-10-02 21:06:35,499 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7046_210_6753_1_1527895337_6920917_783-278109-0 from training. Duration: 0.77 2023-10-02 21:06:35,554 WARNING [train.py:1204] (3/4) Exclude cut with ID _42326_210_7770_1_1533814282531_3707269_113-308323-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 21:06:35,597 WARNING [train.py:1204] (3/4) Exclude cut with ID _7852_210_5219_1_1528286471316_8139400_242-105336-0_sp0.9 from training. Duration: 0.8 2023-10-02 21:06:37,738 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.conv_module1.balancer2.prob, batch_count=1019133.3333333334, ans=0.125 2023-10-02 21:06:38,290 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.2.encoder.layers.2.self_attn2.whiten, num_groups=1, num_channels=384, metric=17.16 vs. limit=22.5 2023-10-02 21:06:39,517 WARNING [train.py:1204] (3/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_114-277905-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 21:06:43,780 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_13118_210_7925_1_1530755612_7608297_42-66683-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 21:06:46,521 WARNING [train.py:1204] (3/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_901-79438-0_sp1.1 from training. Duration: 0.8545625 2023-10-02 21:06:48,561 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_30280_210_1881_1_1532872779_3660139_211-182194-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 21:06:56,175 WARNING [train.py:1204] (3/4) Exclude cut with ID _14898_210_9206_1_1530766812377_4177190_621-45732-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 21:06:56,183 WARNING [train.py:1204] (3/4) Exclude cut with ID _35350_210_16040_1_1533040191183_4075299_224-133775-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 21:07:00,156 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.365e+02 1.918e+02 2.139e+02 2.602e+02 3.703e+02, threshold=4.278e+02, percent-clipped=0.0 2023-10-02 21:07:00,301 WARNING [train.py:1204] (3/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_147-293311-0_sp1.1 from training. Duration: 0.8545625 2023-10-02 21:07:01,715 WARNING [train.py:1204] (3/4) Exclude cut with ID _12302_210_5219_1_1529924446466_8107280_835-279754-0_sp1.1 from training. Duration: 0.8181875 2023-10-02 21:07:01,892 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.self_attn_weights.pos_emb_skip_rate, batch_count=1019200.0, ans=0.0 2023-10-02 21:07:06,329 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8453_210_4211_1_1528502940_8562730_347-330673-0_sp0.9 from training. Duration: 0.8 2023-10-02 21:07:07,636 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_213-103282-0_sp0.9 from training. Duration: 0.8666875 2023-10-02 21:07:08,937 INFO [train.py:1046] (3/4) Epoch 29, batch 4150, loss[loss=0.1643, simple_loss=0.2399, pruned_loss=0.04435, over 23755.00 frames. ], tot_loss[loss=0.1655, simple_loss=0.2438, pruned_loss=0.04361, over 4720650.45 frames. ], batch size: 149, lr: 3.51e-03, grad_scale: 8.0 2023-10-02 21:07:08,986 WARNING [train.py:1204] (3/4) Exclude cut with ID _49728_210_12651_1_1534041034294_6788150_66-43496-0_sp1.1 from training. Duration: 0.97275 2023-10-02 21:07:08,988 WARNING [train.py:1204] (3/4) Exclude cut with ID _19023_210_4353_1_1531378892552_5820780_28-36615-0_sp1.1 from training. Duration: 0.8818125 2023-10-02 21:07:10,461 WARNING [train.py:1204] (3/4) Exclude cut with ID _14349_210_8341_1_1530615710818_3442470_115-285975-0 from training. Duration: 0.86 2023-10-02 21:07:12,308 WARNING [train.py:1204] (3/4) Exclude cut with ID _12734_210_8341_1_1530183614296_3891929_236-47464-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 21:07:12,356 WARNING [train.py:1204] (3/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_177-236809-0 from training. Duration: 0.49 2023-10-02 21:07:12,418 WARNING [train.py:1204] (3/4) Exclude cut with ID _13678_210_7497_1_1530530069226_3463170_371-3221-0 from training. Duration: 0.9 2023-10-02 21:07:12,431 WARNING [train.py:1204] (3/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_48-338976-0 from training. Duration: 0.89 2023-10-02 21:07:13,816 WARNING [train.py:1204] (3/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_1167-309744-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 21:07:19,091 WARNING [train.py:1204] (3/4) Exclude cut with ID _30327_210_16049_1_1532484161295_3924169_401-70819-0_sp0.9 from training. Duration: 0.9555625 2023-10-02 21:07:19,099 WARNING [train.py:1204] (3/4) Exclude cut with ID _55134_210_21982_1_1534590028377_3882170_346-91022-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 21:07:24,496 WARNING [train.py:1204] (3/4) Exclude cut with ID _14459_210_2228_1_1531443641946_7516700_174-118801-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 21:07:25,806 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_269-278006-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 21:07:25,858 WARNING [train.py:1204] (3/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_390-339137-0_sp0.9 from training. Duration: 0.62225 2023-10-02 21:07:28,580 WARNING [train.py:1204] (3/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_253-308377-0_sp1.1 from training. Duration: 0.4454375 2023-10-02 21:07:28,601 WARNING [train.py:1204] (3/4) Exclude cut with ID _23825_210_13449_1_1531915313526_3497150_240-271367-0_sp1.1 from training. Duration: 0.8818125 2023-10-02 21:07:29,967 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1015-343884-0_sp0.9 from training. Duration: 0.688875 2023-10-02 21:07:32,808 WARNING [train.py:1204] (3/4) Exclude cut with ID _23514_210_1389_1_1531832139785_3031439_335-48210-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 21:07:38,698 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_309-65177-0_sp1.1 from training. Duration: 0.8 2023-10-02 21:07:38,770 WARNING [train.py:1204] (3/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_231-137892-0 from training. Duration: 0.99 2023-10-02 21:07:42,188 WARNING [train.py:1204] (3/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_57-270037-0 from training. Duration: 0.77 2023-10-02 21:07:42,200 WARNING [train.py:1204] (3/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_465-214726-0_sp1.1 from training. Duration: 0.7818125 2023-10-02 21:07:42,275 WARNING [train.py:1204] (3/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_106-300985-0 from training. Duration: 0.9 2023-10-02 21:07:42,280 WARNING [train.py:1204] (3/4) Exclude cut with ID _14632_210_9988_1_1530691279392_3923200_161-129530-0_sp1.1 from training. Duration: 0.87275 2023-10-02 21:07:42,298 WARNING [train.py:1204] (3/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_514-88370-0_sp1.1 from training. Duration: 0.963625 2023-10-02 21:07:43,775 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.feed_forward3.hidden_balancer.prob, batch_count=1019400.0, ans=0.125 2023-10-02 21:07:45,050 WARNING [train.py:1204] (3/4) Exclude cut with ID _56495_210_4234_1_1534748080334_4136219_305-334505-0_sp1.1 from training. Duration: 0.92725 2023-10-02 21:07:46,351 WARNING [train.py:1204] (3/4) Exclude cut with ID _42875_210_14856_1_1533704397487_4012433_61-264914-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 21:07:50,664 WARNING [train.py:1204] (3/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_91-148831-0 from training. Duration: 0.99 2023-10-02 21:07:53,829 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_435-313305-0_sp0.9 from training. Duration: 0.788875 2023-10-02 21:07:55,793 WARNING [train.py:1204] (3/4) Exclude cut with ID _16744_210_5986_1_1531724405384_3907567_242-111316-0_sp1.1 from training. Duration: 0.8454375 2023-10-02 21:07:57,117 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_257-20733-0 from training. Duration: 0.76 2023-10-02 21:07:57,172 WARNING [train.py:1204] (3/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_4-91037-0_sp1.1 from training. Duration: 0.8 2023-10-02 21:07:58,516 WARNING [train.py:1204] (3/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_3-276701-0 from training. Duration: 0.91 2023-10-02 21:07:58,701 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.self_attn_weights.pos_emb_skip_rate, batch_count=1019466.6666666666, ans=0.0 2023-10-02 21:08:02,306 WARNING [train.py:1204] (3/4) Exclude cut with ID _3992_210_1747_1_1526644816031_3731060_36-147855-0_sp1.1 from training. Duration: 0.7090625 2023-10-02 21:08:02,400 WARNING [train.py:1204] (3/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_1296-298671-0_sp1.1 from training. Duration: 0.963625 2023-10-02 21:08:03,811 WARNING [train.py:1204] (3/4) Exclude cut with ID _68965_210_1883_1_1535698962272_3497490_309-334593-0_sp1.1 from training. Duration: 0.92725 2023-10-02 21:08:05,210 WARNING [train.py:1204] (3/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_128-296581-0 from training. Duration: 0.74 2023-10-02 21:08:05,210 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_17613_210_2151_1_1531137569_2366160_170-299079-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 21:08:05,212 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6287_210_3273_1_1527509506_2653716_182-199887-0_sp1.1 from training. Duration: 0.463625 2023-10-02 21:08:06,648 WARNING [train.py:1204] (3/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_80-135200-0_sp1.1 from training. Duration: 0.7181875 2023-10-02 21:08:09,946 WARNING [train.py:1204] (3/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_34-183685-0 from training. Duration: 0.98 2023-10-02 21:08:11,295 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8278_210_2550_1_1528637099_4220413_418-210155-0_sp1.1 from training. Duration: 0.92725 2023-10-02 21:08:11,299 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8853_210_3273_1_1528633899_818968_51-187685-0_sp0.9 from training. Duration: 0.7666875 2023-10-02 21:08:11,327 WARNING [train.py:1204] (3/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_332-221057-0_sp1.1 from training. Duration: 0.6181875 2023-10-02 21:08:11,396 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2221_210_1988_1_1525597051_4935541_415-296882-0 from training. Duration: 0.77 2023-10-02 21:08:13,243 WARNING [train.py:1204] (3/4) Exclude cut with ID _33640_210_2655_1_1533117605479_4855469_770-278185-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 21:08:13,279 WARNING [train.py:1204] (3/4) Exclude cut with ID _5287_210_2084_1_1527418883040_3878670_333-48508-0_sp1.1 from training. Duration: 0.5454375 2023-10-02 21:08:14,588 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_142-276839-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 21:08:16,013 WARNING [train.py:1204] (3/4) Exclude cut with ID _28394_210_8913_1_1532430049518_3650118_126-139371-0_sp1.1 from training. Duration: 0.92725 2023-10-02 21:08:16,026 WARNING [train.py:1204] (3/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_76-203867-0 from training. Duration: 0.72 2023-10-02 21:08:16,067 WARNING [train.py:1204] (3/4) Exclude cut with ID _6194_210_1534_1_1527511826160_3941194_14-282876-0_sp0.9 from training. Duration: 0.788875 2023-10-02 21:08:18,278 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.2.encoder.layers.0.self_attn2.whiten, num_groups=1, num_channels=384, metric=18.32 vs. limit=22.5 2023-10-02 21:08:21,709 WARNING [train.py:1204] (3/4) Exclude cut with ID _35355_210_16040_1_1533212988977_4016229_74-190076-0_sp1.1 from training. Duration: 0.72725 2023-10-02 21:08:22,982 INFO [train.py:1046] (3/4) Epoch 29, batch 4200, loss[loss=0.1479, simple_loss=0.2238, pruned_loss=0.03597, over 24317.00 frames. ], tot_loss[loss=0.1648, simple_loss=0.2427, pruned_loss=0.04342, over 4724232.43 frames. ], batch size: 56, lr: 3.51e-03, grad_scale: 8.0 2023-10-02 21:08:23,109 WARNING [train.py:1204] (3/4) Exclude cut with ID _33920_210_4220_1_1533257851500_3084399_198-334063-0 from training. Duration: 0.94 2023-10-02 21:08:23,832 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_4-266348-0_sp0.9 from training. Duration: 0.8666875 2023-10-02 21:08:27,093 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_23856_210_4238_1_1531994785_3087934_122-38075-0_sp1.1 from training. Duration: 0.936375 2023-10-02 21:08:28,453 WARNING [train.py:1204] (3/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_276-96593-0_sp0.9 from training. Duration: 0.8555625 2023-10-02 21:08:28,522 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_308-278734-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 21:08:28,523 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_5190_210_4557_1_1527245078_2965646_353-250603-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 21:08:29,905 WARNING [train.py:1204] (3/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_472-216663-0 from training. Duration: 0.92 2023-10-02 21:08:33,975 WARNING [train.py:1204] (3/4) Exclude cut with ID _7537_210_2084_1_1528502395431_3924359_14-312906-0 from training. Duration: 0.84 2023-10-02 21:08:34,025 WARNING [train.py:1204] (3/4) Exclude cut with ID _29003_210_19596_1_1532507391720_3894098_559-236660-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 21:08:36,894 WARNING [train.py:1204] (3/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_312-204798-0_sp1.1 from training. Duration: 0.8454375 2023-10-02 21:08:40,175 WARNING [train.py:1204] (3/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_17-309070-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 21:08:40,421 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=1019666.6666666666, ans=0.1 2023-10-02 21:08:43,356 WARNING [train.py:1204] (3/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_344-152526-0_sp0.9 from training. Duration: 0.588875 2023-10-02 21:08:44,765 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_41452_210_7291_1_1533437687_3941281_405-11559-0_sp1.1 from training. Duration: 0.82725 2023-10-02 21:08:44,797 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_48730_210_5399_1_1533885886_2212769_157-124052-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 21:08:46,180 WARNING [train.py:1204] (3/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_415-78792-0 from training. Duration: 0.81 2023-10-02 21:08:46,184 WARNING [train.py:1204] (3/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_345-340690-0_sp1.1 from training. Duration: 0.8454375 2023-10-02 21:08:47,580 WARNING [train.py:1204] (3/4) Exclude cut with ID _23825_210_13449_1_1531915313526_3497150_124-8138-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 21:08:47,622 WARNING [train.py:1204] (3/4) Exclude cut with ID _49258_210_7994_1_1534122017863_3738861_194-24864-0_sp1.1 from training. Duration: 0.936375 2023-10-02 21:08:47,638 WARNING [train.py:1204] (3/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_369-222060-0_sp1.1 from training. Duration: 0.6454375 2023-10-02 21:08:49,030 WARNING [train.py:1204] (3/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_480-13146-0_sp0.9 from training. Duration: 0.9444375 2023-10-02 21:08:51,823 WARNING [train.py:1204] (3/4) Exclude cut with ID _13253_210_1925_1_1530757775819_3577873_191-144977-0 from training. Duration: 0.78 2023-10-02 21:08:51,839 WARNING [train.py:1204] (3/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_1128-140266-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 21:08:55,201 WARNING [train.py:1204] (3/4) Exclude cut with ID _8772_210_1850_1_1528635595972_3664011_247-336448-0_sp0.9 from training. Duration: 0.82225 2023-10-02 21:08:56,567 WARNING [train.py:1204] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_318-59097-0_sp1.1 from training. Duration: 0.6909375 2023-10-02 21:08:58,610 WARNING [train.py:1204] (3/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_540-131164-0_sp1.1 from training. Duration: 0.836375 2023-10-02 21:08:58,764 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=1019733.3333333334, ans=0.1 2023-10-02 21:08:59,948 WARNING [train.py:1204] (3/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_39-110836-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 21:09:02,793 WARNING [train.py:1204] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_860-96198-0_sp1.1 from training. Duration: 0.663625 2023-10-02 21:09:02,803 WARNING [train.py:1204] (3/4) Exclude cut with ID _15133_210_6228_1_1530794512074_3080379_29-264458-0 from training. Duration: 0.58 2023-10-02 21:09:04,169 WARNING [train.py:1204] (3/4) Exclude cut with ID _55209_210_1531_1_1534675828996_4597160_797-348343-0_sp1.1 from training. Duration: 0.963625 2023-10-02 21:09:05,536 WARNING [train.py:1204] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_23-189234-0_sp1.1 from training. Duration: 0.7545625 2023-10-02 21:09:05,747 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.feed_forward1.out_proj.dropout_p, batch_count=1019733.3333333334, ans=0.1 2023-10-02 21:09:09,753 WARNING [train.py:1204] (3/4) Exclude cut with ID _5410_210_1388_1_1527397055795_1719020_39-112221-0_sp1.1 from training. Duration: 0.62725 2023-10-02 21:09:13,551 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_100-348441-0_sp1.1 from training. Duration: 0.82725 2023-10-02 21:09:18,989 WARNING [train.py:1204] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_143-86344-0_sp1.1 from training. Duration: 0.7 2023-10-02 21:09:21,665 WARNING [train.py:1204] (3/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_126-29104-0 from training. Duration: 0.85 2023-10-02 21:09:23,235 WARNING [train.py:1204] (3/4) Exclude cut with ID _39846_210_12651_1_1533603651992_1318030_138-31771-0_sp1.1 from training. Duration: 0.963625 2023-10-02 21:09:25,306 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.2.encoder.layers.1.self_attn2.whiten, num_groups=1, num_channels=384, metric=16.30 vs. limit=22.5 2023-10-02 21:09:28,929 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.633e+02 1.836e+02 2.050e+02 2.312e+02 3.476e+02, threshold=4.100e+02, percent-clipped=0.0 2023-10-02 21:09:29,048 WARNING [train.py:1204] (3/4) Exclude cut with ID _2220_210_1819_1_1525509109898_3658680_50-99837-0_sp1.1 from training. Duration: 0.4909375 2023-10-02 21:09:29,110 WARNING [train.py:1204] (3/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_836-210550-0_sp1.1 from training. Duration: 0.97275 2023-10-02 21:09:31,142 WARNING [train.py:1204] (3/4) Exclude cut with ID _28323_210_16187_1_1532480395667_4100300_306-74561-0 from training. Duration: 0.94 2023-10-02 21:09:36,828 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_177-58005-0_sp1.1 from training. Duration: 0.563625 2023-10-02 21:09:38,218 INFO [train.py:1046] (3/4) Epoch 29, batch 4250, loss[loss=0.1764, simple_loss=0.2614, pruned_loss=0.04569, over 23916.00 frames. ], tot_loss[loss=0.1644, simple_loss=0.2422, pruned_loss=0.04329, over 4723631.95 frames. ], batch size: 86, lr: 3.51e-03, grad_scale: 8.0 2023-10-02 21:09:39,635 WARNING [train.py:1204] (3/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_19-342632-0_sp1.1 from training. Duration: 0.82725 2023-10-02 21:09:39,655 WARNING [train.py:1204] (3/4) Exclude cut with ID _35355_210_16040_1_1533212988977_4016229_74-190076-0_sp0.9 from training. Duration: 0.888875 2023-10-02 21:09:42,645 WARNING [train.py:1204] (3/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_285-97786-0_sp1.1 from training. Duration: 0.97275 2023-10-02 21:09:47,301 WARNING [train.py:1204] (3/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_337-181502-0_sp0.9 from training. Duration: 0.72225 2023-10-02 21:09:47,350 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_353-182855-0 from training. Duration: 0.96 2023-10-02 21:09:48,616 WARNING [train.py:1204] (3/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_409-327730-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 21:09:50,102 WARNING [train.py:1204] (3/4) Exclude cut with ID _8030_210_7497_1_1528444886781_3531089_373-19403-0_sp1.1 from training. Duration: 0.97275 2023-10-02 21:09:51,667 WARNING [train.py:1204] (3/4) Exclude cut with ID _6789_210_1534_1_1527857937218_3118950_231-340237-0_sp1.1 from training. Duration: 0.936375 2023-10-02 21:09:51,827 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.attention_skip_rate, batch_count=1020000.0, ans=0.0 2023-10-02 21:09:57,867 WARNING [train.py:1204] (3/4) Exclude cut with ID _6493_210_1747_1_1528459210424_4012470_536-57832-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 21:09:57,891 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7046_210_6753_1_1527895337_6920917_756-116002-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 21:09:59,288 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_37152_210_9697_1_1533106573_3844188_29-239070-0_sp1.1 from training. Duration: 0.8909375 2023-10-02 21:09:59,291 WARNING [train.py:1204] (3/4) Exclude cut with ID _19023_210_4353_1_1531378892552_5820780_61-334539-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 21:09:59,931 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.3.encoder.layers.1.self_attn1.whiten, num_groups=1, num_channels=512, metric=15.82 vs. limit=22.5 2023-10-02 21:10:01,259 WARNING [train.py:1204] (3/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_159-28488-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 21:10:02,683 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_764-71272-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 21:10:02,899 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.conv_skip_rate, batch_count=1020000.0, ans=0.0 2023-10-02 21:10:04,074 WARNING [train.py:1204] (3/4) Exclude cut with ID _44902_210_16187_1_1533888006062_3792078_258-178274-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 21:10:05,454 WARNING [train.py:1204] (3/4) Exclude cut with ID _7537_210_2084_1_1528502395431_3924359_14-312906-0_sp1.1 from training. Duration: 0.763625 2023-10-02 21:10:08,087 WARNING [train.py:1204] (3/4) Exclude cut with ID _49643_210_7989_1_1534640244224_3939031_674-116639-0_sp1.1 from training. Duration: 0.963625 2023-10-02 21:10:10,084 WARNING [train.py:1204] (3/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_415-161-0 from training. Duration: 0.98 2023-10-02 21:10:13,246 WARNING [train.py:1204] (3/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_264-348896-0 from training. Duration: 0.85 2023-10-02 21:10:13,260 WARNING [train.py:1204] (3/4) Exclude cut with ID _51899_210_20034_1_1534129254466_3669220_233-161492-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 21:10:14,526 WARNING [train.py:1204] (3/4) Exclude cut with ID _31255_210_13427_1_1532865619495_3815143_233-200023-0_sp1.1 from training. Duration: 0.936375 2023-10-02 21:10:14,550 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_23850_210_4238_1_1532232597_1632433_100-229879-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 21:10:15,997 WARNING [train.py:1204] (3/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_375-245677-0_sp0.9 from training. Duration: 0.9 2023-10-02 21:10:16,000 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_40223_210_6228_1_1533298404_4812267_355-317415-0_sp1.1 from training. Duration: 0.97275 2023-10-02 21:10:17,263 WARNING [train.py:1204] (3/4) Exclude cut with ID _9679_210_7497_1_1529129212906_4363929_235-81488-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 21:10:20,254 WARNING [train.py:1204] (3/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_172-215932-0_sp1.1 from training. Duration: 0.6 2023-10-02 21:10:20,516 INFO [scaling.py:1118] (3/4) WithLoss: name=encoder.encoders.3.encoder.layers.0.self_attn_weights, loss-sum=0.000e+00 2023-10-02 21:10:21,642 WARNING [train.py:1204] (3/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_64-298917-0_sp1.1 from training. Duration: 0.8 2023-10-02 21:10:25,737 WARNING [train.py:1204] (3/4) Exclude cut with ID _38020_210_5986_1_1533112252259_7006089_131-53533-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 21:10:25,996 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.bypass.skip_rate, batch_count=1020133.3333333334, ans=0.09899494936611666 2023-10-02 21:10:28,432 WARNING [train.py:1204] (3/4) Exclude cut with ID _42334_210_7770_1_1534206610396_3605880_392-48154-0_sp1.1 from training. Duration: 0.963625 2023-10-02 21:10:28,511 WARNING [train.py:1204] (3/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_337-181502-0 from training. Duration: 0.65 2023-10-02 21:10:29,156 WARNING [train.py:1204] (3/4) Exclude cut with ID _6267_210_2084_1_1527764658526_3894730_375-219887-0_sp0.9 from training. Duration: 0.7444375 2023-10-02 21:10:29,856 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.2.encoder.layers.1.nonlin_attention.whiten2, num_groups=1, num_channels=384, metric=5.45 vs. limit=15.0 2023-10-02 21:10:30,391 WARNING [train.py:1204] (3/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_60-146189-0 from training. Duration: 0.98 2023-10-02 21:10:30,488 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_243-143119-0_sp1.1 from training. Duration: 0.7 2023-10-02 21:10:31,837 WARNING [train.py:1204] (3/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_1267-272693-0_sp0.9 from training. Duration: 0.811125 2023-10-02 21:10:35,215 WARNING [train.py:1204] (3/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_83-182645-0_sp1.1 from training. Duration: 0.97275 2023-10-02 21:10:35,240 WARNING [train.py:1204] (3/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_403-292260-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 21:10:37,845 WARNING [train.py:1204] (3/4) Exclude cut with ID _55429_210_10626_1_1534467588527_7174143_377-169391-0 from training. Duration: 0.96 2023-10-02 21:10:39,207 WARNING [train.py:1204] (3/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_494-199538-0_sp1.1 from training. Duration: 0.6818125 2023-10-02 21:10:40,581 WARNING [train.py:1204] (3/4) Exclude cut with ID _3067_210_2667_1_1526212974640_4503168_119-175776-0_sp1.1 from training. Duration: 0.67275 2023-10-02 21:10:43,415 WARNING [train.py:1204] (3/4) Exclude cut with ID _3231_210_2106_1_1526205864934_6784220_183-303839-0_sp1.1 from training. Duration: 0.97275 2023-10-02 21:10:47,078 WARNING [train.py:1204] (3/4) Exclude cut with ID _18513_210_9206_1_1531618215223_3862620_165-38958-0_sp1.1 from training. Duration: 0.963625 2023-10-02 21:10:48,430 WARNING [train.py:1204] (3/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_87-272068-0_sp1.1 from training. Duration: 0.7545625 2023-10-02 21:10:49,846 WARNING [train.py:1204] (3/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_353-95-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 21:10:49,961 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2009_210_2071_1_1525481134_4588937_73-302813-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 21:10:51,378 WARNING [train.py:1204] (3/4) Exclude cut with ID _64220_210_9527_1_1535250151772_4875259_88-95011-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 21:10:52,706 INFO [train.py:1046] (3/4) Epoch 29, batch 4300, loss[loss=0.165, simple_loss=0.2376, pruned_loss=0.04623, over 23437.00 frames. ], tot_loss[loss=0.1631, simple_loss=0.2409, pruned_loss=0.04268, over 4717485.34 frames. ], batch size: 285, lr: 3.51e-03, grad_scale: 8.0 2023-10-02 21:10:52,801 WARNING [train.py:1204] (3/4) Exclude cut with ID _44902_210_16187_1_1533888006062_3792078_160-12107-0_sp1.1 from training. Duration: 0.92725 2023-10-02 21:10:52,807 WARNING [train.py:1204] (3/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_547-5424-0 from training. Duration: 0.94 2023-10-02 21:10:55,474 WARNING [train.py:1204] (3/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_268-85656-0_sp1.1 from training. Duration: 0.936375 2023-10-02 21:10:58,394 WARNING [train.py:1204] (3/4) Exclude cut with ID _66210_210_12651_1_1535446812922_3365850_42-284985-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 21:10:58,416 WARNING [train.py:1204] (3/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_465-214726-0_sp0.9 from training. Duration: 0.9555625 2023-10-02 21:11:04,688 WARNING [train.py:1204] (3/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_50-95770-0_sp1.1 from training. Duration: 0.936375 2023-10-02 21:11:11,677 WARNING [train.py:1204] (3/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_269-77567-0_sp1.1 from training. Duration: 0.963625 2023-10-02 21:11:11,679 WARNING [train.py:1204] (3/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_7-111954-0 from training. Duration: 0.56 2023-10-02 21:11:13,053 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_85-195240-0_sp0.9 from training. Duration: 0.9666875 2023-10-02 21:11:14,514 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2822_210_2715_1_1526112586_1999587_165-36656-0_sp0.9 from training. Duration: 0.988875 2023-10-02 21:11:14,529 WARNING [train.py:1204] (3/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_181-78632-0_sp1.1 from training. Duration: 0.7090625 2023-10-02 21:11:14,548 WARNING [train.py:1204] (3/4) Exclude cut with ID IC0671W0044-331887 from training. Duration: 0.896 2023-10-02 21:11:17,661 WARNING [train.py:1204] (3/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_266-137104-0_sp1.1 from training. Duration: 0.5909375 2023-10-02 21:11:19,177 WARNING [train.py:1204] (3/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_137-116442-0_sp0.9 from training. Duration: 0.8666875 2023-10-02 21:11:22,243 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_23801_210_16223_1_1532066392_3294657_570-5439-0 from training. Duration: 0.97 2023-10-02 21:11:22,255 WARNING [train.py:1204] (3/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_29-280622-0_sp1.1 from training. Duration: 0.7454375 2023-10-02 21:11:22,276 WARNING [train.py:1204] (3/4) Exclude cut with ID 3557-8342-0013-54691-0 from training. Duration: 0.92 2023-10-02 21:11:23,724 WARNING [train.py:1204] (3/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_711-15688-0_sp1.1 from training. Duration: 0.3909375 2023-10-02 21:11:25,105 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_15126_210_6753_1_1530778677_2591185_120-277707-0_sp1.1 from training. Duration: 0.72725 2023-10-02 21:11:29,108 WARNING [train.py:1204] (3/4) Exclude cut with ID _14970_210_15709_1_1530775547361_1966965_8-169924-0_sp1.1 from training. Duration: 0.836375 2023-10-02 21:11:29,109 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_4692_210_3528_1_1526899755_4323254_107-44401-0_sp0.9 from training. Duration: 0.9555625 2023-10-02 21:11:31,127 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3475_210_2028_1_1526193578_3622569_265-276625-0_sp0.9 from training. Duration: 0.8444375 2023-10-02 21:11:32,525 WARNING [train.py:1204] (3/4) Exclude cut with ID _55153_210_2253_1_1534384786677_3162096_148-79022-0_sp1.1 from training. Duration: 0.87275 2023-10-02 21:11:32,613 WARNING [train.py:1204] (3/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_292-36336-0_sp1.1 from training. Duration: 0.92725 2023-10-02 21:11:32,622 WARNING [train.py:1204] (3/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_329-82235-0 from training. Duration: 0.82 2023-10-02 21:11:34,615 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_11305_210_1794_1_1529750428_7178148_257-312870-0 from training. Duration: 0.88 2023-10-02 21:11:36,066 WARNING [train.py:1204] (3/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_436-4493-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 21:11:37,535 WARNING [train.py:1204] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_321-54129-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 21:11:37,543 WARNING [train.py:1204] (3/4) Exclude cut with ID _5287_210_2084_1_1527418883040_3878670_333-48508-0_sp0.9 from training. Duration: 0.6666875 2023-10-02 21:11:37,556 WARNING [train.py:1204] (3/4) Exclude cut with ID _26528_210_9070_1_1532570410842_3851310_244-278158-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 21:11:38,856 WARNING [train.py:1204] (3/4) Exclude cut with ID _46952_210_5986_1_1534230010357_3107379_232-344503-0_sp1.1 from training. Duration: 0.87275 2023-10-02 21:11:38,867 WARNING [train.py:1204] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_43-206757-0 from training. Duration: 0.75 2023-10-02 21:11:38,870 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_4183_210_2087_1_1526729100_1869838_141-166600-0 from training. Duration: 0.95 2023-10-02 21:11:38,932 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_296-287678-0 from training. Duration: 0.95 2023-10-02 21:11:40,267 WARNING [train.py:1204] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_265-60343-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 21:11:40,303 WARNING [train.py:1204] (3/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_332-256168-0 from training. Duration: 0.75 2023-10-02 21:11:40,341 WARNING [train.py:1204] (3/4) Exclude cut with ID _13679_210_7497_1_1530534293662_3355098_411-186088-0 from training. Duration: 0.94 2023-10-02 21:11:44,838 WARNING [train.py:1204] (3/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_458-126779-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 21:11:46,155 WARNING [train.py:1204] (3/4) Exclude cut with ID IC0803W0380-397572 from training. Duration: 0.032 2023-10-02 21:11:46,209 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_278-272612-0_sp1.1 from training. Duration: 0.663625 2023-10-02 21:11:49,481 WARNING [train.py:1204] (3/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_491-263101-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 21:11:49,486 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_53555_210_5632_1_1534595220_7232297_334-336778-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 21:11:50,973 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_50-3289-0 from training. Duration: 0.91 2023-10-02 21:11:52,281 WARNING [train.py:1204] (3/4) Exclude cut with ID _8699_210_2069_1_1528630346831_3960409_155-39766-0_sp0.9 from training. Duration: 0.8666875 2023-10-02 21:11:52,285 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_502-82145-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 21:11:53,561 WARNING [train.py:1204] (3/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_550-43132-0_sp1.1 from training. Duration: 0.97275 2023-10-02 21:11:53,590 WARNING [train.py:1204] (3/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_446-112117-0_sp1.1 from training. Duration: 0.8181875 2023-10-02 21:11:53,636 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3691_210_3528_1_1526468169_4062053_355-334432-0_sp1.1 from training. Duration: 0.7909375 2023-10-02 21:11:55,059 WARNING [train.py:1204] (3/4) Exclude cut with ID _58605_210_12210_1_1534723195939_3904259_239-94720-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 21:11:56,536 WARNING [train.py:1204] (3/4) Exclude cut with ID _49376_210_5986_1_1533985278732_3370740_391-344072-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 21:11:57,714 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.617e+02 1.918e+02 2.229e+02 2.683e+02 3.939e+02, threshold=4.458e+02, percent-clipped=0.0 2023-10-02 21:11:59,783 WARNING [train.py:1204] (3/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_558-322014-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 21:11:59,830 WARNING [train.py:1204] (3/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_256-32662-0_sp1.1 from training. Duration: 0.8181875 2023-10-02 21:12:04,431 WARNING [train.py:1204] (3/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_572-185150-0 from training. Duration: 0.97 2023-10-02 21:12:05,775 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_89-35748-0_sp0.9 from training. Duration: 0.87775 2023-10-02 21:12:07,101 INFO [train.py:1046] (3/4) Epoch 29, batch 4350, loss[loss=0.1554, simple_loss=0.2447, pruned_loss=0.03306, over 24561.00 frames. ], tot_loss[loss=0.1641, simple_loss=0.2422, pruned_loss=0.04297, over 4708453.23 frames. ], batch size: 71, lr: 3.51e-03, grad_scale: 8.0 2023-10-02 21:12:08,725 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_88-66747-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 21:12:11,532 WARNING [train.py:1204] (3/4) Exclude cut with ID _19504_210_9994_1_1531648863242_3325220_357-237029-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 21:12:14,837 WARNING [train.py:1204] (3/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_302-2550-0_sp1.1 from training. Duration: 0.736375 2023-10-02 21:12:14,837 WARNING [train.py:1204] (3/4) Exclude cut with ID _9461_210_4557_1_1529060514726_3677451_59-194017-0_sp1.1 from training. Duration: 0.963625 2023-10-02 21:12:19,644 WARNING [train.py:1204] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_665-196155-0_sp1.1 from training. Duration: 0.6090625 2023-10-02 21:12:22,458 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_326-315007-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 21:12:25,409 INFO [scaling.py:1118] (3/4) WithLoss: name=encoder.encoders.2.encoder.layers.1.self_attn_weights, loss-sum=0.000e+00 2023-10-02 21:12:25,437 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.attention_skip_rate, batch_count=1020666.6666666666, ans=0.0 2023-10-02 21:12:26,543 WARNING [train.py:1204] (3/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_105-328174-0_sp1.1 from training. Duration: 0.6909375 2023-10-02 21:12:26,554 WARNING [train.py:1204] (3/4) Exclude cut with ID _56494_210_4220_1_1535284837625_3033169_106-263722-0_sp1.1 from training. Duration: 0.97275 2023-10-02 21:12:26,813 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.1.conv_skip_rate, batch_count=1020666.6666666666, ans=0.0 2023-10-02 21:12:28,074 WARNING [train.py:1204] (3/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_770-200430-0_sp0.9 from training. Duration: 0.988875 2023-10-02 21:12:31,360 WARNING [train.py:1204] (3/4) Exclude cut with ID _65640_210_6310_1_1535368201445_3173250_209-13008-0_sp1.1 from training. Duration: 0.9 2023-10-02 21:12:31,681 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.feed_forward1.hidden_balancer.prob, batch_count=1020666.6666666666, ans=0.125 2023-10-02 21:12:32,756 WARNING [train.py:1204] (3/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_212-149947-0_sp0.9 from training. Duration: 0.811125 2023-10-02 21:12:36,096 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.conv_module1.balancer1.prob, batch_count=1020733.3333333334, ans=0.125 2023-10-02 21:12:37,275 WARNING [train.py:1204] (3/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_188-171673-0 from training. Duration: 0.58 2023-10-02 21:12:38,615 WARNING [train.py:1204] (3/4) Exclude cut with ID _58895_210_8750_1_1535184081320_3649089_286-281724-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 21:12:39,972 WARNING [train.py:1204] (3/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_16-254909-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 21:12:43,323 WARNING [train.py:1204] (3/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_52-95655-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 21:12:45,997 WARNING [train.py:1204] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_554-45617-0 from training. Duration: 0.99 2023-10-02 21:12:48,838 WARNING [train.py:1204] (3/4) Exclude cut with ID _14683_210_5219_1_1530698352608_3974859_473-344611-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 21:12:50,701 WARNING [train.py:1204] (3/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_17-25045-0_sp0.9 from training. Duration: 0.6555625 2023-10-02 21:12:54,884 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9072W0003-530637 from training. Duration: 0.768 2023-10-02 21:12:56,388 WARNING [train.py:1204] (3/4) Exclude cut with ID _38670_210_15379_1_1533448810659_4391059_76-74025-0_sp1.1 from training. Duration: 0.936375 2023-10-02 21:12:56,435 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_74-292608-0_sp0.9 from training. Duration: 0.8 2023-10-02 21:12:57,771 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9084W0025-536862 from training. Duration: 0.896 2023-10-02 21:12:57,852 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9087W0021-538392 from training. Duration: 0.768 2023-10-02 21:12:57,857 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7543_210_6753_1_1528543323_645515_56-163234-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 21:12:57,897 WARNING [train.py:1204] (3/4) Exclude cut with ID _45560_210_21982_1_1533812603951_3584999_44-332643-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 21:12:59,858 WARNING [train.py:1204] (3/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_1285-256622-0_sp1.1 from training. Duration: 0.763625 2023-10-02 21:13:00,060 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.bypass.scale_min, batch_count=1020800.0, ans=0.2 2023-10-02 21:13:01,140 WARNING [train.py:1204] (3/4) Exclude cut with ID _52781_210_16187_1_1534297729099_1118149_141-325632-0_sp1.1 from training. Duration: 0.936375 2023-10-02 21:13:01,212 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_1051-137631-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 21:13:01,256 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_13091_210_7925_1_1530609447_8987354_233-18333-0_sp1.1 from training. Duration: 0.97275 2023-10-02 21:13:05,847 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_34008_210_10431_1_1533345424_8183247_662-317331-0 from training. Duration: 0.94 2023-10-02 21:13:05,854 WARNING [train.py:1204] (3/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_291-248920-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 21:13:05,864 WARNING [train.py:1204] (3/4) Exclude cut with ID _65263_210_22356_1_1535763598691_3921037_274-288481-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 21:13:05,885 WARNING [train.py:1204] (3/4) Exclude cut with ID _5189_210_4557_1_1527248319428_3769229_233-168080-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 21:13:05,958 WARNING [train.py:1204] (3/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_135-184281-0 from training. Duration: 0.99 2023-10-02 21:13:06,579 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.3.encoder.layers.2.conv_module2.whiten, num_groups=1, num_channels=512, metric=8.58 vs. limit=15.0 2023-10-02 21:13:07,292 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9123W0012-555972 from training. Duration: 0.896 2023-10-02 21:13:07,296 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9123W0118-556077 from training. Duration: 0.896 2023-10-02 21:13:07,312 WARNING [train.py:1204] (3/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_406-275649-0 from training. Duration: 0.42 2023-10-02 21:13:11,253 WARNING [train.py:1204] (3/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_309-13684-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 21:13:11,272 WARNING [train.py:1204] (3/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_129-337802-0_sp1.1 from training. Duration: 0.6545625 2023-10-02 21:13:11,297 WARNING [train.py:1204] (3/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_649-266825-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 21:13:12,635 WARNING [train.py:1204] (3/4) Exclude cut with ID _40519_210_13794_1_1533715228655_7337390_895-224742-0_sp1.1 from training. Duration: 0.92725 2023-10-02 21:13:14,094 WARNING [train.py:1204] (3/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_234-14174-0 from training. Duration: 0.82 2023-10-02 21:13:17,267 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9156W0009-572502 from training. Duration: 0.768 2023-10-02 21:13:17,273 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_424-245529-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 21:13:21,801 INFO [train.py:1046] (3/4) Epoch 29, batch 4400, loss[loss=0.1596, simple_loss=0.2512, pruned_loss=0.03397, over 24371.00 frames. ], tot_loss[loss=0.1648, simple_loss=0.2428, pruned_loss=0.04344, over 4707938.01 frames. ], batch size: 74, lr: 3.51e-03, grad_scale: 16.0 2023-10-02 21:13:21,875 WARNING [train.py:1204] (3/4) Exclude cut with ID _26528_210_9070_1_1532570410842_3851310_232-10149-0_sp1.1 from training. Duration: 0.963625 2023-10-02 21:13:21,883 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_17292_210_6753_1_1531125388_2943734_152-30410-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 21:13:23,306 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_35441_210_16554_1_1533347572_4092680_291-82075-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 21:13:25,974 WARNING [train.py:1204] (3/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_64-298917-0 from training. Duration: 0.88 2023-10-02 21:13:26,009 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_5618_210_1874_1_1527311008_5851_1-214501-0_sp0.9 from training. Duration: 0.7655625 2023-10-02 21:13:26,040 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_9460_210_4211_1_1528954587_10049667_572-37035-0 from training. Duration: 0.8 2023-10-02 21:13:27,353 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9193W0023-590322 from training. Duration: 0.896 2023-10-02 21:13:28,723 WARNING [train.py:1204] (3/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_206-10097-0_sp1.1 from training. Duration: 0.4454375 2023-10-02 21:13:28,734 WARNING [train.py:1204] (3/4) Exclude cut with ID _55134_210_21982_1_1534590028377_3882170_17-51731-0_sp1.1 from training. Duration: 0.963625 2023-10-02 21:13:30,108 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_14953_210_2151_1_1530701710_5616165_710-165400-0 from training. Duration: 0.87 2023-10-02 21:13:33,240 WARNING [train.py:1204] (3/4) Exclude cut with ID _53203_210_2087_1_1534246218309_475590_61-85777-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 21:13:33,545 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.conv_module2.balancer2.prob, batch_count=1020933.3333333334, ans=0.125 2023-10-02 21:13:34,646 WARNING [train.py:1204] (3/4) Exclude cut with ID _55844_210_2346_1_1534471212569_7625707_1050-10453-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 21:13:34,666 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9218W0013-603297 from training. Duration: 0.896 2023-10-02 21:13:37,909 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_267-102818-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 21:13:37,910 WARNING [train.py:1204] (3/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_191-41023-0 from training. Duration: 0.71 2023-10-02 21:13:39,180 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9231W0202-610227 from training. Duration: 0.896 2023-10-02 21:13:41,956 WARNING [train.py:1204] (3/4) Exclude cut with ID _6537_210_1883_1_1527987559710_3597129_382-133062-0 from training. Duration: 0.68 2023-10-02 21:13:42,004 WARNING [train.py:1204] (3/4) Exclude cut with ID _28427_210_4220_1_1532581240967_3061069_43-84928-0 from training. Duration: 0.99 2023-10-02 21:13:42,046 WARNING [train.py:1204] (3/4) Exclude cut with ID _59256_210_5986_1_1535076085997_3589120_121-237386-0 from training. Duration: 0.96 2023-10-02 21:13:43,418 WARNING [train.py:1204] (3/4) Exclude cut with ID _8480_210_1874_1_1528589391205_6728330_137-256986-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 21:13:44,765 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_9417_210_1794_1_1529235261_4614131_479-346963-0_sp1.1 from training. Duration: 0.936375 2023-10-02 21:13:44,796 WARNING [train.py:1204] (3/4) Exclude cut with ID _26474_210_9570_1_1532520022699_3623380_267-7362-0_sp1.1 from training. Duration: 0.936375 2023-10-02 21:13:44,853 WARNING [train.py:1204] (3/4) Exclude cut with ID _55328_210_27767_1_1534580973260_4088170_287-61272-0_sp1.1 from training. Duration: 0.97275 2023-10-02 21:13:46,249 WARNING [train.py:1204] (3/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_589-9070-0 from training. Duration: 0.71 2023-10-02 21:13:46,257 WARNING [train.py:1204] (3/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_148-158509-0_sp1.1 from training. Duration: 0.37275 2023-10-02 21:13:47,555 WARNING [train.py:1204] (3/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_164-184548-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 21:13:50,908 WARNING [train.py:1204] (3/4) Exclude cut with ID _14970_210_15709_1_1530775547361_1966965_6-23328-0_sp1.1 from training. Duration: 0.7818125 2023-10-02 21:13:50,926 WARNING [train.py:1204] (3/4) Exclude cut with ID _29303_210_2575_1_1532392237303_7410649_612-165069-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 21:13:52,211 WARNING [train.py:1204] (3/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_383-321224-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 21:13:52,239 WARNING [train.py:1204] (3/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_283-160525-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 21:13:52,250 WARNING [train.py:1204] (3/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_505-97987-0 from training. Duration: 0.86 2023-10-02 21:13:55,256 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9309W0190-640317 from training. Duration: 0.768 2023-10-02 21:13:59,503 WARNING [train.py:1204] (3/4) Exclude cut with ID _50093_210_4220_1_1534417141207_3432299_280-297390-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 21:14:05,595 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_17292_210_6753_1_1531125388_2943734_320-71385-0_sp1.1 from training. Duration: 0.97275 2023-10-02 21:14:08,715 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_213-103282-0 from training. Duration: 0.78 2023-10-02 21:14:11,633 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7615_210_4211_1_1528110965_4580160_72-235211-0_sp0.9 from training. Duration: 0.8444375 2023-10-02 21:14:11,806 WARNING [train.py:1204] (3/4) Exclude cut with ID _14867_210_1968_1_1530684179890_3846969_173-320977-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 21:14:13,427 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.conv_module1.balancer1.prob, batch_count=1021133.3333333334, ans=0.125 2023-10-02 21:14:14,706 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7046_210_6753_1_1527895337_6920917_783-278109-0_sp0.9 from training. Duration: 0.8555625 2023-10-02 21:14:15,991 WARNING [train.py:1204] (3/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_90-30673-0 from training. Duration: 0.84 2023-10-02 21:14:16,007 WARNING [train.py:1204] (3/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_415-161-0_sp1.1 from training. Duration: 0.8909375 2023-10-02 21:14:16,018 WARNING [train.py:1204] (3/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_86-66192-0_sp0.9 from training. Duration: 0.611125 2023-10-02 21:14:16,020 WARNING [train.py:1204] (3/4) Exclude cut with ID _4626_210_3532_1_1526817652717_3916859_446-206656-0_sp1.1 from training. Duration: 0.6909375 2023-10-02 21:14:16,091 WARNING [train.py:1204] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_166-128941-0_sp0.9 from training. Duration: 0.6 2023-10-02 21:14:18,870 WARNING [train.py:1204] (3/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_345-167986-0 from training. Duration: 0.84 2023-10-02 21:14:22,111 WARNING [train.py:1204] (3/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_973-198085-0 from training. Duration: 0.75 2023-10-02 21:14:23,509 WARNING [train.py:1204] (3/4) Exclude cut with ID _60057_210_10640_1_1534903065793_3333150_171-181133-0 from training. Duration: 0.88 2023-10-02 21:14:23,524 WARNING [train.py:1204] (3/4) Exclude cut with ID _19504_210_9994_1_1531648863242_3325220_336-196513-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 21:14:23,531 WARNING [train.py:1204] (3/4) Exclude cut with ID _6493_210_1747_1_1528459210424_4012470_513-140967-0 from training. Duration: 0.79 2023-10-02 21:14:24,738 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6673_210_3273_1_1527767790_3173240_327-306185-0_sp0.9 from training. Duration: 0.92225 2023-10-02 21:14:26,533 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.432e+02 1.930e+02 2.073e+02 2.466e+02 3.633e+02, threshold=4.147e+02, percent-clipped=0.0 2023-10-02 21:14:28,140 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1032-236863-0_sp1.1 from training. Duration: 0.763625 2023-10-02 21:14:29,640 WARNING [train.py:1204] (3/4) Exclude cut with ID _14867_210_1968_1_1530684179890_3846969_413-29292-0 from training. Duration: 0.9 2023-10-02 21:14:32,495 WARNING [train.py:1204] (3/4) Exclude cut with ID _47251_210_18628_1_1533814063128_4137250_186-343322-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 21:14:35,495 INFO [train.py:1046] (3/4) Epoch 29, batch 4450, loss[loss=0.1879, simple_loss=0.2575, pruned_loss=0.05915, over 23815.00 frames. ], tot_loss[loss=0.1658, simple_loss=0.2441, pruned_loss=0.04374, over 4712915.03 frames. ], batch size: 195, lr: 3.50e-03, grad_scale: 16.0 2023-10-02 21:14:35,630 WARNING [train.py:1204] (3/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_624-46091-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 21:14:37,104 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_791-197891-0_sp0.9 from training. Duration: 0.9333125 2023-10-02 21:14:38,911 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.conv_module1.balancer2.min_abs, batch_count=1021266.6666666666, ans=0.5 2023-10-02 21:14:41,403 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_50050_210_16223_1_1535435796_7290138_786-288457-0_sp1.1 from training. Duration: 0.92725 2023-10-02 21:14:41,428 WARNING [train.py:1204] (3/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_213-227283-0_sp1.1 from training. Duration: 0.963625 2023-10-02 21:14:46,951 WARNING [train.py:1204] (3/4) Exclude cut with ID _42659_210_2070_1_1533607442765_3120380_58-341121-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 21:14:48,383 WARNING [train.py:1204] (3/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_506-23200-0_sp1.1 from training. Duration: 0.8818125 2023-10-02 21:14:53,364 WARNING [train.py:1204] (3/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_336-213530-0_sp1.1 from training. Duration: 0.8181875 2023-10-02 21:14:53,398 WARNING [train.py:1204] (3/4) Exclude cut with ID _64135_210_2253_1_1535677267876_7608972_180-5312-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 21:14:54,787 WARNING [train.py:1204] (3/4) Exclude cut with ID _12302_210_5219_1_1529924446466_8107280_835-279754-0 from training. Duration: 0.9 2023-10-02 21:14:54,788 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_510-96996-0_sp1.1 from training. Duration: 0.7909375 2023-10-02 21:14:54,860 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_15517_210_6753_1_1530954008_3487336_216-187834-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 21:14:54,887 WARNING [train.py:1204] (3/4) Exclude cut with ID _19504_210_9994_1_1531648863242_3325220_414-113427-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 21:14:54,888 WARNING [train.py:1204] (3/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_264-108685-0_sp0.9 from training. Duration: 0.611125 2023-10-02 21:14:57,597 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_5438_210_1794_1_1527336687_2600009_104-288274-0_sp0.9 from training. Duration: 0.6444375 2023-10-02 21:15:02,405 WARNING [train.py:1204] (3/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_520-39496-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 21:15:02,446 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_37818_210_4238_1_1533620910_4720083_315-308565-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 21:15:05,619 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2009_210_2071_1_1525481134_4588937_385-302100-0_sp1.1 from training. Duration: 0.7909375 2023-10-02 21:15:05,665 WARNING [train.py:1204] (3/4) Exclude cut with ID _58895_210_8750_1_1535184081320_3649089_277-53219-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 21:15:05,752 WARNING [train.py:1204] (3/4) Exclude cut with ID _26664_210_7770_1_1532258943280_3418270_121-111258-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 21:15:09,371 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.self_attn_weights.pos_emb_skip_rate, batch_count=1021400.0, ans=0.0 2023-10-02 21:15:10,503 WARNING [train.py:1204] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_99-167392-0_sp1.1 from training. Duration: 0.5545625 2023-10-02 21:15:10,592 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1113-7369-0 from training. Duration: 0.85 2023-10-02 21:15:10,612 WARNING [train.py:1204] (3/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_352-59958-0 from training. Duration: 0.86 2023-10-02 21:15:10,617 WARNING [train.py:1204] (3/4) Exclude cut with ID _14886_210_4220_1_1530702020355_3516190_159-73748-0_sp1.1 from training. Duration: 0.7545625 2023-10-02 21:15:14,571 WARNING [train.py:1204] (3/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_131-119569-0_sp1.1 from training. Duration: 0.92725 2023-10-02 21:15:17,245 WARNING [train.py:1204] (3/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_216-343007-0 from training. Duration: 0.66 2023-10-02 21:15:20,066 WARNING [train.py:1204] (3/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_113-92608-0_sp0.9 from training. Duration: 0.6 2023-10-02 21:15:22,970 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_40047_210_2290_1_1533290519_2374311_132-123271-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 21:15:24,766 WARNING [train.py:1204] (3/4) Exclude cut with ID _8793_210_6310_1_1528542985139_2310009_121-179782-0 from training. Duration: 0.8 2023-10-02 21:15:26,060 WARNING [train.py:1204] (3/4) Exclude cut with ID _43997_210_4525_1_1533538632555_5189329_283-218639-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 21:15:26,063 WARNING [train.py:1204] (3/4) Exclude cut with ID _49376_210_5986_1_1533985278732_3370740_297-78397-0_sp1.1 from training. Duration: 0.936375 2023-10-02 21:15:26,085 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3475_210_2028_1_1526193578_3622569_377-166133-0_sp0.9 from training. Duration: 0.9555625 2023-10-02 21:15:26,091 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_520-213149-0_sp1.1 from training. Duration: 0.92725 2023-10-02 21:15:27,479 WARNING [train.py:1204] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_567-144483-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 21:15:27,692 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.conv_module1.balancer1.max_abs, batch_count=1021466.6666666666, ans=10.0 2023-10-02 21:15:31,919 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_4183_210_2087_1_1526729100_1869838_143-280962-0_sp0.9 from training. Duration: 0.62225 2023-10-02 21:15:33,244 WARNING [train.py:1204] (3/4) Exclude cut with ID _49643_210_7989_1_1534640244224_3939031_611-185515-0 from training. Duration: 0.94 2023-10-02 21:15:34,629 WARNING [train.py:1204] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_88-341718-0_sp0.9 from training. Duration: 0.7333125 2023-10-02 21:15:36,028 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_51225_210_3601_1_1534400634_2327383_49-235441-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 21:15:37,986 WARNING [train.py:1204] (3/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_240-294400-0_sp1.1 from training. Duration: 0.936375 2023-10-02 21:15:39,952 WARNING [train.py:1204] (3/4) Exclude cut with ID _55429_210_10626_1_1534467588527_7174143_640-304825-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 21:15:39,989 WARNING [train.py:1204] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_187-221566-0_sp1.1 from training. Duration: 0.3909375 2023-10-02 21:15:42,740 WARNING [train.py:1204] (3/4) Exclude cut with ID _7606_210_4915_1_1528894253277_5080690_127-177761-0_sp0.9 from training. Duration: 0.711125 2023-10-02 21:15:45,499 WARNING [train.py:1204] (3/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_430-116139-0 from training. Duration: 0.85 2023-10-02 21:15:46,865 WARNING [train.py:1204] (3/4) Exclude cut with ID _3675_210_4557_1_1526639934408_4182089_456-18305-0_sp0.9 from training. Duration: 0.8444375 2023-10-02 21:15:49,632 INFO [train.py:1046] (3/4) Epoch 29, batch 4500, loss[loss=0.155, simple_loss=0.2313, pruned_loss=0.03932, over 24422.00 frames. ], tot_loss[loss=0.1659, simple_loss=0.2442, pruned_loss=0.04379, over 4725408.10 frames. ], batch size: 58, lr: 3.50e-03, grad_scale: 16.0 2023-10-02 21:15:52,397 WARNING [train.py:1204] (3/4) Exclude cut with ID _18513_210_9206_1_1531618215223_3862620_402-5957-0_sp1.1 from training. Duration: 0.9 2023-10-02 21:15:53,747 WARNING [train.py:1204] (3/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_465-156500-0 from training. Duration: 0.72 2023-10-02 21:15:53,748 WARNING [train.py:1204] (3/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_714-310538-0 from training. Duration: 0.86 2023-10-02 21:15:55,791 WARNING [train.py:1204] (3/4) Exclude cut with ID _39411_210_5986_1_1533538825030_3343649_335-301187-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 21:16:00,077 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_41750_210_1960_1_1533520678_3948042_241-158616-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 21:16:01,389 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_46013_210_7234_1_1533776362_3744032_50-141535-0_sp1.1 from training. Duration: 0.9 2023-10-02 21:16:01,455 WARNING [train.py:1204] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_43-206757-0_sp0.9 from training. Duration: 0.8333125 2023-10-02 21:16:02,686 WARNING [train.py:1204] (3/4) Exclude cut with ID _22961_210_8254_1_1531976366672_4278180_172-276296-0_sp1.1 from training. Duration: 0.97275 2023-10-02 21:16:02,720 WARNING [train.py:1204] (3/4) Exclude cut with ID _17956_210_4353_1_1531209568125_6979970_1305-301903-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 21:16:04,491 WARNING [train.py:1204] (3/4) Exclude cut with ID _28323_210_16187_1_1532480395667_4100300_604-44507-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 21:16:06,185 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.feed_forward1.hidden_balancer.prob, batch_count=1021666.6666666666, ans=0.125 2023-10-02 21:16:08,975 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.ff3_skip_rate, batch_count=1021666.6666666666, ans=0.0 2023-10-02 21:16:16,658 WARNING [train.py:1204] (3/4) Exclude cut with ID _23514_210_1389_1_1531832139785_3031439_88-337544-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 21:16:16,723 WARNING [train.py:1204] (3/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_932-3672-0_sp1.1 from training. Duration: 0.963625 2023-10-02 21:16:19,475 WARNING [train.py:1204] (3/4) Exclude cut with ID _46502_210_13794_1_1533724452198_3657159_207-109991-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 21:16:19,529 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_216-73841-0_sp1.1 from training. Duration: 0.87275 2023-10-02 21:16:20,867 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8453_210_4211_1_1528502940_8562730_347-330673-0_sp1.1 from training. Duration: 0.6545625 2023-10-02 21:16:26,618 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_4183_210_2087_1_1526729100_1869838_143-280962-0_sp1.1 from training. Duration: 0.5090625 2023-10-02 21:16:31,957 WARNING [train.py:1204] (3/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_325-113798-0_sp1.1 from training. Duration: 0.77275 2023-10-02 21:16:34,807 WARNING [train.py:1204] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_91-212486-0_sp0.9 from training. Duration: 0.7555625 2023-10-02 21:16:37,986 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8776_210_2040_1_1528638837_4088036_244-278763-0_sp1.1 from training. Duration: 0.8181875 2023-10-02 21:16:38,037 WARNING [train.py:1204] (3/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_106-286130-0 from training. Duration: 0.98 2023-10-02 21:16:39,359 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2746_210_1988_1_1525872816_3795627_372-205861-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 21:16:39,390 WARNING [train.py:1204] (3/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_240-54646-0_sp1.1 from training. Duration: 0.936375 2023-10-02 21:16:40,777 WARNING [train.py:1204] (3/4) Exclude cut with ID _59500_210_8254_1_1534809610680_3736009_162-180695-0_sp1.1 from training. Duration: 0.936375 2023-10-02 21:16:40,800 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7544_210_6753_1_1528591327_1471426_146-93357-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 21:16:44,094 WARNING [train.py:1204] (3/4) Exclude cut with ID _42657_210_2070_1_1533520654016_3327008_405-87432-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 21:16:44,122 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_4528_210_3273_1_1527076654_3276777_209-219804-0 from training. Duration: 0.69 2023-10-02 21:16:44,123 WARNING [train.py:1204] (3/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_78-182912-0_sp0.9 from training. Duration: 0.6555625 2023-10-02 21:16:44,130 WARNING [train.py:1204] (3/4) Exclude cut with ID _31255_210_13427_1_1532865619495_3815143_322-114998-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 21:16:50,318 WARNING [train.py:1204] (3/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_848-280957-0_sp0.9 from training. Duration: 0.9666875 2023-10-02 21:16:50,342 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7696_210_2040_1_1528207167_3716099_117-125434-0_sp0.9 from training. Duration: 0.8444375 2023-10-02 21:16:53,193 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_1002-109192-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 21:16:54,476 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.489e+02 1.911e+02 2.155e+02 2.344e+02 3.258e+02, threshold=4.309e+02, percent-clipped=0.0 2023-10-02 21:16:54,666 WARNING [train.py:1204] (3/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_138-64210-0_sp0.9 from training. Duration: 0.811125 2023-10-02 21:16:54,686 WARNING [train.py:1204] (3/4) Exclude cut with ID _25808_210_13292_1_1532343514562_7366109_633-47524-0_sp1.1 from training. Duration: 0.9 2023-10-02 21:16:56,654 WARNING [train.py:1204] (3/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_297-5233-0 from training. Duration: 0.95 2023-10-02 21:16:57,295 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.3.encoder.layers.3.whiten, num_groups=1, num_channels=512, metric=4.09 vs. limit=12.0 2023-10-02 21:16:58,059 WARNING [train.py:1204] (3/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_335-274780-0 from training. Duration: 0.78 2023-10-02 21:16:58,066 WARNING [train.py:1204] (3/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_301-330455-0 from training. Duration: 0.8 2023-10-02 21:17:00,052 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.3.encoder.layers.2.nonlin_attention.whiten1, num_groups=1, num_channels=384, metric=5.17 vs. limit=10.0 2023-10-02 21:17:02,131 WARNING [train.py:1204] (3/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_383-205833-0 from training. Duration: 0.95 2023-10-02 21:17:03,823 INFO [train.py:1046] (3/4) Epoch 29, batch 4550, loss[loss=0.1694, simple_loss=0.258, pruned_loss=0.04036, over 24640.00 frames. ], tot_loss[loss=0.165, simple_loss=0.2429, pruned_loss=0.04352, over 4725644.23 frames. ], batch size: 73, lr: 3.50e-03, grad_scale: 16.0 2023-10-02 21:17:03,976 WARNING [train.py:1204] (3/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_48-182155-0 from training. Duration: 0.87 2023-10-02 21:17:05,302 WARNING [train.py:1204] (3/4) Exclude cut with ID _14832_210_8750_1_1530691351539_3171348_1-54315-0_sp1.1 from training. Duration: 0.7909375 2023-10-02 21:17:09,472 WARNING [train.py:1204] (3/4) Exclude cut with ID _44982_210_1511_1_1534312820108_3726029_413-100814-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 21:17:10,832 WARNING [train.py:1204] (3/4) Exclude cut with ID _3231_210_2106_1_1526205864934_6784220_104-261392-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 21:17:12,313 WARNING [train.py:1204] (3/4) Exclude cut with ID _24314_210_4915_1_1532135078116_7170179_1-13588-0_sp1.1 from training. Duration: 0.97275 2023-10-02 21:17:15,747 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.nonlin_attention.balancer.prob, batch_count=1021933.3333333334, ans=0.125 2023-10-02 21:17:18,304 WARNING [train.py:1204] (3/4) Exclude cut with ID _44304_210_15374_1_1533639352765_4613129_141-8356-0_sp1.1 from training. Duration: 0.963625 2023-10-02 21:17:18,430 WARNING [train.py:1204] (3/4) Exclude cut with ID _37892_210_19596_1_1533198677897_3964058_374-335034-0_sp1.1 from training. Duration: 0.936375 2023-10-02 21:17:19,732 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_195-257512-0_sp0.9 from training. Duration: 0.7444375 2023-10-02 21:17:19,735 WARNING [train.py:1204] (3/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_1267-272693-0_sp1.1 from training. Duration: 0.663625 2023-10-02 21:17:19,735 WARNING [train.py:1204] (3/4) Exclude cut with ID _30196_210_1664_1_1533708496522_7074140_690-326155-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 21:17:21,608 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.4.encoder.layers.1.conv_module1.whiten, num_groups=1, num_channels=384, metric=4.98 vs. limit=15.0 2023-10-02 21:17:23,715 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_68847_210_14656_1_1536070214_2586843_38-20717-0_sp1.1 from training. Duration: 0.97275 2023-10-02 21:17:23,757 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_99-251314-0_sp1.1 from training. Duration: 0.7909375 2023-10-02 21:17:24,320 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.4.encoder.layers.1.conv_module1.whiten, num_groups=1, num_channels=384, metric=9.02 vs. limit=15.0 2023-10-02 21:17:26,905 WARNING [train.py:1204] (3/4) Exclude cut with ID _49376_210_5986_1_1533985278732_3370740_225-95630-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 21:17:28,608 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.conv_skip_rate, batch_count=1022000.0, ans=0.0 2023-10-02 21:17:29,838 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2655_210_2087_1_1525688959_3897400_202-147557-0 from training. Duration: 0.85 2023-10-02 21:17:29,899 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_5618_210_1874_1_1527311008_5851_1-214501-0_sp1.1 from training. Duration: 0.626375 2023-10-02 21:17:31,295 WARNING [train.py:1204] (3/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_225-43381-0_sp1.1 from training. Duration: 0.8545625 2023-10-02 21:17:34,031 WARNING [train.py:1204] (3/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_242-3472-0 from training. Duration: 0.73 2023-10-02 21:17:35,902 WARNING [train.py:1204] (3/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_339-104791-0 from training. Duration: 0.49 2023-10-02 21:17:37,249 WARNING [train.py:1204] (3/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_488-92261-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 21:17:40,116 WARNING [train.py:1204] (3/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_175-163894-0 from training. Duration: 0.74 2023-10-02 21:17:41,574 WARNING [train.py:1204] (3/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_41-234353-0_sp0.9 from training. Duration: 0.7555625 2023-10-02 21:17:44,864 WARNING [train.py:1204] (3/4) Exclude cut with ID _47251_210_18628_1_1533814063128_4137250_116-275927-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 21:17:44,891 WARNING [train.py:1204] (3/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_61-223324-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 21:17:46,562 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6961_210_2087_1_1527937168_6815011_562-165518-0_sp1.1 from training. Duration: 0.7 2023-10-02 21:17:47,894 WARNING [train.py:1204] (3/4) Exclude cut with ID _21989_210_5854_1_1531899214931_3271360_133-44759-0 from training. Duration: 0.77 2023-10-02 21:17:49,374 WARNING [train.py:1204] (3/4) Exclude cut with ID _23557_210_8750_1_1531890106370_6846255_77-284923-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 21:17:49,714 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.conv_module1.balancer1.max_abs, batch_count=1022133.3333333334, ans=10.0 2023-10-02 21:17:50,002 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.5.encoder.layers.0.nonlin_attention.whiten1, num_groups=1, num_channels=192, metric=4.38 vs. limit=10.0 2023-10-02 21:17:52,260 WARNING [train.py:1204] (3/4) Exclude cut with ID _13678_210_7497_1_1530530069226_3463170_464-296127-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 21:17:52,275 WARNING [train.py:1204] (3/4) Exclude cut with ID _22613_210_5219_1_1532253599686_4090170_393-36663-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 21:17:52,370 WARNING [train.py:1204] (3/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_235-262124-0_sp0.9 from training. Duration: 0.7444375 2023-10-02 21:17:54,149 WARNING [train.py:1204] (3/4) Exclude cut with ID _8480_210_1874_1_1528589391205_6728330_755-112625-0 from training. Duration: 0.74 2023-10-02 21:17:54,221 WARNING [train.py:1204] (3/4) Exclude cut with ID _6456_210_1534_1_1527681634212_3181049_215-309359-0 from training. Duration: 0.88 2023-10-02 21:17:54,245 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3019_210_2028_1_1526044245_3264865_191-72235-0_sp1.1 from training. Duration: 0.8818125 2023-10-02 21:17:55,529 WARNING [train.py:1204] (3/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_380-162648-0 from training. Duration: 0.94 2023-10-02 21:17:56,944 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_92-46009-0 from training. Duration: 0.54 2023-10-02 21:17:58,201 WARNING [train.py:1204] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_53-300544-0_sp0.9 from training. Duration: 0.7444375 2023-10-02 21:17:58,317 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_68235_210_1925_1_1535863247_4484656_101-250943-0_sp1.1 from training. Duration: 0.97275 2023-10-02 21:17:59,607 WARNING [train.py:1204] (3/4) Exclude cut with ID _14970_210_15709_1_1530775547361_1966965_204-22182-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 21:17:59,687 WARNING [train.py:1204] (3/4) Exclude cut with ID _55153_210_2253_1_1534384786677_3162096_310-130092-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 21:17:59,703 WARNING [train.py:1204] (3/4) Exclude cut with ID _60113_210_16271_1_1535164212481_4386079_559-167191-0_sp1.1 from training. Duration: 0.8454375 2023-10-02 21:18:01,061 WARNING [train.py:1204] (3/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_93-58002-0_sp1.1 from training. Duration: 0.5181875 2023-10-02 21:18:01,250 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.nonlin_attention.balancer.prob, batch_count=1022200.0, ans=0.125 2023-10-02 21:18:02,341 WARNING [train.py:1204] (3/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_110-224688-0 from training. Duration: 0.97 2023-10-02 21:18:05,519 WARNING [train.py:1204] (3/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_178-178004-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 21:18:05,528 WARNING [train.py:1204] (3/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_321-335475-0_sp1.1 from training. Duration: 0.4090625 2023-10-02 21:18:05,599 WARNING [train.py:1204] (3/4) Exclude cut with ID _66863_210_18053_1_1535698758334_8590319_1092-271784-0 from training. Duration: 0.96 2023-10-02 21:18:05,606 WARNING [train.py:1204] (3/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_607-213951-0_sp1.1 from training. Duration: 0.763625 2023-10-02 21:18:05,614 WARNING [train.py:1204] (3/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_468-283113-0 from training. Duration: 0.96 2023-10-02 21:18:05,788 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.nonlin_attention.balancer.prob, batch_count=1022200.0, ans=0.125 2023-10-02 21:18:07,311 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=1022200.0, ans=0.1 2023-10-02 21:18:08,366 WARNING [train.py:1204] (3/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_13-165512-0_sp1.1 from training. Duration: 0.7454375 2023-10-02 21:18:08,387 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_69630_210_7234_1_1535788670_910989_56-169762-0_sp1.1 from training. Duration: 0.92725 2023-10-02 21:18:10,418 WARNING [train.py:1204] (3/4) Exclude cut with ID _67335_210_5219_1_1535525975652_4275229_121-80357-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 21:18:11,767 WARNING [train.py:1204] (3/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_126-12079-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 21:18:11,810 WARNING [train.py:1204] (3/4) Exclude cut with ID _5294_210_1747_1_1527249643928_3696179_342-270278-0_sp1.1 from training. Duration: 0.5 2023-10-02 21:18:14,574 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_30322_210_1925_1_1532843478_826817_35-328264-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 21:18:16,571 WARNING [train.py:1204] (3/4) Exclude cut with ID _8480_210_1874_1_1528589391205_6728330_755-112625-0_sp1.1 from training. Duration: 0.67275 2023-10-02 21:18:17,925 INFO [train.py:1046] (3/4) Epoch 29, batch 4600, loss[loss=0.1482, simple_loss=0.2319, pruned_loss=0.03223, over 24347.00 frames. ], tot_loss[loss=0.1636, simple_loss=0.2411, pruned_loss=0.04309, over 4717126.01 frames. ], batch size: 61, lr: 3.50e-03, grad_scale: 16.0 2023-10-02 21:18:19,419 WARNING [train.py:1204] (3/4) Exclude cut with ID _38670_210_15379_1_1533448810659_4391059_132-146412-0_sp1.1 from training. Duration: 0.97275 2023-10-02 21:18:19,498 WARNING [train.py:1204] (3/4) Exclude cut with ID _22020_210_2461_1_1531724164513_6326900_563-39691-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 21:18:22,232 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_9460_210_4211_1_1528954587_10049667_572-37035-0_sp1.1 from training. Duration: 0.72725 2023-10-02 21:18:22,243 WARNING [train.py:1204] (3/4) Exclude cut with ID _68777_210_4353_1_1535810390731_3117904_129-166012-0_sp0.9 from training. Duration: 0.9444375 2023-10-02 21:18:23,943 WARNING [train.py:1204] (3/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_487-91921-0_sp1.1 from training. Duration: 0.963625 2023-10-02 21:18:24,052 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_195-257512-0 from training. Duration: 0.67 2023-10-02 21:18:26,671 WARNING [train.py:1204] (3/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_188-260011-0_sp1.1 from training. Duration: 0.663625 2023-10-02 21:18:26,896 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=1022266.6666666666, ans=0.1 2023-10-02 21:18:30,705 WARNING [train.py:1204] (3/4) Exclude cut with ID _8480_210_1874_1_1528589391205_6728330_309-139896-0_sp0.9 from training. Duration: 0.9 2023-10-02 21:18:30,882 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.conv_skip_rate, batch_count=1022333.3333333334, ans=0.0 2023-10-02 21:18:32,069 WARNING [train.py:1204] (3/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_866-321248-0_sp1.1 from training. Duration: 0.963625 2023-10-02 21:18:34,812 WARNING [train.py:1204] (3/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_510-216513-0_sp1.1 from training. Duration: 0.97275 2023-10-02 21:18:42,393 WARNING [train.py:1204] (3/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_758-143677-0 from training. Duration: 0.98 2023-10-02 21:18:44,240 WARNING [train.py:1204] (3/4) Exclude cut with ID _55134_210_21982_1_1534590028377_3882170_50-214427-0_sp1.1 from training. Duration: 0.97275 2023-10-02 21:18:45,700 WARNING [train.py:1204] (3/4) Exclude cut with ID _31649_210_6228_1_1532584608063_4091078_482-301330-0_sp1.1 from training. Duration: 0.97275 2023-10-02 21:18:49,060 WARNING [train.py:1204] (3/4) Exclude cut with ID _35799_210_5854_1_1533263487887_3529071_490-326847-0_sp1.1 from training. Duration: 0.9 2023-10-02 21:18:49,068 WARNING [train.py:1204] (3/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_984-90716-0_sp1.1 from training. Duration: 0.963625 2023-10-02 21:18:53,308 WARNING [train.py:1204] (3/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_166-246557-0 from training. Duration: 0.64 2023-10-02 21:18:53,308 WARNING [train.py:1204] (3/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_51-200220-0_sp1.1 from training. Duration: 0.5090625 2023-10-02 21:18:54,637 WARNING [train.py:1204] (3/4) Exclude cut with ID _51382_210_23862_1_1534492801838_3650104_275-92796-0_sp1.1 from training. Duration: 0.936375 2023-10-02 21:18:59,480 WARNING [train.py:1204] (3/4) Exclude cut with ID _29853_210_7497_1_1532505600851_6642110_461-142829-0_sp1.1 from training. Duration: 0.97275 2023-10-02 21:19:00,756 WARNING [train.py:1204] (3/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_788-298907-0_sp0.9 from training. Duration: 0.988875 2023-10-02 21:19:00,969 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=1022466.6666666666, ans=0.1 2023-10-02 21:19:02,204 WARNING [train.py:1204] (3/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_739-2098-0_sp1.1 from training. Duration: 0.836375 2023-10-02 21:19:06,270 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7543_210_6753_1_1528541907_1399322_165-323619-0 from training. Duration: 0.69 2023-10-02 21:19:07,624 WARNING [train.py:1204] (3/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_401-255491-0_sp1.1 from training. Duration: 0.636375 2023-10-02 21:19:11,211 WARNING [train.py:1204] (3/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_24-143283-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 21:19:12,502 WARNING [train.py:1204] (3/4) Exclude cut with ID _14459_210_2228_1_1531443641946_7516700_1221-280439-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 21:19:17,131 WARNING [train.py:1204] (3/4) Exclude cut with ID _59202_210_26984_1_1534816810612_3549174_469-213482-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 21:19:17,132 WARNING [train.py:1204] (3/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_148-158509-0_sp0.9 from training. Duration: 0.4555625 2023-10-02 21:19:17,169 WARNING [train.py:1204] (3/4) Exclude cut with ID _24314_210_4915_1_1532135078116_7170179_863-259095-0_sp1.1 from training. Duration: 0.97275 2023-10-02 21:19:17,244 WARNING [train.py:1204] (3/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_321-22821-0 from training. Duration: 0.89 2023-10-02 21:19:17,266 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_43831_210_2158_1_1533725502_5872791_55-163137-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 21:19:19,169 WARNING [train.py:1204] (3/4) Exclude cut with ID _27430_210_7770_1_1532604689375_1378590_26-25207-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 21:19:20,594 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_17613_210_2151_1_1531137569_2366160_223-316461-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 21:19:20,670 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_53679_210_4852_1_1534556726_4186141_541-213370-0_sp1.1 from training. Duration: 0.963625 2023-10-02 21:19:22,018 WARNING [train.py:1204] (3/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_369-295086-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 21:19:23,160 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.672e+02 1.896e+02 2.092e+02 2.608e+02 4.694e+02, threshold=4.185e+02, percent-clipped=1.0 2023-10-02 21:19:23,242 WARNING [train.py:1204] (3/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_91-267696-0 from training. Duration: 0.87 2023-10-02 21:19:23,300 WARNING [train.py:1204] (3/4) Exclude cut with ID _5294_210_1747_1_1527249643928_3696179_180-100713-0 from training. Duration: 0.81 2023-10-02 21:19:23,333 WARNING [train.py:1204] (3/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_32-335103-0 from training. Duration: 0.82 2023-10-02 21:19:23,338 WARNING [train.py:1204] (3/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_133-312727-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 21:19:24,709 WARNING [train.py:1204] (3/4) Exclude cut with ID _59288_210_5854_1_1535077757774_3412246_391-165934-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 21:19:26,551 WARNING [train.py:1204] (3/4) Exclude cut with ID _28323_210_16187_1_1532480395667_4100300_488-1281-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 21:19:27,889 WARNING [train.py:1204] (3/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_124-193207-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 21:19:28,196 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.attention_skip_rate, batch_count=1022533.3333333334, ans=0.0 2023-10-02 21:19:31,865 INFO [train.py:1046] (3/4) Epoch 29, batch 4650, loss[loss=0.1675, simple_loss=0.2591, pruned_loss=0.03793, over 24603.00 frames. ], tot_loss[loss=0.1631, simple_loss=0.2409, pruned_loss=0.04267, over 4712690.22 frames. ], batch size: 71, lr: 3.50e-03, grad_scale: 16.0 2023-10-02 21:19:34,900 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.bypass.scale_min, batch_count=1022600.0, ans=0.2 2023-10-02 21:19:38,758 WARNING [train.py:1204] (3/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_226-242140-0_sp0.9 from training. Duration: 0.9 2023-10-02 21:19:40,235 WARNING [train.py:1204] (3/4) Exclude cut with ID _29277_210_3279_1_1532581109293_3173069_392-3694-0_sp1.1 from training. Duration: 0.936375 2023-10-02 21:19:41,588 WARNING [train.py:1204] (3/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_563-329842-0_sp1.1 from training. Duration: 0.97275 2023-10-02 21:19:41,638 WARNING [train.py:1204] (3/4) Exclude cut with ID _58583_210_5236_1_1534726835266_3574249_391-87755-0_sp1.1 from training. Duration: 0.92725 2023-10-02 21:19:43,529 WARNING [train.py:1204] (3/4) Exclude cut with ID _28427_210_4220_1_1532581240967_3061069_281-237827-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 21:19:43,543 WARNING [train.py:1204] (3/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_44-147898-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 21:19:43,628 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_24007_210_1794_1_1531918925_1663875_125-320772-0_sp1.1 from training. Duration: 0.97275 2023-10-02 21:19:46,845 WARNING [train.py:1204] (3/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_143-117421-0 from training. Duration: 0.58 2023-10-02 21:19:48,579 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.conv_module2.balancer2.prob, batch_count=1022666.6666666666, ans=0.125 2023-10-02 21:19:49,744 WARNING [train.py:1204] (3/4) Exclude cut with ID _69584_210_5219_1_1535785133775_4044450_325-7656-0_sp0.9 from training. Duration: 0.9555625 2023-10-02 21:19:51,754 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_37351_210_7925_1_1533012931_3812113_265-85780-0 from training. Duration: 0.88 2023-10-02 21:19:51,789 WARNING [train.py:1204] (3/4) Exclude cut with ID _18516_210_9206_1_1531704653206_3692406_331-237896-0_sp1.1 from training. Duration: 0.936375 2023-10-02 21:19:53,088 WARNING [train.py:1204] (3/4) Exclude cut with ID _14629_210_15709_1_1530604298182_956899_6-232695-0 from training. Duration: 0.94 2023-10-02 21:19:53,116 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7544_210_6753_1_1528587000_691468_87-178599-0_sp1.1 from training. Duration: 0.7909375 2023-10-02 21:19:54,538 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_326-303945-0 from training. Duration: 0.99 2023-10-02 21:19:54,561 WARNING [train.py:1204] (3/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_503-30719-0 from training. Duration: 0.98 2023-10-02 21:19:54,568 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_11305_210_1794_1_1529750428_7178148_299-344640-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 21:19:54,626 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_53071_210_16554_1_1534490405_2954066_265-170472-0_sp1.1 from training. Duration: 0.7545625 2023-10-02 21:19:56,301 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.bypass_mid.scale_min, batch_count=1022666.6666666666, ans=0.2 2023-10-02 21:19:57,900 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_174-277364-0_sp1.1 from training. Duration: 0.6454375 2023-10-02 21:19:59,263 WARNING [train.py:1204] (3/4) Exclude cut with ID _58414_210_8254_1_1535421609162_7151182_193-151884-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 21:20:00,563 WARNING [train.py:1204] (3/4) Exclude cut with ID IC0651W0104-322063 from training. Duration: 0.768 2023-10-02 21:20:02,030 WARNING [train.py:1204] (3/4) Exclude cut with ID _21881_210_3525_1_1531825242757_3798380_139-305982-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 21:20:03,401 WARNING [train.py:1204] (3/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_79-200392-0 from training. Duration: 0.97 2023-10-02 21:20:07,464 WARNING [train.py:1204] (3/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_451-280894-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 21:20:07,479 WARNING [train.py:1204] (3/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_304-273433-0_sp1.1 from training. Duration: 0.9 2023-10-02 21:20:07,554 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_5959_210_1794_1_1527400400_3445726_412-44052-0 from training. Duration: 0.77 2023-10-02 21:20:08,969 WARNING [train.py:1204] (3/4) Exclude cut with ID _4681_210_3525_1_1527251040321_1235713_148-26159-0_sp1.1 from training. Duration: 0.963625 2023-10-02 21:20:11,695 WARNING [train.py:1204] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_887-249555-0_sp0.9 from training. Duration: 0.8555625 2023-10-02 21:20:14,431 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_25236_210_2158_1_1532051968_280567_37-6860-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 21:20:18,050 WARNING [train.py:1204] (3/4) Exclude cut with ID _29277_210_3279_1_1532581109293_3173069_417-115527-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 21:20:21,140 WARNING [train.py:1204] (3/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_552-134748-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 21:20:22,421 WARNING [train.py:1204] (3/4) Exclude cut with ID _43176_210_8254_1_1533524417469_4174339_188-174143-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 21:20:22,453 WARNING [train.py:1204] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_127-119109-0_sp1.1 from training. Duration: 0.6545625 2023-10-02 21:20:24,365 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.4.encoder.layers.1.whiten, num_groups=1, num_channels=384, metric=6.06 vs. limit=12.0 2023-10-02 21:20:26,550 WARNING [train.py:1204] (3/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_38-187851-0 from training. Duration: 0.62 2023-10-02 21:20:26,581 WARNING [train.py:1204] (3/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_588-125165-0 from training. Duration: 0.83 2023-10-02 21:20:26,672 WARNING [train.py:1204] (3/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_344-152526-0_sp1.1 from training. Duration: 0.4818125 2023-10-02 21:20:26,673 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7002_210_1794_1_1527935634_4034444_487-236501-0 from training. Duration: 0.66 2023-10-02 21:20:29,981 WARNING [train.py:1204] (3/4) Exclude cut with ID _23825_210_13449_1_1531915313526_3497150_379-16981-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 21:20:30,500 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.5.encoder.layers.0.conv_module2.whiten, num_groups=1, num_channels=256, metric=4.05 vs. limit=15.0 2023-10-02 21:20:35,637 WARNING [train.py:1204] (3/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_297-5233-0_sp1.1 from training. Duration: 0.863625 2023-10-02 21:20:36,895 WARNING [train.py:1204] (3/4) Exclude cut with ID _41298_210_9019_1_1533430371450_4822143_178-283538-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 21:20:36,924 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7046_210_6753_1_1527895337_6920917_642-106799-0 from training. Duration: 0.92 2023-10-02 21:20:36,938 WARNING [train.py:1204] (3/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_532-291952-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 21:20:38,435 WARNING [train.py:1204] (3/4) Exclude cut with ID _47278_210_12041_1_1533902418349_3576110_260-270468-0_sp1.1 from training. Duration: 0.92725 2023-10-02 21:20:38,448 WARNING [train.py:1204] (3/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_144-3287-0_sp0.9 from training. Duration: 0.7444375 2023-10-02 21:20:39,799 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_162-262717-0_sp0.9 from training. Duration: 0.92225 2023-10-02 21:20:42,566 WARNING [train.py:1204] (3/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_62-335552-0_sp0.9 from training. Duration: 0.9666875 2023-10-02 21:20:42,571 WARNING [train.py:1204] (3/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_726-272852-0_sp1.1 from training. Duration: 0.92725 2023-10-02 21:20:43,929 WARNING [train.py:1204] (3/4) Exclude cut with ID _14898_210_9206_1_1530766812377_4177190_26-135492-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 21:20:45,714 INFO [train.py:1046] (3/4) Epoch 29, batch 4700, loss[loss=0.1632, simple_loss=0.2333, pruned_loss=0.04656, over 23439.00 frames. ], tot_loss[loss=0.1643, simple_loss=0.242, pruned_loss=0.04331, over 4712366.79 frames. ], batch size: 285, lr: 3.50e-03, grad_scale: 16.0 2023-10-02 21:20:47,247 WARNING [train.py:1204] (3/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_200-95304-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 21:20:49,115 WARNING [train.py:1204] (3/4) Exclude cut with ID _45012_210_12651_1_1533608435136_5371380_240-37101-0_sp1.1 from training. Duration: 0.8454375 2023-10-02 21:20:49,123 WARNING [train.py:1204] (3/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_132-269633-0_sp0.9 from training. Duration: 0.7555625 2023-10-02 21:20:50,448 WARNING [train.py:1204] (3/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_907-151398-0_sp0.9 from training. Duration: 0.411125 2023-10-02 21:20:51,882 WARNING [train.py:1204] (3/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_45-265684-0_sp0.9 from training. Duration: 0.8 2023-10-02 21:20:51,962 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_74-292608-0 from training. Duration: 0.72 2023-10-02 21:20:59,410 WARNING [train.py:1204] (3/4) Exclude cut with ID _59202_210_26984_1_1534816810612_3549174_221-84819-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 21:20:59,472 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_33696_210_3852_1_1533082780_2663375_181-325595-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 21:21:01,282 WARNING [train.py:1204] (3/4) Exclude cut with ID _47251_210_18628_1_1533814063128_4137250_351-154105-0_sp1.1 from training. Duration: 0.963625 2023-10-02 21:21:01,377 WARNING [train.py:1204] (3/4) Exclude cut with ID _35501_210_9288_1_1532998507986_3192189_351-334740-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 21:21:02,795 WARNING [train.py:1204] (3/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_342-100157-0_sp1.1 from training. Duration: 0.4909375 2023-10-02 21:21:04,407 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.feed_forward2.hidden_balancer.prob, batch_count=1023000.0, ans=0.125 2023-10-02 21:21:08,294 WARNING [train.py:1204] (3/4) Exclude cut with ID _6789_210_1534_1_1527857937218_3118950_262-48853-0 from training. Duration: 0.56 2023-10-02 21:21:08,322 WARNING [train.py:1204] (3/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_430-201020-0 from training. Duration: 0.97 2023-10-02 21:21:11,071 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_71-136518-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 21:21:11,178 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_28-291982-0_sp1.1 from training. Duration: 0.8818125 2023-10-02 21:21:11,847 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.3.encoder.layers.3.feed_forward3.out_whiten, num_groups=1, num_channels=512, metric=11.69 vs. limit=15.0 2023-10-02 21:21:12,539 WARNING [train.py:1204] (3/4) Exclude cut with ID _54218_210_12210_1_1534413646388_4409259_437-275326-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 21:21:14,731 WARNING [train.py:1204] (3/4) Exclude cut with ID _55134_210_21982_1_1534590028377_3882170_40-309139-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 21:21:19,140 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.self_attn_weights.pos_emb_skip_rate, batch_count=1023066.6666666666, ans=0.0 2023-10-02 21:21:20,275 WARNING [train.py:1204] (3/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_266-329755-0_sp1.1 from training. Duration: 0.8090625 2023-10-02 21:21:21,606 WARNING [train.py:1204] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_21-174514-0_sp1.1 from training. Duration: 0.3909375 2023-10-02 21:21:25,114 WARNING [train.py:1204] (3/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_409-207378-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 21:21:29,340 WARNING [train.py:1204] (3/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_132-269633-0 from training. Duration: 0.68 2023-10-02 21:21:29,430 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2245_210_2715_1_1525511548_3540254_310-52483-0_sp0.9 from training. Duration: 0.97775 2023-10-02 21:21:33,928 WARNING [train.py:1204] (3/4) Exclude cut with ID _27430_210_7770_1_1532604689375_1378590_47-77403-0_sp1.1 from training. Duration: 0.92725 2023-10-02 21:21:35,449 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=1023133.3333333334, ans=0.1 2023-10-02 21:21:36,636 WARNING [train.py:1204] (3/4) Exclude cut with ID _7562_210_2084_1_1528887767916_3946229_186-326422-0 from training. Duration: 0.84 2023-10-02 21:21:38,005 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_230-35993-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 21:21:43,435 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_23854_210_1794_1_1531969696_1329157_57-123491-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 21:21:43,490 WARNING [train.py:1204] (3/4) Exclude cut with ID _14747_210_8341_1_1530685797463_3654089_147-131229-0 from training. Duration: 0.76 2023-10-02 21:21:45,593 WARNING [train.py:1204] (3/4) Exclude cut with ID _30191_210_1664_1_1533189915047_7196670_430-19539-0_sp1.1 from training. Duration: 0.92725 2023-10-02 21:21:45,607 WARNING [train.py:1204] (3/4) Exclude cut with ID _67725_210_27767_1_1535608756671_5105359_44-50762-0_sp1.1 from training. Duration: 0.963625 2023-10-02 21:21:48,399 WARNING [train.py:1204] (3/4) Exclude cut with ID _27485_210_4916_1_1532739191677_3668269_217-161755-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 21:21:48,443 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_5931_210_2040_1_1527428481_4547999_321-193614-0_sp1.1 from training. Duration: 0.6090625 2023-10-02 21:21:48,463 WARNING [train.py:1204] (3/4) Exclude cut with ID _6267_210_2084_1_1527764658526_3894730_375-219887-0 from training. Duration: 0.67 2023-10-02 21:21:50,435 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9087W0022-538393 from training. Duration: 0.768 2023-10-02 21:21:51,670 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.637e+02 1.909e+02 2.162e+02 2.551e+02 3.062e+02, threshold=4.324e+02, percent-clipped=0.0 2023-10-02 21:21:51,807 WARNING [train.py:1204] (3/4) Exclude cut with ID _68777_210_4353_1_1535810390731_3117904_83-196459-0_sp1.1 from training. Duration: 0.963625 2023-10-02 21:21:54,383 WARNING [train.py:1204] (3/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_455-343812-0_sp1.1 from training. Duration: 0.92725 2023-10-02 21:21:54,383 WARNING [train.py:1204] (3/4) Exclude cut with ID _13916_210_9994_1_1530788341126_3763149_414-320174-0_sp1.1 from training. Duration: 0.92725 2023-10-02 21:21:54,388 WARNING [train.py:1204] (3/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_398-239819-0 from training. Duration: 0.99 2023-10-02 21:21:54,497 WARNING [train.py:1204] (3/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_199-232314-0_sp1.1 from training. Duration: 0.92725 2023-10-02 21:21:59,146 WARNING [train.py:1204] (3/4) Exclude cut with ID _2334_210_2667_1_1525608238880_4705129_459-104997-0 from training. Duration: 0.95 2023-10-02 21:22:00,440 INFO [train.py:1046] (3/4) Epoch 29, batch 4750, loss[loss=0.1823, simple_loss=0.26, pruned_loss=0.05232, over 23394.00 frames. ], tot_loss[loss=0.1646, simple_loss=0.2426, pruned_loss=0.04331, over 4723445.83 frames. ], batch size: 93, lr: 3.50e-03, grad_scale: 16.0 2023-10-02 21:22:01,714 WARNING [train.py:1204] (3/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_441-209395-0_sp1.1 from training. Duration: 0.936375 2023-10-02 21:22:01,796 WARNING [train.py:1204] (3/4) Exclude cut with ID _27052_210_5986_1_1532329197187_3585009_320-80393-0_sp1.1 from training. Duration: 0.97275 2023-10-02 21:22:06,436 WARNING [train.py:1204] (3/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_89-217253-0_sp1.1 from training. Duration: 0.97275 2023-10-02 21:22:06,450 WARNING [train.py:1204] (3/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_453-282588-0_sp1.1 from training. Duration: 0.8909375 2023-10-02 21:22:07,841 WARNING [train.py:1204] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_88-341718-0 from training. Duration: 0.66 2023-10-02 21:22:09,164 WARNING [train.py:1204] (3/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_230-3521-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 21:22:11,877 WARNING [train.py:1204] (3/4) Exclude cut with ID _9679_210_7497_1_1529129212906_4363929_512-313889-0 from training. Duration: 0.81 2023-10-02 21:22:13,363 WARNING [train.py:1204] (3/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_194-252800-0_sp1.1 from training. Duration: 0.8818125 2023-10-02 21:22:14,561 WARNING [train.py:1204] (3/4) Exclude cut with ID _55209_210_1531_1_1534675828996_4597160_239-293541-0_sp1.1 from training. Duration: 0.963625 2023-10-02 21:22:14,619 WARNING [train.py:1204] (3/4) Exclude cut with ID _35799_210_5854_1_1533263487887_3529071_15-325131-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 21:22:20,486 WARNING [train.py:1204] (3/4) Exclude cut with ID _14100_210_8341_1_1530529320613_3678069_279-55760-0 from training. Duration: 0.93 2023-10-02 21:22:25,740 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_489-230703-0_sp1.1 from training. Duration: 0.77275 2023-10-02 21:22:27,196 WARNING [train.py:1204] (3/4) Exclude cut with ID _6694_210_4599_1_1527852637490_3965529_144-185051-0 from training. Duration: 0.96 2023-10-02 21:22:28,485 WARNING [train.py:1204] (3/4) Exclude cut with ID _28461_210_2253_1_1532771775189_3041299_1-228155-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 21:22:32,525 WARNING [train.py:1204] (3/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_686-252323-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 21:22:32,528 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_1075-294633-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 21:22:32,548 WARNING [train.py:1204] (3/4) Exclude cut with ID _22083_210_8254_1_1531702862439_3678970_30-92879-0_sp1.1 from training. Duration: 0.97275 2023-10-02 21:22:33,904 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9245W0121-616888 from training. Duration: 0.896 2023-10-02 21:22:33,907 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6786_210_1388_1_1527839699_4194495_378-46070-0 from training. Duration: 0.72 2023-10-02 21:22:38,733 WARNING [train.py:1204] (3/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_317-175062-0 from training. Duration: 0.82 2023-10-02 21:22:41,437 WARNING [train.py:1204] (3/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_238-89390-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 21:22:42,882 WARNING [train.py:1204] (3/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_524-310811-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 21:22:45,007 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.2.encoder.layers.2.feed_forward3.out_whiten, num_groups=1, num_channels=384, metric=9.64 vs. limit=15.0 2023-10-02 21:22:45,581 WARNING [train.py:1204] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_307-158462-0_sp0.9 from training. Duration: 0.7666875 2023-10-02 21:22:45,582 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9309W0191-640318 from training. Duration: 0.512 2023-10-02 21:22:45,586 WARNING [train.py:1204] (3/4) Exclude cut with ID _55153_210_2253_1_1534384786677_3162096_100-204617-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 21:22:47,696 WARNING [train.py:1204] (3/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_112-149213-0_sp1.1 from training. Duration: 0.863625 2023-10-02 21:22:48,997 WARNING [train.py:1204] (3/4) Exclude cut with ID _67831_210_12392_1_1535612464202_3092612_346-2148-0_sp1.1 from training. Duration: 0.8454375 2023-10-02 21:22:50,387 WARNING [train.py:1204] (3/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_51-117576-0_sp0.9 from training. Duration: 0.4444375 2023-10-02 21:22:50,407 WARNING [train.py:1204] (3/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_300-27235-0 from training. Duration: 0.87 2023-10-02 21:22:52,158 WARNING [train.py:1204] (3/4) Exclude cut with ID _40399_210_4353_1_1533305658194_3279406_195-116825-0_sp1.1 from training. Duration: 0.97275 2023-10-02 21:22:52,191 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7002_210_1794_1_1527939683_5157286_163-87552-0_sp0.9 from training. Duration: 0.9555625 2023-10-02 21:22:52,238 WARNING [train.py:1204] (3/4) Exclude cut with ID _19514_210_9259_1_1531530071330_5084230_560-237597-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 21:22:55,321 WARNING [train.py:1204] (3/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_41-234353-0_sp1.1 from training. Duration: 0.6181875 2023-10-02 21:22:55,339 WARNING [train.py:1204] (3/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_769-168302-0 from training. Duration: 0.71 2023-10-02 21:22:58,082 WARNING [train.py:1204] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_574-45541-0 from training. Duration: 0.65 2023-10-02 21:22:59,617 WARNING [train.py:1204] (3/4) Exclude cut with ID _50310_210_15488_1_1534068043345_3641080_262-67120-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 21:23:01,204 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_53071_210_16554_1_1534493434_1276796_35-11927-0_sp1.1 from training. Duration: 0.963625 2023-10-02 21:23:01,206 WARNING [train.py:1204] (3/4) Exclude cut with ID _3992_210_1747_1_1526644816031_3731060_107-251434-0 from training. Duration: 0.99 2023-10-02 21:23:02,545 WARNING [train.py:1204] (3/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_412-54488-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 21:23:04,057 WARNING [train.py:1204] (3/4) Exclude cut with ID _18516_210_9206_1_1531704653206_3692406_77-258490-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 21:23:05,864 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_40047_210_2290_1_1533290519_2374311_195-165693-0_sp0.9 from training. Duration: 0.988875 2023-10-02 21:23:05,917 WARNING [train.py:1204] (3/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_296-36971-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 21:23:07,300 WARNING [train.py:1204] (3/4) Exclude cut with ID _14970_210_15709_1_1530773543493_1715730_187-20259-0_sp1.1 from training. Duration: 0.8090625 2023-10-02 21:23:10,043 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_53373_210_5399_1_1534229471_4715695_135-305927-0_sp1.1 from training. Duration: 0.936375 2023-10-02 21:23:11,364 WARNING [train.py:1204] (3/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_60-18123-0 from training. Duration: 0.88 2023-10-02 21:23:11,430 WARNING [train.py:1204] (3/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_166-166990-0 from training. Duration: 0.96 2023-10-02 21:23:11,508 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_5438_210_1794_1_1527336687_2600009_233-58072-0 from training. Duration: 0.77 2023-10-02 21:23:14,251 INFO [train.py:1046] (3/4) Epoch 29, batch 4800, loss[loss=0.153, simple_loss=0.2388, pruned_loss=0.03365, over 24496.00 frames. ], tot_loss[loss=0.166, simple_loss=0.244, pruned_loss=0.04404, over 4711005.36 frames. ], batch size: 66, lr: 3.50e-03, grad_scale: 32.0 2023-10-02 21:23:15,707 WARNING [train.py:1204] (3/4) Exclude cut with ID _8833_210_5219_1_1528592665210_7553059_611-80313-0_sp0.9 from training. Duration: 0.711125 2023-10-02 21:23:15,732 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_53555_210_5632_1_1534595220_7232297_118-43239-0_sp1.1 from training. Duration: 0.936375 2023-10-02 21:23:17,076 WARNING [train.py:1204] (3/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_480-13146-0 from training. Duration: 0.85 2023-10-02 21:23:20,644 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_892-28814-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 21:23:21,233 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.3.encoder.layers.1.nonlin_attention.whiten2, num_groups=1, num_channels=512, metric=6.30 vs. limit=15.0 2023-10-02 21:23:21,962 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_24899_210_16554_1_1532046422_3827462_160-21255-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 21:23:27,310 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_154-316450-0_sp1.1 from training. Duration: 0.6909375 2023-10-02 21:23:28,632 WARNING [train.py:1204] (3/4) Exclude cut with ID _30196_210_1664_1_1533708496522_7074140_1225-301232-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 21:23:28,665 WARNING [train.py:1204] (3/4) Exclude cut with ID _28783_210_7943_1_1532671164808_7155479_414-208100-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 21:23:29,966 WARNING [train.py:1204] (3/4) Exclude cut with ID _19023_210_4353_1_1531378892552_5820780_83-254794-0 from training. Duration: 0.86 2023-10-02 21:23:30,022 WARNING [train.py:1204] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_591-83752-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 21:23:30,066 WARNING [train.py:1204] (3/4) Exclude cut with ID _29877_210_6228_1_1532397791106_5276149_266-62291-0_sp1.1 from training. Duration: 0.7909375 2023-10-02 21:23:31,495 WARNING [train.py:1204] (3/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_280-161633-0_sp1.1 from training. Duration: 0.663625 2023-10-02 21:23:31,709 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.feed_forward2.hidden_balancer.prob, batch_count=1023666.6666666666, ans=0.125 2023-10-02 21:23:34,739 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7543_210_6753_1_1528539468_333764_39-156634-0_sp1.1 from training. Duration: 0.97275 2023-10-02 21:23:37,334 WARNING [train.py:1204] (3/4) Exclude cut with ID _67499_210_17282_1_1535509805220_3692430_505-312248-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 21:23:37,355 WARNING [train.py:1204] (3/4) Exclude cut with ID _5294_210_1747_1_1527249643928_3696179_180-100713-0_sp0.9 from training. Duration: 0.9 2023-10-02 21:23:37,444 WARNING [train.py:1204] (3/4) Exclude cut with ID _26528_210_9070_1_1532570410842_3851310_508-148132-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 21:23:37,460 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_211-339622-0_sp1.1 from training. Duration: 0.4090625 2023-10-02 21:23:37,471 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_339-138356-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 21:23:38,791 WARNING [train.py:1204] (3/4) Exclude cut with ID _27060_210_5986_1_1532674838922_3522549_211-19933-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 21:23:41,567 WARNING [train.py:1204] (3/4) Exclude cut with ID _60229_210_27767_1_1534838406158_3655350_457-117923-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 21:23:44,243 WARNING [train.py:1204] (3/4) Exclude cut with ID _55429_210_10626_1_1534467588527_7174143_460-141439-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 21:23:44,456 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.conv_module2.balancer1.prob, batch_count=1023733.3333333334, ans=0.125 2023-10-02 21:23:45,570 WARNING [train.py:1204] (3/4) Exclude cut with ID _59170_210_6721_1_1534983633960_4203335_263-10780-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 21:23:45,581 WARNING [train.py:1204] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_872-264525-0_sp0.9 from training. Duration: 0.72225 2023-10-02 21:23:47,433 WARNING [train.py:1204] (3/4) Exclude cut with ID _7624_210_1531_1_1528109802198_4933101_418-349834-0_sp0.9 from training. Duration: 0.6666875 2023-10-02 21:23:47,531 WARNING [train.py:1204] (3/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_27-53998-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 21:23:48,936 WARNING [train.py:1204] (3/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_202-183922-0 from training. Duration: 0.85 2023-10-02 21:23:48,956 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7002_210_1794_1_1527939683_5157286_1-281264-0 from training. Duration: 0.44 2023-10-02 21:23:50,307 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_53753_210_7234_1_1534320003_3594148_493-9314-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 21:23:50,330 WARNING [train.py:1204] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_217-95056-0_sp1.1 from training. Duration: 0.8909375 2023-10-02 21:23:52,127 WARNING [train.py:1204] (3/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_406-45086-0_sp1.1 from training. Duration: 0.72725 2023-10-02 21:23:52,133 WARNING [train.py:1204] (3/4) Exclude cut with ID _31649_210_6228_1_1532584608063_4091078_505-213821-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 21:23:52,142 WARNING [train.py:1204] (3/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_317-175062-0_sp0.9 from training. Duration: 0.911125 2023-10-02 21:23:54,127 WARNING [train.py:1204] (3/4) Exclude cut with ID _5444_210_2151_1_1527418808464_5312210_726-27165-0_sp1.1 from training. Duration: 0.6090625 2023-10-02 21:23:55,421 WARNING [train.py:1204] (3/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_494-170437-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 21:23:58,146 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_474-64054-0_sp1.1 from training. Duration: 0.936375 2023-10-02 21:23:59,624 WARNING [train.py:1204] (3/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_521-7556-0_sp1.1 from training. Duration: 0.97275 2023-10-02 21:24:02,734 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_269-348505-0_sp1.1 from training. Duration: 0.92725 2023-10-02 21:24:07,121 WARNING [train.py:1204] (3/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_559-184862-0 from training. Duration: 0.94 2023-10-02 21:24:07,158 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_68702_210_23689_1_1535799577_3420068_92-76040-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 21:24:08,511 WARNING [train.py:1204] (3/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_227-102052-0_sp1.1 from training. Duration: 0.97275 2023-10-02 21:24:08,543 WARNING [train.py:1204] (3/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_718-250907-0_sp0.9 from training. Duration: 0.7666875 2023-10-02 21:24:09,899 WARNING [train.py:1204] (3/4) Exclude cut with ID _14069_210_5632_1_1530512692033_4803390_352-143891-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 21:24:14,067 WARNING [train.py:1204] (3/4) Exclude cut with ID _6267_210_2084_1_1527764658526_3894730_65-106602-0_sp1.1 from training. Duration: 0.936375 2023-10-02 21:24:14,135 WARNING [train.py:1204] (3/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_202-183922-0_sp0.9 from training. Duration: 0.9444375 2023-10-02 21:24:14,141 WARNING [train.py:1204] (3/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_465-268827-0_sp1.1 from training. Duration: 0.97275 2023-10-02 21:24:15,547 WARNING [train.py:1204] (3/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_58-29064-0_sp1.1 from training. Duration: 0.9 2023-10-02 21:24:16,900 WARNING [train.py:1204] (3/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_1-99680-0_sp0.9 from training. Duration: 0.7444375 2023-10-02 21:24:16,968 WARNING [train.py:1204] (3/4) Exclude cut with ID _13253_210_1925_1_1530757775819_3577873_191-144977-0_sp1.1 from training. Duration: 0.7090625 2023-10-02 21:24:19,239 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.balancer2.prob, batch_count=1023866.6666666666, ans=0.125 2023-10-02 21:24:21,850 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.631e+02 1.889e+02 2.092e+02 2.309e+02 3.142e+02, threshold=4.184e+02, percent-clipped=0.0 2023-10-02 21:24:21,979 WARNING [train.py:1204] (3/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_276-112684-0_sp1.1 from training. Duration: 0.92725 2023-10-02 21:24:21,988 WARNING [train.py:1204] (3/4) Exclude cut with ID _64639_210_5816_1_1535367619437_3528000_358-411-0_sp1.1 from training. Duration: 0.97275 2023-10-02 21:24:21,994 WARNING [train.py:1204] (3/4) Exclude cut with ID _31649_210_6228_1_1532584608063_4091078_106-21199-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 21:24:23,365 WARNING [train.py:1204] (3/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_901-79438-0 from training. Duration: 0.94 2023-10-02 21:24:26,619 WARNING [train.py:1204] (3/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_718-250907-0 from training. Duration: 0.69 2023-10-02 21:24:26,633 WARNING [train.py:1204] (3/4) Exclude cut with ID _5287_210_2084_1_1527418883040_3878670_363-194161-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 21:24:26,636 WARNING [train.py:1204] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_586-321321-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 21:24:26,711 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_68900_210_2087_1_1535713157_3708529_164-292661-0_sp1.1 from training. Duration: 0.963625 2023-10-02 21:24:26,712 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_26433_210_12108_1_1532157158_4056480_114-144878-0_sp1.1 from training. Duration: 0.97275 2023-10-02 21:24:29,473 INFO [train.py:1046] (3/4) Epoch 29, batch 4850, loss[loss=0.1708, simple_loss=0.2434, pruned_loss=0.04915, over 23508.00 frames. ], tot_loss[loss=0.1664, simple_loss=0.2446, pruned_loss=0.04415, over 4721306.13 frames. ], batch size: 256, lr: 3.50e-03, grad_scale: 16.0 2023-10-02 21:24:30,919 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_15517_210_6753_1_1530954008_3487336_96-341874-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 21:24:38,440 WARNING [train.py:1204] (3/4) Exclude cut with ID _14268_210_9206_1_1530622771285_4714099_637-86630-0 from training. Duration: 0.51 2023-10-02 21:24:39,810 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_28211_210_6388_1_1532861506_4523224_121-320092-0_sp1.1 from training. Duration: 0.92725 2023-10-02 21:24:44,251 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_141-89113-0_sp0.9 from training. Duration: 0.9555625 2023-10-02 21:24:45,678 WARNING [train.py:1204] (3/4) Exclude cut with ID _6789_210_1534_1_1527857937218_3118950_311-61262-0_sp1.1 from training. Duration: 0.5909375 2023-10-02 21:24:45,700 WARNING [train.py:1204] (3/4) Exclude cut with ID _58480_210_12392_1_1534811121111_3646160_389-200461-0_sp1.1 from training. Duration: 0.97275 2023-10-02 21:24:46,434 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.3.encoder.layers.3.self_attn2.whiten, num_groups=1, num_channels=512, metric=11.51 vs. limit=22.5 2023-10-02 21:24:48,474 WARNING [train.py:1204] (3/4) Exclude cut with ID _41326_210_5854_1_1533778076509_3492080_28-156553-0_sp1.1 from training. Duration: 0.92725 2023-10-02 21:24:48,567 WARNING [train.py:1204] (3/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_577-287150-0_sp1.1 from training. Duration: 0.7454375 2023-10-02 21:24:49,911 WARNING [train.py:1204] (3/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_40-129756-0_sp1.1 from training. Duration: 0.736375 2023-10-02 21:24:49,923 WARNING [train.py:1204] (3/4) Exclude cut with ID _60113_210_16271_1_1535164212481_4386079_490-280290-0 from training. Duration: 0.98 2023-10-02 21:24:53,196 WARNING [train.py:1204] (3/4) Exclude cut with ID _58059_210_5854_1_1534901471149_3164808_34-39905-0_sp1.1 from training. Duration: 0.963625 2023-10-02 21:24:56,564 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_791-197891-0_sp1.1 from training. Duration: 0.763625 2023-10-02 21:24:56,583 WARNING [train.py:1204] (3/4) Exclude cut with ID _6789_210_1534_1_1527857937218_3118950_262-48853-0_sp1.1 from training. Duration: 0.5090625 2023-10-02 21:24:57,860 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2655_210_2087_1_1525688959_3897400_52-279899-0_sp0.9 from training. Duration: 0.7666875 2023-10-02 21:24:57,863 WARNING [train.py:1204] (3/4) Exclude cut with ID _14884_210_2252_1_1530707459689_5561250_114-201399-0 from training. Duration: 0.76 2023-10-02 21:25:00,577 WARNING [train.py:1204] (3/4) Exclude cut with ID _19023_210_4353_1_1531378892552_5820780_83-254794-0_sp0.9 from training. Duration: 0.9555625 2023-10-02 21:25:01,920 WARNING [train.py:1204] (3/4) Exclude cut with ID _53489_210_6228_1_1534298357169_3835970_10-297919-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 21:25:06,471 WARNING [train.py:1204] (3/4) Exclude cut with ID _44585_210_18628_1_1533605350387_4518199_418-3055-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 21:25:06,489 WARNING [train.py:1204] (3/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_310-109461-0 from training. Duration: 0.8 2023-10-02 21:25:06,524 WARNING [train.py:1204] (3/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_235-262124-0 from training. Duration: 0.67 2023-10-02 21:25:07,923 WARNING [train.py:1204] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_195-167011-0_sp1.1 from training. Duration: 0.6818125 2023-10-02 21:25:13,575 WARNING [train.py:1204] (3/4) Exclude cut with ID _7852_210_5219_1_1528286471316_8139400_188-55162-0_sp1.1 from training. Duration: 0.936375 2023-10-02 21:25:13,622 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_35361_210_4238_1_1533262810_7793476_675-87012-0 from training. Duration: 0.89 2023-10-02 21:25:15,044 WARNING [train.py:1204] (3/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_465-81427-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 21:25:15,051 WARNING [train.py:1204] (3/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_597-154640-0_sp1.1 from training. Duration: 0.8181875 2023-10-02 21:25:15,178 WARNING [train.py:1204] (3/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_186-309161-0_sp0.9 from training. Duration: 0.6 2023-10-02 21:25:16,495 WARNING [train.py:1204] (3/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_546-119312-0 from training. Duration: 0.99 2023-10-02 21:25:16,499 WARNING [train.py:1204] (3/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_330-240060-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 21:25:18,159 WARNING [train.py:1204] (3/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_477-255782-0 from training. Duration: 0.82 2023-10-02 21:25:18,186 WARNING [train.py:1204] (3/4) Exclude cut with ID _14970_210_15709_1_1530773543493_1715730_150-248939-0_sp1.1 from training. Duration: 0.97275 2023-10-02 21:25:20,847 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_556-206615-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 21:25:20,919 WARNING [train.py:1204] (3/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_308-231735-0 from training. Duration: 0.73 2023-10-02 21:25:27,207 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.3.encoder.layers.2.conv_module1.whiten, num_groups=1, num_channels=512, metric=6.10 vs. limit=15.0 2023-10-02 21:25:31,165 WARNING [train.py:1204] (3/4) Exclude cut with ID _27485_210_4916_1_1532739191677_3668269_66-236571-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 21:25:32,668 INFO [scaling.py:1118] (3/4) WithLoss: name=encoder.encoders.2.encoder.layers.1.self_attn_weights, loss-sum=0.000e+00 2023-10-02 21:25:37,070 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2221_210_1988_1_1525597051_4935541_415-296882-0_sp0.9 from training. Duration: 0.8555625 2023-10-02 21:25:37,087 WARNING [train.py:1204] (3/4) Exclude cut with ID _64521_210_14699_1_1535243427259_3815328_231-78630-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 21:25:41,185 WARNING [train.py:1204] (3/4) Exclude cut with ID _13942_210_9988_1_1530604906036_3733360_499-55676-0 from training. Duration: 0.62 2023-10-02 21:25:41,188 WARNING [train.py:1204] (3/4) Exclude cut with ID _49376_210_5986_1_1533985278732_3370740_301-154763-0_sp1.1 from training. Duration: 0.8909375 2023-10-02 21:25:42,575 INFO [train.py:1046] (3/4) Epoch 29, batch 4900, loss[loss=0.1636, simple_loss=0.2179, pruned_loss=0.05469, over 19152.00 frames. ], tot_loss[loss=0.1653, simple_loss=0.2426, pruned_loss=0.044, over 4714897.37 frames. ], batch size: 389, lr: 3.50e-03, grad_scale: 16.0 2023-10-02 21:25:45,336 WARNING [train.py:1204] (3/4) Exclude cut with ID _27485_210_4916_1_1532739191677_3668269_255-216698-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 21:25:46,672 WARNING [train.py:1204] (3/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_137-334143-0_sp1.1 from training. Duration: 0.97275 2023-10-02 21:25:48,479 WARNING [train.py:1204] (3/4) Exclude cut with ID _6456_210_1534_1_1527681634212_3181049_215-309359-0_sp1.1 from training. Duration: 0.8 2023-10-02 21:25:51,191 WARNING [train.py:1204] (3/4) Exclude cut with ID _65640_210_6310_1_1535368201445_3173250_209-13008-0 from training. Duration: 0.99 2023-10-02 21:25:52,818 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.bypass.scale_min, batch_count=1024266.6666666666, ans=0.2 2023-10-02 21:25:58,419 WARNING [train.py:1204] (3/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_58-29064-0 from training. Duration: 0.99 2023-10-02 21:26:03,030 WARNING [train.py:1204] (3/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_410-287857-0 from training. Duration: 0.93 2023-10-02 21:26:04,377 WARNING [train.py:1204] (3/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_153-153537-0 from training. Duration: 0.91 2023-10-02 21:26:04,421 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7544_210_6753_1_1528587722_3576686_172-171750-0_sp0.9 from training. Duration: 0.72225 2023-10-02 21:26:04,452 WARNING [train.py:1204] (3/4) Exclude cut with ID _64521_210_14699_1_1535243427259_3815328_464-183853-0_sp1.1 from training. Duration: 0.97275 2023-10-02 21:26:04,982 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.4.encoder.layers.0.feed_forward2.out_whiten, num_groups=1, num_channels=384, metric=13.11 vs. limit=15.0 2023-10-02 21:26:05,716 WARNING [train.py:1204] (3/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_385-210308-0_sp1.1 from training. Duration: 0.963625 2023-10-02 21:26:05,724 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_46013_210_7234_1_1533776362_3744032_354-261865-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 21:26:05,730 WARNING [train.py:1204] (3/4) Exclude cut with ID _3067_210_2667_1_1526212974640_4503168_250-285937-0_sp1.1 from training. Duration: 0.72725 2023-10-02 21:26:05,790 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_40036_210_13405_1_1533648647_1315388_48-146016-0 from training. Duration: 0.93 2023-10-02 21:26:05,977 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.conv_module2.balancer1.prob, batch_count=1024333.3333333334, ans=0.125 2023-10-02 21:26:08,559 WARNING [train.py:1204] (3/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_280-161633-0 from training. Duration: 0.73 2023-10-02 21:26:08,594 WARNING [train.py:1204] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_308-161067-0_sp0.9 from training. Duration: 0.7333125 2023-10-02 21:26:10,652 WARNING [train.py:1204] (3/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_68-212801-0_sp0.9 from training. Duration: 0.811125 2023-10-02 21:26:12,131 WARNING [train.py:1204] (3/4) Exclude cut with ID _6789_210_1534_1_1527857937218_3118950_311-61262-0_sp0.9 from training. Duration: 0.72225 2023-10-02 21:26:12,648 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.5.encoder.layers.1.feed_forward3.out_whiten, num_groups=1, num_channels=256, metric=14.81 vs. limit=15.0 2023-10-02 21:26:13,608 WARNING [train.py:1204] (3/4) Exclude cut with ID _14632_210_9988_1_1530691279392_3923200_102-204477-0_sp1.1 from training. Duration: 0.8818125 2023-10-02 21:26:13,652 WARNING [train.py:1204] (3/4) Exclude cut with ID _65242_210_18628_1_1535369393717_3823149_290-117517-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 21:26:16,248 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_69630_210_7234_1_1535788670_910989_35-337140-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 21:26:16,265 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_5618_210_1874_1_1527311008_5851_1-214501-0 from training. Duration: 0.689 2023-10-02 21:26:17,642 WARNING [train.py:1204] (3/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_168-50337-0_sp0.9 from training. Duration: 0.9333125 2023-10-02 21:26:18,973 WARNING [train.py:1204] (3/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_185-14438-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 21:26:19,596 WARNING [train.py:1204] (3/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_61-327062-0 from training. Duration: 0.81 2023-10-02 21:26:19,602 WARNING [train.py:1204] (3/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_98-741-0 from training. Duration: 0.8 2023-10-02 21:26:19,774 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.conv_skip_rate, batch_count=1024400.0, ans=0.0 2023-10-02 21:26:23,783 WARNING [train.py:1204] (3/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_312-204798-0 from training. Duration: 0.93 2023-10-02 21:26:25,142 WARNING [train.py:1204] (3/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_787-294416-0_sp0.9 from training. Duration: 0.911125 2023-10-02 21:26:25,217 WARNING [train.py:1204] (3/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_140-181176-0_sp1.1 from training. Duration: 0.836375 2023-10-02 21:26:26,392 WARNING [train.py:1204] (3/4) Exclude cut with ID _14687_210_5854_1_1530703838259_3664889_115-239046-0_sp1.1 from training. Duration: 0.6090625 2023-10-02 21:26:26,437 WARNING [train.py:1204] (3/4) Exclude cut with ID _56423_210_5854_1_1534896061049_3165969_370-230614-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 21:26:26,494 WARNING [train.py:1204] (3/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_406-275649-0_sp0.9 from training. Duration: 0.4666875 2023-10-02 21:26:26,514 WARNING [train.py:1204] (3/4) Exclude cut with ID _15150_210_8341_1_1530752411803_7256384_22-138274-0_sp1.1 from training. Duration: 0.87275 2023-10-02 21:26:27,880 WARNING [train.py:1204] (3/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_125-125818-0 from training. Duration: 0.96 2023-10-02 21:26:29,767 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_4399_210_1874_1_1526705485_2400757_220-47974-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 21:26:31,634 WARNING [train.py:1204] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_308-161067-0_sp1.1 from training. Duration: 0.6 2023-10-02 21:26:32,977 WARNING [train.py:1204] (3/4) Exclude cut with ID _53489_210_6228_1_1534298357169_3835970_102-57645-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 21:26:37,510 WARNING [train.py:1204] (3/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_178-177582-0 from training. Duration: 0.74 2023-10-02 21:26:37,587 WARNING [train.py:1204] (3/4) Exclude cut with ID _14349_210_8341_1_1530615710818_3442470_170-336541-0_sp1.1 from training. Duration: 0.7818125 2023-10-02 21:26:39,008 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_521-214276-0_sp0.9 from training. Duration: 0.488875 2023-10-02 21:26:39,056 WARNING [train.py:1204] (3/4) Exclude cut with ID _66070_210_7640_1_1535441393096_7805310_489-292631-0 from training. Duration: 0.79 2023-10-02 21:26:45,849 WARNING [train.py:1204] (3/4) Exclude cut with ID _14069_210_5632_1_1530512692033_4803390_438-340023-0_sp1.1 from training. Duration: 0.936375 2023-10-02 21:26:46,086 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.bypass_mid.scale_min, batch_count=1024533.3333333334, ans=0.2 2023-10-02 21:26:47,189 WARNING [train.py:1204] (3/4) Exclude cut with ID _7852_210_5219_1_1528286471316_8139400_242-105336-0_sp1.1 from training. Duration: 0.6545625 2023-10-02 21:26:48,529 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.630e+02 1.893e+02 2.056e+02 2.305e+02 3.001e+02, threshold=4.112e+02, percent-clipped=0.0 2023-10-02 21:26:48,611 WARNING [train.py:1204] (3/4) Exclude cut with ID _3867_210_1883_1_1526641200575_4475830_272-215563-0 from training. Duration: 0.57 2023-10-02 21:26:48,619 WARNING [train.py:1204] (3/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_143-117421-0_sp0.9 from training. Duration: 0.6444375 2023-10-02 21:26:48,625 WARNING [train.py:1204] (3/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_30-326681-0_sp1.1 from training. Duration: 0.7545625 2023-10-02 21:26:50,634 WARNING [train.py:1204] (3/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_538-218448-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 21:26:54,919 WARNING [train.py:1204] (3/4) Exclude cut with ID _33640_210_2655_1_1533117605479_4855469_60-192970-0_sp1.1 from training. Duration: 0.963625 2023-10-02 21:26:54,926 WARNING [train.py:1204] (3/4) Exclude cut with ID _65585_210_12210_1_1535450451584_3764098_112-294301-0_sp1.1 from training. Duration: 0.8 2023-10-02 21:26:54,950 WARNING [train.py:1204] (3/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_643-37412-0_sp1.1 from training. Duration: 0.936375 2023-10-02 21:26:54,970 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_9302_210_2550_1_1528889962_4655416_638-301620-0_sp1.1 from training. Duration: 0.42725 2023-10-02 21:26:56,266 INFO [train.py:1046] (3/4) Epoch 29, batch 4950, loss[loss=0.1812, simple_loss=0.2501, pruned_loss=0.05617, over 23672.00 frames. ], tot_loss[loss=0.1645, simple_loss=0.2418, pruned_loss=0.04356, over 4706963.29 frames. ], batch size: 164, lr: 3.50e-03, grad_scale: 16.0 2023-10-02 21:26:56,403 WARNING [train.py:1204] (3/4) Exclude cut with ID _3867_210_1883_1_1526641200575_4475830_272-215563-0_sp1.1 from training. Duration: 0.5181875 2023-10-02 21:26:59,611 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_53679_210_4852_1_1534556726_4186141_358-154143-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 21:26:59,624 WARNING [train.py:1204] (3/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_841-341679-0_sp0.9 from training. Duration: 0.6444375 2023-10-02 21:26:59,791 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.conv_module2.balancer2.prob, batch_count=1024600.0, ans=0.125 2023-10-02 21:27:03,814 WARNING [train.py:1204] (3/4) Exclude cut with ID _7160_210_1876_1_1527940923231_3703799_302-150728-0 from training. Duration: 0.78 2023-10-02 21:27:03,842 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_125-203105-0 from training. Duration: 0.8 2023-10-02 21:27:03,859 WARNING [train.py:1204] (3/4) Exclude cut with ID _5585_210_2667_1_1527418813324_8007340_89-332072-0_sp0.9 from training. Duration: 0.711125 2023-10-02 21:27:05,746 WARNING [train.py:1204] (3/4) Exclude cut with ID _3992_210_1747_1_1526644816031_3731060_36-147855-0 from training. Duration: 0.78 2023-10-02 21:27:05,772 WARNING [train.py:1204] (3/4) Exclude cut with ID _59256_210_5986_1_1535076085997_3589120_172-239045-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 21:27:05,782 WARNING [train.py:1204] (3/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_472-216663-0_sp1.1 from training. Duration: 0.836375 2023-10-02 21:27:07,113 WARNING [train.py:1204] (3/4) Exclude cut with ID _36512_210_6987_1_1533033143998_3126069_181-306500-0_sp1.1 from training. Duration: 0.863625 2023-10-02 21:27:07,129 WARNING [train.py:1204] (3/4) Exclude cut with ID _56524_210_9642_1_1534572011779_3619170_492-95008-0_sp1.1 from training. Duration: 0.97275 2023-10-02 21:27:10,368 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_33348_210_5632_1_1533086912_3324030_119-90326-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 21:27:10,420 WARNING [train.py:1204] (3/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_231-137892-0_sp1.1 from training. Duration: 0.9 2023-10-02 21:27:11,951 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8453_210_4211_1_1528502940_8562730_596-306820-0_sp1.1 from training. Duration: 0.8818125 2023-10-02 21:27:13,295 WARNING [train.py:1204] (3/4) Exclude cut with ID _27052_210_5986_1_1532329197187_3585009_50-242750-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 21:27:14,734 WARNING [train.py:1204] (3/4) Exclude cut with ID _13913_210_9994_1_1530579615187_3383897_358-98435-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 21:27:14,757 WARNING [train.py:1204] (3/4) Exclude cut with ID _27013_210_1389_1_1532437173240_3252649_399-229778-0_sp1.1 from training. Duration: 0.963625 2023-10-02 21:27:18,743 WARNING [train.py:1204] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_185-174805-0_sp0.9 from training. Duration: 0.7555625 2023-10-02 21:27:22,047 WARNING [train.py:1204] (3/4) Exclude cut with ID _65585_210_12210_1_1535450451584_3764098_196-221014-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 21:27:23,583 WARNING [train.py:1204] (3/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_202-293523-0_sp1.1 from training. Duration: 0.6545625 2023-10-02 21:27:26,345 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8275_210_4198_1_1528455320_3892571_213-127634-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 21:27:26,391 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_921-231678-0_sp1.1 from training. Duration: 0.97275 2023-10-02 21:27:26,629 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.conv_skip_rate, batch_count=1024733.3333333334, ans=0.0 2023-10-02 21:27:27,781 WARNING [train.py:1204] (3/4) Exclude cut with ID _38020_210_5986_1_1533112252259_7006089_493-102745-0_sp1.1 from training. Duration: 0.8909375 2023-10-02 21:27:30,525 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6846_210_6753_1_1527850767_4295158_2-315073-0 from training. Duration: 0.88 2023-10-02 21:27:32,416 WARNING [train.py:1204] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_249-281349-0 from training. Duration: 0.85 2023-10-02 21:27:33,961 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=1024733.3333333334, ans=0.1 2023-10-02 21:27:35,175 WARNING [train.py:1204] (3/4) Exclude cut with ID _18513_210_9206_1_1531618215223_3862620_314-83429-0_sp1.1 from training. Duration: 0.97275 2023-10-02 21:27:36,584 WARNING [train.py:1204] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_215-202973-0_sp0.9 from training. Duration: 0.97775 2023-10-02 21:27:36,593 WARNING [train.py:1204] (3/4) Exclude cut with ID _7719_210_3525_1_1528456153163_4296960_88-250259-0_sp1.1 from training. Duration: 0.763625 2023-10-02 21:27:37,974 WARNING [train.py:1204] (3/4) Exclude cut with ID _7606_210_4915_1_1528894253277_5080690_301-10732-0_sp1.1 from training. Duration: 0.736375 2023-10-02 21:27:37,984 WARNING [train.py:1204] (3/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_818-11907-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 21:27:39,838 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_5959_210_1794_1_1527400400_3445726_412-44052-0_sp1.1 from training. Duration: 0.7 2023-10-02 21:27:41,312 WARNING [train.py:1204] (3/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_6-279195-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 21:27:42,737 WARNING [train.py:1204] (3/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_252-216674-0_sp0.9 from training. Duration: 0.6 2023-10-02 21:27:45,451 WARNING [train.py:1204] (3/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_144-3287-0_sp1.1 from training. Duration: 0.6090625 2023-10-02 21:27:46,968 WARNING [train.py:1204] (3/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_342-3879-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 21:27:48,274 WARNING [train.py:1204] (3/4) Exclude cut with ID _33640_210_2655_1_1533117605479_4855469_584-65630-0_sp1.1 from training. Duration: 0.97275 2023-10-02 21:27:48,355 WARNING [train.py:1204] (3/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_37-189262-0 from training. Duration: 0.98 2023-10-02 21:27:48,398 WARNING [train.py:1204] (3/4) Exclude cut with ID _9389_210_1664_1_1529217337301_5289269_379-128794-0_sp0.9 from training. Duration: 0.9444375 2023-10-02 21:27:49,796 WARNING [train.py:1204] (3/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_119-207162-0_sp1.1 from training. Duration: 0.6454375 2023-10-02 21:27:53,347 WARNING [train.py:1204] (3/4) Exclude cut with ID _2941_210_2577_1_1526199673150_2489759_200-333643-0_sp1.1 from training. Duration: 0.936375 2023-10-02 21:27:53,579 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=1024800.0, ans=0.1 2023-10-02 21:27:54,726 WARNING [train.py:1204] (3/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_479-125611-0_sp1.1 from training. Duration: 0.87275 2023-10-02 21:27:54,739 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3475_210_2028_1_1526193578_3622569_64-297072-0_sp1.1 from training. Duration: 0.82725 2023-10-02 21:27:54,796 WARNING [train.py:1204] (3/4) Exclude cut with ID _53565_210_10635_1_1534337218335_3820259_139-346291-0_sp1.1 from training. Duration: 0.97275 2023-10-02 21:27:56,072 WARNING [train.py:1204] (3/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_588-125165-0_sp1.1 from training. Duration: 0.7545625 2023-10-02 21:27:56,130 WARNING [train.py:1204] (3/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_938-205135-0_sp1.1 from training. Duration: 0.7909375 2023-10-02 21:27:57,518 WARNING [train.py:1204] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_216-48707-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 21:27:58,840 WARNING [train.py:1204] (3/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_477-255782-0_sp1.1 from training. Duration: 0.7454375 2023-10-02 21:27:58,869 WARNING [train.py:1204] (3/4) Exclude cut with ID _16909_210_5724_1_1531292079912_4118919_583-92842-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 21:28:00,267 WARNING [train.py:1204] (3/4) Exclude cut with ID _60860_210_8254_1_1534896015875_3557169_245-256666-0 from training. Duration: 0.97 2023-10-02 21:28:03,091 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_45399_210_4852_1_1533624725_7579737_349-116173-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 21:28:08,345 WARNING [train.py:1204] (3/4) Exclude cut with ID _8699_210_2069_1_1528630346831_3960409_155-39766-0 from training. Duration: 0.78 2023-10-02 21:28:08,354 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_86-16881-0_sp1.1 from training. Duration: 0.52725 2023-10-02 21:28:11,016 INFO [train.py:1046] (3/4) Epoch 29, batch 5000, loss[loss=0.1611, simple_loss=0.2471, pruned_loss=0.0375, over 24326.00 frames. ], tot_loss[loss=0.164, simple_loss=0.2413, pruned_loss=0.04337, over 4705458.42 frames. ], batch size: 77, lr: 3.50e-03, grad_scale: 16.0 2023-10-02 21:28:15,095 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_191-159384-0_sp1.1 from training. Duration: 0.97275 2023-10-02 21:28:15,103 WARNING [train.py:1204] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_307-158462-0_sp1.1 from training. Duration: 0.62725 2023-10-02 21:28:16,577 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7544_210_6753_1_1528591327_1471426_33-346031-0 from training. Duration: 0.61 2023-10-02 21:28:17,864 WARNING [train.py:1204] (3/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_112-149213-0 from training. Duration: 0.95 2023-10-02 21:28:21,043 WARNING [train.py:1204] (3/4) Exclude cut with ID _55844_210_2346_1_1534471212569_7625707_925-191342-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 21:28:23,596 WARNING [train.py:1204] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_87-278686-0 from training. Duration: 0.77 2023-10-02 21:28:23,623 WARNING [train.py:1204] (3/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_415-78792-0_sp1.1 from training. Duration: 0.736375 2023-10-02 21:28:23,639 WARNING [train.py:1204] (3/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_107-332398-0_sp0.9 from training. Duration: 0.8666875 2023-10-02 21:28:25,027 WARNING [train.py:1204] (3/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_72-251255-0 from training. Duration: 0.94 2023-10-02 21:28:25,049 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_431-227273-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 21:28:26,459 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7544_210_6753_1_1528587000_691468_87-178599-0_sp0.9 from training. Duration: 0.9666875 2023-10-02 21:28:26,538 WARNING [train.py:1204] (3/4) Exclude cut with ID _13916_210_9994_1_1530788341126_3763149_60-191556-0 from training. Duration: 0.61 2023-10-02 21:28:26,540 WARNING [train.py:1204] (3/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_746-164782-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 21:28:27,908 WARNING [train.py:1204] (3/4) Exclude cut with ID _33640_210_2655_1_1533117605479_4855469_596-21066-0_sp1.1 from training. Duration: 0.92725 2023-10-02 21:28:29,402 WARNING [train.py:1204] (3/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_320-14078-0 from training. Duration: 0.95 2023-10-02 21:28:29,445 WARNING [train.py:1204] (3/4) Exclude cut with ID _29877_210_6228_1_1532397791106_5276149_266-62291-0 from training. Duration: 0.87 2023-10-02 21:28:29,512 WARNING [train.py:1204] (3/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_176-56120-0_sp0.9 from training. Duration: 0.92225 2023-10-02 21:28:30,822 WARNING [train.py:1204] (3/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_91-26919-0 from training. Duration: 0.97 2023-10-02 21:28:30,832 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_224-322394-0_sp0.9 from training. Duration: 0.7333125 2023-10-02 21:28:30,872 WARNING [train.py:1204] (3/4) Exclude cut with ID _44982_210_1511_1_1534312820108_3726029_586-297059-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 21:28:32,176 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_196-289637-0_sp1.1 from training. Duration: 0.6818125 2023-10-02 21:28:32,177 WARNING [train.py:1204] (3/4) Exclude cut with ID _18513_210_9206_1_1531618215223_3862620_402-5957-0 from training. Duration: 0.99 2023-10-02 21:28:32,184 WARNING [train.py:1204] (3/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_81-300212-0 from training. Duration: 0.6 2023-10-02 21:28:33,595 WARNING [train.py:1204] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_166-128941-0 from training. Duration: 0.54 2023-10-02 21:28:33,609 WARNING [train.py:1204] (3/4) Exclude cut with ID _58059_210_5854_1_1534901471149_3164808_57-309660-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 21:28:35,477 WARNING [train.py:1204] (3/4) Exclude cut with ID _60621_210_17071_1_1535013053530_3671739_73-155814-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 21:28:36,897 WARNING [train.py:1204] (3/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_726-270780-0 from training. Duration: 0.86 2023-10-02 21:28:36,916 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8853_210_3273_1_1528633899_818968_51-187685-0_sp1.1 from training. Duration: 0.62725 2023-10-02 21:28:38,345 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_315-212794-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 21:28:40,428 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_53373_210_5399_1_1534229471_4715695_314-122795-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 21:28:42,431 WARNING [train.py:1204] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_187-221566-0_sp0.9 from training. Duration: 0.47775 2023-10-02 21:28:43,749 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_133-564-0 from training. Duration: 0.79 2023-10-02 21:28:43,797 WARNING [train.py:1204] (3/4) Exclude cut with ID _55153_210_2253_1_1534384786677_3162096_255-228344-0_sp1.1 from training. Duration: 0.936375 2023-10-02 21:28:45,171 WARNING [train.py:1204] (3/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_383-165234-0_sp1.1 from training. Duration: 0.8545625 2023-10-02 21:28:45,296 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.feed_forward3.hidden_balancer.prob, batch_count=1025066.6666666666, ans=0.125 2023-10-02 21:28:46,817 INFO [scaling.py:1118] (3/4) WithLoss: name=encoder.encoders.3.encoder.layers.0.self_attn_weights, loss-sum=0.000e+00 2023-10-02 21:28:47,957 WARNING [train.py:1204] (3/4) Exclude cut with ID IC0677W0056-334889 from training. Duration: 0.768 2023-10-02 21:28:51,269 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_14953_210_2151_1_1530701710_5616165_710-165400-0_sp0.9 from training. Duration: 0.9666875 2023-10-02 21:28:51,923 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.3.encoder.layers.3.feed_forward3.out_whiten, num_groups=1, num_channels=512, metric=12.44 vs. limit=15.0 2023-10-02 21:28:53,911 WARNING [train.py:1204] (3/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_355-236205-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 21:28:53,913 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_34450_210_9547_1_1533099443_4109050_503-246893-0_sp1.1 from training. Duration: 0.963625 2023-10-02 21:28:58,126 WARNING [train.py:1204] (3/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_25-120161-0 from training. Duration: 0.98 2023-10-02 21:28:58,149 WARNING [train.py:1204] (3/4) Exclude cut with ID _66112_210_10635_1_1535590236305_3801484_205-311930-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 21:28:58,170 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_48049_210_5632_1_1533864553_3519937_185-281218-0_sp1.1 from training. Duration: 0.92725 2023-10-02 21:28:58,204 WARNING [train.py:1204] (3/4) Exclude cut with ID _48782_210_12658_1_1534467701544_2300422_191-332415-0_sp1.1 from training. Duration: 0.97275 2023-10-02 21:28:59,699 WARNING [train.py:1204] (3/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_907-151398-0_sp1.1 from training. Duration: 0.336375 2023-10-02 21:29:01,080 WARNING [train.py:1204] (3/4) Exclude cut with ID _67499_210_17282_1_1535509805220_3692430_20-179346-0_sp1.1 from training. Duration: 0.8818125 2023-10-02 21:29:03,773 WARNING [train.py:1204] (3/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_441-91466-0_sp1.1 from training. Duration: 0.8818125 2023-10-02 21:29:05,700 WARNING [train.py:1204] (3/4) Exclude cut with ID _46681_210_9642_1_1533893005571_7603359_786-164007-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 21:29:11,739 WARNING [train.py:1204] (3/4) Exclude cut with ID _14970_210_15709_1_1530773543493_1715730_187-20259-0 from training. Duration: 0.89 2023-10-02 21:29:14,935 WARNING [train.py:1204] (3/4) Exclude cut with ID _12734_210_8341_1_1530183614296_3891929_383-339075-0_sp1.1 from training. Duration: 0.963625 2023-10-02 21:29:17,602 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.554e+02 1.798e+02 1.947e+02 2.210e+02 2.765e+02, threshold=3.894e+02, percent-clipped=0.0 2023-10-02 21:29:25,081 INFO [train.py:1046] (3/4) Epoch 29, batch 5050, loss[loss=0.1581, simple_loss=0.2483, pruned_loss=0.03397, over 24473.00 frames. ], tot_loss[loss=0.1641, simple_loss=0.2413, pruned_loss=0.04339, over 4701440.26 frames. ], batch size: 69, lr: 3.50e-03, grad_scale: 16.0 2023-10-02 21:29:25,182 WARNING [train.py:1204] (3/4) Exclude cut with ID _37086_210_9038_1_1532999540891_7021250_834-322225-0_sp1.1 from training. Duration: 0.92725 2023-10-02 21:29:26,631 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_10987_210_4163_1_1529760555_4123902_56-290559-0_sp1.1 from training. Duration: 0.963625 2023-10-02 21:29:26,639 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_92-246270-0_sp0.9 from training. Duration: 0.9444375 2023-10-02 21:29:26,669 WARNING [train.py:1204] (3/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_56-13561-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 21:29:26,702 WARNING [train.py:1204] (3/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_393-57888-0_sp1.1 from training. Duration: 0.5818125 2023-10-02 21:29:26,724 WARNING [train.py:1204] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_168-32906-0_sp1.1 from training. Duration: 0.62725 2023-10-02 21:29:28,115 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_51225_210_3601_1_1534400634_2327383_127-207553-0_sp1.1 from training. Duration: 0.963625 2023-10-02 21:29:28,465 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.ff2_skip_rate, batch_count=1025266.6666666666, ans=0.0 2023-10-02 21:29:30,857 WARNING [train.py:1204] (3/4) Exclude cut with ID _58583_210_5236_1_1534726835266_3574249_494-203521-0_sp1.1 from training. Duration: 0.963625 2023-10-02 21:29:30,870 WARNING [train.py:1204] (3/4) Exclude cut with ID _49376_210_5986_1_1533985278732_3370740_354-344695-0 from training. Duration: 0.99 2023-10-02 21:29:32,220 WARNING [train.py:1204] (3/4) Exclude cut with ID _8021_210_7994_1_1528376451358_3653069_179-45206-0_sp1.1 from training. Duration: 0.7818125 2023-10-02 21:29:33,612 WARNING [train.py:1204] (3/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_742-154017-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 21:29:34,998 WARNING [train.py:1204] (3/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_222-198127-0_sp1.1 from training. Duration: 0.763625 2023-10-02 21:29:35,045 WARNING [train.py:1204] (3/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_235-307337-0 from training. Duration: 0.94 2023-10-02 21:29:35,114 WARNING [train.py:1204] (3/4) Exclude cut with ID _8393_210_6702_1_1528505860249_3457328_453-176500-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 21:29:36,858 WARNING [train.py:1204] (3/4) Exclude cut with ID _53565_210_10635_1_1534337218335_3820259_102-179528-0_sp1.1 from training. Duration: 0.97275 2023-10-02 21:29:39,779 WARNING [train.py:1204] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_43-206757-0_sp1.1 from training. Duration: 0.6818125 2023-10-02 21:29:41,124 WARNING [train.py:1204] (3/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_335-274780-0_sp1.1 from training. Duration: 0.7090625 2023-10-02 21:29:41,158 WARNING [train.py:1204] (3/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_128-296581-0_sp1.1 from training. Duration: 0.67275 2023-10-02 21:29:53,319 WARNING [train.py:1204] (3/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_111-112362-0 from training. Duration: 0.91 2023-10-02 21:29:53,353 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_5522_210_3273_1_1527249606_3909096_366-172508-0_sp1.1 from training. Duration: 0.6 2023-10-02 21:29:53,434 WARNING [train.py:1204] (3/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_47-167373-0_sp1.1 from training. Duration: 0.836375 2023-10-02 21:29:55,313 WARNING [train.py:1204] (3/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_41-234353-0 from training. Duration: 0.68 2023-10-02 21:29:55,357 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_82-221196-0_sp0.9 from training. Duration: 0.8444375 2023-10-02 21:29:56,670 WARNING [train.py:1204] (3/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_47-4814-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 21:29:56,701 WARNING [train.py:1204] (3/4) Exclude cut with ID _43541_210_10626_1_1533796233418_3635440_40-247993-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 21:29:56,750 WARNING [train.py:1204] (3/4) Exclude cut with ID _21989_210_5854_1_1531899214931_3271360_133-44759-0_sp0.9 from training. Duration: 0.8555625 2023-10-02 21:29:56,753 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_94-195801-0 from training. Duration: 0.94 2023-10-02 21:29:58,217 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_141-89113-0 from training. Duration: 0.86 2023-10-02 21:29:59,566 WARNING [train.py:1204] (3/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_683-284578-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 21:30:02,280 WARNING [train.py:1204] (3/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_6-214815-0_sp1.1 from training. Duration: 0.77275 2023-10-02 21:30:05,124 WARNING [train.py:1204] (3/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_556-212082-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 21:30:05,162 WARNING [train.py:1204] (3/4) Exclude cut with ID _44902_210_16187_1_1533888006062_3792078_149-67212-0 from training. Duration: 0.87 2023-10-02 21:30:06,540 WARNING [train.py:1204] (3/4) Exclude cut with ID _61148_210_6897_1_1535094022726_4217610_333-159440-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 21:30:09,786 WARNING [train.py:1204] (3/4) Exclude cut with ID _3120_210_3525_1_1526036143112_4072980_404-249994-0 from training. Duration: 0.99 2023-10-02 21:30:11,235 WARNING [train.py:1204] (3/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_234-260682-0_sp0.9 from training. Duration: 0.9333125 2023-10-02 21:30:11,266 WARNING [train.py:1204] (3/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_374-138971-0_sp1.1 from training. Duration: 0.8545625 2023-10-02 21:30:11,329 WARNING [train.py:1204] (3/4) Exclude cut with ID _53713_210_24116_1_1534314487762_7416751_743-302085-0_sp1.1 from training. Duration: 0.92725 2023-10-02 21:30:13,941 WARNING [train.py:1204] (3/4) Exclude cut with ID _18516_210_9206_1_1531704653206_3692406_198-334870-0_sp1.1 from training. Duration: 0.836375 2023-10-02 21:30:15,363 WARNING [train.py:1204] (3/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_395-73570-0_sp1.1 from training. Duration: 0.9 2023-10-02 21:30:17,273 WARNING [train.py:1204] (3/4) Exclude cut with ID _46683_210_9642_1_1533730913492_4591116_421-19547-0_sp1.1 from training. Duration: 0.936375 2023-10-02 21:30:19,157 WARNING [train.py:1204] (3/4) Exclude cut with ID _19896_210_1883_1_1531709022662_6649569_714-8668-0_sp1.1 from training. Duration: 0.963625 2023-10-02 21:30:19,196 WARNING [train.py:1204] (3/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_49-101072-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 21:30:20,474 WARNING [train.py:1204] (3/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_220-191080-0_sp1.1 from training. Duration: 0.8909375 2023-10-02 21:30:20,503 WARNING [train.py:1204] (3/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_62-335552-0 from training. Duration: 0.87 2023-10-02 21:30:20,598 WARNING [train.py:1204] (3/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_111-112362-0_sp1.1 from training. Duration: 0.82725 2023-10-02 21:30:22,237 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.bypass.scale_min, batch_count=1025466.6666666666, ans=0.2 2023-10-02 21:30:23,245 WARNING [train.py:1204] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_318-59097-0_sp0.9 from training. Duration: 0.8444375 2023-10-02 21:30:26,528 WARNING [train.py:1204] (3/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_98-255710-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 21:30:26,534 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9042W0003-515609 from training. Duration: 0.896 2023-10-02 21:30:26,542 WARNING [train.py:1204] (3/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_158-250116-0_sp0.9 from training. Duration: 0.688875 2023-10-02 21:30:27,940 WARNING [train.py:1204] (3/4) Exclude cut with ID _26750_210_5665_1_1532226678442_4063230_7-346885-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 21:30:29,263 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_37839_210_7234_1_1533542462_3689303_357-227078-0_sp1.1 from training. Duration: 0.963625 2023-10-02 21:30:29,285 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9052W0119-520904 from training. Duration: 0.896 2023-10-02 21:30:29,596 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.feed_forward2.hidden_balancer.prob, batch_count=1025533.3333333334, ans=0.125 2023-10-02 21:30:32,061 WARNING [train.py:1204] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_249-281349-0_sp1.1 from training. Duration: 0.77275 2023-10-02 21:30:32,068 WARNING [train.py:1204] (3/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_147-293311-0 from training. Duration: 0.94 2023-10-02 21:30:32,069 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_158-21166-0_sp1.1 from training. Duration: 0.963625 2023-10-02 21:30:33,518 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder_embed.conv.2.prob, batch_count=1025533.3333333334, ans=0.125 2023-10-02 21:30:36,131 WARNING [train.py:1204] (3/4) Exclude cut with ID _14100_210_8341_1_1530529320613_3678069_207-332194-0_sp1.1 from training. Duration: 0.92725 2023-10-02 21:30:36,171 WARNING [train.py:1204] (3/4) Exclude cut with ID _9389_210_1664_1_1529217337301_5289269_324-211031-0_sp1.1 from training. Duration: 0.963625 2023-10-02 21:30:36,183 WARNING [train.py:1204] (3/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_133-158308-0 from training. Duration: 0.84 2023-10-02 21:30:37,578 WARNING [train.py:1204] (3/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_166-314607-0 from training. Duration: 0.54 2023-10-02 21:30:38,167 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.4.encoder.layers.2.feed_forward1.out_whiten, num_groups=1, num_channels=384, metric=9.53 vs. limit=15.0 2023-10-02 21:30:39,565 INFO [train.py:1046] (3/4) Epoch 29, batch 5100, loss[loss=0.1616, simple_loss=0.2573, pruned_loss=0.03297, over 24333.00 frames. ], tot_loss[loss=0.1642, simple_loss=0.2419, pruned_loss=0.04329, over 4716692.80 frames. ], batch size: 74, lr: 3.50e-03, grad_scale: 16.0 2023-10-02 21:30:40,967 WARNING [train.py:1204] (3/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_649-79538-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 21:30:40,976 WARNING [train.py:1204] (3/4) Exclude cut with ID _28394_210_8913_1_1532430049518_3650118_343-115667-0_sp1.1 from training. Duration: 0.97275 2023-10-02 21:30:41,031 WARNING [train.py:1204] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_87-278686-0_sp0.9 from training. Duration: 0.8555625 2023-10-02 21:30:43,739 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9109W0003-549749 from training. Duration: 0.768 2023-10-02 21:30:46,330 WARNING [train.py:1204] (3/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_126-29104-0_sp1.1 from training. Duration: 0.77275 2023-10-02 21:30:47,799 WARNING [train.py:1204] (3/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_325-113798-0 from training. Duration: 0.85 2023-10-02 21:30:49,640 WARNING [train.py:1204] (3/4) Exclude cut with ID _6694_210_4599_1_1527852637490_3965529_116-109398-0 from training. Duration: 0.73 2023-10-02 21:30:51,031 WARNING [train.py:1204] (3/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_46-57119-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 21:30:51,158 WARNING [train.py:1204] (3/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_546-119312-0_sp1.1 from training. Duration: 0.9 2023-10-02 21:30:54,024 WARNING [train.py:1204] (3/4) Exclude cut with ID _60057_210_10640_1_1534903065793_3333150_468-306939-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 21:30:54,070 WARNING [train.py:1204] (3/4) Exclude cut with ID _15150_210_8341_1_1530752411803_7256384_22-138274-0 from training. Duration: 0.96 2023-10-02 21:30:55,372 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_289-180904-0 from training. Duration: 0.76 2023-10-02 21:30:56,194 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.bypass.skip_rate, batch_count=1025666.6666666666, ans=0.07 2023-10-02 21:30:58,904 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.nonlin_attention.balancer.prob, batch_count=1025666.6666666666, ans=0.125 2023-10-02 21:31:00,083 WARNING [train.py:1204] (3/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_64-133721-0_sp1.1 from training. Duration: 0.92725 2023-10-02 21:31:01,391 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_195-257512-0_sp1.1 from training. Duration: 0.6090625 2023-10-02 21:31:04,315 WARNING [train.py:1204] (3/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_704-101963-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 21:31:05,949 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.bypass.scale_min, batch_count=1025666.6666666666, ans=0.2 2023-10-02 21:31:08,409 WARNING [train.py:1204] (3/4) Exclude cut with ID _4366_210_1388_1_1526792053633_3613819_231-211402-0 from training. Duration: 0.94 2023-10-02 21:31:08,461 WARNING [train.py:1204] (3/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_679-190385-0_sp1.1 from training. Duration: 0.97275 2023-10-02 21:31:10,483 WARNING [train.py:1204] (3/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_298-81459-0_sp1.1 from training. Duration: 0.963625 2023-10-02 21:31:10,493 WARNING [train.py:1204] (3/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_658-282610-0_sp0.9 from training. Duration: 0.588875 2023-10-02 21:31:11,906 WARNING [train.py:1204] (3/4) Exclude cut with ID _57572_210_5517_1_1534921219624_3809484_196-283734-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 21:31:11,980 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_53071_210_16554_1_1534490405_2954066_294-198554-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 21:31:11,985 WARNING [train.py:1204] (3/4) Exclude cut with ID _4866_210_2577_1_1527404412284_4364659_352-157930-0 from training. Duration: 0.81 2023-10-02 21:31:13,583 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.0.balancer2.prob, batch_count=1025733.3333333334, ans=0.125 2023-10-02 21:31:14,804 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9231W0113-610139 from training. Duration: 0.896 2023-10-02 21:31:14,865 WARNING [train.py:1204] (3/4) Exclude cut with ID _24271_210_12658_1_1531987270022_7190519_638-171906-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 21:31:16,149 WARNING [train.py:1204] (3/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_272-249985-0 from training. Duration: 0.67 2023-10-02 21:31:16,157 WARNING [train.py:1204] (3/4) Exclude cut with ID _64086_210_7994_1_1535270417019_7435949_449-31567-0 from training. Duration: 0.97 2023-10-02 21:31:18,158 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.1.conv_module1.balancer1.min_positive, batch_count=1025733.3333333334, ans=0.025 2023-10-02 21:31:19,913 WARNING [train.py:1204] (3/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_109-282568-0_sp1.1 from training. Duration: 0.97275 2023-10-02 21:31:20,026 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.0.feed_forward1.out_proj.dropout_p, batch_count=1025733.3333333334, ans=0.1 2023-10-02 21:31:23,180 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.5.encoder.layers.1.conv_module1.whiten, num_groups=1, num_channels=256, metric=3.10 vs. limit=15.0 2023-10-02 21:31:24,443 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=1025800.0, ans=0.1 2023-10-02 21:31:26,997 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_121-45934-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 21:31:28,777 WARNING [train.py:1204] (3/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_314-126819-0 from training. Duration: 0.5 2023-10-02 21:31:28,800 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9309W0192-640319 from training. Duration: 0.512 2023-10-02 21:31:28,807 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9310W0003-640649 from training. Duration: 0.896 2023-10-02 21:31:31,406 WARNING [train.py:1204] (3/4) Exclude cut with ID _7160_210_1876_1_1527940923231_3703799_120-150943-0 from training. Duration: 0.81 2023-10-02 21:31:31,408 WARNING [train.py:1204] (3/4) Exclude cut with ID _40380_210_9059_1_1533380594759_4187380_6-33508-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 21:31:32,930 WARNING [train.py:1204] (3/4) Exclude cut with ID _59202_210_26984_1_1534816810612_3549174_137-318710-0 from training. Duration: 0.89 2023-10-02 21:31:37,628 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_435-313305-0 from training. Duration: 0.71 2023-10-02 21:31:40,300 WARNING [train.py:1204] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_91-212486-0_sp1.1 from training. Duration: 0.6181875 2023-10-02 21:31:41,636 WARNING [train.py:1204] (3/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_133-158308-0_sp1.1 from training. Duration: 0.763625 2023-10-02 21:31:44,392 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_99-251314-0 from training. Duration: 0.87 2023-10-02 21:31:45,176 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.2.encoder.layers.1.feed_forward2.out_whiten, num_groups=1, num_channels=384, metric=4.06 vs. limit=15.0 2023-10-02 21:31:46,092 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.574e+02 1.819e+02 1.985e+02 2.208e+02 2.966e+02, threshold=3.970e+02, percent-clipped=0.0 2023-10-02 21:31:47,482 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_365-217119-0_sp1.1 from training. Duration: 0.57275 2023-10-02 21:31:47,511 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3475_210_2028_1_1526193578_3622569_377-166133-0 from training. Duration: 0.86 2023-10-02 21:31:52,366 WARNING [train.py:1204] (3/4) Exclude cut with ID _13916_210_9994_1_1530788341126_3763149_116-21034-0_sp1.1 from training. Duration: 0.87275 2023-10-02 21:31:52,376 WARNING [train.py:1204] (3/4) Exclude cut with ID _64220_210_9527_1_1535250151772_4875259_142-216831-0_sp1.1 from training. Duration: 0.936375 2023-10-02 21:31:52,376 WARNING [train.py:1204] (3/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_490-207861-0_sp1.1 from training. Duration: 0.8909375 2023-10-02 21:31:52,546 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.balancer2.prob, batch_count=1025933.3333333334, ans=0.125 2023-10-02 21:31:53,625 INFO [train.py:1046] (3/4) Epoch 29, batch 5150, loss[loss=0.1699, simple_loss=0.2566, pruned_loss=0.04163, over 24615.00 frames. ], tot_loss[loss=0.165, simple_loss=0.243, pruned_loss=0.04349, over 4727912.36 frames. ], batch size: 68, lr: 3.50e-03, grad_scale: 16.0 2023-10-02 21:31:53,676 WARNING [train.py:1204] (3/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_87-272068-0_sp0.9 from training. Duration: 0.92225 2023-10-02 21:31:53,713 WARNING [train.py:1204] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_18-46756-0_sp1.1 from training. Duration: 0.7181875 2023-10-02 21:31:55,057 WARNING [train.py:1204] (3/4) Exclude cut with ID _39411_210_5986_1_1533538825030_3343649_98-311335-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 21:31:56,368 WARNING [train.py:1204] (3/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_113-18163-0 from training. Duration: 0.89 2023-10-02 21:31:56,370 WARNING [train.py:1204] (3/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_1103-56445-0 from training. Duration: 0.73 2023-10-02 21:31:57,692 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3474_210_2028_1_1526188513_4825246_145-309131-0 from training. Duration: 0.74 2023-10-02 21:31:57,727 WARNING [train.py:1204] (3/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_120-211630-0_sp1.1 from training. Duration: 0.836375 2023-10-02 21:31:57,747 WARNING [train.py:1204] (3/4) Exclude cut with ID _8030_210_7497_1_1528444886781_3531089_32-31837-0 from training. Duration: 0.97 2023-10-02 21:31:59,108 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_19949_210_2161_1_1531706337_928079_8-160956-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 21:32:00,448 WARNING [train.py:1204] (3/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_321-215742-0_sp1.1 from training. Duration: 0.4545625 2023-10-02 21:32:02,421 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_583-124650-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 21:32:05,235 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_30434_210_13049_1_1532519384_981746_164-182755-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 21:32:09,408 WARNING [train.py:1204] (3/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_339-104791-0_sp1.1 from training. Duration: 0.4454375 2023-10-02 21:32:09,434 WARNING [train.py:1204] (3/4) Exclude cut with ID _15026_210_8750_1_1530777602316_3291850_211-195043-0 from training. Duration: 0.8 2023-10-02 21:32:10,772 WARNING [train.py:1204] (3/4) Exclude cut with ID _29877_210_6228_1_1532397791106_5276149_487-182647-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 21:32:10,823 WARNING [train.py:1204] (3/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_127-566-0_sp0.9 from training. Duration: 0.9333125 2023-10-02 21:32:11,113 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.conv_module1.balancer2.prob, batch_count=1026000.0, ans=0.125 2023-10-02 21:32:12,925 WARNING [train.py:1204] (3/4) Exclude cut with ID _8263_210_1664_1_1528439452891_970220_68-181805-0_sp0.9 from training. Duration: 0.711125 2023-10-02 21:32:12,927 WARNING [train.py:1204] (3/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_504-72513-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 21:32:12,935 WARNING [train.py:1204] (3/4) Exclude cut with ID _64639_210_5816_1_1535367619437_3528000_324-265233-0_sp1.1 from training. Duration: 0.963625 2023-10-02 21:32:12,999 WARNING [train.py:1204] (3/4) Exclude cut with ID _6456_210_1534_1_1527681634212_3181049_215-309359-0_sp0.9 from training. Duration: 0.97775 2023-10-02 21:32:13,005 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_115-233929-0_sp1.1 from training. Duration: 0.6545625 2023-10-02 21:32:13,048 WARNING [train.py:1204] (3/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_219-224445-0 from training. Duration: 0.84 2023-10-02 21:32:15,765 WARNING [train.py:1204] (3/4) Exclude cut with ID _14867_210_1968_1_1530684179890_3846969_413-29292-0_sp1.1 from training. Duration: 0.8181875 2023-10-02 21:32:15,820 WARNING [train.py:1204] (3/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_1012-279473-0_sp0.9 from training. Duration: 0.7444375 2023-10-02 21:32:16,514 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.1.encoder.layers.0.feed_forward1.out_whiten, num_groups=1, num_channels=256, metric=5.12 vs. limit=15.0 2023-10-02 21:32:18,893 WARNING [train.py:1204] (3/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_605-133395-0_sp0.9 from training. Duration: 0.7333125 2023-10-02 21:32:20,274 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_89-35748-0 from training. Duration: 0.79 2023-10-02 21:32:22,975 WARNING [train.py:1204] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_476-272757-0_sp1.1 from training. Duration: 0.8454375 2023-10-02 21:32:28,949 WARNING [train.py:1204] (3/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_449-15508-0_sp0.9 from training. Duration: 0.911125 2023-10-02 21:32:29,084 WARNING [train.py:1204] (3/4) Exclude cut with ID _29345_210_4381_1_1532689168263_3860169_198-174328-0 from training. Duration: 0.91 2023-10-02 21:32:33,740 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_15517_210_6753_1_1530952293_1664795_125-158157-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 21:32:35,345 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.feed_forward1.hidden_balancer.prob, batch_count=1026066.6666666666, ans=0.125 2023-10-02 21:32:39,346 WARNING [train.py:1204] (3/4) Exclude cut with ID _25565_210_12392_1_1532423009382_3952098_420-113180-0_sp1.1 from training. Duration: 0.963625 2023-10-02 21:32:40,839 WARNING [train.py:1204] (3/4) Exclude cut with ID _51471_210_1889_1_1534399391390_3094250_236-146773-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 21:32:41,700 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.1.encoder.layers.1.conv_module1.whiten, num_groups=1, num_channels=256, metric=15.21 vs. limit=15.0 2023-10-02 21:32:42,515 WARNING [train.py:1204] (3/4) Exclude cut with ID _30039_210_13270_1_1532484612900_2725150_175-35012-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 21:32:44,491 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_23850_210_4238_1_1532232597_1632433_99-275843-0_sp1.1 from training. Duration: 0.936375 2023-10-02 21:32:47,179 WARNING [train.py:1204] (3/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_658-282610-0 from training. Duration: 0.53 2023-10-02 21:32:47,447 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=1026133.3333333334, ans=0.1 2023-10-02 21:32:50,452 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_37818_210_4238_1_1533620910_4720083_234-65934-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 21:32:51,783 WARNING [train.py:1204] (3/4) Exclude cut with ID _3067_210_2667_1_1526212974640_4503168_459-173867-0_sp1.1 from training. Duration: 0.8 2023-10-02 21:32:51,809 WARNING [train.py:1204] (3/4) Exclude cut with ID _8696_210_1876_1_1528545632440_3345058_209-54069-0_sp0.9 from training. Duration: 0.7444375 2023-10-02 21:32:54,578 WARNING [train.py:1204] (3/4) Exclude cut with ID _62574_210_2655_1_1535180407058_3933249_420-236465-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 21:32:55,931 WARNING [train.py:1204] (3/4) Exclude cut with ID _46681_210_9642_1_1533893005571_7603359_333-4042-0_sp1.1 from training. Duration: 0.936375 2023-10-02 21:32:56,003 WARNING [train.py:1204] (3/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_377-296839-0 from training. Duration: 0.55 2023-10-02 21:32:59,329 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.conv_module2.balancer2.min_abs, batch_count=1026200.0, ans=0.5 2023-10-02 21:33:02,507 WARNING [train.py:1204] (3/4) Exclude cut with ID _43997_210_4525_1_1533538632555_5189329_426-92599-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 21:33:03,595 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_43-71989-0_sp1.1 from training. Duration: 0.5181875 2023-10-02 21:33:05,014 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2009_210_2071_1_1525481134_4588937_140-345530-0_sp1.1 from training. Duration: 0.963625 2023-10-02 21:33:05,029 WARNING [train.py:1204] (3/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_138-332773-0_sp0.9 from training. Duration: 0.9 2023-10-02 21:33:06,354 WARNING [train.py:1204] (3/4) Exclude cut with ID _13942_210_9988_1_1530604906036_3733360_499-55676-0_sp0.9 from training. Duration: 0.688875 2023-10-02 21:33:06,380 WARNING [train.py:1204] (3/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_259-34038-0_sp1.1 from training. Duration: 0.67275 2023-10-02 21:33:07,678 INFO [train.py:1046] (3/4) Epoch 29, batch 5200, loss[loss=0.1671, simple_loss=0.2534, pruned_loss=0.04045, over 24361.00 frames. ], tot_loss[loss=0.1653, simple_loss=0.2432, pruned_loss=0.04368, over 4714967.70 frames. ], batch size: 77, lr: 3.50e-03, grad_scale: 32.0 2023-10-02 21:33:07,725 WARNING [train.py:1204] (3/4) Exclude cut with ID _3120_210_3525_1_1526036143112_4072980_404-249994-0_sp1.1 from training. Duration: 0.9 2023-10-02 21:33:07,763 WARNING [train.py:1204] (3/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_222-108810-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 21:33:10,595 WARNING [train.py:1204] (3/4) Exclude cut with ID _22083_210_8254_1_1531702862439_3678970_49-234546-0_sp0.9 from training. Duration: 0.92225 2023-10-02 21:33:12,101 WARNING [train.py:1204] (3/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_199-295611-0_sp0.9 from training. Duration: 0.611125 2023-10-02 21:33:13,168 INFO [scaling.py:1022] (3/4) Whitening: name=encoder_embed.out_whiten, num_groups=1, num_channels=192, metric=6.86 vs. limit=8.0 2023-10-02 21:33:13,719 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.bypass.scale_min, batch_count=1026266.6666666666, ans=0.2 2023-10-02 21:33:14,813 WARNING [train.py:1204] (3/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_43-192945-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 21:33:19,504 WARNING [train.py:1204] (3/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_364-22817-0 from training. Duration: 0.71 2023-10-02 21:33:20,875 WARNING [train.py:1204] (3/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_388-129552-0_sp1.1 from training. Duration: 0.87275 2023-10-02 21:33:22,797 WARNING [train.py:1204] (3/4) Exclude cut with ID _6537_210_1883_1_1527987559710_3597129_190-280227-0_sp1.1 from training. Duration: 0.97275 2023-10-02 21:33:24,204 WARNING [train.py:1204] (3/4) Exclude cut with ID _55844_210_2346_1_1534471212569_7625707_398-56897-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 21:33:25,578 WARNING [train.py:1204] (3/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_48-182155-0_sp1.1 from training. Duration: 0.7909375 2023-10-02 21:33:25,605 WARNING [train.py:1204] (3/4) Exclude cut with ID _29529_210_7234_1_1532393961455_3977460_582-18084-0_sp1.1 from training. Duration: 0.97275 2023-10-02 21:33:25,817 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.conv_module1.balancer2.prob, batch_count=1026333.3333333334, ans=0.125 2023-10-02 21:33:28,672 WARNING [train.py:1204] (3/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_912-282412-0 from training. Duration: 0.49 2023-10-02 21:33:30,219 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.conv_module1.balancer1.min_positive, batch_count=1026333.3333333334, ans=0.025 2023-10-02 21:33:31,274 WARNING [train.py:1204] (3/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_332-256168-0_sp0.9 from training. Duration: 0.8333125 2023-10-02 21:33:31,326 WARNING [train.py:1204] (3/4) Exclude cut with ID _64220_210_9527_1_1535250151772_4875259_55-250624-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 21:33:31,505 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.1.nonlin_attention.balancer.prob, batch_count=1026333.3333333334, ans=0.125 2023-10-02 21:33:32,779 WARNING [train.py:1204] (3/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_264-108685-0 from training. Duration: 0.55 2023-10-02 21:33:34,927 WARNING [train.py:1204] (3/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_613-232856-0_sp0.9 from training. Duration: 0.6 2023-10-02 21:33:36,881 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.self_attn_weights.whiten_keys.whitening_limit, batch_count=1026400.0, ans=6.0 2023-10-02 21:33:37,604 WARNING [train.py:1204] (3/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_68-212801-0_sp1.1 from training. Duration: 0.663625 2023-10-02 21:33:37,639 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6290_210_3273_1_1527595224_1282787_115-87095-0 from training. Duration: 0.55 2023-10-02 21:33:37,694 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3080_210_2071_1_1526086102_4365166_312-8726-0 from training. Duration: 0.91 2023-10-02 21:33:39,265 WARNING [train.py:1204] (3/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_105-63309-0 from training. Duration: 0.98 2023-10-02 21:33:40,508 WARNING [train.py:1204] (3/4) Exclude cut with ID _2941_210_2577_1_1526199673150_2489759_201-160779-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 21:33:40,511 WARNING [train.py:1204] (3/4) Exclude cut with ID ID0406W0226-885434 from training. Duration: 0.997 2023-10-02 21:33:40,524 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_17292_210_6753_1_1531125388_2943734_353-328201-0_sp1.1 from training. Duration: 0.97275 2023-10-02 21:33:41,945 WARNING [train.py:1204] (3/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_572-96029-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 21:33:43,264 WARNING [train.py:1204] (3/4) Exclude cut with ID _14970_210_15709_1_1530775547361_1966965_6-23328-0_sp0.9 from training. Duration: 0.9555625 2023-10-02 21:33:43,312 WARNING [train.py:1204] (3/4) Exclude cut with ID _45012_210_12651_1_1533608435136_5371380_240-37101-0 from training. Duration: 0.93 2023-10-02 21:33:44,652 WARNING [train.py:1204] (3/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_685-90183-0_sp1.1 from training. Duration: 0.936375 2023-10-02 21:33:46,081 WARNING [train.py:1204] (3/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_41-22245-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 21:33:50,669 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_15126_210_6753_1_1530778677_2591185_120-277707-0 from training. Duration: 0.8 2023-10-02 21:33:51,961 WARNING [train.py:1204] (3/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_310-26186-0 from training. Duration: 0.7 2023-10-02 21:33:51,977 WARNING [train.py:1204] (3/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_787-294416-0 from training. Duration: 0.82 2023-10-02 21:33:56,717 WARNING [train.py:1204] (3/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_795-33252-0 from training. Duration: 0.37 2023-10-02 21:33:56,754 WARNING [train.py:1204] (3/4) Exclude cut with ID _7537_210_2084_1_1528502395431_3924359_113-199933-0_sp0.9 from training. Duration: 0.6555625 2023-10-02 21:33:59,524 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.1.encoder.layers.1.feed_forward1.out_whiten, num_groups=1, num_channels=256, metric=7.02 vs. limit=15.0 2023-10-02 21:34:02,671 WARNING [train.py:1204] (3/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_308-231735-0_sp0.9 from training. Duration: 0.811125 2023-10-02 21:34:03,865 WARNING [train.py:1204] (3/4) Exclude cut with ID _19504_210_9994_1_1531648863242_3325220_354-296765-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 21:34:03,989 WARNING [train.py:1204] (3/4) Exclude cut with ID _5409_210_1388_1_1527292850189_5865191_178-127970-0 from training. Duration: 0.57 2023-10-02 21:34:04,029 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_1466-248056-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 21:34:04,048 WARNING [train.py:1204] (3/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_535-14410-0_sp0.9 from training. Duration: 0.52225 2023-10-02 21:34:04,051 WARNING [train.py:1204] (3/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_747-129941-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 21:34:05,367 WARNING [train.py:1204] (3/4) Exclude cut with ID _15012_210_1968_1_1530856820429_5095350_688-178849-0_sp1.1 from training. Duration: 0.5818125 2023-10-02 21:34:09,981 WARNING [train.py:1204] (3/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_356-45054-0_sp1.1 from training. Duration: 0.7818125 2023-10-02 21:34:11,394 WARNING [train.py:1204] (3/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_146-276944-0_sp0.9 from training. Duration: 0.92225 2023-10-02 21:34:12,875 WARNING [train.py:1204] (3/4) Exclude cut with ID _39204_210_4220_1_1533880547368_3191120_346-180306-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 21:34:12,944 WARNING [train.py:1204] (3/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_641-158017-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 21:34:12,944 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_19952_210_2161_1_1531875469_3851750_274-42137-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 21:34:14,167 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.557e+02 1.881e+02 2.113e+02 2.405e+02 3.374e+02, threshold=4.225e+02, percent-clipped=0.0 2023-10-02 21:34:15,982 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=1026533.3333333334, ans=0.1 2023-10-02 21:34:16,037 INFO [scaling.py:1118] (3/4) WithLoss: name=encoder.encoders.3.encoder.layers.2.self_attn_weights, loss-sum=0.000e+00 2023-10-02 21:34:17,049 WARNING [train.py:1204] (3/4) Exclude cut with ID _60804_210_9642_1_1534831199859_7268299_87-327819-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 21:34:18,393 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_9302_210_2550_1_1528889962_4655416_638-301620-0 from training. Duration: 0.47 2023-10-02 21:34:20,198 WARNING [train.py:1204] (3/4) Exclude cut with ID _44304_210_15374_1_1533639352765_4613129_302-22084-0_sp1.1 from training. Duration: 0.7818125 2023-10-02 21:34:20,210 WARNING [train.py:1204] (3/4) Exclude cut with ID _18513_210_9206_1_1531618215223_3862620_177-227063-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 21:34:20,313 WARNING [train.py:1204] (3/4) Exclude cut with ID _41326_210_5854_1_1533778076509_3492080_510-45444-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 21:34:22,026 INFO [train.py:1046] (3/4) Epoch 29, batch 5250, loss[loss=0.1613, simple_loss=0.2391, pruned_loss=0.04169, over 23257.00 frames. ], tot_loss[loss=0.1651, simple_loss=0.2429, pruned_loss=0.04365, over 4715288.90 frames. ], batch size: 119, lr: 3.50e-03, grad_scale: 32.0 2023-10-02 21:34:22,103 WARNING [train.py:1204] (3/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_76-311963-0_sp1.1 from training. Duration: 0.636375 2023-10-02 21:34:23,571 WARNING [train.py:1204] (3/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_156-302675-0_sp0.9 from training. Duration: 0.611125 2023-10-02 21:34:25,012 WARNING [train.py:1204] (3/4) Exclude cut with ID _53489_210_6228_1_1534298357169_3835970_367-148411-0_sp1.1 from training. Duration: 0.936375 2023-10-02 21:34:26,493 WARNING [train.py:1204] (3/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_389-144525-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 21:34:26,530 WARNING [train.py:1204] (3/4) Exclude cut with ID _14069_210_5632_1_1530512692033_4803390_158-238427-0_sp1.1 from training. Duration: 0.963625 2023-10-02 21:34:29,609 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_486-151131-0_sp0.9 from training. Duration: 0.9444375 2023-10-02 21:34:34,476 WARNING [train.py:1204] (3/4) Exclude cut with ID _39846_210_12651_1_1533603651992_1318030_188-71831-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 21:34:37,201 WARNING [train.py:1204] (3/4) Exclude cut with ID _39411_210_5986_1_1533538825030_3343649_173-238248-0_sp1.1 from training. Duration: 0.7545625 2023-10-02 21:34:38,530 WARNING [train.py:1204] (3/4) Exclude cut with ID _20684_210_6917_1_1531567751646_4203930_469-49081-0_sp1.1 from training. Duration: 0.92725 2023-10-02 21:34:39,943 WARNING [train.py:1204] (3/4) Exclude cut with ID _8833_210_5219_1_1528592665210_7553059_611-80313-0_sp1.1 from training. Duration: 0.5818125 2023-10-02 21:34:40,113 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.balancer2.prob, batch_count=1026666.6666666666, ans=0.125 2023-10-02 21:34:42,609 WARNING [train.py:1204] (3/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_440-141811-0 from training. Duration: 0.95 2023-10-02 21:34:42,629 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_41211_210_2544_1_1533348859_3702910_236-223652-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 21:34:43,961 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_40222_210_6228_1_1533385298_4481939_220-163998-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 21:35:11,552 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.balancer1.prob, batch_count=1026800.0, ans=0.125 2023-10-02 21:35:17,288 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.3.encoder.layers.3.conv_module1.whiten, num_groups=1, num_channels=512, metric=4.66 vs. limit=15.0 2023-10-02 21:35:19,901 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.3.encoder.layers.0.nonlin_attention.whiten1, num_groups=1, num_channels=384, metric=7.17 vs. limit=10.0 2023-10-02 21:35:20,679 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.ff3_skip_rate, batch_count=1026866.6666666666, ans=0.0 2023-10-02 21:35:20,681 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.self_attn_weights.pos_emb_skip_rate, batch_count=1026866.6666666666, ans=0.0 2023-10-02 21:35:27,254 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.feed_forward2.hidden_balancer.prob, batch_count=1026866.6666666666, ans=0.125 2023-10-02 21:35:30,826 INFO [train.py:1046] (3/4) Epoch 29, batch 5300, loss[loss=0.1301, simple_loss=0.1807, pruned_loss=0.0398, over 18966.00 frames. ], tot_loss[loss=0.1643, simple_loss=0.2421, pruned_loss=0.04325, over 4712241.12 frames. ], batch size: 388, lr: 3.49e-03, grad_scale: 32.0 2023-10-02 21:35:45,053 WARNING [train.py:1204] (3/4) Exclude cut with ID _14629_210_15709_1_1530604298182_956899_6-232695-0_sp1.1 from training. Duration: 0.8545625 2023-10-02 21:35:45,088 WARNING [train.py:1204] (3/4) Exclude cut with ID _13921_210_9994_1_1530841782272_3503520_327-69677-0 from training. Duration: 0.9 2023-10-02 21:35:45,113 WARNING [train.py:1204] (3/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_372-298093-0 from training. Duration: 0.8 2023-10-02 21:35:45,127 WARNING [train.py:1204] (3/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_515-139111-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 21:35:45,365 WARNING [train.py:1204] (3/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_85-65056-0_sp1.1 from training. Duration: 0.87275 2023-10-02 21:35:45,405 WARNING [train.py:1204] (3/4) Exclude cut with ID _15113_210_8254_1_1530842438579_3716279_209-54533-0_sp1.1 from training. Duration: 0.87275 2023-10-02 21:35:45,452 WARNING [train.py:1204] (3/4) Exclude cut with ID _70447_210_12392_1_1535972499300_3160420_123-312838-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 21:35:45,478 WARNING [train.py:1204] (3/4) Exclude cut with ID _60113_210_16271_1_1535164212481_4386079_70-63596-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 21:35:45,505 WARNING [train.py:1204] (3/4) Exclude cut with ID _14632_210_9988_1_1530691279392_3923200_225-188509-0_sp1.1 from training. Duration: 0.963625 2023-10-02 21:35:45,554 WARNING [train.py:1204] (3/4) Exclude cut with ID _34734_210_4525_1_1533365977542_4448720_584-180532-0_sp1.1 from training. Duration: 0.97275 2023-10-02 21:35:45,567 WARNING [train.py:1204] (3/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_351-294650-0_sp1.1 from training. Duration: 0.57275 2023-10-02 21:35:45,895 WARNING [train.py:1204] (3/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_263-168388-0_sp0.9 from training. Duration: 0.9555625 2023-10-02 21:35:45,923 WARNING [train.py:1204] (3/4) Exclude cut with ID _22083_210_8254_1_1531702862439_3678970_201-85738-0 from training. Duration: 0.9 2023-10-02 21:35:46,016 WARNING [train.py:1204] (3/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_237-58433-0 from training. Duration: 0.74 2023-10-02 21:35:46,044 WARNING [train.py:1204] (3/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_262-250826-0 from training. Duration: 0.99 2023-10-02 21:35:46,154 WARNING [train.py:1204] (3/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_927-43827-0_sp0.9 from training. Duration: 0.82225 2023-10-02 21:35:46,188 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2821_210_2715_1_1526108385_3763560_73-296673-0 from training. Duration: 0.83 2023-10-02 21:35:46,265 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6287_210_3273_1_1527509506_2653716_182-199887-0 from training. Duration: 0.51 2023-10-02 21:35:46,397 WARNING [train.py:1204] (3/4) Exclude cut with ID _70519_210_4163_1_1535961595994_7458230_899-80223-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 21:35:46,658 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_210-234949-0_sp1.1 from training. Duration: 0.97275 2023-10-02 21:35:46,700 WARNING [train.py:1204] (3/4) Exclude cut with ID _30196_210_1664_1_1533708496522_7074140_768-278176-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 21:35:47,150 WARNING [train.py:1204] (3/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_315-128283-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 21:35:47,229 WARNING [train.py:1204] (3/4) Exclude cut with ID _14886_210_4220_1_1530702020355_3516190_330-323941-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 21:35:47,494 WARNING [train.py:1204] (3/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_264-348896-0_sp1.1 from training. Duration: 0.77275 2023-10-02 21:35:47,517 WARNING [train.py:1204] (3/4) Exclude cut with ID _37086_210_9038_1_1532999540891_7021250_433-10563-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 21:35:47,555 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_53638_210_21581_1_1534297319_5808449_885-80172-0_sp1.1 from training. Duration: 0.87275 2023-10-02 21:35:47,648 WARNING [train.py:1204] (3/4) Exclude cut with ID _14623_210_1925_1_1530773354962_3632609_157-218771-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 21:35:47,659 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_1368-78165-0_sp1.1 from training. Duration: 0.97275 2023-10-02 21:35:47,663 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_309-65177-0_sp0.9 from training. Duration: 0.97775 2023-10-02 21:35:47,669 WARNING [train.py:1204] (3/4) Exclude cut with ID _21632_210_2253_1_1531994001765_3744140_291-138812-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 21:35:47,681 WARNING [train.py:1204] (3/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_669-93163-0_sp1.1 from training. Duration: 0.8909375 2023-10-02 21:35:48,250 WARNING [train.py:1204] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_292-215346-0 from training. Duration: 0.97 2023-10-02 21:35:48,277 WARNING [train.py:1204] (3/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_286-222073-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 21:35:48,577 WARNING [train.py:1204] (3/4) Exclude cut with ID _59256_210_5986_1_1535076085997_3589120_121-237386-0_sp1.1 from training. Duration: 0.87275 2023-10-02 21:35:48,592 WARNING [train.py:1204] (3/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_96-320736-0 from training. Duration: 0.86 2023-10-02 21:35:48,602 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_128-202104-0 from training. Duration: 0.63 2023-10-02 21:35:48,693 WARNING [train.py:1204] (3/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_242-3472-0_sp0.9 from training. Duration: 0.811125 2023-10-02 21:35:48,738 WARNING [train.py:1204] (3/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_154-74182-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 21:35:48,742 WARNING [train.py:1204] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_187-221566-0 from training. Duration: 0.43 2023-10-02 21:35:48,918 WARNING [train.py:1204] (3/4) Exclude cut with ID _6194_210_1534_1_1527511826160_3941194_14-282876-0 from training. Duration: 0.71 2023-10-02 21:35:48,963 WARNING [train.py:1204] (3/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_660-211088-0_sp1.1 from training. Duration: 0.563625 2023-10-02 21:35:49,772 WARNING [train.py:1204] (3/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_526-62079-0_sp1.1 from training. Duration: 0.6090625 2023-10-02 21:35:49,933 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_92-246270-0_sp1.1 from training. Duration: 0.77275 2023-10-02 21:35:50,030 WARNING [train.py:1204] (3/4) Exclude cut with ID IC0287W0023-142350 from training. Duration: 0.956 2023-10-02 21:35:50,110 WARNING [train.py:1204] (3/4) Exclude cut with ID _58605_210_12210_1_1534723195939_3904259_48-269402-0 from training. Duration: 0.96 2023-10-02 21:35:50,127 WARNING [train.py:1204] (3/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_416-257730-0_sp0.9 from training. Duration: 0.72225 2023-10-02 21:35:50,205 WARNING [train.py:1204] (3/4) Exclude cut with ID _7877_210_6014_1_1528286358099_3750136_185-17516-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 21:35:50,310 WARNING [train.py:1204] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_23-189234-0 from training. Duration: 0.83 2023-10-02 21:35:50,327 WARNING [train.py:1204] (3/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_237-89684-0 from training. Duration: 0.96 2023-10-02 21:35:50,398 WARNING [train.py:1204] (3/4) Exclude cut with ID _13916_210_9994_1_1530788341126_3763149_116-21034-0 from training. Duration: 0.96 2023-10-02 21:35:50,537 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_246-125000-0_sp1.1 from training. Duration: 0.563625 2023-10-02 21:35:57,020 INFO [train.py:1046] (3/4) Epoch 30, batch 0, loss[loss=0.1709, simple_loss=0.2554, pruned_loss=0.04316, over 24566.00 frames. ], tot_loss[loss=0.1709, simple_loss=0.2554, pruned_loss=0.04316, over 24566.00 frames. ], batch size: 71, lr: 3.43e-03, grad_scale: 32.0 2023-10-02 21:35:57,021 INFO [train.py:1069] (3/4) Computing validation loss 2023-10-02 21:36:03,607 INFO [zipformer.py:1853] (3/4) name=encoder.encoders.0.layers.1.self_attn_weights, attn_weights_entropy = tensor([5.2093, 5.0816, 4.7908, 4.4032], device='cuda:3') 2023-10-02 21:36:08,957 INFO [train.py:1078] (3/4) Epoch 30, validation: loss=0.3201, simple_loss=0.2693, pruned_loss=0.1854, over 1125622.00 frames. 2023-10-02 21:36:08,957 INFO [train.py:1079] (3/4) Maximum memory allocated so far is 21134MB 2023-10-02 21:36:10,387 WARNING [train.py:1204] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_402-253027-0 from training. Duration: 0.54 2023-10-02 21:36:11,741 WARNING [train.py:1204] (3/4) Exclude cut with ID _59288_210_5854_1_1535077757774_3412246_158-289345-0_sp1.1 from training. Duration: 0.92725 2023-10-02 21:36:14,841 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_28211_210_6388_1_1532861506_4523224_128-328162-0_sp1.1 from training. Duration: 0.8181875 2023-10-02 21:36:20,865 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_788-278281-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 21:36:20,873 WARNING [train.py:1204] (3/4) Exclude cut with ID _49728_210_12651_1_1534041034294_6788150_624-195301-0_sp0.9 from training. Duration: 0.9444375 2023-10-02 21:36:22,135 WARNING [train.py:1204] (3/4) Exclude cut with ID _14860_210_4915_1_1530702020340_7343240_267-139775-0_sp1.1 from training. Duration: 0.963625 2023-10-02 21:36:23,615 WARNING [train.py:1204] (3/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_332-221057-0 from training. Duration: 0.68 2023-10-02 21:36:24,920 WARNING [train.py:1204] (3/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_242-76943-0 from training. Duration: 0.94 2023-10-02 21:36:26,415 WARNING [train.py:1204] (3/4) Exclude cut with ID _16747_210_5986_1_1531811544733_2749109_87-349587-0_sp1.1 from training. Duration: 0.963625 2023-10-02 21:36:27,729 WARNING [train.py:1204] (3/4) Exclude cut with ID _22982_210_1883_1_1531882790447_5449409_505-346452-0_sp1.1 from training. Duration: 0.963625 2023-10-02 21:36:31,950 WARNING [train.py:1204] (3/4) Exclude cut with ID _17956_210_4353_1_1531209568125_6979970_13-192107-0_sp1.1 from training. Duration: 0.963625 2023-10-02 21:36:31,994 WARNING [train.py:1204] (3/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_430-307666-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 21:36:33,757 WARNING [train.py:1204] (3/4) Exclude cut with ID _2895_210_2477_1_1525956785794_4063750_289-11475-0_sp1.1 from training. Duration: 0.7454375 2023-10-02 21:36:33,769 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3910_210_1794_1_1526742306_334530_28-238909-0_sp0.9 from training. Duration: 0.9 2023-10-02 21:36:33,901 WARNING [train.py:1204] (3/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_18-130336-0 from training. Duration: 0.79 2023-10-02 21:36:36,590 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_548-294316-0_sp0.9 from training. Duration: 0.9 2023-10-02 21:36:41,114 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_42-142473-0_sp1.1 from training. Duration: 0.6545625 2023-10-02 21:36:41,127 WARNING [train.py:1204] (3/4) Exclude cut with ID _59170_210_6721_1_1534983633960_4203335_326-48066-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 21:36:42,422 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_45886_210_17024_1_1533709215_471063_7-79165-0 from training. Duration: 0.98 2023-10-02 21:36:46,403 WARNING [train.py:1204] (3/4) Exclude cut with ID _44585_210_18628_1_1533605350387_4518199_280-73534-0_sp0.9 from training. Duration: 0.988875 2023-10-02 21:36:46,412 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3475_210_2028_1_1526193578_3622569_265-276625-0_sp1.1 from training. Duration: 0.6909375 2023-10-02 21:36:47,852 WARNING [train.py:1204] (3/4) Exclude cut with ID _64631_210_27767_1_1535349580068_4165160_88-225232-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 21:36:50,737 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_205-265518-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 21:36:54,752 WARNING [train.py:1204] (3/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_271-253734-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 21:36:58,472 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.1.encoder.layers.0.nonlin_attention.whiten1, num_groups=1, num_channels=192, metric=4.69 vs. limit=10.0 2023-10-02 21:36:58,862 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.602e+02 2.060e+02 2.425e+02 2.930e+02 5.326e+02, threshold=4.849e+02, percent-clipped=3.0 2023-10-02 21:37:00,332 WARNING [train.py:1204] (3/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_282-257791-0 from training. Duration: 0.86 2023-10-02 21:37:05,023 WARNING [train.py:1204] (3/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_356-45054-0 from training. Duration: 0.86 2023-10-02 21:37:05,058 WARNING [train.py:1204] (3/4) Exclude cut with ID _69584_210_5219_1_1535785133775_4044450_38-348116-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 21:37:05,059 WARNING [train.py:1204] (3/4) Exclude cut with ID _66763_210_17301_1_1535788667617_241750_21-328480-0_sp1.1 from training. Duration: 0.936375 2023-10-02 21:37:06,309 WARNING [train.py:1204] (3/4) Exclude cut with ID _26528_210_9070_1_1532570410842_3851310_497-97218-0_sp0.9 from training. Duration: 0.9555625 2023-10-02 21:37:06,366 WARNING [train.py:1204] (3/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_592-101075-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 21:37:09,159 WARNING [train.py:1204] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_678-159307-0 from training. Duration: 0.85 2023-10-02 21:37:10,614 WARNING [train.py:1204] (3/4) Exclude cut with ID _28427_210_4220_1_1532581240967_3061069_166-319773-0_sp1.1 from training. Duration: 0.936375 2023-10-02 21:37:11,914 WARNING [train.py:1204] (3/4) Exclude cut with ID _29877_210_6228_1_1532397791106_5276149_606-224984-0_sp1.1 from training. Duration: 0.936375 2023-10-02 21:37:17,047 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_125-203105-0_sp1.1 from training. Duration: 0.72725 2023-10-02 21:37:18,583 INFO [scaling.py:1118] (3/4) WithLoss: name=encoder.encoders.0.layers.0.self_attn_weights, loss-sum=0.000e+00 2023-10-02 21:37:19,841 WARNING [train.py:1204] (3/4) Exclude cut with ID IC0620W0113-306765 from training. Duration: 0.768 2023-10-02 21:37:21,251 WARNING [train.py:1204] (3/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_410-287857-0_sp1.1 from training. Duration: 0.8454375 2023-10-02 21:37:22,490 INFO [train.py:1046] (3/4) Epoch 30, batch 50, loss[loss=0.1651, simple_loss=0.2346, pruned_loss=0.04781, over 22767.00 frames. ], tot_loss[loss=0.1653, simple_loss=0.2447, pruned_loss=0.04298, over 1076803.06 frames. ], batch size: 322, lr: 3.43e-03, grad_scale: 16.0 2023-10-02 21:37:23,928 WARNING [train.py:1204] (3/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_78-95260-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 21:37:25,388 WARNING [train.py:1204] (3/4) Exclude cut with ID _58059_210_5854_1_1534901471149_3164808_6-73080-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 21:37:25,405 WARNING [train.py:1204] (3/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_331-117829-0 from training. Duration: 0.62 2023-10-02 21:37:25,662 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=1027346.6666666666, ans=0.1 2023-10-02 21:37:26,717 WARNING [train.py:1204] (3/4) Exclude cut with ID _14880_210_8895_1_1530680426655_3795259_289-212817-0_sp0.9 from training. Duration: 0.7666875 2023-10-02 21:37:26,757 WARNING [train.py:1204] (3/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_620-144779-0_sp1.1 from training. Duration: 0.92725 2023-10-02 21:37:29,426 WARNING [train.py:1204] (3/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_561-288575-0_sp1.1 from training. Duration: 0.963625 2023-10-02 21:37:30,832 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_22657_210_6890_1_1532086002_1637216_122-188128-0_sp1.1 from training. Duration: 0.963625 2023-10-02 21:37:33,462 WARNING [train.py:1204] (3/4) Exclude cut with ID _16756_210_4915_1_1531047803763_7104259_604-86607-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 21:37:34,979 WARNING [train.py:1204] (3/4) Exclude cut with ID _15150_210_8341_1_1530752411803_7256384_469-239509-0 from training. Duration: 0.68 2023-10-02 21:37:36,259 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_10987_210_4163_1_1529760555_4123902_302-87926-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 21:37:38,638 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.feed_forward1.hidden_balancer.prob, batch_count=1027413.3333333334, ans=0.125 2023-10-02 21:37:42,518 WARNING [train.py:1204] (3/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_469-94673-0_sp0.9 from training. Duration: 0.87775 2023-10-02 21:37:43,921 WARNING [train.py:1204] (3/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_798-106179-0 from training. Duration: 0.83 2023-10-02 21:37:45,886 WARNING [train.py:1204] (3/4) Exclude cut with ID _16744_210_5986_1_1531724405384_3907567_242-111316-0 from training. Duration: 0.93 2023-10-02 21:37:48,133 WARNING [train.py:1204] (3/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_938-205135-0_sp0.9 from training. Duration: 0.9666875 2023-10-02 21:37:50,665 WARNING [train.py:1204] (3/4) Exclude cut with ID _64135_210_2253_1_1535677267876_7608972_133-188266-0_sp0.9 from training. Duration: 0.9 2023-10-02 21:37:50,670 WARNING [train.py:1204] (3/4) Exclude cut with ID _44585_210_18628_1_1533605350387_4518199_623-346820-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 21:37:52,047 WARNING [train.py:1204] (3/4) Exclude cut with ID _42326_210_7770_1_1533814282531_3707269_454-266903-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 21:37:52,131 WARNING [train.py:1204] (3/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_5-55893-0_sp1.1 from training. Duration: 0.5 2023-10-02 21:37:53,560 WARNING [train.py:1204] (3/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_577-332600-0_sp1.1 from training. Duration: 0.5181875 2023-10-02 21:37:53,561 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_957-138882-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 21:37:55,651 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.2.encoder.layers.1.nonlin_attention.whiten1, num_groups=1, num_channels=288, metric=5.70 vs. limit=10.0 2023-10-02 21:37:56,442 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.0.bypass.skip_rate, batch_count=1027480.0, ans=0.035 2023-10-02 21:37:59,234 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.feed_forward1.hidden_balancer.prob, batch_count=1027480.0, ans=0.125 2023-10-02 21:37:59,241 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.balancer1.prob, batch_count=1027480.0, ans=0.125 2023-10-02 21:38:00,392 WARNING [train.py:1204] (3/4) Exclude cut with ID _12556_210_8341_1_1530081024100_7327690_219-38004-0_sp1.1 from training. Duration: 0.97275 2023-10-02 21:38:00,494 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_397-147196-0_sp1.1 from training. Duration: 0.72725 2023-10-02 21:38:01,930 WARNING [train.py:1204] (3/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_231-104736-0_sp0.9 from training. Duration: 0.8666875 2023-10-02 21:38:03,174 WARNING [train.py:1204] (3/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_398-334558-0 from training. Duration: 0.96 2023-10-02 21:38:04,628 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_257-20733-0_sp1.1 from training. Duration: 0.6909375 2023-10-02 21:38:04,699 WARNING [train.py:1204] (3/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_146-282400-0_sp1.1 from training. Duration: 0.6454375 2023-10-02 21:38:04,702 WARNING [train.py:1204] (3/4) Exclude cut with ID _51899_210_20034_1_1534129254466_3669220_265-215428-0 from training. Duration: 0.98 2023-10-02 21:38:04,838 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.feed_forward1.hidden_balancer.prob, batch_count=1027546.6666666666, ans=0.125 2023-10-02 21:38:05,983 WARNING [train.py:1204] (3/4) Exclude cut with ID _34423_210_8750_1_1533430798456_3330199_219-106541-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 21:38:08,536 WARNING [train.py:1204] (3/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_458-67321-0 from training. Duration: 0.97 2023-10-02 21:38:08,774 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.bypass_mid.scale_min, batch_count=1027546.6666666666, ans=0.2 2023-10-02 21:38:13,550 WARNING [train.py:1204] (3/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_1051-182606-0_sp1.1 from training. Duration: 0.936375 2023-10-02 21:38:13,565 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_53555_210_5632_1_1534595220_7232297_603-4396-0_sp1.1 from training. Duration: 0.97275 2023-10-02 21:38:13,839 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.ff2_skip_rate, batch_count=1027546.6666666666, ans=0.0 2023-10-02 21:38:14,204 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.2.encoder.layers.2.self_attn2.whiten, num_groups=1, num_channels=384, metric=17.75 vs. limit=22.5 2023-10-02 21:38:15,455 WARNING [train.py:1204] (3/4) Exclude cut with ID _54218_210_12210_1_1534413646388_4409259_159-144367-0_sp1.1 from training. Duration: 0.963625 2023-10-02 21:38:18,602 WARNING [train.py:1204] (3/4) Exclude cut with ID _7606_210_4915_1_1528894253277_5080690_301-10732-0_sp0.9 from training. Duration: 0.9 2023-10-02 21:38:18,605 WARNING [train.py:1204] (3/4) Exclude cut with ID _15026_210_8750_1_1530777602316_3291850_211-195043-0_sp0.9 from training. Duration: 0.888875 2023-10-02 21:38:20,880 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=1027613.3333333334, ans=0.1 2023-10-02 21:38:21,946 WARNING [train.py:1204] (3/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_146-282400-0 from training. Duration: 0.71 2023-10-02 21:38:21,973 WARNING [train.py:1204] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_65-88242-0 from training. Duration: 0.56 2023-10-02 21:38:23,486 WARNING [train.py:1204] (3/4) Exclude cut with ID _55857_210_2346_1_1534644007831_6898209_80-107757-0_sp1.1 from training. Duration: 0.963625 2023-10-02 21:38:23,526 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_15126_210_6753_1_1530778677_2591185_120-277707-0_sp0.9 from training. Duration: 0.888875 2023-10-02 21:38:24,823 WARNING [train.py:1204] (3/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_1033-168593-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 21:38:24,856 WARNING [train.py:1204] (3/4) Exclude cut with ID _46683_210_9642_1_1533730913492_4591116_475-30711-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 21:38:26,256 WARNING [train.py:1204] (3/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_253-308377-0 from training. Duration: 0.49 2023-10-02 21:38:26,326 WARNING [train.py:1204] (3/4) Exclude cut with ID _4680_210_3525_1_1527249757953_893386_75-78979-0 from training. Duration: 0.91 2023-10-02 21:38:27,699 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_5522_210_3273_1_1527249606_3909096_318-203153-0_sp0.9 from training. Duration: 0.47775 2023-10-02 21:38:30,462 WARNING [train.py:1204] (3/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_550-170116-0_sp1.1 from training. Duration: 0.92725 2023-10-02 21:38:30,497 WARNING [train.py:1204] (3/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_277-210798-0_sp0.9 from training. Duration: 0.611125 2023-10-02 21:38:31,896 WARNING [train.py:1204] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_620-276793-0 from training. Duration: 0.59 2023-10-02 21:38:31,908 WARNING [train.py:1204] (3/4) Exclude cut with ID _3067_210_2667_1_1526212974640_4503168_119-175776-0 from training. Duration: 0.74 2023-10-02 21:38:31,993 WARNING [train.py:1204] (3/4) Exclude cut with ID _55209_210_1531_1_1534675828996_4597160_681-195827-0_sp1.1 from training. Duration: 0.92725 2023-10-02 21:38:33,331 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_427-78097-0_sp1.1 from training. Duration: 0.72725 2023-10-02 21:38:34,760 WARNING [train.py:1204] (3/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_96-290039-0_sp0.9 from training. Duration: 0.62225 2023-10-02 21:38:34,777 WARNING [train.py:1204] (3/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_287-292174-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 21:38:36,014 INFO [train.py:1046] (3/4) Epoch 30, batch 100, loss[loss=0.1535, simple_loss=0.2381, pruned_loss=0.03438, over 24503.00 frames. ], tot_loss[loss=0.1632, simple_loss=0.243, pruned_loss=0.04172, over 1898388.68 frames. ], batch size: 63, lr: 3.43e-03, grad_scale: 16.0 2023-10-02 21:38:37,436 WARNING [train.py:1204] (3/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_200-292228-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 21:38:40,613 WARNING [train.py:1204] (3/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_291-194999-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 21:38:43,429 WARNING [train.py:1204] (3/4) Exclude cut with ID _58895_210_8750_1_1535184081320_3649089_265-323810-0_sp1.1 from training. Duration: 0.87275 2023-10-02 21:38:43,642 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.self_attn_weights.pos_emb_skip_rate, batch_count=1027680.0, ans=0.0 2023-10-02 21:38:44,732 WARNING [train.py:1204] (3/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_58-68956-0 from training. Duration: 0.83 2023-10-02 21:38:44,736 WARNING [train.py:1204] (3/4) Exclude cut with ID _39867_210_9826_1_1533553222030_9344259_322-115153-0_sp1.1 from training. Duration: 0.963625 2023-10-02 21:38:47,870 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.conv_module2.balancer1.max_abs, batch_count=1027680.0, ans=10.0 2023-10-02 21:38:49,772 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3371_210_1794_1_1526384462_5580777_396-339812-0_sp1.1 from training. Duration: 0.77275 2023-10-02 21:38:49,785 WARNING [train.py:1204] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_731-2428-0_sp0.9 from training. Duration: 0.92225 2023-10-02 21:38:49,806 WARNING [train.py:1204] (3/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_98-741-0_sp1.1 from training. Duration: 0.72725 2023-10-02 21:38:49,809 WARNING [train.py:1204] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_238-245655-0_sp0.9 from training. Duration: 0.9 2023-10-02 21:38:49,841 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6788_210_1388_1_1527898698_4562021_91-197981-0_sp0.9 from training. Duration: 0.92225 2023-10-02 21:38:51,172 WARNING [train.py:1204] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_860-96198-0 from training. Duration: 0.73 2023-10-02 21:38:53,358 WARNING [train.py:1204] (3/4) Exclude cut with ID _14268_210_9206_1_1530622771285_4714099_331-105351-0_sp0.9 from training. Duration: 0.72225 2023-10-02 21:38:53,387 WARNING [train.py:1204] (3/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_204-137133-0_sp1.1 from training. Duration: 0.92725 2023-10-02 21:38:53,418 WARNING [train.py:1204] (3/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_5-46496-0_sp1.1 from training. Duration: 0.936375 2023-10-02 21:38:53,426 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_160-218721-0_sp1.1 from training. Duration: 0.87275 2023-10-02 21:38:57,392 WARNING [train.py:1204] (3/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_503-287883-0 from training. Duration: 0.88 2023-10-02 21:38:57,655 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.nonlin_attention.balancer.prob, batch_count=1027746.6666666666, ans=0.125 2023-10-02 21:38:58,769 WARNING [train.py:1204] (3/4) Exclude cut with ID _57572_210_5517_1_1534921219624_3809484_110-79134-0_sp1.1 from training. Duration: 0.92725 2023-10-02 21:38:58,852 WARNING [train.py:1204] (3/4) Exclude cut with ID _44585_210_18628_1_1533605350387_4518199_421-37641-0_sp1.1 from training. Duration: 0.936375 2023-10-02 21:39:00,207 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_42-142473-0_sp0.9 from training. Duration: 0.8 2023-10-02 21:39:01,709 WARNING [train.py:1204] (3/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_310-114452-0_sp0.9 from training. Duration: 0.6555625 2023-10-02 21:39:05,827 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9029W0120-509520 from training. Duration: 0.768 2023-10-02 21:39:05,842 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9029W0433-509820 from training. Duration: 0.896 2023-10-02 21:39:05,964 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_40222_210_6228_1_1533385298_4481939_163-334586-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 21:39:05,965 WARNING [train.py:1204] (3/4) Exclude cut with ID _48677_210_5663_1_1533902405919_3892989_408-40740-0_sp1.1 from training. Duration: 0.7545625 2023-10-02 21:39:11,248 WARNING [train.py:1204] (3/4) Exclude cut with ID _24314_210_4915_1_1532135078116_7170179_521-258868-0_sp0.9 from training. Duration: 0.87775 2023-10-02 21:39:13,151 WARNING [train.py:1204] (3/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_576-51198-0_sp1.1 from training. Duration: 0.92725 2023-10-02 21:39:14,487 WARNING [train.py:1204] (3/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_306-216871-0_sp1.1 from training. Duration: 0.97275 2023-10-02 21:39:19,372 WARNING [train.py:1204] (3/4) Exclude cut with ID _20684_210_6917_1_1531567751646_4203930_463-119982-0_sp1.1 from training. Duration: 0.97275 2023-10-02 21:39:19,452 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9084W0012-536850 from training. Duration: 0.64 2023-10-02 21:39:22,062 WARNING [train.py:1204] (3/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_847-41334-0_sp1.1 from training. Duration: 0.4 2023-10-02 21:39:25,386 WARNING [train.py:1204] (3/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_326-165900-0_sp1.1 from training. Duration: 0.62725 2023-10-02 21:39:26,744 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.523e+02 1.810e+02 1.965e+02 2.263e+02 3.377e+02, threshold=3.931e+02, percent-clipped=0.0 2023-10-02 21:39:26,866 WARNING [train.py:1204] (3/4) Exclude cut with ID _7537_210_2084_1_1528502395431_3924359_128-268029-0_sp1.1 from training. Duration: 0.8909375 2023-10-02 21:39:29,558 WARNING [train.py:1204] (3/4) Exclude cut with ID _67499_210_17282_1_1535509805220_3692430_333-155030-0_sp1.1 from training. Duration: 0.97275 2023-10-02 21:39:32,358 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_34008_210_10431_1_1533345424_8183247_588-257177-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 21:39:33,827 WARNING [train.py:1204] (3/4) Exclude cut with ID _25914_210_6400_1_1532141317058_644740_52-96808-0_sp1.1 from training. Duration: 0.82725 2023-10-02 21:39:35,235 WARNING [train.py:1204] (3/4) Exclude cut with ID _56494_210_4220_1_1535284837625_3033169_69-138150-0_sp1.1 from training. Duration: 0.8545625 2023-10-02 21:39:37,907 WARNING [train.py:1204] (3/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_19-143553-0_sp1.1 from training. Duration: 0.97275 2023-10-02 21:39:38,150 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.bypass_mid.scale_min, batch_count=1027946.6666666666, ans=0.2 2023-10-02 21:39:39,210 WARNING [train.py:1204] (3/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_582-130735-0_sp1.1 from training. Duration: 0.936375 2023-10-02 21:39:40,682 WARNING [train.py:1204] (3/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_943-201356-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 21:39:40,692 WARNING [train.py:1204] (3/4) Exclude cut with ID _3231_210_2106_1_1526205864934_6784220_422-229276-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 21:39:40,723 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_48049_210_5632_1_1533864553_3519937_73-198717-0_sp1.1 from training. Duration: 0.97275 2023-10-02 21:39:40,785 WARNING [train.py:1204] (3/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_329-285801-0 from training. Duration: 0.91 2023-10-02 21:39:42,670 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9171W0114-579000 from training. Duration: 0.896 2023-10-02 21:39:42,681 WARNING [train.py:1204] (3/4) Exclude cut with ID _3120_210_3525_1_1526036143112_4072980_210-132675-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 21:39:42,760 WARNING [train.py:1204] (3/4) Exclude cut with ID _55857_210_2346_1_1534644007831_6898209_745-98391-0_sp0.9 from training. Duration: 0.8444375 2023-10-02 21:39:42,820 WARNING [train.py:1204] (3/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_615-322035-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 21:39:42,824 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_36756_210_7234_1_1533026991_966276_119-315767-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 21:39:44,096 WARNING [train.py:1204] (3/4) Exclude cut with ID _13916_210_9994_1_1530788341126_3763149_60-191556-0_sp1.1 from training. Duration: 0.5545625 2023-10-02 21:39:44,120 WARNING [train.py:1204] (3/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_177-236809-0_sp1.1 from training. Duration: 0.4454375 2023-10-02 21:39:45,508 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7002_210_1794_1_1527939683_5157286_184-229630-0_sp1.1 from training. Duration: 0.67275 2023-10-02 21:39:45,514 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_50428_210_5724_1_1534247532_4126418_464-18322-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 21:39:45,593 WARNING [train.py:1204] (3/4) Exclude cut with ID _35799_210_5854_1_1533263487887_3529071_466-131402-0_sp1.1 from training. Duration: 0.936375 2023-10-02 21:39:47,309 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_1259-132768-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 21:39:47,366 WARNING [train.py:1204] (3/4) Exclude cut with ID _6694_210_4599_1_1527852637490_3965529_144-185051-0_sp1.1 from training. Duration: 0.87275 2023-10-02 21:39:48,660 WARNING [train.py:1204] (3/4) Exclude cut with ID _64639_210_5816_1_1535367619437_3528000_59-133290-0_sp1.1 from training. Duration: 0.963625 2023-10-02 21:39:50,455 INFO [train.py:1046] (3/4) Epoch 30, batch 150, loss[loss=0.1448, simple_loss=0.2278, pruned_loss=0.0309, over 24308.00 frames. ], tot_loss[loss=0.1654, simple_loss=0.2441, pruned_loss=0.04339, over 2522561.12 frames. ], batch size: 61, lr: 3.43e-03, grad_scale: 16.0 2023-10-02 21:39:50,562 WARNING [train.py:1204] (3/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_223-49525-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 21:39:53,846 WARNING [train.py:1204] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_748-36150-0_sp1.1 from training. Duration: 0.82725 2023-10-02 21:39:53,847 WARNING [train.py:1204] (3/4) Exclude cut with ID _19896_210_1883_1_1531709022662_6649569_52-221627-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 21:39:53,904 WARNING [train.py:1204] (3/4) Exclude cut with ID _6789_210_1534_1_1527854471671_2870519_296-115766-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 21:39:56,581 WARNING [train.py:1204] (3/4) Exclude cut with ID _14624_210_1925_1_1530858690657_3642219_100-19871-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 21:39:56,623 WARNING [train.py:1204] (3/4) Exclude cut with ID _12555_210_8750_1_1530075594402_3780239_50-293675-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 21:40:00,629 WARNING [train.py:1204] (3/4) Exclude cut with ID _14880_210_8895_1_1530680426655_3795259_289-212817-0_sp1.1 from training. Duration: 0.62725 2023-10-02 21:40:00,686 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_388-211626-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 21:40:02,273 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.balancer2.prob, batch_count=1028013.3333333334, ans=0.125 2023-10-02 21:40:04,779 WARNING [train.py:1204] (3/4) Exclude cut with ID _5289_210_2084_1_1527678202736_3779299_392-261988-0 from training. Duration: 0.65 2023-10-02 21:40:04,797 WARNING [train.py:1204] (3/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_124-345840-0 from training. Duration: 0.52 2023-10-02 21:40:04,802 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2655_210_2087_1_1525688959_3897400_437-337690-0 from training. Duration: 0.63 2023-10-02 21:40:07,547 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3780_210_2087_1_1526467391_239068_13-274633-0_sp1.1 from training. Duration: 0.92725 2023-10-02 21:40:07,551 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_805-71497-0_sp1.1 from training. Duration: 0.6454375 2023-10-02 21:40:08,873 WARNING [train.py:1204] (3/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_434-83000-0_sp1.1 from training. Duration: 0.8909375 2023-10-02 21:40:08,986 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_17613_210_2151_1_1531137569_2366160_285-219531-0_sp1.1 from training. Duration: 0.97275 2023-10-02 21:40:08,992 WARNING [train.py:1204] (3/4) Exclude cut with ID _56108_210_16380_1_1534505446159_3887959_22-37858-0_sp1.1 from training. Duration: 0.936375 2023-10-02 21:40:10,361 WARNING [train.py:1204] (3/4) Exclude cut with ID _28427_210_4220_1_1532581240967_3061069_200-88444-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 21:40:12,193 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_77-279042-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 21:40:13,597 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9309W0193-640320 from training. Duration: 0.896 2023-10-02 21:40:14,992 WARNING [train.py:1204] (3/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_516-324055-0_sp1.1 from training. Duration: 0.936375 2023-10-02 21:40:21,023 WARNING [train.py:1204] (3/4) Exclude cut with ID _40094_210_16380_1_1533294005773_3641159_10-261240-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 21:40:25,699 WARNING [train.py:1204] (3/4) Exclude cut with ID _6267_210_2084_1_1527764658526_3894730_375-219887-0_sp1.1 from training. Duration: 0.6090625 2023-10-02 21:40:27,640 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6788_210_1388_1_1527898698_4562021_180-75288-0 from training. Duration: 0.99 2023-10-02 21:40:30,513 WARNING [train.py:1204] (3/4) Exclude cut with ID _8030_210_7497_1_1528444886781_3531089_262-86750-0_sp1.1 from training. Duration: 0.736375 2023-10-02 21:40:30,534 WARNING [train.py:1204] (3/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_934-325498-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 21:40:30,554 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_456-22141-0_sp0.9 from training. Duration: 0.97775 2023-10-02 21:40:31,955 WARNING [train.py:1204] (3/4) Exclude cut with ID _49643_210_7989_1_1534640244224_3939031_598-161926-0_sp1.1 from training. Duration: 0.8181875 2023-10-02 21:40:33,291 WARNING [train.py:1204] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_345-49721-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 21:40:34,617 WARNING [train.py:1204] (3/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_440-141811-0_sp1.1 from training. Duration: 0.863625 2023-10-02 21:40:34,707 WARNING [train.py:1204] (3/4) Exclude cut with ID _64048_210_10626_1_1535198382802_3835179_394-212120-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 21:40:36,014 WARNING [train.py:1204] (3/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_91-277684-0 from training. Duration: 0.74 2023-10-02 21:40:41,543 WARNING [train.py:1204] (3/4) Exclude cut with ID _23038_210_9570_1_1532001584158_3652419_191-346343-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 21:40:43,362 WARNING [train.py:1204] (3/4) Exclude cut with ID _15140_210_9288_1_1530791388450_4880149_351-347974-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 21:40:43,409 WARNING [train.py:1204] (3/4) Exclude cut with ID _58605_210_12210_1_1534723195939_3904259_48-269402-0_sp1.1 from training. Duration: 0.87275 2023-10-02 21:40:43,416 WARNING [train.py:1204] (3/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_401-53423-0_sp0.9 from training. Duration: 0.611125 2023-10-02 21:40:44,994 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.feed_forward2.hidden_balancer.prob, batch_count=1028213.3333333334, ans=0.125 2023-10-02 21:40:46,155 WARNING [train.py:1204] (3/4) Exclude cut with ID _42659_210_2070_1_1533607442765_3120380_158-49004-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 21:40:47,583 WARNING [train.py:1204] (3/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_81-75849-0_sp1.1 from training. Duration: 0.3545625 2023-10-02 21:40:49,026 WARNING [train.py:1204] (3/4) Exclude cut with ID _27052_210_5986_1_1532329197187_3585009_314-106499-0_sp1.1 from training. Duration: 0.836375 2023-10-02 21:40:50,454 WARNING [train.py:1204] (3/4) Exclude cut with ID _4626_210_3532_1_1526817652717_3916859_446-206656-0_sp0.9 from training. Duration: 0.8444375 2023-10-02 21:40:51,744 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_53378_210_17282_1_1534221071_5769509_158-114960-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 21:40:53,733 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3171_210_2087_1_1526109084_2330007_37-64334-0_sp0.9 from training. Duration: 0.911125 2023-10-02 21:40:53,749 WARNING [train.py:1204] (3/4) Exclude cut with ID _5410_210_1388_1_1527397055795_1719020_39-112221-0 from training. Duration: 0.69 2023-10-02 21:40:53,776 WARNING [train.py:1204] (3/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_4-91037-0_sp0.9 from training. Duration: 0.97775 2023-10-02 21:40:53,792 WARNING [train.py:1204] (3/4) Exclude cut with ID ID0079W0443-717795 from training. Duration: 0.541 2023-10-02 21:40:57,732 WARNING [train.py:1204] (3/4) Exclude cut with ID _34734_210_4525_1_1533365977542_4448720_766-297928-0_sp1.1 from training. Duration: 0.936375 2023-10-02 21:40:59,325 WARNING [train.py:1204] (3/4) Exclude cut with ID _51471_210_1889_1_1534399391390_3094250_310-337793-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 21:41:00,583 WARNING [train.py:1204] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_678-159307-0_sp0.9 from training. Duration: 0.9444375 2023-10-02 21:41:00,762 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.1.nonlin_attention.balancer.prob, batch_count=1028280.0, ans=0.125 2023-10-02 21:41:00,812 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.feed_forward2.hidden_balancer.prob, batch_count=1028280.0, ans=0.125 2023-10-02 21:41:02,067 WARNING [train.py:1204] (3/4) Exclude cut with ID _50093_210_4220_1_1534417141207_3432299_259-322252-0 from training. Duration: 0.92 2023-10-02 21:41:03,348 WARNING [train.py:1204] (3/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_67-103444-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 21:41:03,534 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.bypass.skip_rate, batch_count=1028346.6666666666, ans=0.07 2023-10-02 21:41:04,559 INFO [train.py:1046] (3/4) Epoch 30, batch 200, loss[loss=0.1615, simple_loss=0.2503, pruned_loss=0.03636, over 24343.00 frames. ], tot_loss[loss=0.1659, simple_loss=0.2446, pruned_loss=0.04361, over 3012111.15 frames. ], batch size: 77, lr: 3.43e-03, grad_scale: 16.0 2023-10-02 21:41:04,600 WARNING [train.py:1204] (3/4) Exclude cut with ID _40551_210_10635_1_1533776424632_3416179_311-247578-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 21:41:06,047 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_4633_210_2040_1_1526824163_4145343_416-76117-0 from training. Duration: 0.95 2023-10-02 21:41:08,729 WARNING [train.py:1204] (3/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_331-117829-0_sp0.9 from training. Duration: 0.688875 2023-10-02 21:41:10,119 WARNING [train.py:1204] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_299-247059-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 21:41:11,532 WARNING [train.py:1204] (3/4) Exclude cut with ID _64002_210_9570_1_1535198605157_38979_2-240845-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 21:41:11,831 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.feed_forward1.hidden_balancer.prob, batch_count=1028346.6666666666, ans=0.125 2023-10-02 21:41:16,357 WARNING [train.py:1204] (3/4) Exclude cut with ID _4681_210_3525_1_1527250689464_120889_15-126760-0_sp0.9 from training. Duration: 0.9 2023-10-02 21:41:17,704 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_21500_210_2054_1_1532149310_2716758_256-156611-0_sp1.1 from training. Duration: 0.936375 2023-10-02 21:41:17,727 WARNING [train.py:1204] (3/4) Exclude cut with ID _16908_210_5724_1_1531119157248_3772700_624-203729-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 21:41:23,417 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.out_combiner.scale_min, batch_count=1028413.3333333334, ans=0.2 2023-10-02 21:41:25,503 INFO [scaling.py:1118] (3/4) WithLoss: name=encoder.encoders.4.encoder.layers.0.self_attn_weights, loss-sum=0.000e+00 2023-10-02 21:41:35,666 WARNING [train.py:1204] (3/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_605-219732-0_sp1.1 from training. Duration: 0.8545625 2023-10-02 21:41:37,143 WARNING [train.py:1204] (3/4) Exclude cut with ID _44718_210_5816_1_1534050051869_6469030_380-167520-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 21:41:37,208 WARNING [train.py:1204] (3/4) Exclude cut with ID _19896_210_1883_1_1531709022662_6649569_94-229927-0_sp1.1 from training. Duration: 0.8818125 2023-10-02 21:41:37,275 WARNING [train.py:1204] (3/4) Exclude cut with ID _19514_210_9259_1_1531530071330_5084230_519-174300-0_sp1.1 from training. Duration: 0.963625 2023-10-02 21:41:39,773 WARNING [train.py:1204] (3/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_340-174176-0_sp0.9 from training. Duration: 0.5444375 2023-10-02 21:41:39,776 WARNING [train.py:1204] (3/4) Exclude cut with ID _8263_210_1664_1_1528439452891_970220_68-181805-0_sp1.1 from training. Duration: 0.5818125 2023-10-02 21:41:41,229 WARNING [train.py:1204] (3/4) Exclude cut with ID _3120_210_3525_1_1526036143112_4072980_348-257723-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 21:41:42,492 WARNING [train.py:1204] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_453-133983-0_sp1.1 from training. Duration: 0.7090625 2023-10-02 21:41:43,824 WARNING [train.py:1204] (3/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_17-86060-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 21:41:43,855 WARNING [train.py:1204] (3/4) Exclude cut with ID _29877_210_6228_1_1532397791106_5276149_228-184657-0_sp1.1 from training. Duration: 0.97275 2023-10-02 21:41:45,291 WARNING [train.py:1204] (3/4) Exclude cut with ID _18516_210_9206_1_1531704653206_3692406_198-334870-0 from training. Duration: 0.92 2023-10-02 21:41:46,685 WARNING [train.py:1204] (3/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_340-174176-0_sp1.1 from training. Duration: 0.4454375 2023-10-02 21:41:46,705 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_14-139186-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 21:41:51,158 WARNING [train.py:1204] (3/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_126-29104-0_sp0.9 from training. Duration: 0.9444375 2023-10-02 21:41:53,897 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.479e+02 1.828e+02 1.977e+02 2.173e+02 2.870e+02, threshold=3.954e+02, percent-clipped=0.0 2023-10-02 21:41:55,367 WARNING [train.py:1204] (3/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_84-10310-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 21:42:00,701 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_15517_210_6753_1_1530954008_3487336_79-273076-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 21:42:00,763 WARNING [train.py:1204] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_371-18169-0_sp0.9 from training. Duration: 0.9555625 2023-10-02 21:42:07,594 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_68702_210_23689_1_1535799577_3420068_170-92788-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 21:42:09,766 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.0.layers.1.self_attn2.whiten, num_groups=1, num_channels=192, metric=10.20 vs. limit=22.5 2023-10-02 21:42:10,314 WARNING [train.py:1204] (3/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_78-182912-0 from training. Duration: 0.59 2023-10-02 21:42:10,385 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3019_210_2028_1_1526044245_3264865_297-12384-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 21:42:11,686 WARNING [train.py:1204] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_147-111137-0_sp1.1 from training. Duration: 0.863625 2023-10-02 21:42:11,691 WARNING [train.py:1204] (3/4) Exclude cut with ID _28323_210_16187_1_1532480395667_4100300_117-8859-0_sp1.1 from training. Duration: 0.97275 2023-10-02 21:42:13,056 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7762_210_1794_1_1528199508_4057408_419-125009-0_sp1.1 from training. Duration: 0.7545625 2023-10-02 21:42:14,450 WARNING [train.py:1204] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_268-87909-0 from training. Duration: 0.99 2023-10-02 21:42:14,527 WARNING [train.py:1204] (3/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_118-311357-0_sp1.1 from training. Duration: 0.92725 2023-10-02 21:42:15,818 WARNING [train.py:1204] (3/4) Exclude cut with ID ID0377W0003-870225 from training. Duration: 0.852 2023-10-02 21:42:17,148 INFO [train.py:1046] (3/4) Epoch 30, batch 250, loss[loss=0.1709, simple_loss=0.256, pruned_loss=0.04288, over 24441.00 frames. ], tot_loss[loss=0.1659, simple_loss=0.2439, pruned_loss=0.04392, over 3381096.94 frames. ], batch size: 69, lr: 3.43e-03, grad_scale: 16.0 2023-10-02 21:42:17,268 WARNING [train.py:1204] (3/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_56-5838-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 21:42:19,197 WARNING [train.py:1204] (3/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_90-31383-0_sp0.9 from training. Duration: 0.9666875 2023-10-02 21:42:19,281 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_31583_210_3852_1_1532595851_4024413_58-86657-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 21:42:19,318 WARNING [train.py:1204] (3/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_344-113875-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 21:42:22,039 WARNING [train.py:1204] (3/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_278-104676-0_sp1.1 from training. Duration: 0.8909375 2023-10-02 21:42:23,326 WARNING [train.py:1204] (3/4) Exclude cut with ID _55844_210_2346_1_1534471212569_7625707_764-2387-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 21:42:24,667 WARNING [train.py:1204] (3/4) Exclude cut with ID _27430_210_7770_1_1532604689375_1378590_157-14963-0_sp1.1 from training. Duration: 0.963625 2023-10-02 21:42:28,420 WARNING [train.py:1204] (3/4) Exclude cut with ID _7160_210_1876_1_1527940923231_3703799_282-322611-0_sp1.1 from training. Duration: 0.7909375 2023-10-02 21:42:38,771 WARNING [train.py:1204] (3/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_190-138640-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 21:42:41,610 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_33348_210_5632_1_1533086912_3324030_63-270922-0_sp1.1 from training. Duration: 0.97275 2023-10-02 21:42:41,641 WARNING [train.py:1204] (3/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_135-184281-0_sp1.1 from training. Duration: 0.9 2023-10-02 21:42:47,181 WARNING [train.py:1204] (3/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_277-210798-0_sp1.1 from training. Duration: 0.5 2023-10-02 21:42:48,503 WARNING [train.py:1204] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_402-253027-0_sp0.9 from training. Duration: 0.6 2023-10-02 21:42:49,844 WARNING [train.py:1204] (3/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_540-275685-0_sp0.9 from training. Duration: 0.988875 2023-10-02 21:42:49,878 WARNING [train.py:1204] (3/4) Exclude cut with ID _59493_210_17282_1_1535078419077_3225879_118-18960-0_sp1.1 from training. Duration: 0.936375 2023-10-02 21:42:51,283 WARNING [train.py:1204] (3/4) Exclude cut with ID _6194_210_1534_1_1527511826160_3941194_321-148641-0_sp1.1 from training. Duration: 0.5818125 2023-10-02 21:42:51,288 WARNING [train.py:1204] (3/4) Exclude cut with ID _61133_210_9969_1_1535196777227_3696409_333-270325-0_sp1.1 from training. Duration: 0.8454375 2023-10-02 21:42:51,337 WARNING [train.py:1204] (3/4) Exclude cut with ID _49376_210_5986_1_1533985278732_3370740_307-145221-0_sp1.1 from training. Duration: 0.936375 2023-10-02 21:42:54,547 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6846_210_6753_1_1527850767_4295158_2-315073-0_sp0.9 from training. Duration: 0.97775 2023-10-02 21:42:56,029 WARNING [train.py:1204] (3/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_17-25045-0 from training. Duration: 0.59 2023-10-02 21:42:57,742 WARNING [train.py:1204] (3/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_503-333510-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 21:42:59,646 WARNING [train.py:1204] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_238-245655-0_sp1.1 from training. Duration: 0.736375 2023-10-02 21:42:59,675 WARNING [train.py:1204] (3/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_329-82235-0_sp0.9 from training. Duration: 0.911125 2023-10-02 21:42:59,685 WARNING [train.py:1204] (3/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_325-113798-0_sp0.9 from training. Duration: 0.9444375 2023-10-02 21:43:01,587 WARNING [train.py:1204] (3/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_395-37222-0_sp0.9 from training. Duration: 0.8444375 2023-10-02 21:43:03,067 WARNING [train.py:1204] (3/4) Exclude cut with ID _64135_210_2253_1_1535677267876_7608972_498-5039-0_sp1.1 from training. Duration: 0.7090625 2023-10-02 21:43:03,075 WARNING [train.py:1204] (3/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_303-228738-0_sp0.9 from training. Duration: 0.9333125 2023-10-02 21:43:04,578 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_577-220899-0_sp1.1 from training. Duration: 0.92725 2023-10-02 21:43:05,931 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3475_210_2028_1_1526193578_3622569_377-166133-0_sp1.1 from training. Duration: 0.7818125 2023-10-02 21:43:05,979 WARNING [train.py:1204] (3/4) Exclude cut with ID _49376_210_5986_1_1533985278732_3370740_272-313512-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 21:43:10,045 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7762_210_1794_1_1528199508_4057408_422-227034-0_sp0.9 from training. Duration: 0.811125 2023-10-02 21:43:10,231 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.conv_module2.balancer1.prob, batch_count=1028880.0, ans=0.125 2023-10-02 21:43:12,803 WARNING [train.py:1204] (3/4) Exclude cut with ID _18895_210_6228_1_1531357274234_3784930_74-341428-0_sp1.1 from training. Duration: 0.92725 2023-10-02 21:43:16,907 WARNING [train.py:1204] (3/4) Exclude cut with ID _44902_210_16187_1_1533888006062_3792078_149-67212-0_sp1.1 from training. Duration: 0.7909375 2023-10-02 21:43:19,669 WARNING [train.py:1204] (3/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_243-126020-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 21:43:22,384 WARNING [train.py:1204] (3/4) Exclude cut with ID _54218_210_12210_1_1534413646388_4409259_259-6561-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 21:43:28,186 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_41452_210_7291_1_1533437687_3941281_405-11559-0 from training. Duration: 0.91 2023-10-02 21:43:30,171 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_759-225084-0_sp1.1 from training. Duration: 0.97275 2023-10-02 21:43:30,188 WARNING [train.py:1204] (3/4) Exclude cut with ID _14747_210_8341_1_1530685797463_3654089_147-131229-0_sp0.9 from training. Duration: 0.8444375 2023-10-02 21:43:31,453 INFO [train.py:1046] (3/4) Epoch 30, batch 300, loss[loss=0.1641, simple_loss=0.2559, pruned_loss=0.0362, over 24657.00 frames. ], tot_loss[loss=0.1639, simple_loss=0.242, pruned_loss=0.04294, over 3680333.79 frames. ], batch size: 73, lr: 3.43e-03, grad_scale: 16.0 2023-10-02 21:43:31,562 WARNING [train.py:1204] (3/4) Exclude cut with ID _14867_210_1968_1_1530684179890_3846969_319-250868-0 from training. Duration: 0.99 2023-10-02 21:43:31,592 WARNING [train.py:1204] (3/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_580-93798-0_sp0.9 from training. Duration: 0.62225 2023-10-02 21:43:33,060 WARNING [train.py:1204] (3/4) Exclude cut with ID _9679_210_7497_1_1529129212906_4363929_512-313889-0_sp0.9 from training. Duration: 0.9 2023-10-02 21:43:34,911 WARNING [train.py:1204] (3/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_18-189198-0 from training. Duration: 0.9 2023-10-02 21:43:37,857 WARNING [train.py:1204] (3/4) Exclude cut with ID _49755_210_18053_1_1534581005846_3218709_137-64179-0_sp1.1 from training. Duration: 0.92725 2023-10-02 21:43:39,227 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_48049_210_5632_1_1533864553_3519937_306-283794-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 21:43:42,017 WARNING [train.py:1204] (3/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_25-120161-0_sp1.1 from training. Duration: 0.8909375 2023-10-02 21:43:43,257 WARNING [train.py:1204] (3/4) Exclude cut with ID _8833_210_5219_1_1528592665210_7553059_319-349181-0 from training. Duration: 0.99 2023-10-02 21:43:44,648 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_296-332325-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 21:43:44,766 WARNING [train.py:1204] (3/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_436-186311-0_sp1.1 from training. Duration: 0.5909375 2023-10-02 21:43:45,036 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.conv_module2.balancer1.min_positive, batch_count=1029080.0, ans=0.025 2023-10-02 21:43:46,038 WARNING [train.py:1204] (3/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_348-127789-0 from training. Duration: 0.85 2023-10-02 21:43:46,043 WARNING [train.py:1204] (3/4) Exclude cut with ID _3992_210_1747_1_1526644816031_3731060_443-195183-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 21:43:46,274 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.feed_forward2.hidden_balancer.prob, batch_count=1029080.0, ans=0.125 2023-10-02 21:43:49,131 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.conv_module1.balancer1.prob, batch_count=1029080.0, ans=0.125 2023-10-02 21:43:50,229 WARNING [train.py:1204] (3/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_308-231735-0_sp1.1 from training. Duration: 0.663625 2023-10-02 21:43:51,926 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.conv_module2.balancer2.min_abs, batch_count=1029080.0, ans=0.5 2023-10-02 21:43:53,627 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.4.encoder.layers.1.conv_module1.whiten, num_groups=1, num_channels=384, metric=3.26 vs. limit=15.0 2023-10-02 21:43:54,347 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6786_210_1388_1_1527839699_4194495_378-46070-0_sp1.1 from training. Duration: 0.6545625 2023-10-02 21:43:54,403 WARNING [train.py:1204] (3/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_63-101996-0 from training. Duration: 0.88 2023-10-02 21:43:57,664 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2541_210_1794_1_1525590269_3084765_156-173713-0 from training. Duration: 0.81 2023-10-02 21:43:57,696 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_217-239796-0_sp1.1 from training. Duration: 0.963625 2023-10-02 21:44:01,482 WARNING [train.py:1204] (3/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_530-329400-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 21:44:02,911 WARNING [train.py:1204] (3/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_150-62920-0_sp1.1 from training. Duration: 0.963625 2023-10-02 21:44:02,911 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_5522_210_3273_1_1527249606_3909096_366-172508-0 from training. Duration: 0.66 2023-10-02 21:44:02,914 WARNING [train.py:1204] (3/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_303-340446-0_sp1.1 from training. Duration: 0.8181875 2023-10-02 21:44:06,092 WARNING [train.py:1204] (3/4) Exclude cut with ID _8699_210_2069_1_1528630346831_3960409_160-24408-0_sp1.1 from training. Duration: 0.82725 2023-10-02 21:44:07,503 WARNING [train.py:1204] (3/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_299-187801-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 21:44:07,542 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_34886_210_10633_1_1533203882_1755661_92-345288-0_sp1.1 from training. Duration: 0.97275 2023-10-02 21:44:11,663 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_5438_210_1794_1_1527336687_2600009_104-288274-0_sp1.1 from training. Duration: 0.52725 2023-10-02 21:44:11,667 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_154-316450-0 from training. Duration: 0.76 2023-10-02 21:44:11,738 WARNING [train.py:1204] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_23-189234-0_sp0.9 from training. Duration: 0.92225 2023-10-02 21:44:14,542 WARNING [train.py:1204] (3/4) Exclude cut with ID _4779_210_2084_1_1527318022944_3914893_346-168820-0_sp1.1 from training. Duration: 0.963625 2023-10-02 21:44:15,361 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.2.encoder.layers.0.feed_forward2.out_whiten, num_groups=1, num_channels=384, metric=6.03 vs. limit=15.0 2023-10-02 21:44:15,896 WARNING [train.py:1204] (3/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_5-55893-0 from training. Duration: 0.55 2023-10-02 21:44:15,975 WARNING [train.py:1204] (3/4) Exclude cut with ID _20455_210_5219_1_1531565996775_8098100_180-24517-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 21:44:21,388 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.558e+02 1.890e+02 2.076e+02 2.319e+02 2.784e+02, threshold=4.153e+02, percent-clipped=0.0 2023-10-02 21:44:22,732 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_243-143119-0_sp0.9 from training. Duration: 0.8555625 2023-10-02 21:44:24,231 WARNING [train.py:1204] (3/4) Exclude cut with ID _65585_210_12210_1_1535450451584_3764098_293-212045-0_sp1.1 from training. Duration: 0.92725 2023-10-02 21:44:24,235 WARNING [train.py:1204] (3/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_577-332600-0 from training. Duration: 0.57 2023-10-02 21:44:28,994 WARNING [train.py:1204] (3/4) Exclude cut with ID _30039_210_13270_1_1532484612900_2725150_104-289874-0_sp1.1 from training. Duration: 0.963625 2023-10-02 21:44:29,006 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3172_210_2087_1_1526292929_3987865_197-97762-0_sp1.1 from training. Duration: 0.6818125 2023-10-02 21:44:31,907 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_256-150908-0_sp1.1 from training. Duration: 0.963625 2023-10-02 21:44:32,114 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.self_attn_weights.pos_emb_skip_rate, batch_count=1029280.0, ans=0.0 2023-10-02 21:44:33,911 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_296-287678-0_sp1.1 from training. Duration: 0.863625 2023-10-02 21:44:33,946 WARNING [train.py:1204] (3/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_443-30034-0 from training. Duration: 0.64 2023-10-02 21:44:35,738 WARNING [train.py:1204] (3/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_494-199538-0_sp0.9 from training. Duration: 0.8333125 2023-10-02 21:44:35,806 WARNING [train.py:1204] (3/4) Exclude cut with ID _19708_210_9206_1_1532136650733_4238364_61-162325-0_sp1.1 from training. Duration: 0.97275 2023-10-02 21:44:37,008 WARNING [train.py:1204] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_475-127455-0 from training. Duration: 0.7 2023-10-02 21:44:38,616 WARNING [train.py:1204] (3/4) Exclude cut with ID _48677_210_5663_1_1533902405919_3892989_188-11581-0_sp1.1 from training. Duration: 0.963625 2023-10-02 21:44:38,646 WARNING [train.py:1204] (3/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_136-8381-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 21:44:38,760 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.1.ff2_skip_rate, batch_count=1029280.0, ans=0.0 2023-10-02 21:44:41,283 WARNING [train.py:1204] (3/4) Exclude cut with ID _55328_210_27767_1_1534580973260_4088170_270-229555-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 21:44:41,330 WARNING [train.py:1204] (3/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_372-266951-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 21:44:41,402 WARNING [train.py:1204] (3/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_14-242149-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 21:44:41,719 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=1029280.0, ans=0.1 2023-10-02 21:44:45,537 INFO [train.py:1046] (3/4) Epoch 30, batch 350, loss[loss=0.1601, simple_loss=0.2297, pruned_loss=0.04523, over 23512.00 frames. ], tot_loss[loss=0.1627, simple_loss=0.2408, pruned_loss=0.04227, over 3912398.56 frames. ], batch size: 285, lr: 3.43e-03, grad_scale: 16.0 2023-10-02 21:44:46,935 WARNING [train.py:1204] (3/4) Exclude cut with ID _7877_210_6014_1_1528286358099_3750136_159-76512-0_sp1.1 from training. Duration: 0.936375 2023-10-02 21:44:46,938 WARNING [train.py:1204] (3/4) Exclude cut with ID _8263_210_1664_1_1528438953494_294100_40-204731-0_sp1.1 from training. Duration: 0.4545625 2023-10-02 21:44:49,606 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8278_210_2550_1_1528637099_4220413_694-238413-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 21:44:55,149 WARNING [train.py:1204] (3/4) Exclude cut with ID _47278_210_12041_1_1533902418349_3576110_421-172710-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 21:44:56,589 WARNING [train.py:1204] (3/4) Exclude cut with ID _14970_210_15709_1_1530773543493_1715730_215-282680-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 21:44:58,453 WARNING [train.py:1204] (3/4) Exclude cut with ID _64639_210_5816_1_1535367619437_3528000_306-149449-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 21:44:59,974 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_28-291982-0 from training. Duration: 0.97 2023-10-02 21:45:01,372 WARNING [train.py:1204] (3/4) Exclude cut with ID _55134_210_21982_1_1534590028377_3882170_412-15870-0_sp1.1 from training. Duration: 0.936375 2023-10-02 21:45:02,634 WARNING [train.py:1204] (3/4) Exclude cut with ID _13684_210_7497_1_1530619354863_3598359_312-235606-0 from training. Duration: 0.6 2023-10-02 21:45:05,944 WARNING [train.py:1204] (3/4) Exclude cut with ID _37112_210_2575_1_1532997007673_7268610_347-238993-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 21:45:05,988 WARNING [train.py:1204] (3/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_121-284152-0 from training. Duration: 0.59 2023-10-02 21:45:07,880 WARNING [train.py:1204] (3/4) Exclude cut with ID _2941_210_2577_1_1526199673150_2489759_228-2961-0_sp1.1 from training. Duration: 0.97275 2023-10-02 21:45:10,617 WARNING [train.py:1204] (3/4) Exclude cut with ID _28323_210_16187_1_1532480395667_4100300_531-24157-0 from training. Duration: 0.98 2023-10-02 21:45:11,963 WARNING [train.py:1204] (3/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_266-329755-0_sp0.9 from training. Duration: 0.988875 2023-10-02 21:45:13,406 WARNING [train.py:1204] (3/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_1038-146421-0_sp1.1 from training. Duration: 0.97275 2023-10-02 21:45:14,011 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.3.encoder.layers.2.feed_forward2.out_whiten, num_groups=1, num_channels=512, metric=13.12 vs. limit=15.0 2023-10-02 21:45:14,700 WARNING [train.py:1204] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_391-22148-0_sp0.9 from training. Duration: 0.8555625 2023-10-02 21:45:14,796 WARNING [train.py:1204] (3/4) Exclude cut with ID _64521_210_14699_1_1535243427259_3815328_543-341117-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 21:45:16,075 WARNING [train.py:1204] (3/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_52-168712-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 21:45:16,123 WARNING [train.py:1204] (3/4) Exclude cut with ID _33920_210_4220_1_1533257851500_3084399_198-334063-0_sp1.1 from training. Duration: 0.8545625 2023-10-02 21:45:16,142 WARNING [train.py:1204] (3/4) Exclude cut with ID _50093_210_4220_1_1534417141207_3432299_69-111987-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 21:45:17,454 WARNING [train.py:1204] (3/4) Exclude cut with ID _14821_210_8750_1_1530680503991_6829359_189-297451-0_sp0.9 from training. Duration: 0.611125 2023-10-02 21:45:18,944 WARNING [train.py:1204] (3/4) Exclude cut with ID _8030_210_7497_1_1528444886781_3531089_262-86750-0_sp0.9 from training. Duration: 0.9 2023-10-02 21:45:18,957 WARNING [train.py:1204] (3/4) Exclude cut with ID _24612_210_8913_1_1531998054193_3806359_294-191196-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 21:45:23,861 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.2.encoder.layers.0.conv_module2.whiten, num_groups=1, num_channels=384, metric=8.60 vs. limit=15.0 2023-10-02 21:45:24,557 WARNING [train.py:1204] (3/4) Exclude cut with ID _16909_210_5724_1_1531292079912_4118919_707-342896-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 21:45:24,581 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_9460_210_4211_1_1528954587_10049667_572-37035-0_sp0.9 from training. Duration: 0.888875 2023-10-02 21:45:25,857 WARNING [train.py:1204] (3/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_762-109886-0_sp1.1 from training. Duration: 0.82725 2023-10-02 21:45:27,124 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_14953_210_2151_1_1530701710_5616165_519-326393-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 21:45:31,830 WARNING [train.py:1204] (3/4) Exclude cut with ID _25914_210_6400_1_1532141317058_644740_52-96808-0 from training. Duration: 0.91 2023-10-02 21:45:31,833 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_68095_210_20185_1_1535718226_8888855_1079-94525-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 21:45:34,083 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.conv_module2.balancer1.min_positive, batch_count=1029546.6666666666, ans=0.025 2023-10-02 21:45:37,342 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.feed_forward1.hidden_balancer.prob, batch_count=1029546.6666666666, ans=0.125 2023-10-02 21:45:39,109 WARNING [train.py:1204] (3/4) Exclude cut with ID _14860_210_4915_1_1530702020340_7343240_746-3647-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 21:45:39,110 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_70379_210_7925_1_1535941455_557455_49-101720-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 21:45:39,123 WARNING [train.py:1204] (3/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_8-307291-0_sp1.1 from training. Duration: 0.936375 2023-10-02 21:45:39,224 WARNING [train.py:1204] (3/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_117-290916-0 from training. Duration: 0.51 2023-10-02 21:45:41,694 WARNING [train.py:1204] (3/4) Exclude cut with ID _22020_210_2461_1_1531724164513_6326900_944-339840-0_sp1.1 from training. Duration: 0.963625 2023-10-02 21:45:43,188 WARNING [train.py:1204] (3/4) Exclude cut with ID IC0515W0043-254911 from training. Duration: 0.976 2023-10-02 21:45:43,286 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3910_210_1794_1_1526742306_334530_28-238909-0 from training. Duration: 0.81 2023-10-02 21:45:44,617 WARNING [train.py:1204] (3/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_269-267275-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 21:45:46,107 WARNING [train.py:1204] (3/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_72-251255-0_sp1.1 from training. Duration: 0.8545625 2023-10-02 21:45:46,118 WARNING [train.py:1204] (3/4) Exclude cut with ID _14886_210_4220_1_1530702020355_3516190_175-229073-0 from training. Duration: 0.45 2023-10-02 21:45:47,487 WARNING [train.py:1204] (3/4) Exclude cut with ID _40519_210_13794_1_1533715228655_7337390_921-209452-0_sp1.1 from training. Duration: 0.963625 2023-10-02 21:45:50,359 WARNING [train.py:1204] (3/4) Exclude cut with ID _29857_210_20034_1_1532395886753_3798160_335-163562-0_sp0.9 from training. Duration: 0.9444375 2023-10-02 21:45:51,689 WARNING [train.py:1204] (3/4) Exclude cut with ID _58583_210_5236_1_1534726835266_3574249_384-58445-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 21:45:52,921 WARNING [train.py:1204] (3/4) Exclude cut with ID _44011_210_2046_1_1533722401691_3631310_96-315656-0_sp1.1 from training. Duration: 0.963625 2023-10-02 21:45:52,925 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_68484_210_1794_1_1535849804_7678097_549-170613-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 21:45:54,331 WARNING [train.py:1204] (3/4) Exclude cut with ID _37892_210_19596_1_1533198677897_3964058_214-148058-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 21:45:57,090 WARNING [train.py:1204] (3/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_415-78792-0_sp0.9 from training. Duration: 0.9 2023-10-02 21:45:58,322 INFO [train.py:1046] (3/4) Epoch 30, batch 400, loss[loss=0.1486, simple_loss=0.2256, pruned_loss=0.03586, over 24623.00 frames. ], tot_loss[loss=0.1627, simple_loss=0.2413, pruned_loss=0.04202, over 4101366.34 frames. ], batch size: 60, lr: 3.43e-03, grad_scale: 32.0 2023-10-02 21:45:58,627 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.conv_module1.balancer2.prob, batch_count=1029680.0, ans=0.125 2023-10-02 21:46:00,434 WARNING [train.py:1204] (3/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_383-205833-0_sp1.1 from training. Duration: 0.863625 2023-10-02 21:46:01,828 WARNING [train.py:1204] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_167-137255-0 from training. Duration: 0.64 2023-10-02 21:46:01,839 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8275_210_4198_1_1528455320_3892571_484-154786-0_sp1.1 from training. Duration: 0.963625 2023-10-02 21:46:01,908 WARNING [train.py:1204] (3/4) Exclude cut with ID _40284_210_2084_1_1533513679814_3704279_396-319412-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 21:46:03,355 WARNING [train.py:1204] (3/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_280-120942-0_sp0.9 from training. Duration: 0.8555625 2023-10-02 21:46:05,262 WARNING [train.py:1204] (3/4) Exclude cut with ID _45630_210_10556_1_1533949174498_3902049_558-240889-0_sp1.1 from training. Duration: 0.92725 2023-10-02 21:46:06,722 WARNING [train.py:1204] (3/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_197-307458-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 21:46:06,808 WARNING [train.py:1204] (3/4) Exclude cut with ID _16909_210_5724_1_1531292079912_4118919_163-254843-0_sp1.1 from training. Duration: 0.92725 2023-10-02 21:46:10,015 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_103-84010-0 from training. Duration: 0.69 2023-10-02 21:46:10,824 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.2.encoder.layers.1.nonlin_attention.whiten1, num_groups=1, num_channels=288, metric=5.39 vs. limit=10.0 2023-10-02 21:46:11,374 WARNING [train.py:1204] (3/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_401-158239-0 from training. Duration: 0.71 2023-10-02 21:46:11,375 WARNING [train.py:1204] (3/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_899-233658-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 21:46:12,798 WARNING [train.py:1204] (3/4) Exclude cut with ID _23825_210_13449_1_1531915313526_3497150_240-271367-0 from training. Duration: 0.97 2023-10-02 21:46:14,199 WARNING [train.py:1204] (3/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_424-134619-0_sp1.1 from training. Duration: 0.92725 2023-10-02 21:46:18,537 WARNING [train.py:1204] (3/4) Exclude cut with ID _44831_210_13427_1_1533816011549_3689035_32-188147-0_sp1.1 from training. Duration: 0.87275 2023-10-02 21:46:18,540 WARNING [train.py:1204] (3/4) Exclude cut with ID _59170_210_6721_1_1534983633960_4203335_314-63879-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 21:46:19,861 WARNING [train.py:1204] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_602-275260-0 from training. Duration: 0.82 2023-10-02 21:46:19,913 WARNING [train.py:1204] (3/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_511-306767-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 21:46:19,955 WARNING [train.py:1204] (3/4) Exclude cut with ID _41817_210_18835_1_1533434477160_7423049_565-201775-0_sp1.1 from training. Duration: 0.92725 2023-10-02 21:46:19,965 WARNING [train.py:1204] (3/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_778-173151-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 21:46:20,158 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.conv_module1.balancer1.prob, batch_count=1029746.6666666666, ans=0.125 2023-10-02 21:46:21,369 WARNING [train.py:1204] (3/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_680-51001-0_sp1.1 from training. Duration: 0.963625 2023-10-02 21:46:22,789 WARNING [train.py:1204] (3/4) Exclude cut with ID IC0677W0059-334891 from training. Duration: 0.896 2023-10-02 21:46:22,840 WARNING [train.py:1204] (3/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_370-331600-0 from training. Duration: 0.94 2023-10-02 21:46:27,007 WARNING [train.py:1204] (3/4) Exclude cut with ID _66840_210_5219_1_1535452168426_4196349_446-240192-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 21:46:29,685 WARNING [train.py:1204] (3/4) Exclude cut with ID _65242_210_18628_1_1535369393717_3823149_38-6732-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 21:46:29,755 WARNING [train.py:1204] (3/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_302-2550-0 from training. Duration: 0.81 2023-10-02 21:46:31,711 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_100-348441-0 from training. Duration: 0.91 2023-10-02 21:46:35,678 WARNING [train.py:1204] (3/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_329-285801-0_sp1.1 from training. Duration: 0.82725 2023-10-02 21:46:37,207 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_30167_210_3528_1_1532428012_4772375_343-334712-0_sp1.1 from training. Duration: 0.936375 2023-10-02 21:46:44,644 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7543_210_6753_1_1528543318_4552_1-85072-0 from training. Duration: 0.83 2023-10-02 21:46:46,166 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.conv_skip_rate, batch_count=1029880.0, ans=0.0 2023-10-02 21:46:48,698 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7002_210_1794_1_1527935634_4034444_487-236501-0_sp1.1 from training. Duration: 0.6 2023-10-02 21:46:49,949 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.538e+02 1.876e+02 2.050e+02 2.332e+02 4.522e+02, threshold=4.101e+02, percent-clipped=1.0 2023-10-02 21:46:50,094 WARNING [train.py:1204] (3/4) Exclude cut with ID _7160_210_1876_1_1527940923231_3703799_282-322611-0 from training. Duration: 0.87 2023-10-02 21:46:52,799 WARNING [train.py:1204] (3/4) Exclude cut with ID _67335_210_5219_1_1535525975652_4275229_570-262983-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 21:46:55,481 WARNING [train.py:1204] (3/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_625-338546-0_sp1.1 from training. Duration: 0.8 2023-10-02 21:46:55,505 WARNING [train.py:1204] (3/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_176-56120-0 from training. Duration: 0.83 2023-10-02 21:46:55,757 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.conv_skip_rate, batch_count=1029880.0, ans=0.0 2023-10-02 21:46:59,577 WARNING [train.py:1204] (3/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_963-187109-0_sp1.1 from training. Duration: 0.9 2023-10-02 21:47:01,076 WARNING [train.py:1204] (3/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_121-284152-0_sp0.9 from training. Duration: 0.6555625 2023-10-02 21:47:02,467 WARNING [train.py:1204] (3/4) Exclude cut with ID _45021_210_9994_1_1533619751094_3330288_104-81711-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 21:47:03,980 WARNING [train.py:1204] (3/4) Exclude cut with ID _51382_210_23862_1_1534492801838_3650104_129-343914-0_sp1.1 from training. Duration: 0.936375 2023-10-02 21:47:05,887 WARNING [train.py:1204] (3/4) Exclude cut with ID _14970_210_15709_1_1530775547361_1966965_8-169924-0 from training. Duration: 0.92 2023-10-02 21:47:06,026 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2295_210_2087_1_1525500993_2407863_234-83104-0_sp1.1 from training. Duration: 0.563625 2023-10-02 21:47:07,942 WARNING [train.py:1204] (3/4) Exclude cut with ID _14687_210_5854_1_1530703838259_3664889_115-239046-0 from training. Duration: 0.67 2023-10-02 21:47:10,561 WARNING [train.py:1204] (3/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_690-126306-0_sp1.1 from training. Duration: 0.7090625 2023-10-02 21:47:10,576 WARNING [train.py:1204] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_625-102955-0_sp0.9 from training. Duration: 0.8555625 2023-10-02 21:47:11,995 WARNING [train.py:1204] (3/4) Exclude cut with ID _9389_210_1664_1_1529217337301_5289269_289-213417-0 from training. Duration: 0.89 2023-10-02 21:47:13,227 INFO [train.py:1046] (3/4) Epoch 30, batch 450, loss[loss=0.1657, simple_loss=0.2388, pruned_loss=0.04636, over 23710.00 frames. ], tot_loss[loss=0.1634, simple_loss=0.2421, pruned_loss=0.04233, over 4245139.27 frames. ], batch size: 212, lr: 3.43e-03, grad_scale: 32.0 2023-10-02 21:47:15,182 WARNING [train.py:1204] (3/4) Exclude cut with ID _13253_210_1925_1_1530757775819_3577873_191-144977-0_sp0.9 from training. Duration: 0.8666875 2023-10-02 21:47:15,227 WARNING [train.py:1204] (3/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_325-155703-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 21:47:15,254 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2295_210_2087_1_1525500993_2407863_116-238894-0_sp0.9 from training. Duration: 0.77775 2023-10-02 21:47:16,654 WARNING [train.py:1204] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_94-126128-0 from training. Duration: 0.86 2023-10-02 21:47:16,673 WARNING [train.py:1204] (3/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_395-310220-0_sp0.9 from training. Duration: 0.988875 2023-10-02 21:47:18,086 WARNING [train.py:1204] (3/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_821-110852-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 21:47:18,136 WARNING [train.py:1204] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_64-248359-0_sp1.1 from training. Duration: 0.62725 2023-10-02 21:47:18,155 WARNING [train.py:1204] (3/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_138-187031-0 from training. Duration: 0.99 2023-10-02 21:47:19,586 WARNING [train.py:1204] (3/4) Exclude cut with ID _4122_210_2084_1_1526713164417_4353280_528-206771-0_sp0.9 from training. Duration: 0.97775 2023-10-02 21:47:20,975 WARNING [train.py:1204] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_844-96989-0_sp1.1 from training. Duration: 0.6454375 2023-10-02 21:47:23,614 WARNING [train.py:1204] (3/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_106-30782-0_sp1.1 from training. Duration: 0.72725 2023-10-02 21:47:25,729 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.3.encoder.layers.1.nonlin_attention.whiten2, num_groups=1, num_channels=512, metric=7.02 vs. limit=15.0 2023-10-02 21:47:32,133 WARNING [train.py:1204] (3/4) Exclude cut with ID _14683_210_5219_1_1530698352608_3974859_281-250040-0_sp1.1 from training. Duration: 0.936375 2023-10-02 21:47:32,378 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.conv_module2.balancer1.prob, batch_count=1030080.0, ans=0.125 2023-10-02 21:47:33,460 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_46013_210_7234_1_1533776362_3744032_463-52378-0_sp0.9 from training. Duration: 0.9555625 2023-10-02 21:47:35,306 WARNING [train.py:1204] (3/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_299-34413-0 from training. Duration: 0.65 2023-10-02 21:47:35,371 WARNING [train.py:1204] (3/4) Exclude cut with ID _35799_210_5854_1_1533263487887_3529071_490-326847-0 from training. Duration: 0.99 2023-10-02 21:47:38,765 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.conv_skip_rate, batch_count=1030080.0, ans=0.0 2023-10-02 21:47:39,906 WARNING [train.py:1204] (3/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_589-9070-0_sp0.9 from training. Duration: 0.788875 2023-10-02 21:47:42,633 WARNING [train.py:1204] (3/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_884-29515-0_sp1.1 from training. Duration: 0.936375 2023-10-02 21:47:45,327 WARNING [train.py:1204] (3/4) Exclude cut with ID _15012_210_1968_1_1530856820429_5095350_474-119995-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 21:47:48,846 WARNING [train.py:1204] (3/4) Exclude cut with ID _65585_210_12210_1_1535450451584_3764098_211-97124-0_sp1.1 from training. Duration: 0.963625 2023-10-02 21:47:50,237 WARNING [train.py:1204] (3/4) Exclude cut with ID _33640_210_2655_1_1533117605479_4855469_666-224463-0_sp1.1 from training. Duration: 0.963625 2023-10-02 21:47:51,676 WARNING [train.py:1204] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_35-153886-0 from training. Duration: 0.84 2023-10-02 21:47:52,963 WARNING [train.py:1204] (3/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_210-106047-0 from training. Duration: 0.97 2023-10-02 21:47:53,105 WARNING [train.py:1204] (3/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_479-125611-0 from training. Duration: 0.96 2023-10-02 21:47:53,129 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_9417_210_1794_1_1529235261_4614131_427-153963-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 21:47:53,284 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.balancer1.prob, batch_count=1030146.6666666666, ans=0.125 2023-10-02 21:47:55,775 WARNING [train.py:1204] (3/4) Exclude cut with ID _23818_210_6228_1_1531917063884_3676920_133-170571-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 21:47:55,829 WARNING [train.py:1204] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_602-275260-0_sp1.1 from training. Duration: 0.7454375 2023-10-02 21:47:57,235 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9029W0121-509521 from training. Duration: 0.896 2023-10-02 21:47:57,243 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9029W0434-509821 from training. Duration: 0.64 2023-10-02 21:47:57,271 WARNING [train.py:1204] (3/4) Exclude cut with ID _23038_210_9570_1_1532001584158_3652419_289-307034-0_sp1.1 from training. Duration: 0.936375 2023-10-02 21:47:58,610 WARNING [train.py:1204] (3/4) Exclude cut with ID _29345_210_4381_1_1532689168263_3860169_149-257268-0_sp0.9 from training. Duration: 0.9 2023-10-02 21:48:00,012 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_405-339986-0_sp1.1 from training. Duration: 0.463625 2023-10-02 21:48:02,882 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2655_210_2087_1_1525688959_3897400_437-337690-0_sp1.1 from training. Duration: 0.57275 2023-10-02 21:48:04,140 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7543_210_6753_1_1528541907_1399322_165-323619-0_sp1.1 from training. Duration: 0.62725 2023-10-02 21:48:04,182 WARNING [train.py:1204] (3/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_508-335369-0_sp1.1 from training. Duration: 0.436375 2023-10-02 21:48:04,237 WARNING [train.py:1204] (3/4) Exclude cut with ID _7852_210_5219_1_1528286471316_8139400_358-287492-0 from training. Duration: 0.96 2023-10-02 21:48:07,457 WARNING [train.py:1204] (3/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_322-217220-0_sp0.9 from training. Duration: 0.9555625 2023-10-02 21:48:09,457 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3171_210_2087_1_1526109084_2330007_66-260583-0_sp0.9 from training. Duration: 0.8 2023-10-02 21:48:10,698 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8558_210_1410_1_1528505225_7613406_131-329524-0_sp1.1 from training. Duration: 0.7181875 2023-10-02 21:48:12,023 WARNING [train.py:1204] (3/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_30-326681-0 from training. Duration: 0.83 2023-10-02 21:48:13,629 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.conv_module2.balancer1.prob, batch_count=1030280.0, ans=0.125 2023-10-02 21:48:14,792 WARNING [train.py:1204] (3/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_58-68956-0_sp0.9 from training. Duration: 0.92225 2023-10-02 21:48:15,594 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.conv_skip_rate, batch_count=1030280.0, ans=0.0 2023-10-02 21:48:16,691 WARNING [train.py:1204] (3/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_255-312654-0 from training. Duration: 0.91 2023-10-02 21:48:16,766 WARNING [train.py:1204] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_99-167392-0 from training. Duration: 0.61 2023-10-02 21:48:18,225 WARNING [train.py:1204] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_94-126128-0_sp0.9 from training. Duration: 0.9555625 2023-10-02 21:48:22,309 WARNING [train.py:1204] (3/4) Exclude cut with ID _68635_210_9317_1_1535770813410_4315870_187-192817-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 21:48:23,742 WARNING [train.py:1204] (3/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_277-204970-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 21:48:25,091 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6163_210_6400_1_1527506746_4263068_283-137811-0_sp0.9 from training. Duration: 0.8555625 2023-10-02 21:48:25,121 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9144W0121-566446 from training. Duration: 0.896 2023-10-02 21:48:26,421 INFO [train.py:1046] (3/4) Epoch 30, batch 500, loss[loss=0.1566, simple_loss=0.2508, pruned_loss=0.03118, over 24458.00 frames. ], tot_loss[loss=0.1642, simple_loss=0.243, pruned_loss=0.04273, over 4339844.42 frames. ], batch size: 69, lr: 3.43e-03, grad_scale: 16.0 2023-10-02 21:48:29,335 WARNING [train.py:1204] (3/4) Exclude cut with ID _49371_210_12071_1_1533952761021_3812190_351-315519-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 21:48:30,825 WARNING [train.py:1204] (3/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_115-326778-0_sp1.1 from training. Duration: 0.8454375 2023-10-02 21:48:30,878 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_17292_210_6753_1_1531125388_2943734_436-198965-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 21:48:30,888 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9165W0117-576751 from training. Duration: 0.896 2023-10-02 21:48:32,246 WARNING [train.py:1204] (3/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_212-149947-0 from training. Duration: 0.73 2023-10-02 21:48:32,256 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_13757_210_3528_1_1530440682_4107239_65-167544-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 21:48:32,449 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.feed_forward3.hidden_balancer.prob, batch_count=1030346.6666666666, ans=0.125 2023-10-02 21:48:35,561 WARNING [train.py:1204] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_700-183549-0_sp0.9 from training. Duration: 0.7555625 2023-10-02 21:48:35,800 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=1030346.6666666666, ans=0.1 2023-10-02 21:48:40,266 WARNING [train.py:1204] (3/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_216-343007-0_sp0.9 from training. Duration: 0.7333125 2023-10-02 21:48:41,674 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7544_210_6753_1_1528587722_3576686_295-258479-0_sp1.1 from training. Duration: 0.763625 2023-10-02 21:48:44,394 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_365-287106-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 21:48:44,407 WARNING [train.py:1204] (3/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_300-3747-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 21:48:44,473 WARNING [train.py:1204] (3/4) Exclude cut with ID _14069_210_5632_1_1530512692033_4803390_487-303201-0_sp1.1 from training. Duration: 0.92725 2023-10-02 21:48:53,473 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.bypass_mid.scale_min, batch_count=1030413.3333333334, ans=0.2 2023-10-02 21:48:54,724 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.0.attention_skip_rate, batch_count=1030480.0, ans=0.0 2023-10-02 21:48:55,957 WARNING [train.py:1204] (3/4) Exclude cut with ID _14459_210_2228_1_1531443641946_7516700_668-123026-0_sp1.1 from training. Duration: 0.936375 2023-10-02 21:48:55,979 WARNING [train.py:1204] (3/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_108-53929-0_sp1.1 from training. Duration: 0.536375 2023-10-02 21:48:56,014 WARNING [train.py:1204] (3/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_266-137104-0_sp0.9 from training. Duration: 0.72225 2023-10-02 21:48:57,304 WARNING [train.py:1204] (3/4) Exclude cut with ID _12302_210_5219_1_1529924446466_8107280_902-168619-0_sp1.1 from training. Duration: 0.936375 2023-10-02 21:48:57,331 WARNING [train.py:1204] (3/4) Exclude cut with ID _59502_210_12210_1_1535107024698_1835139_146-162495-0 from training. Duration: 0.92 2023-10-02 21:48:57,346 WARNING [train.py:1204] (3/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_412-287526-0_sp1.1 from training. Duration: 0.6545625 2023-10-02 21:49:00,152 WARNING [train.py:1204] (3/4) Exclude cut with ID _7160_210_1876_1_1527940923231_3703799_120-150943-0_sp0.9 from training. Duration: 0.9 2023-10-02 21:49:01,494 WARNING [train.py:1204] (3/4) Exclude cut with ID _15037_210_2253_1_1530752294104_3267650_296-192597-0_sp1.1 from training. Duration: 0.736375 2023-10-02 21:49:01,522 WARNING [train.py:1204] (3/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_397-216497-0_sp1.1 from training. Duration: 0.87275 2023-10-02 21:49:01,544 WARNING [train.py:1204] (3/4) Exclude cut with ID _40094_210_16380_1_1533294005773_3641159_76-269320-0_sp1.1 from training. Duration: 0.936375 2023-10-02 21:49:02,852 WARNING [train.py:1204] (3/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_449-15508-0 from training. Duration: 0.82 2023-10-02 21:49:03,062 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.bypass.skip_rate, batch_count=1030480.0, ans=0.07 2023-10-02 21:49:07,017 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9309W0194-640321 from training. Duration: 0.64 2023-10-02 21:49:08,499 WARNING [train.py:1204] (3/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_284-312178-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 21:49:10,476 WARNING [train.py:1204] (3/4) Exclude cut with ID _29853_210_7497_1_1532505600851_6642110_401-55937-0_sp1.1 from training. Duration: 0.92725 2023-10-02 21:49:12,452 WARNING [train.py:1204] (3/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_716-24918-0_sp1.1 from training. Duration: 0.92725 2023-10-02 21:49:12,473 WARNING [train.py:1204] (3/4) Exclude cut with ID _19637_210_4220_1_1531537167676_3205060_314-305638-0_sp1.1 from training. Duration: 0.92725 2023-10-02 21:49:12,516 WARNING [train.py:1204] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_475-127455-0_sp1.1 from training. Duration: 0.636375 2023-10-02 21:49:15,355 WARNING [train.py:1204] (3/4) Exclude cut with ID _5444_210_2151_1_1527418808464_5312210_581-315411-0 from training. Duration: 0.62 2023-10-02 21:49:16,770 WARNING [train.py:1204] (3/4) Exclude cut with ID _44902_210_16187_1_1533888006062_3792078_149-67212-0_sp0.9 from training. Duration: 0.9666875 2023-10-02 21:49:18,449 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.583e+02 1.804e+02 1.945e+02 2.150e+02 2.589e+02, threshold=3.890e+02, percent-clipped=0.0 2023-10-02 21:49:18,594 WARNING [train.py:1204] (3/4) Exclude cut with ID _31602_210_5854_1_1532677021456_4458250_63-63253-0_sp1.1 from training. Duration: 0.963625 2023-10-02 21:49:21,353 WARNING [train.py:1204] (3/4) Exclude cut with ID _55209_210_1531_1_1534675828996_4597160_660-203992-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 21:49:22,957 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=1030546.6666666666, ans=0.1 2023-10-02 21:49:25,447 WARNING [train.py:1204] (3/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_83-190624-0_sp1.1 from training. Duration: 0.92725 2023-10-02 21:49:29,586 WARNING [train.py:1204] (3/4) Exclude cut with ID _58605_210_12210_1_1534723195939_3904259_154-139635-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 21:49:31,059 WARNING [train.py:1204] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_164-224052-0 from training. Duration: 0.7 2023-10-02 21:49:31,075 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_1126-189071-0_sp1.1 from training. Duration: 0.963625 2023-10-02 21:49:31,086 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_15396_210_6890_1_1531308473_1938936_310-214789-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 21:49:35,222 WARNING [train.py:1204] (3/4) Exclude cut with ID _8395_210_1531_1_1528597871523_4411579_431-132470-0 from training. Duration: 0.74 2023-10-02 21:49:35,286 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3780_210_2087_1_1526467760_2130483_170-259044-0_sp0.9 from training. Duration: 0.77775 2023-10-02 21:49:38,618 WARNING [train.py:1204] (3/4) Exclude cut with ID _12733_210_8341_1_1530167424556_3646380_537-230640-0_sp1.1 from training. Duration: 0.963625 2023-10-02 21:49:39,881 INFO [train.py:1046] (3/4) Epoch 30, batch 550, loss[loss=0.1649, simple_loss=0.2405, pruned_loss=0.04466, over 23794.00 frames. ], tot_loss[loss=0.1643, simple_loss=0.2432, pruned_loss=0.04277, over 4436184.08 frames. ], batch size: 212, lr: 3.43e-03, grad_scale: 16.0 2023-10-02 21:49:42,060 WARNING [train.py:1204] (3/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_159-281310-0 from training. Duration: 0.95 2023-10-02 21:49:44,813 WARNING [train.py:1204] (3/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_434-83000-0 from training. Duration: 0.98 2023-10-02 21:49:44,841 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_4405_210_1849_1_1526732205_3593939_493-32464-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 21:49:44,850 WARNING [train.py:1204] (3/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_342-100157-0 from training. Duration: 0.54 2023-10-02 21:49:44,915 WARNING [train.py:1204] (3/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_765-250133-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 21:49:44,919 WARNING [train.py:1204] (3/4) Exclude cut with ID _38670_210_15379_1_1533448810659_4391059_81-312245-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 21:49:46,287 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_50050_210_16223_1_1535435796_7290138_1273-247611-0_sp1.1 from training. Duration: 0.936375 2023-10-02 21:49:46,347 WARNING [train.py:1204] (3/4) Exclude cut with ID _44831_210_13427_1_1533816011549_3689035_399-18591-0_sp1.1 from training. Duration: 0.936375 2023-10-02 21:49:47,451 WARNING [train.py:1204] (3/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_557-324258-0_sp0.9 from training. Duration: 0.9 2023-10-02 21:49:48,189 WARNING [train.py:1204] (3/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_62-335552-0_sp1.1 from training. Duration: 0.7909375 2023-10-02 21:49:50,937 WARNING [train.py:1204] (3/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_673-264125-0_sp1.1 from training. Duration: 0.963625 2023-10-02 21:49:51,965 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.0.layers.0.self_attn_weights.whiten_keys, num_groups=4, num_channels=128, metric=2.92 vs. limit=6.0 2023-10-02 21:49:52,307 WARNING [train.py:1204] (3/4) Exclude cut with ID _8030_210_7497_1_1528444886781_3531089_262-86750-0 from training. Duration: 0.81 2023-10-02 21:49:52,321 WARNING [train.py:1204] (3/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_444-28041-0_sp1.1 from training. Duration: 0.77275 2023-10-02 21:49:56,427 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_34008_210_10431_1_1533345424_8183247_516-14858-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 21:49:56,449 WARNING [train.py:1204] (3/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_54-248218-0_sp1.1 from training. Duration: 0.936375 2023-10-02 21:49:59,269 WARNING [train.py:1204] (3/4) Exclude cut with ID _13253_210_1925_1_1530757775819_3577873_300-119782-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 21:50:00,546 WARNING [train.py:1204] (3/4) Exclude cut with ID _39290_210_4220_1_1533862778760_3075118_21-240703-0_sp1.1 from training. Duration: 0.936375 2023-10-02 21:50:05,861 WARNING [train.py:1204] (3/4) Exclude cut with ID 774-127930-0014-10412-0_sp1.1 from training. Duration: 0.95 2023-10-02 21:50:07,326 WARNING [train.py:1204] (3/4) Exclude cut with ID _67499_210_17282_1_1535509805220_3692430_20-179346-0 from training. Duration: 0.97 2023-10-02 21:50:08,747 WARNING [train.py:1204] (3/4) Exclude cut with ID _8365_210_2064_1_1528592311070_4677690_431-294102-0_sp0.9 from training. Duration: 0.988875 2023-10-02 21:50:14,500 WARNING [train.py:1204] (3/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_365-1492-0_sp1.1 from training. Duration: 0.82725 2023-10-02 21:50:14,731 INFO [scaling.py:1118] (3/4) WithLoss: name=encoder.encoders.2.encoder.layers.1.self_attn_weights, loss-sum=0.000e+00 2023-10-02 21:50:14,751 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.feed_forward2.hidden_balancer.prob, batch_count=1030813.3333333334, ans=0.125 2023-10-02 21:50:15,874 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_105-114797-0_sp0.9 from training. Duration: 0.9333125 2023-10-02 21:50:15,996 WARNING [train.py:1204] (3/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_769-168302-0_sp0.9 from training. Duration: 0.788875 2023-10-02 21:50:18,909 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_40340_210_9024_1_1533433916_4132944_320-270277-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 21:50:18,916 WARNING [train.py:1204] (3/4) Exclude cut with ID ID0203W0003-781711 from training. Duration: 0.958 2023-10-02 21:50:20,802 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_30434_210_13049_1_1532519384_981746_52-12932-0_sp1.1 from training. Duration: 0.936375 2023-10-02 21:50:22,135 WARNING [train.py:1204] (3/4) Exclude cut with ID _8263_210_1664_1_1528438953494_294100_40-204731-0_sp0.9 from training. Duration: 0.5555625 2023-10-02 21:50:24,874 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1032-236863-0_sp0.9 from training. Duration: 0.9333125 2023-10-02 21:50:24,931 WARNING [train.py:1204] (3/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_335-274780-0_sp0.9 from training. Duration: 0.8666875 2023-10-02 21:50:24,945 WARNING [train.py:1204] (3/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_654-52713-0_sp1.1 from training. Duration: 0.7 2023-10-02 21:50:26,454 WARNING [train.py:1204] (3/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_742-267856-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 21:50:27,839 WARNING [train.py:1204] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_74-48717-0 from training. Duration: 0.58 2023-10-02 21:50:29,214 WARNING [train.py:1204] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_18-46756-0 from training. Duration: 0.79 2023-10-02 21:50:30,510 WARNING [train.py:1204] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_612-56061-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 21:50:30,519 WARNING [train.py:1204] (3/4) Exclude cut with ID _56423_210_5854_1_1534896061049_3165969_412-2403-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 21:50:30,576 WARNING [train.py:1204] (3/4) Exclude cut with ID _62574_210_2655_1_1535180407058_3933249_120-196848-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 21:50:30,577 WARNING [train.py:1204] (3/4) Exclude cut with ID _40519_210_13794_1_1533715228655_7337390_916-51000-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 21:50:32,096 WARNING [train.py:1204] (3/4) Exclude cut with ID _65263_210_22356_1_1535763598691_3921037_108-21902-0_sp1.1 from training. Duration: 0.9 2023-10-02 21:50:33,447 WARNING [train.py:1204] (3/4) Exclude cut with ID _4122_210_2084_1_1526713164417_4353280_528-206771-0_sp1.1 from training. Duration: 0.8 2023-10-02 21:50:36,248 WARNING [train.py:1204] (3/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_43-325657-0_sp1.1 from training. Duration: 0.92725 2023-10-02 21:50:36,304 WARNING [train.py:1204] (3/4) Exclude cut with ID _44659_210_2721_1_1534036120061_6614860_314-82041-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 21:50:36,495 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.conv_module2.balancer1.prob, batch_count=1030880.0, ans=0.125 2023-10-02 21:50:38,829 WARNING [train.py:1204] (3/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_81-300212-0_sp0.9 from training. Duration: 0.6666875 2023-10-02 21:50:38,903 WARNING [train.py:1204] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_665-196155-0_sp0.9 from training. Duration: 0.7444375 2023-10-02 21:50:40,873 WARNING [train.py:1204] (3/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_140-336881-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 21:50:40,962 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6290_210_3273_1_1527595224_1282787_115-87095-0_sp0.9 from training. Duration: 0.611125 2023-10-02 21:50:42,888 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_53373_210_5399_1_1534229471_4715695_607-259685-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 21:50:44,272 WARNING [train.py:1204] (3/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_175-163894-0_sp1.1 from training. Duration: 0.67275 2023-10-02 21:50:44,298 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7002_210_1794_1_1527939683_5157286_1-281264-0_sp1.1 from training. Duration: 0.4 2023-10-02 21:50:50,234 WARNING [train.py:1204] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_605-257205-0 from training. Duration: 0.59 2023-10-02 21:50:54,247 INFO [train.py:1046] (3/4) Epoch 30, batch 600, loss[loss=0.1518, simple_loss=0.2324, pruned_loss=0.03562, over 24301.00 frames. ], tot_loss[loss=0.1649, simple_loss=0.2435, pruned_loss=0.04309, over 4497057.90 frames. ], batch size: 61, lr: 3.43e-03, grad_scale: 16.0 2023-10-02 21:50:54,311 WARNING [train.py:1204] (3/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_454-314901-0 from training. Duration: 0.46 2023-10-02 21:50:54,406 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_46032_210_14656_1_1533781412_4507969_287-130422-0_sp1.1 from training. Duration: 0.97275 2023-10-02 21:50:55,571 WARNING [train.py:1204] (3/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_186-309161-0_sp1.1 from training. Duration: 0.4909375 2023-10-02 21:50:55,620 WARNING [train.py:1204] (3/4) Exclude cut with ID _42659_210_2070_1_1533607442765_3120380_45-342307-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 21:51:02,611 WARNING [train.py:1204] (3/4) Exclude cut with ID _12555_210_8750_1_1530075594402_3780239_478-326818-0_sp1.1 from training. Duration: 0.963625 2023-10-02 21:51:05,386 WARNING [train.py:1204] (3/4) Exclude cut with ID _7606_210_4915_1_1528894253277_5080690_127-177761-0_sp1.1 from training. Duration: 0.5818125 2023-10-02 21:51:05,490 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6788_210_1388_1_1527898698_4562021_91-197981-0 from training. Duration: 0.83 2023-10-02 21:51:05,714 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=1031013.3333333334, ans=0.1 2023-10-02 21:51:06,944 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8558_210_1410_1_1528505225_7613406_131-329524-0_sp0.9 from training. Duration: 0.87775 2023-10-02 21:51:09,663 WARNING [train.py:1204] (3/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_153-153537-0_sp1.1 from training. Duration: 0.82725 2023-10-02 21:51:11,106 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_17292_210_6753_1_1531125388_2943734_285-349105-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 21:51:14,384 WARNING [train.py:1204] (3/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_37-41919-0 from training. Duration: 0.87 2023-10-02 21:51:14,416 WARNING [train.py:1204] (3/4) Exclude cut with ID _60860_210_8254_1_1534896015875_3557169_172-294852-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 21:51:20,588 WARNING [train.py:1204] (3/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_146-276944-0 from training. Duration: 0.83 2023-10-02 21:51:24,805 WARNING [train.py:1204] (3/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_1254-44082-0_sp1.1 from training. Duration: 0.936375 2023-10-02 21:51:24,825 WARNING [train.py:1204] (3/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_1115-180350-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 21:51:26,057 WARNING [train.py:1204] (3/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_60-146189-0_sp1.1 from training. Duration: 0.8909375 2023-10-02 21:51:30,406 WARNING [train.py:1204] (3/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_517-153649-0_sp1.1 from training. Duration: 0.92725 2023-10-02 21:51:30,418 WARNING [train.py:1204] (3/4) Exclude cut with ID _39571_210_13449_1_1533801584143_3651077_37-266257-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 21:51:31,654 WARNING [train.py:1204] (3/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_527-276118-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 21:51:37,328 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7696_210_2040_1_1528207167_3716099_117-125434-0_sp1.1 from training. Duration: 0.6909375 2023-10-02 21:51:40,424 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.bypass_mid.scale_min, batch_count=1031213.3333333334, ans=0.2 2023-10-02 21:51:42,033 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_25767_210_2161_1_1532307483_3867748_478-156360-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 21:51:42,037 WARNING [train.py:1204] (3/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_177-49092-0_sp1.1 from training. Duration: 0.82725 2023-10-02 21:51:42,044 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_11305_210_1794_1_1529750428_7178148_680-317073-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 21:51:44,815 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.472e+02 1.850e+02 2.032e+02 2.229e+02 3.048e+02, threshold=4.063e+02, percent-clipped=0.0 2023-10-02 21:51:47,034 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.ff2_skip_rate, batch_count=1031213.3333333334, ans=0.0 2023-10-02 21:51:49,595 WARNING [train.py:1204] (3/4) Exclude cut with ID _53565_210_10635_1_1534337218335_3820259_287-18120-0 from training. Duration: 0.97 2023-10-02 21:51:55,618 WARNING [train.py:1204] (3/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_498-40153-0_sp1.1 from training. Duration: 0.57275 2023-10-02 21:51:55,638 WARNING [train.py:1204] (3/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_555-63066-0_sp1.1 from training. Duration: 0.963625 2023-10-02 21:51:58,487 WARNING [train.py:1204] (3/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_395-310220-0 from training. Duration: 0.89 2023-10-02 21:51:59,884 WARNING [train.py:1204] (3/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_937-208281-0_sp1.1 from training. Duration: 0.77275 2023-10-02 21:52:01,259 WARNING [train.py:1204] (3/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_120-211630-0 from training. Duration: 0.92 2023-10-02 21:52:02,708 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_454-276429-0_sp1.1 from training. Duration: 0.7909375 2023-10-02 21:52:02,758 WARNING [train.py:1204] (3/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_404-75889-0_sp1.1 from training. Duration: 0.8090625 2023-10-02 21:52:06,845 INFO [train.py:1046] (3/4) Epoch 30, batch 650, loss[loss=0.1771, simple_loss=0.2459, pruned_loss=0.05412, over 23787.00 frames. ], tot_loss[loss=0.1645, simple_loss=0.243, pruned_loss=0.04302, over 4549505.88 frames. ], batch size: 164, lr: 3.43e-03, grad_scale: 16.0 2023-10-02 21:52:07,032 WARNING [train.py:1204] (3/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_282-136641-0_sp0.9 from training. Duration: 0.4666875 2023-10-02 21:52:08,463 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_809-17156-0_sp1.1 from training. Duration: 0.663625 2023-10-02 21:52:10,255 WARNING [train.py:1204] (3/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_1003-101638-0_sp0.9 from training. Duration: 0.811125 2023-10-02 21:52:11,728 WARNING [train.py:1204] (3/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_404-75889-0_sp0.9 from training. Duration: 0.988875 2023-10-02 21:52:13,013 WARNING [train.py:1204] (3/4) Exclude cut with ID _59288_210_5854_1_1535077757774_3412246_570-295255-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 21:52:16,524 WARNING [train.py:1204] (3/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_412-287526-0 from training. Duration: 0.72 2023-10-02 21:52:17,855 WARNING [train.py:1204] (3/4) Exclude cut with ID _44010_210_4525_1_1533884262964_4013620_649-79655-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 21:52:22,533 WARNING [train.py:1204] (3/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_57-270037-0_sp0.9 from training. Duration: 0.8555625 2023-10-02 21:52:22,534 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_34815_210_6388_1_1533381320_3571903_302-255963-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 21:52:26,734 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_47449_210_5399_1_1534076445_4006929_573-301852-0_sp1.1 from training. Duration: 0.97275 2023-10-02 21:52:29,532 WARNING [train.py:1204] (3/4) Exclude cut with ID 6709-74022-0004-86860-0_sp1.1 from training. Duration: 0.9409375 2023-10-02 21:52:30,928 WARNING [train.py:1204] (3/4) Exclude cut with ID _40519_210_13794_1_1533715228655_7337390_111-71991-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 21:52:32,272 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_30167_210_3528_1_1532428012_4772375_169-214186-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 21:52:34,981 WARNING [train.py:1204] (3/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_20-235313-0_sp1.1 from training. Duration: 0.963625 2023-10-02 21:52:35,022 WARNING [train.py:1204] (3/4) Exclude cut with ID _4681_210_3525_1_1527251040321_1235713_123-24428-0_sp0.9 from training. Duration: 0.5333125 2023-10-02 21:52:38,183 WARNING [train.py:1204] (3/4) Exclude cut with ID _31602_210_5854_1_1532677021456_4458250_444-198752-0_sp1.1 from training. Duration: 0.97275 2023-10-02 21:52:38,253 WARNING [train.py:1204] (3/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_9-279442-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 21:52:40,057 WARNING [train.py:1204] (3/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_973-198085-0_sp0.9 from training. Duration: 0.8333125 2023-10-02 21:52:41,352 WARNING [train.py:1204] (3/4) Exclude cut with ID _29853_210_7497_1_1532505600851_6642110_567-250221-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 21:52:42,765 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6545_210_2040_1_1527774670_4245258_407-48070-0_sp1.1 from training. Duration: 0.6909375 2023-10-02 21:52:45,454 WARNING [train.py:1204] (3/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_224-343295-0_sp0.9 from training. Duration: 0.611125 2023-10-02 21:52:45,477 WARNING [train.py:1204] (3/4) Exclude cut with ID IC0125W0011-61817 from training. Duration: 0.88 2023-10-02 21:52:45,485 WARNING [train.py:1204] (3/4) Exclude cut with ID _60113_210_16271_1_1535164212481_4386079_435-154371-0_sp1.1 from training. Duration: 0.97275 2023-10-02 21:52:45,496 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_25836_210_16554_1_1532138167_53197_3-9394-0_sp1.1 from training. Duration: 0.92725 2023-10-02 21:52:47,054 WARNING [train.py:1204] (3/4) Exclude cut with ID _29343_210_4381_1_1532602757752_3674109_57-55125-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 21:52:48,792 WARNING [train.py:1204] (3/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_267-23560-0_sp1.1 from training. Duration: 0.936375 2023-10-02 21:52:48,822 WARNING [train.py:1204] (3/4) Exclude cut with ID _30196_210_1664_1_1533708496522_7074140_1150-278867-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 21:52:50,788 WARNING [train.py:1204] (3/4) Exclude cut with ID _6270_210_2084_1_1527850911573_3856420_26-277343-0_sp0.9 from training. Duration: 0.911125 2023-10-02 21:52:50,962 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.1.feed_forward1.hidden_balancer.prob, batch_count=1031546.6666666666, ans=0.125 2023-10-02 21:52:51,950 WARNING [train.py:1204] (3/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_325-62802-0 from training. Duration: 0.87 2023-10-02 21:52:53,283 WARNING [train.py:1204] (3/4) Exclude cut with ID _56524_210_9642_1_1534572011779_3619170_145-310842-0_sp1.1 from training. Duration: 0.9 2023-10-02 21:52:53,311 WARNING [train.py:1204] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_860-96198-0_sp0.9 from training. Duration: 0.811125 2023-10-02 21:52:54,700 WARNING [train.py:1204] (3/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_17-25045-0_sp1.1 from training. Duration: 0.536375 2023-10-02 21:52:54,718 WARNING [train.py:1204] (3/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_349-69727-0_sp1.1 from training. Duration: 0.936375 2023-10-02 21:52:55,065 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.conv_skip_rate, batch_count=1031546.6666666666, ans=0.0 2023-10-02 21:52:56,123 WARNING [train.py:1204] (3/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_580-93798-0_sp1.1 from training. Duration: 0.5090625 2023-10-02 21:52:56,228 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7762_210_1794_1_1528199508_4057408_419-125009-0 from training. Duration: 0.83 2023-10-02 21:52:57,648 WARNING [train.py:1204] (3/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_356-167816-0 from training. Duration: 0.72 2023-10-02 21:52:58,965 WARNING [train.py:1204] (3/4) Exclude cut with ID _29345_210_4381_1_1532689168263_3860169_225-97100-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 21:52:58,978 WARNING [train.py:1204] (3/4) Exclude cut with ID _14227_210_4381_1_1530701804286_3940259_374-194712-0_sp1.1 from training. Duration: 0.92725 2023-10-02 21:52:59,003 WARNING [train.py:1204] (3/4) Exclude cut with ID _40137_210_12210_1_1533290484046_3795180_249-187059-0_sp0.9 from training. Duration: 0.97775 2023-10-02 21:52:59,030 WARNING [train.py:1204] (3/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_389-317194-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 21:53:00,584 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.conv_module1.balancer1.min_positive, batch_count=1031546.6666666666, ans=0.025 2023-10-02 21:53:00,606 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.bypass.skip_rate, batch_count=1031546.6666666666, ans=0.07 2023-10-02 21:53:01,753 WARNING [train.py:1204] (3/4) Exclude cut with ID _27485_210_4916_1_1532739191677_3668269_239-122918-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 21:53:06,565 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2245_210_2715_1_1525511548_3540254_128-117609-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 21:53:06,594 WARNING [train.py:1204] (3/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_160-276178-0_sp1.1 from training. Duration: 0.963625 2023-10-02 21:53:06,809 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=1031613.3333333334, ans=0.1 2023-10-02 21:53:07,946 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_31920_210_10633_1_1532860009_1686094_1-279735-0_sp1.1 from training. Duration: 0.97275 2023-10-02 21:53:11,182 WARNING [train.py:1204] (3/4) Exclude cut with ID _51078_210_9070_1_1534161557424_3805250_325-262383-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 21:53:11,207 WARNING [train.py:1204] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_643-52007-0_sp0.9 from training. Duration: 0.6333125 2023-10-02 21:53:12,559 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_51937_210_5014_1_1534573610_4022025_518-299248-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 21:53:14,605 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.4.encoder.layers.0.conv_module1.whiten, num_groups=1, num_channels=384, metric=4.27 vs. limit=15.0 2023-10-02 21:53:19,648 WARNING [train.py:1204] (3/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_166-314607-0_sp1.1 from training. Duration: 0.4909375 2023-10-02 21:53:19,652 WARNING [train.py:1204] (3/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_1087-282939-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 21:53:19,679 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_23850_210_4238_1_1532232597_1632433_32-185135-0_sp1.1 from training. Duration: 0.7818125 2023-10-02 21:53:19,709 WARNING [train.py:1204] (3/4) Exclude cut with ID _59500_210_8254_1_1534809610680_3736009_145-89359-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 21:53:21,522 INFO [train.py:1046] (3/4) Epoch 30, batch 700, loss[loss=0.1733, simple_loss=0.2446, pruned_loss=0.05094, over 18766.00 frames. ], tot_loss[loss=0.1639, simple_loss=0.2418, pruned_loss=0.043, over 4561116.13 frames. ], batch size: 41, lr: 3.43e-03, grad_scale: 16.0 2023-10-02 21:53:24,342 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_340-336408-0 from training. Duration: 0.93 2023-10-02 21:53:24,408 WARNING [train.py:1204] (3/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_9-21737-0 from training. Duration: 0.97 2023-10-02 21:53:27,102 WARNING [train.py:1204] (3/4) Exclude cut with ID _2220_210_1819_1_1525509109898_3658680_50-99837-0 from training. Duration: 0.54 2023-10-02 21:53:28,507 WARNING [train.py:1204] (3/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_879-299350-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 21:53:29,973 WARNING [train.py:1204] (3/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_905-160074-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 21:53:31,308 WARNING [train.py:1204] (3/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_395-73570-0 from training. Duration: 0.99 2023-10-02 21:53:34,185 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.conv_module1.balancer1.prob, batch_count=1031746.6666666666, ans=0.125 2023-10-02 21:53:35,932 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7264_210_4938_1_1527939911_8132095_800-91995-0_sp1.1 from training. Duration: 0.7818125 2023-10-02 21:53:35,973 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder_embed.convnext.hidden_balancer.prob, batch_count=1031746.6666666666, ans=0.125 2023-10-02 21:53:39,906 WARNING [train.py:1204] (3/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_77-311404-0_sp1.1 from training. Duration: 0.8545625 2023-10-02 21:53:41,798 WARNING [train.py:1204] (3/4) Exclude cut with ID _13679_210_7497_1_1530534293662_3355098_107-80772-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 21:53:43,278 WARNING [train.py:1204] (3/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_224-343295-0_sp1.1 from training. Duration: 0.5 2023-10-02 21:53:44,575 WARNING [train.py:1204] (3/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_156-253482-0_sp1.1 from training. Duration: 0.963625 2023-10-02 21:53:46,033 WARNING [train.py:1204] (3/4) Exclude cut with ID _49728_210_12651_1_1534041034294_6788150_309-125532-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 21:53:49,467 WARNING [train.py:1204] (3/4) Exclude cut with ID _13913_210_9994_1_1530579615187_3383897_473-277682-0_sp0.9 from training. Duration: 0.4333125 2023-10-02 21:53:49,471 WARNING [train.py:1204] (3/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_3-276701-0_sp1.1 from training. Duration: 0.82725 2023-10-02 21:53:50,991 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=1031813.3333333334, ans=0.1 2023-10-02 21:53:52,096 WARNING [train.py:1204] (3/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_282-136641-0 from training. Duration: 0.42 2023-10-02 21:53:53,738 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.conv_module2.balancer2.prob, batch_count=1031813.3333333334, ans=0.125 2023-10-02 21:53:54,920 WARNING [train.py:1204] (3/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_127-566-0 from training. Duration: 0.84 2023-10-02 21:53:58,071 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6163_210_6400_1_1527506746_4263068_283-137811-0_sp1.1 from training. Duration: 0.7 2023-10-02 21:53:58,105 WARNING [train.py:1204] (3/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_34-183685-0_sp1.1 from training. Duration: 0.8909375 2023-10-02 21:53:59,503 WARNING [train.py:1204] (3/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_154-135696-0_sp0.9 from training. Duration: 0.888875 2023-10-02 21:54:01,256 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=1031813.3333333334, ans=0.1 2023-10-02 21:54:03,869 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.conv_module1.balancer2.prob, batch_count=1031880.0, ans=0.125 2023-10-02 21:54:04,951 WARNING [train.py:1204] (3/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_676-51014-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 21:54:05,017 WARNING [train.py:1204] (3/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_38-169920-0 from training. Duration: 0.91 2023-10-02 21:54:09,517 WARNING [train.py:1204] (3/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_747-114465-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 21:54:09,566 WARNING [train.py:1204] (3/4) Exclude cut with ID _8021_210_7994_1_1528376451358_3653069_100-140231-0_sp1.1 from training. Duration: 0.6090625 2023-10-02 21:54:10,912 WARNING [train.py:1204] (3/4) Exclude cut with ID _64135_210_2253_1_1535677267876_7608972_133-188266-0 from training. Duration: 0.81 2023-10-02 21:54:12,184 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.583e+02 1.810e+02 2.003e+02 2.229e+02 3.158e+02, threshold=4.006e+02, percent-clipped=0.0 2023-10-02 21:54:13,735 WARNING [train.py:1204] (3/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_714-310538-0_sp1.1 from training. Duration: 0.7818125 2023-10-02 21:54:15,631 WARNING [train.py:1204] (3/4) Exclude cut with ID _39867_210_9826_1_1533553222030_9344259_1134-288498-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 21:54:19,013 WARNING [train.py:1204] (3/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_211-237289-0_sp1.1 from training. Duration: 0.936375 2023-10-02 21:54:22,944 WARNING [train.py:1204] (3/4) Exclude cut with ID _59202_210_26984_1_1534816810612_3549174_137-318710-0_sp0.9 from training. Duration: 0.988875 2023-10-02 21:54:22,969 WARNING [train.py:1204] (3/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_86-66192-0 from training. Duration: 0.55 2023-10-02 21:54:24,429 WARNING [train.py:1204] (3/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_540-275685-0 from training. Duration: 0.89 2023-10-02 21:54:26,352 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8776_210_2040_1_1528638837_4088036_244-278763-0 from training. Duration: 0.9 2023-10-02 21:54:27,896 WARNING [train.py:1204] (3/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_9-219972-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 21:54:29,291 WARNING [train.py:1204] (3/4) Exclude cut with ID _44659_210_2721_1_1534036120061_6614860_656-283823-0_sp1.1 from training. Duration: 0.92725 2023-10-02 21:54:30,592 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_69630_210_7234_1_1535789601_661049_77-216385-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 21:54:33,575 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_19952_210_2161_1_1531875469_3851750_210-308919-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 21:54:33,588 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_4737_210_2040_1_1526998689_3293008_378-178650-0 from training. Duration: 0.95 2023-10-02 21:54:33,886 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.feed_forward3.hidden_balancer.prob, batch_count=1032013.3333333334, ans=0.125 2023-10-02 21:54:33,895 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.bypass.scale_min, batch_count=1032013.3333333334, ans=0.2 2023-10-02 21:54:34,472 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.1.encoder.layers.0.feed_forward3.out_whiten, num_groups=1, num_channels=256, metric=12.38 vs. limit=15.0 2023-10-02 21:54:34,949 INFO [train.py:1046] (3/4) Epoch 30, batch 750, loss[loss=0.1537, simple_loss=0.2282, pruned_loss=0.03953, over 23677.00 frames. ], tot_loss[loss=0.1631, simple_loss=0.2406, pruned_loss=0.04279, over 4585479.11 frames. ], batch size: 149, lr: 3.43e-03, grad_scale: 16.0 2023-10-02 21:54:38,355 WARNING [train.py:1204] (3/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_224-343295-0 from training. Duration: 0.55 2023-10-02 21:54:39,636 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7002_210_1794_1_1527939683_5157286_184-229630-0 from training. Duration: 0.74 2023-10-02 21:54:39,674 WARNING [train.py:1204] (3/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_6-214815-0 from training. Duration: 0.85 2023-10-02 21:54:39,938 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=1032013.3333333334, ans=0.1 2023-10-02 21:54:41,043 WARNING [train.py:1204] (3/4) Exclude cut with ID _8699_210_2069_1_1528630346831_3960409_160-24408-0 from training. Duration: 0.91 2023-10-02 21:54:42,403 WARNING [train.py:1204] (3/4) Exclude cut with ID _5585_210_2667_1_1527418813324_8007340_89-332072-0 from training. Duration: 0.64 2023-10-02 21:54:43,715 WARNING [train.py:1204] (3/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_528-259800-0_sp1.1 from training. Duration: 0.963625 2023-10-02 21:54:43,795 WARNING [train.py:1204] (3/4) Exclude cut with ID _55383_210_10635_1_1534468426172_2231669_134-128246-0 from training. Duration: 0.89 2023-10-02 21:54:45,087 WARNING [train.py:1204] (3/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_400-338274-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 21:54:45,142 WARNING [train.py:1204] (3/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_364-22817-0_sp0.9 from training. Duration: 0.788875 2023-10-02 21:54:47,105 WARNING [train.py:1204] (3/4) Exclude cut with ID _42863_210_9070_1_1533902398788_3696920_331-291057-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 21:54:49,317 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.bypass.scale_min, batch_count=1032080.0, ans=0.2 2023-10-02 21:54:50,308 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_5436_210_1794_1_1527331317_2541494_229-211016-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 21:54:50,345 WARNING [train.py:1204] (3/4) Exclude cut with ID _6456_210_1534_1_1527681634212_3181049_345-252877-0_sp0.9 from training. Duration: 0.711125 2023-10-02 21:54:50,511 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.ff3_skip_rate, batch_count=1032080.0, ans=0.0 2023-10-02 21:54:51,576 WARNING [train.py:1204] (3/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_96-81499-0_sp1.1 from training. Duration: 0.92725 2023-10-02 21:54:52,994 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_5575_210_1410_1_1527301305_2570170_342-265572-0_sp1.1 from training. Duration: 0.8545625 2023-10-02 21:54:53,069 WARNING [train.py:1204] (3/4) Exclude cut with ID _5444_210_2151_1_1527418808464_5312210_726-27165-0_sp0.9 from training. Duration: 0.7444375 2023-10-02 21:54:55,894 WARNING [train.py:1204] (3/4) Exclude cut with ID _22083_210_8254_1_1531702862439_3678970_304-327750-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 21:54:58,678 WARNING [train.py:1204] (3/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_245-226199-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 21:54:58,714 WARNING [train.py:1204] (3/4) Exclude cut with ID _42875_210_14856_1_1533704397487_4012433_102-199499-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 21:55:00,383 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_51225_210_3601_1_1534400634_2327383_55-301844-0 from training. Duration: 0.93 2023-10-02 21:55:00,719 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.bypass.scale_min, batch_count=1032080.0, ans=0.2 2023-10-02 21:55:01,945 WARNING [train.py:1204] (3/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_916-195848-0_sp1.1 from training. Duration: 0.836375 2023-10-02 21:55:02,006 WARNING [train.py:1204] (3/4) Exclude cut with ID _44718_210_5816_1_1534050051869_6469030_497-311999-0_sp1.1 from training. Duration: 0.936375 2023-10-02 21:55:03,361 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_34886_210_10633_1_1533203882_1755661_91-194765-0_sp1.1 from training. Duration: 0.936375 2023-10-02 21:55:04,761 WARNING [train.py:1204] (3/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_310-26186-0_sp0.9 from training. Duration: 0.77775 2023-10-02 21:55:06,111 WARNING [train.py:1204] (3/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_416-257730-0 from training. Duration: 0.65 2023-10-02 21:55:06,128 WARNING [train.py:1204] (3/4) Exclude cut with ID _9389_210_1664_1_1529217337301_5289269_209-154760-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 21:55:07,468 WARNING [train.py:1204] (3/4) Exclude cut with ID _4092_210_2151_1_1526643044449_3135759_227-213972-0 from training. Duration: 0.54 2023-10-02 21:55:07,487 WARNING [train.py:1204] (3/4) Exclude cut with ID IC0679W0044-335852 from training. Duration: 0.768 2023-10-02 21:55:08,830 WARNING [train.py:1204] (3/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_42-186162-0 from training. Duration: 0.92 2023-10-02 21:55:08,837 WARNING [train.py:1204] (3/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_302-2550-0_sp0.9 from training. Duration: 0.9 2023-10-02 21:55:08,869 WARNING [train.py:1204] (3/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_356-31216-0_sp0.9 from training. Duration: 0.8333125 2023-10-02 21:55:10,721 WARNING [train.py:1204] (3/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_125-322172-0_sp1.1 from training. Duration: 0.8454375 2023-10-02 21:55:16,821 WARNING [train.py:1204] (3/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_141-89727-0_sp0.9 from training. Duration: 0.788875 2023-10-02 21:55:18,573 WARNING [train.py:1204] (3/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_647-337784-0_sp1.1 from training. Duration: 0.97275 2023-10-02 21:55:18,583 WARNING [train.py:1204] (3/4) Exclude cut with ID _6493_210_1747_1_1528459210424_4012470_513-140967-0_sp1.1 from training. Duration: 0.7181875 2023-10-02 21:55:20,013 WARNING [train.py:1204] (3/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_87-266334-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 21:55:22,649 WARNING [train.py:1204] (3/4) Exclude cut with ID _26528_210_9070_1_1532570410842_3851310_526-261338-0_sp1.1 from training. Duration: 0.92725 2023-10-02 21:55:22,701 WARNING [train.py:1204] (3/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_380-343670-0 from training. Duration: 0.88 2023-10-02 21:55:24,038 WARNING [train.py:1204] (3/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_449-15508-0_sp1.1 from training. Duration: 0.7454375 2023-10-02 21:55:24,148 WARNING [train.py:1204] (3/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_372-6869-0_sp0.9 from training. Duration: 0.488875 2023-10-02 21:55:25,431 WARNING [train.py:1204] (3/4) Exclude cut with ID _26907_210_7234_1_1532221740694_3205263_337-2494-0_sp0.9 from training. Duration: 0.97775 2023-10-02 21:55:28,212 WARNING [train.py:1204] (3/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_10-100367-0_sp1.1 from training. Duration: 0.8909375 2023-10-02 21:55:28,246 WARNING [train.py:1204] (3/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_47-167373-0 from training. Duration: 0.92 2023-10-02 21:55:29,563 WARNING [train.py:1204] (3/4) Exclude cut with ID _28323_210_16187_1_1532480395667_4100300_628-282179-0_sp1.1 from training. Duration: 0.97275 2023-10-02 21:55:34,408 WARNING [train.py:1204] (3/4) Exclude cut with ID _20455_210_5219_1_1531565996775_8098100_213-280898-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 21:55:35,821 WARNING [train.py:1204] (3/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_383-316000-0_sp1.1 from training. Duration: 0.8181875 2023-10-02 21:55:37,119 WARNING [train.py:1204] (3/4) Exclude cut with ID _18513_210_9206_1_1531618215223_3862620_16-78180-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 21:55:38,542 WARNING [train.py:1204] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_330-327861-0_sp1.1 from training. Duration: 0.6090625 2023-10-02 21:55:43,370 WARNING [train.py:1204] (3/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_356-31216-0 from training. Duration: 0.75 2023-10-02 21:55:43,391 WARNING [train.py:1204] (3/4) Exclude cut with ID _25248_210_8750_1_1532062825941_5554180_255-148539-0_sp1.1 from training. Duration: 0.963625 2023-10-02 21:55:44,779 WARNING [train.py:1204] (3/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_269-154726-0_sp1.1 from training. Duration: 0.7818125 2023-10-02 21:55:47,568 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7002_210_1794_1_1527939683_5157286_163-87552-0_sp1.1 from training. Duration: 0.7818125 2023-10-02 21:55:47,611 WARNING [train.py:1204] (3/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_97-151896-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 21:55:48,929 INFO [train.py:1046] (3/4) Epoch 30, batch 800, loss[loss=0.1726, simple_loss=0.2431, pruned_loss=0.05108, over 23602.00 frames. ], tot_loss[loss=0.1639, simple_loss=0.242, pruned_loss=0.0429, over 4621700.91 frames. ], batch size: 232, lr: 3.43e-03, grad_scale: 32.0 2023-10-02 21:55:49,089 WARNING [train.py:1204] (3/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_207-7026-0_sp1.1 from training. Duration: 0.97275 2023-10-02 21:55:49,714 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_92-46009-0_sp0.9 from training. Duration: 0.6 2023-10-02 21:55:58,098 WARNING [train.py:1204] (3/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_122-178847-0_sp1.1 from training. Duration: 0.97275 2023-10-02 21:55:58,102 WARNING [train.py:1204] (3/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_94-348956-0_sp1.1 from training. Duration: 0.92725 2023-10-02 21:56:00,720 WARNING [train.py:1204] (3/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_1018-244089-0_sp1.1 from training. Duration: 0.963625 2023-10-02 21:56:00,732 WARNING [train.py:1204] (3/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_87-118423-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 21:56:00,829 WARNING [train.py:1204] (3/4) Exclude cut with ID _53713_210_24116_1_1534314487762_7416751_504-212292-0_sp1.1 from training. Duration: 0.92725 2023-10-02 21:56:02,097 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_446-150327-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 21:56:04,119 WARNING [train.py:1204] (3/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_682-245989-0_sp1.1 from training. Duration: 0.92725 2023-10-02 21:56:08,127 WARNING [train.py:1204] (3/4) Exclude cut with ID _35723_210_1794_1_1533088341043_628250_8-234524-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 21:56:08,205 WARNING [train.py:1204] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_64-248359-0_sp0.9 from training. Duration: 0.7666875 2023-10-02 21:56:12,238 WARNING [train.py:1204] (3/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_49-97158-0 from training. Duration: 0.76 2023-10-02 21:56:12,287 WARNING [train.py:1204] (3/4) Exclude cut with ID _24314_210_4915_1_1532135078116_7170179_743-915-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 21:56:13,598 WARNING [train.py:1204] (3/4) Exclude cut with ID _61194_210_18628_1_1535007555628_3750909_353-323468-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 21:56:13,625 WARNING [train.py:1204] (3/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_387-131538-0_sp1.1 from training. Duration: 0.736375 2023-10-02 21:56:15,427 WARNING [train.py:1204] (3/4) Exclude cut with ID _44424_210_18163_1_1533623394848_1213110_79-187603-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 21:56:15,466 WARNING [train.py:1204] (3/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_86-19601-0 from training. Duration: 0.85 2023-10-02 21:56:15,493 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_50050_210_16223_1_1535435796_7290138_103-283529-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 21:56:16,872 WARNING [train.py:1204] (3/4) Exclude cut with ID _13913_210_9994_1_1530579615187_3383897_473-277682-0 from training. Duration: 0.39 2023-10-02 21:56:19,620 WARNING [train.py:1204] (3/4) Exclude cut with ID _42326_210_7770_1_1533814282531_3707269_368-68029-0_sp1.1 from training. Duration: 0.92725 2023-10-02 21:56:21,627 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_41428_210_2161_1_1533774184_3992532_446-339645-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 21:56:24,368 WARNING [train.py:1204] (3/4) Exclude cut with ID _69584_210_5219_1_1535785133775_4044450_325-7656-0_sp1.1 from training. Duration: 0.7818125 2023-10-02 21:56:24,384 WARNING [train.py:1204] (3/4) Exclude cut with ID _21632_210_2253_1_1531994001765_3744140_134-184387-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 21:56:25,188 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.0.layers.1.feed_forward2.out_whiten, num_groups=1, num_channels=192, metric=5.98 vs. limit=15.0 2023-10-02 21:56:27,020 WARNING [train.py:1204] (3/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_550-184707-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 21:56:27,045 WARNING [train.py:1204] (3/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_106-341780-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 21:56:31,298 WARNING [train.py:1204] (3/4) Exclude cut with ID _45871_210_5665_1_1533864614245_3296060_253-132552-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 21:56:31,340 WARNING [train.py:1204] (3/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_395-310220-0_sp1.1 from training. Duration: 0.8090625 2023-10-02 21:56:31,353 WARNING [train.py:1204] (3/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_795-33252-0_sp0.9 from training. Duration: 0.411125 2023-10-02 21:56:32,831 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9002W0003-495962 from training. Duration: 0.946 2023-10-02 21:56:32,853 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_43-71989-0 from training. Duration: 0.57 2023-10-02 21:56:32,866 WARNING [train.py:1204] (3/4) Exclude cut with ID _8699_210_2069_1_1528630346831_3960409_155-39766-0_sp1.1 from training. Duration: 0.7090625 2023-10-02 21:56:32,880 WARNING [train.py:1204] (3/4) Exclude cut with ID _27894_210_12392_1_1532415496078_6902439_788-232684-0_sp1.1 from training. Duration: 0.97275 2023-10-02 21:56:33,003 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.1.self_attn_weights.pos_emb_skip_rate, batch_count=1032546.6666666666, ans=0.0 2023-10-02 21:56:35,396 WARNING [train.py:1204] (3/4) Exclude cut with ID _56494_210_4220_1_1535284837625_3033169_10-39296-0_sp1.1 from training. Duration: 0.92725 2023-10-02 21:56:35,435 WARNING [train.py:1204] (3/4) Exclude cut with ID _40901_210_5665_1_1533363789958_3856130_341-20819-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 21:56:40,117 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.601e+02 1.899e+02 2.201e+02 2.718e+02 4.038e+02, threshold=4.403e+02, percent-clipped=2.0 2023-10-02 21:56:40,291 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9029W0435-509822 from training. Duration: 0.896 2023-10-02 21:56:41,648 WARNING [train.py:1204] (3/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_51-117576-0 from training. Duration: 0.4 2023-10-02 21:56:43,069 WARNING [train.py:1204] (3/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_197-28164-0_sp1.1 from training. Duration: 0.72725 2023-10-02 21:56:44,589 WARNING [train.py:1204] (3/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_145-137790-0_sp1.1 from training. Duration: 0.7545625 2023-10-02 21:56:49,196 WARNING [train.py:1204] (3/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_2-39653-0_sp1.1 from training. Duration: 0.8909375 2023-10-02 21:56:51,450 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3780_210_2087_1_1526467760_2130483_186-208240-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 21:56:51,533 WARNING [train.py:1204] (3/4) Exclude cut with ID _24055_210_12651_1_1532310907143_976979_92-59356-0 from training. Duration: 0.91 2023-10-02 21:56:52,813 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3910_210_1794_1_1526742306_334530_28-238909-0_sp1.1 from training. Duration: 0.736375 2023-10-02 21:56:56,811 WARNING [train.py:1204] (3/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_269-154726-0 from training. Duration: 0.86 2023-10-02 21:57:00,884 WARNING [train.py:1204] (3/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_326-165900-0_sp0.9 from training. Duration: 0.7666875 2023-10-02 21:57:02,202 INFO [train.py:1046] (3/4) Epoch 30, batch 850, loss[loss=0.1509, simple_loss=0.2248, pruned_loss=0.03846, over 20184.00 frames. ], tot_loss[loss=0.1648, simple_loss=0.2429, pruned_loss=0.04335, over 4630025.02 frames. ], batch size: 44, lr: 3.43e-03, grad_scale: 32.0 2023-10-02 21:57:02,396 WARNING [train.py:1204] (3/4) Exclude cut with ID _5294_210_1747_1_1527249643928_3696179_470-276077-0_sp1.1 from training. Duration: 0.963625 2023-10-02 21:57:03,006 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.3.encoder.layers.0.conv_module2.whiten, num_groups=1, num_channels=512, metric=3.20 vs. limit=15.0 2023-10-02 21:57:03,688 WARNING [train.py:1204] (3/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_372-131163-0 from training. Duration: 0.94 2023-10-02 21:57:04,983 WARNING [train.py:1204] (3/4) Exclude cut with ID _45297_210_2721_1_1533715351381_3264949_351-147136-0_sp1.1 from training. Duration: 0.7909375 2023-10-02 21:57:05,049 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_1072-12671-0_sp1.1 from training. Duration: 0.92725 2023-10-02 21:57:05,138 WARNING [train.py:1204] (3/4) Exclude cut with ID _15133_210_6228_1_1530794512074_3080379_54-198193-0 from training. Duration: 0.94 2023-10-02 21:57:06,955 WARNING [train.py:1204] (3/4) Exclude cut with ID _30196_210_1664_1_1533708496522_7074140_1194-270988-0_sp1.1 from training. Duration: 0.97275 2023-10-02 21:57:07,062 WARNING [train.py:1204] (3/4) Exclude cut with ID _50310_210_15488_1_1534068043345_3641080_289-193637-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 21:57:08,427 WARNING [train.py:1204] (3/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_313-263495-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 21:57:09,930 WARNING [train.py:1204] (3/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_18-130336-0_sp1.1 from training. Duration: 0.7181875 2023-10-02 21:57:11,215 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_47896_210_20508_1_1534492518_5958868_646-287209-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 21:57:12,500 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_294-163793-0 from training. Duration: 0.82 2023-10-02 21:57:12,533 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_8-319957-0 from training. Duration: 0.96 2023-10-02 21:57:12,536 WARNING [train.py:1204] (3/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_1211-226066-0 from training. Duration: 0.76 2023-10-02 21:57:15,848 WARNING [train.py:1204] (3/4) Exclude cut with ID _4779_210_2084_1_1527318022944_3914893_500-285013-0_sp0.9 from training. Duration: 0.7666875 2023-10-02 21:57:15,880 WARNING [train.py:1204] (3/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_347-232268-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 21:57:19,213 WARNING [train.py:1204] (3/4) Exclude cut with ID _29877_210_6228_1_1532397791106_5276149_111-55056-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 21:57:19,890 WARNING [train.py:1204] (3/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_812-271646-0_sp1.1 from training. Duration: 0.92725 2023-10-02 21:57:19,923 WARNING [train.py:1204] (3/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_235-262124-0_sp1.1 from training. Duration: 0.6090625 2023-10-02 21:57:24,051 WARNING [train.py:1204] (3/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_14-16530-0_sp1.1 from training. Duration: 0.97275 2023-10-02 21:57:24,087 WARNING [train.py:1204] (3/4) Exclude cut with ID _44718_210_5816_1_1534050051869_6469030_568-304512-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 21:57:24,105 WARNING [train.py:1204] (3/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_1033-274376-0 from training. Duration: 0.87 2023-10-02 21:57:25,669 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.conv_module2.balancer2.min_abs, batch_count=1032746.6666666666, ans=0.5 2023-10-02 21:57:26,808 WARNING [train.py:1204] (3/4) Exclude cut with ID _4681_210_3525_1_1527251040321_1235713_123-24428-0 from training. Duration: 0.48 2023-10-02 21:57:28,316 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_37818_210_4238_1_1533620910_4720083_251-288764-0_sp1.1 from training. Duration: 0.97275 2023-10-02 21:57:29,619 WARNING [train.py:1204] (3/4) Exclude cut with ID _65585_210_12210_1_1535450451584_3764098_112-294301-0 from training. Duration: 0.88 2023-10-02 21:57:33,711 WARNING [train.py:1204] (3/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_329-113173-0 from training. Duration: 0.62 2023-10-02 21:57:33,947 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.balancer1.prob, batch_count=1032813.3333333334, ans=0.125 2023-10-02 21:57:35,098 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6767_210_3273_1_1527855783_1718932_173-256601-0 from training. Duration: 0.8 2023-10-02 21:57:35,257 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9266W0001-627677 from training. Duration: 0.896 2023-10-02 21:57:35,267 WARNING [train.py:1204] (3/4) Exclude cut with ID _49643_210_7989_1_1534640244224_3939031_611-185515-0_sp1.1 from training. Duration: 0.8545625 2023-10-02 21:57:37,024 WARNING [train.py:1204] (3/4) Exclude cut with ID _31649_210_6228_1_1532584608063_4091078_687-237771-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 21:57:37,034 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_148-172861-0_sp0.9 from training. Duration: 0.5555625 2023-10-02 21:57:38,536 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_5129_210_6400_1_1527161005_3579054_107-69738-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 21:57:39,939 WARNING [train.py:1204] (3/4) Exclude cut with ID _40137_210_12210_1_1533290484046_3795180_36-301164-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 21:57:39,981 WARNING [train.py:1204] (3/4) Exclude cut with ID _7537_210_2084_1_1528502395431_3924359_113-199933-0 from training. Duration: 0.59 2023-10-02 21:57:43,170 WARNING [train.py:1204] (3/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_547-5424-0_sp1.1 from training. Duration: 0.8545625 2023-10-02 21:57:43,221 WARNING [train.py:1204] (3/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_147-284024-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 21:57:46,297 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_294-163793-0_sp1.1 from training. Duration: 0.7454375 2023-10-02 21:57:46,317 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3780_210_2087_1_1526467760_2130483_170-259044-0_sp1.1 from training. Duration: 0.636375 2023-10-02 21:57:47,811 WARNING [train.py:1204] (3/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_726-270780-0_sp1.1 from training. Duration: 0.7818125 2023-10-02 21:57:49,700 WARNING [train.py:1204] (3/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_214-291369-0_sp1.1 from training. Duration: 0.57275 2023-10-02 21:57:49,741 WARNING [train.py:1204] (3/4) Exclude cut with ID _21881_210_3525_1_1531825242757_3798380_247-102591-0 from training. Duration: 0.98 2023-10-02 21:57:53,948 WARNING [train.py:1204] (3/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_18-174647-0_sp1.1 from training. Duration: 0.82725 2023-10-02 21:57:53,951 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_415-52611-0_sp1.1 from training. Duration: 0.936375 2023-10-02 21:57:54,010 WARNING [train.py:1204] (3/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_379-49457-0_sp0.9 from training. Duration: 0.9666875 2023-10-02 21:57:55,266 WARNING [train.py:1204] (3/4) Exclude cut with ID _64521_210_14699_1_1535243427259_3815328_368-195106-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 21:57:55,334 WARNING [train.py:1204] (3/4) Exclude cut with ID _28394_210_8913_1_1532430049518_3650118_51-334124-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 21:57:56,854 WARNING [train.py:1204] (3/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_317-161769-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 21:57:59,622 WARNING [train.py:1204] (3/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_40-129756-0_sp0.9 from training. Duration: 0.9 2023-10-02 21:58:00,852 WARNING [train.py:1204] (3/4) Exclude cut with ID _3067_210_2667_1_1526212974640_4503168_119-175776-0_sp0.9 from training. Duration: 0.82225 2023-10-02 21:58:02,118 WARNING [train.py:1204] (3/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_1004-304915-0_sp1.1 from training. Duration: 0.92725 2023-10-02 21:58:02,177 WARNING [train.py:1204] (3/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_359-118926-0_sp0.9 from training. Duration: 0.8 2023-10-02 21:58:05,919 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.0.layers.0.whiten, num_groups=1, num_channels=192, metric=5.73 vs. limit=12.0 2023-10-02 21:58:08,395 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.conv_module1.balancer1.prob, batch_count=1032946.6666666666, ans=0.125 2023-10-02 21:58:09,416 WARNING [train.py:1204] (3/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_51-200220-0_sp0.9 from training. Duration: 0.62225 2023-10-02 21:58:09,499 WARNING [train.py:1204] (3/4) Exclude cut with ID _46683_210_9642_1_1533730913492_4591116_530-326202-0_sp1.1 from training. Duration: 0.936375 2023-10-02 21:58:10,866 WARNING [train.py:1204] (3/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_567-135114-0 from training. Duration: 0.87 2023-10-02 21:58:10,900 WARNING [train.py:1204] (3/4) Exclude cut with ID _66863_210_18053_1_1535698758334_8590319_1092-271784-0_sp1.1 from training. Duration: 0.87275 2023-10-02 21:58:10,919 WARNING [train.py:1204] (3/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_762-282614-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 21:58:11,087 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=1032946.6666666666, ans=0.1 2023-10-02 21:58:13,981 WARNING [train.py:1204] (3/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_280-120942-0 from training. Duration: 0.77 2023-10-02 21:58:17,199 INFO [train.py:1046] (3/4) Epoch 30, batch 900, loss[loss=0.1793, simple_loss=0.2534, pruned_loss=0.05257, over 22740.00 frames. ], tot_loss[loss=0.1659, simple_loss=0.2437, pruned_loss=0.04403, over 4635802.40 frames. ], batch size: 322, lr: 3.42e-03, grad_scale: 8.0 2023-10-02 21:58:22,007 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_31583_210_3852_1_1532595851_4024413_27-170931-0_sp1.1 from training. Duration: 0.963625 2023-10-02 21:58:24,817 WARNING [train.py:1204] (3/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_266-271628-0_sp1.1 from training. Duration: 0.92725 2023-10-02 21:58:24,861 WARNING [train.py:1204] (3/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_41-31157-0 from training. Duration: 0.57 2023-10-02 21:58:27,538 WARNING [train.py:1204] (3/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_113-18163-0_sp1.1 from training. Duration: 0.8090625 2023-10-02 21:58:27,573 WARNING [train.py:1204] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_147-111137-0 from training. Duration: 0.95 2023-10-02 21:58:28,575 INFO [scaling.py:1022] (3/4) Whitening: name=encoder_embed.convnext.out_whiten, num_groups=1, num_channels=128, metric=4.69 vs. limit=5.0 2023-10-02 21:58:28,966 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1083-220122-0_sp1.1 from training. Duration: 0.463625 2023-10-02 21:58:30,305 WARNING [train.py:1204] (3/4) Exclude cut with ID _14884_210_2252_1_1530707459689_5561250_591-173406-0_sp1.1 from training. Duration: 0.87275 2023-10-02 21:58:30,313 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_24677_210_1794_1_1532002693_4341296_149-303793-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 21:58:30,380 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_133-564-0_sp1.1 from training. Duration: 0.7181875 2023-10-02 21:58:31,688 WARNING [train.py:1204] (3/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_625-338546-0_sp0.9 from training. Duration: 0.97775 2023-10-02 21:58:37,798 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.bypass.scale_min, batch_count=1033080.0, ans=0.2 2023-10-02 21:58:41,835 WARNING [train.py:1204] (3/4) Exclude cut with ID _25882_210_3036_1_1532775665815_6479510_624-320169-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 21:58:41,852 WARNING [train.py:1204] (3/4) Exclude cut with ID _20622_210_3852_1_1531622427080_1093839_127-325326-0_sp1.1 from training. Duration: 0.92725 2023-10-02 21:58:41,891 WARNING [train.py:1204] (3/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_464-33931-0_sp1.1 from training. Duration: 0.4909375 2023-10-02 21:58:45,184 WARNING [train.py:1204] (3/4) Exclude cut with ID _66112_210_10635_1_1535590236305_3801484_120-304588-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 21:58:45,322 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.1.feed_forward1.out_proj.dropout_p, batch_count=1033146.6666666666, ans=0.1 2023-10-02 21:58:48,614 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.conv_module1.balancer2.prob, batch_count=1033146.6666666666, ans=0.125 2023-10-02 21:58:49,862 WARNING [train.py:1204] (3/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_110-130961-0 from training. Duration: 0.47 2023-10-02 21:58:54,241 WARNING [train.py:1204] (3/4) Exclude cut with ID _16908_210_5724_1_1531119157248_3772700_64-54020-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 21:58:58,374 WARNING [train.py:1204] (3/4) Exclude cut with ID _4068_210_2629_1_1526697350395_1546259_224-288693-0_sp1.1 from training. Duration: 0.536375 2023-10-02 21:58:58,424 WARNING [train.py:1204] (3/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_191-41023-0_sp0.9 from training. Duration: 0.788875 2023-10-02 21:58:58,479 WARNING [train.py:1204] (3/4) Exclude cut with ID ID0205W0255-783002 from training. Duration: 0.958 2023-10-02 21:58:59,825 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_162-262717-0 from training. Duration: 0.83 2023-10-02 21:59:05,379 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_224-322394-0_sp1.1 from training. Duration: 0.6 2023-10-02 21:59:05,396 WARNING [train.py:1204] (3/4) Exclude cut with ID _4866_210_2577_1_1527404412284_4364659_133-1696-0_sp1.1 from training. Duration: 0.9 2023-10-02 21:59:06,637 WARNING [train.py:1204] (3/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_973-198085-0_sp1.1 from training. Duration: 0.6818125 2023-10-02 21:59:11,419 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.496e+02 1.769e+02 1.935e+02 2.161e+02 3.184e+02, threshold=3.870e+02, percent-clipped=0.0 2023-10-02 21:59:12,969 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_47449_210_5399_1_1534076445_4006929_410-237806-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 21:59:12,979 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7543_210_6753_1_1528543318_4552_1-85072-0_sp0.9 from training. Duration: 0.92225 2023-10-02 21:59:14,351 WARNING [train.py:1204] (3/4) Exclude cut with ID _65263_210_22356_1_1535763598691_3921037_108-21902-0 from training. Duration: 0.99 2023-10-02 21:59:14,355 WARNING [train.py:1204] (3/4) Exclude cut with ID _65585_210_12210_1_1535450451584_3764098_309-174818-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 21:59:17,501 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_309-65177-0 from training. Duration: 0.88 2023-10-02 21:59:20,179 WARNING [train.py:1204] (3/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_363-185703-0_sp1.1 from training. Duration: 0.863625 2023-10-02 21:59:20,211 WARNING [train.py:1204] (3/4) Exclude cut with ID _42326_210_7770_1_1533814282531_3707269_442-238692-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 21:59:22,017 WARNING [train.py:1204] (3/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_226-254446-0_sp1.1 from training. Duration: 0.936375 2023-10-02 21:59:22,045 WARNING [train.py:1204] (3/4) Exclude cut with ID _29853_210_7497_1_1532505600851_6642110_287-299903-0_sp1.1 from training. Duration: 0.963625 2023-10-02 21:59:25,456 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1015-343884-0 from training. Duration: 0.62 2023-10-02 21:59:25,500 WARNING [train.py:1204] (3/4) Exclude cut with ID ID0305W0008-833402 from training. Duration: 0.91 2023-10-02 21:59:28,331 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6673_210_3273_1_1527767790_3173240_351-167954-0_sp1.1 from training. Duration: 0.436375 2023-10-02 21:59:28,344 WARNING [train.py:1204] (3/4) Exclude cut with ID _6456_210_1534_1_1527681634212_3181049_345-252877-0 from training. Duration: 0.64 2023-10-02 21:59:30,959 INFO [train.py:1046] (3/4) Epoch 30, batch 950, loss[loss=0.172, simple_loss=0.2605, pruned_loss=0.0418, over 24054.00 frames. ], tot_loss[loss=0.1655, simple_loss=0.2437, pruned_loss=0.04359, over 4660823.70 frames. ], batch size: 80, lr: 3.42e-03, grad_scale: 8.0 2023-10-02 21:59:31,077 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_13-205182-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 21:59:32,656 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.balancer1.prob, batch_count=1033346.6666666666, ans=0.125 2023-10-02 21:59:35,158 WARNING [train.py:1204] (3/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_427-208901-0 from training. Duration: 0.68 2023-10-02 21:59:39,278 WARNING [train.py:1204] (3/4) Exclude cut with ID _49039_210_6315_1_1533967537541_3846118_9-148921-0_sp1.1 from training. Duration: 0.97275 2023-10-02 21:59:41,355 WARNING [train.py:1204] (3/4) Exclude cut with ID _59493_210_17282_1_1535078419077_3225879_143-70317-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 21:59:42,758 WARNING [train.py:1204] (3/4) Exclude cut with ID _27967_210_12140_1_1532588391859_8419130_1056-236369-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 21:59:42,807 WARNING [train.py:1204] (3/4) Exclude cut with ID _6537_210_1883_1_1527987559710_3597129_382-133062-0_sp0.9 from training. Duration: 0.7555625 2023-10-02 21:59:46,064 WARNING [train.py:1204] (3/4) Exclude cut with ID ID0376W0255-869957 from training. Duration: 0.9510625 2023-10-02 21:59:47,602 WARNING [train.py:1204] (3/4) Exclude cut with ID _49371_210_12071_1_1533952761021_3812190_483-240691-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 21:59:48,951 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_12868_210_6753_1_1530701932_530275_18-76322-0_sp1.1 from training. Duration: 0.7909375 2023-10-02 21:59:50,267 WARNING [train.py:1204] (3/4) Exclude cut with ID _53950_210_5816_1_1534417264919_3862951_367-26344-0_sp1.1 from training. Duration: 0.97275 2023-10-02 21:59:50,287 WARNING [train.py:1204] (3/4) Exclude cut with ID _5444_210_2151_1_1527418808464_5312210_634-194174-0_sp1.1 from training. Duration: 0.8818125 2023-10-02 21:59:50,304 WARNING [train.py:1204] (3/4) Exclude cut with ID _56494_210_4220_1_1535284837625_3033169_69-138150-0 from training. Duration: 0.94 2023-10-02 21:59:52,120 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7543_210_6753_1_1528544010_1110402_25-183860-0_sp0.9 from training. Duration: 0.52225 2023-10-02 21:59:52,340 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.out_combiner.scale_min, batch_count=1033413.3333333334, ans=0.2 2023-10-02 21:59:53,447 WARNING [train.py:1204] (3/4) Exclude cut with ID _40284_210_2084_1_1533513679814_3704279_287-191918-0_sp1.1 from training. Duration: 0.963625 2023-10-02 21:59:54,971 WARNING [train.py:1204] (3/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_291-270881-0 from training. Duration: 0.97 2023-10-02 21:59:56,866 WARNING [train.py:1204] (3/4) Exclude cut with ID _12555_210_8750_1_1530075594402_3780239_449-267986-0_sp0.9 from training. Duration: 0.92225 2023-10-02 21:59:58,380 WARNING [train.py:1204] (3/4) Exclude cut with ID _3992_210_1747_1_1526644816031_3731060_268-64520-0_sp1.1 from training. Duration: 0.963625 2023-10-02 21:59:58,388 WARNING [train.py:1204] (3/4) Exclude cut with ID _39411_210_5986_1_1533538825030_3343649_173-238248-0_sp0.9 from training. Duration: 0.92225 2023-10-02 21:59:58,413 WARNING [train.py:1204] (3/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_828-313097-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 21:59:58,516 WARNING [train.py:1204] (3/4) Exclude cut with ID _26907_210_7234_1_1532221740694_3205263_337-2494-0 from training. Duration: 0.88 2023-10-02 21:59:59,927 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_229-45300-0_sp1.1 from training. Duration: 0.5181875 2023-10-02 22:00:01,318 WARNING [train.py:1204] (3/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_300-27235-0_sp1.1 from training. Duration: 0.7909375 2023-10-02 22:00:02,761 WARNING [train.py:1204] (3/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_48-338976-0_sp1.1 from training. Duration: 0.8090625 2023-10-02 22:00:07,991 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_13091_210_7925_1_1530609447_8987354_792-195353-0_sp1.1 from training. Duration: 0.936375 2023-10-02 22:00:07,997 WARNING [train.py:1204] (3/4) Exclude cut with ID _56524_210_9642_1_1534572011779_3619170_311-54500-0_sp1.1 from training. Duration: 0.97275 2023-10-02 22:00:11,363 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7543_210_6753_1_1528544010_1110402_25-183860-0 from training. Duration: 0.47 2023-10-02 22:00:14,628 WARNING [train.py:1204] (3/4) Exclude cut with ID 2411-132532-0017-82279-0_sp1.1 from training. Duration: 0.9681875 2023-10-02 22:00:14,629 WARNING [train.py:1204] (3/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_272-249985-0_sp0.9 from training. Duration: 0.7444375 2023-10-02 22:00:14,688 WARNING [train.py:1204] (3/4) Exclude cut with ID _55644_210_11856_1_1534593599582_4103719_320-229006-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 22:00:14,948 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.ff3_skip_rate, batch_count=1033546.6666666666, ans=0.0 2023-10-02 22:00:16,113 WARNING [train.py:1204] (3/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_177-100554-0_sp1.1 from training. Duration: 0.963625 2023-10-02 22:00:16,115 WARNING [train.py:1204] (3/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_787-294416-0_sp1.1 from training. Duration: 0.7454375 2023-10-02 22:00:20,984 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.feed_forward3.hidden_balancer.prob, batch_count=1033546.6666666666, ans=0.125 2023-10-02 22:00:22,076 WARNING [train.py:1204] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_318-59097-0 from training. Duration: 0.76 2023-10-02 22:00:23,404 WARNING [train.py:1204] (3/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_480-13146-0_sp1.1 from training. Duration: 0.77275 2023-10-02 22:00:25,659 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.balancer1.prob, batch_count=1033546.6666666666, ans=0.125 2023-10-02 22:00:26,662 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_469-338558-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 22:00:26,721 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_367-126897-0_sp1.1 from training. Duration: 0.963625 2023-10-02 22:00:27,998 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6545_210_2040_1_1527774670_4245258_379-14182-0 from training. Duration: 0.8 2023-10-02 22:00:28,011 WARNING [train.py:1204] (3/4) Exclude cut with ID _21631_210_2253_1_1531907880778_3160339_197-79498-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 22:00:28,015 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_416-293746-0_sp1.1 from training. Duration: 0.8181875 2023-10-02 22:00:28,052 WARNING [train.py:1204] (3/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_52-211683-0 from training. Duration: 0.9 2023-10-02 22:00:31,550 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.2.encoder.layers.2.conv_module2.whiten, num_groups=1, num_channels=384, metric=5.94 vs. limit=15.0 2023-10-02 22:00:32,248 WARNING [train.py:1204] (3/4) Exclude cut with ID _48678_210_5663_1_1533988830947_3769199_306-255886-0_sp0.9 from training. Duration: 0.8555625 2023-10-02 22:00:33,668 WARNING [train.py:1204] (3/4) Exclude cut with ID _65134_210_14763_1_1535335393985_3505368_296-24733-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 22:00:35,561 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.4.encoder.layers.1.conv_module1.whiten, num_groups=1, num_channels=384, metric=4.53 vs. limit=15.0 2023-10-02 22:00:39,067 WARNING [train.py:1204] (3/4) Exclude cut with ID _14898_210_9206_1_1530766812377_4177190_335-99364-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 22:00:39,154 WARNING [train.py:1204] (3/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_19-342632-0 from training. Duration: 0.91 2023-10-02 22:00:39,372 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.self_attn_weights.pos_emb_skip_rate, batch_count=1033613.3333333334, ans=0.0 2023-10-02 22:00:40,450 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_53071_210_16554_1_1534490405_2954066_265-170472-0 from training. Duration: 0.83 2023-10-02 22:00:43,316 WARNING [train.py:1204] (3/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_189-298172-0_sp1.1 from training. Duration: 0.963625 2023-10-02 22:00:45,204 INFO [train.py:1046] (3/4) Epoch 30, batch 1000, loss[loss=0.1599, simple_loss=0.247, pruned_loss=0.03637, over 24504.00 frames. ], tot_loss[loss=0.1647, simple_loss=0.2427, pruned_loss=0.04331, over 4671614.87 frames. ], batch size: 66, lr: 3.42e-03, grad_scale: 8.0 2023-10-02 22:00:48,445 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7615_210_4211_1_1528110965_4580160_72-235211-0 from training. Duration: 0.76 2023-10-02 22:00:48,497 WARNING [train.py:1204] (3/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_155-90874-0_sp1.1 from training. Duration: 0.936375 2023-10-02 22:00:49,068 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.3.encoder.layers.1.self_attn2.whiten, num_groups=1, num_channels=512, metric=16.09 vs. limit=22.5 2023-10-02 22:00:53,650 WARNING [train.py:1204] (3/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_164-216310-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 22:00:54,372 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.3.encoder.layers.0.feed_forward1.out_whiten, num_groups=1, num_channels=512, metric=11.07 vs. limit=15.0 2023-10-02 22:00:55,021 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_46013_210_7234_1_1533776362_3744032_463-52378-0 from training. Duration: 0.86 2023-10-02 22:00:55,025 WARNING [train.py:1204] (3/4) Exclude cut with ID _61133_210_9969_1_1535196777227_3696409_333-270325-0 from training. Duration: 0.93 2023-10-02 22:00:59,337 WARNING [train.py:1204] (3/4) Exclude cut with ID _5172_210_1883_1_1527246062567_3577219_18-101380-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 22:01:00,581 WARNING [train.py:1204] (3/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_577-236858-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 22:01:01,939 WARNING [train.py:1204] (3/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_1006-177345-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 22:01:02,136 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.balancer1.prob, batch_count=1033746.6666666666, ans=0.125 2023-10-02 22:01:03,502 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_4-266348-0 from training. Duration: 0.78 2023-10-02 22:01:06,352 WARNING [train.py:1204] (3/4) Exclude cut with ID _55857_210_2346_1_1534644007831_6898209_745-98391-0 from training. Duration: 0.76 2023-10-02 22:01:07,792 WARNING [train.py:1204] (3/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_375-244976-0 from training. Duration: 0.75 2023-10-02 22:01:07,830 WARNING [train.py:1204] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_4-248404-0_sp1.1 from training. Duration: 0.97275 2023-10-02 22:01:09,283 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8453_210_4211_1_1528502940_8562730_596-306820-0 from training. Duration: 0.97 2023-10-02 22:01:10,594 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_211-339622-0_sp0.9 from training. Duration: 0.5 2023-10-02 22:01:11,945 WARNING [train.py:1204] (3/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_393-57888-0 from training. Duration: 0.64 2023-10-02 22:01:12,401 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.5.encoder.layers.1.whiten, num_groups=1, num_channels=256, metric=1.98 vs. limit=12.0 2023-10-02 22:01:13,337 WARNING [train.py:1204] (3/4) Exclude cut with ID _23825_210_13449_1_1531915313526_3497150_143-343575-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 22:01:14,822 WARNING [train.py:1204] (3/4) Exclude cut with ID _58605_210_12210_1_1534723195939_3904259_36-12256-0_sp1.1 from training. Duration: 0.936375 2023-10-02 22:01:24,954 WARNING [train.py:1204] (3/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_38-67795-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 22:01:26,324 WARNING [train.py:1204] (3/4) Exclude cut with ID _64631_210_27767_1_1535349580068_4165160_308-327832-0_sp1.1 from training. Duration: 0.92725 2023-10-02 22:01:26,374 WARNING [train.py:1204] (3/4) Exclude cut with ID _37086_210_9038_1_1532999540891_7021250_726-81176-0_sp1.1 from training. Duration: 0.936375 2023-10-02 22:01:27,791 WARNING [train.py:1204] (3/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_47-141521-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 22:01:27,803 WARNING [train.py:1204] (3/4) Exclude cut with ID _3067_210_2667_1_1526212974640_4503168_250-285937-0 from training. Duration: 0.8 2023-10-02 22:01:27,837 WARNING [train.py:1204] (3/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_239-24000-0_sp1.1 from training. Duration: 0.97275 2023-10-02 22:01:29,137 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3371_210_1794_1_1526384462_5580777_396-339812-0_sp0.9 from training. Duration: 0.9444375 2023-10-02 22:01:29,221 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder_embed.dropout.p, batch_count=1033880.0, ans=0.1 2023-10-02 22:01:30,455 WARNING [train.py:1204] (3/4) Exclude cut with ID _16909_210_5724_1_1531292079912_4118919_580-173454-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 22:01:31,778 WARNING [train.py:1204] (3/4) Exclude cut with ID IC0124W0002-61308 from training. Duration: 0.982 2023-10-02 22:01:33,246 WARNING [train.py:1204] (3/4) Exclude cut with ID _31602_210_5854_1_1532677021456_4458250_249-96604-0 from training. Duration: 0.89 2023-10-02 22:01:33,368 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=1033880.0, ans=0.1 2023-10-02 22:01:36,013 WARNING [train.py:1204] (3/4) Exclude cut with ID _69584_210_5219_1_1535785133775_4044450_325-7656-0 from training. Duration: 0.86 2023-10-02 22:01:36,136 WARNING [train.py:1204] (3/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_478-162247-0 from training. Duration: 0.82 2023-10-02 22:01:38,832 WARNING [train.py:1204] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_686-54383-0_sp1.1 from training. Duration: 0.7909375 2023-10-02 22:01:40,112 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.472e+02 1.798e+02 1.994e+02 2.237e+02 2.934e+02, threshold=3.988e+02, percent-clipped=0.0 2023-10-02 22:01:43,824 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.3.encoder.layers.0.feed_forward3.out_whiten, num_groups=1, num_channels=512, metric=10.45 vs. limit=15.0 2023-10-02 22:01:44,521 WARNING [train.py:1204] (3/4) Exclude cut with ID _29303_210_2575_1_1532392237303_7410649_85-270092-0_sp1.1 from training. Duration: 0.936375 2023-10-02 22:01:44,538 WARNING [train.py:1204] (3/4) Exclude cut with ID _2322_210_2667_1_1525604548971_3656299_440-80494-0_sp1.1 from training. Duration: 0.8 2023-10-02 22:01:45,811 WARNING [train.py:1204] (3/4) Exclude cut with ID _47346_210_9476_1_1533815971553_3536160_201-40644-0_sp1.1 from training. Duration: 0.936375 2023-10-02 22:01:45,911 WARNING [train.py:1204] (3/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_343-139898-0_sp1.1 from training. Duration: 0.87275 2023-10-02 22:01:48,515 WARNING [train.py:1204] (3/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_1012-279473-0 from training. Duration: 0.67 2023-10-02 22:01:50,527 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7931_210_1794_1_1528594728_5564059_280-279560-0_sp1.1 from training. Duration: 0.8909375 2023-10-02 22:01:52,464 WARNING [train.py:1204] (3/4) Exclude cut with ID _8263_210_1664_1_1528439452891_970220_68-181805-0 from training. Duration: 0.64 2023-10-02 22:01:52,514 WARNING [train.py:1204] (3/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_605-133395-0 from training. Duration: 0.66 2023-10-02 22:01:54,308 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_43-171109-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 22:01:54,312 WARNING [train.py:1204] (3/4) Exclude cut with ID _40094_210_16380_1_1533294005773_3641159_107-280739-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 22:01:56,359 WARNING [train.py:1204] (3/4) Exclude cut with ID _14821_210_8750_1_1530680503991_6829359_2-227151-0_sp0.9 from training. Duration: 0.92225 2023-10-02 22:01:59,081 WARNING [train.py:1204] (3/4) Exclude cut with ID _14268_210_9206_1_1530622771285_4714099_331-105351-0_sp1.1 from training. Duration: 0.5909375 2023-10-02 22:02:00,388 INFO [train.py:1046] (3/4) Epoch 30, batch 1050, loss[loss=0.151, simple_loss=0.2324, pruned_loss=0.03478, over 24545.00 frames. ], tot_loss[loss=0.1638, simple_loss=0.2417, pruned_loss=0.043, over 4673849.39 frames. ], batch size: 60, lr: 3.42e-03, grad_scale: 8.0 2023-10-02 22:02:00,445 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_26326_210_7925_1_1533002493_3592406_451-82741-0_sp1.1 from training. Duration: 0.97275 2023-10-02 22:02:03,221 WARNING [train.py:1204] (3/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_191-59784-0_sp0.9 from training. Duration: 0.97775 2023-10-02 22:02:03,330 WARNING [train.py:1204] (3/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_445-117398-0_sp1.1 from training. Duration: 0.8090625 2023-10-02 22:02:04,773 WARNING [train.py:1204] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_605-257205-0_sp0.9 from training. Duration: 0.6555625 2023-10-02 22:02:06,161 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_50050_210_16223_1_1535435796_7290138_592-258841-0_sp1.1 from training. Duration: 0.936375 2023-10-02 22:02:07,600 WARNING [train.py:1204] (3/4) Exclude cut with ID _14348_210_8341_1_1530599434996_3778640_240-232370-0_sp0.9 from training. Duration: 0.9333125 2023-10-02 22:02:10,276 WARNING [train.py:1204] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_239-316802-0_sp0.9 from training. Duration: 0.7666875 2023-10-02 22:02:11,612 WARNING [train.py:1204] (3/4) Exclude cut with ID _45560_210_21982_1_1533812603951_3584999_111-6376-0_sp1.1 from training. Duration: 0.763625 2023-10-02 22:02:13,025 WARNING [train.py:1204] (3/4) Exclude cut with ID _54189_210_16380_1_1534332670039_3911710_606-311481-0_sp1.1 from training. Duration: 0.963625 2023-10-02 22:02:14,374 WARNING [train.py:1204] (3/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_237-332027-0_sp1.1 from training. Duration: 0.863625 2023-10-02 22:02:14,410 WARNING [train.py:1204] (3/4) Exclude cut with ID _66070_210_7640_1_1535441393096_7805310_489-292631-0_sp0.9 from training. Duration: 0.87775 2023-10-02 22:02:15,749 WARNING [train.py:1204] (3/4) Exclude cut with ID _49728_210_12651_1_1534041034294_6788150_624-195301-0_sp1.1 from training. Duration: 0.77275 2023-10-02 22:02:17,047 WARNING [train.py:1204] (3/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_44-79427-0 from training. Duration: 0.98 2023-10-02 22:02:17,121 WARNING [train.py:1204] (3/4) Exclude cut with ID _35350_210_16040_1_1533040191183_4075299_490-198354-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 22:02:17,162 WARNING [train.py:1204] (3/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_309-222545-0 from training. Duration: 0.84 2023-10-02 22:02:20,942 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_13091_210_7925_1_1530609447_8987354_790-199469-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 22:02:20,956 WARNING [train.py:1204] (3/4) Exclude cut with ID _21632_210_2253_1_1531994001765_3744140_110-274654-0 from training. Duration: 0.82 2023-10-02 22:02:20,962 WARNING [train.py:1204] (3/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_498-40153-0_sp0.9 from training. Duration: 0.7 2023-10-02 22:02:26,360 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.3.encoder.layers.0.nonlin_attention.whiten2, num_groups=1, num_channels=512, metric=5.35 vs. limit=15.0 2023-10-02 22:02:28,325 WARNING [train.py:1204] (3/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_363-282909-0_sp1.1 from training. Duration: 0.936375 2023-10-02 22:02:28,372 WARNING [train.py:1204] (3/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_178-177582-0_sp0.9 from training. Duration: 0.82225 2023-10-02 22:02:28,385 WARNING [train.py:1204] (3/4) Exclude cut with ID _24612_210_8913_1_1531998054193_3806359_357-35487-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 22:02:31,217 WARNING [train.py:1204] (3/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_138-332773-0 from training. Duration: 0.81 2023-10-02 22:02:31,235 WARNING [train.py:1204] (3/4) Exclude cut with ID _7624_210_1531_1_1528109802198_4933101_418-349834-0 from training. Duration: 0.6 2023-10-02 22:02:32,015 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.1.encoder.layers.1.feed_forward3.out_whiten, num_groups=1, num_channels=256, metric=9.78 vs. limit=15.0 2023-10-02 22:02:32,559 WARNING [train.py:1204] (3/4) Exclude cut with ID _7537_210_2084_1_1528502395431_3924359_14-312906-0_sp0.9 from training. Duration: 0.9333125 2023-10-02 22:02:33,986 WARNING [train.py:1204] (3/4) Exclude cut with ID _60057_210_10640_1_1534903065793_3333150_362-230377-0 from training. Duration: 0.87 2023-10-02 22:02:36,774 WARNING [train.py:1204] (3/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_138-64210-0 from training. Duration: 0.73 2023-10-02 22:02:38,081 WARNING [train.py:1204] (3/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_454-339504-0_sp1.1 from training. Duration: 0.97275 2023-10-02 22:02:40,767 WARNING [train.py:1204] (3/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_508-335369-0_sp0.9 from training. Duration: 0.5333125 2023-10-02 22:02:42,199 WARNING [train.py:1204] (3/4) Exclude cut with ID _5608_210_2106_1_1527415439089_6819768_909-82767-0_sp0.9 from training. Duration: 0.67775 2023-10-02 22:02:42,225 WARNING [train.py:1204] (3/4) Exclude cut with ID _6694_210_4599_1_1527852637490_3965529_99-11611-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 22:02:43,573 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2655_210_2087_1_1525688959_3897400_52-279899-0_sp1.1 from training. Duration: 0.62725 2023-10-02 22:02:46,412 WARNING [train.py:1204] (3/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_375-245677-0_sp1.1 from training. Duration: 0.736375 2023-10-02 22:02:53,571 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_23850_210_4238_1_1532232597_1632433_32-185135-0 from training. Duration: 0.86 2023-10-02 22:02:54,991 WARNING [train.py:1204] (3/4) Exclude cut with ID _15113_210_8254_1_1530842438579_3716279_209-54533-0 from training. Duration: 0.96 2023-10-02 22:02:56,358 WARNING [train.py:1204] (3/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_401-53423-0 from training. Duration: 0.55 2023-10-02 22:02:56,386 WARNING [train.py:1204] (3/4) Exclude cut with ID _14867_210_1968_1_1530684179890_3846969_319-250868-0_sp1.1 from training. Duration: 0.9 2023-10-02 22:02:56,409 WARNING [train.py:1204] (3/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_106-30782-0_sp0.9 from training. Duration: 0.888875 2023-10-02 22:02:57,773 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_5960_210_1794_1_1527403865_3990999_515-2613-0 from training. Duration: 0.88 2023-10-02 22:03:00,767 WARNING [train.py:1204] (3/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_798-106179-0_sp0.9 from training. Duration: 0.92225 2023-10-02 22:03:02,171 WARNING [train.py:1204] (3/4) Exclude cut with ID _14227_210_4381_1_1530701804286_3940259_125-275642-0_sp1.1 from training. Duration: 0.9 2023-10-02 22:03:02,180 WARNING [train.py:1204] (3/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_79-200392-0_sp1.1 from training. Duration: 0.8818125 2023-10-02 22:03:03,554 WARNING [train.py:1204] (3/4) Exclude cut with ID _48836_210_12210_1_1534075848293_3904150_429-35475-0_sp0.9 from training. Duration: 0.988875 2023-10-02 22:03:03,575 WARNING [train.py:1204] (3/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_224-172517-0_sp1.1 from training. Duration: 0.97275 2023-10-02 22:03:07,650 WARNING [train.py:1204] (3/4) Exclude cut with ID _15113_210_8254_1_1530842438579_3716279_76-232507-0_sp1.1 from training. Duration: 0.97275 2023-10-02 22:03:07,652 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_135-5501-0 from training. Duration: 0.98 2023-10-02 22:03:09,043 WARNING [train.py:1204] (3/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_563-347832-0_sp0.9 from training. Duration: 0.988875 2023-10-02 22:03:09,060 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6786_210_1388_1_1527839699_4194495_273-149841-0 from training. Duration: 0.92 2023-10-02 22:03:09,077 WARNING [train.py:1204] (3/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_172-215932-0 from training. Duration: 0.66 2023-10-02 22:03:10,447 WARNING [train.py:1204] (3/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_939-132910-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 22:03:13,112 INFO [train.py:1046] (3/4) Epoch 30, batch 1100, loss[loss=0.1636, simple_loss=0.2382, pruned_loss=0.04454, over 23670.00 frames. ], tot_loss[loss=0.1632, simple_loss=0.2405, pruned_loss=0.0429, over 4662293.11 frames. ], batch size: 232, lr: 3.42e-03, grad_scale: 8.0 2023-10-02 22:03:13,237 WARNING [train.py:1204] (3/4) Exclude cut with ID _49371_210_12071_1_1533952761021_3812190_16-246001-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 22:03:19,203 WARNING [train.py:1204] (3/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_680-189165-0_sp1.1 from training. Duration: 0.936375 2023-10-02 22:03:23,827 WARNING [train.py:1204] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_53-300544-0_sp1.1 from training. Duration: 0.6090625 2023-10-02 22:03:25,660 WARNING [train.py:1204] (3/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_540-275685-0_sp1.1 from training. Duration: 0.8090625 2023-10-02 22:03:26,981 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_38965_210_4852_1_1533174906_3484735_453-106579-0_sp1.1 from training. Duration: 0.92725 2023-10-02 22:03:27,025 WARNING [train.py:1204] (3/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_105-328174-0 from training. Duration: 0.76 2023-10-02 22:03:29,712 WARNING [train.py:1204] (3/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_279-126205-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 22:03:31,226 WARNING [train.py:1204] (3/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_5-55893-0_sp0.9 from training. Duration: 0.611125 2023-10-02 22:03:32,664 WARNING [train.py:1204] (3/4) Exclude cut with ID _13678_210_7497_1_1530530069226_3463170_141-10835-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 22:03:34,222 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.bypass.scale_min, batch_count=1034413.3333333334, ans=0.2 2023-10-02 22:03:35,418 WARNING [train.py:1204] (3/4) Exclude cut with ID _22083_210_8254_1_1531702862439_3678970_201-85738-0_sp1.1 from training. Duration: 0.8181875 2023-10-02 22:03:36,736 WARNING [train.py:1204] (3/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_51-1049-0 from training. Duration: 0.75 2023-10-02 22:03:38,088 WARNING [train.py:1204] (3/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_206-10097-0_sp0.9 from training. Duration: 0.5444375 2023-10-02 22:03:38,183 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_4528_210_3273_1_1527076654_3276777_360-63661-0_sp1.1 from training. Duration: 0.92725 2023-10-02 22:03:39,479 WARNING [train.py:1204] (3/4) Exclude cut with ID _64086_210_7994_1_1535270417019_7435949_449-31567-0_sp1.1 from training. Duration: 0.8818125 2023-10-02 22:03:40,949 WARNING [train.py:1204] (3/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_245-174553-0_sp0.9 from training. Duration: 0.97775 2023-10-02 22:03:42,501 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.bypass.skip_rate, batch_count=1034480.0, ans=0.07 2023-10-02 22:03:43,713 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6545_210_2040_1_1527774670_4245258_379-14182-0_sp1.1 from training. Duration: 0.72725 2023-10-02 22:03:47,815 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_256-206070-0_sp1.1 from training. Duration: 0.963625 2023-10-02 22:03:51,091 WARNING [train.py:1204] (3/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_278-104676-0 from training. Duration: 0.98 2023-10-02 22:03:52,432 WARNING [train.py:1204] (3/4) Exclude cut with ID IC0671W0065-331908 from training. Duration: 0.896 2023-10-02 22:03:52,476 WARNING [train.py:1204] (3/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_244-129917-0_sp1.1 from training. Duration: 0.97275 2023-10-02 22:03:55,724 WARNING [train.py:1204] (3/4) Exclude cut with ID _7852_210_5219_1_1528286471316_8139400_1129-170857-0_sp1.1 from training. Duration: 0.97275 2023-10-02 22:03:57,754 WARNING [train.py:1204] (3/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_443-30034-0_sp0.9 from training. Duration: 0.711125 2023-10-02 22:03:57,783 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_31583_210_3852_1_1532595851_4024413_250-332999-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 22:03:57,893 WARNING [train.py:1204] (3/4) Exclude cut with ID _8772_210_1850_1_1528635595972_3664011_247-336448-0 from training. Duration: 0.74 2023-10-02 22:03:59,226 WARNING [train.py:1204] (3/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_567-135114-0_sp0.9 from training. Duration: 0.9666875 2023-10-02 22:04:00,503 WARNING [train.py:1204] (3/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_557-349534-0_sp1.1 from training. Duration: 0.82725 2023-10-02 22:04:00,515 WARNING [train.py:1204] (3/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_1341-26728-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 22:04:00,566 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_40222_210_6228_1_1533385298_4481939_375-328051-0_sp1.1 from training. Duration: 0.97275 2023-10-02 22:04:00,597 WARNING [train.py:1204] (3/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_276-96593-0 from training. Duration: 0.77 2023-10-02 22:04:06,083 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_53071_210_16554_1_1534490405_2954066_265-170472-0_sp0.9 from training. Duration: 0.92225 2023-10-02 22:04:07,382 WARNING [train.py:1204] (3/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_365-1492-0 from training. Duration: 0.91 2023-10-02 22:04:08,728 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.568e+02 1.832e+02 2.001e+02 2.230e+02 3.107e+02, threshold=4.001e+02, percent-clipped=0.0 2023-10-02 22:04:08,833 WARNING [train.py:1204] (3/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_654-52713-0_sp0.9 from training. Duration: 0.8555625 2023-10-02 22:04:12,950 WARNING [train.py:1204] (3/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_113-92608-0_sp1.1 from training. Duration: 0.4909375 2023-10-02 22:04:15,757 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_46013_210_7234_1_1533776362_3744032_50-141535-0 from training. Duration: 0.99 2023-10-02 22:04:15,765 WARNING [train.py:1204] (3/4) Exclude cut with ID _13942_210_9988_1_1530604906036_3733360_499-55676-0_sp1.1 from training. Duration: 0.563625 2023-10-02 22:04:18,492 WARNING [train.py:1204] (3/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_375-65143-0_sp1.1 from training. Duration: 0.97275 2023-10-02 22:04:20,451 WARNING [train.py:1204] (3/4) Exclude cut with ID _16756_210_4915_1_1531047803763_7104259_199-325608-0_sp1.1 from training. Duration: 0.92725 2023-10-02 22:04:20,489 WARNING [train.py:1204] (3/4) Exclude cut with ID _31649_210_6228_1_1532584608063_4091078_443-124391-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 22:04:21,878 WARNING [train.py:1204] (3/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_146-218763-0 from training. Duration: 0.86 2023-10-02 22:04:23,545 WARNING [train.py:1204] (3/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_567-135114-0_sp1.1 from training. Duration: 0.7909375 2023-10-02 22:04:23,587 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_26326_210_7925_1_1533002493_3592406_462-288299-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 22:04:25,640 WARNING [train.py:1204] (3/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_322-217220-0 from training. Duration: 0.86 2023-10-02 22:04:25,681 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_238-256668-0_sp1.1 from training. Duration: 0.62725 2023-10-02 22:04:26,787 INFO [train.py:1046] (3/4) Epoch 30, batch 1150, loss[loss=0.1711, simple_loss=0.2432, pruned_loss=0.04952, over 22760.00 frames. ], tot_loss[loss=0.1635, simple_loss=0.2414, pruned_loss=0.04284, over 4689844.35 frames. ], batch size: 322, lr: 3.42e-03, grad_scale: 4.0 2023-10-02 22:04:26,874 WARNING [train.py:1204] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_371-18169-0 from training. Duration: 0.86 2023-10-02 22:04:28,192 WARNING [train.py:1204] (3/4) Exclude cut with ID _58480_210_12392_1_1534811121111_3646160_23-74512-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 22:04:28,204 WARNING [train.py:1204] (3/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_784-213099-0_sp1.1 from training. Duration: 0.8454375 2023-10-02 22:04:29,508 WARNING [train.py:1204] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_625-102955-0_sp1.1 from training. Duration: 0.7 2023-10-02 22:04:34,338 WARNING [train.py:1204] (3/4) Exclude cut with ID _49748_210_24116_1_1533986971737_3158910_176-174327-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 22:04:35,736 WARNING [train.py:1204] (3/4) Exclude cut with ID _50310_210_15488_1_1534068043345_3641080_432-96140-0_sp1.1 from training. Duration: 0.936375 2023-10-02 22:04:37,165 WARNING [train.py:1204] (3/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_76-14910-0_sp1.1 from training. Duration: 0.92725 2023-10-02 22:04:37,200 WARNING [train.py:1204] (3/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_40-19458-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 22:04:38,586 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_46-93547-0 from training. Duration: 0.95 2023-10-02 22:04:39,860 WARNING [train.py:1204] (3/4) Exclude cut with ID _21631_210_2253_1_1531907880778_3160339_321-35325-0_sp1.1 from training. Duration: 0.963625 2023-10-02 22:04:40,441 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.5.encoder.layers.1.nonlin_attention.whiten1, num_groups=1, num_channels=192, metric=2.67 vs. limit=10.0 2023-10-02 22:04:41,379 WARNING [train.py:1204] (3/4) Exclude cut with ID _4779_210_2084_1_1527318022944_3914893_500-285013-0 from training. Duration: 0.69 2023-10-02 22:04:42,731 WARNING [train.py:1204] (3/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_301-93324-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 22:04:42,737 WARNING [train.py:1204] (3/4) Exclude cut with ID _6473_210_1741_1_1527901307396_3815380_108-17335-0_sp1.1 from training. Duration: 0.5909375 2023-10-02 22:04:44,385 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.conv_skip_rate, batch_count=1034746.6666666666, ans=0.0 2023-10-02 22:04:46,363 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.2.encoder.layers.0.self_attn1.whiten, num_groups=1, num_channels=384, metric=18.62 vs. limit=22.5 2023-10-02 22:04:48,283 WARNING [train.py:1204] (3/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_206-10097-0 from training. Duration: 0.49 2023-10-02 22:04:49,765 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_36756_210_7234_1_1533026991_966276_103-77513-0_sp1.1 from training. Duration: 0.97275 2023-10-02 22:04:51,804 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.3.encoder.layers.1.self_attn1.whiten, num_groups=1, num_channels=512, metric=15.36 vs. limit=22.5 2023-10-02 22:04:54,299 WARNING [train.py:1204] (3/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_662-18193-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 22:04:54,355 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_26433_210_12108_1_1532157158_4056480_439-52438-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 22:04:55,704 WARNING [train.py:1204] (3/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_464-33931-0 from training. Duration: 0.54 2023-10-02 22:04:55,710 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3474_210_2028_1_1526188513_4825246_145-309131-0_sp0.9 from training. Duration: 0.82225 2023-10-02 22:04:55,721 WARNING [train.py:1204] (3/4) Exclude cut with ID _9679_210_7497_1_1529129212906_4363929_446-308321-0_sp1.1 from training. Duration: 0.963625 2023-10-02 22:04:59,081 WARNING [train.py:1204] (3/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_662-275621-0 from training. Duration: 0.42 2023-10-02 22:05:01,060 WARNING [train.py:1204] (3/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_344-337148-0_sp1.1 from training. Duration: 0.97275 2023-10-02 22:05:01,290 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.balancer1.prob, batch_count=1034813.3333333334, ans=0.125 2023-10-02 22:05:02,466 WARNING [train.py:1204] (3/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_759-179071-0_sp1.1 from training. Duration: 0.92725 2023-10-02 22:05:03,285 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.1.encoder.layers.1.self_attn2.whiten, num_groups=1, num_channels=256, metric=10.29 vs. limit=22.5 2023-10-02 22:05:04,021 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=1034813.3333333334, ans=0.1 2023-10-02 22:05:12,081 WARNING [train.py:1204] (3/4) Exclude cut with ID _28783_210_7943_1_1532671164808_7155479_293-12188-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 22:05:16,270 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_17613_210_2151_1_1531137569_2366160_235-320304-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 22:05:17,652 WARNING [train.py:1204] (3/4) Exclude cut with ID _14349_210_8341_1_1530615710818_3442470_170-336541-0 from training. Duration: 0.86 2023-10-02 22:05:17,713 WARNING [train.py:1204] (3/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_375-218677-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 22:05:19,106 WARNING [train.py:1204] (3/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_829-202956-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 22:05:23,564 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9029W0436-509823 from training. Duration: 0.768 2023-10-02 22:05:25,564 WARNING [train.py:1204] (3/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_231-112214-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 22:05:33,463 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9059W0470-524883 from training. Duration: 0.3415625 2023-10-02 22:05:36,363 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_43266_210_1794_1_1533521789_4497218_206-79512-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 22:05:37,225 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.1.encoder.layers.0.whiten, num_groups=1, num_channels=256, metric=5.23 vs. limit=12.0 2023-10-02 22:05:37,729 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_294-163793-0_sp0.9 from training. Duration: 0.911125 2023-10-02 22:05:37,761 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6290_210_3273_1_1527595224_1282787_115-87095-0_sp1.1 from training. Duration: 0.5 2023-10-02 22:05:39,086 WARNING [train.py:1204] (3/4) Exclude cut with ID _14100_210_8341_1_1530529320613_3678069_279-55760-0_sp1.1 from training. Duration: 0.8454375 2023-10-02 22:05:40,415 INFO [train.py:1046] (3/4) Epoch 30, batch 1200, loss[loss=0.1649, simple_loss=0.2411, pruned_loss=0.04428, over 23807.00 frames. ], tot_loss[loss=0.1645, simple_loss=0.2427, pruned_loss=0.04316, over 4694248.51 frames. ], batch size: 164, lr: 3.42e-03, grad_scale: 8.0 2023-10-02 22:05:40,569 WARNING [train.py:1204] (3/4) Exclude cut with ID _14621_210_1925_1_1530686901185_3675420_419-234200-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 22:05:44,743 WARNING [train.py:1204] (3/4) Exclude cut with ID _60229_210_27767_1_1534838406158_3655350_353-213656-0_sp1.1 from training. Duration: 0.863625 2023-10-02 22:05:44,751 WARNING [train.py:1204] (3/4) Exclude cut with ID _48678_210_5663_1_1533988830947_3769199_306-255886-0_sp1.1 from training. Duration: 0.7 2023-10-02 22:05:46,198 WARNING [train.py:1204] (3/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_466-3492-0_sp1.1 from training. Duration: 0.97275 2023-10-02 22:05:46,199 WARNING [train.py:1204] (3/4) Exclude cut with ID _8755_210_1876_1_1528628559357_3258209_199-242167-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 22:05:46,266 WARNING [train.py:1204] (3/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_237-89684-0_sp1.1 from training. Duration: 0.87275 2023-10-02 22:05:46,442 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.conv_module1.balancer2.prob, batch_count=1035013.3333333334, ans=0.125 2023-10-02 22:05:47,490 WARNING [train.py:1204] (3/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_70-84121-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 22:05:48,898 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7544_210_6753_1_1528587722_3576686_172-171750-0_sp1.1 from training. Duration: 0.5909375 2023-10-02 22:05:50,769 WARNING [train.py:1204] (3/4) Exclude cut with ID _24271_210_12658_1_1531987270022_7190519_40-321822-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 22:05:52,077 WARNING [train.py:1204] (3/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_227-83612-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 22:05:53,482 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9156W0015-572508 from training. Duration: 0.896 2023-10-02 22:05:58,149 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_292-265928-0 from training. Duration: 0.51 2023-10-02 22:06:01,505 WARNING [train.py:1204] (3/4) Exclude cut with ID _14747_210_8341_1_1530685797463_3654089_147-131229-0_sp1.1 from training. Duration: 0.6909375 2023-10-02 22:06:04,882 WARNING [train.py:1204] (3/4) Exclude cut with ID _45297_210_2721_1_1533715351381_3264949_351-147136-0_sp0.9 from training. Duration: 0.9666875 2023-10-02 22:06:06,317 WARNING [train.py:1204] (3/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_1102-324568-0_sp1.1 from training. Duration: 0.97275 2023-10-02 22:06:07,714 WARNING [train.py:1204] (3/4) Exclude cut with ID _18516_210_9206_1_1531704653206_3692406_465-75446-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 22:06:07,715 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9203W0126-595623 from training. Duration: 0.896 2023-10-02 22:06:09,066 WARNING [train.py:1204] (3/4) Exclude cut with ID _55383_210_10635_1_1534468426172_2231669_139-328410-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 22:06:16,052 WARNING [train.py:1204] (3/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_166-246557-0_sp0.9 from training. Duration: 0.711125 2023-10-02 22:06:16,056 WARNING [train.py:1204] (3/4) Exclude cut with ID _30039_210_13270_1_1532484612900_2725150_239-304872-0_sp1.1 from training. Duration: 0.963625 2023-10-02 22:06:16,080 WARNING [train.py:1204] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_6-226750-0 from training. Duration: 0.94 2023-10-02 22:06:16,200 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.1.conv_skip_rate, batch_count=1035146.6666666666, ans=0.0 2023-10-02 22:06:17,403 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_135-5501-0_sp1.1 from training. Duration: 0.8909375 2023-10-02 22:06:20,417 WARNING [train.py:1204] (3/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_359-118926-0 from training. Duration: 0.72 2023-10-02 22:06:26,265 WARNING [train.py:1204] (3/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_4-91037-0 from training. Duration: 0.88 2023-10-02 22:06:26,285 WARNING [train.py:1204] (3/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_415-288492-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 22:06:26,996 WARNING [train.py:1204] (3/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_544-287179-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 22:06:28,413 WARNING [train.py:1204] (3/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_122-32606-0_sp1.1 from training. Duration: 0.92725 2023-10-02 22:06:29,849 WARNING [train.py:1204] (3/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_121-284152-0_sp1.1 from training. Duration: 0.536375 2023-10-02 22:06:29,965 WARNING [train.py:1204] (3/4) Exclude cut with ID _44718_210_5816_1_1534050051869_6469030_350-299477-0_sp1.1 from training. Duration: 0.97275 2023-10-02 22:06:31,781 WARNING [train.py:1204] (3/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_1103-56445-0_sp0.9 from training. Duration: 0.811125 2023-10-02 22:06:31,846 WARNING [train.py:1204] (3/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_64-298917-0_sp0.9 from training. Duration: 0.97775 2023-10-02 22:06:31,897 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2245_210_2715_1_1525511548_3540254_310-52483-0 from training. Duration: 0.88 2023-10-02 22:06:33,139 WARNING [train.py:1204] (3/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_401-158239-0_sp1.1 from training. Duration: 0.6454375 2023-10-02 22:06:33,184 WARNING [train.py:1204] (3/4) Exclude cut with ID _59502_210_12210_1_1535107024698_1835139_146-162495-0_sp1.1 from training. Duration: 0.836375 2023-10-02 22:06:33,193 WARNING [train.py:1204] (3/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_41-31157-0_sp0.9 from training. Duration: 0.6333125 2023-10-02 22:06:36,595 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_68717_210_3528_1_1535628683_1556759_95-36722-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 22:06:36,602 WARNING [train.py:1204] (3/4) Exclude cut with ID _30196_210_1664_1_1533708496522_7074140_654-162611-0_sp1.1 from training. Duration: 0.92725 2023-10-02 22:06:37,824 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.578e+02 1.850e+02 2.052e+02 2.209e+02 3.387e+02, threshold=4.104e+02, percent-clipped=0.0 2023-10-02 22:06:40,539 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_9302_210_2550_1_1528889962_4655416_638-301620-0_sp0.9 from training. Duration: 0.52225 2023-10-02 22:06:42,082 WARNING [train.py:1204] (3/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_478-162247-0_sp1.1 from training. Duration: 0.7454375 2023-10-02 22:06:44,897 WARNING [train.py:1204] (3/4) Exclude cut with ID _45297_210_2721_1_1533715351381_3264949_353-9541-0 from training. Duration: 0.97 2023-10-02 22:06:47,679 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9653W0190-675468 from training. Duration: 0.9935 2023-10-02 22:06:50,441 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_49730_210_13049_1_1534075294_1830676_214-84436-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 22:06:52,354 WARNING [train.py:1204] (3/4) Exclude cut with ID _50093_210_4220_1_1534417141207_3432299_259-322252-0_sp1.1 from training. Duration: 0.836375 2023-10-02 22:06:52,576 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.self_attn_weights.pos_emb_skip_rate, batch_count=1035280.0, ans=0.0 2023-10-02 22:06:53,852 WARNING [train.py:1204] (3/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_599-50220-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 22:06:55,169 INFO [train.py:1046] (3/4) Epoch 30, batch 1250, loss[loss=0.1717, simple_loss=0.259, pruned_loss=0.04219, over 23738.00 frames. ], tot_loss[loss=0.1659, simple_loss=0.2443, pruned_loss=0.04373, over 4692822.54 frames. ], batch size: 85, lr: 3.42e-03, grad_scale: 8.0 2023-10-02 22:06:55,179 WARNING [train.py:1204] (3/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_174-109016-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 22:06:56,455 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.3.encoder.layers.2.whiten, num_groups=1, num_channels=512, metric=4.60 vs. limit=12.0 2023-10-02 22:06:57,080 WARNING [train.py:1204] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_665-196155-0 from training. Duration: 0.67 2023-10-02 22:07:00,362 WARNING [train.py:1204] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_656-188361-0_sp1.1 from training. Duration: 0.7909375 2023-10-02 22:07:01,680 WARNING [train.py:1204] (3/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_60-18123-0_sp1.1 from training. Duration: 0.8 2023-10-02 22:07:03,011 WARNING [train.py:1204] (3/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_535-14410-0 from training. Duration: 0.47 2023-10-02 22:07:04,776 WARNING [train.py:1204] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_94-126128-0_sp1.1 from training. Duration: 0.7818125 2023-10-02 22:07:04,862 WARNING [train.py:1204] (3/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_356-31216-0_sp1.1 from training. Duration: 0.6818125 2023-10-02 22:07:10,354 WARNING [train.py:1204] (3/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_443-30034-0_sp1.1 from training. Duration: 0.5818125 2023-10-02 22:07:10,439 WARNING [train.py:1204] (3/4) Exclude cut with ID _26907_210_7234_1_1532221740694_3205263_337-2494-0_sp1.1 from training. Duration: 0.8 2023-10-02 22:07:11,876 WARNING [train.py:1204] (3/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_49-97158-0_sp0.9 from training. Duration: 0.8444375 2023-10-02 22:07:11,887 WARNING [train.py:1204] (3/4) Exclude cut with ID _7877_210_6014_1_1528286358099_3750136_553-170128-0_sp1.1 from training. Duration: 0.963625 2023-10-02 22:07:13,422 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.nonlin_attention.balancer.min_positive, batch_count=1035413.3333333334, ans=0.05 2023-10-02 22:07:14,608 WARNING [train.py:1204] (3/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_202-293523-0_sp0.9 from training. Duration: 0.8 2023-10-02 22:07:17,625 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.feed_forward1.hidden_balancer.prob, batch_count=1035413.3333333334, ans=0.125 2023-10-02 22:07:18,669 WARNING [train.py:1204] (3/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_188-171673-0_sp0.9 from training. Duration: 0.6444375 2023-10-02 22:07:18,691 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_120-41668-0_sp1.1 from training. Duration: 0.636375 2023-10-02 22:07:18,702 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_40223_210_6228_1_1533298404_4812267_613-78769-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 22:07:21,338 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_53861_210_5399_1_1534422156_4272077_150-336899-0_sp1.1 from training. Duration: 0.963625 2023-10-02 22:07:21,401 WARNING [train.py:1204] (3/4) Exclude cut with ID _14459_210_2228_1_1531443641946_7516700_1146-56843-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 22:07:23,579 WARNING [train.py:1204] (3/4) Exclude cut with ID _39867_210_9826_1_1533553222030_9344259_1194-299928-0_sp1.1 from training. Duration: 0.92725 2023-10-02 22:07:24,930 WARNING [train.py:1204] (3/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_214-291369-0_sp0.9 from training. Duration: 0.7 2023-10-02 22:07:28,544 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=1035480.0, ans=0.1 2023-10-02 22:07:30,936 WARNING [train.py:1204] (3/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_358-215558-0 from training. Duration: 0.93 2023-10-02 22:07:31,162 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.feed_forward2.hidden_balancer.prob, batch_count=1035480.0, ans=0.125 2023-10-02 22:07:32,272 WARNING [train.py:1204] (3/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_577-287150-0_sp0.9 from training. Duration: 0.911125 2023-10-02 22:07:32,460 WARNING [train.py:1204] (3/4) Exclude cut with ID _60860_210_8254_1_1534896015875_3557169_229-187367-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 22:07:34,240 WARNING [train.py:1204] (3/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_857-298942-0 from training. Duration: 0.85 2023-10-02 22:07:34,285 WARNING [train.py:1204] (3/4) Exclude cut with ID _40137_210_12210_1_1533290484046_3795180_249-187059-0_sp1.1 from training. Duration: 0.8 2023-10-02 22:07:34,292 WARNING [train.py:1204] (3/4) Exclude cut with ID ID0171W0508-765603 from training. Duration: 0.541 2023-10-02 22:07:35,632 WARNING [train.py:1204] (3/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_521-219439-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 22:07:35,644 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_40223_210_6228_1_1533298404_4812267_741-302616-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 22:07:40,975 WARNING [train.py:1204] (3/4) Exclude cut with ID _65293_210_9070_1_1535545549765_4004210_438-154735-0_sp1.1 from training. Duration: 0.92725 2023-10-02 22:07:44,962 WARNING [train.py:1204] (3/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_564-132950-0_sp1.1 from training. Duration: 0.92725 2023-10-02 22:07:44,995 WARNING [train.py:1204] (3/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_291-270881-0_sp1.1 from training. Duration: 0.8818125 2023-10-02 22:07:46,480 WARNING [train.py:1204] (3/4) Exclude cut with ID _60229_210_27767_1_1534838406158_3655350_353-213656-0 from training. Duration: 0.95 2023-10-02 22:07:46,492 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_243-143119-0 from training. Duration: 0.77 2023-10-02 22:07:46,513 WARNING [train.py:1204] (3/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_690-126306-0 from training. Duration: 0.78 2023-10-02 22:07:50,579 WARNING [train.py:1204] (3/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_347-95003-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 22:07:52,016 WARNING [train.py:1204] (3/4) Exclude cut with ID _2322_210_2667_1_1525604548971_3656299_440-80494-0 from training. Duration: 0.88 2023-10-02 22:07:52,030 WARNING [train.py:1204] (3/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_370-229434-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 22:07:55,244 WARNING [train.py:1204] (3/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_124-345840-0_sp1.1 from training. Duration: 0.47275 2023-10-02 22:07:55,262 WARNING [train.py:1204] (3/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_165-123581-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 22:07:56,630 WARNING [train.py:1204] (3/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_453-282588-0 from training. Duration: 0.98 2023-10-02 22:07:56,647 WARNING [train.py:1204] (3/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_299-34413-0_sp0.9 from training. Duration: 0.72225 2023-10-02 22:07:56,695 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_40036_210_13405_1_1533648647_1315388_48-146016-0_sp1.1 from training. Duration: 0.8454375 2023-10-02 22:07:56,722 WARNING [train.py:1204] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_99-167392-0_sp0.9 from training. Duration: 0.67775 2023-10-02 22:07:58,117 WARNING [train.py:1204] (3/4) Exclude cut with ID _48677_210_5663_1_1533902405919_3892989_37-207114-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 22:07:58,402 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=1035613.3333333334, ans=0.1 2023-10-02 22:07:59,443 WARNING [train.py:1204] (3/4) Exclude cut with ID _29857_210_20034_1_1532395886753_3798160_335-163562-0 from training. Duration: 0.85 2023-10-02 22:08:02,655 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_36492_210_7927_1_1533261987_4790834_703-343351-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 22:08:04,148 WARNING [train.py:1204] (3/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_80-147301-0_sp0.9 from training. Duration: 0.9333125 2023-10-02 22:08:05,995 WARNING [train.py:1204] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_218-295097-0_sp0.9 from training. Duration: 0.8555625 2023-10-02 22:08:08,221 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2295_210_2087_1_1525500993_2407863_116-238894-0_sp1.1 from training. Duration: 0.636375 2023-10-02 22:08:09,416 INFO [train.py:1046] (3/4) Epoch 30, batch 1300, loss[loss=0.1557, simple_loss=0.2384, pruned_loss=0.03651, over 24464.00 frames. ], tot_loss[loss=0.1659, simple_loss=0.2444, pruned_loss=0.04367, over 4714728.59 frames. ], batch size: 63, lr: 3.42e-03, grad_scale: 8.0 2023-10-02 22:08:12,193 WARNING [train.py:1204] (3/4) Exclude cut with ID _3231_210_2106_1_1526205864934_6784220_697-171068-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 22:08:12,212 WARNING [train.py:1204] (3/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_102-77064-0 from training. Duration: 0.93 2023-10-02 22:08:16,346 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_13091_210_7925_1_1530609447_8987354_1186-261957-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 22:08:17,674 WARNING [train.py:1204] (3/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_91-277684-0_sp1.1 from training. Duration: 0.67275 2023-10-02 22:08:19,125 WARNING [train.py:1204] (3/4) Exclude cut with ID _31088_210_16446_1_1532520012284_3440919_159-313751-0_sp1.1 from training. Duration: 0.936375 2023-10-02 22:08:20,392 WARNING [train.py:1204] (3/4) Exclude cut with ID _42326_210_7770_1_1533814282531_3707269_227-203721-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 22:08:20,474 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3531_210_3528_1_1526294203_5216456_309-227302-0_sp1.1 from training. Duration: 0.62725 2023-10-02 22:08:20,637 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.ff2_skip_rate, batch_count=1035680.0, ans=0.0 2023-10-02 22:08:21,853 WARNING [train.py:1204] (3/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_226-242140-0 from training. Duration: 0.81 2023-10-02 22:08:25,436 WARNING [train.py:1204] (3/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_122-109023-0_sp1.1 from training. Duration: 0.6909375 2023-10-02 22:08:26,847 WARNING [train.py:1204] (3/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_660-211088-0_sp0.9 from training. Duration: 0.688875 2023-10-02 22:08:28,102 WARNING [train.py:1204] (3/4) Exclude cut with ID _39290_210_4220_1_1533862778760_3075118_92-311600-0 from training. Duration: 0.88 2023-10-02 22:08:32,660 WARNING [train.py:1204] (3/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_390-339137-0_sp1.1 from training. Duration: 0.5090625 2023-10-02 22:08:34,173 WARNING [train.py:1204] (3/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_449-341946-0_sp1.1 from training. Duration: 0.963625 2023-10-02 22:08:36,207 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_68095_210_20185_1_1535718226_8888855_884-327007-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 22:08:38,154 WARNING [train.py:1204] (3/4) Exclude cut with ID _42326_210_7770_1_1533814282531_3707269_308-222998-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 22:08:38,274 WARNING [train.py:1204] (3/4) Exclude cut with ID _40399_210_4353_1_1533305658194_3279406_344-238708-0_sp1.1 from training. Duration: 0.963625 2023-10-02 22:08:39,643 WARNING [train.py:1204] (3/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_87-131427-0_sp1.1 from training. Duration: 0.7090625 2023-10-02 22:08:40,955 WARNING [train.py:1204] (3/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_110-130961-0_sp0.9 from training. Duration: 0.52225 2023-10-02 22:08:41,002 WARNING [train.py:1204] (3/4) Exclude cut with ID _55153_210_2253_1_1534384786677_3162096_148-79022-0 from training. Duration: 0.96 2023-10-02 22:08:45,190 WARNING [train.py:1204] (3/4) Exclude cut with ID _9679_210_7497_1_1529129212906_4363929_512-313889-0_sp1.1 from training. Duration: 0.736375 2023-10-02 22:08:45,201 WARNING [train.py:1204] (3/4) Exclude cut with ID _7970_210_1883_1_1528455616296_3472230_25-178407-0_sp1.1 from training. Duration: 0.6454375 2023-10-02 22:08:47,841 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_246-125000-0 from training. Duration: 0.62 2023-10-02 22:08:47,898 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_15126_210_6753_1_1530778677_2591185_113-157651-0_sp0.9 from training. Duration: 0.7555625 2023-10-02 22:08:50,694 WARNING [train.py:1204] (3/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_105-63309-0_sp1.1 from training. Duration: 0.8909375 2023-10-02 22:08:53,314 WARNING [train.py:1204] (3/4) Exclude cut with ID _29877_210_6228_1_1532397791106_5276149_578-177271-0_sp1.1 from training. Duration: 0.936375 2023-10-02 22:08:53,365 WARNING [train.py:1204] (3/4) Exclude cut with ID _44831_210_13427_1_1533816011549_3689035_32-188147-0 from training. Duration: 0.96 2023-10-02 22:08:53,406 WARNING [train.py:1204] (3/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_300-254337-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 22:08:55,216 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_128-241008-0 from training. Duration: 0.94 2023-10-02 22:08:56,668 WARNING [train.py:1204] (3/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_53-279178-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 22:09:00,844 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_41452_210_7291_1_1533437687_3941281_273-334088-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 22:09:00,846 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_128-241008-0_sp1.1 from training. Duration: 0.8545625 2023-10-02 22:09:04,309 WARNING [train.py:1204] (3/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_379-49457-0 from training. Duration: 0.87 2023-10-02 22:09:05,531 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.577e+02 1.946e+02 2.321e+02 2.829e+02 4.906e+02, threshold=4.642e+02, percent-clipped=3.0 2023-10-02 22:09:05,640 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_105-114797-0 from training. Duration: 0.84 2023-10-02 22:09:07,415 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_14928_210_5632_1_1530773461_4714000_454-112198-0 from training. Duration: 0.66 2023-10-02 22:09:08,950 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.ff2_skip_rate, batch_count=1035946.6666666666, ans=0.0 2023-10-02 22:09:13,218 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_456-22141-0_sp1.1 from training. Duration: 0.8 2023-10-02 22:09:14,733 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_115-233929-0 from training. Duration: 0.72 2023-10-02 22:09:17,426 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_616-16795-0_sp1.1 from training. Duration: 0.963625 2023-10-02 22:09:23,167 INFO [train.py:1046] (3/4) Epoch 30, batch 1350, loss[loss=0.1713, simple_loss=0.2424, pruned_loss=0.05006, over 23795.00 frames. ], tot_loss[loss=0.1652, simple_loss=0.2434, pruned_loss=0.04347, over 4709479.81 frames. ], batch size: 164, lr: 3.42e-03, grad_scale: 8.0 2023-10-02 22:09:23,227 WARNING [train.py:1204] (3/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_448-212135-0 from training. Duration: 0.91 2023-10-02 22:09:27,817 WARNING [train.py:1204] (3/4) Exclude cut with ID _46683_210_9642_1_1533730913492_4591116_201-205677-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 22:09:29,291 WARNING [train.py:1204] (3/4) Exclude cut with ID _64086_210_7994_1_1535270417019_7435949_43-80483-0_sp1.1 from training. Duration: 0.97275 2023-10-02 22:09:33,408 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_240-134124-0_sp1.1 from training. Duration: 0.963625 2023-10-02 22:09:33,438 WARNING [train.py:1204] (3/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_907-120940-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 22:09:34,887 WARNING [train.py:1204] (3/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_848-280957-0_sp1.1 from training. Duration: 0.7909375 2023-10-02 22:09:36,775 WARNING [train.py:1204] (3/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_243-42116-0_sp0.9 from training. Duration: 0.811125 2023-10-02 22:09:40,065 WARNING [train.py:1204] (3/4) Exclude cut with ID _6694_210_4599_1_1527852637490_3965529_116-109398-0_sp0.9 from training. Duration: 0.811125 2023-10-02 22:09:41,371 WARNING [train.py:1204] (3/4) Exclude cut with ID _24314_210_4915_1_1532135078116_7170179_521-258868-0 from training. Duration: 0.79 2023-10-02 22:09:43,490 WARNING [train.py:1204] (3/4) Exclude cut with ID _2220_210_1819_1_1525509109898_3658680_50-99837-0_sp0.9 from training. Duration: 0.6 2023-10-02 22:09:43,545 WARNING [train.py:1204] (3/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_313-134930-0_sp1.1 from training. Duration: 0.8818125 2023-10-02 22:09:46,272 WARNING [train.py:1204] (3/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_125-144174-0 from training. Duration: 0.39 2023-10-02 22:09:46,333 WARNING [train.py:1204] (3/4) Exclude cut with ID _59288_210_5854_1_1535077757774_3412246_275-293897-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 22:09:47,623 WARNING [train.py:1204] (3/4) Exclude cut with ID _40519_210_13794_1_1533715228655_7337390_722-52880-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 22:09:47,637 WARNING [train.py:1204] (3/4) Exclude cut with ID _7537_210_2084_1_1528502395431_3924359_128-268029-0 from training. Duration: 0.98 2023-10-02 22:09:48,954 WARNING [train.py:1204] (3/4) Exclude cut with ID _8021_210_7994_1_1528376451358_3653069_179-45206-0 from training. Duration: 0.86 2023-10-02 22:09:51,645 WARNING [train.py:1204] (3/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_88-276668-0 from training. Duration: 0.99 2023-10-02 22:09:51,759 WARNING [train.py:1204] (3/4) Exclude cut with ID _22613_210_5219_1_1532253599686_4090170_290-212862-0_sp1.1 from training. Duration: 0.97275 2023-10-02 22:09:53,062 WARNING [train.py:1204] (3/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_465-214726-0 from training. Duration: 0.86 2023-10-02 22:10:04,022 WARNING [train.py:1204] (3/4) Exclude cut with ID _56480_210_4220_1_1534679942079_3075409_138-36031-0_sp1.1 from training. Duration: 0.97275 2023-10-02 22:10:04,280 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.feed_forward3.hidden_balancer.prob, batch_count=1036146.6666666666, ans=0.125 2023-10-02 22:10:14,321 WARNING [train.py:1204] (3/4) Exclude cut with ID _58605_210_12210_1_1534723195939_3904259_300-285878-0_sp1.1 from training. Duration: 0.97275 2023-10-02 22:10:15,765 WARNING [train.py:1204] (3/4) Exclude cut with ID _40901_210_5665_1_1533363789958_3856130_114-340229-0_sp1.1 from training. Duration: 0.936375 2023-10-02 22:10:15,817 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_489-230703-0 from training. Duration: 0.85 2023-10-02 22:10:18,597 WARNING [train.py:1204] (3/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_322-81243-0_sp1.1 from training. Duration: 0.936375 2023-10-02 22:10:19,938 WARNING [train.py:1204] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_271-269084-0 from training. Duration: 0.83 2023-10-02 22:10:19,950 WARNING [train.py:1204] (3/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_166-314607-0_sp0.9 from training. Duration: 0.6 2023-10-02 22:10:21,340 WARNING [train.py:1204] (3/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_340-343302-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 22:10:24,105 WARNING [train.py:1204] (3/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_286-112015-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 22:10:25,658 WARNING [train.py:1204] (3/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_328-264776-0 from training. Duration: 0.67 2023-10-02 22:10:27,044 WARNING [train.py:1204] (3/4) Exclude cut with ID _4866_210_2577_1_1527404412284_4364659_256-306830-0_sp0.9 from training. Duration: 0.9666875 2023-10-02 22:10:30,486 WARNING [train.py:1204] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_239-316802-0 from training. Duration: 0.69 2023-10-02 22:10:31,964 WARNING [train.py:1204] (3/4) Exclude cut with ID _29857_210_20034_1_1532395886753_3798160_279-185801-0 from training. Duration: 0.91 2023-10-02 22:10:37,768 INFO [train.py:1046] (3/4) Epoch 30, batch 1400, loss[loss=0.1726, simple_loss=0.2455, pruned_loss=0.04989, over 23882.00 frames. ], tot_loss[loss=0.1644, simple_loss=0.2425, pruned_loss=0.04312, over 4704392.95 frames. ], batch size: 195, lr: 3.42e-03, grad_scale: 8.0 2023-10-02 22:10:39,186 WARNING [train.py:1204] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_656-188361-0 from training. Duration: 0.87 2023-10-02 22:10:41,049 WARNING [train.py:1204] (3/4) Exclude cut with ID _44718_210_5816_1_1534050051869_6469030_100-230287-0_sp1.1 from training. Duration: 0.936375 2023-10-02 22:10:43,169 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8275_210_4198_1_1528455320_3892571_276-215622-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 22:10:44,595 WARNING [train.py:1204] (3/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_698-309430-0_sp1.1 from training. Duration: 0.963625 2023-10-02 22:10:50,167 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_365-217119-0 from training. Duration: 0.63 2023-10-02 22:10:51,592 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7264_210_4938_1_1527939911_8132095_800-91995-0 from training. Duration: 0.86 2023-10-02 22:10:54,672 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.0.balancer2.prob, batch_count=1036413.3333333334, ans=0.125 2023-10-02 22:11:01,785 WARNING [train.py:1204] (3/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_49-97158-0_sp1.1 from training. Duration: 0.6909375 2023-10-02 22:11:03,211 WARNING [train.py:1204] (3/4) Exclude cut with ID _61132_210_9969_1_1535021792964_3755419_61-57720-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 22:11:04,633 WARNING [train.py:1204] (3/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_345-306832-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 22:11:04,675 WARNING [train.py:1204] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_164-224052-0_sp0.9 from training. Duration: 0.77775 2023-10-02 22:11:07,886 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_34886_210_10633_1_1533203882_1755661_76-156096-0_sp1.1 from training. Duration: 0.7909375 2023-10-02 22:11:09,269 WARNING [train.py:1204] (3/4) Exclude cut with ID 4133-6541-0027-40495-0_sp1.1 from training. Duration: 0.9681875 2023-10-02 22:11:18,469 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_514-211165-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 22:11:19,827 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_41211_210_2544_1_1533348859_3702910_97-212695-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 22:11:23,049 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.3.encoder.layers.3.nonlin_attention.whiten1, num_groups=1, num_channels=384, metric=5.59 vs. limit=10.0 2023-10-02 22:11:23,798 WARNING [train.py:1204] (3/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_406-45086-0 from training. Duration: 0.8 2023-10-02 22:11:25,141 WARNING [train.py:1204] (3/4) Exclude cut with ID _65263_210_22356_1_1535763598691_3921037_49-222917-0_sp1.1 from training. Duration: 0.836375 2023-10-02 22:11:26,557 WARNING [train.py:1204] (3/4) Exclude cut with ID _4866_210_2577_1_1527404412284_4364659_352-157930-0_sp1.1 from training. Duration: 0.736375 2023-10-02 22:11:26,617 WARNING [train.py:1204] (3/4) Exclude cut with ID _4366_210_1388_1_1526792053633_3613819_231-211402-0_sp1.1 from training. Duration: 0.8545625 2023-10-02 22:11:26,660 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_98-21576-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 22:11:28,533 WARNING [train.py:1204] (3/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_348-127789-0_sp0.9 from training. Duration: 0.9444375 2023-10-02 22:11:28,551 WARNING [train.py:1204] (3/4) Exclude cut with ID _45871_210_5665_1_1533864614245_3296060_98-329557-0_sp1.1 from training. Duration: 0.97275 2023-10-02 22:11:29,832 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6846_210_6753_1_1527850767_4295158_2-315073-0_sp1.1 from training. Duration: 0.8 2023-10-02 22:11:31,182 WARNING [train.py:1204] (3/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_145-137790-0 from training. Duration: 0.83 2023-10-02 22:11:32,419 WARNING [train.py:1204] (3/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_133-158308-0_sp0.9 from training. Duration: 0.9333125 2023-10-02 22:11:33,815 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.503e+02 1.837e+02 2.123e+02 2.555e+02 3.782e+02, threshold=4.246e+02, percent-clipped=0.0 2023-10-02 22:11:34,140 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.conv_skip_rate, batch_count=1036546.6666666666, ans=0.0 2023-10-02 22:11:35,377 WARNING [train.py:1204] (3/4) Exclude cut with ID _14898_210_9206_1_1530766812377_4177190_320-11758-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 22:11:38,782 WARNING [train.py:1204] (3/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_406-45086-0_sp0.9 from training. Duration: 0.888875 2023-10-02 22:11:39,086 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.conv_module1.balancer1.max_abs, batch_count=1036613.3333333334, ans=10.0 2023-10-02 22:11:40,374 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder_embed.conv.5.prob, batch_count=1036613.3333333334, ans=0.125 2023-10-02 22:11:44,729 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_5438_210_1794_1_1527336687_2600009_104-288274-0 from training. Duration: 0.58 2023-10-02 22:11:44,879 INFO [scaling.py:1118] (3/4) WithLoss: name=encoder.encoders.0.layers.1.self_attn_weights, loss-sum=0.000e+00 2023-10-02 22:11:46,581 WARNING [train.py:1204] (3/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_108-53929-0_sp0.9 from training. Duration: 0.6555625 2023-10-02 22:11:47,834 WARNING [train.py:1204] (3/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_444-286390-0_sp1.1 from training. Duration: 0.963625 2023-10-02 22:11:48,113 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=1036613.3333333334, ans=0.1 2023-10-02 22:11:50,663 WARNING [train.py:1204] (3/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_125-144174-0_sp0.9 from training. Duration: 0.4333125 2023-10-02 22:11:51,969 INFO [train.py:1046] (3/4) Epoch 30, batch 1450, loss[loss=0.1574, simple_loss=0.2437, pruned_loss=0.0355, over 24481.00 frames. ], tot_loss[loss=0.1641, simple_loss=0.2422, pruned_loss=0.04297, over 4708795.40 frames. ], batch size: 66, lr: 3.42e-03, grad_scale: 8.0 2023-10-02 22:11:52,006 WARNING [train.py:1204] (3/4) Exclude cut with ID _55209_210_1531_1_1534675828996_4597160_118-345621-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 22:11:53,373 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_45886_210_17024_1_1533709215_471063_7-79165-0_sp1.1 from training. Duration: 0.8909375 2023-10-02 22:11:55,006 WARNING [train.py:1204] (3/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_175-163894-0_sp0.9 from training. Duration: 0.82225 2023-10-02 22:11:56,437 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_50188_210_4198_1_1534503621_3646041_353-81682-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 22:11:56,443 WARNING [train.py:1204] (3/4) Exclude cut with ID _17875_210_2087_1_1531224012907_1821339_175-80565-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 22:11:56,451 WARNING [train.py:1204] (3/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_159-328157-0_sp1.1 from training. Duration: 0.47275 2023-10-02 22:12:02,252 WARNING [train.py:1204] (3/4) Exclude cut with ID _60057_210_10640_1_1534903065793_3333150_137-267579-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 22:12:02,304 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_92-46009-0_sp1.1 from training. Duration: 0.4909375 2023-10-02 22:12:03,656 WARNING [train.py:1204] (3/4) Exclude cut with ID _53203_210_2087_1_1534246218309_475590_48-68507-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 22:12:03,679 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_28211_210_6388_1_1532861506_4523224_128-328162-0 from training. Duration: 0.9 2023-10-02 22:12:05,004 WARNING [train.py:1204] (3/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_51-1049-0_sp1.1 from training. Duration: 0.6818125 2023-10-02 22:12:05,105 WARNING [train.py:1204] (3/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_383-316000-0 from training. Duration: 0.9 2023-10-02 22:12:07,086 WARNING [train.py:1204] (3/4) Exclude cut with ID _4779_210_2084_1_1527318022944_3914893_185-295153-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 22:12:07,164 WARNING [train.py:1204] (3/4) Exclude cut with ID _3231_210_2106_1_1526205864934_6784220_171-256004-0_sp0.9 from training. Duration: 0.9555625 2023-10-02 22:12:07,165 WARNING [train.py:1204] (3/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_87-272068-0 from training. Duration: 0.83 2023-10-02 22:12:09,717 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_164-152777-0_sp1.1 from training. Duration: 0.92725 2023-10-02 22:12:09,773 WARNING [train.py:1204] (3/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_18-130336-0_sp0.9 from training. Duration: 0.87775 2023-10-02 22:12:11,048 WARNING [train.py:1204] (3/4) Exclude cut with ID _14886_210_4220_1_1530702020355_3516190_175-229073-0_sp1.1 from training. Duration: 0.4090625 2023-10-02 22:12:11,068 WARNING [train.py:1204] (3/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_791-45149-0_sp0.9 from training. Duration: 0.9555625 2023-10-02 22:12:12,505 WARNING [train.py:1204] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_468-189058-0_sp1.1 from training. Duration: 0.87275 2023-10-02 22:12:13,850 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8278_210_2550_1_1528637099_4220413_32-142780-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 22:12:17,291 WARNING [train.py:1204] (3/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_146-218763-0_sp0.9 from training. Duration: 0.9555625 2023-10-02 22:12:20,177 WARNING [train.py:1204] (3/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_348-127789-0_sp1.1 from training. Duration: 0.77275 2023-10-02 22:12:20,182 WARNING [train.py:1204] (3/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_288-285445-0_sp1.1 from training. Duration: 0.936375 2023-10-02 22:12:22,916 WARNING [train.py:1204] (3/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_308-181732-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 22:12:22,939 WARNING [train.py:1204] (3/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_895-117428-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 22:12:26,058 WARNING [train.py:1204] (3/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_387-159008-0_sp0.9 from training. Duration: 0.9555625 2023-10-02 22:12:26,079 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_37351_210_7925_1_1533012931_3812113_265-85780-0_sp0.9 from training. Duration: 0.97775 2023-10-02 22:12:26,100 WARNING [train.py:1204] (3/4) Exclude cut with ID _40551_210_10635_1_1533776424632_3416179_361-3964-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 22:12:27,350 WARNING [train.py:1204] (3/4) Exclude cut with ID _25882_210_3036_1_1532775665815_6479510_485-122742-0_sp1.1 from training. Duration: 0.97275 2023-10-02 22:12:28,817 WARNING [train.py:1204] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_98-51676-0 from training. Duration: 0.73 2023-10-02 22:12:31,633 WARNING [train.py:1204] (3/4) Exclude cut with ID _30548_210_16040_1_1532608097130_3146969_371-280610-0_sp1.1 from training. Duration: 0.92725 2023-10-02 22:12:33,069 WARNING [train.py:1204] (3/4) Exclude cut with ID IC0671W0081-331924 from training. Duration: 0.896 2023-10-02 22:12:35,709 WARNING [train.py:1204] (3/4) Exclude cut with ID _40284_210_2084_1_1533513679814_3704279_380-160775-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 22:12:37,712 WARNING [train.py:1204] (3/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_57-270037-0_sp1.1 from training. Duration: 0.7 2023-10-02 22:12:39,003 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_853-150237-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 22:12:39,089 WARNING [train.py:1204] (3/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_491-261923-0 from training. Duration: 0.98 2023-10-02 22:12:44,515 WARNING [train.py:1204] (3/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_836-244045-0_sp1.1 from training. Duration: 0.97275 2023-10-02 22:12:44,600 WARNING [train.py:1204] (3/4) Exclude cut with ID _14880_210_8895_1_1530680426655_3795259_289-212817-0 from training. Duration: 0.69 2023-10-02 22:12:48,255 WARNING [train.py:1204] (3/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_303-340446-0 from training. Duration: 0.9 2023-10-02 22:12:48,361 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_37152_210_9697_1_1533106573_3844188_418-41736-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 22:12:48,520 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.nonlin_attention.balancer.prob, batch_count=1036880.0, ans=0.125 2023-10-02 22:12:52,439 WARNING [train.py:1204] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_371-18169-0_sp1.1 from training. Duration: 0.7818125 2023-10-02 22:12:53,916 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_187-219984-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 22:12:54,036 WARNING [train.py:1204] (3/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_63-267585-0 from training. Duration: 0.9 2023-10-02 22:12:55,923 WARNING [train.py:1204] (3/4) Exclude cut with ID _27013_210_1389_1_1532437173240_3252649_222-125573-0 from training. Duration: 0.98 2023-10-02 22:12:57,222 WARNING [train.py:1204] (3/4) Exclude cut with ID _14821_210_8750_1_1530680503991_6829359_189-297451-0 from training. Duration: 0.55 2023-10-02 22:12:58,547 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_557-163391-0_sp1.1 from training. Duration: 0.97275 2023-10-02 22:12:59,899 WARNING [train.py:1204] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_65-88242-0_sp1.1 from training. Duration: 0.5090625 2023-10-02 22:13:00,389 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.5.encoder.layers.0.feed_forward1.out_whiten, num_groups=1, num_channels=256, metric=10.92 vs. limit=15.0 2023-10-02 22:13:05,510 INFO [train.py:1046] (3/4) Epoch 30, batch 1500, loss[loss=0.1716, simple_loss=0.2416, pruned_loss=0.05074, over 23721.00 frames. ], tot_loss[loss=0.1643, simple_loss=0.2425, pruned_loss=0.04309, over 4703404.93 frames. ], batch size: 232, lr: 3.42e-03, grad_scale: 8.0 2023-10-02 22:13:08,834 WARNING [train.py:1204] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_218-295097-0 from training. Duration: 0.77 2023-10-02 22:13:10,180 WARNING [train.py:1204] (3/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_686-247600-0_sp1.1 from training. Duration: 0.836375 2023-10-02 22:13:10,184 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_397-147196-0_sp0.9 from training. Duration: 0.888875 2023-10-02 22:13:11,519 WARNING [train.py:1204] (3/4) Exclude cut with ID _53950_210_5816_1_1534417264919_3862951_237-136449-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 22:13:12,885 WARNING [train.py:1204] (3/4) Exclude cut with ID _3120_210_3525_1_1526036143112_4072980_74-178400-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 22:13:12,973 WARNING [train.py:1204] (3/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_309-222545-0_sp0.9 from training. Duration: 0.9333125 2023-10-02 22:13:14,291 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7544_210_6753_1_1528587722_3576686_172-171750-0 from training. Duration: 0.65 2023-10-02 22:13:14,399 WARNING [train.py:1204] (3/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_3-237434-0_sp0.9 from training. Duration: 0.8444375 2023-10-02 22:13:15,732 WARNING [train.py:1204] (3/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_101-293367-0_sp0.9 from training. Duration: 0.7 2023-10-02 22:13:15,740 WARNING [train.py:1204] (3/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_818-213552-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 22:13:15,829 WARNING [train.py:1204] (3/4) Exclude cut with ID _14349_210_8341_1_1530615710818_3442470_115-285975-0_sp1.1 from training. Duration: 0.7818125 2023-10-02 22:13:17,768 WARNING [train.py:1204] (3/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_291-309749-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 22:13:19,656 WARNING [train.py:1204] (3/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_715-3462-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 22:13:24,284 WARNING [train.py:1204] (3/4) Exclude cut with ID _59502_210_12210_1_1535107024698_1835139_13-331827-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 22:13:24,302 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_4528_210_3273_1_1527076654_3276777_342-168068-0 from training. Duration: 0.87 2023-10-02 22:13:25,654 WARNING [train.py:1204] (3/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_215-141059-0_sp0.9 from training. Duration: 0.688875 2023-10-02 22:13:26,916 WARNING [train.py:1204] (3/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_106-286130-0_sp1.1 from training. Duration: 0.8909375 2023-10-02 22:13:28,298 WARNING [train.py:1204] (3/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_480-77655-0_sp1.1 from training. Duration: 0.97275 2023-10-02 22:13:29,846 WARNING [train.py:1204] (3/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_848-280957-0 from training. Duration: 0.87 2023-10-02 22:13:30,120 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.feed_forward3.hidden_balancer.prob, batch_count=1037080.0, ans=0.125 2023-10-02 22:13:32,716 WARNING [train.py:1204] (3/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_122-109023-0 from training. Duration: 0.76 2023-10-02 22:13:35,180 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_11305_210_1794_1_1529750428_7178148_645-233480-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 22:13:35,239 WARNING [train.py:1204] (3/4) Exclude cut with ID _7562_210_2084_1_1528887767916_3946229_345-169156-0 from training. Duration: 0.58 2023-10-02 22:13:38,519 WARNING [train.py:1204] (3/4) Exclude cut with ID _7562_210_2084_1_1528887767916_3946229_345-169156-0_sp1.1 from training. Duration: 0.52725 2023-10-02 22:13:41,203 WARNING [train.py:1204] (3/4) Exclude cut with ID _13253_210_1925_1_1530757775819_3577873_253-246386-0_sp1.1 from training. Duration: 0.8181875 2023-10-02 22:13:41,273 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_136-264444-0_sp1.1 from training. Duration: 0.97275 2023-10-02 22:13:41,294 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2168_210_2290_1_1525606920_1849400_136-222991-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 22:13:42,667 WARNING [train.py:1204] (3/4) Exclude cut with ID _7852_210_5219_1_1528286471316_8139400_242-105336-0 from training. Duration: 0.72 2023-10-02 22:13:43,951 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_5960_210_1794_1_1527403865_3990999_515-2613-0_sp1.1 from training. Duration: 0.8 2023-10-02 22:13:43,973 WARNING [train.py:1204] (3/4) Exclude cut with ID _39688_210_19596_1_1533371626725_4154159_551-167834-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 22:13:44,030 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_85-195240-0 from training. Duration: 0.87 2023-10-02 22:13:45,327 WARNING [train.py:1204] (3/4) Exclude cut with ID _63171_210_15488_1_1535104792794_3295110_113-113534-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 22:13:51,988 WARNING [train.py:1204] (3/4) Exclude cut with ID _8833_210_5219_1_1528592665210_7553059_319-349181-0_sp1.1 from training. Duration: 0.9 2023-10-02 22:13:51,989 WARNING [train.py:1204] (3/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_225-43381-0 from training. Duration: 0.94 2023-10-02 22:13:57,324 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.2.encoder.layers.1.feed_forward2.out_whiten, num_groups=1, num_channels=384, metric=4.34 vs. limit=15.0 2023-10-02 22:13:59,392 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_5959_210_1794_1_1527400400_3445726_412-44052-0_sp0.9 from training. Duration: 0.8555625 2023-10-02 22:14:00,856 WARNING [train.py:1204] (3/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_356-167816-0_sp1.1 from training. Duration: 0.6545625 2023-10-02 22:14:02,133 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.555e+02 1.861e+02 2.140e+02 2.435e+02 4.119e+02, threshold=4.280e+02, percent-clipped=0.0 2023-10-02 22:14:04,979 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9012W0112-501244 from training. Duration: 0.768 2023-10-02 22:14:05,035 WARNING [train.py:1204] (3/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_177-326952-0_sp1.1 from training. Duration: 0.92725 2023-10-02 22:14:05,038 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9016W0116-502789 from training. Duration: 0.896 2023-10-02 22:14:06,392 WARNING [train.py:1204] (3/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_557-204815-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 22:14:07,753 WARNING [train.py:1204] (3/4) Exclude cut with ID _55857_210_2346_1_1534644007831_6898209_279-229469-0_sp1.1 from training. Duration: 0.936375 2023-10-02 22:14:09,036 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9029W0375-509764 from training. Duration: 0.896 2023-10-02 22:14:09,106 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2295_210_2087_1_1525500993_2407863_234-83104-0_sp0.9 from training. Duration: 0.688875 2023-10-02 22:14:09,238 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=1037280.0, ans=0.1 2023-10-02 22:14:10,525 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.bypass.skip_rate, batch_count=1037280.0, ans=0.04949747468305833 2023-10-02 22:14:12,373 WARNING [train.py:1204] (3/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_266-137104-0 from training. Duration: 0.65 2023-10-02 22:14:15,081 WARNING [train.py:1204] (3/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_289-163927-0_sp1.1 from training. Duration: 0.92725 2023-10-02 22:14:18,162 WARNING [train.py:1204] (3/4) Exclude cut with ID _12733_210_8341_1_1530167424556_3646380_86-225314-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 22:14:18,176 WARNING [train.py:1204] (3/4) Exclude cut with ID _7877_210_6014_1_1528286358099_3750136_618-284064-0_sp1.1 from training. Duration: 0.92725 2023-10-02 22:14:18,222 WARNING [train.py:1204] (3/4) Exclude cut with ID _55844_210_2346_1_1534471212569_7625707_1017-31469-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 22:14:18,262 WARNING [train.py:1204] (3/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_617-241446-0_sp1.1 from training. Duration: 0.92725 2023-10-02 22:14:19,524 INFO [train.py:1046] (3/4) Epoch 30, batch 1550, loss[loss=0.1604, simple_loss=0.2442, pruned_loss=0.03829, over 24320.00 frames. ], tot_loss[loss=0.1644, simple_loss=0.2428, pruned_loss=0.04302, over 4711165.54 frames. ], batch size: 61, lr: 3.42e-03, grad_scale: 8.0 2023-10-02 22:14:19,602 WARNING [train.py:1204] (3/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_866-173440-0_sp1.1 from training. Duration: 0.8181875 2023-10-02 22:14:19,727 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_9460_210_4211_1_1528954587_10049667_480-260269-0 from training. Duration: 0.56 2023-10-02 22:14:21,153 WARNING [train.py:1204] (3/4) Exclude cut with ID _44659_210_2721_1_1534036120061_6614860_626-298884-0 from training. Duration: 0.97 2023-10-02 22:14:21,195 WARNING [train.py:1204] (3/4) Exclude cut with ID _53565_210_10635_1_1534337218335_3820259_287-18120-0_sp1.1 from training. Duration: 0.8818125 2023-10-02 22:14:22,494 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_791-197891-0 from training. Duration: 0.84 2023-10-02 22:14:24,336 WARNING [train.py:1204] (3/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_527-238990-0 from training. Duration: 0.9 2023-10-02 22:14:26,276 WARNING [train.py:1204] (3/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_449-65969-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 22:14:27,681 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_553-341452-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 22:14:27,721 WARNING [train.py:1204] (3/4) Exclude cut with ID _16909_210_5724_1_1531292079912_4118919_300-47574-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 22:14:27,733 WARNING [train.py:1204] (3/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_70-27732-0_sp1.1 from training. Duration: 0.8545625 2023-10-02 22:14:27,821 WARNING [train.py:1204] (3/4) Exclude cut with ID _44718_210_5816_1_1534050051869_6469030_727-28074-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 22:14:29,116 WARNING [train.py:1204] (3/4) Exclude cut with ID _55857_210_2346_1_1534644007831_6898209_357-123664-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 22:14:31,823 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9123W0004-555964 from training. Duration: 0.896 2023-10-02 22:14:31,853 WARNING [train.py:1204] (3/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_445-145208-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 22:14:31,875 WARNING [train.py:1204] (3/4) Exclude cut with ID _3675_210_4557_1_1526639934408_4182089_456-18305-0_sp1.1 from training. Duration: 0.6909375 2023-10-02 22:14:33,071 WARNING [train.py:1204] (3/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_1-99680-0_sp1.1 from training. Duration: 0.6090625 2023-10-02 22:14:35,777 WARNING [train.py:1204] (3/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_146-282400-0_sp0.9 from training. Duration: 0.788875 2023-10-02 22:14:35,796 WARNING [train.py:1204] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_844-96989-0 from training. Duration: 0.71 2023-10-02 22:14:37,282 WARNING [train.py:1204] (3/4) Exclude cut with ID _29853_210_7497_1_1532505600851_6642110_128-146364-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 22:14:37,402 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.1.attention_skip_rate, batch_count=1037413.3333333334, ans=0.0 2023-10-02 22:14:38,614 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_224-322394-0 from training. Duration: 0.66 2023-10-02 22:14:38,712 WARNING [train.py:1204] (3/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_191-59784-0 from training. Duration: 0.88 2023-10-02 22:14:38,718 WARNING [train.py:1204] (3/4) Exclude cut with ID _12556_210_8341_1_1530081024100_7327690_725-193477-0 from training. Duration: 0.98 2023-10-02 22:14:38,763 WARNING [train.py:1204] (3/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_523-79838-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 22:14:38,976 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.feed_forward1.hidden_balancer.prob, batch_count=1037413.3333333334, ans=0.125 2023-10-02 22:14:40,133 WARNING [train.py:1204] (3/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_398-334558-0_sp1.1 from training. Duration: 0.87275 2023-10-02 22:14:40,300 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.bypass.scale_min, batch_count=1037413.3333333334, ans=0.2 2023-10-02 22:14:44,818 WARNING [train.py:1204] (3/4) Exclude cut with ID _6456_210_1534_1_1527681634212_3181049_171-36697-0_sp1.1 from training. Duration: 0.936375 2023-10-02 22:14:46,186 WARNING [train.py:1204] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_700-183549-0 from training. Duration: 0.68 2023-10-02 22:14:46,188 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8853_210_3273_1_1528633899_818968_51-187685-0 from training. Duration: 0.69 2023-10-02 22:14:52,419 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.conv_module1.balancer1.prob, batch_count=1037480.0, ans=0.125 2023-10-02 22:14:55,350 WARNING [train.py:1204] (3/4) Exclude cut with ID _7719_210_3525_1_1528456153163_4296960_333-110399-0_sp1.1 from training. Duration: 0.87275 2023-10-02 22:15:00,075 WARNING [train.py:1204] (3/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_1022-83740-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 22:15:00,090 WARNING [train.py:1204] (3/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_61-327062-0_sp0.9 from training. Duration: 0.9 2023-10-02 22:15:00,096 WARNING [train.py:1204] (3/4) Exclude cut with ID _47251_210_18628_1_1533814063128_4137250_214-88076-0_sp1.1 from training. Duration: 0.8 2023-10-02 22:15:01,385 WARNING [train.py:1204] (3/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_839-173017-0 from training. Duration: 0.41 2023-10-02 22:15:03,202 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.balancer1.prob, batch_count=1037546.6666666666, ans=0.125 2023-10-02 22:15:04,464 WARNING [train.py:1204] (3/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_299-34413-0_sp1.1 from training. Duration: 0.5909375 2023-10-02 22:15:05,985 WARNING [train.py:1204] (3/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_664-159457-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 22:15:07,367 WARNING [train.py:1204] (3/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_483-291760-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 22:15:07,553 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=1037546.6666666666, ans=0.1 2023-10-02 22:15:08,753 WARNING [train.py:1204] (3/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_287-255474-0_sp1.1 from training. Duration: 0.92725 2023-10-02 22:15:10,100 WARNING [train.py:1204] (3/4) Exclude cut with ID _48677_210_5663_1_1533902405919_3892989_111-11439-0_sp1.1 from training. Duration: 0.87275 2023-10-02 22:15:10,116 WARNING [train.py:1204] (3/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_399-156791-0 from training. Duration: 0.83 2023-10-02 22:15:10,160 WARNING [train.py:1204] (3/4) Exclude cut with ID _3992_210_1747_1_1526644816031_3731060_36-147855-0_sp0.9 from training. Duration: 0.8666875 2023-10-02 22:15:13,423 WARNING [train.py:1204] (3/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_442-320317-0_sp1.1 from training. Duration: 0.7454375 2023-10-02 22:15:13,440 WARNING [train.py:1204] (3/4) Exclude cut with ID _55429_210_10626_1_1534467588527_7174143_953-80868-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 22:15:14,795 WARNING [train.py:1204] (3/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_159-328157-0_sp0.9 from training. Duration: 0.57775 2023-10-02 22:15:16,040 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9309W0197-640324 from training. Duration: 0.896 2023-10-02 22:15:16,751 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.3.encoder.layers.2.feed_forward3.out_whiten, num_groups=1, num_channels=512, metric=7.90 vs. limit=15.0 2023-10-02 22:15:18,786 WARNING [train.py:1204] (3/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_361-30822-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 22:15:22,337 WARNING [train.py:1204] (3/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_636-251213-0 from training. Duration: 0.94 2023-10-02 22:15:28,281 WARNING [train.py:1204] (3/4) Exclude cut with ID _49728_210_12651_1_1534041034294_6788150_468-333827-0_sp1.1 from training. Duration: 0.963625 2023-10-02 22:15:30,072 WARNING [train.py:1204] (3/4) Exclude cut with ID _25882_210_3036_1_1532775665815_6479510_623-191759-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 22:15:30,132 WARNING [train.py:1204] (3/4) Exclude cut with ID _5189_210_4557_1_1527248319428_3769229_199-53526-0 from training. Duration: 0.42 2023-10-02 22:15:32,886 WARNING [train.py:1204] (3/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_32-205288-0_sp0.9 from training. Duration: 0.8666875 2023-10-02 22:15:32,972 WARNING [train.py:1204] (3/4) Exclude cut with ID _65263_210_22356_1_1535763598691_3921037_401-346646-0_sp1.1 from training. Duration: 0.963625 2023-10-02 22:15:32,976 WARNING [train.py:1204] (3/4) Exclude cut with ID _12555_210_8750_1_1530075594402_3780239_449-267986-0_sp1.1 from training. Duration: 0.7545625 2023-10-02 22:15:32,988 WARNING [train.py:1204] (3/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_430-201020-0_sp1.1 from training. Duration: 0.8818125 2023-10-02 22:15:34,336 INFO [train.py:1046] (3/4) Epoch 30, batch 1600, loss[loss=0.1617, simple_loss=0.2478, pruned_loss=0.03777, over 24618.00 frames. ], tot_loss[loss=0.1655, simple_loss=0.2441, pruned_loss=0.04344, over 4699030.83 frames. ], batch size: 68, lr: 3.42e-03, grad_scale: 16.0 2023-10-02 22:15:34,407 WARNING [train.py:1204] (3/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_379-49457-0_sp1.1 from training. Duration: 0.7909375 2023-10-02 22:15:37,091 WARNING [train.py:1204] (3/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_200-105218-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 22:15:37,153 WARNING [train.py:1204] (3/4) Exclude cut with ID _3867_210_1883_1_1526641200575_4475830_319-349850-0 from training. Duration: 0.67 2023-10-02 22:15:39,746 WARNING [train.py:1204] (3/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_916-195848-0 from training. Duration: 0.92 2023-10-02 22:15:41,205 WARNING [train.py:1204] (3/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_436-186311-0 from training. Duration: 0.65 2023-10-02 22:15:44,578 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3910_210_1794_1_1526777667_3747260_302-180617-0_sp1.1 from training. Duration: 0.97275 2023-10-02 22:15:45,718 WARNING [train.py:1204] (3/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_102-92751-0 from training. Duration: 0.99 2023-10-02 22:15:47,037 WARNING [train.py:1204] (3/4) Exclude cut with ID _51899_210_20034_1_1534129254466_3669220_265-215428-0_sp1.1 from training. Duration: 0.8909375 2023-10-02 22:15:47,261 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.ff2_skip_rate, batch_count=1037746.6666666666, ans=0.0 2023-10-02 22:15:49,764 WARNING [train.py:1204] (3/4) Exclude cut with ID _58583_210_5236_1_1534726835266_3574249_438-198993-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 22:15:54,455 WARNING [train.py:1204] (3/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_166-166990-0_sp1.1 from training. Duration: 0.87275 2023-10-02 22:15:57,185 WARNING [train.py:1204] (3/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_907-151398-0 from training. Duration: 0.37 2023-10-02 22:16:00,562 WARNING [train.py:1204] (3/4) Exclude cut with ID _38670_210_15379_1_1533448810659_4391059_452-256669-0_sp1.1 from training. Duration: 0.936375 2023-10-02 22:16:00,601 WARNING [train.py:1204] (3/4) Exclude cut with ID _48677_210_5663_1_1533902405919_3892989_408-40740-0 from training. Duration: 0.83 2023-10-02 22:16:00,826 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.bypass.scale_min, batch_count=1037746.6666666666, ans=0.2 2023-10-02 22:16:02,441 WARNING [train.py:1204] (3/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_903-158756-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 22:16:02,487 WARNING [train.py:1204] (3/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_397-216497-0 from training. Duration: 0.96 2023-10-02 22:16:07,942 WARNING [train.py:1204] (3/4) Exclude cut with ID _55429_210_10626_1_1534467588527_7174143_97-335084-0 from training. Duration: 0.97 2023-10-02 22:16:14,978 WARNING [train.py:1204] (3/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_453-267730-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 22:16:16,352 WARNING [train.py:1204] (3/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_153-97441-0 from training. Duration: 0.56 2023-10-02 22:16:17,009 WARNING [train.py:1204] (3/4) Exclude cut with ID _40901_210_5665_1_1533363789958_3856130_312-329122-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 22:16:17,060 WARNING [train.py:1204] (3/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_97-126471-0_sp1.1 from training. Duration: 0.97275 2023-10-02 22:16:17,060 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_11305_210_1794_1_1529750428_7178148_257-312870-0_sp1.1 from training. Duration: 0.8 2023-10-02 22:16:19,763 WARNING [train.py:1204] (3/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_454-314901-0_sp0.9 from training. Duration: 0.511125 2023-10-02 22:16:24,341 WARNING [train.py:1204] (3/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_282-136641-0_sp1.1 from training. Duration: 0.3818125 2023-10-02 22:16:25,746 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_46013_210_7234_1_1533776362_3744032_493-160724-0_sp1.1 from training. Duration: 0.963625 2023-10-02 22:16:27,032 WARNING [train.py:1204] (3/4) Exclude cut with ID _67831_210_12392_1_1535612464202_3092612_259-33442-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 22:16:27,091 WARNING [train.py:1204] (3/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_289-62384-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 22:16:27,143 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6545_210_2040_1_1527774670_4245258_379-14182-0_sp0.9 from training. Duration: 0.888875 2023-10-02 22:16:28,567 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_548-294316-0_sp1.1 from training. Duration: 0.736375 2023-10-02 22:16:30,397 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.561e+02 1.857e+02 2.011e+02 2.284e+02 3.695e+02, threshold=4.022e+02, percent-clipped=0.0 2023-10-02 22:16:31,835 WARNING [train.py:1204] (3/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_857-298942-0_sp1.1 from training. Duration: 0.77275 2023-10-02 22:16:33,325 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6673_210_3273_1_1527767790_3173240_327-306185-0_sp1.1 from training. Duration: 0.7545625 2023-10-02 22:16:38,066 WARNING [train.py:1204] (3/4) Exclude cut with ID _61008_210_10633_1_1534852769198_6974370_814-107647-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 22:16:39,415 WARNING [train.py:1204] (3/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_116-332008-0_sp1.1 from training. Duration: 0.8909375 2023-10-02 22:16:40,888 WARNING [train.py:1204] (3/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_266-329755-0 from training. Duration: 0.89 2023-10-02 22:16:40,892 WARNING [train.py:1204] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_271-269084-0_sp0.9 from training. Duration: 0.92225 2023-10-02 22:16:42,247 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3171_210_2087_1_1526109084_2330007_66-260583-0 from training. Duration: 0.72 2023-10-02 22:16:42,457 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.conv_module1.balancer2.min_abs, batch_count=1037946.6666666666, ans=0.5 2023-10-02 22:16:45,137 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.bypass.scale_min, batch_count=1037946.6666666666, ans=0.2 2023-10-02 22:16:46,849 WARNING [train.py:1204] (3/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_1326-276386-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 22:16:48,078 INFO [train.py:1046] (3/4) Epoch 30, batch 1650, loss[loss=0.1557, simple_loss=0.2201, pruned_loss=0.04567, over 23406.00 frames. ], tot_loss[loss=0.1657, simple_loss=0.244, pruned_loss=0.04364, over 4701412.13 frames. ], batch size: 285, lr: 3.42e-03, grad_scale: 16.0 2023-10-02 22:16:49,477 WARNING [train.py:1204] (3/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_372-30302-0_sp1.1 from training. Duration: 0.7818125 2023-10-02 22:16:49,554 WARNING [train.py:1204] (3/4) Exclude cut with ID _25882_210_3036_1_1532775665815_6479510_631-257029-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 22:16:49,569 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3691_210_3528_1_1526468169_4062053_355-334432-0 from training. Duration: 0.87 2023-10-02 22:16:50,820 WARNING [train.py:1204] (3/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_839-173017-0_sp1.1 from training. Duration: 0.37275 2023-10-02 22:16:50,828 WARNING [train.py:1204] (3/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_441-91466-0 from training. Duration: 0.97 2023-10-02 22:16:50,871 WARNING [train.py:1204] (3/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_108-53929-0 from training. Duration: 0.59 2023-10-02 22:16:56,837 WARNING [train.py:1204] (3/4) Exclude cut with ID _9389_210_1664_1_1529217337301_5289269_260-1942-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 22:16:56,899 WARNING [train.py:1204] (3/4) Exclude cut with ID _56423_210_5854_1_1534896061049_3165969_198-61547-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 22:16:58,245 WARNING [train.py:1204] (3/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_412-142122-0_sp1.1 from training. Duration: 0.92725 2023-10-02 22:16:58,262 WARNING [train.py:1204] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_88-341718-0_sp1.1 from training. Duration: 0.6 2023-10-02 22:16:59,704 WARNING [train.py:1204] (3/4) Exclude cut with ID _61079_210_4525_1_1534921008198_4011419_334-298697-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 22:17:01,095 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_5465_210_4211_1_1527227253_55357_8-2499-0_sp1.1 from training. Duration: 0.996375 2023-10-02 22:17:03,186 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_447-201565-0_sp1.1 from training. Duration: 0.97275 2023-10-02 22:17:04,460 WARNING [train.py:1204] (3/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_253-161789-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 22:17:04,463 WARNING [train.py:1204] (3/4) Exclude cut with ID _44304_210_15374_1_1533639352765_4613129_302-22084-0_sp0.9 from training. Duration: 0.9555625 2023-10-02 22:17:04,478 WARNING [train.py:1204] (3/4) Exclude cut with ID _26664_210_7770_1_1532258943280_3418270_187-80815-0_sp1.1 from training. Duration: 0.8454375 2023-10-02 22:17:04,553 WARNING [train.py:1204] (3/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_68-212801-0 from training. Duration: 0.73 2023-10-02 22:17:06,383 WARNING [train.py:1204] (3/4) Exclude cut with ID _29345_210_4381_1_1532689168263_3860169_149-257268-0 from training. Duration: 0.81 2023-10-02 22:17:12,039 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_213-103282-0_sp1.1 from training. Duration: 0.7090625 2023-10-02 22:17:12,242 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.conv_module1.balancer2.prob, batch_count=1038080.0, ans=0.125 2023-10-02 22:17:14,735 WARNING [train.py:1204] (3/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_156-302675-0_sp1.1 from training. Duration: 0.5 2023-10-02 22:17:20,966 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.self_attn_weights.pos_emb_skip_rate, batch_count=1038146.6666666666, ans=0.0 2023-10-02 22:17:22,333 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.feed_forward1.hidden_balancer.prob, batch_count=1038146.6666666666, ans=0.125 2023-10-02 22:17:23,624 WARNING [train.py:1204] (3/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_125-322172-0 from training. Duration: 0.93 2023-10-02 22:17:23,687 WARNING [train.py:1204] (3/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_493-60474-0_sp1.1 from training. Duration: 0.963625 2023-10-02 22:17:25,258 WARNING [train.py:1204] (3/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_156-302675-0 from training. Duration: 0.55 2023-10-02 22:17:28,605 WARNING [train.py:1204] (3/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_123-267862-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 22:17:30,138 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.1.conv_module1.balancer2.min_abs, batch_count=1038146.6666666666, ans=0.5 2023-10-02 22:17:31,218 WARNING [train.py:1204] (3/4) Exclude cut with ID _28427_210_4220_1_1532581240967_3061069_201-143747-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 22:17:31,242 WARNING [train.py:1204] (3/4) Exclude cut with ID _8772_210_1850_1_1528635595972_3664011_96-12756-0_sp1.1 from training. Duration: 0.8909375 2023-10-02 22:17:32,604 WARNING [train.py:1204] (3/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_1347-145675-0_sp1.1 from training. Duration: 0.936375 2023-10-02 22:17:32,709 WARNING [train.py:1204] (3/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_387-159008-0_sp1.1 from training. Duration: 0.7818125 2023-10-02 22:17:32,727 WARNING [train.py:1204] (3/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_400-201896-0_sp1.1 from training. Duration: 0.963625 2023-10-02 22:17:35,014 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.0.layers.1.feed_forward2.out_whiten, num_groups=1, num_channels=192, metric=4.26 vs. limit=15.0 2023-10-02 22:17:36,188 WARNING [train.py:1204] (3/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_326-174612-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 22:17:36,225 WARNING [train.py:1204] (3/4) Exclude cut with ID _55844_210_2346_1_1534471212569_7625707_390-116233-0_sp1.1 from training. Duration: 0.963625 2023-10-02 22:17:37,560 WARNING [train.py:1204] (3/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_419-317062-0_sp1.1 from training. Duration: 0.82725 2023-10-02 22:17:38,728 WARNING [train.py:1204] (3/4) Exclude cut with ID _29877_210_6228_1_1532397791106_5276149_266-62291-0_sp0.9 from training. Duration: 0.9666875 2023-10-02 22:17:38,811 WARNING [train.py:1204] (3/4) Exclude cut with ID _7857_210_1534_1_1528286515745_6831470_431-338528-0_sp1.1 from training. Duration: 0.92725 2023-10-02 22:17:40,127 WARNING [train.py:1204] (3/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_87-131427-0_sp0.9 from training. Duration: 0.8666875 2023-10-02 22:17:44,166 WARNING [train.py:1204] (3/4) Exclude cut with ID _24055_210_12651_1_1532310907143_976979_92-59356-0_sp1.1 from training. Duration: 0.82725 2023-10-02 22:17:44,239 WARNING [train.py:1204] (3/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_96-290039-0 from training. Duration: 0.56 2023-10-02 22:17:45,682 WARNING [train.py:1204] (3/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_48-182155-0_sp0.9 from training. Duration: 0.9666875 2023-10-02 22:17:46,998 WARNING [train.py:1204] (3/4) Exclude cut with ID _4626_210_3532_1_1526817652717_3916859_446-206656-0 from training. Duration: 0.76 2023-10-02 22:17:47,097 WARNING [train.py:1204] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_168-32906-0 from training. Duration: 0.69 2023-10-02 22:17:47,120 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3780_210_2087_1_1526467760_2130483_170-259044-0 from training. Duration: 0.7 2023-10-02 22:17:47,131 WARNING [train.py:1204] (3/4) Exclude cut with ID _65134_210_14763_1_1535335393985_3505368_153-216718-0_sp1.1 from training. Duration: 0.92725 2023-10-02 22:17:48,475 WARNING [train.py:1204] (3/4) Exclude cut with ID _64639_210_5816_1_1535367619437_3528000_461-329156-0_sp1.1 from training. Duration: 0.87275 2023-10-02 22:17:49,126 WARNING [train.py:1204] (3/4) Exclude cut with ID _21989_210_5854_1_1531899214931_3271360_126-347531-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 22:17:50,417 WARNING [train.py:1204] (3/4) Exclude cut with ID _7614_210_4915_1_1529326764092_4537359_96-162197-0_sp1.1 from training. Duration: 0.963625 2023-10-02 22:17:50,418 WARNING [train.py:1204] (3/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_1083-147536-0 from training. Duration: 0.97 2023-10-02 22:17:53,155 WARNING [train.py:1204] (3/4) Exclude cut with ID _64029_210_4381_1_1535178502772_3756459_275-16710-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 22:17:56,361 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7264_210_4938_1_1527939911_8132095_800-91995-0_sp0.9 from training. Duration: 0.9555625 2023-10-02 22:17:56,397 WARNING [train.py:1204] (3/4) Exclude cut with ID _3844_210_1390_1_1526727803582_2871209_50-193879-0_sp1.1 from training. Duration: 0.936375 2023-10-02 22:17:59,179 WARNING [train.py:1204] (3/4) Exclude cut with ID _46952_210_5986_1_1534230010357_3107379_232-344503-0 from training. Duration: 0.96 2023-10-02 22:18:01,894 INFO [train.py:1046] (3/4) Epoch 30, batch 1700, loss[loss=0.1446, simple_loss=0.2086, pruned_loss=0.0403, over 23354.00 frames. ], tot_loss[loss=0.1652, simple_loss=0.2437, pruned_loss=0.04336, over 4709064.47 frames. ], batch size: 285, lr: 3.42e-03, grad_scale: 16.0 2023-10-02 22:18:03,384 WARNING [train.py:1204] (3/4) Exclude cut with ID _25808_210_13292_1_1532343514562_7366109_206-14076-0_sp1.1 from training. Duration: 0.936375 2023-10-02 22:18:03,385 WARNING [train.py:1204] (3/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_55-31904-0_sp0.9 from training. Duration: 0.97775 2023-10-02 22:18:03,410 WARNING [train.py:1204] (3/4) Exclude cut with ID _14632_210_9988_1_1530691279392_3923200_161-129530-0 from training. Duration: 0.96 2023-10-02 22:18:05,116 WARNING [train.py:1204] (3/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_770-200430-0_sp1.1 from training. Duration: 0.8090625 2023-10-02 22:18:05,133 WARNING [train.py:1204] (3/4) Exclude cut with ID _6270_210_2084_1_1527850911573_3856420_26-277343-0_sp1.1 from training. Duration: 0.7454375 2023-10-02 22:18:05,134 WARNING [train.py:1204] (3/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_88-23776-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 22:18:07,881 WARNING [train.py:1204] (3/4) Exclude cut with ID _60860_210_8254_1_1534896015875_3557169_245-256666-0_sp1.1 from training. Duration: 0.8818125 2023-10-02 22:18:07,916 WARNING [train.py:1204] (3/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_636-251213-0_sp1.1 from training. Duration: 0.8545625 2023-10-02 22:18:07,930 WARNING [train.py:1204] (3/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_540-131164-0 from training. Duration: 0.92 2023-10-02 22:18:11,160 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_5931_210_2040_1_1527428481_4547999_321-193614-0_sp0.9 from training. Duration: 0.7444375 2023-10-02 22:18:19,824 WARNING [train.py:1204] (3/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_373-225403-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 22:18:22,489 WARNING [train.py:1204] (3/4) Exclude cut with ID _20622_210_3852_1_1531622427080_1093839_57-326027-0_sp1.1 from training. Duration: 0.97275 2023-10-02 22:18:27,962 WARNING [train.py:1204] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_605-257205-0_sp1.1 from training. Duration: 0.536375 2023-10-02 22:18:27,985 WARNING [train.py:1204] (3/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_442-320317-0_sp0.9 from training. Duration: 0.911125 2023-10-02 22:18:28,031 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_35361_210_4238_1_1533262810_7793476_675-87012-0_sp1.1 from training. Duration: 0.8090625 2023-10-02 22:18:29,453 WARNING [train.py:1204] (3/4) Exclude cut with ID _4184_210_2550_1_1526730192013_2219380_306-44736-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 22:18:31,378 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_124-113562-0 from training. Duration: 0.94 2023-10-02 22:18:32,819 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2821_210_2715_1_1526108385_3763560_73-296673-0_sp0.9 from training. Duration: 0.92225 2023-10-02 22:18:32,836 WARNING [train.py:1204] (3/4) Exclude cut with ID _10301_210_5886_1_1529222275332_3910620_216-46959-0_sp1.1 from training. Duration: 0.92725 2023-10-02 22:18:34,340 WARNING [train.py:1204] (3/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_91-277684-0_sp0.9 from training. Duration: 0.82225 2023-10-02 22:18:34,610 INFO [scaling.py:1118] (3/4) WithLoss: name=encoder.encoders.4.encoder.layers.2.self_attn_weights, loss-sum=0.000e+00 2023-10-02 22:18:35,660 WARNING [train.py:1204] (3/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_351-294650-0_sp0.9 from training. Duration: 0.7 2023-10-02 22:18:39,025 WARNING [train.py:1204] (3/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_209-205365-0 from training. Duration: 0.79 2023-10-02 22:18:39,087 WARNING [train.py:1204] (3/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_97-325053-0 from training. Duration: 0.92 2023-10-02 22:18:39,200 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.1.conv_skip_rate, batch_count=1038480.0, ans=0.0 2023-10-02 22:18:41,634 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_49348_210_7927_1_1533954500_3691300_109-276430-0_sp1.1 from training. Duration: 0.92725 2023-10-02 22:18:43,016 WARNING [train.py:1204] (3/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_912-47473-0 from training. Duration: 0.68 2023-10-02 22:18:43,120 WARNING [train.py:1204] (3/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_96-320736-0_sp0.9 from training. Duration: 0.9555625 2023-10-02 22:18:50,650 WARNING [train.py:1204] (3/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_1252-235481-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 22:18:51,901 WARNING [train.py:1204] (3/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_813-261554-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 22:18:53,271 WARNING [train.py:1204] (3/4) Exclude cut with ID _21632_210_2253_1_1531994001765_3744140_110-274654-0_sp0.9 from training. Duration: 0.911125 2023-10-02 22:18:54,690 WARNING [train.py:1204] (3/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_381-317791-0_sp1.1 from training. Duration: 0.663625 2023-10-02 22:18:54,702 WARNING [train.py:1204] (3/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_51-117576-0_sp1.1 from training. Duration: 0.363625 2023-10-02 22:18:54,720 WARNING [train.py:1204] (3/4) Exclude cut with ID _42321_210_7770_1_1533468631352_4366895_104-215915-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 22:18:57,825 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.602e+02 1.894e+02 2.111e+02 2.412e+02 3.601e+02, threshold=4.223e+02, percent-clipped=0.0 2023-10-02 22:18:57,949 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_216-22824-0_sp1.1 from training. Duration: 0.92725 2023-10-02 22:18:57,950 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_14831_210_7927_1_1530700200_5583610_181-27467-0 from training. Duration: 0.91 2023-10-02 22:18:58,007 WARNING [train.py:1204] (3/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_1024-265645-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 22:18:58,009 WARNING [train.py:1204] (3/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_412-233904-0_sp1.1 from training. Duration: 0.936375 2023-10-02 22:18:59,314 WARNING [train.py:1204] (3/4) Exclude cut with ID _30191_210_1664_1_1533189915047_7196670_911-223461-0_sp1.1 from training. Duration: 0.92725 2023-10-02 22:18:59,322 WARNING [train.py:1204] (3/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_558-148266-0_sp1.1 from training. Duration: 0.963625 2023-10-02 22:19:00,808 WARNING [train.py:1204] (3/4) Exclude cut with ID _58480_210_12392_1_1534811121111_3646160_212-164495-0_sp1.1 from training. Duration: 0.936375 2023-10-02 22:19:00,810 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_454-276429-0_sp0.9 from training. Duration: 0.9666875 2023-10-02 22:19:02,250 WARNING [train.py:1204] (3/4) Exclude cut with ID _24736_210_3279_1_1532332894218_2838700_73-197086-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 22:19:02,300 WARNING [train.py:1204] (3/4) Exclude cut with ID _5294_210_1747_1_1527249643928_3696179_529-242298-0_sp1.1 from training. Duration: 0.863625 2023-10-02 22:19:02,328 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_53555_210_5632_1_1534595220_7232297_345-171438-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 22:19:06,999 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_28211_210_6388_1_1532861506_4523224_22-25766-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 22:19:08,891 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7002_210_1794_1_1527939683_5157286_163-87552-0 from training. Duration: 0.86 2023-10-02 22:19:10,254 WARNING [train.py:1204] (3/4) Exclude cut with ID _56927_210_9969_1_1534848970840_3695229_409-225129-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 22:19:13,028 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_26192_210_7927_1_1532137269_1040115_21-219331-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 22:19:14,414 WARNING [train.py:1204] (3/4) Exclude cut with ID _15012_210_1968_1_1530856820429_5095350_662-210469-0 from training. Duration: 0.57 2023-10-02 22:19:15,698 INFO [train.py:1046] (3/4) Epoch 30, batch 1750, loss[loss=0.163, simple_loss=0.2577, pruned_loss=0.03413, over 24452.00 frames. ], tot_loss[loss=0.1642, simple_loss=0.243, pruned_loss=0.04267, over 4722859.95 frames. ], batch size: 69, lr: 3.42e-03, grad_scale: 16.0 2023-10-02 22:19:21,164 WARNING [train.py:1204] (3/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_582-191016-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 22:19:23,166 WARNING [train.py:1204] (3/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_270-210032-0_sp1.1 from training. Duration: 0.963625 2023-10-02 22:19:23,198 WARNING [train.py:1204] (3/4) Exclude cut with ID _6789_210_1534_1_1527857937218_3118950_262-48853-0_sp0.9 from training. Duration: 0.62225 2023-10-02 22:19:24,384 WARNING [train.py:1204] (3/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_847-41334-0 from training. Duration: 0.44 2023-10-02 22:19:24,416 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_573-46532-0_sp1.1 from training. Duration: 0.92725 2023-10-02 22:19:26,467 WARNING [train.py:1204] (3/4) Exclude cut with ID _21881_210_3525_1_1531825242757_3798380_247-102591-0_sp1.1 from training. Duration: 0.8909375 2023-10-02 22:19:27,750 WARNING [train.py:1204] (3/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_200-21395-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 22:19:31,854 WARNING [train.py:1204] (3/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_711-15688-0 from training. Duration: 0.43 2023-10-02 22:19:34,522 WARNING [train.py:1204] (3/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_53-75025-0_sp1.1 from training. Duration: 0.963625 2023-10-02 22:19:36,021 WARNING [train.py:1204] (3/4) Exclude cut with ID _6194_210_1534_1_1527511826160_3941194_321-148641-0 from training. Duration: 0.64 2023-10-02 22:19:36,045 WARNING [train.py:1204] (3/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_337-294431-0_sp1.1 from training. Duration: 0.936375 2023-10-02 22:19:37,435 WARNING [train.py:1204] (3/4) Exclude cut with ID _21632_210_2253_1_1531994001765_3744140_110-274654-0_sp1.1 from training. Duration: 0.7454375 2023-10-02 22:19:40,788 WARNING [train.py:1204] (3/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_135-174080-0_sp1.1 from training. Duration: 0.4818125 2023-10-02 22:19:40,883 WARNING [train.py:1204] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_21-174514-0 from training. Duration: 0.43 2023-10-02 22:19:41,047 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.0.conv_module2.balancer2.prob, batch_count=1038746.6666666666, ans=0.125 2023-10-02 22:19:43,559 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_38965_210_4852_1_1533174906_3484735_215-294589-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 22:19:43,594 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_548-294316-0 from training. Duration: 0.81 2023-10-02 22:19:48,418 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.4.encoder.layers.0.conv_module2.whiten, num_groups=1, num_channels=384, metric=9.09 vs. limit=15.0 2023-10-02 22:19:51,138 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_103-84010-0_sp1.1 from training. Duration: 0.62725 2023-10-02 22:19:52,511 WARNING [train.py:1204] (3/4) Exclude cut with ID _18199_210_6109_1_1531475789864_4288790_295-198247-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 22:19:52,539 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_13757_210_3528_1_1530440682_4107239_103-313224-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 22:19:55,334 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_34886_210_10633_1_1533203882_1755661_67-294673-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 22:19:55,344 WARNING [train.py:1204] (3/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_350-101734-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 22:19:58,730 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_37357_210_17024_1_1533089504_2316121_204-153986-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 22:19:58,850 WARNING [train.py:1204] (3/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_673-115569-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 22:20:01,463 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_489-230703-0_sp0.9 from training. Duration: 0.9444375 2023-10-02 22:20:01,525 WARNING [train.py:1204] (3/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_575-102496-0_sp1.1 from training. Duration: 0.936375 2023-10-02 22:20:01,600 WARNING [train.py:1204] (3/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_669-93163-0 from training. Duration: 0.98 2023-10-02 22:20:02,985 WARNING [train.py:1204] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_249-281349-0_sp0.9 from training. Duration: 0.9444375 2023-10-02 22:20:06,205 WARNING [train.py:1204] (3/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_106-30782-0 from training. Duration: 0.8 2023-10-02 22:20:06,276 WARNING [train.py:1204] (3/4) Exclude cut with ID _8772_210_1850_1_1528635595972_3664011_398-320049-0_sp1.1 from training. Duration: 0.8 2023-10-02 22:20:08,113 WARNING [train.py:1204] (3/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_516-151727-0_sp1.1 from training. Duration: 0.963625 2023-10-02 22:20:08,172 WARNING [train.py:1204] (3/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_325-62802-0_sp1.1 from training. Duration: 0.7909375 2023-10-02 22:20:13,503 WARNING [train.py:1204] (3/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_93-58002-0_sp0.9 from training. Duration: 0.6333125 2023-10-02 22:20:13,537 WARNING [train.py:1204] (3/4) Exclude cut with ID _4681_210_3525_1_1527251040321_1235713_123-24428-0_sp1.1 from training. Duration: 0.436375 2023-10-02 22:20:15,447 WARNING [train.py:1204] (3/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_1185-56473-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 22:20:16,885 WARNING [train.py:1204] (3/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_96-244299-0_sp1.1 from training. Duration: 0.8 2023-10-02 22:20:19,739 WARNING [train.py:1204] (3/4) Exclude cut with ID _59500_210_8254_1_1534809610680_3736009_104-72815-0_sp1.1 from training. Duration: 0.963625 2023-10-02 22:20:22,647 WARNING [train.py:1204] (3/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_468-283113-0_sp1.1 from training. Duration: 0.87275 2023-10-02 22:20:24,005 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6239_210_4211_1_1527765584_4306698_528-102186-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 22:20:25,903 WARNING [train.py:1204] (3/4) Exclude cut with ID _45297_210_2721_1_1533715351381_3264949_351-147136-0 from training. Duration: 0.87 2023-10-02 22:20:25,908 WARNING [train.py:1204] (3/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_520-51744-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 22:20:27,301 WARNING [train.py:1204] (3/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_310-114452-0_sp1.1 from training. Duration: 0.536375 2023-10-02 22:20:27,305 WARNING [train.py:1204] (3/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_450-165710-0_sp1.1 from training. Duration: 0.92725 2023-10-02 22:20:27,323 WARNING [train.py:1204] (3/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_280-120942-0_sp1.1 from training. Duration: 0.7 2023-10-02 22:20:27,352 WARNING [train.py:1204] (3/4) Exclude cut with ID _65585_210_12210_1_1535450451584_3764098_112-294301-0_sp0.9 from training. Duration: 0.97775 2023-10-02 22:20:28,669 WARNING [train.py:1204] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_98-51676-0_sp0.9 from training. Duration: 0.811125 2023-10-02 22:20:30,048 INFO [train.py:1046] (3/4) Epoch 30, batch 1800, loss[loss=0.1587, simple_loss=0.2329, pruned_loss=0.04229, over 23269.00 frames. ], tot_loss[loss=0.1633, simple_loss=0.2407, pruned_loss=0.04296, over 4687458.23 frames. ], batch size: 119, lr: 3.42e-03, grad_scale: 16.0 2023-10-02 22:20:31,386 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_33348_210_5632_1_1533086912_3324030_85-28583-0_sp1.1 from training. Duration: 0.7454375 2023-10-02 22:20:31,472 WARNING [train.py:1204] (3/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_336-208232-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 22:20:32,866 WARNING [train.py:1204] (3/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_339-104791-0_sp0.9 from training. Duration: 0.5444375 2023-10-02 22:20:37,398 WARNING [train.py:1204] (3/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_559-29840-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 22:20:42,035 WARNING [train.py:1204] (3/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_117-290916-0_sp0.9 from training. Duration: 0.5666875 2023-10-02 22:20:43,416 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_185-112289-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 22:20:44,913 WARNING [train.py:1204] (3/4) Exclude cut with ID _59256_210_5986_1_1535076085997_3589120_155-298797-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 22:20:48,265 WARNING [train.py:1204] (3/4) Exclude cut with ID _44659_210_2721_1_1534036120061_6614860_246-206531-0_sp1.1 from training. Duration: 0.92725 2023-10-02 22:20:48,305 WARNING [train.py:1204] (3/4) Exclude cut with ID _23818_210_6228_1_1531917063884_3676920_469-151944-0_sp1.1 from training. Duration: 0.92725 2023-10-02 22:20:49,653 WARNING [train.py:1204] (3/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_88-276668-0_sp1.1 from training. Duration: 0.9 2023-10-02 22:20:49,918 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.nonlin_attention.balancer.prob, batch_count=1039080.0, ans=0.125 2023-10-02 22:20:51,049 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_15101_210_7927_1_1530786415_5033274_320-327527-0_sp1.1 from training. Duration: 0.87275 2023-10-02 22:20:52,358 WARNING [train.py:1204] (3/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_446-112117-0 from training. Duration: 0.9 2023-10-02 22:20:53,793 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_30280_210_1881_1_1532872779_3660139_400-254159-0_sp1.1 from training. Duration: 0.936375 2023-10-02 22:20:56,497 WARNING [train.py:1204] (3/4) Exclude cut with ID _40284_210_2084_1_1533513679814_3704279_158-345341-0_sp1.1 from training. Duration: 0.936375 2023-10-02 22:21:02,552 WARNING [train.py:1204] (3/4) Exclude cut with ID 3033-130750-0096-55598-0 from training. Duration: 0.83 2023-10-02 22:21:03,943 WARNING [train.py:1204] (3/4) Exclude cut with ID _9389_210_1664_1_1529217337301_5289269_379-128794-0 from training. Duration: 0.85 2023-10-02 22:21:03,966 WARNING [train.py:1204] (3/4) Exclude cut with ID _3675_210_4557_1_1526639934408_4182089_456-18305-0 from training. Duration: 0.76 2023-10-02 22:21:03,991 WARNING [train.py:1204] (3/4) Exclude cut with ID _29853_210_7497_1_1532505600851_6642110_571-249460-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 22:21:06,659 WARNING [train.py:1204] (3/4) Exclude cut with ID _4068_210_2629_1_1526697350395_1546259_230-325568-0_sp1.1 from training. Duration: 0.92725 2023-10-02 22:21:06,660 WARNING [train.py:1204] (3/4) Exclude cut with ID _34415_210_8750_1_1533085213375_3451116_213-267570-0_sp1.1 from training. Duration: 0.963625 2023-10-02 22:21:06,730 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6767_210_3273_1_1527855783_1718932_142-127132-0_sp1.1 from training. Duration: 0.863625 2023-10-02 22:21:12,812 WARNING [train.py:1204] (3/4) Exclude cut with ID IC0653W0001-322925 from training. Duration: 0.896 2023-10-02 22:21:12,963 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.1.feed_forward2.hidden_balancer.prob, batch_count=1039213.3333333334, ans=0.125 2023-10-02 22:21:14,681 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_4528_210_3273_1_1527076654_3276777_209-219804-0_sp1.1 from training. Duration: 0.62725 2023-10-02 22:21:16,648 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.3.encoder.layers.1.whiten, num_groups=1, num_channels=512, metric=9.13 vs. limit=12.0 2023-10-02 22:21:17,330 WARNING [train.py:1204] (3/4) Exclude cut with ID _55844_210_2346_1_1534471212569_7625707_187-204971-0_sp1.1 from training. Duration: 0.936375 2023-10-02 22:21:19,336 WARNING [train.py:1204] (3/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_369-325184-0 from training. Duration: 0.99 2023-10-02 22:21:19,370 WARNING [train.py:1204] (3/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_791-45149-0 from training. Duration: 0.86 2023-10-02 22:21:20,681 WARNING [train.py:1204] (3/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_317-211107-0_sp1.1 from training. Duration: 0.836375 2023-10-02 22:21:22,084 WARNING [train.py:1204] (3/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_488-330913-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 22:21:22,178 WARNING [train.py:1204] (3/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_506-42614-0_sp1.1 from training. Duration: 0.7181875 2023-10-02 22:21:22,370 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=1039213.3333333334, ans=0.1 2023-10-02 22:21:26,132 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.627e+02 1.833e+02 1.992e+02 2.212e+02 2.839e+02, threshold=3.984e+02, percent-clipped=0.0 2023-10-02 22:21:26,264 WARNING [train.py:1204] (3/4) Exclude cut with ID _8696_210_1876_1_1528545632440_3345058_209-54069-0 from training. Duration: 0.67 2023-10-02 22:21:31,947 WARNING [train.py:1204] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_215-202973-0_sp1.1 from training. Duration: 0.8 2023-10-02 22:21:33,884 WARNING [train.py:1204] (3/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_321-215742-0 from training. Duration: 0.5 2023-10-02 22:21:33,940 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_428-32114-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 22:21:33,956 WARNING [train.py:1204] (3/4) Exclude cut with ID _27013_210_1389_1_1532437173240_3252649_467-263151-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 22:21:34,000 WARNING [train.py:1204] (3/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_734-129761-0_sp0.9 from training. Duration: 0.911125 2023-10-02 22:21:35,291 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_521-214276-0 from training. Duration: 0.44 2023-10-02 22:21:38,207 WARNING [train.py:1204] (3/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_80-147301-0_sp1.1 from training. Duration: 0.763625 2023-10-02 22:21:38,209 WARNING [train.py:1204] (3/4) Exclude cut with ID _9679_210_7497_1_1529129212906_4363929_56-307796-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 22:21:40,881 WARNING [train.py:1204] (3/4) Exclude cut with ID _14832_210_8750_1_1530691351539_3171348_1-54315-0 from training. Duration: 0.87 2023-10-02 22:21:40,881 WARNING [train.py:1204] (3/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_558-155750-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 22:21:43,975 INFO [train.py:1046] (3/4) Epoch 30, batch 1850, loss[loss=0.1423, simple_loss=0.2243, pruned_loss=0.03015, over 24574.00 frames. ], tot_loss[loss=0.1639, simple_loss=0.2416, pruned_loss=0.04307, over 4695583.15 frames. ], batch size: 60, lr: 3.41e-03, grad_scale: 16.0 2023-10-02 22:21:44,025 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_142-309370-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 22:21:44,065 WARNING [train.py:1204] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_18-46756-0_sp0.9 from training. Duration: 0.87775 2023-10-02 22:21:44,072 WARNING [train.py:1204] (3/4) Exclude cut with ID _61148_210_6897_1_1535094022726_4217610_279-212159-0_sp1.1 from training. Duration: 0.936375 2023-10-02 22:21:46,053 WARNING [train.py:1204] (3/4) Exclude cut with ID _21987_210_5854_1_1531821500482_3637038_164-2641-0_sp1.1 from training. Duration: 0.936375 2023-10-02 22:21:46,099 WARNING [train.py:1204] (3/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_395-340852-0_sp1.1 from training. Duration: 0.8454375 2023-10-02 22:21:46,236 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.bypass_mid.scale_min, batch_count=1039346.6666666666, ans=0.2 2023-10-02 22:21:49,465 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1077-184261-0_sp1.1 from training. Duration: 0.963625 2023-10-02 22:21:49,474 WARNING [train.py:1204] (3/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_1176-204992-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 22:21:49,667 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.conv_module2.balancer2.prob, batch_count=1039346.6666666666, ans=0.125 2023-10-02 22:21:52,075 WARNING [train.py:1204] (3/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_563-347832-0_sp1.1 from training. Duration: 0.8090625 2023-10-02 22:21:52,811 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.2.encoder.layers.2.feed_forward2.out_whiten, num_groups=1, num_channels=384, metric=5.18 vs. limit=15.0 2023-10-02 22:21:53,364 WARNING [train.py:1204] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_380-204756-0_sp0.9 from training. Duration: 0.9555625 2023-10-02 22:22:00,354 WARNING [train.py:1204] (3/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_212-10501-0_sp1.1 from training. Duration: 0.8545625 2023-10-02 22:22:00,389 WARNING [train.py:1204] (3/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_445-117398-0 from training. Duration: 0.89 2023-10-02 22:22:02,455 WARNING [train.py:1204] (3/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_29-280622-0 from training. Duration: 0.82 2023-10-02 22:22:05,225 WARNING [train.py:1204] (3/4) Exclude cut with ID _26664_210_7770_1_1532258943280_3418270_187-80815-0 from training. Duration: 0.93 2023-10-02 22:22:07,923 WARNING [train.py:1204] (3/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_414-349640-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 22:22:07,953 WARNING [train.py:1204] (3/4) Exclude cut with ID _45021_210_9994_1_1533619751094_3330288_96-239010-0 from training. Duration: 0.91 2023-10-02 22:22:07,956 WARNING [train.py:1204] (3/4) Exclude cut with ID _5189_210_4557_1_1527248319428_3769229_199-53526-0_sp0.9 from training. Duration: 0.4666875 2023-10-02 22:22:09,495 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.out_combiner.scale_min, batch_count=1039413.3333333334, ans=0.2 2023-10-02 22:22:19,826 WARNING [train.py:1204] (3/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_63-101996-0_sp1.1 from training. Duration: 0.8 2023-10-02 22:22:21,769 WARNING [train.py:1204] (3/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_199-86432-0 from training. Duration: 0.98 2023-10-02 22:22:24,547 WARNING [train.py:1204] (3/4) Exclude cut with ID _36399_210_16049_1_1532998771377_3698333_207-259013-0_sp1.1 from training. Duration: 0.92725 2023-10-02 22:22:25,988 WARNING [train.py:1204] (3/4) Exclude cut with ID _62574_210_2655_1_1535180407058_3933249_567-111756-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 22:22:28,820 WARNING [train.py:1204] (3/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_436-318113-0 from training. Duration: 0.75 2023-10-02 22:22:30,113 WARNING [train.py:1204] (3/4) Exclude cut with ID _53379_210_20034_1_1534222027363_3854654_43-305214-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 22:22:30,142 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_15126_210_6753_1_1530778677_2591185_113-157651-0_sp1.1 from training. Duration: 0.6181875 2023-10-02 22:22:31,550 WARNING [train.py:1204] (3/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_146-218763-0_sp1.1 from training. Duration: 0.7818125 2023-10-02 22:22:34,806 WARNING [train.py:1204] (3/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_714-310538-0_sp0.9 from training. Duration: 0.9555625 2023-10-02 22:22:37,630 WARNING [train.py:1204] (3/4) Exclude cut with ID _24271_210_12658_1_1531987270022_7190519_709-16721-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 22:22:39,113 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.attention_skip_rate, batch_count=1039546.6666666666, ans=0.0 2023-10-02 22:22:40,392 WARNING [train.py:1204] (3/4) Exclude cut with ID _5190_210_4557_1_1527244368799_373439_28-197030-0_sp0.9 from training. Duration: 0.8 2023-10-02 22:22:40,418 WARNING [train.py:1204] (3/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_267-197536-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 22:22:40,699 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.balancer1.prob, batch_count=1039546.6666666666, ans=0.125 2023-10-02 22:22:41,796 WARNING [train.py:1204] (3/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_658-282610-0_sp1.1 from training. Duration: 0.4818125 2023-10-02 22:22:41,806 WARNING [train.py:1204] (3/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_257-67050-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 22:22:43,170 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_14906_210_7927_1_1530754190_9408633_961-53402-0_sp1.1 from training. Duration: 0.936375 2023-10-02 22:22:44,568 WARNING [train.py:1204] (3/4) Exclude cut with ID _60113_210_16271_1_1535164212481_4386079_490-280290-0_sp1.1 from training. Duration: 0.8909375 2023-10-02 22:22:44,886 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=1039613.3333333334, ans=0.1 2023-10-02 22:22:45,973 WARNING [train.py:1204] (3/4) Exclude cut with ID _14100_210_8341_1_1530529320613_3678069_258-224599-0 from training. Duration: 0.77 2023-10-02 22:22:46,043 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6961_210_2087_1_1527937168_6815011_565-326898-0_sp1.1 from training. Duration: 0.936375 2023-10-02 22:22:51,069 WARNING [train.py:1204] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_239-316802-0_sp1.1 from training. Duration: 0.62725 2023-10-02 22:22:52,204 WARNING [train.py:1204] (3/4) Exclude cut with ID _55857_210_2346_1_1534644007831_6898209_745-98391-0_sp1.1 from training. Duration: 0.6909375 2023-10-02 22:22:52,207 WARNING [train.py:1204] (3/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_154-135696-0 from training. Duration: 0.8 2023-10-02 22:22:52,209 WARNING [train.py:1204] (3/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_93-58002-0 from training. Duration: 0.57 2023-10-02 22:22:53,676 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9025W0005-507335 from training. Duration: 0.896 2023-10-02 22:22:55,587 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9029W0034-509435 from training. Duration: 0.64 2023-10-02 22:22:55,697 WARNING [train.py:1204] (3/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_76-203867-0_sp1.1 from training. Duration: 0.6545625 2023-10-02 22:22:56,939 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_46639_210_8341_1_1533726088_4495500_66-346325-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 22:22:56,946 WARNING [train.py:1204] (3/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_145-137790-0_sp0.9 from training. Duration: 0.92225 2023-10-02 22:22:56,976 WARNING [train.py:1204] (3/4) Exclude cut with ID _36522_210_8887_1_1533177030611_1772478_7-327299-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 22:22:58,253 INFO [train.py:1046] (3/4) Epoch 30, batch 1900, loss[loss=0.158, simple_loss=0.2455, pruned_loss=0.03531, over 24652.00 frames. ], tot_loss[loss=0.1641, simple_loss=0.2421, pruned_loss=0.04309, over 4698209.73 frames. ], batch size: 68, lr: 3.41e-03, grad_scale: 16.0 2023-10-02 22:22:58,305 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9042W0009-515615 from training. Duration: 0.896 2023-10-02 22:22:58,314 WARNING [train.py:1204] (3/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_39-248870-0_sp0.9 from training. Duration: 0.7666875 2023-10-02 22:22:58,364 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_35441_210_16554_1_1533347572_4092680_362-242912-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 22:22:59,813 WARNING [train.py:1204] (3/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_216-343007-0_sp1.1 from training. Duration: 0.6 2023-10-02 22:23:01,174 WARNING [train.py:1204] (3/4) Exclude cut with ID _3867_210_1883_1_1526641200575_4475830_272-215563-0_sp0.9 from training. Duration: 0.6333125 2023-10-02 22:23:01,263 WARNING [train.py:1204] (3/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_553-175603-0_sp1.1 from training. Duration: 0.92725 2023-10-02 22:23:01,387 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.feed_forward3.hidden_balancer.prob, batch_count=1039680.0, ans=0.125 2023-10-02 22:23:02,460 WARNING [train.py:1204] (3/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_963-187109-0 from training. Duration: 0.99 2023-10-02 22:23:06,015 WARNING [train.py:1204] (3/4) Exclude cut with ID _44973_210_1511_1_1533794381163_3634770_495-211466-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 22:23:06,026 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9068W0002-529085 from training. Duration: 0.896 2023-10-02 22:23:06,040 WARNING [train.py:1204] (3/4) Exclude cut with ID _5294_210_1747_1_1527249643928_3696179_180-100713-0_sp1.1 from training. Duration: 0.736375 2023-10-02 22:23:07,385 WARNING [train.py:1204] (3/4) Exclude cut with ID _35799_210_5854_1_1533263487887_3529071_149-49689-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 22:23:10,263 WARNING [train.py:1204] (3/4) Exclude cut with ID _67725_210_27767_1_1535608756671_5105359_36-220794-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 22:23:12,923 WARNING [train.py:1204] (3/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_338-96520-0_sp1.1 from training. Duration: 0.8545625 2023-10-02 22:23:12,999 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9104W0004-547160 from training. Duration: 0.896 2023-10-02 22:23:14,322 WARNING [train.py:1204] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_330-327861-0 from training. Duration: 0.67 2023-10-02 22:23:15,671 WARNING [train.py:1204] (3/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_156-257607-0_sp0.9 from training. Duration: 0.92225 2023-10-02 22:23:17,133 WARNING [train.py:1204] (3/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_476-322050-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 22:23:17,141 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9118W0017-553385 from training. Duration: 0.896 2023-10-02 22:23:17,179 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9120W0028-554435 from training. Duration: 0.896 2023-10-02 22:23:21,837 WARNING [train.py:1204] (3/4) Exclude cut with ID _56524_210_9642_1_1534572011779_3619170_145-310842-0 from training. Duration: 0.99 2023-10-02 22:23:22,787 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.5.encoder.layers.1.feed_forward3.out_whiten, num_groups=1, num_channels=256, metric=13.02 vs. limit=15.0 2023-10-02 22:23:23,693 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.1.ff2_skip_rate, batch_count=1039746.6666666666, ans=0.0 2023-10-02 22:23:24,904 WARNING [train.py:1204] (3/4) Exclude cut with ID _51631_210_8254_1_1534129228420_4084794_270-222508-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 22:23:28,237 WARNING [train.py:1204] (3/4) Exclude cut with ID _14268_210_9206_1_1530622771285_4714099_331-105351-0 from training. Duration: 0.65 2023-10-02 22:23:30,924 WARNING [train.py:1204] (3/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_40-129756-0 from training. Duration: 0.81 2023-10-02 22:23:32,564 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.feed_forward2.hidden_balancer.prob, batch_count=1039813.3333333334, ans=0.125 2023-10-02 22:23:39,718 WARNING [train.py:1204] (3/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_231-104736-0 from training. Duration: 0.78 2023-10-02 22:23:41,147 WARNING [train.py:1204] (3/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_214-291369-0 from training. Duration: 0.63 2023-10-02 22:23:41,159 WARNING [train.py:1204] (3/4) Exclude cut with ID _62574_210_2655_1_1535180407058_3933249_369-84347-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 22:23:41,207 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9210W0111-599240 from training. Duration: 0.896 2023-10-02 22:23:41,211 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_92-246270-0 from training. Duration: 0.85 2023-10-02 22:23:42,575 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_37152_210_9697_1_1533106573_3844188_29-239070-0 from training. Duration: 0.98 2023-10-02 22:23:42,620 WARNING [train.py:1204] (3/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_239-22076-0 from training. Duration: 0.85 2023-10-02 22:23:42,623 WARNING [train.py:1204] (3/4) Exclude cut with ID _65962_210_29927_1_1535436026519_4174930_368-44639-0_sp1.1 from training. Duration: 0.97275 2023-10-02 22:23:45,519 WARNING [train.py:1204] (3/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_152-212533-0 from training. Duration: 0.97 2023-10-02 22:23:50,091 WARNING [train.py:1204] (3/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_90-31383-0_sp1.1 from training. Duration: 0.7909375 2023-10-02 22:23:51,601 WARNING [train.py:1204] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_196-152868-0_sp1.1 from training. Duration: 0.92725 2023-10-02 22:23:51,602 WARNING [train.py:1204] (3/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_140-181176-0 from training. Duration: 0.92 2023-10-02 22:23:54,904 WARNING [train.py:1204] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_168-32906-0_sp0.9 from training. Duration: 0.7666875 2023-10-02 22:23:56,207 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.538e+02 1.917e+02 2.185e+02 2.667e+02 3.803e+02, threshold=4.369e+02, percent-clipped=0.0 2023-10-02 22:23:57,681 WARNING [train.py:1204] (3/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_13-165512-0 from training. Duration: 0.82 2023-10-02 22:23:57,723 WARNING [train.py:1204] (3/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_381-317791-0_sp0.9 from training. Duration: 0.811125 2023-10-02 22:24:01,893 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.1.encoder.layers.0.nonlin_attention.whiten1, num_groups=1, num_channels=192, metric=4.66 vs. limit=10.0 2023-10-02 22:24:05,797 WARNING [train.py:1204] (3/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_63-267585-0_sp1.1 from training. Duration: 0.8181875 2023-10-02 22:24:05,799 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_326-303945-0_sp1.1 from training. Duration: 0.9 2023-10-02 22:24:05,817 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_194-143381-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 22:24:07,169 WARNING [train.py:1204] (3/4) Exclude cut with ID _39290_210_4220_1_1533862778760_3075118_194-16667-0_sp1.1 from training. Duration: 0.963625 2023-10-02 22:24:07,256 WARNING [train.py:1204] (3/4) Exclude cut with ID _15012_210_1968_1_1530856820429_5095350_662-210469-0_sp0.9 from training. Duration: 0.6333125 2023-10-02 22:24:08,524 WARNING [train.py:1204] (3/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_68-219807-0_sp1.1 from training. Duration: 0.42725 2023-10-02 22:24:08,596 WARNING [train.py:1204] (3/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_321-22821-0_sp0.9 from training. Duration: 0.988875 2023-10-02 22:24:14,279 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_1037-45317-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 22:24:14,280 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_403-39619-0_sp1.1 from training. Duration: 0.763625 2023-10-02 22:24:15,630 INFO [train.py:1046] (3/4) Epoch 30, batch 1950, loss[loss=0.1514, simple_loss=0.2352, pruned_loss=0.0338, over 24519.00 frames. ], tot_loss[loss=0.1656, simple_loss=0.2435, pruned_loss=0.04386, over 4697356.58 frames. ], batch size: 66, lr: 3.41e-03, grad_scale: 8.0 2023-10-02 22:24:15,798 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_510-96996-0_sp0.9 from training. Duration: 0.9666875 2023-10-02 22:24:15,801 WARNING [train.py:1204] (3/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_225-180155-0_sp1.1 from training. Duration: 0.92725 2023-10-02 22:24:17,095 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_278-272612-0_sp0.9 from training. Duration: 0.811125 2023-10-02 22:24:19,723 WARNING [train.py:1204] (3/4) Exclude cut with ID _53713_210_24116_1_1534314487762_7416751_691-70244-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 22:24:21,171 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6788_210_1388_1_1527898698_4562021_91-197981-0_sp1.1 from training. Duration: 0.7545625 2023-10-02 22:24:24,587 WARNING [train.py:1204] (3/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_60-18123-0_sp0.9 from training. Duration: 0.97775 2023-10-02 22:24:24,628 WARNING [train.py:1204] (3/4) Exclude cut with ID _35799_210_5854_1_1533263487887_3529071_5-214617-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 22:24:24,633 WARNING [train.py:1204] (3/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_890-5117-0_sp1.1 from training. Duration: 0.7090625 2023-10-02 22:24:27,946 WARNING [train.py:1204] (3/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_660-211088-0 from training. Duration: 0.62 2023-10-02 22:24:28,019 WARNING [train.py:1204] (3/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_314-126819-0_sp0.9 from training. Duration: 0.5555625 2023-10-02 22:24:28,044 WARNING [train.py:1204] (3/4) Exclude cut with ID _21632_210_2253_1_1531994001765_3744140_362-66225-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 22:24:29,377 WARNING [train.py:1204] (3/4) Exclude cut with ID _61118_210_9527_1_1534897668884_3752630_173-258555-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 22:24:30,954 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.1.bypass_mid.scale_min, batch_count=1040080.0, ans=0.2 2023-10-02 22:24:32,678 WARNING [train.py:1204] (3/4) Exclude cut with ID _7719_210_3525_1_1528456153163_4296960_88-250259-0_sp0.9 from training. Duration: 0.9333125 2023-10-02 22:24:32,717 WARNING [train.py:1204] (3/4) Exclude cut with ID _15140_210_9288_1_1530791388450_4880149_539-153708-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 22:24:32,750 WARNING [train.py:1204] (3/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_253-42544-0_sp1.1 from training. Duration: 0.97275 2023-10-02 22:24:35,474 WARNING [train.py:1204] (3/4) Exclude cut with ID _33640_210_2655_1_1533117605479_4855469_479-296877-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 22:24:37,942 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.4.encoder.layers.1.feed_forward2.out_whiten, num_groups=1, num_channels=384, metric=11.34 vs. limit=15.0 2023-10-02 22:24:38,763 WARNING [train.py:1204] (3/4) Exclude cut with ID 3033-130750-0096-55598-0_sp1.1 from training. Duration: 0.7545625 2023-10-02 22:24:38,774 WARNING [train.py:1204] (3/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_191-41023-0_sp1.1 from training. Duration: 0.6454375 2023-10-02 22:24:39,116 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.conv_module1.balancer1.prob, batch_count=1040080.0, ans=0.125 2023-10-02 22:24:40,114 WARNING [train.py:1204] (3/4) Exclude cut with ID _8365_210_2064_1_1528592311070_4677690_431-294102-0_sp1.1 from training. Duration: 0.8090625 2023-10-02 22:24:40,154 WARNING [train.py:1204] (3/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_893-296030-0_sp1.1 from training. Duration: 0.97275 2023-10-02 22:24:40,353 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.feed_forward3.hidden_balancer.prob, batch_count=1040080.0, ans=0.125 2023-10-02 22:24:42,872 WARNING [train.py:1204] (3/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_401-343820-0_sp1.1 from training. Duration: 0.97275 2023-10-02 22:24:44,372 WARNING [train.py:1204] (3/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_127-566-0_sp1.1 from training. Duration: 0.763625 2023-10-02 22:24:44,373 WARNING [train.py:1204] (3/4) Exclude cut with ID _14100_210_8341_1_1530529320613_3678069_126-272847-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 22:24:44,387 WARNING [train.py:1204] (3/4) Exclude cut with ID _15133_210_6228_1_1530794512074_3080379_29-264458-0_sp1.1 from training. Duration: 0.52725 2023-10-02 22:24:44,388 WARNING [train.py:1204] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_127-119109-0 from training. Duration: 0.72 2023-10-02 22:24:45,779 WARNING [train.py:1204] (3/4) Exclude cut with ID _6537_210_1883_1_1527987559710_3597129_382-133062-0_sp1.1 from training. Duration: 0.6181875 2023-10-02 22:24:45,797 WARNING [train.py:1204] (3/4) Exclude cut with ID _29529_210_7234_1_1532393961455_3977460_391-219682-0_sp1.1 from training. Duration: 0.936375 2023-10-02 22:24:47,075 WARNING [train.py:1204] (3/4) Exclude cut with ID _37537_210_2253_1_1533171758568_3421949_96-137189-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 22:24:49,809 WARNING [train.py:1204] (3/4) Exclude cut with ID _58414_210_8254_1_1535421609162_7151182_15-276301-0_sp1.1 from training. Duration: 0.97275 2023-10-02 22:24:51,264 WARNING [train.py:1204] (3/4) Exclude cut with ID _59502_210_12210_1_1535107024698_1835139_110-289074-0_sp1.1 from training. Duration: 0.92725 2023-10-02 22:24:54,585 WARNING [train.py:1204] (3/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_367-61330-0_sp1.1 from training. Duration: 0.6818125 2023-10-02 22:24:57,855 WARNING [train.py:1204] (3/4) Exclude cut with ID _45297_210_2721_1_1533715351381_3264949_353-9541-0_sp1.1 from training. Duration: 0.8818125 2023-10-02 22:24:59,172 WARNING [train.py:1204] (3/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_85-138257-0_sp0.9 from training. Duration: 0.788875 2023-10-02 22:24:59,214 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3787_210_2040_1_1526477544_5478586_603-120324-0 from training. Duration: 0.81 2023-10-02 22:24:59,259 WARNING [train.py:1204] (3/4) Exclude cut with ID _42863_210_9070_1_1533902398788_3696920_411-204131-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 22:25:02,472 WARNING [train.py:1204] (3/4) Exclude cut with ID _30196_210_1664_1_1533708496522_7074140_822-289909-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 22:25:02,637 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.1.conv_module1.balancer1.prob, batch_count=1040213.3333333334, ans=0.125 2023-10-02 22:25:02,655 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.ff2_skip_rate, batch_count=1040213.3333333334, ans=0.0 2023-10-02 22:25:04,382 WARNING [train.py:1204] (3/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_32-335103-0_sp0.9 from training. Duration: 0.911125 2023-10-02 22:25:05,727 WARNING [train.py:1204] (3/4) Exclude cut with ID _6493_210_1747_1_1528459210424_4012470_513-140967-0_sp0.9 from training. Duration: 0.87775 2023-10-02 22:25:08,504 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.0.self_attn_weights.pos_emb_skip_rate, batch_count=1040213.3333333334, ans=0.0 2023-10-02 22:25:13,841 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_47896_210_20508_1_1534492518_5958868_439-257766-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 22:25:13,914 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_1294-63152-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 22:25:14,131 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.balancer1.prob, batch_count=1040280.0, ans=0.125 2023-10-02 22:25:16,668 WARNING [train.py:1204] (3/4) Exclude cut with ID _59170_210_6721_1_1534983633960_4203335_319-166512-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 22:25:18,059 WARNING [train.py:1204] (3/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_146-3470-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 22:25:20,789 WARNING [train.py:1204] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_678-159307-0_sp1.1 from training. Duration: 0.77275 2023-10-02 22:25:22,100 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_343-48822-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 22:25:22,154 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_4692_210_3528_1_1526899755_4323254_107-44401-0 from training. Duration: 0.86 2023-10-02 22:25:22,159 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_74-292608-0_sp1.1 from training. Duration: 0.6545625 2023-10-02 22:25:23,517 WARNING [train.py:1204] (3/4) Exclude cut with ID _55644_210_11856_1_1534593599582_4103719_293-248694-0_sp1.1 from training. Duration: 0.97275 2023-10-02 22:25:25,268 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3172_210_2087_1_1526292929_3987865_197-97762-0 from training. Duration: 0.75 2023-10-02 22:25:26,795 WARNING [train.py:1204] (3/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_264-348896-0_sp0.9 from training. Duration: 0.9444375 2023-10-02 22:25:29,860 INFO [train.py:1046] (3/4) Epoch 30, batch 2000, loss[loss=0.1615, simple_loss=0.2282, pruned_loss=0.04744, over 23753.00 frames. ], tot_loss[loss=0.1659, simple_loss=0.2443, pruned_loss=0.04378, over 4713347.99 frames. ], batch size: 212, lr: 3.41e-03, grad_scale: 16.0 2023-10-02 22:25:30,008 WARNING [train.py:1204] (3/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_209-205365-0_sp0.9 from training. Duration: 0.87775 2023-10-02 22:25:31,454 WARNING [train.py:1204] (3/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_222-198127-0_sp0.9 from training. Duration: 0.9333125 2023-10-02 22:25:31,503 WARNING [train.py:1204] (3/4) Exclude cut with ID _57572_210_5517_1_1534921219624_3809484_52-43530-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 22:25:33,425 WARNING [train.py:1204] (3/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_96-320736-0_sp1.1 from training. Duration: 0.7818125 2023-10-02 22:25:36,171 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_13091_210_7925_1_1530609447_8987354_1159-225508-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 22:25:36,357 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.ff2_skip_rate, batch_count=1040346.6666666666, ans=0.0 2023-10-02 22:25:37,799 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.feed_forward2.hidden_balancer.prob, batch_count=1040346.6666666666, ans=0.125 2023-10-02 22:25:40,376 WARNING [train.py:1204] (3/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_237-44332-0 from training. Duration: 0.94 2023-10-02 22:25:40,421 WARNING [train.py:1204] (3/4) Exclude cut with ID _7970_210_1883_1_1528455616296_3472230_25-178407-0_sp0.9 from training. Duration: 0.788875 2023-10-02 22:25:43,269 WARNING [train.py:1204] (3/4) Exclude cut with ID _29003_210_19596_1_1532507391720_3894098_357-257062-0_sp1.1 from training. Duration: 0.936375 2023-10-02 22:25:43,519 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.ff3_skip_rate, batch_count=1040413.3333333334, ans=0.0 2023-10-02 22:25:44,874 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_809-17156-0 from training. Duration: 0.73 2023-10-02 22:25:46,163 WARNING [train.py:1204] (3/4) Exclude cut with ID _7160_210_1876_1_1527940923231_3703799_302-150728-0_sp1.1 from training. Duration: 0.7090625 2023-10-02 22:25:47,474 WARNING [train.py:1204] (3/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_444-28041-0_sp0.9 from training. Duration: 0.9444375 2023-10-02 22:25:49,003 WARNING [train.py:1204] (3/4) Exclude cut with ID _52892_210_6721_1_1534378941259_4124129_278-251666-0_sp1.1 from training. Duration: 0.92725 2023-10-02 22:25:51,688 WARNING [train.py:1204] (3/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_115-326778-0 from training. Duration: 0.93 2023-10-02 22:25:52,256 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.5.encoder.layers.0.feed_forward1.out_whiten, num_groups=1, num_channels=256, metric=6.99 vs. limit=15.0 2023-10-02 22:25:53,045 WARNING [train.py:1204] (3/4) Exclude cut with ID _41391_210_7994_1_1533603771872_3638201_31-188961-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 22:25:54,419 WARNING [train.py:1204] (3/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_1073-6264-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 22:25:55,699 WARNING [train.py:1204] (3/4) Exclude cut with ID _66868_210_9527_1_1535509287954_4561140_435-259515-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 22:25:55,790 WARNING [train.py:1204] (3/4) Exclude cut with ID _14886_210_4220_1_1530702020355_3516190_159-73748-0 from training. Duration: 0.83 2023-10-02 22:25:57,101 WARNING [train.py:1204] (3/4) Exclude cut with ID _15150_210_8341_1_1530752411803_7256384_469-239509-0_sp1.1 from training. Duration: 0.6181875 2023-10-02 22:26:00,165 WARNING [train.py:1204] (3/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_395-340852-0 from training. Duration: 0.93 2023-10-02 22:26:00,169 WARNING [train.py:1204] (3/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_138-187031-0_sp1.1 from training. Duration: 0.9 2023-10-02 22:26:02,348 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_13757_210_3528_1_1530440682_4107239_48-106347-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 22:26:02,613 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.conv_module2.balancer2.prob, batch_count=1040480.0, ans=0.125 2023-10-02 22:26:05,153 WARNING [train.py:1204] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_164-224052-0_sp1.1 from training. Duration: 0.636375 2023-10-02 22:26:05,158 WARNING [train.py:1204] (3/4) Exclude cut with ID _59288_210_5854_1_1535077757774_3412246_408-322648-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 22:26:05,212 WARNING [train.py:1204] (3/4) Exclude cut with ID _30191_210_1664_1_1533189915047_7196670_827-36359-0_sp1.1 from training. Duration: 0.963625 2023-10-02 22:26:06,524 WARNING [train.py:1204] (3/4) Exclude cut with ID _4680_210_3525_1_1527249757953_893386_75-78979-0_sp1.1 from training. Duration: 0.82725 2023-10-02 22:26:08,347 WARNING [train.py:1204] (3/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_383-165234-0 from training. Duration: 0.94 2023-10-02 22:26:09,888 WARNING [train.py:1204] (3/4) Exclude cut with ID _19896_210_1883_1_1531709022662_6649569_94-229927-0 from training. Duration: 0.97 2023-10-02 22:26:11,251 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6788_210_1388_1_1527898698_4562021_180-75288-0_sp1.1 from training. Duration: 0.9 2023-10-02 22:26:11,265 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_26433_210_12108_1_1532157158_4056480_54-321349-0_sp1.1 from training. Duration: 0.97275 2023-10-02 22:26:15,472 WARNING [train.py:1204] (3/4) Exclude cut with ID _56108_210_16380_1_1534505446159_3887959_265-161493-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 22:26:16,880 WARNING [train.py:1204] (3/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_395-111602-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 22:26:16,886 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_154-316450-0_sp0.9 from training. Duration: 0.8444375 2023-10-02 22:26:18,215 WARNING [train.py:1204] (3/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_381-116041-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 22:26:18,337 WARNING [train.py:1204] (3/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_65-104124-0_sp1.1 from training. Duration: 0.963625 2023-10-02 22:26:19,740 WARNING [train.py:1204] (3/4) Exclude cut with ID _6536_210_1883_1_1527850783339_3319069_169-23811-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 22:26:19,776 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_289-180904-0_sp0.9 from training. Duration: 0.8444375 2023-10-02 22:26:19,793 WARNING [train.py:1204] (3/4) Exclude cut with ID _56406_210_9288_1_1534899618664_3496149_226-242400-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 22:26:21,444 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.ff2_skip_rate, batch_count=1040546.6666666666, ans=0.0 2023-10-02 22:26:22,494 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_24899_210_16554_1_1532046422_3827462_276-216330-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 22:26:25,158 WARNING [train.py:1204] (3/4) Exclude cut with ID _29345_210_4381_1_1532689168263_3860169_198-174328-0_sp1.1 from training. Duration: 0.82725 2023-10-02 22:26:26,527 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.535e+02 2.011e+02 2.210e+02 2.489e+02 4.189e+02, threshold=4.420e+02, percent-clipped=0.0 2023-10-02 22:26:26,657 WARNING [train.py:1204] (3/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_338-96520-0 from training. Duration: 0.94 2023-10-02 22:26:31,340 WARNING [train.py:1204] (3/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_7-111954-0_sp1.1 from training. Duration: 0.5090625 2023-10-02 22:26:32,740 WARNING [train.py:1204] (3/4) Exclude cut with ID _36692_210_5665_1_1533608967420_398809_8-16261-0_sp1.1 from training. Duration: 0.97275 2023-10-02 22:26:34,629 WARNING [train.py:1204] (3/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_86-212657-0_sp1.1 from training. Duration: 0.97275 2023-10-02 22:26:35,253 WARNING [train.py:1204] (3/4) Exclude cut with ID _44659_210_2721_1_1534036120061_6614860_626-298884-0_sp1.1 from training. Duration: 0.8818125 2023-10-02 22:26:39,939 WARNING [train.py:1204] (3/4) Exclude cut with ID _49755_210_18053_1_1534581005846_3218709_120-257918-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 22:26:41,304 WARNING [train.py:1204] (3/4) Exclude cut with ID _21881_210_3525_1_1531825242757_3798380_400-113821-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 22:26:41,307 WARNING [train.py:1204] (3/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_387-236773-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 22:26:41,370 WARNING [train.py:1204] (3/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_613-232856-0_sp1.1 from training. Duration: 0.4909375 2023-10-02 22:26:41,388 WARNING [train.py:1204] (3/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_181-78632-0_sp0.9 from training. Duration: 0.8666875 2023-10-02 22:26:43,991 INFO [train.py:1046] (3/4) Epoch 30, batch 2050, loss[loss=0.179, simple_loss=0.2601, pruned_loss=0.04901, over 23418.00 frames. ], tot_loss[loss=0.1647, simple_loss=0.2427, pruned_loss=0.04332, over 4709679.97 frames. ], batch size: 105, lr: 3.41e-03, grad_scale: 16.0 2023-10-02 22:26:44,139 WARNING [train.py:1204] (3/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_175-17485-0_sp1.1 from training. Duration: 0.97275 2023-10-02 22:26:45,435 WARNING [train.py:1204] (3/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_1004-183422-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 22:26:48,153 WARNING [train.py:1204] (3/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_32-65962-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 22:26:49,518 WARNING [train.py:1204] (3/4) Exclude cut with ID _15458_210_8341_1_1530858614263_3834410_40-298664-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 22:26:52,451 WARNING [train.py:1204] (3/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_380-261863-0_sp1.1 from training. Duration: 0.963625 2023-10-02 22:26:53,839 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_372-173012-0_sp1.1 from training. Duration: 0.863625 2023-10-02 22:26:53,891 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_57-82205-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 22:26:55,227 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3691_210_3528_1_1526468169_4062053_355-334432-0_sp0.9 from training. Duration: 0.9666875 2023-10-02 22:26:58,293 WARNING [train.py:1204] (3/4) Exclude cut with ID _19023_210_4353_1_1531378892552_5820780_28-36615-0 from training. Duration: 0.97 2023-10-02 22:26:58,306 WARNING [train.py:1204] (3/4) Exclude cut with ID _29853_210_7497_1_1532505600851_6642110_286-251333-0_sp1.1 from training. Duration: 0.92725 2023-10-02 22:26:59,635 WARNING [train.py:1204] (3/4) Exclude cut with ID _31602_210_5854_1_1532677021456_4458250_518-80679-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 22:26:59,796 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.feed_forward3.hidden_balancer.prob, batch_count=1040746.6666666666, ans=0.125 2023-10-02 22:27:00,888 WARNING [train.py:1204] (3/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_38-187851-0_sp0.9 from training. Duration: 0.688875 2023-10-02 22:27:01,127 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.attention_skip_rate, batch_count=1040746.6666666666, ans=0.0 2023-10-02 22:27:10,408 WARNING [train.py:1204] (3/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_39-248870-0_sp1.1 from training. Duration: 0.62725 2023-10-02 22:27:11,748 WARNING [train.py:1204] (3/4) Exclude cut with ID _16908_210_5724_1_1531119157248_3772700_154-31455-0_sp1.1 from training. Duration: 0.97275 2023-10-02 22:27:13,218 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8453_210_4211_1_1528502940_8562730_347-330673-0 from training. Duration: 0.72 2023-10-02 22:27:15,856 WARNING [train.py:1204] (3/4) Exclude cut with ID _30191_210_1664_1_1533189915047_7196670_674-204247-0_sp1.1 from training. Duration: 0.97275 2023-10-02 22:27:15,959 WARNING [train.py:1204] (3/4) Exclude cut with ID _8833_210_5219_1_1528592665210_7553059_329-125156-0 from training. Duration: 0.82 2023-10-02 22:27:15,990 WARNING [train.py:1204] (3/4) Exclude cut with ID _4779_210_2084_1_1527318022944_3914893_500-285013-0_sp1.1 from training. Duration: 0.62725 2023-10-02 22:27:18,086 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.3.encoder.layers.0.feed_forward2.out_whiten, num_groups=1, num_channels=512, metric=4.89 vs. limit=15.0 2023-10-02 22:27:18,842 WARNING [train.py:1204] (3/4) Exclude cut with ID _14421_210_8750_1_1530604795825_3542290_190-313578-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 22:27:20,327 WARNING [train.py:1204] (3/4) Exclude cut with ID _35799_210_5854_1_1533263487887_3529071_538-273574-0_sp1.1 from training. Duration: 0.936375 2023-10-02 22:27:21,630 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_14928_210_5632_1_1530773461_4714000_454-112198-0_sp1.1 from training. Duration: 0.6 2023-10-02 22:27:23,002 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_468-143126-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 22:27:24,344 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_4528_210_3273_1_1527076654_3276777_258-121681-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 22:27:25,690 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_397-132565-0_sp1.1 from training. Duration: 0.9 2023-10-02 22:27:25,710 WARNING [train.py:1204] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_454-123002-0_sp0.9 from training. Duration: 0.7666875 2023-10-02 22:27:29,118 WARNING [train.py:1204] (3/4) Exclude cut with ID _27052_210_5986_1_1532329197187_3585009_297-86890-0_sp1.1 from training. Duration: 0.936375 2023-10-02 22:27:32,398 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_51225_210_3601_1_1534400634_2327383_55-301844-0_sp1.1 from training. Duration: 0.8454375 2023-10-02 22:27:33,927 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6786_210_1388_1_1527839699_4194495_378-46070-0_sp0.9 from training. Duration: 0.8 2023-10-02 22:27:34,025 WARNING [train.py:1204] (3/4) Exclude cut with ID _42334_210_7770_1_1534206610396_3605880_55-19758-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 22:27:38,612 WARNING [train.py:1204] (3/4) Exclude cut with ID _14884_210_2252_1_1530707459689_5561250_114-201399-0_sp0.9 from training. Duration: 0.8444375 2023-10-02 22:27:41,614 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.bypass.skip_rate, batch_count=1040946.6666666666, ans=0.07 2023-10-02 22:27:42,883 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_99-251314-0_sp0.9 from training. Duration: 0.9666875 2023-10-02 22:27:44,256 WARNING [train.py:1204] (3/4) Exclude cut with ID _26528_210_9070_1_1532570410842_3851310_497-97218-0 from training. Duration: 0.86 2023-10-02 22:27:48,463 WARNING [train.py:1204] (3/4) Exclude cut with ID _55328_210_27767_1_1534580973260_4088170_230-317241-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 22:27:49,776 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_125-203105-0_sp0.9 from training. Duration: 0.888875 2023-10-02 22:27:52,345 WARNING [train.py:1204] (3/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_55-31904-0_sp1.1 from training. Duration: 0.8 2023-10-02 22:27:53,828 WARNING [train.py:1204] (3/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_18-174647-0 from training. Duration: 0.91 2023-10-02 22:27:56,626 INFO [train.py:1046] (3/4) Epoch 30, batch 2100, loss[loss=0.1638, simple_loss=0.2471, pruned_loss=0.04025, over 24645.00 frames. ], tot_loss[loss=0.1642, simple_loss=0.2419, pruned_loss=0.04322, over 4712479.30 frames. ], batch size: 73, lr: 3.41e-03, grad_scale: 16.0 2023-10-02 22:27:56,708 WARNING [train.py:1204] (3/4) Exclude cut with ID IC0165W0005-81726 from training. Duration: 0.952 2023-10-02 22:27:56,709 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_31583_210_3852_1_1532595851_4024413_148-171847-0_sp1.1 from training. Duration: 0.963625 2023-10-02 22:27:56,759 WARNING [train.py:1204] (3/4) Exclude cut with ID _21989_210_5854_1_1531899214931_3271360_116-138769-0_sp1.1 from training. Duration: 0.936375 2023-10-02 22:27:58,477 WARNING [train.py:1204] (3/4) Exclude cut with ID _66070_210_7640_1_1535441393096_7805310_489-292631-0_sp1.1 from training. Duration: 0.7181875 2023-10-02 22:28:00,368 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_202-27960-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 22:28:00,375 WARNING [train.py:1204] (3/4) Exclude cut with ID _5294_210_1747_1_1527249643928_3696179_342-270278-0 from training. Duration: 0.55 2023-10-02 22:28:00,409 WARNING [train.py:1204] (3/4) Exclude cut with ID _38323_210_12108_1_1533121233369_3303049_416-11919-0 from training. Duration: 0.96 2023-10-02 22:28:01,824 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6545_210_2040_1_1527774670_4245258_407-48070-0_sp0.9 from training. Duration: 0.8444375 2023-10-02 22:28:04,560 WARNING [train.py:1204] (3/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_503-30719-0_sp1.1 from training. Duration: 0.8909375 2023-10-02 22:28:06,535 WARNING [train.py:1204] (3/4) Exclude cut with ID _49643_210_7989_1_1534640244224_3939031_234-326497-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 22:28:09,549 WARNING [train.py:1204] (3/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_301-77873-0_sp1.1 from training. Duration: 0.963625 2023-10-02 22:28:10,924 WARNING [train.py:1204] (3/4) Exclude cut with ID _45297_210_2721_1_1533715351381_3264949_152-154697-0_sp1.1 from training. Duration: 0.97275 2023-10-02 22:28:10,931 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3371_210_1794_1_1526384462_5580777_396-339812-0 from training. Duration: 0.85 2023-10-02 22:28:12,319 WARNING [train.py:1204] (3/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_399-156791-0_sp1.1 from training. Duration: 0.7545625 2023-10-02 22:28:12,366 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7696_210_2040_1_1528207167_3716099_117-125434-0 from training. Duration: 0.76 2023-10-02 22:28:12,379 WARNING [train.py:1204] (3/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_96-244299-0 from training. Duration: 0.88 2023-10-02 22:28:13,839 WARNING [train.py:1204] (3/4) Exclude cut with ID _37892_210_19596_1_1533198677897_3964058_519-343462-0_sp1.1 from training. Duration: 0.92725 2023-10-02 22:28:13,870 WARNING [train.py:1204] (3/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_202-183922-0_sp1.1 from training. Duration: 0.77275 2023-10-02 22:28:13,876 WARNING [train.py:1204] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_453-133983-0 from training. Duration: 0.78 2023-10-02 22:28:13,902 WARNING [train.py:1204] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_178-39722-0_sp1.1 from training. Duration: 0.4454375 2023-10-02 22:28:19,317 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_15126_210_6753_1_1530778677_2591185_113-157651-0 from training. Duration: 0.68 2023-10-02 22:28:19,318 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_271-234962-0_sp1.1 from training. Duration: 0.7181875 2023-10-02 22:28:22,100 WARNING [train.py:1204] (3/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_195-64967-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 22:28:23,448 WARNING [train.py:1204] (3/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_365-215065-0_sp1.1 from training. Duration: 0.936375 2023-10-02 22:28:26,228 WARNING [train.py:1204] (3/4) Exclude cut with ID _31602_210_5854_1_1532677021456_4458250_249-96604-0_sp0.9 from training. Duration: 0.988875 2023-10-02 22:28:26,292 WARNING [train.py:1204] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_307-158462-0 from training. Duration: 0.69 2023-10-02 22:28:28,117 WARNING [train.py:1204] (3/4) Exclude cut with ID _30918_210_6400_1_1532754078600_3125920_407-223394-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 22:28:28,120 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6673_210_3273_1_1527767790_3173240_351-167954-0_sp0.9 from training. Duration: 0.5333125 2023-10-02 22:28:29,528 WARNING [train.py:1204] (3/4) Exclude cut with ID _4866_210_2577_1_1527404412284_4364659_256-306830-0 from training. Duration: 0.87 2023-10-02 22:28:29,571 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_17613_210_2151_1_1531137569_2366160_222-131118-0_sp1.1 from training. Duration: 0.963625 2023-10-02 22:28:29,592 WARNING [train.py:1204] (3/4) Exclude cut with ID _45560_210_21982_1_1533812603951_3584999_111-6376-0 from training. Duration: 0.84 2023-10-02 22:28:31,332 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_216-73841-0 from training. Duration: 0.96 2023-10-02 22:28:31,374 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_14058_210_4163_1_1530690953_6327180_49-210687-0 from training. Duration: 0.94 2023-10-02 22:28:34,116 WARNING [train.py:1204] (3/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_506-42614-0_sp0.9 from training. Duration: 0.87775 2023-10-02 22:28:35,976 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_25-332138-0_sp1.1 from training. Duration: 0.82725 2023-10-02 22:28:37,519 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.0.conv_skip_rate, batch_count=1041146.6666666666, ans=0.0 2023-10-02 22:28:38,718 WARNING [train.py:1204] (3/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_589-9070-0_sp1.1 from training. Duration: 0.6454375 2023-10-02 22:28:40,633 WARNING [train.py:1204] (3/4) Exclude cut with ID _6194_210_1534_1_1527511826160_3941194_14-282876-0_sp1.1 from training. Duration: 0.6454375 2023-10-02 22:28:41,973 WARNING [train.py:1204] (3/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_1166-90654-0_sp1.1 from training. Duration: 0.92725 2023-10-02 22:28:42,071 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_218-255858-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 22:28:42,073 WARNING [train.py:1204] (3/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_1285-256622-0 from training. Duration: 0.84 2023-10-02 22:28:43,332 WARNING [train.py:1204] (3/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_205-286261-0_sp1.1 from training. Duration: 0.963625 2023-10-02 22:28:43,349 WARNING [train.py:1204] (3/4) Exclude cut with ID _68777_210_4353_1_1535810390731_3117904_6-154119-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 22:28:43,402 WARNING [train.py:1204] (3/4) Exclude cut with ID _22982_210_1883_1_1531882790447_5449409_565-199393-0_sp1.1 from training. Duration: 0.92725 2023-10-02 22:28:43,411 WARNING [train.py:1204] (3/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_278-154492-0 from training. Duration: 0.76 2023-10-02 22:28:44,846 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_426-313557-0 from training. Duration: 0.53 2023-10-02 22:28:45,058 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.self_attn_weights.pos_emb_skip_rate, batch_count=1041213.3333333334, ans=0.0 2023-10-02 22:28:46,166 WARNING [train.py:1204] (3/4) Exclude cut with ID _41817_210_18835_1_1533434477160_7423049_555-293448-0 from training. Duration: 0.8 2023-10-02 22:28:50,005 WARNING [train.py:1204] (3/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_32-335103-0_sp1.1 from training. Duration: 0.7454375 2023-10-02 22:28:52,756 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2655_210_2087_1_1525688959_3897400_202-147557-0_sp1.1 from training. Duration: 0.77275 2023-10-02 22:28:52,786 WARNING [train.py:1204] (3/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_158-250116-0 from training. Duration: 0.62 2023-10-02 22:28:53,985 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.627e+02 1.874e+02 2.097e+02 2.519e+02 4.862e+02, threshold=4.194e+02, percent-clipped=1.0 2023-10-02 22:28:57,662 WARNING [train.py:1204] (3/4) Exclude cut with ID _55429_210_10626_1_1534467588527_7174143_528-88083-0_sp1.1 from training. Duration: 0.963625 2023-10-02 22:29:00,581 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.feed_forward1.out_proj.dropout_p, batch_count=1041280.0, ans=0.1 2023-10-02 22:29:01,772 WARNING [train.py:1204] (3/4) Exclude cut with ID _8030_210_7497_1_1528444886781_3531089_32-31837-0_sp1.1 from training. Duration: 0.8818125 2023-10-02 22:29:01,807 WARNING [train.py:1204] (3/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_744-86168-0_sp1.1 from training. Duration: 0.97275 2023-10-02 22:29:01,816 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1113-7369-0_sp0.9 from training. Duration: 0.9444375 2023-10-02 22:29:01,832 WARNING [train.py:1204] (3/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_124-345840-0_sp0.9 from training. Duration: 0.57775 2023-10-02 22:29:03,080 WARNING [train.py:1204] (3/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_328-264776-0_sp0.9 from training. Duration: 0.7444375 2023-10-02 22:29:04,945 WARNING [train.py:1204] (3/4) Exclude cut with ID _18895_210_6228_1_1531357274234_3784930_469-166615-0_sp1.1 from training. Duration: 0.963625 2023-10-02 22:29:06,261 WARNING [train.py:1204] (3/4) Exclude cut with ID _8395_210_1531_1_1528597871523_4411579_431-132470-0_sp0.9 from training. Duration: 0.82225 2023-10-02 22:29:06,969 WARNING [train.py:1204] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_554-45617-0_sp1.1 from training. Duration: 0.9 2023-10-02 22:29:06,989 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_35361_210_4238_1_1533262810_7793476_524-188242-0_sp1.1 from training. Duration: 0.92725 2023-10-02 22:29:08,392 WARNING [train.py:1204] (3/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_938-205135-0 from training. Duration: 0.87 2023-10-02 22:29:11,605 INFO [train.py:1046] (3/4) Epoch 30, batch 2150, loss[loss=0.1869, simple_loss=0.2713, pruned_loss=0.05131, over 23287.00 frames. ], tot_loss[loss=0.1643, simple_loss=0.2418, pruned_loss=0.04337, over 4718922.40 frames. ], batch size: 93, lr: 3.41e-03, grad_scale: 16.0 2023-10-02 22:29:11,693 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_5931_210_2040_1_1527428481_4547999_321-193614-0 from training. Duration: 0.67 2023-10-02 22:29:11,709 WARNING [train.py:1204] (3/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_445-269521-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 22:29:14,448 WARNING [train.py:1204] (3/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_3-33116-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 22:29:14,450 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_4633_210_2040_1_1526824163_4145343_416-76117-0_sp1.1 from training. Duration: 0.863625 2023-10-02 22:29:15,721 WARNING [train.py:1204] (3/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_1033-274376-0_sp1.1 from training. Duration: 0.7909375 2023-10-02 22:29:15,758 WARNING [train.py:1204] (3/4) Exclude cut with ID _8772_210_1850_1_1528635595972_3664011_398-320049-0_sp0.9 from training. Duration: 0.97775 2023-10-02 22:29:21,289 WARNING [train.py:1204] (3/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_321-215742-0_sp0.9 from training. Duration: 0.5555625 2023-10-02 22:29:22,697 WARNING [train.py:1204] (3/4) Exclude cut with ID _28394_210_8913_1_1532430049518_3650118_272-190950-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 22:29:23,227 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.3.encoder.layers.2.conv_module2.whiten, num_groups=1, num_channels=512, metric=11.01 vs. limit=15.0 2023-10-02 22:29:24,077 WARNING [train.py:1204] (3/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_31-299371-0_sp1.1 from training. Duration: 0.92725 2023-10-02 22:29:26,687 WARNING [train.py:1204] (3/4) Exclude cut with ID _55383_210_10635_1_1534468426172_2231669_134-128246-0_sp0.9 from training. Duration: 0.988875 2023-10-02 22:29:26,690 WARNING [train.py:1204] (3/4) Exclude cut with ID _40519_210_13794_1_1533715228655_7337390_887-9547-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 22:29:26,726 WARNING [train.py:1204] (3/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_310-109461-0_sp0.9 from training. Duration: 0.888875 2023-10-02 22:29:28,332 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2009_210_2071_1_1525481134_4588937_258-250196-0_sp1.1 from training. Duration: 0.92725 2023-10-02 22:29:29,699 WARNING [train.py:1204] (3/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_1186-72702-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 22:29:29,702 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_14831_210_7927_1_1530700200_5583610_181-27467-0_sp1.1 from training. Duration: 0.82725 2023-10-02 22:29:32,989 WARNING [train.py:1204] (3/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_278-132109-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 22:29:33,013 WARNING [train.py:1204] (3/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_261-297503-0 from training. Duration: 0.99 2023-10-02 22:29:38,304 WARNING [train.py:1204] (3/4) Exclude cut with ID _19896_210_1883_1_1531709022662_6649569_313-265903-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 22:29:39,488 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_105-114797-0_sp1.1 from training. Duration: 0.763625 2023-10-02 22:29:40,807 WARNING [train.py:1204] (3/4) Exclude cut with ID _53755_210_8254_1_1534577312941_4707360_9-65843-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 22:29:42,625 WARNING [train.py:1204] (3/4) Exclude cut with ID _18516_210_9206_1_1531704653206_3692406_361-224549-0_sp1.1 from training. Duration: 0.97275 2023-10-02 22:29:42,661 WARNING [train.py:1204] (3/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_403-310975-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 22:29:43,973 WARNING [train.py:1204] (3/4) Exclude cut with ID _5444_210_2151_1_1527418808464_5312210_581-315411-0_sp0.9 from training. Duration: 0.688875 2023-10-02 22:29:44,007 WARNING [train.py:1204] (3/4) Exclude cut with ID _23825_210_13449_1_1531915313526_3497150_464-154904-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 22:29:44,015 WARNING [train.py:1204] (3/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_937-208281-0_sp0.9 from training. Duration: 0.9444375 2023-10-02 22:29:45,302 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_37839_210_7234_1_1533542462_3689303_158-181212-0_sp1.1 from training. Duration: 0.963625 2023-10-02 22:29:46,753 WARNING [train.py:1204] (3/4) Exclude cut with ID _8772_210_1850_1_1528635595972_3664011_398-320049-0 from training. Duration: 0.88 2023-10-02 22:29:48,196 WARNING [train.py:1204] (3/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_718-250907-0_sp1.1 from training. Duration: 0.62725 2023-10-02 22:29:48,278 WARNING [train.py:1204] (3/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_641-126395-0_sp1.1 from training. Duration: 0.92725 2023-10-02 22:29:48,525 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.conv_module2.balancer1.prob, batch_count=1041480.0, ans=0.125 2023-10-02 22:29:49,616 WARNING [train.py:1204] (3/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_166-241493-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 22:29:49,709 WARNING [train.py:1204] (3/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_526-62079-0_sp0.9 from training. Duration: 0.7444375 2023-10-02 22:29:51,109 WARNING [train.py:1204] (3/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_439-275083-0_sp1.1 from training. Duration: 0.936375 2023-10-02 22:29:51,369 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.balancer1.prob, batch_count=1041480.0, ans=0.125 2023-10-02 22:29:53,747 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_66907_210_21581_1_1535615661_3590177_493-229032-0_sp1.1 from training. Duration: 0.92725 2023-10-02 22:29:53,789 WARNING [train.py:1204] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_89-321955-0_sp1.1 from training. Duration: 0.72725 2023-10-02 22:29:56,304 WARNING [train.py:1204] (3/4) Exclude cut with ID _29857_210_20034_1_1532395886753_3798160_273-226195-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 22:29:56,311 WARNING [train.py:1204] (3/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_404-75889-0 from training. Duration: 0.89 2023-10-02 22:29:56,349 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_271-234962-0_sp0.9 from training. Duration: 0.87775 2023-10-02 22:29:59,056 WARNING [train.py:1204] (3/4) Exclude cut with ID _22961_210_8254_1_1531976366672_4278180_156-339685-0_sp1.1 from training. Duration: 0.97275 2023-10-02 22:29:59,096 WARNING [train.py:1204] (3/4) Exclude cut with ID _37892_210_19596_1_1533198677897_3964058_425-231927-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 22:30:00,500 WARNING [train.py:1204] (3/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_102-288670-0_sp1.1 from training. Duration: 0.97275 2023-10-02 22:30:00,581 WARNING [train.py:1204] (3/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_283-220372-0_sp1.1 from training. Duration: 0.5090625 2023-10-02 22:30:02,398 WARNING [train.py:1204] (3/4) Exclude cut with ID _44304_210_15374_1_1533639352765_4613129_86-1699-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 22:30:02,467 WARNING [train.py:1204] (3/4) Exclude cut with ID _2891_210_2550_1_1525874398521_4427710_327-121590-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 22:30:02,473 WARNING [train.py:1204] (3/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_369-222060-0 from training. Duration: 0.71 2023-10-02 22:30:05,144 WARNING [train.py:1204] (3/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_762-109886-0 from training. Duration: 0.91 2023-10-02 22:30:05,174 WARNING [train.py:1204] (3/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_477-255782-0_sp0.9 from training. Duration: 0.911125 2023-10-02 22:30:05,210 WARNING [train.py:1204] (3/4) Exclude cut with ID IC0669W0061-330906 from training. Duration: 0.768 2023-10-02 22:30:05,247 WARNING [train.py:1204] (3/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_347-213071-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 22:30:05,279 WARNING [train.py:1204] (3/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_507-303370-0_sp1.1 from training. Duration: 0.87275 2023-10-02 22:30:07,719 WARNING [train.py:1204] (3/4) Exclude cut with ID _15012_210_1968_1_1530856820429_5095350_688-178849-0 from training. Duration: 0.64 2023-10-02 22:30:07,724 WARNING [train.py:1204] (3/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_28-70987-0_sp0.9 from training. Duration: 0.9666875 2023-10-02 22:30:07,726 WARNING [train.py:1204] (3/4) Exclude cut with ID _5910_210_1389_1_1527337972454_5050100_555-159815-0 from training. Duration: 0.98 2023-10-02 22:30:07,751 WARNING [train.py:1204] (3/4) Exclude cut with ID IC0679W0064-335871 from training. Duration: 0.768 2023-10-02 22:30:07,751 WARNING [train.py:1204] (3/4) Exclude cut with ID IC0679W0079-335886 from training. Duration: 0.896 2023-10-02 22:30:07,910 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.feed_forward1.hidden_balancer.prob, batch_count=1041546.6666666666, ans=0.125 2023-10-02 22:30:09,020 WARNING [train.py:1204] (3/4) Exclude cut with ID _5608_210_2106_1_1527415439089_6819768_909-82767-0 from training. Duration: 0.61 2023-10-02 22:30:10,452 WARNING [train.py:1204] (3/4) Exclude cut with ID _23038_210_9570_1_1532001584158_3652419_411-323967-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 22:30:10,501 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_350-231748-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 22:30:10,512 WARNING [train.py:1204] (3/4) Exclude cut with ID _5289_210_2084_1_1527678202736_3779299_142-103317-0_sp0.9 from training. Duration: 0.9333125 2023-10-02 22:30:12,347 WARNING [train.py:1204] (3/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_314-144055-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 22:30:13,614 WARNING [train.py:1204] (3/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_177-236809-0_sp0.9 from training. Duration: 0.5444375 2023-10-02 22:30:14,988 WARNING [train.py:1204] (3/4) Exclude cut with ID _23038_210_9570_1_1532001584158_3652419_82-334733-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 22:30:14,997 WARNING [train.py:1204] (3/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_225-22526-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 22:30:22,049 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.bypass.skip_rate, batch_count=1041613.3333333334, ans=0.07 2023-10-02 22:30:23,232 WARNING [train.py:1204] (3/4) Exclude cut with ID _30327_210_16049_1_1532484161295_3924169_401-70819-0_sp1.1 from training. Duration: 0.7818125 2023-10-02 22:30:23,290 WARNING [train.py:1204] (3/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_538-32291-0 from training. Duration: 0.83 2023-10-02 22:30:24,671 INFO [train.py:1046] (3/4) Epoch 30, batch 2200, loss[loss=0.1481, simple_loss=0.2317, pruned_loss=0.03229, over 24657.00 frames. ], tot_loss[loss=0.1642, simple_loss=0.2418, pruned_loss=0.04326, over 4718615.78 frames. ], batch size: 65, lr: 3.41e-03, grad_scale: 16.0 2023-10-02 22:30:27,535 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_37357_210_17024_1_1533089504_2316121_258-211605-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 22:30:30,409 WARNING [train.py:1204] (3/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_550-335198-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 22:30:31,797 WARNING [train.py:1204] (3/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_98-741-0_sp0.9 from training. Duration: 0.888875 2023-10-02 22:30:31,851 WARNING [train.py:1204] (3/4) Exclude cut with ID _38323_210_12108_1_1533121233369_3303049_251-238988-0_sp1.1 from training. Duration: 0.92725 2023-10-02 22:30:34,945 WARNING [train.py:1204] (3/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_259-34038-0_sp0.9 from training. Duration: 0.82225 2023-10-02 22:30:38,155 WARNING [train.py:1204] (3/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_1060-180633-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 22:30:38,194 WARNING [train.py:1204] (3/4) Exclude cut with ID _8755_210_1876_1_1528628559357_3258209_211-37473-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 22:30:38,200 WARNING [train.py:1204] (3/4) Exclude cut with ID _4122_210_2084_1_1526713164417_4353280_528-206771-0 from training. Duration: 0.88 2023-10-02 22:30:42,952 WARNING [train.py:1204] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_64-248359-0 from training. Duration: 0.69 2023-10-02 22:30:44,438 WARNING [train.py:1204] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_402-253027-0_sp1.1 from training. Duration: 0.4909375 2023-10-02 22:30:46,166 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.conv_module2.balancer2.prob, batch_count=1041746.6666666666, ans=0.125 2023-10-02 22:30:49,973 WARNING [train.py:1204] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_625-102955-0 from training. Duration: 0.77 2023-10-02 22:30:52,849 WARNING [train.py:1204] (3/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_611-219324-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 22:30:54,213 WARNING [train.py:1204] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_718-68505-0_sp0.9 from training. Duration: 0.988875 2023-10-02 22:30:54,268 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_353-182855-0_sp1.1 from training. Duration: 0.87275 2023-10-02 22:30:57,078 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_5960_210_1794_1_1527403865_3990999_515-2613-0_sp0.9 from training. Duration: 0.97775 2023-10-02 22:30:57,098 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_5575_210_1410_1_1527301305_2570170_342-265572-0 from training. Duration: 0.94 2023-10-02 22:30:59,825 WARNING [train.py:1204] (3/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_329-113173-0_sp0.9 from training. Duration: 0.688875 2023-10-02 22:31:01,633 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_15396_210_6890_1_1531308473_1938936_213-312506-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 22:31:03,095 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_521-214276-0_sp1.1 from training. Duration: 0.4 2023-10-02 22:31:05,859 WARNING [train.py:1204] (3/4) Exclude cut with ID _15026_210_8750_1_1530777602316_3291850_211-195043-0_sp1.1 from training. Duration: 0.72725 2023-10-02 22:31:07,150 WARNING [train.py:1204] (3/4) Exclude cut with ID _29343_210_4381_1_1532602757752_3674109_108-257494-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 22:31:10,491 WARNING [train.py:1204] (3/4) Exclude cut with ID _64414_210_17301_1_1535243252048_3757759_403-58151-0_sp1.1 from training. Duration: 0.963625 2023-10-02 22:31:12,307 WARNING [train.py:1204] (3/4) Exclude cut with ID _36512_210_6987_1_1533033143998_3126069_109-177787-0_sp1.1 from training. Duration: 0.92725 2023-10-02 22:31:14,151 WARNING [train.py:1204] (3/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_498-40153-0 from training. Duration: 0.63 2023-10-02 22:31:15,535 WARNING [train.py:1204] (3/4) Exclude cut with ID _56447_210_1389_1_1535074245910_3697189_161-334523-0_sp1.1 from training. Duration: 0.936375 2023-10-02 22:31:15,626 WARNING [train.py:1204] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_238-245655-0 from training. Duration: 0.81 2023-10-02 22:31:17,246 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder_embed.convnext.layerdrop_rate, batch_count=1041880.0, ans=0.015 2023-10-02 22:31:18,628 WARNING [train.py:1204] (3/4) Exclude cut with ID _64631_210_27767_1_1535349580068_4165160_108-43865-0_sp1.1 from training. Duration: 0.92725 2023-10-02 22:31:19,810 WARNING [train.py:1204] (3/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_243-42116-0_sp1.1 from training. Duration: 0.663625 2023-10-02 22:31:19,848 WARNING [train.py:1204] (3/4) Exclude cut with ID _66868_210_9527_1_1535509287954_4561140_594-132160-0_sp1.1 from training. Duration: 0.92725 2023-10-02 22:31:21,276 WARNING [train.py:1204] (3/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_48-338976-0_sp0.9 from training. Duration: 0.988875 2023-10-02 22:31:22,476 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.578e+02 1.843e+02 2.023e+02 2.325e+02 3.252e+02, threshold=4.046e+02, percent-clipped=0.0 2023-10-02 22:31:22,566 WARNING [train.py:1204] (3/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_137-100595-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 22:31:22,581 WARNING [train.py:1204] (3/4) Exclude cut with ID _49748_210_24116_1_1533986971737_3158910_12-271669-0_sp1.1 from training. Duration: 0.936375 2023-10-02 22:31:22,609 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_37357_210_17024_1_1533089504_2316121_206-80421-0_sp1.1 from training. Duration: 0.936375 2023-10-02 22:31:24,092 WARNING [train.py:1204] (3/4) Exclude cut with ID _7537_210_2084_1_1528502395431_3924359_113-199933-0_sp1.1 from training. Duration: 0.536375 2023-10-02 22:31:25,407 WARNING [train.py:1204] (3/4) Exclude cut with ID _55440_210_12651_1_1534496713364_2980520_31-59289-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 22:31:28,129 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1083-220122-0_sp0.9 from training. Duration: 0.5666875 2023-10-02 22:31:30,870 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_426-313557-0_sp1.1 from training. Duration: 0.4818125 2023-10-02 22:31:30,923 WARNING [train.py:1204] (3/4) Exclude cut with ID _35799_210_5854_1_1533263487887_3529071_324-94843-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 22:31:35,401 WARNING [train.py:1204] (3/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_264-108685-0_sp1.1 from training. Duration: 0.5 2023-10-02 22:31:35,496 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9012W0006-501141 from training. Duration: 0.896 2023-10-02 22:31:38,059 INFO [train.py:1046] (3/4) Epoch 30, batch 2250, loss[loss=0.1633, simple_loss=0.2496, pruned_loss=0.03846, over 24483.00 frames. ], tot_loss[loss=0.1635, simple_loss=0.2418, pruned_loss=0.04259, over 4733735.54 frames. ], batch size: 66, lr: 3.41e-03, grad_scale: 16.0 2023-10-02 22:31:38,126 WARNING [train.py:1204] (3/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_890-5117-0_sp0.9 from training. Duration: 0.8666875 2023-10-02 22:31:38,178 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9023W0008-506301 from training. Duration: 0.896 2023-10-02 22:31:39,539 WARNING [train.py:1204] (3/4) Exclude cut with ID _13916_210_9994_1_1530788341126_3763149_60-191556-0_sp0.9 from training. Duration: 0.67775 2023-10-02 22:31:40,901 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9029W0331-509721 from training. Duration: 0.896 2023-10-02 22:31:42,668 WARNING [train.py:1204] (3/4) Exclude cut with ID _35823_210_6917_1_1533187881914_7297830_567-108596-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 22:31:42,711 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7543_210_6753_1_1528544010_1110402_25-183860-0_sp1.1 from training. Duration: 0.42725 2023-10-02 22:31:44,216 WARNING [train.py:1204] (3/4) Exclude cut with ID _47278_210_12041_1_1533902418349_3576110_495-260035-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 22:31:46,174 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9051W0015-520281 from training. Duration: 0.9045625 2023-10-02 22:31:47,048 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.balancer_ff3.min_abs, batch_count=1042013.3333333334, ans=0.2 2023-10-02 22:31:48,009 WARNING [train.py:1204] (3/4) Exclude cut with ID _14459_210_2228_1_1531443641946_7516700_707-66193-0_sp1.1 from training. Duration: 0.97275 2023-10-02 22:31:50,694 WARNING [train.py:1204] (3/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_219-224445-0_sp1.1 from training. Duration: 0.763625 2023-10-02 22:31:53,715 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.ff3_skip_rate, batch_count=1042080.0, ans=0.0 2023-10-02 22:31:55,079 WARNING [train.py:1204] (3/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_345-167986-0_sp0.9 from training. Duration: 0.9333125 2023-10-02 22:31:56,467 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_34815_210_6388_1_1533381320_3571903_279-305565-0_sp1.1 from training. Duration: 0.836375 2023-10-02 22:31:59,217 WARNING [train.py:1204] (3/4) Exclude cut with ID _21987_210_5854_1_1531821500482_3637038_317-179212-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 22:31:59,265 WARNING [train.py:1204] (3/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_395-37222-0_sp1.1 from training. Duration: 0.6909375 2023-10-02 22:31:59,861 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.3.encoder.layers.0.nonlin_attention.whiten2, num_groups=1, num_channels=512, metric=7.86 vs. limit=15.0 2023-10-02 22:32:00,645 WARNING [train.py:1204] (3/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_168-50337-0_sp1.1 from training. Duration: 0.763625 2023-10-02 22:32:02,128 WARNING [train.py:1204] (3/4) Exclude cut with ID _60113_210_16271_1_1535164212481_4386079_559-167191-0 from training. Duration: 0.93 2023-10-02 22:32:02,137 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_47228_210_4983_1_1533780021_3672027_344-179642-0_sp1.1 from training. Duration: 0.92725 2023-10-02 22:32:02,166 WARNING [train.py:1204] (3/4) Exclude cut with ID _22961_210_8254_1_1531976366672_4278180_317-199546-0_sp0.9 from training. Duration: 0.9555625 2023-10-02 22:32:05,335 WARNING [train.py:1204] (3/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_141-89727-0 from training. Duration: 0.71 2023-10-02 22:32:06,655 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_78-116075-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 22:32:06,667 WARNING [train.py:1204] (3/4) Exclude cut with ID _64086_210_7994_1_1535270417019_7435949_52-235756-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 22:32:07,965 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7615_210_4211_1_1528110965_4580160_72-235211-0_sp1.1 from training. Duration: 0.6909375 2023-10-02 22:32:12,592 WARNING [train.py:1204] (3/4) Exclude cut with ID _14069_210_5632_1_1530512692033_4803390_287-27950-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 22:32:15,289 WARNING [train.py:1204] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_74-48717-0_sp0.9 from training. Duration: 0.6444375 2023-10-02 22:32:15,312 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7544_210_6753_1_1528591327_1471426_172-348777-0_sp1.1 from training. Duration: 0.6 2023-10-02 22:32:16,704 WARNING [train.py:1204] (3/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_220-191080-0 from training. Duration: 0.98 2023-10-02 22:32:18,070 WARNING [train.py:1204] (3/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_1081-137641-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 22:32:20,080 WARNING [train.py:1204] (3/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_278-235459-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 22:32:24,195 WARNING [train.py:1204] (3/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_1181-77470-0_sp1.1 from training. Duration: 0.936375 2023-10-02 22:32:25,647 WARNING [train.py:1204] (3/4) Exclude cut with ID _60860_210_8254_1_1534896015875_3557169_198-5531-0_sp1.1 from training. Duration: 0.936375 2023-10-02 22:32:26,994 WARNING [train.py:1204] (3/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_17-289781-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 22:32:27,005 WARNING [train.py:1204] (3/4) Exclude cut with ID _50310_210_15488_1_1534068043345_3641080_146-199752-0_sp1.1 from training. Duration: 0.92725 2023-10-02 22:32:29,715 WARNING [train.py:1204] (3/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_969-213354-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 22:32:31,170 WARNING [train.py:1204] (3/4) Exclude cut with ID _70519_210_4163_1_1535961595994_7458230_66-344156-0_sp1.1 from training. Duration: 0.963625 2023-10-02 22:32:32,797 INFO [scaling.py:1118] (3/4) WithLoss: name=encoder.encoders.5.encoder.layers.0.self_attn_weights, loss-sum=0.000e+00 2023-10-02 22:32:35,209 WARNING [train.py:1204] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_380-204756-0_sp1.1 from training. Duration: 0.7818125 2023-10-02 22:32:37,088 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_128-202104-0_sp0.9 from training. Duration: 0.7 2023-10-02 22:32:41,160 WARNING [train.py:1204] (3/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_51-1049-0_sp0.9 from training. Duration: 0.8333125 2023-10-02 22:32:41,187 WARNING [train.py:1204] (3/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_86-66192-0_sp1.1 from training. Duration: 0.5 2023-10-02 22:32:42,554 WARNING [train.py:1204] (3/4) Exclude cut with ID _14069_210_5632_1_1530512692033_4803390_95-1896-0_sp1.1 from training. Duration: 0.8909375 2023-10-02 22:32:47,145 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.2.encoder.layers.2.feed_forward3.out_whiten, num_groups=1, num_channels=384, metric=12.62 vs. limit=15.0 2023-10-02 22:32:47,676 WARNING [train.py:1204] (3/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_29-293896-0_sp1.1 from training. Duration: 0.5545625 2023-10-02 22:32:49,170 WARNING [train.py:1204] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_167-137255-0_sp0.9 from training. Duration: 0.711125 2023-10-02 22:32:49,171 WARNING [train.py:1204] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_217-95056-0 from training. Duration: 0.98 2023-10-02 22:32:50,867 WARNING [train.py:1204] (3/4) Exclude cut with ID _47278_210_12041_1_1533902418349_3576110_97-266746-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 22:32:50,928 WARNING [train.py:1204] (3/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_380-343670-0_sp1.1 from training. Duration: 0.8 2023-10-02 22:32:51,089 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=1042346.6666666666, ans=0.1 2023-10-02 22:32:52,154 INFO [train.py:1046] (3/4) Epoch 30, batch 2300, loss[loss=0.1621, simple_loss=0.2529, pruned_loss=0.03568, over 24550.00 frames. ], tot_loss[loss=0.1647, simple_loss=0.2429, pruned_loss=0.04328, over 4726616.24 frames. ], batch size: 71, lr: 3.41e-03, grad_scale: 16.0 2023-10-02 22:32:53,705 WARNING [train.py:1204] (3/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_107-332398-0 from training. Duration: 0.78 2023-10-02 22:32:55,133 WARNING [train.py:1204] (3/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_91-267696-0_sp1.1 from training. Duration: 0.7909375 2023-10-02 22:32:56,375 WARNING [train.py:1204] (3/4) Exclude cut with ID _41298_210_9019_1_1533430371450_4822143_498-186993-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 22:32:59,649 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.5.encoder.layers.1.nonlin_attention.whiten2, num_groups=1, num_channels=256, metric=10.55 vs. limit=15.0 2023-10-02 22:33:01,976 WARNING [train.py:1204] (3/4) Exclude cut with ID _17956_210_4353_1_1531209568125_6979970_54-77610-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 22:33:02,012 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_40047_210_2290_1_1533290519_2374311_257-335162-0_sp1.1 from training. Duration: 0.863625 2023-10-02 22:33:03,421 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9653W0193-675471 from training. Duration: 0.9656875 2023-10-02 22:33:06,005 WARNING [train.py:1204] (3/4) Exclude cut with ID _25248_210_8750_1_1532062825941_5554180_243-240536-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 22:33:13,357 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_32360_210_4198_1_1532689160_3648436_60-51908-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 22:33:13,366 WARNING [train.py:1204] (3/4) Exclude cut with ID _4092_210_2151_1_1526643044449_3135759_227-213972-0_sp0.9 from training. Duration: 0.6 2023-10-02 22:33:14,039 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.3.encoder.layers.1.feed_forward2.out_whiten, num_groups=1, num_channels=512, metric=5.34 vs. limit=15.0 2023-10-02 22:33:15,130 WARNING [train.py:1204] (3/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_392-173500-0_sp1.1 from training. Duration: 0.97275 2023-10-02 22:33:15,172 WARNING [train.py:1204] (3/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_71-329150-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 22:33:15,174 WARNING [train.py:1204] (3/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_866-173440-0 from training. Duration: 0.9 2023-10-02 22:33:16,541 WARNING [train.py:1204] (3/4) Exclude cut with ID _14832_210_8750_1_1530691351539_3171348_1-54315-0_sp0.9 from training. Duration: 0.9666875 2023-10-02 22:33:19,695 WARNING [train.py:1204] (3/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_430-116139-0_sp1.1 from training. Duration: 0.77275 2023-10-02 22:33:19,721 WARNING [train.py:1204] (3/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_356-45054-0_sp0.9 from training. Duration: 0.9555625 2023-10-02 22:33:24,239 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3531_210_3528_1_1526294203_5216456_309-227302-0_sp0.9 from training. Duration: 0.7666875 2023-10-02 22:33:27,009 WARNING [train.py:1204] (3/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_301-330455-0_sp1.1 from training. Duration: 0.72725 2023-10-02 22:33:28,594 WARNING [train.py:1204] (3/4) Exclude cut with ID _23557_210_8750_1_1531890106370_6846255_667-145945-0_sp1.1 from training. Duration: 0.92725 2023-10-02 22:33:32,674 WARNING [train.py:1204] (3/4) Exclude cut with ID _14860_210_4915_1_1530702020340_7343240_836-328489-0_sp1.1 from training. Duration: 0.8181875 2023-10-02 22:33:32,724 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_37152_210_9697_1_1533106573_3844188_416-333249-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 22:33:32,883 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.conv_module2.balancer1.prob, batch_count=1042480.0, ans=0.125 2023-10-02 22:33:35,335 WARNING [train.py:1204] (3/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_538-32291-0_sp0.9 from training. Duration: 0.92225 2023-10-02 22:33:38,576 WARNING [train.py:1204] (3/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_965-176824-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 22:33:42,651 WARNING [train.py:1204] (3/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_86-19601-0_sp1.1 from training. Duration: 0.77275 2023-10-02 22:33:42,728 WARNING [train.py:1204] (3/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_436-318113-0_sp1.1 from training. Duration: 0.6818125 2023-10-02 22:33:42,759 WARNING [train.py:1204] (3/4) Exclude cut with ID _41817_210_18835_1_1533434477160_7423049_555-293448-0_sp0.9 from training. Duration: 0.888875 2023-10-02 22:33:44,092 WARNING [train.py:1204] (3/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_68-219807-0 from training. Duration: 0.47 2023-10-02 22:33:49,142 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.595e+02 1.877e+02 2.187e+02 2.420e+02 3.762e+02, threshold=4.375e+02, percent-clipped=0.0 2023-10-02 22:33:49,219 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7544_210_6753_1_1528591327_1471426_33-346031-0_sp1.1 from training. Duration: 0.5545625 2023-10-02 22:33:49,221 WARNING [train.py:1204] (3/4) Exclude cut with ID _49748_210_24116_1_1533986971737_3158910_263-52113-0_sp1.1 from training. Duration: 0.97275 2023-10-02 22:33:49,268 WARNING [train.py:1204] (3/4) Exclude cut with ID _64639_210_5816_1_1535367619437_3528000_340-24580-0_sp1.1 from training. Duration: 0.92725 2023-10-02 22:33:49,291 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_47228_210_4983_1_1533780021_3672027_479-114941-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 22:33:49,320 WARNING [train.py:1204] (3/4) Exclude cut with ID _44585_210_18628_1_1533605350387_4518199_506-119659-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 22:33:50,653 WARNING [train.py:1204] (3/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_314-126819-0_sp1.1 from training. Duration: 0.4545625 2023-10-02 22:33:50,656 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2541_210_1794_1_1525590269_3084765_156-173713-0_sp0.9 from training. Duration: 0.9 2023-10-02 22:33:50,713 WARNING [train.py:1204] (3/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_108-255430-0 from training. Duration: 0.97 2023-10-02 22:33:50,720 WARNING [train.py:1204] (3/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_791-45149-0_sp1.1 from training. Duration: 0.7818125 2023-10-02 22:33:52,105 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_53378_210_17282_1_1534227054_1261425_10-118295-0_sp1.1 from training. Duration: 0.97275 2023-10-02 22:33:52,406 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.ff2_skip_rate, batch_count=1042613.3333333334, ans=0.0 2023-10-02 22:33:53,915 WARNING [train.py:1204] (3/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_148-158509-0 from training. Duration: 0.41 2023-10-02 22:33:58,145 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_34726_210_4525_1_1533019474_5030395_687-291305-0_sp1.1 from training. Duration: 0.936375 2023-10-02 22:34:02,484 WARNING [train.py:1204] (3/4) Exclude cut with ID _14860_210_4915_1_1530702020340_7343240_264-324537-0_sp1.1 from training. Duration: 0.963625 2023-10-02 22:34:04,132 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.feed_forward2.hidden_balancer.prob, batch_count=1042680.0, ans=0.125 2023-10-02 22:34:05,043 INFO [train.py:1046] (3/4) Epoch 30, batch 2350, loss[loss=0.1371, simple_loss=0.2147, pruned_loss=0.02975, over 24291.00 frames. ], tot_loss[loss=0.1654, simple_loss=0.2436, pruned_loss=0.04361, over 4734375.91 frames. ], batch size: 56, lr: 3.41e-03, grad_scale: 16.0 2023-10-02 22:34:05,155 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_41251_210_15368_1_1533429767_4820695_591-46386-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 22:34:05,186 WARNING [train.py:1204] (3/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_86-19601-0_sp0.9 from training. Duration: 0.9444375 2023-10-02 22:34:06,483 WARNING [train.py:1204] (3/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_401-255491-0_sp0.9 from training. Duration: 0.77775 2023-10-02 22:34:06,619 WARNING [train.py:1204] (3/4) Exclude cut with ID _8696_210_1876_1_1528545632440_3345058_209-54069-0_sp1.1 from training. Duration: 0.6090625 2023-10-02 22:34:06,628 WARNING [train.py:1204] (3/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_369-325184-0_sp1.1 from training. Duration: 0.9 2023-10-02 22:34:07,981 WARNING [train.py:1204] (3/4) Exclude cut with ID _14687_210_5854_1_1530703838259_3664889_115-239046-0_sp0.9 from training. Duration: 0.7444375 2023-10-02 22:34:08,030 WARNING [train.py:1204] (3/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_168-50337-0 from training. Duration: 0.84 2023-10-02 22:34:14,892 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.conv_skip_rate, batch_count=1042680.0, ans=0.0 2023-10-02 22:34:16,036 WARNING [train.py:1204] (3/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_909-64151-0_sp1.1 from training. Duration: 0.8545625 2023-10-02 22:34:16,046 WARNING [train.py:1204] (3/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_597-154640-0 from training. Duration: 0.9 2023-10-02 22:34:18,491 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.feed_forward1.hidden_balancer.prob, batch_count=1042680.0, ans=0.125 2023-10-02 22:34:19,709 WARNING [train.py:1204] (3/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_126-274694-0 from training. Duration: 0.98 2023-10-02 22:34:24,226 WARNING [train.py:1204] (3/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_344-269786-0_sp1.1 from training. Duration: 0.97275 2023-10-02 22:34:24,965 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.feed_forward1.out_whiten.whitening_limit, batch_count=1042746.6666666666, ans=15.0 2023-10-02 22:34:25,676 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_37818_210_4238_1_1533620910_4720083_322-253154-0_sp1.1 from training. Duration: 0.92725 2023-10-02 22:34:25,679 WARNING [train.py:1204] (3/4) Exclude cut with ID _58895_210_8750_1_1535184081320_3649089_221-15823-0_sp1.1 from training. Duration: 0.92725 2023-10-02 22:34:25,696 WARNING [train.py:1204] (3/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_9-21737-0_sp1.1 from training. Duration: 0.8818125 2023-10-02 22:34:26,916 WARNING [train.py:1204] (3/4) Exclude cut with ID _14069_210_5632_1_1530512692033_4803390_522-341588-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 22:34:27,169 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=1042746.6666666666, ans=0.1 2023-10-02 22:34:28,349 WARNING [train.py:1204] (3/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_363-185703-0 from training. Duration: 0.95 2023-10-02 22:34:31,160 WARNING [train.py:1204] (3/4) Exclude cut with ID _41817_210_18835_1_1533434477160_7423049_265-76216-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 22:34:35,423 WARNING [train.py:1204] (3/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_841-341679-0 from training. Duration: 0.58 2023-10-02 22:34:36,703 WARNING [train.py:1204] (3/4) Exclude cut with ID _55429_210_10626_1_1534467588527_7174143_97-335084-0_sp1.1 from training. Duration: 0.8818125 2023-10-02 22:34:41,321 WARNING [train.py:1204] (3/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_690-126306-0_sp0.9 from training. Duration: 0.8666875 2023-10-02 22:34:41,337 WARNING [train.py:1204] (3/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_102-92751-0_sp1.1 from training. Duration: 0.9 2023-10-02 22:34:42,766 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3787_210_2040_1_1526477544_5478586_603-120324-0_sp1.1 from training. Duration: 0.736375 2023-10-02 22:34:44,592 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_33348_210_5632_1_1533086912_3324030_85-28583-0 from training. Duration: 0.82 2023-10-02 22:34:44,651 WARNING [train.py:1204] (3/4) Exclude cut with ID _60057_210_10640_1_1534903065793_3333150_362-230377-0_sp1.1 from training. Duration: 0.7909375 2023-10-02 22:34:48,027 WARNING [train.py:1204] (3/4) Exclude cut with ID _60113_210_16271_1_1535164212481_4386079_194-146486-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 22:34:48,046 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_53753_210_7234_1_1534320003_3594148_171-277668-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 22:34:48,089 WARNING [train.py:1204] (3/4) Exclude cut with ID _34423_210_8750_1_1533430798456_3330199_251-332788-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 22:34:52,153 WARNING [train.py:1204] (3/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_663-166973-0_sp0.9 from training. Duration: 0.888875 2023-10-02 22:34:53,649 WARNING [train.py:1204] (3/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_927-43827-0 from training. Duration: 0.74 2023-10-02 22:34:55,481 WARNING [train.py:1204] (3/4) Exclude cut with ID _28323_210_16187_1_1532480395667_4100300_306-74561-0_sp1.1 from training. Duration: 0.8545625 2023-10-02 22:34:56,924 WARNING [train.py:1204] (3/4) Exclude cut with ID _15037_210_2253_1_1530752294104_3267650_139-337989-0_sp1.1 from training. Duration: 0.92725 2023-10-02 22:34:56,938 WARNING [train.py:1204] (3/4) Exclude cut with ID _49039_210_6315_1_1533967537541_3846118_460-263670-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 22:34:58,286 WARNING [train.py:1204] (3/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_186-309161-0 from training. Duration: 0.54 2023-10-02 22:34:59,678 WARNING [train.py:1204] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_87-278686-0_sp1.1 from training. Duration: 0.7 2023-10-02 22:34:59,889 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.conv_skip_rate, batch_count=1042880.0, ans=0.0 2023-10-02 22:35:02,533 WARNING [train.py:1204] (3/4) Exclude cut with ID _27052_210_5986_1_1532329197187_3585009_314-106499-0 from training. Duration: 0.92 2023-10-02 22:35:02,554 WARNING [train.py:1204] (3/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_412-287526-0_sp0.9 from training. Duration: 0.8 2023-10-02 22:35:06,569 WARNING [train.py:1204] (3/4) Exclude cut with ID _22961_210_8254_1_1531976366672_4278180_317-199546-0 from training. Duration: 0.86 2023-10-02 22:35:11,362 WARNING [train.py:1204] (3/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_381-317791-0 from training. Duration: 0.73 2023-10-02 22:35:12,671 WARNING [train.py:1204] (3/4) Exclude cut with ID _44010_210_4525_1_1533884262964_4013620_560-310411-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 22:35:12,678 WARNING [train.py:1204] (3/4) Exclude cut with ID _5294_210_1747_1_1527249643928_3696179_342-270278-0_sp0.9 from training. Duration: 0.611125 2023-10-02 22:35:12,702 WARNING [train.py:1204] (3/4) Exclude cut with ID ID0473W0002-918951 from training. Duration: 0.924 2023-10-02 22:35:12,746 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1083-220122-0 from training. Duration: 0.51 2023-10-02 22:35:15,927 WARNING [train.py:1204] (3/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_138-154812-0 from training. Duration: 0.78 2023-10-02 22:35:17,428 WARNING [train.py:1204] (3/4) Exclude cut with ID _44982_210_1511_1_1534312820108_3726029_331-19420-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 22:35:17,659 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.bypass.skip_rate, batch_count=1042946.6666666666, ans=0.07 2023-10-02 22:35:19,612 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.conv_skip_rate, batch_count=1043013.3333333334, ans=0.0 2023-10-02 22:35:20,529 INFO [train.py:1046] (3/4) Epoch 30, batch 2400, loss[loss=0.1649, simple_loss=0.2206, pruned_loss=0.05456, over 19749.00 frames. ], tot_loss[loss=0.1653, simple_loss=0.2436, pruned_loss=0.04346, over 4729078.60 frames. ], batch size: 388, lr: 3.41e-03, grad_scale: 32.0 2023-10-02 22:35:23,422 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2655_210_2087_1_1525688959_3897400_202-147557-0_sp0.9 from training. Duration: 0.9444375 2023-10-02 22:35:26,151 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_23850_210_4238_1_1532232597_1632433_32-185135-0_sp0.9 from training. Duration: 0.9555625 2023-10-02 22:35:27,904 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_46-93547-0_sp1.1 from training. Duration: 0.863625 2023-10-02 22:35:27,968 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7931_210_1794_1_1528594728_5564059_280-279560-0 from training. Duration: 0.98 2023-10-02 22:35:29,341 WARNING [train.py:1204] (3/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_937-208281-0 from training. Duration: 0.85 2023-10-02 22:35:32,517 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.ff2_skip_rate, batch_count=1043013.3333333334, ans=0.0 2023-10-02 22:35:36,237 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_14928_210_5632_1_1530773461_4714000_454-112198-0_sp0.9 from training. Duration: 0.7333125 2023-10-02 22:35:36,238 WARNING [train.py:1204] (3/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_944-153333-0_sp1.1 from training. Duration: 0.963625 2023-10-02 22:35:36,531 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.self_attn_weights.pos_emb_skip_rate, batch_count=1043080.0, ans=0.0 2023-10-02 22:35:38,942 WARNING [train.py:1204] (3/4) Exclude cut with ID _14821_210_8750_1_1530680503991_6829359_2-227151-0 from training. Duration: 0.83 2023-10-02 22:35:38,964 WARNING [train.py:1204] (3/4) Exclude cut with ID _28461_210_2253_1_1532771775189_3041299_96-152158-0_sp1.1 from training. Duration: 0.77275 2023-10-02 22:35:39,026 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7837_210_1792_1_1528453445_5841305_654-54195-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 22:35:39,059 WARNING [train.py:1204] (3/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_344-152526-0 from training. Duration: 0.53 2023-10-02 22:35:41,179 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.balancer1.prob, batch_count=1043080.0, ans=0.125 2023-10-02 22:35:45,603 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_30167_210_3528_1_1532428012_4772375_480-142230-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 22:35:46,986 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_160-218721-0 from training. Duration: 0.96 2023-10-02 22:35:51,257 WARNING [train.py:1204] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_65-88242-0_sp0.9 from training. Duration: 0.62225 2023-10-02 22:35:53,314 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.1.feed_forward1.out_proj.dropout_p, batch_count=1043146.6666666666, ans=0.1 2023-10-02 22:35:54,639 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_34886_210_10633_1_1533203882_1755661_76-156096-0 from training. Duration: 0.87 2023-10-02 22:35:57,880 WARNING [train.py:1204] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_188-38388-0_sp1.1 from training. Duration: 0.92725 2023-10-02 22:35:59,188 WARNING [train.py:1204] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_81-289335-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 22:36:03,399 WARNING [train.py:1204] (3/4) Exclude cut with ID _59493_210_17282_1_1535078419077_3225879_110-8778-0_sp1.1 from training. Duration: 0.936375 2023-10-02 22:36:03,437 WARNING [train.py:1204] (3/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_188-260011-0 from training. Duration: 0.73 2023-10-02 22:36:04,781 WARNING [train.py:1204] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_453-133983-0_sp0.9 from training. Duration: 0.8666875 2023-10-02 22:36:12,160 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8054_210_7927_1_1528627721_4984094_553-17379-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 22:36:14,004 WARNING [train.py:1204] (3/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_341-171072-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 22:36:16,752 WARNING [train.py:1204] (3/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_265-257286-0_sp1.1 from training. Duration: 0.963625 2023-10-02 22:36:18,041 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_403-39619-0_sp0.9 from training. Duration: 0.9333125 2023-10-02 22:36:18,046 WARNING [train.py:1204] (3/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_464-33931-0_sp0.9 from training. Duration: 0.6 2023-10-02 22:36:18,071 WARNING [train.py:1204] (3/4) Exclude cut with ID _60804_210_9642_1_1534831199859_7268299_632-143863-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 22:36:18,077 WARNING [train.py:1204] (3/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_431-310997-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 22:36:19,278 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.363e+02 1.815e+02 2.119e+02 2.399e+02 3.814e+02, threshold=4.238e+02, percent-clipped=0.0 2023-10-02 22:36:19,354 WARNING [train.py:1204] (3/4) Exclude cut with ID _31649_210_6228_1_1532584608063_4091078_114-263860-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 22:36:19,365 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6961_210_2087_1_1527937168_6815011_562-165518-0_sp0.9 from training. Duration: 0.8555625 2023-10-02 22:36:24,214 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.feed_forward1.hidden_balancer.prob, batch_count=1043280.0, ans=0.125 2023-10-02 22:36:25,253 WARNING [train.py:1204] (3/4) Exclude cut with ID _27052_210_5986_1_1532329197187_3585009_338-325870-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 22:36:25,303 WARNING [train.py:1204] (3/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_469-94673-0_sp1.1 from training. Duration: 0.7181875 2023-10-02 22:36:25,322 WARNING [train.py:1204] (3/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_259-34038-0 from training. Duration: 0.74 2023-10-02 22:36:26,790 WARNING [train.py:1204] (3/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_388-129552-0 from training. Duration: 0.96 2023-10-02 22:36:29,534 WARNING [train.py:1204] (3/4) Exclude cut with ID _28394_210_8913_1_1532430049518_3650118_342-251455-0_sp1.1 from training. Duration: 0.936375 2023-10-02 22:36:29,552 WARNING [train.py:1204] (3/4) Exclude cut with ID _53731_210_7994_1_1534553979244_3757214_8-191390-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 22:36:29,592 WARNING [train.py:1204] (3/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_613-232856-0 from training. Duration: 0.54 2023-10-02 22:36:29,657 WARNING [train.py:1204] (3/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_557-349534-0 from training. Duration: 0.91 2023-10-02 22:36:31,504 WARNING [train.py:1204] (3/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_379-184223-0 from training. Duration: 0.85 2023-10-02 22:36:31,508 WARNING [train.py:1204] (3/4) Exclude cut with ID IC0125W0001-61807 from training. Duration: 0.966 2023-10-02 22:36:31,598 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_229-45300-0 from training. Duration: 0.57 2023-10-02 22:36:32,946 WARNING [train.py:1204] (3/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_1069-24743-0_sp1.1 from training. Duration: 0.92725 2023-10-02 22:36:33,214 INFO [scaling.py:1118] (3/4) WithLoss: name=encoder.encoders.3.encoder.layers.0.self_attn_weights, loss-sum=0.000e+00 2023-10-02 22:36:34,345 INFO [train.py:1046] (3/4) Epoch 30, batch 2450, loss[loss=0.1635, simple_loss=0.2518, pruned_loss=0.03759, over 24294.00 frames. ], tot_loss[loss=0.1642, simple_loss=0.2425, pruned_loss=0.04291, over 4712641.56 frames. ], batch size: 74, lr: 3.41e-03, grad_scale: 16.0 2023-10-02 22:36:34,492 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_41452_210_7291_1_1533437687_3941281_323-172873-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 22:36:34,502 WARNING [train.py:1204] (3/4) Exclude cut with ID _37086_210_9038_1_1532999540891_7021250_802-226709-0_sp1.1 from training. Duration: 0.97275 2023-10-02 22:36:35,864 WARNING [train.py:1204] (3/4) Exclude cut with ID IC0144W0500-71767 from training. Duration: 0.931875 2023-10-02 22:36:37,236 WARNING [train.py:1204] (3/4) Exclude cut with ID _39846_210_12651_1_1533605310265_2115212_233-129699-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 22:36:37,491 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.conv_module1.balancer1.prob, batch_count=1043346.6666666666, ans=0.125 2023-10-02 22:36:38,511 WARNING [train.py:1204] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_574-45541-0_sp0.9 from training. Duration: 0.72225 2023-10-02 22:36:41,916 WARNING [train.py:1204] (3/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_372-298093-0_sp1.1 from training. Duration: 0.72725 2023-10-02 22:36:41,935 WARNING [train.py:1204] (3/4) Exclude cut with ID _62574_210_2655_1_1535180407058_3933249_34-26343-0_sp1.1 from training. Duration: 0.97275 2023-10-02 22:36:45,362 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_512-256238-0_sp1.1 from training. Duration: 0.963625 2023-10-02 22:36:45,367 WARNING [train.py:1204] (3/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_196-55502-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 22:36:46,727 WARNING [train.py:1204] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_195-167011-0 from training. Duration: 0.75 2023-10-02 22:36:52,684 WARNING [train.py:1204] (3/4) Exclude cut with ID _13678_210_7497_1_1530530069226_3463170_345-230416-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 22:36:52,703 WARNING [train.py:1204] (3/4) Exclude cut with ID _51078_210_9070_1_1534161557424_3805250_225-327809-0_sp1.1 from training. Duration: 0.963625 2023-10-02 22:36:55,556 WARNING [train.py:1204] (3/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_18-189198-0_sp1.1 from training. Duration: 0.8181875 2023-10-02 22:36:55,568 WARNING [train.py:1204] (3/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_329-82235-0_sp1.1 from training. Duration: 0.7454375 2023-10-02 22:36:55,576 WARNING [train.py:1204] (3/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_726-270780-0_sp0.9 from training. Duration: 0.9555625 2023-10-02 22:36:55,618 WARNING [train.py:1204] (3/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_654-52713-0 from training. Duration: 0.77 2023-10-02 22:36:59,682 WARNING [train.py:1204] (3/4) Exclude cut with ID _20455_210_5219_1_1531565996775_8098100_191-16006-0_sp1.1 from training. Duration: 0.963625 2023-10-02 22:37:01,086 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3171_210_2087_1_1526109084_2330007_66-260583-0_sp1.1 from training. Duration: 0.6545625 2023-10-02 22:37:02,865 WARNING [train.py:1204] (3/4) Exclude cut with ID _38670_210_15379_1_1533448810659_4391059_63-129006-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 22:37:04,503 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.feed_forward3.hidden_balancer.prob, batch_count=1043480.0, ans=0.125 2023-10-02 22:37:05,672 WARNING [train.py:1204] (3/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_393-57888-0_sp0.9 from training. Duration: 0.711125 2023-10-02 22:37:05,709 WARNING [train.py:1204] (3/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_857-298942-0_sp0.9 from training. Duration: 0.9444375 2023-10-02 22:37:07,127 WARNING [train.py:1204] (3/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_239-22076-0_sp0.9 from training. Duration: 0.9444375 2023-10-02 22:37:07,156 WARNING [train.py:1204] (3/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_424-227947-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 22:37:10,427 WARNING [train.py:1204] (3/4) Exclude cut with ID _14069_210_5632_1_1530512692033_4803390_95-1896-0 from training. Duration: 0.98 2023-10-02 22:37:10,519 WARNING [train.py:1204] (3/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_96-244299-0_sp0.9 from training. Duration: 0.97775 2023-10-02 22:37:13,369 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=1043480.0, ans=0.1 2023-10-02 22:37:17,976 WARNING [train.py:1204] (3/4) Exclude cut with ID _46683_210_9642_1_1533730913492_4591116_562-319825-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 22:37:19,408 WARNING [train.py:1204] (3/4) Exclude cut with ID _64639_210_5816_1_1535367619437_3528000_284-173474-0_sp1.1 from training. Duration: 0.963625 2023-10-02 22:37:20,679 WARNING [train.py:1204] (3/4) Exclude cut with ID _29529_210_7234_1_1532393961455_3977460_497-100133-0_sp1.1 from training. Duration: 0.936375 2023-10-02 22:37:20,714 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_12868_210_6753_1_1530701932_530275_18-76322-0_sp0.9 from training. Duration: 0.9666875 2023-10-02 22:37:20,750 WARNING [train.py:1204] (3/4) Exclude cut with ID _50093_210_4220_1_1534417141207_3432299_116-303447-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 22:37:21,614 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.conv_module2.balancer2.prob, batch_count=1043546.6666666666, ans=0.125 2023-10-02 22:37:22,488 WARNING [train.py:1204] (3/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_44-79427-0_sp1.1 from training. Duration: 0.8909375 2023-10-02 22:37:22,551 WARNING [train.py:1204] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_887-249555-0 from training. Duration: 0.77 2023-10-02 22:37:25,250 WARNING [train.py:1204] (3/4) Exclude cut with ID _28461_210_2253_1_1532771775189_3041299_96-152158-0_sp0.9 from training. Duration: 0.9444375 2023-10-02 22:37:26,591 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_14058_210_4163_1_1530690953_6327180_49-210687-0_sp1.1 from training. Duration: 0.8545625 2023-10-02 22:37:29,356 WARNING [train.py:1204] (3/4) Exclude cut with ID _40399_210_4353_1_1533305658194_3279406_351-127903-0_sp1.1 from training. Duration: 0.97275 2023-10-02 22:37:29,372 WARNING [train.py:1204] (3/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_184-206939-0_sp1.1 from training. Duration: 0.936375 2023-10-02 22:37:34,140 WARNING [train.py:1204] (3/4) Exclude cut with ID _14100_210_8341_1_1530529320613_3678069_258-224599-0_sp1.1 from training. Duration: 0.7 2023-10-02 22:37:34,156 WARNING [train.py:1204] (3/4) Exclude cut with ID _6493_210_1747_1_1528459210424_4012470_324-323854-0 from training. Duration: 0.92 2023-10-02 22:37:36,892 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_151-325626-0_sp1.1 from training. Duration: 0.8818125 2023-10-02 22:37:36,962 WARNING [train.py:1204] (3/4) Exclude cut with ID _14528_210_8254_1_1530669700514_4306939_106-43145-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 22:37:36,986 WARNING [train.py:1204] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_686-54383-0 from training. Duration: 0.87 2023-10-02 22:37:37,030 WARNING [train.py:1204] (3/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_801-102362-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 22:37:37,083 WARNING [train.py:1204] (3/4) Exclude cut with ID _48677_210_5663_1_1533902405919_3892989_408-40740-0_sp0.9 from training. Duration: 0.92225 2023-10-02 22:37:37,327 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=1043613.3333333334, ans=0.1 2023-10-02 22:37:41,716 WARNING [train.py:1204] (3/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_448-212135-0_sp1.1 from training. Duration: 0.82725 2023-10-02 22:37:43,157 WARNING [train.py:1204] (3/4) Exclude cut with ID _59170_210_6721_1_1534983633960_4203335_320-124047-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 22:37:44,972 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_4692_210_3528_1_1526899755_4323254_107-44401-0_sp1.1 from training. Duration: 0.7818125 2023-10-02 22:37:46,124 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.0.layers.0.conv_module2.whiten, num_groups=1, num_channels=192, metric=11.14 vs. limit=15.0 2023-10-02 22:37:46,591 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder_embed.convnext.layerdrop_rate, batch_count=1043613.3333333334, ans=0.015 2023-10-02 22:37:47,878 WARNING [train.py:1204] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_731-2428-0 from training. Duration: 0.83 2023-10-02 22:37:47,954 WARNING [train.py:1204] (3/4) Exclude cut with ID _29857_210_20034_1_1532395886753_3798160_335-163562-0_sp1.1 from training. Duration: 0.77275 2023-10-02 22:37:49,179 INFO [train.py:1046] (3/4) Epoch 30, batch 2500, loss[loss=0.1468, simple_loss=0.1944, pruned_loss=0.04962, over 19383.00 frames. ], tot_loss[loss=0.1635, simple_loss=0.2415, pruned_loss=0.0428, over 4703741.20 frames. ], batch size: 388, lr: 3.41e-03, grad_scale: 16.0 2023-10-02 22:37:49,393 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.1.conv_module2.balancer1.prob, batch_count=1043680.0, ans=0.125 2023-10-02 22:37:49,502 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.conv_module1.balancer2.prob, batch_count=1043680.0, ans=0.125 2023-10-02 22:37:55,116 WARNING [train.py:1204] (3/4) Exclude cut with ID _14832_210_8750_1_1530691351539_3171348_171-275004-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 22:38:03,947 WARNING [train.py:1204] (3/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_105-328174-0_sp0.9 from training. Duration: 0.8444375 2023-10-02 22:38:05,272 WARNING [train.py:1204] (3/4) Exclude cut with ID _68965_210_1883_1_1535698962272_3497490_164-118421-0_sp1.1 from training. Duration: 0.936375 2023-10-02 22:38:05,350 WARNING [train.py:1204] (3/4) Exclude cut with ID _19896_210_1883_1_1531709022662_6649569_380-24170-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 22:38:05,355 WARNING [train.py:1204] (3/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_505-205270-0_sp1.1 from training. Duration: 0.3454375 2023-10-02 22:38:12,157 WARNING [train.py:1204] (3/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_427-208901-0_sp1.1 from training. Duration: 0.6181875 2023-10-02 22:38:12,212 WARNING [train.py:1204] (3/4) Exclude cut with ID _23818_210_6228_1_1531917063884_3676920_85-326916-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 22:38:13,992 WARNING [train.py:1204] (3/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_101-293367-0_sp1.1 from training. Duration: 0.57275 2023-10-02 22:38:14,007 WARNING [train.py:1204] (3/4) Exclude cut with ID _5409_210_1388_1_1527292850189_5865191_178-127970-0_sp1.1 from training. Duration: 0.5181875 2023-10-02 22:38:15,370 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_403-39619-0 from training. Duration: 0.84 2023-10-02 22:38:15,469 WARNING [train.py:1204] (3/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_427-87841-0_sp1.1 from training. Duration: 0.963625 2023-10-02 22:38:16,925 WARNING [train.py:1204] (3/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_286-68181-0_sp1.1 from training. Duration: 0.97275 2023-10-02 22:38:16,973 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6673_210_3273_1_1527767790_3173240_327-306185-0 from training. Duration: 0.83 2023-10-02 22:38:18,221 WARNING [train.py:1204] (3/4) Exclude cut with ID _3675_210_4557_1_1526639934408_4182089_106-203713-0_sp1.1 from training. Duration: 0.963625 2023-10-02 22:38:18,293 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_53071_210_16554_1_1534493434_1276796_130-36076-0 from training. Duration: 0.95 2023-10-02 22:38:18,322 WARNING [train.py:1204] (3/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_586-93054-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 22:38:23,282 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.feed_forward3.hidden_balancer.prob, batch_count=1043813.3333333334, ans=0.125 2023-10-02 22:38:24,325 WARNING [train.py:1204] (3/4) Exclude cut with ID _23825_210_13449_1_1531915313526_3497150_262-10571-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 22:38:25,750 WARNING [train.py:1204] (3/4) Exclude cut with ID _28783_210_7943_1_1532671164808_7155479_432-67885-0_sp1.1 from training. Duration: 0.97275 2023-10-02 22:38:28,557 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6287_210_3273_1_1527509506_2653716_182-199887-0_sp0.9 from training. Duration: 0.5666875 2023-10-02 22:38:28,613 WARNING [train.py:1204] (3/4) Exclude cut with ID _3231_210_2106_1_1526205864934_6784220_171-256004-0 from training. Duration: 0.86 2023-10-02 22:38:28,656 WARNING [train.py:1204] (3/4) Exclude cut with ID _69051_210_1531_1_1535885566901_3829538_332-92090-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 22:38:30,084 WARNING [train.py:1204] (3/4) Exclude cut with ID _3675_210_4557_1_1526639934408_4182089_104-324125-0_sp1.1 from training. Duration: 0.963625 2023-10-02 22:38:34,762 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_41764_210_2521_1_1533686110_4089313_501-2119-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 22:38:36,315 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.nonlin_attention.balancer.prob, batch_count=1043880.0, ans=0.125 2023-10-02 22:38:38,702 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_56-100988-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 22:38:40,284 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.0.ff3_skip_rate, batch_count=1043880.0, ans=0.0 2023-10-02 22:38:41,567 WARNING [train.py:1204] (3/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_398-239819-0_sp1.1 from training. Duration: 0.9 2023-10-02 22:38:46,987 WARNING [train.py:1204] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_74-48717-0_sp1.1 from training. Duration: 0.52725 2023-10-02 22:38:48,328 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.485e+02 1.895e+02 2.033e+02 2.318e+02 3.238e+02, threshold=4.066e+02, percent-clipped=0.0 2023-10-02 22:38:49,758 WARNING [train.py:1204] (3/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_177-49092-0 from training. Duration: 0.91 2023-10-02 22:38:49,784 WARNING [train.py:1204] (3/4) Exclude cut with ID _62337_210_12392_1_1535009273105_3412229_44-176353-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 22:38:49,792 WARNING [train.py:1204] (3/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_110-130961-0_sp1.1 from training. Duration: 0.42725 2023-10-02 22:38:51,254 WARNING [train.py:1204] (3/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_37-189262-0_sp1.1 from training. Duration: 0.8909375 2023-10-02 22:38:51,255 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3172_210_2087_1_1526292929_3987865_197-97762-0_sp0.9 from training. Duration: 0.8333125 2023-10-02 22:38:53,819 WARNING [train.py:1204] (3/4) Exclude cut with ID IC0677W0065-334897 from training. Duration: 0.896 2023-10-02 22:38:53,820 WARNING [train.py:1204] (3/4) Exclude cut with ID IC0677W0080-334912 from training. Duration: 0.768 2023-10-02 22:38:53,833 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_120-41668-0 from training. Duration: 0.7 2023-10-02 22:38:57,161 WARNING [train.py:1204] (3/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_460-205287-0_sp1.1 from training. Duration: 0.963625 2023-10-02 22:38:58,537 WARNING [train.py:1204] (3/4) Exclude cut with ID _14632_210_9988_1_1530691279392_3923200_102-204477-0 from training. Duration: 0.97 2023-10-02 22:38:58,542 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_34815_210_6388_1_1533381320_3571903_279-305565-0 from training. Duration: 0.92 2023-10-02 22:39:00,169 WARNING [train.py:1204] (3/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_91-148831-0_sp1.1 from training. Duration: 0.9 2023-10-02 22:39:00,223 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_82-221196-0 from training. Duration: 0.76 2023-10-02 22:39:00,386 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.conv_module2.balancer1.prob, batch_count=1043946.6666666666, ans=0.125 2023-10-02 22:39:02,813 INFO [train.py:1046] (3/4) Epoch 30, batch 2550, loss[loss=0.1629, simple_loss=0.2495, pruned_loss=0.03813, over 24487.00 frames. ], tot_loss[loss=0.1631, simple_loss=0.2413, pruned_loss=0.04243, over 4719660.19 frames. ], batch size: 66, lr: 3.41e-03, grad_scale: 16.0 2023-10-02 22:39:04,802 WARNING [train.py:1204] (3/4) Exclude cut with ID _38020_210_5986_1_1533112252259_7006089_493-102745-0 from training. Duration: 0.98 2023-10-02 22:39:06,300 WARNING [train.py:1204] (3/4) Exclude cut with ID _61133_210_9969_1_1535196777227_3696409_40-93442-0_sp1.1 from training. Duration: 0.92725 2023-10-02 22:39:07,690 WARNING [train.py:1204] (3/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_45-258852-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 22:39:09,090 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_53071_210_16554_1_1534493434_1276796_130-36076-0_sp1.1 from training. Duration: 0.863625 2023-10-02 22:39:11,745 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8694_210_6400_1_1529065754_3260756_332-13553-0_sp1.1 from training. Duration: 0.92725 2023-10-02 22:39:11,824 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_405-339986-0 from training. Duration: 0.51 2023-10-02 22:39:13,084 WARNING [train.py:1204] (3/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_927-43827-0_sp1.1 from training. Duration: 0.67275 2023-10-02 22:39:18,379 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_397-147196-0 from training. Duration: 0.8 2023-10-02 22:39:18,492 WARNING [train.py:1204] (3/4) Exclude cut with ID 3557-8342-0013-54691-0_sp1.1 from training. Duration: 0.836375 2023-10-02 22:39:21,201 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_47438_210_5399_1_1533797471_4513356_626-127722-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 22:39:23,919 WARNING [train.py:1204] (3/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_256-323883-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 22:39:23,930 WARNING [train.py:1204] (3/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_839-173017-0_sp0.9 from training. Duration: 0.4555625 2023-10-02 22:39:25,244 WARNING [train.py:1204] (3/4) Exclude cut with ID _5289_210_2084_1_1527678202736_3779299_392-261988-0_sp1.1 from training. Duration: 0.5909375 2023-10-02 22:39:25,305 WARNING [train.py:1204] (3/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_119-224384-0_sp1.1 from training. Duration: 0.936375 2023-10-02 22:39:25,343 WARNING [train.py:1204] (3/4) Exclude cut with ID _57523_210_1856_1_1534766294545_3732989_262-267933-0_sp1.1 from training. Duration: 0.963625 2023-10-02 22:39:26,812 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_35361_210_4238_1_1533262810_7793476_675-87012-0_sp0.9 from training. Duration: 0.988875 2023-10-02 22:39:26,835 WARNING [train.py:1204] (3/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_17-90541-0 from training. Duration: 0.94 2023-10-02 22:39:28,328 WARNING [train.py:1204] (3/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_535-14410-0_sp1.1 from training. Duration: 0.42725 2023-10-02 22:39:28,334 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_24673_210_6400_1_1531968633_778659_94-154511-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 22:39:28,345 WARNING [train.py:1204] (3/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_212-10501-0 from training. Duration: 0.94 2023-10-02 22:39:35,799 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.0.feed_forward1.hidden_balancer.prob, batch_count=1044146.6666666666, ans=0.125 2023-10-02 22:39:40,344 WARNING [train.py:1204] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_383-285196-0_sp1.1 from training. Duration: 0.8818125 2023-10-02 22:39:43,277 WARNING [train.py:1204] (3/4) Exclude cut with ID _45560_210_21982_1_1533812603951_3584999_238-47020-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 22:39:44,907 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_21500_210_2054_1_1532149310_2716758_278-18053-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 22:39:44,914 WARNING [train.py:1204] (3/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_453-9626-0_sp1.1 from training. Duration: 0.936375 2023-10-02 22:39:46,332 WARNING [train.py:1204] (3/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_153-97441-0_sp1.1 from training. Duration: 0.5090625 2023-10-02 22:39:52,365 WARNING [train.py:1204] (3/4) Exclude cut with ID _67725_210_27767_1_1535608756671_5105359_431-120895-0_sp1.1 from training. Duration: 0.963625 2023-10-02 22:39:56,593 WARNING [train.py:1204] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_574-45541-0_sp1.1 from training. Duration: 0.5909375 2023-10-02 22:39:56,617 WARNING [train.py:1204] (3/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_162-103620-0_sp1.1 from training. Duration: 0.8454375 2023-10-02 22:39:56,635 WARNING [train.py:1204] (3/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_734-129761-0_sp1.1 from training. Duration: 0.7454375 2023-10-02 22:39:56,671 WARNING [train.py:1204] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_620-276793-0_sp1.1 from training. Duration: 0.536375 2023-10-02 22:39:57,918 WARNING [train.py:1204] (3/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_153-97441-0_sp0.9 from training. Duration: 0.62225 2023-10-02 22:40:02,650 WARNING [train.py:1204] (3/4) Exclude cut with ID _12733_210_8341_1_1530167424556_3646380_523-85621-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 22:40:02,682 WARNING [train.py:1204] (3/4) Exclude cut with ID _64048_210_10626_1_1535198382802_3835179_352-179834-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 22:40:05,634 WARNING [train.py:1204] (3/4) Exclude cut with ID _55162_210_17071_1_1534408248583_3753480_43-242364-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 22:40:05,648 WARNING [train.py:1204] (3/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_661-264857-0 from training. Duration: 0.93 2023-10-02 22:40:05,656 WARNING [train.py:1204] (3/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_325-237800-0_sp1.1 from training. Duration: 0.97275 2023-10-02 22:40:05,706 WARNING [train.py:1204] (3/4) Exclude cut with ID _55328_210_27767_1_1534580973260_4088170_385-171459-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 22:40:07,012 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3780_210_2087_1_1526467389_2145572_176-107140-0_sp1.1 from training. Duration: 0.62725 2023-10-02 22:40:08,735 WARNING [train.py:1204] (3/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_375-244976-0_sp1.1 from training. Duration: 0.6818125 2023-10-02 22:40:11,418 WARNING [train.py:1204] (3/4) Exclude cut with ID _35941_210_15379_1_1532995294515_4023089_309-33162-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 22:40:13,088 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.conv_module2.balancer2.min_positive, batch_count=1044280.0, ans=0.05 2023-10-02 22:40:13,148 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.conv_skip_rate, batch_count=1044280.0, ans=0.0 2023-10-02 22:40:16,192 WARNING [train.py:1204] (3/4) Exclude cut with ID _14459_210_2228_1_1531443641946_7516700_1185-22038-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 22:40:17,449 INFO [train.py:1046] (3/4) Epoch 30, batch 2600, loss[loss=0.17, simple_loss=0.2417, pruned_loss=0.04909, over 23725.00 frames. ], tot_loss[loss=0.164, simple_loss=0.2421, pruned_loss=0.04291, over 4715661.20 frames. ], batch size: 212, lr: 3.41e-03, grad_scale: 16.0 2023-10-02 22:40:19,328 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_1197-69460-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 22:40:22,081 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9012W0022-501157 from training. Duration: 0.896 2023-10-02 22:40:23,477 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9023W0009-506302 from training. Duration: 0.896 2023-10-02 22:40:23,492 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2821_210_2715_1_1526108385_3763560_73-296673-0_sp1.1 from training. Duration: 0.7545625 2023-10-02 22:40:23,531 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9025W0113-507442 from training. Duration: 0.896 2023-10-02 22:40:24,874 WARNING [train.py:1204] (3/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_739-2098-0 from training. Duration: 0.92 2023-10-02 22:40:24,895 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9029W0332-509722 from training. Duration: 0.768 2023-10-02 22:40:27,703 WARNING [train.py:1204] (3/4) Exclude cut with ID _16908_210_5724_1_1531119157248_3772700_68-101953-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 22:40:29,100 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9042W0011-515617 from training. Duration: 0.768 2023-10-02 22:40:29,197 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3019_210_2028_1_1526044245_3264865_191-72235-0 from training. Duration: 0.97 2023-10-02 22:40:31,115 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9052W0006-520792 from training. Duration: 0.896 2023-10-02 22:40:32,515 WARNING [train.py:1204] (3/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_245-174553-0_sp1.1 from training. Duration: 0.8 2023-10-02 22:40:33,946 WARNING [train.py:1204] (3/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_202-67017-0 from training. Duration: 0.82 2023-10-02 22:40:36,562 WARNING [train.py:1204] (3/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_304-59227-0 from training. Duration: 0.77 2023-10-02 22:40:37,945 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_246-125000-0_sp0.9 from training. Duration: 0.688875 2023-10-02 22:40:38,198 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.conv_module1.balancer1.prob, batch_count=1044413.3333333334, ans=0.125 2023-10-02 22:40:39,296 WARNING [train.py:1204] (3/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_243-42116-0 from training. Duration: 0.73 2023-10-02 22:40:39,525 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.feed_forward2.hidden_balancer.prob, batch_count=1044413.3333333334, ans=0.125 2023-10-02 22:40:41,284 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9087W0121-538492 from training. Duration: 0.896 2023-10-02 22:40:41,307 WARNING [train.py:1204] (3/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_120-76843-0 from training. Duration: 0.55 2023-10-02 22:40:46,408 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.conv_module1.balancer2.prob, batch_count=1044480.0, ans=0.125 2023-10-02 22:40:47,507 WARNING [train.py:1204] (3/4) Exclude cut with ID _7852_210_5219_1_1528286471316_8139400_850-304621-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 22:40:47,529 WARNING [train.py:1204] (3/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_154-285415-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 22:40:47,540 WARNING [train.py:1204] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_538-278194-0_sp1.1 from training. Duration: 0.92725 2023-10-02 22:40:47,541 WARNING [train.py:1204] (3/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_119-207162-0 from training. Duration: 0.71 2023-10-02 22:40:49,321 WARNING [train.py:1204] (3/4) Exclude cut with ID _2334_210_2667_1_1525608238880_4705129_459-104997-0_sp1.1 from training. Duration: 0.863625 2023-10-02 22:40:50,281 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.2.encoder.layers.0.self_attn2.whiten, num_groups=1, num_channels=384, metric=19.38 vs. limit=22.5 2023-10-02 22:40:51,103 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=1044480.0, ans=0.1 2023-10-02 22:40:52,259 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9147W0019-567847 from training. Duration: 0.768 2023-10-02 22:40:59,318 WARNING [train.py:1204] (3/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_228-189439-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 22:40:59,343 WARNING [train.py:1204] (3/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_196-77172-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 22:41:00,707 WARNING [train.py:1204] (3/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_625-338546-0 from training. Duration: 0.88 2023-10-02 22:41:00,759 WARNING [train.py:1204] (3/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_193-327630-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 22:41:00,761 WARNING [train.py:1204] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_391-24006-0_sp1.1 from training. Duration: 0.92725 2023-10-02 22:41:02,140 WARNING [train.py:1204] (3/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_197-28164-0 from training. Duration: 0.8 2023-10-02 22:41:04,884 WARNING [train.py:1204] (3/4) Exclude cut with ID _8395_210_1531_1_1528597871523_4411579_431-132470-0_sp1.1 from training. Duration: 0.67275 2023-10-02 22:41:04,911 WARNING [train.py:1204] (3/4) Exclude cut with ID _35350_210_16040_1_1533040191183_4075299_465-248326-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 22:41:07,674 WARNING [train.py:1204] (3/4) Exclude cut with ID _16756_210_4915_1_1531047803763_7104259_101-141922-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 22:41:12,268 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9212W0005-600172 from training. Duration: 0.896 2023-10-02 22:41:12,289 WARNING [train.py:1204] (3/4) Exclude cut with ID _63186_210_2084_1_1535414479705_3908649_378-166820-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 22:41:12,301 WARNING [train.py:1204] (3/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_52-211683-0_sp1.1 from training. Duration: 0.8181875 2023-10-02 22:41:16,858 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.552e+02 1.854e+02 2.030e+02 2.260e+02 3.001e+02, threshold=4.060e+02, percent-clipped=0.0 2023-10-02 22:41:17,074 WARNING [train.py:1204] (3/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_1147-237729-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 22:41:20,166 WARNING [train.py:1204] (3/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_234-14174-0_sp0.9 from training. Duration: 0.911125 2023-10-02 22:41:20,187 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_454-276429-0 from training. Duration: 0.87 2023-10-02 22:41:20,263 WARNING [train.py:1204] (3/4) Exclude cut with ID _53713_210_24116_1_1534314487762_7416751_755-165313-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 22:41:23,018 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8278_210_2550_1_1528637099_4220413_436-317072-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 22:41:23,076 WARNING [train.py:1204] (3/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_28-324729-0_sp1.1 from training. Duration: 0.8545625 2023-10-02 22:41:24,715 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.balancer2.prob, batch_count=1044613.3333333334, ans=0.125 2023-10-02 22:41:28,640 WARNING [train.py:1204] (3/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_326-165900-0 from training. Duration: 0.69 2023-10-02 22:41:28,905 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.conv_module1.balancer1.min_positive, batch_count=1044613.3333333334, ans=0.025 2023-10-02 22:41:29,905 WARNING [train.py:1204] (3/4) Exclude cut with ID _53489_210_6228_1_1534298357169_3835970_96-30246-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 22:41:31,752 INFO [train.py:1046] (3/4) Epoch 30, batch 2650, loss[loss=0.1626, simple_loss=0.2337, pruned_loss=0.04573, over 23890.00 frames. ], tot_loss[loss=0.1642, simple_loss=0.2425, pruned_loss=0.0429, over 4711605.45 frames. ], batch size: 195, lr: 3.41e-03, grad_scale: 8.0 2023-10-02 22:41:31,827 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8275_210_4198_1_1528455320_3892571_491-17707-0_sp1.1 from training. Duration: 0.5818125 2023-10-02 22:41:36,049 WARNING [train.py:1204] (3/4) Exclude cut with ID _14348_210_8341_1_1530599434996_3778640_240-232370-0 from training. Duration: 0.84 2023-10-02 22:41:36,060 WARNING [train.py:1204] (3/4) Exclude cut with ID _45012_210_12651_1_1533608435136_5371380_54-112623-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 22:41:37,430 WARNING [train.py:1204] (3/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_328-264776-0_sp1.1 from training. Duration: 0.6090625 2023-10-02 22:41:37,485 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9325W0001-648427 from training. Duration: 0.768 2023-10-02 22:41:37,500 WARNING [train.py:1204] (3/4) Exclude cut with ID _56480_210_4220_1_1534679942079_3075409_49-74717-0_sp1.1 from training. Duration: 0.936375 2023-10-02 22:41:37,738 INFO [scaling.py:1118] (3/4) WithLoss: name=encoder.encoders.4.encoder.layers.0.self_attn_weights, loss-sum=0.000e+00 2023-10-02 22:41:40,672 WARNING [train.py:1204] (3/4) Exclude cut with ID _59500_210_8254_1_1534809610680_3736009_6-241869-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 22:41:42,086 WARNING [train.py:1204] (3/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_172-215932-0_sp0.9 from training. Duration: 0.7333125 2023-10-02 22:41:42,203 WARNING [train.py:1204] (3/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_17-90541-0_sp1.1 from training. Duration: 0.8545625 2023-10-02 22:41:44,904 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_43831_210_2158_1_1533725502_5872791_525-89447-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 22:41:46,321 WARNING [train.py:1204] (3/4) Exclude cut with ID _12302_210_5219_1_1529924446466_8107280_907-182949-0 from training. Duration: 0.92 2023-10-02 22:41:46,338 WARNING [train.py:1204] (3/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_402-102311-0_sp0.9 from training. Duration: 0.7444375 2023-10-02 22:41:47,592 WARNING [train.py:1204] (3/4) Exclude cut with ID _8021_210_7994_1_1528376451358_3653069_179-45206-0_sp0.9 from training. Duration: 0.9555625 2023-10-02 22:41:49,802 WARNING [train.py:1204] (3/4) Exclude cut with ID _40137_210_12210_1_1533290484046_3795180_249-187059-0 from training. Duration: 0.88 2023-10-02 22:41:51,671 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9653W0194-675472 from training. Duration: 0.9658125 2023-10-02 22:41:54,413 WARNING [train.py:1204] (3/4) Exclude cut with ID _51332_210_9988_1_1534244371836_3801270_336-255604-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 22:41:56,005 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=1044746.6666666666, ans=0.1 2023-10-02 22:41:57,110 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_510-96996-0 from training. Duration: 0.87 2023-10-02 22:41:57,131 WARNING [train.py:1204] (3/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_1167-20437-0_sp1.1 from training. Duration: 0.92725 2023-10-02 22:41:57,180 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3475_210_2028_1_1526193578_3622569_265-276625-0 from training. Duration: 0.76 2023-10-02 22:42:00,601 WARNING [train.py:1204] (3/4) Exclude cut with ID _14860_210_4915_1_1530702020340_7343240_470-172418-0_sp1.1 from training. Duration: 0.97275 2023-10-02 22:42:01,788 WARNING [train.py:1204] (3/4) Exclude cut with ID _5444_210_2151_1_1527418808464_5312210_581-315411-0_sp1.1 from training. Duration: 0.563625 2023-10-02 22:42:01,804 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_34450_210_9547_1_1533099443_4109050_519-342439-0_sp1.1 from training. Duration: 0.97275 2023-10-02 22:42:01,836 WARNING [train.py:1204] (3/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_50-175961-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 22:42:07,773 WARNING [train.py:1204] (3/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_372-6869-0 from training. Duration: 0.44 2023-10-02 22:42:07,786 WARNING [train.py:1204] (3/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_3-237434-0 from training. Duration: 0.76 2023-10-02 22:42:07,960 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.conv_skip_rate, batch_count=1044813.3333333334, ans=0.0 2023-10-02 22:42:11,841 WARNING [train.py:1204] (3/4) Exclude cut with ID _8772_210_1850_1_1528635595972_3664011_247-336448-0_sp1.1 from training. Duration: 0.67275 2023-10-02 22:42:16,179 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_427-78097-0 from training. Duration: 0.8 2023-10-02 22:42:16,200 WARNING [train.py:1204] (3/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_466-252727-0_sp1.1 from training. Duration: 0.97275 2023-10-02 22:42:17,565 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_15972_210_6400_1_1530877934_3405692_316-35478-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 22:42:17,603 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_809-17156-0_sp0.9 from training. Duration: 0.811125 2023-10-02 22:42:18,854 WARNING [train.py:1204] (3/4) Exclude cut with ID _57572_210_5517_1_1534921219624_3809484_594-263474-0_sp1.1 from training. Duration: 0.936375 2023-10-02 22:42:18,894 WARNING [train.py:1204] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_190-228204-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 22:42:20,281 WARNING [train.py:1204] (3/4) Exclude cut with ID _19514_210_9259_1_1531530071330_5084230_117-21193-0_sp1.1 from training. Duration: 0.936375 2023-10-02 22:42:22,379 WARNING [train.py:1204] (3/4) Exclude cut with ID _69584_210_5219_1_1535785133775_4044450_190-21315-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 22:42:23,779 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_53071_210_16554_1_1534493434_1276796_184-302050-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 22:42:23,822 WARNING [train.py:1204] (3/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_120-76843-0_sp1.1 from training. Duration: 0.5 2023-10-02 22:42:25,555 WARNING [train.py:1204] (3/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_318-162017-0_sp1.1 from training. Duration: 0.82725 2023-10-02 22:42:25,782 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=1044880.0, ans=0.1 2023-10-02 22:42:26,947 WARNING [train.py:1204] (3/4) Exclude cut with ID _46067_210_4220_1_1534125571066_3307450_79-46864-0_sp1.1 from training. Duration: 0.92725 2023-10-02 22:42:28,301 WARNING [train.py:1204] (3/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_358-215558-0_sp1.1 from training. Duration: 0.8454375 2023-10-02 22:42:29,608 WARNING [train.py:1204] (3/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_621-66764-0_sp1.1 from training. Duration: 0.92725 2023-10-02 22:42:29,721 WARNING [train.py:1204] (3/4) Exclude cut with ID _42863_210_9070_1_1533902398788_3696920_338-156851-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 22:42:29,745 WARNING [train.py:1204] (3/4) Exclude cut with ID _5289_210_2084_1_1527678202736_3779299_392-261988-0_sp0.9 from training. Duration: 0.72225 2023-10-02 22:42:31,824 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.3.encoder.layers.2.whiten, num_groups=1, num_channels=512, metric=4.21 vs. limit=12.0 2023-10-02 22:42:33,890 WARNING [train.py:1204] (3/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_409-84682-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 22:42:33,966 WARNING [train.py:1204] (3/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_30-326681-0_sp0.9 from training. Duration: 0.92225 2023-10-02 22:42:33,973 WARNING [train.py:1204] (3/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_389-184695-0_sp1.1 from training. Duration: 0.92725 2023-10-02 22:42:35,359 WARNING [train.py:1204] (3/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_442-320317-0 from training. Duration: 0.82 2023-10-02 22:42:38,813 WARNING [train.py:1204] (3/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_756-29911-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 22:42:40,287 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_401-112036-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 22:42:42,104 WARNING [train.py:1204] (3/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_181-308827-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 22:42:43,432 WARNING [train.py:1204] (3/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_332-258725-0_sp1.1 from training. Duration: 0.963625 2023-10-02 22:42:44,769 WARNING [train.py:1204] (3/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_188-260011-0_sp0.9 from training. Duration: 0.811125 2023-10-02 22:42:44,828 WARNING [train.py:1204] (3/4) Exclude cut with ID _49039_210_6315_1_1533967537541_3846118_99-302239-0_sp1.1 from training. Duration: 0.963625 2023-10-02 22:42:46,080 INFO [train.py:1046] (3/4) Epoch 30, batch 2700, loss[loss=0.1518, simple_loss=0.2364, pruned_loss=0.03356, over 24451.00 frames. ], tot_loss[loss=0.1644, simple_loss=0.243, pruned_loss=0.04294, over 4705679.68 frames. ], batch size: 63, lr: 3.41e-03, grad_scale: 8.0 2023-10-02 22:42:47,534 WARNING [train.py:1204] (3/4) Exclude cut with ID _4122_210_2084_1_1526713164417_4353280_312-294828-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 22:42:47,540 WARNING [train.py:1204] (3/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_303-228738-0 from training. Duration: 0.84 2023-10-02 22:42:50,330 WARNING [train.py:1204] (3/4) Exclude cut with ID _14349_210_8341_1_1530615710818_3442470_115-285975-0_sp0.9 from training. Duration: 0.9555625 2023-10-02 22:42:52,175 WARNING [train.py:1204] (3/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_125-144174-0_sp1.1 from training. Duration: 0.3545625 2023-10-02 22:42:53,661 WARNING [train.py:1204] (3/4) Exclude cut with ID _53565_210_10635_1_1534337218335_3820259_351-54570-0_sp1.1 from training. Duration: 0.97275 2023-10-02 22:42:53,696 WARNING [train.py:1204] (3/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_410-40065-0_sp1.1 from training. Duration: 0.963625 2023-10-02 22:42:53,717 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_38965_210_4852_1_1533174906_3484735_167-339442-0_sp1.1 from training. Duration: 0.963625 2023-10-02 22:42:55,686 WARNING [train.py:1204] (3/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_265-164486-0_sp1.1 from training. Duration: 0.87275 2023-10-02 22:42:56,968 WARNING [train.py:1204] (3/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_725-182484-0_sp1.1 from training. Duration: 0.92725 2023-10-02 22:42:57,003 WARNING [train.py:1204] (3/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_607-213951-0_sp0.9 from training. Duration: 0.9333125 2023-10-02 22:42:57,019 WARNING [train.py:1204] (3/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_605-133395-0_sp1.1 from training. Duration: 0.6 2023-10-02 22:42:58,293 WARNING [train.py:1204] (3/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_351-294650-0 from training. Duration: 0.63 2023-10-02 22:42:58,352 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_14953_210_2151_1_1530701710_5616165_710-165400-0_sp1.1 from training. Duration: 0.7909375 2023-10-02 22:42:58,659 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.conv_module1.balancer2.prob, batch_count=1045013.3333333334, ans=0.125 2023-10-02 22:43:01,009 WARNING [train.py:1204] (3/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_237-58433-0_sp1.1 from training. Duration: 0.67275 2023-10-02 22:43:01,103 WARNING [train.py:1204] (3/4) Exclude cut with ID _5190_210_4557_1_1527244368799_373439_28-197030-0_sp1.1 from training. Duration: 0.6545625 2023-10-02 22:43:02,507 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_743-327651-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 22:43:05,165 WARNING [train.py:1204] (3/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_76-203867-0_sp0.9 from training. Duration: 0.8 2023-10-02 22:43:06,487 WARNING [train.py:1204] (3/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_274-274798-0 from training. Duration: 0.98 2023-10-02 22:43:06,521 WARNING [train.py:1204] (3/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_226-242140-0_sp1.1 from training. Duration: 0.736375 2023-10-02 22:43:11,102 WARNING [train.py:1204] (3/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_352-59958-0_sp1.1 from training. Duration: 0.7818125 2023-10-02 22:43:11,106 WARNING [train.py:1204] (3/4) Exclude cut with ID _39411_210_5986_1_1533538825030_3343649_191-251819-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 22:43:16,098 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.balancer2.prob, batch_count=1045146.6666666666, ans=0.125 2023-10-02 22:43:18,530 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7046_210_6753_1_1527895337_6920917_642-106799-0_sp1.1 from training. Duration: 0.836375 2023-10-02 22:43:18,539 WARNING [train.py:1204] (3/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_412-3442-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 22:43:18,563 WARNING [train.py:1204] (3/4) Exclude cut with ID _60057_210_10640_1_1534903065793_3333150_171-181133-0_sp0.9 from training. Duration: 0.97775 2023-10-02 22:43:18,581 WARNING [train.py:1204] (3/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_154-135696-0_sp1.1 from training. Duration: 0.72725 2023-10-02 22:43:21,394 WARNING [train.py:1204] (3/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_148-186805-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 22:43:26,015 WARNING [train.py:1204] (3/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_605-165641-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 22:43:26,022 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_174-277364-0_sp0.9 from training. Duration: 0.788875 2023-10-02 22:43:26,044 WARNING [train.py:1204] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_89-321955-0_sp0.9 from training. Duration: 0.888875 2023-10-02 22:43:29,497 WARNING [train.py:1204] (3/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_438-105429-0_sp1.1 from training. Duration: 0.963625 2023-10-02 22:43:29,501 WARNING [train.py:1204] (3/4) Exclude cut with ID _14970_210_15709_1_1530773543493_1715730_187-20259-0_sp0.9 from training. Duration: 0.988875 2023-10-02 22:43:30,379 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.1.encoder.layers.1.self_attn_weights.whiten_keys, num_groups=4, num_channels=128, metric=2.82 vs. limit=6.0 2023-10-02 22:43:33,818 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=1045213.3333333334, ans=0.1 2023-10-02 22:43:35,802 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.ff2_skip_rate, batch_count=1045213.3333333334, ans=0.0 2023-10-02 22:43:36,831 WARNING [train.py:1204] (3/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_385-193052-0_sp0.9 from training. Duration: 0.9555625 2023-10-02 22:43:36,893 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7113_210_1849_1_1527942685_3448768_194-311626-0_sp1.1 from training. Duration: 0.92725 2023-10-02 22:43:41,003 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3780_210_2087_1_1526467389_2145572_176-107140-0_sp0.9 from training. Duration: 0.7666875 2023-10-02 22:43:41,005 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_36756_210_7234_1_1533026991_966276_123-213457-0_sp1.1 from training. Duration: 0.936375 2023-10-02 22:43:45,601 WARNING [train.py:1204] (3/4) Exclude cut with ID _23557_210_8750_1_1531890106370_6846255_456-244861-0_sp1.1 from training. Duration: 0.963625 2023-10-02 22:43:45,664 WARNING [train.py:1204] (3/4) Exclude cut with ID _37086_210_9038_1_1532999540891_7021250_242-349170-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 22:43:46,892 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.557e+02 1.911e+02 2.096e+02 2.404e+02 3.352e+02, threshold=4.192e+02, percent-clipped=0.0 2023-10-02 22:43:46,982 WARNING [train.py:1204] (3/4) Exclude cut with ID _41298_210_9019_1_1533430371450_4822143_471-281351-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 22:43:48,389 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_53378_210_17282_1_1534221071_5769509_572-126176-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 22:43:49,794 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_1507-263032-0_sp1.1 from training. Duration: 0.963625 2023-10-02 22:43:49,835 WARNING [train.py:1204] (3/4) Exclude cut with ID _13679_210_7497_1_1530534293662_3355098_411-186088-0_sp1.1 from training. Duration: 0.8545625 2023-10-02 22:43:52,513 WARNING [train.py:1204] (3/4) Exclude cut with ID _29345_210_4381_1_1532689168263_3860169_149-257268-0_sp1.1 from training. Duration: 0.736375 2023-10-02 22:43:54,407 WARNING [train.py:1204] (3/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_381-252014-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 22:43:54,422 WARNING [train.py:1204] (3/4) Exclude cut with ID _35799_210_5854_1_1533263487887_3529071_512-2717-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 22:43:57,422 WARNING [train.py:1204] (3/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_505-205270-0_sp0.9 from training. Duration: 0.42225 2023-10-02 22:43:59,321 WARNING [train.py:1204] (3/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_731-20117-0_sp1.1 from training. Duration: 0.936375 2023-10-02 22:44:00,639 INFO [train.py:1046] (3/4) Epoch 30, batch 2750, loss[loss=0.1673, simple_loss=0.2521, pruned_loss=0.0412, over 23595.00 frames. ], tot_loss[loss=0.1653, simple_loss=0.2435, pruned_loss=0.04353, over 4692292.35 frames. ], batch size: 85, lr: 3.40e-03, grad_scale: 8.0 2023-10-02 22:44:00,750 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_37351_210_7925_1_1533012931_3812113_265-85780-0_sp1.1 from training. Duration: 0.8 2023-10-02 22:44:00,760 WARNING [train.py:1204] (3/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_313-134930-0 from training. Duration: 0.97 2023-10-02 22:44:00,920 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.0.feed_forward1.out_proj.dropout_p, batch_count=1045346.6666666666, ans=0.1 2023-10-02 22:44:03,481 WARNING [train.py:1204] (3/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_374-138971-0 from training. Duration: 0.94 2023-10-02 22:44:03,523 WARNING [train.py:1204] (3/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_1227-287886-0_sp1.1 from training. Duration: 0.936375 2023-10-02 22:44:05,579 WARNING [train.py:1204] (3/4) Exclude cut with ID _59493_210_17282_1_1535078419077_3225879_332-217544-0_sp1.1 from training. Duration: 0.97275 2023-10-02 22:44:05,617 WARNING [train.py:1204] (3/4) Exclude cut with ID _23514_210_1389_1_1531832139785_3031439_361-119640-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 22:44:08,218 WARNING [train.py:1204] (3/4) Exclude cut with ID _69584_210_5219_1_1535785133775_4044450_142-118058-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 22:44:09,512 WARNING [train.py:1204] (3/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_178-177582-0_sp1.1 from training. Duration: 0.67275 2023-10-02 22:44:09,566 WARNING [train.py:1204] (3/4) Exclude cut with ID _25303_210_17519_1_1532050207576_3635970_41-195572-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 22:44:11,016 WARNING [train.py:1204] (3/4) Exclude cut with ID _55328_210_27767_1_1534580973260_4088170_329-341322-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 22:44:12,361 WARNING [train.py:1204] (3/4) Exclude cut with ID _5608_210_2106_1_1527415439089_6819768_909-82767-0_sp1.1 from training. Duration: 0.5545625 2023-10-02 22:44:12,398 WARNING [train.py:1204] (3/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_261-297503-0_sp1.1 from training. Duration: 0.9 2023-10-02 22:44:12,412 WARNING [train.py:1204] (3/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_459-291706-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 22:44:12,413 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7762_210_1794_1_1528199508_4057408_422-227034-0 from training. Duration: 0.73 2023-10-02 22:44:12,429 WARNING [train.py:1204] (3/4) Exclude cut with ID _3067_210_2667_1_1526212974640_4503168_250-285937-0_sp0.9 from training. Duration: 0.888875 2023-10-02 22:44:12,451 WARNING [train.py:1204] (3/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_332-253745-0_sp1.1 from training. Duration: 0.936375 2023-10-02 22:44:15,938 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.nonlin_attention.balancer.prob, batch_count=1045413.3333333334, ans=0.125 2023-10-02 22:44:18,503 WARNING [train.py:1204] (3/4) Exclude cut with ID _13938_210_9988_1_1530518718978_3693960_304-112784-0 from training. Duration: 0.46 2023-10-02 22:44:21,232 WARNING [train.py:1204] (3/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_226-267957-0_sp1.1 from training. Duration: 0.92725 2023-10-02 22:44:21,272 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_41822_210_13270_1_1533955976_4379284_608-254567-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 22:44:21,337 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_124-113562-0_sp1.1 from training. Duration: 0.8545625 2023-10-02 22:44:22,661 WARNING [train.py:1204] (3/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_117-290916-0_sp1.1 from training. Duration: 0.463625 2023-10-02 22:44:22,744 WARNING [train.py:1204] (3/4) Exclude cut with ID _28427_210_4220_1_1532581240967_3061069_102-66533-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 22:44:25,994 WARNING [train.py:1204] (3/4) Exclude cut with ID _55383_210_10635_1_1534468426172_2231669_134-128246-0_sp1.1 from training. Duration: 0.8090625 2023-10-02 22:44:26,052 WARNING [train.py:1204] (3/4) Exclude cut with ID _45871_210_5665_1_1533864614245_3296060_49-308316-0_sp1.1 from training. Duration: 0.97275 2023-10-02 22:44:26,091 WARNING [train.py:1204] (3/4) Exclude cut with ID _57572_210_5517_1_1534921219624_3809484_176-199205-0_sp1.1 from training. Duration: 0.97275 2023-10-02 22:44:30,794 WARNING [train.py:1204] (3/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_45-265684-0_sp1.1 from training. Duration: 0.6545625 2023-10-02 22:44:30,812 WARNING [train.py:1204] (3/4) Exclude cut with ID _15150_210_8341_1_1530752411803_7256384_469-239509-0_sp0.9 from training. Duration: 0.7555625 2023-10-02 22:44:30,864 WARNING [train.py:1204] (3/4) Exclude cut with ID _28461_210_2253_1_1532771775189_3041299_136-270644-0_sp1.1 from training. Duration: 0.8454375 2023-10-02 22:44:32,190 WARNING [train.py:1204] (3/4) Exclude cut with ID _69584_210_5219_1_1535785133775_4044450_302-47302-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 22:44:32,291 WARNING [train.py:1204] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_166-128941-0_sp1.1 from training. Duration: 0.4909375 2023-10-02 22:44:34,645 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.5.encoder.layers.0.feed_forward3.out_whiten, num_groups=1, num_channels=256, metric=11.60 vs. limit=15.0 2023-10-02 22:44:39,382 WARNING [train.py:1204] (3/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_559-120504-0_sp1.1 from training. Duration: 0.97275 2023-10-02 22:44:41,998 WARNING [train.py:1204] (3/4) Exclude cut with ID _6456_210_1534_1_1527681634212_3181049_345-252877-0_sp1.1 from training. Duration: 0.5818125 2023-10-02 22:44:42,034 WARNING [train.py:1204] (3/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_902-233003-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 22:44:46,250 WARNING [train.py:1204] (3/4) Exclude cut with ID _36538_210_9994_1_1533204020363_3795620_204-261385-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 22:44:46,259 WARNING [train.py:1204] (3/4) Exclude cut with ID _14821_210_8750_1_1530680503991_6829359_189-297451-0_sp1.1 from training. Duration: 0.5 2023-10-02 22:44:47,535 WARNING [train.py:1204] (3/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_1211-226066-0_sp0.9 from training. Duration: 0.8444375 2023-10-02 22:44:52,219 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6786_210_1388_1_1527839699_4194495_273-149841-0_sp1.1 from training. Duration: 0.836375 2023-10-02 22:44:52,259 WARNING [train.py:1204] (3/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_380-162648-0_sp1.1 from training. Duration: 0.8545625 2023-10-02 22:44:52,260 WARNING [train.py:1204] (3/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_494-199538-0 from training. Duration: 0.75 2023-10-02 22:44:56,991 WARNING [train.py:1204] (3/4) Exclude cut with ID _49097_210_16049_1_1533952780207_4360979_276-87053-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 22:44:58,261 WARNING [train.py:1204] (3/4) Exclude cut with ID _5444_210_2151_1_1527418808464_5312210_726-27165-0 from training. Duration: 0.67 2023-10-02 22:45:01,291 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_128-202104-0_sp1.1 from training. Duration: 0.57275 2023-10-02 22:45:05,796 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_11349_210_6753_1_1529800861_1101207_26-65602-0_sp1.1 from training. Duration: 0.87275 2023-10-02 22:45:05,834 WARNING [train.py:1204] (3/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_159-328157-0 from training. Duration: 0.52 2023-10-02 22:45:07,134 WARNING [train.py:1204] (3/4) Exclude cut with ID _14069_210_5632_1_1530512692033_4803390_98-21934-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 22:45:08,603 WARNING [train.py:1204] (3/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_352-59958-0_sp0.9 from training. Duration: 0.9555625 2023-10-02 22:45:09,955 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6163_210_6400_1_1527506746_4263068_283-137811-0 from training. Duration: 0.77 2023-10-02 22:45:09,983 WARNING [train.py:1204] (3/4) Exclude cut with ID _21989_210_5854_1_1531899214931_3271360_133-44759-0_sp1.1 from training. Duration: 0.7 2023-10-02 22:45:12,748 WARNING [train.py:1204] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_21-174514-0_sp0.9 from training. Duration: 0.47775 2023-10-02 22:45:12,787 WARNING [train.py:1204] (3/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_201-78811-0_sp1.1 from training. Duration: 0.963625 2023-10-02 22:45:12,818 WARNING [train.py:1204] (3/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_537-164630-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 22:45:14,073 INFO [train.py:1046] (3/4) Epoch 30, batch 2800, loss[loss=0.1546, simple_loss=0.222, pruned_loss=0.04358, over 23826.00 frames. ], tot_loss[loss=0.1643, simple_loss=0.2417, pruned_loss=0.04344, over 4681101.28 frames. ], batch size: 212, lr: 3.40e-03, grad_scale: 16.0 2023-10-02 22:45:14,159 WARNING [train.py:1204] (3/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_156-257607-0 from training. Duration: 0.83 2023-10-02 22:45:14,172 WARNING [train.py:1204] (3/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_308-154450-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 22:45:14,220 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_45399_210_4852_1_1533624725_7579737_214-240574-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 22:45:15,623 WARNING [train.py:1204] (3/4) Exclude cut with ID _29303_210_2575_1_1532392237303_7410649_615-305517-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 22:45:15,813 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.conv_skip_rate, batch_count=1045680.0, ans=0.0 2023-10-02 22:45:17,079 WARNING [train.py:1204] (3/4) Exclude cut with ID IC0125W0002-61808 from training. Duration: 0.708 2023-10-02 22:45:17,080 WARNING [train.py:1204] (3/4) Exclude cut with ID IC0125W0017-61823 from training. Duration: 0.759 2023-10-02 22:45:21,816 WARNING [train.py:1204] (3/4) Exclude cut with ID _53219_210_4525_1_1534834836008_4064825_4-145291-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 22:45:23,222 WARNING [train.py:1204] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_185-174805-0_sp1.1 from training. Duration: 0.6181875 2023-10-02 22:45:23,244 WARNING [train.py:1204] (3/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_635-193046-0_sp1.1 from training. Duration: 0.92725 2023-10-02 22:45:28,488 WARNING [train.py:1204] (3/4) Exclude cut with ID _46067_210_4220_1_1534125571066_3307450_321-258866-0_sp1.1 from training. Duration: 0.97275 2023-10-02 22:45:29,916 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8558_210_1410_1_1528505225_7613406_131-329524-0 from training. Duration: 0.79 2023-10-02 22:45:31,307 WARNING [train.py:1204] (3/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_847-41334-0_sp0.9 from training. Duration: 0.488875 2023-10-02 22:45:33,249 WARNING [train.py:1204] (3/4) Exclude cut with ID _14227_210_4381_1_1530701804286_3940259_125-275642-0 from training. Duration: 0.99 2023-10-02 22:45:34,647 WARNING [train.py:1204] (3/4) Exclude cut with ID _25248_210_8750_1_1532062825941_5554180_248-177255-0_sp1.1 from training. Duration: 0.963625 2023-10-02 22:45:34,697 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_40047_210_2290_1_1533290519_2374311_195-165693-0_sp1.1 from training. Duration: 0.8090625 2023-10-02 22:45:34,706 WARNING [train.py:1204] (3/4) Exclude cut with ID _23109_210_4381_1_1532150828777_901949_29-258672-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 22:45:38,885 WARNING [train.py:1204] (3/4) Exclude cut with ID _14821_210_8750_1_1530680503991_6829359_2-227151-0_sp1.1 from training. Duration: 0.7545625 2023-10-02 22:45:40,197 WARNING [train.py:1204] (3/4) Exclude cut with ID _49643_210_7989_1_1534640244224_3939031_565-53982-0_sp1.1 from training. Duration: 0.963625 2023-10-02 22:45:40,207 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7544_210_6753_1_1528591327_1471426_33-346031-0_sp0.9 from training. Duration: 0.67775 2023-10-02 22:45:41,549 WARNING [train.py:1204] (3/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_95-61782-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 22:45:48,491 WARNING [train.py:1204] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_686-54383-0_sp0.9 from training. Duration: 0.9666875 2023-10-02 22:45:51,735 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_505-276823-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 22:45:54,436 WARNING [train.py:1204] (3/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_745-128443-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 22:45:54,491 WARNING [train.py:1204] (3/4) Exclude cut with ID _3844_210_1390_1_1526727803582_2871209_20-251664-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 22:45:55,841 WARNING [train.py:1204] (3/4) Exclude cut with ID _36538_210_9994_1_1533204020363_3795620_487-164628-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 22:45:59,487 WARNING [train.py:1204] (3/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_309-222545-0_sp1.1 from training. Duration: 0.763625 2023-10-02 22:45:59,516 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3531_210_3528_1_1526294203_5216456_309-227302-0 from training. Duration: 0.69 2023-10-02 22:46:01,522 WARNING [train.py:1204] (3/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_42-192460-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 22:46:02,703 WARNING [train.py:1204] (3/4) Exclude cut with ID _22083_210_8254_1_1531702862439_3678970_49-234546-0_sp1.1 from training. Duration: 0.7545625 2023-10-02 22:46:02,707 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_427-78097-0_sp0.9 from training. Duration: 0.888875 2023-10-02 22:46:06,816 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_378-205254-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 22:46:06,863 WARNING [train.py:1204] (3/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_672-314830-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 22:46:09,677 WARNING [train.py:1204] (3/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_90-30673-0_sp1.1 from training. Duration: 0.763625 2023-10-02 22:46:11,294 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.bypass_mid.scale_min, batch_count=1045880.0, ans=0.2 2023-10-02 22:46:12,372 WARNING [train.py:1204] (3/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_563-329943-0_sp1.1 from training. Duration: 0.9 2023-10-02 22:46:13,723 WARNING [train.py:1204] (3/4) Exclude cut with ID _21632_210_2253_1_1531994001765_3744140_154-84090-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 22:46:13,724 WARNING [train.py:1204] (3/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_103-336064-0_sp0.9 from training. Duration: 0.5666875 2023-10-02 22:46:13,764 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_43-71989-0_sp0.9 from training. Duration: 0.6333125 2023-10-02 22:46:13,822 WARNING [train.py:1204] (3/4) Exclude cut with ID _3867_210_1883_1_1526641200575_4475830_319-349850-0_sp0.9 from training. Duration: 0.7444375 2023-10-02 22:46:14,908 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.610e+02 1.953e+02 2.195e+02 2.519e+02 3.830e+02, threshold=4.390e+02, percent-clipped=0.0 2023-10-02 22:46:14,998 WARNING [train.py:1204] (3/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_629-179454-0_sp1.1 from training. Duration: 0.963625 2023-10-02 22:46:15,013 WARNING [train.py:1204] (3/4) Exclude cut with ID _65263_210_22356_1_1535763598691_3921037_49-222917-0 from training. Duration: 0.92 2023-10-02 22:46:15,044 WARNING [train.py:1204] (3/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_602-331772-0_sp1.1 from training. Duration: 0.936375 2023-10-02 22:46:17,694 WARNING [train.py:1204] (3/4) Exclude cut with ID _58414_210_8254_1_1535421609162_7151182_292-134978-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 22:46:17,727 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_38793_210_2544_1_1533176114_3849320_200-299292-0_sp1.1 from training. Duration: 0.936375 2023-10-02 22:46:19,144 WARNING [train.py:1204] (3/4) Exclude cut with ID _14970_210_15709_1_1530775547361_1966965_6-23328-0 from training. Duration: 0.86 2023-10-02 22:46:19,502 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.bypass.scale_min, batch_count=1045946.6666666666, ans=0.2 2023-10-02 22:46:20,604 WARNING [train.py:1204] (3/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_386-225511-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 22:46:20,615 WARNING [train.py:1204] (3/4) Exclude cut with ID _38323_210_12108_1_1533121233369_3303049_416-11919-0_sp1.1 from training. Duration: 0.87275 2023-10-02 22:46:20,661 WARNING [train.py:1204] (3/4) Exclude cut with ID _4866_210_2577_1_1527404412284_4364659_256-306830-0_sp1.1 from training. Duration: 0.7909375 2023-10-02 22:46:22,152 WARNING [train.py:1204] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_53-300544-0 from training. Duration: 0.67 2023-10-02 22:46:28,635 INFO [train.py:1046] (3/4) Epoch 30, batch 2850, loss[loss=0.1545, simple_loss=0.2367, pruned_loss=0.03616, over 24657.00 frames. ], tot_loss[loss=0.1635, simple_loss=0.2409, pruned_loss=0.04307, over 4682593.23 frames. ], batch size: 65, lr: 3.40e-03, grad_scale: 16.0 2023-10-02 22:46:28,758 WARNING [train.py:1204] (3/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_80-74184-0_sp1.1 from training. Duration: 0.7545625 2023-10-02 22:46:30,075 WARNING [train.py:1204] (3/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_332-256168-0_sp1.1 from training. Duration: 0.6818125 2023-10-02 22:46:30,122 WARNING [train.py:1204] (3/4) Exclude cut with ID _48836_210_12210_1_1534075848293_3904150_429-35475-0_sp1.1 from training. Duration: 0.8090625 2023-10-02 22:46:32,074 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_144-135833-0_sp1.1 from training. Duration: 0.97275 2023-10-02 22:46:35,379 WARNING [train.py:1204] (3/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_380-343670-0_sp0.9 from training. Duration: 0.97775 2023-10-02 22:46:35,398 WARNING [train.py:1204] (3/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_213-265174-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 22:46:35,427 WARNING [train.py:1204] (3/4) Exclude cut with ID _20455_210_5219_1_1531565996775_8098100_86-314283-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 22:46:38,245 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_41750_210_1960_1_1533520678_3948042_68-223813-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 22:46:39,595 WARNING [train.py:1204] (3/4) Exclude cut with ID _55153_210_2253_1_1534384786677_3162096_24-198429-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 22:46:42,383 WARNING [train.py:1204] (3/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_119-207162-0_sp0.9 from training. Duration: 0.788875 2023-10-02 22:46:42,446 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_148-172861-0 from training. Duration: 0.5 2023-10-02 22:46:46,700 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.0.ff2_skip_rate, batch_count=1046080.0, ans=0.0 2023-10-02 22:46:49,317 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_5522_210_3273_1_1527249606_3909096_318-203153-0 from training. Duration: 0.43 2023-10-02 22:46:49,324 WARNING [train.py:1204] (3/4) Exclude cut with ID _14884_210_2252_1_1530707459689_5561250_288-189949-0_sp1.1 from training. Duration: 0.97275 2023-10-02 22:46:50,822 WARNING [train.py:1204] (3/4) Exclude cut with ID _5294_210_1747_1_1527249643928_3696179_529-242298-0 from training. Duration: 0.95 2023-10-02 22:46:50,887 WARNING [train.py:1204] (3/4) Exclude cut with ID _36538_210_9994_1_1533204020363_3795620_503-110289-0_sp1.1 from training. Duration: 0.936375 2023-10-02 22:46:53,620 WARNING [train.py:1204] (3/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_385-193052-0 from training. Duration: 0.86 2023-10-02 22:46:55,006 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_456-22141-0 from training. Duration: 0.88 2023-10-02 22:46:56,817 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_38965_210_4852_1_1533174906_3484735_284-117840-0_sp1.1 from training. Duration: 0.936375 2023-10-02 22:47:07,693 WARNING [train.py:1204] (3/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_75-201625-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 22:47:09,104 WARNING [train.py:1204] (3/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_255-312654-0_sp1.1 from training. Duration: 0.82725 2023-10-02 22:47:09,114 WARNING [train.py:1204] (3/4) Exclude cut with ID _47251_210_18628_1_1533814063128_4137250_214-88076-0_sp0.9 from training. Duration: 0.97775 2023-10-02 22:47:09,193 WARNING [train.py:1204] (3/4) Exclude cut with ID _4092_210_2151_1_1526643044449_3135759_227-213972-0_sp1.1 from training. Duration: 0.4909375 2023-10-02 22:47:10,583 WARNING [train.py:1204] (3/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_912-47473-0_sp1.1 from training. Duration: 0.6181875 2023-10-02 22:47:10,604 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6767_210_3273_1_1527855783_1718932_173-256601-0_sp1.1 from training. Duration: 0.72725 2023-10-02 22:47:11,994 WARNING [train.py:1204] (3/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_32-205288-0_sp1.1 from training. Duration: 0.7090625 2023-10-02 22:47:12,019 WARNING [train.py:1204] (3/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_39-248870-0 from training. Duration: 0.69 2023-10-02 22:47:14,738 WARNING [train.py:1204] (3/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_377-296839-0_sp1.1 from training. Duration: 0.5 2023-10-02 22:47:14,760 WARNING [train.py:1204] (3/4) Exclude cut with ID _13679_210_7497_1_1530534293662_3355098_413-56154-0_sp1.1 from training. Duration: 0.963625 2023-10-02 22:47:16,063 WARNING [train.py:1204] (3/4) Exclude cut with ID _14970_210_15709_1_1530775547361_1966965_201-84544-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 22:47:17,439 WARNING [train.py:1204] (3/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_105-95775-0_sp1.1 from training. Duration: 0.936375 2023-10-02 22:47:18,331 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.1.encoder.layers.1.self_attn2.whiten, num_groups=1, num_channels=256, metric=10.50 vs. limit=22.5 2023-10-02 22:47:18,815 WARNING [train.py:1204] (3/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_131-191669-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 22:47:18,839 WARNING [train.py:1204] (3/4) Exclude cut with ID _14860_210_4915_1_1530702020340_7343240_695-4323-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 22:47:20,227 WARNING [train.py:1204] (3/4) Exclude cut with ID _39867_210_9826_1_1533553222030_9344259_991-130565-0_sp1.1 from training. Duration: 0.97275 2023-10-02 22:47:24,318 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_50-3289-0_sp1.1 from training. Duration: 0.82725 2023-10-02 22:47:24,437 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_54-347670-0_sp1.1 from training. Duration: 0.92725 2023-10-02 22:47:25,789 WARNING [train.py:1204] (3/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_200-153445-0_sp1.1 from training. Duration: 0.936375 2023-10-02 22:47:27,829 WARNING [train.py:1204] (3/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_279-302911-0_sp1.1 from training. Duration: 0.97275 2023-10-02 22:47:29,355 WARNING [train.py:1204] (3/4) Exclude cut with ID _14886_210_4220_1_1530702020355_3516190_159-73748-0_sp0.9 from training. Duration: 0.92225 2023-10-02 22:47:34,558 WARNING [train.py:1204] (3/4) Exclude cut with ID _53379_210_20034_1_1534222027363_3854654_23-222715-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 22:47:36,509 WARNING [train.py:1204] (3/4) Exclude cut with ID _12555_210_8750_1_1530075594402_3780239_449-267986-0 from training. Duration: 0.83 2023-10-02 22:47:36,527 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7544_210_6753_1_1528587722_3576686_295-258479-0 from training. Duration: 0.84 2023-10-02 22:47:36,757 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.feed_forward1.out_proj.dropout_p, batch_count=1046280.0, ans=0.1 2023-10-02 22:47:39,328 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7002_210_1794_1_1527935634_4034444_487-236501-0_sp0.9 from training. Duration: 0.7333125 2023-10-02 22:47:40,662 WARNING [train.py:1204] (3/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_244-19107-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 22:47:40,682 WARNING [train.py:1204] (3/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_390-339137-0 from training. Duration: 0.56 2023-10-02 22:47:40,731 WARNING [train.py:1204] (3/4) Exclude cut with ID _7562_210_2084_1_1528887767916_3946229_186-326422-0_sp1.1 from training. Duration: 0.763625 2023-10-02 22:47:41,915 WARNING [train.py:1204] (3/4) Exclude cut with ID _33640_210_2655_1_1533117605479_4855469_645-145235-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 22:47:41,931 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_25767_210_2161_1_1532307483_3867748_506-171777-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 22:47:41,959 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_5438_210_1794_1_1527336687_2600009_233-58072-0_sp1.1 from training. Duration: 0.7 2023-10-02 22:47:41,960 WARNING [train.py:1204] (3/4) Exclude cut with ID IC0669W0063-330908 from training. Duration: 0.896 2023-10-02 22:47:41,992 WARNING [train.py:1204] (3/4) Exclude cut with ID IC0671W0070-331913 from training. Duration: 0.896 2023-10-02 22:47:41,996 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_258-310570-0_sp1.1 from training. Duration: 0.7454375 2023-10-02 22:47:42,172 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.ff3_skip_rate, batch_count=1046346.6666666666, ans=0.0 2023-10-02 22:47:43,298 INFO [train.py:1046] (3/4) Epoch 30, batch 2900, loss[loss=0.1704, simple_loss=0.2453, pruned_loss=0.04772, over 23776.00 frames. ], tot_loss[loss=0.1632, simple_loss=0.2409, pruned_loss=0.04275, over 4686121.54 frames. ], batch size: 179, lr: 3.40e-03, grad_scale: 8.0 2023-10-02 22:47:43,366 WARNING [train.py:1204] (3/4) Exclude cut with ID _9461_210_4557_1_1529060514726_3677451_162-177507-0_sp1.1 from training. Duration: 0.97275 2023-10-02 22:47:47,488 WARNING [train.py:1204] (3/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_26-175553-0_sp0.9 from training. Duration: 0.67775 2023-10-02 22:47:47,513 WARNING [train.py:1204] (3/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_7-150481-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 22:47:47,567 WARNING [train.py:1204] (3/4) Exclude cut with ID _23557_210_8750_1_1531890106370_6846255_72-273262-0_sp1.1 from training. Duration: 0.963625 2023-10-02 22:47:47,830 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.bypass_mid.scale_min, batch_count=1046346.6666666666, ans=0.2 2023-10-02 22:47:48,981 WARNING [train.py:1204] (3/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_367-61330-0 from training. Duration: 0.75 2023-10-02 22:47:54,160 WARNING [train.py:1204] (3/4) Exclude cut with ID _39867_210_9826_1_1533553222030_9344259_1337-11141-0_sp1.1 from training. Duration: 0.936375 2023-10-02 22:47:54,186 WARNING [train.py:1204] (3/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_101-293367-0 from training. Duration: 0.63 2023-10-02 22:47:55,545 WARNING [train.py:1204] (3/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_10-100367-0 from training. Duration: 0.98 2023-10-02 22:47:57,085 WARNING [train.py:1204] (3/4) Exclude cut with ID _60057_210_10640_1_1534903065793_3333150_171-181133-0_sp1.1 from training. Duration: 0.8 2023-10-02 22:47:57,088 WARNING [train.py:1204] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_844-96989-0_sp0.9 from training. Duration: 0.788875 2023-10-02 22:47:58,479 WARNING [train.py:1204] (3/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_301-144595-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 22:47:59,821 WARNING [train.py:1204] (3/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_115-147343-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 22:48:03,164 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3171_210_2087_1_1526109084_2330007_37-64334-0_sp1.1 from training. Duration: 0.7454375 2023-10-02 22:48:05,022 WARNING [train.py:1204] (3/4) Exclude cut with ID _19514_210_9259_1_1531530071330_5084230_769-272914-0_sp1.1 from training. Duration: 0.936375 2023-10-02 22:48:06,481 WARNING [train.py:1204] (3/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_76-311963-0_sp0.9 from training. Duration: 0.77775 2023-10-02 22:48:07,772 WARNING [train.py:1204] (3/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_526-62079-0 from training. Duration: 0.67 2023-10-02 22:48:07,832 WARNING [train.py:1204] (3/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_7-111954-0_sp0.9 from training. Duration: 0.62225 2023-10-02 22:48:12,283 WARNING [train.py:1204] (3/4) Exclude cut with ID _57997_210_4163_1_1534766425889_3961049_360-279140-0_sp1.1 from training. Duration: 0.97275 2023-10-02 22:48:13,782 WARNING [train.py:1204] (3/4) Exclude cut with ID _4866_210_2577_1_1527404412284_4364659_133-1696-0 from training. Duration: 0.99 2023-10-02 22:48:13,843 WARNING [train.py:1204] (3/4) Exclude cut with ID _5190_210_4557_1_1527244368799_373439_28-197030-0 from training. Duration: 0.72 2023-10-02 22:48:16,644 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_53-39003-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 22:48:16,646 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6673_210_3273_1_1527767790_3173240_351-167954-0 from training. Duration: 0.48 2023-10-02 22:48:16,671 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2822_210_2715_1_1526112586_1999587_165-36656-0_sp1.1 from training. Duration: 0.8090625 2023-10-02 22:48:18,114 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_328-264892-0_sp1.1 from training. Duration: 0.92725 2023-10-02 22:48:18,119 WARNING [train.py:1204] (3/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_29-293896-0_sp0.9 from training. Duration: 0.67775 2023-10-02 22:48:20,839 WARNING [train.py:1204] (3/4) Exclude cut with ID _65263_210_22356_1_1535763598691_3921037_465-55448-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 22:48:22,116 WARNING [train.py:1204] (3/4) Exclude cut with ID _21989_210_5854_1_1531899214931_3271360_311-53106-0_sp1.1 from training. Duration: 0.97275 2023-10-02 22:48:24,926 WARNING [train.py:1204] (3/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_389-277915-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 22:48:26,434 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_69630_210_7234_1_1535791094_677837_63-277139-0_sp1.1 from training. Duration: 0.963625 2023-10-02 22:48:27,874 WARNING [train.py:1204] (3/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_85-138257-0 from training. Duration: 0.71 2023-10-02 22:48:27,998 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.0.bypass.scale_min, batch_count=1046546.6666666666, ans=0.2 2023-10-02 22:48:29,283 WARNING [train.py:1204] (3/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_686-247600-0 from training. Duration: 0.92 2023-10-02 22:48:29,289 WARNING [train.py:1204] (3/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_108-255430-0_sp1.1 from training. Duration: 0.8818125 2023-10-02 22:48:33,112 WARNING [train.py:1204] (3/4) Exclude cut with ID _14884_210_2252_1_1530707459689_5561250_114-201399-0_sp1.1 from training. Duration: 0.6909375 2023-10-02 22:48:36,347 WARNING [train.py:1204] (3/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_506-42614-0 from training. Duration: 0.79 2023-10-02 22:48:37,800 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_85-195240-0_sp1.1 from training. Duration: 0.7909375 2023-10-02 22:48:43,297 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.1.encoder.layers.1.feed_forward2.out_whiten, num_groups=1, num_channels=256, metric=3.80 vs. limit=15.0 2023-10-02 22:48:43,786 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_141-36243-0_sp1.1 from training. Duration: 0.97275 2023-10-02 22:48:45,104 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.653e+02 1.984e+02 2.334e+02 2.809e+02 4.390e+02, threshold=4.669e+02, percent-clipped=1.0 2023-10-02 22:48:50,717 WARNING [train.py:1204] (3/4) Exclude cut with ID _7852_210_5219_1_1528286471316_8139400_358-287492-0_sp1.1 from training. Duration: 0.87275 2023-10-02 22:48:50,742 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_805-71497-0_sp0.9 from training. Duration: 0.788875 2023-10-02 22:48:52,077 WARNING [train.py:1204] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_178-39722-0 from training. Duration: 0.49 2023-10-02 22:48:54,995 WARNING [train.py:1204] (3/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_428-181925-0_sp1.1 from training. Duration: 0.963625 2023-10-02 22:48:54,999 WARNING [train.py:1204] (3/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_770-200430-0 from training. Duration: 0.89 2023-10-02 22:48:55,044 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_1104-271009-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 22:48:55,100 WARNING [train.py:1204] (3/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_128-296581-0_sp0.9 from training. Duration: 0.82225 2023-10-02 22:48:56,368 INFO [train.py:1046] (3/4) Epoch 30, batch 2950, loss[loss=0.1667, simple_loss=0.2565, pruned_loss=0.03844, over 24384.00 frames. ], tot_loss[loss=0.1632, simple_loss=0.2415, pruned_loss=0.04242, over 4695133.11 frames. ], batch size: 77, lr: 3.40e-03, grad_scale: 8.0 2023-10-02 22:48:56,682 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.ff3_skip_rate, batch_count=1046680.0, ans=0.0 2023-10-02 22:49:02,447 WARNING [train.py:1204] (3/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_19-62511-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 22:49:03,824 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_805-71497-0 from training. Duration: 0.71 2023-10-02 22:49:05,628 WARNING [train.py:1204] (3/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_573-152077-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 22:49:05,633 WARNING [train.py:1204] (3/4) Exclude cut with ID _42875_210_14856_1_1533704397487_4012433_393-177816-0_sp1.1 from training. Duration: 0.963625 2023-10-02 22:49:07,578 WARNING [train.py:1204] (3/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_278-35159-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 22:49:07,669 WARNING [train.py:1204] (3/4) Exclude cut with ID _15037_210_2253_1_1530752294104_3267650_13-130361-0_sp1.1 from training. Duration: 0.9 2023-10-02 22:49:10,296 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3019_210_2028_1_1526044245_3264865_127-135615-0 from training. Duration: 0.94 2023-10-02 22:49:11,575 WARNING [train.py:1204] (3/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_607-213951-0 from training. Duration: 0.84 2023-10-02 22:49:11,631 WARNING [train.py:1204] (3/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_436-318113-0_sp0.9 from training. Duration: 0.8333125 2023-10-02 22:49:11,632 WARNING [train.py:1204] (3/4) Exclude cut with ID _13938_210_9988_1_1530518718978_3693960_148-152225-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 22:49:17,210 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.3.encoder.layers.1.self_attn2.whiten, num_groups=1, num_channels=512, metric=15.18 vs. limit=22.5 2023-10-02 22:49:17,880 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2541_210_1794_1_1525590269_3084765_156-173713-0_sp1.1 from training. Duration: 0.736375 2023-10-02 22:49:19,272 WARNING [train.py:1204] (3/4) Exclude cut with ID _28323_210_16187_1_1532480395667_4100300_531-24157-0_sp1.1 from training. Duration: 0.8909375 2023-10-02 22:49:20,657 WARNING [train.py:1204] (3/4) Exclude cut with ID _29303_210_2575_1_1532392237303_7410649_382-217492-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 22:49:21,903 WARNING [train.py:1204] (3/4) Exclude cut with ID _58895_210_8750_1_1535184081320_3649089_270-158013-0_sp1.1 from training. Duration: 0.92725 2023-10-02 22:49:24,718 WARNING [train.py:1204] (3/4) Exclude cut with ID _29277_210_3279_1_1532581109293_3173069_311-206227-0_sp1.1 from training. Duration: 0.936375 2023-10-02 22:49:24,730 WARNING [train.py:1204] (3/4) Exclude cut with ID _7160_210_1876_1_1527940923231_3703799_282-322611-0_sp0.9 from training. Duration: 0.9666875 2023-10-02 22:49:24,835 WARNING [train.py:1204] (3/4) Exclude cut with ID _53219_210_4525_1_1534834836008_4064825_562-32295-0_sp1.1 from training. Duration: 0.963625 2023-10-02 22:49:26,245 WARNING [train.py:1204] (3/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_408-87330-0_sp1.1 from training. Duration: 0.963625 2023-10-02 22:49:26,258 WARNING [train.py:1204] (3/4) Exclude cut with ID _59202_210_26984_1_1534816810612_3549174_137-318710-0_sp1.1 from training. Duration: 0.8090625 2023-10-02 22:49:28,911 WARNING [train.py:1204] (3/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_372-30302-0 from training. Duration: 0.86 2023-10-02 22:49:34,110 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.balancer2.prob, batch_count=1046813.3333333334, ans=0.125 2023-10-02 22:49:35,163 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7543_210_6753_1_1528541907_1399322_178-56269-0_sp1.1 from training. Duration: 0.95275 2023-10-02 22:49:35,194 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9109W0012-549758 from training. Duration: 0.512 2023-10-02 22:49:35,272 WARNING [train.py:1204] (3/4) Exclude cut with ID _5410_210_1388_1_1527397055795_1719020_39-112221-0_sp0.9 from training. Duration: 0.7666875 2023-10-02 22:49:37,965 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9121W0012-554933 from training. Duration: 0.896 2023-10-02 22:49:38,134 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.nonlin_attention.balancer.min_positive, batch_count=1046813.3333333334, ans=0.05 2023-10-02 22:49:39,331 WARNING [train.py:1204] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_476-272757-0 from training. Duration: 0.93 2023-10-02 22:49:39,353 WARNING [train.py:1204] (3/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_1085-171718-0_sp1.1 from training. Duration: 0.92725 2023-10-02 22:49:39,594 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.conv_module2.balancer2.prob, batch_count=1046880.0, ans=0.125 2023-10-02 22:49:41,266 WARNING [train.py:1204] (3/4) Exclude cut with ID _8480_210_1874_1_1528589391205_6728330_309-139896-0_sp1.1 from training. Duration: 0.736375 2023-10-02 22:49:41,266 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9130W0113-559703 from training. Duration: 0.896 2023-10-02 22:49:41,277 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_292-265928-0_sp1.1 from training. Duration: 0.463625 2023-10-02 22:49:43,986 WARNING [train.py:1204] (3/4) Exclude cut with ID _8772_210_1850_1_1528635595972_3664011_96-12756-0 from training. Duration: 0.98 2023-10-02 22:49:45,795 WARNING [train.py:1204] (3/4) Exclude cut with ID _12556_210_8341_1_1530081024100_7327690_725-193477-0_sp1.1 from training. Duration: 0.8909375 2023-10-02 22:49:45,842 WARNING [train.py:1204] (3/4) Exclude cut with ID _5289_210_2084_1_1527678202736_3779299_142-103317-0_sp1.1 from training. Duration: 0.763625 2023-10-02 22:49:47,271 WARNING [train.py:1204] (3/4) Exclude cut with ID _45012_210_12651_1_1533608435136_5371380_206-51075-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 22:49:48,671 WARNING [train.py:1204] (3/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_176-56120-0_sp1.1 from training. Duration: 0.7545625 2023-10-02 22:49:48,696 WARNING [train.py:1204] (3/4) Exclude cut with ID _40284_210_2084_1_1533513679814_3704279_220-201864-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 22:49:50,024 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9165W0004-576638 from training. Duration: 0.896 2023-10-02 22:49:50,072 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_5438_210_1794_1_1527336687_2600009_218-343336-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 22:49:51,401 WARNING [train.py:1204] (3/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_318-162017-0 from training. Duration: 0.91 2023-10-02 22:49:55,710 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_49544_210_1794_1_1534379770_5262083_483-349298-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 22:49:57,021 WARNING [train.py:1204] (3/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_197-28164-0_sp0.9 from training. Duration: 0.888875 2023-10-02 22:49:57,091 WARNING [train.py:1204] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_643-52007-0 from training. Duration: 0.57 2023-10-02 22:49:57,120 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_38965_210_4852_1_1533174906_3484735_403-332963-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 22:49:58,523 WARNING [train.py:1204] (3/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_340-174176-0 from training. Duration: 0.49 2023-10-02 22:50:01,279 WARNING [train.py:1204] (3/4) Exclude cut with ID _56708_210_13177_1_1535252351282_3422899_363-917-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 22:50:02,689 WARNING [train.py:1204] (3/4) Exclude cut with ID _19896_210_1883_1_1531709022662_6649569_525-159063-0_sp1.1 from training. Duration: 0.936375 2023-10-02 22:50:02,721 WARNING [train.py:1204] (3/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_430-116139-0_sp0.9 from training. Duration: 0.9444375 2023-10-02 22:50:06,172 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_53753_210_7234_1_1534320003_3594148_86-329250-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 22:50:06,182 WARNING [train.py:1204] (3/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_406-275649-0_sp1.1 from training. Duration: 0.3818125 2023-10-02 22:50:06,296 WARNING [train.py:1204] (3/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_1083-147536-0_sp1.1 from training. Duration: 0.8818125 2023-10-02 22:50:07,666 WARNING [train.py:1204] (3/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_54-311741-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 22:50:07,671 WARNING [train.py:1204] (3/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_237-58433-0_sp0.9 from training. Duration: 0.82225 2023-10-02 22:50:07,707 WARNING [train.py:1204] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_98-51676-0_sp1.1 from training. Duration: 0.663625 2023-10-02 22:50:08,961 WARNING [train.py:1204] (3/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_386-270472-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 22:50:09,046 WARNING [train.py:1204] (3/4) Exclude cut with ID _19023_210_4353_1_1531378892552_5820780_83-254794-0_sp1.1 from training. Duration: 0.7818125 2023-10-02 22:50:10,835 INFO [train.py:1046] (3/4) Epoch 30, batch 3000, loss[loss=0.1684, simple_loss=0.2549, pruned_loss=0.04095, over 23945.00 frames. ], tot_loss[loss=0.1639, simple_loss=0.2423, pruned_loss=0.04275, over 4703810.02 frames. ], batch size: 80, lr: 3.40e-03, grad_scale: 8.0 2023-10-02 22:50:10,835 INFO [train.py:1069] (3/4) Computing validation loss 2023-10-02 22:50:18,242 INFO [zipformer.py:1853] (3/4) name=encoder.encoders.2.encoder.layers.2.self_attn_weights, attn_weights_entropy = tensor([5.3513, 4.7906, 4.2844, 4.8640], device='cuda:3') 2023-10-02 22:50:22,556 INFO [train.py:1078] (3/4) Epoch 30, validation: loss=0.3782, simple_loss=0.2831, pruned_loss=0.2366, over 1125622.00 frames. 2023-10-02 22:50:22,557 INFO [train.py:1079] (3/4) Maximum memory allocated so far is 21134MB 2023-10-02 22:50:22,682 WARNING [train.py:1204] (3/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_314-93656-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 22:50:22,699 WARNING [train.py:1204] (3/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_336-213530-0 from training. Duration: 0.9 2023-10-02 22:50:24,111 WARNING [train.py:1204] (3/4) Exclude cut with ID _36686_210_5665_1_1533778225913_3646410_65-221120-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 22:50:25,534 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_26433_210_12108_1_1532157158_4056480_271-68178-0_sp1.1 from training. Duration: 0.92725 2023-10-02 22:50:25,580 WARNING [train.py:1204] (3/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_46-207599-0_sp0.9 from training. Duration: 0.711125 2023-10-02 22:50:30,158 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9303W0114-637133 from training. Duration: 0.896 2023-10-02 22:50:31,843 WARNING [train.py:1204] (3/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_490-207861-0 from training. Duration: 0.98 2023-10-02 22:50:33,296 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6767_210_3273_1_1527855783_1718932_173-256601-0_sp0.9 from training. Duration: 0.888875 2023-10-02 22:50:33,332 WARNING [train.py:1204] (3/4) Exclude cut with ID _13921_210_9994_1_1530841782272_3503520_327-69677-0_sp1.1 from training. Duration: 0.8181875 2023-10-02 22:50:33,376 WARNING [train.py:1204] (3/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_508-335369-0 from training. Duration: 0.48 2023-10-02 22:50:34,738 WARNING [train.py:1204] (3/4) Exclude cut with ID _64631_210_27767_1_1535349580068_4165160_24-46567-0_sp1.1 from training. Duration: 0.963625 2023-10-02 22:50:41,598 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_89-35748-0_sp1.1 from training. Duration: 0.7181875 2023-10-02 22:50:46,375 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.1.encoder.layers.1.self_attn_weights.whiten_keys, num_groups=4, num_channels=128, metric=2.59 vs. limit=6.0 2023-10-02 22:50:48,499 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_26433_210_12108_1_1532157158_4056480_578-262107-0_sp1.1 from training. Duration: 0.9 2023-10-02 22:50:53,848 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.1.nonlin_attention.balancer.prob, batch_count=1047146.6666666666, ans=0.125 2023-10-02 22:50:55,094 WARNING [train.py:1204] (3/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_129-337802-0 from training. Duration: 0.72 2023-10-02 22:50:56,415 WARNING [train.py:1204] (3/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_356-167816-0_sp0.9 from training. Duration: 0.8 2023-10-02 22:50:59,172 WARNING [train.py:1204] (3/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_137-116442-0_sp1.1 from training. Duration: 0.7090625 2023-10-02 22:50:59,215 WARNING [train.py:1204] (3/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_358-172940-0_sp1.1 from training. Duration: 0.963625 2023-10-02 22:51:01,155 WARNING [train.py:1204] (3/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_668-344147-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 22:51:02,504 WARNING [train.py:1204] (3/4) Exclude cut with ID _62337_210_12392_1_1535009273105_3412229_255-157667-0_sp1.1 from training. Duration: 0.936375 2023-10-02 22:51:02,507 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_486-151131-0 from training. Duration: 0.85 2023-10-02 22:51:06,456 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_53638_210_21581_1_1534297319_5808449_885-80172-0 from training. Duration: 0.96 2023-10-02 22:51:07,844 WARNING [train.py:1204] (3/4) Exclude cut with ID _50093_210_4220_1_1534417141207_3432299_105-183067-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 22:51:07,896 WARNING [train.py:1204] (3/4) Exclude cut with ID _7562_210_2084_1_1528887767916_3946229_345-169156-0_sp0.9 from training. Duration: 0.6444375 2023-10-02 22:51:10,668 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_4528_210_3273_1_1527076654_3276777_209-219804-0_sp0.9 from training. Duration: 0.7666875 2023-10-02 22:51:10,718 WARNING [train.py:1204] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_35-153886-0_sp0.9 from training. Duration: 0.9333125 2023-10-02 22:51:12,075 WARNING [train.py:1204] (3/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_332-121925-0_sp1.1 from training. Duration: 0.92725 2023-10-02 22:51:12,080 WARNING [train.py:1204] (3/4) Exclude cut with ID _59202_210_26984_1_1534816810612_3549174_495-65515-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 22:51:14,123 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.1.conv_skip_rate, batch_count=1047213.3333333334, ans=0.0 2023-10-02 22:51:16,638 WARNING [train.py:1204] (3/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_769-168302-0_sp1.1 from training. Duration: 0.6454375 2023-10-02 22:51:17,917 WARNING [train.py:1204] (3/4) Exclude cut with ID _24271_210_12658_1_1531987270022_7190519_708-71345-0_sp1.1 from training. Duration: 0.936375 2023-10-02 22:51:17,917 WARNING [train.py:1204] (3/4) Exclude cut with ID 3033-130750-0096-55598-0_sp0.9 from training. Duration: 0.92225 2023-10-02 22:51:19,298 WARNING [train.py:1204] (3/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_219-224445-0_sp0.9 from training. Duration: 0.9333125 2023-10-02 22:51:20,770 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_174-277364-0 from training. Duration: 0.71 2023-10-02 22:51:20,853 WARNING [train.py:1204] (3/4) Exclude cut with ID _45021_210_9994_1_1533619751094_3330288_96-239010-0_sp1.1 from training. Duration: 0.82725 2023-10-02 22:51:22,297 WARNING [train.py:1204] (3/4) Exclude cut with ID _31602_210_5854_1_1532677021456_4458250_489-305116-0_sp1.1 from training. Duration: 0.97275 2023-10-02 22:51:22,324 WARNING [train.py:1204] (3/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_269-154726-0_sp0.9 from training. Duration: 0.9555625 2023-10-02 22:51:24,944 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.563e+02 1.821e+02 2.047e+02 2.427e+02 4.890e+02, threshold=4.095e+02, percent-clipped=1.0 2023-10-02 22:51:26,447 WARNING [train.py:1204] (3/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_126-54418-0_sp1.1 from training. Duration: 0.92725 2023-10-02 22:51:26,484 WARNING [train.py:1204] (3/4) Exclude cut with ID _42326_210_7770_1_1533814282531_3707269_182-224159-0_sp1.1 from training. Duration: 0.92725 2023-10-02 22:51:27,826 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7002_210_1794_1_1527939683_5157286_1-281264-0_sp0.9 from training. Duration: 0.488875 2023-10-02 22:51:29,146 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_86-16881-0 from training. Duration: 0.58 2023-10-02 22:51:29,181 WARNING [train.py:1204] (3/4) Exclude cut with ID _19514_210_9259_1_1531530071330_5084230_637-279094-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 22:51:30,400 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_26433_210_12108_1_1532157158_4056480_578-262107-0 from training. Duration: 0.99 2023-10-02 22:51:30,445 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_82-221196-0_sp1.1 from training. Duration: 0.6909375 2023-10-02 22:51:31,793 WARNING [train.py:1204] (3/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_402-102311-0 from training. Duration: 0.67 2023-10-02 22:51:33,949 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1015-343884-0_sp1.1 from training. Duration: 0.563625 2023-10-02 22:51:35,321 WARNING [train.py:1204] (3/4) Exclude cut with ID _13684_210_7497_1_1530619354863_3598359_312-235606-0_sp1.1 from training. Duration: 0.5454375 2023-10-02 22:51:35,344 WARNING [train.py:1204] (3/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_55-31904-0 from training. Duration: 0.88 2023-10-02 22:51:35,417 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_25-332138-0 from training. Duration: 0.91 2023-10-02 22:51:35,419 WARNING [train.py:1204] (3/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_912-282412-0_sp0.9 from training. Duration: 0.5444375 2023-10-02 22:51:35,479 WARNING [train.py:1204] (3/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_458-299435-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 22:51:37,318 INFO [train.py:1046] (3/4) Epoch 30, batch 3050, loss[loss=0.1393, simple_loss=0.2213, pruned_loss=0.02862, over 24460.00 frames. ], tot_loss[loss=0.1647, simple_loss=0.2434, pruned_loss=0.04303, over 4695565.34 frames. ], batch size: 58, lr: 3.40e-03, grad_scale: 8.0 2023-10-02 22:51:38,678 WARNING [train.py:1204] (3/4) Exclude cut with ID _69584_210_5219_1_1535785133775_4044450_275-88403-0_sp1.1 from training. Duration: 0.92725 2023-10-02 22:51:38,694 WARNING [train.py:1204] (3/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_841-341679-0_sp1.1 from training. Duration: 0.52725 2023-10-02 22:51:38,701 WARNING [train.py:1204] (3/4) Exclude cut with ID _41391_210_7994_1_1533603771872_3638201_1-54521-0_sp1.1 from training. Duration: 0.97275 2023-10-02 22:51:38,751 WARNING [train.py:1204] (3/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_495-186609-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 22:51:41,839 WARNING [train.py:1204] (3/4) Exclude cut with ID _4681_210_3525_1_1527250689464_120889_15-126760-0 from training. Duration: 0.81 2023-10-02 22:51:42,032 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.bypass.skip_rate, batch_count=1047346.6666666666, ans=0.04949747468305833 2023-10-02 22:51:44,582 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3019_210_2028_1_1526044245_3264865_127-135615-0_sp1.1 from training. Duration: 0.8545625 2023-10-02 22:51:47,362 WARNING [train.py:1204] (3/4) Exclude cut with ID _30191_210_1664_1_1533189915047_7196670_915-21561-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 22:51:47,396 WARNING [train.py:1204] (3/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_6-214815-0_sp0.9 from training. Duration: 0.9444375 2023-10-02 22:51:50,061 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_45399_210_4852_1_1533624725_7579737_190-137494-0_sp1.1 from training. Duration: 0.97275 2023-10-02 22:51:52,920 WARNING [train.py:1204] (3/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_207-132038-0 from training. Duration: 0.94 2023-10-02 22:51:58,553 WARNING [train.py:1204] (3/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_90-31383-0 from training. Duration: 0.87 2023-10-02 22:51:58,586 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_15517_210_6753_1_1530952293_1664795_119-31001-0 from training. Duration: 0.94 2023-10-02 22:51:58,624 WARNING [train.py:1204] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_684-85342-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 22:51:58,854 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.conv_skip_rate, batch_count=1047413.3333333334, ans=0.0 2023-10-02 22:52:02,624 WARNING [train.py:1204] (3/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_97-325053-0_sp1.1 from training. Duration: 0.836375 2023-10-02 22:52:06,582 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_68702_210_23689_1_1535799577_3420068_168-25150-0_sp1.1 from training. Duration: 0.97275 2023-10-02 22:52:06,596 WARNING [train.py:1204] (3/4) Exclude cut with ID _65134_210_14763_1_1535335393985_3505368_111-27984-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 22:52:06,641 WARNING [train.py:1204] (3/4) Exclude cut with ID _48678_210_5663_1_1533988830947_3769199_276-226230-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 22:52:11,091 WARNING [train.py:1204] (3/4) Exclude cut with ID _28427_210_4220_1_1532581240967_3061069_43-84928-0_sp1.1 from training. Duration: 0.9 2023-10-02 22:52:11,127 WARNING [train.py:1204] (3/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_158-250116-0_sp1.1 from training. Duration: 0.563625 2023-10-02 22:52:12,449 WARNING [train.py:1204] (3/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_279-332556-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 22:52:12,497 WARNING [train.py:1204] (3/4) Exclude cut with ID _42863_210_9070_1_1533902398788_3696920_488-230146-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 22:52:12,497 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_387-247159-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 22:52:13,848 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6729_210_2550_1_1528032481_1788725_261-42619-0_sp1.1 from training. Duration: 0.97275 2023-10-02 22:52:15,257 WARNING [train.py:1204] (3/4) Exclude cut with ID _19896_210_1883_1_1531709022662_6649569_831-44724-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 22:52:18,096 WARNING [train.py:1204] (3/4) Exclude cut with ID _55153_210_2253_1_1534384786677_3162096_280-285944-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 22:52:18,117 WARNING [train.py:1204] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_448-4048-0 from training. Duration: 0.96 2023-10-02 22:52:19,572 WARNING [train.py:1204] (3/4) Exclude cut with ID _29678_210_7893_1_1532421042787_3906118_456-202548-0_sp1.1 from training. Duration: 0.97275 2023-10-02 22:52:19,586 WARNING [train.py:1204] (3/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_107-332398-0_sp1.1 from training. Duration: 0.7090625 2023-10-02 22:52:20,978 WARNING [train.py:1204] (3/4) Exclude cut with ID _13921_210_9994_1_1530838702800_3003429_46-149444-0_sp1.1 from training. Duration: 0.8545625 2023-10-02 22:52:21,026 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_257-20733-0_sp0.9 from training. Duration: 0.8444375 2023-10-02 22:52:22,341 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_53753_210_7234_1_1534320003_3594148_385-234809-0_sp1.1 from training. Duration: 0.963625 2023-10-02 22:52:23,737 WARNING [train.py:1204] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_292-215346-0_sp1.1 from training. Duration: 0.8818125 2023-10-02 22:52:29,096 WARNING [train.py:1204] (3/4) Exclude cut with ID _68965_210_1883_1_1535698962272_3497490_196-124268-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 22:52:29,128 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_23801_210_16223_1_1532066392_3294657_570-5439-0_sp1.1 from training. Duration: 0.8818125 2023-10-02 22:52:33,472 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.conv_module1.balancer1.prob, batch_count=1047613.3333333334, ans=0.125 2023-10-02 22:52:34,655 WARNING [train.py:1204] (3/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_349-6928-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 22:52:34,696 WARNING [train.py:1204] (3/4) Exclude cut with ID _60621_210_17071_1_1535013053530_3671739_226-255202-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 22:52:34,702 WARNING [train.py:1204] (3/4) Exclude cut with ID _41817_210_18835_1_1533434477160_7423049_448-130302-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 22:52:38,002 WARNING [train.py:1204] (3/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_758-143677-0_sp1.1 from training. Duration: 0.8909375 2023-10-02 22:52:38,056 WARNING [train.py:1204] (3/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_912-47473-0_sp0.9 from training. Duration: 0.7555625 2023-10-02 22:52:38,089 WARNING [train.py:1204] (3/4) Exclude cut with ID _6694_210_4599_1_1527852637490_3965529_356-169233-0_sp1.1 from training. Duration: 0.9 2023-10-02 22:52:39,886 WARNING [train.py:1204] (3/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_1-99680-0 from training. Duration: 0.67 2023-10-02 22:52:41,697 WARNING [train.py:1204] (3/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_199-86432-0_sp1.1 from training. Duration: 0.8909375 2023-10-02 22:52:42,997 WARNING [train.py:1204] (3/4) Exclude cut with ID _61148_210_6897_1_1535094022726_4217610_362-182430-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 22:52:43,071 WARNING [train.py:1204] (3/4) Exclude cut with ID _49376_210_5986_1_1533985278732_3370740_301-154763-0 from training. Duration: 0.98 2023-10-02 22:52:44,408 WARNING [train.py:1204] (3/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_572-185150-0_sp1.1 from training. Duration: 0.8818125 2023-10-02 22:52:49,794 INFO [train.py:1046] (3/4) Epoch 30, batch 3100, loss[loss=0.1522, simple_loss=0.2299, pruned_loss=0.0373, over 24595.00 frames. ], tot_loss[loss=0.1647, simple_loss=0.2428, pruned_loss=0.0433, over 4693028.90 frames. ], batch size: 60, lr: 3.40e-03, grad_scale: 8.0 2023-10-02 22:52:49,925 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_47896_210_20508_1_1534492518_5958868_558-314572-0_sp1.1 from training. Duration: 0.8818125 2023-10-02 22:52:51,271 WARNING [train.py:1204] (3/4) Exclude cut with ID _45560_210_21982_1_1533812603951_3584999_111-6376-0_sp0.9 from training. Duration: 0.9333125 2023-10-02 22:52:52,676 WARNING [train.py:1204] (3/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_412-317077-0_sp1.1 from training. Duration: 0.6818125 2023-10-02 22:52:52,850 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.conv_module2.balancer1.min_positive, batch_count=1047680.0, ans=0.025 2023-10-02 22:52:54,076 WARNING [train.py:1204] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_718-68505-0 from training. Duration: 0.89 2023-10-02 22:52:56,808 WARNING [train.py:1204] (3/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_77-311404-0 from training. Duration: 0.94 2023-10-02 22:52:58,126 WARNING [train.py:1204] (3/4) Exclude cut with ID _60229_210_27767_1_1534838406158_3655350_264-279573-0 from training. Duration: 0.83 2023-10-02 22:53:00,751 WARNING [train.py:1204] (3/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_106-300985-0_sp1.1 from training. Duration: 0.8181875 2023-10-02 22:53:00,996 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.balancer2.prob, batch_count=1047680.0, ans=0.125 2023-10-02 22:53:02,389 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.conv_module1.balancer2.min_positive, batch_count=1047746.6666666666, ans=0.05 2023-10-02 22:53:04,036 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_4183_210_2087_1_1526729100_1869838_79-277189-0_sp1.1 from training. Duration: 0.963625 2023-10-02 22:53:04,052 WARNING [train.py:1204] (3/4) Exclude cut with ID _28427_210_4220_1_1532581240967_3061069_195-295268-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 22:53:05,557 WARNING [train.py:1204] (3/4) Exclude cut with ID _4092_210_2151_1_1526643044449_3135759_356-278694-0_sp0.9 from training. Duration: 0.52225 2023-10-02 22:53:10,296 WARNING [train.py:1204] (3/4) Exclude cut with ID _14459_210_2228_1_1531443641946_7516700_777-71446-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 22:53:15,094 WARNING [train.py:1204] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_520-126729-0 from training. Duration: 0.7 2023-10-02 22:53:19,242 WARNING [train.py:1204] (3/4) Exclude cut with ID _13684_210_7497_1_1530619354863_3598359_312-235606-0_sp0.9 from training. Duration: 0.6666875 2023-10-02 22:53:19,276 WARNING [train.py:1204] (3/4) Exclude cut with ID _37537_210_2253_1_1533171758568_3421949_283-349402-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 22:53:20,612 WARNING [train.py:1204] (3/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_29-117932-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 22:53:20,633 WARNING [train.py:1204] (3/4) Exclude cut with ID _29303_210_2575_1_1532392237303_7410649_721-283351-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 22:53:22,021 WARNING [train.py:1204] (3/4) Exclude cut with ID _14886_210_4220_1_1530702020355_3516190_175-229073-0_sp0.9 from training. Duration: 0.5 2023-10-02 22:53:22,151 WARNING [train.py:1204] (3/4) Exclude cut with ID _15458_210_8341_1_1530858614263_3834410_70-268830-0_sp1.1 from training. Duration: 0.97275 2023-10-02 22:53:22,171 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2295_210_2087_1_1525500993_2407863_234-83104-0 from training. Duration: 0.62 2023-10-02 22:53:22,172 WARNING [train.py:1204] (3/4) Exclude cut with ID _22961_210_8254_1_1531976366672_4278180_317-199546-0_sp1.1 from training. Duration: 0.7818125 2023-10-02 22:53:23,643 WARNING [train.py:1204] (3/4) Exclude cut with ID _27894_210_12392_1_1532415496078_6902439_547-196022-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 22:53:25,030 WARNING [train.py:1204] (3/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_46-207599-0 from training. Duration: 0.64 2023-10-02 22:53:26,384 WARNING [train.py:1204] (3/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_426-326246-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 22:53:30,303 WARNING [train.py:1204] (3/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_445-117398-0_sp0.9 from training. Duration: 0.988875 2023-10-02 22:53:32,033 WARNING [train.py:1204] (3/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_563-347832-0 from training. Duration: 0.89 2023-10-02 22:53:32,118 WARNING [train.py:1204] (3/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_252-216674-0 from training. Duration: 0.54 2023-10-02 22:53:34,048 WARNING [train.py:1204] (3/4) Exclude cut with ID _59611_210_8895_1_1534741198240_3650361_187-212779-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 22:53:34,113 WARNING [train.py:1204] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_282-159367-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 22:53:36,837 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_68702_210_23689_1_1535799577_3420068_33-136883-0_sp1.1 from training. Duration: 0.936375 2023-10-02 22:53:36,850 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_47228_210_4983_1_1533780021_3672027_184-293706-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 22:53:36,869 WARNING [train.py:1204] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_656-188361-0_sp0.9 from training. Duration: 0.9666875 2023-10-02 22:53:37,045 INFO [scaling.py:1118] (3/4) WithLoss: name=encoder.encoders.0.layers.1.self_attn_weights, loss-sum=0.000e+00 2023-10-02 22:53:38,395 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.0.conv_module1.balancer1.prob, batch_count=1047880.0, ans=0.125 2023-10-02 22:53:38,502 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.feed_forward3.hidden_balancer.prob, batch_count=1047880.0, ans=0.125 2023-10-02 22:53:39,482 WARNING [train.py:1204] (3/4) Exclude cut with ID _4866_210_2577_1_1527404412284_4364659_352-157930-0_sp0.9 from training. Duration: 0.9 2023-10-02 22:53:39,482 WARNING [train.py:1204] (3/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_210-106047-0_sp1.1 from training. Duration: 0.8818125 2023-10-02 22:53:41,091 WARNING [train.py:1204] (3/4) Exclude cut with ID _9389_210_1664_1_1529217337301_5289269_289-213417-0_sp1.1 from training. Duration: 0.8090625 2023-10-02 22:53:41,123 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3910_210_1794_1_1526777667_3747260_178-41186-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 22:53:41,126 WARNING [train.py:1204] (3/4) Exclude cut with ID _27052_210_5986_1_1532329197187_3585009_24-172263-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 22:53:41,130 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_5522_210_3273_1_1527249606_3909096_318-203153-0_sp1.1 from training. Duration: 0.3909375 2023-10-02 22:53:45,602 WARNING [train.py:1204] (3/4) Exclude cut with ID _27894_210_12392_1_1532415496078_6902439_222-51982-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 22:53:46,999 WARNING [train.py:1204] (3/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_87-131427-0 from training. Duration: 0.78 2023-10-02 22:53:48,412 WARNING [train.py:1204] (3/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_372-298093-0_sp0.9 from training. Duration: 0.888875 2023-10-02 22:53:48,456 WARNING [train.py:1204] (3/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_663-166973-0 from training. Duration: 0.8 2023-10-02 22:53:49,763 WARNING [train.py:1204] (3/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_429-177469-0_sp1.1 from training. Duration: 0.936375 2023-10-02 22:53:49,796 WARNING [train.py:1204] (3/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_724-172651-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 22:53:49,835 WARNING [train.py:1204] (3/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_103-336064-0 from training. Duration: 0.51 2023-10-02 22:53:50,034 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=1047946.6666666666, ans=0.1 2023-10-02 22:53:51,051 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.572e+02 1.847e+02 2.082e+02 2.389e+02 3.344e+02, threshold=4.165e+02, percent-clipped=0.0 2023-10-02 22:53:52,687 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.0.nonlin_attention.balancer.prob, batch_count=1047946.6666666666, ans=0.125 2023-10-02 22:53:56,022 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.3.encoder.layers.1.whiten, num_groups=1, num_channels=512, metric=5.02 vs. limit=12.0 2023-10-02 22:54:01,378 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=1048013.3333333334, ans=0.1 2023-10-02 22:54:02,411 INFO [train.py:1046] (3/4) Epoch 30, batch 3150, loss[loss=0.1684, simple_loss=0.2363, pruned_loss=0.05022, over 23792.00 frames. ], tot_loss[loss=0.1637, simple_loss=0.242, pruned_loss=0.04267, over 4689585.37 frames. ], batch size: 164, lr: 3.40e-03, grad_scale: 8.0 2023-10-02 22:54:02,484 WARNING [train.py:1204] (3/4) Exclude cut with ID _48677_210_5663_1_1533902405919_3892989_111-11439-0 from training. Duration: 0.96 2023-10-02 22:54:04,557 WARNING [train.py:1204] (3/4) Exclude cut with ID _8085_210_4557_1_1528455633493_3716100_95-103398-0_sp1.1 from training. Duration: 0.963625 2023-10-02 22:54:04,615 WARNING [train.py:1204] (3/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_1041-110837-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 22:54:07,785 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_50050_210_16223_1_1535435796_7290138_1155-121757-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 22:54:07,787 WARNING [train.py:1204] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_602-275260-0_sp0.9 from training. Duration: 0.911125 2023-10-02 22:54:07,857 WARNING [train.py:1204] (3/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_265-164486-0 from training. Duration: 0.96 2023-10-02 22:54:09,191 WARNING [train.py:1204] (3/4) Exclude cut with ID _46067_210_4220_1_1534125571066_3307450_142-229375-0_sp1.1 from training. Duration: 0.963625 2023-10-02 22:54:09,221 WARNING [train.py:1204] (3/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_310-26186-0_sp1.1 from training. Duration: 0.636375 2023-10-02 22:54:10,951 WARNING [train.py:1204] (3/4) Exclude cut with ID _28461_210_2253_1_1532771775189_3041299_136-270644-0 from training. Duration: 0.93 2023-10-02 22:54:12,940 WARNING [train.py:1204] (3/4) Exclude cut with ID _14860_210_4915_1_1530702020340_7343240_438-286372-0_sp1.1 from training. Duration: 0.936375 2023-10-02 22:54:14,211 WARNING [train.py:1204] (3/4) Exclude cut with ID IC0128W0004-63309 from training. Duration: 0.831 2023-10-02 22:54:18,206 WARNING [train.py:1204] (3/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_135-174080-0 from training. Duration: 0.53 2023-10-02 22:54:18,248 WARNING [train.py:1204] (3/4) Exclude cut with ID _41326_210_5854_1_1533778076509_3492080_277-59511-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 22:54:19,628 WARNING [train.py:1204] (3/4) Exclude cut with ID IC0144W0112-71379 from training. Duration: 0.992 2023-10-02 22:54:19,693 WARNING [train.py:1204] (3/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_711-15688-0_sp0.9 from training. Duration: 0.47775 2023-10-02 22:54:22,321 WARNING [train.py:1204] (3/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_237-332027-0 from training. Duration: 0.95 2023-10-02 22:54:22,370 WARNING [train.py:1204] (3/4) Exclude cut with ID _7970_210_1883_1_1528455616296_3472230_25-178407-0 from training. Duration: 0.71 2023-10-02 22:54:22,372 WARNING [train.py:1204] (3/4) Exclude cut with ID _8480_210_1874_1_1528589391205_6728330_309-139896-0 from training. Duration: 0.81 2023-10-02 22:54:22,388 WARNING [train.py:1204] (3/4) Exclude cut with ID _39571_210_13449_1_1533801584143_3651077_319-268877-0_sp1.1 from training. Duration: 0.936375 2023-10-02 22:54:22,391 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_155-246880-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 22:54:23,741 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_566-122599-0_sp1.1 from training. Duration: 0.936375 2023-10-02 22:54:25,142 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_12868_210_6753_1_1530701932_530275_18-76322-0 from training. Duration: 0.87 2023-10-02 22:54:26,562 WARNING [train.py:1204] (3/4) Exclude cut with ID _56494_210_4220_1_1535284837625_3033169_159-106035-0_sp1.1 from training. Duration: 0.963625 2023-10-02 22:54:27,940 WARNING [train.py:1204] (3/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_98-276013-0_sp1.1 from training. Duration: 0.963625 2023-10-02 22:54:29,202 WARNING [train.py:1204] (3/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_330-216124-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 22:54:32,251 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_177-58005-0_sp0.9 from training. Duration: 0.688875 2023-10-02 22:54:35,052 WARNING [train.py:1204] (3/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_343-139898-0 from training. Duration: 0.96 2023-10-02 22:54:36,902 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6961_210_2087_1_1527937168_6815011_395-185060-0_sp1.1 from training. Duration: 0.863625 2023-10-02 22:54:39,850 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7002_210_1794_1_1527939683_5157286_184-229630-0_sp0.9 from training. Duration: 0.82225 2023-10-02 22:54:39,898 WARNING [train.py:1204] (3/4) Exclude cut with ID _59170_210_6721_1_1534983633960_4203335_153-18895-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 22:54:39,945 WARNING [train.py:1204] (3/4) Exclude cut with ID _49643_210_7989_1_1534640244224_3939031_598-161926-0 from training. Duration: 0.9 2023-10-02 22:54:43,320 WARNING [train.py:1204] (3/4) Exclude cut with ID _4068_210_2629_1_1526697350395_1546259_224-288693-0 from training. Duration: 0.59 2023-10-02 22:54:43,380 WARNING [train.py:1204] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_718-68505-0_sp1.1 from training. Duration: 0.8090625 2023-10-02 22:54:44,712 WARNING [train.py:1204] (3/4) Exclude cut with ID _4068_210_2629_1_1526697350395_1546259_224-288693-0_sp0.9 from training. Duration: 0.6555625 2023-10-02 22:54:44,724 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2296_210_2087_1_1525503448_3397335_57-185430-0_sp0.9 from training. Duration: 0.6666875 2023-10-02 22:54:44,797 WARNING [train.py:1204] (3/4) Exclude cut with ID _15458_210_8341_1_1530858614263_3834410_49-235472-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 22:54:44,799 WARNING [train.py:1204] (3/4) Exclude cut with ID _64135_210_2253_1_1535677267876_7608972_498-5039-0_sp0.9 from training. Duration: 0.8666875 2023-10-02 22:54:46,783 WARNING [train.py:1204] (3/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_283-220372-0_sp0.9 from training. Duration: 0.62225 2023-10-02 22:54:46,792 WARNING [train.py:1204] (3/4) Exclude cut with ID _6473_210_1741_1_1527901307396_3815380_108-17335-0_sp0.9 from training. Duration: 0.72225 2023-10-02 22:54:48,030 WARNING [train.py:1204] (3/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_113-92608-0 from training. Duration: 0.54 2023-10-02 22:54:48,068 WARNING [train.py:1204] (3/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_272-249985-0_sp1.1 from training. Duration: 0.6090625 2023-10-02 22:54:49,280 WARNING [train.py:1204] (3/4) Exclude cut with ID _50376_210_13401_1_1534031502101_4217740_518-187390-0_sp1.1 from training. Duration: 0.97275 2023-10-02 22:54:50,654 WARNING [train.py:1204] (3/4) Exclude cut with ID _59738_210_15379_1_1534849512562_3941470_165-248260-0_sp1.1 from training. Duration: 0.92725 2023-10-02 22:54:50,671 WARNING [train.py:1204] (3/4) Exclude cut with ID _23818_210_6228_1_1531917063884_3676920_400-343925-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 22:54:51,982 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6961_210_2087_1_1527937168_6815011_562-165518-0 from training. Duration: 0.77 2023-10-02 22:54:53,277 WARNING [train.py:1204] (3/4) Exclude cut with ID _24271_210_12658_1_1531987270022_7190519_289-207931-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 22:54:54,659 WARNING [train.py:1204] (3/4) Exclude cut with ID _14632_210_9988_1_1530691279392_3923200_332-105904-0 from training. Duration: 0.9 2023-10-02 22:54:54,678 WARNING [train.py:1204] (3/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_203-232438-0_sp1.1 from training. Duration: 0.97275 2023-10-02 22:54:56,130 WARNING [train.py:1204] (3/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_263-168388-0 from training. Duration: 0.86 2023-10-02 22:54:57,539 WARNING [train.py:1204] (3/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_174-205058-0 from training. Duration: 0.95 2023-10-02 22:54:58,927 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_94-195801-0_sp1.1 from training. Duration: 0.8545625 2023-10-02 22:54:58,950 WARNING [train.py:1204] (3/4) Exclude cut with ID _55153_210_2253_1_1534384786677_3162096_269-103204-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 22:55:01,629 WARNING [train.py:1204] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_748-36150-0 from training. Duration: 0.91 2023-10-02 22:55:01,710 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_148-172861-0_sp1.1 from training. Duration: 0.4545625 2023-10-02 22:55:02,977 WARNING [train.py:1204] (3/4) Exclude cut with ID _67499_210_17282_1_1535509805220_3692430_460-77730-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 22:55:04,519 WARNING [train.py:1204] (3/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_374-20753-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 22:55:06,458 WARNING [train.py:1204] (3/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_374-289983-0_sp1.1 from training. Duration: 0.97275 2023-10-02 22:55:08,324 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_34886_210_10633_1_1533203882_1755661_76-156096-0_sp0.9 from training. Duration: 0.9666875 2023-10-02 22:55:09,969 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.1.ff2_skip_rate, batch_count=1048280.0, ans=0.0 2023-10-02 22:55:10,085 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=1048280.0, ans=0.1 2023-10-02 22:55:11,228 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_238-256668-0_sp0.9 from training. Duration: 0.7666875 2023-10-02 22:55:11,278 WARNING [train.py:1204] (3/4) Exclude cut with ID _46683_210_9642_1_1533730913492_4591116_489-146727-0_sp1.1 from training. Duration: 0.97275 2023-10-02 22:55:14,742 WARNING [train.py:1204] (3/4) Exclude cut with ID _13938_210_9988_1_1530518718978_3693960_304-112784-0_sp0.9 from training. Duration: 0.511125 2023-10-02 22:55:17,443 INFO [train.py:1046] (3/4) Epoch 30, batch 3200, loss[loss=0.1546, simple_loss=0.2232, pruned_loss=0.04295, over 22891.00 frames. ], tot_loss[loss=0.1629, simple_loss=0.2408, pruned_loss=0.04252, over 4680685.55 frames. ], batch size: 322, lr: 3.40e-03, grad_scale: 16.0 2023-10-02 22:55:18,865 WARNING [train.py:1204] (3/4) Exclude cut with ID _24271_210_12658_1_1531987270022_7190519_243-46549-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 22:55:18,870 WARNING [train.py:1204] (3/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_78-182912-0_sp1.1 from training. Duration: 0.536375 2023-10-02 22:55:22,894 WARNING [train.py:1204] (3/4) Exclude cut with ID _57572_210_5517_1_1534921219624_3809484_121-321334-0_sp1.1 from training. Duration: 0.97275 2023-10-02 22:55:24,262 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_40047_210_2290_1_1533290519_2374311_254-102149-0_sp1.1 from training. Duration: 0.963625 2023-10-02 22:55:24,263 WARNING [train.py:1204] (3/4) Exclude cut with ID _28461_210_2253_1_1532771775189_3041299_96-152158-0 from training. Duration: 0.85 2023-10-02 22:55:25,744 WARNING [train.py:1204] (3/4) Exclude cut with ID _29877_210_6228_1_1532397791106_5276149_161-102979-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 22:55:29,947 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3787_210_2040_1_1526477544_5478586_603-120324-0_sp0.9 from training. Duration: 0.9 2023-10-02 22:55:34,529 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_41750_210_1960_1_1533520678_3948042_247-213456-0_sp1.1 from training. Duration: 0.97275 2023-10-02 22:55:38,852 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.1.encoder.layers.0.self_attn_weights.whiten_keys, num_groups=4, num_channels=128, metric=3.68 vs. limit=6.0 2023-10-02 22:55:42,214 WARNING [train.py:1204] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_127-119109-0_sp0.9 from training. Duration: 0.8 2023-10-02 22:55:47,206 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.conv_module2.balancer2.prob, batch_count=1048480.0, ans=0.125 2023-10-02 22:55:52,892 WARNING [train.py:1204] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_215-202973-0 from training. Duration: 0.88 2023-10-02 22:55:52,973 WARNING [train.py:1204] (3/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_17-320491-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 22:55:54,653 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.conv_module2.balancer1.prob, batch_count=1048480.0, ans=0.125 2023-10-02 22:55:55,831 WARNING [train.py:1204] (3/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_310-114452-0 from training. Duration: 0.59 2023-10-02 22:55:57,167 WARNING [train.py:1204] (3/4) Exclude cut with ID _15133_210_6228_1_1530794512074_3080379_29-264458-0_sp0.9 from training. Duration: 0.6444375 2023-10-02 22:55:59,978 WARNING [train.py:1204] (3/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_159-281310-0_sp1.1 from training. Duration: 0.863625 2023-10-02 22:56:01,247 WARNING [train.py:1204] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_195-167011-0_sp0.9 from training. Duration: 0.8333125 2023-10-02 22:56:01,327 WARNING [train.py:1204] (3/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_156-257607-0_sp1.1 from training. Duration: 0.7545625 2023-10-02 22:56:06,462 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2295_210_2087_1_1525500993_2407863_116-238894-0 from training. Duration: 0.7 2023-10-02 22:56:07,823 WARNING [train.py:1204] (3/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_321-335475-0_sp0.9 from training. Duration: 0.5 2023-10-02 22:56:10,472 WARNING [train.py:1204] (3/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_188-114464-0 from training. Duration: 0.82 2023-10-02 22:56:14,556 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2592_210_2290_1_1525697025_4882925_157-70512-0 from training. Duration: 0.91 2023-10-02 22:56:14,699 WARNING [train.py:1204] (3/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_138-64210-0_sp1.1 from training. Duration: 0.663625 2023-10-02 22:56:18,415 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.feed_forward1.hidden_balancer.prob, batch_count=1048613.3333333333, ans=0.125 2023-10-02 22:56:19,391 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.579e+02 1.819e+02 2.030e+02 2.429e+02 3.151e+02, threshold=4.060e+02, percent-clipped=0.0 2023-10-02 22:56:19,578 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_11305_210_1794_1_1529750428_7178148_651-95702-0_sp1.1 from training. Duration: 0.963625 2023-10-02 22:56:21,332 WARNING [train.py:1204] (3/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_379-184223-0_sp0.9 from training. Duration: 0.9444375 2023-10-02 22:56:21,361 WARNING [train.py:1204] (3/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_381-125325-0_sp1.1 from training. Duration: 0.963625 2023-10-02 22:56:21,415 WARNING [train.py:1204] (3/4) Exclude cut with ID IC0622W0021-307659 from training. Duration: 0.896 2023-10-02 22:56:21,417 WARNING [train.py:1204] (3/4) Exclude cut with ID _7624_210_1531_1_1528109802198_4933101_418-349834-0_sp1.1 from training. Duration: 0.5454375 2023-10-02 22:56:24,238 WARNING [train.py:1204] (3/4) Exclude cut with ID _49728_210_12651_1_1534041034294_6788150_607-3416-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 22:56:26,906 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8275_210_4198_1_1528455320_3892571_491-17707-0 from training. Duration: 0.64 2023-10-02 22:56:26,956 WARNING [train.py:1204] (3/4) Exclude cut with ID _5287_210_2084_1_1527418883040_3878670_333-48508-0 from training. Duration: 0.6 2023-10-02 22:56:28,215 WARNING [train.py:1204] (3/4) Exclude cut with ID _8365_210_2064_1_1528592311070_4677690_371-122295-0 from training. Duration: 0.55 2023-10-02 22:56:30,793 INFO [train.py:1046] (3/4) Epoch 30, batch 3250, loss[loss=0.1784, simple_loss=0.2616, pruned_loss=0.04759, over 24285.00 frames. ], tot_loss[loss=0.1627, simple_loss=0.2408, pruned_loss=0.0423, over 4691081.73 frames. ], batch size: 77, lr: 3.40e-03, grad_scale: 16.0 2023-10-02 22:56:30,869 WARNING [train.py:1204] (3/4) Exclude cut with ID _14860_210_4915_1_1530702020340_7343240_836-328489-0 from training. Duration: 0.9 2023-10-02 22:56:32,283 WARNING [train.py:1204] (3/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_1033-274376-0_sp0.9 from training. Duration: 0.9666875 2023-10-02 22:56:34,280 WARNING [train.py:1204] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_475-127455-0_sp0.9 from training. Duration: 0.77775 2023-10-02 22:56:34,287 WARNING [train.py:1204] (3/4) Exclude cut with ID IC0671W0071-331914 from training. Duration: 0.896 2023-10-02 22:56:35,729 WARNING [train.py:1204] (3/4) Exclude cut with ID _30327_210_16049_1_1532484161295_3924169_2-1523-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 22:56:35,731 WARNING [train.py:1204] (3/4) Exclude cut with ID _24314_210_4915_1_1532135078116_7170179_460-208279-0_sp1.1 from training. Duration: 0.97275 2023-10-02 22:56:37,119 WARNING [train.py:1204] (3/4) Exclude cut with ID IC0679W0052-335859 from training. Duration: 0.64 2023-10-02 22:56:40,508 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.0.ff2_skip_rate, batch_count=1048680.0, ans=0.0 2023-10-02 22:56:41,730 WARNING [train.py:1204] (3/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_3-237434-0_sp1.1 from training. Duration: 0.6909375 2023-10-02 22:56:43,195 WARNING [train.py:1204] (3/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_405-185283-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 22:56:50,623 WARNING [train.py:1204] (3/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_927-83684-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 22:56:50,632 WARNING [train.py:1204] (3/4) Exclude cut with ID _6694_210_4599_1_1527852637490_3965529_356-169233-0 from training. Duration: 0.99 2023-10-02 22:56:52,111 WARNING [train.py:1204] (3/4) Exclude cut with ID _19896_210_1883_1_1531709022662_6649569_562-233001-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 22:56:53,929 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_354-78629-0_sp1.1 from training. Duration: 0.963625 2023-10-02 22:56:53,930 WARNING [train.py:1204] (3/4) Exclude cut with ID _60135_210_13427_1_1534935557922_3651319_96-168198-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 22:56:55,273 WARNING [train.py:1204] (3/4) Exclude cut with ID _31602_210_5854_1_1532677021456_4458250_249-96604-0_sp1.1 from training. Duration: 0.8090625 2023-10-02 22:56:55,311 WARNING [train.py:1204] (3/4) Exclude cut with ID _14100_210_8341_1_1530529320613_3678069_258-224599-0_sp0.9 from training. Duration: 0.8555625 2023-10-02 22:56:58,012 WARNING [train.py:1204] (3/4) Exclude cut with ID _19708_210_9206_1_1532136650733_4238364_215-160135-0_sp1.1 from training. Duration: 0.97275 2023-10-02 22:56:58,051 WARNING [train.py:1204] (3/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_113-18163-0_sp0.9 from training. Duration: 0.988875 2023-10-02 22:56:59,409 WARNING [train.py:1204] (3/4) Exclude cut with ID _64135_210_2253_1_1535677267876_7608972_552-337340-0_sp1.1 from training. Duration: 0.92725 2023-10-02 22:56:59,449 WARNING [train.py:1204] (3/4) Exclude cut with ID _67725_210_27767_1_1535608756671_5105359_10-12887-0_sp1.1 from training. Duration: 0.97275 2023-10-02 22:56:59,456 WARNING [train.py:1204] (3/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_401-284840-0_sp1.1 from training. Duration: 0.97275 2023-10-02 22:57:00,724 WARNING [train.py:1204] (3/4) Exclude cut with ID _44659_210_2721_1_1534036120061_6614860_483-141285-0_sp1.1 from training. Duration: 0.936375 2023-10-02 22:57:02,204 WARNING [train.py:1204] (3/4) Exclude cut with ID _9461_210_4557_1_1529060514726_3677451_358-332734-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 22:57:02,310 WARNING [train.py:1204] (3/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_788-298907-0_sp1.1 from training. Duration: 0.8090625 2023-10-02 22:57:05,558 WARNING [train.py:1204] (3/4) Exclude cut with ID _39411_210_5986_1_1533538825030_3343649_354-267196-0_sp1.1 from training. Duration: 0.92725 2023-10-02 22:57:05,588 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_14953_210_2151_1_1530701710_5616165_192-309792-0_sp1.1 from training. Duration: 0.97275 2023-10-02 22:57:06,938 WARNING [train.py:1204] (3/4) Exclude cut with ID _59170_210_6721_1_1534983633960_4203335_290-151255-0_sp1.1 from training. Duration: 0.92725 2023-10-02 22:57:06,975 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_5575_210_1410_1_1527301305_2570170_249-93769-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 22:57:06,984 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_46013_210_7234_1_1533776362_3744032_463-52378-0_sp1.1 from training. Duration: 0.7818125 2023-10-02 22:57:11,821 WARNING [train.py:1204] (3/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_580-93798-0 from training. Duration: 0.56 2023-10-02 22:57:12,506 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.2.encoder.layers.1.whiten, num_groups=1, num_channels=384, metric=4.71 vs. limit=12.0 2023-10-02 22:57:13,076 WARNING [train.py:1204] (3/4) Exclude cut with ID _19514_210_9259_1_1531530071330_5084230_385-13762-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 22:57:13,080 WARNING [train.py:1204] (3/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_458-67321-0_sp1.1 from training. Duration: 0.8818125 2023-10-02 22:57:14,566 WARNING [train.py:1204] (3/4) Exclude cut with ID _41298_210_9019_1_1533430371450_4822143_188-154791-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 22:57:14,634 WARNING [train.py:1204] (3/4) Exclude cut with ID _15037_210_2253_1_1530752294104_3267650_296-192597-0_sp0.9 from training. Duration: 0.9 2023-10-02 22:57:21,892 WARNING [train.py:1204] (3/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_661-264857-0_sp1.1 from training. Duration: 0.8454375 2023-10-02 22:57:23,900 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.balancer2.prob, batch_count=1048880.0, ans=0.125 2023-10-02 22:57:26,104 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.1.encoder.layers.0.conv_module1.whiten, num_groups=1, num_channels=256, metric=9.03 vs. limit=15.0 2023-10-02 22:57:26,648 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_34008_210_10431_1_1533345424_8183247_662-317331-0_sp1.1 from training. Duration: 0.8545625 2023-10-02 22:57:26,679 WARNING [train.py:1204] (3/4) Exclude cut with ID _46502_210_13794_1_1533724452198_3657159_241-278329-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 22:57:26,680 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_11349_210_6753_1_1529800861_1101207_26-65602-0 from training. Duration: 0.96 2023-10-02 22:57:26,684 WARNING [train.py:1204] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_448-4048-0_sp1.1 from training. Duration: 0.87275 2023-10-02 22:57:26,698 WARNING [train.py:1204] (3/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_662-275621-0_sp1.1 from training. Duration: 0.3818125 2023-10-02 22:57:27,988 WARNING [train.py:1204] (3/4) Exclude cut with ID _13938_210_9988_1_1530518718978_3693960_256-54454-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 22:57:30,695 WARNING [train.py:1204] (3/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_256-32662-0 from training. Duration: 0.9 2023-10-02 22:57:30,738 WARNING [train.py:1204] (3/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_326-17061-0 from training. Duration: 0.94 2023-10-02 22:57:30,781 WARNING [train.py:1204] (3/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_505-97987-0_sp1.1 from training. Duration: 0.7818125 2023-10-02 22:57:32,150 WARNING [train.py:1204] (3/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_316-294364-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 22:57:34,040 WARNING [train.py:1204] (3/4) Exclude cut with ID _19514_210_9259_1_1531530071330_5084230_762-231063-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 22:57:34,102 WARNING [train.py:1204] (3/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_68-219807-0_sp0.9 from training. Duration: 0.52225 2023-10-02 22:57:35,332 WARNING [train.py:1204] (3/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_479-158053-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 22:57:38,160 WARNING [train.py:1204] (3/4) Exclude cut with ID _66112_210_10635_1_1535590236305_3801484_83-38181-0_sp1.1 from training. Duration: 0.936375 2023-10-02 22:57:38,181 WARNING [train.py:1204] (3/4) Exclude cut with ID _55857_210_2346_1_1534644007831_6898209_255-146346-0_sp1.1 from training. Duration: 0.963625 2023-10-02 22:57:40,687 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.3.encoder.layers.3.nonlin_attention.whiten1, num_groups=1, num_channels=384, metric=5.62 vs. limit=10.0 2023-10-02 22:57:41,446 WARNING [train.py:1204] (3/4) Exclude cut with ID _48836_210_12210_1_1534075848293_3904150_429-35475-0 from training. Duration: 0.89 2023-10-02 22:57:41,464 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_50154_210_13731_1_1534750862_1080961_119-187163-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 22:57:44,142 WARNING [train.py:1204] (3/4) Exclude cut with ID _8793_210_6310_1_1528542985139_2310009_121-179782-0_sp0.9 from training. Duration: 0.888875 2023-10-02 22:57:44,145 WARNING [train.py:1204] (3/4) Exclude cut with ID _14884_210_2252_1_1530707459689_5561250_591-173406-0 from training. Duration: 0.96 2023-10-02 22:57:45,605 INFO [train.py:1046] (3/4) Epoch 30, batch 3300, loss[loss=0.1732, simple_loss=0.2624, pruned_loss=0.042, over 24457.00 frames. ], tot_loss[loss=0.1634, simple_loss=0.2419, pruned_loss=0.04251, over 4700274.18 frames. ], batch size: 69, lr: 3.40e-03, grad_scale: 8.0 2023-10-02 22:57:47,081 WARNING [train.py:1204] (3/4) Exclude cut with ID _15133_210_6228_1_1530794512074_3080379_54-198193-0_sp1.1 from training. Duration: 0.8545625 2023-10-02 22:57:47,090 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6545_210_2040_1_1527774670_4245258_407-48070-0 from training. Duration: 0.76 2023-10-02 22:57:48,972 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1032-236863-0 from training. Duration: 0.84 2023-10-02 22:57:49,048 WARNING [train.py:1204] (3/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_507-303370-0 from training. Duration: 0.96 2023-10-02 22:57:49,079 WARNING [train.py:1204] (3/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_164-282366-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 22:57:55,057 WARNING [train.py:1204] (3/4) Exclude cut with ID _39867_210_9826_1_1533553222030_9344259_312-158727-0_sp1.1 from training. Duration: 0.963625 2023-10-02 22:57:56,417 WARNING [train.py:1204] (3/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_174-205058-0_sp1.1 from training. Duration: 0.863625 2023-10-02 22:57:56,448 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_11305_210_1794_1_1529750428_7178148_166-237358-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 22:57:59,141 WARNING [train.py:1204] (3/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_26-175553-0_sp1.1 from training. Duration: 0.5545625 2023-10-02 22:57:59,173 WARNING [train.py:1204] (3/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_96-290039-0_sp1.1 from training. Duration: 0.5090625 2023-10-02 22:58:01,876 WARNING [train.py:1204] (3/4) Exclude cut with ID _53219_210_4525_1_1534834836008_4064825_261-101691-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 22:58:03,280 WARNING [train.py:1204] (3/4) Exclude cut with ID _24271_210_12658_1_1531987270022_7190519_96-293390-0_sp1.1 from training. Duration: 0.936375 2023-10-02 22:58:06,096 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_271-234962-0 from training. Duration: 0.79 2023-10-02 22:58:08,072 WARNING [train.py:1204] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_136-30660-0_sp1.1 from training. Duration: 0.97275 2023-10-02 22:58:09,255 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_625-281046-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 22:58:10,598 WARNING [train.py:1204] (3/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_333-168193-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 22:58:10,652 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9029W0269-509664 from training. Duration: 0.896 2023-10-02 22:58:11,990 WARNING [train.py:1204] (3/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_450-147393-0_sp1.1 from training. Duration: 0.92725 2023-10-02 22:58:13,799 WARNING [train.py:1204] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_643-52007-0_sp1.1 from training. Duration: 0.5181875 2023-10-02 22:58:15,186 WARNING [train.py:1204] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_330-327861-0_sp0.9 from training. Duration: 0.7444375 2023-10-02 22:58:15,195 WARNING [train.py:1204] (3/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_260-317651-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 22:58:15,213 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9044W0010-516654 from training. Duration: 0.896 2023-10-02 22:58:19,370 WARNING [train.py:1204] (3/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_265-118378-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 22:58:19,384 WARNING [train.py:1204] (3/4) Exclude cut with ID _6194_210_1534_1_1527511826160_3941194_321-148641-0_sp0.9 from training. Duration: 0.711125 2023-10-02 22:58:20,865 WARNING [train.py:1204] (3/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_101-169176-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 22:58:20,868 WARNING [train.py:1204] (3/4) Exclude cut with ID 497-129325-0061-62254-0_sp1.1 from training. Duration: 0.97725 2023-10-02 22:58:22,779 WARNING [train.py:1204] (3/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_795-33252-0_sp1.1 from training. Duration: 0.336375 2023-10-02 22:58:22,798 WARNING [train.py:1204] (3/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_168-246176-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 22:58:24,151 WARNING [train.py:1204] (3/4) Exclude cut with ID _7160_210_1876_1_1527940923231_3703799_120-150943-0_sp1.1 from training. Duration: 0.736375 2023-10-02 22:58:26,889 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9084W0005-536844 from training. Duration: 0.768 2023-10-02 22:58:27,008 WARNING [train.py:1204] (3/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_234-260682-0 from training. Duration: 0.84 2023-10-02 22:58:27,055 WARNING [train.py:1204] (3/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_188-114464-0_sp0.9 from training. Duration: 0.911125 2023-10-02 22:58:30,347 WARNING [train.py:1204] (3/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_81-75849-0 from training. Duration: 0.39 2023-10-02 22:58:32,985 WARNING [train.py:1204] (3/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_379-184223-0_sp1.1 from training. Duration: 0.77275 2023-10-02 22:58:35,627 WARNING [train.py:1204] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_454-123002-0_sp1.1 from training. Duration: 0.62725 2023-10-02 22:58:36,998 WARNING [train.py:1204] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_887-249555-0_sp1.1 from training. Duration: 0.7 2023-10-02 22:58:38,472 WARNING [train.py:1204] (3/4) Exclude cut with ID _31088_210_16446_1_1532520012284_3440919_20-260902-0_sp1.1 from training. Duration: 0.97275 2023-10-02 22:58:40,424 WARNING [train.py:1204] (3/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_856-344574-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 22:58:40,426 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_36756_210_7234_1_1533026991_966276_4-164118-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 22:58:40,440 WARNING [train.py:1204] (3/4) Exclude cut with ID _8365_210_2064_1_1528592311070_4677690_371-122295-0_sp1.1 from training. Duration: 0.5 2023-10-02 22:58:41,865 WARNING [train.py:1204] (3/4) Exclude cut with ID _5910_210_1389_1_1527337972454_5050100_555-159815-0_sp1.1 from training. Duration: 0.8909375 2023-10-02 22:58:41,874 WARNING [train.py:1204] (3/4) Exclude cut with ID _37086_210_9038_1_1532999540891_7021250_469-147941-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 22:58:43,198 WARNING [train.py:1204] (3/4) Exclude cut with ID _3067_210_2667_1_1526212974640_4503168_459-173867-0_sp0.9 from training. Duration: 0.97775 2023-10-02 22:58:45,105 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9156W0006-572499 from training. Duration: 0.896 2023-10-02 22:58:46,451 WARNING [train.py:1204] (3/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_277-210798-0 from training. Duration: 0.55 2023-10-02 22:58:46,793 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.nonlin_attention.balancer.prob, batch_count=1049280.0, ans=0.125 2023-10-02 22:58:46,794 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.conv_module1.balancer1.prob, batch_count=1049280.0, ans=0.125 2023-10-02 22:58:47,872 WARNING [train.py:1204] (3/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_103-336064-0_sp1.1 from training. Duration: 0.463625 2023-10-02 22:58:49,140 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.469e+02 1.888e+02 2.006e+02 2.224e+02 2.991e+02, threshold=4.012e+02, percent-clipped=0.0 2023-10-02 22:58:49,209 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3414_210_1988_1_1526212718_4813195_331-290945-0_sp1.1 from training. Duration: 0.936375 2023-10-02 22:58:49,210 WARNING [train.py:1204] (3/4) Exclude cut with ID _67499_210_17282_1_1535509805220_3692430_365-29276-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 22:58:49,358 WARNING [train.py:1204] (3/4) Exclude cut with ID _14821_210_8750_1_1530680503991_6829359_69-250847-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 22:58:49,359 WARNING [train.py:1204] (3/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_1102-61301-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 22:58:50,711 WARNING [train.py:1204] (3/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_166-246557-0_sp1.1 from training. Duration: 0.5818125 2023-10-02 22:58:50,761 WARNING [train.py:1204] (3/4) Exclude cut with ID _15351_210_10626_1_1530856635653_3576210_214-129918-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 22:58:50,773 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_120-41668-0_sp0.9 from training. Duration: 0.77775 2023-10-02 22:58:52,828 WARNING [train.py:1204] (3/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_27-72278-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 22:58:54,268 WARNING [train.py:1204] (3/4) Exclude cut with ID _24314_210_4915_1_1532135078116_7170179_521-258868-0_sp1.1 from training. Duration: 0.7181875 2023-10-02 22:58:56,392 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.2.encoder.layers.2.self_attn1.whiten, num_groups=1, num_channels=384, metric=17.39 vs. limit=22.5 2023-10-02 22:58:58,930 WARNING [train.py:1204] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_143-86344-0 from training. Duration: 0.77 2023-10-02 22:58:58,974 WARNING [train.py:1204] (3/4) Exclude cut with ID _53489_210_6228_1_1534298357169_3835970_465-157433-0_sp1.1 from training. Duration: 0.92725 2023-10-02 22:59:00,262 INFO [train.py:1046] (3/4) Epoch 30, batch 3350, loss[loss=0.1587, simple_loss=0.2539, pruned_loss=0.03177, over 24329.00 frames. ], tot_loss[loss=0.1641, simple_loss=0.2429, pruned_loss=0.04271, over 4700539.07 frames. ], batch size: 74, lr: 3.40e-03, grad_scale: 8.0 2023-10-02 22:59:00,331 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_248-319353-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 22:59:00,618 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.balancer2.prob, batch_count=1049346.6666666667, ans=0.125 2023-10-02 22:59:01,761 WARNING [train.py:1204] (3/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_102-77064-0_sp1.1 from training. Duration: 0.8454375 2023-10-02 22:59:01,791 WARNING [train.py:1204] (3/4) Exclude cut with ID _9389_210_1664_1_1529217337301_5289269_379-128794-0_sp1.1 from training. Duration: 0.77275 2023-10-02 22:59:03,173 WARNING [train.py:1204] (3/4) Exclude cut with ID _3231_210_2106_1_1526205864934_6784220_236-260835-0_sp1.1 from training. Duration: 0.97275 2023-10-02 22:59:03,735 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.4.encoder.layers.0.feed_forward3.out_whiten, num_groups=1, num_channels=384, metric=13.34 vs. limit=15.0 2023-10-02 22:59:04,591 WARNING [train.py:1204] (3/4) Exclude cut with ID _24271_210_12658_1_1531987270022_7190519_184-342363-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 22:59:04,592 WARNING [train.py:1204] (3/4) Exclude cut with ID _27894_210_12392_1_1532415496078_6902439_67-221663-0_sp1.1 from training. Duration: 0.92725 2023-10-02 22:59:08,776 WARNING [train.py:1204] (3/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_304-59227-0_sp1.1 from training. Duration: 0.7 2023-10-02 22:59:10,760 WARNING [train.py:1204] (3/4) Exclude cut with ID _58605_210_12210_1_1534723195939_3904259_458-42232-0_sp1.1 from training. Duration: 0.92725 2023-10-02 22:59:12,125 WARNING [train.py:1204] (3/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_13-165512-0_sp0.9 from training. Duration: 0.911125 2023-10-02 22:59:14,841 WARNING [train.py:1204] (3/4) Exclude cut with ID _58602_210_2346_1_1534903210085_6908830_792-102241-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 22:59:14,965 WARNING [train.py:1204] (3/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_588-125165-0_sp0.9 from training. Duration: 0.92225 2023-10-02 22:59:16,422 WARNING [train.py:1204] (3/4) Exclude cut with ID _54218_210_12210_1_1534413646388_4409259_239-98429-0_sp1.1 from training. Duration: 0.97275 2023-10-02 22:59:17,120 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.1.encoder.layers.1.conv_module2.whiten, num_groups=1, num_channels=256, metric=11.82 vs. limit=15.0 2023-10-02 22:59:18,294 WARNING [train.py:1204] (3/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_538-32291-0_sp1.1 from training. Duration: 0.7545625 2023-10-02 22:59:19,654 WARNING [train.py:1204] (3/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_419-317062-0 from training. Duration: 0.91 2023-10-02 22:59:19,762 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9309W0202-640329 from training. Duration: 0.896 2023-10-02 22:59:21,044 WARNING [train.py:1204] (3/4) Exclude cut with ID _3231_210_2106_1_1526205864934_6784220_419-313291-0_sp1.1 from training. Duration: 0.97275 2023-10-02 22:59:21,414 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.feed_forward2.hidden_balancer.prob, batch_count=1049413.3333333333, ans=0.125 2023-10-02 22:59:24,471 WARNING [train.py:1204] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_619-344499-0 from training. Duration: 0.63 2023-10-02 22:59:24,483 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_278-272612-0 from training. Duration: 0.73 2023-10-02 22:59:24,567 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_103-84010-0_sp0.9 from training. Duration: 0.7666875 2023-10-02 22:59:24,585 WARNING [train.py:1204] (3/4) Exclude cut with ID _26528_210_9070_1_1532570410842_3851310_497-97218-0_sp1.1 from training. Duration: 0.7818125 2023-10-02 22:59:25,883 WARNING [train.py:1204] (3/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_262-84613-0_sp1.1 from training. Duration: 0.936375 2023-10-02 22:59:25,917 WARNING [train.py:1204] (3/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_387-131538-0 from training. Duration: 0.81 2023-10-02 22:59:25,963 WARNING [train.py:1204] (3/4) Exclude cut with ID _8030_210_7497_1_1528444886781_3531089_224-300489-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 22:59:25,987 WARNING [train.py:1204] (3/4) Exclude cut with ID _14349_210_8341_1_1530615710818_3442470_170-336541-0_sp0.9 from training. Duration: 0.9555625 2023-10-02 22:59:29,254 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_47069_210_15368_1_1533781267_4431320_180-31746-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 22:59:31,848 WARNING [train.py:1204] (3/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_447-7041-0_sp1.1 from training. Duration: 0.92725 2023-10-02 22:59:31,874 WARNING [train.py:1204] (3/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_2-55283-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 22:59:31,931 WARNING [train.py:1204] (3/4) Exclude cut with ID _12555_210_8750_1_1530075594402_3780239_70-181911-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 22:59:34,642 WARNING [train.py:1204] (3/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_311-328022-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 22:59:36,094 WARNING [train.py:1204] (3/4) Exclude cut with ID _14528_210_8254_1_1530669700514_4306939_330-2987-0_sp1.1 from training. Duration: 0.92725 2023-10-02 22:59:36,132 WARNING [train.py:1204] (3/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_385-184858-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 22:59:37,697 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.1.feed_forward1.out_proj.dropout_p, batch_count=1049480.0, ans=0.1 2023-10-02 22:59:40,847 WARNING [train.py:1204] (3/4) Exclude cut with ID _64414_210_17301_1_1535243252048_3757759_348-333360-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 22:59:41,156 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.conv_module2.balancer2.prob, batch_count=1049480.0, ans=0.125 2023-10-02 22:59:42,062 WARNING [train.py:1204] (3/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_753-214751-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 22:59:42,433 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.balancer2.prob, batch_count=1049480.0, ans=0.125 2023-10-02 22:59:43,526 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_17613_210_2151_1_1531137569_2366160_202-208013-0_sp1.1 from training. Duration: 0.92725 2023-10-02 22:59:43,546 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_43831_210_2158_1_1533725502_5872791_610-129300-0_sp1.1 from training. Duration: 0.936375 2023-10-02 22:59:44,920 WARNING [train.py:1204] (3/4) Exclude cut with ID _54218_210_12210_1_1534413646388_4409259_321-23852-0_sp1.1 from training. Duration: 0.936375 2023-10-02 22:59:46,335 WARNING [train.py:1204] (3/4) Exclude cut with ID _39411_210_5986_1_1533538825030_3343649_173-238248-0 from training. Duration: 0.83 2023-10-02 22:59:48,033 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_405-339986-0_sp0.9 from training. Duration: 0.5666875 2023-10-02 22:59:48,068 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2009_210_2071_1_1525481134_4588937_385-302100-0 from training. Duration: 0.87 2023-10-02 22:59:48,102 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_161-276996-0_sp1.1 from training. Duration: 0.863625 2023-10-02 22:59:49,503 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_177-58005-0 from training. Duration: 0.62 2023-10-02 22:59:51,505 WARNING [train.py:1204] (3/4) Exclude cut with ID _64639_210_5816_1_1535367619437_3528000_146-343734-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 22:59:52,833 WARNING [train.py:1204] (3/4) Exclude cut with ID _3845_210_1390_1_1526731326141_3523610_72-61441-0_sp1.1 from training. Duration: 0.92725 2023-10-02 23:00:00,352 WARNING [train.py:1204] (3/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_991-125028-0_sp1.1 from training. Duration: 0.936375 2023-10-02 23:00:00,424 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7544_210_6753_1_1528591327_1471426_172-348777-0 from training. Duration: 0.66 2023-10-02 23:00:00,644 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.feed_forward2.hidden_balancer.prob, batch_count=1049613.3333333333, ans=0.125 2023-10-02 23:00:01,722 WARNING [train.py:1204] (3/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_416-257730-0_sp1.1 from training. Duration: 0.5909375 2023-10-02 23:00:01,981 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=1049613.3333333333, ans=0.1 2023-10-02 23:00:03,146 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1113-7369-0_sp1.1 from training. Duration: 0.77275 2023-10-02 23:00:03,379 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.feed_forward2.hidden_balancer.prob, batch_count=1049613.3333333333, ans=0.125 2023-10-02 23:00:04,553 WARNING [train.py:1204] (3/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_242-76943-0_sp1.1 from training. Duration: 0.8545625 2023-10-02 23:00:10,105 WARNING [train.py:1204] (3/4) Exclude cut with ID _44718_210_5816_1_1534050051869_6469030_190-49648-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 23:00:10,321 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.self_attn_weights.pos_emb_skip_rate, batch_count=1049613.3333333333, ans=0.0 2023-10-02 23:00:11,998 WARNING [train.py:1204] (3/4) Exclude cut with ID _47251_210_18628_1_1533814063128_4137250_214-88076-0 from training. Duration: 0.88 2023-10-02 23:00:12,023 WARNING [train.py:1204] (3/4) Exclude cut with ID _5585_210_2667_1_1527418813324_8007340_89-332072-0_sp1.1 from training. Duration: 0.5818125 2023-10-02 23:00:12,143 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.0.nonlin_attention.balancer.prob, batch_count=1049613.3333333333, ans=0.125 2023-10-02 23:00:13,454 WARNING [train.py:1204] (3/4) Exclude cut with ID _41817_210_18835_1_1533434477160_7423049_555-293448-0_sp1.1 from training. Duration: 0.72725 2023-10-02 23:00:14,781 INFO [train.py:1046] (3/4) Epoch 30, batch 3400, loss[loss=0.1635, simple_loss=0.2419, pruned_loss=0.0425, over 23464.00 frames. ], tot_loss[loss=0.1645, simple_loss=0.2431, pruned_loss=0.04294, over 4708482.65 frames. ], batch size: 119, lr: 3.40e-03, grad_scale: 8.0 2023-10-02 23:00:14,859 WARNING [train.py:1204] (3/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_286-331314-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 23:00:14,924 WARNING [train.py:1204] (3/4) Exclude cut with ID _13921_210_9994_1_1530838702800_3003429_46-149444-0 from training. Duration: 0.94 2023-10-02 23:00:16,188 WARNING [train.py:1204] (3/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_768-43595-0_sp1.1 from training. Duration: 0.936375 2023-10-02 23:00:16,203 WARNING [train.py:1204] (3/4) Exclude cut with ID _35355_210_16040_1_1533212988977_4016229_74-190076-0 from training. Duration: 0.8 2023-10-02 23:00:17,555 WARNING [train.py:1204] (3/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_1007-316380-0_sp1.1 from training. Duration: 0.963625 2023-10-02 23:00:17,595 WARNING [train.py:1204] (3/4) Exclude cut with ID _23557_210_8750_1_1531890106370_6846255_150-283437-0_sp1.1 from training. Duration: 0.963625 2023-10-02 23:00:18,965 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3474_210_2028_1_1526188513_4825246_145-309131-0_sp1.1 from training. Duration: 0.67275 2023-10-02 23:00:20,844 WARNING [train.py:1204] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_391-22148-0_sp1.1 from training. Duration: 0.7 2023-10-02 23:00:20,867 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_238-256668-0 from training. Duration: 0.69 2023-10-02 23:00:20,951 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder_embed.conv.2.prob, batch_count=1049680.0, ans=0.125 2023-10-02 23:00:26,826 WARNING [train.py:1204] (3/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_372-198878-0 from training. Duration: 0.6 2023-10-02 23:00:26,842 WARNING [train.py:1204] (3/4) Exclude cut with ID ID0174W0006-766659 from training. Duration: 0.953 2023-10-02 23:00:26,864 WARNING [train.py:1204] (3/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_668-23019-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 23:00:31,655 WARNING [train.py:1204] (3/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_322-74328-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 23:00:31,662 WARNING [train.py:1204] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_872-264525-0_sp1.1 from training. Duration: 0.5909375 2023-10-02 23:00:32,994 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_53378_210_17282_1_1534221071_5769509_641-286201-0_sp1.1 from training. Duration: 0.97275 2023-10-02 23:00:34,243 WARNING [train.py:1204] (3/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_199-295611-0_sp1.1 from training. Duration: 0.5 2023-10-02 23:00:37,204 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.feed_forward1.hidden_balancer.prob, batch_count=1049746.6666666667, ans=0.125 2023-10-02 23:00:38,426 WARNING [train.py:1204] (3/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_1270-317971-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 23:00:39,800 WARNING [train.py:1204] (3/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_784-213099-0 from training. Duration: 0.93 2023-10-02 23:00:40,315 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.4.encoder.layers.2.self_attn_weights.whiten_keys, num_groups=4, num_channels=128, metric=2.22 vs. limit=6.0 2023-10-02 23:00:44,823 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_115-233929-0_sp0.9 from training. Duration: 0.8 2023-10-02 23:00:46,182 WARNING [train.py:1204] (3/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_256-223809-0_sp1.1 from training. Duration: 0.97275 2023-10-02 23:00:47,569 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_285-87781-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 23:00:48,805 WARNING [train.py:1204] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_619-344499-0_sp1.1 from training. Duration: 0.57275 2023-10-02 23:00:53,749 WARNING [train.py:1204] (3/4) Exclude cut with ID _60229_210_27767_1_1534838406158_3655350_264-279573-0_sp0.9 from training. Duration: 0.92225 2023-10-02 23:00:54,677 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.ff2_skip_rate, batch_count=1049813.3333333333, ans=0.0 2023-10-02 23:00:55,847 WARNING [train.py:1204] (3/4) Exclude cut with ID _7719_210_3525_1_1528456153163_4296960_333-110399-0 from training. Duration: 0.96 2023-10-02 23:01:01,821 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_50423_210_13270_1_1534128598_5021734_759-90833-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 23:01:01,874 WARNING [train.py:1204] (3/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_151-254798-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 23:01:03,157 WARNING [train.py:1204] (3/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_450-203637-0 from training. Duration: 0.92 2023-10-02 23:01:03,188 WARNING [train.py:1204] (3/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_673-36456-0_sp1.1 from training. Duration: 0.936375 2023-10-02 23:01:04,485 WARNING [train.py:1204] (3/4) Exclude cut with ID _36538_210_9994_1_1533204020363_3795620_131-8685-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 23:01:04,550 WARNING [train.py:1204] (3/4) Exclude cut with ID _12556_210_8341_1_1530081024100_7327690_713-55626-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 23:01:05,861 WARNING [train.py:1204] (3/4) Exclude cut with ID _2322_210_2667_1_1525604548971_3656299_440-80494-0_sp0.9 from training. Duration: 0.97775 2023-10-02 23:01:07,328 WARNING [train.py:1204] (3/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_210-325081-0_sp1.1 from training. Duration: 0.97275 2023-10-02 23:01:11,461 WARNING [train.py:1204] (3/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_375-244976-0_sp0.9 from training. Duration: 0.8333125 2023-10-02 23:01:11,467 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_15517_210_6753_1_1530952293_1664795_119-31001-0_sp1.1 from training. Duration: 0.8545625 2023-10-02 23:01:13,654 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.nonlin_attention.balancer.min_positive, batch_count=1049946.6666666667, ans=0.05 2023-10-02 23:01:17,699 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_41452_210_7291_1_1533437687_3941281_319-42326-0_sp1.1 from training. Duration: 0.92725 2023-10-02 23:01:19,052 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.668e+02 1.969e+02 2.278e+02 2.618e+02 4.115e+02, threshold=4.556e+02, percent-clipped=1.0 2023-10-02 23:01:19,196 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_372-173012-0 from training. Duration: 0.95 2023-10-02 23:01:22,840 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.feed_forward1.out_proj.dropout_p, batch_count=1049946.6666666667, ans=0.1 2023-10-02 23:01:25,263 WARNING [train.py:1204] (3/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_41-31157-0_sp1.1 from training. Duration: 0.5181875 2023-10-02 23:01:29,868 INFO [train.py:1046] (3/4) Epoch 30, batch 3450, loss[loss=0.1574, simple_loss=0.2372, pruned_loss=0.03878, over 23325.00 frames. ], tot_loss[loss=0.1639, simple_loss=0.2426, pruned_loss=0.04259, over 4714974.92 frames. ], batch size: 119, lr: 3.40e-03, grad_scale: 8.0 2023-10-02 23:01:29,921 WARNING [train.py:1204] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_91-212486-0 from training. Duration: 0.68 2023-10-02 23:01:33,103 WARNING [train.py:1204] (3/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_1267-272693-0 from training. Duration: 0.73 2023-10-02 23:01:33,135 WARNING [train.py:1204] (3/4) Exclude cut with ID _13938_210_9988_1_1530518718978_3693960_152-67046-0_sp1.1 from training. Duration: 0.936375 2023-10-02 23:01:34,606 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_4528_210_3273_1_1527076654_3276777_342-168068-0_sp1.1 from training. Duration: 0.7909375 2023-10-02 23:01:34,618 WARNING [train.py:1204] (3/4) Exclude cut with ID _7606_210_4915_1_1528894253277_5080690_127-177761-0 from training. Duration: 0.64 2023-10-02 23:01:35,933 WARNING [train.py:1204] (3/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_335-137972-0_sp1.1 from training. Duration: 0.92725 2023-10-02 23:01:38,817 WARNING [train.py:1204] (3/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_331-117829-0_sp1.1 from training. Duration: 0.563625 2023-10-02 23:01:44,902 WARNING [train.py:1204] (3/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_125-125818-0_sp1.1 from training. Duration: 0.87275 2023-10-02 23:01:46,130 WARNING [train.py:1204] (3/4) Exclude cut with ID _6789_210_1534_1_1527854471671_2870519_48-294897-0_sp1.1 from training. Duration: 0.963625 2023-10-02 23:01:47,514 WARNING [train.py:1204] (3/4) Exclude cut with ID _6694_210_4599_1_1527852637490_3965529_116-109398-0_sp1.1 from training. Duration: 0.663625 2023-10-02 23:01:47,524 WARNING [train.py:1204] (3/4) Exclude cut with ID _52892_210_6721_1_1534378941259_4124129_222-172685-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 23:01:49,015 WARNING [train.py:1204] (3/4) Exclude cut with ID _50655_210_18628_1_1534229942193_3772154_320-132275-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 23:01:53,777 WARNING [train.py:1204] (3/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_76-311963-0 from training. Duration: 0.7 2023-10-02 23:01:59,796 WARNING [train.py:1204] (3/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_32-205288-0 from training. Duration: 0.78 2023-10-02 23:01:59,813 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_86-16881-0_sp0.9 from training. Duration: 0.6444375 2023-10-02 23:01:59,845 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_271-297505-0_sp1.1 from training. Duration: 0.9 2023-10-02 23:02:02,611 WARNING [train.py:1204] (3/4) Exclude cut with ID _57572_210_5517_1_1534921219624_3809484_187-295293-0_sp1.1 from training. Duration: 0.963625 2023-10-02 23:02:07,117 WARNING [train.py:1204] (3/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_563-329943-0 from training. Duration: 0.99 2023-10-02 23:02:07,189 WARNING [train.py:1204] (3/4) Exclude cut with ID _7160_210_1876_1_1527940923231_3703799_302-150728-0_sp0.9 from training. Duration: 0.8666875 2023-10-02 23:02:10,190 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.conv_module2.balancer2.prob, batch_count=1050146.6666666667, ans=0.125 2023-10-02 23:02:11,287 WARNING [train.py:1204] (3/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_926-7526-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 23:02:11,325 WARNING [train.py:1204] (3/4) Exclude cut with ID _62574_210_2655_1_1535180407058_3933249_467-22348-0_sp1.1 from training. Duration: 0.936375 2023-10-02 23:02:11,560 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.nonlin_attention.balancer.prob, batch_count=1050146.6666666667, ans=0.125 2023-10-02 23:02:11,587 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.self_attn_weights.pos_emb_skip_rate, batch_count=1050146.6666666667, ans=0.0 2023-10-02 23:02:12,618 WARNING [train.py:1204] (3/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_280-161633-0_sp0.9 from training. Duration: 0.811125 2023-10-02 23:02:15,701 WARNING [train.py:1204] (3/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_385-193052-0_sp1.1 from training. Duration: 0.7818125 2023-10-02 23:02:15,923 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.0.bypass_mid.scale_min, batch_count=1050213.3333333333, ans=0.2 2023-10-02 23:02:17,132 WARNING [train.py:1204] (3/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_444-28041-0 from training. Duration: 0.85 2023-10-02 23:02:17,145 WARNING [train.py:1204] (3/4) Exclude cut with ID _66070_210_7640_1_1535441393096_7805310_341-249237-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 23:02:17,305 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.0.bypass.scale_min, batch_count=1050213.3333333333, ans=0.2 2023-10-02 23:02:17,351 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.feed_forward1.hidden_balancer.prob, batch_count=1050213.3333333333, ans=0.125 2023-10-02 23:02:19,763 WARNING [train.py:1204] (3/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_425-269826-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 23:02:21,712 WARNING [train.py:1204] (3/4) Exclude cut with ID _53950_210_5816_1_1534417264919_3862951_124-185597-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 23:02:24,481 WARNING [train.py:1204] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_380-204756-0 from training. Duration: 0.86 2023-10-02 23:02:27,782 WARNING [train.py:1204] (3/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_282-257791-0_sp0.9 from training. Duration: 0.9555625 2023-10-02 23:02:28,023 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.nonlin_attention.balancer.prob, batch_count=1050280.0, ans=0.125 2023-10-02 23:02:28,056 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.bypass.scale_min, batch_count=1050280.0, ans=0.2 2023-10-02 23:02:32,118 WARNING [train.py:1204] (3/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_372-131163-0_sp1.1 from training. Duration: 0.8545625 2023-10-02 23:02:32,207 WARNING [train.py:1204] (3/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_447-233513-0_sp1.1 from training. Duration: 0.963625 2023-10-02 23:02:33,753 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.ff3_skip_rate, batch_count=1050280.0, ans=0.0 2023-10-02 23:02:35,889 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.1.encoder.layers.0.nonlin_attention.whiten1, num_groups=1, num_channels=192, metric=4.74 vs. limit=10.0 2023-10-02 23:02:36,319 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_54-118333-0_sp1.1 from training. Duration: 0.97275 2023-10-02 23:02:41,203 WARNING [train.py:1204] (3/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_388-230239-0_sp1.1 from training. Duration: 0.963625 2023-10-02 23:02:41,221 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_40047_210_2290_1_1533290519_2374311_141-54639-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 23:02:41,284 WARNING [train.py:1204] (3/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_91-267696-0_sp0.9 from training. Duration: 0.9666875 2023-10-02 23:02:43,315 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_532-137847-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 23:02:44,604 INFO [train.py:1046] (3/4) Epoch 30, batch 3500, loss[loss=0.1546, simple_loss=0.2288, pruned_loss=0.04015, over 23679.00 frames. ], tot_loss[loss=0.1631, simple_loss=0.2415, pruned_loss=0.0424, over 4710456.82 frames. ], batch size: 149, lr: 3.40e-03, grad_scale: 8.0 2023-10-02 23:02:47,443 WARNING [train.py:1204] (3/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_155-283231-0_sp1.1 from training. Duration: 0.97275 2023-10-02 23:02:48,998 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2245_210_2715_1_1525511548_3540254_310-52483-0_sp1.1 from training. Duration: 0.8 2023-10-02 23:02:50,298 WARNING [train.py:1204] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_185-174805-0 from training. Duration: 0.68 2023-10-02 23:02:53,477 WARNING [train.py:1204] (3/4) Exclude cut with ID _15012_210_1968_1_1530856820429_5095350_662-210469-0_sp1.1 from training. Duration: 0.5181875 2023-10-02 23:02:55,041 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.feed_forward1.out_proj.dropout_p, batch_count=1050346.6666666667, ans=0.1 2023-10-02 23:02:55,467 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.3.encoder.layers.1.nonlin_attention.whiten1, num_groups=1, num_channels=384, metric=4.74 vs. limit=10.0 2023-10-02 23:02:57,541 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_426-313557-0_sp0.9 from training. Duration: 0.588875 2023-10-02 23:02:59,034 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.1.encoder.layers.1.feed_forward2.out_whiten, num_groups=1, num_channels=256, metric=3.85 vs. limit=15.0 2023-10-02 23:02:59,465 WARNING [train.py:1204] (3/4) Exclude cut with ID _42326_210_7770_1_1533814282531_3707269_407-204343-0_sp1.1 from training. Duration: 0.97275 2023-10-02 23:02:59,479 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_258-310570-0 from training. Duration: 0.82 2023-10-02 23:03:02,411 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_4737_210_2040_1_1526998689_3293008_378-178650-0_sp1.1 from training. Duration: 0.863625 2023-10-02 23:03:02,495 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_47896_210_20508_1_1534492518_5958868_432-327451-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 23:03:03,840 WARNING [train.py:1204] (3/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_188-114464-0_sp1.1 from training. Duration: 0.7454375 2023-10-02 23:03:03,860 WARNING [train.py:1204] (3/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_798-106179-0_sp1.1 from training. Duration: 0.7545625 2023-10-02 23:03:03,905 WARNING [train.py:1204] (3/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_143-117421-0_sp1.1 from training. Duration: 0.52725 2023-10-02 23:03:05,266 WARNING [train.py:1204] (3/4) Exclude cut with ID _67499_210_17282_1_1535509805220_3692430_361-78786-0_sp1.1 from training. Duration: 0.963625 2023-10-02 23:03:05,308 WARNING [train.py:1204] (3/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_712-342563-0_sp1.1 from training. Duration: 0.8818125 2023-10-02 23:03:05,351 WARNING [train.py:1204] (3/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_387-159008-0 from training. Duration: 0.86 2023-10-02 23:03:08,513 WARNING [train.py:1204] (3/4) Exclude cut with ID _33640_210_2655_1_1533117605479_4855469_959-18820-0_sp1.1 from training. Duration: 0.963625 2023-10-02 23:03:10,337 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_133-564-0_sp0.9 from training. Duration: 0.87775 2023-10-02 23:03:10,456 WARNING [train.py:1204] (3/4) Exclude cut with ID _23514_210_1389_1_1531832139785_3031439_12-29287-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 23:03:13,359 WARNING [train.py:1204] (3/4) Exclude cut with ID _23514_210_1389_1_1531832139785_3031439_489-185052-0_sp1.1 from training. Duration: 0.963625 2023-10-02 23:03:14,716 WARNING [train.py:1204] (3/4) Exclude cut with ID _5444_210_2151_1_1527418808464_5312210_634-194174-0 from training. Duration: 0.97 2023-10-02 23:03:14,737 WARNING [train.py:1204] (3/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_91-26919-0_sp1.1 from training. Duration: 0.8818125 2023-10-02 23:03:18,669 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_37351_210_7925_1_1533012931_3812113_516-82303-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 23:03:19,999 WARNING [train.py:1204] (3/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_80-74184-0_sp0.9 from training. Duration: 0.92225 2023-10-02 23:03:21,438 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_9460_210_4211_1_1528954587_10049667_495-202963-0_sp1.1 from training. Duration: 0.963625 2023-10-02 23:03:22,856 WARNING [train.py:1204] (3/4) Exclude cut with ID _14632_210_9988_1_1530691279392_3923200_332-105904-0_sp1.1 from training. Duration: 0.8181875 2023-10-02 23:03:24,144 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_40764_210_1925_1_1533882359_3752345_493-167886-0_sp1.1 from training. Duration: 0.92725 2023-10-02 23:03:24,434 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.out_combiner.scale_min, batch_count=1050480.0, ans=0.2 2023-10-02 23:03:26,123 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_196-289637-0 from training. Duration: 0.75 2023-10-02 23:03:27,463 WARNING [train.py:1204] (3/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_26-175553-0 from training. Duration: 0.61 2023-10-02 23:03:27,506 WARNING [train.py:1204] (3/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_890-5117-0 from training. Duration: 0.78 2023-10-02 23:03:27,551 WARNING [train.py:1204] (3/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_206-306860-0_sp1.1 from training. Duration: 0.92725 2023-10-02 23:03:30,185 WARNING [train.py:1204] (3/4) Exclude cut with ID _25882_210_3036_1_1532775665815_6479510_570-79824-0_sp1.1 from training. Duration: 0.963625 2023-10-02 23:03:30,250 WARNING [train.py:1204] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_731-2428-0_sp1.1 from training. Duration: 0.7545625 2023-10-02 23:03:30,273 WARNING [train.py:1204] (3/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_138-154812-0_sp0.9 from training. Duration: 0.8666875 2023-10-02 23:03:33,015 WARNING [train.py:1204] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_178-39722-0_sp0.9 from training. Duration: 0.5444375 2023-10-02 23:03:33,099 WARNING [train.py:1204] (3/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_1285-256622-0_sp0.9 from training. Duration: 0.9333125 2023-10-02 23:03:37,686 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_41428_210_2161_1_1533774184_3992532_65-271323-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 23:03:37,795 WARNING [train.py:1204] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_308-161067-0 from training. Duration: 0.66 2023-10-02 23:03:37,801 WARNING [train.py:1204] (3/4) Exclude cut with ID _3067_210_2667_1_1526212974640_4503168_459-173867-0 from training. Duration: 0.88 2023-10-02 23:03:37,802 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2221_210_1988_1_1525597051_4935541_415-296882-0_sp1.1 from training. Duration: 0.7 2023-10-02 23:03:39,857 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.ff3_skip_rate, batch_count=1050546.6666666667, ans=0.0 2023-10-02 23:03:39,905 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.nonlin_attention.balancer.max_positive, batch_count=1050546.6666666667, ans=0.95 2023-10-02 23:03:41,021 WARNING [train.py:1204] (3/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_354-192266-0_sp1.1 from training. Duration: 0.936375 2023-10-02 23:03:42,425 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_258-310570-0_sp0.9 from training. Duration: 0.911125 2023-10-02 23:03:43,796 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_548-141039-0_sp1.1 from training. Duration: 0.963625 2023-10-02 23:03:46,585 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_15101_210_7927_1_1530786415_5033274_320-327527-0 from training. Duration: 0.96 2023-10-02 23:03:46,641 WARNING [train.py:1204] (3/4) Exclude cut with ID _2895_210_2477_1_1525956785794_4063750_289-11475-0_sp0.9 from training. Duration: 0.911125 2023-10-02 23:03:47,844 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.503e+02 1.822e+02 2.025e+02 2.259e+02 3.457e+02, threshold=4.050e+02, percent-clipped=0.0 2023-10-02 23:03:48,054 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7543_210_6753_1_1528543318_4552_1-85072-0_sp1.1 from training. Duration: 0.7545625 2023-10-02 23:03:48,131 WARNING [train.py:1204] (3/4) Exclude cut with ID _64639_210_5816_1_1535367619437_3528000_461-329156-0 from training. Duration: 0.96 2023-10-02 23:03:51,272 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_211-339622-0 from training. Duration: 0.45 2023-10-02 23:03:54,521 WARNING [train.py:1204] (3/4) Exclude cut with ID _20455_210_5219_1_1531565996775_8098100_402-112739-0_sp1.1 from training. Duration: 0.963625 2023-10-02 23:03:56,070 WARNING [train.py:1204] (3/4) Exclude cut with ID _53489_210_6228_1_1534298357169_3835970_256-115565-0_sp1.1 from training. Duration: 0.936375 2023-10-02 23:03:56,095 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_26326_210_7925_1_1533002493_3592406_360-305260-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 23:03:57,362 WARNING [train.py:1204] (3/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_18-204383-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 23:03:58,755 INFO [train.py:1046] (3/4) Epoch 30, batch 3550, loss[loss=0.1516, simple_loss=0.234, pruned_loss=0.03461, over 24487.00 frames. ], tot_loss[loss=0.1629, simple_loss=0.2407, pruned_loss=0.04255, over 4701698.83 frames. ], batch size: 66, lr: 3.40e-03, grad_scale: 8.0 2023-10-02 23:03:59,024 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.conv_skip_rate, batch_count=1050680.0, ans=0.0 2023-10-02 23:04:00,288 WARNING [train.py:1204] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_306-232480-0_sp1.1 from training. Duration: 0.87275 2023-10-02 23:04:08,650 WARNING [train.py:1204] (3/4) Exclude cut with ID _41161_210_20034_1_1533431249657_3555314_116-299186-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 23:04:10,684 WARNING [train.py:1204] (3/4) Exclude cut with ID _13913_210_9994_1_1530579615187_3383897_473-277682-0_sp1.1 from training. Duration: 0.3545625 2023-10-02 23:04:13,908 WARNING [train.py:1204] (3/4) Exclude cut with ID _58895_210_8750_1_1535184081320_3649089_49-225702-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 23:04:13,981 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3080_210_2071_1_1526086102_4365166_312-8726-0_sp1.1 from training. Duration: 0.82725 2023-10-02 23:04:15,386 WARNING [train.py:1204] (3/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_489-190175-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 23:04:16,640 WARNING [train.py:1204] (3/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_345-95717-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 23:04:16,658 WARNING [train.py:1204] (3/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_85-138257-0_sp1.1 from training. Duration: 0.6454375 2023-10-02 23:04:19,540 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7046_210_6753_1_1527895337_6920917_783-278109-0_sp1.1 from training. Duration: 0.7 2023-10-02 23:04:19,566 WARNING [train.py:1204] (3/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_450-203637-0_sp1.1 from training. Duration: 0.836375 2023-10-02 23:04:20,939 WARNING [train.py:1204] (3/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_416-132893-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 23:04:20,963 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2655_210_2087_1_1525688959_3897400_437-337690-0_sp0.9 from training. Duration: 0.7 2023-10-02 23:04:21,041 WARNING [train.py:1204] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_700-183549-0_sp1.1 from training. Duration: 0.6181875 2023-10-02 23:04:24,964 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.2.encoder.layers.1.conv_module2.whiten, num_groups=1, num_channels=384, metric=4.09 vs. limit=15.0 2023-10-02 23:04:26,991 WARNING [train.py:1204] (3/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_191-59784-0_sp1.1 from training. Duration: 0.8 2023-10-02 23:04:27,008 WARNING [train.py:1204] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_218-295097-0_sp1.1 from training. Duration: 0.7 2023-10-02 23:04:27,132 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.0.conv_skip_rate, batch_count=1050813.3333333333, ans=0.0 2023-10-02 23:04:29,710 WARNING [train.py:1204] (3/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_310-109461-0_sp1.1 from training. Duration: 0.72725 2023-10-02 23:04:29,714 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_33348_210_5632_1_1533086912_3324030_131-321149-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 23:04:29,768 WARNING [train.py:1204] (3/4) Exclude cut with ID _64135_210_2253_1_1535677267876_7608972_133-188266-0_sp1.1 from training. Duration: 0.736375 2023-10-02 23:04:29,797 WARNING [train.py:1204] (3/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_505-205270-0 from training. Duration: 0.38 2023-10-02 23:04:29,809 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_67223_210_4238_1_1535612120_2537431_102-88248-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 23:04:31,221 WARNING [train.py:1204] (3/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_60-235433-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 23:04:32,662 WARNING [train.py:1204] (3/4) Exclude cut with ID _5189_210_4557_1_1527248319428_3769229_199-53526-0_sp1.1 from training. Duration: 0.3818125 2023-10-02 23:04:34,221 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.0.ff2_skip_rate, batch_count=1050813.3333333333, ans=0.0 2023-10-02 23:04:38,241 WARNING [train.py:1204] (3/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_439-90979-0_sp1.1 from training. Duration: 0.963625 2023-10-02 23:04:38,305 WARNING [train.py:1204] (3/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_1162-132880-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 23:04:40,215 WARNING [train.py:1204] (3/4) Exclude cut with ID _56423_210_5854_1_1534896061049_3165969_220-22317-0_sp1.1 from training. Duration: 0.963625 2023-10-02 23:04:42,934 WARNING [train.py:1204] (3/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_909-64151-0 from training. Duration: 0.94 2023-10-02 23:04:42,994 WARNING [train.py:1204] (3/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_465-156500-0_sp0.9 from training. Duration: 0.8 2023-10-02 23:04:43,074 WARNING [train.py:1204] (3/4) Exclude cut with ID _44304_210_15374_1_1533639352765_4613129_302-22084-0 from training. Duration: 0.86 2023-10-02 23:04:44,913 WARNING [train.py:1204] (3/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_663-166973-0_sp1.1 from training. Duration: 0.72725 2023-10-02 23:04:47,687 WARNING [train.py:1204] (3/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_503-287883-0_sp0.9 from training. Duration: 0.97775 2023-10-02 23:04:47,704 WARNING [train.py:1204] (3/4) Exclude cut with ID _60057_210_10640_1_1534903065793_3333150_362-230377-0_sp0.9 from training. Duration: 0.9666875 2023-10-02 23:04:50,425 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_4183_210_2087_1_1526729100_1869838_143-280962-0 from training. Duration: 0.56 2023-10-02 23:04:50,503 WARNING [train.py:1204] (3/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_281-1313-0_sp1.1 from training. Duration: 0.936375 2023-10-02 23:04:56,930 WARNING [train.py:1204] (3/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_64-215118-0_sp1.1 from training. Duration: 0.936375 2023-10-02 23:04:58,360 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2822_210_2715_1_1526112586_1999587_165-36656-0 from training. Duration: 0.89 2023-10-02 23:04:58,424 WARNING [train.py:1204] (3/4) Exclude cut with ID _22961_210_8254_1_1531976366672_4278180_402-208830-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 23:04:58,932 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.5.encoder.layers.0.self_attn2.whiten, num_groups=1, num_channels=256, metric=10.98 vs. limit=22.5 2023-10-02 23:05:01,283 WARNING [train.py:1204] (3/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_98-193494-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 23:05:02,645 WARNING [train.py:1204] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_383-285196-0 from training. Duration: 0.97 2023-10-02 23:05:02,841 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.conv_module1.balancer2.prob, batch_count=1050946.6666666667, ans=0.125 2023-10-02 23:05:05,051 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.0.layers.0.feed_forward2.out_whiten, num_groups=1, num_channels=192, metric=5.27 vs. limit=15.0 2023-10-02 23:05:06,100 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.3.encoder.layers.1.whiten, num_groups=1, num_channels=512, metric=4.69 vs. limit=12.0 2023-10-02 23:05:08,005 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6767_210_3273_1_1527855783_1718932_142-127132-0 from training. Duration: 0.95 2023-10-02 23:05:08,043 WARNING [train.py:1204] (3/4) Exclude cut with ID _39867_210_9826_1_1533553222030_9344259_1018-339635-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 23:05:09,348 WARNING [train.py:1204] (3/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_274-274798-0_sp1.1 from training. Duration: 0.8909375 2023-10-02 23:05:09,600 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.ff3_skip_rate, batch_count=1050946.6666666667, ans=0.0 2023-10-02 23:05:09,643 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=1050946.6666666667, ans=0.1 2023-10-02 23:05:11,136 WARNING [train.py:1204] (3/4) Exclude cut with ID _2334_210_2667_1_1525608238880_4705129_376-263393-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 23:05:12,511 INFO [train.py:1046] (3/4) Epoch 30, batch 3600, loss[loss=0.1604, simple_loss=0.2547, pruned_loss=0.03302, over 24415.00 frames. ], tot_loss[loss=0.1629, simple_loss=0.2407, pruned_loss=0.04249, over 4707395.33 frames. ], batch size: 69, lr: 3.40e-03, grad_scale: 16.0 2023-10-02 23:05:12,560 WARNING [train.py:1204] (3/4) Exclude cut with ID _29303_210_2575_1_1532392237303_7410649_363-344675-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 23:05:12,651 WARNING [train.py:1204] (3/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_58-68956-0_sp1.1 from training. Duration: 0.7545625 2023-10-02 23:05:16,142 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.0.balancer2.prob, batch_count=1051013.3333333333, ans=0.125 2023-10-02 23:05:17,264 WARNING [train.py:1204] (3/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_709-311571-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 23:05:17,377 WARNING [train.py:1204] (3/4) Exclude cut with ID _46067_210_4220_1_1534125571066_3307450_136-257407-0_sp1.1 from training. Duration: 0.963625 2023-10-02 23:05:19,220 WARNING [train.py:1204] (3/4) Exclude cut with ID _6493_210_1747_1_1528459210424_4012470_324-323854-0_sp1.1 from training. Duration: 0.836375 2023-10-02 23:05:19,280 WARNING [train.py:1204] (3/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_234-260682-0_sp1.1 from training. Duration: 0.763625 2023-10-02 23:05:20,547 WARNING [train.py:1204] (3/4) Exclude cut with ID _36634_210_1794_1_1533296352450_1399884_55-119451-0_sp1.1 from training. Duration: 0.963625 2023-10-02 23:05:20,551 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_40047_210_2290_1_1533290519_2374311_195-165693-0 from training. Duration: 0.89 2023-10-02 23:05:22,883 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_5522_210_3273_1_1527249606_3909096_366-172508-0_sp0.9 from training. Duration: 0.7333125 2023-10-02 23:05:23,088 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=1051013.3333333333, ans=0.1 2023-10-02 23:05:24,033 WARNING [train.py:1204] (3/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_187-10504-0_sp1.1 from training. Duration: 0.963625 2023-10-02 23:05:26,773 WARNING [train.py:1204] (3/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_282-257791-0_sp1.1 from training. Duration: 0.7818125 2023-10-02 23:05:29,500 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_43831_210_2158_1_1533725502_5872791_467-288776-0_sp1.1 from training. Duration: 0.92725 2023-10-02 23:05:30,854 WARNING [train.py:1204] (3/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_138-154812-0_sp1.1 from training. Duration: 0.7090625 2023-10-02 23:05:30,900 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_1154-185930-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 23:05:30,917 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2655_210_2087_1_1525688959_3897400_52-279899-0 from training. Duration: 0.69 2023-10-02 23:05:32,289 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_141-89113-0_sp1.1 from training. Duration: 0.7818125 2023-10-02 23:05:33,760 WARNING [train.py:1204] (3/4) Exclude cut with ID _55134_210_21982_1_1534590028377_3882170_297-47310-0_sp1.1 from training. Duration: 0.963625 2023-10-02 23:05:35,151 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_486-151131-0_sp1.1 from training. Duration: 0.77275 2023-10-02 23:05:35,373 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=1051080.0, ans=0.1 2023-10-02 23:05:36,561 WARNING [train.py:1204] (3/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_21-232980-0_sp1.1 from training. Duration: 0.936375 2023-10-02 23:05:37,365 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.2.encoder.layers.0.nonlin_attention.whiten1, num_groups=1, num_channels=288, metric=7.97 vs. limit=10.0 2023-10-02 23:05:39,688 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6165_210_3528_1_1527590982_4379841_338-106380-0_sp1.1 from training. Duration: 0.92725 2023-10-02 23:05:41,047 WARNING [train.py:1204] (3/4) Exclude cut with ID _55162_210_17071_1_1534408248583_3753480_355-257218-0_sp1.1 from training. Duration: 0.97275 2023-10-02 23:05:42,461 WARNING [train.py:1204] (3/4) Exclude cut with ID _2895_210_2477_1_1525956785794_4063750_289-11475-0 from training. Duration: 0.82 2023-10-02 23:05:49,074 WARNING [train.py:1204] (3/4) Exclude cut with ID _16756_210_4915_1_1531047803763_7104259_839-183431-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 23:05:49,290 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.conv_skip_rate, batch_count=1051146.6666666667, ans=0.0 2023-10-02 23:05:51,667 WARNING [train.py:1204] (3/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_427-208901-0_sp0.9 from training. Duration: 0.7555625 2023-10-02 23:05:53,542 WARNING [train.py:1204] (3/4) Exclude cut with ID _36512_210_6987_1_1533033143998_3126069_181-306500-0 from training. Duration: 0.95 2023-10-02 23:05:56,324 WARNING [train.py:1204] (3/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_28-70987-0_sp1.1 from training. Duration: 0.7909375 2023-10-02 23:06:00,541 WARNING [train.py:1204] (3/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_590-58996-0_sp1.1 from training. Duration: 0.936375 2023-10-02 23:06:03,347 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_48730_210_5399_1_1533885886_2212769_108-47561-0_sp1.1 from training. Duration: 0.936375 2023-10-02 23:06:06,187 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.attention_skip_rate, batch_count=1051213.3333333333, ans=0.0 2023-10-02 23:06:07,596 INFO [scaling.py:1118] (3/4) WithLoss: name=encoder.encoders.3.encoder.layers.3.self_attn_weights, loss-sum=0.000e+00 2023-10-02 23:06:08,802 WARNING [train.py:1204] (3/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_401-158239-0_sp0.9 from training. Duration: 0.788875 2023-10-02 23:06:08,816 WARNING [train.py:1204] (3/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_278-154492-0_sp1.1 from training. Duration: 0.6909375 2023-10-02 23:06:08,822 WARNING [train.py:1204] (3/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_137-116442-0 from training. Duration: 0.78 2023-10-02 23:06:10,530 WARNING [train.py:1204] (3/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_189-317158-0 from training. Duration: 0.9 2023-10-02 23:06:12,263 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_47896_210_20508_1_1534492518_5958868_558-314572-0 from training. Duration: 0.97 2023-10-02 23:06:12,487 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.nonlin_attention.balancer.prob, batch_count=1051280.0, ans=0.125 2023-10-02 23:06:13,787 WARNING [train.py:1204] (3/4) Exclude cut with ID _66070_210_7640_1_1535441393096_7805310_290-40284-0_sp1.1 from training. Duration: 0.97275 2023-10-02 23:06:15,455 WARNING [train.py:1204] (3/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_110-224688-0_sp1.1 from training. Duration: 0.8818125 2023-10-02 23:06:15,701 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.conv_module2.balancer1.prob, batch_count=1051280.0, ans=0.125 2023-10-02 23:06:16,666 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.604e+02 1.870e+02 2.077e+02 2.507e+02 3.555e+02, threshold=4.155e+02, percent-clipped=0.0 2023-10-02 23:06:16,801 WARNING [train.py:1204] (3/4) Exclude cut with ID _7719_210_3525_1_1528456153163_4296960_88-250259-0 from training. Duration: 0.84 2023-10-02 23:06:16,846 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8453_210_4211_1_1528502940_8562730_1157-114102-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 23:06:18,673 WARNING [train.py:1204] (3/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_317-175062-0_sp1.1 from training. Duration: 0.7454375 2023-10-02 23:06:18,687 WARNING [train.py:1204] (3/4) Exclude cut with ID _29857_210_20034_1_1532395886753_3798160_254-347254-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 23:06:20,094 WARNING [train.py:1204] (3/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_28-70987-0 from training. Duration: 0.87 2023-10-02 23:06:21,345 WARNING [train.py:1204] (3/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_321-335475-0 from training. Duration: 0.45 2023-10-02 23:06:21,481 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.0.ff3_skip_rate, batch_count=1051280.0, ans=0.0 2023-10-02 23:06:25,585 WARNING [train.py:1204] (3/4) Exclude cut with ID _40901_210_5665_1_1533363789958_3856130_356-107330-0_sp1.1 from training. Duration: 0.936375 2023-10-02 23:06:25,657 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3780_210_2087_1_1526467389_2145572_176-107140-0 from training. Duration: 0.69 2023-10-02 23:06:25,736 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder_embed.conv.8.prob, batch_count=1051346.6666666667, ans=0.125 2023-10-02 23:06:27,011 INFO [train.py:1046] (3/4) Epoch 30, batch 3650, loss[loss=0.1463, simple_loss=0.2279, pruned_loss=0.03239, over 24419.00 frames. ], tot_loss[loss=0.1628, simple_loss=0.2411, pruned_loss=0.04229, over 4700523.72 frames. ], batch size: 58, lr: 3.40e-03, grad_scale: 16.0 2023-10-02 23:06:30,597 WARNING [train.py:1204] (3/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_80-135200-0 from training. Duration: 0.79 2023-10-02 23:06:32,073 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_33348_210_5632_1_1533086912_3324030_85-28583-0_sp0.9 from training. Duration: 0.911125 2023-10-02 23:06:34,939 WARNING [train.py:1204] (3/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_29-293896-0 from training. Duration: 0.61 2023-10-02 23:06:36,347 WARNING [train.py:1204] (3/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_194-252800-0 from training. Duration: 0.97 2023-10-02 23:06:39,240 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_360-52446-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 23:06:39,242 WARNING [train.py:1204] (3/4) Exclude cut with ID _9389_210_1664_1_1529217337301_5289269_289-213417-0_sp0.9 from training. Duration: 0.988875 2023-10-02 23:06:39,287 WARNING [train.py:1204] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_143-86344-0_sp0.9 from training. Duration: 0.8555625 2023-10-02 23:06:42,161 WARNING [train.py:1204] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_520-126729-0_sp0.9 from training. Duration: 0.77775 2023-10-02 23:06:42,189 WARNING [train.py:1204] (3/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_76-308663-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 23:06:43,550 WARNING [train.py:1204] (3/4) Exclude cut with ID _8021_210_7994_1_1528376451358_3653069_100-140231-0 from training. Duration: 0.67 2023-10-02 23:06:43,616 WARNING [train.py:1204] (3/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_387-131538-0_sp0.9 from training. Duration: 0.9 2023-10-02 23:06:45,308 WARNING [train.py:1204] (3/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_365-182054-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 23:06:45,358 WARNING [train.py:1204] (3/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_345-340690-0 from training. Duration: 0.93 2023-10-02 23:06:47,122 WARNING [train.py:1204] (3/4) Exclude cut with ID _5409_210_1388_1_1527292850189_5865191_178-127970-0_sp0.9 from training. Duration: 0.6333125 2023-10-02 23:06:47,178 WARNING [train.py:1204] (3/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_352-12734-0_sp1.1 from training. Duration: 0.92725 2023-10-02 23:06:47,181 WARNING [train.py:1204] (3/4) Exclude cut with ID _65134_210_14763_1_1535335393985_3505368_121-129373-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 23:06:50,363 WARNING [train.py:1204] (3/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_557-324258-0_sp1.1 from training. Duration: 0.736375 2023-10-02 23:06:51,815 WARNING [train.py:1204] (3/4) Exclude cut with ID _48678_210_5663_1_1533988830947_3769199_306-255886-0 from training. Duration: 0.77 2023-10-02 23:06:53,151 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7544_210_6753_1_1528587000_691468_87-178599-0 from training. Duration: 0.87 2023-10-02 23:06:54,508 WARNING [train.py:1204] (3/4) Exclude cut with ID _2941_210_2577_1_1526199673150_2489759_208-236402-0_sp1.1 from training. Duration: 0.963625 2023-10-02 23:06:54,769 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.ff2_skip_rate, batch_count=1051413.3333333333, ans=0.0 2023-10-02 23:06:55,953 WARNING [train.py:1204] (3/4) Exclude cut with ID _38670_210_15379_1_1533448810659_4391059_65-169397-0 from training. Duration: 0.97 2023-10-02 23:06:59,217 WARNING [train.py:1204] (3/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_306-328175-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 23:06:59,227 WARNING [train.py:1204] (3/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_212-149947-0_sp1.1 from training. Duration: 0.663625 2023-10-02 23:07:00,743 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.conv_module2.balancer2.prob, batch_count=1051480.0, ans=0.125 2023-10-02 23:07:00,839 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.conv_module1.balancer2.min_positive, batch_count=1051480.0, ans=0.05 2023-10-02 23:07:04,784 WARNING [train.py:1204] (3/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_321-22821-0_sp1.1 from training. Duration: 0.8090625 2023-10-02 23:07:05,003 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.balancer1.prob, batch_count=1051480.0, ans=0.125 2023-10-02 23:07:07,494 WARNING [train.py:1204] (3/4) Exclude cut with ID _44010_210_4525_1_1533884262964_4013620_483-69110-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 23:07:07,511 WARNING [train.py:1204] (3/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_38-187851-0_sp1.1 from training. Duration: 0.563625 2023-10-02 23:07:08,918 WARNING [train.py:1204] (3/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_129-337802-0_sp0.9 from training. Duration: 0.8 2023-10-02 23:07:09,051 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.1.conv_module2.balancer2.min_abs, batch_count=1051480.0, ans=0.5 2023-10-02 23:07:10,305 WARNING [train.py:1204] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_268-87909-0_sp1.1 from training. Duration: 0.9 2023-10-02 23:07:11,735 WARNING [train.py:1204] (3/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_559-184862-0_sp1.1 from training. Duration: 0.8545625 2023-10-02 23:07:14,569 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_46032_210_14656_1_1533781412_4507969_237-120766-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 23:07:15,960 WARNING [train.py:1204] (3/4) Exclude cut with ID _28427_210_4220_1_1532581240967_3061069_330-170906-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 23:07:15,966 WARNING [train.py:1204] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_401-51578-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 23:07:17,950 WARNING [train.py:1204] (3/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_209-205365-0_sp1.1 from training. Duration: 0.7181875 2023-10-02 23:07:19,315 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_23801_210_16223_1_1532066392_3294657_57-242170-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 23:07:21,275 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_127-37371-0_sp1.1 from training. Duration: 0.936375 2023-10-02 23:07:25,499 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9171W0019-578905 from training. Duration: 0.896 2023-10-02 23:07:28,229 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_24605_210_1794_1_1532047863_4149755_349-49056-0_sp1.1 from training. Duration: 0.92725 2023-10-02 23:07:28,248 WARNING [train.py:1204] (3/4) Exclude cut with ID _12302_210_5219_1_1529924446466_8107280_897-21237-0_sp1.1 from training. Duration: 0.936375 2023-10-02 23:07:30,244 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_9460_210_4211_1_1528954587_10049667_480-260269-0_sp0.9 from training. Duration: 0.62225 2023-10-02 23:07:31,587 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_348-152548-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 23:07:32,891 WARNING [train.py:1204] (3/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_301-330455-0_sp0.9 from training. Duration: 0.888875 2023-10-02 23:07:34,329 WARNING [train.py:1204] (3/4) Exclude cut with ID _61008_210_10633_1_1534852769198_6974370_345-91705-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 23:07:35,680 WARNING [train.py:1204] (3/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_304-273433-0 from training. Duration: 0.99 2023-10-02 23:07:35,683 WARNING [train.py:1204] (3/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_359-345972-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 23:07:38,467 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2296_210_2087_1_1525503448_3397335_57-185430-0_sp1.1 from training. Duration: 0.5454375 2023-10-02 23:07:39,832 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_1256-120974-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 23:07:41,131 INFO [train.py:1046] (3/4) Epoch 30, batch 3700, loss[loss=0.1644, simple_loss=0.24, pruned_loss=0.0444, over 23812.00 frames. ], tot_loss[loss=0.1635, simple_loss=0.242, pruned_loss=0.04248, over 4704467.77 frames. ], batch size: 179, lr: 3.39e-03, grad_scale: 16.0 2023-10-02 23:07:41,170 WARNING [train.py:1204] (3/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_372-30302-0_sp0.9 from training. Duration: 0.9555625 2023-10-02 23:07:42,563 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_42012_210_7927_1_1533439936_3871556_451-344262-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 23:07:42,564 WARNING [train.py:1204] (3/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_28-324729-0 from training. Duration: 0.94 2023-10-02 23:07:42,582 WARNING [train.py:1204] (3/4) Exclude cut with ID _64135_210_2253_1_1535677267876_7608972_313-18394-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 23:07:42,744 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.ff3_skip_rate, batch_count=1051680.0, ans=0.0 2023-10-02 23:07:43,921 WARNING [train.py:1204] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_620-276793-0_sp0.9 from training. Duration: 0.6555625 2023-10-02 23:07:43,950 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7544_210_6753_1_1528591327_1471426_172-348777-0_sp0.9 from training. Duration: 0.7333125 2023-10-02 23:07:47,951 WARNING [train.py:1204] (3/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_332-221057-0_sp0.9 from training. Duration: 0.7555625 2023-10-02 23:07:51,802 WARNING [train.py:1204] (3/4) Exclude cut with ID _19023_210_4353_1_1531378892552_5820780_5-300035-0_sp1.1 from training. Duration: 0.92725 2023-10-02 23:07:51,844 WARNING [train.py:1204] (3/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_343-309023-0_sp1.1 from training. Duration: 0.963625 2023-10-02 23:07:51,927 WARNING [train.py:1204] (3/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_37-41919-0_sp1.1 from training. Duration: 0.7909375 2023-10-02 23:07:53,786 WARNING [train.py:1204] (3/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_41-182910-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 23:07:53,861 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_229-45300-0_sp0.9 from training. Duration: 0.6333125 2023-10-02 23:07:55,306 WARNING [train.py:1204] (3/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_40-214716-0_sp1.1 from training. Duration: 0.963625 2023-10-02 23:07:58,090 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9309W0203-640330 from training. Duration: 0.896 2023-10-02 23:08:04,037 WARNING [train.py:1204] (3/4) Exclude cut with ID _60229_210_27767_1_1534838406158_3655350_264-279573-0_sp1.1 from training. Duration: 0.7545625 2023-10-02 23:08:04,067 WARNING [train.py:1204] (3/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_372-198878-0_sp0.9 from training. Duration: 0.6666875 2023-10-02 23:08:05,411 WARNING [train.py:1204] (3/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_359-118926-0_sp1.1 from training. Duration: 0.6545625 2023-10-02 23:08:06,708 WARNING [train.py:1204] (3/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_85-65056-0 from training. Duration: 0.96 2023-10-02 23:08:06,747 WARNING [train.py:1204] (3/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_89-242335-0_sp1.1 from training. Duration: 0.863625 2023-10-02 23:08:09,652 WARNING [train.py:1204] (3/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_128-49444-0_sp1.1 from training. Duration: 0.936375 2023-10-02 23:08:10,859 WARNING [train.py:1204] (3/4) Exclude cut with ID _30327_210_16049_1_1532484161295_3924169_401-70819-0 from training. Duration: 0.86 2023-10-02 23:08:11,881 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.0.layers.0.self_attn2.whiten, num_groups=1, num_channels=192, metric=13.16 vs. limit=22.5 2023-10-02 23:08:12,331 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_41822_210_13270_1_1533955976_4379284_260-256391-0_sp1.1 from training. Duration: 0.936375 2023-10-02 23:08:13,691 WARNING [train.py:1204] (3/4) Exclude cut with ID _67335_210_5219_1_1535525975652_4275229_599-247317-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 23:08:16,466 WARNING [train.py:1204] (3/4) Exclude cut with ID _29857_210_20034_1_1532395886753_3798160_209-239267-0_sp1.1 from training. Duration: 0.936375 2023-10-02 23:08:16,493 WARNING [train.py:1204] (3/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_46-207599-0_sp1.1 from training. Duration: 0.5818125 2023-10-02 23:08:17,026 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.5.encoder.layers.0.feed_forward2.out_whiten, num_groups=1, num_channels=256, metric=7.18 vs. limit=15.0 2023-10-02 23:08:18,504 WARNING [train.py:1204] (3/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_912-282412-0_sp1.1 from training. Duration: 0.4454375 2023-10-02 23:08:23,885 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_4183_210_2087_1_1526729100_1869838_141-166600-0_sp1.1 from training. Duration: 0.863625 2023-10-02 23:08:23,890 WARNING [train.py:1204] (3/4) Exclude cut with ID _15037_210_2253_1_1530752294104_3267650_296-192597-0 from training. Duration: 0.81 2023-10-02 23:08:23,939 WARNING [train.py:1204] (3/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_571-220562-0_sp1.1 from training. Duration: 0.963625 2023-10-02 23:08:25,230 WARNING [train.py:1204] (3/4) Exclude cut with ID _5287_210_2084_1_1527418883040_3878670_12-221186-0 from training. Duration: 0.98 2023-10-02 23:08:26,905 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.feed_forward3.hidden_balancer.prob, batch_count=1051880.0, ans=0.125 2023-10-02 23:08:29,298 WARNING [train.py:1204] (3/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_149-85602-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 23:08:30,582 WARNING [train.py:1204] (3/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_1003-101638-0_sp1.1 from training. Duration: 0.663625 2023-10-02 23:08:32,143 WARNING [train.py:1204] (3/4) Exclude cut with ID _55844_210_2346_1_1534471212569_7625707_1099-33689-0_sp1.1 from training. Duration: 0.92725 2023-10-02 23:08:33,873 WARNING [train.py:1204] (3/4) Exclude cut with ID _7606_210_4915_1_1528894253277_5080690_301-10732-0 from training. Duration: 0.81 2023-10-02 23:08:34,017 WARNING [train.py:1204] (3/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_403-34698-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 23:08:35,219 WARNING [train.py:1204] (3/4) Exclude cut with ID _8480_210_1874_1_1528589391205_6728330_755-112625-0_sp0.9 from training. Duration: 0.82225 2023-10-02 23:08:35,246 WARNING [train.py:1204] (3/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_527-238990-0_sp1.1 from training. Duration: 0.8181875 2023-10-02 23:08:35,276 WARNING [train.py:1204] (3/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_426-7328-0_sp1.1 from training. Duration: 0.92725 2023-10-02 23:08:38,168 WARNING [train.py:1204] (3/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_189-317158-0_sp1.1 from training. Duration: 0.8181875 2023-10-02 23:08:39,450 WARNING [train.py:1204] (3/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_162-103620-0 from training. Duration: 0.93 2023-10-02 23:08:39,560 WARNING [train.py:1204] (3/4) Exclude cut with ID _22083_210_8254_1_1531702862439_3678970_49-234546-0 from training. Duration: 0.83 2023-10-02 23:08:40,846 WARNING [train.py:1204] (3/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_370-331600-0_sp1.1 from training. Duration: 0.8545625 2023-10-02 23:08:40,859 WARNING [train.py:1204] (3/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_166-59042-0_sp1.1 from training. Duration: 0.97275 2023-10-02 23:08:42,268 WARNING [train.py:1204] (3/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_138-332773-0_sp1.1 from training. Duration: 0.736375 2023-10-02 23:08:42,332 WARNING [train.py:1204] (3/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_90-30673-0_sp0.9 from training. Duration: 0.9333125 2023-10-02 23:08:44,006 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.feed_forward1.hidden_balancer.prob, batch_count=1051946.6666666667, ans=0.125 2023-10-02 23:08:44,988 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.505e+02 1.850e+02 2.059e+02 2.330e+02 3.629e+02, threshold=4.118e+02, percent-clipped=0.0 2023-10-02 23:08:45,077 WARNING [train.py:1204] (3/4) Exclude cut with ID _27967_210_12140_1_1532588391859_8419130_737-41898-0_sp1.1 from training. Duration: 0.936375 2023-10-02 23:08:45,313 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=1051946.6666666667, ans=0.1 2023-10-02 23:08:46,981 WARNING [train.py:1204] (3/4) Exclude cut with ID _8833_210_5219_1_1528592665210_7553059_329-125156-0_sp1.1 from training. Duration: 0.7454375 2023-10-02 23:08:47,093 WARNING [train.py:1204] (3/4) Exclude cut with ID _41817_210_18835_1_1533434477160_7423049_352-42537-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 23:08:50,324 WARNING [train.py:1204] (3/4) Exclude cut with ID _8365_210_2064_1_1528592311070_4677690_431-294102-0 from training. Duration: 0.89 2023-10-02 23:08:50,471 WARNING [train.py:1204] (3/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_454-314901-0_sp1.1 from training. Duration: 0.4181875 2023-10-02 23:08:51,125 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.3.encoder.layers.3.self_attn2.whiten, num_groups=1, num_channels=512, metric=11.88 vs. limit=22.5 2023-10-02 23:08:53,878 WARNING [train.py:1204] (3/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_80-135200-0_sp0.9 from training. Duration: 0.87775 2023-10-02 23:08:53,930 WARNING [train.py:1204] (3/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_181-78632-0 from training. Duration: 0.78 2023-10-02 23:08:55,246 WARNING [train.py:1204] (3/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_300-27235-0_sp0.9 from training. Duration: 0.9666875 2023-10-02 23:08:56,511 INFO [train.py:1046] (3/4) Epoch 30, batch 3750, loss[loss=0.172, simple_loss=0.2415, pruned_loss=0.05124, over 23682.00 frames. ], tot_loss[loss=0.1646, simple_loss=0.2431, pruned_loss=0.04298, over 4708448.19 frames. ], batch size: 232, lr: 3.39e-03, grad_scale: 16.0 2023-10-02 23:08:56,575 WARNING [train.py:1204] (3/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_96-177176-0_sp1.1 from training. Duration: 0.97275 2023-10-02 23:08:59,176 WARNING [train.py:1204] (3/4) Exclude cut with ID _35941_210_15379_1_1532995294515_4023089_246-348742-0_sp1.1 from training. Duration: 0.97275 2023-10-02 23:09:00,535 WARNING [train.py:1204] (3/4) Exclude cut with ID _38670_210_15379_1_1533448810659_4391059_65-169397-0_sp1.1 from training. Duration: 0.8818125 2023-10-02 23:09:03,967 WARNING [train.py:1204] (3/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_1003-99642-0_sp1.1 from training. Duration: 0.963625 2023-10-02 23:09:08,008 WARNING [train.py:1204] (3/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_320-14078-0_sp1.1 from training. Duration: 0.863625 2023-10-02 23:09:09,439 WARNING [train.py:1204] (3/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_367-61330-0_sp0.9 from training. Duration: 0.8333125 2023-10-02 23:09:10,979 WARNING [train.py:1204] (3/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_515-298514-0_sp1.1 from training. Duration: 0.92725 2023-10-02 23:09:13,974 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.feed_forward2.hidden_balancer.prob, batch_count=1052080.0, ans=0.125 2023-10-02 23:09:15,065 WARNING [train.py:1204] (3/4) Exclude cut with ID _53731_210_7994_1_1534553979244_3757214_471-320815-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 23:09:16,362 WARNING [train.py:1204] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_306-232480-0 from training. Duration: 0.96 2023-10-02 23:09:17,679 WARNING [train.py:1204] (3/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_478-162247-0_sp0.9 from training. Duration: 0.911125 2023-10-02 23:09:19,520 WARNING [train.py:1204] (3/4) Exclude cut with ID _56423_210_5854_1_1534896061049_3165969_345-284696-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 23:09:19,547 WARNING [train.py:1204] (3/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_600-122253-0_sp1.1 from training. Duration: 0.963625 2023-10-02 23:09:21,612 WARNING [train.py:1204] (3/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_199-295611-0 from training. Duration: 0.55 2023-10-02 23:09:25,864 WARNING [train.py:1204] (3/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_734-129761-0 from training. Duration: 0.82 2023-10-02 23:09:28,983 WARNING [train.py:1204] (3/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_349-169257-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 23:09:29,029 WARNING [train.py:1204] (3/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_202-67017-0_sp0.9 from training. Duration: 0.911125 2023-10-02 23:09:31,829 WARNING [train.py:1204] (3/4) Exclude cut with ID _16908_210_5724_1_1531119157248_3772700_61-258978-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 23:09:37,832 WARNING [train.py:1204] (3/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_172-156931-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 23:09:39,190 WARNING [train.py:1204] (3/4) Exclude cut with ID _4092_210_2151_1_1526643044449_3135759_356-278694-0_sp1.1 from training. Duration: 0.42725 2023-10-02 23:09:40,748 WARNING [train.py:1204] (3/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_116-332008-0 from training. Duration: 0.98 2023-10-02 23:09:42,154 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.bypass_mid.scale_min, batch_count=1052213.3333333333, ans=0.2 2023-10-02 23:09:44,826 WARNING [train.py:1204] (3/4) Exclude cut with ID _29003_210_19596_1_1532507391720_3894098_358-114789-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 23:09:46,379 WARNING [train.py:1204] (3/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_152-212533-0_sp1.1 from training. Duration: 0.8818125 2023-10-02 23:09:47,773 WARNING [train.py:1204] (3/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_267-214116-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 23:09:49,827 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7543_210_6753_1_1528541907_1399322_165-323619-0_sp0.9 from training. Duration: 0.7666875 2023-10-02 23:09:52,742 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.attention_skip_rate, batch_count=1052213.3333333333, ans=0.0 2023-10-02 23:09:55,665 WARNING [train.py:1204] (3/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_577-332600-0_sp0.9 from training. Duration: 0.6333125 2023-10-02 23:09:56,997 WARNING [train.py:1204] (3/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_342-100157-0_sp0.9 from training. Duration: 0.6 2023-10-02 23:09:58,423 WARNING [train.py:1204] (3/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_141-89727-0_sp1.1 from training. Duration: 0.6454375 2023-10-02 23:09:59,855 WARNING [train.py:1204] (3/4) Exclude cut with ID _3992_210_1747_1_1526644816031_3731060_107-251434-0_sp1.1 from training. Duration: 0.9 2023-10-02 23:10:01,888 WARNING [train.py:1204] (3/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_401-53423-0_sp1.1 from training. Duration: 0.5 2023-10-02 23:10:09,126 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7762_210_1794_1_1528199508_4057408_422-227034-0_sp1.1 from training. Duration: 0.663625 2023-10-02 23:10:09,459 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.conv_skip_rate, batch_count=1052346.6666666667, ans=0.0 2023-10-02 23:10:10,386 INFO [train.py:1046] (3/4) Epoch 30, batch 3800, loss[loss=0.1693, simple_loss=0.2495, pruned_loss=0.04455, over 23364.00 frames. ], tot_loss[loss=0.1639, simple_loss=0.2428, pruned_loss=0.04251, over 4717176.57 frames. ], batch size: 105, lr: 3.39e-03, grad_scale: 16.0 2023-10-02 23:10:13,187 WARNING [train.py:1204] (3/4) Exclude cut with ID _4184_210_2550_1_1526730192013_2219380_102-250299-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 23:10:13,272 WARNING [train.py:1204] (3/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_253-308377-0_sp0.9 from training. Duration: 0.5444375 2023-10-02 23:10:14,612 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_271-297505-0 from training. Duration: 0.99 2023-10-02 23:10:14,745 WARNING [train.py:1204] (3/4) Exclude cut with ID _35941_210_15379_1_1532995294515_4023089_213-142973-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 23:10:17,432 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7544_210_6753_1_1528587722_3576686_162-63018-0_sp1.1 from training. Duration: 0.92725 2023-10-02 23:10:20,386 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_365-217119-0_sp0.9 from training. Duration: 0.7 2023-10-02 23:10:22,030 WARNING [train.py:1204] (3/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_662-275621-0_sp0.9 from training. Duration: 0.4666875 2023-10-02 23:10:22,031 WARNING [train.py:1204] (3/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_374-187555-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 23:10:22,115 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_289-180904-0_sp1.1 from training. Duration: 0.6909375 2023-10-02 23:10:25,351 WARNING [train.py:1204] (3/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_189-47206-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 23:10:25,393 WARNING [train.py:1204] (3/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_234-14174-0_sp1.1 from training. Duration: 0.7454375 2023-10-02 23:10:25,515 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.0.balancer2.prob, batch_count=1052413.3333333333, ans=0.125 2023-10-02 23:10:26,698 WARNING [train.py:1204] (3/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_576-76621-0_sp1.1 from training. Duration: 0.963625 2023-10-02 23:10:28,165 WARNING [train.py:1204] (3/4) Exclude cut with ID _13253_210_1925_1_1530757775819_3577873_253-246386-0 from training. Duration: 0.9 2023-10-02 23:10:29,702 WARNING [train.py:1204] (3/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_372-6869-0_sp1.1 from training. Duration: 0.4 2023-10-02 23:10:31,022 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_21434_210_4238_1_1531916339_1222252_22-54954-0_sp1.1 from training. Duration: 0.936375 2023-10-02 23:10:34,336 WARNING [train.py:1204] (3/4) Exclude cut with ID _29853_210_7497_1_1532505600851_6642110_492-170361-0_sp1.1 from training. Duration: 0.92725 2023-10-02 23:10:36,448 WARNING [train.py:1204] (3/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_505-97987-0_sp0.9 from training. Duration: 0.9555625 2023-10-02 23:10:37,573 WARNING [train.py:1204] (3/4) Exclude cut with ID _8365_210_2064_1_1528592311070_4677690_371-122295-0_sp0.9 from training. Duration: 0.611125 2023-10-02 23:10:38,943 WARNING [train.py:1204] (3/4) Exclude cut with ID _14268_210_9206_1_1530622771285_4714099_637-86630-0_sp1.1 from training. Duration: 0.463625 2023-10-02 23:10:38,955 WARNING [train.py:1204] (3/4) Exclude cut with ID _29857_210_20034_1_1532395886753_3798160_372-5678-0_sp1.1 from training. Duration: 0.963625 2023-10-02 23:10:39,314 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=1052480.0, ans=0.1 2023-10-02 23:10:40,429 WARNING [train.py:1204] (3/4) Exclude cut with ID _52892_210_6721_1_1534378941259_4124129_90-165108-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 23:10:40,572 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.0.feed_forward1.out_proj.dropout_p, batch_count=1052480.0, ans=0.1 2023-10-02 23:10:41,890 WARNING [train.py:1204] (3/4) Exclude cut with ID _27052_210_5986_1_1532329197187_3585009_143-339198-0_sp1.1 from training. Duration: 0.963625 2023-10-02 23:10:47,375 WARNING [train.py:1204] (3/4) Exclude cut with ID _14268_210_9206_1_1530622771285_4714099_637-86630-0_sp0.9 from training. Duration: 0.5666875 2023-10-02 23:10:47,378 WARNING [train.py:1204] (3/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_401-255491-0 from training. Duration: 0.7 2023-10-02 23:10:48,834 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder_embed.convnext.layerdrop_rate, batch_count=1052480.0, ans=0.015 2023-10-02 23:10:50,104 WARNING [train.py:1204] (3/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_178-6946-0_sp1.1 from training. Duration: 0.97275 2023-10-02 23:10:56,837 WARNING [train.py:1204] (3/4) Exclude cut with ID _27485_210_4916_1_1532739191677_3668269_460-163885-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 23:11:03,454 WARNING [train.py:1204] (3/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_270-26053-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 23:11:04,942 WARNING [train.py:1204] (3/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_412-317077-0 from training. Duration: 0.75 2023-10-02 23:11:08,316 WARNING [train.py:1204] (3/4) Exclude cut with ID _5289_210_2084_1_1527678202736_3779299_142-103317-0 from training. Duration: 0.84 2023-10-02 23:11:08,362 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_150-143438-0_sp1.1 from training. Duration: 0.92725 2023-10-02 23:11:09,792 WARNING [train.py:1204] (3/4) Exclude cut with ID _27894_210_12392_1_1532415496078_6902439_668-269779-0_sp1.1 from training. Duration: 0.97275 2023-10-02 23:11:09,857 WARNING [train.py:1204] (3/4) Exclude cut with ID _18513_210_9206_1_1531618215223_3862620_56-222075-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 23:11:13,000 WARNING [train.py:1204] (3/4) Exclude cut with ID _4092_210_2151_1_1526643044449_3135759_356-278694-0 from training. Duration: 0.47 2023-10-02 23:11:15,489 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.551e+02 1.837e+02 2.037e+02 2.244e+02 3.093e+02, threshold=4.074e+02, percent-clipped=0.0 2023-10-02 23:11:16,872 WARNING [train.py:1204] (3/4) Exclude cut with ID _25808_210_13292_1_1532343514562_7366109_633-47524-0 from training. Duration: 0.99 2023-10-02 23:11:16,885 WARNING [train.py:1204] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_872-264525-0 from training. Duration: 0.65 2023-10-02 23:11:16,915 WARNING [train.py:1204] (3/4) Exclude cut with ID _14632_210_9988_1_1530691279392_3923200_239-232545-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 23:11:16,998 WARNING [train.py:1204] (3/4) Exclude cut with ID _14349_210_8341_1_1530615710818_3442470_153-27432-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 23:11:21,198 WARNING [train.py:1204] (3/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_37-41919-0_sp0.9 from training. Duration: 0.9666875 2023-10-02 23:11:23,040 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_196-289637-0_sp0.9 from training. Duration: 0.8333125 2023-10-02 23:11:23,349 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.conv_skip_rate, batch_count=1052680.0, ans=0.0 2023-10-02 23:11:24,327 INFO [train.py:1046] (3/4) Epoch 30, batch 3850, loss[loss=0.1627, simple_loss=0.254, pruned_loss=0.03572, over 24650.00 frames. ], tot_loss[loss=0.1632, simple_loss=0.2417, pruned_loss=0.04238, over 4720801.34 frames. ], batch size: 68, lr: 3.39e-03, grad_scale: 8.0 2023-10-02 23:11:30,504 WARNING [train.py:1204] (3/4) Exclude cut with ID _13678_210_7497_1_1530530069226_3463170_371-3221-0_sp1.1 from training. Duration: 0.8181875 2023-10-02 23:11:30,608 WARNING [train.py:1204] (3/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_557-324258-0 from training. Duration: 0.81 2023-10-02 23:11:31,993 WARNING [train.py:1204] (3/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_1012-279473-0_sp1.1 from training. Duration: 0.6090625 2023-10-02 23:11:32,077 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_66916_210_14656_1_1535725525_326566_7-92649-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 23:11:32,989 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.2.encoder.layers.0.conv_module2.whiten, num_groups=1, num_channels=384, metric=5.14 vs. limit=15.0 2023-10-02 23:11:34,930 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_9460_210_4211_1_1528954587_10049667_480-260269-0_sp1.1 from training. Duration: 0.5090625 2023-10-02 23:11:36,478 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_121-155001-0_sp1.1 from training. Duration: 0.92725 2023-10-02 23:11:39,009 WARNING [train.py:1204] (3/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_188-171673-0_sp1.1 from training. Duration: 0.52725 2023-10-02 23:11:40,203 WARNING [train.py:1204] (3/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_2-39653-0 from training. Duration: 0.98 2023-10-02 23:11:46,035 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.ff3_skip_rate, batch_count=1052746.6666666667, ans=0.0 2023-10-02 23:11:47,059 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_23850_210_4238_1_1532232597_1632433_112-229012-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 23:11:48,516 WARNING [train.py:1204] (3/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_626-79528-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 23:11:48,798 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.feed_forward1.out_proj.dropout_p, batch_count=1052746.6666666667, ans=0.1 2023-10-02 23:11:50,105 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.conv_skip_rate, batch_count=1052746.6666666667, ans=0.0 2023-10-02 23:11:51,270 WARNING [train.py:1204] (3/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_856-90052-0_sp1.1 from training. Duration: 0.963625 2023-10-02 23:11:51,318 WARNING [train.py:1204] (3/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_122-109023-0_sp0.9 from training. Duration: 0.8444375 2023-10-02 23:11:55,803 WARNING [train.py:1204] (3/4) Exclude cut with ID _64414_210_17301_1_1535243252048_3757759_37-70328-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 23:11:55,871 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_622-344907-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 23:11:56,540 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.2.encoder.layers.0.conv_module2.whiten, num_groups=1, num_channels=384, metric=11.79 vs. limit=15.0 2023-10-02 23:11:57,177 WARNING [train.py:1204] (3/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_410-321956-0_sp1.1 from training. Duration: 0.92725 2023-10-02 23:11:57,197 WARNING [train.py:1204] (3/4) Exclude cut with ID _7562_210_2084_1_1528887767916_3946229_186-326422-0_sp0.9 from training. Duration: 0.9333125 2023-10-02 23:11:58,562 WARNING [train.py:1204] (3/4) Exclude cut with ID _37537_210_2253_1_1533171758568_3421949_127-285192-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 23:11:59,936 WARNING [train.py:1204] (3/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_723-228168-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 23:12:01,758 WARNING [train.py:1204] (3/4) Exclude cut with ID _13678_210_7497_1_1530530069226_3463170_426-246984-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 23:12:01,772 WARNING [train.py:1204] (3/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_399-156791-0_sp0.9 from training. Duration: 0.92225 2023-10-02 23:12:01,826 WARNING [train.py:1204] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_391-22148-0 from training. Duration: 0.77 2023-10-02 23:12:01,858 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3475_210_2028_1_1526193578_3622569_64-297072-0 from training. Duration: 0.91 2023-10-02 23:12:03,190 WARNING [train.py:1204] (3/4) Exclude cut with ID _17875_210_2087_1_1531224012907_1821339_225-280015-0_sp1.1 from training. Duration: 0.963625 2023-10-02 23:12:03,209 WARNING [train.py:1204] (3/4) Exclude cut with ID _50655_210_18628_1_1534229942193_3772154_325-246169-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 23:12:05,945 WARNING [train.py:1204] (3/4) Exclude cut with ID _29529_210_7234_1_1532393961455_3977460_597-257348-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 23:12:05,987 WARNING [train.py:1204] (3/4) Exclude cut with ID _58895_210_8750_1_1535184081320_3649089_77-173485-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 23:12:07,307 WARNING [train.py:1204] (3/4) Exclude cut with ID _6270_210_2084_1_1527850911573_3856420_26-277343-0 from training. Duration: 0.82 2023-10-02 23:12:07,937 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.4.encoder.layers.1.self_attn_weights.whiten_keys, num_groups=4, num_channels=128, metric=2.18 vs. limit=6.0 2023-10-02 23:12:09,390 WARNING [train.py:1204] (3/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_144-3287-0 from training. Duration: 0.67 2023-10-02 23:12:12,694 WARNING [train.py:1204] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_293-145278-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 23:12:14,159 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_40047_210_2290_1_1533290519_2374311_257-335162-0 from training. Duration: 0.95 2023-10-02 23:12:16,839 WARNING [train.py:1204] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_619-344499-0_sp0.9 from training. Duration: 0.7 2023-10-02 23:12:19,682 WARNING [train.py:1204] (3/4) Exclude cut with ID _19504_210_9994_1_1531648863242_3325220_456-315379-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 23:12:21,038 WARNING [train.py:1204] (3/4) Exclude cut with ID _55857_210_2346_1_1534644007831_6898209_631-221692-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 23:12:25,248 WARNING [train.py:1204] (3/4) Exclude cut with ID _64631_210_27767_1_1535349580068_4165160_324-3682-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 23:12:25,280 WARNING [train.py:1204] (3/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_317-211107-0 from training. Duration: 0.92 2023-10-02 23:12:28,560 WARNING [train.py:1204] (3/4) Exclude cut with ID _6789_210_1534_1_1527857937218_3118950_311-61262-0 from training. Duration: 0.65 2023-10-02 23:12:28,700 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder_embed.conv.5.prob, batch_count=1052946.6666666667, ans=0.125 2023-10-02 23:12:29,965 WARNING [train.py:1204] (3/4) Exclude cut with ID _28323_210_16187_1_1532480395667_4100300_365-68882-0_sp1.1 from training. Duration: 0.92725 2023-10-02 23:12:29,999 WARNING [train.py:1204] (3/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_816-181825-0_sp1.1 from training. Duration: 0.92725 2023-10-02 23:12:32,239 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder_embed.conv.5.prob, batch_count=1052946.6666666667, ans=0.125 2023-10-02 23:12:33,517 WARNING [train.py:1204] (3/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_132-269633-0_sp1.1 from training. Duration: 0.6181875 2023-10-02 23:12:33,530 WARNING [train.py:1204] (3/4) Exclude cut with ID _44585_210_18628_1_1533605350387_4518199_280-73534-0_sp1.1 from training. Duration: 0.8090625 2023-10-02 23:12:33,748 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.attention_skip_rate, batch_count=1052946.6666666667, ans=0.0 2023-10-02 23:12:34,843 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_68095_210_20185_1_1535718226_8888855_1009-164755-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 23:12:36,089 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_40223_210_6228_1_1533298404_4812267_400-103813-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 23:12:36,090 WARNING [train.py:1204] (3/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_491-261923-0_sp1.1 from training. Duration: 0.8909375 2023-10-02 23:12:36,095 WARNING [train.py:1204] (3/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_712-342563-0 from training. Duration: 0.97 2023-10-02 23:12:37,660 WARNING [train.py:1204] (3/4) Exclude cut with ID _41161_210_20034_1_1533431249657_3555314_175-190032-0_sp1.1 from training. Duration: 0.963625 2023-10-02 23:12:37,925 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=1053013.3333333333, ans=0.1 2023-10-02 23:12:39,442 INFO [train.py:1046] (3/4) Epoch 30, batch 3900, loss[loss=0.1637, simple_loss=0.2449, pruned_loss=0.04122, over 24654.00 frames. ], tot_loss[loss=0.1628, simple_loss=0.2409, pruned_loss=0.04241, over 4706906.15 frames. ], batch size: 73, lr: 3.39e-03, grad_scale: 8.0 2023-10-02 23:12:39,509 WARNING [train.py:1204] (3/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_788-298907-0 from training. Duration: 0.89 2023-10-02 23:12:39,526 WARNING [train.py:1204] (3/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_225-70961-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 23:12:39,532 WARNING [train.py:1204] (3/4) Exclude cut with ID _58583_210_5236_1_1534726835266_3574249_235-175994-0_sp1.1 from training. Duration: 0.92725 2023-10-02 23:12:40,981 WARNING [train.py:1204] (3/4) Exclude cut with ID _12302_210_5219_1_1529924446466_8107280_907-182949-0_sp1.1 from training. Duration: 0.836375 2023-10-02 23:12:41,024 WARNING [train.py:1204] (3/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_94-193692-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 23:12:44,279 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_4528_210_3273_1_1527076654_3276777_342-168068-0_sp0.9 from training. Duration: 0.9666875 2023-10-02 23:12:45,641 WARNING [train.py:1204] (3/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_368-74239-0_sp1.1 from training. Duration: 0.92725 2023-10-02 23:12:45,642 WARNING [train.py:1204] (3/4) Exclude cut with ID _44831_210_13427_1_1533816011549_3689035_291-295928-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 23:12:46,966 WARNING [train.py:1204] (3/4) Exclude cut with ID _56406_210_9288_1_1534899618664_3496149_455-131997-0_sp1.1 from training. Duration: 0.97275 2023-10-02 23:12:46,974 WARNING [train.py:1204] (3/4) Exclude cut with ID _6473_210_1741_1_1527901307396_3815380_108-17335-0 from training. Duration: 0.65 2023-10-02 23:12:48,266 WARNING [train.py:1204] (3/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_286-135600-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 23:12:51,201 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_928-321675-0_sp1.1 from training. Duration: 0.936375 2023-10-02 23:12:51,270 WARNING [train.py:1204] (3/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_81-300212-0_sp1.1 from training. Duration: 0.5454375 2023-10-02 23:12:51,301 WARNING [train.py:1204] (3/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_29-280622-0_sp0.9 from training. Duration: 0.911125 2023-10-02 23:12:52,690 WARNING [train.py:1204] (3/4) Exclude cut with ID _61079_210_4525_1_1534921008198_4011419_228-176447-0_sp1.1 from training. Duration: 0.936375 2023-10-02 23:12:55,447 WARNING [train.py:1204] (3/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_372-198878-0_sp1.1 from training. Duration: 0.5454375 2023-10-02 23:12:55,458 WARNING [train.py:1204] (3/4) Exclude cut with ID _64521_210_14699_1_1535243427259_3815328_244-208725-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 23:12:56,854 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8275_210_4198_1_1528455320_3892571_491-17707-0_sp0.9 from training. Duration: 0.711125 2023-10-02 23:12:58,592 WARNING [train.py:1204] (3/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_375-245677-0 from training. Duration: 0.81 2023-10-02 23:12:58,600 WARNING [train.py:1204] (3/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_690-286055-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 23:12:59,983 WARNING [train.py:1204] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_89-321955-0 from training. Duration: 0.8 2023-10-02 23:13:01,284 WARNING [train.py:1204] (3/4) Exclude cut with ID _18895_210_6228_1_1531357274234_3784930_213-98721-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 23:13:01,800 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.5.encoder.layers.1.conv_module1.whiten, num_groups=1, num_channels=256, metric=3.93 vs. limit=15.0 2023-10-02 23:13:03,115 WARNING [train.py:1204] (3/4) Exclude cut with ID _19708_210_9206_1_1532136650733_4238364_347-65240-0 from training. Duration: 0.97 2023-10-02 23:13:04,590 WARNING [train.py:1204] (3/4) Exclude cut with ID _64135_210_2253_1_1535677267876_7608972_498-5039-0 from training. Duration: 0.78 2023-10-02 23:13:05,076 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.5.encoder.layers.0.self_attn2.whiten, num_groups=1, num_channels=256, metric=11.00 vs. limit=22.5 2023-10-02 23:13:08,734 WARNING [train.py:1204] (3/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_146-276944-0_sp1.1 from training. Duration: 0.7545625 2023-10-02 23:13:08,836 WARNING [train.py:1204] (3/4) Exclude cut with ID _63171_210_15488_1_1535104792794_3295110_155-34488-0_sp1.1 from training. Duration: 0.97275 2023-10-02 23:13:10,151 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_5438_210_1794_1_1527336687_2600009_233-58072-0_sp0.9 from training. Duration: 0.8555625 2023-10-02 23:13:10,193 WARNING [train.py:1204] (3/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_63-101996-0_sp0.9 from training. Duration: 0.97775 2023-10-02 23:13:13,569 WARNING [train.py:1204] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_271-269084-0_sp1.1 from training. Duration: 0.7545625 2023-10-02 23:13:16,650 WARNING [train.py:1204] (3/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_345-167986-0_sp1.1 from training. Duration: 0.763625 2023-10-02 23:13:18,026 WARNING [train.py:1204] (3/4) Exclude cut with ID _39290_210_4220_1_1533862778760_3075118_92-311600-0_sp1.1 from training. Duration: 0.8 2023-10-02 23:13:18,032 WARNING [train.py:1204] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_209-65306-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 23:13:18,111 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_26433_210_12108_1_1532157158_4056480_317-335807-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 23:13:21,209 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.balancer1.prob, batch_count=1053146.6666666667, ans=0.125 2023-10-02 23:13:23,627 WARNING [train.py:1204] (3/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_238-94127-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 23:13:24,862 WARNING [train.py:1204] (3/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_207-132038-0_sp1.1 from training. Duration: 0.8545625 2023-10-02 23:13:31,681 WARNING [train.py:1204] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_167-137255-0_sp1.1 from training. Duration: 0.5818125 2023-10-02 23:13:32,415 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.3.encoder.layers.0.feed_forward1.out_whiten, num_groups=1, num_channels=512, metric=8.69 vs. limit=15.0 2023-10-02 23:13:33,287 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=1053213.3333333333, ans=0.1 2023-10-02 23:13:34,333 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2009_210_2071_1_1525481134_4588937_385-302100-0_sp0.9 from training. Duration: 0.9666875 2023-10-02 23:13:36,022 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.attention_skip_rate, batch_count=1053213.3333333333, ans=0.0 2023-10-02 23:13:44,601 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_15517_210_6753_1_1530954008_3487336_253-110617-0_sp1.1 from training. Duration: 0.936375 2023-10-02 23:13:45,997 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.451e+02 1.921e+02 2.166e+02 2.503e+02 3.662e+02, threshold=4.332e+02, percent-clipped=0.0 2023-10-02 23:13:48,040 WARNING [train.py:1204] (3/4) Exclude cut with ID _39290_210_4220_1_1533862778760_3075118_92-311600-0_sp0.9 from training. Duration: 0.97775 2023-10-02 23:13:48,090 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_42-142473-0 from training. Duration: 0.72 2023-10-02 23:13:48,121 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2296_210_2087_1_1525503448_3397335_57-185430-0 from training. Duration: 0.6 2023-10-02 23:13:48,132 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_11305_210_1794_1_1529750428_7178148_257-312870-0_sp0.9 from training. Duration: 0.97775 2023-10-02 23:13:50,911 WARNING [train.py:1204] (3/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_222-198127-0 from training. Duration: 0.84 2023-10-02 23:13:52,310 WARNING [train.py:1204] (3/4) Exclude cut with ID _65263_210_22356_1_1535763598691_3921037_423-202699-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 23:13:53,567 INFO [train.py:1046] (3/4) Epoch 30, batch 3950, loss[loss=0.1792, simple_loss=0.2518, pruned_loss=0.0533, over 23798.00 frames. ], tot_loss[loss=0.1628, simple_loss=0.2408, pruned_loss=0.0424, over 4709661.49 frames. ], batch size: 179, lr: 3.39e-03, grad_scale: 4.0 2023-10-02 23:13:53,670 WARNING [train.py:1204] (3/4) Exclude cut with ID _15037_210_2253_1_1530752294104_3267650_13-130361-0 from training. Duration: 0.99 2023-10-02 23:13:59,134 WARNING [train.py:1204] (3/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_628-198080-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 23:14:00,522 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3171_210_2087_1_1526109084_2330007_37-64334-0 from training. Duration: 0.82 2023-10-02 23:14:01,939 WARNING [train.py:1204] (3/4) Exclude cut with ID _5287_210_2084_1_1527418883040_3878670_12-221186-0_sp1.1 from training. Duration: 0.8909375 2023-10-02 23:14:03,386 WARNING [train.py:1204] (3/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_1103-56445-0_sp1.1 from training. Duration: 0.663625 2023-10-02 23:14:05,315 WARNING [train.py:1204] (3/4) Exclude cut with ID _65263_210_22356_1_1535763598691_3921037_169-140068-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 23:14:08,456 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.bypass.skip_rate, batch_count=1053413.3333333333, ans=0.07 2023-10-02 23:14:10,922 WARNING [train.py:1204] (3/4) Exclude cut with ID IC0671W0118-331961 from training. Duration: 0.896 2023-10-02 23:14:10,995 WARNING [train.py:1204] (3/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_402-102311-0_sp1.1 from training. Duration: 0.6090625 2023-10-02 23:14:11,014 WARNING [train.py:1204] (3/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_202-293523-0 from training. Duration: 0.72 2023-10-02 23:14:12,343 WARNING [train.py:1204] (3/4) Exclude cut with ID IC0679W0038-335846 from training. Duration: 0.768 2023-10-02 23:14:12,364 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_37357_210_17024_1_1533089504_2316121_79-198866-0_sp1.1 from training. Duration: 0.936375 2023-10-02 23:14:13,780 WARNING [train.py:1204] (3/4) Exclude cut with ID _7606_210_4915_1_1528894253277_5080690_292-271285-0_sp1.1 from training. Duration: 0.963625 2023-10-02 23:14:13,792 WARNING [train.py:1204] (3/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_377-296839-0_sp0.9 from training. Duration: 0.611125 2023-10-02 23:14:13,794 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_45886_210_17024_1_1533704917_2435333_58-131069-0_sp1.1 from training. Duration: 0.963625 2023-10-02 23:14:18,423 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6961_210_2087_1_1527937168_6815011_395-185060-0 from training. Duration: 0.95 2023-10-02 23:14:18,722 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.out_combiner.scale_min, batch_count=1053413.3333333333, ans=0.2 2023-10-02 23:14:19,831 WARNING [train.py:1204] (3/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_304-266479-0_sp1.1 from training. Duration: 0.97275 2023-10-02 23:14:19,882 WARNING [train.py:1204] (3/4) Exclude cut with ID _3867_210_1883_1_1526641200575_4475830_319-349850-0_sp1.1 from training. Duration: 0.6090625 2023-10-02 23:14:19,890 WARNING [train.py:1204] (3/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_304-59227-0_sp0.9 from training. Duration: 0.8555625 2023-10-02 23:14:21,219 WARNING [train.py:1204] (3/4) Exclude cut with ID _8021_210_7994_1_1528376451358_3653069_100-140231-0_sp0.9 from training. Duration: 0.7444375 2023-10-02 23:14:22,544 WARNING [train.py:1204] (3/4) Exclude cut with ID _8833_210_5219_1_1528592665210_7553059_329-125156-0_sp0.9 from training. Duration: 0.911125 2023-10-02 23:14:32,212 WARNING [train.py:1204] (3/4) Exclude cut with ID _14459_210_2228_1_1531443641946_7516700_792-287234-0_sp1.1 from training. Duration: 0.92725 2023-10-02 23:14:32,227 WARNING [train.py:1204] (3/4) Exclude cut with ID _19708_210_9206_1_1532136650733_4238364_347-65240-0_sp1.1 from training. Duration: 0.8818125 2023-10-02 23:14:37,459 WARNING [train.py:1204] (3/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_70-27732-0 from training. Duration: 0.94 2023-10-02 23:14:43,059 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_397-132565-0 from training. Duration: 0.99 2023-10-02 23:14:43,062 WARNING [train.py:1204] (3/4) Exclude cut with ID _49728_210_12651_1_1534041034294_6788150_624-195301-0 from training. Duration: 0.85 2023-10-02 23:14:45,036 WARNING [train.py:1204] (3/4) Exclude cut with ID _55429_210_10626_1_1534467588527_7174143_377-169391-0_sp1.1 from training. Duration: 0.87275 2023-10-02 23:14:45,153 WARNING [train.py:1204] (3/4) Exclude cut with ID _49376_210_5986_1_1533985278732_3370740_354-344695-0_sp1.1 from training. Duration: 0.9 2023-10-02 23:14:51,926 WARNING [train.py:1204] (3/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_276-96593-0_sp1.1 from training. Duration: 0.7 2023-10-02 23:14:51,940 WARNING [train.py:1204] (3/4) Exclude cut with ID _15012_210_1968_1_1530856820429_5095350_688-178849-0_sp0.9 from training. Duration: 0.711125 2023-10-02 23:14:53,213 WARNING [train.py:1204] (3/4) Exclude cut with ID _25565_210_12392_1_1532423009382_3952098_372-69023-0_sp1.1 from training. Duration: 0.936375 2023-10-02 23:14:53,250 WARNING [train.py:1204] (3/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_61-327062-0_sp1.1 from training. Duration: 0.736375 2023-10-02 23:14:53,284 WARNING [train.py:1204] (3/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_215-141059-0 from training. Duration: 0.62 2023-10-02 23:14:58,744 WARNING [train.py:1204] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_6-226750-0_sp1.1 from training. Duration: 0.8545625 2023-10-02 23:14:58,835 WARNING [train.py:1204] (3/4) Exclude cut with ID _14348_210_8341_1_1530599434996_3778640_240-232370-0_sp1.1 from training. Duration: 0.763625 2023-10-02 23:15:03,004 WARNING [train.py:1204] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_468-189058-0 from training. Duration: 0.96 2023-10-02 23:15:06,716 INFO [train.py:1046] (3/4) Epoch 30, batch 4000, loss[loss=0.1629, simple_loss=0.2546, pruned_loss=0.03556, over 24668.00 frames. ], tot_loss[loss=0.1638, simple_loss=0.2416, pruned_loss=0.04296, over 4695425.31 frames. ], batch size: 73, lr: 3.39e-03, grad_scale: 8.0 2023-10-02 23:15:11,386 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.5.encoder.layers.0.conv_module2.whiten, num_groups=1, num_channels=256, metric=4.44 vs. limit=15.0 2023-10-02 23:15:12,267 WARNING [train.py:1204] (3/4) Exclude cut with ID _21987_210_5854_1_1531821500482_3637038_247-93340-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 23:15:18,916 WARNING [train.py:1204] (3/4) Exclude cut with ID _66868_210_9527_1_1535509287954_4561140_436-191602-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 23:15:24,263 WARNING [train.py:1204] (3/4) Exclude cut with ID _18895_210_6228_1_1531357274234_3784930_394-58203-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 23:15:25,564 WARNING [train.py:1204] (3/4) Exclude cut with ID _58605_210_12210_1_1534723195939_3904259_258-281498-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 23:15:25,590 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_33348_210_5632_1_1533086912_3324030_245-292752-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 23:15:25,610 WARNING [train.py:1204] (3/4) Exclude cut with ID _67831_210_12392_1_1535612464202_3092612_346-2148-0 from training. Duration: 0.93 2023-10-02 23:15:26,943 WARNING [train.py:1204] (3/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_369-222060-0_sp0.9 from training. Duration: 0.788875 2023-10-02 23:15:27,013 WARNING [train.py:1204] (3/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_245-174553-0 from training. Duration: 0.88 2023-10-02 23:15:27,019 WARNING [train.py:1204] (3/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_231-104736-0_sp1.1 from training. Duration: 0.7090625 2023-10-02 23:15:28,259 WARNING [train.py:1204] (3/4) Exclude cut with ID _44585_210_18628_1_1533605350387_4518199_280-73534-0 from training. Duration: 0.89 2023-10-02 23:15:29,713 WARNING [train.py:1204] (3/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_22-201476-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 23:15:32,410 WARNING [train.py:1204] (3/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_325-62802-0_sp0.9 from training. Duration: 0.9666875 2023-10-02 23:15:32,433 WARNING [train.py:1204] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_199-274062-0_sp1.1 from training. Duration: 0.97275 2023-10-02 23:15:32,436 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_8-319957-0_sp1.1 from training. Duration: 0.87275 2023-10-02 23:15:32,474 WARNING [train.py:1204] (3/4) Exclude cut with ID _29343_210_4381_1_1532602757752_3674109_224-349726-0_sp1.1 from training. Duration: 0.936375 2023-10-02 23:15:32,479 WARNING [train.py:1204] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_520-126729-0_sp1.1 from training. Duration: 0.636375 2023-10-02 23:15:33,867 WARNING [train.py:1204] (3/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_239-22076-0_sp1.1 from training. Duration: 0.77275 2023-10-02 23:15:35,251 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9012W0011-501146 from training. Duration: 0.896 2023-10-02 23:15:36,995 WARNING [train.py:1204] (3/4) Exclude cut with ID _29857_210_20034_1_1532395886753_3798160_279-185801-0_sp1.1 from training. Duration: 0.82725 2023-10-02 23:15:38,958 WARNING [train.py:1204] (3/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_261-223933-0_sp1.1 from training. Duration: 0.92725 2023-10-02 23:15:41,700 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9029W0025-509426 from training. Duration: 0.896 2023-10-02 23:15:41,765 WARNING [train.py:1204] (3/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_337-181502-0_sp1.1 from training. Duration: 0.5909375 2023-10-02 23:15:41,785 WARNING [train.py:1204] (3/4) Exclude cut with ID _68635_210_9317_1_1535770813410_4315870_159-332599-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 23:15:47,859 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_161-276996-0 from training. Duration: 0.95 2023-10-02 23:15:49,153 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7088_210_1874_1_1527939776_1101113_127-68870-0_sp1.1 from training. Duration: 0.936375 2023-10-02 23:15:51,935 WARNING [train.py:1204] (3/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_322-217220-0_sp1.1 from training. Duration: 0.7818125 2023-10-02 23:15:52,015 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9072W0002-530636 from training. Duration: 0.768 2023-10-02 23:15:54,701 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_340-336408-0_sp1.1 from training. Duration: 0.8454375 2023-10-02 23:15:54,757 WARNING [train.py:1204] (3/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_395-37222-0 from training. Duration: 0.76 2023-10-02 23:15:54,767 WARNING [train.py:1204] (3/4) Exclude cut with ID _64099_210_17301_1_1535526977656_1578188_159-209156-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 23:15:57,457 WARNING [train.py:1204] (3/4) Exclude cut with ID _53950_210_5816_1_1534417264919_3862951_28-29681-0_sp1.1 from training. Duration: 0.92725 2023-10-02 23:15:58,975 WARNING [train.py:1204] (3/4) Exclude cut with ID _4681_210_3525_1_1527250689464_120889_15-126760-0_sp1.1 from training. Duration: 0.736375 2023-10-02 23:16:00,347 WARNING [train.py:1204] (3/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_242-3472-0_sp1.1 from training. Duration: 0.663625 2023-10-02 23:16:00,379 WARNING [train.py:1204] (3/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_436-186311-0_sp0.9 from training. Duration: 0.72225 2023-10-02 23:16:00,406 WARNING [train.py:1204] (3/4) Exclude cut with ID _3675_210_4557_1_1526639934408_4182089_452-172147-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 23:16:01,889 WARNING [train.py:1204] (3/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_80-147301-0 from training. Duration: 0.84 2023-10-02 23:16:01,915 WARNING [train.py:1204] (3/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_351-107588-0_sp1.1 from training. Duration: 0.92725 2023-10-02 23:16:03,335 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9111W0002-550781 from training. Duration: 0.896 2023-10-02 23:16:07,943 WARNING [train.py:1204] (3/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_465-156500-0_sp1.1 from training. Duration: 0.6545625 2023-10-02 23:16:12,363 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.564e+02 1.926e+02 2.122e+02 2.388e+02 3.312e+02, threshold=4.243e+02, percent-clipped=0.0 2023-10-02 23:16:12,442 WARNING [train.py:1204] (3/4) Exclude cut with ID _13938_210_9988_1_1530518718978_3693960_304-112784-0_sp1.1 from training. Duration: 0.4181875 2023-10-02 23:16:15,110 WARNING [train.py:1204] (3/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_202-67017-0_sp1.1 from training. Duration: 0.7454375 2023-10-02 23:16:15,156 WARNING [train.py:1204] (3/4) Exclude cut with ID _58414_210_8254_1_1535421609162_7151182_448-4358-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 23:16:16,597 WARNING [train.py:1204] (3/4) Exclude cut with ID _66840_210_5219_1_1535452168426_4196349_203-243234-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 23:16:18,522 WARNING [train.py:1204] (3/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_201-213513-0_sp1.1 from training. Duration: 0.97275 2023-10-02 23:16:20,286 INFO [train.py:1046] (3/4) Epoch 30, batch 4050, loss[loss=0.1733, simple_loss=0.2638, pruned_loss=0.04146, over 24422.00 frames. ], tot_loss[loss=0.1636, simple_loss=0.242, pruned_loss=0.04264, over 4689944.24 frames. ], batch size: 69, lr: 3.39e-03, grad_scale: 8.0 2023-10-02 23:16:23,171 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_117-187560-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 23:16:25,934 WARNING [train.py:1204] (3/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_135-174080-0_sp0.9 from training. Duration: 0.588875 2023-10-02 23:16:25,986 WARNING [train.py:1204] (3/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_45-265684-0 from training. Duration: 0.72 2023-10-02 23:16:27,479 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=1054013.3333333333, ans=0.1 2023-10-02 23:16:27,950 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.2.encoder.layers.2.self_attn2.whiten, num_groups=1, num_channels=384, metric=18.32 vs. limit=22.5 2023-10-02 23:16:28,642 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_435-313305-0_sp1.1 from training. Duration: 0.6454375 2023-10-02 23:16:28,679 WARNING [train.py:1204] (3/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_235-307337-0_sp1.1 from training. Duration: 0.8545625 2023-10-02 23:16:30,019 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7762_210_1794_1_1528199508_4057408_419-125009-0_sp0.9 from training. Duration: 0.92225 2023-10-02 23:16:31,331 WARNING [train.py:1204] (3/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_215-141059-0_sp1.1 from training. Duration: 0.563625 2023-10-02 23:16:31,417 WARNING [train.py:1204] (3/4) Exclude cut with ID _27013_210_1389_1_1532437173240_3252649_222-125573-0_sp1.1 from training. Duration: 0.8909375 2023-10-02 23:16:34,413 WARNING [train.py:1204] (3/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_126-274694-0_sp1.1 from training. Duration: 0.8909375 2023-10-02 23:16:37,212 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2592_210_2290_1_1525697025_4882925_157-70512-0_sp1.1 from training. Duration: 0.82725 2023-10-02 23:16:37,252 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_292-265928-0_sp0.9 from training. Duration: 0.5666875 2023-10-02 23:16:39,110 WARNING [train.py:1204] (3/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_1211-226066-0_sp1.1 from training. Duration: 0.6909375 2023-10-02 23:16:39,159 WARNING [train.py:1204] (3/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_394-265747-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 23:16:41,983 WARNING [train.py:1204] (3/4) Exclude cut with ID _48782_210_12658_1_1534467701544_2300422_239-67139-0_sp1.1 from training. Duration: 0.97275 2023-10-02 23:16:43,425 WARNING [train.py:1204] (3/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_329-113173-0_sp1.1 from training. Duration: 0.563625 2023-10-02 23:16:45,494 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.conv_module2.balancer2.prob, batch_count=1054080.0, ans=0.125 2023-10-02 23:16:46,607 WARNING [train.py:1204] (3/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_81-75849-0_sp0.9 from training. Duration: 0.4333125 2023-10-02 23:16:49,321 WARNING [train.py:1204] (3/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_1003-101638-0 from training. Duration: 0.73 2023-10-02 23:16:49,352 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9309W0189-640316 from training. Duration: 0.64 2023-10-02 23:16:51,381 WARNING [train.py:1204] (3/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_503-287883-0_sp1.1 from training. Duration: 0.8 2023-10-02 23:16:52,847 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.ff3_skip_rate, batch_count=1054146.6666666667, ans=0.0 2023-10-02 23:16:58,189 WARNING [train.py:1204] (3/4) Exclude cut with ID _8833_210_5219_1_1528592665210_7553059_611-80313-0 from training. Duration: 0.64 2023-10-02 23:16:59,539 WARNING [train.py:1204] (3/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_335-106626-0_sp1.1 from training. Duration: 0.963625 2023-10-02 23:17:02,331 WARNING [train.py:1204] (3/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_237-44332-0_sp1.1 from training. Duration: 0.8545625 2023-10-02 23:17:05,158 WARNING [train.py:1204] (3/4) Exclude cut with ID _46952_210_5986_1_1534230010357_3107379_178-147341-0_sp1.1 from training. Duration: 0.97275 2023-10-02 23:17:06,517 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_13091_210_7925_1_1530609447_8987354_946-154426-0_sp1.1 from training. Duration: 0.936375 2023-10-02 23:17:06,546 WARNING [train.py:1204] (3/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_326-17061-0_sp1.1 from training. Duration: 0.8545625 2023-10-02 23:17:09,856 WARNING [train.py:1204] (3/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_38-169920-0_sp1.1 from training. Duration: 0.82725 2023-10-02 23:17:13,941 WARNING [train.py:1204] (3/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_283-220372-0 from training. Duration: 0.56 2023-10-02 23:17:13,956 WARNING [train.py:1204] (3/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_252-216674-0_sp1.1 from training. Duration: 0.4909375 2023-10-02 23:17:15,351 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_202-91290-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 23:17:15,588 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.1.bypass.skip_rate, batch_count=1054213.3333333333, ans=0.035 2023-10-02 23:17:17,338 WARNING [train.py:1204] (3/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_80-74184-0 from training. Duration: 0.83 2023-10-02 23:17:20,838 WARNING [train.py:1204] (3/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_411-271358-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 23:17:26,989 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.conv_skip_rate, batch_count=1054280.0, ans=0.0 2023-10-02 23:17:28,121 WARNING [train.py:1204] (3/4) Exclude cut with ID _68777_210_4353_1_1535810390731_3117904_129-166012-0 from training. Duration: 0.85 2023-10-02 23:17:28,207 WARNING [train.py:1204] (3/4) Exclude cut with ID _44982_210_1511_1_1534312820108_3726029_410-109759-0_sp1.1 from training. Duration: 0.963625 2023-10-02 23:17:28,211 WARNING [train.py:1204] (3/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_364-22817-0_sp1.1 from training. Duration: 0.6454375 2023-10-02 23:17:29,562 WARNING [train.py:1204] (3/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_89-242335-0 from training. Duration: 0.95 2023-10-02 23:17:30,919 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_151-325626-0 from training. Duration: 0.97 2023-10-02 23:17:30,920 WARNING [train.py:1204] (3/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_786-264332-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 23:17:33,675 INFO [train.py:1046] (3/4) Epoch 30, batch 4100, loss[loss=0.1672, simple_loss=0.256, pruned_loss=0.03917, over 24692.00 frames. ], tot_loss[loss=0.1646, simple_loss=0.2428, pruned_loss=0.04318, over 4696917.37 frames. ], batch size: 73, lr: 3.39e-03, grad_scale: 8.0 2023-10-02 23:17:33,734 WARNING [train.py:1204] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_35-153886-0_sp1.1 from training. Duration: 0.763625 2023-10-02 23:17:35,158 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_24007_210_1794_1_1531918925_1663875_116-100916-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 23:17:35,173 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_162-262717-0_sp1.1 from training. Duration: 0.7545625 2023-10-02 23:17:41,257 WARNING [train.py:1204] (3/4) Exclude cut with ID _8263_210_1664_1_1528438953494_294100_40-204731-0 from training. Duration: 0.5 2023-10-02 23:17:42,631 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_416-293746-0 from training. Duration: 0.9 2023-10-02 23:17:46,010 WARNING [train.py:1204] (3/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_605-219732-0 from training. Duration: 0.94 2023-10-02 23:17:47,424 WARNING [train.py:1204] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_454-123002-0 from training. Duration: 0.69 2023-10-02 23:17:47,430 WARNING [train.py:1204] (3/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_25-304273-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 23:17:47,502 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6860_210_4852_1_1527917457_3833327_451-39702-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 23:17:48,774 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_304-16505-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 23:17:48,788 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_4-266348-0_sp1.1 from training. Duration: 0.7090625 2023-10-02 23:17:49,533 WARNING [train.py:1204] (3/4) Exclude cut with ID ID0149W0001-753701 from training. Duration: 0.989 2023-10-02 23:17:52,160 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_41251_210_15368_1_1533429767_4820695_394-55393-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 23:17:53,484 WARNING [train.py:1204] (3/4) Exclude cut with ID _8793_210_6310_1_1528542985139_2310009_121-179782-0_sp1.1 from training. Duration: 0.72725 2023-10-02 23:17:53,498 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2729_210_3528_1_1525863505_3882214_421-73047-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 23:17:54,851 WARNING [train.py:1204] (3/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_278-154492-0_sp0.9 from training. Duration: 0.8444375 2023-10-02 23:17:58,154 WARNING [train.py:1204] (3/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_412-317077-0_sp0.9 from training. Duration: 0.8333125 2023-10-02 23:17:59,531 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_727-138317-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 23:17:59,578 WARNING [train.py:1204] (3/4) Exclude cut with ID _25808_210_13292_1_1532343514562_7366109_345-335048-0_sp1.1 from training. Duration: 0.92725 2023-10-02 23:17:59,591 WARNING [train.py:1204] (3/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_506-23200-0 from training. Duration: 0.97 2023-10-02 23:17:59,659 WARNING [train.py:1204] (3/4) Exclude cut with ID _44718_210_5816_1_1534050051869_6469030_643-245187-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 23:17:59,663 WARNING [train.py:1204] (3/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_42-186162-0_sp1.1 from training. Duration: 0.836375 2023-10-02 23:18:00,943 WARNING [train.py:1204] (3/4) Exclude cut with ID _3231_210_2106_1_1526205864934_6784220_171-256004-0_sp1.1 from training. Duration: 0.7818125 2023-10-02 23:18:00,974 WARNING [train.py:1204] (3/4) Exclude cut with ID _25914_210_6400_1_1532141317058_644740_43-248478-0_sp1.1 from training. Duration: 0.936375 2023-10-02 23:18:02,324 WARNING [train.py:1204] (3/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_577-287150-0 from training. Duration: 0.82 2023-10-02 23:18:05,133 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_46015_210_7234_1_1533863319_3087837_376-276068-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 23:18:06,477 WARNING [train.py:1204] (3/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_51-200220-0 from training. Duration: 0.56 2023-10-02 23:18:08,325 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_452-96101-0_sp1.1 from training. Duration: 0.963625 2023-10-02 23:18:11,048 WARNING [train.py:1204] (3/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_263-168388-0_sp1.1 from training. Duration: 0.7818125 2023-10-02 23:18:11,049 WARNING [train.py:1204] (3/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_469-94673-0 from training. Duration: 0.79 2023-10-02 23:18:11,165 WARNING [train.py:1204] (3/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_303-228738-0_sp1.1 from training. Duration: 0.763625 2023-10-02 23:18:11,197 WARNING [train.py:1204] (3/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_262-250826-0_sp1.1 from training. Duration: 0.9 2023-10-02 23:18:12,489 WARNING [train.py:1204] (3/4) Exclude cut with ID _68777_210_4353_1_1535810390731_3117904_129-166012-0_sp1.1 from training. Duration: 0.77275 2023-10-02 23:18:12,744 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.conv_module2.balancer2.prob, batch_count=1054480.0, ans=0.125 2023-10-02 23:18:13,932 WARNING [train.py:1204] (3/4) Exclude cut with ID _58895_210_8750_1_1535184081320_3649089_265-323810-0 from training. Duration: 0.96 2023-10-02 23:18:14,019 WARNING [train.py:1204] (3/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_120-76843-0_sp0.9 from training. Duration: 0.611125 2023-10-02 23:18:16,031 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7544_210_6753_1_1528587722_3576686_295-258479-0_sp0.9 from training. Duration: 0.9333125 2023-10-02 23:18:18,704 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7046_210_6753_1_1527895337_6920917_783-278109-0 from training. Duration: 0.77 2023-10-02 23:18:20,689 WARNING [train.py:1204] (3/4) Exclude cut with ID _42326_210_7770_1_1533814282531_3707269_113-308323-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 23:18:20,728 WARNING [train.py:1204] (3/4) Exclude cut with ID _7852_210_5219_1_1528286471316_8139400_242-105336-0_sp0.9 from training. Duration: 0.8 2023-10-02 23:18:23,428 WARNING [train.py:1204] (3/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_114-277905-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 23:18:26,976 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.3.encoder.layers.1.self_attn_weights.whiten_keys, num_groups=8, num_channels=256, metric=4.36 vs. limit=6.0 2023-10-02 23:18:28,312 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_13118_210_7925_1_1530755612_7608297_42-66683-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 23:18:32,274 WARNING [train.py:1204] (3/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_901-79438-0_sp1.1 from training. Duration: 0.8545625 2023-10-02 23:18:33,640 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_30280_210_1881_1_1532872779_3660139_211-182194-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 23:18:39,864 WARNING [train.py:1204] (3/4) Exclude cut with ID _14898_210_9206_1_1530766812377_4177190_621-45732-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 23:18:39,871 WARNING [train.py:1204] (3/4) Exclude cut with ID _35350_210_16040_1_1533040191183_4075299_224-133775-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 23:18:41,143 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.653e+02 1.912e+02 2.226e+02 2.556e+02 3.636e+02, threshold=4.451e+02, percent-clipped=0.0 2023-10-02 23:18:43,837 WARNING [train.py:1204] (3/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_147-293311-0_sp1.1 from training. Duration: 0.8545625 2023-10-02 23:18:46,962 WARNING [train.py:1204] (3/4) Exclude cut with ID _12302_210_5219_1_1529924446466_8107280_835-279754-0_sp1.1 from training. Duration: 0.8181875 2023-10-02 23:18:48,258 INFO [train.py:1046] (3/4) Epoch 30, batch 4150, loss[loss=0.1489, simple_loss=0.2336, pruned_loss=0.03214, over 24333.00 frames. ], tot_loss[loss=0.1647, simple_loss=0.2428, pruned_loss=0.04334, over 4694456.85 frames. ], batch size: 61, lr: 3.39e-03, grad_scale: 8.0 2023-10-02 23:18:51,263 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8453_210_4211_1_1528502940_8562730_347-330673-0_sp0.9 from training. Duration: 0.8 2023-10-02 23:18:51,977 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_213-103282-0_sp0.9 from training. Duration: 0.8666875 2023-10-02 23:18:53,230 WARNING [train.py:1204] (3/4) Exclude cut with ID _49728_210_12651_1_1534041034294_6788150_66-43496-0_sp1.1 from training. Duration: 0.97275 2023-10-02 23:18:53,232 WARNING [train.py:1204] (3/4) Exclude cut with ID _19023_210_4353_1_1531378892552_5820780_28-36615-0_sp1.1 from training. Duration: 0.8818125 2023-10-02 23:18:54,671 WARNING [train.py:1204] (3/4) Exclude cut with ID _14349_210_8341_1_1530615710818_3442470_115-285975-0 from training. Duration: 0.86 2023-10-02 23:18:54,689 WARNING [train.py:1204] (3/4) Exclude cut with ID _12734_210_8341_1_1530183614296_3891929_236-47464-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 23:18:54,896 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.conv_module1.balancer1.prob, batch_count=1054680.0, ans=0.125 2023-10-02 23:18:55,981 WARNING [train.py:1204] (3/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_177-236809-0 from training. Duration: 0.49 2023-10-02 23:18:57,368 WARNING [train.py:1204] (3/4) Exclude cut with ID _13678_210_7497_1_1530530069226_3463170_371-3221-0 from training. Duration: 0.9 2023-10-02 23:18:57,389 WARNING [train.py:1204] (3/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_48-338976-0 from training. Duration: 0.89 2023-10-02 23:18:58,813 WARNING [train.py:1204] (3/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_1167-309744-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 23:19:00,566 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.conv_skip_rate, batch_count=1054680.0, ans=0.0 2023-10-02 23:19:03,601 WARNING [train.py:1204] (3/4) Exclude cut with ID _30327_210_16049_1_1532484161295_3924169_401-70819-0_sp0.9 from training. Duration: 0.9555625 2023-10-02 23:19:03,617 WARNING [train.py:1204] (3/4) Exclude cut with ID _55134_210_21982_1_1534590028377_3882170_346-91022-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 23:19:06,804 WARNING [train.py:1204] (3/4) Exclude cut with ID _14459_210_2228_1_1531443641946_7516700_174-118801-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 23:19:06,865 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_269-278006-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 23:19:08,162 WARNING [train.py:1204] (3/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_390-339137-0_sp0.9 from training. Duration: 0.62225 2023-10-02 23:19:10,910 WARNING [train.py:1204] (3/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_253-308377-0_sp1.1 from training. Duration: 0.4454375 2023-10-02 23:19:10,923 WARNING [train.py:1204] (3/4) Exclude cut with ID _23825_210_13449_1_1531915313526_3497150_240-271367-0_sp1.1 from training. Duration: 0.8818125 2023-10-02 23:19:11,022 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1015-343884-0_sp0.9 from training. Duration: 0.688875 2023-10-02 23:19:17,047 WARNING [train.py:1204] (3/4) Exclude cut with ID _23514_210_1389_1_1531832139785_3031439_335-48210-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 23:19:19,889 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_309-65177-0_sp1.1 from training. Duration: 0.8 2023-10-02 23:19:21,791 WARNING [train.py:1204] (3/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_231-137892-0 from training. Duration: 0.99 2023-10-02 23:19:24,544 WARNING [train.py:1204] (3/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_57-270037-0 from training. Duration: 0.77 2023-10-02 23:19:24,549 WARNING [train.py:1204] (3/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_465-214726-0_sp1.1 from training. Duration: 0.7818125 2023-10-02 23:19:25,780 WARNING [train.py:1204] (3/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_106-300985-0 from training. Duration: 0.9 2023-10-02 23:19:25,786 WARNING [train.py:1204] (3/4) Exclude cut with ID _14632_210_9988_1_1530691279392_3923200_161-129530-0_sp1.1 from training. Duration: 0.87275 2023-10-02 23:19:25,805 WARNING [train.py:1204] (3/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_514-88370-0_sp1.1 from training. Duration: 0.963625 2023-10-02 23:19:28,619 WARNING [train.py:1204] (3/4) Exclude cut with ID _56495_210_4234_1_1534748080334_4136219_305-334505-0_sp1.1 from training. Duration: 0.92725 2023-10-02 23:19:29,982 WARNING [train.py:1204] (3/4) Exclude cut with ID _42875_210_14856_1_1533704397487_4012433_61-264914-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 23:19:31,662 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.balancer_na.min_abs, batch_count=1054880.0, ans=0.02 2023-10-02 23:19:34,766 WARNING [train.py:1204] (3/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_91-148831-0 from training. Duration: 0.99 2023-10-02 23:19:37,538 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_435-313305-0_sp0.9 from training. Duration: 0.788875 2023-10-02 23:19:39,547 WARNING [train.py:1204] (3/4) Exclude cut with ID _16744_210_5986_1_1531724405384_3907567_242-111316-0_sp1.1 from training. Duration: 0.8454375 2023-10-02 23:19:39,614 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_257-20733-0 from training. Duration: 0.76 2023-10-02 23:19:39,655 WARNING [train.py:1204] (3/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_4-91037-0_sp1.1 from training. Duration: 0.8 2023-10-02 23:19:42,218 WARNING [train.py:1204] (3/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_3-276701-0 from training. Duration: 0.91 2023-10-02 23:19:43,631 WARNING [train.py:1204] (3/4) Exclude cut with ID _3992_210_1747_1_1526644816031_3731060_36-147855-0_sp1.1 from training. Duration: 0.7090625 2023-10-02 23:19:43,708 WARNING [train.py:1204] (3/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_1296-298671-0_sp1.1 from training. Duration: 0.963625 2023-10-02 23:19:43,824 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.1.nonlin_attention.balancer.prob, batch_count=1054880.0, ans=0.125 2023-10-02 23:19:45,059 WARNING [train.py:1204] (3/4) Exclude cut with ID _68965_210_1883_1_1535698962272_3497490_309-334593-0_sp1.1 from training. Duration: 0.92725 2023-10-02 23:19:46,406 WARNING [train.py:1204] (3/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_128-296581-0 from training. Duration: 0.74 2023-10-02 23:19:46,407 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_17613_210_2151_1_1531137569_2366160_170-299079-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 23:19:46,409 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6287_210_3273_1_1527509506_2653716_182-199887-0_sp1.1 from training. Duration: 0.463625 2023-10-02 23:19:49,609 WARNING [train.py:1204] (3/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_80-135200-0_sp1.1 from training. Duration: 0.7181875 2023-10-02 23:19:51,046 WARNING [train.py:1204] (3/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_34-183685-0 from training. Duration: 0.98 2023-10-02 23:19:51,708 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8278_210_2550_1_1528637099_4220413_418-210155-0_sp1.1 from training. Duration: 0.92725 2023-10-02 23:19:51,711 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8853_210_3273_1_1528633899_818968_51-187685-0_sp0.9 from training. Duration: 0.7666875 2023-10-02 23:19:51,725 WARNING [train.py:1204] (3/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_332-221057-0_sp1.1 from training. Duration: 0.6181875 2023-10-02 23:19:52,971 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2221_210_1988_1_1525597051_4935541_415-296882-0 from training. Duration: 0.77 2023-10-02 23:19:53,009 WARNING [train.py:1204] (3/4) Exclude cut with ID _33640_210_2655_1_1533117605479_4855469_770-278185-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 23:19:54,293 WARNING [train.py:1204] (3/4) Exclude cut with ID _5287_210_2084_1_1527418883040_3878670_333-48508-0_sp1.1 from training. Duration: 0.5454375 2023-10-02 23:19:54,357 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_142-276839-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 23:19:57,181 WARNING [train.py:1204] (3/4) Exclude cut with ID _28394_210_8913_1_1532430049518_3650118_126-139371-0_sp1.1 from training. Duration: 0.92725 2023-10-02 23:19:57,205 WARNING [train.py:1204] (3/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_76-203867-0 from training. Duration: 0.72 2023-10-02 23:19:57,251 WARNING [train.py:1204] (3/4) Exclude cut with ID _6194_210_1534_1_1527511826160_3941194_14-282876-0_sp0.9 from training. Duration: 0.788875 2023-10-02 23:19:57,863 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.4.encoder.layers.1.nonlin_attention.whiten2, num_groups=1, num_channels=384, metric=3.26 vs. limit=15.0 2023-10-02 23:19:58,849 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=1054946.6666666667, ans=0.1 2023-10-02 23:20:02,505 INFO [train.py:1046] (3/4) Epoch 30, batch 4200, loss[loss=0.1465, simple_loss=0.2048, pruned_loss=0.04408, over 22575.00 frames. ], tot_loss[loss=0.1639, simple_loss=0.2416, pruned_loss=0.04316, over 4687797.90 frames. ], batch size: 322, lr: 3.39e-03, grad_scale: 8.0 2023-10-02 23:20:02,565 WARNING [train.py:1204] (3/4) Exclude cut with ID _35355_210_16040_1_1533212988977_4016229_74-190076-0_sp1.1 from training. Duration: 0.72725 2023-10-02 23:20:04,310 WARNING [train.py:1204] (3/4) Exclude cut with ID _33920_210_4220_1_1533257851500_3084399_198-334063-0 from training. Duration: 0.94 2023-10-02 23:20:05,733 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_4-266348-0_sp0.9 from training. Duration: 0.8666875 2023-10-02 23:20:07,592 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_23856_210_4238_1_1531994785_3087934_122-38075-0_sp1.1 from training. Duration: 0.936375 2023-10-02 23:20:10,241 WARNING [train.py:1204] (3/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_276-96593-0_sp0.9 from training. Duration: 0.8555625 2023-10-02 23:20:10,307 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_308-278734-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 23:20:10,308 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_5190_210_4557_1_1527245078_2965646_353-250603-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 23:20:12,897 WARNING [train.py:1204] (3/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_472-216663-0 from training. Duration: 0.92 2023-10-02 23:20:15,787 WARNING [train.py:1204] (3/4) Exclude cut with ID _7537_210_2084_1_1528502395431_3924359_14-312906-0 from training. Duration: 0.84 2023-10-02 23:20:17,500 WARNING [train.py:1204] (3/4) Exclude cut with ID _29003_210_19596_1_1532507391720_3894098_559-236660-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 23:20:18,994 WARNING [train.py:1204] (3/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_312-204798-0_sp1.1 from training. Duration: 0.8454375 2023-10-02 23:20:21,714 WARNING [train.py:1204] (3/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_17-309070-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 23:20:21,956 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.balancer2.prob, batch_count=1055080.0, ans=0.125 2023-10-02 23:20:24,898 WARNING [train.py:1204] (3/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_344-152526-0_sp0.9 from training. Duration: 0.588875 2023-10-02 23:20:26,322 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_41452_210_7291_1_1533437687_3941281_405-11559-0_sp1.1 from training. Duration: 0.82725 2023-10-02 23:20:26,353 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_48730_210_5399_1_1533885886_2212769_157-124052-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 23:20:27,773 WARNING [train.py:1204] (3/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_415-78792-0 from training. Duration: 0.81 2023-10-02 23:20:27,777 WARNING [train.py:1204] (3/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_345-340690-0_sp1.1 from training. Duration: 0.8454375 2023-10-02 23:20:29,233 WARNING [train.py:1204] (3/4) Exclude cut with ID _23825_210_13449_1_1531915313526_3497150_124-8138-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 23:20:29,286 WARNING [train.py:1204] (3/4) Exclude cut with ID _49258_210_7994_1_1534122017863_3738861_194-24864-0_sp1.1 from training. Duration: 0.936375 2023-10-02 23:20:29,311 WARNING [train.py:1204] (3/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_369-222060-0_sp1.1 from training. Duration: 0.6454375 2023-10-02 23:20:30,727 WARNING [train.py:1204] (3/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_480-13146-0_sp0.9 from training. Duration: 0.9444375 2023-10-02 23:20:30,961 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.feed_forward1.hidden_balancer.prob, batch_count=1055146.6666666667, ans=0.125 2023-10-02 23:20:33,400 WARNING [train.py:1204] (3/4) Exclude cut with ID _13253_210_1925_1_1530757775819_3577873_191-144977-0 from training. Duration: 0.78 2023-10-02 23:20:33,423 WARNING [train.py:1204] (3/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_1128-140266-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 23:20:33,569 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.0.feed_forward1.hidden_balancer.prob, batch_count=1055146.6666666667, ans=0.125 2023-10-02 23:20:38,156 WARNING [train.py:1204] (3/4) Exclude cut with ID _8772_210_1850_1_1528635595972_3664011_247-336448-0_sp0.9 from training. Duration: 0.82225 2023-10-02 23:20:39,435 WARNING [train.py:1204] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_318-59097-0_sp1.1 from training. Duration: 0.6909375 2023-10-02 23:20:42,541 WARNING [train.py:1204] (3/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_540-131164-0_sp1.1 from training. Duration: 0.836375 2023-10-02 23:20:42,640 WARNING [train.py:1204] (3/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_39-110836-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 23:20:44,095 WARNING [train.py:1204] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_860-96198-0_sp1.1 from training. Duration: 0.663625 2023-10-02 23:20:44,103 WARNING [train.py:1204] (3/4) Exclude cut with ID _15133_210_6228_1_1530794512074_3080379_29-264458-0 from training. Duration: 0.58 2023-10-02 23:20:45,358 WARNING [train.py:1204] (3/4) Exclude cut with ID _55209_210_1531_1_1534675828996_4597160_797-348343-0_sp1.1 from training. Duration: 0.963625 2023-10-02 23:20:45,466 WARNING [train.py:1204] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_23-189234-0_sp1.1 from training. Duration: 0.7545625 2023-10-02 23:20:48,892 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.attention_skip_rate, batch_count=1055213.3333333333, ans=0.0 2023-10-02 23:20:51,462 WARNING [train.py:1204] (3/4) Exclude cut with ID _5410_210_1388_1_1527397055795_1719020_39-112221-0_sp1.1 from training. Duration: 0.62725 2023-10-02 23:20:51,716 INFO [scaling.py:1118] (3/4) WithLoss: name=encoder.encoders.2.encoder.layers.2.self_attn_weights, loss-sum=0.000e+00 2023-10-02 23:20:53,420 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_100-348441-0_sp1.1 from training. Duration: 0.82725 2023-10-02 23:20:58,877 WARNING [train.py:1204] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_143-86344-0_sp1.1 from training. Duration: 0.7 2023-10-02 23:21:01,791 WARNING [train.py:1204] (3/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_126-29104-0 from training. Duration: 0.85 2023-10-02 23:21:03,196 WARNING [train.py:1204] (3/4) Exclude cut with ID _39846_210_12651_1_1533603651992_1318030_138-31771-0_sp1.1 from training. Duration: 0.963625 2023-10-02 23:21:07,951 WARNING [train.py:1204] (3/4) Exclude cut with ID _2220_210_1819_1_1525509109898_3658680_50-99837-0_sp1.1 from training. Duration: 0.4909375 2023-10-02 23:21:09,207 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.479e+02 1.983e+02 2.287e+02 2.650e+02 3.812e+02, threshold=4.574e+02, percent-clipped=0.0 2023-10-02 23:21:09,330 WARNING [train.py:1204] (3/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_836-210550-0_sp1.1 from training. Duration: 0.97275 2023-10-02 23:21:11,953 WARNING [train.py:1204] (3/4) Exclude cut with ID _28323_210_16187_1_1532480395667_4100300_306-74561-0 from training. Duration: 0.94 2023-10-02 23:21:16,273 INFO [train.py:1046] (3/4) Epoch 30, batch 4250, loss[loss=0.1486, simple_loss=0.2259, pruned_loss=0.03569, over 24469.00 frames. ], tot_loss[loss=0.1626, simple_loss=0.2397, pruned_loss=0.04276, over 4677240.56 frames. ], batch size: 58, lr: 3.39e-03, grad_scale: 8.0 2023-10-02 23:21:16,392 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_177-58005-0_sp1.1 from training. Duration: 0.563625 2023-10-02 23:21:19,703 WARNING [train.py:1204] (3/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_19-342632-0_sp1.1 from training. Duration: 0.82725 2023-10-02 23:21:19,725 WARNING [train.py:1204] (3/4) Exclude cut with ID _35355_210_16040_1_1533212988977_4016229_74-190076-0_sp0.9 from training. Duration: 0.888875 2023-10-02 23:21:22,814 WARNING [train.py:1204] (3/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_285-97786-0_sp1.1 from training. Duration: 0.97275 2023-10-02 23:21:26,990 WARNING [train.py:1204] (3/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_337-181502-0_sp0.9 from training. Duration: 0.72225 2023-10-02 23:21:28,277 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_353-182855-0 from training. Duration: 0.96 2023-10-02 23:21:28,312 WARNING [train.py:1204] (3/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_409-327730-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 23:21:29,878 WARNING [train.py:1204] (3/4) Exclude cut with ID _8030_210_7497_1_1528444886781_3531089_373-19403-0_sp1.1 from training. Duration: 0.97275 2023-10-02 23:21:31,453 WARNING [train.py:1204] (3/4) Exclude cut with ID _6789_210_1534_1_1527857937218_3118950_231-340237-0_sp1.1 from training. Duration: 0.936375 2023-10-02 23:21:37,143 WARNING [train.py:1204] (3/4) Exclude cut with ID _6493_210_1747_1_1528459210424_4012470_536-57832-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 23:21:37,164 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7046_210_6753_1_1527895337_6920917_756-116002-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 23:21:39,876 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_37152_210_9697_1_1533106573_3844188_29-239070-0_sp1.1 from training. Duration: 0.8909375 2023-10-02 23:21:39,879 WARNING [train.py:1204] (3/4) Exclude cut with ID _19023_210_4353_1_1531378892552_5820780_61-334539-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 23:21:41,334 WARNING [train.py:1204] (3/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_159-28488-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 23:21:42,684 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_764-71272-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 23:21:42,764 WARNING [train.py:1204] (3/4) Exclude cut with ID _44902_210_16187_1_1533888006062_3792078_258-178274-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 23:21:42,940 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.0.conv_skip_rate, batch_count=1055413.3333333333, ans=0.0 2023-10-02 23:21:45,516 WARNING [train.py:1204] (3/4) Exclude cut with ID _7537_210_2084_1_1528502395431_3924359_14-312906-0_sp1.1 from training. Duration: 0.763625 2023-10-02 23:21:47,489 WARNING [train.py:1204] (3/4) Exclude cut with ID _49643_210_7989_1_1534640244224_3939031_674-116639-0_sp1.1 from training. Duration: 0.963625 2023-10-02 23:21:48,898 WARNING [train.py:1204] (3/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_415-161-0 from training. Duration: 0.98 2023-10-02 23:21:52,195 WARNING [train.py:1204] (3/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_264-348896-0 from training. Duration: 0.85 2023-10-02 23:21:52,202 WARNING [train.py:1204] (3/4) Exclude cut with ID _51899_210_20034_1_1534129254466_3669220_233-161492-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 23:21:52,270 WARNING [train.py:1204] (3/4) Exclude cut with ID _31255_210_13427_1_1532865619495_3815143_233-200023-0_sp1.1 from training. Duration: 0.936375 2023-10-02 23:21:52,292 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_23850_210_4238_1_1532232597_1632433_100-229879-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 23:21:55,001 WARNING [train.py:1204] (3/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_375-245677-0_sp0.9 from training. Duration: 0.9 2023-10-02 23:21:55,004 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_40223_210_6228_1_1533298404_4812267_355-317415-0_sp1.1 from training. Duration: 0.97275 2023-10-02 23:21:55,051 WARNING [train.py:1204] (3/4) Exclude cut with ID _9679_210_7497_1_1529129212906_4363929_235-81488-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 23:21:56,849 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=1055480.0, ans=0.1 2023-10-02 23:21:57,866 WARNING [train.py:1204] (3/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_172-215932-0_sp1.1 from training. Duration: 0.6 2023-10-02 23:21:57,921 WARNING [train.py:1204] (3/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_64-298917-0_sp1.1 from training. Duration: 0.8 2023-10-02 23:22:03,171 WARNING [train.py:1204] (3/4) Exclude cut with ID _38020_210_5986_1_1533112252259_7006089_131-53533-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 23:22:04,553 WARNING [train.py:1204] (3/4) Exclude cut with ID _42334_210_7770_1_1534206610396_3605880_392-48154-0_sp1.1 from training. Duration: 0.963625 2023-10-02 23:22:05,944 WARNING [train.py:1204] (3/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_337-181502-0 from training. Duration: 0.65 2023-10-02 23:22:05,952 WARNING [train.py:1204] (3/4) Exclude cut with ID _6267_210_2084_1_1527764658526_3894730_375-219887-0_sp0.9 from training. Duration: 0.7444375 2023-10-02 23:22:06,185 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.ff3_skip_rate, batch_count=1055546.6666666667, ans=0.0 2023-10-02 23:22:07,886 WARNING [train.py:1204] (3/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_60-146189-0 from training. Duration: 0.98 2023-10-02 23:22:08,131 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.conv_module2.balancer2.prob, batch_count=1055546.6666666667, ans=0.125 2023-10-02 23:22:09,312 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_243-143119-0_sp1.1 from training. Duration: 0.7 2023-10-02 23:22:11,263 WARNING [train.py:1204] (3/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_1267-272693-0_sp0.9 from training. Duration: 0.811125 2023-10-02 23:22:12,734 WARNING [train.py:1204] (3/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_83-182645-0_sp1.1 from training. Duration: 0.97275 2023-10-02 23:22:12,763 WARNING [train.py:1204] (3/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_403-292260-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 23:22:13,002 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.conv_module1.balancer2.prob, batch_count=1055546.6666666667, ans=0.125 2023-10-02 23:22:15,589 WARNING [train.py:1204] (3/4) Exclude cut with ID _55429_210_10626_1_1534467588527_7174143_377-169391-0 from training. Duration: 0.96 2023-10-02 23:22:16,993 WARNING [train.py:1204] (3/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_494-199538-0_sp1.1 from training. Duration: 0.6818125 2023-10-02 23:22:18,851 WARNING [train.py:1204] (3/4) Exclude cut with ID _3067_210_2667_1_1526212974640_4503168_119-175776-0_sp1.1 from training. Duration: 0.67275 2023-10-02 23:22:22,260 WARNING [train.py:1204] (3/4) Exclude cut with ID _3231_210_2106_1_1526205864934_6784220_183-303839-0_sp1.1 from training. Duration: 0.97275 2023-10-02 23:22:25,020 WARNING [train.py:1204] (3/4) Exclude cut with ID _18513_210_9206_1_1531618215223_3862620_165-38958-0_sp1.1 from training. Duration: 0.963625 2023-10-02 23:22:26,409 WARNING [train.py:1204] (3/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_87-272068-0_sp1.1 from training. Duration: 0.7545625 2023-10-02 23:22:27,728 WARNING [train.py:1204] (3/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_353-95-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 23:22:29,157 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2009_210_2071_1_1525481134_4588937_73-302813-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 23:22:30,520 INFO [train.py:1046] (3/4) Epoch 30, batch 4300, loss[loss=0.1762, simple_loss=0.2476, pruned_loss=0.05243, over 23758.00 frames. ], tot_loss[loss=0.1622, simple_loss=0.2394, pruned_loss=0.04252, over 4666469.64 frames. ], batch size: 212, lr: 3.39e-03, grad_scale: 8.0 2023-10-02 23:22:30,615 WARNING [train.py:1204] (3/4) Exclude cut with ID _64220_210_9527_1_1535250151772_4875259_88-95011-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 23:22:31,924 WARNING [train.py:1204] (3/4) Exclude cut with ID _44902_210_16187_1_1533888006062_3792078_160-12107-0_sp1.1 from training. Duration: 0.92725 2023-10-02 23:22:31,937 WARNING [train.py:1204] (3/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_547-5424-0 from training. Duration: 0.94 2023-10-02 23:22:33,398 WARNING [train.py:1204] (3/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_268-85656-0_sp1.1 from training. Duration: 0.936375 2023-10-02 23:22:33,510 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.1.conv_skip_rate, batch_count=1055680.0, ans=0.0 2023-10-02 23:22:38,802 WARNING [train.py:1204] (3/4) Exclude cut with ID _66210_210_12651_1_1535446812922_3365850_42-284985-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 23:22:38,842 WARNING [train.py:1204] (3/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_465-214726-0_sp0.9 from training. Duration: 0.9555625 2023-10-02 23:22:42,393 WARNING [train.py:1204] (3/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_50-95770-0_sp1.1 from training. Duration: 0.936375 2023-10-02 23:22:49,482 WARNING [train.py:1204] (3/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_269-77567-0_sp1.1 from training. Duration: 0.963625 2023-10-02 23:22:49,484 WARNING [train.py:1204] (3/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_7-111954-0 from training. Duration: 0.56 2023-10-02 23:22:51,345 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_85-195240-0_sp0.9 from training. Duration: 0.9666875 2023-10-02 23:22:54,607 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2822_210_2715_1_1526112586_1999587_165-36656-0_sp0.9 from training. Duration: 0.988875 2023-10-02 23:22:54,623 WARNING [train.py:1204] (3/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_181-78632-0_sp1.1 from training. Duration: 0.7090625 2023-10-02 23:22:54,641 WARNING [train.py:1204] (3/4) Exclude cut with ID IC0671W0044-331887 from training. Duration: 0.896 2023-10-02 23:22:58,759 WARNING [train.py:1204] (3/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_266-137104-0_sp1.1 from training. Duration: 0.5909375 2023-10-02 23:23:00,161 WARNING [train.py:1204] (3/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_137-116442-0_sp0.9 from training. Duration: 0.8666875 2023-10-02 23:23:02,871 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_23801_210_16223_1_1532066392_3294657_570-5439-0 from training. Duration: 0.97 2023-10-02 23:23:02,884 WARNING [train.py:1204] (3/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_29-280622-0_sp1.1 from training. Duration: 0.7454375 2023-10-02 23:23:02,917 WARNING [train.py:1204] (3/4) Exclude cut with ID 3557-8342-0013-54691-0 from training. Duration: 0.92 2023-10-02 23:23:05,596 WARNING [train.py:1204] (3/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_711-15688-0_sp1.1 from training. Duration: 0.3909375 2023-10-02 23:23:06,887 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_15126_210_6753_1_1530778677_2591185_120-277707-0_sp1.1 from training. Duration: 0.72725 2023-10-02 23:23:09,628 WARNING [train.py:1204] (3/4) Exclude cut with ID _14970_210_15709_1_1530775547361_1966965_8-169924-0_sp1.1 from training. Duration: 0.836375 2023-10-02 23:23:09,630 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_4692_210_3528_1_1526899755_4323254_107-44401-0_sp0.9 from training. Duration: 0.9555625 2023-10-02 23:23:09,719 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3475_210_2028_1_1526193578_3622569_265-276625-0_sp0.9 from training. Duration: 0.8444375 2023-10-02 23:23:12,802 WARNING [train.py:1204] (3/4) Exclude cut with ID _55153_210_2253_1_1534384786677_3162096_148-79022-0_sp1.1 from training. Duration: 0.87275 2023-10-02 23:23:12,899 WARNING [train.py:1204] (3/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_292-36336-0_sp1.1 from training. Duration: 0.92725 2023-10-02 23:23:12,908 WARNING [train.py:1204] (3/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_329-82235-0 from training. Duration: 0.82 2023-10-02 23:23:14,803 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_11305_210_1794_1_1529750428_7178148_257-312870-0 from training. Duration: 0.88 2023-10-02 23:23:16,215 WARNING [train.py:1204] (3/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_436-4493-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 23:23:20,296 WARNING [train.py:1204] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_321-54129-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 23:23:20,310 WARNING [train.py:1204] (3/4) Exclude cut with ID _5287_210_2084_1_1527418883040_3878670_333-48508-0_sp0.9 from training. Duration: 0.6666875 2023-10-02 23:23:20,338 WARNING [train.py:1204] (3/4) Exclude cut with ID _26528_210_9070_1_1532570410842_3851310_244-278158-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 23:23:20,938 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.3.encoder.layers.3.feed_forward3.out_whiten, num_groups=1, num_channels=512, metric=12.96 vs. limit=15.0 2023-10-02 23:23:22,123 WARNING [train.py:1204] (3/4) Exclude cut with ID _46952_210_5986_1_1534230010357_3107379_232-344503-0_sp1.1 from training. Duration: 0.87275 2023-10-02 23:23:22,141 WARNING [train.py:1204] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_43-206757-0 from training. Duration: 0.75 2023-10-02 23:23:22,142 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_4183_210_2087_1_1526729100_1869838_141-166600-0 from training. Duration: 0.95 2023-10-02 23:23:23,464 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_296-287678-0 from training. Duration: 0.95 2023-10-02 23:23:23,541 WARNING [train.py:1204] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_265-60343-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 23:23:23,567 WARNING [train.py:1204] (3/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_332-256168-0 from training. Duration: 0.75 2023-10-02 23:23:24,867 WARNING [train.py:1204] (3/4) Exclude cut with ID _13679_210_7497_1_1530534293662_3355098_411-186088-0 from training. Duration: 0.94 2023-10-02 23:23:28,510 WARNING [train.py:1204] (3/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_458-126779-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 23:23:28,629 WARNING [train.py:1204] (3/4) Exclude cut with ID IC0803W0380-397572 from training. Duration: 0.032 2023-10-02 23:23:29,919 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_278-272612-0_sp1.1 from training. Duration: 0.663625 2023-10-02 23:23:32,681 WARNING [train.py:1204] (3/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_491-263101-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 23:23:32,697 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_53555_210_5632_1_1534595220_7232297_334-336778-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 23:23:35,322 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_50-3289-0 from training. Duration: 0.91 2023-10-02 23:23:36,593 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.619e+02 1.805e+02 2.042e+02 2.369e+02 3.407e+02, threshold=4.083e+02, percent-clipped=0.0 2023-10-02 23:23:36,701 WARNING [train.py:1204] (3/4) Exclude cut with ID _8699_210_2069_1_1528630346831_3960409_155-39766-0_sp0.9 from training. Duration: 0.8666875 2023-10-02 23:23:36,705 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_502-82145-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 23:23:36,755 WARNING [train.py:1204] (3/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_550-43132-0_sp1.1 from training. Duration: 0.97275 2023-10-02 23:23:38,099 WARNING [train.py:1204] (3/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_446-112117-0_sp1.1 from training. Duration: 0.8181875 2023-10-02 23:23:38,166 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3691_210_3528_1_1526468169_4062053_355-334432-0_sp1.1 from training. Duration: 0.7909375 2023-10-02 23:23:39,590 WARNING [train.py:1204] (3/4) Exclude cut with ID _58605_210_12210_1_1534723195939_3904259_239-94720-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 23:23:39,765 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.ff3_skip_rate, batch_count=1055946.6666666667, ans=0.0 2023-10-02 23:23:42,493 WARNING [train.py:1204] (3/4) Exclude cut with ID _49376_210_5986_1_1533985278732_3370740_391-344072-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 23:23:42,580 WARNING [train.py:1204] (3/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_558-322014-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 23:23:42,621 WARNING [train.py:1204] (3/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_256-32662-0_sp1.1 from training. Duration: 0.8181875 2023-10-02 23:23:44,285 INFO [train.py:1046] (3/4) Epoch 30, batch 4350, loss[loss=0.1719, simple_loss=0.2413, pruned_loss=0.0513, over 23743.00 frames. ], tot_loss[loss=0.1623, simple_loss=0.2402, pruned_loss=0.04216, over 4686889.87 frames. ], batch size: 179, lr: 3.39e-03, grad_scale: 8.0 2023-10-02 23:23:50,342 WARNING [train.py:1204] (3/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_572-185150-0 from training. Duration: 0.97 2023-10-02 23:23:50,377 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_89-35748-0_sp0.9 from training. Duration: 0.87775 2023-10-02 23:23:56,487 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_88-66747-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 23:23:57,929 WARNING [train.py:1204] (3/4) Exclude cut with ID _19504_210_9994_1_1531648863242_3325220_357-237029-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 23:23:59,379 WARNING [train.py:1204] (3/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_302-2550-0_sp1.1 from training. Duration: 0.736375 2023-10-02 23:23:59,379 WARNING [train.py:1204] (3/4) Exclude cut with ID _9461_210_4557_1_1529060514726_3677451_59-194017-0_sp1.1 from training. Duration: 0.963625 2023-10-02 23:24:05,407 WARNING [train.py:1204] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_665-196155-0_sp1.1 from training. Duration: 0.6090625 2023-10-02 23:24:08,233 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_326-315007-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 23:24:09,701 WARNING [train.py:1204] (3/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_105-328174-0_sp1.1 from training. Duration: 0.6909375 2023-10-02 23:24:09,709 WARNING [train.py:1204] (3/4) Exclude cut with ID _56494_210_4220_1_1535284837625_3033169_106-263722-0_sp1.1 from training. Duration: 0.97275 2023-10-02 23:24:12,374 WARNING [train.py:1204] (3/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_770-200430-0_sp0.9 from training. Duration: 0.988875 2023-10-02 23:24:13,802 WARNING [train.py:1204] (3/4) Exclude cut with ID _65640_210_6310_1_1535368201445_3173250_209-13008-0_sp1.1 from training. Duration: 0.9 2023-10-02 23:24:15,812 WARNING [train.py:1204] (3/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_212-149947-0_sp0.9 from training. Duration: 0.811125 2023-10-02 23:24:20,020 WARNING [train.py:1204] (3/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_188-171673-0 from training. Duration: 0.58 2023-10-02 23:24:21,965 WARNING [train.py:1204] (3/4) Exclude cut with ID _58895_210_8750_1_1535184081320_3649089_286-281724-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 23:24:23,308 WARNING [train.py:1204] (3/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_16-254909-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 23:24:26,149 WARNING [train.py:1204] (3/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_52-95655-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 23:24:26,402 INFO [scaling.py:1118] (3/4) WithLoss: name=encoder.encoders.3.encoder.layers.2.self_attn_weights, loss-sum=0.000e+00 2023-10-02 23:24:26,414 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.feed_forward1.hidden_balancer.prob, batch_count=1056146.6666666667, ans=0.125 2023-10-02 23:24:27,686 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.conv_module1.balancer1.max_abs, batch_count=1056213.3333333333, ans=10.0 2023-10-02 23:24:28,950 WARNING [train.py:1204] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_554-45617-0 from training. Duration: 0.99 2023-10-02 23:24:32,396 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.ff2_skip_rate, batch_count=1056213.3333333333, ans=0.0 2023-10-02 23:24:33,490 WARNING [train.py:1204] (3/4) Exclude cut with ID _14683_210_5219_1_1530698352608_3974859_473-344611-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 23:24:34,830 WARNING [train.py:1204] (3/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_17-25045-0_sp0.9 from training. Duration: 0.6555625 2023-10-02 23:24:38,988 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9072W0003-530637 from training. Duration: 0.768 2023-10-02 23:24:40,395 WARNING [train.py:1204] (3/4) Exclude cut with ID _38670_210_15379_1_1533448810659_4391059_76-74025-0_sp1.1 from training. Duration: 0.936375 2023-10-02 23:24:41,205 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.2.encoder.layers.2.self_attn2.whiten, num_groups=1, num_channels=384, metric=25.34 vs. limit=22.5 2023-10-02 23:24:41,808 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_74-292608-0_sp0.9 from training. Duration: 0.8 2023-10-02 23:24:41,876 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9084W0025-536862 from training. Duration: 0.896 2023-10-02 23:24:41,938 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9087W0021-538392 from training. Duration: 0.768 2023-10-02 23:24:41,943 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7543_210_6753_1_1528543323_645515_56-163234-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 23:24:43,282 WARNING [train.py:1204] (3/4) Exclude cut with ID _45560_210_21982_1_1533812603951_3584999_44-332643-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 23:24:43,364 WARNING [train.py:1204] (3/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_1285-256622-0_sp1.1 from training. Duration: 0.763625 2023-10-02 23:24:43,579 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.out_combiner.scale_min, batch_count=1056280.0, ans=0.2 2023-10-02 23:24:44,618 WARNING [train.py:1204] (3/4) Exclude cut with ID _52781_210_16187_1_1534297729099_1118149_141-325632-0_sp1.1 from training. Duration: 0.936375 2023-10-02 23:24:46,444 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_1051-137631-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 23:24:46,496 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_13091_210_7925_1_1530609447_8987354_233-18333-0_sp1.1 from training. Duration: 0.97275 2023-10-02 23:24:50,991 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_34008_210_10431_1_1533345424_8183247_662-317331-0 from training. Duration: 0.94 2023-10-02 23:24:51,005 WARNING [train.py:1204] (3/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_291-248920-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 23:24:51,007 WARNING [train.py:1204] (3/4) Exclude cut with ID _65263_210_22356_1_1535763598691_3921037_274-288481-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 23:24:51,027 WARNING [train.py:1204] (3/4) Exclude cut with ID _5189_210_4557_1_1527248319428_3769229_233-168080-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 23:24:51,083 WARNING [train.py:1204] (3/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_135-184281-0 from training. Duration: 0.99 2023-10-02 23:24:53,113 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9123W0012-555972 from training. Duration: 0.896 2023-10-02 23:24:53,126 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9123W0118-556077 from training. Duration: 0.896 2023-10-02 23:24:53,135 WARNING [train.py:1204] (3/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_406-275649-0 from training. Duration: 0.42 2023-10-02 23:24:55,888 WARNING [train.py:1204] (3/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_309-13684-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 23:24:55,922 WARNING [train.py:1204] (3/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_129-337802-0_sp1.1 from training. Duration: 0.6545625 2023-10-02 23:24:55,949 WARNING [train.py:1204] (3/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_649-266825-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 23:24:57,271 WARNING [train.py:1204] (3/4) Exclude cut with ID _40519_210_13794_1_1533715228655_7337390_895-224742-0_sp1.1 from training. Duration: 0.92725 2023-10-02 23:24:57,400 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.1.conv_skip_rate, batch_count=1056346.6666666667, ans=0.0 2023-10-02 23:24:58,422 INFO [train.py:1046] (3/4) Epoch 30, batch 4400, loss[loss=0.1745, simple_loss=0.2475, pruned_loss=0.05072, over 23771.00 frames. ], tot_loss[loss=0.163, simple_loss=0.2411, pruned_loss=0.04251, over 4688328.22 frames. ], batch size: 212, lr: 3.39e-03, grad_scale: 16.0 2023-10-02 23:24:59,929 WARNING [train.py:1204] (3/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_234-14174-0 from training. Duration: 0.82 2023-10-02 23:25:01,286 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9156W0009-572502 from training. Duration: 0.768 2023-10-02 23:25:01,901 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_424-245529-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 23:25:04,753 WARNING [train.py:1204] (3/4) Exclude cut with ID _26528_210_9070_1_1532570410842_3851310_232-10149-0_sp1.1 from training. Duration: 0.963625 2023-10-02 23:25:04,761 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_17292_210_6753_1_1531125388_2943734_152-30410-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 23:25:06,169 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_35441_210_16554_1_1533347572_4092680_291-82075-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 23:25:07,490 WARNING [train.py:1204] (3/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_64-298917-0 from training. Duration: 0.88 2023-10-02 23:25:07,518 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_5618_210_1874_1_1527311008_5851_1-214501-0_sp0.9 from training. Duration: 0.7655625 2023-10-02 23:25:07,546 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_9460_210_4211_1_1528954587_10049667_572-37035-0 from training. Duration: 0.8 2023-10-02 23:25:07,571 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9193W0023-590322 from training. Duration: 0.896 2023-10-02 23:25:07,733 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.balancer1.prob, batch_count=1056346.6666666667, ans=0.125 2023-10-02 23:25:08,895 WARNING [train.py:1204] (3/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_206-10097-0_sp1.1 from training. Duration: 0.4454375 2023-10-02 23:25:08,908 WARNING [train.py:1204] (3/4) Exclude cut with ID _55134_210_21982_1_1534590028377_3882170_17-51731-0_sp1.1 from training. Duration: 0.963625 2023-10-02 23:25:11,595 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_14953_210_2151_1_1530701710_5616165_710-165400-0 from training. Duration: 0.87 2023-10-02 23:25:13,021 WARNING [train.py:1204] (3/4) Exclude cut with ID _53203_210_2087_1_1534246218309_475590_61-85777-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 23:25:14,754 WARNING [train.py:1204] (3/4) Exclude cut with ID _55844_210_2346_1_1534471212569_7625707_1050-10453-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 23:25:14,765 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9218W0013-603297 from training. Duration: 0.896 2023-10-02 23:25:17,084 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.0.layers.0.feed_forward3.out_whiten, num_groups=1, num_channels=192, metric=12.51 vs. limit=15.0 2023-10-02 23:25:19,690 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_267-102818-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 23:25:19,691 WARNING [train.py:1204] (3/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_191-41023-0 from training. Duration: 0.71 2023-10-02 23:25:19,742 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9231W0202-610227 from training. Duration: 0.896 2023-10-02 23:25:22,564 WARNING [train.py:1204] (3/4) Exclude cut with ID _6537_210_1883_1_1527987559710_3597129_382-133062-0 from training. Duration: 0.68 2023-10-02 23:25:24,303 WARNING [train.py:1204] (3/4) Exclude cut with ID _28427_210_4220_1_1532581240967_3061069_43-84928-0 from training. Duration: 0.99 2023-10-02 23:25:24,344 WARNING [train.py:1204] (3/4) Exclude cut with ID _59256_210_5986_1_1535076085997_3589120_121-237386-0 from training. Duration: 0.96 2023-10-02 23:25:24,364 WARNING [train.py:1204] (3/4) Exclude cut with ID _8480_210_1874_1_1528589391205_6728330_137-256986-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 23:25:25,705 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_9417_210_1794_1_1529235261_4614131_479-346963-0_sp1.1 from training. Duration: 0.936375 2023-10-02 23:25:26,975 WARNING [train.py:1204] (3/4) Exclude cut with ID _26474_210_9570_1_1532520022699_3623380_267-7362-0_sp1.1 from training. Duration: 0.936375 2023-10-02 23:25:27,034 WARNING [train.py:1204] (3/4) Exclude cut with ID _55328_210_27767_1_1534580973260_4088170_287-61272-0_sp1.1 from training. Duration: 0.97275 2023-10-02 23:25:28,484 WARNING [train.py:1204] (3/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_589-9070-0 from training. Duration: 0.71 2023-10-02 23:25:28,490 WARNING [train.py:1204] (3/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_148-158509-0_sp1.1 from training. Duration: 0.37275 2023-10-02 23:25:29,866 WARNING [train.py:1204] (3/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_164-184548-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 23:25:32,984 WARNING [train.py:1204] (3/4) Exclude cut with ID _14970_210_15709_1_1530775547361_1966965_6-23328-0_sp1.1 from training. Duration: 0.7818125 2023-10-02 23:25:32,990 WARNING [train.py:1204] (3/4) Exclude cut with ID _29303_210_2575_1_1532392237303_7410649_612-165069-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 23:25:34,363 WARNING [train.py:1204] (3/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_383-321224-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 23:25:34,396 WARNING [train.py:1204] (3/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_283-160525-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 23:25:34,413 WARNING [train.py:1204] (3/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_505-97987-0 from training. Duration: 0.86 2023-10-02 23:25:35,776 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9309W0190-640317 from training. Duration: 0.768 2023-10-02 23:25:35,977 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.self_attn_weights.pos_emb_skip_rate, batch_count=1056480.0, ans=0.0 2023-10-02 23:25:40,088 WARNING [train.py:1204] (3/4) Exclude cut with ID _50093_210_4220_1_1534417141207_3432299_280-297390-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 23:25:44,500 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.conv_module1.balancer2.prob, batch_count=1056546.6666666667, ans=0.125 2023-10-02 23:25:45,626 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_17292_210_6753_1_1531125388_2943734_320-71385-0_sp1.1 from training. Duration: 0.97275 2023-10-02 23:25:48,856 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_213-103282-0 from training. Duration: 0.78 2023-10-02 23:25:50,960 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=1056546.6666666667, ans=0.1 2023-10-02 23:25:53,432 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7615_210_4211_1_1528110965_4580160_72-235211-0_sp0.9 from training. Duration: 0.8444375 2023-10-02 23:25:56,591 WARNING [train.py:1204] (3/4) Exclude cut with ID _14867_210_1968_1_1530684179890_3846969_173-320977-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 23:25:58,070 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7046_210_6753_1_1527895337_6920917_783-278109-0_sp0.9 from training. Duration: 0.8555625 2023-10-02 23:25:58,114 WARNING [train.py:1204] (3/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_90-30673-0 from training. Duration: 0.84 2023-10-02 23:25:58,128 WARNING [train.py:1204] (3/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_415-161-0_sp1.1 from training. Duration: 0.8909375 2023-10-02 23:25:58,138 WARNING [train.py:1204] (3/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_86-66192-0_sp0.9 from training. Duration: 0.611125 2023-10-02 23:25:58,140 WARNING [train.py:1204] (3/4) Exclude cut with ID _4626_210_3532_1_1526817652717_3916859_446-206656-0_sp1.1 from training. Duration: 0.6909375 2023-10-02 23:25:59,484 WARNING [train.py:1204] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_166-128941-0_sp0.9 from training. Duration: 0.6 2023-10-02 23:26:02,222 WARNING [train.py:1204] (3/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_345-167986-0 from training. Duration: 0.84 2023-10-02 23:26:05,411 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.580e+02 1.891e+02 2.096e+02 2.482e+02 3.996e+02, threshold=4.193e+02, percent-clipped=0.0 2023-10-02 23:26:05,579 WARNING [train.py:1204] (3/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_973-198085-0 from training. Duration: 0.75 2023-10-02 23:26:06,833 WARNING [train.py:1204] (3/4) Exclude cut with ID _60057_210_10640_1_1534903065793_3333150_171-181133-0 from training. Duration: 0.88 2023-10-02 23:26:06,846 WARNING [train.py:1204] (3/4) Exclude cut with ID _19504_210_9994_1_1531648863242_3325220_336-196513-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 23:26:06,853 WARNING [train.py:1204] (3/4) Exclude cut with ID _6493_210_1747_1_1528459210424_4012470_513-140967-0 from training. Duration: 0.79 2023-10-02 23:26:08,238 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6673_210_3273_1_1527767790_3173240_327-306185-0_sp0.9 from training. Duration: 0.92225 2023-10-02 23:26:10,946 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1032-236863-0_sp1.1 from training. Duration: 0.763625 2023-10-02 23:26:12,192 INFO [train.py:1046] (3/4) Epoch 30, batch 4450, loss[loss=0.1764, simple_loss=0.2584, pruned_loss=0.04717, over 23986.00 frames. ], tot_loss[loss=0.1642, simple_loss=0.2423, pruned_loss=0.04305, over 4689584.44 frames. ], batch size: 86, lr: 3.39e-03, grad_scale: 16.0 2023-10-02 23:26:13,661 WARNING [train.py:1204] (3/4) Exclude cut with ID _14867_210_1968_1_1530684179890_3846969_413-29292-0 from training. Duration: 0.9 2023-10-02 23:26:16,488 WARNING [train.py:1204] (3/4) Exclude cut with ID _47251_210_18628_1_1533814063128_4137250_186-343322-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 23:26:18,338 WARNING [train.py:1204] (3/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_624-46091-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 23:26:18,385 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_791-197891-0_sp0.9 from training. Duration: 0.9333125 2023-10-02 23:26:27,538 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_50050_210_16223_1_1535435796_7290138_786-288457-0_sp1.1 from training. Duration: 0.92725 2023-10-02 23:26:27,563 WARNING [train.py:1204] (3/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_213-227283-0_sp1.1 from training. Duration: 0.963625 2023-10-02 23:26:31,757 WARNING [train.py:1204] (3/4) Exclude cut with ID _42659_210_2070_1_1533607442765_3120380_58-341121-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 23:26:33,263 WARNING [train.py:1204] (3/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_506-23200-0_sp1.1 from training. Duration: 0.8818125 2023-10-02 23:26:35,733 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.4.encoder.layers.2.conv_module2.whiten, num_groups=1, num_channels=384, metric=3.33 vs. limit=15.0 2023-10-02 23:26:36,456 WARNING [train.py:1204] (3/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_336-213530-0_sp1.1 from training. Duration: 0.8181875 2023-10-02 23:26:36,487 WARNING [train.py:1204] (3/4) Exclude cut with ID _64135_210_2253_1_1535677267876_7608972_180-5312-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 23:26:37,815 WARNING [train.py:1204] (3/4) Exclude cut with ID _12302_210_5219_1_1529924446466_8107280_835-279754-0 from training. Duration: 0.9 2023-10-02 23:26:37,817 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_510-96996-0_sp1.1 from training. Duration: 0.7909375 2023-10-02 23:26:39,218 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_15517_210_6753_1_1530954008_3487336_216-187834-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 23:26:39,245 WARNING [train.py:1204] (3/4) Exclude cut with ID _19504_210_9994_1_1531648863242_3325220_414-113427-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 23:26:39,246 WARNING [train.py:1204] (3/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_264-108685-0_sp0.9 from training. Duration: 0.611125 2023-10-02 23:26:40,653 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_5438_210_1794_1_1527336687_2600009_104-288274-0_sp0.9 from training. Duration: 0.6444375 2023-10-02 23:26:47,481 WARNING [train.py:1204] (3/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_520-39496-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 23:26:47,528 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_37818_210_4238_1_1533620910_4720083_315-308565-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 23:26:47,628 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2009_210_2071_1_1525481134_4588937_385-302100-0_sp1.1 from training. Duration: 0.7909375 2023-10-02 23:26:49,529 WARNING [train.py:1204] (3/4) Exclude cut with ID _58895_210_8750_1_1535184081320_3649089_277-53219-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 23:26:50,829 WARNING [train.py:1204] (3/4) Exclude cut with ID _26664_210_7770_1_1532258943280_3418270_121-111258-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 23:26:52,533 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.0.bypass.skip_rate, batch_count=1056813.3333333333, ans=0.035 2023-10-02 23:26:54,215 WARNING [train.py:1204] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_99-167392-0_sp1.1 from training. Duration: 0.5545625 2023-10-02 23:26:55,491 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1113-7369-0 from training. Duration: 0.85 2023-10-02 23:26:55,505 WARNING [train.py:1204] (3/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_352-59958-0 from training. Duration: 0.86 2023-10-02 23:26:55,517 WARNING [train.py:1204] (3/4) Exclude cut with ID _14886_210_4220_1_1530702020355_3516190_159-73748-0_sp1.1 from training. Duration: 0.7545625 2023-10-02 23:26:57,073 WARNING [train.py:1204] (3/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_131-119569-0_sp1.1 from training. Duration: 0.92725 2023-10-02 23:26:59,032 WARNING [train.py:1204] (3/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_216-343007-0 from training. Duration: 0.66 2023-10-02 23:27:01,832 WARNING [train.py:1204] (3/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_113-92608-0_sp0.9 from training. Duration: 0.6 2023-10-02 23:27:05,321 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_40047_210_2290_1_1533290519_2374311_132-123271-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 23:27:05,366 WARNING [train.py:1204] (3/4) Exclude cut with ID _8793_210_6310_1_1528542985139_2310009_121-179782-0 from training. Duration: 0.8 2023-10-02 23:27:06,617 WARNING [train.py:1204] (3/4) Exclude cut with ID _43997_210_4525_1_1533538632555_5189329_283-218639-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 23:27:06,627 WARNING [train.py:1204] (3/4) Exclude cut with ID _49376_210_5986_1_1533985278732_3370740_297-78397-0_sp1.1 from training. Duration: 0.936375 2023-10-02 23:27:06,652 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3475_210_2028_1_1526193578_3622569_377-166133-0_sp0.9 from training. Duration: 0.9555625 2023-10-02 23:27:06,659 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_520-213149-0_sp1.1 from training. Duration: 0.92725 2023-10-02 23:27:09,243 WARNING [train.py:1204] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_567-144483-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 23:27:10,778 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_4183_210_2087_1_1526729100_1869838_143-280962-0_sp0.9 from training. Duration: 0.62225 2023-10-02 23:27:10,823 WARNING [train.py:1204] (3/4) Exclude cut with ID _49643_210_7989_1_1534640244224_3939031_611-185515-0 from training. Duration: 0.94 2023-10-02 23:27:11,116 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.conv_module1.balancer2.prob, batch_count=1056946.6666666667, ans=0.125 2023-10-02 23:27:12,335 WARNING [train.py:1204] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_88-341718-0_sp0.9 from training. Duration: 0.7333125 2023-10-02 23:27:14,967 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_51225_210_3601_1_1534400634_2327383_49-235441-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 23:27:16,390 WARNING [train.py:1204] (3/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_240-294400-0_sp1.1 from training. Duration: 0.936375 2023-10-02 23:27:17,735 WARNING [train.py:1204] (3/4) Exclude cut with ID _55429_210_10626_1_1534467588527_7174143_640-304825-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 23:27:17,762 WARNING [train.py:1204] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_187-221566-0_sp1.1 from training. Duration: 0.3909375 2023-10-02 23:27:19,616 WARNING [train.py:1204] (3/4) Exclude cut with ID _7606_210_4915_1_1528894253277_5080690_127-177761-0_sp0.9 from training. Duration: 0.711125 2023-10-02 23:27:22,388 WARNING [train.py:1204] (3/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_430-116139-0 from training. Duration: 0.85 2023-10-02 23:27:24,494 WARNING [train.py:1204] (3/4) Exclude cut with ID _3675_210_4557_1_1526639934408_4182089_456-18305-0_sp0.9 from training. Duration: 0.8444375 2023-10-02 23:27:27,512 INFO [train.py:1046] (3/4) Epoch 30, batch 4500, loss[loss=0.1783, simple_loss=0.2645, pruned_loss=0.04605, over 24007.00 frames. ], tot_loss[loss=0.1644, simple_loss=0.2427, pruned_loss=0.04308, over 4691558.15 frames. ], batch size: 80, lr: 3.39e-03, grad_scale: 8.0 2023-10-02 23:27:31,614 WARNING [train.py:1204] (3/4) Exclude cut with ID _18513_210_9206_1_1531618215223_3862620_402-5957-0_sp1.1 from training. Duration: 0.9 2023-10-02 23:27:31,905 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.ff2_skip_rate, batch_count=1057013.3333333333, ans=0.0 2023-10-02 23:27:32,990 WARNING [train.py:1204] (3/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_465-156500-0 from training. Duration: 0.72 2023-10-02 23:27:32,991 WARNING [train.py:1204] (3/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_714-310538-0 from training. Duration: 0.86 2023-10-02 23:27:34,554 WARNING [train.py:1204] (3/4) Exclude cut with ID _39411_210_5986_1_1533538825030_3343649_335-301187-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 23:27:40,506 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_41750_210_1960_1_1533520678_3948042_241-158616-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 23:27:40,546 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_46013_210_7234_1_1533776362_3744032_50-141535-0_sp1.1 from training. Duration: 0.9 2023-10-02 23:27:40,769 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.balancer_ff3.min_abs, batch_count=1057080.0, ans=0.2 2023-10-02 23:27:41,936 WARNING [train.py:1204] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_43-206757-0_sp0.9 from training. Duration: 0.8333125 2023-10-02 23:27:41,991 WARNING [train.py:1204] (3/4) Exclude cut with ID _22961_210_8254_1_1531976366672_4278180_172-276296-0_sp1.1 from training. Duration: 0.97275 2023-10-02 23:27:42,011 WARNING [train.py:1204] (3/4) Exclude cut with ID _17956_210_4353_1_1531209568125_6979970_1305-301903-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 23:27:42,056 WARNING [train.py:1204] (3/4) Exclude cut with ID _28323_210_16187_1_1532480395667_4100300_604-44507-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 23:27:52,939 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.2.encoder.layers.2.nonlin_attention.whiten1, num_groups=1, num_channels=288, metric=4.61 vs. limit=10.0 2023-10-02 23:27:53,567 WARNING [train.py:1204] (3/4) Exclude cut with ID _23514_210_1389_1_1531832139785_3031439_88-337544-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 23:27:53,634 WARNING [train.py:1204] (3/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_932-3672-0_sp1.1 from training. Duration: 0.963625 2023-10-02 23:27:55,199 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.bypass.scale_min, batch_count=1057146.6666666667, ans=0.2 2023-10-02 23:27:56,733 WARNING [train.py:1204] (3/4) Exclude cut with ID _46502_210_13794_1_1533724452198_3657159_207-109991-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 23:27:56,782 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_216-73841-0_sp1.1 from training. Duration: 0.87275 2023-10-02 23:27:58,602 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8453_210_4211_1_1528502940_8562730_347-330673-0_sp1.1 from training. Duration: 0.6545625 2023-10-02 23:28:04,283 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_4183_210_2087_1_1526729100_1869838_143-280962-0_sp1.1 from training. Duration: 0.5090625 2023-10-02 23:28:07,668 WARNING [train.py:1204] (3/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_325-113798-0_sp1.1 from training. Duration: 0.77275 2023-10-02 23:28:10,468 WARNING [train.py:1204] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_91-212486-0_sp0.9 from training. Duration: 0.7555625 2023-10-02 23:28:13,325 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8776_210_2040_1_1528638837_4088036_244-278763-0_sp1.1 from training. Duration: 0.8181875 2023-10-02 23:28:14,658 WARNING [train.py:1204] (3/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_106-286130-0 from training. Duration: 0.98 2023-10-02 23:28:14,739 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2746_210_1988_1_1525872816_3795627_372-205861-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 23:28:16,038 WARNING [train.py:1204] (3/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_240-54646-0_sp1.1 from training. Duration: 0.936375 2023-10-02 23:28:17,352 WARNING [train.py:1204] (3/4) Exclude cut with ID _59500_210_8254_1_1534809610680_3736009_162-180695-0_sp1.1 from training. Duration: 0.936375 2023-10-02 23:28:17,386 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7544_210_6753_1_1528591327_1471426_146-93357-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 23:28:20,652 WARNING [train.py:1204] (3/4) Exclude cut with ID _42657_210_2070_1_1533520654016_3327008_405-87432-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 23:28:21,930 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_4528_210_3273_1_1527076654_3276777_209-219804-0 from training. Duration: 0.69 2023-10-02 23:28:21,930 WARNING [train.py:1204] (3/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_78-182912-0_sp0.9 from training. Duration: 0.6555625 2023-10-02 23:28:21,945 WARNING [train.py:1204] (3/4) Exclude cut with ID _31255_210_13427_1_1532865619495_3815143_322-114998-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 23:28:26,772 WARNING [train.py:1204] (3/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_848-280957-0_sp0.9 from training. Duration: 0.9666875 2023-10-02 23:28:26,799 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7696_210_2040_1_1528207167_3716099_117-125434-0_sp0.9 from training. Duration: 0.8444375 2023-10-02 23:28:28,243 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_1002-109192-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 23:28:31,011 WARNING [train.py:1204] (3/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_138-64210-0_sp0.9 from training. Duration: 0.811125 2023-10-02 23:28:31,025 WARNING [train.py:1204] (3/4) Exclude cut with ID _25808_210_13292_1_1532343514562_7366109_633-47524-0_sp1.1 from training. Duration: 0.9 2023-10-02 23:28:32,440 WARNING [train.py:1204] (3/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_297-5233-0 from training. Duration: 0.95 2023-10-02 23:28:35,778 WARNING [train.py:1204] (3/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_335-274780-0 from training. Duration: 0.78 2023-10-02 23:28:35,783 WARNING [train.py:1204] (3/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_301-330455-0 from training. Duration: 0.8 2023-10-02 23:28:37,058 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.549e+02 1.840e+02 1.982e+02 2.300e+02 3.400e+02, threshold=3.964e+02, percent-clipped=0.0 2023-10-02 23:28:40,215 WARNING [train.py:1204] (3/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_383-205833-0 from training. Duration: 0.95 2023-10-02 23:28:41,540 INFO [train.py:1046] (3/4) Epoch 30, batch 4550, loss[loss=0.1345, simple_loss=0.2074, pruned_loss=0.0308, over 24425.00 frames. ], tot_loss[loss=0.1636, simple_loss=0.242, pruned_loss=0.04263, over 4696404.33 frames. ], batch size: 58, lr: 3.39e-03, grad_scale: 8.0 2023-10-02 23:28:43,074 WARNING [train.py:1204] (3/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_48-182155-0 from training. Duration: 0.87 2023-10-02 23:28:44,455 WARNING [train.py:1204] (3/4) Exclude cut with ID _14832_210_8750_1_1530691351539_3171348_1-54315-0_sp1.1 from training. Duration: 0.7909375 2023-10-02 23:28:47,274 WARNING [train.py:1204] (3/4) Exclude cut with ID _44982_210_1511_1_1534312820108_3726029_413-100814-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 23:28:48,586 WARNING [train.py:1204] (3/4) Exclude cut with ID _3231_210_2106_1_1526205864934_6784220_104-261392-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 23:28:51,243 WARNING [train.py:1204] (3/4) Exclude cut with ID _24314_210_4915_1_1532135078116_7170179_1-13588-0_sp1.1 from training. Duration: 0.97275 2023-10-02 23:28:51,551 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.conv_skip_rate, batch_count=1057346.6666666667, ans=0.0 2023-10-02 23:28:51,919 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.5.encoder.layers.1.conv_module1.whiten, num_groups=1, num_channels=256, metric=5.57 vs. limit=15.0 2023-10-02 23:28:53,506 WARNING [train.py:1204] (3/4) Exclude cut with ID _44304_210_15374_1_1533639352765_4613129_141-8356-0_sp1.1 from training. Duration: 0.963625 2023-10-02 23:28:56,667 WARNING [train.py:1204] (3/4) Exclude cut with ID _37892_210_19596_1_1533198677897_3964058_374-335034-0_sp1.1 from training. Duration: 0.936375 2023-10-02 23:28:58,646 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_195-257512-0_sp0.9 from training. Duration: 0.7444375 2023-10-02 23:28:58,648 WARNING [train.py:1204] (3/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_1267-272693-0_sp1.1 from training. Duration: 0.663625 2023-10-02 23:28:58,649 WARNING [train.py:1204] (3/4) Exclude cut with ID _30196_210_1664_1_1533708496522_7074140_690-326155-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 23:29:01,401 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_68847_210_14656_1_1536070214_2586843_38-20717-0_sp1.1 from training. Duration: 0.97275 2023-10-02 23:29:01,435 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_99-251314-0_sp1.1 from training. Duration: 0.7909375 2023-10-02 23:29:04,153 WARNING [train.py:1204] (3/4) Exclude cut with ID _49376_210_5986_1_1533985278732_3370740_225-95630-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 23:29:06,058 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2655_210_2087_1_1525688959_3897400_202-147557-0 from training. Duration: 0.85 2023-10-02 23:29:07,419 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_5618_210_1874_1_1527311008_5851_1-214501-0_sp1.1 from training. Duration: 0.626375 2023-10-02 23:29:07,480 WARNING [train.py:1204] (3/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_225-43381-0_sp1.1 from training. Duration: 0.8545625 2023-10-02 23:29:07,648 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.conv_module2.balancer1.prob, batch_count=1057413.3333333333, ans=0.125 2023-10-02 23:29:08,940 WARNING [train.py:1204] (3/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_242-3472-0 from training. Duration: 0.73 2023-10-02 23:29:12,870 WARNING [train.py:1204] (3/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_339-104791-0 from training. Duration: 0.49 2023-10-02 23:29:13,041 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.ff2_skip_rate, batch_count=1057480.0, ans=0.0 2023-10-02 23:29:14,170 WARNING [train.py:1204] (3/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_488-92261-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 23:29:16,990 WARNING [train.py:1204] (3/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_175-163894-0 from training. Duration: 0.74 2023-10-02 23:29:17,310 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.feed_forward1.hidden_balancer.prob, batch_count=1057480.0, ans=0.125 2023-10-02 23:29:18,357 WARNING [train.py:1204] (3/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_41-234353-0_sp0.9 from training. Duration: 0.7555625 2023-10-02 23:29:21,152 WARNING [train.py:1204] (3/4) Exclude cut with ID _47251_210_18628_1_1533814063128_4137250_116-275927-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 23:29:22,830 WARNING [train.py:1204] (3/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_61-223324-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 23:29:22,850 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6961_210_2087_1_1527937168_6815011_562-165518-0_sp1.1 from training. Duration: 0.7 2023-10-02 23:29:25,948 WARNING [train.py:1204] (3/4) Exclude cut with ID _21989_210_5854_1_1531899214931_3271360_133-44759-0 from training. Duration: 0.77 2023-10-02 23:29:27,893 WARNING [train.py:1204] (3/4) Exclude cut with ID _23557_210_8750_1_1531890106370_6846255_77-284923-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 23:29:30,683 WARNING [train.py:1204] (3/4) Exclude cut with ID _13678_210_7497_1_1530530069226_3463170_464-296127-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 23:29:30,698 WARNING [train.py:1204] (3/4) Exclude cut with ID _22613_210_5219_1_1532253599686_4090170_393-36663-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 23:29:30,787 WARNING [train.py:1204] (3/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_235-262124-0_sp0.9 from training. Duration: 0.7444375 2023-10-02 23:29:32,209 WARNING [train.py:1204] (3/4) Exclude cut with ID _8480_210_1874_1_1528589391205_6728330_755-112625-0 from training. Duration: 0.74 2023-10-02 23:29:33,456 WARNING [train.py:1204] (3/4) Exclude cut with ID _6456_210_1534_1_1527681634212_3181049_215-309359-0 from training. Duration: 0.88 2023-10-02 23:29:33,487 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3019_210_2028_1_1526044245_3264865_191-72235-0_sp1.1 from training. Duration: 0.8818125 2023-10-02 23:29:34,920 WARNING [train.py:1204] (3/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_380-162648-0 from training. Duration: 0.94 2023-10-02 23:29:36,755 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_92-46009-0 from training. Duration: 0.54 2023-10-02 23:29:36,781 WARNING [train.py:1204] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_53-300544-0_sp0.9 from training. Duration: 0.7444375 2023-10-02 23:29:39,543 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_68235_210_1925_1_1535863247_4484656_101-250943-0_sp1.1 from training. Duration: 0.97275 2023-10-02 23:29:39,566 WARNING [train.py:1204] (3/4) Exclude cut with ID _14970_210_15709_1_1530775547361_1966965_204-22182-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 23:29:39,650 WARNING [train.py:1204] (3/4) Exclude cut with ID _55153_210_2253_1_1534384786677_3162096_310-130092-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 23:29:40,955 WARNING [train.py:1204] (3/4) Exclude cut with ID _60113_210_16271_1_1535164212481_4386079_559-167191-0_sp1.1 from training. Duration: 0.8454375 2023-10-02 23:29:42,400 WARNING [train.py:1204] (3/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_93-58002-0_sp1.1 from training. Duration: 0.5181875 2023-10-02 23:29:42,453 WARNING [train.py:1204] (3/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_110-224688-0 from training. Duration: 0.97 2023-10-02 23:29:42,972 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.5.encoder.layers.0.nonlin_attention.whiten2, num_groups=1, num_channels=256, metric=3.31 vs. limit=15.0 2023-10-02 23:29:45,167 WARNING [train.py:1204] (3/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_178-178004-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 23:29:45,176 WARNING [train.py:1204] (3/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_321-335475-0_sp1.1 from training. Duration: 0.4090625 2023-10-02 23:29:46,503 WARNING [train.py:1204] (3/4) Exclude cut with ID _66863_210_18053_1_1535698758334_8590319_1092-271784-0 from training. Duration: 0.96 2023-10-02 23:29:46,518 WARNING [train.py:1204] (3/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_607-213951-0_sp1.1 from training. Duration: 0.763625 2023-10-02 23:29:46,527 WARNING [train.py:1204] (3/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_468-283113-0 from training. Duration: 0.96 2023-10-02 23:29:49,345 WARNING [train.py:1204] (3/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_13-165512-0_sp1.1 from training. Duration: 0.7454375 2023-10-02 23:29:49,366 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_69630_210_7234_1_1535788670_910989_56-169762-0_sp1.1 from training. Duration: 0.92725 2023-10-02 23:29:52,076 WARNING [train.py:1204] (3/4) Exclude cut with ID _67335_210_5219_1_1535525975652_4275229_121-80357-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 23:29:52,109 WARNING [train.py:1204] (3/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_126-12079-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 23:29:52,149 WARNING [train.py:1204] (3/4) Exclude cut with ID _5294_210_1747_1_1527249643928_3696179_342-270278-0_sp1.1 from training. Duration: 0.5 2023-10-02 23:29:53,915 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_30322_210_1925_1_1532843478_826817_35-328264-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 23:29:54,028 WARNING [train.py:1204] (3/4) Exclude cut with ID _8480_210_1874_1_1528589391205_6728330_755-112625-0_sp1.1 from training. Duration: 0.67275 2023-10-02 23:29:55,746 INFO [train.py:1046] (3/4) Epoch 30, batch 4600, loss[loss=0.1595, simple_loss=0.2445, pruned_loss=0.03727, over 24039.00 frames. ], tot_loss[loss=0.163, simple_loss=0.2411, pruned_loss=0.04242, over 4696452.41 frames. ], batch size: 80, lr: 3.38e-03, grad_scale: 8.0 2023-10-02 23:29:59,146 WARNING [train.py:1204] (3/4) Exclude cut with ID _38670_210_15379_1_1533448810659_4391059_132-146412-0_sp1.1 from training. Duration: 0.97275 2023-10-02 23:29:59,222 WARNING [train.py:1204] (3/4) Exclude cut with ID _22020_210_2461_1_1531724164513_6326900_563-39691-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 23:30:01,852 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_9460_210_4211_1_1528954587_10049667_572-37035-0_sp1.1 from training. Duration: 0.72725 2023-10-02 23:30:01,864 WARNING [train.py:1204] (3/4) Exclude cut with ID _68777_210_4353_1_1535810390731_3117904_129-166012-0_sp0.9 from training. Duration: 0.9444375 2023-10-02 23:30:01,933 WARNING [train.py:1204] (3/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_487-91921-0_sp1.1 from training. Duration: 0.963625 2023-10-02 23:30:03,432 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_195-257512-0 from training. Duration: 0.67 2023-10-02 23:30:06,069 WARNING [train.py:1204] (3/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_188-260011-0_sp1.1 from training. Duration: 0.663625 2023-10-02 23:30:08,860 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.2.encoder.layers.2.conv_module1.whiten, num_groups=1, num_channels=384, metric=5.01 vs. limit=15.0 2023-10-02 23:30:09,427 WARNING [train.py:1204] (3/4) Exclude cut with ID _8480_210_1874_1_1528589391205_6728330_309-139896-0_sp0.9 from training. Duration: 0.9 2023-10-02 23:30:09,489 WARNING [train.py:1204] (3/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_866-321248-0_sp1.1 from training. Duration: 0.963625 2023-10-02 23:30:11,036 WARNING [train.py:1204] (3/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_510-216513-0_sp1.1 from training. Duration: 0.97275 2023-10-02 23:30:19,223 WARNING [train.py:1204] (3/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_758-143677-0 from training. Duration: 0.98 2023-10-02 23:30:19,311 WARNING [train.py:1204] (3/4) Exclude cut with ID _55134_210_21982_1_1534590028377_3882170_50-214427-0_sp1.1 from training. Duration: 0.97275 2023-10-02 23:30:23,961 WARNING [train.py:1204] (3/4) Exclude cut with ID _31649_210_6228_1_1532584608063_4091078_482-301330-0_sp1.1 from training. Duration: 0.97275 2023-10-02 23:30:27,311 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=1057813.3333333333, ans=0.1 2023-10-02 23:30:28,436 WARNING [train.py:1204] (3/4) Exclude cut with ID _35799_210_5854_1_1533263487887_3529071_490-326847-0_sp1.1 from training. Duration: 0.9 2023-10-02 23:30:28,444 WARNING [train.py:1204] (3/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_984-90716-0_sp1.1 from training. Duration: 0.963625 2023-10-02 23:30:33,274 WARNING [train.py:1204] (3/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_166-246557-0 from training. Duration: 0.64 2023-10-02 23:30:33,275 WARNING [train.py:1204] (3/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_51-200220-0_sp1.1 from training. Duration: 0.5090625 2023-10-02 23:30:33,352 WARNING [train.py:1204] (3/4) Exclude cut with ID _51382_210_23862_1_1534492801838_3650104_275-92796-0_sp1.1 from training. Duration: 0.936375 2023-10-02 23:30:39,532 WARNING [train.py:1204] (3/4) Exclude cut with ID _29853_210_7497_1_1532505600851_6642110_461-142829-0_sp1.1 from training. Duration: 0.97275 2023-10-02 23:30:39,582 WARNING [train.py:1204] (3/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_788-298907-0_sp0.9 from training. Duration: 0.988875 2023-10-02 23:30:39,776 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.ff3_skip_rate, batch_count=1057880.0, ans=0.0 2023-10-02 23:30:42,098 WARNING [train.py:1204] (3/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_739-2098-0_sp1.1 from training. Duration: 0.836375 2023-10-02 23:30:43,624 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7543_210_6753_1_1528541907_1399322_165-323619-0 from training. Duration: 0.69 2023-10-02 23:30:45,026 WARNING [train.py:1204] (3/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_401-255491-0_sp1.1 from training. Duration: 0.636375 2023-10-02 23:30:49,123 WARNING [train.py:1204] (3/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_24-143283-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 23:30:50,456 WARNING [train.py:1204] (3/4) Exclude cut with ID _14459_210_2228_1_1531443641946_7516700_1221-280439-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 23:30:53,179 WARNING [train.py:1204] (3/4) Exclude cut with ID _59202_210_26984_1_1534816810612_3549174_469-213482-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 23:30:53,179 WARNING [train.py:1204] (3/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_148-158509-0_sp0.9 from training. Duration: 0.4555625 2023-10-02 23:30:53,217 WARNING [train.py:1204] (3/4) Exclude cut with ID _24314_210_4915_1_1532135078116_7170179_863-259095-0_sp1.1 from training. Duration: 0.97275 2023-10-02 23:30:54,986 WARNING [train.py:1204] (3/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_321-22821-0 from training. Duration: 0.89 2023-10-02 23:30:55,014 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_43831_210_2158_1_1533725502_5872791_55-163137-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 23:30:55,058 WARNING [train.py:1204] (3/4) Exclude cut with ID _27430_210_7770_1_1532604689375_1378590_26-25207-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 23:30:57,043 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_17613_210_2151_1_1531137569_2366160_223-316461-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 23:30:58,405 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_53679_210_4852_1_1534556726_4186141_541-213370-0_sp1.1 from training. Duration: 0.963625 2023-10-02 23:30:58,477 WARNING [train.py:1204] (3/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_369-295086-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 23:30:58,531 WARNING [train.py:1204] (3/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_91-267696-0 from training. Duration: 0.87 2023-10-02 23:31:00,358 WARNING [train.py:1204] (3/4) Exclude cut with ID _5294_210_1747_1_1527249643928_3696179_180-100713-0 from training. Duration: 0.81 2023-10-02 23:31:00,394 WARNING [train.py:1204] (3/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_32-335103-0 from training. Duration: 0.82 2023-10-02 23:31:00,400 WARNING [train.py:1204] (3/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_133-312727-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 23:31:01,826 WARNING [train.py:1204] (3/4) Exclude cut with ID _59288_210_5854_1_1535077757774_3412246_391-165934-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 23:31:01,875 WARNING [train.py:1204] (3/4) Exclude cut with ID _28323_210_16187_1_1532480395667_4100300_488-1281-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 23:31:03,197 WARNING [train.py:1204] (3/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_124-193207-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 23:31:05,783 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.399e+02 1.837e+02 2.032e+02 2.322e+02 3.938e+02, threshold=4.064e+02, percent-clipped=0.0 2023-10-02 23:31:07,597 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.out_combiner.scale_min, batch_count=1057946.6666666667, ans=0.2 2023-10-02 23:31:09,553 INFO [scaling.py:1118] (3/4) WithLoss: name=encoder.encoders.4.encoder.layers.1.self_attn_weights, loss-sum=0.000e+00 2023-10-02 23:31:10,364 INFO [train.py:1046] (3/4) Epoch 30, batch 4650, loss[loss=0.1687, simple_loss=0.2195, pruned_loss=0.0589, over 19308.00 frames. ], tot_loss[loss=0.1625, simple_loss=0.2405, pruned_loss=0.04222, over 4698979.72 frames. ], batch size: 388, lr: 3.38e-03, grad_scale: 8.0 2023-10-02 23:31:13,244 WARNING [train.py:1204] (3/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_226-242140-0_sp0.9 from training. Duration: 0.9 2023-10-02 23:31:14,786 WARNING [train.py:1204] (3/4) Exclude cut with ID _29277_210_3279_1_1532581109293_3173069_392-3694-0_sp1.1 from training. Duration: 0.936375 2023-10-02 23:31:16,121 WARNING [train.py:1204] (3/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_563-329842-0_sp1.1 from training. Duration: 0.97275 2023-10-02 23:31:16,169 WARNING [train.py:1204] (3/4) Exclude cut with ID _58583_210_5236_1_1534726835266_3574249_391-87755-0_sp1.1 from training. Duration: 0.92725 2023-10-02 23:31:16,187 WARNING [train.py:1204] (3/4) Exclude cut with ID _28427_210_4220_1_1532581240967_3061069_281-237827-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 23:31:16,201 WARNING [train.py:1204] (3/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_44-147898-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 23:31:17,596 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_24007_210_1794_1_1531918925_1663875_125-320772-0_sp1.1 from training. Duration: 0.97275 2023-10-02 23:31:20,297 WARNING [train.py:1204] (3/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_143-117421-0 from training. Duration: 0.58 2023-10-02 23:31:22,912 WARNING [train.py:1204] (3/4) Exclude cut with ID _69584_210_5219_1_1535785133775_4044450_325-7656-0_sp0.9 from training. Duration: 0.9555625 2023-10-02 23:31:26,188 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_37351_210_7925_1_1533012931_3812113_265-85780-0 from training. Duration: 0.88 2023-10-02 23:31:26,218 WARNING [train.py:1204] (3/4) Exclude cut with ID _18516_210_9206_1_1531704653206_3692406_331-237896-0_sp1.1 from training. Duration: 0.936375 2023-10-02 23:31:26,400 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=1058080.0, ans=0.1 2023-10-02 23:31:28,074 WARNING [train.py:1204] (3/4) Exclude cut with ID _14629_210_15709_1_1530604298182_956899_6-232695-0 from training. Duration: 0.94 2023-10-02 23:31:28,110 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7544_210_6753_1_1528587000_691468_87-178599-0_sp1.1 from training. Duration: 0.7909375 2023-10-02 23:31:28,160 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_326-303945-0 from training. Duration: 0.99 2023-10-02 23:31:28,175 WARNING [train.py:1204] (3/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_503-30719-0 from training. Duration: 0.98 2023-10-02 23:31:29,503 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_11305_210_1794_1_1529750428_7178148_299-344640-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 23:31:29,568 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_53071_210_16554_1_1534490405_2954066_265-170472-0_sp1.1 from training. Duration: 0.7545625 2023-10-02 23:31:32,285 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_174-277364-0_sp1.1 from training. Duration: 0.6454375 2023-10-02 23:31:33,713 WARNING [train.py:1204] (3/4) Exclude cut with ID _58414_210_8254_1_1535421609162_7151182_193-151884-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 23:31:33,742 WARNING [train.py:1204] (3/4) Exclude cut with ID IC0651W0104-322063 from training. Duration: 0.768 2023-10-02 23:31:36,493 WARNING [train.py:1204] (3/4) Exclude cut with ID _21881_210_3525_1_1531825242757_3798380_139-305982-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 23:31:37,816 WARNING [train.py:1204] (3/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_79-200392-0 from training. Duration: 0.97 2023-10-02 23:31:41,332 WARNING [train.py:1204] (3/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_451-280894-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 23:31:41,340 WARNING [train.py:1204] (3/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_304-273433-0_sp1.1 from training. Duration: 0.9 2023-10-02 23:31:41,424 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_5959_210_1794_1_1527400400_3445726_412-44052-0 from training. Duration: 0.77 2023-10-02 23:31:42,273 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.1.encoder.layers.1.nonlin_attention.whiten2, num_groups=1, num_channels=256, metric=7.29 vs. limit=15.0 2023-10-02 23:31:44,171 WARNING [train.py:1204] (3/4) Exclude cut with ID _4681_210_3525_1_1527251040321_1235713_148-26159-0_sp1.1 from training. Duration: 0.963625 2023-10-02 23:31:46,971 WARNING [train.py:1204] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_887-249555-0_sp0.9 from training. Duration: 0.8555625 2023-10-02 23:31:49,865 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_25236_210_2158_1_1532051968_280567_37-6860-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 23:31:56,958 WARNING [train.py:1204] (3/4) Exclude cut with ID _29277_210_3279_1_1532581109293_3173069_417-115527-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 23:31:59,041 WARNING [train.py:1204] (3/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_552-134748-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 23:31:59,103 WARNING [train.py:1204] (3/4) Exclude cut with ID _43176_210_8254_1_1533524417469_4174339_188-174143-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 23:32:00,446 WARNING [train.py:1204] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_127-119109-0_sp1.1 from training. Duration: 0.6545625 2023-10-02 23:32:01,883 WARNING [train.py:1204] (3/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_38-187851-0 from training. Duration: 0.62 2023-10-02 23:32:01,919 WARNING [train.py:1204] (3/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_588-125165-0 from training. Duration: 0.83 2023-10-02 23:32:03,307 WARNING [train.py:1204] (3/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_344-152526-0_sp1.1 from training. Duration: 0.4818125 2023-10-02 23:32:03,309 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7002_210_1794_1_1527935634_4034444_487-236501-0 from training. Duration: 0.66 2023-10-02 23:32:03,445 WARNING [train.py:1204] (3/4) Exclude cut with ID _23825_210_13449_1_1531915313526_3497150_379-16981-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 23:32:10,185 WARNING [train.py:1204] (3/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_297-5233-0_sp1.1 from training. Duration: 0.863625 2023-10-02 23:32:10,190 WARNING [train.py:1204] (3/4) Exclude cut with ID _41298_210_9019_1_1533430371450_4822143_178-283538-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 23:32:10,230 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7046_210_6753_1_1527895337_6920917_642-106799-0 from training. Duration: 0.92 2023-10-02 23:32:11,536 WARNING [train.py:1204] (3/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_532-291952-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 23:32:12,987 WARNING [train.py:1204] (3/4) Exclude cut with ID _47278_210_12041_1_1533902418349_3576110_260-270468-0_sp1.1 from training. Duration: 0.92725 2023-10-02 23:32:12,992 WARNING [train.py:1204] (3/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_144-3287-0_sp0.9 from training. Duration: 0.7444375 2023-10-02 23:32:15,006 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_162-262717-0_sp0.9 from training. Duration: 0.92225 2023-10-02 23:32:15,168 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.feed_forward2.hidden_balancer.prob, batch_count=1058280.0, ans=0.125 2023-10-02 23:32:16,412 WARNING [train.py:1204] (3/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_62-335552-0_sp0.9 from training. Duration: 0.9666875 2023-10-02 23:32:16,424 WARNING [train.py:1204] (3/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_726-272852-0_sp1.1 from training. Duration: 0.92725 2023-10-02 23:32:16,643 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.bypass.scale_min, batch_count=1058280.0, ans=0.2 2023-10-02 23:32:17,781 WARNING [train.py:1204] (3/4) Exclude cut with ID _14898_210_9206_1_1530766812377_4177190_26-135492-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 23:32:20,685 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.0.nonlin_attention.balancer.prob, batch_count=1058280.0, ans=0.125 2023-10-02 23:32:21,967 WARNING [train.py:1204] (3/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_200-95304-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 23:32:23,212 INFO [train.py:1046] (3/4) Epoch 30, batch 4700, loss[loss=0.1844, simple_loss=0.2653, pruned_loss=0.0517, over 23365.00 frames. ], tot_loss[loss=0.1626, simple_loss=0.2413, pruned_loss=0.04195, over 4714982.62 frames. ], batch size: 93, lr: 3.38e-03, grad_scale: 8.0 2023-10-02 23:32:23,278 WARNING [train.py:1204] (3/4) Exclude cut with ID _45012_210_12651_1_1533608435136_5371380_240-37101-0_sp1.1 from training. Duration: 0.8454375 2023-10-02 23:32:23,287 WARNING [train.py:1204] (3/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_132-269633-0_sp0.9 from training. Duration: 0.7555625 2023-10-02 23:32:23,368 WARNING [train.py:1204] (3/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_907-151398-0_sp0.9 from training. Duration: 0.411125 2023-10-02 23:32:24,769 WARNING [train.py:1204] (3/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_45-265684-0_sp0.9 from training. Duration: 0.8 2023-10-02 23:32:26,695 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_74-292608-0 from training. Duration: 0.72 2023-10-02 23:32:30,254 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.conv_module2.balancer2.prob, batch_count=1058346.6666666667, ans=0.125 2023-10-02 23:32:34,044 WARNING [train.py:1204] (3/4) Exclude cut with ID _59202_210_26984_1_1534816810612_3549174_221-84819-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 23:32:35,394 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_33696_210_3852_1_1533082780_2663375_181-325595-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 23:32:35,436 WARNING [train.py:1204] (3/4) Exclude cut with ID _47251_210_18628_1_1533814063128_4137250_351-154105-0_sp1.1 from training. Duration: 0.963625 2023-10-02 23:32:36,849 WARNING [train.py:1204] (3/4) Exclude cut with ID _35501_210_9288_1_1532998507986_3192189_351-334740-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 23:32:38,278 WARNING [train.py:1204] (3/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_342-100157-0_sp1.1 from training. Duration: 0.4909375 2023-10-02 23:32:42,401 WARNING [train.py:1204] (3/4) Exclude cut with ID _6789_210_1534_1_1527857937218_3118950_262-48853-0 from training. Duration: 0.56 2023-10-02 23:32:42,519 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.0.self_attn_weights.pos_emb_skip_rate, batch_count=1058413.3333333333, ans=0.0 2023-10-02 23:32:43,744 WARNING [train.py:1204] (3/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_430-201020-0 from training. Duration: 0.97 2023-10-02 23:32:44,778 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.conv_skip_rate, batch_count=1058413.3333333333, ans=0.0 2023-10-02 23:32:45,749 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_71-136518-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 23:32:45,838 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_28-291982-0_sp1.1 from training. Duration: 0.8818125 2023-10-02 23:32:45,863 WARNING [train.py:1204] (3/4) Exclude cut with ID _54218_210_12210_1_1534413646388_4409259_437-275326-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 23:32:49,926 WARNING [train.py:1204] (3/4) Exclude cut with ID _55134_210_21982_1_1534590028377_3882170_40-309139-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 23:32:55,990 WARNING [train.py:1204] (3/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_266-329755-0_sp1.1 from training. Duration: 0.8090625 2023-10-02 23:32:57,780 WARNING [train.py:1204] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_21-174514-0_sp1.1 from training. Duration: 0.3909375 2023-10-02 23:32:59,834 WARNING [train.py:1204] (3/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_409-207378-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 23:33:05,438 WARNING [train.py:1204] (3/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_132-269633-0 from training. Duration: 0.68 2023-10-02 23:33:06,769 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2245_210_2715_1_1525511548_3540254_310-52483-0_sp0.9 from training. Duration: 0.97775 2023-10-02 23:33:08,358 WARNING [train.py:1204] (3/4) Exclude cut with ID _27430_210_7770_1_1532604689375_1378590_47-77403-0_sp1.1 from training. Duration: 0.92725 2023-10-02 23:33:11,129 WARNING [train.py:1204] (3/4) Exclude cut with ID _7562_210_2084_1_1528887767916_3946229_186-326422-0 from training. Duration: 0.84 2023-10-02 23:33:12,517 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_230-35993-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 23:33:15,479 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.0.conv_module1.balancer1.prob, batch_count=1058546.6666666667, ans=0.125 2023-10-02 23:33:17,390 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_23854_210_1794_1_1531969696_1329157_57-123491-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 23:33:18,548 WARNING [train.py:1204] (3/4) Exclude cut with ID _14747_210_8341_1_1530685797463_3654089_147-131229-0 from training. Duration: 0.76 2023-10-02 23:33:20,021 WARNING [train.py:1204] (3/4) Exclude cut with ID _30191_210_1664_1_1533189915047_7196670_430-19539-0_sp1.1 from training. Duration: 0.92725 2023-10-02 23:33:20,034 WARNING [train.py:1204] (3/4) Exclude cut with ID _67725_210_27767_1_1535608756671_5105359_44-50762-0_sp1.1 from training. Duration: 0.963625 2023-10-02 23:33:22,784 WARNING [train.py:1204] (3/4) Exclude cut with ID _27485_210_4916_1_1532739191677_3668269_217-161755-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 23:33:24,078 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_5931_210_2040_1_1527428481_4547999_321-193614-0_sp1.1 from training. Duration: 0.6090625 2023-10-02 23:33:24,094 WARNING [train.py:1204] (3/4) Exclude cut with ID _6267_210_2084_1_1527764658526_3894730_375-219887-0 from training. Duration: 0.67 2023-10-02 23:33:24,177 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9087W0022-538393 from training. Duration: 0.768 2023-10-02 23:33:24,313 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.balancer1.prob, batch_count=1058613.3333333333, ans=0.125 2023-10-02 23:33:27,341 WARNING [train.py:1204] (3/4) Exclude cut with ID _68777_210_4353_1_1535810390731_3117904_83-196459-0_sp1.1 from training. Duration: 0.963625 2023-10-02 23:33:28,821 WARNING [train.py:1204] (3/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_455-343812-0_sp1.1 from training. Duration: 0.92725 2023-10-02 23:33:28,822 WARNING [train.py:1204] (3/4) Exclude cut with ID _13916_210_9994_1_1530788341126_3763149_414-320174-0_sp1.1 from training. Duration: 0.92725 2023-10-02 23:33:28,826 WARNING [train.py:1204] (3/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_398-239819-0 from training. Duration: 0.99 2023-10-02 23:33:28,929 WARNING [train.py:1204] (3/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_199-232314-0_sp1.1 from training. Duration: 0.92725 2023-10-02 23:33:32,971 WARNING [train.py:1204] (3/4) Exclude cut with ID _2334_210_2667_1_1525608238880_4705129_459-104997-0 from training. Duration: 0.95 2023-10-02 23:33:34,176 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.532e+02 1.856e+02 2.084e+02 2.252e+02 3.247e+02, threshold=4.168e+02, percent-clipped=0.0 2023-10-02 23:33:35,812 WARNING [train.py:1204] (3/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_441-209395-0_sp1.1 from training. Duration: 0.936375 2023-10-02 23:33:37,129 WARNING [train.py:1204] (3/4) Exclude cut with ID _27052_210_5986_1_1532329197187_3585009_320-80393-0_sp1.1 from training. Duration: 0.97275 2023-10-02 23:33:38,567 INFO [train.py:1046] (3/4) Epoch 30, batch 4750, loss[loss=0.1525, simple_loss=0.24, pruned_loss=0.03252, over 24497.00 frames. ], tot_loss[loss=0.1641, simple_loss=0.2431, pruned_loss=0.04253, over 4718570.35 frames. ], batch size: 66, lr: 3.38e-03, grad_scale: 8.0 2023-10-02 23:33:40,152 WARNING [train.py:1204] (3/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_89-217253-0_sp1.1 from training. Duration: 0.97275 2023-10-02 23:33:41,414 WARNING [train.py:1204] (3/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_453-282588-0_sp1.1 from training. Duration: 0.8909375 2023-10-02 23:33:42,807 WARNING [train.py:1204] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_88-341718-0 from training. Duration: 0.66 2023-10-02 23:33:42,860 WARNING [train.py:1204] (3/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_230-3521-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 23:33:46,895 WARNING [train.py:1204] (3/4) Exclude cut with ID _9679_210_7497_1_1529129212906_4363929_512-313889-0 from training. Duration: 0.81 2023-10-02 23:33:48,684 WARNING [train.py:1204] (3/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_194-252800-0_sp1.1 from training. Duration: 0.8818125 2023-10-02 23:33:48,720 WARNING [train.py:1204] (3/4) Exclude cut with ID _55209_210_1531_1_1534675828996_4597160_239-293541-0_sp1.1 from training. Duration: 0.963625 2023-10-02 23:33:50,092 WARNING [train.py:1204] (3/4) Exclude cut with ID _35799_210_5854_1_1533263487887_3529071_15-325131-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 23:33:54,330 WARNING [train.py:1204] (3/4) Exclude cut with ID _14100_210_8341_1_1530529320613_3678069_279-55760-0 from training. Duration: 0.93 2023-10-02 23:33:58,906 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_489-230703-0_sp1.1 from training. Duration: 0.77275 2023-10-02 23:33:59,501 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.3.encoder.layers.1.nonlin_attention.whiten1, num_groups=1, num_channels=384, metric=6.84 vs. limit=10.0 2023-10-02 23:34:02,127 WARNING [train.py:1204] (3/4) Exclude cut with ID _6694_210_4599_1_1527852637490_3965529_144-185051-0 from training. Duration: 0.96 2023-10-02 23:34:02,190 WARNING [train.py:1204] (3/4) Exclude cut with ID _28461_210_2253_1_1532771775189_3041299_1-228155-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 23:34:05,100 WARNING [train.py:1204] (3/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_686-252323-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 23:34:05,103 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_1075-294633-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 23:34:06,445 WARNING [train.py:1204] (3/4) Exclude cut with ID _22083_210_8254_1_1531702862439_3678970_30-92879-0_sp1.1 from training. Duration: 0.97275 2023-10-02 23:34:06,545 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9245W0121-616888 from training. Duration: 0.896 2023-10-02 23:34:06,548 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6786_210_1388_1_1527839699_4194495_378-46070-0 from training. Duration: 0.72 2023-10-02 23:34:11,985 WARNING [train.py:1204] (3/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_317-175062-0 from training. Duration: 0.82 2023-10-02 23:34:13,391 WARNING [train.py:1204] (3/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_238-89390-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 23:34:16,127 WARNING [train.py:1204] (3/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_524-310811-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 23:34:19,427 WARNING [train.py:1204] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_307-158462-0_sp0.9 from training. Duration: 0.7666875 2023-10-02 23:34:19,428 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9309W0191-640318 from training. Duration: 0.512 2023-10-02 23:34:19,432 WARNING [train.py:1204] (3/4) Exclude cut with ID _55153_210_2253_1_1534384786677_3162096_100-204617-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 23:34:23,510 WARNING [train.py:1204] (3/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_112-149213-0_sp1.1 from training. Duration: 0.863625 2023-10-02 23:34:26,356 WARNING [train.py:1204] (3/4) Exclude cut with ID _67831_210_12392_1_1535612464202_3092612_346-2148-0_sp1.1 from training. Duration: 0.8454375 2023-10-02 23:34:27,738 WARNING [train.py:1204] (3/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_51-117576-0_sp0.9 from training. Duration: 0.4444375 2023-10-02 23:34:27,770 WARNING [train.py:1204] (3/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_300-27235-0 from training. Duration: 0.87 2023-10-02 23:34:27,811 WARNING [train.py:1204] (3/4) Exclude cut with ID _40399_210_4353_1_1533305658194_3279406_195-116825-0_sp1.1 from training. Duration: 0.97275 2023-10-02 23:34:29,775 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7002_210_1794_1_1527939683_5157286_163-87552-0_sp0.9 from training. Duration: 0.9555625 2023-10-02 23:34:29,824 WARNING [train.py:1204] (3/4) Exclude cut with ID _19514_210_9259_1_1531530071330_5084230_560-237597-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 23:34:31,094 WARNING [train.py:1204] (3/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_41-234353-0_sp1.1 from training. Duration: 0.6181875 2023-10-02 23:34:31,111 WARNING [train.py:1204] (3/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_769-168302-0 from training. Duration: 0.71 2023-10-02 23:34:35,034 WARNING [train.py:1204] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_574-45541-0 from training. Duration: 0.65 2023-10-02 23:34:36,498 WARNING [train.py:1204] (3/4) Exclude cut with ID _50310_210_15488_1_1534068043345_3641080_262-67120-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 23:34:39,243 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_53071_210_16554_1_1534493434_1276796_35-11927-0_sp1.1 from training. Duration: 0.963625 2023-10-02 23:34:39,245 WARNING [train.py:1204] (3/4) Exclude cut with ID _3992_210_1747_1_1526644816031_3731060_107-251434-0 from training. Duration: 0.99 2023-10-02 23:34:39,288 WARNING [train.py:1204] (3/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_412-54488-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 23:34:40,724 WARNING [train.py:1204] (3/4) Exclude cut with ID _18516_210_9206_1_1531704653206_3692406_77-258490-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 23:34:42,084 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_40047_210_2290_1_1533290519_2374311_195-165693-0_sp0.9 from training. Duration: 0.988875 2023-10-02 23:34:42,128 WARNING [train.py:1204] (3/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_296-36971-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 23:34:43,489 WARNING [train.py:1204] (3/4) Exclude cut with ID _14970_210_15709_1_1530773543493_1715730_187-20259-0_sp1.1 from training. Duration: 0.8090625 2023-10-02 23:34:45,119 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_53373_210_5399_1_1534229471_4715695_135-305927-0_sp1.1 from training. Duration: 0.936375 2023-10-02 23:34:45,254 INFO [scaling.py:1118] (3/4) WithLoss: name=encoder.encoders.0.layers.1.self_attn_weights, loss-sum=0.000e+00 2023-10-02 23:34:46,370 WARNING [train.py:1204] (3/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_60-18123-0 from training. Duration: 0.88 2023-10-02 23:34:46,451 WARNING [train.py:1204] (3/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_166-166990-0 from training. Duration: 0.96 2023-10-02 23:34:46,640 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.nonlin_attention.balancer.prob, batch_count=1058946.6666666667, ans=0.125 2023-10-02 23:34:47,856 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_5438_210_1794_1_1527336687_2600009_233-58072-0 from training. Duration: 0.77 2023-10-02 23:34:47,917 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder_embed.dropout.p, batch_count=1058946.6666666667, ans=0.1 2023-10-02 23:34:51,335 WARNING [train.py:1204] (3/4) Exclude cut with ID _8833_210_5219_1_1528592665210_7553059_611-80313-0_sp0.9 from training. Duration: 0.711125 2023-10-02 23:34:51,358 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_53555_210_5632_1_1534595220_7232297_118-43239-0_sp1.1 from training. Duration: 0.936375 2023-10-02 23:34:52,582 INFO [train.py:1046] (3/4) Epoch 30, batch 4800, loss[loss=0.1635, simple_loss=0.2312, pruned_loss=0.04785, over 23767.00 frames. ], tot_loss[loss=0.1642, simple_loss=0.2434, pruned_loss=0.04248, over 4725685.80 frames. ], batch size: 179, lr: 3.38e-03, grad_scale: 16.0 2023-10-02 23:34:52,657 WARNING [train.py:1204] (3/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_480-13146-0 from training. Duration: 0.85 2023-10-02 23:35:00,099 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_892-28814-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 23:35:00,151 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_24899_210_16554_1_1532046422_3827462_160-21255-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 23:35:03,797 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.conv_module1.balancer1.prob, batch_count=1059013.3333333333, ans=0.125 2023-10-02 23:35:05,486 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_154-316450-0_sp1.1 from training. Duration: 0.6909375 2023-10-02 23:35:06,791 WARNING [train.py:1204] (3/4) Exclude cut with ID _30196_210_1664_1_1533708496522_7074140_1225-301232-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 23:35:08,142 WARNING [train.py:1204] (3/4) Exclude cut with ID _28783_210_7943_1_1532671164808_7155479_414-208100-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 23:35:08,193 WARNING [train.py:1204] (3/4) Exclude cut with ID _19023_210_4353_1_1531378892552_5820780_83-254794-0 from training. Duration: 0.86 2023-10-02 23:35:09,490 WARNING [train.py:1204] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_591-83752-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 23:35:09,533 WARNING [train.py:1204] (3/4) Exclude cut with ID _29877_210_6228_1_1532397791106_5276149_266-62291-0_sp1.1 from training. Duration: 0.7909375 2023-10-02 23:35:10,895 WARNING [train.py:1204] (3/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_280-161633-0_sp1.1 from training. Duration: 0.663625 2023-10-02 23:35:15,032 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7543_210_6753_1_1528539468_333764_39-156634-0_sp1.1 from training. Duration: 0.97275 2023-10-02 23:35:16,434 WARNING [train.py:1204] (3/4) Exclude cut with ID _67499_210_17282_1_1535509805220_3692430_505-312248-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 23:35:16,455 WARNING [train.py:1204] (3/4) Exclude cut with ID _5294_210_1747_1_1527249643928_3696179_180-100713-0_sp0.9 from training. Duration: 0.9 2023-10-02 23:35:19,122 WARNING [train.py:1204] (3/4) Exclude cut with ID _26528_210_9070_1_1532570410842_3851310_508-148132-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 23:35:19,133 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_211-339622-0_sp1.1 from training. Duration: 0.4090625 2023-10-02 23:35:19,146 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_339-138356-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 23:35:20,446 WARNING [train.py:1204] (3/4) Exclude cut with ID _27060_210_5986_1_1532674838922_3522549_211-19933-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 23:35:23,288 WARNING [train.py:1204] (3/4) Exclude cut with ID _60229_210_27767_1_1534838406158_3655350_457-117923-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 23:35:26,470 WARNING [train.py:1204] (3/4) Exclude cut with ID _55429_210_10626_1_1534467588527_7174143_460-141439-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 23:35:26,678 INFO [scaling.py:1118] (3/4) WithLoss: name=encoder.encoders.1.encoder.layers.0.self_attn_weights, loss-sum=0.000e+00 2023-10-02 23:35:27,821 WARNING [train.py:1204] (3/4) Exclude cut with ID _59170_210_6721_1_1534983633960_4203335_263-10780-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 23:35:27,838 WARNING [train.py:1204] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_872-264525-0_sp0.9 from training. Duration: 0.72225 2023-10-02 23:35:29,246 WARNING [train.py:1204] (3/4) Exclude cut with ID _7624_210_1531_1_1528109802198_4933101_418-349834-0_sp0.9 from training. Duration: 0.6666875 2023-10-02 23:35:29,367 WARNING [train.py:1204] (3/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_27-53998-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 23:35:30,746 WARNING [train.py:1204] (3/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_202-183922-0 from training. Duration: 0.85 2023-10-02 23:35:30,759 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7002_210_1794_1_1527939683_5157286_1-281264-0 from training. Duration: 0.44 2023-10-02 23:35:32,746 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_53753_210_7234_1_1534320003_3594148_493-9314-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 23:35:32,767 WARNING [train.py:1204] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_217-95056-0_sp1.1 from training. Duration: 0.8909375 2023-10-02 23:35:34,480 WARNING [train.py:1204] (3/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_406-45086-0_sp1.1 from training. Duration: 0.72725 2023-10-02 23:35:34,486 WARNING [train.py:1204] (3/4) Exclude cut with ID _31649_210_6228_1_1532584608063_4091078_505-213821-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 23:35:34,494 WARNING [train.py:1204] (3/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_317-175062-0_sp0.9 from training. Duration: 0.911125 2023-10-02 23:35:36,370 WARNING [train.py:1204] (3/4) Exclude cut with ID _5444_210_2151_1_1527418808464_5312210_726-27165-0_sp1.1 from training. Duration: 0.6090625 2023-10-02 23:35:36,412 WARNING [train.py:1204] (3/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_494-170437-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 23:35:40,592 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_474-64054-0_sp1.1 from training. Duration: 0.936375 2023-10-02 23:35:43,194 WARNING [train.py:1204] (3/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_521-7556-0_sp1.1 from training. Duration: 0.97275 2023-10-02 23:35:44,599 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_269-348505-0_sp1.1 from training. Duration: 0.92725 2023-10-02 23:35:46,150 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.balancer1.prob, batch_count=1059213.3333333333, ans=0.125 2023-10-02 23:35:47,323 WARNING [train.py:1204] (3/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_559-184862-0 from training. Duration: 0.94 2023-10-02 23:35:48,640 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_68702_210_23689_1_1535799577_3420068_92-76040-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 23:35:48,677 WARNING [train.py:1204] (3/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_227-102052-0_sp1.1 from training. Duration: 0.97275 2023-10-02 23:35:48,717 WARNING [train.py:1204] (3/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_718-250907-0_sp0.9 from training. Duration: 0.7666875 2023-10-02 23:35:50,132 WARNING [train.py:1204] (3/4) Exclude cut with ID _14069_210_5632_1_1530512692033_4803390_352-143891-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 23:35:50,361 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.feed_forward2.hidden_balancer.prob, batch_count=1059280.0, ans=0.125 2023-10-02 23:35:54,163 WARNING [train.py:1204] (3/4) Exclude cut with ID _6267_210_2084_1_1527764658526_3894730_65-106602-0_sp1.1 from training. Duration: 0.936375 2023-10-02 23:35:54,234 WARNING [train.py:1204] (3/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_202-183922-0_sp0.9 from training. Duration: 0.9444375 2023-10-02 23:35:54,859 WARNING [train.py:1204] (3/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_465-268827-0_sp1.1 from training. Duration: 0.97275 2023-10-02 23:35:56,050 WARNING [train.py:1204] (3/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_58-29064-0_sp1.1 from training. Duration: 0.9 2023-10-02 23:35:56,107 WARNING [train.py:1204] (3/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_1-99680-0_sp0.9 from training. Duration: 0.7444375 2023-10-02 23:35:56,150 WARNING [train.py:1204] (3/4) Exclude cut with ID _13253_210_1925_1_1530757775819_3577873_191-144977-0_sp1.1 from training. Duration: 0.7090625 2023-10-02 23:36:00,130 WARNING [train.py:1204] (3/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_276-112684-0_sp1.1 from training. Duration: 0.92725 2023-10-02 23:36:00,145 WARNING [train.py:1204] (3/4) Exclude cut with ID _64639_210_5816_1_1535367619437_3528000_358-411-0_sp1.1 from training. Duration: 0.97275 2023-10-02 23:36:01,344 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.546e+02 1.838e+02 2.029e+02 2.262e+02 3.175e+02, threshold=4.059e+02, percent-clipped=0.0 2023-10-02 23:36:01,413 WARNING [train.py:1204] (3/4) Exclude cut with ID _31649_210_6228_1_1532584608063_4091078_106-21199-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 23:36:03,339 WARNING [train.py:1204] (3/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_901-79438-0 from training. Duration: 0.94 2023-10-02 23:36:04,796 WARNING [train.py:1204] (3/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_718-250907-0 from training. Duration: 0.69 2023-10-02 23:36:04,801 WARNING [train.py:1204] (3/4) Exclude cut with ID _5287_210_2084_1_1527418883040_3878670_363-194161-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 23:36:04,807 WARNING [train.py:1204] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_586-321321-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 23:36:06,110 INFO [train.py:1046] (3/4) Epoch 30, batch 4850, loss[loss=0.148, simple_loss=0.224, pruned_loss=0.03603, over 24472.00 frames. ], tot_loss[loss=0.1643, simple_loss=0.2435, pruned_loss=0.04255, over 4722213.20 frames. ], batch size: 58, lr: 3.38e-03, grad_scale: 16.0 2023-10-02 23:36:06,232 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_68900_210_2087_1_1535713157_3708529_164-292661-0_sp1.1 from training. Duration: 0.963625 2023-10-02 23:36:06,233 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_26433_210_12108_1_1532157158_4056480_114-144878-0_sp1.1 from training. Duration: 0.97275 2023-10-02 23:36:09,574 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_15517_210_6753_1_1530954008_3487336_96-341874-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 23:36:16,349 WARNING [train.py:1204] (3/4) Exclude cut with ID _14268_210_9206_1_1530622771285_4714099_637-86630-0 from training. Duration: 0.51 2023-10-02 23:36:17,716 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_28211_210_6388_1_1532861506_4523224_121-320092-0_sp1.1 from training. Duration: 0.92725 2023-10-02 23:36:21,881 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_141-89113-0_sp0.9 from training. Duration: 0.9555625 2023-10-02 23:36:23,265 WARNING [train.py:1204] (3/4) Exclude cut with ID _6789_210_1534_1_1527857937218_3118950_311-61262-0_sp1.1 from training. Duration: 0.5909375 2023-10-02 23:36:23,311 WARNING [train.py:1204] (3/4) Exclude cut with ID _58480_210_12392_1_1534811121111_3646160_389-200461-0_sp1.1 from training. Duration: 0.97275 2023-10-02 23:36:27,000 WARNING [train.py:1204] (3/4) Exclude cut with ID _41326_210_5854_1_1533778076509_3492080_28-156553-0_sp1.1 from training. Duration: 0.92725 2023-10-02 23:36:28,391 WARNING [train.py:1204] (3/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_577-287150-0_sp1.1 from training. Duration: 0.7454375 2023-10-02 23:36:29,679 WARNING [train.py:1204] (3/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_40-129756-0_sp1.1 from training. Duration: 0.736375 2023-10-02 23:36:29,689 WARNING [train.py:1204] (3/4) Exclude cut with ID _60113_210_16271_1_1535164212481_4386079_490-280290-0 from training. Duration: 0.98 2023-10-02 23:36:33,135 WARNING [train.py:1204] (3/4) Exclude cut with ID _58059_210_5854_1_1534901471149_3164808_34-39905-0_sp1.1 from training. Duration: 0.963625 2023-10-02 23:36:36,349 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_791-197891-0_sp1.1 from training. Duration: 0.763625 2023-10-02 23:36:36,368 WARNING [train.py:1204] (3/4) Exclude cut with ID _6789_210_1534_1_1527857937218_3118950_262-48853-0_sp1.1 from training. Duration: 0.5090625 2023-10-02 23:36:37,732 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2655_210_2087_1_1525688959_3897400_52-279899-0_sp0.9 from training. Duration: 0.7666875 2023-10-02 23:36:37,736 WARNING [train.py:1204] (3/4) Exclude cut with ID _14884_210_2252_1_1530707459689_5561250_114-201399-0 from training. Duration: 0.76 2023-10-02 23:36:40,548 WARNING [train.py:1204] (3/4) Exclude cut with ID _19023_210_4353_1_1531378892552_5820780_83-254794-0_sp0.9 from training. Duration: 0.9555625 2023-10-02 23:36:40,565 WARNING [train.py:1204] (3/4) Exclude cut with ID _53489_210_6228_1_1534298357169_3835970_10-297919-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 23:36:46,482 WARNING [train.py:1204] (3/4) Exclude cut with ID _44585_210_18628_1_1533605350387_4518199_418-3055-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 23:36:46,502 WARNING [train.py:1204] (3/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_310-109461-0 from training. Duration: 0.8 2023-10-02 23:36:46,545 WARNING [train.py:1204] (3/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_235-262124-0 from training. Duration: 0.67 2023-10-02 23:36:48,051 WARNING [train.py:1204] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_195-167011-0_sp1.1 from training. Duration: 0.6818125 2023-10-02 23:36:49,944 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.4.encoder.layers.1.feed_forward3.out_whiten, num_groups=1, num_channels=384, metric=10.93 vs. limit=15.0 2023-10-02 23:36:54,888 WARNING [train.py:1204] (3/4) Exclude cut with ID _7852_210_5219_1_1528286471316_8139400_188-55162-0_sp1.1 from training. Duration: 0.936375 2023-10-02 23:36:56,206 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_35361_210_4238_1_1533262810_7793476_675-87012-0 from training. Duration: 0.89 2023-10-02 23:36:56,282 WARNING [train.py:1204] (3/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_465-81427-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 23:36:56,288 WARNING [train.py:1204] (3/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_597-154640-0_sp1.1 from training. Duration: 0.8181875 2023-10-02 23:36:58,288 WARNING [train.py:1204] (3/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_186-309161-0_sp0.9 from training. Duration: 0.6 2023-10-02 23:36:59,549 WARNING [train.py:1204] (3/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_546-119312-0 from training. Duration: 0.99 2023-10-02 23:36:59,553 WARNING [train.py:1204] (3/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_330-240060-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 23:37:02,167 WARNING [train.py:1204] (3/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_477-255782-0 from training. Duration: 0.82 2023-10-02 23:37:02,201 WARNING [train.py:1204] (3/4) Exclude cut with ID _14970_210_15709_1_1530773543493_1715730_150-248939-0_sp1.1 from training. Duration: 0.97275 2023-10-02 23:37:03,611 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_556-206615-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 23:37:05,516 WARNING [train.py:1204] (3/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_308-231735-0 from training. Duration: 0.73 2023-10-02 23:37:12,935 WARNING [train.py:1204] (3/4) Exclude cut with ID _27485_210_4916_1_1532739191677_3668269_66-236571-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 23:37:17,796 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2221_210_1988_1_1525597051_4935541_415-296882-0_sp0.9 from training. Duration: 0.8555625 2023-10-02 23:37:19,052 WARNING [train.py:1204] (3/4) Exclude cut with ID _64521_210_14699_1_1535243427259_3815328_231-78630-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 23:37:20,375 INFO [train.py:1046] (3/4) Epoch 30, batch 4900, loss[loss=0.1708, simple_loss=0.2528, pruned_loss=0.04439, over 23339.00 frames. ], tot_loss[loss=0.1636, simple_loss=0.2424, pruned_loss=0.04236, over 4729249.37 frames. ], batch size: 93, lr: 3.38e-03, grad_scale: 16.0 2023-10-02 23:37:22,112 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.feed_forward3.hidden_balancer.prob, batch_count=1059680.0, ans=0.125 2023-10-02 23:37:23,250 WARNING [train.py:1204] (3/4) Exclude cut with ID _13942_210_9988_1_1530604906036_3733360_499-55676-0 from training. Duration: 0.62 2023-10-02 23:37:23,255 WARNING [train.py:1204] (3/4) Exclude cut with ID _49376_210_5986_1_1533985278732_3370740_301-154763-0_sp1.1 from training. Duration: 0.8909375 2023-10-02 23:37:28,781 WARNING [train.py:1204] (3/4) Exclude cut with ID _27485_210_4916_1_1532739191677_3668269_255-216698-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 23:37:28,878 WARNING [train.py:1204] (3/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_137-334143-0_sp1.1 from training. Duration: 0.97275 2023-10-02 23:37:28,899 WARNING [train.py:1204] (3/4) Exclude cut with ID _6456_210_1534_1_1527681634212_3181049_215-309359-0_sp1.1 from training. Duration: 0.8 2023-10-02 23:37:32,112 WARNING [train.py:1204] (3/4) Exclude cut with ID _65640_210_6310_1_1535368201445_3173250_209-13008-0 from training. Duration: 0.99 2023-10-02 23:37:32,398 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.out_combiner.scale_min, batch_count=1059680.0, ans=0.2 2023-10-02 23:37:36,287 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.1.encoder.layers.1.self_attn_weights.whiten_keys, num_groups=4, num_channels=128, metric=2.80 vs. limit=6.0 2023-10-02 23:37:38,617 WARNING [train.py:1204] (3/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_58-29064-0 from training. Duration: 0.99 2023-10-02 23:37:42,751 WARNING [train.py:1204] (3/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_410-287857-0 from training. Duration: 0.93 2023-10-02 23:37:42,838 WARNING [train.py:1204] (3/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_153-153537-0 from training. Duration: 0.91 2023-10-02 23:37:42,876 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7544_210_6753_1_1528587722_3576686_172-171750-0_sp0.9 from training. Duration: 0.72225 2023-10-02 23:37:42,911 WARNING [train.py:1204] (3/4) Exclude cut with ID _64521_210_14699_1_1535243427259_3815328_464-183853-0_sp1.1 from training. Duration: 0.97275 2023-10-02 23:37:44,169 WARNING [train.py:1204] (3/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_385-210308-0_sp1.1 from training. Duration: 0.963625 2023-10-02 23:37:44,177 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_46013_210_7234_1_1533776362_3744032_354-261865-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 23:37:44,185 WARNING [train.py:1204] (3/4) Exclude cut with ID _3067_210_2667_1_1526212974640_4503168_250-285937-0_sp1.1 from training. Duration: 0.72725 2023-10-02 23:37:44,253 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_40036_210_13405_1_1533648647_1315388_48-146016-0 from training. Duration: 0.93 2023-10-02 23:37:47,553 WARNING [train.py:1204] (3/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_280-161633-0 from training. Duration: 0.73 2023-10-02 23:37:48,896 WARNING [train.py:1204] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_308-161067-0_sp0.9 from training. Duration: 0.7333125 2023-10-02 23:37:49,006 WARNING [train.py:1204] (3/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_68-212801-0_sp0.9 from training. Duration: 0.811125 2023-10-02 23:37:50,303 WARNING [train.py:1204] (3/4) Exclude cut with ID _6789_210_1534_1_1527857937218_3118950_311-61262-0_sp0.9 from training. Duration: 0.72225 2023-10-02 23:37:51,717 WARNING [train.py:1204] (3/4) Exclude cut with ID _14632_210_9988_1_1530691279392_3923200_102-204477-0_sp1.1 from training. Duration: 0.8818125 2023-10-02 23:37:53,102 WARNING [train.py:1204] (3/4) Exclude cut with ID _65242_210_18628_1_1535369393717_3823149_290-117517-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 23:37:53,233 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.conv_skip_rate, batch_count=1059813.3333333333, ans=0.0 2023-10-02 23:37:55,653 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_69630_210_7234_1_1535788670_910989_35-337140-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 23:37:55,662 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_5618_210_1874_1_1527311008_5851_1-214501-0 from training. Duration: 0.689 2023-10-02 23:37:57,055 WARNING [train.py:1204] (3/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_168-50337-0_sp0.9 from training. Duration: 0.9333125 2023-10-02 23:37:58,894 WARNING [train.py:1204] (3/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_185-14438-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 23:37:58,923 WARNING [train.py:1204] (3/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_61-327062-0 from training. Duration: 0.81 2023-10-02 23:37:58,929 WARNING [train.py:1204] (3/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_98-741-0 from training. Duration: 0.8 2023-10-02 23:38:00,489 WARNING [train.py:1204] (3/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_312-204798-0 from training. Duration: 0.93 2023-10-02 23:38:00,668 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.conv_skip_rate, batch_count=1059813.3333333333, ans=0.0 2023-10-02 23:38:03,153 WARNING [train.py:1204] (3/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_787-294416-0_sp0.9 from training. Duration: 0.911125 2023-10-02 23:38:04,557 WARNING [train.py:1204] (3/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_140-181176-0_sp1.1 from training. Duration: 0.836375 2023-10-02 23:38:04,591 WARNING [train.py:1204] (3/4) Exclude cut with ID _14687_210_5854_1_1530703838259_3664889_115-239046-0_sp1.1 from training. Duration: 0.6090625 2023-10-02 23:38:04,636 WARNING [train.py:1204] (3/4) Exclude cut with ID _56423_210_5854_1_1534896061049_3165969_370-230614-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 23:38:06,550 WARNING [train.py:1204] (3/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_406-275649-0_sp0.9 from training. Duration: 0.4666875 2023-10-02 23:38:06,567 WARNING [train.py:1204] (3/4) Exclude cut with ID _15150_210_8341_1_1530752411803_7256384_22-138274-0_sp1.1 from training. Duration: 0.87275 2023-10-02 23:38:08,310 WARNING [train.py:1204] (3/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_125-125818-0 from training. Duration: 0.96 2023-10-02 23:38:11,100 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_4399_210_1874_1_1526705485_2400757_220-47974-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 23:38:12,485 WARNING [train.py:1204] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_308-161067-0_sp1.1 from training. Duration: 0.6 2023-10-02 23:38:13,878 WARNING [train.py:1204] (3/4) Exclude cut with ID _53489_210_6228_1_1534298357169_3835970_102-57645-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 23:38:18,628 WARNING [train.py:1204] (3/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_178-177582-0 from training. Duration: 0.74 2023-10-02 23:38:19,952 WARNING [train.py:1204] (3/4) Exclude cut with ID _14349_210_8341_1_1530615710818_3442470_170-336541-0_sp1.1 from training. Duration: 0.7818125 2023-10-02 23:38:20,020 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_521-214276-0_sp0.9 from training. Duration: 0.488875 2023-10-02 23:38:20,063 WARNING [train.py:1204] (3/4) Exclude cut with ID _66070_210_7640_1_1535441393096_7805310_489-292631-0 from training. Duration: 0.79 2023-10-02 23:38:27,041 WARNING [train.py:1204] (3/4) Exclude cut with ID _14069_210_5632_1_1530512692033_4803390_438-340023-0_sp1.1 from training. Duration: 0.936375 2023-10-02 23:38:28,301 WARNING [train.py:1204] (3/4) Exclude cut with ID _7852_210_5219_1_1528286471316_8139400_242-105336-0_sp1.1 from training. Duration: 0.6545625 2023-10-02 23:38:30,096 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.624e+02 1.877e+02 2.022e+02 2.304e+02 3.994e+02, threshold=4.045e+02, percent-clipped=0.0 2023-10-02 23:38:30,228 WARNING [train.py:1204] (3/4) Exclude cut with ID _3867_210_1883_1_1526641200575_4475830_272-215563-0 from training. Duration: 0.57 2023-10-02 23:38:30,235 WARNING [train.py:1204] (3/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_143-117421-0_sp0.9 from training. Duration: 0.6444375 2023-10-02 23:38:30,253 WARNING [train.py:1204] (3/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_30-326681-0_sp1.1 from training. Duration: 0.7545625 2023-10-02 23:38:31,776 WARNING [train.py:1204] (3/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_538-218448-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 23:38:33,283 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.feed_forward2.hidden_balancer.prob, batch_count=1060013.3333333333, ans=0.125 2023-10-02 23:38:34,467 INFO [train.py:1046] (3/4) Epoch 30, batch 4950, loss[loss=0.1745, simple_loss=0.2636, pruned_loss=0.04269, over 24451.00 frames. ], tot_loss[loss=0.162, simple_loss=0.2403, pruned_loss=0.04187, over 4713168.65 frames. ], batch size: 69, lr: 3.38e-03, grad_scale: 16.0 2023-10-02 23:38:34,635 WARNING [train.py:1204] (3/4) Exclude cut with ID _33640_210_2655_1_1533117605479_4855469_60-192970-0_sp1.1 from training. Duration: 0.963625 2023-10-02 23:38:34,642 WARNING [train.py:1204] (3/4) Exclude cut with ID _65585_210_12210_1_1535450451584_3764098_112-294301-0_sp1.1 from training. Duration: 0.8 2023-10-02 23:38:34,661 WARNING [train.py:1204] (3/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_643-37412-0_sp1.1 from training. Duration: 0.936375 2023-10-02 23:38:35,924 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_9302_210_2550_1_1528889962_4655416_638-301620-0_sp1.1 from training. Duration: 0.42725 2023-10-02 23:38:36,229 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.feed_forward1.hidden_balancer.prob, batch_count=1060013.3333333333, ans=0.125 2023-10-02 23:38:37,244 WARNING [train.py:1204] (3/4) Exclude cut with ID _3867_210_1883_1_1526641200575_4475830_272-215563-0_sp1.1 from training. Duration: 0.5181875 2023-10-02 23:38:42,233 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_53679_210_4852_1_1534556726_4186141_358-154143-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 23:38:42,252 WARNING [train.py:1204] (3/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_841-341679-0_sp0.9 from training. Duration: 0.6444375 2023-10-02 23:38:45,111 WARNING [train.py:1204] (3/4) Exclude cut with ID _7160_210_1876_1_1527940923231_3703799_302-150728-0 from training. Duration: 0.78 2023-10-02 23:38:45,138 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_125-203105-0 from training. Duration: 0.8 2023-10-02 23:38:45,160 WARNING [train.py:1204] (3/4) Exclude cut with ID _5585_210_2667_1_1527418813324_8007340_89-332072-0_sp0.9 from training. Duration: 0.711125 2023-10-02 23:38:46,629 WARNING [train.py:1204] (3/4) Exclude cut with ID _3992_210_1747_1_1526644816031_3731060_36-147855-0 from training. Duration: 0.78 2023-10-02 23:38:46,661 WARNING [train.py:1204] (3/4) Exclude cut with ID _59256_210_5986_1_1535076085997_3589120_172-239045-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 23:38:46,670 WARNING [train.py:1204] (3/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_472-216663-0_sp1.1 from training. Duration: 0.836375 2023-10-02 23:38:46,720 WARNING [train.py:1204] (3/4) Exclude cut with ID _36512_210_6987_1_1533033143998_3126069_181-306500-0_sp1.1 from training. Duration: 0.863625 2023-10-02 23:38:46,914 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.balancer1.prob, batch_count=1060013.3333333333, ans=0.125 2023-10-02 23:38:48,619 WARNING [train.py:1204] (3/4) Exclude cut with ID _56524_210_9642_1_1534572011779_3619170_492-95008-0_sp1.1 from training. Duration: 0.97275 2023-10-02 23:38:49,995 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_33348_210_5632_1_1533086912_3324030_119-90326-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 23:38:51,332 WARNING [train.py:1204] (3/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_231-137892-0_sp1.1 from training. Duration: 0.9 2023-10-02 23:38:51,439 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8453_210_4211_1_1528502940_8562730_596-306820-0_sp1.1 from training. Duration: 0.8818125 2023-10-02 23:38:52,762 WARNING [train.py:1204] (3/4) Exclude cut with ID _27052_210_5986_1_1532329197187_3585009_50-242750-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 23:38:54,138 WARNING [train.py:1204] (3/4) Exclude cut with ID _13913_210_9994_1_1530579615187_3383897_358-98435-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 23:38:54,154 WARNING [train.py:1204] (3/4) Exclude cut with ID _27013_210_1389_1_1532437173240_3252649_399-229778-0_sp1.1 from training. Duration: 0.963625 2023-10-02 23:38:56,919 WARNING [train.py:1204] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_185-174805-0_sp0.9 from training. Duration: 0.7555625 2023-10-02 23:38:57,543 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.3.encoder.layers.0.self_attn_weights.whiten_keys, num_groups=8, num_channels=256, metric=4.75 vs. limit=6.0 2023-10-02 23:39:02,821 WARNING [train.py:1204] (3/4) Exclude cut with ID _65585_210_12210_1_1535450451584_3764098_196-221014-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 23:39:04,334 WARNING [train.py:1204] (3/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_202-293523-0_sp1.1 from training. Duration: 0.6545625 2023-10-02 23:39:05,699 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8275_210_4198_1_1528455320_3892571_213-127634-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 23:39:07,058 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_921-231678-0_sp1.1 from training. Duration: 0.97275 2023-10-02 23:39:07,419 INFO [scaling.py:1118] (3/4) WithLoss: name=encoder.encoders.4.encoder.layers.1.self_attn_weights, loss-sum=0.000e+00 2023-10-02 23:39:08,887 WARNING [train.py:1204] (3/4) Exclude cut with ID _38020_210_5986_1_1533112252259_7006089_493-102745-0_sp1.1 from training. Duration: 0.8909375 2023-10-02 23:39:08,995 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6846_210_6753_1_1527850767_4295158_2-315073-0 from training. Duration: 0.88 2023-10-02 23:39:10,402 WARNING [train.py:1204] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_249-281349-0 from training. Duration: 0.85 2023-10-02 23:39:13,603 WARNING [train.py:1204] (3/4) Exclude cut with ID _18513_210_9206_1_1531618215223_3862620_314-83429-0_sp1.1 from training. Duration: 0.97275 2023-10-02 23:39:13,856 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.bypass_mid.scale_min, batch_count=1060146.6666666667, ans=0.2 2023-10-02 23:39:15,057 WARNING [train.py:1204] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_215-202973-0_sp0.9 from training. Duration: 0.97775 2023-10-02 23:39:15,067 WARNING [train.py:1204] (3/4) Exclude cut with ID _7719_210_3525_1_1528456153163_4296960_88-250259-0_sp1.1 from training. Duration: 0.763625 2023-10-02 23:39:16,419 WARNING [train.py:1204] (3/4) Exclude cut with ID _7606_210_4915_1_1528894253277_5080690_301-10732-0_sp1.1 from training. Duration: 0.736375 2023-10-02 23:39:16,428 WARNING [train.py:1204] (3/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_818-11907-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 23:39:17,921 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_5959_210_1794_1_1527400400_3445726_412-44052-0_sp1.1 from training. Duration: 0.7 2023-10-02 23:39:21,091 WARNING [train.py:1204] (3/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_6-279195-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 23:39:22,539 WARNING [train.py:1204] (3/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_252-216674-0_sp0.9 from training. Duration: 0.6 2023-10-02 23:39:23,962 WARNING [train.py:1204] (3/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_144-3287-0_sp1.1 from training. Duration: 0.6090625 2023-10-02 23:39:26,570 WARNING [train.py:1204] (3/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_342-3879-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 23:39:26,593 WARNING [train.py:1204] (3/4) Exclude cut with ID _33640_210_2655_1_1533117605479_4855469_584-65630-0_sp1.1 from training. Duration: 0.97275 2023-10-02 23:39:27,994 WARNING [train.py:1204] (3/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_37-189262-0 from training. Duration: 0.98 2023-10-02 23:39:28,032 WARNING [train.py:1204] (3/4) Exclude cut with ID _9389_210_1664_1_1529217337301_5289269_379-128794-0_sp0.9 from training. Duration: 0.9444375 2023-10-02 23:39:29,485 WARNING [train.py:1204] (3/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_119-207162-0_sp1.1 from training. Duration: 0.6454375 2023-10-02 23:39:32,954 WARNING [train.py:1204] (3/4) Exclude cut with ID _2941_210_2577_1_1526199673150_2489759_200-333643-0_sp1.1 from training. Duration: 0.936375 2023-10-02 23:39:34,410 WARNING [train.py:1204] (3/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_479-125611-0_sp1.1 from training. Duration: 0.87275 2023-10-02 23:39:34,426 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3475_210_2028_1_1526193578_3622569_64-297072-0_sp1.1 from training. Duration: 0.82725 2023-10-02 23:39:35,749 WARNING [train.py:1204] (3/4) Exclude cut with ID _53565_210_10635_1_1534337218335_3820259_139-346291-0_sp1.1 from training. Duration: 0.97275 2023-10-02 23:39:35,798 WARNING [train.py:1204] (3/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_588-125165-0_sp1.1 from training. Duration: 0.7545625 2023-10-02 23:39:35,856 WARNING [train.py:1204] (3/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_938-205135-0_sp1.1 from training. Duration: 0.7909375 2023-10-02 23:39:38,561 WARNING [train.py:1204] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_216-48707-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 23:39:40,441 WARNING [train.py:1204] (3/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_477-255782-0_sp1.1 from training. Duration: 0.7454375 2023-10-02 23:39:40,477 WARNING [train.py:1204] (3/4) Exclude cut with ID _16909_210_5724_1_1531292079912_4118919_583-92842-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 23:39:41,769 WARNING [train.py:1204] (3/4) Exclude cut with ID _60860_210_8254_1_1534896015875_3557169_245-256666-0 from training. Duration: 0.97 2023-10-02 23:39:44,963 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_45399_210_4852_1_1533624725_7579737_349-116173-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 23:39:47,786 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.balancer1.prob, batch_count=1060346.6666666667, ans=0.125 2023-10-02 23:39:48,820 INFO [train.py:1046] (3/4) Epoch 30, batch 5000, loss[loss=0.1726, simple_loss=0.2471, pruned_loss=0.04904, over 23755.00 frames. ], tot_loss[loss=0.1615, simple_loss=0.2396, pruned_loss=0.04171, over 4713512.22 frames. ], batch size: 179, lr: 3.38e-03, grad_scale: 16.0 2023-10-02 23:39:50,436 WARNING [train.py:1204] (3/4) Exclude cut with ID _8699_210_2069_1_1528630346831_3960409_155-39766-0 from training. Duration: 0.78 2023-10-02 23:39:50,445 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_86-16881-0_sp1.1 from training. Duration: 0.52725 2023-10-02 23:39:57,974 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_191-159384-0_sp1.1 from training. Duration: 0.97275 2023-10-02 23:39:57,982 WARNING [train.py:1204] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_307-158462-0_sp1.1 from training. Duration: 0.62725 2023-10-02 23:39:59,365 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7544_210_6753_1_1528591327_1471426_33-346031-0 from training. Duration: 0.61 2023-10-02 23:40:01,299 WARNING [train.py:1204] (3/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_112-149213-0 from training. Duration: 0.95 2023-10-02 23:40:04,033 WARNING [train.py:1204] (3/4) Exclude cut with ID _55844_210_2346_1_1534471212569_7625707_925-191342-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 23:40:04,165 WARNING [train.py:1204] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_87-278686-0 from training. Duration: 0.77 2023-10-02 23:40:05,445 WARNING [train.py:1204] (3/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_415-78792-0_sp1.1 from training. Duration: 0.736375 2023-10-02 23:40:05,462 WARNING [train.py:1204] (3/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_107-332398-0_sp0.9 from training. Duration: 0.8666875 2023-10-02 23:40:06,799 WARNING [train.py:1204] (3/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_72-251255-0 from training. Duration: 0.94 2023-10-02 23:40:06,831 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_431-227273-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 23:40:06,886 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7544_210_6753_1_1528587000_691468_87-178599-0_sp0.9 from training. Duration: 0.9666875 2023-10-02 23:40:08,224 WARNING [train.py:1204] (3/4) Exclude cut with ID _13916_210_9994_1_1530788341126_3763149_60-191556-0 from training. Duration: 0.61 2023-10-02 23:40:08,226 WARNING [train.py:1204] (3/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_746-164782-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 23:40:09,556 WARNING [train.py:1204] (3/4) Exclude cut with ID _33640_210_2655_1_1533117605479_4855469_596-21066-0_sp1.1 from training. Duration: 0.92725 2023-10-02 23:40:09,668 WARNING [train.py:1204] (3/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_320-14078-0 from training. Duration: 0.95 2023-10-02 23:40:10,956 WARNING [train.py:1204] (3/4) Exclude cut with ID _29877_210_6228_1_1532397791106_5276149_266-62291-0 from training. Duration: 0.87 2023-10-02 23:40:12,821 WARNING [train.py:1204] (3/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_176-56120-0_sp0.9 from training. Duration: 0.92225 2023-10-02 23:40:14,200 WARNING [train.py:1204] (3/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_91-26919-0 from training. Duration: 0.97 2023-10-02 23:40:14,214 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_224-322394-0_sp0.9 from training. Duration: 0.7333125 2023-10-02 23:40:14,254 WARNING [train.py:1204] (3/4) Exclude cut with ID _44982_210_1511_1_1534312820108_3726029_586-297059-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 23:40:14,306 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_196-289637-0_sp1.1 from training. Duration: 0.6818125 2023-10-02 23:40:14,308 WARNING [train.py:1204] (3/4) Exclude cut with ID _18513_210_9206_1_1531618215223_3862620_402-5957-0 from training. Duration: 0.99 2023-10-02 23:40:14,314 WARNING [train.py:1204] (3/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_81-300212-0 from training. Duration: 0.6 2023-10-02 23:40:18,288 WARNING [train.py:1204] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_166-128941-0 from training. Duration: 0.54 2023-10-02 23:40:18,302 WARNING [train.py:1204] (3/4) Exclude cut with ID _58059_210_5854_1_1534901471149_3164808_57-309660-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 23:40:18,380 WARNING [train.py:1204] (3/4) Exclude cut with ID _60621_210_17071_1_1535013053530_3671739_73-155814-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 23:40:19,763 WARNING [train.py:1204] (3/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_726-270780-0 from training. Duration: 0.86 2023-10-02 23:40:19,774 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8853_210_3273_1_1528633899_818968_51-187685-0_sp1.1 from training. Duration: 0.62725 2023-10-02 23:40:21,145 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_315-212794-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 23:40:21,235 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_53373_210_5399_1_1534229471_4715695_314-122795-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 23:40:22,607 WARNING [train.py:1204] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_187-221566-0_sp0.9 from training. Duration: 0.47775 2023-10-02 23:40:25,715 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_133-564-0 from training. Duration: 0.79 2023-10-02 23:40:25,771 WARNING [train.py:1204] (3/4) Exclude cut with ID _55153_210_2253_1_1534384786677_3162096_255-228344-0_sp1.1 from training. Duration: 0.936375 2023-10-02 23:40:28,351 WARNING [train.py:1204] (3/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_383-165234-0_sp1.1 from training. Duration: 0.8545625 2023-10-02 23:40:31,153 WARNING [train.py:1204] (3/4) Exclude cut with ID IC0677W0056-334889 from training. Duration: 0.768 2023-10-02 23:40:34,641 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_14953_210_2151_1_1530701710_5616165_710-165400-0_sp0.9 from training. Duration: 0.9666875 2023-10-02 23:40:34,730 WARNING [train.py:1204] (3/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_355-236205-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 23:40:34,731 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_34450_210_9547_1_1533099443_4109050_503-246893-0_sp1.1 from training. Duration: 0.963625 2023-10-02 23:40:37,400 WARNING [train.py:1204] (3/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_25-120161-0 from training. Duration: 0.98 2023-10-02 23:40:37,441 WARNING [train.py:1204] (3/4) Exclude cut with ID _66112_210_10635_1_1535590236305_3801484_205-311930-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 23:40:38,726 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_48049_210_5632_1_1533864553_3519937_185-281218-0_sp1.1 from training. Duration: 0.92725 2023-10-02 23:40:38,770 WARNING [train.py:1204] (3/4) Exclude cut with ID _48782_210_12658_1_1534467701544_2300422_191-332415-0_sp1.1 from training. Duration: 0.97275 2023-10-02 23:40:39,015 INFO [scaling.py:1118] (3/4) WithLoss: name=encoder.encoders.3.encoder.layers.1.self_attn_weights, loss-sum=0.000e+00 2023-10-02 23:40:40,071 WARNING [train.py:1204] (3/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_907-151398-0_sp1.1 from training. Duration: 0.336375 2023-10-02 23:40:41,252 WARNING [train.py:1204] (3/4) Exclude cut with ID _67499_210_17282_1_1535509805220_3692430_20-179346-0_sp1.1 from training. Duration: 0.8818125 2023-10-02 23:40:45,251 WARNING [train.py:1204] (3/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_441-91466-0_sp1.1 from training. Duration: 0.8818125 2023-10-02 23:40:46,547 WARNING [train.py:1204] (3/4) Exclude cut with ID _46681_210_9642_1_1533893005571_7603359_786-164007-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 23:40:50,786 WARNING [train.py:1204] (3/4) Exclude cut with ID _14970_210_15709_1_1530773543493_1715730_187-20259-0 from training. Duration: 0.89 2023-10-02 23:40:52,450 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.bypass.skip_rate, batch_count=1060613.3333333333, ans=0.04949747468305833 2023-10-02 23:40:54,213 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.0.feed_forward1.out_proj.dropout_p, batch_count=1060613.3333333333, ans=0.1 2023-10-02 23:40:55,402 WARNING [train.py:1204] (3/4) Exclude cut with ID _12734_210_8341_1_1530183614296_3891929_383-339075-0_sp1.1 from training. Duration: 0.963625 2023-10-02 23:40:58,138 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.575e+02 1.837e+02 2.079e+02 2.443e+02 4.073e+02, threshold=4.157e+02, percent-clipped=1.0 2023-10-02 23:41:02,689 INFO [train.py:1046] (3/4) Epoch 30, batch 5050, loss[loss=0.1925, simple_loss=0.2477, pruned_loss=0.06864, over 19234.00 frames. ], tot_loss[loss=0.1623, simple_loss=0.2404, pruned_loss=0.04211, over 4706664.82 frames. ], batch size: 388, lr: 3.38e-03, grad_scale: 16.0 2023-10-02 23:41:04,193 WARNING [train.py:1204] (3/4) Exclude cut with ID _37086_210_9038_1_1532999540891_7021250_834-322225-0_sp1.1 from training. Duration: 0.92725 2023-10-02 23:41:04,302 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_10987_210_4163_1_1529760555_4123902_56-290559-0_sp1.1 from training. Duration: 0.963625 2023-10-02 23:41:05,559 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_92-246270-0_sp0.9 from training. Duration: 0.9444375 2023-10-02 23:41:05,576 WARNING [train.py:1204] (3/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_56-13561-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 23:41:05,613 WARNING [train.py:1204] (3/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_393-57888-0_sp1.1 from training. Duration: 0.5818125 2023-10-02 23:41:06,803 WARNING [train.py:1204] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_168-32906-0_sp1.1 from training. Duration: 0.62725 2023-10-02 23:41:06,862 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_51225_210_3601_1_1534400634_2327383_127-207553-0_sp1.1 from training. Duration: 0.963625 2023-10-02 23:41:11,124 WARNING [train.py:1204] (3/4) Exclude cut with ID _58583_210_5236_1_1534726835266_3574249_494-203521-0_sp1.1 from training. Duration: 0.963625 2023-10-02 23:41:11,145 WARNING [train.py:1204] (3/4) Exclude cut with ID _49376_210_5986_1_1533985278732_3370740_354-344695-0 from training. Duration: 0.99 2023-10-02 23:41:12,542 WARNING [train.py:1204] (3/4) Exclude cut with ID _8021_210_7994_1_1528376451358_3653069_179-45206-0_sp1.1 from training. Duration: 0.7818125 2023-10-02 23:41:15,807 WARNING [train.py:1204] (3/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_742-154017-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 23:41:17,695 WARNING [train.py:1204] (3/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_222-198127-0_sp1.1 from training. Duration: 0.763625 2023-10-02 23:41:17,737 WARNING [train.py:1204] (3/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_235-307337-0 from training. Duration: 0.94 2023-10-02 23:41:18,953 WARNING [train.py:1204] (3/4) Exclude cut with ID _8393_210_6702_1_1528505860249_3457328_453-176500-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 23:41:18,996 WARNING [train.py:1204] (3/4) Exclude cut with ID _53565_210_10635_1_1534337218335_3820259_102-179528-0_sp1.1 from training. Duration: 0.97275 2023-10-02 23:41:20,473 WARNING [train.py:1204] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_43-206757-0_sp1.1 from training. Duration: 0.6818125 2023-10-02 23:41:21,836 WARNING [train.py:1204] (3/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_335-274780-0_sp1.1 from training. Duration: 0.7090625 2023-10-02 23:41:22,018 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.conv_skip_rate, batch_count=1060746.6666666667, ans=0.0 2023-10-02 23:41:23,154 WARNING [train.py:1204] (3/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_128-296581-0_sp1.1 from training. Duration: 0.67275 2023-10-02 23:41:32,373 WARNING [train.py:1204] (3/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_111-112362-0 from training. Duration: 0.91 2023-10-02 23:41:32,415 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_5522_210_3273_1_1527249606_3909096_366-172508-0_sp1.1 from training. Duration: 0.6 2023-10-02 23:41:34,301 WARNING [train.py:1204] (3/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_47-167373-0_sp1.1 from training. Duration: 0.836375 2023-10-02 23:41:34,344 WARNING [train.py:1204] (3/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_41-234353-0 from training. Duration: 0.68 2023-10-02 23:41:34,369 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_82-221196-0_sp0.9 from training. Duration: 0.8444375 2023-10-02 23:41:34,469 WARNING [train.py:1204] (3/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_47-4814-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 23:41:35,732 WARNING [train.py:1204] (3/4) Exclude cut with ID _43541_210_10626_1_1533796233418_3635440_40-247993-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 23:41:37,156 WARNING [train.py:1204] (3/4) Exclude cut with ID _21989_210_5854_1_1531899214931_3271360_133-44759-0_sp0.9 from training. Duration: 0.8555625 2023-10-02 23:41:37,159 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_94-195801-0 from training. Duration: 0.94 2023-10-02 23:41:37,234 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_141-89113-0 from training. Duration: 0.86 2023-10-02 23:41:38,607 WARNING [train.py:1204] (3/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_683-284578-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 23:41:39,995 WARNING [train.py:1204] (3/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_6-214815-0_sp1.1 from training. Duration: 0.77275 2023-10-02 23:41:44,130 WARNING [train.py:1204] (3/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_556-212082-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 23:41:44,184 WARNING [train.py:1204] (3/4) Exclude cut with ID _44902_210_16187_1_1533888006062_3792078_149-67212-0 from training. Duration: 0.87 2023-10-02 23:41:47,387 WARNING [train.py:1204] (3/4) Exclude cut with ID _61148_210_6897_1_1535094022726_4217610_333-159440-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 23:41:48,828 WARNING [train.py:1204] (3/4) Exclude cut with ID _3120_210_3525_1_1526036143112_4072980_404-249994-0 from training. Duration: 0.99 2023-10-02 23:41:48,927 WARNING [train.py:1204] (3/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_234-260682-0_sp0.9 from training. Duration: 0.9333125 2023-10-02 23:41:48,950 WARNING [train.py:1204] (3/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_374-138971-0_sp1.1 from training. Duration: 0.8545625 2023-10-02 23:41:49,124 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.1.ff3_skip_rate, batch_count=1060880.0, ans=0.0 2023-10-02 23:41:50,388 WARNING [train.py:1204] (3/4) Exclude cut with ID _53713_210_24116_1_1534314487762_7416751_743-302085-0_sp1.1 from training. Duration: 0.92725 2023-10-02 23:41:51,792 WARNING [train.py:1204] (3/4) Exclude cut with ID _18516_210_9206_1_1531704653206_3692406_198-334870-0_sp1.1 from training. Duration: 0.836375 2023-10-02 23:41:51,910 WARNING [train.py:1204] (3/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_395-73570-0_sp1.1 from training. Duration: 0.9 2023-10-02 23:41:52,071 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=1060880.0, ans=0.1 2023-10-02 23:41:55,077 WARNING [train.py:1204] (3/4) Exclude cut with ID _46683_210_9642_1_1533730913492_4591116_421-19547-0_sp1.1 from training. Duration: 0.936375 2023-10-02 23:41:55,131 WARNING [train.py:1204] (3/4) Exclude cut with ID _19896_210_1883_1_1531709022662_6649569_714-8668-0_sp1.1 from training. Duration: 0.963625 2023-10-02 23:41:56,403 WARNING [train.py:1204] (3/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_49-101072-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 23:41:56,417 WARNING [train.py:1204] (3/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_220-191080-0_sp1.1 from training. Duration: 0.8909375 2023-10-02 23:41:56,455 WARNING [train.py:1204] (3/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_62-335552-0 from training. Duration: 0.87 2023-10-02 23:41:57,797 WARNING [train.py:1204] (3/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_111-112362-0_sp1.1 from training. Duration: 0.82725 2023-10-02 23:41:59,208 WARNING [train.py:1204] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_318-59097-0_sp0.9 from training. Duration: 0.8444375 2023-10-02 23:42:05,263 WARNING [train.py:1204] (3/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_98-255710-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 23:42:05,269 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9042W0003-515609 from training. Duration: 0.896 2023-10-02 23:42:05,285 WARNING [train.py:1204] (3/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_158-250116-0_sp0.9 from training. Duration: 0.688875 2023-10-02 23:42:06,620 WARNING [train.py:1204] (3/4) Exclude cut with ID _26750_210_5665_1_1532226678442_4063230_7-346885-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 23:42:06,656 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_37839_210_7234_1_1533542462_3689303_357-227078-0_sp1.1 from training. Duration: 0.963625 2023-10-02 23:42:08,009 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9052W0119-520904 from training. Duration: 0.896 2023-10-02 23:42:09,403 WARNING [train.py:1204] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_249-281349-0_sp1.1 from training. Duration: 0.77275 2023-10-02 23:42:09,407 WARNING [train.py:1204] (3/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_147-293311-0 from training. Duration: 0.94 2023-10-02 23:42:09,408 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_158-21166-0_sp1.1 from training. Duration: 0.963625 2023-10-02 23:42:12,767 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.4.encoder.layers.1.whiten, num_groups=1, num_channels=384, metric=7.01 vs. limit=12.0 2023-10-02 23:42:13,578 WARNING [train.py:1204] (3/4) Exclude cut with ID _14100_210_8341_1_1530529320613_3678069_207-332194-0_sp1.1 from training. Duration: 0.92725 2023-10-02 23:42:13,616 WARNING [train.py:1204] (3/4) Exclude cut with ID _9389_210_1664_1_1529217337301_5289269_324-211031-0_sp1.1 from training. Duration: 0.963625 2023-10-02 23:42:13,627 WARNING [train.py:1204] (3/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_133-158308-0 from training. Duration: 0.84 2023-10-02 23:42:15,034 WARNING [train.py:1204] (3/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_166-314607-0 from training. Duration: 0.54 2023-10-02 23:42:15,332 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.balancer2.prob, batch_count=1061013.3333333333, ans=0.125 2023-10-02 23:42:16,885 INFO [train.py:1046] (3/4) Epoch 30, batch 5100, loss[loss=0.1658, simple_loss=0.2433, pruned_loss=0.04408, over 23604.00 frames. ], tot_loss[loss=0.1635, simple_loss=0.2417, pruned_loss=0.04267, over 4705200.61 frames. ], batch size: 256, lr: 3.38e-03, grad_scale: 8.0 2023-10-02 23:42:18,309 WARNING [train.py:1204] (3/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_649-79538-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 23:42:18,317 WARNING [train.py:1204] (3/4) Exclude cut with ID _28394_210_8913_1_1532430049518_3650118_343-115667-0_sp1.1 from training. Duration: 0.97275 2023-10-02 23:42:18,353 WARNING [train.py:1204] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_87-278686-0_sp0.9 from training. Duration: 0.8555625 2023-10-02 23:42:19,357 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.0.layers.0.conv_module1.whiten, num_groups=1, num_channels=192, metric=10.70 vs. limit=15.0 2023-10-02 23:42:19,891 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9109W0003-549749 from training. Duration: 0.768 2023-10-02 23:42:21,250 WARNING [train.py:1204] (3/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_126-29104-0_sp1.1 from training. Duration: 0.77275 2023-10-02 23:42:21,515 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=1061013.3333333333, ans=0.1 2023-10-02 23:42:25,815 WARNING [train.py:1204] (3/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_325-113798-0 from training. Duration: 0.85 2023-10-02 23:42:25,868 WARNING [train.py:1204] (3/4) Exclude cut with ID _6694_210_4599_1_1527852637490_3965529_116-109398-0 from training. Duration: 0.73 2023-10-02 23:42:27,112 WARNING [train.py:1204] (3/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_46-57119-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 23:42:27,224 WARNING [train.py:1204] (3/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_546-119312-0_sp1.1 from training. Duration: 0.9 2023-10-02 23:42:30,426 WARNING [train.py:1204] (3/4) Exclude cut with ID _60057_210_10640_1_1534903065793_3333150_468-306939-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 23:42:31,814 WARNING [train.py:1204] (3/4) Exclude cut with ID _15150_210_8341_1_1530752411803_7256384_22-138274-0 from training. Duration: 0.96 2023-10-02 23:42:31,838 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_289-180904-0 from training. Duration: 0.76 2023-10-02 23:42:36,199 WARNING [train.py:1204] (3/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_64-133721-0_sp1.1 from training. Duration: 0.92725 2023-10-02 23:42:37,386 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_195-257512-0_sp1.1 from training. Duration: 0.6090625 2023-10-02 23:42:40,296 WARNING [train.py:1204] (3/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_704-101963-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 23:42:43,002 WARNING [train.py:1204] (3/4) Exclude cut with ID _4366_210_1388_1_1526792053633_3613819_231-211402-0 from training. Duration: 0.94 2023-10-02 23:42:44,328 WARNING [train.py:1204] (3/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_679-190385-0_sp1.1 from training. Duration: 0.97275 2023-10-02 23:42:45,649 WARNING [train.py:1204] (3/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_298-81459-0_sp1.1 from training. Duration: 0.963625 2023-10-02 23:42:47,310 WARNING [train.py:1204] (3/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_658-282610-0_sp0.9 from training. Duration: 0.588875 2023-10-02 23:42:50,725 WARNING [train.py:1204] (3/4) Exclude cut with ID _57572_210_5517_1_1534921219624_3809484_196-283734-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 23:42:52,126 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_53071_210_16554_1_1534490405_2954066_294-198554-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 23:42:52,131 WARNING [train.py:1204] (3/4) Exclude cut with ID _4866_210_2577_1_1527404412284_4364659_352-157930-0 from training. Duration: 0.81 2023-10-02 23:42:53,646 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9231W0113-610139 from training. Duration: 0.896 2023-10-02 23:42:53,843 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.bypass_mid.scale_min, batch_count=1061146.6666666667, ans=0.2 2023-10-02 23:42:54,964 WARNING [train.py:1204] (3/4) Exclude cut with ID _24271_210_12658_1_1531987270022_7190519_638-171906-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 23:42:55,797 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.1.encoder.layers.0.feed_forward2.out_whiten, num_groups=1, num_channels=256, metric=4.79 vs. limit=15.0 2023-10-02 23:42:56,283 WARNING [train.py:1204] (3/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_272-249985-0 from training. Duration: 0.67 2023-10-02 23:42:56,293 WARNING [train.py:1204] (3/4) Exclude cut with ID _64086_210_7994_1_1535270417019_7435949_449-31567-0 from training. Duration: 0.97 2023-10-02 23:42:58,410 WARNING [train.py:1204] (3/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_109-282568-0_sp1.1 from training. Duration: 0.97275 2023-10-02 23:43:07,315 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_121-45934-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 23:43:08,765 WARNING [train.py:1204] (3/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_314-126819-0 from training. Duration: 0.5 2023-10-02 23:43:08,798 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9309W0192-640319 from training. Duration: 0.512 2023-10-02 23:43:08,805 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9310W0003-640649 from training. Duration: 0.896 2023-10-02 23:43:11,464 WARNING [train.py:1204] (3/4) Exclude cut with ID _7160_210_1876_1_1527940923231_3703799_120-150943-0 from training. Duration: 0.81 2023-10-02 23:43:11,466 WARNING [train.py:1204] (3/4) Exclude cut with ID _40380_210_9059_1_1533380594759_4187380_6-33508-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 23:43:15,387 WARNING [train.py:1204] (3/4) Exclude cut with ID _59202_210_26984_1_1534816810612_3549174_137-318710-0 from training. Duration: 0.89 2023-10-02 23:43:17,120 INFO [scaling.py:1118] (3/4) WithLoss: name=encoder.encoders.4.encoder.layers.2.self_attn_weights, loss-sum=0.000e+00 2023-10-02 23:43:18,930 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_435-313305-0 from training. Duration: 0.71 2023-10-02 23:43:20,844 WARNING [train.py:1204] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_91-212486-0_sp1.1 from training. Duration: 0.6181875 2023-10-02 23:43:20,957 WARNING [train.py:1204] (3/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_133-158308-0_sp1.1 from training. Duration: 0.763625 2023-10-02 23:43:23,656 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_99-251314-0 from training. Duration: 0.87 2023-10-02 23:43:25,084 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_365-217119-0_sp1.1 from training. Duration: 0.57275 2023-10-02 23:43:25,119 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3475_210_2028_1_1526193578_3622569_377-166133-0 from training. Duration: 0.86 2023-10-02 23:43:25,353 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.conv_module1.balancer1.prob, batch_count=1061280.0, ans=0.125 2023-10-02 23:43:28,612 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.446e+02 1.879e+02 2.162e+02 2.700e+02 3.768e+02, threshold=4.325e+02, percent-clipped=0.0 2023-10-02 23:43:31,375 INFO [train.py:1046] (3/4) Epoch 30, batch 5150, loss[loss=0.164, simple_loss=0.2495, pruned_loss=0.03924, over 24555.00 frames. ], tot_loss[loss=0.1645, simple_loss=0.2431, pruned_loss=0.04292, over 4708432.19 frames. ], batch size: 71, lr: 3.38e-03, grad_scale: 8.0 2023-10-02 23:43:32,774 WARNING [train.py:1204] (3/4) Exclude cut with ID _13916_210_9994_1_1530788341126_3763149_116-21034-0_sp1.1 from training. Duration: 0.87275 2023-10-02 23:43:32,792 WARNING [train.py:1204] (3/4) Exclude cut with ID _64220_210_9527_1_1535250151772_4875259_142-216831-0_sp1.1 from training. Duration: 0.936375 2023-10-02 23:43:32,792 WARNING [train.py:1204] (3/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_490-207861-0_sp1.1 from training. Duration: 0.8909375 2023-10-02 23:43:34,220 WARNING [train.py:1204] (3/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_87-272068-0_sp0.9 from training. Duration: 0.92225 2023-10-02 23:43:34,248 WARNING [train.py:1204] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_18-46756-0_sp1.1 from training. Duration: 0.7181875 2023-10-02 23:43:36,085 WARNING [train.py:1204] (3/4) Exclude cut with ID _39411_210_5986_1_1533538825030_3343649_98-311335-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 23:43:36,151 WARNING [train.py:1204] (3/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_113-18163-0 from training. Duration: 0.89 2023-10-02 23:43:36,153 WARNING [train.py:1204] (3/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_1103-56445-0 from training. Duration: 0.73 2023-10-02 23:43:36,200 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3474_210_2028_1_1526188513_4825246_145-309131-0 from training. Duration: 0.74 2023-10-02 23:43:36,217 WARNING [train.py:1204] (3/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_120-211630-0_sp1.1 from training. Duration: 0.836375 2023-10-02 23:43:36,227 WARNING [train.py:1204] (3/4) Exclude cut with ID _8030_210_7497_1_1528444886781_3531089_32-31837-0 from training. Duration: 0.97 2023-10-02 23:43:38,972 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_19949_210_2161_1_1531706337_928079_8-160956-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 23:43:39,029 WARNING [train.py:1204] (3/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_321-215742-0_sp1.1 from training. Duration: 0.4545625 2023-10-02 23:43:40,547 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_583-124650-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 23:43:41,936 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_30434_210_13049_1_1532519384_981746_164-182755-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 23:43:44,793 WARNING [train.py:1204] (3/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_339-104791-0_sp1.1 from training. Duration: 0.4454375 2023-10-02 23:43:44,810 WARNING [train.py:1204] (3/4) Exclude cut with ID _15026_210_8750_1_1530777602316_3291850_211-195043-0 from training. Duration: 0.8 2023-10-02 23:43:46,207 WARNING [train.py:1204] (3/4) Exclude cut with ID _29877_210_6228_1_1532397791106_5276149_487-182647-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 23:43:47,516 WARNING [train.py:1204] (3/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_127-566-0_sp0.9 from training. Duration: 0.9333125 2023-10-02 23:43:49,368 WARNING [train.py:1204] (3/4) Exclude cut with ID _8263_210_1664_1_1528439452891_970220_68-181805-0_sp0.9 from training. Duration: 0.711125 2023-10-02 23:43:49,369 WARNING [train.py:1204] (3/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_504-72513-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 23:43:49,388 WARNING [train.py:1204] (3/4) Exclude cut with ID _64639_210_5816_1_1535367619437_3528000_324-265233-0_sp1.1 from training. Duration: 0.963625 2023-10-02 23:43:50,853 WARNING [train.py:1204] (3/4) Exclude cut with ID _6456_210_1534_1_1527681634212_3181049_215-309359-0_sp0.9 from training. Duration: 0.97775 2023-10-02 23:43:50,857 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_115-233929-0_sp1.1 from training. Duration: 0.6545625 2023-10-02 23:43:50,912 WARNING [train.py:1204] (3/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_219-224445-0 from training. Duration: 0.84 2023-10-02 23:43:52,301 WARNING [train.py:1204] (3/4) Exclude cut with ID _14867_210_1968_1_1530684179890_3846969_413-29292-0_sp1.1 from training. Duration: 0.8181875 2023-10-02 23:43:52,347 WARNING [train.py:1204] (3/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_1012-279473-0_sp0.9 from training. Duration: 0.7444375 2023-10-02 23:43:53,803 WARNING [train.py:1204] (3/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_605-133395-0_sp0.9 from training. Duration: 0.7333125 2023-10-02 23:43:56,551 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_89-35748-0 from training. Duration: 0.79 2023-10-02 23:43:56,642 WARNING [train.py:1204] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_476-272757-0_sp1.1 from training. Duration: 0.8454375 2023-10-02 23:44:02,892 WARNING [train.py:1204] (3/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_449-15508-0_sp0.9 from training. Duration: 0.911125 2023-10-02 23:44:05,570 WARNING [train.py:1204] (3/4) Exclude cut with ID _29345_210_4381_1_1532689168263_3860169_198-174328-0 from training. Duration: 0.91 2023-10-02 23:44:07,175 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_15517_210_6753_1_1530952293_1664795_125-158157-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 23:44:15,341 WARNING [train.py:1204] (3/4) Exclude cut with ID _25565_210_12392_1_1532423009382_3952098_420-113180-0_sp1.1 from training. Duration: 0.963625 2023-10-02 23:44:16,677 WARNING [train.py:1204] (3/4) Exclude cut with ID _51471_210_1889_1_1534399391390_3094250_236-146773-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 23:44:19,597 WARNING [train.py:1204] (3/4) Exclude cut with ID _30039_210_13270_1_1532484612900_2725150_175-35012-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 23:44:21,434 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_23850_210_4238_1_1532232597_1632433_99-275843-0_sp1.1 from training. Duration: 0.936375 2023-10-02 23:44:24,649 WARNING [train.py:1204] (3/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_658-282610-0 from training. Duration: 0.53 2023-10-02 23:44:26,305 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_37818_210_4238_1_1533620910_4720083_234-65934-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 23:44:27,643 WARNING [train.py:1204] (3/4) Exclude cut with ID _3067_210_2667_1_1526212974640_4503168_459-173867-0_sp1.1 from training. Duration: 0.8 2023-10-02 23:44:27,674 WARNING [train.py:1204] (3/4) Exclude cut with ID _8696_210_1876_1_1528545632440_3345058_209-54069-0_sp0.9 from training. Duration: 0.7444375 2023-10-02 23:44:32,936 WARNING [train.py:1204] (3/4) Exclude cut with ID _62574_210_2655_1_1535180407058_3933249_420-236465-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 23:44:34,170 WARNING [train.py:1204] (3/4) Exclude cut with ID _46681_210_9642_1_1533893005571_7603359_333-4042-0_sp1.1 from training. Duration: 0.936375 2023-10-02 23:44:34,259 WARNING [train.py:1204] (3/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_377-296839-0 from training. Duration: 0.55 2023-10-02 23:44:38,425 WARNING [train.py:1204] (3/4) Exclude cut with ID _43997_210_4525_1_1533538632555_5189329_426-92599-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 23:44:41,044 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_43-71989-0_sp1.1 from training. Duration: 0.5181875 2023-10-02 23:44:43,844 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2009_210_2071_1_1525481134_4588937_140-345530-0_sp1.1 from training. Duration: 0.963625 2023-10-02 23:44:43,861 WARNING [train.py:1204] (3/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_138-332773-0_sp0.9 from training. Duration: 0.9 2023-10-02 23:44:45,159 INFO [train.py:1046] (3/4) Epoch 30, batch 5200, loss[loss=0.1342, simple_loss=0.2078, pruned_loss=0.03031, over 24597.00 frames. ], tot_loss[loss=0.1658, simple_loss=0.2441, pruned_loss=0.04379, over 4693216.88 frames. ], batch size: 60, lr: 3.38e-03, grad_scale: 16.0 2023-10-02 23:44:45,231 WARNING [train.py:1204] (3/4) Exclude cut with ID _13942_210_9988_1_1530604906036_3733360_499-55676-0_sp0.9 from training. Duration: 0.688875 2023-10-02 23:44:45,254 WARNING [train.py:1204] (3/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_259-34038-0_sp1.1 from training. Duration: 0.67275 2023-10-02 23:44:45,275 WARNING [train.py:1204] (3/4) Exclude cut with ID _3120_210_3525_1_1526036143112_4072980_404-249994-0_sp1.1 from training. Duration: 0.9 2023-10-02 23:44:45,324 WARNING [train.py:1204] (3/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_222-108810-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 23:44:48,260 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.ff2_skip_rate, batch_count=1061680.0, ans=0.0 2023-10-02 23:44:49,465 WARNING [train.py:1204] (3/4) Exclude cut with ID _22083_210_8254_1_1531702862439_3678970_49-234546-0_sp0.9 from training. Duration: 0.92225 2023-10-02 23:44:50,869 WARNING [train.py:1204] (3/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_199-295611-0_sp0.9 from training. Duration: 0.611125 2023-10-02 23:44:52,382 WARNING [train.py:1204] (3/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_43-192945-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 23:44:53,894 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.feed_forward2.hidden_balancer.prob, batch_count=1061680.0, ans=0.125 2023-10-02 23:44:57,611 WARNING [train.py:1204] (3/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_364-22817-0 from training. Duration: 0.71 2023-10-02 23:44:57,795 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=1061680.0, ans=0.1 2023-10-02 23:44:58,993 WARNING [train.py:1204] (3/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_388-129552-0_sp1.1 from training. Duration: 0.87275 2023-10-02 23:45:00,307 WARNING [train.py:1204] (3/4) Exclude cut with ID _6537_210_1883_1_1527987559710_3597129_190-280227-0_sp1.1 from training. Duration: 0.97275 2023-10-02 23:45:01,779 WARNING [train.py:1204] (3/4) Exclude cut with ID _55844_210_2346_1_1534471212569_7625707_398-56897-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 23:45:03,103 WARNING [train.py:1204] (3/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_48-182155-0_sp1.1 from training. Duration: 0.7909375 2023-10-02 23:45:03,136 WARNING [train.py:1204] (3/4) Exclude cut with ID _29529_210_7234_1_1532393961455_3977460_582-18084-0_sp1.1 from training. Duration: 0.97275 2023-10-02 23:45:03,900 WARNING [train.py:1204] (3/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_912-282412-0 from training. Duration: 0.49 2023-10-02 23:45:06,664 WARNING [train.py:1204] (3/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_332-256168-0_sp0.9 from training. Duration: 0.8333125 2023-10-02 23:45:06,724 WARNING [train.py:1204] (3/4) Exclude cut with ID _64220_210_9527_1_1535250151772_4875259_55-250624-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 23:45:09,432 WARNING [train.py:1204] (3/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_264-108685-0 from training. Duration: 0.55 2023-10-02 23:45:12,090 WARNING [train.py:1204] (3/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_613-232856-0_sp0.9 from training. Duration: 0.6 2023-10-02 23:45:13,395 WARNING [train.py:1204] (3/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_68-212801-0_sp1.1 from training. Duration: 0.663625 2023-10-02 23:45:13,448 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6290_210_3273_1_1527595224_1282787_115-87095-0 from training. Duration: 0.55 2023-10-02 23:45:13,486 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3080_210_2071_1_1526086102_4365166_312-8726-0 from training. Duration: 0.91 2023-10-02 23:45:16,158 WARNING [train.py:1204] (3/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_105-63309-0 from training. Duration: 0.98 2023-10-02 23:45:16,420 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=1061813.3333333333, ans=0.1 2023-10-02 23:45:17,484 WARNING [train.py:1204] (3/4) Exclude cut with ID _2941_210_2577_1_1526199673150_2489759_201-160779-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 23:45:17,487 WARNING [train.py:1204] (3/4) Exclude cut with ID ID0406W0226-885434 from training. Duration: 0.997 2023-10-02 23:45:17,495 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_17292_210_6753_1_1531125388_2943734_353-328201-0_sp1.1 from training. Duration: 0.97275 2023-10-02 23:45:18,858 WARNING [train.py:1204] (3/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_572-96029-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 23:45:18,894 WARNING [train.py:1204] (3/4) Exclude cut with ID _14970_210_15709_1_1530775547361_1966965_6-23328-0_sp0.9 from training. Duration: 0.9555625 2023-10-02 23:45:20,179 WARNING [train.py:1204] (3/4) Exclude cut with ID _45012_210_12651_1_1533608435136_5371380_240-37101-0 from training. Duration: 0.93 2023-10-02 23:45:20,236 WARNING [train.py:1204] (3/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_685-90183-0_sp1.1 from training. Duration: 0.936375 2023-10-02 23:45:23,027 WARNING [train.py:1204] (3/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_41-22245-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 23:45:26,528 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_15126_210_6753_1_1530778677_2591185_120-277707-0 from training. Duration: 0.8 2023-10-02 23:45:26,563 WARNING [train.py:1204] (3/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_310-26186-0 from training. Duration: 0.7 2023-10-02 23:45:26,584 WARNING [train.py:1204] (3/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_787-294416-0 from training. Duration: 0.82 2023-10-02 23:45:32,006 WARNING [train.py:1204] (3/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_795-33252-0 from training. Duration: 0.37 2023-10-02 23:45:33,972 WARNING [train.py:1204] (3/4) Exclude cut with ID _7537_210_2084_1_1528502395431_3924359_113-199933-0_sp0.9 from training. Duration: 0.6555625 2023-10-02 23:45:34,233 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.conv_module1.balancer2.prob, batch_count=1061880.0, ans=0.125 2023-10-02 23:45:38,869 WARNING [train.py:1204] (3/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_308-231735-0_sp0.9 from training. Duration: 0.811125 2023-10-02 23:45:38,889 WARNING [train.py:1204] (3/4) Exclude cut with ID _19504_210_9994_1_1531648863242_3325220_354-296765-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 23:45:40,270 WARNING [train.py:1204] (3/4) Exclude cut with ID _5409_210_1388_1_1527292850189_5865191_178-127970-0 from training. Duration: 0.57 2023-10-02 23:45:41,573 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_1466-248056-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 23:45:41,599 WARNING [train.py:1204] (3/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_535-14410-0_sp0.9 from training. Duration: 0.52225 2023-10-02 23:45:41,602 WARNING [train.py:1204] (3/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_747-129941-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 23:45:41,634 WARNING [train.py:1204] (3/4) Exclude cut with ID _15012_210_1968_1_1530856820429_5095350_688-178849-0_sp1.1 from training. Duration: 0.5818125 2023-10-02 23:45:45,667 WARNING [train.py:1204] (3/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_356-45054-0_sp1.1 from training. Duration: 0.7818125 2023-10-02 23:45:45,756 WARNING [train.py:1204] (3/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_146-276944-0_sp0.9 from training. Duration: 0.92225 2023-10-02 23:45:49,963 WARNING [train.py:1204] (3/4) Exclude cut with ID _39204_210_4220_1_1533880547368_3191120_346-180306-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 23:45:51,311 WARNING [train.py:1204] (3/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_641-158017-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 23:45:51,312 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_19952_210_2161_1_1531875469_3851750_274-42137-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 23:45:51,607 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.conv_skip_rate, batch_count=1061946.6666666667, ans=0.0 2023-10-02 23:45:55,888 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.546e+02 1.943e+02 2.112e+02 2.505e+02 3.885e+02, threshold=4.224e+02, percent-clipped=0.0 2023-10-02 23:45:56,002 WARNING [train.py:1204] (3/4) Exclude cut with ID _60804_210_9642_1_1534831199859_7268299_87-327819-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 23:45:57,784 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_9302_210_2550_1_1528889962_4655416_638-301620-0 from training. Duration: 0.47 2023-10-02 23:45:57,845 WARNING [train.py:1204] (3/4) Exclude cut with ID _44304_210_15374_1_1533639352765_4613129_302-22084-0_sp1.1 from training. Duration: 0.7818125 2023-10-02 23:45:57,859 WARNING [train.py:1204] (3/4) Exclude cut with ID _18513_210_9206_1_1531618215223_3862620_177-227063-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 23:45:59,156 INFO [train.py:1046] (3/4) Epoch 30, batch 5250, loss[loss=0.1431, simple_loss=0.2014, pruned_loss=0.04235, over 19429.00 frames. ], tot_loss[loss=0.1646, simple_loss=0.2427, pruned_loss=0.0433, over 4695967.33 frames. ], batch size: 388, lr: 3.38e-03, grad_scale: 16.0 2023-10-02 23:45:59,272 WARNING [train.py:1204] (3/4) Exclude cut with ID _41326_210_5854_1_1533778076509_3492080_510-45444-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 23:46:00,594 WARNING [train.py:1204] (3/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_76-311963-0_sp1.1 from training. Duration: 0.636375 2023-10-02 23:46:00,691 WARNING [train.py:1204] (3/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_156-302675-0_sp0.9 from training. Duration: 0.611125 2023-10-02 23:46:04,017 WARNING [train.py:1204] (3/4) Exclude cut with ID _53489_210_6228_1_1534298357169_3835970_367-148411-0_sp1.1 from training. Duration: 0.936375 2023-10-02 23:46:05,510 WARNING [train.py:1204] (3/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_389-144525-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 23:46:07,383 WARNING [train.py:1204] (3/4) Exclude cut with ID _14069_210_5632_1_1530512692033_4803390_158-238427-0_sp1.1 from training. Duration: 0.963625 2023-10-02 23:46:08,718 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_486-151131-0_sp0.9 from training. Duration: 0.9444375 2023-10-02 23:46:11,575 WARNING [train.py:1204] (3/4) Exclude cut with ID _39846_210_12651_1_1533603651992_1318030_188-71831-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 23:46:14,191 WARNING [train.py:1204] (3/4) Exclude cut with ID _39411_210_5986_1_1533538825030_3343649_173-238248-0_sp1.1 from training. Duration: 0.7545625 2023-10-02 23:46:16,925 WARNING [train.py:1204] (3/4) Exclude cut with ID _20684_210_6917_1_1531567751646_4203930_469-49081-0_sp1.1 from training. Duration: 0.92725 2023-10-02 23:46:17,170 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.conv_module1.balancer2.min_positive, batch_count=1062080.0, ans=0.05 2023-10-02 23:46:19,642 WARNING [train.py:1204] (3/4) Exclude cut with ID _8833_210_5219_1_1528592665210_7553059_611-80313-0_sp1.1 from training. Duration: 0.5818125 2023-10-02 23:46:19,773 WARNING [train.py:1204] (3/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_440-141811-0 from training. Duration: 0.95 2023-10-02 23:46:19,795 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_41211_210_2544_1_1533348859_3702910_236-223652-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 23:46:21,133 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_40222_210_6228_1_1533385298_4481939_220-163998-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 23:46:22,593 INFO [scaling.py:1118] (3/4) WithLoss: name=encoder.encoders.2.encoder.layers.2.self_attn_weights, loss-sum=0.000e+00 2023-10-02 23:46:35,067 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.1.conv_module2.balancer2.prob, batch_count=1062146.6666666667, ans=0.125 2023-10-02 23:46:44,438 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=1062213.3333333333, ans=0.1 2023-10-02 23:46:44,534 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.conv_module2.balancer2.prob, batch_count=1062213.3333333333, ans=0.125 2023-10-02 23:47:08,002 INFO [train.py:1046] (3/4) Epoch 30, batch 5300, loss[loss=0.1585, simple_loss=0.2225, pruned_loss=0.04731, over 23665.00 frames. ], tot_loss[loss=0.1637, simple_loss=0.2411, pruned_loss=0.04317, over 4687454.92 frames. ], batch size: 232, lr: 3.38e-03, grad_scale: 8.0 2023-10-02 23:47:10,806 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=1062346.6666666667, ans=0.1 2023-10-02 23:47:14,685 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.ff3_skip_rate, batch_count=1062346.6666666667, ans=0.0 2023-10-02 23:47:15,046 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.4.encoder.layers.2.feed_forward1.out_whiten, num_groups=1, num_channels=384, metric=8.16 vs. limit=15.0 2023-10-02 23:47:17,994 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.3.encoder.layers.0.whiten, num_groups=1, num_channels=512, metric=5.40 vs. limit=12.0 2023-10-02 23:47:20,858 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.attention_skip_rate, batch_count=1062413.3333333333, ans=0.0 2023-10-02 23:47:21,849 WARNING [train.py:1204] (3/4) Exclude cut with ID _14629_210_15709_1_1530604298182_956899_6-232695-0_sp1.1 from training. Duration: 0.8545625 2023-10-02 23:47:21,870 WARNING [train.py:1204] (3/4) Exclude cut with ID _13921_210_9994_1_1530841782272_3503520_327-69677-0 from training. Duration: 0.9 2023-10-02 23:47:21,896 WARNING [train.py:1204] (3/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_372-298093-0 from training. Duration: 0.8 2023-10-02 23:47:21,909 WARNING [train.py:1204] (3/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_515-139111-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 23:47:22,144 WARNING [train.py:1204] (3/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_85-65056-0_sp1.1 from training. Duration: 0.87275 2023-10-02 23:47:22,185 WARNING [train.py:1204] (3/4) Exclude cut with ID _15113_210_8254_1_1530842438579_3716279_209-54533-0_sp1.1 from training. Duration: 0.87275 2023-10-02 23:47:22,234 WARNING [train.py:1204] (3/4) Exclude cut with ID _70447_210_12392_1_1535972499300_3160420_123-312838-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 23:47:22,262 WARNING [train.py:1204] (3/4) Exclude cut with ID _60113_210_16271_1_1535164212481_4386079_70-63596-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 23:47:22,291 WARNING [train.py:1204] (3/4) Exclude cut with ID _14632_210_9988_1_1530691279392_3923200_225-188509-0_sp1.1 from training. Duration: 0.963625 2023-10-02 23:47:22,341 WARNING [train.py:1204] (3/4) Exclude cut with ID _34734_210_4525_1_1533365977542_4448720_584-180532-0_sp1.1 from training. Duration: 0.97275 2023-10-02 23:47:22,355 WARNING [train.py:1204] (3/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_351-294650-0_sp1.1 from training. Duration: 0.57275 2023-10-02 23:47:23,038 WARNING [train.py:1204] (3/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_263-168388-0_sp0.9 from training. Duration: 0.9555625 2023-10-02 23:47:23,065 WARNING [train.py:1204] (3/4) Exclude cut with ID _22083_210_8254_1_1531702862439_3678970_201-85738-0 from training. Duration: 0.9 2023-10-02 23:47:23,154 WARNING [train.py:1204] (3/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_237-58433-0 from training. Duration: 0.74 2023-10-02 23:47:23,180 WARNING [train.py:1204] (3/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_262-250826-0 from training. Duration: 0.99 2023-10-02 23:47:23,281 WARNING [train.py:1204] (3/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_927-43827-0_sp0.9 from training. Duration: 0.82225 2023-10-02 23:47:23,313 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2821_210_2715_1_1526108385_3763560_73-296673-0 from training. Duration: 0.83 2023-10-02 23:47:23,389 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6287_210_3273_1_1527509506_2653716_182-199887-0 from training. Duration: 0.51 2023-10-02 23:47:23,523 WARNING [train.py:1204] (3/4) Exclude cut with ID _70519_210_4163_1_1535961595994_7458230_899-80223-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 23:47:23,797 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_210-234949-0_sp1.1 from training. Duration: 0.97275 2023-10-02 23:47:23,841 WARNING [train.py:1204] (3/4) Exclude cut with ID _30196_210_1664_1_1533708496522_7074140_768-278176-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 23:47:23,931 WARNING [train.py:1204] (3/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_315-128283-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 23:47:24,015 WARNING [train.py:1204] (3/4) Exclude cut with ID _14886_210_4220_1_1530702020355_3516190_330-323941-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 23:47:24,276 WARNING [train.py:1204] (3/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_264-348896-0_sp1.1 from training. Duration: 0.77275 2023-10-02 23:47:24,298 WARNING [train.py:1204] (3/4) Exclude cut with ID _37086_210_9038_1_1532999540891_7021250_433-10563-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 23:47:24,333 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_53638_210_21581_1_1534297319_5808449_885-80172-0_sp1.1 from training. Duration: 0.87275 2023-10-02 23:47:24,421 WARNING [train.py:1204] (3/4) Exclude cut with ID _14623_210_1925_1_1530773354962_3632609_157-218771-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 23:47:24,431 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_1368-78165-0_sp1.1 from training. Duration: 0.97275 2023-10-02 23:47:24,435 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_309-65177-0_sp0.9 from training. Duration: 0.97775 2023-10-02 23:47:24,441 WARNING [train.py:1204] (3/4) Exclude cut with ID _21632_210_2253_1_1531994001765_3744140_291-138812-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 23:47:24,454 WARNING [train.py:1204] (3/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_669-93163-0_sp1.1 from training. Duration: 0.8909375 2023-10-02 23:47:25,378 WARNING [train.py:1204] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_292-215346-0 from training. Duration: 0.97 2023-10-02 23:47:25,405 WARNING [train.py:1204] (3/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_286-222073-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 23:47:25,694 WARNING [train.py:1204] (3/4) Exclude cut with ID _59256_210_5986_1_1535076085997_3589120_121-237386-0_sp1.1 from training. Duration: 0.87275 2023-10-02 23:47:25,709 WARNING [train.py:1204] (3/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_96-320736-0 from training. Duration: 0.86 2023-10-02 23:47:25,718 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_128-202104-0 from training. Duration: 0.63 2023-10-02 23:47:25,804 WARNING [train.py:1204] (3/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_242-3472-0_sp0.9 from training. Duration: 0.811125 2023-10-02 23:47:25,847 WARNING [train.py:1204] (3/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_154-74182-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 23:47:25,851 WARNING [train.py:1204] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_187-221566-0 from training. Duration: 0.43 2023-10-02 23:47:26,015 WARNING [train.py:1204] (3/4) Exclude cut with ID _6194_210_1534_1_1527511826160_3941194_14-282876-0 from training. Duration: 0.71 2023-10-02 23:47:26,055 WARNING [train.py:1204] (3/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_660-211088-0_sp1.1 from training. Duration: 0.563625 2023-10-02 23:47:26,377 WARNING [train.py:1204] (3/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_526-62079-0_sp1.1 from training. Duration: 0.6090625 2023-10-02 23:47:26,519 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_92-246270-0_sp1.1 from training. Duration: 0.77275 2023-10-02 23:47:26,603 WARNING [train.py:1204] (3/4) Exclude cut with ID IC0287W0023-142350 from training. Duration: 0.956 2023-10-02 23:47:26,674 WARNING [train.py:1204] (3/4) Exclude cut with ID _58605_210_12210_1_1534723195939_3904259_48-269402-0 from training. Duration: 0.96 2023-10-02 23:47:26,689 WARNING [train.py:1204] (3/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_416-257730-0_sp0.9 from training. Duration: 0.72225 2023-10-02 23:47:26,760 WARNING [train.py:1204] (3/4) Exclude cut with ID _7877_210_6014_1_1528286358099_3750136_185-17516-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 23:47:26,860 WARNING [train.py:1204] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_23-189234-0 from training. Duration: 0.83 2023-10-02 23:47:26,876 WARNING [train.py:1204] (3/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_237-89684-0 from training. Duration: 0.96 2023-10-02 23:47:26,944 WARNING [train.py:1204] (3/4) Exclude cut with ID _13916_210_9994_1_1530788341126_3763149_116-21034-0 from training. Duration: 0.96 2023-10-02 23:47:27,098 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_246-125000-0_sp1.1 from training. Duration: 0.563625 2023-10-02 23:47:31,468 INFO [train.py:1046] (3/4) Epoch 31, batch 0, loss[loss=0.1634, simple_loss=0.25, pruned_loss=0.0384, over 24544.00 frames. ], tot_loss[loss=0.1634, simple_loss=0.25, pruned_loss=0.0384, over 24544.00 frames. ], batch size: 71, lr: 3.32e-03, grad_scale: 16.0 2023-10-02 23:47:31,469 INFO [train.py:1069] (3/4) Computing validation loss 2023-10-02 23:47:41,557 INFO [zipformer.py:1853] (3/4) name=encoder.encoders.3.encoder.layers.2.self_attn_weights, attn_weights_entropy = tensor([1.6116, 3.1465, 2.5951, 2.7972, 2.6558, 3.0879, 2.3148, 2.9898], device='cuda:3') 2023-10-02 23:47:41,819 INFO [zipformer.py:1853] (3/4) name=encoder.encoders.2.encoder.layers.0.self_attn_weights, attn_weights_entropy = tensor([3.8502, 3.4766, 3.5535, 3.6923], device='cuda:3') 2023-10-02 23:47:43,364 INFO [train.py:1078] (3/4) Epoch 31, validation: loss=0.3244, simple_loss=0.2676, pruned_loss=0.1906, over 1125622.00 frames. 2023-10-02 23:47:43,364 INFO [train.py:1079] (3/4) Maximum memory allocated so far is 21134MB 2023-10-02 23:47:43,537 WARNING [train.py:1204] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_402-253027-0 from training. Duration: 0.54 2023-10-02 23:47:44,888 WARNING [train.py:1204] (3/4) Exclude cut with ID _59288_210_5854_1_1535077757774_3412246_158-289345-0_sp1.1 from training. Duration: 0.92725 2023-10-02 23:47:47,915 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_28211_210_6388_1_1532861506_4523224_128-328162-0_sp1.1 from training. Duration: 0.8181875 2023-10-02 23:47:48,150 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.conv_module1.balancer1.prob, batch_count=1062426.6666666667, ans=0.125 2023-10-02 23:47:52,510 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_788-278281-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 23:47:52,525 WARNING [train.py:1204] (3/4) Exclude cut with ID _49728_210_12651_1_1534041034294_6788150_624-195301-0_sp0.9 from training. Duration: 0.9444375 2023-10-02 23:47:53,825 WARNING [train.py:1204] (3/4) Exclude cut with ID _14860_210_4915_1_1530702020340_7343240_267-139775-0_sp1.1 from training. Duration: 0.963625 2023-10-02 23:47:53,895 WARNING [train.py:1204] (3/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_332-221057-0 from training. Duration: 0.68 2023-10-02 23:47:55,781 WARNING [train.py:1204] (3/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_242-76943-0 from training. Duration: 0.94 2023-10-02 23:47:58,696 WARNING [train.py:1204] (3/4) Exclude cut with ID _16747_210_5986_1_1531811544733_2749109_87-349587-0_sp1.1 from training. Duration: 0.963625 2023-10-02 23:47:58,767 WARNING [train.py:1204] (3/4) Exclude cut with ID _22982_210_1883_1_1531882790447_5449409_505-346452-0_sp1.1 from training. Duration: 0.963625 2023-10-02 23:48:01,584 WARNING [train.py:1204] (3/4) Exclude cut with ID _17956_210_4353_1_1531209568125_6979970_13-192107-0_sp1.1 from training. Duration: 0.963625 2023-10-02 23:48:01,611 WARNING [train.py:1204] (3/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_430-307666-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 23:48:02,763 WARNING [train.py:1204] (3/4) Exclude cut with ID _2895_210_2477_1_1525956785794_4063750_289-11475-0_sp1.1 from training. Duration: 0.7454375 2023-10-02 23:48:02,774 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3910_210_1794_1_1526742306_334530_28-238909-0_sp0.9 from training. Duration: 0.9 2023-10-02 23:48:05,373 WARNING [train.py:1204] (3/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_18-130336-0 from training. Duration: 0.79 2023-10-02 23:48:06,807 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_548-294316-0_sp0.9 from training. Duration: 0.9 2023-10-02 23:48:15,615 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_42-142473-0_sp1.1 from training. Duration: 0.6545625 2023-10-02 23:48:15,621 WARNING [train.py:1204] (3/4) Exclude cut with ID _59170_210_6721_1_1534983633960_4203335_326-48066-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 23:48:18,334 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_45886_210_17024_1_1533709215_471063_7-79165-0 from training. Duration: 0.98 2023-10-02 23:48:22,856 WARNING [train.py:1204] (3/4) Exclude cut with ID _44585_210_18628_1_1533605350387_4518199_280-73534-0_sp0.9 from training. Duration: 0.988875 2023-10-02 23:48:22,863 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3475_210_2028_1_1526193578_3622569_265-276625-0_sp1.1 from training. Duration: 0.6909375 2023-10-02 23:48:24,353 WARNING [train.py:1204] (3/4) Exclude cut with ID _64631_210_27767_1_1535349580068_4165160_88-225232-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 23:48:27,327 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.self_attn_weights.pos_emb_skip_rate, batch_count=1062626.6666666667, ans=0.0 2023-10-02 23:48:29,061 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_205-265518-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 23:48:30,637 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=1062626.6666666667, ans=0.1 2023-10-02 23:48:33,200 WARNING [train.py:1204] (3/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_271-253734-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 23:48:37,458 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.553e+02 1.906e+02 2.085e+02 2.438e+02 4.411e+02, threshold=4.170e+02, percent-clipped=1.0 2023-10-02 23:48:37,523 WARNING [train.py:1204] (3/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_282-257791-0 from training. Duration: 0.86 2023-10-02 23:48:40,390 WARNING [train.py:1204] (3/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_356-45054-0 from training. Duration: 0.86 2023-10-02 23:48:40,429 WARNING [train.py:1204] (3/4) Exclude cut with ID _69584_210_5219_1_1535785133775_4044450_38-348116-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 23:48:40,430 WARNING [train.py:1204] (3/4) Exclude cut with ID _66763_210_17301_1_1535788667617_241750_21-328480-0_sp1.1 from training. Duration: 0.936375 2023-10-02 23:48:42,280 WARNING [train.py:1204] (3/4) Exclude cut with ID _26528_210_9070_1_1532570410842_3851310_497-97218-0_sp0.9 from training. Duration: 0.9555625 2023-10-02 23:48:42,326 WARNING [train.py:1204] (3/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_592-101075-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 23:48:44,891 WARNING [train.py:1204] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_678-159307-0 from training. Duration: 0.85 2023-10-02 23:48:46,451 WARNING [train.py:1204] (3/4) Exclude cut with ID _28427_210_4220_1_1532581240967_3061069_166-319773-0_sp1.1 from training. Duration: 0.936375 2023-10-02 23:48:46,678 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.ff2_skip_rate, batch_count=1062693.3333333333, ans=0.0 2023-10-02 23:48:47,806 WARNING [train.py:1204] (3/4) Exclude cut with ID _29877_210_6228_1_1532397791106_5276149_606-224984-0_sp1.1 from training. Duration: 0.936375 2023-10-02 23:48:50,710 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_125-203105-0_sp1.1 from training. Duration: 0.72725 2023-10-02 23:48:55,875 WARNING [train.py:1204] (3/4) Exclude cut with ID IC0620W0113-306765 from training. Duration: 0.768 2023-10-02 23:48:56,076 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.attention_skip_rate, batch_count=1062760.0, ans=0.0 2023-10-02 23:48:57,123 INFO [train.py:1046] (3/4) Epoch 31, batch 50, loss[loss=0.1769, simple_loss=0.25, pruned_loss=0.05195, over 23765.00 frames. ], tot_loss[loss=0.1648, simple_loss=0.2439, pruned_loss=0.04286, over 1071755.90 frames. ], batch size: 179, lr: 3.32e-03, grad_scale: 16.0 2023-10-02 23:48:57,253 WARNING [train.py:1204] (3/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_410-287857-0_sp1.1 from training. Duration: 0.8454375 2023-10-02 23:48:57,401 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.conv_skip_rate, batch_count=1062760.0, ans=0.0 2023-10-02 23:48:59,419 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.conv_module1.balancer2.min_positive, batch_count=1062760.0, ans=0.05 2023-10-02 23:49:00,509 WARNING [train.py:1204] (3/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_78-95260-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 23:49:03,267 WARNING [train.py:1204] (3/4) Exclude cut with ID _58059_210_5854_1_1534901471149_3164808_6-73080-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 23:49:03,290 WARNING [train.py:1204] (3/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_331-117829-0 from training. Duration: 0.62 2023-10-02 23:49:04,763 WARNING [train.py:1204] (3/4) Exclude cut with ID _14880_210_8895_1_1530680426655_3795259_289-212817-0_sp0.9 from training. Duration: 0.7666875 2023-10-02 23:49:04,798 WARNING [train.py:1204] (3/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_620-144779-0_sp1.1 from training. Duration: 0.92725 2023-10-02 23:49:06,225 WARNING [train.py:1204] (3/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_561-288575-0_sp1.1 from training. Duration: 0.963625 2023-10-02 23:49:07,518 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_22657_210_6890_1_1532086002_1637216_122-188128-0_sp1.1 from training. Duration: 0.963625 2023-10-02 23:49:10,180 WARNING [train.py:1204] (3/4) Exclude cut with ID _16756_210_4915_1_1531047803763_7104259_604-86607-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 23:49:12,969 WARNING [train.py:1204] (3/4) Exclude cut with ID _15150_210_8341_1_1530752411803_7256384_469-239509-0 from training. Duration: 0.68 2023-10-02 23:49:12,978 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_10987_210_4163_1_1529760555_4123902_302-87926-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 23:49:14,741 INFO [scaling.py:1118] (3/4) WithLoss: name=encoder.encoders.3.encoder.layers.3.self_attn_weights, loss-sum=0.000e+00 2023-10-02 23:49:19,113 WARNING [train.py:1204] (3/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_469-94673-0_sp0.9 from training. Duration: 0.87775 2023-10-02 23:49:20,525 WARNING [train.py:1204] (3/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_798-106179-0 from training. Duration: 0.83 2023-10-02 23:49:21,910 WARNING [train.py:1204] (3/4) Exclude cut with ID _16744_210_5986_1_1531724405384_3907567_242-111316-0 from training. Duration: 0.93 2023-10-02 23:49:23,975 WARNING [train.py:1204] (3/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_938-205135-0_sp0.9 from training. Duration: 0.9666875 2023-10-02 23:49:25,729 WARNING [train.py:1204] (3/4) Exclude cut with ID _64135_210_2253_1_1535677267876_7608972_133-188266-0_sp0.9 from training. Duration: 0.9 2023-10-02 23:49:25,734 WARNING [train.py:1204] (3/4) Exclude cut with ID _44585_210_18628_1_1533605350387_4518199_623-346820-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 23:49:27,196 WARNING [train.py:1204] (3/4) Exclude cut with ID _42326_210_7770_1_1533814282531_3707269_454-266903-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 23:49:27,270 WARNING [train.py:1204] (3/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_5-55893-0_sp1.1 from training. Duration: 0.5 2023-10-02 23:49:28,505 WARNING [train.py:1204] (3/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_577-332600-0_sp1.1 from training. Duration: 0.5181875 2023-10-02 23:49:28,505 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_957-138882-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 23:49:34,853 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.conv_module1.balancer2.prob, batch_count=1062893.3333333333, ans=0.125 2023-10-02 23:49:36,179 WARNING [train.py:1204] (3/4) Exclude cut with ID _12556_210_8341_1_1530081024100_7327690_219-38004-0_sp1.1 from training. Duration: 0.97275 2023-10-02 23:49:37,429 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_397-147196-0_sp1.1 from training. Duration: 0.72725 2023-10-02 23:49:37,446 WARNING [train.py:1204] (3/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_231-104736-0_sp0.9 from training. Duration: 0.8666875 2023-10-02 23:49:37,620 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.conv_skip_rate, batch_count=1062893.3333333333, ans=0.0 2023-10-02 23:49:38,895 WARNING [train.py:1204] (3/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_398-334558-0 from training. Duration: 0.96 2023-10-02 23:49:40,174 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_257-20733-0_sp1.1 from training. Duration: 0.6909375 2023-10-02 23:49:41,620 WARNING [train.py:1204] (3/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_146-282400-0_sp1.1 from training. Duration: 0.6454375 2023-10-02 23:49:41,624 WARNING [train.py:1204] (3/4) Exclude cut with ID _51899_210_20034_1_1534129254466_3669220_265-215428-0 from training. Duration: 0.98 2023-10-02 23:49:41,687 WARNING [train.py:1204] (3/4) Exclude cut with ID _34423_210_8750_1_1533430798456_3330199_219-106541-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 23:49:44,366 WARNING [train.py:1204] (3/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_458-67321-0 from training. Duration: 0.97 2023-10-02 23:49:50,578 WARNING [train.py:1204] (3/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_1051-182606-0_sp1.1 from training. Duration: 0.936375 2023-10-02 23:49:50,600 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_53555_210_5632_1_1534595220_7232297_603-4396-0_sp1.1 from training. Duration: 0.97275 2023-10-02 23:49:51,930 WARNING [train.py:1204] (3/4) Exclude cut with ID _54218_210_12210_1_1534413646388_4409259_159-144367-0_sp1.1 from training. Duration: 0.963625 2023-10-02 23:49:54,592 WARNING [train.py:1204] (3/4) Exclude cut with ID _7606_210_4915_1_1528894253277_5080690_301-10732-0_sp0.9 from training. Duration: 0.9 2023-10-02 23:49:54,596 WARNING [train.py:1204] (3/4) Exclude cut with ID _15026_210_8750_1_1530777602316_3291850_211-195043-0_sp0.9 from training. Duration: 0.888875 2023-10-02 23:49:56,689 WARNING [train.py:1204] (3/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_146-282400-0 from training. Duration: 0.71 2023-10-02 23:49:56,723 WARNING [train.py:1204] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_65-88242-0 from training. Duration: 0.56 2023-10-02 23:49:58,033 WARNING [train.py:1204] (3/4) Exclude cut with ID _55857_210_2346_1_1534644007831_6898209_80-107757-0_sp1.1 from training. Duration: 0.963625 2023-10-02 23:49:58,074 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_15126_210_6753_1_1530778677_2591185_120-277707-0_sp0.9 from training. Duration: 0.888875 2023-10-02 23:49:59,536 WARNING [train.py:1204] (3/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_1033-168593-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 23:49:59,588 WARNING [train.py:1204] (3/4) Exclude cut with ID _46683_210_9642_1_1533730913492_4591116_475-30711-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 23:49:59,624 WARNING [train.py:1204] (3/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_253-308377-0 from training. Duration: 0.49 2023-10-02 23:49:59,761 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.ff3_skip_rate, batch_count=1063026.6666666667, ans=0.0 2023-10-02 23:50:00,851 WARNING [train.py:1204] (3/4) Exclude cut with ID _4680_210_3525_1_1527249757953_893386_75-78979-0 from training. Duration: 0.91 2023-10-02 23:50:02,215 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_5522_210_3273_1_1527249606_3909096_318-203153-0_sp0.9 from training. Duration: 0.47775 2023-10-02 23:50:03,911 WARNING [train.py:1204] (3/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_550-170116-0_sp1.1 from training. Duration: 0.92725 2023-10-02 23:50:03,940 WARNING [train.py:1204] (3/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_277-210798-0_sp0.9 from training. Duration: 0.611125 2023-10-02 23:50:04,160 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.balancer2.prob, batch_count=1063026.6666666667, ans=0.125 2023-10-02 23:50:05,321 WARNING [train.py:1204] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_620-276793-0 from training. Duration: 0.59 2023-10-02 23:50:05,331 WARNING [train.py:1204] (3/4) Exclude cut with ID _3067_210_2667_1_1526212974640_4503168_119-175776-0 from training. Duration: 0.74 2023-10-02 23:50:05,400 WARNING [train.py:1204] (3/4) Exclude cut with ID _55209_210_1531_1_1534675828996_4597160_681-195827-0_sp1.1 from training. Duration: 0.92725 2023-10-02 23:50:06,726 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_427-78097-0_sp1.1 from training. Duration: 0.72725 2023-10-02 23:50:08,134 WARNING [train.py:1204] (3/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_96-290039-0_sp0.9 from training. Duration: 0.62225 2023-10-02 23:50:08,144 WARNING [train.py:1204] (3/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_287-292174-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 23:50:10,743 INFO [train.py:1046] (3/4) Epoch 31, batch 100, loss[loss=0.2101, simple_loss=0.2712, pruned_loss=0.07452, over 19400.00 frames. ], tot_loss[loss=0.1662, simple_loss=0.2457, pruned_loss=0.04334, over 1875031.05 frames. ], batch size: 388, lr: 3.32e-03, grad_scale: 8.0 2023-10-02 23:50:10,862 WARNING [train.py:1204] (3/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_200-292228-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 23:50:14,998 WARNING [train.py:1204] (3/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_291-194999-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 23:50:18,350 WARNING [train.py:1204] (3/4) Exclude cut with ID _58895_210_8750_1_1535184081320_3649089_265-323810-0_sp1.1 from training. Duration: 0.87275 2023-10-02 23:50:19,712 WARNING [train.py:1204] (3/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_58-68956-0 from training. Duration: 0.83 2023-10-02 23:50:19,716 WARNING [train.py:1204] (3/4) Exclude cut with ID _39867_210_9826_1_1533553222030_9344259_322-115153-0_sp1.1 from training. Duration: 0.963625 2023-10-02 23:50:23,863 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3371_210_1794_1_1526384462_5580777_396-339812-0_sp1.1 from training. Duration: 0.77275 2023-10-02 23:50:25,101 WARNING [train.py:1204] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_731-2428-0_sp0.9 from training. Duration: 0.92225 2023-10-02 23:50:25,119 WARNING [train.py:1204] (3/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_98-741-0_sp1.1 from training. Duration: 0.72725 2023-10-02 23:50:25,121 WARNING [train.py:1204] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_238-245655-0_sp0.9 from training. Duration: 0.9 2023-10-02 23:50:25,152 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6788_210_1388_1_1527898698_4562021_91-197981-0_sp0.9 from training. Duration: 0.92225 2023-10-02 23:50:25,247 WARNING [train.py:1204] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_860-96198-0 from training. Duration: 0.73 2023-10-02 23:50:28,521 WARNING [train.py:1204] (3/4) Exclude cut with ID _14268_210_9206_1_1530622771285_4714099_331-105351-0_sp0.9 from training. Duration: 0.72225 2023-10-02 23:50:28,552 WARNING [train.py:1204] (3/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_204-137133-0_sp1.1 from training. Duration: 0.92725 2023-10-02 23:50:30,380 WARNING [train.py:1204] (3/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_5-46496-0_sp1.1 from training. Duration: 0.936375 2023-10-02 23:50:30,388 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_160-218721-0_sp1.1 from training. Duration: 0.87275 2023-10-02 23:50:33,207 WARNING [train.py:1204] (3/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_503-287883-0 from training. Duration: 0.88 2023-10-02 23:50:33,283 WARNING [train.py:1204] (3/4) Exclude cut with ID _57572_210_5517_1_1534921219624_3809484_110-79134-0_sp1.1 from training. Duration: 0.92725 2023-10-02 23:50:34,616 WARNING [train.py:1204] (3/4) Exclude cut with ID _44585_210_18628_1_1533605350387_4518199_421-37641-0_sp1.1 from training. Duration: 0.936375 2023-10-02 23:50:34,846 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.bypass_mid.scale_min, batch_count=1063160.0, ans=0.2 2023-10-02 23:50:36,508 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_42-142473-0_sp0.9 from training. Duration: 0.8 2023-10-02 23:50:39,265 WARNING [train.py:1204] (3/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_310-114452-0_sp0.9 from training. Duration: 0.6555625 2023-10-02 23:50:42,220 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9029W0120-509520 from training. Duration: 0.768 2023-10-02 23:50:42,233 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9029W0433-509820 from training. Duration: 0.896 2023-10-02 23:50:42,486 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.conv_module2.balancer2.prob, batch_count=1063226.6666666667, ans=0.125 2023-10-02 23:50:43,684 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_40222_210_6228_1_1533385298_4481939_163-334586-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 23:50:43,684 WARNING [train.py:1204] (3/4) Exclude cut with ID _48677_210_5663_1_1533902405919_3892989_408-40740-0_sp1.1 from training. Duration: 0.7545625 2023-10-02 23:50:48,822 WARNING [train.py:1204] (3/4) Exclude cut with ID _24314_210_4915_1_1532135078116_7170179_521-258868-0_sp0.9 from training. Duration: 0.87775 2023-10-02 23:50:51,605 WARNING [train.py:1204] (3/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_576-51198-0_sp1.1 from training. Duration: 0.92725 2023-10-02 23:50:53,521 WARNING [train.py:1204] (3/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_306-216871-0_sp1.1 from training. Duration: 0.97275 2023-10-02 23:50:55,655 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.4.encoder.layers.0.whiten, num_groups=1, num_channels=384, metric=4.54 vs. limit=12.0 2023-10-02 23:50:56,505 WARNING [train.py:1204] (3/4) Exclude cut with ID _20684_210_6917_1_1531567751646_4203930_463-119982-0_sp1.1 from training. Duration: 0.97275 2023-10-02 23:50:57,786 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9084W0012-536850 from training. Duration: 0.64 2023-10-02 23:50:59,622 WARNING [train.py:1204] (3/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_847-41334-0_sp1.1 from training. Duration: 0.4 2023-10-02 23:51:04,203 WARNING [train.py:1204] (3/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_326-165900-0_sp1.1 from training. Duration: 0.62725 2023-10-02 23:51:06,059 WARNING [train.py:1204] (3/4) Exclude cut with ID _7537_210_2084_1_1528502395431_3924359_128-268029-0_sp1.1 from training. Duration: 0.8909375 2023-10-02 23:51:07,428 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.535e+02 1.861e+02 2.149e+02 2.468e+02 3.325e+02, threshold=4.297e+02, percent-clipped=0.0 2023-10-02 23:51:07,534 WARNING [train.py:1204] (3/4) Exclude cut with ID _67499_210_17282_1_1535509805220_3692430_333-155030-0_sp1.1 from training. Duration: 0.97275 2023-10-02 23:51:09,230 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.bypass.skip_rate, batch_count=1063360.0, ans=0.07 2023-10-02 23:51:10,301 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_34008_210_10431_1_1533345424_8183247_588-257177-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 23:51:11,786 WARNING [train.py:1204] (3/4) Exclude cut with ID _25914_210_6400_1_1532141317058_644740_52-96808-0_sp1.1 from training. Duration: 0.82725 2023-10-02 23:51:13,196 WARNING [train.py:1204] (3/4) Exclude cut with ID _56494_210_4220_1_1535284837625_3033169_69-138150-0_sp1.1 from training. Duration: 0.8545625 2023-10-02 23:51:15,946 WARNING [train.py:1204] (3/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_19-143553-0_sp1.1 from training. Duration: 0.97275 2023-10-02 23:51:17,363 WARNING [train.py:1204] (3/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_582-130735-0_sp1.1 from training. Duration: 0.936375 2023-10-02 23:51:18,870 WARNING [train.py:1204] (3/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_943-201356-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 23:51:18,889 WARNING [train.py:1204] (3/4) Exclude cut with ID _3231_210_2106_1_1526205864934_6784220_422-229276-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 23:51:20,185 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_48049_210_5632_1_1533864553_3519937_73-198717-0_sp1.1 from training. Duration: 0.97275 2023-10-02 23:51:21,470 WARNING [train.py:1204] (3/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_329-285801-0 from training. Duration: 0.91 2023-10-02 23:51:21,487 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9171W0114-579000 from training. Duration: 0.896 2023-10-02 23:51:21,498 WARNING [train.py:1204] (3/4) Exclude cut with ID _3120_210_3525_1_1526036143112_4072980_210-132675-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 23:51:21,583 WARNING [train.py:1204] (3/4) Exclude cut with ID _55857_210_2346_1_1534644007831_6898209_745-98391-0_sp0.9 from training. Duration: 0.8444375 2023-10-02 23:51:21,751 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.self_attn_weights.pos_emb_skip_rate, batch_count=1063360.0, ans=0.0 2023-10-02 23:51:22,984 WARNING [train.py:1204] (3/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_615-322035-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 23:51:22,989 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_36756_210_7234_1_1533026991_966276_119-315767-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 23:51:23,011 WARNING [train.py:1204] (3/4) Exclude cut with ID _13916_210_9994_1_1530788341126_3763149_60-191556-0_sp1.1 from training. Duration: 0.5545625 2023-10-02 23:51:23,031 WARNING [train.py:1204] (3/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_177-236809-0_sp1.1 from training. Duration: 0.4454375 2023-10-02 23:51:23,072 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7002_210_1794_1_1527939683_5157286_184-229630-0_sp1.1 from training. Duration: 0.67275 2023-10-02 23:51:24,293 INFO [train.py:1046] (3/4) Epoch 31, batch 150, loss[loss=0.1513, simple_loss=0.2378, pruned_loss=0.03241, over 24625.00 frames. ], tot_loss[loss=0.1654, simple_loss=0.2449, pruned_loss=0.04291, over 2508778.24 frames. ], batch size: 68, lr: 3.32e-03, grad_scale: 8.0 2023-10-02 23:51:24,338 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_50428_210_5724_1_1534247532_4126418_464-18322-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 23:51:24,406 WARNING [train.py:1204] (3/4) Exclude cut with ID _35799_210_5854_1_1533263487887_3529071_466-131402-0_sp1.1 from training. Duration: 0.936375 2023-10-02 23:51:26,149 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_1259-132768-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 23:51:26,204 WARNING [train.py:1204] (3/4) Exclude cut with ID _6694_210_4599_1_1527852637490_3965529_144-185051-0_sp1.1 from training. Duration: 0.87275 2023-10-02 23:51:26,240 WARNING [train.py:1204] (3/4) Exclude cut with ID _64639_210_5816_1_1535367619437_3528000_59-133290-0_sp1.1 from training. Duration: 0.963625 2023-10-02 23:51:28,987 WARNING [train.py:1204] (3/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_223-49525-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 23:51:32,811 WARNING [train.py:1204] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_748-36150-0_sp1.1 from training. Duration: 0.82725 2023-10-02 23:51:32,812 WARNING [train.py:1204] (3/4) Exclude cut with ID _19896_210_1883_1_1531709022662_6649569_52-221627-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 23:51:32,867 WARNING [train.py:1204] (3/4) Exclude cut with ID _6789_210_1534_1_1527854471671_2870519_296-115766-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 23:51:35,662 WARNING [train.py:1204] (3/4) Exclude cut with ID _14624_210_1925_1_1530858690657_3642219_100-19871-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 23:51:35,694 WARNING [train.py:1204] (3/4) Exclude cut with ID _12555_210_8750_1_1530075594402_3780239_50-293675-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 23:51:37,855 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=1063426.6666666667, ans=0.1 2023-10-02 23:51:38,951 WARNING [train.py:1204] (3/4) Exclude cut with ID _14880_210_8895_1_1530680426655_3795259_289-212817-0_sp1.1 from training. Duration: 0.62725 2023-10-02 23:51:39,013 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_388-211626-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 23:51:41,961 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.feed_forward3.hidden_balancer.prob, batch_count=1063493.3333333333, ans=0.125 2023-10-02 23:51:43,122 WARNING [train.py:1204] (3/4) Exclude cut with ID _5289_210_2084_1_1527678202736_3779299_392-261988-0 from training. Duration: 0.65 2023-10-02 23:51:44,363 WARNING [train.py:1204] (3/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_124-345840-0 from training. Duration: 0.52 2023-10-02 23:51:44,368 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2655_210_2087_1_1525688959_3897400_437-337690-0 from training. Duration: 0.63 2023-10-02 23:51:44,665 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.conv_module1.balancer1.prob, batch_count=1063493.3333333333, ans=0.125 2023-10-02 23:51:47,183 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3780_210_2087_1_1526467391_239068_13-274633-0_sp1.1 from training. Duration: 0.92725 2023-10-02 23:51:47,187 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_805-71497-0_sp1.1 from training. Duration: 0.6454375 2023-10-02 23:51:48,525 WARNING [train.py:1204] (3/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_434-83000-0_sp1.1 from training. Duration: 0.8909375 2023-10-02 23:51:49,855 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_17613_210_2151_1_1531137569_2366160_285-219531-0_sp1.1 from training. Duration: 0.97275 2023-10-02 23:51:49,859 WARNING [train.py:1204] (3/4) Exclude cut with ID _56108_210_16380_1_1534505446159_3887959_22-37858-0_sp1.1 from training. Duration: 0.936375 2023-10-02 23:51:49,883 WARNING [train.py:1204] (3/4) Exclude cut with ID _28427_210_4220_1_1532581240967_3061069_200-88444-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 23:51:49,962 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_77-279042-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 23:51:51,361 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9309W0193-640320 from training. Duration: 0.896 2023-10-02 23:51:54,143 WARNING [train.py:1204] (3/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_516-324055-0_sp1.1 from training. Duration: 0.936375 2023-10-02 23:52:00,331 WARNING [train.py:1204] (3/4) Exclude cut with ID _40094_210_16380_1_1533294005773_3641159_10-261240-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 23:52:03,691 WARNING [train.py:1204] (3/4) Exclude cut with ID _6267_210_2084_1_1527764658526_3894730_375-219887-0_sp1.1 from training. Duration: 0.6090625 2023-10-02 23:52:05,106 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6788_210_1388_1_1527898698_4562021_180-75288-0 from training. Duration: 0.99 2023-10-02 23:52:07,811 WARNING [train.py:1204] (3/4) Exclude cut with ID _8030_210_7497_1_1528444886781_3531089_262-86750-0_sp1.1 from training. Duration: 0.736375 2023-10-02 23:52:07,832 WARNING [train.py:1204] (3/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_934-325498-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 23:52:07,861 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_456-22141-0_sp0.9 from training. Duration: 0.97775 2023-10-02 23:52:11,210 WARNING [train.py:1204] (3/4) Exclude cut with ID _49643_210_7989_1_1534640244224_3939031_598-161926-0_sp1.1 from training. Duration: 0.8181875 2023-10-02 23:52:11,314 WARNING [train.py:1204] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_345-49721-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 23:52:12,610 WARNING [train.py:1204] (3/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_440-141811-0_sp1.1 from training. Duration: 0.863625 2023-10-02 23:52:13,984 WARNING [train.py:1204] (3/4) Exclude cut with ID _64048_210_10626_1_1535198382802_3835179_394-212120-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 23:52:14,046 WARNING [train.py:1204] (3/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_91-277684-0 from training. Duration: 0.74 2023-10-02 23:52:17,249 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.feed_forward1.hidden_balancer.prob, batch_count=1063626.6666666667, ans=0.125 2023-10-02 23:52:18,276 WARNING [train.py:1204] (3/4) Exclude cut with ID _23038_210_9570_1_1532001584158_3652419_191-346343-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 23:52:18,331 WARNING [train.py:1204] (3/4) Exclude cut with ID _15140_210_9288_1_1530791388450_4880149_351-347974-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 23:52:19,637 WARNING [train.py:1204] (3/4) Exclude cut with ID _58605_210_12210_1_1534723195939_3904259_48-269402-0_sp1.1 from training. Duration: 0.87275 2023-10-02 23:52:19,644 WARNING [train.py:1204] (3/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_401-53423-0_sp0.9 from training. Duration: 0.611125 2023-10-02 23:52:20,960 WARNING [train.py:1204] (3/4) Exclude cut with ID _42659_210_2070_1_1533607442765_3120380_158-49004-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 23:52:22,369 WARNING [train.py:1204] (3/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_81-75849-0_sp1.1 from training. Duration: 0.3545625 2023-10-02 23:52:23,812 WARNING [train.py:1204] (3/4) Exclude cut with ID _27052_210_5986_1_1532329197187_3585009_314-106499-0_sp1.1 from training. Duration: 0.836375 2023-10-02 23:52:23,943 WARNING [train.py:1204] (3/4) Exclude cut with ID _4626_210_3532_1_1526817652717_3916859_446-206656-0_sp0.9 from training. Duration: 0.8444375 2023-10-02 23:52:25,320 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_53378_210_17282_1_1534221071_5769509_158-114960-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 23:52:28,430 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3171_210_2087_1_1526109084_2330007_37-64334-0_sp0.9 from training. Duration: 0.911125 2023-10-02 23:52:28,446 WARNING [train.py:1204] (3/4) Exclude cut with ID _5410_210_1388_1_1527397055795_1719020_39-112221-0 from training. Duration: 0.69 2023-10-02 23:52:28,468 WARNING [train.py:1204] (3/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_4-91037-0_sp0.9 from training. Duration: 0.97775 2023-10-02 23:52:28,494 WARNING [train.py:1204] (3/4) Exclude cut with ID ID0079W0443-717795 from training. Duration: 0.541 2023-10-02 23:52:33,040 WARNING [train.py:1204] (3/4) Exclude cut with ID _34734_210_4525_1_1533365977542_4448720_766-297928-0_sp1.1 from training. Duration: 0.936375 2023-10-02 23:52:36,651 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.ff2_skip_rate, batch_count=1063693.3333333333, ans=0.0 2023-10-02 23:52:37,755 WARNING [train.py:1204] (3/4) Exclude cut with ID _51471_210_1889_1_1534399391390_3094250_310-337793-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 23:52:37,767 WARNING [train.py:1204] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_678-159307-0_sp0.9 from training. Duration: 0.9444375 2023-10-02 23:52:39,132 INFO [train.py:1046] (3/4) Epoch 31, batch 200, loss[loss=0.1433, simple_loss=0.223, pruned_loss=0.03179, over 24365.00 frames. ], tot_loss[loss=0.1668, simple_loss=0.2459, pruned_loss=0.04385, over 2996117.42 frames. ], batch size: 56, lr: 3.32e-03, grad_scale: 8.0 2023-10-02 23:52:40,555 WARNING [train.py:1204] (3/4) Exclude cut with ID _50093_210_4220_1_1534417141207_3432299_259-322252-0 from training. Duration: 0.92 2023-10-02 23:52:41,909 WARNING [train.py:1204] (3/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_67-103444-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 23:52:41,933 WARNING [train.py:1204] (3/4) Exclude cut with ID _40551_210_10635_1_1533776424632_3416179_311-247578-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 23:52:44,651 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_4633_210_2040_1_1526824163_4145343_416-76117-0 from training. Duration: 0.95 2023-10-02 23:52:46,145 WARNING [train.py:1204] (3/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_331-117829-0_sp0.9 from training. Duration: 0.688875 2023-10-02 23:52:47,479 WARNING [train.py:1204] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_299-247059-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 23:52:47,562 WARNING [train.py:1204] (3/4) Exclude cut with ID _64002_210_9570_1_1535198605157_38979_2-240845-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 23:52:51,861 WARNING [train.py:1204] (3/4) Exclude cut with ID _4681_210_3525_1_1527250689464_120889_15-126760-0_sp0.9 from training. Duration: 0.9 2023-10-02 23:52:51,885 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_21500_210_2054_1_1532149310_2716758_256-156611-0_sp1.1 from training. Duration: 0.936375 2023-10-02 23:52:51,909 WARNING [train.py:1204] (3/4) Exclude cut with ID _16908_210_5724_1_1531119157248_3772700_624-203729-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 23:53:11,937 WARNING [train.py:1204] (3/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_605-219732-0_sp1.1 from training. Duration: 0.8545625 2023-10-02 23:53:11,986 WARNING [train.py:1204] (3/4) Exclude cut with ID _44718_210_5816_1_1534050051869_6469030_380-167520-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 23:53:14,501 WARNING [train.py:1204] (3/4) Exclude cut with ID _19896_210_1883_1_1531709022662_6649569_94-229927-0_sp1.1 from training. Duration: 0.8818125 2023-10-02 23:53:14,569 WARNING [train.py:1204] (3/4) Exclude cut with ID _19514_210_9259_1_1531530071330_5084230_519-174300-0_sp1.1 from training. Duration: 0.963625 2023-10-02 23:53:15,871 WARNING [train.py:1204] (3/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_340-174176-0_sp0.9 from training. Duration: 0.5444375 2023-10-02 23:53:15,883 WARNING [train.py:1204] (3/4) Exclude cut with ID _8263_210_1664_1_1528439452891_970220_68-181805-0_sp1.1 from training. Duration: 0.5818125 2023-10-02 23:53:17,322 WARNING [train.py:1204] (3/4) Exclude cut with ID _3120_210_3525_1_1526036143112_4072980_348-257723-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 23:53:17,401 WARNING [train.py:1204] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_453-133983-0_sp1.1 from training. Duration: 0.7090625 2023-10-02 23:53:18,757 WARNING [train.py:1204] (3/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_17-86060-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 23:53:18,794 WARNING [train.py:1204] (3/4) Exclude cut with ID _29877_210_6228_1_1532397791106_5276149_228-184657-0_sp1.1 from training. Duration: 0.97275 2023-10-02 23:53:20,149 WARNING [train.py:1204] (3/4) Exclude cut with ID _18516_210_9206_1_1531704653206_3692406_198-334870-0 from training. Duration: 0.92 2023-10-02 23:53:20,190 WARNING [train.py:1204] (3/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_340-174176-0_sp1.1 from training. Duration: 0.4454375 2023-10-02 23:53:21,474 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_14-139186-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 23:53:25,596 WARNING [train.py:1204] (3/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_126-29104-0_sp0.9 from training. Duration: 0.9444375 2023-10-02 23:53:31,718 WARNING [train.py:1204] (3/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_84-10310-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 23:53:34,996 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.584e+02 1.946e+02 2.118e+02 2.313e+02 3.373e+02, threshold=4.235e+02, percent-clipped=0.0 2023-10-02 23:53:38,491 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_15517_210_6753_1_1530954008_3487336_79-273076-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 23:53:38,547 WARNING [train.py:1204] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_371-18169-0_sp0.9 from training. Duration: 0.9555625 2023-10-02 23:53:44,505 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_68702_210_23689_1_1535799577_3420068_170-92788-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 23:53:45,950 WARNING [train.py:1204] (3/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_78-182912-0 from training. Duration: 0.59 2023-10-02 23:53:47,382 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3019_210_2028_1_1526044245_3264865_297-12384-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 23:53:47,388 WARNING [train.py:1204] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_147-111137-0_sp1.1 from training. Duration: 0.863625 2023-10-02 23:53:48,632 WARNING [train.py:1204] (3/4) Exclude cut with ID _28323_210_16187_1_1532480395667_4100300_117-8859-0_sp1.1 from training. Duration: 0.97275 2023-10-02 23:53:48,720 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7762_210_1794_1_1528199508_4057408_419-125009-0_sp1.1 from training. Duration: 0.7545625 2023-10-02 23:53:50,020 WARNING [train.py:1204] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_268-87909-0 from training. Duration: 0.99 2023-10-02 23:53:51,344 WARNING [train.py:1204] (3/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_118-311357-0_sp1.1 from training. Duration: 0.92725 2023-10-02 23:53:51,367 WARNING [train.py:1204] (3/4) Exclude cut with ID ID0377W0003-870225 from training. Duration: 0.852 2023-10-02 23:53:52,681 INFO [train.py:1046] (3/4) Epoch 31, batch 250, loss[loss=0.1671, simple_loss=0.2543, pruned_loss=0.03998, over 24011.00 frames. ], tot_loss[loss=0.1662, simple_loss=0.2453, pruned_loss=0.04355, over 3375619.69 frames. ], batch size: 86, lr: 3.32e-03, grad_scale: 8.0 2023-10-02 23:53:52,769 WARNING [train.py:1204] (3/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_56-5838-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 23:53:54,169 WARNING [train.py:1204] (3/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_90-31383-0_sp0.9 from training. Duration: 0.9666875 2023-10-02 23:53:55,469 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_31583_210_3852_1_1532595851_4024413_58-86657-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 23:53:55,501 WARNING [train.py:1204] (3/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_344-113875-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 23:53:56,980 WARNING [train.py:1204] (3/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_278-104676-0_sp1.1 from training. Duration: 0.8909375 2023-10-02 23:53:57,007 WARNING [train.py:1204] (3/4) Exclude cut with ID _55844_210_2346_1_1534471212569_7625707_764-2387-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 23:54:00,102 WARNING [train.py:1204] (3/4) Exclude cut with ID _27430_210_7770_1_1532604689375_1378590_157-14963-0_sp1.1 from training. Duration: 0.963625 2023-10-02 23:54:03,429 WARNING [train.py:1204] (3/4) Exclude cut with ID _7160_210_1876_1_1527940923231_3703799_282-322611-0_sp1.1 from training. Duration: 0.7909375 2023-10-02 23:54:05,046 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.bypass_mid.scale_min, batch_count=1064093.3333333333, ans=0.2 2023-10-02 23:54:06,318 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.0.feed_forward1.out_proj.dropout_p, batch_count=1064160.0, ans=0.1 2023-10-02 23:54:16,357 WARNING [train.py:1204] (3/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_190-138640-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 23:54:16,630 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.conv_module1.balancer1.prob, batch_count=1064160.0, ans=0.125 2023-10-02 23:54:17,808 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_33348_210_5632_1_1533086912_3324030_63-270922-0_sp1.1 from training. Duration: 0.97275 2023-10-02 23:54:17,848 WARNING [train.py:1204] (3/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_135-184281-0_sp1.1 from training. Duration: 0.9 2023-10-02 23:54:24,749 WARNING [train.py:1204] (3/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_277-210798-0_sp1.1 from training. Duration: 0.5 2023-10-02 23:54:24,802 WARNING [train.py:1204] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_402-253027-0_sp0.9 from training. Duration: 0.6 2023-10-02 23:54:25,006 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.bypass_mid.scale_min, batch_count=1064226.6666666667, ans=0.2 2023-10-02 23:54:26,157 WARNING [train.py:1204] (3/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_540-275685-0_sp0.9 from training. Duration: 0.988875 2023-10-02 23:54:26,201 WARNING [train.py:1204] (3/4) Exclude cut with ID _59493_210_17282_1_1535078419077_3225879_118-18960-0_sp1.1 from training. Duration: 0.936375 2023-10-02 23:54:27,639 WARNING [train.py:1204] (3/4) Exclude cut with ID _6194_210_1534_1_1527511826160_3941194_321-148641-0_sp1.1 from training. Duration: 0.5818125 2023-10-02 23:54:27,646 WARNING [train.py:1204] (3/4) Exclude cut with ID _61133_210_9969_1_1535196777227_3696409_333-270325-0_sp1.1 from training. Duration: 0.8454375 2023-10-02 23:54:28,939 WARNING [train.py:1204] (3/4) Exclude cut with ID _49376_210_5986_1_1533985278732_3370740_307-145221-0_sp1.1 from training. Duration: 0.936375 2023-10-02 23:54:32,099 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6846_210_6753_1_1527850767_4295158_2-315073-0_sp0.9 from training. Duration: 0.97775 2023-10-02 23:54:32,618 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.5.encoder.layers.1.nonlin_attention.whiten1, num_groups=1, num_channels=192, metric=2.69 vs. limit=10.0 2023-10-02 23:54:35,613 WARNING [train.py:1204] (3/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_17-25045-0 from training. Duration: 0.59 2023-10-02 23:54:35,625 WARNING [train.py:1204] (3/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_503-333510-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 23:54:37,124 WARNING [train.py:1204] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_238-245655-0_sp1.1 from training. Duration: 0.736375 2023-10-02 23:54:37,163 WARNING [train.py:1204] (3/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_329-82235-0_sp0.9 from training. Duration: 0.911125 2023-10-02 23:54:38,483 WARNING [train.py:1204] (3/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_325-113798-0_sp0.9 from training. Duration: 0.9444375 2023-10-02 23:54:39,809 WARNING [train.py:1204] (3/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_395-37222-0_sp0.9 from training. Duration: 0.8444375 2023-10-02 23:54:39,888 WARNING [train.py:1204] (3/4) Exclude cut with ID _64135_210_2253_1_1535677267876_7608972_498-5039-0_sp1.1 from training. Duration: 0.7090625 2023-10-02 23:54:39,903 WARNING [train.py:1204] (3/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_303-228738-0_sp0.9 from training. Duration: 0.9333125 2023-10-02 23:54:42,651 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_577-220899-0_sp1.1 from training. Duration: 0.92725 2023-10-02 23:54:44,570 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3475_210_2028_1_1526193578_3622569_377-166133-0_sp1.1 from training. Duration: 0.7818125 2023-10-02 23:54:44,606 WARNING [train.py:1204] (3/4) Exclude cut with ID _49376_210_5986_1_1533985278732_3370740_272-313512-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 23:54:48,656 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7762_210_1794_1_1528199508_4057408_422-227034-0_sp0.9 from training. Duration: 0.811125 2023-10-02 23:54:50,835 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.2.encoder.layers.1.feed_forward2.out_whiten, num_groups=1, num_channels=384, metric=4.60 vs. limit=15.0 2023-10-02 23:54:51,416 WARNING [train.py:1204] (3/4) Exclude cut with ID _18895_210_6228_1_1531357274234_3784930_74-341428-0_sp1.1 from training. Duration: 0.92725 2023-10-02 23:54:54,303 WARNING [train.py:1204] (3/4) Exclude cut with ID _44902_210_16187_1_1533888006062_3792078_149-67212-0_sp1.1 from training. Duration: 0.7909375 2023-10-02 23:54:57,211 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.nonlin_attention.balancer.prob, batch_count=1064360.0, ans=0.125 2023-10-02 23:54:58,391 WARNING [train.py:1204] (3/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_243-126020-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 23:55:00,981 WARNING [train.py:1204] (3/4) Exclude cut with ID _54218_210_12210_1_1534413646388_4409259_259-6561-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 23:55:05,078 INFO [train.py:1046] (3/4) Epoch 31, batch 300, loss[loss=0.1575, simple_loss=0.227, pruned_loss=0.04402, over 23792.00 frames. ], tot_loss[loss=0.164, simple_loss=0.2427, pruned_loss=0.04265, over 3675137.21 frames. ], batch size: 179, lr: 3.32e-03, grad_scale: 8.0 2023-10-02 23:55:05,173 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_41452_210_7291_1_1533437687_3941281_405-11559-0 from training. Duration: 0.91 2023-10-02 23:55:07,192 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_759-225084-0_sp1.1 from training. Duration: 0.97275 2023-10-02 23:55:07,210 WARNING [train.py:1204] (3/4) Exclude cut with ID _14747_210_8341_1_1530685797463_3654089_147-131229-0_sp0.9 from training. Duration: 0.8444375 2023-10-02 23:55:09,170 WARNING [train.py:1204] (3/4) Exclude cut with ID _14867_210_1968_1_1530684179890_3846969_319-250868-0 from training. Duration: 0.99 2023-10-02 23:55:10,529 WARNING [train.py:1204] (3/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_580-93798-0_sp0.9 from training. Duration: 0.62225 2023-10-02 23:55:10,625 WARNING [train.py:1204] (3/4) Exclude cut with ID _9679_210_7497_1_1529129212906_4363929_512-313889-0_sp0.9 from training. Duration: 0.9 2023-10-02 23:55:10,640 WARNING [train.py:1204] (3/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_18-189198-0 from training. Duration: 0.9 2023-10-02 23:55:15,780 WARNING [train.py:1204] (3/4) Exclude cut with ID _49755_210_18053_1_1534581005846_3218709_137-64179-0_sp1.1 from training. Duration: 0.92725 2023-10-02 23:55:15,955 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.conv_module1.balancer2.prob, batch_count=1064426.6666666667, ans=0.125 2023-10-02 23:55:17,108 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_48049_210_5632_1_1533864553_3519937_306-283794-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 23:55:20,068 WARNING [train.py:1204] (3/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_25-120161-0_sp1.1 from training. Duration: 0.8909375 2023-10-02 23:55:20,122 WARNING [train.py:1204] (3/4) Exclude cut with ID _8833_210_5219_1_1528592665210_7553059_319-349181-0 from training. Duration: 0.99 2023-10-02 23:55:21,490 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_296-332325-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 23:55:22,982 WARNING [train.py:1204] (3/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_436-186311-0_sp1.1 from training. Duration: 0.5909375 2023-10-02 23:55:23,003 WARNING [train.py:1204] (3/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_348-127789-0 from training. Duration: 0.85 2023-10-02 23:55:23,008 WARNING [train.py:1204] (3/4) Exclude cut with ID _3992_210_1747_1_1526644816031_3731060_443-195183-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 23:55:27,090 WARNING [train.py:1204] (3/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_308-231735-0_sp1.1 from training. Duration: 0.663625 2023-10-02 23:55:27,701 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.5.encoder.layers.0.conv_module2.whiten, num_groups=1, num_channels=256, metric=4.84 vs. limit=15.0 2023-10-02 23:55:30,120 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6786_210_1388_1_1527839699_4194495_378-46070-0_sp1.1 from training. Duration: 0.6545625 2023-10-02 23:55:31,471 WARNING [train.py:1204] (3/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_63-101996-0 from training. Duration: 0.88 2023-10-02 23:55:33,022 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2541_210_1794_1_1525590269_3084765_156-173713-0 from training. Duration: 0.81 2023-10-02 23:55:34,306 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_217-239796-0_sp1.1 from training. Duration: 0.963625 2023-10-02 23:55:37,946 WARNING [train.py:1204] (3/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_530-329400-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 23:55:39,380 WARNING [train.py:1204] (3/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_150-62920-0_sp1.1 from training. Duration: 0.963625 2023-10-02 23:55:39,380 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_5522_210_3273_1_1527249606_3909096_366-172508-0 from training. Duration: 0.66 2023-10-02 23:55:39,383 WARNING [train.py:1204] (3/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_303-340446-0_sp1.1 from training. Duration: 0.8181875 2023-10-02 23:55:42,030 WARNING [train.py:1204] (3/4) Exclude cut with ID _8699_210_2069_1_1528630346831_3960409_160-24408-0_sp1.1 from training. Duration: 0.82725 2023-10-02 23:55:42,819 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.2.encoder.layers.1.self_attn1.whiten, num_groups=1, num_channels=384, metric=18.23 vs. limit=22.5 2023-10-02 23:55:45,273 WARNING [train.py:1204] (3/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_299-187801-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 23:55:45,301 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_34886_210_10633_1_1533203882_1755661_92-345288-0_sp1.1 from training. Duration: 0.97275 2023-10-02 23:55:45,518 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.conv_module2.balancer1.prob, batch_count=1064560.0, ans=0.125 2023-10-02 23:55:48,453 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_5438_210_1794_1_1527336687_2600009_104-288274-0_sp1.1 from training. Duration: 0.52725 2023-10-02 23:55:48,457 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_154-316450-0 from training. Duration: 0.76 2023-10-02 23:55:49,810 WARNING [train.py:1204] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_23-189234-0_sp0.9 from training. Duration: 0.92225 2023-10-02 23:55:50,002 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.self_attn_weights.pos_emb_skip_rate, batch_count=1064626.6666666667, ans=0.0 2023-10-02 23:55:52,444 WARNING [train.py:1204] (3/4) Exclude cut with ID _4779_210_2084_1_1527318022944_3914893_346-168820-0_sp1.1 from training. Duration: 0.963625 2023-10-02 23:55:52,545 WARNING [train.py:1204] (3/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_5-55893-0 from training. Duration: 0.55 2023-10-02 23:55:53,943 WARNING [train.py:1204] (3/4) Exclude cut with ID _20455_210_5219_1_1531565996775_8098100_180-24517-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 23:55:56,805 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_243-143119-0_sp0.9 from training. Duration: 0.8555625 2023-10-02 23:55:56,997 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.bypass_mid.scale_min, batch_count=1064626.6666666667, ans=0.2 2023-10-02 23:55:59,472 WARNING [train.py:1204] (3/4) Exclude cut with ID _65585_210_12210_1_1535450451584_3764098_293-212045-0_sp1.1 from training. Duration: 0.92725 2023-10-02 23:55:59,476 WARNING [train.py:1204] (3/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_577-332600-0 from training. Duration: 0.57 2023-10-02 23:56:02,024 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.594e+02 1.918e+02 2.245e+02 2.640e+02 3.587e+02, threshold=4.490e+02, percent-clipped=0.0 2023-10-02 23:56:03,644 WARNING [train.py:1204] (3/4) Exclude cut with ID _30039_210_13270_1_1532484612900_2725150_104-289874-0_sp1.1 from training. Duration: 0.963625 2023-10-02 23:56:03,651 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3172_210_2087_1_1526292929_3987865_197-97762-0_sp1.1 from training. Duration: 0.6818125 2023-10-02 23:56:05,677 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_256-150908-0_sp1.1 from training. Duration: 0.963625 2023-10-02 23:56:08,349 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_296-287678-0_sp1.1 from training. Duration: 0.863625 2023-10-02 23:56:08,394 WARNING [train.py:1204] (3/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_443-30034-0 from training. Duration: 0.64 2023-10-02 23:56:08,435 WARNING [train.py:1204] (3/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_494-199538-0_sp0.9 from training. Duration: 0.8333125 2023-10-02 23:56:08,482 WARNING [train.py:1204] (3/4) Exclude cut with ID _19708_210_9206_1_1532136650733_4238364_61-162325-0_sp1.1 from training. Duration: 0.97275 2023-10-02 23:56:11,095 WARNING [train.py:1204] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_475-127455-0 from training. Duration: 0.7 2023-10-02 23:56:12,478 WARNING [train.py:1204] (3/4) Exclude cut with ID _48677_210_5663_1_1533902405919_3892989_188-11581-0_sp1.1 from training. Duration: 0.963625 2023-10-02 23:56:12,501 WARNING [train.py:1204] (3/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_136-8381-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 23:56:14,315 WARNING [train.py:1204] (3/4) Exclude cut with ID _55328_210_27767_1_1534580973260_4088170_270-229555-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 23:56:16,246 WARNING [train.py:1204] (3/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_372-266951-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 23:56:16,307 WARNING [train.py:1204] (3/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_14-242149-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 23:56:20,426 INFO [train.py:1046] (3/4) Epoch 31, batch 350, loss[loss=0.1553, simple_loss=0.2241, pruned_loss=0.04327, over 23707.00 frames. ], tot_loss[loss=0.1627, simple_loss=0.2413, pruned_loss=0.04204, over 3915910.24 frames. ], batch size: 232, lr: 3.32e-03, grad_scale: 8.0 2023-10-02 23:56:20,585 WARNING [train.py:1204] (3/4) Exclude cut with ID _7877_210_6014_1_1528286358099_3750136_159-76512-0_sp1.1 from training. Duration: 0.936375 2023-10-02 23:56:20,587 WARNING [train.py:1204] (3/4) Exclude cut with ID _8263_210_1664_1_1528438953494_294100_40-204731-0_sp1.1 from training. Duration: 0.4545625 2023-10-02 23:56:22,096 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.feed_forward1.hidden_balancer.prob, batch_count=1064760.0, ans=0.125 2023-10-02 23:56:24,611 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8278_210_2550_1_1528637099_4220413_694-238413-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 23:56:30,224 WARNING [train.py:1204] (3/4) Exclude cut with ID _47278_210_12041_1_1533902418349_3576110_421-172710-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 23:56:31,696 WARNING [train.py:1204] (3/4) Exclude cut with ID _14970_210_15709_1_1530773543493_1715730_215-282680-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 23:56:31,733 WARNING [train.py:1204] (3/4) Exclude cut with ID _64639_210_5816_1_1535367619437_3528000_306-149449-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 23:56:34,469 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_28-291982-0 from training. Duration: 0.97 2023-10-02 23:56:36,446 WARNING [train.py:1204] (3/4) Exclude cut with ID _55134_210_21982_1_1534590028377_3882170_412-15870-0_sp1.1 from training. Duration: 0.936375 2023-10-02 23:56:37,654 WARNING [train.py:1204] (3/4) Exclude cut with ID _13684_210_7497_1_1530619354863_3598359_312-235606-0 from training. Duration: 0.6 2023-10-02 23:56:40,433 WARNING [train.py:1204] (3/4) Exclude cut with ID _37112_210_2575_1_1532997007673_7268610_347-238993-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 23:56:40,482 WARNING [train.py:1204] (3/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_121-284152-0 from training. Duration: 0.59 2023-10-02 23:56:41,862 WARNING [train.py:1204] (3/4) Exclude cut with ID _2941_210_2577_1_1526199673150_2489759_228-2961-0_sp1.1 from training. Duration: 0.97275 2023-10-02 23:56:45,301 WARNING [train.py:1204] (3/4) Exclude cut with ID _28323_210_16187_1_1532480395667_4100300_531-24157-0 from training. Duration: 0.98 2023-10-02 23:56:45,558 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.balancer2.prob, batch_count=1064826.6666666667, ans=0.125 2023-10-02 23:56:47,246 WARNING [train.py:1204] (3/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_266-329755-0_sp0.9 from training. Duration: 0.988875 2023-10-02 23:56:47,378 WARNING [train.py:1204] (3/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_1038-146421-0_sp1.1 from training. Duration: 0.97275 2023-10-02 23:56:48,741 WARNING [train.py:1204] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_391-22148-0_sp0.9 from training. Duration: 0.8555625 2023-10-02 23:56:50,008 WARNING [train.py:1204] (3/4) Exclude cut with ID _64521_210_14699_1_1535243427259_3815328_543-341117-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 23:56:50,019 WARNING [train.py:1204] (3/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_52-168712-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 23:56:50,073 WARNING [train.py:1204] (3/4) Exclude cut with ID _33920_210_4220_1_1533257851500_3084399_198-334063-0_sp1.1 from training. Duration: 0.8545625 2023-10-02 23:56:50,093 WARNING [train.py:1204] (3/4) Exclude cut with ID _50093_210_4220_1_1534417141207_3432299_69-111987-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 23:56:51,368 WARNING [train.py:1204] (3/4) Exclude cut with ID _14821_210_8750_1_1530680503991_6829359_189-297451-0_sp0.9 from training. Duration: 0.611125 2023-10-02 23:56:53,915 WARNING [train.py:1204] (3/4) Exclude cut with ID _8030_210_7497_1_1528444886781_3531089_262-86750-0_sp0.9 from training. Duration: 0.9 2023-10-02 23:56:53,920 WARNING [train.py:1204] (3/4) Exclude cut with ID _24612_210_8913_1_1531998054193_3806359_294-191196-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 23:56:59,629 WARNING [train.py:1204] (3/4) Exclude cut with ID _16909_210_5724_1_1531292079912_4118919_707-342896-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 23:56:59,644 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_9460_210_4211_1_1528954587_10049667_572-37035-0_sp0.9 from training. Duration: 0.888875 2023-10-02 23:57:00,996 WARNING [train.py:1204] (3/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_762-109886-0_sp1.1 from training. Duration: 0.82725 2023-10-02 23:57:01,024 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_14953_210_2151_1_1530701710_5616165_519-326393-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 23:57:05,176 WARNING [train.py:1204] (3/4) Exclude cut with ID _25914_210_6400_1_1532141317058_644740_52-96808-0 from training. Duration: 0.91 2023-10-02 23:57:07,038 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_68095_210_20185_1_1535718226_8888855_1079-94525-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 23:57:07,287 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.bypass.skip_rate, batch_count=1064960.0, ans=0.07 2023-10-02 23:57:08,724 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.feed_forward3.hidden_balancer.prob, batch_count=1064960.0, ans=0.125 2023-10-02 23:57:11,319 WARNING [train.py:1204] (3/4) Exclude cut with ID _14860_210_4915_1_1530702020340_7343240_746-3647-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 23:57:11,320 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_70379_210_7925_1_1535941455_557455_49-101720-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 23:57:11,339 WARNING [train.py:1204] (3/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_8-307291-0_sp1.1 from training. Duration: 0.936375 2023-10-02 23:57:13,141 WARNING [train.py:1204] (3/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_117-290916-0 from training. Duration: 0.51 2023-10-02 23:57:16,539 WARNING [train.py:1204] (3/4) Exclude cut with ID _22020_210_2461_1_1531724164513_6326900_944-339840-0_sp1.1 from training. Duration: 0.963625 2023-10-02 23:57:17,939 WARNING [train.py:1204] (3/4) Exclude cut with ID IC0515W0043-254911 from training. Duration: 0.976 2023-10-02 23:57:18,034 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3910_210_1794_1_1526742306_334530_28-238909-0 from training. Duration: 0.81 2023-10-02 23:57:18,048 WARNING [train.py:1204] (3/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_269-267275-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 23:57:22,136 WARNING [train.py:1204] (3/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_72-251255-0_sp1.1 from training. Duration: 0.8545625 2023-10-02 23:57:22,155 WARNING [train.py:1204] (3/4) Exclude cut with ID _14886_210_4220_1_1530702020355_3516190_175-229073-0 from training. Duration: 0.45 2023-10-02 23:57:24,924 WARNING [train.py:1204] (3/4) Exclude cut with ID _40519_210_13794_1_1533715228655_7337390_921-209452-0_sp1.1 from training. Duration: 0.963625 2023-10-02 23:57:25,351 INFO [scaling.py:1118] (3/4) WithLoss: name=encoder.encoders.5.encoder.layers.0.self_attn_weights, loss-sum=0.000e+00 2023-10-02 23:57:27,703 WARNING [train.py:1204] (3/4) Exclude cut with ID _29857_210_20034_1_1532395886753_3798160_335-163562-0_sp0.9 from training. Duration: 0.9444375 2023-10-02 23:57:27,809 WARNING [train.py:1204] (3/4) Exclude cut with ID _58583_210_5236_1_1534726835266_3574249_384-58445-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 23:57:29,125 WARNING [train.py:1204] (3/4) Exclude cut with ID _44011_210_2046_1_1533722401691_3631310_96-315656-0_sp1.1 from training. Duration: 0.963625 2023-10-02 23:57:29,130 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_68484_210_1794_1_1535849804_7678097_549-170613-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 23:57:30,541 WARNING [train.py:1204] (3/4) Exclude cut with ID _37892_210_19596_1_1533198677897_3964058_214-148058-0_sp1.1 from training. Duration: 0.9454375 2023-10-02 23:57:33,329 INFO [train.py:1046] (3/4) Epoch 31, batch 400, loss[loss=0.1583, simple_loss=0.2295, pruned_loss=0.04353, over 22733.00 frames. ], tot_loss[loss=0.1626, simple_loss=0.2409, pruned_loss=0.0422, over 4075070.19 frames. ], batch size: 322, lr: 3.32e-03, grad_scale: 16.0 2023-10-02 23:57:33,408 WARNING [train.py:1204] (3/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_415-78792-0_sp0.9 from training. Duration: 0.9 2023-10-02 23:57:34,810 WARNING [train.py:1204] (3/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_383-205833-0_sp1.1 from training. Duration: 0.863625 2023-10-02 23:57:36,163 WARNING [train.py:1204] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_167-137255-0 from training. Duration: 0.64 2023-10-02 23:57:36,176 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8275_210_4198_1_1528455320_3892571_484-154786-0_sp1.1 from training. Duration: 0.963625 2023-10-02 23:57:37,483 WARNING [train.py:1204] (3/4) Exclude cut with ID _40284_210_2084_1_1533513679814_3704279_396-319412-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 23:57:40,685 WARNING [train.py:1204] (3/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_280-120942-0_sp0.9 from training. Duration: 0.8555625 2023-10-02 23:57:40,892 INFO [scaling.py:1118] (3/4) WithLoss: name=encoder.encoders.0.layers.1.self_attn_weights, loss-sum=0.000e+00 2023-10-02 23:57:41,970 WARNING [train.py:1204] (3/4) Exclude cut with ID _45630_210_10556_1_1533949174498_3902049_558-240889-0_sp1.1 from training. Duration: 0.92725 2023-10-02 23:57:43,414 WARNING [train.py:1204] (3/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_197-307458-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 23:57:45,368 WARNING [train.py:1204] (3/4) Exclude cut with ID _16909_210_5724_1_1531292079912_4118919_163-254843-0_sp1.1 from training. Duration: 0.92725 2023-10-02 23:57:46,927 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_103-84010-0 from training. Duration: 0.69 2023-10-02 23:57:48,368 WARNING [train.py:1204] (3/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_401-158239-0 from training. Duration: 0.71 2023-10-02 23:57:48,369 WARNING [train.py:1204] (3/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_899-233658-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 23:57:49,733 WARNING [train.py:1204] (3/4) Exclude cut with ID _23825_210_13449_1_1531915313526_3497150_240-271367-0 from training. Duration: 0.97 2023-10-02 23:57:51,061 WARNING [train.py:1204] (3/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_424-134619-0_sp1.1 from training. Duration: 0.92725 2023-10-02 23:57:53,867 WARNING [train.py:1204] (3/4) Exclude cut with ID _44831_210_13427_1_1533816011549_3689035_32-188147-0_sp1.1 from training. Duration: 0.87275 2023-10-02 23:57:53,870 WARNING [train.py:1204] (3/4) Exclude cut with ID _59170_210_6721_1_1534983633960_4203335_314-63879-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 23:57:53,891 WARNING [train.py:1204] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_602-275260-0 from training. Duration: 0.82 2023-10-02 23:57:53,927 WARNING [train.py:1204] (3/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_511-306767-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 23:57:53,954 WARNING [train.py:1204] (3/4) Exclude cut with ID _41817_210_18835_1_1533434477160_7423049_565-201775-0_sp1.1 from training. Duration: 0.92725 2023-10-02 23:57:53,963 WARNING [train.py:1204] (3/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_778-173151-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 23:57:55,312 WARNING [train.py:1204] (3/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_680-51001-0_sp1.1 from training. Duration: 0.963625 2023-10-02 23:57:58,179 WARNING [train.py:1204] (3/4) Exclude cut with ID IC0677W0059-334891 from training. Duration: 0.896 2023-10-02 23:57:58,237 WARNING [train.py:1204] (3/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_370-331600-0 from training. Duration: 0.94 2023-10-02 23:58:03,681 WARNING [train.py:1204] (3/4) Exclude cut with ID _66840_210_5219_1_1535452168426_4196349_446-240192-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 23:58:03,789 WARNING [train.py:1204] (3/4) Exclude cut with ID _65242_210_18628_1_1535369393717_3823149_38-6732-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 23:58:05,165 WARNING [train.py:1204] (3/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_302-2550-0 from training. Duration: 0.81 2023-10-02 23:58:06,529 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_100-348441-0 from training. Duration: 0.91 2023-10-02 23:58:09,896 WARNING [train.py:1204] (3/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_329-285801-0_sp1.1 from training. Duration: 0.82725 2023-10-02 23:58:10,154 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.conv_module2.balancer2.min_abs, batch_count=1065226.6666666667, ans=0.5 2023-10-02 23:58:11,412 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_30167_210_3528_1_1532428012_4772375_343-334712-0_sp1.1 from training. Duration: 0.936375 2023-10-02 23:58:19,531 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7543_210_6753_1_1528543318_4552_1-85072-0 from training. Duration: 0.83 2023-10-02 23:58:22,269 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7002_210_1794_1_1527935634_4034444_487-236501-0_sp1.1 from training. Duration: 0.6 2023-10-02 23:58:23,630 WARNING [train.py:1204] (3/4) Exclude cut with ID _7160_210_1876_1_1527940923231_3703799_282-322611-0 from training. Duration: 0.87 2023-10-02 23:58:26,560 WARNING [train.py:1204] (3/4) Exclude cut with ID _67335_210_5219_1_1535525975652_4275229_570-262983-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 23:58:26,661 WARNING [train.py:1204] (3/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_625-338546-0_sp1.1 from training. Duration: 0.8 2023-10-02 23:58:28,003 WARNING [train.py:1204] (3/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_176-56120-0 from training. Duration: 0.83 2023-10-02 23:58:29,308 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.534e+02 1.935e+02 2.133e+02 2.545e+02 3.728e+02, threshold=4.266e+02, percent-clipped=0.0 2023-10-02 23:58:29,747 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=1065293.3333333333, ans=0.1 2023-10-02 23:58:30,763 WARNING [train.py:1204] (3/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_963-187109-0_sp1.1 from training. Duration: 0.9 2023-10-02 23:58:33,445 WARNING [train.py:1204] (3/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_121-284152-0_sp0.9 from training. Duration: 0.6555625 2023-10-02 23:58:34,783 WARNING [train.py:1204] (3/4) Exclude cut with ID _45021_210_9994_1_1533619751094_3330288_104-81711-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 23:58:34,946 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.0.conv_module2.balancer2.min_abs, batch_count=1065360.0, ans=0.5 2023-10-02 23:58:37,477 WARNING [train.py:1204] (3/4) Exclude cut with ID _51382_210_23862_1_1534492801838_3650104_129-343914-0_sp1.1 from training. Duration: 0.936375 2023-10-02 23:58:39,334 WARNING [train.py:1204] (3/4) Exclude cut with ID _14970_210_15709_1_1530775547361_1966965_8-169924-0 from training. Duration: 0.92 2023-10-02 23:58:40,763 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2295_210_2087_1_1525500993_2407863_234-83104-0_sp1.1 from training. Duration: 0.563625 2023-10-02 23:58:42,154 WARNING [train.py:1204] (3/4) Exclude cut with ID _14687_210_5854_1_1530703838259_3664889_115-239046-0 from training. Duration: 0.67 2023-10-02 23:58:42,878 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.3.encoder.layers.0.nonlin_attention.whiten2, num_groups=1, num_channels=512, metric=8.83 vs. limit=15.0 2023-10-02 23:58:44,075 WARNING [train.py:1204] (3/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_690-126306-0_sp1.1 from training. Duration: 0.7090625 2023-10-02 23:58:44,083 WARNING [train.py:1204] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_625-102955-0_sp0.9 from training. Duration: 0.8555625 2023-10-02 23:58:46,828 INFO [train.py:1046] (3/4) Epoch 31, batch 450, loss[loss=0.142, simple_loss=0.2194, pruned_loss=0.0323, over 24463.00 frames. ], tot_loss[loss=0.1633, simple_loss=0.2413, pruned_loss=0.0426, over 4222414.83 frames. ], batch size: 58, lr: 3.32e-03, grad_scale: 16.0 2023-10-02 23:58:47,520 WARNING [train.py:1204] (3/4) Exclude cut with ID _9389_210_1664_1_1529217337301_5289269_289-213417-0 from training. Duration: 0.89 2023-10-02 23:58:48,753 WARNING [train.py:1204] (3/4) Exclude cut with ID _13253_210_1925_1_1530757775819_3577873_191-144977-0_sp0.9 from training. Duration: 0.8666875 2023-10-02 23:58:48,800 WARNING [train.py:1204] (3/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_325-155703-0_sp1.1 from training. Duration: 0.9090625 2023-10-02 23:58:50,051 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2295_210_2087_1_1525500993_2407863_116-238894-0_sp0.9 from training. Duration: 0.77775 2023-10-02 23:58:50,159 WARNING [train.py:1204] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_94-126128-0 from training. Duration: 0.86 2023-10-02 23:58:50,420 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.bypass.scale_min, batch_count=1065426.6666666667, ans=0.2 2023-10-02 23:58:51,933 WARNING [train.py:1204] (3/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_395-310220-0_sp0.9 from training. Duration: 0.988875 2023-10-02 23:58:53,295 WARNING [train.py:1204] (3/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_821-110852-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 23:58:53,363 WARNING [train.py:1204] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_64-248359-0_sp1.1 from training. Duration: 0.62725 2023-10-02 23:58:53,380 WARNING [train.py:1204] (3/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_138-187031-0 from training. Duration: 0.99 2023-10-02 23:58:54,623 WARNING [train.py:1204] (3/4) Exclude cut with ID _4122_210_2084_1_1526713164417_4353280_528-206771-0_sp0.9 from training. Duration: 0.97775 2023-10-02 23:58:55,957 WARNING [train.py:1204] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_844-96989-0_sp1.1 from training. Duration: 0.6454375 2023-10-02 23:58:57,351 WARNING [train.py:1204] (3/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_106-30782-0_sp1.1 from training. Duration: 0.72725 2023-10-02 23:59:05,572 WARNING [train.py:1204] (3/4) Exclude cut with ID _14683_210_5219_1_1530698352608_3974859_281-250040-0_sp1.1 from training. Duration: 0.936375 2023-10-02 23:59:05,614 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_46013_210_7234_1_1533776362_3744032_463-52378-0_sp0.9 from training. Duration: 0.9555625 2023-10-02 23:59:07,018 WARNING [train.py:1204] (3/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_299-34413-0 from training. Duration: 0.65 2023-10-02 23:59:07,072 WARNING [train.py:1204] (3/4) Exclude cut with ID _35799_210_5854_1_1533263487887_3529071_490-326847-0 from training. Duration: 0.99 2023-10-02 23:59:10,293 WARNING [train.py:1204] (3/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_589-9070-0_sp0.9 from training. Duration: 0.788875 2023-10-02 23:59:13,407 WARNING [train.py:1204] (3/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_884-29515-0_sp1.1 from training. Duration: 0.936375 2023-10-02 23:59:15,368 WARNING [train.py:1204] (3/4) Exclude cut with ID _15012_210_1968_1_1530856820429_5095350_474-119995-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 23:59:19,595 WARNING [train.py:1204] (3/4) Exclude cut with ID _65585_210_12210_1_1535450451584_3764098_211-97124-0_sp1.1 from training. Duration: 0.963625 2023-10-02 23:59:19,649 WARNING [train.py:1204] (3/4) Exclude cut with ID _33640_210_2655_1_1533117605479_4855469_666-224463-0_sp1.1 from training. Duration: 0.963625 2023-10-02 23:59:22,700 WARNING [train.py:1204] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_35-153886-0 from training. Duration: 0.84 2023-10-02 23:59:22,751 WARNING [train.py:1204] (3/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_210-106047-0 from training. Duration: 0.97 2023-10-02 23:59:24,182 WARNING [train.py:1204] (3/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_479-125611-0 from training. Duration: 0.96 2023-10-02 23:59:25,444 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_9417_210_1794_1_1529235261_4614131_427-153963-0_sp1.1 from training. Duration: 0.9545625 2023-10-02 23:59:26,768 WARNING [train.py:1204] (3/4) Exclude cut with ID _23818_210_6228_1_1531917063884_3676920_133-170571-0_sp1.1 from training. Duration: 0.9909375 2023-10-02 23:59:26,836 WARNING [train.py:1204] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_602-275260-0_sp1.1 from training. Duration: 0.7454375 2023-10-02 23:59:28,226 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9029W0121-509521 from training. Duration: 0.896 2023-10-02 23:59:28,244 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9029W0434-509821 from training. Duration: 0.64 2023-10-02 23:59:28,262 WARNING [train.py:1204] (3/4) Exclude cut with ID _23038_210_9570_1_1532001584158_3652419_289-307034-0_sp1.1 from training. Duration: 0.936375 2023-10-02 23:59:29,696 WARNING [train.py:1204] (3/4) Exclude cut with ID _29345_210_4381_1_1532689168263_3860169_149-257268-0_sp0.9 from training. Duration: 0.9 2023-10-02 23:59:32,453 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_405-339986-0_sp1.1 from training. Duration: 0.463625 2023-10-02 23:59:32,692 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2655_210_2087_1_1525688959_3897400_437-337690-0_sp1.1 from training. Duration: 0.57275 2023-10-02 23:59:34,017 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7543_210_6753_1_1528541907_1399322_165-323619-0_sp1.1 from training. Duration: 0.62725 2023-10-02 23:59:34,070 WARNING [train.py:1204] (3/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_508-335369-0_sp1.1 from training. Duration: 0.436375 2023-10-02 23:59:34,142 WARNING [train.py:1204] (3/4) Exclude cut with ID _7852_210_5219_1_1528286471316_8139400_358-287492-0 from training. Duration: 0.96 2023-10-02 23:59:36,914 WARNING [train.py:1204] (3/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_322-217220-0_sp0.9 from training. Duration: 0.9555625 2023-10-02 23:59:38,399 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3171_210_2087_1_1526109084_2330007_66-260583-0_sp0.9 from training. Duration: 0.8 2023-10-02 23:59:40,169 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8558_210_1410_1_1528505225_7613406_131-329524-0_sp1.1 from training. Duration: 0.7181875 2023-10-02 23:59:42,055 WARNING [train.py:1204] (3/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_30-326681-0 from training. Duration: 0.83 2023-10-02 23:59:43,696 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.feed_forward1.out_proj.dropout_p, batch_count=1065626.6666666667, ans=0.1 2023-10-02 23:59:46,630 WARNING [train.py:1204] (3/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_58-68956-0_sp0.9 from training. Duration: 0.92225 2023-10-02 23:59:46,694 WARNING [train.py:1204] (3/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_255-312654-0 from training. Duration: 0.91 2023-10-02 23:59:49,323 WARNING [train.py:1204] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_99-167392-0 from training. Duration: 0.61 2023-10-02 23:59:50,681 WARNING [train.py:1204] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_94-126128-0_sp0.9 from training. Duration: 0.9555625 2023-10-02 23:59:55,424 WARNING [train.py:1204] (3/4) Exclude cut with ID _68635_210_9317_1_1535770813410_4315870_187-192817-0_sp1.1 from training. Duration: 0.9181875 2023-10-02 23:59:56,834 WARNING [train.py:1204] (3/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_277-204970-0_sp1.1 from training. Duration: 0.9818125 2023-10-02 23:59:57,072 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=1065693.3333333333, ans=0.1 2023-10-02 23:59:58,249 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6163_210_6400_1_1527506746_4263068_283-137811-0_sp0.9 from training. Duration: 0.8555625 2023-10-02 23:59:58,275 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9144W0121-566446 from training. Duration: 0.896 2023-10-03 00:00:00,935 INFO [train.py:1046] (3/4) Epoch 31, batch 500, loss[loss=0.1799, simple_loss=0.2677, pruned_loss=0.0461, over 23810.00 frames. ], tot_loss[loss=0.1649, simple_loss=0.243, pruned_loss=0.04343, over 4324445.37 frames. ], batch size: 85, lr: 3.32e-03, grad_scale: 8.0 2023-10-03 00:00:01,095 WARNING [train.py:1204] (3/4) Exclude cut with ID _49371_210_12071_1_1533952761021_3812190_351-315519-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 00:00:02,319 WARNING [train.py:1204] (3/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_115-326778-0_sp1.1 from training. Duration: 0.8454375 2023-10-03 00:00:02,388 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_17292_210_6753_1_1531125388_2943734_436-198965-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 00:00:02,407 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9165W0117-576751 from training. Duration: 0.896 2023-10-03 00:00:03,987 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.self_attn_weights.pos_emb_skip_rate, batch_count=1065760.0, ans=0.0 2023-10-03 00:00:05,014 WARNING [train.py:1204] (3/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_212-149947-0 from training. Duration: 0.73 2023-10-03 00:00:05,022 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_13757_210_3528_1_1530440682_4107239_65-167544-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 00:00:06,568 WARNING [train.py:1204] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_700-183549-0_sp0.9 from training. Duration: 0.7555625 2023-10-03 00:00:11,191 WARNING [train.py:1204] (3/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_216-343007-0_sp0.9 from training. Duration: 0.7333125 2023-10-03 00:00:11,303 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7544_210_6753_1_1528587722_3576686_295-258479-0_sp1.1 from training. Duration: 0.763625 2023-10-03 00:00:14,445 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_365-287106-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 00:00:14,464 WARNING [train.py:1204] (3/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_300-3747-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 00:00:14,539 WARNING [train.py:1204] (3/4) Exclude cut with ID _14069_210_5632_1_1530512692033_4803390_487-303201-0_sp1.1 from training. Duration: 0.92725 2023-10-03 00:00:17,519 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.bypass.scale_min, batch_count=1065826.6666666667, ans=0.2 2023-10-03 00:00:21,455 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.attention_skip_rate, batch_count=1065826.6666666667, ans=0.0 2023-10-03 00:00:27,272 WARNING [train.py:1204] (3/4) Exclude cut with ID _14459_210_2228_1_1531443641946_7516700_668-123026-0_sp1.1 from training. Duration: 0.936375 2023-10-03 00:00:27,301 WARNING [train.py:1204] (3/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_108-53929-0_sp1.1 from training. Duration: 0.536375 2023-10-03 00:00:27,330 WARNING [train.py:1204] (3/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_266-137104-0_sp0.9 from training. Duration: 0.72225 2023-10-03 00:00:27,362 WARNING [train.py:1204] (3/4) Exclude cut with ID _12302_210_5219_1_1529924446466_8107280_902-168619-0_sp1.1 from training. Duration: 0.936375 2023-10-03 00:00:28,674 WARNING [train.py:1204] (3/4) Exclude cut with ID _59502_210_12210_1_1535107024698_1835139_146-162495-0 from training. Duration: 0.92 2023-10-03 00:00:28,703 WARNING [train.py:1204] (3/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_412-287526-0_sp1.1 from training. Duration: 0.6545625 2023-10-03 00:00:32,794 WARNING [train.py:1204] (3/4) Exclude cut with ID _7160_210_1876_1_1527940923231_3703799_120-150943-0_sp0.9 from training. Duration: 0.9 2023-10-03 00:00:32,856 WARNING [train.py:1204] (3/4) Exclude cut with ID _15037_210_2253_1_1530752294104_3267650_296-192597-0_sp1.1 from training. Duration: 0.736375 2023-10-03 00:00:32,876 WARNING [train.py:1204] (3/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_397-216497-0_sp1.1 from training. Duration: 0.87275 2023-10-03 00:00:32,901 WARNING [train.py:1204] (3/4) Exclude cut with ID _40094_210_16380_1_1533294005773_3641159_76-269320-0_sp1.1 from training. Duration: 0.936375 2023-10-03 00:00:34,271 WARNING [train.py:1204] (3/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_449-15508-0 from training. Duration: 0.82 2023-10-03 00:00:37,106 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9309W0194-640321 from training. Duration: 0.64 2023-10-03 00:00:38,789 WARNING [train.py:1204] (3/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_284-312178-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 00:00:39,307 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.4.encoder.layers.1.nonlin_attention.whiten2, num_groups=1, num_channels=384, metric=3.39 vs. limit=15.0 2023-10-03 00:00:40,116 WARNING [train.py:1204] (3/4) Exclude cut with ID _29853_210_7497_1_1532505600851_6642110_401-55937-0_sp1.1 from training. Duration: 0.92725 2023-10-03 00:00:41,777 WARNING [train.py:1204] (3/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_716-24918-0_sp1.1 from training. Duration: 0.92725 2023-10-03 00:00:41,806 WARNING [train.py:1204] (3/4) Exclude cut with ID _19637_210_4220_1_1531537167676_3205060_314-305638-0_sp1.1 from training. Duration: 0.92725 2023-10-03 00:00:41,842 WARNING [train.py:1204] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_475-127455-0_sp1.1 from training. Duration: 0.636375 2023-10-03 00:00:44,603 WARNING [train.py:1204] (3/4) Exclude cut with ID _5444_210_2151_1_1527418808464_5312210_581-315411-0 from training. Duration: 0.62 2023-10-03 00:00:47,985 WARNING [train.py:1204] (3/4) Exclude cut with ID _44902_210_16187_1_1533888006062_3792078_149-67212-0_sp0.9 from training. Duration: 0.9666875 2023-10-03 00:00:49,267 WARNING [train.py:1204] (3/4) Exclude cut with ID _31602_210_5854_1_1532677021456_4458250_63-63253-0_sp1.1 from training. Duration: 0.963625 2023-10-03 00:00:54,692 WARNING [train.py:1204] (3/4) Exclude cut with ID _55209_210_1531_1_1534675828996_4597160_660-203992-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 00:00:57,866 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.647e+02 1.909e+02 2.161e+02 2.446e+02 3.681e+02, threshold=4.321e+02, percent-clipped=0.0 2023-10-03 00:00:58,002 WARNING [train.py:1204] (3/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_83-190624-0_sp1.1 from training. Duration: 0.92725 2023-10-03 00:01:03,542 WARNING [train.py:1204] (3/4) Exclude cut with ID _58605_210_12210_1_1534723195939_3904259_154-139635-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 00:01:06,384 WARNING [train.py:1204] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_164-224052-0 from training. Duration: 0.7 2023-10-03 00:01:06,393 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_1126-189071-0_sp1.1 from training. Duration: 0.963625 2023-10-03 00:01:06,404 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_15396_210_6890_1_1531308473_1938936_310-214789-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 00:01:09,225 WARNING [train.py:1204] (3/4) Exclude cut with ID _8395_210_1531_1_1528597871523_4411579_431-132470-0 from training. Duration: 0.74 2023-10-03 00:01:10,486 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3780_210_2087_1_1526467760_2130483_170-259044-0_sp0.9 from training. Duration: 0.77775 2023-10-03 00:01:10,735 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.nonlin_attention.balancer.prob, batch_count=1066026.6666666667, ans=0.125 2023-10-03 00:01:11,392 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.1.encoder.layers.1.feed_forward2.out_whiten, num_groups=1, num_channels=256, metric=3.68 vs. limit=15.0 2023-10-03 00:01:11,929 WARNING [train.py:1204] (3/4) Exclude cut with ID _12733_210_8341_1_1530167424556_3646380_537-230640-0_sp1.1 from training. Duration: 0.963625 2023-10-03 00:01:13,792 INFO [train.py:1046] (3/4) Epoch 31, batch 550, loss[loss=0.1692, simple_loss=0.243, pruned_loss=0.04774, over 23545.00 frames. ], tot_loss[loss=0.1657, simple_loss=0.244, pruned_loss=0.04368, over 4412149.81 frames. ], batch size: 256, lr: 3.32e-03, grad_scale: 8.0 2023-10-03 00:01:16,636 WARNING [train.py:1204] (3/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_159-281310-0 from training. Duration: 0.95 2023-10-03 00:01:18,653 WARNING [train.py:1204] (3/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_434-83000-0 from training. Duration: 0.98 2023-10-03 00:01:18,672 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_4405_210_1849_1_1526732205_3593939_493-32464-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 00:01:19,952 WARNING [train.py:1204] (3/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_342-100157-0 from training. Duration: 0.54 2023-10-03 00:01:20,012 WARNING [train.py:1204] (3/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_765-250133-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 00:01:20,016 WARNING [train.py:1204] (3/4) Exclude cut with ID _38670_210_15379_1_1533448810659_4391059_81-312245-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 00:01:20,103 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_50050_210_16223_1_1535435796_7290138_1273-247611-0_sp1.1 from training. Duration: 0.936375 2023-10-03 00:01:21,448 WARNING [train.py:1204] (3/4) Exclude cut with ID _44831_210_13427_1_1533816011549_3689035_399-18591-0_sp1.1 from training. Duration: 0.936375 2023-10-03 00:01:22,905 WARNING [train.py:1204] (3/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_557-324258-0_sp0.9 from training. Duration: 0.9 2023-10-03 00:01:24,246 WARNING [train.py:1204] (3/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_62-335552-0_sp1.1 from training. Duration: 0.7909375 2023-10-03 00:01:25,689 WARNING [train.py:1204] (3/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_673-264125-0_sp1.1 from training. Duration: 0.963625 2023-10-03 00:01:25,776 WARNING [train.py:1204] (3/4) Exclude cut with ID _8030_210_7497_1_1528444886781_3531089_262-86750-0 from training. Duration: 0.81 2023-10-03 00:01:25,798 WARNING [train.py:1204] (3/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_444-28041-0_sp1.1 from training. Duration: 0.77275 2023-10-03 00:01:29,171 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_34008_210_10431_1_1533345424_8183247_516-14858-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 00:01:30,486 WARNING [train.py:1204] (3/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_54-248218-0_sp1.1 from training. Duration: 0.936375 2023-10-03 00:01:31,204 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.3.encoder.layers.3.conv_module1.whiten, num_groups=1, num_channels=512, metric=11.87 vs. limit=15.0 2023-10-03 00:01:31,929 WARNING [train.py:1204] (3/4) Exclude cut with ID _13253_210_1925_1_1530757775819_3577873_300-119782-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 00:01:33,191 WARNING [train.py:1204] (3/4) Exclude cut with ID _39290_210_4220_1_1533862778760_3075118_21-240703-0_sp1.1 from training. Duration: 0.936375 2023-10-03 00:01:37,250 WARNING [train.py:1204] (3/4) Exclude cut with ID 774-127930-0014-10412-0_sp1.1 from training. Duration: 0.95 2023-10-03 00:01:37,318 WARNING [train.py:1204] (3/4) Exclude cut with ID _67499_210_17282_1_1535509805220_3692430_20-179346-0 from training. Duration: 0.97 2023-10-03 00:01:37,417 WARNING [train.py:1204] (3/4) Exclude cut with ID _8365_210_2064_1_1528592311070_4677690_431-294102-0_sp0.9 from training. Duration: 0.988875 2023-10-03 00:01:44,084 WARNING [train.py:1204] (3/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_365-1492-0_sp1.1 from training. Duration: 0.82725 2023-10-03 00:01:45,407 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_105-114797-0_sp0.9 from training. Duration: 0.9333125 2023-10-03 00:01:45,521 WARNING [train.py:1204] (3/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_769-168302-0_sp0.9 from training. Duration: 0.788875 2023-10-03 00:01:48,714 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_40340_210_9024_1_1533433916_4132944_320-270277-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 00:01:48,720 WARNING [train.py:1204] (3/4) Exclude cut with ID ID0203W0003-781711 from training. Duration: 0.958 2023-10-03 00:01:50,049 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_30434_210_13049_1_1532519384_981746_52-12932-0_sp1.1 from training. Duration: 0.936375 2023-10-03 00:01:51,384 WARNING [train.py:1204] (3/4) Exclude cut with ID _8263_210_1664_1_1528438953494_294100_40-204731-0_sp0.9 from training. Duration: 0.5555625 2023-10-03 00:01:55,316 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1032-236863-0_sp0.9 from training. Duration: 0.9333125 2023-10-03 00:01:56,638 WARNING [train.py:1204] (3/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_335-274780-0_sp0.9 from training. Duration: 0.8666875 2023-10-03 00:01:56,646 WARNING [train.py:1204] (3/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_654-52713-0_sp1.1 from training. Duration: 0.7 2023-10-03 00:01:58,004 WARNING [train.py:1204] (3/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_742-267856-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 00:01:59,364 WARNING [train.py:1204] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_74-48717-0 from training. Duration: 0.58 2023-10-03 00:02:01,293 WARNING [train.py:1204] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_18-46756-0 from training. Duration: 0.79 2023-10-03 00:02:01,463 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.attention_skip_rate, batch_count=1066293.3333333333, ans=0.0 2023-10-03 00:02:02,672 WARNING [train.py:1204] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_612-56061-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 00:02:02,681 WARNING [train.py:1204] (3/4) Exclude cut with ID _56423_210_5854_1_1534896061049_3165969_412-2403-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 00:02:02,748 WARNING [train.py:1204] (3/4) Exclude cut with ID _62574_210_2655_1_1535180407058_3933249_120-196848-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 00:02:02,749 WARNING [train.py:1204] (3/4) Exclude cut with ID _40519_210_13794_1_1533715228655_7337390_916-51000-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 00:02:07,012 WARNING [train.py:1204] (3/4) Exclude cut with ID _65263_210_22356_1_1535763598691_3921037_108-21902-0_sp1.1 from training. Duration: 0.9 2023-10-03 00:02:07,123 WARNING [train.py:1204] (3/4) Exclude cut with ID _4122_210_2084_1_1526713164417_4353280_528-206771-0_sp1.1 from training. Duration: 0.8 2023-10-03 00:02:09,849 WARNING [train.py:1204] (3/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_43-325657-0_sp1.1 from training. Duration: 0.92725 2023-10-03 00:02:11,211 WARNING [train.py:1204] (3/4) Exclude cut with ID _44659_210_2721_1_1534036120061_6614860_314-82041-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 00:02:11,268 WARNING [train.py:1204] (3/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_81-300212-0_sp0.9 from training. Duration: 0.6666875 2023-10-03 00:02:12,607 WARNING [train.py:1204] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_665-196155-0_sp0.9 from training. Duration: 0.7444375 2023-10-03 00:02:14,623 WARNING [train.py:1204] (3/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_140-336881-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 00:02:14,698 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6290_210_3273_1_1527595224_1282787_115-87095-0_sp0.9 from training. Duration: 0.611125 2023-10-03 00:02:16,601 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_53373_210_5399_1_1534229471_4715695_607-259685-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 00:02:18,028 WARNING [train.py:1204] (3/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_175-163894-0_sp1.1 from training. Duration: 0.67275 2023-10-03 00:02:18,045 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7002_210_1794_1_1527939683_5157286_1-281264-0_sp1.1 from training. Duration: 0.4 2023-10-03 00:02:25,301 WARNING [train.py:1204] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_605-257205-0 from training. Duration: 0.59 2023-10-03 00:02:26,794 WARNING [train.py:1204] (3/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_454-314901-0 from training. Duration: 0.46 2023-10-03 00:02:28,025 INFO [train.py:1046] (3/4) Epoch 31, batch 600, loss[loss=0.2158, simple_loss=0.2849, pruned_loss=0.07336, over 19521.00 frames. ], tot_loss[loss=0.1661, simple_loss=0.2444, pruned_loss=0.04388, over 4470605.87 frames. ], batch size: 388, lr: 3.32e-03, grad_scale: 8.0 2023-10-03 00:02:29,592 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_46032_210_14656_1_1533781412_4507969_287-130422-0_sp1.1 from training. Duration: 0.97275 2023-10-03 00:02:29,609 WARNING [train.py:1204] (3/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_186-309161-0_sp1.1 from training. Duration: 0.4909375 2023-10-03 00:02:29,642 WARNING [train.py:1204] (3/4) Exclude cut with ID _42659_210_2070_1_1533607442765_3120380_45-342307-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 00:02:29,803 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.feed_forward1.hidden_balancer.prob, batch_count=1066426.6666666667, ans=0.125 2023-10-03 00:02:35,674 WARNING [train.py:1204] (3/4) Exclude cut with ID _12555_210_8750_1_1530075594402_3780239_478-326818-0_sp1.1 from training. Duration: 0.963625 2023-10-03 00:02:39,559 WARNING [train.py:1204] (3/4) Exclude cut with ID _7606_210_4915_1_1528894253277_5080690_127-177761-0_sp1.1 from training. Duration: 0.5818125 2023-10-03 00:02:39,678 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6788_210_1388_1_1527898698_4562021_91-197981-0 from training. Duration: 0.83 2023-10-03 00:02:42,406 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8558_210_1410_1_1528505225_7613406_131-329524-0_sp0.9 from training. Duration: 0.87775 2023-10-03 00:02:43,848 WARNING [train.py:1204] (3/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_153-153537-0_sp1.1 from training. Duration: 0.82725 2023-10-03 00:02:45,739 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_17292_210_6753_1_1531125388_2943734_285-349105-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 00:02:49,231 WARNING [train.py:1204] (3/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_37-41919-0 from training. Duration: 0.87 2023-10-03 00:02:49,257 WARNING [train.py:1204] (3/4) Exclude cut with ID _60860_210_8254_1_1534896015875_3557169_172-294852-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 00:02:53,019 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.4.encoder.layers.1.nonlin_attention.whiten1, num_groups=1, num_channels=288, metric=4.22 vs. limit=10.0 2023-10-03 00:02:55,275 WARNING [train.py:1204] (3/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_146-276944-0 from training. Duration: 0.83 2023-10-03 00:02:57,967 WARNING [train.py:1204] (3/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_1254-44082-0_sp1.1 from training. Duration: 0.936375 2023-10-03 00:02:57,986 WARNING [train.py:1204] (3/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_1115-180350-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 00:02:59,415 WARNING [train.py:1204] (3/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_60-146189-0_sp1.1 from training. Duration: 0.8909375 2023-10-03 00:03:01,610 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.3.encoder.layers.1.self_attn2.whiten, num_groups=1, num_channels=512, metric=14.34 vs. limit=22.5 2023-10-03 00:03:04,943 WARNING [train.py:1204] (3/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_517-153649-0_sp1.1 from training. Duration: 0.92725 2023-10-03 00:03:04,964 WARNING [train.py:1204] (3/4) Exclude cut with ID _39571_210_13449_1_1533801584143_3651077_37-266257-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 00:03:05,018 WARNING [train.py:1204] (3/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_527-276118-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 00:03:11,003 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7696_210_2040_1_1528207167_3716099_117-125434-0_sp1.1 from training. Duration: 0.6909375 2023-10-03 00:03:13,954 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.ff3_skip_rate, batch_count=1066626.6666666667, ans=0.0 2023-10-03 00:03:15,063 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_25767_210_2161_1_1532307483_3867748_478-156360-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 00:03:15,067 WARNING [train.py:1204] (3/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_177-49092-0_sp1.1 from training. Duration: 0.82725 2023-10-03 00:03:15,075 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_11305_210_1794_1_1529750428_7178148_680-317073-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 00:03:20,019 WARNING [train.py:1204] (3/4) Exclude cut with ID _53565_210_10635_1_1534337218335_3820259_287-18120-0 from training. Duration: 0.97 2023-10-03 00:03:25,377 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.conv_module1.balancer1.min_positive, batch_count=1066626.6666666667, ans=0.025 2023-10-03 00:03:27,629 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.609e+02 1.826e+02 1.991e+02 2.219e+02 3.636e+02, threshold=3.983e+02, percent-clipped=0.0 2023-10-03 00:03:27,755 WARNING [train.py:1204] (3/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_498-40153-0_sp1.1 from training. Duration: 0.57275 2023-10-03 00:03:27,772 WARNING [train.py:1204] (3/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_555-63066-0_sp1.1 from training. Duration: 0.963625 2023-10-03 00:03:30,539 WARNING [train.py:1204] (3/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_395-310220-0 from training. Duration: 0.89 2023-10-03 00:03:32,056 WARNING [train.py:1204] (3/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_937-208281-0_sp1.1 from training. Duration: 0.77275 2023-10-03 00:03:33,470 WARNING [train.py:1204] (3/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_120-211630-0 from training. Duration: 0.92 2023-10-03 00:03:34,739 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_454-276429-0_sp1.1 from training. Duration: 0.7909375 2023-10-03 00:03:34,800 WARNING [train.py:1204] (3/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_404-75889-0_sp1.1 from training. Duration: 0.8090625 2023-10-03 00:03:36,425 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.1.conv_module2.balancer2.prob, batch_count=1066693.3333333333, ans=0.125 2023-10-03 00:03:36,831 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.4.encoder.layers.2.feed_forward2.out_whiten, num_groups=1, num_channels=384, metric=7.66 vs. limit=15.0 2023-10-03 00:03:40,315 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.2.encoder.layers.0.feed_forward3.out_whiten, num_groups=1, num_channels=384, metric=11.66 vs. limit=15.0 2023-10-03 00:03:42,242 WARNING [train.py:1204] (3/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_282-136641-0_sp0.9 from training. Duration: 0.4666875 2023-10-03 00:03:43,533 INFO [train.py:1046] (3/4) Epoch 31, batch 650, loss[loss=0.1536, simple_loss=0.2248, pruned_loss=0.04125, over 24450.00 frames. ], tot_loss[loss=0.166, simple_loss=0.2446, pruned_loss=0.04372, over 4528824.27 frames. ], batch size: 58, lr: 3.31e-03, grad_scale: 8.0 2023-10-03 00:03:43,674 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_809-17156-0_sp1.1 from training. Duration: 0.663625 2023-10-03 00:03:45,171 WARNING [train.py:1204] (3/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_1003-101638-0_sp0.9 from training. Duration: 0.811125 2023-10-03 00:03:48,777 WARNING [train.py:1204] (3/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_404-75889-0_sp0.9 from training. Duration: 0.988875 2023-10-03 00:03:50,703 WARNING [train.py:1204] (3/4) Exclude cut with ID _59288_210_5854_1_1535077757774_3412246_570-295255-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 00:03:53,417 WARNING [train.py:1204] (3/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_412-287526-0 from training. Duration: 0.72 2023-10-03 00:03:53,486 WARNING [train.py:1204] (3/4) Exclude cut with ID _44010_210_4525_1_1533884262964_4013620_649-79655-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 00:03:57,733 WARNING [train.py:1204] (3/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_57-270037-0_sp0.9 from training. Duration: 0.8555625 2023-10-03 00:03:57,734 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_34815_210_6388_1_1533381320_3571903_302-255963-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 00:03:57,966 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.balancer1.prob, batch_count=1066826.6666666667, ans=0.125 2023-10-03 00:04:01,845 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_47449_210_5399_1_1534076445_4006929_573-301852-0_sp1.1 from training. Duration: 0.97275 2023-10-03 00:04:03,328 INFO [scaling.py:1118] (3/4) WithLoss: name=encoder.encoders.0.layers.1.self_attn_weights, loss-sum=0.000e+00 2023-10-03 00:04:06,037 WARNING [train.py:1204] (3/4) Exclude cut with ID 6709-74022-0004-86860-0_sp1.1 from training. Duration: 0.9409375 2023-10-03 00:04:08,124 WARNING [train.py:1204] (3/4) Exclude cut with ID _40519_210_13794_1_1533715228655_7337390_111-71991-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 00:04:08,183 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_30167_210_3528_1_1532428012_4772375_169-214186-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 00:04:12,353 WARNING [train.py:1204] (3/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_20-235313-0_sp1.1 from training. Duration: 0.963625 2023-10-03 00:04:12,399 WARNING [train.py:1204] (3/4) Exclude cut with ID _4681_210_3525_1_1527251040321_1235713_123-24428-0_sp0.9 from training. Duration: 0.5333125 2023-10-03 00:04:15,194 WARNING [train.py:1204] (3/4) Exclude cut with ID _31602_210_5854_1_1532677021456_4458250_444-198752-0_sp1.1 from training. Duration: 0.97275 2023-10-03 00:04:15,247 WARNING [train.py:1204] (3/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_9-279442-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 00:04:16,572 WARNING [train.py:1204] (3/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_973-198085-0_sp0.9 from training. Duration: 0.8333125 2023-10-03 00:04:18,496 WARNING [train.py:1204] (3/4) Exclude cut with ID _29853_210_7497_1_1532505600851_6642110_567-250221-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 00:04:19,866 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6545_210_2040_1_1527774670_4245258_407-48070-0_sp1.1 from training. Duration: 0.6909375 2023-10-03 00:04:21,336 WARNING [train.py:1204] (3/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_224-343295-0_sp0.9 from training. Duration: 0.611125 2023-10-03 00:04:21,361 WARNING [train.py:1204] (3/4) Exclude cut with ID IC0125W0011-61817 from training. Duration: 0.88 2023-10-03 00:04:22,469 WARNING [train.py:1204] (3/4) Exclude cut with ID _60113_210_16271_1_1535164212481_4386079_435-154371-0_sp1.1 from training. Duration: 0.97275 2023-10-03 00:04:22,488 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_25836_210_16554_1_1532138167_53197_3-9394-0_sp1.1 from training. Duration: 0.92725 2023-10-03 00:04:25,395 WARNING [train.py:1204] (3/4) Exclude cut with ID _29343_210_4381_1_1532602757752_3674109_57-55125-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 00:04:25,482 WARNING [train.py:1204] (3/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_267-23560-0_sp1.1 from training. Duration: 0.936375 2023-10-03 00:04:25,504 WARNING [train.py:1204] (3/4) Exclude cut with ID _30196_210_1664_1_1533708496522_7074140_1150-278867-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 00:04:26,794 WARNING [train.py:1204] (3/4) Exclude cut with ID _6270_210_2084_1_1527850911573_3856420_26-277343-0_sp0.9 from training. Duration: 0.911125 2023-10-03 00:04:28,141 WARNING [train.py:1204] (3/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_325-62802-0 from training. Duration: 0.87 2023-10-03 00:04:28,209 WARNING [train.py:1204] (3/4) Exclude cut with ID _56524_210_9642_1_1534572011779_3619170_145-310842-0_sp1.1 from training. Duration: 0.9 2023-10-03 00:04:29,633 WARNING [train.py:1204] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_860-96198-0_sp0.9 from training. Duration: 0.811125 2023-10-03 00:04:31,011 WARNING [train.py:1204] (3/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_17-25045-0_sp1.1 from training. Duration: 0.536375 2023-10-03 00:04:31,540 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.4.encoder.layers.2.self_attn2.whiten, num_groups=1, num_channels=384, metric=11.62 vs. limit=22.5 2023-10-03 00:04:32,250 WARNING [train.py:1204] (3/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_349-69727-0_sp1.1 from training. Duration: 0.936375 2023-10-03 00:04:33,603 WARNING [train.py:1204] (3/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_580-93798-0_sp1.1 from training. Duration: 0.5090625 2023-10-03 00:04:34,923 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7762_210_1794_1_1528199508_4057408_419-125009-0 from training. Duration: 0.83 2023-10-03 00:04:35,023 WARNING [train.py:1204] (3/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_356-167816-0 from training. Duration: 0.72 2023-10-03 00:04:35,054 WARNING [train.py:1204] (3/4) Exclude cut with ID _29345_210_4381_1_1532689168263_3860169_225-97100-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 00:04:35,058 WARNING [train.py:1204] (3/4) Exclude cut with ID _14227_210_4381_1_1530701804286_3940259_374-194712-0_sp1.1 from training. Duration: 0.92725 2023-10-03 00:04:36,397 WARNING [train.py:1204] (3/4) Exclude cut with ID _40137_210_12210_1_1533290484046_3795180_249-187059-0_sp0.9 from training. Duration: 0.97775 2023-10-03 00:04:36,433 WARNING [train.py:1204] (3/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_389-317194-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 00:04:36,907 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.5.encoder.layers.0.feed_forward2.out_whiten, num_groups=1, num_channels=256, metric=7.10 vs. limit=15.0 2023-10-03 00:04:37,900 WARNING [train.py:1204] (3/4) Exclude cut with ID _27485_210_4916_1_1532739191677_3668269_239-122918-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 00:04:45,209 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2245_210_2715_1_1525511548_3540254_128-117609-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 00:04:46,642 WARNING [train.py:1204] (3/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_160-276178-0_sp1.1 from training. Duration: 0.963625 2023-10-03 00:04:46,729 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_31920_210_10633_1_1532860009_1686094_1-279735-0_sp1.1 from training. Duration: 0.97275 2023-10-03 00:04:50,538 WARNING [train.py:1204] (3/4) Exclude cut with ID _51078_210_9070_1_1534161557424_3805250_325-262383-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 00:04:50,559 WARNING [train.py:1204] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_643-52007-0_sp0.9 from training. Duration: 0.6333125 2023-10-03 00:04:50,612 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_51937_210_5014_1_1534573610_4022025_518-299248-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 00:04:56,284 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.ff2_skip_rate, batch_count=1067093.3333333333, ans=0.0 2023-10-03 00:04:57,405 INFO [train.py:1046] (3/4) Epoch 31, batch 700, loss[loss=0.1455, simple_loss=0.1956, pruned_loss=0.04765, over 19019.00 frames. ], tot_loss[loss=0.1645, simple_loss=0.2424, pruned_loss=0.04326, over 4566069.81 frames. ], batch size: 388, lr: 3.31e-03, grad_scale: 8.0 2023-10-03 00:04:57,493 WARNING [train.py:1204] (3/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_166-314607-0_sp1.1 from training. Duration: 0.4909375 2023-10-03 00:04:57,497 WARNING [train.py:1204] (3/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_1087-282939-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 00:04:58,747 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_23850_210_4238_1_1532232597_1632433_32-185135-0_sp1.1 from training. Duration: 0.7818125 2023-10-03 00:04:58,787 WARNING [train.py:1204] (3/4) Exclude cut with ID _59500_210_8254_1_1534809610680_3736009_145-89359-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 00:05:00,333 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.feed_forward1.hidden_balancer.prob, batch_count=1067093.3333333333, ans=0.125 2023-10-03 00:05:01,749 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.balancer1.prob, batch_count=1067093.3333333333, ans=0.125 2023-10-03 00:05:02,907 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_340-336408-0 from training. Duration: 0.93 2023-10-03 00:05:04,265 WARNING [train.py:1204] (3/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_9-21737-0 from training. Duration: 0.97 2023-10-03 00:05:07,022 WARNING [train.py:1204] (3/4) Exclude cut with ID _2220_210_1819_1_1525509109898_3658680_50-99837-0 from training. Duration: 0.54 2023-10-03 00:05:07,093 WARNING [train.py:1204] (3/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_879-299350-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 00:05:08,549 WARNING [train.py:1204] (3/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_905-160074-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 00:05:08,674 WARNING [train.py:1204] (3/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_395-73570-0 from training. Duration: 0.99 2023-10-03 00:05:13,242 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7264_210_4938_1_1527939911_8132095_800-91995-0_sp1.1 from training. Duration: 0.7818125 2023-10-03 00:05:15,341 WARNING [train.py:1204] (3/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_77-311404-0_sp1.1 from training. Duration: 0.8545625 2023-10-03 00:05:16,707 WARNING [train.py:1204] (3/4) Exclude cut with ID _13679_210_7497_1_1530534293662_3355098_107-80772-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 00:05:19,427 WARNING [train.py:1204] (3/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_224-343295-0_sp1.1 from training. Duration: 0.5 2023-10-03 00:05:19,582 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.bypass.skip_rate, batch_count=1067160.0, ans=0.04949747468305833 2023-10-03 00:05:21,240 WARNING [train.py:1204] (3/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_156-253482-0_sp1.1 from training. Duration: 0.963625 2023-10-03 00:05:22,624 WARNING [train.py:1204] (3/4) Exclude cut with ID _49728_210_12651_1_1534041034294_6788150_309-125532-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 00:05:25,410 WARNING [train.py:1204] (3/4) Exclude cut with ID _13913_210_9994_1_1530579615187_3383897_473-277682-0_sp0.9 from training. Duration: 0.4333125 2023-10-03 00:05:25,414 WARNING [train.py:1204] (3/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_3-276701-0_sp1.1 from training. Duration: 0.82725 2023-10-03 00:05:26,803 WARNING [train.py:1204] (3/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_282-136641-0 from training. Duration: 0.42 2023-10-03 00:05:29,463 WARNING [train.py:1204] (3/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_127-566-0 from training. Duration: 0.84 2023-10-03 00:05:33,597 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6163_210_6400_1_1527506746_4263068_283-137811-0_sp1.1 from training. Duration: 0.7 2023-10-03 00:05:34,889 WARNING [train.py:1204] (3/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_34-183685-0_sp1.1 from training. Duration: 0.8909375 2023-10-03 00:05:35,022 WARNING [train.py:1204] (3/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_154-135696-0_sp0.9 from training. Duration: 0.888875 2023-10-03 00:05:39,093 WARNING [train.py:1204] (3/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_676-51014-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 00:05:40,421 WARNING [train.py:1204] (3/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_38-169920-0 from training. Duration: 0.91 2023-10-03 00:05:46,026 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.conv_skip_rate, batch_count=1067293.3333333333, ans=0.0 2023-10-03 00:05:47,050 WARNING [train.py:1204] (3/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_747-114465-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 00:05:47,098 WARNING [train.py:1204] (3/4) Exclude cut with ID _8021_210_7994_1_1528376451358_3653069_100-140231-0_sp1.1 from training. Duration: 0.6090625 2023-10-03 00:05:47,115 WARNING [train.py:1204] (3/4) Exclude cut with ID _64135_210_2253_1_1535677267876_7608972_133-188266-0 from training. Duration: 0.81 2023-10-03 00:05:49,931 WARNING [train.py:1204] (3/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_714-310538-0_sp1.1 from training. Duration: 0.7818125 2023-10-03 00:05:50,634 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.3.encoder.layers.3.whiten, num_groups=1, num_channels=512, metric=3.64 vs. limit=12.0 2023-10-03 00:05:51,847 WARNING [train.py:1204] (3/4) Exclude cut with ID _39867_210_9826_1_1533553222030_9344259_1134-288498-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 00:05:54,705 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.495e+02 1.944e+02 2.173e+02 2.571e+02 3.380e+02, threshold=4.347e+02, percent-clipped=0.0 2023-10-03 00:05:54,888 WARNING [train.py:1204] (3/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_211-237289-0_sp1.1 from training. Duration: 0.936375 2023-10-03 00:06:00,290 WARNING [train.py:1204] (3/4) Exclude cut with ID _59202_210_26984_1_1534816810612_3549174_137-318710-0_sp0.9 from training. Duration: 0.988875 2023-10-03 00:06:00,324 WARNING [train.py:1204] (3/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_86-66192-0 from training. Duration: 0.55 2023-10-03 00:06:03,016 WARNING [train.py:1204] (3/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_540-275685-0 from training. Duration: 0.89 2023-10-03 00:06:03,054 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8776_210_2040_1_1528638837_4088036_244-278763-0 from training. Duration: 0.9 2023-10-03 00:06:05,851 WARNING [train.py:1204] (3/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_9-219972-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 00:06:07,205 WARNING [train.py:1204] (3/4) Exclude cut with ID _44659_210_2721_1_1534036120061_6614860_656-283823-0_sp1.1 from training. Duration: 0.92725 2023-10-03 00:06:07,382 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.conv_module1.balancer1.min_positive, batch_count=1067360.0, ans=0.025 2023-10-03 00:06:08,507 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_69630_210_7234_1_1535789601_661049_77-216385-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 00:06:09,788 INFO [train.py:1046] (3/4) Epoch 31, batch 750, loss[loss=0.1844, simple_loss=0.2656, pruned_loss=0.05165, over 23378.00 frames. ], tot_loss[loss=0.1636, simple_loss=0.2414, pruned_loss=0.04291, over 4591824.47 frames. ], batch size: 93, lr: 3.31e-03, grad_scale: 8.0 2023-10-03 00:06:11,199 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_19952_210_2161_1_1531875469_3851750_210-308919-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 00:06:12,469 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_4737_210_2040_1_1526998689_3293008_378-178650-0 from training. Duration: 0.95 2023-10-03 00:06:16,359 WARNING [train.py:1204] (3/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_224-343295-0 from training. Duration: 0.55 2023-10-03 00:06:16,383 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7002_210_1794_1_1527939683_5157286_184-229630-0 from training. Duration: 0.74 2023-10-03 00:06:17,625 WARNING [train.py:1204] (3/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_6-214815-0 from training. Duration: 0.85 2023-10-03 00:06:17,702 WARNING [train.py:1204] (3/4) Exclude cut with ID _8699_210_2069_1_1528630346831_3960409_160-24408-0 from training. Duration: 0.91 2023-10-03 00:06:17,932 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.ff3_skip_rate, batch_count=1067426.6666666667, ans=0.0 2023-10-03 00:06:19,069 WARNING [train.py:1204] (3/4) Exclude cut with ID _5585_210_2667_1_1527418813324_8007340_89-332072-0 from training. Duration: 0.64 2023-10-03 00:06:19,110 WARNING [train.py:1204] (3/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_528-259800-0_sp1.1 from training. Duration: 0.963625 2023-10-03 00:06:20,419 WARNING [train.py:1204] (3/4) Exclude cut with ID _55383_210_10635_1_1534468426172_2231669_134-128246-0 from training. Duration: 0.89 2023-10-03 00:06:20,494 WARNING [train.py:1204] (3/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_400-338274-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 00:06:22,311 WARNING [train.py:1204] (3/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_364-22817-0_sp0.9 from training. Duration: 0.788875 2023-10-03 00:06:24,145 WARNING [train.py:1204] (3/4) Exclude cut with ID _42863_210_9070_1_1533902398788_3696920_331-291057-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 00:06:25,555 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_5436_210_1794_1_1527331317_2541494_229-211016-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 00:06:25,593 WARNING [train.py:1204] (3/4) Exclude cut with ID _6456_210_1534_1_1527681634212_3181049_345-252877-0_sp0.9 from training. Duration: 0.711125 2023-10-03 00:06:26,820 WARNING [train.py:1204] (3/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_96-81499-0_sp1.1 from training. Duration: 0.92725 2023-10-03 00:06:28,366 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_5575_210_1410_1_1527301305_2570170_342-265572-0_sp1.1 from training. Duration: 0.8545625 2023-10-03 00:06:29,722 WARNING [train.py:1204] (3/4) Exclude cut with ID _5444_210_2151_1_1527418808464_5312210_726-27165-0_sp0.9 from training. Duration: 0.7444375 2023-10-03 00:06:31,053 WARNING [train.py:1204] (3/4) Exclude cut with ID _22083_210_8254_1_1531702862439_3678970_304-327750-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 00:06:31,257 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.conv_module2.balancer1.prob, batch_count=1067493.3333333333, ans=0.125 2023-10-03 00:06:33,860 WARNING [train.py:1204] (3/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_245-226199-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 00:06:33,890 WARNING [train.py:1204] (3/4) Exclude cut with ID _42875_210_14856_1_1533704397487_4012433_102-199499-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 00:06:33,960 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_51225_210_3601_1_1534400634_2327383_55-301844-0 from training. Duration: 0.93 2023-10-03 00:06:35,259 WARNING [train.py:1204] (3/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_916-195848-0_sp1.1 from training. Duration: 0.836375 2023-10-03 00:06:36,562 WARNING [train.py:1204] (3/4) Exclude cut with ID _44718_210_5816_1_1534050051869_6469030_497-311999-0_sp1.1 from training. Duration: 0.936375 2023-10-03 00:06:38,177 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_34886_210_10633_1_1533203882_1755661_91-194765-0_sp1.1 from training. Duration: 0.936375 2023-10-03 00:06:38,372 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.self_attn_weights.pos_emb_skip_rate, batch_count=1067560.0, ans=0.0 2023-10-03 00:06:39,605 WARNING [train.py:1204] (3/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_310-26186-0_sp0.9 from training. Duration: 0.77775 2023-10-03 00:06:40,970 WARNING [train.py:1204] (3/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_416-257730-0 from training. Duration: 0.65 2023-10-03 00:06:40,976 WARNING [train.py:1204] (3/4) Exclude cut with ID _9389_210_1664_1_1529217337301_5289269_209-154760-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 00:06:42,395 WARNING [train.py:1204] (3/4) Exclude cut with ID _4092_210_2151_1_1526643044449_3135759_227-213972-0 from training. Duration: 0.54 2023-10-03 00:06:42,425 WARNING [train.py:1204] (3/4) Exclude cut with ID IC0679W0044-335852 from training. Duration: 0.768 2023-10-03 00:06:43,759 WARNING [train.py:1204] (3/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_42-186162-0 from training. Duration: 0.92 2023-10-03 00:06:43,765 WARNING [train.py:1204] (3/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_302-2550-0_sp0.9 from training. Duration: 0.9 2023-10-03 00:06:43,785 WARNING [train.py:1204] (3/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_356-31216-0_sp0.9 from training. Duration: 0.8333125 2023-10-03 00:06:47,645 WARNING [train.py:1204] (3/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_125-322172-0_sp1.1 from training. Duration: 0.8454375 2023-10-03 00:06:53,666 WARNING [train.py:1204] (3/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_141-89727-0_sp0.9 from training. Duration: 0.788875 2023-10-03 00:06:55,535 WARNING [train.py:1204] (3/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_647-337784-0_sp1.1 from training. Duration: 0.97275 2023-10-03 00:06:55,552 WARNING [train.py:1204] (3/4) Exclude cut with ID _6493_210_1747_1_1528459210424_4012470_513-140967-0_sp1.1 from training. Duration: 0.7181875 2023-10-03 00:06:58,268 WARNING [train.py:1204] (3/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_87-266334-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 00:07:00,825 WARNING [train.py:1204] (3/4) Exclude cut with ID _26528_210_9070_1_1532570410842_3851310_526-261338-0_sp1.1 from training. Duration: 0.92725 2023-10-03 00:07:00,861 WARNING [train.py:1204] (3/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_380-343670-0 from training. Duration: 0.88 2023-10-03 00:07:00,913 WARNING [train.py:1204] (3/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_449-15508-0_sp1.1 from training. Duration: 0.7454375 2023-10-03 00:07:02,338 WARNING [train.py:1204] (3/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_372-6869-0_sp0.9 from training. Duration: 0.488875 2023-10-03 00:07:02,410 WARNING [train.py:1204] (3/4) Exclude cut with ID _26907_210_7234_1_1532221740694_3205263_337-2494-0_sp0.9 from training. Duration: 0.97775 2023-10-03 00:07:02,626 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.bypass_mid.scale_min, batch_count=1067626.6666666667, ans=0.2 2023-10-03 00:07:05,312 WARNING [train.py:1204] (3/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_10-100367-0_sp1.1 from training. Duration: 0.8909375 2023-10-03 00:07:05,344 WARNING [train.py:1204] (3/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_47-167373-0 from training. Duration: 0.92 2023-10-03 00:07:05,385 WARNING [train.py:1204] (3/4) Exclude cut with ID _28323_210_16187_1_1532480395667_4100300_628-282179-0_sp1.1 from training. Duration: 0.97275 2023-10-03 00:07:05,529 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.feed_forward1.hidden_balancer.prob, batch_count=1067626.6666666667, ans=0.125 2023-10-03 00:07:05,531 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.conv_skip_rate, batch_count=1067626.6666666667, ans=0.0 2023-10-03 00:07:09,767 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.feed_forward2.hidden_balancer.prob, batch_count=1067693.3333333333, ans=0.125 2023-10-03 00:07:10,857 WARNING [train.py:1204] (3/4) Exclude cut with ID _20455_210_5219_1_1531565996775_8098100_213-280898-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 00:07:12,200 WARNING [train.py:1204] (3/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_383-316000-0_sp1.1 from training. Duration: 0.8181875 2023-10-03 00:07:12,230 WARNING [train.py:1204] (3/4) Exclude cut with ID _18513_210_9206_1_1531618215223_3862620_16-78180-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 00:07:14,255 WARNING [train.py:1204] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_330-327861-0_sp1.1 from training. Duration: 0.6090625 2023-10-03 00:07:17,169 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.nonlin_attention.balancer.min_positive, batch_count=1067693.3333333333, ans=0.05 2023-10-03 00:07:18,864 WARNING [train.py:1204] (3/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_356-31216-0 from training. Duration: 0.75 2023-10-03 00:07:18,886 WARNING [train.py:1204] (3/4) Exclude cut with ID _25248_210_8750_1_1532062825941_5554180_255-148539-0_sp1.1 from training. Duration: 0.963625 2023-10-03 00:07:18,936 WARNING [train.py:1204] (3/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_269-154726-0_sp1.1 from training. Duration: 0.7818125 2023-10-03 00:07:20,332 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7002_210_1794_1_1527939683_5157286_163-87552-0_sp1.1 from training. Duration: 0.7818125 2023-10-03 00:07:20,357 WARNING [train.py:1204] (3/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_97-151896-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 00:07:23,751 WARNING [train.py:1204] (3/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_207-7026-0_sp1.1 from training. Duration: 0.97275 2023-10-03 00:07:25,401 INFO [train.py:1046] (3/4) Epoch 31, batch 800, loss[loss=0.1911, simple_loss=0.2461, pruned_loss=0.06804, over 19508.00 frames. ], tot_loss[loss=0.1642, simple_loss=0.2417, pruned_loss=0.04336, over 4588346.88 frames. ], batch size: 388, lr: 3.31e-03, grad_scale: 16.0 2023-10-03 00:07:25,472 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_92-46009-0_sp0.9 from training. Duration: 0.6 2023-10-03 00:07:31,053 WARNING [train.py:1204] (3/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_122-178847-0_sp1.1 from training. Duration: 0.97275 2023-10-03 00:07:31,057 WARNING [train.py:1204] (3/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_94-348956-0_sp1.1 from training. Duration: 0.92725 2023-10-03 00:07:32,497 WARNING [train.py:1204] (3/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_1018-244089-0_sp1.1 from training. Duration: 0.963625 2023-10-03 00:07:32,517 WARNING [train.py:1204] (3/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_87-118423-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 00:07:35,113 WARNING [train.py:1204] (3/4) Exclude cut with ID _53713_210_24116_1_1534314487762_7416751_504-212292-0_sp1.1 from training. Duration: 0.92725 2023-10-03 00:07:35,134 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_446-150327-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 00:07:36,454 WARNING [train.py:1204] (3/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_682-245989-0_sp1.1 from training. Duration: 0.92725 2023-10-03 00:07:40,011 WARNING [train.py:1204] (3/4) Exclude cut with ID _35723_210_1794_1_1533088341043_628250_8-234524-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 00:07:41,413 WARNING [train.py:1204] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_64-248359-0_sp0.9 from training. Duration: 0.7666875 2023-10-03 00:07:44,129 WARNING [train.py:1204] (3/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_49-97158-0 from training. Duration: 0.76 2023-10-03 00:07:45,477 WARNING [train.py:1204] (3/4) Exclude cut with ID _24314_210_4915_1_1532135078116_7170179_743-915-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 00:07:46,918 WARNING [train.py:1204] (3/4) Exclude cut with ID _61194_210_18628_1_1535007555628_3750909_353-323468-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 00:07:46,956 WARNING [train.py:1204] (3/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_387-131538-0_sp1.1 from training. Duration: 0.736375 2023-10-03 00:07:47,187 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.self_attn_weights.pos_emb_skip_rate, batch_count=1067826.6666666667, ans=0.0 2023-10-03 00:07:48,659 WARNING [train.py:1204] (3/4) Exclude cut with ID _44424_210_18163_1_1533623394848_1213110_79-187603-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 00:07:48,703 WARNING [train.py:1204] (3/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_86-19601-0 from training. Duration: 0.85 2023-10-03 00:07:48,722 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_50050_210_16223_1_1535435796_7290138_103-283529-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 00:07:48,757 WARNING [train.py:1204] (3/4) Exclude cut with ID _13913_210_9994_1_1530579615187_3383897_473-277682-0 from training. Duration: 0.39 2023-10-03 00:07:52,912 WARNING [train.py:1204] (3/4) Exclude cut with ID _42326_210_7770_1_1533814282531_3707269_368-68029-0_sp1.1 from training. Duration: 0.92725 2023-10-03 00:07:56,172 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_41428_210_2161_1_1533774184_3992532_446-339645-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 00:07:57,678 WARNING [train.py:1204] (3/4) Exclude cut with ID _69584_210_5219_1_1535785133775_4044450_325-7656-0_sp1.1 from training. Duration: 0.7818125 2023-10-03 00:07:57,686 WARNING [train.py:1204] (3/4) Exclude cut with ID _21632_210_2253_1_1531994001765_3744140_134-184387-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 00:07:59,852 WARNING [train.py:1204] (3/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_550-184707-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 00:07:59,867 WARNING [train.py:1204] (3/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_106-341780-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 00:08:03,882 WARNING [train.py:1204] (3/4) Exclude cut with ID _45871_210_5665_1_1533864614245_3296060_253-132552-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 00:08:05,028 WARNING [train.py:1204] (3/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_395-310220-0_sp1.1 from training. Duration: 0.8090625 2023-10-03 00:08:05,049 WARNING [train.py:1204] (3/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_795-33252-0_sp0.9 from training. Duration: 0.411125 2023-10-03 00:08:06,445 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9002W0003-495962 from training. Duration: 0.946 2023-10-03 00:08:07,792 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_43-71989-0 from training. Duration: 0.57 2023-10-03 00:08:07,812 WARNING [train.py:1204] (3/4) Exclude cut with ID _8699_210_2069_1_1528630346831_3960409_155-39766-0_sp1.1 from training. Duration: 0.7090625 2023-10-03 00:08:07,833 WARNING [train.py:1204] (3/4) Exclude cut with ID _27894_210_12392_1_1532415496078_6902439_788-232684-0_sp1.1 from training. Duration: 0.97275 2023-10-03 00:08:09,277 WARNING [train.py:1204] (3/4) Exclude cut with ID _56494_210_4220_1_1535284837625_3033169_10-39296-0_sp1.1 from training. Duration: 0.92725 2023-10-03 00:08:09,308 WARNING [train.py:1204] (3/4) Exclude cut with ID _40901_210_5665_1_1533363789958_3856130_341-20819-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 00:08:13,913 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9029W0435-509822 from training. Duration: 0.896 2023-10-03 00:08:15,218 WARNING [train.py:1204] (3/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_51-117576-0 from training. Duration: 0.4 2023-10-03 00:08:15,332 WARNING [train.py:1204] (3/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_197-28164-0_sp1.1 from training. Duration: 0.72725 2023-10-03 00:08:18,179 WARNING [train.py:1204] (3/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_145-137790-0_sp1.1 from training. Duration: 0.7545625 2023-10-03 00:08:22,714 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.490e+02 2.007e+02 2.299e+02 2.659e+02 4.036e+02, threshold=4.599e+02, percent-clipped=0.0 2023-10-03 00:08:24,204 WARNING [train.py:1204] (3/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_2-39653-0_sp1.1 from training. Duration: 0.8909375 2023-10-03 00:08:27,143 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3780_210_2087_1_1526467760_2130483_186-208240-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 00:08:29,000 WARNING [train.py:1204] (3/4) Exclude cut with ID _24055_210_12651_1_1532310907143_976979_92-59356-0 from training. Duration: 0.91 2023-10-03 00:08:29,056 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3910_210_1794_1_1526742306_334530_28-238909-0_sp1.1 from training. Duration: 0.736375 2023-10-03 00:08:32,436 WARNING [train.py:1204] (3/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_269-154726-0 from training. Duration: 0.86 2023-10-03 00:08:37,863 WARNING [train.py:1204] (3/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_326-165900-0_sp0.9 from training. Duration: 0.7666875 2023-10-03 00:08:39,173 INFO [train.py:1046] (3/4) Epoch 31, batch 850, loss[loss=0.1667, simple_loss=0.2368, pruned_loss=0.04831, over 23546.00 frames. ], tot_loss[loss=0.1646, simple_loss=0.2423, pruned_loss=0.04342, over 4622729.15 frames. ], batch size: 134, lr: 3.31e-03, grad_scale: 16.0 2023-10-03 00:08:39,337 WARNING [train.py:1204] (3/4) Exclude cut with ID _5294_210_1747_1_1527249643928_3696179_470-276077-0_sp1.1 from training. Duration: 0.963625 2023-10-03 00:08:40,586 WARNING [train.py:1204] (3/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_372-131163-0 from training. Duration: 0.94 2023-10-03 00:08:40,643 WARNING [train.py:1204] (3/4) Exclude cut with ID _45297_210_2721_1_1533715351381_3264949_351-147136-0_sp1.1 from training. Duration: 0.7909375 2023-10-03 00:08:41,922 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_1072-12671-0_sp1.1 from training. Duration: 0.92725 2023-10-03 00:08:43,299 WARNING [train.py:1204] (3/4) Exclude cut with ID _15133_210_6228_1_1530794512074_3080379_54-198193-0 from training. Duration: 0.94 2023-10-03 00:08:43,326 WARNING [train.py:1204] (3/4) Exclude cut with ID _30196_210_1664_1_1533708496522_7074140_1194-270988-0_sp1.1 from training. Duration: 0.97275 2023-10-03 00:08:45,324 WARNING [train.py:1204] (3/4) Exclude cut with ID _50310_210_15488_1_1534068043345_3641080_289-193637-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 00:08:46,755 WARNING [train.py:1204] (3/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_313-263495-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 00:08:48,149 WARNING [train.py:1204] (3/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_18-130336-0_sp1.1 from training. Duration: 0.7181875 2023-10-03 00:08:48,206 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_47896_210_20508_1_1534492518_5958868_646-287209-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 00:08:49,723 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_294-163793-0 from training. Duration: 0.82 2023-10-03 00:08:51,078 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_8-319957-0 from training. Duration: 0.96 2023-10-03 00:08:51,081 WARNING [train.py:1204] (3/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_1211-226066-0 from training. Duration: 0.76 2023-10-03 00:08:53,066 WARNING [train.py:1204] (3/4) Exclude cut with ID _4779_210_2084_1_1527318022944_3914893_500-285013-0_sp0.9 from training. Duration: 0.7666875 2023-10-03 00:08:53,084 WARNING [train.py:1204] (3/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_347-232268-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 00:08:54,547 WARNING [train.py:1204] (3/4) Exclude cut with ID _29877_210_6228_1_1532397791106_5276149_111-55056-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 00:08:54,578 WARNING [train.py:1204] (3/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_812-271646-0_sp1.1 from training. Duration: 0.92725 2023-10-03 00:08:55,894 WARNING [train.py:1204] (3/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_235-262124-0_sp1.1 from training. Duration: 0.6090625 2023-10-03 00:09:00,958 WARNING [train.py:1204] (3/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_14-16530-0_sp1.1 from training. Duration: 0.97275 2023-10-03 00:09:00,991 WARNING [train.py:1204] (3/4) Exclude cut with ID _44718_210_5816_1_1534050051869_6469030_568-304512-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 00:09:01,010 WARNING [train.py:1204] (3/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_1033-274376-0 from training. Duration: 0.87 2023-10-03 00:09:03,856 WARNING [train.py:1204] (3/4) Exclude cut with ID _4681_210_3525_1_1527251040321_1235713_123-24428-0 from training. Duration: 0.48 2023-10-03 00:09:09,431 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_37818_210_4238_1_1533620910_4720083_251-288764-0_sp1.1 from training. Duration: 0.97275 2023-10-03 00:09:09,509 WARNING [train.py:1204] (3/4) Exclude cut with ID _65585_210_12210_1_1535450451584_3764098_112-294301-0 from training. Duration: 0.88 2023-10-03 00:09:13,538 WARNING [train.py:1204] (3/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_329-113173-0 from training. Duration: 0.62 2023-10-03 00:09:13,640 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6767_210_3273_1_1527855783_1718932_173-256601-0 from training. Duration: 0.8 2023-10-03 00:09:15,113 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9266W0001-627677 from training. Duration: 0.896 2023-10-03 00:09:16,454 WARNING [train.py:1204] (3/4) Exclude cut with ID _49643_210_7989_1_1534640244224_3939031_611-185515-0_sp1.1 from training. Duration: 0.8545625 2023-10-03 00:09:16,458 WARNING [train.py:1204] (3/4) Exclude cut with ID _31649_210_6228_1_1532584608063_4091078_687-237771-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 00:09:16,476 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_148-172861-0_sp0.9 from training. Duration: 0.5555625 2023-10-03 00:09:18,381 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_5129_210_6400_1_1527161005_3579054_107-69738-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 00:09:19,819 WARNING [train.py:1204] (3/4) Exclude cut with ID _40137_210_12210_1_1533290484046_3795180_36-301164-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 00:09:19,849 WARNING [train.py:1204] (3/4) Exclude cut with ID _7537_210_2084_1_1528502395431_3924359_113-199933-0 from training. Duration: 0.59 2023-10-03 00:09:22,998 WARNING [train.py:1204] (3/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_547-5424-0_sp1.1 from training. Duration: 0.8545625 2023-10-03 00:09:23,055 WARNING [train.py:1204] (3/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_147-284024-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 00:09:23,917 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.1.encoder.layers.1.self_attn1.whiten, num_groups=1, num_channels=256, metric=12.87 vs. limit=22.5 2023-10-03 00:09:24,409 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_294-163793-0_sp1.1 from training. Duration: 0.7454375 2023-10-03 00:09:24,428 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3780_210_2087_1_1526467760_2130483_170-259044-0_sp1.1 from training. Duration: 0.636375 2023-10-03 00:09:25,863 WARNING [train.py:1204] (3/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_726-270780-0_sp1.1 from training. Duration: 0.7818125 2023-10-03 00:09:27,703 WARNING [train.py:1204] (3/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_214-291369-0_sp1.1 from training. Duration: 0.57275 2023-10-03 00:09:27,759 WARNING [train.py:1204] (3/4) Exclude cut with ID _21881_210_3525_1_1531825242757_3798380_247-102591-0 from training. Duration: 0.98 2023-10-03 00:09:28,323 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.4.encoder.layers.0.conv_module2.whiten, num_groups=1, num_channels=384, metric=8.63 vs. limit=15.0 2023-10-03 00:09:32,303 WARNING [train.py:1204] (3/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_18-174647-0_sp1.1 from training. Duration: 0.82725 2023-10-03 00:09:32,306 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_415-52611-0_sp1.1 from training. Duration: 0.936375 2023-10-03 00:09:33,627 WARNING [train.py:1204] (3/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_379-49457-0_sp0.9 from training. Duration: 0.9666875 2023-10-03 00:09:33,643 WARNING [train.py:1204] (3/4) Exclude cut with ID _64521_210_14699_1_1535243427259_3815328_368-195106-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 00:09:33,718 WARNING [train.py:1204] (3/4) Exclude cut with ID _28394_210_8913_1_1532430049518_3650118_51-334124-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 00:09:37,858 WARNING [train.py:1204] (3/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_317-161769-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 00:09:39,283 WARNING [train.py:1204] (3/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_40-129756-0_sp0.9 from training. Duration: 0.9 2023-10-03 00:09:41,903 WARNING [train.py:1204] (3/4) Exclude cut with ID _3067_210_2667_1_1526212974640_4503168_119-175776-0_sp0.9 from training. Duration: 0.82225 2023-10-03 00:09:41,954 WARNING [train.py:1204] (3/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_1004-304915-0_sp1.1 from training. Duration: 0.92725 2023-10-03 00:09:43,284 WARNING [train.py:1204] (3/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_359-118926-0_sp0.9 from training. Duration: 0.8 2023-10-03 00:09:50,433 WARNING [train.py:1204] (3/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_51-200220-0_sp0.9 from training. Duration: 0.62225 2023-10-03 00:09:52,255 WARNING [train.py:1204] (3/4) Exclude cut with ID _46683_210_9642_1_1533730913492_4591116_530-326202-0_sp1.1 from training. Duration: 0.936375 2023-10-03 00:09:52,311 WARNING [train.py:1204] (3/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_567-135114-0 from training. Duration: 0.87 2023-10-03 00:09:52,336 WARNING [train.py:1204] (3/4) Exclude cut with ID _66863_210_18053_1_1535698758334_8590319_1092-271784-0_sp1.1 from training. Duration: 0.87275 2023-10-03 00:09:53,679 INFO [train.py:1046] (3/4) Epoch 31, batch 900, loss[loss=0.1427, simple_loss=0.2205, pruned_loss=0.03245, over 21574.00 frames. ], tot_loss[loss=0.1644, simple_loss=0.2429, pruned_loss=0.04296, over 4658821.73 frames. ], batch size: 47, lr: 3.31e-03, grad_scale: 16.0 2023-10-03 00:09:53,732 WARNING [train.py:1204] (3/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_762-282614-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 00:09:55,491 WARNING [train.py:1204] (3/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_280-120942-0 from training. Duration: 0.77 2023-10-03 00:10:03,404 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_31583_210_3852_1_1532595851_4024413_27-170931-0_sp1.1 from training. Duration: 0.963625 2023-10-03 00:10:04,182 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.2.encoder.layers.0.self_attn1.whiten, num_groups=1, num_channels=384, metric=19.60 vs. limit=22.5 2023-10-03 00:10:06,035 WARNING [train.py:1204] (3/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_266-271628-0_sp1.1 from training. Duration: 0.92725 2023-10-03 00:10:07,362 WARNING [train.py:1204] (3/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_41-31157-0 from training. Duration: 0.57 2023-10-03 00:10:10,179 WARNING [train.py:1204] (3/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_113-18163-0_sp1.1 from training. Duration: 0.8090625 2023-10-03 00:10:10,213 WARNING [train.py:1204] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_147-111137-0 from training. Duration: 0.95 2023-10-03 00:10:11,635 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1083-220122-0_sp1.1 from training. Duration: 0.463625 2023-10-03 00:10:12,956 WARNING [train.py:1204] (3/4) Exclude cut with ID _14884_210_2252_1_1530707459689_5561250_591-173406-0_sp1.1 from training. Duration: 0.87275 2023-10-03 00:10:12,962 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_24677_210_1794_1_1532002693_4341296_149-303793-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 00:10:13,012 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_133-564-0_sp1.1 from training. Duration: 0.7181875 2023-10-03 00:10:13,042 WARNING [train.py:1204] (3/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_625-338546-0_sp0.9 from training. Duration: 0.97775 2023-10-03 00:10:21,450 WARNING [train.py:1204] (3/4) Exclude cut with ID _25882_210_3036_1_1532775665815_6479510_624-320169-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 00:10:21,464 WARNING [train.py:1204] (3/4) Exclude cut with ID _20622_210_3852_1_1531622427080_1093839_127-325326-0_sp1.1 from training. Duration: 0.92725 2023-10-03 00:10:21,486 WARNING [train.py:1204] (3/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_464-33931-0_sp1.1 from training. Duration: 0.4909375 2023-10-03 00:10:26,330 WARNING [train.py:1204] (3/4) Exclude cut with ID _66112_210_10635_1_1535590236305_3801484_120-304588-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 00:10:30,959 WARNING [train.py:1204] (3/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_110-130961-0 from training. Duration: 0.47 2023-10-03 00:10:31,253 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.bypass_mid.scale_min, batch_count=1068560.0, ans=0.2 2023-10-03 00:10:32,299 WARNING [train.py:1204] (3/4) Exclude cut with ID _16908_210_5724_1_1531119157248_3772700_64-54020-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 00:10:37,026 WARNING [train.py:1204] (3/4) Exclude cut with ID _4068_210_2629_1_1526697350395_1546259_224-288693-0_sp1.1 from training. Duration: 0.536375 2023-10-03 00:10:37,083 WARNING [train.py:1204] (3/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_191-41023-0_sp0.9 from training. Duration: 0.788875 2023-10-03 00:10:38,429 WARNING [train.py:1204] (3/4) Exclude cut with ID ID0205W0255-783002 from training. Duration: 0.958 2023-10-03 00:10:38,507 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_162-262717-0 from training. Duration: 0.83 2023-10-03 00:10:42,752 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_224-322394-0_sp1.1 from training. Duration: 0.6 2023-10-03 00:10:42,760 WARNING [train.py:1204] (3/4) Exclude cut with ID _4866_210_2577_1_1527404412284_4364659_133-1696-0_sp1.1 from training. Duration: 0.9 2023-10-03 00:10:44,029 WARNING [train.py:1204] (3/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_973-198085-0_sp1.1 from training. Duration: 0.6818125 2023-10-03 00:10:49,530 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=1068626.6666666667, ans=0.1 2023-10-03 00:10:49,613 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.nonlin_attention.balancer.prob, batch_count=1068626.6666666667, ans=0.125 2023-10-03 00:10:50,572 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.611e+02 1.883e+02 2.040e+02 2.401e+02 4.175e+02, threshold=4.079e+02, percent-clipped=0.0 2023-10-03 00:10:50,684 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_47449_210_5399_1_1534076445_4006929_410-237806-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 00:10:50,693 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7543_210_6753_1_1528543318_4552_1-85072-0_sp0.9 from training. Duration: 0.92225 2023-10-03 00:10:53,357 WARNING [train.py:1204] (3/4) Exclude cut with ID _65263_210_22356_1_1535763598691_3921037_108-21902-0 from training. Duration: 0.99 2023-10-03 00:10:53,362 WARNING [train.py:1204] (3/4) Exclude cut with ID _65585_210_12210_1_1535450451584_3764098_309-174818-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 00:10:56,836 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_309-65177-0 from training. Duration: 0.88 2023-10-03 00:10:58,852 WARNING [train.py:1204] (3/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_363-185703-0_sp1.1 from training. Duration: 0.863625 2023-10-03 00:10:58,888 WARNING [train.py:1204] (3/4) Exclude cut with ID _42326_210_7770_1_1533814282531_3707269_442-238692-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 00:11:00,842 WARNING [train.py:1204] (3/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_226-254446-0_sp1.1 from training. Duration: 0.936375 2023-10-03 00:11:00,868 WARNING [train.py:1204] (3/4) Exclude cut with ID _29853_210_7497_1_1532505600851_6642110_287-299903-0_sp1.1 from training. Duration: 0.963625 2023-10-03 00:11:06,920 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1015-343884-0 from training. Duration: 0.62 2023-10-03 00:11:06,965 WARNING [train.py:1204] (3/4) Exclude cut with ID ID0305W0008-833402 from training. Duration: 0.91 2023-10-03 00:11:08,235 INFO [train.py:1046] (3/4) Epoch 31, batch 950, loss[loss=0.1658, simple_loss=0.2571, pruned_loss=0.03729, over 24670.00 frames. ], tot_loss[loss=0.1637, simple_loss=0.2427, pruned_loss=0.04231, over 4674658.92 frames. ], batch size: 73, lr: 3.31e-03, grad_scale: 16.0 2023-10-03 00:11:08,307 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6673_210_3273_1_1527767790_3173240_351-167954-0_sp1.1 from training. Duration: 0.436375 2023-10-03 00:11:08,319 WARNING [train.py:1204] (3/4) Exclude cut with ID _6456_210_1534_1_1527681634212_3181049_345-252877-0 from training. Duration: 0.64 2023-10-03 00:11:11,004 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_13-205182-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 00:11:13,830 WARNING [train.py:1204] (3/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_427-208901-0 from training. Duration: 0.68 2023-10-03 00:11:16,653 WARNING [train.py:1204] (3/4) Exclude cut with ID _49039_210_6315_1_1533967537541_3846118_9-148921-0_sp1.1 from training. Duration: 0.97275 2023-10-03 00:11:18,170 WARNING [train.py:1204] (3/4) Exclude cut with ID _59493_210_17282_1_1535078419077_3225879_143-70317-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 00:11:19,589 WARNING [train.py:1204] (3/4) Exclude cut with ID _27967_210_12140_1_1532588391859_8419130_1056-236369-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 00:11:19,639 WARNING [train.py:1204] (3/4) Exclude cut with ID _6537_210_1883_1_1527987559710_3597129_382-133062-0_sp0.9 from training. Duration: 0.7555625 2023-10-03 00:11:22,368 WARNING [train.py:1204] (3/4) Exclude cut with ID ID0376W0255-869957 from training. Duration: 0.9510625 2023-10-03 00:11:25,069 WARNING [train.py:1204] (3/4) Exclude cut with ID _49371_210_12071_1_1533952761021_3812190_483-240691-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 00:11:25,743 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_12868_210_6753_1_1530701932_530275_18-76322-0_sp1.1 from training. Duration: 0.7909375 2023-10-03 00:11:27,103 WARNING [train.py:1204] (3/4) Exclude cut with ID _53950_210_5816_1_1534417264919_3862951_367-26344-0_sp1.1 from training. Duration: 0.97275 2023-10-03 00:11:28,408 WARNING [train.py:1204] (3/4) Exclude cut with ID _5444_210_2151_1_1527418808464_5312210_634-194174-0_sp1.1 from training. Duration: 0.8818125 2023-10-03 00:11:28,433 WARNING [train.py:1204] (3/4) Exclude cut with ID _56494_210_4220_1_1535284837625_3033169_69-138150-0 from training. Duration: 0.94 2023-10-03 00:11:30,319 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7543_210_6753_1_1528544010_1110402_25-183860-0_sp0.9 from training. Duration: 0.52225 2023-10-03 00:11:30,426 WARNING [train.py:1204] (3/4) Exclude cut with ID _40284_210_2084_1_1533513679814_3704279_287-191918-0_sp1.1 from training. Duration: 0.963625 2023-10-03 00:11:32,554 WARNING [train.py:1204] (3/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_291-270881-0 from training. Duration: 0.97 2023-10-03 00:11:32,601 WARNING [train.py:1204] (3/4) Exclude cut with ID _12555_210_8750_1_1530075594402_3780239_449-267986-0_sp0.9 from training. Duration: 0.92225 2023-10-03 00:11:35,335 WARNING [train.py:1204] (3/4) Exclude cut with ID _3992_210_1747_1_1526644816031_3731060_268-64520-0_sp1.1 from training. Duration: 0.963625 2023-10-03 00:11:35,349 WARNING [train.py:1204] (3/4) Exclude cut with ID _39411_210_5986_1_1533538825030_3343649_173-238248-0_sp0.9 from training. Duration: 0.92225 2023-10-03 00:11:36,561 WARNING [train.py:1204] (3/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_828-313097-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 00:11:36,662 WARNING [train.py:1204] (3/4) Exclude cut with ID _26907_210_7234_1_1532221740694_3205263_337-2494-0 from training. Duration: 0.88 2023-10-03 00:11:39,318 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_229-45300-0_sp1.1 from training. Duration: 0.5181875 2023-10-03 00:11:39,474 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=1068893.3333333333, ans=0.1 2023-10-03 00:11:40,701 WARNING [train.py:1204] (3/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_300-27235-0_sp1.1 from training. Duration: 0.7909375 2023-10-03 00:11:42,061 WARNING [train.py:1204] (3/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_48-338976-0_sp1.1 from training. Duration: 0.8090625 2023-10-03 00:11:47,662 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_13091_210_7925_1_1530609447_8987354_792-195353-0_sp1.1 from training. Duration: 0.936375 2023-10-03 00:11:47,668 WARNING [train.py:1204] (3/4) Exclude cut with ID _56524_210_9642_1_1534572011779_3619170_311-54500-0_sp1.1 from training. Duration: 0.97275 2023-10-03 00:11:50,506 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7543_210_6753_1_1528544010_1110402_25-183860-0 from training. Duration: 0.47 2023-10-03 00:11:54,488 WARNING [train.py:1204] (3/4) Exclude cut with ID 2411-132532-0017-82279-0_sp1.1 from training. Duration: 0.9681875 2023-10-03 00:11:54,489 WARNING [train.py:1204] (3/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_272-249985-0_sp0.9 from training. Duration: 0.7444375 2023-10-03 00:11:55,214 WARNING [train.py:1204] (3/4) Exclude cut with ID _55644_210_11856_1_1534593599582_4103719_320-229006-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 00:11:56,483 WARNING [train.py:1204] (3/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_177-100554-0_sp1.1 from training. Duration: 0.963625 2023-10-03 00:11:56,485 WARNING [train.py:1204] (3/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_787-294416-0_sp1.1 from training. Duration: 0.7454375 2023-10-03 00:12:00,956 WARNING [train.py:1204] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_318-59097-0 from training. Duration: 0.76 2023-10-03 00:12:01,031 WARNING [train.py:1204] (3/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_480-13146-0_sp1.1 from training. Duration: 0.77275 2023-10-03 00:12:03,041 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_469-338558-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 00:12:04,375 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_367-126897-0_sp1.1 from training. Duration: 0.963625 2023-10-03 00:12:04,398 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6545_210_2040_1_1527774670_4245258_379-14182-0 from training. Duration: 0.8 2023-10-03 00:12:04,415 WARNING [train.py:1204] (3/4) Exclude cut with ID _21631_210_2253_1_1531907880778_3160339_197-79498-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 00:12:04,424 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_416-293746-0_sp1.1 from training. Duration: 0.8181875 2023-10-03 00:12:05,842 WARNING [train.py:1204] (3/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_52-211683-0 from training. Duration: 0.9 2023-10-03 00:12:06,174 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.feed_forward1.out_proj.dropout_p, batch_count=1069026.6666666667, ans=0.1 2023-10-03 00:12:08,604 WARNING [train.py:1204] (3/4) Exclude cut with ID _48678_210_5663_1_1533988830947_3769199_306-255886-0_sp0.9 from training. Duration: 0.8555625 2023-10-03 00:12:11,397 WARNING [train.py:1204] (3/4) Exclude cut with ID _65134_210_14763_1_1535335393985_3505368_296-24733-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 00:12:13,068 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.conv_skip_rate, batch_count=1069026.6666666667, ans=0.0 2023-10-03 00:12:14,342 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.conv_module2.balancer2.min_positive, batch_count=1069026.6666666667, ans=0.05 2023-10-03 00:12:15,529 WARNING [train.py:1204] (3/4) Exclude cut with ID _14898_210_9206_1_1530766812377_4177190_335-99364-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 00:12:16,872 WARNING [train.py:1204] (3/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_19-342632-0 from training. Duration: 0.91 2023-10-03 00:12:16,898 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_53071_210_16554_1_1534490405_2954066_265-170472-0 from training. Duration: 0.83 2023-10-03 00:12:17,510 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.3.encoder.layers.2.self_attn2.whiten, num_groups=1, num_channels=512, metric=15.98 vs. limit=22.5 2023-10-03 00:12:20,956 INFO [train.py:1046] (3/4) Epoch 31, batch 1000, loss[loss=0.1558, simple_loss=0.2503, pruned_loss=0.03068, over 24312.00 frames. ], tot_loss[loss=0.1632, simple_loss=0.2413, pruned_loss=0.04249, over 4666100.60 frames. ], batch size: 74, lr: 3.31e-03, grad_scale: 8.0 2023-10-03 00:12:20,989 WARNING [train.py:1204] (3/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_189-298172-0_sp1.1 from training. Duration: 0.963625 2023-10-03 00:12:24,298 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7615_210_4211_1_1528110965_4580160_72-235211-0 from training. Duration: 0.76 2023-10-03 00:12:24,345 WARNING [train.py:1204] (3/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_155-90874-0_sp1.1 from training. Duration: 0.936375 2023-10-03 00:12:28,993 WARNING [train.py:1204] (3/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_164-216310-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 00:12:30,367 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_46013_210_7234_1_1533776362_3744032_463-52378-0 from training. Duration: 0.86 2023-10-03 00:12:30,370 WARNING [train.py:1204] (3/4) Exclude cut with ID _61133_210_9969_1_1535196777227_3696409_333-270325-0 from training. Duration: 0.93 2023-10-03 00:12:36,526 WARNING [train.py:1204] (3/4) Exclude cut with ID _5172_210_1883_1_1527246062567_3577219_18-101380-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 00:12:36,541 WARNING [train.py:1204] (3/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_577-236858-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 00:12:37,991 WARNING [train.py:1204] (3/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_1006-177345-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 00:12:39,465 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_4-266348-0 from training. Duration: 0.78 2023-10-03 00:12:42,167 WARNING [train.py:1204] (3/4) Exclude cut with ID _55857_210_2346_1_1534644007831_6898209_745-98391-0 from training. Duration: 0.76 2023-10-03 00:12:42,292 WARNING [train.py:1204] (3/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_375-244976-0 from training. Duration: 0.75 2023-10-03 00:12:42,912 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.4.encoder.layers.0.nonlin_attention.whiten2, num_groups=1, num_channels=384, metric=10.20 vs. limit=15.0 2023-10-03 00:12:43,696 WARNING [train.py:1204] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_4-248404-0_sp1.1 from training. Duration: 0.97275 2023-10-03 00:12:45,201 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8453_210_4211_1_1528502940_8562730_596-306820-0 from training. Duration: 0.97 2023-10-03 00:12:46,580 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_211-339622-0_sp0.9 from training. Duration: 0.5 2023-10-03 00:12:46,606 WARNING [train.py:1204] (3/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_393-57888-0 from training. Duration: 0.64 2023-10-03 00:12:48,008 WARNING [train.py:1204] (3/4) Exclude cut with ID _23825_210_13449_1_1531915313526_3497150_143-343575-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 00:12:48,071 WARNING [train.py:1204] (3/4) Exclude cut with ID _58605_210_12210_1_1534723195939_3904259_36-12256-0_sp1.1 from training. Duration: 0.936375 2023-10-03 00:12:53,045 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=1069226.6666666667, ans=0.1 2023-10-03 00:12:58,115 WARNING [train.py:1204] (3/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_38-67795-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 00:12:59,573 WARNING [train.py:1204] (3/4) Exclude cut with ID _64631_210_27767_1_1535349580068_4165160_308-327832-0_sp1.1 from training. Duration: 0.92725 2023-10-03 00:13:01,628 WARNING [train.py:1204] (3/4) Exclude cut with ID _37086_210_9038_1_1532999540891_7021250_726-81176-0_sp1.1 from training. Duration: 0.936375 2023-10-03 00:13:01,692 WARNING [train.py:1204] (3/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_47-141521-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 00:13:02,956 WARNING [train.py:1204] (3/4) Exclude cut with ID _3067_210_2667_1_1526212974640_4503168_250-285937-0 from training. Duration: 0.8 2023-10-03 00:13:02,994 WARNING [train.py:1204] (3/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_239-24000-0_sp1.1 from training. Duration: 0.97275 2023-10-03 00:13:04,681 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3371_210_1794_1_1526384462_5580777_396-339812-0_sp0.9 from training. Duration: 0.9444375 2023-10-03 00:13:04,738 WARNING [train.py:1204] (3/4) Exclude cut with ID _16909_210_5724_1_1531292079912_4118919_580-173454-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 00:13:06,111 WARNING [train.py:1204] (3/4) Exclude cut with ID IC0124W0002-61308 from training. Duration: 0.982 2023-10-03 00:13:07,722 WARNING [train.py:1204] (3/4) Exclude cut with ID _31602_210_5854_1_1532677021456_4458250_249-96604-0 from training. Duration: 0.89 2023-10-03 00:13:09,089 WARNING [train.py:1204] (3/4) Exclude cut with ID _69584_210_5219_1_1535785133775_4044450_325-7656-0 from training. Duration: 0.86 2023-10-03 00:13:11,800 WARNING [train.py:1204] (3/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_478-162247-0 from training. Duration: 0.82 2023-10-03 00:13:13,189 WARNING [train.py:1204] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_686-54383-0_sp1.1 from training. Duration: 0.7909375 2023-10-03 00:13:17,673 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.conv_module1.balancer2.prob, batch_count=1069293.3333333333, ans=0.125 2023-10-03 00:13:18,924 WARNING [train.py:1204] (3/4) Exclude cut with ID _29303_210_2575_1_1532392237303_7410649_85-270092-0_sp1.1 from training. Duration: 0.936375 2023-10-03 00:13:20,105 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.552e+02 1.823e+02 1.971e+02 2.145e+02 3.229e+02, threshold=3.942e+02, percent-clipped=0.0 2023-10-03 00:13:20,185 WARNING [train.py:1204] (3/4) Exclude cut with ID _2322_210_2667_1_1525604548971_3656299_440-80494-0_sp1.1 from training. Duration: 0.8 2023-10-03 00:13:20,222 WARNING [train.py:1204] (3/4) Exclude cut with ID _47346_210_9476_1_1533815971553_3536160_201-40644-0_sp1.1 from training. Duration: 0.936375 2023-10-03 00:13:21,581 WARNING [train.py:1204] (3/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_343-139898-0_sp1.1 from training. Duration: 0.87275 2023-10-03 00:13:24,377 WARNING [train.py:1204] (3/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_1012-279473-0 from training. Duration: 0.67 2023-10-03 00:13:26,291 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7931_210_1794_1_1528594728_5564059_280-279560-0_sp1.1 from training. Duration: 0.8909375 2023-10-03 00:13:26,993 WARNING [train.py:1204] (3/4) Exclude cut with ID _8263_210_1664_1_1528439452891_970220_68-181805-0 from training. Duration: 0.64 2023-10-03 00:13:28,290 WARNING [train.py:1204] (3/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_605-133395-0 from training. Duration: 0.66 2023-10-03 00:13:28,411 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_43-171109-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 00:13:28,417 WARNING [train.py:1204] (3/4) Exclude cut with ID _40094_210_16380_1_1533294005773_3641159_107-280739-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 00:13:31,202 WARNING [train.py:1204] (3/4) Exclude cut with ID _14821_210_8750_1_1530680503991_6829359_2-227151-0_sp0.9 from training. Duration: 0.92225 2023-10-03 00:13:32,643 WARNING [train.py:1204] (3/4) Exclude cut with ID _14268_210_9206_1_1530622771285_4714099_331-105351-0_sp1.1 from training. Duration: 0.5909375 2023-10-03 00:13:35,572 INFO [train.py:1046] (3/4) Epoch 31, batch 1050, loss[loss=0.1625, simple_loss=0.2518, pruned_loss=0.03661, over 24343.00 frames. ], tot_loss[loss=0.1621, simple_loss=0.2402, pruned_loss=0.04194, over 4677366.22 frames. ], batch size: 74, lr: 3.31e-03, grad_scale: 8.0 2023-10-03 00:13:35,659 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_26326_210_7925_1_1533002493_3592406_451-82741-0_sp1.1 from training. Duration: 0.97275 2023-10-03 00:13:37,369 WARNING [train.py:1204] (3/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_191-59784-0_sp0.9 from training. Duration: 0.97775 2023-10-03 00:13:38,754 WARNING [train.py:1204] (3/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_445-117398-0_sp1.1 from training. Duration: 0.8090625 2023-10-03 00:13:41,431 WARNING [train.py:1204] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_605-257205-0_sp0.9 from training. Duration: 0.6555625 2023-10-03 00:13:42,926 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_50050_210_16223_1_1535435796_7290138_592-258841-0_sp1.1 from training. Duration: 0.936375 2023-10-03 00:13:44,334 WARNING [train.py:1204] (3/4) Exclude cut with ID _14348_210_8341_1_1530599434996_3778640_240-232370-0_sp0.9 from training. Duration: 0.9333125 2023-10-03 00:13:47,023 WARNING [train.py:1204] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_239-316802-0_sp0.9 from training. Duration: 0.7666875 2023-10-03 00:13:48,486 WARNING [train.py:1204] (3/4) Exclude cut with ID _45560_210_21982_1_1533812603951_3584999_111-6376-0_sp1.1 from training. Duration: 0.763625 2023-10-03 00:13:49,958 WARNING [train.py:1204] (3/4) Exclude cut with ID _54189_210_16380_1_1534332670039_3911710_606-311481-0_sp1.1 from training. Duration: 0.963625 2023-10-03 00:13:51,313 WARNING [train.py:1204] (3/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_237-332027-0_sp1.1 from training. Duration: 0.863625 2023-10-03 00:13:51,492 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.conv_module1.balancer2.prob, batch_count=1069493.3333333333, ans=0.125 2023-10-03 00:13:52,584 WARNING [train.py:1204] (3/4) Exclude cut with ID _66070_210_7640_1_1535441393096_7805310_489-292631-0_sp0.9 from training. Duration: 0.87775 2023-10-03 00:13:52,662 WARNING [train.py:1204] (3/4) Exclude cut with ID _49728_210_12651_1_1534041034294_6788150_624-195301-0_sp1.1 from training. Duration: 0.77275 2023-10-03 00:13:53,965 WARNING [train.py:1204] (3/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_44-79427-0 from training. Duration: 0.98 2023-10-03 00:13:54,226 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.feed_forward3.hidden_balancer.prob, batch_count=1069493.3333333333, ans=0.125 2023-10-03 00:13:55,388 WARNING [train.py:1204] (3/4) Exclude cut with ID _35350_210_16040_1_1533040191183_4075299_490-198354-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 00:13:55,421 WARNING [train.py:1204] (3/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_309-222545-0 from training. Duration: 0.84 2023-10-03 00:13:57,491 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_13091_210_7925_1_1530609447_8987354_790-199469-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 00:13:57,498 WARNING [train.py:1204] (3/4) Exclude cut with ID _21632_210_2253_1_1531994001765_3744140_110-274654-0 from training. Duration: 0.82 2023-10-03 00:13:58,720 WARNING [train.py:1204] (3/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_498-40153-0_sp0.9 from training. Duration: 0.7 2023-10-03 00:14:04,721 WARNING [train.py:1204] (3/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_363-282909-0_sp1.1 from training. Duration: 0.936375 2023-10-03 00:14:04,770 WARNING [train.py:1204] (3/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_178-177582-0_sp0.9 from training. Duration: 0.82225 2023-10-03 00:14:04,787 WARNING [train.py:1204] (3/4) Exclude cut with ID _24612_210_8913_1_1531998054193_3806359_357-35487-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 00:14:07,503 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.2.encoder.layers.2.self_attn2.whiten, num_groups=1, num_channels=384, metric=16.38 vs. limit=22.5 2023-10-03 00:14:08,072 WARNING [train.py:1204] (3/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_138-332773-0 from training. Duration: 0.81 2023-10-03 00:14:08,101 WARNING [train.py:1204] (3/4) Exclude cut with ID _7624_210_1531_1_1528109802198_4933101_418-349834-0 from training. Duration: 0.6 2023-10-03 00:14:08,132 WARNING [train.py:1204] (3/4) Exclude cut with ID _7537_210_2084_1_1528502395431_3924359_14-312906-0_sp0.9 from training. Duration: 0.9333125 2023-10-03 00:14:08,345 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.out_combiner.scale_min, batch_count=1069560.0, ans=0.2 2023-10-03 00:14:10,920 WARNING [train.py:1204] (3/4) Exclude cut with ID _60057_210_10640_1_1534903065793_3333150_362-230377-0 from training. Duration: 0.87 2023-10-03 00:14:12,884 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.5.encoder.layers.1.self_attn2.whiten, num_groups=1, num_channels=256, metric=12.97 vs. limit=22.5 2023-10-03 00:14:13,676 WARNING [train.py:1204] (3/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_138-64210-0 from training. Duration: 0.73 2023-10-03 00:14:13,747 WARNING [train.py:1204] (3/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_454-339504-0_sp1.1 from training. Duration: 0.97275 2023-10-03 00:14:17,856 WARNING [train.py:1204] (3/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_508-335369-0_sp0.9 from training. Duration: 0.5333125 2023-10-03 00:14:19,191 WARNING [train.py:1204] (3/4) Exclude cut with ID _5608_210_2106_1_1527415439089_6819768_909-82767-0_sp0.9 from training. Duration: 0.67775 2023-10-03 00:14:19,401 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.conv_module2.balancer2.prob, batch_count=1069626.6666666667, ans=0.125 2023-10-03 00:14:20,544 WARNING [train.py:1204] (3/4) Exclude cut with ID _6694_210_4599_1_1527852637490_3965529_99-11611-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 00:14:20,599 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2655_210_2087_1_1525688959_3897400_52-279899-0_sp1.1 from training. Duration: 0.62725 2023-10-03 00:14:24,849 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=1069626.6666666667, ans=0.1 2023-10-03 00:14:26,735 WARNING [train.py:1204] (3/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_375-245677-0_sp1.1 from training. Duration: 0.736375 2023-10-03 00:14:30,113 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_23850_210_4238_1_1532232597_1632433_32-185135-0 from training. Duration: 0.86 2023-10-03 00:14:30,235 WARNING [train.py:1204] (3/4) Exclude cut with ID _15113_210_8254_1_1530842438579_3716279_209-54533-0 from training. Duration: 0.96 2023-10-03 00:14:31,575 WARNING [train.py:1204] (3/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_401-53423-0 from training. Duration: 0.55 2023-10-03 00:14:31,610 WARNING [train.py:1204] (3/4) Exclude cut with ID _14867_210_1968_1_1530684179890_3846969_319-250868-0_sp1.1 from training. Duration: 0.9 2023-10-03 00:14:31,623 WARNING [train.py:1204] (3/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_106-30782-0_sp0.9 from training. Duration: 0.888875 2023-10-03 00:14:34,291 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_5960_210_1794_1_1527403865_3990999_515-2613-0 from training. Duration: 0.88 2023-10-03 00:14:37,618 WARNING [train.py:1204] (3/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_798-106179-0_sp0.9 from training. Duration: 0.92225 2023-10-03 00:14:39,692 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.ff2_skip_rate, batch_count=1069693.3333333333, ans=0.0 2023-10-03 00:14:40,888 WARNING [train.py:1204] (3/4) Exclude cut with ID _14227_210_4381_1_1530701804286_3940259_125-275642-0_sp1.1 from training. Duration: 0.9 2023-10-03 00:14:40,891 WARNING [train.py:1204] (3/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_79-200392-0_sp1.1 from training. Duration: 0.8818125 2023-10-03 00:14:40,931 WARNING [train.py:1204] (3/4) Exclude cut with ID _48836_210_12210_1_1534075848293_3904150_429-35475-0_sp0.9 from training. Duration: 0.988875 2023-10-03 00:14:42,167 WARNING [train.py:1204] (3/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_224-172517-0_sp1.1 from training. Duration: 0.97275 2023-10-03 00:14:44,997 WARNING [train.py:1204] (3/4) Exclude cut with ID _15113_210_8254_1_1530842438579_3716279_76-232507-0_sp1.1 from training. Duration: 0.97275 2023-10-03 00:14:45,000 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_135-5501-0 from training. Duration: 0.98 2023-10-03 00:14:46,521 WARNING [train.py:1204] (3/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_563-347832-0_sp0.9 from training. Duration: 0.988875 2023-10-03 00:14:46,532 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6786_210_1388_1_1527839699_4194495_273-149841-0 from training. Duration: 0.92 2023-10-03 00:14:46,555 WARNING [train.py:1204] (3/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_172-215932-0 from training. Duration: 0.66 2023-10-03 00:14:47,860 WARNING [train.py:1204] (3/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_939-132910-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 00:14:49,218 INFO [train.py:1046] (3/4) Epoch 31, batch 1100, loss[loss=0.1796, simple_loss=0.2641, pruned_loss=0.04759, over 24338.00 frames. ], tot_loss[loss=0.1621, simple_loss=0.24, pruned_loss=0.04207, over 4681139.88 frames. ], batch size: 77, lr: 3.31e-03, grad_scale: 8.0 2023-10-03 00:14:50,552 WARNING [train.py:1204] (3/4) Exclude cut with ID _49371_210_12071_1_1533952761021_3812190_16-246001-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 00:14:56,047 WARNING [train.py:1204] (3/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_680-189165-0_sp1.1 from training. Duration: 0.936375 2023-10-03 00:14:59,642 WARNING [train.py:1204] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_53-300544-0_sp1.1 from training. Duration: 0.6090625 2023-10-03 00:15:00,999 WARNING [train.py:1204] (3/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_540-275685-0_sp1.1 from training. Duration: 0.8090625 2023-10-03 00:15:01,029 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_38965_210_4852_1_1533174906_3484735_453-106579-0_sp1.1 from training. Duration: 0.92725 2023-10-03 00:15:01,065 WARNING [train.py:1204] (3/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_105-328174-0 from training. Duration: 0.76 2023-10-03 00:15:02,510 WARNING [train.py:1204] (3/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_279-126205-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 00:15:04,362 WARNING [train.py:1204] (3/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_5-55893-0_sp0.9 from training. Duration: 0.611125 2023-10-03 00:15:04,589 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.balancer2.prob, batch_count=1069826.6666666667, ans=0.125 2023-10-03 00:15:05,734 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.attention_skip_rate, batch_count=1069826.6666666667, ans=0.0 2023-10-03 00:15:06,906 WARNING [train.py:1204] (3/4) Exclude cut with ID _13678_210_7497_1_1530530069226_3463170_141-10835-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 00:15:10,228 WARNING [train.py:1204] (3/4) Exclude cut with ID _22083_210_8254_1_1531702862439_3678970_201-85738-0_sp1.1 from training. Duration: 0.8181875 2023-10-03 00:15:10,247 WARNING [train.py:1204] (3/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_51-1049-0 from training. Duration: 0.75 2023-10-03 00:15:11,653 WARNING [train.py:1204] (3/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_206-10097-0_sp0.9 from training. Duration: 0.5444375 2023-10-03 00:15:12,987 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_4528_210_3273_1_1527076654_3276777_360-63661-0_sp1.1 from training. Duration: 0.92725 2023-10-03 00:15:12,994 WARNING [train.py:1204] (3/4) Exclude cut with ID _64086_210_7994_1_1535270417019_7435949_449-31567-0_sp1.1 from training. Duration: 0.8818125 2023-10-03 00:15:15,692 WARNING [train.py:1204] (3/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_245-174553-0_sp0.9 from training. Duration: 0.97775 2023-10-03 00:15:17,309 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6545_210_2040_1_1527774670_4245258_379-14182-0_sp1.1 from training. Duration: 0.72725 2023-10-03 00:15:21,378 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_256-206070-0_sp1.1 from training. Duration: 0.963625 2023-10-03 00:15:21,671 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.nonlin_attention.balancer.prob, batch_count=1069893.3333333333, ans=0.125 2023-10-03 00:15:24,102 WARNING [train.py:1204] (3/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_278-104676-0 from training. Duration: 0.98 2023-10-03 00:15:26,061 WARNING [train.py:1204] (3/4) Exclude cut with ID IC0671W0065-331908 from training. Duration: 0.896 2023-10-03 00:15:26,102 WARNING [train.py:1204] (3/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_244-129917-0_sp1.1 from training. Duration: 0.97275 2023-10-03 00:15:29,418 WARNING [train.py:1204] (3/4) Exclude cut with ID _7852_210_5219_1_1528286471316_8139400_1129-170857-0_sp1.1 from training. Duration: 0.97275 2023-10-03 00:15:29,499 WARNING [train.py:1204] (3/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_443-30034-0_sp0.9 from training. Duration: 0.711125 2023-10-03 00:15:30,807 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_31583_210_3852_1_1532595851_4024413_250-332999-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 00:15:31,082 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.conv_module1.balancer2.prob, batch_count=1069893.3333333333, ans=0.125 2023-10-03 00:15:32,272 WARNING [train.py:1204] (3/4) Exclude cut with ID _8772_210_1850_1_1528635595972_3664011_247-336448-0 from training. Duration: 0.74 2023-10-03 00:15:32,325 WARNING [train.py:1204] (3/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_567-135114-0_sp0.9 from training. Duration: 0.9666875 2023-10-03 00:15:32,329 WARNING [train.py:1204] (3/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_557-349534-0_sp1.1 from training. Duration: 0.82725 2023-10-03 00:15:32,349 WARNING [train.py:1204] (3/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_1341-26728-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 00:15:33,668 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_40222_210_6228_1_1533385298_4481939_375-328051-0_sp1.1 from training. Duration: 0.97275 2023-10-03 00:15:33,689 WARNING [train.py:1204] (3/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_276-96593-0 from training. Duration: 0.77 2023-10-03 00:15:41,439 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_53071_210_16554_1_1534490405_2954066_265-170472-0_sp0.9 from training. Duration: 0.92225 2023-10-03 00:15:41,462 WARNING [train.py:1204] (3/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_365-1492-0 from training. Duration: 0.91 2023-10-03 00:15:44,177 WARNING [train.py:1204] (3/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_654-52713-0_sp0.9 from training. Duration: 0.8555625 2023-10-03 00:15:48,045 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.612e+02 1.833e+02 1.997e+02 2.295e+02 4.959e+02, threshold=3.994e+02, percent-clipped=1.0 2023-10-03 00:15:48,177 WARNING [train.py:1204] (3/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_113-92608-0_sp1.1 from training. Duration: 0.4909375 2023-10-03 00:15:52,198 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_46013_210_7234_1_1533776362_3744032_50-141535-0 from training. Duration: 0.99 2023-10-03 00:15:52,208 WARNING [train.py:1204] (3/4) Exclude cut with ID _13942_210_9988_1_1530604906036_3733360_499-55676-0_sp1.1 from training. Duration: 0.563625 2023-10-03 00:15:53,615 WARNING [train.py:1204] (3/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_375-65143-0_sp1.1 from training. Duration: 0.97275 2023-10-03 00:15:55,067 WARNING [train.py:1204] (3/4) Exclude cut with ID _16756_210_4915_1_1531047803763_7104259_199-325608-0_sp1.1 from training. Duration: 0.92725 2023-10-03 00:15:55,098 WARNING [train.py:1204] (3/4) Exclude cut with ID _31649_210_6228_1_1532584608063_4091078_443-124391-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 00:15:56,398 WARNING [train.py:1204] (3/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_146-218763-0 from training. Duration: 0.86 2023-10-03 00:15:57,733 WARNING [train.py:1204] (3/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_567-135114-0_sp1.1 from training. Duration: 0.7909375 2023-10-03 00:15:57,772 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_26326_210_7925_1_1533002493_3592406_462-288299-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 00:15:58,704 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.balancer1.prob, batch_count=1070026.6666666667, ans=0.125 2023-10-03 00:15:59,746 WARNING [train.py:1204] (3/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_322-217220-0 from training. Duration: 0.86 2023-10-03 00:15:59,772 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_238-256668-0_sp1.1 from training. Duration: 0.62725 2023-10-03 00:15:59,814 WARNING [train.py:1204] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_371-18169-0 from training. Duration: 0.86 2023-10-03 00:16:01,269 WARNING [train.py:1204] (3/4) Exclude cut with ID _58480_210_12392_1_1534811121111_3646160_23-74512-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 00:16:01,274 WARNING [train.py:1204] (3/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_784-213099-0_sp1.1 from training. Duration: 0.8454375 2023-10-03 00:16:01,364 WARNING [train.py:1204] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_625-102955-0_sp1.1 from training. Duration: 0.7 2023-10-03 00:16:02,546 INFO [train.py:1046] (3/4) Epoch 31, batch 1150, loss[loss=0.1516, simple_loss=0.2294, pruned_loss=0.03689, over 23581.00 frames. ], tot_loss[loss=0.1627, simple_loss=0.2407, pruned_loss=0.04232, over 4689447.97 frames. ], batch size: 149, lr: 3.31e-03, grad_scale: 8.0 2023-10-03 00:16:06,027 WARNING [train.py:1204] (3/4) Exclude cut with ID _49748_210_24116_1_1533986971737_3158910_176-174327-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 00:16:08,754 WARNING [train.py:1204] (3/4) Exclude cut with ID _50310_210_15488_1_1534068043345_3641080_432-96140-0_sp1.1 from training. Duration: 0.936375 2023-10-03 00:16:10,637 WARNING [train.py:1204] (3/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_76-14910-0_sp1.1 from training. Duration: 0.92725 2023-10-03 00:16:10,669 WARNING [train.py:1204] (3/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_40-19458-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 00:16:10,694 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_46-93547-0 from training. Duration: 0.95 2023-10-03 00:16:10,839 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.balancer2.prob, batch_count=1070093.3333333333, ans=0.125 2023-10-03 00:16:12,015 WARNING [train.py:1204] (3/4) Exclude cut with ID _21631_210_2253_1_1531907880778_3160339_321-35325-0_sp1.1 from training. Duration: 0.963625 2023-10-03 00:16:14,827 WARNING [train.py:1204] (3/4) Exclude cut with ID _4779_210_2084_1_1527318022944_3914893_500-285013-0 from training. Duration: 0.69 2023-10-03 00:16:16,221 WARNING [train.py:1204] (3/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_301-93324-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 00:16:17,524 WARNING [train.py:1204] (3/4) Exclude cut with ID _6473_210_1741_1_1527901307396_3815380_108-17335-0_sp1.1 from training. Duration: 0.5909375 2023-10-03 00:16:19,162 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.balancer_ff2.min_abs, batch_count=1070160.0, ans=0.1 2023-10-03 00:16:23,140 WARNING [train.py:1204] (3/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_206-10097-0 from training. Duration: 0.49 2023-10-03 00:16:25,841 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_36756_210_7234_1_1533026991_966276_103-77513-0_sp1.1 from training. Duration: 0.97275 2023-10-03 00:16:27,304 WARNING [train.py:1204] (3/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_662-18193-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 00:16:29,151 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_26433_210_12108_1_1532157158_4056480_439-52438-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 00:16:31,112 WARNING [train.py:1204] (3/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_464-33931-0 from training. Duration: 0.54 2023-10-03 00:16:31,119 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3474_210_2028_1_1526188513_4825246_145-309131-0_sp0.9 from training. Duration: 0.82225 2023-10-03 00:16:31,130 WARNING [train.py:1204] (3/4) Exclude cut with ID _9679_210_7497_1_1529129212906_4363929_446-308321-0_sp1.1 from training. Duration: 0.963625 2023-10-03 00:16:33,926 WARNING [train.py:1204] (3/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_662-275621-0 from training. Duration: 0.42 2023-10-03 00:16:35,230 WARNING [train.py:1204] (3/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_344-337148-0_sp1.1 from training. Duration: 0.97275 2023-10-03 00:16:36,718 WARNING [train.py:1204] (3/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_759-179071-0_sp1.1 from training. Duration: 0.92725 2023-10-03 00:16:46,071 WARNING [train.py:1204] (3/4) Exclude cut with ID _28783_210_7943_1_1532671164808_7155479_293-12188-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 00:16:51,643 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_17613_210_2151_1_1531137569_2366160_235-320304-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 00:16:52,836 WARNING [train.py:1204] (3/4) Exclude cut with ID _14349_210_8341_1_1530615710818_3442470_170-336541-0 from training. Duration: 0.86 2023-10-03 00:16:52,899 WARNING [train.py:1204] (3/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_375-218677-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 00:16:54,228 WARNING [train.py:1204] (3/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_829-202956-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 00:17:00,287 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9029W0436-509823 from training. Duration: 0.768 2023-10-03 00:17:03,526 WARNING [train.py:1204] (3/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_231-112214-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 00:17:03,694 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.0.conv_skip_rate, batch_count=1070360.0, ans=0.0 2023-10-03 00:17:07,794 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.conv_module2.balancer2.prob, batch_count=1070360.0, ans=0.125 2023-10-03 00:17:10,754 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9059W0470-524883 from training. Duration: 0.3415625 2023-10-03 00:17:13,500 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_43266_210_1794_1_1533521789_4497218_206-79512-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 00:17:14,966 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_294-163793-0_sp0.9 from training. Duration: 0.911125 2023-10-03 00:17:14,993 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6290_210_3273_1_1527595224_1282787_115-87095-0_sp1.1 from training. Duration: 0.5 2023-10-03 00:17:16,189 INFO [train.py:1046] (3/4) Epoch 31, batch 1200, loss[loss=0.169, simple_loss=0.2421, pruned_loss=0.04794, over 23700.00 frames. ], tot_loss[loss=0.1633, simple_loss=0.2416, pruned_loss=0.04247, over 4697895.93 frames. ], batch size: 232, lr: 3.31e-03, grad_scale: 16.0 2023-10-03 00:17:16,249 WARNING [train.py:1204] (3/4) Exclude cut with ID _14100_210_8341_1_1530529320613_3678069_279-55760-0_sp1.1 from training. Duration: 0.8454375 2023-10-03 00:17:18,117 WARNING [train.py:1204] (3/4) Exclude cut with ID _14621_210_1925_1_1530686901185_3675420_419-234200-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 00:17:23,684 WARNING [train.py:1204] (3/4) Exclude cut with ID _60229_210_27767_1_1534838406158_3655350_353-213656-0_sp1.1 from training. Duration: 0.863625 2023-10-03 00:17:23,694 WARNING [train.py:1204] (3/4) Exclude cut with ID _48678_210_5663_1_1533988830947_3769199_306-255886-0_sp1.1 from training. Duration: 0.7 2023-10-03 00:17:26,384 WARNING [train.py:1204] (3/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_466-3492-0_sp1.1 from training. Duration: 0.97275 2023-10-03 00:17:26,385 WARNING [train.py:1204] (3/4) Exclude cut with ID _8755_210_1876_1_1528628559357_3258209_199-242167-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 00:17:26,450 WARNING [train.py:1204] (3/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_237-89684-0_sp1.1 from training. Duration: 0.87275 2023-10-03 00:17:27,853 WARNING [train.py:1204] (3/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_70-84121-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 00:17:30,585 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7544_210_6753_1_1528587722_3576686_172-171750-0_sp1.1 from training. Duration: 0.5909375 2023-10-03 00:17:32,111 WARNING [train.py:1204] (3/4) Exclude cut with ID _24271_210_12658_1_1531987270022_7190519_40-321822-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 00:17:32,133 WARNING [train.py:1204] (3/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_227-83612-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 00:17:32,986 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.feed_forward2.hidden_balancer.prob, batch_count=1070493.3333333333, ans=0.125 2023-10-03 00:17:34,017 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9156W0015-572508 from training. Duration: 0.896 2023-10-03 00:17:36,129 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_292-265928-0 from training. Duration: 0.51 2023-10-03 00:17:38,861 WARNING [train.py:1204] (3/4) Exclude cut with ID _14747_210_8341_1_1530685797463_3654089_147-131229-0_sp1.1 from training. Duration: 0.6909375 2023-10-03 00:17:39,508 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.2.encoder.layers.2.self_attn2.whiten, num_groups=1, num_channels=384, metric=17.62 vs. limit=22.5 2023-10-03 00:17:42,134 WARNING [train.py:1204] (3/4) Exclude cut with ID _45297_210_2721_1_1533715351381_3264949_351-147136-0_sp0.9 from training. Duration: 0.9666875 2023-10-03 00:17:44,864 WARNING [train.py:1204] (3/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_1102-324568-0_sp1.1 from training. Duration: 0.97275 2023-10-03 00:17:45,004 WARNING [train.py:1204] (3/4) Exclude cut with ID _18516_210_9206_1_1531704653206_3692406_465-75446-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 00:17:45,006 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9203W0126-595623 from training. Duration: 0.896 2023-10-03 00:17:47,524 WARNING [train.py:1204] (3/4) Exclude cut with ID _55383_210_10635_1_1534468426172_2231669_139-328410-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 00:17:53,727 WARNING [train.py:1204] (3/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_166-246557-0_sp0.9 from training. Duration: 0.711125 2023-10-03 00:17:53,731 WARNING [train.py:1204] (3/4) Exclude cut with ID _30039_210_13270_1_1532484612900_2725150_239-304872-0_sp1.1 from training. Duration: 0.963625 2023-10-03 00:17:53,770 WARNING [train.py:1204] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_6-226750-0 from training. Duration: 0.94 2023-10-03 00:17:55,135 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_135-5501-0_sp1.1 from training. Duration: 0.8909375 2023-10-03 00:17:57,988 WARNING [train.py:1204] (3/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_359-118926-0 from training. Duration: 0.72 2023-10-03 00:18:04,009 WARNING [train.py:1204] (3/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_4-91037-0 from training. Duration: 0.88 2023-10-03 00:18:04,036 WARNING [train.py:1204] (3/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_415-288492-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 00:18:05,952 WARNING [train.py:1204] (3/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_544-287179-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 00:18:06,205 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.conv_module1.balancer1.prob, batch_count=1070626.6666666667, ans=0.125 2023-10-03 00:18:07,374 WARNING [train.py:1204] (3/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_122-32606-0_sp1.1 from training. Duration: 0.92725 2023-10-03 00:18:07,419 WARNING [train.py:1204] (3/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_121-284152-0_sp1.1 from training. Duration: 0.536375 2023-10-03 00:18:08,813 WARNING [train.py:1204] (3/4) Exclude cut with ID _44718_210_5816_1_1534050051869_6469030_350-299477-0_sp1.1 from training. Duration: 0.97275 2023-10-03 00:18:08,833 WARNING [train.py:1204] (3/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_1103-56445-0_sp0.9 from training. Duration: 0.811125 2023-10-03 00:18:08,881 WARNING [train.py:1204] (3/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_64-298917-0_sp0.9 from training. Duration: 0.97775 2023-10-03 00:18:08,927 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2245_210_2715_1_1525511548_3540254_310-52483-0 from training. Duration: 0.88 2023-10-03 00:18:10,389 WARNING [train.py:1204] (3/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_401-158239-0_sp1.1 from training. Duration: 0.6454375 2023-10-03 00:18:10,430 WARNING [train.py:1204] (3/4) Exclude cut with ID _59502_210_12210_1_1535107024698_1835139_146-162495-0_sp1.1 from training. Duration: 0.836375 2023-10-03 00:18:10,439 WARNING [train.py:1204] (3/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_41-31157-0_sp0.9 from training. Duration: 0.6333125 2023-10-03 00:18:13,127 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_68717_210_3528_1_1535628683_1556759_95-36722-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 00:18:13,133 WARNING [train.py:1204] (3/4) Exclude cut with ID _30196_210_1664_1_1533708496522_7074140_654-162611-0_sp1.1 from training. Duration: 0.92725 2023-10-03 00:18:16,345 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_9302_210_2550_1_1528889962_4655416_638-301620-0_sp0.9 from training. Duration: 0.52225 2023-10-03 00:18:17,598 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.596e+02 1.961e+02 2.154e+02 2.388e+02 3.166e+02, threshold=4.308e+02, percent-clipped=0.0 2023-10-03 00:18:17,779 WARNING [train.py:1204] (3/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_478-162247-0_sp1.1 from training. Duration: 0.7454375 2023-10-03 00:18:21,098 WARNING [train.py:1204] (3/4) Exclude cut with ID _45297_210_2721_1_1533715351381_3264949_353-9541-0 from training. Duration: 0.97 2023-10-03 00:18:24,028 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9653W0190-675468 from training. Duration: 0.9935 2023-10-03 00:18:24,138 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_49730_210_13049_1_1534075294_1830676_214-84436-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 00:18:26,832 WARNING [train.py:1204] (3/4) Exclude cut with ID _50093_210_4220_1_1534417141207_3432299_259-322252-0_sp1.1 from training. Duration: 0.836375 2023-10-03 00:18:28,601 WARNING [train.py:1204] (3/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_599-50220-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 00:18:29,919 WARNING [train.py:1204] (3/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_174-109016-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 00:18:31,674 INFO [train.py:1046] (3/4) Epoch 31, batch 1250, loss[loss=0.1649, simple_loss=0.2489, pruned_loss=0.04048, over 23782.00 frames. ], tot_loss[loss=0.1651, simple_loss=0.2435, pruned_loss=0.04336, over 4694189.61 frames. ], batch size: 85, lr: 3.31e-03, grad_scale: 8.0 2023-10-03 00:18:31,819 WARNING [train.py:1204] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_665-196155-0 from training. Duration: 0.67 2023-10-03 00:18:32,067 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.out_combiner.scale_min, batch_count=1070760.0, ans=0.2 2023-10-03 00:18:35,938 WARNING [train.py:1204] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_656-188361-0_sp1.1 from training. Duration: 0.7909375 2023-10-03 00:18:37,268 WARNING [train.py:1204] (3/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_60-18123-0_sp1.1 from training. Duration: 0.8 2023-10-03 00:18:37,326 WARNING [train.py:1204] (3/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_535-14410-0 from training. Duration: 0.47 2023-10-03 00:18:38,784 WARNING [train.py:1204] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_94-126128-0_sp1.1 from training. Duration: 0.7818125 2023-10-03 00:18:38,874 WARNING [train.py:1204] (3/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_356-31216-0_sp1.1 from training. Duration: 0.6818125 2023-10-03 00:18:39,055 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.feed_forward3.hidden_balancer.prob, batch_count=1070760.0, ans=0.125 2023-10-03 00:18:40,446 WARNING [train.py:1204] (3/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_443-30034-0_sp1.1 from training. Duration: 0.5818125 2023-10-03 00:18:42,402 WARNING [train.py:1204] (3/4) Exclude cut with ID _26907_210_7234_1_1532221740694_3205263_337-2494-0_sp1.1 from training. Duration: 0.8 2023-10-03 00:18:43,794 WARNING [train.py:1204] (3/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_49-97158-0_sp0.9 from training. Duration: 0.8444375 2023-10-03 00:18:43,806 WARNING [train.py:1204] (3/4) Exclude cut with ID _7877_210_6014_1_1528286358099_3750136_553-170128-0_sp1.1 from training. Duration: 0.963625 2023-10-03 00:18:46,990 WARNING [train.py:1204] (3/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_202-293523-0_sp0.9 from training. Duration: 0.8 2023-10-03 00:18:49,709 WARNING [train.py:1204] (3/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_188-171673-0_sp0.9 from training. Duration: 0.6444375 2023-10-03 00:18:49,737 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_120-41668-0_sp1.1 from training. Duration: 0.636375 2023-10-03 00:18:49,742 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_40223_210_6228_1_1533298404_4812267_613-78769-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 00:18:49,849 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_53861_210_5399_1_1534422156_4272077_150-336899-0_sp1.1 from training. Duration: 0.963625 2023-10-03 00:18:51,168 WARNING [train.py:1204] (3/4) Exclude cut with ID _14459_210_2228_1_1531443641946_7516700_1146-56843-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 00:18:52,731 WARNING [train.py:1204] (3/4) Exclude cut with ID _39867_210_9826_1_1533553222030_9344259_1194-299928-0_sp1.1 from training. Duration: 0.92725 2023-10-03 00:18:52,805 WARNING [train.py:1204] (3/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_214-291369-0_sp0.9 from training. Duration: 0.7 2023-10-03 00:18:57,879 WARNING [train.py:1204] (3/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_358-215558-0 from training. Duration: 0.93 2023-10-03 00:18:57,953 WARNING [train.py:1204] (3/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_577-287150-0_sp0.9 from training. Duration: 0.911125 2023-10-03 00:19:01,926 WARNING [train.py:1204] (3/4) Exclude cut with ID _60860_210_8254_1_1534896015875_3557169_229-187367-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 00:19:03,231 WARNING [train.py:1204] (3/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_857-298942-0 from training. Duration: 0.85 2023-10-03 00:19:03,280 WARNING [train.py:1204] (3/4) Exclude cut with ID _40137_210_12210_1_1533290484046_3795180_249-187059-0_sp1.1 from training. Duration: 0.8 2023-10-03 00:19:03,295 WARNING [train.py:1204] (3/4) Exclude cut with ID ID0171W0508-765603 from training. Duration: 0.541 2023-10-03 00:19:03,320 WARNING [train.py:1204] (3/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_521-219439-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 00:19:04,672 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_40223_210_6228_1_1533298404_4812267_741-302616-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 00:19:08,083 WARNING [train.py:1204] (3/4) Exclude cut with ID _65293_210_9070_1_1535545549765_4004210_438-154735-0_sp1.1 from training. Duration: 0.92725 2023-10-03 00:19:10,859 WARNING [train.py:1204] (3/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_564-132950-0_sp1.1 from training. Duration: 0.92725 2023-10-03 00:19:10,907 WARNING [train.py:1204] (3/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_291-270881-0_sp1.1 from training. Duration: 0.8818125 2023-10-03 00:19:12,225 WARNING [train.py:1204] (3/4) Exclude cut with ID _60229_210_27767_1_1534838406158_3655350_353-213656-0 from training. Duration: 0.95 2023-10-03 00:19:12,237 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_243-143119-0 from training. Duration: 0.77 2023-10-03 00:19:12,273 WARNING [train.py:1204] (3/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_690-126306-0 from training. Duration: 0.78 2023-10-03 00:19:14,483 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.bypass_mid.scale_min, batch_count=1070893.3333333333, ans=0.2 2023-10-03 00:19:15,684 WARNING [train.py:1204] (3/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_347-95003-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 00:19:17,139 WARNING [train.py:1204] (3/4) Exclude cut with ID _2322_210_2667_1_1525604548971_3656299_440-80494-0 from training. Duration: 0.88 2023-10-03 00:19:17,158 WARNING [train.py:1204] (3/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_370-229434-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 00:19:19,919 WARNING [train.py:1204] (3/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_124-345840-0_sp1.1 from training. Duration: 0.47275 2023-10-03 00:19:19,927 WARNING [train.py:1204] (3/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_165-123581-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 00:19:22,700 WARNING [train.py:1204] (3/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_453-282588-0 from training. Duration: 0.98 2023-10-03 00:19:22,718 WARNING [train.py:1204] (3/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_299-34413-0_sp0.9 from training. Duration: 0.72225 2023-10-03 00:19:22,762 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_40036_210_13405_1_1533648647_1315388_48-146016-0_sp1.1 from training. Duration: 0.8454375 2023-10-03 00:19:24,067 WARNING [train.py:1204] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_99-167392-0_sp0.9 from training. Duration: 0.67775 2023-10-03 00:19:24,124 WARNING [train.py:1204] (3/4) Exclude cut with ID _48677_210_5663_1_1533902405919_3892989_37-207114-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 00:19:27,380 WARNING [train.py:1204] (3/4) Exclude cut with ID _29857_210_20034_1_1532395886753_3798160_335-163562-0 from training. Duration: 0.85 2023-10-03 00:19:28,833 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_36492_210_7927_1_1533261987_4790834_703-343351-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 00:19:28,930 WARNING [train.py:1204] (3/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_80-147301-0_sp0.9 from training. Duration: 0.9333125 2023-10-03 00:19:30,298 WARNING [train.py:1204] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_218-295097-0_sp0.9 from training. Duration: 0.8555625 2023-10-03 00:19:33,143 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2295_210_2087_1_1525500993_2407863_116-238894-0_sp1.1 from training. Duration: 0.636375 2023-10-03 00:19:35,762 WARNING [train.py:1204] (3/4) Exclude cut with ID _3231_210_2106_1_1526205864934_6784220_697-171068-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 00:19:37,447 WARNING [train.py:1204] (3/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_102-77064-0 from training. Duration: 0.93 2023-10-03 00:19:39,223 INFO [scaling.py:1118] (3/4) WithLoss: name=encoder.encoders.4.encoder.layers.0.self_attn_weights, loss-sum=0.000e+00 2023-10-03 00:19:41,645 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_13091_210_7925_1_1530609447_8987354_1186-261957-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 00:19:42,990 WARNING [train.py:1204] (3/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_91-277684-0_sp1.1 from training. Duration: 0.67275 2023-10-03 00:19:44,937 WARNING [train.py:1204] (3/4) Exclude cut with ID _31088_210_16446_1_1532520012284_3440919_159-313751-0_sp1.1 from training. Duration: 0.936375 2023-10-03 00:19:46,267 INFO [train.py:1046] (3/4) Epoch 31, batch 1300, loss[loss=0.152, simple_loss=0.2404, pruned_loss=0.03183, over 24643.00 frames. ], tot_loss[loss=0.1653, simple_loss=0.2436, pruned_loss=0.04346, over 4698023.17 frames. ], batch size: 68, lr: 3.31e-03, grad_scale: 8.0 2023-10-03 00:19:46,343 WARNING [train.py:1204] (3/4) Exclude cut with ID _42326_210_7770_1_1533814282531_3707269_227-203721-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 00:19:46,562 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.self_attn_weights.pos_emb_skip_rate, batch_count=1071093.3333333333, ans=0.0 2023-10-03 00:19:47,707 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3531_210_3528_1_1526294203_5216456_309-227302-0_sp1.1 from training. Duration: 0.62725 2023-10-03 00:19:47,763 WARNING [train.py:1204] (3/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_226-242140-0 from training. Duration: 0.81 2023-10-03 00:19:51,965 WARNING [train.py:1204] (3/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_122-109023-0_sp1.1 from training. Duration: 0.6909375 2023-10-03 00:19:53,242 WARNING [train.py:1204] (3/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_660-211088-0_sp0.9 from training. Duration: 0.688875 2023-10-03 00:19:55,065 WARNING [train.py:1204] (3/4) Exclude cut with ID _39290_210_4220_1_1533862778760_3075118_92-311600-0 from training. Duration: 0.88 2023-10-03 00:19:58,461 WARNING [train.py:1204] (3/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_390-339137-0_sp1.1 from training. Duration: 0.5090625 2023-10-03 00:20:03,665 WARNING [train.py:1204] (3/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_449-341946-0_sp1.1 from training. Duration: 0.963625 2023-10-03 00:20:05,003 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_68095_210_20185_1_1535718226_8888855_884-327007-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 00:20:05,116 WARNING [train.py:1204] (3/4) Exclude cut with ID _42326_210_7770_1_1533814282531_3707269_308-222998-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 00:20:06,590 WARNING [train.py:1204] (3/4) Exclude cut with ID _40399_210_4353_1_1533305658194_3279406_344-238708-0_sp1.1 from training. Duration: 0.963625 2023-10-03 00:20:08,544 WARNING [train.py:1204] (3/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_87-131427-0_sp1.1 from training. Duration: 0.7090625 2023-10-03 00:20:08,597 WARNING [train.py:1204] (3/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_110-130961-0_sp0.9 from training. Duration: 0.52225 2023-10-03 00:20:08,639 WARNING [train.py:1204] (3/4) Exclude cut with ID _55153_210_2253_1_1534384786677_3162096_148-79022-0 from training. Duration: 0.96 2023-10-03 00:20:12,800 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.feed_forward2.hidden_balancer.prob, batch_count=1071160.0, ans=0.125 2023-10-03 00:20:12,819 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.feed_forward2.hidden_balancer.prob, batch_count=1071160.0, ans=0.125 2023-10-03 00:20:14,046 WARNING [train.py:1204] (3/4) Exclude cut with ID _9679_210_7497_1_1529129212906_4363929_512-313889-0_sp1.1 from training. Duration: 0.736375 2023-10-03 00:20:14,278 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.balancer2.prob, batch_count=1071226.6666666667, ans=0.125 2023-10-03 00:20:15,770 WARNING [train.py:1204] (3/4) Exclude cut with ID _7970_210_1883_1_1528455616296_3472230_25-178407-0_sp1.1 from training. Duration: 0.6454375 2023-10-03 00:20:17,125 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_246-125000-0 from training. Duration: 0.62 2023-10-03 00:20:17,860 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.3.encoder.layers.2.feed_forward1.out_whiten, num_groups=1, num_channels=512, metric=8.41 vs. limit=15.0 2023-10-03 00:20:18,591 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_15126_210_6753_1_1530778677_2591185_113-157651-0_sp0.9 from training. Duration: 0.7555625 2023-10-03 00:20:20,024 WARNING [train.py:1204] (3/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_105-63309-0_sp1.1 from training. Duration: 0.8909375 2023-10-03 00:20:21,497 WARNING [train.py:1204] (3/4) Exclude cut with ID _29877_210_6228_1_1532397791106_5276149_578-177271-0_sp1.1 from training. Duration: 0.936375 2023-10-03 00:20:22,181 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.2.encoder.layers.1.conv_module1.whiten, num_groups=1, num_channels=384, metric=7.41 vs. limit=15.0 2023-10-03 00:20:22,738 WARNING [train.py:1204] (3/4) Exclude cut with ID _44831_210_13427_1_1533816011549_3689035_32-188147-0 from training. Duration: 0.96 2023-10-03 00:20:22,785 WARNING [train.py:1204] (3/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_300-254337-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 00:20:22,814 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_128-241008-0 from training. Duration: 0.94 2023-10-03 00:20:25,556 WARNING [train.py:1204] (3/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_53-279178-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 00:20:29,326 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_41452_210_7291_1_1533437687_3941281_273-334088-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 00:20:29,329 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_128-241008-0_sp1.1 from training. Duration: 0.8545625 2023-10-03 00:20:33,469 WARNING [train.py:1204] (3/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_379-49457-0 from training. Duration: 0.87 2023-10-03 00:20:33,715 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.conv_module2.balancer1.prob, batch_count=1071293.3333333333, ans=0.125 2023-10-03 00:20:34,770 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_105-114797-0 from training. Duration: 0.84 2023-10-03 00:20:34,854 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_14928_210_5632_1_1530773461_4714000_454-112198-0 from training. Duration: 0.66 2023-10-03 00:20:39,050 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_456-22141-0_sp1.1 from training. Duration: 0.8 2023-10-03 00:20:39,220 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.balancer1.prob, batch_count=1071293.3333333333, ans=0.125 2023-10-03 00:20:41,716 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_115-233929-0 from training. Duration: 0.72 2023-10-03 00:20:43,527 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_616-16795-0_sp1.1 from training. Duration: 0.963625 2023-10-03 00:20:45,068 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.feed_forward3.hidden_balancer.prob, batch_count=1071360.0, ans=0.125 2023-10-03 00:20:46,132 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.518e+02 1.864e+02 2.122e+02 2.414e+02 4.284e+02, threshold=4.243e+02, percent-clipped=0.0 2023-10-03 00:20:48,665 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.conv_skip_rate, batch_count=1071360.0, ans=0.0 2023-10-03 00:20:51,282 WARNING [train.py:1204] (3/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_448-212135-0 from training. Duration: 0.91 2023-10-03 00:20:54,028 WARNING [train.py:1204] (3/4) Exclude cut with ID _46683_210_9642_1_1533730913492_4591116_201-205677-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 00:20:56,801 WARNING [train.py:1204] (3/4) Exclude cut with ID _64086_210_7994_1_1535270417019_7435949_43-80483-0_sp1.1 from training. Duration: 0.97275 2023-10-03 00:21:00,452 INFO [train.py:1046] (3/4) Epoch 31, batch 1350, loss[loss=0.1734, simple_loss=0.2503, pruned_loss=0.04822, over 23986.00 frames. ], tot_loss[loss=0.1643, simple_loss=0.2421, pruned_loss=0.04329, over 4709069.15 frames. ], batch size: 80, lr: 3.31e-03, grad_scale: 8.0 2023-10-03 00:21:01,894 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_240-134124-0_sp1.1 from training. Duration: 0.963625 2023-10-03 00:21:01,923 WARNING [train.py:1204] (3/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_907-120940-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 00:21:02,035 WARNING [train.py:1204] (3/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_848-280957-0_sp1.1 from training. Duration: 0.7909375 2023-10-03 00:21:03,357 WARNING [train.py:1204] (3/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_243-42116-0_sp0.9 from training. Duration: 0.811125 2023-10-03 00:21:06,301 WARNING [train.py:1204] (3/4) Exclude cut with ID _6694_210_4599_1_1527852637490_3965529_116-109398-0_sp0.9 from training. Duration: 0.811125 2023-10-03 00:21:06,407 WARNING [train.py:1204] (3/4) Exclude cut with ID _24314_210_4915_1_1532135078116_7170179_521-258868-0 from training. Duration: 0.79 2023-10-03 00:21:09,057 WARNING [train.py:1204] (3/4) Exclude cut with ID _2220_210_1819_1_1525509109898_3658680_50-99837-0_sp0.9 from training. Duration: 0.6 2023-10-03 00:21:09,094 WARNING [train.py:1204] (3/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_313-134930-0_sp1.1 from training. Duration: 0.8818125 2023-10-03 00:21:11,749 WARNING [train.py:1204] (3/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_125-144174-0 from training. Duration: 0.39 2023-10-03 00:21:11,816 WARNING [train.py:1204] (3/4) Exclude cut with ID _59288_210_5854_1_1535077757774_3412246_275-293897-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 00:21:15,021 WARNING [train.py:1204] (3/4) Exclude cut with ID _40519_210_13794_1_1533715228655_7337390_722-52880-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 00:21:15,044 WARNING [train.py:1204] (3/4) Exclude cut with ID _7537_210_2084_1_1528502395431_3924359_128-268029-0 from training. Duration: 0.98 2023-10-03 00:21:16,446 WARNING [train.py:1204] (3/4) Exclude cut with ID _8021_210_7994_1_1528376451358_3653069_179-45206-0 from training. Duration: 0.86 2023-10-03 00:21:18,459 WARNING [train.py:1204] (3/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_88-276668-0 from training. Duration: 0.99 2023-10-03 00:21:19,841 WARNING [train.py:1204] (3/4) Exclude cut with ID _22613_210_5219_1_1532253599686_4090170_290-212862-0_sp1.1 from training. Duration: 0.97275 2023-10-03 00:21:19,846 WARNING [train.py:1204] (3/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_465-214726-0 from training. Duration: 0.86 2023-10-03 00:21:32,053 WARNING [train.py:1204] (3/4) Exclude cut with ID _56480_210_4220_1_1534679942079_3075409_138-36031-0_sp1.1 from training. Duration: 0.97275 2023-10-03 00:21:34,373 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.1.encoder.layers.1.nonlin_attention.whiten1, num_groups=1, num_channels=192, metric=5.70 vs. limit=10.0 2023-10-03 00:21:40,474 WARNING [train.py:1204] (3/4) Exclude cut with ID _58605_210_12210_1_1534723195939_3904259_300-285878-0_sp1.1 from training. Duration: 0.97275 2023-10-03 00:21:40,499 WARNING [train.py:1204] (3/4) Exclude cut with ID _40901_210_5665_1_1533363789958_3856130_114-340229-0_sp1.1 from training. Duration: 0.936375 2023-10-03 00:21:40,541 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_489-230703-0 from training. Duration: 0.85 2023-10-03 00:21:45,031 WARNING [train.py:1204] (3/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_322-81243-0_sp1.1 from training. Duration: 0.936375 2023-10-03 00:21:46,340 WARNING [train.py:1204] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_271-269084-0 from training. Duration: 0.83 2023-10-03 00:21:46,351 WARNING [train.py:1204] (3/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_166-314607-0_sp0.9 from training. Duration: 0.6 2023-10-03 00:21:46,421 WARNING [train.py:1204] (3/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_340-343302-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 00:21:49,717 WARNING [train.py:1204] (3/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_286-112015-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 00:21:52,501 WARNING [train.py:1204] (3/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_328-264776-0 from training. Duration: 0.67 2023-10-03 00:21:52,681 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.0.ff3_skip_rate, batch_count=1071626.6666666667, ans=0.0 2023-10-03 00:21:53,936 WARNING [train.py:1204] (3/4) Exclude cut with ID _4866_210_2577_1_1527404412284_4364659_256-306830-0_sp0.9 from training. Duration: 0.9666875 2023-10-03 00:21:58,166 WARNING [train.py:1204] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_239-316802-0 from training. Duration: 0.69 2023-10-03 00:21:58,338 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.out_combiner.scale_min, batch_count=1071693.3333333333, ans=0.2 2023-10-03 00:22:01,346 WARNING [train.py:1204] (3/4) Exclude cut with ID _29857_210_20034_1_1532395886753_3798160_279-185801-0 from training. Duration: 0.91 2023-10-03 00:22:06,074 WARNING [train.py:1204] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_656-188361-0 from training. Duration: 0.87 2023-10-03 00:22:07,457 WARNING [train.py:1204] (3/4) Exclude cut with ID _44718_210_5816_1_1534050051869_6469030_100-230287-0_sp1.1 from training. Duration: 0.936375 2023-10-03 00:22:10,212 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8275_210_4198_1_1528455320_3892571_276-215622-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 00:22:11,619 WARNING [train.py:1204] (3/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_698-309430-0_sp1.1 from training. Duration: 0.963625 2023-10-03 00:22:14,814 INFO [train.py:1046] (3/4) Epoch 31, batch 1400, loss[loss=0.1818, simple_loss=0.2726, pruned_loss=0.04548, over 24707.00 frames. ], tot_loss[loss=0.1632, simple_loss=0.2411, pruned_loss=0.04265, over 4709634.03 frames. ], batch size: 73, lr: 3.31e-03, grad_scale: 8.0 2023-10-03 00:22:16,350 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_365-217119-0 from training. Duration: 0.63 2023-10-03 00:22:17,718 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7264_210_4938_1_1527939911_8132095_800-91995-0 from training. Duration: 0.86 2023-10-03 00:22:25,234 WARNING [train.py:1204] (3/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_49-97158-0_sp1.1 from training. Duration: 0.6909375 2023-10-03 00:22:28,051 WARNING [train.py:1204] (3/4) Exclude cut with ID _61132_210_9969_1_1535021792964_3755419_61-57720-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 00:22:28,224 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.bypass.scale_min, batch_count=1071826.6666666667, ans=0.2 2023-10-03 00:22:32,538 WARNING [train.py:1204] (3/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_345-306832-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 00:22:33,192 WARNING [train.py:1204] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_164-224052-0_sp0.9 from training. Duration: 0.77775 2023-10-03 00:22:37,312 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_34886_210_10633_1_1533203882_1755661_76-156096-0_sp1.1 from training. Duration: 0.7909375 2023-10-03 00:22:38,745 WARNING [train.py:1204] (3/4) Exclude cut with ID 4133-6541-0027-40495-0_sp1.1 from training. Duration: 0.9681875 2023-10-03 00:22:43,704 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.1.encoder.layers.1.feed_forward2.out_whiten, num_groups=1, num_channels=256, metric=3.67 vs. limit=15.0 2023-10-03 00:22:47,490 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_514-211165-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 00:22:48,689 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_41211_210_2544_1_1533348859_3702910_97-212695-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 00:22:51,651 WARNING [train.py:1204] (3/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_406-45086-0 from training. Duration: 0.8 2023-10-03 00:22:52,958 WARNING [train.py:1204] (3/4) Exclude cut with ID _65263_210_22356_1_1535763598691_3921037_49-222917-0_sp1.1 from training. Duration: 0.836375 2023-10-03 00:22:53,017 WARNING [train.py:1204] (3/4) Exclude cut with ID _4866_210_2577_1_1527404412284_4364659_352-157930-0_sp1.1 from training. Duration: 0.736375 2023-10-03 00:22:54,356 WARNING [train.py:1204] (3/4) Exclude cut with ID _4366_210_1388_1_1526792053633_3613819_231-211402-0_sp1.1 from training. Duration: 0.8545625 2023-10-03 00:22:54,391 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_98-21576-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 00:22:56,235 WARNING [train.py:1204] (3/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_348-127789-0_sp0.9 from training. Duration: 0.9444375 2023-10-03 00:22:56,266 WARNING [train.py:1204] (3/4) Exclude cut with ID _45871_210_5665_1_1533864614245_3296060_98-329557-0_sp1.1 from training. Duration: 0.97275 2023-10-03 00:22:57,516 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6846_210_6753_1_1527850767_4295158_2-315073-0_sp1.1 from training. Duration: 0.8 2023-10-03 00:22:58,791 WARNING [train.py:1204] (3/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_145-137790-0 from training. Duration: 0.83 2023-10-03 00:22:58,813 WARNING [train.py:1204] (3/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_133-158308-0_sp0.9 from training. Duration: 0.9333125 2023-10-03 00:23:02,940 WARNING [train.py:1204] (3/4) Exclude cut with ID _14898_210_9206_1_1530766812377_4177190_320-11758-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 00:23:05,716 WARNING [train.py:1204] (3/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_406-45086-0_sp0.9 from training. Duration: 0.888875 2023-10-03 00:23:11,614 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_5438_210_1794_1_1527336687_2600009_104-288274-0 from training. Duration: 0.58 2023-10-03 00:23:12,953 WARNING [train.py:1204] (3/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_108-53929-0_sp0.9 from training. Duration: 0.6555625 2023-10-03 00:23:13,031 WARNING [train.py:1204] (3/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_444-286390-0_sp1.1 from training. Duration: 0.963625 2023-10-03 00:23:16,199 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.555e+02 1.848e+02 2.033e+02 2.325e+02 3.935e+02, threshold=4.067e+02, percent-clipped=0.0 2023-10-03 00:23:16,350 WARNING [train.py:1204] (3/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_125-144174-0_sp0.9 from training. Duration: 0.4333125 2023-10-03 00:23:16,409 WARNING [train.py:1204] (3/4) Exclude cut with ID _55209_210_1531_1_1534675828996_4597160_118-345621-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 00:23:19,133 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_45886_210_17024_1_1533709215_471063_7-79165-0_sp1.1 from training. Duration: 0.8909375 2023-10-03 00:23:21,860 WARNING [train.py:1204] (3/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_175-163894-0_sp0.9 from training. Duration: 0.82225 2023-10-03 00:23:22,427 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.5.encoder.layers.0.feed_forward3.out_whiten, num_groups=1, num_channels=256, metric=9.38 vs. limit=15.0 2023-10-03 00:23:23,704 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_50188_210_4198_1_1534503621_3646041_353-81682-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 00:23:23,710 WARNING [train.py:1204] (3/4) Exclude cut with ID _17875_210_2087_1_1531224012907_1821339_175-80565-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 00:23:23,726 WARNING [train.py:1204] (3/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_159-328157-0_sp1.1 from training. Duration: 0.47275 2023-10-03 00:23:29,046 INFO [train.py:1046] (3/4) Epoch 31, batch 1450, loss[loss=0.1742, simple_loss=0.2455, pruned_loss=0.05143, over 23791.00 frames. ], tot_loss[loss=0.1632, simple_loss=0.2413, pruned_loss=0.04253, over 4723464.64 frames. ], batch size: 195, lr: 3.31e-03, grad_scale: 8.0 2023-10-03 00:23:29,083 WARNING [train.py:1204] (3/4) Exclude cut with ID _60057_210_10640_1_1534903065793_3333150_137-267579-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 00:23:29,131 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_92-46009-0_sp1.1 from training. Duration: 0.4909375 2023-10-03 00:23:30,666 WARNING [train.py:1204] (3/4) Exclude cut with ID _53203_210_2087_1_1534246218309_475590_48-68507-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 00:23:30,681 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_28211_210_6388_1_1532861506_4523224_128-328162-0 from training. Duration: 0.9 2023-10-03 00:23:32,170 WARNING [train.py:1204] (3/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_51-1049-0_sp1.1 from training. Duration: 0.6818125 2023-10-03 00:23:33,999 WARNING [train.py:1204] (3/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_383-316000-0 from training. Duration: 0.9 2023-10-03 00:23:34,061 WARNING [train.py:1204] (3/4) Exclude cut with ID _4779_210_2084_1_1527318022944_3914893_185-295153-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 00:23:36,005 WARNING [train.py:1204] (3/4) Exclude cut with ID _3231_210_2106_1_1526205864934_6784220_171-256004-0_sp0.9 from training. Duration: 0.9555625 2023-10-03 00:23:36,008 WARNING [train.py:1204] (3/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_87-272068-0 from training. Duration: 0.83 2023-10-03 00:23:37,400 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_164-152777-0_sp1.1 from training. Duration: 0.92725 2023-10-03 00:23:38,720 WARNING [train.py:1204] (3/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_18-130336-0_sp0.9 from training. Duration: 0.87775 2023-10-03 00:23:38,786 WARNING [train.py:1204] (3/4) Exclude cut with ID _14886_210_4220_1_1530702020355_3516190_175-229073-0_sp1.1 from training. Duration: 0.4090625 2023-10-03 00:23:38,799 WARNING [train.py:1204] (3/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_791-45149-0_sp0.9 from training. Duration: 0.9555625 2023-10-03 00:23:41,472 WARNING [train.py:1204] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_468-189058-0_sp1.1 from training. Duration: 0.87275 2023-10-03 00:23:42,925 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8278_210_2550_1_1528637099_4220413_32-142780-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 00:23:43,242 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.self_attn_weights.pos_emb_skip_rate, batch_count=1072160.0, ans=0.0 2023-10-03 00:23:45,775 WARNING [train.py:1204] (3/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_146-218763-0_sp0.9 from training. Duration: 0.9555625 2023-10-03 00:23:48,816 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.0.layers.1.feed_forward1.out_whiten, num_groups=1, num_channels=192, metric=4.61 vs. limit=15.0 2023-10-03 00:23:49,328 WARNING [train.py:1204] (3/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_348-127789-0_sp1.1 from training. Duration: 0.77275 2023-10-03 00:23:49,333 WARNING [train.py:1204] (3/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_288-285445-0_sp1.1 from training. Duration: 0.936375 2023-10-03 00:23:50,812 WARNING [train.py:1204] (3/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_308-181732-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 00:23:50,844 WARNING [train.py:1204] (3/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_895-117428-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 00:23:53,495 WARNING [train.py:1204] (3/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_387-159008-0_sp0.9 from training. Duration: 0.9555625 2023-10-03 00:23:53,516 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_37351_210_7925_1_1533012931_3812113_265-85780-0_sp0.9 from training. Duration: 0.97775 2023-10-03 00:23:53,538 WARNING [train.py:1204] (3/4) Exclude cut with ID _40551_210_10635_1_1533776424632_3416179_361-3964-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 00:23:54,836 WARNING [train.py:1204] (3/4) Exclude cut with ID _25882_210_3036_1_1532775665815_6479510_485-122742-0_sp1.1 from training. Duration: 0.97275 2023-10-03 00:23:55,141 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.feed_forward1.hidden_balancer.prob, batch_count=1072160.0, ans=0.125 2023-10-03 00:23:59,564 WARNING [train.py:1204] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_98-51676-0 from training. Duration: 0.73 2023-10-03 00:24:02,329 WARNING [train.py:1204] (3/4) Exclude cut with ID _30548_210_16040_1_1532608097130_3146969_371-280610-0_sp1.1 from training. Duration: 0.92725 2023-10-03 00:24:05,186 WARNING [train.py:1204] (3/4) Exclude cut with ID IC0671W0081-331924 from training. Duration: 0.896 2023-10-03 00:24:08,951 WARNING [train.py:1204] (3/4) Exclude cut with ID _40284_210_2084_1_1533513679814_3704279_380-160775-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 00:24:10,359 WARNING [train.py:1204] (3/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_57-270037-0_sp1.1 from training. Duration: 0.7 2023-10-03 00:24:10,443 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_853-150237-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 00:24:11,772 WARNING [train.py:1204] (3/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_491-261923-0 from training. Duration: 0.98 2023-10-03 00:24:14,916 WARNING [train.py:1204] (3/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_836-244045-0_sp1.1 from training. Duration: 0.97275 2023-10-03 00:24:16,290 WARNING [train.py:1204] (3/4) Exclude cut with ID _14880_210_8895_1_1530680426655_3795259_289-212817-0 from training. Duration: 0.69 2023-10-03 00:24:16,475 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.conv_module2.balancer1.prob, batch_count=1072293.3333333333, ans=0.125 2023-10-03 00:24:17,645 WARNING [train.py:1204] (3/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_303-340446-0 from training. Duration: 0.9 2023-10-03 00:24:17,860 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.bypass.skip_rate, batch_count=1072293.3333333333, ans=0.07 2023-10-03 00:24:19,442 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_37152_210_9697_1_1533106573_3844188_418-41736-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 00:24:22,319 WARNING [train.py:1204] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_371-18169-0_sp1.1 from training. Duration: 0.7818125 2023-10-03 00:24:22,348 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_187-219984-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 00:24:22,517 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.attention_skip_rate, batch_count=1072293.3333333333, ans=0.0 2023-10-03 00:24:23,671 WARNING [train.py:1204] (3/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_63-267585-0 from training. Duration: 0.9 2023-10-03 00:24:25,075 WARNING [train.py:1204] (3/4) Exclude cut with ID _27013_210_1389_1_1532437173240_3252649_222-125573-0 from training. Duration: 0.98 2023-10-03 00:24:26,468 WARNING [train.py:1204] (3/4) Exclude cut with ID _14821_210_8750_1_1530680503991_6829359_189-297451-0 from training. Duration: 0.55 2023-10-03 00:24:27,805 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_557-163391-0_sp1.1 from training. Duration: 0.97275 2023-10-03 00:24:27,887 WARNING [train.py:1204] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_65-88242-0_sp1.1 from training. Duration: 0.5090625 2023-10-03 00:24:41,889 WARNING [train.py:1204] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_218-295097-0 from training. Duration: 0.77 2023-10-03 00:24:41,918 WARNING [train.py:1204] (3/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_686-247600-0_sp1.1 from training. Duration: 0.836375 2023-10-03 00:24:41,921 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_397-147196-0_sp0.9 from training. Duration: 0.888875 2023-10-03 00:24:43,314 WARNING [train.py:1204] (3/4) Exclude cut with ID _53950_210_5816_1_1534417264919_3862951_237-136449-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 00:24:43,393 WARNING [train.py:1204] (3/4) Exclude cut with ID _3120_210_3525_1_1526036143112_4072980_74-178400-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 00:24:44,685 INFO [train.py:1046] (3/4) Epoch 31, batch 1500, loss[loss=0.1502, simple_loss=0.2314, pruned_loss=0.03452, over 24558.00 frames. ], tot_loss[loss=0.1642, simple_loss=0.2423, pruned_loss=0.04301, over 4712546.79 frames. ], batch size: 60, lr: 3.31e-03, grad_scale: 8.0 2023-10-03 00:24:44,765 WARNING [train.py:1204] (3/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_309-222545-0_sp0.9 from training. Duration: 0.9333125 2023-10-03 00:24:44,835 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7544_210_6753_1_1528587722_3576686_172-171750-0 from training. Duration: 0.65 2023-10-03 00:24:47,457 WARNING [train.py:1204] (3/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_3-237434-0_sp0.9 from training. Duration: 0.8444375 2023-10-03 00:24:47,502 WARNING [train.py:1204] (3/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_101-293367-0_sp0.9 from training. Duration: 0.7 2023-10-03 00:24:47,510 WARNING [train.py:1204] (3/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_818-213552-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 00:24:48,860 WARNING [train.py:1204] (3/4) Exclude cut with ID _14349_210_8341_1_1530615710818_3442470_115-285975-0_sp1.1 from training. Duration: 0.7818125 2023-10-03 00:24:51,509 WARNING [train.py:1204] (3/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_291-309749-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 00:24:51,595 WARNING [train.py:1204] (3/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_715-3462-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 00:24:56,391 WARNING [train.py:1204] (3/4) Exclude cut with ID _59502_210_12210_1_1535107024698_1835139_13-331827-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 00:24:56,402 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_4528_210_3273_1_1527076654_3276777_342-168068-0 from training. Duration: 0.87 2023-10-03 00:24:56,458 WARNING [train.py:1204] (3/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_215-141059-0_sp0.9 from training. Duration: 0.688875 2023-10-03 00:24:56,485 WARNING [train.py:1204] (3/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_106-286130-0_sp1.1 from training. Duration: 0.8909375 2023-10-03 00:24:57,873 WARNING [train.py:1204] (3/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_480-77655-0_sp1.1 from training. Duration: 0.97275 2023-10-03 00:25:02,506 WARNING [train.py:1204] (3/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_848-280957-0 from training. Duration: 0.87 2023-10-03 00:25:04,142 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.conv_skip_rate, batch_count=1072493.3333333333, ans=0.0 2023-10-03 00:25:06,663 WARNING [train.py:1204] (3/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_122-109023-0 from training. Duration: 0.76 2023-10-03 00:25:08,582 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_11305_210_1794_1_1529750428_7178148_645-233480-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 00:25:09,877 WARNING [train.py:1204] (3/4) Exclude cut with ID _7562_210_2084_1_1528887767916_3946229_345-169156-0 from training. Duration: 0.58 2023-10-03 00:25:11,960 WARNING [train.py:1204] (3/4) Exclude cut with ID _7562_210_2084_1_1528887767916_3946229_345-169156-0_sp1.1 from training. Duration: 0.52725 2023-10-03 00:25:14,693 WARNING [train.py:1204] (3/4) Exclude cut with ID _13253_210_1925_1_1530757775819_3577873_253-246386-0_sp1.1 from training. Duration: 0.8181875 2023-10-03 00:25:14,771 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_136-264444-0_sp1.1 from training. Duration: 0.97275 2023-10-03 00:25:16,033 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2168_210_2290_1_1525606920_1849400_136-222991-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 00:25:17,401 WARNING [train.py:1204] (3/4) Exclude cut with ID _7852_210_5219_1_1528286471316_8139400_242-105336-0 from training. Duration: 0.72 2023-10-03 00:25:17,444 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_5960_210_1794_1_1527403865_3990999_515-2613-0_sp1.1 from training. Duration: 0.8 2023-10-03 00:25:17,465 WARNING [train.py:1204] (3/4) Exclude cut with ID _39688_210_19596_1_1533371626725_4154159_551-167834-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 00:25:17,508 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_85-195240-0 from training. Duration: 0.87 2023-10-03 00:25:18,867 WARNING [train.py:1204] (3/4) Exclude cut with ID _63171_210_15488_1_1535104792794_3295110_113-113534-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 00:25:24,287 WARNING [train.py:1204] (3/4) Exclude cut with ID _8833_210_5219_1_1528592665210_7553059_319-349181-0_sp1.1 from training. Duration: 0.9 2023-10-03 00:25:24,288 WARNING [train.py:1204] (3/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_225-43381-0 from training. Duration: 0.94 2023-10-03 00:25:27,962 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.ff3_skip_rate, batch_count=1072626.6666666667, ans=0.0 2023-10-03 00:25:29,159 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_5959_210_1794_1_1527400400_3445726_412-44052-0_sp0.9 from training. Duration: 0.8555625 2023-10-03 00:25:30,586 WARNING [train.py:1204] (3/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_356-167816-0_sp1.1 from training. Duration: 0.6545625 2023-10-03 00:25:35,353 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9012W0112-501244 from training. Duration: 0.768 2023-10-03 00:25:35,399 WARNING [train.py:1204] (3/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_177-326952-0_sp1.1 from training. Duration: 0.92725 2023-10-03 00:25:35,403 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9016W0116-502789 from training. Duration: 0.896 2023-10-03 00:25:38,220 WARNING [train.py:1204] (3/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_557-204815-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 00:25:39,622 WARNING [train.py:1204] (3/4) Exclude cut with ID _55857_210_2346_1_1534644007831_6898209_279-229469-0_sp1.1 from training. Duration: 0.936375 2023-10-03 00:25:39,686 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9029W0375-509764 from training. Duration: 0.896 2023-10-03 00:25:40,220 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.4.encoder.layers.2.feed_forward1.out_whiten, num_groups=1, num_channels=384, metric=13.07 vs. limit=15.0 2023-10-03 00:25:41,605 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2295_210_2087_1_1525500993_2407863_234-83104-0_sp0.9 from training. Duration: 0.688875 2023-10-03 00:25:44,843 WARNING [train.py:1204] (3/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_266-137104-0 from training. Duration: 0.65 2023-10-03 00:25:46,099 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.574e+02 1.898e+02 2.100e+02 2.437e+02 3.214e+02, threshold=4.200e+02, percent-clipped=0.0 2023-10-03 00:25:46,228 WARNING [train.py:1204] (3/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_289-163927-0_sp1.1 from training. Duration: 0.92725 2023-10-03 00:25:49,067 WARNING [train.py:1204] (3/4) Exclude cut with ID _12733_210_8341_1_1530167424556_3646380_86-225314-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 00:25:49,082 WARNING [train.py:1204] (3/4) Exclude cut with ID _7877_210_6014_1_1528286358099_3750136_618-284064-0_sp1.1 from training. Duration: 0.92725 2023-10-03 00:25:49,126 WARNING [train.py:1204] (3/4) Exclude cut with ID _55844_210_2346_1_1534471212569_7625707_1017-31469-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 00:25:50,526 WARNING [train.py:1204] (3/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_617-241446-0_sp1.1 from training. Duration: 0.92725 2023-10-03 00:25:50,592 WARNING [train.py:1204] (3/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_866-173440-0_sp1.1 from training. Duration: 0.8181875 2023-10-03 00:25:53,199 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_9460_210_4211_1_1528954587_10049667_480-260269-0 from training. Duration: 0.56 2023-10-03 00:25:55,070 WARNING [train.py:1204] (3/4) Exclude cut with ID _44659_210_2721_1_1534036120061_6614860_626-298884-0 from training. Duration: 0.97 2023-10-03 00:25:55,107 WARNING [train.py:1204] (3/4) Exclude cut with ID _53565_210_10635_1_1534337218335_3820259_287-18120-0_sp1.1 from training. Duration: 0.8818125 2023-10-03 00:25:55,171 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_791-197891-0 from training. Duration: 0.84 2023-10-03 00:25:56,526 WARNING [train.py:1204] (3/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_527-238990-0 from training. Duration: 0.9 2023-10-03 00:25:59,113 INFO [train.py:1046] (3/4) Epoch 31, batch 1550, loss[loss=0.1401, simple_loss=0.2215, pruned_loss=0.02935, over 24353.00 frames. ], tot_loss[loss=0.1642, simple_loss=0.2428, pruned_loss=0.04277, over 4725257.13 frames. ], batch size: 56, lr: 3.31e-03, grad_scale: 8.0 2023-10-03 00:26:00,539 WARNING [train.py:1204] (3/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_449-65969-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 00:26:00,647 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_553-341452-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 00:26:01,957 WARNING [train.py:1204] (3/4) Exclude cut with ID _16909_210_5724_1_1531292079912_4118919_300-47574-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 00:26:01,977 WARNING [train.py:1204] (3/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_70-27732-0_sp1.1 from training. Duration: 0.8545625 2023-10-03 00:26:03,328 WARNING [train.py:1204] (3/4) Exclude cut with ID _44718_210_5816_1_1534050051869_6469030_727-28074-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 00:26:03,375 WARNING [train.py:1204] (3/4) Exclude cut with ID _55857_210_2346_1_1534644007831_6898209_357-123664-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 00:26:08,031 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9123W0004-555964 from training. Duration: 0.896 2023-10-03 00:26:08,053 WARNING [train.py:1204] (3/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_445-145208-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 00:26:08,078 WARNING [train.py:1204] (3/4) Exclude cut with ID _3675_210_4557_1_1526639934408_4182089_456-18305-0_sp1.1 from training. Duration: 0.6909375 2023-10-03 00:26:09,383 WARNING [train.py:1204] (3/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_1-99680-0_sp1.1 from training. Duration: 0.6090625 2023-10-03 00:26:10,898 WARNING [train.py:1204] (3/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_146-282400-0_sp0.9 from training. Duration: 0.788875 2023-10-03 00:26:12,168 WARNING [train.py:1204] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_844-96989-0 from training. Duration: 0.71 2023-10-03 00:26:14,127 WARNING [train.py:1204] (3/4) Exclude cut with ID _29853_210_7497_1_1532505600851_6642110_128-146364-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 00:26:14,162 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_224-322394-0 from training. Duration: 0.66 2023-10-03 00:26:15,596 WARNING [train.py:1204] (3/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_191-59784-0 from training. Duration: 0.88 2023-10-03 00:26:15,601 WARNING [train.py:1204] (3/4) Exclude cut with ID _12556_210_8341_1_1530081024100_7327690_725-193477-0 from training. Duration: 0.98 2023-10-03 00:26:16,873 WARNING [train.py:1204] (3/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_523-79838-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 00:26:18,237 WARNING [train.py:1204] (3/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_398-334558-0_sp1.1 from training. Duration: 0.87275 2023-10-03 00:26:22,892 WARNING [train.py:1204] (3/4) Exclude cut with ID _6456_210_1534_1_1527681634212_3181049_171-36697-0_sp1.1 from training. Duration: 0.936375 2023-10-03 00:26:24,310 WARNING [train.py:1204] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_700-183549-0 from training. Duration: 0.68 2023-10-03 00:26:24,312 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8853_210_3273_1_1528633899_818968_51-187685-0 from training. Duration: 0.69 2023-10-03 00:26:25,897 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.conv_skip_rate, batch_count=1072826.6666666667, ans=0.0 2023-10-03 00:26:33,080 WARNING [train.py:1204] (3/4) Exclude cut with ID _7719_210_3525_1_1528456153163_4296960_333-110399-0_sp1.1 from training. Duration: 0.87275 2023-10-03 00:26:37,170 WARNING [train.py:1204] (3/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_1022-83740-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 00:26:37,195 WARNING [train.py:1204] (3/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_61-327062-0_sp0.9 from training. Duration: 0.9 2023-10-03 00:26:37,201 WARNING [train.py:1204] (3/4) Exclude cut with ID _47251_210_18628_1_1533814063128_4137250_214-88076-0_sp1.1 from training. Duration: 0.8 2023-10-03 00:26:37,255 WARNING [train.py:1204] (3/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_839-173017-0 from training. Duration: 0.41 2023-10-03 00:26:40,351 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.conv_module2.balancer2.prob, batch_count=1072893.3333333333, ans=0.125 2023-10-03 00:26:43,375 WARNING [train.py:1204] (3/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_299-34413-0_sp1.1 from training. Duration: 0.5909375 2023-10-03 00:26:44,756 WARNING [train.py:1204] (3/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_664-159457-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 00:26:46,110 WARNING [train.py:1204] (3/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_483-291760-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 00:26:51,228 WARNING [train.py:1204] (3/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_287-255474-0_sp1.1 from training. Duration: 0.92725 2023-10-03 00:26:52,434 WARNING [train.py:1204] (3/4) Exclude cut with ID _48677_210_5663_1_1533902405919_3892989_111-11439-0_sp1.1 from training. Duration: 0.87275 2023-10-03 00:26:52,442 WARNING [train.py:1204] (3/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_399-156791-0 from training. Duration: 0.83 2023-10-03 00:26:52,475 WARNING [train.py:1204] (3/4) Exclude cut with ID _3992_210_1747_1_1526644816031_3731060_36-147855-0_sp0.9 from training. Duration: 0.8666875 2023-10-03 00:26:53,995 WARNING [train.py:1204] (3/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_442-320317-0_sp1.1 from training. Duration: 0.7454375 2023-10-03 00:26:54,014 WARNING [train.py:1204] (3/4) Exclude cut with ID _55429_210_10626_1_1534467588527_7174143_953-80868-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 00:26:54,093 WARNING [train.py:1204] (3/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_159-328157-0_sp0.9 from training. Duration: 0.57775 2023-10-03 00:26:54,108 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9309W0197-640324 from training. Duration: 0.896 2023-10-03 00:26:57,053 WARNING [train.py:1204] (3/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_361-30822-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 00:27:01,529 WARNING [train.py:1204] (3/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_636-251213-0 from training. Duration: 0.94 2023-10-03 00:27:03,350 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.4.encoder.layers.2.conv_module2.whiten, num_groups=1, num_channels=384, metric=4.33 vs. limit=15.0 2023-10-03 00:27:05,711 WARNING [train.py:1204] (3/4) Exclude cut with ID _49728_210_12651_1_1534041034294_6788150_468-333827-0_sp1.1 from training. Duration: 0.963625 2023-10-03 00:27:07,103 WARNING [train.py:1204] (3/4) Exclude cut with ID _25882_210_3036_1_1532775665815_6479510_623-191759-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 00:27:07,167 WARNING [train.py:1204] (3/4) Exclude cut with ID _5189_210_4557_1_1527248319428_3769229_199-53526-0 from training. Duration: 0.42 2023-10-03 00:27:09,812 WARNING [train.py:1204] (3/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_32-205288-0_sp0.9 from training. Duration: 0.8666875 2023-10-03 00:27:11,644 WARNING [train.py:1204] (3/4) Exclude cut with ID _65263_210_22356_1_1535763598691_3921037_401-346646-0_sp1.1 from training. Duration: 0.963625 2023-10-03 00:27:11,649 WARNING [train.py:1204] (3/4) Exclude cut with ID _12555_210_8750_1_1530075594402_3780239_449-267986-0_sp1.1 from training. Duration: 0.7545625 2023-10-03 00:27:11,661 WARNING [train.py:1204] (3/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_430-201020-0_sp1.1 from training. Duration: 0.8818125 2023-10-03 00:27:11,734 WARNING [train.py:1204] (3/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_379-49457-0_sp1.1 from training. Duration: 0.7909375 2023-10-03 00:27:13,103 INFO [train.py:1046] (3/4) Epoch 31, batch 1600, loss[loss=0.1521, simple_loss=0.2307, pruned_loss=0.03678, over 21519.00 frames. ], tot_loss[loss=0.1653, simple_loss=0.2436, pruned_loss=0.04354, over 4701708.23 frames. ], batch size: 47, lr: 3.30e-03, grad_scale: 16.0 2023-10-03 00:27:14,627 WARNING [train.py:1204] (3/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_200-105218-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 00:27:16,076 WARNING [train.py:1204] (3/4) Exclude cut with ID _3867_210_1883_1_1526641200575_4475830_319-349850-0 from training. Duration: 0.67 2023-10-03 00:27:16,161 WARNING [train.py:1204] (3/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_916-195848-0 from training. Duration: 0.92 2023-10-03 00:27:19,206 WARNING [train.py:1204] (3/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_436-186311-0 from training. Duration: 0.65 2023-10-03 00:27:22,361 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3910_210_1794_1_1526777667_3747260_302-180617-0_sp1.1 from training. Duration: 0.97275 2023-10-03 00:27:23,726 WARNING [train.py:1204] (3/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_102-92751-0 from training. Duration: 0.99 2023-10-03 00:27:23,793 WARNING [train.py:1204] (3/4) Exclude cut with ID _51899_210_20034_1_1534129254466_3669220_265-215428-0_sp1.1 from training. Duration: 0.8909375 2023-10-03 00:27:26,698 WARNING [train.py:1204] (3/4) Exclude cut with ID _58583_210_5236_1_1534726835266_3574249_438-198993-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 00:27:30,718 WARNING [train.py:1204] (3/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_166-166990-0_sp1.1 from training. Duration: 0.87275 2023-10-03 00:27:32,642 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.conv_module2.balancer2.prob, batch_count=1073160.0, ans=0.125 2023-10-03 00:27:35,163 WARNING [train.py:1204] (3/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_907-151398-0 from training. Duration: 0.37 2023-10-03 00:27:35,375 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.0.ff2_skip_rate, batch_count=1073160.0, ans=0.0 2023-10-03 00:27:37,908 WARNING [train.py:1204] (3/4) Exclude cut with ID _38670_210_15379_1_1533448810659_4391059_452-256669-0_sp1.1 from training. Duration: 0.936375 2023-10-03 00:27:37,945 WARNING [train.py:1204] (3/4) Exclude cut with ID _48677_210_5663_1_1533902405919_3892989_408-40740-0 from training. Duration: 0.83 2023-10-03 00:27:37,988 WARNING [train.py:1204] (3/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_903-158756-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 00:27:38,037 WARNING [train.py:1204] (3/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_397-216497-0 from training. Duration: 0.96 2023-10-03 00:27:40,993 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.ff3_skip_rate, batch_count=1073226.6666666667, ans=0.0 2023-10-03 00:27:45,237 WARNING [train.py:1204] (3/4) Exclude cut with ID _55429_210_10626_1_1534467588527_7174143_97-335084-0 from training. Duration: 0.97 2023-10-03 00:27:51,506 WARNING [train.py:1204] (3/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_453-267730-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 00:27:53,350 WARNING [train.py:1204] (3/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_153-97441-0 from training. Duration: 0.56 2023-10-03 00:27:53,406 WARNING [train.py:1204] (3/4) Exclude cut with ID _40901_210_5665_1_1533363789958_3856130_312-329122-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 00:27:53,454 WARNING [train.py:1204] (3/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_97-126471-0_sp1.1 from training. Duration: 0.97275 2023-10-03 00:27:53,454 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_11305_210_1794_1_1529750428_7178148_257-312870-0_sp1.1 from training. Duration: 0.8 2023-10-03 00:27:53,707 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.self_attn_weights.pos_emb_skip_rate, batch_count=1073226.6666666667, ans=0.0 2023-10-03 00:27:56,132 WARNING [train.py:1204] (3/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_454-314901-0_sp0.9 from training. Duration: 0.511125 2023-10-03 00:28:00,180 WARNING [train.py:1204] (3/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_282-136641-0_sp1.1 from training. Duration: 0.3818125 2023-10-03 00:28:03,440 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_46013_210_7234_1_1533776362_3744032_493-160724-0_sp1.1 from training. Duration: 0.963625 2023-10-03 00:28:04,727 WARNING [train.py:1204] (3/4) Exclude cut with ID _67831_210_12392_1_1535612464202_3092612_259-33442-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 00:28:04,785 WARNING [train.py:1204] (3/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_289-62384-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 00:28:04,856 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6545_210_2040_1_1527774670_4245258_379-14182-0_sp0.9 from training. Duration: 0.888875 2023-10-03 00:28:06,589 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_548-294316-0_sp1.1 from training. Duration: 0.736375 2023-10-03 00:28:07,974 WARNING [train.py:1204] (3/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_857-298942-0_sp1.1 from training. Duration: 0.77275 2023-10-03 00:28:09,352 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6673_210_3273_1_1527767790_3173240_327-306185-0_sp1.1 from training. Duration: 0.7545625 2023-10-03 00:28:13,239 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.519e+02 1.827e+02 1.985e+02 2.199e+02 3.882e+02, threshold=3.970e+02, percent-clipped=0.0 2023-10-03 00:28:14,761 WARNING [train.py:1204] (3/4) Exclude cut with ID _61008_210_10633_1_1534852769198_6974370_814-107647-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 00:28:14,995 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.bypass.skip_rate, batch_count=1073360.0, ans=0.09899494936611666 2023-10-03 00:28:16,662 WARNING [train.py:1204] (3/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_116-332008-0_sp1.1 from training. Duration: 0.8909375 2023-10-03 00:28:18,090 WARNING [train.py:1204] (3/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_266-329755-0 from training. Duration: 0.89 2023-10-03 00:28:18,093 WARNING [train.py:1204] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_271-269084-0_sp0.9 from training. Duration: 0.92225 2023-10-03 00:28:18,175 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3171_210_2087_1_1526109084_2330007_66-260583-0 from training. Duration: 0.72 2023-10-03 00:28:25,899 WARNING [train.py:1204] (3/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_1326-276386-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 00:28:27,290 INFO [train.py:1046] (3/4) Epoch 31, batch 1650, loss[loss=0.1454, simple_loss=0.2218, pruned_loss=0.03454, over 23435.00 frames. ], tot_loss[loss=0.1653, simple_loss=0.2432, pruned_loss=0.04366, over 4711578.14 frames. ], batch size: 119, lr: 3.30e-03, grad_scale: 16.0 2023-10-03 00:28:27,404 WARNING [train.py:1204] (3/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_372-30302-0_sp1.1 from training. Duration: 0.7818125 2023-10-03 00:28:27,468 WARNING [train.py:1204] (3/4) Exclude cut with ID _25882_210_3036_1_1532775665815_6479510_631-257029-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 00:28:27,477 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3691_210_3528_1_1526468169_4062053_355-334432-0 from training. Duration: 0.87 2023-10-03 00:28:27,492 WARNING [train.py:1204] (3/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_839-173017-0_sp1.1 from training. Duration: 0.37275 2023-10-03 00:28:27,499 WARNING [train.py:1204] (3/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_441-91466-0 from training. Duration: 0.97 2023-10-03 00:28:28,789 WARNING [train.py:1204] (3/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_108-53929-0 from training. Duration: 0.59 2023-10-03 00:28:29,061 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.balancer1.prob, batch_count=1073426.6666666667, ans=0.125 2023-10-03 00:28:30,566 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.feed_forward1.out_proj.dropout_p, batch_count=1073426.6666666667, ans=0.1 2023-10-03 00:28:33,159 WARNING [train.py:1204] (3/4) Exclude cut with ID _9389_210_1664_1_1529217337301_5289269_260-1942-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 00:28:33,344 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.self_attn_weights.pos_emb_skip_rate, batch_count=1073426.6666666667, ans=0.0 2023-10-03 00:28:34,865 WARNING [train.py:1204] (3/4) Exclude cut with ID _56423_210_5854_1_1534896061049_3165969_198-61547-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 00:28:34,895 WARNING [train.py:1204] (3/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_412-142122-0_sp1.1 from training. Duration: 0.92725 2023-10-03 00:28:35,071 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.conv_module1.balancer2.min_abs, batch_count=1073426.6666666667, ans=0.5 2023-10-03 00:28:36,204 WARNING [train.py:1204] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_88-341718-0_sp1.1 from training. Duration: 0.6 2023-10-03 00:28:38,967 WARNING [train.py:1204] (3/4) Exclude cut with ID _61079_210_4525_1_1534921008198_4011419_334-298697-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 00:28:40,404 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_5465_210_4211_1_1527227253_55357_8-2499-0_sp1.1 from training. Duration: 0.996375 2023-10-03 00:28:41,922 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_447-201565-0_sp1.1 from training. Duration: 0.97275 2023-10-03 00:28:43,192 WARNING [train.py:1204] (3/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_253-161789-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 00:28:43,195 WARNING [train.py:1204] (3/4) Exclude cut with ID _44304_210_15374_1_1533639352765_4613129_302-22084-0_sp0.9 from training. Duration: 0.9555625 2023-10-03 00:28:43,212 WARNING [train.py:1204] (3/4) Exclude cut with ID _26664_210_7770_1_1532258943280_3418270_187-80815-0_sp1.1 from training. Duration: 0.8454375 2023-10-03 00:28:44,607 WARNING [train.py:1204] (3/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_68-212801-0 from training. Duration: 0.73 2023-10-03 00:28:44,642 WARNING [train.py:1204] (3/4) Exclude cut with ID _29345_210_4381_1_1532689168263_3860169_149-257268-0 from training. Duration: 0.81 2023-10-03 00:28:44,911 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.bypass.scale_min, batch_count=1073493.3333333333, ans=0.2 2023-10-03 00:28:49,413 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_213-103282-0_sp1.1 from training. Duration: 0.7090625 2023-10-03 00:28:51,963 WARNING [train.py:1204] (3/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_156-302675-0_sp1.1 from training. Duration: 0.5 2023-10-03 00:28:54,053 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.conv_module1.balancer1.prob, batch_count=1073493.3333333333, ans=0.125 2023-10-03 00:29:00,012 WARNING [train.py:1204] (3/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_125-322172-0 from training. Duration: 0.93 2023-10-03 00:29:00,079 WARNING [train.py:1204] (3/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_493-60474-0_sp1.1 from training. Duration: 0.963625 2023-10-03 00:29:02,802 WARNING [train.py:1204] (3/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_156-302675-0 from training. Duration: 0.55 2023-10-03 00:29:04,313 WARNING [train.py:1204] (3/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_123-267862-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 00:29:06,811 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.2.encoder.layers.2.nonlin_attention.whiten1, num_groups=1, num_channels=288, metric=4.89 vs. limit=10.0 2023-10-03 00:29:07,462 WARNING [train.py:1204] (3/4) Exclude cut with ID _28427_210_4220_1_1532581240967_3061069_201-143747-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 00:29:07,488 WARNING [train.py:1204] (3/4) Exclude cut with ID _8772_210_1850_1_1528635595972_3664011_96-12756-0_sp1.1 from training. Duration: 0.8909375 2023-10-03 00:29:07,533 WARNING [train.py:1204] (3/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_1347-145675-0_sp1.1 from training. Duration: 0.936375 2023-10-03 00:29:10,174 WARNING [train.py:1204] (3/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_387-159008-0_sp1.1 from training. Duration: 0.7818125 2023-10-03 00:29:10,206 WARNING [train.py:1204] (3/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_400-201896-0_sp1.1 from training. Duration: 0.963625 2023-10-03 00:29:12,939 WARNING [train.py:1204] (3/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_326-174612-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 00:29:14,250 WARNING [train.py:1204] (3/4) Exclude cut with ID _55844_210_2346_1_1534471212569_7625707_390-116233-0_sp1.1 from training. Duration: 0.963625 2023-10-03 00:29:14,276 WARNING [train.py:1204] (3/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_419-317062-0_sp1.1 from training. Duration: 0.82725 2023-10-03 00:29:14,325 WARNING [train.py:1204] (3/4) Exclude cut with ID _29877_210_6228_1_1532397791106_5276149_266-62291-0_sp0.9 from training. Duration: 0.9666875 2023-10-03 00:29:16,145 WARNING [train.py:1204] (3/4) Exclude cut with ID _7857_210_1534_1_1528286515745_6831470_431-338528-0_sp1.1 from training. Duration: 0.92725 2023-10-03 00:29:16,230 WARNING [train.py:1204] (3/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_87-131427-0_sp0.9 from training. Duration: 0.8666875 2023-10-03 00:29:21,402 WARNING [train.py:1204] (3/4) Exclude cut with ID _24055_210_12651_1_1532310907143_976979_92-59356-0_sp1.1 from training. Duration: 0.82725 2023-10-03 00:29:21,486 WARNING [train.py:1204] (3/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_96-290039-0 from training. Duration: 0.56 2023-10-03 00:29:23,500 WARNING [train.py:1204] (3/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_48-182155-0_sp0.9 from training. Duration: 0.9666875 2023-10-03 00:29:23,545 WARNING [train.py:1204] (3/4) Exclude cut with ID _4626_210_3532_1_1526817652717_3916859_446-206656-0 from training. Duration: 0.76 2023-10-03 00:29:23,635 WARNING [train.py:1204] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_168-32906-0 from training. Duration: 0.69 2023-10-03 00:29:25,072 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3780_210_2087_1_1526467760_2130483_170-259044-0 from training. Duration: 0.7 2023-10-03 00:29:25,083 WARNING [train.py:1204] (3/4) Exclude cut with ID _65134_210_14763_1_1535335393985_3505368_153-216718-0_sp1.1 from training. Duration: 0.92725 2023-10-03 00:29:26,523 WARNING [train.py:1204] (3/4) Exclude cut with ID _64639_210_5816_1_1535367619437_3528000_461-329156-0_sp1.1 from training. Duration: 0.87275 2023-10-03 00:29:26,558 WARNING [train.py:1204] (3/4) Exclude cut with ID _21989_210_5854_1_1531899214931_3271360_126-347531-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 00:29:26,616 WARNING [train.py:1204] (3/4) Exclude cut with ID _7614_210_4915_1_1529326764092_4537359_96-162197-0_sp1.1 from training. Duration: 0.963625 2023-10-03 00:29:26,617 WARNING [train.py:1204] (3/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_1083-147536-0 from training. Duration: 0.97 2023-10-03 00:29:31,260 WARNING [train.py:1204] (3/4) Exclude cut with ID _64029_210_4381_1_1535178502772_3756459_275-16710-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 00:29:32,707 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7264_210_4938_1_1527939911_8132095_800-91995-0_sp0.9 from training. Duration: 0.9555625 2023-10-03 00:29:34,000 WARNING [train.py:1204] (3/4) Exclude cut with ID _3844_210_1390_1_1526727803582_2871209_50-193879-0_sp1.1 from training. Duration: 0.936375 2023-10-03 00:29:35,432 WARNING [train.py:1204] (3/4) Exclude cut with ID _46952_210_5986_1_1534230010357_3107379_232-344503-0 from training. Duration: 0.96 2023-10-03 00:29:40,101 WARNING [train.py:1204] (3/4) Exclude cut with ID _25808_210_13292_1_1532343514562_7366109_206-14076-0_sp1.1 from training. Duration: 0.936375 2023-10-03 00:29:40,102 WARNING [train.py:1204] (3/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_55-31904-0_sp0.9 from training. Duration: 0.97775 2023-10-03 00:29:41,375 INFO [train.py:1046] (3/4) Epoch 31, batch 1700, loss[loss=0.1668, simple_loss=0.2564, pruned_loss=0.03861, over 24306.00 frames. ], tot_loss[loss=0.1647, simple_loss=0.2424, pruned_loss=0.04346, over 4708381.13 frames. ], batch size: 74, lr: 3.30e-03, grad_scale: 16.0 2023-10-03 00:29:41,434 WARNING [train.py:1204] (3/4) Exclude cut with ID _14632_210_9988_1_1530691279392_3923200_161-129530-0 from training. Duration: 0.96 2023-10-03 00:29:41,498 WARNING [train.py:1204] (3/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_770-200430-0_sp1.1 from training. Duration: 0.8090625 2023-10-03 00:29:41,507 WARNING [train.py:1204] (3/4) Exclude cut with ID _6270_210_2084_1_1527850911573_3856420_26-277343-0_sp1.1 from training. Duration: 0.7454375 2023-10-03 00:29:41,509 WARNING [train.py:1204] (3/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_88-23776-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 00:29:42,974 WARNING [train.py:1204] (3/4) Exclude cut with ID _60860_210_8254_1_1534896015875_3557169_245-256666-0_sp1.1 from training. Duration: 0.8818125 2023-10-03 00:29:43,011 WARNING [train.py:1204] (3/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_636-251213-0_sp1.1 from training. Duration: 0.8545625 2023-10-03 00:29:43,185 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.bypass.skip_rate, batch_count=1073760.0, ans=0.04949747468305833 2023-10-03 00:29:44,309 WARNING [train.py:1204] (3/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_540-131164-0 from training. Duration: 0.92 2023-10-03 00:29:46,333 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_5931_210_2040_1_1527428481_4547999_321-193614-0_sp0.9 from training. Duration: 0.7444375 2023-10-03 00:29:50,031 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.0.layers.1.feed_forward1.out_whiten, num_groups=1, num_channels=192, metric=4.44 vs. limit=15.0 2023-10-03 00:29:54,983 WARNING [train.py:1204] (3/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_373-225403-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 00:29:58,275 WARNING [train.py:1204] (3/4) Exclude cut with ID _20622_210_3852_1_1531622427080_1093839_57-326027-0_sp1.1 from training. Duration: 0.97275 2023-10-03 00:30:02,514 WARNING [train.py:1204] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_605-257205-0_sp1.1 from training. Duration: 0.536375 2023-10-03 00:30:02,549 WARNING [train.py:1204] (3/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_442-320317-0_sp0.9 from training. Duration: 0.911125 2023-10-03 00:30:03,860 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_35361_210_4238_1_1533262810_7793476_675-87012-0_sp1.1 from training. Duration: 0.8090625 2023-10-03 00:30:05,216 WARNING [train.py:1204] (3/4) Exclude cut with ID _4184_210_2550_1_1526730192013_2219380_306-44736-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 00:30:09,231 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_124-113562-0 from training. Duration: 0.94 2023-10-03 00:30:10,842 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2821_210_2715_1_1526108385_3763560_73-296673-0_sp0.9 from training. Duration: 0.92225 2023-10-03 00:30:10,864 WARNING [train.py:1204] (3/4) Exclude cut with ID _10301_210_5886_1_1529222275332_3910620_216-46959-0_sp1.1 from training. Duration: 0.92725 2023-10-03 00:30:12,868 WARNING [train.py:1204] (3/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_91-277684-0_sp0.9 from training. Duration: 0.82225 2023-10-03 00:30:14,275 WARNING [train.py:1204] (3/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_351-294650-0_sp0.9 from training. Duration: 0.7 2023-10-03 00:30:15,713 WARNING [train.py:1204] (3/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_209-205365-0 from training. Duration: 0.79 2023-10-03 00:30:16,959 WARNING [train.py:1204] (3/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_97-325053-0 from training. Duration: 0.92 2023-10-03 00:30:18,859 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_49348_210_7927_1_1533954500_3691300_109-276430-0_sp1.1 from training. Duration: 0.92725 2023-10-03 00:30:18,969 WARNING [train.py:1204] (3/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_912-47473-0 from training. Duration: 0.68 2023-10-03 00:30:20,302 WARNING [train.py:1204] (3/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_96-320736-0_sp0.9 from training. Duration: 0.9555625 2023-10-03 00:30:29,761 WARNING [train.py:1204] (3/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_1252-235481-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 00:30:29,865 WARNING [train.py:1204] (3/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_813-261554-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 00:30:31,215 WARNING [train.py:1204] (3/4) Exclude cut with ID _21632_210_2253_1_1531994001765_3744140_110-274654-0_sp0.9 from training. Duration: 0.911125 2023-10-03 00:30:32,644 WARNING [train.py:1204] (3/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_381-317791-0_sp1.1 from training. Duration: 0.663625 2023-10-03 00:30:33,959 WARNING [train.py:1204] (3/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_51-117576-0_sp1.1 from training. Duration: 0.363625 2023-10-03 00:30:33,976 WARNING [train.py:1204] (3/4) Exclude cut with ID _42321_210_7770_1_1533468631352_4366895_104-215915-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 00:30:35,429 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_216-22824-0_sp1.1 from training. Duration: 0.92725 2023-10-03 00:30:35,430 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_14831_210_7927_1_1530700200_5583610_181-27467-0 from training. Duration: 0.91 2023-10-03 00:30:36,757 WARNING [train.py:1204] (3/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_1024-265645-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 00:30:36,759 WARNING [train.py:1204] (3/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_412-233904-0_sp1.1 from training. Duration: 0.936375 2023-10-03 00:30:36,787 WARNING [train.py:1204] (3/4) Exclude cut with ID _30191_210_1664_1_1533189915047_7196670_911-223461-0_sp1.1 from training. Duration: 0.92725 2023-10-03 00:30:36,796 WARNING [train.py:1204] (3/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_558-148266-0_sp1.1 from training. Duration: 0.963625 2023-10-03 00:30:39,548 WARNING [train.py:1204] (3/4) Exclude cut with ID _58480_210_12392_1_1534811121111_3646160_212-164495-0_sp1.1 from training. Duration: 0.936375 2023-10-03 00:30:39,549 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_454-276429-0_sp0.9 from training. Duration: 0.9666875 2023-10-03 00:30:39,633 WARNING [train.py:1204] (3/4) Exclude cut with ID _24736_210_3279_1_1532332894218_2838700_73-197086-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 00:30:41,019 WARNING [train.py:1204] (3/4) Exclude cut with ID _5294_210_1747_1_1527249643928_3696179_529-242298-0_sp1.1 from training. Duration: 0.863625 2023-10-03 00:30:41,051 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_53555_210_5632_1_1534595220_7232297_345-171438-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 00:30:42,286 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.498e+02 1.835e+02 2.052e+02 2.305e+02 3.998e+02, threshold=4.103e+02, percent-clipped=1.0 2023-10-03 00:30:45,705 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_28211_210_6388_1_1532861506_4523224_22-25766-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 00:30:47,546 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7002_210_1794_1_1527939683_5157286_163-87552-0 from training. Duration: 0.86 2023-10-03 00:30:50,358 WARNING [train.py:1204] (3/4) Exclude cut with ID _56927_210_9969_1_1534848970840_3695229_409-225129-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 00:30:51,774 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_26192_210_7927_1_1532137269_1040115_21-219331-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 00:30:54,443 WARNING [train.py:1204] (3/4) Exclude cut with ID _15012_210_1968_1_1530856820429_5095350_662-210469-0 from training. Duration: 0.57 2023-10-03 00:30:55,843 INFO [train.py:1046] (3/4) Epoch 31, batch 1750, loss[loss=0.149, simple_loss=0.2351, pruned_loss=0.03139, over 24474.00 frames. ], tot_loss[loss=0.1632, simple_loss=0.241, pruned_loss=0.04271, over 4719160.51 frames. ], batch size: 66, lr: 3.30e-03, grad_scale: 16.0 2023-10-03 00:30:57,979 WARNING [train.py:1204] (3/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_582-191016-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 00:31:01,348 WARNING [train.py:1204] (3/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_270-210032-0_sp1.1 from training. Duration: 0.963625 2023-10-03 00:31:01,390 WARNING [train.py:1204] (3/4) Exclude cut with ID _6789_210_1534_1_1527857937218_3118950_262-48853-0_sp0.9 from training. Duration: 0.62225 2023-10-03 00:31:02,686 WARNING [train.py:1204] (3/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_847-41334-0 from training. Duration: 0.44 2023-10-03 00:31:02,708 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_573-46532-0_sp1.1 from training. Duration: 0.92725 2023-10-03 00:31:04,520 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.4.encoder.layers.0.feed_forward1.out_whiten, num_groups=1, num_channels=384, metric=6.50 vs. limit=15.0 2023-10-03 00:31:05,398 WARNING [train.py:1204] (3/4) Exclude cut with ID _21881_210_3525_1_1531825242757_3798380_247-102591-0_sp1.1 from training. Duration: 0.8909375 2023-10-03 00:31:05,421 WARNING [train.py:1204] (3/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_200-21395-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 00:31:09,546 WARNING [train.py:1204] (3/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_711-15688-0 from training. Duration: 0.43 2023-10-03 00:31:11,577 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.3.encoder.layers.1.nonlin_attention.whiten1, num_groups=1, num_channels=384, metric=5.04 vs. limit=10.0 2023-10-03 00:31:12,398 WARNING [train.py:1204] (3/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_53-75025-0_sp1.1 from training. Duration: 0.963625 2023-10-03 00:31:13,846 WARNING [train.py:1204] (3/4) Exclude cut with ID _6194_210_1534_1_1527511826160_3941194_321-148641-0 from training. Duration: 0.64 2023-10-03 00:31:13,870 WARNING [train.py:1204] (3/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_337-294431-0_sp1.1 from training. Duration: 0.936375 2023-10-03 00:31:14,172 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.conv_skip_rate, batch_count=1074160.0, ans=0.0 2023-10-03 00:31:15,823 WARNING [train.py:1204] (3/4) Exclude cut with ID _21632_210_2253_1_1531994001765_3744140_110-274654-0_sp1.1 from training. Duration: 0.7454375 2023-10-03 00:31:19,102 WARNING [train.py:1204] (3/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_135-174080-0_sp1.1 from training. Duration: 0.4818125 2023-10-03 00:31:19,269 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.conv_skip_rate, batch_count=1074160.0, ans=0.0 2023-10-03 00:31:20,420 WARNING [train.py:1204] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_21-174514-0 from training. Duration: 0.43 2023-10-03 00:31:21,928 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_38965_210_4852_1_1533174906_3484735_215-294589-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 00:31:23,399 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_548-294316-0 from training. Duration: 0.81 2023-10-03 00:31:25,117 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.conv_module1.balancer2.prob, batch_count=1074226.6666666667, ans=0.125 2023-10-03 00:31:31,498 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_103-84010-0_sp1.1 from training. Duration: 0.62725 2023-10-03 00:31:31,693 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.conv_module1.balancer2.prob, batch_count=1074226.6666666667, ans=0.125 2023-10-03 00:31:34,506 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.conv_module2.balancer2.prob, batch_count=1074226.6666666667, ans=0.125 2023-10-03 00:31:35,580 WARNING [train.py:1204] (3/4) Exclude cut with ID _18199_210_6109_1_1531475789864_4288790_295-198247-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 00:31:35,602 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_13757_210_3528_1_1530440682_4107239_103-313224-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 00:31:38,423 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_34886_210_10633_1_1533203882_1755661_67-294673-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 00:31:38,438 WARNING [train.py:1204] (3/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_350-101734-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 00:31:41,260 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_37357_210_17024_1_1533089504_2316121_204-153986-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 00:31:42,617 WARNING [train.py:1204] (3/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_673-115569-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 00:31:45,263 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_489-230703-0_sp0.9 from training. Duration: 0.9444375 2023-10-03 00:31:45,327 WARNING [train.py:1204] (3/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_575-102496-0_sp1.1 from training. Duration: 0.936375 2023-10-03 00:31:46,669 WARNING [train.py:1204] (3/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_669-93163-0 from training. Duration: 0.98 2023-10-03 00:31:48,798 WARNING [train.py:1204] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_249-281349-0_sp0.9 from training. Duration: 0.9444375 2023-10-03 00:31:50,628 WARNING [train.py:1204] (3/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_106-30782-0 from training. Duration: 0.8 2023-10-03 00:31:50,684 WARNING [train.py:1204] (3/4) Exclude cut with ID _8772_210_1850_1_1528635595972_3664011_398-320049-0_sp1.1 from training. Duration: 0.8 2023-10-03 00:31:53,375 WARNING [train.py:1204] (3/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_516-151727-0_sp1.1 from training. Duration: 0.963625 2023-10-03 00:31:53,430 WARNING [train.py:1204] (3/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_325-62802-0_sp1.1 from training. Duration: 0.7909375 2023-10-03 00:31:55,019 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.nonlin_attention.balancer.min_positive, batch_count=1074360.0, ans=0.05 2023-10-03 00:31:57,882 WARNING [train.py:1204] (3/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_93-58002-0_sp0.9 from training. Duration: 0.6333125 2023-10-03 00:31:59,224 WARNING [train.py:1204] (3/4) Exclude cut with ID _4681_210_3525_1_1527251040321_1235713_123-24428-0_sp1.1 from training. Duration: 0.436375 2023-10-03 00:32:00,535 WARNING [train.py:1204] (3/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_1185-56473-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 00:32:00,661 WARNING [train.py:1204] (3/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_96-244299-0_sp1.1 from training. Duration: 0.8 2023-10-03 00:32:06,412 WARNING [train.py:1204] (3/4) Exclude cut with ID _59500_210_8254_1_1534809610680_3736009_104-72815-0_sp1.1 from training. Duration: 0.963625 2023-10-03 00:32:07,942 WARNING [train.py:1204] (3/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_468-283113-0_sp1.1 from training. Duration: 0.87275 2023-10-03 00:32:09,353 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6239_210_4211_1_1527765584_4306698_528-102186-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 00:32:10,694 INFO [train.py:1046] (3/4) Epoch 31, batch 1800, loss[loss=0.168, simple_loss=0.2526, pruned_loss=0.04167, over 23689.00 frames. ], tot_loss[loss=0.1633, simple_loss=0.2405, pruned_loss=0.04305, over 4712405.61 frames. ], batch size: 85, lr: 3.30e-03, grad_scale: 16.0 2023-10-03 00:32:10,805 WARNING [train.py:1204] (3/4) Exclude cut with ID _45297_210_2721_1_1533715351381_3264949_351-147136-0 from training. Duration: 0.87 2023-10-03 00:32:10,810 WARNING [train.py:1204] (3/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_520-51744-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 00:32:12,195 WARNING [train.py:1204] (3/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_310-114452-0_sp1.1 from training. Duration: 0.536375 2023-10-03 00:32:12,198 WARNING [train.py:1204] (3/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_450-165710-0_sp1.1 from training. Duration: 0.92725 2023-10-03 00:32:12,209 WARNING [train.py:1204] (3/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_280-120942-0_sp1.1 from training. Duration: 0.7 2023-10-03 00:32:12,237 WARNING [train.py:1204] (3/4) Exclude cut with ID _65585_210_12210_1_1535450451584_3764098_112-294301-0_sp0.9 from training. Duration: 0.97775 2023-10-03 00:32:12,290 WARNING [train.py:1204] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_98-51676-0_sp0.9 from training. Duration: 0.811125 2023-10-03 00:32:16,252 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_33348_210_5632_1_1533086912_3324030_85-28583-0_sp1.1 from training. Duration: 0.7454375 2023-10-03 00:32:17,473 WARNING [train.py:1204] (3/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_336-208232-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 00:32:18,865 WARNING [train.py:1204] (3/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_339-104791-0_sp0.9 from training. Duration: 0.5444375 2023-10-03 00:32:23,340 WARNING [train.py:1204] (3/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_559-29840-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 00:32:25,421 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=1074493.3333333333, ans=0.1 2023-10-03 00:32:26,483 WARNING [train.py:1204] (3/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_117-290916-0_sp0.9 from training. Duration: 0.5666875 2023-10-03 00:32:26,571 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_185-112289-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 00:32:29,937 WARNING [train.py:1204] (3/4) Exclude cut with ID _59256_210_5986_1_1535076085997_3589120_155-298797-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 00:32:31,387 WARNING [train.py:1204] (3/4) Exclude cut with ID _44659_210_2721_1_1534036120061_6614860_246-206531-0_sp1.1 from training. Duration: 0.92725 2023-10-03 00:32:32,709 WARNING [train.py:1204] (3/4) Exclude cut with ID _23818_210_6228_1_1531917063884_3676920_469-151944-0_sp1.1 from training. Duration: 0.92725 2023-10-03 00:32:34,117 WARNING [train.py:1204] (3/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_88-276668-0_sp1.1 from training. Duration: 0.9 2023-10-03 00:32:35,640 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.nonlin_attention.balancer.prob, batch_count=1074493.3333333333, ans=0.125 2023-10-03 00:32:37,034 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_15101_210_7927_1_1530786415_5033274_320-327527-0_sp1.1 from training. Duration: 0.87275 2023-10-03 00:32:37,053 WARNING [train.py:1204] (3/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_446-112117-0 from training. Duration: 0.9 2023-10-03 00:32:37,112 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_30280_210_1881_1_1532872779_3660139_400-254159-0_sp1.1 from training. Duration: 0.936375 2023-10-03 00:32:39,889 WARNING [train.py:1204] (3/4) Exclude cut with ID _40284_210_2084_1_1533513679814_3704279_158-345341-0_sp1.1 from training. Duration: 0.936375 2023-10-03 00:32:40,132 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.conv_module2.balancer2.prob, batch_count=1074560.0, ans=0.125 2023-10-03 00:32:41,418 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.nonlin_attention.balancer.min_positive, batch_count=1074560.0, ans=0.05 2023-10-03 00:32:41,474 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.balancer1.prob, batch_count=1074560.0, ans=0.125 2023-10-03 00:32:43,935 WARNING [train.py:1204] (3/4) Exclude cut with ID 3033-130750-0096-55598-0 from training. Duration: 0.83 2023-10-03 00:32:45,277 WARNING [train.py:1204] (3/4) Exclude cut with ID _9389_210_1664_1_1529217337301_5289269_379-128794-0 from training. Duration: 0.85 2023-10-03 00:32:45,292 WARNING [train.py:1204] (3/4) Exclude cut with ID _3675_210_4557_1_1526639934408_4182089_456-18305-0 from training. Duration: 0.76 2023-10-03 00:32:46,640 WARNING [train.py:1204] (3/4) Exclude cut with ID _29853_210_7497_1_1532505600851_6642110_571-249460-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 00:32:46,724 WARNING [train.py:1204] (3/4) Exclude cut with ID _4068_210_2629_1_1526697350395_1546259_230-325568-0_sp1.1 from training. Duration: 0.92725 2023-10-03 00:32:46,724 WARNING [train.py:1204] (3/4) Exclude cut with ID _34415_210_8750_1_1533085213375_3451116_213-267570-0_sp1.1 from training. Duration: 0.963625 2023-10-03 00:32:49,373 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6767_210_3273_1_1527855783_1718932_142-127132-0_sp1.1 from training. Duration: 0.863625 2023-10-03 00:32:50,060 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.4.encoder.layers.2.feed_forward2.out_whiten, num_groups=1, num_channels=384, metric=9.21 vs. limit=15.0 2023-10-03 00:32:52,980 WARNING [train.py:1204] (3/4) Exclude cut with ID IC0653W0001-322925 from training. Duration: 0.896 2023-10-03 00:32:54,748 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_4528_210_3273_1_1527076654_3276777_209-219804-0_sp1.1 from training. Duration: 0.62725 2023-10-03 00:32:56,185 WARNING [train.py:1204] (3/4) Exclude cut with ID _55844_210_2346_1_1534471212569_7625707_187-204971-0_sp1.1 from training. Duration: 0.936375 2023-10-03 00:32:59,193 WARNING [train.py:1204] (3/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_369-325184-0 from training. Duration: 0.99 2023-10-03 00:32:59,227 WARNING [train.py:1204] (3/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_791-45149-0 from training. Duration: 0.86 2023-10-03 00:33:00,391 WARNING [train.py:1204] (3/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_317-211107-0_sp1.1 from training. Duration: 0.836375 2023-10-03 00:33:00,497 WARNING [train.py:1204] (3/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_488-330913-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 00:33:01,831 WARNING [train.py:1204] (3/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_506-42614-0_sp1.1 from training. Duration: 0.7181875 2023-10-03 00:33:04,925 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.1.conv_module1.balancer2.min_positive, batch_count=1074626.6666666667, ans=0.05 2023-10-03 00:33:06,728 WARNING [train.py:1204] (3/4) Exclude cut with ID _8696_210_1876_1_1528545632440_3345058_209-54069-0 from training. Duration: 0.67 2023-10-03 00:33:10,966 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.533e+02 1.922e+02 2.138e+02 2.507e+02 4.896e+02, threshold=4.277e+02, percent-clipped=2.0 2023-10-03 00:33:12,488 WARNING [train.py:1204] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_215-202973-0_sp1.1 from training. Duration: 0.8 2023-10-03 00:33:13,787 WARNING [train.py:1204] (3/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_321-215742-0 from training. Duration: 0.5 2023-10-03 00:33:13,842 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_428-32114-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 00:33:13,850 WARNING [train.py:1204] (3/4) Exclude cut with ID _27013_210_1389_1_1532437173240_3252649_467-263151-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 00:33:13,897 WARNING [train.py:1204] (3/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_734-129761-0_sp0.9 from training. Duration: 0.911125 2023-10-03 00:33:13,928 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_521-214276-0 from training. Duration: 0.44 2023-10-03 00:33:18,022 WARNING [train.py:1204] (3/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_80-147301-0_sp1.1 from training. Duration: 0.763625 2023-10-03 00:33:18,024 WARNING [train.py:1204] (3/4) Exclude cut with ID _9679_210_7497_1_1529129212906_4363929_56-307796-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 00:33:19,486 WARNING [train.py:1204] (3/4) Exclude cut with ID _14832_210_8750_1_1530691351539_3171348_1-54315-0 from training. Duration: 0.87 2023-10-03 00:33:19,487 WARNING [train.py:1204] (3/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_558-155750-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 00:33:22,638 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_142-309370-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 00:33:22,684 WARNING [train.py:1204] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_18-46756-0_sp0.9 from training. Duration: 0.87775 2023-10-03 00:33:24,466 INFO [train.py:1046] (3/4) Epoch 31, batch 1850, loss[loss=0.1654, simple_loss=0.2394, pruned_loss=0.04568, over 23903.00 frames. ], tot_loss[loss=0.1634, simple_loss=0.2406, pruned_loss=0.04308, over 4714689.23 frames. ], batch size: 179, lr: 3.30e-03, grad_scale: 16.0 2023-10-03 00:33:24,521 WARNING [train.py:1204] (3/4) Exclude cut with ID _61148_210_6897_1_1535094022726_4217610_279-212159-0_sp1.1 from training. Duration: 0.936375 2023-10-03 00:33:24,621 WARNING [train.py:1204] (3/4) Exclude cut with ID _21987_210_5854_1_1531821500482_3637038_164-2641-0_sp1.1 from training. Duration: 0.936375 2023-10-03 00:33:25,864 WARNING [train.py:1204] (3/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_395-340852-0_sp1.1 from training. Duration: 0.8454375 2023-10-03 00:33:27,307 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1077-184261-0_sp1.1 from training. Duration: 0.963625 2023-10-03 00:33:27,960 WARNING [train.py:1204] (3/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_1176-204992-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 00:33:30,544 WARNING [train.py:1204] (3/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_563-347832-0_sp1.1 from training. Duration: 0.8090625 2023-10-03 00:33:30,611 WARNING [train.py:1204] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_380-204756-0_sp0.9 from training. Duration: 0.9555625 2023-10-03 00:33:32,117 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.conv_skip_rate, batch_count=1074760.0, ans=0.0 2023-10-03 00:33:34,939 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.1.balancer1.prob, batch_count=1074760.0, ans=0.125 2023-10-03 00:33:35,000 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.conv_skip_rate, batch_count=1074760.0, ans=0.0 2023-10-03 00:33:36,680 WARNING [train.py:1204] (3/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_212-10501-0_sp1.1 from training. Duration: 0.8545625 2023-10-03 00:33:36,709 WARNING [train.py:1204] (3/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_445-117398-0 from training. Duration: 0.89 2023-10-03 00:33:39,489 WARNING [train.py:1204] (3/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_29-280622-0 from training. Duration: 0.82 2023-10-03 00:33:42,110 WARNING [train.py:1204] (3/4) Exclude cut with ID _26664_210_7770_1_1532258943280_3418270_187-80815-0 from training. Duration: 0.93 2023-10-03 00:33:46,298 WARNING [train.py:1204] (3/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_414-349640-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 00:33:46,316 WARNING [train.py:1204] (3/4) Exclude cut with ID _45021_210_9994_1_1533619751094_3330288_96-239010-0 from training. Duration: 0.91 2023-10-03 00:33:46,326 WARNING [train.py:1204] (3/4) Exclude cut with ID _5189_210_4557_1_1527248319428_3769229_199-53526-0_sp0.9 from training. Duration: 0.4666875 2023-10-03 00:33:58,112 WARNING [train.py:1204] (3/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_63-101996-0_sp1.1 from training. Duration: 0.8 2023-10-03 00:33:59,459 WARNING [train.py:1204] (3/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_199-86432-0 from training. Duration: 0.98 2023-10-03 00:34:02,804 WARNING [train.py:1204] (3/4) Exclude cut with ID _36399_210_16049_1_1532998771377_3698333_207-259013-0_sp1.1 from training. Duration: 0.92725 2023-10-03 00:34:04,121 WARNING [train.py:1204] (3/4) Exclude cut with ID _62574_210_2655_1_1535180407058_3933249_567-111756-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 00:34:07,665 WARNING [train.py:1204] (3/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_436-318113-0 from training. Duration: 0.75 2023-10-03 00:34:07,711 WARNING [train.py:1204] (3/4) Exclude cut with ID _53379_210_20034_1_1534222027363_3854654_43-305214-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 00:34:07,736 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_15126_210_6753_1_1530778677_2591185_113-157651-0_sp1.1 from training. Duration: 0.6181875 2023-10-03 00:34:09,167 WARNING [train.py:1204] (3/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_146-218763-0_sp1.1 from training. Duration: 0.7818125 2023-10-03 00:34:11,846 WARNING [train.py:1204] (3/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_714-310538-0_sp0.9 from training. Duration: 0.9555625 2023-10-03 00:34:14,498 WARNING [train.py:1204] (3/4) Exclude cut with ID _24271_210_12658_1_1531987270022_7190519_709-16721-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 00:34:14,846 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.feed_forward2.hidden_balancer.prob, batch_count=1074960.0, ans=0.125 2023-10-03 00:34:15,986 WARNING [train.py:1204] (3/4) Exclude cut with ID _5190_210_4557_1_1527244368799_373439_28-197030-0_sp0.9 from training. Duration: 0.8 2023-10-03 00:34:17,284 WARNING [train.py:1204] (3/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_267-197536-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 00:34:17,316 WARNING [train.py:1204] (3/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_658-282610-0_sp1.1 from training. Duration: 0.4818125 2023-10-03 00:34:17,333 WARNING [train.py:1204] (3/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_257-67050-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 00:34:18,737 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_14906_210_7927_1_1530754190_9408633_961-53402-0_sp1.1 from training. Duration: 0.936375 2023-10-03 00:34:21,443 WARNING [train.py:1204] (3/4) Exclude cut with ID _60113_210_16271_1_1535164212481_4386079_490-280290-0_sp1.1 from training. Duration: 0.8909375 2023-10-03 00:34:22,930 WARNING [train.py:1204] (3/4) Exclude cut with ID _14100_210_8341_1_1530529320613_3678069_258-224599-0 from training. Duration: 0.77 2023-10-03 00:34:22,994 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6961_210_2087_1_1527937168_6815011_565-326898-0_sp1.1 from training. Duration: 0.936375 2023-10-03 00:34:28,168 WARNING [train.py:1204] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_239-316802-0_sp1.1 from training. Duration: 0.62725 2023-10-03 00:34:29,461 WARNING [train.py:1204] (3/4) Exclude cut with ID _55857_210_2346_1_1534644007831_6898209_745-98391-0_sp1.1 from training. Duration: 0.6909375 2023-10-03 00:34:29,465 WARNING [train.py:1204] (3/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_154-135696-0 from training. Duration: 0.8 2023-10-03 00:34:29,466 WARNING [train.py:1204] (3/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_93-58002-0 from training. Duration: 0.57 2023-10-03 00:34:31,459 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9025W0005-507335 from training. Duration: 0.896 2023-10-03 00:34:32,584 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9029W0034-509435 from training. Duration: 0.64 2023-10-03 00:34:33,939 WARNING [train.py:1204] (3/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_76-203867-0_sp1.1 from training. Duration: 0.6545625 2023-10-03 00:34:33,949 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_46639_210_8341_1_1533726088_4495500_66-346325-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 00:34:33,956 WARNING [train.py:1204] (3/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_145-137790-0_sp0.9 from training. Duration: 0.92225 2023-10-03 00:34:33,982 WARNING [train.py:1204] (3/4) Exclude cut with ID _36522_210_8887_1_1533177030611_1772478_7-327299-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 00:34:35,391 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9042W0009-515615 from training. Duration: 0.896 2023-10-03 00:34:35,405 WARNING [train.py:1204] (3/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_39-248870-0_sp0.9 from training. Duration: 0.7666875 2023-10-03 00:34:35,441 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_35441_210_16554_1_1533347572_4092680_362-242912-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 00:34:37,216 WARNING [train.py:1204] (3/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_216-343007-0_sp1.1 from training. Duration: 0.6 2023-10-03 00:34:37,292 WARNING [train.py:1204] (3/4) Exclude cut with ID _3867_210_1883_1_1526641200575_4475830_272-215563-0_sp0.9 from training. Duration: 0.6333125 2023-10-03 00:34:38,579 INFO [train.py:1046] (3/4) Epoch 31, batch 1900, loss[loss=0.1795, simple_loss=0.2651, pruned_loss=0.04693, over 24385.00 frames. ], tot_loss[loss=0.1643, simple_loss=0.242, pruned_loss=0.04329, over 4730805.35 frames. ], batch size: 77, lr: 3.30e-03, grad_scale: 16.0 2023-10-03 00:34:38,663 WARNING [train.py:1204] (3/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_553-175603-0_sp1.1 from training. Duration: 0.92725 2023-10-03 00:34:38,681 WARNING [train.py:1204] (3/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_963-187109-0 from training. Duration: 0.99 2023-10-03 00:34:41,438 WARNING [train.py:1204] (3/4) Exclude cut with ID _44973_210_1511_1_1533794381163_3634770_495-211466-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 00:34:41,457 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9068W0002-529085 from training. Duration: 0.896 2023-10-03 00:34:41,482 WARNING [train.py:1204] (3/4) Exclude cut with ID _5294_210_1747_1_1527249643928_3696179_180-100713-0_sp1.1 from training. Duration: 0.736375 2023-10-03 00:34:42,774 WARNING [train.py:1204] (3/4) Exclude cut with ID _35799_210_5854_1_1533263487887_3529071_149-49689-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 00:34:48,267 WARNING [train.py:1204] (3/4) Exclude cut with ID _67725_210_27767_1_1535608756671_5105359_36-220794-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 00:34:49,664 WARNING [train.py:1204] (3/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_338-96520-0_sp1.1 from training. Duration: 0.8545625 2023-10-03 00:34:49,729 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9104W0004-547160 from training. Duration: 0.896 2023-10-03 00:34:51,062 WARNING [train.py:1204] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_330-327861-0 from training. Duration: 0.67 2023-10-03 00:34:53,100 WARNING [train.py:1204] (3/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_156-257607-0_sp0.9 from training. Duration: 0.92225 2023-10-03 00:34:53,167 WARNING [train.py:1204] (3/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_476-322050-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 00:34:54,570 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9118W0017-553385 from training. Duration: 0.896 2023-10-03 00:34:54,603 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9120W0028-554435 from training. Duration: 0.896 2023-10-03 00:34:57,386 WARNING [train.py:1204] (3/4) Exclude cut with ID _56524_210_9642_1_1534572011779_3619170_145-310842-0 from training. Duration: 0.99 2023-10-03 00:34:59,201 WARNING [train.py:1204] (3/4) Exclude cut with ID _51631_210_8254_1_1534129228420_4084794_270-222508-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 00:35:02,698 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.ff2_skip_rate, batch_count=1075160.0, ans=0.0 2023-10-03 00:35:03,902 WARNING [train.py:1204] (3/4) Exclude cut with ID _14268_210_9206_1_1530622771285_4714099_331-105351-0 from training. Duration: 0.65 2023-10-03 00:35:05,300 WARNING [train.py:1204] (3/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_40-129756-0 from training. Duration: 0.81 2023-10-03 00:35:14,286 WARNING [train.py:1204] (3/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_231-104736-0 from training. Duration: 0.78 2023-10-03 00:35:17,004 WARNING [train.py:1204] (3/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_214-291369-0 from training. Duration: 0.63 2023-10-03 00:35:17,012 WARNING [train.py:1204] (3/4) Exclude cut with ID _62574_210_2655_1_1535180407058_3933249_369-84347-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 00:35:18,294 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9210W0111-599240 from training. Duration: 0.896 2023-10-03 00:35:18,298 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_92-246270-0 from training. Duration: 0.85 2023-10-03 00:35:18,330 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_37152_210_9697_1_1533106573_3844188_29-239070-0 from training. Duration: 0.98 2023-10-03 00:35:19,638 WARNING [train.py:1204] (3/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_239-22076-0 from training. Duration: 0.85 2023-10-03 00:35:19,641 WARNING [train.py:1204] (3/4) Exclude cut with ID _65962_210_29927_1_1535436026519_4174930_368-44639-0_sp1.1 from training. Duration: 0.97275 2023-10-03 00:35:21,415 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.ff2_skip_rate, batch_count=1075293.3333333333, ans=0.0 2023-10-03 00:35:21,433 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.balancer1.prob, batch_count=1075293.3333333333, ans=0.125 2023-10-03 00:35:24,473 WARNING [train.py:1204] (3/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_152-212533-0 from training. Duration: 0.97 2023-10-03 00:35:27,347 WARNING [train.py:1204] (3/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_90-31383-0_sp1.1 from training. Duration: 0.7909375 2023-10-03 00:35:30,121 WARNING [train.py:1204] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_196-152868-0_sp1.1 from training. Duration: 0.92725 2023-10-03 00:35:30,122 WARNING [train.py:1204] (3/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_140-181176-0 from training. Duration: 0.92 2023-10-03 00:35:32,606 WARNING [train.py:1204] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_168-32906-0_sp0.9 from training. Duration: 0.7666875 2023-10-03 00:35:32,719 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.1.bypass.scale_min, batch_count=1075293.3333333333, ans=0.2 2023-10-03 00:35:33,775 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder_embed.convnext.layerdrop_rate, batch_count=1075293.3333333333, ans=0.015 2023-10-03 00:35:34,628 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.0.layers.1.feed_forward2.out_whiten, num_groups=1, num_channels=192, metric=4.22 vs. limit=15.0 2023-10-03 00:35:37,976 WARNING [train.py:1204] (3/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_13-165512-0 from training. Duration: 0.82 2023-10-03 00:35:39,194 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.518e+02 2.002e+02 2.260e+02 2.885e+02 4.012e+02, threshold=4.521e+02, percent-clipped=0.0 2023-10-03 00:35:39,268 WARNING [train.py:1204] (3/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_381-317791-0_sp0.9 from training. Duration: 0.811125 2023-10-03 00:35:42,873 WARNING [train.py:1204] (3/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_63-267585-0_sp1.1 from training. Duration: 0.8181875 2023-10-03 00:35:42,875 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_326-303945-0_sp1.1 from training. Duration: 0.9 2023-10-03 00:35:42,894 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_194-143381-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 00:35:44,281 WARNING [train.py:1204] (3/4) Exclude cut with ID _39290_210_4220_1_1533862778760_3075118_194-16667-0_sp1.1 from training. Duration: 0.963625 2023-10-03 00:35:45,621 WARNING [train.py:1204] (3/4) Exclude cut with ID _15012_210_1968_1_1530856820429_5095350_662-210469-0_sp0.9 from training. Duration: 0.6333125 2023-10-03 00:35:45,665 WARNING [train.py:1204] (3/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_68-219807-0_sp1.1 from training. Duration: 0.42725 2023-10-03 00:35:45,714 WARNING [train.py:1204] (3/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_321-22821-0_sp0.9 from training. Duration: 0.988875 2023-10-03 00:35:48,366 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_1037-45317-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 00:35:48,368 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_403-39619-0_sp1.1 from training. Duration: 0.763625 2023-10-03 00:35:51,169 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_510-96996-0_sp0.9 from training. Duration: 0.9666875 2023-10-03 00:35:51,172 WARNING [train.py:1204] (3/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_225-180155-0_sp1.1 from training. Duration: 0.92725 2023-10-03 00:35:52,438 INFO [train.py:1046] (3/4) Epoch 31, batch 1950, loss[loss=0.1581, simple_loss=0.2465, pruned_loss=0.0349, over 24677.00 frames. ], tot_loss[loss=0.1644, simple_loss=0.2423, pruned_loss=0.04326, over 4731041.37 frames. ], batch size: 73, lr: 3.30e-03, grad_scale: 16.0 2023-10-03 00:35:52,482 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_278-272612-0_sp0.9 from training. Duration: 0.811125 2023-10-03 00:35:52,569 WARNING [train.py:1204] (3/4) Exclude cut with ID _53713_210_24116_1_1534314487762_7416751_691-70244-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 00:35:55,720 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6788_210_1388_1_1527898698_4562021_91-197981-0_sp1.1 from training. Duration: 0.7545625 2023-10-03 00:35:58,510 WARNING [train.py:1204] (3/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_60-18123-0_sp0.9 from training. Duration: 0.97775 2023-10-03 00:35:58,564 WARNING [train.py:1204] (3/4) Exclude cut with ID _35799_210_5854_1_1533263487887_3529071_5-214617-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 00:35:58,569 WARNING [train.py:1204] (3/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_890-5117-0_sp1.1 from training. Duration: 0.7090625 2023-10-03 00:36:01,335 WARNING [train.py:1204] (3/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_660-211088-0 from training. Duration: 0.62 2023-10-03 00:36:01,389 WARNING [train.py:1204] (3/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_314-126819-0_sp0.9 from training. Duration: 0.5555625 2023-10-03 00:36:01,424 WARNING [train.py:1204] (3/4) Exclude cut with ID _21632_210_2253_1_1531994001765_3744140_362-66225-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 00:36:02,315 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.conv_module1.balancer1.prob, batch_count=1075426.6666666667, ans=0.125 2023-10-03 00:36:03,292 WARNING [train.py:1204] (3/4) Exclude cut with ID _61118_210_9527_1_1534897668884_3752630_173-258555-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 00:36:05,983 WARNING [train.py:1204] (3/4) Exclude cut with ID _7719_210_3525_1_1528456153163_4296960_88-250259-0_sp0.9 from training. Duration: 0.9333125 2023-10-03 00:36:06,016 WARNING [train.py:1204] (3/4) Exclude cut with ID _15140_210_9288_1_1530791388450_4880149_539-153708-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 00:36:06,045 WARNING [train.py:1204] (3/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_253-42544-0_sp1.1 from training. Duration: 0.97275 2023-10-03 00:36:08,030 WARNING [train.py:1204] (3/4) Exclude cut with ID _33640_210_2655_1_1533117605479_4855469_479-296877-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 00:36:10,802 WARNING [train.py:1204] (3/4) Exclude cut with ID 3033-130750-0096-55598-0_sp1.1 from training. Duration: 0.7545625 2023-10-03 00:36:10,821 WARNING [train.py:1204] (3/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_191-41023-0_sp1.1 from training. Duration: 0.6454375 2023-10-03 00:36:10,837 WARNING [train.py:1204] (3/4) Exclude cut with ID _8365_210_2064_1_1528592311070_4677690_431-294102-0_sp1.1 from training. Duration: 0.8090625 2023-10-03 00:36:10,861 WARNING [train.py:1204] (3/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_893-296030-0_sp1.1 from training. Duration: 0.97275 2023-10-03 00:36:12,490 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.conv_module2.balancer2.prob, batch_count=1075493.3333333333, ans=0.125 2023-10-03 00:36:13,663 WARNING [train.py:1204] (3/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_401-343820-0_sp1.1 from training. Duration: 0.97275 2023-10-03 00:36:17,780 WARNING [train.py:1204] (3/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_127-566-0_sp1.1 from training. Duration: 0.763625 2023-10-03 00:36:17,781 WARNING [train.py:1204] (3/4) Exclude cut with ID _14100_210_8341_1_1530529320613_3678069_126-272847-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 00:36:17,801 WARNING [train.py:1204] (3/4) Exclude cut with ID _15133_210_6228_1_1530794512074_3080379_29-264458-0_sp1.1 from training. Duration: 0.52725 2023-10-03 00:36:17,801 WARNING [train.py:1204] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_127-119109-0 from training. Duration: 0.72 2023-10-03 00:36:19,222 WARNING [train.py:1204] (3/4) Exclude cut with ID _6537_210_1883_1_1527987559710_3597129_382-133062-0_sp1.1 from training. Duration: 0.6181875 2023-10-03 00:36:19,240 WARNING [train.py:1204] (3/4) Exclude cut with ID _29529_210_7234_1_1532393961455_3977460_391-219682-0_sp1.1 from training. Duration: 0.936375 2023-10-03 00:36:20,575 WARNING [train.py:1204] (3/4) Exclude cut with ID _37537_210_2253_1_1533171758568_3421949_96-137189-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 00:36:25,337 WARNING [train.py:1204] (3/4) Exclude cut with ID _58414_210_8254_1_1535421609162_7151182_15-276301-0_sp1.1 from training. Duration: 0.97275 2023-10-03 00:36:28,121 WARNING [train.py:1204] (3/4) Exclude cut with ID _59502_210_12210_1_1535107024698_1835139_110-289074-0_sp1.1 from training. Duration: 0.92725 2023-10-03 00:36:31,146 WARNING [train.py:1204] (3/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_367-61330-0_sp1.1 from training. Duration: 0.6818125 2023-10-03 00:36:35,814 WARNING [train.py:1204] (3/4) Exclude cut with ID _45297_210_2721_1_1533715351381_3264949_353-9541-0_sp1.1 from training. Duration: 0.8818125 2023-10-03 00:36:35,853 WARNING [train.py:1204] (3/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_85-138257-0_sp0.9 from training. Duration: 0.788875 2023-10-03 00:36:37,592 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3787_210_2040_1_1526477544_5478586_603-120324-0 from training. Duration: 0.81 2023-10-03 00:36:37,628 WARNING [train.py:1204] (3/4) Exclude cut with ID _42863_210_9070_1_1533902398788_3696920_411-204131-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 00:36:42,349 WARNING [train.py:1204] (3/4) Exclude cut with ID _30196_210_1664_1_1533708496522_7074140_822-289909-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 00:36:43,720 WARNING [train.py:1204] (3/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_32-335103-0_sp0.9 from training. Duration: 0.911125 2023-10-03 00:36:43,758 WARNING [train.py:1204] (3/4) Exclude cut with ID _6493_210_1747_1_1528459210424_4012470_513-140967-0_sp0.9 from training. Duration: 0.87775 2023-10-03 00:36:44,719 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.0.layers.0.nonlin_attention.whiten2, num_groups=1, num_channels=192, metric=5.04 vs. limit=15.0 2023-10-03 00:36:52,033 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_47896_210_20508_1_1534492518_5958868_439-257766-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 00:36:52,097 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_1294-63152-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 00:36:54,842 WARNING [train.py:1204] (3/4) Exclude cut with ID _59170_210_6721_1_1534983633960_4203335_319-166512-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 00:36:56,295 WARNING [train.py:1204] (3/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_146-3470-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 00:36:56,399 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder_embed.convnext.layerdrop_rate, batch_count=1075693.3333333333, ans=0.015 2023-10-03 00:36:59,701 WARNING [train.py:1204] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_678-159307-0_sp1.1 from training. Duration: 0.77275 2023-10-03 00:36:59,745 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_343-48822-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 00:36:59,794 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_4692_210_3528_1_1526899755_4323254_107-44401-0 from training. Duration: 0.86 2023-10-03 00:36:59,799 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_74-292608-0_sp1.1 from training. Duration: 0.6545625 2023-10-03 00:36:59,876 WARNING [train.py:1204] (3/4) Exclude cut with ID _55644_210_11856_1_1534593599582_4103719_293-248694-0_sp1.1 from training. Duration: 0.97275 2023-10-03 00:37:01,292 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3172_210_2087_1_1526292929_3987865_197-97762-0 from training. Duration: 0.75 2023-10-03 00:37:02,735 WARNING [train.py:1204] (3/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_264-348896-0_sp0.9 from training. Duration: 0.9444375 2023-10-03 00:37:06,025 WARNING [train.py:1204] (3/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_209-205365-0_sp0.9 from training. Duration: 0.87775 2023-10-03 00:37:07,733 INFO [train.py:1046] (3/4) Epoch 31, batch 2000, loss[loss=0.1667, simple_loss=0.2336, pruned_loss=0.04989, over 23836.00 frames. ], tot_loss[loss=0.1653, simple_loss=0.2432, pruned_loss=0.04371, over 4724973.02 frames. ], batch size: 164, lr: 3.30e-03, grad_scale: 32.0 2023-10-03 00:37:07,807 WARNING [train.py:1204] (3/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_222-198127-0_sp0.9 from training. Duration: 0.9333125 2023-10-03 00:37:07,854 WARNING [train.py:1204] (3/4) Exclude cut with ID _57572_210_5517_1_1534921219624_3809484_52-43530-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 00:37:10,952 WARNING [train.py:1204] (3/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_96-320736-0_sp1.1 from training. Duration: 0.7818125 2023-10-03 00:37:12,336 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_13091_210_7925_1_1530609447_8987354_1159-225508-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 00:37:15,074 WARNING [train.py:1204] (3/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_237-44332-0 from training. Duration: 0.94 2023-10-03 00:37:16,362 WARNING [train.py:1204] (3/4) Exclude cut with ID _7970_210_1883_1_1528455616296_3472230_25-178407-0_sp0.9 from training. Duration: 0.788875 2023-10-03 00:37:16,657 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=1075760.0, ans=0.1 2023-10-03 00:37:20,404 WARNING [train.py:1204] (3/4) Exclude cut with ID _29003_210_19596_1_1532507391720_3894098_357-257062-0_sp1.1 from training. Duration: 0.936375 2023-10-03 00:37:21,960 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_809-17156-0 from training. Duration: 0.73 2023-10-03 00:37:23,304 WARNING [train.py:1204] (3/4) Exclude cut with ID _7160_210_1876_1_1527940923231_3703799_302-150728-0_sp1.1 from training. Duration: 0.7090625 2023-10-03 00:37:23,312 WARNING [train.py:1204] (3/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_444-28041-0_sp0.9 from training. Duration: 0.9444375 2023-10-03 00:37:24,813 WARNING [train.py:1204] (3/4) Exclude cut with ID _52892_210_6721_1_1534378941259_4124129_278-251666-0_sp1.1 from training. Duration: 0.92725 2023-10-03 00:37:26,235 WARNING [train.py:1204] (3/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_115-326778-0 from training. Duration: 0.93 2023-10-03 00:37:28,246 WARNING [train.py:1204] (3/4) Exclude cut with ID _41391_210_7994_1_1533603771872_3638201_31-188961-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 00:37:29,689 WARNING [train.py:1204] (3/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_1073-6264-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 00:37:29,714 WARNING [train.py:1204] (3/4) Exclude cut with ID _66868_210_9527_1_1535509287954_4561140_435-259515-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 00:37:29,817 WARNING [train.py:1204] (3/4) Exclude cut with ID _14886_210_4220_1_1530702020355_3516190_159-73748-0 from training. Duration: 0.83 2023-10-03 00:37:31,153 WARNING [train.py:1204] (3/4) Exclude cut with ID _15150_210_8341_1_1530752411803_7256384_469-239509-0_sp1.1 from training. Duration: 0.6181875 2023-10-03 00:37:31,958 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.2.encoder.layers.1.feed_forward3.out_whiten, num_groups=1, num_channels=384, metric=8.20 vs. limit=15.0 2023-10-03 00:37:32,590 WARNING [train.py:1204] (3/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_395-340852-0 from training. Duration: 0.93 2023-10-03 00:37:32,593 WARNING [train.py:1204] (3/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_138-187031-0_sp1.1 from training. Duration: 0.9 2023-10-03 00:37:35,522 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_13757_210_3528_1_1530440682_4107239_48-106347-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 00:37:37,407 WARNING [train.py:1204] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_164-224052-0_sp1.1 from training. Duration: 0.636375 2023-10-03 00:37:37,419 WARNING [train.py:1204] (3/4) Exclude cut with ID _59288_210_5854_1_1535077757774_3412246_408-322648-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 00:37:37,452 WARNING [train.py:1204] (3/4) Exclude cut with ID _30191_210_1664_1_1533189915047_7196670_827-36359-0_sp1.1 from training. Duration: 0.963625 2023-10-03 00:37:39,330 WARNING [train.py:1204] (3/4) Exclude cut with ID _4680_210_3525_1_1527249757953_893386_75-78979-0_sp1.1 from training. Duration: 0.82725 2023-10-03 00:37:40,621 WARNING [train.py:1204] (3/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_383-165234-0 from training. Duration: 0.94 2023-10-03 00:37:43,442 WARNING [train.py:1204] (3/4) Exclude cut with ID _19896_210_1883_1_1531709022662_6649569_94-229927-0 from training. Duration: 0.97 2023-10-03 00:37:43,450 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6788_210_1388_1_1527898698_4562021_180-75288-0_sp1.1 from training. Duration: 0.9 2023-10-03 00:37:43,469 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_26433_210_12108_1_1532157158_4056480_54-321349-0_sp1.1 from training. Duration: 0.97275 2023-10-03 00:37:48,942 WARNING [train.py:1204] (3/4) Exclude cut with ID _56108_210_16380_1_1534505446159_3887959_265-161493-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 00:37:49,058 WARNING [train.py:1204] (3/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_395-111602-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 00:37:49,065 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_154-316450-0_sp0.9 from training. Duration: 0.8444375 2023-10-03 00:37:50,436 WARNING [train.py:1204] (3/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_381-116041-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 00:37:51,884 WARNING [train.py:1204] (3/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_65-104124-0_sp1.1 from training. Duration: 0.963625 2023-10-03 00:37:53,243 WARNING [train.py:1204] (3/4) Exclude cut with ID _6536_210_1883_1_1527850783339_3319069_169-23811-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 00:37:53,276 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_289-180904-0_sp0.9 from training. Duration: 0.8444375 2023-10-03 00:37:54,452 WARNING [train.py:1204] (3/4) Exclude cut with ID _56406_210_9288_1_1534899618664_3496149_226-242400-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 00:37:54,562 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_24899_210_16554_1_1532046422_3827462_276-216330-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 00:37:57,848 WARNING [train.py:1204] (3/4) Exclude cut with ID _29345_210_4381_1_1532689168263_3860169_198-174328-0_sp1.1 from training. Duration: 0.82725 2023-10-03 00:37:59,157 WARNING [train.py:1204] (3/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_338-96520-0 from training. Duration: 0.94 2023-10-03 00:37:59,467 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.feed_forward1.hidden_balancer.prob, batch_count=1075960.0, ans=0.125 2023-10-03 00:38:02,254 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.0.balancer2.prob, batch_count=1075960.0, ans=0.125 2023-10-03 00:38:03,498 WARNING [train.py:1204] (3/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_7-111954-0_sp1.1 from training. Duration: 0.5090625 2023-10-03 00:38:04,710 WARNING [train.py:1204] (3/4) Exclude cut with ID _36692_210_5665_1_1533608967420_398809_8-16261-0_sp1.1 from training. Duration: 0.97275 2023-10-03 00:38:08,528 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.578e+02 1.852e+02 2.094e+02 2.367e+02 3.575e+02, threshold=4.187e+02, percent-clipped=0.0 2023-10-03 00:38:08,698 WARNING [train.py:1204] (3/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_86-212657-0_sp1.1 from training. Duration: 0.97275 2023-10-03 00:38:08,709 WARNING [train.py:1204] (3/4) Exclude cut with ID _44659_210_2721_1_1534036120061_6614860_626-298884-0_sp1.1 from training. Duration: 0.8818125 2023-10-03 00:38:11,875 WARNING [train.py:1204] (3/4) Exclude cut with ID _49755_210_18053_1_1534581005846_3218709_120-257918-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 00:38:13,297 WARNING [train.py:1204] (3/4) Exclude cut with ID _21881_210_3525_1_1531825242757_3798380_400-113821-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 00:38:13,299 WARNING [train.py:1204] (3/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_387-236773-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 00:38:14,614 WARNING [train.py:1204] (3/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_613-232856-0_sp1.1 from training. Duration: 0.4909375 2023-10-03 00:38:14,646 WARNING [train.py:1204] (3/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_181-78632-0_sp0.9 from training. Duration: 0.8666875 2023-10-03 00:38:17,251 WARNING [train.py:1204] (3/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_175-17485-0_sp1.1 from training. Duration: 0.97275 2023-10-03 00:38:17,325 WARNING [train.py:1204] (3/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_1004-183422-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 00:38:21,446 INFO [train.py:1046] (3/4) Epoch 31, batch 2050, loss[loss=0.1499, simple_loss=0.2283, pruned_loss=0.03581, over 24652.00 frames. ], tot_loss[loss=0.1637, simple_loss=0.2415, pruned_loss=0.04298, over 4714268.47 frames. ], batch size: 65, lr: 3.30e-03, grad_scale: 32.0 2023-10-03 00:38:21,523 WARNING [train.py:1204] (3/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_32-65962-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 00:38:22,835 WARNING [train.py:1204] (3/4) Exclude cut with ID _15458_210_8341_1_1530858614263_3834410_40-298664-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 00:38:29,058 WARNING [train.py:1204] (3/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_380-261863-0_sp1.1 from training. Duration: 0.963625 2023-10-03 00:38:30,542 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_372-173012-0_sp1.1 from training. Duration: 0.863625 2023-10-03 00:38:31,867 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_57-82205-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 00:38:33,232 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3691_210_3528_1_1526468169_4062053_355-334432-0_sp0.9 from training. Duration: 0.9666875 2023-10-03 00:38:34,667 WARNING [train.py:1204] (3/4) Exclude cut with ID _19023_210_4353_1_1531378892552_5820780_28-36615-0 from training. Duration: 0.97 2023-10-03 00:38:34,679 WARNING [train.py:1204] (3/4) Exclude cut with ID _29853_210_7497_1_1532505600851_6642110_286-251333-0_sp1.1 from training. Duration: 0.92725 2023-10-03 00:38:36,138 WARNING [train.py:1204] (3/4) Exclude cut with ID _31602_210_5854_1_1532677021456_4458250_518-80679-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 00:38:37,437 WARNING [train.py:1204] (3/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_38-187851-0_sp0.9 from training. Duration: 0.688875 2023-10-03 00:38:37,688 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.conv_skip_rate, batch_count=1076160.0, ans=0.0 2023-10-03 00:38:46,310 WARNING [train.py:1204] (3/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_39-248870-0_sp1.1 from training. Duration: 0.62725 2023-10-03 00:38:46,336 WARNING [train.py:1204] (3/4) Exclude cut with ID _16908_210_5724_1_1531119157248_3772700_154-31455-0_sp1.1 from training. Duration: 0.97275 2023-10-03 00:38:48,994 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8453_210_4211_1_1528502940_8562730_347-330673-0 from training. Duration: 0.72 2023-10-03 00:38:50,432 WARNING [train.py:1204] (3/4) Exclude cut with ID _30191_210_1664_1_1533189915047_7196670_674-204247-0_sp1.1 from training. Duration: 0.97275 2023-10-03 00:38:51,828 WARNING [train.py:1204] (3/4) Exclude cut with ID _8833_210_5219_1_1528592665210_7553059_329-125156-0 from training. Duration: 0.82 2023-10-03 00:38:51,867 WARNING [train.py:1204] (3/4) Exclude cut with ID _4779_210_2084_1_1527318022944_3914893_500-285013-0_sp1.1 from training. Duration: 0.62725 2023-10-03 00:38:52,106 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.feed_forward2.hidden_balancer.prob, batch_count=1076226.6666666667, ans=0.125 2023-10-03 00:38:54,731 WARNING [train.py:1204] (3/4) Exclude cut with ID _14421_210_8750_1_1530604795825_3542290_190-313578-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 00:38:57,531 WARNING [train.py:1204] (3/4) Exclude cut with ID _35799_210_5854_1_1533263487887_3529071_538-273574-0_sp1.1 from training. Duration: 0.936375 2023-10-03 00:38:59,304 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_14928_210_5632_1_1530773461_4714000_454-112198-0_sp1.1 from training. Duration: 0.6 2023-10-03 00:38:59,352 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_468-143126-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 00:39:00,733 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_4528_210_3273_1_1527076654_3276777_258-121681-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 00:39:00,848 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.1.nonlin_attention.balancer.prob, batch_count=1076226.6666666667, ans=0.125 2023-10-03 00:39:02,222 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_397-132565-0_sp1.1 from training. Duration: 0.9 2023-10-03 00:39:02,240 WARNING [train.py:1204] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_454-123002-0_sp0.9 from training. Duration: 0.7666875 2023-10-03 00:39:03,727 WARNING [train.py:1204] (3/4) Exclude cut with ID _27052_210_5986_1_1532329197187_3585009_297-86890-0_sp1.1 from training. Duration: 0.936375 2023-10-03 00:39:05,133 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_51225_210_3601_1_1534400634_2327383_55-301844-0_sp1.1 from training. Duration: 0.8454375 2023-10-03 00:39:06,519 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6786_210_1388_1_1527839699_4194495_378-46070-0_sp0.9 from training. Duration: 0.8 2023-10-03 00:39:09,729 WARNING [train.py:1204] (3/4) Exclude cut with ID _42334_210_7770_1_1534206610396_3605880_55-19758-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 00:39:13,877 WARNING [train.py:1204] (3/4) Exclude cut with ID _14884_210_2252_1_1530707459689_5561250_114-201399-0_sp0.9 from training. Duration: 0.8444375 2023-10-03 00:39:15,488 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_99-251314-0_sp0.9 from training. Duration: 0.9666875 2023-10-03 00:39:16,887 WARNING [train.py:1204] (3/4) Exclude cut with ID _26528_210_9070_1_1532570410842_3851310_497-97218-0 from training. Duration: 0.86 2023-10-03 00:39:22,306 WARNING [train.py:1204] (3/4) Exclude cut with ID _55328_210_27767_1_1534580973260_4088170_230-317241-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 00:39:22,367 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_125-203105-0_sp0.9 from training. Duration: 0.888875 2023-10-03 00:39:25,120 WARNING [train.py:1204] (3/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_55-31904-0_sp1.1 from training. Duration: 0.8 2023-10-03 00:39:27,039 WARNING [train.py:1204] (3/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_18-174647-0 from training. Duration: 0.91 2023-10-03 00:39:29,802 WARNING [train.py:1204] (3/4) Exclude cut with ID IC0165W0005-81726 from training. Duration: 0.952 2023-10-03 00:39:29,802 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_31583_210_3852_1_1532595851_4024413_148-171847-0_sp1.1 from training. Duration: 0.963625 2023-10-03 00:39:31,143 WARNING [train.py:1204] (3/4) Exclude cut with ID _21989_210_5854_1_1531899214931_3271360_116-138769-0_sp1.1 from training. Duration: 0.936375 2023-10-03 00:39:31,198 WARNING [train.py:1204] (3/4) Exclude cut with ID _66070_210_7640_1_1535441393096_7805310_489-292631-0_sp1.1 from training. Duration: 0.7181875 2023-10-03 00:39:32,513 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_202-27960-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 00:39:33,764 INFO [train.py:1046] (3/4) Epoch 31, batch 2100, loss[loss=0.1727, simple_loss=0.2486, pruned_loss=0.04836, over 23259.00 frames. ], tot_loss[loss=0.1631, simple_loss=0.2407, pruned_loss=0.04275, over 4714749.83 frames. ], batch size: 105, lr: 3.30e-03, grad_scale: 32.0 2023-10-03 00:39:33,821 WARNING [train.py:1204] (3/4) Exclude cut with ID _5294_210_1747_1_1527249643928_3696179_342-270278-0 from training. Duration: 0.55 2023-10-03 00:39:33,865 WARNING [train.py:1204] (3/4) Exclude cut with ID _38323_210_12108_1_1533121233369_3303049_416-11919-0 from training. Duration: 0.96 2023-10-03 00:39:35,252 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6545_210_2040_1_1527774670_4245258_407-48070-0_sp0.9 from training. Duration: 0.8444375 2023-10-03 00:39:39,035 WARNING [train.py:1204] (3/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_503-30719-0_sp1.1 from training. Duration: 0.8909375 2023-10-03 00:39:39,187 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=1076426.6666666667, ans=0.1 2023-10-03 00:39:40,960 WARNING [train.py:1204] (3/4) Exclude cut with ID _49643_210_7989_1_1534640244224_3939031_234-326497-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 00:39:42,459 WARNING [train.py:1204] (3/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_301-77873-0_sp1.1 from training. Duration: 0.963625 2023-10-03 00:39:43,817 WARNING [train.py:1204] (3/4) Exclude cut with ID _45297_210_2721_1_1533715351381_3264949_152-154697-0_sp1.1 from training. Duration: 0.97275 2023-10-03 00:39:43,824 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3371_210_1794_1_1526384462_5580777_396-339812-0 from training. Duration: 0.85 2023-10-03 00:39:43,915 WARNING [train.py:1204] (3/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_399-156791-0_sp1.1 from training. Duration: 0.7545625 2023-10-03 00:39:43,959 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7696_210_2040_1_1528207167_3716099_117-125434-0 from training. Duration: 0.76 2023-10-03 00:39:43,964 WARNING [train.py:1204] (3/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_96-244299-0 from training. Duration: 0.88 2023-10-03 00:39:44,239 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.balancer1.prob, batch_count=1076426.6666666667, ans=0.125 2023-10-03 00:39:46,649 WARNING [train.py:1204] (3/4) Exclude cut with ID _37892_210_19596_1_1533198677897_3964058_519-343462-0_sp1.1 from training. Duration: 0.92725 2023-10-03 00:39:46,680 WARNING [train.py:1204] (3/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_202-183922-0_sp1.1 from training. Duration: 0.77275 2023-10-03 00:39:46,686 WARNING [train.py:1204] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_453-133983-0 from training. Duration: 0.78 2023-10-03 00:39:46,730 WARNING [train.py:1204] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_178-39722-0_sp1.1 from training. Duration: 0.4454375 2023-10-03 00:39:51,990 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_15126_210_6753_1_1530778677_2591185_113-157651-0 from training. Duration: 0.68 2023-10-03 00:39:51,991 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_271-234962-0_sp1.1 from training. Duration: 0.7181875 2023-10-03 00:39:54,722 WARNING [train.py:1204] (3/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_195-64967-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 00:39:56,064 WARNING [train.py:1204] (3/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_365-215065-0_sp1.1 from training. Duration: 0.936375 2023-10-03 00:39:59,327 WARNING [train.py:1204] (3/4) Exclude cut with ID _31602_210_5854_1_1532677021456_4458250_249-96604-0_sp0.9 from training. Duration: 0.988875 2023-10-03 00:39:59,385 WARNING [train.py:1204] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_307-158462-0 from training. Duration: 0.69 2023-10-03 00:39:59,624 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.conv_module2.balancer2.prob, batch_count=1076493.3333333333, ans=0.125 2023-10-03 00:40:00,648 WARNING [train.py:1204] (3/4) Exclude cut with ID _30918_210_6400_1_1532754078600_3125920_407-223394-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 00:40:00,659 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6673_210_3273_1_1527767790_3173240_351-167954-0_sp0.9 from training. Duration: 0.5333125 2023-10-03 00:40:02,159 WARNING [train.py:1204] (3/4) Exclude cut with ID _4866_210_2577_1_1527404412284_4364659_256-306830-0 from training. Duration: 0.87 2023-10-03 00:40:02,219 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_17613_210_2151_1_1531137569_2366160_222-131118-0_sp1.1 from training. Duration: 0.963625 2023-10-03 00:40:02,230 WARNING [train.py:1204] (3/4) Exclude cut with ID _45560_210_21982_1_1533812603951_3584999_111-6376-0 from training. Duration: 0.84 2023-10-03 00:40:03,501 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_216-73841-0 from training. Duration: 0.96 2023-10-03 00:40:05,161 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_14058_210_4163_1_1530690953_6327180_49-210687-0 from training. Duration: 0.94 2023-10-03 00:40:07,833 WARNING [train.py:1204] (3/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_506-42614-0_sp0.9 from training. Duration: 0.87775 2023-10-03 00:40:09,788 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_25-332138-0_sp1.1 from training. Duration: 0.82725 2023-10-03 00:40:12,632 WARNING [train.py:1204] (3/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_589-9070-0_sp1.1 from training. Duration: 0.6454375 2023-10-03 00:40:13,998 WARNING [train.py:1204] (3/4) Exclude cut with ID _6194_210_1534_1_1527511826160_3941194_14-282876-0_sp1.1 from training. Duration: 0.6454375 2023-10-03 00:40:14,115 WARNING [train.py:1204] (3/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_1166-90654-0_sp1.1 from training. Duration: 0.92725 2023-10-03 00:40:15,462 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_218-255858-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 00:40:15,464 WARNING [train.py:1204] (3/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_1285-256622-0 from training. Duration: 0.84 2023-10-03 00:40:16,742 WARNING [train.py:1204] (3/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_205-286261-0_sp1.1 from training. Duration: 0.963625 2023-10-03 00:40:16,757 WARNING [train.py:1204] (3/4) Exclude cut with ID _68777_210_4353_1_1535810390731_3117904_6-154119-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 00:40:16,809 WARNING [train.py:1204] (3/4) Exclude cut with ID _22982_210_1883_1_1531882790447_5449409_565-199393-0_sp1.1 from training. Duration: 0.92725 2023-10-03 00:40:16,819 WARNING [train.py:1204] (3/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_278-154492-0 from training. Duration: 0.76 2023-10-03 00:40:18,295 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_426-313557-0 from training. Duration: 0.53 2023-10-03 00:40:19,588 WARNING [train.py:1204] (3/4) Exclude cut with ID _41817_210_18835_1_1533434477160_7423049_555-293448-0 from training. Duration: 0.8 2023-10-03 00:40:25,158 WARNING [train.py:1204] (3/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_32-335103-0_sp1.1 from training. Duration: 0.7454375 2023-10-03 00:40:28,144 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2655_210_2087_1_1525688959_3897400_202-147557-0_sp1.1 from training. Duration: 0.77275 2023-10-03 00:40:28,169 WARNING [train.py:1204] (3/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_158-250116-0 from training. Duration: 0.62 2023-10-03 00:40:35,497 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.675e+02 2.012e+02 2.254e+02 2.840e+02 4.737e+02, threshold=4.507e+02, percent-clipped=3.0 2023-10-03 00:40:35,565 WARNING [train.py:1204] (3/4) Exclude cut with ID _55429_210_10626_1_1534467588527_7174143_528-88083-0_sp1.1 from training. Duration: 0.963625 2023-10-03 00:40:37,156 WARNING [train.py:1204] (3/4) Exclude cut with ID _8030_210_7497_1_1528444886781_3531089_32-31837-0_sp1.1 from training. Duration: 0.8818125 2023-10-03 00:40:37,184 WARNING [train.py:1204] (3/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_744-86168-0_sp1.1 from training. Duration: 0.97275 2023-10-03 00:40:37,200 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1113-7369-0_sp0.9 from training. Duration: 0.9444375 2023-10-03 00:40:38,454 WARNING [train.py:1204] (3/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_124-345840-0_sp0.9 from training. Duration: 0.57775 2023-10-03 00:40:39,168 WARNING [train.py:1204] (3/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_328-264776-0_sp0.9 from training. Duration: 0.7444375 2023-10-03 00:40:41,717 WARNING [train.py:1204] (3/4) Exclude cut with ID _18895_210_6228_1_1531357274234_3784930_469-166615-0_sp1.1 from training. Duration: 0.963625 2023-10-03 00:40:41,723 WARNING [train.py:1204] (3/4) Exclude cut with ID _8395_210_1531_1_1528597871523_4411579_431-132470-0_sp0.9 from training. Duration: 0.82225 2023-10-03 00:40:41,807 WARNING [train.py:1204] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_554-45617-0_sp1.1 from training. Duration: 0.9 2023-10-03 00:40:41,830 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_35361_210_4238_1_1533262810_7793476_524-188242-0_sp1.1 from training. Duration: 0.92725 2023-10-03 00:40:45,070 WARNING [train.py:1204] (3/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_938-205135-0 from training. Duration: 0.87 2023-10-03 00:40:46,546 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_5931_210_2040_1_1527428481_4547999_321-193614-0 from training. Duration: 0.67 2023-10-03 00:40:46,559 WARNING [train.py:1204] (3/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_445-269521-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 00:40:47,879 INFO [train.py:1046] (3/4) Epoch 31, batch 2150, loss[loss=0.1742, simple_loss=0.2228, pruned_loss=0.06277, over 19287.00 frames. ], tot_loss[loss=0.1619, simple_loss=0.2392, pruned_loss=0.04225, over 4704223.02 frames. ], batch size: 388, lr: 3.30e-03, grad_scale: 16.0 2023-10-03 00:40:49,345 WARNING [train.py:1204] (3/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_3-33116-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 00:40:49,347 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_4633_210_2040_1_1526824163_4145343_416-76117-0_sp1.1 from training. Duration: 0.863625 2023-10-03 00:40:49,380 WARNING [train.py:1204] (3/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_1033-274376-0_sp1.1 from training. Duration: 0.7909375 2023-10-03 00:40:50,658 WARNING [train.py:1204] (3/4) Exclude cut with ID _8772_210_1850_1_1528635595972_3664011_398-320049-0_sp0.9 from training. Duration: 0.97775 2023-10-03 00:40:54,819 WARNING [train.py:1204] (3/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_321-215742-0_sp0.9 from training. Duration: 0.5555625 2023-10-03 00:40:56,153 WARNING [train.py:1204] (3/4) Exclude cut with ID _28394_210_8913_1_1532430049518_3650118_272-190950-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 00:40:57,545 WARNING [train.py:1204] (3/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_31-299371-0_sp1.1 from training. Duration: 0.92725 2023-10-03 00:40:58,957 WARNING [train.py:1204] (3/4) Exclude cut with ID _55383_210_10635_1_1534468426172_2231669_134-128246-0_sp0.9 from training. Duration: 0.988875 2023-10-03 00:40:58,960 WARNING [train.py:1204] (3/4) Exclude cut with ID _40519_210_13794_1_1533715228655_7337390_887-9547-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 00:40:58,996 WARNING [train.py:1204] (3/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_310-109461-0_sp0.9 from training. Duration: 0.888875 2023-10-03 00:41:02,209 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2009_210_2071_1_1525481134_4588937_258-250196-0_sp1.1 from training. Duration: 0.92725 2023-10-03 00:41:03,525 WARNING [train.py:1204] (3/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_1186-72702-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 00:41:03,527 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_14831_210_7927_1_1530700200_5583610_181-27467-0_sp1.1 from training. Duration: 0.82725 2023-10-03 00:41:05,227 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.balancer1.prob, batch_count=1076826.6666666667, ans=0.125 2023-10-03 00:41:06,344 WARNING [train.py:1204] (3/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_278-132109-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 00:41:06,361 WARNING [train.py:1204] (3/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_261-297503-0 from training. Duration: 0.99 2023-10-03 00:41:11,088 WARNING [train.py:1204] (3/4) Exclude cut with ID _19896_210_1883_1_1531709022662_6649569_313-265903-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 00:41:11,177 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_105-114797-0_sp1.1 from training. Duration: 0.763625 2023-10-03 00:41:12,922 WARNING [train.py:1204] (3/4) Exclude cut with ID _53755_210_8254_1_1534577312941_4707360_9-65843-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 00:41:12,944 WARNING [train.py:1204] (3/4) Exclude cut with ID _18516_210_9206_1_1531704653206_3692406_361-224549-0_sp1.1 from training. Duration: 0.97275 2023-10-03 00:41:12,971 WARNING [train.py:1204] (3/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_403-310975-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 00:41:13,004 WARNING [train.py:1204] (3/4) Exclude cut with ID _5444_210_2151_1_1527418808464_5312210_581-315411-0_sp0.9 from training. Duration: 0.688875 2023-10-03 00:41:13,135 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.feed_forward3.hidden_balancer.prob, batch_count=1076826.6666666667, ans=0.125 2023-10-03 00:41:14,912 WARNING [train.py:1204] (3/4) Exclude cut with ID _23825_210_13449_1_1531915313526_3497150_464-154904-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 00:41:14,928 WARNING [train.py:1204] (3/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_937-208281-0_sp0.9 from training. Duration: 0.9444375 2023-10-03 00:41:15,078 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.feed_forward1.hidden_balancer.prob, batch_count=1076826.6666666667, ans=0.125 2023-10-03 00:41:16,275 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_37839_210_7234_1_1533542462_3689303_158-181212-0_sp1.1 from training. Duration: 0.963625 2023-10-03 00:41:19,027 WARNING [train.py:1204] (3/4) Exclude cut with ID _8772_210_1850_1_1528635595972_3664011_398-320049-0 from training. Duration: 0.88 2023-10-03 00:41:20,422 WARNING [train.py:1204] (3/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_718-250907-0_sp1.1 from training. Duration: 0.62725 2023-10-03 00:41:21,867 WARNING [train.py:1204] (3/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_641-126395-0_sp1.1 from training. Duration: 0.92725 2023-10-03 00:41:21,891 WARNING [train.py:1204] (3/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_166-241493-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 00:41:23,258 WARNING [train.py:1204] (3/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_526-62079-0_sp0.9 from training. Duration: 0.7444375 2023-10-03 00:41:23,360 WARNING [train.py:1204] (3/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_439-275083-0_sp1.1 from training. Duration: 0.936375 2023-10-03 00:41:23,542 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=1076893.3333333333, ans=0.1 2023-10-03 00:41:26,139 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_66907_210_21581_1_1535615661_3590177_493-229032-0_sp1.1 from training. Duration: 0.92725 2023-10-03 00:41:26,176 WARNING [train.py:1204] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_89-321955-0_sp1.1 from training. Duration: 0.72725 2023-10-03 00:41:26,414 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.conv_module2.balancer1.prob, batch_count=1076893.3333333333, ans=0.125 2023-10-03 00:41:28,880 WARNING [train.py:1204] (3/4) Exclude cut with ID _29857_210_20034_1_1532395886753_3798160_273-226195-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 00:41:28,887 WARNING [train.py:1204] (3/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_404-75889-0 from training. Duration: 0.89 2023-10-03 00:41:28,927 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_271-234962-0_sp0.9 from training. Duration: 0.87775 2023-10-03 00:41:32,012 WARNING [train.py:1204] (3/4) Exclude cut with ID _22961_210_8254_1_1531976366672_4278180_156-339685-0_sp1.1 from training. Duration: 0.97275 2023-10-03 00:41:32,069 WARNING [train.py:1204] (3/4) Exclude cut with ID _37892_210_19596_1_1533198677897_3964058_425-231927-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 00:41:33,358 WARNING [train.py:1204] (3/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_102-288670-0_sp1.1 from training. Duration: 0.97275 2023-10-03 00:41:34,804 WARNING [train.py:1204] (3/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_283-220372-0_sp1.1 from training. Duration: 0.5090625 2023-10-03 00:41:35,073 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.feed_forward2.hidden_balancer.prob, batch_count=1076960.0, ans=0.125 2023-10-03 00:41:36,213 WARNING [train.py:1204] (3/4) Exclude cut with ID _44304_210_15374_1_1533639352765_4613129_86-1699-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 00:41:37,584 WARNING [train.py:1204] (3/4) Exclude cut with ID _2891_210_2550_1_1525874398521_4427710_327-121590-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 00:41:37,591 WARNING [train.py:1204] (3/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_369-222060-0 from training. Duration: 0.71 2023-10-03 00:41:39,151 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.1.balancer2.prob, batch_count=1076960.0, ans=0.125 2023-10-03 00:41:40,366 WARNING [train.py:1204] (3/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_762-109886-0 from training. Duration: 0.91 2023-10-03 00:41:40,391 WARNING [train.py:1204] (3/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_477-255782-0_sp0.9 from training. Duration: 0.911125 2023-10-03 00:41:40,426 WARNING [train.py:1204] (3/4) Exclude cut with ID IC0669W0061-330906 from training. Duration: 0.768 2023-10-03 00:41:40,467 WARNING [train.py:1204] (3/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_347-213071-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 00:41:40,507 WARNING [train.py:1204] (3/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_507-303370-0_sp1.1 from training. Duration: 0.87275 2023-10-03 00:41:42,430 WARNING [train.py:1204] (3/4) Exclude cut with ID _15012_210_1968_1_1530856820429_5095350_688-178849-0 from training. Duration: 0.64 2023-10-03 00:41:42,443 WARNING [train.py:1204] (3/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_28-70987-0_sp0.9 from training. Duration: 0.9666875 2023-10-03 00:41:42,446 WARNING [train.py:1204] (3/4) Exclude cut with ID _5910_210_1389_1_1527337972454_5050100_555-159815-0 from training. Duration: 0.98 2023-10-03 00:41:42,469 WARNING [train.py:1204] (3/4) Exclude cut with ID IC0679W0064-335871 from training. Duration: 0.768 2023-10-03 00:41:42,469 WARNING [train.py:1204] (3/4) Exclude cut with ID IC0679W0079-335886 from training. Duration: 0.896 2023-10-03 00:41:42,503 WARNING [train.py:1204] (3/4) Exclude cut with ID _5608_210_2106_1_1527415439089_6819768_909-82767-0 from training. Duration: 0.61 2023-10-03 00:41:44,428 WARNING [train.py:1204] (3/4) Exclude cut with ID _23038_210_9570_1_1532001584158_3652419_411-323967-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 00:41:45,729 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_350-231748-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 00:41:45,740 WARNING [train.py:1204] (3/4) Exclude cut with ID _5289_210_2084_1_1527678202736_3779299_142-103317-0_sp0.9 from training. Duration: 0.9333125 2023-10-03 00:41:45,827 WARNING [train.py:1204] (3/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_314-144055-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 00:41:47,755 WARNING [train.py:1204] (3/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_177-236809-0_sp0.9 from training. Duration: 0.5444375 2023-10-03 00:41:49,123 WARNING [train.py:1204] (3/4) Exclude cut with ID _23038_210_9570_1_1532001584158_3652419_82-334733-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 00:41:49,130 WARNING [train.py:1204] (3/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_225-22526-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 00:41:54,565 WARNING [train.py:1204] (3/4) Exclude cut with ID _30327_210_16049_1_1532484161295_3924169_401-70819-0_sp1.1 from training. Duration: 0.7818125 2023-10-03 00:41:55,859 WARNING [train.py:1204] (3/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_538-32291-0 from training. Duration: 0.83 2023-10-03 00:41:58,117 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.2.encoder.layers.0.self_attn1.whiten, num_groups=1, num_channels=384, metric=18.46 vs. limit=22.5 2023-10-03 00:42:00,637 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_37357_210_17024_1_1533089504_2316121_258-211605-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 00:42:01,961 INFO [train.py:1046] (3/4) Epoch 31, batch 2200, loss[loss=0.1653, simple_loss=0.2382, pruned_loss=0.04625, over 23669.00 frames. ], tot_loss[loss=0.162, simple_loss=0.2401, pruned_loss=0.04201, over 4721736.52 frames. ], batch size: 232, lr: 3.30e-03, grad_scale: 8.0 2023-10-03 00:42:03,650 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.conv_skip_rate, batch_count=1077093.3333333333, ans=0.0 2023-10-03 00:42:06,081 WARNING [train.py:1204] (3/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_550-335198-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 00:42:06,145 WARNING [train.py:1204] (3/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_98-741-0_sp0.9 from training. Duration: 0.888875 2023-10-03 00:42:06,201 WARNING [train.py:1204] (3/4) Exclude cut with ID _38323_210_12108_1_1533121233369_3303049_251-238988-0_sp1.1 from training. Duration: 0.92725 2023-10-03 00:42:07,565 WARNING [train.py:1204] (3/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_259-34038-0_sp0.9 from training. Duration: 0.82225 2023-10-03 00:42:09,586 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.0.conv_module1.balancer2.min_positive, batch_count=1077093.3333333333, ans=0.05 2023-10-03 00:42:10,806 WARNING [train.py:1204] (3/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_1060-180633-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 00:42:10,854 WARNING [train.py:1204] (3/4) Exclude cut with ID _8755_210_1876_1_1528628559357_3258209_211-37473-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 00:42:10,860 WARNING [train.py:1204] (3/4) Exclude cut with ID _4122_210_2084_1_1526713164417_4353280_528-206771-0 from training. Duration: 0.88 2023-10-03 00:42:10,975 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.0.feed_forward1.out_proj.dropout_p, batch_count=1077093.3333333333, ans=0.1 2023-10-03 00:42:16,860 WARNING [train.py:1204] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_64-248359-0 from training. Duration: 0.69 2023-10-03 00:42:17,078 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.conv_module2.balancer1.prob, batch_count=1077160.0, ans=0.125 2023-10-03 00:42:18,235 WARNING [train.py:1204] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_402-253027-0_sp1.1 from training. Duration: 0.4909375 2023-10-03 00:42:18,432 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.feed_forward1.out_proj.dropout_p, batch_count=1077160.0, ans=0.1 2023-10-03 00:42:24,097 WARNING [train.py:1204] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_625-102955-0 from training. Duration: 0.77 2023-10-03 00:42:27,056 WARNING [train.py:1204] (3/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_611-219324-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 00:42:27,132 WARNING [train.py:1204] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_718-68505-0_sp0.9 from training. Duration: 0.988875 2023-10-03 00:42:28,405 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_353-182855-0_sp1.1 from training. Duration: 0.87275 2023-10-03 00:42:31,633 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_5960_210_1794_1_1527403865_3990999_515-2613-0_sp0.9 from training. Duration: 0.97775 2023-10-03 00:42:32,920 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_5575_210_1410_1_1527301305_2570170_342-265572-0 from training. Duration: 0.94 2023-10-03 00:42:35,703 WARNING [train.py:1204] (3/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_329-113173-0_sp0.9 from training. Duration: 0.688875 2023-10-03 00:42:37,097 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_15396_210_6890_1_1531308473_1938936_213-312506-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 00:42:37,161 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_521-214276-0_sp1.1 from training. Duration: 0.4 2023-10-03 00:42:40,483 WARNING [train.py:1204] (3/4) Exclude cut with ID _15026_210_8750_1_1530777602316_3291850_211-195043-0_sp1.1 from training. Duration: 0.72725 2023-10-03 00:42:41,844 WARNING [train.py:1204] (3/4) Exclude cut with ID _29343_210_4381_1_1532602757752_3674109_108-257494-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 00:42:42,013 INFO [scaling.py:1118] (3/4) WithLoss: name=encoder.encoders.2.encoder.layers.1.self_attn_weights, loss-sum=0.000e+00 2023-10-03 00:42:45,039 WARNING [train.py:1204] (3/4) Exclude cut with ID _64414_210_17301_1_1535243252048_3757759_403-58151-0_sp1.1 from training. Duration: 0.963625 2023-10-03 00:42:47,749 WARNING [train.py:1204] (3/4) Exclude cut with ID _36512_210_6987_1_1533033143998_3126069_109-177787-0_sp1.1 from training. Duration: 0.92725 2023-10-03 00:42:49,099 WARNING [train.py:1204] (3/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_498-40153-0 from training. Duration: 0.63 2023-10-03 00:42:49,183 WARNING [train.py:1204] (3/4) Exclude cut with ID _56447_210_1389_1_1535074245910_3697189_161-334523-0_sp1.1 from training. Duration: 0.936375 2023-10-03 00:42:50,665 WARNING [train.py:1204] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_238-245655-0 from training. Duration: 0.81 2023-10-03 00:42:52,138 WARNING [train.py:1204] (3/4) Exclude cut with ID _64631_210_27767_1_1535349580068_4165160_108-43865-0_sp1.1 from training. Duration: 0.92725 2023-10-03 00:42:52,143 WARNING [train.py:1204] (3/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_243-42116-0_sp1.1 from training. Duration: 0.663625 2023-10-03 00:42:52,179 WARNING [train.py:1204] (3/4) Exclude cut with ID _66868_210_9527_1_1535509287954_4561140_594-132160-0_sp1.1 from training. Duration: 0.92725 2023-10-03 00:42:55,673 WARNING [train.py:1204] (3/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_48-338976-0_sp0.9 from training. Duration: 0.988875 2023-10-03 00:42:55,721 WARNING [train.py:1204] (3/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_137-100595-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 00:42:55,727 WARNING [train.py:1204] (3/4) Exclude cut with ID _49748_210_24116_1_1533986971737_3158910_12-271669-0_sp1.1 from training. Duration: 0.936375 2023-10-03 00:42:55,752 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_37357_210_17024_1_1533089504_2316121_206-80421-0_sp1.1 from training. Duration: 0.936375 2023-10-03 00:42:56,500 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.2.encoder.layers.1.feed_forward2.out_whiten, num_groups=1, num_channels=384, metric=3.86 vs. limit=15.0 2023-10-03 00:42:58,452 WARNING [train.py:1204] (3/4) Exclude cut with ID _7537_210_2084_1_1528502395431_3924359_113-199933-0_sp1.1 from training. Duration: 0.536375 2023-10-03 00:42:58,484 WARNING [train.py:1204] (3/4) Exclude cut with ID _55440_210_12651_1_1534496713364_2980520_31-59289-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 00:42:59,921 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1083-220122-0_sp0.9 from training. Duration: 0.5666875 2023-10-03 00:43:03,201 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_426-313557-0_sp1.1 from training. Duration: 0.4818125 2023-10-03 00:43:04,591 WARNING [train.py:1204] (3/4) Exclude cut with ID _35799_210_5854_1_1533263487887_3529071_324-94843-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 00:43:05,899 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.536e+02 1.847e+02 1.989e+02 2.183e+02 3.187e+02, threshold=3.977e+02, percent-clipped=0.0 2023-10-03 00:43:08,527 WARNING [train.py:1204] (3/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_264-108685-0_sp1.1 from training. Duration: 0.5 2023-10-03 00:43:08,614 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9012W0006-501141 from training. Duration: 0.896 2023-10-03 00:43:11,862 WARNING [train.py:1204] (3/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_890-5117-0_sp0.9 from training. Duration: 0.8666875 2023-10-03 00:43:11,907 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9023W0008-506301 from training. Duration: 0.896 2023-10-03 00:43:13,311 WARNING [train.py:1204] (3/4) Exclude cut with ID _13916_210_9994_1_1530788341126_3763149_60-191556-0_sp0.9 from training. Duration: 0.67775 2023-10-03 00:43:14,029 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.1.encoder.layers.1.nonlin_attention.whiten2, num_groups=1, num_channels=256, metric=7.80 vs. limit=15.0 2023-10-03 00:43:14,632 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9029W0331-509721 from training. Duration: 0.896 2023-10-03 00:43:15,960 INFO [train.py:1046] (3/4) Epoch 31, batch 2250, loss[loss=0.1581, simple_loss=0.2415, pruned_loss=0.03733, over 24303.00 frames. ], tot_loss[loss=0.1622, simple_loss=0.2406, pruned_loss=0.04192, over 4720483.57 frames. ], batch size: 61, lr: 3.30e-03, grad_scale: 8.0 2023-10-03 00:43:16,088 WARNING [train.py:1204] (3/4) Exclude cut with ID _35823_210_6917_1_1533187881914_7297830_567-108596-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 00:43:17,373 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7543_210_6753_1_1528544010_1110402_25-183860-0_sp1.1 from training. Duration: 0.42725 2023-10-03 00:43:19,270 WARNING [train.py:1204] (3/4) Exclude cut with ID _47278_210_12041_1_1533902418349_3576110_495-260035-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 00:43:20,694 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9051W0015-520281 from training. Duration: 0.9045625 2023-10-03 00:43:22,211 WARNING [train.py:1204] (3/4) Exclude cut with ID _14459_210_2228_1_1531443641946_7516700_707-66193-0_sp1.1 from training. Duration: 0.97275 2023-10-03 00:43:23,775 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.conv_module1.balancer2.prob, batch_count=1077426.6666666667, ans=0.125 2023-10-03 00:43:24,888 WARNING [train.py:1204] (3/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_219-224445-0_sp1.1 from training. Duration: 0.763625 2023-10-03 00:43:26,977 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.conv_module2.balancer2.min_abs, batch_count=1077426.6666666667, ans=0.5 2023-10-03 00:43:28,986 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.2.encoder.layers.1.self_attn_weights.whiten_keys, num_groups=4, num_channels=128, metric=3.19 vs. limit=6.0 2023-10-03 00:43:30,931 WARNING [train.py:1204] (3/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_345-167986-0_sp0.9 from training. Duration: 0.9333125 2023-10-03 00:43:31,041 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_34815_210_6388_1_1533381320_3571903_279-305565-0_sp1.1 from training. Duration: 0.836375 2023-10-03 00:43:35,058 WARNING [train.py:1204] (3/4) Exclude cut with ID _21987_210_5854_1_1531821500482_3637038_317-179212-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 00:43:35,117 WARNING [train.py:1204] (3/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_395-37222-0_sp1.1 from training. Duration: 0.6909375 2023-10-03 00:43:36,859 WARNING [train.py:1204] (3/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_168-50337-0_sp1.1 from training. Duration: 0.763625 2023-10-03 00:43:39,713 WARNING [train.py:1204] (3/4) Exclude cut with ID _60113_210_16271_1_1535164212481_4386079_559-167191-0 from training. Duration: 0.93 2023-10-03 00:43:39,730 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_47228_210_4983_1_1533780021_3672027_344-179642-0_sp1.1 from training. Duration: 0.92725 2023-10-03 00:43:40,993 WARNING [train.py:1204] (3/4) Exclude cut with ID _22961_210_8254_1_1531976366672_4278180_317-199546-0_sp0.9 from training. Duration: 0.9555625 2023-10-03 00:43:42,478 WARNING [train.py:1204] (3/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_141-89727-0 from training. Duration: 0.71 2023-10-03 00:43:42,538 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_78-116075-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 00:43:42,548 WARNING [train.py:1204] (3/4) Exclude cut with ID _64086_210_7994_1_1535270417019_7435949_52-235756-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 00:43:43,911 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7615_210_4211_1_1528110965_4580160_72-235211-0_sp1.1 from training. Duration: 0.6909375 2023-10-03 00:43:48,731 WARNING [train.py:1204] (3/4) Exclude cut with ID _14069_210_5632_1_1530512692033_4803390_287-27950-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 00:43:48,830 WARNING [train.py:1204] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_74-48717-0_sp0.9 from training. Duration: 0.6444375 2023-10-03 00:43:50,089 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7544_210_6753_1_1528591327_1471426_172-348777-0_sp1.1 from training. Duration: 0.6 2023-10-03 00:43:52,074 WARNING [train.py:1204] (3/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_220-191080-0 from training. Duration: 0.98 2023-10-03 00:43:52,328 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=1077560.0, ans=0.1 2023-10-03 00:43:53,429 WARNING [train.py:1204] (3/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_1081-137641-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 00:43:54,901 WARNING [train.py:1204] (3/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_278-235459-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 00:43:56,838 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.feed_forward3.hidden_balancer.prob, batch_count=1077560.0, ans=0.125 2023-10-03 00:43:56,879 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.balancer1.prob, batch_count=1077560.0, ans=0.125 2023-10-03 00:43:59,574 WARNING [train.py:1204] (3/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_1181-77470-0_sp1.1 from training. Duration: 0.936375 2023-10-03 00:44:01,055 WARNING [train.py:1204] (3/4) Exclude cut with ID _60860_210_8254_1_1534896015875_3557169_198-5531-0_sp1.1 from training. Duration: 0.936375 2023-10-03 00:44:02,420 WARNING [train.py:1204] (3/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_17-289781-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 00:44:02,425 WARNING [train.py:1204] (3/4) Exclude cut with ID _50310_210_15488_1_1534068043345_3641080_146-199752-0_sp1.1 from training. Duration: 0.92725 2023-10-03 00:44:03,907 WARNING [train.py:1204] (3/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_969-213354-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 00:44:05,184 WARNING [train.py:1204] (3/4) Exclude cut with ID _70519_210_4163_1_1535961595994_7458230_66-344156-0_sp1.1 from training. Duration: 0.963625 2023-10-03 00:44:09,073 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.4.encoder.layers.1.conv_module1.whiten, num_groups=1, num_channels=384, metric=7.38 vs. limit=15.0 2023-10-03 00:44:09,906 WARNING [train.py:1204] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_380-204756-0_sp1.1 from training. Duration: 0.7818125 2023-10-03 00:44:13,903 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_128-202104-0_sp0.9 from training. Duration: 0.7 2023-10-03 00:44:14,226 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.feed_forward3.hidden_balancer.prob, batch_count=1077693.3333333333, ans=0.125 2023-10-03 00:44:17,253 WARNING [train.py:1204] (3/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_51-1049-0_sp0.9 from training. Duration: 0.8333125 2023-10-03 00:44:17,270 WARNING [train.py:1204] (3/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_86-66192-0_sp1.1 from training. Duration: 0.5 2023-10-03 00:44:17,333 WARNING [train.py:1204] (3/4) Exclude cut with ID _14069_210_5632_1_1530512692033_4803390_95-1896-0_sp1.1 from training. Duration: 0.8909375 2023-10-03 00:44:20,980 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.feed_forward1.out_proj.dropout_p, batch_count=1077693.3333333333, ans=0.1 2023-10-03 00:44:23,358 WARNING [train.py:1204] (3/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_29-293896-0_sp1.1 from training. Duration: 0.5545625 2023-10-03 00:44:24,998 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.conv_module2.balancer2.prob, batch_count=1077693.3333333333, ans=0.125 2023-10-03 00:44:26,527 WARNING [train.py:1204] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_167-137255-0_sp0.9 from training. Duration: 0.711125 2023-10-03 00:44:26,527 WARNING [train.py:1204] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_217-95056-0 from training. Duration: 0.98 2023-10-03 00:44:26,545 WARNING [train.py:1204] (3/4) Exclude cut with ID _47278_210_12041_1_1533902418349_3576110_97-266746-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 00:44:26,597 WARNING [train.py:1204] (3/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_380-343670-0_sp1.1 from training. Duration: 0.8 2023-10-03 00:44:29,510 WARNING [train.py:1204] (3/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_107-332398-0 from training. Duration: 0.78 2023-10-03 00:44:30,743 INFO [train.py:1046] (3/4) Epoch 31, batch 2300, loss[loss=0.1611, simple_loss=0.2548, pruned_loss=0.03366, over 24332.00 frames. ], tot_loss[loss=0.1625, simple_loss=0.2409, pruned_loss=0.04209, over 4710867.09 frames. ], batch size: 74, lr: 3.30e-03, grad_scale: 8.0 2023-10-03 00:44:32,132 WARNING [train.py:1204] (3/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_91-267696-0_sp1.1 from training. Duration: 0.7909375 2023-10-03 00:44:32,158 WARNING [train.py:1204] (3/4) Exclude cut with ID _41298_210_9019_1_1533430371450_4822143_498-186993-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 00:44:39,654 WARNING [train.py:1204] (3/4) Exclude cut with ID _17956_210_4353_1_1531209568125_6979970_54-77610-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 00:44:41,080 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_40047_210_2290_1_1533290519_2374311_257-335162-0_sp1.1 from training. Duration: 0.863625 2023-10-03 00:44:42,482 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9653W0193-675471 from training. Duration: 0.9656875 2023-10-03 00:44:43,826 WARNING [train.py:1204] (3/4) Exclude cut with ID _25248_210_8750_1_1532062825941_5554180_243-240536-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 00:44:45,475 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.bypass.skip_rate, batch_count=1077826.6666666667, ans=0.07 2023-10-03 00:44:51,314 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_32360_210_4198_1_1532689160_3648436_60-51908-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 00:44:51,322 WARNING [train.py:1204] (3/4) Exclude cut with ID _4092_210_2151_1_1526643044449_3135759_227-213972-0_sp0.9 from training. Duration: 0.6 2023-10-03 00:44:51,364 WARNING [train.py:1204] (3/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_392-173500-0_sp1.1 from training. Duration: 0.97275 2023-10-03 00:44:51,392 WARNING [train.py:1204] (3/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_71-329150-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 00:44:51,397 WARNING [train.py:1204] (3/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_866-173440-0 from training. Duration: 0.9 2023-10-03 00:44:52,640 WARNING [train.py:1204] (3/4) Exclude cut with ID _14832_210_8750_1_1530691351539_3171348_1-54315-0_sp0.9 from training. Duration: 0.9666875 2023-10-03 00:44:55,885 WARNING [train.py:1204] (3/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_430-116139-0_sp1.1 from training. Duration: 0.77275 2023-10-03 00:44:57,172 WARNING [train.py:1204] (3/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_356-45054-0_sp0.9 from training. Duration: 0.9555625 2023-10-03 00:45:00,603 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3531_210_3528_1_1526294203_5216456_309-227302-0_sp0.9 from training. Duration: 0.7666875 2023-10-03 00:45:03,334 WARNING [train.py:1204] (3/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_301-330455-0_sp1.1 from training. Duration: 0.72725 2023-10-03 00:45:04,902 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.conv_skip_rate, batch_count=1077893.3333333333, ans=0.0 2023-10-03 00:45:06,109 WARNING [train.py:1204] (3/4) Exclude cut with ID _23557_210_8750_1_1531890106370_6846255_667-145945-0_sp1.1 from training. Duration: 0.92725 2023-10-03 00:45:09,013 WARNING [train.py:1204] (3/4) Exclude cut with ID _14860_210_4915_1_1530702020340_7343240_836-328489-0_sp1.1 from training. Duration: 0.8181875 2023-10-03 00:45:10,786 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_37152_210_9697_1_1533106573_3844188_416-333249-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 00:45:12,228 WARNING [train.py:1204] (3/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_538-32291-0_sp0.9 from training. Duration: 0.92225 2023-10-03 00:45:14,959 WARNING [train.py:1204] (3/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_965-176824-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 00:45:19,190 WARNING [train.py:1204] (3/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_86-19601-0_sp1.1 from training. Duration: 0.77275 2023-10-03 00:45:20,973 WARNING [train.py:1204] (3/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_436-318113-0_sp1.1 from training. Duration: 0.6818125 2023-10-03 00:45:21,010 WARNING [train.py:1204] (3/4) Exclude cut with ID _41817_210_18835_1_1533434477160_7423049_555-293448-0_sp0.9 from training. Duration: 0.888875 2023-10-03 00:45:21,022 WARNING [train.py:1204] (3/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_68-219807-0 from training. Duration: 0.47 2023-10-03 00:45:25,549 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7544_210_6753_1_1528591327_1471426_33-346031-0_sp1.1 from training. Duration: 0.5545625 2023-10-03 00:45:26,777 WARNING [train.py:1204] (3/4) Exclude cut with ID _49748_210_24116_1_1533986971737_3158910_263-52113-0_sp1.1 from training. Duration: 0.97275 2023-10-03 00:45:26,832 WARNING [train.py:1204] (3/4) Exclude cut with ID _64639_210_5816_1_1535367619437_3528000_340-24580-0_sp1.1 from training. Duration: 0.92725 2023-10-03 00:45:26,849 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_47228_210_4983_1_1533780021_3672027_479-114941-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 00:45:26,883 WARNING [train.py:1204] (3/4) Exclude cut with ID _44585_210_18628_1_1533605350387_4518199_506-119659-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 00:45:28,204 WARNING [train.py:1204] (3/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_314-126819-0_sp1.1 from training. Duration: 0.4545625 2023-10-03 00:45:28,213 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2541_210_1794_1_1525590269_3084765_156-173713-0_sp0.9 from training. Duration: 0.9 2023-10-03 00:45:30,186 WARNING [train.py:1204] (3/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_108-255430-0 from training. Duration: 0.97 2023-10-03 00:45:30,194 WARNING [train.py:1204] (3/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_791-45149-0_sp1.1 from training. Duration: 0.7818125 2023-10-03 00:45:30,199 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_53378_210_17282_1_1534227054_1261425_10-118295-0_sp1.1 from training. Duration: 0.97275 2023-10-03 00:45:30,250 WARNING [train.py:1204] (3/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_148-158509-0 from training. Duration: 0.41 2023-10-03 00:45:34,149 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.582e+02 1.897e+02 2.094e+02 2.362e+02 4.130e+02, threshold=4.187e+02, percent-clipped=1.0 2023-10-03 00:45:35,703 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_34726_210_4525_1_1533019474_5030395_687-291305-0_sp1.1 from training. Duration: 0.936375 2023-10-03 00:45:38,529 WARNING [train.py:1204] (3/4) Exclude cut with ID _14860_210_4915_1_1530702020340_7343240_264-324537-0_sp1.1 from training. Duration: 0.963625 2023-10-03 00:45:41,942 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_41251_210_15368_1_1533429767_4820695_591-46386-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 00:45:43,311 WARNING [train.py:1204] (3/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_86-19601-0_sp0.9 from training. Duration: 0.9444375 2023-10-03 00:45:43,336 WARNING [train.py:1204] (3/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_401-255491-0_sp0.9 from training. Duration: 0.77775 2023-10-03 00:45:44,638 INFO [train.py:1046] (3/4) Epoch 31, batch 2350, loss[loss=0.2254, simple_loss=0.2897, pruned_loss=0.08054, over 19284.00 frames. ], tot_loss[loss=0.1635, simple_loss=0.2416, pruned_loss=0.04269, over 4701787.27 frames. ], batch size: 388, lr: 3.30e-03, grad_scale: 8.0 2023-10-03 00:45:44,716 WARNING [train.py:1204] (3/4) Exclude cut with ID _8696_210_1876_1_1528545632440_3345058_209-54069-0_sp1.1 from training. Duration: 0.6090625 2023-10-03 00:45:44,731 WARNING [train.py:1204] (3/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_369-325184-0_sp1.1 from training. Duration: 0.9 2023-10-03 00:45:44,787 WARNING [train.py:1204] (3/4) Exclude cut with ID _14687_210_5854_1_1530703838259_3664889_115-239046-0_sp0.9 from training. Duration: 0.7444375 2023-10-03 00:45:46,192 WARNING [train.py:1204] (3/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_168-50337-0 from training. Duration: 0.84 2023-10-03 00:45:47,817 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.balancer2.prob, batch_count=1078093.3333333333, ans=0.125 2023-10-03 00:45:53,676 WARNING [train.py:1204] (3/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_909-64151-0_sp1.1 from training. Duration: 0.8545625 2023-10-03 00:45:53,687 WARNING [train.py:1204] (3/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_597-154640-0 from training. Duration: 0.9 2023-10-03 00:45:56,804 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.feed_forward1.out_proj.dropout_p, batch_count=1078093.3333333333, ans=0.1 2023-10-03 00:45:58,607 WARNING [train.py:1204] (3/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_126-274694-0 from training. Duration: 0.98 2023-10-03 00:46:02,660 WARNING [train.py:1204] (3/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_344-269786-0_sp1.1 from training. Duration: 0.97275 2023-10-03 00:46:02,883 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.conv_module1.balancer2.prob, batch_count=1078160.0, ans=0.125 2023-10-03 00:46:05,415 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_37818_210_4238_1_1533620910_4720083_322-253154-0_sp1.1 from training. Duration: 0.92725 2023-10-03 00:46:05,418 WARNING [train.py:1204] (3/4) Exclude cut with ID _58895_210_8750_1_1535184081320_3649089_221-15823-0_sp1.1 from training. Duration: 0.92725 2023-10-03 00:46:05,436 WARNING [train.py:1204] (3/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_9-21737-0_sp1.1 from training. Duration: 0.8818125 2023-10-03 00:46:05,487 WARNING [train.py:1204] (3/4) Exclude cut with ID _14069_210_5632_1_1530512692033_4803390_522-341588-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 00:46:06,849 WARNING [train.py:1204] (3/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_363-185703-0 from training. Duration: 0.95 2023-10-03 00:46:09,729 WARNING [train.py:1204] (3/4) Exclude cut with ID _41817_210_18835_1_1533434477160_7423049_265-76216-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 00:46:13,911 WARNING [train.py:1204] (3/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_841-341679-0 from training. Duration: 0.58 2023-10-03 00:46:15,868 WARNING [train.py:1204] (3/4) Exclude cut with ID _55429_210_10626_1_1534467588527_7174143_97-335084-0_sp1.1 from training. Duration: 0.8818125 2023-10-03 00:46:20,023 WARNING [train.py:1204] (3/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_690-126306-0_sp0.9 from training. Duration: 0.8666875 2023-10-03 00:46:20,043 WARNING [train.py:1204] (3/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_102-92751-0_sp1.1 from training. Duration: 0.9 2023-10-03 00:46:21,985 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3787_210_2040_1_1526477544_5478586_603-120324-0_sp1.1 from training. Duration: 0.736375 2023-10-03 00:46:24,026 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_33348_210_5632_1_1533086912_3324030_85-28583-0 from training. Duration: 0.82 2023-10-03 00:46:25,339 WARNING [train.py:1204] (3/4) Exclude cut with ID _60057_210_10640_1_1534903065793_3333150_362-230377-0_sp1.1 from training. Duration: 0.7909375 2023-10-03 00:46:26,809 WARNING [train.py:1204] (3/4) Exclude cut with ID _60113_210_16271_1_1535164212481_4386079_194-146486-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 00:46:26,828 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_53753_210_7234_1_1534320003_3594148_171-277668-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 00:46:26,856 WARNING [train.py:1204] (3/4) Exclude cut with ID _34423_210_8750_1_1533430798456_3330199_251-332788-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 00:46:31,376 WARNING [train.py:1204] (3/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_663-166973-0_sp0.9 from training. Duration: 0.888875 2023-10-03 00:46:32,750 WARNING [train.py:1204] (3/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_927-43827-0 from training. Duration: 0.74 2023-10-03 00:46:32,803 WARNING [train.py:1204] (3/4) Exclude cut with ID _28323_210_16187_1_1532480395667_4100300_306-74561-0_sp1.1 from training. Duration: 0.8545625 2023-10-03 00:46:32,989 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.self_attn_weights.pos_emb_skip_rate, batch_count=1078293.3333333333, ans=0.0 2023-10-03 00:46:35,587 WARNING [train.py:1204] (3/4) Exclude cut with ID _15037_210_2253_1_1530752294104_3267650_139-337989-0_sp1.1 from training. Duration: 0.92725 2023-10-03 00:46:35,600 WARNING [train.py:1204] (3/4) Exclude cut with ID _49039_210_6315_1_1533967537541_3846118_460-263670-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 00:46:37,049 WARNING [train.py:1204] (3/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_186-309161-0 from training. Duration: 0.54 2023-10-03 00:46:38,321 WARNING [train.py:1204] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_87-278686-0_sp1.1 from training. Duration: 0.7 2023-10-03 00:46:39,731 WARNING [train.py:1204] (3/4) Exclude cut with ID _27052_210_5986_1_1532329197187_3585009_314-106499-0 from training. Duration: 0.92 2023-10-03 00:46:39,758 WARNING [train.py:1204] (3/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_412-287526-0_sp0.9 from training. Duration: 0.8 2023-10-03 00:46:39,986 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.conv_skip_rate, batch_count=1078293.3333333333, ans=0.0 2023-10-03 00:46:44,561 WARNING [train.py:1204] (3/4) Exclude cut with ID _22961_210_8254_1_1531976366672_4278180_317-199546-0 from training. Duration: 0.86 2023-10-03 00:46:47,352 WARNING [train.py:1204] (3/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_381-317791-0 from training. Duration: 0.73 2023-10-03 00:46:47,409 WARNING [train.py:1204] (3/4) Exclude cut with ID _44010_210_4525_1_1533884262964_4013620_560-310411-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 00:46:47,415 WARNING [train.py:1204] (3/4) Exclude cut with ID _5294_210_1747_1_1527249643928_3696179_342-270278-0_sp0.9 from training. Duration: 0.611125 2023-10-03 00:46:47,440 WARNING [train.py:1204] (3/4) Exclude cut with ID ID0473W0002-918951 from training. Duration: 0.924 2023-10-03 00:46:48,727 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1083-220122-0 from training. Duration: 0.51 2023-10-03 00:46:48,977 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.feed_forward1.hidden_balancer.prob, batch_count=1078360.0, ans=0.125 2023-10-03 00:46:52,376 WARNING [train.py:1204] (3/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_138-154812-0 from training. Duration: 0.78 2023-10-03 00:46:56,574 WARNING [train.py:1204] (3/4) Exclude cut with ID _44982_210_1511_1_1534312820108_3726029_331-19420-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 00:46:59,197 INFO [train.py:1046] (3/4) Epoch 31, batch 2400, loss[loss=0.1447, simple_loss=0.2253, pruned_loss=0.03201, over 24328.00 frames. ], tot_loss[loss=0.1633, simple_loss=0.2412, pruned_loss=0.04271, over 4706164.07 frames. ], batch size: 61, lr: 3.30e-03, grad_scale: 16.0 2023-10-03 00:47:01,200 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2655_210_2087_1_1525688959_3897400_202-147557-0_sp0.9 from training. Duration: 0.9444375 2023-10-03 00:47:04,108 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_23850_210_4238_1_1532232597_1632433_32-185135-0_sp0.9 from training. Duration: 0.9555625 2023-10-03 00:47:05,487 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_46-93547-0_sp1.1 from training. Duration: 0.863625 2023-10-03 00:47:06,777 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7931_210_1794_1_1528594728_5564059_280-279560-0 from training. Duration: 0.98 2023-10-03 00:47:06,822 WARNING [train.py:1204] (3/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_937-208281-0 from training. Duration: 0.85 2023-10-03 00:47:09,796 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.self_attn_weights.pos_emb_skip_rate, batch_count=1078426.6666666667, ans=0.0 2023-10-03 00:47:13,703 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_14928_210_5632_1_1530773461_4714000_454-112198-0_sp0.9 from training. Duration: 0.7333125 2023-10-03 00:47:14,986 WARNING [train.py:1204] (3/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_944-153333-0_sp1.1 from training. Duration: 0.963625 2023-10-03 00:47:16,792 WARNING [train.py:1204] (3/4) Exclude cut with ID _14821_210_8750_1_1530680503991_6829359_2-227151-0 from training. Duration: 0.83 2023-10-03 00:47:16,809 WARNING [train.py:1204] (3/4) Exclude cut with ID _28461_210_2253_1_1532771775189_3041299_96-152158-0_sp1.1 from training. Duration: 0.77275 2023-10-03 00:47:18,177 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7837_210_1792_1_1528453445_5841305_654-54195-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 00:47:18,354 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.feed_forward1.hidden_balancer.prob, batch_count=1078493.3333333333, ans=0.125 2023-10-03 00:47:19,539 WARNING [train.py:1204] (3/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_344-152526-0 from training. Duration: 0.53 2023-10-03 00:47:22,984 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_30167_210_3528_1_1532428012_4772375_480-142230-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 00:47:27,571 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_160-218721-0 from training. Duration: 0.96 2023-10-03 00:47:30,374 WARNING [train.py:1204] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_65-88242-0_sp0.9 from training. Duration: 0.62225 2023-10-03 00:47:35,002 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_34886_210_10633_1_1533203882_1755661_76-156096-0 from training. Duration: 0.87 2023-10-03 00:47:37,721 WARNING [train.py:1204] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_188-38388-0_sp1.1 from training. Duration: 0.92725 2023-10-03 00:47:39,174 WARNING [train.py:1204] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_81-289335-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 00:47:42,019 WARNING [train.py:1204] (3/4) Exclude cut with ID _59493_210_17282_1_1535078419077_3225879_110-8778-0_sp1.1 from training. Duration: 0.936375 2023-10-03 00:47:43,355 WARNING [train.py:1204] (3/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_188-260011-0 from training. Duration: 0.73 2023-10-03 00:47:43,388 WARNING [train.py:1204] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_453-133983-0_sp0.9 from training. Duration: 0.8666875 2023-10-03 00:47:46,330 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.attention_skip_rate, batch_count=1078626.6666666667, ans=0.0 2023-10-03 00:47:50,110 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8054_210_7927_1_1528627721_4984094_553-17379-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 00:47:53,161 WARNING [train.py:1204] (3/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_341-171072-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 00:47:56,494 WARNING [train.py:1204] (3/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_265-257286-0_sp1.1 from training. Duration: 0.963625 2023-10-03 00:47:56,584 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_403-39619-0_sp0.9 from training. Duration: 0.9333125 2023-10-03 00:47:56,590 WARNING [train.py:1204] (3/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_464-33931-0_sp0.9 from training. Duration: 0.6 2023-10-03 00:47:57,872 WARNING [train.py:1204] (3/4) Exclude cut with ID _60804_210_9642_1_1534831199859_7268299_632-143863-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 00:47:57,879 WARNING [train.py:1204] (3/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_431-310997-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 00:47:57,929 WARNING [train.py:1204] (3/4) Exclude cut with ID _31649_210_6228_1_1532584608063_4091078_114-263860-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 00:47:57,941 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6961_210_2087_1_1527937168_6815011_562-165518-0_sp0.9 from training. Duration: 0.8555625 2023-10-03 00:48:02,182 WARNING [train.py:1204] (3/4) Exclude cut with ID _27052_210_5986_1_1532329197187_3585009_338-325870-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 00:48:03,809 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.603e+02 1.898e+02 2.077e+02 2.404e+02 3.282e+02, threshold=4.153e+02, percent-clipped=0.0 2023-10-03 00:48:03,901 WARNING [train.py:1204] (3/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_469-94673-0_sp1.1 from training. Duration: 0.7181875 2023-10-03 00:48:03,922 WARNING [train.py:1204] (3/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_259-34038-0 from training. Duration: 0.74 2023-10-03 00:48:04,010 WARNING [train.py:1204] (3/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_388-129552-0 from training. Duration: 0.96 2023-10-03 00:48:04,271 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.feed_forward3.hidden_balancer.prob, batch_count=1078693.3333333333, ans=0.125 2023-10-03 00:48:06,665 WARNING [train.py:1204] (3/4) Exclude cut with ID _28394_210_8913_1_1532430049518_3650118_342-251455-0_sp1.1 from training. Duration: 0.936375 2023-10-03 00:48:06,684 WARNING [train.py:1204] (3/4) Exclude cut with ID _53731_210_7994_1_1534553979244_3757214_8-191390-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 00:48:06,890 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=1078693.3333333333, ans=0.1 2023-10-03 00:48:08,045 WARNING [train.py:1204] (3/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_613-232856-0 from training. Duration: 0.54 2023-10-03 00:48:08,116 WARNING [train.py:1204] (3/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_557-349534-0 from training. Duration: 0.91 2023-10-03 00:48:09,413 WARNING [train.py:1204] (3/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_379-184223-0 from training. Duration: 0.85 2023-10-03 00:48:09,417 WARNING [train.py:1204] (3/4) Exclude cut with ID IC0125W0001-61807 from training. Duration: 0.966 2023-10-03 00:48:10,697 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_229-45300-0 from training. Duration: 0.57 2023-10-03 00:48:10,787 WARNING [train.py:1204] (3/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_1069-24743-0_sp1.1 from training. Duration: 0.92725 2023-10-03 00:48:12,183 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_41452_210_7291_1_1533437687_3941281_323-172873-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 00:48:12,199 WARNING [train.py:1204] (3/4) Exclude cut with ID _37086_210_9038_1_1532999540891_7021250_802-226709-0_sp1.1 from training. Duration: 0.97275 2023-10-03 00:48:13,464 INFO [train.py:1046] (3/4) Epoch 31, batch 2450, loss[loss=0.171, simple_loss=0.245, pruned_loss=0.04853, over 23717.00 frames. ], tot_loss[loss=0.1627, simple_loss=0.2407, pruned_loss=0.04239, over 4702837.03 frames. ], batch size: 164, lr: 3.30e-03, grad_scale: 16.0 2023-10-03 00:48:13,593 WARNING [train.py:1204] (3/4) Exclude cut with ID IC0144W0500-71767 from training. Duration: 0.931875 2023-10-03 00:48:14,990 WARNING [train.py:1204] (3/4) Exclude cut with ID _39846_210_12651_1_1533605310265_2115212_233-129699-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 00:48:15,054 WARNING [train.py:1204] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_574-45541-0_sp0.9 from training. Duration: 0.72225 2023-10-03 00:48:18,133 WARNING [train.py:1204] (3/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_372-298093-0_sp1.1 from training. Duration: 0.72725 2023-10-03 00:48:18,144 WARNING [train.py:1204] (3/4) Exclude cut with ID _62574_210_2655_1_1535180407058_3933249_34-26343-0_sp1.1 from training. Duration: 0.97275 2023-10-03 00:48:21,538 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.conv_module1.balancer2.prob, batch_count=1078760.0, ans=0.125 2023-10-03 00:48:22,804 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_512-256238-0_sp1.1 from training. Duration: 0.963625 2023-10-03 00:48:22,810 WARNING [train.py:1204] (3/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_196-55502-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 00:48:24,220 WARNING [train.py:1204] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_195-167011-0 from training. Duration: 0.75 2023-10-03 00:48:27,551 WARNING [train.py:1204] (3/4) Exclude cut with ID _13678_210_7497_1_1530530069226_3463170_345-230416-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 00:48:27,567 WARNING [train.py:1204] (3/4) Exclude cut with ID _51078_210_9070_1_1534161557424_3805250_225-327809-0_sp1.1 from training. Duration: 0.963625 2023-10-03 00:48:31,661 WARNING [train.py:1204] (3/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_18-189198-0_sp1.1 from training. Duration: 0.8181875 2023-10-03 00:48:31,672 WARNING [train.py:1204] (3/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_329-82235-0_sp1.1 from training. Duration: 0.7454375 2023-10-03 00:48:31,680 WARNING [train.py:1204] (3/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_726-270780-0_sp0.9 from training. Duration: 0.9555625 2023-10-03 00:48:31,717 WARNING [train.py:1204] (3/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_654-52713-0 from training. Duration: 0.77 2023-10-03 00:48:34,961 WARNING [train.py:1204] (3/4) Exclude cut with ID _20455_210_5219_1_1531565996775_8098100_191-16006-0_sp1.1 from training. Duration: 0.963625 2023-10-03 00:48:37,620 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3171_210_2087_1_1526109084_2330007_66-260583-0_sp1.1 from training. Duration: 0.6545625 2023-10-03 00:48:37,661 WARNING [train.py:1204] (3/4) Exclude cut with ID _38670_210_15379_1_1533448810659_4391059_63-129006-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 00:48:40,596 WARNING [train.py:1204] (3/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_393-57888-0_sp0.9 from training. Duration: 0.711125 2023-10-03 00:48:40,639 WARNING [train.py:1204] (3/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_857-298942-0_sp0.9 from training. Duration: 0.9444375 2023-10-03 00:48:42,048 WARNING [train.py:1204] (3/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_239-22076-0_sp0.9 from training. Duration: 0.9444375 2023-10-03 00:48:42,072 WARNING [train.py:1204] (3/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_424-227947-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 00:48:45,295 WARNING [train.py:1204] (3/4) Exclude cut with ID _14069_210_5632_1_1530512692033_4803390_95-1896-0 from training. Duration: 0.98 2023-10-03 00:48:46,935 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.ff3_skip_rate, batch_count=1078893.3333333333, ans=0.0 2023-10-03 00:48:48,059 WARNING [train.py:1204] (3/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_96-244299-0_sp0.9 from training. Duration: 0.97775 2023-10-03 00:48:50,028 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.attention_skip_rate, batch_count=1078893.3333333333, ans=0.0 2023-10-03 00:48:50,084 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.feed_forward1.out_proj.dropout_p, batch_count=1078893.3333333333, ans=0.1 2023-10-03 00:48:54,020 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.1.conv_module2.balancer1.prob, batch_count=1078893.3333333333, ans=0.125 2023-10-03 00:48:56,928 WARNING [train.py:1204] (3/4) Exclude cut with ID _46683_210_9642_1_1533730913492_4591116_562-319825-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 00:48:58,284 WARNING [train.py:1204] (3/4) Exclude cut with ID _64639_210_5816_1_1535367619437_3528000_284-173474-0_sp1.1 from training. Duration: 0.963625 2023-10-03 00:48:58,305 WARNING [train.py:1204] (3/4) Exclude cut with ID _29529_210_7234_1_1532393961455_3977460_497-100133-0_sp1.1 from training. Duration: 0.936375 2023-10-03 00:48:58,331 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_12868_210_6753_1_1530701932_530275_18-76322-0_sp0.9 from training. Duration: 0.9666875 2023-10-03 00:48:59,680 WARNING [train.py:1204] (3/4) Exclude cut with ID _50093_210_4220_1_1534417141207_3432299_116-303447-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 00:48:59,763 WARNING [train.py:1204] (3/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_44-79427-0_sp1.1 from training. Duration: 0.8909375 2023-10-03 00:49:01,135 WARNING [train.py:1204] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_887-249555-0 from training. Duration: 0.77 2023-10-03 00:49:02,746 WARNING [train.py:1204] (3/4) Exclude cut with ID _28461_210_2253_1_1532771775189_3041299_96-152158-0_sp0.9 from training. Duration: 0.9444375 2023-10-03 00:49:04,059 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_14058_210_4163_1_1530690953_6327180_49-210687-0_sp1.1 from training. Duration: 0.8545625 2023-10-03 00:49:04,221 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.feed_forward3.hidden_balancer.prob, batch_count=1078960.0, ans=0.125 2023-10-03 00:49:08,611 WARNING [train.py:1204] (3/4) Exclude cut with ID _40399_210_4353_1_1533305658194_3279406_351-127903-0_sp1.1 from training. Duration: 0.97275 2023-10-03 00:49:08,617 WARNING [train.py:1204] (3/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_184-206939-0_sp1.1 from training. Duration: 0.936375 2023-10-03 00:49:13,973 WARNING [train.py:1204] (3/4) Exclude cut with ID _14100_210_8341_1_1530529320613_3678069_258-224599-0_sp1.1 from training. Duration: 0.7 2023-10-03 00:49:13,996 WARNING [train.py:1204] (3/4) Exclude cut with ID _6493_210_1747_1_1528459210424_4012470_324-323854-0 from training. Duration: 0.92 2023-10-03 00:49:15,332 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_151-325626-0_sp1.1 from training. Duration: 0.8818125 2023-10-03 00:49:15,405 WARNING [train.py:1204] (3/4) Exclude cut with ID _14528_210_8254_1_1530669700514_4306939_106-43145-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 00:49:15,433 WARNING [train.py:1204] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_686-54383-0 from training. Duration: 0.87 2023-10-03 00:49:15,473 WARNING [train.py:1204] (3/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_801-102362-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 00:49:17,448 WARNING [train.py:1204] (3/4) Exclude cut with ID _48677_210_5663_1_1533902405919_3892989_408-40740-0_sp0.9 from training. Duration: 0.92225 2023-10-03 00:49:21,534 WARNING [train.py:1204] (3/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_448-212135-0_sp1.1 from training. Duration: 0.82725 2023-10-03 00:49:23,548 WARNING [train.py:1204] (3/4) Exclude cut with ID _59170_210_6721_1_1534983633960_4203335_320-124047-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 00:49:24,868 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_4692_210_3528_1_1526899755_4323254_107-44401-0_sp1.1 from training. Duration: 0.7818125 2023-10-03 00:49:27,730 INFO [train.py:1046] (3/4) Epoch 31, batch 2500, loss[loss=0.1581, simple_loss=0.2422, pruned_loss=0.03702, over 24655.00 frames. ], tot_loss[loss=0.1616, simple_loss=0.2397, pruned_loss=0.04178, over 4710464.07 frames. ], batch size: 65, lr: 3.30e-03, grad_scale: 16.0 2023-10-03 00:49:28,468 WARNING [train.py:1204] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_731-2428-0 from training. Duration: 0.83 2023-10-03 00:49:29,651 WARNING [train.py:1204] (3/4) Exclude cut with ID _29857_210_20034_1_1532395886753_3798160_335-163562-0_sp1.1 from training. Duration: 0.77275 2023-10-03 00:49:32,645 WARNING [train.py:1204] (3/4) Exclude cut with ID _14832_210_8750_1_1530691351539_3171348_171-275004-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 00:49:41,700 WARNING [train.py:1204] (3/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_105-328174-0_sp0.9 from training. Duration: 0.8444375 2023-10-03 00:49:41,730 WARNING [train.py:1204] (3/4) Exclude cut with ID _68965_210_1883_1_1535698962272_3497490_164-118421-0_sp1.1 from training. Duration: 0.936375 2023-10-03 00:49:43,043 WARNING [train.py:1204] (3/4) Exclude cut with ID _19896_210_1883_1_1531709022662_6649569_380-24170-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 00:49:43,048 WARNING [train.py:1204] (3/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_505-205270-0_sp1.1 from training. Duration: 0.3454375 2023-10-03 00:49:49,229 WARNING [train.py:1204] (3/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_427-208901-0_sp1.1 from training. Duration: 0.6181875 2023-10-03 00:49:50,556 WARNING [train.py:1204] (3/4) Exclude cut with ID _23818_210_6228_1_1531917063884_3676920_85-326916-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 00:49:50,641 WARNING [train.py:1204] (3/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_101-293367-0_sp1.1 from training. Duration: 0.57275 2023-10-03 00:49:50,655 WARNING [train.py:1204] (3/4) Exclude cut with ID _5409_210_1388_1_1527292850189_5865191_178-127970-0_sp1.1 from training. Duration: 0.5181875 2023-10-03 00:49:50,830 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.nonlin_attention.balancer.prob, batch_count=1079160.0, ans=0.125 2023-10-03 00:49:51,934 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_403-39619-0 from training. Duration: 0.84 2023-10-03 00:49:54,972 WARNING [train.py:1204] (3/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_427-87841-0_sp1.1 from training. Duration: 0.963625 2023-10-03 00:49:55,027 WARNING [train.py:1204] (3/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_286-68181-0_sp1.1 from training. Duration: 0.97275 2023-10-03 00:49:56,365 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6673_210_3273_1_1527767790_3173240_327-306185-0 from training. Duration: 0.83 2023-10-03 00:49:56,380 WARNING [train.py:1204] (3/4) Exclude cut with ID _3675_210_4557_1_1526639934408_4182089_106-203713-0_sp1.1 from training. Duration: 0.963625 2023-10-03 00:49:57,032 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_53071_210_16554_1_1534493434_1276796_130-36076-0 from training. Duration: 0.95 2023-10-03 00:49:57,052 WARNING [train.py:1204] (3/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_586-93054-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 00:50:01,231 WARNING [train.py:1204] (3/4) Exclude cut with ID _23825_210_13449_1_1531915313526_3497150_262-10571-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 00:50:02,588 WARNING [train.py:1204] (3/4) Exclude cut with ID _28783_210_7943_1_1532671164808_7155479_432-67885-0_sp1.1 from training. Duration: 0.97275 2023-10-03 00:50:05,329 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6287_210_3273_1_1527509506_2653716_182-199887-0_sp0.9 from training. Duration: 0.5666875 2023-10-03 00:50:05,381 WARNING [train.py:1204] (3/4) Exclude cut with ID _3231_210_2106_1_1526205864934_6784220_171-256004-0 from training. Duration: 0.86 2023-10-03 00:50:07,344 WARNING [train.py:1204] (3/4) Exclude cut with ID _69051_210_1531_1_1535885566901_3829538_332-92090-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 00:50:07,975 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.3.encoder.layers.2.conv_module1.whiten, num_groups=1, num_channels=512, metric=6.66 vs. limit=15.0 2023-10-03 00:50:08,638 WARNING [train.py:1204] (3/4) Exclude cut with ID _3675_210_4557_1_1526639934408_4182089_104-324125-0_sp1.1 from training. Duration: 0.963625 2023-10-03 00:50:12,710 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_41764_210_2521_1_1533686110_4089313_501-2119-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 00:50:12,926 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.conv_skip_rate, batch_count=1079293.3333333333, ans=0.0 2023-10-03 00:50:15,588 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_56-100988-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 00:50:20,240 WARNING [train.py:1204] (3/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_398-239819-0_sp1.1 from training. Duration: 0.9 2023-10-03 00:50:24,982 WARNING [train.py:1204] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_74-48717-0_sp1.1 from training. Duration: 0.52725 2023-10-03 00:50:27,639 WARNING [train.py:1204] (3/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_177-49092-0 from training. Duration: 0.91 2023-10-03 00:50:29,061 WARNING [train.py:1204] (3/4) Exclude cut with ID _62337_210_12392_1_1535009273105_3412229_44-176353-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 00:50:29,070 WARNING [train.py:1204] (3/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_110-130961-0_sp1.1 from training. Duration: 0.42725 2023-10-03 00:50:31,039 WARNING [train.py:1204] (3/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_37-189262-0_sp1.1 from training. Duration: 0.8909375 2023-10-03 00:50:31,040 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3172_210_2087_1_1526292929_3987865_197-97762-0_sp0.9 from training. Duration: 0.8333125 2023-10-03 00:50:31,141 WARNING [train.py:1204] (3/4) Exclude cut with ID IC0677W0065-334897 from training. Duration: 0.896 2023-10-03 00:50:31,142 WARNING [train.py:1204] (3/4) Exclude cut with ID IC0677W0080-334912 from training. Duration: 0.768 2023-10-03 00:50:31,147 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_120-41668-0 from training. Duration: 0.7 2023-10-03 00:50:32,337 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.630e+02 1.760e+02 1.910e+02 2.096e+02 3.347e+02, threshold=3.821e+02, percent-clipped=0.0 2023-10-03 00:50:35,182 WARNING [train.py:1204] (3/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_460-205287-0_sp1.1 from training. Duration: 0.963625 2023-10-03 00:50:36,643 WARNING [train.py:1204] (3/4) Exclude cut with ID _14632_210_9988_1_1530691279392_3923200_102-204477-0 from training. Duration: 0.97 2023-10-03 00:50:36,648 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_34815_210_6388_1_1533381320_3571903_279-305565-0 from training. Duration: 0.92 2023-10-03 00:50:36,724 WARNING [train.py:1204] (3/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_91-148831-0_sp1.1 from training. Duration: 0.9 2023-10-03 00:50:38,376 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_82-221196-0 from training. Duration: 0.76 2023-10-03 00:50:42,255 INFO [train.py:1046] (3/4) Epoch 31, batch 2550, loss[loss=0.1725, simple_loss=0.2518, pruned_loss=0.04659, over 23830.00 frames. ], tot_loss[loss=0.162, simple_loss=0.2404, pruned_loss=0.04177, over 4714947.70 frames. ], batch size: 195, lr: 3.30e-03, grad_scale: 16.0 2023-10-03 00:50:42,364 WARNING [train.py:1204] (3/4) Exclude cut with ID _38020_210_5986_1_1533112252259_7006089_493-102745-0 from training. Duration: 0.98 2023-10-03 00:50:42,515 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.0.balancer2.prob, batch_count=1079426.6666666667, ans=0.125 2023-10-03 00:50:45,108 WARNING [train.py:1204] (3/4) Exclude cut with ID _61133_210_9969_1_1535196777227_3696409_40-93442-0_sp1.1 from training. Duration: 0.92725 2023-10-03 00:50:45,363 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.self_attn_weights.pos_emb_skip_rate, batch_count=1079426.6666666667, ans=0.0 2023-10-03 00:50:46,566 WARNING [train.py:1204] (3/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_45-258852-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 00:50:46,593 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_53071_210_16554_1_1534493434_1276796_130-36076-0_sp1.1 from training. Duration: 0.863625 2023-10-03 00:50:48,451 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8694_210_6400_1_1529065754_3260756_332-13553-0_sp1.1 from training. Duration: 0.92725 2023-10-03 00:50:48,522 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_405-339986-0 from training. Duration: 0.51 2023-10-03 00:50:49,915 WARNING [train.py:1204] (3/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_927-43827-0_sp1.1 from training. Duration: 0.67275 2023-10-03 00:50:51,616 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.conv_module1.balancer1.prob, batch_count=1079426.6666666667, ans=0.125 2023-10-03 00:50:52,689 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_397-147196-0 from training. Duration: 0.8 2023-10-03 00:50:54,489 WARNING [train.py:1204] (3/4) Exclude cut with ID 3557-8342-0013-54691-0_sp1.1 from training. Duration: 0.836375 2023-10-03 00:50:55,919 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_47438_210_5399_1_1533797471_4513356_626-127722-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 00:50:58,639 WARNING [train.py:1204] (3/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_256-323883-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 00:50:58,644 WARNING [train.py:1204] (3/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_839-173017-0_sp0.9 from training. Duration: 0.4555625 2023-10-03 00:50:58,693 WARNING [train.py:1204] (3/4) Exclude cut with ID _5289_210_2084_1_1527678202736_3779299_392-261988-0_sp1.1 from training. Duration: 0.5909375 2023-10-03 00:50:58,723 WARNING [train.py:1204] (3/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_119-224384-0_sp1.1 from training. Duration: 0.936375 2023-10-03 00:50:58,905 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.feed_forward1.out_proj.dropout_p, batch_count=1079493.3333333333, ans=0.1 2023-10-03 00:51:00,674 WARNING [train.py:1204] (3/4) Exclude cut with ID _57523_210_1856_1_1534766294545_3732989_262-267933-0_sp1.1 from training. Duration: 0.963625 2023-10-03 00:51:03,227 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_35361_210_4238_1_1533262810_7793476_675-87012-0_sp0.9 from training. Duration: 0.988875 2023-10-03 00:51:03,253 WARNING [train.py:1204] (3/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_17-90541-0 from training. Duration: 0.94 2023-10-03 00:51:03,449 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=1079493.3333333333, ans=0.1 2023-10-03 00:51:04,544 WARNING [train.py:1204] (3/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_535-14410-0_sp1.1 from training. Duration: 0.42725 2023-10-03 00:51:04,550 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_24673_210_6400_1_1531968633_778659_94-154511-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 00:51:04,554 WARNING [train.py:1204] (3/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_212-10501-0 from training. Duration: 0.94 2023-10-03 00:51:04,839 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.feed_forward1.out_proj.dropout_p, batch_count=1079493.3333333333, ans=0.1 2023-10-03 00:51:14,869 WARNING [train.py:1204] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_383-285196-0_sp1.1 from training. Duration: 0.8818125 2023-10-03 00:51:17,034 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.balancer2.prob, batch_count=1079560.0, ans=0.125 2023-10-03 00:51:19,622 WARNING [train.py:1204] (3/4) Exclude cut with ID _45560_210_21982_1_1533812603951_3584999_238-47020-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 00:51:19,639 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_21500_210_2054_1_1532149310_2716758_278-18053-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 00:51:20,950 WARNING [train.py:1204] (3/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_453-9626-0_sp1.1 from training. Duration: 0.936375 2023-10-03 00:51:21,045 WARNING [train.py:1204] (3/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_153-97441-0_sp1.1 from training. Duration: 0.5090625 2023-10-03 00:51:25,803 WARNING [train.py:1204] (3/4) Exclude cut with ID _67725_210_27767_1_1535608756671_5105359_431-120895-0_sp1.1 from training. Duration: 0.963625 2023-10-03 00:51:29,983 WARNING [train.py:1204] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_574-45541-0_sp1.1 from training. Duration: 0.5909375 2023-10-03 00:51:30,016 WARNING [train.py:1204] (3/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_162-103620-0_sp1.1 from training. Duration: 0.8454375 2023-10-03 00:51:30,030 WARNING [train.py:1204] (3/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_734-129761-0_sp1.1 from training. Duration: 0.7454375 2023-10-03 00:51:31,764 WARNING [train.py:1204] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_620-276793-0_sp1.1 from training. Duration: 0.536375 2023-10-03 00:51:31,814 WARNING [train.py:1204] (3/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_153-97441-0_sp0.9 from training. Duration: 0.62225 2023-10-03 00:51:34,668 WARNING [train.py:1204] (3/4) Exclude cut with ID _12733_210_8341_1_1530167424556_3646380_523-85621-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 00:51:35,985 WARNING [train.py:1204] (3/4) Exclude cut with ID _64048_210_10626_1_1535198382802_3835179_352-179834-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 00:51:39,481 WARNING [train.py:1204] (3/4) Exclude cut with ID _55162_210_17071_1_1534408248583_3753480_43-242364-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 00:51:39,498 WARNING [train.py:1204] (3/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_661-264857-0 from training. Duration: 0.93 2023-10-03 00:51:39,505 WARNING [train.py:1204] (3/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_325-237800-0_sp1.1 from training. Duration: 0.97275 2023-10-03 00:51:39,552 WARNING [train.py:1204] (3/4) Exclude cut with ID _55328_210_27767_1_1534580973260_4088170_385-171459-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 00:51:40,938 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3780_210_2087_1_1526467389_2145572_176-107140-0_sp1.1 from training. Duration: 0.62725 2023-10-03 00:51:41,027 WARNING [train.py:1204] (3/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_375-244976-0_sp1.1 from training. Duration: 0.6818125 2023-10-03 00:51:42,451 WARNING [train.py:1204] (3/4) Exclude cut with ID _35941_210_15379_1_1532995294515_4023089_309-33162-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 00:51:47,410 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.3.encoder.layers.1.self_attn1.whiten, num_groups=1, num_channels=512, metric=13.35 vs. limit=22.5 2023-10-03 00:51:50,108 WARNING [train.py:1204] (3/4) Exclude cut with ID _14459_210_2228_1_1531443641946_7516700_1185-22038-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 00:51:51,580 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_1197-69460-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 00:51:54,494 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9012W0022-501157 from training. Duration: 0.896 2023-10-03 00:51:57,153 INFO [train.py:1046] (3/4) Epoch 31, batch 2600, loss[loss=0.1684, simple_loss=0.2375, pruned_loss=0.04964, over 22839.00 frames. ], tot_loss[loss=0.1632, simple_loss=0.2418, pruned_loss=0.04234, over 4712465.72 frames. ], batch size: 322, lr: 3.29e-03, grad_scale: 16.0 2023-10-03 00:51:57,273 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9023W0009-506302 from training. Duration: 0.896 2023-10-03 00:51:57,288 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2821_210_2715_1_1526108385_3763560_73-296673-0_sp1.1 from training. Duration: 0.7545625 2023-10-03 00:51:58,730 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9025W0113-507442 from training. Duration: 0.896 2023-10-03 00:52:00,154 WARNING [train.py:1204] (3/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_739-2098-0 from training. Duration: 0.92 2023-10-03 00:52:00,797 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9029W0332-509722 from training. Duration: 0.768 2023-10-03 00:52:02,261 WARNING [train.py:1204] (3/4) Exclude cut with ID _16908_210_5724_1_1531119157248_3772700_68-101953-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 00:52:02,279 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9042W0011-515617 from training. Duration: 0.768 2023-10-03 00:52:05,050 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3019_210_2028_1_1526044245_3264865_191-72235-0 from training. Duration: 0.97 2023-10-03 00:52:06,433 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9052W0006-520792 from training. Duration: 0.896 2023-10-03 00:52:07,754 WARNING [train.py:1204] (3/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_245-174553-0_sp1.1 from training. Duration: 0.8 2023-10-03 00:52:09,169 WARNING [train.py:1204] (3/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_202-67017-0 from training. Duration: 0.82 2023-10-03 00:52:09,267 WARNING [train.py:1204] (3/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_304-59227-0 from training. Duration: 0.77 2023-10-03 00:52:10,633 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_246-125000-0_sp0.9 from training. Duration: 0.688875 2023-10-03 00:52:12,444 WARNING [train.py:1204] (3/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_243-42116-0 from training. Duration: 0.73 2023-10-03 00:52:13,861 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9087W0121-538492 from training. Duration: 0.896 2023-10-03 00:52:13,879 WARNING [train.py:1204] (3/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_120-76843-0 from training. Duration: 0.55 2023-10-03 00:52:17,366 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.nonlin_attention.balancer.prob, batch_count=1079826.6666666667, ans=0.125 2023-10-03 00:52:21,826 WARNING [train.py:1204] (3/4) Exclude cut with ID _7852_210_5219_1_1528286471316_8139400_850-304621-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 00:52:21,846 WARNING [train.py:1204] (3/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_154-285415-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 00:52:21,865 WARNING [train.py:1204] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_538-278194-0_sp1.1 from training. Duration: 0.92725 2023-10-03 00:52:21,865 WARNING [train.py:1204] (3/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_119-207162-0 from training. Duration: 0.71 2023-10-03 00:52:23,390 WARNING [train.py:1204] (3/4) Exclude cut with ID _2334_210_2667_1_1525608238880_4705129_459-104997-0_sp1.1 from training. Duration: 0.863625 2023-10-03 00:52:28,745 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9147W0019-567847 from training. Duration: 0.768 2023-10-03 00:52:34,817 WARNING [train.py:1204] (3/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_228-189439-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 00:52:34,848 WARNING [train.py:1204] (3/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_196-77172-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 00:52:36,218 WARNING [train.py:1204] (3/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_625-338546-0 from training. Duration: 0.88 2023-10-03 00:52:37,515 WARNING [train.py:1204] (3/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_193-327630-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 00:52:37,517 WARNING [train.py:1204] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_391-24006-0_sp1.1 from training. Duration: 0.92725 2023-10-03 00:52:37,603 WARNING [train.py:1204] (3/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_197-28164-0 from training. Duration: 0.8 2023-10-03 00:52:41,704 WARNING [train.py:1204] (3/4) Exclude cut with ID _8395_210_1531_1_1528597871523_4411579_431-132470-0_sp1.1 from training. Duration: 0.67275 2023-10-03 00:52:41,725 WARNING [train.py:1204] (3/4) Exclude cut with ID _35350_210_16040_1_1533040191183_4075299_465-248326-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 00:52:43,100 WARNING [train.py:1204] (3/4) Exclude cut with ID _16756_210_4915_1_1531047803763_7104259_101-141922-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 00:52:43,281 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.bypass.skip_rate, batch_count=1079960.0, ans=0.07 2023-10-03 00:52:46,564 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9212W0005-600172 from training. Duration: 0.896 2023-10-03 00:52:46,596 WARNING [train.py:1204] (3/4) Exclude cut with ID _63186_210_2084_1_1535414479705_3908649_378-166820-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 00:52:46,608 WARNING [train.py:1204] (3/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_52-211683-0_sp1.1 from training. Duration: 0.8181875 2023-10-03 00:52:48,846 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.1.conv_module2.balancer1.min_positive, batch_count=1079960.0, ans=0.025 2023-10-03 00:52:52,013 WARNING [train.py:1204] (3/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_1147-237729-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 00:52:53,308 WARNING [train.py:1204] (3/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_234-14174-0_sp0.9 from training. Duration: 0.911125 2023-10-03 00:52:53,458 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.conv_module2.balancer1.prob, batch_count=1079960.0, ans=0.125 2023-10-03 00:52:54,547 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_454-276429-0 from training. Duration: 0.87 2023-10-03 00:52:54,639 WARNING [train.py:1204] (3/4) Exclude cut with ID _53713_210_24116_1_1534314487762_7416751_755-165313-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 00:52:56,106 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8278_210_2550_1_1528637099_4220413_436-317072-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 00:52:56,164 WARNING [train.py:1204] (3/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_28-324729-0_sp1.1 from training. Duration: 0.8545625 2023-10-03 00:53:02,170 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.566e+02 1.914e+02 2.104e+02 2.405e+02 3.485e+02, threshold=4.207e+02, percent-clipped=0.0 2023-10-03 00:53:02,279 WARNING [train.py:1204] (3/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_326-165900-0 from training. Duration: 0.69 2023-10-03 00:53:03,626 WARNING [train.py:1204] (3/4) Exclude cut with ID _53489_210_6228_1_1534298357169_3835970_96-30246-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 00:53:05,104 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8275_210_4198_1_1528455320_3892571_491-17707-0_sp1.1 from training. Duration: 0.5818125 2023-10-03 00:53:08,038 WARNING [train.py:1204] (3/4) Exclude cut with ID _14348_210_8341_1_1530599434996_3778640_240-232370-0 from training. Duration: 0.84 2023-10-03 00:53:08,048 WARNING [train.py:1204] (3/4) Exclude cut with ID _45012_210_12651_1_1533608435136_5371380_54-112623-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 00:53:08,157 WARNING [train.py:1204] (3/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_328-264776-0_sp1.1 from training. Duration: 0.6090625 2023-10-03 00:53:09,629 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9325W0001-648427 from training. Duration: 0.768 2023-10-03 00:53:09,639 WARNING [train.py:1204] (3/4) Exclude cut with ID _56480_210_4220_1_1534679942079_3075409_49-74717-0_sp1.1 from training. Duration: 0.936375 2023-10-03 00:53:12,213 INFO [train.py:1046] (3/4) Epoch 31, batch 2650, loss[loss=0.2067, simple_loss=0.2773, pruned_loss=0.06803, over 19366.00 frames. ], tot_loss[loss=0.1644, simple_loss=0.243, pruned_loss=0.04291, over 4709511.93 frames. ], batch size: 388, lr: 3.29e-03, grad_scale: 16.0 2023-10-03 00:53:12,347 WARNING [train.py:1204] (3/4) Exclude cut with ID _59500_210_8254_1_1534809610680_3736009_6-241869-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 00:53:15,739 WARNING [train.py:1204] (3/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_172-215932-0_sp0.9 from training. Duration: 0.7333125 2023-10-03 00:53:17,641 WARNING [train.py:1204] (3/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_17-90541-0_sp1.1 from training. Duration: 0.8545625 2023-10-03 00:53:19,076 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_43831_210_2158_1_1533725502_5872791_525-89447-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 00:53:20,496 WARNING [train.py:1204] (3/4) Exclude cut with ID _12302_210_5219_1_1529924446466_8107280_907-182949-0 from training. Duration: 0.92 2023-10-03 00:53:20,514 WARNING [train.py:1204] (3/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_402-102311-0_sp0.9 from training. Duration: 0.7444375 2023-10-03 00:53:20,553 WARNING [train.py:1204] (3/4) Exclude cut with ID _8021_210_7994_1_1528376451358_3653069_179-45206-0_sp0.9 from training. Duration: 0.9555625 2023-10-03 00:53:20,804 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.conv_module2.balancer2.min_positive, batch_count=1080093.3333333333, ans=0.05 2023-10-03 00:53:25,197 WARNING [train.py:1204] (3/4) Exclude cut with ID _40137_210_12210_1_1533290484046_3795180_249-187059-0 from training. Duration: 0.88 2023-10-03 00:53:25,285 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9653W0194-675472 from training. Duration: 0.9658125 2023-10-03 00:53:27,946 WARNING [train.py:1204] (3/4) Exclude cut with ID _51332_210_9988_1_1534244371836_3801270_336-255604-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 00:53:28,161 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.conv_module1.balancer2.prob, batch_count=1080160.0, ans=0.125 2023-10-03 00:53:30,070 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_510-96996-0 from training. Duration: 0.87 2023-10-03 00:53:31,181 WARNING [train.py:1204] (3/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_1167-20437-0_sp1.1 from training. Duration: 0.92725 2023-10-03 00:53:31,236 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3475_210_2028_1_1526193578_3622569_265-276625-0 from training. Duration: 0.76 2023-10-03 00:53:33,049 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.self_attn_weights.pos_emb_skip_rate, batch_count=1080160.0, ans=0.0 2023-10-03 00:53:34,244 WARNING [train.py:1204] (3/4) Exclude cut with ID _14860_210_4915_1_1530702020340_7343240_470-172418-0_sp1.1 from training. Duration: 0.97275 2023-10-03 00:53:34,255 WARNING [train.py:1204] (3/4) Exclude cut with ID _5444_210_2151_1_1527418808464_5312210_581-315411-0_sp1.1 from training. Duration: 0.563625 2023-10-03 00:53:34,270 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_34450_210_9547_1_1533099443_4109050_519-342439-0_sp1.1 from training. Duration: 0.97275 2023-10-03 00:53:34,295 WARNING [train.py:1204] (3/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_50-175961-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 00:53:39,644 WARNING [train.py:1204] (3/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_372-6869-0 from training. Duration: 0.44 2023-10-03 00:53:39,649 WARNING [train.py:1204] (3/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_3-237434-0 from training. Duration: 0.76 2023-10-03 00:53:41,263 WARNING [train.py:1204] (3/4) Exclude cut with ID _8772_210_1850_1_1528635595972_3664011_247-336448-0_sp1.1 from training. Duration: 0.67275 2023-10-03 00:53:44,111 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_427-78097-0 from training. Duration: 0.8 2023-10-03 00:53:44,125 WARNING [train.py:1204] (3/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_466-252727-0_sp1.1 from training. Duration: 0.97275 2023-10-03 00:53:44,367 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.ff2_skip_rate, batch_count=1080226.6666666667, ans=0.0 2023-10-03 00:53:46,057 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_15972_210_6400_1_1530877934_3405692_316-35478-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 00:53:46,093 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_809-17156-0_sp0.9 from training. Duration: 0.811125 2023-10-03 00:53:46,124 WARNING [train.py:1204] (3/4) Exclude cut with ID _57572_210_5517_1_1534921219624_3809484_594-263474-0_sp1.1 from training. Duration: 0.936375 2023-10-03 00:53:46,162 WARNING [train.py:1204] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_190-228204-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 00:53:46,887 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.self_attn1.whiten.whitening_limit, batch_count=1080226.6666666667, ans=22.5 2023-10-03 00:53:47,975 WARNING [train.py:1204] (3/4) Exclude cut with ID _19514_210_9259_1_1531530071330_5084230_117-21193-0_sp1.1 from training. Duration: 0.936375 2023-10-03 00:53:48,828 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.0.layers.1.conv_module2.whiten, num_groups=1, num_channels=192, metric=9.37 vs. limit=15.0 2023-10-03 00:53:51,086 WARNING [train.py:1204] (3/4) Exclude cut with ID _69584_210_5219_1_1535785133775_4044450_190-21315-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 00:53:52,333 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_53071_210_16554_1_1534493434_1276796_184-302050-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 00:53:53,630 WARNING [train.py:1204] (3/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_120-76843-0_sp1.1 from training. Duration: 0.5 2023-10-03 00:53:54,974 WARNING [train.py:1204] (3/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_318-162017-0_sp1.1 from training. Duration: 0.82725 2023-10-03 00:53:55,245 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.attention_skip_rate, batch_count=1080226.6666666667, ans=0.0 2023-10-03 00:53:56,440 WARNING [train.py:1204] (3/4) Exclude cut with ID _46067_210_4220_1_1534125571066_3307450_79-46864-0_sp1.1 from training. Duration: 0.92725 2023-10-03 00:53:57,746 WARNING [train.py:1204] (3/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_358-215558-0_sp1.1 from training. Duration: 0.8454375 2023-10-03 00:53:59,166 WARNING [train.py:1204] (3/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_621-66764-0_sp1.1 from training. Duration: 0.92725 2023-10-03 00:54:01,274 WARNING [train.py:1204] (3/4) Exclude cut with ID _42863_210_9070_1_1533902398788_3696920_338-156851-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 00:54:01,300 WARNING [train.py:1204] (3/4) Exclude cut with ID _5289_210_2084_1_1527678202736_3779299_392-261988-0_sp0.9 from training. Duration: 0.72225 2023-10-03 00:54:04,017 WARNING [train.py:1204] (3/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_409-84682-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 00:54:05,401 WARNING [train.py:1204] (3/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_30-326681-0_sp0.9 from training. Duration: 0.92225 2023-10-03 00:54:05,413 WARNING [train.py:1204] (3/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_389-184695-0_sp1.1 from training. Duration: 0.92725 2023-10-03 00:54:06,813 WARNING [train.py:1204] (3/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_442-320317-0 from training. Duration: 0.82 2023-10-03 00:54:12,144 WARNING [train.py:1204] (3/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_756-29911-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 00:54:14,836 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_401-112036-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 00:54:16,276 WARNING [train.py:1204] (3/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_181-308827-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 00:54:16,355 WARNING [train.py:1204] (3/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_332-258725-0_sp1.1 from training. Duration: 0.963625 2023-10-03 00:54:16,433 WARNING [train.py:1204] (3/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_188-260011-0_sp0.9 from training. Duration: 0.811125 2023-10-03 00:54:17,801 WARNING [train.py:1204] (3/4) Exclude cut with ID _49039_210_6315_1_1533967537541_3846118_99-302239-0_sp1.1 from training. Duration: 0.963625 2023-10-03 00:54:19,204 WARNING [train.py:1204] (3/4) Exclude cut with ID _4122_210_2084_1_1526713164417_4353280_312-294828-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 00:54:19,209 WARNING [train.py:1204] (3/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_303-228738-0 from training. Duration: 0.84 2023-10-03 00:54:22,382 WARNING [train.py:1204] (3/4) Exclude cut with ID _14349_210_8341_1_1530615710818_3442470_115-285975-0_sp0.9 from training. Duration: 0.9555625 2023-10-03 00:54:24,333 WARNING [train.py:1204] (3/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_125-144174-0_sp1.1 from training. Duration: 0.3545625 2023-10-03 00:54:25,748 WARNING [train.py:1204] (3/4) Exclude cut with ID _53565_210_10635_1_1534337218335_3820259_351-54570-0_sp1.1 from training. Duration: 0.97275 2023-10-03 00:54:25,777 WARNING [train.py:1204] (3/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_410-40065-0_sp1.1 from training. Duration: 0.963625 2023-10-03 00:54:27,002 INFO [train.py:1046] (3/4) Epoch 31, batch 2700, loss[loss=0.1372, simple_loss=0.2158, pruned_loss=0.02929, over 24344.00 frames. ], tot_loss[loss=0.1649, simple_loss=0.2432, pruned_loss=0.04325, over 4715765.91 frames. ], batch size: 56, lr: 3.29e-03, grad_scale: 16.0 2023-10-03 00:54:27,064 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_38965_210_4852_1_1533174906_3484735_167-339442-0_sp1.1 from training. Duration: 0.963625 2023-10-03 00:54:28,431 WARNING [train.py:1204] (3/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_265-164486-0_sp1.1 from training. Duration: 0.87275 2023-10-03 00:54:28,449 WARNING [train.py:1204] (3/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_725-182484-0_sp1.1 from training. Duration: 0.92725 2023-10-03 00:54:28,481 WARNING [train.py:1204] (3/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_607-213951-0_sp0.9 from training. Duration: 0.9333125 2023-10-03 00:54:29,722 WARNING [train.py:1204] (3/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_605-133395-0_sp1.1 from training. Duration: 0.6 2023-10-03 00:54:29,736 WARNING [train.py:1204] (3/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_351-294650-0 from training. Duration: 0.63 2023-10-03 00:54:29,804 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_14953_210_2151_1_1530701710_5616165_710-165400-0_sp1.1 from training. Duration: 0.7909375 2023-10-03 00:54:32,471 WARNING [train.py:1204] (3/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_237-58433-0_sp1.1 from training. Duration: 0.67275 2023-10-03 00:54:34,536 WARNING [train.py:1204] (3/4) Exclude cut with ID _5190_210_4557_1_1527244368799_373439_28-197030-0_sp1.1 from training. Duration: 0.6545625 2023-10-03 00:54:34,592 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_743-327651-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 00:54:37,182 WARNING [train.py:1204] (3/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_76-203867-0_sp0.9 from training. Duration: 0.8 2023-10-03 00:54:38,525 WARNING [train.py:1204] (3/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_274-274798-0 from training. Duration: 0.98 2023-10-03 00:54:39,857 WARNING [train.py:1204] (3/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_226-242140-0_sp1.1 from training. Duration: 0.736375 2023-10-03 00:54:41,411 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.conv_skip_rate, batch_count=1080493.3333333333, ans=0.0 2023-10-03 00:54:41,446 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.ff3_skip_rate, batch_count=1080493.3333333333, ans=0.0 2023-10-03 00:54:41,531 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.conv_module1.balancer2.prob, batch_count=1080493.3333333333, ans=0.125 2023-10-03 00:54:44,056 WARNING [train.py:1204] (3/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_352-59958-0_sp1.1 from training. Duration: 0.7818125 2023-10-03 00:54:45,313 WARNING [train.py:1204] (3/4) Exclude cut with ID _39411_210_5986_1_1533538825030_3343649_191-251819-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 00:54:49,543 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7046_210_6753_1_1527895337_6920917_642-106799-0_sp1.1 from training. Duration: 0.836375 2023-10-03 00:54:49,549 WARNING [train.py:1204] (3/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_412-3442-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 00:54:49,567 WARNING [train.py:1204] (3/4) Exclude cut with ID _60057_210_10640_1_1534903065793_3333150_171-181133-0_sp0.9 from training. Duration: 0.97775 2023-10-03 00:54:49,575 WARNING [train.py:1204] (3/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_154-135696-0_sp1.1 from training. Duration: 0.72725 2023-10-03 00:54:53,301 WARNING [train.py:1204] (3/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_148-186805-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 00:54:54,771 WARNING [train.py:1204] (3/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_605-165641-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 00:54:56,699 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_174-277364-0_sp0.9 from training. Duration: 0.788875 2023-10-03 00:54:56,720 WARNING [train.py:1204] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_89-321955-0_sp0.9 from training. Duration: 0.888875 2023-10-03 00:55:00,768 WARNING [train.py:1204] (3/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_438-105429-0_sp1.1 from training. Duration: 0.963625 2023-10-03 00:55:00,771 WARNING [train.py:1204] (3/4) Exclude cut with ID _14970_210_15709_1_1530773543493_1715730_187-20259-0_sp0.9 from training. Duration: 0.988875 2023-10-03 00:55:08,167 WARNING [train.py:1204] (3/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_385-193052-0_sp0.9 from training. Duration: 0.9555625 2023-10-03 00:55:08,222 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7113_210_1849_1_1527942685_3448768_194-311626-0_sp1.1 from training. Duration: 0.92725 2023-10-03 00:55:10,950 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3780_210_2087_1_1526467389_2145572_176-107140-0_sp0.9 from training. Duration: 0.7666875 2023-10-03 00:55:10,952 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_36756_210_7234_1_1533026991_966276_123-213457-0_sp1.1 from training. Duration: 0.936375 2023-10-03 00:55:13,704 WARNING [train.py:1204] (3/4) Exclude cut with ID _23557_210_8750_1_1531890106370_6846255_456-244861-0_sp1.1 from training. Duration: 0.963625 2023-10-03 00:55:13,766 WARNING [train.py:1204] (3/4) Exclude cut with ID _37086_210_9038_1_1532999540891_7021250_242-349170-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 00:55:15,131 WARNING [train.py:1204] (3/4) Exclude cut with ID _41298_210_9019_1_1533430371450_4822143_471-281351-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 00:55:15,300 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.feed_forward2.hidden_balancer.prob, batch_count=1080626.6666666667, ans=0.125 2023-10-03 00:55:16,543 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_53378_210_17282_1_1534221071_5769509_572-126176-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 00:55:19,860 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_1507-263032-0_sp1.1 from training. Duration: 0.963625 2023-10-03 00:55:21,198 WARNING [train.py:1204] (3/4) Exclude cut with ID _13679_210_7497_1_1530534293662_3355098_411-186088-0_sp1.1 from training. Duration: 0.8545625 2023-10-03 00:55:24,452 WARNING [train.py:1204] (3/4) Exclude cut with ID _29345_210_4381_1_1532689168263_3860169_149-257268-0_sp1.1 from training. Duration: 0.736375 2023-10-03 00:55:25,920 WARNING [train.py:1204] (3/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_381-252014-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 00:55:25,927 WARNING [train.py:1204] (3/4) Exclude cut with ID _35799_210_5854_1_1533263487887_3529071_512-2717-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 00:55:28,673 WARNING [train.py:1204] (3/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_505-205270-0_sp0.9 from training. Duration: 0.42225 2023-10-03 00:55:30,058 WARNING [train.py:1204] (3/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_731-20117-0_sp1.1 from training. Duration: 0.936375 2023-10-03 00:55:31,497 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.607e+02 1.858e+02 1.974e+02 2.152e+02 3.142e+02, threshold=3.947e+02, percent-clipped=0.0 2023-10-03 00:55:31,604 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_37351_210_7925_1_1533012931_3812113_265-85780-0_sp1.1 from training. Duration: 0.8 2023-10-03 00:55:31,610 WARNING [train.py:1204] (3/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_313-134930-0 from training. Duration: 0.97 2023-10-03 00:55:32,983 WARNING [train.py:1204] (3/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_374-138971-0 from training. Duration: 0.94 2023-10-03 00:55:33,038 WARNING [train.py:1204] (3/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_1227-287886-0_sp1.1 from training. Duration: 0.936375 2023-10-03 00:55:37,608 WARNING [train.py:1204] (3/4) Exclude cut with ID _59493_210_17282_1_1535078419077_3225879_332-217544-0_sp1.1 from training. Duration: 0.97275 2023-10-03 00:55:39,030 WARNING [train.py:1204] (3/4) Exclude cut with ID _23514_210_1389_1_1531832139785_3031439_361-119640-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 00:55:40,421 INFO [train.py:1046] (3/4) Epoch 31, batch 2750, loss[loss=0.1686, simple_loss=0.2555, pruned_loss=0.04083, over 23588.00 frames. ], tot_loss[loss=0.1637, simple_loss=0.2421, pruned_loss=0.04262, over 4711643.93 frames. ], batch size: 93, lr: 3.29e-03, grad_scale: 8.0 2023-10-03 00:55:41,865 WARNING [train.py:1204] (3/4) Exclude cut with ID _69584_210_5219_1_1535785133775_4044450_142-118058-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 00:55:41,888 WARNING [train.py:1204] (3/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_178-177582-0_sp1.1 from training. Duration: 0.67275 2023-10-03 00:55:41,939 WARNING [train.py:1204] (3/4) Exclude cut with ID _25303_210_17519_1_1532050207576_3635970_41-195572-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 00:55:44,810 WARNING [train.py:1204] (3/4) Exclude cut with ID _55328_210_27767_1_1534580973260_4088170_329-341322-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 00:55:46,152 WARNING [train.py:1204] (3/4) Exclude cut with ID _5608_210_2106_1_1527415439089_6819768_909-82767-0_sp1.1 from training. Duration: 0.5545625 2023-10-03 00:55:46,197 WARNING [train.py:1204] (3/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_261-297503-0_sp1.1 from training. Duration: 0.9 2023-10-03 00:55:46,203 WARNING [train.py:1204] (3/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_459-291706-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 00:55:46,204 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7762_210_1794_1_1528199508_4057408_422-227034-0 from training. Duration: 0.73 2023-10-03 00:55:46,213 WARNING [train.py:1204] (3/4) Exclude cut with ID _3067_210_2667_1_1526212974640_4503168_250-285937-0_sp0.9 from training. Duration: 0.888875 2023-10-03 00:55:46,228 WARNING [train.py:1204] (3/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_332-253745-0_sp1.1 from training. Duration: 0.936375 2023-10-03 00:55:53,634 WARNING [train.py:1204] (3/4) Exclude cut with ID _13938_210_9988_1_1530518718978_3693960_304-112784-0 from training. Duration: 0.46 2023-10-03 00:55:55,081 WARNING [train.py:1204] (3/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_226-267957-0_sp1.1 from training. Duration: 0.92725 2023-10-03 00:55:55,137 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_41822_210_13270_1_1533955976_4379284_608-254567-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 00:55:56,511 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_124-113562-0_sp1.1 from training. Duration: 0.8545625 2023-10-03 00:55:56,538 WARNING [train.py:1204] (3/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_117-290916-0_sp1.1 from training. Duration: 0.463625 2023-10-03 00:55:57,906 WARNING [train.py:1204] (3/4) Exclude cut with ID _28427_210_4220_1_1532581240967_3061069_102-66533-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 00:55:59,790 WARNING [train.py:1204] (3/4) Exclude cut with ID _55383_210_10635_1_1534468426172_2231669_134-128246-0_sp1.1 from training. Duration: 0.8090625 2023-10-03 00:55:59,842 WARNING [train.py:1204] (3/4) Exclude cut with ID _45871_210_5665_1_1533864614245_3296060_49-308316-0_sp1.1 from training. Duration: 0.97275 2023-10-03 00:56:01,098 WARNING [train.py:1204] (3/4) Exclude cut with ID _57572_210_5517_1_1534921219624_3809484_176-199205-0_sp1.1 from training. Duration: 0.97275 2023-10-03 00:56:01,420 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.balancer1.prob, batch_count=1080826.6666666667, ans=0.125 2023-10-03 00:56:05,350 WARNING [train.py:1204] (3/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_45-265684-0_sp1.1 from training. Duration: 0.6545625 2023-10-03 00:56:05,369 WARNING [train.py:1204] (3/4) Exclude cut with ID _15150_210_8341_1_1530752411803_7256384_469-239509-0_sp0.9 from training. Duration: 0.7555625 2023-10-03 00:56:06,659 WARNING [train.py:1204] (3/4) Exclude cut with ID _28461_210_2253_1_1532771775189_3041299_136-270644-0_sp1.1 from training. Duration: 0.8454375 2023-10-03 00:56:08,032 WARNING [train.py:1204] (3/4) Exclude cut with ID _69584_210_5219_1_1535785133775_4044450_302-47302-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 00:56:08,126 WARNING [train.py:1204] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_166-128941-0_sp1.1 from training. Duration: 0.4909375 2023-10-03 00:56:15,870 WARNING [train.py:1204] (3/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_559-120504-0_sp1.1 from training. Duration: 0.97275 2023-10-03 00:56:18,551 WARNING [train.py:1204] (3/4) Exclude cut with ID _6456_210_1534_1_1527681634212_3181049_345-252877-0_sp1.1 from training. Duration: 0.5818125 2023-10-03 00:56:18,574 WARNING [train.py:1204] (3/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_902-233003-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 00:56:23,197 WARNING [train.py:1204] (3/4) Exclude cut with ID _36538_210_9994_1_1533204020363_3795620_204-261385-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 00:56:23,210 WARNING [train.py:1204] (3/4) Exclude cut with ID _14821_210_8750_1_1530680503991_6829359_189-297451-0_sp1.1 from training. Duration: 0.5 2023-10-03 00:56:24,495 WARNING [train.py:1204] (3/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_1211-226066-0_sp0.9 from training. Duration: 0.8444375 2023-10-03 00:56:30,699 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6786_210_1388_1_1527839699_4194495_273-149841-0_sp1.1 from training. Duration: 0.836375 2023-10-03 00:56:30,738 WARNING [train.py:1204] (3/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_380-162648-0_sp1.1 from training. Duration: 0.8545625 2023-10-03 00:56:30,739 WARNING [train.py:1204] (3/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_494-199538-0 from training. Duration: 0.75 2023-10-03 00:56:36,713 WARNING [train.py:1204] (3/4) Exclude cut with ID _49097_210_16049_1_1533952780207_4360979_276-87053-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 00:56:38,095 WARNING [train.py:1204] (3/4) Exclude cut with ID _5444_210_2151_1_1527418808464_5312210_726-27165-0 from training. Duration: 0.67 2023-10-03 00:56:41,179 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_128-202104-0_sp1.1 from training. Duration: 0.57275 2023-10-03 00:56:44,291 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_11349_210_6753_1_1529800861_1101207_26-65602-0_sp1.1 from training. Duration: 0.87275 2023-10-03 00:56:44,317 WARNING [train.py:1204] (3/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_159-328157-0 from training. Duration: 0.52 2023-10-03 00:56:44,410 WARNING [train.py:1204] (3/4) Exclude cut with ID _14069_210_5632_1_1530512692033_4803390_98-21934-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 00:56:47,108 WARNING [train.py:1204] (3/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_352-59958-0_sp0.9 from training. Duration: 0.9555625 2023-10-03 00:56:48,371 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6163_210_6400_1_1527506746_4263068_283-137811-0 from training. Duration: 0.77 2023-10-03 00:56:48,402 WARNING [train.py:1204] (3/4) Exclude cut with ID _21989_210_5854_1_1531899214931_3271360_133-44759-0_sp1.1 from training. Duration: 0.7 2023-10-03 00:56:51,118 WARNING [train.py:1204] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_21-174514-0_sp0.9 from training. Duration: 0.47775 2023-10-03 00:56:51,144 WARNING [train.py:1204] (3/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_201-78811-0_sp1.1 from training. Duration: 0.963625 2023-10-03 00:56:51,181 WARNING [train.py:1204] (3/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_537-164630-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 00:56:53,087 WARNING [train.py:1204] (3/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_156-257607-0 from training. Duration: 0.83 2023-10-03 00:56:53,097 WARNING [train.py:1204] (3/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_308-154450-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 00:56:54,345 INFO [train.py:1046] (3/4) Epoch 31, batch 2800, loss[loss=0.1679, simple_loss=0.2376, pruned_loss=0.0491, over 23664.00 frames. ], tot_loss[loss=0.1629, simple_loss=0.2412, pruned_loss=0.04233, over 4716336.09 frames. ], batch size: 232, lr: 3.29e-03, grad_scale: 16.0 2023-10-03 00:56:54,396 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_45399_210_4852_1_1533624725_7579737_214-240574-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 00:56:57,026 WARNING [train.py:1204] (3/4) Exclude cut with ID _29303_210_2575_1_1532392237303_7410649_615-305517-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 00:56:57,071 WARNING [train.py:1204] (3/4) Exclude cut with ID IC0125W0002-61808 from training. Duration: 0.708 2023-10-03 00:56:57,072 WARNING [train.py:1204] (3/4) Exclude cut with ID IC0125W0017-61823 from training. Duration: 0.759 2023-10-03 00:57:00,454 WARNING [train.py:1204] (3/4) Exclude cut with ID _53219_210_4525_1_1534834836008_4064825_4-145291-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 00:57:01,828 WARNING [train.py:1204] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_185-174805-0_sp1.1 from training. Duration: 0.6181875 2023-10-03 00:57:03,076 WARNING [train.py:1204] (3/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_635-193046-0_sp1.1 from training. Duration: 0.92725 2023-10-03 00:57:06,356 WARNING [train.py:1204] (3/4) Exclude cut with ID _46067_210_4220_1_1534125571066_3307450_321-258866-0_sp1.1 from training. Duration: 0.97275 2023-10-03 00:57:09,100 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8558_210_1410_1_1528505225_7613406_131-329524-0 from training. Duration: 0.79 2023-10-03 00:57:10,545 WARNING [train.py:1204] (3/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_847-41334-0_sp0.9 from training. Duration: 0.488875 2023-10-03 00:57:10,635 WARNING [train.py:1204] (3/4) Exclude cut with ID _14227_210_4381_1_1530701804286_3940259_125-275642-0 from training. Duration: 0.99 2023-10-03 00:57:10,885 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.nonlin_attention.balancer.prob, batch_count=1081160.0, ans=0.125 2023-10-03 00:57:12,015 WARNING [train.py:1204] (3/4) Exclude cut with ID _25248_210_8750_1_1532062825941_5554180_248-177255-0_sp1.1 from training. Duration: 0.963625 2023-10-03 00:57:12,051 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_40047_210_2290_1_1533290519_2374311_195-165693-0_sp1.1 from training. Duration: 0.8090625 2023-10-03 00:57:12,055 WARNING [train.py:1204] (3/4) Exclude cut with ID _23109_210_4381_1_1532150828777_901949_29-258672-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 00:57:16,781 WARNING [train.py:1204] (3/4) Exclude cut with ID _14821_210_8750_1_1530680503991_6829359_2-227151-0_sp1.1 from training. Duration: 0.7545625 2023-10-03 00:57:16,812 WARNING [train.py:1204] (3/4) Exclude cut with ID _49643_210_7989_1_1534640244224_3939031_565-53982-0_sp1.1 from training. Duration: 0.963625 2023-10-03 00:57:16,816 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7544_210_6753_1_1528591327_1471426_33-346031-0_sp0.9 from training. Duration: 0.67775 2023-10-03 00:57:18,098 WARNING [train.py:1204] (3/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_95-61782-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 00:57:26,854 WARNING [train.py:1204] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_686-54383-0_sp0.9 from training. Duration: 0.9666875 2023-10-03 00:57:27,177 INFO [scaling.py:1118] (3/4) WithLoss: name=encoder.encoders.1.encoder.layers.1.self_attn_weights, loss-sum=0.000e+00 2023-10-03 00:57:27,190 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.conv_skip_rate, batch_count=1081226.6666666667, ans=0.0 2023-10-03 00:57:28,966 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_505-276823-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 00:57:31,679 WARNING [train.py:1204] (3/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_745-128443-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 00:57:31,738 WARNING [train.py:1204] (3/4) Exclude cut with ID _3844_210_1390_1_1526727803582_2871209_20-251664-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 00:57:31,798 WARNING [train.py:1204] (3/4) Exclude cut with ID _36538_210_9994_1_1533204020363_3795620_487-164628-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 00:57:36,656 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.3.encoder.layers.2.feed_forward1.out_whiten, num_groups=1, num_channels=512, metric=7.54 vs. limit=15.0 2023-10-03 00:57:37,892 WARNING [train.py:1204] (3/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_309-222545-0_sp1.1 from training. Duration: 0.763625 2023-10-03 00:57:37,914 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3531_210_3528_1_1526294203_5216456_309-227302-0 from training. Duration: 0.69 2023-10-03 00:57:37,967 WARNING [train.py:1204] (3/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_42-192460-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 00:57:39,425 WARNING [train.py:1204] (3/4) Exclude cut with ID _22083_210_8254_1_1531702862439_3678970_49-234546-0_sp1.1 from training. Duration: 0.7545625 2023-10-03 00:57:39,429 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_427-78097-0_sp0.9 from training. Duration: 0.888875 2023-10-03 00:57:45,183 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_378-205254-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 00:57:45,237 WARNING [train.py:1204] (3/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_672-314830-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 00:57:49,471 WARNING [train.py:1204] (3/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_90-30673-0_sp1.1 from training. Duration: 0.763625 2023-10-03 00:57:50,889 WARNING [train.py:1204] (3/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_563-329943-0_sp1.1 from training. Duration: 0.9 2023-10-03 00:57:50,910 WARNING [train.py:1204] (3/4) Exclude cut with ID _21632_210_2253_1_1531994001765_3744140_154-84090-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 00:57:50,911 WARNING [train.py:1204] (3/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_103-336064-0_sp0.9 from training. Duration: 0.5666875 2023-10-03 00:57:50,954 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_43-71989-0_sp0.9 from training. Duration: 0.6333125 2023-10-03 00:57:52,294 WARNING [train.py:1204] (3/4) Exclude cut with ID _3867_210_1883_1_1526641200575_4475830_319-349850-0_sp0.9 from training. Duration: 0.7444375 2023-10-03 00:57:53,689 WARNING [train.py:1204] (3/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_629-179454-0_sp1.1 from training. Duration: 0.963625 2023-10-03 00:57:53,702 WARNING [train.py:1204] (3/4) Exclude cut with ID _65263_210_22356_1_1535763598691_3921037_49-222917-0 from training. Duration: 0.92 2023-10-03 00:57:54,990 WARNING [train.py:1204] (3/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_602-331772-0_sp1.1 from training. Duration: 0.936375 2023-10-03 00:57:55,083 WARNING [train.py:1204] (3/4) Exclude cut with ID _58414_210_8254_1_1535421609162_7151182_292-134978-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 00:57:56,426 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_38793_210_2544_1_1533176114_3849320_200-299292-0_sp1.1 from training. Duration: 0.936375 2023-10-03 00:57:58,239 WARNING [train.py:1204] (3/4) Exclude cut with ID _14970_210_15709_1_1530775547361_1966965_6-23328-0 from training. Duration: 0.86 2023-10-03 00:57:59,536 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.724e+02 1.999e+02 2.250e+02 2.864e+02 4.547e+02, threshold=4.500e+02, percent-clipped=1.0 2023-10-03 00:57:59,629 WARNING [train.py:1204] (3/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_386-225511-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 00:57:59,648 WARNING [train.py:1204] (3/4) Exclude cut with ID _38323_210_12108_1_1533121233369_3303049_416-11919-0_sp1.1 from training. Duration: 0.87275 2023-10-03 00:57:59,695 WARNING [train.py:1204] (3/4) Exclude cut with ID _4866_210_2577_1_1527404412284_4364659_256-306830-0_sp1.1 from training. Duration: 0.7909375 2023-10-03 00:58:01,634 WARNING [train.py:1204] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_53-300544-0 from training. Duration: 0.67 2023-10-03 00:58:07,168 WARNING [train.py:1204] (3/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_80-74184-0_sp1.1 from training. Duration: 0.7545625 2023-10-03 00:58:07,190 WARNING [train.py:1204] (3/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_332-256168-0_sp1.1 from training. Duration: 0.6818125 2023-10-03 00:58:08,901 INFO [train.py:1046] (3/4) Epoch 31, batch 2850, loss[loss=0.1591, simple_loss=0.2441, pruned_loss=0.03701, over 24320.00 frames. ], tot_loss[loss=0.1624, simple_loss=0.2404, pruned_loss=0.04221, over 4704482.02 frames. ], batch size: 61, lr: 3.29e-03, grad_scale: 16.0 2023-10-03 00:58:08,967 WARNING [train.py:1204] (3/4) Exclude cut with ID _48836_210_12210_1_1534075848293_3904150_429-35475-0_sp1.1 from training. Duration: 0.8090625 2023-10-03 00:58:10,450 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_144-135833-0_sp1.1 from training. Duration: 0.97275 2023-10-03 00:58:14,532 WARNING [train.py:1204] (3/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_380-343670-0_sp0.9 from training. Duration: 0.97775 2023-10-03 00:58:14,575 WARNING [train.py:1204] (3/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_213-265174-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 00:58:14,613 WARNING [train.py:1204] (3/4) Exclude cut with ID _20455_210_5219_1_1531565996775_8098100_86-314283-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 00:58:17,835 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_41750_210_1960_1_1533520678_3948042_68-223813-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 00:58:19,103 WARNING [train.py:1204] (3/4) Exclude cut with ID _55153_210_2253_1_1534384786677_3162096_24-198429-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 00:58:20,490 WARNING [train.py:1204] (3/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_119-207162-0_sp0.9 from training. Duration: 0.788875 2023-10-03 00:58:20,554 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_148-172861-0 from training. Duration: 0.5 2023-10-03 00:58:24,964 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_5522_210_3273_1_1527249606_3909096_318-203153-0 from training. Duration: 0.43 2023-10-03 00:58:24,971 WARNING [train.py:1204] (3/4) Exclude cut with ID _14884_210_2252_1_1530707459689_5561250_288-189949-0_sp1.1 from training. Duration: 0.97275 2023-10-03 00:58:28,304 WARNING [train.py:1204] (3/4) Exclude cut with ID _5294_210_1747_1_1527249643928_3696179_529-242298-0 from training. Duration: 0.95 2023-10-03 00:58:29,592 WARNING [train.py:1204] (3/4) Exclude cut with ID _36538_210_9994_1_1533204020363_3795620_503-110289-0_sp1.1 from training. Duration: 0.936375 2023-10-03 00:58:32,354 WARNING [train.py:1204] (3/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_385-193052-0 from training. Duration: 0.86 2023-10-03 00:58:32,538 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.out_combiner.scale_min, batch_count=1081493.3333333333, ans=0.2 2023-10-03 00:58:34,266 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_456-22141-0 from training. Duration: 0.88 2023-10-03 00:58:34,374 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_38965_210_4852_1_1533174906_3484735_284-117840-0_sp1.1 from training. Duration: 0.936375 2023-10-03 00:58:46,169 WARNING [train.py:1204] (3/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_75-201625-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 00:58:47,981 WARNING [train.py:1204] (3/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_255-312654-0_sp1.1 from training. Duration: 0.82725 2023-10-03 00:58:47,991 WARNING [train.py:1204] (3/4) Exclude cut with ID _47251_210_18628_1_1533814063128_4137250_214-88076-0_sp0.9 from training. Duration: 0.97775 2023-10-03 00:58:49,382 WARNING [train.py:1204] (3/4) Exclude cut with ID _4092_210_2151_1_1526643044449_3135759_227-213972-0_sp1.1 from training. Duration: 0.4909375 2023-10-03 00:58:49,399 WARNING [train.py:1204] (3/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_912-47473-0_sp1.1 from training. Duration: 0.6181875 2023-10-03 00:58:49,427 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6767_210_3273_1_1527855783_1718932_173-256601-0_sp1.1 from training. Duration: 0.72725 2023-10-03 00:58:51,135 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.5.encoder.layers.0.feed_forward2.out_whiten, num_groups=1, num_channels=256, metric=7.30 vs. limit=15.0 2023-10-03 00:58:52,021 WARNING [train.py:1204] (3/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_32-205288-0_sp1.1 from training. Duration: 0.7090625 2023-10-03 00:58:52,058 WARNING [train.py:1204] (3/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_39-248870-0 from training. Duration: 0.69 2023-10-03 00:58:54,770 WARNING [train.py:1204] (3/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_377-296839-0_sp1.1 from training. Duration: 0.5 2023-10-03 00:58:54,793 WARNING [train.py:1204] (3/4) Exclude cut with ID _13679_210_7497_1_1530534293662_3355098_413-56154-0_sp1.1 from training. Duration: 0.963625 2023-10-03 00:58:54,844 WARNING [train.py:1204] (3/4) Exclude cut with ID _14970_210_15709_1_1530775547361_1966965_201-84544-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 00:58:56,113 WARNING [train.py:1204] (3/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_105-95775-0_sp1.1 from training. Duration: 0.936375 2023-10-03 00:58:59,402 WARNING [train.py:1204] (3/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_131-191669-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 00:58:59,445 WARNING [train.py:1204] (3/4) Exclude cut with ID _14860_210_4915_1_1530702020340_7343240_695-4323-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 00:59:00,783 WARNING [train.py:1204] (3/4) Exclude cut with ID _39867_210_9826_1_1533553222030_9344259_991-130565-0_sp1.1 from training. Duration: 0.97275 2023-10-03 00:59:02,228 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_50-3289-0_sp1.1 from training. Duration: 0.82725 2023-10-03 00:59:05,071 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_54-347670-0_sp1.1 from training. Duration: 0.92725 2023-10-03 00:59:05,128 WARNING [train.py:1204] (3/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_200-153445-0_sp1.1 from training. Duration: 0.936375 2023-10-03 00:59:06,477 WARNING [train.py:1204] (3/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_279-302911-0_sp1.1 from training. Duration: 0.97275 2023-10-03 00:59:09,671 WARNING [train.py:1204] (3/4) Exclude cut with ID _14886_210_4220_1_1530702020355_3516190_159-73748-0_sp0.9 from training. Duration: 0.92225 2023-10-03 00:59:14,402 WARNING [train.py:1204] (3/4) Exclude cut with ID _53379_210_20034_1_1534222027363_3854654_23-222715-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 00:59:14,503 WARNING [train.py:1204] (3/4) Exclude cut with ID _12555_210_8750_1_1530075594402_3780239_449-267986-0 from training. Duration: 0.83 2023-10-03 00:59:14,521 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7544_210_6753_1_1528587722_3576686_295-258479-0 from training. Duration: 0.84 2023-10-03 00:59:15,947 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7002_210_1794_1_1527935634_4034444_487-236501-0_sp0.9 from training. Duration: 0.7333125 2023-10-03 00:59:17,279 WARNING [train.py:1204] (3/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_244-19107-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 00:59:17,301 WARNING [train.py:1204] (3/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_390-339137-0 from training. Duration: 0.56 2023-10-03 00:59:19,206 WARNING [train.py:1204] (3/4) Exclude cut with ID _7562_210_2084_1_1528887767916_3946229_186-326422-0_sp1.1 from training. Duration: 0.763625 2023-10-03 00:59:20,575 WARNING [train.py:1204] (3/4) Exclude cut with ID _33640_210_2655_1_1533117605479_4855469_645-145235-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 00:59:20,592 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_25767_210_2161_1_1532307483_3867748_506-171777-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 00:59:20,610 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_5438_210_1794_1_1527336687_2600009_233-58072-0_sp1.1 from training. Duration: 0.7 2023-10-03 00:59:20,611 WARNING [train.py:1204] (3/4) Exclude cut with ID IC0669W0063-330908 from training. Duration: 0.896 2023-10-03 00:59:20,661 WARNING [train.py:1204] (3/4) Exclude cut with ID IC0671W0070-331913 from training. Duration: 0.896 2023-10-03 00:59:20,672 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_258-310570-0_sp1.1 from training. Duration: 0.7454375 2023-10-03 00:59:21,098 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.5.encoder.layers.1.self_attn_weights.whiten_keys, num_groups=4, num_channels=128, metric=1.81 vs. limit=6.0 2023-10-03 00:59:21,966 WARNING [train.py:1204] (3/4) Exclude cut with ID _9461_210_4557_1_1529060514726_3677451_162-177507-0_sp1.1 from training. Duration: 0.97275 2023-10-03 00:59:23,275 INFO [train.py:1046] (3/4) Epoch 31, batch 2900, loss[loss=0.1555, simple_loss=0.2417, pruned_loss=0.03463, over 24484.00 frames. ], tot_loss[loss=0.1631, simple_loss=0.2414, pruned_loss=0.04241, over 4705203.22 frames. ], batch size: 66, lr: 3.29e-03, grad_scale: 16.0 2023-10-03 00:59:26,231 WARNING [train.py:1204] (3/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_26-175553-0_sp0.9 from training. Duration: 0.67775 2023-10-03 00:59:26,254 WARNING [train.py:1204] (3/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_7-150481-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 00:59:26,295 WARNING [train.py:1204] (3/4) Exclude cut with ID _23557_210_8750_1_1531890106370_6846255_72-273262-0_sp1.1 from training. Duration: 0.963625 2023-10-03 00:59:26,344 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder_embed.dropout.p, batch_count=1081760.0, ans=0.1 2023-10-03 00:59:27,767 WARNING [train.py:1204] (3/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_367-61330-0 from training. Duration: 0.75 2023-10-03 00:59:32,265 WARNING [train.py:1204] (3/4) Exclude cut with ID _39867_210_9826_1_1533553222030_9344259_1337-11141-0_sp1.1 from training. Duration: 0.936375 2023-10-03 00:59:32,298 WARNING [train.py:1204] (3/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_101-293367-0 from training. Duration: 0.63 2023-10-03 00:59:33,635 WARNING [train.py:1204] (3/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_10-100367-0 from training. Duration: 0.98 2023-10-03 00:59:34,938 WARNING [train.py:1204] (3/4) Exclude cut with ID _60057_210_10640_1_1534903065793_3333150_171-181133-0_sp1.1 from training. Duration: 0.8 2023-10-03 00:59:34,940 WARNING [train.py:1204] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_844-96989-0_sp0.9 from training. Duration: 0.788875 2023-10-03 00:59:36,380 WARNING [train.py:1204] (3/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_301-144595-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 00:59:37,798 WARNING [train.py:1204] (3/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_115-147343-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 00:59:41,291 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3171_210_2087_1_1526109084_2330007_37-64334-0_sp1.1 from training. Duration: 0.7454375 2023-10-03 00:59:42,571 WARNING [train.py:1204] (3/4) Exclude cut with ID _19514_210_9259_1_1531530071330_5084230_769-272914-0_sp1.1 from training. Duration: 0.936375 2023-10-03 00:59:45,691 WARNING [train.py:1204] (3/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_76-311963-0_sp0.9 from training. Duration: 0.77775 2023-10-03 00:59:45,717 WARNING [train.py:1204] (3/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_526-62079-0 from training. Duration: 0.67 2023-10-03 00:59:45,764 WARNING [train.py:1204] (3/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_7-111954-0_sp0.9 from training. Duration: 0.62225 2023-10-03 00:59:48,534 WARNING [train.py:1204] (3/4) Exclude cut with ID _57997_210_4163_1_1534766425889_3961049_360-279140-0_sp1.1 from training. Duration: 0.97275 2023-10-03 00:59:51,954 WARNING [train.py:1204] (3/4) Exclude cut with ID _4866_210_2577_1_1527404412284_4364659_133-1696-0 from training. Duration: 0.99 2023-10-03 00:59:52,026 WARNING [train.py:1204] (3/4) Exclude cut with ID _5190_210_4557_1_1527244368799_373439_28-197030-0 from training. Duration: 0.72 2023-10-03 00:59:54,853 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_53-39003-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 00:59:54,855 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6673_210_3273_1_1527767790_3173240_351-167954-0 from training. Duration: 0.48 2023-10-03 00:59:54,886 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2822_210_2715_1_1526112586_1999587_165-36656-0_sp1.1 from training. Duration: 0.8090625 2023-10-03 00:59:57,565 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_328-264892-0_sp1.1 from training. Duration: 0.92725 2023-10-03 00:59:57,570 WARNING [train.py:1204] (3/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_29-293896-0_sp0.9 from training. Duration: 0.67775 2023-10-03 00:59:57,862 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.bypass.skip_rate, batch_count=1081893.3333333333, ans=0.07 2023-10-03 01:00:00,332 WARNING [train.py:1204] (3/4) Exclude cut with ID _65263_210_22356_1_1535763598691_3921037_465-55448-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 01:00:02,222 WARNING [train.py:1204] (3/4) Exclude cut with ID _21989_210_5854_1_1531899214931_3271360_311-53106-0_sp1.1 from training. Duration: 0.97275 2023-10-03 01:00:04,432 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.1.encoder.layers.1.feed_forward3.out_whiten, num_groups=1, num_channels=256, metric=7.50 vs. limit=15.0 2023-10-03 01:00:04,985 WARNING [train.py:1204] (3/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_389-277915-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 01:00:07,854 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_69630_210_7234_1_1535791094_677837_63-277139-0_sp1.1 from training. Duration: 0.963625 2023-10-03 01:00:07,994 WARNING [train.py:1204] (3/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_85-138257-0 from training. Duration: 0.71 2023-10-03 01:00:09,745 WARNING [train.py:1204] (3/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_686-247600-0 from training. Duration: 0.92 2023-10-03 01:00:09,750 WARNING [train.py:1204] (3/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_108-255430-0_sp1.1 from training. Duration: 0.8818125 2023-10-03 01:00:10,081 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=1081960.0, ans=0.1 2023-10-03 01:00:12,521 WARNING [train.py:1204] (3/4) Exclude cut with ID _14884_210_2252_1_1530707459689_5561250_114-201399-0_sp1.1 from training. Duration: 0.6909375 2023-10-03 01:00:15,163 WARNING [train.py:1204] (3/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_506-42614-0 from training. Duration: 0.79 2023-10-03 01:00:18,212 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_85-195240-0_sp1.1 from training. Duration: 0.7909375 2023-10-03 01:00:22,254 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.3.encoder.layers.0.feed_forward1.out_whiten, num_groups=1, num_channels=512, metric=10.34 vs. limit=15.0 2023-10-03 01:00:22,832 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_141-36243-0_sp1.1 from training. Duration: 0.97275 2023-10-03 01:00:28,270 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.525e+02 1.814e+02 1.971e+02 2.189e+02 2.896e+02, threshold=3.942e+02, percent-clipped=0.0 2023-10-03 01:00:29,789 WARNING [train.py:1204] (3/4) Exclude cut with ID _7852_210_5219_1_1528286471316_8139400_358-287492-0_sp1.1 from training. Duration: 0.87275 2023-10-03 01:00:29,820 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_805-71497-0_sp0.9 from training. Duration: 0.788875 2023-10-03 01:00:33,053 WARNING [train.py:1204] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_178-39722-0 from training. Duration: 0.49 2023-10-03 01:00:35,922 WARNING [train.py:1204] (3/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_428-181925-0_sp1.1 from training. Duration: 0.963625 2023-10-03 01:00:35,926 WARNING [train.py:1204] (3/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_770-200430-0 from training. Duration: 0.89 2023-10-03 01:00:35,977 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_1104-271009-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 01:00:36,160 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.bypass_mid.scale_min, batch_count=1082093.3333333333, ans=0.2 2023-10-03 01:00:37,305 INFO [train.py:1046] (3/4) Epoch 31, batch 2950, loss[loss=0.1761, simple_loss=0.26, pruned_loss=0.04604, over 24363.00 frames. ], tot_loss[loss=0.163, simple_loss=0.2418, pruned_loss=0.04207, over 4718656.17 frames. ], batch size: 77, lr: 3.29e-03, grad_scale: 16.0 2023-10-03 01:00:37,366 WARNING [train.py:1204] (3/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_128-296581-0_sp0.9 from training. Duration: 0.82225 2023-10-03 01:00:41,032 WARNING [train.py:1204] (3/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_19-62511-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 01:00:42,480 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.nonlin_attention.balancer.prob, batch_count=1082093.3333333333, ans=0.125 2023-10-03 01:00:43,565 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_805-71497-0 from training. Duration: 0.71 2023-10-03 01:00:44,987 WARNING [train.py:1204] (3/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_573-152077-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 01:00:44,992 WARNING [train.py:1204] (3/4) Exclude cut with ID _42875_210_14856_1_1533704397487_4012433_393-177816-0_sp1.1 from training. Duration: 0.963625 2023-10-03 01:00:46,333 WARNING [train.py:1204] (3/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_278-35159-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 01:00:48,159 WARNING [train.py:1204] (3/4) Exclude cut with ID _15037_210_2253_1_1530752294104_3267650_13-130361-0_sp1.1 from training. Duration: 0.9 2023-10-03 01:00:48,262 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3019_210_2028_1_1526044245_3264865_127-135615-0 from training. Duration: 0.94 2023-10-03 01:00:49,583 WARNING [train.py:1204] (3/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_607-213951-0 from training. Duration: 0.84 2023-10-03 01:00:49,648 WARNING [train.py:1204] (3/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_436-318113-0_sp0.9 from training. Duration: 0.8333125 2023-10-03 01:00:49,649 WARNING [train.py:1204] (3/4) Exclude cut with ID _13938_210_9988_1_1530518718978_3693960_148-152225-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 01:00:53,152 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.conv_module1.balancer2.min_abs, batch_count=1082160.0, ans=0.5 2023-10-03 01:00:55,730 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2541_210_1794_1_1525590269_3084765_156-173713-0_sp1.1 from training. Duration: 0.736375 2023-10-03 01:00:57,081 WARNING [train.py:1204] (3/4) Exclude cut with ID _28323_210_16187_1_1532480395667_4100300_531-24157-0_sp1.1 from training. Duration: 0.8909375 2023-10-03 01:00:59,677 WARNING [train.py:1204] (3/4) Exclude cut with ID _29303_210_2575_1_1532392237303_7410649_382-217492-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 01:00:59,712 WARNING [train.py:1204] (3/4) Exclude cut with ID _58895_210_8750_1_1535184081320_3649089_270-158013-0_sp1.1 from training. Duration: 0.92725 2023-10-03 01:01:03,078 WARNING [train.py:1204] (3/4) Exclude cut with ID _29277_210_3279_1_1532581109293_3173069_311-206227-0_sp1.1 from training. Duration: 0.936375 2023-10-03 01:01:03,088 WARNING [train.py:1204] (3/4) Exclude cut with ID _7160_210_1876_1_1527940923231_3703799_282-322611-0_sp0.9 from training. Duration: 0.9666875 2023-10-03 01:01:04,482 WARNING [train.py:1204] (3/4) Exclude cut with ID _53219_210_4525_1_1534834836008_4064825_562-32295-0_sp1.1 from training. Duration: 0.963625 2023-10-03 01:01:05,807 WARNING [train.py:1204] (3/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_408-87330-0_sp1.1 from training. Duration: 0.963625 2023-10-03 01:01:05,821 WARNING [train.py:1204] (3/4) Exclude cut with ID _59202_210_26984_1_1534816810612_3549174_137-318710-0_sp1.1 from training. Duration: 0.8090625 2023-10-03 01:01:08,613 WARNING [train.py:1204] (3/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_372-30302-0 from training. Duration: 0.86 2023-10-03 01:01:10,789 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=1082226.6666666667, ans=0.1 2023-10-03 01:01:13,376 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7543_210_6753_1_1528541907_1399322_178-56269-0_sp1.1 from training. Duration: 0.95275 2023-10-03 01:01:13,394 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9109W0012-549758 from training. Duration: 0.512 2023-10-03 01:01:14,756 WARNING [train.py:1204] (3/4) Exclude cut with ID _5410_210_1388_1_1527397055795_1719020_39-112221-0_sp0.9 from training. Duration: 0.7666875 2023-10-03 01:01:16,266 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9121W0012-554933 from training. Duration: 0.896 2023-10-03 01:01:17,632 WARNING [train.py:1204] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_476-272757-0 from training. Duration: 0.93 2023-10-03 01:01:17,651 WARNING [train.py:1204] (3/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_1085-171718-0_sp1.1 from training. Duration: 0.92725 2023-10-03 01:01:19,488 WARNING [train.py:1204] (3/4) Exclude cut with ID _8480_210_1874_1_1528589391205_6728330_309-139896-0_sp1.1 from training. Duration: 0.736375 2023-10-03 01:01:19,488 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9130W0113-559703 from training. Duration: 0.896 2023-10-03 01:01:19,501 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_292-265928-0_sp1.1 from training. Duration: 0.463625 2023-10-03 01:01:22,340 WARNING [train.py:1204] (3/4) Exclude cut with ID _8772_210_1850_1_1528635595972_3664011_96-12756-0 from training. Duration: 0.98 2023-10-03 01:01:23,661 WARNING [train.py:1204] (3/4) Exclude cut with ID _12556_210_8341_1_1530081024100_7327690_725-193477-0_sp1.1 from training. Duration: 0.8909375 2023-10-03 01:01:24,357 WARNING [train.py:1204] (3/4) Exclude cut with ID _5289_210_2084_1_1527678202736_3779299_142-103317-0_sp1.1 from training. Duration: 0.763625 2023-10-03 01:01:28,436 WARNING [train.py:1204] (3/4) Exclude cut with ID _45012_210_12651_1_1533608435136_5371380_206-51075-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 01:01:28,550 WARNING [train.py:1204] (3/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_176-56120-0_sp1.1 from training. Duration: 0.7545625 2023-10-03 01:01:29,909 WARNING [train.py:1204] (3/4) Exclude cut with ID _40284_210_2084_1_1533513679814_3704279_220-201864-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 01:01:29,940 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9165W0004-576638 from training. Duration: 0.896 2023-10-03 01:01:29,974 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_5438_210_1794_1_1527336687_2600009_218-343336-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 01:01:29,996 WARNING [train.py:1204] (3/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_318-162017-0 from training. Duration: 0.91 2023-10-03 01:01:37,335 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_49544_210_1794_1_1534379770_5262083_483-349298-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 01:01:37,410 WARNING [train.py:1204] (3/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_197-28164-0_sp0.9 from training. Duration: 0.888875 2023-10-03 01:01:38,793 WARNING [train.py:1204] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_643-52007-0 from training. Duration: 0.57 2023-10-03 01:01:38,817 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_38965_210_4852_1_1533174906_3484735_403-332963-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 01:01:40,246 WARNING [train.py:1204] (3/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_340-174176-0 from training. Duration: 0.49 2023-10-03 01:01:41,696 WARNING [train.py:1204] (3/4) Exclude cut with ID _56708_210_13177_1_1535252351282_3422899_363-917-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 01:01:43,647 WARNING [train.py:1204] (3/4) Exclude cut with ID _19896_210_1883_1_1531709022662_6649569_525-159063-0_sp1.1 from training. Duration: 0.936375 2023-10-03 01:01:45,019 WARNING [train.py:1204] (3/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_430-116139-0_sp0.9 from training. Duration: 0.9444375 2023-10-03 01:01:46,365 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_53753_210_7234_1_1534320003_3594148_86-329250-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 01:01:47,601 WARNING [train.py:1204] (3/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_406-275649-0_sp1.1 from training. Duration: 0.3818125 2023-10-03 01:01:48,921 WARNING [train.py:1204] (3/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_1083-147536-0_sp1.1 from training. Duration: 0.8818125 2023-10-03 01:01:48,985 WARNING [train.py:1204] (3/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_54-311741-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 01:01:48,991 WARNING [train.py:1204] (3/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_237-58433-0_sp0.9 from training. Duration: 0.82225 2023-10-03 01:01:50,802 WARNING [train.py:1204] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_98-51676-0_sp1.1 from training. Duration: 0.663625 2023-10-03 01:01:50,857 WARNING [train.py:1204] (3/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_386-270472-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 01:01:51,156 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.out_combiner.scale_min, batch_count=1082426.6666666667, ans=0.2 2023-10-03 01:01:52,123 INFO [train.py:1046] (3/4) Epoch 31, batch 3000, loss[loss=0.1562, simple_loss=0.2376, pruned_loss=0.03742, over 24463.00 frames. ], tot_loss[loss=0.1634, simple_loss=0.2424, pruned_loss=0.04222, over 4718301.83 frames. ], batch size: 66, lr: 3.29e-03, grad_scale: 16.0 2023-10-03 01:01:52,123 INFO [train.py:1069] (3/4) Computing validation loss 2023-10-03 01:02:05,000 INFO [train.py:1078] (3/4) Epoch 31, validation: loss=0.3333, simple_loss=0.2731, pruned_loss=0.1967, over 1125622.00 frames. 2023-10-03 01:02:05,000 INFO [train.py:1079] (3/4) Maximum memory allocated so far is 21134MB 2023-10-03 01:02:05,085 WARNING [train.py:1204] (3/4) Exclude cut with ID _19023_210_4353_1_1531378892552_5820780_83-254794-0_sp1.1 from training. Duration: 0.7818125 2023-10-03 01:02:06,510 WARNING [train.py:1204] (3/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_314-93656-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 01:02:06,533 WARNING [train.py:1204] (3/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_336-213530-0 from training. Duration: 0.9 2023-10-03 01:02:06,622 WARNING [train.py:1204] (3/4) Exclude cut with ID _36686_210_5665_1_1533778225913_3646410_65-221120-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 01:02:08,751 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.2.encoder.layers.0.self_attn_weights.whiten_keys, num_groups=4, num_channels=128, metric=3.54 vs. limit=6.0 2023-10-03 01:02:09,261 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_26433_210_12108_1_1532157158_4056480_271-68178-0_sp1.1 from training. Duration: 0.92725 2023-10-03 01:02:09,305 WARNING [train.py:1204] (3/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_46-207599-0_sp0.9 from training. Duration: 0.711125 2023-10-03 01:02:13,971 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9303W0114-637133 from training. Duration: 0.896 2023-10-03 01:02:14,007 WARNING [train.py:1204] (3/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_490-207861-0 from training. Duration: 0.98 2023-10-03 01:02:16,643 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6767_210_3273_1_1527855783_1718932_173-256601-0_sp0.9 from training. Duration: 0.888875 2023-10-03 01:02:16,676 WARNING [train.py:1204] (3/4) Exclude cut with ID _13921_210_9994_1_1530841782272_3503520_327-69677-0_sp1.1 from training. Duration: 0.8181875 2023-10-03 01:02:16,754 WARNING [train.py:1204] (3/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_508-335369-0 from training. Duration: 0.48 2023-10-03 01:02:18,018 WARNING [train.py:1204] (3/4) Exclude cut with ID _64631_210_27767_1_1535349580068_4165160_24-46567-0_sp1.1 from training. Duration: 0.963625 2023-10-03 01:02:18,302 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.balancer1.prob, batch_count=1082493.3333333333, ans=0.125 2023-10-03 01:02:25,584 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_89-35748-0_sp1.1 from training. Duration: 0.7181875 2023-10-03 01:02:33,125 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_26433_210_12108_1_1532157158_4056480_578-262107-0_sp1.1 from training. Duration: 0.9 2023-10-03 01:02:39,340 WARNING [train.py:1204] (3/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_129-337802-0 from training. Duration: 0.72 2023-10-03 01:02:39,463 WARNING [train.py:1204] (3/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_356-167816-0_sp0.9 from training. Duration: 0.8 2023-10-03 01:02:42,021 WARNING [train.py:1204] (3/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_137-116442-0_sp1.1 from training. Duration: 0.7090625 2023-10-03 01:02:42,050 WARNING [train.py:1204] (3/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_358-172940-0_sp1.1 from training. Duration: 0.963625 2023-10-03 01:02:42,079 WARNING [train.py:1204] (3/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_668-344147-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 01:02:43,618 INFO [scaling.py:1118] (3/4) WithLoss: name=encoder.encoders.1.encoder.layers.1.self_attn_weights, loss-sum=0.000e+00 2023-10-03 01:02:44,869 WARNING [train.py:1204] (3/4) Exclude cut with ID _62337_210_12392_1_1535009273105_3412229_255-157667-0_sp1.1 from training. Duration: 0.936375 2023-10-03 01:02:44,873 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_486-151131-0 from training. Duration: 0.85 2023-10-03 01:02:46,788 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_53638_210_21581_1_1534297319_5808449_885-80172-0 from training. Duration: 0.96 2023-10-03 01:02:48,799 WARNING [train.py:1204] (3/4) Exclude cut with ID _50093_210_4220_1_1534417141207_3432299_105-183067-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 01:02:49,067 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.ff3_skip_rate, batch_count=1082626.6666666667, ans=0.0 2023-10-03 01:02:50,247 WARNING [train.py:1204] (3/4) Exclude cut with ID _7562_210_2084_1_1528887767916_3946229_345-169156-0_sp0.9 from training. Duration: 0.6444375 2023-10-03 01:02:50,421 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_4528_210_3273_1_1527076654_3276777_209-219804-0_sp0.9 from training. Duration: 0.7666875 2023-10-03 01:02:50,446 WARNING [train.py:1204] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_35-153886-0_sp0.9 from training. Duration: 0.9333125 2023-10-03 01:02:51,810 WARNING [train.py:1204] (3/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_332-121925-0_sp1.1 from training. Duration: 0.92725 2023-10-03 01:02:51,812 WARNING [train.py:1204] (3/4) Exclude cut with ID _59202_210_26984_1_1534816810612_3549174_495-65515-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 01:02:55,225 WARNING [train.py:1204] (3/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_769-168302-0_sp1.1 from training. Duration: 0.6454375 2023-10-03 01:02:56,310 WARNING [train.py:1204] (3/4) Exclude cut with ID _24271_210_12658_1_1531987270022_7190519_708-71345-0_sp1.1 from training. Duration: 0.936375 2023-10-03 01:02:56,311 WARNING [train.py:1204] (3/4) Exclude cut with ID 3033-130750-0096-55598-0_sp0.9 from training. Duration: 0.92225 2023-10-03 01:02:57,723 WARNING [train.py:1204] (3/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_219-224445-0_sp0.9 from training. Duration: 0.9333125 2023-10-03 01:02:59,638 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_174-277364-0 from training. Duration: 0.71 2023-10-03 01:03:00,944 WARNING [train.py:1204] (3/4) Exclude cut with ID _45021_210_9994_1_1533619751094_3330288_96-239010-0_sp1.1 from training. Duration: 0.82725 2023-10-03 01:03:02,176 WARNING [train.py:1204] (3/4) Exclude cut with ID _31602_210_5854_1_1532677021456_4458250_489-305116-0_sp1.1 from training. Duration: 0.97275 2023-10-03 01:03:02,195 WARNING [train.py:1204] (3/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_269-154726-0_sp0.9 from training. Duration: 0.9555625 2023-10-03 01:03:03,757 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=1082693.3333333333, ans=0.1 2023-10-03 01:03:03,845 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.feed_forward1.out_proj.dropout_p, batch_count=1082693.3333333333, ans=0.1 2023-10-03 01:03:07,977 WARNING [train.py:1204] (3/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_126-54418-0_sp1.1 from training. Duration: 0.92725 2023-10-03 01:03:08,009 WARNING [train.py:1204] (3/4) Exclude cut with ID _42326_210_7770_1_1533814282531_3707269_182-224159-0_sp1.1 from training. Duration: 0.92725 2023-10-03 01:03:08,653 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.3.encoder.layers.1.feed_forward3.out_whiten, num_groups=1, num_channels=512, metric=10.77 vs. limit=15.0 2023-10-03 01:03:09,331 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7002_210_1794_1_1527939683_5157286_1-281264-0_sp0.9 from training. Duration: 0.488875 2023-10-03 01:03:09,365 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_86-16881-0 from training. Duration: 0.58 2023-10-03 01:03:09,401 WARNING [train.py:1204] (3/4) Exclude cut with ID _19514_210_9259_1_1531530071330_5084230_637-279094-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 01:03:10,128 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.2.encoder.layers.2.feed_forward2.out_whiten, num_groups=1, num_channels=384, metric=4.03 vs. limit=15.0 2023-10-03 01:03:10,615 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.466e+02 1.904e+02 2.048e+02 2.313e+02 4.264e+02, threshold=4.096e+02, percent-clipped=1.0 2023-10-03 01:03:10,727 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_26433_210_12108_1_1532157158_4056480_578-262107-0 from training. Duration: 0.99 2023-10-03 01:03:10,778 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_82-221196-0_sp1.1 from training. Duration: 0.6909375 2023-10-03 01:03:12,185 WARNING [train.py:1204] (3/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_402-102311-0 from training. Duration: 0.67 2023-10-03 01:03:12,920 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.3.encoder.layers.0.nonlin_attention.whiten2, num_groups=1, num_channels=512, metric=7.51 vs. limit=15.0 2023-10-03 01:03:13,657 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1015-343884-0_sp1.1 from training. Duration: 0.563625 2023-10-03 01:03:16,824 WARNING [train.py:1204] (3/4) Exclude cut with ID _13684_210_7497_1_1530619354863_3598359_312-235606-0_sp1.1 from training. Duration: 0.5454375 2023-10-03 01:03:16,856 WARNING [train.py:1204] (3/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_55-31904-0 from training. Duration: 0.88 2023-10-03 01:03:18,235 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_25-332138-0 from training. Duration: 0.91 2023-10-03 01:03:18,237 WARNING [train.py:1204] (3/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_912-282412-0_sp0.9 from training. Duration: 0.5444375 2023-10-03 01:03:19,491 INFO [train.py:1046] (3/4) Epoch 31, batch 3050, loss[loss=0.1612, simple_loss=0.2307, pruned_loss=0.04587, over 23388.00 frames. ], tot_loss[loss=0.1651, simple_loss=0.2435, pruned_loss=0.04333, over 4713310.54 frames. ], batch size: 285, lr: 3.29e-03, grad_scale: 16.0 2023-10-03 01:03:19,538 WARNING [train.py:1204] (3/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_458-299435-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 01:03:19,629 WARNING [train.py:1204] (3/4) Exclude cut with ID _69584_210_5219_1_1535785133775_4044450_275-88403-0_sp1.1 from training. Duration: 0.92725 2023-10-03 01:03:19,637 WARNING [train.py:1204] (3/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_841-341679-0_sp1.1 from training. Duration: 0.52725 2023-10-03 01:03:19,643 WARNING [train.py:1204] (3/4) Exclude cut with ID _41391_210_7994_1_1533603771872_3638201_1-54521-0_sp1.1 from training. Duration: 0.97275 2023-10-03 01:03:20,992 WARNING [train.py:1204] (3/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_495-186609-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 01:03:25,005 WARNING [train.py:1204] (3/4) Exclude cut with ID _4681_210_3525_1_1527250689464_120889_15-126760-0 from training. Duration: 0.81 2023-10-03 01:03:27,737 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3019_210_2028_1_1526044245_3264865_127-135615-0_sp1.1 from training. Duration: 0.8545625 2023-10-03 01:03:29,153 WARNING [train.py:1204] (3/4) Exclude cut with ID _30191_210_1664_1_1533189915047_7196670_915-21561-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 01:03:30,449 WARNING [train.py:1204] (3/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_6-214815-0_sp0.9 from training. Duration: 0.9444375 2023-10-03 01:03:31,913 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_45399_210_4852_1_1533624725_7579737_190-137494-0_sp1.1 from training. Duration: 0.97275 2023-10-03 01:03:34,769 WARNING [train.py:1204] (3/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_207-132038-0 from training. Duration: 0.94 2023-10-03 01:03:42,022 WARNING [train.py:1204] (3/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_90-31383-0 from training. Duration: 0.87 2023-10-03 01:03:42,061 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_15517_210_6753_1_1530952293_1664795_119-31001-0 from training. Duration: 0.94 2023-10-03 01:03:42,092 WARNING [train.py:1204] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_684-85342-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 01:03:46,144 WARNING [train.py:1204] (3/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_97-325053-0_sp1.1 from training. Duration: 0.836375 2023-10-03 01:03:50,546 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_68702_210_23689_1_1535799577_3420068_168-25150-0_sp1.1 from training. Duration: 0.97275 2023-10-03 01:03:50,553 WARNING [train.py:1204] (3/4) Exclude cut with ID _65134_210_14763_1_1535335393985_3505368_111-27984-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 01:03:50,747 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.attention_skip_rate, batch_count=1082893.3333333333, ans=0.0 2023-10-03 01:03:51,839 WARNING [train.py:1204] (3/4) Exclude cut with ID _48678_210_5663_1_1533988830947_3769199_276-226230-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 01:03:53,369 WARNING [train.py:1204] (3/4) Exclude cut with ID _28427_210_4220_1_1532581240967_3061069_43-84928-0_sp1.1 from training. Duration: 0.9 2023-10-03 01:03:54,704 WARNING [train.py:1204] (3/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_158-250116-0_sp1.1 from training. Duration: 0.563625 2023-10-03 01:03:55,335 WARNING [train.py:1204] (3/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_279-332556-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 01:03:55,386 WARNING [train.py:1204] (3/4) Exclude cut with ID _42863_210_9070_1_1533902398788_3696920_488-230146-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 01:03:55,386 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_387-247159-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 01:03:56,757 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6729_210_2550_1_1528032481_1788725_261-42619-0_sp1.1 from training. Duration: 0.97275 2023-10-03 01:03:59,510 WARNING [train.py:1204] (3/4) Exclude cut with ID _19896_210_1883_1_1531709022662_6649569_831-44724-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 01:04:02,287 WARNING [train.py:1204] (3/4) Exclude cut with ID _55153_210_2253_1_1534384786677_3162096_280-285944-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 01:04:02,309 WARNING [train.py:1204] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_448-4048-0 from training. Duration: 0.96 2023-10-03 01:04:02,368 WARNING [train.py:1204] (3/4) Exclude cut with ID _29678_210_7893_1_1532421042787_3906118_456-202548-0_sp1.1 from training. Duration: 0.97275 2023-10-03 01:04:02,382 WARNING [train.py:1204] (3/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_107-332398-0_sp1.1 from training. Duration: 0.7090625 2023-10-03 01:04:06,516 WARNING [train.py:1204] (3/4) Exclude cut with ID _13921_210_9994_1_1530838702800_3003429_46-149444-0_sp1.1 from training. Duration: 0.8545625 2023-10-03 01:04:07,804 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_257-20733-0_sp0.9 from training. Duration: 0.8444375 2023-10-03 01:04:07,866 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_53753_210_7234_1_1534320003_3594148_385-234809-0_sp1.1 from training. Duration: 0.963625 2023-10-03 01:04:07,915 WARNING [train.py:1204] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_292-215346-0_sp1.1 from training. Duration: 0.8818125 2023-10-03 01:04:11,079 INFO [scaling.py:1118] (3/4) WithLoss: name=encoder.encoders.3.encoder.layers.3.self_attn_weights, loss-sum=0.000e+00 2023-10-03 01:04:13,938 WARNING [train.py:1204] (3/4) Exclude cut with ID _68965_210_1883_1_1535698962272_3497490_196-124268-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 01:04:13,970 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_23801_210_16223_1_1532066392_3294657_570-5439-0_sp1.1 from training. Duration: 0.8818125 2023-10-03 01:04:17,131 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.ff3_skip_rate, batch_count=1083026.6666666667, ans=0.0 2023-10-03 01:04:18,818 WARNING [train.py:1204] (3/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_349-6928-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 01:04:18,856 WARNING [train.py:1204] (3/4) Exclude cut with ID _60621_210_17071_1_1535013053530_3671739_226-255202-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 01:04:18,860 WARNING [train.py:1204] (3/4) Exclude cut with ID _41817_210_18835_1_1533434477160_7423049_448-130302-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 01:04:21,783 WARNING [train.py:1204] (3/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_758-143677-0_sp1.1 from training. Duration: 0.8909375 2023-10-03 01:04:21,831 WARNING [train.py:1204] (3/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_912-47473-0_sp0.9 from training. Duration: 0.7555625 2023-10-03 01:04:21,870 WARNING [train.py:1204] (3/4) Exclude cut with ID _6694_210_4599_1_1527852637490_3965529_356-169233-0_sp1.1 from training. Duration: 0.9 2023-10-03 01:04:23,142 WARNING [train.py:1204] (3/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_1-99680-0 from training. Duration: 0.67 2023-10-03 01:04:24,543 WARNING [train.py:1204] (3/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_199-86432-0_sp1.1 from training. Duration: 0.8909375 2023-10-03 01:04:25,170 WARNING [train.py:1204] (3/4) Exclude cut with ID _61148_210_6897_1_1535094022726_4217610_362-182430-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 01:04:26,430 WARNING [train.py:1204] (3/4) Exclude cut with ID _49376_210_5986_1_1533985278732_3370740_301-154763-0 from training. Duration: 0.98 2023-10-03 01:04:27,806 WARNING [train.py:1204] (3/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_572-185150-0_sp1.1 from training. Duration: 0.8818125 2023-10-03 01:04:30,760 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.1.feed_forward1.out_proj.dropout_p, batch_count=1083026.6666666667, ans=0.1 2023-10-03 01:04:32,079 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_47896_210_20508_1_1534492518_5958868_558-314572-0_sp1.1 from training. Duration: 0.8818125 2023-10-03 01:04:33,363 INFO [train.py:1046] (3/4) Epoch 31, batch 3100, loss[loss=0.1764, simple_loss=0.2595, pruned_loss=0.04661, over 24330.00 frames. ], tot_loss[loss=0.164, simple_loss=0.2423, pruned_loss=0.04285, over 4712559.25 frames. ], batch size: 77, lr: 3.29e-03, grad_scale: 16.0 2023-10-03 01:04:33,447 WARNING [train.py:1204] (3/4) Exclude cut with ID _45560_210_21982_1_1533812603951_3584999_111-6376-0_sp0.9 from training. Duration: 0.9333125 2023-10-03 01:04:36,079 WARNING [train.py:1204] (3/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_412-317077-0_sp1.1 from training. Duration: 0.6818125 2023-10-03 01:04:37,502 WARNING [train.py:1204] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_718-68505-0 from training. Duration: 0.89 2023-10-03 01:04:40,747 WARNING [train.py:1204] (3/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_77-311404-0 from training. Duration: 0.94 2023-10-03 01:04:40,829 WARNING [train.py:1204] (3/4) Exclude cut with ID _60229_210_27767_1_1534838406158_3655350_264-279573-0 from training. Duration: 0.83 2023-10-03 01:04:40,937 WARNING [train.py:1204] (3/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_106-300985-0_sp1.1 from training. Duration: 0.8181875 2023-10-03 01:04:45,195 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_4183_210_2087_1_1526729100_1869838_79-277189-0_sp1.1 from training. Duration: 0.963625 2023-10-03 01:04:45,204 WARNING [train.py:1204] (3/4) Exclude cut with ID _28427_210_4220_1_1532581240967_3061069_195-295268-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 01:04:45,445 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.attention_skip_rate, batch_count=1083093.3333333333, ans=0.0 2023-10-03 01:04:48,447 WARNING [train.py:1204] (3/4) Exclude cut with ID _4092_210_2151_1_1526643044449_3135759_356-278694-0_sp0.9 from training. Duration: 0.52225 2023-10-03 01:04:52,480 WARNING [train.py:1204] (3/4) Exclude cut with ID _14459_210_2228_1_1531443641946_7516700_777-71446-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 01:04:52,776 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.self_attn_weights.pos_emb_skip_rate, batch_count=1083160.0, ans=0.0 2023-10-03 01:04:57,592 WARNING [train.py:1204] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_520-126729-0 from training. Duration: 0.7 2023-10-03 01:05:01,714 WARNING [train.py:1204] (3/4) Exclude cut with ID _13684_210_7497_1_1530619354863_3598359_312-235606-0_sp0.9 from training. Duration: 0.6666875 2023-10-03 01:05:01,757 WARNING [train.py:1204] (3/4) Exclude cut with ID _37537_210_2253_1_1533171758568_3421949_283-349402-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 01:05:01,791 WARNING [train.py:1204] (3/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_29-117932-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 01:05:01,825 WARNING [train.py:1204] (3/4) Exclude cut with ID _29303_210_2575_1_1532392237303_7410649_721-283351-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 01:05:03,194 WARNING [train.py:1204] (3/4) Exclude cut with ID _14886_210_4220_1_1530702020355_3516190_175-229073-0_sp0.9 from training. Duration: 0.5 2023-10-03 01:05:05,911 WARNING [train.py:1204] (3/4) Exclude cut with ID _15458_210_8341_1_1530858614263_3834410_70-268830-0_sp1.1 from training. Duration: 0.97275 2023-10-03 01:05:05,928 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2295_210_2087_1_1525500993_2407863_234-83104-0 from training. Duration: 0.62 2023-10-03 01:05:05,929 WARNING [train.py:1204] (3/4) Exclude cut with ID _22961_210_8254_1_1531976366672_4278180_317-199546-0_sp1.1 from training. Duration: 0.7818125 2023-10-03 01:05:07,377 WARNING [train.py:1204] (3/4) Exclude cut with ID _27894_210_12392_1_1532415496078_6902439_547-196022-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 01:05:08,756 WARNING [train.py:1204] (3/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_46-207599-0 from training. Duration: 0.64 2023-10-03 01:05:10,205 WARNING [train.py:1204] (3/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_426-326246-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 01:05:14,800 WARNING [train.py:1204] (3/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_445-117398-0_sp0.9 from training. Duration: 0.988875 2023-10-03 01:05:14,867 WARNING [train.py:1204] (3/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_563-347832-0 from training. Duration: 0.89 2023-10-03 01:05:16,661 WARNING [train.py:1204] (3/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_252-216674-0 from training. Duration: 0.54 2023-10-03 01:05:16,740 WARNING [train.py:1204] (3/4) Exclude cut with ID _59611_210_8895_1_1534741198240_3650361_187-212779-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 01:05:18,054 WARNING [train.py:1204] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_282-159367-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 01:05:19,641 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_68702_210_23689_1_1535799577_3420068_33-136883-0_sp1.1 from training. Duration: 0.936375 2023-10-03 01:05:20,948 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_47228_210_4983_1_1533780021_3672027_184-293706-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 01:05:20,975 WARNING [train.py:1204] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_656-188361-0_sp0.9 from training. Duration: 0.9666875 2023-10-03 01:05:21,088 WARNING [train.py:1204] (3/4) Exclude cut with ID _4866_210_2577_1_1527404412284_4364659_352-157930-0_sp0.9 from training. Duration: 0.9 2023-10-03 01:05:21,089 WARNING [train.py:1204] (3/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_210-106047-0_sp1.1 from training. Duration: 0.8818125 2023-10-03 01:05:22,631 INFO [scaling.py:1118] (3/4) WithLoss: name=encoder.encoders.1.encoder.layers.0.self_attn_weights, loss-sum=0.000e+00 2023-10-03 01:05:24,328 WARNING [train.py:1204] (3/4) Exclude cut with ID _9389_210_1664_1_1529217337301_5289269_289-213417-0_sp1.1 from training. Duration: 0.8090625 2023-10-03 01:05:25,587 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3910_210_1794_1_1526777667_3747260_178-41186-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 01:05:25,591 WARNING [train.py:1204] (3/4) Exclude cut with ID _27052_210_5986_1_1532329197187_3585009_24-172263-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 01:05:25,594 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_5522_210_3273_1_1527249606_3909096_318-203153-0_sp1.1 from training. Duration: 0.3909375 2023-10-03 01:05:30,096 WARNING [train.py:1204] (3/4) Exclude cut with ID _27894_210_12392_1_1532415496078_6902439_222-51982-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 01:05:31,539 WARNING [train.py:1204] (3/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_87-131427-0 from training. Duration: 0.78 2023-10-03 01:05:34,237 WARNING [train.py:1204] (3/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_372-298093-0_sp0.9 from training. Duration: 0.888875 2023-10-03 01:05:34,297 WARNING [train.py:1204] (3/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_663-166973-0 from training. Duration: 0.8 2023-10-03 01:05:34,500 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.feed_forward2.hidden_balancer.prob, batch_count=1083360.0, ans=0.125 2023-10-03 01:05:35,540 WARNING [train.py:1204] (3/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_429-177469-0_sp1.1 from training. Duration: 0.936375 2023-10-03 01:05:35,567 WARNING [train.py:1204] (3/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_724-172651-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 01:05:35,619 WARNING [train.py:1204] (3/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_103-336064-0 from training. Duration: 0.51 2023-10-03 01:05:38,198 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.606e+02 1.903e+02 2.126e+02 2.520e+02 4.741e+02, threshold=4.252e+02, percent-clipped=3.0 2023-10-03 01:05:38,539 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.conv_module1.balancer1.prob, batch_count=1083360.0, ans=0.125 2023-10-03 01:05:38,581 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.nonlin_attention.balancer.prob, batch_count=1083360.0, ans=0.125 2023-10-03 01:05:45,709 WARNING [train.py:1204] (3/4) Exclude cut with ID _48677_210_5663_1_1533902405919_3892989_111-11439-0 from training. Duration: 0.96 2023-10-03 01:05:46,970 INFO [train.py:1046] (3/4) Epoch 31, batch 3150, loss[loss=0.1643, simple_loss=0.2241, pruned_loss=0.05224, over 23418.00 frames. ], tot_loss[loss=0.163, simple_loss=0.2411, pruned_loss=0.04247, over 4711957.58 frames. ], batch size: 285, lr: 3.29e-03, grad_scale: 16.0 2023-10-03 01:05:48,476 WARNING [train.py:1204] (3/4) Exclude cut with ID _8085_210_4557_1_1528455633493_3716100_95-103398-0_sp1.1 from training. Duration: 0.963625 2023-10-03 01:05:49,706 WARNING [train.py:1204] (3/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_1041-110837-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 01:05:51,107 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_50050_210_16223_1_1535435796_7290138_1155-121757-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 01:05:51,109 WARNING [train.py:1204] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_602-275260-0_sp0.9 from training. Duration: 0.911125 2023-10-03 01:05:52,508 WARNING [train.py:1204] (3/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_265-164486-0 from training. Duration: 0.96 2023-10-03 01:05:52,799 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.nonlin_attention.balancer.prob, batch_count=1083426.6666666667, ans=0.125 2023-10-03 01:05:53,901 WARNING [train.py:1204] (3/4) Exclude cut with ID _46067_210_4220_1_1534125571066_3307450_142-229375-0_sp1.1 from training. Duration: 0.963625 2023-10-03 01:05:53,932 WARNING [train.py:1204] (3/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_310-26186-0_sp1.1 from training. Duration: 0.636375 2023-10-03 01:05:54,025 WARNING [train.py:1204] (3/4) Exclude cut with ID _28461_210_2253_1_1532771775189_3041299_136-270644-0 from training. Duration: 0.93 2023-10-03 01:05:55,925 WARNING [train.py:1204] (3/4) Exclude cut with ID _14860_210_4915_1_1530702020340_7343240_438-286372-0_sp1.1 from training. Duration: 0.936375 2023-10-03 01:05:59,230 WARNING [train.py:1204] (3/4) Exclude cut with ID IC0128W0004-63309 from training. Duration: 0.831 2023-10-03 01:06:00,766 WARNING [train.py:1204] (3/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_135-174080-0 from training. Duration: 0.53 2023-10-03 01:06:00,804 WARNING [train.py:1204] (3/4) Exclude cut with ID _41326_210_5854_1_1533778076509_3492080_277-59511-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 01:06:01,070 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.conv_skip_rate, batch_count=1083493.3333333333, ans=0.0 2023-10-03 01:06:02,025 WARNING [train.py:1204] (3/4) Exclude cut with ID IC0144W0112-71379 from training. Duration: 0.992 2023-10-03 01:06:02,090 WARNING [train.py:1204] (3/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_711-15688-0_sp0.9 from training. Duration: 0.47775 2023-10-03 01:06:02,183 WARNING [train.py:1204] (3/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_237-332027-0 from training. Duration: 0.95 2023-10-03 01:06:03,451 WARNING [train.py:1204] (3/4) Exclude cut with ID _7970_210_1883_1_1528455616296_3472230_25-178407-0 from training. Duration: 0.71 2023-10-03 01:06:03,453 WARNING [train.py:1204] (3/4) Exclude cut with ID _8480_210_1874_1_1528589391205_6728330_309-139896-0 from training. Duration: 0.81 2023-10-03 01:06:03,486 WARNING [train.py:1204] (3/4) Exclude cut with ID _39571_210_13449_1_1533801584143_3651077_319-268877-0_sp1.1 from training. Duration: 0.936375 2023-10-03 01:06:03,490 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_155-246880-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 01:06:04,836 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_566-122599-0_sp1.1 from training. Duration: 0.936375 2023-10-03 01:06:04,977 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.0.feed_forward1.out_proj.dropout_p, batch_count=1083493.3333333333, ans=0.1 2023-10-03 01:06:06,207 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_12868_210_6753_1_1530701932_530275_18-76322-0 from training. Duration: 0.87 2023-10-03 01:06:07,648 WARNING [train.py:1204] (3/4) Exclude cut with ID _56494_210_4220_1_1535284837625_3033169_159-106035-0_sp1.1 from training. Duration: 0.963625 2023-10-03 01:06:07,688 WARNING [train.py:1204] (3/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_98-276013-0_sp1.1 from training. Duration: 0.963625 2023-10-03 01:06:08,470 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.2.encoder.layers.1.conv_module2.whiten, num_groups=1, num_channels=384, metric=4.74 vs. limit=15.0 2023-10-03 01:06:09,116 WARNING [train.py:1204] (3/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_330-216124-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 01:06:10,972 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_177-58005-0_sp0.9 from training. Duration: 0.688875 2023-10-03 01:06:15,669 WARNING [train.py:1204] (3/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_343-139898-0 from training. Duration: 0.96 2023-10-03 01:06:15,734 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6961_210_2087_1_1527937168_6815011_395-185060-0_sp1.1 from training. Duration: 0.863625 2023-10-03 01:06:19,714 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7002_210_1794_1_1527939683_5157286_184-229630-0_sp0.9 from training. Duration: 0.82225 2023-10-03 01:06:20,980 WARNING [train.py:1204] (3/4) Exclude cut with ID _59170_210_6721_1_1534983633960_4203335_153-18895-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 01:06:22,728 WARNING [train.py:1204] (3/4) Exclude cut with ID _49643_210_7989_1_1534640244224_3939031_598-161926-0 from training. Duration: 0.9 2023-10-03 01:06:24,300 WARNING [train.py:1204] (3/4) Exclude cut with ID _4068_210_2629_1_1526697350395_1546259_224-288693-0 from training. Duration: 0.59 2023-10-03 01:06:25,583 WARNING [train.py:1204] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_718-68505-0_sp1.1 from training. Duration: 0.8090625 2023-10-03 01:06:25,630 WARNING [train.py:1204] (3/4) Exclude cut with ID _4068_210_2629_1_1526697350395_1546259_224-288693-0_sp0.9 from training. Duration: 0.6555625 2023-10-03 01:06:25,641 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2296_210_2087_1_1525503448_3397335_57-185430-0_sp0.9 from training. Duration: 0.6666875 2023-10-03 01:06:25,694 WARNING [train.py:1204] (3/4) Exclude cut with ID _15458_210_8341_1_1530858614263_3834410_49-235472-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 01:06:27,138 WARNING [train.py:1204] (3/4) Exclude cut with ID _64135_210_2253_1_1535677267876_7608972_498-5039-0_sp0.9 from training. Duration: 0.8666875 2023-10-03 01:06:28,553 WARNING [train.py:1204] (3/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_283-220372-0_sp0.9 from training. Duration: 0.62225 2023-10-03 01:06:28,562 WARNING [train.py:1204] (3/4) Exclude cut with ID _6473_210_1741_1_1527901307396_3815380_108-17335-0_sp0.9 from training. Duration: 0.72225 2023-10-03 01:06:28,664 WARNING [train.py:1204] (3/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_113-92608-0 from training. Duration: 0.54 2023-10-03 01:06:29,404 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.0.feed_forward2.hidden_balancer.prob, batch_count=1083560.0, ans=0.125 2023-10-03 01:06:30,608 WARNING [train.py:1204] (3/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_272-249985-0_sp1.1 from training. Duration: 0.6090625 2023-10-03 01:06:30,618 WARNING [train.py:1204] (3/4) Exclude cut with ID _50376_210_13401_1_1534031502101_4217740_518-187390-0_sp1.1 from training. Duration: 0.97275 2023-10-03 01:06:33,184 WARNING [train.py:1204] (3/4) Exclude cut with ID _59738_210_15379_1_1534849512562_3941470_165-248260-0_sp1.1 from training. Duration: 0.92725 2023-10-03 01:06:33,202 WARNING [train.py:1204] (3/4) Exclude cut with ID _23818_210_6228_1_1531917063884_3676920_400-343925-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 01:06:33,264 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6961_210_2087_1_1527937168_6815011_562-165518-0 from training. Duration: 0.77 2023-10-03 01:06:34,496 WARNING [train.py:1204] (3/4) Exclude cut with ID _24271_210_12658_1_1531987270022_7190519_289-207931-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 01:06:35,909 WARNING [train.py:1204] (3/4) Exclude cut with ID _14632_210_9988_1_1530691279392_3923200_332-105904-0 from training. Duration: 0.9 2023-10-03 01:06:35,927 WARNING [train.py:1204] (3/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_203-232438-0_sp1.1 from training. Duration: 0.97275 2023-10-03 01:06:38,565 WARNING [train.py:1204] (3/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_263-168388-0 from training. Duration: 0.86 2023-10-03 01:06:38,632 WARNING [train.py:1204] (3/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_174-205058-0 from training. Duration: 0.95 2023-10-03 01:06:40,076 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_94-195801-0_sp1.1 from training. Duration: 0.8545625 2023-10-03 01:06:40,099 WARNING [train.py:1204] (3/4) Exclude cut with ID _55153_210_2253_1_1534384786677_3162096_269-103204-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 01:06:41,502 WARNING [train.py:1204] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_748-36150-0 from training. Duration: 0.91 2023-10-03 01:06:42,885 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_148-172861-0_sp1.1 from training. Duration: 0.4545625 2023-10-03 01:06:44,212 WARNING [train.py:1204] (3/4) Exclude cut with ID _67499_210_17282_1_1535509805220_3692430_460-77730-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 01:06:47,055 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.0.layers.0.feed_forward2.out_whiten, num_groups=1, num_channels=192, metric=5.97 vs. limit=15.0 2023-10-03 01:06:47,492 WARNING [train.py:1204] (3/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_374-20753-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 01:06:48,933 WARNING [train.py:1204] (3/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_374-289983-0_sp1.1 from training. Duration: 0.97275 2023-10-03 01:06:50,329 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_34886_210_10633_1_1533203882_1755661_76-156096-0_sp0.9 from training. Duration: 0.9666875 2023-10-03 01:06:55,207 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_238-256668-0_sp0.9 from training. Duration: 0.7666875 2023-10-03 01:06:55,258 WARNING [train.py:1204] (3/4) Exclude cut with ID _46683_210_9642_1_1533730913492_4591116_489-146727-0_sp1.1 from training. Duration: 0.97275 2023-10-03 01:06:56,707 WARNING [train.py:1204] (3/4) Exclude cut with ID _13938_210_9988_1_1530518718978_3693960_304-112784-0_sp0.9 from training. Duration: 0.511125 2023-10-03 01:07:00,745 INFO [train.py:1046] (3/4) Epoch 31, batch 3200, loss[loss=0.1683, simple_loss=0.2434, pruned_loss=0.04656, over 23752.00 frames. ], tot_loss[loss=0.1625, simple_loss=0.2407, pruned_loss=0.04209, over 4729020.84 frames. ], batch size: 164, lr: 3.29e-03, grad_scale: 32.0 2023-10-03 01:07:02,668 WARNING [train.py:1204] (3/4) Exclude cut with ID _24271_210_12658_1_1531987270022_7190519_243-46549-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 01:07:02,673 WARNING [train.py:1204] (3/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_78-182912-0_sp1.1 from training. Duration: 0.536375 2023-10-03 01:07:05,440 WARNING [train.py:1204] (3/4) Exclude cut with ID _57572_210_5517_1_1534921219624_3809484_121-321334-0_sp1.1 from training. Duration: 0.97275 2023-10-03 01:07:06,803 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_40047_210_2290_1_1533290519_2374311_254-102149-0_sp1.1 from training. Duration: 0.963625 2023-10-03 01:07:06,804 WARNING [train.py:1204] (3/4) Exclude cut with ID _28461_210_2253_1_1532771775189_3041299_96-152158-0 from training. Duration: 0.85 2023-10-03 01:07:09,621 WARNING [train.py:1204] (3/4) Exclude cut with ID _29877_210_6228_1_1532397791106_5276149_161-102979-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 01:07:09,774 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.conv_module1.balancer2.min_abs, batch_count=1083760.0, ans=0.5 2023-10-03 01:07:11,456 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.5.encoder.layers.1.feed_forward2.out_whiten, num_groups=1, num_channels=256, metric=10.21 vs. limit=15.0 2023-10-03 01:07:14,353 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3787_210_2040_1_1526477544_5478586_603-120324-0_sp0.9 from training. Duration: 0.9 2023-10-03 01:07:15,958 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.ff3_skip_rate, batch_count=1083826.6666666667, ans=0.0 2023-10-03 01:07:18,990 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_41750_210_1960_1_1533520678_3948042_247-213456-0_sp1.1 from training. Duration: 0.97275 2023-10-03 01:07:24,758 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.feed_forward3.hidden_balancer.prob, batch_count=1083826.6666666667, ans=0.125 2023-10-03 01:07:26,607 WARNING [train.py:1204] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_127-119109-0_sp0.9 from training. Duration: 0.8 2023-10-03 01:07:32,327 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.conv_skip_rate, batch_count=1083893.3333333333, ans=0.0 2023-10-03 01:07:36,749 WARNING [train.py:1204] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_215-202973-0 from training. Duration: 0.88 2023-10-03 01:07:38,134 WARNING [train.py:1204] (3/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_17-320491-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 01:07:39,609 WARNING [train.py:1204] (3/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_310-114452-0 from training. Duration: 0.59 2023-10-03 01:07:40,999 WARNING [train.py:1204] (3/4) Exclude cut with ID _15133_210_6228_1_1530794512074_3080379_29-264458-0_sp0.9 from training. Duration: 0.6444375 2023-10-03 01:07:44,219 WARNING [train.py:1204] (3/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_159-281310-0_sp1.1 from training. Duration: 0.863625 2023-10-03 01:07:44,228 WARNING [train.py:1204] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_195-167011-0_sp0.9 from training. Duration: 0.8333125 2023-10-03 01:07:45,448 WARNING [train.py:1204] (3/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_156-257607-0_sp1.1 from training. Duration: 0.7545625 2023-10-03 01:07:48,794 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2295_210_2087_1_1525500993_2407863_116-238894-0 from training. Duration: 0.7 2023-10-03 01:07:50,220 WARNING [train.py:1204] (3/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_321-335475-0_sp0.9 from training. Duration: 0.5 2023-10-03 01:07:52,890 WARNING [train.py:1204] (3/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_188-114464-0 from training. Duration: 0.82 2023-10-03 01:07:55,959 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2592_210_2290_1_1525697025_4882925_157-70512-0 from training. Duration: 0.91 2023-10-03 01:07:56,170 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.conv_skip_rate, batch_count=1083960.0, ans=0.0 2023-10-03 01:07:57,425 WARNING [train.py:1204] (3/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_138-64210-0_sp1.1 from training. Duration: 0.663625 2023-10-03 01:07:57,643 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.1.conv_module1.balancer1.prob, batch_count=1083960.0, ans=0.125 2023-10-03 01:08:02,967 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_11305_210_1794_1_1529750428_7178148_651-95702-0_sp1.1 from training. Duration: 0.963625 2023-10-03 01:08:04,094 WARNING [train.py:1204] (3/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_379-184223-0_sp0.9 from training. Duration: 0.9444375 2023-10-03 01:08:04,130 WARNING [train.py:1204] (3/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_381-125325-0_sp1.1 from training. Duration: 0.963625 2023-10-03 01:08:04,191 WARNING [train.py:1204] (3/4) Exclude cut with ID IC0622W0021-307659 from training. Duration: 0.896 2023-10-03 01:08:04,194 WARNING [train.py:1204] (3/4) Exclude cut with ID _7624_210_1531_1_1528109802198_4933101_418-349834-0_sp1.1 from training. Duration: 0.5454375 2023-10-03 01:08:06,914 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.538e+02 1.876e+02 2.218e+02 2.823e+02 4.927e+02, threshold=4.435e+02, percent-clipped=2.0 2023-10-03 01:08:07,297 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=1084026.6666666667, ans=0.1 2023-10-03 01:08:08,311 WARNING [train.py:1204] (3/4) Exclude cut with ID _49728_210_12651_1_1534041034294_6788150_607-3416-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 01:08:09,729 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8275_210_4198_1_1528455320_3892571_491-17707-0 from training. Duration: 0.64 2023-10-03 01:08:09,781 WARNING [train.py:1204] (3/4) Exclude cut with ID _5287_210_2084_1_1527418883040_3878670_333-48508-0 from training. Duration: 0.6 2023-10-03 01:08:11,264 WARNING [train.py:1204] (3/4) Exclude cut with ID _8365_210_2064_1_1528592311070_4677690_371-122295-0 from training. Duration: 0.55 2023-10-03 01:08:11,372 WARNING [train.py:1204] (3/4) Exclude cut with ID _14860_210_4915_1_1530702020340_7343240_836-328489-0 from training. Duration: 0.9 2023-10-03 01:08:13,125 WARNING [train.py:1204] (3/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_1033-274376-0_sp0.9 from training. Duration: 0.9666875 2023-10-03 01:08:14,719 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.feed_forward1.out_proj.dropout_p, batch_count=1084093.3333333333, ans=0.1 2023-10-03 01:08:15,755 INFO [train.py:1046] (3/4) Epoch 31, batch 3250, loss[loss=0.1717, simple_loss=0.2585, pruned_loss=0.04248, over 24504.00 frames. ], tot_loss[loss=0.162, simple_loss=0.2408, pruned_loss=0.04162, over 4735080.80 frames. ], batch size: 66, lr: 3.29e-03, grad_scale: 32.0 2023-10-03 01:08:17,668 WARNING [train.py:1204] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_475-127455-0_sp0.9 from training. Duration: 0.77775 2023-10-03 01:08:17,682 WARNING [train.py:1204] (3/4) Exclude cut with ID IC0671W0071-331914 from training. Duration: 0.896 2023-10-03 01:08:17,700 WARNING [train.py:1204] (3/4) Exclude cut with ID _30327_210_16049_1_1532484161295_3924169_2-1523-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 01:08:17,702 WARNING [train.py:1204] (3/4) Exclude cut with ID _24314_210_4915_1_1532135078116_7170179_460-208279-0_sp1.1 from training. Duration: 0.97275 2023-10-03 01:08:17,800 WARNING [train.py:1204] (3/4) Exclude cut with ID IC0679W0052-335859 from training. Duration: 0.64 2023-10-03 01:08:22,038 WARNING [train.py:1204] (3/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_3-237434-0_sp1.1 from training. Duration: 0.6909375 2023-10-03 01:08:26,075 WARNING [train.py:1204] (3/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_405-185283-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 01:08:32,461 WARNING [train.py:1204] (3/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_927-83684-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 01:08:32,470 WARNING [train.py:1204] (3/4) Exclude cut with ID _6694_210_4599_1_1527852637490_3965529_356-169233-0 from training. Duration: 0.99 2023-10-03 01:08:33,810 WARNING [train.py:1204] (3/4) Exclude cut with ID _19896_210_1883_1_1531709022662_6649569_562-233001-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 01:08:35,769 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_354-78629-0_sp1.1 from training. Duration: 0.963625 2023-10-03 01:08:35,771 WARNING [train.py:1204] (3/4) Exclude cut with ID _60135_210_13427_1_1534935557922_3651319_96-168198-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 01:08:36,946 WARNING [train.py:1204] (3/4) Exclude cut with ID _31602_210_5854_1_1532677021456_4458250_249-96604-0_sp1.1 from training. Duration: 0.8090625 2023-10-03 01:08:36,971 WARNING [train.py:1204] (3/4) Exclude cut with ID _14100_210_8341_1_1530529320613_3678069_258-224599-0_sp0.9 from training. Duration: 0.8555625 2023-10-03 01:08:38,474 WARNING [train.py:1204] (3/4) Exclude cut with ID _19708_210_9206_1_1532136650733_4238364_215-160135-0_sp1.1 from training. Duration: 0.97275 2023-10-03 01:08:39,746 WARNING [train.py:1204] (3/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_113-18163-0_sp0.9 from training. Duration: 0.988875 2023-10-03 01:08:39,785 WARNING [train.py:1204] (3/4) Exclude cut with ID _64135_210_2253_1_1535677267876_7608972_552-337340-0_sp1.1 from training. Duration: 0.92725 2023-10-03 01:08:39,821 WARNING [train.py:1204] (3/4) Exclude cut with ID _67725_210_27767_1_1535608756671_5105359_10-12887-0_sp1.1 from training. Duration: 0.97275 2023-10-03 01:08:39,828 WARNING [train.py:1204] (3/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_401-284840-0_sp1.1 from training. Duration: 0.97275 2023-10-03 01:08:39,863 WARNING [train.py:1204] (3/4) Exclude cut with ID _44659_210_2721_1_1534036120061_6614860_483-141285-0_sp1.1 from training. Duration: 0.936375 2023-10-03 01:08:40,699 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.1.encoder.layers.0.self_attn2.whiten, num_groups=1, num_channels=256, metric=13.61 vs. limit=22.5 2023-10-03 01:08:44,420 WARNING [train.py:1204] (3/4) Exclude cut with ID _9461_210_4557_1_1529060514726_3677451_358-332734-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 01:08:44,529 WARNING [train.py:1204] (3/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_788-298907-0_sp1.1 from training. Duration: 0.8090625 2023-10-03 01:08:44,647 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.1.feed_forward1.hidden_balancer.prob, batch_count=1084226.6666666667, ans=0.125 2023-10-03 01:08:47,791 WARNING [train.py:1204] (3/4) Exclude cut with ID _39411_210_5986_1_1533538825030_3343649_354-267196-0_sp1.1 from training. Duration: 0.92725 2023-10-03 01:08:47,812 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_14953_210_2151_1_1530701710_5616165_192-309792-0_sp1.1 from training. Duration: 0.97275 2023-10-03 01:08:49,185 WARNING [train.py:1204] (3/4) Exclude cut with ID _59170_210_6721_1_1534983633960_4203335_290-151255-0_sp1.1 from training. Duration: 0.92725 2023-10-03 01:08:50,456 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_5575_210_1410_1_1527301305_2570170_249-93769-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 01:08:50,475 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_46013_210_7234_1_1533776362_3744032_463-52378-0_sp1.1 from training. Duration: 0.7818125 2023-10-03 01:08:54,604 WARNING [train.py:1204] (3/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_580-93798-0 from training. Duration: 0.56 2023-10-03 01:08:54,651 WARNING [train.py:1204] (3/4) Exclude cut with ID _19514_210_9259_1_1531530071330_5084230_385-13762-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 01:08:54,656 WARNING [train.py:1204] (3/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_458-67321-0_sp1.1 from training. Duration: 0.8818125 2023-10-03 01:08:57,424 WARNING [train.py:1204] (3/4) Exclude cut with ID _41298_210_9019_1_1533430371450_4822143_188-154791-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 01:08:58,793 WARNING [train.py:1204] (3/4) Exclude cut with ID _15037_210_2253_1_1530752294104_3267650_296-192597-0_sp0.9 from training. Duration: 0.9 2023-10-03 01:08:59,054 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.conv_skip_rate, batch_count=1084293.3333333333, ans=0.0 2023-10-03 01:09:03,475 WARNING [train.py:1204] (3/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_661-264857-0_sp1.1 from training. Duration: 0.8454375 2023-10-03 01:09:12,159 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_34008_210_10431_1_1533345424_8183247_662-317331-0_sp1.1 from training. Duration: 0.8545625 2023-10-03 01:09:12,190 WARNING [train.py:1204] (3/4) Exclude cut with ID _46502_210_13794_1_1533724452198_3657159_241-278329-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 01:09:12,191 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_11349_210_6753_1_1529800861_1101207_26-65602-0 from training. Duration: 0.96 2023-10-03 01:09:12,194 WARNING [train.py:1204] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_448-4048-0_sp1.1 from training. Duration: 0.87275 2023-10-03 01:09:12,203 WARNING [train.py:1204] (3/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_662-275621-0_sp1.1 from training. Duration: 0.3818125 2023-10-03 01:09:12,244 WARNING [train.py:1204] (3/4) Exclude cut with ID _13938_210_9988_1_1530518718978_3693960_256-54454-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 01:09:15,641 WARNING [train.py:1204] (3/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_256-32662-0 from training. Duration: 0.9 2023-10-03 01:09:15,687 WARNING [train.py:1204] (3/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_326-17061-0 from training. Duration: 0.94 2023-10-03 01:09:17,621 WARNING [train.py:1204] (3/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_505-97987-0_sp1.1 from training. Duration: 0.7818125 2023-10-03 01:09:17,711 WARNING [train.py:1204] (3/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_316-294364-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 01:09:19,031 WARNING [train.py:1204] (3/4) Exclude cut with ID _19514_210_9259_1_1531530071330_5084230_762-231063-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 01:09:19,096 WARNING [train.py:1204] (3/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_68-219807-0_sp0.9 from training. Duration: 0.52225 2023-10-03 01:09:19,148 WARNING [train.py:1204] (3/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_479-158053-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 01:09:20,612 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.conv_module1.balancer2.prob, batch_count=1084360.0, ans=0.125 2023-10-03 01:09:24,573 WARNING [train.py:1204] (3/4) Exclude cut with ID _66112_210_10635_1_1535590236305_3801484_83-38181-0_sp1.1 from training. Duration: 0.936375 2023-10-03 01:09:24,588 WARNING [train.py:1204] (3/4) Exclude cut with ID _55857_210_2346_1_1534644007831_6898209_255-146346-0_sp1.1 from training. Duration: 0.963625 2023-10-03 01:09:25,980 WARNING [train.py:1204] (3/4) Exclude cut with ID _48836_210_12210_1_1534075848293_3904150_429-35475-0 from training. Duration: 0.89 2023-10-03 01:09:25,999 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_50154_210_13731_1_1534750862_1080961_119-187163-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 01:09:28,772 WARNING [train.py:1204] (3/4) Exclude cut with ID _8793_210_6310_1_1528542985139_2310009_121-179782-0_sp0.9 from training. Duration: 0.888875 2023-10-03 01:09:28,775 WARNING [train.py:1204] (3/4) Exclude cut with ID _14884_210_2252_1_1530707459689_5561250_591-173406-0 from training. Duration: 0.96 2023-10-03 01:09:30,111 INFO [train.py:1046] (3/4) Epoch 31, batch 3300, loss[loss=0.1435, simple_loss=0.2165, pruned_loss=0.03523, over 24421.00 frames. ], tot_loss[loss=0.1624, simple_loss=0.2413, pruned_loss=0.04176, over 4737146.80 frames. ], batch size: 58, lr: 3.29e-03, grad_scale: 32.0 2023-10-03 01:09:31,511 WARNING [train.py:1204] (3/4) Exclude cut with ID _15133_210_6228_1_1530794512074_3080379_54-198193-0_sp1.1 from training. Duration: 0.8545625 2023-10-03 01:09:31,520 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6545_210_2040_1_1527774670_4245258_407-48070-0 from training. Duration: 0.76 2023-10-03 01:09:34,302 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1032-236863-0 from training. Duration: 0.84 2023-10-03 01:09:36,116 WARNING [train.py:1204] (3/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_507-303370-0 from training. Duration: 0.96 2023-10-03 01:09:36,144 WARNING [train.py:1204] (3/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_164-282366-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 01:09:40,705 WARNING [train.py:1204] (3/4) Exclude cut with ID _39867_210_9826_1_1533553222030_9344259_312-158727-0_sp1.1 from training. Duration: 0.963625 2023-10-03 01:09:42,091 WARNING [train.py:1204] (3/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_174-205058-0_sp1.1 from training. Duration: 0.863625 2023-10-03 01:09:42,120 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_11305_210_1794_1_1529750428_7178148_166-237358-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 01:09:43,499 WARNING [train.py:1204] (3/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_26-175553-0_sp1.1 from training. Duration: 0.5545625 2023-10-03 01:09:43,525 WARNING [train.py:1204] (3/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_96-290039-0_sp1.1 from training. Duration: 0.5090625 2023-10-03 01:09:45,010 WARNING [train.py:1204] (3/4) Exclude cut with ID _53219_210_4525_1_1534834836008_4064825_261-101691-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 01:09:46,908 WARNING [train.py:1204] (3/4) Exclude cut with ID _24271_210_12658_1_1531987270022_7190519_96-293390-0_sp1.1 from training. Duration: 0.936375 2023-10-03 01:09:50,282 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_271-234962-0 from training. Duration: 0.79 2023-10-03 01:09:50,451 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.conv_module2.balancer1.prob, batch_count=1084493.3333333333, ans=0.125 2023-10-03 01:09:51,643 WARNING [train.py:1204] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_136-30660-0_sp1.1 from training. Duration: 0.97275 2023-10-03 01:09:53,068 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_625-281046-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 01:09:54,373 WARNING [train.py:1204] (3/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_333-168193-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 01:09:54,445 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9029W0269-509664 from training. Duration: 0.896 2023-10-03 01:09:54,624 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.bypass.scale_min, batch_count=1084493.3333333333, ans=0.2 2023-10-03 01:09:55,774 WARNING [train.py:1204] (3/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_450-147393-0_sp1.1 from training. Duration: 0.92725 2023-10-03 01:09:57,157 WARNING [train.py:1204] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_643-52007-0_sp1.1 from training. Duration: 0.5181875 2023-10-03 01:09:57,224 WARNING [train.py:1204] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_330-327861-0_sp0.9 from training. Duration: 0.7444375 2023-10-03 01:09:57,235 WARNING [train.py:1204] (3/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_260-317651-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 01:09:57,251 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9044W0010-516654 from training. Duration: 0.896 2023-10-03 01:10:00,022 WARNING [train.py:1204] (3/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_265-118378-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 01:10:00,028 WARNING [train.py:1204] (3/4) Exclude cut with ID _6194_210_1534_1_1527511826160_3941194_321-148641-0_sp0.9 from training. Duration: 0.711125 2023-10-03 01:10:02,802 WARNING [train.py:1204] (3/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_101-169176-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 01:10:02,805 WARNING [train.py:1204] (3/4) Exclude cut with ID 497-129325-0061-62254-0_sp1.1 from training. Duration: 0.97725 2023-10-03 01:10:02,906 WARNING [train.py:1204] (3/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_795-33252-0_sp1.1 from training. Duration: 0.336375 2023-10-03 01:10:04,201 WARNING [train.py:1204] (3/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_168-246176-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 01:10:06,072 WARNING [train.py:1204] (3/4) Exclude cut with ID _7160_210_1876_1_1527940923231_3703799_120-150943-0_sp1.1 from training. Duration: 0.736375 2023-10-03 01:10:07,515 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9084W0005-536844 from training. Duration: 0.768 2023-10-03 01:10:09,384 WARNING [train.py:1204] (3/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_234-260682-0 from training. Duration: 0.84 2023-10-03 01:10:10,639 WARNING [train.py:1204] (3/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_188-114464-0_sp0.9 from training. Duration: 0.911125 2023-10-03 01:10:13,409 WARNING [train.py:1204] (3/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_81-75849-0 from training. Duration: 0.39 2023-10-03 01:10:14,761 WARNING [train.py:1204] (3/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_379-184223-0_sp1.1 from training. Duration: 0.77275 2023-10-03 01:10:16,217 WARNING [train.py:1204] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_454-123002-0_sp1.1 from training. Duration: 0.62725 2023-10-03 01:10:16,253 WARNING [train.py:1204] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_887-249555-0_sp1.1 from training. Duration: 0.7 2023-10-03 01:10:16,419 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.attention_skip_rate, batch_count=1084626.6666666667, ans=0.0 2023-10-03 01:10:20,609 WARNING [train.py:1204] (3/4) Exclude cut with ID _31088_210_16446_1_1532520012284_3440919_20-260902-0_sp1.1 from training. Duration: 0.97275 2023-10-03 01:10:21,914 WARNING [train.py:1204] (3/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_856-344574-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 01:10:21,916 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_36756_210_7234_1_1533026991_966276_4-164118-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 01:10:21,939 WARNING [train.py:1204] (3/4) Exclude cut with ID _8365_210_2064_1_1528592311070_4677690_371-122295-0_sp1.1 from training. Duration: 0.5 2023-10-03 01:10:23,449 WARNING [train.py:1204] (3/4) Exclude cut with ID _5910_210_1389_1_1527337972454_5050100_555-159815-0_sp1.1 from training. Duration: 0.8909375 2023-10-03 01:10:23,459 WARNING [train.py:1204] (3/4) Exclude cut with ID _37086_210_9038_1_1532999540891_7021250_469-147941-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 01:10:24,801 WARNING [train.py:1204] (3/4) Exclude cut with ID _3067_210_2667_1_1526212974640_4503168_459-173867-0_sp0.9 from training. Duration: 0.97775 2023-10-03 01:10:26,155 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9156W0006-572499 from training. Duration: 0.896 2023-10-03 01:10:27,663 WARNING [train.py:1204] (3/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_277-210798-0 from training. Duration: 0.55 2023-10-03 01:10:29,205 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.bypass.skip_rate, batch_count=1084693.3333333333, ans=0.09899494936611666 2023-10-03 01:10:30,285 WARNING [train.py:1204] (3/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_103-336064-0_sp1.1 from training. Duration: 0.463625 2023-10-03 01:10:30,336 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3414_210_1988_1_1526212718_4813195_331-290945-0_sp1.1 from training. Duration: 0.936375 2023-10-03 01:10:30,337 WARNING [train.py:1204] (3/4) Exclude cut with ID _67499_210_17282_1_1535509805220_3692430_365-29276-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 01:10:31,772 WARNING [train.py:1204] (3/4) Exclude cut with ID _14821_210_8750_1_1530680503991_6829359_69-250847-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 01:10:31,773 WARNING [train.py:1204] (3/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_1102-61301-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 01:10:34,537 WARNING [train.py:1204] (3/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_166-246557-0_sp1.1 from training. Duration: 0.5818125 2023-10-03 01:10:34,583 WARNING [train.py:1204] (3/4) Exclude cut with ID _15351_210_10626_1_1530856635653_3576210_214-129918-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 01:10:34,594 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_120-41668-0_sp0.9 from training. Duration: 0.77775 2023-10-03 01:10:35,806 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.577e+02 1.832e+02 2.063e+02 2.275e+02 2.937e+02, threshold=4.126e+02, percent-clipped=0.0 2023-10-03 01:10:35,915 WARNING [train.py:1204] (3/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_27-72278-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 01:10:37,721 WARNING [train.py:1204] (3/4) Exclude cut with ID _24314_210_4915_1_1532135078116_7170179_521-258868-0_sp1.1 from training. Duration: 0.7181875 2023-10-03 01:10:41,138 WARNING [train.py:1204] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_143-86344-0 from training. Duration: 0.77 2023-10-03 01:10:41,173 WARNING [train.py:1204] (3/4) Exclude cut with ID _53489_210_6228_1_1534298357169_3835970_465-157433-0_sp1.1 from training. Duration: 0.92725 2023-10-03 01:10:42,457 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_248-319353-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 01:10:43,756 INFO [train.py:1046] (3/4) Epoch 31, batch 3350, loss[loss=0.167, simple_loss=0.2387, pruned_loss=0.04767, over 23595.00 frames. ], tot_loss[loss=0.163, simple_loss=0.2421, pruned_loss=0.04193, over 4736352.50 frames. ], batch size: 256, lr: 3.29e-03, grad_scale: 16.0 2023-10-03 01:10:45,147 WARNING [train.py:1204] (3/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_102-77064-0_sp1.1 from training. Duration: 0.8454375 2023-10-03 01:10:45,183 WARNING [train.py:1204] (3/4) Exclude cut with ID _9389_210_1664_1_1529217337301_5289269_379-128794-0_sp1.1 from training. Duration: 0.77275 2023-10-03 01:10:46,583 WARNING [train.py:1204] (3/4) Exclude cut with ID _3231_210_2106_1_1526205864934_6784220_236-260835-0_sp1.1 from training. Duration: 0.97275 2023-10-03 01:10:47,936 WARNING [train.py:1204] (3/4) Exclude cut with ID _24271_210_12658_1_1531987270022_7190519_184-342363-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 01:10:47,937 WARNING [train.py:1204] (3/4) Exclude cut with ID _27894_210_12392_1_1532415496078_6902439_67-221663-0_sp1.1 from training. Duration: 0.92725 2023-10-03 01:10:51,378 WARNING [train.py:1204] (3/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_304-59227-0_sp1.1 from training. Duration: 0.7 2023-10-03 01:10:53,188 WARNING [train.py:1204] (3/4) Exclude cut with ID _58605_210_12210_1_1534723195939_3904259_458-42232-0_sp1.1 from training. Duration: 0.92725 2023-10-03 01:10:53,268 WARNING [train.py:1204] (3/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_13-165512-0_sp0.9 from training. Duration: 0.911125 2023-10-03 01:10:56,132 WARNING [train.py:1204] (3/4) Exclude cut with ID _58602_210_2346_1_1534903210085_6908830_792-102241-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 01:10:57,639 WARNING [train.py:1204] (3/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_588-125165-0_sp0.9 from training. Duration: 0.92225 2023-10-03 01:10:59,239 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=1084826.6666666667, ans=0.1 2023-10-03 01:11:00,216 WARNING [train.py:1204] (3/4) Exclude cut with ID _54218_210_12210_1_1534413646388_4409259_239-98429-0_sp1.1 from training. Duration: 0.97275 2023-10-03 01:11:00,284 WARNING [train.py:1204] (3/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_538-32291-0_sp1.1 from training. Duration: 0.7545625 2023-10-03 01:11:01,098 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.1.encoder.layers.1.nonlin_attention.whiten2, num_groups=1, num_channels=256, metric=7.39 vs. limit=15.0 2023-10-03 01:11:01,640 WARNING [train.py:1204] (3/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_419-317062-0 from training. Duration: 0.91 2023-10-03 01:11:01,736 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9309W0202-640329 from training. Duration: 0.896 2023-10-03 01:11:01,759 WARNING [train.py:1204] (3/4) Exclude cut with ID _3231_210_2106_1_1526205864934_6784220_419-313291-0_sp1.1 from training. Duration: 0.97275 2023-10-03 01:11:06,001 WARNING [train.py:1204] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_619-344499-0 from training. Duration: 0.63 2023-10-03 01:11:06,029 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_278-272612-0 from training. Duration: 0.73 2023-10-03 01:11:07,701 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_103-84010-0_sp0.9 from training. Duration: 0.7666875 2023-10-03 01:11:07,730 WARNING [train.py:1204] (3/4) Exclude cut with ID _26528_210_9070_1_1532570410842_3851310_497-97218-0_sp1.1 from training. Duration: 0.7818125 2023-10-03 01:11:09,061 WARNING [train.py:1204] (3/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_262-84613-0_sp1.1 from training. Duration: 0.936375 2023-10-03 01:11:10,418 WARNING [train.py:1204] (3/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_387-131538-0 from training. Duration: 0.81 2023-10-03 01:11:10,449 WARNING [train.py:1204] (3/4) Exclude cut with ID _8030_210_7497_1_1528444886781_3531089_224-300489-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 01:11:10,478 WARNING [train.py:1204] (3/4) Exclude cut with ID _14349_210_8341_1_1530615710818_3442470_170-336541-0_sp0.9 from training. Duration: 0.9555625 2023-10-03 01:11:11,337 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.conv_module1.balancer2.min_abs, batch_count=1084826.6666666667, ans=0.5 2023-10-03 01:11:13,650 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_47069_210_15368_1_1533781267_4431320_180-31746-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 01:11:15,073 WARNING [train.py:1204] (3/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_447-7041-0_sp1.1 from training. Duration: 0.92725 2023-10-03 01:11:15,211 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.conv_skip_rate, batch_count=1084893.3333333333, ans=0.0 2023-10-03 01:11:16,340 WARNING [train.py:1204] (3/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_2-55283-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 01:11:16,398 WARNING [train.py:1204] (3/4) Exclude cut with ID _12555_210_8750_1_1530075594402_3780239_70-181911-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 01:11:17,908 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.conv_skip_rate, batch_count=1084893.3333333333, ans=0.0 2023-10-03 01:11:17,964 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.nonlin_attention.balancer.min_positive, batch_count=1084893.3333333333, ans=0.05 2023-10-03 01:11:19,401 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.ff3_skip_rate, batch_count=1084893.3333333333, ans=0.0 2023-10-03 01:11:20,507 WARNING [train.py:1204] (3/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_311-328022-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 01:11:21,975 WARNING [train.py:1204] (3/4) Exclude cut with ID _14528_210_8254_1_1530669700514_4306939_330-2987-0_sp1.1 from training. Duration: 0.92725 2023-10-03 01:11:22,030 WARNING [train.py:1204] (3/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_385-184858-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 01:11:22,170 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.1.conv_module1.balancer1.prob, batch_count=1084893.3333333333, ans=0.125 2023-10-03 01:11:25,852 WARNING [train.py:1204] (3/4) Exclude cut with ID _64414_210_17301_1_1535243252048_3757759_348-333360-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 01:11:25,923 WARNING [train.py:1204] (3/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_753-214751-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 01:11:27,413 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_17613_210_2151_1_1531137569_2366160_202-208013-0_sp1.1 from training. Duration: 0.92725 2023-10-03 01:11:27,425 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_43831_210_2158_1_1533725502_5872791_610-129300-0_sp1.1 from training. Duration: 0.936375 2023-10-03 01:11:30,139 WARNING [train.py:1204] (3/4) Exclude cut with ID _54218_210_12210_1_1534413646388_4409259_321-23852-0_sp1.1 from training. Duration: 0.936375 2023-10-03 01:11:32,825 WARNING [train.py:1204] (3/4) Exclude cut with ID _39411_210_5986_1_1533538825030_3343649_173-238248-0 from training. Duration: 0.83 2023-10-03 01:11:32,839 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_405-339986-0_sp0.9 from training. Duration: 0.5666875 2023-10-03 01:11:34,156 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2009_210_2071_1_1525481134_4588937_385-302100-0 from training. Duration: 0.87 2023-10-03 01:11:34,203 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_161-276996-0_sp1.1 from training. Duration: 0.863625 2023-10-03 01:11:35,573 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_177-58005-0 from training. Duration: 0.62 2023-10-03 01:11:36,961 WARNING [train.py:1204] (3/4) Exclude cut with ID _64639_210_5816_1_1535367619437_3528000_146-343734-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 01:11:38,335 WARNING [train.py:1204] (3/4) Exclude cut with ID _3845_210_1390_1_1526731326141_3523610_72-61441-0_sp1.1 from training. Duration: 0.92725 2023-10-03 01:11:38,560 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.bypass.scale_min, batch_count=1084960.0, ans=0.2 2023-10-03 01:11:45,164 WARNING [train.py:1204] (3/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_991-125028-0_sp1.1 from training. Duration: 0.936375 2023-10-03 01:11:46,509 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7544_210_6753_1_1528591327_1471426_172-348777-0 from training. Duration: 0.66 2023-10-03 01:11:46,560 WARNING [train.py:1204] (3/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_416-257730-0_sp1.1 from training. Duration: 0.5909375 2023-10-03 01:11:47,876 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1113-7369-0_sp1.1 from training. Duration: 0.77275 2023-10-03 01:11:49,299 WARNING [train.py:1204] (3/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_242-76943-0_sp1.1 from training. Duration: 0.8545625 2023-10-03 01:11:49,661 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.bypass.scale_min, batch_count=1085026.6666666667, ans=0.2 2023-10-03 01:11:52,158 WARNING [train.py:1204] (3/4) Exclude cut with ID _44718_210_5816_1_1534050051869_6469030_190-49648-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 01:11:54,957 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.1.encoder.layers.1.feed_forward1.out_whiten, num_groups=1, num_channels=256, metric=6.40 vs. limit=15.0 2023-10-03 01:11:55,375 WARNING [train.py:1204] (3/4) Exclude cut with ID _47251_210_18628_1_1533814063128_4137250_214-88076-0 from training. Duration: 0.88 2023-10-03 01:11:56,662 WARNING [train.py:1204] (3/4) Exclude cut with ID _5585_210_2667_1_1527418813324_8007340_89-332072-0_sp1.1 from training. Duration: 0.5818125 2023-10-03 01:11:56,701 WARNING [train.py:1204] (3/4) Exclude cut with ID _41817_210_18835_1_1533434477160_7423049_555-293448-0_sp1.1 from training. Duration: 0.72725 2023-10-03 01:11:56,858 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.0.feed_forward2.hidden_balancer.prob, batch_count=1085093.3333333333, ans=0.125 2023-10-03 01:11:58,008 INFO [train.py:1046] (3/4) Epoch 31, batch 3400, loss[loss=0.2443, simple_loss=0.3023, pruned_loss=0.09309, over 19433.00 frames. ], tot_loss[loss=0.1641, simple_loss=0.2433, pruned_loss=0.04246, over 4731683.22 frames. ], batch size: 390, lr: 3.29e-03, grad_scale: 16.0 2023-10-03 01:11:58,120 WARNING [train.py:1204] (3/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_286-331314-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 01:11:58,171 WARNING [train.py:1204] (3/4) Exclude cut with ID _13921_210_9994_1_1530838702800_3003429_46-149444-0 from training. Duration: 0.94 2023-10-03 01:11:59,644 WARNING [train.py:1204] (3/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_768-43595-0_sp1.1 from training. Duration: 0.936375 2023-10-03 01:11:59,758 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.1.bypass.skip_rate, batch_count=1085093.3333333333, ans=0.035 2023-10-03 01:12:00,931 WARNING [train.py:1204] (3/4) Exclude cut with ID _35355_210_16040_1_1533212988977_4016229_74-190076-0 from training. Duration: 0.8 2023-10-03 01:12:01,070 WARNING [train.py:1204] (3/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_1007-316380-0_sp1.1 from training. Duration: 0.963625 2023-10-03 01:12:02,360 WARNING [train.py:1204] (3/4) Exclude cut with ID _23557_210_8750_1_1531890106370_6846255_150-283437-0_sp1.1 from training. Duration: 0.963625 2023-10-03 01:12:03,724 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3474_210_2028_1_1526188513_4825246_145-309131-0_sp1.1 from training. Duration: 0.67275 2023-10-03 01:12:05,095 WARNING [train.py:1204] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_391-22148-0_sp1.1 from training. Duration: 0.7 2023-10-03 01:12:05,126 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_238-256668-0 from training. Duration: 0.69 2023-10-03 01:12:09,281 WARNING [train.py:1204] (3/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_372-198878-0 from training. Duration: 0.6 2023-10-03 01:12:09,291 WARNING [train.py:1204] (3/4) Exclude cut with ID ID0174W0006-766659 from training. Duration: 0.953 2023-10-03 01:12:09,308 WARNING [train.py:1204] (3/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_668-23019-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 01:12:11,311 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.conv_skip_rate, batch_count=1085160.0, ans=0.0 2023-10-03 01:12:13,891 WARNING [train.py:1204] (3/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_322-74328-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 01:12:13,898 WARNING [train.py:1204] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_872-264525-0_sp1.1 from training. Duration: 0.5909375 2023-10-03 01:12:14,559 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_53378_210_17282_1_1534221071_5769509_641-286201-0_sp1.1 from training. Duration: 0.97275 2023-10-03 01:12:15,896 WARNING [train.py:1204] (3/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_199-295611-0_sp1.1 from training. Duration: 0.5 2023-10-03 01:12:21,425 WARNING [train.py:1204] (3/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_1270-317971-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 01:12:22,753 WARNING [train.py:1204] (3/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_784-213099-0 from training. Duration: 0.93 2023-10-03 01:12:28,026 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_115-233929-0_sp0.9 from training. Duration: 0.8 2023-10-03 01:12:29,419 WARNING [train.py:1204] (3/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_256-223809-0_sp1.1 from training. Duration: 0.97275 2023-10-03 01:12:30,820 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_285-87781-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 01:12:32,174 WARNING [train.py:1204] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_619-344499-0_sp1.1 from training. Duration: 0.57275 2023-10-03 01:12:37,737 WARNING [train.py:1204] (3/4) Exclude cut with ID _60229_210_27767_1_1534838406158_3655350_264-279573-0_sp0.9 from training. Duration: 0.92225 2023-10-03 01:12:40,593 WARNING [train.py:1204] (3/4) Exclude cut with ID _7719_210_3525_1_1528456153163_4296960_333-110399-0 from training. Duration: 0.96 2023-10-03 01:12:40,911 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.conv_skip_rate, batch_count=1085293.3333333333, ans=0.0 2023-10-03 01:12:45,269 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_50423_210_13270_1_1534128598_5021734_759-90833-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 01:12:45,496 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.balancer1.prob, batch_count=1085293.3333333333, ans=0.125 2023-10-03 01:12:46,567 WARNING [train.py:1204] (3/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_151-254798-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 01:12:46,615 WARNING [train.py:1204] (3/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_450-203637-0 from training. Duration: 0.92 2023-10-03 01:12:48,448 WARNING [train.py:1204] (3/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_673-36456-0_sp1.1 from training. Duration: 0.936375 2023-10-03 01:12:48,486 WARNING [train.py:1204] (3/4) Exclude cut with ID _36538_210_9994_1_1533204020363_3795620_131-8685-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 01:12:48,547 WARNING [train.py:1204] (3/4) Exclude cut with ID _12556_210_8341_1_1530081024100_7327690_713-55626-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 01:12:49,828 WARNING [train.py:1204] (3/4) Exclude cut with ID _2322_210_2667_1_1525604548971_3656299_440-80494-0_sp0.9 from training. Duration: 0.97775 2023-10-03 01:12:51,424 WARNING [train.py:1204] (3/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_210-325081-0_sp1.1 from training. Duration: 0.97275 2023-10-03 01:12:54,757 WARNING [train.py:1204] (3/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_375-244976-0_sp0.9 from training. Duration: 0.8333125 2023-10-03 01:12:54,763 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_15517_210_6753_1_1530952293_1664795_119-31001-0_sp1.1 from training. Duration: 0.8545625 2023-10-03 01:13:01,631 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_41452_210_7291_1_1533437687_3941281_319-42326-0_sp1.1 from training. Duration: 0.92725 2023-10-03 01:13:03,093 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_372-173012-0 from training. Duration: 0.95 2023-10-03 01:13:04,324 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.520e+02 1.916e+02 2.114e+02 2.433e+02 5.346e+02, threshold=4.228e+02, percent-clipped=1.0 2023-10-03 01:13:07,333 WARNING [train.py:1204] (3/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_41-31157-0_sp1.1 from training. Duration: 0.5181875 2023-10-03 01:13:10,340 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.balancer2.prob, batch_count=1085426.6666666667, ans=0.125 2023-10-03 01:13:11,338 INFO [train.py:1046] (3/4) Epoch 31, batch 3450, loss[loss=0.1518, simple_loss=0.2381, pruned_loss=0.03281, over 24670.00 frames. ], tot_loss[loss=0.1637, simple_loss=0.2429, pruned_loss=0.04222, over 4738798.60 frames. ], batch size: 65, lr: 3.29e-03, grad_scale: 16.0 2023-10-03 01:13:11,471 WARNING [train.py:1204] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_91-212486-0 from training. Duration: 0.68 2023-10-03 01:13:14,757 WARNING [train.py:1204] (3/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_1267-272693-0 from training. Duration: 0.73 2023-10-03 01:13:16,729 WARNING [train.py:1204] (3/4) Exclude cut with ID _13938_210_9988_1_1530518718978_3693960_152-67046-0_sp1.1 from training. Duration: 0.936375 2023-10-03 01:13:18,015 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_4528_210_3273_1_1527076654_3276777_342-168068-0_sp1.1 from training. Duration: 0.7909375 2023-10-03 01:13:18,024 WARNING [train.py:1204] (3/4) Exclude cut with ID _7606_210_4915_1_1528894253277_5080690_127-177761-0 from training. Duration: 0.64 2023-10-03 01:13:18,267 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.conv_module2.balancer2.prob, batch_count=1085426.6666666667, ans=0.125 2023-10-03 01:13:19,374 WARNING [train.py:1204] (3/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_335-137972-0_sp1.1 from training. Duration: 0.92725 2023-10-03 01:13:19,641 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.ff2_skip_rate, batch_count=1085426.6666666667, ans=0.0 2023-10-03 01:13:21,147 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.conv_module1.balancer2.prob, batch_count=1085426.6666666667, ans=0.125 2023-10-03 01:13:22,127 WARNING [train.py:1204] (3/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_331-117829-0_sp1.1 from training. Duration: 0.563625 2023-10-03 01:13:26,173 WARNING [train.py:1204] (3/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_125-125818-0_sp1.1 from training. Duration: 0.87275 2023-10-03 01:13:27,465 WARNING [train.py:1204] (3/4) Exclude cut with ID _6789_210_1534_1_1527854471671_2870519_48-294897-0_sp1.1 from training. Duration: 0.963625 2023-10-03 01:13:27,536 WARNING [train.py:1204] (3/4) Exclude cut with ID _6694_210_4599_1_1527852637490_3965529_116-109398-0_sp1.1 from training. Duration: 0.663625 2023-10-03 01:13:27,545 WARNING [train.py:1204] (3/4) Exclude cut with ID _52892_210_6721_1_1534378941259_4124129_222-172685-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 01:13:30,369 WARNING [train.py:1204] (3/4) Exclude cut with ID _50655_210_18628_1_1534229942193_3772154_320-132275-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 01:13:38,380 WARNING [train.py:1204] (3/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_76-311963-0 from training. Duration: 0.7 2023-10-03 01:13:42,673 WARNING [train.py:1204] (3/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_32-205288-0 from training. Duration: 0.78 2023-10-03 01:13:42,699 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_86-16881-0_sp0.9 from training. Duration: 0.6444375 2023-10-03 01:13:42,738 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_271-297505-0_sp1.1 from training. Duration: 0.9 2023-10-03 01:13:43,043 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.feed_forward1.out_proj.dropout_p, batch_count=1085560.0, ans=0.1 2023-10-03 01:13:44,170 WARNING [train.py:1204] (3/4) Exclude cut with ID _57572_210_5517_1_1534921219624_3809484_187-295293-0_sp1.1 from training. Duration: 0.963625 2023-10-03 01:13:48,993 WARNING [train.py:1204] (3/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_563-329943-0 from training. Duration: 0.99 2023-10-03 01:13:49,063 WARNING [train.py:1204] (3/4) Exclude cut with ID _7160_210_1876_1_1527940923231_3703799_302-150728-0_sp0.9 from training. Duration: 0.8666875 2023-10-03 01:13:55,394 WARNING [train.py:1204] (3/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_926-7526-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 01:13:55,416 WARNING [train.py:1204] (3/4) Exclude cut with ID _62574_210_2655_1_1535180407058_3933249_467-22348-0_sp1.1 from training. Duration: 0.936375 2023-10-03 01:13:56,855 WARNING [train.py:1204] (3/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_280-161633-0_sp0.9 from training. Duration: 0.811125 2023-10-03 01:13:56,977 WARNING [train.py:1204] (3/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_385-193052-0_sp1.1 from training. Duration: 0.7818125 2023-10-03 01:13:58,337 WARNING [train.py:1204] (3/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_444-28041-0 from training. Duration: 0.85 2023-10-03 01:13:58,349 WARNING [train.py:1204] (3/4) Exclude cut with ID _66070_210_7640_1_1535441393096_7805310_341-249237-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 01:13:59,730 WARNING [train.py:1204] (3/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_425-269826-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 01:14:01,189 WARNING [train.py:1204] (3/4) Exclude cut with ID _53950_210_5816_1_1534417264919_3862951_124-185597-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 01:14:04,051 WARNING [train.py:1204] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_380-204756-0 from training. Duration: 0.86 2023-10-03 01:14:08,124 WARNING [train.py:1204] (3/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_282-257791-0_sp0.9 from training. Duration: 0.9555625 2023-10-03 01:14:09,170 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.conv_module2.balancer2.min_abs, batch_count=1085626.6666666667, ans=0.5 2023-10-03 01:14:11,605 WARNING [train.py:1204] (3/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_372-131163-0_sp1.1 from training. Duration: 0.8545625 2023-10-03 01:14:12,914 WARNING [train.py:1204] (3/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_447-233513-0_sp1.1 from training. Duration: 0.963625 2023-10-03 01:14:13,103 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.0.ff3_skip_rate, batch_count=1085693.3333333333, ans=0.0 2023-10-03 01:14:16,027 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_54-118333-0_sp1.1 from training. Duration: 0.97275 2023-10-03 01:14:20,148 WARNING [train.py:1204] (3/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_388-230239-0_sp1.1 from training. Duration: 0.963625 2023-10-03 01:14:20,251 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.0.attention_skip_rate, batch_count=1085693.3333333333, ans=0.0 2023-10-03 01:14:21,386 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_40047_210_2290_1_1533290519_2374311_141-54639-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 01:14:21,447 WARNING [train.py:1204] (3/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_91-267696-0_sp0.9 from training. Duration: 0.9666875 2023-10-03 01:14:21,486 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_532-137847-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 01:14:25,435 WARNING [train.py:1204] (3/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_155-283231-0_sp1.1 from training. Duration: 0.97275 2023-10-03 01:14:26,726 INFO [train.py:1046] (3/4) Epoch 31, batch 3500, loss[loss=0.1607, simple_loss=0.2417, pruned_loss=0.0398, over 24470.00 frames. ], tot_loss[loss=0.1628, simple_loss=0.2409, pruned_loss=0.04232, over 4703925.83 frames. ], batch size: 66, lr: 3.29e-03, grad_scale: 8.0 2023-10-03 01:14:28,309 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2245_210_2715_1_1525511548_3540254_310-52483-0_sp1.1 from training. Duration: 0.8 2023-10-03 01:14:29,634 WARNING [train.py:1204] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_185-174805-0 from training. Duration: 0.68 2023-10-03 01:14:31,090 WARNING [train.py:1204] (3/4) Exclude cut with ID _15012_210_1968_1_1530856820429_5095350_662-210469-0_sp1.1 from training. Duration: 0.5181875 2023-10-03 01:14:31,221 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.0.conv_module2.balancer2.prob, batch_count=1085760.0, ans=0.125 2023-10-03 01:14:34,135 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.feed_forward1.hidden_balancer.prob, batch_count=1085760.0, ans=0.125 2023-10-03 01:14:35,122 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_426-313557-0_sp0.9 from training. Duration: 0.588875 2023-10-03 01:14:35,295 WARNING [train.py:1204] (3/4) Exclude cut with ID _42326_210_7770_1_1533814282531_3707269_407-204343-0_sp1.1 from training. Duration: 0.97275 2023-10-03 01:14:35,303 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_258-310570-0 from training. Duration: 0.82 2023-10-03 01:14:36,050 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.2.encoder.layers.1.conv_module2.whiten, num_groups=1, num_channels=384, metric=3.77 vs. limit=15.0 2023-10-03 01:14:37,794 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.3.encoder.layers.1.conv_module1.whiten, num_groups=1, num_channels=512, metric=3.11 vs. limit=15.0 2023-10-03 01:14:42,540 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_4737_210_2040_1_1526998689_3293008_378-178650-0_sp1.1 from training. Duration: 0.863625 2023-10-03 01:14:43,882 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_47896_210_20508_1_1534492518_5958868_432-327451-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 01:14:43,973 WARNING [train.py:1204] (3/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_188-114464-0_sp1.1 from training. Duration: 0.7454375 2023-10-03 01:14:43,979 WARNING [train.py:1204] (3/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_798-106179-0_sp1.1 from training. Duration: 0.7545625 2023-10-03 01:14:45,724 WARNING [train.py:1204] (3/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_143-117421-0_sp1.1 from training. Duration: 0.52725 2023-10-03 01:14:45,765 WARNING [train.py:1204] (3/4) Exclude cut with ID _67499_210_17282_1_1535509805220_3692430_361-78786-0_sp1.1 from training. Duration: 0.963625 2023-10-03 01:14:45,804 WARNING [train.py:1204] (3/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_712-342563-0_sp1.1 from training. Duration: 0.8818125 2023-10-03 01:14:47,122 WARNING [train.py:1204] (3/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_387-159008-0 from training. Duration: 0.86 2023-10-03 01:14:48,630 WARNING [train.py:1204] (3/4) Exclude cut with ID _33640_210_2655_1_1533117605479_4855469_959-18820-0_sp1.1 from training. Duration: 0.963625 2023-10-03 01:14:48,672 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_133-564-0_sp0.9 from training. Duration: 0.87775 2023-10-03 01:14:50,006 WARNING [train.py:1204] (3/4) Exclude cut with ID _23514_210_1389_1_1531832139785_3031439_12-29287-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 01:14:53,708 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.4.encoder.layers.2.nonlin_attention.whiten1, num_groups=1, num_channels=288, metric=7.19 vs. limit=10.0 2023-10-03 01:14:55,079 WARNING [train.py:1204] (3/4) Exclude cut with ID _23514_210_1389_1_1531832139785_3031439_489-185052-0_sp1.1 from training. Duration: 0.963625 2023-10-03 01:14:56,413 WARNING [train.py:1204] (3/4) Exclude cut with ID _5444_210_2151_1_1527418808464_5312210_634-194174-0 from training. Duration: 0.97 2023-10-03 01:14:56,439 WARNING [train.py:1204] (3/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_91-26919-0_sp1.1 from training. Duration: 0.8818125 2023-10-03 01:14:59,286 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_37351_210_7925_1_1533012931_3812113_516-82303-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 01:15:00,770 WARNING [train.py:1204] (3/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_80-74184-0_sp0.9 from training. Duration: 0.92225 2023-10-03 01:15:02,022 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_9460_210_4211_1_1528954587_10049667_495-202963-0_sp1.1 from training. Duration: 0.963625 2023-10-03 01:15:04,624 WARNING [train.py:1204] (3/4) Exclude cut with ID _14632_210_9988_1_1530691279392_3923200_332-105904-0_sp1.1 from training. Duration: 0.8181875 2023-10-03 01:15:04,647 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_40764_210_1925_1_1533882359_3752345_493-167886-0_sp1.1 from training. Duration: 0.92725 2023-10-03 01:15:06,085 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_196-289637-0 from training. Duration: 0.75 2023-10-03 01:15:06,283 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.ff2_skip_rate, batch_count=1085893.3333333333, ans=0.0 2023-10-03 01:15:07,393 WARNING [train.py:1204] (3/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_26-175553-0 from training. Duration: 0.61 2023-10-03 01:15:07,440 WARNING [train.py:1204] (3/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_890-5117-0 from training. Duration: 0.78 2023-10-03 01:15:07,497 WARNING [train.py:1204] (3/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_206-306860-0_sp1.1 from training. Duration: 0.92725 2023-10-03 01:15:08,953 WARNING [train.py:1204] (3/4) Exclude cut with ID _25882_210_3036_1_1532775665815_6479510_570-79824-0_sp1.1 from training. Duration: 0.963625 2023-10-03 01:15:10,662 WARNING [train.py:1204] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_731-2428-0_sp1.1 from training. Duration: 0.7545625 2023-10-03 01:15:10,683 WARNING [train.py:1204] (3/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_138-154812-0_sp0.9 from training. Duration: 0.8666875 2023-10-03 01:15:13,397 WARNING [train.py:1204] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_178-39722-0_sp0.9 from training. Duration: 0.5444375 2023-10-03 01:15:14,766 WARNING [train.py:1204] (3/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_1285-256622-0_sp0.9 from training. Duration: 0.9333125 2023-10-03 01:15:18,239 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_41428_210_2161_1_1533774184_3992532_65-271323-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 01:15:20,163 WARNING [train.py:1204] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_308-161067-0 from training. Duration: 0.66 2023-10-03 01:15:20,168 WARNING [train.py:1204] (3/4) Exclude cut with ID _3067_210_2667_1_1526212974640_4503168_459-173867-0 from training. Duration: 0.88 2023-10-03 01:15:20,170 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2221_210_1988_1_1525597051_4935541_415-296882-0_sp1.1 from training. Duration: 0.7 2023-10-03 01:15:24,137 WARNING [train.py:1204] (3/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_354-192266-0_sp1.1 from training. Duration: 0.936375 2023-10-03 01:15:26,112 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_258-310570-0_sp0.9 from training. Duration: 0.911125 2023-10-03 01:15:27,566 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_548-141039-0_sp1.1 from training. Duration: 0.963625 2023-10-03 01:15:30,314 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_15101_210_7927_1_1530786415_5033274_320-327527-0 from training. Duration: 0.96 2023-10-03 01:15:30,369 WARNING [train.py:1204] (3/4) Exclude cut with ID _2895_210_2477_1_1525956785794_4063750_289-11475-0_sp0.9 from training. Duration: 0.911125 2023-10-03 01:15:31,711 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7543_210_6753_1_1528543318_4552_1-85072-0_sp1.1 from training. Duration: 0.7545625 2023-10-03 01:15:33,046 WARNING [train.py:1204] (3/4) Exclude cut with ID _64639_210_5816_1_1535367619437_3528000_461-329156-0 from training. Duration: 0.96 2023-10-03 01:15:34,360 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.601e+02 1.859e+02 2.081e+02 2.339e+02 4.872e+02, threshold=4.163e+02, percent-clipped=1.0 2023-10-03 01:15:34,458 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_211-339622-0 from training. Duration: 0.45 2023-10-03 01:15:35,864 WARNING [train.py:1204] (3/4) Exclude cut with ID _20455_210_5219_1_1531565996775_8098100_402-112739-0_sp1.1 from training. Duration: 0.963625 2023-10-03 01:15:37,296 WARNING [train.py:1204] (3/4) Exclude cut with ID _53489_210_6228_1_1534298357169_3835970_256-115565-0_sp1.1 from training. Duration: 0.936375 2023-10-03 01:15:37,313 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_26326_210_7925_1_1533002493_3592406_360-305260-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 01:15:38,625 WARNING [train.py:1204] (3/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_18-204383-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 01:15:39,950 INFO [train.py:1046] (3/4) Epoch 31, batch 3550, loss[loss=0.1542, simple_loss=0.2328, pruned_loss=0.03781, over 23349.00 frames. ], tot_loss[loss=0.1622, simple_loss=0.2401, pruned_loss=0.04217, over 4707795.33 frames. ], batch size: 93, lr: 3.29e-03, grad_scale: 8.0 2023-10-03 01:15:42,123 WARNING [train.py:1204] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_306-232480-0_sp1.1 from training. Duration: 0.87275 2023-10-03 01:15:45,377 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.3.encoder.layers.3.nonlin_attention.whiten1, num_groups=1, num_channels=384, metric=5.13 vs. limit=10.0 2023-10-03 01:15:50,836 WARNING [train.py:1204] (3/4) Exclude cut with ID _41161_210_20034_1_1533431249657_3555314_116-299186-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 01:15:52,781 WARNING [train.py:1204] (3/4) Exclude cut with ID _13913_210_9994_1_1530579615187_3383897_473-277682-0_sp1.1 from training. Duration: 0.3545625 2023-10-03 01:15:55,636 WARNING [train.py:1204] (3/4) Exclude cut with ID _58895_210_8750_1_1535184081320_3649089_49-225702-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 01:15:57,329 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3080_210_2071_1_1526086102_4365166_312-8726-0_sp1.1 from training. Duration: 0.82725 2023-10-03 01:15:59,995 WARNING [train.py:1204] (3/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_489-190175-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 01:16:00,066 WARNING [train.py:1204] (3/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_345-95717-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 01:16:00,080 WARNING [train.py:1204] (3/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_85-138257-0_sp1.1 from training. Duration: 0.6454375 2023-10-03 01:16:03,991 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7046_210_6753_1_1527895337_6920917_783-278109-0_sp1.1 from training. Duration: 0.7 2023-10-03 01:16:05,327 WARNING [train.py:1204] (3/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_450-203637-0_sp1.1 from training. Duration: 0.836375 2023-10-03 01:16:05,359 WARNING [train.py:1204] (3/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_416-132893-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 01:16:05,374 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2655_210_2087_1_1525688959_3897400_437-337690-0_sp0.9 from training. Duration: 0.7 2023-10-03 01:16:05,456 WARNING [train.py:1204] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_700-183549-0_sp1.1 from training. Duration: 0.6181875 2023-10-03 01:16:08,360 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.balancer_na.min_abs, batch_count=1086226.6666666667, ans=0.02 2023-10-03 01:16:10,947 WARNING [train.py:1204] (3/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_191-59784-0_sp1.1 from training. Duration: 0.8 2023-10-03 01:16:10,965 WARNING [train.py:1204] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_218-295097-0_sp1.1 from training. Duration: 0.7 2023-10-03 01:16:12,966 WARNING [train.py:1204] (3/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_310-109461-0_sp1.1 from training. Duration: 0.72725 2023-10-03 01:16:12,971 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_33348_210_5632_1_1533086912_3324030_131-321149-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 01:16:13,022 WARNING [train.py:1204] (3/4) Exclude cut with ID _64135_210_2253_1_1535677267876_7608972_133-188266-0_sp1.1 from training. Duration: 0.736375 2023-10-03 01:16:13,043 WARNING [train.py:1204] (3/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_505-205270-0 from training. Duration: 0.38 2023-10-03 01:16:13,061 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_67223_210_4238_1_1535612120_2537431_102-88248-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 01:16:15,738 WARNING [train.py:1204] (3/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_60-235433-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 01:16:17,218 WARNING [train.py:1204] (3/4) Exclude cut with ID _5189_210_4557_1_1527248319428_3769229_199-53526-0_sp1.1 from training. Duration: 0.3818125 2023-10-03 01:16:17,516 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.bypass.scale_min, batch_count=1086226.6666666667, ans=0.2 2023-10-03 01:16:18,972 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.conv_module1.balancer1.prob, batch_count=1086226.6666666667, ans=0.125 2023-10-03 01:16:21,513 WARNING [train.py:1204] (3/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_439-90979-0_sp1.1 from training. Duration: 0.963625 2023-10-03 01:16:21,580 WARNING [train.py:1204] (3/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_1162-132880-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 01:16:23,430 WARNING [train.py:1204] (3/4) Exclude cut with ID _56423_210_5854_1_1534896061049_3165969_220-22317-0_sp1.1 from training. Duration: 0.963625 2023-10-03 01:16:24,912 WARNING [train.py:1204] (3/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_909-64151-0 from training. Duration: 0.94 2023-10-03 01:16:26,403 WARNING [train.py:1204] (3/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_465-156500-0_sp0.9 from training. Duration: 0.8 2023-10-03 01:16:28,387 WARNING [train.py:1204] (3/4) Exclude cut with ID _44304_210_15374_1_1533639352765_4613129_302-22084-0 from training. Duration: 0.86 2023-10-03 01:16:28,435 WARNING [train.py:1204] (3/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_663-166973-0_sp1.1 from training. Duration: 0.72725 2023-10-03 01:16:29,775 WARNING [train.py:1204] (3/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_503-287883-0_sp0.9 from training. Duration: 0.97775 2023-10-03 01:16:29,801 WARNING [train.py:1204] (3/4) Exclude cut with ID _60057_210_10640_1_1534903065793_3333150_362-230377-0_sp0.9 from training. Duration: 0.9666875 2023-10-03 01:16:32,583 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_4183_210_2087_1_1526729100_1869838_143-280962-0 from training. Duration: 0.56 2023-10-03 01:16:33,941 WARNING [train.py:1204] (3/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_281-1313-0_sp1.1 from training. Duration: 0.936375 2023-10-03 01:16:39,353 WARNING [train.py:1204] (3/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_64-215118-0_sp1.1 from training. Duration: 0.936375 2023-10-03 01:16:39,420 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2822_210_2715_1_1526112586_1999587_165-36656-0 from training. Duration: 0.89 2023-10-03 01:16:40,847 WARNING [train.py:1204] (3/4) Exclude cut with ID _22961_210_8254_1_1531976366672_4278180_402-208830-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 01:16:45,434 WARNING [train.py:1204] (3/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_98-193494-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 01:16:45,522 WARNING [train.py:1204] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_383-285196-0 from training. Duration: 0.97 2023-10-03 01:16:52,791 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6767_210_3273_1_1527855783_1718932_142-127132-0 from training. Duration: 0.95 2023-10-03 01:16:52,833 WARNING [train.py:1204] (3/4) Exclude cut with ID _39867_210_9826_1_1533553222030_9344259_1018-339635-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 01:16:52,895 WARNING [train.py:1204] (3/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_274-274798-0_sp1.1 from training. Duration: 0.8909375 2023-10-03 01:16:54,686 INFO [train.py:1046] (3/4) Epoch 31, batch 3600, loss[loss=0.1487, simple_loss=0.2275, pruned_loss=0.03497, over 23275.00 frames. ], tot_loss[loss=0.1622, simple_loss=0.2402, pruned_loss=0.04211, over 4701686.96 frames. ], batch size: 51, lr: 3.28e-03, grad_scale: 16.0 2023-10-03 01:16:56,144 WARNING [train.py:1204] (3/4) Exclude cut with ID _2334_210_2667_1_1525608238880_4705129_376-263393-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 01:16:56,182 WARNING [train.py:1204] (3/4) Exclude cut with ID _29303_210_2575_1_1532392237303_7410649_363-344675-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 01:16:56,273 WARNING [train.py:1204] (3/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_58-68956-0_sp1.1 from training. Duration: 0.7545625 2023-10-03 01:17:00,367 WARNING [train.py:1204] (3/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_709-311571-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 01:17:01,728 WARNING [train.py:1204] (3/4) Exclude cut with ID _46067_210_4220_1_1534125571066_3307450_136-257407-0_sp1.1 from training. Duration: 0.963625 2023-10-03 01:17:01,809 WARNING [train.py:1204] (3/4) Exclude cut with ID _6493_210_1747_1_1528459210424_4012470_324-323854-0_sp1.1 from training. Duration: 0.836375 2023-10-03 01:17:03,252 WARNING [train.py:1204] (3/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_234-260682-0_sp1.1 from training. Duration: 0.763625 2023-10-03 01:17:03,314 WARNING [train.py:1204] (3/4) Exclude cut with ID _36634_210_1794_1_1533296352450_1399884_55-119451-0_sp1.1 from training. Duration: 0.963625 2023-10-03 01:17:03,317 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_40047_210_2290_1_1533290519_2374311_195-165693-0 from training. Duration: 0.89 2023-10-03 01:17:06,148 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_5522_210_3273_1_1527249606_3909096_366-172508-0_sp0.9 from training. Duration: 0.7333125 2023-10-03 01:17:07,531 WARNING [train.py:1204] (3/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_187-10504-0_sp1.1 from training. Duration: 0.963625 2023-10-03 01:17:10,298 WARNING [train.py:1204] (3/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_282-257791-0_sp1.1 from training. Duration: 0.7818125 2023-10-03 01:17:13,683 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_43831_210_2158_1_1533725502_5872791_467-288776-0_sp1.1 from training. Duration: 0.92725 2023-10-03 01:17:13,775 WARNING [train.py:1204] (3/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_138-154812-0_sp1.1 from training. Duration: 0.7090625 2023-10-03 01:17:15,148 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_1154-185930-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 01:17:15,172 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2655_210_2087_1_1525688959_3897400_52-279899-0 from training. Duration: 0.69 2023-10-03 01:17:16,455 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_141-89113-0_sp1.1 from training. Duration: 0.7818125 2023-10-03 01:17:18,391 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.4.encoder.layers.0.conv_module1.whiten, num_groups=1, num_channels=384, metric=5.72 vs. limit=15.0 2023-10-03 01:17:19,150 WARNING [train.py:1204] (3/4) Exclude cut with ID _55134_210_21982_1_1534590028377_3882170_297-47310-0_sp1.1 from training. Duration: 0.963625 2023-10-03 01:17:20,574 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_486-151131-0_sp1.1 from training. Duration: 0.77275 2023-10-03 01:17:22,450 WARNING [train.py:1204] (3/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_21-232980-0_sp1.1 from training. Duration: 0.936375 2023-10-03 01:17:22,573 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.0.feed_forward1.hidden_balancer.prob, batch_count=1086560.0, ans=0.125 2023-10-03 01:17:24,421 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6165_210_3528_1_1527590982_4379841_338-106380-0_sp1.1 from training. Duration: 0.92725 2023-10-03 01:17:25,843 WARNING [train.py:1204] (3/4) Exclude cut with ID _55162_210_17071_1_1534408248583_3753480_355-257218-0_sp1.1 from training. Duration: 0.97275 2023-10-03 01:17:27,209 WARNING [train.py:1204] (3/4) Exclude cut with ID _2895_210_2477_1_1525956785794_4063750_289-11475-0 from training. Duration: 0.82 2023-10-03 01:17:32,896 WARNING [train.py:1204] (3/4) Exclude cut with ID _16756_210_4915_1_1531047803763_7104259_839-183431-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 01:17:35,457 WARNING [train.py:1204] (3/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_427-208901-0_sp0.9 from training. Duration: 0.7555625 2023-10-03 01:17:36,842 WARNING [train.py:1204] (3/4) Exclude cut with ID _36512_210_6987_1_1533033143998_3126069_181-306500-0 from training. Duration: 0.95 2023-10-03 01:17:41,077 WARNING [train.py:1204] (3/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_28-70987-0_sp1.1 from training. Duration: 0.7909375 2023-10-03 01:17:42,827 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.nonlin_attention.balancer.prob, batch_count=1086626.6666666667, ans=0.125 2023-10-03 01:17:46,074 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.feed_forward1.out_proj.dropout_p, batch_count=1086626.6666666667, ans=0.1 2023-10-03 01:17:46,941 WARNING [train.py:1204] (3/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_590-58996-0_sp1.1 from training. Duration: 0.936375 2023-10-03 01:17:49,766 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_48730_210_5399_1_1533885886_2212769_108-47561-0_sp1.1 from training. Duration: 0.936375 2023-10-03 01:17:52,857 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.bypass_mid.scale_min, batch_count=1086693.3333333333, ans=0.2 2023-10-03 01:17:55,852 WARNING [train.py:1204] (3/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_401-158239-0_sp0.9 from training. Duration: 0.788875 2023-10-03 01:17:55,867 WARNING [train.py:1204] (3/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_278-154492-0_sp1.1 from training. Duration: 0.6909375 2023-10-03 01:17:57,522 WARNING [train.py:1204] (3/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_137-116442-0 from training. Duration: 0.78 2023-10-03 01:17:59,538 WARNING [train.py:1204] (3/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_189-317158-0 from training. Duration: 0.9 2023-10-03 01:18:00,956 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_47896_210_20508_1_1534492518_5958868_558-314572-0 from training. Duration: 0.97 2023-10-03 01:18:02,191 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.587e+02 1.943e+02 2.241e+02 2.571e+02 3.664e+02, threshold=4.481e+02, percent-clipped=0.0 2023-10-03 01:18:03,639 WARNING [train.py:1204] (3/4) Exclude cut with ID _66070_210_7640_1_1535441393096_7805310_290-40284-0_sp1.1 from training. Duration: 0.97275 2023-10-03 01:18:03,684 WARNING [train.py:1204] (3/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_110-224688-0_sp1.1 from training. Duration: 0.8818125 2023-10-03 01:18:05,076 WARNING [train.py:1204] (3/4) Exclude cut with ID _7719_210_3525_1_1528456153163_4296960_88-250259-0 from training. Duration: 0.84 2023-10-03 01:18:05,116 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8453_210_4211_1_1528502940_8562730_1157-114102-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 01:18:05,145 WARNING [train.py:1204] (3/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_317-175062-0_sp1.1 from training. Duration: 0.7454375 2023-10-03 01:18:05,151 WARNING [train.py:1204] (3/4) Exclude cut with ID _29857_210_20034_1_1532395886753_3798160_254-347254-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 01:18:05,218 WARNING [train.py:1204] (3/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_28-70987-0 from training. Duration: 0.87 2023-10-03 01:18:05,409 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.ff2_skip_rate, batch_count=1086693.3333333333, ans=0.0 2023-10-03 01:18:06,729 WARNING [train.py:1204] (3/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_321-335475-0 from training. Duration: 0.45 2023-10-03 01:18:07,063 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.self_attn_weights.pos_emb_skip_rate, batch_count=1086760.0, ans=0.0 2023-10-03 01:18:08,145 INFO [train.py:1046] (3/4) Epoch 31, batch 3650, loss[loss=0.1777, simple_loss=0.245, pruned_loss=0.0552, over 22692.00 frames. ], tot_loss[loss=0.1628, simple_loss=0.2413, pruned_loss=0.0421, over 4720522.09 frames. ], batch size: 322, lr: 3.28e-03, grad_scale: 8.0 2023-10-03 01:18:08,451 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.balancer2.prob, batch_count=1086760.0, ans=0.125 2023-10-03 01:18:09,601 WARNING [train.py:1204] (3/4) Exclude cut with ID _40901_210_5665_1_1533363789958_3856130_356-107330-0_sp1.1 from training. Duration: 0.936375 2023-10-03 01:18:10,941 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3780_210_2087_1_1526467389_2145572_176-107140-0 from training. Duration: 0.69 2023-10-03 01:18:11,143 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.0.ff3_skip_rate, batch_count=1086760.0, ans=0.0 2023-10-03 01:18:15,122 WARNING [train.py:1204] (3/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_80-135200-0 from training. Duration: 0.79 2023-10-03 01:18:16,552 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_33348_210_5632_1_1533086912_3324030_85-28583-0_sp0.9 from training. Duration: 0.911125 2023-10-03 01:18:21,279 WARNING [train.py:1204] (3/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_29-293896-0 from training. Duration: 0.61 2023-10-03 01:18:22,679 WARNING [train.py:1204] (3/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_194-252800-0 from training. Duration: 0.97 2023-10-03 01:18:24,188 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=1086826.6666666667, ans=0.1 2023-10-03 01:18:26,769 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_360-52446-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 01:18:26,770 WARNING [train.py:1204] (3/4) Exclude cut with ID _9389_210_1664_1_1529217337301_5289269_289-213417-0_sp0.9 from training. Duration: 0.988875 2023-10-03 01:18:26,813 WARNING [train.py:1204] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_143-86344-0_sp0.9 from training. Duration: 0.8555625 2023-10-03 01:18:32,320 WARNING [train.py:1204] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_520-126729-0_sp0.9 from training. Duration: 0.77775 2023-10-03 01:18:32,338 WARNING [train.py:1204] (3/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_76-308663-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 01:18:32,398 WARNING [train.py:1204] (3/4) Exclude cut with ID _8021_210_7994_1_1528376451358_3653069_100-140231-0 from training. Duration: 0.67 2023-10-03 01:18:33,676 WARNING [train.py:1204] (3/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_387-131538-0_sp0.9 from training. Duration: 0.9 2023-10-03 01:18:33,710 WARNING [train.py:1204] (3/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_365-182054-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 01:18:34,971 WARNING [train.py:1204] (3/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_345-340690-0 from training. Duration: 0.93 2023-10-03 01:18:35,068 WARNING [train.py:1204] (3/4) Exclude cut with ID _5409_210_1388_1_1527292850189_5865191_178-127970-0_sp0.9 from training. Duration: 0.6333125 2023-10-03 01:18:36,347 WARNING [train.py:1204] (3/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_352-12734-0_sp1.1 from training. Duration: 0.92725 2023-10-03 01:18:36,350 WARNING [train.py:1204] (3/4) Exclude cut with ID _65134_210_14763_1_1535335393985_3505368_121-129373-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 01:18:37,789 WARNING [train.py:1204] (3/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_557-324258-0_sp1.1 from training. Duration: 0.736375 2023-10-03 01:18:40,442 WARNING [train.py:1204] (3/4) Exclude cut with ID _48678_210_5663_1_1533988830947_3769199_306-255886-0 from training. Duration: 0.77 2023-10-03 01:18:41,909 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7544_210_6753_1_1528587000_691468_87-178599-0 from training. Duration: 0.87 2023-10-03 01:18:43,238 WARNING [train.py:1204] (3/4) Exclude cut with ID _2941_210_2577_1_1526199673150_2489759_208-236402-0_sp1.1 from training. Duration: 0.963625 2023-10-03 01:18:44,608 WARNING [train.py:1204] (3/4) Exclude cut with ID _38670_210_15379_1_1533448810659_4391059_65-169397-0 from training. Duration: 0.97 2023-10-03 01:18:45,994 WARNING [train.py:1204] (3/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_306-328175-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 01:18:46,013 WARNING [train.py:1204] (3/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_212-149947-0_sp1.1 from training. Duration: 0.663625 2023-10-03 01:18:51,018 WARNING [train.py:1204] (3/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_321-22821-0_sp1.1 from training. Duration: 0.8090625 2023-10-03 01:18:51,535 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.4.encoder.layers.1.conv_module2.whiten, num_groups=1, num_channels=384, metric=5.43 vs. limit=15.0 2023-10-03 01:18:52,223 WARNING [train.py:1204] (3/4) Exclude cut with ID _44010_210_4525_1_1533884262964_4013620_483-69110-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 01:18:52,232 WARNING [train.py:1204] (3/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_38-187851-0_sp1.1 from training. Duration: 0.563625 2023-10-03 01:18:53,611 WARNING [train.py:1204] (3/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_129-337802-0_sp0.9 from training. Duration: 0.8 2023-10-03 01:18:54,930 WARNING [train.py:1204] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_268-87909-0_sp1.1 from training. Duration: 0.9 2023-10-03 01:18:55,546 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.4.encoder.layers.1.feed_forward2.out_whiten, num_groups=1, num_channels=384, metric=11.11 vs. limit=15.0 2023-10-03 01:18:56,434 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.1.conv_module1.balancer2.prob, batch_count=1086960.0, ans=0.125 2023-10-03 01:18:56,468 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.conv_module2.balancer1.min_positive, batch_count=1086960.0, ans=0.025 2023-10-03 01:18:57,648 WARNING [train.py:1204] (3/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_559-184862-0_sp1.1 from training. Duration: 0.8545625 2023-10-03 01:19:01,275 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_46032_210_14656_1_1533781412_4507969_237-120766-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 01:19:03,226 WARNING [train.py:1204] (3/4) Exclude cut with ID _28427_210_4220_1_1532581240967_3061069_330-170906-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 01:19:03,234 WARNING [train.py:1204] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_401-51578-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 01:19:05,928 WARNING [train.py:1204] (3/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_209-205365-0_sp1.1 from training. Duration: 0.7181875 2023-10-03 01:19:06,031 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_23801_210_16223_1_1532066392_3294657_57-242170-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 01:19:07,366 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_127-37371-0_sp1.1 from training. Duration: 0.936375 2023-10-03 01:19:08,924 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.balancer1.prob, batch_count=1087026.6666666667, ans=0.125 2023-10-03 01:19:11,587 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9171W0019-578905 from training. Duration: 0.896 2023-10-03 01:19:13,128 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_24605_210_1794_1_1532047863_4149755_349-49056-0_sp1.1 from training. Duration: 0.92725 2023-10-03 01:19:14,530 WARNING [train.py:1204] (3/4) Exclude cut with ID _12302_210_5219_1_1529924446466_8107280_897-21237-0_sp1.1 from training. Duration: 0.936375 2023-10-03 01:19:14,628 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_9460_210_4211_1_1528954587_10049667_480-260269-0_sp0.9 from training. Duration: 0.62225 2023-10-03 01:19:15,965 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_348-152548-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 01:19:17,280 WARNING [train.py:1204] (3/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_301-330455-0_sp0.9 from training. Duration: 0.888875 2023-10-03 01:19:18,724 WARNING [train.py:1204] (3/4) Exclude cut with ID _61008_210_10633_1_1534852769198_6974370_345-91705-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 01:19:20,682 WARNING [train.py:1204] (3/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_304-273433-0 from training. Duration: 0.99 2023-10-03 01:19:20,686 WARNING [train.py:1204] (3/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_359-345972-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 01:19:21,968 INFO [train.py:1046] (3/4) Epoch 31, batch 3700, loss[loss=0.1627, simple_loss=0.253, pruned_loss=0.03618, over 24661.00 frames. ], tot_loss[loss=0.1636, simple_loss=0.2425, pruned_loss=0.04235, over 4719160.87 frames. ], batch size: 73, lr: 3.28e-03, grad_scale: 8.0 2023-10-03 01:19:23,399 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2296_210_2087_1_1525503448_3397335_57-185430-0_sp1.1 from training. Duration: 0.5454375 2023-10-03 01:19:26,139 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_1256-120974-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 01:19:26,209 WARNING [train.py:1204] (3/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_372-30302-0_sp0.9 from training. Duration: 0.9555625 2023-10-03 01:19:28,923 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_42012_210_7927_1_1533439936_3871556_451-344262-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 01:19:28,924 WARNING [train.py:1204] (3/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_28-324729-0 from training. Duration: 0.94 2023-10-03 01:19:28,931 WARNING [train.py:1204] (3/4) Exclude cut with ID _64135_210_2253_1_1535677267876_7608972_313-18394-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 01:19:32,171 WARNING [train.py:1204] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_620-276793-0_sp0.9 from training. Duration: 0.6555625 2023-10-03 01:19:32,211 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7544_210_6753_1_1528591327_1471426_172-348777-0_sp0.9 from training. Duration: 0.7333125 2023-10-03 01:19:36,845 WARNING [train.py:1204] (3/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_332-221057-0_sp0.9 from training. Duration: 0.7555625 2023-10-03 01:19:37,041 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.feed_forward1.out_proj.dropout_p, batch_count=1087160.0, ans=0.1 2023-10-03 01:19:38,260 WARNING [train.py:1204] (3/4) Exclude cut with ID _19023_210_4353_1_1531378892552_5820780_5-300035-0_sp1.1 from training. Duration: 0.92725 2023-10-03 01:19:39,681 WARNING [train.py:1204] (3/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_343-309023-0_sp1.1 from training. Duration: 0.963625 2023-10-03 01:19:39,749 WARNING [train.py:1204] (3/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_37-41919-0_sp1.1 from training. Duration: 0.7909375 2023-10-03 01:19:39,772 WARNING [train.py:1204] (3/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_41-182910-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 01:19:41,161 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_229-45300-0_sp0.9 from training. Duration: 0.6333125 2023-10-03 01:19:43,802 WARNING [train.py:1204] (3/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_40-214716-0_sp1.1 from training. Duration: 0.963625 2023-10-03 01:19:45,201 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9309W0203-640330 from training. Duration: 0.896 2023-10-03 01:19:46,875 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.ff3_skip_rate, batch_count=1087160.0, ans=0.0 2023-10-03 01:19:49,481 WARNING [train.py:1204] (3/4) Exclude cut with ID _60229_210_27767_1_1534838406158_3655350_264-279573-0_sp1.1 from training. Duration: 0.7545625 2023-10-03 01:19:50,876 WARNING [train.py:1204] (3/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_372-198878-0_sp0.9 from training. Duration: 0.6666875 2023-10-03 01:19:52,298 WARNING [train.py:1204] (3/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_359-118926-0_sp1.1 from training. Duration: 0.6545625 2023-10-03 01:19:52,316 WARNING [train.py:1204] (3/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_85-65056-0 from training. Duration: 0.96 2023-10-03 01:19:52,337 WARNING [train.py:1204] (3/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_89-242335-0_sp1.1 from training. Duration: 0.863625 2023-10-03 01:19:55,679 WARNING [train.py:1204] (3/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_128-49444-0_sp1.1 from training. Duration: 0.936375 2023-10-03 01:19:55,763 WARNING [train.py:1204] (3/4) Exclude cut with ID _30327_210_16049_1_1532484161295_3924169_401-70819-0 from training. Duration: 0.86 2023-10-03 01:19:57,086 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_41822_210_13270_1_1533955976_4379284_260-256391-0_sp1.1 from training. Duration: 0.936375 2023-10-03 01:19:58,487 WARNING [train.py:1204] (3/4) Exclude cut with ID _67335_210_5219_1_1535525975652_4275229_599-247317-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 01:20:02,150 WARNING [train.py:1204] (3/4) Exclude cut with ID _29857_210_20034_1_1532395886753_3798160_209-239267-0_sp1.1 from training. Duration: 0.936375 2023-10-03 01:20:03,773 WARNING [train.py:1204] (3/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_46-207599-0_sp1.1 from training. Duration: 0.5818125 2023-10-03 01:20:06,448 WARNING [train.py:1204] (3/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_912-282412-0_sp1.1 from training. Duration: 0.4454375 2023-10-03 01:20:09,491 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_4183_210_2087_1_1526729100_1869838_141-166600-0_sp1.1 from training. Duration: 0.863625 2023-10-03 01:20:09,503 WARNING [train.py:1204] (3/4) Exclude cut with ID _15037_210_2253_1_1530752294104_3267650_296-192597-0 from training. Duration: 0.81 2023-10-03 01:20:09,547 WARNING [train.py:1204] (3/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_571-220562-0_sp1.1 from training. Duration: 0.963625 2023-10-03 01:20:10,810 WARNING [train.py:1204] (3/4) Exclude cut with ID _5287_210_2084_1_1527418883040_3878670_12-221186-0 from training. Duration: 0.98 2023-10-03 01:20:15,113 WARNING [train.py:1204] (3/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_149-85602-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 01:20:15,151 WARNING [train.py:1204] (3/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_1003-101638-0_sp1.1 from training. Duration: 0.663625 2023-10-03 01:20:17,928 WARNING [train.py:1204] (3/4) Exclude cut with ID _55844_210_2346_1_1534471212569_7625707_1099-33689-0_sp1.1 from training. Duration: 0.92725 2023-10-03 01:20:19,142 WARNING [train.py:1204] (3/4) Exclude cut with ID _7606_210_4915_1_1528894253277_5080690_301-10732-0 from training. Duration: 0.81 2023-10-03 01:20:20,539 WARNING [train.py:1204] (3/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_403-34698-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 01:20:20,545 WARNING [train.py:1204] (3/4) Exclude cut with ID _8480_210_1874_1_1528589391205_6728330_755-112625-0_sp0.9 from training. Duration: 0.82225 2023-10-03 01:20:20,562 WARNING [train.py:1204] (3/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_527-238990-0_sp1.1 from training. Duration: 0.8181875 2023-10-03 01:20:21,877 WARNING [train.py:1204] (3/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_426-7328-0_sp1.1 from training. Duration: 0.92725 2023-10-03 01:20:25,246 WARNING [train.py:1204] (3/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_189-317158-0_sp1.1 from training. Duration: 0.8181875 2023-10-03 01:20:26,614 WARNING [train.py:1204] (3/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_162-103620-0 from training. Duration: 0.93 2023-10-03 01:20:26,718 WARNING [train.py:1204] (3/4) Exclude cut with ID _22083_210_8254_1_1531702862439_3678970_49-234546-0 from training. Duration: 0.83 2023-10-03 01:20:28,070 WARNING [train.py:1204] (3/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_370-331600-0_sp1.1 from training. Duration: 0.8545625 2023-10-03 01:20:28,093 WARNING [train.py:1204] (3/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_166-59042-0_sp1.1 from training. Duration: 0.97275 2023-10-03 01:20:29,601 WARNING [train.py:1204] (3/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_138-332773-0_sp1.1 from training. Duration: 0.736375 2023-10-03 01:20:29,674 WARNING [train.py:1204] (3/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_90-30673-0_sp0.9 from training. Duration: 0.9333125 2023-10-03 01:20:31,861 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.attention_skip_rate, batch_count=1087360.0, ans=0.0 2023-10-03 01:20:32,845 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.518e+02 1.869e+02 2.113e+02 2.369e+02 2.929e+02, threshold=4.225e+02, percent-clipped=0.0 2023-10-03 01:20:34,163 WARNING [train.py:1204] (3/4) Exclude cut with ID _27967_210_12140_1_1532588391859_8419130_737-41898-0_sp1.1 from training. Duration: 0.936375 2023-10-03 01:20:34,272 WARNING [train.py:1204] (3/4) Exclude cut with ID _8833_210_5219_1_1528592665210_7553059_329-125156-0_sp1.1 from training. Duration: 0.7454375 2023-10-03 01:20:35,943 INFO [train.py:1046] (3/4) Epoch 31, batch 3750, loss[loss=0.1452, simple_loss=0.2254, pruned_loss=0.03255, over 24334.00 frames. ], tot_loss[loss=0.1639, simple_loss=0.2428, pruned_loss=0.04248, over 4728197.64 frames. ], batch size: 61, lr: 3.28e-03, grad_scale: 8.0 2023-10-03 01:20:36,067 WARNING [train.py:1204] (3/4) Exclude cut with ID _41817_210_18835_1_1533434477160_7423049_352-42537-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 01:20:38,754 WARNING [train.py:1204] (3/4) Exclude cut with ID _8365_210_2064_1_1528592311070_4677690_431-294102-0 from training. Duration: 0.89 2023-10-03 01:20:40,249 WARNING [train.py:1204] (3/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_454-314901-0_sp1.1 from training. Duration: 0.4181875 2023-10-03 01:20:41,652 WARNING [train.py:1204] (3/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_80-135200-0_sp0.9 from training. Duration: 0.87775 2023-10-03 01:20:42,970 WARNING [train.py:1204] (3/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_181-78632-0 from training. Duration: 0.78 2023-10-03 01:20:43,034 WARNING [train.py:1204] (3/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_300-27235-0_sp0.9 from training. Duration: 0.9666875 2023-10-03 01:20:44,409 WARNING [train.py:1204] (3/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_96-177176-0_sp1.1 from training. Duration: 0.97275 2023-10-03 01:20:45,942 WARNING [train.py:1204] (3/4) Exclude cut with ID _35941_210_15379_1_1532995294515_4023089_246-348742-0_sp1.1 from training. Duration: 0.97275 2023-10-03 01:20:47,260 WARNING [train.py:1204] (3/4) Exclude cut with ID _38670_210_15379_1_1533448810659_4391059_65-169397-0_sp1.1 from training. Duration: 0.8818125 2023-10-03 01:20:50,015 WARNING [train.py:1204] (3/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_1003-99642-0_sp1.1 from training. Duration: 0.963625 2023-10-03 01:20:50,864 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.2.encoder.layers.0.nonlin_attention.whiten2, num_groups=1, num_channels=384, metric=7.48 vs. limit=15.0 2023-10-03 01:20:52,796 WARNING [train.py:1204] (3/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_320-14078-0_sp1.1 from training. Duration: 0.863625 2023-10-03 01:20:54,653 WARNING [train.py:1204] (3/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_367-61330-0_sp0.9 from training. Duration: 0.8333125 2023-10-03 01:20:56,049 WARNING [train.py:1204] (3/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_515-298514-0_sp1.1 from training. Duration: 0.92725 2023-10-03 01:20:58,597 WARNING [train.py:1204] (3/4) Exclude cut with ID _53731_210_7994_1_1534553979244_3757214_471-320815-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 01:20:58,887 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.nonlin_attention.balancer.prob, batch_count=1087493.3333333333, ans=0.125 2023-10-03 01:21:00,573 WARNING [train.py:1204] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_306-232480-0 from training. Duration: 0.96 2023-10-03 01:21:01,890 WARNING [train.py:1204] (3/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_478-162247-0_sp0.9 from training. Duration: 0.911125 2023-10-03 01:21:02,168 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.self_attn_weights.pos_emb_skip_rate, batch_count=1087493.3333333333, ans=0.0 2023-10-03 01:21:03,907 WARNING [train.py:1204] (3/4) Exclude cut with ID _56423_210_5854_1_1534896061049_3165969_345-284696-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 01:21:03,943 WARNING [train.py:1204] (3/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_600-122253-0_sp1.1 from training. Duration: 0.963625 2023-10-03 01:21:07,355 WARNING [train.py:1204] (3/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_199-295611-0 from training. Duration: 0.55 2023-10-03 01:21:10,094 WARNING [train.py:1204] (3/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_734-129761-0 from training. Duration: 0.82 2023-10-03 01:21:11,514 WARNING [train.py:1204] (3/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_349-169257-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 01:21:11,633 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.nonlin_attention.balancer.prob, batch_count=1087560.0, ans=0.125 2023-10-03 01:21:12,897 WARNING [train.py:1204] (3/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_202-67017-0_sp0.9 from training. Duration: 0.911125 2023-10-03 01:21:13,555 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.3.encoder.layers.2.whiten, num_groups=1, num_channels=512, metric=4.37 vs. limit=12.0 2023-10-03 01:21:15,592 WARNING [train.py:1204] (3/4) Exclude cut with ID _16908_210_5724_1_1531119157248_3772700_61-258978-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 01:21:17,071 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.1.feed_forward2.hidden_balancer.prob, batch_count=1087560.0, ans=0.125 2023-10-03 01:21:17,196 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.balancer2.prob, batch_count=1087560.0, ans=0.125 2023-10-03 01:21:20,466 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.3.encoder.layers.2.self_attn2.whiten, num_groups=1, num_channels=512, metric=11.83 vs. limit=22.5 2023-10-03 01:21:21,100 WARNING [train.py:1204] (3/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_172-156931-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 01:21:21,200 WARNING [train.py:1204] (3/4) Exclude cut with ID _4092_210_2151_1_1526643044449_3135759_356-278694-0_sp1.1 from training. Duration: 0.42725 2023-10-03 01:21:24,668 WARNING [train.py:1204] (3/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_116-332008-0 from training. Duration: 0.98 2023-10-03 01:21:27,248 WARNING [train.py:1204] (3/4) Exclude cut with ID _29003_210_19596_1_1532507391720_3894098_358-114789-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 01:21:30,105 WARNING [train.py:1204] (3/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_152-212533-0_sp1.1 from training. Duration: 0.8818125 2023-10-03 01:21:32,019 WARNING [train.py:1204] (3/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_267-214116-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 01:21:34,798 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7543_210_6753_1_1528541907_1399322_165-323619-0_sp0.9 from training. Duration: 0.7666875 2023-10-03 01:21:39,476 WARNING [train.py:1204] (3/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_577-332600-0_sp0.9 from training. Duration: 0.6333125 2023-10-03 01:21:40,839 WARNING [train.py:1204] (3/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_342-100157-0_sp0.9 from training. Duration: 0.6 2023-10-03 01:21:43,560 WARNING [train.py:1204] (3/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_141-89727-0_sp1.1 from training. Duration: 0.6454375 2023-10-03 01:21:43,673 WARNING [train.py:1204] (3/4) Exclude cut with ID _3992_210_1747_1_1526644816031_3731060_107-251434-0_sp1.1 from training. Duration: 0.9 2023-10-03 01:21:44,528 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.0.layers.0.nonlin_attention.whiten2, num_groups=1, num_channels=192, metric=5.32 vs. limit=15.0 2023-10-03 01:21:45,234 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.balancer2.prob, batch_count=1087693.3333333333, ans=0.125 2023-10-03 01:21:47,732 WARNING [train.py:1204] (3/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_401-53423-0_sp1.1 from training. Duration: 0.5 2023-10-03 01:21:49,176 INFO [train.py:1046] (3/4) Epoch 31, batch 3800, loss[loss=0.1807, simple_loss=0.2425, pruned_loss=0.05947, over 19544.00 frames. ], tot_loss[loss=0.1637, simple_loss=0.2422, pruned_loss=0.04262, over 4720032.25 frames. ], batch size: 388, lr: 3.28e-03, grad_scale: 8.0 2023-10-03 01:21:53,926 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7762_210_1794_1_1528199508_4057408_422-227034-0_sp1.1 from training. Duration: 0.663625 2023-10-03 01:21:58,108 WARNING [train.py:1204] (3/4) Exclude cut with ID _4184_210_2550_1_1526730192013_2219380_102-250299-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 01:21:58,336 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.balancer2.prob, batch_count=1087760.0, ans=0.125 2023-10-03 01:21:59,446 WARNING [train.py:1204] (3/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_253-308377-0_sp0.9 from training. Duration: 0.5444375 2023-10-03 01:21:59,523 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_271-297505-0 from training. Duration: 0.99 2023-10-03 01:21:59,577 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder_embed.conv.8.prob, batch_count=1087760.0, ans=0.125 2023-10-03 01:22:02,455 WARNING [train.py:1204] (3/4) Exclude cut with ID _35941_210_15379_1_1532995294515_4023089_213-142973-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 01:22:04,458 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7544_210_6753_1_1528587722_3576686_162-63018-0_sp1.1 from training. Duration: 0.92725 2023-10-03 01:22:04,733 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.balancer2.prob, batch_count=1087826.6666666667, ans=0.125 2023-10-03 01:22:05,801 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_365-217119-0_sp0.9 from training. Duration: 0.7 2023-10-03 01:22:07,304 WARNING [train.py:1204] (3/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_662-275621-0_sp0.9 from training. Duration: 0.4666875 2023-10-03 01:22:07,305 WARNING [train.py:1204] (3/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_374-187555-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 01:22:07,378 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_289-180904-0_sp1.1 from training. Duration: 0.6909375 2023-10-03 01:22:09,296 WARNING [train.py:1204] (3/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_189-47206-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 01:22:09,332 WARNING [train.py:1204] (3/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_234-14174-0_sp1.1 from training. Duration: 0.7454375 2023-10-03 01:22:11,389 WARNING [train.py:1204] (3/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_576-76621-0_sp1.1 from training. Duration: 0.963625 2023-10-03 01:22:11,476 WARNING [train.py:1204] (3/4) Exclude cut with ID _13253_210_1925_1_1530757775819_3577873_253-246386-0 from training. Duration: 0.9 2023-10-03 01:22:15,675 WARNING [train.py:1204] (3/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_372-6869-0_sp1.1 from training. Duration: 0.4 2023-10-03 01:22:16,932 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_21434_210_4238_1_1531916339_1222252_22-54954-0_sp1.1 from training. Duration: 0.936375 2023-10-03 01:22:18,404 WARNING [train.py:1204] (3/4) Exclude cut with ID _29853_210_7497_1_1532505600851_6642110_492-170361-0_sp1.1 from training. Duration: 0.92725 2023-10-03 01:22:19,876 WARNING [train.py:1204] (3/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_505-97987-0_sp0.9 from training. Duration: 0.9555625 2023-10-03 01:22:21,182 WARNING [train.py:1204] (3/4) Exclude cut with ID _8365_210_2064_1_1528592311070_4677690_371-122295-0_sp0.9 from training. Duration: 0.611125 2023-10-03 01:22:21,349 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.1.attention_skip_rate, batch_count=1087893.3333333333, ans=0.0 2023-10-03 01:22:22,585 WARNING [train.py:1204] (3/4) Exclude cut with ID _14268_210_9206_1_1530622771285_4714099_637-86630-0_sp1.1 from training. Duration: 0.463625 2023-10-03 01:22:22,597 WARNING [train.py:1204] (3/4) Exclude cut with ID _29857_210_20034_1_1532395886753_3798160_372-5678-0_sp1.1 from training. Duration: 0.963625 2023-10-03 01:22:25,876 WARNING [train.py:1204] (3/4) Exclude cut with ID _52892_210_6721_1_1534378941259_4124129_90-165108-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 01:22:25,977 WARNING [train.py:1204] (3/4) Exclude cut with ID _27052_210_5986_1_1532329197187_3585009_143-339198-0_sp1.1 from training. Duration: 0.963625 2023-10-03 01:22:30,207 WARNING [train.py:1204] (3/4) Exclude cut with ID _14268_210_9206_1_1530622771285_4714099_637-86630-0_sp0.9 from training. Duration: 0.5666875 2023-10-03 01:22:30,209 WARNING [train.py:1204] (3/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_401-255491-0 from training. Duration: 0.7 2023-10-03 01:22:31,579 WARNING [train.py:1204] (3/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_178-6946-0_sp1.1 from training. Duration: 0.97275 2023-10-03 01:22:40,144 WARNING [train.py:1204] (3/4) Exclude cut with ID _27485_210_4916_1_1532739191677_3668269_460-163885-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 01:22:44,622 WARNING [train.py:1204] (3/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_270-26053-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 01:22:47,285 WARNING [train.py:1204] (3/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_412-317077-0 from training. Duration: 0.75 2023-10-03 01:22:50,070 WARNING [train.py:1204] (3/4) Exclude cut with ID _5289_210_2084_1_1527678202736_3779299_142-103317-0 from training. Duration: 0.84 2023-10-03 01:22:50,128 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_150-143438-0_sp1.1 from training. Duration: 0.92725 2023-10-03 01:22:50,342 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.nonlin_attention.balancer.min_positive, batch_count=1088026.6666666667, ans=0.05 2023-10-03 01:22:52,878 WARNING [train.py:1204] (3/4) Exclude cut with ID _27894_210_12392_1_1532415496078_6902439_668-269779-0_sp1.1 from training. Duration: 0.97275 2023-10-03 01:22:52,951 WARNING [train.py:1204] (3/4) Exclude cut with ID _18513_210_9206_1_1531618215223_3862620_56-222075-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 01:22:54,369 WARNING [train.py:1204] (3/4) Exclude cut with ID _4092_210_2151_1_1526643044449_3135759_356-278694-0 from training. Duration: 0.47 2023-10-03 01:22:57,652 WARNING [train.py:1204] (3/4) Exclude cut with ID _25808_210_13292_1_1532343514562_7366109_633-47524-0 from training. Duration: 0.99 2023-10-03 01:22:57,669 WARNING [train.py:1204] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_872-264525-0 from training. Duration: 0.65 2023-10-03 01:22:58,984 WARNING [train.py:1204] (3/4) Exclude cut with ID _14632_210_9988_1_1530691279392_3923200_239-232545-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 01:22:59,077 WARNING [train.py:1204] (3/4) Exclude cut with ID _14349_210_8341_1_1530615710818_3442470_153-27432-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 01:23:00,661 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.0.bypass_mid.scale_min, batch_count=1088026.6666666667, ans=0.2 2023-10-03 01:23:01,710 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.598e+02 1.890e+02 2.060e+02 2.285e+02 4.176e+02, threshold=4.120e+02, percent-clipped=0.0 2023-10-03 01:23:03,365 WARNING [train.py:1204] (3/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_37-41919-0_sp0.9 from training. Duration: 0.9666875 2023-10-03 01:23:03,422 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_196-289637-0_sp0.9 from training. Duration: 0.8333125 2023-10-03 01:23:04,717 INFO [train.py:1046] (3/4) Epoch 31, batch 3850, loss[loss=0.1406, simple_loss=0.1937, pruned_loss=0.04378, over 19289.00 frames. ], tot_loss[loss=0.1632, simple_loss=0.241, pruned_loss=0.04268, over 4714594.65 frames. ], batch size: 388, lr: 3.28e-03, grad_scale: 8.0 2023-10-03 01:23:08,725 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.0.ff3_skip_rate, batch_count=1088093.3333333333, ans=0.0 2023-10-03 01:23:09,882 WARNING [train.py:1204] (3/4) Exclude cut with ID _13678_210_7497_1_1530530069226_3463170_371-3221-0_sp1.1 from training. Duration: 0.8181875 2023-10-03 01:23:09,959 WARNING [train.py:1204] (3/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_557-324258-0 from training. Duration: 0.81 2023-10-03 01:23:13,211 WARNING [train.py:1204] (3/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_1012-279473-0_sp1.1 from training. Duration: 0.6090625 2023-10-03 01:23:13,294 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_66916_210_14656_1_1535725525_326566_7-92649-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 01:23:13,430 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.out_combiner.scale_min, batch_count=1088093.3333333333, ans=0.2 2023-10-03 01:23:13,509 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.bypass.scale_min, batch_count=1088093.3333333333, ans=0.2 2023-10-03 01:23:17,332 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_9460_210_4211_1_1528954587_10049667_480-260269-0_sp1.1 from training. Duration: 0.5090625 2023-10-03 01:23:18,708 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_121-155001-0_sp1.1 from training. Duration: 0.92725 2023-10-03 01:23:20,245 WARNING [train.py:1204] (3/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_188-171673-0_sp1.1 from training. Duration: 0.52725 2023-10-03 01:23:22,231 WARNING [train.py:1204] (3/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_2-39653-0 from training. Duration: 0.98 2023-10-03 01:23:23,612 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=1088160.0, ans=0.1 2023-10-03 01:23:27,613 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_23850_210_4238_1_1532232597_1632433_112-229012-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 01:23:27,747 WARNING [train.py:1204] (3/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_626-79528-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 01:23:29,147 WARNING [train.py:1204] (3/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_856-90052-0_sp1.1 from training. Duration: 0.963625 2023-10-03 01:23:30,585 WARNING [train.py:1204] (3/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_122-109023-0_sp0.9 from training. Duration: 0.8444375 2023-10-03 01:23:33,692 WARNING [train.py:1204] (3/4) Exclude cut with ID _64414_210_17301_1_1535243252048_3757759_37-70328-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 01:23:34,944 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_622-344907-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 01:23:35,016 WARNING [train.py:1204] (3/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_410-321956-0_sp1.1 from training. Duration: 0.92725 2023-10-03 01:23:35,037 WARNING [train.py:1204] (3/4) Exclude cut with ID _7562_210_2084_1_1528887767916_3946229_186-326422-0_sp0.9 from training. Duration: 0.9333125 2023-10-03 01:23:36,406 WARNING [train.py:1204] (3/4) Exclude cut with ID _37537_210_2253_1_1533171758568_3421949_127-285192-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 01:23:36,631 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.ff3_skip_rate, batch_count=1088226.6666666667, ans=0.0 2023-10-03 01:23:37,738 WARNING [train.py:1204] (3/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_723-228168-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 01:23:39,696 WARNING [train.py:1204] (3/4) Exclude cut with ID _13678_210_7497_1_1530530069226_3463170_426-246984-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 01:23:40,916 WARNING [train.py:1204] (3/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_399-156791-0_sp0.9 from training. Duration: 0.92225 2023-10-03 01:23:40,973 WARNING [train.py:1204] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_391-22148-0 from training. Duration: 0.77 2023-10-03 01:23:41,001 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3475_210_2028_1_1526193578_3622569_64-297072-0 from training. Duration: 0.91 2023-10-03 01:23:43,079 WARNING [train.py:1204] (3/4) Exclude cut with ID _17875_210_2087_1_1531224012907_1821339_225-280015-0_sp1.1 from training. Duration: 0.963625 2023-10-03 01:23:43,098 WARNING [train.py:1204] (3/4) Exclude cut with ID _50655_210_18628_1_1534229942193_3772154_325-246169-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 01:23:43,339 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.conv_module1.balancer2.min_abs, batch_count=1088226.6666666667, ans=0.5 2023-10-03 01:23:45,771 WARNING [train.py:1204] (3/4) Exclude cut with ID _29529_210_7234_1_1532393961455_3977460_597-257348-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 01:23:47,117 WARNING [train.py:1204] (3/4) Exclude cut with ID _58895_210_8750_1_1535184081320_3649089_77-173485-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 01:23:47,143 WARNING [train.py:1204] (3/4) Exclude cut with ID _6270_210_2084_1_1527850911573_3856420_26-277343-0 from training. Duration: 0.82 2023-10-03 01:23:50,031 WARNING [train.py:1204] (3/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_144-3287-0 from training. Duration: 0.67 2023-10-03 01:23:51,478 WARNING [train.py:1204] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_293-145278-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 01:23:53,416 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_40047_210_2290_1_1533290519_2374311_257-335162-0 from training. Duration: 0.95 2023-10-03 01:23:54,908 WARNING [train.py:1204] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_619-344499-0_sp0.9 from training. Duration: 0.7 2023-10-03 01:23:59,115 WARNING [train.py:1204] (3/4) Exclude cut with ID _19504_210_9994_1_1531648863242_3325220_456-315379-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 01:24:01,751 WARNING [train.py:1204] (3/4) Exclude cut with ID _55857_210_2346_1_1534644007831_6898209_631-221692-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 01:24:03,298 WARNING [train.py:1204] (3/4) Exclude cut with ID _64631_210_27767_1_1535349580068_4165160_324-3682-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 01:24:03,317 WARNING [train.py:1204] (3/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_317-211107-0 from training. Duration: 0.92 2023-10-03 01:24:06,628 WARNING [train.py:1204] (3/4) Exclude cut with ID _6789_210_1534_1_1527857937218_3118950_311-61262-0 from training. Duration: 0.65 2023-10-03 01:24:09,331 WARNING [train.py:1204] (3/4) Exclude cut with ID _28323_210_16187_1_1532480395667_4100300_365-68882-0_sp1.1 from training. Duration: 0.92725 2023-10-03 01:24:09,375 WARNING [train.py:1204] (3/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_816-181825-0_sp1.1 from training. Duration: 0.92725 2023-10-03 01:24:12,735 WARNING [train.py:1204] (3/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_132-269633-0_sp1.1 from training. Duration: 0.6181875 2023-10-03 01:24:12,753 WARNING [train.py:1204] (3/4) Exclude cut with ID _44585_210_18628_1_1533605350387_4518199_280-73534-0_sp1.1 from training. Duration: 0.8090625 2023-10-03 01:24:14,147 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_68095_210_20185_1_1535718226_8888855_1009-164755-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 01:24:14,216 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_40223_210_6228_1_1533298404_4812267_400-103813-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 01:24:14,216 WARNING [train.py:1204] (3/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_491-261923-0_sp1.1 from training. Duration: 0.8909375 2023-10-03 01:24:14,221 WARNING [train.py:1204] (3/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_712-342563-0 from training. Duration: 0.97 2023-10-03 01:24:15,577 WARNING [train.py:1204] (3/4) Exclude cut with ID _41161_210_20034_1_1533431249657_3555314_175-190032-0_sp1.1 from training. Duration: 0.963625 2023-10-03 01:24:16,961 WARNING [train.py:1204] (3/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_788-298907-0 from training. Duration: 0.89 2023-10-03 01:24:16,989 WARNING [train.py:1204] (3/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_225-70961-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 01:24:17,004 WARNING [train.py:1204] (3/4) Exclude cut with ID _58583_210_5236_1_1534726835266_3574249_235-175994-0_sp1.1 from training. Duration: 0.92725 2023-10-03 01:24:17,165 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.1.balancer2.prob, batch_count=1088360.0, ans=0.125 2023-10-03 01:24:18,418 WARNING [train.py:1204] (3/4) Exclude cut with ID _12302_210_5219_1_1529924446466_8107280_907-182949-0_sp1.1 from training. Duration: 0.836375 2023-10-03 01:24:19,602 INFO [train.py:1046] (3/4) Epoch 31, batch 3900, loss[loss=0.1608, simple_loss=0.2488, pruned_loss=0.03642, over 24618.00 frames. ], tot_loss[loss=0.1623, simple_loss=0.24, pruned_loss=0.04228, over 4710465.04 frames. ], batch size: 68, lr: 3.28e-03, grad_scale: 8.0 2023-10-03 01:24:19,690 WARNING [train.py:1204] (3/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_94-193692-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 01:24:19,884 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.balancer1.prob, batch_count=1088426.6666666667, ans=0.125 2023-10-03 01:24:22,186 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_4528_210_3273_1_1527076654_3276777_342-168068-0_sp0.9 from training. Duration: 0.9666875 2023-10-03 01:24:22,244 WARNING [train.py:1204] (3/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_368-74239-0_sp1.1 from training. Duration: 0.92725 2023-10-03 01:24:22,245 WARNING [train.py:1204] (3/4) Exclude cut with ID _44831_210_13427_1_1533816011549_3689035_291-295928-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 01:24:24,158 WARNING [train.py:1204] (3/4) Exclude cut with ID _56406_210_9288_1_1534899618664_3496149_455-131997-0_sp1.1 from training. Duration: 0.97275 2023-10-03 01:24:24,174 WARNING [train.py:1204] (3/4) Exclude cut with ID _6473_210_1741_1_1527901307396_3815380_108-17335-0 from training. Duration: 0.65 2023-10-03 01:24:24,210 WARNING [train.py:1204] (3/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_286-135600-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 01:24:28,372 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_928-321675-0_sp1.1 from training. Duration: 0.936375 2023-10-03 01:24:28,450 WARNING [train.py:1204] (3/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_81-300212-0_sp1.1 from training. Duration: 0.5454375 2023-10-03 01:24:28,489 WARNING [train.py:1204] (3/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_29-280622-0_sp0.9 from training. Duration: 0.911125 2023-10-03 01:24:29,871 WARNING [train.py:1204] (3/4) Exclude cut with ID _61079_210_4525_1_1534921008198_4011419_228-176447-0_sp1.1 from training. Duration: 0.936375 2023-10-03 01:24:31,316 WARNING [train.py:1204] (3/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_372-198878-0_sp1.1 from training. Duration: 0.5454375 2023-10-03 01:24:33,011 WARNING [train.py:1204] (3/4) Exclude cut with ID _64521_210_14699_1_1535243427259_3815328_244-208725-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 01:24:33,146 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8275_210_4198_1_1528455320_3892571_491-17707-0_sp0.9 from training. Duration: 0.711125 2023-10-03 01:24:34,698 WARNING [train.py:1204] (3/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_375-245677-0 from training. Duration: 0.81 2023-10-03 01:24:34,704 WARNING [train.py:1204] (3/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_690-286055-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 01:24:36,120 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.ff2_skip_rate, batch_count=1088493.3333333333, ans=0.0 2023-10-03 01:24:37,263 WARNING [train.py:1204] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_89-321955-0 from training. Duration: 0.8 2023-10-03 01:24:37,309 WARNING [train.py:1204] (3/4) Exclude cut with ID _18895_210_6228_1_1531357274234_3784930_213-98721-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 01:24:38,616 WARNING [train.py:1204] (3/4) Exclude cut with ID _19708_210_9206_1_1532136650733_4238364_347-65240-0 from training. Duration: 0.97 2023-10-03 01:24:39,981 WARNING [train.py:1204] (3/4) Exclude cut with ID _64135_210_2253_1_1535677267876_7608972_498-5039-0 from training. Duration: 0.78 2023-10-03 01:24:40,185 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.feed_forward1.out_proj.dropout_p, batch_count=1088493.3333333333, ans=0.1 2023-10-03 01:24:46,644 WARNING [train.py:1204] (3/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_146-276944-0_sp1.1 from training. Duration: 0.7545625 2023-10-03 01:24:48,007 WARNING [train.py:1204] (3/4) Exclude cut with ID _63171_210_15488_1_1535104792794_3295110_155-34488-0_sp1.1 from training. Duration: 0.97275 2023-10-03 01:24:48,020 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_5438_210_1794_1_1527336687_2600009_233-58072-0_sp0.9 from training. Duration: 0.8555625 2023-10-03 01:24:49,367 WARNING [train.py:1204] (3/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_63-101996-0_sp0.9 from training. Duration: 0.97775 2023-10-03 01:24:53,649 WARNING [train.py:1204] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_271-269084-0_sp1.1 from training. Duration: 0.7545625 2023-10-03 01:24:55,046 WARNING [train.py:1204] (3/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_345-167986-0_sp1.1 from training. Duration: 0.763625 2023-10-03 01:24:57,179 WARNING [train.py:1204] (3/4) Exclude cut with ID _39290_210_4220_1_1533862778760_3075118_92-311600-0_sp1.1 from training. Duration: 0.8 2023-10-03 01:24:58,315 WARNING [train.py:1204] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_209-65306-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 01:24:58,400 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_26433_210_12108_1_1532157158_4056480_317-335807-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 01:25:03,789 WARNING [train.py:1204] (3/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_238-94127-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 01:25:03,854 WARNING [train.py:1204] (3/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_207-132038-0_sp1.1 from training. Duration: 0.8545625 2023-10-03 01:25:08,690 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder_embed.convnext.layerdrop_rate, batch_count=1088626.6666666667, ans=0.015 2023-10-03 01:25:09,997 WARNING [train.py:1204] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_167-137255-0_sp1.1 from training. Duration: 0.5818125 2023-10-03 01:25:11,405 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2009_210_2071_1_1525481134_4588937_385-302100-0_sp0.9 from training. Duration: 0.9666875 2023-10-03 01:25:14,976 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.self_attn_weights.pos_emb_skip_rate, batch_count=1088626.6666666667, ans=0.0 2023-10-03 01:25:22,073 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_15517_210_6753_1_1530954008_3487336_253-110617-0_sp1.1 from training. Duration: 0.936375 2023-10-03 01:25:23,581 WARNING [train.py:1204] (3/4) Exclude cut with ID _39290_210_4220_1_1533862778760_3075118_92-311600-0_sp0.9 from training. Duration: 0.97775 2023-10-03 01:25:23,635 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_42-142473-0 from training. Duration: 0.72 2023-10-03 01:25:24,896 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2296_210_2087_1_1525503448_3397335_57-185430-0 from training. Duration: 0.6 2023-10-03 01:25:24,909 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_11305_210_1794_1_1529750428_7178148_257-312870-0_sp0.9 from training. Duration: 0.97775 2023-10-03 01:25:26,949 WARNING [train.py:1204] (3/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_222-198127-0 from training. Duration: 0.84 2023-10-03 01:25:28,104 WARNING [train.py:1204] (3/4) Exclude cut with ID _65263_210_22356_1_1535763598691_3921037_423-202699-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 01:25:28,180 WARNING [train.py:1204] (3/4) Exclude cut with ID _15037_210_2253_1_1530752294104_3267650_13-130361-0 from training. Duration: 0.99 2023-10-03 01:25:30,790 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.539e+02 1.803e+02 2.022e+02 2.316e+02 4.391e+02, threshold=4.043e+02, percent-clipped=1.0 2023-10-03 01:25:32,541 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.feed_forward3.hidden_balancer.prob, batch_count=1088760.0, ans=0.125 2023-10-03 01:25:33,536 INFO [train.py:1046] (3/4) Epoch 31, batch 3950, loss[loss=0.1694, simple_loss=0.2438, pruned_loss=0.04752, over 23850.00 frames. ], tot_loss[loss=0.1622, simple_loss=0.2402, pruned_loss=0.04209, over 4718727.13 frames. ], batch size: 195, lr: 3.28e-03, grad_scale: 8.0 2023-10-03 01:25:35,162 WARNING [train.py:1204] (3/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_628-198080-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 01:25:36,602 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3171_210_2087_1_1526109084_2330007_37-64334-0 from training. Duration: 0.82 2023-10-03 01:25:38,499 WARNING [train.py:1204] (3/4) Exclude cut with ID _5287_210_2084_1_1527418883040_3878670_12-221186-0_sp1.1 from training. Duration: 0.8909375 2023-10-03 01:25:39,956 WARNING [train.py:1204] (3/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_1103-56445-0_sp1.1 from training. Duration: 0.663625 2023-10-03 01:25:42,696 WARNING [train.py:1204] (3/4) Exclude cut with ID _65263_210_22356_1_1535763598691_3921037_169-140068-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 01:25:49,184 WARNING [train.py:1204] (3/4) Exclude cut with ID IC0671W0118-331961 from training. Duration: 0.896 2023-10-03 01:25:50,428 WARNING [train.py:1204] (3/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_402-102311-0_sp1.1 from training. Duration: 0.6090625 2023-10-03 01:25:50,446 WARNING [train.py:1204] (3/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_202-293523-0 from training. Duration: 0.72 2023-10-03 01:25:51,757 WARNING [train.py:1204] (3/4) Exclude cut with ID IC0679W0038-335846 from training. Duration: 0.768 2023-10-03 01:25:51,781 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_37357_210_17024_1_1533089504_2316121_79-198866-0_sp1.1 from training. Duration: 0.936375 2023-10-03 01:25:52,018 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.self_attn_weights.pos_emb_skip_rate, batch_count=1088826.6666666667, ans=0.0 2023-10-03 01:25:53,894 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.feed_forward1.out_whiten.whitening_limit, batch_count=1088826.6666666667, ans=15.0 2023-10-03 01:25:54,561 WARNING [train.py:1204] (3/4) Exclude cut with ID _7606_210_4915_1_1528894253277_5080690_292-271285-0_sp1.1 from training. Duration: 0.963625 2023-10-03 01:25:54,572 WARNING [train.py:1204] (3/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_377-296839-0_sp0.9 from training. Duration: 0.611125 2023-10-03 01:25:54,575 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_45886_210_17024_1_1533704917_2435333_58-131069-0_sp1.1 from training. Duration: 0.963625 2023-10-03 01:25:56,103 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6961_210_2087_1_1527937168_6815011_395-185060-0 from training. Duration: 0.95 2023-10-03 01:25:59,498 WARNING [train.py:1204] (3/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_304-266479-0_sp1.1 from training. Duration: 0.97275 2023-10-03 01:25:59,558 WARNING [train.py:1204] (3/4) Exclude cut with ID _3867_210_1883_1_1526641200575_4475830_319-349850-0_sp1.1 from training. Duration: 0.6090625 2023-10-03 01:26:00,804 WARNING [train.py:1204] (3/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_304-59227-0_sp0.9 from training. Duration: 0.8555625 2023-10-03 01:26:00,870 WARNING [train.py:1204] (3/4) Exclude cut with ID _8021_210_7994_1_1528376451358_3653069_100-140231-0_sp0.9 from training. Duration: 0.7444375 2023-10-03 01:26:02,135 WARNING [train.py:1204] (3/4) Exclude cut with ID _8833_210_5219_1_1528592665210_7553059_329-125156-0_sp0.9 from training. Duration: 0.911125 2023-10-03 01:26:08,066 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.conv_module1.balancer1.min_positive, batch_count=1088893.3333333333, ans=0.025 2023-10-03 01:26:12,539 WARNING [train.py:1204] (3/4) Exclude cut with ID _14459_210_2228_1_1531443641946_7516700_792-287234-0_sp1.1 from training. Duration: 0.92725 2023-10-03 01:26:12,571 WARNING [train.py:1204] (3/4) Exclude cut with ID _19708_210_9206_1_1532136650733_4238364_347-65240-0_sp1.1 from training. Duration: 0.8818125 2023-10-03 01:26:17,342 WARNING [train.py:1204] (3/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_70-27732-0 from training. Duration: 0.94 2023-10-03 01:26:23,624 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_397-132565-0 from training. Duration: 0.99 2023-10-03 01:26:23,634 WARNING [train.py:1204] (3/4) Exclude cut with ID _49728_210_12651_1_1534041034294_6788150_624-195301-0 from training. Duration: 0.85 2023-10-03 01:26:24,878 WARNING [train.py:1204] (3/4) Exclude cut with ID _55429_210_10626_1_1534467588527_7174143_377-169391-0_sp1.1 from training. Duration: 0.87275 2023-10-03 01:26:26,279 WARNING [train.py:1204] (3/4) Exclude cut with ID _49376_210_5986_1_1533985278732_3370740_354-344695-0_sp1.1 from training. Duration: 0.9 2023-10-03 01:26:28,704 INFO [scaling.py:1118] (3/4) WithLoss: name=encoder.encoders.3.encoder.layers.0.self_attn_weights, loss-sum=0.000e+00 2023-10-03 01:26:29,192 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.3.encoder.layers.0.self_attn2.whiten, num_groups=1, num_channels=512, metric=14.85 vs. limit=22.5 2023-10-03 01:26:31,217 WARNING [train.py:1204] (3/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_276-96593-0_sp1.1 from training. Duration: 0.7 2023-10-03 01:26:31,228 WARNING [train.py:1204] (3/4) Exclude cut with ID _15012_210_1968_1_1530856820429_5095350_688-178849-0_sp0.9 from training. Duration: 0.711125 2023-10-03 01:26:32,568 WARNING [train.py:1204] (3/4) Exclude cut with ID _25565_210_12392_1_1532423009382_3952098_372-69023-0_sp1.1 from training. Duration: 0.936375 2023-10-03 01:26:32,605 WARNING [train.py:1204] (3/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_61-327062-0_sp1.1 from training. Duration: 0.736375 2023-10-03 01:26:32,633 WARNING [train.py:1204] (3/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_215-141059-0 from training. Duration: 0.62 2023-10-03 01:26:35,406 WARNING [train.py:1204] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_6-226750-0_sp1.1 from training. Duration: 0.8545625 2023-10-03 01:26:38,070 WARNING [train.py:1204] (3/4) Exclude cut with ID _14348_210_8341_1_1530599434996_3778640_240-232370-0_sp1.1 from training. Duration: 0.763625 2023-10-03 01:26:41,365 WARNING [train.py:1204] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_468-189058-0 from training. Duration: 0.96 2023-10-03 01:26:49,254 INFO [train.py:1046] (3/4) Epoch 31, batch 4000, loss[loss=0.2122, simple_loss=0.2736, pruned_loss=0.07535, over 19627.00 frames. ], tot_loss[loss=0.1625, simple_loss=0.2406, pruned_loss=0.04221, over 4708603.05 frames. ], batch size: 388, lr: 3.28e-03, grad_scale: 16.0 2023-10-03 01:26:50,649 WARNING [train.py:1204] (3/4) Exclude cut with ID _21987_210_5854_1_1531821500482_3637038_247-93340-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 01:26:55,121 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.attention_skip_rate, batch_count=1089093.3333333333, ans=0.0 2023-10-03 01:26:56,352 WARNING [train.py:1204] (3/4) Exclude cut with ID _66868_210_9527_1_1535509287954_4561140_436-191602-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 01:26:56,604 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.conv_module1.balancer1.prob, batch_count=1089093.3333333333, ans=0.125 2023-10-03 01:27:00,487 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.0.layers.1.conv_module2.whiten, num_groups=1, num_channels=192, metric=10.20 vs. limit=15.0 2023-10-03 01:27:01,035 WARNING [train.py:1204] (3/4) Exclude cut with ID _18895_210_6228_1_1531357274234_3784930_394-58203-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 01:27:02,328 WARNING [train.py:1204] (3/4) Exclude cut with ID _58605_210_12210_1_1534723195939_3904259_258-281498-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 01:27:03,612 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_33348_210_5632_1_1533086912_3324030_245-292752-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 01:27:03,645 WARNING [train.py:1204] (3/4) Exclude cut with ID _67831_210_12392_1_1535612464202_3092612_346-2148-0 from training. Duration: 0.93 2023-10-03 01:27:04,978 WARNING [train.py:1204] (3/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_369-222060-0_sp0.9 from training. Duration: 0.788875 2023-10-03 01:27:05,027 WARNING [train.py:1204] (3/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_245-174553-0 from training. Duration: 0.88 2023-10-03 01:27:05,033 WARNING [train.py:1204] (3/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_231-104736-0_sp1.1 from training. Duration: 0.7090625 2023-10-03 01:27:05,047 WARNING [train.py:1204] (3/4) Exclude cut with ID _44585_210_18628_1_1533605350387_4518199_280-73534-0 from training. Duration: 0.89 2023-10-03 01:27:06,437 WARNING [train.py:1204] (3/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_22-201476-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 01:27:09,289 WARNING [train.py:1204] (3/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_325-62802-0_sp0.9 from training. Duration: 0.9666875 2023-10-03 01:27:09,301 WARNING [train.py:1204] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_199-274062-0_sp1.1 from training. Duration: 0.97275 2023-10-03 01:27:09,304 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_8-319957-0_sp1.1 from training. Duration: 0.87275 2023-10-03 01:27:09,341 WARNING [train.py:1204] (3/4) Exclude cut with ID _29343_210_4381_1_1532602757752_3674109_224-349726-0_sp1.1 from training. Duration: 0.936375 2023-10-03 01:27:09,345 WARNING [train.py:1204] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_520-126729-0_sp1.1 from training. Duration: 0.636375 2023-10-03 01:27:11,336 WARNING [train.py:1204] (3/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_239-22076-0_sp1.1 from training. Duration: 0.77275 2023-10-03 01:27:13,148 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9012W0011-501146 from training. Duration: 0.896 2023-10-03 01:27:14,501 WARNING [train.py:1204] (3/4) Exclude cut with ID _29857_210_20034_1_1532395886753_3798160_279-185801-0_sp1.1 from training. Duration: 0.82725 2023-10-03 01:27:14,567 WARNING [train.py:1204] (3/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_261-223933-0_sp1.1 from training. Duration: 0.92725 2023-10-03 01:27:15,923 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.1.conv_module1.balancer1.prob, batch_count=1089160.0, ans=0.125 2023-10-03 01:27:18,883 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9029W0025-509426 from training. Duration: 0.896 2023-10-03 01:27:20,187 WARNING [train.py:1204] (3/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_337-181502-0_sp1.1 from training. Duration: 0.5909375 2023-10-03 01:27:20,212 WARNING [train.py:1204] (3/4) Exclude cut with ID _68635_210_9317_1_1535770813410_4315870_159-332599-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 01:27:27,070 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_161-276996-0 from training. Duration: 0.95 2023-10-03 01:27:27,115 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7088_210_1874_1_1527939776_1101113_127-68870-0_sp1.1 from training. Duration: 0.936375 2023-10-03 01:27:28,579 WARNING [train.py:1204] (3/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_322-217220-0_sp1.1 from training. Duration: 0.7818125 2023-10-03 01:27:29,910 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9072W0002-530636 from training. Duration: 0.768 2023-10-03 01:27:31,780 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_340-336408-0_sp1.1 from training. Duration: 0.8454375 2023-10-03 01:27:31,852 WARNING [train.py:1204] (3/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_395-37222-0 from training. Duration: 0.76 2023-10-03 01:27:31,855 WARNING [train.py:1204] (3/4) Exclude cut with ID _64099_210_17301_1_1535526977656_1578188_159-209156-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 01:27:33,208 WARNING [train.py:1204] (3/4) Exclude cut with ID _53950_210_5816_1_1534417264919_3862951_28-29681-0_sp1.1 from training. Duration: 0.92725 2023-10-03 01:27:33,293 WARNING [train.py:1204] (3/4) Exclude cut with ID _4681_210_3525_1_1527250689464_120889_15-126760-0_sp1.1 from training. Duration: 0.736375 2023-10-03 01:27:34,699 WARNING [train.py:1204] (3/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_242-3472-0_sp1.1 from training. Duration: 0.663625 2023-10-03 01:27:36,016 WARNING [train.py:1204] (3/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_436-186311-0_sp0.9 from training. Duration: 0.72225 2023-10-03 01:27:36,031 WARNING [train.py:1204] (3/4) Exclude cut with ID _3675_210_4557_1_1526639934408_4182089_452-172147-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 01:27:37,396 WARNING [train.py:1204] (3/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_80-147301-0 from training. Duration: 0.84 2023-10-03 01:27:37,423 WARNING [train.py:1204] (3/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_351-107588-0_sp1.1 from training. Duration: 0.92725 2023-10-03 01:27:40,041 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9111W0002-550781 from training. Duration: 0.896 2023-10-03 01:27:40,289 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.1.attention_skip_rate, batch_count=1089293.3333333333, ans=0.0 2023-10-03 01:27:40,360 INFO [scaling.py:1118] (3/4) WithLoss: name=encoder.encoders.2.encoder.layers.2.self_attn_weights, loss-sum=0.000e+00 2023-10-03 01:27:45,004 WARNING [train.py:1204] (3/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_465-156500-0_sp1.1 from training. Duration: 0.6545625 2023-10-03 01:27:46,608 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.feed_forward2.hidden_balancer.prob, batch_count=1089360.0, ans=0.125 2023-10-03 01:27:47,808 WARNING [train.py:1204] (3/4) Exclude cut with ID _13938_210_9988_1_1530518718978_3693960_304-112784-0_sp1.1 from training. Duration: 0.4181875 2023-10-03 01:27:49,844 WARNING [train.py:1204] (3/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_202-67017-0_sp1.1 from training. Duration: 0.7454375 2023-10-03 01:27:51,163 WARNING [train.py:1204] (3/4) Exclude cut with ID _58414_210_8254_1_1535421609162_7151182_448-4358-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 01:27:51,234 WARNING [train.py:1204] (3/4) Exclude cut with ID _66840_210_5219_1_1535452168426_4196349_203-243234-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 01:27:52,706 WARNING [train.py:1204] (3/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_201-213513-0_sp1.1 from training. Duration: 0.97275 2023-10-03 01:27:58,304 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_117-187560-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 01:27:59,514 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.615e+02 1.852e+02 2.002e+02 2.226e+02 3.079e+02, threshold=4.003e+02, percent-clipped=0.0 2023-10-03 01:27:59,714 WARNING [train.py:1204] (3/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_135-174080-0_sp0.9 from training. Duration: 0.588875 2023-10-03 01:28:01,170 WARNING [train.py:1204] (3/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_45-265684-0 from training. Duration: 0.72 2023-10-03 01:28:02,501 INFO [train.py:1046] (3/4) Epoch 31, batch 4050, loss[loss=0.1639, simple_loss=0.2522, pruned_loss=0.03778, over 24602.00 frames. ], tot_loss[loss=0.1627, simple_loss=0.2412, pruned_loss=0.04204, over 4717427.77 frames. ], batch size: 73, lr: 3.28e-03, grad_scale: 16.0 2023-10-03 01:28:04,608 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_435-313305-0_sp1.1 from training. Duration: 0.6454375 2023-10-03 01:28:04,632 WARNING [train.py:1204] (3/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_235-307337-0_sp1.1 from training. Duration: 0.8545625 2023-10-03 01:28:05,968 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7762_210_1794_1_1528199508_4057408_419-125009-0_sp0.9 from training. Duration: 0.92225 2023-10-03 01:28:06,052 WARNING [train.py:1204] (3/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_215-141059-0_sp1.1 from training. Duration: 0.563625 2023-10-03 01:28:07,436 WARNING [train.py:1204] (3/4) Exclude cut with ID _27013_210_1389_1_1532437173240_3252649_222-125573-0_sp1.1 from training. Duration: 0.8909375 2023-10-03 01:28:11,603 WARNING [train.py:1204] (3/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_126-274694-0_sp1.1 from training. Duration: 0.8909375 2023-10-03 01:28:14,791 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2592_210_2290_1_1525697025_4882925_157-70512-0_sp1.1 from training. Duration: 0.82725 2023-10-03 01:28:14,866 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_292-265928-0_sp0.9 from training. Duration: 0.5666875 2023-10-03 01:28:18,124 WARNING [train.py:1204] (3/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_1211-226066-0_sp1.1 from training. Duration: 0.6909375 2023-10-03 01:28:18,172 WARNING [train.py:1204] (3/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_394-265747-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 01:28:19,700 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.conv_module2.balancer1.prob, batch_count=1089493.3333333333, ans=0.125 2023-10-03 01:28:22,742 WARNING [train.py:1204] (3/4) Exclude cut with ID _48782_210_12658_1_1534467701544_2300422_239-67139-0_sp1.1 from training. Duration: 0.97275 2023-10-03 01:28:24,089 WARNING [train.py:1204] (3/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_329-113173-0_sp1.1 from training. Duration: 0.563625 2023-10-03 01:28:26,944 WARNING [train.py:1204] (3/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_81-75849-0_sp0.9 from training. Duration: 0.4333125 2023-10-03 01:28:28,343 WARNING [train.py:1204] (3/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_1003-101638-0 from training. Duration: 0.73 2023-10-03 01:28:29,683 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9309W0189-640316 from training. Duration: 0.64 2023-10-03 01:28:30,040 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.ff3_skip_rate, batch_count=1089493.3333333333, ans=0.0 2023-10-03 01:28:31,244 WARNING [train.py:1204] (3/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_503-287883-0_sp1.1 from training. Duration: 0.8 2023-10-03 01:28:31,527 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=1089560.0, ans=0.1 2023-10-03 01:28:38,595 WARNING [train.py:1204] (3/4) Exclude cut with ID _8833_210_5219_1_1528592665210_7553059_611-80313-0 from training. Duration: 0.64 2023-10-03 01:28:40,000 WARNING [train.py:1204] (3/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_335-106626-0_sp1.1 from training. Duration: 0.963625 2023-10-03 01:28:42,801 WARNING [train.py:1204] (3/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_237-44332-0_sp1.1 from training. Duration: 0.8545625 2023-10-03 01:28:45,706 WARNING [train.py:1204] (3/4) Exclude cut with ID _46952_210_5986_1_1534230010357_3107379_178-147341-0_sp1.1 from training. Duration: 0.97275 2023-10-03 01:28:45,764 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_13091_210_7925_1_1530609447_8987354_946-154426-0_sp1.1 from training. Duration: 0.936375 2023-10-03 01:28:45,783 WARNING [train.py:1204] (3/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_326-17061-0_sp1.1 from training. Duration: 0.8545625 2023-10-03 01:28:49,402 WARNING [train.py:1204] (3/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_38-169920-0_sp1.1 from training. Duration: 0.82725 2023-10-03 01:28:50,911 WARNING [train.py:1204] (3/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_283-220372-0 from training. Duration: 0.56 2023-10-03 01:28:50,919 WARNING [train.py:1204] (3/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_252-216674-0_sp1.1 from training. Duration: 0.4909375 2023-10-03 01:28:52,884 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_202-91290-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 01:28:53,178 INFO [scaling.py:1118] (3/4) WithLoss: name=encoder.encoders.4.encoder.layers.2.self_attn_weights, loss-sum=0.000e+00 2023-10-03 01:28:55,502 WARNING [train.py:1204] (3/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_80-74184-0 from training. Duration: 0.83 2023-10-03 01:28:59,505 WARNING [train.py:1204] (3/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_411-271358-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 01:29:06,534 WARNING [train.py:1204] (3/4) Exclude cut with ID _68777_210_4353_1_1535810390731_3117904_129-166012-0 from training. Duration: 0.85 2023-10-03 01:29:06,613 WARNING [train.py:1204] (3/4) Exclude cut with ID _44982_210_1511_1_1534312820108_3726029_410-109759-0_sp1.1 from training. Duration: 0.963625 2023-10-03 01:29:06,617 WARNING [train.py:1204] (3/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_364-22817-0_sp1.1 from training. Duration: 0.6454375 2023-10-03 01:29:09,847 WARNING [train.py:1204] (3/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_89-242335-0 from training. Duration: 0.95 2023-10-03 01:29:09,857 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_151-325626-0 from training. Duration: 0.97 2023-10-03 01:29:09,858 WARNING [train.py:1204] (3/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_786-264332-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 01:29:12,508 WARNING [train.py:1204] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_35-153886-0_sp1.1 from training. Duration: 0.763625 2023-10-03 01:29:13,894 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_24007_210_1794_1_1531918925_1663875_116-100916-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 01:29:14,476 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.3.encoder.layers.1.whiten, num_groups=1, num_channels=512, metric=5.09 vs. limit=12.0 2023-10-03 01:29:15,205 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_162-262717-0_sp1.1 from training. Duration: 0.7545625 2023-10-03 01:29:16,976 INFO [train.py:1046] (3/4) Epoch 31, batch 4100, loss[loss=0.1756, simple_loss=0.2498, pruned_loss=0.05065, over 23445.00 frames. ], tot_loss[loss=0.163, simple_loss=0.242, pruned_loss=0.04198, over 4718818.06 frames. ], batch size: 93, lr: 3.28e-03, grad_scale: 16.0 2023-10-03 01:29:21,663 WARNING [train.py:1204] (3/4) Exclude cut with ID _8263_210_1664_1_1528438953494_294100_40-204731-0 from training. Duration: 0.5 2023-10-03 01:29:23,019 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_416-293746-0 from training. Duration: 0.9 2023-10-03 01:29:25,129 WARNING [train.py:1204] (3/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_605-219732-0 from training. Duration: 0.94 2023-10-03 01:29:26,474 WARNING [train.py:1204] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_454-123002-0 from training. Duration: 0.69 2023-10-03 01:29:26,479 WARNING [train.py:1204] (3/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_25-304273-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 01:29:26,525 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6860_210_4852_1_1527917457_3833327_451-39702-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 01:29:26,561 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_304-16505-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 01:29:26,573 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_4-266348-0_sp1.1 from training. Duration: 0.7090625 2023-10-03 01:29:26,644 WARNING [train.py:1204] (3/4) Exclude cut with ID ID0149W0001-753701 from training. Duration: 0.989 2023-10-03 01:29:29,394 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_41251_210_15368_1_1533429767_4820695_394-55393-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 01:29:30,962 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.conv_module2.balancer2.min_abs, batch_count=1089826.6666666667, ans=0.5 2023-10-03 01:29:32,021 WARNING [train.py:1204] (3/4) Exclude cut with ID _8793_210_6310_1_1528542985139_2310009_121-179782-0_sp1.1 from training. Duration: 0.72725 2023-10-03 01:29:32,035 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2729_210_3528_1_1525863505_3882214_421-73047-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 01:29:32,105 WARNING [train.py:1204] (3/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_278-154492-0_sp0.9 from training. Duration: 0.8444375 2023-10-03 01:29:36,155 WARNING [train.py:1204] (3/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_412-317077-0_sp0.9 from training. Duration: 0.8333125 2023-10-03 01:29:37,477 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_727-138317-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 01:29:37,519 WARNING [train.py:1204] (3/4) Exclude cut with ID _25808_210_13292_1_1532343514562_7366109_345-335048-0_sp1.1 from training. Duration: 0.92725 2023-10-03 01:29:37,540 WARNING [train.py:1204] (3/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_506-23200-0 from training. Duration: 0.97 2023-10-03 01:29:40,544 WARNING [train.py:1204] (3/4) Exclude cut with ID _44718_210_5816_1_1534050051869_6469030_643-245187-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 01:29:40,548 WARNING [train.py:1204] (3/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_42-186162-0_sp1.1 from training. Duration: 0.836375 2023-10-03 01:29:40,560 WARNING [train.py:1204] (3/4) Exclude cut with ID _3231_210_2106_1_1526205864934_6784220_171-256004-0_sp1.1 from training. Duration: 0.7818125 2023-10-03 01:29:40,587 WARNING [train.py:1204] (3/4) Exclude cut with ID _25914_210_6400_1_1532141317058_644740_43-248478-0_sp1.1 from training. Duration: 0.936375 2023-10-03 01:29:41,938 WARNING [train.py:1204] (3/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_577-287150-0 from training. Duration: 0.82 2023-10-03 01:29:42,167 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.balancer1.prob, batch_count=1089826.6666666667, ans=0.125 2023-10-03 01:29:43,434 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_46015_210_7234_1_1533863319_3087837_376-276068-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 01:29:44,847 WARNING [train.py:1204] (3/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_51-200220-0 from training. Duration: 0.56 2023-10-03 01:29:46,753 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_452-96101-0_sp1.1 from training. Duration: 0.963625 2023-10-03 01:29:48,384 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.feed_forward1.hidden_balancer.prob, batch_count=1089893.3333333333, ans=0.125 2023-10-03 01:29:48,589 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.nonlin_attention.whiten2.whitening_limit, batch_count=1089893.3333333333, ans=15.0 2023-10-03 01:29:49,881 WARNING [train.py:1204] (3/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_263-168388-0_sp1.1 from training. Duration: 0.7818125 2023-10-03 01:29:49,883 WARNING [train.py:1204] (3/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_469-94673-0 from training. Duration: 0.79 2023-10-03 01:29:51,304 WARNING [train.py:1204] (3/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_303-228738-0_sp1.1 from training. Duration: 0.763625 2023-10-03 01:29:52,553 WARNING [train.py:1204] (3/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_262-250826-0_sp1.1 from training. Duration: 0.9 2023-10-03 01:29:52,587 WARNING [train.py:1204] (3/4) Exclude cut with ID _68777_210_4353_1_1535810390731_3117904_129-166012-0_sp1.1 from training. Duration: 0.77275 2023-10-03 01:29:54,583 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.4.encoder.layers.2.conv_module2.whiten, num_groups=1, num_channels=384, metric=3.07 vs. limit=15.0 2023-10-03 01:29:55,337 WARNING [train.py:1204] (3/4) Exclude cut with ID _58895_210_8750_1_1535184081320_3649089_265-323810-0 from training. Duration: 0.96 2023-10-03 01:29:57,227 WARNING [train.py:1204] (3/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_120-76843-0_sp0.9 from training. Duration: 0.611125 2023-10-03 01:29:58,677 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7544_210_6753_1_1528587722_3576686_295-258479-0_sp0.9 from training. Duration: 0.9333125 2023-10-03 01:30:01,353 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7046_210_6753_1_1527895337_6920917_783-278109-0 from training. Duration: 0.77 2023-10-03 01:30:01,403 WARNING [train.py:1204] (3/4) Exclude cut with ID _42326_210_7770_1_1533814282531_3707269_113-308323-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 01:30:01,435 WARNING [train.py:1204] (3/4) Exclude cut with ID _7852_210_5219_1_1528286471316_8139400_242-105336-0_sp0.9 from training. Duration: 0.8 2023-10-03 01:30:04,286 WARNING [train.py:1204] (3/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_114-277905-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 01:30:08,355 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_13118_210_7925_1_1530755612_7608297_42-66683-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 01:30:13,156 WARNING [train.py:1204] (3/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_901-79438-0_sp1.1 from training. Duration: 0.8545625 2023-10-03 01:30:13,201 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_30280_210_1881_1_1532872779_3660139_211-182194-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 01:30:13,674 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.4.encoder.layers.1.conv_module2.whiten, num_groups=1, num_channels=384, metric=5.65 vs. limit=15.0 2023-10-03 01:30:16,177 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.conv_module1.balancer2.min_positive, batch_count=1090026.6666666667, ans=0.05 2023-10-03 01:30:17,567 WARNING [train.py:1204] (3/4) Exclude cut with ID _14898_210_9206_1_1530766812377_4177190_621-45732-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 01:30:19,512 WARNING [train.py:1204] (3/4) Exclude cut with ID _35350_210_16040_1_1533040191183_4075299_224-133775-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 01:30:22,918 WARNING [train.py:1204] (3/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_147-293311-0_sp1.1 from training. Duration: 0.8545625 2023-10-03 01:30:23,165 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.ff2_skip_rate, batch_count=1090026.6666666667, ans=0.0 2023-10-03 01:30:25,519 WARNING [train.py:1204] (3/4) Exclude cut with ID _12302_210_5219_1_1529924446466_8107280_835-279754-0_sp1.1 from training. Duration: 0.8181875 2023-10-03 01:30:28,710 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.505e+02 1.780e+02 1.987e+02 2.300e+02 3.252e+02, threshold=3.974e+02, percent-clipped=0.0 2023-10-03 01:30:28,842 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8453_210_4211_1_1528502940_8562730_347-330673-0_sp0.9 from training. Duration: 0.8 2023-10-03 01:30:30,207 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_213-103282-0_sp0.9 from training. Duration: 0.8666875 2023-10-03 01:30:31,470 INFO [train.py:1046] (3/4) Epoch 31, batch 4150, loss[loss=0.1806, simple_loss=0.2618, pruned_loss=0.04972, over 24040.00 frames. ], tot_loss[loss=0.1636, simple_loss=0.2422, pruned_loss=0.04249, over 4710826.21 frames. ], batch size: 86, lr: 3.28e-03, grad_scale: 16.0 2023-10-03 01:30:31,556 WARNING [train.py:1204] (3/4) Exclude cut with ID _49728_210_12651_1_1534041034294_6788150_66-43496-0_sp1.1 from training. Duration: 0.97275 2023-10-03 01:30:31,560 WARNING [train.py:1204] (3/4) Exclude cut with ID _19023_210_4353_1_1531378892552_5820780_28-36615-0_sp1.1 from training. Duration: 0.8818125 2023-10-03 01:30:31,829 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.balancer2.prob, batch_count=1090093.3333333333, ans=0.125 2023-10-03 01:30:34,361 WARNING [train.py:1204] (3/4) Exclude cut with ID _14349_210_8341_1_1530615710818_3442470_115-285975-0 from training. Duration: 0.86 2023-10-03 01:30:34,397 WARNING [train.py:1204] (3/4) Exclude cut with ID _12734_210_8341_1_1530183614296_3891929_236-47464-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 01:30:35,686 WARNING [train.py:1204] (3/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_177-236809-0 from training. Duration: 0.49 2023-10-03 01:30:37,581 WARNING [train.py:1204] (3/4) Exclude cut with ID _13678_210_7497_1_1530530069226_3463170_371-3221-0 from training. Duration: 0.9 2023-10-03 01:30:37,595 WARNING [train.py:1204] (3/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_48-338976-0 from training. Duration: 0.89 2023-10-03 01:30:38,734 WARNING [train.py:1204] (3/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_1167-309744-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 01:30:42,906 WARNING [train.py:1204] (3/4) Exclude cut with ID _30327_210_16049_1_1532484161295_3924169_401-70819-0_sp0.9 from training. Duration: 0.9555625 2023-10-03 01:30:42,914 WARNING [train.py:1204] (3/4) Exclude cut with ID _55134_210_21982_1_1534590028377_3882170_346-91022-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 01:30:44,659 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.conv_skip_rate, batch_count=1090160.0, ans=0.0 2023-10-03 01:30:45,878 WARNING [train.py:1204] (3/4) Exclude cut with ID _14459_210_2228_1_1531443641946_7516700_174-118801-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 01:30:45,944 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_269-278006-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 01:30:47,336 WARNING [train.py:1204] (3/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_390-339137-0_sp0.9 from training. Duration: 0.62225 2023-10-03 01:30:50,579 WARNING [train.py:1204] (3/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_253-308377-0_sp1.1 from training. Duration: 0.4454375 2023-10-03 01:30:51,788 WARNING [train.py:1204] (3/4) Exclude cut with ID _23825_210_13449_1_1531915313526_3497150_240-271367-0_sp1.1 from training. Duration: 0.8818125 2023-10-03 01:30:51,879 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1015-343884-0_sp0.9 from training. Duration: 0.688875 2023-10-03 01:30:53,840 INFO [scaling.py:1118] (3/4) WithLoss: name=encoder.encoders.0.layers.0.self_attn_weights, loss-sum=0.000e+00 2023-10-03 01:30:57,775 WARNING [train.py:1204] (3/4) Exclude cut with ID _23514_210_1389_1_1531832139785_3031439_335-48210-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 01:31:02,342 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_309-65177-0_sp1.1 from training. Duration: 0.8 2023-10-03 01:31:03,674 WARNING [train.py:1204] (3/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_231-137892-0 from training. Duration: 0.99 2023-10-03 01:31:05,168 WARNING [train.py:1204] (3/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_57-270037-0 from training. Duration: 0.77 2023-10-03 01:31:05,172 WARNING [train.py:1204] (3/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_465-214726-0_sp1.1 from training. Duration: 0.7818125 2023-10-03 01:31:06,666 WARNING [train.py:1204] (3/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_106-300985-0 from training. Duration: 0.9 2023-10-03 01:31:06,671 WARNING [train.py:1204] (3/4) Exclude cut with ID _14632_210_9988_1_1530691279392_3923200_161-129530-0_sp1.1 from training. Duration: 0.87275 2023-10-03 01:31:06,687 WARNING [train.py:1204] (3/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_514-88370-0_sp1.1 from training. Duration: 0.963625 2023-10-03 01:31:09,300 WARNING [train.py:1204] (3/4) Exclude cut with ID _56495_210_4234_1_1534748080334_4136219_305-334505-0_sp1.1 from training. Duration: 0.92725 2023-10-03 01:31:11,252 WARNING [train.py:1204] (3/4) Exclude cut with ID _42875_210_14856_1_1533704397487_4012433_61-264914-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 01:31:13,916 WARNING [train.py:1204] (3/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_91-148831-0 from training. Duration: 0.99 2023-10-03 01:31:14,192 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.feed_forward2.hidden_balancer.prob, batch_count=1090293.3333333333, ans=0.125 2023-10-03 01:31:16,662 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_435-313305-0_sp0.9 from training. Duration: 0.788875 2023-10-03 01:31:19,409 WARNING [train.py:1204] (3/4) Exclude cut with ID _16744_210_5986_1_1531724405384_3907567_242-111316-0_sp1.1 from training. Duration: 0.8454375 2023-10-03 01:31:19,474 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_257-20733-0 from training. Duration: 0.76 2023-10-03 01:31:19,518 WARNING [train.py:1204] (3/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_4-91037-0_sp1.1 from training. Duration: 0.8 2023-10-03 01:31:20,846 WARNING [train.py:1204] (3/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_3-276701-0 from training. Duration: 0.91 2023-10-03 01:31:22,829 WARNING [train.py:1204] (3/4) Exclude cut with ID _3992_210_1747_1_1526644816031_3731060_36-147855-0_sp1.1 from training. Duration: 0.7090625 2023-10-03 01:31:24,223 WARNING [train.py:1204] (3/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_1296-298671-0_sp1.1 from training. Duration: 0.963625 2023-10-03 01:31:25,642 WARNING [train.py:1204] (3/4) Exclude cut with ID _68965_210_1883_1_1535698962272_3497490_309-334593-0_sp1.1 from training. Duration: 0.92725 2023-10-03 01:31:26,986 WARNING [train.py:1204] (3/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_128-296581-0 from training. Duration: 0.74 2023-10-03 01:31:26,986 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_17613_210_2151_1_1531137569_2366160_170-299079-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 01:31:26,989 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6287_210_3273_1_1527509506_2653716_182-199887-0_sp1.1 from training. Duration: 0.463625 2023-10-03 01:31:29,760 WARNING [train.py:1204] (3/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_80-135200-0_sp1.1 from training. Duration: 0.7181875 2023-10-03 01:31:33,128 WARNING [train.py:1204] (3/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_34-183685-0 from training. Duration: 0.98 2023-10-03 01:31:33,146 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8278_210_2550_1_1528637099_4220413_418-210155-0_sp1.1 from training. Duration: 0.92725 2023-10-03 01:31:33,150 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8853_210_3273_1_1528633899_818968_51-187685-0_sp0.9 from training. Duration: 0.7666875 2023-10-03 01:31:33,165 WARNING [train.py:1204] (3/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_332-221057-0_sp1.1 from training. Duration: 0.6181875 2023-10-03 01:31:34,466 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2221_210_1988_1_1525597051_4935541_415-296882-0 from training. Duration: 0.77 2023-10-03 01:31:34,495 WARNING [train.py:1204] (3/4) Exclude cut with ID _33640_210_2655_1_1533117605479_4855469_770-278185-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 01:31:35,845 WARNING [train.py:1204] (3/4) Exclude cut with ID _5287_210_2084_1_1527418883040_3878670_333-48508-0_sp1.1 from training. Duration: 0.5454375 2023-10-03 01:31:35,911 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_142-276839-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 01:31:37,314 WARNING [train.py:1204] (3/4) Exclude cut with ID _28394_210_8913_1_1532430049518_3650118_126-139371-0_sp1.1 from training. Duration: 0.92725 2023-10-03 01:31:37,330 WARNING [train.py:1204] (3/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_76-203867-0 from training. Duration: 0.72 2023-10-03 01:31:38,646 WARNING [train.py:1204] (3/4) Exclude cut with ID _6194_210_1534_1_1527511826160_3941194_14-282876-0_sp0.9 from training. Duration: 0.788875 2023-10-03 01:31:43,414 WARNING [train.py:1204] (3/4) Exclude cut with ID _35355_210_16040_1_1533212988977_4016229_74-190076-0_sp1.1 from training. Duration: 0.72725 2023-10-03 01:31:43,610 INFO [scaling.py:1118] (3/4) WithLoss: name=encoder.encoders.0.layers.0.self_attn_weights, loss-sum=0.000e+00 2023-10-03 01:31:44,720 INFO [train.py:1046] (3/4) Epoch 31, batch 4200, loss[loss=0.1667, simple_loss=0.254, pruned_loss=0.0397, over 24238.00 frames. ], tot_loss[loss=0.1631, simple_loss=0.2418, pruned_loss=0.04221, over 4707229.70 frames. ], batch size: 74, lr: 3.28e-03, grad_scale: 16.0 2023-10-03 01:31:44,906 WARNING [train.py:1204] (3/4) Exclude cut with ID _33920_210_4220_1_1533257851500_3084399_198-334063-0 from training. Duration: 0.94 2023-10-03 01:31:45,123 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.conv_skip_rate, batch_count=1090426.6666666667, ans=0.0 2023-10-03 01:31:47,666 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_4-266348-0_sp0.9 from training. Duration: 0.8666875 2023-10-03 01:31:49,159 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_23856_210_4238_1_1531994785_3087934_122-38075-0_sp1.1 from training. Duration: 0.936375 2023-10-03 01:31:50,457 WARNING [train.py:1204] (3/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_276-96593-0_sp0.9 from training. Duration: 0.8555625 2023-10-03 01:31:50,513 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_308-278734-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 01:31:52,296 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_5190_210_4557_1_1527245078_2965646_353-250603-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 01:31:54,238 WARNING [train.py:1204] (3/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_472-216663-0 from training. Duration: 0.92 2023-10-03 01:31:57,068 WARNING [train.py:1204] (3/4) Exclude cut with ID _7537_210_2084_1_1528502395431_3924359_14-312906-0 from training. Duration: 0.84 2023-10-03 01:31:58,356 WARNING [train.py:1204] (3/4) Exclude cut with ID _29003_210_19596_1_1532507391720_3894098_559-236660-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 01:31:59,882 WARNING [train.py:1204] (3/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_312-204798-0_sp1.1 from training. Duration: 0.8454375 2023-10-03 01:32:01,307 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.1.conv_skip_rate, batch_count=1090493.3333333333, ans=0.0 2023-10-03 01:32:03,911 WARNING [train.py:1204] (3/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_17-309070-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 01:32:07,237 WARNING [train.py:1204] (3/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_344-152526-0_sp0.9 from training. Duration: 0.588875 2023-10-03 01:32:08,601 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_41452_210_7291_1_1533437687_3941281_405-11559-0_sp1.1 from training. Duration: 0.82725 2023-10-03 01:32:08,645 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_48730_210_5399_1_1533885886_2212769_157-124052-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 01:32:09,980 WARNING [train.py:1204] (3/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_415-78792-0 from training. Duration: 0.81 2023-10-03 01:32:09,984 WARNING [train.py:1204] (3/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_345-340690-0_sp1.1 from training. Duration: 0.8454375 2023-10-03 01:32:10,106 WARNING [train.py:1204] (3/4) Exclude cut with ID _23825_210_13449_1_1531915313526_3497150_124-8138-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 01:32:10,787 WARNING [train.py:1204] (3/4) Exclude cut with ID _49258_210_7994_1_1534122017863_3738861_194-24864-0_sp1.1 from training. Duration: 0.936375 2023-10-03 01:32:11,014 INFO [scaling.py:1118] (3/4) WithLoss: name=encoder.encoders.3.encoder.layers.1.self_attn_weights, loss-sum=0.000e+00 2023-10-03 01:32:11,981 WARNING [train.py:1204] (3/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_369-222060-0_sp1.1 from training. Duration: 0.6454375 2023-10-03 01:32:13,405 WARNING [train.py:1204] (3/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_480-13146-0_sp0.9 from training. Duration: 0.9444375 2023-10-03 01:32:14,856 WARNING [train.py:1204] (3/4) Exclude cut with ID _13253_210_1925_1_1530757775819_3577873_191-144977-0 from training. Duration: 0.78 2023-10-03 01:32:14,873 WARNING [train.py:1204] (3/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_1128-140266-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 01:32:19,036 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.1.conv_module1.balancer1.max_abs, batch_count=1090560.0, ans=10.0 2023-10-03 01:32:20,331 WARNING [train.py:1204] (3/4) Exclude cut with ID _8772_210_1850_1_1528635595972_3664011_247-336448-0_sp0.9 from training. Duration: 0.82225 2023-10-03 01:32:20,416 WARNING [train.py:1204] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_318-59097-0_sp1.1 from training. Duration: 0.6909375 2023-10-03 01:32:23,075 WARNING [train.py:1204] (3/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_540-131164-0_sp1.1 from training. Duration: 0.836375 2023-10-03 01:32:24,997 WARNING [train.py:1204] (3/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_39-110836-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 01:32:25,136 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.conv_skip_rate, batch_count=1090560.0, ans=0.0 2023-10-03 01:32:27,765 WARNING [train.py:1204] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_860-96198-0_sp1.1 from training. Duration: 0.663625 2023-10-03 01:32:27,779 WARNING [train.py:1204] (3/4) Exclude cut with ID _15133_210_6228_1_1530794512074_3080379_29-264458-0 from training. Duration: 0.58 2023-10-03 01:32:29,142 WARNING [train.py:1204] (3/4) Exclude cut with ID _55209_210_1531_1_1534675828996_4597160_797-348343-0_sp1.1 from training. Duration: 0.963625 2023-10-03 01:32:29,253 WARNING [train.py:1204] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_23-189234-0_sp1.1 from training. Duration: 0.7545625 2023-10-03 01:32:29,537 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.balancer2.prob, batch_count=1090626.6666666667, ans=0.125 2023-10-03 01:32:33,494 WARNING [train.py:1204] (3/4) Exclude cut with ID _5410_210_1388_1_1527397055795_1719020_39-112221-0_sp1.1 from training. Duration: 0.62725 2023-10-03 01:32:34,902 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_100-348441-0_sp1.1 from training. Duration: 0.82725 2023-10-03 01:32:41,210 WARNING [train.py:1204] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_143-86344-0_sp1.1 from training. Duration: 0.7 2023-10-03 01:32:43,123 WARNING [train.py:1204] (3/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_126-29104-0 from training. Duration: 0.85 2023-10-03 01:32:44,615 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=1090693.3333333333, ans=0.1 2023-10-03 01:32:45,806 WARNING [train.py:1204] (3/4) Exclude cut with ID _39846_210_12651_1_1533603651992_1318030_138-31771-0_sp1.1 from training. Duration: 0.963625 2023-10-03 01:32:47,311 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.0.feed_forward3.hidden_balancer.prob, batch_count=1090693.3333333333, ans=0.125 2023-10-03 01:32:49,927 WARNING [train.py:1204] (3/4) Exclude cut with ID _2220_210_1819_1_1525509109898_3658680_50-99837-0_sp1.1 from training. Duration: 0.4909375 2023-10-03 01:32:51,336 WARNING [train.py:1204] (3/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_836-210550-0_sp1.1 from training. Duration: 0.97275 2023-10-03 01:32:51,603 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.feed_forward1.out_proj.dropout_p, batch_count=1090693.3333333333, ans=0.1 2023-10-03 01:32:53,276 WARNING [train.py:1204] (3/4) Exclude cut with ID _28323_210_16187_1_1532480395667_4100300_306-74561-0 from training. Duration: 0.94 2023-10-03 01:32:56,585 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.598e+02 1.884e+02 2.053e+02 2.259e+02 3.350e+02, threshold=4.107e+02, percent-clipped=0.0 2023-10-03 01:32:59,320 INFO [train.py:1046] (3/4) Epoch 31, batch 4250, loss[loss=0.1714, simple_loss=0.2578, pruned_loss=0.04245, over 24477.00 frames. ], tot_loss[loss=0.1622, simple_loss=0.2406, pruned_loss=0.04189, over 4711784.22 frames. ], batch size: 63, lr: 3.28e-03, grad_scale: 16.0 2023-10-03 01:32:59,395 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_177-58005-0_sp1.1 from training. Duration: 0.563625 2023-10-03 01:33:03,609 WARNING [train.py:1204] (3/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_19-342632-0_sp1.1 from training. Duration: 0.82725 2023-10-03 01:33:03,622 WARNING [train.py:1204] (3/4) Exclude cut with ID _35355_210_16040_1_1533212988977_4016229_74-190076-0_sp0.9 from training. Duration: 0.888875 2023-10-03 01:33:05,226 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.bypass_mid.scale_min, batch_count=1090760.0, ans=0.2 2023-10-03 01:33:06,387 WARNING [train.py:1204] (3/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_285-97786-0_sp1.1 from training. Duration: 0.97275 2023-10-03 01:33:11,733 WARNING [train.py:1204] (3/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_337-181502-0_sp0.9 from training. Duration: 0.72225 2023-10-03 01:33:11,771 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_353-182855-0 from training. Duration: 0.96 2023-10-03 01:33:11,806 WARNING [train.py:1204] (3/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_409-327730-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 01:33:15,027 WARNING [train.py:1204] (3/4) Exclude cut with ID _8030_210_7497_1_1528444886781_3531089_373-19403-0_sp1.1 from training. Duration: 0.97275 2023-10-03 01:33:17,037 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder_embed.conv.8.prob, batch_count=1090826.6666666667, ans=0.125 2023-10-03 01:33:18,407 WARNING [train.py:1204] (3/4) Exclude cut with ID _6789_210_1534_1_1527857937218_3118950_231-340237-0_sp1.1 from training. Duration: 0.936375 2023-10-03 01:33:21,203 WARNING [train.py:1204] (3/4) Exclude cut with ID _6493_210_1747_1_1528459210424_4012470_536-57832-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 01:33:21,217 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7046_210_6753_1_1527895337_6920917_756-116002-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 01:33:23,244 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_37152_210_9697_1_1533106573_3844188_29-239070-0_sp1.1 from training. Duration: 0.8909375 2023-10-03 01:33:23,247 WARNING [train.py:1204] (3/4) Exclude cut with ID _19023_210_4353_1_1531378892552_5820780_61-334539-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 01:33:24,659 WARNING [train.py:1204] (3/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_159-28488-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 01:33:26,019 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_764-71272-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 01:33:26,108 WARNING [train.py:1204] (3/4) Exclude cut with ID _44902_210_16187_1_1533888006062_3792078_258-178274-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 01:33:28,650 WARNING [train.py:1204] (3/4) Exclude cut with ID _7537_210_2084_1_1528502395431_3924359_14-312906-0_sp1.1 from training. Duration: 0.763625 2023-10-03 01:33:30,045 WARNING [train.py:1204] (3/4) Exclude cut with ID _49643_210_7989_1_1534640244224_3939031_674-116639-0_sp1.1 from training. Duration: 0.963625 2023-10-03 01:33:31,345 WARNING [train.py:1204] (3/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_415-161-0 from training. Duration: 0.98 2023-10-03 01:33:34,330 WARNING [train.py:1204] (3/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_264-348896-0 from training. Duration: 0.85 2023-10-03 01:33:34,347 WARNING [train.py:1204] (3/4) Exclude cut with ID _51899_210_20034_1_1534129254466_3669220_233-161492-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 01:33:35,741 WARNING [train.py:1204] (3/4) Exclude cut with ID _31255_210_13427_1_1532865619495_3815143_233-200023-0_sp1.1 from training. Duration: 0.936375 2023-10-03 01:33:35,763 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_23850_210_4238_1_1532232597_1632433_100-229879-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 01:33:37,094 WARNING [train.py:1204] (3/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_375-245677-0_sp0.9 from training. Duration: 0.9 2023-10-03 01:33:37,097 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_40223_210_6228_1_1533298404_4812267_355-317415-0_sp1.1 from training. Duration: 0.97275 2023-10-03 01:33:37,137 WARNING [train.py:1204] (3/4) Exclude cut with ID _9679_210_7497_1_1529129212906_4363929_235-81488-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 01:33:39,939 WARNING [train.py:1204] (3/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_172-215932-0_sp1.1 from training. Duration: 0.6 2023-10-03 01:33:41,337 WARNING [train.py:1204] (3/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_64-298917-0_sp1.1 from training. Duration: 0.8 2023-10-03 01:33:46,352 WARNING [train.py:1204] (3/4) Exclude cut with ID _38020_210_5986_1_1533112252259_7006089_131-53533-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 01:33:49,016 WARNING [train.py:1204] (3/4) Exclude cut with ID _42334_210_7770_1_1534206610396_3605880_392-48154-0_sp1.1 from training. Duration: 0.963625 2023-10-03 01:33:49,084 WARNING [train.py:1204] (3/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_337-181502-0 from training. Duration: 0.65 2023-10-03 01:33:49,091 WARNING [train.py:1204] (3/4) Exclude cut with ID _6267_210_2084_1_1527764658526_3894730_375-219887-0_sp0.9 from training. Duration: 0.7444375 2023-10-03 01:33:50,442 WARNING [train.py:1204] (3/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_60-146189-0 from training. Duration: 0.98 2023-10-03 01:33:52,394 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_243-143119-0_sp1.1 from training. Duration: 0.7 2023-10-03 01:33:53,701 WARNING [train.py:1204] (3/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_1267-272693-0_sp0.9 from training. Duration: 0.811125 2023-10-03 01:33:55,132 WARNING [train.py:1204] (3/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_83-182645-0_sp1.1 from training. Duration: 0.97275 2023-10-03 01:33:55,157 WARNING [train.py:1204] (3/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_403-292260-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 01:33:57,906 WARNING [train.py:1204] (3/4) Exclude cut with ID _55429_210_10626_1_1534467588527_7174143_377-169391-0 from training. Duration: 0.96 2023-10-03 01:33:59,282 WARNING [train.py:1204] (3/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_494-199538-0_sp1.1 from training. Duration: 0.6818125 2023-10-03 01:33:59,337 WARNING [train.py:1204] (3/4) Exclude cut with ID _3067_210_2667_1_1526212974640_4503168_119-175776-0_sp1.1 from training. Duration: 0.67275 2023-10-03 01:34:01,654 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.1.encoder.layers.0.conv_module1.whiten, num_groups=1, num_channels=256, metric=9.71 vs. limit=15.0 2023-10-03 01:34:03,589 WARNING [train.py:1204] (3/4) Exclude cut with ID _3231_210_2106_1_1526205864934_6784220_183-303839-0_sp1.1 from training. Duration: 0.97275 2023-10-03 01:34:06,277 WARNING [train.py:1204] (3/4) Exclude cut with ID _18513_210_9206_1_1531618215223_3862620_165-38958-0_sp1.1 from training. Duration: 0.963625 2023-10-03 01:34:07,647 WARNING [train.py:1204] (3/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_87-272068-0_sp1.1 from training. Duration: 0.7545625 2023-10-03 01:34:08,983 WARNING [train.py:1204] (3/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_353-95-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 01:34:11,563 INFO [train.py:1046] (3/4) Epoch 31, batch 4300, loss[loss=0.175, simple_loss=0.2486, pruned_loss=0.05077, over 23826.00 frames. ], tot_loss[loss=0.1623, simple_loss=0.2408, pruned_loss=0.04187, over 4712358.44 frames. ], batch size: 164, lr: 3.28e-03, grad_scale: 16.0 2023-10-03 01:34:11,640 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2009_210_2071_1_1525481134_4588937_73-302813-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 01:34:11,742 WARNING [train.py:1204] (3/4) Exclude cut with ID _64220_210_9527_1_1535250151772_4875259_88-95011-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 01:34:13,661 WARNING [train.py:1204] (3/4) Exclude cut with ID _44902_210_16187_1_1533888006062_3792078_160-12107-0_sp1.1 from training. Duration: 0.92725 2023-10-03 01:34:13,667 WARNING [train.py:1204] (3/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_547-5424-0 from training. Duration: 0.94 2023-10-03 01:34:15,007 WARNING [train.py:1204] (3/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_268-85656-0_sp1.1 from training. Duration: 0.936375 2023-10-03 01:34:18,693 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.feed_forward1.out_proj.dropout_p, batch_count=1091093.3333333333, ans=0.1 2023-10-03 01:34:21,180 WARNING [train.py:1204] (3/4) Exclude cut with ID _66210_210_12651_1_1535446812922_3365850_42-284985-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 01:34:21,206 WARNING [train.py:1204] (3/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_465-214726-0_sp0.9 from training. Duration: 0.9555625 2023-10-03 01:34:24,743 WARNING [train.py:1204] (3/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_50-95770-0_sp1.1 from training. Duration: 0.936375 2023-10-03 01:34:31,874 WARNING [train.py:1204] (3/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_269-77567-0_sp1.1 from training. Duration: 0.963625 2023-10-03 01:34:31,876 WARNING [train.py:1204] (3/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_7-111954-0 from training. Duration: 0.56 2023-10-03 01:34:33,211 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_85-195240-0_sp0.9 from training. Duration: 0.9666875 2023-10-03 01:34:34,653 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2822_210_2715_1_1526112586_1999587_165-36656-0_sp0.9 from training. Duration: 0.988875 2023-10-03 01:34:34,669 WARNING [train.py:1204] (3/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_181-78632-0_sp1.1 from training. Duration: 0.7090625 2023-10-03 01:34:34,691 WARNING [train.py:1204] (3/4) Exclude cut with ID IC0671W0044-331887 from training. Duration: 0.896 2023-10-03 01:34:37,540 WARNING [train.py:1204] (3/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_266-137104-0_sp1.1 from training. Duration: 0.5909375 2023-10-03 01:34:40,050 WARNING [train.py:1204] (3/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_137-116442-0_sp0.9 from training. Duration: 0.8666875 2023-10-03 01:34:42,841 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_23801_210_16223_1_1532066392_3294657_570-5439-0 from training. Duration: 0.97 2023-10-03 01:34:42,861 WARNING [train.py:1204] (3/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_29-280622-0_sp1.1 from training. Duration: 0.7454375 2023-10-03 01:34:44,531 WARNING [train.py:1204] (3/4) Exclude cut with ID 3557-8342-0013-54691-0 from training. Duration: 0.92 2023-10-03 01:34:47,322 WARNING [train.py:1204] (3/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_711-15688-0_sp1.1 from training. Duration: 0.3909375 2023-10-03 01:34:47,460 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_15126_210_6753_1_1530778677_2591185_120-277707-0_sp1.1 from training. Duration: 0.72725 2023-10-03 01:34:50,703 WARNING [train.py:1204] (3/4) Exclude cut with ID _14970_210_15709_1_1530775547361_1966965_8-169924-0_sp1.1 from training. Duration: 0.836375 2023-10-03 01:34:50,704 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_4692_210_3528_1_1526899755_4323254_107-44401-0_sp0.9 from training. Duration: 0.9555625 2023-10-03 01:34:52,088 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3475_210_2028_1_1526193578_3622569_265-276625-0_sp0.9 from training. Duration: 0.8444375 2023-10-03 01:34:53,438 WARNING [train.py:1204] (3/4) Exclude cut with ID _55153_210_2253_1_1534384786677_3162096_148-79022-0_sp1.1 from training. Duration: 0.87275 2023-10-03 01:34:55,208 WARNING [train.py:1204] (3/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_292-36336-0_sp1.1 from training. Duration: 0.92725 2023-10-03 01:34:55,224 WARNING [train.py:1204] (3/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_329-82235-0 from training. Duration: 0.82 2023-10-03 01:34:56,621 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_11305_210_1794_1_1529750428_7178148_257-312870-0 from training. Duration: 0.88 2023-10-03 01:34:58,171 WARNING [train.py:1204] (3/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_436-4493-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 01:35:00,808 WARNING [train.py:1204] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_321-54129-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 01:35:00,816 WARNING [train.py:1204] (3/4) Exclude cut with ID _5287_210_2084_1_1527418883040_3878670_333-48508-0_sp0.9 from training. Duration: 0.6666875 2023-10-03 01:35:00,830 WARNING [train.py:1204] (3/4) Exclude cut with ID _26528_210_9070_1_1532570410842_3851310_244-278158-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 01:35:00,871 WARNING [train.py:1204] (3/4) Exclude cut with ID _46952_210_5986_1_1534230010357_3107379_232-344503-0_sp1.1 from training. Duration: 0.87275 2023-10-03 01:35:00,881 WARNING [train.py:1204] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_43-206757-0 from training. Duration: 0.75 2023-10-03 01:35:00,883 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_4183_210_2087_1_1526729100_1869838_141-166600-0 from training. Duration: 0.95 2023-10-03 01:35:02,212 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_296-287678-0 from training. Duration: 0.95 2023-10-03 01:35:03,443 WARNING [train.py:1204] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_265-60343-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 01:35:03,476 WARNING [train.py:1204] (3/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_332-256168-0 from training. Duration: 0.75 2023-10-03 01:35:03,516 WARNING [train.py:1204] (3/4) Exclude cut with ID _13679_210_7497_1_1530534293662_3355098_411-186088-0 from training. Duration: 0.94 2023-10-03 01:35:07,585 WARNING [train.py:1204] (3/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_458-126779-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 01:35:09,006 WARNING [train.py:1204] (3/4) Exclude cut with ID IC0803W0380-397572 from training. Duration: 0.032 2023-10-03 01:35:10,356 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_278-272612-0_sp1.1 from training. Duration: 0.663625 2023-10-03 01:35:12,381 WARNING [train.py:1204] (3/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_491-263101-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 01:35:12,386 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_53555_210_5632_1_1534595220_7232297_334-336778-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 01:35:15,062 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_50-3289-0 from training. Duration: 0.91 2023-10-03 01:35:15,124 WARNING [train.py:1204] (3/4) Exclude cut with ID _8699_210_2069_1_1528630346831_3960409_155-39766-0_sp0.9 from training. Duration: 0.8666875 2023-10-03 01:35:15,128 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_502-82145-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 01:35:15,178 WARNING [train.py:1204] (3/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_550-43132-0_sp1.1 from training. Duration: 0.97275 2023-10-03 01:35:16,328 WARNING [train.py:1204] (3/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_446-112117-0_sp1.1 from training. Duration: 0.8181875 2023-10-03 01:35:16,404 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3691_210_3528_1_1526468169_4062053_355-334432-0_sp1.1 from training. Duration: 0.7909375 2023-10-03 01:35:19,089 WARNING [train.py:1204] (3/4) Exclude cut with ID _58605_210_12210_1_1534723195939_3904259_239-94720-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 01:35:22,330 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.517e+02 1.858e+02 1.987e+02 2.158e+02 3.474e+02, threshold=3.975e+02, percent-clipped=0.0 2023-10-03 01:35:22,399 WARNING [train.py:1204] (3/4) Exclude cut with ID _49376_210_5986_1_1533985278732_3370740_391-344072-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 01:35:24,320 WARNING [train.py:1204] (3/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_558-322014-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 01:35:24,352 WARNING [train.py:1204] (3/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_256-32662-0_sp1.1 from training. Duration: 0.8181875 2023-10-03 01:35:25,701 INFO [train.py:1046] (3/4) Epoch 31, batch 4350, loss[loss=0.1561, simple_loss=0.233, pruned_loss=0.03964, over 23509.00 frames. ], tot_loss[loss=0.1632, simple_loss=0.2416, pruned_loss=0.04245, over 4707818.85 frames. ], batch size: 134, lr: 3.28e-03, grad_scale: 16.0 2023-10-03 01:35:29,061 WARNING [train.py:1204] (3/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_572-185150-0 from training. Duration: 0.97 2023-10-03 01:35:29,100 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_89-35748-0_sp0.9 from training. Duration: 0.87775 2023-10-03 01:35:33,404 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_88-66747-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 01:35:34,823 WARNING [train.py:1204] (3/4) Exclude cut with ID _19504_210_9994_1_1531648863242_3325220_357-237029-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 01:35:36,257 WARNING [train.py:1204] (3/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_302-2550-0_sp1.1 from training. Duration: 0.736375 2023-10-03 01:35:36,257 WARNING [train.py:1204] (3/4) Exclude cut with ID _9461_210_4557_1_1529060514726_3677451_59-194017-0_sp1.1 from training. Duration: 0.963625 2023-10-03 01:35:40,340 WARNING [train.py:1204] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_665-196155-0_sp1.1 from training. Duration: 0.6090625 2023-10-03 01:35:43,484 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_326-315007-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 01:35:46,189 WARNING [train.py:1204] (3/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_105-328174-0_sp1.1 from training. Duration: 0.6909375 2023-10-03 01:35:46,201 WARNING [train.py:1204] (3/4) Exclude cut with ID _56494_210_4220_1_1535284837625_3033169_106-263722-0_sp1.1 from training. Duration: 0.97275 2023-10-03 01:35:49,430 WARNING [train.py:1204] (3/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_770-200430-0_sp0.9 from training. Duration: 0.988875 2023-10-03 01:35:51,463 WARNING [train.py:1204] (3/4) Exclude cut with ID _65640_210_6310_1_1535368201445_3173250_209-13008-0_sp1.1 from training. Duration: 0.9 2023-10-03 01:35:54,079 WARNING [train.py:1204] (3/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_212-149947-0_sp0.9 from training. Duration: 0.811125 2023-10-03 01:35:58,955 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.balancer_na.min_abs, batch_count=1091560.0, ans=0.02 2023-10-03 01:36:00,190 WARNING [train.py:1204] (3/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_188-171673-0 from training. Duration: 0.58 2023-10-03 01:36:01,509 WARNING [train.py:1204] (3/4) Exclude cut with ID _58895_210_8750_1_1535184081320_3649089_286-281724-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 01:36:01,593 WARNING [train.py:1204] (3/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_16-254909-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 01:36:07,094 WARNING [train.py:1204] (3/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_52-95655-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 01:36:08,583 WARNING [train.py:1204] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_554-45617-0 from training. Duration: 0.99 2023-10-03 01:36:08,843 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.conv_module2.balancer2.prob, batch_count=1091626.6666666667, ans=0.125 2023-10-03 01:36:11,971 WARNING [train.py:1204] (3/4) Exclude cut with ID _14683_210_5219_1_1530698352608_3974859_473-344611-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 01:36:13,414 WARNING [train.py:1204] (3/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_17-25045-0_sp0.9 from training. Duration: 0.6555625 2023-10-03 01:36:17,599 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9072W0003-530637 from training. Duration: 0.768 2023-10-03 01:36:18,989 WARNING [train.py:1204] (3/4) Exclude cut with ID _38670_210_15379_1_1533448810659_4391059_76-74025-0_sp1.1 from training. Duration: 0.936375 2023-10-03 01:36:19,035 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_74-292608-0_sp0.9 from training. Duration: 0.8 2023-10-03 01:36:19,100 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9084W0025-536862 from training. Duration: 0.896 2023-10-03 01:36:19,874 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.2.encoder.layers.0.nonlin_attention.whiten1, num_groups=1, num_channels=288, metric=7.95 vs. limit=10.0 2023-10-03 01:36:20,441 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9087W0021-538392 from training. Duration: 0.768 2023-10-03 01:36:20,448 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7543_210_6753_1_1528543323_645515_56-163234-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 01:36:20,491 WARNING [train.py:1204] (3/4) Exclude cut with ID _45560_210_21982_1_1533812603951_3584999_44-332643-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 01:36:21,894 WARNING [train.py:1204] (3/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_1285-256622-0_sp1.1 from training. Duration: 0.763625 2023-10-03 01:36:23,719 WARNING [train.py:1204] (3/4) Exclude cut with ID _52781_210_16187_1_1534297729099_1118149_141-325632-0_sp1.1 from training. Duration: 0.936375 2023-10-03 01:36:23,784 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_1051-137631-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 01:36:25,149 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_13091_210_7925_1_1530609447_8987354_233-18333-0_sp1.1 from training. Duration: 0.97275 2023-10-03 01:36:28,280 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_34008_210_10431_1_1533345424_8183247_662-317331-0 from training. Duration: 0.94 2023-10-03 01:36:28,286 WARNING [train.py:1204] (3/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_291-248920-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 01:36:28,288 WARNING [train.py:1204] (3/4) Exclude cut with ID _65263_210_22356_1_1535763598691_3921037_274-288481-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 01:36:28,317 WARNING [train.py:1204] (3/4) Exclude cut with ID _5189_210_4557_1_1527248319428_3769229_233-168080-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 01:36:29,695 WARNING [train.py:1204] (3/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_135-184281-0 from training. Duration: 0.99 2023-10-03 01:36:29,809 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9123W0012-555972 from training. Duration: 0.896 2023-10-03 01:36:29,816 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9123W0118-556077 from training. Duration: 0.896 2023-10-03 01:36:31,075 WARNING [train.py:1204] (3/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_406-275649-0 from training. Duration: 0.42 2023-10-03 01:36:33,965 WARNING [train.py:1204] (3/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_309-13684-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 01:36:33,995 WARNING [train.py:1204] (3/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_129-337802-0_sp1.1 from training. Duration: 0.6545625 2023-10-03 01:36:34,013 WARNING [train.py:1204] (3/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_649-266825-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 01:36:35,288 WARNING [train.py:1204] (3/4) Exclude cut with ID _40519_210_13794_1_1533715228655_7337390_895-224742-0_sp1.1 from training. Duration: 0.92725 2023-10-03 01:36:36,668 WARNING [train.py:1204] (3/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_234-14174-0 from training. Duration: 0.82 2023-10-03 01:36:36,867 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.bypass_mid.scale_min, batch_count=1091693.3333333333, ans=0.2 2023-10-03 01:36:38,120 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9156W0009-572502 from training. Duration: 0.768 2023-10-03 01:36:38,135 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_424-245529-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 01:36:39,826 INFO [train.py:1046] (3/4) Epoch 31, batch 4400, loss[loss=0.2067, simple_loss=0.2705, pruned_loss=0.07142, over 19343.00 frames. ], tot_loss[loss=0.1643, simple_loss=0.2425, pruned_loss=0.04306, over 4689908.52 frames. ], batch size: 388, lr: 3.28e-03, grad_scale: 32.0 2023-10-03 01:36:42,749 WARNING [train.py:1204] (3/4) Exclude cut with ID _26528_210_9070_1_1532570410842_3851310_232-10149-0_sp1.1 from training. Duration: 0.963625 2023-10-03 01:36:42,758 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_17292_210_6753_1_1531125388_2943734_152-30410-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 01:36:45,465 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_35441_210_16554_1_1533347572_4092680_291-82075-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 01:36:46,939 WARNING [train.py:1204] (3/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_64-298917-0 from training. Duration: 0.88 2023-10-03 01:36:48,229 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_5618_210_1874_1_1527311008_5851_1-214501-0_sp0.9 from training. Duration: 0.7655625 2023-10-03 01:36:48,270 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_9460_210_4211_1_1528954587_10049667_572-37035-0 from training. Duration: 0.8 2023-10-03 01:36:48,299 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9193W0023-590322 from training. Duration: 0.896 2023-10-03 01:36:49,652 WARNING [train.py:1204] (3/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_206-10097-0_sp1.1 from training. Duration: 0.4454375 2023-10-03 01:36:49,672 WARNING [train.py:1204] (3/4) Exclude cut with ID _55134_210_21982_1_1534590028377_3882170_17-51731-0_sp1.1 from training. Duration: 0.963625 2023-10-03 01:36:51,136 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_14953_210_2151_1_1530701710_5616165_710-165400-0 from training. Duration: 0.87 2023-10-03 01:36:52,493 WARNING [train.py:1204] (3/4) Exclude cut with ID _53203_210_2087_1_1534246218309_475590_61-85777-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 01:36:54,332 WARNING [train.py:1204] (3/4) Exclude cut with ID _55844_210_2346_1_1534471212569_7625707_1050-10453-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 01:36:54,343 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9218W0013-603297 from training. Duration: 0.896 2023-10-03 01:36:58,205 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_267-102818-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 01:36:58,206 WARNING [train.py:1204] (3/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_191-41023-0 from training. Duration: 0.71 2023-10-03 01:36:58,255 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9231W0202-610227 from training. Duration: 0.896 2023-10-03 01:37:02,303 WARNING [train.py:1204] (3/4) Exclude cut with ID _6537_210_1883_1_1527987559710_3597129_382-133062-0 from training. Duration: 0.68 2023-10-03 01:37:03,687 WARNING [train.py:1204] (3/4) Exclude cut with ID _28427_210_4220_1_1532581240967_3061069_43-84928-0 from training. Duration: 0.99 2023-10-03 01:37:04,954 WARNING [train.py:1204] (3/4) Exclude cut with ID _59256_210_5986_1_1535076085997_3589120_121-237386-0 from training. Duration: 0.96 2023-10-03 01:37:04,987 WARNING [train.py:1204] (3/4) Exclude cut with ID _8480_210_1874_1_1528589391205_6728330_137-256986-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 01:37:05,074 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_9417_210_1794_1_1529235261_4614131_479-346963-0_sp1.1 from training. Duration: 0.936375 2023-10-03 01:37:06,327 WARNING [train.py:1204] (3/4) Exclude cut with ID _26474_210_9570_1_1532520022699_3623380_267-7362-0_sp1.1 from training. Duration: 0.936375 2023-10-03 01:37:06,405 WARNING [train.py:1204] (3/4) Exclude cut with ID _55328_210_27767_1_1534580973260_4088170_287-61272-0_sp1.1 from training. Duration: 0.97275 2023-10-03 01:37:07,941 WARNING [train.py:1204] (3/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_589-9070-0 from training. Duration: 0.71 2023-10-03 01:37:07,955 WARNING [train.py:1204] (3/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_148-158509-0_sp1.1 from training. Duration: 0.37275 2023-10-03 01:37:08,152 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.feed_forward1.hidden_balancer.prob, batch_count=1091893.3333333333, ans=0.125 2023-10-03 01:37:08,210 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.balancer1.prob, batch_count=1091893.3333333333, ans=0.125 2023-10-03 01:37:09,365 WARNING [train.py:1204] (3/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_164-184548-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 01:37:11,550 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.conv_skip_rate, batch_count=1091893.3333333333, ans=0.0 2023-10-03 01:37:12,498 WARNING [train.py:1204] (3/4) Exclude cut with ID _14970_210_15709_1_1530775547361_1966965_6-23328-0_sp1.1 from training. Duration: 0.7818125 2023-10-03 01:37:12,503 WARNING [train.py:1204] (3/4) Exclude cut with ID _29303_210_2575_1_1532392237303_7410649_612-165069-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 01:37:13,731 WARNING [train.py:1204] (3/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_383-321224-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 01:37:13,776 WARNING [train.py:1204] (3/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_283-160525-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 01:37:13,796 WARNING [train.py:1204] (3/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_505-97987-0 from training. Duration: 0.86 2023-10-03 01:37:15,193 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9309W0190-640317 from training. Duration: 0.768 2023-10-03 01:37:19,437 WARNING [train.py:1204] (3/4) Exclude cut with ID _50093_210_4220_1_1534417141207_3432299_280-297390-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 01:37:25,034 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_17292_210_6753_1_1531125388_2943734_320-71385-0_sp1.1 from training. Duration: 0.97275 2023-10-03 01:37:27,054 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_213-103282-0 from training. Duration: 0.78 2023-10-03 01:37:27,924 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.0.layers.1.conv_module1.whiten, num_groups=1, num_channels=192, metric=8.26 vs. limit=15.0 2023-10-03 01:37:30,184 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7615_210_4211_1_1528110965_4580160_72-235211-0_sp0.9 from training. Duration: 0.8444375 2023-10-03 01:37:33,846 WARNING [train.py:1204] (3/4) Exclude cut with ID _14867_210_1968_1_1530684179890_3846969_173-320977-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 01:37:36,598 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7046_210_6753_1_1527895337_6920917_783-278109-0_sp0.9 from training. Duration: 0.8555625 2023-10-03 01:37:36,636 WARNING [train.py:1204] (3/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_90-30673-0 from training. Duration: 0.84 2023-10-03 01:37:36,652 WARNING [train.py:1204] (3/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_415-161-0_sp1.1 from training. Duration: 0.8909375 2023-10-03 01:37:37,947 WARNING [train.py:1204] (3/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_86-66192-0_sp0.9 from training. Duration: 0.611125 2023-10-03 01:37:37,957 WARNING [train.py:1204] (3/4) Exclude cut with ID _4626_210_3532_1_1526817652717_3916859_446-206656-0_sp1.1 from training. Duration: 0.6909375 2023-10-03 01:37:39,313 WARNING [train.py:1204] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_166-128941-0_sp0.9 from training. Duration: 0.6 2023-10-03 01:37:44,149 WARNING [train.py:1204] (3/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_345-167986-0 from training. Duration: 0.84 2023-10-03 01:37:45,617 WARNING [train.py:1204] (3/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_973-198085-0 from training. Duration: 0.75 2023-10-03 01:37:46,239 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.3.encoder.layers.3.self_attn2.whiten, num_groups=1, num_channels=512, metric=10.06 vs. limit=22.5 2023-10-03 01:37:46,885 WARNING [train.py:1204] (3/4) Exclude cut with ID _60057_210_10640_1_1534903065793_3333150_171-181133-0 from training. Duration: 0.88 2023-10-03 01:37:46,899 WARNING [train.py:1204] (3/4) Exclude cut with ID _19504_210_9994_1_1531648863242_3325220_336-196513-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 01:37:46,914 WARNING [train.py:1204] (3/4) Exclude cut with ID _6493_210_1747_1_1528459210424_4012470_513-140967-0 from training. Duration: 0.79 2023-10-03 01:37:48,225 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6673_210_3273_1_1527767790_3173240_327-306185-0_sp0.9 from training. Duration: 0.92225 2023-10-03 01:37:50,997 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.506e+02 1.843e+02 2.055e+02 2.239e+02 3.074e+02, threshold=4.110e+02, percent-clipped=0.0 2023-10-03 01:37:51,142 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1032-236863-0_sp1.1 from training. Duration: 0.763625 2023-10-03 01:37:52,581 WARNING [train.py:1204] (3/4) Exclude cut with ID _14867_210_1968_1_1530684179890_3846969_413-29292-0 from training. Duration: 0.9 2023-10-03 01:37:53,809 INFO [train.py:1046] (3/4) Epoch 31, batch 4450, loss[loss=0.1711, simple_loss=0.2606, pruned_loss=0.04081, over 24684.00 frames. ], tot_loss[loss=0.1656, simple_loss=0.2437, pruned_loss=0.04372, over 4690787.75 frames. ], batch size: 73, lr: 3.28e-03, grad_scale: 32.0 2023-10-03 01:37:57,130 WARNING [train.py:1204] (3/4) Exclude cut with ID _47251_210_18628_1_1533814063128_4137250_186-343322-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 01:38:00,007 WARNING [train.py:1204] (3/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_624-46091-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 01:38:00,049 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_791-197891-0_sp0.9 from training. Duration: 0.9333125 2023-10-03 01:38:07,023 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.5.encoder.layers.0.conv_module1.whiten, num_groups=1, num_channels=256, metric=7.74 vs. limit=15.0 2023-10-03 01:38:07,864 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_50050_210_16223_1_1535435796_7290138_786-288457-0_sp1.1 from training. Duration: 0.92725 2023-10-03 01:38:07,894 WARNING [train.py:1204] (3/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_213-227283-0_sp1.1 from training. Duration: 0.963625 2023-10-03 01:38:08,516 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.4.encoder.layers.0.nonlin_attention.whiten2, num_groups=1, num_channels=384, metric=10.57 vs. limit=15.0 2023-10-03 01:38:10,649 WARNING [train.py:1204] (3/4) Exclude cut with ID _42659_210_2070_1_1533607442765_3120380_58-341121-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 01:38:12,221 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=1092160.0, ans=0.1 2023-10-03 01:38:13,386 WARNING [train.py:1204] (3/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_506-23200-0_sp1.1 from training. Duration: 0.8818125 2023-10-03 01:38:15,438 WARNING [train.py:1204] (3/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_336-213530-0_sp1.1 from training. Duration: 0.8181875 2023-10-03 01:38:15,462 WARNING [train.py:1204] (3/4) Exclude cut with ID _64135_210_2253_1_1535677267876_7608972_180-5312-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 01:38:18,207 WARNING [train.py:1204] (3/4) Exclude cut with ID _12302_210_5219_1_1529924446466_8107280_835-279754-0 from training. Duration: 0.9 2023-10-03 01:38:18,209 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_510-96996-0_sp1.1 from training. Duration: 0.7909375 2023-10-03 01:38:18,292 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_15517_210_6753_1_1530954008_3487336_216-187834-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 01:38:19,661 WARNING [train.py:1204] (3/4) Exclude cut with ID _19504_210_9994_1_1531648863242_3325220_414-113427-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 01:38:19,662 WARNING [train.py:1204] (3/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_264-108685-0_sp0.9 from training. Duration: 0.611125 2023-10-03 01:38:19,963 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.conv_skip_rate, batch_count=1092160.0, ans=0.0 2023-10-03 01:38:22,298 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_5438_210_1794_1_1527336687_2600009_104-288274-0_sp0.9 from training. Duration: 0.6444375 2023-10-03 01:38:26,859 WARNING [train.py:1204] (3/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_520-39496-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 01:38:26,912 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_37818_210_4238_1_1533620910_4720083_315-308565-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 01:38:28,368 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2009_210_2071_1_1525481134_4588937_385-302100-0_sp1.1 from training. Duration: 0.7909375 2023-10-03 01:38:28,409 WARNING [train.py:1204] (3/4) Exclude cut with ID _58895_210_8750_1_1535184081320_3649089_277-53219-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 01:38:29,679 WARNING [train.py:1204] (3/4) Exclude cut with ID _26664_210_7770_1_1532258943280_3418270_121-111258-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 01:38:32,636 WARNING [train.py:1204] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_99-167392-0_sp1.1 from training. Duration: 0.5545625 2023-10-03 01:38:34,486 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1113-7369-0 from training. Duration: 0.85 2023-10-03 01:38:34,501 WARNING [train.py:1204] (3/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_352-59958-0 from training. Duration: 0.86 2023-10-03 01:38:34,513 WARNING [train.py:1204] (3/4) Exclude cut with ID _14886_210_4220_1_1530702020355_3516190_159-73748-0_sp1.1 from training. Duration: 0.7545625 2023-10-03 01:38:37,120 WARNING [train.py:1204] (3/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_131-119569-0_sp1.1 from training. Duration: 0.92725 2023-10-03 01:38:38,495 WARNING [train.py:1204] (3/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_216-343007-0 from training. Duration: 0.66 2023-10-03 01:38:42,551 WARNING [train.py:1204] (3/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_113-92608-0_sp0.9 from training. Duration: 0.6 2023-10-03 01:38:45,985 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_40047_210_2290_1_1533290519_2374311_132-123271-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 01:38:46,029 WARNING [train.py:1204] (3/4) Exclude cut with ID _8793_210_6310_1_1528542985139_2310009_121-179782-0 from training. Duration: 0.8 2023-10-03 01:38:47,167 WARNING [train.py:1204] (3/4) Exclude cut with ID _43997_210_4525_1_1533538632555_5189329_283-218639-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 01:38:47,170 WARNING [train.py:1204] (3/4) Exclude cut with ID _49376_210_5986_1_1533985278732_3370740_297-78397-0_sp1.1 from training. Duration: 0.936375 2023-10-03 01:38:47,186 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3475_210_2028_1_1526193578_3622569_377-166133-0_sp0.9 from training. Duration: 0.9555625 2023-10-03 01:38:47,194 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_520-213149-0_sp1.1 from training. Duration: 0.92725 2023-10-03 01:38:48,658 WARNING [train.py:1204] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_567-144483-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 01:38:51,593 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_4183_210_2087_1_1526729100_1869838_143-280962-0_sp0.9 from training. Duration: 0.62225 2023-10-03 01:38:52,808 WARNING [train.py:1204] (3/4) Exclude cut with ID _49643_210_7989_1_1534640244224_3939031_611-185515-0 from training. Duration: 0.94 2023-10-03 01:38:53,050 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.conv_module2.balancer2.prob, batch_count=1092360.0, ans=0.125 2023-10-03 01:38:54,723 WARNING [train.py:1204] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_88-341718-0_sp0.9 from training. Duration: 0.7333125 2023-10-03 01:38:56,221 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_51225_210_3601_1_1534400634_2327383_49-235441-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 01:38:57,671 WARNING [train.py:1204] (3/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_240-294400-0_sp1.1 from training. Duration: 0.936375 2023-10-03 01:39:00,302 WARNING [train.py:1204] (3/4) Exclude cut with ID _55429_210_10626_1_1534467588527_7174143_640-304825-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 01:39:00,323 WARNING [train.py:1204] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_187-221566-0_sp1.1 from training. Duration: 0.3909375 2023-10-03 01:39:01,860 WARNING [train.py:1204] (3/4) Exclude cut with ID _7606_210_4915_1_1528894253277_5080690_127-177761-0_sp0.9 from training. Duration: 0.711125 2023-10-03 01:39:04,091 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.0.layers.1.nonlin_attention.whiten2, num_groups=1, num_channels=192, metric=5.39 vs. limit=15.0 2023-10-03 01:39:04,509 WARNING [train.py:1204] (3/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_430-116139-0 from training. Duration: 0.85 2023-10-03 01:39:06,555 WARNING [train.py:1204] (3/4) Exclude cut with ID _3675_210_4557_1_1526639934408_4182089_456-18305-0_sp0.9 from training. Duration: 0.8444375 2023-10-03 01:39:07,822 INFO [train.py:1046] (3/4) Epoch 31, batch 4500, loss[loss=0.1534, simple_loss=0.236, pruned_loss=0.03541, over 24419.00 frames. ], tot_loss[loss=0.1653, simple_loss=0.2436, pruned_loss=0.04349, over 4699722.62 frames. ], batch size: 63, lr: 3.28e-03, grad_scale: 16.0 2023-10-03 01:39:10,794 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.1.bypass.skip_rate, batch_count=1092426.6666666667, ans=0.035 2023-10-03 01:39:12,021 WARNING [train.py:1204] (3/4) Exclude cut with ID _18513_210_9206_1_1531618215223_3862620_402-5957-0_sp1.1 from training. Duration: 0.9 2023-10-03 01:39:13,434 WARNING [train.py:1204] (3/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_465-156500-0 from training. Duration: 0.72 2023-10-03 01:39:13,435 WARNING [train.py:1204] (3/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_714-310538-0 from training. Duration: 0.86 2023-10-03 01:39:14,859 WARNING [train.py:1204] (3/4) Exclude cut with ID _39411_210_5986_1_1533538825030_3343649_335-301187-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 01:39:19,420 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_41750_210_1960_1_1533520678_3948042_241-158616-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 01:39:20,833 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_46013_210_7234_1_1533776362_3744032_50-141535-0_sp1.1 from training. Duration: 0.9 2023-10-03 01:39:20,901 WARNING [train.py:1204] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_43-206757-0_sp0.9 from training. Duration: 0.8333125 2023-10-03 01:39:22,169 WARNING [train.py:1204] (3/4) Exclude cut with ID _22961_210_8254_1_1531976366672_4278180_172-276296-0_sp1.1 from training. Duration: 0.97275 2023-10-03 01:39:22,196 WARNING [train.py:1204] (3/4) Exclude cut with ID _17956_210_4353_1_1531209568125_6979970_1305-301903-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 01:39:22,239 WARNING [train.py:1204] (3/4) Exclude cut with ID _28323_210_16187_1_1532480395667_4100300_604-44507-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 01:39:33,633 WARNING [train.py:1204] (3/4) Exclude cut with ID _23514_210_1389_1_1531832139785_3031439_88-337544-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 01:39:36,262 WARNING [train.py:1204] (3/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_932-3672-0_sp1.1 from training. Duration: 0.963625 2023-10-03 01:39:36,429 WARNING [train.py:1204] (3/4) Exclude cut with ID _46502_210_13794_1_1533724452198_3657159_207-109991-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 01:39:38,366 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_216-73841-0_sp1.1 from training. Duration: 0.87275 2023-10-03 01:39:40,140 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8453_210_4211_1_1528502940_8562730_347-330673-0_sp1.1 from training. Duration: 0.6545625 2023-10-03 01:39:44,379 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_4183_210_2087_1_1526729100_1869838_143-280962-0_sp1.1 from training. Duration: 0.5090625 2023-10-03 01:39:47,756 WARNING [train.py:1204] (3/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_325-113798-0_sp1.1 from training. Duration: 0.77275 2023-10-03 01:39:48,524 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.3.encoder.layers.1.self_attn_weights.whiten_keys, num_groups=8, num_channels=256, metric=4.32 vs. limit=6.0 2023-10-03 01:39:52,029 WARNING [train.py:1204] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_91-212486-0_sp0.9 from training. Duration: 0.7555625 2023-10-03 01:39:53,501 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.0.nonlin_attention.balancer.prob, batch_count=1092626.6666666667, ans=0.125 2023-10-03 01:39:54,745 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8776_210_2040_1_1528638837_4088036_244-278763-0_sp1.1 from training. Duration: 0.8181875 2023-10-03 01:39:56,082 WARNING [train.py:1204] (3/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_106-286130-0 from training. Duration: 0.98 2023-10-03 01:39:56,152 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2746_210_1988_1_1525872816_3795627_372-205861-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 01:39:57,941 WARNING [train.py:1204] (3/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_240-54646-0_sp1.1 from training. Duration: 0.936375 2023-10-03 01:39:58,199 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.bypass_mid.scale_min, batch_count=1092626.6666666667, ans=0.2 2023-10-03 01:39:59,340 WARNING [train.py:1204] (3/4) Exclude cut with ID _59500_210_8254_1_1534809610680_3736009_162-180695-0_sp1.1 from training. Duration: 0.936375 2023-10-03 01:39:59,587 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.self_attn_weights.pos_emb_skip_rate, batch_count=1092626.6666666667, ans=0.0 2023-10-03 01:40:00,665 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7544_210_6753_1_1528591327_1471426_146-93357-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 01:40:00,850 WARNING [train.py:1204] (3/4) Exclude cut with ID _42657_210_2070_1_1533520654016_3327008_405-87432-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 01:40:00,872 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_4528_210_3273_1_1527076654_3276777_209-219804-0 from training. Duration: 0.69 2023-10-03 01:40:00,872 WARNING [train.py:1204] (3/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_78-182912-0_sp0.9 from training. Duration: 0.6555625 2023-10-03 01:40:02,220 WARNING [train.py:1204] (3/4) Exclude cut with ID _31255_210_13427_1_1532865619495_3815143_322-114998-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 01:40:08,220 WARNING [train.py:1204] (3/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_848-280957-0_sp0.9 from training. Duration: 0.9666875 2023-10-03 01:40:08,242 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7696_210_2040_1_1528207167_3716099_117-125434-0_sp0.9 from training. Duration: 0.8444375 2023-10-03 01:40:11,501 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_1002-109192-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 01:40:14,236 WARNING [train.py:1204] (3/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_138-64210-0_sp0.9 from training. Duration: 0.811125 2023-10-03 01:40:14,256 WARNING [train.py:1204] (3/4) Exclude cut with ID _25808_210_13292_1_1532343514562_7366109_633-47524-0_sp1.1 from training. Duration: 0.9 2023-10-03 01:40:15,618 WARNING [train.py:1204] (3/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_297-5233-0 from training. Duration: 0.95 2023-10-03 01:40:17,514 WARNING [train.py:1204] (3/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_335-274780-0 from training. Duration: 0.78 2023-10-03 01:40:17,519 WARNING [train.py:1204] (3/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_301-330455-0 from training. Duration: 0.8 2023-10-03 01:40:20,177 WARNING [train.py:1204] (3/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_383-205833-0 from training. Duration: 0.95 2023-10-03 01:40:21,553 INFO [train.py:1046] (3/4) Epoch 31, batch 4550, loss[loss=0.1836, simple_loss=0.2638, pruned_loss=0.05171, over 24091.00 frames. ], tot_loss[loss=0.1644, simple_loss=0.2423, pruned_loss=0.04322, over 4690636.32 frames. ], batch size: 86, lr: 3.28e-03, grad_scale: 4.0 2023-10-03 01:40:22,877 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.645e+02 1.951e+02 2.109e+02 2.572e+02 4.097e+02, threshold=4.218e+02, percent-clipped=0.0 2023-10-03 01:40:23,003 WARNING [train.py:1204] (3/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_48-182155-0 from training. Duration: 0.87 2023-10-03 01:40:24,377 WARNING [train.py:1204] (3/4) Exclude cut with ID _14832_210_8750_1_1530691351539_3171348_1-54315-0_sp1.1 from training. Duration: 0.7909375 2023-10-03 01:40:27,068 WARNING [train.py:1204] (3/4) Exclude cut with ID _44982_210_1511_1_1534312820108_3726029_413-100814-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 01:40:27,122 WARNING [train.py:1204] (3/4) Exclude cut with ID _3231_210_2106_1_1526205864934_6784220_104-261392-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 01:40:30,378 WARNING [train.py:1204] (3/4) Exclude cut with ID _24314_210_4915_1_1532135078116_7170179_1-13588-0_sp1.1 from training. Duration: 0.97275 2023-10-03 01:40:34,443 WARNING [train.py:1204] (3/4) Exclude cut with ID _44304_210_15374_1_1533639352765_4613129_141-8356-0_sp1.1 from training. Duration: 0.963625 2023-10-03 01:40:37,807 WARNING [train.py:1204] (3/4) Exclude cut with ID _37892_210_19596_1_1533198677897_3964058_374-335034-0_sp1.1 from training. Duration: 0.936375 2023-10-03 01:40:37,958 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_195-257512-0_sp0.9 from training. Duration: 0.7444375 2023-10-03 01:40:37,960 WARNING [train.py:1204] (3/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_1267-272693-0_sp1.1 from training. Duration: 0.663625 2023-10-03 01:40:37,961 WARNING [train.py:1204] (3/4) Exclude cut with ID _30196_210_1664_1_1533708496522_7074140_690-326155-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 01:40:40,610 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_68847_210_14656_1_1536070214_2586843_38-20717-0_sp1.1 from training. Duration: 0.97275 2023-10-03 01:40:41,943 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_99-251314-0_sp1.1 from training. Duration: 0.7909375 2023-10-03 01:40:45,200 WARNING [train.py:1204] (3/4) Exclude cut with ID _49376_210_5986_1_1533985278732_3370740_225-95630-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 01:40:46,674 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2655_210_2087_1_1525688959_3897400_202-147557-0 from training. Duration: 0.85 2023-10-03 01:40:46,876 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=1092826.6666666667, ans=0.1 2023-10-03 01:40:48,039 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_5618_210_1874_1_1527311008_5851_1-214501-0_sp1.1 from training. Duration: 0.626375 2023-10-03 01:40:48,789 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.0.layers.1.feed_forward3.out_whiten, num_groups=1, num_channels=192, metric=11.41 vs. limit=15.0 2023-10-03 01:40:49,306 WARNING [train.py:1204] (3/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_225-43381-0_sp1.1 from training. Duration: 0.8545625 2023-10-03 01:40:49,400 WARNING [train.py:1204] (3/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_242-3472-0 from training. Duration: 0.73 2023-10-03 01:40:50,940 WARNING [train.py:1204] (3/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_339-104791-0 from training. Duration: 0.49 2023-10-03 01:40:51,089 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.conv_skip_rate, batch_count=1092893.3333333333, ans=0.0 2023-10-03 01:40:52,176 WARNING [train.py:1204] (3/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_488-92261-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 01:40:56,225 WARNING [train.py:1204] (3/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_175-163894-0 from training. Duration: 0.74 2023-10-03 01:40:58,224 WARNING [train.py:1204] (3/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_41-234353-0_sp0.9 from training. Duration: 0.7555625 2023-10-03 01:41:00,943 WARNING [train.py:1204] (3/4) Exclude cut with ID _47251_210_18628_1_1533814063128_4137250_116-275927-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 01:41:00,981 WARNING [train.py:1204] (3/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_61-223324-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 01:41:00,993 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6961_210_2087_1_1527937168_6815011_562-165518-0_sp1.1 from training. Duration: 0.7 2023-10-03 01:41:03,790 WARNING [train.py:1204] (3/4) Exclude cut with ID _21989_210_5854_1_1531899214931_3271360_133-44759-0 from training. Duration: 0.77 2023-10-03 01:41:05,236 WARNING [train.py:1204] (3/4) Exclude cut with ID _23557_210_8750_1_1531890106370_6846255_77-284923-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 01:41:07,266 WARNING [train.py:1204] (3/4) Exclude cut with ID _13678_210_7497_1_1530530069226_3463170_464-296127-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 01:41:08,543 WARNING [train.py:1204] (3/4) Exclude cut with ID _22613_210_5219_1_1532253599686_4090170_393-36663-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 01:41:09,926 WARNING [train.py:1204] (3/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_235-262124-0_sp0.9 from training. Duration: 0.7444375 2023-10-03 01:41:12,540 WARNING [train.py:1204] (3/4) Exclude cut with ID _8480_210_1874_1_1528589391205_6728330_755-112625-0 from training. Duration: 0.74 2023-10-03 01:41:12,595 WARNING [train.py:1204] (3/4) Exclude cut with ID _6456_210_1534_1_1527681634212_3181049_215-309359-0 from training. Duration: 0.88 2023-10-03 01:41:12,617 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3019_210_2028_1_1526044245_3264865_191-72235-0_sp1.1 from training. Duration: 0.8818125 2023-10-03 01:41:14,592 WARNING [train.py:1204] (3/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_380-162648-0 from training. Duration: 0.94 2023-10-03 01:41:16,030 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_92-46009-0 from training. Duration: 0.54 2023-10-03 01:41:16,048 WARNING [train.py:1204] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_53-300544-0_sp0.9 from training. Duration: 0.7444375 2023-10-03 01:41:17,395 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_68235_210_1925_1_1535863247_4484656_101-250943-0_sp1.1 from training. Duration: 0.97275 2023-10-03 01:41:17,413 WARNING [train.py:1204] (3/4) Exclude cut with ID _14970_210_15709_1_1530775547361_1966965_204-22182-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 01:41:18,827 WARNING [train.py:1204] (3/4) Exclude cut with ID _55153_210_2253_1_1534384786677_3162096_310-130092-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 01:41:18,846 WARNING [train.py:1204] (3/4) Exclude cut with ID _60113_210_16271_1_1535164212481_4386079_559-167191-0_sp1.1 from training. Duration: 0.8454375 2023-10-03 01:41:20,288 WARNING [train.py:1204] (3/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_93-58002-0_sp1.1 from training. Duration: 0.5181875 2023-10-03 01:41:21,545 WARNING [train.py:1204] (3/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_110-224688-0 from training. Duration: 0.97 2023-10-03 01:41:22,937 WARNING [train.py:1204] (3/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_178-178004-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 01:41:22,954 WARNING [train.py:1204] (3/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_321-335475-0_sp1.1 from training. Duration: 0.4090625 2023-10-03 01:41:23,013 WARNING [train.py:1204] (3/4) Exclude cut with ID _66863_210_18053_1_1535698758334_8590319_1092-271784-0 from training. Duration: 0.96 2023-10-03 01:41:24,295 WARNING [train.py:1204] (3/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_607-213951-0_sp1.1 from training. Duration: 0.763625 2023-10-03 01:41:24,304 WARNING [train.py:1204] (3/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_468-283113-0 from training. Duration: 0.96 2023-10-03 01:41:24,512 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.1.conv_module1.balancer2.prob, batch_count=1093026.6666666667, ans=0.125 2023-10-03 01:41:27,072 WARNING [train.py:1204] (3/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_13-165512-0_sp1.1 from training. Duration: 0.7454375 2023-10-03 01:41:27,093 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_69630_210_7234_1_1535788670_910989_56-169762-0_sp1.1 from training. Duration: 0.92725 2023-10-03 01:41:30,296 WARNING [train.py:1204] (3/4) Exclude cut with ID _67335_210_5219_1_1535525975652_4275229_121-80357-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 01:41:30,347 WARNING [train.py:1204] (3/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_126-12079-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 01:41:30,376 WARNING [train.py:1204] (3/4) Exclude cut with ID _5294_210_1747_1_1527249643928_3696179_342-270278-0_sp1.1 from training. Duration: 0.5 2023-10-03 01:41:33,117 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_30322_210_1925_1_1532843478_826817_35-328264-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 01:41:34,910 INFO [train.py:1046] (3/4) Epoch 31, batch 4600, loss[loss=0.1587, simple_loss=0.2284, pruned_loss=0.04453, over 23737.00 frames. ], tot_loss[loss=0.1632, simple_loss=0.2407, pruned_loss=0.04283, over 4683847.92 frames. ], batch size: 232, lr: 3.27e-03, grad_scale: 8.0 2023-10-03 01:41:34,956 WARNING [train.py:1204] (3/4) Exclude cut with ID _8480_210_1874_1_1528589391205_6728330_755-112625-0_sp1.1 from training. Duration: 0.67275 2023-10-03 01:41:36,991 WARNING [train.py:1204] (3/4) Exclude cut with ID _38670_210_15379_1_1533448810659_4391059_132-146412-0_sp1.1 from training. Duration: 0.97275 2023-10-03 01:41:37,073 WARNING [train.py:1204] (3/4) Exclude cut with ID _22020_210_2461_1_1531724164513_6326900_563-39691-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 01:41:38,525 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.balancer2.prob, batch_count=1093093.3333333333, ans=0.125 2023-10-03 01:41:40,970 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_9460_210_4211_1_1528954587_10049667_572-37035-0_sp1.1 from training. Duration: 0.72725 2023-10-03 01:41:40,981 WARNING [train.py:1204] (3/4) Exclude cut with ID _68777_210_4353_1_1535810390731_3117904_129-166012-0_sp0.9 from training. Duration: 0.9444375 2023-10-03 01:41:42,463 WARNING [train.py:1204] (3/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_487-91921-0_sp1.1 from training. Duration: 0.963625 2023-10-03 01:41:44,389 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_195-257512-0 from training. Duration: 0.67 2023-10-03 01:41:45,792 WARNING [train.py:1204] (3/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_188-260011-0_sp1.1 from training. Duration: 0.663625 2023-10-03 01:41:49,912 WARNING [train.py:1204] (3/4) Exclude cut with ID _8480_210_1874_1_1528589391205_6728330_309-139896-0_sp0.9 from training. Duration: 0.9 2023-10-03 01:41:51,223 WARNING [train.py:1204] (3/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_866-321248-0_sp1.1 from training. Duration: 0.963625 2023-10-03 01:41:52,752 WARNING [train.py:1204] (3/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_510-216513-0_sp1.1 from training. Duration: 0.97275 2023-10-03 01:41:56,211 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.4.encoder.layers.1.feed_forward1.out_whiten, num_groups=1, num_channels=384, metric=11.83 vs. limit=15.0 2023-10-03 01:41:58,196 WARNING [train.py:1204] (3/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_758-143677-0 from training. Duration: 0.98 2023-10-03 01:41:58,276 WARNING [train.py:1204] (3/4) Exclude cut with ID _55134_210_21982_1_1534590028377_3882170_50-214427-0_sp1.1 from training. Duration: 0.97275 2023-10-03 01:42:02,826 WARNING [train.py:1204] (3/4) Exclude cut with ID _31649_210_6228_1_1532584608063_4091078_482-301330-0_sp1.1 from training. Duration: 0.97275 2023-10-03 01:42:04,352 WARNING [train.py:1204] (3/4) Exclude cut with ID _35799_210_5854_1_1533263487887_3529071_490-326847-0_sp1.1 from training. Duration: 0.9 2023-10-03 01:42:04,365 WARNING [train.py:1204] (3/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_984-90716-0_sp1.1 from training. Duration: 0.963625 2023-10-03 01:42:12,276 WARNING [train.py:1204] (3/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_166-246557-0 from training. Duration: 0.64 2023-10-03 01:42:12,276 WARNING [train.py:1204] (3/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_51-200220-0_sp1.1 from training. Duration: 0.5090625 2023-10-03 01:42:12,943 WARNING [train.py:1204] (3/4) Exclude cut with ID _51382_210_23862_1_1534492801838_3650104_275-92796-0_sp1.1 from training. Duration: 0.936375 2023-10-03 01:42:17,316 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.conv_module2.balancer1.prob, batch_count=1093226.6666666667, ans=0.125 2023-10-03 01:42:18,498 WARNING [train.py:1204] (3/4) Exclude cut with ID _29853_210_7497_1_1532505600851_6642110_461-142829-0_sp1.1 from training. Duration: 0.97275 2023-10-03 01:42:18,534 WARNING [train.py:1204] (3/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_788-298907-0_sp0.9 from training. Duration: 0.988875 2023-10-03 01:42:21,180 WARNING [train.py:1204] (3/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_739-2098-0_sp1.1 from training. Duration: 0.836375 2023-10-03 01:42:23,972 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7543_210_6753_1_1528541907_1399322_165-323619-0 from training. Duration: 0.69 2023-10-03 01:42:24,066 WARNING [train.py:1204] (3/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_401-255491-0_sp1.1 from training. Duration: 0.636375 2023-10-03 01:42:25,685 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=1093293.3333333333, ans=0.1 2023-10-03 01:42:26,858 WARNING [train.py:1204] (3/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_24-143283-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 01:42:30,689 WARNING [train.py:1204] (3/4) Exclude cut with ID _14459_210_2228_1_1531443641946_7516700_1221-280439-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 01:42:32,221 WARNING [train.py:1204] (3/4) Exclude cut with ID _59202_210_26984_1_1534816810612_3549174_469-213482-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 01:42:32,222 WARNING [train.py:1204] (3/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_148-158509-0_sp0.9 from training. Duration: 0.4555625 2023-10-03 01:42:33,552 WARNING [train.py:1204] (3/4) Exclude cut with ID _24314_210_4915_1_1532135078116_7170179_863-259095-0_sp1.1 from training. Duration: 0.97275 2023-10-03 01:42:34,878 WARNING [train.py:1204] (3/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_321-22821-0 from training. Duration: 0.89 2023-10-03 01:42:34,909 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_43831_210_2158_1_1533725502_5872791_55-163137-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 01:42:34,964 WARNING [train.py:1204] (3/4) Exclude cut with ID _27430_210_7770_1_1532604689375_1378590_26-25207-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 01:42:38,627 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_17613_210_2151_1_1531137569_2366160_223-316461-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 01:42:38,704 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_53679_210_4852_1_1534556726_4186141_541-213370-0_sp1.1 from training. Duration: 0.963625 2023-10-03 01:42:40,024 WARNING [train.py:1204] (3/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_369-295086-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 01:42:41,295 WARNING [train.py:1204] (3/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_91-267696-0 from training. Duration: 0.87 2023-10-03 01:42:41,343 WARNING [train.py:1204] (3/4) Exclude cut with ID _5294_210_1747_1_1527249643928_3696179_180-100713-0 from training. Duration: 0.81 2023-10-03 01:42:43,156 WARNING [train.py:1204] (3/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_32-335103-0 from training. Duration: 0.82 2023-10-03 01:42:43,171 WARNING [train.py:1204] (3/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_133-312727-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 01:42:44,571 WARNING [train.py:1204] (3/4) Exclude cut with ID _59288_210_5854_1_1535077757774_3412246_391-165934-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 01:42:45,819 WARNING [train.py:1204] (3/4) Exclude cut with ID _28323_210_16187_1_1532480395667_4100300_488-1281-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 01:42:47,169 WARNING [train.py:1204] (3/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_124-193207-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 01:42:51,990 INFO [train.py:1046] (3/4) Epoch 31, batch 4650, loss[loss=0.1637, simple_loss=0.2538, pruned_loss=0.03684, over 24448.00 frames. ], tot_loss[loss=0.1623, simple_loss=0.2402, pruned_loss=0.04217, over 4690182.97 frames. ], batch size: 69, lr: 3.27e-03, grad_scale: 8.0 2023-10-03 01:42:53,307 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.466e+02 1.809e+02 1.983e+02 2.209e+02 6.032e+02, threshold=3.967e+02, percent-clipped=1.0 2023-10-03 01:42:56,272 WARNING [train.py:1204] (3/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_226-242140-0_sp0.9 from training. Duration: 0.9 2023-10-03 01:42:57,873 WARNING [train.py:1204] (3/4) Exclude cut with ID _29277_210_3279_1_1532581109293_3173069_392-3694-0_sp1.1 from training. Duration: 0.936375 2023-10-03 01:42:57,902 WARNING [train.py:1204] (3/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_563-329842-0_sp1.1 from training. Duration: 0.97275 2023-10-03 01:42:59,157 WARNING [train.py:1204] (3/4) Exclude cut with ID _58583_210_5236_1_1534726835266_3574249_391-87755-0_sp1.1 from training. Duration: 0.92725 2023-10-03 01:42:59,177 WARNING [train.py:1204] (3/4) Exclude cut with ID _28427_210_4220_1_1532581240967_3061069_281-237827-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 01:42:59,198 WARNING [train.py:1204] (3/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_44-147898-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 01:42:59,266 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_24007_210_1794_1_1531918925_1663875_125-320772-0_sp1.1 from training. Duration: 0.97275 2023-10-03 01:43:03,326 WARNING [train.py:1204] (3/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_143-117421-0 from training. Duration: 0.58 2023-10-03 01:43:06,063 WARNING [train.py:1204] (3/4) Exclude cut with ID _69584_210_5219_1_1535785133775_4044450_325-7656-0_sp0.9 from training. Duration: 0.9555625 2023-10-03 01:43:09,315 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_37351_210_7925_1_1533012931_3812113_265-85780-0 from training. Duration: 0.88 2023-10-03 01:43:09,335 WARNING [train.py:1204] (3/4) Exclude cut with ID _18516_210_9206_1_1531704653206_3692406_331-237896-0_sp1.1 from training. Duration: 0.936375 2023-10-03 01:43:09,567 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.feed_forward1.out_proj.dropout_p, batch_count=1093493.3333333333, ans=0.1 2023-10-03 01:43:11,364 WARNING [train.py:1204] (3/4) Exclude cut with ID _14629_210_15709_1_1530604298182_956899_6-232695-0 from training. Duration: 0.94 2023-10-03 01:43:11,383 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7544_210_6753_1_1528587000_691468_87-178599-0_sp1.1 from training. Duration: 0.7909375 2023-10-03 01:43:12,708 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_326-303945-0 from training. Duration: 0.99 2023-10-03 01:43:12,738 WARNING [train.py:1204] (3/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_503-30719-0 from training. Duration: 0.98 2023-10-03 01:43:12,755 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_11305_210_1794_1_1529750428_7178148_299-344640-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 01:43:12,810 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_53071_210_16554_1_1534490405_2954066_265-170472-0_sp1.1 from training. Duration: 0.7545625 2023-10-03 01:43:16,123 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_174-277364-0_sp1.1 from training. Duration: 0.6454375 2023-10-03 01:43:16,885 WARNING [train.py:1204] (3/4) Exclude cut with ID _58414_210_8254_1_1535421609162_7151182_193-151884-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 01:43:18,242 WARNING [train.py:1204] (3/4) Exclude cut with ID IC0651W0104-322063 from training. Duration: 0.768 2023-10-03 01:43:21,109 WARNING [train.py:1204] (3/4) Exclude cut with ID _21881_210_3525_1_1531825242757_3798380_139-305982-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 01:43:22,576 WARNING [train.py:1204] (3/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_79-200392-0 from training. Duration: 0.97 2023-10-03 01:43:24,107 WARNING [train.py:1204] (3/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_451-280894-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 01:43:24,115 WARNING [train.py:1204] (3/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_304-273433-0_sp1.1 from training. Duration: 0.9 2023-10-03 01:43:25,429 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_5959_210_1794_1_1527400400_3445726_412-44052-0 from training. Duration: 0.77 2023-10-03 01:43:26,826 WARNING [train.py:1204] (3/4) Exclude cut with ID _4681_210_3525_1_1527251040321_1235713_148-26159-0_sp1.1 from training. Duration: 0.963625 2023-10-03 01:43:28,548 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.nonlin_attention.balancer.prob, batch_count=1093560.0, ans=0.125 2023-10-03 01:43:29,576 WARNING [train.py:1204] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_887-249555-0_sp0.9 from training. Duration: 0.8555625 2023-10-03 01:43:30,426 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.2.encoder.layers.1.conv_module2.whiten, num_groups=1, num_channels=384, metric=4.16 vs. limit=15.0 2023-10-03 01:43:33,790 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_25236_210_2158_1_1532051968_280567_37-6860-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 01:43:34,058 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.ff2_skip_rate, batch_count=1093560.0, ans=0.0 2023-10-03 01:43:39,717 WARNING [train.py:1204] (3/4) Exclude cut with ID _29277_210_3279_1_1532581109293_3173069_417-115527-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 01:43:41,236 WARNING [train.py:1204] (3/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_552-134748-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 01:43:41,292 WARNING [train.py:1204] (3/4) Exclude cut with ID _43176_210_8254_1_1533524417469_4174339_188-174143-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 01:43:41,329 WARNING [train.py:1204] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_127-119109-0_sp1.1 from training. Duration: 0.6545625 2023-10-03 01:43:43,311 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.conv_module1.balancer2.prob, batch_count=1093626.6666666667, ans=0.125 2023-10-03 01:43:44,518 WARNING [train.py:1204] (3/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_38-187851-0 from training. Duration: 0.62 2023-10-03 01:43:46,472 WARNING [train.py:1204] (3/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_588-125165-0 from training. Duration: 0.83 2023-10-03 01:43:46,544 WARNING [train.py:1204] (3/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_344-152526-0_sp1.1 from training. Duration: 0.4818125 2023-10-03 01:43:46,545 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7002_210_1794_1_1527935634_4034444_487-236501-0 from training. Duration: 0.66 2023-10-03 01:43:46,789 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.attention_skip_rate, batch_count=1093626.6666666667, ans=0.0 2023-10-03 01:43:48,587 WARNING [train.py:1204] (3/4) Exclude cut with ID _23825_210_13449_1_1531915313526_3497150_379-16981-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 01:43:53,953 WARNING [train.py:1204] (3/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_297-5233-0_sp1.1 from training. Duration: 0.863625 2023-10-03 01:43:55,227 WARNING [train.py:1204] (3/4) Exclude cut with ID _41298_210_9019_1_1533430371450_4822143_178-283538-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 01:43:55,261 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7046_210_6753_1_1527895337_6920917_642-106799-0 from training. Duration: 0.92 2023-10-03 01:43:55,275 WARNING [train.py:1204] (3/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_532-291952-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 01:43:56,625 WARNING [train.py:1204] (3/4) Exclude cut with ID _47278_210_12041_1_1533902418349_3576110_260-270468-0_sp1.1 from training. Duration: 0.92725 2023-10-03 01:43:56,630 WARNING [train.py:1204] (3/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_144-3287-0_sp0.9 from training. Duration: 0.7444375 2023-10-03 01:43:56,731 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_162-262717-0_sp0.9 from training. Duration: 0.92225 2023-10-03 01:43:59,584 WARNING [train.py:1204] (3/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_62-335552-0_sp0.9 from training. Duration: 0.9666875 2023-10-03 01:44:00,846 WARNING [train.py:1204] (3/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_726-272852-0_sp1.1 from training. Duration: 0.92725 2023-10-03 01:44:00,925 WARNING [train.py:1204] (3/4) Exclude cut with ID _14898_210_9206_1_1530766812377_4177190_26-135492-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 01:44:04,894 WARNING [train.py:1204] (3/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_200-95304-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 01:44:04,930 WARNING [train.py:1204] (3/4) Exclude cut with ID _45012_210_12651_1_1533608435136_5371380_240-37101-0_sp1.1 from training. Duration: 0.8454375 2023-10-03 01:44:04,938 WARNING [train.py:1204] (3/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_132-269633-0_sp0.9 from training. Duration: 0.7555625 2023-10-03 01:44:06,270 INFO [train.py:1046] (3/4) Epoch 31, batch 4700, loss[loss=0.1785, simple_loss=0.249, pruned_loss=0.05405, over 23781.00 frames. ], tot_loss[loss=0.1625, simple_loss=0.2409, pruned_loss=0.04212, over 4685817.00 frames. ], batch size: 179, lr: 3.27e-03, grad_scale: 8.0 2023-10-03 01:44:06,344 WARNING [train.py:1204] (3/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_907-151398-0_sp0.9 from training. Duration: 0.411125 2023-10-03 01:44:06,441 WARNING [train.py:1204] (3/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_45-265684-0_sp0.9 from training. Duration: 0.8 2023-10-03 01:44:08,221 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_74-292608-0 from training. Duration: 0.72 2023-10-03 01:44:17,116 WARNING [train.py:1204] (3/4) Exclude cut with ID _59202_210_26984_1_1534816810612_3549174_221-84819-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 01:44:18,586 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_33696_210_3852_1_1533082780_2663375_181-325595-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 01:44:19,723 WARNING [train.py:1204] (3/4) Exclude cut with ID _47251_210_18628_1_1533814063128_4137250_351-154105-0_sp1.1 from training. Duration: 0.963625 2023-10-03 01:44:22,985 WARNING [train.py:1204] (3/4) Exclude cut with ID _35501_210_9288_1_1532998507986_3192189_351-334740-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 01:44:23,104 WARNING [train.py:1204] (3/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_342-100157-0_sp1.1 from training. Duration: 0.4909375 2023-10-03 01:44:27,260 WARNING [train.py:1204] (3/4) Exclude cut with ID _6789_210_1534_1_1527857937218_3118950_262-48853-0 from training. Duration: 0.56 2023-10-03 01:44:27,296 WARNING [train.py:1204] (3/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_430-201020-0 from training. Duration: 0.97 2023-10-03 01:44:28,756 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_71-136518-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 01:44:30,154 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_28-291982-0_sp1.1 from training. Duration: 0.8818125 2023-10-03 01:44:31,447 WARNING [train.py:1204] (3/4) Exclude cut with ID _54218_210_12210_1_1534413646388_4409259_437-275326-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 01:44:34,195 WARNING [train.py:1204] (3/4) Exclude cut with ID _55134_210_21982_1_1534590028377_3882170_40-309139-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 01:44:37,261 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.conv_module2.balancer2.prob, batch_count=1093893.3333333333, ans=0.125 2023-10-03 01:44:38,510 WARNING [train.py:1204] (3/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_266-329755-0_sp1.1 from training. Duration: 0.8090625 2023-10-03 01:44:39,797 WARNING [train.py:1204] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_21-174514-0_sp1.1 from training. Duration: 0.3909375 2023-10-03 01:44:41,666 WARNING [train.py:1204] (3/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_409-207378-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 01:44:42,321 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.3.encoder.layers.2.whiten, num_groups=1, num_channels=512, metric=4.07 vs. limit=12.0 2023-10-03 01:44:49,021 WARNING [train.py:1204] (3/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_132-269633-0 from training. Duration: 0.68 2023-10-03 01:44:50,866 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2245_210_2715_1_1525511548_3540254_310-52483-0_sp0.9 from training. Duration: 0.97775 2023-10-03 01:44:52,321 WARNING [train.py:1204] (3/4) Exclude cut with ID _27430_210_7770_1_1532604689375_1378590_47-77403-0_sp1.1 from training. Duration: 0.92725 2023-10-03 01:44:52,596 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.feed_forward1.out_proj.dropout_p, batch_count=1093960.0, ans=0.1 2023-10-03 01:44:55,147 WARNING [train.py:1204] (3/4) Exclude cut with ID _7562_210_2084_1_1528887767916_3946229_186-326422-0 from training. Duration: 0.84 2023-10-03 01:44:56,625 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_230-35993-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 01:44:59,912 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.2.encoder.layers.1.feed_forward3.out_whiten, num_groups=1, num_channels=384, metric=10.67 vs. limit=15.0 2023-10-03 01:45:00,534 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_23854_210_1794_1_1531969696_1329157_57-123491-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 01:45:01,914 WARNING [train.py:1204] (3/4) Exclude cut with ID _14747_210_8341_1_1530685797463_3654089_147-131229-0 from training. Duration: 0.76 2023-10-03 01:45:02,019 WARNING [train.py:1204] (3/4) Exclude cut with ID _30191_210_1664_1_1533189915047_7196670_430-19539-0_sp1.1 from training. Duration: 0.92725 2023-10-03 01:45:02,033 WARNING [train.py:1204] (3/4) Exclude cut with ID _67725_210_27767_1_1535608756671_5105359_44-50762-0_sp1.1 from training. Duration: 0.963625 2023-10-03 01:45:04,819 WARNING [train.py:1204] (3/4) Exclude cut with ID _27485_210_4916_1_1532739191677_3668269_217-161755-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 01:45:04,864 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_5931_210_2040_1_1527428481_4547999_321-193614-0_sp1.1 from training. Duration: 0.6090625 2023-10-03 01:45:06,169 WARNING [train.py:1204] (3/4) Exclude cut with ID _6267_210_2084_1_1527764658526_3894730_375-219887-0 from training. Duration: 0.67 2023-10-03 01:45:06,242 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9087W0022-538393 from training. Duration: 0.768 2023-10-03 01:45:07,661 WARNING [train.py:1204] (3/4) Exclude cut with ID _68777_210_4353_1_1535810390731_3117904_83-196459-0_sp1.1 from training. Duration: 0.963625 2023-10-03 01:45:09,035 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.balancer2.prob, batch_count=1094026.6666666667, ans=0.125 2023-10-03 01:45:10,220 WARNING [train.py:1204] (3/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_455-343812-0_sp1.1 from training. Duration: 0.92725 2023-10-03 01:45:10,221 WARNING [train.py:1204] (3/4) Exclude cut with ID _13916_210_9994_1_1530788341126_3763149_414-320174-0_sp1.1 from training. Duration: 0.92725 2023-10-03 01:45:10,232 WARNING [train.py:1204] (3/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_398-239819-0 from training. Duration: 0.99 2023-10-03 01:45:12,043 WARNING [train.py:1204] (3/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_199-232314-0_sp1.1 from training. Duration: 0.92725 2023-10-03 01:45:15,425 WARNING [train.py:1204] (3/4) Exclude cut with ID _2334_210_2667_1_1525608238880_4705129_459-104997-0 from training. Duration: 0.95 2023-10-03 01:45:16,931 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.conv_skip_rate, batch_count=1094026.6666666667, ans=0.0 2023-10-03 01:45:18,190 WARNING [train.py:1204] (3/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_441-209395-0_sp1.1 from training. Duration: 0.936375 2023-10-03 01:45:18,862 WARNING [train.py:1204] (3/4) Exclude cut with ID _27052_210_5986_1_1532329197187_3585009_320-80393-0_sp1.1 from training. Duration: 0.97275 2023-10-03 01:45:20,098 INFO [train.py:1046] (3/4) Epoch 31, batch 4750, loss[loss=0.1771, simple_loss=0.2664, pruned_loss=0.04387, over 24563.00 frames. ], tot_loss[loss=0.163, simple_loss=0.2414, pruned_loss=0.04232, over 4690065.75 frames. ], batch size: 71, lr: 3.27e-03, grad_scale: 8.0 2023-10-03 01:45:20,750 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.4.encoder.layers.1.self_attn2.whiten, num_groups=1, num_channels=384, metric=11.21 vs. limit=22.5 2023-10-03 01:45:21,362 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.603e+02 1.883e+02 2.081e+02 2.313e+02 2.638e+02, threshold=4.163e+02, percent-clipped=0.0 2023-10-03 01:45:22,972 WARNING [train.py:1204] (3/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_89-217253-0_sp1.1 from training. Duration: 0.97275 2023-10-03 01:45:22,989 WARNING [train.py:1204] (3/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_453-282588-0_sp1.1 from training. Duration: 0.8909375 2023-10-03 01:45:24,333 WARNING [train.py:1204] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_88-341718-0 from training. Duration: 0.66 2023-10-03 01:45:25,656 WARNING [train.py:1204] (3/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_230-3521-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 01:45:27,364 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.self_attn_weights.pos_emb_skip_rate, batch_count=1094093.3333333333, ans=0.0 2023-10-03 01:45:28,617 WARNING [train.py:1204] (3/4) Exclude cut with ID _9679_210_7497_1_1529129212906_4363929_512-313889-0 from training. Duration: 0.81 2023-10-03 01:45:28,777 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.bypass.skip_rate, batch_count=1094093.3333333333, ans=0.07 2023-10-03 01:45:31,349 WARNING [train.py:1204] (3/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_194-252800-0_sp1.1 from training. Duration: 0.8818125 2023-10-03 01:45:31,370 WARNING [train.py:1204] (3/4) Exclude cut with ID _55209_210_1531_1_1534675828996_4597160_239-293541-0_sp1.1 from training. Duration: 0.963625 2023-10-03 01:45:31,416 WARNING [train.py:1204] (3/4) Exclude cut with ID _35799_210_5854_1_1533263487887_3529071_15-325131-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 01:45:34,447 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.conv_module2.balancer2.prob, batch_count=1094160.0, ans=0.125 2023-10-03 01:45:36,860 WARNING [train.py:1204] (3/4) Exclude cut with ID _14100_210_8341_1_1530529320613_3678069_279-55760-0 from training. Duration: 0.93 2023-10-03 01:45:41,358 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_489-230703-0_sp1.1 from training. Duration: 0.77275 2023-10-03 01:45:42,835 WARNING [train.py:1204] (3/4) Exclude cut with ID _6694_210_4599_1_1527852637490_3965529_144-185051-0 from training. Duration: 0.96 2023-10-03 01:45:42,894 WARNING [train.py:1204] (3/4) Exclude cut with ID _28461_210_2253_1_1532771775189_3041299_1-228155-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 01:45:48,034 WARNING [train.py:1204] (3/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_686-252323-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 01:45:48,036 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_1075-294633-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 01:45:48,048 WARNING [train.py:1204] (3/4) Exclude cut with ID _22083_210_8254_1_1531702862439_3678970_30-92879-0_sp1.1 from training. Duration: 0.97275 2023-10-03 01:45:48,130 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9245W0121-616888 from training. Duration: 0.896 2023-10-03 01:45:48,135 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6786_210_1388_1_1527839699_4194495_378-46070-0 from training. Duration: 0.72 2023-10-03 01:45:48,320 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.conv_skip_rate, batch_count=1094226.6666666667, ans=0.0 2023-10-03 01:45:54,141 WARNING [train.py:1204] (3/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_317-175062-0 from training. Duration: 0.82 2023-10-03 01:45:56,891 WARNING [train.py:1204] (3/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_238-89390-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 01:45:59,574 WARNING [train.py:1204] (3/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_524-310811-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 01:46:01,120 WARNING [train.py:1204] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_307-158462-0_sp0.9 from training. Duration: 0.7666875 2023-10-03 01:46:01,120 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9309W0191-640318 from training. Duration: 0.512 2023-10-03 01:46:01,124 WARNING [train.py:1204] (3/4) Exclude cut with ID _55153_210_2253_1_1534384786677_3162096_100-204617-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 01:46:05,312 WARNING [train.py:1204] (3/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_112-149213-0_sp1.1 from training. Duration: 0.863625 2023-10-03 01:46:06,783 WARNING [train.py:1204] (3/4) Exclude cut with ID _67831_210_12392_1_1535612464202_3092612_346-2148-0_sp1.1 from training. Duration: 0.8454375 2023-10-03 01:46:09,501 WARNING [train.py:1204] (3/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_51-117576-0_sp0.9 from training. Duration: 0.4444375 2023-10-03 01:46:09,530 WARNING [train.py:1204] (3/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_300-27235-0 from training. Duration: 0.87 2023-10-03 01:46:10,836 WARNING [train.py:1204] (3/4) Exclude cut with ID _40399_210_4353_1_1533305658194_3279406_195-116825-0_sp1.1 from training. Duration: 0.97275 2023-10-03 01:46:10,873 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7002_210_1794_1_1527939683_5157286_163-87552-0_sp0.9 from training. Duration: 0.9555625 2023-10-03 01:46:11,698 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.2.encoder.layers.0.self_attn2.whiten, num_groups=1, num_channels=384, metric=20.98 vs. limit=22.5 2023-10-03 01:46:12,263 WARNING [train.py:1204] (3/4) Exclude cut with ID _19514_210_9259_1_1531530071330_5084230_560-237597-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 01:46:12,326 WARNING [train.py:1204] (3/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_41-234353-0_sp1.1 from training. Duration: 0.6181875 2023-10-03 01:46:12,360 WARNING [train.py:1204] (3/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_769-168302-0 from training. Duration: 0.71 2023-10-03 01:46:12,535 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.attention_skip_rate, batch_count=1094293.3333333333, ans=0.0 2023-10-03 01:46:14,340 WARNING [train.py:1204] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_574-45541-0 from training. Duration: 0.65 2023-10-03 01:46:18,794 WARNING [train.py:1204] (3/4) Exclude cut with ID _50310_210_15488_1_1534068043345_3641080_262-67120-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 01:46:21,957 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_53071_210_16554_1_1534493434_1276796_35-11927-0_sp1.1 from training. Duration: 0.963625 2023-10-03 01:46:21,958 WARNING [train.py:1204] (3/4) Exclude cut with ID _3992_210_1747_1_1526644816031_3731060_107-251434-0 from training. Duration: 0.99 2023-10-03 01:46:23,866 WARNING [train.py:1204] (3/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_412-54488-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 01:46:25,240 WARNING [train.py:1204] (3/4) Exclude cut with ID _18516_210_9206_1_1531704653206_3692406_77-258490-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 01:46:26,590 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_40047_210_2290_1_1533290519_2374311_195-165693-0_sp0.9 from training. Duration: 0.988875 2023-10-03 01:46:26,640 WARNING [train.py:1204] (3/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_296-36971-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 01:46:27,950 WARNING [train.py:1204] (3/4) Exclude cut with ID _14970_210_15709_1_1530773543493_1715730_187-20259-0_sp1.1 from training. Duration: 0.8090625 2023-10-03 01:46:29,452 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_53373_210_5399_1_1534229471_4715695_135-305927-0_sp1.1 from training. Duration: 0.936375 2023-10-03 01:46:30,168 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.3.encoder.layers.0.feed_forward1.out_whiten, num_groups=1, num_channels=512, metric=8.09 vs. limit=15.0 2023-10-03 01:46:30,734 WARNING [train.py:1204] (3/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_60-18123-0 from training. Duration: 0.88 2023-10-03 01:46:30,952 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.bypass_mid.scale_min, batch_count=1094360.0, ans=0.2 2023-10-03 01:46:32,072 WARNING [train.py:1204] (3/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_166-166990-0 from training. Duration: 0.96 2023-10-03 01:46:33,413 INFO [train.py:1046] (3/4) Epoch 31, batch 4800, loss[loss=0.1561, simple_loss=0.2292, pruned_loss=0.04154, over 20219.00 frames. ], tot_loss[loss=0.1632, simple_loss=0.2418, pruned_loss=0.04232, over 4690774.00 frames. ], batch size: 44, lr: 3.27e-03, grad_scale: 16.0 2023-10-03 01:46:33,476 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_5438_210_1794_1_1527336687_2600009_233-58072-0 from training. Duration: 0.77 2023-10-03 01:46:33,656 WARNING [train.py:1204] (3/4) Exclude cut with ID _8833_210_5219_1_1528592665210_7553059_611-80313-0_sp0.9 from training. Duration: 0.711125 2023-10-03 01:46:34,841 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_53555_210_5632_1_1534595220_7232297_118-43239-0_sp1.1 from training. Duration: 0.936375 2023-10-03 01:46:34,924 WARNING [train.py:1204] (3/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_480-13146-0 from training. Duration: 0.85 2023-10-03 01:46:39,253 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_892-28814-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 01:46:39,297 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_24899_210_16554_1_1532046422_3827462_160-21255-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 01:46:42,154 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.nonlin_attention.balancer.max_positive, batch_count=1094426.6666666667, ans=0.95 2023-10-03 01:46:45,162 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_154-316450-0_sp1.1 from training. Duration: 0.6909375 2023-10-03 01:46:46,468 WARNING [train.py:1204] (3/4) Exclude cut with ID _30196_210_1664_1_1533708496522_7074140_1225-301232-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 01:46:46,500 WARNING [train.py:1204] (3/4) Exclude cut with ID _28783_210_7943_1_1532671164808_7155479_414-208100-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 01:46:46,549 WARNING [train.py:1204] (3/4) Exclude cut with ID _19023_210_4353_1_1531378892552_5820780_83-254794-0 from training. Duration: 0.86 2023-10-03 01:46:50,252 WARNING [train.py:1204] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_591-83752-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 01:46:50,290 WARNING [train.py:1204] (3/4) Exclude cut with ID _29877_210_6228_1_1532397791106_5276149_266-62291-0_sp1.1 from training. Duration: 0.7909375 2023-10-03 01:46:52,072 WARNING [train.py:1204] (3/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_280-161633-0_sp1.1 from training. Duration: 0.663625 2023-10-03 01:46:54,982 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7543_210_6753_1_1528539468_333764_39-156634-0_sp1.1 from training. Duration: 0.97275 2023-10-03 01:46:57,898 WARNING [train.py:1204] (3/4) Exclude cut with ID _67499_210_17282_1_1535509805220_3692430_505-312248-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 01:46:57,928 WARNING [train.py:1204] (3/4) Exclude cut with ID _5294_210_1747_1_1527249643928_3696179_180-100713-0_sp0.9 from training. Duration: 0.9 2023-10-03 01:46:58,010 WARNING [train.py:1204] (3/4) Exclude cut with ID _26528_210_9070_1_1532570410842_3851310_508-148132-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 01:46:58,019 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_211-339622-0_sp1.1 from training. Duration: 0.4090625 2023-10-03 01:46:58,031 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_339-138356-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 01:46:59,516 WARNING [train.py:1204] (3/4) Exclude cut with ID _27060_210_5986_1_1532674838922_3522549_211-19933-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 01:47:00,944 WARNING [train.py:1204] (3/4) Exclude cut with ID _60229_210_27767_1_1534838406158_3655350_457-117923-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 01:47:01,712 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.2.encoder.layers.0.self_attn2.whiten, num_groups=1, num_channels=384, metric=17.33 vs. limit=22.5 2023-10-03 01:47:03,656 WARNING [train.py:1204] (3/4) Exclude cut with ID _55429_210_10626_1_1534467588527_7174143_460-141439-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 01:47:04,983 WARNING [train.py:1204] (3/4) Exclude cut with ID _59170_210_6721_1_1534983633960_4203335_263-10780-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 01:47:06,230 WARNING [train.py:1204] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_872-264525-0_sp0.9 from training. Duration: 0.72225 2023-10-03 01:47:07,503 WARNING [train.py:1204] (3/4) Exclude cut with ID _7624_210_1531_1_1528109802198_4933101_418-349834-0_sp0.9 from training. Duration: 0.6666875 2023-10-03 01:47:09,115 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.feed_forward3.hidden_balancer.prob, batch_count=1094560.0, ans=0.125 2023-10-03 01:47:10,206 WARNING [train.py:1204] (3/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_27-53998-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 01:47:10,331 WARNING [train.py:1204] (3/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_202-183922-0 from training. Duration: 0.85 2023-10-03 01:47:10,353 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7002_210_1794_1_1527939683_5157286_1-281264-0 from training. Duration: 0.44 2023-10-03 01:47:11,713 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_53753_210_7234_1_1534320003_3594148_493-9314-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 01:47:12,966 WARNING [train.py:1204] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_217-95056-0_sp1.1 from training. Duration: 0.8909375 2023-10-03 01:47:13,017 WARNING [train.py:1204] (3/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_406-45086-0_sp1.1 from training. Duration: 0.72725 2023-10-03 01:47:13,030 WARNING [train.py:1204] (3/4) Exclude cut with ID _31649_210_6228_1_1532584608063_4091078_505-213821-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 01:47:13,038 WARNING [train.py:1204] (3/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_317-175062-0_sp0.9 from training. Duration: 0.911125 2023-10-03 01:47:16,286 WARNING [train.py:1204] (3/4) Exclude cut with ID _5444_210_2151_1_1527418808464_5312210_726-27165-0_sp1.1 from training. Duration: 0.6090625 2023-10-03 01:47:16,320 WARNING [train.py:1204] (3/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_494-170437-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 01:47:21,547 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_474-64054-0_sp1.1 from training. Duration: 0.936375 2023-10-03 01:47:24,745 WARNING [train.py:1204] (3/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_521-7556-0_sp1.1 from training. Duration: 0.97275 2023-10-03 01:47:24,870 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_269-348505-0_sp1.1 from training. Duration: 0.92725 2023-10-03 01:47:29,232 WARNING [train.py:1204] (3/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_559-184862-0 from training. Duration: 0.94 2023-10-03 01:47:29,260 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_68702_210_23689_1_1535799577_3420068_92-76040-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 01:47:30,586 WARNING [train.py:1204] (3/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_227-102052-0_sp1.1 from training. Duration: 0.97275 2023-10-03 01:47:30,617 WARNING [train.py:1204] (3/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_718-250907-0_sp0.9 from training. Duration: 0.7666875 2023-10-03 01:47:30,717 WARNING [train.py:1204] (3/4) Exclude cut with ID _14069_210_5632_1_1530512692033_4803390_352-143891-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 01:47:34,980 WARNING [train.py:1204] (3/4) Exclude cut with ID _6267_210_2084_1_1527764658526_3894730_65-106602-0_sp1.1 from training. Duration: 0.936375 2023-10-03 01:47:37,588 WARNING [train.py:1204] (3/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_202-183922-0_sp0.9 from training. Duration: 0.9444375 2023-10-03 01:47:37,603 WARNING [train.py:1204] (3/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_465-268827-0_sp1.1 from training. Duration: 0.97275 2023-10-03 01:47:37,656 WARNING [train.py:1204] (3/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_58-29064-0_sp1.1 from training. Duration: 0.9 2023-10-03 01:47:37,683 WARNING [train.py:1204] (3/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_1-99680-0_sp0.9 from training. Duration: 0.7444375 2023-10-03 01:47:39,040 WARNING [train.py:1204] (3/4) Exclude cut with ID _13253_210_1925_1_1530757775819_3577873_191-144977-0_sp1.1 from training. Duration: 0.7090625 2023-10-03 01:47:43,010 WARNING [train.py:1204] (3/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_276-112684-0_sp1.1 from training. Duration: 0.92725 2023-10-03 01:47:43,018 WARNING [train.py:1204] (3/4) Exclude cut with ID _64639_210_5816_1_1535367619437_3528000_358-411-0_sp1.1 from training. Duration: 0.97275 2023-10-03 01:47:43,024 WARNING [train.py:1204] (3/4) Exclude cut with ID _31649_210_6228_1_1532584608063_4091078_106-21199-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 01:47:44,386 WARNING [train.py:1204] (3/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_901-79438-0 from training. Duration: 0.94 2023-10-03 01:47:45,899 WARNING [train.py:1204] (3/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_718-250907-0 from training. Duration: 0.69 2023-10-03 01:47:45,907 WARNING [train.py:1204] (3/4) Exclude cut with ID _5287_210_2084_1_1527418883040_3878670_363-194161-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 01:47:45,912 WARNING [train.py:1204] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_586-321321-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 01:47:47,766 INFO [train.py:1046] (3/4) Epoch 31, batch 4850, loss[loss=0.1907, simple_loss=0.2496, pruned_loss=0.06584, over 19450.00 frames. ], tot_loss[loss=0.1641, simple_loss=0.2422, pruned_loss=0.04302, over 4678043.61 frames. ], batch size: 388, lr: 3.27e-03, grad_scale: 16.0 2023-10-03 01:47:47,893 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_68900_210_2087_1_1535713157_3708529_164-292661-0_sp1.1 from training. Duration: 0.963625 2023-10-03 01:47:47,894 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_26433_210_12108_1_1532157158_4056480_114-144878-0_sp1.1 from training. Duration: 0.97275 2023-10-03 01:47:49,140 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.492e+02 1.923e+02 2.074e+02 2.370e+02 4.081e+02, threshold=4.148e+02, percent-clipped=0.0 2023-10-03 01:47:51,146 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_15517_210_6753_1_1530954008_3487336_96-341874-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 01:47:58,087 WARNING [train.py:1204] (3/4) Exclude cut with ID _14268_210_9206_1_1530622771285_4714099_637-86630-0 from training. Duration: 0.51 2023-10-03 01:47:59,381 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_28211_210_6388_1_1532861506_4523224_121-320092-0_sp1.1 from training. Duration: 0.92725 2023-10-03 01:48:04,926 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_141-89113-0_sp0.9 from training. Duration: 0.9555625 2023-10-03 01:48:06,292 WARNING [train.py:1204] (3/4) Exclude cut with ID _6789_210_1534_1_1527857937218_3118950_311-61262-0_sp1.1 from training. Duration: 0.5909375 2023-10-03 01:48:06,311 WARNING [train.py:1204] (3/4) Exclude cut with ID _58480_210_12392_1_1534811121111_3646160_389-200461-0_sp1.1 from training. Duration: 0.97275 2023-10-03 01:48:09,025 WARNING [train.py:1204] (3/4) Exclude cut with ID _41326_210_5854_1_1533778076509_3492080_28-156553-0_sp1.1 from training. Duration: 0.92725 2023-10-03 01:48:10,356 WARNING [train.py:1204] (3/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_577-287150-0_sp1.1 from training. Duration: 0.7454375 2023-10-03 01:48:11,717 WARNING [train.py:1204] (3/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_40-129756-0_sp1.1 from training. Duration: 0.736375 2023-10-03 01:48:11,727 WARNING [train.py:1204] (3/4) Exclude cut with ID _60113_210_16271_1_1535164212481_4386079_490-280290-0 from training. Duration: 0.98 2023-10-03 01:48:14,556 WARNING [train.py:1204] (3/4) Exclude cut with ID _58059_210_5854_1_1534901471149_3164808_34-39905-0_sp1.1 from training. Duration: 0.963625 2023-10-03 01:48:17,910 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_791-197891-0_sp1.1 from training. Duration: 0.763625 2023-10-03 01:48:17,924 WARNING [train.py:1204] (3/4) Exclude cut with ID _6789_210_1534_1_1527857937218_3118950_262-48853-0_sp1.1 from training. Duration: 0.5090625 2023-10-03 01:48:18,004 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2655_210_2087_1_1525688959_3897400_52-279899-0_sp0.9 from training. Duration: 0.7666875 2023-10-03 01:48:18,007 WARNING [train.py:1204] (3/4) Exclude cut with ID _14884_210_2252_1_1530707459689_5561250_114-201399-0 from training. Duration: 0.76 2023-10-03 01:48:22,177 WARNING [train.py:1204] (3/4) Exclude cut with ID _19023_210_4353_1_1531378892552_5820780_83-254794-0_sp0.9 from training. Duration: 0.9555625 2023-10-03 01:48:22,201 WARNING [train.py:1204] (3/4) Exclude cut with ID _53489_210_6228_1_1534298357169_3835970_10-297919-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 01:48:26,198 WARNING [train.py:1204] (3/4) Exclude cut with ID _44585_210_18628_1_1533605350387_4518199_418-3055-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 01:48:26,217 WARNING [train.py:1204] (3/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_310-109461-0 from training. Duration: 0.8 2023-10-03 01:48:27,587 WARNING [train.py:1204] (3/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_235-262124-0 from training. Duration: 0.67 2023-10-03 01:48:27,660 WARNING [train.py:1204] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_195-167011-0_sp1.1 from training. Duration: 0.6818125 2023-10-03 01:48:27,861 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.attention_skip_rate, batch_count=1094893.3333333333, ans=0.0 2023-10-03 01:48:33,437 WARNING [train.py:1204] (3/4) Exclude cut with ID _7852_210_5219_1_1528286471316_8139400_188-55162-0_sp1.1 from training. Duration: 0.936375 2023-10-03 01:48:34,834 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_35361_210_4238_1_1533262810_7793476_675-87012-0 from training. Duration: 0.89 2023-10-03 01:48:34,925 WARNING [train.py:1204] (3/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_465-81427-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 01:48:36,264 WARNING [train.py:1204] (3/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_597-154640-0_sp1.1 from training. Duration: 0.8181875 2023-10-03 01:48:37,640 WARNING [train.py:1204] (3/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_186-309161-0_sp0.9 from training. Duration: 0.6 2023-10-03 01:48:38,930 WARNING [train.py:1204] (3/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_546-119312-0 from training. Duration: 0.99 2023-10-03 01:48:38,935 WARNING [train.py:1204] (3/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_330-240060-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 01:48:40,415 WARNING [train.py:1204] (3/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_477-255782-0 from training. Duration: 0.82 2023-10-03 01:48:40,437 WARNING [train.py:1204] (3/4) Exclude cut with ID _14970_210_15709_1_1530773543493_1715730_150-248939-0_sp1.1 from training. Duration: 0.97275 2023-10-03 01:48:43,056 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_556-206615-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 01:48:43,108 WARNING [train.py:1204] (3/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_308-231735-0 from training. Duration: 0.73 2023-10-03 01:48:48,930 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.0.conv_module1.balancer1.min_positive, batch_count=1095026.6666666667, ans=0.025 2023-10-03 01:48:52,384 WARNING [train.py:1204] (3/4) Exclude cut with ID _27485_210_4916_1_1532739191677_3668269_66-236571-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 01:48:57,053 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.1.encoder.layers.0.nonlin_attention.whiten2, num_groups=1, num_channels=256, metric=5.22 vs. limit=15.0 2023-10-03 01:49:01,654 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2221_210_1988_1_1525597051_4935541_415-296882-0_sp0.9 from training. Duration: 0.8555625 2023-10-03 01:49:01,663 WARNING [train.py:1204] (3/4) Exclude cut with ID _64521_210_14699_1_1535243427259_3815328_231-78630-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 01:49:03,076 INFO [train.py:1046] (3/4) Epoch 31, batch 4900, loss[loss=0.1464, simple_loss=0.2201, pruned_loss=0.0363, over 24273.00 frames. ], tot_loss[loss=0.1637, simple_loss=0.2414, pruned_loss=0.04302, over 4683522.12 frames. ], batch size: 56, lr: 3.27e-03, grad_scale: 16.0 2023-10-03 01:49:06,001 WARNING [train.py:1204] (3/4) Exclude cut with ID _13942_210_9988_1_1530604906036_3733360_499-55676-0 from training. Duration: 0.62 2023-10-03 01:49:06,003 WARNING [train.py:1204] (3/4) Exclude cut with ID _49376_210_5986_1_1533985278732_3370740_301-154763-0_sp1.1 from training. Duration: 0.8909375 2023-10-03 01:49:08,997 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.self_attn_weights.pos_emb_skip_rate, batch_count=1095093.3333333333, ans=0.0 2023-10-03 01:49:09,042 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.balancer2.prob, batch_count=1095093.3333333333, ans=0.125 2023-10-03 01:49:10,471 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.conv_module1.balancer2.prob, batch_count=1095093.3333333333, ans=0.125 2023-10-03 01:49:11,641 WARNING [train.py:1204] (3/4) Exclude cut with ID _27485_210_4916_1_1532739191677_3668269_255-216698-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 01:49:12,997 WARNING [train.py:1204] (3/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_137-334143-0_sp1.1 from training. Duration: 0.97275 2023-10-03 01:49:13,017 WARNING [train.py:1204] (3/4) Exclude cut with ID _6456_210_1534_1_1527681634212_3181049_215-309359-0_sp1.1 from training. Duration: 0.8 2023-10-03 01:49:14,495 WARNING [train.py:1204] (3/4) Exclude cut with ID _65640_210_6310_1_1535368201445_3173250_209-13008-0 from training. Duration: 0.99 2023-10-03 01:49:20,109 WARNING [train.py:1204] (3/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_58-29064-0 from training. Duration: 0.99 2023-10-03 01:49:22,811 WARNING [train.py:1204] (3/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_410-287857-0 from training. Duration: 0.93 2023-10-03 01:49:24,116 WARNING [train.py:1204] (3/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_153-153537-0 from training. Duration: 0.91 2023-10-03 01:49:24,155 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7544_210_6753_1_1528587722_3576686_172-171750-0_sp0.9 from training. Duration: 0.72225 2023-10-03 01:49:24,811 WARNING [train.py:1204] (3/4) Exclude cut with ID _64521_210_14699_1_1535243427259_3815328_464-183853-0_sp1.1 from training. Duration: 0.97275 2023-10-03 01:49:24,824 WARNING [train.py:1204] (3/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_385-210308-0_sp1.1 from training. Duration: 0.963625 2023-10-03 01:49:24,839 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_46013_210_7234_1_1533776362_3744032_354-261865-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 01:49:24,846 WARNING [train.py:1204] (3/4) Exclude cut with ID _3067_210_2667_1_1526212974640_4503168_250-285937-0_sp1.1 from training. Duration: 0.72725 2023-10-03 01:49:26,192 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_40036_210_13405_1_1533648647_1315388_48-146016-0 from training. Duration: 0.93 2023-10-03 01:49:29,031 WARNING [train.py:1204] (3/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_280-161633-0 from training. Duration: 0.73 2023-10-03 01:49:30,344 WARNING [train.py:1204] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_308-161067-0_sp0.9 from training. Duration: 0.7333125 2023-10-03 01:49:30,438 WARNING [train.py:1204] (3/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_68-212801-0_sp0.9 from training. Duration: 0.811125 2023-10-03 01:49:31,605 WARNING [train.py:1204] (3/4) Exclude cut with ID _6789_210_1534_1_1527857937218_3118950_311-61262-0_sp0.9 from training. Duration: 0.72225 2023-10-03 01:49:33,035 WARNING [train.py:1204] (3/4) Exclude cut with ID _14632_210_9988_1_1530691279392_3923200_102-204477-0_sp1.1 from training. Duration: 0.8818125 2023-10-03 01:49:34,452 WARNING [train.py:1204] (3/4) Exclude cut with ID _65242_210_18628_1_1535369393717_3823149_290-117517-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 01:49:35,913 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_69630_210_7234_1_1535788670_910989_35-337140-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 01:49:35,921 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_5618_210_1874_1_1527311008_5851_1-214501-0 from training. Duration: 0.689 2023-10-03 01:49:37,346 WARNING [train.py:1204] (3/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_168-50337-0_sp0.9 from training. Duration: 0.9333125 2023-10-03 01:49:38,654 WARNING [train.py:1204] (3/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_185-14438-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 01:49:38,668 WARNING [train.py:1204] (3/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_61-327062-0 from training. Duration: 0.81 2023-10-03 01:49:38,681 WARNING [train.py:1204] (3/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_98-741-0 from training. Duration: 0.8 2023-10-03 01:49:43,043 WARNING [train.py:1204] (3/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_312-204798-0 from training. Duration: 0.93 2023-10-03 01:49:44,412 WARNING [train.py:1204] (3/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_787-294416-0_sp0.9 from training. Duration: 0.911125 2023-10-03 01:49:45,865 WARNING [train.py:1204] (3/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_140-181176-0_sp1.1 from training. Duration: 0.836375 2023-10-03 01:49:45,893 WARNING [train.py:1204] (3/4) Exclude cut with ID _14687_210_5854_1_1530703838259_3664889_115-239046-0_sp1.1 from training. Duration: 0.6090625 2023-10-03 01:49:46,052 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.balancer2.prob, batch_count=1095293.3333333333, ans=0.125 2023-10-03 01:49:47,267 WARNING [train.py:1204] (3/4) Exclude cut with ID _56423_210_5854_1_1534896061049_3165969_370-230614-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 01:49:47,315 WARNING [train.py:1204] (3/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_406-275649-0_sp0.9 from training. Duration: 0.4666875 2023-10-03 01:49:47,327 WARNING [train.py:1204] (3/4) Exclude cut with ID _15150_210_8341_1_1530752411803_7256384_22-138274-0_sp1.1 from training. Duration: 0.87275 2023-10-03 01:49:48,727 WARNING [train.py:1204] (3/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_125-125818-0 from training. Duration: 0.96 2023-10-03 01:49:49,057 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.attention_skip_rate, batch_count=1095293.3333333333, ans=0.0 2023-10-03 01:49:50,168 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_4399_210_1874_1_1526705485_2400757_220-47974-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 01:49:51,519 WARNING [train.py:1204] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_308-161067-0_sp1.1 from training. Duration: 0.6 2023-10-03 01:49:55,129 WARNING [train.py:1204] (3/4) Exclude cut with ID _53489_210_6228_1_1534298357169_3835970_102-57645-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 01:49:56,629 WARNING [train.py:1204] (3/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_178-177582-0 from training. Duration: 0.74 2023-10-03 01:49:58,576 WARNING [train.py:1204] (3/4) Exclude cut with ID _14349_210_8341_1_1530615710818_3442470_170-336541-0_sp1.1 from training. Duration: 0.7818125 2023-10-03 01:49:58,645 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_521-214276-0_sp0.9 from training. Duration: 0.488875 2023-10-03 01:49:58,684 WARNING [train.py:1204] (3/4) Exclude cut with ID _66070_210_7640_1_1535441393096_7805310_489-292631-0 from training. Duration: 0.79 2023-10-03 01:50:04,226 WARNING [train.py:1204] (3/4) Exclude cut with ID _14069_210_5632_1_1530512692033_4803390_438-340023-0_sp1.1 from training. Duration: 0.936375 2023-10-03 01:50:04,452 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.conv_module1.balancer1.prob, batch_count=1095360.0, ans=0.125 2023-10-03 01:50:05,763 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.conv_module1.balancer2.min_abs, batch_count=1095360.0, ans=0.5 2023-10-03 01:50:06,893 WARNING [train.py:1204] (3/4) Exclude cut with ID _7852_210_5219_1_1528286471316_8139400_242-105336-0_sp1.1 from training. Duration: 0.6545625 2023-10-03 01:50:08,300 WARNING [train.py:1204] (3/4) Exclude cut with ID _3867_210_1883_1_1526641200575_4475830_272-215563-0 from training. Duration: 0.57 2023-10-03 01:50:08,309 WARNING [train.py:1204] (3/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_143-117421-0_sp0.9 from training. Duration: 0.6444375 2023-10-03 01:50:08,316 WARNING [train.py:1204] (3/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_30-326681-0_sp1.1 from training. Duration: 0.7545625 2023-10-03 01:50:09,722 WARNING [train.py:1204] (3/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_538-218448-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 01:50:13,909 WARNING [train.py:1204] (3/4) Exclude cut with ID _33640_210_2655_1_1533117605479_4855469_60-192970-0_sp1.1 from training. Duration: 0.963625 2023-10-03 01:50:13,916 WARNING [train.py:1204] (3/4) Exclude cut with ID _65585_210_12210_1_1535450451584_3764098_112-294301-0_sp1.1 from training. Duration: 0.8 2023-10-03 01:50:13,947 WARNING [train.py:1204] (3/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_643-37412-0_sp1.1 from training. Duration: 0.936375 2023-10-03 01:50:15,118 INFO [train.py:1046] (3/4) Epoch 31, batch 4950, loss[loss=0.1513, simple_loss=0.2193, pruned_loss=0.04169, over 23589.00 frames. ], tot_loss[loss=0.1626, simple_loss=0.24, pruned_loss=0.04255, over 4691513.26 frames. ], batch size: 256, lr: 3.27e-03, grad_scale: 16.0 2023-10-03 01:50:15,177 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_9302_210_2550_1_1528889962_4655416_638-301620-0_sp1.1 from training. Duration: 0.42725 2023-10-03 01:50:15,849 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.3.encoder.layers.3.nonlin_attention.whiten2, num_groups=1, num_channels=512, metric=4.90 vs. limit=15.0 2023-10-03 01:50:16,401 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.656e+02 1.910e+02 2.098e+02 2.377e+02 3.455e+02, threshold=4.195e+02, percent-clipped=0.0 2023-10-03 01:50:16,552 WARNING [train.py:1204] (3/4) Exclude cut with ID _3867_210_1883_1_1526641200575_4475830_272-215563-0_sp1.1 from training. Duration: 0.5181875 2023-10-03 01:50:19,303 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_53679_210_4852_1_1534556726_4186141_358-154143-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 01:50:19,316 WARNING [train.py:1204] (3/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_841-341679-0_sp0.9 from training. Duration: 0.6444375 2023-10-03 01:50:21,455 WARNING [train.py:1204] (3/4) Exclude cut with ID _7160_210_1876_1_1527940923231_3703799_302-150728-0 from training. Duration: 0.78 2023-10-03 01:50:22,756 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_125-203105-0 from training. Duration: 0.8 2023-10-03 01:50:22,774 WARNING [train.py:1204] (3/4) Exclude cut with ID _5585_210_2667_1_1527418813324_8007340_89-332072-0_sp0.9 from training. Duration: 0.711125 2023-10-03 01:50:22,872 WARNING [train.py:1204] (3/4) Exclude cut with ID _3992_210_1747_1_1526644816031_3731060_36-147855-0 from training. Duration: 0.78 2023-10-03 01:50:22,891 WARNING [train.py:1204] (3/4) Exclude cut with ID _59256_210_5986_1_1535076085997_3589120_172-239045-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 01:50:22,905 WARNING [train.py:1204] (3/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_472-216663-0_sp1.1 from training. Duration: 0.836375 2023-10-03 01:50:24,790 WARNING [train.py:1204] (3/4) Exclude cut with ID _36512_210_6987_1_1533033143998_3126069_181-306500-0_sp1.1 from training. Duration: 0.863625 2023-10-03 01:50:24,808 WARNING [train.py:1204] (3/4) Exclude cut with ID _56524_210_9642_1_1534572011779_3619170_492-95008-0_sp1.1 from training. Duration: 0.97275 2023-10-03 01:50:27,409 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_33348_210_5632_1_1533086912_3324030_119-90326-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 01:50:27,445 WARNING [train.py:1204] (3/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_231-137892-0_sp1.1 from training. Duration: 0.9 2023-10-03 01:50:28,830 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8453_210_4211_1_1528502940_8562730_596-306820-0_sp1.1 from training. Duration: 0.8818125 2023-10-03 01:50:30,685 WARNING [train.py:1204] (3/4) Exclude cut with ID _27052_210_5986_1_1532329197187_3585009_50-242750-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 01:50:32,192 WARNING [train.py:1204] (3/4) Exclude cut with ID _13913_210_9994_1_1530579615187_3383897_358-98435-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 01:50:32,212 WARNING [train.py:1204] (3/4) Exclude cut with ID _27013_210_1389_1_1532437173240_3252649_399-229778-0_sp1.1 from training. Duration: 0.963625 2023-10-03 01:50:32,888 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.3.encoder.layers.0.nonlin_attention.whiten1, num_groups=1, num_channels=384, metric=7.28 vs. limit=10.0 2023-10-03 01:50:35,111 WARNING [train.py:1204] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_185-174805-0_sp0.9 from training. Duration: 0.7555625 2023-10-03 01:50:39,149 WARNING [train.py:1204] (3/4) Exclude cut with ID _65585_210_12210_1_1535450451584_3764098_196-221014-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 01:50:41,844 WARNING [train.py:1204] (3/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_202-293523-0_sp1.1 from training. Duration: 0.6545625 2023-10-03 01:50:41,976 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8275_210_4198_1_1528455320_3892571_213-127634-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 01:50:43,265 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_921-231678-0_sp1.1 from training. Duration: 0.97275 2023-10-03 01:50:44,660 WARNING [train.py:1204] (3/4) Exclude cut with ID _38020_210_5986_1_1533112252259_7006089_493-102745-0_sp1.1 from training. Duration: 0.8909375 2023-10-03 01:50:44,745 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6846_210_6753_1_1527850767_4295158_2-315073-0 from training. Duration: 0.88 2023-10-03 01:50:46,222 WARNING [train.py:1204] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_249-281349-0 from training. Duration: 0.85 2023-10-03 01:50:47,684 WARNING [train.py:1204] (3/4) Exclude cut with ID _18513_210_9206_1_1531618215223_3862620_314-83429-0_sp1.1 from training. Duration: 0.97275 2023-10-03 01:50:51,247 WARNING [train.py:1204] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_215-202973-0_sp0.9 from training. Duration: 0.97775 2023-10-03 01:50:51,265 WARNING [train.py:1204] (3/4) Exclude cut with ID _7719_210_3525_1_1528456153163_4296960_88-250259-0_sp1.1 from training. Duration: 0.763625 2023-10-03 01:50:52,615 WARNING [train.py:1204] (3/4) Exclude cut with ID _7606_210_4915_1_1528894253277_5080690_301-10732-0_sp1.1 from training. Duration: 0.736375 2023-10-03 01:50:52,627 WARNING [train.py:1204] (3/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_818-11907-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 01:50:54,353 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_5959_210_1794_1_1527400400_3445726_412-44052-0_sp1.1 from training. Duration: 0.7 2023-10-03 01:50:55,750 WARNING [train.py:1204] (3/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_6-279195-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 01:50:58,484 WARNING [train.py:1204] (3/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_252-216674-0_sp0.9 from training. Duration: 0.6 2023-10-03 01:51:00,009 WARNING [train.py:1204] (3/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_144-3287-0_sp1.1 from training. Duration: 0.6090625 2023-10-03 01:51:00,753 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.3.encoder.layers.0.conv_module1.whiten, num_groups=1, num_channels=512, metric=4.07 vs. limit=15.0 2023-10-03 01:51:02,016 WARNING [train.py:1204] (3/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_342-3879-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 01:51:02,046 WARNING [train.py:1204] (3/4) Exclude cut with ID _33640_210_2655_1_1533117605479_4855469_584-65630-0_sp1.1 from training. Duration: 0.97275 2023-10-03 01:51:03,404 WARNING [train.py:1204] (3/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_37-189262-0 from training. Duration: 0.98 2023-10-03 01:51:03,447 WARNING [train.py:1204] (3/4) Exclude cut with ID _9389_210_1664_1_1529217337301_5289269_379-128794-0_sp0.9 from training. Duration: 0.9444375 2023-10-03 01:51:05,009 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.balancer2.prob, batch_count=1095626.6666666667, ans=0.125 2023-10-03 01:51:06,105 WARNING [train.py:1204] (3/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_119-207162-0_sp1.1 from training. Duration: 0.6454375 2023-10-03 01:51:10,292 WARNING [train.py:1204] (3/4) Exclude cut with ID _2941_210_2577_1_1526199673150_2489759_200-333643-0_sp1.1 from training. Duration: 0.936375 2023-10-03 01:51:11,716 WARNING [train.py:1204] (3/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_479-125611-0_sp1.1 from training. Duration: 0.87275 2023-10-03 01:51:11,726 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3475_210_2028_1_1526193578_3622569_64-297072-0_sp1.1 from training. Duration: 0.82725 2023-10-03 01:51:11,764 WARNING [train.py:1204] (3/4) Exclude cut with ID _53565_210_10635_1_1534337218335_3820259_139-346291-0_sp1.1 from training. Duration: 0.97275 2023-10-03 01:51:11,813 WARNING [train.py:1204] (3/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_588-125165-0_sp1.1 from training. Duration: 0.7545625 2023-10-03 01:51:13,169 WARNING [train.py:1204] (3/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_938-205135-0_sp1.1 from training. Duration: 0.7909375 2023-10-03 01:51:15,867 WARNING [train.py:1204] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_216-48707-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 01:51:15,929 WARNING [train.py:1204] (3/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_477-255782-0_sp1.1 from training. Duration: 0.7454375 2023-10-03 01:51:15,964 WARNING [train.py:1204] (3/4) Exclude cut with ID _16909_210_5724_1_1531292079912_4118919_583-92842-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 01:51:17,248 WARNING [train.py:1204] (3/4) Exclude cut with ID _60860_210_8254_1_1534896015875_3557169_245-256666-0 from training. Duration: 0.97 2023-10-03 01:51:21,949 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_45399_210_4852_1_1533624725_7579737_349-116173-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 01:51:26,542 WARNING [train.py:1204] (3/4) Exclude cut with ID _8699_210_2069_1_1528630346831_3960409_155-39766-0 from training. Duration: 0.78 2023-10-03 01:51:26,558 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_86-16881-0_sp1.1 from training. Duration: 0.52725 2023-10-03 01:51:29,247 INFO [train.py:1046] (3/4) Epoch 31, batch 5000, loss[loss=0.1522, simple_loss=0.2301, pruned_loss=0.03714, over 23654.00 frames. ], tot_loss[loss=0.1622, simple_loss=0.2399, pruned_loss=0.04225, over 4699094.69 frames. ], batch size: 149, lr: 3.27e-03, grad_scale: 16.0 2023-10-03 01:51:29,711 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.self_attn_weights.pos_emb_skip_rate, batch_count=1095760.0, ans=0.0 2023-10-03 01:51:32,712 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_191-159384-0_sp1.1 from training. Duration: 0.97275 2023-10-03 01:51:32,723 WARNING [train.py:1204] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_307-158462-0_sp1.1 from training. Duration: 0.62725 2023-10-03 01:51:34,164 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7544_210_6753_1_1528591327_1471426_33-346031-0 from training. Duration: 0.61 2023-10-03 01:51:35,507 WARNING [train.py:1204] (3/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_112-149213-0 from training. Duration: 0.95 2023-10-03 01:51:36,867 WARNING [train.py:1204] (3/4) Exclude cut with ID _55844_210_2346_1_1534471212569_7625707_925-191342-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 01:51:39,514 WARNING [train.py:1204] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_87-278686-0 from training. Duration: 0.77 2023-10-03 01:51:39,547 WARNING [train.py:1204] (3/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_415-78792-0_sp1.1 from training. Duration: 0.736375 2023-10-03 01:51:39,565 WARNING [train.py:1204] (3/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_107-332398-0_sp0.9 from training. Duration: 0.8666875 2023-10-03 01:51:39,866 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.conv_skip_rate, batch_count=1095760.0, ans=0.0 2023-10-03 01:51:40,953 WARNING [train.py:1204] (3/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_72-251255-0 from training. Duration: 0.94 2023-10-03 01:51:40,976 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_431-227273-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 01:51:41,041 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7544_210_6753_1_1528587000_691468_87-178599-0_sp0.9 from training. Duration: 0.9666875 2023-10-03 01:51:42,392 WARNING [train.py:1204] (3/4) Exclude cut with ID _13916_210_9994_1_1530788341126_3763149_60-191556-0 from training. Duration: 0.61 2023-10-03 01:51:42,394 WARNING [train.py:1204] (3/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_746-164782-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 01:51:42,438 WARNING [train.py:1204] (3/4) Exclude cut with ID _33640_210_2655_1_1533117605479_4855469_596-21066-0_sp1.1 from training. Duration: 0.92725 2023-10-03 01:51:43,910 WARNING [train.py:1204] (3/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_320-14078-0 from training. Duration: 0.95 2023-10-03 01:51:43,951 WARNING [train.py:1204] (3/4) Exclude cut with ID _29877_210_6228_1_1532397791106_5276149_266-62291-0 from training. Duration: 0.87 2023-10-03 01:51:45,328 WARNING [train.py:1204] (3/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_176-56120-0_sp0.9 from training. Duration: 0.92225 2023-10-03 01:51:46,618 WARNING [train.py:1204] (3/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_91-26919-0 from training. Duration: 0.97 2023-10-03 01:51:46,626 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_224-322394-0_sp0.9 from training. Duration: 0.7333125 2023-10-03 01:51:46,673 WARNING [train.py:1204] (3/4) Exclude cut with ID _44982_210_1511_1_1534312820108_3726029_586-297059-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 01:51:46,844 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.conv_skip_rate, batch_count=1095826.6666666667, ans=0.0 2023-10-03 01:51:48,014 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_196-289637-0_sp1.1 from training. Duration: 0.6818125 2023-10-03 01:51:48,015 WARNING [train.py:1204] (3/4) Exclude cut with ID _18513_210_9206_1_1531618215223_3862620_402-5957-0 from training. Duration: 0.99 2023-10-03 01:51:48,021 WARNING [train.py:1204] (3/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_81-300212-0 from training. Duration: 0.6 2023-10-03 01:51:48,271 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.balancer1.prob, batch_count=1095826.6666666667, ans=0.125 2023-10-03 01:51:50,031 WARNING [train.py:1204] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_166-128941-0 from training. Duration: 0.54 2023-10-03 01:51:50,047 WARNING [train.py:1204] (3/4) Exclude cut with ID _58059_210_5854_1_1534901471149_3164808_57-309660-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 01:51:51,437 WARNING [train.py:1204] (3/4) Exclude cut with ID _60621_210_17071_1_1535013053530_3671739_73-155814-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 01:51:52,169 WARNING [train.py:1204] (3/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_726-270780-0 from training. Duration: 0.86 2023-10-03 01:51:53,229 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8853_210_3273_1_1528633899_818968_51-187685-0_sp1.1 from training. Duration: 0.62725 2023-10-03 01:51:54,594 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_315-212794-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 01:51:56,024 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_53373_210_5399_1_1534229471_4715695_314-122795-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 01:51:57,355 WARNING [train.py:1204] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_187-221566-0_sp0.9 from training. Duration: 0.47775 2023-10-03 01:52:00,094 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_133-564-0 from training. Duration: 0.79 2023-10-03 01:52:00,132 WARNING [train.py:1204] (3/4) Exclude cut with ID _55153_210_2253_1_1534384786677_3162096_255-228344-0_sp1.1 from training. Duration: 0.936375 2023-10-03 01:52:00,319 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.0.ff3_skip_rate, batch_count=1095893.3333333333, ans=0.0 2023-10-03 01:52:01,585 WARNING [train.py:1204] (3/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_383-165234-0_sp1.1 from training. Duration: 0.8545625 2023-10-03 01:52:04,409 WARNING [train.py:1204] (3/4) Exclude cut with ID IC0677W0056-334889 from training. Duration: 0.768 2023-10-03 01:52:06,472 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_14953_210_2151_1_1530701710_5616165_710-165400-0_sp0.9 from training. Duration: 0.9666875 2023-10-03 01:52:06,589 WARNING [train.py:1204] (3/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_355-236205-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 01:52:06,590 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_34450_210_9547_1_1533099443_4109050_503-246893-0_sp1.1 from training. Duration: 0.963625 2023-10-03 01:52:10,628 WARNING [train.py:1204] (3/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_25-120161-0 from training. Duration: 0.98 2023-10-03 01:52:10,660 WARNING [train.py:1204] (3/4) Exclude cut with ID _66112_210_10635_1_1535590236305_3801484_205-311930-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 01:52:10,689 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_48049_210_5632_1_1533864553_3519937_185-281218-0_sp1.1 from training. Duration: 0.92725 2023-10-03 01:52:11,917 WARNING [train.py:1204] (3/4) Exclude cut with ID _48782_210_12658_1_1534467701544_2300422_191-332415-0_sp1.1 from training. Duration: 0.97275 2023-10-03 01:52:13,394 WARNING [train.py:1204] (3/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_907-151398-0_sp1.1 from training. Duration: 0.336375 2023-10-03 01:52:13,445 WARNING [train.py:1204] (3/4) Exclude cut with ID _67499_210_17282_1_1535509805220_3692430_20-179346-0_sp1.1 from training. Duration: 0.8818125 2023-10-03 01:52:16,268 WARNING [train.py:1204] (3/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_441-91466-0_sp1.1 from training. Duration: 0.8818125 2023-10-03 01:52:18,202 WARNING [train.py:1204] (3/4) Exclude cut with ID _46681_210_9642_1_1533893005571_7603359_786-164007-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 01:52:24,471 WARNING [train.py:1204] (3/4) Exclude cut with ID _14970_210_15709_1_1530773543493_1715730_187-20259-0 from training. Duration: 0.89 2023-10-03 01:52:28,716 WARNING [train.py:1204] (3/4) Exclude cut with ID _12734_210_8341_1_1530183614296_3891929_383-339075-0_sp1.1 from training. Duration: 0.963625 2023-10-03 01:52:36,699 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.3.encoder.layers.0.self_attn_weights.whiten_keys, num_groups=8, num_channels=256, metric=4.81 vs. limit=6.0 2023-10-03 01:52:38,757 WARNING [train.py:1204] (3/4) Exclude cut with ID _37086_210_9038_1_1532999540891_7021250_834-322225-0_sp1.1 from training. Duration: 0.92725 2023-10-03 01:52:38,872 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_10987_210_4163_1_1529760555_4123902_56-290559-0_sp1.1 from training. Duration: 0.963625 2023-10-03 01:52:38,879 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_92-246270-0_sp0.9 from training. Duration: 0.9444375 2023-10-03 01:52:38,898 WARNING [train.py:1204] (3/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_56-13561-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 01:52:40,140 WARNING [train.py:1204] (3/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_393-57888-0_sp1.1 from training. Duration: 0.5818125 2023-10-03 01:52:40,161 WARNING [train.py:1204] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_168-32906-0_sp1.1 from training. Duration: 0.62725 2023-10-03 01:52:40,209 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_51225_210_3601_1_1534400634_2327383_127-207553-0_sp1.1 from training. Duration: 0.963625 2023-10-03 01:52:41,946 INFO [scaling.py:1118] (3/4) WithLoss: name=encoder.encoders.3.encoder.layers.0.self_attn_weights, loss-sum=0.000e+00 2023-10-03 01:52:42,983 INFO [train.py:1046] (3/4) Epoch 31, batch 5050, loss[loss=0.1592, simple_loss=0.2466, pruned_loss=0.0359, over 24382.00 frames. ], tot_loss[loss=0.1628, simple_loss=0.2407, pruned_loss=0.0425, over 4708652.10 frames. ], batch size: 77, lr: 3.27e-03, grad_scale: 16.0 2023-10-03 01:52:43,111 WARNING [train.py:1204] (3/4) Exclude cut with ID _58583_210_5236_1_1534726835266_3574249_494-203521-0_sp1.1 from training. Duration: 0.963625 2023-10-03 01:52:43,136 WARNING [train.py:1204] (3/4) Exclude cut with ID _49376_210_5986_1_1533985278732_3370740_354-344695-0 from training. Duration: 0.99 2023-10-03 01:52:44,301 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.516e+02 1.838e+02 2.026e+02 2.267e+02 4.820e+02, threshold=4.051e+02, percent-clipped=1.0 2023-10-03 01:52:44,469 WARNING [train.py:1204] (3/4) Exclude cut with ID _8021_210_7994_1_1528376451358_3653069_179-45206-0_sp1.1 from training. Duration: 0.7818125 2023-10-03 01:52:45,141 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.4.encoder.layers.0.conv_module2.whiten, num_groups=1, num_channels=384, metric=11.99 vs. limit=15.0 2023-10-03 01:52:45,984 WARNING [train.py:1204] (3/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_742-154017-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 01:52:47,268 WARNING [train.py:1204] (3/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_222-198127-0_sp1.1 from training. Duration: 0.763625 2023-10-03 01:52:49,121 WARNING [train.py:1204] (3/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_235-307337-0 from training. Duration: 0.94 2023-10-03 01:52:49,185 WARNING [train.py:1204] (3/4) Exclude cut with ID _8393_210_6702_1_1528505860249_3457328_453-176500-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 01:52:50,642 WARNING [train.py:1204] (3/4) Exclude cut with ID _53565_210_10635_1_1534337218335_3820259_102-179528-0_sp1.1 from training. Duration: 0.97275 2023-10-03 01:52:53,346 WARNING [train.py:1204] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_43-206757-0_sp1.1 from training. Duration: 0.6818125 2023-10-03 01:52:55,279 WARNING [train.py:1204] (3/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_335-274780-0_sp1.1 from training. Duration: 0.7090625 2023-10-03 01:52:55,321 WARNING [train.py:1204] (3/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_128-296581-0_sp1.1 from training. Duration: 0.67275 2023-10-03 01:53:03,684 WARNING [train.py:1204] (3/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_111-112362-0 from training. Duration: 0.91 2023-10-03 01:53:03,733 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_5522_210_3273_1_1527249606_3909096_366-172508-0_sp1.1 from training. Duration: 0.6 2023-10-03 01:53:05,101 WARNING [train.py:1204] (3/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_47-167373-0_sp1.1 from training. Duration: 0.836375 2023-10-03 01:53:05,140 WARNING [train.py:1204] (3/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_41-234353-0 from training. Duration: 0.68 2023-10-03 01:53:05,170 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_82-221196-0_sp0.9 from training. Duration: 0.8444375 2023-10-03 01:53:06,889 WARNING [train.py:1204] (3/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_47-4814-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 01:53:06,925 WARNING [train.py:1204] (3/4) Exclude cut with ID _43541_210_10626_1_1533796233418_3635440_40-247993-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 01:53:06,966 WARNING [train.py:1204] (3/4) Exclude cut with ID _21989_210_5854_1_1531899214931_3271360_133-44759-0_sp0.9 from training. Duration: 0.8555625 2023-10-03 01:53:06,969 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_94-195801-0 from training. Duration: 0.94 2023-10-03 01:53:08,343 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_141-89113-0 from training. Duration: 0.86 2023-10-03 01:53:09,706 WARNING [train.py:1204] (3/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_683-284578-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 01:53:11,174 WARNING [train.py:1204] (3/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_6-214815-0_sp1.1 from training. Duration: 0.77275 2023-10-03 01:53:13,817 WARNING [train.py:1204] (3/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_556-212082-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 01:53:15,214 WARNING [train.py:1204] (3/4) Exclude cut with ID _44902_210_16187_1_1533888006062_3792078_149-67212-0 from training. Duration: 0.87 2023-10-03 01:53:15,340 WARNING [train.py:1204] (3/4) Exclude cut with ID _61148_210_6897_1_1535094022726_4217610_333-159440-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 01:53:19,012 WARNING [train.py:1204] (3/4) Exclude cut with ID _3120_210_3525_1_1526036143112_4072980_404-249994-0 from training. Duration: 0.99 2023-10-03 01:53:20,404 WARNING [train.py:1204] (3/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_234-260682-0_sp0.9 from training. Duration: 0.9333125 2023-10-03 01:53:20,441 WARNING [train.py:1204] (3/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_374-138971-0_sp1.1 from training. Duration: 0.8545625 2023-10-03 01:53:21,862 WARNING [train.py:1204] (3/4) Exclude cut with ID _53713_210_24116_1_1534314487762_7416751_743-302085-0_sp1.1 from training. Duration: 0.92725 2023-10-03 01:53:21,934 WARNING [train.py:1204] (3/4) Exclude cut with ID _18516_210_9206_1_1531704653206_3692406_198-334870-0_sp1.1 from training. Duration: 0.836375 2023-10-03 01:53:24,021 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.3.encoder.layers.1.self_attn1.whiten, num_groups=1, num_channels=512, metric=14.45 vs. limit=22.5 2023-10-03 01:53:24,627 WARNING [train.py:1204] (3/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_395-73570-0_sp1.1 from training. Duration: 0.9 2023-10-03 01:53:26,520 WARNING [train.py:1204] (3/4) Exclude cut with ID _46683_210_9642_1_1533730913492_4591116_421-19547-0_sp1.1 from training. Duration: 0.936375 2023-10-03 01:53:27,823 WARNING [train.py:1204] (3/4) Exclude cut with ID _19896_210_1883_1_1531709022662_6649569_714-8668-0_sp1.1 from training. Duration: 0.963625 2023-10-03 01:53:27,850 WARNING [train.py:1204] (3/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_49-101072-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 01:53:27,856 WARNING [train.py:1204] (3/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_220-191080-0_sp1.1 from training. Duration: 0.8909375 2023-10-03 01:53:29,057 WARNING [train.py:1204] (3/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_62-335552-0 from training. Duration: 0.87 2023-10-03 01:53:29,157 WARNING [train.py:1204] (3/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_111-112362-0_sp1.1 from training. Duration: 0.82725 2023-10-03 01:53:30,652 WARNING [train.py:1204] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_318-59097-0_sp0.9 from training. Duration: 0.8444375 2023-10-03 01:53:35,295 WARNING [train.py:1204] (3/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_98-255710-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 01:53:35,309 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9042W0003-515609 from training. Duration: 0.896 2023-10-03 01:53:35,325 WARNING [train.py:1204] (3/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_158-250116-0_sp0.9 from training. Duration: 0.688875 2023-10-03 01:53:35,570 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.conv_module2.balancer1.min_positive, batch_count=1096293.3333333333, ans=0.025 2023-10-03 01:53:36,736 WARNING [train.py:1204] (3/4) Exclude cut with ID _26750_210_5665_1_1532226678442_4063230_7-346885-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 01:53:38,033 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_37839_210_7234_1_1533542462_3689303_357-227078-0_sp1.1 from training. Duration: 0.963625 2023-10-03 01:53:38,076 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9052W0119-520904 from training. Duration: 0.896 2023-10-03 01:53:40,791 WARNING [train.py:1204] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_249-281349-0_sp1.1 from training. Duration: 0.77275 2023-10-03 01:53:40,804 WARNING [train.py:1204] (3/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_147-293311-0 from training. Duration: 0.94 2023-10-03 01:53:40,804 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_158-21166-0_sp1.1 from training. Duration: 0.963625 2023-10-03 01:53:43,614 WARNING [train.py:1204] (3/4) Exclude cut with ID _14100_210_8341_1_1530529320613_3678069_207-332194-0_sp1.1 from training. Duration: 0.92725 2023-10-03 01:53:44,979 WARNING [train.py:1204] (3/4) Exclude cut with ID _9389_210_1664_1_1529217337301_5289269_324-211031-0_sp1.1 from training. Duration: 0.963625 2023-10-03 01:53:44,992 WARNING [train.py:1204] (3/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_133-158308-0 from training. Duration: 0.84 2023-10-03 01:53:46,423 WARNING [train.py:1204] (3/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_166-314607-0 from training. Duration: 0.54 2023-10-03 01:53:48,317 WARNING [train.py:1204] (3/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_649-79538-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 01:53:50,105 WARNING [train.py:1204] (3/4) Exclude cut with ID _28394_210_8913_1_1532430049518_3650118_343-115667-0_sp1.1 from training. Duration: 0.97275 2023-10-03 01:53:50,157 WARNING [train.py:1204] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_87-278686-0_sp0.9 from training. Duration: 0.8555625 2023-10-03 01:53:54,207 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9109W0003-549749 from training. Duration: 0.768 2023-10-03 01:53:54,519 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=1096360.0, ans=0.1 2023-10-03 01:53:55,618 WARNING [train.py:1204] (3/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_126-29104-0_sp1.1 from training. Duration: 0.77275 2023-10-03 01:53:56,879 INFO [train.py:1046] (3/4) Epoch 31, batch 5100, loss[loss=0.17, simple_loss=0.2526, pruned_loss=0.04368, over 24632.00 frames. ], tot_loss[loss=0.1629, simple_loss=0.2413, pruned_loss=0.04226, over 4719903.76 frames. ], batch size: 65, lr: 3.27e-03, grad_scale: 16.0 2023-10-03 01:53:58,982 WARNING [train.py:1204] (3/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_325-113798-0 from training. Duration: 0.85 2023-10-03 01:53:59,042 WARNING [train.py:1204] (3/4) Exclude cut with ID _6694_210_4599_1_1527852637490_3965529_116-109398-0 from training. Duration: 0.73 2023-10-03 01:54:00,238 WARNING [train.py:1204] (3/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_46-57119-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 01:54:01,680 WARNING [train.py:1204] (3/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_546-119312-0_sp1.1 from training. Duration: 0.9 2023-10-03 01:54:01,714 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder_embed.convnext.hidden_balancer.prob, batch_count=1096426.6666666667, ans=0.125 2023-10-03 01:54:04,309 WARNING [train.py:1204] (3/4) Exclude cut with ID _60057_210_10640_1_1534903065793_3333150_468-306939-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 01:54:04,349 WARNING [train.py:1204] (3/4) Exclude cut with ID _15150_210_8341_1_1530752411803_7256384_22-138274-0 from training. Duration: 0.96 2023-10-03 01:54:04,364 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_289-180904-0 from training. Duration: 0.76 2023-10-03 01:54:07,865 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.feed_forward3.hidden_balancer.prob, batch_count=1096426.6666666667, ans=0.125 2023-10-03 01:54:10,329 WARNING [train.py:1204] (3/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_64-133721-0_sp1.1 from training. Duration: 0.92725 2023-10-03 01:54:11,704 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_195-257512-0_sp1.1 from training. Duration: 0.6090625 2023-10-03 01:54:14,464 WARNING [train.py:1204] (3/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_704-101963-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 01:54:16,017 WARNING [train.py:1204] (3/4) Exclude cut with ID _4366_210_1388_1_1526792053633_3613819_231-211402-0 from training. Duration: 0.94 2023-10-03 01:54:17,366 WARNING [train.py:1204] (3/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_679-190385-0_sp1.1 from training. Duration: 0.97275 2023-10-03 01:54:18,853 WARNING [train.py:1204] (3/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_298-81459-0_sp1.1 from training. Duration: 0.963625 2023-10-03 01:54:20,109 WARNING [train.py:1204] (3/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_658-282610-0_sp0.9 from training. Duration: 0.588875 2023-10-03 01:54:22,179 WARNING [train.py:1204] (3/4) Exclude cut with ID _57572_210_5517_1_1534921219624_3809484_196-283734-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 01:54:23,495 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_53071_210_16554_1_1534490405_2954066_294-198554-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 01:54:23,500 WARNING [train.py:1204] (3/4) Exclude cut with ID _4866_210_2577_1_1527404412284_4364659_352-157930-0 from training. Duration: 0.81 2023-10-03 01:54:23,689 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=1096493.3333333333, ans=0.1 2023-10-03 01:54:27,515 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9231W0113-610139 from training. Duration: 0.896 2023-10-03 01:54:27,567 WARNING [train.py:1204] (3/4) Exclude cut with ID _24271_210_12658_1_1531987270022_7190519_638-171906-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 01:54:27,601 WARNING [train.py:1204] (3/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_272-249985-0 from training. Duration: 0.67 2023-10-03 01:54:27,617 WARNING [train.py:1204] (3/4) Exclude cut with ID _64086_210_7994_1_1535270417019_7435949_449-31567-0 from training. Duration: 0.97 2023-10-03 01:54:32,045 WARNING [train.py:1204] (3/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_109-282568-0_sp1.1 from training. Duration: 0.97275 2023-10-03 01:54:39,655 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_121-45934-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 01:54:43,679 WARNING [train.py:1204] (3/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_314-126819-0 from training. Duration: 0.5 2023-10-03 01:54:43,717 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9309W0192-640319 from training. Duration: 0.512 2023-10-03 01:54:43,725 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9310W0003-640649 from training. Duration: 0.896 2023-10-03 01:54:46,405 WARNING [train.py:1204] (3/4) Exclude cut with ID _7160_210_1876_1_1527940923231_3703799_120-150943-0 from training. Duration: 0.81 2023-10-03 01:54:46,407 WARNING [train.py:1204] (3/4) Exclude cut with ID _40380_210_9059_1_1533380594759_4187380_6-33508-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 01:54:46,697 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.feed_forward1.out_proj.dropout_p, batch_count=1096626.6666666667, ans=0.1 2023-10-03 01:54:49,308 WARNING [train.py:1204] (3/4) Exclude cut with ID _59202_210_26984_1_1534816810612_3549174_137-318710-0 from training. Duration: 0.89 2023-10-03 01:54:52,289 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_435-313305-0 from training. Duration: 0.71 2023-10-03 01:54:54,138 WARNING [train.py:1204] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_91-212486-0_sp1.1 from training. Duration: 0.6181875 2023-10-03 01:54:57,475 WARNING [train.py:1204] (3/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_133-158308-0_sp1.1 from training. Duration: 0.763625 2023-10-03 01:54:58,897 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_99-251314-0 from training. Duration: 0.87 2023-10-03 01:55:00,265 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_365-217119-0_sp1.1 from training. Duration: 0.57275 2023-10-03 01:55:01,551 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3475_210_2028_1_1526193578_3622569_377-166133-0 from training. Duration: 0.86 2023-10-03 01:55:02,385 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.2.encoder.layers.2.feed_forward3.out_whiten, num_groups=1, num_channels=384, metric=9.94 vs. limit=15.0 2023-10-03 01:55:06,536 WARNING [train.py:1204] (3/4) Exclude cut with ID _13916_210_9994_1_1530788341126_3763149_116-21034-0_sp1.1 from training. Duration: 0.87275 2023-10-03 01:55:06,554 WARNING [train.py:1204] (3/4) Exclude cut with ID _64220_210_9527_1_1535250151772_4875259_142-216831-0_sp1.1 from training. Duration: 0.936375 2023-10-03 01:55:06,554 WARNING [train.py:1204] (3/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_490-207861-0_sp1.1 from training. Duration: 0.8909375 2023-10-03 01:55:06,607 WARNING [train.py:1204] (3/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_87-272068-0_sp0.9 from training. Duration: 0.92225 2023-10-03 01:55:07,951 WARNING [train.py:1204] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_18-46756-0_sp1.1 from training. Duration: 0.7181875 2023-10-03 01:55:08,021 WARNING [train.py:1204] (3/4) Exclude cut with ID _39411_210_5986_1_1533538825030_3343649_98-311335-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 01:55:09,837 WARNING [train.py:1204] (3/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_113-18163-0 from training. Duration: 0.89 2023-10-03 01:55:09,839 WARNING [train.py:1204] (3/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_1103-56445-0 from training. Duration: 0.73 2023-10-03 01:55:11,130 INFO [train.py:1046] (3/4) Epoch 31, batch 5150, loss[loss=0.1496, simple_loss=0.2326, pruned_loss=0.03327, over 24511.00 frames. ], tot_loss[loss=0.1636, simple_loss=0.2424, pruned_loss=0.04243, over 4717024.79 frames. ], batch size: 63, lr: 3.27e-03, grad_scale: 16.0 2023-10-03 01:55:11,187 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3474_210_2028_1_1526188513_4825246_145-309131-0 from training. Duration: 0.74 2023-10-03 01:55:11,206 WARNING [train.py:1204] (3/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_120-211630-0_sp1.1 from training. Duration: 0.836375 2023-10-03 01:55:11,216 WARNING [train.py:1204] (3/4) Exclude cut with ID _8030_210_7497_1_1528444886781_3531089_32-31837-0 from training. Duration: 0.97 2023-10-03 01:55:11,334 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_19949_210_2161_1_1531706337_928079_8-160956-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 01:55:12,448 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.640e+02 1.869e+02 2.053e+02 2.257e+02 3.083e+02, threshold=4.107e+02, percent-clipped=0.0 2023-10-03 01:55:12,534 WARNING [train.py:1204] (3/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_321-215742-0_sp1.1 from training. Duration: 0.4545625 2023-10-03 01:55:12,704 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.0.conv_skip_rate, batch_count=1096760.0, ans=0.0 2023-10-03 01:55:13,952 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_583-124650-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 01:55:16,672 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_30434_210_13049_1_1532519384_981746_164-182755-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 01:55:16,860 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.balancer2.prob, batch_count=1096760.0, ans=0.125 2023-10-03 01:55:22,233 WARNING [train.py:1204] (3/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_339-104791-0_sp1.1 from training. Duration: 0.4454375 2023-10-03 01:55:23,543 WARNING [train.py:1204] (3/4) Exclude cut with ID _15026_210_8750_1_1530777602316_3291850_211-195043-0 from training. Duration: 0.8 2023-10-03 01:55:24,955 WARNING [train.py:1204] (3/4) Exclude cut with ID _29877_210_6228_1_1532397791106_5276149_487-182647-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 01:55:24,994 WARNING [train.py:1204] (3/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_127-566-0_sp0.9 from training. Duration: 0.9333125 2023-10-03 01:55:28,245 WARNING [train.py:1204] (3/4) Exclude cut with ID _8263_210_1664_1_1528439452891_970220_68-181805-0_sp0.9 from training. Duration: 0.711125 2023-10-03 01:55:28,247 WARNING [train.py:1204] (3/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_504-72513-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 01:55:28,256 WARNING [train.py:1204] (3/4) Exclude cut with ID _64639_210_5816_1_1535367619437_3528000_324-265233-0_sp1.1 from training. Duration: 0.963625 2023-10-03 01:55:28,323 WARNING [train.py:1204] (3/4) Exclude cut with ID _6456_210_1534_1_1527681634212_3181049_215-309359-0_sp0.9 from training. Duration: 0.97775 2023-10-03 01:55:30,180 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_115-233929-0_sp1.1 from training. Duration: 0.6545625 2023-10-03 01:55:30,222 WARNING [train.py:1204] (3/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_219-224445-0 from training. Duration: 0.84 2023-10-03 01:55:31,133 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.1.encoder.layers.0.feed_forward3.out_whiten, num_groups=1, num_channels=256, metric=12.03 vs. limit=15.0 2023-10-03 01:55:31,659 WARNING [train.py:1204] (3/4) Exclude cut with ID _14867_210_1968_1_1530684179890_3846969_413-29292-0_sp1.1 from training. Duration: 0.8181875 2023-10-03 01:55:31,711 WARNING [train.py:1204] (3/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_1012-279473-0_sp0.9 from training. Duration: 0.7444375 2023-10-03 01:55:34,414 WARNING [train.py:1204] (3/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_605-133395-0_sp0.9 from training. Duration: 0.7333125 2023-10-03 01:55:36,503 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.attention_skip_rate, batch_count=1096826.6666666667, ans=0.0 2023-10-03 01:55:37,026 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.3.encoder.layers.0.self_attn1.whiten, num_groups=1, num_channels=512, metric=14.29 vs. limit=22.5 2023-10-03 01:55:37,694 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_89-35748-0 from training. Duration: 0.79 2023-10-03 01:55:39,121 WARNING [train.py:1204] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_476-272757-0_sp1.1 from training. Duration: 0.8454375 2023-10-03 01:55:42,166 WARNING [train.py:1204] (3/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_449-15508-0_sp0.9 from training. Duration: 0.911125 2023-10-03 01:55:44,109 WARNING [train.py:1204] (3/4) Exclude cut with ID _29345_210_4381_1_1532689168263_3860169_198-174328-0 from training. Duration: 0.91 2023-10-03 01:55:48,139 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_15517_210_6753_1_1530952293_1664795_125-158157-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 01:55:52,395 WARNING [train.py:1204] (3/4) Exclude cut with ID _25565_210_12392_1_1532423009382_3952098_420-113180-0_sp1.1 from training. Duration: 0.963625 2023-10-03 01:55:53,830 WARNING [train.py:1204] (3/4) Exclude cut with ID _51471_210_1889_1_1534399391390_3094250_236-146773-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 01:55:56,570 WARNING [train.py:1204] (3/4) Exclude cut with ID _30039_210_13270_1_1532484612900_2725150_175-35012-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 01:55:56,605 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_23850_210_4238_1_1532232597_1632433_99-275843-0_sp1.1 from training. Duration: 0.936375 2023-10-03 01:56:00,285 WARNING [train.py:1204] (3/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_658-282610-0 from training. Duration: 0.53 2023-10-03 01:56:05,002 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_37818_210_4238_1_1533620910_4720083_234-65934-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 01:56:05,072 WARNING [train.py:1204] (3/4) Exclude cut with ID _3067_210_2667_1_1526212974640_4503168_459-173867-0_sp1.1 from training. Duration: 0.8 2023-10-03 01:56:05,105 WARNING [train.py:1204] (3/4) Exclude cut with ID _8696_210_1876_1_1528545632440_3345058_209-54069-0_sp0.9 from training. Duration: 0.7444375 2023-10-03 01:56:06,700 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.feed_forward1.out_proj.dropout_p, batch_count=1096960.0, ans=0.1 2023-10-03 01:56:06,718 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.feed_forward3.hidden_balancer.prob, batch_count=1096960.0, ans=0.125 2023-10-03 01:56:07,882 WARNING [train.py:1204] (3/4) Exclude cut with ID _62574_210_2655_1_1535180407058_3933249_420-236465-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 01:56:09,282 WARNING [train.py:1204] (3/4) Exclude cut with ID _46681_210_9642_1_1533893005571_7603359_333-4042-0_sp1.1 from training. Duration: 0.936375 2023-10-03 01:56:10,624 WARNING [train.py:1204] (3/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_377-296839-0 from training. Duration: 0.55 2023-10-03 01:56:12,315 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.conv_skip_rate, batch_count=1097026.6666666667, ans=0.0 2023-10-03 01:56:13,868 WARNING [train.py:1204] (3/4) Exclude cut with ID _43997_210_4525_1_1533538632555_5189329_426-92599-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 01:56:15,212 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_43-71989-0_sp1.1 from training. Duration: 0.5181875 2023-10-03 01:56:15,444 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.self_attn_weights.pos_emb_skip_rate, batch_count=1097026.6666666667, ans=0.0 2023-10-03 01:56:16,642 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2009_210_2071_1_1525481134_4588937_140-345530-0_sp1.1 from training. Duration: 0.963625 2023-10-03 01:56:16,756 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=1097026.6666666667, ans=0.1 2023-10-03 01:56:18,016 WARNING [train.py:1204] (3/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_138-332773-0_sp0.9 from training. Duration: 0.9 2023-10-03 01:56:18,089 WARNING [train.py:1204] (3/4) Exclude cut with ID _13942_210_9988_1_1530604906036_3733360_499-55676-0_sp0.9 from training. Duration: 0.688875 2023-10-03 01:56:18,115 WARNING [train.py:1204] (3/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_259-34038-0_sp1.1 from training. Duration: 0.67275 2023-10-03 01:56:18,124 WARNING [train.py:1204] (3/4) Exclude cut with ID _3120_210_3525_1_1526036143112_4072980_404-249994-0_sp1.1 from training. Duration: 0.9 2023-10-03 01:56:19,369 WARNING [train.py:1204] (3/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_222-108810-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 01:56:22,210 WARNING [train.py:1204] (3/4) Exclude cut with ID _22083_210_8254_1_1531702862439_3678970_49-234546-0_sp0.9 from training. Duration: 0.92225 2023-10-03 01:56:23,623 WARNING [train.py:1204] (3/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_199-295611-0_sp0.9 from training. Duration: 0.611125 2023-10-03 01:56:25,001 INFO [train.py:1046] (3/4) Epoch 31, batch 5200, loss[loss=0.1553, simple_loss=0.2216, pruned_loss=0.04447, over 23455.00 frames. ], tot_loss[loss=0.1651, simple_loss=0.2434, pruned_loss=0.04336, over 4691871.95 frames. ], batch size: 285, lr: 3.27e-03, grad_scale: 32.0 2023-10-03 01:56:26,364 WARNING [train.py:1204] (3/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_43-192945-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 01:56:28,583 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.bypass.scale_min, batch_count=1097093.3333333333, ans=0.2 2023-10-03 01:56:32,187 WARNING [train.py:1204] (3/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_364-22817-0 from training. Duration: 0.71 2023-10-03 01:56:33,498 WARNING [train.py:1204] (3/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_388-129552-0_sp1.1 from training. Duration: 0.87275 2023-10-03 01:56:33,553 WARNING [train.py:1204] (3/4) Exclude cut with ID _6537_210_1883_1_1527987559710_3597129_190-280227-0_sp1.1 from training. Duration: 0.97275 2023-10-03 01:56:36,355 WARNING [train.py:1204] (3/4) Exclude cut with ID _55844_210_2346_1_1534471212569_7625707_398-56897-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 01:56:37,720 WARNING [train.py:1204] (3/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_48-182155-0_sp1.1 from training. Duration: 0.7909375 2023-10-03 01:56:39,024 WARNING [train.py:1204] (3/4) Exclude cut with ID _29529_210_7234_1_1532393961455_3977460_582-18084-0_sp1.1 from training. Duration: 0.97275 2023-10-03 01:56:39,155 WARNING [train.py:1204] (3/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_912-282412-0 from training. Duration: 0.49 2023-10-03 01:56:41,994 WARNING [train.py:1204] (3/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_332-256168-0_sp0.9 from training. Duration: 0.8333125 2023-10-03 01:56:43,811 WARNING [train.py:1204] (3/4) Exclude cut with ID _64220_210_9527_1_1535250151772_4875259_55-250624-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 01:56:45,291 WARNING [train.py:1204] (3/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_264-108685-0 from training. Duration: 0.55 2023-10-03 01:56:48,273 WARNING [train.py:1204] (3/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_613-232856-0_sp0.9 from training. Duration: 0.6 2023-10-03 01:56:49,633 WARNING [train.py:1204] (3/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_68-212801-0_sp1.1 from training. Duration: 0.663625 2023-10-03 01:56:49,682 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6290_210_3273_1_1527595224_1282787_115-87095-0 from training. Duration: 0.55 2023-10-03 01:56:49,724 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3080_210_2071_1_1526086102_4365166_312-8726-0 from training. Duration: 0.91 2023-10-03 01:56:52,411 WARNING [train.py:1204] (3/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_105-63309-0 from training. Duration: 0.98 2023-10-03 01:56:52,485 WARNING [train.py:1204] (3/4) Exclude cut with ID _2941_210_2577_1_1526199673150_2489759_201-160779-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 01:56:52,488 WARNING [train.py:1204] (3/4) Exclude cut with ID ID0406W0226-885434 from training. Duration: 0.997 2023-10-03 01:56:52,502 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_17292_210_6753_1_1531125388_2943734_353-328201-0_sp1.1 from training. Duration: 0.97275 2023-10-03 01:56:53,942 WARNING [train.py:1204] (3/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_572-96029-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 01:56:53,976 WARNING [train.py:1204] (3/4) Exclude cut with ID _14970_210_15709_1_1530775547361_1966965_6-23328-0_sp0.9 from training. Duration: 0.9555625 2023-10-03 01:56:55,338 WARNING [train.py:1204] (3/4) Exclude cut with ID _45012_210_12651_1_1533608435136_5371380_240-37101-0 from training. Duration: 0.93 2023-10-03 01:56:56,695 WARNING [train.py:1204] (3/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_685-90183-0_sp1.1 from training. Duration: 0.936375 2023-10-03 01:57:01,173 WARNING [train.py:1204] (3/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_41-22245-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 01:57:03,372 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_15126_210_6753_1_1530778677_2591185_120-277707-0 from training. Duration: 0.8 2023-10-03 01:57:03,410 WARNING [train.py:1204] (3/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_310-26186-0 from training. Duration: 0.7 2023-10-03 01:57:03,431 WARNING [train.py:1204] (3/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_787-294416-0 from training. Duration: 0.82 2023-10-03 01:57:05,061 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.conv_module2.balancer1.prob, batch_count=1097226.6666666667, ans=0.125 2023-10-03 01:57:07,702 WARNING [train.py:1204] (3/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_795-33252-0 from training. Duration: 0.37 2023-10-03 01:57:07,731 WARNING [train.py:1204] (3/4) Exclude cut with ID _7537_210_2084_1_1528502395431_3924359_113-199933-0_sp0.9 from training. Duration: 0.6555625 2023-10-03 01:57:07,931 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.feed_forward2.hidden_balancer.prob, batch_count=1097226.6666666667, ans=0.125 2023-10-03 01:57:13,334 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.feed_forward2.hidden_balancer.prob, batch_count=1097293.3333333333, ans=0.125 2023-10-03 01:57:14,428 WARNING [train.py:1204] (3/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_308-231735-0_sp0.9 from training. Duration: 0.811125 2023-10-03 01:57:14,458 WARNING [train.py:1204] (3/4) Exclude cut with ID _19504_210_9994_1_1531648863242_3325220_354-296765-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 01:57:16,259 WARNING [train.py:1204] (3/4) Exclude cut with ID _5409_210_1388_1_1527292850189_5865191_178-127970-0 from training. Duration: 0.57 2023-10-03 01:57:17,095 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.1.encoder.layers.1.feed_forward2.out_whiten, num_groups=1, num_channels=256, metric=3.79 vs. limit=15.0 2023-10-03 01:57:17,530 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_1466-248056-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 01:57:17,573 WARNING [train.py:1204] (3/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_535-14410-0_sp0.9 from training. Duration: 0.52225 2023-10-03 01:57:17,579 WARNING [train.py:1204] (3/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_747-129941-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 01:57:18,996 WARNING [train.py:1204] (3/4) Exclude cut with ID _15012_210_1968_1_1530856820429_5095350_688-178849-0_sp1.1 from training. Duration: 0.5818125 2023-10-03 01:57:21,817 WARNING [train.py:1204] (3/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_356-45054-0_sp1.1 from training. Duration: 0.7818125 2023-10-03 01:57:23,175 WARNING [train.py:1204] (3/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_146-276944-0_sp0.9 from training. Duration: 0.92225 2023-10-03 01:57:24,714 WARNING [train.py:1204] (3/4) Exclude cut with ID _39204_210_4220_1_1533880547368_3191120_346-180306-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 01:57:26,054 WARNING [train.py:1204] (3/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_641-158017-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 01:57:26,054 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_19952_210_2161_1_1531875469_3851750_274-42137-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 01:57:29,107 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.ff2_skip_rate, batch_count=1097360.0, ans=0.0 2023-10-03 01:57:33,534 WARNING [train.py:1204] (3/4) Exclude cut with ID _60804_210_9642_1_1534831199859_7268299_87-327819-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 01:57:35,408 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_9302_210_2550_1_1528889962_4655416_638-301620-0 from training. Duration: 0.47 2023-10-03 01:57:35,491 WARNING [train.py:1204] (3/4) Exclude cut with ID _44304_210_15374_1_1533639352765_4613129_302-22084-0_sp1.1 from training. Duration: 0.7818125 2023-10-03 01:57:35,505 WARNING [train.py:1204] (3/4) Exclude cut with ID _18513_210_9206_1_1531618215223_3862620_177-227063-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 01:57:36,837 WARNING [train.py:1204] (3/4) Exclude cut with ID _41326_210_5854_1_1533778076509_3492080_510-45444-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 01:57:38,198 WARNING [train.py:1204] (3/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_76-311963-0_sp1.1 from training. Duration: 0.636375 2023-10-03 01:57:39,487 INFO [train.py:1046] (3/4) Epoch 31, batch 5250, loss[loss=0.1718, simple_loss=0.2549, pruned_loss=0.04432, over 24469.00 frames. ], tot_loss[loss=0.1646, simple_loss=0.2429, pruned_loss=0.04311, over 4680796.33 frames. ], batch size: 69, lr: 3.27e-03, grad_scale: 32.0 2023-10-03 01:57:39,580 WARNING [train.py:1204] (3/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_156-302675-0_sp0.9 from training. Duration: 0.611125 2023-10-03 01:57:40,851 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.573e+02 1.931e+02 2.126e+02 2.506e+02 4.070e+02, threshold=4.253e+02, percent-clipped=0.0 2023-10-03 01:57:42,350 WARNING [train.py:1204] (3/4) Exclude cut with ID _53489_210_6228_1_1534298357169_3835970_367-148411-0_sp1.1 from training. Duration: 0.936375 2023-10-03 01:57:43,812 WARNING [train.py:1204] (3/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_389-144525-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 01:57:45,117 WARNING [train.py:1204] (3/4) Exclude cut with ID _14069_210_5632_1_1530512692033_4803390_158-238427-0_sp1.1 from training. Duration: 0.963625 2023-10-03 01:57:45,217 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_486-151131-0_sp0.9 from training. Duration: 0.9444375 2023-10-03 01:57:52,387 WARNING [train.py:1204] (3/4) Exclude cut with ID _39846_210_12651_1_1533603651992_1318030_188-71831-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 01:57:53,788 WARNING [train.py:1204] (3/4) Exclude cut with ID _39411_210_5986_1_1533538825030_3343649_173-238248-0_sp1.1 from training. Duration: 0.7545625 2023-10-03 01:57:56,650 WARNING [train.py:1204] (3/4) Exclude cut with ID _20684_210_6917_1_1531567751646_4203930_469-49081-0_sp1.1 from training. Duration: 0.92725 2023-10-03 01:57:57,953 WARNING [train.py:1204] (3/4) Exclude cut with ID _8833_210_5219_1_1528592665210_7553059_611-80313-0_sp1.1 from training. Duration: 0.5818125 2023-10-03 01:57:59,414 WARNING [train.py:1204] (3/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_440-141811-0 from training. Duration: 0.95 2023-10-03 01:57:59,437 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_41211_210_2544_1_1533348859_3702910_236-223652-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 01:58:00,038 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.nonlin_attention.whiten1.whitening_limit, batch_count=1097493.3333333333, ans=10.0 2023-10-03 01:58:02,142 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_40222_210_6228_1_1533385298_4481939_220-163998-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 01:58:03,181 INFO [scaling.py:1022] (3/4) Whitening: name=encoder_embed.out_whiten, num_groups=1, num_channels=192, metric=7.26 vs. limit=8.0 2023-10-03 01:58:04,935 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.1.conv_module2.balancer1.prob, batch_count=1097493.3333333333, ans=0.125 2023-10-03 01:58:08,349 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.ff3_skip_rate, batch_count=1097560.0, ans=0.0 2023-10-03 01:58:16,551 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder_embed.conv.8.prob, batch_count=1097560.0, ans=0.125 2023-10-03 01:58:35,732 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.0.layers.1.self_attn_weights.whiten_keys, num_groups=4, num_channels=128, metric=2.32 vs. limit=6.0 2023-10-03 01:58:47,947 INFO [train.py:1046] (3/4) Epoch 31, batch 5300, loss[loss=0.1293, simple_loss=0.193, pruned_loss=0.0328, over 22700.00 frames. ], tot_loss[loss=0.1635, simple_loss=0.2409, pruned_loss=0.04311, over 4679454.55 frames. ], batch size: 322, lr: 3.27e-03, grad_scale: 32.0 2023-10-03 01:59:02,128 WARNING [train.py:1204] (3/4) Exclude cut with ID _14629_210_15709_1_1530604298182_956899_6-232695-0_sp1.1 from training. Duration: 0.8545625 2023-10-03 01:59:02,161 WARNING [train.py:1204] (3/4) Exclude cut with ID _13921_210_9994_1_1530841782272_3503520_327-69677-0 from training. Duration: 0.9 2023-10-03 01:59:02,185 WARNING [train.py:1204] (3/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_372-298093-0 from training. Duration: 0.8 2023-10-03 01:59:02,198 WARNING [train.py:1204] (3/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_515-139111-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 01:59:02,441 WARNING [train.py:1204] (3/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_85-65056-0_sp1.1 from training. Duration: 0.87275 2023-10-03 01:59:02,487 WARNING [train.py:1204] (3/4) Exclude cut with ID _15113_210_8254_1_1530842438579_3716279_209-54533-0_sp1.1 from training. Duration: 0.87275 2023-10-03 01:59:02,539 WARNING [train.py:1204] (3/4) Exclude cut with ID _70447_210_12392_1_1535972499300_3160420_123-312838-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 01:59:02,568 WARNING [train.py:1204] (3/4) Exclude cut with ID _60113_210_16271_1_1535164212481_4386079_70-63596-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 01:59:02,598 WARNING [train.py:1204] (3/4) Exclude cut with ID _14632_210_9988_1_1530691279392_3923200_225-188509-0_sp1.1 from training. Duration: 0.963625 2023-10-03 01:59:02,651 WARNING [train.py:1204] (3/4) Exclude cut with ID _34734_210_4525_1_1533365977542_4448720_584-180532-0_sp1.1 from training. Duration: 0.97275 2023-10-03 01:59:02,664 WARNING [train.py:1204] (3/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_351-294650-0_sp1.1 from training. Duration: 0.57275 2023-10-03 01:59:03,003 WARNING [train.py:1204] (3/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_263-168388-0_sp0.9 from training. Duration: 0.9555625 2023-10-03 01:59:03,030 WARNING [train.py:1204] (3/4) Exclude cut with ID _22083_210_8254_1_1531702862439_3678970_201-85738-0 from training. Duration: 0.9 2023-10-03 01:59:03,122 WARNING [train.py:1204] (3/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_237-58433-0 from training. Duration: 0.74 2023-10-03 01:59:03,152 WARNING [train.py:1204] (3/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_262-250826-0 from training. Duration: 0.99 2023-10-03 01:59:03,253 WARNING [train.py:1204] (3/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_927-43827-0_sp0.9 from training. Duration: 0.82225 2023-10-03 01:59:03,283 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2821_210_2715_1_1526108385_3763560_73-296673-0 from training. Duration: 0.83 2023-10-03 01:59:03,358 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6287_210_3273_1_1527509506_2653716_182-199887-0 from training. Duration: 0.51 2023-10-03 01:59:03,489 WARNING [train.py:1204] (3/4) Exclude cut with ID _70519_210_4163_1_1535961595994_7458230_899-80223-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 01:59:03,747 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_210-234949-0_sp1.1 from training. Duration: 0.97275 2023-10-03 01:59:03,785 WARNING [train.py:1204] (3/4) Exclude cut with ID _30196_210_1664_1_1533708496522_7074140_768-278176-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 01:59:04,229 WARNING [train.py:1204] (3/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_315-128283-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 01:59:04,305 WARNING [train.py:1204] (3/4) Exclude cut with ID _14886_210_4220_1_1530702020355_3516190_330-323941-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 01:59:04,576 WARNING [train.py:1204] (3/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_264-348896-0_sp1.1 from training. Duration: 0.77275 2023-10-03 01:59:04,598 WARNING [train.py:1204] (3/4) Exclude cut with ID _37086_210_9038_1_1532999540891_7021250_433-10563-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 01:59:04,638 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_53638_210_21581_1_1534297319_5808449_885-80172-0_sp1.1 from training. Duration: 0.87275 2023-10-03 01:59:04,732 WARNING [train.py:1204] (3/4) Exclude cut with ID _14623_210_1925_1_1530773354962_3632609_157-218771-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 01:59:04,743 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_1368-78165-0_sp1.1 from training. Duration: 0.97275 2023-10-03 01:59:04,747 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_309-65177-0_sp0.9 from training. Duration: 0.97775 2023-10-03 01:59:04,753 WARNING [train.py:1204] (3/4) Exclude cut with ID _21632_210_2253_1_1531994001765_3744140_291-138812-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 01:59:04,766 WARNING [train.py:1204] (3/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_669-93163-0_sp1.1 from training. Duration: 0.8909375 2023-10-03 01:59:05,345 WARNING [train.py:1204] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_292-215346-0 from training. Duration: 0.97 2023-10-03 01:59:05,370 WARNING [train.py:1204] (3/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_286-222073-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 01:59:05,658 WARNING [train.py:1204] (3/4) Exclude cut with ID _59256_210_5986_1_1535076085997_3589120_121-237386-0_sp1.1 from training. Duration: 0.87275 2023-10-03 01:59:05,673 WARNING [train.py:1204] (3/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_96-320736-0 from training. Duration: 0.86 2023-10-03 01:59:05,683 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_128-202104-0 from training. Duration: 0.63 2023-10-03 01:59:05,772 WARNING [train.py:1204] (3/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_242-3472-0_sp0.9 from training. Duration: 0.811125 2023-10-03 01:59:05,817 WARNING [train.py:1204] (3/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_154-74182-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 01:59:05,820 WARNING [train.py:1204] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_187-221566-0 from training. Duration: 0.43 2023-10-03 01:59:05,990 WARNING [train.py:1204] (3/4) Exclude cut with ID _6194_210_1534_1_1527511826160_3941194_14-282876-0 from training. Duration: 0.71 2023-10-03 01:59:06,037 WARNING [train.py:1204] (3/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_660-211088-0_sp1.1 from training. Duration: 0.563625 2023-10-03 01:59:06,872 WARNING [train.py:1204] (3/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_526-62079-0_sp1.1 from training. Duration: 0.6090625 2023-10-03 01:59:07,025 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_92-246270-0_sp1.1 from training. Duration: 0.77275 2023-10-03 01:59:07,120 WARNING [train.py:1204] (3/4) Exclude cut with ID IC0287W0023-142350 from training. Duration: 0.956 2023-10-03 01:59:07,198 WARNING [train.py:1204] (3/4) Exclude cut with ID _58605_210_12210_1_1534723195939_3904259_48-269402-0 from training. Duration: 0.96 2023-10-03 01:59:07,213 WARNING [train.py:1204] (3/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_416-257730-0_sp0.9 from training. Duration: 0.72225 2023-10-03 01:59:07,292 WARNING [train.py:1204] (3/4) Exclude cut with ID _7877_210_6014_1_1528286358099_3750136_185-17516-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 01:59:07,396 WARNING [train.py:1204] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_23-189234-0 from training. Duration: 0.83 2023-10-03 01:59:07,414 WARNING [train.py:1204] (3/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_237-89684-0 from training. Duration: 0.96 2023-10-03 01:59:07,484 WARNING [train.py:1204] (3/4) Exclude cut with ID _13916_210_9994_1_1530788341126_3763149_116-21034-0 from training. Duration: 0.96 2023-10-03 01:59:07,614 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_246-125000-0_sp1.1 from training. Duration: 0.563625 2023-10-03 01:59:14,803 INFO [train.py:1046] (3/4) Epoch 32, batch 0, loss[loss=0.1467, simple_loss=0.2352, pruned_loss=0.0291, over 24462.00 frames. ], tot_loss[loss=0.1467, simple_loss=0.2352, pruned_loss=0.0291, over 24462.00 frames. ], batch size: 63, lr: 3.21e-03, grad_scale: 32.0 2023-10-03 01:59:14,803 INFO [train.py:1069] (3/4) Computing validation loss 2023-10-03 01:59:26,546 INFO [train.py:1078] (3/4) Epoch 32, validation: loss=0.3377, simple_loss=0.28, pruned_loss=0.1977, over 1125622.00 frames. 2023-10-03 01:59:26,547 INFO [train.py:1079] (3/4) Maximum memory allocated so far is 21134MB 2023-10-03 01:59:26,630 WARNING [train.py:1204] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_402-253027-0 from training. Duration: 0.54 2023-10-03 01:59:26,700 WARNING [train.py:1204] (3/4) Exclude cut with ID _59288_210_5854_1_1535077757774_3412246_158-289345-0_sp1.1 from training. Duration: 0.92725 2023-10-03 01:59:29,304 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_28211_210_6388_1_1532861506_4523224_128-328162-0_sp1.1 from training. Duration: 0.8181875 2023-10-03 01:59:33,985 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_788-278281-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 01:59:33,993 WARNING [train.py:1204] (3/4) Exclude cut with ID _49728_210_12651_1_1534041034294_6788150_624-195301-0_sp0.9 from training. Duration: 0.9444375 2023-10-03 01:59:34,041 WARNING [train.py:1204] (3/4) Exclude cut with ID _14860_210_4915_1_1530702020340_7343240_267-139775-0_sp1.1 from training. Duration: 0.963625 2023-10-03 01:59:35,411 WARNING [train.py:1204] (3/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_332-221057-0 from training. Duration: 0.68 2023-10-03 01:59:35,592 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.conv_skip_rate, batch_count=1097840.0, ans=0.0 2023-10-03 01:59:36,765 WARNING [train.py:1204] (3/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_242-76943-0 from training. Duration: 0.94 2023-10-03 01:59:38,321 WARNING [train.py:1204] (3/4) Exclude cut with ID _16747_210_5986_1_1531811544733_2749109_87-349587-0_sp1.1 from training. Duration: 0.963625 2023-10-03 01:59:39,665 WARNING [train.py:1204] (3/4) Exclude cut with ID _22982_210_1883_1_1531882790447_5449409_505-346452-0_sp1.1 from training. Duration: 0.963625 2023-10-03 01:59:41,437 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.feed_forward2.hidden_balancer.prob, batch_count=1097906.6666666667, ans=0.125 2023-10-03 01:59:42,482 WARNING [train.py:1204] (3/4) Exclude cut with ID _17956_210_4353_1_1531209568125_6979970_13-192107-0_sp1.1 from training. Duration: 0.963625 2023-10-03 01:59:42,531 WARNING [train.py:1204] (3/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_430-307666-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 01:59:43,891 WARNING [train.py:1204] (3/4) Exclude cut with ID _2895_210_2477_1_1525956785794_4063750_289-11475-0_sp1.1 from training. Duration: 0.7454375 2023-10-03 01:59:43,905 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3910_210_1794_1_1526742306_334530_28-238909-0_sp0.9 from training. Duration: 0.9 2023-10-03 01:59:46,569 WARNING [train.py:1204] (3/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_18-130336-0 from training. Duration: 0.79 2023-10-03 01:59:48,003 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_548-294316-0_sp0.9 from training. Duration: 0.9 2023-10-03 01:59:56,735 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_42-142473-0_sp1.1 from training. Duration: 0.6545625 2023-10-03 01:59:56,741 WARNING [train.py:1204] (3/4) Exclude cut with ID _59170_210_6721_1_1534983633960_4203335_326-48066-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 01:59:58,194 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_45886_210_17024_1_1533709215_471063_7-79165-0 from training. Duration: 0.98 2023-10-03 02:00:02,235 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.2.encoder.layers.0.feed_forward3.out_whiten, num_groups=1, num_channels=384, metric=12.20 vs. limit=15.0 2023-10-03 02:00:02,791 WARNING [train.py:1204] (3/4) Exclude cut with ID _44585_210_18628_1_1533605350387_4518199_280-73534-0_sp0.9 from training. Duration: 0.988875 2023-10-03 02:00:02,798 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3475_210_2028_1_1526193578_3622569_265-276625-0_sp1.1 from training. Duration: 0.6909375 2023-10-03 02:00:04,880 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.1.balancer1.prob, batch_count=1097973.3333333333, ans=0.125 2023-10-03 02:00:06,044 WARNING [train.py:1204] (3/4) Exclude cut with ID _64631_210_27767_1_1535349580068_4165160_88-225232-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 02:00:06,320 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.bypass_mid.scale_min, batch_count=1097973.3333333333, ans=0.2 2023-10-03 02:00:09,372 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_205-265518-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 02:00:09,634 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.conv_module1.balancer1.prob, batch_count=1098040.0, ans=0.125 2023-10-03 02:00:13,637 WARNING [train.py:1204] (3/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_271-253734-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 02:00:13,844 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.nonlin_attention.balancer.max_positive, batch_count=1098040.0, ans=0.95 2023-10-03 02:00:19,008 WARNING [train.py:1204] (3/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_282-257791-0 from training. Duration: 0.86 2023-10-03 02:00:21,825 WARNING [train.py:1204] (3/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_356-45054-0 from training. Duration: 0.86 2023-10-03 02:00:21,868 WARNING [train.py:1204] (3/4) Exclude cut with ID _69584_210_5219_1_1535785133775_4044450_38-348116-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 02:00:21,869 WARNING [train.py:1204] (3/4) Exclude cut with ID _66763_210_17301_1_1535788667617_241750_21-328480-0_sp1.1 from training. Duration: 0.936375 2023-10-03 02:00:23,171 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.590e+02 1.938e+02 2.224e+02 2.523e+02 4.024e+02, threshold=4.448e+02, percent-clipped=0.0 2023-10-03 02:00:23,286 WARNING [train.py:1204] (3/4) Exclude cut with ID _26528_210_9070_1_1532570410842_3851310_497-97218-0_sp0.9 from training. Duration: 0.9555625 2023-10-03 02:00:23,345 WARNING [train.py:1204] (3/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_592-101075-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 02:00:24,850 WARNING [train.py:1204] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_678-159307-0 from training. Duration: 0.85 2023-10-03 02:00:28,123 WARNING [train.py:1204] (3/4) Exclude cut with ID _28427_210_4220_1_1532581240967_3061069_166-319773-0_sp1.1 from training. Duration: 0.936375 2023-10-03 02:00:31,239 WARNING [train.py:1204] (3/4) Exclude cut with ID _29877_210_6228_1_1532397791106_5276149_606-224984-0_sp1.1 from training. Duration: 0.936375 2023-10-03 02:00:31,397 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.0.feed_forward1.out_proj.dropout_p, batch_count=1098106.6666666667, ans=0.1 2023-10-03 02:00:33,234 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.3.encoder.layers.3.self_attn_weights.whiten_keys, num_groups=8, num_channels=256, metric=2.78 vs. limit=6.0 2023-10-03 02:00:34,112 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.0.conv_module1.balancer2.min_abs, batch_count=1098106.6666666667, ans=0.5 2023-10-03 02:00:35,266 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_125-203105-0_sp1.1 from training. Duration: 0.72725 2023-10-03 02:00:38,077 WARNING [train.py:1204] (3/4) Exclude cut with ID IC0620W0113-306765 from training. Duration: 0.768 2023-10-03 02:00:39,944 INFO [train.py:1046] (3/4) Epoch 32, batch 50, loss[loss=0.1716, simple_loss=0.2578, pruned_loss=0.0427, over 24376.00 frames. ], tot_loss[loss=0.163, simple_loss=0.2409, pruned_loss=0.04252, over 1069095.79 frames. ], batch size: 77, lr: 3.21e-03, grad_scale: 32.0 2023-10-03 02:00:40,066 WARNING [train.py:1204] (3/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_410-287857-0_sp1.1 from training. Duration: 0.8454375 2023-10-03 02:00:44,145 WARNING [train.py:1204] (3/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_78-95260-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 02:00:45,516 WARNING [train.py:1204] (3/4) Exclude cut with ID _58059_210_5854_1_1534901471149_3164808_6-73080-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 02:00:45,540 WARNING [train.py:1204] (3/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_331-117829-0 from training. Duration: 0.62 2023-10-03 02:00:46,856 WARNING [train.py:1204] (3/4) Exclude cut with ID _14880_210_8895_1_1530680426655_3795259_289-212817-0_sp0.9 from training. Duration: 0.7666875 2023-10-03 02:00:46,889 WARNING [train.py:1204] (3/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_620-144779-0_sp1.1 from training. Duration: 0.92725 2023-10-03 02:00:48,266 WARNING [train.py:1204] (3/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_561-288575-0_sp1.1 from training. Duration: 0.963625 2023-10-03 02:00:50,941 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_22657_210_6890_1_1532086002_1637216_122-188128-0_sp1.1 from training. Duration: 0.963625 2023-10-03 02:00:52,368 WARNING [train.py:1204] (3/4) Exclude cut with ID _16756_210_4915_1_1531047803763_7104259_604-86607-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 02:00:56,463 WARNING [train.py:1204] (3/4) Exclude cut with ID _15150_210_8341_1_1530752411803_7256384_469-239509-0 from training. Duration: 0.68 2023-10-03 02:00:56,470 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_10987_210_4163_1_1529760555_4123902_302-87926-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 02:01:00,032 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.out_combiner.scale_min, batch_count=1098240.0, ans=0.2 2023-10-03 02:01:01,237 WARNING [train.py:1204] (3/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_469-94673-0_sp0.9 from training. Duration: 0.87775 2023-10-03 02:01:02,685 WARNING [train.py:1204] (3/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_798-106179-0 from training. Duration: 0.83 2023-10-03 02:01:05,762 WARNING [train.py:1204] (3/4) Exclude cut with ID _16744_210_5986_1_1531724405384_3907567_242-111316-0 from training. Duration: 0.93 2023-10-03 02:01:07,194 WARNING [train.py:1204] (3/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_938-205135-0_sp0.9 from training. Duration: 0.9666875 2023-10-03 02:01:08,791 WARNING [train.py:1204] (3/4) Exclude cut with ID _64135_210_2253_1_1535677267876_7608972_133-188266-0_sp0.9 from training. Duration: 0.9 2023-10-03 02:01:08,796 WARNING [train.py:1204] (3/4) Exclude cut with ID _44585_210_18628_1_1533605350387_4518199_623-346820-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 02:01:10,139 WARNING [train.py:1204] (3/4) Exclude cut with ID _42326_210_7770_1_1533814282531_3707269_454-266903-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 02:01:12,015 WARNING [train.py:1204] (3/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_5-55893-0_sp1.1 from training. Duration: 0.5 2023-10-03 02:01:13,387 WARNING [train.py:1204] (3/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_577-332600-0_sp1.1 from training. Duration: 0.5181875 2023-10-03 02:01:13,388 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_957-138882-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 02:01:17,990 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.nonlin_attention.balancer.prob, batch_count=1098306.6666666667, ans=0.125 2023-10-03 02:01:20,515 WARNING [train.py:1204] (3/4) Exclude cut with ID _12556_210_8341_1_1530081024100_7327690_219-38004-0_sp1.1 from training. Duration: 0.97275 2023-10-03 02:01:20,606 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_397-147196-0_sp1.1 from training. Duration: 0.72725 2023-10-03 02:01:21,851 WARNING [train.py:1204] (3/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_231-104736-0_sp0.9 from training. Duration: 0.8666875 2023-10-03 02:01:21,929 WARNING [train.py:1204] (3/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_398-334558-0 from training. Duration: 0.96 2023-10-03 02:01:24,603 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_257-20733-0_sp1.1 from training. Duration: 0.6909375 2023-10-03 02:01:24,679 WARNING [train.py:1204] (3/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_146-282400-0_sp1.1 from training. Duration: 0.6454375 2023-10-03 02:01:24,683 WARNING [train.py:1204] (3/4) Exclude cut with ID _51899_210_20034_1_1534129254466_3669220_265-215428-0 from training. Duration: 0.98 2023-10-03 02:01:26,036 WARNING [train.py:1204] (3/4) Exclude cut with ID _34423_210_8750_1_1533430798456_3330199_219-106541-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 02:01:27,479 WARNING [train.py:1204] (3/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_458-67321-0 from training. Duration: 0.97 2023-10-03 02:01:33,571 WARNING [train.py:1204] (3/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_1051-182606-0_sp1.1 from training. Duration: 0.936375 2023-10-03 02:01:33,592 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_53555_210_5632_1_1534595220_7232297_603-4396-0_sp1.1 from training. Duration: 0.97275 2023-10-03 02:01:35,066 WARNING [train.py:1204] (3/4) Exclude cut with ID _54218_210_12210_1_1534413646388_4409259_159-144367-0_sp1.1 from training. Duration: 0.963625 2023-10-03 02:01:38,167 WARNING [train.py:1204] (3/4) Exclude cut with ID _7606_210_4915_1_1528894253277_5080690_301-10732-0_sp0.9 from training. Duration: 0.9 2023-10-03 02:01:38,170 WARNING [train.py:1204] (3/4) Exclude cut with ID _15026_210_8750_1_1530777602316_3291850_211-195043-0_sp0.9 from training. Duration: 0.888875 2023-10-03 02:01:39,666 WARNING [train.py:1204] (3/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_146-282400-0 from training. Duration: 0.71 2023-10-03 02:01:39,703 WARNING [train.py:1204] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_65-88242-0 from training. Duration: 0.56 2023-10-03 02:01:41,536 WARNING [train.py:1204] (3/4) Exclude cut with ID _55857_210_2346_1_1534644007831_6898209_80-107757-0_sp1.1 from training. Duration: 0.963625 2023-10-03 02:01:42,830 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_15126_210_6753_1_1530778677_2591185_120-277707-0_sp0.9 from training. Duration: 0.888875 2023-10-03 02:01:45,822 WARNING [train.py:1204] (3/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_1033-168593-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 02:01:45,873 WARNING [train.py:1204] (3/4) Exclude cut with ID _46683_210_9642_1_1533730913492_4591116_475-30711-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 02:01:46,515 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.3.encoder.layers.2.conv_module1.whiten, num_groups=1, num_channels=512, metric=4.13 vs. limit=15.0 2023-10-03 02:01:47,142 WARNING [train.py:1204] (3/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_253-308377-0 from training. Duration: 0.49 2023-10-03 02:01:47,205 WARNING [train.py:1204] (3/4) Exclude cut with ID _4680_210_3525_1_1527249757953_893386_75-78979-0 from training. Duration: 0.91 2023-10-03 02:01:48,591 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_5522_210_3273_1_1527249606_3909096_318-203153-0_sp0.9 from training. Duration: 0.47775 2023-10-03 02:01:49,960 WARNING [train.py:1204] (3/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_550-170116-0_sp1.1 from training. Duration: 0.92725 2023-10-03 02:01:49,996 WARNING [train.py:1204] (3/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_277-210798-0_sp0.9 from training. Duration: 0.611125 2023-10-03 02:01:51,262 WARNING [train.py:1204] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_620-276793-0 from training. Duration: 0.59 2023-10-03 02:01:51,279 WARNING [train.py:1204] (3/4) Exclude cut with ID _3067_210_2667_1_1526212974640_4503168_119-175776-0 from training. Duration: 0.74 2023-10-03 02:01:52,589 INFO [train.py:1046] (3/4) Epoch 32, batch 100, loss[loss=0.1678, simple_loss=0.2516, pruned_loss=0.04202, over 24422.00 frames. ], tot_loss[loss=0.1643, simple_loss=0.2434, pruned_loss=0.04257, over 1889176.52 frames. ], batch size: 69, lr: 3.21e-03, grad_scale: 32.0 2023-10-03 02:01:52,619 WARNING [train.py:1204] (3/4) Exclude cut with ID _55209_210_1531_1_1534675828996_4597160_681-195827-0_sp1.1 from training. Duration: 0.92725 2023-10-03 02:01:52,689 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_427-78097-0_sp1.1 from training. Duration: 0.72725 2023-10-03 02:01:55,324 WARNING [train.py:1204] (3/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_96-290039-0_sp0.9 from training. Duration: 0.62225 2023-10-03 02:01:55,340 WARNING [train.py:1204] (3/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_287-292174-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 02:01:58,168 WARNING [train.py:1204] (3/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_200-292228-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 02:01:59,014 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.2.encoder.layers.1.feed_forward2.out_whiten, num_groups=1, num_channels=384, metric=3.68 vs. limit=15.0 2023-10-03 02:02:00,985 WARNING [train.py:1204] (3/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_291-194999-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 02:02:05,738 WARNING [train.py:1204] (3/4) Exclude cut with ID _58895_210_8750_1_1535184081320_3649089_265-323810-0_sp1.1 from training. Duration: 0.87275 2023-10-03 02:02:06,101 INFO [scaling.py:1118] (3/4) WithLoss: name=encoder.encoders.5.encoder.layers.0.self_attn_weights, loss-sum=0.000e+00 2023-10-03 02:02:07,234 WARNING [train.py:1204] (3/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_58-68956-0 from training. Duration: 0.83 2023-10-03 02:02:07,239 WARNING [train.py:1204] (3/4) Exclude cut with ID _39867_210_9826_1_1533553222030_9344259_322-115153-0_sp1.1 from training. Duration: 0.963625 2023-10-03 02:02:12,594 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3371_210_1794_1_1526384462_5580777_396-339812-0_sp1.1 from training. Duration: 0.77275 2023-10-03 02:02:12,606 WARNING [train.py:1204] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_731-2428-0_sp0.9 from training. Duration: 0.92225 2023-10-03 02:02:12,632 WARNING [train.py:1204] (3/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_98-741-0_sp1.1 from training. Duration: 0.72725 2023-10-03 02:02:12,641 WARNING [train.py:1204] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_238-245655-0_sp0.9 from training. Duration: 0.9 2023-10-03 02:02:13,949 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6788_210_1388_1_1527898698_4562021_91-197981-0_sp0.9 from training. Duration: 0.92225 2023-10-03 02:02:14,060 WARNING [train.py:1204] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_860-96198-0 from training. Duration: 0.73 2023-10-03 02:02:17,363 WARNING [train.py:1204] (3/4) Exclude cut with ID _14268_210_9206_1_1530622771285_4714099_331-105351-0_sp0.9 from training. Duration: 0.72225 2023-10-03 02:02:17,400 WARNING [train.py:1204] (3/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_204-137133-0_sp1.1 from training. Duration: 0.92725 2023-10-03 02:02:18,695 WARNING [train.py:1204] (3/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_5-46496-0_sp1.1 from training. Duration: 0.936375 2023-10-03 02:02:18,710 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_160-218721-0_sp1.1 from training. Duration: 0.87275 2023-10-03 02:02:20,996 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.conv_module1.balancer1.prob, batch_count=1098640.0, ans=0.125 2023-10-03 02:02:22,066 WARNING [train.py:1204] (3/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_503-287883-0 from training. Duration: 0.88 2023-10-03 02:02:22,151 WARNING [train.py:1204] (3/4) Exclude cut with ID _57572_210_5517_1_1534921219624_3809484_110-79134-0_sp1.1 from training. Duration: 0.92725 2023-10-03 02:02:23,513 WARNING [train.py:1204] (3/4) Exclude cut with ID _44585_210_18628_1_1533605350387_4518199_421-37641-0_sp1.1 from training. Duration: 0.936375 2023-10-03 02:02:23,594 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_42-142473-0_sp0.9 from training. Duration: 0.8 2023-10-03 02:02:25,430 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.4.encoder.layers.2.feed_forward3.out_whiten, num_groups=1, num_channels=384, metric=13.59 vs. limit=15.0 2023-10-03 02:02:26,383 WARNING [train.py:1204] (3/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_310-114452-0_sp0.9 from training. Duration: 0.6555625 2023-10-03 02:02:26,602 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.feed_forward1.out_proj.dropout_p, batch_count=1098640.0, ans=0.1 2023-10-03 02:02:29,542 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9029W0120-509520 from training. Duration: 0.768 2023-10-03 02:02:29,557 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9029W0433-509820 from training. Duration: 0.896 2023-10-03 02:02:30,914 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_40222_210_6228_1_1533385298_4481939_163-334586-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 02:02:30,914 WARNING [train.py:1204] (3/4) Exclude cut with ID _48677_210_5663_1_1533902405919_3892989_408-40740-0_sp1.1 from training. Duration: 0.7545625 2023-10-03 02:02:35,690 WARNING [train.py:1204] (3/4) Exclude cut with ID _24314_210_4915_1_1532135078116_7170179_521-258868-0_sp0.9 from training. Duration: 0.87775 2023-10-03 02:02:38,158 WARNING [train.py:1204] (3/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_576-51198-0_sp1.1 from training. Duration: 0.92725 2023-10-03 02:02:39,462 WARNING [train.py:1204] (3/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_306-216871-0_sp1.1 from training. Duration: 0.97275 2023-10-03 02:02:43,653 WARNING [train.py:1204] (3/4) Exclude cut with ID _20684_210_6917_1_1531567751646_4203930_463-119982-0_sp1.1 from training. Duration: 0.97275 2023-10-03 02:02:45,386 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9084W0012-536850 from training. Duration: 0.64 2023-10-03 02:02:47,224 WARNING [train.py:1204] (3/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_847-41334-0_sp1.1 from training. Duration: 0.4 2023-10-03 02:02:47,461 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.nonlin_attention.balancer.prob, batch_count=1098706.6666666667, ans=0.125 2023-10-03 02:02:49,850 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.559e+02 1.824e+02 1.996e+02 2.210e+02 3.286e+02, threshold=3.993e+02, percent-clipped=0.0 2023-10-03 02:02:51,282 WARNING [train.py:1204] (3/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_326-165900-0_sp1.1 from training. Duration: 0.62725 2023-10-03 02:02:52,618 WARNING [train.py:1204] (3/4) Exclude cut with ID _7537_210_2084_1_1528502395431_3924359_128-268029-0_sp1.1 from training. Duration: 0.8909375 2023-10-03 02:02:54,583 WARNING [train.py:1204] (3/4) Exclude cut with ID _67499_210_17282_1_1535509805220_3692430_333-155030-0_sp1.1 from training. Duration: 0.97275 2023-10-03 02:02:57,407 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_34008_210_10431_1_1533345424_8183247_588-257177-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 02:02:58,919 WARNING [train.py:1204] (3/4) Exclude cut with ID _25914_210_6400_1_1532141317058_644740_52-96808-0_sp1.1 from training. Duration: 0.82725 2023-10-03 02:03:00,388 WARNING [train.py:1204] (3/4) Exclude cut with ID _56494_210_4220_1_1535284837625_3033169_69-138150-0_sp1.1 from training. Duration: 0.8545625 2023-10-03 02:03:03,057 WARNING [train.py:1204] (3/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_19-143553-0_sp1.1 from training. Duration: 0.97275 2023-10-03 02:03:04,287 WARNING [train.py:1204] (3/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_582-130735-0_sp1.1 from training. Duration: 0.936375 2023-10-03 02:03:04,403 WARNING [train.py:1204] (3/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_943-201356-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 02:03:04,412 WARNING [train.py:1204] (3/4) Exclude cut with ID _3231_210_2106_1_1526205864934_6784220_422-229276-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 02:03:04,428 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_48049_210_5632_1_1533864553_3519937_73-198717-0_sp1.1 from training. Duration: 0.97275 2023-10-03 02:03:05,702 INFO [train.py:1046] (3/4) Epoch 32, batch 150, loss[loss=0.1619, simple_loss=0.253, pruned_loss=0.03538, over 24330.00 frames. ], tot_loss[loss=0.1654, simple_loss=0.2448, pruned_loss=0.04295, over 2514458.99 frames. ], batch size: 74, lr: 3.21e-03, grad_scale: 16.0 2023-10-03 02:03:05,821 WARNING [train.py:1204] (3/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_329-285801-0 from training. Duration: 0.91 2023-10-03 02:03:05,837 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9171W0114-579000 from training. Duration: 0.896 2023-10-03 02:03:07,571 WARNING [train.py:1204] (3/4) Exclude cut with ID _3120_210_3525_1_1526036143112_4072980_210-132675-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 02:03:07,663 WARNING [train.py:1204] (3/4) Exclude cut with ID _55857_210_2346_1_1534644007831_6898209_745-98391-0_sp0.9 from training. Duration: 0.8444375 2023-10-03 02:03:08,946 WARNING [train.py:1204] (3/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_615-322035-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 02:03:08,950 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_36756_210_7234_1_1533026991_966276_119-315767-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 02:03:08,973 WARNING [train.py:1204] (3/4) Exclude cut with ID _13916_210_9994_1_1530788341126_3763149_60-191556-0_sp1.1 from training. Duration: 0.5545625 2023-10-03 02:03:08,986 WARNING [train.py:1204] (3/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_177-236809-0_sp1.1 from training. Duration: 0.4454375 2023-10-03 02:03:09,029 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7002_210_1794_1_1527939683_5157286_184-229630-0_sp1.1 from training. Duration: 0.67275 2023-10-03 02:03:10,264 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_50428_210_5724_1_1534247532_4126418_464-18322-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 02:03:10,340 WARNING [train.py:1204] (3/4) Exclude cut with ID _35799_210_5854_1_1533263487887_3529071_466-131402-0_sp1.1 from training. Duration: 0.936375 2023-10-03 02:03:11,709 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_1259-132768-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 02:03:11,763 WARNING [train.py:1204] (3/4) Exclude cut with ID _6694_210_4599_1_1527852637490_3965529_144-185051-0_sp1.1 from training. Duration: 0.87275 2023-10-03 02:03:13,124 WARNING [train.py:1204] (3/4) Exclude cut with ID _64639_210_5816_1_1535367619437_3528000_59-133290-0_sp1.1 from training. Duration: 0.963625 2023-10-03 02:03:15,162 WARNING [train.py:1204] (3/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_223-49525-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 02:03:18,354 WARNING [train.py:1204] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_748-36150-0_sp1.1 from training. Duration: 0.82725 2023-10-03 02:03:18,355 WARNING [train.py:1204] (3/4) Exclude cut with ID _19896_210_1883_1_1531709022662_6649569_52-221627-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 02:03:18,416 WARNING [train.py:1204] (3/4) Exclude cut with ID _6789_210_1534_1_1527854471671_2870519_296-115766-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 02:03:21,075 WARNING [train.py:1204] (3/4) Exclude cut with ID _14624_210_1925_1_1530858690657_3642219_100-19871-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 02:03:21,125 WARNING [train.py:1204] (3/4) Exclude cut with ID _12555_210_8750_1_1530075594402_3780239_50-293675-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 02:03:24,256 WARNING [train.py:1204] (3/4) Exclude cut with ID _14880_210_8895_1_1530680426655_3795259_289-212817-0_sp1.1 from training. Duration: 0.62725 2023-10-03 02:03:24,310 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_388-211626-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 02:03:28,412 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.attention_skip_rate, batch_count=1098906.6666666667, ans=0.0 2023-10-03 02:03:29,534 WARNING [train.py:1204] (3/4) Exclude cut with ID _5289_210_2084_1_1527678202736_3779299_392-261988-0 from training. Duration: 0.65 2023-10-03 02:03:29,542 WARNING [train.py:1204] (3/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_124-345840-0 from training. Duration: 0.52 2023-10-03 02:03:29,553 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2655_210_2087_1_1525688959_3897400_437-337690-0 from training. Duration: 0.63 2023-10-03 02:03:32,308 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3780_210_2087_1_1526467391_239068_13-274633-0_sp1.1 from training. Duration: 0.92725 2023-10-03 02:03:32,313 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_805-71497-0_sp1.1 from training. Duration: 0.6454375 2023-10-03 02:03:33,722 WARNING [train.py:1204] (3/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_434-83000-0_sp1.1 from training. Duration: 0.8909375 2023-10-03 02:03:35,146 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_17613_210_2151_1_1531137569_2366160_285-219531-0_sp1.1 from training. Duration: 0.97275 2023-10-03 02:03:35,151 WARNING [train.py:1204] (3/4) Exclude cut with ID _56108_210_16380_1_1534505446159_3887959_22-37858-0_sp1.1 from training. Duration: 0.936375 2023-10-03 02:03:35,190 WARNING [train.py:1204] (3/4) Exclude cut with ID _28427_210_4220_1_1532581240967_3061069_200-88444-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 02:03:37,104 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_77-279042-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 02:03:39,780 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9309W0193-640320 from training. Duration: 0.896 2023-10-03 02:03:41,176 WARNING [train.py:1204] (3/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_516-324055-0_sp1.1 from training. Duration: 0.936375 2023-10-03 02:03:45,866 WARNING [train.py:1204] (3/4) Exclude cut with ID _40094_210_16380_1_1533294005773_3641159_10-261240-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 02:03:46,082 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.feed_forward1.hidden_balancer.prob, batch_count=1098973.3333333333, ans=0.125 2023-10-03 02:03:50,409 WARNING [train.py:1204] (3/4) Exclude cut with ID _6267_210_2084_1_1527764658526_3894730_375-219887-0_sp1.1 from training. Duration: 0.6090625 2023-10-03 02:03:51,729 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6788_210_1388_1_1527898698_4562021_180-75288-0 from training. Duration: 0.99 2023-10-03 02:03:55,797 WARNING [train.py:1204] (3/4) Exclude cut with ID _8030_210_7497_1_1528444886781_3531089_262-86750-0_sp1.1 from training. Duration: 0.736375 2023-10-03 02:03:55,820 WARNING [train.py:1204] (3/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_934-325498-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 02:03:55,864 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_456-22141-0_sp0.9 from training. Duration: 0.97775 2023-10-03 02:03:57,786 WARNING [train.py:1204] (3/4) Exclude cut with ID _49643_210_7989_1_1534640244224_3939031_598-161926-0_sp1.1 from training. Duration: 0.8181875 2023-10-03 02:04:00,665 WARNING [train.py:1204] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_345-49721-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 02:04:00,734 WARNING [train.py:1204] (3/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_440-141811-0_sp1.1 from training. Duration: 0.863625 2023-10-03 02:04:03,489 WARNING [train.py:1204] (3/4) Exclude cut with ID _64048_210_10626_1_1535198382802_3835179_394-212120-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 02:04:03,537 WARNING [train.py:1204] (3/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_91-277684-0 from training. Duration: 0.74 2023-10-03 02:04:07,740 WARNING [train.py:1204] (3/4) Exclude cut with ID _23038_210_9570_1_1532001584158_3652419_191-346343-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 02:04:09,099 WARNING [train.py:1204] (3/4) Exclude cut with ID _15140_210_9288_1_1530791388450_4880149_351-347974-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 02:04:09,138 WARNING [train.py:1204] (3/4) Exclude cut with ID _58605_210_12210_1_1534723195939_3904259_48-269402-0_sp1.1 from training. Duration: 0.87275 2023-10-03 02:04:09,151 WARNING [train.py:1204] (3/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_401-53423-0_sp0.9 from training. Duration: 0.611125 2023-10-03 02:04:11,205 WARNING [train.py:1204] (3/4) Exclude cut with ID _42659_210_2070_1_1533607442765_3120380_158-49004-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 02:04:11,347 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.0.balancer2.prob, batch_count=1099106.6666666667, ans=0.125 2023-10-03 02:04:12,500 WARNING [train.py:1204] (3/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_81-75849-0_sp1.1 from training. Duration: 0.3545625 2023-10-03 02:04:15,331 WARNING [train.py:1204] (3/4) Exclude cut with ID _27052_210_5986_1_1532329197187_3585009_314-106499-0_sp1.1 from training. Duration: 0.836375 2023-10-03 02:04:15,432 WARNING [train.py:1204] (3/4) Exclude cut with ID _4626_210_3532_1_1526817652717_3916859_446-206656-0_sp0.9 from training. Duration: 0.8444375 2023-10-03 02:04:16,777 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_53378_210_17282_1_1534221071_5769509_158-114960-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 02:04:18,816 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3171_210_2087_1_1526109084_2330007_37-64334-0_sp0.9 from training. Duration: 0.911125 2023-10-03 02:04:20,039 INFO [train.py:1046] (3/4) Epoch 32, batch 200, loss[loss=0.1758, simple_loss=0.257, pruned_loss=0.04732, over 23891.00 frames. ], tot_loss[loss=0.1663, simple_loss=0.245, pruned_loss=0.04378, over 2992502.15 frames. ], batch size: 86, lr: 3.21e-03, grad_scale: 16.0 2023-10-03 02:04:20,099 WARNING [train.py:1204] (3/4) Exclude cut with ID _5410_210_1388_1_1527397055795_1719020_39-112221-0 from training. Duration: 0.69 2023-10-03 02:04:20,123 WARNING [train.py:1204] (3/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_4-91037-0_sp0.9 from training. Duration: 0.97775 2023-10-03 02:04:20,143 WARNING [train.py:1204] (3/4) Exclude cut with ID ID0079W0443-717795 from training. Duration: 0.541 2023-10-03 02:04:24,784 WARNING [train.py:1204] (3/4) Exclude cut with ID _34734_210_4525_1_1533365977542_4448720_766-297928-0_sp1.1 from training. Duration: 0.936375 2023-10-03 02:04:26,373 WARNING [train.py:1204] (3/4) Exclude cut with ID _51471_210_1889_1_1534399391390_3094250_310-337793-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 02:04:28,097 WARNING [train.py:1204] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_678-159307-0_sp0.9 from training. Duration: 0.9444375 2023-10-03 02:04:30,817 WARNING [train.py:1204] (3/4) Exclude cut with ID _50093_210_4220_1_1534417141207_3432299_259-322252-0 from training. Duration: 0.92 2023-10-03 02:04:32,080 WARNING [train.py:1204] (3/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_67-103444-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 02:04:32,105 WARNING [train.py:1204] (3/4) Exclude cut with ID _40551_210_10635_1_1533776424632_3416179_311-247578-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 02:04:34,896 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_4633_210_2040_1_1526824163_4145343_416-76117-0 from training. Duration: 0.95 2023-10-03 02:04:36,271 WARNING [train.py:1204] (3/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_331-117829-0_sp0.9 from training. Duration: 0.688875 2023-10-03 02:04:37,673 WARNING [train.py:1204] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_299-247059-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 02:04:38,981 WARNING [train.py:1204] (3/4) Exclude cut with ID _64002_210_9570_1_1535198605157_38979_2-240845-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 02:04:42,364 WARNING [train.py:1204] (3/4) Exclude cut with ID _4681_210_3525_1_1527250689464_120889_15-126760-0_sp0.9 from training. Duration: 0.9 2023-10-03 02:04:42,381 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_21500_210_2054_1_1532149310_2716758_256-156611-0_sp1.1 from training. Duration: 0.936375 2023-10-03 02:04:42,540 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.feed_forward1.out_proj.dropout_p, batch_count=1099240.0, ans=0.1 2023-10-03 02:04:42,599 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.ff2_skip_rate, batch_count=1099240.0, ans=0.0 2023-10-03 02:04:43,673 WARNING [train.py:1204] (3/4) Exclude cut with ID _16908_210_5724_1_1531119157248_3772700_624-203729-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 02:04:48,929 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.bypass_mid.scale_min, batch_count=1099306.6666666667, ans=0.2 2023-10-03 02:05:00,497 WARNING [train.py:1204] (3/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_605-219732-0_sp1.1 from training. Duration: 0.8545625 2023-10-03 02:05:00,537 WARNING [train.py:1204] (3/4) Exclude cut with ID _44718_210_5816_1_1534050051869_6469030_380-167520-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 02:05:01,888 WARNING [train.py:1204] (3/4) Exclude cut with ID _19896_210_1883_1_1531709022662_6649569_94-229927-0_sp1.1 from training. Duration: 0.8818125 2023-10-03 02:05:03,195 WARNING [train.py:1204] (3/4) Exclude cut with ID _19514_210_9259_1_1531530071330_5084230_519-174300-0_sp1.1 from training. Duration: 0.963625 2023-10-03 02:05:03,265 WARNING [train.py:1204] (3/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_340-174176-0_sp0.9 from training. Duration: 0.5444375 2023-10-03 02:05:03,268 WARNING [train.py:1204] (3/4) Exclude cut with ID _8263_210_1664_1_1528439452891_970220_68-181805-0_sp1.1 from training. Duration: 0.5818125 2023-10-03 02:05:04,624 WARNING [train.py:1204] (3/4) Exclude cut with ID _3120_210_3525_1_1526036143112_4072980_348-257723-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 02:05:05,959 WARNING [train.py:1204] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_453-133983-0_sp1.1 from training. Duration: 0.7090625 2023-10-03 02:05:07,299 WARNING [train.py:1204] (3/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_17-86060-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 02:05:07,318 WARNING [train.py:1204] (3/4) Exclude cut with ID _29877_210_6228_1_1532397791106_5276149_228-184657-0_sp1.1 from training. Duration: 0.97275 2023-10-03 02:05:07,422 WARNING [train.py:1204] (3/4) Exclude cut with ID _18516_210_9206_1_1531704653206_3692406_198-334870-0 from training. Duration: 0.92 2023-10-03 02:05:09,237 WARNING [train.py:1204] (3/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_340-174176-0_sp1.1 from training. Duration: 0.4454375 2023-10-03 02:05:09,249 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_14-139186-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 02:05:12,146 WARNING [train.py:1204] (3/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_126-29104-0_sp0.9 from training. Duration: 0.9444375 2023-10-03 02:05:13,682 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.balancer1.prob, batch_count=1099373.3333333333, ans=0.125 2023-10-03 02:05:15,039 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.1.feed_forward1.hidden_balancer.prob, batch_count=1099373.3333333333, ans=0.125 2023-10-03 02:05:15,083 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=1099373.3333333333, ans=0.1 2023-10-03 02:05:15,126 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.conv_skip_rate, batch_count=1099373.3333333333, ans=0.0 2023-10-03 02:05:18,855 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.373e+02 1.798e+02 1.948e+02 2.252e+02 2.874e+02, threshold=3.895e+02, percent-clipped=0.0 2023-10-03 02:05:18,949 WARNING [train.py:1204] (3/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_84-10310-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 02:05:19,149 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.conv_skip_rate, batch_count=1099440.0, ans=0.0 2023-10-03 02:05:25,132 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.0.layers.1.feed_forward3.out_whiten, num_groups=1, num_channels=192, metric=10.69 vs. limit=15.0 2023-10-03 02:05:25,636 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_15517_210_6753_1_1530954008_3487336_79-273076-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 02:05:27,496 WARNING [train.py:1204] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_371-18169-0_sp0.9 from training. Duration: 0.9555625 2023-10-03 02:05:33,042 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_68702_210_23689_1_1535799577_3420068_170-92788-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 02:05:34,322 INFO [train.py:1046] (3/4) Epoch 32, batch 250, loss[loss=0.171, simple_loss=0.2535, pruned_loss=0.04425, over 24046.00 frames. ], tot_loss[loss=0.166, simple_loss=0.2446, pruned_loss=0.04371, over 3362123.63 frames. ], batch size: 80, lr: 3.21e-03, grad_scale: 16.0 2023-10-03 02:05:34,473 WARNING [train.py:1204] (3/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_78-182912-0 from training. Duration: 0.59 2023-10-03 02:05:35,815 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3019_210_2028_1_1526044245_3264865_297-12384-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 02:05:35,820 WARNING [train.py:1204] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_147-111137-0_sp1.1 from training. Duration: 0.863625 2023-10-03 02:05:35,825 WARNING [train.py:1204] (3/4) Exclude cut with ID _28323_210_16187_1_1532480395667_4100300_117-8859-0_sp1.1 from training. Duration: 0.97275 2023-10-03 02:05:35,994 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.attention_skip_rate, batch_count=1099506.6666666667, ans=0.0 2023-10-03 02:05:37,837 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7762_210_1794_1_1528199508_4057408_419-125009-0_sp1.1 from training. Duration: 0.7545625 2023-10-03 02:05:37,927 WARNING [train.py:1204] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_268-87909-0 from training. Duration: 0.99 2023-10-03 02:05:39,127 WARNING [train.py:1204] (3/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_118-311357-0_sp1.1 from training. Duration: 0.92725 2023-10-03 02:05:39,152 WARNING [train.py:1204] (3/4) Exclude cut with ID ID0377W0003-870225 from training. Duration: 0.852 2023-10-03 02:05:40,582 WARNING [train.py:1204] (3/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_56-5838-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 02:05:43,247 WARNING [train.py:1204] (3/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_90-31383-0_sp0.9 from training. Duration: 0.9666875 2023-10-03 02:05:44,537 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_31583_210_3852_1_1532595851_4024413_58-86657-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 02:05:44,577 WARNING [train.py:1204] (3/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_344-113875-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 02:05:47,329 WARNING [train.py:1204] (3/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_278-104676-0_sp1.1 from training. Duration: 0.8909375 2023-10-03 02:05:47,365 WARNING [train.py:1204] (3/4) Exclude cut with ID _55844_210_2346_1_1534471212569_7625707_764-2387-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 02:05:48,731 WARNING [train.py:1204] (3/4) Exclude cut with ID _27430_210_7770_1_1532604689375_1378590_157-14963-0_sp1.1 from training. Duration: 0.963625 2023-10-03 02:05:51,533 WARNING [train.py:1204] (3/4) Exclude cut with ID _7160_210_1876_1_1527940923231_3703799_282-322611-0_sp1.1 from training. Duration: 0.7909375 2023-10-03 02:06:01,501 WARNING [train.py:1204] (3/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_190-138640-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 02:06:02,944 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_33348_210_5632_1_1533086912_3324030_63-270922-0_sp1.1 from training. Duration: 0.97275 2023-10-03 02:06:04,369 WARNING [train.py:1204] (3/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_135-184281-0_sp1.1 from training. Duration: 0.9 2023-10-03 02:06:08,291 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.2.encoder.layers.2.conv_module2.whiten, num_groups=1, num_channels=384, metric=5.31 vs. limit=15.0 2023-10-03 02:06:10,431 WARNING [train.py:1204] (3/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_277-210798-0_sp1.1 from training. Duration: 0.5 2023-10-03 02:06:11,783 WARNING [train.py:1204] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_402-253027-0_sp0.9 from training. Duration: 0.6 2023-10-03 02:06:13,109 WARNING [train.py:1204] (3/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_540-275685-0_sp0.9 from training. Duration: 0.988875 2023-10-03 02:06:14,449 WARNING [train.py:1204] (3/4) Exclude cut with ID _59493_210_17282_1_1535078419077_3225879_118-18960-0_sp1.1 from training. Duration: 0.936375 2023-10-03 02:06:15,867 WARNING [train.py:1204] (3/4) Exclude cut with ID _6194_210_1534_1_1527511826160_3941194_321-148641-0_sp1.1 from training. Duration: 0.5818125 2023-10-03 02:06:15,874 WARNING [train.py:1204] (3/4) Exclude cut with ID _61133_210_9969_1_1535196777227_3696409_333-270325-0_sp1.1 from training. Duration: 0.8454375 2023-10-03 02:06:15,919 WARNING [train.py:1204] (3/4) Exclude cut with ID _49376_210_5986_1_1533985278732_3370740_307-145221-0_sp1.1 from training. Duration: 0.936375 2023-10-03 02:06:16,054 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6846_210_6753_1_1527850767_4295158_2-315073-0_sp0.9 from training. Duration: 0.97775 2023-10-03 02:06:18,799 WARNING [train.py:1204] (3/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_17-25045-0 from training. Duration: 0.59 2023-10-03 02:06:20,214 WARNING [train.py:1204] (3/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_503-333510-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 02:06:21,591 WARNING [train.py:1204] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_238-245655-0_sp1.1 from training. Duration: 0.736375 2023-10-03 02:06:21,621 WARNING [train.py:1204] (3/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_329-82235-0_sp0.9 from training. Duration: 0.911125 2023-10-03 02:06:21,626 WARNING [train.py:1204] (3/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_325-113798-0_sp0.9 from training. Duration: 0.9444375 2023-10-03 02:06:21,698 WARNING [train.py:1204] (3/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_395-37222-0_sp0.9 from training. Duration: 0.8444375 2023-10-03 02:06:23,461 WARNING [train.py:1204] (3/4) Exclude cut with ID _64135_210_2253_1_1535677267876_7608972_498-5039-0_sp1.1 from training. Duration: 0.7090625 2023-10-03 02:06:23,478 WARNING [train.py:1204] (3/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_303-228738-0_sp0.9 from training. Duration: 0.9333125 2023-10-03 02:06:25,469 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_577-220899-0_sp1.1 from training. Duration: 0.92725 2023-10-03 02:06:25,718 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.conv_module2.balancer1.max_abs, batch_count=1099706.6666666667, ans=10.0 2023-10-03 02:06:26,858 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3475_210_2028_1_1526193578_3622569_377-166133-0_sp1.1 from training. Duration: 0.7818125 2023-10-03 02:06:28,093 WARNING [train.py:1204] (3/4) Exclude cut with ID _49376_210_5986_1_1533985278732_3370740_272-313512-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 02:06:30,978 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7762_210_1794_1_1528199508_4057408_422-227034-0_sp0.9 from training. Duration: 0.811125 2023-10-03 02:06:35,026 WARNING [train.py:1204] (3/4) Exclude cut with ID _18895_210_6228_1_1531357274234_3784930_74-341428-0_sp1.1 from training. Duration: 0.92725 2023-10-03 02:06:36,534 WARNING [train.py:1204] (3/4) Exclude cut with ID _44902_210_16187_1_1533888006062_3792078_149-67212-0_sp1.1 from training. Duration: 0.7909375 2023-10-03 02:06:42,424 WARNING [train.py:1204] (3/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_243-126020-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 02:06:42,740 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.feed_forward1.hidden_balancer.prob, batch_count=1099773.3333333333, ans=0.125 2023-10-03 02:06:43,817 WARNING [train.py:1204] (3/4) Exclude cut with ID _54218_210_12210_1_1534413646388_4409259_259-6561-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 02:06:45,359 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_41452_210_7291_1_1533437687_3941281_405-11559-0 from training. Duration: 0.91 2023-10-03 02:06:46,809 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_759-225084-0_sp1.1 from training. Duration: 0.97275 2023-10-03 02:06:46,825 WARNING [train.py:1204] (3/4) Exclude cut with ID _14747_210_8341_1_1530685797463_3654089_147-131229-0_sp0.9 from training. Duration: 0.8444375 2023-10-03 02:06:48,128 INFO [train.py:1046] (3/4) Epoch 32, batch 300, loss[loss=0.1461, simple_loss=0.2217, pruned_loss=0.03528, over 23644.00 frames. ], tot_loss[loss=0.1646, simple_loss=0.2426, pruned_loss=0.04326, over 3669834.83 frames. ], batch size: 149, lr: 3.21e-03, grad_scale: 16.0 2023-10-03 02:06:48,230 WARNING [train.py:1204] (3/4) Exclude cut with ID _14867_210_1968_1_1530684179890_3846969_319-250868-0 from training. Duration: 0.99 2023-10-03 02:06:49,489 WARNING [train.py:1204] (3/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_580-93798-0_sp0.9 from training. Duration: 0.62225 2023-10-03 02:06:51,324 WARNING [train.py:1204] (3/4) Exclude cut with ID _9679_210_7497_1_1529129212906_4363929_512-313889-0_sp0.9 from training. Duration: 0.9 2023-10-03 02:06:51,332 WARNING [train.py:1204] (3/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_18-189198-0 from training. Duration: 0.9 2023-10-03 02:06:54,612 WARNING [train.py:1204] (3/4) Exclude cut with ID _49755_210_18053_1_1534581005846_3218709_137-64179-0_sp1.1 from training. Duration: 0.92725 2023-10-03 02:06:56,415 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_48049_210_5632_1_1533864553_3519937_306-283794-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 02:06:59,108 WARNING [train.py:1204] (3/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_25-120161-0_sp1.1 from training. Duration: 0.8909375 2023-10-03 02:06:59,172 WARNING [train.py:1204] (3/4) Exclude cut with ID _8833_210_5219_1_1528592665210_7553059_319-349181-0 from training. Duration: 0.99 2023-10-03 02:07:00,550 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_296-332325-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 02:07:00,636 WARNING [train.py:1204] (3/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_436-186311-0_sp1.1 from training. Duration: 0.5909375 2023-10-03 02:07:00,655 WARNING [train.py:1204] (3/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_348-127789-0 from training. Duration: 0.85 2023-10-03 02:07:00,660 WARNING [train.py:1204] (3/4) Exclude cut with ID _3992_210_1747_1_1526644816031_3731060_443-195183-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 02:07:05,961 WARNING [train.py:1204] (3/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_308-231735-0_sp1.1 from training. Duration: 0.663625 2023-10-03 02:07:10,653 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6786_210_1388_1_1527839699_4194495_378-46070-0_sp1.1 from training. Duration: 0.6545625 2023-10-03 02:07:10,710 WARNING [train.py:1204] (3/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_63-101996-0 from training. Duration: 0.88 2023-10-03 02:07:13,475 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2541_210_1794_1_1525590269_3084765_156-173713-0 from training. Duration: 0.81 2023-10-03 02:07:13,522 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_217-239796-0_sp1.1 from training. Duration: 0.963625 2023-10-03 02:07:14,894 WARNING [train.py:1204] (3/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_530-329400-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 02:07:17,666 WARNING [train.py:1204] (3/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_150-62920-0_sp1.1 from training. Duration: 0.963625 2023-10-03 02:07:17,667 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_5522_210_3273_1_1527249606_3909096_366-172508-0 from training. Duration: 0.66 2023-10-03 02:07:17,669 WARNING [train.py:1204] (3/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_303-340446-0_sp1.1 from training. Duration: 0.8181875 2023-10-03 02:07:19,516 WARNING [train.py:1204] (3/4) Exclude cut with ID _8699_210_2069_1_1528630346831_3960409_160-24408-0_sp1.1 from training. Duration: 0.82725 2023-10-03 02:07:21,370 WARNING [train.py:1204] (3/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_299-187801-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 02:07:21,395 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_34886_210_10633_1_1533203882_1755661_92-345288-0_sp1.1 from training. Duration: 0.97275 2023-10-03 02:07:27,505 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_5438_210_1794_1_1527336687_2600009_104-288274-0_sp1.1 from training. Duration: 0.52725 2023-10-03 02:07:27,509 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_154-316450-0 from training. Duration: 0.76 2023-10-03 02:07:27,581 WARNING [train.py:1204] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_23-189234-0_sp0.9 from training. Duration: 0.92225 2023-10-03 02:07:30,314 WARNING [train.py:1204] (3/4) Exclude cut with ID _4779_210_2084_1_1527318022944_3914893_346-168820-0_sp1.1 from training. Duration: 0.963625 2023-10-03 02:07:31,765 WARNING [train.py:1204] (3/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_5-55893-0 from training. Duration: 0.55 2023-10-03 02:07:31,847 WARNING [train.py:1204] (3/4) Exclude cut with ID _20455_210_5219_1_1531565996775_8098100_180-24517-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 02:07:35,233 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_243-143119-0_sp0.9 from training. Duration: 0.8555625 2023-10-03 02:07:37,961 WARNING [train.py:1204] (3/4) Exclude cut with ID _65585_210_12210_1_1535450451584_3764098_293-212045-0_sp1.1 from training. Duration: 0.92725 2023-10-03 02:07:37,965 WARNING [train.py:1204] (3/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_577-332600-0 from training. Duration: 0.57 2023-10-03 02:07:40,914 WARNING [train.py:1204] (3/4) Exclude cut with ID _30039_210_13270_1_1532484612900_2725150_104-289874-0_sp1.1 from training. Duration: 0.963625 2023-10-03 02:07:40,920 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3172_210_2087_1_1526292929_3987865_197-97762-0_sp1.1 from training. Duration: 0.6818125 2023-10-03 02:07:43,648 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_256-150908-0_sp1.1 from training. Duration: 0.963625 2023-10-03 02:07:45,572 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_296-287678-0_sp1.1 from training. Duration: 0.863625 2023-10-03 02:07:45,605 WARNING [train.py:1204] (3/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_443-30034-0 from training. Duration: 0.64 2023-10-03 02:07:45,629 WARNING [train.py:1204] (3/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_494-199538-0_sp0.9 from training. Duration: 0.8333125 2023-10-03 02:07:46,921 WARNING [train.py:1204] (3/4) Exclude cut with ID _19708_210_9206_1_1532136650733_4238364_61-162325-0_sp1.1 from training. Duration: 0.97275 2023-10-03 02:07:47,020 WARNING [train.py:1204] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_475-127455-0 from training. Duration: 0.7 2023-10-03 02:07:50,054 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.593e+02 1.863e+02 2.088e+02 2.387e+02 3.568e+02, threshold=4.176e+02, percent-clipped=0.0 2023-10-03 02:07:50,191 WARNING [train.py:1204] (3/4) Exclude cut with ID _48677_210_5663_1_1533902405919_3892989_188-11581-0_sp1.1 from training. Duration: 0.963625 2023-10-03 02:07:50,215 WARNING [train.py:1204] (3/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_136-8381-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 02:07:51,601 WARNING [train.py:1204] (3/4) Exclude cut with ID _55328_210_27767_1_1534580973260_4088170_270-229555-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 02:07:53,268 WARNING [train.py:1204] (3/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_372-266951-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 02:07:53,562 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.feed_forward1.hidden_balancer.prob, batch_count=1100106.6666666667, ans=0.125 2023-10-03 02:07:54,765 WARNING [train.py:1204] (3/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_14-242149-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 02:08:00,494 WARNING [train.py:1204] (3/4) Exclude cut with ID _7877_210_6014_1_1528286358099_3750136_159-76512-0_sp1.1 from training. Duration: 0.936375 2023-10-03 02:08:00,496 WARNING [train.py:1204] (3/4) Exclude cut with ID _8263_210_1664_1_1528438953494_294100_40-204731-0_sp1.1 from training. Duration: 0.4545625 2023-10-03 02:08:03,153 INFO [train.py:1046] (3/4) Epoch 32, batch 350, loss[loss=0.1677, simple_loss=0.2587, pruned_loss=0.03839, over 24630.00 frames. ], tot_loss[loss=0.1631, simple_loss=0.2408, pruned_loss=0.04269, over 3882508.85 frames. ], batch size: 73, lr: 3.21e-03, grad_scale: 8.0 2023-10-03 02:08:03,225 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8278_210_2550_1_1528637099_4220413_694-238413-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 02:08:10,675 WARNING [train.py:1204] (3/4) Exclude cut with ID _47278_210_12041_1_1533902418349_3576110_421-172710-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 02:08:12,196 WARNING [train.py:1204] (3/4) Exclude cut with ID _14970_210_15709_1_1530773543493_1715730_215-282680-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 02:08:13,478 WARNING [train.py:1204] (3/4) Exclude cut with ID _64639_210_5816_1_1535367619437_3528000_306-149449-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 02:08:14,992 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_28-291982-0 from training. Duration: 0.97 2023-10-03 02:08:16,901 WARNING [train.py:1204] (3/4) Exclude cut with ID _55134_210_21982_1_1534590028377_3882170_412-15870-0_sp1.1 from training. Duration: 0.936375 2023-10-03 02:08:16,938 WARNING [train.py:1204] (3/4) Exclude cut with ID _13684_210_7497_1_1530619354863_3598359_312-235606-0 from training. Duration: 0.6 2023-10-03 02:08:20,936 WARNING [train.py:1204] (3/4) Exclude cut with ID _37112_210_2575_1_1532997007673_7268610_347-238993-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 02:08:20,981 WARNING [train.py:1204] (3/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_121-284152-0 from training. Duration: 0.59 2023-10-03 02:08:21,049 WARNING [train.py:1204] (3/4) Exclude cut with ID _2941_210_2577_1_1526199673150_2489759_228-2961-0_sp1.1 from training. Duration: 0.97275 2023-10-03 02:08:21,647 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.4.encoder.layers.0.whiten, num_groups=1, num_channels=384, metric=12.25 vs. limit=12.0 2023-10-03 02:08:24,303 WARNING [train.py:1204] (3/4) Exclude cut with ID _28323_210_16187_1_1532480395667_4100300_531-24157-0 from training. Duration: 0.98 2023-10-03 02:08:25,796 WARNING [train.py:1204] (3/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_266-329755-0_sp0.9 from training. Duration: 0.988875 2023-10-03 02:08:28,938 WARNING [train.py:1204] (3/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_1038-146421-0_sp1.1 from training. Duration: 0.97275 2023-10-03 02:08:29,009 WARNING [train.py:1204] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_391-22148-0_sp0.9 from training. Duration: 0.8555625 2023-10-03 02:08:29,102 WARNING [train.py:1204] (3/4) Exclude cut with ID _64521_210_14699_1_1535243427259_3815328_543-341117-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 02:08:29,113 WARNING [train.py:1204] (3/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_52-168712-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 02:08:30,487 WARNING [train.py:1204] (3/4) Exclude cut with ID _33920_210_4220_1_1533257851500_3084399_198-334063-0_sp1.1 from training. Duration: 0.8545625 2023-10-03 02:08:30,499 WARNING [train.py:1204] (3/4) Exclude cut with ID _50093_210_4220_1_1534417141207_3432299_69-111987-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 02:08:30,542 WARNING [train.py:1204] (3/4) Exclude cut with ID _14821_210_8750_1_1530680503991_6829359_189-297451-0_sp0.9 from training. Duration: 0.611125 2023-10-03 02:08:31,944 WARNING [train.py:1204] (3/4) Exclude cut with ID _8030_210_7497_1_1528444886781_3531089_262-86750-0_sp0.9 from training. Duration: 0.9 2023-10-03 02:08:31,951 WARNING [train.py:1204] (3/4) Exclude cut with ID _24612_210_8913_1_1531998054193_3806359_294-191196-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 02:08:38,117 WARNING [train.py:1204] (3/4) Exclude cut with ID _16909_210_5724_1_1531292079912_4118919_707-342896-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 02:08:39,475 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_9460_210_4211_1_1528954587_10049667_572-37035-0_sp0.9 from training. Duration: 0.888875 2023-10-03 02:08:39,534 WARNING [train.py:1204] (3/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_762-109886-0_sp1.1 from training. Duration: 0.82725 2023-10-03 02:08:40,093 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.3.encoder.layers.3.nonlin_attention.whiten1, num_groups=1, num_channels=384, metric=6.75 vs. limit=10.0 2023-10-03 02:08:40,817 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_14953_210_2151_1_1530701710_5616165_519-326393-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 02:08:43,683 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.bypass_mid.scale_min, batch_count=1100306.6666666667, ans=0.2 2023-10-03 02:08:44,883 WARNING [train.py:1204] (3/4) Exclude cut with ID _25914_210_6400_1_1532141317058_644740_52-96808-0 from training. Duration: 0.91 2023-10-03 02:08:44,887 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_68095_210_20185_1_1535718226_8888855_1079-94525-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 02:08:49,277 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.0.layers.0.self_attn1.whiten, num_groups=1, num_channels=192, metric=11.40 vs. limit=22.5 2023-10-03 02:08:51,100 WARNING [train.py:1204] (3/4) Exclude cut with ID _14860_210_4915_1_1530702020340_7343240_746-3647-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 02:08:51,101 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_70379_210_7925_1_1535941455_557455_49-101720-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 02:08:51,121 WARNING [train.py:1204] (3/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_8-307291-0_sp1.1 from training. Duration: 0.936375 2023-10-03 02:08:52,575 WARNING [train.py:1204] (3/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_117-290916-0 from training. Duration: 0.51 2023-10-03 02:08:54,028 WARNING [train.py:1204] (3/4) Exclude cut with ID _22020_210_2461_1_1531724164513_6326900_944-339840-0_sp1.1 from training. Duration: 0.963625 2023-10-03 02:08:55,876 WARNING [train.py:1204] (3/4) Exclude cut with ID IC0515W0043-254911 from training. Duration: 0.976 2023-10-03 02:08:57,704 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3910_210_1794_1_1526742306_334530_28-238909-0 from training. Duration: 0.81 2023-10-03 02:08:57,728 WARNING [train.py:1204] (3/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_269-267275-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 02:09:01,765 WARNING [train.py:1204] (3/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_72-251255-0_sp1.1 from training. Duration: 0.8545625 2023-10-03 02:09:01,776 WARNING [train.py:1204] (3/4) Exclude cut with ID _14886_210_4220_1_1530702020355_3516190_175-229073-0 from training. Duration: 0.45 2023-10-03 02:09:04,596 WARNING [train.py:1204] (3/4) Exclude cut with ID _40519_210_13794_1_1533715228655_7337390_921-209452-0_sp1.1 from training. Duration: 0.963625 2023-10-03 02:09:06,089 WARNING [train.py:1204] (3/4) Exclude cut with ID _29857_210_20034_1_1532395886753_3798160_335-163562-0_sp0.9 from training. Duration: 0.9444375 2023-10-03 02:09:09,180 WARNING [train.py:1204] (3/4) Exclude cut with ID _58583_210_5236_1_1534726835266_3574249_384-58445-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 02:09:10,544 WARNING [train.py:1204] (3/4) Exclude cut with ID _44011_210_2046_1_1533722401691_3631310_96-315656-0_sp1.1 from training. Duration: 0.963625 2023-10-03 02:09:10,556 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_68484_210_1794_1_1535849804_7678097_549-170613-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 02:09:12,090 WARNING [train.py:1204] (3/4) Exclude cut with ID _37892_210_19596_1_1533198677897_3964058_214-148058-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 02:09:13,683 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.balancer1.prob, batch_count=1100440.0, ans=0.125 2023-10-03 02:09:14,867 WARNING [train.py:1204] (3/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_415-78792-0_sp0.9 from training. Duration: 0.9 2023-10-03 02:09:17,452 INFO [train.py:1046] (3/4) Epoch 32, batch 400, loss[loss=0.1573, simple_loss=0.2332, pruned_loss=0.04071, over 23381.00 frames. ], tot_loss[loss=0.1628, simple_loss=0.2405, pruned_loss=0.04255, over 4071902.62 frames. ], batch size: 93, lr: 3.21e-03, grad_scale: 16.0 2023-10-03 02:09:17,575 WARNING [train.py:1204] (3/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_383-205833-0_sp1.1 from training. Duration: 0.863625 2023-10-03 02:09:18,955 WARNING [train.py:1204] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_167-137255-0 from training. Duration: 0.64 2023-10-03 02:09:18,962 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8275_210_4198_1_1528455320_3892571_484-154786-0_sp1.1 from training. Duration: 0.963625 2023-10-03 02:09:21,056 WARNING [train.py:1204] (3/4) Exclude cut with ID _40284_210_2084_1_1533513679814_3704279_396-319412-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 02:09:23,712 WARNING [train.py:1204] (3/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_280-120942-0_sp0.9 from training. Duration: 0.8555625 2023-10-03 02:09:23,771 WARNING [train.py:1204] (3/4) Exclude cut with ID _45630_210_10556_1_1533949174498_3902049_558-240889-0_sp1.1 from training. Duration: 0.92725 2023-10-03 02:09:26,497 WARNING [train.py:1204] (3/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_197-307458-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 02:09:27,899 WARNING [train.py:1204] (3/4) Exclude cut with ID _16909_210_5724_1_1531292079912_4118919_163-254843-0_sp1.1 from training. Duration: 0.92725 2023-10-03 02:09:28,028 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_103-84010-0 from training. Duration: 0.69 2023-10-03 02:09:29,981 WARNING [train.py:1204] (3/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_401-158239-0 from training. Duration: 0.71 2023-10-03 02:09:29,981 WARNING [train.py:1204] (3/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_899-233658-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 02:09:30,096 WARNING [train.py:1204] (3/4) Exclude cut with ID _23825_210_13449_1_1531915313526_3497150_240-271367-0 from training. Duration: 0.97 2023-10-03 02:09:31,379 WARNING [train.py:1204] (3/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_424-134619-0_sp1.1 from training. Duration: 0.92725 2023-10-03 02:09:34,139 WARNING [train.py:1204] (3/4) Exclude cut with ID _44831_210_13427_1_1533816011549_3689035_32-188147-0_sp1.1 from training. Duration: 0.87275 2023-10-03 02:09:34,142 WARNING [train.py:1204] (3/4) Exclude cut with ID _59170_210_6721_1_1534983633960_4203335_314-63879-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 02:09:34,156 WARNING [train.py:1204] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_602-275260-0 from training. Duration: 0.82 2023-10-03 02:09:35,494 WARNING [train.py:1204] (3/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_511-306767-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 02:09:35,528 WARNING [train.py:1204] (3/4) Exclude cut with ID _41817_210_18835_1_1533434477160_7423049_565-201775-0_sp1.1 from training. Duration: 0.92725 2023-10-03 02:09:35,538 WARNING [train.py:1204] (3/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_778-173151-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 02:09:35,574 WARNING [train.py:1204] (3/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_680-51001-0_sp1.1 from training. Duration: 0.963625 2023-10-03 02:09:39,478 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.3.encoder.layers.1.nonlin_attention.whiten1, num_groups=1, num_channels=384, metric=5.21 vs. limit=10.0 2023-10-03 02:09:40,195 WARNING [train.py:1204] (3/4) Exclude cut with ID IC0677W0059-334891 from training. Duration: 0.896 2023-10-03 02:09:40,258 WARNING [train.py:1204] (3/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_370-331600-0 from training. Duration: 0.94 2023-10-03 02:09:44,544 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.feed_forward1.hidden_balancer.prob, batch_count=1100573.3333333333, ans=0.125 2023-10-03 02:09:45,786 WARNING [train.py:1204] (3/4) Exclude cut with ID _66840_210_5219_1_1535452168426_4196349_446-240192-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 02:09:48,426 WARNING [train.py:1204] (3/4) Exclude cut with ID _65242_210_18628_1_1535369393717_3823149_38-6732-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 02:09:49,768 WARNING [train.py:1204] (3/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_302-2550-0 from training. Duration: 0.81 2023-10-03 02:09:49,862 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_100-348441-0 from training. Duration: 0.91 2023-10-03 02:09:54,458 WARNING [train.py:1204] (3/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_329-285801-0_sp1.1 from training. Duration: 0.82725 2023-10-03 02:09:54,624 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_30167_210_3528_1_1532428012_4772375_343-334712-0_sp1.1 from training. Duration: 0.936375 2023-10-03 02:10:01,766 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7543_210_6753_1_1528543318_4552_1-85072-0 from training. Duration: 0.83 2023-10-03 02:10:05,113 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7002_210_1794_1_1527935634_4034444_487-236501-0_sp1.1 from training. Duration: 0.6 2023-10-03 02:10:05,229 WARNING [train.py:1204] (3/4) Exclude cut with ID _7160_210_1876_1_1527940923231_3703799_282-322611-0 from training. Duration: 0.87 2023-10-03 02:10:05,411 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.ff2_skip_rate, batch_count=1100706.6666666667, ans=0.0 2023-10-03 02:10:06,666 WARNING [train.py:1204] (3/4) Exclude cut with ID _67335_210_5219_1_1535525975652_4275229_570-262983-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 02:10:09,404 WARNING [train.py:1204] (3/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_625-338546-0_sp1.1 from training. Duration: 0.8 2023-10-03 02:10:09,435 WARNING [train.py:1204] (3/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_176-56120-0 from training. Duration: 0.83 2023-10-03 02:10:12,841 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.feed_forward1.out_proj.dropout_p, batch_count=1100706.6666666667, ans=0.1 2023-10-03 02:10:13,908 WARNING [train.py:1204] (3/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_963-187109-0_sp1.1 from training. Duration: 0.9 2023-10-03 02:10:15,357 WARNING [train.py:1204] (3/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_121-284152-0_sp0.9 from training. Duration: 0.6555625 2023-10-03 02:10:16,744 WARNING [train.py:1204] (3/4) Exclude cut with ID _45021_210_9994_1_1533619751094_3330288_104-81711-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 02:10:19,454 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.686e+02 1.948e+02 2.204e+02 2.746e+02 3.868e+02, threshold=4.409e+02, percent-clipped=0.0 2023-10-03 02:10:19,535 WARNING [train.py:1204] (3/4) Exclude cut with ID _51382_210_23862_1_1534492801838_3650104_129-343914-0_sp1.1 from training. Duration: 0.936375 2023-10-03 02:10:19,575 WARNING [train.py:1204] (3/4) Exclude cut with ID _14970_210_15709_1_1530775547361_1966965_8-169924-0 from training. Duration: 0.92 2023-10-03 02:10:20,833 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2295_210_2087_1_1525500993_2407863_234-83104-0_sp1.1 from training. Duration: 0.563625 2023-10-03 02:10:22,647 WARNING [train.py:1204] (3/4) Exclude cut with ID _14687_210_5854_1_1530703838259_3664889_115-239046-0 from training. Duration: 0.67 2023-10-03 02:10:22,975 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=1100773.3333333333, ans=0.1 2023-10-03 02:10:24,213 WARNING [train.py:1204] (3/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_690-126306-0_sp1.1 from training. Duration: 0.7090625 2023-10-03 02:10:24,220 WARNING [train.py:1204] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_625-102955-0_sp0.9 from training. Duration: 0.8555625 2023-10-03 02:10:28,723 WARNING [train.py:1204] (3/4) Exclude cut with ID _9389_210_1664_1_1529217337301_5289269_289-213417-0 from training. Duration: 0.89 2023-10-03 02:10:30,125 WARNING [train.py:1204] (3/4) Exclude cut with ID _13253_210_1925_1_1530757775819_3577873_191-144977-0_sp0.9 from training. Duration: 0.8666875 2023-10-03 02:10:31,426 INFO [train.py:1046] (3/4) Epoch 32, batch 450, loss[loss=0.1738, simple_loss=0.2572, pruned_loss=0.04521, over 24700.00 frames. ], tot_loss[loss=0.163, simple_loss=0.2413, pruned_loss=0.04236, over 4220536.34 frames. ], batch size: 73, lr: 3.21e-03, grad_scale: 8.0 2023-10-03 02:10:31,466 WARNING [train.py:1204] (3/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_325-155703-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 02:10:31,496 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2295_210_2087_1_1525500993_2407863_116-238894-0_sp0.9 from training. Duration: 0.77775 2023-10-03 02:10:32,912 WARNING [train.py:1204] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_94-126128-0 from training. Duration: 0.86 2023-10-03 02:10:32,933 WARNING [train.py:1204] (3/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_395-310220-0_sp0.9 from training. Duration: 0.988875 2023-10-03 02:10:34,268 WARNING [train.py:1204] (3/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_821-110852-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 02:10:34,322 WARNING [train.py:1204] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_64-248359-0_sp1.1 from training. Duration: 0.62725 2023-10-03 02:10:34,340 WARNING [train.py:1204] (3/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_138-187031-0 from training. Duration: 0.99 2023-10-03 02:10:36,293 WARNING [train.py:1204] (3/4) Exclude cut with ID _4122_210_2084_1_1526713164417_4353280_528-206771-0_sp0.9 from training. Duration: 0.97775 2023-10-03 02:10:36,386 WARNING [train.py:1204] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_844-96989-0_sp1.1 from training. Duration: 0.6454375 2023-10-03 02:10:37,788 WARNING [train.py:1204] (3/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_106-30782-0_sp1.1 from training. Duration: 0.72725 2023-10-03 02:10:47,759 WARNING [train.py:1204] (3/4) Exclude cut with ID _14683_210_5219_1_1530698352608_3974859_281-250040-0_sp1.1 from training. Duration: 0.936375 2023-10-03 02:10:47,805 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_46013_210_7234_1_1533776362_3744032_463-52378-0_sp0.9 from training. Duration: 0.9555625 2023-10-03 02:10:49,289 WARNING [train.py:1204] (3/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_299-34413-0 from training. Duration: 0.65 2023-10-03 02:10:50,564 WARNING [train.py:1204] (3/4) Exclude cut with ID _35799_210_5854_1_1533263487887_3529071_490-326847-0 from training. Duration: 0.99 2023-10-03 02:10:53,410 WARNING [train.py:1204] (3/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_589-9070-0_sp0.9 from training. Duration: 0.788875 2023-10-03 02:10:56,598 WARNING [train.py:1204] (3/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_884-29515-0_sp1.1 from training. Duration: 0.936375 2023-10-03 02:10:58,101 WARNING [train.py:1204] (3/4) Exclude cut with ID _15012_210_1968_1_1530856820429_5095350_474-119995-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 02:11:02,559 WARNING [train.py:1204] (3/4) Exclude cut with ID _65585_210_12210_1_1535450451584_3764098_211-97124-0_sp1.1 from training. Duration: 0.963625 2023-10-03 02:11:04,516 WARNING [train.py:1204] (3/4) Exclude cut with ID _33640_210_2655_1_1533117605479_4855469_666-224463-0_sp1.1 from training. Duration: 0.963625 2023-10-03 02:11:05,932 WARNING [train.py:1204] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_35-153886-0 from training. Duration: 0.84 2023-10-03 02:11:07,295 WARNING [train.py:1204] (3/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_210-106047-0 from training. Duration: 0.97 2023-10-03 02:11:08,608 WARNING [train.py:1204] (3/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_479-125611-0 from training. Duration: 0.96 2023-10-03 02:11:09,937 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_9417_210_1794_1_1529235261_4614131_427-153963-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 02:11:10,020 WARNING [train.py:1204] (3/4) Exclude cut with ID _23818_210_6228_1_1531917063884_3676920_133-170571-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 02:11:11,408 WARNING [train.py:1204] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_602-275260-0_sp1.1 from training. Duration: 0.7454375 2023-10-03 02:11:13,359 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9029W0121-509521 from training. Duration: 0.896 2023-10-03 02:11:13,368 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9029W0434-509821 from training. Duration: 0.64 2023-10-03 02:11:13,396 WARNING [train.py:1204] (3/4) Exclude cut with ID _23038_210_9570_1_1532001584158_3652419_289-307034-0_sp1.1 from training. Duration: 0.936375 2023-10-03 02:11:14,767 WARNING [train.py:1204] (3/4) Exclude cut with ID _29345_210_4381_1_1532689168263_3860169_149-257268-0_sp0.9 from training. Duration: 0.9 2023-10-03 02:11:14,843 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_405-339986-0_sp1.1 from training. Duration: 0.463625 2023-10-03 02:11:18,945 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2655_210_2087_1_1525688959_3897400_437-337690-0_sp1.1 from training. Duration: 0.57275 2023-10-03 02:11:18,982 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7543_210_6753_1_1528541907_1399322_165-323619-0_sp1.1 from training. Duration: 0.62725 2023-10-03 02:11:19,035 WARNING [train.py:1204] (3/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_508-335369-0_sp1.1 from training. Duration: 0.436375 2023-10-03 02:11:19,092 WARNING [train.py:1204] (3/4) Exclude cut with ID _7852_210_5219_1_1528286471316_8139400_358-287492-0 from training. Duration: 0.96 2023-10-03 02:11:21,732 WARNING [train.py:1204] (3/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_322-217220-0_sp0.9 from training. Duration: 0.9555625 2023-10-03 02:11:23,225 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3171_210_2087_1_1526109084_2330007_66-260583-0_sp0.9 from training. Duration: 0.8 2023-10-03 02:11:25,060 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8558_210_1410_1_1528505225_7613406_131-329524-0_sp1.1 from training. Duration: 0.7181875 2023-10-03 02:11:26,395 WARNING [train.py:1204] (3/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_30-326681-0 from training. Duration: 0.83 2023-10-03 02:11:28,078 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.nonlin_attention.balancer.prob, batch_count=1101040.0, ans=0.125 2023-10-03 02:11:29,372 WARNING [train.py:1204] (3/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_58-68956-0_sp0.9 from training. Duration: 0.92225 2023-10-03 02:11:30,737 WARNING [train.py:1204] (3/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_255-312654-0 from training. Duration: 0.91 2023-10-03 02:11:32,576 WARNING [train.py:1204] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_99-167392-0 from training. Duration: 0.61 2023-10-03 02:11:33,929 WARNING [train.py:1204] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_94-126128-0_sp0.9 from training. Duration: 0.9555625 2023-10-03 02:11:40,203 WARNING [train.py:1204] (3/4) Exclude cut with ID _68635_210_9317_1_1535770813410_4315870_187-192817-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 02:11:41,590 WARNING [train.py:1204] (3/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_277-204970-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 02:11:42,945 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6163_210_6400_1_1527506746_4263068_283-137811-0_sp0.9 from training. Duration: 0.8555625 2023-10-03 02:11:42,972 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9144W0121-566446 from training. Duration: 0.896 2023-10-03 02:11:45,610 INFO [train.py:1046] (3/4) Epoch 32, batch 500, loss[loss=0.153, simple_loss=0.2411, pruned_loss=0.03243, over 24464.00 frames. ], tot_loss[loss=0.1633, simple_loss=0.2419, pruned_loss=0.04235, over 4330255.72 frames. ], batch size: 63, lr: 3.21e-03, grad_scale: 8.0 2023-10-03 02:11:46,380 WARNING [train.py:1204] (3/4) Exclude cut with ID _49371_210_12071_1_1533952761021_3812190_351-315519-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 02:11:47,524 WARNING [train.py:1204] (3/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_115-326778-0_sp1.1 from training. Duration: 0.8454375 2023-10-03 02:11:47,589 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_17292_210_6753_1_1531125388_2943734_436-198965-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 02:11:47,600 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9165W0117-576751 from training. Duration: 0.896 2023-10-03 02:11:47,757 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.conv_module1.balancer2.prob, batch_count=1101173.3333333333, ans=0.125 2023-10-03 02:11:49,012 WARNING [train.py:1204] (3/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_212-149947-0 from training. Duration: 0.73 2023-10-03 02:11:49,028 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_13757_210_3528_1_1530440682_4107239_65-167544-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 02:11:53,046 WARNING [train.py:1204] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_700-183549-0_sp0.9 from training. Duration: 0.7555625 2023-10-03 02:11:57,593 WARNING [train.py:1204] (3/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_216-343007-0_sp0.9 from training. Duration: 0.7333125 2023-10-03 02:11:57,699 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7544_210_6753_1_1528587722_3576686_295-258479-0_sp1.1 from training. Duration: 0.763625 2023-10-03 02:12:00,503 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_365-287106-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 02:12:00,519 WARNING [train.py:1204] (3/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_300-3747-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 02:12:01,832 WARNING [train.py:1204] (3/4) Exclude cut with ID _14069_210_5632_1_1530512692033_4803390_487-303201-0_sp1.1 from training. Duration: 0.92725 2023-10-03 02:12:12,440 WARNING [train.py:1204] (3/4) Exclude cut with ID _14459_210_2228_1_1531443641946_7516700_668-123026-0_sp1.1 from training. Duration: 0.936375 2023-10-03 02:12:12,467 WARNING [train.py:1204] (3/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_108-53929-0_sp1.1 from training. Duration: 0.536375 2023-10-03 02:12:12,506 WARNING [train.py:1204] (3/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_266-137104-0_sp0.9 from training. Duration: 0.72225 2023-10-03 02:12:12,552 WARNING [train.py:1204] (3/4) Exclude cut with ID _12302_210_5219_1_1529924446466_8107280_902-168619-0_sp1.1 from training. Duration: 0.936375 2023-10-03 02:12:13,832 WARNING [train.py:1204] (3/4) Exclude cut with ID _59502_210_12210_1_1535107024698_1835139_146-162495-0 from training. Duration: 0.92 2023-10-03 02:12:13,854 WARNING [train.py:1204] (3/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_412-287526-0_sp1.1 from training. Duration: 0.6545625 2023-10-03 02:12:18,525 WARNING [train.py:1204] (3/4) Exclude cut with ID _7160_210_1876_1_1527940923231_3703799_120-150943-0_sp0.9 from training. Duration: 0.9 2023-10-03 02:12:18,589 WARNING [train.py:1204] (3/4) Exclude cut with ID _15037_210_2253_1_1530752294104_3267650_296-192597-0_sp1.1 from training. Duration: 0.736375 2023-10-03 02:12:19,859 WARNING [train.py:1204] (3/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_397-216497-0_sp1.1 from training. Duration: 0.87275 2023-10-03 02:12:19,891 WARNING [train.py:1204] (3/4) Exclude cut with ID _40094_210_16380_1_1533294005773_3641159_76-269320-0_sp1.1 from training. Duration: 0.936375 2023-10-03 02:12:21,501 WARNING [train.py:1204] (3/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_449-15508-0 from training. Duration: 0.82 2023-10-03 02:12:24,350 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9309W0194-640321 from training. Duration: 0.64 2023-10-03 02:12:25,885 WARNING [train.py:1204] (3/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_284-312178-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 02:12:27,316 WARNING [train.py:1204] (3/4) Exclude cut with ID _29853_210_7497_1_1532505600851_6642110_401-55937-0_sp1.1 from training. Duration: 0.92725 2023-10-03 02:12:27,389 WARNING [train.py:1204] (3/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_716-24918-0_sp1.1 from training. Duration: 0.92725 2023-10-03 02:12:29,120 WARNING [train.py:1204] (3/4) Exclude cut with ID _19637_210_4220_1_1531537167676_3205060_314-305638-0_sp1.1 from training. Duration: 0.92725 2023-10-03 02:12:29,160 WARNING [train.py:1204] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_475-127455-0_sp1.1 from training. Duration: 0.636375 2023-10-03 02:12:30,617 WARNING [train.py:1204] (3/4) Exclude cut with ID _5444_210_2151_1_1527418808464_5312210_581-315411-0 from training. Duration: 0.62 2023-10-03 02:12:32,101 WARNING [train.py:1204] (3/4) Exclude cut with ID _44902_210_16187_1_1533888006062_3792078_149-67212-0_sp0.9 from training. Duration: 0.9666875 2023-10-03 02:12:33,976 WARNING [train.py:1204] (3/4) Exclude cut with ID _31602_210_5854_1_1532677021456_4458250_63-63253-0_sp1.1 from training. Duration: 0.963625 2023-10-03 02:12:36,948 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.feed_forward1.hidden_balancer.prob, batch_count=1101373.3333333333, ans=0.125 2023-10-03 02:12:37,983 WARNING [train.py:1204] (3/4) Exclude cut with ID _55209_210_1531_1_1534675828996_4597160_660-203992-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 02:12:41,239 WARNING [train.py:1204] (3/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_83-190624-0_sp1.1 from training. Duration: 0.92725 2023-10-03 02:12:42,745 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.1.balancer2.prob, batch_count=1101373.3333333333, ans=0.125 2023-10-03 02:12:46,701 WARNING [train.py:1204] (3/4) Exclude cut with ID _58605_210_12210_1_1534723195939_3904259_154-139635-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 02:12:47,006 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.ff3_skip_rate, batch_count=1101440.0, ans=0.0 2023-10-03 02:12:48,035 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.569e+02 1.888e+02 2.126e+02 2.426e+02 3.415e+02, threshold=4.253e+02, percent-clipped=0.0 2023-10-03 02:12:50,179 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder_embed.convnext.hidden_balancer.prob, batch_count=1101440.0, ans=0.125 2023-10-03 02:12:51,437 WARNING [train.py:1204] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_164-224052-0 from training. Duration: 0.7 2023-10-03 02:12:51,453 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_1126-189071-0_sp1.1 from training. Duration: 0.963625 2023-10-03 02:12:51,474 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_15396_210_6890_1_1531308473_1938936_310-214789-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 02:12:55,511 WARNING [train.py:1204] (3/4) Exclude cut with ID _8395_210_1531_1_1528597871523_4411579_431-132470-0 from training. Duration: 0.74 2023-10-03 02:12:56,816 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3780_210_2087_1_1526467760_2130483_170-259044-0_sp0.9 from training. Duration: 0.77775 2023-10-03 02:12:58,225 WARNING [train.py:1204] (3/4) Exclude cut with ID _12733_210_8341_1_1530167424556_3646380_537-230640-0_sp1.1 from training. Duration: 0.963625 2023-10-03 02:13:00,075 INFO [train.py:1046] (3/4) Epoch 32, batch 550, loss[loss=0.1667, simple_loss=0.2629, pruned_loss=0.0353, over 24328.00 frames. ], tot_loss[loss=0.1632, simple_loss=0.2421, pruned_loss=0.04217, over 4418404.77 frames. ], batch size: 74, lr: 3.21e-03, grad_scale: 8.0 2023-10-03 02:13:02,862 WARNING [train.py:1204] (3/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_159-281310-0 from training. Duration: 0.95 2023-10-03 02:13:05,556 WARNING [train.py:1204] (3/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_434-83000-0 from training. Duration: 0.98 2023-10-03 02:13:05,583 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_4405_210_1849_1_1526732205_3593939_493-32464-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 02:13:05,591 WARNING [train.py:1204] (3/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_342-100157-0 from training. Duration: 0.54 2023-10-03 02:13:05,648 WARNING [train.py:1204] (3/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_765-250133-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 02:13:05,652 WARNING [train.py:1204] (3/4) Exclude cut with ID _38670_210_15379_1_1533448810659_4391059_81-312245-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 02:13:07,065 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_50050_210_16223_1_1535435796_7290138_1273-247611-0_sp1.1 from training. Duration: 0.936375 2023-10-03 02:13:08,822 WARNING [train.py:1204] (3/4) Exclude cut with ID _44831_210_13427_1_1533816011549_3689035_399-18591-0_sp1.1 from training. Duration: 0.936375 2023-10-03 02:13:08,837 WARNING [train.py:1204] (3/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_557-324258-0_sp0.9 from training. Duration: 0.9 2023-10-03 02:13:10,264 WARNING [train.py:1204] (3/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_62-335552-0_sp1.1 from training. Duration: 0.7909375 2023-10-03 02:13:10,569 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.feed_forward1.hidden_balancer.prob, batch_count=1101506.6666666667, ans=0.125 2023-10-03 02:13:11,602 WARNING [train.py:1204] (3/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_673-264125-0_sp1.1 from training. Duration: 0.963625 2023-10-03 02:13:13,545 WARNING [train.py:1204] (3/4) Exclude cut with ID _8030_210_7497_1_1528444886781_3531089_262-86750-0 from training. Duration: 0.81 2023-10-03 02:13:13,559 WARNING [train.py:1204] (3/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_444-28041-0_sp1.1 from training. Duration: 0.77275 2023-10-03 02:13:19,191 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_34008_210_10431_1_1533345424_8183247_516-14858-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 02:13:19,210 WARNING [train.py:1204] (3/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_54-248218-0_sp1.1 from training. Duration: 0.936375 2023-10-03 02:13:22,026 WARNING [train.py:1204] (3/4) Exclude cut with ID _13253_210_1925_1_1530757775819_3577873_300-119782-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 02:13:22,732 WARNING [train.py:1204] (3/4) Exclude cut with ID _39290_210_4220_1_1533862778760_3075118_21-240703-0_sp1.1 from training. Duration: 0.936375 2023-10-03 02:13:25,630 WARNING [train.py:1204] (3/4) Exclude cut with ID 774-127930-0014-10412-0_sp1.1 from training. Duration: 0.95 2023-10-03 02:13:27,036 WARNING [train.py:1204] (3/4) Exclude cut with ID _67499_210_17282_1_1535509805220_3692430_20-179346-0 from training. Duration: 0.97 2023-10-03 02:13:27,222 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=1101573.3333333333, ans=0.1 2023-10-03 02:13:29,727 WARNING [train.py:1204] (3/4) Exclude cut with ID _8365_210_2064_1_1528592311070_4677690_431-294102-0_sp0.9 from training. Duration: 0.988875 2023-10-03 02:13:34,499 WARNING [train.py:1204] (3/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_365-1492-0_sp1.1 from training. Duration: 0.82725 2023-10-03 02:13:34,538 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_105-114797-0_sp0.9 from training. Duration: 0.9333125 2023-10-03 02:13:37,218 WARNING [train.py:1204] (3/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_769-168302-0_sp0.9 from training. Duration: 0.788875 2023-10-03 02:13:39,976 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_40340_210_9024_1_1533433916_4132944_320-270277-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 02:13:39,990 WARNING [train.py:1204] (3/4) Exclude cut with ID ID0203W0003-781711 from training. Duration: 0.958 2023-10-03 02:13:41,828 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_30434_210_13049_1_1532519384_981746_52-12932-0_sp1.1 from training. Duration: 0.936375 2023-10-03 02:13:42,109 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.balancer_na.min_abs, batch_count=1101640.0, ans=0.02 2023-10-03 02:13:43,208 WARNING [train.py:1204] (3/4) Exclude cut with ID _8263_210_1664_1_1528438953494_294100_40-204731-0_sp0.9 from training. Duration: 0.5555625 2023-10-03 02:13:44,630 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1032-236863-0_sp0.9 from training. Duration: 0.9333125 2023-10-03 02:13:46,542 WARNING [train.py:1204] (3/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_335-274780-0_sp0.9 from training. Duration: 0.8666875 2023-10-03 02:13:46,549 WARNING [train.py:1204] (3/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_654-52713-0_sp1.1 from training. Duration: 0.7 2023-10-03 02:13:46,630 WARNING [train.py:1204] (3/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_742-267856-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 02:13:48,037 WARNING [train.py:1204] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_74-48717-0 from training. Duration: 0.58 2023-10-03 02:13:49,350 WARNING [train.py:1204] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_18-46756-0 from training. Duration: 0.79 2023-10-03 02:13:49,414 WARNING [train.py:1204] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_612-56061-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 02:13:49,546 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.bypass_mid.scale_min, batch_count=1101706.6666666667, ans=0.2 2023-10-03 02:13:50,745 WARNING [train.py:1204] (3/4) Exclude cut with ID _56423_210_5854_1_1534896061049_3165969_412-2403-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 02:13:50,832 WARNING [train.py:1204] (3/4) Exclude cut with ID _62574_210_2655_1_1535180407058_3933249_120-196848-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 02:13:50,832 WARNING [train.py:1204] (3/4) Exclude cut with ID _40519_210_13794_1_1533715228655_7337390_916-51000-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 02:13:55,276 WARNING [train.py:1204] (3/4) Exclude cut with ID _65263_210_22356_1_1535763598691_3921037_108-21902-0_sp1.1 from training. Duration: 0.9 2023-10-03 02:13:55,370 WARNING [train.py:1204] (3/4) Exclude cut with ID _4122_210_2084_1_1526713164417_4353280_528-206771-0_sp1.1 from training. Duration: 0.8 2023-10-03 02:13:56,247 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.0.layers.1.self_attn_weights.whiten_keys, num_groups=4, num_channels=128, metric=2.33 vs. limit=6.0 2023-10-03 02:13:58,122 WARNING [train.py:1204] (3/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_43-325657-0_sp1.1 from training. Duration: 0.92725 2023-10-03 02:13:59,480 WARNING [train.py:1204] (3/4) Exclude cut with ID _44659_210_2721_1_1534036120061_6614860_314-82041-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 02:13:59,554 WARNING [train.py:1204] (3/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_81-300212-0_sp0.9 from training. Duration: 0.6666875 2023-10-03 02:13:59,622 WARNING [train.py:1204] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_665-196155-0_sp0.9 from training. Duration: 0.7444375 2023-10-03 02:14:00,993 WARNING [train.py:1204] (3/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_140-336881-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 02:14:02,768 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6290_210_3273_1_1527595224_1282787_115-87095-0_sp0.9 from training. Duration: 0.611125 2023-10-03 02:14:02,825 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_53373_210_5399_1_1534229471_4715695_607-259685-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 02:14:04,292 WARNING [train.py:1204] (3/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_175-163894-0_sp1.1 from training. Duration: 0.67275 2023-10-03 02:14:04,308 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7002_210_1794_1_1527939683_5157286_1-281264-0_sp1.1 from training. Duration: 0.4 2023-10-03 02:14:11,498 WARNING [train.py:1204] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_605-257205-0 from training. Duration: 0.59 2023-10-03 02:14:11,685 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.nonlin_attention.balancer.prob, batch_count=1101773.3333333333, ans=0.125 2023-10-03 02:14:13,024 WARNING [train.py:1204] (3/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_454-314901-0 from training. Duration: 0.46 2023-10-03 02:14:14,388 INFO [train.py:1046] (3/4) Epoch 32, batch 600, loss[loss=0.1486, simple_loss=0.2363, pruned_loss=0.03038, over 24652.00 frames. ], tot_loss[loss=0.1642, simple_loss=0.2431, pruned_loss=0.04261, over 4474638.72 frames. ], batch size: 68, lr: 3.21e-03, grad_scale: 8.0 2023-10-03 02:14:16,286 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_46032_210_14656_1_1533781412_4507969_287-130422-0_sp1.1 from training. Duration: 0.97275 2023-10-03 02:14:16,304 WARNING [train.py:1204] (3/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_186-309161-0_sp1.1 from training. Duration: 0.4909375 2023-10-03 02:14:16,334 WARNING [train.py:1204] (3/4) Exclude cut with ID _42659_210_2070_1_1533607442765_3120380_45-342307-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 02:14:17,932 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.bypass.scale_min, batch_count=1101840.0, ans=0.2 2023-10-03 02:14:21,254 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.out_combiner.scale_min, batch_count=1101840.0, ans=0.2 2023-10-03 02:14:22,460 WARNING [train.py:1204] (3/4) Exclude cut with ID _12555_210_8750_1_1530075594402_3780239_478-326818-0_sp1.1 from training. Duration: 0.963625 2023-10-03 02:14:25,210 WARNING [train.py:1204] (3/4) Exclude cut with ID _7606_210_4915_1_1528894253277_5080690_127-177761-0_sp1.1 from training. Duration: 0.5818125 2023-10-03 02:14:26,575 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6788_210_1388_1_1527898698_4562021_91-197981-0 from training. Duration: 0.83 2023-10-03 02:14:27,969 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8558_210_1410_1_1528505225_7613406_131-329524-0_sp0.9 from training. Duration: 0.87775 2023-10-03 02:14:29,453 WARNING [train.py:1204] (3/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_153-153537-0_sp1.1 from training. Duration: 0.82725 2023-10-03 02:14:32,789 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_17292_210_6753_1_1531125388_2943734_285-349105-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 02:14:35,440 WARNING [train.py:1204] (3/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_37-41919-0 from training. Duration: 0.87 2023-10-03 02:14:35,476 WARNING [train.py:1204] (3/4) Exclude cut with ID _60860_210_8254_1_1534896015875_3557169_172-294852-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 02:14:36,129 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.3.encoder.layers.2.conv_module2.whiten, num_groups=1, num_channels=512, metric=3.80 vs. limit=15.0 2023-10-03 02:14:38,581 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=1101906.6666666667, ans=0.1 2023-10-03 02:14:41,519 WARNING [train.py:1204] (3/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_146-276944-0 from training. Duration: 0.83 2023-10-03 02:14:45,564 WARNING [train.py:1204] (3/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_1254-44082-0_sp1.1 from training. Duration: 0.936375 2023-10-03 02:14:45,578 WARNING [train.py:1204] (3/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_1115-180350-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 02:14:45,609 WARNING [train.py:1204] (3/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_60-146189-0_sp1.1 from training. Duration: 0.8909375 2023-10-03 02:14:49,946 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.2.encoder.layers.0.feed_forward2.out_whiten, num_groups=1, num_channels=384, metric=4.52 vs. limit=15.0 2023-10-03 02:14:50,458 WARNING [train.py:1204] (3/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_517-153649-0_sp1.1 from training. Duration: 0.92725 2023-10-03 02:14:50,470 WARNING [train.py:1204] (3/4) Exclude cut with ID _39571_210_13449_1_1533801584143_3651077_37-266257-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 02:14:50,513 WARNING [train.py:1204] (3/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_527-276118-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 02:14:59,094 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7696_210_2040_1_1528207167_3716099_117-125434-0_sp1.1 from training. Duration: 0.6909375 2023-10-03 02:15:03,622 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_25767_210_2161_1_1532307483_3867748_478-156360-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 02:15:03,634 WARNING [train.py:1204] (3/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_177-49092-0_sp1.1 from training. Duration: 0.82725 2023-10-03 02:15:03,642 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_11305_210_1794_1_1529750428_7178148_680-317073-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 02:15:08,059 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.feed_forward1.hidden_balancer.prob, batch_count=1102040.0, ans=0.125 2023-10-03 02:15:10,549 WARNING [train.py:1204] (3/4) Exclude cut with ID _53565_210_10635_1_1534337218335_3820259_287-18120-0 from training. Duration: 0.97 2023-10-03 02:15:16,461 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.517e+02 1.923e+02 2.160e+02 2.483e+02 3.446e+02, threshold=4.321e+02, percent-clipped=0.0 2023-10-03 02:15:16,575 WARNING [train.py:1204] (3/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_498-40153-0_sp1.1 from training. Duration: 0.57275 2023-10-03 02:15:16,592 WARNING [train.py:1204] (3/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_555-63066-0_sp1.1 from training. Duration: 0.963625 2023-10-03 02:15:18,226 WARNING [train.py:1204] (3/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_395-310220-0 from training. Duration: 0.89 2023-10-03 02:15:19,574 WARNING [train.py:1204] (3/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_937-208281-0_sp1.1 from training. Duration: 0.77275 2023-10-03 02:15:22,829 WARNING [train.py:1204] (3/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_120-211630-0 from training. Duration: 0.92 2023-10-03 02:15:22,857 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_454-276429-0_sp1.1 from training. Duration: 0.7909375 2023-10-03 02:15:22,900 WARNING [train.py:1204] (3/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_404-75889-0_sp1.1 from training. Duration: 0.8090625 2023-10-03 02:15:26,024 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.0.feed_forward1.hidden_balancer.prob, batch_count=1102106.6666666667, ans=0.125 2023-10-03 02:15:28,706 INFO [train.py:1046] (3/4) Epoch 32, batch 650, loss[loss=0.1618, simple_loss=0.2516, pruned_loss=0.03604, over 24558.00 frames. ], tot_loss[loss=0.1633, simple_loss=0.2422, pruned_loss=0.04218, over 4537984.63 frames. ], batch size: 71, lr: 3.21e-03, grad_scale: 8.0 2023-10-03 02:15:28,814 WARNING [train.py:1204] (3/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_282-136641-0_sp0.9 from training. Duration: 0.4666875 2023-10-03 02:15:30,173 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_809-17156-0_sp1.1 from training. Duration: 0.663625 2023-10-03 02:15:32,221 WARNING [train.py:1204] (3/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_1003-101638-0_sp0.9 from training. Duration: 0.811125 2023-10-03 02:15:32,484 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.feed_forward2.hidden_balancer.prob, batch_count=1102173.3333333333, ans=0.125 2023-10-03 02:15:33,584 WARNING [train.py:1204] (3/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_404-75889-0_sp0.9 from training. Duration: 0.988875 2023-10-03 02:15:36,185 WARNING [train.py:1204] (3/4) Exclude cut with ID _59288_210_5854_1_1535077757774_3412246_570-295255-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 02:15:38,959 WARNING [train.py:1204] (3/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_412-287526-0 from training. Duration: 0.72 2023-10-03 02:15:39,043 WARNING [train.py:1204] (3/4) Exclude cut with ID _44010_210_4525_1_1533884262964_4013620_649-79655-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 02:15:43,707 WARNING [train.py:1204] (3/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_57-270037-0_sp0.9 from training. Duration: 0.8555625 2023-10-03 02:15:43,708 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_34815_210_6388_1_1533381320_3571903_302-255963-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 02:15:47,820 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_47449_210_5399_1_1534076445_4006929_573-301852-0_sp1.1 from training. Duration: 0.97275 2023-10-03 02:15:51,041 WARNING [train.py:1204] (3/4) Exclude cut with ID 6709-74022-0004-86860-0_sp1.1 from training. Duration: 0.9409375 2023-10-03 02:15:54,403 WARNING [train.py:1204] (3/4) Exclude cut with ID _40519_210_13794_1_1533715228655_7337390_111-71991-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 02:15:54,453 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_30167_210_3528_1_1532428012_4772375_169-214186-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 02:15:58,486 WARNING [train.py:1204] (3/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_20-235313-0_sp1.1 from training. Duration: 0.963625 2023-10-03 02:15:58,536 WARNING [train.py:1204] (3/4) Exclude cut with ID _4681_210_3525_1_1527251040321_1235713_123-24428-0_sp0.9 from training. Duration: 0.5333125 2023-10-03 02:15:59,974 WARNING [train.py:1204] (3/4) Exclude cut with ID _31602_210_5854_1_1532677021456_4458250_444-198752-0_sp1.1 from training. Duration: 0.97275 2023-10-03 02:16:00,176 INFO [scaling.py:1118] (3/4) WithLoss: name=encoder.encoders.2.encoder.layers.0.self_attn_weights, loss-sum=0.000e+00 2023-10-03 02:16:01,776 WARNING [train.py:1204] (3/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_9-279442-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 02:16:01,837 WARNING [train.py:1204] (3/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_973-198085-0_sp0.9 from training. Duration: 0.8333125 2023-10-03 02:16:03,141 WARNING [train.py:1204] (3/4) Exclude cut with ID _29853_210_7497_1_1532505600851_6642110_567-250221-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 02:16:04,539 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6545_210_2040_1_1527774670_4245258_407-48070-0_sp1.1 from training. Duration: 0.6909375 2023-10-03 02:16:05,896 WARNING [train.py:1204] (3/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_224-343295-0_sp0.9 from training. Duration: 0.611125 2023-10-03 02:16:05,911 WARNING [train.py:1204] (3/4) Exclude cut with ID IC0125W0011-61817 from training. Duration: 0.88 2023-10-03 02:16:05,927 WARNING [train.py:1204] (3/4) Exclude cut with ID _60113_210_16271_1_1535164212481_4386079_435-154371-0_sp1.1 from training. Duration: 0.97275 2023-10-03 02:16:05,939 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_25836_210_16554_1_1532138167_53197_3-9394-0_sp1.1 from training. Duration: 0.92725 2023-10-03 02:16:08,689 WARNING [train.py:1204] (3/4) Exclude cut with ID _29343_210_4381_1_1532602757752_3674109_57-55125-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 02:16:08,795 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.0.conv_skip_rate, batch_count=1102306.6666666667, ans=0.0 2023-10-03 02:16:11,375 WARNING [train.py:1204] (3/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_267-23560-0_sp1.1 from training. Duration: 0.936375 2023-10-03 02:16:11,401 WARNING [train.py:1204] (3/4) Exclude cut with ID _30196_210_1664_1_1533708496522_7074140_1150-278867-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 02:16:11,457 WARNING [train.py:1204] (3/4) Exclude cut with ID _6270_210_2084_1_1527850911573_3856420_26-277343-0_sp0.9 from training. Duration: 0.911125 2023-10-03 02:16:11,524 WARNING [train.py:1204] (3/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_325-62802-0 from training. Duration: 0.87 2023-10-03 02:16:13,245 WARNING [train.py:1204] (3/4) Exclude cut with ID _56524_210_9642_1_1534572011779_3619170_145-310842-0_sp1.1 from training. Duration: 0.9 2023-10-03 02:16:13,286 WARNING [train.py:1204] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_860-96198-0_sp0.9 from training. Duration: 0.811125 2023-10-03 02:16:14,506 WARNING [train.py:1204] (3/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_17-25045-0_sp1.1 from training. Duration: 0.536375 2023-10-03 02:16:14,539 WARNING [train.py:1204] (3/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_349-69727-0_sp1.1 from training. Duration: 0.936375 2023-10-03 02:16:15,944 WARNING [train.py:1204] (3/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_580-93798-0_sp1.1 from training. Duration: 0.5090625 2023-10-03 02:16:19,051 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7762_210_1794_1_1528199508_4057408_419-125009-0 from training. Duration: 0.83 2023-10-03 02:16:20,441 WARNING [train.py:1204] (3/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_356-167816-0 from training. Duration: 0.72 2023-10-03 02:16:20,469 WARNING [train.py:1204] (3/4) Exclude cut with ID _29345_210_4381_1_1532689168263_3860169_225-97100-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 02:16:20,473 WARNING [train.py:1204] (3/4) Exclude cut with ID _14227_210_4381_1_1530701804286_3940259_374-194712-0_sp1.1 from training. Duration: 0.92725 2023-10-03 02:16:21,867 WARNING [train.py:1204] (3/4) Exclude cut with ID _40137_210_12210_1_1533290484046_3795180_249-187059-0_sp0.9 from training. Duration: 0.97775 2023-10-03 02:16:21,885 WARNING [train.py:1204] (3/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_389-317194-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 02:16:22,005 WARNING [train.py:1204] (3/4) Exclude cut with ID _27485_210_4916_1_1532739191677_3668269_239-122918-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 02:16:28,620 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2245_210_2715_1_1525511548_3540254_128-117609-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 02:16:28,658 WARNING [train.py:1204] (3/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_160-276178-0_sp1.1 from training. Duration: 0.963625 2023-10-03 02:16:29,999 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_31920_210_10633_1_1532860009_1686094_1-279735-0_sp1.1 from training. Duration: 0.97275 2023-10-03 02:16:34,037 WARNING [train.py:1204] (3/4) Exclude cut with ID _51078_210_9070_1_1534161557424_3805250_325-262383-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 02:16:34,064 WARNING [train.py:1204] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_643-52007-0_sp0.9 from training. Duration: 0.6333125 2023-10-03 02:16:34,116 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_51937_210_5014_1_1534573610_4022025_518-299248-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 02:16:34,282 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.attention_skip_rate, batch_count=1102440.0, ans=0.0 2023-10-03 02:16:35,738 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.conv_skip_rate, batch_count=1102440.0, ans=0.0 2023-10-03 02:16:39,835 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.0.feed_forward1.out_proj.dropout_p, batch_count=1102440.0, ans=0.1 2023-10-03 02:16:41,003 WARNING [train.py:1204] (3/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_166-314607-0_sp1.1 from training. Duration: 0.4909375 2023-10-03 02:16:41,007 WARNING [train.py:1204] (3/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_1087-282939-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 02:16:42,284 INFO [train.py:1046] (3/4) Epoch 32, batch 700, loss[loss=0.1648, simple_loss=0.2602, pruned_loss=0.0347, over 24593.00 frames. ], tot_loss[loss=0.1626, simple_loss=0.2415, pruned_loss=0.04188, over 4583389.05 frames. ], batch size: 71, lr: 3.21e-03, grad_scale: 8.0 2023-10-03 02:16:42,333 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_23850_210_4238_1_1532232597_1632433_32-185135-0_sp1.1 from training. Duration: 0.7818125 2023-10-03 02:16:42,363 WARNING [train.py:1204] (3/4) Exclude cut with ID _59500_210_8254_1_1534809610680_3736009_145-89359-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 02:16:48,451 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_340-336408-0 from training. Duration: 0.93 2023-10-03 02:16:48,519 WARNING [train.py:1204] (3/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_9-21737-0 from training. Duration: 0.97 2023-10-03 02:16:51,846 WARNING [train.py:1204] (3/4) Exclude cut with ID _2220_210_1819_1_1525509109898_3658680_50-99837-0 from training. Duration: 0.54 2023-10-03 02:16:51,920 WARNING [train.py:1204] (3/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_879-299350-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 02:16:53,324 WARNING [train.py:1204] (3/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_905-160074-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 02:16:54,785 WARNING [train.py:1204] (3/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_395-73570-0 from training. Duration: 0.99 2023-10-03 02:17:00,069 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7264_210_4938_1_1527939911_8132095_800-91995-0_sp1.1 from training. Duration: 0.7818125 2023-10-03 02:17:02,888 WARNING [train.py:1204] (3/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_77-311404-0_sp1.1 from training. Duration: 0.8545625 2023-10-03 02:17:02,997 WARNING [train.py:1204] (3/4) Exclude cut with ID _13679_210_7497_1_1530534293662_3355098_107-80772-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 02:17:05,776 WARNING [train.py:1204] (3/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_224-343295-0_sp1.1 from training. Duration: 0.5 2023-10-03 02:17:05,811 WARNING [train.py:1204] (3/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_156-253482-0_sp1.1 from training. Duration: 0.963625 2023-10-03 02:17:08,522 WARNING [train.py:1204] (3/4) Exclude cut with ID _49728_210_12651_1_1534041034294_6788150_309-125532-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 02:17:10,022 WARNING [train.py:1204] (3/4) Exclude cut with ID _13913_210_9994_1_1530579615187_3383897_473-277682-0_sp0.9 from training. Duration: 0.4333125 2023-10-03 02:17:11,323 WARNING [train.py:1204] (3/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_3-276701-0_sp1.1 from training. Duration: 0.82725 2023-10-03 02:17:12,766 WARNING [train.py:1204] (3/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_282-136641-0 from training. Duration: 0.42 2023-10-03 02:17:14,112 WARNING [train.py:1204] (3/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_127-566-0 from training. Duration: 0.84 2023-10-03 02:17:14,335 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.ff2_skip_rate, batch_count=1102640.0, ans=0.0 2023-10-03 02:17:17,802 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6163_210_6400_1_1527506746_4263068_283-137811-0_sp1.1 from training. Duration: 0.7 2023-10-03 02:17:19,425 WARNING [train.py:1204] (3/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_34-183685-0_sp1.1 from training. Duration: 0.8909375 2023-10-03 02:17:20,774 WARNING [train.py:1204] (3/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_154-135696-0_sp0.9 from training. Duration: 0.888875 2023-10-03 02:17:23,570 WARNING [train.py:1204] (3/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_676-51014-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 02:17:23,627 WARNING [train.py:1204] (3/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_38-169920-0 from training. Duration: 0.91 2023-10-03 02:17:25,859 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.feed_forward3.hidden_balancer.prob, batch_count=1102640.0, ans=0.125 2023-10-03 02:17:28,738 WARNING [train.py:1204] (3/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_747-114465-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 02:17:28,792 WARNING [train.py:1204] (3/4) Exclude cut with ID _8021_210_7994_1_1528376451358_3653069_100-140231-0_sp1.1 from training. Duration: 0.6090625 2023-10-03 02:17:28,810 WARNING [train.py:1204] (3/4) Exclude cut with ID _64135_210_2253_1_1535677267876_7608972_133-188266-0 from training. Duration: 0.81 2023-10-03 02:17:29,050 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.conv_module1.balancer1.prob, batch_count=1102706.6666666667, ans=0.125 2023-10-03 02:17:32,853 WARNING [train.py:1204] (3/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_714-310538-0_sp1.1 from training. Duration: 0.7818125 2023-10-03 02:17:34,246 WARNING [train.py:1204] (3/4) Exclude cut with ID _39867_210_9826_1_1533553222030_9344259_1134-288498-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 02:17:36,967 WARNING [train.py:1204] (3/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_211-237289-0_sp1.1 from training. Duration: 0.936375 2023-10-03 02:17:37,592 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.3.encoder.layers.1.conv_module2.whiten, num_groups=1, num_channels=512, metric=3.12 vs. limit=15.0 2023-10-03 02:17:42,402 WARNING [train.py:1204] (3/4) Exclude cut with ID _59202_210_26984_1_1534816810612_3549174_137-318710-0_sp0.9 from training. Duration: 0.988875 2023-10-03 02:17:43,737 WARNING [train.py:1204] (3/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_86-66192-0 from training. Duration: 0.55 2023-10-03 02:17:45,708 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.564e+02 1.878e+02 2.011e+02 2.266e+02 3.115e+02, threshold=4.021e+02, percent-clipped=0.0 2023-10-03 02:17:45,896 WARNING [train.py:1204] (3/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_540-275685-0 from training. Duration: 0.89 2023-10-03 02:17:45,927 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8776_210_2040_1_1528638837_4088036_244-278763-0 from training. Duration: 0.9 2023-10-03 02:17:49,060 WARNING [train.py:1204] (3/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_9-219972-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 02:17:50,524 WARNING [train.py:1204] (3/4) Exclude cut with ID _44659_210_2721_1_1534036120061_6614860_656-283823-0_sp1.1 from training. Duration: 0.92725 2023-10-03 02:17:51,861 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_69630_210_7234_1_1535789601_661049_77-216385-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 02:17:53,298 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_19952_210_2161_1_1531875469_3851750_210-308919-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 02:17:53,304 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_4737_210_2040_1_1526998689_3293008_378-178650-0 from training. Duration: 0.95 2023-10-03 02:17:53,486 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.conv_module1.balancer2.prob, batch_count=1102773.3333333333, ans=0.125 2023-10-03 02:17:58,515 INFO [train.py:1046] (3/4) Epoch 32, batch 750, loss[loss=0.1599, simple_loss=0.2324, pruned_loss=0.04366, over 23356.00 frames. ], tot_loss[loss=0.1618, simple_loss=0.2407, pruned_loss=0.04142, over 4607884.09 frames. ], batch size: 285, lr: 3.21e-03, grad_scale: 8.0 2023-10-03 02:17:58,640 WARNING [train.py:1204] (3/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_224-343295-0 from training. Duration: 0.55 2023-10-03 02:17:58,858 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.feed_forward1.out_proj.dropout_p, batch_count=1102840.0, ans=0.1 2023-10-03 02:17:59,896 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7002_210_1794_1_1527939683_5157286_184-229630-0 from training. Duration: 0.74 2023-10-03 02:17:59,920 WARNING [train.py:1204] (3/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_6-214815-0 from training. Duration: 0.85 2023-10-03 02:18:00,020 WARNING [train.py:1204] (3/4) Exclude cut with ID _8699_210_2069_1_1528630346831_3960409_160-24408-0 from training. Duration: 0.91 2023-10-03 02:18:01,243 WARNING [train.py:1204] (3/4) Exclude cut with ID _5585_210_2667_1_1527418813324_8007340_89-332072-0 from training. Duration: 0.64 2023-10-03 02:18:01,297 WARNING [train.py:1204] (3/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_528-259800-0_sp1.1 from training. Duration: 0.963625 2023-10-03 02:18:02,550 WARNING [train.py:1204] (3/4) Exclude cut with ID _55383_210_10635_1_1534468426172_2231669_134-128246-0 from training. Duration: 0.89 2023-10-03 02:18:03,971 WARNING [train.py:1204] (3/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_400-338274-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 02:18:05,263 WARNING [train.py:1204] (3/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_364-22817-0_sp0.9 from training. Duration: 0.788875 2023-10-03 02:18:07,888 WARNING [train.py:1204] (3/4) Exclude cut with ID _42863_210_9070_1_1533902398788_3696920_331-291057-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 02:18:09,347 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_5436_210_1794_1_1527331317_2541494_229-211016-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 02:18:09,382 WARNING [train.py:1204] (3/4) Exclude cut with ID _6456_210_1534_1_1527681634212_3181049_345-252877-0_sp0.9 from training. Duration: 0.711125 2023-10-03 02:18:09,413 WARNING [train.py:1204] (3/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_96-81499-0_sp1.1 from training. Duration: 0.92725 2023-10-03 02:18:12,251 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_5575_210_1410_1_1527301305_2570170_342-265572-0_sp1.1 from training. Duration: 0.8545625 2023-10-03 02:18:13,699 WARNING [train.py:1204] (3/4) Exclude cut with ID _5444_210_2151_1_1527418808464_5312210_726-27165-0_sp0.9 from training. Duration: 0.7444375 2023-10-03 02:18:15,163 WARNING [train.py:1204] (3/4) Exclude cut with ID _22083_210_8254_1_1531702862439_3678970_304-327750-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 02:18:18,431 WARNING [train.py:1204] (3/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_245-226199-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 02:18:18,468 WARNING [train.py:1204] (3/4) Exclude cut with ID _42875_210_14856_1_1533704397487_4012433_102-199499-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 02:18:20,296 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_51225_210_3601_1_1534400634_2327383_55-301844-0 from training. Duration: 0.93 2023-10-03 02:18:20,400 WARNING [train.py:1204] (3/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_916-195848-0_sp1.1 from training. Duration: 0.836375 2023-10-03 02:18:21,742 WARNING [train.py:1204] (3/4) Exclude cut with ID _44718_210_5816_1_1534050051869_6469030_497-311999-0_sp1.1 from training. Duration: 0.936375 2023-10-03 02:18:23,215 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_34886_210_10633_1_1533203882_1755661_91-194765-0_sp1.1 from training. Duration: 0.936375 2023-10-03 02:18:24,638 WARNING [train.py:1204] (3/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_310-26186-0_sp0.9 from training. Duration: 0.77775 2023-10-03 02:18:26,648 WARNING [train.py:1204] (3/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_416-257730-0 from training. Duration: 0.65 2023-10-03 02:18:26,654 WARNING [train.py:1204] (3/4) Exclude cut with ID _9389_210_1664_1_1529217337301_5289269_209-154760-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 02:18:28,074 WARNING [train.py:1204] (3/4) Exclude cut with ID _4092_210_2151_1_1526643044449_3135759_227-213972-0 from training. Duration: 0.54 2023-10-03 02:18:28,113 WARNING [train.py:1204] (3/4) Exclude cut with ID IC0679W0044-335852 from training. Duration: 0.768 2023-10-03 02:18:28,188 WARNING [train.py:1204] (3/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_42-186162-0 from training. Duration: 0.92 2023-10-03 02:18:28,195 WARNING [train.py:1204] (3/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_302-2550-0_sp0.9 from training. Duration: 0.9 2023-10-03 02:18:28,209 WARNING [train.py:1204] (3/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_356-31216-0_sp0.9 from training. Duration: 0.8333125 2023-10-03 02:18:31,473 WARNING [train.py:1204] (3/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_125-322172-0_sp1.1 from training. Duration: 0.8454375 2023-10-03 02:18:38,093 WARNING [train.py:1204] (3/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_141-89727-0_sp0.9 from training. Duration: 0.788875 2023-10-03 02:18:38,110 WARNING [train.py:1204] (3/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_647-337784-0_sp1.1 from training. Duration: 0.97275 2023-10-03 02:18:39,407 WARNING [train.py:1204] (3/4) Exclude cut with ID _6493_210_1747_1_1528459210424_4012470_513-140967-0_sp1.1 from training. Duration: 0.7181875 2023-10-03 02:18:42,308 WARNING [train.py:1204] (3/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_87-266334-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 02:18:43,681 WARNING [train.py:1204] (3/4) Exclude cut with ID _26528_210_9070_1_1532570410842_3851310_526-261338-0_sp1.1 from training. Duration: 0.92725 2023-10-03 02:18:43,722 WARNING [train.py:1204] (3/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_380-343670-0 from training. Duration: 0.88 2023-10-03 02:18:43,772 WARNING [train.py:1204] (3/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_449-15508-0_sp1.1 from training. Duration: 0.7454375 2023-10-03 02:18:45,621 WARNING [train.py:1204] (3/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_372-6869-0_sp0.9 from training. Duration: 0.488875 2023-10-03 02:18:45,677 WARNING [train.py:1204] (3/4) Exclude cut with ID _26907_210_7234_1_1532221740694_3205263_337-2494-0_sp0.9 from training. Duration: 0.97775 2023-10-03 02:18:48,434 WARNING [train.py:1204] (3/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_10-100367-0_sp1.1 from training. Duration: 0.8909375 2023-10-03 02:18:49,743 WARNING [train.py:1204] (3/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_47-167373-0 from training. Duration: 0.92 2023-10-03 02:18:49,807 WARNING [train.py:1204] (3/4) Exclude cut with ID _28323_210_16187_1_1532480395667_4100300_628-282179-0_sp1.1 from training. Duration: 0.97275 2023-10-03 02:18:54,331 WARNING [train.py:1204] (3/4) Exclude cut with ID _20455_210_5219_1_1531565996775_8098100_213-280898-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 02:18:55,713 WARNING [train.py:1204] (3/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_383-316000-0_sp1.1 from training. Duration: 0.8181875 2023-10-03 02:18:55,751 WARNING [train.py:1204] (3/4) Exclude cut with ID _18513_210_9206_1_1531618215223_3862620_16-78180-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 02:18:59,068 WARNING [train.py:1204] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_330-327861-0_sp1.1 from training. Duration: 0.6090625 2023-10-03 02:19:01,815 WARNING [train.py:1204] (3/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_356-31216-0 from training. Duration: 0.75 2023-10-03 02:19:01,842 WARNING [train.py:1204] (3/4) Exclude cut with ID _25248_210_8750_1_1532062825941_5554180_255-148539-0_sp1.1 from training. Duration: 0.963625 2023-10-03 02:19:02,551 WARNING [train.py:1204] (3/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_269-154726-0_sp1.1 from training. Duration: 0.7818125 2023-10-03 02:19:05,313 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=1103106.6666666667, ans=0.1 2023-10-03 02:19:06,433 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7002_210_1794_1_1527939683_5157286_163-87552-0_sp1.1 from training. Duration: 0.7818125 2023-10-03 02:19:06,475 WARNING [train.py:1204] (3/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_97-151896-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 02:19:09,217 WARNING [train.py:1204] (3/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_207-7026-0_sp1.1 from training. Duration: 0.97275 2023-10-03 02:19:09,245 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_92-46009-0_sp0.9 from training. Duration: 0.6 2023-10-03 02:19:11,917 INFO [train.py:1046] (3/4) Epoch 32, batch 800, loss[loss=0.1644, simple_loss=0.2372, pruned_loss=0.04576, over 24455.00 frames. ], tot_loss[loss=0.1626, simple_loss=0.242, pruned_loss=0.04158, over 4641349.82 frames. ], batch size: 58, lr: 3.21e-03, grad_scale: 16.0 2023-10-03 02:19:13,754 INFO [scaling.py:1118] (3/4) WithLoss: name=encoder.encoders.5.encoder.layers.0.self_attn_weights, loss-sum=0.000e+00 2023-10-03 02:19:16,896 WARNING [train.py:1204] (3/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_122-178847-0_sp1.1 from training. Duration: 0.97275 2023-10-03 02:19:16,901 WARNING [train.py:1204] (3/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_94-348956-0_sp1.1 from training. Duration: 0.92725 2023-10-03 02:19:17,174 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.feed_forward3.hidden_balancer.prob, batch_count=1103173.3333333333, ans=0.125 2023-10-03 02:19:18,372 WARNING [train.py:1204] (3/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_1018-244089-0_sp1.1 from training. Duration: 0.963625 2023-10-03 02:19:18,390 WARNING [train.py:1204] (3/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_87-118423-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 02:19:19,754 WARNING [train.py:1204] (3/4) Exclude cut with ID _53713_210_24116_1_1534314487762_7416751_504-212292-0_sp1.1 from training. Duration: 0.92725 2023-10-03 02:19:19,773 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_446-150327-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 02:19:21,022 WARNING [train.py:1204] (3/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_682-245989-0_sp1.1 from training. Duration: 0.92725 2023-10-03 02:19:21,308 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.feed_forward2.hidden_balancer.prob, batch_count=1103173.3333333333, ans=0.125 2023-10-03 02:19:24,368 WARNING [train.py:1204] (3/4) Exclude cut with ID _35723_210_1794_1_1533088341043_628250_8-234524-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 02:19:26,185 WARNING [train.py:1204] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_64-248359-0_sp0.9 from training. Duration: 0.7666875 2023-10-03 02:19:26,432 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.conv_module2.balancer1.max_abs, batch_count=1103240.0, ans=10.0 2023-10-03 02:19:28,930 WARNING [train.py:1204] (3/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_49-97158-0 from training. Duration: 0.76 2023-10-03 02:19:28,980 WARNING [train.py:1204] (3/4) Exclude cut with ID _24314_210_4915_1_1532135078116_7170179_743-915-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 02:19:30,379 WARNING [train.py:1204] (3/4) Exclude cut with ID _61194_210_18628_1_1535007555628_3750909_353-323468-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 02:19:30,409 WARNING [train.py:1204] (3/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_387-131538-0_sp1.1 from training. Duration: 0.736375 2023-10-03 02:19:31,055 WARNING [train.py:1204] (3/4) Exclude cut with ID _44424_210_18163_1_1533623394848_1213110_79-187603-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 02:19:31,084 WARNING [train.py:1204] (3/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_86-19601-0 from training. Duration: 0.85 2023-10-03 02:19:31,103 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_50050_210_16223_1_1535435796_7290138_103-283529-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 02:19:32,453 WARNING [train.py:1204] (3/4) Exclude cut with ID _13913_210_9994_1_1530579615187_3383897_473-277682-0 from training. Duration: 0.39 2023-10-03 02:19:33,899 WARNING [train.py:1204] (3/4) Exclude cut with ID _42326_210_7770_1_1533814282531_3707269_368-68029-0_sp1.1 from training. Duration: 0.92725 2023-10-03 02:19:36,642 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_41428_210_2161_1_1533774184_3992532_446-339645-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 02:19:38,628 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.feed_forward1.out_whiten.whitening_limit, batch_count=1103240.0, ans=15.0 2023-10-03 02:19:39,370 WARNING [train.py:1204] (3/4) Exclude cut with ID _69584_210_5219_1_1535785133775_4044450_325-7656-0_sp1.1 from training. Duration: 0.7818125 2023-10-03 02:19:39,378 WARNING [train.py:1204] (3/4) Exclude cut with ID _21632_210_2253_1_1531994001765_3744140_134-184387-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 02:19:39,588 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.conv_module1.balancer2.prob, batch_count=1103240.0, ans=0.125 2023-10-03 02:19:42,081 WARNING [train.py:1204] (3/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_550-184707-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 02:19:42,106 WARNING [train.py:1204] (3/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_106-341780-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 02:19:48,122 WARNING [train.py:1204] (3/4) Exclude cut with ID _45871_210_5665_1_1533864614245_3296060_253-132552-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 02:19:48,162 WARNING [train.py:1204] (3/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_395-310220-0_sp1.1 from training. Duration: 0.8090625 2023-10-03 02:19:48,176 WARNING [train.py:1204] (3/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_795-33252-0_sp0.9 from training. Duration: 0.411125 2023-10-03 02:19:51,047 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9002W0003-495962 from training. Duration: 0.946 2023-10-03 02:19:52,419 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_43-71989-0 from training. Duration: 0.57 2023-10-03 02:19:52,437 WARNING [train.py:1204] (3/4) Exclude cut with ID _8699_210_2069_1_1528630346831_3960409_155-39766-0_sp1.1 from training. Duration: 0.7090625 2023-10-03 02:19:52,457 WARNING [train.py:1204] (3/4) Exclude cut with ID _27894_210_12392_1_1532415496078_6902439_788-232684-0_sp1.1 from training. Duration: 0.97275 2023-10-03 02:19:54,408 WARNING [train.py:1204] (3/4) Exclude cut with ID _56494_210_4220_1_1535284837625_3033169_10-39296-0_sp1.1 from training. Duration: 0.92725 2023-10-03 02:19:54,427 WARNING [train.py:1204] (3/4) Exclude cut with ID _40901_210_5665_1_1533363789958_3856130_341-20819-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 02:20:00,538 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9029W0435-509822 from training. Duration: 0.896 2023-10-03 02:20:00,588 WARNING [train.py:1204] (3/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_51-117576-0 from training. Duration: 0.4 2023-10-03 02:20:03,325 WARNING [train.py:1204] (3/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_197-28164-0_sp1.1 from training. Duration: 0.72725 2023-10-03 02:20:06,598 WARNING [train.py:1204] (3/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_145-137790-0_sp1.1 from training. Duration: 0.7545625 2023-10-03 02:20:06,873 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.bypass_mid.scale_min, batch_count=1103373.3333333333, ans=0.2 2023-10-03 02:20:09,376 WARNING [train.py:1204] (3/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_2-39653-0_sp1.1 from training. Duration: 0.8909375 2023-10-03 02:20:09,947 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.5.encoder.layers.0.self_attn1.whiten, num_groups=1, num_channels=256, metric=11.95 vs. limit=22.5 2023-10-03 02:20:12,264 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3780_210_2087_1_1526467760_2130483_186-208240-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 02:20:13,667 WARNING [train.py:1204] (3/4) Exclude cut with ID _24055_210_12651_1_1532310907143_976979_92-59356-0 from training. Duration: 0.91 2023-10-03 02:20:13,718 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3910_210_1794_1_1526742306_334530_28-238909-0_sp1.1 from training. Duration: 0.736375 2023-10-03 02:20:14,953 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.624e+02 1.903e+02 2.081e+02 2.304e+02 3.546e+02, threshold=4.162e+02, percent-clipped=0.0 2023-10-03 02:20:16,559 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=1103440.0, ans=0.1 2023-10-03 02:20:17,698 WARNING [train.py:1204] (3/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_269-154726-0 from training. Duration: 0.86 2023-10-03 02:20:22,566 WARNING [train.py:1204] (3/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_326-165900-0_sp0.9 from training. Duration: 0.7666875 2023-10-03 02:20:25,429 WARNING [train.py:1204] (3/4) Exclude cut with ID _5294_210_1747_1_1527249643928_3696179_470-276077-0_sp1.1 from training. Duration: 0.963625 2023-10-03 02:20:25,480 WARNING [train.py:1204] (3/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_372-131163-0 from training. Duration: 0.94 2023-10-03 02:20:26,697 INFO [train.py:1046] (3/4) Epoch 32, batch 850, loss[loss=0.1754, simple_loss=0.2588, pruned_loss=0.04607, over 24380.00 frames. ], tot_loss[loss=0.1629, simple_loss=0.2424, pruned_loss=0.04166, over 4666763.62 frames. ], batch size: 77, lr: 3.21e-03, grad_scale: 16.0 2023-10-03 02:20:26,800 WARNING [train.py:1204] (3/4) Exclude cut with ID _45297_210_2721_1_1533715351381_3264949_351-147136-0_sp1.1 from training. Duration: 0.7909375 2023-10-03 02:20:28,578 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_1072-12671-0_sp1.1 from training. Duration: 0.92725 2023-10-03 02:20:28,677 WARNING [train.py:1204] (3/4) Exclude cut with ID _15133_210_6228_1_1530794512074_3080379_54-198193-0 from training. Duration: 0.94 2023-10-03 02:20:28,693 WARNING [train.py:1204] (3/4) Exclude cut with ID _30196_210_1664_1_1533708496522_7074140_1194-270988-0_sp1.1 from training. Duration: 0.97275 2023-10-03 02:20:28,803 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.0.bypass.skip_rate, batch_count=1103506.6666666667, ans=0.035 2023-10-03 02:20:30,579 WARNING [train.py:1204] (3/4) Exclude cut with ID _50310_210_15488_1_1534068043345_3641080_289-193637-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 02:20:31,910 WARNING [train.py:1204] (3/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_313-263495-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 02:20:34,608 WARNING [train.py:1204] (3/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_18-130336-0_sp1.1 from training. Duration: 0.7181875 2023-10-03 02:20:34,670 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_47896_210_20508_1_1534492518_5958868_646-287209-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 02:20:34,759 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_294-163793-0 from training. Duration: 0.82 2023-10-03 02:20:36,666 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_8-319957-0 from training. Duration: 0.96 2023-10-03 02:20:36,669 WARNING [train.py:1204] (3/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_1211-226066-0 from training. Duration: 0.76 2023-10-03 02:20:37,950 WARNING [train.py:1204] (3/4) Exclude cut with ID _4779_210_2084_1_1527318022944_3914893_500-285013-0_sp0.9 from training. Duration: 0.7666875 2023-10-03 02:20:37,990 WARNING [train.py:1204] (3/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_347-232268-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 02:20:38,529 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.4.encoder.layers.1.self_attn2.whiten, num_groups=1, num_channels=384, metric=12.15 vs. limit=22.5 2023-10-03 02:20:40,600 WARNING [train.py:1204] (3/4) Exclude cut with ID _29877_210_6228_1_1532397791106_5276149_111-55056-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 02:20:40,640 WARNING [train.py:1204] (3/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_812-271646-0_sp1.1 from training. Duration: 0.92725 2023-10-03 02:20:40,663 WARNING [train.py:1204] (3/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_235-262124-0_sp1.1 from training. Duration: 0.6090625 2023-10-03 02:20:44,839 WARNING [train.py:1204] (3/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_14-16530-0_sp1.1 from training. Duration: 0.97275 2023-10-03 02:20:46,130 WARNING [train.py:1204] (3/4) Exclude cut with ID _44718_210_5816_1_1534050051869_6469030_568-304512-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 02:20:46,162 WARNING [train.py:1204] (3/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_1033-274376-0 from training. Duration: 0.87 2023-10-03 02:20:50,806 WARNING [train.py:1204] (3/4) Exclude cut with ID _4681_210_3525_1_1527251040321_1235713_123-24428-0 from training. Duration: 0.48 2023-10-03 02:20:53,574 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_37818_210_4238_1_1533620910_4720083_251-288764-0_sp1.1 from training. Duration: 0.97275 2023-10-03 02:20:54,950 WARNING [train.py:1204] (3/4) Exclude cut with ID _65585_210_12210_1_1535450451584_3764098_112-294301-0 from training. Duration: 0.88 2023-10-03 02:20:59,421 WARNING [train.py:1204] (3/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_329-113173-0 from training. Duration: 0.62 2023-10-03 02:20:59,514 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6767_210_3273_1_1527855783_1718932_173-256601-0 from training. Duration: 0.8 2023-10-03 02:21:03,904 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9266W0001-627677 from training. Duration: 0.896 2023-10-03 02:21:03,923 WARNING [train.py:1204] (3/4) Exclude cut with ID _49643_210_7989_1_1534640244224_3939031_611-185515-0_sp1.1 from training. Duration: 0.8545625 2023-10-03 02:21:03,927 WARNING [train.py:1204] (3/4) Exclude cut with ID _31649_210_6228_1_1532584608063_4091078_687-237771-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 02:21:03,937 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_148-172861-0_sp0.9 from training. Duration: 0.5555625 2023-10-03 02:21:04,208 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.ff2_skip_rate, batch_count=1103640.0, ans=0.0 2023-10-03 02:21:06,802 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_5129_210_6400_1_1527161005_3579054_107-69738-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 02:21:08,232 WARNING [train.py:1204] (3/4) Exclude cut with ID _40137_210_12210_1_1533290484046_3795180_36-301164-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 02:21:08,254 WARNING [train.py:1204] (3/4) Exclude cut with ID _7537_210_2084_1_1528502395431_3924359_113-199933-0 from training. Duration: 0.59 2023-10-03 02:21:11,488 WARNING [train.py:1204] (3/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_547-5424-0_sp1.1 from training. Duration: 0.8545625 2023-10-03 02:21:11,549 WARNING [train.py:1204] (3/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_147-284024-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 02:21:11,760 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.self_attn_weights.pos_emb_skip_rate, batch_count=1103706.6666666667, ans=0.0 2023-10-03 02:21:12,870 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_294-163793-0_sp1.1 from training. Duration: 0.7454375 2023-10-03 02:21:12,905 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3780_210_2087_1_1526467760_2130483_170-259044-0_sp1.1 from training. Duration: 0.636375 2023-10-03 02:21:15,693 WARNING [train.py:1204] (3/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_726-270780-0_sp1.1 from training. Duration: 0.7818125 2023-10-03 02:21:15,777 WARNING [train.py:1204] (3/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_214-291369-0_sp1.1 from training. Duration: 0.57275 2023-10-03 02:21:17,037 WARNING [train.py:1204] (3/4) Exclude cut with ID _21881_210_3525_1_1531825242757_3798380_247-102591-0 from training. Duration: 0.98 2023-10-03 02:21:21,222 WARNING [train.py:1204] (3/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_18-174647-0_sp1.1 from training. Duration: 0.82725 2023-10-03 02:21:21,225 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_415-52611-0_sp1.1 from training. Duration: 0.936375 2023-10-03 02:21:21,300 WARNING [train.py:1204] (3/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_379-49457-0_sp0.9 from training. Duration: 0.9666875 2023-10-03 02:21:21,315 WARNING [train.py:1204] (3/4) Exclude cut with ID _64521_210_14699_1_1535243427259_3815328_368-195106-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 02:21:22,585 WARNING [train.py:1204] (3/4) Exclude cut with ID _28394_210_8913_1_1532430049518_3650118_51-334124-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 02:21:25,768 WARNING [train.py:1204] (3/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_317-161769-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 02:21:27,210 WARNING [train.py:1204] (3/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_40-129756-0_sp0.9 from training. Duration: 0.9 2023-10-03 02:21:28,956 WARNING [train.py:1204] (3/4) Exclude cut with ID _3067_210_2667_1_1526212974640_4503168_119-175776-0_sp0.9 from training. Duration: 0.82225 2023-10-03 02:21:30,271 WARNING [train.py:1204] (3/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_1004-304915-0_sp1.1 from training. Duration: 0.92725 2023-10-03 02:21:31,676 WARNING [train.py:1204] (3/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_359-118926-0_sp0.9 from training. Duration: 0.8 2023-10-03 02:21:36,475 WARNING [train.py:1204] (3/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_51-200220-0_sp0.9 from training. Duration: 0.62225 2023-10-03 02:21:37,791 WARNING [train.py:1204] (3/4) Exclude cut with ID _46683_210_9642_1_1533730913492_4591116_530-326202-0_sp1.1 from training. Duration: 0.936375 2023-10-03 02:21:39,162 WARNING [train.py:1204] (3/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_567-135114-0 from training. Duration: 0.87 2023-10-03 02:21:39,189 WARNING [train.py:1204] (3/4) Exclude cut with ID _66863_210_18053_1_1535698758334_8590319_1092-271784-0_sp1.1 from training. Duration: 0.87275 2023-10-03 02:21:40,459 INFO [train.py:1046] (3/4) Epoch 32, batch 900, loss[loss=0.1661, simple_loss=0.2567, pruned_loss=0.03781, over 24673.00 frames. ], tot_loss[loss=0.1638, simple_loss=0.2435, pruned_loss=0.04202, over 4688529.57 frames. ], batch size: 73, lr: 3.21e-03, grad_scale: 8.0 2023-10-03 02:21:41,179 WARNING [train.py:1204] (3/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_762-282614-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 02:21:43,654 WARNING [train.py:1204] (3/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_280-120942-0 from training. Duration: 0.77 2023-10-03 02:21:47,919 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_31583_210_3852_1_1532595851_4024413_27-170931-0_sp1.1 from training. Duration: 0.963625 2023-10-03 02:21:50,668 WARNING [train.py:1204] (3/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_266-271628-0_sp1.1 from training. Duration: 0.92725 2023-10-03 02:21:51,995 WARNING [train.py:1204] (3/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_41-31157-0 from training. Duration: 0.57 2023-10-03 02:21:55,295 WARNING [train.py:1204] (3/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_113-18163-0_sp1.1 from training. Duration: 0.8090625 2023-10-03 02:21:56,590 WARNING [train.py:1204] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_147-111137-0 from training. Duration: 0.95 2023-10-03 02:21:56,666 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1083-220122-0_sp1.1 from training. Duration: 0.463625 2023-10-03 02:21:58,080 WARNING [train.py:1204] (3/4) Exclude cut with ID _14884_210_2252_1_1530707459689_5561250_591-173406-0_sp1.1 from training. Duration: 0.87275 2023-10-03 02:21:58,088 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_24677_210_1794_1_1532002693_4341296_149-303793-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 02:21:58,152 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_133-564-0_sp1.1 from training. Duration: 0.7181875 2023-10-03 02:21:59,602 WARNING [train.py:1204] (3/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_625-338546-0_sp0.9 from training. Duration: 0.97775 2023-10-03 02:22:07,659 WARNING [train.py:1204] (3/4) Exclude cut with ID _25882_210_3036_1_1532775665815_6479510_624-320169-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 02:22:07,674 WARNING [train.py:1204] (3/4) Exclude cut with ID _20622_210_3852_1_1531622427080_1093839_127-325326-0_sp1.1 from training. Duration: 0.92725 2023-10-03 02:22:07,708 WARNING [train.py:1204] (3/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_464-33931-0_sp1.1 from training. Duration: 0.4909375 2023-10-03 02:22:11,166 WARNING [train.py:1204] (3/4) Exclude cut with ID _66112_210_10635_1_1535590236305_3801484_120-304588-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 02:22:11,362 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.feed_forward1.hidden_balancer.prob, batch_count=1103973.3333333333, ans=0.125 2023-10-03 02:22:16,697 WARNING [train.py:1204] (3/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_110-130961-0 from training. Duration: 0.47 2023-10-03 02:22:16,848 WARNING [train.py:1204] (3/4) Exclude cut with ID _16908_210_5724_1_1531119157248_3772700_64-54020-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 02:22:21,033 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder_embed.dropout.p, batch_count=1103973.3333333333, ans=0.1 2023-10-03 02:22:22,282 WARNING [train.py:1204] (3/4) Exclude cut with ID _4068_210_2629_1_1526697350395_1546259_224-288693-0_sp1.1 from training. Duration: 0.536375 2023-10-03 02:22:22,346 WARNING [train.py:1204] (3/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_191-41023-0_sp0.9 from training. Duration: 0.788875 2023-10-03 02:22:23,677 WARNING [train.py:1204] (3/4) Exclude cut with ID ID0205W0255-783002 from training. Duration: 0.958 2023-10-03 02:22:25,519 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_162-262717-0 from training. Duration: 0.83 2023-10-03 02:22:31,541 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_224-322394-0_sp1.1 from training. Duration: 0.6 2023-10-03 02:22:31,692 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.ff3_skip_rate, batch_count=1104040.0, ans=0.0 2023-10-03 02:22:32,808 WARNING [train.py:1204] (3/4) Exclude cut with ID _4866_210_2577_1_1527404412284_4364659_133-1696-0_sp1.1 from training. Duration: 0.9 2023-10-03 02:22:32,868 WARNING [train.py:1204] (3/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_973-198085-0_sp1.1 from training. Duration: 0.6818125 2023-10-03 02:22:37,418 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.attention_skip_rate, batch_count=1104040.0, ans=0.0 2023-10-03 02:22:38,550 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_47449_210_5399_1_1534076445_4006929_410-237806-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 02:22:38,561 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7543_210_6753_1_1528543318_4552_1-85072-0_sp0.9 from training. Duration: 0.92225 2023-10-03 02:22:40,568 WARNING [train.py:1204] (3/4) Exclude cut with ID _65263_210_22356_1_1535763598691_3921037_108-21902-0 from training. Duration: 0.99 2023-10-03 02:22:40,573 WARNING [train.py:1204] (3/4) Exclude cut with ID _65585_210_12210_1_1535450451584_3764098_309-174818-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 02:22:43,795 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_309-65177-0 from training. Duration: 0.88 2023-10-03 02:22:45,147 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.563e+02 1.827e+02 2.004e+02 2.233e+02 3.058e+02, threshold=4.009e+02, percent-clipped=0.0 2023-10-03 02:22:45,275 WARNING [train.py:1204] (3/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_363-185703-0_sp1.1 from training. Duration: 0.863625 2023-10-03 02:22:46,583 WARNING [train.py:1204] (3/4) Exclude cut with ID _42326_210_7770_1_1533814282531_3707269_442-238692-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 02:22:47,962 WARNING [train.py:1204] (3/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_226-254446-0_sp1.1 from training. Duration: 0.936375 2023-10-03 02:22:47,989 WARNING [train.py:1204] (3/4) Exclude cut with ID _29853_210_7497_1_1532505600851_6642110_287-299903-0_sp1.1 from training. Duration: 0.963625 2023-10-03 02:22:50,813 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1015-343884-0 from training. Duration: 0.62 2023-10-03 02:22:52,067 WARNING [train.py:1204] (3/4) Exclude cut with ID ID0305W0008-833402 from training. Duration: 0.91 2023-10-03 02:22:53,501 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6673_210_3273_1_1527767790_3173240_351-167954-0_sp1.1 from training. Duration: 0.436375 2023-10-03 02:22:53,513 WARNING [train.py:1204] (3/4) Exclude cut with ID _6456_210_1534_1_1527681634212_3181049_345-252877-0 from training. Duration: 0.64 2023-10-03 02:22:54,751 INFO [train.py:1046] (3/4) Epoch 32, batch 950, loss[loss=0.1485, simple_loss=0.2149, pruned_loss=0.04106, over 22692.00 frames. ], tot_loss[loss=0.1639, simple_loss=0.2433, pruned_loss=0.04223, over 4694292.71 frames. ], batch size: 322, lr: 3.21e-03, grad_scale: 8.0 2023-10-03 02:22:56,651 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_13-205182-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 02:22:59,415 WARNING [train.py:1204] (3/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_427-208901-0 from training. Duration: 0.68 2023-10-03 02:23:01,006 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.feed_forward2.hidden_balancer.prob, batch_count=1104173.3333333333, ans=0.125 2023-10-03 02:23:05,361 WARNING [train.py:1204] (3/4) Exclude cut with ID _49039_210_6315_1_1533967537541_3846118_9-148921-0_sp1.1 from training. Duration: 0.97275 2023-10-03 02:23:05,760 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.conv_skip_rate, batch_count=1104173.3333333333, ans=0.0 2023-10-03 02:23:06,127 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.4.encoder.layers.0.self_attn1.whiten, num_groups=1, num_channels=384, metric=15.17 vs. limit=22.5 2023-10-03 02:23:06,921 WARNING [train.py:1204] (3/4) Exclude cut with ID _59493_210_17282_1_1535078419077_3225879_143-70317-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 02:23:06,954 WARNING [train.py:1204] (3/4) Exclude cut with ID _27967_210_12140_1_1532588391859_8419130_1056-236369-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 02:23:07,003 WARNING [train.py:1204] (3/4) Exclude cut with ID _6537_210_1883_1_1527987559710_3597129_382-133062-0_sp0.9 from training. Duration: 0.7555625 2023-10-03 02:23:09,851 WARNING [train.py:1204] (3/4) Exclude cut with ID ID0376W0255-869957 from training. Duration: 0.9510625 2023-10-03 02:23:14,481 WARNING [train.py:1204] (3/4) Exclude cut with ID _49371_210_12071_1_1533952761021_3812190_483-240691-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 02:23:14,559 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_12868_210_6753_1_1530701932_530275_18-76322-0_sp1.1 from training. Duration: 0.7909375 2023-10-03 02:23:15,989 WARNING [train.py:1204] (3/4) Exclude cut with ID _53950_210_5816_1_1534417264919_3862951_367-26344-0_sp1.1 from training. Duration: 0.97275 2023-10-03 02:23:16,016 WARNING [train.py:1204] (3/4) Exclude cut with ID _5444_210_2151_1_1527418808464_5312210_634-194174-0_sp1.1 from training. Duration: 0.8818125 2023-10-03 02:23:16,051 WARNING [train.py:1204] (3/4) Exclude cut with ID _56494_210_4220_1_1535284837625_3033169_69-138150-0 from training. Duration: 0.94 2023-10-03 02:23:17,432 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7543_210_6753_1_1528544010_1110402_25-183860-0_sp0.9 from training. Duration: 0.52225 2023-10-03 02:23:20,141 WARNING [train.py:1204] (3/4) Exclude cut with ID _40284_210_2084_1_1533513679814_3704279_287-191918-0_sp1.1 from training. Duration: 0.963625 2023-10-03 02:23:20,351 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.conv_skip_rate, batch_count=1104240.0, ans=0.0 2023-10-03 02:23:21,536 WARNING [train.py:1204] (3/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_291-270881-0 from training. Duration: 0.97 2023-10-03 02:23:22,873 WARNING [train.py:1204] (3/4) Exclude cut with ID _12555_210_8750_1_1530075594402_3780239_449-267986-0_sp0.9 from training. Duration: 0.92225 2023-10-03 02:23:26,311 WARNING [train.py:1204] (3/4) Exclude cut with ID _3992_210_1747_1_1526644816031_3731060_268-64520-0_sp1.1 from training. Duration: 0.963625 2023-10-03 02:23:26,317 WARNING [train.py:1204] (3/4) Exclude cut with ID _39411_210_5986_1_1533538825030_3343649_173-238248-0_sp0.9 from training. Duration: 0.92225 2023-10-03 02:23:26,345 WARNING [train.py:1204] (3/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_828-313097-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 02:23:27,705 WARNING [train.py:1204] (3/4) Exclude cut with ID _26907_210_7234_1_1532221740694_3205263_337-2494-0 from training. Duration: 0.88 2023-10-03 02:23:29,268 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.bypass_mid.scale_min, batch_count=1104306.6666666667, ans=0.2 2023-10-03 02:23:30,299 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_229-45300-0_sp1.1 from training. Duration: 0.5181875 2023-10-03 02:23:31,731 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.ff3_skip_rate, batch_count=1104306.6666666667, ans=0.0 2023-10-03 02:23:32,952 WARNING [train.py:1204] (3/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_300-27235-0_sp1.1 from training. Duration: 0.7909375 2023-10-03 02:23:34,913 WARNING [train.py:1204] (3/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_48-338976-0_sp1.1 from training. Duration: 0.8090625 2023-10-03 02:23:37,843 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_13091_210_7925_1_1530609447_8987354_792-195353-0_sp1.1 from training. Duration: 0.936375 2023-10-03 02:23:37,848 WARNING [train.py:1204] (3/4) Exclude cut with ID _56524_210_9642_1_1534572011779_3619170_311-54500-0_sp1.1 from training. Duration: 0.97275 2023-10-03 02:23:41,878 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7543_210_6753_1_1528544010_1110402_25-183860-0 from training. Duration: 0.47 2023-10-03 02:23:44,916 WARNING [train.py:1204] (3/4) Exclude cut with ID 2411-132532-0017-82279-0_sp1.1 from training. Duration: 0.9681875 2023-10-03 02:23:44,917 WARNING [train.py:1204] (3/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_272-249985-0_sp0.9 from training. Duration: 0.7444375 2023-10-03 02:23:44,977 WARNING [train.py:1204] (3/4) Exclude cut with ID _55644_210_11856_1_1534593599582_4103719_320-229006-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 02:23:45,024 WARNING [train.py:1204] (3/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_177-100554-0_sp1.1 from training. Duration: 0.963625 2023-10-03 02:23:45,026 WARNING [train.py:1204] (3/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_787-294416-0_sp1.1 from training. Duration: 0.7454375 2023-10-03 02:23:49,842 WARNING [train.py:1204] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_318-59097-0 from training. Duration: 0.76 2023-10-03 02:23:49,914 WARNING [train.py:1204] (3/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_480-13146-0_sp1.1 from training. Duration: 0.77275 2023-10-03 02:23:52,656 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_469-338558-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 02:23:52,714 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_367-126897-0_sp1.1 from training. Duration: 0.963625 2023-10-03 02:23:52,731 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6545_210_2040_1_1527774670_4245258_379-14182-0 from training. Duration: 0.8 2023-10-03 02:23:52,745 WARNING [train.py:1204] (3/4) Exclude cut with ID _21631_210_2253_1_1531907880778_3160339_197-79498-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 02:23:52,749 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_416-293746-0_sp1.1 from training. Duration: 0.8181875 2023-10-03 02:23:52,796 WARNING [train.py:1204] (3/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_52-211683-0 from training. Duration: 0.9 2023-10-03 02:23:54,773 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.balancer1.prob, batch_count=1104440.0, ans=0.125 2023-10-03 02:23:57,382 WARNING [train.py:1204] (3/4) Exclude cut with ID _48678_210_5663_1_1533988830947_3769199_306-255886-0_sp0.9 from training. Duration: 0.8555625 2023-10-03 02:23:58,920 WARNING [train.py:1204] (3/4) Exclude cut with ID _65134_210_14763_1_1535335393985_3505368_296-24733-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 02:24:04,742 WARNING [train.py:1204] (3/4) Exclude cut with ID _14898_210_9206_1_1530766812377_4177190_335-99364-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 02:24:06,120 WARNING [train.py:1204] (3/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_19-342632-0 from training. Duration: 0.91 2023-10-03 02:24:06,143 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_53071_210_16554_1_1534490405_2954066_265-170472-0 from training. Duration: 0.83 2023-10-03 02:24:08,737 INFO [train.py:1046] (3/4) Epoch 32, batch 1000, loss[loss=0.1535, simple_loss=0.2464, pruned_loss=0.03029, over 24666.00 frames. ], tot_loss[loss=0.1629, simple_loss=0.2417, pruned_loss=0.04204, over 4692396.85 frames. ], batch size: 73, lr: 3.21e-03, grad_scale: 8.0 2023-10-03 02:24:08,944 WARNING [train.py:1204] (3/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_189-298172-0_sp1.1 from training. Duration: 0.963625 2023-10-03 02:24:13,488 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7615_210_4211_1_1528110965_4580160_72-235211-0 from training. Duration: 0.76 2023-10-03 02:24:14,829 WARNING [train.py:1204] (3/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_155-90874-0_sp1.1 from training. Duration: 0.936375 2023-10-03 02:24:18,053 WARNING [train.py:1204] (3/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_164-216310-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 02:24:19,429 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_46013_210_7234_1_1533776362_3744032_463-52378-0 from training. Duration: 0.86 2023-10-03 02:24:19,433 WARNING [train.py:1204] (3/4) Exclude cut with ID _61133_210_9969_1_1535196777227_3696409_333-270325-0 from training. Duration: 0.93 2023-10-03 02:24:22,459 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.conv_module1.balancer2.prob, batch_count=1104573.3333333333, ans=0.125 2023-10-03 02:24:23,671 WARNING [train.py:1204] (3/4) Exclude cut with ID _5172_210_1883_1_1527246062567_3577219_18-101380-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 02:24:23,679 WARNING [train.py:1204] (3/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_577-236858-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 02:24:25,450 WARNING [train.py:1204] (3/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_1006-177345-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 02:24:27,001 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.conv_skip_rate, batch_count=1104573.3333333333, ans=0.0 2023-10-03 02:24:29,649 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_4-266348-0 from training. Duration: 0.78 2023-10-03 02:24:32,586 WARNING [train.py:1204] (3/4) Exclude cut with ID _55857_210_2346_1_1534644007831_6898209_745-98391-0 from training. Duration: 0.76 2023-10-03 02:24:34,557 WARNING [train.py:1204] (3/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_375-244976-0 from training. Duration: 0.75 2023-10-03 02:24:34,601 WARNING [train.py:1204] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_4-248404-0_sp1.1 from training. Duration: 0.97275 2023-10-03 02:24:36,051 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8453_210_4211_1_1528502940_8562730_596-306820-0 from training. Duration: 0.97 2023-10-03 02:24:37,463 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_211-339622-0_sp0.9 from training. Duration: 0.5 2023-10-03 02:24:37,490 WARNING [train.py:1204] (3/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_393-57888-0 from training. Duration: 0.64 2023-10-03 02:24:38,887 WARNING [train.py:1204] (3/4) Exclude cut with ID _23825_210_13449_1_1531915313526_3497150_143-343575-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 02:24:38,953 WARNING [train.py:1204] (3/4) Exclude cut with ID _58605_210_12210_1_1534723195939_3904259_36-12256-0_sp1.1 from training. Duration: 0.936375 2023-10-03 02:24:48,366 WARNING [train.py:1204] (3/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_38-67795-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 02:24:49,681 WARNING [train.py:1204] (3/4) Exclude cut with ID _64631_210_27767_1_1535349580068_4165160_308-327832-0_sp1.1 from training. Duration: 0.92725 2023-10-03 02:24:49,744 WARNING [train.py:1204] (3/4) Exclude cut with ID _37086_210_9038_1_1532999540891_7021250_726-81176-0_sp1.1 from training. Duration: 0.936375 2023-10-03 02:24:51,000 WARNING [train.py:1204] (3/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_47-141521-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 02:24:51,004 WARNING [train.py:1204] (3/4) Exclude cut with ID _3067_210_2667_1_1526212974640_4503168_250-285937-0 from training. Duration: 0.8 2023-10-03 02:24:51,042 WARNING [train.py:1204] (3/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_239-24000-0_sp1.1 from training. Duration: 0.97275 2023-10-03 02:24:51,226 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.balancer_ff3.min_abs, batch_count=1104640.0, ans=0.2 2023-10-03 02:24:52,496 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3371_210_1794_1_1526384462_5580777_396-339812-0_sp0.9 from training. Duration: 0.9444375 2023-10-03 02:24:53,805 WARNING [train.py:1204] (3/4) Exclude cut with ID _16909_210_5724_1_1531292079912_4118919_580-173454-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 02:24:55,213 WARNING [train.py:1204] (3/4) Exclude cut with ID IC0124W0002-61308 from training. Duration: 0.982 2023-10-03 02:24:58,566 WARNING [train.py:1204] (3/4) Exclude cut with ID _31602_210_5854_1_1532677021456_4458250_249-96604-0 from training. Duration: 0.89 2023-10-03 02:24:58,641 WARNING [train.py:1204] (3/4) Exclude cut with ID _69584_210_5219_1_1535785133775_4044450_325-7656-0 from training. Duration: 0.86 2023-10-03 02:25:00,034 WARNING [train.py:1204] (3/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_478-162247-0 from training. Duration: 0.82 2023-10-03 02:25:01,516 WARNING [train.py:1204] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_686-54383-0_sp1.1 from training. Duration: 0.7909375 2023-10-03 02:25:07,656 WARNING [train.py:1204] (3/4) Exclude cut with ID _29303_210_2575_1_1532392237303_7410649_85-270092-0_sp1.1 from training. Duration: 0.936375 2023-10-03 02:25:07,667 WARNING [train.py:1204] (3/4) Exclude cut with ID _2322_210_2667_1_1525604548971_3656299_440-80494-0_sp1.1 from training. Duration: 0.8 2023-10-03 02:25:08,948 WARNING [train.py:1204] (3/4) Exclude cut with ID _47346_210_9476_1_1533815971553_3536160_201-40644-0_sp1.1 from training. Duration: 0.936375 2023-10-03 02:25:09,043 WARNING [train.py:1204] (3/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_343-139898-0_sp1.1 from training. Duration: 0.87275 2023-10-03 02:25:10,419 WARNING [train.py:1204] (3/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_1012-279473-0 from training. Duration: 0.67 2023-10-03 02:25:11,794 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7931_210_1794_1_1528594728_5564059_280-279560-0_sp1.1 from training. Duration: 0.8909375 2023-10-03 02:25:13,047 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.623e+02 1.852e+02 2.033e+02 2.255e+02 3.341e+02, threshold=4.066e+02, percent-clipped=0.0 2023-10-03 02:25:13,134 WARNING [train.py:1204] (3/4) Exclude cut with ID _8263_210_1664_1_1528439452891_970220_68-181805-0 from training. Duration: 0.64 2023-10-03 02:25:13,186 WARNING [train.py:1204] (3/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_605-133395-0 from training. Duration: 0.66 2023-10-03 02:25:13,882 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_43-171109-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 02:25:13,885 WARNING [train.py:1204] (3/4) Exclude cut with ID _40094_210_16380_1_1533294005773_3641159_107-280739-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 02:25:18,161 WARNING [train.py:1204] (3/4) Exclude cut with ID _14821_210_8750_1_1530680503991_6829359_2-227151-0_sp0.9 from training. Duration: 0.92225 2023-10-03 02:25:19,583 WARNING [train.py:1204] (3/4) Exclude cut with ID _14268_210_9206_1_1530622771285_4714099_331-105351-0_sp1.1 from training. Duration: 0.5909375 2023-10-03 02:25:22,216 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_26326_210_7925_1_1533002493_3592406_451-82741-0_sp1.1 from training. Duration: 0.97275 2023-10-03 02:25:23,464 INFO [train.py:1046] (3/4) Epoch 32, batch 1050, loss[loss=0.1629, simple_loss=0.245, pruned_loss=0.04044, over 23225.00 frames. ], tot_loss[loss=0.1619, simple_loss=0.2403, pruned_loss=0.04172, over 4695471.11 frames. ], batch size: 119, lr: 3.20e-03, grad_scale: 8.0 2023-10-03 02:25:24,900 WARNING [train.py:1204] (3/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_191-59784-0_sp0.9 from training. Duration: 0.97775 2023-10-03 02:25:26,875 WARNING [train.py:1204] (3/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_445-117398-0_sp1.1 from training. Duration: 0.8090625 2023-10-03 02:25:28,343 WARNING [train.py:1204] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_605-257205-0_sp0.9 from training. Duration: 0.6555625 2023-10-03 02:25:29,736 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_50050_210_16223_1_1535435796_7290138_592-258841-0_sp1.1 from training. Duration: 0.936375 2023-10-03 02:25:31,756 WARNING [train.py:1204] (3/4) Exclude cut with ID _14348_210_8341_1_1530599434996_3778640_240-232370-0_sp0.9 from training. Duration: 0.9333125 2023-10-03 02:25:33,156 WARNING [train.py:1204] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_239-316802-0_sp0.9 from training. Duration: 0.7666875 2023-10-03 02:25:34,724 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.conv_skip_rate, batch_count=1104840.0, ans=0.0 2023-10-03 02:25:35,756 WARNING [train.py:1204] (3/4) Exclude cut with ID _45560_210_21982_1_1533812603951_3584999_111-6376-0_sp1.1 from training. Duration: 0.763625 2023-10-03 02:25:38,546 WARNING [train.py:1204] (3/4) Exclude cut with ID _54189_210_16380_1_1534332670039_3911710_606-311481-0_sp1.1 from training. Duration: 0.963625 2023-10-03 02:25:39,879 WARNING [train.py:1204] (3/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_237-332027-0_sp1.1 from training. Duration: 0.863625 2023-10-03 02:25:39,911 WARNING [train.py:1204] (3/4) Exclude cut with ID _66070_210_7640_1_1535441393096_7805310_489-292631-0_sp0.9 from training. Duration: 0.87775 2023-10-03 02:25:41,215 WARNING [train.py:1204] (3/4) Exclude cut with ID _49728_210_12651_1_1534041034294_6788150_624-195301-0_sp1.1 from training. Duration: 0.77275 2023-10-03 02:25:41,259 WARNING [train.py:1204] (3/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_44-79427-0 from training. Duration: 0.98 2023-10-03 02:25:42,580 WARNING [train.py:1204] (3/4) Exclude cut with ID _35350_210_16040_1_1533040191183_4075299_490-198354-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 02:25:43,924 WARNING [train.py:1204] (3/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_309-222545-0 from training. Duration: 0.84 2023-10-03 02:25:46,703 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_13091_210_7925_1_1530609447_8987354_790-199469-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 02:25:46,709 WARNING [train.py:1204] (3/4) Exclude cut with ID _21632_210_2253_1_1531994001765_3744140_110-274654-0 from training. Duration: 0.82 2023-10-03 02:25:46,714 WARNING [train.py:1204] (3/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_498-40153-0_sp0.9 from training. Duration: 0.7 2023-10-03 02:25:52,918 WARNING [train.py:1204] (3/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_363-282909-0_sp1.1 from training. Duration: 0.936375 2023-10-03 02:25:54,294 WARNING [train.py:1204] (3/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_178-177582-0_sp0.9 from training. Duration: 0.82225 2023-10-03 02:25:54,305 WARNING [train.py:1204] (3/4) Exclude cut with ID _24612_210_8913_1_1531998054193_3806359_357-35487-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 02:25:55,767 WARNING [train.py:1204] (3/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_138-332773-0 from training. Duration: 0.81 2023-10-03 02:25:55,790 WARNING [train.py:1204] (3/4) Exclude cut with ID _7624_210_1531_1_1528109802198_4933101_418-349834-0 from training. Duration: 0.6 2023-10-03 02:25:57,662 WARNING [train.py:1204] (3/4) Exclude cut with ID _7537_210_2084_1_1528502395431_3924359_14-312906-0_sp0.9 from training. Duration: 0.9333125 2023-10-03 02:26:00,391 WARNING [train.py:1204] (3/4) Exclude cut with ID _60057_210_10640_1_1534903065793_3333150_362-230377-0 from training. Duration: 0.87 2023-10-03 02:26:02,529 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.bypass.skip_rate, batch_count=1104973.3333333333, ans=0.09899494936611666 2023-10-03 02:26:02,569 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.conv_skip_rate, batch_count=1104973.3333333333, ans=0.0 2023-10-03 02:26:03,614 WARNING [train.py:1204] (3/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_138-64210-0 from training. Duration: 0.73 2023-10-03 02:26:03,687 WARNING [train.py:1204] (3/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_454-339504-0_sp1.1 from training. Duration: 0.97275 2023-10-03 02:26:07,874 WARNING [train.py:1204] (3/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_508-335369-0_sp0.9 from training. Duration: 0.5333125 2023-10-03 02:26:09,364 WARNING [train.py:1204] (3/4) Exclude cut with ID _5608_210_2106_1_1527415439089_6819768_909-82767-0_sp0.9 from training. Duration: 0.67775 2023-10-03 02:26:09,391 WARNING [train.py:1204] (3/4) Exclude cut with ID _6694_210_4599_1_1527852637490_3965529_99-11611-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 02:26:10,724 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2655_210_2087_1_1525688959_3897400_52-279899-0_sp1.1 from training. Duration: 0.62725 2023-10-03 02:26:13,719 WARNING [train.py:1204] (3/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_375-245677-0_sp1.1 from training. Duration: 0.736375 2023-10-03 02:26:16,976 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_23850_210_4238_1_1532232597_1632433_32-185135-0 from training. Duration: 0.86 2023-10-03 02:26:19,867 WARNING [train.py:1204] (3/4) Exclude cut with ID _15113_210_8254_1_1530842438579_3716279_209-54533-0 from training. Duration: 0.96 2023-10-03 02:26:19,908 WARNING [train.py:1204] (3/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_401-53423-0 from training. Duration: 0.55 2023-10-03 02:26:19,952 WARNING [train.py:1204] (3/4) Exclude cut with ID _14867_210_1968_1_1530684179890_3846969_319-250868-0_sp1.1 from training. Duration: 0.9 2023-10-03 02:26:19,966 WARNING [train.py:1204] (3/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_106-30782-0_sp0.9 from training. Duration: 0.888875 2023-10-03 02:26:21,417 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_5960_210_1794_1_1527403865_3990999_515-2613-0 from training. Duration: 0.88 2023-10-03 02:26:21,618 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.feed_forward3.hidden_balancer.prob, batch_count=1105106.6666666667, ans=0.125 2023-10-03 02:26:25,556 WARNING [train.py:1204] (3/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_798-106179-0_sp0.9 from training. Duration: 0.92225 2023-10-03 02:26:27,589 WARNING [train.py:1204] (3/4) Exclude cut with ID _14227_210_4381_1_1530701804286_3940259_125-275642-0_sp1.1 from training. Duration: 0.9 2023-10-03 02:26:27,598 WARNING [train.py:1204] (3/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_79-200392-0_sp1.1 from training. Duration: 0.8818125 2023-10-03 02:26:28,899 WARNING [train.py:1204] (3/4) Exclude cut with ID _48836_210_12210_1_1534075848293_3904150_429-35475-0_sp0.9 from training. Duration: 0.988875 2023-10-03 02:26:28,915 WARNING [train.py:1204] (3/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_224-172517-0_sp1.1 from training. Duration: 0.97275 2023-10-03 02:26:31,675 WARNING [train.py:1204] (3/4) Exclude cut with ID _15113_210_8254_1_1530842438579_3716279_76-232507-0_sp1.1 from training. Duration: 0.97275 2023-10-03 02:26:31,678 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_135-5501-0 from training. Duration: 0.98 2023-10-03 02:26:33,111 WARNING [train.py:1204] (3/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_563-347832-0_sp0.9 from training. Duration: 0.988875 2023-10-03 02:26:33,127 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6786_210_1388_1_1527839699_4194495_273-149841-0 from training. Duration: 0.92 2023-10-03 02:26:34,909 WARNING [train.py:1204] (3/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_172-215932-0 from training. Duration: 0.66 2023-10-03 02:26:36,187 WARNING [train.py:1204] (3/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_939-132910-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 02:26:36,427 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.ff2_skip_rate, batch_count=1105173.3333333333, ans=0.0 2023-10-03 02:26:37,354 INFO [train.py:1046] (3/4) Epoch 32, batch 1100, loss[loss=0.1665, simple_loss=0.243, pruned_loss=0.04497, over 23479.00 frames. ], tot_loss[loss=0.1616, simple_loss=0.2398, pruned_loss=0.0417, over 4701008.95 frames. ], batch size: 134, lr: 3.20e-03, grad_scale: 8.0 2023-10-03 02:26:40,226 WARNING [train.py:1204] (3/4) Exclude cut with ID _49371_210_12071_1_1533952761021_3812190_16-246001-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 02:26:44,415 WARNING [train.py:1204] (3/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_680-189165-0_sp1.1 from training. Duration: 0.936375 2023-10-03 02:26:49,247 WARNING [train.py:1204] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_53-300544-0_sp1.1 from training. Duration: 0.6090625 2023-10-03 02:26:51,019 WARNING [train.py:1204] (3/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_540-275685-0_sp1.1 from training. Duration: 0.8090625 2023-10-03 02:26:51,044 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_38965_210_4852_1_1533174906_3484735_453-106579-0_sp1.1 from training. Duration: 0.92725 2023-10-03 02:26:52,310 WARNING [train.py:1204] (3/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_105-328174-0 from training. Duration: 0.76 2023-10-03 02:26:53,685 WARNING [train.py:1204] (3/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_279-126205-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 02:26:55,286 WARNING [train.py:1204] (3/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_5-55893-0_sp0.9 from training. Duration: 0.611125 2023-10-03 02:26:56,692 WARNING [train.py:1204] (3/4) Exclude cut with ID _13678_210_7497_1_1530530069226_3463170_141-10835-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 02:26:56,860 INFO [scaling.py:1118] (3/4) WithLoss: name=encoder.encoders.1.encoder.layers.0.self_attn_weights, loss-sum=0.000e+00 2023-10-03 02:26:57,476 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.2.encoder.layers.0.self_attn2.whiten, num_groups=1, num_channels=384, metric=18.83 vs. limit=22.5 2023-10-03 02:27:01,268 WARNING [train.py:1204] (3/4) Exclude cut with ID _22083_210_8254_1_1531702862439_3678970_201-85738-0_sp1.1 from training. Duration: 0.8181875 2023-10-03 02:27:01,294 WARNING [train.py:1204] (3/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_51-1049-0 from training. Duration: 0.75 2023-10-03 02:27:02,563 WARNING [train.py:1204] (3/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_206-10097-0_sp0.9 from training. Duration: 0.5444375 2023-10-03 02:27:03,964 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_4528_210_3273_1_1527076654_3276777_360-63661-0_sp1.1 from training. Duration: 0.92725 2023-10-03 02:27:03,981 WARNING [train.py:1204] (3/4) Exclude cut with ID _64086_210_7994_1_1535270417019_7435949_449-31567-0_sp1.1 from training. Duration: 0.8818125 2023-10-03 02:27:07,314 WARNING [train.py:1204] (3/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_245-174553-0_sp0.9 from training. Duration: 0.97775 2023-10-03 02:27:08,768 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6545_210_2040_1_1527774670_4245258_379-14182-0_sp1.1 from training. Duration: 0.72725 2023-10-03 02:27:13,078 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_256-206070-0_sp1.1 from training. Duration: 0.963625 2023-10-03 02:27:15,996 WARNING [train.py:1204] (3/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_278-104676-0 from training. Duration: 0.98 2023-10-03 02:27:17,328 WARNING [train.py:1204] (3/4) Exclude cut with ID IC0671W0065-331908 from training. Duration: 0.896 2023-10-03 02:27:17,963 WARNING [train.py:1204] (3/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_244-129917-0_sp1.1 from training. Duration: 0.97275 2023-10-03 02:27:20,635 WARNING [train.py:1204] (3/4) Exclude cut with ID _7852_210_5219_1_1528286471316_8139400_1129-170857-0_sp1.1 from training. Duration: 0.97275 2023-10-03 02:27:20,705 WARNING [train.py:1204] (3/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_443-30034-0_sp0.9 from training. Duration: 0.711125 2023-10-03 02:27:22,022 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_31583_210_3852_1_1532595851_4024413_250-332999-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 02:27:23,381 WARNING [train.py:1204] (3/4) Exclude cut with ID _8772_210_1850_1_1528635595972_3664011_247-336448-0 from training. Duration: 0.74 2023-10-03 02:27:23,442 WARNING [train.py:1204] (3/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_567-135114-0_sp0.9 from training. Duration: 0.9666875 2023-10-03 02:27:23,446 WARNING [train.py:1204] (3/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_557-349534-0_sp1.1 from training. Duration: 0.82725 2023-10-03 02:27:25,304 WARNING [train.py:1204] (3/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_1341-26728-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 02:27:25,366 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_40222_210_6228_1_1533385298_4481939_375-328051-0_sp1.1 from training. Duration: 0.97275 2023-10-03 02:27:25,387 WARNING [train.py:1204] (3/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_276-96593-0 from training. Duration: 0.77 2023-10-03 02:27:29,742 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_53071_210_16554_1_1534490405_2954066_265-170472-0_sp0.9 from training. Duration: 0.92225 2023-10-03 02:27:29,775 WARNING [train.py:1204] (3/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_365-1492-0 from training. Duration: 0.91 2023-10-03 02:27:32,918 WARNING [train.py:1204] (3/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_654-52713-0_sp0.9 from training. Duration: 0.8555625 2023-10-03 02:27:38,629 WARNING [train.py:1204] (3/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_113-92608-0_sp1.1 from training. Duration: 0.4909375 2023-10-03 02:27:40,240 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.balancer1.prob, batch_count=1105440.0, ans=0.125 2023-10-03 02:27:41,293 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.521e+02 1.860e+02 2.079e+02 2.474e+02 4.878e+02, threshold=4.158e+02, percent-clipped=1.0 2023-10-03 02:27:41,368 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_46013_210_7234_1_1533776362_3744032_50-141535-0 from training. Duration: 0.99 2023-10-03 02:27:41,377 WARNING [train.py:1204] (3/4) Exclude cut with ID _13942_210_9988_1_1530604906036_3733360_499-55676-0_sp1.1 from training. Duration: 0.563625 2023-10-03 02:27:42,121 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.3.encoder.layers.3.self_attn_weights.whiten_keys, num_groups=8, num_channels=256, metric=2.82 vs. limit=6.0 2023-10-03 02:27:42,802 WARNING [train.py:1204] (3/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_375-65143-0_sp1.1 from training. Duration: 0.97275 2023-10-03 02:27:42,962 WARNING [train.py:1204] (3/4) Exclude cut with ID _16756_210_4915_1_1531047803763_7104259_199-325608-0_sp1.1 from training. Duration: 0.92725 2023-10-03 02:27:44,284 WARNING [train.py:1204] (3/4) Exclude cut with ID _31649_210_6228_1_1532584608063_4091078_443-124391-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 02:27:45,766 WARNING [train.py:1204] (3/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_146-218763-0 from training. Duration: 0.86 2023-10-03 02:27:47,017 WARNING [train.py:1204] (3/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_567-135114-0_sp1.1 from training. Duration: 0.7909375 2023-10-03 02:27:47,065 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_26326_210_7925_1_1533002493_3592406_462-288299-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 02:27:48,994 WARNING [train.py:1204] (3/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_322-217220-0 from training. Duration: 0.86 2023-10-03 02:27:50,268 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_238-256668-0_sp1.1 from training. Duration: 0.62725 2023-10-03 02:27:50,312 WARNING [train.py:1204] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_371-18169-0 from training. Duration: 0.86 2023-10-03 02:27:51,630 INFO [train.py:1046] (3/4) Epoch 32, batch 1150, loss[loss=0.1577, simple_loss=0.2443, pruned_loss=0.03558, over 24692.00 frames. ], tot_loss[loss=0.1626, simple_loss=0.2407, pruned_loss=0.04219, over 4694281.90 frames. ], batch size: 65, lr: 3.20e-03, grad_scale: 8.0 2023-10-03 02:27:51,708 WARNING [train.py:1204] (3/4) Exclude cut with ID _58480_210_12392_1_1534811121111_3646160_23-74512-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 02:27:51,713 WARNING [train.py:1204] (3/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_784-213099-0_sp1.1 from training. Duration: 0.8454375 2023-10-03 02:27:53,027 WARNING [train.py:1204] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_625-102955-0_sp1.1 from training. Duration: 0.7 2023-10-03 02:27:57,727 WARNING [train.py:1204] (3/4) Exclude cut with ID _49748_210_24116_1_1533986971737_3158910_176-174327-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 02:27:59,153 WARNING [train.py:1204] (3/4) Exclude cut with ID _50310_210_15488_1_1534068043345_3641080_432-96140-0_sp1.1 from training. Duration: 0.936375 2023-10-03 02:28:01,952 WARNING [train.py:1204] (3/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_76-14910-0_sp1.1 from training. Duration: 0.92725 2023-10-03 02:28:01,977 WARNING [train.py:1204] (3/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_40-19458-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 02:28:02,025 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_46-93547-0 from training. Duration: 0.95 2023-10-03 02:28:03,782 WARNING [train.py:1204] (3/4) Exclude cut with ID _21631_210_2253_1_1531907880778_3160339_321-35325-0_sp1.1 from training. Duration: 0.963625 2023-10-03 02:28:04,926 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.0.layers.0.nonlin_attention.whiten2, num_groups=1, num_channels=192, metric=5.10 vs. limit=15.0 2023-10-03 02:28:05,262 WARNING [train.py:1204] (3/4) Exclude cut with ID _4779_210_2084_1_1527318022944_3914893_500-285013-0 from training. Duration: 0.69 2023-10-03 02:28:06,716 WARNING [train.py:1204] (3/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_301-93324-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 02:28:06,722 WARNING [train.py:1204] (3/4) Exclude cut with ID _6473_210_1741_1_1527901307396_3815380_108-17335-0_sp1.1 from training. Duration: 0.5909375 2023-10-03 02:28:10,465 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.4.encoder.layers.2.nonlin_attention.whiten2, num_groups=1, num_channels=384, metric=3.42 vs. limit=15.0 2023-10-03 02:28:11,403 WARNING [train.py:1204] (3/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_206-10097-0 from training. Duration: 0.49 2023-10-03 02:28:12,996 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.nonlin_attention.balancer.prob, batch_count=1105573.3333333333, ans=0.125 2023-10-03 02:28:14,194 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_36756_210_7234_1_1533026991_966276_103-77513-0_sp1.1 from training. Duration: 0.97275 2023-10-03 02:28:17,472 WARNING [train.py:1204] (3/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_662-18193-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 02:28:17,550 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_26433_210_12108_1_1532157158_4056480_439-52438-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 02:28:18,740 WARNING [train.py:1204] (3/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_464-33931-0 from training. Duration: 0.54 2023-10-03 02:28:18,747 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3474_210_2028_1_1526188513_4825246_145-309131-0_sp0.9 from training. Duration: 0.82225 2023-10-03 02:28:18,757 WARNING [train.py:1204] (3/4) Exclude cut with ID _9679_210_7497_1_1529129212906_4363929_446-308321-0_sp1.1 from training. Duration: 0.963625 2023-10-03 02:28:18,997 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.0.conv_module2.balancer2.prob, batch_count=1105573.3333333333, ans=0.125 2023-10-03 02:28:19,031 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.feed_forward2.hidden_balancer.prob, batch_count=1105573.3333333333, ans=0.125 2023-10-03 02:28:21,607 WARNING [train.py:1204] (3/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_662-275621-0 from training. Duration: 0.42 2023-10-03 02:28:21,809 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.ff3_skip_rate, batch_count=1105640.0, ans=0.0 2023-10-03 02:28:23,091 WARNING [train.py:1204] (3/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_344-337148-0_sp1.1 from training. Duration: 0.97275 2023-10-03 02:28:24,475 WARNING [train.py:1204] (3/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_759-179071-0_sp1.1 from training. Duration: 0.92725 2023-10-03 02:28:28,715 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.0.layers.1.nonlin_attention.whiten2, num_groups=1, num_channels=192, metric=5.78 vs. limit=15.0 2023-10-03 02:28:35,795 WARNING [train.py:1204] (3/4) Exclude cut with ID _28783_210_7943_1_1532671164808_7155479_293-12188-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 02:28:41,229 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_17613_210_2151_1_1531137569_2366160_235-320304-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 02:28:41,244 WARNING [train.py:1204] (3/4) Exclude cut with ID _14349_210_8341_1_1530615710818_3442470_170-336541-0 from training. Duration: 0.86 2023-10-03 02:28:41,430 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.attention_skip_rate, batch_count=1105706.6666666667, ans=0.0 2023-10-03 02:28:42,516 WARNING [train.py:1204] (3/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_375-218677-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 02:28:42,561 WARNING [train.py:1204] (3/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_829-202956-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 02:28:49,790 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9029W0436-509823 from training. Duration: 0.768 2023-10-03 02:28:49,908 WARNING [train.py:1204] (3/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_231-112214-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 02:28:54,501 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.nonlin_attention.balancer.prob, batch_count=1105773.3333333333, ans=0.125 2023-10-03 02:28:55,478 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9059W0470-524883 from training. Duration: 0.3415625 2023-10-03 02:28:58,802 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_43266_210_1794_1_1533521789_4497218_206-79512-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 02:28:58,887 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_294-163793-0_sp0.9 from training. Duration: 0.911125 2023-10-03 02:29:00,113 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6290_210_3273_1_1527595224_1282787_115-87095-0_sp1.1 from training. Duration: 0.5 2023-10-03 02:29:00,157 WARNING [train.py:1204] (3/4) Exclude cut with ID _14100_210_8341_1_1530529320613_3678069_279-55760-0_sp1.1 from training. Duration: 0.8454375 2023-10-03 02:29:03,467 WARNING [train.py:1204] (3/4) Exclude cut with ID _14621_210_1925_1_1530686901185_3675420_419-234200-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 02:29:06,566 INFO [train.py:1046] (3/4) Epoch 32, batch 1200, loss[loss=0.1534, simple_loss=0.2358, pruned_loss=0.03553, over 24432.00 frames. ], tot_loss[loss=0.1633, simple_loss=0.2415, pruned_loss=0.04251, over 4687970.52 frames. ], batch size: 63, lr: 3.20e-03, grad_scale: 16.0 2023-10-03 02:29:08,039 WARNING [train.py:1204] (3/4) Exclude cut with ID _60229_210_27767_1_1534838406158_3655350_353-213656-0_sp1.1 from training. Duration: 0.863625 2023-10-03 02:29:08,047 WARNING [train.py:1204] (3/4) Exclude cut with ID _48678_210_5663_1_1533988830947_3769199_306-255886-0_sp1.1 from training. Duration: 0.7 2023-10-03 02:29:08,238 INFO [scaling.py:1118] (3/4) WithLoss: name=encoder.encoders.3.encoder.layers.0.self_attn_weights, loss-sum=0.000e+00 2023-10-03 02:29:10,862 WARNING [train.py:1204] (3/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_466-3492-0_sp1.1 from training. Duration: 0.97275 2023-10-03 02:29:10,862 WARNING [train.py:1204] (3/4) Exclude cut with ID _8755_210_1876_1_1528628559357_3258209_199-242167-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 02:29:10,923 WARNING [train.py:1204] (3/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_237-89684-0_sp1.1 from training. Duration: 0.87275 2023-10-03 02:29:13,619 WARNING [train.py:1204] (3/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_70-84121-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 02:29:15,077 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7544_210_6753_1_1528587722_3576686_172-171750-0_sp1.1 from training. Duration: 0.5909375 2023-10-03 02:29:16,368 WARNING [train.py:1204] (3/4) Exclude cut with ID _24271_210_12658_1_1531987270022_7190519_40-321822-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 02:29:16,387 WARNING [train.py:1204] (3/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_227-83612-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 02:29:19,603 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9156W0015-572508 from training. Duration: 0.896 2023-10-03 02:29:19,923 INFO [scaling.py:1118] (3/4) WithLoss: name=encoder.encoders.4.encoder.layers.0.self_attn_weights, loss-sum=0.000e+00 2023-10-03 02:29:21,223 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.bypass.skip_rate, batch_count=1105906.6666666667, ans=0.04949747468305833 2023-10-03 02:29:22,320 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_292-265928-0 from training. Duration: 0.51 2023-10-03 02:29:27,642 WARNING [train.py:1204] (3/4) Exclude cut with ID _14747_210_8341_1_1530685797463_3654089_147-131229-0_sp1.1 from training. Duration: 0.6909375 2023-10-03 02:29:30,809 WARNING [train.py:1204] (3/4) Exclude cut with ID _45297_210_2721_1_1533715351381_3264949_351-147136-0_sp0.9 from training. Duration: 0.9666875 2023-10-03 02:29:32,241 WARNING [train.py:1204] (3/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_1102-324568-0_sp1.1 from training. Duration: 0.97275 2023-10-03 02:29:33,589 WARNING [train.py:1204] (3/4) Exclude cut with ID _18516_210_9206_1_1531704653206_3692406_465-75446-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 02:29:33,590 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9203W0126-595623 from training. Duration: 0.896 2023-10-03 02:29:34,390 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.2.encoder.layers.2.self_attn_weights.whiten_keys, num_groups=4, num_channels=128, metric=3.41 vs. limit=6.0 2023-10-03 02:29:35,048 WARNING [train.py:1204] (3/4) Exclude cut with ID _55383_210_10635_1_1534468426172_2231669_139-328410-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 02:29:42,846 WARNING [train.py:1204] (3/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_166-246557-0_sp0.9 from training. Duration: 0.711125 2023-10-03 02:29:42,852 WARNING [train.py:1204] (3/4) Exclude cut with ID _30039_210_13270_1_1532484612900_2725150_239-304872-0_sp1.1 from training. Duration: 0.963625 2023-10-03 02:29:42,883 WARNING [train.py:1204] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_6-226750-0 from training. Duration: 0.94 2023-10-03 02:29:42,971 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_135-5501-0_sp1.1 from training. Duration: 0.8909375 2023-10-03 02:29:45,744 WARNING [train.py:1204] (3/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_359-118926-0 from training. Duration: 0.72 2023-10-03 02:29:50,296 WARNING [train.py:1204] (3/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_4-91037-0 from training. Duration: 0.88 2023-10-03 02:29:50,318 WARNING [train.py:1204] (3/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_415-288492-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 02:29:52,965 WARNING [train.py:1204] (3/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_544-287179-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 02:29:53,078 WARNING [train.py:1204] (3/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_122-32606-0_sp1.1 from training. Duration: 0.92725 2023-10-03 02:29:54,311 WARNING [train.py:1204] (3/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_121-284152-0_sp1.1 from training. Duration: 0.536375 2023-10-03 02:29:55,687 WARNING [train.py:1204] (3/4) Exclude cut with ID _44718_210_5816_1_1534050051869_6469030_350-299477-0_sp1.1 from training. Duration: 0.97275 2023-10-03 02:29:55,699 WARNING [train.py:1204] (3/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_1103-56445-0_sp0.9 from training. Duration: 0.811125 2023-10-03 02:29:58,300 WARNING [train.py:1204] (3/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_64-298917-0_sp0.9 from training. Duration: 0.97775 2023-10-03 02:29:58,349 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2245_210_2715_1_1525511548_3540254_310-52483-0 from training. Duration: 0.88 2023-10-03 02:29:58,397 WARNING [train.py:1204] (3/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_401-158239-0_sp1.1 from training. Duration: 0.6454375 2023-10-03 02:29:59,769 WARNING [train.py:1204] (3/4) Exclude cut with ID _59502_210_12210_1_1535107024698_1835139_146-162495-0_sp1.1 from training. Duration: 0.836375 2023-10-03 02:29:59,787 WARNING [train.py:1204] (3/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_41-31157-0_sp0.9 from training. Duration: 0.6333125 2023-10-03 02:30:02,530 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_68717_210_3528_1_1535628683_1556759_95-36722-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 02:30:02,537 WARNING [train.py:1204] (3/4) Exclude cut with ID _30196_210_1664_1_1533708496522_7074140_654-162611-0_sp1.1 from training. Duration: 0.92725 2023-10-03 02:30:03,057 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.4.encoder.layers.2.nonlin_attention.whiten2, num_groups=1, num_channels=384, metric=3.50 vs. limit=15.0 2023-10-03 02:30:04,693 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.balancer1.prob, batch_count=1106106.6666666667, ans=0.125 2023-10-03 02:30:07,294 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_9302_210_2550_1_1528889962_4655416_638-301620-0_sp0.9 from training. Duration: 0.52225 2023-10-03 02:30:09,341 WARNING [train.py:1204] (3/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_478-162247-0_sp1.1 from training. Duration: 0.7454375 2023-10-03 02:30:11,107 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.554e+02 1.987e+02 2.198e+02 2.518e+02 3.756e+02, threshold=4.395e+02, percent-clipped=0.0 2023-10-03 02:30:12,492 WARNING [train.py:1204] (3/4) Exclude cut with ID _45297_210_2721_1_1533715351381_3264949_353-9541-0 from training. Duration: 0.97 2023-10-03 02:30:15,360 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9653W0190-675468 from training. Duration: 0.9935 2023-10-03 02:30:16,737 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_49730_210_13049_1_1534075294_1830676_214-84436-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 02:30:19,258 INFO [train.py:1046] (3/4) Epoch 32, batch 1250, loss[loss=0.2012, simple_loss=0.269, pruned_loss=0.06671, over 19674.00 frames. ], tot_loss[loss=0.1637, simple_loss=0.2419, pruned_loss=0.0427, over 4682772.78 frames. ], batch size: 388, lr: 3.20e-03, grad_scale: 8.0 2023-10-03 02:30:19,363 WARNING [train.py:1204] (3/4) Exclude cut with ID _50093_210_4220_1_1534417141207_3432299_259-322252-0_sp1.1 from training. Duration: 0.836375 2023-10-03 02:30:20,765 WARNING [train.py:1204] (3/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_599-50220-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 02:30:22,742 WARNING [train.py:1204] (3/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_174-109016-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 02:30:24,096 WARNING [train.py:1204] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_665-196155-0 from training. Duration: 0.67 2023-10-03 02:30:27,025 WARNING [train.py:1204] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_656-188361-0_sp1.1 from training. Duration: 0.7909375 2023-10-03 02:30:29,660 WARNING [train.py:1204] (3/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_60-18123-0_sp1.1 from training. Duration: 0.8 2023-10-03 02:30:29,730 WARNING [train.py:1204] (3/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_535-14410-0 from training. Duration: 0.47 2023-10-03 02:30:31,003 WARNING [train.py:1204] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_94-126128-0_sp1.1 from training. Duration: 0.7818125 2023-10-03 02:30:32,340 WARNING [train.py:1204] (3/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_356-31216-0_sp1.1 from training. Duration: 0.6818125 2023-10-03 02:30:37,584 WARNING [train.py:1204] (3/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_443-30034-0_sp1.1 from training. Duration: 0.5818125 2023-10-03 02:30:37,660 WARNING [train.py:1204] (3/4) Exclude cut with ID _26907_210_7234_1_1532221740694_3205263_337-2494-0_sp1.1 from training. Duration: 0.8 2023-10-03 02:30:38,919 WARNING [train.py:1204] (3/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_49-97158-0_sp0.9 from training. Duration: 0.8444375 2023-10-03 02:30:38,938 WARNING [train.py:1204] (3/4) Exclude cut with ID _7877_210_6014_1_1528286358099_3750136_553-170128-0_sp1.1 from training. Duration: 0.963625 2023-10-03 02:30:42,281 WARNING [train.py:1204] (3/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_202-293523-0_sp0.9 from training. Duration: 0.8 2023-10-03 02:30:45,098 WARNING [train.py:1204] (3/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_188-171673-0_sp0.9 from training. Duration: 0.6444375 2023-10-03 02:30:45,112 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_120-41668-0_sp1.1 from training. Duration: 0.636375 2023-10-03 02:30:45,117 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_40223_210_6228_1_1533298404_4812267_613-78769-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 02:30:47,814 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_53861_210_5399_1_1534422156_4272077_150-336899-0_sp1.1 from training. Duration: 0.963625 2023-10-03 02:30:47,880 WARNING [train.py:1204] (3/4) Exclude cut with ID _14459_210_2228_1_1531443641946_7516700_1146-56843-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 02:30:49,419 WARNING [train.py:1204] (3/4) Exclude cut with ID _39867_210_9826_1_1533553222030_9344259_1194-299928-0_sp1.1 from training. Duration: 0.92725 2023-10-03 02:30:50,938 WARNING [train.py:1204] (3/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_214-291369-0_sp0.9 from training. Duration: 0.7 2023-10-03 02:30:52,506 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.conv_module1.balancer1.prob, batch_count=1106306.6666666667, ans=0.125 2023-10-03 02:30:57,126 WARNING [train.py:1204] (3/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_358-215558-0 from training. Duration: 0.93 2023-10-03 02:30:57,187 WARNING [train.py:1204] (3/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_577-287150-0_sp0.9 from training. Duration: 0.911125 2023-10-03 02:30:59,994 WARNING [train.py:1204] (3/4) Exclude cut with ID _60860_210_8254_1_1534896015875_3557169_229-187367-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 02:31:01,450 WARNING [train.py:1204] (3/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_857-298942-0 from training. Duration: 0.85 2023-10-03 02:31:01,504 WARNING [train.py:1204] (3/4) Exclude cut with ID _40137_210_12210_1_1533290484046_3795180_249-187059-0_sp1.1 from training. Duration: 0.8 2023-10-03 02:31:01,512 WARNING [train.py:1204] (3/4) Exclude cut with ID ID0171W0508-765603 from training. Duration: 0.541 2023-10-03 02:31:01,537 WARNING [train.py:1204] (3/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_521-219439-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 02:31:01,543 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_40223_210_6228_1_1533298404_4812267_741-302616-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 02:31:07,772 WARNING [train.py:1204] (3/4) Exclude cut with ID _65293_210_9070_1_1535545549765_4004210_438-154735-0_sp1.1 from training. Duration: 0.92725 2023-10-03 02:31:10,555 WARNING [train.py:1204] (3/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_564-132950-0_sp1.1 from training. Duration: 0.92725 2023-10-03 02:31:11,814 WARNING [train.py:1204] (3/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_291-270881-0_sp1.1 from training. Duration: 0.8818125 2023-10-03 02:31:11,926 WARNING [train.py:1204] (3/4) Exclude cut with ID _60229_210_27767_1_1534838406158_3655350_353-213656-0 from training. Duration: 0.95 2023-10-03 02:31:11,937 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_243-143119-0 from training. Duration: 0.77 2023-10-03 02:31:11,957 WARNING [train.py:1204] (3/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_690-126306-0 from training. Duration: 0.78 2023-10-03 02:31:14,004 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.0.ff3_skip_rate, batch_count=1106373.3333333333, ans=0.0 2023-10-03 02:31:15,259 WARNING [train.py:1204] (3/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_347-95003-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 02:31:16,728 WARNING [train.py:1204] (3/4) Exclude cut with ID _2322_210_2667_1_1525604548971_3656299_440-80494-0 from training. Duration: 0.88 2023-10-03 02:31:17,972 WARNING [train.py:1204] (3/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_370-229434-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 02:31:20,628 WARNING [train.py:1204] (3/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_124-345840-0_sp1.1 from training. Duration: 0.47275 2023-10-03 02:31:21,921 WARNING [train.py:1204] (3/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_165-123581-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 02:31:22,694 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.2.encoder.layers.2.whiten, num_groups=1, num_channels=384, metric=3.59 vs. limit=12.0 2023-10-03 02:31:23,404 WARNING [train.py:1204] (3/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_453-282588-0 from training. Duration: 0.98 2023-10-03 02:31:24,740 WARNING [train.py:1204] (3/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_299-34413-0_sp0.9 from training. Duration: 0.72225 2023-10-03 02:31:24,790 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_40036_210_13405_1_1533648647_1315388_48-146016-0_sp1.1 from training. Duration: 0.8454375 2023-10-03 02:31:26,167 WARNING [train.py:1204] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_99-167392-0_sp0.9 from training. Duration: 0.67775 2023-10-03 02:31:27,576 WARNING [train.py:1204] (3/4) Exclude cut with ID _48677_210_5663_1_1533902405919_3892989_37-207114-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 02:31:29,523 WARNING [train.py:1204] (3/4) Exclude cut with ID _29857_210_20034_1_1532395886753_3798160_335-163562-0 from training. Duration: 0.85 2023-10-03 02:31:32,317 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_36492_210_7927_1_1533261987_4790834_703-343351-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 02:31:32,398 WARNING [train.py:1204] (3/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_80-147301-0_sp0.9 from training. Duration: 0.9333125 2023-10-03 02:31:33,836 INFO [train.py:1046] (3/4) Epoch 32, batch 1300, loss[loss=0.1505, simple_loss=0.2359, pruned_loss=0.03257, over 24594.00 frames. ], tot_loss[loss=0.1639, simple_loss=0.2425, pruned_loss=0.04263, over 4701319.75 frames. ], batch size: 60, lr: 3.20e-03, grad_scale: 8.0 2023-10-03 02:31:33,904 WARNING [train.py:1204] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_218-295097-0_sp0.9 from training. Duration: 0.8555625 2023-10-03 02:31:35,463 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2295_210_2087_1_1525500993_2407863_116-238894-0_sp1.1 from training. Duration: 0.636375 2023-10-03 02:31:35,683 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.balancer1.prob, batch_count=1106506.6666666667, ans=0.125 2023-10-03 02:31:38,218 WARNING [train.py:1204] (3/4) Exclude cut with ID _3231_210_2106_1_1526205864934_6784220_697-171068-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 02:31:38,369 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.bypass.skip_rate, batch_count=1106506.6666666667, ans=0.07 2023-10-03 02:31:40,030 WARNING [train.py:1204] (3/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_102-77064-0 from training. Duration: 0.93 2023-10-03 02:31:44,629 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_13091_210_7925_1_1530609447_8987354_1186-261957-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 02:31:46,524 WARNING [train.py:1204] (3/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_91-277684-0_sp1.1 from training. Duration: 0.67275 2023-10-03 02:31:47,828 WARNING [train.py:1204] (3/4) Exclude cut with ID _31088_210_16446_1_1532520012284_3440919_159-313751-0_sp1.1 from training. Duration: 0.936375 2023-10-03 02:31:49,130 WARNING [train.py:1204] (3/4) Exclude cut with ID _42326_210_7770_1_1533814282531_3707269_227-203721-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 02:31:49,208 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3531_210_3528_1_1526294203_5216456_309-227302-0_sp1.1 from training. Duration: 0.62725 2023-10-03 02:31:50,604 WARNING [train.py:1204] (3/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_226-242140-0 from training. Duration: 0.81 2023-10-03 02:31:54,861 WARNING [train.py:1204] (3/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_122-109023-0_sp1.1 from training. Duration: 0.6909375 2023-10-03 02:31:56,662 WARNING [train.py:1204] (3/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_660-211088-0_sp0.9 from training. Duration: 0.688875 2023-10-03 02:31:56,755 WARNING [train.py:1204] (3/4) Exclude cut with ID _39290_210_4220_1_1533862778760_3075118_92-311600-0 from training. Duration: 0.88 2023-10-03 02:31:59,622 WARNING [train.py:1204] (3/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_390-339137-0_sp1.1 from training. Duration: 0.5090625 2023-10-03 02:32:00,193 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.4.encoder.layers.0.conv_module1.whiten, num_groups=1, num_channels=384, metric=4.38 vs. limit=15.0 2023-10-03 02:32:03,743 WARNING [train.py:1204] (3/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_449-341946-0_sp1.1 from training. Duration: 0.963625 2023-10-03 02:32:05,066 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_68095_210_20185_1_1535718226_8888855_884-327007-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 02:32:05,159 WARNING [train.py:1204] (3/4) Exclude cut with ID _42326_210_7770_1_1533814282531_3707269_308-222998-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 02:32:07,819 WARNING [train.py:1204] (3/4) Exclude cut with ID _40399_210_4353_1_1533305658194_3279406_344-238708-0_sp1.1 from training. Duration: 0.963625 2023-10-03 02:32:07,868 WARNING [train.py:1204] (3/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_87-131427-0_sp1.1 from training. Duration: 0.7090625 2023-10-03 02:32:07,936 WARNING [train.py:1204] (3/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_110-130961-0_sp0.9 from training. Duration: 0.52225 2023-10-03 02:32:08,688 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.feed_forward1.out_whiten.whitening_limit, batch_count=1106640.0, ans=15.0 2023-10-03 02:32:09,841 WARNING [train.py:1204] (3/4) Exclude cut with ID _55153_210_2253_1_1534384786677_3162096_148-79022-0 from training. Duration: 0.96 2023-10-03 02:32:15,047 WARNING [train.py:1204] (3/4) Exclude cut with ID _9679_210_7497_1_1529129212906_4363929_512-313889-0_sp1.1 from training. Duration: 0.736375 2023-10-03 02:32:16,350 WARNING [train.py:1204] (3/4) Exclude cut with ID _7970_210_1883_1_1528455616296_3472230_25-178407-0_sp1.1 from training. Duration: 0.6454375 2023-10-03 02:32:17,050 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.3.encoder.layers.3.nonlin_attention.whiten2, num_groups=1, num_channels=512, metric=5.46 vs. limit=15.0 2023-10-03 02:32:17,701 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_246-125000-0 from training. Duration: 0.62 2023-10-03 02:32:17,921 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.bypass.scale_min, batch_count=1106706.6666666667, ans=0.2 2023-10-03 02:32:18,951 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_15126_210_6753_1_1530778677_2591185_113-157651-0_sp0.9 from training. Duration: 0.7555625 2023-10-03 02:32:20,390 WARNING [train.py:1204] (3/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_105-63309-0_sp1.1 from training. Duration: 0.8909375 2023-10-03 02:32:21,900 WARNING [train.py:1204] (3/4) Exclude cut with ID _29877_210_6228_1_1532397791106_5276149_578-177271-0_sp1.1 from training. Duration: 0.936375 2023-10-03 02:32:23,310 WARNING [train.py:1204] (3/4) Exclude cut with ID _44831_210_13427_1_1533816011549_3689035_32-188147-0 from training. Duration: 0.96 2023-10-03 02:32:23,352 WARNING [train.py:1204] (3/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_300-254337-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 02:32:23,372 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_128-241008-0 from training. Duration: 0.94 2023-10-03 02:32:24,077 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.4.encoder.layers.0.self_attn2.whiten, num_groups=1, num_channels=384, metric=14.10 vs. limit=22.5 2023-10-03 02:32:24,829 WARNING [train.py:1204] (3/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_53-279178-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 02:32:28,222 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_41452_210_7291_1_1533437687_3941281_273-334088-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 02:32:28,224 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_128-241008-0_sp1.1 from training. Duration: 0.8545625 2023-10-03 02:32:33,455 WARNING [train.py:1204] (3/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_379-49457-0 from training. Duration: 0.87 2023-10-03 02:32:34,709 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_105-114797-0 from training. Duration: 0.84 2023-10-03 02:32:34,806 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_14928_210_5632_1_1530773461_4714000_454-112198-0 from training. Duration: 0.66 2023-10-03 02:32:37,702 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_456-22141-0_sp1.1 from training. Duration: 0.8 2023-10-03 02:32:41,087 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.516e+02 1.861e+02 2.113e+02 2.556e+02 3.728e+02, threshold=4.226e+02, percent-clipped=0.0 2023-10-03 02:32:41,249 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_115-233929-0 from training. Duration: 0.72 2023-10-03 02:32:42,661 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_616-16795-0_sp1.1 from training. Duration: 0.963625 2023-10-03 02:32:48,760 INFO [train.py:1046] (3/4) Epoch 32, batch 1350, loss[loss=0.1452, simple_loss=0.2068, pruned_loss=0.04178, over 23418.00 frames. ], tot_loss[loss=0.1634, simple_loss=0.2416, pruned_loss=0.0426, over 4708455.58 frames. ], batch size: 285, lr: 3.20e-03, grad_scale: 8.0 2023-10-03 02:32:50,299 WARNING [train.py:1204] (3/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_448-212135-0 from training. Duration: 0.91 2023-10-03 02:32:53,101 WARNING [train.py:1204] (3/4) Exclude cut with ID _46683_210_9642_1_1533730913492_4591116_201-205677-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 02:32:54,522 WARNING [train.py:1204] (3/4) Exclude cut with ID _64086_210_7994_1_1535270417019_7435949_43-80483-0_sp1.1 from training. Duration: 0.97275 2023-10-03 02:32:58,911 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_240-134124-0_sp1.1 from training. Duration: 0.963625 2023-10-03 02:32:58,941 WARNING [train.py:1204] (3/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_907-120940-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 02:33:00,335 WARNING [train.py:1204] (3/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_848-280957-0_sp1.1 from training. Duration: 0.7909375 2023-10-03 02:33:00,373 WARNING [train.py:1204] (3/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_243-42116-0_sp0.9 from training. Duration: 0.811125 2023-10-03 02:33:02,129 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.balancer2.prob, batch_count=1106906.6666666667, ans=0.125 2023-10-03 02:33:03,277 WARNING [train.py:1204] (3/4) Exclude cut with ID _6694_210_4599_1_1527852637490_3965529_116-109398-0_sp0.9 from training. Duration: 0.811125 2023-10-03 02:33:03,866 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.3.encoder.layers.1.feed_forward3.out_whiten, num_groups=1, num_channels=512, metric=13.12 vs. limit=15.0 2023-10-03 02:33:04,637 WARNING [train.py:1204] (3/4) Exclude cut with ID _24314_210_4915_1_1532135078116_7170179_521-258868-0 from training. Duration: 0.79 2023-10-03 02:33:05,983 WARNING [train.py:1204] (3/4) Exclude cut with ID _2220_210_1819_1_1525509109898_3658680_50-99837-0_sp0.9 from training. Duration: 0.6 2023-10-03 02:33:07,459 WARNING [train.py:1204] (3/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_313-134930-0_sp1.1 from training. Duration: 0.8818125 2023-10-03 02:33:08,276 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.1.encoder.layers.0.whiten, num_groups=1, num_channels=256, metric=5.27 vs. limit=12.0 2023-10-03 02:33:12,134 WARNING [train.py:1204] (3/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_125-144174-0 from training. Duration: 0.39 2023-10-03 02:33:12,193 WARNING [train.py:1204] (3/4) Exclude cut with ID _59288_210_5854_1_1535077757774_3412246_275-293897-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 02:33:14,232 WARNING [train.py:1204] (3/4) Exclude cut with ID _40519_210_13794_1_1533715228655_7337390_722-52880-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 02:33:14,253 WARNING [train.py:1204] (3/4) Exclude cut with ID _7537_210_2084_1_1528502395431_3924359_128-268029-0 from training. Duration: 0.98 2023-10-03 02:33:17,664 WARNING [train.py:1204] (3/4) Exclude cut with ID _8021_210_7994_1_1528376451358_3653069_179-45206-0 from training. Duration: 0.86 2023-10-03 02:33:19,110 WARNING [train.py:1204] (3/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_88-276668-0 from training. Duration: 0.99 2023-10-03 02:33:21,805 WARNING [train.py:1204] (3/4) Exclude cut with ID _22613_210_5219_1_1532253599686_4090170_290-212862-0_sp1.1 from training. Duration: 0.97275 2023-10-03 02:33:21,810 WARNING [train.py:1204] (3/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_465-214726-0 from training. Duration: 0.86 2023-10-03 02:33:22,135 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.0.ff3_skip_rate, batch_count=1106973.3333333333, ans=0.0 2023-10-03 02:33:28,171 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.balancer2.prob, batch_count=1106973.3333333333, ans=0.125 2023-10-03 02:33:29,358 WARNING [train.py:1204] (3/4) Exclude cut with ID _56480_210_4220_1_1534679942079_3075409_138-36031-0_sp1.1 from training. Duration: 0.97275 2023-10-03 02:33:36,384 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.nonlin_attention.balancer.prob, batch_count=1107040.0, ans=0.125 2023-10-03 02:33:37,488 WARNING [train.py:1204] (3/4) Exclude cut with ID _58605_210_12210_1_1534723195939_3904259_300-285878-0_sp1.1 from training. Duration: 0.97275 2023-10-03 02:33:37,519 WARNING [train.py:1204] (3/4) Exclude cut with ID _40901_210_5665_1_1533363789958_3856130_114-340229-0_sp1.1 from training. Duration: 0.936375 2023-10-03 02:33:37,552 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_489-230703-0 from training. Duration: 0.85 2023-10-03 02:33:39,551 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.conv_module1.balancer1.prob, batch_count=1107040.0, ans=0.125 2023-10-03 02:33:42,091 WARNING [train.py:1204] (3/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_322-81243-0_sp1.1 from training. Duration: 0.936375 2023-10-03 02:33:42,258 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.conv_module1.balancer2.prob, batch_count=1107040.0, ans=0.125 2023-10-03 02:33:43,878 WARNING [train.py:1204] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_271-269084-0 from training. Duration: 0.83 2023-10-03 02:33:43,899 WARNING [train.py:1204] (3/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_166-314607-0_sp0.9 from training. Duration: 0.6 2023-10-03 02:33:45,296 WARNING [train.py:1204] (3/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_340-343302-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 02:33:48,725 WARNING [train.py:1204] (3/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_286-112015-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 02:33:50,070 WARNING [train.py:1204] (3/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_328-264776-0 from training. Duration: 0.67 2023-10-03 02:33:51,419 WARNING [train.py:1204] (3/4) Exclude cut with ID _4866_210_2577_1_1527404412284_4364659_256-306830-0_sp0.9 from training. Duration: 0.9666875 2023-10-03 02:33:54,984 WARNING [train.py:1204] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_239-316802-0 from training. Duration: 0.69 2023-10-03 02:33:57,527 WARNING [train.py:1204] (3/4) Exclude cut with ID _29857_210_20034_1_1532395886753_3798160_279-185801-0 from training. Duration: 0.91 2023-10-03 02:33:59,475 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.5.encoder.layers.0.nonlin_attention.whiten2, num_groups=1, num_channels=256, metric=3.59 vs. limit=15.0 2023-10-03 02:34:01,725 WARNING [train.py:1204] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_656-188361-0 from training. Duration: 0.87 2023-10-03 02:34:01,888 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.conv_skip_rate, batch_count=1107173.3333333333, ans=0.0 2023-10-03 02:34:03,009 INFO [train.py:1046] (3/4) Epoch 32, batch 1400, loss[loss=0.1464, simple_loss=0.2165, pruned_loss=0.0382, over 22762.00 frames. ], tot_loss[loss=0.1622, simple_loss=0.2395, pruned_loss=0.04249, over 4688396.98 frames. ], batch size: 322, lr: 3.20e-03, grad_scale: 8.0 2023-10-03 02:34:03,139 WARNING [train.py:1204] (3/4) Exclude cut with ID _44718_210_5816_1_1534050051869_6469030_100-230287-0_sp1.1 from training. Duration: 0.936375 2023-10-03 02:34:03,276 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.nonlin_attention.balancer.max_positive, batch_count=1107173.3333333333, ans=0.95 2023-10-03 02:34:07,267 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8275_210_4198_1_1528455320_3892571_276-215622-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 02:34:07,316 WARNING [train.py:1204] (3/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_698-309430-0_sp1.1 from training. Duration: 0.963625 2023-10-03 02:34:09,432 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.conv_module1.balancer2.min_abs, batch_count=1107173.3333333333, ans=0.5 2023-10-03 02:34:12,036 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_365-217119-0 from training. Duration: 0.63 2023-10-03 02:34:13,538 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7264_210_4938_1_1527939911_8132095_800-91995-0 from training. Duration: 0.86 2023-10-03 02:34:13,803 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.conv_module2.balancer1.max_abs, batch_count=1107173.3333333333, ans=10.0 2023-10-03 02:34:25,897 WARNING [train.py:1204] (3/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_49-97158-0_sp1.1 from training. Duration: 0.6909375 2023-10-03 02:34:27,327 WARNING [train.py:1204] (3/4) Exclude cut with ID _61132_210_9969_1_1535021792964_3755419_61-57720-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 02:34:28,800 WARNING [train.py:1204] (3/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_345-306832-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 02:34:28,833 WARNING [train.py:1204] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_164-224052-0_sp0.9 from training. Duration: 0.77775 2023-10-03 02:34:31,843 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_34886_210_10633_1_1533203882_1755661_76-156096-0_sp1.1 from training. Duration: 0.7909375 2023-10-03 02:34:32,089 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.feed_forward1.hidden_balancer.prob, batch_count=1107306.6666666667, ans=0.125 2023-10-03 02:34:33,229 WARNING [train.py:1204] (3/4) Exclude cut with ID 4133-6541-0027-40495-0_sp1.1 from training. Duration: 0.9681875 2023-10-03 02:34:41,860 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_514-211165-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 02:34:41,921 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_41211_210_2544_1_1533348859_3702910_97-212695-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 02:34:47,773 WARNING [train.py:1204] (3/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_406-45086-0 from training. Duration: 0.8 2023-10-03 02:34:47,833 WARNING [train.py:1204] (3/4) Exclude cut with ID _65263_210_22356_1_1535763598691_3921037_49-222917-0_sp1.1 from training. Duration: 0.836375 2023-10-03 02:34:49,628 WARNING [train.py:1204] (3/4) Exclude cut with ID _4866_210_2577_1_1527404412284_4364659_352-157930-0_sp1.1 from training. Duration: 0.736375 2023-10-03 02:34:49,697 WARNING [train.py:1204] (3/4) Exclude cut with ID _4366_210_1388_1_1526792053633_3613819_231-211402-0_sp1.1 from training. Duration: 0.8545625 2023-10-03 02:34:49,735 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_98-21576-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 02:34:49,950 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.feed_forward1.out_proj.dropout_p, batch_count=1107373.3333333333, ans=0.1 2023-10-03 02:34:51,107 WARNING [train.py:1204] (3/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_348-127789-0_sp0.9 from training. Duration: 0.9444375 2023-10-03 02:34:51,122 WARNING [train.py:1204] (3/4) Exclude cut with ID _45871_210_5665_1_1533864614245_3296060_98-329557-0_sp1.1 from training. Duration: 0.97275 2023-10-03 02:34:52,400 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6846_210_6753_1_1527850767_4295158_2-315073-0_sp1.1 from training. Duration: 0.8 2023-10-03 02:34:53,867 WARNING [train.py:1204] (3/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_145-137790-0 from training. Duration: 0.83 2023-10-03 02:34:53,883 WARNING [train.py:1204] (3/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_133-158308-0_sp0.9 from training. Duration: 0.9333125 2023-10-03 02:34:58,272 WARNING [train.py:1204] (3/4) Exclude cut with ID _14898_210_9206_1_1530766812377_4177190_320-11758-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 02:35:02,416 WARNING [train.py:1204] (3/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_406-45086-0_sp0.9 from training. Duration: 0.888875 2023-10-03 02:35:09,416 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_5438_210_1794_1_1527336687_2600009_104-288274-0 from training. Duration: 0.58 2023-10-03 02:35:09,929 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.5.encoder.layers.0.feed_forward3.out_whiten, num_groups=1, num_channels=256, metric=8.05 vs. limit=15.0 2023-10-03 02:35:10,701 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.575e+02 1.787e+02 1.940e+02 2.244e+02 3.961e+02, threshold=3.881e+02, percent-clipped=0.0 2023-10-03 02:35:10,813 WARNING [train.py:1204] (3/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_108-53929-0_sp0.9 from training. Duration: 0.6555625 2023-10-03 02:35:10,879 WARNING [train.py:1204] (3/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_444-286390-0_sp1.1 from training. Duration: 0.963625 2023-10-03 02:35:14,054 WARNING [train.py:1204] (3/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_125-144174-0_sp0.9 from training. Duration: 0.4333125 2023-10-03 02:35:14,117 WARNING [train.py:1204] (3/4) Exclude cut with ID _55209_210_1531_1_1534675828996_4597160_118-345621-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 02:35:14,320 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.bypass.skip_rate, batch_count=1107440.0, ans=0.09899494936611666 2023-10-03 02:35:16,733 INFO [train.py:1046] (3/4) Epoch 32, batch 1450, loss[loss=0.1601, simple_loss=0.2509, pruned_loss=0.03464, over 24422.00 frames. ], tot_loss[loss=0.1619, simple_loss=0.2395, pruned_loss=0.04216, over 4707909.47 frames. ], batch size: 69, lr: 3.20e-03, grad_scale: 4.0 2023-10-03 02:35:16,814 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_45886_210_17024_1_1533709215_471063_7-79165-0_sp1.1 from training. Duration: 0.8909375 2023-10-03 02:35:20,634 WARNING [train.py:1204] (3/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_175-163894-0_sp0.9 from training. Duration: 0.82225 2023-10-03 02:35:22,085 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_50188_210_4198_1_1534503621_3646041_353-81682-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 02:35:22,091 WARNING [train.py:1204] (3/4) Exclude cut with ID _17875_210_2087_1_1531224012907_1821339_175-80565-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 02:35:22,100 WARNING [train.py:1204] (3/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_159-328157-0_sp1.1 from training. Duration: 0.47275 2023-10-03 02:35:26,337 WARNING [train.py:1204] (3/4) Exclude cut with ID _60057_210_10640_1_1534903065793_3333150_137-267579-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 02:35:27,694 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_92-46009-0_sp1.1 from training. Duration: 0.4909375 2023-10-03 02:35:29,676 WARNING [train.py:1204] (3/4) Exclude cut with ID _53203_210_2087_1_1534246218309_475590_48-68507-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 02:35:30,867 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_28211_210_6388_1_1532861506_4523224_128-328162-0 from training. Duration: 0.9 2023-10-03 02:35:30,942 WARNING [train.py:1204] (3/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_51-1049-0_sp1.1 from training. Duration: 0.6818125 2023-10-03 02:35:32,220 WARNING [train.py:1204] (3/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_383-316000-0 from training. Duration: 0.9 2023-10-03 02:35:32,273 WARNING [train.py:1204] (3/4) Exclude cut with ID _4779_210_2084_1_1527318022944_3914893_185-295153-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 02:35:33,594 WARNING [train.py:1204] (3/4) Exclude cut with ID _3231_210_2106_1_1526205864934_6784220_171-256004-0_sp0.9 from training. Duration: 0.9555625 2023-10-03 02:35:33,594 WARNING [train.py:1204] (3/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_87-272068-0 from training. Duration: 0.83 2023-10-03 02:35:33,712 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_164-152777-0_sp1.1 from training. Duration: 0.92725 2023-10-03 02:35:35,010 WARNING [train.py:1204] (3/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_18-130336-0_sp0.9 from training. Duration: 0.87775 2023-10-03 02:35:35,063 WARNING [train.py:1204] (3/4) Exclude cut with ID _14886_210_4220_1_1530702020355_3516190_175-229073-0_sp1.1 from training. Duration: 0.4090625 2023-10-03 02:35:35,075 WARNING [train.py:1204] (3/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_791-45149-0_sp0.9 from training. Duration: 0.9555625 2023-10-03 02:35:36,454 WARNING [train.py:1204] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_468-189058-0_sp1.1 from training. Duration: 0.87275 2023-10-03 02:35:37,870 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8278_210_2550_1_1528637099_4220413_32-142780-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 02:35:38,037 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.conv_skip_rate, batch_count=1107573.3333333333, ans=0.0 2023-10-03 02:35:41,196 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.3.encoder.layers.0.self_attn1.whiten, num_groups=1, num_channels=512, metric=13.88 vs. limit=22.5 2023-10-03 02:35:41,795 WARNING [train.py:1204] (3/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_146-218763-0_sp0.9 from training. Duration: 0.9555625 2023-10-03 02:35:46,615 WARNING [train.py:1204] (3/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_348-127789-0_sp1.1 from training. Duration: 0.77275 2023-10-03 02:35:47,978 WARNING [train.py:1204] (3/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_288-285445-0_sp1.1 from training. Duration: 0.936375 2023-10-03 02:35:49,835 WARNING [train.py:1204] (3/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_308-181732-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 02:35:49,867 WARNING [train.py:1204] (3/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_895-117428-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 02:35:50,527 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.4.encoder.layers.2.self_attn2.whiten, num_groups=1, num_channels=384, metric=11.64 vs. limit=22.5 2023-10-03 02:35:51,839 WARNING [train.py:1204] (3/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_387-159008-0_sp0.9 from training. Duration: 0.9555625 2023-10-03 02:35:51,851 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_37351_210_7925_1_1533012931_3812113_265-85780-0_sp0.9 from training. Duration: 0.97775 2023-10-03 02:35:51,873 WARNING [train.py:1204] (3/4) Exclude cut with ID _40551_210_10635_1_1533776424632_3416179_361-3964-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 02:35:53,152 WARNING [train.py:1204] (3/4) Exclude cut with ID _25882_210_3036_1_1532775665815_6479510_485-122742-0_sp1.1 from training. Duration: 0.97275 2023-10-03 02:35:57,344 WARNING [train.py:1204] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_98-51676-0 from training. Duration: 0.73 2023-10-03 02:35:58,791 WARNING [train.py:1204] (3/4) Exclude cut with ID _30548_210_16040_1_1532608097130_3146969_371-280610-0_sp1.1 from training. Duration: 0.92725 2023-10-03 02:36:00,317 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.1.conv_module2.balancer1.prob, batch_count=1107706.6666666667, ans=0.125 2023-10-03 02:36:01,531 WARNING [train.py:1204] (3/4) Exclude cut with ID IC0671W0081-331924 from training. Duration: 0.896 2023-10-03 02:36:03,653 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=1107706.6666666667, ans=0.1 2023-10-03 02:36:04,708 WARNING [train.py:1204] (3/4) Exclude cut with ID _40284_210_2084_1_1533513679814_3704279_380-160775-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 02:36:06,134 WARNING [train.py:1204] (3/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_57-270037-0_sp1.1 from training. Duration: 0.7 2023-10-03 02:36:06,400 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.conv_skip_rate, batch_count=1107706.6666666667, ans=0.0 2023-10-03 02:36:07,455 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_853-150237-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 02:36:08,867 WARNING [train.py:1204] (3/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_491-261923-0 from training. Duration: 0.98 2023-10-03 02:36:12,953 WARNING [train.py:1204] (3/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_836-244045-0_sp1.1 from training. Duration: 0.97275 2023-10-03 02:36:14,385 WARNING [train.py:1204] (3/4) Exclude cut with ID _14880_210_8895_1_1530680426655_3795259_289-212817-0 from training. Duration: 0.69 2023-10-03 02:36:15,794 WARNING [train.py:1204] (3/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_303-340446-0 from training. Duration: 0.9 2023-10-03 02:36:15,891 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_37152_210_9697_1_1533106573_3844188_418-41736-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 02:36:16,158 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.conv_module1.balancer1.min_positive, batch_count=1107773.3333333333, ans=0.025 2023-10-03 02:36:19,329 WARNING [train.py:1204] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_371-18169-0_sp1.1 from training. Duration: 0.7818125 2023-10-03 02:36:21,146 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_187-219984-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 02:36:21,259 WARNING [train.py:1204] (3/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_63-267585-0 from training. Duration: 0.9 2023-10-03 02:36:22,666 WARNING [train.py:1204] (3/4) Exclude cut with ID _27013_210_1389_1_1532437173240_3252649_222-125573-0 from training. Duration: 0.98 2023-10-03 02:36:23,960 WARNING [train.py:1204] (3/4) Exclude cut with ID _14821_210_8750_1_1530680503991_6829359_189-297451-0 from training. Duration: 0.55 2023-10-03 02:36:25,384 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_557-163391-0_sp1.1 from training. Duration: 0.97275 2023-10-03 02:36:25,479 WARNING [train.py:1204] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_65-88242-0_sp1.1 from training. Duration: 0.5090625 2023-10-03 02:36:27,060 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=1107773.3333333333, ans=0.1 2023-10-03 02:36:30,826 INFO [train.py:1046] (3/4) Epoch 32, batch 1500, loss[loss=0.1777, simple_loss=0.264, pruned_loss=0.04571, over 24417.00 frames. ], tot_loss[loss=0.1617, simple_loss=0.2398, pruned_loss=0.04184, over 4713366.37 frames. ], batch size: 77, lr: 3.20e-03, grad_scale: 8.0 2023-10-03 02:36:31,458 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.feed_forward2.out_whiten.whitening_limit, batch_count=1107840.0, ans=15.0 2023-10-03 02:36:35,674 WARNING [train.py:1204] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_218-295097-0 from training. Duration: 0.77 2023-10-03 02:36:35,695 WARNING [train.py:1204] (3/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_686-247600-0_sp1.1 from training. Duration: 0.836375 2023-10-03 02:36:35,699 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_397-147196-0_sp0.9 from training. Duration: 0.888875 2023-10-03 02:36:37,047 WARNING [train.py:1204] (3/4) Exclude cut with ID _53950_210_5816_1_1534417264919_3862951_237-136449-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 02:36:38,505 WARNING [train.py:1204] (3/4) Exclude cut with ID _3120_210_3525_1_1526036143112_4072980_74-178400-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 02:36:38,721 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.conv_module1.balancer2.prob, batch_count=1107840.0, ans=0.125 2023-10-03 02:36:39,852 WARNING [train.py:1204] (3/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_309-222545-0_sp0.9 from training. Duration: 0.9333125 2023-10-03 02:36:39,933 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7544_210_6753_1_1528587722_3576686_172-171750-0 from training. Duration: 0.65 2023-10-03 02:36:40,156 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.balancer2.prob, batch_count=1107840.0, ans=0.125 2023-10-03 02:36:41,365 WARNING [train.py:1204] (3/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_3-237434-0_sp0.9 from training. Duration: 0.8444375 2023-10-03 02:36:42,632 WARNING [train.py:1204] (3/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_101-293367-0_sp0.9 from training. Duration: 0.7 2023-10-03 02:36:42,639 WARNING [train.py:1204] (3/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_818-213552-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 02:36:42,714 WARNING [train.py:1204] (3/4) Exclude cut with ID _14349_210_8341_1_1530615710818_3442470_115-285975-0_sp1.1 from training. Duration: 0.7818125 2023-10-03 02:36:45,514 WARNING [train.py:1204] (3/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_291-309749-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 02:36:45,607 WARNING [train.py:1204] (3/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_715-3462-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 02:36:52,712 WARNING [train.py:1204] (3/4) Exclude cut with ID _59502_210_12210_1_1535107024698_1835139_13-331827-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 02:36:52,733 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_4528_210_3273_1_1527076654_3276777_342-168068-0 from training. Duration: 0.87 2023-10-03 02:36:52,796 WARNING [train.py:1204] (3/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_215-141059-0_sp0.9 from training. Duration: 0.688875 2023-10-03 02:36:54,066 WARNING [train.py:1204] (3/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_106-286130-0_sp1.1 from training. Duration: 0.8909375 2023-10-03 02:36:54,151 WARNING [train.py:1204] (3/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_480-77655-0_sp1.1 from training. Duration: 0.97275 2023-10-03 02:36:58,391 WARNING [train.py:1204] (3/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_848-280957-0 from training. Duration: 0.87 2023-10-03 02:36:59,822 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.1.feed_forward3.hidden_balancer.prob, batch_count=1107973.3333333333, ans=0.125 2023-10-03 02:37:02,314 WARNING [train.py:1204] (3/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_122-109023-0 from training. Duration: 0.76 2023-10-03 02:37:04,443 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_11305_210_1794_1_1529750428_7178148_645-233480-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 02:37:04,502 WARNING [train.py:1204] (3/4) Exclude cut with ID _7562_210_2084_1_1528887767916_3946229_345-169156-0 from training. Duration: 0.58 2023-10-03 02:37:06,064 WARNING [train.py:1204] (3/4) Exclude cut with ID _7562_210_2084_1_1528887767916_3946229_345-169156-0_sp1.1 from training. Duration: 0.52725 2023-10-03 02:37:09,025 WARNING [train.py:1204] (3/4) Exclude cut with ID _13253_210_1925_1_1530757775819_3577873_253-246386-0_sp1.1 from training. Duration: 0.8181875 2023-10-03 02:37:10,369 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_136-264444-0_sp1.1 from training. Duration: 0.97275 2023-10-03 02:37:10,390 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2168_210_2290_1_1525606920_1849400_136-222991-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 02:37:11,180 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.whiten.whitening_limit, batch_count=1107973.3333333333, ans=12.0 2023-10-03 02:37:11,733 WARNING [train.py:1204] (3/4) Exclude cut with ID _7852_210_5219_1_1528286471316_8139400_242-105336-0 from training. Duration: 0.72 2023-10-03 02:37:11,778 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_5960_210_1794_1_1527403865_3990999_515-2613-0_sp1.1 from training. Duration: 0.8 2023-10-03 02:37:11,793 WARNING [train.py:1204] (3/4) Exclude cut with ID _39688_210_19596_1_1533371626725_4154159_551-167834-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 02:37:13,083 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_85-195240-0 from training. Duration: 0.87 2023-10-03 02:37:14,424 WARNING [train.py:1204] (3/4) Exclude cut with ID _63171_210_15488_1_1535104792794_3295110_113-113534-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 02:37:17,431 WARNING [train.py:1204] (3/4) Exclude cut with ID _8833_210_5219_1_1528592665210_7553059_319-349181-0_sp1.1 from training. Duration: 0.9 2023-10-03 02:37:17,432 WARNING [train.py:1204] (3/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_225-43381-0 from training. Duration: 0.94 2023-10-03 02:37:24,470 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.conv_module1.balancer2.prob, batch_count=1108040.0, ans=0.125 2023-10-03 02:37:25,611 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_5959_210_1794_1_1527400400_3445726_412-44052-0_sp0.9 from training. Duration: 0.8555625 2023-10-03 02:37:25,723 WARNING [train.py:1204] (3/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_356-167816-0_sp1.1 from training. Duration: 0.6545625 2023-10-03 02:37:27,213 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.1.conv_module2.balancer1.prob, batch_count=1108040.0, ans=0.125 2023-10-03 02:37:29,733 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9012W0112-501244 from training. Duration: 0.768 2023-10-03 02:37:31,123 WARNING [train.py:1204] (3/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_177-326952-0_sp1.1 from training. Duration: 0.92725 2023-10-03 02:37:31,127 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9016W0116-502789 from training. Duration: 0.896 2023-10-03 02:37:32,468 WARNING [train.py:1204] (3/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_557-204815-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 02:37:33,819 WARNING [train.py:1204] (3/4) Exclude cut with ID _55857_210_2346_1_1534644007831_6898209_279-229469-0_sp1.1 from training. Duration: 0.936375 2023-10-03 02:37:33,889 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9029W0375-509764 from training. Duration: 0.896 2023-10-03 02:37:35,272 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2295_210_2087_1_1525500993_2407863_234-83104-0_sp0.9 from training. Duration: 0.688875 2023-10-03 02:37:38,776 WARNING [train.py:1204] (3/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_266-137104-0 from training. Duration: 0.65 2023-10-03 02:37:40,021 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.551e+02 1.890e+02 2.007e+02 2.185e+02 3.461e+02, threshold=4.014e+02, percent-clipped=0.0 2023-10-03 02:37:41,369 WARNING [train.py:1204] (3/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_289-163927-0_sp1.1 from training. Duration: 0.92725 2023-10-03 02:37:44,224 WARNING [train.py:1204] (3/4) Exclude cut with ID _12733_210_8341_1_1530167424556_3646380_86-225314-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 02:37:44,244 WARNING [train.py:1204] (3/4) Exclude cut with ID _7877_210_6014_1_1528286358099_3750136_618-284064-0_sp1.1 from training. Duration: 0.92725 2023-10-03 02:37:45,544 INFO [train.py:1046] (3/4) Epoch 32, batch 1550, loss[loss=0.152, simple_loss=0.2317, pruned_loss=0.03616, over 24332.00 frames. ], tot_loss[loss=0.1615, simple_loss=0.2396, pruned_loss=0.0417, over 4715726.59 frames. ], batch size: 61, lr: 3.20e-03, grad_scale: 8.0 2023-10-03 02:37:45,603 WARNING [train.py:1204] (3/4) Exclude cut with ID _55844_210_2346_1_1534471212569_7625707_1017-31469-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 02:37:45,651 WARNING [train.py:1204] (3/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_617-241446-0_sp1.1 from training. Duration: 0.92725 2023-10-03 02:37:46,962 WARNING [train.py:1204] (3/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_866-173440-0_sp1.1 from training. Duration: 0.8181875 2023-10-03 02:37:48,306 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_9460_210_4211_1_1528954587_10049667_480-260269-0 from training. Duration: 0.56 2023-10-03 02:37:48,338 WARNING [train.py:1204] (3/4) Exclude cut with ID _44659_210_2721_1_1534036120061_6614860_626-298884-0 from training. Duration: 0.97 2023-10-03 02:37:48,376 WARNING [train.py:1204] (3/4) Exclude cut with ID _53565_210_10635_1_1534337218335_3820259_287-18120-0_sp1.1 from training. Duration: 0.8818125 2023-10-03 02:37:49,721 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_791-197891-0 from training. Duration: 0.84 2023-10-03 02:37:49,775 WARNING [train.py:1204] (3/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_527-238990-0 from training. Duration: 0.9 2023-10-03 02:37:53,109 WARNING [train.py:1204] (3/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_449-65969-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 02:37:54,457 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_553-341452-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 02:37:54,506 WARNING [train.py:1204] (3/4) Exclude cut with ID _16909_210_5724_1_1531292079912_4118919_300-47574-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 02:37:54,524 WARNING [train.py:1204] (3/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_70-27732-0_sp1.1 from training. Duration: 0.8545625 2023-10-03 02:37:57,507 WARNING [train.py:1204] (3/4) Exclude cut with ID _44718_210_5816_1_1534050051869_6469030_727-28074-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 02:37:57,561 WARNING [train.py:1204] (3/4) Exclude cut with ID _55857_210_2346_1_1534644007831_6898209_357-123664-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 02:38:00,218 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9123W0004-555964 from training. Duration: 0.896 2023-10-03 02:38:00,238 WARNING [train.py:1204] (3/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_445-145208-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 02:38:00,285 WARNING [train.py:1204] (3/4) Exclude cut with ID _3675_210_4557_1_1526639934408_4182089_456-18305-0_sp1.1 from training. Duration: 0.6909375 2023-10-03 02:38:01,626 WARNING [train.py:1204] (3/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_1-99680-0_sp1.1 from training. Duration: 0.6090625 2023-10-03 02:38:03,107 WARNING [train.py:1204] (3/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_146-282400-0_sp0.9 from training. Duration: 0.788875 2023-10-03 02:38:03,128 WARNING [train.py:1204] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_844-96989-0 from training. Duration: 0.71 2023-10-03 02:38:04,566 WARNING [train.py:1204] (3/4) Exclude cut with ID _29853_210_7497_1_1532505600851_6642110_128-146364-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 02:38:04,587 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_224-322394-0 from training. Duration: 0.66 2023-10-03 02:38:06,376 WARNING [train.py:1204] (3/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_191-59784-0 from training. Duration: 0.88 2023-10-03 02:38:06,382 WARNING [train.py:1204] (3/4) Exclude cut with ID _12556_210_8341_1_1530081024100_7327690_725-193477-0 from training. Duration: 0.98 2023-10-03 02:38:06,421 WARNING [train.py:1204] (3/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_523-79838-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 02:38:07,892 WARNING [train.py:1204] (3/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_398-334558-0_sp1.1 from training. Duration: 0.87275 2023-10-03 02:38:10,704 WARNING [train.py:1204] (3/4) Exclude cut with ID _6456_210_1534_1_1527681634212_3181049_171-36697-0_sp1.1 from training. Duration: 0.936375 2023-10-03 02:38:13,461 WARNING [train.py:1204] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_700-183549-0 from training. Duration: 0.68 2023-10-03 02:38:13,462 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8853_210_3273_1_1528633899_818968_51-187685-0 from training. Duration: 0.69 2023-10-03 02:38:22,273 WARNING [train.py:1204] (3/4) Exclude cut with ID _7719_210_3525_1_1528456153163_4296960_333-110399-0_sp1.1 from training. Duration: 0.87275 2023-10-03 02:38:25,744 WARNING [train.py:1204] (3/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_1022-83740-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 02:38:25,776 WARNING [train.py:1204] (3/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_61-327062-0_sp0.9 from training. Duration: 0.9 2023-10-03 02:38:25,781 WARNING [train.py:1204] (3/4) Exclude cut with ID _47251_210_18628_1_1533814063128_4137250_214-88076-0_sp1.1 from training. Duration: 0.8 2023-10-03 02:38:27,123 WARNING [train.py:1204] (3/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_839-173017-0 from training. Duration: 0.41 2023-10-03 02:38:28,904 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.feed_forward2.hidden_balancer.prob, batch_count=1108373.3333333333, ans=0.125 2023-10-03 02:38:31,395 WARNING [train.py:1204] (3/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_299-34413-0_sp1.1 from training. Duration: 0.5909375 2023-10-03 02:38:32,871 WARNING [train.py:1204] (3/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_664-159457-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 02:38:34,399 WARNING [train.py:1204] (3/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_483-291760-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 02:38:38,849 WARNING [train.py:1204] (3/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_287-255474-0_sp1.1 from training. Duration: 0.92725 2023-10-03 02:38:38,883 WARNING [train.py:1204] (3/4) Exclude cut with ID _48677_210_5663_1_1533902405919_3892989_111-11439-0_sp1.1 from training. Duration: 0.87275 2023-10-03 02:38:38,902 WARNING [train.py:1204] (3/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_399-156791-0 from training. Duration: 0.83 2023-10-03 02:38:38,942 WARNING [train.py:1204] (3/4) Exclude cut with ID _3992_210_1747_1_1526644816031_3731060_36-147855-0_sp0.9 from training. Duration: 0.8666875 2023-10-03 02:38:40,267 WARNING [train.py:1204] (3/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_442-320317-0_sp1.1 from training. Duration: 0.7454375 2023-10-03 02:38:41,565 WARNING [train.py:1204] (3/4) Exclude cut with ID _55429_210_10626_1_1534467588527_7174143_953-80868-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 02:38:42,935 WARNING [train.py:1204] (3/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_159-328157-0_sp0.9 from training. Duration: 0.57775 2023-10-03 02:38:42,947 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9309W0197-640324 from training. Duration: 0.896 2023-10-03 02:38:44,469 WARNING [train.py:1204] (3/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_361-30822-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 02:38:48,554 WARNING [train.py:1204] (3/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_636-251213-0 from training. Duration: 0.94 2023-10-03 02:38:54,995 WARNING [train.py:1204] (3/4) Exclude cut with ID _49728_210_12651_1_1534041034294_6788150_468-333827-0_sp1.1 from training. Duration: 0.963625 2023-10-03 02:38:57,003 WARNING [train.py:1204] (3/4) Exclude cut with ID _25882_210_3036_1_1532775665815_6479510_623-191759-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 02:38:58,341 WARNING [train.py:1204] (3/4) Exclude cut with ID _5189_210_4557_1_1527248319428_3769229_199-53526-0 from training. Duration: 0.42 2023-10-03 02:38:59,592 INFO [train.py:1046] (3/4) Epoch 32, batch 1600, loss[loss=0.1443, simple_loss=0.2372, pruned_loss=0.02565, over 24669.00 frames. ], tot_loss[loss=0.1624, simple_loss=0.2406, pruned_loss=0.04208, over 4714382.03 frames. ], batch size: 73, lr: 3.20e-03, grad_scale: 16.0 2023-10-03 02:38:59,718 WARNING [train.py:1204] (3/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_32-205288-0_sp0.9 from training. Duration: 0.8666875 2023-10-03 02:38:59,948 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.feed_forward1.hidden_balancer.prob, batch_count=1108506.6666666667, ans=0.125 2023-10-03 02:39:01,098 WARNING [train.py:1204] (3/4) Exclude cut with ID _65263_210_22356_1_1535763598691_3921037_401-346646-0_sp1.1 from training. Duration: 0.963625 2023-10-03 02:39:01,109 WARNING [train.py:1204] (3/4) Exclude cut with ID _12555_210_8750_1_1530075594402_3780239_449-267986-0_sp1.1 from training. Duration: 0.7545625 2023-10-03 02:39:01,126 WARNING [train.py:1204] (3/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_430-201020-0_sp1.1 from training. Duration: 0.8818125 2023-10-03 02:39:02,527 WARNING [train.py:1204] (3/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_379-49457-0_sp1.1 from training. Duration: 0.7909375 2023-10-03 02:39:04,014 WARNING [train.py:1204] (3/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_200-105218-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 02:39:04,067 WARNING [train.py:1204] (3/4) Exclude cut with ID _3867_210_1883_1_1526641200575_4475830_319-349850-0 from training. Duration: 0.67 2023-10-03 02:39:05,504 WARNING [train.py:1204] (3/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_916-195848-0 from training. Duration: 0.92 2023-10-03 02:39:08,875 WARNING [train.py:1204] (3/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_436-186311-0 from training. Duration: 0.65 2023-10-03 02:39:10,109 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3910_210_1794_1_1526777667_3747260_302-180617-0_sp1.1 from training. Duration: 0.97275 2023-10-03 02:39:11,502 WARNING [train.py:1204] (3/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_102-92751-0 from training. Duration: 0.99 2023-10-03 02:39:11,554 WARNING [train.py:1204] (3/4) Exclude cut with ID _51899_210_20034_1_1534129254466_3669220_265-215428-0_sp1.1 from training. Duration: 0.8909375 2023-10-03 02:39:13,026 WARNING [train.py:1204] (3/4) Exclude cut with ID _58583_210_5236_1_1534726835266_3574249_438-198993-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 02:39:17,160 WARNING [train.py:1204] (3/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_166-166990-0_sp1.1 from training. Duration: 0.87275 2023-10-03 02:39:19,662 WARNING [train.py:1204] (3/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_907-151398-0 from training. Duration: 0.37 2023-10-03 02:39:21,180 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.bypass.skip_rate, batch_count=1108573.3333333333, ans=0.07 2023-10-03 02:39:24,769 WARNING [train.py:1204] (3/4) Exclude cut with ID _38670_210_15379_1_1533448810659_4391059_452-256669-0_sp1.1 from training. Duration: 0.936375 2023-10-03 02:39:26,108 WARNING [train.py:1204] (3/4) Exclude cut with ID _48677_210_5663_1_1533902405919_3892989_408-40740-0 from training. Duration: 0.83 2023-10-03 02:39:26,154 WARNING [train.py:1204] (3/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_903-158756-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 02:39:27,425 WARNING [train.py:1204] (3/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_397-216497-0 from training. Duration: 0.96 2023-10-03 02:39:31,621 WARNING [train.py:1204] (3/4) Exclude cut with ID _55429_210_10626_1_1534467588527_7174143_97-335084-0 from training. Duration: 0.97 2023-10-03 02:39:41,954 WARNING [train.py:1204] (3/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_453-267730-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 02:39:43,333 WARNING [train.py:1204] (3/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_153-97441-0 from training. Duration: 0.56 2023-10-03 02:39:43,382 WARNING [train.py:1204] (3/4) Exclude cut with ID _40901_210_5665_1_1533363789958_3856130_312-329122-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 02:39:44,677 WARNING [train.py:1204] (3/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_97-126471-0_sp1.1 from training. Duration: 0.97275 2023-10-03 02:39:44,678 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_11305_210_1794_1_1529750428_7178148_257-312870-0_sp1.1 from training. Duration: 0.8 2023-10-03 02:39:47,513 WARNING [train.py:1204] (3/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_454-314901-0_sp0.9 from training. Duration: 0.511125 2023-10-03 02:39:50,203 WARNING [train.py:1204] (3/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_282-136641-0_sp1.1 from training. Duration: 0.3818125 2023-10-03 02:39:51,632 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_46013_210_7234_1_1533776362_3744032_493-160724-0_sp1.1 from training. Duration: 0.963625 2023-10-03 02:39:52,978 WARNING [train.py:1204] (3/4) Exclude cut with ID _67831_210_12392_1_1535612464202_3092612_259-33442-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 02:39:53,044 WARNING [train.py:1204] (3/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_289-62384-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 02:39:54,420 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6545_210_2040_1_1527774670_4245258_379-14182-0_sp0.9 from training. Duration: 0.888875 2023-10-03 02:39:58,310 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_548-294316-0_sp1.1 from training. Duration: 0.736375 2023-10-03 02:39:58,398 WARNING [train.py:1204] (3/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_857-298942-0_sp1.1 from training. Duration: 0.77275 2023-10-03 02:40:00,400 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6673_210_3273_1_1527767790_3173240_327-306185-0_sp1.1 from training. Duration: 0.7545625 2023-10-03 02:40:00,610 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.conv_module2.balancer2.prob, batch_count=1108773.3333333333, ans=0.125 2023-10-03 02:40:04,666 WARNING [train.py:1204] (3/4) Exclude cut with ID _61008_210_10633_1_1534852769198_6974370_814-107647-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 02:40:05,972 WARNING [train.py:1204] (3/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_116-332008-0_sp1.1 from training. Duration: 0.8909375 2023-10-03 02:40:06,945 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.0.layers.1.self_attn2.whiten, num_groups=1, num_channels=192, metric=10.52 vs. limit=22.5 2023-10-03 02:40:07,257 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.636e+02 1.902e+02 2.151e+02 2.646e+02 3.941e+02, threshold=4.301e+02, percent-clipped=0.0 2023-10-03 02:40:07,443 WARNING [train.py:1204] (3/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_266-329755-0 from training. Duration: 0.89 2023-10-03 02:40:07,445 WARNING [train.py:1204] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_271-269084-0_sp0.9 from training. Duration: 0.92225 2023-10-03 02:40:08,859 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3171_210_2087_1_1526109084_2330007_66-260583-0 from training. Duration: 0.72 2023-10-03 02:40:09,103 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.nonlin_attention.balancer.prob, batch_count=1108773.3333333333, ans=0.125 2023-10-03 02:40:10,448 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.1.bypass.skip_rate, batch_count=1108773.3333333333, ans=0.035 2023-10-03 02:40:12,329 WARNING [train.py:1204] (3/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_1326-276386-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 02:40:13,654 INFO [train.py:1046] (3/4) Epoch 32, batch 1650, loss[loss=0.1717, simple_loss=0.2598, pruned_loss=0.04185, over 23808.00 frames. ], tot_loss[loss=0.1631, simple_loss=0.2417, pruned_loss=0.04228, over 4716973.77 frames. ], batch size: 85, lr: 3.20e-03, grad_scale: 16.0 2023-10-03 02:40:13,752 WARNING [train.py:1204] (3/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_372-30302-0_sp1.1 from training. Duration: 0.7818125 2023-10-03 02:40:13,824 WARNING [train.py:1204] (3/4) Exclude cut with ID _25882_210_3036_1_1532775665815_6479510_631-257029-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 02:40:13,832 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3691_210_3528_1_1526468169_4062053_355-334432-0 from training. Duration: 0.87 2023-10-03 02:40:15,093 WARNING [train.py:1204] (3/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_839-173017-0_sp1.1 from training. Duration: 0.37275 2023-10-03 02:40:15,104 WARNING [train.py:1204] (3/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_441-91466-0 from training. Duration: 0.97 2023-10-03 02:40:15,157 WARNING [train.py:1204] (3/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_108-53929-0 from training. Duration: 0.59 2023-10-03 02:40:16,728 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.0.self_attn_weights.pos_emb_skip_rate, batch_count=1108840.0, ans=0.0 2023-10-03 02:40:17,942 WARNING [train.py:1204] (3/4) Exclude cut with ID _9389_210_1664_1_1529217337301_5289269_260-1942-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 02:40:19,305 WARNING [train.py:1204] (3/4) Exclude cut with ID _56423_210_5854_1_1534896061049_3165969_198-61547-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 02:40:19,342 WARNING [train.py:1204] (3/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_412-142122-0_sp1.1 from training. Duration: 0.92725 2023-10-03 02:40:20,576 WARNING [train.py:1204] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_88-341718-0_sp1.1 from training. Duration: 0.6 2023-10-03 02:40:23,805 WARNING [train.py:1204] (3/4) Exclude cut with ID _61079_210_4525_1_1534921008198_4011419_334-298697-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 02:40:25,828 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_5465_210_4211_1_1527227253_55357_8-2499-0_sp1.1 from training. Duration: 0.996375 2023-10-03 02:40:26,544 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.3.encoder.layers.1.conv_module1.whiten, num_groups=1, num_channels=512, metric=3.46 vs. limit=15.0 2023-10-03 02:40:27,855 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_447-201565-0_sp1.1 from training. Duration: 0.97275 2023-10-03 02:40:27,864 WARNING [train.py:1204] (3/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_253-161789-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 02:40:27,867 WARNING [train.py:1204] (3/4) Exclude cut with ID _44304_210_15374_1_1533639352765_4613129_302-22084-0_sp0.9 from training. Duration: 0.9555625 2023-10-03 02:40:27,875 WARNING [train.py:1204] (3/4) Exclude cut with ID _26664_210_7770_1_1532258943280_3418270_187-80815-0_sp1.1 from training. Duration: 0.8454375 2023-10-03 02:40:29,244 WARNING [train.py:1204] (3/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_68-212801-0 from training. Duration: 0.73 2023-10-03 02:40:30,565 WARNING [train.py:1204] (3/4) Exclude cut with ID _29345_210_4381_1_1532689168263_3860169_149-257268-0 from training. Duration: 0.81 2023-10-03 02:40:30,811 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.0.ff3_skip_rate, batch_count=1108906.6666666667, ans=0.0 2023-10-03 02:40:30,839 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.conv_module2.balancer1.prob, batch_count=1108906.6666666667, ans=0.125 2023-10-03 02:40:34,792 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_213-103282-0_sp1.1 from training. Duration: 0.7090625 2023-10-03 02:40:36,244 WARNING [train.py:1204] (3/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_156-302675-0_sp1.1 from training. Duration: 0.5 2023-10-03 02:40:45,769 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.4.encoder.layers.1.self_attn_weights.whiten_keys, num_groups=4, num_channels=128, metric=1.92 vs. limit=6.0 2023-10-03 02:40:46,497 WARNING [train.py:1204] (3/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_125-322172-0 from training. Duration: 0.93 2023-10-03 02:40:47,867 WARNING [train.py:1204] (3/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_493-60474-0_sp1.1 from training. Duration: 0.963625 2023-10-03 02:40:50,596 WARNING [train.py:1204] (3/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_156-302675-0 from training. Duration: 0.55 2023-10-03 02:40:53,433 WARNING [train.py:1204] (3/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_123-267862-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 02:40:55,273 WARNING [train.py:1204] (3/4) Exclude cut with ID _28427_210_4220_1_1532581240967_3061069_201-143747-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 02:40:55,308 WARNING [train.py:1204] (3/4) Exclude cut with ID _8772_210_1850_1_1528635595972_3664011_96-12756-0_sp1.1 from training. Duration: 0.8909375 2023-10-03 02:40:55,343 WARNING [train.py:1204] (3/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_1347-145675-0_sp1.1 from training. Duration: 0.936375 2023-10-03 02:40:56,692 WARNING [train.py:1204] (3/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_387-159008-0_sp1.1 from training. Duration: 0.7818125 2023-10-03 02:40:56,716 WARNING [train.py:1204] (3/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_400-201896-0_sp1.1 from training. Duration: 0.963625 2023-10-03 02:41:00,114 WARNING [train.py:1204] (3/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_326-174612-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 02:41:01,414 WARNING [train.py:1204] (3/4) Exclude cut with ID _55844_210_2346_1_1534471212569_7625707_390-116233-0_sp1.1 from training. Duration: 0.963625 2023-10-03 02:41:01,440 WARNING [train.py:1204] (3/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_419-317062-0_sp1.1 from training. Duration: 0.82725 2023-10-03 02:41:01,491 WARNING [train.py:1204] (3/4) Exclude cut with ID _29877_210_6228_1_1532397791106_5276149_266-62291-0_sp0.9 from training. Duration: 0.9666875 2023-10-03 02:41:02,778 WARNING [train.py:1204] (3/4) Exclude cut with ID _7857_210_1534_1_1528286515745_6831470_431-338528-0_sp1.1 from training. Duration: 0.92725 2023-10-03 02:41:04,189 WARNING [train.py:1204] (3/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_87-131427-0_sp0.9 from training. Duration: 0.8666875 2023-10-03 02:41:06,976 WARNING [train.py:1204] (3/4) Exclude cut with ID _24055_210_12651_1_1532310907143_976979_92-59356-0_sp1.1 from training. Duration: 0.82725 2023-10-03 02:41:08,332 WARNING [train.py:1204] (3/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_96-290039-0 from training. Duration: 0.56 2023-10-03 02:41:09,746 WARNING [train.py:1204] (3/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_48-182155-0_sp0.9 from training. Duration: 0.9666875 2023-10-03 02:41:09,775 WARNING [train.py:1204] (3/4) Exclude cut with ID _4626_210_3532_1_1526817652717_3916859_446-206656-0 from training. Duration: 0.76 2023-10-03 02:41:10,553 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.2.encoder.layers.0.feed_forward3.out_whiten, num_groups=1, num_channels=384, metric=12.56 vs. limit=15.0 2023-10-03 02:41:11,175 WARNING [train.py:1204] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_168-32906-0 from training. Duration: 0.69 2023-10-03 02:41:11,193 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3780_210_2087_1_1526467760_2130483_170-259044-0 from training. Duration: 0.7 2023-10-03 02:41:11,204 WARNING [train.py:1204] (3/4) Exclude cut with ID _65134_210_14763_1_1535335393985_3505368_153-216718-0_sp1.1 from training. Duration: 0.92725 2023-10-03 02:41:13,213 WARNING [train.py:1204] (3/4) Exclude cut with ID _64639_210_5816_1_1535367619437_3528000_461-329156-0_sp1.1 from training. Duration: 0.87275 2023-10-03 02:41:14,454 WARNING [train.py:1204] (3/4) Exclude cut with ID _21989_210_5854_1_1531899214931_3271360_126-347531-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 02:41:14,495 WARNING [train.py:1204] (3/4) Exclude cut with ID _7614_210_4915_1_1529326764092_4537359_96-162197-0_sp1.1 from training. Duration: 0.963625 2023-10-03 02:41:14,496 WARNING [train.py:1204] (3/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_1083-147536-0 from training. Duration: 0.97 2023-10-03 02:41:17,587 WARNING [train.py:1204] (3/4) Exclude cut with ID _64029_210_4381_1_1535178502772_3756459_275-16710-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 02:41:20,169 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7264_210_4938_1_1527939911_8132095_800-91995-0_sp0.9 from training. Duration: 0.9555625 2023-10-03 02:41:20,212 WARNING [train.py:1204] (3/4) Exclude cut with ID _3844_210_1390_1_1526727803582_2871209_50-193879-0_sp1.1 from training. Duration: 0.936375 2023-10-03 02:41:22,995 WARNING [train.py:1204] (3/4) Exclude cut with ID _46952_210_5986_1_1534230010357_3107379_232-344503-0 from training. Duration: 0.96 2023-10-03 02:41:26,288 WARNING [train.py:1204] (3/4) Exclude cut with ID _25808_210_13292_1_1532343514562_7366109_206-14076-0_sp1.1 from training. Duration: 0.936375 2023-10-03 02:41:26,289 WARNING [train.py:1204] (3/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_55-31904-0_sp0.9 from training. Duration: 0.97775 2023-10-03 02:41:28,038 INFO [train.py:1046] (3/4) Epoch 32, batch 1700, loss[loss=0.1575, simple_loss=0.2385, pruned_loss=0.03831, over 24463.00 frames. ], tot_loss[loss=0.1636, simple_loss=0.2422, pruned_loss=0.04251, over 4716640.79 frames. ], batch size: 66, lr: 3.20e-03, grad_scale: 16.0 2023-10-03 02:41:28,091 WARNING [train.py:1204] (3/4) Exclude cut with ID _14632_210_9988_1_1530691279392_3923200_161-129530-0 from training. Duration: 0.96 2023-10-03 02:41:28,157 WARNING [train.py:1204] (3/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_770-200430-0_sp1.1 from training. Duration: 0.8090625 2023-10-03 02:41:28,167 WARNING [train.py:1204] (3/4) Exclude cut with ID _6270_210_2084_1_1527850911573_3856420_26-277343-0_sp1.1 from training. Duration: 0.7454375 2023-10-03 02:41:28,168 WARNING [train.py:1204] (3/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_88-23776-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 02:41:30,921 WARNING [train.py:1204] (3/4) Exclude cut with ID _60860_210_8254_1_1534896015875_3557169_245-256666-0_sp1.1 from training. Duration: 0.8818125 2023-10-03 02:41:32,219 WARNING [train.py:1204] (3/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_636-251213-0_sp1.1 from training. Duration: 0.8545625 2023-10-03 02:41:32,248 WARNING [train.py:1204] (3/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_540-131164-0 from training. Duration: 0.92 2023-10-03 02:41:35,004 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_5931_210_2040_1_1527428481_4547999_321-193614-0_sp0.9 from training. Duration: 0.7444375 2023-10-03 02:41:43,295 WARNING [train.py:1204] (3/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_373-225403-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 02:41:45,358 WARNING [train.py:1204] (3/4) Exclude cut with ID _20622_210_3852_1_1531622427080_1093839_57-326027-0_sp1.1 from training. Duration: 0.97275 2023-10-03 02:41:51,010 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.balancer1.prob, batch_count=1109240.0, ans=0.125 2023-10-03 02:41:52,160 WARNING [train.py:1204] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_605-257205-0_sp1.1 from training. Duration: 0.536375 2023-10-03 02:41:52,187 WARNING [train.py:1204] (3/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_442-320317-0_sp0.9 from training. Duration: 0.911125 2023-10-03 02:41:52,224 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_35361_210_4238_1_1533262810_7793476_675-87012-0_sp1.1 from training. Duration: 0.8090625 2023-10-03 02:41:52,264 WARNING [train.py:1204] (3/4) Exclude cut with ID _4184_210_2550_1_1526730192013_2219380_306-44736-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 02:41:53,639 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_124-113562-0 from training. Duration: 0.94 2023-10-03 02:41:54,100 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.5.encoder.layers.1.feed_forward3.out_whiten, num_groups=1, num_channels=256, metric=9.67 vs. limit=15.0 2023-10-03 02:41:55,159 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2821_210_2715_1_1526108385_3763560_73-296673-0_sp0.9 from training. Duration: 0.92225 2023-10-03 02:41:55,174 WARNING [train.py:1204] (3/4) Exclude cut with ID _10301_210_5886_1_1529222275332_3910620_216-46959-0_sp1.1 from training. Duration: 0.92725 2023-10-03 02:41:58,293 WARNING [train.py:1204] (3/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_91-277684-0_sp0.9 from training. Duration: 0.82225 2023-10-03 02:42:00,334 WARNING [train.py:1204] (3/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_351-294650-0_sp0.9 from training. Duration: 0.7 2023-10-03 02:42:01,763 WARNING [train.py:1204] (3/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_209-205365-0 from training. Duration: 0.79 2023-10-03 02:42:01,832 WARNING [train.py:1204] (3/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_97-325053-0 from training. Duration: 0.92 2023-10-03 02:42:03,518 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_49348_210_7927_1_1533954500_3691300_109-276430-0_sp1.1 from training. Duration: 0.92725 2023-10-03 02:42:06,091 WARNING [train.py:1204] (3/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_912-47473-0 from training. Duration: 0.68 2023-10-03 02:42:06,166 WARNING [train.py:1204] (3/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_96-320736-0_sp0.9 from training. Duration: 0.9555625 2023-10-03 02:42:06,328 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.conv_module1.balancer2.prob, batch_count=1109306.6666666667, ans=0.125 2023-10-03 02:42:13,576 WARNING [train.py:1204] (3/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_1252-235481-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 02:42:14,947 WARNING [train.py:1204] (3/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_813-261554-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 02:42:15,016 WARNING [train.py:1204] (3/4) Exclude cut with ID _21632_210_2253_1_1531994001765_3744140_110-274654-0_sp0.9 from training. Duration: 0.911125 2023-10-03 02:42:17,788 WARNING [train.py:1204] (3/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_381-317791-0_sp1.1 from training. Duration: 0.663625 2023-10-03 02:42:19,095 WARNING [train.py:1204] (3/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_51-117576-0_sp1.1 from training. Duration: 0.363625 2023-10-03 02:42:19,121 WARNING [train.py:1204] (3/4) Exclude cut with ID _42321_210_7770_1_1533468631352_4366895_104-215915-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 02:42:21,032 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.4.encoder.layers.1.nonlin_attention.whiten2, num_groups=1, num_channels=384, metric=3.20 vs. limit=15.0 2023-10-03 02:42:21,788 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_216-22824-0_sp1.1 from training. Duration: 0.92725 2023-10-03 02:42:21,789 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_14831_210_7927_1_1530700200_5583610_181-27467-0 from training. Duration: 0.91 2023-10-03 02:42:23,082 WARNING [train.py:1204] (3/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_1024-265645-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 02:42:23,093 WARNING [train.py:1204] (3/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_412-233904-0_sp1.1 from training. Duration: 0.936375 2023-10-03 02:42:23,117 WARNING [train.py:1204] (3/4) Exclude cut with ID _30191_210_1664_1_1533189915047_7196670_911-223461-0_sp1.1 from training. Duration: 0.92725 2023-10-03 02:42:23,125 WARNING [train.py:1204] (3/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_558-148266-0_sp1.1 from training. Duration: 0.963625 2023-10-03 02:42:23,339 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.nonlin_attention.balancer.prob, batch_count=1109373.3333333333, ans=0.125 2023-10-03 02:42:26,342 WARNING [train.py:1204] (3/4) Exclude cut with ID _58480_210_12392_1_1534811121111_3646160_212-164495-0_sp1.1 from training. Duration: 0.936375 2023-10-03 02:42:26,344 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_454-276429-0_sp0.9 from training. Duration: 0.9666875 2023-10-03 02:42:27,689 WARNING [train.py:1204] (3/4) Exclude cut with ID _24736_210_3279_1_1532332894218_2838700_73-197086-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 02:42:27,752 WARNING [train.py:1204] (3/4) Exclude cut with ID _5294_210_1747_1_1527249643928_3696179_529-242298-0_sp1.1 from training. Duration: 0.863625 2023-10-03 02:42:29,615 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_53555_210_5632_1_1534595220_7232297_345-171438-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 02:42:34,265 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_28211_210_6388_1_1532861506_4523224_22-25766-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 02:42:35,643 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7002_210_1794_1_1527939683_5157286_163-87552-0 from training. Duration: 0.86 2023-10-03 02:42:36,918 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.594e+02 1.849e+02 2.096e+02 2.324e+02 3.909e+02, threshold=4.193e+02, percent-clipped=0.0 2023-10-03 02:42:37,076 WARNING [train.py:1204] (3/4) Exclude cut with ID _56927_210_9969_1_1534848970840_3695229_409-225129-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 02:42:38,527 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_26192_210_7927_1_1532137269_1040115_21-219331-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 02:42:39,995 WARNING [train.py:1204] (3/4) Exclude cut with ID _15012_210_1968_1_1530856820429_5095350_662-210469-0 from training. Duration: 0.57 2023-10-03 02:42:41,317 INFO [train.py:1046] (3/4) Epoch 32, batch 1750, loss[loss=0.1548, simple_loss=0.225, pruned_loss=0.04232, over 23831.00 frames. ], tot_loss[loss=0.1621, simple_loss=0.2405, pruned_loss=0.04185, over 4707384.67 frames. ], batch size: 195, lr: 3.20e-03, grad_scale: 8.0 2023-10-03 02:42:44,917 WARNING [train.py:1204] (3/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_582-191016-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 02:42:46,267 WARNING [train.py:1204] (3/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_270-210032-0_sp1.1 from training. Duration: 0.963625 2023-10-03 02:42:47,531 WARNING [train.py:1204] (3/4) Exclude cut with ID _6789_210_1534_1_1527857937218_3118950_262-48853-0_sp0.9 from training. Duration: 0.62225 2023-10-03 02:42:47,589 WARNING [train.py:1204] (3/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_847-41334-0 from training. Duration: 0.44 2023-10-03 02:42:47,620 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_573-46532-0_sp1.1 from training. Duration: 0.92725 2023-10-03 02:42:50,359 WARNING [train.py:1204] (3/4) Exclude cut with ID _21881_210_3525_1_1531825242757_3798380_247-102591-0_sp1.1 from training. Duration: 0.8909375 2023-10-03 02:42:50,388 WARNING [train.py:1204] (3/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_200-21395-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 02:42:54,720 WARNING [train.py:1204] (3/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_711-15688-0 from training. Duration: 0.43 2023-10-03 02:42:58,135 WARNING [train.py:1204] (3/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_53-75025-0_sp1.1 from training. Duration: 0.963625 2023-10-03 02:42:59,753 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.bypass_mid.scale_min, batch_count=1109573.3333333333, ans=0.2 2023-10-03 02:43:00,802 WARNING [train.py:1204] (3/4) Exclude cut with ID _6194_210_1534_1_1527511826160_3941194_321-148641-0 from training. Duration: 0.64 2023-10-03 02:43:00,838 WARNING [train.py:1204] (3/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_337-294431-0_sp1.1 from training. Duration: 0.936375 2023-10-03 02:43:02,758 WARNING [train.py:1204] (3/4) Exclude cut with ID _21632_210_2253_1_1531994001765_3744140_110-274654-0_sp1.1 from training. Duration: 0.7454375 2023-10-03 02:43:04,212 WARNING [train.py:1204] (3/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_135-174080-0_sp1.1 from training. Duration: 0.4818125 2023-10-03 02:43:05,593 WARNING [train.py:1204] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_21-174514-0 from training. Duration: 0.43 2023-10-03 02:43:08,352 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_38965_210_4852_1_1533174906_3484735_215-294589-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 02:43:08,382 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_548-294316-0 from training. Duration: 0.81 2023-10-03 02:43:15,739 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_103-84010-0_sp1.1 from training. Duration: 0.62725 2023-10-03 02:43:18,453 WARNING [train.py:1204] (3/4) Exclude cut with ID _18199_210_6109_1_1531475789864_4288790_295-198247-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 02:43:18,479 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_13757_210_3528_1_1530440682_4107239_103-313224-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 02:43:22,664 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_34886_210_10633_1_1533203882_1755661_67-294673-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 02:43:22,681 WARNING [train.py:1204] (3/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_350-101734-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 02:43:24,083 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_37357_210_17024_1_1533089504_2316121_204-153986-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 02:43:27,264 WARNING [train.py:1204] (3/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_673-115569-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 02:43:29,186 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_489-230703-0_sp0.9 from training. Duration: 0.9444375 2023-10-03 02:43:29,261 WARNING [train.py:1204] (3/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_575-102496-0_sp1.1 from training. Duration: 0.936375 2023-10-03 02:43:30,602 WARNING [train.py:1204] (3/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_669-93163-0 from training. Duration: 0.98 2023-10-03 02:43:32,141 WARNING [train.py:1204] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_249-281349-0_sp0.9 from training. Duration: 0.9444375 2023-10-03 02:43:35,364 WARNING [train.py:1204] (3/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_106-30782-0 from training. Duration: 0.8 2023-10-03 02:43:35,432 WARNING [train.py:1204] (3/4) Exclude cut with ID _8772_210_1850_1_1528635595972_3664011_398-320049-0_sp1.1 from training. Duration: 0.8 2023-10-03 02:43:36,864 WARNING [train.py:1204] (3/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_516-151727-0_sp1.1 from training. Duration: 0.963625 2023-10-03 02:43:37,007 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.attention_skip_rate, batch_count=1109706.6666666667, ans=0.0 2023-10-03 02:43:38,195 WARNING [train.py:1204] (3/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_325-62802-0_sp1.1 from training. Duration: 0.7909375 2023-10-03 02:43:41,183 WARNING [train.py:1204] (3/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_93-58002-0_sp0.9 from training. Duration: 0.6333125 2023-10-03 02:43:42,105 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.0.layers.1.feed_forward1.out_whiten, num_groups=1, num_channels=192, metric=4.84 vs. limit=15.0 2023-10-03 02:43:42,531 WARNING [train.py:1204] (3/4) Exclude cut with ID _4681_210_3525_1_1527251040321_1235713_123-24428-0_sp1.1 from training. Duration: 0.436375 2023-10-03 02:43:42,587 WARNING [train.py:1204] (3/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_1185-56473-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 02:43:44,476 WARNING [train.py:1204] (3/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_96-244299-0_sp1.1 from training. Duration: 0.8 2023-10-03 02:43:47,342 WARNING [train.py:1204] (3/4) Exclude cut with ID _59500_210_8254_1_1534809610680_3736009_104-72815-0_sp1.1 from training. Duration: 0.963625 2023-10-03 02:43:47,564 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.conv_skip_rate, batch_count=1109773.3333333333, ans=0.0 2023-10-03 02:43:50,028 WARNING [train.py:1204] (3/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_468-283113-0_sp1.1 from training. Duration: 0.87275 2023-10-03 02:43:50,148 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6239_210_4211_1_1527765584_4306698_528-102186-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 02:43:51,501 WARNING [train.py:1204] (3/4) Exclude cut with ID _45297_210_2721_1_1533715351381_3264949_351-147136-0 from training. Duration: 0.87 2023-10-03 02:43:51,506 WARNING [train.py:1204] (3/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_520-51744-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 02:43:51,696 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.conv_module1.balancer1.prob, batch_count=1109773.3333333333, ans=0.125 2023-10-03 02:43:52,862 WARNING [train.py:1204] (3/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_310-114452-0_sp1.1 from training. Duration: 0.536375 2023-10-03 02:43:52,866 WARNING [train.py:1204] (3/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_450-165710-0_sp1.1 from training. Duration: 0.92725 2023-10-03 02:43:52,876 WARNING [train.py:1204] (3/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_280-120942-0_sp1.1 from training. Duration: 0.7 2023-10-03 02:43:52,897 WARNING [train.py:1204] (3/4) Exclude cut with ID _65585_210_12210_1_1535450451584_3764098_112-294301-0_sp0.9 from training. Duration: 0.97775 2023-10-03 02:43:53,106 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.bypass_mid.scale_min, batch_count=1109773.3333333333, ans=0.2 2023-10-03 02:43:54,257 WARNING [train.py:1204] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_98-51676-0_sp0.9 from training. Duration: 0.811125 2023-10-03 02:43:56,227 INFO [train.py:1046] (3/4) Epoch 32, batch 1800, loss[loss=0.1559, simple_loss=0.2091, pruned_loss=0.05131, over 19002.00 frames. ], tot_loss[loss=0.1614, simple_loss=0.2396, pruned_loss=0.04158, over 4703794.24 frames. ], batch size: 388, lr: 3.20e-03, grad_scale: 8.0 2023-10-03 02:43:57,791 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_33348_210_5632_1_1533086912_3324030_85-28583-0_sp1.1 from training. Duration: 0.7454375 2023-10-03 02:43:59,562 WARNING [train.py:1204] (3/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_336-208232-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 02:44:00,925 WARNING [train.py:1204] (3/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_339-104791-0_sp0.9 from training. Duration: 0.5444375 2023-10-03 02:44:03,650 WARNING [train.py:1204] (3/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_559-29840-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 02:44:07,019 WARNING [train.py:1204] (3/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_117-290916-0_sp0.9 from training. Duration: 0.5666875 2023-10-03 02:44:08,366 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_185-112289-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 02:44:12,429 WARNING [train.py:1204] (3/4) Exclude cut with ID _59256_210_5986_1_1535076085997_3589120_155-298797-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 02:44:15,096 WARNING [train.py:1204] (3/4) Exclude cut with ID _44659_210_2721_1_1534036120061_6614860_246-206531-0_sp1.1 from training. Duration: 0.92725 2023-10-03 02:44:15,150 WARNING [train.py:1204] (3/4) Exclude cut with ID _23818_210_6228_1_1531917063884_3676920_469-151944-0_sp1.1 from training. Duration: 0.92725 2023-10-03 02:44:16,920 WARNING [train.py:1204] (3/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_88-276668-0_sp1.1 from training. Duration: 0.9 2023-10-03 02:44:18,348 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_15101_210_7927_1_1530786415_5033274_320-327527-0_sp1.1 from training. Duration: 0.87275 2023-10-03 02:44:18,357 WARNING [train.py:1204] (3/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_446-112117-0 from training. Duration: 0.9 2023-10-03 02:44:19,673 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_30280_210_1881_1_1532872779_3660139_400-254159-0_sp1.1 from training. Duration: 0.936375 2023-10-03 02:44:21,061 WARNING [train.py:1204] (3/4) Exclude cut with ID _40284_210_2084_1_1533513679814_3704279_158-345341-0_sp1.1 from training. Duration: 0.936375 2023-10-03 02:44:21,224 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.balancer1.prob, batch_count=1109906.6666666667, ans=0.125 2023-10-03 02:44:25,352 WARNING [train.py:1204] (3/4) Exclude cut with ID 3033-130750-0096-55598-0 from training. Duration: 0.83 2023-10-03 02:44:25,507 WARNING [train.py:1204] (3/4) Exclude cut with ID _9389_210_1664_1_1529217337301_5289269_379-128794-0 from training. Duration: 0.85 2023-10-03 02:44:25,521 WARNING [train.py:1204] (3/4) Exclude cut with ID _3675_210_4557_1_1526639934408_4182089_456-18305-0 from training. Duration: 0.76 2023-10-03 02:44:27,319 WARNING [train.py:1204] (3/4) Exclude cut with ID _29853_210_7497_1_1532505600851_6642110_571-249460-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 02:44:28,754 WARNING [train.py:1204] (3/4) Exclude cut with ID _4068_210_2629_1_1526697350395_1546259_230-325568-0_sp1.1 from training. Duration: 0.92725 2023-10-03 02:44:28,755 WARNING [train.py:1204] (3/4) Exclude cut with ID _34415_210_8750_1_1533085213375_3451116_213-267570-0_sp1.1 from training. Duration: 0.963625 2023-10-03 02:44:28,844 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6767_210_3273_1_1527855783_1718932_142-127132-0_sp1.1 from training. Duration: 0.863625 2023-10-03 02:44:30,975 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.bypass.scale_min, batch_count=1109973.3333333333, ans=0.2 2023-10-03 02:44:36,616 WARNING [train.py:1204] (3/4) Exclude cut with ID IC0653W0001-322925 from training. Duration: 0.896 2023-10-03 02:44:37,925 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_4528_210_3273_1_1527076654_3276777_209-219804-0_sp1.1 from training. Duration: 0.62725 2023-10-03 02:44:39,245 WARNING [train.py:1204] (3/4) Exclude cut with ID _55844_210_2346_1_1534471212569_7625707_187-204971-0_sp1.1 from training. Duration: 0.936375 2023-10-03 02:44:40,685 WARNING [train.py:1204] (3/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_369-325184-0 from training. Duration: 0.99 2023-10-03 02:44:42,132 WARNING [train.py:1204] (3/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_791-45149-0 from training. Duration: 0.86 2023-10-03 02:44:42,186 WARNING [train.py:1204] (3/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_317-211107-0_sp1.1 from training. Duration: 0.836375 2023-10-03 02:44:44,863 WARNING [train.py:1204] (3/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_488-330913-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 02:44:46,900 WARNING [train.py:1204] (3/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_506-42614-0_sp1.1 from training. Duration: 0.7181875 2023-10-03 02:44:49,742 WARNING [train.py:1204] (3/4) Exclude cut with ID _8696_210_1876_1_1528545632440_3345058_209-54069-0 from training. Duration: 0.67 2023-10-03 02:44:55,253 WARNING [train.py:1204] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_215-202973-0_sp1.1 from training. Duration: 0.8 2023-10-03 02:44:55,335 WARNING [train.py:1204] (3/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_321-215742-0 from training. Duration: 0.5 2023-10-03 02:44:56,030 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.3.encoder.layers.0.conv_module2.whiten, num_groups=1, num_channels=512, metric=3.20 vs. limit=15.0 2023-10-03 02:44:56,639 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_428-32114-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 02:44:56,654 WARNING [train.py:1204] (3/4) Exclude cut with ID _27013_210_1389_1_1532437173240_3252649_467-263151-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 02:44:56,695 WARNING [train.py:1204] (3/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_734-129761-0_sp0.9 from training. Duration: 0.911125 2023-10-03 02:44:58,539 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_521-214276-0 from training. Duration: 0.44 2023-10-03 02:45:01,751 WARNING [train.py:1204] (3/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_80-147301-0_sp1.1 from training. Duration: 0.763625 2023-10-03 02:45:01,753 WARNING [train.py:1204] (3/4) Exclude cut with ID _9679_210_7497_1_1529129212906_4363929_56-307796-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 02:45:04,533 WARNING [train.py:1204] (3/4) Exclude cut with ID _14832_210_8750_1_1530691351539_3171348_1-54315-0 from training. Duration: 0.87 2023-10-03 02:45:04,533 WARNING [train.py:1204] (3/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_558-155750-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 02:45:06,184 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.547e+02 1.827e+02 1.959e+02 2.120e+02 2.855e+02, threshold=3.918e+02, percent-clipped=0.0 2023-10-03 02:45:07,581 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_142-309370-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 02:45:07,607 WARNING [train.py:1204] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_18-46756-0_sp0.9 from training. Duration: 0.87775 2023-10-03 02:45:07,614 WARNING [train.py:1204] (3/4) Exclude cut with ID _61148_210_6897_1_1535094022726_4217610_279-212159-0_sp1.1 from training. Duration: 0.936375 2023-10-03 02:45:09,126 WARNING [train.py:1204] (3/4) Exclude cut with ID _21987_210_5854_1_1531821500482_3637038_164-2641-0_sp1.1 from training. Duration: 0.936375 2023-10-03 02:45:09,163 WARNING [train.py:1204] (3/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_395-340852-0_sp1.1 from training. Duration: 0.8454375 2023-10-03 02:45:10,449 INFO [train.py:1046] (3/4) Epoch 32, batch 1850, loss[loss=0.1617, simple_loss=0.2397, pruned_loss=0.04183, over 23591.00 frames. ], tot_loss[loss=0.1614, simple_loss=0.24, pruned_loss=0.04144, over 4713958.97 frames. ], batch size: 149, lr: 3.20e-03, grad_scale: 8.0 2023-10-03 02:45:11,827 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1077-184261-0_sp1.1 from training. Duration: 0.963625 2023-10-03 02:45:11,836 WARNING [train.py:1204] (3/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_1176-204992-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 02:45:15,872 WARNING [train.py:1204] (3/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_563-347832-0_sp1.1 from training. Duration: 0.8090625 2023-10-03 02:45:15,942 WARNING [train.py:1204] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_380-204756-0_sp0.9 from training. Duration: 0.9555625 2023-10-03 02:45:22,319 WARNING [train.py:1204] (3/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_212-10501-0_sp1.1 from training. Duration: 0.8545625 2023-10-03 02:45:22,355 WARNING [train.py:1204] (3/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_445-117398-0 from training. Duration: 0.89 2023-10-03 02:45:25,363 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.1.conv_skip_rate, batch_count=1110240.0, ans=0.0 2023-10-03 02:45:26,567 WARNING [train.py:1204] (3/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_29-280622-0 from training. Duration: 0.82 2023-10-03 02:45:30,497 WARNING [train.py:1204] (3/4) Exclude cut with ID _26664_210_7770_1_1532258943280_3418270_187-80815-0 from training. Duration: 0.93 2023-10-03 02:45:35,789 WARNING [train.py:1204] (3/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_414-349640-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 02:45:35,816 WARNING [train.py:1204] (3/4) Exclude cut with ID _45021_210_9994_1_1533619751094_3330288_96-239010-0 from training. Duration: 0.91 2023-10-03 02:45:35,819 WARNING [train.py:1204] (3/4) Exclude cut with ID _5189_210_4557_1_1527248319428_3769229_199-53526-0_sp0.9 from training. Duration: 0.4666875 2023-10-03 02:45:37,462 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.bypass.skip_rate, batch_count=1110240.0, ans=0.07 2023-10-03 02:45:44,965 WARNING [train.py:1204] (3/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_63-101996-0_sp1.1 from training. Duration: 0.8 2023-10-03 02:45:47,660 WARNING [train.py:1204] (3/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_199-86432-0 from training. Duration: 0.98 2023-10-03 02:45:50,541 WARNING [train.py:1204] (3/4) Exclude cut with ID _36399_210_16049_1_1532998771377_3698333_207-259013-0_sp1.1 from training. Duration: 0.92725 2023-10-03 02:45:51,887 WARNING [train.py:1204] (3/4) Exclude cut with ID _62574_210_2655_1_1535180407058_3933249_567-111756-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 02:45:55,293 WARNING [train.py:1204] (3/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_436-318113-0 from training. Duration: 0.75 2023-10-03 02:45:55,334 WARNING [train.py:1204] (3/4) Exclude cut with ID _53379_210_20034_1_1534222027363_3854654_43-305214-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 02:45:56,603 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_15126_210_6753_1_1530778677_2591185_113-157651-0_sp1.1 from training. Duration: 0.6181875 2023-10-03 02:45:56,708 WARNING [train.py:1204] (3/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_146-218763-0_sp1.1 from training. Duration: 0.7818125 2023-10-03 02:45:58,147 WARNING [train.py:1204] (3/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_714-310538-0_sp0.9 from training. Duration: 0.9555625 2023-10-03 02:46:00,823 WARNING [train.py:1204] (3/4) Exclude cut with ID _24271_210_12658_1_1531987270022_7190519_709-16721-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 02:46:05,092 WARNING [train.py:1204] (3/4) Exclude cut with ID _5190_210_4557_1_1527244368799_373439_28-197030-0_sp0.9 from training. Duration: 0.8 2023-10-03 02:46:05,140 WARNING [train.py:1204] (3/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_267-197536-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 02:46:06,799 WARNING [train.py:1204] (3/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_658-282610-0_sp1.1 from training. Duration: 0.4818125 2023-10-03 02:46:06,808 WARNING [train.py:1204] (3/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_257-67050-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 02:46:08,693 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_14906_210_7927_1_1530754190_9408633_961-53402-0_sp1.1 from training. Duration: 0.936375 2023-10-03 02:46:10,600 WARNING [train.py:1204] (3/4) Exclude cut with ID _60113_210_16271_1_1535164212481_4386079_490-280290-0_sp1.1 from training. Duration: 0.8909375 2023-10-03 02:46:13,418 WARNING [train.py:1204] (3/4) Exclude cut with ID _14100_210_8341_1_1530529320613_3678069_258-224599-0 from training. Duration: 0.77 2023-10-03 02:46:14,738 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6961_210_2087_1_1527937168_6815011_565-326898-0_sp1.1 from training. Duration: 0.936375 2023-10-03 02:46:17,497 WARNING [train.py:1204] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_239-316802-0_sp1.1 from training. Duration: 0.62725 2023-10-03 02:46:17,564 WARNING [train.py:1204] (3/4) Exclude cut with ID _55857_210_2346_1_1534644007831_6898209_745-98391-0_sp1.1 from training. Duration: 0.6909375 2023-10-03 02:46:17,567 WARNING [train.py:1204] (3/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_154-135696-0 from training. Duration: 0.8 2023-10-03 02:46:17,572 WARNING [train.py:1204] (3/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_93-58002-0 from training. Duration: 0.57 2023-10-03 02:46:20,151 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9025W0005-507335 from training. Duration: 0.896 2023-10-03 02:46:20,227 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9029W0034-509435 from training. Duration: 0.64 2023-10-03 02:46:21,594 WARNING [train.py:1204] (3/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_76-203867-0_sp1.1 from training. Duration: 0.6545625 2023-10-03 02:46:21,603 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_46639_210_8341_1_1533726088_4495500_66-346325-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 02:46:21,610 WARNING [train.py:1204] (3/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_145-137790-0_sp0.9 from training. Duration: 0.92225 2023-10-03 02:46:21,631 WARNING [train.py:1204] (3/4) Exclude cut with ID _36522_210_8887_1_1533177030611_1772478_7-327299-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 02:46:23,011 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9042W0009-515615 from training. Duration: 0.896 2023-10-03 02:46:24,262 INFO [train.py:1046] (3/4) Epoch 32, batch 1900, loss[loss=0.1583, simple_loss=0.2383, pruned_loss=0.03921, over 24665.00 frames. ], tot_loss[loss=0.1617, simple_loss=0.2405, pruned_loss=0.04145, over 4720320.63 frames. ], batch size: 65, lr: 3.20e-03, grad_scale: 8.0 2023-10-03 02:46:24,311 WARNING [train.py:1204] (3/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_39-248870-0_sp0.9 from training. Duration: 0.7666875 2023-10-03 02:46:24,961 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_35441_210_16554_1_1533347572_4092680_362-242912-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 02:46:26,213 WARNING [train.py:1204] (3/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_216-343007-0_sp1.1 from training. Duration: 0.6 2023-10-03 02:46:27,566 WARNING [train.py:1204] (3/4) Exclude cut with ID _3867_210_1883_1_1526641200575_4475830_272-215563-0_sp0.9 from training. Duration: 0.6333125 2023-10-03 02:46:28,931 WARNING [train.py:1204] (3/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_553-175603-0_sp1.1 from training. Duration: 0.92725 2023-10-03 02:46:28,942 WARNING [train.py:1204] (3/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_963-187109-0 from training. Duration: 0.99 2023-10-03 02:46:30,410 WARNING [train.py:1204] (3/4) Exclude cut with ID _44973_210_1511_1_1533794381163_3634770_495-211466-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 02:46:30,421 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9068W0002-529085 from training. Duration: 0.896 2023-10-03 02:46:30,443 WARNING [train.py:1204] (3/4) Exclude cut with ID _5294_210_1747_1_1527249643928_3696179_180-100713-0_sp1.1 from training. Duration: 0.736375 2023-10-03 02:46:31,810 WARNING [train.py:1204] (3/4) Exclude cut with ID _35799_210_5854_1_1533263487887_3529071_149-49689-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 02:46:36,099 WARNING [train.py:1204] (3/4) Exclude cut with ID _67725_210_27767_1_1535608756671_5105359_36-220794-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 02:46:40,896 WARNING [train.py:1204] (3/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_338-96520-0_sp1.1 from training. Duration: 0.8545625 2023-10-03 02:46:40,981 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9104W0004-547160 from training. Duration: 0.896 2023-10-03 02:46:42,801 WARNING [train.py:1204] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_330-327861-0 from training. Duration: 0.67 2023-10-03 02:46:44,090 WARNING [train.py:1204] (3/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_156-257607-0_sp0.9 from training. Duration: 0.92225 2023-10-03 02:46:44,137 WARNING [train.py:1204] (3/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_476-322050-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 02:46:44,145 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9118W0017-553385 from training. Duration: 0.896 2023-10-03 02:46:44,181 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9120W0028-554435 from training. Duration: 0.896 2023-10-03 02:46:45,718 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.0.self_attn_weights.pos_emb_skip_rate, batch_count=1110573.3333333333, ans=0.0 2023-10-03 02:46:48,384 WARNING [train.py:1204] (3/4) Exclude cut with ID _56524_210_9642_1_1534572011779_3619170_145-310842-0 from training. Duration: 0.99 2023-10-03 02:46:49,774 WARNING [train.py:1204] (3/4) Exclude cut with ID _51631_210_8254_1_1534129228420_4084794_270-222508-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 02:46:53,977 WARNING [train.py:1204] (3/4) Exclude cut with ID _14268_210_9206_1_1530622771285_4714099_331-105351-0 from training. Duration: 0.65 2023-10-03 02:46:57,225 WARNING [train.py:1204] (3/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_40-129756-0 from training. Duration: 0.81 2023-10-03 02:47:06,046 WARNING [train.py:1204] (3/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_231-104736-0 from training. Duration: 0.78 2023-10-03 02:47:08,767 WARNING [train.py:1204] (3/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_214-291369-0 from training. Duration: 0.63 2023-10-03 02:47:08,774 WARNING [train.py:1204] (3/4) Exclude cut with ID _62574_210_2655_1_1535180407058_3933249_369-84347-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 02:47:08,833 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9210W0111-599240 from training. Duration: 0.896 2023-10-03 02:47:08,837 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_92-246270-0 from training. Duration: 0.85 2023-10-03 02:47:08,863 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_37152_210_9697_1_1533106573_3844188_29-239070-0 from training. Duration: 0.98 2023-10-03 02:47:10,242 WARNING [train.py:1204] (3/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_239-22076-0 from training. Duration: 0.85 2023-10-03 02:47:10,245 WARNING [train.py:1204] (3/4) Exclude cut with ID _65962_210_29927_1_1535436026519_4174930_368-44639-0_sp1.1 from training. Duration: 0.97275 2023-10-03 02:47:15,344 WARNING [train.py:1204] (3/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_152-212533-0 from training. Duration: 0.97 2023-10-03 02:47:15,519 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.0.feed_forward1.out_proj.dropout_p, batch_count=1110706.6666666667, ans=0.1 2023-10-03 02:47:16,899 WARNING [train.py:1204] (3/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_90-31383-0_sp1.1 from training. Duration: 0.7909375 2023-10-03 02:47:20,846 WARNING [train.py:1204] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_196-152868-0_sp1.1 from training. Duration: 0.92725 2023-10-03 02:47:20,847 WARNING [train.py:1204] (3/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_140-181176-0 from training. Duration: 0.92 2023-10-03 02:47:23,662 WARNING [train.py:1204] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_168-32906-0_sp0.9 from training. Duration: 0.7666875 2023-10-03 02:47:26,544 WARNING [train.py:1204] (3/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_13-165512-0 from training. Duration: 0.82 2023-10-03 02:47:26,577 WARNING [train.py:1204] (3/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_381-317791-0_sp0.9 from training. Duration: 0.811125 2023-10-03 02:47:31,195 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.0.feed_forward2.hidden_balancer.prob, batch_count=1110773.3333333333, ans=0.125 2023-10-03 02:47:33,851 WARNING [train.py:1204] (3/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_63-267585-0_sp1.1 from training. Duration: 0.8181875 2023-10-03 02:47:33,853 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_326-303945-0_sp1.1 from training. Duration: 0.9 2023-10-03 02:47:33,871 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_194-143381-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 02:47:35,106 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.618e+02 1.918e+02 2.080e+02 2.358e+02 3.129e+02, threshold=4.160e+02, percent-clipped=0.0 2023-10-03 02:47:35,213 WARNING [train.py:1204] (3/4) Exclude cut with ID _39290_210_4220_1_1533862778760_3075118_194-16667-0_sp1.1 from training. Duration: 0.963625 2023-10-03 02:47:36,533 WARNING [train.py:1204] (3/4) Exclude cut with ID _15012_210_1968_1_1530856820429_5095350_662-210469-0_sp0.9 from training. Duration: 0.6333125 2023-10-03 02:47:36,566 WARNING [train.py:1204] (3/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_68-219807-0_sp1.1 from training. Duration: 0.42725 2023-10-03 02:47:37,803 INFO [train.py:1046] (3/4) Epoch 32, batch 1950, loss[loss=0.1621, simple_loss=0.2379, pruned_loss=0.04311, over 23699.00 frames. ], tot_loss[loss=0.1628, simple_loss=0.2417, pruned_loss=0.04196, over 4718148.24 frames. ], batch size: 149, lr: 3.20e-03, grad_scale: 8.0 2023-10-03 02:47:37,881 WARNING [train.py:1204] (3/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_321-22821-0_sp0.9 from training. Duration: 0.988875 2023-10-03 02:47:39,368 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_1037-45317-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 02:47:39,369 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_403-39619-0_sp1.1 from training. Duration: 0.763625 2023-10-03 02:47:43,947 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_510-96996-0_sp0.9 from training. Duration: 0.9666875 2023-10-03 02:47:43,949 WARNING [train.py:1204] (3/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_225-180155-0_sp1.1 from training. Duration: 0.92725 2023-10-03 02:47:43,989 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_278-272612-0_sp0.9 from training. Duration: 0.811125 2023-10-03 02:47:46,031 WARNING [train.py:1204] (3/4) Exclude cut with ID _53713_210_24116_1_1534314487762_7416751_691-70244-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 02:47:48,104 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6788_210_1388_1_1527898698_4562021_91-197981-0_sp1.1 from training. Duration: 0.7545625 2023-10-03 02:47:48,305 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.nonlin_attention.balancer.prob, batch_count=1110840.0, ans=0.125 2023-10-03 02:47:49,544 WARNING [train.py:1204] (3/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_60-18123-0_sp0.9 from training. Duration: 0.97775 2023-10-03 02:47:50,992 WARNING [train.py:1204] (3/4) Exclude cut with ID _35799_210_5854_1_1533263487887_3529071_5-214617-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 02:47:50,996 WARNING [train.py:1204] (3/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_890-5117-0_sp1.1 from training. Duration: 0.7090625 2023-10-03 02:47:54,814 WARNING [train.py:1204] (3/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_660-211088-0 from training. Duration: 0.62 2023-10-03 02:47:54,870 WARNING [train.py:1204] (3/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_314-126819-0_sp0.9 from training. Duration: 0.5555625 2023-10-03 02:47:54,895 WARNING [train.py:1204] (3/4) Exclude cut with ID _21632_210_2253_1_1531994001765_3744140_362-66225-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 02:47:56,250 WARNING [train.py:1204] (3/4) Exclude cut with ID _61118_210_9527_1_1534897668884_3752630_173-258555-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 02:47:57,720 WARNING [train.py:1204] (3/4) Exclude cut with ID _7719_210_3525_1_1528456153163_4296960_88-250259-0_sp0.9 from training. Duration: 0.9333125 2023-10-03 02:47:59,031 WARNING [train.py:1204] (3/4) Exclude cut with ID _15140_210_9288_1_1530791388450_4880149_539-153708-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 02:47:59,063 WARNING [train.py:1204] (3/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_253-42544-0_sp1.1 from training. Duration: 0.97275 2023-10-03 02:48:00,895 WARNING [train.py:1204] (3/4) Exclude cut with ID _33640_210_2655_1_1533117605479_4855469_479-296877-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 02:48:03,664 WARNING [train.py:1204] (3/4) Exclude cut with ID 3033-130750-0096-55598-0_sp1.1 from training. Duration: 0.7545625 2023-10-03 02:48:03,679 WARNING [train.py:1204] (3/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_191-41023-0_sp1.1 from training. Duration: 0.6454375 2023-10-03 02:48:03,700 WARNING [train.py:1204] (3/4) Exclude cut with ID _8365_210_2064_1_1528592311070_4677690_431-294102-0_sp1.1 from training. Duration: 0.8090625 2023-10-03 02:48:05,020 WARNING [train.py:1204] (3/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_893-296030-0_sp1.1 from training. Duration: 0.97275 2023-10-03 02:48:07,740 WARNING [train.py:1204] (3/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_401-343820-0_sp1.1 from training. Duration: 0.97275 2023-10-03 02:48:09,304 WARNING [train.py:1204] (3/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_127-566-0_sp1.1 from training. Duration: 0.763625 2023-10-03 02:48:09,305 WARNING [train.py:1204] (3/4) Exclude cut with ID _14100_210_8341_1_1530529320613_3678069_126-272847-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 02:48:09,329 WARNING [train.py:1204] (3/4) Exclude cut with ID _15133_210_6228_1_1530794512074_3080379_29-264458-0_sp1.1 from training. Duration: 0.52725 2023-10-03 02:48:09,330 WARNING [train.py:1204] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_127-119109-0 from training. Duration: 0.72 2023-10-03 02:48:10,712 WARNING [train.py:1204] (3/4) Exclude cut with ID _6537_210_1883_1_1527987559710_3597129_382-133062-0_sp1.1 from training. Duration: 0.6181875 2023-10-03 02:48:10,722 WARNING [train.py:1204] (3/4) Exclude cut with ID _29529_210_7234_1_1532393961455_3977460_391-219682-0_sp1.1 from training. Duration: 0.936375 2023-10-03 02:48:10,769 WARNING [train.py:1204] (3/4) Exclude cut with ID _37537_210_2253_1_1533171758568_3421949_96-137189-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 02:48:14,124 WARNING [train.py:1204] (3/4) Exclude cut with ID _58414_210_8254_1_1535421609162_7151182_15-276301-0_sp1.1 from training. Duration: 0.97275 2023-10-03 02:48:17,289 WARNING [train.py:1204] (3/4) Exclude cut with ID _59502_210_12210_1_1535107024698_1835139_110-289074-0_sp1.1 from training. Duration: 0.92725 2023-10-03 02:48:21,166 WARNING [train.py:1204] (3/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_367-61330-0_sp1.1 from training. Duration: 0.6818125 2023-10-03 02:48:23,841 WARNING [train.py:1204] (3/4) Exclude cut with ID _45297_210_2721_1_1533715351381_3264949_353-9541-0_sp1.1 from training. Duration: 0.8818125 2023-10-03 02:48:23,867 WARNING [train.py:1204] (3/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_85-138257-0_sp0.9 from training. Duration: 0.788875 2023-10-03 02:48:25,154 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3787_210_2040_1_1526477544_5478586_603-120324-0 from training. Duration: 0.81 2023-10-03 02:48:25,211 WARNING [train.py:1204] (3/4) Exclude cut with ID _42863_210_9070_1_1533902398788_3696920_411-204131-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 02:48:28,761 WARNING [train.py:1204] (3/4) Exclude cut with ID _30196_210_1664_1_1533708496522_7074140_822-289909-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 02:48:30,055 WARNING [train.py:1204] (3/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_32-335103-0_sp0.9 from training. Duration: 0.911125 2023-10-03 02:48:31,379 WARNING [train.py:1204] (3/4) Exclude cut with ID _6493_210_1747_1_1528459210424_4012470_513-140967-0_sp0.9 from training. Duration: 0.87775 2023-10-03 02:48:31,712 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.attention_skip_rate, batch_count=1111040.0, ans=0.0 2023-10-03 02:48:37,062 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_47896_210_20508_1_1534492518_5958868_439-257766-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 02:48:37,315 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.balancer2.prob, batch_count=1111106.6666666667, ans=0.125 2023-10-03 02:48:38,458 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_1294-63152-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 02:48:42,947 WARNING [train.py:1204] (3/4) Exclude cut with ID _59170_210_6721_1_1534983633960_4203335_319-166512-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 02:48:44,374 WARNING [train.py:1204] (3/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_146-3470-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 02:48:46,489 WARNING [train.py:1204] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_678-159307-0_sp1.1 from training. Duration: 0.77275 2023-10-03 02:48:46,529 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_343-48822-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 02:48:47,828 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_4692_210_3528_1_1526899755_4323254_107-44401-0 from training. Duration: 0.86 2023-10-03 02:48:47,841 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_74-292608-0_sp1.1 from training. Duration: 0.6545625 2023-10-03 02:48:49,181 WARNING [train.py:1204] (3/4) Exclude cut with ID _55644_210_11856_1_1534593599582_4103719_293-248694-0_sp1.1 from training. Duration: 0.97275 2023-10-03 02:48:50,493 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3172_210_2087_1_1526292929_3987865_197-97762-0 from training. Duration: 0.75 2023-10-03 02:48:51,919 INFO [train.py:1046] (3/4) Epoch 32, batch 2000, loss[loss=0.1574, simple_loss=0.2278, pruned_loss=0.04347, over 23405.00 frames. ], tot_loss[loss=0.1635, simple_loss=0.2424, pruned_loss=0.04228, over 4717934.22 frames. ], batch size: 285, lr: 3.20e-03, grad_scale: 16.0 2023-10-03 02:48:52,000 WARNING [train.py:1204] (3/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_264-348896-0_sp0.9 from training. Duration: 0.9444375 2023-10-03 02:48:52,306 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.conv_skip_rate, batch_count=1111173.3333333333, ans=0.0 2023-10-03 02:48:54,808 WARNING [train.py:1204] (3/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_209-205365-0_sp0.9 from training. Duration: 0.87775 2023-10-03 02:48:54,891 WARNING [train.py:1204] (3/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_222-198127-0_sp0.9 from training. Duration: 0.9333125 2023-10-03 02:48:55,129 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.feed_forward2.hidden_balancer.prob, batch_count=1111173.3333333333, ans=0.125 2023-10-03 02:48:56,154 WARNING [train.py:1204] (3/4) Exclude cut with ID _57572_210_5517_1_1534921219624_3809484_52-43530-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 02:48:58,045 WARNING [train.py:1204] (3/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_96-320736-0_sp1.1 from training. Duration: 0.7818125 2023-10-03 02:48:59,437 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_13091_210_7925_1_1530609447_8987354_1159-225508-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 02:49:02,320 WARNING [train.py:1204] (3/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_237-44332-0 from training. Duration: 0.94 2023-10-03 02:49:03,651 WARNING [train.py:1204] (3/4) Exclude cut with ID _7970_210_1883_1_1528455616296_3472230_25-178407-0_sp0.9 from training. Duration: 0.788875 2023-10-03 02:49:06,415 WARNING [train.py:1204] (3/4) Exclude cut with ID _29003_210_19596_1_1532507391720_3894098_357-257062-0_sp1.1 from training. Duration: 0.936375 2023-10-03 02:49:07,836 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_809-17156-0 from training. Duration: 0.73 2023-10-03 02:49:09,229 WARNING [train.py:1204] (3/4) Exclude cut with ID _7160_210_1876_1_1527940923231_3703799_302-150728-0_sp1.1 from training. Duration: 0.7090625 2023-10-03 02:49:09,237 WARNING [train.py:1204] (3/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_444-28041-0_sp0.9 from training. Duration: 0.9444375 2023-10-03 02:49:13,198 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.2.encoder.layers.2.nonlin_attention.whiten2, num_groups=1, num_channels=384, metric=4.87 vs. limit=15.0 2023-10-03 02:49:15,794 WARNING [train.py:1204] (3/4) Exclude cut with ID _52892_210_6721_1_1534378941259_4124129_278-251666-0_sp1.1 from training. Duration: 0.92725 2023-10-03 02:49:15,883 WARNING [train.py:1204] (3/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_115-326778-0 from training. Duration: 0.93 2023-10-03 02:49:15,988 WARNING [train.py:1204] (3/4) Exclude cut with ID _41391_210_7994_1_1533603771872_3638201_31-188961-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 02:49:17,887 WARNING [train.py:1204] (3/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_1073-6264-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 02:49:17,911 WARNING [train.py:1204] (3/4) Exclude cut with ID _66868_210_9527_1_1535509287954_4561140_435-259515-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 02:49:19,301 WARNING [train.py:1204] (3/4) Exclude cut with ID _14886_210_4220_1_1530702020355_3516190_159-73748-0 from training. Duration: 0.83 2023-10-03 02:49:20,645 WARNING [train.py:1204] (3/4) Exclude cut with ID _15150_210_8341_1_1530752411803_7256384_469-239509-0_sp1.1 from training. Duration: 0.6181875 2023-10-03 02:49:22,111 WARNING [train.py:1204] (3/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_395-340852-0 from training. Duration: 0.93 2023-10-03 02:49:22,115 WARNING [train.py:1204] (3/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_138-187031-0_sp1.1 from training. Duration: 0.9 2023-10-03 02:49:24,879 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_13757_210_3528_1_1530440682_4107239_48-106347-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 02:49:24,945 WARNING [train.py:1204] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_164-224052-0_sp1.1 from training. Duration: 0.636375 2023-10-03 02:49:24,949 WARNING [train.py:1204] (3/4) Exclude cut with ID _59288_210_5854_1_1535077757774_3412246_408-322648-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 02:49:25,001 WARNING [train.py:1204] (3/4) Exclude cut with ID _30191_210_1664_1_1533189915047_7196670_827-36359-0_sp1.1 from training. Duration: 0.963625 2023-10-03 02:49:28,282 WARNING [train.py:1204] (3/4) Exclude cut with ID _4680_210_3525_1_1527249757953_893386_75-78979-0_sp1.1 from training. Duration: 0.82725 2023-10-03 02:49:28,338 WARNING [train.py:1204] (3/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_383-165234-0 from training. Duration: 0.94 2023-10-03 02:49:31,148 WARNING [train.py:1204] (3/4) Exclude cut with ID _19896_210_1883_1_1531709022662_6649569_94-229927-0 from training. Duration: 0.97 2023-10-03 02:49:31,155 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6788_210_1388_1_1527898698_4562021_180-75288-0_sp1.1 from training. Duration: 0.9 2023-10-03 02:49:31,162 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_26433_210_12108_1_1532157158_4056480_54-321349-0_sp1.1 from training. Duration: 0.97275 2023-10-03 02:49:36,767 WARNING [train.py:1204] (3/4) Exclude cut with ID _56108_210_16380_1_1534505446159_3887959_265-161493-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 02:49:38,123 WARNING [train.py:1204] (3/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_395-111602-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 02:49:38,128 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_154-316450-0_sp0.9 from training. Duration: 0.8444375 2023-10-03 02:49:39,395 WARNING [train.py:1204] (3/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_381-116041-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 02:49:39,579 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.conv_module2.balancer2.prob, batch_count=1111373.3333333333, ans=0.125 2023-10-03 02:49:40,790 WARNING [train.py:1204] (3/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_65-104124-0_sp1.1 from training. Duration: 0.963625 2023-10-03 02:49:40,836 WARNING [train.py:1204] (3/4) Exclude cut with ID _6536_210_1883_1_1527850783339_3319069_169-23811-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 02:49:42,643 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_289-180904-0_sp0.9 from training. Duration: 0.8444375 2023-10-03 02:49:42,654 WARNING [train.py:1204] (3/4) Exclude cut with ID _56406_210_9288_1_1534899618664_3496149_226-242400-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 02:49:42,929 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=1111373.3333333333, ans=0.1 2023-10-03 02:49:44,175 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_24899_210_16554_1_1532046422_3827462_276-216330-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 02:49:46,375 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.0.balancer1.prob, batch_count=1111373.3333333333, ans=0.125 2023-10-03 02:49:47,646 WARNING [train.py:1204] (3/4) Exclude cut with ID _29345_210_4381_1_1532689168263_3860169_198-174328-0_sp1.1 from training. Duration: 0.82725 2023-10-03 02:49:48,935 WARNING [train.py:1204] (3/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_338-96520-0 from training. Duration: 0.94 2023-10-03 02:49:52,953 WARNING [train.py:1204] (3/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_7-111954-0_sp1.1 from training. Duration: 0.5090625 2023-10-03 02:49:54,311 WARNING [train.py:1204] (3/4) Exclude cut with ID _36692_210_5665_1_1533608967420_398809_8-16261-0_sp1.1 from training. Duration: 0.97275 2023-10-03 02:49:57,129 WARNING [train.py:1204] (3/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_86-212657-0_sp1.1 from training. Duration: 0.97275 2023-10-03 02:49:57,136 WARNING [train.py:1204] (3/4) Exclude cut with ID _44659_210_2721_1_1534036120061_6614860_626-298884-0_sp1.1 from training. Duration: 0.8818125 2023-10-03 02:49:57,320 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.nonlin_attention.balancer.min_positive, batch_count=1111440.0, ans=0.05 2023-10-03 02:49:59,972 WARNING [train.py:1204] (3/4) Exclude cut with ID _49755_210_18053_1_1534581005846_3218709_120-257918-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 02:50:03,071 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.550e+02 2.023e+02 2.252e+02 2.571e+02 3.525e+02, threshold=4.504e+02, percent-clipped=0.0 2023-10-03 02:50:03,174 WARNING [train.py:1204] (3/4) Exclude cut with ID _21881_210_3525_1_1531825242757_3798380_400-113821-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 02:50:03,176 WARNING [train.py:1204] (3/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_387-236773-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 02:50:03,256 WARNING [train.py:1204] (3/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_613-232856-0_sp1.1 from training. Duration: 0.4909375 2023-10-03 02:50:04,698 WARNING [train.py:1204] (3/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_181-78632-0_sp0.9 from training. Duration: 0.8666875 2023-10-03 02:50:05,928 INFO [train.py:1046] (3/4) Epoch 32, batch 2050, loss[loss=0.1466, simple_loss=0.211, pruned_loss=0.04112, over 23583.00 frames. ], tot_loss[loss=0.1631, simple_loss=0.2419, pruned_loss=0.04213, over 4729791.78 frames. ], batch size: 256, lr: 3.20e-03, grad_scale: 16.0 2023-10-03 02:50:06,079 WARNING [train.py:1204] (3/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_175-17485-0_sp1.1 from training. Duration: 0.97275 2023-10-03 02:50:07,434 WARNING [train.py:1204] (3/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_1004-183422-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 02:50:08,854 WARNING [train.py:1204] (3/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_32-65962-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 02:50:10,230 WARNING [train.py:1204] (3/4) Exclude cut with ID _15458_210_8341_1_1530858614263_3834410_40-298664-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 02:50:15,779 WARNING [train.py:1204] (3/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_380-261863-0_sp1.1 from training. Duration: 0.963625 2023-10-03 02:50:17,207 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_372-173012-0_sp1.1 from training. Duration: 0.863625 2023-10-03 02:50:18,526 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_57-82205-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 02:50:19,818 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3691_210_3528_1_1526468169_4062053_355-334432-0_sp0.9 from training. Duration: 0.9666875 2023-10-03 02:50:21,276 WARNING [train.py:1204] (3/4) Exclude cut with ID _19023_210_4353_1_1531378892552_5820780_28-36615-0 from training. Duration: 0.97 2023-10-03 02:50:21,281 WARNING [train.py:1204] (3/4) Exclude cut with ID _29853_210_7497_1_1532505600851_6642110_286-251333-0_sp1.1 from training. Duration: 0.92725 2023-10-03 02:50:24,064 WARNING [train.py:1204] (3/4) Exclude cut with ID _31602_210_5854_1_1532677021456_4458250_518-80679-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 02:50:25,362 WARNING [train.py:1204] (3/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_38-187851-0_sp0.9 from training. Duration: 0.688875 2023-10-03 02:50:32,989 WARNING [train.py:1204] (3/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_39-248870-0_sp1.1 from training. Duration: 0.62725 2023-10-03 02:50:33,007 WARNING [train.py:1204] (3/4) Exclude cut with ID _16908_210_5724_1_1531119157248_3772700_154-31455-0_sp1.1 from training. Duration: 0.97275 2023-10-03 02:50:35,475 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8453_210_4211_1_1528502940_8562730_347-330673-0 from training. Duration: 0.72 2023-10-03 02:50:35,616 WARNING [train.py:1204] (3/4) Exclude cut with ID _30191_210_1664_1_1533189915047_7196670_674-204247-0_sp1.1 from training. Duration: 0.97275 2023-10-03 02:50:37,085 WARNING [train.py:1204] (3/4) Exclude cut with ID _8833_210_5219_1_1528592665210_7553059_329-125156-0 from training. Duration: 0.82 2023-10-03 02:50:38,382 WARNING [train.py:1204] (3/4) Exclude cut with ID _4779_210_2084_1_1527318022944_3914893_500-285013-0_sp1.1 from training. Duration: 0.62725 2023-10-03 02:50:39,888 WARNING [train.py:1204] (3/4) Exclude cut with ID _14421_210_8750_1_1530604795825_3542290_190-313578-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 02:50:43,425 WARNING [train.py:1204] (3/4) Exclude cut with ID _35799_210_5854_1_1533263487887_3529071_538-273574-0_sp1.1 from training. Duration: 0.936375 2023-10-03 02:50:44,823 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_14928_210_5632_1_1530773461_4714000_454-112198-0_sp1.1 from training. Duration: 0.6 2023-10-03 02:50:44,863 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_468-143126-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 02:50:46,145 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_4528_210_3273_1_1527076654_3276777_258-121681-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 02:50:48,031 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_397-132565-0_sp1.1 from training. Duration: 0.9 2023-10-03 02:50:48,058 WARNING [train.py:1204] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_454-123002-0_sp0.9 from training. Duration: 0.7666875 2023-10-03 02:50:49,611 WARNING [train.py:1204] (3/4) Exclude cut with ID _27052_210_5986_1_1532329197187_3585009_297-86890-0_sp1.1 from training. Duration: 0.936375 2023-10-03 02:50:52,335 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_51225_210_3601_1_1534400634_2327383_55-301844-0_sp1.1 from training. Duration: 0.8454375 2023-10-03 02:50:54,976 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6786_210_1388_1_1527839699_4194495_378-46070-0_sp0.9 from training. Duration: 0.8 2023-10-03 02:50:56,356 WARNING [train.py:1204] (3/4) Exclude cut with ID _42334_210_7770_1_1534206610396_3605880_55-19758-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 02:50:56,567 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.balancer2.prob, batch_count=1111706.6666666667, ans=0.125 2023-10-03 02:51:00,526 WARNING [train.py:1204] (3/4) Exclude cut with ID _14884_210_2252_1_1530707459689_5561250_114-201399-0_sp0.9 from training. Duration: 0.8444375 2023-10-03 02:51:05,340 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_99-251314-0_sp0.9 from training. Duration: 0.9666875 2023-10-03 02:51:06,586 WARNING [train.py:1204] (3/4) Exclude cut with ID _26528_210_9070_1_1532570410842_3851310_497-97218-0 from training. Duration: 0.86 2023-10-03 02:51:11,447 WARNING [train.py:1204] (3/4) Exclude cut with ID _55328_210_27767_1_1534580973260_4088170_230-317241-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 02:51:12,779 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_125-203105-0_sp0.9 from training. Duration: 0.888875 2023-10-03 02:51:14,723 WARNING [train.py:1204] (3/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_55-31904-0_sp1.1 from training. Duration: 0.8 2023-10-03 02:51:16,106 WARNING [train.py:1204] (3/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_18-174647-0 from training. Duration: 0.91 2023-10-03 02:51:18,123 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.4.encoder.layers.1.nonlin_attention.whiten2, num_groups=1, num_channels=384, metric=3.13 vs. limit=15.0 2023-10-03 02:51:19,369 WARNING [train.py:1204] (3/4) Exclude cut with ID IC0165W0005-81726 from training. Duration: 0.952 2023-10-03 02:51:19,370 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_31583_210_3852_1_1532595851_4024413_148-171847-0_sp1.1 from training. Duration: 0.963625 2023-10-03 02:51:20,588 INFO [train.py:1046] (3/4) Epoch 32, batch 2100, loss[loss=0.1616, simple_loss=0.237, pruned_loss=0.04313, over 23294.00 frames. ], tot_loss[loss=0.1625, simple_loss=0.2412, pruned_loss=0.04191, over 4725225.11 frames. ], batch size: 105, lr: 3.19e-03, grad_scale: 8.0 2023-10-03 02:51:20,667 WARNING [train.py:1204] (3/4) Exclude cut with ID _21989_210_5854_1_1531899214931_3271360_116-138769-0_sp1.1 from training. Duration: 0.936375 2023-10-03 02:51:20,714 WARNING [train.py:1204] (3/4) Exclude cut with ID _66070_210_7640_1_1535441393096_7805310_489-292631-0_sp1.1 from training. Duration: 0.7181875 2023-10-03 02:51:22,107 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_202-27960-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 02:51:22,117 WARNING [train.py:1204] (3/4) Exclude cut with ID _5294_210_1747_1_1527249643928_3696179_342-270278-0 from training. Duration: 0.55 2023-10-03 02:51:22,152 WARNING [train.py:1204] (3/4) Exclude cut with ID _38323_210_12108_1_1533121233369_3303049_416-11919-0 from training. Duration: 0.96 2023-10-03 02:51:24,722 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6545_210_2040_1_1527774670_4245258_407-48070-0_sp0.9 from training. Duration: 0.8444375 2023-10-03 02:51:27,468 WARNING [train.py:1204] (3/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_503-30719-0_sp1.1 from training. Duration: 0.8909375 2023-10-03 02:51:27,538 WARNING [train.py:1204] (3/4) Exclude cut with ID _49643_210_7989_1_1534640244224_3939031_234-326497-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 02:51:27,739 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.conv_module2.balancer2.prob, batch_count=1111840.0, ans=0.125 2023-10-03 02:51:30,298 WARNING [train.py:1204] (3/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_301-77873-0_sp1.1 from training. Duration: 0.963625 2023-10-03 02:51:31,644 WARNING [train.py:1204] (3/4) Exclude cut with ID _45297_210_2721_1_1533715351381_3264949_152-154697-0_sp1.1 from training. Duration: 0.97275 2023-10-03 02:51:31,651 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3371_210_1794_1_1526384462_5580777_396-339812-0 from training. Duration: 0.85 2023-10-03 02:51:31,755 WARNING [train.py:1204] (3/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_399-156791-0_sp1.1 from training. Duration: 0.7545625 2023-10-03 02:51:32,949 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7696_210_2040_1_1528207167_3716099_117-125434-0 from training. Duration: 0.76 2023-10-03 02:51:32,953 WARNING [train.py:1204] (3/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_96-244299-0 from training. Duration: 0.88 2023-10-03 02:51:34,385 WARNING [train.py:1204] (3/4) Exclude cut with ID _37892_210_19596_1_1533198677897_3964058_519-343462-0_sp1.1 from training. Duration: 0.92725 2023-10-03 02:51:36,188 WARNING [train.py:1204] (3/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_202-183922-0_sp1.1 from training. Duration: 0.77275 2023-10-03 02:51:36,196 WARNING [train.py:1204] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_453-133983-0 from training. Duration: 0.78 2023-10-03 02:51:36,240 WARNING [train.py:1204] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_178-39722-0_sp1.1 from training. Duration: 0.4454375 2023-10-03 02:51:42,123 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_15126_210_6753_1_1530778677_2591185_113-157651-0 from training. Duration: 0.68 2023-10-03 02:51:42,123 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_271-234962-0_sp1.1 from training. Duration: 0.7181875 2023-10-03 02:51:44,193 WARNING [train.py:1204] (3/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_195-64967-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 02:51:45,527 WARNING [train.py:1204] (3/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_365-215065-0_sp1.1 from training. Duration: 0.936375 2023-10-03 02:51:47,056 WARNING [train.py:1204] (3/4) Exclude cut with ID _31602_210_5854_1_1532677021456_4458250_249-96604-0_sp0.9 from training. Duration: 0.988875 2023-10-03 02:51:48,914 WARNING [train.py:1204] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_307-158462-0 from training. Duration: 0.69 2023-10-03 02:51:50,362 WARNING [train.py:1204] (3/4) Exclude cut with ID _30918_210_6400_1_1532754078600_3125920_407-223394-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 02:51:50,366 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6673_210_3273_1_1527767790_3173240_351-167954-0_sp0.9 from training. Duration: 0.5333125 2023-10-03 02:51:51,704 WARNING [train.py:1204] (3/4) Exclude cut with ID _4866_210_2577_1_1527404412284_4364659_256-306830-0 from training. Duration: 0.87 2023-10-03 02:51:53,053 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_17613_210_2151_1_1531137569_2366160_222-131118-0_sp1.1 from training. Duration: 0.963625 2023-10-03 02:51:53,072 WARNING [train.py:1204] (3/4) Exclude cut with ID _45560_210_21982_1_1533812603951_3584999_111-6376-0 from training. Duration: 0.84 2023-10-03 02:51:53,092 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_216-73841-0 from training. Duration: 0.96 2023-10-03 02:51:53,135 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_14058_210_4163_1_1530690953_6327180_49-210687-0 from training. Duration: 0.94 2023-10-03 02:51:56,044 WARNING [train.py:1204] (3/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_506-42614-0_sp0.9 from training. Duration: 0.87775 2023-10-03 02:51:57,451 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_25-332138-0_sp1.1 from training. Duration: 0.82725 2023-10-03 02:52:00,140 WARNING [train.py:1204] (3/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_589-9070-0_sp1.1 from training. Duration: 0.6454375 2023-10-03 02:52:00,255 WARNING [train.py:1204] (3/4) Exclude cut with ID _6194_210_1534_1_1527511826160_3941194_14-282876-0_sp1.1 from training. Duration: 0.6454375 2023-10-03 02:52:01,781 WARNING [train.py:1204] (3/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_1166-90654-0_sp1.1 from training. Duration: 0.92725 2023-10-03 02:52:04,554 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_218-255858-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 02:52:04,555 WARNING [train.py:1204] (3/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_1285-256622-0 from training. Duration: 0.84 2023-10-03 02:52:04,575 WARNING [train.py:1204] (3/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_205-286261-0_sp1.1 from training. Duration: 0.963625 2023-10-03 02:52:04,601 WARNING [train.py:1204] (3/4) Exclude cut with ID _68777_210_4353_1_1535810390731_3117904_6-154119-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 02:52:06,368 WARNING [train.py:1204] (3/4) Exclude cut with ID _22982_210_1883_1_1531882790447_5449409_565-199393-0_sp1.1 from training. Duration: 0.92725 2023-10-03 02:52:06,386 WARNING [train.py:1204] (3/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_278-154492-0 from training. Duration: 0.76 2023-10-03 02:52:09,526 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_426-313557-0 from training. Duration: 0.53 2023-10-03 02:52:09,591 WARNING [train.py:1204] (3/4) Exclude cut with ID _41817_210_18835_1_1533434477160_7423049_555-293448-0 from training. Duration: 0.8 2023-10-03 02:52:14,231 WARNING [train.py:1204] (3/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_32-335103-0_sp1.1 from training. Duration: 0.7454375 2023-10-03 02:52:14,435 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.bypass.scale_min, batch_count=1112040.0, ans=0.2 2023-10-03 02:52:16,935 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2655_210_2087_1_1525688959_3897400_202-147557-0_sp1.1 from training. Duration: 0.77275 2023-10-03 02:52:18,031 WARNING [train.py:1204] (3/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_158-250116-0 from training. Duration: 0.62 2023-10-03 02:52:24,026 WARNING [train.py:1204] (3/4) Exclude cut with ID _55429_210_10626_1_1534467588527_7174143_528-88083-0_sp1.1 from training. Duration: 0.963625 2023-10-03 02:52:24,173 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.0.balancer2.prob, batch_count=1112106.6666666667, ans=0.125 2023-10-03 02:52:26,731 WARNING [train.py:1204] (3/4) Exclude cut with ID _8030_210_7497_1_1528444886781_3531089_32-31837-0_sp1.1 from training. Duration: 0.8818125 2023-10-03 02:52:26,761 WARNING [train.py:1204] (3/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_744-86168-0_sp1.1 from training. Duration: 0.97275 2023-10-03 02:52:26,770 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1113-7369-0_sp0.9 from training. Duration: 0.9444375 2023-10-03 02:52:26,787 WARNING [train.py:1204] (3/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_124-345840-0_sp0.9 from training. Duration: 0.57775 2023-10-03 02:52:26,841 WARNING [train.py:1204] (3/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_328-264776-0_sp0.9 from training. Duration: 0.7444375 2023-10-03 02:52:28,261 WARNING [train.py:1204] (3/4) Exclude cut with ID _18895_210_6228_1_1531357274234_3784930_469-166615-0_sp1.1 from training. Duration: 0.963625 2023-10-03 02:52:29,540 WARNING [train.py:1204] (3/4) Exclude cut with ID _8395_210_1531_1_1528597871523_4411579_431-132470-0_sp0.9 from training. Duration: 0.82225 2023-10-03 02:52:29,618 WARNING [train.py:1204] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_554-45617-0_sp1.1 from training. Duration: 0.9 2023-10-03 02:52:29,637 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_35361_210_4238_1_1533262810_7793476_524-188242-0_sp1.1 from training. Duration: 0.92725 2023-10-03 02:52:30,984 WARNING [train.py:1204] (3/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_938-205135-0 from training. Duration: 0.87 2023-10-03 02:52:31,258 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.conv_skip_rate, batch_count=1112106.6666666667, ans=0.0 2023-10-03 02:52:32,428 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_5931_210_2040_1_1527428481_4547999_321-193614-0 from training. Duration: 0.67 2023-10-03 02:52:33,714 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.605e+02 1.908e+02 2.112e+02 2.525e+02 3.507e+02, threshold=4.223e+02, percent-clipped=0.0 2023-10-03 02:52:33,741 INFO [train.py:1046] (3/4) Epoch 32, batch 2150, loss[loss=0.1477, simple_loss=0.2342, pruned_loss=0.03055, over 24672.00 frames. ], tot_loss[loss=0.1621, simple_loss=0.2409, pruned_loss=0.04161, over 4726145.57 frames. ], batch size: 65, lr: 3.19e-03, grad_scale: 4.0 2023-10-03 02:52:33,812 WARNING [train.py:1204] (3/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_445-269521-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 02:52:35,261 WARNING [train.py:1204] (3/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_3-33116-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 02:52:35,262 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_4633_210_2040_1_1526824163_4145343_416-76117-0_sp1.1 from training. Duration: 0.863625 2023-10-03 02:52:35,290 WARNING [train.py:1204] (3/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_1033-274376-0_sp1.1 from training. Duration: 0.7909375 2023-10-03 02:52:36,614 WARNING [train.py:1204] (3/4) Exclude cut with ID _8772_210_1850_1_1528635595972_3664011_398-320049-0_sp0.9 from training. Duration: 0.97775 2023-10-03 02:52:44,553 WARNING [train.py:1204] (3/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_321-215742-0_sp0.9 from training. Duration: 0.5555625 2023-10-03 02:52:46,747 WARNING [train.py:1204] (3/4) Exclude cut with ID _28394_210_8913_1_1532430049518_3650118_272-190950-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 02:52:46,962 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.conv_module2.balancer1.prob, batch_count=1112173.3333333333, ans=0.125 2023-10-03 02:52:48,065 WARNING [train.py:1204] (3/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_31-299371-0_sp1.1 from training. Duration: 0.92725 2023-10-03 02:52:49,551 WARNING [train.py:1204] (3/4) Exclude cut with ID _55383_210_10635_1_1534468426172_2231669_134-128246-0_sp0.9 from training. Duration: 0.988875 2023-10-03 02:52:49,553 WARNING [train.py:1204] (3/4) Exclude cut with ID _40519_210_13794_1_1533715228655_7337390_887-9547-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 02:52:49,590 WARNING [train.py:1204] (3/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_310-109461-0_sp0.9 from training. Duration: 0.888875 2023-10-03 02:52:52,350 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2009_210_2071_1_1525481134_4588937_258-250196-0_sp1.1 from training. Duration: 0.92725 2023-10-03 02:52:52,424 WARNING [train.py:1204] (3/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_1186-72702-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 02:52:52,427 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_14831_210_7927_1_1530700200_5583610_181-27467-0_sp1.1 from training. Duration: 0.82725 2023-10-03 02:52:55,611 WARNING [train.py:1204] (3/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_278-132109-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 02:52:55,636 WARNING [train.py:1204] (3/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_261-297503-0 from training. Duration: 0.99 2023-10-03 02:53:00,890 WARNING [train.py:1204] (3/4) Exclude cut with ID _19896_210_1883_1_1531709022662_6649569_313-265903-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 02:53:02,297 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_105-114797-0_sp1.1 from training. Duration: 0.763625 2023-10-03 02:53:03,705 WARNING [train.py:1204] (3/4) Exclude cut with ID _53755_210_8254_1_1534577312941_4707360_9-65843-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 02:53:03,736 WARNING [train.py:1204] (3/4) Exclude cut with ID _18516_210_9206_1_1531704653206_3692406_361-224549-0_sp1.1 from training. Duration: 0.97275 2023-10-03 02:53:03,770 WARNING [train.py:1204] (3/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_403-310975-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 02:53:05,132 WARNING [train.py:1204] (3/4) Exclude cut with ID _5444_210_2151_1_1527418808464_5312210_581-315411-0_sp0.9 from training. Duration: 0.688875 2023-10-03 02:53:05,184 WARNING [train.py:1204] (3/4) Exclude cut with ID _23825_210_13449_1_1531915313526_3497150_464-154904-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 02:53:05,193 WARNING [train.py:1204] (3/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_937-208281-0_sp0.9 from training. Duration: 0.9444375 2023-10-03 02:53:06,468 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_37839_210_7234_1_1533542462_3689303_158-181212-0_sp1.1 from training. Duration: 0.963625 2023-10-03 02:53:06,562 WARNING [train.py:1204] (3/4) Exclude cut with ID _8772_210_1850_1_1528635595972_3664011_398-320049-0 from training. Duration: 0.88 2023-10-03 02:53:08,560 WARNING [train.py:1204] (3/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_718-250907-0_sp1.1 from training. Duration: 0.62725 2023-10-03 02:53:09,893 WARNING [train.py:1204] (3/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_641-126395-0_sp1.1 from training. Duration: 0.92725 2023-10-03 02:53:11,206 WARNING [train.py:1204] (3/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_166-241493-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 02:53:11,296 WARNING [train.py:1204] (3/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_526-62079-0_sp0.9 from training. Duration: 0.7444375 2023-10-03 02:53:12,600 WARNING [train.py:1204] (3/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_439-275083-0_sp1.1 from training. Duration: 0.936375 2023-10-03 02:53:17,573 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_66907_210_21581_1_1535615661_3590177_493-229032-0_sp1.1 from training. Duration: 0.92725 2023-10-03 02:53:17,603 WARNING [train.py:1204] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_89-321955-0_sp1.1 from training. Duration: 0.72725 2023-10-03 02:53:18,836 WARNING [train.py:1204] (3/4) Exclude cut with ID _29857_210_20034_1_1532395886753_3798160_273-226195-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 02:53:18,843 WARNING [train.py:1204] (3/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_404-75889-0 from training. Duration: 0.89 2023-10-03 02:53:18,883 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_271-234962-0_sp0.9 from training. Duration: 0.87775 2023-10-03 02:53:21,750 WARNING [train.py:1204] (3/4) Exclude cut with ID _22961_210_8254_1_1531976366672_4278180_156-339685-0_sp1.1 from training. Duration: 0.97275 2023-10-03 02:53:21,797 WARNING [train.py:1204] (3/4) Exclude cut with ID _37892_210_19596_1_1533198677897_3964058_425-231927-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 02:53:23,167 WARNING [train.py:1204] (3/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_102-288670-0_sp1.1 from training. Duration: 0.97275 2023-10-03 02:53:24,524 WARNING [train.py:1204] (3/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_283-220372-0_sp1.1 from training. Duration: 0.5090625 2023-10-03 02:53:24,609 WARNING [train.py:1204] (3/4) Exclude cut with ID _44304_210_15374_1_1533639352765_4613129_86-1699-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 02:53:25,987 WARNING [train.py:1204] (3/4) Exclude cut with ID _2891_210_2550_1_1525874398521_4427710_327-121590-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 02:53:25,993 WARNING [train.py:1204] (3/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_369-222060-0 from training. Duration: 0.71 2023-10-03 02:53:27,977 WARNING [train.py:1204] (3/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_762-109886-0 from training. Duration: 0.91 2023-10-03 02:53:29,266 WARNING [train.py:1204] (3/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_477-255782-0_sp0.9 from training. Duration: 0.911125 2023-10-03 02:53:29,307 WARNING [train.py:1204] (3/4) Exclude cut with ID IC0669W0061-330906 from training. Duration: 0.768 2023-10-03 02:53:29,365 WARNING [train.py:1204] (3/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_347-213071-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 02:53:29,403 WARNING [train.py:1204] (3/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_507-303370-0_sp1.1 from training. Duration: 0.87275 2023-10-03 02:53:30,789 WARNING [train.py:1204] (3/4) Exclude cut with ID _15012_210_1968_1_1530856820429_5095350_688-178849-0 from training. Duration: 0.64 2023-10-03 02:53:30,794 WARNING [train.py:1204] (3/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_28-70987-0_sp0.9 from training. Duration: 0.9666875 2023-10-03 02:53:30,797 WARNING [train.py:1204] (3/4) Exclude cut with ID _5910_210_1389_1_1527337972454_5050100_555-159815-0 from training. Duration: 0.98 2023-10-03 02:53:30,826 WARNING [train.py:1204] (3/4) Exclude cut with ID IC0679W0064-335871 from training. Duration: 0.768 2023-10-03 02:53:30,827 WARNING [train.py:1204] (3/4) Exclude cut with ID IC0679W0079-335886 from training. Duration: 0.896 2023-10-03 02:53:32,168 WARNING [train.py:1204] (3/4) Exclude cut with ID _5608_210_2106_1_1527415439089_6819768_909-82767-0 from training. Duration: 0.61 2023-10-03 02:53:33,645 WARNING [train.py:1204] (3/4) Exclude cut with ID _23038_210_9570_1_1532001584158_3652419_411-323967-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 02:53:33,702 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_350-231748-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 02:53:33,712 WARNING [train.py:1204] (3/4) Exclude cut with ID _5289_210_2084_1_1527678202736_3779299_142-103317-0_sp0.9 from training. Duration: 0.9333125 2023-10-03 02:53:35,054 WARNING [train.py:1204] (3/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_314-144055-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 02:53:36,256 WARNING [train.py:1204] (3/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_177-236809-0_sp0.9 from training. Duration: 0.5444375 2023-10-03 02:53:38,271 WARNING [train.py:1204] (3/4) Exclude cut with ID _23038_210_9570_1_1532001584158_3652419_82-334733-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 02:53:38,285 WARNING [train.py:1204] (3/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_225-22526-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 02:53:46,772 WARNING [train.py:1204] (3/4) Exclude cut with ID _30327_210_16049_1_1532484161295_3924169_401-70819-0_sp1.1 from training. Duration: 0.7818125 2023-10-03 02:53:48,051 INFO [train.py:1046] (3/4) Epoch 32, batch 2200, loss[loss=0.1801, simple_loss=0.2581, pruned_loss=0.05104, over 23443.00 frames. ], tot_loss[loss=0.1622, simple_loss=0.2409, pruned_loss=0.04173, over 4721718.47 frames. ], batch size: 93, lr: 3.19e-03, grad_scale: 8.0 2023-10-03 02:53:48,157 WARNING [train.py:1204] (3/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_538-32291-0 from training. Duration: 0.83 2023-10-03 02:53:51,046 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_37357_210_17024_1_1533089504_2316121_258-211605-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 02:53:52,622 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.ff3_skip_rate, batch_count=1112506.6666666667, ans=0.0 2023-10-03 02:53:56,444 WARNING [train.py:1204] (3/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_550-335198-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 02:53:56,510 WARNING [train.py:1204] (3/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_98-741-0_sp0.9 from training. Duration: 0.888875 2023-10-03 02:53:57,738 WARNING [train.py:1204] (3/4) Exclude cut with ID _38323_210_12108_1_1533121233369_3303049_251-238988-0_sp1.1 from training. Duration: 0.92725 2023-10-03 02:53:59,561 WARNING [train.py:1204] (3/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_259-34038-0_sp0.9 from training. Duration: 0.82225 2023-10-03 02:54:02,354 WARNING [train.py:1204] (3/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_1060-180633-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 02:54:02,392 WARNING [train.py:1204] (3/4) Exclude cut with ID _8755_210_1876_1_1528628559357_3258209_211-37473-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 02:54:02,398 WARNING [train.py:1204] (3/4) Exclude cut with ID _4122_210_2084_1_1526713164417_4353280_528-206771-0 from training. Duration: 0.88 2023-10-03 02:54:03,984 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.0.ff3_skip_rate, batch_count=1112573.3333333333, ans=0.0 2023-10-03 02:54:06,635 WARNING [train.py:1204] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_64-248359-0 from training. Duration: 0.69 2023-10-03 02:54:08,017 WARNING [train.py:1204] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_402-253027-0_sp1.1 from training. Duration: 0.4909375 2023-10-03 02:54:14,101 WARNING [train.py:1204] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_625-102955-0 from training. Duration: 0.77 2023-10-03 02:54:17,609 WARNING [train.py:1204] (3/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_611-219324-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 02:54:18,987 WARNING [train.py:1204] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_718-68505-0_sp0.9 from training. Duration: 0.988875 2023-10-03 02:54:20,310 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_353-182855-0_sp1.1 from training. Duration: 0.87275 2023-10-03 02:54:23,013 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_5960_210_1794_1_1527403865_3990999_515-2613-0_sp0.9 from training. Duration: 0.97775 2023-10-03 02:54:24,341 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_5575_210_1410_1_1527301305_2570170_342-265572-0 from training. Duration: 0.94 2023-10-03 02:54:26,488 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.3.encoder.layers.2.feed_forward1.out_whiten, num_groups=1, num_channels=512, metric=7.61 vs. limit=15.0 2023-10-03 02:54:27,144 WARNING [train.py:1204] (3/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_329-113173-0_sp0.9 from training. Duration: 0.688875 2023-10-03 02:54:27,258 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.0.balancer1.prob, batch_count=1112640.0, ans=0.125 2023-10-03 02:54:28,501 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_15396_210_6890_1_1531308473_1938936_213-312506-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 02:54:28,554 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_521-214276-0_sp1.1 from training. Duration: 0.4 2023-10-03 02:54:28,817 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.conv_module1.balancer1.prob, batch_count=1112640.0, ans=0.125 2023-10-03 02:54:31,381 WARNING [train.py:1204] (3/4) Exclude cut with ID _15026_210_8750_1_1530777602316_3291850_211-195043-0_sp1.1 from training. Duration: 0.72725 2023-10-03 02:54:34,704 WARNING [train.py:1204] (3/4) Exclude cut with ID _29343_210_4381_1_1532602757752_3674109_108-257494-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 02:54:35,996 WARNING [train.py:1204] (3/4) Exclude cut with ID _64414_210_17301_1_1535243252048_3757759_403-58151-0_sp1.1 from training. Duration: 0.963625 2023-10-03 02:54:36,098 WARNING [train.py:1204] (3/4) Exclude cut with ID _36512_210_6987_1_1533033143998_3126069_109-177787-0_sp1.1 from training. Duration: 0.92725 2023-10-03 02:54:38,848 WARNING [train.py:1204] (3/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_498-40153-0 from training. Duration: 0.63 2023-10-03 02:54:40,165 WARNING [train.py:1204] (3/4) Exclude cut with ID _56447_210_1389_1_1535074245910_3697189_161-334523-0_sp1.1 from training. Duration: 0.936375 2023-10-03 02:54:40,908 WARNING [train.py:1204] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_238-245655-0 from training. Duration: 0.81 2023-10-03 02:54:43,599 WARNING [train.py:1204] (3/4) Exclude cut with ID _64631_210_27767_1_1535349580068_4165160_108-43865-0_sp1.1 from training. Duration: 0.92725 2023-10-03 02:54:43,604 WARNING [train.py:1204] (3/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_243-42116-0_sp1.1 from training. Duration: 0.663625 2023-10-03 02:54:43,635 WARNING [train.py:1204] (3/4) Exclude cut with ID _66868_210_9527_1_1535509287954_4561140_594-132160-0_sp1.1 from training. Duration: 0.92725 2023-10-03 02:54:45,689 WARNING [train.py:1204] (3/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_48-338976-0_sp0.9 from training. Duration: 0.988875 2023-10-03 02:54:46,985 WARNING [train.py:1204] (3/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_137-100595-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 02:54:46,991 WARNING [train.py:1204] (3/4) Exclude cut with ID _49748_210_24116_1_1533986971737_3158910_12-271669-0_sp1.1 from training. Duration: 0.936375 2023-10-03 02:54:47,009 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_37357_210_17024_1_1533089504_2316121_206-80421-0_sp1.1 from training. Duration: 0.936375 2023-10-03 02:54:48,304 WARNING [train.py:1204] (3/4) Exclude cut with ID _7537_210_2084_1_1528502395431_3924359_113-199933-0_sp1.1 from training. Duration: 0.536375 2023-10-03 02:54:50,218 WARNING [train.py:1204] (3/4) Exclude cut with ID _55440_210_12651_1_1534496713364_2980520_31-59289-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 02:54:51,623 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1083-220122-0_sp0.9 from training. Duration: 0.5666875 2023-10-03 02:54:54,409 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_426-313557-0_sp1.1 from training. Duration: 0.4818125 2023-10-03 02:54:55,781 WARNING [train.py:1204] (3/4) Exclude cut with ID _35799_210_5854_1_1533263487887_3529071_324-94843-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 02:54:58,588 WARNING [train.py:1204] (3/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_264-108685-0_sp1.1 from training. Duration: 0.5 2023-10-03 02:54:58,688 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9012W0006-501141 from training. Duration: 0.896 2023-10-03 02:55:00,064 WARNING [train.py:1204] (3/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_890-5117-0_sp0.9 from training. Duration: 0.8666875 2023-10-03 02:55:01,288 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.569e+02 1.834e+02 1.970e+02 2.169e+02 2.586e+02, threshold=3.939e+02, percent-clipped=0.0 2023-10-03 02:55:01,315 INFO [train.py:1046] (3/4) Epoch 32, batch 2250, loss[loss=0.1649, simple_loss=0.2424, pruned_loss=0.04374, over 23812.00 frames. ], tot_loss[loss=0.1622, simple_loss=0.2415, pruned_loss=0.04152, over 4730561.73 frames. ], batch size: 164, lr: 3.19e-03, grad_scale: 8.0 2023-10-03 02:55:01,391 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9023W0008-506301 from training. Duration: 0.896 2023-10-03 02:55:01,683 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.conv_module2.balancer1.prob, batch_count=1112840.0, ans=0.125 2023-10-03 02:55:03,155 WARNING [train.py:1204] (3/4) Exclude cut with ID _13916_210_9994_1_1530788341126_3763149_60-191556-0_sp0.9 from training. Duration: 0.67775 2023-10-03 02:55:03,206 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9029W0331-509721 from training. Duration: 0.896 2023-10-03 02:55:03,411 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.ff2_skip_rate, batch_count=1112840.0, ans=0.0 2023-10-03 02:55:04,760 WARNING [train.py:1204] (3/4) Exclude cut with ID _35823_210_6917_1_1533187881914_7297830_567-108596-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 02:55:04,809 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7543_210_6753_1_1528544010_1110402_25-183860-0_sp1.1 from training. Duration: 0.42725 2023-10-03 02:55:06,077 WARNING [train.py:1204] (3/4) Exclude cut with ID _47278_210_12041_1_1533902418349_3576110_495-260035-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 02:55:06,326 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.ff2_skip_rate, batch_count=1112840.0, ans=0.0 2023-10-03 02:55:07,540 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9051W0015-520281 from training. Duration: 0.9045625 2023-10-03 02:55:07,645 WARNING [train.py:1204] (3/4) Exclude cut with ID _14459_210_2228_1_1531443641946_7516700_707-66193-0_sp1.1 from training. Duration: 0.97275 2023-10-03 02:55:10,366 WARNING [train.py:1204] (3/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_219-224445-0_sp1.1 from training. Duration: 0.763625 2023-10-03 02:55:16,644 WARNING [train.py:1204] (3/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_345-167986-0_sp0.9 from training. Duration: 0.9333125 2023-10-03 02:55:18,010 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_34815_210_6388_1_1533381320_3571903_279-305565-0_sp1.1 from training. Duration: 0.836375 2023-10-03 02:55:21,262 WARNING [train.py:1204] (3/4) Exclude cut with ID _21987_210_5854_1_1531821500482_3637038_317-179212-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 02:55:22,618 WARNING [train.py:1204] (3/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_395-37222-0_sp1.1 from training. Duration: 0.6909375 2023-10-03 02:55:22,695 WARNING [train.py:1204] (3/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_168-50337-0_sp1.1 from training. Duration: 0.763625 2023-10-03 02:55:25,462 WARNING [train.py:1204] (3/4) Exclude cut with ID _60113_210_16271_1_1535164212481_4386079_559-167191-0 from training. Duration: 0.93 2023-10-03 02:55:26,689 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_47228_210_4983_1_1533780021_3672027_344-179642-0_sp1.1 from training. Duration: 0.92725 2023-10-03 02:55:26,723 WARNING [train.py:1204] (3/4) Exclude cut with ID _22961_210_8254_1_1531976366672_4278180_317-199546-0_sp0.9 from training. Duration: 0.9555625 2023-10-03 02:55:28,208 WARNING [train.py:1204] (3/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_141-89727-0 from training. Duration: 0.71 2023-10-03 02:55:29,649 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_78-116075-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 02:55:29,661 WARNING [train.py:1204] (3/4) Exclude cut with ID _64086_210_7994_1_1535270417019_7435949_52-235756-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 02:55:31,095 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7615_210_4211_1_1528110965_4580160_72-235211-0_sp1.1 from training. Duration: 0.6909375 2023-10-03 02:55:35,759 WARNING [train.py:1204] (3/4) Exclude cut with ID _14069_210_5632_1_1530512692033_4803390_287-27950-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 02:55:35,849 WARNING [train.py:1204] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_74-48717-0_sp0.9 from training. Duration: 0.6444375 2023-10-03 02:55:37,180 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7544_210_6753_1_1528591327_1471426_172-348777-0_sp1.1 from training. Duration: 0.6 2023-10-03 02:55:37,305 WARNING [train.py:1204] (3/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_220-191080-0 from training. Duration: 0.98 2023-10-03 02:55:38,678 WARNING [train.py:1204] (3/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_1081-137641-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 02:55:41,336 WARNING [train.py:1204] (3/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_278-235459-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 02:55:46,357 WARNING [train.py:1204] (3/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_1181-77470-0_sp1.1 from training. Duration: 0.936375 2023-10-03 02:55:47,809 WARNING [train.py:1204] (3/4) Exclude cut with ID _60860_210_8254_1_1534896015875_3557169_198-5531-0_sp1.1 from training. Duration: 0.936375 2023-10-03 02:55:49,169 WARNING [train.py:1204] (3/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_17-289781-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 02:55:49,176 WARNING [train.py:1204] (3/4) Exclude cut with ID _50310_210_15488_1_1534068043345_3641080_146-199752-0_sp1.1 from training. Duration: 0.92725 2023-10-03 02:55:52,510 WARNING [train.py:1204] (3/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_969-213354-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 02:55:53,880 WARNING [train.py:1204] (3/4) Exclude cut with ID _70519_210_4163_1_1535961595994_7458230_66-344156-0_sp1.1 from training. Duration: 0.963625 2023-10-03 02:55:54,092 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.feed_forward1.hidden_balancer.prob, batch_count=1113040.0, ans=0.125 2023-10-03 02:55:58,065 WARNING [train.py:1204] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_380-204756-0_sp1.1 from training. Duration: 0.7818125 2023-10-03 02:55:59,553 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_128-202104-0_sp0.9 from training. Duration: 0.7 2023-10-03 02:56:03,749 WARNING [train.py:1204] (3/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_51-1049-0_sp0.9 from training. Duration: 0.8333125 2023-10-03 02:56:03,768 WARNING [train.py:1204] (3/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_86-66192-0_sp1.1 from training. Duration: 0.5 2023-10-03 02:56:03,833 WARNING [train.py:1204] (3/4) Exclude cut with ID _14069_210_5632_1_1530512692033_4803390_95-1896-0_sp1.1 from training. Duration: 0.8909375 2023-10-03 02:56:07,288 WARNING [train.py:1204] (3/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_29-293896-0_sp1.1 from training. Duration: 0.5545625 2023-10-03 02:56:11,187 WARNING [train.py:1204] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_167-137255-0_sp0.9 from training. Duration: 0.711125 2023-10-03 02:56:11,188 WARNING [train.py:1204] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_217-95056-0 from training. Duration: 0.98 2023-10-03 02:56:11,202 WARNING [train.py:1204] (3/4) Exclude cut with ID _47278_210_12041_1_1533902418349_3576110_97-266746-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 02:56:11,249 WARNING [train.py:1204] (3/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_380-343670-0_sp1.1 from training. Duration: 0.8 2023-10-03 02:56:14,416 WARNING [train.py:1204] (3/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_107-332398-0 from training. Duration: 0.78 2023-10-03 02:56:15,769 INFO [train.py:1046] (3/4) Epoch 32, batch 2300, loss[loss=0.1477, simple_loss=0.2209, pruned_loss=0.0372, over 23515.00 frames. ], tot_loss[loss=0.1627, simple_loss=0.242, pruned_loss=0.04165, over 4741772.31 frames. ], batch size: 134, lr: 3.19e-03, grad_scale: 8.0 2023-10-03 02:56:19,166 WARNING [train.py:1204] (3/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_91-267696-0_sp1.1 from training. Duration: 0.7909375 2023-10-03 02:56:19,186 WARNING [train.py:1204] (3/4) Exclude cut with ID _41298_210_9019_1_1533430371450_4822143_498-186993-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 02:56:26,475 WARNING [train.py:1204] (3/4) Exclude cut with ID _17956_210_4353_1_1531209568125_6979970_54-77610-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 02:56:26,508 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_40047_210_2290_1_1533290519_2374311_257-335162-0_sp1.1 from training. Duration: 0.863625 2023-10-03 02:56:28,012 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9653W0193-675471 from training. Duration: 0.9656875 2023-10-03 02:56:30,632 WARNING [train.py:1204] (3/4) Exclude cut with ID _25248_210_8750_1_1532062825941_5554180_243-240536-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 02:56:35,036 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_32360_210_4198_1_1532689160_3648436_60-51908-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 02:56:35,053 WARNING [train.py:1204] (3/4) Exclude cut with ID _4092_210_2151_1_1526643044449_3135759_227-213972-0_sp0.9 from training. Duration: 0.6 2023-10-03 02:56:36,353 WARNING [train.py:1204] (3/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_392-173500-0_sp1.1 from training. Duration: 0.97275 2023-10-03 02:56:36,401 WARNING [train.py:1204] (3/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_71-329150-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 02:56:36,403 WARNING [train.py:1204] (3/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_866-173440-0 from training. Duration: 0.9 2023-10-03 02:56:38,209 WARNING [train.py:1204] (3/4) Exclude cut with ID _14832_210_8750_1_1530691351539_3171348_1-54315-0_sp0.9 from training. Duration: 0.9666875 2023-10-03 02:56:40,870 WARNING [train.py:1204] (3/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_430-116139-0_sp1.1 from training. Duration: 0.77275 2023-10-03 02:56:40,907 WARNING [train.py:1204] (3/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_356-45054-0_sp0.9 from training. Duration: 0.9555625 2023-10-03 02:56:41,160 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.balancer1.prob, batch_count=1113240.0, ans=0.125 2023-10-03 02:56:43,062 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=1113240.0, ans=0.1 2023-10-03 02:56:44,285 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3531_210_3528_1_1526294203_5216456_309-227302-0_sp0.9 from training. Duration: 0.7666875 2023-10-03 02:56:48,247 WARNING [train.py:1204] (3/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_301-330455-0_sp1.1 from training. Duration: 0.72725 2023-10-03 02:56:53,417 WARNING [train.py:1204] (3/4) Exclude cut with ID _23557_210_8750_1_1531890106370_6846255_667-145945-0_sp1.1 from training. Duration: 0.92725 2023-10-03 02:56:55,073 WARNING [train.py:1204] (3/4) Exclude cut with ID _14860_210_4915_1_1530702020340_7343240_836-328489-0_sp1.1 from training. Duration: 0.8181875 2023-10-03 02:56:56,403 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_37152_210_9697_1_1533106573_3844188_416-333249-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 02:56:58,023 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.conv_skip_rate, batch_count=1113306.6666666667, ans=0.0 2023-10-03 02:56:59,183 WARNING [train.py:1204] (3/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_538-32291-0_sp0.9 from training. Duration: 0.92225 2023-10-03 02:57:01,839 WARNING [train.py:1204] (3/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_965-176824-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 02:57:03,337 WARNING [train.py:1204] (3/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_86-19601-0_sp1.1 from training. Duration: 0.77275 2023-10-03 02:57:04,685 WARNING [train.py:1204] (3/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_436-318113-0_sp1.1 from training. Duration: 0.6818125 2023-10-03 02:57:04,721 WARNING [train.py:1204] (3/4) Exclude cut with ID _41817_210_18835_1_1533434477160_7423049_555-293448-0_sp0.9 from training. Duration: 0.888875 2023-10-03 02:57:04,732 WARNING [train.py:1204] (3/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_68-219807-0 from training. Duration: 0.47 2023-10-03 02:57:08,981 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7544_210_6753_1_1528591327_1471426_33-346031-0_sp1.1 from training. Duration: 0.5545625 2023-10-03 02:57:08,983 WARNING [train.py:1204] (3/4) Exclude cut with ID _49748_210_24116_1_1533986971737_3158910_263-52113-0_sp1.1 from training. Duration: 0.97275 2023-10-03 02:57:10,692 WARNING [train.py:1204] (3/4) Exclude cut with ID _64639_210_5816_1_1535367619437_3528000_340-24580-0_sp1.1 from training. Duration: 0.92725 2023-10-03 02:57:10,706 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_47228_210_4983_1_1533780021_3672027_479-114941-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 02:57:10,730 WARNING [train.py:1204] (3/4) Exclude cut with ID _44585_210_18628_1_1533605350387_4518199_506-119659-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 02:57:12,454 WARNING [train.py:1204] (3/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_314-126819-0_sp1.1 from training. Duration: 0.4545625 2023-10-03 02:57:12,457 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2541_210_1794_1_1525590269_3084765_156-173713-0_sp0.9 from training. Duration: 0.9 2023-10-03 02:57:12,661 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.conv_module2.balancer2.min_positive, batch_count=1113373.3333333333, ans=0.05 2023-10-03 02:57:13,910 WARNING [train.py:1204] (3/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_108-255430-0 from training. Duration: 0.97 2023-10-03 02:57:13,920 WARNING [train.py:1204] (3/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_791-45149-0_sp1.1 from training. Duration: 0.7818125 2023-10-03 02:57:13,928 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_53378_210_17282_1_1534227054_1261425_10-118295-0_sp1.1 from training. Duration: 0.97275 2023-10-03 02:57:13,991 WARNING [train.py:1204] (3/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_148-158509-0 from training. Duration: 0.41 2023-10-03 02:57:21,481 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_34726_210_4525_1_1533019474_5030395_687-291305-0_sp1.1 from training. Duration: 0.936375 2023-10-03 02:57:23,651 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.conv_module1.balancer1.prob, batch_count=1113440.0, ans=0.125 2023-10-03 02:57:26,095 WARNING [train.py:1204] (3/4) Exclude cut with ID _14860_210_4915_1_1530702020340_7343240_264-324537-0_sp1.1 from training. Duration: 0.963625 2023-10-03 02:57:28,899 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_41251_210_15368_1_1533429767_4820695_591-46386-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 02:57:29,551 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.3.encoder.layers.1.feed_forward1.out_whiten, num_groups=1, num_channels=512, metric=8.76 vs. limit=15.0 2023-10-03 02:57:30,094 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.534e+02 1.927e+02 2.149e+02 2.530e+02 4.352e+02, threshold=4.298e+02, percent-clipped=1.0 2023-10-03 02:57:30,120 INFO [train.py:1046] (3/4) Epoch 32, batch 2350, loss[loss=0.1463, simple_loss=0.223, pruned_loss=0.0348, over 24322.00 frames. ], tot_loss[loss=0.1627, simple_loss=0.2422, pruned_loss=0.04155, over 4744217.31 frames. ], batch size: 56, lr: 3.19e-03, grad_scale: 8.0 2023-10-03 02:57:30,166 WARNING [train.py:1204] (3/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_86-19601-0_sp0.9 from training. Duration: 0.9444375 2023-10-03 02:57:30,185 WARNING [train.py:1204] (3/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_401-255491-0_sp0.9 from training. Duration: 0.77775 2023-10-03 02:57:31,632 WARNING [train.py:1204] (3/4) Exclude cut with ID _8696_210_1876_1_1528545632440_3345058_209-54069-0_sp1.1 from training. Duration: 0.6090625 2023-10-03 02:57:31,640 WARNING [train.py:1204] (3/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_369-325184-0_sp1.1 from training. Duration: 0.9 2023-10-03 02:57:31,712 WARNING [train.py:1204] (3/4) Exclude cut with ID _14687_210_5854_1_1530703838259_3664889_115-239046-0_sp0.9 from training. Duration: 0.7444375 2023-10-03 02:57:31,753 WARNING [train.py:1204] (3/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_168-50337-0 from training. Duration: 0.84 2023-10-03 02:57:39,862 WARNING [train.py:1204] (3/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_909-64151-0_sp1.1 from training. Duration: 0.8545625 2023-10-03 02:57:39,872 WARNING [train.py:1204] (3/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_597-154640-0 from training. Duration: 0.9 2023-10-03 02:57:44,731 WARNING [train.py:1204] (3/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_126-274694-0 from training. Duration: 0.98 2023-10-03 02:57:46,170 WARNING [train.py:1204] (3/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_344-269786-0_sp1.1 from training. Duration: 0.97275 2023-10-03 02:57:47,692 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_37818_210_4238_1_1533620910_4720083_322-253154-0_sp1.1 from training. Duration: 0.92725 2023-10-03 02:57:47,695 WARNING [train.py:1204] (3/4) Exclude cut with ID _58895_210_8750_1_1535184081320_3649089_221-15823-0_sp1.1 from training. Duration: 0.92725 2023-10-03 02:57:47,705 WARNING [train.py:1204] (3/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_9-21737-0_sp1.1 from training. Duration: 0.8818125 2023-10-03 02:57:48,403 WARNING [train.py:1204] (3/4) Exclude cut with ID _14069_210_5632_1_1530512692033_4803390_522-341588-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 02:57:49,695 WARNING [train.py:1204] (3/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_363-185703-0 from training. Duration: 0.95 2023-10-03 02:57:52,983 WARNING [train.py:1204] (3/4) Exclude cut with ID _41817_210_18835_1_1533434477160_7423049_265-76216-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 02:57:56,942 WARNING [train.py:1204] (3/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_841-341679-0 from training. Duration: 0.58 2023-10-03 02:58:00,884 WARNING [train.py:1204] (3/4) Exclude cut with ID _55429_210_10626_1_1534467588527_7174143_97-335084-0_sp1.1 from training. Duration: 0.8818125 2023-10-03 02:58:03,804 WARNING [train.py:1204] (3/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_690-126306-0_sp0.9 from training. Duration: 0.8666875 2023-10-03 02:58:03,813 WARNING [train.py:1204] (3/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_102-92751-0_sp1.1 from training. Duration: 0.9 2023-10-03 02:58:05,241 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3787_210_2040_1_1526477544_5478586_603-120324-0_sp1.1 from training. Duration: 0.736375 2023-10-03 02:58:06,003 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.1.encoder.layers.1.whiten, num_groups=1, num_channels=256, metric=3.91 vs. limit=12.0 2023-10-03 02:58:06,675 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_33348_210_5632_1_1533086912_3324030_85-28583-0 from training. Duration: 0.82 2023-10-03 02:58:06,726 WARNING [train.py:1204] (3/4) Exclude cut with ID _60057_210_10640_1_1534903065793_3333150_362-230377-0_sp1.1 from training. Duration: 0.7909375 2023-10-03 02:58:08,160 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.1.balancer1.prob, batch_count=1113640.0, ans=0.125 2023-10-03 02:58:09,951 WARNING [train.py:1204] (3/4) Exclude cut with ID _60113_210_16271_1_1535164212481_4386079_194-146486-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 02:58:09,964 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_53753_210_7234_1_1534320003_3594148_171-277668-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 02:58:11,258 WARNING [train.py:1204] (3/4) Exclude cut with ID _34423_210_8750_1_1533430798456_3330199_251-332788-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 02:58:14,670 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.bypass.scale_min, batch_count=1113706.6666666667, ans=0.2 2023-10-03 02:58:15,772 WARNING [train.py:1204] (3/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_663-166973-0_sp0.9 from training. Duration: 0.888875 2023-10-03 02:58:17,254 WARNING [train.py:1204] (3/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_927-43827-0 from training. Duration: 0.74 2023-10-03 02:58:17,297 WARNING [train.py:1204] (3/4) Exclude cut with ID _28323_210_16187_1_1532480395667_4100300_306-74561-0_sp1.1 from training. Duration: 0.8545625 2023-10-03 02:58:18,776 WARNING [train.py:1204] (3/4) Exclude cut with ID _15037_210_2253_1_1530752294104_3267650_139-337989-0_sp1.1 from training. Duration: 0.92725 2023-10-03 02:58:20,255 WARNING [train.py:1204] (3/4) Exclude cut with ID _49039_210_6315_1_1533967537541_3846118_460-263670-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 02:58:22,147 WARNING [train.py:1204] (3/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_186-309161-0 from training. Duration: 0.54 2023-10-03 02:58:24,025 WARNING [train.py:1204] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_87-278686-0_sp1.1 from training. Duration: 0.7 2023-10-03 02:58:26,803 WARNING [train.py:1204] (3/4) Exclude cut with ID _27052_210_5986_1_1532329197187_3585009_314-106499-0 from training. Duration: 0.92 2023-10-03 02:58:26,821 WARNING [train.py:1204] (3/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_412-287526-0_sp0.9 from training. Duration: 0.8 2023-10-03 02:58:30,942 WARNING [train.py:1204] (3/4) Exclude cut with ID _22961_210_8254_1_1531976366672_4278180_317-199546-0 from training. Duration: 0.86 2023-10-03 02:58:33,757 WARNING [train.py:1204] (3/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_381-317791-0 from training. Duration: 0.73 2023-10-03 02:58:34,023 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=1113773.3333333333, ans=0.1 2023-10-03 02:58:35,184 WARNING [train.py:1204] (3/4) Exclude cut with ID _44010_210_4525_1_1533884262964_4013620_560-310411-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 02:58:35,196 WARNING [train.py:1204] (3/4) Exclude cut with ID _5294_210_1747_1_1527249643928_3696179_342-270278-0_sp0.9 from training. Duration: 0.611125 2023-10-03 02:58:35,220 WARNING [train.py:1204] (3/4) Exclude cut with ID ID0473W0002-918951 from training. Duration: 0.924 2023-10-03 02:58:35,237 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1083-220122-0 from training. Duration: 0.51 2023-10-03 02:58:37,941 WARNING [train.py:1204] (3/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_138-154812-0 from training. Duration: 0.78 2023-10-03 02:58:38,071 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.attention_skip_rate, batch_count=1113773.3333333333, ans=0.0 2023-10-03 02:58:41,114 WARNING [train.py:1204] (3/4) Exclude cut with ID _44982_210_1511_1_1534312820108_3726029_331-19420-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 02:58:43,819 INFO [train.py:1046] (3/4) Epoch 32, batch 2400, loss[loss=0.1724, simple_loss=0.2388, pruned_loss=0.05299, over 23869.00 frames. ], tot_loss[loss=0.1629, simple_loss=0.2422, pruned_loss=0.04176, over 4735239.43 frames. ], batch size: 150, lr: 3.19e-03, grad_scale: 16.0 2023-10-03 02:58:44,006 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2655_210_2087_1_1525688959_3897400_202-147557-0_sp0.9 from training. Duration: 0.9444375 2023-10-03 02:58:48,650 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_23850_210_4238_1_1532232597_1632433_32-185135-0_sp0.9 from training. Duration: 0.9555625 2023-10-03 02:58:50,067 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_46-93547-0_sp1.1 from training. Duration: 0.863625 2023-10-03 02:58:50,140 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7931_210_1794_1_1528594728_5564059_280-279560-0 from training. Duration: 0.98 2023-10-03 02:58:51,899 WARNING [train.py:1204] (3/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_937-208281-0 from training. Duration: 0.85 2023-10-03 02:58:59,317 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_14928_210_5632_1_1530773461_4714000_454-112198-0_sp0.9 from training. Duration: 0.7333125 2023-10-03 02:58:59,318 WARNING [train.py:1204] (3/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_944-153333-0_sp1.1 from training. Duration: 0.963625 2023-10-03 02:59:00,759 WARNING [train.py:1204] (3/4) Exclude cut with ID _14821_210_8750_1_1530680503991_6829359_2-227151-0 from training. Duration: 0.83 2023-10-03 02:59:02,104 WARNING [train.py:1204] (3/4) Exclude cut with ID _28461_210_2253_1_1532771775189_3041299_96-152158-0_sp1.1 from training. Duration: 0.77275 2023-10-03 02:59:02,202 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7837_210_1792_1_1528453445_5841305_654-54195-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 02:59:02,243 WARNING [train.py:1204] (3/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_344-152526-0 from training. Duration: 0.53 2023-10-03 02:59:07,809 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_30167_210_3528_1_1532428012_4772375_480-142230-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 02:59:10,557 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_160-218721-0 from training. Duration: 0.96 2023-10-03 02:59:14,171 WARNING [train.py:1204] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_65-88242-0_sp0.9 from training. Duration: 0.62225 2023-10-03 02:59:18,810 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_34886_210_10633_1_1533203882_1755661_76-156096-0 from training. Duration: 0.87 2023-10-03 02:59:21,985 WARNING [train.py:1204] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_188-38388-0_sp1.1 from training. Duration: 0.92725 2023-10-03 02:59:24,126 WARNING [train.py:1204] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_81-289335-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 02:59:27,210 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.ff3_skip_rate, batch_count=1113973.3333333333, ans=0.0 2023-10-03 02:59:28,315 WARNING [train.py:1204] (3/4) Exclude cut with ID _59493_210_17282_1_1535078419077_3225879_110-8778-0_sp1.1 from training. Duration: 0.936375 2023-10-03 02:59:28,361 WARNING [train.py:1204] (3/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_188-260011-0 from training. Duration: 0.73 2023-10-03 02:59:28,397 WARNING [train.py:1204] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_453-133983-0_sp0.9 from training. Duration: 0.8666875 2023-10-03 02:59:30,291 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.4.encoder.layers.1.nonlin_attention.whiten1, num_groups=1, num_channels=288, metric=4.21 vs. limit=10.0 2023-10-03 02:59:36,576 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8054_210_7927_1_1528627721_4984094_553-17379-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 02:59:39,334 WARNING [train.py:1204] (3/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_341-171072-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 02:59:40,849 WARNING [train.py:1204] (3/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_265-257286-0_sp1.1 from training. Duration: 0.963625 2023-10-03 02:59:42,245 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_403-39619-0_sp0.9 from training. Duration: 0.9333125 2023-10-03 02:59:42,250 WARNING [train.py:1204] (3/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_464-33931-0_sp0.9 from training. Duration: 0.6 2023-10-03 02:59:42,272 WARNING [train.py:1204] (3/4) Exclude cut with ID _60804_210_9642_1_1534831199859_7268299_632-143863-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 02:59:42,278 WARNING [train.py:1204] (3/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_431-310997-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 02:59:44,162 WARNING [train.py:1204] (3/4) Exclude cut with ID _31649_210_6228_1_1532584608063_4091078_114-263860-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 02:59:44,173 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6961_210_2087_1_1527937168_6815011_562-165518-0_sp0.9 from training. Duration: 0.8555625 2023-10-03 02:59:48,347 WARNING [train.py:1204] (3/4) Exclude cut with ID _27052_210_5986_1_1532329197187_3585009_338-325870-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 02:59:49,571 WARNING [train.py:1204] (3/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_469-94673-0_sp1.1 from training. Duration: 0.7181875 2023-10-03 02:59:49,594 WARNING [train.py:1204] (3/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_259-34038-0 from training. Duration: 0.74 2023-10-03 02:59:51,348 WARNING [train.py:1204] (3/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_388-129552-0 from training. Duration: 0.96 2023-10-03 02:59:53,265 WARNING [train.py:1204] (3/4) Exclude cut with ID _28394_210_8913_1_1532430049518_3650118_342-251455-0_sp1.1 from training. Duration: 0.936375 2023-10-03 02:59:53,295 WARNING [train.py:1204] (3/4) Exclude cut with ID _53731_210_7994_1_1534553979244_3757214_8-191390-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 02:59:54,630 WARNING [train.py:1204] (3/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_613-232856-0 from training. Duration: 0.54 2023-10-03 02:59:56,608 WARNING [train.py:1204] (3/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_557-349534-0 from training. Duration: 0.91 2023-10-03 02:59:56,617 WARNING [train.py:1204] (3/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_379-184223-0 from training. Duration: 0.85 2023-10-03 02:59:56,621 WARNING [train.py:1204] (3/4) Exclude cut with ID IC0125W0001-61807 from training. Duration: 0.966 2023-10-03 02:59:56,708 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_229-45300-0 from training. Duration: 0.57 2023-10-03 02:59:57,852 WARNING [train.py:1204] (3/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_1069-24743-0_sp1.1 from training. Duration: 0.92725 2023-10-03 02:59:59,078 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.535e+02 1.835e+02 1.995e+02 2.171e+02 3.013e+02, threshold=3.990e+02, percent-clipped=0.0 2023-10-03 02:59:59,105 INFO [train.py:1046] (3/4) Epoch 32, batch 2450, loss[loss=0.1469, simple_loss=0.2143, pruned_loss=0.03972, over 23406.00 frames. ], tot_loss[loss=0.1617, simple_loss=0.2404, pruned_loss=0.04148, over 4718570.22 frames. ], batch size: 285, lr: 3.19e-03, grad_scale: 16.0 2023-10-03 02:59:59,212 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_41452_210_7291_1_1533437687_3941281_323-172873-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 02:59:59,468 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.feed_forward2.hidden_balancer.prob, batch_count=1114173.3333333333, ans=0.125 2023-10-03 03:00:00,601 WARNING [train.py:1204] (3/4) Exclude cut with ID _37086_210_9038_1_1532999540891_7021250_802-226709-0_sp1.1 from training. Duration: 0.97275 2023-10-03 03:00:01,968 WARNING [train.py:1204] (3/4) Exclude cut with ID IC0144W0500-71767 from training. Duration: 0.931875 2023-10-03 03:00:02,037 WARNING [train.py:1204] (3/4) Exclude cut with ID _39846_210_12651_1_1533605310265_2115212_233-129699-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 03:00:03,228 WARNING [train.py:1204] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_574-45541-0_sp0.9 from training. Duration: 0.72225 2023-10-03 03:00:03,595 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.feed_forward3.hidden_balancer.prob, batch_count=1114173.3333333333, ans=0.125 2023-10-03 03:00:04,793 WARNING [train.py:1204] (3/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_372-298093-0_sp1.1 from training. Duration: 0.72725 2023-10-03 03:00:06,181 WARNING [train.py:1204] (3/4) Exclude cut with ID _62574_210_2655_1_1535180407058_3933249_34-26343-0_sp1.1 from training. Duration: 0.97275 2023-10-03 03:00:09,015 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_512-256238-0_sp1.1 from training. Duration: 0.963625 2023-10-03 03:00:09,020 WARNING [train.py:1204] (3/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_196-55502-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 03:00:09,126 WARNING [train.py:1204] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_195-167011-0 from training. Duration: 0.75 2023-10-03 03:00:12,403 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.5.encoder.layers.0.feed_forward1.out_whiten, num_groups=1, num_channels=256, metric=13.14 vs. limit=15.0 2023-10-03 03:00:13,317 WARNING [train.py:1204] (3/4) Exclude cut with ID _13678_210_7497_1_1530530069226_3463170_345-230416-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 03:00:13,327 WARNING [train.py:1204] (3/4) Exclude cut with ID _51078_210_9070_1_1534161557424_3805250_225-327809-0_sp1.1 from training. Duration: 0.963625 2023-10-03 03:00:19,532 WARNING [train.py:1204] (3/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_18-189198-0_sp1.1 from training. Duration: 0.8181875 2023-10-03 03:00:19,543 WARNING [train.py:1204] (3/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_329-82235-0_sp1.1 from training. Duration: 0.7454375 2023-10-03 03:00:19,553 WARNING [train.py:1204] (3/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_726-270780-0_sp0.9 from training. Duration: 0.9555625 2023-10-03 03:00:19,580 WARNING [train.py:1204] (3/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_654-52713-0 from training. Duration: 0.77 2023-10-03 03:00:25,333 WARNING [train.py:1204] (3/4) Exclude cut with ID _20455_210_5219_1_1531565996775_8098100_191-16006-0_sp1.1 from training. Duration: 0.963625 2023-10-03 03:00:26,737 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3171_210_2087_1_1526109084_2330007_66-260583-0_sp1.1 from training. Duration: 0.6545625 2023-10-03 03:00:26,800 WARNING [train.py:1204] (3/4) Exclude cut with ID _38670_210_15379_1_1533448810659_4391059_63-129006-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 03:00:30,839 WARNING [train.py:1204] (3/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_393-57888-0_sp0.9 from training. Duration: 0.711125 2023-10-03 03:00:30,883 WARNING [train.py:1204] (3/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_857-298942-0_sp0.9 from training. Duration: 0.9444375 2023-10-03 03:00:32,330 WARNING [train.py:1204] (3/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_239-22076-0_sp0.9 from training. Duration: 0.9444375 2023-10-03 03:00:32,369 WARNING [train.py:1204] (3/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_424-227947-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 03:00:35,062 WARNING [train.py:1204] (3/4) Exclude cut with ID _14069_210_5632_1_1530512692033_4803390_95-1896-0 from training. Duration: 0.98 2023-10-03 03:00:35,138 WARNING [train.py:1204] (3/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_96-244299-0_sp0.9 from training. Duration: 0.97775 2023-10-03 03:00:38,113 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.1.ff2_skip_rate, batch_count=1114306.6666666667, ans=0.0 2023-10-03 03:00:41,055 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.feed_forward3.hidden_balancer.prob, batch_count=1114306.6666666667, ans=0.125 2023-10-03 03:00:43,485 WARNING [train.py:1204] (3/4) Exclude cut with ID _46683_210_9642_1_1533730913492_4591116_562-319825-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 03:00:44,828 WARNING [train.py:1204] (3/4) Exclude cut with ID _64639_210_5816_1_1535367619437_3528000_284-173474-0_sp1.1 from training. Duration: 0.963625 2023-10-03 03:00:44,850 WARNING [train.py:1204] (3/4) Exclude cut with ID _29529_210_7234_1_1532393961455_3977460_497-100133-0_sp1.1 from training. Duration: 0.936375 2023-10-03 03:00:46,166 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_12868_210_6753_1_1530701932_530275_18-76322-0_sp0.9 from training. Duration: 0.9666875 2023-10-03 03:00:46,198 WARNING [train.py:1204] (3/4) Exclude cut with ID _50093_210_4220_1_1534417141207_3432299_116-303447-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 03:00:46,304 WARNING [train.py:1204] (3/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_44-79427-0_sp1.1 from training. Duration: 0.8909375 2023-10-03 03:00:48,147 WARNING [train.py:1204] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_887-249555-0 from training. Duration: 0.77 2023-10-03 03:00:48,372 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.1.conv_skip_rate, batch_count=1114373.3333333333, ans=0.0 2023-10-03 03:00:51,627 WARNING [train.py:1204] (3/4) Exclude cut with ID _28461_210_2253_1_1532771775189_3041299_96-152158-0_sp0.9 from training. Duration: 0.9444375 2023-10-03 03:00:51,665 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_14058_210_4163_1_1530690953_6327180_49-210687-0_sp1.1 from training. Duration: 0.8545625 2023-10-03 03:00:54,732 WARNING [train.py:1204] (3/4) Exclude cut with ID _40399_210_4353_1_1533305658194_3279406_351-127903-0_sp1.1 from training. Duration: 0.97275 2023-10-03 03:00:54,739 WARNING [train.py:1204] (3/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_184-206939-0_sp1.1 from training. Duration: 0.936375 2023-10-03 03:00:59,642 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.feed_forward2.hidden_balancer.prob, batch_count=1114440.0, ans=0.125 2023-10-03 03:01:00,757 WARNING [train.py:1204] (3/4) Exclude cut with ID _14100_210_8341_1_1530529320613_3678069_258-224599-0_sp1.1 from training. Duration: 0.7 2023-10-03 03:01:00,772 WARNING [train.py:1204] (3/4) Exclude cut with ID _6493_210_1747_1_1528459210424_4012470_324-323854-0 from training. Duration: 0.92 2023-10-03 03:01:00,856 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_151-325626-0_sp1.1 from training. Duration: 0.8818125 2023-10-03 03:01:02,135 WARNING [train.py:1204] (3/4) Exclude cut with ID _14528_210_8254_1_1530669700514_4306939_106-43145-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 03:01:02,150 WARNING [train.py:1204] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_686-54383-0 from training. Duration: 0.87 2023-10-03 03:01:02,201 WARNING [train.py:1204] (3/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_801-102362-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 03:01:03,578 WARNING [train.py:1204] (3/4) Exclude cut with ID _48677_210_5663_1_1533902405919_3892989_408-40740-0_sp0.9 from training. Duration: 0.92225 2023-10-03 03:01:06,459 WARNING [train.py:1204] (3/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_448-212135-0_sp1.1 from training. Duration: 0.82725 2023-10-03 03:01:07,916 WARNING [train.py:1204] (3/4) Exclude cut with ID _59170_210_6721_1_1534983633960_4203335_320-124047-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 03:01:09,317 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_4692_210_3528_1_1526899755_4323254_107-44401-0_sp1.1 from training. Duration: 0.7818125 2023-10-03 03:01:12,257 WARNING [train.py:1204] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_731-2428-0 from training. Duration: 0.83 2023-10-03 03:01:13,442 INFO [train.py:1046] (3/4) Epoch 32, batch 2500, loss[loss=0.1719, simple_loss=0.2445, pruned_loss=0.04966, over 23244.00 frames. ], tot_loss[loss=0.1609, simple_loss=0.2387, pruned_loss=0.04153, over 4693835.48 frames. ], batch size: 119, lr: 3.19e-03, grad_scale: 8.0 2023-10-03 03:01:13,523 WARNING [train.py:1204] (3/4) Exclude cut with ID _29857_210_20034_1_1532395886753_3798160_335-163562-0_sp1.1 from training. Duration: 0.77275 2023-10-03 03:01:18,266 WARNING [train.py:1204] (3/4) Exclude cut with ID _14832_210_8750_1_1530691351539_3171348_171-275004-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 03:01:26,843 WARNING [train.py:1204] (3/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_105-328174-0_sp0.9 from training. Duration: 0.8444375 2023-10-03 03:01:26,868 WARNING [train.py:1204] (3/4) Exclude cut with ID _68965_210_1883_1_1535698962272_3497490_164-118421-0_sp1.1 from training. Duration: 0.936375 2023-10-03 03:01:26,958 WARNING [train.py:1204] (3/4) Exclude cut with ID _19896_210_1883_1_1531709022662_6649569_380-24170-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 03:01:28,203 WARNING [train.py:1204] (3/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_505-205270-0_sp1.1 from training. Duration: 0.3454375 2023-10-03 03:01:35,150 WARNING [train.py:1204] (3/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_427-208901-0_sp1.1 from training. Duration: 0.6181875 2023-10-03 03:01:36,563 WARNING [train.py:1204] (3/4) Exclude cut with ID _23818_210_6228_1_1531917063884_3676920_85-326916-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 03:01:36,668 WARNING [train.py:1204] (3/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_101-293367-0_sp1.1 from training. Duration: 0.57275 2023-10-03 03:01:36,676 WARNING [train.py:1204] (3/4) Exclude cut with ID _5409_210_1388_1_1527292850189_5865191_178-127970-0_sp1.1 from training. Duration: 0.5181875 2023-10-03 03:01:38,017 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_403-39619-0 from training. Duration: 0.84 2023-10-03 03:01:39,351 WARNING [train.py:1204] (3/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_427-87841-0_sp1.1 from training. Duration: 0.963625 2023-10-03 03:01:40,709 WARNING [train.py:1204] (3/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_286-68181-0_sp1.1 from training. Duration: 0.97275 2023-10-03 03:01:40,790 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6673_210_3273_1_1527767790_3173240_327-306185-0 from training. Duration: 0.83 2023-10-03 03:01:42,033 WARNING [train.py:1204] (3/4) Exclude cut with ID _3675_210_4557_1_1526639934408_4182089_106-203713-0_sp1.1 from training. Duration: 0.963625 2023-10-03 03:01:42,106 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_53071_210_16554_1_1534493434_1276796_130-36076-0 from training. Duration: 0.95 2023-10-03 03:01:42,128 WARNING [train.py:1204] (3/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_586-93054-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 03:01:46,254 WARNING [train.py:1204] (3/4) Exclude cut with ID _23825_210_13449_1_1531915313526_3497150_262-10571-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 03:01:48,122 WARNING [train.py:1204] (3/4) Exclude cut with ID _28783_210_7943_1_1532671164808_7155479_432-67885-0_sp1.1 from training. Duration: 0.97275 2023-10-03 03:01:49,835 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6287_210_3273_1_1527509506_2653716_182-199887-0_sp0.9 from training. Duration: 0.5666875 2023-10-03 03:01:51,308 WARNING [train.py:1204] (3/4) Exclude cut with ID _3231_210_2106_1_1526205864934_6784220_171-256004-0 from training. Duration: 0.86 2023-10-03 03:01:51,374 WARNING [train.py:1204] (3/4) Exclude cut with ID _69051_210_1531_1_1535885566901_3829538_332-92090-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 03:01:52,816 WARNING [train.py:1204] (3/4) Exclude cut with ID _3675_210_4557_1_1526639934408_4182089_104-324125-0_sp1.1 from training. Duration: 0.963625 2023-10-03 03:01:56,960 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.bypass.scale_min, batch_count=1114640.0, ans=0.2 2023-10-03 03:01:57,986 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_41764_210_2521_1_1533686110_4089313_501-2119-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 03:01:59,623 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.ff3_skip_rate, batch_count=1114706.6666666667, ans=0.0 2023-10-03 03:02:02,097 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_56-100988-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 03:02:04,930 WARNING [train.py:1204] (3/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_398-239819-0_sp1.1 from training. Duration: 0.9 2023-10-03 03:02:10,583 WARNING [train.py:1204] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_74-48717-0_sp1.1 from training. Duration: 0.52725 2023-10-03 03:02:12,007 WARNING [train.py:1204] (3/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_177-49092-0 from training. Duration: 0.91 2023-10-03 03:02:12,031 WARNING [train.py:1204] (3/4) Exclude cut with ID _62337_210_12392_1_1535009273105_3412229_44-176353-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 03:02:12,046 WARNING [train.py:1204] (3/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_110-130961-0_sp1.1 from training. Duration: 0.42725 2023-10-03 03:02:14,757 WARNING [train.py:1204] (3/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_37-189262-0_sp1.1 from training. Duration: 0.8909375 2023-10-03 03:02:14,757 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3172_210_2087_1_1526292929_3987865_197-97762-0_sp0.9 from training. Duration: 0.8333125 2023-10-03 03:02:16,200 WARNING [train.py:1204] (3/4) Exclude cut with ID IC0677W0065-334897 from training. Duration: 0.896 2023-10-03 03:02:16,201 WARNING [train.py:1204] (3/4) Exclude cut with ID IC0677W0080-334912 from training. Duration: 0.768 2023-10-03 03:02:16,213 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_120-41668-0 from training. Duration: 0.7 2023-10-03 03:02:16,486 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.balancer2.prob, batch_count=1114773.3333333333, ans=0.125 2023-10-03 03:02:19,722 WARNING [train.py:1204] (3/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_460-205287-0_sp1.1 from training. Duration: 0.963625 2023-10-03 03:02:21,693 WARNING [train.py:1204] (3/4) Exclude cut with ID _14632_210_9988_1_1530691279392_3923200_102-204477-0 from training. Duration: 0.97 2023-10-03 03:02:21,706 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_34815_210_6388_1_1533381320_3571903_279-305565-0 from training. Duration: 0.92 2023-10-03 03:02:23,024 WARNING [train.py:1204] (3/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_91-148831-0_sp1.1 from training. Duration: 0.9 2023-10-03 03:02:23,099 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_82-221196-0 from training. Duration: 0.76 2023-10-03 03:02:26,096 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.balancer2.prob, batch_count=1114773.3333333333, ans=0.125 2023-10-03 03:02:27,613 WARNING [train.py:1204] (3/4) Exclude cut with ID _38020_210_5986_1_1533112252259_7006089_493-102745-0 from training. Duration: 0.98 2023-10-03 03:02:29,563 INFO [train.py:1046] (3/4) Epoch 32, batch 2550, loss[loss=0.2029, simple_loss=0.2607, pruned_loss=0.0725, over 19132.00 frames. ], tot_loss[loss=0.1613, simple_loss=0.2392, pruned_loss=0.04173, over 4695466.05 frames. ], batch size: 388, lr: 3.19e-03, grad_scale: 8.0 2023-10-03 03:02:30,925 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.572e+02 1.835e+02 1.976e+02 2.166e+02 3.435e+02, threshold=3.953e+02, percent-clipped=0.0 2023-10-03 03:02:31,006 WARNING [train.py:1204] (3/4) Exclude cut with ID _61133_210_9969_1_1535196777227_3696409_40-93442-0_sp1.1 from training. Duration: 0.92725 2023-10-03 03:02:31,137 WARNING [train.py:1204] (3/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_45-258852-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 03:02:32,427 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_53071_210_16554_1_1534493434_1276796_130-36076-0_sp1.1 from training. Duration: 0.863625 2023-10-03 03:02:35,233 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8694_210_6400_1_1529065754_3260756_332-13553-0_sp1.1 from training. Duration: 0.92725 2023-10-03 03:02:35,307 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_405-339986-0 from training. Duration: 0.51 2023-10-03 03:02:36,241 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.0.layers.0.self_attn1.whiten, num_groups=1, num_channels=192, metric=10.53 vs. limit=22.5 2023-10-03 03:02:36,616 WARNING [train.py:1204] (3/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_927-43827-0_sp1.1 from training. Duration: 0.67275 2023-10-03 03:02:39,577 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_397-147196-0 from training. Duration: 0.8 2023-10-03 03:02:40,950 WARNING [train.py:1204] (3/4) Exclude cut with ID 3557-8342-0013-54691-0_sp1.1 from training. Duration: 0.836375 2023-10-03 03:02:43,784 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_47438_210_5399_1_1533797471_4513356_626-127722-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 03:02:46,517 WARNING [train.py:1204] (3/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_256-323883-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 03:02:46,522 WARNING [train.py:1204] (3/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_839-173017-0_sp0.9 from training. Duration: 0.4555625 2023-10-03 03:02:46,558 WARNING [train.py:1204] (3/4) Exclude cut with ID _5289_210_2084_1_1527678202736_3779299_392-261988-0_sp1.1 from training. Duration: 0.5909375 2023-10-03 03:02:47,949 WARNING [train.py:1204] (3/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_119-224384-0_sp1.1 from training. Duration: 0.936375 2023-10-03 03:02:47,978 WARNING [train.py:1204] (3/4) Exclude cut with ID _57523_210_1856_1_1534766294545_3732989_262-267933-0_sp1.1 from training. Duration: 0.963625 2023-10-03 03:02:49,417 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_35361_210_4238_1_1533262810_7793476_675-87012-0_sp0.9 from training. Duration: 0.988875 2023-10-03 03:02:49,449 WARNING [train.py:1204] (3/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_17-90541-0 from training. Duration: 0.94 2023-10-03 03:02:49,484 WARNING [train.py:1204] (3/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_535-14410-0_sp1.1 from training. Duration: 0.42725 2023-10-03 03:02:49,505 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_24673_210_6400_1_1531968633_778659_94-154511-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 03:02:49,510 WARNING [train.py:1204] (3/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_212-10501-0 from training. Duration: 0.94 2023-10-03 03:02:58,098 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.1.encoder.layers.0.whiten, num_groups=1, num_channels=256, metric=5.07 vs. limit=12.0 2023-10-03 03:03:03,159 WARNING [train.py:1204] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_383-285196-0_sp1.1 from training. Duration: 0.8818125 2023-10-03 03:03:07,163 WARNING [train.py:1204] (3/4) Exclude cut with ID _45560_210_21982_1_1533812603951_3584999_238-47020-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 03:03:07,179 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_21500_210_2054_1_1532149310_2716758_278-18053-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 03:03:07,193 WARNING [train.py:1204] (3/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_453-9626-0_sp1.1 from training. Duration: 0.936375 2023-10-03 03:03:07,308 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.nonlin_attention.balancer.prob, batch_count=1114973.3333333333, ans=0.125 2023-10-03 03:03:07,691 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.5.encoder.layers.0.conv_module1.whiten, num_groups=1, num_channels=256, metric=7.11 vs. limit=15.0 2023-10-03 03:03:09,864 WARNING [train.py:1204] (3/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_153-97441-0_sp1.1 from training. Duration: 0.5090625 2023-10-03 03:03:15,440 WARNING [train.py:1204] (3/4) Exclude cut with ID _67725_210_27767_1_1535608756671_5105359_431-120895-0_sp1.1 from training. Duration: 0.963625 2023-10-03 03:03:19,524 WARNING [train.py:1204] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_574-45541-0_sp1.1 from training. Duration: 0.5909375 2023-10-03 03:03:19,542 WARNING [train.py:1204] (3/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_162-103620-0_sp1.1 from training. Duration: 0.8454375 2023-10-03 03:03:19,553 WARNING [train.py:1204] (3/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_734-129761-0_sp1.1 from training. Duration: 0.7454375 2023-10-03 03:03:21,337 WARNING [train.py:1204] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_620-276793-0_sp1.1 from training. Duration: 0.536375 2023-10-03 03:03:21,369 WARNING [train.py:1204] (3/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_153-97441-0_sp0.9 from training. Duration: 0.62225 2023-10-03 03:03:25,861 WARNING [train.py:1204] (3/4) Exclude cut with ID _12733_210_8341_1_1530167424556_3646380_523-85621-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 03:03:25,881 WARNING [train.py:1204] (3/4) Exclude cut with ID _64048_210_10626_1_1535198382802_3835179_352-179834-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 03:03:29,034 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.ff2_skip_rate, batch_count=1115106.6666666667, ans=0.0 2023-10-03 03:03:30,169 WARNING [train.py:1204] (3/4) Exclude cut with ID _55162_210_17071_1_1534408248583_3753480_43-242364-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 03:03:30,193 WARNING [train.py:1204] (3/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_661-264857-0 from training. Duration: 0.93 2023-10-03 03:03:30,207 WARNING [train.py:1204] (3/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_325-237800-0_sp1.1 from training. Duration: 0.97275 2023-10-03 03:03:32,181 WARNING [train.py:1204] (3/4) Exclude cut with ID _55328_210_27767_1_1534580973260_4088170_385-171459-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 03:03:32,254 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3780_210_2087_1_1526467389_2145572_176-107140-0_sp1.1 from training. Duration: 0.62725 2023-10-03 03:03:34,160 WARNING [train.py:1204] (3/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_375-244976-0_sp1.1 from training. Duration: 0.6818125 2023-10-03 03:03:36,854 WARNING [train.py:1204] (3/4) Exclude cut with ID _35941_210_15379_1_1532995294515_4023089_309-33162-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 03:03:37,243 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.balancer1.prob, batch_count=1115106.6666666667, ans=0.125 2023-10-03 03:03:41,060 WARNING [train.py:1204] (3/4) Exclude cut with ID _14459_210_2228_1_1531443641946_7516700_1185-22038-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 03:03:41,204 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_1197-69460-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 03:03:42,500 INFO [train.py:1046] (3/4) Epoch 32, batch 2600, loss[loss=0.168, simple_loss=0.2491, pruned_loss=0.04346, over 23357.00 frames. ], tot_loss[loss=0.1621, simple_loss=0.2401, pruned_loss=0.04208, over 4700077.13 frames. ], batch size: 93, lr: 3.19e-03, grad_scale: 8.0 2023-10-03 03:03:43,928 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9012W0022-501157 from training. Duration: 0.896 2023-10-03 03:03:45,421 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9023W0009-506302 from training. Duration: 0.896 2023-10-03 03:03:45,443 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2821_210_2715_1_1526108385_3763560_73-296673-0_sp1.1 from training. Duration: 0.7545625 2023-10-03 03:03:45,476 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9025W0113-507442 from training. Duration: 0.896 2023-10-03 03:03:46,824 WARNING [train.py:1204] (3/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_739-2098-0 from training. Duration: 0.92 2023-10-03 03:03:48,062 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9029W0332-509722 from training. Duration: 0.768 2023-10-03 03:03:49,509 WARNING [train.py:1204] (3/4) Exclude cut with ID _16908_210_5724_1_1531119157248_3772700_68-101953-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 03:03:51,291 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9042W0011-515617 from training. Duration: 0.768 2023-10-03 03:03:51,388 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3019_210_2028_1_1526044245_3264865_191-72235-0 from training. Duration: 0.97 2023-10-03 03:03:53,273 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9052W0006-520792 from training. Duration: 0.896 2023-10-03 03:03:54,705 WARNING [train.py:1204] (3/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_245-174553-0_sp1.1 from training. Duration: 0.8 2023-10-03 03:03:56,233 WARNING [train.py:1204] (3/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_202-67017-0 from training. Duration: 0.82 2023-10-03 03:03:57,621 WARNING [train.py:1204] (3/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_304-59227-0 from training. Duration: 0.77 2023-10-03 03:03:57,729 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_246-125000-0_sp0.9 from training. Duration: 0.688875 2023-10-03 03:03:59,518 WARNING [train.py:1204] (3/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_243-42116-0 from training. Duration: 0.73 2023-10-03 03:04:02,718 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9087W0121-538492 from training. Duration: 0.896 2023-10-03 03:04:02,741 WARNING [train.py:1204] (3/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_120-76843-0 from training. Duration: 0.55 2023-10-03 03:04:08,377 WARNING [train.py:1204] (3/4) Exclude cut with ID _7852_210_5219_1_1528286471316_8139400_850-304621-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 03:04:08,389 WARNING [train.py:1204] (3/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_154-285415-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 03:04:08,407 WARNING [train.py:1204] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_538-278194-0_sp1.1 from training. Duration: 0.92725 2023-10-03 03:04:08,408 WARNING [train.py:1204] (3/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_119-207162-0 from training. Duration: 0.71 2023-10-03 03:04:11,055 WARNING [train.py:1204] (3/4) Exclude cut with ID _2334_210_2667_1_1525608238880_4705129_459-104997-0_sp1.1 from training. Duration: 0.863625 2023-10-03 03:04:12,734 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.ff3_skip_rate, batch_count=1115306.6666666667, ans=0.0 2023-10-03 03:04:15,310 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9147W0019-567847 from training. Duration: 0.768 2023-10-03 03:04:24,070 WARNING [train.py:1204] (3/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_228-189439-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 03:04:24,707 WARNING [train.py:1204] (3/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_196-77172-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 03:04:24,774 WARNING [train.py:1204] (3/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_625-338546-0 from training. Duration: 0.88 2023-10-03 03:04:26,073 WARNING [train.py:1204] (3/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_193-327630-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 03:04:26,075 WARNING [train.py:1204] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_391-24006-0_sp1.1 from training. Duration: 0.92725 2023-10-03 03:04:27,564 WARNING [train.py:1204] (3/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_197-28164-0 from training. Duration: 0.8 2023-10-03 03:04:29,040 WARNING [train.py:1204] (3/4) Exclude cut with ID _8395_210_1531_1_1528597871523_4411579_431-132470-0_sp1.1 from training. Duration: 0.67275 2023-10-03 03:04:29,068 WARNING [train.py:1204] (3/4) Exclude cut with ID _35350_210_16040_1_1533040191183_4075299_465-248326-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 03:04:31,107 WARNING [train.py:1204] (3/4) Exclude cut with ID _16756_210_4915_1_1531047803763_7104259_101-141922-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 03:04:33,948 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.conv_skip_rate, batch_count=1115373.3333333333, ans=0.0 2023-10-03 03:04:35,760 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9212W0005-600172 from training. Duration: 0.896 2023-10-03 03:04:35,789 WARNING [train.py:1204] (3/4) Exclude cut with ID _63186_210_2084_1_1535414479705_3908649_378-166820-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 03:04:35,811 WARNING [train.py:1204] (3/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_52-211683-0_sp1.1 from training. Duration: 0.8181875 2023-10-03 03:04:41,409 WARNING [train.py:1204] (3/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_1147-237729-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 03:04:42,739 WARNING [train.py:1204] (3/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_234-14174-0_sp0.9 from training. Duration: 0.911125 2023-10-03 03:04:42,751 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_454-276429-0 from training. Duration: 0.87 2023-10-03 03:04:44,228 WARNING [train.py:1204] (3/4) Exclude cut with ID _53713_210_24116_1_1534314487762_7416751_755-165313-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 03:04:46,928 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8278_210_2550_1_1528637099_4220413_436-317072-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 03:04:47,000 WARNING [train.py:1204] (3/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_28-324729-0_sp1.1 from training. Duration: 0.8545625 2023-10-03 03:04:52,552 WARNING [train.py:1204] (3/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_326-165900-0 from training. Duration: 0.69 2023-10-03 03:04:54,433 WARNING [train.py:1204] (3/4) Exclude cut with ID _53489_210_6228_1_1534298357169_3835970_96-30246-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 03:04:54,701 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.balancer1.prob, batch_count=1115440.0, ans=0.125 2023-10-03 03:04:55,877 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8275_210_4198_1_1528455320_3892571_491-17707-0_sp1.1 from training. Duration: 0.5818125 2023-10-03 03:04:56,749 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.conv_module2.balancer2.prob, batch_count=1115506.6666666667, ans=0.125 2023-10-03 03:04:57,808 INFO [train.py:1046] (3/4) Epoch 32, batch 2650, loss[loss=0.1672, simple_loss=0.2549, pruned_loss=0.03977, over 24616.00 frames. ], tot_loss[loss=0.1624, simple_loss=0.2407, pruned_loss=0.04203, over 4710735.07 frames. ], batch size: 68, lr: 3.19e-03, grad_scale: 8.0 2023-10-03 03:04:58,680 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.2.encoder.layers.1.self_attn_weights.whiten_keys, num_groups=4, num_channels=128, metric=3.08 vs. limit=6.0 2023-10-03 03:04:59,102 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.432e+02 1.872e+02 2.008e+02 2.203e+02 2.987e+02, threshold=4.015e+02, percent-clipped=0.0 2023-10-03 03:05:01,051 WARNING [train.py:1204] (3/4) Exclude cut with ID _14348_210_8341_1_1530599434996_3778640_240-232370-0 from training. Duration: 0.84 2023-10-03 03:05:01,062 WARNING [train.py:1204] (3/4) Exclude cut with ID _45012_210_12651_1_1533608435136_5371380_54-112623-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 03:05:01,160 WARNING [train.py:1204] (3/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_328-264776-0_sp1.1 from training. Duration: 0.6090625 2023-10-03 03:05:02,470 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9325W0001-648427 from training. Duration: 0.768 2023-10-03 03:05:02,479 WARNING [train.py:1204] (3/4) Exclude cut with ID _56480_210_4220_1_1534679942079_3075409_49-74717-0_sp1.1 from training. Duration: 0.936375 2023-10-03 03:05:03,991 WARNING [train.py:1204] (3/4) Exclude cut with ID _59500_210_8254_1_1534809610680_3736009_6-241869-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 03:05:07,041 WARNING [train.py:1204] (3/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_172-215932-0_sp0.9 from training. Duration: 0.7333125 2023-10-03 03:05:07,135 WARNING [train.py:1204] (3/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_17-90541-0_sp1.1 from training. Duration: 0.8545625 2023-10-03 03:05:09,724 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_43831_210_2158_1_1533725502_5872791_525-89447-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 03:05:09,935 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.balancer1.prob, batch_count=1115506.6666666667, ans=0.125 2023-10-03 03:05:11,119 WARNING [train.py:1204] (3/4) Exclude cut with ID _12302_210_5219_1_1529924446466_8107280_907-182949-0 from training. Duration: 0.92 2023-10-03 03:05:11,140 WARNING [train.py:1204] (3/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_402-102311-0_sp0.9 from training. Duration: 0.7444375 2023-10-03 03:05:11,294 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.balancer1.prob, batch_count=1115573.3333333333, ans=0.125 2023-10-03 03:05:12,453 WARNING [train.py:1204] (3/4) Exclude cut with ID _8021_210_7994_1_1528376451358_3653069_179-45206-0_sp0.9 from training. Duration: 0.9555625 2023-10-03 03:05:13,818 WARNING [train.py:1204] (3/4) Exclude cut with ID _40137_210_12210_1_1533290484046_3795180_249-187059-0 from training. Duration: 0.88 2023-10-03 03:05:15,310 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9653W0194-675472 from training. Duration: 0.9658125 2023-10-03 03:05:18,003 WARNING [train.py:1204] (3/4) Exclude cut with ID _51332_210_9988_1_1534244371836_3801270_336-255604-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 03:05:19,415 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_510-96996-0 from training. Duration: 0.87 2023-10-03 03:05:19,427 WARNING [train.py:1204] (3/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_1167-20437-0_sp1.1 from training. Duration: 0.92725 2023-10-03 03:05:20,754 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3475_210_2028_1_1526193578_3622569_265-276625-0 from training. Duration: 0.76 2023-10-03 03:05:24,139 WARNING [train.py:1204] (3/4) Exclude cut with ID _14860_210_4915_1_1530702020340_7343240_470-172418-0_sp1.1 from training. Duration: 0.97275 2023-10-03 03:05:24,147 WARNING [train.py:1204] (3/4) Exclude cut with ID _5444_210_2151_1_1527418808464_5312210_581-315411-0_sp1.1 from training. Duration: 0.563625 2023-10-03 03:05:24,162 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_34450_210_9547_1_1533099443_4109050_519-342439-0_sp1.1 from training. Duration: 0.97275 2023-10-03 03:05:24,192 WARNING [train.py:1204] (3/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_50-175961-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 03:05:28,940 WARNING [train.py:1204] (3/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_372-6869-0 from training. Duration: 0.44 2023-10-03 03:05:28,948 WARNING [train.py:1204] (3/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_3-237434-0 from training. Duration: 0.76 2023-10-03 03:05:31,715 WARNING [train.py:1204] (3/4) Exclude cut with ID _8772_210_1850_1_1528635595972_3664011_247-336448-0_sp1.1 from training. Duration: 0.67275 2023-10-03 03:05:36,252 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_427-78097-0 from training. Duration: 0.8 2023-10-03 03:05:36,273 WARNING [train.py:1204] (3/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_466-252727-0_sp1.1 from training. Duration: 0.97275 2023-10-03 03:05:38,952 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_15972_210_6400_1_1530877934_3405692_316-35478-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 03:05:38,986 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_809-17156-0_sp0.9 from training. Duration: 0.811125 2023-10-03 03:05:39,031 WARNING [train.py:1204] (3/4) Exclude cut with ID _57572_210_5517_1_1534921219624_3809484_594-263474-0_sp1.1 from training. Duration: 0.936375 2023-10-03 03:05:40,393 WARNING [train.py:1204] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_190-228204-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 03:05:41,796 WARNING [train.py:1204] (3/4) Exclude cut with ID _19514_210_9259_1_1531530071330_5084230_117-21193-0_sp1.1 from training. Duration: 0.936375 2023-10-03 03:05:43,131 WARNING [train.py:1204] (3/4) Exclude cut with ID _69584_210_5219_1_1535785133775_4044450_190-21315-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 03:05:44,511 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_53071_210_16554_1_1534493434_1276796_184-302050-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 03:05:44,745 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.bypass.skip_rate, batch_count=1115706.6666666667, ans=0.04949747468305833 2023-10-03 03:05:45,809 WARNING [train.py:1204] (3/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_120-76843-0_sp1.1 from training. Duration: 0.5 2023-10-03 03:05:47,272 WARNING [train.py:1204] (3/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_318-162017-0_sp1.1 from training. Duration: 0.82725 2023-10-03 03:05:48,711 WARNING [train.py:1204] (3/4) Exclude cut with ID _46067_210_4220_1_1534125571066_3307450_79-46864-0_sp1.1 from training. Duration: 0.92725 2023-10-03 03:05:48,766 WARNING [train.py:1204] (3/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_358-215558-0_sp1.1 from training. Duration: 0.8454375 2023-10-03 03:05:50,147 WARNING [train.py:1204] (3/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_621-66764-0_sp1.1 from training. Duration: 0.92725 2023-10-03 03:05:51,500 WARNING [train.py:1204] (3/4) Exclude cut with ID _42863_210_9070_1_1533902398788_3696920_338-156851-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 03:05:51,533 WARNING [train.py:1204] (3/4) Exclude cut with ID _5289_210_2084_1_1527678202736_3779299_392-261988-0_sp0.9 from training. Duration: 0.72225 2023-10-03 03:05:54,805 WARNING [train.py:1204] (3/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_409-84682-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 03:05:56,146 WARNING [train.py:1204] (3/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_30-326681-0_sp0.9 from training. Duration: 0.92225 2023-10-03 03:05:56,150 WARNING [train.py:1204] (3/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_389-184695-0_sp1.1 from training. Duration: 0.92725 2023-10-03 03:05:56,180 WARNING [train.py:1204] (3/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_442-320317-0 from training. Duration: 0.82 2023-10-03 03:06:01,585 WARNING [train.py:1204] (3/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_756-29911-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 03:06:02,848 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_401-112036-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 03:06:04,211 WARNING [train.py:1204] (3/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_181-308827-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 03:06:05,486 WARNING [train.py:1204] (3/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_332-258725-0_sp1.1 from training. Duration: 0.963625 2023-10-03 03:06:05,561 WARNING [train.py:1204] (3/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_188-260011-0_sp0.9 from training. Duration: 0.811125 2023-10-03 03:06:05,606 WARNING [train.py:1204] (3/4) Exclude cut with ID _49039_210_6315_1_1533967537541_3846118_99-302239-0_sp1.1 from training. Duration: 0.963625 2023-10-03 03:06:07,663 WARNING [train.py:1204] (3/4) Exclude cut with ID _4122_210_2084_1_1526713164417_4353280_312-294828-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 03:06:07,668 WARNING [train.py:1204] (3/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_303-228738-0 from training. Duration: 0.84 2023-10-03 03:06:10,460 WARNING [train.py:1204] (3/4) Exclude cut with ID _14349_210_8341_1_1530615710818_3442470_115-285975-0_sp0.9 from training. Duration: 0.9555625 2023-10-03 03:06:11,717 INFO [train.py:1046] (3/4) Epoch 32, batch 2700, loss[loss=0.1661, simple_loss=0.2263, pruned_loss=0.05294, over 22536.00 frames. ], tot_loss[loss=0.1632, simple_loss=0.2418, pruned_loss=0.04229, over 4716818.17 frames. ], batch size: 322, lr: 3.19e-03, grad_scale: 8.0 2023-10-03 03:06:11,907 WARNING [train.py:1204] (3/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_125-144174-0_sp1.1 from training. Duration: 0.3545625 2023-10-03 03:06:14,782 WARNING [train.py:1204] (3/4) Exclude cut with ID _53565_210_10635_1_1534337218335_3820259_351-54570-0_sp1.1 from training. Duration: 0.97275 2023-10-03 03:06:14,816 WARNING [train.py:1204] (3/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_410-40065-0_sp1.1 from training. Duration: 0.963625 2023-10-03 03:06:14,844 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_38965_210_4852_1_1533174906_3484735_167-339442-0_sp1.1 from training. Duration: 0.963625 2023-10-03 03:06:16,184 WARNING [train.py:1204] (3/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_265-164486-0_sp1.1 from training. Duration: 0.87275 2023-10-03 03:06:16,194 WARNING [train.py:1204] (3/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_725-182484-0_sp1.1 from training. Duration: 0.92725 2023-10-03 03:06:16,229 WARNING [train.py:1204] (3/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_607-213951-0_sp0.9 from training. Duration: 0.9333125 2023-10-03 03:06:16,244 WARNING [train.py:1204] (3/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_605-133395-0_sp1.1 from training. Duration: 0.6 2023-10-03 03:06:16,261 WARNING [train.py:1204] (3/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_351-294650-0 from training. Duration: 0.63 2023-10-03 03:06:17,666 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_14953_210_2151_1_1530701710_5616165_710-165400-0_sp1.1 from training. Duration: 0.7909375 2023-10-03 03:06:20,290 WARNING [train.py:1204] (3/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_237-58433-0_sp1.1 from training. Duration: 0.67275 2023-10-03 03:06:20,839 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.4.encoder.layers.1.feed_forward3.out_whiten, num_groups=1, num_channels=384, metric=11.85 vs. limit=15.0 2023-10-03 03:06:21,691 WARNING [train.py:1204] (3/4) Exclude cut with ID _5190_210_4557_1_1527244368799_373439_28-197030-0_sp1.1 from training. Duration: 0.6545625 2023-10-03 03:06:22,954 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_743-327651-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 03:06:24,663 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.conv_skip_rate, batch_count=1115906.6666666667, ans=0.0 2023-10-03 03:06:26,306 WARNING [train.py:1204] (3/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_76-203867-0_sp0.9 from training. Duration: 0.8 2023-10-03 03:06:27,618 WARNING [train.py:1204] (3/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_274-274798-0 from training. Duration: 0.98 2023-10-03 03:06:27,666 WARNING [train.py:1204] (3/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_226-242140-0_sp1.1 from training. Duration: 0.736375 2023-10-03 03:06:33,149 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.conv_module2.balancer2.prob, batch_count=1115906.6666666667, ans=0.125 2023-10-03 03:06:34,254 WARNING [train.py:1204] (3/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_352-59958-0_sp1.1 from training. Duration: 0.7818125 2023-10-03 03:06:34,259 WARNING [train.py:1204] (3/4) Exclude cut with ID _39411_210_5986_1_1533538825030_3343649_191-251819-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 03:06:39,885 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7046_210_6753_1_1527895337_6920917_642-106799-0_sp1.1 from training. Duration: 0.836375 2023-10-03 03:06:39,892 WARNING [train.py:1204] (3/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_412-3442-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 03:06:39,927 WARNING [train.py:1204] (3/4) Exclude cut with ID _60057_210_10640_1_1534903065793_3333150_171-181133-0_sp0.9 from training. Duration: 0.97775 2023-10-03 03:06:41,900 WARNING [train.py:1204] (3/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_154-135696-0_sp1.1 from training. Duration: 0.72725 2023-10-03 03:06:43,416 WARNING [train.py:1204] (3/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_148-186805-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 03:06:46,358 WARNING [train.py:1204] (3/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_605-165641-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 03:06:46,364 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_174-277364-0_sp0.9 from training. Duration: 0.788875 2023-10-03 03:06:47,035 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.2.encoder.layers.0.self_attn2.whiten, num_groups=1, num_channels=384, metric=18.05 vs. limit=22.5 2023-10-03 03:06:47,988 WARNING [train.py:1204] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_89-321955-0_sp0.9 from training. Duration: 0.888875 2023-10-03 03:06:48,589 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.5.encoder.layers.0.feed_forward3.out_whiten, num_groups=1, num_channels=256, metric=9.99 vs. limit=15.0 2023-10-03 03:06:49,504 WARNING [train.py:1204] (3/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_438-105429-0_sp1.1 from training. Duration: 0.963625 2023-10-03 03:06:49,508 WARNING [train.py:1204] (3/4) Exclude cut with ID _14970_210_15709_1_1530773543493_1715730_187-20259-0_sp0.9 from training. Duration: 0.988875 2023-10-03 03:06:52,508 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.ff2_skip_rate, batch_count=1115973.3333333333, ans=0.0 2023-10-03 03:06:56,572 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.conv_module1.balancer2.prob, batch_count=1116040.0, ans=0.125 2023-10-03 03:07:00,187 WARNING [train.py:1204] (3/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_385-193052-0_sp0.9 from training. Duration: 0.9555625 2023-10-03 03:07:00,229 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7113_210_1849_1_1527942685_3448768_194-311626-0_sp1.1 from training. Duration: 0.92725 2023-10-03 03:07:04,353 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3780_210_2087_1_1526467389_2145572_176-107140-0_sp0.9 from training. Duration: 0.7666875 2023-10-03 03:07:04,355 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_36756_210_7234_1_1533026991_966276_123-213457-0_sp1.1 from training. Duration: 0.936375 2023-10-03 03:07:07,468 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.balancer2.prob, batch_count=1116040.0, ans=0.125 2023-10-03 03:07:08,580 WARNING [train.py:1204] (3/4) Exclude cut with ID _23557_210_8750_1_1531890106370_6846255_456-244861-0_sp1.1 from training. Duration: 0.963625 2023-10-03 03:07:09,914 WARNING [train.py:1204] (3/4) Exclude cut with ID _37086_210_9038_1_1532999540891_7021250_242-349170-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 03:07:09,972 WARNING [train.py:1204] (3/4) Exclude cut with ID _41298_210_9019_1_1533430371450_4822143_471-281351-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 03:07:11,370 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_53378_210_17282_1_1534221071_5769509_572-126176-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 03:07:12,823 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_1507-263032-0_sp1.1 from training. Duration: 0.963625 2023-10-03 03:07:12,858 WARNING [train.py:1204] (3/4) Exclude cut with ID _13679_210_7497_1_1530534293662_3355098_411-186088-0_sp1.1 from training. Duration: 0.8545625 2023-10-03 03:07:16,084 WARNING [train.py:1204] (3/4) Exclude cut with ID _29345_210_4381_1_1532689168263_3860169_149-257268-0_sp1.1 from training. Duration: 0.736375 2023-10-03 03:07:17,458 WARNING [train.py:1204] (3/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_381-252014-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 03:07:17,465 WARNING [train.py:1204] (3/4) Exclude cut with ID _35799_210_5854_1_1533263487887_3529071_512-2717-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 03:07:18,953 WARNING [train.py:1204] (3/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_505-205270-0_sp0.9 from training. Duration: 0.42225 2023-10-03 03:07:19,218 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.conv_module1.balancer1.prob, batch_count=1116106.6666666667, ans=0.125 2023-10-03 03:07:20,258 WARNING [train.py:1204] (3/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_731-20117-0_sp1.1 from training. Duration: 0.936375 2023-10-03 03:07:23,064 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_37351_210_7925_1_1533012931_3812113_265-85780-0_sp1.1 from training. Duration: 0.8 2023-10-03 03:07:23,071 WARNING [train.py:1204] (3/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_313-134930-0 from training. Duration: 0.97 2023-10-03 03:07:23,278 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.bypass_mid.scale_min, batch_count=1116106.6666666667, ans=0.2 2023-10-03 03:07:24,498 WARNING [train.py:1204] (3/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_374-138971-0 from training. Duration: 0.94 2023-10-03 03:07:24,534 WARNING [train.py:1204] (3/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_1227-287886-0_sp1.1 from training. Duration: 0.936375 2023-10-03 03:07:25,793 INFO [train.py:1046] (3/4) Epoch 32, batch 2750, loss[loss=0.161, simple_loss=0.2404, pruned_loss=0.04079, over 24657.00 frames. ], tot_loss[loss=0.1625, simple_loss=0.2408, pruned_loss=0.04217, over 4711033.17 frames. ], batch size: 65, lr: 3.19e-03, grad_scale: 8.0 2023-10-03 03:07:27,190 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.552e+02 1.923e+02 2.045e+02 2.291e+02 3.532e+02, threshold=4.091e+02, percent-clipped=0.0 2023-10-03 03:07:27,391 WARNING [train.py:1204] (3/4) Exclude cut with ID _59493_210_17282_1_1535078419077_3225879_332-217544-0_sp1.1 from training. Duration: 0.97275 2023-10-03 03:07:27,428 WARNING [train.py:1204] (3/4) Exclude cut with ID _23514_210_1389_1_1531832139785_3031439_361-119640-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 03:07:30,735 WARNING [train.py:1204] (3/4) Exclude cut with ID _69584_210_5219_1_1535785133775_4044450_142-118058-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 03:07:30,767 WARNING [train.py:1204] (3/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_178-177582-0_sp1.1 from training. Duration: 0.67275 2023-10-03 03:07:30,941 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.nonlin_attention.balancer.min_positive, batch_count=1116173.3333333333, ans=0.05 2023-10-03 03:07:32,596 WARNING [train.py:1204] (3/4) Exclude cut with ID _25303_210_17519_1_1532050207576_3635970_41-195572-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 03:07:35,241 WARNING [train.py:1204] (3/4) Exclude cut with ID _55328_210_27767_1_1534580973260_4088170_329-341322-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 03:07:35,280 WARNING [train.py:1204] (3/4) Exclude cut with ID _5608_210_2106_1_1527415439089_6819768_909-82767-0_sp1.1 from training. Duration: 0.5545625 2023-10-03 03:07:36,663 WARNING [train.py:1204] (3/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_261-297503-0_sp1.1 from training. Duration: 0.9 2023-10-03 03:07:36,670 WARNING [train.py:1204] (3/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_459-291706-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 03:07:36,671 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7762_210_1794_1_1528199508_4057408_422-227034-0 from training. Duration: 0.73 2023-10-03 03:07:36,687 WARNING [train.py:1204] (3/4) Exclude cut with ID _3067_210_2667_1_1526212974640_4503168_250-285937-0_sp0.9 from training. Duration: 0.888875 2023-10-03 03:07:36,701 WARNING [train.py:1204] (3/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_332-253745-0_sp1.1 from training. Duration: 0.936375 2023-10-03 03:07:43,248 WARNING [train.py:1204] (3/4) Exclude cut with ID _13938_210_9988_1_1530518718978_3693960_304-112784-0 from training. Duration: 0.46 2023-10-03 03:07:43,400 WARNING [train.py:1204] (3/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_226-267957-0_sp1.1 from training. Duration: 0.92725 2023-10-03 03:07:44,805 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_41822_210_13270_1_1533955976_4379284_608-254567-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 03:07:46,737 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_124-113562-0_sp1.1 from training. Duration: 0.8545625 2023-10-03 03:07:46,773 WARNING [train.py:1204] (3/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_117-290916-0_sp1.1 from training. Duration: 0.463625 2023-10-03 03:07:46,843 WARNING [train.py:1204] (3/4) Exclude cut with ID _28427_210_4220_1_1532581240967_3061069_102-66533-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 03:07:48,146 WARNING [train.py:1204] (3/4) Exclude cut with ID _55383_210_10635_1_1534468426172_2231669_134-128246-0_sp1.1 from training. Duration: 0.8090625 2023-10-03 03:07:48,192 WARNING [train.py:1204] (3/4) Exclude cut with ID _45871_210_5665_1_1533864614245_3296060_49-308316-0_sp1.1 from training. Duration: 0.97275 2023-10-03 03:07:48,243 WARNING [train.py:1204] (3/4) Exclude cut with ID _57572_210_5517_1_1534921219624_3809484_176-199205-0_sp1.1 from training. Duration: 0.97275 2023-10-03 03:07:49,908 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.attention_skip_rate, batch_count=1116240.0, ans=0.0 2023-10-03 03:07:51,265 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.feed_forward3.hidden_balancer.prob, batch_count=1116240.0, ans=0.125 2023-10-03 03:07:52,381 WARNING [train.py:1204] (3/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_45-265684-0_sp1.1 from training. Duration: 0.6545625 2023-10-03 03:07:52,399 WARNING [train.py:1204] (3/4) Exclude cut with ID _15150_210_8341_1_1530752411803_7256384_469-239509-0_sp0.9 from training. Duration: 0.7555625 2023-10-03 03:07:52,455 WARNING [train.py:1204] (3/4) Exclude cut with ID _28461_210_2253_1_1532771775189_3041299_136-270644-0_sp1.1 from training. Duration: 0.8454375 2023-10-03 03:07:53,821 WARNING [train.py:1204] (3/4) Exclude cut with ID _69584_210_5219_1_1535785133775_4044450_302-47302-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 03:07:55,172 WARNING [train.py:1204] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_166-128941-0_sp1.1 from training. Duration: 0.4909375 2023-10-03 03:08:01,476 INFO [scaling.py:1118] (3/4) WithLoss: name=encoder.encoders.2.encoder.layers.2.self_attn_weights, loss-sum=0.000e+00 2023-10-03 03:08:04,518 WARNING [train.py:1204] (3/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_559-120504-0_sp1.1 from training. Duration: 0.97275 2023-10-03 03:08:07,681 WARNING [train.py:1204] (3/4) Exclude cut with ID _6456_210_1534_1_1527681634212_3181049_345-252877-0_sp1.1 from training. Duration: 0.5818125 2023-10-03 03:08:07,704 WARNING [train.py:1204] (3/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_902-233003-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 03:08:11,868 WARNING [train.py:1204] (3/4) Exclude cut with ID _36538_210_9994_1_1533204020363_3795620_204-261385-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 03:08:11,874 WARNING [train.py:1204] (3/4) Exclude cut with ID _14821_210_8750_1_1530680503991_6829359_189-297451-0_sp1.1 from training. Duration: 0.5 2023-10-03 03:08:11,920 WARNING [train.py:1204] (3/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_1211-226066-0_sp0.9 from training. Duration: 0.8444375 2023-10-03 03:08:13,598 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.ff3_skip_rate, batch_count=1116373.3333333333, ans=0.0 2023-10-03 03:08:18,005 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6786_210_1388_1_1527839699_4194495_273-149841-0_sp1.1 from training. Duration: 0.836375 2023-10-03 03:08:18,056 WARNING [train.py:1204] (3/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_380-162648-0_sp1.1 from training. Duration: 0.8545625 2023-10-03 03:08:18,057 WARNING [train.py:1204] (3/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_494-199538-0 from training. Duration: 0.75 2023-10-03 03:08:22,206 WARNING [train.py:1204] (3/4) Exclude cut with ID _49097_210_16049_1_1533952780207_4360979_276-87053-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 03:08:23,584 WARNING [train.py:1204] (3/4) Exclude cut with ID _5444_210_2151_1_1527418808464_5312210_726-27165-0 from training. Duration: 0.67 2023-10-03 03:08:28,864 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_128-202104-0_sp1.1 from training. Duration: 0.57275 2023-10-03 03:08:32,074 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_11349_210_6753_1_1529800861_1101207_26-65602-0_sp1.1 from training. Duration: 0.87275 2023-10-03 03:08:32,105 WARNING [train.py:1204] (3/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_159-328157-0 from training. Duration: 0.52 2023-10-03 03:08:33,460 WARNING [train.py:1204] (3/4) Exclude cut with ID _14069_210_5632_1_1530512692033_4803390_98-21934-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 03:08:35,330 WARNING [train.py:1204] (3/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_352-59958-0_sp0.9 from training. Duration: 0.9555625 2023-10-03 03:08:37,272 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6163_210_6400_1_1527506746_4263068_283-137811-0 from training. Duration: 0.77 2023-10-03 03:08:37,307 WARNING [train.py:1204] (3/4) Exclude cut with ID _21989_210_5854_1_1531899214931_3271360_133-44759-0_sp1.1 from training. Duration: 0.7 2023-10-03 03:08:40,177 INFO [train.py:1046] (3/4) Epoch 32, batch 2800, loss[loss=0.1558, simple_loss=0.2311, pruned_loss=0.04026, over 19505.00 frames. ], tot_loss[loss=0.162, simple_loss=0.2401, pruned_loss=0.042, over 4711546.22 frames. ], batch size: 42, lr: 3.19e-03, grad_scale: 16.0 2023-10-03 03:08:40,275 WARNING [train.py:1204] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_21-174514-0_sp0.9 from training. Duration: 0.47775 2023-10-03 03:08:40,312 WARNING [train.py:1204] (3/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_201-78811-0_sp1.1 from training. Duration: 0.963625 2023-10-03 03:08:40,345 WARNING [train.py:1204] (3/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_537-164630-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 03:08:41,692 WARNING [train.py:1204] (3/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_156-257607-0 from training. Duration: 0.83 2023-10-03 03:08:41,710 WARNING [train.py:1204] (3/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_308-154450-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 03:08:41,751 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_45399_210_4852_1_1533624725_7579737_214-240574-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 03:08:44,530 WARNING [train.py:1204] (3/4) Exclude cut with ID _29303_210_2575_1_1532392237303_7410649_615-305517-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 03:08:44,581 WARNING [train.py:1204] (3/4) Exclude cut with ID IC0125W0002-61808 from training. Duration: 0.708 2023-10-03 03:08:44,582 WARNING [train.py:1204] (3/4) Exclude cut with ID IC0125W0017-61823 from training. Duration: 0.759 2023-10-03 03:08:50,322 WARNING [train.py:1204] (3/4) Exclude cut with ID _53219_210_4525_1_1534834836008_4064825_4-145291-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 03:08:51,727 WARNING [train.py:1204] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_185-174805-0_sp1.1 from training. Duration: 0.6181875 2023-10-03 03:08:51,743 WARNING [train.py:1204] (3/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_635-193046-0_sp1.1 from training. Duration: 0.92725 2023-10-03 03:08:54,724 WARNING [train.py:1204] (3/4) Exclude cut with ID _46067_210_4220_1_1534125571066_3307450_321-258866-0_sp1.1 from training. Duration: 0.97275 2023-10-03 03:08:57,475 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8558_210_1410_1_1528505225_7613406_131-329524-0 from training. Duration: 0.79 2023-10-03 03:08:57,614 WARNING [train.py:1204] (3/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_847-41334-0_sp0.9 from training. Duration: 0.488875 2023-10-03 03:08:59,024 WARNING [train.py:1204] (3/4) Exclude cut with ID _14227_210_4381_1_1530701804286_3940259_125-275642-0 from training. Duration: 0.99 2023-10-03 03:09:01,671 WARNING [train.py:1204] (3/4) Exclude cut with ID _25248_210_8750_1_1532062825941_5554180_248-177255-0_sp1.1 from training. Duration: 0.963625 2023-10-03 03:09:01,727 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_40047_210_2290_1_1533290519_2374311_195-165693-0_sp1.1 from training. Duration: 0.8090625 2023-10-03 03:09:01,731 WARNING [train.py:1204] (3/4) Exclude cut with ID _23109_210_4381_1_1532150828777_901949_29-258672-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 03:09:03,741 WARNING [train.py:1204] (3/4) Exclude cut with ID _14821_210_8750_1_1530680503991_6829359_2-227151-0_sp1.1 from training. Duration: 0.7545625 2023-10-03 03:09:05,025 WARNING [train.py:1204] (3/4) Exclude cut with ID _49643_210_7989_1_1534640244224_3939031_565-53982-0_sp1.1 from training. Duration: 0.963625 2023-10-03 03:09:05,029 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7544_210_6753_1_1528591327_1471426_33-346031-0_sp0.9 from training. Duration: 0.67775 2023-10-03 03:09:05,112 WARNING [train.py:1204] (3/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_95-61782-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 03:09:13,167 WARNING [train.py:1204] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_686-54383-0_sp0.9 from training. Duration: 0.9666875 2023-10-03 03:09:15,800 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_505-276823-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 03:09:17,726 WARNING [train.py:1204] (3/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_745-128443-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 03:09:17,777 WARNING [train.py:1204] (3/4) Exclude cut with ID _3844_210_1390_1_1526727803582_2871209_20-251664-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 03:09:17,945 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.feed_forward2.hidden_balancer.prob, batch_count=1116640.0, ans=0.125 2023-10-03 03:09:19,153 WARNING [train.py:1204] (3/4) Exclude cut with ID _36538_210_9994_1_1533204020363_3795620_487-164628-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 03:09:23,311 WARNING [train.py:1204] (3/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_309-222545-0_sp1.1 from training. Duration: 0.763625 2023-10-03 03:09:23,324 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3531_210_3528_1_1526294203_5216456_309-227302-0 from training. Duration: 0.69 2023-10-03 03:09:24,730 WARNING [train.py:1204] (3/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_42-192460-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 03:09:24,818 WARNING [train.py:1204] (3/4) Exclude cut with ID _22083_210_8254_1_1531702862439_3678970_49-234546-0_sp1.1 from training. Duration: 0.7545625 2023-10-03 03:09:24,822 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_427-78097-0_sp0.9 from training. Duration: 0.888875 2023-10-03 03:09:25,468 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.3.encoder.layers.0.self_attn1.whiten, num_groups=1, num_channels=512, metric=20.81 vs. limit=22.5 2023-10-03 03:09:28,994 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_378-205254-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 03:09:29,039 WARNING [train.py:1204] (3/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_672-314830-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 03:09:32,378 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.feed_forward3.hidden_balancer.prob, batch_count=1116706.6666666667, ans=0.125 2023-10-03 03:09:33,539 WARNING [train.py:1204] (3/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_90-30673-0_sp1.1 from training. Duration: 0.763625 2023-10-03 03:09:36,787 WARNING [train.py:1204] (3/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_563-329943-0_sp1.1 from training. Duration: 0.9 2023-10-03 03:09:36,809 WARNING [train.py:1204] (3/4) Exclude cut with ID _21632_210_2253_1_1531994001765_3744140_154-84090-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 03:09:36,810 WARNING [train.py:1204] (3/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_103-336064-0_sp0.9 from training. Duration: 0.5666875 2023-10-03 03:09:36,857 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_43-71989-0_sp0.9 from training. Duration: 0.6333125 2023-10-03 03:09:38,714 WARNING [train.py:1204] (3/4) Exclude cut with ID _3867_210_1883_1_1526641200575_4475830_319-349850-0_sp0.9 from training. Duration: 0.7444375 2023-10-03 03:09:39,053 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.bypass_mid.scale_min, batch_count=1116773.3333333333, ans=0.2 2023-10-03 03:09:40,130 WARNING [train.py:1204] (3/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_629-179454-0_sp1.1 from training. Duration: 0.963625 2023-10-03 03:09:40,146 WARNING [train.py:1204] (3/4) Exclude cut with ID _65263_210_22356_1_1535763598691_3921037_49-222917-0 from training. Duration: 0.92 2023-10-03 03:09:40,173 WARNING [train.py:1204] (3/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_602-331772-0_sp1.1 from training. Duration: 0.936375 2023-10-03 03:09:40,361 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.feed_forward3.hidden_balancer.prob, batch_count=1116773.3333333333, ans=0.125 2023-10-03 03:09:40,399 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.balancer2.prob, batch_count=1116773.3333333333, ans=0.125 2023-10-03 03:09:42,894 WARNING [train.py:1204] (3/4) Exclude cut with ID _58414_210_8254_1_1535421609162_7151182_292-134978-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 03:09:42,921 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_38793_210_2544_1_1533176114_3849320_200-299292-0_sp1.1 from training. Duration: 0.936375 2023-10-03 03:09:43,032 WARNING [train.py:1204] (3/4) Exclude cut with ID _14970_210_15709_1_1530775547361_1966965_6-23328-0 from training. Duration: 0.86 2023-10-03 03:09:43,608 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.3.encoder.layers.0.self_attn_weights.whiten_keys, num_groups=8, num_channels=256, metric=4.44 vs. limit=6.0 2023-10-03 03:09:44,336 WARNING [train.py:1204] (3/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_386-225511-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 03:09:45,600 WARNING [train.py:1204] (3/4) Exclude cut with ID _38323_210_12108_1_1533121233369_3303049_416-11919-0_sp1.1 from training. Duration: 0.87275 2023-10-03 03:09:45,661 WARNING [train.py:1204] (3/4) Exclude cut with ID _4866_210_2577_1_1527404412284_4364659_256-306830-0_sp1.1 from training. Duration: 0.7909375 2023-10-03 03:09:48,301 WARNING [train.py:1204] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_53-300544-0 from training. Duration: 0.67 2023-10-03 03:09:50,354 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.attention_skip_rate, batch_count=1116773.3333333333, ans=0.0 2023-10-03 03:09:52,896 WARNING [train.py:1204] (3/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_80-74184-0_sp1.1 from training. Duration: 0.7545625 2023-10-03 03:09:52,927 WARNING [train.py:1204] (3/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_332-256168-0_sp1.1 from training. Duration: 0.6818125 2023-10-03 03:09:54,147 INFO [train.py:1046] (3/4) Epoch 32, batch 2850, loss[loss=0.1666, simple_loss=0.2368, pruned_loss=0.04816, over 23859.00 frames. ], tot_loss[loss=0.1618, simple_loss=0.2395, pruned_loss=0.04208, over 4701914.64 frames. ], batch size: 164, lr: 3.19e-03, grad_scale: 16.0 2023-10-03 03:09:54,225 WARNING [train.py:1204] (3/4) Exclude cut with ID _48836_210_12210_1_1534075848293_3904150_429-35475-0_sp1.1 from training. Duration: 0.8090625 2023-10-03 03:09:55,420 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.553e+02 1.850e+02 1.983e+02 2.213e+02 2.652e+02, threshold=3.967e+02, percent-clipped=0.0 2023-10-03 03:09:56,887 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_144-135833-0_sp1.1 from training. Duration: 0.97275 2023-10-03 03:10:00,958 WARNING [train.py:1204] (3/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_380-343670-0_sp0.9 from training. Duration: 0.97775 2023-10-03 03:10:00,979 WARNING [train.py:1204] (3/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_213-265174-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 03:10:01,633 WARNING [train.py:1204] (3/4) Exclude cut with ID _20455_210_5219_1_1531565996775_8098100_86-314283-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 03:10:02,941 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_41750_210_1960_1_1533520678_3948042_68-223813-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 03:10:04,223 WARNING [train.py:1204] (3/4) Exclude cut with ID _55153_210_2253_1_1534384786677_3162096_24-198429-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 03:10:05,599 WARNING [train.py:1204] (3/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_119-207162-0_sp0.9 from training. Duration: 0.788875 2023-10-03 03:10:05,834 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=1116840.0, ans=0.1 2023-10-03 03:10:06,901 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_148-172861-0 from training. Duration: 0.5 2023-10-03 03:10:13,579 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_5522_210_3273_1_1527249606_3909096_318-203153-0 from training. Duration: 0.43 2023-10-03 03:10:13,595 WARNING [train.py:1204] (3/4) Exclude cut with ID _14884_210_2252_1_1530707459689_5561250_288-189949-0_sp1.1 from training. Duration: 0.97275 2023-10-03 03:10:15,028 WARNING [train.py:1204] (3/4) Exclude cut with ID _5294_210_1747_1_1527249643928_3696179_529-242298-0 from training. Duration: 0.95 2023-10-03 03:10:15,206 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.bypass.skip_rate, batch_count=1116906.6666666667, ans=0.04949747468305833 2023-10-03 03:10:16,347 WARNING [train.py:1204] (3/4) Exclude cut with ID _36538_210_9994_1_1533204020363_3795620_503-110289-0_sp1.1 from training. Duration: 0.936375 2023-10-03 03:10:17,850 WARNING [train.py:1204] (3/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_385-193052-0 from training. Duration: 0.86 2023-10-03 03:10:17,908 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_456-22141-0 from training. Duration: 0.88 2023-10-03 03:10:21,203 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_38965_210_4852_1_1533174906_3484735_284-117840-0_sp1.1 from training. Duration: 0.936375 2023-10-03 03:10:30,142 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.ff2_skip_rate, batch_count=1116973.3333333333, ans=0.0 2023-10-03 03:10:31,290 WARNING [train.py:1204] (3/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_75-201625-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 03:10:32,676 WARNING [train.py:1204] (3/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_255-312654-0_sp1.1 from training. Duration: 0.82725 2023-10-03 03:10:32,691 WARNING [train.py:1204] (3/4) Exclude cut with ID _47251_210_18628_1_1533814063128_4137250_214-88076-0_sp0.9 from training. Duration: 0.97775 2023-10-03 03:10:34,075 WARNING [train.py:1204] (3/4) Exclude cut with ID _4092_210_2151_1_1526643044449_3135759_227-213972-0_sp1.1 from training. Duration: 0.4909375 2023-10-03 03:10:34,102 WARNING [train.py:1204] (3/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_912-47473-0_sp1.1 from training. Duration: 0.6181875 2023-10-03 03:10:34,140 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6767_210_3273_1_1527855783_1718932_173-256601-0_sp1.1 from training. Duration: 0.72725 2023-10-03 03:10:36,828 WARNING [train.py:1204] (3/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_32-205288-0_sp1.1 from training. Duration: 0.7090625 2023-10-03 03:10:36,865 WARNING [train.py:1204] (3/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_39-248870-0 from training. Duration: 0.69 2023-10-03 03:10:39,456 WARNING [train.py:1204] (3/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_377-296839-0_sp1.1 from training. Duration: 0.5 2023-10-03 03:10:39,472 WARNING [train.py:1204] (3/4) Exclude cut with ID _13679_210_7497_1_1530534293662_3355098_413-56154-0_sp1.1 from training. Duration: 0.963625 2023-10-03 03:10:41,349 WARNING [train.py:1204] (3/4) Exclude cut with ID _14970_210_15709_1_1530775547361_1966965_201-84544-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 03:10:41,604 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.feed_forward2.hidden_balancer.prob, batch_count=1117040.0, ans=0.125 2023-10-03 03:10:41,878 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.5.encoder.layers.0.conv_module1.whiten, num_groups=1, num_channels=256, metric=7.23 vs. limit=15.0 2023-10-03 03:10:43,196 WARNING [train.py:1204] (3/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_105-95775-0_sp1.1 from training. Duration: 0.936375 2023-10-03 03:10:45,918 WARNING [train.py:1204] (3/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_131-191669-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 03:10:45,944 WARNING [train.py:1204] (3/4) Exclude cut with ID _14860_210_4915_1_1530702020340_7343240_695-4323-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 03:10:47,373 WARNING [train.py:1204] (3/4) Exclude cut with ID _39867_210_9826_1_1533553222030_9344259_991-130565-0_sp1.1 from training. Duration: 0.97275 2023-10-03 03:10:47,607 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.0.balancer2.prob, batch_count=1117040.0, ans=0.125 2023-10-03 03:10:48,743 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_50-3289-0_sp1.1 from training. Duration: 0.82725 2023-10-03 03:10:51,846 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_54-347670-0_sp1.1 from training. Duration: 0.92725 2023-10-03 03:10:51,910 WARNING [train.py:1204] (3/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_200-153445-0_sp1.1 from training. Duration: 0.936375 2023-10-03 03:10:53,233 WARNING [train.py:1204] (3/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_279-302911-0_sp1.1 from training. Duration: 0.97275 2023-10-03 03:10:53,496 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.conv_module1.balancer1.prob, batch_count=1117106.6666666667, ans=0.125 2023-10-03 03:10:53,527 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.feed_forward3.hidden_balancer.prob, batch_count=1117106.6666666667, ans=0.125 2023-10-03 03:10:54,675 WARNING [train.py:1204] (3/4) Exclude cut with ID _14886_210_4220_1_1530702020355_3516190_159-73748-0_sp0.9 from training. Duration: 0.92225 2023-10-03 03:10:56,060 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder_embed.conv.5.prob, batch_count=1117106.6666666667, ans=0.125 2023-10-03 03:11:00,166 WARNING [train.py:1204] (3/4) Exclude cut with ID _53379_210_20034_1_1534222027363_3854654_23-222715-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 03:11:01,599 WARNING [train.py:1204] (3/4) Exclude cut with ID _12555_210_8750_1_1530075594402_3780239_449-267986-0 from training. Duration: 0.83 2023-10-03 03:11:01,620 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7544_210_6753_1_1528587722_3576686_295-258479-0 from training. Duration: 0.84 2023-10-03 03:11:03,679 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7002_210_1794_1_1527935634_4034444_487-236501-0_sp0.9 from training. Duration: 0.7333125 2023-10-03 03:11:03,730 WARNING [train.py:1204] (3/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_244-19107-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 03:11:03,746 WARNING [train.py:1204] (3/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_390-339137-0 from training. Duration: 0.56 2023-10-03 03:11:03,956 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.conv_module2.balancer2.prob, batch_count=1117106.6666666667, ans=0.125 2023-10-03 03:11:05,104 WARNING [train.py:1204] (3/4) Exclude cut with ID _7562_210_2084_1_1528887767916_3946229_186-326422-0_sp1.1 from training. Duration: 0.763625 2023-10-03 03:11:05,168 WARNING [train.py:1204] (3/4) Exclude cut with ID _33640_210_2655_1_1533117605479_4855469_645-145235-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 03:11:05,186 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_25767_210_2161_1_1532307483_3867748_506-171777-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 03:11:05,212 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_5438_210_1794_1_1527336687_2600009_233-58072-0_sp1.1 from training. Duration: 0.7 2023-10-03 03:11:05,213 WARNING [train.py:1204] (3/4) Exclude cut with ID IC0669W0063-330908 from training. Duration: 0.896 2023-10-03 03:11:06,499 WARNING [train.py:1204] (3/4) Exclude cut with ID IC0671W0070-331913 from training. Duration: 0.896 2023-10-03 03:11:06,502 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_258-310570-0_sp1.1 from training. Duration: 0.7454375 2023-10-03 03:11:07,748 INFO [train.py:1046] (3/4) Epoch 32, batch 2900, loss[loss=0.153, simple_loss=0.2279, pruned_loss=0.039, over 23619.00 frames. ], tot_loss[loss=0.1616, simple_loss=0.2397, pruned_loss=0.04179, over 4693726.72 frames. ], batch size: 149, lr: 3.19e-03, grad_scale: 16.0 2023-10-03 03:11:07,813 WARNING [train.py:1204] (3/4) Exclude cut with ID _9461_210_4557_1_1529060514726_3677451_162-177507-0_sp1.1 from training. Duration: 0.97275 2023-10-03 03:11:11,819 WARNING [train.py:1204] (3/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_26-175553-0_sp0.9 from training. Duration: 0.67775 2023-10-03 03:11:11,835 WARNING [train.py:1204] (3/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_7-150481-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 03:11:11,896 WARNING [train.py:1204] (3/4) Exclude cut with ID _23557_210_8750_1_1531890106370_6846255_72-273262-0_sp1.1 from training. Duration: 0.963625 2023-10-03 03:11:13,764 WARNING [train.py:1204] (3/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_367-61330-0 from training. Duration: 0.75 2023-10-03 03:11:18,336 WARNING [train.py:1204] (3/4) Exclude cut with ID _39867_210_9826_1_1533553222030_9344259_1337-11141-0_sp1.1 from training. Duration: 0.936375 2023-10-03 03:11:18,363 WARNING [train.py:1204] (3/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_101-293367-0 from training. Duration: 0.63 2023-10-03 03:11:18,429 WARNING [train.py:1204] (3/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_10-100367-0 from training. Duration: 0.98 2023-10-03 03:11:21,229 WARNING [train.py:1204] (3/4) Exclude cut with ID _60057_210_10640_1_1534903065793_3333150_171-181133-0_sp1.1 from training. Duration: 0.8 2023-10-03 03:11:21,232 WARNING [train.py:1204] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_844-96989-0_sp0.9 from training. Duration: 0.788875 2023-10-03 03:11:23,926 WARNING [train.py:1204] (3/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_301-144595-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 03:11:25,770 WARNING [train.py:1204] (3/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_115-147343-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 03:11:28,644 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3171_210_2087_1_1526109084_2330007_37-64334-0_sp1.1 from training. Duration: 0.7454375 2023-10-03 03:11:28,690 WARNING [train.py:1204] (3/4) Exclude cut with ID _19514_210_9259_1_1531530071330_5084230_769-272914-0_sp1.1 from training. Duration: 0.936375 2023-10-03 03:11:31,436 WARNING [train.py:1204] (3/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_76-311963-0_sp0.9 from training. Duration: 0.77775 2023-10-03 03:11:32,784 WARNING [train.py:1204] (3/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_526-62079-0 from training. Duration: 0.67 2023-10-03 03:11:33,445 WARNING [train.py:1204] (3/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_7-111954-0_sp0.9 from training. Duration: 0.62225 2023-10-03 03:11:36,117 WARNING [train.py:1204] (3/4) Exclude cut with ID _57997_210_4163_1_1534766425889_3961049_360-279140-0_sp1.1 from training. Duration: 0.97275 2023-10-03 03:11:37,562 WARNING [train.py:1204] (3/4) Exclude cut with ID _4866_210_2577_1_1527404412284_4364659_133-1696-0 from training. Duration: 0.99 2023-10-03 03:11:38,849 WARNING [train.py:1204] (3/4) Exclude cut with ID _5190_210_4557_1_1527244368799_373439_28-197030-0 from training. Duration: 0.72 2023-10-03 03:11:41,866 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_53-39003-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 03:11:41,869 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6673_210_3273_1_1527767790_3173240_351-167954-0 from training. Duration: 0.48 2023-10-03 03:11:41,884 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2822_210_2715_1_1526112586_1999587_165-36656-0_sp1.1 from training. Duration: 0.8090625 2023-10-03 03:11:43,905 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_328-264892-0_sp1.1 from training. Duration: 0.92725 2023-10-03 03:11:43,909 WARNING [train.py:1204] (3/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_29-293896-0_sp0.9 from training. Duration: 0.67775 2023-10-03 03:11:46,595 WARNING [train.py:1204] (3/4) Exclude cut with ID _65263_210_22356_1_1535763598691_3921037_465-55448-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 03:11:48,037 WARNING [train.py:1204] (3/4) Exclude cut with ID _21989_210_5854_1_1531899214931_3271360_311-53106-0_sp1.1 from training. Duration: 0.97275 2023-10-03 03:11:50,002 WARNING [train.py:1204] (3/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_389-277915-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 03:11:51,422 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_69630_210_7234_1_1535791094_677837_63-277139-0_sp1.1 from training. Duration: 0.963625 2023-10-03 03:11:54,201 WARNING [train.py:1204] (3/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_85-138257-0 from training. Duration: 0.71 2023-10-03 03:11:54,226 WARNING [train.py:1204] (3/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_686-247600-0 from training. Duration: 0.92 2023-10-03 03:11:54,231 WARNING [train.py:1204] (3/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_108-255430-0_sp1.1 from training. Duration: 0.8818125 2023-10-03 03:11:58,739 WARNING [train.py:1204] (3/4) Exclude cut with ID _14884_210_2252_1_1530707459689_5561250_114-201399-0_sp1.1 from training. Duration: 0.6909375 2023-10-03 03:12:01,443 WARNING [train.py:1204] (3/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_506-42614-0 from training. Duration: 0.79 2023-10-03 03:12:03,447 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_85-195240-0_sp1.1 from training. Duration: 0.7909375 2023-10-03 03:12:09,046 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_141-36243-0_sp1.1 from training. Duration: 0.97275 2023-10-03 03:12:16,513 WARNING [train.py:1204] (3/4) Exclude cut with ID _7852_210_5219_1_1528286471316_8139400_358-287492-0_sp1.1 from training. Duration: 0.87275 2023-10-03 03:12:16,539 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_805-71497-0_sp0.9 from training. Duration: 0.788875 2023-10-03 03:12:17,895 WARNING [train.py:1204] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_178-39722-0 from training. Duration: 0.49 2023-10-03 03:12:18,171 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.balancer2.prob, batch_count=1117440.0, ans=0.125 2023-10-03 03:12:22,464 INFO [train.py:1046] (3/4) Epoch 32, batch 2950, loss[loss=0.2081, simple_loss=0.2709, pruned_loss=0.07267, over 18917.00 frames. ], tot_loss[loss=0.1626, simple_loss=0.2404, pruned_loss=0.04235, over 4680067.00 frames. ], batch size: 388, lr: 3.19e-03, grad_scale: 16.0 2023-10-03 03:12:22,544 WARNING [train.py:1204] (3/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_428-181925-0_sp1.1 from training. Duration: 0.963625 2023-10-03 03:12:22,554 WARNING [train.py:1204] (3/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_770-200430-0 from training. Duration: 0.89 2023-10-03 03:12:23,826 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.550e+02 1.841e+02 2.014e+02 2.273e+02 4.138e+02, threshold=4.027e+02, percent-clipped=1.0 2023-10-03 03:12:23,902 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_1104-271009-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 03:12:23,939 WARNING [train.py:1204] (3/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_128-296581-0_sp0.9 from training. Duration: 0.82225 2023-10-03 03:12:25,622 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.balancer1.prob, batch_count=1117506.6666666667, ans=0.125 2023-10-03 03:12:28,374 WARNING [train.py:1204] (3/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_19-62511-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 03:12:31,495 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_805-71497-0 from training. Duration: 0.71 2023-10-03 03:12:31,567 WARNING [train.py:1204] (3/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_573-152077-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 03:12:33,515 WARNING [train.py:1204] (3/4) Exclude cut with ID _42875_210_14856_1_1533704397487_4012433_393-177816-0_sp1.1 from training. Duration: 0.963625 2023-10-03 03:12:34,761 WARNING [train.py:1204] (3/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_278-35159-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 03:12:34,867 WARNING [train.py:1204] (3/4) Exclude cut with ID _15037_210_2253_1_1530752294104_3267650_13-130361-0_sp1.1 from training. Duration: 0.9 2023-10-03 03:12:37,533 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3019_210_2028_1_1526044245_3264865_127-135615-0 from training. Duration: 0.94 2023-10-03 03:12:37,601 WARNING [train.py:1204] (3/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_607-213951-0 from training. Duration: 0.84 2023-10-03 03:12:38,906 WARNING [train.py:1204] (3/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_436-318113-0_sp0.9 from training. Duration: 0.8333125 2023-10-03 03:12:38,906 WARNING [train.py:1204] (3/4) Exclude cut with ID _13938_210_9988_1_1530518718978_3693960_148-152225-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 03:12:44,522 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2541_210_1794_1_1525590269_3084765_156-173713-0_sp1.1 from training. Duration: 0.736375 2023-10-03 03:12:46,367 WARNING [train.py:1204] (3/4) Exclude cut with ID _28323_210_16187_1_1532480395667_4100300_531-24157-0_sp1.1 from training. Duration: 0.8909375 2023-10-03 03:12:47,804 WARNING [train.py:1204] (3/4) Exclude cut with ID _29303_210_2575_1_1532392237303_7410649_382-217492-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 03:12:47,953 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.1.self_attn_weights.pos_emb_skip_rate, batch_count=1117573.3333333333, ans=0.0 2023-10-03 03:12:49,091 WARNING [train.py:1204] (3/4) Exclude cut with ID _58895_210_8750_1_1535184081320_3649089_270-158013-0_sp1.1 from training. Duration: 0.92725 2023-10-03 03:12:52,429 WARNING [train.py:1204] (3/4) Exclude cut with ID _29277_210_3279_1_1532581109293_3173069_311-206227-0_sp1.1 from training. Duration: 0.936375 2023-10-03 03:12:52,447 WARNING [train.py:1204] (3/4) Exclude cut with ID _7160_210_1876_1_1527940923231_3703799_282-322611-0_sp0.9 from training. Duration: 0.9666875 2023-10-03 03:12:53,820 WARNING [train.py:1204] (3/4) Exclude cut with ID _53219_210_4525_1_1534834836008_4064825_562-32295-0_sp1.1 from training. Duration: 0.963625 2023-10-03 03:12:54,069 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.conv_module1.balancer1.prob, batch_count=1117640.0, ans=0.125 2023-10-03 03:12:55,231 WARNING [train.py:1204] (3/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_408-87330-0_sp1.1 from training. Duration: 0.963625 2023-10-03 03:12:55,239 WARNING [train.py:1204] (3/4) Exclude cut with ID _59202_210_26984_1_1534816810612_3549174_137-318710-0_sp1.1 from training. Duration: 0.8090625 2023-10-03 03:12:56,666 WARNING [train.py:1204] (3/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_372-30302-0 from training. Duration: 0.86 2023-10-03 03:12:58,222 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.balancer1.prob, batch_count=1117640.0, ans=0.125 2023-10-03 03:13:02,362 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7543_210_6753_1_1528541907_1399322_178-56269-0_sp1.1 from training. Duration: 0.95275 2023-10-03 03:13:02,380 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9109W0012-549758 from training. Duration: 0.512 2023-10-03 03:13:02,450 WARNING [train.py:1204] (3/4) Exclude cut with ID _5410_210_1388_1_1527397055795_1719020_39-112221-0_sp0.9 from training. Duration: 0.7666875 2023-10-03 03:13:04,570 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9121W0012-554933 from training. Duration: 0.896 2023-10-03 03:13:05,956 WARNING [train.py:1204] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_476-272757-0 from training. Duration: 0.93 2023-10-03 03:13:05,985 WARNING [train.py:1204] (3/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_1085-171718-0_sp1.1 from training. Duration: 0.92725 2023-10-03 03:13:07,316 WARNING [train.py:1204] (3/4) Exclude cut with ID _8480_210_1874_1_1528589391205_6728330_309-139896-0_sp1.1 from training. Duration: 0.736375 2023-10-03 03:13:07,316 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9130W0113-559703 from training. Duration: 0.896 2023-10-03 03:13:07,328 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_292-265928-0_sp1.1 from training. Duration: 0.463625 2023-10-03 03:13:10,250 WARNING [train.py:1204] (3/4) Exclude cut with ID _8772_210_1850_1_1528635595972_3664011_96-12756-0 from training. Duration: 0.98 2023-10-03 03:13:11,561 WARNING [train.py:1204] (3/4) Exclude cut with ID _12556_210_8341_1_1530081024100_7327690_725-193477-0_sp1.1 from training. Duration: 0.8909375 2023-10-03 03:13:11,618 WARNING [train.py:1204] (3/4) Exclude cut with ID _5289_210_2084_1_1527678202736_3779299_142-103317-0_sp1.1 from training. Duration: 0.763625 2023-10-03 03:13:11,872 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.ff3_skip_rate, batch_count=1117706.6666666667, ans=0.0 2023-10-03 03:13:13,028 WARNING [train.py:1204] (3/4) Exclude cut with ID _45012_210_12651_1_1533608435136_5371380_206-51075-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 03:13:14,415 WARNING [train.py:1204] (3/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_176-56120-0_sp1.1 from training. Duration: 0.7545625 2023-10-03 03:13:14,437 WARNING [train.py:1204] (3/4) Exclude cut with ID _40284_210_2084_1_1533513679814_3704279_220-201864-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 03:13:15,785 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9165W0004-576638 from training. Duration: 0.896 2023-10-03 03:13:15,825 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_5438_210_1794_1_1527336687_2600009_218-343336-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 03:13:17,698 WARNING [train.py:1204] (3/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_318-162017-0 from training. Duration: 0.91 2023-10-03 03:13:20,777 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_49544_210_1794_1_1534379770_5262083_483-349298-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 03:13:22,613 WARNING [train.py:1204] (3/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_197-28164-0_sp0.9 from training. Duration: 0.888875 2023-10-03 03:13:22,670 WARNING [train.py:1204] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_643-52007-0 from training. Duration: 0.57 2023-10-03 03:13:22,694 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_38965_210_4852_1_1533174906_3484735_403-332963-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 03:13:24,086 WARNING [train.py:1204] (3/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_340-174176-0 from training. Duration: 0.49 2023-10-03 03:13:25,778 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.conv_skip_rate, batch_count=1117773.3333333333, ans=0.0 2023-10-03 03:13:26,917 WARNING [train.py:1204] (3/4) Exclude cut with ID _56708_210_13177_1_1535252351282_3422899_363-917-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 03:13:28,201 WARNING [train.py:1204] (3/4) Exclude cut with ID _19896_210_1883_1_1531709022662_6649569_525-159063-0_sp1.1 from training. Duration: 0.936375 2023-10-03 03:13:28,235 WARNING [train.py:1204] (3/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_430-116139-0_sp0.9 from training. Duration: 0.9444375 2023-10-03 03:13:29,643 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_53753_210_7234_1_1534320003_3594148_86-329250-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 03:13:29,654 WARNING [train.py:1204] (3/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_406-275649-0_sp1.1 from training. Duration: 0.3818125 2023-10-03 03:13:32,855 WARNING [train.py:1204] (3/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_1083-147536-0_sp1.1 from training. Duration: 0.8818125 2023-10-03 03:13:32,923 WARNING [train.py:1204] (3/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_54-311741-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 03:13:32,928 WARNING [train.py:1204] (3/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_237-58433-0_sp0.9 from training. Duration: 0.82225 2023-10-03 03:13:34,802 WARNING [train.py:1204] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_98-51676-0_sp1.1 from training. Duration: 0.663625 2023-10-03 03:13:34,859 WARNING [train.py:1204] (3/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_386-270472-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 03:13:35,084 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.nonlin_attention.balancer.prob, batch_count=1117773.3333333333, ans=0.125 2023-10-03 03:13:36,219 WARNING [train.py:1204] (3/4) Exclude cut with ID _19023_210_4353_1_1531378892552_5820780_83-254794-0_sp1.1 from training. Duration: 0.7818125 2023-10-03 03:13:37,485 INFO [train.py:1046] (3/4) Epoch 32, batch 3000, loss[loss=0.1612, simple_loss=0.2513, pruned_loss=0.03556, over 24462.00 frames. ], tot_loss[loss=0.1634, simple_loss=0.2418, pruned_loss=0.04253, over 4682104.23 frames. ], batch size: 69, lr: 3.19e-03, grad_scale: 8.0 2023-10-03 03:13:37,486 INFO [train.py:1069] (3/4) Computing validation loss 2023-10-03 03:13:49,430 INFO [train.py:1078] (3/4) Epoch 32, validation: loss=0.3583, simple_loss=0.2851, pruned_loss=0.2157, over 1125622.00 frames. 2023-10-03 03:13:49,430 INFO [train.py:1079] (3/4) Maximum memory allocated so far is 21134MB 2023-10-03 03:13:49,558 WARNING [train.py:1204] (3/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_314-93656-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 03:13:49,576 WARNING [train.py:1204] (3/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_336-213530-0 from training. Duration: 0.9 2023-10-03 03:13:50,977 WARNING [train.py:1204] (3/4) Exclude cut with ID _36686_210_5665_1_1533778225913_3646410_65-221120-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 03:13:51,140 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.feed_forward1.hidden_balancer.prob, batch_count=1117840.0, ans=0.125 2023-10-03 03:13:53,720 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_26433_210_12108_1_1532157158_4056480_271-68178-0_sp1.1 from training. Duration: 0.92725 2023-10-03 03:13:53,756 WARNING [train.py:1204] (3/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_46-207599-0_sp0.9 from training. Duration: 0.711125 2023-10-03 03:13:56,462 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9303W0114-637133 from training. Duration: 0.896 2023-10-03 03:13:56,487 WARNING [train.py:1204] (3/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_490-207861-0 from training. Duration: 0.98 2023-10-03 03:13:59,716 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6767_210_3273_1_1527855783_1718932_173-256601-0_sp0.9 from training. Duration: 0.888875 2023-10-03 03:14:01,058 WARNING [train.py:1204] (3/4) Exclude cut with ID _13921_210_9994_1_1530841782272_3503520_327-69677-0_sp1.1 from training. Duration: 0.8181875 2023-10-03 03:14:01,126 WARNING [train.py:1204] (3/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_508-335369-0 from training. Duration: 0.48 2023-10-03 03:14:01,153 WARNING [train.py:1204] (3/4) Exclude cut with ID _64631_210_27767_1_1535349580068_4165160_24-46567-0_sp1.1 from training. Duration: 0.963625 2023-10-03 03:14:08,510 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_89-35748-0_sp1.1 from training. Duration: 0.7181875 2023-10-03 03:14:14,330 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.0.feed_forward3.hidden_balancer.prob, batch_count=1117906.6666666667, ans=0.125 2023-10-03 03:14:16,027 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_26433_210_12108_1_1532157158_4056480_578-262107-0_sp1.1 from training. Duration: 0.9 2023-10-03 03:14:22,095 WARNING [train.py:1204] (3/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_129-337802-0 from training. Duration: 0.72 2023-10-03 03:14:23,502 WARNING [train.py:1204] (3/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_356-167816-0_sp0.9 from training. Duration: 0.8 2023-10-03 03:14:24,994 WARNING [train.py:1204] (3/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_137-116442-0_sp1.1 from training. Duration: 0.7090625 2023-10-03 03:14:26,362 WARNING [train.py:1204] (3/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_358-172940-0_sp1.1 from training. Duration: 0.963625 2023-10-03 03:14:26,400 WARNING [train.py:1204] (3/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_668-344147-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 03:14:26,555 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.bypass_mid.scale_min, batch_count=1117973.3333333333, ans=0.2 2023-10-03 03:14:27,833 WARNING [train.py:1204] (3/4) Exclude cut with ID _62337_210_12392_1_1535009273105_3412229_255-157667-0_sp1.1 from training. Duration: 0.936375 2023-10-03 03:14:27,836 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_486-151131-0 from training. Duration: 0.85 2023-10-03 03:14:31,212 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_53638_210_21581_1_1534297319_5808449_885-80172-0 from training. Duration: 0.96 2023-10-03 03:14:31,518 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.bypass_mid.scale_min, batch_count=1117973.3333333333, ans=0.2 2023-10-03 03:14:31,543 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.feed_forward1.out_proj.dropout_p, batch_count=1117973.3333333333, ans=0.1 2023-10-03 03:14:32,565 WARNING [train.py:1204] (3/4) Exclude cut with ID _50093_210_4220_1_1534417141207_3432299_105-183067-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 03:14:32,625 WARNING [train.py:1204] (3/4) Exclude cut with ID _7562_210_2084_1_1528887767916_3946229_345-169156-0_sp0.9 from training. Duration: 0.6444375 2023-10-03 03:14:34,501 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_4528_210_3273_1_1527076654_3276777_209-219804-0_sp0.9 from training. Duration: 0.7666875 2023-10-03 03:14:35,733 WARNING [train.py:1204] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_35-153886-0_sp0.9 from training. Duration: 0.9333125 2023-10-03 03:14:35,783 WARNING [train.py:1204] (3/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_332-121925-0_sp1.1 from training. Duration: 0.92725 2023-10-03 03:14:35,796 WARNING [train.py:1204] (3/4) Exclude cut with ID _59202_210_26984_1_1534816810612_3549174_495-65515-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 03:14:38,612 WARNING [train.py:1204] (3/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_769-168302-0_sp1.1 from training. Duration: 0.6454375 2023-10-03 03:14:38,652 WARNING [train.py:1204] (3/4) Exclude cut with ID _24271_210_12658_1_1531987270022_7190519_708-71345-0_sp1.1 from training. Duration: 0.936375 2023-10-03 03:14:38,652 WARNING [train.py:1204] (3/4) Exclude cut with ID 3033-130750-0096-55598-0_sp0.9 from training. Duration: 0.92225 2023-10-03 03:14:41,331 WARNING [train.py:1204] (3/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_219-224445-0_sp0.9 from training. Duration: 0.9333125 2023-10-03 03:14:42,832 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_174-277364-0 from training. Duration: 0.71 2023-10-03 03:14:42,910 WARNING [train.py:1204] (3/4) Exclude cut with ID _45021_210_9994_1_1533619751094_3330288_96-239010-0_sp1.1 from training. Duration: 0.82725 2023-10-03 03:14:44,118 WARNING [train.py:1204] (3/4) Exclude cut with ID _31602_210_5854_1_1532677021456_4458250_489-305116-0_sp1.1 from training. Duration: 0.97275 2023-10-03 03:14:44,136 WARNING [train.py:1204] (3/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_269-154726-0_sp0.9 from training. Duration: 0.9555625 2023-10-03 03:14:48,275 WARNING [train.py:1204] (3/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_126-54418-0_sp1.1 from training. Duration: 0.92725 2023-10-03 03:14:48,317 WARNING [train.py:1204] (3/4) Exclude cut with ID _42326_210_7770_1_1533814282531_3707269_182-224159-0_sp1.1 from training. Duration: 0.92725 2023-10-03 03:14:49,706 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7002_210_1794_1_1527939683_5157286_1-281264-0_sp0.9 from training. Duration: 0.488875 2023-10-03 03:14:49,741 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_86-16881-0 from training. Duration: 0.58 2023-10-03 03:14:51,084 WARNING [train.py:1204] (3/4) Exclude cut with ID _19514_210_9259_1_1531530071330_5084230_637-279094-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 03:14:51,127 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_26433_210_12108_1_1532157158_4056480_578-262107-0 from training. Duration: 0.99 2023-10-03 03:14:52,425 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_82-221196-0_sp1.1 from training. Duration: 0.6909375 2023-10-03 03:14:53,855 WARNING [train.py:1204] (3/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_402-102311-0 from training. Duration: 0.67 2023-10-03 03:14:55,990 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1015-343884-0_sp1.1 from training. Duration: 0.563625 2023-10-03 03:14:57,201 WARNING [train.py:1204] (3/4) Exclude cut with ID _13684_210_7497_1_1530619354863_3598359_312-235606-0_sp1.1 from training. Duration: 0.5454375 2023-10-03 03:14:57,229 WARNING [train.py:1204] (3/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_55-31904-0 from training. Duration: 0.88 2023-10-03 03:14:58,624 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_25-332138-0 from training. Duration: 0.91 2023-10-03 03:14:58,625 WARNING [train.py:1204] (3/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_912-282412-0_sp0.9 from training. Duration: 0.5444375 2023-10-03 03:14:58,791 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=1118106.6666666667, ans=0.1 2023-10-03 03:15:00,007 WARNING [train.py:1204] (3/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_458-299435-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 03:15:01,968 WARNING [train.py:1204] (3/4) Exclude cut with ID _69584_210_5219_1_1535785133775_4044450_275-88403-0_sp1.1 from training. Duration: 0.92725 2023-10-03 03:15:01,977 WARNING [train.py:1204] (3/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_841-341679-0_sp1.1 from training. Duration: 0.52725 2023-10-03 03:15:01,986 WARNING [train.py:1204] (3/4) Exclude cut with ID _41391_210_7994_1_1533603771872_3638201_1-54521-0_sp1.1 from training. Duration: 0.97275 2023-10-03 03:15:03,320 WARNING [train.py:1204] (3/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_495-186609-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 03:15:04,549 INFO [train.py:1046] (3/4) Epoch 32, batch 3050, loss[loss=0.1634, simple_loss=0.2549, pruned_loss=0.03596, over 24600.00 frames. ], tot_loss[loss=0.1648, simple_loss=0.2431, pruned_loss=0.04326, over 4684450.97 frames. ], batch size: 73, lr: 3.19e-03, grad_scale: 8.0 2023-10-03 03:15:06,038 WARNING [train.py:1204] (3/4) Exclude cut with ID _4681_210_3525_1_1527250689464_120889_15-126760-0 from training. Duration: 0.81 2023-10-03 03:15:07,342 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.534e+02 1.893e+02 2.072e+02 2.427e+02 3.731e+02, threshold=4.145e+02, percent-clipped=0.0 2023-10-03 03:15:07,538 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3019_210_2028_1_1526044245_3264865_127-135615-0_sp1.1 from training. Duration: 0.8545625 2023-10-03 03:15:10,251 WARNING [train.py:1204] (3/4) Exclude cut with ID _30191_210_1664_1_1533189915047_7196670_915-21561-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 03:15:10,288 WARNING [train.py:1204] (3/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_6-214815-0_sp0.9 from training. Duration: 0.9444375 2023-10-03 03:15:15,680 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_45399_210_4852_1_1533624725_7579737_190-137494-0_sp1.1 from training. Duration: 0.97275 2023-10-03 03:15:17,652 WARNING [train.py:1204] (3/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_207-132038-0 from training. Duration: 0.94 2023-10-03 03:15:22,706 WARNING [train.py:1204] (3/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_90-31383-0 from training. Duration: 0.87 2023-10-03 03:15:23,978 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_15517_210_6753_1_1530952293_1664795_119-31001-0 from training. Duration: 0.94 2023-10-03 03:15:24,012 WARNING [train.py:1204] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_684-85342-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 03:15:27,986 WARNING [train.py:1204] (3/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_97-325053-0_sp1.1 from training. Duration: 0.836375 2023-10-03 03:15:30,207 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.feed_forward3.hidden_balancer.prob, batch_count=1118240.0, ans=0.125 2023-10-03 03:15:31,292 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_68702_210_23689_1_1535799577_3420068_168-25150-0_sp1.1 from training. Duration: 0.97275 2023-10-03 03:15:31,307 WARNING [train.py:1204] (3/4) Exclude cut with ID _65134_210_14763_1_1535335393985_3505368_111-27984-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 03:15:32,724 WARNING [train.py:1204] (3/4) Exclude cut with ID _48678_210_5663_1_1533988830947_3769199_276-226230-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 03:15:33,062 INFO [scaling.py:1118] (3/4) WithLoss: name=encoder.encoders.2.encoder.layers.0.self_attn_weights, loss-sum=0.000e+00 2023-10-03 03:15:35,597 WARNING [train.py:1204] (3/4) Exclude cut with ID _28427_210_4220_1_1532581240967_3061069_43-84928-0_sp1.1 from training. Duration: 0.9 2023-10-03 03:15:35,626 WARNING [train.py:1204] (3/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_158-250116-0_sp1.1 from training. Duration: 0.563625 2023-10-03 03:15:35,651 WARNING [train.py:1204] (3/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_279-332556-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 03:15:35,691 WARNING [train.py:1204] (3/4) Exclude cut with ID _42863_210_9070_1_1533902398788_3696920_488-230146-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 03:15:35,692 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_387-247159-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 03:15:37,017 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6729_210_2550_1_1528032481_1788725_261-42619-0_sp1.1 from training. Duration: 0.97275 2023-10-03 03:15:40,148 WARNING [train.py:1204] (3/4) Exclude cut with ID _19896_210_1883_1_1531709022662_6649569_831-44724-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 03:15:42,872 WARNING [train.py:1204] (3/4) Exclude cut with ID _55153_210_2253_1_1534384786677_3162096_280-285944-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 03:15:42,894 WARNING [train.py:1204] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_448-4048-0 from training. Duration: 0.96 2023-10-03 03:15:42,945 WARNING [train.py:1204] (3/4) Exclude cut with ID _29678_210_7893_1_1532421042787_3906118_456-202548-0_sp1.1 from training. Duration: 0.97275 2023-10-03 03:15:42,957 WARNING [train.py:1204] (3/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_107-332398-0_sp1.1 from training. Duration: 0.7090625 2023-10-03 03:15:47,133 WARNING [train.py:1204] (3/4) Exclude cut with ID _13921_210_9994_1_1530838702800_3003429_46-149444-0_sp1.1 from training. Duration: 0.8545625 2023-10-03 03:15:47,164 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_257-20733-0_sp0.9 from training. Duration: 0.8444375 2023-10-03 03:15:47,231 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_53753_210_7234_1_1534320003_3594148_385-234809-0_sp1.1 from training. Duration: 0.963625 2023-10-03 03:15:49,203 WARNING [train.py:1204] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_292-215346-0_sp1.1 from training. Duration: 0.8818125 2023-10-03 03:15:52,164 WARNING [train.py:1204] (3/4) Exclude cut with ID _68965_210_1883_1_1535698962272_3497490_196-124268-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 03:15:53,484 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_23801_210_16223_1_1532066392_3294657_570-5439-0_sp1.1 from training. Duration: 0.8818125 2023-10-03 03:15:59,067 WARNING [train.py:1204] (3/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_349-6928-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 03:15:59,735 WARNING [train.py:1204] (3/4) Exclude cut with ID _60621_210_17071_1_1535013053530_3671739_226-255202-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 03:15:59,739 WARNING [train.py:1204] (3/4) Exclude cut with ID _41817_210_18835_1_1533434477160_7423049_448-130302-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 03:16:02,331 WARNING [train.py:1204] (3/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_758-143677-0_sp1.1 from training. Duration: 0.8909375 2023-10-03 03:16:02,371 WARNING [train.py:1204] (3/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_912-47473-0_sp0.9 from training. Duration: 0.7555625 2023-10-03 03:16:02,401 WARNING [train.py:1204] (3/4) Exclude cut with ID _6694_210_4599_1_1527852637490_3965529_356-169233-0_sp1.1 from training. Duration: 0.9 2023-10-03 03:16:03,846 WARNING [train.py:1204] (3/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_1-99680-0 from training. Duration: 0.67 2023-10-03 03:16:05,230 WARNING [train.py:1204] (3/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_199-86432-0_sp1.1 from training. Duration: 0.8909375 2023-10-03 03:16:07,150 WARNING [train.py:1204] (3/4) Exclude cut with ID _61148_210_6897_1_1535094022726_4217610_362-182430-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 03:16:07,404 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=1118440.0, ans=0.1 2023-10-03 03:16:08,525 WARNING [train.py:1204] (3/4) Exclude cut with ID _49376_210_5986_1_1533985278732_3370740_301-154763-0 from training. Duration: 0.98 2023-10-03 03:16:09,864 WARNING [train.py:1204] (3/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_572-185150-0_sp1.1 from training. Duration: 0.8818125 2023-10-03 03:16:15,497 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_47896_210_20508_1_1534492518_5958868_558-314572-0_sp1.1 from training. Duration: 0.8818125 2023-10-03 03:16:16,959 WARNING [train.py:1204] (3/4) Exclude cut with ID _45560_210_21982_1_1533812603951_3584999_111-6376-0_sp0.9 from training. Duration: 0.9333125 2023-10-03 03:16:18,174 INFO [train.py:1046] (3/4) Epoch 32, batch 3100, loss[loss=0.1641, simple_loss=0.2274, pruned_loss=0.05045, over 23744.00 frames. ], tot_loss[loss=0.1639, simple_loss=0.2419, pruned_loss=0.04295, over 4670854.63 frames. ], batch size: 232, lr: 3.19e-03, grad_scale: 8.0 2023-10-03 03:16:21,644 WARNING [train.py:1204] (3/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_412-317077-0_sp1.1 from training. Duration: 0.6818125 2023-10-03 03:16:23,701 WARNING [train.py:1204] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_718-68505-0 from training. Duration: 0.89 2023-10-03 03:16:25,190 WARNING [train.py:1204] (3/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_77-311404-0 from training. Duration: 0.94 2023-10-03 03:16:26,567 WARNING [train.py:1204] (3/4) Exclude cut with ID _60229_210_27767_1_1534838406158_3655350_264-279573-0 from training. Duration: 0.83 2023-10-03 03:16:27,866 WARNING [train.py:1204] (3/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_106-300985-0_sp1.1 from training. Duration: 0.8181875 2023-10-03 03:16:32,551 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_4183_210_2087_1_1526729100_1869838_79-277189-0_sp1.1 from training. Duration: 0.963625 2023-10-03 03:16:32,569 WARNING [train.py:1204] (3/4) Exclude cut with ID _28427_210_4220_1_1532581240967_3061069_195-295268-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 03:16:34,142 WARNING [train.py:1204] (3/4) Exclude cut with ID _4092_210_2151_1_1526643044449_3135759_356-278694-0_sp0.9 from training. Duration: 0.52225 2023-10-03 03:16:35,743 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=1118573.3333333333, ans=0.1 2023-10-03 03:16:38,235 WARNING [train.py:1204] (3/4) Exclude cut with ID _14459_210_2228_1_1531443641946_7516700_777-71446-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 03:16:39,857 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.feed_forward2.hidden_balancer.prob, batch_count=1118573.3333333333, ans=0.125 2023-10-03 03:16:42,888 WARNING [train.py:1204] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_520-126729-0 from training. Duration: 0.7 2023-10-03 03:16:47,000 WARNING [train.py:1204] (3/4) Exclude cut with ID _13684_210_7497_1_1530619354863_3598359_312-235606-0_sp0.9 from training. Duration: 0.6666875 2023-10-03 03:16:48,301 WARNING [train.py:1204] (3/4) Exclude cut with ID _37537_210_2253_1_1533171758568_3421949_283-349402-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 03:16:48,345 WARNING [train.py:1204] (3/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_29-117932-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 03:16:49,601 WARNING [train.py:1204] (3/4) Exclude cut with ID _29303_210_2575_1_1532392237303_7410649_721-283351-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 03:16:50,999 WARNING [train.py:1204] (3/4) Exclude cut with ID _14886_210_4220_1_1530702020355_3516190_175-229073-0_sp0.9 from training. Duration: 0.5 2023-10-03 03:16:54,699 WARNING [train.py:1204] (3/4) Exclude cut with ID _15458_210_8341_1_1530858614263_3834410_70-268830-0_sp1.1 from training. Duration: 0.97275 2023-10-03 03:16:54,716 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2295_210_2087_1_1525500993_2407863_234-83104-0 from training. Duration: 0.62 2023-10-03 03:16:54,718 WARNING [train.py:1204] (3/4) Exclude cut with ID _22961_210_8254_1_1531976366672_4278180_317-199546-0_sp1.1 from training. Duration: 0.7818125 2023-10-03 03:16:56,106 WARNING [train.py:1204] (3/4) Exclude cut with ID _27894_210_12392_1_1532415496078_6902439_547-196022-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 03:16:57,520 WARNING [train.py:1204] (3/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_46-207599-0 from training. Duration: 0.64 2023-10-03 03:16:58,888 WARNING [train.py:1204] (3/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_426-326246-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 03:17:01,709 WARNING [train.py:1204] (3/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_445-117398-0_sp0.9 from training. Duration: 0.988875 2023-10-03 03:17:03,082 WARNING [train.py:1204] (3/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_563-347832-0 from training. Duration: 0.89 2023-10-03 03:17:05,125 WARNING [train.py:1204] (3/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_252-216674-0 from training. Duration: 0.54 2023-10-03 03:17:05,211 WARNING [train.py:1204] (3/4) Exclude cut with ID _59611_210_8895_1_1534741198240_3650361_187-212779-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 03:17:06,642 WARNING [train.py:1204] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_282-159367-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 03:17:08,066 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_68702_210_23689_1_1535799577_3420068_33-136883-0_sp1.1 from training. Duration: 0.936375 2023-10-03 03:17:08,083 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_47228_210_4983_1_1533780021_3672027_184-293706-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 03:17:08,108 WARNING [train.py:1204] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_656-188361-0_sp0.9 from training. Duration: 0.9666875 2023-10-03 03:17:09,452 WARNING [train.py:1204] (3/4) Exclude cut with ID _4866_210_2577_1_1527404412284_4364659_352-157930-0_sp0.9 from training. Duration: 0.9 2023-10-03 03:17:09,453 WARNING [train.py:1204] (3/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_210-106047-0_sp1.1 from training. Duration: 0.8818125 2023-10-03 03:17:10,923 WARNING [train.py:1204] (3/4) Exclude cut with ID _9389_210_1664_1_1529217337301_5289269_289-213417-0_sp1.1 from training. Duration: 0.8090625 2023-10-03 03:17:12,256 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3910_210_1794_1_1526777667_3747260_178-41186-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 03:17:12,260 WARNING [train.py:1204] (3/4) Exclude cut with ID _27052_210_5986_1_1532329197187_3585009_24-172263-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 03:17:12,264 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_5522_210_3273_1_1527249606_3909096_318-203153-0_sp1.1 from training. Duration: 0.3909375 2023-10-03 03:17:13,927 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.feed_forward1.out_proj.dropout_p, batch_count=1118706.6666666667, ans=0.1 2023-10-03 03:17:16,772 WARNING [train.py:1204] (3/4) Exclude cut with ID _27894_210_12392_1_1532415496078_6902439_222-51982-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 03:17:18,177 WARNING [train.py:1204] (3/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_87-131427-0 from training. Duration: 0.78 2023-10-03 03:17:19,622 WARNING [train.py:1204] (3/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_372-298093-0_sp0.9 from training. Duration: 0.888875 2023-10-03 03:17:20,938 WARNING [train.py:1204] (3/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_663-166973-0 from training. Duration: 0.8 2023-10-03 03:17:20,975 WARNING [train.py:1204] (3/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_429-177469-0_sp1.1 from training. Duration: 0.936375 2023-10-03 03:17:20,999 WARNING [train.py:1204] (3/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_724-172651-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 03:17:22,319 WARNING [train.py:1204] (3/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_103-336064-0 from training. Duration: 0.51 2023-10-03 03:17:31,764 WARNING [train.py:1204] (3/4) Exclude cut with ID _48677_210_5663_1_1533902405919_3892989_111-11439-0 from training. Duration: 0.96 2023-10-03 03:17:33,072 INFO [train.py:1046] (3/4) Epoch 32, batch 3150, loss[loss=0.1858, simple_loss=0.2649, pruned_loss=0.05336, over 23954.00 frames. ], tot_loss[loss=0.1633, simple_loss=0.2414, pruned_loss=0.04257, over 4682634.51 frames. ], batch size: 86, lr: 3.18e-03, grad_scale: 8.0 2023-10-03 03:17:35,066 WARNING [train.py:1204] (3/4) Exclude cut with ID _8085_210_4557_1_1528455633493_3716100_95-103398-0_sp1.1 from training. Duration: 0.963625 2023-10-03 03:17:35,127 WARNING [train.py:1204] (3/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_1041-110837-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 03:17:36,439 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.500e+02 1.898e+02 2.081e+02 2.464e+02 4.773e+02, threshold=4.162e+02, percent-clipped=1.0 2023-10-03 03:17:36,573 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_50050_210_16223_1_1535435796_7290138_1155-121757-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 03:17:36,581 WARNING [train.py:1204] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_602-275260-0_sp0.9 from training. Duration: 0.911125 2023-10-03 03:17:37,857 WARNING [train.py:1204] (3/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_265-164486-0 from training. Duration: 0.96 2023-10-03 03:17:37,962 WARNING [train.py:1204] (3/4) Exclude cut with ID _46067_210_4220_1_1534125571066_3307450_142-229375-0_sp1.1 from training. Duration: 0.963625 2023-10-03 03:17:37,997 WARNING [train.py:1204] (3/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_310-26186-0_sp1.1 from training. Duration: 0.636375 2023-10-03 03:17:40,658 WARNING [train.py:1204] (3/4) Exclude cut with ID _28461_210_2253_1_1532771775189_3041299_136-270644-0 from training. Duration: 0.93 2023-10-03 03:17:42,094 WARNING [train.py:1204] (3/4) Exclude cut with ID _14860_210_4915_1_1530702020340_7343240_438-286372-0_sp1.1 from training. Duration: 0.936375 2023-10-03 03:17:45,338 WARNING [train.py:1204] (3/4) Exclude cut with ID IC0128W0004-63309 from training. Duration: 0.831 2023-10-03 03:17:46,870 WARNING [train.py:1204] (3/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_135-174080-0 from training. Duration: 0.53 2023-10-03 03:17:48,155 WARNING [train.py:1204] (3/4) Exclude cut with ID _41326_210_5854_1_1533778076509_3492080_277-59511-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 03:17:48,226 WARNING [train.py:1204] (3/4) Exclude cut with ID IC0144W0112-71379 from training. Duration: 0.992 2023-10-03 03:17:49,649 WARNING [train.py:1204] (3/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_711-15688-0_sp0.9 from training. Duration: 0.47775 2023-10-03 03:17:52,360 WARNING [train.py:1204] (3/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_237-332027-0 from training. Duration: 0.95 2023-10-03 03:17:52,407 WARNING [train.py:1204] (3/4) Exclude cut with ID _7970_210_1883_1_1528455616296_3472230_25-178407-0 from training. Duration: 0.71 2023-10-03 03:17:52,409 WARNING [train.py:1204] (3/4) Exclude cut with ID _8480_210_1874_1_1528589391205_6728330_309-139896-0 from training. Duration: 0.81 2023-10-03 03:17:52,434 WARNING [train.py:1204] (3/4) Exclude cut with ID _39571_210_13449_1_1533801584143_3651077_319-268877-0_sp1.1 from training. Duration: 0.936375 2023-10-03 03:17:52,438 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_155-246880-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 03:17:52,663 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.ff2_skip_rate, batch_count=1118906.6666666667, ans=0.0 2023-10-03 03:17:53,742 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_566-122599-0_sp1.1 from training. Duration: 0.936375 2023-10-03 03:17:55,766 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.feed_forward2.hidden_balancer.prob, batch_count=1118906.6666666667, ans=0.125 2023-10-03 03:17:56,994 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_12868_210_6753_1_1530701932_530275_18-76322-0 from training. Duration: 0.87 2023-10-03 03:17:58,406 WARNING [train.py:1204] (3/4) Exclude cut with ID _56494_210_4220_1_1535284837625_3033169_159-106035-0_sp1.1 from training. Duration: 0.963625 2023-10-03 03:17:58,446 WARNING [train.py:1204] (3/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_98-276013-0_sp1.1 from training. Duration: 0.963625 2023-10-03 03:17:58,512 WARNING [train.py:1204] (3/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_330-216124-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 03:17:59,929 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_177-58005-0_sp0.9 from training. Duration: 0.688875 2023-10-03 03:18:03,284 WARNING [train.py:1204] (3/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_343-139898-0 from training. Duration: 0.96 2023-10-03 03:18:04,724 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6961_210_2087_1_1527937168_6815011_395-185060-0_sp1.1 from training. Duration: 0.863625 2023-10-03 03:18:06,078 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7002_210_1794_1_1527939683_5157286_184-229630-0_sp0.9 from training. Duration: 0.82225 2023-10-03 03:18:07,359 WARNING [train.py:1204] (3/4) Exclude cut with ID _59170_210_6721_1_1534983633960_4203335_153-18895-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 03:18:07,411 WARNING [train.py:1204] (3/4) Exclude cut with ID _49643_210_7989_1_1534640244224_3939031_598-161926-0 from training. Duration: 0.9 2023-10-03 03:18:10,195 WARNING [train.py:1204] (3/4) Exclude cut with ID _4068_210_2629_1_1526697350395_1546259_224-288693-0 from training. Duration: 0.59 2023-10-03 03:18:11,495 WARNING [train.py:1204] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_718-68505-0_sp1.1 from training. Duration: 0.8090625 2023-10-03 03:18:11,548 WARNING [train.py:1204] (3/4) Exclude cut with ID _4068_210_2629_1_1526697350395_1546259_224-288693-0_sp0.9 from training. Duration: 0.6555625 2023-10-03 03:18:11,560 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2296_210_2087_1_1525503448_3397335_57-185430-0_sp0.9 from training. Duration: 0.6666875 2023-10-03 03:18:12,956 WARNING [train.py:1204] (3/4) Exclude cut with ID _15458_210_8341_1_1530858614263_3834410_49-235472-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 03:18:12,958 WARNING [train.py:1204] (3/4) Exclude cut with ID _64135_210_2253_1_1535677267876_7608972_498-5039-0_sp0.9 from training. Duration: 0.8666875 2023-10-03 03:18:14,843 WARNING [train.py:1204] (3/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_283-220372-0_sp0.9 from training. Duration: 0.62225 2023-10-03 03:18:14,852 WARNING [train.py:1204] (3/4) Exclude cut with ID _6473_210_1741_1_1527901307396_3815380_108-17335-0_sp0.9 from training. Duration: 0.72225 2023-10-03 03:18:14,950 WARNING [train.py:1204] (3/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_113-92608-0 from training. Duration: 0.54 2023-10-03 03:18:16,290 WARNING [train.py:1204] (3/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_272-249985-0_sp1.1 from training. Duration: 0.6090625 2023-10-03 03:18:16,299 WARNING [train.py:1204] (3/4) Exclude cut with ID _50376_210_13401_1_1534031502101_4217740_518-187390-0_sp1.1 from training. Duration: 0.97275 2023-10-03 03:18:17,639 WARNING [train.py:1204] (3/4) Exclude cut with ID _59738_210_15379_1_1534849512562_3941470_165-248260-0_sp1.1 from training. Duration: 0.92725 2023-10-03 03:18:17,656 WARNING [train.py:1204] (3/4) Exclude cut with ID _23818_210_6228_1_1531917063884_3676920_400-343925-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 03:18:19,034 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6961_210_2087_1_1527937168_6815011_562-165518-0 from training. Duration: 0.77 2023-10-03 03:18:19,084 WARNING [train.py:1204] (3/4) Exclude cut with ID _24271_210_12658_1_1531987270022_7190519_289-207931-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 03:18:20,470 WARNING [train.py:1204] (3/4) Exclude cut with ID _14632_210_9988_1_1530691279392_3923200_332-105904-0 from training. Duration: 0.9 2023-10-03 03:18:20,481 WARNING [train.py:1204] (3/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_203-232438-0_sp1.1 from training. Duration: 0.97275 2023-10-03 03:18:22,497 WARNING [train.py:1204] (3/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_263-168388-0 from training. Duration: 0.86 2023-10-03 03:18:24,449 WARNING [train.py:1204] (3/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_174-205058-0 from training. Duration: 0.95 2023-10-03 03:18:25,880 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_94-195801-0_sp1.1 from training. Duration: 0.8545625 2023-10-03 03:18:25,919 WARNING [train.py:1204] (3/4) Exclude cut with ID _55153_210_2253_1_1534384786677_3162096_269-103204-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 03:18:27,194 WARNING [train.py:1204] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_748-36150-0 from training. Duration: 0.91 2023-10-03 03:18:28,588 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_148-172861-0_sp1.1 from training. Duration: 0.4545625 2023-10-03 03:18:28,654 WARNING [train.py:1204] (3/4) Exclude cut with ID _67499_210_17282_1_1535509805220_3692430_460-77730-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 03:18:31,435 WARNING [train.py:1204] (3/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_374-20753-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 03:18:33,335 WARNING [train.py:1204] (3/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_374-289983-0_sp1.1 from training. Duration: 0.97275 2023-10-03 03:18:33,378 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_34886_210_10633_1_1533203882_1755661_76-156096-0_sp0.9 from training. Duration: 0.9666875 2023-10-03 03:18:37,534 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_238-256668-0_sp0.9 from training. Duration: 0.7666875 2023-10-03 03:18:38,822 WARNING [train.py:1204] (3/4) Exclude cut with ID _46683_210_9642_1_1533730913492_4591116_489-146727-0_sp1.1 from training. Duration: 0.97275 2023-10-03 03:18:40,216 WARNING [train.py:1204] (3/4) Exclude cut with ID _13938_210_9988_1_1530518718978_3693960_304-112784-0_sp0.9 from training. Duration: 0.511125 2023-10-03 03:18:40,464 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.balancer2.prob, batch_count=1119106.6666666667, ans=0.125 2023-10-03 03:18:45,110 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.ff2_skip_rate, batch_count=1119106.6666666667, ans=0.0 2023-10-03 03:18:46,194 WARNING [train.py:1204] (3/4) Exclude cut with ID _24271_210_12658_1_1531987270022_7190519_243-46549-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 03:18:46,198 WARNING [train.py:1204] (3/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_78-182912-0_sp1.1 from training. Duration: 0.536375 2023-10-03 03:18:47,546 INFO [train.py:1046] (3/4) Epoch 32, batch 3200, loss[loss=0.1877, simple_loss=0.2745, pruned_loss=0.05043, over 24388.00 frames. ], tot_loss[loss=0.1621, simple_loss=0.2398, pruned_loss=0.04225, over 4677153.57 frames. ], batch size: 77, lr: 3.18e-03, grad_scale: 16.0 2023-10-03 03:18:50,417 WARNING [train.py:1204] (3/4) Exclude cut with ID _57572_210_5517_1_1534921219624_3809484_121-321334-0_sp1.1 from training. Duration: 0.97275 2023-10-03 03:18:51,808 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_40047_210_2290_1_1533290519_2374311_254-102149-0_sp1.1 from training. Duration: 0.963625 2023-10-03 03:18:51,809 WARNING [train.py:1204] (3/4) Exclude cut with ID _28461_210_2253_1_1532771775189_3041299_96-152158-0 from training. Duration: 0.85 2023-10-03 03:18:52,044 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.bypass.skip_rate, batch_count=1119173.3333333333, ans=0.09899494936611666 2023-10-03 03:18:55,615 WARNING [train.py:1204] (3/4) Exclude cut with ID _29877_210_6228_1_1532397791106_5276149_161-102979-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 03:18:59,837 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3787_210_2040_1_1526477544_5478586_603-120324-0_sp0.9 from training. Duration: 0.9 2023-10-03 03:19:02,592 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.4.encoder.layers.0.whiten, num_groups=1, num_channels=384, metric=4.55 vs. limit=12.0 2023-10-03 03:19:03,283 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_41750_210_1960_1_1533520678_3948042_247-213456-0_sp1.1 from training. Duration: 0.97275 2023-10-03 03:19:06,230 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.balancer2.prob, batch_count=1119240.0, ans=0.125 2023-10-03 03:19:07,545 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.balancer2.prob, batch_count=1119240.0, ans=0.125 2023-10-03 03:19:08,046 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.1.encoder.layers.1.feed_forward1.out_whiten, num_groups=1, num_channels=256, metric=6.02 vs. limit=15.0 2023-10-03 03:19:11,520 WARNING [train.py:1204] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_127-119109-0_sp0.9 from training. Duration: 0.8 2023-10-03 03:19:15,812 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.conv_skip_rate, batch_count=1119306.6666666667, ans=0.0 2023-10-03 03:19:21,516 WARNING [train.py:1204] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_215-202973-0 from training. Duration: 0.88 2023-10-03 03:19:22,886 WARNING [train.py:1204] (3/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_17-320491-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 03:19:26,264 WARNING [train.py:1204] (3/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_310-114452-0 from training. Duration: 0.59 2023-10-03 03:19:27,658 WARNING [train.py:1204] (3/4) Exclude cut with ID _15133_210_6228_1_1530794512074_3080379_29-264458-0_sp0.9 from training. Duration: 0.6444375 2023-10-03 03:19:29,854 WARNING [train.py:1204] (3/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_159-281310-0_sp1.1 from training. Duration: 0.863625 2023-10-03 03:19:31,161 WARNING [train.py:1204] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_195-167011-0_sp0.9 from training. Duration: 0.8333125 2023-10-03 03:19:31,244 WARNING [train.py:1204] (3/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_156-257607-0_sp1.1 from training. Duration: 0.7545625 2023-10-03 03:19:35,761 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2295_210_2087_1_1525500993_2407863_116-238894-0 from training. Duration: 0.7 2023-10-03 03:19:37,144 WARNING [train.py:1204] (3/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_321-335475-0_sp0.9 from training. Duration: 0.5 2023-10-03 03:19:38,537 WARNING [train.py:1204] (3/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_188-114464-0 from training. Duration: 0.82 2023-10-03 03:19:39,973 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2592_210_2290_1_1525697025_4882925_157-70512-0 from training. Duration: 0.91 2023-10-03 03:19:40,285 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.ff2_skip_rate, batch_count=1119373.3333333333, ans=0.0 2023-10-03 03:19:40,861 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.2.encoder.layers.2.nonlin_attention.whiten1, num_groups=1, num_channels=288, metric=10.62 vs. limit=10.0 2023-10-03 03:19:42,884 WARNING [train.py:1204] (3/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_138-64210-0_sp1.1 from training. Duration: 0.663625 2023-10-03 03:19:48,372 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_11305_210_1794_1_1529750428_7178148_651-95702-0_sp1.1 from training. Duration: 0.963625 2023-10-03 03:19:48,396 WARNING [train.py:1204] (3/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_379-184223-0_sp0.9 from training. Duration: 0.9444375 2023-10-03 03:19:48,431 WARNING [train.py:1204] (3/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_381-125325-0_sp1.1 from training. Duration: 0.963625 2023-10-03 03:19:50,358 WARNING [train.py:1204] (3/4) Exclude cut with ID IC0622W0021-307659 from training. Duration: 0.896 2023-10-03 03:19:50,360 WARNING [train.py:1204] (3/4) Exclude cut with ID _7624_210_1531_1_1528109802198_4933101_418-349834-0_sp1.1 from training. Duration: 0.5454375 2023-10-03 03:19:53,306 WARNING [train.py:1204] (3/4) Exclude cut with ID _49728_210_12651_1_1534041034294_6788150_607-3416-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 03:19:55,290 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8275_210_4198_1_1528455320_3892571_491-17707-0 from training. Duration: 0.64 2023-10-03 03:19:55,340 WARNING [train.py:1204] (3/4) Exclude cut with ID _5287_210_2084_1_1527418883040_3878670_333-48508-0 from training. Duration: 0.6 2023-10-03 03:19:56,664 WARNING [train.py:1204] (3/4) Exclude cut with ID _8365_210_2064_1_1528592311070_4677690_371-122295-0 from training. Duration: 0.55 2023-10-03 03:19:58,474 WARNING [train.py:1204] (3/4) Exclude cut with ID _14860_210_4915_1_1530702020340_7343240_836-328489-0 from training. Duration: 0.9 2023-10-03 03:19:59,881 WARNING [train.py:1204] (3/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_1033-274376-0_sp0.9 from training. Duration: 0.9666875 2023-10-03 03:20:02,519 INFO [train.py:1046] (3/4) Epoch 32, batch 3250, loss[loss=0.1523, simple_loss=0.2277, pruned_loss=0.03844, over 23307.00 frames. ], tot_loss[loss=0.1621, simple_loss=0.2397, pruned_loss=0.04223, over 4683159.68 frames. ], batch size: 119, lr: 3.18e-03, grad_scale: 16.0 2023-10-03 03:20:04,422 WARNING [train.py:1204] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_475-127455-0_sp0.9 from training. Duration: 0.77775 2023-10-03 03:20:04,430 WARNING [train.py:1204] (3/4) Exclude cut with ID IC0671W0071-331914 from training. Duration: 0.896 2023-10-03 03:20:04,469 WARNING [train.py:1204] (3/4) Exclude cut with ID _30327_210_16049_1_1532484161295_3924169_2-1523-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 03:20:04,472 WARNING [train.py:1204] (3/4) Exclude cut with ID _24314_210_4915_1_1532135078116_7170179_460-208279-0_sp1.1 from training. Duration: 0.97275 2023-10-03 03:20:05,812 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.570e+02 1.990e+02 2.302e+02 2.522e+02 3.377e+02, threshold=4.604e+02, percent-clipped=0.0 2023-10-03 03:20:05,980 WARNING [train.py:1204] (3/4) Exclude cut with ID IC0679W0052-335859 from training. Duration: 0.64 2023-10-03 03:20:10,103 WARNING [train.py:1204] (3/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_3-237434-0_sp1.1 from training. Duration: 0.6909375 2023-10-03 03:20:14,218 WARNING [train.py:1204] (3/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_405-185283-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 03:20:18,554 WARNING [train.py:1204] (3/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_927-83684-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 03:20:18,564 WARNING [train.py:1204] (3/4) Exclude cut with ID _6694_210_4599_1_1527852637490_3965529_356-169233-0 from training. Duration: 0.99 2023-10-03 03:20:18,755 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.attention_skip_rate, batch_count=1119573.3333333333, ans=0.0 2023-10-03 03:20:19,920 WARNING [train.py:1204] (3/4) Exclude cut with ID _19896_210_1883_1_1531709022662_6649569_562-233001-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 03:20:19,951 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_354-78629-0_sp1.1 from training. Duration: 0.963625 2023-10-03 03:20:19,953 WARNING [train.py:1204] (3/4) Exclude cut with ID _60135_210_13427_1_1534935557922_3651319_96-168198-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 03:20:21,803 WARNING [train.py:1204] (3/4) Exclude cut with ID _31602_210_5854_1_1532677021456_4458250_249-96604-0_sp1.1 from training. Duration: 0.8090625 2023-10-03 03:20:21,825 WARNING [train.py:1204] (3/4) Exclude cut with ID _14100_210_8341_1_1530529320613_3678069_258-224599-0_sp0.9 from training. Duration: 0.8555625 2023-10-03 03:20:23,181 WARNING [train.py:1204] (3/4) Exclude cut with ID _19708_210_9206_1_1532136650733_4238364_215-160135-0_sp1.1 from training. Duration: 0.97275 2023-10-03 03:20:24,983 WARNING [train.py:1204] (3/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_113-18163-0_sp0.9 from training. Duration: 0.988875 2023-10-03 03:20:25,038 WARNING [train.py:1204] (3/4) Exclude cut with ID _64135_210_2253_1_1535677267876_7608972_552-337340-0_sp1.1 from training. Duration: 0.92725 2023-10-03 03:20:25,065 WARNING [train.py:1204] (3/4) Exclude cut with ID _67725_210_27767_1_1535608756671_5105359_10-12887-0_sp1.1 from training. Duration: 0.97275 2023-10-03 03:20:25,079 WARNING [train.py:1204] (3/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_401-284840-0_sp1.1 from training. Duration: 0.97275 2023-10-03 03:20:26,377 WARNING [train.py:1204] (3/4) Exclude cut with ID _44659_210_2721_1_1534036120061_6614860_483-141285-0_sp1.1 from training. Duration: 0.936375 2023-10-03 03:20:29,081 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.1.encoder.layers.1.nonlin_attention.whiten1, num_groups=1, num_channels=192, metric=5.75 vs. limit=10.0 2023-10-03 03:20:29,777 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.conv_module2.balancer1.prob, batch_count=1119573.3333333333, ans=0.125 2023-10-03 03:20:30,812 WARNING [train.py:1204] (3/4) Exclude cut with ID _9461_210_4557_1_1529060514726_3677451_358-332734-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 03:20:30,928 WARNING [train.py:1204] (3/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_788-298907-0_sp1.1 from training. Duration: 0.8090625 2023-10-03 03:20:33,671 WARNING [train.py:1204] (3/4) Exclude cut with ID _39411_210_5986_1_1533538825030_3343649_354-267196-0_sp1.1 from training. Duration: 0.92725 2023-10-03 03:20:33,699 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_14953_210_2151_1_1530701710_5616165_192-309792-0_sp1.1 from training. Duration: 0.97275 2023-10-03 03:20:35,695 WARNING [train.py:1204] (3/4) Exclude cut with ID _59170_210_6721_1_1534983633960_4203335_290-151255-0_sp1.1 from training. Duration: 0.92725 2023-10-03 03:20:37,010 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_5575_210_1410_1_1527301305_2570170_249-93769-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 03:20:37,019 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_46013_210_7234_1_1533776362_3744032_463-52378-0_sp1.1 from training. Duration: 0.7818125 2023-10-03 03:20:40,203 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.nonlin_attention.balancer.prob, batch_count=1119640.0, ans=0.125 2023-10-03 03:20:41,340 WARNING [train.py:1204] (3/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_580-93798-0 from training. Duration: 0.56 2023-10-03 03:20:41,513 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.balancer1.prob, batch_count=1119640.0, ans=0.125 2023-10-03 03:20:42,626 WARNING [train.py:1204] (3/4) Exclude cut with ID _19514_210_9259_1_1531530071330_5084230_385-13762-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 03:20:42,639 WARNING [train.py:1204] (3/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_458-67321-0_sp1.1 from training. Duration: 0.8818125 2023-10-03 03:20:43,954 WARNING [train.py:1204] (3/4) Exclude cut with ID _41298_210_9019_1_1533430371450_4822143_188-154791-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 03:20:44,023 WARNING [train.py:1204] (3/4) Exclude cut with ID _15037_210_2253_1_1530752294104_3267650_296-192597-0_sp0.9 from training. Duration: 0.9 2023-10-03 03:20:49,754 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.balancer1.prob, batch_count=1119706.6666666667, ans=0.125 2023-10-03 03:20:50,859 WARNING [train.py:1204] (3/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_661-264857-0_sp1.1 from training. Duration: 0.8454375 2023-10-03 03:20:52,650 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.bypass.scale_min, batch_count=1119706.6666666667, ans=0.2 2023-10-03 03:20:58,837 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_34008_210_10431_1_1533345424_8183247_662-317331-0_sp1.1 from training. Duration: 0.8545625 2023-10-03 03:20:58,860 WARNING [train.py:1204] (3/4) Exclude cut with ID _46502_210_13794_1_1533724452198_3657159_241-278329-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 03:20:58,860 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_11349_210_6753_1_1529800861_1101207_26-65602-0 from training. Duration: 0.96 2023-10-03 03:20:58,864 WARNING [train.py:1204] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_448-4048-0_sp1.1 from training. Duration: 0.87275 2023-10-03 03:20:58,879 WARNING [train.py:1204] (3/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_662-275621-0_sp1.1 from training. Duration: 0.3818125 2023-10-03 03:20:58,916 WARNING [train.py:1204] (3/4) Exclude cut with ID _13938_210_9988_1_1530518718978_3693960_256-54454-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 03:21:02,055 WARNING [train.py:1204] (3/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_256-32662-0 from training. Duration: 0.9 2023-10-03 03:21:02,096 WARNING [train.py:1204] (3/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_326-17061-0 from training. Duration: 0.94 2023-10-03 03:21:02,124 WARNING [train.py:1204] (3/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_505-97987-0_sp1.1 from training. Duration: 0.7818125 2023-10-03 03:21:03,582 WARNING [train.py:1204] (3/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_316-294364-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 03:21:05,028 WARNING [train.py:1204] (3/4) Exclude cut with ID _19514_210_9259_1_1531530071330_5084230_762-231063-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 03:21:05,091 WARNING [train.py:1204] (3/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_68-219807-0_sp0.9 from training. Duration: 0.52225 2023-10-03 03:21:05,137 WARNING [train.py:1204] (3/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_479-158053-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 03:21:05,950 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.conv_skip_rate, batch_count=1119773.3333333333, ans=0.0 2023-10-03 03:21:09,804 WARNING [train.py:1204] (3/4) Exclude cut with ID _66112_210_10635_1_1535590236305_3801484_83-38181-0_sp1.1 from training. Duration: 0.936375 2023-10-03 03:21:09,819 WARNING [train.py:1204] (3/4) Exclude cut with ID _55857_210_2346_1_1534644007831_6898209_255-146346-0_sp1.1 from training. Duration: 0.963625 2023-10-03 03:21:11,270 WARNING [train.py:1204] (3/4) Exclude cut with ID _48836_210_12210_1_1534075848293_3904150_429-35475-0 from training. Duration: 0.89 2023-10-03 03:21:11,291 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_50154_210_13731_1_1534750862_1080961_119-187163-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 03:21:12,865 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.nonlin_attention.balancer.prob, batch_count=1119773.3333333333, ans=0.125 2023-10-03 03:21:14,072 WARNING [train.py:1204] (3/4) Exclude cut with ID _8793_210_6310_1_1528542985139_2310009_121-179782-0_sp0.9 from training. Duration: 0.888875 2023-10-03 03:21:14,075 WARNING [train.py:1204] (3/4) Exclude cut with ID _14884_210_2252_1_1530707459689_5561250_591-173406-0 from training. Duration: 0.96 2023-10-03 03:21:14,380 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.bypass.skip_rate, batch_count=1119773.3333333333, ans=0.09899494936611666 2023-10-03 03:21:15,542 WARNING [train.py:1204] (3/4) Exclude cut with ID _15133_210_6228_1_1530794512074_3080379_54-198193-0_sp1.1 from training. Duration: 0.8545625 2023-10-03 03:21:15,551 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6545_210_2040_1_1527774670_4245258_407-48070-0 from training. Duration: 0.76 2023-10-03 03:21:16,789 INFO [train.py:1046] (3/4) Epoch 32, batch 3300, loss[loss=0.173, simple_loss=0.261, pruned_loss=0.0425, over 24546.00 frames. ], tot_loss[loss=0.163, simple_loss=0.2408, pruned_loss=0.04259, over 4684829.57 frames. ], batch size: 71, lr: 3.18e-03, grad_scale: 16.0 2023-10-03 03:21:18,199 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1032-236863-0 from training. Duration: 0.84 2023-10-03 03:21:19,617 WARNING [train.py:1204] (3/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_507-303370-0 from training. Duration: 0.96 2023-10-03 03:21:19,646 WARNING [train.py:1204] (3/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_164-282366-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 03:21:22,397 WARNING [train.py:1204] (3/4) Exclude cut with ID _39867_210_9826_1_1533553222030_9344259_312-158727-0_sp1.1 from training. Duration: 0.963625 2023-10-03 03:21:23,751 WARNING [train.py:1204] (3/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_174-205058-0_sp1.1 from training. Duration: 0.863625 2023-10-03 03:21:23,781 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_11305_210_1794_1_1529750428_7178148_166-237358-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 03:21:25,624 WARNING [train.py:1204] (3/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_26-175553-0_sp1.1 from training. Duration: 0.5545625 2023-10-03 03:21:25,645 WARNING [train.py:1204] (3/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_96-290039-0_sp1.1 from training. Duration: 0.5090625 2023-10-03 03:21:29,055 WARNING [train.py:1204] (3/4) Exclude cut with ID _53219_210_4525_1_1534834836008_4064825_261-101691-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 03:21:32,262 WARNING [train.py:1204] (3/4) Exclude cut with ID _24271_210_12658_1_1531987270022_7190519_96-293390-0_sp1.1 from training. Duration: 0.936375 2023-10-03 03:21:35,039 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_271-234962-0 from training. Duration: 0.79 2023-10-03 03:21:35,106 WARNING [train.py:1204] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_136-30660-0_sp1.1 from training. Duration: 0.97275 2023-10-03 03:21:37,170 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_625-281046-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 03:21:38,466 WARNING [train.py:1204] (3/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_333-168193-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 03:21:38,514 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9029W0269-509664 from training. Duration: 0.896 2023-10-03 03:21:38,778 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.feed_forward3.hidden_balancer.prob, batch_count=1119906.6666666667, ans=0.125 2023-10-03 03:21:39,832 WARNING [train.py:1204] (3/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_450-147393-0_sp1.1 from training. Duration: 0.92725 2023-10-03 03:21:39,898 WARNING [train.py:1204] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_643-52007-0_sp1.1 from training. Duration: 0.5181875 2023-10-03 03:21:41,209 WARNING [train.py:1204] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_330-327861-0_sp0.9 from training. Duration: 0.7444375 2023-10-03 03:21:41,218 WARNING [train.py:1204] (3/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_260-317651-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 03:21:42,508 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9044W0010-516654 from training. Duration: 0.896 2023-10-03 03:21:45,285 WARNING [train.py:1204] (3/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_265-118378-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 03:21:45,290 WARNING [train.py:1204] (3/4) Exclude cut with ID _6194_210_1534_1_1527511826160_3941194_321-148641-0_sp0.9 from training. Duration: 0.711125 2023-10-03 03:21:48,051 WARNING [train.py:1204] (3/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_101-169176-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 03:21:48,054 WARNING [train.py:1204] (3/4) Exclude cut with ID 497-129325-0061-62254-0_sp1.1 from training. Duration: 0.97725 2023-10-03 03:21:49,513 WARNING [train.py:1204] (3/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_795-33252-0_sp1.1 from training. Duration: 0.336375 2023-10-03 03:21:49,534 WARNING [train.py:1204] (3/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_168-246176-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 03:21:50,897 WARNING [train.py:1204] (3/4) Exclude cut with ID _7160_210_1876_1_1527940923231_3703799_120-150943-0_sp1.1 from training. Duration: 0.736375 2023-10-03 03:21:54,779 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9084W0005-536844 from training. Duration: 0.768 2023-10-03 03:21:56,275 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.balancer1.prob, batch_count=1119973.3333333333, ans=0.125 2023-10-03 03:21:57,427 WARNING [train.py:1204] (3/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_234-260682-0 from training. Duration: 0.84 2023-10-03 03:21:57,471 WARNING [train.py:1204] (3/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_188-114464-0_sp0.9 from training. Duration: 0.911125 2023-10-03 03:22:00,776 WARNING [train.py:1204] (3/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_81-75849-0 from training. Duration: 0.39 2023-10-03 03:22:02,713 WARNING [train.py:1204] (3/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_379-184223-0_sp1.1 from training. Duration: 0.77275 2023-10-03 03:22:06,024 WARNING [train.py:1204] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_454-123002-0_sp1.1 from training. Duration: 0.62725 2023-10-03 03:22:06,069 WARNING [train.py:1204] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_887-249555-0_sp1.1 from training. Duration: 0.7 2023-10-03 03:22:06,284 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.self_attn_weights.pos_emb_skip_rate, batch_count=1120040.0, ans=0.0 2023-10-03 03:22:07,560 WARNING [train.py:1204] (3/4) Exclude cut with ID _31088_210_16446_1_1532520012284_3440919_20-260902-0_sp1.1 from training. Duration: 0.97275 2023-10-03 03:22:08,769 WARNING [train.py:1204] (3/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_856-344574-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 03:22:08,771 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_36756_210_7234_1_1533026991_966276_4-164118-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 03:22:08,789 WARNING [train.py:1204] (3/4) Exclude cut with ID _8365_210_2064_1_1528592311070_4677690_371-122295-0_sp1.1 from training. Duration: 0.5 2023-10-03 03:22:09,097 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.feed_forward1.hidden_balancer.prob, batch_count=1120040.0, ans=0.125 2023-10-03 03:22:10,179 WARNING [train.py:1204] (3/4) Exclude cut with ID _5910_210_1389_1_1527337972454_5050100_555-159815-0_sp1.1 from training. Duration: 0.8909375 2023-10-03 03:22:11,882 WARNING [train.py:1204] (3/4) Exclude cut with ID _37086_210_9038_1_1532999540891_7021250_469-147941-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 03:22:13,186 WARNING [train.py:1204] (3/4) Exclude cut with ID _3067_210_2667_1_1526212974640_4503168_459-173867-0_sp0.9 from training. Duration: 0.97775 2023-10-03 03:22:14,571 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9156W0006-572499 from training. Duration: 0.896 2023-10-03 03:22:15,923 WARNING [train.py:1204] (3/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_277-210798-0 from training. Duration: 0.55 2023-10-03 03:22:17,437 WARNING [train.py:1204] (3/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_103-336064-0_sp1.1 from training. Duration: 0.463625 2023-10-03 03:22:18,793 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3414_210_1988_1_1526212718_4813195_331-290945-0_sp1.1 from training. Duration: 0.936375 2023-10-03 03:22:18,794 WARNING [train.py:1204] (3/4) Exclude cut with ID _67499_210_17282_1_1535509805220_3692430_365-29276-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 03:22:20,256 WARNING [train.py:1204] (3/4) Exclude cut with ID _14821_210_8750_1_1530680503991_6829359_69-250847-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 03:22:20,258 WARNING [train.py:1204] (3/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_1102-61301-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 03:22:21,612 WARNING [train.py:1204] (3/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_166-246557-0_sp1.1 from training. Duration: 0.5818125 2023-10-03 03:22:21,658 WARNING [train.py:1204] (3/4) Exclude cut with ID _15351_210_10626_1_1530856635653_3576210_214-129918-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 03:22:21,679 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_120-41668-0_sp0.9 from training. Duration: 0.77775 2023-10-03 03:22:23,106 WARNING [train.py:1204] (3/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_27-72278-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 03:22:25,847 WARNING [train.py:1204] (3/4) Exclude cut with ID _24314_210_4915_1_1532135078116_7170179_521-258868-0_sp1.1 from training. Duration: 0.7181875 2023-10-03 03:22:29,901 WARNING [train.py:1204] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_143-86344-0 from training. Duration: 0.77 2023-10-03 03:22:29,928 WARNING [train.py:1204] (3/4) Exclude cut with ID _53489_210_6228_1_1534298357169_3835970_465-157433-0_sp1.1 from training. Duration: 0.92725 2023-10-03 03:22:30,025 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_248-319353-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 03:22:32,040 WARNING [train.py:1204] (3/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_102-77064-0_sp1.1 from training. Duration: 0.8454375 2023-10-03 03:22:32,241 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.conv_module2.balancer1.prob, batch_count=1120173.3333333333, ans=0.125 2023-10-03 03:22:33,242 INFO [train.py:1046] (3/4) Epoch 32, batch 3350, loss[loss=0.1655, simple_loss=0.2377, pruned_loss=0.04662, over 23844.00 frames. ], tot_loss[loss=0.1627, simple_loss=0.2415, pruned_loss=0.04201, over 4701612.34 frames. ], batch size: 164, lr: 3.18e-03, grad_scale: 16.0 2023-10-03 03:22:33,296 WARNING [train.py:1204] (3/4) Exclude cut with ID _9389_210_1664_1_1529217337301_5289269_379-128794-0_sp1.1 from training. Duration: 0.77275 2023-10-03 03:22:34,660 WARNING [train.py:1204] (3/4) Exclude cut with ID _3231_210_2106_1_1526205864934_6784220_236-260835-0_sp1.1 from training. Duration: 0.97275 2023-10-03 03:22:36,516 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.570e+02 1.816e+02 1.964e+02 2.229e+02 3.119e+02, threshold=3.928e+02, percent-clipped=0.0 2023-10-03 03:22:36,643 WARNING [train.py:1204] (3/4) Exclude cut with ID _24271_210_12658_1_1531987270022_7190519_184-342363-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 03:22:36,644 WARNING [train.py:1204] (3/4) Exclude cut with ID _27894_210_12392_1_1532415496078_6902439_67-221663-0_sp1.1 from training. Duration: 0.92725 2023-10-03 03:22:39,336 WARNING [train.py:1204] (3/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_304-59227-0_sp1.1 from training. Duration: 0.7 2023-10-03 03:22:40,729 WARNING [train.py:1204] (3/4) Exclude cut with ID _58605_210_12210_1_1534723195939_3904259_458-42232-0_sp1.1 from training. Duration: 0.92725 2023-10-03 03:22:42,221 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.attention_skip_rate, batch_count=1120173.3333333333, ans=0.0 2023-10-03 03:22:43,446 WARNING [train.py:1204] (3/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_13-165512-0_sp0.9 from training. Duration: 0.911125 2023-10-03 03:22:46,128 WARNING [train.py:1204] (3/4) Exclude cut with ID _58602_210_2346_1_1534903210085_6908830_792-102241-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 03:22:48,194 WARNING [train.py:1204] (3/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_588-125165-0_sp0.9 from training. Duration: 0.92225 2023-10-03 03:22:48,298 WARNING [train.py:1204] (3/4) Exclude cut with ID _54218_210_12210_1_1534413646388_4409259_239-98429-0_sp1.1 from training. Duration: 0.97275 2023-10-03 03:22:49,714 WARNING [train.py:1204] (3/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_538-32291-0_sp1.1 from training. Duration: 0.7545625 2023-10-03 03:22:49,805 WARNING [train.py:1204] (3/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_419-317062-0 from training. Duration: 0.91 2023-10-03 03:22:51,261 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9309W0202-640329 from training. Duration: 0.896 2023-10-03 03:22:52,587 WARNING [train.py:1204] (3/4) Exclude cut with ID _3231_210_2106_1_1526205864934_6784220_419-313291-0_sp1.1 from training. Duration: 0.97275 2023-10-03 03:22:54,146 WARNING [train.py:1204] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_619-344499-0 from training. Duration: 0.63 2023-10-03 03:22:54,158 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_278-272612-0 from training. Duration: 0.73 2023-10-03 03:22:55,538 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_103-84010-0_sp0.9 from training. Duration: 0.7666875 2023-10-03 03:22:56,814 WARNING [train.py:1204] (3/4) Exclude cut with ID _26528_210_9070_1_1532570410842_3851310_497-97218-0_sp1.1 from training. Duration: 0.7818125 2023-10-03 03:22:56,900 WARNING [train.py:1204] (3/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_262-84613-0_sp1.1 from training. Duration: 0.936375 2023-10-03 03:22:58,279 WARNING [train.py:1204] (3/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_387-131538-0 from training. Duration: 0.81 2023-10-03 03:22:58,316 WARNING [train.py:1204] (3/4) Exclude cut with ID _8030_210_7497_1_1528444886781_3531089_224-300489-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 03:22:58,333 WARNING [train.py:1204] (3/4) Exclude cut with ID _14349_210_8341_1_1530615710818_3442470_170-336541-0_sp0.9 from training. Duration: 0.9555625 2023-10-03 03:23:01,030 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_47069_210_15368_1_1533781267_4431320_180-31746-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 03:23:02,980 WARNING [train.py:1204] (3/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_447-7041-0_sp1.1 from training. Duration: 0.92725 2023-10-03 03:23:04,293 WARNING [train.py:1204] (3/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_2-55283-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 03:23:04,351 WARNING [train.py:1204] (3/4) Exclude cut with ID _12555_210_8750_1_1530075594402_3780239_70-181911-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 03:23:04,568 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.bypass_mid.scale_min, batch_count=1120306.6666666667, ans=0.2 2023-10-03 03:23:07,782 WARNING [train.py:1204] (3/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_311-328022-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 03:23:09,219 WARNING [train.py:1204] (3/4) Exclude cut with ID _14528_210_8254_1_1530669700514_4306939_330-2987-0_sp1.1 from training. Duration: 0.92725 2023-10-03 03:23:10,594 WARNING [train.py:1204] (3/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_385-184858-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 03:23:14,660 WARNING [train.py:1204] (3/4) Exclude cut with ID _64414_210_17301_1_1535243252048_3757759_348-333360-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 03:23:16,108 WARNING [train.py:1204] (3/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_753-214751-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 03:23:17,558 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_17613_210_2151_1_1531137569_2366160_202-208013-0_sp1.1 from training. Duration: 0.92725 2023-10-03 03:23:17,566 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_43831_210_2158_1_1533725502_5872791_610-129300-0_sp1.1 from training. Duration: 0.936375 2023-10-03 03:23:20,748 WARNING [train.py:1204] (3/4) Exclude cut with ID _54218_210_12210_1_1534413646388_4409259_321-23852-0_sp1.1 from training. Duration: 0.936375 2023-10-03 03:23:20,886 WARNING [train.py:1204] (3/4) Exclude cut with ID _39411_210_5986_1_1533538825030_3343649_173-238248-0 from training. Duration: 0.83 2023-10-03 03:23:20,893 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_405-339986-0_sp0.9 from training. Duration: 0.5666875 2023-10-03 03:23:20,928 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2009_210_2071_1_1525481134_4588937_385-302100-0 from training. Duration: 0.87 2023-10-03 03:23:20,957 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_161-276996-0_sp1.1 from training. Duration: 0.863625 2023-10-03 03:23:23,787 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_177-58005-0 from training. Duration: 0.62 2023-10-03 03:23:23,871 WARNING [train.py:1204] (3/4) Exclude cut with ID _64639_210_5816_1_1535367619437_3528000_146-343734-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 03:23:25,283 WARNING [train.py:1204] (3/4) Exclude cut with ID _3845_210_1390_1_1526731326141_3523610_72-61441-0_sp1.1 from training. Duration: 0.92725 2023-10-03 03:23:32,783 WARNING [train.py:1204] (3/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_991-125028-0_sp1.1 from training. Duration: 0.936375 2023-10-03 03:23:32,847 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7544_210_6753_1_1528591327_1471426_172-348777-0 from training. Duration: 0.66 2023-10-03 03:23:32,888 WARNING [train.py:1204] (3/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_416-257730-0_sp1.1 from training. Duration: 0.5909375 2023-10-03 03:23:34,652 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1113-7369-0_sp1.1 from training. Duration: 0.77275 2023-10-03 03:23:36,529 WARNING [train.py:1204] (3/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_242-76943-0_sp1.1 from training. Duration: 0.8545625 2023-10-03 03:23:41,804 WARNING [train.py:1204] (3/4) Exclude cut with ID _44718_210_5816_1_1534050051869_6469030_190-49648-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 03:23:43,296 WARNING [train.py:1204] (3/4) Exclude cut with ID _47251_210_18628_1_1533814063128_4137250_214-88076-0 from training. Duration: 0.88 2023-10-03 03:23:44,539 WARNING [train.py:1204] (3/4) Exclude cut with ID _5585_210_2667_1_1527418813324_8007340_89-332072-0_sp1.1 from training. Duration: 0.5818125 2023-10-03 03:23:44,571 WARNING [train.py:1204] (3/4) Exclude cut with ID _41817_210_18835_1_1533434477160_7423049_555-293448-0_sp1.1 from training. Duration: 0.72725 2023-10-03 03:23:47,658 INFO [train.py:1046] (3/4) Epoch 32, batch 3400, loss[loss=0.2286, simple_loss=0.2919, pruned_loss=0.08262, over 19714.00 frames. ], tot_loss[loss=0.1644, simple_loss=0.2429, pruned_loss=0.04293, over 4696476.40 frames. ], batch size: 389, lr: 3.18e-03, grad_scale: 16.0 2023-10-03 03:23:47,727 WARNING [train.py:1204] (3/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_286-331314-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 03:23:47,772 WARNING [train.py:1204] (3/4) Exclude cut with ID _13921_210_9994_1_1530838702800_3003429_46-149444-0 from training. Duration: 0.94 2023-10-03 03:23:47,840 WARNING [train.py:1204] (3/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_768-43595-0_sp1.1 from training. Duration: 0.936375 2023-10-03 03:23:47,856 WARNING [train.py:1204] (3/4) Exclude cut with ID _35355_210_16040_1_1533212988977_4016229_74-190076-0 from training. Duration: 0.8 2023-10-03 03:23:49,284 WARNING [train.py:1204] (3/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_1007-316380-0_sp1.1 from training. Duration: 0.963625 2023-10-03 03:23:50,642 WARNING [train.py:1204] (3/4) Exclude cut with ID _23557_210_8750_1_1531890106370_6846255_150-283437-0_sp1.1 from training. Duration: 0.963625 2023-10-03 03:23:51,984 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3474_210_2028_1_1526188513_4825246_145-309131-0_sp1.1 from training. Duration: 0.67275 2023-10-03 03:23:52,081 WARNING [train.py:1204] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_391-22148-0_sp1.1 from training. Duration: 0.7 2023-10-03 03:23:52,096 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_238-256668-0 from training. Duration: 0.69 2023-10-03 03:23:58,871 WARNING [train.py:1204] (3/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_372-198878-0 from training. Duration: 0.6 2023-10-03 03:23:58,881 WARNING [train.py:1204] (3/4) Exclude cut with ID ID0174W0006-766659 from training. Duration: 0.953 2023-10-03 03:23:58,906 WARNING [train.py:1204] (3/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_668-23019-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 03:24:02,983 WARNING [train.py:1204] (3/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_322-74328-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 03:24:02,989 WARNING [train.py:1204] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_872-264525-0_sp1.1 from training. Duration: 0.5909375 2023-10-03 03:24:03,047 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_53378_210_17282_1_1534221071_5769509_641-286201-0_sp1.1 from training. Duration: 0.97275 2023-10-03 03:24:04,419 WARNING [train.py:1204] (3/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_199-295611-0_sp1.1 from training. Duration: 0.5 2023-10-03 03:24:08,426 WARNING [train.py:1204] (3/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_1270-317971-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 03:24:10,492 WARNING [train.py:1204] (3/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_784-213099-0 from training. Duration: 0.93 2023-10-03 03:24:14,609 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_115-233929-0_sp0.9 from training. Duration: 0.8 2023-10-03 03:24:19,044 WARNING [train.py:1204] (3/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_256-223809-0_sp1.1 from training. Duration: 0.97275 2023-10-03 03:24:19,098 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_285-87781-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 03:24:20,432 WARNING [train.py:1204] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_619-344499-0_sp1.1 from training. Duration: 0.57275 2023-10-03 03:24:27,260 WARNING [train.py:1204] (3/4) Exclude cut with ID _60229_210_27767_1_1534838406158_3655350_264-279573-0_sp0.9 from training. Duration: 0.92225 2023-10-03 03:24:28,932 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.bypass.scale_min, batch_count=1120640.0, ans=0.2 2023-10-03 03:24:30,305 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.conv_module1.balancer1.prob, batch_count=1120706.6666666667, ans=0.125 2023-10-03 03:24:31,320 WARNING [train.py:1204] (3/4) Exclude cut with ID _7719_210_3525_1_1528456153163_4296960_333-110399-0 from training. Duration: 0.96 2023-10-03 03:24:35,708 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_50423_210_13270_1_1534128598_5021734_759-90833-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 03:24:37,052 WARNING [train.py:1204] (3/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_151-254798-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 03:24:37,084 WARNING [train.py:1204] (3/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_450-203637-0 from training. Duration: 0.92 2023-10-03 03:24:37,111 WARNING [train.py:1204] (3/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_673-36456-0_sp1.1 from training. Duration: 0.936375 2023-10-03 03:24:39,118 WARNING [train.py:1204] (3/4) Exclude cut with ID _36538_210_9994_1_1533204020363_3795620_131-8685-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 03:24:39,160 WARNING [train.py:1204] (3/4) Exclude cut with ID _12556_210_8341_1_1530081024100_7327690_713-55626-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 03:24:39,211 WARNING [train.py:1204] (3/4) Exclude cut with ID _2322_210_2667_1_1525604548971_3656299_440-80494-0_sp0.9 from training. Duration: 0.97775 2023-10-03 03:24:42,697 WARNING [train.py:1204] (3/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_210-325081-0_sp1.1 from training. Duration: 0.97275 2023-10-03 03:24:45,567 WARNING [train.py:1204] (3/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_375-244976-0_sp0.9 from training. Duration: 0.8333125 2023-10-03 03:24:45,573 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_15517_210_6753_1_1530952293_1664795_119-31001-0_sp1.1 from training. Duration: 0.8545625 2023-10-03 03:24:49,256 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=1120773.3333333333, ans=0.1 2023-10-03 03:24:50,423 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_41452_210_7291_1_1533437687_3941281_319-42326-0_sp1.1 from training. Duration: 0.92725 2023-10-03 03:24:51,870 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_372-173012-0 from training. Duration: 0.95 2023-10-03 03:24:57,343 WARNING [train.py:1204] (3/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_41-31157-0_sp1.1 from training. Duration: 0.5181875 2023-10-03 03:25:00,223 WARNING [train.py:1204] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_91-212486-0 from training. Duration: 0.68 2023-10-03 03:25:01,504 INFO [train.py:1046] (3/4) Epoch 32, batch 3450, loss[loss=0.1701, simple_loss=0.2606, pruned_loss=0.03978, over 23998.00 frames. ], tot_loss[loss=0.1651, simple_loss=0.2438, pruned_loss=0.04323, over 4693133.98 frames. ], batch size: 80, lr: 3.18e-03, grad_scale: 8.0 2023-10-03 03:25:03,070 WARNING [train.py:1204] (3/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_1267-272693-0 from training. Duration: 0.73 2023-10-03 03:25:03,104 WARNING [train.py:1204] (3/4) Exclude cut with ID _13938_210_9988_1_1530518718978_3693960_152-67046-0_sp1.1 from training. Duration: 0.936375 2023-10-03 03:25:04,486 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_4528_210_3273_1_1527076654_3276777_342-168068-0_sp1.1 from training. Duration: 0.7909375 2023-10-03 03:25:04,494 WARNING [train.py:1204] (3/4) Exclude cut with ID _7606_210_4915_1_1528894253277_5080690_127-177761-0 from training. Duration: 0.64 2023-10-03 03:25:04,560 WARNING [train.py:1204] (3/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_335-137972-0_sp1.1 from training. Duration: 0.92725 2023-10-03 03:25:06,246 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.512e+02 1.880e+02 2.016e+02 2.211e+02 2.960e+02, threshold=4.032e+02, percent-clipped=0.0 2023-10-03 03:25:09,461 WARNING [train.py:1204] (3/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_331-117829-0_sp1.1 from training. Duration: 0.563625 2023-10-03 03:25:14,677 WARNING [train.py:1204] (3/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_125-125818-0_sp1.1 from training. Duration: 0.87275 2023-10-03 03:25:16,030 WARNING [train.py:1204] (3/4) Exclude cut with ID _6789_210_1534_1_1527854471671_2870519_48-294897-0_sp1.1 from training. Duration: 0.963625 2023-10-03 03:25:17,380 WARNING [train.py:1204] (3/4) Exclude cut with ID _6694_210_4599_1_1527852637490_3965529_116-109398-0_sp1.1 from training. Duration: 0.663625 2023-10-03 03:25:17,391 WARNING [train.py:1204] (3/4) Exclude cut with ID _52892_210_6721_1_1534378941259_4124129_222-172685-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 03:25:20,150 WARNING [train.py:1204] (3/4) Exclude cut with ID _50655_210_18628_1_1534229942193_3772154_320-132275-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 03:25:26,940 WARNING [train.py:1204] (3/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_76-311963-0 from training. Duration: 0.7 2023-10-03 03:25:29,075 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.3.encoder.layers.0.feed_forward2.out_whiten, num_groups=1, num_channels=512, metric=3.96 vs. limit=15.0 2023-10-03 03:25:31,217 WARNING [train.py:1204] (3/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_32-205288-0 from training. Duration: 0.78 2023-10-03 03:25:31,235 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_86-16881-0_sp0.9 from training. Duration: 0.6444375 2023-10-03 03:25:32,565 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_271-297505-0_sp1.1 from training. Duration: 0.9 2023-10-03 03:25:32,676 WARNING [train.py:1204] (3/4) Exclude cut with ID _57572_210_5517_1_1534921219624_3809484_187-295293-0_sp1.1 from training. Duration: 0.963625 2023-10-03 03:25:38,520 WARNING [train.py:1204] (3/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_563-329943-0 from training. Duration: 0.99 2023-10-03 03:25:39,898 WARNING [train.py:1204] (3/4) Exclude cut with ID _7160_210_1876_1_1527940923231_3703799_302-150728-0_sp0.9 from training. Duration: 0.8666875 2023-10-03 03:25:43,219 WARNING [train.py:1204] (3/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_926-7526-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 03:25:43,249 WARNING [train.py:1204] (3/4) Exclude cut with ID _62574_210_2655_1_1535180407058_3933249_467-22348-0_sp1.1 from training. Duration: 0.936375 2023-10-03 03:25:45,191 WARNING [train.py:1204] (3/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_280-161633-0_sp0.9 from training. Duration: 0.811125 2023-10-03 03:25:46,626 WARNING [train.py:1204] (3/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_385-193052-0_sp1.1 from training. Duration: 0.7818125 2023-10-03 03:25:48,038 WARNING [train.py:1204] (3/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_444-28041-0 from training. Duration: 0.85 2023-10-03 03:25:48,043 WARNING [train.py:1204] (3/4) Exclude cut with ID _66070_210_7640_1_1535441393096_7805310_341-249237-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 03:25:50,659 WARNING [train.py:1204] (3/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_425-269826-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 03:25:52,198 WARNING [train.py:1204] (3/4) Exclude cut with ID _53950_210_5816_1_1534417264919_3862951_124-185597-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 03:25:54,924 WARNING [train.py:1204] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_380-204756-0 from training. Duration: 0.86 2023-10-03 03:25:57,615 WARNING [train.py:1204] (3/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_282-257791-0_sp0.9 from training. Duration: 0.9555625 2023-10-03 03:26:01,814 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.conv_module1.balancer1.prob, batch_count=1121106.6666666667, ans=0.125 2023-10-03 03:26:01,861 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.feed_forward2.hidden_balancer.prob, batch_count=1121106.6666666667, ans=0.125 2023-10-03 03:26:02,942 WARNING [train.py:1204] (3/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_372-131163-0_sp1.1 from training. Duration: 0.8545625 2023-10-03 03:26:04,801 WARNING [train.py:1204] (3/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_447-233513-0_sp1.1 from training. Duration: 0.963625 2023-10-03 03:26:07,597 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_54-118333-0_sp1.1 from training. Duration: 0.97275 2023-10-03 03:26:11,802 WARNING [train.py:1204] (3/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_388-230239-0_sp1.1 from training. Duration: 0.963625 2023-10-03 03:26:12,443 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_40047_210_2290_1_1533290519_2374311_141-54639-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 03:26:12,509 WARNING [train.py:1204] (3/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_91-267696-0_sp0.9 from training. Duration: 0.9666875 2023-10-03 03:26:12,536 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_532-137847-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 03:26:15,036 INFO [train.py:1046] (3/4) Epoch 32, batch 3500, loss[loss=0.1353, simple_loss=0.1982, pruned_loss=0.03613, over 23463.00 frames. ], tot_loss[loss=0.1636, simple_loss=0.2416, pruned_loss=0.04282, over 4684362.28 frames. ], batch size: 285, lr: 3.18e-03, grad_scale: 8.0 2023-10-03 03:26:19,006 WARNING [train.py:1204] (3/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_155-283231-0_sp1.1 from training. Duration: 0.97275 2023-10-03 03:26:21,811 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2245_210_2715_1_1525511548_3540254_310-52483-0_sp1.1 from training. Duration: 0.8 2023-10-03 03:26:21,876 WARNING [train.py:1204] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_185-174805-0 from training. Duration: 0.68 2023-10-03 03:26:23,228 WARNING [train.py:1204] (3/4) Exclude cut with ID _15012_210_1968_1_1530856820429_5095350_662-210469-0_sp1.1 from training. Duration: 0.5181875 2023-10-03 03:26:26,153 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_426-313557-0_sp0.9 from training. Duration: 0.588875 2023-10-03 03:26:28,790 WARNING [train.py:1204] (3/4) Exclude cut with ID _42326_210_7770_1_1533814282531_3707269_407-204343-0_sp1.1 from training. Duration: 0.97275 2023-10-03 03:26:28,798 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_258-310570-0 from training. Duration: 0.82 2023-10-03 03:26:30,795 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.3.encoder.layers.0.feed_forward1.out_whiten, num_groups=1, num_channels=512, metric=11.02 vs. limit=15.0 2023-10-03 03:26:33,061 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_4737_210_2040_1_1526998689_3293008_378-178650-0_sp1.1 from training. Duration: 0.863625 2023-10-03 03:26:34,980 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_47896_210_20508_1_1534492518_5958868_432-327451-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 03:26:36,386 WARNING [train.py:1204] (3/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_188-114464-0_sp1.1 from training. Duration: 0.7454375 2023-10-03 03:26:36,393 WARNING [train.py:1204] (3/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_798-106179-0_sp1.1 from training. Duration: 0.7545625 2023-10-03 03:26:37,746 WARNING [train.py:1204] (3/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_143-117421-0_sp1.1 from training. Duration: 0.52725 2023-10-03 03:26:37,790 WARNING [train.py:1204] (3/4) Exclude cut with ID _67499_210_17282_1_1535509805220_3692430_361-78786-0_sp1.1 from training. Duration: 0.963625 2023-10-03 03:26:39,132 WARNING [train.py:1204] (3/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_712-342563-0_sp1.1 from training. Duration: 0.8818125 2023-10-03 03:26:39,181 WARNING [train.py:1204] (3/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_387-159008-0 from training. Duration: 0.86 2023-10-03 03:26:40,473 WARNING [train.py:1204] (3/4) Exclude cut with ID _33640_210_2655_1_1533117605479_4855469_959-18820-0_sp1.1 from training. Duration: 0.963625 2023-10-03 03:26:40,500 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_133-564-0_sp0.9 from training. Duration: 0.87775 2023-10-03 03:26:42,482 WARNING [train.py:1204] (3/4) Exclude cut with ID _23514_210_1389_1_1531832139785_3031439_12-29287-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 03:26:45,800 WARNING [train.py:1204] (3/4) Exclude cut with ID _23514_210_1389_1_1531832139785_3031439_489-185052-0_sp1.1 from training. Duration: 0.963625 2023-10-03 03:26:47,093 WARNING [train.py:1204] (3/4) Exclude cut with ID _5444_210_2151_1_1527418808464_5312210_634-194174-0 from training. Duration: 0.97 2023-10-03 03:26:47,111 WARNING [train.py:1204] (3/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_91-26919-0_sp1.1 from training. Duration: 0.8818125 2023-10-03 03:26:50,341 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_37351_210_7925_1_1533012931_3812113_516-82303-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 03:26:51,685 WARNING [train.py:1204] (3/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_80-74184-0_sp0.9 from training. Duration: 0.92225 2023-10-03 03:26:52,991 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_9460_210_4211_1_1528954587_10049667_495-202963-0_sp1.1 from training. Duration: 0.963625 2023-10-03 03:26:53,218 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.1.attention_skip_rate, batch_count=1121306.6666666667, ans=0.0 2023-10-03 03:26:54,377 WARNING [train.py:1204] (3/4) Exclude cut with ID _14632_210_9988_1_1530691279392_3923200_332-105904-0_sp1.1 from training. Duration: 0.8181875 2023-10-03 03:26:54,393 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_40764_210_1925_1_1533882359_3752345_493-167886-0_sp1.1 from training. Duration: 0.92725 2023-10-03 03:26:55,777 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_196-289637-0 from training. Duration: 0.75 2023-10-03 03:26:55,857 WARNING [train.py:1204] (3/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_26-175553-0 from training. Duration: 0.61 2023-10-03 03:26:57,136 WARNING [train.py:1204] (3/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_890-5117-0 from training. Duration: 0.78 2023-10-03 03:26:57,192 WARNING [train.py:1204] (3/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_206-306860-0_sp1.1 from training. Duration: 0.92725 2023-10-03 03:26:58,592 WARNING [train.py:1204] (3/4) Exclude cut with ID _25882_210_3036_1_1532775665815_6479510_570-79824-0_sp1.1 from training. Duration: 0.963625 2023-10-03 03:26:59,953 WARNING [train.py:1204] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_731-2428-0_sp1.1 from training. Duration: 0.7545625 2023-10-03 03:26:59,982 WARNING [train.py:1204] (3/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_138-154812-0_sp0.9 from training. Duration: 0.8666875 2023-10-03 03:27:03,279 WARNING [train.py:1204] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_178-39722-0_sp0.9 from training. Duration: 0.5444375 2023-10-03 03:27:04,610 WARNING [train.py:1204] (3/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_1285-256622-0_sp0.9 from training. Duration: 0.9333125 2023-10-03 03:27:08,907 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_41428_210_2161_1_1533774184_3992532_65-271323-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 03:27:08,989 WARNING [train.py:1204] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_308-161067-0 from training. Duration: 0.66 2023-10-03 03:27:08,994 WARNING [train.py:1204] (3/4) Exclude cut with ID _3067_210_2667_1_1526212974640_4503168_459-173867-0 from training. Duration: 0.88 2023-10-03 03:27:08,995 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2221_210_1988_1_1525597051_4935541_415-296882-0_sp1.1 from training. Duration: 0.7 2023-10-03 03:27:12,316 WARNING [train.py:1204] (3/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_354-192266-0_sp1.1 from training. Duration: 0.936375 2023-10-03 03:27:12,410 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_258-310570-0_sp0.9 from training. Duration: 0.911125 2023-10-03 03:27:15,490 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_548-141039-0_sp1.1 from training. Duration: 0.963625 2023-10-03 03:27:16,883 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_15101_210_7927_1_1530786415_5033274_320-327527-0 from training. Duration: 0.96 2023-10-03 03:27:16,933 WARNING [train.py:1204] (3/4) Exclude cut with ID _2895_210_2477_1_1525956785794_4063750_289-11475-0_sp0.9 from training. Duration: 0.911125 2023-10-03 03:27:19,688 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7543_210_6753_1_1528543318_4552_1-85072-0_sp1.1 from training. Duration: 0.7545625 2023-10-03 03:27:19,784 WARNING [train.py:1204] (3/4) Exclude cut with ID _64639_210_5816_1_1535367619437_3528000_461-329156-0 from training. Duration: 0.96 2023-10-03 03:27:22,474 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_211-339622-0 from training. Duration: 0.45 2023-10-03 03:27:23,890 WARNING [train.py:1204] (3/4) Exclude cut with ID _20455_210_5219_1_1531565996775_8098100_402-112739-0_sp1.1 from training. Duration: 0.963625 2023-10-03 03:27:25,288 WARNING [train.py:1204] (3/4) Exclude cut with ID _53489_210_6228_1_1534298357169_3835970_256-115565-0_sp1.1 from training. Duration: 0.936375 2023-10-03 03:27:25,312 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_26326_210_7925_1_1533002493_3592406_360-305260-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 03:27:25,341 WARNING [train.py:1204] (3/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_18-204383-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 03:27:27,951 INFO [train.py:1046] (3/4) Epoch 32, batch 3550, loss[loss=0.1607, simple_loss=0.2394, pruned_loss=0.04096, over 24624.00 frames. ], tot_loss[loss=0.1627, simple_loss=0.2407, pruned_loss=0.0424, over 4689966.14 frames. ], batch size: 60, lr: 3.18e-03, grad_scale: 8.0 2023-10-03 03:27:28,336 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=1121506.6666666667, ans=0.1 2023-10-03 03:27:29,359 WARNING [train.py:1204] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_306-232480-0_sp1.1 from training. Duration: 0.87275 2023-10-03 03:27:32,573 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.544e+02 1.881e+02 2.109e+02 2.515e+02 3.801e+02, threshold=4.218e+02, percent-clipped=0.0 2023-10-03 03:27:34,850 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.1.nonlin_attention.whiten1.whitening_limit, batch_count=1121506.6666666667, ans=10.0 2023-10-03 03:27:37,029 WARNING [train.py:1204] (3/4) Exclude cut with ID _41161_210_20034_1_1533431249657_3555314_116-299186-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 03:27:38,370 WARNING [train.py:1204] (3/4) Exclude cut with ID _13913_210_9994_1_1530579615187_3383897_473-277682-0_sp1.1 from training. Duration: 0.3545625 2023-10-03 03:27:43,529 WARNING [train.py:1204] (3/4) Exclude cut with ID _58895_210_8750_1_1535184081320_3649089_49-225702-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 03:27:43,600 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3080_210_2071_1_1526086102_4365166_312-8726-0_sp1.1 from training. Duration: 0.82725 2023-10-03 03:27:46,664 WARNING [train.py:1204] (3/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_489-190175-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 03:27:46,751 WARNING [train.py:1204] (3/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_345-95717-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 03:27:46,763 WARNING [train.py:1204] (3/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_85-138257-0_sp1.1 from training. Duration: 0.6454375 2023-10-03 03:27:49,635 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7046_210_6753_1_1527895337_6920917_783-278109-0_sp1.1 from training. Duration: 0.7 2023-10-03 03:27:49,667 WARNING [train.py:1204] (3/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_450-203637-0_sp1.1 from training. Duration: 0.836375 2023-10-03 03:27:50,956 WARNING [train.py:1204] (3/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_416-132893-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 03:27:50,978 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2655_210_2087_1_1525688959_3897400_437-337690-0_sp0.9 from training. Duration: 0.7 2023-10-03 03:27:52,351 WARNING [train.py:1204] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_700-183549-0_sp1.1 from training. Duration: 0.6181875 2023-10-03 03:27:55,457 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.conv_module2.balancer2.prob, batch_count=1121573.3333333333, ans=0.125 2023-10-03 03:27:56,471 WARNING [train.py:1204] (3/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_191-59784-0_sp1.1 from training. Duration: 0.8 2023-10-03 03:27:56,482 WARNING [train.py:1204] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_218-295097-0_sp1.1 from training. Duration: 0.7 2023-10-03 03:27:57,787 WARNING [train.py:1204] (3/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_310-109461-0_sp1.1 from training. Duration: 0.72725 2023-10-03 03:27:57,791 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_33348_210_5632_1_1533086912_3324030_131-321149-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 03:27:59,112 WARNING [train.py:1204] (3/4) Exclude cut with ID _64135_210_2253_1_1535677267876_7608972_133-188266-0_sp1.1 from training. Duration: 0.736375 2023-10-03 03:27:59,143 WARNING [train.py:1204] (3/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_505-205270-0 from training. Duration: 0.38 2023-10-03 03:27:59,155 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_67223_210_4238_1_1535612120_2537431_102-88248-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 03:28:00,999 WARNING [train.py:1204] (3/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_60-235433-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 03:28:01,077 WARNING [train.py:1204] (3/4) Exclude cut with ID _5189_210_4557_1_1527248319428_3769229_199-53526-0_sp1.1 from training. Duration: 0.3818125 2023-10-03 03:28:01,895 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.0.layers.1.self_attn1.whiten, num_groups=1, num_channels=192, metric=10.34 vs. limit=22.5 2023-10-03 03:28:05,234 WARNING [train.py:1204] (3/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_439-90979-0_sp1.1 from training. Duration: 0.963625 2023-10-03 03:28:06,556 WARNING [train.py:1204] (3/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_1162-132880-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 03:28:06,639 WARNING [train.py:1204] (3/4) Exclude cut with ID _56423_210_5854_1_1534896061049_3165969_220-22317-0_sp1.1 from training. Duration: 0.963625 2023-10-03 03:28:09,380 WARNING [train.py:1204] (3/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_909-64151-0 from training. Duration: 0.94 2023-10-03 03:28:11,290 WARNING [train.py:1204] (3/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_465-156500-0_sp0.9 from training. Duration: 0.8 2023-10-03 03:28:13,141 WARNING [train.py:1204] (3/4) Exclude cut with ID _44304_210_15374_1_1533639352765_4613129_302-22084-0 from training. Duration: 0.86 2023-10-03 03:28:15,002 WARNING [train.py:1204] (3/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_663-166973-0_sp1.1 from training. Duration: 0.72725 2023-10-03 03:28:16,414 WARNING [train.py:1204] (3/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_503-287883-0_sp0.9 from training. Duration: 0.97775 2023-10-03 03:28:16,438 WARNING [train.py:1204] (3/4) Exclude cut with ID _60057_210_10640_1_1534903065793_3333150_362-230377-0_sp0.9 from training. Duration: 0.9666875 2023-10-03 03:28:18,088 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.attention_skip_rate, batch_count=1121706.6666666667, ans=0.0 2023-10-03 03:28:19,310 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_4183_210_2087_1_1526729100_1869838_143-280962-0 from training. Duration: 0.56 2023-10-03 03:28:19,458 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.conv_module1.balancer2.prob, batch_count=1121706.6666666667, ans=0.125 2023-10-03 03:28:20,679 WARNING [train.py:1204] (3/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_281-1313-0_sp1.1 from training. Duration: 0.936375 2023-10-03 03:28:26,409 WARNING [train.py:1204] (3/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_64-215118-0_sp1.1 from training. Duration: 0.936375 2023-10-03 03:28:26,479 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2822_210_2715_1_1526112586_1999587_165-36656-0 from training. Duration: 0.89 2023-10-03 03:28:27,704 WARNING [train.py:1204] (3/4) Exclude cut with ID _22961_210_8254_1_1531976366672_4278180_402-208830-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 03:28:29,258 WARNING [train.py:1204] (3/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_98-193494-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 03:28:31,098 WARNING [train.py:1204] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_383-285196-0 from training. Duration: 0.97 2023-10-03 03:28:36,826 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6767_210_3273_1_1527855783_1718932_142-127132-0 from training. Duration: 0.95 2023-10-03 03:28:38,146 WARNING [train.py:1204] (3/4) Exclude cut with ID _39867_210_9826_1_1533553222030_9344259_1018-339635-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 03:28:38,214 WARNING [train.py:1204] (3/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_274-274798-0_sp1.1 from training. Duration: 0.8909375 2023-10-03 03:28:41,425 WARNING [train.py:1204] (3/4) Exclude cut with ID _2334_210_2667_1_1525608238880_4705129_376-263393-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 03:28:41,473 WARNING [train.py:1204] (3/4) Exclude cut with ID _29303_210_2575_1_1532392237303_7410649_363-344675-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 03:28:42,748 INFO [train.py:1046] (3/4) Epoch 32, batch 3600, loss[loss=0.1528, simple_loss=0.2405, pruned_loss=0.0326, over 24474.00 frames. ], tot_loss[loss=0.1623, simple_loss=0.2401, pruned_loss=0.04222, over 4691150.98 frames. ], batch size: 66, lr: 3.18e-03, grad_scale: 16.0 2023-10-03 03:28:42,888 WARNING [train.py:1204] (3/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_58-68956-0_sp1.1 from training. Duration: 0.7545625 2023-10-03 03:28:48,224 WARNING [train.py:1204] (3/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_709-311571-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 03:28:49,608 WARNING [train.py:1204] (3/4) Exclude cut with ID _46067_210_4220_1_1534125571066_3307450_136-257407-0_sp1.1 from training. Duration: 0.963625 2023-10-03 03:28:51,040 WARNING [train.py:1204] (3/4) Exclude cut with ID _6493_210_1747_1_1528459210424_4012470_324-323854-0_sp1.1 from training. Duration: 0.836375 2023-10-03 03:28:52,395 WARNING [train.py:1204] (3/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_234-260682-0_sp1.1 from training. Duration: 0.763625 2023-10-03 03:28:52,451 WARNING [train.py:1204] (3/4) Exclude cut with ID _36634_210_1794_1_1533296352450_1399884_55-119451-0_sp1.1 from training. Duration: 0.963625 2023-10-03 03:28:52,454 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_40047_210_2290_1_1533290519_2374311_195-165693-0 from training. Duration: 0.89 2023-10-03 03:28:57,736 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_5522_210_3273_1_1527249606_3909096_366-172508-0_sp0.9 from training. Duration: 0.7333125 2023-10-03 03:28:57,816 WARNING [train.py:1204] (3/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_187-10504-0_sp1.1 from training. Duration: 0.963625 2023-10-03 03:29:00,602 WARNING [train.py:1204] (3/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_282-257791-0_sp1.1 from training. Duration: 0.7818125 2023-10-03 03:29:02,126 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_43831_210_2158_1_1533725502_5872791_467-288776-0_sp1.1 from training. Duration: 0.92725 2023-10-03 03:29:04,041 WARNING [train.py:1204] (3/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_138-154812-0_sp1.1 from training. Duration: 0.7090625 2023-10-03 03:29:04,108 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_1154-185930-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 03:29:04,126 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2655_210_2087_1_1525688959_3897400_52-279899-0 from training. Duration: 0.69 2023-10-03 03:29:04,241 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.1.feed_forward1.hidden_balancer.prob, batch_count=1121906.6666666667, ans=0.125 2023-10-03 03:29:05,345 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_141-89113-0_sp1.1 from training. Duration: 0.7818125 2023-10-03 03:29:06,801 WARNING [train.py:1204] (3/4) Exclude cut with ID _55134_210_21982_1_1534590028377_3882170_297-47310-0_sp1.1 from training. Duration: 0.963625 2023-10-03 03:29:08,201 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_486-151131-0_sp1.1 from training. Duration: 0.77275 2023-10-03 03:29:09,563 WARNING [train.py:1204] (3/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_21-232980-0_sp1.1 from training. Duration: 0.936375 2023-10-03 03:29:12,639 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6165_210_3528_1_1527590982_4379841_338-106380-0_sp1.1 from training. Duration: 0.92725 2023-10-03 03:29:12,706 WARNING [train.py:1204] (3/4) Exclude cut with ID _55162_210_17071_1_1534408248583_3753480_355-257218-0_sp1.1 from training. Duration: 0.97275 2023-10-03 03:29:14,453 WARNING [train.py:1204] (3/4) Exclude cut with ID _2895_210_2477_1_1525956785794_4063750_289-11475-0 from training. Duration: 0.82 2023-10-03 03:29:16,556 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.3.encoder.layers.2.conv_module1.whiten, num_groups=1, num_channels=512, metric=4.59 vs. limit=15.0 2023-10-03 03:29:19,402 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.nonlin_attention.balancer.prob, batch_count=1121973.3333333333, ans=0.125 2023-10-03 03:29:20,578 WARNING [train.py:1204] (3/4) Exclude cut with ID _16756_210_4915_1_1531047803763_7104259_839-183431-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 03:29:21,890 WARNING [train.py:1204] (3/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_427-208901-0_sp0.9 from training. Duration: 0.7555625 2023-10-03 03:29:21,939 WARNING [train.py:1204] (3/4) Exclude cut with ID _36512_210_6987_1_1533033143998_3126069_181-306500-0 from training. Duration: 0.95 2023-10-03 03:29:26,027 WARNING [train.py:1204] (3/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_28-70987-0_sp1.1 from training. Duration: 0.7909375 2023-10-03 03:29:32,752 WARNING [train.py:1204] (3/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_590-58996-0_sp1.1 from training. Duration: 0.936375 2023-10-03 03:29:34,216 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_48730_210_5399_1_1533885886_2212769_108-47561-0_sp1.1 from training. Duration: 0.936375 2023-10-03 03:29:40,816 WARNING [train.py:1204] (3/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_401-158239-0_sp0.9 from training. Duration: 0.788875 2023-10-03 03:29:40,838 WARNING [train.py:1204] (3/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_278-154492-0_sp1.1 from training. Duration: 0.6909375 2023-10-03 03:29:40,843 WARNING [train.py:1204] (3/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_137-116442-0 from training. Duration: 0.78 2023-10-03 03:29:41,049 INFO [scaling.py:1118] (3/4) WithLoss: name=encoder.encoders.0.layers.1.self_attn_weights, loss-sum=0.000e+00 2023-10-03 03:29:42,260 WARNING [train.py:1204] (3/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_189-317158-0 from training. Duration: 0.9 2023-10-03 03:29:42,360 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_47896_210_20508_1_1534492518_5958868_558-314572-0 from training. Duration: 0.97 2023-10-03 03:29:43,156 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.2.encoder.layers.0.conv_module1.whiten, num_groups=1, num_channels=384, metric=5.95 vs. limit=15.0 2023-10-03 03:29:45,541 WARNING [train.py:1204] (3/4) Exclude cut with ID _66070_210_7640_1_1535441393096_7805310_290-40284-0_sp1.1 from training. Duration: 0.97275 2023-10-03 03:29:45,584 WARNING [train.py:1204] (3/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_110-224688-0_sp1.1 from training. Duration: 0.8818125 2023-10-03 03:29:48,675 WARNING [train.py:1204] (3/4) Exclude cut with ID _7719_210_3525_1_1528456153163_4296960_88-250259-0 from training. Duration: 0.84 2023-10-03 03:29:48,720 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8453_210_4211_1_1528502940_8562730_1157-114102-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 03:29:50,104 WARNING [train.py:1204] (3/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_317-175062-0_sp1.1 from training. Duration: 0.7454375 2023-10-03 03:29:50,111 WARNING [train.py:1204] (3/4) Exclude cut with ID _29857_210_20034_1_1532395886753_3798160_254-347254-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 03:29:51,418 WARNING [train.py:1204] (3/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_28-70987-0 from training. Duration: 0.87 2023-10-03 03:29:52,675 WARNING [train.py:1204] (3/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_321-335475-0 from training. Duration: 0.45 2023-10-03 03:29:52,850 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.balancer1.prob, batch_count=1122106.6666666667, ans=0.125 2023-10-03 03:29:54,829 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.4.encoder.layers.2.conv_module2.whiten, num_groups=1, num_channels=384, metric=3.73 vs. limit=15.0 2023-10-03 03:29:55,817 WARNING [train.py:1204] (3/4) Exclude cut with ID _40901_210_5665_1_1533363789958_3856130_356-107330-0_sp1.1 from training. Duration: 0.936375 2023-10-03 03:29:55,986 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.1.conv_module2.balancer2.prob, batch_count=1122173.3333333333, ans=0.125 2023-10-03 03:29:57,116 INFO [train.py:1046] (3/4) Epoch 32, batch 3650, loss[loss=0.1716, simple_loss=0.2527, pruned_loss=0.04529, over 24681.00 frames. ], tot_loss[loss=0.1632, simple_loss=0.2413, pruned_loss=0.04258, over 4702087.79 frames. ], batch size: 65, lr: 3.18e-03, grad_scale: 8.0 2023-10-03 03:29:57,188 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3780_210_2087_1_1526467389_2145572_176-107140-0 from training. Duration: 0.69 2023-10-03 03:30:02,615 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.489e+02 1.895e+02 2.042e+02 2.308e+02 4.121e+02, threshold=4.085e+02, percent-clipped=0.0 2023-10-03 03:30:02,709 WARNING [train.py:1204] (3/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_80-135200-0 from training. Duration: 0.79 2023-10-03 03:30:02,884 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.1.bypass.skip_rate, batch_count=1122173.3333333333, ans=0.035 2023-10-03 03:30:04,134 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_33348_210_5632_1_1533086912_3324030_85-28583-0_sp0.9 from training. Duration: 0.911125 2023-10-03 03:30:07,338 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.ff3_skip_rate, batch_count=1122173.3333333333, ans=0.0 2023-10-03 03:30:08,454 WARNING [train.py:1204] (3/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_29-293896-0 from training. Duration: 0.61 2023-10-03 03:30:09,862 WARNING [train.py:1204] (3/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_194-252800-0 from training. Duration: 0.97 2023-10-03 03:30:10,363 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.5.encoder.layers.0.nonlin_attention.whiten2, num_groups=1, num_channels=256, metric=3.55 vs. limit=15.0 2023-10-03 03:30:14,474 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_360-52446-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 03:30:14,478 WARNING [train.py:1204] (3/4) Exclude cut with ID _9389_210_1664_1_1529217337301_5289269_289-213417-0_sp0.9 from training. Duration: 0.988875 2023-10-03 03:30:14,524 WARNING [train.py:1204] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_143-86344-0_sp0.9 from training. Duration: 0.8555625 2023-10-03 03:30:15,311 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.0.bypass.scale_min, batch_count=1122240.0, ans=0.2 2023-10-03 03:30:17,902 WARNING [train.py:1204] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_520-126729-0_sp0.9 from training. Duration: 0.77775 2023-10-03 03:30:17,921 WARNING [train.py:1204] (3/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_76-308663-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 03:30:19,746 WARNING [train.py:1204] (3/4) Exclude cut with ID _8021_210_7994_1_1528376451358_3653069_100-140231-0 from training. Duration: 0.67 2023-10-03 03:30:19,819 WARNING [train.py:1204] (3/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_387-131538-0_sp0.9 from training. Duration: 0.9 2023-10-03 03:30:19,846 WARNING [train.py:1204] (3/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_365-182054-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 03:30:21,148 WARNING [train.py:1204] (3/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_345-340690-0 from training. Duration: 0.93 2023-10-03 03:30:22,493 WARNING [train.py:1204] (3/4) Exclude cut with ID _5409_210_1388_1_1527292850189_5865191_178-127970-0_sp0.9 from training. Duration: 0.6333125 2023-10-03 03:30:22,551 WARNING [train.py:1204] (3/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_352-12734-0_sp1.1 from training. Duration: 0.92725 2023-10-03 03:30:22,555 WARNING [train.py:1204] (3/4) Exclude cut with ID _65134_210_14763_1_1535335393985_3505368_121-129373-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 03:30:25,949 WARNING [train.py:1204] (3/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_557-324258-0_sp1.1 from training. Duration: 0.736375 2023-10-03 03:30:28,681 WARNING [train.py:1204] (3/4) Exclude cut with ID _48678_210_5663_1_1533988830947_3769199_306-255886-0 from training. Duration: 0.77 2023-10-03 03:30:28,782 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7544_210_6753_1_1528587000_691468_87-178599-0 from training. Duration: 0.87 2023-10-03 03:30:30,060 WARNING [train.py:1204] (3/4) Exclude cut with ID _2941_210_2577_1_1526199673150_2489759_208-236402-0_sp1.1 from training. Duration: 0.963625 2023-10-03 03:30:31,537 WARNING [train.py:1204] (3/4) Exclude cut with ID _38670_210_15379_1_1533448810659_4391059_65-169397-0 from training. Duration: 0.97 2023-10-03 03:30:32,865 WARNING [train.py:1204] (3/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_306-328175-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 03:30:32,875 WARNING [train.py:1204] (3/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_212-149947-0_sp1.1 from training. Duration: 0.663625 2023-10-03 03:30:37,143 WARNING [train.py:1204] (3/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_321-22821-0_sp1.1 from training. Duration: 0.8090625 2023-10-03 03:30:39,772 WARNING [train.py:1204] (3/4) Exclude cut with ID _44010_210_4525_1_1533884262964_4013620_483-69110-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 03:30:39,786 WARNING [train.py:1204] (3/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_38-187851-0_sp1.1 from training. Duration: 0.563625 2023-10-03 03:30:41,141 WARNING [train.py:1204] (3/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_129-337802-0_sp0.9 from training. Duration: 0.8 2023-10-03 03:30:43,075 WARNING [train.py:1204] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_268-87909-0_sp1.1 from training. Duration: 0.9 2023-10-03 03:30:45,865 WARNING [train.py:1204] (3/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_559-184862-0_sp1.1 from training. Duration: 0.8545625 2023-10-03 03:30:46,127 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.conv_module1.balancer1.prob, batch_count=1122373.3333333333, ans=0.125 2023-10-03 03:30:46,720 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.1.encoder.layers.1.self_attn2.whiten, num_groups=1, num_channels=256, metric=10.24 vs. limit=22.5 2023-10-03 03:30:49,234 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_46032_210_14656_1_1533781412_4507969_237-120766-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 03:30:50,589 WARNING [train.py:1204] (3/4) Exclude cut with ID _28427_210_4220_1_1532581240967_3061069_330-170906-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 03:30:50,595 WARNING [train.py:1204] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_401-51578-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 03:30:51,940 WARNING [train.py:1204] (3/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_209-205365-0_sp1.1 from training. Duration: 0.7181875 2023-10-03 03:30:53,274 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_23801_210_16223_1_1532066392_3294657_57-242170-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 03:30:53,328 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_127-37371-0_sp1.1 from training. Duration: 0.936375 2023-10-03 03:31:00,744 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9171W0019-578905 from training. Duration: 0.896 2023-10-03 03:31:03,587 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_24605_210_1794_1_1532047863_4149755_349-49056-0_sp1.1 from training. Duration: 0.92725 2023-10-03 03:31:04,780 WARNING [train.py:1204] (3/4) Exclude cut with ID _12302_210_5219_1_1529924446466_8107280_897-21237-0_sp1.1 from training. Duration: 0.936375 2023-10-03 03:31:04,875 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_9460_210_4211_1_1528954587_10049667_480-260269-0_sp0.9 from training. Duration: 0.62225 2023-10-03 03:31:04,917 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_348-152548-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 03:31:06,260 WARNING [train.py:1204] (3/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_301-330455-0_sp0.9 from training. Duration: 0.888875 2023-10-03 03:31:07,614 WARNING [train.py:1204] (3/4) Exclude cut with ID _61008_210_10633_1_1534852769198_6974370_345-91705-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 03:31:09,061 WARNING [train.py:1204] (3/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_304-273433-0 from training. Duration: 0.99 2023-10-03 03:31:09,071 WARNING [train.py:1204] (3/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_359-345972-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 03:31:10,296 INFO [train.py:1046] (3/4) Epoch 32, batch 3700, loss[loss=0.1765, simple_loss=0.2451, pruned_loss=0.05392, over 22713.00 frames. ], tot_loss[loss=0.1638, simple_loss=0.2421, pruned_loss=0.04277, over 4714244.62 frames. ], batch size: 322, lr: 3.18e-03, grad_scale: 8.0 2023-10-03 03:31:13,202 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2296_210_2087_1_1525503448_3397335_57-185430-0_sp1.1 from training. Duration: 0.5454375 2023-10-03 03:31:14,589 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_1256-120974-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 03:31:16,293 WARNING [train.py:1204] (3/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_372-30302-0_sp0.9 from training. Duration: 0.9555625 2023-10-03 03:31:18,391 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_42012_210_7927_1_1533439936_3871556_451-344262-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 03:31:18,392 WARNING [train.py:1204] (3/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_28-324729-0 from training. Duration: 0.94 2023-10-03 03:31:18,401 WARNING [train.py:1204] (3/4) Exclude cut with ID _64135_210_2253_1_1535677267876_7608972_313-18394-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 03:31:21,105 WARNING [train.py:1204] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_620-276793-0_sp0.9 from training. Duration: 0.6555625 2023-10-03 03:31:21,145 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7544_210_6753_1_1528591327_1471426_172-348777-0_sp0.9 from training. Duration: 0.7333125 2023-10-03 03:31:21,273 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.out_combiner.scale_min, batch_count=1122506.6666666667, ans=0.2 2023-10-03 03:31:25,616 WARNING [train.py:1204] (3/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_332-221057-0_sp0.9 from training. Duration: 0.7555625 2023-10-03 03:31:27,716 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.4.encoder.layers.0.whiten, num_groups=1, num_channels=384, metric=4.69 vs. limit=12.0 2023-10-03 03:31:28,430 WARNING [train.py:1204] (3/4) Exclude cut with ID _19023_210_4353_1_1531378892552_5820780_5-300035-0_sp1.1 from training. Duration: 0.92725 2023-10-03 03:31:28,492 WARNING [train.py:1204] (3/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_343-309023-0_sp1.1 from training. Duration: 0.963625 2023-10-03 03:31:29,901 WARNING [train.py:1204] (3/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_37-41919-0_sp1.1 from training. Duration: 0.7909375 2023-10-03 03:31:29,928 WARNING [train.py:1204] (3/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_41-182910-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 03:31:29,992 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_229-45300-0_sp0.9 from training. Duration: 0.6333125 2023-10-03 03:31:32,662 WARNING [train.py:1204] (3/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_40-214716-0_sp1.1 from training. Duration: 0.963625 2023-10-03 03:31:34,472 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9309W0203-640330 from training. Duration: 0.896 2023-10-03 03:31:41,626 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.feed_forward1.hidden_balancer.prob, batch_count=1122640.0, ans=0.125 2023-10-03 03:31:42,816 WARNING [train.py:1204] (3/4) Exclude cut with ID _60229_210_27767_1_1534838406158_3655350_264-279573-0_sp1.1 from training. Duration: 0.7545625 2023-10-03 03:31:42,864 WARNING [train.py:1204] (3/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_372-198878-0_sp0.9 from training. Duration: 0.6666875 2023-10-03 03:31:44,204 WARNING [train.py:1204] (3/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_359-118926-0_sp1.1 from training. Duration: 0.6545625 2023-10-03 03:31:44,221 WARNING [train.py:1204] (3/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_85-65056-0 from training. Duration: 0.96 2023-10-03 03:31:44,239 WARNING [train.py:1204] (3/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_89-242335-0_sp1.1 from training. Duration: 0.863625 2023-10-03 03:31:49,104 WARNING [train.py:1204] (3/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_128-49444-0_sp1.1 from training. Duration: 0.936375 2023-10-03 03:31:49,181 WARNING [train.py:1204] (3/4) Exclude cut with ID _30327_210_16049_1_1532484161295_3924169_401-70819-0 from training. Duration: 0.86 2023-10-03 03:31:50,568 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_41822_210_13270_1_1533955976_4379284_260-256391-0_sp1.1 from training. Duration: 0.936375 2023-10-03 03:31:52,000 WARNING [train.py:1204] (3/4) Exclude cut with ID _67335_210_5219_1_1535525975652_4275229_599-247317-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 03:31:55,433 WARNING [train.py:1204] (3/4) Exclude cut with ID _29857_210_20034_1_1532395886753_3798160_209-239267-0_sp1.1 from training. Duration: 0.936375 2023-10-03 03:31:56,642 WARNING [train.py:1204] (3/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_46-207599-0_sp1.1 from training. Duration: 0.5818125 2023-10-03 03:31:57,955 WARNING [train.py:1204] (3/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_912-282412-0_sp1.1 from training. Duration: 0.4454375 2023-10-03 03:32:00,700 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_4183_210_2087_1_1526729100_1869838_141-166600-0_sp1.1 from training. Duration: 0.863625 2023-10-03 03:32:00,710 WARNING [train.py:1204] (3/4) Exclude cut with ID _15037_210_2253_1_1530752294104_3267650_296-192597-0 from training. Duration: 0.81 2023-10-03 03:32:02,019 WARNING [train.py:1204] (3/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_571-220562-0_sp1.1 from training. Duration: 0.963625 2023-10-03 03:32:02,037 WARNING [train.py:1204] (3/4) Exclude cut with ID _5287_210_2084_1_1527418883040_3878670_12-221186-0 from training. Duration: 0.98 2023-10-03 03:32:08,068 WARNING [train.py:1204] (3/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_149-85602-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 03:32:08,096 WARNING [train.py:1204] (3/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_1003-101638-0_sp1.1 from training. Duration: 0.663625 2023-10-03 03:32:10,889 WARNING [train.py:1204] (3/4) Exclude cut with ID _55844_210_2346_1_1534471212569_7625707_1099-33689-0_sp1.1 from training. Duration: 0.92725 2023-10-03 03:32:10,928 WARNING [train.py:1204] (3/4) Exclude cut with ID _7606_210_4915_1_1528894253277_5080690_301-10732-0 from training. Duration: 0.81 2023-10-03 03:32:12,398 WARNING [train.py:1204] (3/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_403-34698-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 03:32:12,404 WARNING [train.py:1204] (3/4) Exclude cut with ID _8480_210_1874_1_1528589391205_6728330_755-112625-0_sp0.9 from training. Duration: 0.82225 2023-10-03 03:32:13,666 WARNING [train.py:1204] (3/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_527-238990-0_sp1.1 from training. Duration: 0.8181875 2023-10-03 03:32:13,698 WARNING [train.py:1204] (3/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_426-7328-0_sp1.1 from training. Duration: 0.92725 2023-10-03 03:32:16,588 WARNING [train.py:1204] (3/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_189-317158-0_sp1.1 from training. Duration: 0.8181875 2023-10-03 03:32:17,974 WARNING [train.py:1204] (3/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_162-103620-0 from training. Duration: 0.93 2023-10-03 03:32:19,919 WARNING [train.py:1204] (3/4) Exclude cut with ID _22083_210_8254_1_1531702862439_3678970_49-234546-0 from training. Duration: 0.83 2023-10-03 03:32:19,964 WARNING [train.py:1204] (3/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_370-331600-0_sp1.1 from training. Duration: 0.8545625 2023-10-03 03:32:19,987 WARNING [train.py:1204] (3/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_166-59042-0_sp1.1 from training. Duration: 0.97275 2023-10-03 03:32:21,310 WARNING [train.py:1204] (3/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_138-332773-0_sp1.1 from training. Duration: 0.736375 2023-10-03 03:32:22,188 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.feed_forward1.hidden_balancer.prob, batch_count=1122773.3333333333, ans=0.125 2023-10-03 03:32:23,221 WARNING [train.py:1204] (3/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_90-30673-0_sp0.9 from training. Duration: 0.9333125 2023-10-03 03:32:24,558 INFO [train.py:1046] (3/4) Epoch 32, batch 3750, loss[loss=0.1526, simple_loss=0.2329, pruned_loss=0.03616, over 24334.00 frames. ], tot_loss[loss=0.1639, simple_loss=0.2426, pruned_loss=0.04255, over 4728215.22 frames. ], batch size: 61, lr: 3.18e-03, grad_scale: 8.0 2023-10-03 03:32:25,984 WARNING [train.py:1204] (3/4) Exclude cut with ID _27967_210_12140_1_1532588391859_8419130_737-41898-0_sp1.1 from training. Duration: 0.936375 2023-10-03 03:32:27,346 WARNING [train.py:1204] (3/4) Exclude cut with ID _8833_210_5219_1_1528592665210_7553059_329-125156-0_sp1.1 from training. Duration: 0.7454375 2023-10-03 03:32:27,456 WARNING [train.py:1204] (3/4) Exclude cut with ID _41817_210_18835_1_1533434477160_7423049_352-42537-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 03:32:30,023 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.594e+02 1.897e+02 2.092e+02 2.385e+02 3.379e+02, threshold=4.184e+02, percent-clipped=0.0 2023-10-03 03:32:30,135 WARNING [train.py:1204] (3/4) Exclude cut with ID _8365_210_2064_1_1528592311070_4677690_431-294102-0 from training. Duration: 0.89 2023-10-03 03:32:31,596 WARNING [train.py:1204] (3/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_454-314901-0_sp1.1 from training. Duration: 0.4181875 2023-10-03 03:32:33,043 WARNING [train.py:1204] (3/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_80-135200-0_sp0.9 from training. Duration: 0.87775 2023-10-03 03:32:34,403 WARNING [train.py:1204] (3/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_181-78632-0 from training. Duration: 0.78 2023-10-03 03:32:34,457 WARNING [train.py:1204] (3/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_300-27235-0_sp0.9 from training. Duration: 0.9666875 2023-10-03 03:32:35,790 WARNING [train.py:1204] (3/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_96-177176-0_sp1.1 from training. Duration: 0.97275 2023-10-03 03:32:37,647 WARNING [train.py:1204] (3/4) Exclude cut with ID _35941_210_15379_1_1532995294515_4023089_246-348742-0_sp1.1 from training. Duration: 0.97275 2023-10-03 03:32:39,160 WARNING [train.py:1204] (3/4) Exclude cut with ID _38670_210_15379_1_1533448810659_4391059_65-169397-0_sp1.1 from training. Duration: 0.8818125 2023-10-03 03:32:41,972 WARNING [train.py:1204] (3/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_1003-99642-0_sp1.1 from training. Duration: 0.963625 2023-10-03 03:32:44,759 WARNING [train.py:1204] (3/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_320-14078-0_sp1.1 from training. Duration: 0.863625 2023-10-03 03:32:46,137 WARNING [train.py:1204] (3/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_367-61330-0_sp0.9 from training. Duration: 0.8333125 2023-10-03 03:32:47,989 WARNING [train.py:1204] (3/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_515-298514-0_sp1.1 from training. Duration: 0.92725 2023-10-03 03:32:49,445 WARNING [train.py:1204] (3/4) Exclude cut with ID _53731_210_7994_1_1534553979244_3757214_471-320815-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 03:32:51,576 WARNING [train.py:1204] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_306-232480-0 from training. Duration: 0.96 2023-10-03 03:32:51,725 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.0.conv_skip_rate, batch_count=1122906.6666666667, ans=0.0 2023-10-03 03:32:52,897 WARNING [train.py:1204] (3/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_478-162247-0_sp0.9 from training. Duration: 0.911125 2023-10-03 03:32:54,707 WARNING [train.py:1204] (3/4) Exclude cut with ID _56423_210_5854_1_1534896061049_3165969_345-284696-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 03:32:54,735 WARNING [train.py:1204] (3/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_600-122253-0_sp1.1 from training. Duration: 0.963625 2023-10-03 03:32:58,766 WARNING [train.py:1204] (3/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_199-295611-0 from training. Duration: 0.55 2023-10-03 03:33:01,583 WARNING [train.py:1204] (3/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_734-129761-0 from training. Duration: 0.82 2023-10-03 03:33:02,922 WARNING [train.py:1204] (3/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_349-169257-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 03:33:02,952 WARNING [train.py:1204] (3/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_202-67017-0_sp0.9 from training. Duration: 0.911125 2023-10-03 03:33:05,623 WARNING [train.py:1204] (3/4) Exclude cut with ID _16908_210_5724_1_1531119157248_3772700_61-258978-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 03:33:10,449 WARNING [train.py:1204] (3/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_172-156931-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 03:33:11,877 WARNING [train.py:1204] (3/4) Exclude cut with ID _4092_210_2151_1_1526643044449_3135759_356-278694-0_sp1.1 from training. Duration: 0.42725 2023-10-03 03:33:12,051 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.0.conv_skip_rate, batch_count=1123040.0, ans=0.0 2023-10-03 03:33:13,563 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.1.feed_forward3.hidden_balancer.prob, batch_count=1123040.0, ans=0.125 2023-10-03 03:33:14,767 WARNING [train.py:1204] (3/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_116-332008-0 from training. Duration: 0.98 2023-10-03 03:33:16,329 WARNING [train.py:1204] (3/4) Exclude cut with ID _29003_210_19596_1_1532507391720_3894098_358-114789-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 03:33:19,316 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.self_attn_weights.pos_emb_skip_rate, batch_count=1123040.0, ans=0.0 2023-10-03 03:33:20,331 WARNING [train.py:1204] (3/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_152-212533-0_sp1.1 from training. Duration: 0.8818125 2023-10-03 03:33:20,463 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=1123040.0, ans=0.1 2023-10-03 03:33:21,612 WARNING [train.py:1204] (3/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_267-214116-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 03:33:25,658 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7543_210_6753_1_1528541907_1399322_165-323619-0_sp0.9 from training. Duration: 0.7666875 2023-10-03 03:33:29,095 WARNING [train.py:1204] (3/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_577-332600-0_sp0.9 from training. Duration: 0.6333125 2023-10-03 03:33:31,803 WARNING [train.py:1204] (3/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_342-100157-0_sp0.9 from training. Duration: 0.6 2023-10-03 03:33:32,485 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.3.encoder.layers.2.feed_forward3.out_whiten, num_groups=1, num_channels=512, metric=7.94 vs. limit=15.0 2023-10-03 03:33:33,176 WARNING [train.py:1204] (3/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_141-89727-0_sp1.1 from training. Duration: 0.6454375 2023-10-03 03:33:33,278 WARNING [train.py:1204] (3/4) Exclude cut with ID _3992_210_1747_1_1526644816031_3731060_107-251434-0_sp1.1 from training. Duration: 0.9 2023-10-03 03:33:35,947 WARNING [train.py:1204] (3/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_401-53423-0_sp1.1 from training. Duration: 0.5 2023-10-03 03:33:36,174 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.conv_skip_rate, batch_count=1123106.6666666667, ans=0.0 2023-10-03 03:33:39,278 INFO [train.py:1046] (3/4) Epoch 32, batch 3800, loss[loss=0.166, simple_loss=0.238, pruned_loss=0.04695, over 23886.00 frames. ], tot_loss[loss=0.1645, simple_loss=0.2426, pruned_loss=0.04319, over 4709391.96 frames. ], batch size: 195, lr: 3.18e-03, grad_scale: 8.0 2023-10-03 03:33:42,245 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7762_210_1794_1_1528199508_4057408_422-227034-0_sp1.1 from training. Duration: 0.663625 2023-10-03 03:33:46,407 WARNING [train.py:1204] (3/4) Exclude cut with ID _4184_210_2550_1_1526730192013_2219380_102-250299-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 03:33:47,778 WARNING [train.py:1204] (3/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_253-308377-0_sp0.9 from training. Duration: 0.5444375 2023-10-03 03:33:47,845 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_271-297505-0 from training. Duration: 0.99 2023-10-03 03:33:49,786 WARNING [train.py:1204] (3/4) Exclude cut with ID _35941_210_15379_1_1532995294515_4023089_213-142973-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 03:33:51,206 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7544_210_6753_1_1528587722_3576686_162-63018-0_sp1.1 from training. Duration: 0.92725 2023-10-03 03:33:51,287 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_365-217119-0_sp0.9 from training. Duration: 0.7 2023-10-03 03:33:54,411 WARNING [train.py:1204] (3/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_662-275621-0_sp0.9 from training. Duration: 0.4666875 2023-10-03 03:33:54,413 WARNING [train.py:1204] (3/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_374-187555-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 03:33:55,700 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_289-180904-0_sp1.1 from training. Duration: 0.6909375 2023-10-03 03:33:56,241 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.4.encoder.layers.2.feed_forward1.out_whiten, num_groups=1, num_channels=384, metric=13.62 vs. limit=15.0 2023-10-03 03:33:57,111 WARNING [train.py:1204] (3/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_189-47206-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 03:33:58,439 WARNING [train.py:1204] (3/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_234-14174-0_sp1.1 from training. Duration: 0.7454375 2023-10-03 03:33:58,488 WARNING [train.py:1204] (3/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_576-76621-0_sp1.1 from training. Duration: 0.963625 2023-10-03 03:33:59,860 WARNING [train.py:1204] (3/4) Exclude cut with ID _13253_210_1925_1_1530757775819_3577873_253-246386-0 from training. Duration: 0.9 2023-10-03 03:34:01,371 WARNING [train.py:1204] (3/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_372-6869-0_sp1.1 from training. Duration: 0.4 2023-10-03 03:34:02,663 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_21434_210_4238_1_1531916339_1222252_22-54954-0_sp1.1 from training. Duration: 0.936375 2023-10-03 03:34:05,356 WARNING [train.py:1204] (3/4) Exclude cut with ID _29853_210_7497_1_1532505600851_6642110_492-170361-0_sp1.1 from training. Duration: 0.92725 2023-10-03 03:34:08,611 WARNING [train.py:1204] (3/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_505-97987-0_sp0.9 from training. Duration: 0.9555625 2023-10-03 03:34:08,660 WARNING [train.py:1204] (3/4) Exclude cut with ID _8365_210_2064_1_1528592311070_4677690_371-122295-0_sp0.9 from training. Duration: 0.611125 2023-10-03 03:34:10,123 WARNING [train.py:1204] (3/4) Exclude cut with ID _14268_210_9206_1_1530622771285_4714099_637-86630-0_sp1.1 from training. Duration: 0.463625 2023-10-03 03:34:11,267 WARNING [train.py:1204] (3/4) Exclude cut with ID _29857_210_20034_1_1532395886753_3798160_372-5678-0_sp1.1 from training. Duration: 0.963625 2023-10-03 03:34:12,710 WARNING [train.py:1204] (3/4) Exclude cut with ID _52892_210_6721_1_1534378941259_4124129_90-165108-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 03:34:14,054 WARNING [train.py:1204] (3/4) Exclude cut with ID _27052_210_5986_1_1532329197187_3585009_143-339198-0_sp1.1 from training. Duration: 0.963625 2023-10-03 03:34:18,915 WARNING [train.py:1204] (3/4) Exclude cut with ID _14268_210_9206_1_1530622771285_4714099_637-86630-0_sp0.9 from training. Duration: 0.5666875 2023-10-03 03:34:18,918 WARNING [train.py:1204] (3/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_401-255491-0 from training. Duration: 0.7 2023-10-03 03:34:22,287 WARNING [train.py:1204] (3/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_178-6946-0_sp1.1 from training. Duration: 0.97275 2023-10-03 03:34:26,128 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.3.encoder.layers.2.whiten, num_groups=1, num_channels=512, metric=4.45 vs. limit=12.0 2023-10-03 03:34:26,928 WARNING [train.py:1204] (3/4) Exclude cut with ID _27485_210_4916_1_1532739191677_3668269_460-163885-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 03:34:32,616 WARNING [train.py:1204] (3/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_270-26053-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 03:34:34,000 WARNING [train.py:1204] (3/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_412-317077-0 from training. Duration: 0.75 2023-10-03 03:34:36,679 WARNING [train.py:1204] (3/4) Exclude cut with ID _5289_210_2084_1_1527678202736_3779299_142-103317-0 from training. Duration: 0.84 2023-10-03 03:34:36,730 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_150-143438-0_sp1.1 from training. Duration: 0.92725 2023-10-03 03:34:38,706 WARNING [train.py:1204] (3/4) Exclude cut with ID _27894_210_12392_1_1532415496078_6902439_668-269779-0_sp1.1 from training. Duration: 0.97275 2023-10-03 03:34:38,757 WARNING [train.py:1204] (3/4) Exclude cut with ID _18513_210_9206_1_1531618215223_3862620_56-222075-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 03:34:41,442 WARNING [train.py:1204] (3/4) Exclude cut with ID _4092_210_2151_1_1526643044449_3135759_356-278694-0 from training. Duration: 0.47 2023-10-03 03:34:44,343 WARNING [train.py:1204] (3/4) Exclude cut with ID _25808_210_13292_1_1532343514562_7366109_633-47524-0 from training. Duration: 0.99 2023-10-03 03:34:44,353 WARNING [train.py:1204] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_872-264525-0 from training. Duration: 0.65 2023-10-03 03:34:44,394 WARNING [train.py:1204] (3/4) Exclude cut with ID _14632_210_9988_1_1530691279392_3923200_239-232545-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 03:34:45,801 WARNING [train.py:1204] (3/4) Exclude cut with ID _14349_210_8341_1_1530615710818_3442470_153-27432-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 03:34:51,705 WARNING [train.py:1204] (3/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_37-41919-0_sp0.9 from training. Duration: 0.9666875 2023-10-03 03:34:52,994 INFO [train.py:1046] (3/4) Epoch 32, batch 3850, loss[loss=0.1692, simple_loss=0.2545, pruned_loss=0.04189, over 24556.00 frames. ], tot_loss[loss=0.1636, simple_loss=0.2413, pruned_loss=0.04291, over 4704217.65 frames. ], batch size: 71, lr: 3.18e-03, grad_scale: 8.0 2023-10-03 03:34:53,075 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_196-289637-0_sp0.9 from training. Duration: 0.8333125 2023-10-03 03:34:54,696 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder_embed.dropout.p, batch_count=1123506.6666666667, ans=0.1 2023-10-03 03:34:56,696 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=1123506.6666666667, ans=0.1 2023-10-03 03:34:57,890 WARNING [train.py:1204] (3/4) Exclude cut with ID _13678_210_7497_1_1530530069226_3463170_371-3221-0_sp1.1 from training. Duration: 0.8181875 2023-10-03 03:34:57,981 WARNING [train.py:1204] (3/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_557-324258-0 from training. Duration: 0.81 2023-10-03 03:34:59,361 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.545e+02 1.881e+02 2.039e+02 2.318e+02 3.209e+02, threshold=4.078e+02, percent-clipped=0.0 2023-10-03 03:34:59,508 WARNING [train.py:1204] (3/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_1012-279473-0_sp1.1 from training. Duration: 0.6090625 2023-10-03 03:35:00,826 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_66916_210_14656_1_1535725525_326566_7-92649-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 03:35:05,232 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_9460_210_4211_1_1528954587_10049667_480-260269-0_sp1.1 from training. Duration: 0.5090625 2023-10-03 03:35:08,443 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_121-155001-0_sp1.1 from training. Duration: 0.92725 2023-10-03 03:35:11,206 WARNING [train.py:1204] (3/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_188-171673-0_sp1.1 from training. Duration: 0.52725 2023-10-03 03:35:11,474 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=1123573.3333333333, ans=0.1 2023-10-03 03:35:12,497 WARNING [train.py:1204] (3/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_2-39653-0 from training. Duration: 0.98 2023-10-03 03:35:16,897 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_23850_210_4238_1_1532232597_1632433_112-229012-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 03:35:18,290 WARNING [train.py:1204] (3/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_626-79528-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 03:35:21,564 WARNING [train.py:1204] (3/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_856-90052-0_sp1.1 from training. Duration: 0.963625 2023-10-03 03:35:21,605 WARNING [train.py:1204] (3/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_122-109023-0_sp0.9 from training. Duration: 0.8444375 2023-10-03 03:35:25,962 WARNING [train.py:1204] (3/4) Exclude cut with ID _64414_210_17301_1_1535243252048_3757759_37-70328-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 03:35:26,036 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_622-344907-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 03:35:27,908 WARNING [train.py:1204] (3/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_410-321956-0_sp1.1 from training. Duration: 0.92725 2023-10-03 03:35:27,923 WARNING [train.py:1204] (3/4) Exclude cut with ID _7562_210_2084_1_1528887767916_3946229_186-326422-0_sp0.9 from training. Duration: 0.9333125 2023-10-03 03:35:29,363 WARNING [train.py:1204] (3/4) Exclude cut with ID _37537_210_2253_1_1533171758568_3421949_127-285192-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 03:35:30,768 WARNING [train.py:1204] (3/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_723-228168-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 03:35:32,181 WARNING [train.py:1204] (3/4) Exclude cut with ID _13678_210_7497_1_1530530069226_3463170_426-246984-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 03:35:32,203 WARNING [train.py:1204] (3/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_399-156791-0_sp0.9 from training. Duration: 0.92225 2023-10-03 03:35:33,494 WARNING [train.py:1204] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_391-22148-0 from training. Duration: 0.77 2023-10-03 03:35:33,526 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3475_210_2028_1_1526193578_3622569_64-297072-0 from training. Duration: 0.91 2023-10-03 03:35:33,781 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.balancer2.prob, batch_count=1123640.0, ans=0.125 2023-10-03 03:35:34,937 WARNING [train.py:1204] (3/4) Exclude cut with ID _17875_210_2087_1_1531224012907_1821339_225-280015-0_sp1.1 from training. Duration: 0.963625 2023-10-03 03:35:34,964 WARNING [train.py:1204] (3/4) Exclude cut with ID _50655_210_18628_1_1534229942193_3772154_325-246169-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 03:35:36,450 WARNING [train.py:1204] (3/4) Exclude cut with ID _29529_210_7234_1_1532393961455_3977460_597-257348-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 03:35:37,824 WARNING [train.py:1204] (3/4) Exclude cut with ID _58895_210_8750_1_1535184081320_3649089_77-173485-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 03:35:37,845 WARNING [train.py:1204] (3/4) Exclude cut with ID _6270_210_2084_1_1527850911573_3856420_26-277343-0 from training. Duration: 0.82 2023-10-03 03:35:39,321 WARNING [train.py:1204] (3/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_144-3287-0 from training. Duration: 0.67 2023-10-03 03:35:39,520 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.feed_forward1.hidden_balancer.prob, batch_count=1123706.6666666667, ans=0.125 2023-10-03 03:35:41,416 WARNING [train.py:1204] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_293-145278-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 03:35:44,184 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_40047_210_2290_1_1533290519_2374311_257-335162-0 from training. Duration: 0.95 2023-10-03 03:35:44,330 WARNING [train.py:1204] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_619-344499-0_sp0.9 from training. Duration: 0.7 2023-10-03 03:35:49,660 WARNING [train.py:1204] (3/4) Exclude cut with ID _19504_210_9994_1_1531648863242_3325220_456-315379-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 03:35:50,931 WARNING [train.py:1204] (3/4) Exclude cut with ID _55857_210_2346_1_1534644007831_6898209_631-221692-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 03:35:54,532 WARNING [train.py:1204] (3/4) Exclude cut with ID _64631_210_27767_1_1535349580068_4165160_324-3682-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 03:35:54,568 WARNING [train.py:1204] (3/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_317-211107-0 from training. Duration: 0.92 2023-10-03 03:35:57,202 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.4.encoder.layers.1.nonlin_attention.whiten2, num_groups=1, num_channels=384, metric=3.11 vs. limit=15.0 2023-10-03 03:35:58,508 WARNING [train.py:1204] (3/4) Exclude cut with ID _6789_210_1534_1_1527857937218_3118950_311-61262-0 from training. Duration: 0.65 2023-10-03 03:36:01,147 WARNING [train.py:1204] (3/4) Exclude cut with ID _28323_210_16187_1_1532480395667_4100300_365-68882-0_sp1.1 from training. Duration: 0.92725 2023-10-03 03:36:02,478 WARNING [train.py:1204] (3/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_816-181825-0_sp1.1 from training. Duration: 0.92725 2023-10-03 03:36:03,944 WARNING [train.py:1204] (3/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_132-269633-0_sp1.1 from training. Duration: 0.6181875 2023-10-03 03:36:03,962 WARNING [train.py:1204] (3/4) Exclude cut with ID _44585_210_18628_1_1533605350387_4518199_280-73534-0_sp1.1 from training. Duration: 0.8090625 2023-10-03 03:36:04,016 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_68095_210_20185_1_1535718226_8888855_1009-164755-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 03:36:05,371 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_40223_210_6228_1_1533298404_4812267_400-103813-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 03:36:05,372 WARNING [train.py:1204] (3/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_491-261923-0_sp1.1 from training. Duration: 0.8909375 2023-10-03 03:36:05,377 WARNING [train.py:1204] (3/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_712-342563-0 from training. Duration: 0.97 2023-10-03 03:36:05,455 WARNING [train.py:1204] (3/4) Exclude cut with ID _41161_210_20034_1_1533431249657_3555314_175-190032-0_sp1.1 from training. Duration: 0.963625 2023-10-03 03:36:06,866 WARNING [train.py:1204] (3/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_788-298907-0 from training. Duration: 0.89 2023-10-03 03:36:08,146 INFO [train.py:1046] (3/4) Epoch 32, batch 3900, loss[loss=0.16, simple_loss=0.2359, pruned_loss=0.04208, over 24459.00 frames. ], tot_loss[loss=0.1626, simple_loss=0.2404, pruned_loss=0.04245, over 4698937.60 frames. ], batch size: 58, lr: 3.18e-03, grad_scale: 8.0 2023-10-03 03:36:08,191 WARNING [train.py:1204] (3/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_225-70961-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 03:36:08,198 WARNING [train.py:1204] (3/4) Exclude cut with ID _58583_210_5236_1_1534726835266_3574249_235-175994-0_sp1.1 from training. Duration: 0.92725 2023-10-03 03:36:09,636 WARNING [train.py:1204] (3/4) Exclude cut with ID _12302_210_5219_1_1529924446466_8107280_907-182949-0_sp1.1 from training. Duration: 0.836375 2023-10-03 03:36:09,668 WARNING [train.py:1204] (3/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_94-193692-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 03:36:09,755 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_4528_210_3273_1_1527076654_3276777_342-168068-0_sp0.9 from training. Duration: 0.9666875 2023-10-03 03:36:11,497 WARNING [train.py:1204] (3/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_368-74239-0_sp1.1 from training. Duration: 0.92725 2023-10-03 03:36:11,498 WARNING [train.py:1204] (3/4) Exclude cut with ID _44831_210_13427_1_1533816011549_3689035_291-295928-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 03:36:12,817 WARNING [train.py:1204] (3/4) Exclude cut with ID _56406_210_9288_1_1534899618664_3496149_455-131997-0_sp1.1 from training. Duration: 0.97275 2023-10-03 03:36:12,825 WARNING [train.py:1204] (3/4) Exclude cut with ID _6473_210_1741_1_1527901307396_3815380_108-17335-0 from training. Duration: 0.65 2023-10-03 03:36:12,863 WARNING [train.py:1204] (3/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_286-135600-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 03:36:16,949 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_928-321675-0_sp1.1 from training. Duration: 0.936375 2023-10-03 03:36:17,030 WARNING [train.py:1204] (3/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_81-300212-0_sp1.1 from training. Duration: 0.5454375 2023-10-03 03:36:18,329 WARNING [train.py:1204] (3/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_29-280622-0_sp0.9 from training. Duration: 0.911125 2023-10-03 03:36:18,490 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.1.ff3_skip_rate, batch_count=1123840.0, ans=0.0 2023-10-03 03:36:19,612 WARNING [train.py:1204] (3/4) Exclude cut with ID _61079_210_4525_1_1534921008198_4011419_228-176447-0_sp1.1 from training. Duration: 0.936375 2023-10-03 03:36:22,401 WARNING [train.py:1204] (3/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_372-198878-0_sp1.1 from training. Duration: 0.5454375 2023-10-03 03:36:22,413 WARNING [train.py:1204] (3/4) Exclude cut with ID _64521_210_14699_1_1535243427259_3815328_244-208725-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 03:36:24,382 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8275_210_4198_1_1528455320_3892571_491-17707-0_sp0.9 from training. Duration: 0.711125 2023-10-03 03:36:25,770 WARNING [train.py:1204] (3/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_375-245677-0 from training. Duration: 0.81 2023-10-03 03:36:25,778 WARNING [train.py:1204] (3/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_690-286055-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 03:36:27,719 WARNING [train.py:1204] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_89-321955-0 from training. Duration: 0.8 2023-10-03 03:36:27,768 WARNING [train.py:1204] (3/4) Exclude cut with ID _18895_210_6228_1_1531357274234_3784930_213-98721-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 03:36:29,095 WARNING [train.py:1204] (3/4) Exclude cut with ID _19708_210_9206_1_1532136650733_4238364_347-65240-0 from training. Duration: 0.97 2023-10-03 03:36:30,348 WARNING [train.py:1204] (3/4) Exclude cut with ID _64135_210_2253_1_1535677267876_7608972_498-5039-0 from training. Duration: 0.78 2023-10-03 03:36:33,315 WARNING [train.py:1204] (3/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_146-276944-0_sp1.1 from training. Duration: 0.7545625 2023-10-03 03:36:34,814 WARNING [train.py:1204] (3/4) Exclude cut with ID _63171_210_15488_1_1535104792794_3295110_155-34488-0_sp1.1 from training. Duration: 0.97275 2023-10-03 03:36:34,837 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_5438_210_1794_1_1527336687_2600009_233-58072-0_sp0.9 from training. Duration: 0.8555625 2023-10-03 03:36:34,883 WARNING [train.py:1204] (3/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_63-101996-0_sp0.9 from training. Duration: 0.97775 2023-10-03 03:36:37,670 WARNING [train.py:1204] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_271-269084-0_sp1.1 from training. Duration: 0.7545625 2023-10-03 03:36:39,587 WARNING [train.py:1204] (3/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_345-167986-0_sp1.1 from training. Duration: 0.763625 2023-10-03 03:36:42,369 WARNING [train.py:1204] (3/4) Exclude cut with ID _39290_210_4220_1_1533862778760_3075118_92-311600-0_sp1.1 from training. Duration: 0.8 2023-10-03 03:36:42,375 WARNING [train.py:1204] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_209-65306-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 03:36:43,855 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_26433_210_12108_1_1532157158_4056480_317-335807-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 03:36:48,218 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.conv_module2.balancer2.prob, batch_count=1123973.3333333333, ans=0.125 2023-10-03 03:36:49,360 WARNING [train.py:1204] (3/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_238-94127-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 03:36:49,413 WARNING [train.py:1204] (3/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_207-132038-0_sp1.1 from training. Duration: 0.8545625 2023-10-03 03:36:53,938 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.4.encoder.layers.0.self_attn1.whiten, num_groups=1, num_channels=384, metric=14.57 vs. limit=22.5 2023-10-03 03:36:56,093 WARNING [train.py:1204] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_167-137255-0_sp1.1 from training. Duration: 0.5818125 2023-10-03 03:36:57,537 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2009_210_2071_1_1525481134_4588937_385-302100-0_sp0.9 from training. Duration: 0.9666875 2023-10-03 03:37:04,713 INFO [scaling.py:1118] (3/4) WithLoss: name=encoder.encoders.4.encoder.layers.2.self_attn_weights, loss-sum=0.000e+00 2023-10-03 03:37:07,282 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_15517_210_6753_1_1530954008_3487336_253-110617-0_sp1.1 from training. Duration: 0.936375 2023-10-03 03:37:10,677 WARNING [train.py:1204] (3/4) Exclude cut with ID _39290_210_4220_1_1533862778760_3075118_92-311600-0_sp0.9 from training. Duration: 0.97775 2023-10-03 03:37:10,724 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_42-142473-0 from training. Duration: 0.72 2023-10-03 03:37:10,752 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2296_210_2087_1_1525503448_3397335_57-185430-0 from training. Duration: 0.6 2023-10-03 03:37:10,766 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_11305_210_1794_1_1529750428_7178148_257-312870-0_sp0.9 from training. Duration: 0.97775 2023-10-03 03:37:12,175 WARNING [train.py:1204] (3/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_222-198127-0 from training. Duration: 0.84 2023-10-03 03:37:12,426 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.feed_forward1.hidden_balancer.prob, batch_count=1124106.6666666667, ans=0.125 2023-10-03 03:37:13,563 WARNING [train.py:1204] (3/4) Exclude cut with ID _65263_210_22356_1_1535763598691_3921037_423-202699-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 03:37:14,844 WARNING [train.py:1204] (3/4) Exclude cut with ID _15037_210_2253_1_1530752294104_3267650_13-130361-0 from training. Duration: 0.99 2023-10-03 03:37:20,512 WARNING [train.py:1204] (3/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_628-198080-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 03:37:21,746 INFO [train.py:1046] (3/4) Epoch 32, batch 3950, loss[loss=0.1807, simple_loss=0.2613, pruned_loss=0.05005, over 23900.00 frames. ], tot_loss[loss=0.1633, simple_loss=0.2412, pruned_loss=0.04269, over 4696678.65 frames. ], batch size: 86, lr: 3.18e-03, grad_scale: 8.0 2023-10-03 03:37:21,870 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3171_210_2087_1_1526109084_2330007_37-64334-0 from training. Duration: 0.82 2023-10-03 03:37:21,913 WARNING [train.py:1204] (3/4) Exclude cut with ID _5287_210_2084_1_1527418883040_3878670_12-221186-0_sp1.1 from training. Duration: 0.8909375 2023-10-03 03:37:25,240 WARNING [train.py:1204] (3/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_1103-56445-0_sp1.1 from training. Duration: 0.663625 2023-10-03 03:37:26,671 WARNING [train.py:1204] (3/4) Exclude cut with ID _65263_210_22356_1_1535763598691_3921037_169-140068-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 03:37:28,014 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.596e+02 1.850e+02 2.026e+02 2.281e+02 3.100e+02, threshold=4.052e+02, percent-clipped=0.0 2023-10-03 03:37:30,887 WARNING [train.py:1204] (3/4) Exclude cut with ID IC0671W0118-331961 from training. Duration: 0.896 2023-10-03 03:37:30,944 WARNING [train.py:1204] (3/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_402-102311-0_sp1.1 from training. Duration: 0.6090625 2023-10-03 03:37:32,307 WARNING [train.py:1204] (3/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_202-293523-0 from training. Duration: 0.72 2023-10-03 03:37:32,366 WARNING [train.py:1204] (3/4) Exclude cut with ID IC0679W0038-335846 from training. Duration: 0.768 2023-10-03 03:37:32,397 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_37357_210_17024_1_1533089504_2316121_79-198866-0_sp1.1 from training. Duration: 0.936375 2023-10-03 03:37:36,414 WARNING [train.py:1204] (3/4) Exclude cut with ID _7606_210_4915_1_1528894253277_5080690_292-271285-0_sp1.1 from training. Duration: 0.963625 2023-10-03 03:37:36,424 WARNING [train.py:1204] (3/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_377-296839-0_sp0.9 from training. Duration: 0.611125 2023-10-03 03:37:36,433 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_45886_210_17024_1_1533704917_2435333_58-131069-0_sp1.1 from training. Duration: 0.963625 2023-10-03 03:37:37,832 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6961_210_2087_1_1527937168_6815011_395-185060-0 from training. Duration: 0.95 2023-10-03 03:37:38,502 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.3.encoder.layers.0.nonlin_attention.whiten1, num_groups=1, num_channels=384, metric=7.08 vs. limit=10.0 2023-10-03 03:37:40,953 WARNING [train.py:1204] (3/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_304-266479-0_sp1.1 from training. Duration: 0.97275 2023-10-03 03:37:41,017 WARNING [train.py:1204] (3/4) Exclude cut with ID _3867_210_1883_1_1526641200575_4475830_319-349850-0_sp1.1 from training. Duration: 0.6090625 2023-10-03 03:37:41,028 WARNING [train.py:1204] (3/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_304-59227-0_sp0.9 from training. Duration: 0.8555625 2023-10-03 03:37:42,303 WARNING [train.py:1204] (3/4) Exclude cut with ID _8021_210_7994_1_1528376451358_3653069_100-140231-0_sp0.9 from training. Duration: 0.7444375 2023-10-03 03:37:43,692 WARNING [train.py:1204] (3/4) Exclude cut with ID _8833_210_5219_1_1528592665210_7553059_329-125156-0_sp0.9 from training. Duration: 0.911125 2023-10-03 03:37:49,491 INFO [scaling.py:1118] (3/4) WithLoss: name=encoder.encoders.2.encoder.layers.0.self_attn_weights, loss-sum=0.000e+00 2023-10-03 03:37:56,344 WARNING [train.py:1204] (3/4) Exclude cut with ID _14459_210_2228_1_1531443641946_7516700_792-287234-0_sp1.1 from training. Duration: 0.92725 2023-10-03 03:37:56,361 WARNING [train.py:1204] (3/4) Exclude cut with ID _19708_210_9206_1_1532136650733_4238364_347-65240-0_sp1.1 from training. Duration: 0.8818125 2023-10-03 03:38:01,923 WARNING [train.py:1204] (3/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_70-27732-0 from training. Duration: 0.94 2023-10-03 03:38:07,485 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_397-132565-0 from training. Duration: 0.99 2023-10-03 03:38:07,488 WARNING [train.py:1204] (3/4) Exclude cut with ID _49728_210_12651_1_1534041034294_6788150_624-195301-0 from training. Duration: 0.85 2023-10-03 03:38:07,533 WARNING [train.py:1204] (3/4) Exclude cut with ID _55429_210_10626_1_1534467588527_7174143_377-169391-0_sp1.1 from training. Duration: 0.87275 2023-10-03 03:38:08,892 WARNING [train.py:1204] (3/4) Exclude cut with ID _49376_210_5986_1_1533985278732_3370740_354-344695-0_sp1.1 from training. Duration: 0.9 2023-10-03 03:38:13,744 WARNING [train.py:1204] (3/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_276-96593-0_sp1.1 from training. Duration: 0.7 2023-10-03 03:38:15,042 WARNING [train.py:1204] (3/4) Exclude cut with ID _15012_210_1968_1_1530856820429_5095350_688-178849-0_sp0.9 from training. Duration: 0.711125 2023-10-03 03:38:15,078 WARNING [train.py:1204] (3/4) Exclude cut with ID _25565_210_12392_1_1532423009382_3952098_372-69023-0_sp1.1 from training. Duration: 0.936375 2023-10-03 03:38:15,113 WARNING [train.py:1204] (3/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_61-327062-0_sp1.1 from training. Duration: 0.736375 2023-10-03 03:38:15,161 WARNING [train.py:1204] (3/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_215-141059-0 from training. Duration: 0.62 2023-10-03 03:38:19,291 WARNING [train.py:1204] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_6-226750-0_sp1.1 from training. Duration: 0.8545625 2023-10-03 03:38:21,040 WARNING [train.py:1204] (3/4) Exclude cut with ID _14348_210_8341_1_1530599434996_3778640_240-232370-0_sp1.1 from training. Duration: 0.763625 2023-10-03 03:38:26,131 WARNING [train.py:1204] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_468-189058-0 from training. Duration: 0.96 2023-10-03 03:38:34,531 WARNING [train.py:1204] (3/4) Exclude cut with ID _21987_210_5854_1_1531821500482_3637038_247-93340-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 03:38:35,761 INFO [train.py:1046] (3/4) Epoch 32, batch 4000, loss[loss=0.1786, simple_loss=0.2531, pruned_loss=0.05205, over 23623.00 frames. ], tot_loss[loss=0.1628, simple_loss=0.2412, pruned_loss=0.04221, over 4703685.03 frames. ], batch size: 256, lr: 3.18e-03, grad_scale: 16.0 2023-10-03 03:38:38,702 WARNING [train.py:1204] (3/4) Exclude cut with ID _66868_210_9527_1_1535509287954_4561140_436-191602-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 03:38:43,572 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.attention_skip_rate, batch_count=1124506.6666666667, ans=0.0 2023-10-03 03:38:44,679 WARNING [train.py:1204] (3/4) Exclude cut with ID _18895_210_6228_1_1531357274234_3784930_394-58203-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 03:38:44,734 WARNING [train.py:1204] (3/4) Exclude cut with ID _58605_210_12210_1_1534723195939_3904259_258-281498-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 03:38:44,760 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_33348_210_5632_1_1533086912_3324030_245-292752-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 03:38:45,939 WARNING [train.py:1204] (3/4) Exclude cut with ID _67831_210_12392_1_1535612464202_3092612_346-2148-0 from training. Duration: 0.93 2023-10-03 03:38:46,004 WARNING [train.py:1204] (3/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_369-222060-0_sp0.9 from training. Duration: 0.788875 2023-10-03 03:38:47,309 WARNING [train.py:1204] (3/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_245-174553-0 from training. Duration: 0.88 2023-10-03 03:38:47,314 WARNING [train.py:1204] (3/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_231-104736-0_sp1.1 from training. Duration: 0.7090625 2023-10-03 03:38:47,328 WARNING [train.py:1204] (3/4) Exclude cut with ID _44585_210_18628_1_1533605350387_4518199_280-73534-0 from training. Duration: 0.89 2023-10-03 03:38:50,601 WARNING [train.py:1204] (3/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_22-201476-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 03:38:55,392 WARNING [train.py:1204] (3/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_325-62802-0_sp0.9 from training. Duration: 0.9666875 2023-10-03 03:38:55,405 WARNING [train.py:1204] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_199-274062-0_sp1.1 from training. Duration: 0.97275 2023-10-03 03:38:55,407 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_8-319957-0_sp1.1 from training. Duration: 0.87275 2023-10-03 03:38:55,435 WARNING [train.py:1204] (3/4) Exclude cut with ID _29343_210_4381_1_1532602757752_3674109_224-349726-0_sp1.1 from training. Duration: 0.936375 2023-10-03 03:38:55,441 WARNING [train.py:1204] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_520-126729-0_sp1.1 from training. Duration: 0.636375 2023-10-03 03:38:55,571 WARNING [train.py:1204] (3/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_239-22076-0_sp1.1 from training. Duration: 0.77275 2023-10-03 03:38:56,942 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9012W0011-501146 from training. Duration: 0.896 2023-10-03 03:38:58,351 WARNING [train.py:1204] (3/4) Exclude cut with ID _29857_210_20034_1_1532395886753_3798160_279-185801-0_sp1.1 from training. Duration: 0.82725 2023-10-03 03:38:58,401 WARNING [train.py:1204] (3/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_261-223933-0_sp1.1 from training. Duration: 0.92725 2023-10-03 03:39:02,475 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9029W0025-509426 from training. Duration: 0.896 2023-10-03 03:39:02,559 WARNING [train.py:1204] (3/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_337-181502-0_sp1.1 from training. Duration: 0.5909375 2023-10-03 03:39:02,572 WARNING [train.py:1204] (3/4) Exclude cut with ID _68635_210_9317_1_1535770813410_4315870_159-332599-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 03:39:08,013 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_161-276996-0 from training. Duration: 0.95 2023-10-03 03:39:08,067 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7088_210_1874_1_1527939776_1101113_127-68870-0_sp1.1 from training. Duration: 0.936375 2023-10-03 03:39:10,860 WARNING [train.py:1204] (3/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_322-217220-0_sp1.1 from training. Duration: 0.7818125 2023-10-03 03:39:12,345 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9072W0002-530636 from training. Duration: 0.768 2023-10-03 03:39:12,463 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.0.balancer1.prob, batch_count=1124640.0, ans=0.125 2023-10-03 03:39:14,088 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_340-336408-0_sp1.1 from training. Duration: 0.8454375 2023-10-03 03:39:15,389 WARNING [train.py:1204] (3/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_395-37222-0 from training. Duration: 0.76 2023-10-03 03:39:15,392 WARNING [train.py:1204] (3/4) Exclude cut with ID _64099_210_17301_1_1535526977656_1578188_159-209156-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 03:39:16,810 WARNING [train.py:1204] (3/4) Exclude cut with ID _53950_210_5816_1_1534417264919_3862951_28-29681-0_sp1.1 from training. Duration: 0.92725 2023-10-03 03:39:18,154 WARNING [train.py:1204] (3/4) Exclude cut with ID _4681_210_3525_1_1527250689464_120889_15-126760-0_sp1.1 from training. Duration: 0.736375 2023-10-03 03:39:19,479 WARNING [train.py:1204] (3/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_242-3472-0_sp1.1 from training. Duration: 0.663625 2023-10-03 03:39:19,518 WARNING [train.py:1204] (3/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_436-186311-0_sp0.9 from training. Duration: 0.72225 2023-10-03 03:39:21,248 WARNING [train.py:1204] (3/4) Exclude cut with ID _3675_210_4557_1_1526639934408_4182089_452-172147-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 03:39:22,683 WARNING [train.py:1204] (3/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_80-147301-0 from training. Duration: 0.84 2023-10-03 03:39:22,724 WARNING [train.py:1204] (3/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_351-107588-0_sp1.1 from training. Duration: 0.92725 2023-10-03 03:39:25,199 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9111W0002-550781 from training. Duration: 0.896 2023-10-03 03:39:29,134 WARNING [train.py:1204] (3/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_465-156500-0_sp1.1 from training. Duration: 0.6545625 2023-10-03 03:39:29,279 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=1124706.6666666667, ans=0.1 2023-10-03 03:39:33,162 WARNING [train.py:1204] (3/4) Exclude cut with ID _13938_210_9988_1_1530518718978_3693960_304-112784-0_sp1.1 from training. Duration: 0.4181875 2023-10-03 03:39:34,557 WARNING [train.py:1204] (3/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_202-67017-0_sp1.1 from training. Duration: 0.7454375 2023-10-03 03:39:34,612 WARNING [train.py:1204] (3/4) Exclude cut with ID _58414_210_8254_1_1535421609162_7151182_448-4358-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 03:39:36,041 WARNING [train.py:1204] (3/4) Exclude cut with ID _66840_210_5219_1_1535452168426_4196349_203-243234-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 03:39:37,408 WARNING [train.py:1204] (3/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_201-213513-0_sp1.1 from training. Duration: 0.97275 2023-10-03 03:39:41,592 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_117-187560-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 03:39:44,148 WARNING [train.py:1204] (3/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_135-174080-0_sp0.9 from training. Duration: 0.588875 2023-10-03 03:39:45,492 WARNING [train.py:1204] (3/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_45-265684-0 from training. Duration: 0.72 2023-10-03 03:39:46,993 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_435-313305-0_sp1.1 from training. Duration: 0.6454375 2023-10-03 03:39:47,012 WARNING [train.py:1204] (3/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_235-307337-0_sp1.1 from training. Duration: 0.8545625 2023-10-03 03:39:47,090 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7762_210_1794_1_1528199508_4057408_419-125009-0_sp0.9 from training. Duration: 0.92225 2023-10-03 03:39:49,023 INFO [train.py:1046] (3/4) Epoch 32, batch 4050, loss[loss=0.1599, simple_loss=0.2367, pruned_loss=0.04159, over 23344.00 frames. ], tot_loss[loss=0.1632, simple_loss=0.2415, pruned_loss=0.04247, over 4691945.30 frames. ], batch size: 119, lr: 3.18e-03, grad_scale: 16.0 2023-10-03 03:39:49,115 WARNING [train.py:1204] (3/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_215-141059-0_sp1.1 from training. Duration: 0.563625 2023-10-03 03:39:50,455 WARNING [train.py:1204] (3/4) Exclude cut with ID _27013_210_1389_1_1532437173240_3252649_222-125573-0_sp1.1 from training. Duration: 0.8909375 2023-10-03 03:39:53,907 WARNING [train.py:1204] (3/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_126-274694-0_sp1.1 from training. Duration: 0.8909375 2023-10-03 03:39:55,674 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.536e+02 1.778e+02 1.970e+02 2.195e+02 3.325e+02, threshold=3.940e+02, percent-clipped=0.0 2023-10-03 03:39:56,014 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.bypass.scale_min, batch_count=1124840.0, ans=0.2 2023-10-03 03:39:57,250 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2592_210_2290_1_1525697025_4882925_157-70512-0_sp1.1 from training. Duration: 0.82725 2023-10-03 03:39:58,580 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_292-265928-0_sp0.9 from training. Duration: 0.5666875 2023-10-03 03:40:01,340 WARNING [train.py:1204] (3/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_1211-226066-0_sp1.1 from training. Duration: 0.6909375 2023-10-03 03:40:01,383 WARNING [train.py:1204] (3/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_394-265747-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 03:40:05,519 WARNING [train.py:1204] (3/4) Exclude cut with ID _48782_210_12658_1_1534467701544_2300422_239-67139-0_sp1.1 from training. Duration: 0.97275 2023-10-03 03:40:06,950 WARNING [train.py:1204] (3/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_329-113173-0_sp1.1 from training. Duration: 0.563625 2023-10-03 03:40:08,415 WARNING [train.py:1204] (3/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_81-75849-0_sp0.9 from training. Duration: 0.4333125 2023-10-03 03:40:11,237 WARNING [train.py:1204] (3/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_1003-101638-0 from training. Duration: 0.73 2023-10-03 03:40:11,267 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9309W0189-640316 from training. Duration: 0.64 2023-10-03 03:40:13,795 WARNING [train.py:1204] (3/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_503-287883-0_sp1.1 from training. Duration: 0.8 2023-10-03 03:40:21,235 WARNING [train.py:1204] (3/4) Exclude cut with ID _8833_210_5219_1_1528592665210_7553059_611-80313-0 from training. Duration: 0.64 2023-10-03 03:40:23,120 WARNING [train.py:1204] (3/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_335-106626-0_sp1.1 from training. Duration: 0.963625 2023-10-03 03:40:25,921 WARNING [train.py:1204] (3/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_237-44332-0_sp1.1 from training. Duration: 0.8545625 2023-10-03 03:40:29,901 WARNING [train.py:1204] (3/4) Exclude cut with ID _46952_210_5986_1_1534230010357_3107379_178-147341-0_sp1.1 from training. Duration: 0.97275 2023-10-03 03:40:31,316 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_13091_210_7925_1_1530609447_8987354_946-154426-0_sp1.1 from training. Duration: 0.936375 2023-10-03 03:40:31,331 WARNING [train.py:1204] (3/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_326-17061-0_sp1.1 from training. Duration: 0.8545625 2023-10-03 03:40:35,383 WARNING [train.py:1204] (3/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_38-169920-0_sp1.1 from training. Duration: 0.82725 2023-10-03 03:40:38,159 WARNING [train.py:1204] (3/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_283-220372-0 from training. Duration: 0.56 2023-10-03 03:40:38,175 WARNING [train.py:1204] (3/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_252-216674-0_sp1.1 from training. Duration: 0.4909375 2023-10-03 03:40:39,531 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_202-91290-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 03:40:40,404 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.1.encoder.layers.1.conv_module2.whiten, num_groups=1, num_channels=256, metric=10.81 vs. limit=15.0 2023-10-03 03:40:40,965 WARNING [train.py:1204] (3/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_80-74184-0 from training. Duration: 0.83 2023-10-03 03:40:43,810 WARNING [train.py:1204] (3/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_411-271358-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 03:40:47,417 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.2.encoder.layers.0.feed_forward1.out_whiten, num_groups=1, num_channels=384, metric=5.47 vs. limit=15.0 2023-10-03 03:40:49,397 WARNING [train.py:1204] (3/4) Exclude cut with ID _68777_210_4353_1_1535810390731_3117904_129-166012-0 from training. Duration: 0.85 2023-10-03 03:40:52,527 WARNING [train.py:1204] (3/4) Exclude cut with ID _44982_210_1511_1_1534312820108_3726029_410-109759-0_sp1.1 from training. Duration: 0.963625 2023-10-03 03:40:52,539 WARNING [train.py:1204] (3/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_364-22817-0_sp1.1 from training. Duration: 0.6454375 2023-10-03 03:40:53,941 WARNING [train.py:1204] (3/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_89-242335-0 from training. Duration: 0.95 2023-10-03 03:40:53,949 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_151-325626-0 from training. Duration: 0.97 2023-10-03 03:40:53,950 WARNING [train.py:1204] (3/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_786-264332-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 03:40:57,074 WARNING [train.py:1204] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_35-153886-0_sp1.1 from training. Duration: 0.763625 2023-10-03 03:40:57,154 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_24007_210_1794_1_1531918925_1663875_116-100916-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 03:40:57,686 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.5.encoder.layers.0.conv_module2.whiten, num_groups=1, num_channels=256, metric=4.96 vs. limit=15.0 2023-10-03 03:40:58,902 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_162-262717-0_sp1.1 from training. Duration: 0.7545625 2023-10-03 03:41:03,127 INFO [train.py:1046] (3/4) Epoch 32, batch 4100, loss[loss=0.1514, simple_loss=0.2251, pruned_loss=0.03883, over 24292.00 frames. ], tot_loss[loss=0.1628, simple_loss=0.2413, pruned_loss=0.0421, over 4699736.18 frames. ], batch size: 56, lr: 3.18e-03, grad_scale: 16.0 2023-10-03 03:41:03,959 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.ff3_skip_rate, batch_count=1125173.3333333333, ans=0.0 2023-10-03 03:41:06,480 WARNING [train.py:1204] (3/4) Exclude cut with ID _8263_210_1664_1_1528438953494_294100_40-204731-0 from training. Duration: 0.5 2023-10-03 03:41:07,887 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_416-293746-0 from training. Duration: 0.9 2023-10-03 03:41:09,295 WARNING [train.py:1204] (3/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_605-219732-0 from training. Duration: 0.94 2023-10-03 03:41:09,465 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.feed_forward3.hidden_balancer.prob, batch_count=1125173.3333333333, ans=0.125 2023-10-03 03:41:10,635 WARNING [train.py:1204] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_454-123002-0 from training. Duration: 0.69 2023-10-03 03:41:10,643 WARNING [train.py:1204] (3/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_25-304273-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 03:41:11,918 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6860_210_4852_1_1527917457_3833327_451-39702-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 03:41:11,958 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_304-16505-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 03:41:11,978 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_4-266348-0_sp1.1 from training. Duration: 0.7090625 2023-10-03 03:41:14,636 WARNING [train.py:1204] (3/4) Exclude cut with ID ID0149W0001-753701 from training. Duration: 0.989 2023-10-03 03:41:17,382 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_41251_210_15368_1_1533429767_4820695_394-55393-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 03:41:17,462 WARNING [train.py:1204] (3/4) Exclude cut with ID _8793_210_6310_1_1528542985139_2310009_121-179782-0_sp1.1 from training. Duration: 0.72725 2023-10-03 03:41:17,475 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2729_210_3528_1_1525863505_3882214_421-73047-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 03:41:18,857 WARNING [train.py:1204] (3/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_278-154492-0_sp0.9 from training. Duration: 0.8444375 2023-10-03 03:41:24,619 WARNING [train.py:1204] (3/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_412-317077-0_sp0.9 from training. Duration: 0.8333125 2023-10-03 03:41:24,701 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_727-138317-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 03:41:24,733 WARNING [train.py:1204] (3/4) Exclude cut with ID _25808_210_13292_1_1532343514562_7366109_345-335048-0_sp1.1 from training. Duration: 0.92725 2023-10-03 03:41:24,755 WARNING [train.py:1204] (3/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_506-23200-0 from training. Duration: 0.97 2023-10-03 03:41:26,309 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.self_attn_weights.pos_emb_skip_rate, batch_count=1125240.0, ans=0.0 2023-10-03 03:41:27,551 WARNING [train.py:1204] (3/4) Exclude cut with ID _44718_210_5816_1_1534050051869_6469030_643-245187-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 03:41:27,555 WARNING [train.py:1204] (3/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_42-186162-0_sp1.1 from training. Duration: 0.836375 2023-10-03 03:41:27,575 WARNING [train.py:1204] (3/4) Exclude cut with ID _3231_210_2106_1_1526205864934_6784220_171-256004-0_sp1.1 from training. Duration: 0.7818125 2023-10-03 03:41:27,606 WARNING [train.py:1204] (3/4) Exclude cut with ID _25914_210_6400_1_1532141317058_644740_43-248478-0_sp1.1 from training. Duration: 0.936375 2023-10-03 03:41:27,646 WARNING [train.py:1204] (3/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_577-287150-0 from training. Duration: 0.82 2023-10-03 03:41:30,213 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.3.encoder.layers.3.whiten, num_groups=1, num_channels=512, metric=5.36 vs. limit=12.0 2023-10-03 03:41:31,590 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_46015_210_7234_1_1533863319_3087837_376-276068-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 03:41:31,682 WARNING [train.py:1204] (3/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_51-200220-0 from training. Duration: 0.56 2023-10-03 03:41:33,112 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_452-96101-0_sp1.1 from training. Duration: 0.963625 2023-10-03 03:41:36,369 WARNING [train.py:1204] (3/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_263-168388-0_sp1.1 from training. Duration: 0.7818125 2023-10-03 03:41:36,371 WARNING [train.py:1204] (3/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_469-94673-0 from training. Duration: 0.79 2023-10-03 03:41:37,792 WARNING [train.py:1204] (3/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_303-228738-0_sp1.1 from training. Duration: 0.763625 2023-10-03 03:41:37,834 WARNING [train.py:1204] (3/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_262-250826-0_sp1.1 from training. Duration: 0.9 2023-10-03 03:41:37,861 WARNING [train.py:1204] (3/4) Exclude cut with ID _68777_210_4353_1_1535810390731_3117904_129-166012-0_sp1.1 from training. Duration: 0.77275 2023-10-03 03:41:40,516 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.0.layers.0.nonlin_attention.whiten1, num_groups=1, num_channels=144, metric=9.69 vs. limit=10.0 2023-10-03 03:41:40,926 WARNING [train.py:1204] (3/4) Exclude cut with ID _58895_210_8750_1_1535184081320_3649089_265-323810-0 from training. Duration: 0.96 2023-10-03 03:41:41,018 WARNING [train.py:1204] (3/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_120-76843-0_sp0.9 from training. Duration: 0.611125 2023-10-03 03:41:42,366 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7544_210_6753_1_1528587722_3576686_295-258479-0_sp0.9 from training. Duration: 0.9333125 2023-10-03 03:41:43,833 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7046_210_6753_1_1527895337_6920917_783-278109-0 from training. Duration: 0.77 2023-10-03 03:41:45,119 WARNING [train.py:1204] (3/4) Exclude cut with ID _42326_210_7770_1_1533814282531_3707269_113-308323-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 03:41:45,165 WARNING [train.py:1204] (3/4) Exclude cut with ID _7852_210_5219_1_1528286471316_8139400_242-105336-0_sp0.9 from training. Duration: 0.8 2023-10-03 03:41:47,841 WARNING [train.py:1204] (3/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_114-277905-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 03:41:50,815 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_13118_210_7925_1_1530755612_7608297_42-66683-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 03:41:52,293 WARNING [train.py:1204] (3/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_901-79438-0_sp1.1 from training. Duration: 0.8545625 2023-10-03 03:41:54,126 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_30280_210_1881_1_1532872779_3660139_211-182194-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 03:42:03,799 WARNING [train.py:1204] (3/4) Exclude cut with ID _14898_210_9206_1_1530766812377_4177190_621-45732-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 03:42:03,807 WARNING [train.py:1204] (3/4) Exclude cut with ID _35350_210_16040_1_1533040191183_4075299_224-133775-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 03:42:06,873 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.conv_skip_rate, batch_count=1125440.0, ans=0.0 2023-10-03 03:42:07,422 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.2.encoder.layers.1.nonlin_attention.whiten2, num_groups=1, num_channels=384, metric=5.39 vs. limit=15.0 2023-10-03 03:42:09,298 WARNING [train.py:1204] (3/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_147-293311-0_sp1.1 from training. Duration: 0.8545625 2023-10-03 03:42:10,728 WARNING [train.py:1204] (3/4) Exclude cut with ID _12302_210_5219_1_1529924446466_8107280_835-279754-0_sp1.1 from training. Duration: 0.8181875 2023-10-03 03:42:10,860 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.1.ff2_skip_rate, batch_count=1125440.0, ans=0.0 2023-10-03 03:42:16,104 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8453_210_4211_1_1528502940_8562730_347-330673-0_sp0.9 from training. Duration: 0.8 2023-10-03 03:42:16,202 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_213-103282-0_sp0.9 from training. Duration: 0.8666875 2023-10-03 03:42:17,481 INFO [train.py:1046] (3/4) Epoch 32, batch 4150, loss[loss=0.1461, simple_loss=0.2116, pruned_loss=0.04031, over 23507.00 frames. ], tot_loss[loss=0.1631, simple_loss=0.2416, pruned_loss=0.04232, over 4702959.88 frames. ], batch size: 285, lr: 3.18e-03, grad_scale: 16.0 2023-10-03 03:42:17,549 WARNING [train.py:1204] (3/4) Exclude cut with ID _49728_210_12651_1_1534041034294_6788150_66-43496-0_sp1.1 from training. Duration: 0.97275 2023-10-03 03:42:17,551 WARNING [train.py:1204] (3/4) Exclude cut with ID _19023_210_4353_1_1531378892552_5820780_28-36615-0_sp1.1 from training. Duration: 0.8818125 2023-10-03 03:42:20,345 WARNING [train.py:1204] (3/4) Exclude cut with ID _14349_210_8341_1_1530615710818_3442470_115-285975-0 from training. Duration: 0.86 2023-10-03 03:42:20,376 WARNING [train.py:1204] (3/4) Exclude cut with ID _12734_210_8341_1_1530183614296_3891929_236-47464-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 03:42:20,435 WARNING [train.py:1204] (3/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_177-236809-0 from training. Duration: 0.49 2023-10-03 03:42:21,892 WARNING [train.py:1204] (3/4) Exclude cut with ID _13678_210_7497_1_1530530069226_3463170_371-3221-0 from training. Duration: 0.9 2023-10-03 03:42:21,915 WARNING [train.py:1204] (3/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_48-338976-0 from training. Duration: 0.89 2023-10-03 03:42:22,112 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.conv_skip_rate, batch_count=1125506.6666666667, ans=0.0 2023-10-03 03:42:23,088 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.523e+02 1.897e+02 2.094e+02 2.297e+02 3.189e+02, threshold=4.189e+02, percent-clipped=0.0 2023-10-03 03:42:23,251 WARNING [train.py:1204] (3/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_1167-309744-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 03:42:27,839 WARNING [train.py:1204] (3/4) Exclude cut with ID _30327_210_16049_1_1532484161295_3924169_401-70819-0_sp0.9 from training. Duration: 0.9555625 2023-10-03 03:42:27,847 WARNING [train.py:1204] (3/4) Exclude cut with ID _55134_210_21982_1_1534590028377_3882170_346-91022-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 03:42:31,239 WARNING [train.py:1204] (3/4) Exclude cut with ID _14459_210_2228_1_1531443641946_7516700_174-118801-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 03:42:31,964 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.2.encoder.layers.2.self_attn_weights.whiten_keys, num_groups=4, num_channels=128, metric=3.02 vs. limit=6.0 2023-10-03 03:42:32,997 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_269-278006-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 03:42:34,898 WARNING [train.py:1204] (3/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_390-339137-0_sp0.9 from training. Duration: 0.62225 2023-10-03 03:42:36,288 WARNING [train.py:1204] (3/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_253-308377-0_sp1.1 from training. Duration: 0.4454375 2023-10-03 03:42:36,299 WARNING [train.py:1204] (3/4) Exclude cut with ID _23825_210_13449_1_1531915313526_3497150_240-271367-0_sp1.1 from training. Duration: 0.8818125 2023-10-03 03:42:36,806 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.5.encoder.layers.0.feed_forward3.out_whiten, num_groups=1, num_channels=256, metric=10.35 vs. limit=15.0 2023-10-03 03:42:37,639 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1015-343884-0_sp0.9 from training. Duration: 0.688875 2023-10-03 03:42:40,496 WARNING [train.py:1204] (3/4) Exclude cut with ID _23514_210_1389_1_1531832139785_3031439_335-48210-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 03:42:43,422 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_309-65177-0_sp1.1 from training. Duration: 0.8 2023-10-03 03:42:44,840 WARNING [train.py:1204] (3/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_231-137892-0 from training. Duration: 0.99 2023-10-03 03:42:47,604 WARNING [train.py:1204] (3/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_57-270037-0 from training. Duration: 0.77 2023-10-03 03:42:47,609 WARNING [train.py:1204] (3/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_465-214726-0_sp1.1 from training. Duration: 0.7818125 2023-10-03 03:42:49,072 WARNING [train.py:1204] (3/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_106-300985-0 from training. Duration: 0.9 2023-10-03 03:42:49,079 WARNING [train.py:1204] (3/4) Exclude cut with ID _14632_210_9988_1_1530691279392_3923200_161-129530-0_sp1.1 from training. Duration: 0.87275 2023-10-03 03:42:49,093 WARNING [train.py:1204] (3/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_514-88370-0_sp1.1 from training. Duration: 0.963625 2023-10-03 03:42:50,643 WARNING [train.py:1204] (3/4) Exclude cut with ID _56495_210_4234_1_1534748080334_4136219_305-334505-0_sp1.1 from training. Duration: 0.92725 2023-10-03 03:42:51,979 WARNING [train.py:1204] (3/4) Exclude cut with ID _42875_210_14856_1_1533704397487_4012433_61-264914-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 03:42:55,269 WARNING [train.py:1204] (3/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_91-148831-0 from training. Duration: 0.99 2023-10-03 03:42:55,453 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=1125640.0, ans=0.1 2023-10-03 03:42:58,710 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_435-313305-0_sp0.9 from training. Duration: 0.788875 2023-10-03 03:43:00,085 WARNING [train.py:1204] (3/4) Exclude cut with ID _16744_210_5986_1_1531724405384_3907567_242-111316-0_sp1.1 from training. Duration: 0.8454375 2023-10-03 03:43:01,440 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_257-20733-0 from training. Duration: 0.76 2023-10-03 03:43:01,496 WARNING [train.py:1204] (3/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_4-91037-0_sp1.1 from training. Duration: 0.8 2023-10-03 03:43:04,888 WARNING [train.py:1204] (3/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_3-276701-0 from training. Duration: 0.91 2023-10-03 03:43:06,274 WARNING [train.py:1204] (3/4) Exclude cut with ID _3992_210_1747_1_1526644816031_3731060_36-147855-0_sp1.1 from training. Duration: 0.7090625 2023-10-03 03:43:08,879 WARNING [train.py:1204] (3/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_1296-298671-0_sp1.1 from training. Duration: 0.963625 2023-10-03 03:43:10,263 WARNING [train.py:1204] (3/4) Exclude cut with ID _68965_210_1883_1_1535698962272_3497490_309-334593-0_sp1.1 from training. Duration: 0.92725 2023-10-03 03:43:11,671 WARNING [train.py:1204] (3/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_128-296581-0 from training. Duration: 0.74 2023-10-03 03:43:11,671 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_17613_210_2151_1_1531137569_2366160_170-299079-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 03:43:11,674 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6287_210_3273_1_1527509506_2653716_182-199887-0_sp1.1 from training. Duration: 0.463625 2023-10-03 03:43:11,813 WARNING [train.py:1204] (3/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_80-135200-0_sp1.1 from training. Duration: 0.7181875 2023-10-03 03:43:14,586 WARNING [train.py:1204] (3/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_34-183685-0 from training. Duration: 0.98 2023-10-03 03:43:14,609 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8278_210_2550_1_1528637099_4220413_418-210155-0_sp1.1 from training. Duration: 0.92725 2023-10-03 03:43:14,613 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8853_210_3273_1_1528633899_818968_51-187685-0_sp0.9 from training. Duration: 0.7666875 2023-10-03 03:43:14,626 WARNING [train.py:1204] (3/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_332-221057-0_sp1.1 from training. Duration: 0.6181875 2023-10-03 03:43:17,257 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2221_210_1988_1_1525597051_4935541_415-296882-0 from training. Duration: 0.77 2023-10-03 03:43:17,285 WARNING [train.py:1204] (3/4) Exclude cut with ID _33640_210_2655_1_1533117605479_4855469_770-278185-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 03:43:17,319 WARNING [train.py:1204] (3/4) Exclude cut with ID _5287_210_2084_1_1527418883040_3878670_333-48508-0_sp1.1 from training. Duration: 0.5454375 2023-10-03 03:43:17,370 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_142-276839-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 03:43:17,513 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.feed_forward1.hidden_balancer.prob, batch_count=1125773.3333333333, ans=0.125 2023-10-03 03:43:17,881 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.feed_forward1.out_whiten.whitening_limit, batch_count=1125773.3333333333, ans=15.0 2023-10-03 03:43:18,647 WARNING [train.py:1204] (3/4) Exclude cut with ID _28394_210_8913_1_1532430049518_3650118_126-139371-0_sp1.1 from training. Duration: 0.92725 2023-10-03 03:43:18,667 WARNING [train.py:1204] (3/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_76-203867-0 from training. Duration: 0.72 2023-10-03 03:43:20,000 WARNING [train.py:1204] (3/4) Exclude cut with ID _6194_210_1534_1_1527511826160_3941194_14-282876-0_sp0.9 from training. Duration: 0.788875 2023-10-03 03:43:21,627 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.attention_skip_rate, batch_count=1125773.3333333333, ans=0.0 2023-10-03 03:43:24,165 WARNING [train.py:1204] (3/4) Exclude cut with ID _35355_210_16040_1_1533212988977_4016229_74-190076-0_sp1.1 from training. Duration: 0.72725 2023-10-03 03:43:27,307 WARNING [train.py:1204] (3/4) Exclude cut with ID _33920_210_4220_1_1533257851500_3084399_198-334063-0 from training. Duration: 0.94 2023-10-03 03:43:30,459 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_4-266348-0_sp0.9 from training. Duration: 0.8666875 2023-10-03 03:43:31,725 INFO [train.py:1046] (3/4) Epoch 32, batch 4200, loss[loss=0.1622, simple_loss=0.2518, pruned_loss=0.0363, over 24475.00 frames. ], tot_loss[loss=0.1623, simple_loss=0.2407, pruned_loss=0.04195, over 4711118.33 frames. ], batch size: 69, lr: 3.17e-03, grad_scale: 16.0 2023-10-03 03:43:31,893 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_23856_210_4238_1_1531994785_3087934_122-38075-0_sp1.1 from training. Duration: 0.936375 2023-10-03 03:43:33,847 WARNING [train.py:1204] (3/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_276-96593-0_sp0.9 from training. Duration: 0.8555625 2023-10-03 03:43:35,847 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_308-278734-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 03:43:35,857 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_5190_210_4557_1_1527245078_2965646_353-250603-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 03:43:37,043 WARNING [train.py:1204] (3/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_472-216663-0 from training. Duration: 0.92 2023-10-03 03:43:39,943 WARNING [train.py:1204] (3/4) Exclude cut with ID _7537_210_2084_1_1528502395431_3924359_14-312906-0 from training. Duration: 0.84 2023-10-03 03:43:41,373 WARNING [train.py:1204] (3/4) Exclude cut with ID _29003_210_19596_1_1532507391720_3894098_559-236660-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 03:43:44,175 WARNING [train.py:1204] (3/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_312-204798-0_sp1.1 from training. Duration: 0.8454375 2023-10-03 03:43:46,916 WARNING [train.py:1204] (3/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_17-309070-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 03:43:49,694 WARNING [train.py:1204] (3/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_344-152526-0_sp0.9 from training. Duration: 0.588875 2023-10-03 03:43:51,160 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_41452_210_7291_1_1533437687_3941281_405-11559-0_sp1.1 from training. Duration: 0.82725 2023-10-03 03:43:51,189 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_48730_210_5399_1_1533885886_2212769_157-124052-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 03:43:51,427 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.bypass_mid.scale_min, batch_count=1125906.6666666667, ans=0.2 2023-10-03 03:43:52,571 WARNING [train.py:1204] (3/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_415-78792-0 from training. Duration: 0.81 2023-10-03 03:43:52,575 WARNING [train.py:1204] (3/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_345-340690-0_sp1.1 from training. Duration: 0.8454375 2023-10-03 03:43:53,949 WARNING [train.py:1204] (3/4) Exclude cut with ID _23825_210_13449_1_1531915313526_3497150_124-8138-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 03:43:55,259 WARNING [train.py:1204] (3/4) Exclude cut with ID _49258_210_7994_1_1534122017863_3738861_194-24864-0_sp1.1 from training. Duration: 0.936375 2023-10-03 03:43:55,287 WARNING [train.py:1204] (3/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_369-222060-0_sp1.1 from training. Duration: 0.6454375 2023-10-03 03:43:56,676 WARNING [train.py:1204] (3/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_480-13146-0_sp0.9 from training. Duration: 0.9444375 2023-10-03 03:43:59,915 WARNING [train.py:1204] (3/4) Exclude cut with ID _13253_210_1925_1_1530757775819_3577873_191-144977-0 from training. Duration: 0.78 2023-10-03 03:43:59,945 WARNING [train.py:1204] (3/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_1128-140266-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 03:44:00,170 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.feed_forward1.hidden_balancer.prob, batch_count=1125973.3333333333, ans=0.125 2023-10-03 03:44:04,712 WARNING [train.py:1204] (3/4) Exclude cut with ID _8772_210_1850_1_1528635595972_3664011_247-336448-0_sp0.9 from training. Duration: 0.82225 2023-10-03 03:44:04,995 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.attention_skip_rate, batch_count=1125973.3333333333, ans=0.0 2023-10-03 03:44:06,732 WARNING [train.py:1204] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_318-59097-0_sp1.1 from training. Duration: 0.6909375 2023-10-03 03:44:08,020 WARNING [train.py:1204] (3/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_540-131164-0_sp1.1 from training. Duration: 0.836375 2023-10-03 03:44:09,810 WARNING [train.py:1204] (3/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_39-110836-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 03:44:12,495 WARNING [train.py:1204] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_860-96198-0_sp1.1 from training. Duration: 0.663625 2023-10-03 03:44:12,505 WARNING [train.py:1204] (3/4) Exclude cut with ID _15133_210_6228_1_1530794512074_3080379_29-264458-0 from training. Duration: 0.58 2023-10-03 03:44:12,525 WARNING [train.py:1204] (3/4) Exclude cut with ID _55209_210_1531_1_1534675828996_4597160_797-348343-0_sp1.1 from training. Duration: 0.963625 2023-10-03 03:44:12,595 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder_embed.convnext.out_balancer.prob, batch_count=1125973.3333333333, ans=0.125 2023-10-03 03:44:14,002 WARNING [train.py:1204] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_23-189234-0_sp1.1 from training. Duration: 0.7545625 2023-10-03 03:44:19,545 WARNING [train.py:1204] (3/4) Exclude cut with ID _5410_210_1388_1_1527397055795_1719020_39-112221-0_sp1.1 from training. Duration: 0.62725 2023-10-03 03:44:20,956 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_100-348441-0_sp1.1 from training. Duration: 0.82725 2023-10-03 03:44:25,139 WARNING [train.py:1204] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_143-86344-0_sp1.1 from training. Duration: 0.7 2023-10-03 03:44:27,898 WARNING [train.py:1204] (3/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_126-29104-0 from training. Duration: 0.85 2023-10-03 03:44:29,290 WARNING [train.py:1204] (3/4) Exclude cut with ID _39846_210_12651_1_1533603651992_1318030_138-31771-0_sp1.1 from training. Duration: 0.963625 2023-10-03 03:44:31,310 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.feed_forward2.hidden_balancer.prob, batch_count=1126106.6666666667, ans=0.125 2023-10-03 03:44:35,800 WARNING [train.py:1204] (3/4) Exclude cut with ID _2220_210_1819_1_1525509109898_3658680_50-99837-0_sp1.1 from training. Duration: 0.4909375 2023-10-03 03:44:35,989 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.conv_module2.balancer2.prob, batch_count=1126106.6666666667, ans=0.125 2023-10-03 03:44:36,004 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.conv_skip_rate, batch_count=1126106.6666666667, ans=0.0 2023-10-03 03:44:37,136 WARNING [train.py:1204] (3/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_836-210550-0_sp1.1 from training. Duration: 0.97275 2023-10-03 03:44:39,031 WARNING [train.py:1204] (3/4) Exclude cut with ID _28323_210_16187_1_1532480395667_4100300_306-74561-0 from training. Duration: 0.94 2023-10-03 03:44:41,917 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_177-58005-0_sp1.1 from training. Duration: 0.563625 2023-10-03 03:44:42,567 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.3.encoder.layers.1.conv_module1.whiten, num_groups=1, num_channels=512, metric=2.46 vs. limit=15.0 2023-10-03 03:44:45,807 INFO [train.py:1046] (3/4) Epoch 32, batch 4250, loss[loss=0.1588, simple_loss=0.2241, pruned_loss=0.04673, over 22741.00 frames. ], tot_loss[loss=0.1611, simple_loss=0.2394, pruned_loss=0.04143, over 4714734.85 frames. ], batch size: 322, lr: 3.17e-03, grad_scale: 16.0 2023-10-03 03:44:47,243 WARNING [train.py:1204] (3/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_19-342632-0_sp1.1 from training. Duration: 0.82725 2023-10-03 03:44:47,258 WARNING [train.py:1204] (3/4) Exclude cut with ID _35355_210_16040_1_1533212988977_4016229_74-190076-0_sp0.9 from training. Duration: 0.888875 2023-10-03 03:44:50,085 WARNING [train.py:1204] (3/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_285-97786-0_sp1.1 from training. Duration: 0.97275 2023-10-03 03:44:51,398 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.553e+02 1.847e+02 2.009e+02 2.181e+02 2.689e+02, threshold=4.019e+02, percent-clipped=0.0 2023-10-03 03:44:52,985 WARNING [train.py:1204] (3/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_337-181502-0_sp0.9 from training. Duration: 0.72225 2023-10-03 03:44:54,206 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_353-182855-0 from training. Duration: 0.96 2023-10-03 03:44:54,232 WARNING [train.py:1204] (3/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_409-327730-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 03:44:56,875 WARNING [train.py:1204] (3/4) Exclude cut with ID _8030_210_7497_1_1528444886781_3531089_373-19403-0_sp1.1 from training. Duration: 0.97275 2023-10-03 03:44:59,853 WARNING [train.py:1204] (3/4) Exclude cut with ID _6789_210_1534_1_1527857937218_3118950_231-340237-0_sp1.1 from training. Duration: 0.936375 2023-10-03 03:45:04,753 WARNING [train.py:1204] (3/4) Exclude cut with ID _6493_210_1747_1_1528459210424_4012470_536-57832-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 03:45:04,946 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.feed_forward1.out_proj.dropout_p, batch_count=1126240.0, ans=0.1 2023-10-03 03:45:06,030 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7046_210_6753_1_1527895337_6920917_756-116002-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 03:45:09,370 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_37152_210_9697_1_1533106573_3844188_29-239070-0_sp1.1 from training. Duration: 0.8909375 2023-10-03 03:45:09,380 WARNING [train.py:1204] (3/4) Exclude cut with ID _19023_210_4353_1_1531378892552_5820780_61-334539-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 03:45:10,789 WARNING [train.py:1204] (3/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_159-28488-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 03:45:12,117 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_764-71272-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 03:45:13,477 WARNING [train.py:1204] (3/4) Exclude cut with ID _44902_210_16187_1_1533888006062_3792078_258-178274-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 03:45:15,015 WARNING [train.py:1204] (3/4) Exclude cut with ID _7537_210_2084_1_1528502395431_3924359_14-312906-0_sp1.1 from training. Duration: 0.763625 2023-10-03 03:45:16,321 WARNING [train.py:1204] (3/4) Exclude cut with ID _49643_210_7989_1_1534640244224_3939031_674-116639-0_sp1.1 from training. Duration: 0.963625 2023-10-03 03:45:16,430 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder_embed.conv.5.prob, batch_count=1126306.6666666667, ans=0.125 2023-10-03 03:45:17,707 WARNING [train.py:1204] (3/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_415-161-0 from training. Duration: 0.98 2023-10-03 03:45:20,658 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.1.bypass_mid.scale_min, batch_count=1126306.6666666667, ans=0.2 2023-10-03 03:45:21,876 WARNING [train.py:1204] (3/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_264-348896-0 from training. Duration: 0.85 2023-10-03 03:45:21,885 WARNING [train.py:1204] (3/4) Exclude cut with ID _51899_210_20034_1_1534129254466_3669220_233-161492-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 03:45:23,157 WARNING [train.py:1204] (3/4) Exclude cut with ID _31255_210_13427_1_1532865619495_3815143_233-200023-0_sp1.1 from training. Duration: 0.936375 2023-10-03 03:45:23,173 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_23850_210_4238_1_1532232597_1632433_100-229879-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 03:45:24,542 WARNING [train.py:1204] (3/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_375-245677-0_sp0.9 from training. Duration: 0.9 2023-10-03 03:45:24,545 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_40223_210_6228_1_1533298404_4812267_355-317415-0_sp1.1 from training. Duration: 0.97275 2023-10-03 03:45:24,586 WARNING [train.py:1204] (3/4) Exclude cut with ID _9679_210_7497_1_1529129212906_4363929_235-81488-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 03:45:26,179 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.bypass_mid.scale_min, batch_count=1126306.6666666667, ans=0.2 2023-10-03 03:45:28,868 WARNING [train.py:1204] (3/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_172-215932-0_sp1.1 from training. Duration: 0.6 2023-10-03 03:45:30,191 WARNING [train.py:1204] (3/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_64-298917-0_sp1.1 from training. Duration: 0.8 2023-10-03 03:45:35,029 WARNING [train.py:1204] (3/4) Exclude cut with ID _38020_210_5986_1_1533112252259_7006089_131-53533-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 03:45:36,481 WARNING [train.py:1204] (3/4) Exclude cut with ID _42334_210_7770_1_1534206610396_3605880_392-48154-0_sp1.1 from training. Duration: 0.963625 2023-10-03 03:45:37,786 WARNING [train.py:1204] (3/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_337-181502-0 from training. Duration: 0.65 2023-10-03 03:45:37,794 WARNING [train.py:1204] (3/4) Exclude cut with ID _6267_210_2084_1_1527764658526_3894730_375-219887-0_sp0.9 from training. Duration: 0.7444375 2023-10-03 03:45:37,884 WARNING [train.py:1204] (3/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_60-146189-0 from training. Duration: 0.98 2023-10-03 03:45:39,887 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_243-143119-0_sp1.1 from training. Duration: 0.7 2023-10-03 03:45:40,796 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.0.layers.1.nonlin_attention.whiten1, num_groups=1, num_channels=144, metric=5.21 vs. limit=10.0 2023-10-03 03:45:42,964 WARNING [train.py:1204] (3/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_1267-272693-0_sp0.9 from training. Duration: 0.811125 2023-10-03 03:45:43,087 WARNING [train.py:1204] (3/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_83-182645-0_sp1.1 from training. Duration: 0.97275 2023-10-03 03:45:43,110 WARNING [train.py:1204] (3/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_403-292260-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 03:45:45,873 WARNING [train.py:1204] (3/4) Exclude cut with ID _55429_210_10626_1_1534467588527_7174143_377-169391-0 from training. Duration: 0.96 2023-10-03 03:45:46,000 WARNING [train.py:1204] (3/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_494-199538-0_sp1.1 from training. Duration: 0.6818125 2023-10-03 03:45:47,383 WARNING [train.py:1204] (3/4) Exclude cut with ID _3067_210_2667_1_1526212974640_4503168_119-175776-0_sp1.1 from training. Duration: 0.67275 2023-10-03 03:45:52,594 WARNING [train.py:1204] (3/4) Exclude cut with ID _3231_210_2106_1_1526205864934_6784220_183-303839-0_sp1.1 from training. Duration: 0.97275 2023-10-03 03:45:55,553 WARNING [train.py:1204] (3/4) Exclude cut with ID _18513_210_9206_1_1531618215223_3862620_165-38958-0_sp1.1 from training. Duration: 0.963625 2023-10-03 03:45:55,663 WARNING [train.py:1204] (3/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_87-272068-0_sp1.1 from training. Duration: 0.7545625 2023-10-03 03:45:57,012 WARNING [train.py:1204] (3/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_353-95-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 03:45:57,257 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.attention_skip_rate, batch_count=1126506.6666666667, ans=0.0 2023-10-03 03:45:58,285 INFO [train.py:1046] (3/4) Epoch 32, batch 4300, loss[loss=0.1714, simple_loss=0.2548, pruned_loss=0.04404, over 23438.00 frames. ], tot_loss[loss=0.1606, simple_loss=0.2389, pruned_loss=0.0412, over 4705125.46 frames. ], batch size: 93, lr: 3.17e-03, grad_scale: 16.0 2023-10-03 03:45:58,357 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2009_210_2071_1_1525481134_4588937_73-302813-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 03:45:59,858 WARNING [train.py:1204] (3/4) Exclude cut with ID _64220_210_9527_1_1535250151772_4875259_88-95011-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 03:46:01,165 WARNING [train.py:1204] (3/4) Exclude cut with ID _44902_210_16187_1_1533888006062_3792078_160-12107-0_sp1.1 from training. Duration: 0.92725 2023-10-03 03:46:01,171 WARNING [train.py:1204] (3/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_547-5424-0 from training. Duration: 0.94 2023-10-03 03:46:04,113 WARNING [train.py:1204] (3/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_268-85656-0_sp1.1 from training. Duration: 0.936375 2023-10-03 03:46:07,485 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.bypass.scale_min, batch_count=1126506.6666666667, ans=0.2 2023-10-03 03:46:08,721 WARNING [train.py:1204] (3/4) Exclude cut with ID _66210_210_12651_1_1535446812922_3365850_42-284985-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 03:46:08,753 WARNING [train.py:1204] (3/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_465-214726-0_sp0.9 from training. Duration: 0.9555625 2023-10-03 03:46:08,972 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=1126506.6666666667, ans=0.1 2023-10-03 03:46:13,558 WARNING [train.py:1204] (3/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_50-95770-0_sp1.1 from training. Duration: 0.936375 2023-10-03 03:46:19,632 WARNING [train.py:1204] (3/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_269-77567-0_sp1.1 from training. Duration: 0.963625 2023-10-03 03:46:19,640 WARNING [train.py:1204] (3/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_7-111954-0 from training. Duration: 0.56 2023-10-03 03:46:20,986 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_85-195240-0_sp0.9 from training. Duration: 0.9666875 2023-10-03 03:46:22,372 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2822_210_2715_1_1526112586_1999587_165-36656-0_sp0.9 from training. Duration: 0.988875 2023-10-03 03:46:22,394 WARNING [train.py:1204] (3/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_181-78632-0_sp1.1 from training. Duration: 0.7090625 2023-10-03 03:46:22,405 WARNING [train.py:1204] (3/4) Exclude cut with ID IC0671W0044-331887 from training. Duration: 0.896 2023-10-03 03:46:25,242 WARNING [train.py:1204] (3/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_266-137104-0_sp1.1 from training. Duration: 0.5909375 2023-10-03 03:46:26,699 WARNING [train.py:1204] (3/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_137-116442-0_sp0.9 from training. Duration: 0.8666875 2023-10-03 03:46:29,542 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_23801_210_16223_1_1532066392_3294657_570-5439-0 from training. Duration: 0.97 2023-10-03 03:46:29,557 WARNING [train.py:1204] (3/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_29-280622-0_sp1.1 from training. Duration: 0.7454375 2023-10-03 03:46:29,593 WARNING [train.py:1204] (3/4) Exclude cut with ID 3557-8342-0013-54691-0 from training. Duration: 0.92 2023-10-03 03:46:32,283 WARNING [train.py:1204] (3/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_711-15688-0_sp1.1 from training. Duration: 0.3909375 2023-10-03 03:46:34,579 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_15126_210_6753_1_1530778677_2591185_120-277707-0_sp1.1 from training. Duration: 0.72725 2023-10-03 03:46:36,577 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.balancer2.prob, batch_count=1126640.0, ans=0.125 2023-10-03 03:46:37,684 WARNING [train.py:1204] (3/4) Exclude cut with ID _14970_210_15709_1_1530775547361_1966965_8-169924-0_sp1.1 from training. Duration: 0.836375 2023-10-03 03:46:37,685 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_4692_210_3528_1_1526899755_4323254_107-44401-0_sp0.9 from training. Duration: 0.9555625 2023-10-03 03:46:39,103 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3475_210_2028_1_1526193578_3622569_265-276625-0_sp0.9 from training. Duration: 0.8444375 2023-10-03 03:46:39,797 WARNING [train.py:1204] (3/4) Exclude cut with ID _55153_210_2253_1_1534384786677_3162096_148-79022-0_sp1.1 from training. Duration: 0.87275 2023-10-03 03:46:41,181 WARNING [train.py:1204] (3/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_292-36336-0_sp1.1 from training. Duration: 0.92725 2023-10-03 03:46:42,583 WARNING [train.py:1204] (3/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_329-82235-0 from training. Duration: 0.82 2023-10-03 03:46:42,677 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_11305_210_1794_1_1529750428_7178148_257-312870-0 from training. Duration: 0.88 2023-10-03 03:46:45,789 WARNING [train.py:1204] (3/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_436-4493-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 03:46:47,213 WARNING [train.py:1204] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_321-54129-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 03:46:47,227 WARNING [train.py:1204] (3/4) Exclude cut with ID _5287_210_2084_1_1527418883040_3878670_333-48508-0_sp0.9 from training. Duration: 0.6666875 2023-10-03 03:46:47,404 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.conv_module2.balancer1.prob, batch_count=1126706.6666666667, ans=0.125 2023-10-03 03:46:48,534 WARNING [train.py:1204] (3/4) Exclude cut with ID _26528_210_9070_1_1532570410842_3851310_244-278158-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 03:46:48,574 WARNING [train.py:1204] (3/4) Exclude cut with ID _46952_210_5986_1_1534230010357_3107379_232-344503-0_sp1.1 from training. Duration: 0.87275 2023-10-03 03:46:48,592 WARNING [train.py:1204] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_43-206757-0 from training. Duration: 0.75 2023-10-03 03:46:48,594 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_4183_210_2087_1_1526729100_1869838_141-166600-0 from training. Duration: 0.95 2023-10-03 03:46:49,914 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_296-287678-0 from training. Duration: 0.95 2023-10-03 03:46:51,259 WARNING [train.py:1204] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_265-60343-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 03:46:51,298 WARNING [train.py:1204] (3/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_332-256168-0 from training. Duration: 0.75 2023-10-03 03:46:51,327 WARNING [train.py:1204] (3/4) Exclude cut with ID _13679_210_7497_1_1530534293662_3355098_411-186088-0 from training. Duration: 0.94 2023-10-03 03:46:55,328 WARNING [train.py:1204] (3/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_458-126779-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 03:46:56,800 WARNING [train.py:1204] (3/4) Exclude cut with ID IC0803W0380-397572 from training. Duration: 0.032 2023-10-03 03:46:56,861 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_278-272612-0_sp1.1 from training. Duration: 0.663625 2023-10-03 03:46:59,526 WARNING [train.py:1204] (3/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_491-263101-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 03:46:59,531 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_53555_210_5632_1_1534595220_7232297_334-336778-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 03:47:02,239 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_50-3289-0 from training. Duration: 0.91 2023-10-03 03:47:02,459 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.conv_module1.balancer1.prob, batch_count=1126773.3333333333, ans=0.125 2023-10-03 03:47:03,557 WARNING [train.py:1204] (3/4) Exclude cut with ID _8699_210_2069_1_1528630346831_3960409_155-39766-0_sp0.9 from training. Duration: 0.8666875 2023-10-03 03:47:03,568 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_502-82145-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 03:47:03,614 WARNING [train.py:1204] (3/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_550-43132-0_sp1.1 from training. Duration: 0.97275 2023-10-03 03:47:03,634 WARNING [train.py:1204] (3/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_446-112117-0_sp1.1 from training. Duration: 0.8181875 2023-10-03 03:47:03,898 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.bypass.skip_rate, batch_count=1126773.3333333333, ans=0.04949747468305833 2023-10-03 03:47:04,242 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.3.encoder.layers.0.nonlin_attention.whiten1, num_groups=1, num_channels=384, metric=7.32 vs. limit=10.0 2023-10-03 03:47:05,025 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3691_210_3528_1_1526468169_4062053_355-334432-0_sp1.1 from training. Duration: 0.7909375 2023-10-03 03:47:06,401 WARNING [train.py:1204] (3/4) Exclude cut with ID _58605_210_12210_1_1534723195939_3904259_239-94720-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 03:47:10,273 WARNING [train.py:1204] (3/4) Exclude cut with ID _49376_210_5986_1_1533985278732_3370740_391-344072-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 03:47:12,243 WARNING [train.py:1204] (3/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_558-322014-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 03:47:12,284 WARNING [train.py:1204] (3/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_256-32662-0_sp1.1 from training. Duration: 0.8181875 2023-10-03 03:47:13,585 INFO [train.py:1046] (3/4) Epoch 32, batch 4350, loss[loss=0.1607, simple_loss=0.2526, pruned_loss=0.0344, over 24659.00 frames. ], tot_loss[loss=0.1617, simple_loss=0.2401, pruned_loss=0.04168, over 4701720.86 frames. ], batch size: 73, lr: 3.17e-03, grad_scale: 16.0 2023-10-03 03:47:16,473 WARNING [train.py:1204] (3/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_572-185150-0 from training. Duration: 0.97 2023-10-03 03:47:16,512 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_89-35748-0_sp0.9 from training. Duration: 0.87775 2023-10-03 03:47:19,714 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.552e+02 1.807e+02 2.015e+02 2.247e+02 3.972e+02, threshold=4.030e+02, percent-clipped=0.0 2023-10-03 03:47:21,349 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_88-66747-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 03:47:22,789 WARNING [train.py:1204] (3/4) Exclude cut with ID _19504_210_9994_1_1531648863242_3325220_357-237029-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 03:47:25,496 WARNING [train.py:1204] (3/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_302-2550-0_sp1.1 from training. Duration: 0.736375 2023-10-03 03:47:25,496 WARNING [train.py:1204] (3/4) Exclude cut with ID _9461_210_4557_1_1529060514726_3677451_59-194017-0_sp1.1 from training. Duration: 0.963625 2023-10-03 03:47:28,276 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.1.balancer2.prob, batch_count=1126906.6666666667, ans=0.125 2023-10-03 03:47:29,444 WARNING [train.py:1204] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_665-196155-0_sp1.1 from training. Duration: 0.6090625 2023-10-03 03:47:29,582 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.1.feed_forward1.out_proj.dropout_p, batch_count=1126906.6666666667, ans=0.1 2023-10-03 03:47:32,297 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_326-315007-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 03:47:35,172 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.ff3_skip_rate, batch_count=1126906.6666666667, ans=0.0 2023-10-03 03:47:36,246 WARNING [train.py:1204] (3/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_105-328174-0_sp1.1 from training. Duration: 0.6909375 2023-10-03 03:47:36,262 WARNING [train.py:1204] (3/4) Exclude cut with ID _56494_210_4220_1_1535284837625_3033169_106-263722-0_sp1.1 from training. Duration: 0.97275 2023-10-03 03:47:39,641 WARNING [train.py:1204] (3/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_770-200430-0_sp0.9 from training. Duration: 0.988875 2023-10-03 03:47:42,997 WARNING [train.py:1204] (3/4) Exclude cut with ID _65640_210_6310_1_1535368201445_3173250_209-13008-0_sp1.1 from training. Duration: 0.9 2023-10-03 03:47:44,411 WARNING [train.py:1204] (3/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_212-149947-0_sp0.9 from training. Duration: 0.811125 2023-10-03 03:47:49,146 WARNING [train.py:1204] (3/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_188-171673-0 from training. Duration: 0.58 2023-10-03 03:47:49,212 WARNING [train.py:1204] (3/4) Exclude cut with ID _58895_210_8750_1_1535184081320_3649089_286-281724-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 03:47:49,286 WARNING [train.py:1204] (3/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_16-254909-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 03:47:53,593 WARNING [train.py:1204] (3/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_52-95655-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 03:47:55,058 WARNING [train.py:1204] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_554-45617-0 from training. Duration: 0.99 2023-10-03 03:47:57,643 WARNING [train.py:1204] (3/4) Exclude cut with ID _14683_210_5219_1_1530698352608_3974859_473-344611-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 03:47:59,010 WARNING [train.py:1204] (3/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_17-25045-0_sp0.9 from training. Duration: 0.6555625 2023-10-03 03:47:59,193 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.conv_skip_rate, batch_count=1127040.0, ans=0.0 2023-10-03 03:48:01,856 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9072W0003-530637 from training. Duration: 0.768 2023-10-03 03:48:03,231 WARNING [train.py:1204] (3/4) Exclude cut with ID _38670_210_15379_1_1533448810659_4391059_76-74025-0_sp1.1 from training. Duration: 0.936375 2023-10-03 03:48:04,575 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_74-292608-0_sp0.9 from training. Duration: 0.8 2023-10-03 03:48:06,389 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9084W0025-536862 from training. Duration: 0.896 2023-10-03 03:48:07,749 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9087W0021-538392 from training. Duration: 0.768 2023-10-03 03:48:07,754 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7543_210_6753_1_1528543323_645515_56-163234-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 03:48:07,794 WARNING [train.py:1204] (3/4) Exclude cut with ID _45560_210_21982_1_1533812603951_3584999_44-332643-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 03:48:09,659 WARNING [train.py:1204] (3/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_1285-256622-0_sp1.1 from training. Duration: 0.763625 2023-10-03 03:48:09,722 WARNING [train.py:1204] (3/4) Exclude cut with ID _52781_210_16187_1_1534297729099_1118149_141-325632-0_sp1.1 from training. Duration: 0.936375 2023-10-03 03:48:11,611 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_1051-137631-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 03:48:11,650 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_13091_210_7925_1_1530609447_8987354_233-18333-0_sp1.1 from training. Duration: 0.97275 2023-10-03 03:48:14,929 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_34008_210_10431_1_1533345424_8183247_662-317331-0 from training. Duration: 0.94 2023-10-03 03:48:14,935 WARNING [train.py:1204] (3/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_291-248920-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 03:48:14,937 WARNING [train.py:1204] (3/4) Exclude cut with ID _65263_210_22356_1_1535763598691_3921037_274-288481-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 03:48:14,964 WARNING [train.py:1204] (3/4) Exclude cut with ID _5189_210_4557_1_1527248319428_3769229_233-168080-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 03:48:16,285 WARNING [train.py:1204] (3/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_135-184281-0 from training. Duration: 0.99 2023-10-03 03:48:16,392 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9123W0012-555972 from training. Duration: 0.896 2023-10-03 03:48:16,396 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9123W0118-556077 from training. Duration: 0.896 2023-10-03 03:48:17,665 WARNING [train.py:1204] (3/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_406-275649-0 from training. Duration: 0.42 2023-10-03 03:48:20,502 WARNING [train.py:1204] (3/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_309-13684-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 03:48:20,523 WARNING [train.py:1204] (3/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_129-337802-0_sp1.1 from training. Duration: 0.6545625 2023-10-03 03:48:20,540 WARNING [train.py:1204] (3/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_649-266825-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 03:48:21,789 WARNING [train.py:1204] (3/4) Exclude cut with ID _40519_210_13794_1_1533715228655_7337390_895-224742-0_sp1.1 from training. Duration: 0.92725 2023-10-03 03:48:24,605 WARNING [train.py:1204] (3/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_234-14174-0 from training. Duration: 0.82 2023-10-03 03:48:24,735 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.1.balancer2.prob, batch_count=1127106.6666666667, ans=0.125 2023-10-03 03:48:27,283 INFO [train.py:1046] (3/4) Epoch 32, batch 4400, loss[loss=0.1576, simple_loss=0.2446, pruned_loss=0.03527, over 24684.00 frames. ], tot_loss[loss=0.1622, simple_loss=0.2408, pruned_loss=0.04179, over 4716923.44 frames. ], batch size: 65, lr: 3.17e-03, grad_scale: 32.0 2023-10-03 03:48:27,351 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9156W0009-572502 from training. Duration: 0.768 2023-10-03 03:48:27,358 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_424-245529-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 03:48:30,270 WARNING [train.py:1204] (3/4) Exclude cut with ID _26528_210_9070_1_1532570410842_3851310_232-10149-0_sp1.1 from training. Duration: 0.963625 2023-10-03 03:48:30,286 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_17292_210_6753_1_1531125388_2943734_152-30410-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 03:48:31,665 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_35441_210_16554_1_1533347572_4092680_291-82075-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 03:48:33,050 WARNING [train.py:1204] (3/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_64-298917-0 from training. Duration: 0.88 2023-10-03 03:48:33,078 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_5618_210_1874_1_1527311008_5851_1-214501-0_sp0.9 from training. Duration: 0.7655625 2023-10-03 03:48:33,105 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_9460_210_4211_1_1528954587_10049667_572-37035-0 from training. Duration: 0.8 2023-10-03 03:48:33,129 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9193W0023-590322 from training. Duration: 0.896 2023-10-03 03:48:34,457 WARNING [train.py:1204] (3/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_206-10097-0_sp1.1 from training. Duration: 0.4454375 2023-10-03 03:48:34,469 WARNING [train.py:1204] (3/4) Exclude cut with ID _55134_210_21982_1_1534590028377_3882170_17-51731-0_sp1.1 from training. Duration: 0.963625 2023-10-03 03:48:37,612 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_14953_210_2151_1_1530701710_5616165_710-165400-0 from training. Duration: 0.87 2023-10-03 03:48:39,100 WARNING [train.py:1204] (3/4) Exclude cut with ID _53203_210_2087_1_1534246218309_475590_61-85777-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 03:48:39,829 WARNING [train.py:1204] (3/4) Exclude cut with ID _55844_210_2346_1_1534471212569_7625707_1050-10453-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 03:48:39,839 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9218W0013-603297 from training. Duration: 0.896 2023-10-03 03:48:42,513 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_267-102818-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 03:48:42,514 WARNING [train.py:1204] (3/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_191-41023-0 from training. Duration: 0.71 2023-10-03 03:48:44,265 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9231W0202-610227 from training. Duration: 0.896 2023-10-03 03:48:47,183 WARNING [train.py:1204] (3/4) Exclude cut with ID _6537_210_1883_1_1527987559710_3597129_382-133062-0 from training. Duration: 0.68 2023-10-03 03:48:48,500 WARNING [train.py:1204] (3/4) Exclude cut with ID _28427_210_4220_1_1532581240967_3061069_43-84928-0 from training. Duration: 0.99 2023-10-03 03:48:48,541 WARNING [train.py:1204] (3/4) Exclude cut with ID _59256_210_5986_1_1535076085997_3589120_121-237386-0 from training. Duration: 0.96 2023-10-03 03:48:49,960 WARNING [train.py:1204] (3/4) Exclude cut with ID _8480_210_1874_1_1528589391205_6728330_137-256986-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 03:48:50,049 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_9417_210_1794_1_1529235261_4614131_479-346963-0_sp1.1 from training. Duration: 0.936375 2023-10-03 03:48:51,417 WARNING [train.py:1204] (3/4) Exclude cut with ID _26474_210_9570_1_1532520022699_3623380_267-7362-0_sp1.1 from training. Duration: 0.936375 2023-10-03 03:48:51,615 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.conv_module2.balancer1.prob, batch_count=1127240.0, ans=0.125 2023-10-03 03:48:52,807 WARNING [train.py:1204] (3/4) Exclude cut with ID _55328_210_27767_1_1534580973260_4088170_287-61272-0_sp1.1 from training. Duration: 0.97275 2023-10-03 03:48:54,281 WARNING [train.py:1204] (3/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_589-9070-0 from training. Duration: 0.71 2023-10-03 03:48:54,288 WARNING [train.py:1204] (3/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_148-158509-0_sp1.1 from training. Duration: 0.37275 2023-10-03 03:48:55,607 WARNING [train.py:1204] (3/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_164-184548-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 03:48:57,184 WARNING [train.py:1204] (3/4) Exclude cut with ID _14970_210_15709_1_1530775547361_1966965_6-23328-0_sp1.1 from training. Duration: 0.7818125 2023-10-03 03:48:57,191 WARNING [train.py:1204] (3/4) Exclude cut with ID _29303_210_2575_1_1532392237303_7410649_612-165069-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 03:48:58,641 WARNING [train.py:1204] (3/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_383-321224-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 03:48:58,712 WARNING [train.py:1204] (3/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_283-160525-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 03:48:59,955 WARNING [train.py:1204] (3/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_505-97987-0 from training. Duration: 0.86 2023-10-03 03:49:00,043 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9309W0190-640317 from training. Duration: 0.768 2023-10-03 03:49:03,968 WARNING [train.py:1204] (3/4) Exclude cut with ID _50093_210_4220_1_1534417141207_3432299_280-297390-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 03:49:10,124 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_17292_210_6753_1_1531125388_2943734_320-71385-0_sp1.1 from training. Duration: 0.97275 2023-10-03 03:49:13,991 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_213-103282-0 from training. Duration: 0.78 2023-10-03 03:49:16,802 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7615_210_4211_1_1528110965_4580160_72-235211-0_sp0.9 from training. Duration: 0.8444375 2023-10-03 03:49:20,035 WARNING [train.py:1204] (3/4) Exclude cut with ID _14867_210_1968_1_1530684179890_3846969_173-320977-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 03:49:21,630 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7046_210_6753_1_1527895337_6920917_783-278109-0_sp0.9 from training. Duration: 0.8555625 2023-10-03 03:49:21,676 WARNING [train.py:1204] (3/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_90-30673-0 from training. Duration: 0.84 2023-10-03 03:49:22,998 WARNING [train.py:1204] (3/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_415-161-0_sp1.1 from training. Duration: 0.8909375 2023-10-03 03:49:23,016 WARNING [train.py:1204] (3/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_86-66192-0_sp0.9 from training. Duration: 0.611125 2023-10-03 03:49:23,019 WARNING [train.py:1204] (3/4) Exclude cut with ID _4626_210_3532_1_1526817652717_3916859_446-206656-0_sp1.1 from training. Duration: 0.6909375 2023-10-03 03:49:23,082 WARNING [train.py:1204] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_166-128941-0_sp0.9 from training. Duration: 0.6 2023-10-03 03:49:23,312 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.conv_module2.balancer1.prob, batch_count=1127373.3333333333, ans=0.125 2023-10-03 03:49:25,988 INFO [scaling.py:1118] (3/4) WithLoss: name=encoder.encoders.4.encoder.layers.0.self_attn_weights, loss-sum=0.000e+00 2023-10-03 03:49:27,101 WARNING [train.py:1204] (3/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_345-167986-0 from training. Duration: 0.84 2023-10-03 03:49:29,797 WARNING [train.py:1204] (3/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_973-198085-0 from training. Duration: 0.75 2023-10-03 03:49:29,879 WARNING [train.py:1204] (3/4) Exclude cut with ID _60057_210_10640_1_1534903065793_3333150_171-181133-0 from training. Duration: 0.88 2023-10-03 03:49:31,300 WARNING [train.py:1204] (3/4) Exclude cut with ID _19504_210_9994_1_1531648863242_3325220_336-196513-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 03:49:31,307 WARNING [train.py:1204] (3/4) Exclude cut with ID _6493_210_1747_1_1528459210424_4012470_513-140967-0 from training. Duration: 0.79 2023-10-03 03:49:32,662 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6673_210_3273_1_1527767790_3173240_327-306185-0_sp0.9 from training. Duration: 0.92225 2023-10-03 03:49:35,442 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1032-236863-0_sp1.1 from training. Duration: 0.763625 2023-10-03 03:49:35,582 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.1.feed_forward1.out_proj.dropout_p, batch_count=1127440.0, ans=0.1 2023-10-03 03:49:38,709 WARNING [train.py:1204] (3/4) Exclude cut with ID _14867_210_1968_1_1530684179890_3846969_413-29292-0 from training. Duration: 0.9 2023-10-03 03:49:41,426 INFO [train.py:1046] (3/4) Epoch 32, batch 4450, loss[loss=0.1671, simple_loss=0.2531, pruned_loss=0.04053, over 24687.00 frames. ], tot_loss[loss=0.1618, simple_loss=0.2407, pruned_loss=0.04141, over 4732065.67 frames. ], batch size: 73, lr: 3.17e-03, grad_scale: 32.0 2023-10-03 03:49:41,528 WARNING [train.py:1204] (3/4) Exclude cut with ID _47251_210_18628_1_1533814063128_4137250_186-343322-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 03:49:45,873 WARNING [train.py:1204] (3/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_624-46091-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 03:49:45,921 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_791-197891-0_sp0.9 from training. Duration: 0.9333125 2023-10-03 03:49:46,098 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.ff2_skip_rate, batch_count=1127506.6666666667, ans=0.0 2023-10-03 03:49:47,471 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.515e+02 1.837e+02 2.024e+02 2.337e+02 3.195e+02, threshold=4.048e+02, percent-clipped=0.0 2023-10-03 03:49:50,336 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.3.encoder.layers.1.feed_forward3.out_whiten, num_groups=1, num_channels=512, metric=10.34 vs. limit=15.0 2023-10-03 03:49:52,456 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_50050_210_16223_1_1535435796_7290138_786-288457-0_sp1.1 from training. Duration: 0.92725 2023-10-03 03:49:53,789 WARNING [train.py:1204] (3/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_213-227283-0_sp1.1 from training. Duration: 0.963625 2023-10-03 03:49:55,519 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.conv_module2.balancer1.prob, batch_count=1127573.3333333333, ans=0.125 2023-10-03 03:49:57,848 WARNING [train.py:1204] (3/4) Exclude cut with ID _42659_210_2070_1_1533607442765_3120380_58-341121-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 03:49:59,369 WARNING [train.py:1204] (3/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_506-23200-0_sp1.1 from training. Duration: 0.8818125 2023-10-03 03:49:59,622 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.1.conv_module2.balancer2.prob, batch_count=1127573.3333333333, ans=0.125 2023-10-03 03:49:59,650 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=1127573.3333333333, ans=0.1 2023-10-03 03:50:02,138 WARNING [train.py:1204] (3/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_336-213530-0_sp1.1 from training. Duration: 0.8181875 2023-10-03 03:50:02,167 WARNING [train.py:1204] (3/4) Exclude cut with ID _64135_210_2253_1_1535677267876_7608972_180-5312-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 03:50:02,241 WARNING [train.py:1204] (3/4) Exclude cut with ID _12302_210_5219_1_1529924446466_8107280_835-279754-0 from training. Duration: 0.9 2023-10-03 03:50:02,243 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_510-96996-0_sp1.1 from training. Duration: 0.7909375 2023-10-03 03:50:02,850 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.3.encoder.layers.3.nonlin_attention.whiten1, num_groups=1, num_channels=384, metric=5.22 vs. limit=10.0 2023-10-03 03:50:03,663 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_15517_210_6753_1_1530954008_3487336_216-187834-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 03:50:03,688 WARNING [train.py:1204] (3/4) Exclude cut with ID _19504_210_9994_1_1531648863242_3325220_414-113427-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 03:50:03,689 WARNING [train.py:1204] (3/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_264-108685-0_sp0.9 from training. Duration: 0.611125 2023-10-03 03:50:03,915 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=1127573.3333333333, ans=0.1 2023-10-03 03:50:05,695 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.feed_forward3.out_whiten.whitening_limit, batch_count=1127573.3333333333, ans=15.0 2023-10-03 03:50:06,364 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_5438_210_1794_1_1527336687_2600009_104-288274-0_sp0.9 from training. Duration: 0.6444375 2023-10-03 03:50:08,081 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.feed_forward1.hidden_balancer.prob, batch_count=1127573.3333333333, ans=0.125 2023-10-03 03:50:09,284 WARNING [train.py:1204] (3/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_520-39496-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 03:50:10,659 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_37818_210_4238_1_1533620910_4720083_315-308565-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 03:50:12,566 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2009_210_2071_1_1525481134_4588937_385-302100-0_sp1.1 from training. Duration: 0.7909375 2023-10-03 03:50:12,612 WARNING [train.py:1204] (3/4) Exclude cut with ID _58895_210_8750_1_1535184081320_3649089_277-53219-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 03:50:14,076 WARNING [train.py:1204] (3/4) Exclude cut with ID _26664_210_7770_1_1532258943280_3418270_121-111258-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 03:50:19,340 WARNING [train.py:1204] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_99-167392-0_sp1.1 from training. Duration: 0.5545625 2023-10-03 03:50:19,460 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1113-7369-0 from training. Duration: 0.85 2023-10-03 03:50:19,475 WARNING [train.py:1204] (3/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_352-59958-0 from training. Duration: 0.86 2023-10-03 03:50:19,479 WARNING [train.py:1204] (3/4) Exclude cut with ID _14886_210_4220_1_1530702020355_3516190_159-73748-0_sp1.1 from training. Duration: 0.7545625 2023-10-03 03:50:21,198 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.attention_skip_rate, batch_count=1127640.0, ans=0.0 2023-10-03 03:50:22,311 WARNING [train.py:1204] (3/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_131-119569-0_sp1.1 from training. Duration: 0.92725 2023-10-03 03:50:23,679 WARNING [train.py:1204] (3/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_216-343007-0 from training. Duration: 0.66 2023-10-03 03:50:23,922 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.ff3_skip_rate, batch_count=1127640.0, ans=0.0 2023-10-03 03:50:26,398 WARNING [train.py:1204] (3/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_113-92608-0_sp0.9 from training. Duration: 0.6 2023-10-03 03:50:26,660 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.self_attn_weights.pos_emb_skip_rate, batch_count=1127706.6666666667, ans=0.0 2023-10-03 03:50:31,565 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_40047_210_2290_1_1533290519_2374311_132-123271-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 03:50:31,625 WARNING [train.py:1204] (3/4) Exclude cut with ID _8793_210_6310_1_1528542985139_2310009_121-179782-0 from training. Duration: 0.8 2023-10-03 03:50:31,645 WARNING [train.py:1204] (3/4) Exclude cut with ID _43997_210_4525_1_1533538632555_5189329_283-218639-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 03:50:31,650 WARNING [train.py:1204] (3/4) Exclude cut with ID _49376_210_5986_1_1533985278732_3370740_297-78397-0_sp1.1 from training. Duration: 0.936375 2023-10-03 03:50:31,663 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3475_210_2028_1_1526193578_3622569_377-166133-0_sp0.9 from training. Duration: 0.9555625 2023-10-03 03:50:31,670 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_520-213149-0_sp1.1 from training. Duration: 0.92725 2023-10-03 03:50:31,810 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=1127706.6666666667, ans=0.1 2023-10-03 03:50:33,084 WARNING [train.py:1204] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_567-144483-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 03:50:34,678 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=1127706.6666666667, ans=0.1 2023-10-03 03:50:35,920 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_4183_210_2087_1_1526729100_1869838_143-280962-0_sp0.9 from training. Duration: 0.62225 2023-10-03 03:50:37,209 WARNING [train.py:1204] (3/4) Exclude cut with ID _49643_210_7989_1_1534640244224_3939031_611-185515-0 from training. Duration: 0.94 2023-10-03 03:50:38,638 WARNING [train.py:1204] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_88-341718-0_sp0.9 from training. Duration: 0.7333125 2023-10-03 03:50:38,770 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_51225_210_3601_1_1534400634_2327383_49-235441-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 03:50:40,770 WARNING [train.py:1204] (3/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_240-294400-0_sp1.1 from training. Duration: 0.936375 2023-10-03 03:50:42,046 WARNING [train.py:1204] (3/4) Exclude cut with ID _55429_210_10626_1_1534467588527_7174143_640-304825-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 03:50:42,067 WARNING [train.py:1204] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_187-221566-0_sp1.1 from training. Duration: 0.3909375 2023-10-03 03:50:45,136 WARNING [train.py:1204] (3/4) Exclude cut with ID _7606_210_4915_1_1528894253277_5080690_127-177761-0_sp0.9 from training. Duration: 0.711125 2023-10-03 03:50:48,460 WARNING [train.py:1204] (3/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_430-116139-0 from training. Duration: 0.85 2023-10-03 03:50:49,925 WARNING [train.py:1204] (3/4) Exclude cut with ID _3675_210_4557_1_1526639934408_4182089_456-18305-0_sp0.9 from training. Duration: 0.8444375 2023-10-03 03:50:54,090 WARNING [train.py:1204] (3/4) Exclude cut with ID _18513_210_9206_1_1531618215223_3862620_402-5957-0_sp1.1 from training. Duration: 0.9 2023-10-03 03:50:54,239 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.ff2_skip_rate, batch_count=1127840.0, ans=0.0 2023-10-03 03:50:55,312 INFO [train.py:1046] (3/4) Epoch 32, batch 4500, loss[loss=0.1523, simple_loss=0.2355, pruned_loss=0.03452, over 24503.00 frames. ], tot_loss[loss=0.1621, simple_loss=0.2408, pruned_loss=0.04168, over 4719245.49 frames. ], batch size: 63, lr: 3.17e-03, grad_scale: 16.0 2023-10-03 03:50:56,772 WARNING [train.py:1204] (3/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_465-156500-0 from training. Duration: 0.72 2023-10-03 03:50:56,773 WARNING [train.py:1204] (3/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_714-310538-0 from training. Duration: 0.86 2023-10-03 03:50:58,276 WARNING [train.py:1204] (3/4) Exclude cut with ID _39411_210_5986_1_1533538825030_3343649_335-301187-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 03:51:03,648 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_41750_210_1960_1_1533520678_3948042_241-158616-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 03:51:04,935 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_46013_210_7234_1_1533776362_3744032_50-141535-0_sp1.1 from training. Duration: 0.9 2023-10-03 03:51:05,002 WARNING [train.py:1204] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_43-206757-0_sp0.9 from training. Duration: 0.8333125 2023-10-03 03:51:06,362 WARNING [train.py:1204] (3/4) Exclude cut with ID _22961_210_8254_1_1531976366672_4278180_172-276296-0_sp1.1 from training. Duration: 0.97275 2023-10-03 03:51:06,396 WARNING [train.py:1204] (3/4) Exclude cut with ID _17956_210_4353_1_1531209568125_6979970_1305-301903-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 03:51:06,434 WARNING [train.py:1204] (3/4) Exclude cut with ID _28323_210_16187_1_1532480395667_4100300_604-44507-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 03:51:08,070 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.balancer1.prob, batch_count=1127906.6666666667, ans=0.125 2023-10-03 03:51:15,536 WARNING [train.py:1204] (3/4) Exclude cut with ID _23514_210_1389_1_1531832139785_3031439_88-337544-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 03:51:17,399 WARNING [train.py:1204] (3/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_932-3672-0_sp1.1 from training. Duration: 0.963625 2023-10-03 03:51:17,556 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.conv_module1.balancer1.prob, batch_count=1127906.6666666667, ans=0.125 2023-10-03 03:51:20,505 WARNING [train.py:1204] (3/4) Exclude cut with ID _46502_210_13794_1_1533724452198_3657159_207-109991-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 03:51:20,568 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_216-73841-0_sp1.1 from training. Duration: 0.87275 2023-10-03 03:51:22,066 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8453_210_4211_1_1528502940_8562730_347-330673-0_sp1.1 from training. Duration: 0.6545625 2023-10-03 03:51:29,027 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_4183_210_2087_1_1526729100_1869838_143-280962-0_sp1.1 from training. Duration: 0.5090625 2023-10-03 03:51:33,403 WARNING [train.py:1204] (3/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_325-113798-0_sp1.1 from training. Duration: 0.77275 2023-10-03 03:51:36,182 WARNING [train.py:1204] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_91-212486-0_sp0.9 from training. Duration: 0.7555625 2023-10-03 03:51:38,924 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8776_210_2040_1_1528638837_4088036_244-278763-0_sp1.1 from training. Duration: 0.8181875 2023-10-03 03:51:38,970 WARNING [train.py:1204] (3/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_106-286130-0 from training. Duration: 0.98 2023-10-03 03:51:40,138 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2746_210_1988_1_1525872816_3795627_372-205861-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 03:51:41,391 WARNING [train.py:1204] (3/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_240-54646-0_sp1.1 from training. Duration: 0.936375 2023-10-03 03:51:43,383 WARNING [train.py:1204] (3/4) Exclude cut with ID _59500_210_8254_1_1534809610680_3736009_162-180695-0_sp1.1 from training. Duration: 0.936375 2023-10-03 03:51:43,412 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7544_210_6753_1_1528591327_1471426_146-93357-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 03:51:45,423 WARNING [train.py:1204] (3/4) Exclude cut with ID _42657_210_2070_1_1533520654016_3327008_405-87432-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 03:51:45,587 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.conv_skip_rate, batch_count=1128040.0, ans=0.0 2023-10-03 03:51:46,727 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_4528_210_3273_1_1527076654_3276777_209-219804-0 from training. Duration: 0.69 2023-10-03 03:51:46,728 WARNING [train.py:1204] (3/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_78-182912-0_sp0.9 from training. Duration: 0.6555625 2023-10-03 03:51:46,735 WARNING [train.py:1204] (3/4) Exclude cut with ID _31255_210_13427_1_1532865619495_3815143_322-114998-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 03:51:51,402 WARNING [train.py:1204] (3/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_848-280957-0_sp0.9 from training. Duration: 0.9666875 2023-10-03 03:51:51,428 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7696_210_2040_1_1528207167_3716099_117-125434-0_sp0.9 from training. Duration: 0.8444375 2023-10-03 03:51:55,439 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_1002-109192-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 03:51:56,939 WARNING [train.py:1204] (3/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_138-64210-0_sp0.9 from training. Duration: 0.811125 2023-10-03 03:51:56,962 WARNING [train.py:1204] (3/4) Exclude cut with ID _25808_210_13292_1_1532343514562_7366109_633-47524-0_sp1.1 from training. Duration: 0.9 2023-10-03 03:51:58,500 WARNING [train.py:1204] (3/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_297-5233-0 from training. Duration: 0.95 2023-10-03 03:51:59,954 WARNING [train.py:1204] (3/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_335-274780-0 from training. Duration: 0.78 2023-10-03 03:51:59,962 WARNING [train.py:1204] (3/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_301-330455-0 from training. Duration: 0.8 2023-10-03 03:52:02,694 WARNING [train.py:1204] (3/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_383-205833-0 from training. Duration: 0.95 2023-10-03 03:52:05,479 WARNING [train.py:1204] (3/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_48-182155-0 from training. Duration: 0.87 2023-10-03 03:52:05,662 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.bypass.skip_rate, batch_count=1128106.6666666667, ans=0.07 2023-10-03 03:52:06,820 WARNING [train.py:1204] (3/4) Exclude cut with ID _14832_210_8750_1_1530691351539_3171348_1-54315-0_sp1.1 from training. Duration: 0.7909375 2023-10-03 03:52:08,139 INFO [train.py:1046] (3/4) Epoch 32, batch 4550, loss[loss=0.1525, simple_loss=0.2314, pruned_loss=0.03679, over 24602.00 frames. ], tot_loss[loss=0.1619, simple_loss=0.2402, pruned_loss=0.04173, over 4719694.37 frames. ], batch size: 60, lr: 3.17e-03, grad_scale: 16.0 2023-10-03 03:52:10,930 WARNING [train.py:1204] (3/4) Exclude cut with ID _44982_210_1511_1_1534312820108_3726029_413-100814-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 03:52:12,711 WARNING [train.py:1204] (3/4) Exclude cut with ID _3231_210_2106_1_1526205864934_6784220_104-261392-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 03:52:14,246 WARNING [train.py:1204] (3/4) Exclude cut with ID _24314_210_4915_1_1532135078116_7170179_1-13588-0_sp1.1 from training. Duration: 0.97275 2023-10-03 03:52:15,515 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.558e+02 1.941e+02 2.112e+02 2.362e+02 4.046e+02, threshold=4.225e+02, percent-clipped=0.0 2023-10-03 03:52:19,435 WARNING [train.py:1204] (3/4) Exclude cut with ID _44304_210_15374_1_1533639352765_4613129_141-8356-0_sp1.1 from training. Duration: 0.963625 2023-10-03 03:52:22,653 WARNING [train.py:1204] (3/4) Exclude cut with ID _37892_210_19596_1_1533198677897_3964058_374-335034-0_sp1.1 from training. Duration: 0.936375 2023-10-03 03:52:24,095 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_195-257512-0_sp0.9 from training. Duration: 0.7444375 2023-10-03 03:52:24,097 WARNING [train.py:1204] (3/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_1267-272693-0_sp1.1 from training. Duration: 0.663625 2023-10-03 03:52:24,098 WARNING [train.py:1204] (3/4) Exclude cut with ID _30196_210_1664_1_1533708496522_7074140_690-326155-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 03:52:25,896 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.5.encoder.layers.1.feed_forward1.out_whiten, num_groups=1, num_channels=256, metric=6.26 vs. limit=15.0 2023-10-03 03:52:26,872 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_68847_210_14656_1_1536070214_2586843_38-20717-0_sp1.1 from training. Duration: 0.97275 2023-10-03 03:52:26,914 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_99-251314-0_sp1.1 from training. Duration: 0.7909375 2023-10-03 03:52:30,958 WARNING [train.py:1204] (3/4) Exclude cut with ID _49376_210_5986_1_1533985278732_3370740_225-95630-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 03:52:31,108 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.ff2_skip_rate, batch_count=1128240.0, ans=0.0 2023-10-03 03:52:35,063 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2655_210_2087_1_1525688959_3897400_202-147557-0 from training. Duration: 0.85 2023-10-03 03:52:35,128 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_5618_210_1874_1_1527311008_5851_1-214501-0_sp1.1 from training. Duration: 0.626375 2023-10-03 03:52:36,525 WARNING [train.py:1204] (3/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_225-43381-0_sp1.1 from training. Duration: 0.8545625 2023-10-03 03:52:39,272 WARNING [train.py:1204] (3/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_242-3472-0 from training. Duration: 0.73 2023-10-03 03:52:40,807 WARNING [train.py:1204] (3/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_339-104791-0 from training. Duration: 0.49 2023-10-03 03:52:40,879 WARNING [train.py:1204] (3/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_488-92261-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 03:52:43,635 WARNING [train.py:1204] (3/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_175-163894-0 from training. Duration: 0.74 2023-10-03 03:52:45,660 WARNING [train.py:1204] (3/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_41-234353-0_sp0.9 from training. Duration: 0.7555625 2023-10-03 03:52:48,318 WARNING [train.py:1204] (3/4) Exclude cut with ID _47251_210_18628_1_1533814063128_4137250_116-275927-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 03:52:48,352 WARNING [train.py:1204] (3/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_61-223324-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 03:52:48,372 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6961_210_2087_1_1527937168_6815011_562-165518-0_sp1.1 from training. Duration: 0.7 2023-10-03 03:52:50,909 WARNING [train.py:1204] (3/4) Exclude cut with ID _21989_210_5854_1_1531899214931_3271360_133-44759-0 from training. Duration: 0.77 2023-10-03 03:52:52,350 WARNING [train.py:1204] (3/4) Exclude cut with ID _23557_210_8750_1_1531890106370_6846255_77-284923-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 03:52:55,118 WARNING [train.py:1204] (3/4) Exclude cut with ID _13678_210_7497_1_1530530069226_3463170_464-296127-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 03:52:55,145 WARNING [train.py:1204] (3/4) Exclude cut with ID _22613_210_5219_1_1532253599686_4090170_393-36663-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 03:52:56,658 WARNING [train.py:1204] (3/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_235-262124-0_sp0.9 from training. Duration: 0.7444375 2023-10-03 03:52:57,921 WARNING [train.py:1204] (3/4) Exclude cut with ID _8480_210_1874_1_1528589391205_6728330_755-112625-0 from training. Duration: 0.74 2023-10-03 03:52:57,981 WARNING [train.py:1204] (3/4) Exclude cut with ID _6456_210_1534_1_1527681634212_3181049_215-309359-0 from training. Duration: 0.88 2023-10-03 03:52:59,312 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3019_210_2028_1_1526044245_3264865_191-72235-0_sp1.1 from training. Duration: 0.8818125 2023-10-03 03:52:59,389 WARNING [train.py:1204] (3/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_380-162648-0 from training. Duration: 0.94 2023-10-03 03:53:00,826 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_92-46009-0 from training. Duration: 0.54 2023-10-03 03:53:00,842 WARNING [train.py:1204] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_53-300544-0_sp0.9 from training. Duration: 0.7444375 2023-10-03 03:53:02,300 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_68235_210_1925_1_1535863247_4484656_101-250943-0_sp1.1 from training. Duration: 0.97275 2023-10-03 03:53:02,310 WARNING [train.py:1204] (3/4) Exclude cut with ID _14970_210_15709_1_1530775547361_1966965_204-22182-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 03:53:03,684 WARNING [train.py:1204] (3/4) Exclude cut with ID _55153_210_2253_1_1534384786677_3162096_310-130092-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 03:53:03,694 WARNING [train.py:1204] (3/4) Exclude cut with ID _60113_210_16271_1_1535164212481_4386079_559-167191-0_sp1.1 from training. Duration: 0.8454375 2023-10-03 03:53:05,010 WARNING [train.py:1204] (3/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_93-58002-0_sp1.1 from training. Duration: 0.5181875 2023-10-03 03:53:05,066 WARNING [train.py:1204] (3/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_110-224688-0 from training. Duration: 0.97 2023-10-03 03:53:06,469 WARNING [train.py:1204] (3/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_178-178004-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 03:53:06,484 WARNING [train.py:1204] (3/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_321-335475-0_sp1.1 from training. Duration: 0.4090625 2023-10-03 03:53:07,710 WARNING [train.py:1204] (3/4) Exclude cut with ID _66863_210_18053_1_1535698758334_8590319_1092-271784-0 from training. Duration: 0.96 2023-10-03 03:53:07,716 WARNING [train.py:1204] (3/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_607-213951-0_sp1.1 from training. Duration: 0.763625 2023-10-03 03:53:07,725 WARNING [train.py:1204] (3/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_468-283113-0 from training. Duration: 0.96 2023-10-03 03:53:09,746 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.2.encoder.layers.2.self_attn_weights.whiten_keys, num_groups=4, num_channels=128, metric=3.21 vs. limit=6.0 2023-10-03 03:53:10,549 WARNING [train.py:1204] (3/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_13-165512-0_sp1.1 from training. Duration: 0.7454375 2023-10-03 03:53:10,560 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_69630_210_7234_1_1535788670_910989_56-169762-0_sp1.1 from training. Duration: 0.92725 2023-10-03 03:53:14,955 WARNING [train.py:1204] (3/4) Exclude cut with ID _67335_210_5219_1_1535525975652_4275229_121-80357-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 03:53:15,011 WARNING [train.py:1204] (3/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_126-12079-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 03:53:16,336 WARNING [train.py:1204] (3/4) Exclude cut with ID _5294_210_1747_1_1527249643928_3696179_342-270278-0_sp1.1 from training. Duration: 0.5 2023-10-03 03:53:16,455 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_30322_210_1925_1_1532843478_826817_35-328264-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 03:53:19,698 WARNING [train.py:1204] (3/4) Exclude cut with ID _8480_210_1874_1_1528589391205_6728330_755-112625-0_sp1.1 from training. Duration: 0.67275 2023-10-03 03:53:21,673 WARNING [train.py:1204] (3/4) Exclude cut with ID _38670_210_15379_1_1533448810659_4391059_132-146412-0_sp1.1 from training. Duration: 0.97275 2023-10-03 03:53:22,889 INFO [train.py:1046] (3/4) Epoch 32, batch 4600, loss[loss=0.1518, simple_loss=0.2197, pruned_loss=0.042, over 22826.00 frames. ], tot_loss[loss=0.1612, simple_loss=0.2392, pruned_loss=0.04157, over 4714222.95 frames. ], batch size: 322, lr: 3.17e-03, grad_scale: 16.0 2023-10-03 03:53:22,985 WARNING [train.py:1204] (3/4) Exclude cut with ID _22020_210_2461_1_1531724164513_6326900_563-39691-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 03:53:25,892 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_9460_210_4211_1_1528954587_10049667_572-37035-0_sp1.1 from training. Duration: 0.72725 2023-10-03 03:53:27,164 WARNING [train.py:1204] (3/4) Exclude cut with ID _68777_210_4353_1_1535810390731_3117904_129-166012-0_sp0.9 from training. Duration: 0.9444375 2023-10-03 03:53:27,443 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=1128506.6666666667, ans=0.1 2023-10-03 03:53:28,516 WARNING [train.py:1204] (3/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_487-91921-0_sp1.1 from training. Duration: 0.963625 2023-10-03 03:53:29,856 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_195-257512-0 from training. Duration: 0.67 2023-10-03 03:53:31,295 WARNING [train.py:1204] (3/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_188-260011-0_sp1.1 from training. Duration: 0.663625 2023-10-03 03:53:34,223 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.balancer1.prob, batch_count=1128506.6666666667, ans=0.125 2023-10-03 03:53:35,346 WARNING [train.py:1204] (3/4) Exclude cut with ID _8480_210_1874_1_1528589391205_6728330_309-139896-0_sp0.9 from training. Duration: 0.9 2023-10-03 03:53:36,686 WARNING [train.py:1204] (3/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_866-321248-0_sp1.1 from training. Duration: 0.963625 2023-10-03 03:53:38,172 WARNING [train.py:1204] (3/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_510-216513-0_sp1.1 from training. Duration: 0.97275 2023-10-03 03:53:45,209 WARNING [train.py:1204] (3/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_758-143677-0 from training. Duration: 0.98 2023-10-03 03:53:45,288 WARNING [train.py:1204] (3/4) Exclude cut with ID _55134_210_21982_1_1534590028377_3882170_50-214427-0_sp1.1 from training. Duration: 0.97275 2023-10-03 03:53:49,015 WARNING [train.py:1204] (3/4) Exclude cut with ID _31649_210_6228_1_1532584608063_4091078_482-301330-0_sp1.1 from training. Duration: 0.97275 2023-10-03 03:53:52,320 WARNING [train.py:1204] (3/4) Exclude cut with ID _35799_210_5854_1_1533263487887_3529071_490-326847-0_sp1.1 from training. Duration: 0.9 2023-10-03 03:53:52,327 WARNING [train.py:1204] (3/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_984-90716-0_sp1.1 from training. Duration: 0.963625 2023-10-03 03:53:58,184 WARNING [train.py:1204] (3/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_166-246557-0 from training. Duration: 0.64 2023-10-03 03:53:58,185 WARNING [train.py:1204] (3/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_51-200220-0_sp1.1 from training. Duration: 0.5090625 2023-10-03 03:53:58,347 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.conv_module1.balancer1.prob, batch_count=1128640.0, ans=0.125 2023-10-03 03:53:59,528 WARNING [train.py:1204] (3/4) Exclude cut with ID _51382_210_23862_1_1534492801838_3650104_275-92796-0_sp1.1 from training. Duration: 0.936375 2023-10-03 03:54:03,676 WARNING [train.py:1204] (3/4) Exclude cut with ID _29853_210_7497_1_1532505600851_6642110_461-142829-0_sp1.1 from training. Duration: 0.97275 2023-10-03 03:54:03,723 WARNING [train.py:1204] (3/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_788-298907-0_sp0.9 from training. Duration: 0.988875 2023-10-03 03:54:06,342 WARNING [train.py:1204] (3/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_739-2098-0_sp1.1 from training. Duration: 0.836375 2023-10-03 03:54:06,690 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.conv_module1.balancer2.min_positive, batch_count=1128706.6666666667, ans=0.05 2023-10-03 03:54:07,881 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7543_210_6753_1_1528541907_1399322_165-323619-0 from training. Duration: 0.69 2023-10-03 03:54:10,497 WARNING [train.py:1204] (3/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_401-255491-0_sp1.1 from training. Duration: 0.636375 2023-10-03 03:54:13,307 WARNING [train.py:1204] (3/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_24-143283-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 03:54:13,546 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.feed_forward1.hidden_balancer.prob, batch_count=1128706.6666666667, ans=0.125 2023-10-03 03:54:16,007 WARNING [train.py:1204] (3/4) Exclude cut with ID _14459_210_2228_1_1531443641946_7516700_1221-280439-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 03:54:18,048 WARNING [train.py:1204] (3/4) Exclude cut with ID _59202_210_26984_1_1534816810612_3549174_469-213482-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 03:54:19,915 WARNING [train.py:1204] (3/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_148-158509-0_sp0.9 from training. Duration: 0.4555625 2023-10-03 03:54:19,944 WARNING [train.py:1204] (3/4) Exclude cut with ID _24314_210_4915_1_1532135078116_7170179_863-259095-0_sp1.1 from training. Duration: 0.97275 2023-10-03 03:54:20,017 WARNING [train.py:1204] (3/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_321-22821-0 from training. Duration: 0.89 2023-10-03 03:54:20,038 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_43831_210_2158_1_1533725502_5872791_55-163137-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 03:54:20,080 WARNING [train.py:1204] (3/4) Exclude cut with ID _27430_210_7770_1_1532604689375_1378590_26-25207-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 03:54:21,448 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_17613_210_2151_1_1531137569_2366160_223-316461-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 03:54:23,177 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_53679_210_4852_1_1534556726_4186141_541-213370-0_sp1.1 from training. Duration: 0.963625 2023-10-03 03:54:24,643 WARNING [train.py:1204] (3/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_369-295086-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 03:54:24,701 WARNING [train.py:1204] (3/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_91-267696-0 from training. Duration: 0.87 2023-10-03 03:54:26,725 WARNING [train.py:1204] (3/4) Exclude cut with ID _5294_210_1747_1_1527249643928_3696179_180-100713-0 from training. Duration: 0.81 2023-10-03 03:54:26,758 WARNING [train.py:1204] (3/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_32-335103-0 from training. Duration: 0.82 2023-10-03 03:54:26,764 WARNING [train.py:1204] (3/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_133-312727-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 03:54:28,063 WARNING [train.py:1204] (3/4) Exclude cut with ID _59288_210_5854_1_1535077757774_3412246_391-165934-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 03:54:29,384 WARNING [train.py:1204] (3/4) Exclude cut with ID _28323_210_16187_1_1532480395667_4100300_488-1281-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 03:54:30,898 WARNING [train.py:1204] (3/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_124-193207-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 03:54:32,619 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=1128773.3333333333, ans=0.1 2023-10-03 03:54:36,615 INFO [train.py:1046] (3/4) Epoch 32, batch 4650, loss[loss=0.1681, simple_loss=0.2414, pruned_loss=0.04741, over 23826.00 frames. ], tot_loss[loss=0.1605, simple_loss=0.2389, pruned_loss=0.04106, over 4723899.30 frames. ], batch size: 164, lr: 3.17e-03, grad_scale: 16.0 2023-10-03 03:54:38,170 WARNING [train.py:1204] (3/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_226-242140-0_sp0.9 from training. Duration: 0.9 2023-10-03 03:54:40,962 WARNING [train.py:1204] (3/4) Exclude cut with ID _29277_210_3279_1_1532581109293_3173069_392-3694-0_sp1.1 from training. Duration: 0.936375 2023-10-03 03:54:41,003 WARNING [train.py:1204] (3/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_563-329842-0_sp1.1 from training. Duration: 0.97275 2023-10-03 03:54:42,338 WARNING [train.py:1204] (3/4) Exclude cut with ID _58583_210_5236_1_1534726835266_3574249_391-87755-0_sp1.1 from training. Duration: 0.92725 2023-10-03 03:54:43,537 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.563e+02 1.926e+02 2.147e+02 2.476e+02 3.690e+02, threshold=4.294e+02, percent-clipped=0.0 2023-10-03 03:54:43,616 WARNING [train.py:1204] (3/4) Exclude cut with ID _28427_210_4220_1_1532581240967_3061069_281-237827-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 03:54:43,639 WARNING [train.py:1204] (3/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_44-147898-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 03:54:44,986 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_24007_210_1794_1_1531918925_1663875_125-320772-0_sp1.1 from training. Duration: 0.97275 2023-10-03 03:54:48,395 WARNING [train.py:1204] (3/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_143-117421-0 from training. Duration: 0.58 2023-10-03 03:54:49,978 WARNING [train.py:1204] (3/4) Exclude cut with ID _69584_210_5219_1_1535785133775_4044450_325-7656-0_sp0.9 from training. Duration: 0.9555625 2023-10-03 03:54:51,792 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_37351_210_7925_1_1533012931_3812113_265-85780-0 from training. Duration: 0.88 2023-10-03 03:54:51,945 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.out_combiner.scale_min, batch_count=1128906.6666666667, ans=0.2 2023-10-03 03:54:53,159 WARNING [train.py:1204] (3/4) Exclude cut with ID _18516_210_9206_1_1531704653206_3692406_331-237896-0_sp1.1 from training. Duration: 0.936375 2023-10-03 03:54:55,189 WARNING [train.py:1204] (3/4) Exclude cut with ID _14629_210_15709_1_1530604298182_956899_6-232695-0 from training. Duration: 0.94 2023-10-03 03:54:55,217 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7544_210_6753_1_1528587000_691468_87-178599-0_sp1.1 from training. Duration: 0.7909375 2023-10-03 03:54:55,267 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_326-303945-0 from training. Duration: 0.99 2023-10-03 03:54:55,281 WARNING [train.py:1204] (3/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_503-30719-0 from training. Duration: 0.98 2023-10-03 03:54:55,289 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_11305_210_1794_1_1529750428_7178148_299-344640-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 03:54:57,113 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_53071_210_16554_1_1534490405_2954066_265-170472-0_sp1.1 from training. Duration: 0.7545625 2023-10-03 03:54:59,853 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_174-277364-0_sp1.1 from training. Duration: 0.6454375 2023-10-03 03:55:01,189 WARNING [train.py:1204] (3/4) Exclude cut with ID _58414_210_8254_1_1535421609162_7151182_193-151884-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 03:55:02,501 WARNING [train.py:1204] (3/4) Exclude cut with ID IC0651W0104-322063 from training. Duration: 0.768 2023-10-03 03:55:02,712 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.conv_module2.balancer1.prob, batch_count=1128906.6666666667, ans=0.125 2023-10-03 03:55:02,790 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.conv_module1.balancer1.min_positive, batch_count=1128906.6666666667, ans=0.025 2023-10-03 03:55:05,159 WARNING [train.py:1204] (3/4) Exclude cut with ID _21881_210_3525_1_1531825242757_3798380_139-305982-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 03:55:06,586 WARNING [train.py:1204] (3/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_79-200392-0 from training. Duration: 0.97 2023-10-03 03:55:10,576 WARNING [train.py:1204] (3/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_451-280894-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 03:55:10,593 WARNING [train.py:1204] (3/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_304-273433-0_sp1.1 from training. Duration: 0.9 2023-10-03 03:55:11,984 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_5959_210_1794_1_1527400400_3445726_412-44052-0 from training. Duration: 0.77 2023-10-03 03:55:13,435 WARNING [train.py:1204] (3/4) Exclude cut with ID _4681_210_3525_1_1527251040321_1235713_148-26159-0_sp1.1 from training. Duration: 0.963625 2023-10-03 03:55:16,180 WARNING [train.py:1204] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_887-249555-0_sp0.9 from training. Duration: 0.8555625 2023-10-03 03:55:20,686 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_25236_210_2158_1_1532051968_280567_37-6860-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 03:55:25,311 WARNING [train.py:1204] (3/4) Exclude cut with ID _29277_210_3279_1_1532581109293_3173069_417-115527-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 03:55:26,736 WARNING [train.py:1204] (3/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_552-134748-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 03:55:28,119 WARNING [train.py:1204] (3/4) Exclude cut with ID _43176_210_8254_1_1533524417469_4174339_188-174143-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 03:55:28,173 WARNING [train.py:1204] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_127-119109-0_sp1.1 from training. Duration: 0.6545625 2023-10-03 03:55:33,301 WARNING [train.py:1204] (3/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_38-187851-0 from training. Duration: 0.62 2023-10-03 03:55:33,332 WARNING [train.py:1204] (3/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_588-125165-0 from training. Duration: 0.83 2023-10-03 03:55:33,417 WARNING [train.py:1204] (3/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_344-152526-0_sp1.1 from training. Duration: 0.4818125 2023-10-03 03:55:33,418 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7002_210_1794_1_1527935634_4034444_487-236501-0 from training. Duration: 0.66 2023-10-03 03:55:36,095 WARNING [train.py:1204] (3/4) Exclude cut with ID _23825_210_13449_1_1531915313526_3497150_379-16981-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 03:55:43,076 WARNING [train.py:1204] (3/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_297-5233-0_sp1.1 from training. Duration: 0.863625 2023-10-03 03:55:43,080 WARNING [train.py:1204] (3/4) Exclude cut with ID _41298_210_9019_1_1533430371450_4822143_178-283538-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 03:55:43,113 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7046_210_6753_1_1527895337_6920917_642-106799-0 from training. Duration: 0.92 2023-10-03 03:55:43,126 WARNING [train.py:1204] (3/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_532-291952-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 03:55:44,523 WARNING [train.py:1204] (3/4) Exclude cut with ID _47278_210_12041_1_1533902418349_3576110_260-270468-0_sp1.1 from training. Duration: 0.92725 2023-10-03 03:55:44,529 WARNING [train.py:1204] (3/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_144-3287-0_sp0.9 from training. Duration: 0.7444375 2023-10-03 03:55:45,967 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_162-262717-0_sp0.9 from training. Duration: 0.92225 2023-10-03 03:55:47,377 WARNING [train.py:1204] (3/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_62-335552-0_sp0.9 from training. Duration: 0.9666875 2023-10-03 03:55:47,382 WARNING [train.py:1204] (3/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_726-272852-0_sp1.1 from training. Duration: 0.92725 2023-10-03 03:55:49,323 WARNING [train.py:1204] (3/4) Exclude cut with ID _14898_210_9206_1_1530766812377_4177190_26-135492-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 03:55:50,382 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.0.layers.0.self_attn2.whiten, num_groups=1, num_channels=192, metric=12.06 vs. limit=22.5 2023-10-03 03:55:50,650 INFO [train.py:1046] (3/4) Epoch 32, batch 4700, loss[loss=0.1726, simple_loss=0.254, pruned_loss=0.04561, over 23646.00 frames. ], tot_loss[loss=0.1606, simple_loss=0.2394, pruned_loss=0.04084, over 4735561.02 frames. ], batch size: 85, lr: 3.17e-03, grad_scale: 8.0 2023-10-03 03:55:52,258 WARNING [train.py:1204] (3/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_200-95304-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 03:55:52,306 WARNING [train.py:1204] (3/4) Exclude cut with ID _45012_210_12651_1_1533608435136_5371380_240-37101-0_sp1.1 from training. Duration: 0.8454375 2023-10-03 03:55:52,314 WARNING [train.py:1204] (3/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_132-269633-0_sp0.9 from training. Duration: 0.7555625 2023-10-03 03:55:53,677 WARNING [train.py:1204] (3/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_907-151398-0_sp0.9 from training. Duration: 0.411125 2023-10-03 03:55:53,867 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.ff2_skip_rate, batch_count=1129173.3333333333, ans=0.0 2023-10-03 03:55:55,036 WARNING [train.py:1204] (3/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_45-265684-0_sp0.9 from training. Duration: 0.8 2023-10-03 03:55:55,143 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_74-292608-0 from training. Duration: 0.72 2023-10-03 03:56:04,782 WARNING [train.py:1204] (3/4) Exclude cut with ID _59202_210_26984_1_1534816810612_3549174_221-84819-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 03:56:06,157 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_33696_210_3852_1_1533082780_2663375_181-325595-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 03:56:06,204 WARNING [train.py:1204] (3/4) Exclude cut with ID _47251_210_18628_1_1533814063128_4137250_351-154105-0_sp1.1 from training. Duration: 0.963625 2023-10-03 03:56:07,614 WARNING [train.py:1204] (3/4) Exclude cut with ID _35501_210_9288_1_1532998507986_3192189_351-334740-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 03:56:09,035 WARNING [train.py:1204] (3/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_342-100157-0_sp1.1 from training. Duration: 0.4909375 2023-10-03 03:56:10,647 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.conv_skip_rate, batch_count=1129240.0, ans=0.0 2023-10-03 03:56:13,310 WARNING [train.py:1204] (3/4) Exclude cut with ID _6789_210_1534_1_1527857937218_3118950_262-48853-0 from training. Duration: 0.56 2023-10-03 03:56:13,343 WARNING [train.py:1204] (3/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_430-201020-0 from training. Duration: 0.97 2023-10-03 03:56:17,309 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_71-136518-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 03:56:17,397 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_28-291982-0_sp1.1 from training. Duration: 0.8818125 2023-10-03 03:56:18,806 WARNING [train.py:1204] (3/4) Exclude cut with ID _54218_210_12210_1_1534413646388_4409259_437-275326-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 03:56:20,986 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.balancer2.prob, batch_count=1129306.6666666667, ans=0.125 2023-10-03 03:56:22,215 WARNING [train.py:1204] (3/4) Exclude cut with ID _55134_210_21982_1_1534590028377_3882170_40-309139-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 03:56:27,903 WARNING [train.py:1204] (3/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_266-329755-0_sp1.1 from training. Duration: 0.8090625 2023-10-03 03:56:27,997 WARNING [train.py:1204] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_21-174514-0_sp1.1 from training. Duration: 0.3909375 2023-10-03 03:56:31,266 WARNING [train.py:1204] (3/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_409-207378-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 03:56:36,026 WARNING [train.py:1204] (3/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_132-269633-0 from training. Duration: 0.68 2023-10-03 03:56:37,358 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2245_210_2715_1_1525511548_3540254_310-52483-0_sp0.9 from training. Duration: 0.97775 2023-10-03 03:56:38,779 WARNING [train.py:1204] (3/4) Exclude cut with ID _27430_210_7770_1_1532604689375_1378590_47-77403-0_sp1.1 from training. Duration: 0.92725 2023-10-03 03:56:39,083 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.conv_module1.balancer2.prob, batch_count=1129373.3333333333, ans=0.125 2023-10-03 03:56:41,582 WARNING [train.py:1204] (3/4) Exclude cut with ID _7562_210_2084_1_1528887767916_3946229_186-326422-0 from training. Duration: 0.84 2023-10-03 03:56:42,986 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_230-35993-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 03:56:47,174 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_23854_210_1794_1_1531969696_1329157_57-123491-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 03:56:47,212 WARNING [train.py:1204] (3/4) Exclude cut with ID _14747_210_8341_1_1530685797463_3654089_147-131229-0 from training. Duration: 0.76 2023-10-03 03:56:49,139 WARNING [train.py:1204] (3/4) Exclude cut with ID _30191_210_1664_1_1533189915047_7196670_430-19539-0_sp1.1 from training. Duration: 0.92725 2023-10-03 03:56:49,159 WARNING [train.py:1204] (3/4) Exclude cut with ID _67725_210_27767_1_1535608756671_5105359_44-50762-0_sp1.1 from training. Duration: 0.963625 2023-10-03 03:56:50,683 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.feed_forward3.hidden_balancer.prob, batch_count=1129440.0, ans=0.125 2023-10-03 03:56:53,087 WARNING [train.py:1204] (3/4) Exclude cut with ID _27485_210_4916_1_1532739191677_3668269_217-161755-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 03:56:53,138 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_5931_210_2040_1_1527428481_4547999_321-193614-0_sp1.1 from training. Duration: 0.6090625 2023-10-03 03:56:54,479 WARNING [train.py:1204] (3/4) Exclude cut with ID _6267_210_2084_1_1527764658526_3894730_375-219887-0 from training. Duration: 0.67 2023-10-03 03:56:55,672 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9087W0022-538393 from training. Duration: 0.768 2023-10-03 03:56:57,531 WARNING [train.py:1204] (3/4) Exclude cut with ID _68777_210_4353_1_1535810390731_3117904_83-196459-0_sp1.1 from training. Duration: 0.963625 2023-10-03 03:56:58,971 WARNING [train.py:1204] (3/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_455-343812-0_sp1.1 from training. Duration: 0.92725 2023-10-03 03:56:58,971 WARNING [train.py:1204] (3/4) Exclude cut with ID _13916_210_9994_1_1530788341126_3763149_414-320174-0_sp1.1 from training. Duration: 0.92725 2023-10-03 03:56:58,975 WARNING [train.py:1204] (3/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_398-239819-0 from training. Duration: 0.99 2023-10-03 03:57:00,425 WARNING [train.py:1204] (3/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_199-232314-0_sp1.1 from training. Duration: 0.92725 2023-10-03 03:57:03,845 WARNING [train.py:1204] (3/4) Exclude cut with ID _2334_210_2667_1_1525608238880_4705129_459-104997-0 from training. Duration: 0.95 2023-10-03 03:57:04,050 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.ff2_skip_rate, batch_count=1129506.6666666667, ans=0.0 2023-10-03 03:57:05,060 INFO [train.py:1046] (3/4) Epoch 32, batch 4750, loss[loss=0.1679, simple_loss=0.2479, pruned_loss=0.04397, over 23323.00 frames. ], tot_loss[loss=0.1614, simple_loss=0.2404, pruned_loss=0.04127, over 4735108.21 frames. ], batch size: 93, lr: 3.17e-03, grad_scale: 8.0 2023-10-03 03:57:07,191 WARNING [train.py:1204] (3/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_441-209395-0_sp1.1 from training. Duration: 0.936375 2023-10-03 03:57:09,787 WARNING [train.py:1204] (3/4) Exclude cut with ID _27052_210_5986_1_1532329197187_3585009_320-80393-0_sp1.1 from training. Duration: 0.97275 2023-10-03 03:57:14,007 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.533e+02 1.934e+02 2.171e+02 2.465e+02 4.386e+02, threshold=4.342e+02, percent-clipped=1.0 2023-10-03 03:57:14,106 WARNING [train.py:1204] (3/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_89-217253-0_sp1.1 from training. Duration: 0.97275 2023-10-03 03:57:14,134 WARNING [train.py:1204] (3/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_453-282588-0_sp1.1 from training. Duration: 0.8909375 2023-10-03 03:57:15,585 WARNING [train.py:1204] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_88-341718-0 from training. Duration: 0.66 2023-10-03 03:57:15,617 WARNING [train.py:1204] (3/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_230-3521-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 03:57:18,634 WARNING [train.py:1204] (3/4) Exclude cut with ID _9679_210_7497_1_1529129212906_4363929_512-313889-0 from training. Duration: 0.81 2023-10-03 03:57:19,965 WARNING [train.py:1204] (3/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_194-252800-0_sp1.1 from training. Duration: 0.8818125 2023-10-03 03:57:19,984 WARNING [train.py:1204] (3/4) Exclude cut with ID _55209_210_1531_1_1534675828996_4597160_239-293541-0_sp1.1 from training. Duration: 0.963625 2023-10-03 03:57:20,026 WARNING [train.py:1204] (3/4) Exclude cut with ID _35799_210_5854_1_1533263487887_3529071_15-325131-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 03:57:24,755 WARNING [train.py:1204] (3/4) Exclude cut with ID _14100_210_8341_1_1530529320613_3678069_279-55760-0 from training. Duration: 0.93 2023-10-03 03:57:29,352 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_489-230703-0_sp1.1 from training. Duration: 0.77275 2023-10-03 03:57:30,708 WARNING [train.py:1204] (3/4) Exclude cut with ID _6694_210_4599_1_1527852637490_3965529_144-185051-0 from training. Duration: 0.96 2023-10-03 03:57:30,772 WARNING [train.py:1204] (3/4) Exclude cut with ID _28461_210_2253_1_1532771775189_3041299_1-228155-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 03:57:33,942 WARNING [train.py:1204] (3/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_686-252323-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 03:57:33,944 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_1075-294633-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 03:57:35,202 WARNING [train.py:1204] (3/4) Exclude cut with ID _22083_210_8254_1_1531702862439_3678970_30-92879-0_sp1.1 from training. Duration: 0.97275 2023-10-03 03:57:35,291 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9245W0121-616888 from training. Duration: 0.896 2023-10-03 03:57:35,294 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6786_210_1388_1_1527839699_4194495_378-46070-0 from training. Duration: 0.72 2023-10-03 03:57:42,026 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.1.encoder.layers.0.self_attn_weights.whiten_keys, num_groups=4, num_channels=128, metric=3.79 vs. limit=6.0 2023-10-03 03:57:42,549 WARNING [train.py:1204] (3/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_317-175062-0 from training. Duration: 0.82 2023-10-03 03:57:43,961 WARNING [train.py:1204] (3/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_238-89390-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 03:57:44,901 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.0.layers.0.self_attn_weights.whiten_keys, num_groups=4, num_channels=128, metric=3.40 vs. limit=6.0 2023-10-03 03:57:45,340 WARNING [train.py:1204] (3/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_524-310811-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 03:57:46,860 WARNING [train.py:1204] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_307-158462-0_sp0.9 from training. Duration: 0.7666875 2023-10-03 03:57:46,861 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9309W0191-640318 from training. Duration: 0.512 2023-10-03 03:57:46,865 WARNING [train.py:1204] (3/4) Exclude cut with ID _55153_210_2253_1_1534384786677_3162096_100-204617-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 03:57:47,008 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.out_combiner.scale_min, batch_count=1129640.0, ans=0.2 2023-10-03 03:57:51,430 WARNING [train.py:1204] (3/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_112-149213-0_sp1.1 from training. Duration: 0.863625 2023-10-03 03:57:52,884 WARNING [train.py:1204] (3/4) Exclude cut with ID _67831_210_12392_1_1535612464202_3092612_346-2148-0_sp1.1 from training. Duration: 0.8454375 2023-10-03 03:57:55,629 WARNING [train.py:1204] (3/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_51-117576-0_sp0.9 from training. Duration: 0.4444375 2023-10-03 03:57:55,665 WARNING [train.py:1204] (3/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_300-27235-0 from training. Duration: 0.87 2023-10-03 03:57:56,989 WARNING [train.py:1204] (3/4) Exclude cut with ID _40399_210_4353_1_1533305658194_3279406_195-116825-0_sp1.1 from training. Duration: 0.97275 2023-10-03 03:57:57,025 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7002_210_1794_1_1527939683_5157286_163-87552-0_sp0.9 from training. Duration: 0.9555625 2023-10-03 03:57:58,890 WARNING [train.py:1204] (3/4) Exclude cut with ID _19514_210_9259_1_1531530071330_5084230_560-237597-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 03:57:58,964 WARNING [train.py:1204] (3/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_41-234353-0_sp1.1 from training. Duration: 0.6181875 2023-10-03 03:57:58,981 WARNING [train.py:1204] (3/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_769-168302-0 from training. Duration: 0.71 2023-10-03 03:58:01,785 WARNING [train.py:1204] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_574-45541-0 from training. Duration: 0.65 2023-10-03 03:58:05,068 WARNING [train.py:1204] (3/4) Exclude cut with ID _50310_210_15488_1_1534068043345_3641080_262-67120-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 03:58:09,102 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_53071_210_16554_1_1534493434_1276796_35-11927-0_sp1.1 from training. Duration: 0.963625 2023-10-03 03:58:09,105 WARNING [train.py:1204] (3/4) Exclude cut with ID _3992_210_1747_1_1526644816031_3731060_107-251434-0 from training. Duration: 0.99 2023-10-03 03:58:09,164 WARNING [train.py:1204] (3/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_412-54488-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 03:58:09,752 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.4.encoder.layers.1.whiten, num_groups=1, num_channels=384, metric=4.37 vs. limit=12.0 2023-10-03 03:58:10,603 WARNING [train.py:1204] (3/4) Exclude cut with ID _18516_210_9206_1_1531704653206_3692406_77-258490-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 03:58:13,795 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_40047_210_2290_1_1533290519_2374311_195-165693-0_sp0.9 from training. Duration: 0.988875 2023-10-03 03:58:13,845 WARNING [train.py:1204] (3/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_296-36971-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 03:58:14,100 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.conv_skip_rate, batch_count=1129773.3333333333, ans=0.0 2023-10-03 03:58:15,303 WARNING [train.py:1204] (3/4) Exclude cut with ID _14970_210_15709_1_1530773543493_1715730_187-20259-0_sp1.1 from training. Duration: 0.8090625 2023-10-03 03:58:18,052 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_53373_210_5399_1_1534229471_4715695_135-305927-0_sp1.1 from training. Duration: 0.936375 2023-10-03 03:58:18,084 WARNING [train.py:1204] (3/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_60-18123-0 from training. Duration: 0.88 2023-10-03 03:58:18,155 WARNING [train.py:1204] (3/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_166-166990-0 from training. Duration: 0.96 2023-10-03 03:58:19,446 INFO [train.py:1046] (3/4) Epoch 32, batch 4800, loss[loss=0.154, simple_loss=0.2476, pruned_loss=0.03018, over 24682.00 frames. ], tot_loss[loss=0.1621, simple_loss=0.2411, pruned_loss=0.04158, over 4717893.09 frames. ], batch size: 73, lr: 3.17e-03, grad_scale: 16.0 2023-10-03 03:58:19,572 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_5438_210_1794_1_1527336687_2600009_233-58072-0 from training. Duration: 0.77 2023-10-03 03:58:24,123 WARNING [train.py:1204] (3/4) Exclude cut with ID _8833_210_5219_1_1528592665210_7553059_611-80313-0_sp0.9 from training. Duration: 0.711125 2023-10-03 03:58:24,146 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_53555_210_5632_1_1534595220_7232297_118-43239-0_sp1.1 from training. Duration: 0.936375 2023-10-03 03:58:24,240 WARNING [train.py:1204] (3/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_480-13146-0 from training. Duration: 0.85 2023-10-03 03:58:27,208 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.1.balancer2.prob, batch_count=1129840.0, ans=0.125 2023-10-03 03:58:28,448 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_892-28814-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 03:58:29,780 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_24899_210_16554_1_1532046422_3827462_160-21255-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 03:58:31,387 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.conv_module2.balancer1.prob, batch_count=1129840.0, ans=0.125 2023-10-03 03:58:35,001 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_154-316450-0_sp1.1 from training. Duration: 0.6909375 2023-10-03 03:58:36,335 WARNING [train.py:1204] (3/4) Exclude cut with ID _30196_210_1664_1_1533708496522_7074140_1225-301232-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 03:58:37,708 WARNING [train.py:1204] (3/4) Exclude cut with ID _28783_210_7943_1_1532671164808_7155479_414-208100-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 03:58:39,070 WARNING [train.py:1204] (3/4) Exclude cut with ID _19023_210_4353_1_1531378892552_5820780_83-254794-0 from training. Duration: 0.86 2023-10-03 03:58:40,514 WARNING [train.py:1204] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_591-83752-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 03:58:40,562 WARNING [train.py:1204] (3/4) Exclude cut with ID _29877_210_6228_1_1532397791106_5276149_266-62291-0_sp1.1 from training. Duration: 0.7909375 2023-10-03 03:58:42,117 WARNING [train.py:1204] (3/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_280-161633-0_sp1.1 from training. Duration: 0.663625 2023-10-03 03:58:46,549 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7543_210_6753_1_1528539468_333764_39-156634-0_sp1.1 from training. Duration: 0.97275 2023-10-03 03:58:47,892 WARNING [train.py:1204] (3/4) Exclude cut with ID _67499_210_17282_1_1535509805220_3692430_505-312248-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 03:58:47,922 WARNING [train.py:1204] (3/4) Exclude cut with ID _5294_210_1747_1_1527249643928_3696179_180-100713-0_sp0.9 from training. Duration: 0.9 2023-10-03 03:58:48,174 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.feed_forward1.out_proj.dropout_p, batch_count=1129973.3333333333, ans=0.1 2023-10-03 03:58:49,293 WARNING [train.py:1204] (3/4) Exclude cut with ID _26528_210_9070_1_1532570410842_3851310_508-148132-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 03:58:49,303 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_211-339622-0_sp1.1 from training. Duration: 0.4090625 2023-10-03 03:58:49,315 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_339-138356-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 03:58:49,397 WARNING [train.py:1204] (3/4) Exclude cut with ID _27060_210_5986_1_1532674838922_3522549_211-19933-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 03:58:52,233 WARNING [train.py:1204] (3/4) Exclude cut with ID _60229_210_27767_1_1534838406158_3655350_457-117923-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 03:58:54,239 WARNING [train.py:1204] (3/4) Exclude cut with ID _55429_210_10626_1_1534467588527_7174143_460-141439-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 03:58:56,820 WARNING [train.py:1204] (3/4) Exclude cut with ID _59170_210_6721_1_1534983633960_4203335_263-10780-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 03:58:56,832 WARNING [train.py:1204] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_872-264525-0_sp0.9 from training. Duration: 0.72225 2023-10-03 03:58:56,903 WARNING [train.py:1204] (3/4) Exclude cut with ID _7624_210_1531_1_1528109802198_4933101_418-349834-0_sp0.9 from training. Duration: 0.6666875 2023-10-03 03:58:56,986 WARNING [train.py:1204] (3/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_27-53998-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 03:58:58,372 WARNING [train.py:1204] (3/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_202-183922-0 from training. Duration: 0.85 2023-10-03 03:58:58,383 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7002_210_1794_1_1527939683_5157286_1-281264-0 from training. Duration: 0.44 2023-10-03 03:58:59,793 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_53753_210_7234_1_1534320003_3594148_493-9314-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 03:59:01,161 WARNING [train.py:1204] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_217-95056-0_sp1.1 from training. Duration: 0.8909375 2023-10-03 03:59:01,204 WARNING [train.py:1204] (3/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_406-45086-0_sp1.1 from training. Duration: 0.72725 2023-10-03 03:59:01,209 WARNING [train.py:1204] (3/4) Exclude cut with ID _31649_210_6228_1_1532584608063_4091078_505-213821-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 03:59:01,223 WARNING [train.py:1204] (3/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_317-175062-0_sp0.9 from training. Duration: 0.911125 2023-10-03 03:59:03,339 WARNING [train.py:1204] (3/4) Exclude cut with ID _5444_210_2151_1_1527418808464_5312210_726-27165-0_sp1.1 from training. Duration: 0.6090625 2023-10-03 03:59:04,565 WARNING [train.py:1204] (3/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_494-170437-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 03:59:06,203 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.nonlin_attention.balancer.prob, batch_count=1130040.0, ans=0.125 2023-10-03 03:59:07,439 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_474-64054-0_sp1.1 from training. Duration: 0.936375 2023-10-03 03:59:07,678 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.self_attn_weights.pos_emb_skip_rate, batch_count=1130040.0, ans=0.0 2023-10-03 03:59:08,839 WARNING [train.py:1204] (3/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_521-7556-0_sp1.1 from training. Duration: 0.97275 2023-10-03 03:59:10,206 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_269-348505-0_sp1.1 from training. Duration: 0.92725 2023-10-03 03:59:16,148 WARNING [train.py:1204] (3/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_559-184862-0 from training. Duration: 0.94 2023-10-03 03:59:16,189 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_68702_210_23689_1_1535799577_3420068_92-76040-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 03:59:17,479 WARNING [train.py:1204] (3/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_227-102052-0_sp1.1 from training. Duration: 0.97275 2023-10-03 03:59:17,519 WARNING [train.py:1204] (3/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_718-250907-0_sp0.9 from training. Duration: 0.7666875 2023-10-03 03:59:18,911 WARNING [train.py:1204] (3/4) Exclude cut with ID _14069_210_5632_1_1530512692033_4803390_352-143891-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 03:59:23,567 WARNING [train.py:1204] (3/4) Exclude cut with ID _6267_210_2084_1_1527764658526_3894730_65-106602-0_sp1.1 from training. Duration: 0.936375 2023-10-03 03:59:23,637 WARNING [train.py:1204] (3/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_202-183922-0_sp0.9 from training. Duration: 0.9444375 2023-10-03 03:59:23,644 WARNING [train.py:1204] (3/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_465-268827-0_sp1.1 from training. Duration: 0.97275 2023-10-03 03:59:24,958 WARNING [train.py:1204] (3/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_58-29064-0_sp1.1 from training. Duration: 0.9 2023-10-03 03:59:24,996 WARNING [train.py:1204] (3/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_1-99680-0_sp0.9 from training. Duration: 0.7444375 2023-10-03 03:59:26,311 WARNING [train.py:1204] (3/4) Exclude cut with ID _13253_210_1925_1_1530757775819_3577873_191-144977-0_sp1.1 from training. Duration: 0.7090625 2023-10-03 03:59:26,618 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.feed_forward1.hidden_balancer.prob, batch_count=1130106.6666666667, ans=0.125 2023-10-03 03:59:29,135 WARNING [train.py:1204] (3/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_276-112684-0_sp1.1 from training. Duration: 0.92725 2023-10-03 03:59:29,143 WARNING [train.py:1204] (3/4) Exclude cut with ID _64639_210_5816_1_1535367619437_3528000_358-411-0_sp1.1 from training. Duration: 0.97275 2023-10-03 03:59:29,149 WARNING [train.py:1204] (3/4) Exclude cut with ID _31649_210_6228_1_1532584608063_4091078_106-21199-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 03:59:30,478 WARNING [train.py:1204] (3/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_901-79438-0 from training. Duration: 0.94 2023-10-03 03:59:33,569 INFO [train.py:1046] (3/4) Epoch 32, batch 4850, loss[loss=0.1424, simple_loss=0.2063, pruned_loss=0.03926, over 23534.00 frames. ], tot_loss[loss=0.1631, simple_loss=0.2416, pruned_loss=0.04225, over 4700790.03 frames. ], batch size: 285, lr: 3.17e-03, grad_scale: 16.0 2023-10-03 03:59:33,694 WARNING [train.py:1204] (3/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_718-250907-0 from training. Duration: 0.69 2023-10-03 03:59:33,700 WARNING [train.py:1204] (3/4) Exclude cut with ID _5287_210_2084_1_1527418883040_3878670_363-194161-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 03:59:33,704 WARNING [train.py:1204] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_586-321321-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 03:59:35,493 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_68900_210_2087_1_1535713157_3708529_164-292661-0_sp1.1 from training. Duration: 0.963625 2023-10-03 03:59:35,494 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_26433_210_12108_1_1532157158_4056480_114-144878-0_sp1.1 from training. Duration: 0.97275 2023-10-03 03:59:38,324 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_15517_210_6753_1_1530954008_3487336_96-341874-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 03:59:39,915 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.conv_skip_rate, batch_count=1130173.3333333333, ans=0.0 2023-10-03 03:59:42,445 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.589e+02 1.847e+02 2.113e+02 2.344e+02 3.781e+02, threshold=4.226e+02, percent-clipped=0.0 2023-10-03 03:59:44,398 WARNING [train.py:1204] (3/4) Exclude cut with ID _14268_210_9206_1_1530622771285_4714099_637-86630-0 from training. Duration: 0.51 2023-10-03 03:59:45,944 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.conv_module1.balancer2.prob, batch_count=1130173.3333333333, ans=0.125 2023-10-03 03:59:47,128 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_28211_210_6388_1_1532861506_4523224_121-320092-0_sp1.1 from training. Duration: 0.92725 2023-10-03 03:59:48,730 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.feed_forward3.hidden_balancer.prob, batch_count=1130240.0, ans=0.125 2023-10-03 03:59:51,337 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_141-89113-0_sp0.9 from training. Duration: 0.9555625 2023-10-03 03:59:52,738 WARNING [train.py:1204] (3/4) Exclude cut with ID _6789_210_1534_1_1527857937218_3118950_311-61262-0_sp1.1 from training. Duration: 0.5909375 2023-10-03 03:59:52,768 WARNING [train.py:1204] (3/4) Exclude cut with ID _58480_210_12392_1_1534811121111_3646160_389-200461-0_sp1.1 from training. Duration: 0.97275 2023-10-03 03:59:55,569 WARNING [train.py:1204] (3/4) Exclude cut with ID _41326_210_5854_1_1533778076509_3492080_28-156553-0_sp1.1 from training. Duration: 0.92725 2023-10-03 03:59:57,411 WARNING [train.py:1204] (3/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_577-287150-0_sp1.1 from training. Duration: 0.7454375 2023-10-03 03:59:58,806 WARNING [train.py:1204] (3/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_40-129756-0_sp1.1 from training. Duration: 0.736375 2023-10-03 03:59:58,827 WARNING [train.py:1204] (3/4) Exclude cut with ID _60113_210_16271_1_1535164212481_4386079_490-280290-0 from training. Duration: 0.98 2023-10-03 04:00:02,687 WARNING [train.py:1204] (3/4) Exclude cut with ID _58059_210_5854_1_1534901471149_3164808_34-39905-0_sp1.1 from training. Duration: 0.963625 2023-10-03 04:00:06,716 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_791-197891-0_sp1.1 from training. Duration: 0.763625 2023-10-03 04:00:06,730 WARNING [train.py:1204] (3/4) Exclude cut with ID _6789_210_1534_1_1527857937218_3118950_262-48853-0_sp1.1 from training. Duration: 0.5090625 2023-10-03 04:00:06,793 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2655_210_2087_1_1525688959_3897400_52-279899-0_sp0.9 from training. Duration: 0.7666875 2023-10-03 04:00:06,796 WARNING [train.py:1204] (3/4) Exclude cut with ID _14884_210_2252_1_1530707459689_5561250_114-201399-0 from training. Duration: 0.76 2023-10-03 04:00:09,575 WARNING [train.py:1204] (3/4) Exclude cut with ID _19023_210_4353_1_1531378892552_5820780_83-254794-0_sp0.9 from training. Duration: 0.9555625 2023-10-03 04:00:09,600 WARNING [train.py:1204] (3/4) Exclude cut with ID _53489_210_6228_1_1534298357169_3835970_10-297919-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 04:00:12,549 WARNING [train.py:1204] (3/4) Exclude cut with ID _44585_210_18628_1_1533605350387_4518199_418-3055-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 04:00:12,561 WARNING [train.py:1204] (3/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_310-109461-0 from training. Duration: 0.8 2023-10-03 04:00:13,911 WARNING [train.py:1204] (3/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_235-262124-0 from training. Duration: 0.67 2023-10-03 04:00:15,838 WARNING [train.py:1204] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_195-167011-0_sp1.1 from training. Duration: 0.6818125 2023-10-03 04:00:22,885 WARNING [train.py:1204] (3/4) Exclude cut with ID _7852_210_5219_1_1528286471316_8139400_188-55162-0_sp1.1 from training. Duration: 0.936375 2023-10-03 04:00:22,935 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_35361_210_4238_1_1533262810_7793476_675-87012-0 from training. Duration: 0.89 2023-10-03 04:00:24,280 WARNING [train.py:1204] (3/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_465-81427-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 04:00:24,294 WARNING [train.py:1204] (3/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_597-154640-0_sp1.1 from training. Duration: 0.8181875 2023-10-03 04:00:25,735 WARNING [train.py:1204] (3/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_186-309161-0_sp0.9 from training. Duration: 0.6 2023-10-03 04:00:27,008 WARNING [train.py:1204] (3/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_546-119312-0 from training. Duration: 0.99 2023-10-03 04:00:27,013 WARNING [train.py:1204] (3/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_330-240060-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 04:00:27,089 WARNING [train.py:1204] (3/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_477-255782-0 from training. Duration: 0.82 2023-10-03 04:00:28,972 WARNING [train.py:1204] (3/4) Exclude cut with ID _14970_210_15709_1_1530773543493_1715730_150-248939-0_sp1.1 from training. Duration: 0.97275 2023-10-03 04:00:30,259 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_556-206615-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 04:00:30,321 WARNING [train.py:1204] (3/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_308-231735-0 from training. Duration: 0.73 2023-10-03 04:00:35,617 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.5.encoder.layers.1.self_attn_weights.whiten_keys, num_groups=4, num_channels=128, metric=1.58 vs. limit=6.0 2023-10-03 04:00:37,975 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.2.encoder.layers.0.self_attn2.whiten, num_groups=1, num_channels=384, metric=17.03 vs. limit=22.5 2023-10-03 04:00:39,102 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.2.encoder.layers.2.conv_module1.whiten, num_groups=1, num_channels=384, metric=6.02 vs. limit=15.0 2023-10-03 04:00:39,835 WARNING [train.py:1204] (3/4) Exclude cut with ID _27485_210_4916_1_1532739191677_3668269_66-236571-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 04:00:43,966 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2221_210_1988_1_1525597051_4935541_415-296882-0_sp0.9 from training. Duration: 0.8555625 2023-10-03 04:00:43,973 WARNING [train.py:1204] (3/4) Exclude cut with ID _64521_210_14699_1_1535243427259_3815328_231-78630-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 04:00:48,767 INFO [train.py:1046] (3/4) Epoch 32, batch 4900, loss[loss=0.1595, simple_loss=0.2414, pruned_loss=0.03881, over 12877.00 frames. ], tot_loss[loss=0.1625, simple_loss=0.2411, pruned_loss=0.04192, over 4709653.27 frames. ], batch size: 27, lr: 3.17e-03, grad_scale: 16.0 2023-10-03 04:00:50,243 WARNING [train.py:1204] (3/4) Exclude cut with ID _13942_210_9988_1_1530604906036_3733360_499-55676-0 from training. Duration: 0.62 2023-10-03 04:00:50,246 WARNING [train.py:1204] (3/4) Exclude cut with ID _49376_210_5986_1_1533985278732_3370740_301-154763-0_sp1.1 from training. Duration: 0.8909375 2023-10-03 04:00:54,477 WARNING [train.py:1204] (3/4) Exclude cut with ID _27485_210_4916_1_1532739191677_3668269_255-216698-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 04:00:54,725 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=1130506.6666666667, ans=0.1 2023-10-03 04:00:57,096 WARNING [train.py:1204] (3/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_137-334143-0_sp1.1 from training. Duration: 0.97275 2023-10-03 04:00:57,130 WARNING [train.py:1204] (3/4) Exclude cut with ID _6456_210_1534_1_1527681634212_3181049_215-309359-0_sp1.1 from training. Duration: 0.8 2023-10-03 04:00:57,646 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.5.encoder.layers.1.feed_forward2.out_whiten, num_groups=1, num_channels=256, metric=8.11 vs. limit=15.0 2023-10-03 04:00:58,645 WARNING [train.py:1204] (3/4) Exclude cut with ID _65640_210_6310_1_1535368201445_3173250_209-13008-0 from training. Duration: 0.99 2023-10-03 04:01:01,988 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.feed_forward2.hidden_balancer.prob, batch_count=1130573.3333333333, ans=0.125 2023-10-03 04:01:05,171 WARNING [train.py:1204] (3/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_58-29064-0 from training. Duration: 0.99 2023-10-03 04:01:09,107 WARNING [train.py:1204] (3/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_410-287857-0 from training. Duration: 0.93 2023-10-03 04:01:10,917 WARNING [train.py:1204] (3/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_153-153537-0 from training. Duration: 0.91 2023-10-03 04:01:10,947 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7544_210_6753_1_1528587722_3576686_172-171750-0_sp0.9 from training. Duration: 0.72225 2023-10-03 04:01:10,983 WARNING [train.py:1204] (3/4) Exclude cut with ID _64521_210_14699_1_1535243427259_3815328_464-183853-0_sp1.1 from training. Duration: 0.97275 2023-10-03 04:01:10,996 WARNING [train.py:1204] (3/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_385-210308-0_sp1.1 from training. Duration: 0.963625 2023-10-03 04:01:11,003 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_46013_210_7234_1_1533776362_3744032_354-261865-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 04:01:11,016 WARNING [train.py:1204] (3/4) Exclude cut with ID _3067_210_2667_1_1526212974640_4503168_250-285937-0_sp1.1 from training. Duration: 0.72725 2023-10-03 04:01:11,069 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_40036_210_13405_1_1533648647_1315388_48-146016-0 from training. Duration: 0.93 2023-10-03 04:01:13,768 WARNING [train.py:1204] (3/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_280-161633-0 from training. Duration: 0.73 2023-10-03 04:01:15,023 WARNING [train.py:1204] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_308-161067-0_sp0.9 from training. Duration: 0.7333125 2023-10-03 04:01:15,122 WARNING [train.py:1204] (3/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_68-212801-0_sp0.9 from training. Duration: 0.811125 2023-10-03 04:01:17,094 WARNING [train.py:1204] (3/4) Exclude cut with ID _6789_210_1534_1_1527857937218_3118950_311-61262-0_sp0.9 from training. Duration: 0.72225 2023-10-03 04:01:19,638 WARNING [train.py:1204] (3/4) Exclude cut with ID _14632_210_9988_1_1530691279392_3923200_102-204477-0_sp1.1 from training. Duration: 0.8818125 2023-10-03 04:01:19,713 WARNING [train.py:1204] (3/4) Exclude cut with ID _65242_210_18628_1_1535369393717_3823149_290-117517-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 04:01:21,304 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.feed_forward1.out_proj.dropout_p, batch_count=1130640.0, ans=0.1 2023-10-03 04:01:22,301 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_69630_210_7234_1_1535788670_910989_35-337140-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 04:01:22,309 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_5618_210_1874_1_1527311008_5851_1-214501-0 from training. Duration: 0.689 2023-10-03 04:01:24,029 WARNING [train.py:1204] (3/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_168-50337-0_sp0.9 from training. Duration: 0.9333125 2023-10-03 04:01:25,390 WARNING [train.py:1204] (3/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_185-14438-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 04:01:25,411 WARNING [train.py:1204] (3/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_61-327062-0 from training. Duration: 0.81 2023-10-03 04:01:25,416 WARNING [train.py:1204] (3/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_98-741-0 from training. Duration: 0.8 2023-10-03 04:01:26,058 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.3.encoder.layers.1.self_attn_weights.whiten_keys, num_groups=8, num_channels=256, metric=4.27 vs. limit=6.0 2023-10-03 04:01:29,917 WARNING [train.py:1204] (3/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_312-204798-0 from training. Duration: 0.93 2023-10-03 04:01:30,042 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.feed_forward1.hidden_balancer.prob, batch_count=1130640.0, ans=0.125 2023-10-03 04:01:32,673 WARNING [train.py:1204] (3/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_787-294416-0_sp0.9 from training. Duration: 0.911125 2023-10-03 04:01:34,088 WARNING [train.py:1204] (3/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_140-181176-0_sp1.1 from training. Duration: 0.836375 2023-10-03 04:01:34,123 WARNING [train.py:1204] (3/4) Exclude cut with ID _14687_210_5854_1_1530703838259_3664889_115-239046-0_sp1.1 from training. Duration: 0.6090625 2023-10-03 04:01:35,941 WARNING [train.py:1204] (3/4) Exclude cut with ID _56423_210_5854_1_1534896061049_3165969_370-230614-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 04:01:35,996 WARNING [train.py:1204] (3/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_406-275649-0_sp0.9 from training. Duration: 0.4666875 2023-10-03 04:01:36,009 WARNING [train.py:1204] (3/4) Exclude cut with ID _15150_210_8341_1_1530752411803_7256384_22-138274-0_sp1.1 from training. Duration: 0.87275 2023-10-03 04:01:37,394 WARNING [train.py:1204] (3/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_125-125818-0 from training. Duration: 0.96 2023-10-03 04:01:38,932 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_4399_210_1874_1_1526705485_2400757_220-47974-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 04:01:41,598 WARNING [train.py:1204] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_308-161067-0_sp1.1 from training. Duration: 0.6 2023-10-03 04:01:43,573 WARNING [train.py:1204] (3/4) Exclude cut with ID _53489_210_6228_1_1534298357169_3835970_102-57645-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 04:01:46,338 WARNING [train.py:1204] (3/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_178-177582-0 from training. Duration: 0.74 2023-10-03 04:01:46,404 WARNING [train.py:1204] (3/4) Exclude cut with ID _14349_210_8341_1_1530615710818_3442470_170-336541-0_sp1.1 from training. Duration: 0.7818125 2023-10-03 04:01:46,472 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_521-214276-0_sp0.9 from training. Duration: 0.488875 2023-10-03 04:01:46,512 WARNING [train.py:1204] (3/4) Exclude cut with ID _66070_210_7640_1_1535441393096_7805310_489-292631-0 from training. Duration: 0.79 2023-10-03 04:01:48,470 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.3.encoder.layers.2.feed_forward2.out_whiten, num_groups=1, num_channels=512, metric=10.94 vs. limit=15.0 2023-10-03 04:01:53,733 WARNING [train.py:1204] (3/4) Exclude cut with ID _14069_210_5632_1_1530512692033_4803390_438-340023-0_sp1.1 from training. Duration: 0.936375 2023-10-03 04:01:55,082 WARNING [train.py:1204] (3/4) Exclude cut with ID _7852_210_5219_1_1528286471316_8139400_242-105336-0_sp1.1 from training. Duration: 0.6545625 2023-10-03 04:01:55,219 WARNING [train.py:1204] (3/4) Exclude cut with ID _3867_210_1883_1_1526641200575_4475830_272-215563-0 from training. Duration: 0.57 2023-10-03 04:01:55,227 WARNING [train.py:1204] (3/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_143-117421-0_sp0.9 from training. Duration: 0.6444375 2023-10-03 04:01:55,234 WARNING [train.py:1204] (3/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_30-326681-0_sp1.1 from training. Duration: 0.7545625 2023-10-03 04:01:57,938 WARNING [train.py:1204] (3/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_538-218448-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 04:02:00,031 WARNING [train.py:1204] (3/4) Exclude cut with ID _33640_210_2655_1_1533117605479_4855469_60-192970-0_sp1.1 from training. Duration: 0.963625 2023-10-03 04:02:01,266 WARNING [train.py:1204] (3/4) Exclude cut with ID _65585_210_12210_1_1535450451584_3764098_112-294301-0_sp1.1 from training. Duration: 0.8 2023-10-03 04:02:01,297 WARNING [train.py:1204] (3/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_643-37412-0_sp1.1 from training. Duration: 0.936375 2023-10-03 04:02:01,330 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_9302_210_2550_1_1528889962_4655416_638-301620-0_sp1.1 from training. Duration: 0.42725 2023-10-03 04:02:02,563 INFO [train.py:1046] (3/4) Epoch 32, batch 4950, loss[loss=0.1658, simple_loss=0.2533, pruned_loss=0.03919, over 24471.00 frames. ], tot_loss[loss=0.1623, simple_loss=0.2408, pruned_loss=0.04191, over 4707605.94 frames. ], batch size: 69, lr: 3.17e-03, grad_scale: 16.0 2023-10-03 04:02:02,712 WARNING [train.py:1204] (3/4) Exclude cut with ID _3867_210_1883_1_1526641200575_4475830_272-215563-0_sp1.1 from training. Duration: 0.5181875 2023-10-03 04:02:07,355 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_53679_210_4852_1_1534556726_4186141_358-154143-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 04:02:07,375 WARNING [train.py:1204] (3/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_841-341679-0_sp0.9 from training. Duration: 0.6444375 2023-10-03 04:02:08,895 WARNING [train.py:1204] (3/4) Exclude cut with ID _7160_210_1876_1_1527940923231_3703799_302-150728-0 from training. Duration: 0.78 2023-10-03 04:02:08,922 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_125-203105-0 from training. Duration: 0.8 2023-10-03 04:02:10,708 WARNING [train.py:1204] (3/4) Exclude cut with ID _5585_210_2667_1_1527418813324_8007340_89-332072-0_sp0.9 from training. Duration: 0.711125 2023-10-03 04:02:10,780 WARNING [train.py:1204] (3/4) Exclude cut with ID _3992_210_1747_1_1526644816031_3731060_36-147855-0 from training. Duration: 0.78 2023-10-03 04:02:10,797 WARNING [train.py:1204] (3/4) Exclude cut with ID _59256_210_5986_1_1535076085997_3589120_172-239045-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 04:02:10,813 WARNING [train.py:1204] (3/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_472-216663-0_sp1.1 from training. Duration: 0.836375 2023-10-03 04:02:11,966 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.595e+02 1.926e+02 2.133e+02 2.555e+02 3.988e+02, threshold=4.267e+02, percent-clipped=0.0 2023-10-03 04:02:12,076 WARNING [train.py:1204] (3/4) Exclude cut with ID _36512_210_6987_1_1533033143998_3126069_181-306500-0_sp1.1 from training. Duration: 0.863625 2023-10-03 04:02:12,092 WARNING [train.py:1204] (3/4) Exclude cut with ID _56524_210_9642_1_1534572011779_3619170_492-95008-0_sp1.1 from training. Duration: 0.97275 2023-10-03 04:02:13,542 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_33348_210_5632_1_1533086912_3324030_119-90326-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 04:02:13,578 WARNING [train.py:1204] (3/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_231-137892-0_sp1.1 from training. Duration: 0.9 2023-10-03 04:02:16,298 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8453_210_4211_1_1528502940_8562730_596-306820-0_sp1.1 from training. Duration: 0.8818125 2023-10-03 04:02:18,153 WARNING [train.py:1204] (3/4) Exclude cut with ID _27052_210_5986_1_1532329197187_3585009_50-242750-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 04:02:19,543 WARNING [train.py:1204] (3/4) Exclude cut with ID _13913_210_9994_1_1530579615187_3383897_358-98435-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 04:02:19,563 WARNING [train.py:1204] (3/4) Exclude cut with ID _27013_210_1389_1_1532437173240_3252649_399-229778-0_sp1.1 from training. Duration: 0.963625 2023-10-03 04:02:19,750 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.feed_forward1.out_proj.dropout_p, batch_count=1130906.6666666667, ans=0.1 2023-10-03 04:02:22,452 WARNING [train.py:1204] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_185-174805-0_sp0.9 from training. Duration: 0.7555625 2023-10-03 04:02:29,629 WARNING [train.py:1204] (3/4) Exclude cut with ID _65585_210_12210_1_1535450451584_3764098_196-221014-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 04:02:31,082 WARNING [train.py:1204] (3/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_202-293523-0_sp1.1 from training. Duration: 0.6545625 2023-10-03 04:02:32,503 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8275_210_4198_1_1528455320_3892571_213-127634-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 04:02:32,552 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_921-231678-0_sp1.1 from training. Duration: 0.97275 2023-10-03 04:02:35,207 WARNING [train.py:1204] (3/4) Exclude cut with ID _38020_210_5986_1_1533112252259_7006089_493-102745-0_sp1.1 from training. Duration: 0.8909375 2023-10-03 04:02:36,990 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6846_210_6753_1_1527850767_4295158_2-315073-0 from training. Duration: 0.88 2023-10-03 04:02:37,052 WARNING [train.py:1204] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_249-281349-0 from training. Duration: 0.85 2023-10-03 04:02:39,831 WARNING [train.py:1204] (3/4) Exclude cut with ID _18513_210_9206_1_1531618215223_3862620_314-83429-0_sp1.1 from training. Duration: 0.97275 2023-10-03 04:02:41,269 WARNING [train.py:1204] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_215-202973-0_sp0.9 from training. Duration: 0.97775 2023-10-03 04:02:41,279 WARNING [train.py:1204] (3/4) Exclude cut with ID _7719_210_3525_1_1528456153163_4296960_88-250259-0_sp1.1 from training. Duration: 0.763625 2023-10-03 04:02:44,356 WARNING [train.py:1204] (3/4) Exclude cut with ID _7606_210_4915_1_1528894253277_5080690_301-10732-0_sp1.1 from training. Duration: 0.736375 2023-10-03 04:02:44,366 WARNING [train.py:1204] (3/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_818-11907-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 04:02:44,436 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_5959_210_1794_1_1527400400_3445726_412-44052-0_sp1.1 from training. Duration: 0.7 2023-10-03 04:02:45,892 WARNING [train.py:1204] (3/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_6-279195-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 04:02:47,441 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.nonlin_attention.balancer.prob, batch_count=1131040.0, ans=0.125 2023-10-03 04:02:48,533 WARNING [train.py:1204] (3/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_252-216674-0_sp0.9 from training. Duration: 0.6 2023-10-03 04:02:50,530 WARNING [train.py:1204] (3/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_144-3287-0_sp1.1 from training. Duration: 0.6090625 2023-10-03 04:02:53,207 WARNING [train.py:1204] (3/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_342-3879-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 04:02:53,238 WARNING [train.py:1204] (3/4) Exclude cut with ID _33640_210_2655_1_1533117605479_4855469_584-65630-0_sp1.1 from training. Duration: 0.97275 2023-10-03 04:02:53,333 WARNING [train.py:1204] (3/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_37-189262-0 from training. Duration: 0.98 2023-10-03 04:02:53,362 WARNING [train.py:1204] (3/4) Exclude cut with ID _9389_210_1664_1_1529217337301_5289269_379-128794-0_sp0.9 from training. Duration: 0.9444375 2023-10-03 04:02:56,113 WARNING [train.py:1204] (3/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_119-207162-0_sp1.1 from training. Duration: 0.6454375 2023-10-03 04:03:00,100 WARNING [train.py:1204] (3/4) Exclude cut with ID _2941_210_2577_1_1526199673150_2489759_200-333643-0_sp1.1 from training. Duration: 0.936375 2023-10-03 04:03:00,290 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.1.self_attn_weights.pos_emb_skip_rate, batch_count=1131106.6666666667, ans=0.0 2023-10-03 04:03:01,985 WARNING [train.py:1204] (3/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_479-125611-0_sp1.1 from training. Duration: 0.87275 2023-10-03 04:03:01,996 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3475_210_2028_1_1526193578_3622569_64-297072-0_sp1.1 from training. Duration: 0.82725 2023-10-03 04:03:02,048 WARNING [train.py:1204] (3/4) Exclude cut with ID _53565_210_10635_1_1534337218335_3820259_139-346291-0_sp1.1 from training. Duration: 0.97275 2023-10-03 04:03:02,090 WARNING [train.py:1204] (3/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_588-125165-0_sp1.1 from training. Duration: 0.7545625 2023-10-03 04:03:03,505 WARNING [train.py:1204] (3/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_938-205135-0_sp1.1 from training. Duration: 0.7909375 2023-10-03 04:03:04,912 WARNING [train.py:1204] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_216-48707-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 04:03:06,197 WARNING [train.py:1204] (3/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_477-255782-0_sp1.1 from training. Duration: 0.7454375 2023-10-03 04:03:06,235 WARNING [train.py:1204] (3/4) Exclude cut with ID _16909_210_5724_1_1531292079912_4118919_583-92842-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 04:03:06,311 WARNING [train.py:1204] (3/4) Exclude cut with ID _60860_210_8254_1_1534896015875_3557169_245-256666-0 from training. Duration: 0.97 2023-10-03 04:03:08,417 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.feed_forward3.hidden_balancer.prob, batch_count=1131106.6666666667, ans=0.125 2023-10-03 04:03:11,589 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_45399_210_4852_1_1533624725_7579737_349-116173-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 04:03:15,730 WARNING [train.py:1204] (3/4) Exclude cut with ID _8699_210_2069_1_1528630346831_3960409_155-39766-0 from training. Duration: 0.78 2023-10-03 04:03:15,740 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_86-16881-0_sp1.1 from training. Duration: 0.52725 2023-10-03 04:03:17,016 INFO [train.py:1046] (3/4) Epoch 32, batch 5000, loss[loss=0.1598, simple_loss=0.2363, pruned_loss=0.0417, over 23688.00 frames. ], tot_loss[loss=0.1625, simple_loss=0.241, pruned_loss=0.04201, over 4711354.28 frames. ], batch size: 232, lr: 3.17e-03, grad_scale: 16.0 2023-10-03 04:03:21,802 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_191-159384-0_sp1.1 from training. Duration: 0.97275 2023-10-03 04:03:21,810 WARNING [train.py:1204] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_307-158462-0_sp1.1 from training. Duration: 0.62725 2023-10-03 04:03:23,179 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7544_210_6753_1_1528591327_1471426_33-346031-0 from training. Duration: 0.61 2023-10-03 04:03:24,552 WARNING [train.py:1204] (3/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_112-149213-0 from training. Duration: 0.95 2023-10-03 04:03:26,020 WARNING [train.py:1204] (3/4) Exclude cut with ID _55844_210_2346_1_1534471212569_7625707_925-191342-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 04:03:27,415 WARNING [train.py:1204] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_87-278686-0 from training. Duration: 0.77 2023-10-03 04:03:28,743 WARNING [train.py:1204] (3/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_415-78792-0_sp1.1 from training. Duration: 0.736375 2023-10-03 04:03:28,753 WARNING [train.py:1204] (3/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_107-332398-0_sp0.9 from training. Duration: 0.8666875 2023-10-03 04:03:28,850 WARNING [train.py:1204] (3/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_72-251255-0 from training. Duration: 0.94 2023-10-03 04:03:30,085 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_431-227273-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 04:03:30,161 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7544_210_6753_1_1528587000_691468_87-178599-0_sp0.9 from training. Duration: 0.9666875 2023-10-03 04:03:31,530 WARNING [train.py:1204] (3/4) Exclude cut with ID _13916_210_9994_1_1530788341126_3763149_60-191556-0 from training. Duration: 0.61 2023-10-03 04:03:31,532 WARNING [train.py:1204] (3/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_746-164782-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 04:03:31,570 WARNING [train.py:1204] (3/4) Exclude cut with ID _33640_210_2655_1_1533117605479_4855469_596-21066-0_sp1.1 from training. Duration: 0.92725 2023-10-03 04:03:34,878 WARNING [train.py:1204] (3/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_320-14078-0 from training. Duration: 0.95 2023-10-03 04:03:36,277 WARNING [train.py:1204] (3/4) Exclude cut with ID _29877_210_6228_1_1532397791106_5276149_266-62291-0 from training. Duration: 0.87 2023-10-03 04:03:36,348 WARNING [train.py:1204] (3/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_176-56120-0_sp0.9 from training. Duration: 0.92225 2023-10-03 04:03:36,389 WARNING [train.py:1204] (3/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_91-26919-0 from training. Duration: 0.97 2023-10-03 04:03:38,239 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_224-322394-0_sp0.9 from training. Duration: 0.7333125 2023-10-03 04:03:38,294 WARNING [train.py:1204] (3/4) Exclude cut with ID _44982_210_1511_1_1534312820108_3726029_586-297059-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 04:03:40,280 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_196-289637-0_sp1.1 from training. Duration: 0.6818125 2023-10-03 04:03:40,282 WARNING [train.py:1204] (3/4) Exclude cut with ID _18513_210_9206_1_1531618215223_3862620_402-5957-0 from training. Duration: 0.99 2023-10-03 04:03:40,288 WARNING [train.py:1204] (3/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_81-300212-0 from training. Duration: 0.6 2023-10-03 04:03:41,699 WARNING [train.py:1204] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_166-128941-0 from training. Duration: 0.54 2023-10-03 04:03:41,720 WARNING [train.py:1204] (3/4) Exclude cut with ID _58059_210_5854_1_1534901471149_3164808_57-309660-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 04:03:43,088 WARNING [train.py:1204] (3/4) Exclude cut with ID _60621_210_17071_1_1535013053530_3671739_73-155814-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 04:03:44,502 WARNING [train.py:1204] (3/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_726-270780-0 from training. Duration: 0.86 2023-10-03 04:03:44,524 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8853_210_3273_1_1528633899_818968_51-187685-0_sp1.1 from training. Duration: 0.62725 2023-10-03 04:03:45,891 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_315-212794-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 04:03:47,251 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_53373_210_5399_1_1534229471_4715695_314-122795-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 04:03:47,362 WARNING [train.py:1204] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_187-221566-0_sp0.9 from training. Duration: 0.47775 2023-10-03 04:03:48,759 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_133-564-0 from training. Duration: 0.79 2023-10-03 04:03:48,794 WARNING [train.py:1204] (3/4) Exclude cut with ID _55153_210_2253_1_1534384786677_3162096_255-228344-0_sp1.1 from training. Duration: 0.936375 2023-10-03 04:03:48,904 WARNING [train.py:1204] (3/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_383-165234-0_sp1.1 from training. Duration: 0.8545625 2023-10-03 04:03:54,594 WARNING [train.py:1204] (3/4) Exclude cut with ID IC0677W0056-334889 from training. Duration: 0.768 2023-10-03 04:03:58,792 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_14953_210_2151_1_1530701710_5616165_710-165400-0_sp0.9 from training. Duration: 0.9666875 2023-10-03 04:03:58,900 WARNING [train.py:1204] (3/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_355-236205-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 04:03:58,901 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_34450_210_9547_1_1533099443_4109050_503-246893-0_sp1.1 from training. Duration: 0.963625 2023-10-03 04:04:01,578 WARNING [train.py:1204] (3/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_25-120161-0 from training. Duration: 0.98 2023-10-03 04:04:01,613 WARNING [train.py:1204] (3/4) Exclude cut with ID _66112_210_10635_1_1535590236305_3801484_205-311930-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 04:04:01,642 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_48049_210_5632_1_1533864553_3519937_185-281218-0_sp1.1 from training. Duration: 0.92725 2023-10-03 04:04:01,679 WARNING [train.py:1204] (3/4) Exclude cut with ID _48782_210_12658_1_1534467701544_2300422_191-332415-0_sp1.1 from training. Duration: 0.97275 2023-10-03 04:04:03,784 WARNING [train.py:1204] (3/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_907-151398-0_sp1.1 from training. Duration: 0.336375 2023-10-03 04:04:04,065 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.feed_forward3.hidden_balancer.prob, batch_count=1131373.3333333333, ans=0.125 2023-10-03 04:04:05,161 WARNING [train.py:1204] (3/4) Exclude cut with ID _67499_210_17282_1_1535509805220_3692430_20-179346-0_sp1.1 from training. Duration: 0.8818125 2023-10-03 04:04:08,362 WARNING [train.py:1204] (3/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_441-91466-0_sp1.1 from training. Duration: 0.8818125 2023-10-03 04:04:09,673 WARNING [train.py:1204] (3/4) Exclude cut with ID _46681_210_9642_1_1533893005571_7603359_786-164007-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 04:04:14,021 WARNING [train.py:1204] (3/4) Exclude cut with ID _14970_210_15709_1_1530773543493_1715730_187-20259-0 from training. Duration: 0.89 2023-10-03 04:04:14,253 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.conv_module1.balancer2.prob, batch_count=1131373.3333333333, ans=0.125 2023-10-03 04:04:16,123 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.3.encoder.layers.2.self_attn_weights.whiten_keys, num_groups=8, num_channels=256, metric=3.61 vs. limit=6.0 2023-10-03 04:04:18,161 WARNING [train.py:1204] (3/4) Exclude cut with ID _12734_210_8341_1_1530183614296_3891929_383-339075-0_sp1.1 from training. Duration: 0.963625 2023-10-03 04:04:23,805 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.2.encoder.layers.0.nonlin_attention.whiten2, num_groups=1, num_channels=384, metric=7.16 vs. limit=15.0 2023-10-03 04:04:25,868 WARNING [train.py:1204] (3/4) Exclude cut with ID _37086_210_9038_1_1532999540891_7021250_834-322225-0_sp1.1 from training. Duration: 0.92725 2023-10-03 04:04:27,221 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_10987_210_4163_1_1529760555_4123902_56-290559-0_sp1.1 from training. Duration: 0.963625 2023-10-03 04:04:27,227 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_92-246270-0_sp0.9 from training. Duration: 0.9444375 2023-10-03 04:04:27,251 WARNING [train.py:1204] (3/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_56-13561-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 04:04:28,512 WARNING [train.py:1204] (3/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_393-57888-0_sp1.1 from training. Duration: 0.5818125 2023-10-03 04:04:28,549 WARNING [train.py:1204] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_168-32906-0_sp1.1 from training. Duration: 0.62725 2023-10-03 04:04:29,793 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_51225_210_3601_1_1534400634_2327383_127-207553-0_sp1.1 from training. Duration: 0.963625 2023-10-03 04:04:30,018 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.1.bypass.scale_min, batch_count=1131506.6666666667, ans=0.2 2023-10-03 04:04:31,093 INFO [train.py:1046] (3/4) Epoch 32, batch 5050, loss[loss=0.1677, simple_loss=0.251, pruned_loss=0.04225, over 23257.00 frames. ], tot_loss[loss=0.1627, simple_loss=0.2409, pruned_loss=0.04225, over 4694087.49 frames. ], batch size: 93, lr: 3.17e-03, grad_scale: 16.0 2023-10-03 04:04:32,618 WARNING [train.py:1204] (3/4) Exclude cut with ID _58583_210_5236_1_1534726835266_3574249_494-203521-0_sp1.1 from training. Duration: 0.963625 2023-10-03 04:04:34,363 WARNING [train.py:1204] (3/4) Exclude cut with ID _49376_210_5986_1_1533985278732_3370740_354-344695-0 from training. Duration: 0.99 2023-10-03 04:04:34,459 WARNING [train.py:1204] (3/4) Exclude cut with ID _8021_210_7994_1_1528376451358_3653069_179-45206-0_sp1.1 from training. Duration: 0.7818125 2023-10-03 04:04:36,949 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.4.encoder.layers.1.whiten, num_groups=1, num_channels=384, metric=4.05 vs. limit=12.0 2023-10-03 04:04:37,801 WARNING [train.py:1204] (3/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_742-154017-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 04:04:39,291 WARNING [train.py:1204] (3/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_222-198127-0_sp1.1 from training. Duration: 0.763625 2023-10-03 04:04:39,344 WARNING [train.py:1204] (3/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_235-307337-0 from training. Duration: 0.94 2023-10-03 04:04:40,633 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.512e+02 1.847e+02 2.039e+02 2.357e+02 3.411e+02, threshold=4.078e+02, percent-clipped=0.0 2023-10-03 04:04:40,729 WARNING [train.py:1204] (3/4) Exclude cut with ID _8393_210_6702_1_1528505860249_3457328_453-176500-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 04:04:40,777 WARNING [train.py:1204] (3/4) Exclude cut with ID _53565_210_10635_1_1534337218335_3820259_102-179528-0_sp1.1 from training. Duration: 0.97275 2023-10-03 04:04:43,505 WARNING [train.py:1204] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_43-206757-0_sp1.1 from training. Duration: 0.6818125 2023-10-03 04:04:43,594 WARNING [train.py:1204] (3/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_335-274780-0_sp1.1 from training. Duration: 0.7090625 2023-10-03 04:04:44,873 WARNING [train.py:1204] (3/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_128-296581-0_sp1.1 from training. Duration: 0.67275 2023-10-03 04:04:46,341 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.conv_skip_rate, batch_count=1131573.3333333333, ans=0.0 2023-10-03 04:04:49,185 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.bypass_mid.scale_min, batch_count=1131573.3333333333, ans=0.2 2023-10-03 04:04:50,565 WARNING [train.py:1204] (3/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_111-112362-0 from training. Duration: 0.91 2023-10-03 04:04:51,864 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_5522_210_3273_1_1527249606_3909096_366-172508-0_sp1.1 from training. Duration: 0.6 2023-10-03 04:04:53,630 WARNING [train.py:1204] (3/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_47-167373-0_sp1.1 from training. Duration: 0.836375 2023-10-03 04:04:53,672 WARNING [train.py:1204] (3/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_41-234353-0 from training. Duration: 0.68 2023-10-03 04:04:53,710 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_82-221196-0_sp0.9 from training. Duration: 0.8444375 2023-10-03 04:04:55,137 WARNING [train.py:1204] (3/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_47-4814-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 04:04:55,159 WARNING [train.py:1204] (3/4) Exclude cut with ID _43541_210_10626_1_1533796233418_3635440_40-247993-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 04:04:56,483 WARNING [train.py:1204] (3/4) Exclude cut with ID _21989_210_5854_1_1531899214931_3271360_133-44759-0_sp0.9 from training. Duration: 0.8555625 2023-10-03 04:04:56,486 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_94-195801-0 from training. Duration: 0.94 2023-10-03 04:04:56,570 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_141-89113-0 from training. Duration: 0.86 2023-10-03 04:04:57,942 WARNING [train.py:1204] (3/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_683-284578-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 04:05:00,597 WARNING [train.py:1204] (3/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_6-214815-0_sp1.1 from training. Duration: 0.77275 2023-10-03 04:05:03,827 WARNING [train.py:1204] (3/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_556-212082-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 04:05:05,149 WARNING [train.py:1204] (3/4) Exclude cut with ID _44902_210_16187_1_1533888006062_3792078_149-67212-0 from training. Duration: 0.87 2023-10-03 04:05:06,939 WARNING [train.py:1204] (3/4) Exclude cut with ID _61148_210_6897_1_1535094022726_4217610_333-159440-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 04:05:10,457 WARNING [train.py:1204] (3/4) Exclude cut with ID _3120_210_3525_1_1526036143112_4072980_404-249994-0 from training. Duration: 0.99 2023-10-03 04:05:13,143 WARNING [train.py:1204] (3/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_234-260682-0_sp0.9 from training. Duration: 0.9333125 2023-10-03 04:05:13,161 WARNING [train.py:1204] (3/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_374-138971-0_sp1.1 from training. Duration: 0.8545625 2023-10-03 04:05:13,233 WARNING [train.py:1204] (3/4) Exclude cut with ID _53713_210_24116_1_1534314487762_7416751_743-302085-0_sp1.1 from training. Duration: 0.92725 2023-10-03 04:05:13,480 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.conv_module2.balancer2.prob, batch_count=1131640.0, ans=0.125 2023-10-03 04:05:14,646 WARNING [train.py:1204] (3/4) Exclude cut with ID _18516_210_9206_1_1531704653206_3692406_198-334870-0_sp1.1 from training. Duration: 0.836375 2023-10-03 04:05:17,324 WARNING [train.py:1204] (3/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_395-73570-0_sp1.1 from training. Duration: 0.9 2023-10-03 04:05:18,898 WARNING [train.py:1204] (3/4) Exclude cut with ID _46683_210_9642_1_1533730913492_4591116_421-19547-0_sp1.1 from training. Duration: 0.936375 2023-10-03 04:05:20,236 WARNING [train.py:1204] (3/4) Exclude cut with ID _19896_210_1883_1_1531709022662_6649569_714-8668-0_sp1.1 from training. Duration: 0.963625 2023-10-03 04:05:20,260 WARNING [train.py:1204] (3/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_49-101072-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 04:05:20,277 WARNING [train.py:1204] (3/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_220-191080-0_sp1.1 from training. Duration: 0.8909375 2023-10-03 04:05:21,556 WARNING [train.py:1204] (3/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_62-335552-0 from training. Duration: 0.87 2023-10-03 04:05:21,637 WARNING [train.py:1204] (3/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_111-112362-0_sp1.1 from training. Duration: 0.82725 2023-10-03 04:05:23,069 WARNING [train.py:1204] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_318-59097-0_sp0.9 from training. Duration: 0.8444375 2023-10-03 04:05:26,640 WARNING [train.py:1204] (3/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_98-255710-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 04:05:26,647 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9042W0003-515609 from training. Duration: 0.896 2023-10-03 04:05:27,902 WARNING [train.py:1204] (3/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_158-250116-0_sp0.9 from training. Duration: 0.688875 2023-10-03 04:05:28,036 WARNING [train.py:1204] (3/4) Exclude cut with ID _26750_210_5665_1_1532226678442_4063230_7-346885-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 04:05:29,367 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_37839_210_7234_1_1533542462_3689303_357-227078-0_sp1.1 from training. Duration: 0.963625 2023-10-03 04:05:29,397 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9052W0119-520904 from training. Duration: 0.896 2023-10-03 04:05:32,168 WARNING [train.py:1204] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_249-281349-0_sp1.1 from training. Duration: 0.77275 2023-10-03 04:05:32,183 WARNING [train.py:1204] (3/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_147-293311-0 from training. Duration: 0.94 2023-10-03 04:05:32,184 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_158-21166-0_sp1.1 from training. Duration: 0.963625 2023-10-03 04:05:36,903 WARNING [train.py:1204] (3/4) Exclude cut with ID _14100_210_8341_1_1530529320613_3678069_207-332194-0_sp1.1 from training. Duration: 0.92725 2023-10-03 04:05:37,141 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.feed_forward3.hidden_balancer.prob, batch_count=1131773.3333333333, ans=0.125 2023-10-03 04:05:38,850 WARNING [train.py:1204] (3/4) Exclude cut with ID _9389_210_1664_1_1529217337301_5289269_324-211031-0_sp1.1 from training. Duration: 0.963625 2023-10-03 04:05:38,863 WARNING [train.py:1204] (3/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_133-158308-0 from training. Duration: 0.84 2023-10-03 04:05:38,986 WARNING [train.py:1204] (3/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_166-314607-0 from training. Duration: 0.54 2023-10-03 04:05:39,201 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.bypass_mid.scale_min, batch_count=1131773.3333333333, ans=0.2 2023-10-03 04:05:42,140 WARNING [train.py:1204] (3/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_649-79538-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 04:05:42,156 WARNING [train.py:1204] (3/4) Exclude cut with ID _28394_210_8913_1_1532430049518_3650118_343-115667-0_sp1.1 from training. Duration: 0.97275 2023-10-03 04:05:42,208 WARNING [train.py:1204] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_87-278686-0_sp0.9 from training. Duration: 0.8555625 2023-10-03 04:05:44,723 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9109W0003-549749 from training. Duration: 0.768 2023-10-03 04:05:45,214 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.4.encoder.layers.2.nonlin_attention.whiten2, num_groups=1, num_channels=384, metric=3.46 vs. limit=15.0 2023-10-03 04:05:45,976 INFO [train.py:1046] (3/4) Epoch 32, batch 5100, loss[loss=0.1715, simple_loss=0.2482, pruned_loss=0.04741, over 23766.00 frames. ], tot_loss[loss=0.1627, simple_loss=0.2412, pruned_loss=0.04208, over 4706254.80 frames. ], batch size: 164, lr: 3.17e-03, grad_scale: 16.0 2023-10-03 04:05:46,159 WARNING [train.py:1204] (3/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_126-29104-0_sp1.1 from training. Duration: 0.77275 2023-10-03 04:05:48,828 WARNING [train.py:1204] (3/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_325-113798-0 from training. Duration: 0.85 2023-10-03 04:05:48,875 WARNING [train.py:1204] (3/4) Exclude cut with ID _6694_210_4599_1_1527852637490_3965529_116-109398-0 from training. Duration: 0.73 2023-10-03 04:05:50,220 WARNING [train.py:1204] (3/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_46-57119-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 04:05:51,846 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.conv_skip_rate, batch_count=1131840.0, ans=0.0 2023-10-03 04:05:52,952 WARNING [train.py:1204] (3/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_546-119312-0_sp1.1 from training. Duration: 0.9 2023-10-03 04:05:53,472 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.conv_module2.whiten.whitening_limit, batch_count=1131840.0, ans=15.0 2023-10-03 04:05:54,357 WARNING [train.py:1204] (3/4) Exclude cut with ID _60057_210_10640_1_1534903065793_3333150_468-306939-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 04:05:54,613 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.feed_forward1.out_proj.dropout_p, batch_count=1131840.0, ans=0.1 2023-10-03 04:05:55,665 WARNING [train.py:1204] (3/4) Exclude cut with ID _15150_210_8341_1_1530752411803_7256384_22-138274-0 from training. Duration: 0.96 2023-10-03 04:05:55,681 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_289-180904-0 from training. Duration: 0.76 2023-10-03 04:05:59,153 WARNING [train.py:1204] (3/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_64-133721-0_sp1.1 from training. Duration: 0.92725 2023-10-03 04:05:59,202 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_195-257512-0_sp1.1 from training. Duration: 0.6090625 2023-10-03 04:06:02,174 WARNING [train.py:1204] (3/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_704-101963-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 04:06:05,604 WARNING [train.py:1204] (3/4) Exclude cut with ID _4366_210_1388_1_1526792053633_3613819_231-211402-0 from training. Duration: 0.94 2023-10-03 04:06:07,347 WARNING [train.py:1204] (3/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_679-190385-0_sp1.1 from training. Duration: 0.97275 2023-10-03 04:06:08,718 WARNING [train.py:1204] (3/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_298-81459-0_sp1.1 from training. Duration: 0.963625 2023-10-03 04:06:08,734 WARNING [train.py:1204] (3/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_658-282610-0_sp0.9 from training. Duration: 0.588875 2023-10-03 04:06:11,807 WARNING [train.py:1204] (3/4) Exclude cut with ID _57572_210_5517_1_1534921219624_3809484_196-283734-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 04:06:11,896 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_53071_210_16554_1_1534490405_2954066_294-198554-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 04:06:11,901 WARNING [train.py:1204] (3/4) Exclude cut with ID _4866_210_2577_1_1527404412284_4364659_352-157930-0 from training. Duration: 0.81 2023-10-03 04:06:14,648 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9231W0113-610139 from training. Duration: 0.896 2023-10-03 04:06:14,707 WARNING [train.py:1204] (3/4) Exclude cut with ID _24271_210_12658_1_1531987270022_7190519_638-171906-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 04:06:14,744 WARNING [train.py:1204] (3/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_272-249985-0 from training. Duration: 0.67 2023-10-03 04:06:14,757 WARNING [train.py:1204] (3/4) Exclude cut with ID _64086_210_7994_1_1535270417019_7435949_449-31567-0 from training. Duration: 0.97 2023-10-03 04:06:20,182 WARNING [train.py:1204] (3/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_109-282568-0_sp1.1 from training. Duration: 0.97275 2023-10-03 04:06:27,425 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_121-45934-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 04:06:30,480 WARNING [train.py:1204] (3/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_314-126819-0 from training. Duration: 0.5 2023-10-03 04:06:30,514 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9309W0192-640319 from training. Duration: 0.512 2023-10-03 04:06:30,522 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9310W0003-640649 from training. Duration: 0.896 2023-10-03 04:06:33,296 WARNING [train.py:1204] (3/4) Exclude cut with ID _7160_210_1876_1_1527940923231_3703799_120-150943-0 from training. Duration: 0.81 2023-10-03 04:06:33,298 WARNING [train.py:1204] (3/4) Exclude cut with ID _40380_210_9059_1_1533380594759_4187380_6-33508-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 04:06:36,744 WARNING [train.py:1204] (3/4) Exclude cut with ID _59202_210_26984_1_1534816810612_3549174_137-318710-0 from training. Duration: 0.89 2023-10-03 04:06:41,947 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_435-313305-0 from training. Duration: 0.71 2023-10-03 04:06:44,613 WARNING [train.py:1204] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_91-212486-0_sp1.1 from training. Duration: 0.6181875 2023-10-03 04:06:44,913 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.conv_module2.balancer1.min_positive, batch_count=1132106.6666666667, ans=0.025 2023-10-03 04:06:46,061 WARNING [train.py:1204] (3/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_133-158308-0_sp1.1 from training. Duration: 0.763625 2023-10-03 04:06:48,869 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_99-251314-0 from training. Duration: 0.87 2023-10-03 04:06:53,063 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_365-217119-0_sp1.1 from training. Duration: 0.57275 2023-10-03 04:06:53,112 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3475_210_2028_1_1526193578_3622569_377-166133-0 from training. Duration: 0.86 2023-10-03 04:06:58,752 WARNING [train.py:1204] (3/4) Exclude cut with ID _13916_210_9994_1_1530788341126_3763149_116-21034-0_sp1.1 from training. Duration: 0.87275 2023-10-03 04:06:58,769 WARNING [train.py:1204] (3/4) Exclude cut with ID _64220_210_9527_1_1535250151772_4875259_142-216831-0_sp1.1 from training. Duration: 0.936375 2023-10-03 04:06:58,769 WARNING [train.py:1204] (3/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_490-207861-0_sp1.1 from training. Duration: 0.8909375 2023-10-03 04:07:00,024 INFO [train.py:1046] (3/4) Epoch 32, batch 5150, loss[loss=0.138, simple_loss=0.2244, pruned_loss=0.02581, over 24439.00 frames. ], tot_loss[loss=0.163, simple_loss=0.2416, pruned_loss=0.04223, over 4721965.72 frames. ], batch size: 63, lr: 3.17e-03, grad_scale: 16.0 2023-10-03 04:07:00,094 WARNING [train.py:1204] (3/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_87-272068-0_sp0.9 from training. Duration: 0.92225 2023-10-03 04:07:00,114 WARNING [train.py:1204] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_18-46756-0_sp1.1 from training. Duration: 0.7181875 2023-10-03 04:07:01,495 WARNING [train.py:1204] (3/4) Exclude cut with ID _39411_210_5986_1_1533538825030_3343649_98-311335-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 04:07:01,556 WARNING [train.py:1204] (3/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_113-18163-0 from training. Duration: 0.89 2023-10-03 04:07:01,565 WARNING [train.py:1204] (3/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_1103-56445-0 from training. Duration: 0.73 2023-10-03 04:07:02,942 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3474_210_2028_1_1526188513_4825246_145-309131-0 from training. Duration: 0.74 2023-10-03 04:07:02,960 WARNING [train.py:1204] (3/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_120-211630-0_sp1.1 from training. Duration: 0.836375 2023-10-03 04:07:02,977 WARNING [train.py:1204] (3/4) Exclude cut with ID _8030_210_7497_1_1528444886781_3531089_32-31837-0 from training. Duration: 0.97 2023-10-03 04:07:04,331 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_19949_210_2161_1_1531706337_928079_8-160956-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 04:07:04,375 WARNING [train.py:1204] (3/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_321-215742-0_sp1.1 from training. Duration: 0.4545625 2023-10-03 04:07:04,607 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=1132173.3333333333, ans=0.1 2023-10-03 04:07:06,356 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_583-124650-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 04:07:07,758 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_30434_210_13049_1_1532519384_981746_164-182755-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 04:07:09,080 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.570e+02 1.942e+02 2.192e+02 2.524e+02 4.905e+02, threshold=4.384e+02, percent-clipped=1.0 2023-10-03 04:07:09,334 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder_embed.convnext.layerdrop_rate, batch_count=1132173.3333333333, ans=0.015 2023-10-03 04:07:10,725 WARNING [train.py:1204] (3/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_339-104791-0_sp1.1 from training. Duration: 0.4454375 2023-10-03 04:07:12,538 WARNING [train.py:1204] (3/4) Exclude cut with ID _15026_210_8750_1_1530777602316_3291850_211-195043-0 from training. Duration: 0.8 2023-10-03 04:07:12,639 WARNING [train.py:1204] (3/4) Exclude cut with ID _29877_210_6228_1_1532397791106_5276149_487-182647-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 04:07:12,672 WARNING [train.py:1204] (3/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_127-566-0_sp0.9 from training. Duration: 0.9333125 2023-10-03 04:07:12,840 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.feed_forward2.hidden_balancer.prob, batch_count=1132173.3333333333, ans=0.125 2023-10-03 04:07:15,639 WARNING [train.py:1204] (3/4) Exclude cut with ID _8263_210_1664_1_1528439452891_970220_68-181805-0_sp0.9 from training. Duration: 0.711125 2023-10-03 04:07:15,641 WARNING [train.py:1204] (3/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_504-72513-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 04:07:15,658 WARNING [train.py:1204] (3/4) Exclude cut with ID _64639_210_5816_1_1535367619437_3528000_324-265233-0_sp1.1 from training. Duration: 0.963625 2023-10-03 04:07:16,986 WARNING [train.py:1204] (3/4) Exclude cut with ID _6456_210_1534_1_1527681634212_3181049_215-309359-0_sp0.9 from training. Duration: 0.97775 2023-10-03 04:07:16,991 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_115-233929-0_sp1.1 from training. Duration: 0.6545625 2023-10-03 04:07:17,044 WARNING [train.py:1204] (3/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_219-224445-0 from training. Duration: 0.84 2023-10-03 04:07:18,518 WARNING [train.py:1204] (3/4) Exclude cut with ID _14867_210_1968_1_1530684179890_3846969_413-29292-0_sp1.1 from training. Duration: 0.8181875 2023-10-03 04:07:18,558 WARNING [train.py:1204] (3/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_1012-279473-0_sp0.9 from training. Duration: 0.7444375 2023-10-03 04:07:20,053 WARNING [train.py:1204] (3/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_605-133395-0_sp0.9 from training. Duration: 0.7333125 2023-10-03 04:07:22,736 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_89-35748-0 from training. Duration: 0.79 2023-10-03 04:07:24,111 WARNING [train.py:1204] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_476-272757-0_sp1.1 from training. Duration: 0.8454375 2023-10-03 04:07:29,616 WARNING [train.py:1204] (3/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_449-15508-0_sp0.9 from training. Duration: 0.911125 2023-10-03 04:07:31,124 WARNING [train.py:1204] (3/4) Exclude cut with ID _29345_210_4381_1_1532689168263_3860169_198-174328-0 from training. Duration: 0.91 2023-10-03 04:07:31,275 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.feed_forward3.hidden_balancer.prob, batch_count=1132306.6666666667, ans=0.125 2023-10-03 04:07:33,016 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.4.encoder.layers.1.self_attn_weights.whiten_keys, num_groups=4, num_channels=128, metric=2.15 vs. limit=6.0 2023-10-03 04:07:35,709 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_15517_210_6753_1_1530952293_1664795_125-158157-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 04:07:41,772 WARNING [train.py:1204] (3/4) Exclude cut with ID _25565_210_12392_1_1532423009382_3952098_420-113180-0_sp1.1 from training. Duration: 0.963625 2023-10-03 04:07:43,124 WARNING [train.py:1204] (3/4) Exclude cut with ID _51471_210_1889_1_1534399391390_3094250_236-146773-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 04:07:43,372 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.nonlin_attention.balancer.prob, batch_count=1132373.3333333333, ans=0.125 2023-10-03 04:07:45,305 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.bypass.scale_min, batch_count=1132373.3333333333, ans=0.2 2023-10-03 04:07:47,886 WARNING [train.py:1204] (3/4) Exclude cut with ID _30039_210_13270_1_1532484612900_2725150_175-35012-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 04:07:47,921 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_23850_210_4238_1_1532232597_1632433_99-275843-0_sp1.1 from training. Duration: 0.936375 2023-10-03 04:07:50,075 WARNING [train.py:1204] (3/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_658-282610-0 from training. Duration: 0.53 2023-10-03 04:07:52,917 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_37818_210_4238_1_1533620910_4720083_234-65934-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 04:07:54,380 WARNING [train.py:1204] (3/4) Exclude cut with ID _3067_210_2667_1_1526212974640_4503168_459-173867-0_sp1.1 from training. Duration: 0.8 2023-10-03 04:07:54,401 WARNING [train.py:1204] (3/4) Exclude cut with ID _8696_210_1876_1_1528545632440_3345058_209-54069-0_sp0.9 from training. Duration: 0.7444375 2023-10-03 04:07:56,100 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.bypass_mid.scale_min, batch_count=1132373.3333333333, ans=0.2 2023-10-03 04:07:57,240 WARNING [train.py:1204] (3/4) Exclude cut with ID _62574_210_2655_1_1535180407058_3933249_420-236465-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 04:07:57,334 WARNING [train.py:1204] (3/4) Exclude cut with ID _46681_210_9642_1_1533893005571_7603359_333-4042-0_sp1.1 from training. Duration: 0.936375 2023-10-03 04:07:59,902 WARNING [train.py:1204] (3/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_377-296839-0 from training. Duration: 0.55 2023-10-03 04:08:02,980 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.feed_forward3.hidden_balancer.prob, batch_count=1132440.0, ans=0.125 2023-10-03 04:08:04,218 WARNING [train.py:1204] (3/4) Exclude cut with ID _43997_210_4525_1_1533538632555_5189329_426-92599-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 04:08:05,592 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_43-71989-0_sp1.1 from training. Duration: 0.5181875 2023-10-03 04:08:08,867 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2009_210_2071_1_1525481134_4588937_140-345530-0_sp1.1 from training. Duration: 0.963625 2023-10-03 04:08:08,875 WARNING [train.py:1204] (3/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_138-332773-0_sp0.9 from training. Duration: 0.9 2023-10-03 04:08:10,233 WARNING [train.py:1204] (3/4) Exclude cut with ID _13942_210_9988_1_1530604906036_3733360_499-55676-0_sp0.9 from training. Duration: 0.688875 2023-10-03 04:08:10,255 WARNING [train.py:1204] (3/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_259-34038-0_sp1.1 from training. Duration: 0.67275 2023-10-03 04:08:10,264 WARNING [train.py:1204] (3/4) Exclude cut with ID _3120_210_3525_1_1526036143112_4072980_404-249994-0_sp1.1 from training. Duration: 0.9 2023-10-03 04:08:10,295 WARNING [train.py:1204] (3/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_222-108810-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 04:08:14,901 INFO [train.py:1046] (3/4) Epoch 32, batch 5200, loss[loss=0.1682, simple_loss=0.2445, pruned_loss=0.04594, over 24676.00 frames. ], tot_loss[loss=0.1626, simple_loss=0.2415, pruned_loss=0.04189, over 4724557.74 frames. ], batch size: 65, lr: 3.17e-03, grad_scale: 32.0 2023-10-03 04:08:15,037 WARNING [train.py:1204] (3/4) Exclude cut with ID _22083_210_8254_1_1531702862439_3678970_49-234546-0_sp0.9 from training. Duration: 0.92225 2023-10-03 04:08:16,848 WARNING [train.py:1204] (3/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_199-295611-0_sp0.9 from training. Duration: 0.611125 2023-10-03 04:08:19,489 WARNING [train.py:1204] (3/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_43-192945-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 04:08:24,124 WARNING [train.py:1204] (3/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_364-22817-0 from training. Duration: 0.71 2023-10-03 04:08:25,481 WARNING [train.py:1204] (3/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_388-129552-0_sp1.1 from training. Duration: 0.87275 2023-10-03 04:08:25,718 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=1132506.6666666667, ans=0.1 2023-10-03 04:08:26,775 WARNING [train.py:1204] (3/4) Exclude cut with ID _6537_210_1883_1_1527987559710_3597129_190-280227-0_sp1.1 from training. Duration: 0.97275 2023-10-03 04:08:28,256 WARNING [train.py:1204] (3/4) Exclude cut with ID _55844_210_2346_1_1534471212569_7625707_398-56897-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 04:08:29,586 WARNING [train.py:1204] (3/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_48-182155-0_sp1.1 from training. Duration: 0.7909375 2023-10-03 04:08:29,605 WARNING [train.py:1204] (3/4) Exclude cut with ID _29529_210_7234_1_1532393961455_3977460_582-18084-0_sp1.1 from training. Duration: 0.97275 2023-10-03 04:08:29,783 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.feed_forward2.hidden_balancer.prob, batch_count=1132573.3333333333, ans=0.125 2023-10-03 04:08:32,406 WARNING [train.py:1204] (3/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_912-282412-0 from training. Duration: 0.49 2023-10-03 04:08:35,137 WARNING [train.py:1204] (3/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_332-256168-0_sp0.9 from training. Duration: 0.8333125 2023-10-03 04:08:35,188 WARNING [train.py:1204] (3/4) Exclude cut with ID _64220_210_9527_1_1535250151772_4875259_55-250624-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 04:08:36,644 WARNING [train.py:1204] (3/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_264-108685-0 from training. Duration: 0.55 2023-10-03 04:08:39,953 WARNING [train.py:1204] (3/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_613-232856-0_sp0.9 from training. Duration: 0.6 2023-10-03 04:08:40,035 WARNING [train.py:1204] (3/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_68-212801-0_sp1.1 from training. Duration: 0.663625 2023-10-03 04:08:41,437 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6290_210_3273_1_1527595224_1282787_115-87095-0 from training. Duration: 0.55 2023-10-03 04:08:41,486 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3080_210_2071_1_1526086102_4365166_312-8726-0 from training. Duration: 0.91 2023-10-03 04:08:44,250 WARNING [train.py:1204] (3/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_105-63309-0 from training. Duration: 0.98 2023-10-03 04:08:45,694 WARNING [train.py:1204] (3/4) Exclude cut with ID _2941_210_2577_1_1526199673150_2489759_201-160779-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 04:08:45,697 WARNING [train.py:1204] (3/4) Exclude cut with ID ID0406W0226-885434 from training. Duration: 0.997 2023-10-03 04:08:45,704 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_17292_210_6753_1_1531125388_2943734_353-328201-0_sp1.1 from training. Duration: 0.97275 2023-10-03 04:08:47,661 WARNING [train.py:1204] (3/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_572-96029-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 04:08:47,693 WARNING [train.py:1204] (3/4) Exclude cut with ID _14970_210_15709_1_1530775547361_1966965_6-23328-0_sp0.9 from training. Duration: 0.9555625 2023-10-03 04:08:47,957 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.bypass.skip_rate, batch_count=1132640.0, ans=0.07 2023-10-03 04:08:49,002 WARNING [train.py:1204] (3/4) Exclude cut with ID _45012_210_12651_1_1533608435136_5371380_240-37101-0 from training. Duration: 0.93 2023-10-03 04:08:49,064 WARNING [train.py:1204] (3/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_685-90183-0_sp1.1 from training. Duration: 0.936375 2023-10-03 04:08:51,150 WARNING [train.py:1204] (3/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_41-22245-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 04:08:53,935 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_15126_210_6753_1_1530778677_2591185_120-277707-0 from training. Duration: 0.8 2023-10-03 04:08:55,270 WARNING [train.py:1204] (3/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_310-26186-0 from training. Duration: 0.7 2023-10-03 04:08:55,292 WARNING [train.py:1204] (3/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_787-294416-0 from training. Duration: 0.82 2023-10-03 04:09:00,342 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.0.layers.1.self_attn_weights.whiten_keys, num_groups=4, num_channels=128, metric=2.30 vs. limit=6.0 2023-10-03 04:09:00,655 WARNING [train.py:1204] (3/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_795-33252-0 from training. Duration: 0.37 2023-10-03 04:09:00,695 WARNING [train.py:1204] (3/4) Exclude cut with ID _7537_210_2084_1_1528502395431_3924359_113-199933-0_sp0.9 from training. Duration: 0.6555625 2023-10-03 04:09:06,298 WARNING [train.py:1204] (3/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_308-231735-0_sp0.9 from training. Duration: 0.811125 2023-10-03 04:09:06,329 WARNING [train.py:1204] (3/4) Exclude cut with ID _19504_210_9994_1_1531648863242_3325220_354-296765-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 04:09:07,696 WARNING [train.py:1204] (3/4) Exclude cut with ID _5409_210_1388_1_1527292850189_5865191_178-127970-0 from training. Duration: 0.57 2023-10-03 04:09:07,736 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_1466-248056-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 04:09:08,298 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.4.encoder.layers.0.whiten, num_groups=1, num_channels=384, metric=4.43 vs. limit=12.0 2023-10-03 04:09:08,956 WARNING [train.py:1204] (3/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_535-14410-0_sp0.9 from training. Duration: 0.52225 2023-10-03 04:09:08,959 WARNING [train.py:1204] (3/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_747-129941-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 04:09:09,013 WARNING [train.py:1204] (3/4) Exclude cut with ID _15012_210_1968_1_1530856820429_5095350_688-178849-0_sp1.1 from training. Duration: 0.5818125 2023-10-03 04:09:13,571 WARNING [train.py:1204] (3/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_356-45054-0_sp1.1 from training. Duration: 0.7818125 2023-10-03 04:09:13,667 WARNING [train.py:1204] (3/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_146-276944-0_sp0.9 from training. Duration: 0.92225 2023-10-03 04:09:18,305 WARNING [train.py:1204] (3/4) Exclude cut with ID _39204_210_4220_1_1533880547368_3191120_346-180306-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 04:09:18,415 WARNING [train.py:1204] (3/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_641-158017-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 04:09:18,415 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_19952_210_2161_1_1531875469_3851750_274-42137-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 04:09:24,297 WARNING [train.py:1204] (3/4) Exclude cut with ID _60804_210_9642_1_1534831199859_7268299_87-327819-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 04:09:25,725 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_9302_210_2550_1_1528889962_4655416_638-301620-0 from training. Duration: 0.47 2023-10-03 04:09:25,787 WARNING [train.py:1204] (3/4) Exclude cut with ID _44304_210_15374_1_1533639352765_4613129_302-22084-0_sp1.1 from training. Duration: 0.7818125 2023-10-03 04:09:25,797 WARNING [train.py:1204] (3/4) Exclude cut with ID _18513_210_9206_1_1531618215223_3862620_177-227063-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 04:09:25,920 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.conv_module1.balancer2.prob, batch_count=1132773.3333333333, ans=0.125 2023-10-03 04:09:27,087 WARNING [train.py:1204] (3/4) Exclude cut with ID _41326_210_5854_1_1533778076509_3492080_510-45444-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 04:09:28,320 INFO [train.py:1046] (3/4) Epoch 32, batch 5250, loss[loss=0.1506, simple_loss=0.2302, pruned_loss=0.03556, over 24652.00 frames. ], tot_loss[loss=0.1622, simple_loss=0.2409, pruned_loss=0.04174, over 4727855.90 frames. ], batch size: 65, lr: 3.17e-03, grad_scale: 16.0 2023-10-03 04:09:28,384 WARNING [train.py:1204] (3/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_76-311963-0_sp1.1 from training. Duration: 0.636375 2023-10-03 04:09:28,502 WARNING [train.py:1204] (3/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_156-302675-0_sp0.9 from training. Duration: 0.611125 2023-10-03 04:09:30,707 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.2.encoder.layers.2.conv_module1.whiten, num_groups=1, num_channels=384, metric=5.33 vs. limit=15.0 2023-10-03 04:09:31,345 WARNING [train.py:1204] (3/4) Exclude cut with ID _53489_210_6228_1_1534298357169_3835970_367-148411-0_sp1.1 from training. Duration: 0.936375 2023-10-03 04:09:31,545 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.feed_forward2.hidden_balancer.prob, batch_count=1132840.0, ans=0.125 2023-10-03 04:09:34,024 WARNING [train.py:1204] (3/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_389-144525-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 04:09:34,088 WARNING [train.py:1204] (3/4) Exclude cut with ID _14069_210_5632_1_1530512692033_4803390_158-238427-0_sp1.1 from training. Duration: 0.963625 2023-10-03 04:09:35,454 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_486-151131-0_sp0.9 from training. Duration: 0.9444375 2023-10-03 04:09:38,124 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.558e+02 1.839e+02 2.059e+02 2.239e+02 2.945e+02, threshold=4.119e+02, percent-clipped=0.0 2023-10-03 04:09:39,733 WARNING [train.py:1204] (3/4) Exclude cut with ID _39846_210_12651_1_1533603651992_1318030_188-71831-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 04:09:41,259 WARNING [train.py:1204] (3/4) Exclude cut with ID _39411_210_5986_1_1533538825030_3343649_173-238248-0_sp1.1 from training. Duration: 0.7545625 2023-10-03 04:09:44,454 WARNING [train.py:1204] (3/4) Exclude cut with ID _20684_210_6917_1_1531567751646_4203930_469-49081-0_sp1.1 from training. Duration: 0.92725 2023-10-03 04:09:46,402 WARNING [train.py:1204] (3/4) Exclude cut with ID _8833_210_5219_1_1528592665210_7553059_611-80313-0_sp1.1 from training. Duration: 0.5818125 2023-10-03 04:09:48,422 WARNING [train.py:1204] (3/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_440-141811-0 from training. Duration: 0.95 2023-10-03 04:09:49,842 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_41211_210_2544_1_1533348859_3702910_236-223652-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 04:09:49,928 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_40222_210_6228_1_1533385298_4481939_220-163998-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 04:10:11,572 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.5.encoder.layers.1.self_attn2.whiten, num_groups=1, num_channels=256, metric=11.05 vs. limit=22.5 2023-10-03 04:10:16,874 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.3.encoder.layers.0.nonlin_attention.whiten2, num_groups=1, num_channels=512, metric=6.22 vs. limit=15.0 2023-10-03 04:10:17,652 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=1133040.0, ans=0.1 2023-10-03 04:10:27,799 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.0.conv_skip_rate, batch_count=1133106.6666666667, ans=0.0 2023-10-03 04:10:36,643 INFO [train.py:1046] (3/4) Epoch 32, batch 5300, loss[loss=0.1573, simple_loss=0.2285, pruned_loss=0.04306, over 23671.00 frames. ], tot_loss[loss=0.1611, simple_loss=0.2394, pruned_loss=0.04139, over 4739333.18 frames. ], batch size: 232, lr: 3.16e-03, grad_scale: 16.0 2023-10-03 04:10:45,673 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.3.encoder.layers.0.feed_forward1.out_whiten, num_groups=1, num_channels=512, metric=7.75 vs. limit=15.0 2023-10-03 04:10:51,056 WARNING [train.py:1204] (3/4) Exclude cut with ID _14629_210_15709_1_1530604298182_956899_6-232695-0_sp1.1 from training. Duration: 0.8545625 2023-10-03 04:10:51,092 WARNING [train.py:1204] (3/4) Exclude cut with ID _13921_210_9994_1_1530841782272_3503520_327-69677-0 from training. Duration: 0.9 2023-10-03 04:10:51,121 WARNING [train.py:1204] (3/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_372-298093-0 from training. Duration: 0.8 2023-10-03 04:10:51,134 WARNING [train.py:1204] (3/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_515-139111-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 04:10:51,746 WARNING [train.py:1204] (3/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_85-65056-0_sp1.1 from training. Duration: 0.87275 2023-10-03 04:10:51,787 WARNING [train.py:1204] (3/4) Exclude cut with ID _15113_210_8254_1_1530842438579_3716279_209-54533-0_sp1.1 from training. Duration: 0.87275 2023-10-03 04:10:51,836 WARNING [train.py:1204] (3/4) Exclude cut with ID _70447_210_12392_1_1535972499300_3160420_123-312838-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 04:10:51,865 WARNING [train.py:1204] (3/4) Exclude cut with ID _60113_210_16271_1_1535164212481_4386079_70-63596-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 04:10:51,895 WARNING [train.py:1204] (3/4) Exclude cut with ID _14632_210_9988_1_1530691279392_3923200_225-188509-0_sp1.1 from training. Duration: 0.963625 2023-10-03 04:10:51,944 WARNING [train.py:1204] (3/4) Exclude cut with ID _34734_210_4525_1_1533365977542_4448720_584-180532-0_sp1.1 from training. Duration: 0.97275 2023-10-03 04:10:51,958 WARNING [train.py:1204] (3/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_351-294650-0_sp1.1 from training. Duration: 0.57275 2023-10-03 04:10:52,295 WARNING [train.py:1204] (3/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_263-168388-0_sp0.9 from training. Duration: 0.9555625 2023-10-03 04:10:52,322 WARNING [train.py:1204] (3/4) Exclude cut with ID _22083_210_8254_1_1531702862439_3678970_201-85738-0 from training. Duration: 0.9 2023-10-03 04:10:52,415 WARNING [train.py:1204] (3/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_237-58433-0 from training. Duration: 0.74 2023-10-03 04:10:52,442 WARNING [train.py:1204] (3/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_262-250826-0 from training. Duration: 0.99 2023-10-03 04:10:52,548 WARNING [train.py:1204] (3/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_927-43827-0_sp0.9 from training. Duration: 0.82225 2023-10-03 04:10:52,579 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2821_210_2715_1_1526108385_3763560_73-296673-0 from training. Duration: 0.83 2023-10-03 04:10:52,657 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6287_210_3273_1_1527509506_2653716_182-199887-0 from training. Duration: 0.51 2023-10-03 04:10:52,791 WARNING [train.py:1204] (3/4) Exclude cut with ID _70519_210_4163_1_1535961595994_7458230_899-80223-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 04:10:53,062 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_210-234949-0_sp1.1 from training. Duration: 0.97275 2023-10-03 04:10:53,102 WARNING [train.py:1204] (3/4) Exclude cut with ID _30196_210_1664_1_1533708496522_7074140_768-278176-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 04:10:53,190 WARNING [train.py:1204] (3/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_315-128283-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 04:10:53,275 WARNING [train.py:1204] (3/4) Exclude cut with ID _14886_210_4220_1_1530702020355_3516190_330-323941-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 04:10:53,979 WARNING [train.py:1204] (3/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_264-348896-0_sp1.1 from training. Duration: 0.77275 2023-10-03 04:10:54,003 WARNING [train.py:1204] (3/4) Exclude cut with ID _37086_210_9038_1_1532999540891_7021250_433-10563-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 04:10:54,042 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_53638_210_21581_1_1534297319_5808449_885-80172-0_sp1.1 from training. Duration: 0.87275 2023-10-03 04:10:54,136 WARNING [train.py:1204] (3/4) Exclude cut with ID _14623_210_1925_1_1530773354962_3632609_157-218771-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 04:10:54,146 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_1368-78165-0_sp1.1 from training. Duration: 0.97275 2023-10-03 04:10:54,152 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_309-65177-0_sp0.9 from training. Duration: 0.97775 2023-10-03 04:10:54,159 WARNING [train.py:1204] (3/4) Exclude cut with ID _21632_210_2253_1_1531994001765_3744140_291-138812-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 04:10:54,172 WARNING [train.py:1204] (3/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_669-93163-0_sp1.1 from training. Duration: 0.8909375 2023-10-03 04:10:54,751 WARNING [train.py:1204] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_292-215346-0 from training. Duration: 0.97 2023-10-03 04:10:54,777 WARNING [train.py:1204] (3/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_286-222073-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 04:10:55,087 WARNING [train.py:1204] (3/4) Exclude cut with ID _59256_210_5986_1_1535076085997_3589120_121-237386-0_sp1.1 from training. Duration: 0.87275 2023-10-03 04:10:55,101 WARNING [train.py:1204] (3/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_96-320736-0 from training. Duration: 0.86 2023-10-03 04:10:55,111 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_128-202104-0 from training. Duration: 0.63 2023-10-03 04:10:55,200 WARNING [train.py:1204] (3/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_242-3472-0_sp0.9 from training. Duration: 0.811125 2023-10-03 04:10:55,247 WARNING [train.py:1204] (3/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_154-74182-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 04:10:55,251 WARNING [train.py:1204] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_187-221566-0 from training. Duration: 0.43 2023-10-03 04:10:55,428 WARNING [train.py:1204] (3/4) Exclude cut with ID _6194_210_1534_1_1527511826160_3941194_14-282876-0 from training. Duration: 0.71 2023-10-03 04:10:55,470 WARNING [train.py:1204] (3/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_660-211088-0_sp1.1 from training. Duration: 0.563625 2023-10-03 04:10:55,805 WARNING [train.py:1204] (3/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_526-62079-0_sp1.1 from training. Duration: 0.6090625 2023-10-03 04:10:55,950 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_92-246270-0_sp1.1 from training. Duration: 0.77275 2023-10-03 04:10:56,492 WARNING [train.py:1204] (3/4) Exclude cut with ID IC0287W0023-142350 from training. Duration: 0.956 2023-10-03 04:10:56,566 WARNING [train.py:1204] (3/4) Exclude cut with ID _58605_210_12210_1_1534723195939_3904259_48-269402-0 from training. Duration: 0.96 2023-10-03 04:10:56,580 WARNING [train.py:1204] (3/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_416-257730-0_sp0.9 from training. Duration: 0.72225 2023-10-03 04:10:56,651 WARNING [train.py:1204] (3/4) Exclude cut with ID _7877_210_6014_1_1528286358099_3750136_185-17516-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 04:10:56,749 WARNING [train.py:1204] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_23-189234-0 from training. Duration: 0.83 2023-10-03 04:10:56,765 WARNING [train.py:1204] (3/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_237-89684-0 from training. Duration: 0.96 2023-10-03 04:10:56,835 WARNING [train.py:1204] (3/4) Exclude cut with ID _13916_210_9994_1_1530788341126_3763149_116-21034-0 from training. Duration: 0.96 2023-10-03 04:10:56,986 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_246-125000-0_sp1.1 from training. Duration: 0.563625 2023-10-03 04:10:59,112 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.0.feed_forward1.out_proj.dropout_p, batch_count=1133253.3333333333, ans=0.1 2023-10-03 04:11:03,412 INFO [train.py:1046] (3/4) Epoch 33, batch 0, loss[loss=0.1684, simple_loss=0.2451, pruned_loss=0.04583, over 23761.00 frames. ], tot_loss[loss=0.1684, simple_loss=0.2451, pruned_loss=0.04583, over 23761.00 frames. ], batch size: 232, lr: 3.12e-03, grad_scale: 32.0 2023-10-03 04:11:03,412 INFO [train.py:1069] (3/4) Computing validation loss 2023-10-03 04:11:15,270 INFO [train.py:1078] (3/4) Epoch 33, validation: loss=0.326, simple_loss=0.2728, pruned_loss=0.1896, over 1125622.00 frames. 2023-10-03 04:11:15,270 INFO [train.py:1079] (3/4) Maximum memory allocated so far is 21134MB 2023-10-03 04:11:16,728 WARNING [train.py:1204] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_402-253027-0 from training. Duration: 0.54 2023-10-03 04:11:17,402 WARNING [train.py:1204] (3/4) Exclude cut with ID _59288_210_5854_1_1535077757774_3412246_158-289345-0_sp1.1 from training. Duration: 0.92725 2023-10-03 04:11:18,645 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_28211_210_6388_1_1532861506_4523224_128-328162-0_sp1.1 from training. Duration: 0.8181875 2023-10-03 04:11:20,263 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.feed_forward1.hidden_balancer.prob, batch_count=1133253.3333333333, ans=0.125 2023-10-03 04:11:22,900 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_788-278281-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 04:11:22,908 WARNING [train.py:1204] (3/4) Exclude cut with ID _49728_210_12651_1_1534041034294_6788150_624-195301-0_sp0.9 from training. Duration: 0.9444375 2023-10-03 04:11:24,202 WARNING [train.py:1204] (3/4) Exclude cut with ID _14860_210_4915_1_1530702020340_7343240_267-139775-0_sp1.1 from training. Duration: 0.963625 2023-10-03 04:11:24,269 WARNING [train.py:1204] (3/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_332-221057-0 from training. Duration: 0.68 2023-10-03 04:11:25,731 WARNING [train.py:1204] (3/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_242-76943-0 from training. Duration: 0.94 2023-10-03 04:11:27,051 WARNING [train.py:1204] (3/4) Exclude cut with ID _16747_210_5986_1_1531811544733_2749109_87-349587-0_sp1.1 from training. Duration: 0.963625 2023-10-03 04:11:27,122 WARNING [train.py:1204] (3/4) Exclude cut with ID _22982_210_1883_1_1531882790447_5449409_505-346452-0_sp1.1 from training. Duration: 0.963625 2023-10-03 04:11:31,163 WARNING [train.py:1204] (3/4) Exclude cut with ID _17956_210_4353_1_1531209568125_6979970_13-192107-0_sp1.1 from training. Duration: 0.963625 2023-10-03 04:11:31,211 WARNING [train.py:1204] (3/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_430-307666-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 04:11:32,823 WARNING [train.py:1204] (3/4) Exclude cut with ID _2895_210_2477_1_1525956785794_4063750_289-11475-0_sp1.1 from training. Duration: 0.7454375 2023-10-03 04:11:32,842 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3910_210_1794_1_1526742306_334530_28-238909-0_sp0.9 from training. Duration: 0.9 2023-10-03 04:11:34,284 WARNING [train.py:1204] (3/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_18-130336-0 from training. Duration: 0.79 2023-10-03 04:11:35,665 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_548-294316-0_sp0.9 from training. Duration: 0.9 2023-10-03 04:11:42,572 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_42-142473-0_sp1.1 from training. Duration: 0.6545625 2023-10-03 04:11:42,577 WARNING [train.py:1204] (3/4) Exclude cut with ID _59170_210_6721_1_1534983633960_4203335_326-48066-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 04:11:42,701 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.conv_module2.balancer2.prob, batch_count=1133320.0, ans=0.125 2023-10-03 04:11:45,293 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_45886_210_17024_1_1533709215_471063_7-79165-0 from training. Duration: 0.98 2023-10-03 04:11:48,592 WARNING [train.py:1204] (3/4) Exclude cut with ID _44585_210_18628_1_1533605350387_4518199_280-73534-0_sp0.9 from training. Duration: 0.988875 2023-10-03 04:11:48,600 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3475_210_2028_1_1526193578_3622569_265-276625-0_sp1.1 from training. Duration: 0.6909375 2023-10-03 04:11:50,222 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=1133386.6666666667, ans=0.1 2023-10-03 04:11:51,299 WARNING [train.py:1204] (3/4) Exclude cut with ID _64631_210_27767_1_1535349580068_4165160_88-225232-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 04:11:55,535 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_205-265518-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 04:11:59,574 WARNING [train.py:1204] (3/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_271-253734-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 04:11:59,781 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.1.conv_module1.balancer1.max_abs, batch_count=1133453.3333333333, ans=10.0 2023-10-03 04:12:05,219 WARNING [train.py:1204] (3/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_282-257791-0 from training. Duration: 0.86 2023-10-03 04:12:05,505 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=1133453.3333333333, ans=0.1 2023-10-03 04:12:08,003 WARNING [train.py:1204] (3/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_356-45054-0 from training. Duration: 0.86 2023-10-03 04:12:09,347 WARNING [train.py:1204] (3/4) Exclude cut with ID _69584_210_5219_1_1535785133775_4044450_38-348116-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 04:12:09,357 WARNING [train.py:1204] (3/4) Exclude cut with ID _66763_210_17301_1_1535788667617_241750_21-328480-0_sp1.1 from training. Duration: 0.936375 2023-10-03 04:12:11,267 WARNING [train.py:1204] (3/4) Exclude cut with ID _26528_210_9070_1_1532570410842_3851310_497-97218-0_sp0.9 from training. Duration: 0.9555625 2023-10-03 04:12:11,316 WARNING [train.py:1204] (3/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_592-101075-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 04:12:13,234 WARNING [train.py:1204] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_678-159307-0 from training. Duration: 0.85 2023-10-03 04:12:14,781 WARNING [train.py:1204] (3/4) Exclude cut with ID _28427_210_4220_1_1532581240967_3061069_166-319773-0_sp1.1 from training. Duration: 0.936375 2023-10-03 04:12:15,002 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=1133520.0, ans=0.1 2023-10-03 04:12:16,174 WARNING [train.py:1204] (3/4) Exclude cut with ID _29877_210_6228_1_1532397791106_5276149_606-224984-0_sp1.1 from training. Duration: 0.936375 2023-10-03 04:12:21,013 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_125-203105-0_sp1.1 from training. Duration: 0.72725 2023-10-03 04:12:21,887 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.2.encoder.layers.1.feed_forward1.out_whiten, num_groups=1, num_channels=384, metric=6.48 vs. limit=15.0 2023-10-03 04:12:22,666 WARNING [train.py:1204] (3/4) Exclude cut with ID IC0620W0113-306765 from training. Duration: 0.768 2023-10-03 04:12:23,924 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.507e+02 1.812e+02 1.985e+02 2.280e+02 3.382e+02, threshold=3.970e+02, percent-clipped=0.0 2023-10-03 04:12:24,056 WARNING [train.py:1204] (3/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_410-287857-0_sp1.1 from training. Duration: 0.8454375 2023-10-03 04:12:25,707 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.feed_forward1.out_proj.dropout_p, batch_count=1133520.0, ans=0.1 2023-10-03 04:12:26,757 WARNING [train.py:1204] (3/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_78-95260-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 04:12:28,427 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.balancer2.prob, batch_count=1133586.6666666667, ans=0.125 2023-10-03 04:12:29,398 INFO [train.py:1046] (3/4) Epoch 33, batch 50, loss[loss=0.1715, simple_loss=0.2465, pruned_loss=0.04823, over 23385.00 frames. ], tot_loss[loss=0.1643, simple_loss=0.2439, pruned_loss=0.04234, over 1072087.34 frames. ], batch size: 119, lr: 3.11e-03, grad_scale: 16.0 2023-10-03 04:12:29,443 WARNING [train.py:1204] (3/4) Exclude cut with ID _58059_210_5854_1_1534901471149_3164808_6-73080-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 04:12:29,471 WARNING [train.py:1204] (3/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_331-117829-0 from training. Duration: 0.62 2023-10-03 04:12:29,556 WARNING [train.py:1204] (3/4) Exclude cut with ID _14880_210_8895_1_1530680426655_3795259_289-212817-0_sp0.9 from training. Duration: 0.7666875 2023-10-03 04:12:30,870 WARNING [train.py:1204] (3/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_620-144779-0_sp1.1 from training. Duration: 0.92725 2023-10-03 04:12:32,238 WARNING [train.py:1204] (3/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_561-288575-0_sp1.1 from training. Duration: 0.963625 2023-10-03 04:12:32,326 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_22657_210_6890_1_1532086002_1637216_122-188128-0_sp1.1 from training. Duration: 0.963625 2023-10-03 04:12:34,983 WARNING [train.py:1204] (3/4) Exclude cut with ID _16756_210_4915_1_1531047803763_7104259_604-86607-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 04:12:35,159 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=1133586.6666666667, ans=0.1 2023-10-03 04:12:37,817 WARNING [train.py:1204] (3/4) Exclude cut with ID _15150_210_8341_1_1530752411803_7256384_469-239509-0 from training. Duration: 0.68 2023-10-03 04:12:37,824 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_10987_210_4163_1_1529760555_4123902_302-87926-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 04:12:38,070 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.attention_skip_rate, batch_count=1133586.6666666667, ans=0.0 2023-10-03 04:12:44,784 WARNING [train.py:1204] (3/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_469-94673-0_sp0.9 from training. Duration: 0.87775 2023-10-03 04:12:47,466 WARNING [train.py:1204] (3/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_798-106179-0 from training. Duration: 0.83 2023-10-03 04:12:48,877 WARNING [train.py:1204] (3/4) Exclude cut with ID _16744_210_5986_1_1531724405384_3907567_242-111316-0 from training. Duration: 0.93 2023-10-03 04:12:49,850 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.bypass_mid.scale_min, batch_count=1133653.3333333333, ans=0.2 2023-10-03 04:12:50,891 WARNING [train.py:1204] (3/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_938-205135-0_sp0.9 from training. Duration: 0.9666875 2023-10-03 04:12:52,286 WARNING [train.py:1204] (3/4) Exclude cut with ID _64135_210_2253_1_1535677267876_7608972_133-188266-0_sp0.9 from training. Duration: 0.9 2023-10-03 04:12:52,298 WARNING [train.py:1204] (3/4) Exclude cut with ID _44585_210_18628_1_1533605350387_4518199_623-346820-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 04:12:53,582 WARNING [train.py:1204] (3/4) Exclude cut with ID _42326_210_7770_1_1533814282531_3707269_454-266903-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 04:12:54,899 WARNING [train.py:1204] (3/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_5-55893-0_sp1.1 from training. Duration: 0.5 2023-10-03 04:12:54,952 WARNING [train.py:1204] (3/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_577-332600-0_sp1.1 from training. Duration: 0.5181875 2023-10-03 04:12:54,953 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_957-138882-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 04:12:55,281 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.conv_module1.balancer1.max_abs, batch_count=1133653.3333333333, ans=10.0 2023-10-03 04:13:00,960 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.balancer1.prob, batch_count=1133720.0, ans=0.125 2023-10-03 04:13:02,210 WARNING [train.py:1204] (3/4) Exclude cut with ID _12556_210_8341_1_1530081024100_7327690_219-38004-0_sp1.1 from training. Duration: 0.97275 2023-10-03 04:13:02,773 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.4.encoder.layers.2.nonlin_attention.whiten2, num_groups=1, num_channels=384, metric=3.53 vs. limit=15.0 2023-10-03 04:13:03,519 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_397-147196-0_sp1.1 from training. Duration: 0.72725 2023-10-03 04:13:03,529 WARNING [train.py:1204] (3/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_231-104736-0_sp0.9 from training. Duration: 0.8666875 2023-10-03 04:13:04,907 WARNING [train.py:1204] (3/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_398-334558-0 from training. Duration: 0.96 2023-10-03 04:13:05,212 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.conv_module2.balancer1.prob, batch_count=1133720.0, ans=0.125 2023-10-03 04:13:05,791 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.1.encoder.layers.0.nonlin_attention.whiten1, num_groups=1, num_channels=192, metric=4.53 vs. limit=10.0 2023-10-03 04:13:06,354 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_257-20733-0_sp1.1 from training. Duration: 0.6909375 2023-10-03 04:13:07,670 WARNING [train.py:1204] (3/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_146-282400-0_sp1.1 from training. Duration: 0.6454375 2023-10-03 04:13:07,674 WARNING [train.py:1204] (3/4) Exclude cut with ID _51899_210_20034_1_1534129254466_3669220_265-215428-0 from training. Duration: 0.98 2023-10-03 04:13:09,044 WARNING [train.py:1204] (3/4) Exclude cut with ID _34423_210_8750_1_1533430798456_3330199_219-106541-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 04:13:10,449 WARNING [train.py:1204] (3/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_458-67321-0 from training. Duration: 0.97 2023-10-03 04:13:18,981 WARNING [train.py:1204] (3/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_1051-182606-0_sp1.1 from training. Duration: 0.936375 2023-10-03 04:13:20,325 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_53555_210_5632_1_1534595220_7232297_603-4396-0_sp1.1 from training. Duration: 0.97275 2023-10-03 04:13:20,423 WARNING [train.py:1204] (3/4) Exclude cut with ID _54218_210_12210_1_1534413646388_4409259_159-144367-0_sp1.1 from training. Duration: 0.963625 2023-10-03 04:13:21,800 WARNING [train.py:1204] (3/4) Exclude cut with ID _7606_210_4915_1_1528894253277_5080690_301-10732-0_sp0.9 from training. Duration: 0.9 2023-10-03 04:13:21,803 WARNING [train.py:1204] (3/4) Exclude cut with ID _15026_210_8750_1_1530777602316_3291850_211-195043-0_sp0.9 from training. Duration: 0.888875 2023-10-03 04:13:25,203 WARNING [train.py:1204] (3/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_146-282400-0 from training. Duration: 0.71 2023-10-03 04:13:25,232 WARNING [train.py:1204] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_65-88242-0 from training. Duration: 0.56 2023-10-03 04:13:26,553 WARNING [train.py:1204] (3/4) Exclude cut with ID _55857_210_2346_1_1534644007831_6898209_80-107757-0_sp1.1 from training. Duration: 0.963625 2023-10-03 04:13:27,833 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_15126_210_6753_1_1530778677_2591185_120-277707-0_sp0.9 from training. Duration: 0.888875 2023-10-03 04:13:29,300 WARNING [train.py:1204] (3/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_1033-168593-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 04:13:29,351 WARNING [train.py:1204] (3/4) Exclude cut with ID _46683_210_9642_1_1533730913492_4591116_475-30711-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 04:13:30,658 WARNING [train.py:1204] (3/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_253-308377-0 from training. Duration: 0.49 2023-10-03 04:13:30,727 WARNING [train.py:1204] (3/4) Exclude cut with ID _4680_210_3525_1_1527249757953_893386_75-78979-0 from training. Duration: 0.91 2023-10-03 04:13:32,164 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_5522_210_3273_1_1527249606_3909096_318-203153-0_sp0.9 from training. Duration: 0.47775 2023-10-03 04:13:33,543 WARNING [train.py:1204] (3/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_550-170116-0_sp1.1 from training. Duration: 0.92725 2023-10-03 04:13:34,886 WARNING [train.py:1204] (3/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_277-210798-0_sp0.9 from training. Duration: 0.611125 2023-10-03 04:13:34,951 WARNING [train.py:1204] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_620-276793-0 from training. Duration: 0.59 2023-10-03 04:13:34,961 WARNING [train.py:1204] (3/4) Exclude cut with ID _3067_210_2667_1_1526212974640_4503168_119-175776-0 from training. Duration: 0.74 2023-10-03 04:13:37,575 WARNING [train.py:1204] (3/4) Exclude cut with ID _55209_210_1531_1_1534675828996_4597160_681-195827-0_sp1.1 from training. Duration: 0.92725 2023-10-03 04:13:37,646 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_427-78097-0_sp1.1 from training. Duration: 0.72725 2023-10-03 04:13:40,542 WARNING [train.py:1204] (3/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_96-290039-0_sp0.9 from training. Duration: 0.62225 2023-10-03 04:13:40,552 WARNING [train.py:1204] (3/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_287-292174-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 04:13:41,979 WARNING [train.py:1204] (3/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_200-292228-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 04:13:43,219 INFO [train.py:1046] (3/4) Epoch 33, batch 100, loss[loss=0.1733, simple_loss=0.2394, pruned_loss=0.05357, over 23796.00 frames. ], tot_loss[loss=0.1658, simple_loss=0.2452, pruned_loss=0.04322, over 1886835.13 frames. ], batch size: 164, lr: 3.11e-03, grad_scale: 16.0 2023-10-03 04:13:44,694 WARNING [train.py:1204] (3/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_291-194999-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 04:13:46,874 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.conv_module1.balancer1.prob, batch_count=1133920.0, ans=0.125 2023-10-03 04:13:48,174 WARNING [train.py:1204] (3/4) Exclude cut with ID _58895_210_8750_1_1535184081320_3649089_265-323810-0_sp1.1 from training. Duration: 0.87275 2023-10-03 04:13:50,816 WARNING [train.py:1204] (3/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_58-68956-0 from training. Duration: 0.83 2023-10-03 04:13:50,829 WARNING [train.py:1204] (3/4) Exclude cut with ID _39867_210_9826_1_1533553222030_9344259_322-115153-0_sp1.1 from training. Duration: 0.963625 2023-10-03 04:13:55,987 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3371_210_1794_1_1526384462_5580777_396-339812-0_sp1.1 from training. Duration: 0.77275 2023-10-03 04:13:55,998 WARNING [train.py:1204] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_731-2428-0_sp0.9 from training. Duration: 0.92225 2023-10-03 04:13:56,033 WARNING [train.py:1204] (3/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_98-741-0_sp1.1 from training. Duration: 0.72725 2023-10-03 04:13:56,035 WARNING [train.py:1204] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_238-245655-0_sp0.9 from training. Duration: 0.9 2023-10-03 04:13:56,063 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6788_210_1388_1_1527898698_4562021_91-197981-0_sp0.9 from training. Duration: 0.92225 2023-10-03 04:13:56,287 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.feed_forward1.hidden_balancer.prob, batch_count=1133920.0, ans=0.125 2023-10-03 04:13:57,431 WARNING [train.py:1204] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_860-96198-0 from training. Duration: 0.73 2023-10-03 04:14:00,219 WARNING [train.py:1204] (3/4) Exclude cut with ID _14268_210_9206_1_1530622771285_4714099_331-105351-0_sp0.9 from training. Duration: 0.72225 2023-10-03 04:14:00,242 WARNING [train.py:1204] (3/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_204-137133-0_sp1.1 from training. Duration: 0.92725 2023-10-03 04:14:00,261 WARNING [train.py:1204] (3/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_5-46496-0_sp1.1 from training. Duration: 0.936375 2023-10-03 04:14:00,280 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_160-218721-0_sp1.1 from training. Duration: 0.87275 2023-10-03 04:14:04,384 WARNING [train.py:1204] (3/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_503-287883-0 from training. Duration: 0.88 2023-10-03 04:14:04,460 WARNING [train.py:1204] (3/4) Exclude cut with ID _57572_210_5517_1_1534921219624_3809484_110-79134-0_sp1.1 from training. Duration: 0.92725 2023-10-03 04:14:05,717 WARNING [train.py:1204] (3/4) Exclude cut with ID _44585_210_18628_1_1533605350387_4518199_421-37641-0_sp1.1 from training. Duration: 0.936375 2023-10-03 04:14:07,068 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_42-142473-0_sp0.9 from training. Duration: 0.8 2023-10-03 04:14:08,532 WARNING [train.py:1204] (3/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_310-114452-0_sp0.9 from training. Duration: 0.6555625 2023-10-03 04:14:12,731 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9029W0120-509520 from training. Duration: 0.768 2023-10-03 04:14:12,760 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9029W0433-509820 from training. Duration: 0.896 2023-10-03 04:14:14,244 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_40222_210_6228_1_1533385298_4481939_163-334586-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 04:14:14,245 WARNING [train.py:1204] (3/4) Exclude cut with ID _48677_210_5663_1_1533902405919_3892989_408-40740-0_sp1.1 from training. Duration: 0.7545625 2023-10-03 04:14:17,122 WARNING [train.py:1204] (3/4) Exclude cut with ID _24314_210_4915_1_1532135078116_7170179_521-258868-0_sp0.9 from training. Duration: 0.87775 2023-10-03 04:14:19,097 WARNING [train.py:1204] (3/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_576-51198-0_sp1.1 from training. Duration: 0.92725 2023-10-03 04:14:20,454 WARNING [train.py:1204] (3/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_306-216871-0_sp1.1 from training. Duration: 0.97275 2023-10-03 04:14:26,858 WARNING [train.py:1204] (3/4) Exclude cut with ID _20684_210_6917_1_1531567751646_4203930_463-119982-0_sp1.1 from training. Duration: 0.97275 2023-10-03 04:14:28,023 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9084W0012-536850 from training. Duration: 0.64 2023-10-03 04:14:29,361 WARNING [train.py:1204] (3/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_847-41334-0_sp1.1 from training. Duration: 0.4 2023-10-03 04:14:32,687 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.nonlin_attention.whiten1.whitening_limit, batch_count=1134120.0, ans=10.0 2023-10-03 04:14:33,553 WARNING [train.py:1204] (3/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_326-165900-0_sp1.1 from training. Duration: 0.62725 2023-10-03 04:14:36,173 WARNING [train.py:1204] (3/4) Exclude cut with ID _7537_210_2084_1_1528502395431_3924359_128-268029-0_sp1.1 from training. Duration: 0.8909375 2023-10-03 04:14:37,651 WARNING [train.py:1204] (3/4) Exclude cut with ID _67499_210_17282_1_1535509805220_3692430_333-155030-0_sp1.1 from training. Duration: 0.97275 2023-10-03 04:14:40,523 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_34008_210_10431_1_1533345424_8183247_588-257177-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 04:14:41,983 WARNING [train.py:1204] (3/4) Exclude cut with ID _25914_210_6400_1_1532141317058_644740_52-96808-0_sp1.1 from training. Duration: 0.82725 2023-10-03 04:14:43,372 WARNING [train.py:1204] (3/4) Exclude cut with ID _56494_210_4220_1_1535284837625_3033169_69-138150-0_sp1.1 from training. Duration: 0.8545625 2023-10-03 04:14:43,655 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.bypass_mid.scale_min, batch_count=1134186.6666666667, ans=0.2 2023-10-03 04:14:44,790 WARNING [train.py:1204] (3/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_19-143553-0_sp1.1 from training. Duration: 0.97275 2023-10-03 04:14:46,162 WARNING [train.py:1204] (3/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_582-130735-0_sp1.1 from training. Duration: 0.936375 2023-10-03 04:14:48,020 WARNING [train.py:1204] (3/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_943-201356-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 04:14:48,029 WARNING [train.py:1204] (3/4) Exclude cut with ID _3231_210_2106_1_1526205864934_6784220_422-229276-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 04:14:48,048 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_48049_210_5632_1_1533864553_3519937_73-198717-0_sp1.1 from training. Duration: 0.97275 2023-10-03 04:14:49,413 WARNING [train.py:1204] (3/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_329-285801-0 from training. Duration: 0.91 2023-10-03 04:14:49,437 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9171W0114-579000 from training. Duration: 0.896 2023-10-03 04:14:50,693 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.594e+02 1.866e+02 1.991e+02 2.230e+02 3.082e+02, threshold=3.982e+02, percent-clipped=0.0 2023-10-03 04:14:50,781 WARNING [train.py:1204] (3/4) Exclude cut with ID _3120_210_3525_1_1526036143112_4072980_210-132675-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 04:14:52,607 WARNING [train.py:1204] (3/4) Exclude cut with ID _55857_210_2346_1_1534644007831_6898209_745-98391-0_sp0.9 from training. Duration: 0.8444375 2023-10-03 04:14:53,862 WARNING [train.py:1204] (3/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_615-322035-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 04:14:53,868 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_36756_210_7234_1_1533026991_966276_119-315767-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 04:14:53,889 WARNING [train.py:1204] (3/4) Exclude cut with ID _13916_210_9994_1_1530788341126_3763149_60-191556-0_sp1.1 from training. Duration: 0.5545625 2023-10-03 04:14:53,921 WARNING [train.py:1204] (3/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_177-236809-0_sp1.1 from training. Duration: 0.4454375 2023-10-03 04:14:53,969 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7002_210_1794_1_1527939683_5157286_184-229630-0_sp1.1 from training. Duration: 0.67275 2023-10-03 04:14:53,975 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_50428_210_5724_1_1534247532_4126418_464-18322-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 04:14:55,342 WARNING [train.py:1204] (3/4) Exclude cut with ID _35799_210_5854_1_1533263487887_3529071_466-131402-0_sp1.1 from training. Duration: 0.936375 2023-10-03 04:14:57,221 INFO [train.py:1046] (3/4) Epoch 33, batch 150, loss[loss=0.1502, simple_loss=0.2266, pruned_loss=0.03686, over 19937.00 frames. ], tot_loss[loss=0.1641, simple_loss=0.2439, pruned_loss=0.04218, over 2516788.41 frames. ], batch size: 43, lr: 3.11e-03, grad_scale: 16.0 2023-10-03 04:14:57,273 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_1259-132768-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 04:14:57,327 WARNING [train.py:1204] (3/4) Exclude cut with ID _6694_210_4599_1_1527852637490_3965529_144-185051-0_sp1.1 from training. Duration: 0.87275 2023-10-03 04:14:57,374 WARNING [train.py:1204] (3/4) Exclude cut with ID _64639_210_5816_1_1535367619437_3528000_59-133290-0_sp1.1 from training. Duration: 0.963625 2023-10-03 04:14:59,435 WARNING [train.py:1204] (3/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_223-49525-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 04:15:03,427 WARNING [train.py:1204] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_748-36150-0_sp1.1 from training. Duration: 0.82725 2023-10-03 04:15:03,428 WARNING [train.py:1204] (3/4) Exclude cut with ID _19896_210_1883_1_1531709022662_6649569_52-221627-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 04:15:03,480 WARNING [train.py:1204] (3/4) Exclude cut with ID _6789_210_1534_1_1527854471671_2870519_296-115766-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 04:15:06,289 WARNING [train.py:1204] (3/4) Exclude cut with ID _14624_210_1925_1_1530858690657_3642219_100-19871-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 04:15:07,601 WARNING [train.py:1204] (3/4) Exclude cut with ID _12555_210_8750_1_1530075594402_3780239_50-293675-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 04:15:07,958 INFO [scaling.py:1118] (3/4) WithLoss: name=encoder.encoders.5.encoder.layers.1.self_attn_weights, loss-sum=0.000e+00 2023-10-03 04:15:09,054 WARNING [train.py:1204] (3/4) Exclude cut with ID _14880_210_8895_1_1530680426655_3795259_289-212817-0_sp1.1 from training. Duration: 0.62725 2023-10-03 04:15:10,363 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_388-211626-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 04:15:13,382 WARNING [train.py:1204] (3/4) Exclude cut with ID _5289_210_2084_1_1527678202736_3779299_392-261988-0 from training. Duration: 0.65 2023-10-03 04:15:13,402 WARNING [train.py:1204] (3/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_124-345840-0 from training. Duration: 0.52 2023-10-03 04:15:13,407 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2655_210_2087_1_1525688959_3897400_437-337690-0 from training. Duration: 0.63 2023-10-03 04:15:16,387 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.feed_forward1.out_proj.dropout_p, batch_count=1134320.0, ans=0.1 2023-10-03 04:15:17,388 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3780_210_2087_1_1526467391_239068_13-274633-0_sp1.1 from training. Duration: 0.92725 2023-10-03 04:15:17,392 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_805-71497-0_sp1.1 from training. Duration: 0.6454375 2023-10-03 04:15:17,475 WARNING [train.py:1204] (3/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_434-83000-0_sp1.1 from training. Duration: 0.8909375 2023-10-03 04:15:18,881 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_17613_210_2151_1_1531137569_2366160_285-219531-0_sp1.1 from training. Duration: 0.97275 2023-10-03 04:15:18,895 WARNING [train.py:1204] (3/4) Exclude cut with ID _56108_210_16380_1_1534505446159_3887959_22-37858-0_sp1.1 from training. Duration: 0.936375 2023-10-03 04:15:20,716 WARNING [train.py:1204] (3/4) Exclude cut with ID _28427_210_4220_1_1532581240967_3061069_200-88444-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 04:15:21,982 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_77-279042-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 04:15:24,572 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9309W0193-640320 from training. Duration: 0.896 2023-10-03 04:15:24,722 WARNING [train.py:1204] (3/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_516-324055-0_sp1.1 from training. Duration: 0.936375 2023-10-03 04:15:25,578 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.1.encoder.layers.0.self_attn2.whiten, num_groups=1, num_channels=256, metric=14.21 vs. limit=22.5 2023-10-03 04:15:30,936 WARNING [train.py:1204] (3/4) Exclude cut with ID _40094_210_16380_1_1533294005773_3641159_10-261240-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 04:15:34,054 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.conv_module2.balancer1.prob, batch_count=1134386.6666666667, ans=0.125 2023-10-03 04:15:35,121 WARNING [train.py:1204] (3/4) Exclude cut with ID _6267_210_2084_1_1527764658526_3894730_375-219887-0_sp1.1 from training. Duration: 0.6090625 2023-10-03 04:15:36,497 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6788_210_1388_1_1527898698_4562021_180-75288-0 from training. Duration: 0.99 2023-10-03 04:15:40,692 WARNING [train.py:1204] (3/4) Exclude cut with ID _8030_210_7497_1_1528444886781_3531089_262-86750-0_sp1.1 from training. Duration: 0.736375 2023-10-03 04:15:40,713 WARNING [train.py:1204] (3/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_934-325498-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 04:15:40,740 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_456-22141-0_sp0.9 from training. Duration: 0.97775 2023-10-03 04:15:42,248 WARNING [train.py:1204] (3/4) Exclude cut with ID _49643_210_7989_1_1534640244224_3939031_598-161926-0_sp1.1 from training. Duration: 0.8181875 2023-10-03 04:15:43,757 WARNING [train.py:1204] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_345-49721-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 04:15:45,087 WARNING [train.py:1204] (3/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_440-141811-0_sp1.1 from training. Duration: 0.863625 2023-10-03 04:15:46,415 WARNING [train.py:1204] (3/4) Exclude cut with ID _64048_210_10626_1_1535198382802_3835179_394-212120-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 04:15:46,464 WARNING [train.py:1204] (3/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_91-277684-0 from training. Duration: 0.74 2023-10-03 04:15:46,719 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.ff3_skip_rate, batch_count=1134453.3333333333, ans=0.0 2023-10-03 04:15:52,465 WARNING [train.py:1204] (3/4) Exclude cut with ID _23038_210_9570_1_1532001584158_3652419_191-346343-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 04:15:52,527 WARNING [train.py:1204] (3/4) Exclude cut with ID _15140_210_9288_1_1530791388450_4880149_351-347974-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 04:15:52,744 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.conv_skip_rate, batch_count=1134453.3333333333, ans=0.0 2023-10-03 04:15:54,344 WARNING [train.py:1204] (3/4) Exclude cut with ID _58605_210_12210_1_1534723195939_3904259_48-269402-0_sp1.1 from training. Duration: 0.87275 2023-10-03 04:15:54,350 WARNING [train.py:1204] (3/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_401-53423-0_sp0.9 from training. Duration: 0.611125 2023-10-03 04:15:54,601 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.ff3_skip_rate, batch_count=1134520.0, ans=0.0 2023-10-03 04:15:57,031 WARNING [train.py:1204] (3/4) Exclude cut with ID _42659_210_2070_1_1533607442765_3120380_158-49004-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 04:15:58,437 WARNING [train.py:1204] (3/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_81-75849-0_sp1.1 from training. Duration: 0.3545625 2023-10-03 04:16:00,024 WARNING [train.py:1204] (3/4) Exclude cut with ID _27052_210_5986_1_1532329197187_3585009_314-106499-0_sp1.1 from training. Duration: 0.836375 2023-10-03 04:16:01,417 WARNING [train.py:1204] (3/4) Exclude cut with ID _4626_210_3532_1_1526817652717_3916859_446-206656-0_sp0.9 from training. Duration: 0.8444375 2023-10-03 04:16:02,782 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_53378_210_17282_1_1534221071_5769509_158-114960-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 04:16:03,566 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3171_210_2087_1_1526109084_2330007_37-64334-0_sp0.9 from training. Duration: 0.911125 2023-10-03 04:16:04,820 WARNING [train.py:1204] (3/4) Exclude cut with ID _5410_210_1388_1_1527397055795_1719020_39-112221-0 from training. Duration: 0.69 2023-10-03 04:16:04,846 WARNING [train.py:1204] (3/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_4-91037-0_sp0.9 from training. Duration: 0.97775 2023-10-03 04:16:04,868 WARNING [train.py:1204] (3/4) Exclude cut with ID ID0079W0443-717795 from training. Duration: 0.541 2023-10-03 04:16:07,595 WARNING [train.py:1204] (3/4) Exclude cut with ID _34734_210_4525_1_1533365977542_4448720_766-297928-0_sp1.1 from training. Duration: 0.936375 2023-10-03 04:16:10,291 INFO [train.py:1046] (3/4) Epoch 33, batch 200, loss[loss=0.1649, simple_loss=0.249, pruned_loss=0.04037, over 24439.00 frames. ], tot_loss[loss=0.1645, simple_loss=0.2442, pruned_loss=0.04238, over 3010580.12 frames. ], batch size: 66, lr: 3.11e-03, grad_scale: 16.0 2023-10-03 04:16:10,407 WARNING [train.py:1204] (3/4) Exclude cut with ID _51471_210_1889_1_1534399391390_3094250_310-337793-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 04:16:10,424 WARNING [train.py:1204] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_678-159307-0_sp0.9 from training. Duration: 0.9444375 2023-10-03 04:16:13,225 WARNING [train.py:1204] (3/4) Exclude cut with ID _50093_210_4220_1_1534417141207_3432299_259-322252-0 from training. Duration: 0.92 2023-10-03 04:16:14,515 WARNING [train.py:1204] (3/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_67-103444-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 04:16:14,645 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.1.bypass.skip_rate, batch_count=1134586.6666666667, ans=0.035 2023-10-03 04:16:15,807 WARNING [train.py:1204] (3/4) Exclude cut with ID _40551_210_10635_1_1533776424632_3416179_311-247578-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 04:16:19,750 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_4633_210_2040_1_1526824163_4145343_416-76117-0 from training. Duration: 0.95 2023-10-03 04:16:21,646 WARNING [train.py:1204] (3/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_331-117829-0_sp0.9 from training. Duration: 0.688875 2023-10-03 04:16:23,176 WARNING [train.py:1204] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_299-247059-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 04:16:23,456 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.conv_module2.balancer1.prob, batch_count=1134653.3333333333, ans=0.125 2023-10-03 04:16:25,128 WARNING [train.py:1204] (3/4) Exclude cut with ID _64002_210_9570_1_1535198605157_38979_2-240845-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 04:16:28,336 WARNING [train.py:1204] (3/4) Exclude cut with ID _4681_210_3525_1_1527250689464_120889_15-126760-0_sp0.9 from training. Duration: 0.9 2023-10-03 04:16:28,356 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_21500_210_2054_1_1532149310_2716758_256-156611-0_sp1.1 from training. Duration: 0.936375 2023-10-03 04:16:28,381 WARNING [train.py:1204] (3/4) Exclude cut with ID _16908_210_5724_1_1531119157248_3772700_624-203729-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 04:16:46,352 WARNING [train.py:1204] (3/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_605-219732-0_sp1.1 from training. Duration: 0.8545625 2023-10-03 04:16:47,641 WARNING [train.py:1204] (3/4) Exclude cut with ID _44718_210_5816_1_1534050051869_6469030_380-167520-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 04:16:47,713 WARNING [train.py:1204] (3/4) Exclude cut with ID _19896_210_1883_1_1531709022662_6649569_94-229927-0_sp1.1 from training. Duration: 0.8818125 2023-10-03 04:16:49,171 WARNING [train.py:1204] (3/4) Exclude cut with ID _19514_210_9259_1_1531530071330_5084230_519-174300-0_sp1.1 from training. Duration: 0.963625 2023-10-03 04:16:50,487 WARNING [train.py:1204] (3/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_340-174176-0_sp0.9 from training. Duration: 0.5444375 2023-10-03 04:16:50,491 WARNING [train.py:1204] (3/4) Exclude cut with ID _8263_210_1664_1_1528439452891_970220_68-181805-0_sp1.1 from training. Duration: 0.5818125 2023-10-03 04:16:53,544 WARNING [train.py:1204] (3/4) Exclude cut with ID _3120_210_3525_1_1526036143112_4072980_348-257723-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 04:16:53,695 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=1134786.6666666667, ans=0.1 2023-10-03 04:16:54,933 WARNING [train.py:1204] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_453-133983-0_sp1.1 from training. Duration: 0.7090625 2023-10-03 04:16:56,835 WARNING [train.py:1204] (3/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_17-86060-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 04:16:56,870 WARNING [train.py:1204] (3/4) Exclude cut with ID _29877_210_6228_1_1532397791106_5276149_228-184657-0_sp1.1 from training. Duration: 0.97275 2023-10-03 04:16:58,230 WARNING [train.py:1204] (3/4) Exclude cut with ID _18516_210_9206_1_1531704653206_3692406_198-334870-0 from training. Duration: 0.92 2023-10-03 04:16:58,284 WARNING [train.py:1204] (3/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_340-174176-0_sp1.1 from training. Duration: 0.4454375 2023-10-03 04:16:58,303 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_14-139186-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 04:16:59,963 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.feed_forward1.hidden_balancer.prob, batch_count=1134786.6666666667, ans=0.125 2023-10-03 04:17:02,586 WARNING [train.py:1204] (3/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_126-29104-0_sp0.9 from training. Duration: 0.9444375 2023-10-03 04:17:06,791 WARNING [train.py:1204] (3/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_84-10310-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 04:17:10,465 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.0.ff3_skip_rate, batch_count=1134853.3333333333, ans=0.0 2023-10-03 04:17:12,982 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_15517_210_6753_1_1530954008_3487336_79-273076-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 04:17:14,248 WARNING [train.py:1204] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_371-18169-0_sp0.9 from training. Duration: 0.9555625 2023-10-03 04:17:15,827 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.conv_module1.balancer2.prob, batch_count=1134853.3333333333, ans=0.125 2023-10-03 04:17:17,482 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.5.encoder.layers.1.self_attn1.whiten, num_groups=1, num_channels=256, metric=11.10 vs. limit=22.5 2023-10-03 04:17:18,228 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.589e+02 1.915e+02 2.133e+02 2.434e+02 3.393e+02, threshold=4.265e+02, percent-clipped=0.0 2023-10-03 04:17:19,744 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_68702_210_23689_1_1535799577_3420068_170-92788-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 04:17:23,090 WARNING [train.py:1204] (3/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_78-182912-0 from training. Duration: 0.59 2023-10-03 04:17:23,145 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3019_210_2028_1_1526044245_3264865_297-12384-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 04:17:23,156 WARNING [train.py:1204] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_147-111137-0_sp1.1 from training. Duration: 0.863625 2023-10-03 04:17:23,160 WARNING [train.py:1204] (3/4) Exclude cut with ID _28323_210_16187_1_1532480395667_4100300_117-8859-0_sp1.1 from training. Duration: 0.97275 2023-10-03 04:17:24,355 INFO [train.py:1046] (3/4) Epoch 33, batch 250, loss[loss=0.1553, simple_loss=0.2246, pruned_loss=0.04301, over 23797.00 frames. ], tot_loss[loss=0.1629, simple_loss=0.242, pruned_loss=0.04189, over 3392904.89 frames. ], batch size: 195, lr: 3.11e-03, grad_scale: 16.0 2023-10-03 04:17:26,160 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7762_210_1794_1_1528199508_4057408_419-125009-0_sp1.1 from training. Duration: 0.7545625 2023-10-03 04:17:26,278 WARNING [train.py:1204] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_268-87909-0 from training. Duration: 0.99 2023-10-03 04:17:27,643 WARNING [train.py:1204] (3/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_118-311357-0_sp1.1 from training. Duration: 0.92725 2023-10-03 04:17:27,658 WARNING [train.py:1204] (3/4) Exclude cut with ID ID0377W0003-870225 from training. Duration: 0.852 2023-10-03 04:17:29,081 WARNING [train.py:1204] (3/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_56-5838-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 04:17:29,170 WARNING [train.py:1204] (3/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_90-31383-0_sp0.9 from training. Duration: 0.9666875 2023-10-03 04:17:29,245 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_31583_210_3852_1_1532595851_4024413_58-86657-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 04:17:30,613 WARNING [train.py:1204] (3/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_344-113875-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 04:17:32,017 WARNING [train.py:1204] (3/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_278-104676-0_sp1.1 from training. Duration: 0.8909375 2023-10-03 04:17:33,209 WARNING [train.py:1204] (3/4) Exclude cut with ID _55844_210_2346_1_1534471212569_7625707_764-2387-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 04:17:34,654 WARNING [train.py:1204] (3/4) Exclude cut with ID _27430_210_7770_1_1532604689375_1378590_157-14963-0_sp1.1 from training. Duration: 0.963625 2023-10-03 04:17:37,380 WARNING [train.py:1204] (3/4) Exclude cut with ID _7160_210_1876_1_1527940923231_3703799_282-322611-0_sp1.1 from training. Duration: 0.7909375 2023-10-03 04:17:37,876 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.5.encoder.layers.0.nonlin_attention.whiten2, num_groups=1, num_channels=256, metric=3.40 vs. limit=15.0 2023-10-03 04:17:48,700 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.bypass.skip_rate, batch_count=1134986.6666666667, ans=0.04949747468305833 2023-10-03 04:17:51,322 WARNING [train.py:1204] (3/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_190-138640-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 04:17:53,327 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_33348_210_5632_1_1533086912_3324030_63-270922-0_sp1.1 from training. Duration: 0.97275 2023-10-03 04:17:54,625 WARNING [train.py:1204] (3/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_135-184281-0_sp1.1 from training. Duration: 0.9 2023-10-03 04:17:54,819 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.1.nonlin_attention.balancer.prob, batch_count=1135053.3333333333, ans=0.125 2023-10-03 04:17:54,859 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.nonlin_attention.balancer.prob, batch_count=1135053.3333333333, ans=0.125 2023-10-03 04:17:59,294 WARNING [train.py:1204] (3/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_277-210798-0_sp1.1 from training. Duration: 0.5 2023-10-03 04:18:00,650 WARNING [train.py:1204] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_402-253027-0_sp0.9 from training. Duration: 0.6 2023-10-03 04:18:01,972 WARNING [train.py:1204] (3/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_540-275685-0_sp0.9 from training. Duration: 0.988875 2023-10-03 04:18:02,005 WARNING [train.py:1204] (3/4) Exclude cut with ID _59493_210_17282_1_1535078419077_3225879_118-18960-0_sp1.1 from training. Duration: 0.936375 2023-10-03 04:18:03,491 WARNING [train.py:1204] (3/4) Exclude cut with ID _6194_210_1534_1_1527511826160_3941194_321-148641-0_sp1.1 from training. Duration: 0.5818125 2023-10-03 04:18:03,497 WARNING [train.py:1204] (3/4) Exclude cut with ID _61133_210_9969_1_1535196777227_3696409_333-270325-0_sp1.1 from training. Duration: 0.8454375 2023-10-03 04:18:04,827 WARNING [train.py:1204] (3/4) Exclude cut with ID _49376_210_5986_1_1533985278732_3370740_307-145221-0_sp1.1 from training. Duration: 0.936375 2023-10-03 04:18:04,967 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6846_210_6753_1_1527850767_4295158_2-315073-0_sp0.9 from training. Duration: 0.97775 2023-10-03 04:18:07,669 WARNING [train.py:1204] (3/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_17-25045-0 from training. Duration: 0.59 2023-10-03 04:18:07,691 WARNING [train.py:1204] (3/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_503-333510-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 04:18:09,091 WARNING [train.py:1204] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_238-245655-0_sp1.1 from training. Duration: 0.736375 2023-10-03 04:18:10,350 WARNING [train.py:1204] (3/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_329-82235-0_sp0.9 from training. Duration: 0.911125 2023-10-03 04:18:10,355 WARNING [train.py:1204] (3/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_325-113798-0_sp0.9 from training. Duration: 0.9444375 2023-10-03 04:18:12,285 WARNING [train.py:1204] (3/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_395-37222-0_sp0.9 from training. Duration: 0.8444375 2023-10-03 04:18:13,628 WARNING [train.py:1204] (3/4) Exclude cut with ID _64135_210_2253_1_1535677267876_7608972_498-5039-0_sp1.1 from training. Duration: 0.7090625 2023-10-03 04:18:13,645 WARNING [train.py:1204] (3/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_303-228738-0_sp0.9 from training. Duration: 0.9333125 2023-10-03 04:18:15,591 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_577-220899-0_sp1.1 from training. Duration: 0.92725 2023-10-03 04:18:15,692 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3475_210_2028_1_1526193578_3622569_377-166133-0_sp1.1 from training. Duration: 0.7818125 2023-10-03 04:18:17,048 WARNING [train.py:1204] (3/4) Exclude cut with ID _49376_210_5986_1_1533985278732_3370740_272-313512-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 04:18:21,247 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7762_210_1794_1_1528199508_4057408_422-227034-0_sp0.9 from training. Duration: 0.811125 2023-10-03 04:18:23,142 WARNING [train.py:1204] (3/4) Exclude cut with ID _18895_210_6228_1_1531357274234_3784930_74-341428-0_sp1.1 from training. Duration: 0.92725 2023-10-03 04:18:23,336 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.conv_module2.balancer1.min_positive, batch_count=1135186.6666666667, ans=0.025 2023-10-03 04:18:25,929 WARNING [train.py:1204] (3/4) Exclude cut with ID _44902_210_16187_1_1533888006062_3792078_149-67212-0_sp1.1 from training. Duration: 0.7909375 2023-10-03 04:18:26,028 INFO [scaling.py:1118] (3/4) WithLoss: name=encoder.encoders.0.layers.0.self_attn_weights, loss-sum=8.771e-02 2023-10-03 04:18:32,003 WARNING [train.py:1204] (3/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_243-126020-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 04:18:32,130 WARNING [train.py:1204] (3/4) Exclude cut with ID _54218_210_12210_1_1534413646388_4409259_259-6561-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 04:18:36,229 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_41452_210_7291_1_1533437687_3941281_405-11559-0 from training. Duration: 0.91 2023-10-03 04:18:37,554 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_759-225084-0_sp1.1 from training. Duration: 0.97275 2023-10-03 04:18:37,567 WARNING [train.py:1204] (3/4) Exclude cut with ID _14747_210_8341_1_1530685797463_3654089_147-131229-0_sp0.9 from training. Duration: 0.8444375 2023-10-03 04:18:38,776 INFO [train.py:1046] (3/4) Epoch 33, batch 300, loss[loss=0.1738, simple_loss=0.2388, pruned_loss=0.05439, over 23819.00 frames. ], tot_loss[loss=0.1614, simple_loss=0.2398, pruned_loss=0.0415, over 3682774.44 frames. ], batch size: 164, lr: 3.11e-03, grad_scale: 16.0 2023-10-03 04:18:38,903 WARNING [train.py:1204] (3/4) Exclude cut with ID _14867_210_1968_1_1530684179890_3846969_319-250868-0 from training. Duration: 0.99 2023-10-03 04:18:38,924 WARNING [train.py:1204] (3/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_580-93798-0_sp0.9 from training. Duration: 0.62225 2023-10-03 04:18:42,185 WARNING [train.py:1204] (3/4) Exclude cut with ID _9679_210_7497_1_1529129212906_4363929_512-313889-0_sp0.9 from training. Duration: 0.9 2023-10-03 04:18:42,203 WARNING [train.py:1204] (3/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_18-189198-0 from training. Duration: 0.9 2023-10-03 04:18:46,732 WARNING [train.py:1204] (3/4) Exclude cut with ID _49755_210_18053_1_1534581005846_3218709_137-64179-0_sp1.1 from training. Duration: 0.92725 2023-10-03 04:18:46,862 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.0.ff2_skip_rate, batch_count=1135253.3333333333, ans=0.0 2023-10-03 04:18:48,133 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_48049_210_5632_1_1533864553_3519937_306-283794-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 04:18:49,802 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.1.balancer_ff3.min_abs, batch_count=1135253.3333333333, ans=0.2 2023-10-03 04:18:51,003 WARNING [train.py:1204] (3/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_25-120161-0_sp1.1 from training. Duration: 0.8909375 2023-10-03 04:18:52,402 WARNING [train.py:1204] (3/4) Exclude cut with ID _8833_210_5219_1_1528592665210_7553059_319-349181-0 from training. Duration: 0.99 2023-10-03 04:18:52,505 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_296-332325-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 04:18:55,073 WARNING [train.py:1204] (3/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_436-186311-0_sp1.1 from training. Duration: 0.5909375 2023-10-03 04:18:55,091 WARNING [train.py:1204] (3/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_348-127789-0 from training. Duration: 0.85 2023-10-03 04:18:55,103 WARNING [train.py:1204] (3/4) Exclude cut with ID _3992_210_1747_1_1526644816031_3731060_443-195183-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 04:19:00,416 WARNING [train.py:1204] (3/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_308-231735-0_sp1.1 from training. Duration: 0.663625 2023-10-03 04:19:04,498 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6786_210_1388_1_1527839699_4194495_378-46070-0_sp1.1 from training. Duration: 0.6545625 2023-10-03 04:19:04,559 WARNING [train.py:1204] (3/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_63-101996-0 from training. Duration: 0.88 2023-10-03 04:19:04,726 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.feed_forward1.out_proj.dropout_p, batch_count=1135320.0, ans=0.1 2023-10-03 04:19:06,150 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.bypass.skip_rate, batch_count=1135320.0, ans=0.04949747468305833 2023-10-03 04:19:06,163 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.feed_forward2.hidden_balancer.prob, batch_count=1135320.0, ans=0.125 2023-10-03 04:19:08,639 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2541_210_1794_1_1525590269_3084765_156-173713-0 from training. Duration: 0.81 2023-10-03 04:19:08,680 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_217-239796-0_sp1.1 from training. Duration: 0.963625 2023-10-03 04:19:10,854 WARNING [train.py:1204] (3/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_530-329400-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 04:19:12,054 WARNING [train.py:1204] (3/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_150-62920-0_sp1.1 from training. Duration: 0.963625 2023-10-03 04:19:12,054 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_5522_210_3273_1_1527249606_3909096_366-172508-0 from training. Duration: 0.66 2023-10-03 04:19:13,351 WARNING [train.py:1204] (3/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_303-340446-0_sp1.1 from training. Duration: 0.8181875 2023-10-03 04:19:14,817 WARNING [train.py:1204] (3/4) Exclude cut with ID _8699_210_2069_1_1528630346831_3960409_160-24408-0_sp1.1 from training. Duration: 0.82725 2023-10-03 04:19:14,979 WARNING [train.py:1204] (3/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_299-187801-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 04:19:16,304 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_34886_210_10633_1_1533203882_1755661_92-345288-0_sp1.1 from training. Duration: 0.97275 2023-10-03 04:19:19,621 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_5438_210_1794_1_1527336687_2600009_104-288274-0_sp1.1 from training. Duration: 0.52725 2023-10-03 04:19:19,625 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_154-316450-0 from training. Duration: 0.76 2023-10-03 04:19:20,933 WARNING [train.py:1204] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_23-189234-0_sp0.9 from training. Duration: 0.92225 2023-10-03 04:19:22,408 WARNING [train.py:1204] (3/4) Exclude cut with ID _4779_210_2084_1_1527318022944_3914893_346-168820-0_sp1.1 from training. Duration: 0.963625 2023-10-03 04:19:25,202 WARNING [train.py:1204] (3/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_5-55893-0 from training. Duration: 0.55 2023-10-03 04:19:26,506 WARNING [train.py:1204] (3/4) Exclude cut with ID _20455_210_5219_1_1531565996775_8098100_180-24517-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 04:19:30,491 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.4.encoder.layers.2.feed_forward1.out_whiten, num_groups=1, num_channels=384, metric=8.45 vs. limit=15.0 2023-10-03 04:19:31,292 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_243-143119-0_sp0.9 from training. Duration: 0.8555625 2023-10-03 04:19:31,478 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.0.conv_module2.balancer1.prob, batch_count=1135453.3333333333, ans=0.125 2023-10-03 04:19:33,945 WARNING [train.py:1204] (3/4) Exclude cut with ID _65585_210_12210_1_1535450451584_3764098_293-212045-0_sp1.1 from training. Duration: 0.92725 2023-10-03 04:19:33,949 WARNING [train.py:1204] (3/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_577-332600-0 from training. Duration: 0.57 2023-10-03 04:19:37,913 WARNING [train.py:1204] (3/4) Exclude cut with ID _30039_210_13270_1_1532484612900_2725150_104-289874-0_sp1.1 from training. Duration: 0.963625 2023-10-03 04:19:37,922 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3172_210_2087_1_1526292929_3987865_197-97762-0_sp1.1 from training. Duration: 0.6818125 2023-10-03 04:19:40,786 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.2.encoder.layers.2.feed_forward1.out_whiten, num_groups=1, num_channels=384, metric=6.74 vs. limit=15.0 2023-10-03 04:19:41,411 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_256-150908-0_sp1.1 from training. Duration: 0.963625 2023-10-03 04:19:42,755 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_296-287678-0_sp1.1 from training. Duration: 0.863625 2023-10-03 04:19:42,802 WARNING [train.py:1204] (3/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_443-30034-0 from training. Duration: 0.64 2023-10-03 04:19:44,129 WARNING [train.py:1204] (3/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_494-199538-0_sp0.9 from training. Duration: 0.8333125 2023-10-03 04:19:44,184 WARNING [train.py:1204] (3/4) Exclude cut with ID _19708_210_9206_1_1532136650733_4238364_61-162325-0_sp1.1 from training. Duration: 0.97275 2023-10-03 04:19:45,611 WARNING [train.py:1204] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_475-127455-0 from training. Duration: 0.7 2023-10-03 04:19:46,841 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.474e+02 1.972e+02 2.281e+02 2.681e+02 3.966e+02, threshold=4.562e+02, percent-clipped=0.0 2023-10-03 04:19:46,986 WARNING [train.py:1204] (3/4) Exclude cut with ID _48677_210_5663_1_1533902405919_3892989_188-11581-0_sp1.1 from training. Duration: 0.963625 2023-10-03 04:19:47,015 WARNING [train.py:1204] (3/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_136-8381-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 04:19:49,611 WARNING [train.py:1204] (3/4) Exclude cut with ID _55328_210_27767_1_1534580973260_4088170_270-229555-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 04:19:49,642 WARNING [train.py:1204] (3/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_372-266951-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 04:19:51,381 WARNING [train.py:1204] (3/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_14-242149-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 04:19:52,674 INFO [train.py:1046] (3/4) Epoch 33, batch 350, loss[loss=0.1563, simple_loss=0.2472, pruned_loss=0.03265, over 24570.00 frames. ], tot_loss[loss=0.1608, simple_loss=0.2387, pruned_loss=0.04143, over 3914361.11 frames. ], batch size: 71, lr: 3.11e-03, grad_scale: 16.0 2023-10-03 04:19:55,479 WARNING [train.py:1204] (3/4) Exclude cut with ID _7877_210_6014_1_1528286358099_3750136_159-76512-0_sp1.1 from training. Duration: 0.936375 2023-10-03 04:19:55,482 WARNING [train.py:1204] (3/4) Exclude cut with ID _8263_210_1664_1_1528438953494_294100_40-204731-0_sp1.1 from training. Duration: 0.4545625 2023-10-03 04:19:58,326 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8278_210_2550_1_1528637099_4220413_694-238413-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 04:19:58,527 INFO [scaling.py:1118] (3/4) WithLoss: name=encoder.encoders.0.layers.0.self_attn_weights, loss-sum=0.000e+00 2023-10-03 04:20:03,030 WARNING [train.py:1204] (3/4) Exclude cut with ID _47278_210_12041_1_1533902418349_3576110_421-172710-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 04:20:04,584 WARNING [train.py:1204] (3/4) Exclude cut with ID _14970_210_15709_1_1530773543493_1715730_215-282680-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 04:20:04,619 WARNING [train.py:1204] (3/4) Exclude cut with ID _64639_210_5816_1_1535367619437_3528000_306-149449-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 04:20:07,409 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_28-291982-0 from training. Duration: 0.97 2023-10-03 04:20:09,418 WARNING [train.py:1204] (3/4) Exclude cut with ID _55134_210_21982_1_1534590028377_3882170_412-15870-0_sp1.1 from training. Duration: 0.936375 2023-10-03 04:20:09,444 WARNING [train.py:1204] (3/4) Exclude cut with ID _13684_210_7497_1_1530619354863_3598359_312-235606-0 from training. Duration: 0.6 2023-10-03 04:20:12,052 WARNING [train.py:1204] (3/4) Exclude cut with ID _37112_210_2575_1_1532997007673_7268610_347-238993-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 04:20:12,093 WARNING [train.py:1204] (3/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_121-284152-0 from training. Duration: 0.59 2023-10-03 04:20:13,344 WARNING [train.py:1204] (3/4) Exclude cut with ID _2941_210_2577_1_1526199673150_2489759_228-2961-0_sp1.1 from training. Duration: 0.97275 2023-10-03 04:20:17,394 WARNING [train.py:1204] (3/4) Exclude cut with ID _28323_210_16187_1_1532480395667_4100300_531-24157-0 from training. Duration: 0.98 2023-10-03 04:20:18,857 WARNING [train.py:1204] (3/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_266-329755-0_sp0.9 from training. Duration: 0.988875 2023-10-03 04:20:22,256 WARNING [train.py:1204] (3/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_1038-146421-0_sp1.1 from training. Duration: 0.97275 2023-10-03 04:20:22,333 WARNING [train.py:1204] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_391-22148-0_sp0.9 from training. Duration: 0.8555625 2023-10-03 04:20:22,437 WARNING [train.py:1204] (3/4) Exclude cut with ID _64521_210_14699_1_1535243427259_3815328_543-341117-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 04:20:23,800 WARNING [train.py:1204] (3/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_52-168712-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 04:20:23,843 WARNING [train.py:1204] (3/4) Exclude cut with ID _33920_210_4220_1_1533257851500_3084399_198-334063-0_sp1.1 from training. Duration: 0.8545625 2023-10-03 04:20:25,117 WARNING [train.py:1204] (3/4) Exclude cut with ID _50093_210_4220_1_1534417141207_3432299_69-111987-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 04:20:25,155 WARNING [train.py:1204] (3/4) Exclude cut with ID _14821_210_8750_1_1530680503991_6829359_189-297451-0_sp0.9 from training. Duration: 0.611125 2023-10-03 04:20:26,604 WARNING [train.py:1204] (3/4) Exclude cut with ID _8030_210_7497_1_1528444886781_3531089_262-86750-0_sp0.9 from training. Duration: 0.9 2023-10-03 04:20:26,615 WARNING [train.py:1204] (3/4) Exclude cut with ID _24612_210_8913_1_1531998054193_3806359_294-191196-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 04:20:26,801 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.balancer2.prob, batch_count=1135720.0, ans=0.125 2023-10-03 04:20:34,516 WARNING [train.py:1204] (3/4) Exclude cut with ID _16909_210_5724_1_1531292079912_4118919_707-342896-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 04:20:34,538 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_9460_210_4211_1_1528954587_10049667_572-37035-0_sp0.9 from training. Duration: 0.888875 2023-10-03 04:20:35,867 WARNING [train.py:1204] (3/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_762-109886-0_sp1.1 from training. Duration: 0.82725 2023-10-03 04:20:35,895 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_14953_210_2151_1_1530701710_5616165_519-326393-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 04:20:40,517 WARNING [train.py:1204] (3/4) Exclude cut with ID _25914_210_6400_1_1532141317058_644740_52-96808-0 from training. Duration: 0.91 2023-10-03 04:20:40,521 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_68095_210_20185_1_1535718226_8888855_1079-94525-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 04:20:42,443 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.3.encoder.layers.2.self_attn1.whiten, num_groups=1, num_channels=512, metric=13.49 vs. limit=22.5 2023-10-03 04:20:44,621 WARNING [train.py:1204] (3/4) Exclude cut with ID _14860_210_4915_1_1530702020340_7343240_746-3647-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 04:20:44,621 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_70379_210_7925_1_1535941455_557455_49-101720-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 04:20:44,640 WARNING [train.py:1204] (3/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_8-307291-0_sp1.1 from training. Duration: 0.936375 2023-10-03 04:20:47,344 WARNING [train.py:1204] (3/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_117-290916-0 from training. Duration: 0.51 2023-10-03 04:20:50,148 WARNING [train.py:1204] (3/4) Exclude cut with ID _22020_210_2461_1_1531724164513_6326900_944-339840-0_sp1.1 from training. Duration: 0.963625 2023-10-03 04:20:50,231 WARNING [train.py:1204] (3/4) Exclude cut with ID IC0515W0043-254911 from training. Duration: 0.976 2023-10-03 04:20:51,685 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3910_210_1794_1_1526742306_334530_28-238909-0 from training. Duration: 0.81 2023-10-03 04:20:51,706 WARNING [train.py:1204] (3/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_269-267275-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 04:20:53,135 WARNING [train.py:1204] (3/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_72-251255-0_sp1.1 from training. Duration: 0.8545625 2023-10-03 04:20:53,147 WARNING [train.py:1204] (3/4) Exclude cut with ID _14886_210_4220_1_1530702020355_3516190_175-229073-0 from training. Duration: 0.45 2023-10-03 04:20:53,383 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.feed_forward1.out_proj.dropout_p, batch_count=1135853.3333333333, ans=0.1 2023-10-03 04:20:58,259 WARNING [train.py:1204] (3/4) Exclude cut with ID _40519_210_13794_1_1533715228655_7337390_921-209452-0_sp1.1 from training. Duration: 0.963625 2023-10-03 04:20:59,459 WARNING [train.py:1204] (3/4) Exclude cut with ID _29857_210_20034_1_1532395886753_3798160_335-163562-0_sp0.9 from training. Duration: 0.9444375 2023-10-03 04:21:00,912 WARNING [train.py:1204] (3/4) Exclude cut with ID _58583_210_5236_1_1534726835266_3574249_384-58445-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 04:21:02,645 WARNING [train.py:1204] (3/4) Exclude cut with ID _44011_210_2046_1_1533722401691_3631310_96-315656-0_sp1.1 from training. Duration: 0.963625 2023-10-03 04:21:02,650 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_68484_210_1794_1_1535849804_7678097_549-170613-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 04:21:05,329 WARNING [train.py:1204] (3/4) Exclude cut with ID _37892_210_19596_1_1533198677897_3964058_214-148058-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 04:21:06,773 INFO [train.py:1046] (3/4) Epoch 33, batch 400, loss[loss=0.1627, simple_loss=0.252, pruned_loss=0.0367, over 24513.00 frames. ], tot_loss[loss=0.1606, simple_loss=0.2383, pruned_loss=0.04149, over 4077570.87 frames. ], batch size: 66, lr: 3.11e-03, grad_scale: 32.0 2023-10-03 04:21:08,223 WARNING [train.py:1204] (3/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_415-78792-0_sp0.9 from training. Duration: 0.9 2023-10-03 04:21:09,147 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.feed_forward2.hidden_balancer.prob, batch_count=1135920.0, ans=0.125 2023-10-03 04:21:11,496 WARNING [train.py:1204] (3/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_383-205833-0_sp1.1 from training. Duration: 0.863625 2023-10-03 04:21:11,580 WARNING [train.py:1204] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_167-137255-0 from training. Duration: 0.64 2023-10-03 04:21:11,588 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8275_210_4198_1_1528455320_3892571_484-154786-0_sp1.1 from training. Duration: 0.963625 2023-10-03 04:21:11,640 WARNING [train.py:1204] (3/4) Exclude cut with ID _40284_210_2084_1_1533513679814_3704279_396-319412-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 04:21:13,195 WARNING [train.py:1204] (3/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_280-120942-0_sp0.9 from training. Duration: 0.8555625 2023-10-03 04:21:13,259 WARNING [train.py:1204] (3/4) Exclude cut with ID _45630_210_10556_1_1533949174498_3902049_558-240889-0_sp1.1 from training. Duration: 0.92725 2023-10-03 04:21:15,932 WARNING [train.py:1204] (3/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_197-307458-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 04:21:17,396 WARNING [train.py:1204] (3/4) Exclude cut with ID _16909_210_5724_1_1531292079912_4118919_163-254843-0_sp1.1 from training. Duration: 0.92725 2023-10-03 04:21:17,523 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_103-84010-0 from training. Duration: 0.69 2023-10-03 04:21:18,874 WARNING [train.py:1204] (3/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_401-158239-0 from training. Duration: 0.71 2023-10-03 04:21:18,875 WARNING [train.py:1204] (3/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_899-233658-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 04:21:20,206 WARNING [train.py:1204] (3/4) Exclude cut with ID _23825_210_13449_1_1531915313526_3497150_240-271367-0 from training. Duration: 0.97 2023-10-03 04:21:20,262 WARNING [train.py:1204] (3/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_424-134619-0_sp1.1 from training. Duration: 0.92725 2023-10-03 04:21:25,014 WARNING [train.py:1204] (3/4) Exclude cut with ID _44831_210_13427_1_1533816011549_3689035_32-188147-0_sp1.1 from training. Duration: 0.87275 2023-10-03 04:21:25,017 WARNING [train.py:1204] (3/4) Exclude cut with ID _59170_210_6721_1_1534983633960_4203335_314-63879-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 04:21:25,035 WARNING [train.py:1204] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_602-275260-0 from training. Duration: 0.82 2023-10-03 04:21:26,667 WARNING [train.py:1204] (3/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_511-306767-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 04:21:26,701 WARNING [train.py:1204] (3/4) Exclude cut with ID _41817_210_18835_1_1533434477160_7423049_565-201775-0_sp1.1 from training. Duration: 0.92725 2023-10-03 04:21:26,710 WARNING [train.py:1204] (3/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_778-173151-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 04:21:28,002 WARNING [train.py:1204] (3/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_680-51001-0_sp1.1 from training. Duration: 0.963625 2023-10-03 04:21:29,290 WARNING [train.py:1204] (3/4) Exclude cut with ID IC0677W0059-334891 from training. Duration: 0.896 2023-10-03 04:21:30,756 WARNING [train.py:1204] (3/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_370-331600-0 from training. Duration: 0.94 2023-10-03 04:21:32,289 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.balancer_ff2.min_abs, batch_count=1135986.6666666667, ans=0.1 2023-10-03 04:21:34,263 WARNING [train.py:1204] (3/4) Exclude cut with ID _66840_210_5219_1_1535452168426_4196349_446-240192-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 04:21:34,448 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.feed_forward1.out_proj.dropout_p, batch_count=1135986.6666666667, ans=0.1 2023-10-03 04:21:35,516 WARNING [train.py:1204] (3/4) Exclude cut with ID _65242_210_18628_1_1535369393717_3823149_38-6732-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 04:21:37,588 WARNING [train.py:1204] (3/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_302-2550-0 from training. Duration: 0.81 2023-10-03 04:21:38,709 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_100-348441-0 from training. Duration: 0.91 2023-10-03 04:21:40,287 WARNING [train.py:1204] (3/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_329-285801-0_sp1.1 from training. Duration: 0.82725 2023-10-03 04:21:42,994 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_30167_210_3528_1_1532428012_4772375_343-334712-0_sp1.1 from training. Duration: 0.936375 2023-10-03 04:21:51,195 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7543_210_6753_1_1528543318_4552_1-85072-0 from training. Duration: 0.83 2023-10-03 04:21:54,054 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7002_210_1794_1_1527935634_4034444_487-236501-0_sp1.1 from training. Duration: 0.6 2023-10-03 04:21:55,411 WARNING [train.py:1204] (3/4) Exclude cut with ID _7160_210_1876_1_1527940923231_3703799_282-322611-0 from training. Duration: 0.87 2023-10-03 04:21:57,642 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.bypass.scale_min, batch_count=1136120.0, ans=0.2 2023-10-03 04:21:58,828 WARNING [train.py:1204] (3/4) Exclude cut with ID _67335_210_5219_1_1535525975652_4275229_570-262983-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 04:21:59,012 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.conv_skip_rate, batch_count=1136120.0, ans=0.0 2023-10-03 04:22:00,281 WARNING [train.py:1204] (3/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_625-338546-0_sp1.1 from training. Duration: 0.8 2023-10-03 04:22:00,320 WARNING [train.py:1204] (3/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_176-56120-0 from training. Duration: 0.83 2023-10-03 04:22:04,523 WARNING [train.py:1204] (3/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_963-187109-0_sp1.1 from training. Duration: 0.9 2023-10-03 04:22:07,757 WARNING [train.py:1204] (3/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_121-284152-0_sp0.9 from training. Duration: 0.6555625 2023-10-03 04:22:09,240 WARNING [train.py:1204] (3/4) Exclude cut with ID _45021_210_9994_1_1533619751094_3330288_104-81711-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 04:22:09,520 INFO [scaling.py:1118] (3/4) WithLoss: name=encoder.encoders.2.encoder.layers.2.self_attn_weights, loss-sum=0.000e+00 2023-10-03 04:22:11,227 WARNING [train.py:1204] (3/4) Exclude cut with ID _51382_210_23862_1_1534492801838_3650104_129-343914-0_sp1.1 from training. Duration: 0.936375 2023-10-03 04:22:12,598 WARNING [train.py:1204] (3/4) Exclude cut with ID _14970_210_15709_1_1530775547361_1966965_8-169924-0 from training. Duration: 0.92 2023-10-03 04:22:12,907 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.feed_forward2.hidden_balancer.prob, batch_count=1136186.6666666667, ans=0.125 2023-10-03 04:22:14,016 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2295_210_2087_1_1525500993_2407863_234-83104-0_sp1.1 from training. Duration: 0.563625 2023-10-03 04:22:14,172 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.feed_forward1.hidden_balancer.prob, batch_count=1136186.6666666667, ans=0.125 2023-10-03 04:22:15,380 WARNING [train.py:1204] (3/4) Exclude cut with ID _14687_210_5854_1_1530703838259_3664889_115-239046-0 from training. Duration: 0.67 2023-10-03 04:22:16,716 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.539e+02 1.810e+02 1.929e+02 2.053e+02 2.839e+02, threshold=3.858e+02, percent-clipped=0.0 2023-10-03 04:22:16,863 WARNING [train.py:1204] (3/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_690-126306-0_sp1.1 from training. Duration: 0.7090625 2023-10-03 04:22:16,877 WARNING [train.py:1204] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_625-102955-0_sp0.9 from training. Duration: 0.8555625 2023-10-03 04:22:19,597 WARNING [train.py:1204] (3/4) Exclude cut with ID _9389_210_1664_1_1529217337301_5289269_289-213417-0 from training. Duration: 0.89 2023-10-03 04:22:20,938 INFO [train.py:1046] (3/4) Epoch 33, batch 450, loss[loss=0.1598, simple_loss=0.2422, pruned_loss=0.03868, over 23190.00 frames. ], tot_loss[loss=0.1619, simple_loss=0.2403, pruned_loss=0.04172, over 4221943.05 frames. ], batch size: 105, lr: 3.11e-03, grad_scale: 16.0 2023-10-03 04:22:21,013 WARNING [train.py:1204] (3/4) Exclude cut with ID _13253_210_1925_1_1530757775819_3577873_191-144977-0_sp0.9 from training. Duration: 0.8666875 2023-10-03 04:22:22,317 WARNING [train.py:1204] (3/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_325-155703-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 04:22:22,352 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2295_210_2087_1_1525500993_2407863_116-238894-0_sp0.9 from training. Duration: 0.77775 2023-10-03 04:22:23,583 WARNING [train.py:1204] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_94-126128-0 from training. Duration: 0.86 2023-10-03 04:22:23,607 WARNING [train.py:1204] (3/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_395-310220-0_sp0.9 from training. Duration: 0.988875 2023-10-03 04:22:25,011 WARNING [train.py:1204] (3/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_821-110852-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 04:22:25,067 WARNING [train.py:1204] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_64-248359-0_sp1.1 from training. Duration: 0.62725 2023-10-03 04:22:25,081 WARNING [train.py:1204] (3/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_138-187031-0 from training. Duration: 0.99 2023-10-03 04:22:25,120 WARNING [train.py:1204] (3/4) Exclude cut with ID _4122_210_2084_1_1526713164417_4353280_528-206771-0_sp0.9 from training. Duration: 0.97775 2023-10-03 04:22:25,950 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.1.encoder.layers.0.feed_forward2.out_whiten, num_groups=1, num_channels=256, metric=4.15 vs. limit=15.0 2023-10-03 04:22:26,635 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.feed_forward1.hidden_balancer.prob, batch_count=1136253.3333333333, ans=0.125 2023-10-03 04:22:27,840 WARNING [train.py:1204] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_844-96989-0_sp1.1 from training. Duration: 0.6454375 2023-10-03 04:22:29,280 WARNING [train.py:1204] (3/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_106-30782-0_sp1.1 from training. Duration: 0.72725 2023-10-03 04:22:39,034 WARNING [train.py:1204] (3/4) Exclude cut with ID _14683_210_5219_1_1530698352608_3974859_281-250040-0_sp1.1 from training. Duration: 0.936375 2023-10-03 04:22:39,064 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_46013_210_7234_1_1533776362_3744032_463-52378-0_sp0.9 from training. Duration: 0.9555625 2023-10-03 04:22:42,343 WARNING [train.py:1204] (3/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_299-34413-0 from training. Duration: 0.65 2023-10-03 04:22:43,729 WARNING [train.py:1204] (3/4) Exclude cut with ID _35799_210_5854_1_1533263487887_3529071_490-326847-0 from training. Duration: 0.99 2023-10-03 04:22:46,595 WARNING [train.py:1204] (3/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_589-9070-0_sp0.9 from training. Duration: 0.788875 2023-10-03 04:22:49,424 WARNING [train.py:1204] (3/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_884-29515-0_sp1.1 from training. Duration: 0.936375 2023-10-03 04:22:50,903 WARNING [train.py:1204] (3/4) Exclude cut with ID _15012_210_1968_1_1530856820429_5095350_474-119995-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 04:22:53,787 WARNING [train.py:1204] (3/4) Exclude cut with ID _65585_210_12210_1_1535450451584_3764098_211-97124-0_sp1.1 from training. Duration: 0.963625 2023-10-03 04:22:53,850 WARNING [train.py:1204] (3/4) Exclude cut with ID _33640_210_2655_1_1533117605479_4855469_666-224463-0_sp1.1 from training. Duration: 0.963625 2023-10-03 04:22:56,691 WARNING [train.py:1204] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_35-153886-0 from training. Duration: 0.84 2023-10-03 04:22:57,961 WARNING [train.py:1204] (3/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_210-106047-0 from training. Duration: 0.97 2023-10-03 04:22:58,096 WARNING [train.py:1204] (3/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_479-125611-0 from training. Duration: 0.96 2023-10-03 04:22:59,358 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_9417_210_1794_1_1529235261_4614131_427-153963-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 04:22:59,439 WARNING [train.py:1204] (3/4) Exclude cut with ID _23818_210_6228_1_1531917063884_3676920_133-170571-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 04:23:01,376 WARNING [train.py:1204] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_602-275260-0_sp1.1 from training. Duration: 0.7454375 2023-10-03 04:23:02,820 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9029W0121-509521 from training. Duration: 0.896 2023-10-03 04:23:02,829 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9029W0434-509821 from training. Duration: 0.64 2023-10-03 04:23:04,095 WARNING [train.py:1204] (3/4) Exclude cut with ID _23038_210_9570_1_1532001584158_3652419_289-307034-0_sp1.1 from training. Duration: 0.936375 2023-10-03 04:23:05,552 WARNING [train.py:1204] (3/4) Exclude cut with ID _29345_210_4381_1_1532689168263_3860169_149-257268-0_sp0.9 from training. Duration: 0.9 2023-10-03 04:23:07,405 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_405-339986-0_sp1.1 from training. Duration: 0.463625 2023-10-03 04:23:10,280 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2655_210_2087_1_1525688959_3897400_437-337690-0_sp1.1 from training. Duration: 0.57275 2023-10-03 04:23:10,307 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7543_210_6753_1_1528541907_1399322_165-323619-0_sp1.1 from training. Duration: 0.62725 2023-10-03 04:23:10,373 WARNING [train.py:1204] (3/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_508-335369-0_sp1.1 from training. Duration: 0.436375 2023-10-03 04:23:12,339 WARNING [train.py:1204] (3/4) Exclude cut with ID _7852_210_5219_1_1528286471316_8139400_358-287492-0 from training. Duration: 0.96 2023-10-03 04:23:12,496 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.feed_forward2.hidden_balancer.prob, batch_count=1136453.3333333333, ans=0.125 2023-10-03 04:23:15,035 WARNING [train.py:1204] (3/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_322-217220-0_sp0.9 from training. Duration: 0.9555625 2023-10-03 04:23:17,696 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3171_210_2087_1_1526109084_2330007_66-260583-0_sp0.9 from training. Duration: 0.8 2023-10-03 04:23:17,730 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8558_210_1410_1_1528505225_7613406_131-329524-0_sp1.1 from training. Duration: 0.7181875 2023-10-03 04:23:17,963 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.conv_module1.balancer2.prob, batch_count=1136453.3333333333, ans=0.125 2023-10-03 04:23:19,067 WARNING [train.py:1204] (3/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_30-326681-0 from training. Duration: 0.83 2023-10-03 04:23:21,927 WARNING [train.py:1204] (3/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_58-68956-0_sp0.9 from training. Duration: 0.92225 2023-10-03 04:23:23,365 WARNING [train.py:1204] (3/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_255-312654-0 from training. Duration: 0.91 2023-10-03 04:23:24,721 WARNING [train.py:1204] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_99-167392-0 from training. Duration: 0.61 2023-10-03 04:23:26,092 WARNING [train.py:1204] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_94-126128-0_sp0.9 from training. Duration: 0.9555625 2023-10-03 04:23:30,344 WARNING [train.py:1204] (3/4) Exclude cut with ID _68635_210_9317_1_1535770813410_4315870_187-192817-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 04:23:31,775 WARNING [train.py:1204] (3/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_277-204970-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 04:23:31,924 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=1136520.0, ans=0.1 2023-10-03 04:23:33,938 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.ff2_skip_rate, batch_count=1136586.6666666667, ans=0.0 2023-10-03 04:23:35,414 INFO [train.py:1046] (3/4) Epoch 33, batch 500, loss[loss=0.1655, simple_loss=0.2349, pruned_loss=0.04799, over 23905.00 frames. ], tot_loss[loss=0.1623, simple_loss=0.241, pruned_loss=0.04178, over 4332502.03 frames. ], batch size: 150, lr: 3.11e-03, grad_scale: 16.0 2023-10-03 04:23:35,476 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6163_210_6400_1_1527506746_4263068_283-137811-0_sp0.9 from training. Duration: 0.8555625 2023-10-03 04:23:35,511 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9144W0121-566446 from training. Duration: 0.896 2023-10-03 04:23:38,445 WARNING [train.py:1204] (3/4) Exclude cut with ID _49371_210_12071_1_1533952761021_3812190_351-315519-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 04:23:39,830 WARNING [train.py:1204] (3/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_115-326778-0_sp1.1 from training. Duration: 0.8454375 2023-10-03 04:23:41,230 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_17292_210_6753_1_1531125388_2943734_436-198965-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 04:23:41,243 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9165W0117-576751 from training. Duration: 0.896 2023-10-03 04:23:42,050 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.self_attn1.whiten.whitening_limit, batch_count=1136586.6666666667, ans=22.5 2023-10-03 04:23:43,007 WARNING [train.py:1204] (3/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_212-149947-0 from training. Duration: 0.73 2023-10-03 04:23:43,015 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_13757_210_3528_1_1530440682_4107239_65-167544-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 04:23:46,466 WARNING [train.py:1204] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_700-183549-0_sp0.9 from training. Duration: 0.7555625 2023-10-03 04:23:50,501 WARNING [train.py:1204] (3/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_216-343007-0_sp0.9 from training. Duration: 0.7333125 2023-10-03 04:23:51,974 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7544_210_6753_1_1528587722_3576686_295-258479-0_sp1.1 from training. Duration: 0.763625 2023-10-03 04:23:53,398 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_365-287106-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 04:23:53,411 WARNING [train.py:1204] (3/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_300-3747-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 04:23:53,470 WARNING [train.py:1204] (3/4) Exclude cut with ID _14069_210_5632_1_1530512692033_4803390_487-303201-0_sp1.1 from training. Duration: 0.92725 2023-10-03 04:23:55,578 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.3.encoder.layers.0.conv_module1.whiten, num_groups=1, num_channels=512, metric=3.88 vs. limit=15.0 2023-10-03 04:24:02,632 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.2.encoder.layers.0.self_attn1.whiten, num_groups=1, num_channels=384, metric=22.81 vs. limit=22.5 2023-10-03 04:24:03,240 WARNING [train.py:1204] (3/4) Exclude cut with ID _14459_210_2228_1_1531443641946_7516700_668-123026-0_sp1.1 from training. Duration: 0.936375 2023-10-03 04:24:03,272 WARNING [train.py:1204] (3/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_108-53929-0_sp1.1 from training. Duration: 0.536375 2023-10-03 04:24:04,528 WARNING [train.py:1204] (3/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_266-137104-0_sp0.9 from training. Duration: 0.72225 2023-10-03 04:24:04,567 WARNING [train.py:1204] (3/4) Exclude cut with ID _12302_210_5219_1_1529924446466_8107280_902-168619-0_sp1.1 from training. Duration: 0.936375 2023-10-03 04:24:04,599 WARNING [train.py:1204] (3/4) Exclude cut with ID _59502_210_12210_1_1535107024698_1835139_146-162495-0 from training. Duration: 0.92 2023-10-03 04:24:04,615 WARNING [train.py:1204] (3/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_412-287526-0_sp1.1 from training. Duration: 0.6545625 2023-10-03 04:24:09,279 WARNING [train.py:1204] (3/4) Exclude cut with ID _7160_210_1876_1_1527940923231_3703799_120-150943-0_sp0.9 from training. Duration: 0.9 2023-10-03 04:24:09,331 WARNING [train.py:1204] (3/4) Exclude cut with ID _15037_210_2253_1_1530752294104_3267650_296-192597-0_sp1.1 from training. Duration: 0.736375 2023-10-03 04:24:11,227 WARNING [train.py:1204] (3/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_397-216497-0_sp1.1 from training. Duration: 0.87275 2023-10-03 04:24:11,249 WARNING [train.py:1204] (3/4) Exclude cut with ID _40094_210_16380_1_1533294005773_3641159_76-269320-0_sp1.1 from training. Duration: 0.936375 2023-10-03 04:24:11,308 WARNING [train.py:1204] (3/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_449-15508-0 from training. Duration: 0.82 2023-10-03 04:24:15,613 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9309W0194-640321 from training. Duration: 0.64 2023-10-03 04:24:17,767 WARNING [train.py:1204] (3/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_284-312178-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 04:24:19,169 WARNING [train.py:1204] (3/4) Exclude cut with ID _29853_210_7497_1_1532505600851_6642110_401-55937-0_sp1.1 from training. Duration: 0.92725 2023-10-03 04:24:20,518 WARNING [train.py:1204] (3/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_716-24918-0_sp1.1 from training. Duration: 0.92725 2023-10-03 04:24:20,545 WARNING [train.py:1204] (3/4) Exclude cut with ID _19637_210_4220_1_1531537167676_3205060_314-305638-0_sp1.1 from training. Duration: 0.92725 2023-10-03 04:24:21,891 WARNING [train.py:1204] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_475-127455-0_sp1.1 from training. Duration: 0.636375 2023-10-03 04:24:23,300 WARNING [train.py:1204] (3/4) Exclude cut with ID _5444_210_2151_1_1527418808464_5312210_581-315411-0 from training. Duration: 0.62 2023-10-03 04:24:24,789 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.self_attn_weights.pos_emb_skip_rate, batch_count=1136786.6666666667, ans=0.0 2023-10-03 04:24:26,013 WARNING [train.py:1204] (3/4) Exclude cut with ID _44902_210_16187_1_1533888006062_3792078_149-67212-0_sp0.9 from training. Duration: 0.9666875 2023-10-03 04:24:27,395 WARNING [train.py:1204] (3/4) Exclude cut with ID _31602_210_5854_1_1532677021456_4458250_63-63253-0_sp1.1 from training. Duration: 0.963625 2023-10-03 04:24:31,647 WARNING [train.py:1204] (3/4) Exclude cut with ID _55209_210_1531_1_1534675828996_4597160_660-203992-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 04:24:34,453 WARNING [train.py:1204] (3/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_83-190624-0_sp1.1 from training. Duration: 0.92725 2023-10-03 04:24:39,195 WARNING [train.py:1204] (3/4) Exclude cut with ID _58605_210_12210_1_1534723195939_3904259_154-139635-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 04:24:42,499 WARNING [train.py:1204] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_164-224052-0 from training. Duration: 0.7 2023-10-03 04:24:42,518 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_1126-189071-0_sp1.1 from training. Duration: 0.963625 2023-10-03 04:24:42,530 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_15396_210_6890_1_1531308473_1938936_310-214789-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 04:24:45,316 WARNING [train.py:1204] (3/4) Exclude cut with ID _8395_210_1531_1_1528597871523_4411579_431-132470-0 from training. Duration: 0.74 2023-10-03 04:24:46,562 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.535e+02 1.881e+02 2.075e+02 2.361e+02 3.441e+02, threshold=4.149e+02, percent-clipped=0.0 2023-10-03 04:24:46,662 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3780_210_2087_1_1526467760_2130483_170-259044-0_sp0.9 from training. Duration: 0.77775 2023-10-03 04:24:48,649 WARNING [train.py:1204] (3/4) Exclude cut with ID _12733_210_8341_1_1530167424556_3646380_537-230640-0_sp1.1 from training. Duration: 0.963625 2023-10-03 04:24:50,008 INFO [train.py:1046] (3/4) Epoch 33, batch 550, loss[loss=0.1623, simple_loss=0.2489, pruned_loss=0.03788, over 24425.00 frames. ], tot_loss[loss=0.1624, simple_loss=0.2416, pruned_loss=0.04157, over 4418465.19 frames. ], batch size: 69, lr: 3.11e-03, grad_scale: 8.0 2023-10-03 04:24:52,861 WARNING [train.py:1204] (3/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_159-281310-0 from training. Duration: 0.95 2023-10-03 04:24:55,535 WARNING [train.py:1204] (3/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_434-83000-0 from training. Duration: 0.98 2023-10-03 04:24:55,564 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_4405_210_1849_1_1526732205_3593939_493-32464-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 04:24:55,573 WARNING [train.py:1204] (3/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_342-100157-0 from training. Duration: 0.54 2023-10-03 04:24:57,038 WARNING [train.py:1204] (3/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_765-250133-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 04:24:57,043 WARNING [train.py:1204] (3/4) Exclude cut with ID _38670_210_15379_1_1533448810659_4391059_81-312245-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 04:24:58,426 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_50050_210_16223_1_1535435796_7290138_1273-247611-0_sp1.1 from training. Duration: 0.936375 2023-10-03 04:24:58,497 WARNING [train.py:1204] (3/4) Exclude cut with ID _44831_210_13427_1_1533816011549_3689035_399-18591-0_sp1.1 from training. Duration: 0.936375 2023-10-03 04:24:58,512 WARNING [train.py:1204] (3/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_557-324258-0_sp0.9 from training. Duration: 0.9 2023-10-03 04:24:59,872 WARNING [train.py:1204] (3/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_62-335552-0_sp1.1 from training. Duration: 0.7909375 2023-10-03 04:25:01,313 WARNING [train.py:1204] (3/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_673-264125-0_sp1.1 from training. Duration: 0.963625 2023-10-03 04:25:02,678 WARNING [train.py:1204] (3/4) Exclude cut with ID _8030_210_7497_1_1528444886781_3531089_262-86750-0 from training. Duration: 0.81 2023-10-03 04:25:04,115 WARNING [train.py:1204] (3/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_444-28041-0_sp1.1 from training. Duration: 0.77275 2023-10-03 04:25:06,922 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_34008_210_10431_1_1533345424_8183247_516-14858-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 04:25:06,937 WARNING [train.py:1204] (3/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_54-248218-0_sp1.1 from training. Duration: 0.936375 2023-10-03 04:25:07,138 INFO [scaling.py:1118] (3/4) WithLoss: name=encoder.encoders.2.encoder.layers.2.self_attn_weights, loss-sum=0.000e+00 2023-10-03 04:25:10,200 WARNING [train.py:1204] (3/4) Exclude cut with ID _13253_210_1925_1_1530757775819_3577873_300-119782-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 04:25:10,270 WARNING [train.py:1204] (3/4) Exclude cut with ID _39290_210_4220_1_1533862778760_3075118_21-240703-0_sp1.1 from training. Duration: 0.936375 2023-10-03 04:25:10,440 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=1136986.6666666667, ans=0.1 2023-10-03 04:25:14,828 WARNING [train.py:1204] (3/4) Exclude cut with ID 774-127930-0014-10412-0_sp1.1 from training. Duration: 0.95 2023-10-03 04:25:14,896 WARNING [train.py:1204] (3/4) Exclude cut with ID _67499_210_17282_1_1535509805220_3692430_20-179346-0 from training. Duration: 0.97 2023-10-03 04:25:16,948 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.nonlin_attention.balancer.prob, batch_count=1136986.6666666667, ans=0.125 2023-10-03 04:25:18,032 WARNING [train.py:1204] (3/4) Exclude cut with ID _8365_210_2064_1_1528592311070_4677690_431-294102-0_sp0.9 from training. Duration: 0.988875 2023-10-03 04:25:21,282 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.5.encoder.layers.0.self_attn1.whiten, num_groups=1, num_channels=256, metric=12.08 vs. limit=22.5 2023-10-03 04:25:22,174 WARNING [train.py:1204] (3/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_365-1492-0_sp1.1 from training. Duration: 0.82725 2023-10-03 04:25:22,201 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_105-114797-0_sp0.9 from training. Duration: 0.9333125 2023-10-03 04:25:23,629 WARNING [train.py:1204] (3/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_769-168302-0_sp0.9 from training. Duration: 0.788875 2023-10-03 04:25:26,496 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_40340_210_9024_1_1533433916_4132944_320-270277-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 04:25:26,501 WARNING [train.py:1204] (3/4) Exclude cut with ID ID0203W0003-781711 from training. Duration: 0.958 2023-10-03 04:25:26,570 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_30434_210_13049_1_1532519384_981746_52-12932-0_sp1.1 from training. Duration: 0.936375 2023-10-03 04:25:27,951 WARNING [train.py:1204] (3/4) Exclude cut with ID _8263_210_1664_1_1528438953494_294100_40-204731-0_sp0.9 from training. Duration: 0.5555625 2023-10-03 04:25:31,880 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1032-236863-0_sp0.9 from training. Duration: 0.9333125 2023-10-03 04:25:31,935 WARNING [train.py:1204] (3/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_335-274780-0_sp0.9 from training. Duration: 0.8666875 2023-10-03 04:25:31,949 WARNING [train.py:1204] (3/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_654-52713-0_sp1.1 from training. Duration: 0.7 2023-10-03 04:25:33,412 WARNING [train.py:1204] (3/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_742-267856-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 04:25:34,852 WARNING [train.py:1204] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_74-48717-0 from training. Duration: 0.58 2023-10-03 04:25:36,221 WARNING [train.py:1204] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_18-46756-0 from training. Duration: 0.79 2023-10-03 04:25:36,489 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.conv_module2.balancer2.prob, batch_count=1137120.0, ans=0.125 2023-10-03 04:25:37,561 WARNING [train.py:1204] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_612-56061-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 04:25:37,577 WARNING [train.py:1204] (3/4) Exclude cut with ID _56423_210_5854_1_1534896061049_3165969_412-2403-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 04:25:38,927 WARNING [train.py:1204] (3/4) Exclude cut with ID _62574_210_2655_1_1535180407058_3933249_120-196848-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 04:25:38,928 WARNING [train.py:1204] (3/4) Exclude cut with ID _40519_210_13794_1_1533715228655_7337390_916-51000-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 04:25:43,693 WARNING [train.py:1204] (3/4) Exclude cut with ID _65263_210_22356_1_1535763598691_3921037_108-21902-0_sp1.1 from training. Duration: 0.9 2023-10-03 04:25:45,105 WARNING [train.py:1204] (3/4) Exclude cut with ID _4122_210_2084_1_1526713164417_4353280_528-206771-0_sp1.1 from training. Duration: 0.8 2023-10-03 04:25:46,585 WARNING [train.py:1204] (3/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_43-325657-0_sp1.1 from training. Duration: 0.92725 2023-10-03 04:25:46,632 WARNING [train.py:1204] (3/4) Exclude cut with ID _44659_210_2721_1_1534036120061_6614860_314-82041-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 04:25:48,624 WARNING [train.py:1204] (3/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_81-300212-0_sp0.9 from training. Duration: 0.6666875 2023-10-03 04:25:49,973 WARNING [train.py:1204] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_665-196155-0_sp0.9 from training. Duration: 0.7444375 2023-10-03 04:25:51,341 WARNING [train.py:1204] (3/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_140-336881-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 04:25:52,637 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6290_210_3273_1_1527595224_1282787_115-87095-0_sp0.9 from training. Duration: 0.611125 2023-10-03 04:25:53,977 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_53373_210_5399_1_1534229471_4715695_607-259685-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 04:25:54,080 WARNING [train.py:1204] (3/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_175-163894-0_sp1.1 from training. Duration: 0.67275 2023-10-03 04:25:55,467 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7002_210_1794_1_1527939683_5157286_1-281264-0_sp1.1 from training. Duration: 0.4 2023-10-03 04:25:59,660 WARNING [train.py:1204] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_605-257205-0 from training. Duration: 0.59 2023-10-03 04:26:02,186 INFO [train.py:1046] (3/4) Epoch 33, batch 600, loss[loss=0.1607, simple_loss=0.2257, pruned_loss=0.04787, over 23850.00 frames. ], tot_loss[loss=0.1633, simple_loss=0.2421, pruned_loss=0.04226, over 4466469.08 frames. ], batch size: 179, lr: 3.11e-03, grad_scale: 8.0 2023-10-03 04:26:03,611 WARNING [train.py:1204] (3/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_454-314901-0 from training. Duration: 0.46 2023-10-03 04:26:06,237 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_46032_210_14656_1_1533781412_4507969_287-130422-0_sp1.1 from training. Duration: 0.97275 2023-10-03 04:26:06,252 WARNING [train.py:1204] (3/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_186-309161-0_sp1.1 from training. Duration: 0.4909375 2023-10-03 04:26:06,294 WARNING [train.py:1204] (3/4) Exclude cut with ID _42659_210_2070_1_1533607442765_3120380_45-342307-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 04:26:08,657 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.0.layers.0.self_attn2.whiten, num_groups=1, num_channels=192, metric=13.43 vs. limit=22.5 2023-10-03 04:26:11,279 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.ff3_skip_rate, batch_count=1137253.3333333333, ans=0.0 2023-10-03 04:26:13,115 WARNING [train.py:1204] (3/4) Exclude cut with ID _12555_210_8750_1_1530075594402_3780239_478-326818-0_sp1.1 from training. Duration: 0.963625 2023-10-03 04:26:14,534 WARNING [train.py:1204] (3/4) Exclude cut with ID _7606_210_4915_1_1528894253277_5080690_127-177761-0_sp1.1 from training. Duration: 0.5818125 2023-10-03 04:26:17,798 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6788_210_1388_1_1527898698_4562021_91-197981-0 from training. Duration: 0.83 2023-10-03 04:26:19,275 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8558_210_1410_1_1528505225_7613406_131-329524-0_sp0.9 from training. Duration: 0.87775 2023-10-03 04:26:20,973 WARNING [train.py:1204] (3/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_153-153537-0_sp1.1 from training. Duration: 0.82725 2023-10-03 04:26:22,428 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_17292_210_6753_1_1531125388_2943734_285-349105-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 04:26:23,820 WARNING [train.py:1204] (3/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_37-41919-0 from training. Duration: 0.87 2023-10-03 04:26:23,852 WARNING [train.py:1204] (3/4) Exclude cut with ID _60860_210_8254_1_1534896015875_3557169_172-294852-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 04:26:26,918 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.feed_forward2.hidden_balancer.prob, batch_count=1137320.0, ans=0.125 2023-10-03 04:26:29,577 WARNING [train.py:1204] (3/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_146-276944-0 from training. Duration: 0.83 2023-10-03 04:26:32,427 WARNING [train.py:1204] (3/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_1254-44082-0_sp1.1 from training. Duration: 0.936375 2023-10-03 04:26:32,445 WARNING [train.py:1204] (3/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_1115-180350-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 04:26:33,702 WARNING [train.py:1204] (3/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_60-146189-0_sp1.1 from training. Duration: 0.8909375 2023-10-03 04:26:33,872 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=1137386.6666666667, ans=0.1 2023-10-03 04:26:37,876 WARNING [train.py:1204] (3/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_517-153649-0_sp1.1 from training. Duration: 0.92725 2023-10-03 04:26:37,888 WARNING [train.py:1204] (3/4) Exclude cut with ID _39571_210_13449_1_1533801584143_3651077_37-266257-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 04:26:39,213 WARNING [train.py:1204] (3/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_527-276118-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 04:26:43,448 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.self_attn_weights.pos_emb_skip_rate, batch_count=1137386.6666666667, ans=0.0 2023-10-03 04:26:46,834 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.feed_forward2.hidden_balancer.prob, batch_count=1137453.3333333333, ans=0.125 2023-10-03 04:26:47,966 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7696_210_2040_1_1528207167_3716099_117-125434-0_sp1.1 from training. Duration: 0.6909375 2023-10-03 04:26:52,225 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_25767_210_2161_1_1532307483_3867748_478-156360-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 04:26:52,229 WARNING [train.py:1204] (3/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_177-49092-0_sp1.1 from training. Duration: 0.82725 2023-10-03 04:26:52,236 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_11305_210_1794_1_1529750428_7178148_680-317073-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 04:26:57,839 WARNING [train.py:1204] (3/4) Exclude cut with ID _53565_210_10635_1_1534337218335_3820259_287-18120-0 from training. Duration: 0.97 2023-10-03 04:27:02,025 WARNING [train.py:1204] (3/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_498-40153-0_sp1.1 from training. Duration: 0.57275 2023-10-03 04:27:03,312 WARNING [train.py:1204] (3/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_555-63066-0_sp1.1 from training. Duration: 0.963625 2023-10-03 04:27:06,080 WARNING [train.py:1204] (3/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_395-310220-0 from training. Duration: 0.89 2023-10-03 04:27:06,156 WARNING [train.py:1204] (3/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_937-208281-0_sp1.1 from training. Duration: 0.77275 2023-10-03 04:27:08,871 WARNING [train.py:1204] (3/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_120-211630-0 from training. Duration: 0.92 2023-10-03 04:27:11,157 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_454-276429-0_sp1.1 from training. Duration: 0.7909375 2023-10-03 04:27:11,204 WARNING [train.py:1204] (3/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_404-75889-0_sp1.1 from training. Duration: 0.8090625 2023-10-03 04:27:14,270 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.607e+02 1.854e+02 2.113e+02 2.384e+02 3.554e+02, threshold=4.226e+02, percent-clipped=0.0 2023-10-03 04:27:14,678 INFO [scaling.py:1118] (3/4) WithLoss: name=encoder.encoders.2.encoder.layers.2.self_attn_weights, loss-sum=0.000e+00 2023-10-03 04:27:17,647 INFO [train.py:1046] (3/4) Epoch 33, batch 650, loss[loss=0.1732, simple_loss=0.2631, pruned_loss=0.04161, over 24375.00 frames. ], tot_loss[loss=0.1632, simple_loss=0.2412, pruned_loss=0.04258, over 4500081.06 frames. ], batch size: 77, lr: 3.11e-03, grad_scale: 8.0 2023-10-03 04:27:17,715 WARNING [train.py:1204] (3/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_282-136641-0_sp0.9 from training. Duration: 0.4666875 2023-10-03 04:27:20,392 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_809-17156-0_sp1.1 from training. Duration: 0.663625 2023-10-03 04:27:21,865 WARNING [train.py:1204] (3/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_1003-101638-0_sp0.9 from training. Duration: 0.811125 2023-10-03 04:27:23,253 WARNING [train.py:1204] (3/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_404-75889-0_sp0.9 from training. Duration: 0.988875 2023-10-03 04:27:24,681 WARNING [train.py:1204] (3/4) Exclude cut with ID _59288_210_5854_1_1535077757774_3412246_570-295255-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 04:27:27,414 WARNING [train.py:1204] (3/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_412-287526-0 from training. Duration: 0.72 2023-10-03 04:27:28,751 WARNING [train.py:1204] (3/4) Exclude cut with ID _44010_210_4525_1_1533884262964_4013620_649-79655-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 04:27:34,305 WARNING [train.py:1204] (3/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_57-270037-0_sp0.9 from training. Duration: 0.8555625 2023-10-03 04:27:34,306 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_34815_210_6388_1_1533381320_3571903_302-255963-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 04:27:37,149 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_47449_210_5399_1_1534076445_4006929_573-301852-0_sp1.1 from training. Duration: 0.97275 2023-10-03 04:27:40,569 WARNING [train.py:1204] (3/4) Exclude cut with ID 6709-74022-0004-86860-0_sp1.1 from training. Duration: 0.9409375 2023-10-03 04:27:42,035 WARNING [train.py:1204] (3/4) Exclude cut with ID _40519_210_13794_1_1533715228655_7337390_111-71991-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 04:27:42,675 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_30167_210_3528_1_1532428012_4772375_169-214186-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 04:27:46,497 WARNING [train.py:1204] (3/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_20-235313-0_sp1.1 from training. Duration: 0.963625 2023-10-03 04:27:46,554 WARNING [train.py:1204] (3/4) Exclude cut with ID _4681_210_3525_1_1527251040321_1235713_123-24428-0_sp0.9 from training. Duration: 0.5333125 2023-10-03 04:27:48,387 WARNING [train.py:1204] (3/4) Exclude cut with ID _31602_210_5854_1_1532677021456_4458250_444-198752-0_sp1.1 from training. Duration: 0.97275 2023-10-03 04:27:48,435 WARNING [train.py:1204] (3/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_9-279442-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 04:27:48,479 WARNING [train.py:1204] (3/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_973-198085-0_sp0.9 from training. Duration: 0.8333125 2023-10-03 04:27:49,827 WARNING [train.py:1204] (3/4) Exclude cut with ID _29853_210_7497_1_1532505600851_6642110_567-250221-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 04:27:52,567 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6545_210_2040_1_1527774670_4245258_407-48070-0_sp1.1 from training. Duration: 0.6909375 2023-10-03 04:27:55,251 WARNING [train.py:1204] (3/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_224-343295-0_sp0.9 from training. Duration: 0.611125 2023-10-03 04:27:55,266 WARNING [train.py:1204] (3/4) Exclude cut with ID IC0125W0011-61817 from training. Duration: 0.88 2023-10-03 04:27:55,275 WARNING [train.py:1204] (3/4) Exclude cut with ID _60113_210_16271_1_1535164212481_4386079_435-154371-0_sp1.1 from training. Duration: 0.97275 2023-10-03 04:27:55,295 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_25836_210_16554_1_1532138167_53197_3-9394-0_sp1.1 from training. Duration: 0.92725 2023-10-03 04:27:59,575 WARNING [train.py:1204] (3/4) Exclude cut with ID _29343_210_4381_1_1532602757752_3674109_57-55125-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 04:27:59,671 WARNING [train.py:1204] (3/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_267-23560-0_sp1.1 from training. Duration: 0.936375 2023-10-03 04:27:59,694 WARNING [train.py:1204] (3/4) Exclude cut with ID _30196_210_1664_1_1533708496522_7074140_1150-278867-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 04:28:01,033 WARNING [train.py:1204] (3/4) Exclude cut with ID _6270_210_2084_1_1527850911573_3856420_26-277343-0_sp0.9 from training. Duration: 0.911125 2023-10-03 04:28:02,270 WARNING [train.py:1204] (3/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_325-62802-0 from training. Duration: 0.87 2023-10-03 04:28:02,347 WARNING [train.py:1204] (3/4) Exclude cut with ID _56524_210_9642_1_1534572011779_3619170_145-310842-0_sp1.1 from training. Duration: 0.9 2023-10-03 04:28:03,751 WARNING [train.py:1204] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_860-96198-0_sp0.9 from training. Duration: 0.811125 2023-10-03 04:28:03,834 WARNING [train.py:1204] (3/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_17-25045-0_sp1.1 from training. Duration: 0.536375 2023-10-03 04:28:03,858 WARNING [train.py:1204] (3/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_349-69727-0_sp1.1 from training. Duration: 0.936375 2023-10-03 04:28:06,527 WARNING [train.py:1204] (3/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_580-93798-0_sp1.1 from training. Duration: 0.5090625 2023-10-03 04:28:06,639 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7762_210_1794_1_1528199508_4057408_419-125009-0 from training. Duration: 0.83 2023-10-03 04:28:08,031 WARNING [train.py:1204] (3/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_356-167816-0 from training. Duration: 0.72 2023-10-03 04:28:08,054 WARNING [train.py:1204] (3/4) Exclude cut with ID _29345_210_4381_1_1532689168263_3860169_225-97100-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 04:28:09,900 WARNING [train.py:1204] (3/4) Exclude cut with ID _14227_210_4381_1_1530701804286_3940259_374-194712-0_sp1.1 from training. Duration: 0.92725 2023-10-03 04:28:09,939 WARNING [train.py:1204] (3/4) Exclude cut with ID _40137_210_12210_1_1533290484046_3795180_249-187059-0_sp0.9 from training. Duration: 0.97775 2023-10-03 04:28:09,957 WARNING [train.py:1204] (3/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_389-317194-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 04:28:11,950 WARNING [train.py:1204] (3/4) Exclude cut with ID _27485_210_4916_1_1532739191677_3668269_239-122918-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 04:28:17,960 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2245_210_2715_1_1525511548_3540254_128-117609-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 04:28:19,264 WARNING [train.py:1204] (3/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_160-276178-0_sp1.1 from training. Duration: 0.963625 2023-10-03 04:28:21,240 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_31920_210_10633_1_1532860009_1686094_1-279735-0_sp1.1 from training. Duration: 0.97275 2023-10-03 04:28:22,743 WARNING [train.py:1204] (3/4) Exclude cut with ID _51078_210_9070_1_1534161557424_3805250_325-262383-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 04:28:22,771 WARNING [train.py:1204] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_643-52007-0_sp0.9 from training. Duration: 0.6333125 2023-10-03 04:28:24,148 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_51937_210_5014_1_1534573610_4022025_518-299248-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 04:28:27,464 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.bypass.scale_min, batch_count=1137853.3333333333, ans=0.2 2023-10-03 04:28:31,199 INFO [train.py:1046] (3/4) Epoch 33, batch 700, loss[loss=0.1733, simple_loss=0.2602, pruned_loss=0.04325, over 23972.00 frames. ], tot_loss[loss=0.1621, simple_loss=0.24, pruned_loss=0.0421, over 4547957.66 frames. ], batch size: 80, lr: 3.11e-03, grad_scale: 8.0 2023-10-03 04:28:31,254 WARNING [train.py:1204] (3/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_166-314607-0_sp1.1 from training. Duration: 0.4909375 2023-10-03 04:28:31,258 WARNING [train.py:1204] (3/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_1087-282939-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 04:28:31,301 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_23850_210_4238_1_1532232597_1632433_32-185135-0_sp1.1 from training. Duration: 0.7818125 2023-10-03 04:28:31,330 WARNING [train.py:1204] (3/4) Exclude cut with ID _59500_210_8254_1_1534809610680_3736009_145-89359-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 04:28:34,286 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.feed_forward2.hidden_balancer.prob, batch_count=1137920.0, ans=0.125 2023-10-03 04:28:34,287 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.conv_module2.balancer1.prob, batch_count=1137920.0, ans=0.125 2023-10-03 04:28:36,723 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_340-336408-0 from training. Duration: 0.93 2023-10-03 04:28:36,799 WARNING [train.py:1204] (3/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_9-21737-0 from training. Duration: 0.97 2023-10-03 04:28:38,983 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.2.encoder.layers.0.self_attn_weights.whiten_keys, num_groups=4, num_channels=128, metric=4.33 vs. limit=6.0 2023-10-03 04:28:40,118 WARNING [train.py:1204] (3/4) Exclude cut with ID _2220_210_1819_1_1525509109898_3658680_50-99837-0 from training. Duration: 0.54 2023-10-03 04:28:41,433 WARNING [train.py:1204] (3/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_879-299350-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 04:28:43,511 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.feed_forward1.hidden_balancer.prob, batch_count=1137920.0, ans=0.125 2023-10-03 04:28:44,652 WARNING [train.py:1204] (3/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_905-160074-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 04:28:44,771 WARNING [train.py:1204] (3/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_395-73570-0 from training. Duration: 0.99 2023-10-03 04:28:49,673 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7264_210_4938_1_1527939911_8132095_800-91995-0_sp1.1 from training. Duration: 0.7818125 2023-10-03 04:28:52,278 WARNING [train.py:1204] (3/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_77-311404-0_sp1.1 from training. Duration: 0.8545625 2023-10-03 04:28:54,321 WARNING [train.py:1204] (3/4) Exclude cut with ID _13679_210_7497_1_1530534293662_3355098_107-80772-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 04:28:55,600 WARNING [train.py:1204] (3/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_224-343295-0_sp1.1 from training. Duration: 0.5 2023-10-03 04:28:55,639 WARNING [train.py:1204] (3/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_156-253482-0_sp1.1 from training. Duration: 0.963625 2023-10-03 04:28:58,397 WARNING [train.py:1204] (3/4) Exclude cut with ID _49728_210_12651_1_1534041034294_6788150_309-125532-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 04:28:59,957 WARNING [train.py:1204] (3/4) Exclude cut with ID _13913_210_9994_1_1530579615187_3383897_473-277682-0_sp0.9 from training. Duration: 0.4333125 2023-10-03 04:28:59,961 WARNING [train.py:1204] (3/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_3-276701-0_sp1.1 from training. Duration: 0.82725 2023-10-03 04:29:02,532 WARNING [train.py:1204] (3/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_282-136641-0 from training. Duration: 0.42 2023-10-03 04:29:04,116 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.1.balancer1.prob, batch_count=1138053.3333333333, ans=0.125 2023-10-03 04:29:05,358 WARNING [train.py:1204] (3/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_127-566-0 from training. Duration: 0.84 2023-10-03 04:29:08,201 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6163_210_6400_1_1527506746_4263068_283-137811-0_sp1.1 from training. Duration: 0.7 2023-10-03 04:29:08,228 WARNING [train.py:1204] (3/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_34-183685-0_sp1.1 from training. Duration: 0.8909375 2023-10-03 04:29:10,209 WARNING [train.py:1204] (3/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_154-135696-0_sp0.9 from training. Duration: 0.888875 2023-10-03 04:29:14,802 WARNING [train.py:1204] (3/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_676-51014-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 04:29:16,201 WARNING [train.py:1204] (3/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_38-169920-0 from training. Duration: 0.91 2023-10-03 04:29:19,434 WARNING [train.py:1204] (3/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_747-114465-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 04:29:19,485 WARNING [train.py:1204] (3/4) Exclude cut with ID _8021_210_7994_1_1528376451358_3653069_100-140231-0_sp1.1 from training. Duration: 0.6090625 2023-10-03 04:29:19,502 WARNING [train.py:1204] (3/4) Exclude cut with ID _64135_210_2253_1_1535677267876_7608972_133-188266-0 from training. Duration: 0.81 2023-10-03 04:29:23,682 WARNING [train.py:1204] (3/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_714-310538-0_sp1.1 from training. Duration: 0.7818125 2023-10-03 04:29:25,517 WARNING [train.py:1204] (3/4) Exclude cut with ID _39867_210_9826_1_1533553222030_9344259_1134-288498-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 04:29:28,167 WARNING [train.py:1204] (3/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_211-237289-0_sp1.1 from training. Duration: 0.936375 2023-10-03 04:29:31,991 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.0.layers.1.self_attn1.whiten, num_groups=1, num_channels=192, metric=12.52 vs. limit=22.5 2023-10-03 04:29:33,767 WARNING [train.py:1204] (3/4) Exclude cut with ID _59202_210_26984_1_1534816810612_3549174_137-318710-0_sp0.9 from training. Duration: 0.988875 2023-10-03 04:29:33,791 WARNING [train.py:1204] (3/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_86-66192-0 from training. Duration: 0.55 2023-10-03 04:29:36,748 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=1138186.6666666667, ans=0.1 2023-10-03 04:29:37,934 WARNING [train.py:1204] (3/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_540-275685-0 from training. Duration: 0.89 2023-10-03 04:29:37,970 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8776_210_2040_1_1528638837_4088036_244-278763-0 from training. Duration: 0.9 2023-10-03 04:29:41,218 WARNING [train.py:1204] (3/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_9-219972-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 04:29:41,356 WARNING [train.py:1204] (3/4) Exclude cut with ID _44659_210_2721_1_1534036120061_6614860_656-283823-0_sp1.1 from training. Duration: 0.92725 2023-10-03 04:29:43,032 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.496e+02 1.966e+02 2.295e+02 2.647e+02 3.706e+02, threshold=4.591e+02, percent-clipped=0.0 2023-10-03 04:29:43,165 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_69630_210_7234_1_1535789601_661049_77-216385-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 04:29:44,607 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_19952_210_2161_1_1531875469_3851750_210-308919-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 04:29:45,851 INFO [train.py:1046] (3/4) Epoch 33, batch 750, loss[loss=0.1668, simple_loss=0.2539, pruned_loss=0.03988, over 23753.00 frames. ], tot_loss[loss=0.1608, simple_loss=0.2391, pruned_loss=0.04131, over 4587586.77 frames. ], batch size: 85, lr: 3.11e-03, grad_scale: 8.0 2023-10-03 04:29:45,898 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_4737_210_2040_1_1526998689_3293008_378-178650-0 from training. Duration: 0.95 2023-10-03 04:29:49,436 WARNING [train.py:1204] (3/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_224-343295-0 from training. Duration: 0.55 2023-10-03 04:29:49,462 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7002_210_1794_1_1527939683_5157286_184-229630-0 from training. Duration: 0.74 2023-10-03 04:29:49,495 WARNING [train.py:1204] (3/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_6-214815-0 from training. Duration: 0.85 2023-10-03 04:29:50,705 WARNING [train.py:1204] (3/4) Exclude cut with ID _8699_210_2069_1_1528630346831_3960409_160-24408-0 from training. Duration: 0.91 2023-10-03 04:29:50,746 WARNING [train.py:1204] (3/4) Exclude cut with ID _5585_210_2667_1_1527418813324_8007340_89-332072-0 from training. Duration: 0.64 2023-10-03 04:29:52,088 WARNING [train.py:1204] (3/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_528-259800-0_sp1.1 from training. Duration: 0.963625 2023-10-03 04:29:53,482 WARNING [train.py:1204] (3/4) Exclude cut with ID _55383_210_10635_1_1534468426172_2231669_134-128246-0 from training. Duration: 0.89 2023-10-03 04:29:53,557 WARNING [train.py:1204] (3/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_400-338274-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 04:29:55,437 WARNING [train.py:1204] (3/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_364-22817-0_sp0.9 from training. Duration: 0.788875 2023-10-03 04:29:56,853 WARNING [train.py:1204] (3/4) Exclude cut with ID _42863_210_9070_1_1533902398788_3696920_331-291057-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 04:29:58,275 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_5436_210_1794_1_1527331317_2541494_229-211016-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 04:29:58,309 WARNING [train.py:1204] (3/4) Exclude cut with ID _6456_210_1534_1_1527681634212_3181049_345-252877-0_sp0.9 from training. Duration: 0.711125 2023-10-03 04:29:59,625 WARNING [train.py:1204] (3/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_96-81499-0_sp1.1 from training. Duration: 0.92725 2023-10-03 04:30:02,311 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_5575_210_1410_1_1527301305_2570170_342-265572-0_sp1.1 from training. Duration: 0.8545625 2023-10-03 04:30:02,387 WARNING [train.py:1204] (3/4) Exclude cut with ID _5444_210_2151_1_1527418808464_5312210_726-27165-0_sp0.9 from training. Duration: 0.7444375 2023-10-03 04:30:05,104 WARNING [train.py:1204] (3/4) Exclude cut with ID _22083_210_8254_1_1531702862439_3678970_304-327750-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 04:30:05,251 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.conv_module2.balancer1.min_positive, batch_count=1138320.0, ans=0.025 2023-10-03 04:30:06,555 WARNING [train.py:1204] (3/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_245-226199-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 04:30:07,894 WARNING [train.py:1204] (3/4) Exclude cut with ID _42875_210_14856_1_1533704397487_4012433_102-199499-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 04:30:07,961 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_51225_210_3601_1_1534400634_2327383_55-301844-0 from training. Duration: 0.93 2023-10-03 04:30:08,062 WARNING [train.py:1204] (3/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_916-195848-0_sp1.1 from training. Duration: 0.836375 2023-10-03 04:30:11,224 WARNING [train.py:1204] (3/4) Exclude cut with ID _44718_210_5816_1_1534050051869_6469030_497-311999-0_sp1.1 from training. Duration: 0.936375 2023-10-03 04:30:12,629 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_34886_210_10633_1_1533203882_1755661_91-194765-0_sp1.1 from training. Duration: 0.936375 2023-10-03 04:30:13,870 WARNING [train.py:1204] (3/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_310-26186-0_sp0.9 from training. Duration: 0.77775 2023-10-03 04:30:14,594 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.3.encoder.layers.2.nonlin_attention.whiten2, num_groups=1, num_channels=512, metric=5.58 vs. limit=15.0 2023-10-03 04:30:15,689 WARNING [train.py:1204] (3/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_416-257730-0 from training. Duration: 0.65 2023-10-03 04:30:15,695 WARNING [train.py:1204] (3/4) Exclude cut with ID _9389_210_1664_1_1529217337301_5289269_209-154760-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 04:30:18,424 WARNING [train.py:1204] (3/4) Exclude cut with ID _4092_210_2151_1_1526643044449_3135759_227-213972-0 from training. Duration: 0.54 2023-10-03 04:30:18,454 WARNING [train.py:1204] (3/4) Exclude cut with ID IC0679W0044-335852 from training. Duration: 0.768 2023-10-03 04:30:19,724 WARNING [train.py:1204] (3/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_42-186162-0 from training. Duration: 0.92 2023-10-03 04:30:19,737 WARNING [train.py:1204] (3/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_302-2550-0_sp0.9 from training. Duration: 0.9 2023-10-03 04:30:19,758 WARNING [train.py:1204] (3/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_356-31216-0_sp0.9 from training. Duration: 0.8333125 2023-10-03 04:30:21,793 WARNING [train.py:1204] (3/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_125-322172-0_sp1.1 from training. Duration: 0.8454375 2023-10-03 04:30:23,391 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.nonlin_attention.balancer.prob, batch_count=1138386.6666666667, ans=0.125 2023-10-03 04:30:29,174 WARNING [train.py:1204] (3/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_141-89727-0_sp0.9 from training. Duration: 0.788875 2023-10-03 04:30:29,193 WARNING [train.py:1204] (3/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_647-337784-0_sp1.1 from training. Duration: 0.97275 2023-10-03 04:30:29,216 WARNING [train.py:1204] (3/4) Exclude cut with ID _6493_210_1747_1_1528459210424_4012470_513-140967-0_sp1.1 from training. Duration: 0.7181875 2023-10-03 04:30:31,936 WARNING [train.py:1204] (3/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_87-266334-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 04:30:33,322 WARNING [train.py:1204] (3/4) Exclude cut with ID _26528_210_9070_1_1532570410842_3851310_526-261338-0_sp1.1 from training. Duration: 0.92725 2023-10-03 04:30:33,377 WARNING [train.py:1204] (3/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_380-343670-0 from training. Duration: 0.88 2023-10-03 04:30:34,661 WARNING [train.py:1204] (3/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_449-15508-0_sp1.1 from training. Duration: 0.7454375 2023-10-03 04:30:34,773 WARNING [train.py:1204] (3/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_372-6869-0_sp0.9 from training. Duration: 0.488875 2023-10-03 04:30:34,819 WARNING [train.py:1204] (3/4) Exclude cut with ID _26907_210_7234_1_1532221740694_3205263_337-2494-0_sp0.9 from training. Duration: 0.97775 2023-10-03 04:30:38,048 WARNING [train.py:1204] (3/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_10-100367-0_sp1.1 from training. Duration: 0.8909375 2023-10-03 04:30:39,346 WARNING [train.py:1204] (3/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_47-167373-0 from training. Duration: 0.92 2023-10-03 04:30:40,728 WARNING [train.py:1204] (3/4) Exclude cut with ID _28323_210_16187_1_1532480395667_4100300_628-282179-0_sp1.1 from training. Duration: 0.97275 2023-10-03 04:30:43,580 WARNING [train.py:1204] (3/4) Exclude cut with ID _20455_210_5219_1_1531565996775_8098100_213-280898-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 04:30:46,778 WARNING [train.py:1204] (3/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_383-316000-0_sp1.1 from training. Duration: 0.8181875 2023-10-03 04:30:46,807 WARNING [train.py:1204] (3/4) Exclude cut with ID _18513_210_9206_1_1531618215223_3862620_16-78180-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 04:30:49,530 WARNING [train.py:1204] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_330-327861-0_sp1.1 from training. Duration: 0.6090625 2023-10-03 04:30:49,769 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=1138520.0, ans=0.1 2023-10-03 04:30:51,360 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.attention_skip_rate, batch_count=1138520.0, ans=0.0 2023-10-03 04:30:52,816 WARNING [train.py:1204] (3/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_356-31216-0 from training. Duration: 0.75 2023-10-03 04:30:52,839 WARNING [train.py:1204] (3/4) Exclude cut with ID _25248_210_8750_1_1532062825941_5554180_255-148539-0_sp1.1 from training. Duration: 0.963625 2023-10-03 04:30:54,173 WARNING [train.py:1204] (3/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_269-154726-0_sp1.1 from training. Duration: 0.7818125 2023-10-03 04:30:56,887 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7002_210_1794_1_1527939683_5157286_163-87552-0_sp1.1 from training. Duration: 0.7818125 2023-10-03 04:30:56,921 WARNING [train.py:1204] (3/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_97-151896-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 04:31:00,095 INFO [train.py:1046] (3/4) Epoch 33, batch 800, loss[loss=0.1841, simple_loss=0.2557, pruned_loss=0.05623, over 23837.00 frames. ], tot_loss[loss=0.161, simple_loss=0.2397, pruned_loss=0.04117, over 4621590.74 frames. ], batch size: 164, lr: 3.11e-03, grad_scale: 16.0 2023-10-03 04:31:00,169 WARNING [train.py:1204] (3/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_207-7026-0_sp1.1 from training. Duration: 0.97275 2023-10-03 04:31:00,197 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_92-46009-0_sp0.9 from training. Duration: 0.6 2023-10-03 04:31:05,922 WARNING [train.py:1204] (3/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_122-178847-0_sp1.1 from training. Duration: 0.97275 2023-10-03 04:31:05,926 WARNING [train.py:1204] (3/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_94-348956-0_sp1.1 from training. Duration: 0.92725 2023-10-03 04:31:07,373 WARNING [train.py:1204] (3/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_1018-244089-0_sp1.1 from training. Duration: 0.963625 2023-10-03 04:31:07,394 WARNING [train.py:1204] (3/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_87-118423-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 04:31:10,438 WARNING [train.py:1204] (3/4) Exclude cut with ID _53713_210_24116_1_1534314487762_7416751_504-212292-0_sp1.1 from training. Duration: 0.92725 2023-10-03 04:31:10,467 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_446-150327-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 04:31:11,897 WARNING [train.py:1204] (3/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_682-245989-0_sp1.1 from training. Duration: 0.92725 2023-10-03 04:31:14,608 WARNING [train.py:1204] (3/4) Exclude cut with ID _35723_210_1794_1_1533088341043_628250_8-234524-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 04:31:16,485 WARNING [train.py:1204] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_64-248359-0_sp0.9 from training. Duration: 0.7666875 2023-10-03 04:31:19,416 WARNING [train.py:1204] (3/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_49-97158-0 from training. Duration: 0.76 2023-10-03 04:31:19,471 WARNING [train.py:1204] (3/4) Exclude cut with ID _24314_210_4915_1_1532135078116_7170179_743-915-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 04:31:20,782 WARNING [train.py:1204] (3/4) Exclude cut with ID _61194_210_18628_1_1535007555628_3750909_353-323468-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 04:31:20,804 WARNING [train.py:1204] (3/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_387-131538-0_sp1.1 from training. Duration: 0.736375 2023-10-03 04:31:20,833 WARNING [train.py:1204] (3/4) Exclude cut with ID _44424_210_18163_1_1533623394848_1213110_79-187603-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 04:31:21,013 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.conv_module1.balancer2.prob, batch_count=1138653.3333333333, ans=0.125 2023-10-03 04:31:22,222 WARNING [train.py:1204] (3/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_86-19601-0 from training. Duration: 0.85 2023-10-03 04:31:22,251 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_50050_210_16223_1_1535435796_7290138_103-283529-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 04:31:22,914 WARNING [train.py:1204] (3/4) Exclude cut with ID _13913_210_9994_1_1530579615187_3383897_473-277682-0 from training. Duration: 0.39 2023-10-03 04:31:25,630 WARNING [train.py:1204] (3/4) Exclude cut with ID _42326_210_7770_1_1533814282531_3707269_368-68029-0_sp1.1 from training. Duration: 0.92725 2023-10-03 04:31:26,996 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_41428_210_2161_1_1533774184_3992532_446-339645-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 04:31:29,045 WARNING [train.py:1204] (3/4) Exclude cut with ID _69584_210_5219_1_1535785133775_4044450_325-7656-0_sp1.1 from training. Duration: 0.7818125 2023-10-03 04:31:29,054 WARNING [train.py:1204] (3/4) Exclude cut with ID _21632_210_2253_1_1531994001765_3744140_134-184387-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 04:31:31,565 WARNING [train.py:1204] (3/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_550-184707-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 04:31:32,803 WARNING [train.py:1204] (3/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_106-341780-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 04:31:36,233 WARNING [train.py:1204] (3/4) Exclude cut with ID _45871_210_5665_1_1533864614245_3296060_253-132552-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 04:31:37,546 WARNING [train.py:1204] (3/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_395-310220-0_sp1.1 from training. Duration: 0.8090625 2023-10-03 04:31:37,568 WARNING [train.py:1204] (3/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_795-33252-0_sp0.9 from training. Duration: 0.411125 2023-10-03 04:31:39,012 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9002W0003-495962 from training. Duration: 0.946 2023-10-03 04:31:39,032 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_43-71989-0 from training. Duration: 0.57 2023-10-03 04:31:39,055 WARNING [train.py:1204] (3/4) Exclude cut with ID _8699_210_2069_1_1528630346831_3960409_155-39766-0_sp1.1 from training. Duration: 0.7090625 2023-10-03 04:31:40,309 WARNING [train.py:1204] (3/4) Exclude cut with ID _27894_210_12392_1_1532415496078_6902439_788-232684-0_sp1.1 from training. Duration: 0.97275 2023-10-03 04:31:41,728 WARNING [train.py:1204] (3/4) Exclude cut with ID _56494_210_4220_1_1535284837625_3033169_10-39296-0_sp1.1 from training. Duration: 0.92725 2023-10-03 04:31:41,748 WARNING [train.py:1204] (3/4) Exclude cut with ID _40901_210_5665_1_1533363789958_3856130_341-20819-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 04:31:42,523 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.2.encoder.layers.0.conv_module2.whiten, num_groups=1, num_channels=384, metric=14.49 vs. limit=15.0 2023-10-03 04:31:46,584 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9029W0435-509822 from training. Duration: 0.896 2023-10-03 04:31:47,915 WARNING [train.py:1204] (3/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_51-117576-0 from training. Duration: 0.4 2023-10-03 04:31:49,294 WARNING [train.py:1204] (3/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_197-28164-0_sp1.1 from training. Duration: 0.72725 2023-10-03 04:31:49,584 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.bypass.scale_min, batch_count=1138786.6666666667, ans=0.2 2023-10-03 04:31:50,702 WARNING [train.py:1204] (3/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_145-137790-0_sp1.1 from training. Duration: 0.7545625 2023-10-03 04:31:54,040 WARNING [train.py:1204] (3/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_2-39653-0_sp1.1 from training. Duration: 0.8909375 2023-10-03 04:31:58,183 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3780_210_2087_1_1526467760_2130483_186-208240-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 04:31:58,419 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.conv_module1.balancer2.prob, batch_count=1138853.3333333333, ans=0.125 2023-10-03 04:31:59,611 WARNING [train.py:1204] (3/4) Exclude cut with ID _24055_210_12651_1_1532310907143_976979_92-59356-0 from training. Duration: 0.91 2023-10-03 04:32:00,990 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3910_210_1794_1_1526742306_334530_28-238909-0_sp1.1 from training. Duration: 0.736375 2023-10-03 04:32:04,204 WARNING [train.py:1204] (3/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_269-154726-0 from training. Duration: 0.86 2023-10-03 04:32:07,722 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.0.self_attn_weights.pos_emb_skip_rate, batch_count=1138853.3333333333, ans=0.0 2023-10-03 04:32:10,199 WARNING [train.py:1204] (3/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_326-165900-0_sp0.9 from training. Duration: 0.7666875 2023-10-03 04:32:11,447 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.520e+02 1.861e+02 2.064e+02 2.320e+02 3.416e+02, threshold=4.128e+02, percent-clipped=0.0 2023-10-03 04:32:12,920 WARNING [train.py:1204] (3/4) Exclude cut with ID _5294_210_1747_1_1527249643928_3696179_470-276077-0_sp1.1 from training. Duration: 0.963625 2023-10-03 04:32:12,973 WARNING [train.py:1204] (3/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_372-131163-0 from training. Duration: 0.94 2023-10-03 04:32:14,289 INFO [train.py:1046] (3/4) Epoch 33, batch 850, loss[loss=0.1471, simple_loss=0.2395, pruned_loss=0.02738, over 24445.00 frames. ], tot_loss[loss=0.1617, simple_loss=0.2404, pruned_loss=0.04152, over 4640270.68 frames. ], batch size: 69, lr: 3.11e-03, grad_scale: 16.0 2023-10-03 04:32:14,343 WARNING [train.py:1204] (3/4) Exclude cut with ID _45297_210_2721_1_1533715351381_3264949_351-147136-0_sp1.1 from training. Duration: 0.7909375 2023-10-03 04:32:14,412 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_1072-12671-0_sp1.1 from training. Duration: 0.92725 2023-10-03 04:32:15,652 WARNING [train.py:1204] (3/4) Exclude cut with ID _15133_210_6228_1_1530794512074_3080379_54-198193-0 from training. Duration: 0.94 2023-10-03 04:32:15,681 WARNING [train.py:1204] (3/4) Exclude cut with ID _30196_210_1664_1_1533708496522_7074140_1194-270988-0_sp1.1 from training. Duration: 0.97275 2023-10-03 04:32:17,599 WARNING [train.py:1204] (3/4) Exclude cut with ID _50310_210_15488_1_1534068043345_3641080_289-193637-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 04:32:17,701 WARNING [train.py:1204] (3/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_313-263495-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 04:32:17,871 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.nonlin_attention.balancer.prob, batch_count=1138920.0, ans=0.125 2023-10-03 04:32:19,113 WARNING [train.py:1204] (3/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_18-130336-0_sp1.1 from training. Duration: 0.7181875 2023-10-03 04:32:20,451 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_47896_210_20508_1_1534492518_5958868_646-287209-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 04:32:21,821 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_294-163793-0 from training. Duration: 0.82 2023-10-03 04:32:22,489 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_8-319957-0 from training. Duration: 0.96 2023-10-03 04:32:22,493 WARNING [train.py:1204] (3/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_1211-226066-0 from training. Duration: 0.76 2023-10-03 04:32:23,720 WARNING [train.py:1204] (3/4) Exclude cut with ID _4779_210_2084_1_1527318022944_3914893_500-285013-0_sp0.9 from training. Duration: 0.7666875 2023-10-03 04:32:23,756 WARNING [train.py:1204] (3/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_347-232268-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 04:32:25,208 WARNING [train.py:1204] (3/4) Exclude cut with ID _29877_210_6228_1_1532397791106_5276149_111-55056-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 04:32:26,543 WARNING [train.py:1204] (3/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_812-271646-0_sp1.1 from training. Duration: 0.92725 2023-10-03 04:32:26,568 WARNING [train.py:1204] (3/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_235-262124-0_sp1.1 from training. Duration: 0.6090625 2023-10-03 04:32:28,372 INFO [scaling.py:1118] (3/4) WithLoss: name=encoder.encoders.2.encoder.layers.0.self_attn_weights, loss-sum=0.000e+00 2023-10-03 04:32:30,733 WARNING [train.py:1204] (3/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_14-16530-0_sp1.1 from training. Duration: 0.97275 2023-10-03 04:32:30,771 WARNING [train.py:1204] (3/4) Exclude cut with ID _44718_210_5816_1_1534050051869_6469030_568-304512-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 04:32:30,796 WARNING [train.py:1204] (3/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_1033-274376-0 from training. Duration: 0.87 2023-10-03 04:32:34,175 WARNING [train.py:1204] (3/4) Exclude cut with ID _4681_210_3525_1_1527251040321_1235713_123-24428-0 from training. Duration: 0.48 2023-10-03 04:32:37,366 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_37818_210_4238_1_1533620910_4720083_251-288764-0_sp1.1 from training. Duration: 0.97275 2023-10-03 04:32:38,704 WARNING [train.py:1204] (3/4) Exclude cut with ID _65585_210_12210_1_1535450451584_3764098_112-294301-0 from training. Duration: 0.88 2023-10-03 04:32:41,438 WARNING [train.py:1204] (3/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_329-113173-0 from training. Duration: 0.62 2023-10-03 04:32:41,530 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6767_210_3273_1_1527855783_1718932_173-256601-0 from training. Duration: 0.8 2023-10-03 04:32:44,328 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9266W0001-627677 from training. Duration: 0.896 2023-10-03 04:32:44,339 WARNING [train.py:1204] (3/4) Exclude cut with ID _49643_210_7989_1_1534640244224_3939031_611-185515-0_sp1.1 from training. Duration: 0.8545625 2023-10-03 04:32:46,221 WARNING [train.py:1204] (3/4) Exclude cut with ID _31649_210_6228_1_1532584608063_4091078_687-237771-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 04:32:46,231 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_148-172861-0_sp0.9 from training. Duration: 0.5555625 2023-10-03 04:32:49,000 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_5129_210_6400_1_1527161005_3579054_107-69738-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 04:32:51,011 WARNING [train.py:1204] (3/4) Exclude cut with ID _40137_210_12210_1_1533290484046_3795180_36-301164-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 04:32:51,034 WARNING [train.py:1204] (3/4) Exclude cut with ID _7537_210_2084_1_1528502395431_3924359_113-199933-0 from training. Duration: 0.59 2023-10-03 04:32:53,770 WARNING [train.py:1204] (3/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_547-5424-0_sp1.1 from training. Duration: 0.8545625 2023-10-03 04:32:55,128 WARNING [train.py:1204] (3/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_147-284024-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 04:32:55,208 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_294-163793-0_sp1.1 from training. Duration: 0.7454375 2023-10-03 04:32:55,238 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3780_210_2087_1_1526467760_2130483_170-259044-0_sp1.1 from training. Duration: 0.636375 2023-10-03 04:32:55,508 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.feed_forward1.hidden_balancer.prob, batch_count=1139053.3333333333, ans=0.125 2023-10-03 04:32:56,633 WARNING [train.py:1204] (3/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_726-270780-0_sp1.1 from training. Duration: 0.7818125 2023-10-03 04:32:57,968 WARNING [train.py:1204] (3/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_214-291369-0_sp1.1 from training. Duration: 0.57275 2023-10-03 04:32:58,014 WARNING [train.py:1204] (3/4) Exclude cut with ID _21881_210_3525_1_1531825242757_3798380_247-102591-0 from training. Duration: 0.98 2023-10-03 04:33:02,165 WARNING [train.py:1204] (3/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_18-174647-0_sp1.1 from training. Duration: 0.82725 2023-10-03 04:33:02,176 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_415-52611-0_sp1.1 from training. Duration: 0.936375 2023-10-03 04:33:02,238 WARNING [train.py:1204] (3/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_379-49457-0_sp0.9 from training. Duration: 0.9666875 2023-10-03 04:33:04,039 WARNING [train.py:1204] (3/4) Exclude cut with ID _64521_210_14699_1_1535243427259_3815328_368-195106-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 04:33:04,133 WARNING [train.py:1204] (3/4) Exclude cut with ID _28394_210_8913_1_1532430049518_3650118_51-334124-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 04:33:06,753 WARNING [train.py:1204] (3/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_317-161769-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 04:33:06,895 WARNING [train.py:1204] (3/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_40-129756-0_sp0.9 from training. Duration: 0.9 2023-10-03 04:33:08,903 WARNING [train.py:1204] (3/4) Exclude cut with ID _3067_210_2667_1_1526212974640_4503168_119-175776-0_sp0.9 from training. Duration: 0.82225 2023-10-03 04:33:08,962 WARNING [train.py:1204] (3/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_1004-304915-0_sp1.1 from training. Duration: 0.92725 2023-10-03 04:33:10,232 WARNING [train.py:1204] (3/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_359-118926-0_sp0.9 from training. Duration: 0.8 2023-10-03 04:33:18,930 WARNING [train.py:1204] (3/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_51-200220-0_sp0.9 from training. Duration: 0.62225 2023-10-03 04:33:22,249 WARNING [train.py:1204] (3/4) Exclude cut with ID _46683_210_9642_1_1533730913492_4591116_530-326202-0_sp1.1 from training. Duration: 0.936375 2023-10-03 04:33:22,302 WARNING [train.py:1204] (3/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_567-135114-0 from training. Duration: 0.87 2023-10-03 04:33:22,341 WARNING [train.py:1204] (3/4) Exclude cut with ID _66863_210_18053_1_1535698758334_8590319_1092-271784-0_sp1.1 from training. Duration: 0.87275 2023-10-03 04:33:22,350 WARNING [train.py:1204] (3/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_762-282614-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 04:33:24,857 WARNING [train.py:1204] (3/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_280-120942-0 from training. Duration: 0.77 2023-10-03 04:33:28,983 INFO [train.py:1046] (3/4) Epoch 33, batch 900, loss[loss=0.159, simple_loss=0.2527, pruned_loss=0.0326, over 24632.00 frames. ], tot_loss[loss=0.1631, simple_loss=0.2417, pruned_loss=0.04223, over 4651737.11 frames. ], batch size: 68, lr: 3.11e-03, grad_scale: 8.0 2023-10-03 04:33:29,142 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_31583_210_3852_1_1532595851_4024413_27-170931-0_sp1.1 from training. Duration: 0.963625 2023-10-03 04:33:33,108 WARNING [train.py:1204] (3/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_266-271628-0_sp1.1 from training. Duration: 0.92725 2023-10-03 04:33:33,143 WARNING [train.py:1204] (3/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_41-31157-0 from training. Duration: 0.57 2023-10-03 04:33:36,478 WARNING [train.py:1204] (3/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_113-18163-0_sp1.1 from training. Duration: 0.8090625 2023-10-03 04:33:36,522 WARNING [train.py:1204] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_147-111137-0 from training. Duration: 0.95 2023-10-03 04:33:37,881 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1083-220122-0_sp1.1 from training. Duration: 0.463625 2023-10-03 04:33:37,993 WARNING [train.py:1204] (3/4) Exclude cut with ID _14884_210_2252_1_1530707459689_5561250_591-173406-0_sp1.1 from training. Duration: 0.87275 2023-10-03 04:33:38,000 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_24677_210_1794_1_1532002693_4341296_149-303793-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 04:33:39,432 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_133-564-0_sp1.1 from training. Duration: 0.7181875 2023-10-03 04:33:39,459 WARNING [train.py:1204] (3/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_625-338546-0_sp0.9 from training. Duration: 0.97775 2023-10-03 04:33:46,226 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.0.feed_forward1.out_proj.dropout_p, batch_count=1139320.0, ans=0.1 2023-10-03 04:33:47,571 WARNING [train.py:1204] (3/4) Exclude cut with ID _25882_210_3036_1_1532775665815_6479510_624-320169-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 04:33:47,578 WARNING [train.py:1204] (3/4) Exclude cut with ID _20622_210_3852_1_1531622427080_1093839_127-325326-0_sp1.1 from training. Duration: 0.92725 2023-10-03 04:33:48,861 WARNING [train.py:1204] (3/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_464-33931-0_sp1.1 from training. Duration: 0.4909375 2023-10-03 04:33:49,847 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.balancer1.prob, batch_count=1139320.0, ans=0.125 2023-10-03 04:33:52,194 WARNING [train.py:1204] (3/4) Exclude cut with ID _66112_210_10635_1_1535590236305_3801484_120-304588-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 04:33:52,416 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=1139320.0, ans=0.1 2023-10-03 04:33:56,246 WARNING [train.py:1204] (3/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_110-130961-0 from training. Duration: 0.47 2023-10-03 04:33:57,766 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.bypass.scale_min, batch_count=1139386.6666666667, ans=0.2 2023-10-03 04:33:58,933 WARNING [train.py:1204] (3/4) Exclude cut with ID _16908_210_5724_1_1531119157248_3772700_64-54020-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 04:34:00,383 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.0.balancer2.prob, batch_count=1139386.6666666667, ans=0.125 2023-10-03 04:34:06,349 WARNING [train.py:1204] (3/4) Exclude cut with ID _4068_210_2629_1_1526697350395_1546259_224-288693-0_sp1.1 from training. Duration: 0.536375 2023-10-03 04:34:06,428 WARNING [train.py:1204] (3/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_191-41023-0_sp0.9 from training. Duration: 0.788875 2023-10-03 04:34:07,749 WARNING [train.py:1204] (3/4) Exclude cut with ID ID0205W0255-783002 from training. Duration: 0.958 2023-10-03 04:34:07,830 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_162-262717-0 from training. Duration: 0.83 2023-10-03 04:34:07,960 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.conv_skip_rate, batch_count=1139386.6666666667, ans=0.0 2023-10-03 04:34:09,296 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.0.attention_skip_rate, batch_count=1139386.6666666667, ans=0.0 2023-10-03 04:34:12,765 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_224-322394-0_sp1.1 from training. Duration: 0.6 2023-10-03 04:34:12,771 WARNING [train.py:1204] (3/4) Exclude cut with ID _4866_210_2577_1_1527404412284_4364659_133-1696-0_sp1.1 from training. Duration: 0.9 2023-10-03 04:34:14,177 WARNING [train.py:1204] (3/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_973-198085-0_sp1.1 from training. Duration: 0.6818125 2023-10-03 04:34:20,166 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_47449_210_5399_1_1534076445_4006929_410-237806-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 04:34:20,175 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7543_210_6753_1_1528543318_4552_1-85072-0_sp0.9 from training. Duration: 0.92225 2023-10-03 04:34:21,629 WARNING [train.py:1204] (3/4) Exclude cut with ID _65263_210_22356_1_1535763598691_3921037_108-21902-0 from training. Duration: 0.99 2023-10-03 04:34:21,634 WARNING [train.py:1204] (3/4) Exclude cut with ID _65585_210_12210_1_1535450451584_3764098_309-174818-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 04:34:23,823 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.conv_skip_rate, batch_count=1139453.3333333333, ans=0.0 2023-10-03 04:34:25,025 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_309-65177-0 from training. Duration: 0.88 2023-10-03 04:34:27,656 WARNING [train.py:1204] (3/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_363-185703-0_sp1.1 from training. Duration: 0.863625 2023-10-03 04:34:27,705 WARNING [train.py:1204] (3/4) Exclude cut with ID _42326_210_7770_1_1533814282531_3707269_442-238692-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 04:34:30,432 WARNING [train.py:1204] (3/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_226-254446-0_sp1.1 from training. Duration: 0.936375 2023-10-03 04:34:30,456 WARNING [train.py:1204] (3/4) Exclude cut with ID _29853_210_7497_1_1532505600851_6642110_287-299903-0_sp1.1 from training. Duration: 0.963625 2023-10-03 04:34:34,648 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1015-343884-0 from training. Duration: 0.62 2023-10-03 04:34:34,679 WARNING [train.py:1204] (3/4) Exclude cut with ID ID0305W0008-833402 from training. Duration: 0.91 2023-10-03 04:34:37,862 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6673_210_3273_1_1527767790_3173240_351-167954-0_sp1.1 from training. Duration: 0.436375 2023-10-03 04:34:37,876 WARNING [train.py:1204] (3/4) Exclude cut with ID _6456_210_1534_1_1527681634212_3181049_345-252877-0 from training. Duration: 0.64 2023-10-03 04:34:40,515 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_13-205182-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 04:34:43,677 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.638e+02 1.900e+02 2.085e+02 2.299e+02 3.582e+02, threshold=4.170e+02, percent-clipped=0.0 2023-10-03 04:34:43,705 INFO [train.py:1046] (3/4) Epoch 33, batch 950, loss[loss=0.1682, simple_loss=0.2416, pruned_loss=0.04741, over 23906.00 frames. ], tot_loss[loss=0.1629, simple_loss=0.242, pruned_loss=0.04189, over 4675544.03 frames. ], batch size: 196, lr: 3.11e-03, grad_scale: 4.0 2023-10-03 04:34:45,294 WARNING [train.py:1204] (3/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_427-208901-0 from training. Duration: 0.68 2023-10-03 04:34:50,180 WARNING [train.py:1204] (3/4) Exclude cut with ID _49039_210_6315_1_1533967537541_3846118_9-148921-0_sp1.1 from training. Duration: 0.97275 2023-10-03 04:34:52,991 WARNING [train.py:1204] (3/4) Exclude cut with ID _59493_210_17282_1_1535078419077_3225879_143-70317-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 04:34:54,277 WARNING [train.py:1204] (3/4) Exclude cut with ID _27967_210_12140_1_1532588391859_8419130_1056-236369-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 04:34:54,328 WARNING [train.py:1204] (3/4) Exclude cut with ID _6537_210_1883_1_1527987559710_3597129_382-133062-0_sp0.9 from training. Duration: 0.7555625 2023-10-03 04:34:55,807 WARNING [train.py:1204] (3/4) Exclude cut with ID ID0376W0255-869957 from training. Duration: 0.9510625 2023-10-03 04:35:00,331 WARNING [train.py:1204] (3/4) Exclude cut with ID _49371_210_12071_1_1533952761021_3812190_483-240691-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 04:35:01,770 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_12868_210_6753_1_1530701932_530275_18-76322-0_sp1.1 from training. Duration: 0.7909375 2023-10-03 04:35:01,824 WARNING [train.py:1204] (3/4) Exclude cut with ID _53950_210_5816_1_1534417264919_3862951_367-26344-0_sp1.1 from training. Duration: 0.97275 2023-10-03 04:35:03,164 WARNING [train.py:1204] (3/4) Exclude cut with ID _5444_210_2151_1_1527418808464_5312210_634-194174-0_sp1.1 from training. Duration: 0.8818125 2023-10-03 04:35:03,192 WARNING [train.py:1204] (3/4) Exclude cut with ID _56494_210_4220_1_1535284837625_3033169_69-138150-0 from training. Duration: 0.94 2023-10-03 04:35:03,293 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7543_210_6753_1_1528544010_1110402_25-183860-0_sp0.9 from training. Duration: 0.52225 2023-10-03 04:35:04,694 WARNING [train.py:1204] (3/4) Exclude cut with ID _40284_210_2084_1_1533513679814_3704279_287-191918-0_sp1.1 from training. Duration: 0.963625 2023-10-03 04:35:06,109 WARNING [train.py:1204] (3/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_291-270881-0 from training. Duration: 0.97 2023-10-03 04:35:07,546 WARNING [train.py:1204] (3/4) Exclude cut with ID _12555_210_8750_1_1530075594402_3780239_449-267986-0_sp0.9 from training. Duration: 0.92225 2023-10-03 04:35:08,093 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.5.encoder.layers.0.feed_forward2.out_whiten, num_groups=1, num_channels=256, metric=7.33 vs. limit=15.0 2023-10-03 04:35:12,046 WARNING [train.py:1204] (3/4) Exclude cut with ID _3992_210_1747_1_1526644816031_3731060_268-64520-0_sp1.1 from training. Duration: 0.963625 2023-10-03 04:35:12,061 WARNING [train.py:1204] (3/4) Exclude cut with ID _39411_210_5986_1_1533538825030_3343649_173-238248-0_sp0.9 from training. Duration: 0.92225 2023-10-03 04:35:12,091 WARNING [train.py:1204] (3/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_828-313097-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 04:35:12,313 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.conv_skip_rate, batch_count=1139720.0, ans=0.0 2023-10-03 04:35:13,451 WARNING [train.py:1204] (3/4) Exclude cut with ID _26907_210_7234_1_1532221740694_3205263_337-2494-0 from training. Duration: 0.88 2023-10-03 04:35:16,548 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_229-45300-0_sp1.1 from training. Duration: 0.5181875 2023-10-03 04:35:17,985 WARNING [train.py:1204] (3/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_300-27235-0_sp1.1 from training. Duration: 0.7909375 2023-10-03 04:35:19,340 WARNING [train.py:1204] (3/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_48-338976-0_sp1.1 from training. Duration: 0.8090625 2023-10-03 04:35:24,178 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_13091_210_7925_1_1530609447_8987354_792-195353-0_sp1.1 from training. Duration: 0.936375 2023-10-03 04:35:24,183 WARNING [train.py:1204] (3/4) Exclude cut with ID _56524_210_9642_1_1534572011779_3619170_311-54500-0_sp1.1 from training. Duration: 0.97275 2023-10-03 04:35:26,932 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7543_210_6753_1_1528544010_1110402_25-183860-0 from training. Duration: 0.47 2023-10-03 04:35:28,350 WARNING [train.py:1204] (3/4) Exclude cut with ID 2411-132532-0017-82279-0_sp1.1 from training. Duration: 0.9681875 2023-10-03 04:35:28,351 WARNING [train.py:1204] (3/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_272-249985-0_sp0.9 from training. Duration: 0.7444375 2023-10-03 04:35:30,353 WARNING [train.py:1204] (3/4) Exclude cut with ID _55644_210_11856_1_1534593599582_4103719_320-229006-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 04:35:30,407 WARNING [train.py:1204] (3/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_177-100554-0_sp1.1 from training. Duration: 0.963625 2023-10-03 04:35:30,409 WARNING [train.py:1204] (3/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_787-294416-0_sp1.1 from training. Duration: 0.7454375 2023-10-03 04:35:34,257 WARNING [train.py:1204] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_318-59097-0 from training. Duration: 0.76 2023-10-03 04:35:34,329 WARNING [train.py:1204] (3/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_480-13146-0_sp1.1 from training. Duration: 0.77275 2023-10-03 04:35:38,419 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_469-338558-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 04:35:38,474 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_367-126897-0_sp1.1 from training. Duration: 0.963625 2023-10-03 04:35:38,499 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6545_210_2040_1_1527774670_4245258_379-14182-0 from training. Duration: 0.8 2023-10-03 04:35:38,510 WARNING [train.py:1204] (3/4) Exclude cut with ID _21631_210_2253_1_1531907880778_3160339_197-79498-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 04:35:38,514 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_416-293746-0_sp1.1 from training. Duration: 0.8181875 2023-10-03 04:35:39,847 WARNING [train.py:1204] (3/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_52-211683-0 from training. Duration: 0.9 2023-10-03 04:35:43,332 WARNING [train.py:1204] (3/4) Exclude cut with ID _48678_210_5663_1_1533988830947_3769199_306-255886-0_sp0.9 from training. Duration: 0.8555625 2023-10-03 04:35:44,822 WARNING [train.py:1204] (3/4) Exclude cut with ID _65134_210_14763_1_1535335393985_3505368_296-24733-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 04:35:46,774 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.self_attn_weights.pos_emb_skip_rate, batch_count=1139853.3333333333, ans=0.0 2023-10-03 04:35:50,766 WARNING [train.py:1204] (3/4) Exclude cut with ID _14898_210_9206_1_1530766812377_4177190_335-99364-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 04:35:50,849 WARNING [train.py:1204] (3/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_19-342632-0 from training. Duration: 0.91 2023-10-03 04:35:51,303 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.5.encoder.layers.1.whiten, num_groups=1, num_channels=256, metric=2.11 vs. limit=12.0 2023-10-03 04:35:52,760 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_53071_210_16554_1_1534490405_2954066_265-170472-0 from training. Duration: 0.83 2023-10-03 04:35:55,567 WARNING [train.py:1204] (3/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_189-298172-0_sp1.1 from training. Duration: 0.963625 2023-10-03 04:35:58,278 INFO [train.py:1046] (3/4) Epoch 33, batch 1000, loss[loss=0.1593, simple_loss=0.2376, pruned_loss=0.04043, over 23475.00 frames. ], tot_loss[loss=0.1622, simple_loss=0.241, pruned_loss=0.04175, over 4688696.30 frames. ], batch size: 105, lr: 3.11e-03, grad_scale: 8.0 2023-10-03 04:36:00,382 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7615_210_4211_1_1528110965_4580160_72-235211-0 from training. Duration: 0.76 2023-10-03 04:36:01,575 WARNING [train.py:1204] (3/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_155-90874-0_sp1.1 from training. Duration: 0.936375 2023-10-03 04:36:02,196 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.4.encoder.layers.0.feed_forward2.out_whiten, num_groups=1, num_channels=384, metric=5.99 vs. limit=15.0 2023-10-03 04:36:05,780 WARNING [train.py:1204] (3/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_164-216310-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 04:36:07,167 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_46013_210_7234_1_1533776362_3744032_463-52378-0 from training. Duration: 0.86 2023-10-03 04:36:07,170 WARNING [train.py:1204] (3/4) Exclude cut with ID _61133_210_9969_1_1535196777227_3696409_333-270325-0 from training. Duration: 0.93 2023-10-03 04:36:11,404 WARNING [train.py:1204] (3/4) Exclude cut with ID _5172_210_1883_1_1527246062567_3577219_18-101380-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 04:36:11,415 WARNING [train.py:1204] (3/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_577-236858-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 04:36:14,462 WARNING [train.py:1204] (3/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_1006-177345-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 04:36:17,339 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_4-266348-0 from training. Duration: 0.78 2023-10-03 04:36:19,432 WARNING [train.py:1204] (3/4) Exclude cut with ID _55857_210_2346_1_1534644007831_6898209_745-98391-0 from training. Duration: 0.76 2023-10-03 04:36:22,225 WARNING [train.py:1204] (3/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_375-244976-0 from training. Duration: 0.75 2023-10-03 04:36:22,288 WARNING [train.py:1204] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_4-248404-0_sp1.1 from training. Duration: 0.97275 2023-10-03 04:36:24,339 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8453_210_4211_1_1528502940_8562730_596-306820-0 from training. Duration: 0.97 2023-10-03 04:36:24,635 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.balancer1.prob, batch_count=1139986.6666666667, ans=0.125 2023-10-03 04:36:25,714 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_211-339622-0_sp0.9 from training. Duration: 0.5 2023-10-03 04:36:25,759 WARNING [train.py:1204] (3/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_393-57888-0 from training. Duration: 0.64 2023-10-03 04:36:28,412 WARNING [train.py:1204] (3/4) Exclude cut with ID _23825_210_13449_1_1531915313526_3497150_143-343575-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 04:36:28,464 WARNING [train.py:1204] (3/4) Exclude cut with ID _58605_210_12210_1_1534723195939_3904259_36-12256-0_sp1.1 from training. Duration: 0.936375 2023-10-03 04:36:37,474 WARNING [train.py:1204] (3/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_38-67795-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 04:36:37,544 WARNING [train.py:1204] (3/4) Exclude cut with ID _64631_210_27767_1_1535349580068_4165160_308-327832-0_sp1.1 from training. Duration: 0.92725 2023-10-03 04:36:38,874 WARNING [train.py:1204] (3/4) Exclude cut with ID _37086_210_9038_1_1532999540891_7021250_726-81176-0_sp1.1 from training. Duration: 0.936375 2023-10-03 04:36:38,931 WARNING [train.py:1204] (3/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_47-141521-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 04:36:40,191 WARNING [train.py:1204] (3/4) Exclude cut with ID _3067_210_2667_1_1526212974640_4503168_250-285937-0 from training. Duration: 0.8 2023-10-03 04:36:40,222 WARNING [train.py:1204] (3/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_239-24000-0_sp1.1 from training. Duration: 0.97275 2023-10-03 04:36:41,566 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3371_210_1794_1_1526384462_5580777_396-339812-0_sp0.9 from training. Duration: 0.9444375 2023-10-03 04:36:41,619 WARNING [train.py:1204] (3/4) Exclude cut with ID _16909_210_5724_1_1531292079912_4118919_580-173454-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 04:36:43,005 WARNING [train.py:1204] (3/4) Exclude cut with ID IC0124W0002-61308 from training. Duration: 0.982 2023-10-03 04:36:43,253 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.bypass.skip_rate, batch_count=1140120.0, ans=0.07 2023-10-03 04:36:45,686 WARNING [train.py:1204] (3/4) Exclude cut with ID _31602_210_5854_1_1532677021456_4458250_249-96604-0 from training. Duration: 0.89 2023-10-03 04:36:47,157 WARNING [train.py:1204] (3/4) Exclude cut with ID _69584_210_5219_1_1535785133775_4044450_325-7656-0 from training. Duration: 0.86 2023-10-03 04:36:49,093 WARNING [train.py:1204] (3/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_478-162247-0 from training. Duration: 0.82 2023-10-03 04:36:52,329 WARNING [train.py:1204] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_686-54383-0_sp1.1 from training. Duration: 0.7909375 2023-10-03 04:36:52,931 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.4.encoder.layers.2.conv_module2.whiten, num_groups=1, num_channels=384, metric=4.32 vs. limit=15.0 2023-10-03 04:36:58,631 WARNING [train.py:1204] (3/4) Exclude cut with ID _29303_210_2575_1_1532392237303_7410649_85-270092-0_sp1.1 from training. Duration: 0.936375 2023-10-03 04:36:58,651 WARNING [train.py:1204] (3/4) Exclude cut with ID _2322_210_2667_1_1525604548971_3656299_440-80494-0_sp1.1 from training. Duration: 0.8 2023-10-03 04:36:59,930 WARNING [train.py:1204] (3/4) Exclude cut with ID _47346_210_9476_1_1533815971553_3536160_201-40644-0_sp1.1 from training. Duration: 0.936375 2023-10-03 04:37:00,021 WARNING [train.py:1204] (3/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_343-139898-0_sp1.1 from training. Duration: 0.87275 2023-10-03 04:37:01,347 WARNING [train.py:1204] (3/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_1012-279473-0 from training. Duration: 0.67 2023-10-03 04:37:03,424 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7931_210_1794_1_1528594728_5564059_280-279560-0_sp1.1 from training. Duration: 0.8909375 2023-10-03 04:37:03,472 WARNING [train.py:1204] (3/4) Exclude cut with ID _8263_210_1664_1_1528439452891_970220_68-181805-0 from training. Duration: 0.64 2023-10-03 04:37:04,558 WARNING [train.py:1204] (3/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_605-133395-0 from training. Duration: 0.66 2023-10-03 04:37:04,769 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.self_attn_weights.pos_emb_skip_rate, batch_count=1140186.6666666667, ans=0.0 2023-10-03 04:37:05,933 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_43-171109-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 04:37:05,942 WARNING [train.py:1204] (3/4) Exclude cut with ID _40094_210_16380_1_1533294005773_3641159_107-280739-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 04:37:08,624 WARNING [train.py:1204] (3/4) Exclude cut with ID _14821_210_8750_1_1530680503991_6829359_2-227151-0_sp0.9 from training. Duration: 0.92225 2023-10-03 04:37:10,089 WARNING [train.py:1204] (3/4) Exclude cut with ID _14268_210_9206_1_1530622771285_4714099_331-105351-0_sp1.1 from training. Duration: 0.5909375 2023-10-03 04:37:12,726 INFO [train.py:1046] (3/4) Epoch 33, batch 1050, loss[loss=0.1531, simple_loss=0.2258, pruned_loss=0.04018, over 23408.00 frames. ], tot_loss[loss=0.1612, simple_loss=0.2393, pruned_loss=0.04159, over 4697238.54 frames. ], batch size: 285, lr: 3.11e-03, grad_scale: 4.0 2023-10-03 04:37:12,813 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_26326_210_7925_1_1533002493_3592406_451-82741-0_sp1.1 from training. Duration: 0.97275 2023-10-03 04:37:14,199 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.551e+02 1.886e+02 2.130e+02 2.502e+02 4.211e+02, threshold=4.261e+02, percent-clipped=1.0 2023-10-03 04:37:14,399 WARNING [train.py:1204] (3/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_191-59784-0_sp0.9 from training. Duration: 0.97775 2023-10-03 04:37:15,913 WARNING [train.py:1204] (3/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_445-117398-0_sp1.1 from training. Duration: 0.8090625 2023-10-03 04:37:19,036 WARNING [train.py:1204] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_605-257205-0_sp0.9 from training. Duration: 0.6555625 2023-10-03 04:37:19,091 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_50050_210_16223_1_1535435796_7290138_592-258841-0_sp1.1 from training. Duration: 0.936375 2023-10-03 04:37:20,461 WARNING [train.py:1204] (3/4) Exclude cut with ID _14348_210_8341_1_1530599434996_3778640_240-232370-0_sp0.9 from training. Duration: 0.9333125 2023-10-03 04:37:23,532 WARNING [train.py:1204] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_239-316802-0_sp0.9 from training. Duration: 0.7666875 2023-10-03 04:37:25,387 WARNING [train.py:1204] (3/4) Exclude cut with ID _45560_210_21982_1_1533812603951_3584999_111-6376-0_sp1.1 from training. Duration: 0.763625 2023-10-03 04:37:26,939 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.bypass.scale_min, batch_count=1140320.0, ans=0.2 2023-10-03 04:37:28,120 WARNING [train.py:1204] (3/4) Exclude cut with ID _54189_210_16380_1_1534332670039_3911710_606-311481-0_sp1.1 from training. Duration: 0.963625 2023-10-03 04:37:28,206 WARNING [train.py:1204] (3/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_237-332027-0_sp1.1 from training. Duration: 0.863625 2023-10-03 04:37:28,236 WARNING [train.py:1204] (3/4) Exclude cut with ID _66070_210_7640_1_1535441393096_7805310_489-292631-0_sp0.9 from training. Duration: 0.87775 2023-10-03 04:37:29,602 WARNING [train.py:1204] (3/4) Exclude cut with ID _49728_210_12651_1_1534041034294_6788150_624-195301-0_sp1.1 from training. Duration: 0.77275 2023-10-03 04:37:29,654 WARNING [train.py:1204] (3/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_44-79427-0 from training. Duration: 0.98 2023-10-03 04:37:31,046 WARNING [train.py:1204] (3/4) Exclude cut with ID _35350_210_16040_1_1533040191183_4075299_490-198354-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 04:37:31,082 WARNING [train.py:1204] (3/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_309-222545-0 from training. Duration: 0.84 2023-10-03 04:37:33,204 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_13091_210_7925_1_1530609447_8987354_790-199469-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 04:37:33,210 WARNING [train.py:1204] (3/4) Exclude cut with ID _21632_210_2253_1_1531994001765_3744140_110-274654-0 from training. Duration: 0.82 2023-10-03 04:37:33,223 WARNING [train.py:1204] (3/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_498-40153-0_sp0.9 from training. Duration: 0.7 2023-10-03 04:37:38,595 WARNING [train.py:1204] (3/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_363-282909-0_sp1.1 from training. Duration: 0.936375 2023-10-03 04:37:39,905 WARNING [train.py:1204] (3/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_178-177582-0_sp0.9 from training. Duration: 0.82225 2023-10-03 04:37:39,924 WARNING [train.py:1204] (3/4) Exclude cut with ID _24612_210_8913_1_1531998054193_3806359_357-35487-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 04:37:42,614 WARNING [train.py:1204] (3/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_138-332773-0 from training. Duration: 0.81 2023-10-03 04:37:42,640 WARNING [train.py:1204] (3/4) Exclude cut with ID _7624_210_1531_1_1528109802198_4933101_418-349834-0 from training. Duration: 0.6 2023-10-03 04:37:42,665 WARNING [train.py:1204] (3/4) Exclude cut with ID _7537_210_2084_1_1528502395431_3924359_14-312906-0_sp0.9 from training. Duration: 0.9333125 2023-10-03 04:37:46,876 WARNING [train.py:1204] (3/4) Exclude cut with ID _60057_210_10640_1_1534903065793_3333150_362-230377-0 from training. Duration: 0.87 2023-10-03 04:37:50,214 WARNING [train.py:1204] (3/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_138-64210-0 from training. Duration: 0.73 2023-10-03 04:37:52,073 WARNING [train.py:1204] (3/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_454-339504-0_sp1.1 from training. Duration: 0.97275 2023-10-03 04:37:55,335 WARNING [train.py:1204] (3/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_508-335369-0_sp0.9 from training. Duration: 0.5333125 2023-10-03 04:37:58,046 WARNING [train.py:1204] (3/4) Exclude cut with ID _5608_210_2106_1_1527415439089_6819768_909-82767-0_sp0.9 from training. Duration: 0.67775 2023-10-03 04:37:58,080 WARNING [train.py:1204] (3/4) Exclude cut with ID _6694_210_4599_1_1527852637490_3965529_99-11611-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 04:37:58,142 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2655_210_2087_1_1525688959_3897400_52-279899-0_sp1.1 from training. Duration: 0.62725 2023-10-03 04:38:02,462 WARNING [train.py:1204] (3/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_375-245677-0_sp1.1 from training. Duration: 0.736375 2023-10-03 04:38:05,640 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_23850_210_4238_1_1532232597_1632433_32-185135-0 from training. Duration: 0.86 2023-10-03 04:38:05,716 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder_embed.convnext.out_balancer.prob, batch_count=1140453.3333333333, ans=0.125 2023-10-03 04:38:08,379 WARNING [train.py:1204] (3/4) Exclude cut with ID _15113_210_8254_1_1530842438579_3716279_209-54533-0 from training. Duration: 0.96 2023-10-03 04:38:08,419 WARNING [train.py:1204] (3/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_401-53423-0 from training. Duration: 0.55 2023-10-03 04:38:08,456 WARNING [train.py:1204] (3/4) Exclude cut with ID _14867_210_1968_1_1530684179890_3846969_319-250868-0_sp1.1 from training. Duration: 0.9 2023-10-03 04:38:08,477 WARNING [train.py:1204] (3/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_106-30782-0_sp0.9 from training. Duration: 0.888875 2023-10-03 04:38:11,106 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_5960_210_1794_1_1527403865_3990999_515-2613-0 from training. Duration: 0.88 2023-10-03 04:38:15,351 WARNING [train.py:1204] (3/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_798-106179-0_sp0.9 from training. Duration: 0.92225 2023-10-03 04:38:18,058 WARNING [train.py:1204] (3/4) Exclude cut with ID _14227_210_4381_1_1530701804286_3940259_125-275642-0_sp1.1 from training. Duration: 0.9 2023-10-03 04:38:18,061 WARNING [train.py:1204] (3/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_79-200392-0_sp1.1 from training. Duration: 0.8818125 2023-10-03 04:38:19,479 WARNING [train.py:1204] (3/4) Exclude cut with ID _48836_210_12210_1_1534075848293_3904150_429-35475-0_sp0.9 from training. Duration: 0.988875 2023-10-03 04:38:19,502 WARNING [train.py:1204] (3/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_224-172517-0_sp1.1 from training. Duration: 0.97275 2023-10-03 04:38:22,726 WARNING [train.py:1204] (3/4) Exclude cut with ID _15113_210_8254_1_1530842438579_3716279_76-232507-0_sp1.1 from training. Duration: 0.97275 2023-10-03 04:38:22,729 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_135-5501-0 from training. Duration: 0.98 2023-10-03 04:38:24,173 WARNING [train.py:1204] (3/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_563-347832-0_sp0.9 from training. Duration: 0.988875 2023-10-03 04:38:24,183 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6786_210_1388_1_1527839699_4194495_273-149841-0 from training. Duration: 0.92 2023-10-03 04:38:25,975 WARNING [train.py:1204] (3/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_172-215932-0 from training. Duration: 0.66 2023-10-03 04:38:26,034 WARNING [train.py:1204] (3/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_939-132910-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 04:38:27,224 INFO [train.py:1046] (3/4) Epoch 33, batch 1100, loss[loss=0.1763, simple_loss=0.2487, pruned_loss=0.05192, over 23775.00 frames. ], tot_loss[loss=0.1604, simple_loss=0.2384, pruned_loss=0.04121, over 4691314.91 frames. ], batch size: 179, lr: 3.11e-03, grad_scale: 8.0 2023-10-03 04:38:30,424 WARNING [train.py:1204] (3/4) Exclude cut with ID _49371_210_12071_1_1533952761021_3812190_16-246001-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 04:38:34,559 WARNING [train.py:1204] (3/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_680-189165-0_sp1.1 from training. Duration: 0.936375 2023-10-03 04:38:35,984 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.2.encoder.layers.0.self_attn1.whiten, num_groups=1, num_channels=384, metric=17.53 vs. limit=22.5 2023-10-03 04:38:39,200 WARNING [train.py:1204] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_53-300544-0_sp1.1 from training. Duration: 0.6090625 2023-10-03 04:38:40,517 WARNING [train.py:1204] (3/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_540-275685-0_sp1.1 from training. Duration: 0.8090625 2023-10-03 04:38:40,534 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_38965_210_4852_1_1533174906_3484735_453-106579-0_sp1.1 from training. Duration: 0.92725 2023-10-03 04:38:41,927 WARNING [train.py:1204] (3/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_105-328174-0 from training. Duration: 0.76 2023-10-03 04:38:43,295 WARNING [train.py:1204] (3/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_279-126205-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 04:38:44,786 WARNING [train.py:1204] (3/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_5-55893-0_sp0.9 from training. Duration: 0.611125 2023-10-03 04:38:47,551 WARNING [train.py:1204] (3/4) Exclude cut with ID _13678_210_7497_1_1530530069226_3463170_141-10835-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 04:38:50,219 WARNING [train.py:1204] (3/4) Exclude cut with ID _22083_210_8254_1_1531702862439_3678970_201-85738-0_sp1.1 from training. Duration: 0.8181875 2023-10-03 04:38:50,252 WARNING [train.py:1204] (3/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_51-1049-0 from training. Duration: 0.75 2023-10-03 04:38:51,542 WARNING [train.py:1204] (3/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_206-10097-0_sp0.9 from training. Duration: 0.5444375 2023-10-03 04:38:52,391 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.0.self_attn1.whiten.whitening_limit, batch_count=1140653.3333333333, ans=22.5 2023-10-03 04:38:53,316 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_4528_210_3273_1_1527076654_3276777_360-63661-0_sp1.1 from training. Duration: 0.92725 2023-10-03 04:38:53,330 WARNING [train.py:1204] (3/4) Exclude cut with ID _64086_210_7994_1_1535270417019_7435949_449-31567-0_sp1.1 from training. Duration: 0.8818125 2023-10-03 04:38:56,040 WARNING [train.py:1204] (3/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_245-174553-0_sp0.9 from training. Duration: 0.97775 2023-10-03 04:38:57,565 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6545_210_2040_1_1527774670_4245258_379-14182-0_sp1.1 from training. Duration: 0.72725 2023-10-03 04:38:57,693 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.0.conv_module2.balancer1.prob, batch_count=1140720.0, ans=0.125 2023-10-03 04:39:03,332 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.4.encoder.layers.1.self_attn_weights.whiten_keys, num_groups=4, num_channels=128, metric=2.37 vs. limit=6.0 2023-10-03 04:39:04,185 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_256-206070-0_sp1.1 from training. Duration: 0.963625 2023-10-03 04:39:07,600 WARNING [train.py:1204] (3/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_278-104676-0 from training. Duration: 0.98 2023-10-03 04:39:07,674 WARNING [train.py:1204] (3/4) Exclude cut with ID IC0671W0065-331908 from training. Duration: 0.896 2023-10-03 04:39:07,713 WARNING [train.py:1204] (3/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_244-129917-0_sp1.1 from training. Duration: 0.97275 2023-10-03 04:39:09,092 WARNING [train.py:1204] (3/4) Exclude cut with ID _7852_210_5219_1_1528286471316_8139400_1129-170857-0_sp1.1 from training. Duration: 0.97275 2023-10-03 04:39:10,407 WARNING [train.py:1204] (3/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_443-30034-0_sp0.9 from training. Duration: 0.711125 2023-10-03 04:39:10,450 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_31583_210_3852_1_1532595851_4024413_250-332999-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 04:39:11,987 WARNING [train.py:1204] (3/4) Exclude cut with ID _8772_210_1850_1_1528635595972_3664011_247-336448-0 from training. Duration: 0.74 2023-10-03 04:39:13,326 WARNING [train.py:1204] (3/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_567-135114-0_sp0.9 from training. Duration: 0.9666875 2023-10-03 04:39:13,329 WARNING [train.py:1204] (3/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_557-349534-0_sp1.1 from training. Duration: 0.82725 2023-10-03 04:39:13,347 WARNING [train.py:1204] (3/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_1341-26728-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 04:39:14,628 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_40222_210_6228_1_1533385298_4481939_375-328051-0_sp1.1 from training. Duration: 0.97275 2023-10-03 04:39:14,649 WARNING [train.py:1204] (3/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_276-96593-0 from training. Duration: 0.77 2023-10-03 04:39:16,226 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.balancer1.prob, batch_count=1140786.6666666667, ans=0.125 2023-10-03 04:39:19,017 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=1140786.6666666667, ans=0.1 2023-10-03 04:39:20,134 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_53071_210_16554_1_1534490405_2954066_265-170472-0_sp0.9 from training. Duration: 0.92225 2023-10-03 04:39:20,162 WARNING [train.py:1204] (3/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_365-1492-0 from training. Duration: 0.91 2023-10-03 04:39:23,467 WARNING [train.py:1204] (3/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_654-52713-0_sp0.9 from training. Duration: 0.8555625 2023-10-03 04:39:27,507 WARNING [train.py:1204] (3/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_113-92608-0_sp1.1 from training. Duration: 0.4909375 2023-10-03 04:39:28,971 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_46013_210_7234_1_1533776362_3744032_50-141535-0 from training. Duration: 0.99 2023-10-03 04:39:28,980 WARNING [train.py:1204] (3/4) Exclude cut with ID _13942_210_9988_1_1530604906036_3733360_499-55676-0_sp1.1 from training. Duration: 0.563625 2023-10-03 04:39:30,987 WARNING [train.py:1204] (3/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_375-65143-0_sp1.1 from training. Duration: 0.97275 2023-10-03 04:39:33,902 WARNING [train.py:1204] (3/4) Exclude cut with ID _16756_210_4915_1_1531047803763_7104259_199-325608-0_sp1.1 from training. Duration: 0.92725 2023-10-03 04:39:33,937 WARNING [train.py:1204] (3/4) Exclude cut with ID _31649_210_6228_1_1532584608063_4091078_443-124391-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 04:39:37,124 WARNING [train.py:1204] (3/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_146-218763-0 from training. Duration: 0.86 2023-10-03 04:39:38,440 WARNING [train.py:1204] (3/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_567-135114-0_sp1.1 from training. Duration: 0.7909375 2023-10-03 04:39:38,472 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_26326_210_7925_1_1533002493_3592406_462-288299-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 04:39:39,855 WARNING [train.py:1204] (3/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_322-217220-0 from training. Duration: 0.86 2023-10-03 04:39:39,892 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_238-256668-0_sp1.1 from training. Duration: 0.62725 2023-10-03 04:39:39,938 WARNING [train.py:1204] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_371-18169-0 from training. Duration: 0.86 2023-10-03 04:39:41,277 INFO [train.py:1046] (3/4) Epoch 33, batch 1150, loss[loss=0.157, simple_loss=0.2468, pruned_loss=0.03355, over 24665.00 frames. ], tot_loss[loss=0.1612, simple_loss=0.2394, pruned_loss=0.04156, over 4689620.59 frames. ], batch size: 65, lr: 3.10e-03, grad_scale: 8.0 2023-10-03 04:39:41,331 WARNING [train.py:1204] (3/4) Exclude cut with ID _58480_210_12392_1_1534811121111_3646160_23-74512-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 04:39:41,336 WARNING [train.py:1204] (3/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_784-213099-0_sp1.1 from training. Duration: 0.8454375 2023-10-03 04:39:41,427 WARNING [train.py:1204] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_625-102955-0_sp1.1 from training. Duration: 0.7 2023-10-03 04:39:42,685 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.574e+02 1.826e+02 2.024e+02 2.251e+02 4.261e+02, threshold=4.048e+02, percent-clipped=0.0 2023-10-03 04:39:45,624 WARNING [train.py:1204] (3/4) Exclude cut with ID _49748_210_24116_1_1533986971737_3158910_176-174327-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 04:39:46,964 WARNING [train.py:1204] (3/4) Exclude cut with ID _50310_210_15488_1_1534068043345_3641080_432-96140-0_sp1.1 from training. Duration: 0.936375 2023-10-03 04:39:48,408 WARNING [train.py:1204] (3/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_76-14910-0_sp1.1 from training. Duration: 0.92725 2023-10-03 04:39:48,435 WARNING [train.py:1204] (3/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_40-19458-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 04:39:49,779 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_46-93547-0 from training. Duration: 0.95 2023-10-03 04:39:49,827 WARNING [train.py:1204] (3/4) Exclude cut with ID _21631_210_2253_1_1531907880778_3160339_321-35325-0_sp1.1 from training. Duration: 0.963625 2023-10-03 04:39:54,355 WARNING [train.py:1204] (3/4) Exclude cut with ID _4779_210_2084_1_1527318022944_3914893_500-285013-0 from training. Duration: 0.69 2023-10-03 04:39:54,441 WARNING [train.py:1204] (3/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_301-93324-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 04:39:54,447 WARNING [train.py:1204] (3/4) Exclude cut with ID _6473_210_1741_1_1527901307396_3815380_108-17335-0_sp1.1 from training. Duration: 0.5909375 2023-10-03 04:39:58,872 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.ff3_skip_rate, batch_count=1140986.6666666667, ans=0.0 2023-10-03 04:40:00,075 WARNING [train.py:1204] (3/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_206-10097-0 from training. Duration: 0.49 2023-10-03 04:40:03,382 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_36756_210_7234_1_1533026991_966276_103-77513-0_sp1.1 from training. Duration: 0.97275 2023-10-03 04:40:04,868 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.conv_module1.balancer2.prob, batch_count=1140986.6666666667, ans=0.125 2023-10-03 04:40:07,581 WARNING [train.py:1204] (3/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_662-18193-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 04:40:07,638 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_26433_210_12108_1_1532157158_4056480_439-52438-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 04:40:07,687 WARNING [train.py:1204] (3/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_464-33931-0 from training. Duration: 0.54 2023-10-03 04:40:07,693 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3474_210_2028_1_1526188513_4825246_145-309131-0_sp0.9 from training. Duration: 0.82225 2023-10-03 04:40:09,038 WARNING [train.py:1204] (3/4) Exclude cut with ID _9679_210_7497_1_1529129212906_4363929_446-308321-0_sp1.1 from training. Duration: 0.963625 2023-10-03 04:40:10,594 WARNING [train.py:1204] (3/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_662-275621-0 from training. Duration: 0.42 2023-10-03 04:40:11,945 WARNING [train.py:1204] (3/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_344-337148-0_sp1.1 from training. Duration: 0.97275 2023-10-03 04:40:13,332 WARNING [train.py:1204] (3/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_759-179071-0_sp1.1 from training. Duration: 0.92725 2023-10-03 04:40:19,298 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.conv_skip_rate, batch_count=1141053.3333333333, ans=0.0 2023-10-03 04:40:20,438 WARNING [train.py:1204] (3/4) Exclude cut with ID _28783_210_7943_1_1532671164808_7155479_293-12188-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 04:40:20,744 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.feed_forward1.hidden_balancer.prob, batch_count=1141053.3333333333, ans=0.125 2023-10-03 04:40:27,960 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_17613_210_2151_1_1531137569_2366160_235-320304-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 04:40:27,999 WARNING [train.py:1204] (3/4) Exclude cut with ID _14349_210_8341_1_1530615710818_3442470_170-336541-0 from training. Duration: 0.86 2023-10-03 04:40:29,751 WARNING [train.py:1204] (3/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_375-218677-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 04:40:31,110 WARNING [train.py:1204] (3/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_829-202956-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 04:40:34,896 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9029W0436-509823 from training. Duration: 0.768 2023-10-03 04:40:36,312 WARNING [train.py:1204] (3/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_231-112214-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 04:40:43,118 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9059W0470-524883 from training. Duration: 0.3415625 2023-10-03 04:40:47,213 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_43266_210_1794_1_1533521789_4497218_206-79512-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 04:40:48,618 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_294-163793-0_sp0.9 from training. Duration: 0.911125 2023-10-03 04:40:48,645 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6290_210_3273_1_1527595224_1282787_115-87095-0_sp1.1 from training. Duration: 0.5 2023-10-03 04:40:48,687 WARNING [train.py:1204] (3/4) Exclude cut with ID _14100_210_8341_1_1530529320613_3678069_279-55760-0_sp1.1 from training. Duration: 0.8454375 2023-10-03 04:40:51,496 WARNING [train.py:1204] (3/4) Exclude cut with ID _14621_210_1925_1_1530686901185_3675420_419-234200-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 04:40:51,697 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.ff2_skip_rate, batch_count=1141186.6666666667, ans=0.0 2023-10-03 04:40:54,715 INFO [train.py:1046] (3/4) Epoch 33, batch 1200, loss[loss=0.1488, simple_loss=0.2272, pruned_loss=0.03524, over 21214.00 frames. ], tot_loss[loss=0.1615, simple_loss=0.2395, pruned_loss=0.04177, over 4672712.43 frames. ], batch size: 46, lr: 3.10e-03, grad_scale: 16.0 2023-10-03 04:40:56,255 WARNING [train.py:1204] (3/4) Exclude cut with ID _60229_210_27767_1_1534838406158_3655350_353-213656-0_sp1.1 from training. Duration: 0.863625 2023-10-03 04:40:56,264 WARNING [train.py:1204] (3/4) Exclude cut with ID _48678_210_5663_1_1533988830947_3769199_306-255886-0_sp1.1 from training. Duration: 0.7 2023-10-03 04:40:57,649 WARNING [train.py:1204] (3/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_466-3492-0_sp1.1 from training. Duration: 0.97275 2023-10-03 04:40:57,650 WARNING [train.py:1204] (3/4) Exclude cut with ID _8755_210_1876_1_1528628559357_3258209_199-242167-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 04:40:57,718 WARNING [train.py:1204] (3/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_237-89684-0_sp1.1 from training. Duration: 0.87275 2023-10-03 04:40:59,101 WARNING [train.py:1204] (3/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_70-84121-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 04:41:01,024 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7544_210_6753_1_1528587722_3576686_172-171750-0_sp1.1 from training. Duration: 0.5909375 2023-10-03 04:41:04,068 WARNING [train.py:1204] (3/4) Exclude cut with ID _24271_210_12658_1_1531987270022_7190519_40-321822-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 04:41:04,100 WARNING [train.py:1204] (3/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_227-83612-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 04:41:04,368 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.attention_skip_rate, batch_count=1141253.3333333333, ans=0.0 2023-10-03 04:41:06,660 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9156W0015-572508 from training. Duration: 0.896 2023-10-03 04:41:09,407 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_292-265928-0 from training. Duration: 0.51 2023-10-03 04:41:13,623 WARNING [train.py:1204] (3/4) Exclude cut with ID _14747_210_8341_1_1530685797463_3654089_147-131229-0_sp1.1 from training. Duration: 0.6909375 2023-10-03 04:41:15,098 WARNING [train.py:1204] (3/4) Exclude cut with ID _45297_210_2721_1_1533715351381_3264949_351-147136-0_sp0.9 from training. Duration: 0.9666875 2023-10-03 04:41:17,840 WARNING [train.py:1204] (3/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_1102-324568-0_sp1.1 from training. Duration: 0.97275 2023-10-03 04:41:18,002 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.balancer1.prob, batch_count=1141320.0, ans=0.125 2023-10-03 04:41:19,434 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.balancer2.prob, batch_count=1141320.0, ans=0.125 2023-10-03 04:41:21,753 WARNING [train.py:1204] (3/4) Exclude cut with ID _18516_210_9206_1_1531704653206_3692406_465-75446-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 04:41:21,754 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9203W0126-595623 from training. Duration: 0.896 2023-10-03 04:41:21,841 WARNING [train.py:1204] (3/4) Exclude cut with ID _55383_210_10635_1_1534468426172_2231669_139-328410-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 04:41:29,394 WARNING [train.py:1204] (3/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_166-246557-0_sp0.9 from training. Duration: 0.711125 2023-10-03 04:41:29,398 WARNING [train.py:1204] (3/4) Exclude cut with ID _30039_210_13270_1_1532484612900_2725150_239-304872-0_sp1.1 from training. Duration: 0.963625 2023-10-03 04:41:29,517 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.0.bypass_mid.scale_min, batch_count=1141386.6666666667, ans=0.2 2023-10-03 04:41:30,659 WARNING [train.py:1204] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_6-226750-0 from training. Duration: 0.94 2023-10-03 04:41:30,734 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_135-5501-0_sp1.1 from training. Duration: 0.8909375 2023-10-03 04:41:34,269 WARNING [train.py:1204] (3/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_359-118926-0 from training. Duration: 0.72 2023-10-03 04:41:39,615 WARNING [train.py:1204] (3/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_4-91037-0 from training. Duration: 0.88 2023-10-03 04:41:39,635 WARNING [train.py:1204] (3/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_415-288492-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 04:41:41,053 WARNING [train.py:1204] (3/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_544-287179-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 04:41:42,529 WARNING [train.py:1204] (3/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_122-32606-0_sp1.1 from training. Duration: 0.92725 2023-10-03 04:41:42,588 WARNING [train.py:1204] (3/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_121-284152-0_sp1.1 from training. Duration: 0.536375 2023-10-03 04:41:44,060 WARNING [train.py:1204] (3/4) Exclude cut with ID _44718_210_5816_1_1534050051869_6469030_350-299477-0_sp1.1 from training. Duration: 0.97275 2023-10-03 04:41:44,078 WARNING [train.py:1204] (3/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_1103-56445-0_sp0.9 from training. Duration: 0.811125 2023-10-03 04:41:45,378 WARNING [train.py:1204] (3/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_64-298917-0_sp0.9 from training. Duration: 0.97775 2023-10-03 04:41:46,724 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2245_210_2715_1_1525511548_3540254_310-52483-0 from training. Duration: 0.88 2023-10-03 04:41:46,788 WARNING [train.py:1204] (3/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_401-158239-0_sp1.1 from training. Duration: 0.6454375 2023-10-03 04:41:46,822 WARNING [train.py:1204] (3/4) Exclude cut with ID _59502_210_12210_1_1535107024698_1835139_146-162495-0_sp1.1 from training. Duration: 0.836375 2023-10-03 04:41:48,010 WARNING [train.py:1204] (3/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_41-31157-0_sp0.9 from training. Duration: 0.6333125 2023-10-03 04:41:50,709 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_68717_210_3528_1_1535628683_1556759_95-36722-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 04:41:50,715 WARNING [train.py:1204] (3/4) Exclude cut with ID _30196_210_1664_1_1533708496522_7074140_654-162611-0_sp1.1 from training. Duration: 0.92725 2023-10-03 04:41:54,842 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_9302_210_2550_1_1528889962_4655416_638-301620-0_sp0.9 from training. Duration: 0.52225 2023-10-03 04:41:56,426 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.attention_skip_rate, batch_count=1141520.0, ans=0.0 2023-10-03 04:41:58,037 WARNING [train.py:1204] (3/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_478-162247-0_sp1.1 from training. Duration: 0.7454375 2023-10-03 04:41:58,367 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.self_attn_weights.pos_emb_skip_rate, batch_count=1141520.0, ans=0.0 2023-10-03 04:42:00,834 WARNING [train.py:1204] (3/4) Exclude cut with ID _45297_210_2721_1_1533715351381_3264949_353-9541-0 from training. Duration: 0.97 2023-10-03 04:42:03,673 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9653W0190-675468 from training. Duration: 0.9935 2023-10-03 04:42:06,913 INFO [train.py:1046] (3/4) Epoch 33, batch 1250, loss[loss=0.1706, simple_loss=0.2512, pruned_loss=0.04501, over 24021.00 frames. ], tot_loss[loss=0.1624, simple_loss=0.2407, pruned_loss=0.04203, over 4679908.43 frames. ], batch size: 80, lr: 3.10e-03, grad_scale: 16.0 2023-10-03 04:42:06,977 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_49730_210_13049_1_1534075294_1830676_214-84436-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 04:42:08,774 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.511e+02 1.967e+02 2.213e+02 2.630e+02 3.265e+02, threshold=4.425e+02, percent-clipped=0.0 2023-10-03 04:42:08,925 WARNING [train.py:1204] (3/4) Exclude cut with ID _50093_210_4220_1_1534417141207_3432299_259-322252-0_sp1.1 from training. Duration: 0.836375 2023-10-03 04:42:09,068 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.bypass.skip_rate, batch_count=1141586.6666666667, ans=0.04949747468305833 2023-10-03 04:42:11,981 WARNING [train.py:1204] (3/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_599-50220-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 04:42:12,086 WARNING [train.py:1204] (3/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_174-109016-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 04:42:14,838 WARNING [train.py:1204] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_665-196155-0 from training. Duration: 0.67 2023-10-03 04:42:17,680 WARNING [train.py:1204] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_656-188361-0_sp1.1 from training. Duration: 0.7909375 2023-10-03 04:42:19,123 WARNING [train.py:1204] (3/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_60-18123-0_sp1.1 from training. Duration: 0.8 2023-10-03 04:42:19,184 WARNING [train.py:1204] (3/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_535-14410-0 from training. Duration: 0.47 2023-10-03 04:42:20,635 WARNING [train.py:1204] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_94-126128-0_sp1.1 from training. Duration: 0.7818125 2023-10-03 04:42:21,966 WARNING [train.py:1204] (3/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_356-31216-0_sp1.1 from training. Duration: 0.6818125 2023-10-03 04:42:26,126 WARNING [train.py:1204] (3/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_443-30034-0_sp1.1 from training. Duration: 0.5818125 2023-10-03 04:42:27,404 WARNING [train.py:1204] (3/4) Exclude cut with ID _26907_210_7234_1_1532221740694_3205263_337-2494-0_sp1.1 from training. Duration: 0.8 2023-10-03 04:42:27,502 WARNING [train.py:1204] (3/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_49-97158-0_sp0.9 from training. Duration: 0.8444375 2023-10-03 04:42:27,514 WARNING [train.py:1204] (3/4) Exclude cut with ID _7877_210_6014_1_1528286358099_3750136_553-170128-0_sp1.1 from training. Duration: 0.963625 2023-10-03 04:42:30,666 WARNING [train.py:1204] (3/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_202-293523-0_sp0.9 from training. Duration: 0.8 2023-10-03 04:42:30,920 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.feed_forward1.out_proj.dropout_p, batch_count=1141653.3333333333, ans=0.1 2023-10-03 04:42:33,518 WARNING [train.py:1204] (3/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_188-171673-0_sp0.9 from training. Duration: 0.6444375 2023-10-03 04:42:33,540 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_120-41668-0_sp1.1 from training. Duration: 0.636375 2023-10-03 04:42:33,551 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_40223_210_6228_1_1533298404_4812267_613-78769-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 04:42:36,211 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_53861_210_5399_1_1534422156_4272077_150-336899-0_sp1.1 from training. Duration: 0.963625 2023-10-03 04:42:36,269 WARNING [train.py:1204] (3/4) Exclude cut with ID _14459_210_2228_1_1531443641946_7516700_1146-56843-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 04:42:38,378 WARNING [train.py:1204] (3/4) Exclude cut with ID _39867_210_9826_1_1533553222030_9344259_1194-299928-0_sp1.1 from training. Duration: 0.92725 2023-10-03 04:42:40,206 WARNING [train.py:1204] (3/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_214-291369-0_sp0.9 from training. Duration: 0.7 2023-10-03 04:42:44,530 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.nonlin_attention.balancer.prob, batch_count=1141720.0, ans=0.125 2023-10-03 04:42:45,665 WARNING [train.py:1204] (3/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_358-215558-0 from training. Duration: 0.93 2023-10-03 04:42:46,955 WARNING [train.py:1204] (3/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_577-287150-0_sp0.9 from training. Duration: 0.911125 2023-10-03 04:42:48,578 WARNING [train.py:1204] (3/4) Exclude cut with ID _60860_210_8254_1_1534896015875_3557169_229-187367-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 04:42:49,828 WARNING [train.py:1204] (3/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_857-298942-0 from training. Duration: 0.85 2023-10-03 04:42:49,887 WARNING [train.py:1204] (3/4) Exclude cut with ID _40137_210_12210_1_1533290484046_3795180_249-187059-0_sp1.1 from training. Duration: 0.8 2023-10-03 04:42:51,176 WARNING [train.py:1204] (3/4) Exclude cut with ID ID0171W0508-765603 from training. Duration: 0.541 2023-10-03 04:42:51,207 WARNING [train.py:1204] (3/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_521-219439-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 04:42:51,212 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_40223_210_6228_1_1533298404_4812267_741-302616-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 04:42:53,953 WARNING [train.py:1204] (3/4) Exclude cut with ID _65293_210_9070_1_1535545549765_4004210_438-154735-0_sp1.1 from training. Duration: 0.92725 2023-10-03 04:42:57,234 WARNING [train.py:1204] (3/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_564-132950-0_sp1.1 from training. Duration: 0.92725 2023-10-03 04:42:58,547 WARNING [train.py:1204] (3/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_291-270881-0_sp1.1 from training. Duration: 0.8818125 2023-10-03 04:42:59,868 WARNING [train.py:1204] (3/4) Exclude cut with ID _60229_210_27767_1_1534838406158_3655350_353-213656-0 from training. Duration: 0.95 2023-10-03 04:42:59,880 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_243-143119-0 from training. Duration: 0.77 2023-10-03 04:42:59,912 WARNING [train.py:1204] (3/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_690-126306-0 from training. Duration: 0.78 2023-10-03 04:43:05,209 WARNING [train.py:1204] (3/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_347-95003-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 04:43:05,321 WARNING [train.py:1204] (3/4) Exclude cut with ID _2322_210_2667_1_1525604548971_3656299_440-80494-0 from training. Duration: 0.88 2023-10-03 04:43:05,336 WARNING [train.py:1204] (3/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_370-229434-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 04:43:08,557 WARNING [train.py:1204] (3/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_124-345840-0_sp1.1 from training. Duration: 0.47275 2023-10-03 04:43:08,566 WARNING [train.py:1204] (3/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_165-123581-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 04:43:10,594 WARNING [train.py:1204] (3/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_453-282588-0 from training. Duration: 0.98 2023-10-03 04:43:10,620 WARNING [train.py:1204] (3/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_299-34413-0_sp0.9 from training. Duration: 0.72225 2023-10-03 04:43:10,669 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_40036_210_13405_1_1533648647_1315388_48-146016-0_sp1.1 from training. Duration: 0.8454375 2023-10-03 04:43:12,578 WARNING [train.py:1204] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_99-167392-0_sp0.9 from training. Duration: 0.67775 2023-10-03 04:43:14,004 WARNING [train.py:1204] (3/4) Exclude cut with ID _48677_210_5663_1_1533902405919_3892989_37-207114-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 04:43:15,436 WARNING [train.py:1204] (3/4) Exclude cut with ID _29857_210_20034_1_1532395886753_3798160_335-163562-0 from training. Duration: 0.85 2023-10-03 04:43:16,858 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_36492_210_7927_1_1533261987_4790834_703-343351-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 04:43:18,239 WARNING [train.py:1204] (3/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_80-147301-0_sp0.9 from training. Duration: 0.9333125 2023-10-03 04:43:18,348 WARNING [train.py:1204] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_218-295097-0_sp0.9 from training. Duration: 0.8555625 2023-10-03 04:43:18,460 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.1.attention_skip_rate, batch_count=1141853.3333333333, ans=0.0 2023-10-03 04:43:19,804 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2295_210_2087_1_1525500993_2407863_116-238894-0_sp1.1 from training. Duration: 0.636375 2023-10-03 04:43:21,094 INFO [train.py:1046] (3/4) Epoch 33, batch 1300, loss[loss=0.1617, simple_loss=0.2387, pruned_loss=0.04231, over 24368.00 frames. ], tot_loss[loss=0.1624, simple_loss=0.2409, pruned_loss=0.04197, over 4685828.81 frames. ], batch size: 77, lr: 3.10e-03, grad_scale: 16.0 2023-10-03 04:43:22,599 WARNING [train.py:1204] (3/4) Exclude cut with ID _3231_210_2106_1_1526205864934_6784220_697-171068-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 04:43:22,632 WARNING [train.py:1204] (3/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_102-77064-0 from training. Duration: 0.93 2023-10-03 04:43:28,518 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_13091_210_7925_1_1530609447_8987354_1186-261957-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 04:43:28,719 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.bypass.skip_rate, batch_count=1141920.0, ans=0.07 2023-10-03 04:43:28,813 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=1141920.0, ans=0.1 2023-10-03 04:43:29,990 WARNING [train.py:1204] (3/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_91-277684-0_sp1.1 from training. Duration: 0.67275 2023-10-03 04:43:31,315 WARNING [train.py:1204] (3/4) Exclude cut with ID _31088_210_16446_1_1532520012284_3440919_159-313751-0_sp1.1 from training. Duration: 0.936375 2023-10-03 04:43:32,801 WARNING [train.py:1204] (3/4) Exclude cut with ID _42326_210_7770_1_1533814282531_3707269_227-203721-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 04:43:34,087 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3531_210_3528_1_1526294203_5216456_309-227302-0_sp1.1 from training. Duration: 0.62725 2023-10-03 04:43:34,153 WARNING [train.py:1204] (3/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_226-242140-0 from training. Duration: 0.81 2023-10-03 04:43:36,425 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=1141986.6666666667, ans=0.1 2023-10-03 04:43:38,835 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=1141986.6666666667, ans=0.1 2023-10-03 04:43:40,612 WARNING [train.py:1204] (3/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_122-109023-0_sp1.1 from training. Duration: 0.6909375 2023-10-03 04:43:42,584 WARNING [train.py:1204] (3/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_660-211088-0_sp0.9 from training. Duration: 0.688875 2023-10-03 04:43:44,009 WARNING [train.py:1204] (3/4) Exclude cut with ID _39290_210_4220_1_1533862778760_3075118_92-311600-0 from training. Duration: 0.88 2023-10-03 04:43:46,796 WARNING [train.py:1204] (3/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_390-339137-0_sp1.1 from training. Duration: 0.5090625 2023-10-03 04:43:50,762 WARNING [train.py:1204] (3/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_449-341946-0_sp1.1 from training. Duration: 0.963625 2023-10-03 04:43:52,114 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_68095_210_20185_1_1535718226_8888855_884-327007-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 04:43:53,555 WARNING [train.py:1204] (3/4) Exclude cut with ID _42326_210_7770_1_1533814282531_3707269_308-222998-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 04:43:54,959 WARNING [train.py:1204] (3/4) Exclude cut with ID _40399_210_4353_1_1533305658194_3279406_344-238708-0_sp1.1 from training. Duration: 0.963625 2023-10-03 04:43:56,375 WARNING [train.py:1204] (3/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_87-131427-0_sp1.1 from training. Duration: 0.7090625 2023-10-03 04:43:56,426 WARNING [train.py:1204] (3/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_110-130961-0_sp0.9 from training. Duration: 0.52225 2023-10-03 04:43:56,468 WARNING [train.py:1204] (3/4) Exclude cut with ID _55153_210_2253_1_1534384786677_3162096_148-79022-0 from training. Duration: 0.96 2023-10-03 04:44:01,101 WARNING [train.py:1204] (3/4) Exclude cut with ID _9679_210_7497_1_1529129212906_4363929_512-313889-0_sp1.1 from training. Duration: 0.736375 2023-10-03 04:44:01,112 WARNING [train.py:1204] (3/4) Exclude cut with ID _7970_210_1883_1_1528455616296_3472230_25-178407-0_sp1.1 from training. Duration: 0.6454375 2023-10-03 04:44:02,466 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_246-125000-0 from training. Duration: 0.62 2023-10-03 04:44:02,536 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_15126_210_6753_1_1530778677_2591185_113-157651-0_sp0.9 from training. Duration: 0.7555625 2023-10-03 04:44:03,823 WARNING [train.py:1204] (3/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_105-63309-0_sp1.1 from training. Duration: 0.8909375 2023-10-03 04:44:07,038 WARNING [train.py:1204] (3/4) Exclude cut with ID _29877_210_6228_1_1532397791106_5276149_578-177271-0_sp1.1 from training. Duration: 0.936375 2023-10-03 04:44:07,086 WARNING [train.py:1204] (3/4) Exclude cut with ID _44831_210_13427_1_1533816011549_3689035_32-188147-0 from training. Duration: 0.96 2023-10-03 04:44:08,380 WARNING [train.py:1204] (3/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_300-254337-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 04:44:08,392 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_128-241008-0 from training. Duration: 0.94 2023-10-03 04:44:11,614 WARNING [train.py:1204] (3/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_53-279178-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 04:44:15,290 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_41452_210_7291_1_1533437687_3941281_273-334088-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 04:44:15,299 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_128-241008-0_sp1.1 from training. Duration: 0.8545625 2023-10-03 04:44:19,450 WARNING [train.py:1204] (3/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_379-49457-0 from training. Duration: 0.87 2023-10-03 04:44:20,787 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_105-114797-0 from training. Duration: 0.84 2023-10-03 04:44:20,877 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_14928_210_5632_1_1530773461_4714000_454-112198-0 from training. Duration: 0.66 2023-10-03 04:44:25,043 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_456-22141-0_sp1.1 from training. Duration: 0.8 2023-10-03 04:44:27,868 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_115-233929-0 from training. Duration: 0.72 2023-10-03 04:44:29,317 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_616-16795-0_sp1.1 from training. Duration: 0.963625 2023-10-03 04:44:35,328 INFO [train.py:1046] (3/4) Epoch 33, batch 1350, loss[loss=0.1607, simple_loss=0.2264, pruned_loss=0.04753, over 23567.00 frames. ], tot_loss[loss=0.1618, simple_loss=0.2399, pruned_loss=0.04179, over 4692319.66 frames. ], batch size: 256, lr: 3.10e-03, grad_scale: 16.0 2023-10-03 04:44:35,455 WARNING [train.py:1204] (3/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_448-212135-0 from training. Duration: 0.91 2023-10-03 04:44:36,746 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.510e+02 1.912e+02 2.066e+02 2.352e+02 3.364e+02, threshold=4.132e+02, percent-clipped=0.0 2023-10-03 04:44:38,767 WARNING [train.py:1204] (3/4) Exclude cut with ID _46683_210_9642_1_1533730913492_4591116_201-205677-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 04:44:43,253 WARNING [train.py:1204] (3/4) Exclude cut with ID _64086_210_7994_1_1535270417019_7435949_43-80483-0_sp1.1 from training. Duration: 0.97275 2023-10-03 04:44:43,445 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.conv_module2.balancer2.prob, batch_count=1142253.3333333333, ans=0.125 2023-10-03 04:44:46,646 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_240-134124-0_sp1.1 from training. Duration: 0.963625 2023-10-03 04:44:46,667 WARNING [train.py:1204] (3/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_907-120940-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 04:44:48,002 WARNING [train.py:1204] (3/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_848-280957-0_sp1.1 from training. Duration: 0.7909375 2023-10-03 04:44:48,044 WARNING [train.py:1204] (3/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_243-42116-0_sp0.9 from training. Duration: 0.811125 2023-10-03 04:44:52,097 WARNING [train.py:1204] (3/4) Exclude cut with ID _6694_210_4599_1_1527852637490_3965529_116-109398-0_sp0.9 from training. Duration: 0.811125 2023-10-03 04:44:53,487 WARNING [train.py:1204] (3/4) Exclude cut with ID _24314_210_4915_1_1532135078116_7170179_521-258868-0 from training. Duration: 0.79 2023-10-03 04:44:54,931 WARNING [train.py:1204] (3/4) Exclude cut with ID _2220_210_1819_1_1525509109898_3658680_50-99837-0_sp0.9 from training. Duration: 0.6 2023-10-03 04:44:54,983 WARNING [train.py:1204] (3/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_313-134930-0_sp1.1 from training. Duration: 0.8818125 2023-10-03 04:44:57,698 WARNING [train.py:1204] (3/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_125-144174-0 from training. Duration: 0.39 2023-10-03 04:44:59,051 WARNING [train.py:1204] (3/4) Exclude cut with ID _59288_210_5854_1_1535077757774_3412246_275-293897-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 04:45:00,393 WARNING [train.py:1204] (3/4) Exclude cut with ID _40519_210_13794_1_1533715228655_7337390_722-52880-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 04:45:00,407 WARNING [train.py:1204] (3/4) Exclude cut with ID _7537_210_2084_1_1528502395431_3924359_128-268029-0 from training. Duration: 0.98 2023-10-03 04:45:02,160 WARNING [train.py:1204] (3/4) Exclude cut with ID _8021_210_7994_1_1528376451358_3653069_179-45206-0 from training. Duration: 0.86 2023-10-03 04:45:03,596 WARNING [train.py:1204] (3/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_88-276668-0 from training. Duration: 0.99 2023-10-03 04:45:04,921 WARNING [train.py:1204] (3/4) Exclude cut with ID _22613_210_5219_1_1532253599686_4090170_290-212862-0_sp1.1 from training. Duration: 0.97275 2023-10-03 04:45:04,937 WARNING [train.py:1204] (3/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_465-214726-0 from training. Duration: 0.86 2023-10-03 04:45:17,497 WARNING [train.py:1204] (3/4) Exclude cut with ID _56480_210_4220_1_1534679942079_3075409_138-36031-0_sp1.1 from training. Duration: 0.97275 2023-10-03 04:45:27,031 WARNING [train.py:1204] (3/4) Exclude cut with ID _58605_210_12210_1_1534723195939_3904259_300-285878-0_sp1.1 from training. Duration: 0.97275 2023-10-03 04:45:27,062 WARNING [train.py:1204] (3/4) Exclude cut with ID _40901_210_5665_1_1533363789958_3856130_114-340229-0_sp1.1 from training. Duration: 0.936375 2023-10-03 04:45:27,106 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_489-230703-0 from training. Duration: 0.85 2023-10-03 04:45:31,168 WARNING [train.py:1204] (3/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_322-81243-0_sp1.1 from training. Duration: 0.936375 2023-10-03 04:45:32,552 WARNING [train.py:1204] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_271-269084-0 from training. Duration: 0.83 2023-10-03 04:45:32,565 WARNING [train.py:1204] (3/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_166-314607-0_sp0.9 from training. Duration: 0.6 2023-10-03 04:45:33,967 WARNING [train.py:1204] (3/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_340-343302-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 04:45:35,858 WARNING [train.py:1204] (3/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_286-112015-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 04:45:36,044 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.attention_skip_rate, batch_count=1142520.0, ans=0.0 2023-10-03 04:45:38,681 WARNING [train.py:1204] (3/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_328-264776-0 from training. Duration: 0.67 2023-10-03 04:45:39,426 WARNING [train.py:1204] (3/4) Exclude cut with ID _4866_210_2577_1_1527404412284_4364659_256-306830-0_sp0.9 from training. Duration: 0.9666875 2023-10-03 04:45:43,542 WARNING [train.py:1204] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_239-316802-0 from training. Duration: 0.69 2023-10-03 04:45:45,073 WARNING [train.py:1204] (3/4) Exclude cut with ID _29857_210_20034_1_1532395886753_3798160_279-185801-0 from training. Duration: 0.91 2023-10-03 04:45:49,690 INFO [train.py:1046] (3/4) Epoch 33, batch 1400, loss[loss=0.134, simple_loss=0.1851, pruned_loss=0.04144, over 18751.00 frames. ], tot_loss[loss=0.1605, simple_loss=0.2382, pruned_loss=0.04144, over 4680096.82 frames. ], batch size: 388, lr: 3.10e-03, grad_scale: 16.0 2023-10-03 04:45:49,799 WARNING [train.py:1204] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_656-188361-0 from training. Duration: 0.87 2023-10-03 04:45:51,277 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.bypass.skip_rate, batch_count=1142586.6666666667, ans=0.09899494936611666 2023-10-03 04:45:52,445 WARNING [train.py:1204] (3/4) Exclude cut with ID _44718_210_5816_1_1534050051869_6469030_100-230287-0_sp1.1 from training. Duration: 0.936375 2023-10-03 04:45:53,955 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8275_210_4198_1_1528455320_3892571_276-215622-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 04:45:55,323 WARNING [train.py:1204] (3/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_698-309430-0_sp1.1 from training. Duration: 0.963625 2023-10-03 04:45:59,735 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_365-217119-0 from training. Duration: 0.63 2023-10-03 04:46:01,027 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7264_210_4938_1_1527939911_8132095_800-91995-0 from training. Duration: 0.86 2023-10-03 04:46:10,939 WARNING [train.py:1204] (3/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_49-97158-0_sp1.1 from training. Duration: 0.6909375 2023-10-03 04:46:13,620 WARNING [train.py:1204] (3/4) Exclude cut with ID _61132_210_9969_1_1535021792964_3755419_61-57720-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 04:46:13,784 WARNING [train.py:1204] (3/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_345-306832-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 04:46:15,148 WARNING [train.py:1204] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_164-224052-0_sp0.9 from training. Duration: 0.77775 2023-10-03 04:46:19,678 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_34886_210_10633_1_1533203882_1755661_76-156096-0_sp1.1 from training. Duration: 0.7909375 2023-10-03 04:46:22,489 WARNING [train.py:1204] (3/4) Exclude cut with ID 4133-6541-0027-40495-0_sp1.1 from training. Duration: 0.9681875 2023-10-03 04:46:27,396 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.self_attn1.whiten.whitening_limit, batch_count=1142720.0, ans=22.5 2023-10-03 04:46:31,046 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_514-211165-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 04:46:31,105 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_41211_210_2544_1_1533348859_3702910_97-212695-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 04:46:33,933 WARNING [train.py:1204] (3/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_406-45086-0 from training. Duration: 0.8 2023-10-03 04:46:35,281 WARNING [train.py:1204] (3/4) Exclude cut with ID _65263_210_22356_1_1535763598691_3921037_49-222917-0_sp1.1 from training. Duration: 0.836375 2023-10-03 04:46:35,331 WARNING [train.py:1204] (3/4) Exclude cut with ID _4866_210_2577_1_1527404412284_4364659_352-157930-0_sp1.1 from training. Duration: 0.736375 2023-10-03 04:46:37,110 WARNING [train.py:1204] (3/4) Exclude cut with ID _4366_210_1388_1_1526792053633_3613819_231-211402-0_sp1.1 from training. Duration: 0.8545625 2023-10-03 04:46:37,145 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_98-21576-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 04:46:38,483 WARNING [train.py:1204] (3/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_348-127789-0_sp0.9 from training. Duration: 0.9444375 2023-10-03 04:46:38,506 WARNING [train.py:1204] (3/4) Exclude cut with ID _45871_210_5665_1_1533864614245_3296060_98-329557-0_sp1.1 from training. Duration: 0.97275 2023-10-03 04:46:40,522 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6846_210_6753_1_1527850767_4295158_2-315073-0_sp1.1 from training. Duration: 0.8 2023-10-03 04:46:41,832 WARNING [train.py:1204] (3/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_145-137790-0 from training. Duration: 0.83 2023-10-03 04:46:41,990 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.attention_skip_rate, batch_count=1142786.6666666667, ans=0.0 2023-10-03 04:46:43,118 WARNING [train.py:1204] (3/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_133-158308-0_sp0.9 from training. Duration: 0.9333125 2023-10-03 04:46:45,306 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.3.encoder.layers.1.feed_forward1.out_whiten, num_groups=1, num_channels=512, metric=7.75 vs. limit=15.0 2023-10-03 04:46:47,315 WARNING [train.py:1204] (3/4) Exclude cut with ID _14898_210_9206_1_1530766812377_4177190_320-11758-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 04:46:50,186 WARNING [train.py:1204] (3/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_406-45086-0_sp0.9 from training. Duration: 0.888875 2023-10-03 04:46:57,564 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_5438_210_1794_1_1527336687_2600009_104-288274-0 from training. Duration: 0.58 2023-10-03 04:46:57,639 WARNING [train.py:1204] (3/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_108-53929-0_sp0.9 from training. Duration: 0.6555625 2023-10-03 04:46:59,008 WARNING [train.py:1204] (3/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_444-286390-0_sp1.1 from training. Duration: 0.963625 2023-10-03 04:47:01,802 WARNING [train.py:1204] (3/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_125-144174-0_sp0.9 from training. Duration: 0.4333125 2023-10-03 04:47:03,146 INFO [train.py:1046] (3/4) Epoch 33, batch 1450, loss[loss=0.1463, simple_loss=0.2238, pruned_loss=0.03436, over 24434.00 frames. ], tot_loss[loss=0.1602, simple_loss=0.2381, pruned_loss=0.04118, over 4688801.60 frames. ], batch size: 58, lr: 3.10e-03, grad_scale: 8.0 2023-10-03 04:47:03,187 WARNING [train.py:1204] (3/4) Exclude cut with ID _55209_210_1531_1_1534675828996_4597160_118-345621-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 04:47:04,622 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_45886_210_17024_1_1533709215_471063_7-79165-0_sp1.1 from training. Duration: 0.8909375 2023-10-03 04:47:05,801 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.538e+02 1.841e+02 1.972e+02 2.239e+02 2.970e+02, threshold=3.944e+02, percent-clipped=0.0 2023-10-03 04:47:07,301 WARNING [train.py:1204] (3/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_175-163894-0_sp0.9 from training. Duration: 0.82225 2023-10-03 04:47:09,273 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_50188_210_4198_1_1534503621_3646041_353-81682-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 04:47:09,279 WARNING [train.py:1204] (3/4) Exclude cut with ID _17875_210_2087_1_1531224012907_1821339_175-80565-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 04:47:09,298 WARNING [train.py:1204] (3/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_159-328157-0_sp1.1 from training. Duration: 0.47275 2023-10-03 04:47:11,368 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.balancer1.prob, batch_count=1142920.0, ans=0.125 2023-10-03 04:47:13,858 WARNING [train.py:1204] (3/4) Exclude cut with ID _60057_210_10640_1_1534903065793_3333150_137-267579-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 04:47:13,927 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_92-46009-0_sp1.1 from training. Duration: 0.4909375 2023-10-03 04:47:15,532 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.conv_skip_rate, batch_count=1142920.0, ans=0.0 2023-10-03 04:47:16,563 WARNING [train.py:1204] (3/4) Exclude cut with ID _53203_210_2087_1_1534246218309_475590_48-68507-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 04:47:16,586 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_28211_210_6388_1_1532861506_4523224_128-328162-0 from training. Duration: 0.9 2023-10-03 04:47:17,953 WARNING [train.py:1204] (3/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_51-1049-0_sp1.1 from training. Duration: 0.6818125 2023-10-03 04:47:18,044 WARNING [train.py:1204] (3/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_383-316000-0 from training. Duration: 0.9 2023-10-03 04:47:19,355 WARNING [train.py:1204] (3/4) Exclude cut with ID _4779_210_2084_1_1527318022944_3914893_185-295153-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 04:47:19,433 WARNING [train.py:1204] (3/4) Exclude cut with ID _3231_210_2106_1_1526205864934_6784220_171-256004-0_sp0.9 from training. Duration: 0.9555625 2023-10-03 04:47:19,434 WARNING [train.py:1204] (3/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_87-272068-0 from training. Duration: 0.83 2023-10-03 04:47:21,359 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_164-152777-0_sp1.1 from training. Duration: 0.92725 2023-10-03 04:47:22,668 WARNING [train.py:1204] (3/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_18-130336-0_sp0.9 from training. Duration: 0.87775 2023-10-03 04:47:22,722 WARNING [train.py:1204] (3/4) Exclude cut with ID _14886_210_4220_1_1530702020355_3516190_175-229073-0_sp1.1 from training. Duration: 0.4090625 2023-10-03 04:47:22,741 WARNING [train.py:1204] (3/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_791-45149-0_sp0.9 from training. Duration: 0.9555625 2023-10-03 04:47:24,287 WARNING [train.py:1204] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_468-189058-0_sp1.1 from training. Duration: 0.87275 2023-10-03 04:47:25,684 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8278_210_2550_1_1528637099_4220413_32-142780-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 04:47:28,431 WARNING [train.py:1204] (3/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_146-218763-0_sp0.9 from training. Duration: 0.9555625 2023-10-03 04:47:31,291 WARNING [train.py:1204] (3/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_348-127789-0_sp1.1 from training. Duration: 0.77275 2023-10-03 04:47:31,304 WARNING [train.py:1204] (3/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_288-285445-0_sp1.1 from training. Duration: 0.936375 2023-10-03 04:47:32,739 WARNING [train.py:1204] (3/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_308-181732-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 04:47:32,772 WARNING [train.py:1204] (3/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_895-117428-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 04:47:32,975 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.ff2_skip_rate, batch_count=1143053.3333333333, ans=0.0 2023-10-03 04:47:34,190 WARNING [train.py:1204] (3/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_387-159008-0_sp0.9 from training. Duration: 0.9555625 2023-10-03 04:47:34,212 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_37351_210_7925_1_1533012931_3812113_265-85780-0_sp0.9 from training. Duration: 0.97775 2023-10-03 04:47:35,483 WARNING [train.py:1204] (3/4) Exclude cut with ID _40551_210_10635_1_1533776424632_3416179_361-3964-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 04:47:35,536 WARNING [train.py:1204] (3/4) Exclude cut with ID _25882_210_3036_1_1532775665815_6479510_485-122742-0_sp1.1 from training. Duration: 0.97275 2023-10-03 04:47:41,889 WARNING [train.py:1204] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_98-51676-0 from training. Duration: 0.73 2023-10-03 04:47:43,371 WARNING [train.py:1204] (3/4) Exclude cut with ID _30548_210_16040_1_1532608097130_3146969_371-280610-0_sp1.1 from training. Duration: 0.92725 2023-10-03 04:47:46,193 WARNING [train.py:1204] (3/4) Exclude cut with ID IC0671W0081-331924 from training. Duration: 0.896 2023-10-03 04:47:46,542 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.bypass_mid.scale_min, batch_count=1143120.0, ans=0.2 2023-10-03 04:47:47,653 WARNING [train.py:1204] (3/4) Exclude cut with ID _40284_210_2084_1_1533513679814_3704279_380-160775-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 04:47:48,915 WARNING [train.py:1204] (3/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_57-270037-0_sp1.1 from training. Duration: 0.7 2023-10-03 04:47:50,381 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_853-150237-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 04:47:52,216 WARNING [train.py:1204] (3/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_491-261923-0 from training. Duration: 0.98 2023-10-03 04:47:56,316 WARNING [train.py:1204] (3/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_836-244045-0_sp1.1 from training. Duration: 0.97275 2023-10-03 04:47:57,680 WARNING [train.py:1204] (3/4) Exclude cut with ID _14880_210_8895_1_1530680426655_3795259_289-212817-0 from training. Duration: 0.69 2023-10-03 04:47:59,074 WARNING [train.py:1204] (3/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_303-340446-0 from training. Duration: 0.9 2023-10-03 04:47:59,213 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_37152_210_9697_1_1533106573_3844188_418-41736-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 04:48:00,678 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.attention_skip_rate, batch_count=1143186.6666666667, ans=0.0 2023-10-03 04:48:03,146 WARNING [train.py:1204] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_371-18169-0_sp1.1 from training. Duration: 0.7818125 2023-10-03 04:48:03,179 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_187-219984-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 04:48:05,812 WARNING [train.py:1204] (3/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_63-267585-0 from training. Duration: 0.9 2023-10-03 04:48:07,718 WARNING [train.py:1204] (3/4) Exclude cut with ID _27013_210_1389_1_1532437173240_3252649_222-125573-0 from training. Duration: 0.98 2023-10-03 04:48:07,775 WARNING [train.py:1204] (3/4) Exclude cut with ID _14821_210_8750_1_1530680503991_6829359_189-297451-0 from training. Duration: 0.55 2023-10-03 04:48:09,772 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder_embed.conv.8.prob, batch_count=1143186.6666666667, ans=0.125 2023-10-03 04:48:11,030 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_557-163391-0_sp1.1 from training. Duration: 0.97275 2023-10-03 04:48:12,933 WARNING [train.py:1204] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_65-88242-0_sp1.1 from training. Duration: 0.5090625 2023-10-03 04:48:17,073 INFO [train.py:1046] (3/4) Epoch 33, batch 1500, loss[loss=0.1759, simple_loss=0.253, pruned_loss=0.04941, over 23324.00 frames. ], tot_loss[loss=0.1612, simple_loss=0.2392, pruned_loss=0.04159, over 4695059.27 frames. ], batch size: 105, lr: 3.10e-03, grad_scale: 8.0 2023-10-03 04:48:22,772 WARNING [train.py:1204] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_218-295097-0 from training. Duration: 0.77 2023-10-03 04:48:24,135 WARNING [train.py:1204] (3/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_686-247600-0_sp1.1 from training. Duration: 0.836375 2023-10-03 04:48:24,139 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_397-147196-0_sp0.9 from training. Duration: 0.888875 2023-10-03 04:48:25,574 WARNING [train.py:1204] (3/4) Exclude cut with ID _53950_210_5816_1_1534417264919_3862951_237-136449-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 04:48:25,641 WARNING [train.py:1204] (3/4) Exclude cut with ID _3120_210_3525_1_1526036143112_4072980_74-178400-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 04:48:27,566 WARNING [train.py:1204] (3/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_309-222545-0_sp0.9 from training. Duration: 0.9333125 2023-10-03 04:48:27,650 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7544_210_6753_1_1528587722_3576686_172-171750-0 from training. Duration: 0.65 2023-10-03 04:48:29,066 WARNING [train.py:1204] (3/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_3-237434-0_sp0.9 from training. Duration: 0.8444375 2023-10-03 04:48:30,390 WARNING [train.py:1204] (3/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_101-293367-0_sp0.9 from training. Duration: 0.7 2023-10-03 04:48:30,397 WARNING [train.py:1204] (3/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_818-213552-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 04:48:31,761 WARNING [train.py:1204] (3/4) Exclude cut with ID _14349_210_8341_1_1530615710818_3442470_115-285975-0_sp1.1 from training. Duration: 0.7818125 2023-10-03 04:48:33,235 WARNING [train.py:1204] (3/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_291-309749-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 04:48:33,343 WARNING [train.py:1204] (3/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_715-3462-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 04:48:39,324 WARNING [train.py:1204] (3/4) Exclude cut with ID _59502_210_12210_1_1535107024698_1835139_13-331827-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 04:48:39,335 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_4528_210_3273_1_1527076654_3276777_342-168068-0 from training. Duration: 0.87 2023-10-03 04:48:39,401 WARNING [train.py:1204] (3/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_215-141059-0_sp0.9 from training. Duration: 0.688875 2023-10-03 04:48:39,429 WARNING [train.py:1204] (3/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_106-286130-0_sp1.1 from training. Duration: 0.8909375 2023-10-03 04:48:40,775 WARNING [train.py:1204] (3/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_480-77655-0_sp1.1 from training. Duration: 0.97275 2023-10-03 04:48:43,304 WARNING [train.py:1204] (3/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_848-280957-0 from training. Duration: 0.87 2023-10-03 04:48:47,389 WARNING [train.py:1204] (3/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_122-109023-0 from training. Duration: 0.76 2023-10-03 04:48:48,753 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_11305_210_1794_1_1529750428_7178148_645-233480-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 04:48:49,989 WARNING [train.py:1204] (3/4) Exclude cut with ID _7562_210_2084_1_1528887767916_3946229_345-169156-0 from training. Duration: 0.58 2023-10-03 04:48:51,421 WARNING [train.py:1204] (3/4) Exclude cut with ID _7562_210_2084_1_1528887767916_3946229_345-169156-0_sp1.1 from training. Duration: 0.52725 2023-10-03 04:48:52,847 WARNING [train.py:1204] (3/4) Exclude cut with ID _13253_210_1925_1_1530757775819_3577873_253-246386-0_sp1.1 from training. Duration: 0.8181875 2023-10-03 04:48:52,907 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_136-264444-0_sp1.1 from training. Duration: 0.97275 2023-10-03 04:48:52,928 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2168_210_2290_1_1525606920_1849400_136-222991-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 04:48:55,586 WARNING [train.py:1204] (3/4) Exclude cut with ID _7852_210_5219_1_1528286471316_8139400_242-105336-0 from training. Duration: 0.72 2023-10-03 04:48:55,647 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_5960_210_1794_1_1527403865_3990999_515-2613-0_sp1.1 from training. Duration: 0.8 2023-10-03 04:48:55,843 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.feed_forward1.hidden_balancer.prob, batch_count=1143386.6666666667, ans=0.125 2023-10-03 04:48:57,311 WARNING [train.py:1204] (3/4) Exclude cut with ID _39688_210_19596_1_1533371626725_4154159_551-167834-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 04:48:57,367 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_85-195240-0 from training. Duration: 0.87 2023-10-03 04:48:57,423 WARNING [train.py:1204] (3/4) Exclude cut with ID _63171_210_15488_1_1535104792794_3295110_113-113534-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 04:49:01,690 WARNING [train.py:1204] (3/4) Exclude cut with ID _8833_210_5219_1_1528592665210_7553059_319-349181-0_sp1.1 from training. Duration: 0.9 2023-10-03 04:49:01,691 WARNING [train.py:1204] (3/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_225-43381-0 from training. Duration: 0.94 2023-10-03 04:49:06,305 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_5959_210_1794_1_1527400400_3445726_412-44052-0_sp0.9 from training. Duration: 0.8555625 2023-10-03 04:49:06,426 WARNING [train.py:1204] (3/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_356-167816-0_sp1.1 from training. Duration: 0.6545625 2023-10-03 04:49:11,416 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9012W0112-501244 from training. Duration: 0.768 2023-10-03 04:49:11,475 WARNING [train.py:1204] (3/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_177-326952-0_sp1.1 from training. Duration: 0.92725 2023-10-03 04:49:11,479 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9016W0116-502789 from training. Duration: 0.896 2023-10-03 04:49:12,875 WARNING [train.py:1204] (3/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_557-204815-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 04:49:14,373 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=1143453.3333333333, ans=0.1 2023-10-03 04:49:15,557 WARNING [train.py:1204] (3/4) Exclude cut with ID _55857_210_2346_1_1534644007831_6898209_279-229469-0_sp1.1 from training. Duration: 0.936375 2023-10-03 04:49:15,627 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9029W0375-509764 from training. Duration: 0.896 2023-10-03 04:49:17,036 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2295_210_2087_1_1525500993_2407863_234-83104-0_sp0.9 from training. Duration: 0.688875 2023-10-03 04:49:19,771 WARNING [train.py:1204] (3/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_266-137104-0 from training. Duration: 0.65 2023-10-03 04:49:20,088 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.feed_forward2.hidden_balancer.prob, batch_count=1143520.0, ans=0.125 2023-10-03 04:49:21,246 WARNING [train.py:1204] (3/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_289-163927-0_sp1.1 from training. Duration: 0.92725 2023-10-03 04:49:24,422 WARNING [train.py:1204] (3/4) Exclude cut with ID _12733_210_8341_1_1530167424556_3646380_86-225314-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 04:49:24,451 WARNING [train.py:1204] (3/4) Exclude cut with ID _7877_210_6014_1_1528286358099_3750136_618-284064-0_sp1.1 from training. Duration: 0.92725 2023-10-03 04:49:25,835 WARNING [train.py:1204] (3/4) Exclude cut with ID _55844_210_2346_1_1534471212569_7625707_1017-31469-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 04:49:25,881 WARNING [train.py:1204] (3/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_617-241446-0_sp1.1 from training. Duration: 0.92725 2023-10-03 04:49:27,201 WARNING [train.py:1204] (3/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_866-173440-0_sp1.1 from training. Duration: 0.8181875 2023-10-03 04:49:28,637 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_9460_210_4211_1_1528954587_10049667_480-260269-0 from training. Duration: 0.56 2023-10-03 04:49:29,987 WARNING [train.py:1204] (3/4) Exclude cut with ID _44659_210_2721_1_1534036120061_6614860_626-298884-0 from training. Duration: 0.97 2023-10-03 04:49:30,024 WARNING [train.py:1204] (3/4) Exclude cut with ID _53565_210_10635_1_1534337218335_3820259_287-18120-0_sp1.1 from training. Duration: 0.8818125 2023-10-03 04:49:30,093 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_791-197891-0 from training. Duration: 0.84 2023-10-03 04:49:31,407 INFO [train.py:1046] (3/4) Epoch 33, batch 1550, loss[loss=0.143, simple_loss=0.2236, pruned_loss=0.03122, over 24624.00 frames. ], tot_loss[loss=0.1622, simple_loss=0.2405, pruned_loss=0.04192, over 4702453.12 frames. ], batch size: 60, lr: 3.10e-03, grad_scale: 8.0 2023-10-03 04:49:31,492 WARNING [train.py:1204] (3/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_527-238990-0 from training. Duration: 0.9 2023-10-03 04:49:34,159 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.691e+02 1.973e+02 2.319e+02 2.706e+02 3.781e+02, threshold=4.639e+02, percent-clipped=0.0 2023-10-03 04:49:34,233 WARNING [train.py:1204] (3/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_449-65969-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 04:49:34,399 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.0.conv_module2.balancer1.min_positive, batch_count=1143586.6666666667, ans=0.025 2023-10-03 04:49:35,621 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_553-341452-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 04:49:35,770 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.1.conv_module2.balancer1.prob, batch_count=1143586.6666666667, ans=0.125 2023-10-03 04:49:36,836 WARNING [train.py:1204] (3/4) Exclude cut with ID _16909_210_5724_1_1531292079912_4118919_300-47574-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 04:49:36,850 WARNING [train.py:1204] (3/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_70-27732-0_sp1.1 from training. Duration: 0.8545625 2023-10-03 04:49:38,268 WARNING [train.py:1204] (3/4) Exclude cut with ID _44718_210_5816_1_1534050051869_6469030_727-28074-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 04:49:38,315 WARNING [train.py:1204] (3/4) Exclude cut with ID _55857_210_2346_1_1534644007831_6898209_357-123664-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 04:49:42,954 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9123W0004-555964 from training. Duration: 0.896 2023-10-03 04:49:42,996 WARNING [train.py:1204] (3/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_445-145208-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 04:49:44,227 WARNING [train.py:1204] (3/4) Exclude cut with ID _3675_210_4557_1_1526639934408_4182089_456-18305-0_sp1.1 from training. Duration: 0.6909375 2023-10-03 04:49:44,304 WARNING [train.py:1204] (3/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_1-99680-0_sp1.1 from training. Duration: 0.6090625 2023-10-03 04:49:45,812 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.conv_module1.balancer1.max_abs, batch_count=1143653.3333333333, ans=10.0 2023-10-03 04:49:47,148 WARNING [train.py:1204] (3/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_146-282400-0_sp0.9 from training. Duration: 0.788875 2023-10-03 04:49:48,428 WARNING [train.py:1204] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_844-96989-0 from training. Duration: 0.71 2023-10-03 04:49:49,850 WARNING [train.py:1204] (3/4) Exclude cut with ID _29853_210_7497_1_1532505600851_6642110_128-146364-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 04:49:49,884 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_224-322394-0 from training. Duration: 0.66 2023-10-03 04:49:51,271 WARNING [train.py:1204] (3/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_191-59784-0 from training. Duration: 0.88 2023-10-03 04:49:51,276 WARNING [train.py:1204] (3/4) Exclude cut with ID _12556_210_8341_1_1530081024100_7327690_725-193477-0 from training. Duration: 0.98 2023-10-03 04:49:51,325 WARNING [train.py:1204] (3/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_523-79838-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 04:49:51,410 WARNING [train.py:1204] (3/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_398-334558-0_sp1.1 from training. Duration: 0.87275 2023-10-03 04:49:54,268 WARNING [train.py:1204] (3/4) Exclude cut with ID _6456_210_1534_1_1527681634212_3181049_171-36697-0_sp1.1 from training. Duration: 0.936375 2023-10-03 04:49:57,492 WARNING [train.py:1204] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_700-183549-0 from training. Duration: 0.68 2023-10-03 04:49:57,494 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8853_210_3273_1_1528633899_818968_51-187685-0 from training. Duration: 0.69 2023-10-03 04:49:57,737 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.ff3_skip_rate, batch_count=1143653.3333333333, ans=0.0 2023-10-03 04:49:59,559 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.4.encoder.layers.0.whiten, num_groups=1, num_channels=384, metric=5.58 vs. limit=12.0 2023-10-03 04:50:03,151 WARNING [train.py:1204] (3/4) Exclude cut with ID _7719_210_3525_1_1528456153163_4296960_333-110399-0_sp1.1 from training. Duration: 0.87275 2023-10-03 04:50:07,172 WARNING [train.py:1204] (3/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_1022-83740-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 04:50:07,195 WARNING [train.py:1204] (3/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_61-327062-0_sp0.9 from training. Duration: 0.9 2023-10-03 04:50:07,207 WARNING [train.py:1204] (3/4) Exclude cut with ID _47251_210_18628_1_1533814063128_4137250_214-88076-0_sp1.1 from training. Duration: 0.8 2023-10-03 04:50:08,620 WARNING [train.py:1204] (3/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_839-173017-0 from training. Duration: 0.41 2023-10-03 04:50:09,274 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.3.encoder.layers.1.feed_forward2.out_whiten, num_groups=1, num_channels=512, metric=4.99 vs. limit=15.0 2023-10-03 04:50:15,328 WARNING [train.py:1204] (3/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_299-34413-0_sp1.1 from training. Duration: 0.5909375 2023-10-03 04:50:15,438 WARNING [train.py:1204] (3/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_664-159457-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 04:50:18,197 WARNING [train.py:1204] (3/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_483-291760-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 04:50:18,904 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.whiten.whitening_limit, batch_count=1143786.6666666667, ans=12.0 2023-10-03 04:50:19,590 WARNING [train.py:1204] (3/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_287-255474-0_sp1.1 from training. Duration: 0.92725 2023-10-03 04:50:19,637 WARNING [train.py:1204] (3/4) Exclude cut with ID _48677_210_5663_1_1533902405919_3892989_111-11439-0_sp1.1 from training. Duration: 0.87275 2023-10-03 04:50:19,643 WARNING [train.py:1204] (3/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_399-156791-0 from training. Duration: 0.83 2023-10-03 04:50:19,672 WARNING [train.py:1204] (3/4) Exclude cut with ID _3992_210_1747_1_1526644816031_3731060_36-147855-0_sp0.9 from training. Duration: 0.8666875 2023-10-03 04:50:22,423 WARNING [train.py:1204] (3/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_442-320317-0_sp1.1 from training. Duration: 0.7454375 2023-10-03 04:50:22,450 WARNING [train.py:1204] (3/4) Exclude cut with ID _55429_210_10626_1_1534467588527_7174143_953-80868-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 04:50:22,523 WARNING [train.py:1204] (3/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_159-328157-0_sp0.9 from training. Duration: 0.57775 2023-10-03 04:50:22,531 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9309W0197-640324 from training. Duration: 0.896 2023-10-03 04:50:25,270 WARNING [train.py:1204] (3/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_361-30822-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 04:50:31,218 WARNING [train.py:1204] (3/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_636-251213-0 from training. Duration: 0.94 2023-10-03 04:50:36,720 WARNING [train.py:1204] (3/4) Exclude cut with ID _49728_210_12651_1_1534041034294_6788150_468-333827-0_sp1.1 from training. Duration: 0.963625 2023-10-03 04:50:38,700 WARNING [train.py:1204] (3/4) Exclude cut with ID _25882_210_3036_1_1532775665815_6479510_623-191759-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 04:50:38,758 WARNING [train.py:1204] (3/4) Exclude cut with ID _5189_210_4557_1_1527248319428_3769229_199-53526-0 from training. Duration: 0.42 2023-10-03 04:50:38,940 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.conv_skip_rate, batch_count=1143853.3333333333, ans=0.0 2023-10-03 04:50:41,423 WARNING [train.py:1204] (3/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_32-205288-0_sp0.9 from training. Duration: 0.8666875 2023-10-03 04:50:41,512 WARNING [train.py:1204] (3/4) Exclude cut with ID _65263_210_22356_1_1535763598691_3921037_401-346646-0_sp1.1 from training. Duration: 0.963625 2023-10-03 04:50:41,517 WARNING [train.py:1204] (3/4) Exclude cut with ID _12555_210_8750_1_1530075594402_3780239_449-267986-0_sp1.1 from training. Duration: 0.7545625 2023-10-03 04:50:41,532 WARNING [train.py:1204] (3/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_430-201020-0_sp1.1 from training. Duration: 0.8818125 2023-10-03 04:50:42,227 WARNING [train.py:1204] (3/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_379-49457-0_sp1.1 from training. Duration: 0.7909375 2023-10-03 04:50:45,522 WARNING [train.py:1204] (3/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_200-105218-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 04:50:45,574 WARNING [train.py:1204] (3/4) Exclude cut with ID _3867_210_1883_1_1526641200575_4475830_319-349850-0 from training. Duration: 0.67 2023-10-03 04:50:46,766 INFO [train.py:1046] (3/4) Epoch 33, batch 1600, loss[loss=0.1842, simple_loss=0.2604, pruned_loss=0.05399, over 23741.00 frames. ], tot_loss[loss=0.163, simple_loss=0.2413, pruned_loss=0.04232, over 4704950.21 frames. ], batch size: 212, lr: 3.10e-03, grad_scale: 16.0 2023-10-03 04:50:46,885 WARNING [train.py:1204] (3/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_916-195848-0 from training. Duration: 0.92 2023-10-03 04:50:48,224 WARNING [train.py:1204] (3/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_436-186311-0 from training. Duration: 0.65 2023-10-03 04:50:49,835 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.balancer1.prob, batch_count=1143920.0, ans=0.125 2023-10-03 04:50:50,969 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3910_210_1794_1_1526777667_3747260_302-180617-0_sp1.1 from training. Duration: 0.97275 2023-10-03 04:50:52,372 WARNING [train.py:1204] (3/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_102-92751-0 from training. Duration: 0.99 2023-10-03 04:50:52,430 WARNING [train.py:1204] (3/4) Exclude cut with ID _51899_210_20034_1_1534129254466_3669220_265-215428-0_sp1.1 from training. Duration: 0.8909375 2023-10-03 04:50:54,992 WARNING [train.py:1204] (3/4) Exclude cut with ID _58583_210_5236_1_1534726835266_3574249_438-198993-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 04:50:59,847 WARNING [train.py:1204] (3/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_166-166990-0_sp1.1 from training. Duration: 0.87275 2023-10-03 04:51:03,243 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.1.encoder.layers.0.feed_forward1.out_whiten, num_groups=1, num_channels=256, metric=4.93 vs. limit=15.0 2023-10-03 04:51:04,092 WARNING [train.py:1204] (3/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_907-151398-0 from training. Duration: 0.37 2023-10-03 04:51:04,326 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.feed_forward3.hidden_balancer.prob, batch_count=1143986.6666666667, ans=0.125 2023-10-03 04:51:06,838 WARNING [train.py:1204] (3/4) Exclude cut with ID _38670_210_15379_1_1533448810659_4391059_452-256669-0_sp1.1 from training. Duration: 0.936375 2023-10-03 04:51:08,673 WARNING [train.py:1204] (3/4) Exclude cut with ID _48677_210_5663_1_1533902405919_3892989_408-40740-0 from training. Duration: 0.83 2023-10-03 04:51:08,708 WARNING [train.py:1204] (3/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_903-158756-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 04:51:10,049 WARNING [train.py:1204] (3/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_397-216497-0 from training. Duration: 0.96 2023-10-03 04:51:14,783 WARNING [train.py:1204] (3/4) Exclude cut with ID _55429_210_10626_1_1534467588527_7174143_97-335084-0 from training. Duration: 0.97 2023-10-03 04:51:21,132 WARNING [train.py:1204] (3/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_453-267730-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 04:51:21,220 WARNING [train.py:1204] (3/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_153-97441-0 from training. Duration: 0.56 2023-10-03 04:51:22,616 WARNING [train.py:1204] (3/4) Exclude cut with ID _40901_210_5665_1_1533363789958_3856130_312-329122-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 04:51:22,659 WARNING [train.py:1204] (3/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_97-126471-0_sp1.1 from training. Duration: 0.97275 2023-10-03 04:51:22,659 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_11305_210_1794_1_1529750428_7178148_257-312870-0_sp1.1 from training. Duration: 0.8 2023-10-03 04:51:24,126 WARNING [train.py:1204] (3/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_454-314901-0_sp0.9 from training. Duration: 0.511125 2023-10-03 04:51:26,903 WARNING [train.py:1204] (3/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_282-136641-0_sp1.1 from training. Duration: 0.3818125 2023-10-03 04:51:30,035 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_46013_210_7234_1_1533776362_3744032_493-160724-0_sp1.1 from training. Duration: 0.963625 2023-10-03 04:51:30,089 WARNING [train.py:1204] (3/4) Exclude cut with ID _67831_210_12392_1_1535612464202_3092612_259-33442-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 04:51:30,145 WARNING [train.py:1204] (3/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_289-62384-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 04:51:31,452 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6545_210_2040_1_1527774670_4245258_379-14182-0_sp0.9 from training. Duration: 0.888875 2023-10-03 04:51:34,263 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_548-294316-0_sp1.1 from training. Duration: 0.736375 2023-10-03 04:51:35,943 WARNING [train.py:1204] (3/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_857-298942-0_sp1.1 from training. Duration: 0.77275 2023-10-03 04:51:37,298 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6673_210_3273_1_1527767790_3173240_327-306185-0_sp1.1 from training. Duration: 0.7545625 2023-10-03 04:51:37,844 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.4.encoder.layers.2.nonlin_attention.whiten1, num_groups=1, num_channels=288, metric=4.02 vs. limit=10.0 2023-10-03 04:51:43,485 WARNING [train.py:1204] (3/4) Exclude cut with ID _61008_210_10633_1_1534852769198_6974370_814-107647-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 04:51:45,328 WARNING [train.py:1204] (3/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_116-332008-0_sp1.1 from training. Duration: 0.8909375 2023-10-03 04:51:46,319 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.0.layers.0.self_attn1.whiten, num_groups=1, num_channels=192, metric=12.32 vs. limit=22.5 2023-10-03 04:51:46,797 WARNING [train.py:1204] (3/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_266-329755-0 from training. Duration: 0.89 2023-10-03 04:51:46,808 WARNING [train.py:1204] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_271-269084-0_sp0.9 from training. Duration: 0.92225 2023-10-03 04:51:48,157 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3171_210_2087_1_1526109084_2330007_66-260583-0 from training. Duration: 0.72 2023-10-03 04:51:51,062 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.conv_module1.balancer1.prob, batch_count=1144186.6666666667, ans=0.125 2023-10-03 04:51:53,552 WARNING [train.py:1204] (3/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_1326-276386-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 04:51:54,947 WARNING [train.py:1204] (3/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_372-30302-0_sp1.1 from training. Duration: 0.7818125 2023-10-03 04:51:56,325 WARNING [train.py:1204] (3/4) Exclude cut with ID _25882_210_3036_1_1532775665815_6479510_631-257029-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 04:51:56,334 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3691_210_3528_1_1526468169_4062053_355-334432-0 from training. Duration: 0.87 2023-10-03 04:51:57,585 WARNING [train.py:1204] (3/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_839-173017-0_sp1.1 from training. Duration: 0.37275 2023-10-03 04:51:57,593 WARNING [train.py:1204] (3/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_441-91466-0 from training. Duration: 0.97 2023-10-03 04:51:57,624 WARNING [train.py:1204] (3/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_108-53929-0 from training. Duration: 0.59 2023-10-03 04:52:00,268 INFO [train.py:1046] (3/4) Epoch 33, batch 1650, loss[loss=0.1303, simple_loss=0.2086, pruned_loss=0.02594, over 24304.00 frames. ], tot_loss[loss=0.1635, simple_loss=0.242, pruned_loss=0.0425, over 4681908.43 frames. ], batch size: 56, lr: 3.10e-03, grad_scale: 16.0 2023-10-03 04:52:01,815 WARNING [train.py:1204] (3/4) Exclude cut with ID _9389_210_1664_1_1529217337301_5289269_260-1942-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 04:52:01,892 WARNING [train.py:1204] (3/4) Exclude cut with ID _56423_210_5854_1_1534896061049_3165969_198-61547-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 04:52:01,918 WARNING [train.py:1204] (3/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_412-142122-0_sp1.1 from training. Duration: 0.92725 2023-10-03 04:52:01,944 WARNING [train.py:1204] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_88-341718-0_sp1.1 from training. Duration: 0.6 2023-10-03 04:52:03,801 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.551e+02 1.880e+02 2.031e+02 2.196e+02 3.045e+02, threshold=4.062e+02, percent-clipped=0.0 2023-10-03 04:52:05,189 WARNING [train.py:1204] (3/4) Exclude cut with ID _61079_210_4525_1_1534921008198_4011419_334-298697-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 04:52:06,825 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_5465_210_4211_1_1527227253_55357_8-2499-0_sp1.1 from training. Duration: 0.996375 2023-10-03 04:52:08,701 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_447-201565-0_sp1.1 from training. Duration: 0.97275 2023-10-03 04:52:08,711 WARNING [train.py:1204] (3/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_253-161789-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 04:52:08,714 WARNING [train.py:1204] (3/4) Exclude cut with ID _44304_210_15374_1_1533639352765_4613129_302-22084-0_sp0.9 from training. Duration: 0.9555625 2023-10-03 04:52:08,723 WARNING [train.py:1204] (3/4) Exclude cut with ID _26664_210_7770_1_1532258943280_3418270_187-80815-0_sp1.1 from training. Duration: 0.8454375 2023-10-03 04:52:10,148 WARNING [train.py:1204] (3/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_68-212801-0 from training. Duration: 0.73 2023-10-03 04:52:10,187 WARNING [train.py:1204] (3/4) Exclude cut with ID _29345_210_4381_1_1532689168263_3860169_149-257268-0 from training. Duration: 0.81 2023-10-03 04:52:16,891 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_213-103282-0_sp1.1 from training. Duration: 0.7090625 2023-10-03 04:52:18,283 WARNING [train.py:1204] (3/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_156-302675-0_sp1.1 from training. Duration: 0.5 2023-10-03 04:52:21,357 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.conv_module2.balancer2.prob, batch_count=1144320.0, ans=0.125 2023-10-03 04:52:26,733 WARNING [train.py:1204] (3/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_125-322172-0 from training. Duration: 0.93 2023-10-03 04:52:26,797 WARNING [train.py:1204] (3/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_493-60474-0_sp1.1 from training. Duration: 0.963625 2023-10-03 04:52:29,468 WARNING [train.py:1204] (3/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_156-302675-0 from training. Duration: 0.55 2023-10-03 04:52:32,195 WARNING [train.py:1204] (3/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_123-267862-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 04:52:36,922 WARNING [train.py:1204] (3/4) Exclude cut with ID _28427_210_4220_1_1532581240967_3061069_201-143747-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 04:52:36,951 WARNING [train.py:1204] (3/4) Exclude cut with ID _8772_210_1850_1_1528635595972_3664011_96-12756-0_sp1.1 from training. Duration: 0.8909375 2023-10-03 04:52:36,988 WARNING [train.py:1204] (3/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_1347-145675-0_sp1.1 from training. Duration: 0.936375 2023-10-03 04:52:37,078 WARNING [train.py:1204] (3/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_387-159008-0_sp1.1 from training. Duration: 0.7818125 2023-10-03 04:52:37,103 WARNING [train.py:1204] (3/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_400-201896-0_sp1.1 from training. Duration: 0.963625 2023-10-03 04:52:42,299 WARNING [train.py:1204] (3/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_326-174612-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 04:52:42,343 WARNING [train.py:1204] (3/4) Exclude cut with ID _55844_210_2346_1_1534471212569_7625707_390-116233-0_sp1.1 from training. Duration: 0.963625 2023-10-03 04:52:43,458 WARNING [train.py:1204] (3/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_419-317062-0_sp1.1 from training. Duration: 0.82725 2023-10-03 04:52:43,509 WARNING [train.py:1204] (3/4) Exclude cut with ID _29877_210_6228_1_1532397791106_5276149_266-62291-0_sp0.9 from training. Duration: 0.9666875 2023-10-03 04:52:44,924 WARNING [train.py:1204] (3/4) Exclude cut with ID _7857_210_1534_1_1528286515745_6831470_431-338528-0_sp1.1 from training. Duration: 0.92725 2023-10-03 04:52:45,005 WARNING [train.py:1204] (3/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_87-131427-0_sp0.9 from training. Duration: 0.8666875 2023-10-03 04:52:48,275 WARNING [train.py:1204] (3/4) Exclude cut with ID _24055_210_12651_1_1532310907143_976979_92-59356-0_sp1.1 from training. Duration: 0.82725 2023-10-03 04:52:49,688 WARNING [train.py:1204] (3/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_96-290039-0 from training. Duration: 0.56 2023-10-03 04:52:51,100 WARNING [train.py:1204] (3/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_48-182155-0_sp0.9 from training. Duration: 0.9666875 2023-10-03 04:52:51,144 WARNING [train.py:1204] (3/4) Exclude cut with ID _4626_210_3532_1_1526817652717_3916859_446-206656-0 from training. Duration: 0.76 2023-10-03 04:52:52,627 WARNING [train.py:1204] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_168-32906-0 from training. Duration: 0.69 2023-10-03 04:52:52,652 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3780_210_2087_1_1526467760_2130483_170-259044-0 from training. Duration: 0.7 2023-10-03 04:52:52,663 WARNING [train.py:1204] (3/4) Exclude cut with ID _65134_210_14763_1_1535335393985_3505368_153-216718-0_sp1.1 from training. Duration: 0.92725 2023-10-03 04:52:53,974 WARNING [train.py:1204] (3/4) Exclude cut with ID _64639_210_5816_1_1535367619437_3528000_461-329156-0_sp1.1 from training. Duration: 0.87275 2023-10-03 04:52:54,008 WARNING [train.py:1204] (3/4) Exclude cut with ID _21989_210_5854_1_1531899214931_3271360_126-347531-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 04:52:55,329 WARNING [train.py:1204] (3/4) Exclude cut with ID _7614_210_4915_1_1529326764092_4537359_96-162197-0_sp1.1 from training. Duration: 0.963625 2023-10-03 04:52:55,330 WARNING [train.py:1204] (3/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_1083-147536-0 from training. Duration: 0.97 2023-10-03 04:52:58,133 WARNING [train.py:1204] (3/4) Exclude cut with ID _64029_210_4381_1_1535178502772_3756459_275-16710-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 04:53:00,898 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7264_210_4938_1_1527939911_8132095_800-91995-0_sp0.9 from training. Duration: 0.9555625 2023-10-03 04:53:00,942 WARNING [train.py:1204] (3/4) Exclude cut with ID _3844_210_1390_1_1526727803582_2871209_50-193879-0_sp1.1 from training. Duration: 0.936375 2023-10-03 04:53:04,894 WARNING [train.py:1204] (3/4) Exclude cut with ID _46952_210_5986_1_1534230010357_3107379_232-344503-0 from training. Duration: 0.96 2023-10-03 04:53:09,662 WARNING [train.py:1204] (3/4) Exclude cut with ID _25808_210_13292_1_1532343514562_7366109_206-14076-0_sp1.1 from training. Duration: 0.936375 2023-10-03 04:53:09,663 WARNING [train.py:1204] (3/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_55-31904-0_sp0.9 from training. Duration: 0.97775 2023-10-03 04:53:09,688 WARNING [train.py:1204] (3/4) Exclude cut with ID _14632_210_9988_1_1530691279392_3923200_161-129530-0 from training. Duration: 0.96 2023-10-03 04:53:09,733 WARNING [train.py:1204] (3/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_770-200430-0_sp1.1 from training. Duration: 0.8090625 2023-10-03 04:53:09,749 WARNING [train.py:1204] (3/4) Exclude cut with ID _6270_210_2084_1_1527850911573_3856420_26-277343-0_sp1.1 from training. Duration: 0.7454375 2023-10-03 04:53:09,750 WARNING [train.py:1204] (3/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_88-23776-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 04:53:11,804 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.1.feed_forward1.out_proj.dropout_p, batch_count=1144520.0, ans=0.1 2023-10-03 04:53:14,912 WARNING [train.py:1204] (3/4) Exclude cut with ID _60860_210_8254_1_1534896015875_3557169_245-256666-0_sp1.1 from training. Duration: 0.8818125 2023-10-03 04:53:14,946 WARNING [train.py:1204] (3/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_636-251213-0_sp1.1 from training. Duration: 0.8545625 2023-10-03 04:53:14,960 WARNING [train.py:1204] (3/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_540-131164-0 from training. Duration: 0.92 2023-10-03 04:53:16,246 INFO [train.py:1046] (3/4) Epoch 33, batch 1700, loss[loss=0.1609, simple_loss=0.254, pruned_loss=0.03388, over 24688.00 frames. ], tot_loss[loss=0.1623, simple_loss=0.2407, pruned_loss=0.04199, over 4692281.93 frames. ], batch size: 73, lr: 3.10e-03, grad_scale: 16.0 2023-10-03 04:53:18,177 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_5931_210_2040_1_1527428481_4547999_321-193614-0_sp0.9 from training. Duration: 0.7444375 2023-10-03 04:53:26,292 WARNING [train.py:1204] (3/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_373-225403-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 04:53:27,816 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.conv_module1.balancer2.min_abs, batch_count=1144586.6666666667, ans=0.5 2023-10-03 04:53:28,877 WARNING [train.py:1204] (3/4) Exclude cut with ID _20622_210_3852_1_1531622427080_1093839_57-326027-0_sp1.1 from training. Duration: 0.97275 2023-10-03 04:53:29,103 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.conv_module2.balancer2.min_abs, batch_count=1144653.3333333333, ans=0.5 2023-10-03 04:53:34,509 WARNING [train.py:1204] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_605-257205-0_sp1.1 from training. Duration: 0.536375 2023-10-03 04:53:34,530 WARNING [train.py:1204] (3/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_442-320317-0_sp0.9 from training. Duration: 0.911125 2023-10-03 04:53:35,888 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_35361_210_4238_1_1533262810_7793476_675-87012-0_sp1.1 from training. Duration: 0.8090625 2023-10-03 04:53:35,917 WARNING [train.py:1204] (3/4) Exclude cut with ID _4184_210_2550_1_1526730192013_2219380_306-44736-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 04:53:37,840 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_124-113562-0 from training. Duration: 0.94 2023-10-03 04:53:40,577 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2821_210_2715_1_1526108385_3763560_73-296673-0_sp0.9 from training. Duration: 0.92225 2023-10-03 04:53:40,591 WARNING [train.py:1204] (3/4) Exclude cut with ID _10301_210_5886_1_1529222275332_3910620_216-46959-0_sp1.1 from training. Duration: 0.92725 2023-10-03 04:53:42,045 WARNING [train.py:1204] (3/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_91-277684-0_sp0.9 from training. Duration: 0.82225 2023-10-03 04:53:42,594 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.5.encoder.layers.0.self_attn1.whiten, num_groups=1, num_channels=256, metric=12.30 vs. limit=22.5 2023-10-03 04:53:44,040 WARNING [train.py:1204] (3/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_351-294650-0_sp0.9 from training. Duration: 0.7 2023-10-03 04:53:46,128 WARNING [train.py:1204] (3/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_209-205365-0 from training. Duration: 0.79 2023-10-03 04:53:46,189 WARNING [train.py:1204] (3/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_97-325053-0 from training. Duration: 0.92 2023-10-03 04:53:47,432 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_49348_210_7927_1_1533954500_3691300_109-276430-0_sp1.1 from training. Duration: 0.92725 2023-10-03 04:53:47,675 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.conv_module1.balancer1.max_abs, batch_count=1144720.0, ans=10.0 2023-10-03 04:53:49,256 WARNING [train.py:1204] (3/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_912-47473-0 from training. Duration: 0.68 2023-10-03 04:53:50,624 WARNING [train.py:1204] (3/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_96-320736-0_sp0.9 from training. Duration: 0.9555625 2023-10-03 04:54:00,201 WARNING [train.py:1204] (3/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_1252-235481-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 04:54:01,643 WARNING [train.py:1204] (3/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_813-261554-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 04:54:03,085 WARNING [train.py:1204] (3/4) Exclude cut with ID _21632_210_2253_1_1531994001765_3744140_110-274654-0_sp0.9 from training. Duration: 0.911125 2023-10-03 04:54:04,464 WARNING [train.py:1204] (3/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_381-317791-0_sp1.1 from training. Duration: 0.663625 2023-10-03 04:54:04,478 WARNING [train.py:1204] (3/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_51-117576-0_sp1.1 from training. Duration: 0.363625 2023-10-03 04:54:04,495 WARNING [train.py:1204] (3/4) Exclude cut with ID _42321_210_7770_1_1533468631352_4366895_104-215915-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 04:54:05,925 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_216-22824-0_sp1.1 from training. Duration: 0.92725 2023-10-03 04:54:05,926 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_14831_210_7927_1_1530700200_5583610_181-27467-0 from training. Duration: 0.91 2023-10-03 04:54:06,123 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.nonlin_attention.balancer.min_positive, batch_count=1144786.6666666667, ans=0.05 2023-10-03 04:54:07,251 WARNING [train.py:1204] (3/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_1024-265645-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 04:54:07,253 WARNING [train.py:1204] (3/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_412-233904-0_sp1.1 from training. Duration: 0.936375 2023-10-03 04:54:07,270 WARNING [train.py:1204] (3/4) Exclude cut with ID _30191_210_1664_1_1533189915047_7196670_911-223461-0_sp1.1 from training. Duration: 0.92725 2023-10-03 04:54:07,289 WARNING [train.py:1204] (3/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_558-148266-0_sp1.1 from training. Duration: 0.963625 2023-10-03 04:54:11,764 WARNING [train.py:1204] (3/4) Exclude cut with ID _58480_210_12392_1_1534811121111_3646160_212-164495-0_sp1.1 from training. Duration: 0.936375 2023-10-03 04:54:11,765 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_454-276429-0_sp0.9 from training. Duration: 0.9666875 2023-10-03 04:54:11,863 WARNING [train.py:1204] (3/4) Exclude cut with ID _24736_210_3279_1_1532332894218_2838700_73-197086-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 04:54:13,199 WARNING [train.py:1204] (3/4) Exclude cut with ID _5294_210_1747_1_1527249643928_3696179_529-242298-0_sp1.1 from training. Duration: 0.863625 2023-10-03 04:54:13,250 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_53555_210_5632_1_1534595220_7232297_345-171438-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 04:54:18,383 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_28211_210_6388_1_1532861506_4523224_22-25766-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 04:54:19,815 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7002_210_1794_1_1527939683_5157286_163-87552-0 from training. Duration: 0.86 2023-10-03 04:54:21,503 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=1144853.3333333333, ans=0.1 2023-10-03 04:54:22,641 WARNING [train.py:1204] (3/4) Exclude cut with ID _56927_210_9969_1_1534848970840_3695229_409-225129-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 04:54:24,622 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_26192_210_7927_1_1532137269_1040115_21-219331-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 04:54:26,071 WARNING [train.py:1204] (3/4) Exclude cut with ID _15012_210_1968_1_1530856820429_5095350_662-210469-0 from training. Duration: 0.57 2023-10-03 04:54:30,278 INFO [train.py:1046] (3/4) Epoch 33, batch 1750, loss[loss=0.1591, simple_loss=0.2522, pruned_loss=0.03295, over 24426.00 frames. ], tot_loss[loss=0.1605, simple_loss=0.2387, pruned_loss=0.04118, over 4693911.49 frames. ], batch size: 69, lr: 3.10e-03, grad_scale: 16.0 2023-10-03 04:54:30,375 WARNING [train.py:1204] (3/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_582-191016-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 04:54:32,923 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.645e+02 1.900e+02 2.115e+02 2.470e+02 3.706e+02, threshold=4.230e+02, percent-clipped=0.0 2023-10-03 04:54:33,043 WARNING [train.py:1204] (3/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_270-210032-0_sp1.1 from training. Duration: 0.963625 2023-10-03 04:54:33,096 WARNING [train.py:1204] (3/4) Exclude cut with ID _6789_210_1534_1_1527857937218_3118950_262-48853-0_sp0.9 from training. Duration: 0.62225 2023-10-03 04:54:34,534 WARNING [train.py:1204] (3/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_847-41334-0 from training. Duration: 0.44 2023-10-03 04:54:34,564 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_573-46532-0_sp1.1 from training. Duration: 0.92725 2023-10-03 04:54:37,364 WARNING [train.py:1204] (3/4) Exclude cut with ID _21881_210_3525_1_1531825242757_3798380_247-102591-0_sp1.1 from training. Duration: 0.8909375 2023-10-03 04:54:37,399 WARNING [train.py:1204] (3/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_200-21395-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 04:54:39,090 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=1144920.0, ans=0.1 2023-10-03 04:54:42,129 WARNING [train.py:1204] (3/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_711-15688-0 from training. Duration: 0.43 2023-10-03 04:54:43,758 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=1144986.6666666667, ans=0.1 2023-10-03 04:54:44,989 WARNING [train.py:1204] (3/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_53-75025-0_sp1.1 from training. Duration: 0.963625 2023-10-03 04:54:47,728 WARNING [train.py:1204] (3/4) Exclude cut with ID _6194_210_1534_1_1527511826160_3941194_321-148641-0 from training. Duration: 0.64 2023-10-03 04:54:47,752 WARNING [train.py:1204] (3/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_337-294431-0_sp1.1 from training. Duration: 0.936375 2023-10-03 04:54:49,765 WARNING [train.py:1204] (3/4) Exclude cut with ID _21632_210_2253_1_1531994001765_3744140_110-274654-0_sp1.1 from training. Duration: 0.7454375 2023-10-03 04:54:53,086 WARNING [train.py:1204] (3/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_135-174080-0_sp1.1 from training. Duration: 0.4818125 2023-10-03 04:54:53,188 WARNING [train.py:1204] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_21-174514-0 from training. Duration: 0.43 2023-10-03 04:54:55,997 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_38965_210_4852_1_1533174906_3484735_215-294589-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 04:54:56,021 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_548-294316-0 from training. Duration: 0.81 2023-10-03 04:55:03,021 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_103-84010-0_sp1.1 from training. Duration: 0.62725 2023-10-03 04:55:04,493 WARNING [train.py:1204] (3/4) Exclude cut with ID _18199_210_6109_1_1531475789864_4288790_295-198247-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 04:55:04,514 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_13757_210_3528_1_1530440682_4107239_103-313224-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 04:55:05,048 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.3.encoder.layers.2.self_attn2.whiten, num_groups=1, num_channels=512, metric=12.86 vs. limit=22.5 2023-10-03 04:55:07,300 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_34886_210_10633_1_1533203882_1755661_67-294673-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 04:55:07,309 WARNING [train.py:1204] (3/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_350-101734-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 04:55:10,064 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_37357_210_17024_1_1533089504_2316121_204-153986-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 04:55:10,376 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.nonlin_attention.balancer.prob, batch_count=1145053.3333333333, ans=0.125 2023-10-03 04:55:11,504 WARNING [train.py:1204] (3/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_673-115569-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 04:55:13,593 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_489-230703-0_sp0.9 from training. Duration: 0.9444375 2023-10-03 04:55:13,655 WARNING [train.py:1204] (3/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_575-102496-0_sp1.1 from training. Duration: 0.936375 2023-10-03 04:55:15,017 WARNING [train.py:1204] (3/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_669-93163-0 from training. Duration: 0.98 2023-10-03 04:55:18,328 WARNING [train.py:1204] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_249-281349-0_sp0.9 from training. Duration: 0.9444375 2023-10-03 04:55:19,716 WARNING [train.py:1204] (3/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_106-30782-0 from training. Duration: 0.8 2023-10-03 04:55:21,470 WARNING [train.py:1204] (3/4) Exclude cut with ID _8772_210_1850_1_1528635595972_3664011_398-320049-0_sp1.1 from training. Duration: 0.8 2023-10-03 04:55:22,908 WARNING [train.py:1204] (3/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_516-151727-0_sp1.1 from training. Duration: 0.963625 2023-10-03 04:55:24,219 WARNING [train.py:1204] (3/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_325-62802-0_sp1.1 from training. Duration: 0.7909375 2023-10-03 04:55:28,346 WARNING [train.py:1204] (3/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_93-58002-0_sp0.9 from training. Duration: 0.6333125 2023-10-03 04:55:28,383 WARNING [train.py:1204] (3/4) Exclude cut with ID _4681_210_3525_1_1527251040321_1235713_123-24428-0_sp1.1 from training. Duration: 0.436375 2023-10-03 04:55:29,613 WARNING [train.py:1204] (3/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_1185-56473-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 04:55:29,736 WARNING [train.py:1204] (3/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_96-244299-0_sp1.1 from training. Duration: 0.8 2023-10-03 04:55:35,228 WARNING [train.py:1204] (3/4) Exclude cut with ID _59500_210_8254_1_1534809610680_3736009_104-72815-0_sp1.1 from training. Duration: 0.963625 2023-10-03 04:55:37,926 WARNING [train.py:1204] (3/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_468-283113-0_sp1.1 from training. Duration: 0.87275 2023-10-03 04:55:38,128 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.conv_module1.balancer2.prob, batch_count=1145186.6666666667, ans=0.125 2023-10-03 04:55:39,350 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6239_210_4211_1_1527765584_4306698_528-102186-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 04:55:40,633 WARNING [train.py:1204] (3/4) Exclude cut with ID _45297_210_2721_1_1533715351381_3264949_351-147136-0 from training. Duration: 0.87 2023-10-03 04:55:40,638 WARNING [train.py:1204] (3/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_520-51744-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 04:55:40,743 WARNING [train.py:1204] (3/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_310-114452-0_sp1.1 from training. Duration: 0.536375 2023-10-03 04:55:40,746 WARNING [train.py:1204] (3/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_450-165710-0_sp1.1 from training. Duration: 0.92725 2023-10-03 04:55:40,756 WARNING [train.py:1204] (3/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_280-120942-0_sp1.1 from training. Duration: 0.7 2023-10-03 04:55:40,775 WARNING [train.py:1204] (3/4) Exclude cut with ID _65585_210_12210_1_1535450451584_3764098_112-294301-0_sp0.9 from training. Duration: 0.97775 2023-10-03 04:55:42,077 WARNING [train.py:1204] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_98-51676-0_sp0.9 from training. Duration: 0.811125 2023-10-03 04:55:43,422 INFO [train.py:1046] (3/4) Epoch 33, batch 1800, loss[loss=0.157, simple_loss=0.2494, pruned_loss=0.03231, over 24449.00 frames. ], tot_loss[loss=0.16, simple_loss=0.2381, pruned_loss=0.04091, over 4687102.79 frames. ], batch size: 66, lr: 3.10e-03, grad_scale: 16.0 2023-10-03 04:55:45,428 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_33348_210_5632_1_1533086912_3324030_85-28583-0_sp1.1 from training. Duration: 0.7454375 2023-10-03 04:55:45,529 WARNING [train.py:1204] (3/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_336-208232-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 04:55:45,766 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=1145253.3333333333, ans=0.1 2023-10-03 04:55:48,132 WARNING [train.py:1204] (3/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_339-104791-0_sp0.9 from training. Duration: 0.5444375 2023-10-03 04:55:50,021 WARNING [train.py:1204] (3/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_559-29840-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 04:55:52,160 WARNING [train.py:1204] (3/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_117-290916-0_sp0.9 from training. Duration: 0.5666875 2023-10-03 04:55:52,349 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.conv_module1.balancer1.prob, batch_count=1145253.3333333333, ans=0.125 2023-10-03 04:55:53,559 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_185-112289-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 04:55:56,267 WARNING [train.py:1204] (3/4) Exclude cut with ID _59256_210_5986_1_1535076085997_3589120_155-298797-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 04:55:57,877 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.bypass.scale_min, batch_count=1145320.0, ans=0.2 2023-10-03 04:55:59,019 WARNING [train.py:1204] (3/4) Exclude cut with ID _44659_210_2721_1_1534036120061_6614860_246-206531-0_sp1.1 from training. Duration: 0.92725 2023-10-03 04:55:59,069 WARNING [train.py:1204] (3/4) Exclude cut with ID _23818_210_6228_1_1531917063884_3676920_469-151944-0_sp1.1 from training. Duration: 0.92725 2023-10-03 04:56:00,444 WARNING [train.py:1204] (3/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_88-276668-0_sp1.1 from training. Duration: 0.9 2023-10-03 04:56:03,356 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_15101_210_7927_1_1530786415_5033274_320-327527-0_sp1.1 from training. Duration: 0.87275 2023-10-03 04:56:03,364 WARNING [train.py:1204] (3/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_446-112117-0 from training. Duration: 0.9 2023-10-03 04:56:03,428 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_30280_210_1881_1_1532872779_3660139_400-254159-0_sp1.1 from training. Duration: 0.936375 2023-10-03 04:56:04,992 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.feed_forward1.hidden_balancer.prob, batch_count=1145320.0, ans=0.125 2023-10-03 04:56:07,478 WARNING [train.py:1204] (3/4) Exclude cut with ID _40284_210_2084_1_1533513679814_3704279_158-345341-0_sp1.1 from training. Duration: 0.936375 2023-10-03 04:56:10,574 WARNING [train.py:1204] (3/4) Exclude cut with ID 3033-130750-0096-55598-0 from training. Duration: 0.83 2023-10-03 04:56:12,049 WARNING [train.py:1204] (3/4) Exclude cut with ID _9389_210_1664_1_1529217337301_5289269_379-128794-0 from training. Duration: 0.85 2023-10-03 04:56:12,065 WARNING [train.py:1204] (3/4) Exclude cut with ID _3675_210_4557_1_1526639934408_4182089_456-18305-0 from training. Duration: 0.76 2023-10-03 04:56:13,412 WARNING [train.py:1204] (3/4) Exclude cut with ID _29853_210_7497_1_1532505600851_6642110_571-249460-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 04:56:13,509 WARNING [train.py:1204] (3/4) Exclude cut with ID _4068_210_2629_1_1526697350395_1546259_230-325568-0_sp1.1 from training. Duration: 0.92725 2023-10-03 04:56:13,510 WARNING [train.py:1204] (3/4) Exclude cut with ID _34415_210_8750_1_1533085213375_3451116_213-267570-0_sp1.1 from training. Duration: 0.963625 2023-10-03 04:56:15,342 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6767_210_3273_1_1527855783_1718932_142-127132-0_sp1.1 from training. Duration: 0.863625 2023-10-03 04:56:24,639 WARNING [train.py:1204] (3/4) Exclude cut with ID IC0653W0001-322925 from training. Duration: 0.896 2023-10-03 04:56:26,560 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_4528_210_3273_1_1527076654_3276777_209-219804-0_sp1.1 from training. Duration: 0.62725 2023-10-03 04:56:26,693 WARNING [train.py:1204] (3/4) Exclude cut with ID _55844_210_2346_1_1534471212569_7625707_187-204971-0_sp1.1 from training. Duration: 0.936375 2023-10-03 04:56:28,134 WARNING [train.py:1204] (3/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_369-325184-0 from training. Duration: 0.99 2023-10-03 04:56:28,165 WARNING [train.py:1204] (3/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_791-45149-0 from training. Duration: 0.86 2023-10-03 04:56:28,219 WARNING [train.py:1204] (3/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_317-211107-0_sp1.1 from training. Duration: 0.836375 2023-10-03 04:56:29,640 WARNING [train.py:1204] (3/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_488-330913-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 04:56:31,049 WARNING [train.py:1204] (3/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_506-42614-0_sp1.1 from training. Duration: 0.7181875 2023-10-03 04:56:35,252 WARNING [train.py:1204] (3/4) Exclude cut with ID _8696_210_1876_1_1528545632440_3345058_209-54069-0 from training. Duration: 0.67 2023-10-03 04:56:40,707 WARNING [train.py:1204] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_215-202973-0_sp1.1 from training. Duration: 0.8 2023-10-03 04:56:42,145 WARNING [train.py:1204] (3/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_321-215742-0 from training. Duration: 0.5 2023-10-03 04:56:42,184 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_428-32114-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 04:56:42,192 WARNING [train.py:1204] (3/4) Exclude cut with ID _27013_210_1389_1_1532437173240_3252649_467-263151-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 04:56:43,527 WARNING [train.py:1204] (3/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_734-129761-0_sp0.9 from training. Duration: 0.911125 2023-10-03 04:56:44,837 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_521-214276-0 from training. Duration: 0.44 2023-10-03 04:56:45,124 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.balancer1.prob, batch_count=1145520.0, ans=0.125 2023-10-03 04:56:46,987 WARNING [train.py:1204] (3/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_80-147301-0_sp1.1 from training. Duration: 0.763625 2023-10-03 04:56:46,989 WARNING [train.py:1204] (3/4) Exclude cut with ID _9679_210_7497_1_1529129212906_4363929_56-307796-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 04:56:49,669 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.conv_skip_rate, batch_count=1145520.0, ans=0.0 2023-10-03 04:56:50,625 WARNING [train.py:1204] (3/4) Exclude cut with ID _14832_210_8750_1_1530691351539_3171348_1-54315-0 from training. Duration: 0.87 2023-10-03 04:56:50,625 WARNING [train.py:1204] (3/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_558-155750-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 04:56:52,048 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_142-309370-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 04:56:52,072 WARNING [train.py:1204] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_18-46756-0_sp0.9 from training. Duration: 0.87775 2023-10-03 04:56:52,080 WARNING [train.py:1204] (3/4) Exclude cut with ID _61148_210_6897_1_1535094022726_4217610_279-212159-0_sp1.1 from training. Duration: 0.936375 2023-10-03 04:56:54,729 WARNING [train.py:1204] (3/4) Exclude cut with ID _21987_210_5854_1_1531821500482_3637038_164-2641-0_sp1.1 from training. Duration: 0.936375 2023-10-03 04:56:54,786 WARNING [train.py:1204] (3/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_395-340852-0_sp1.1 from training. Duration: 0.8454375 2023-10-03 04:56:57,509 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1077-184261-0_sp1.1 from training. Duration: 0.963625 2023-10-03 04:56:57,527 WARNING [train.py:1204] (3/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_1176-204992-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 04:56:59,224 INFO [train.py:1046] (3/4) Epoch 33, batch 1850, loss[loss=0.1666, simple_loss=0.2458, pruned_loss=0.04369, over 24321.00 frames. ], tot_loss[loss=0.1613, simple_loss=0.2395, pruned_loss=0.04158, over 4696183.45 frames. ], batch size: 61, lr: 3.10e-03, grad_scale: 8.0 2023-10-03 04:56:59,526 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.attention_skip_rate, batch_count=1145586.6666666667, ans=0.0 2023-10-03 04:57:00,624 WARNING [train.py:1204] (3/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_563-347832-0_sp1.1 from training. Duration: 0.8090625 2023-10-03 04:57:00,689 WARNING [train.py:1204] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_380-204756-0_sp0.9 from training. Duration: 0.9555625 2023-10-03 04:57:03,504 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.522e+02 1.866e+02 2.062e+02 2.280e+02 4.556e+02, threshold=4.123e+02, percent-clipped=1.0 2023-10-03 04:57:04,352 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.3.encoder.layers.0.nonlin_attention.whiten2, num_groups=1, num_channels=512, metric=7.44 vs. limit=15.0 2023-10-03 04:57:05,321 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.ff2_skip_rate, batch_count=1145586.6666666667, ans=0.0 2023-10-03 04:57:06,270 WARNING [train.py:1204] (3/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_212-10501-0_sp1.1 from training. Duration: 0.8545625 2023-10-03 04:57:06,319 WARNING [train.py:1204] (3/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_445-117398-0 from training. Duration: 0.89 2023-10-03 04:57:08,930 WARNING [train.py:1204] (3/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_29-280622-0 from training. Duration: 0.82 2023-10-03 04:57:11,721 WARNING [train.py:1204] (3/4) Exclude cut with ID _26664_210_7770_1_1532258943280_3418270_187-80815-0 from training. Duration: 0.93 2023-10-03 04:57:16,312 WARNING [train.py:1204] (3/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_414-349640-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 04:57:16,332 WARNING [train.py:1204] (3/4) Exclude cut with ID _45021_210_9994_1_1533619751094_3330288_96-239010-0 from training. Duration: 0.91 2023-10-03 04:57:16,335 WARNING [train.py:1204] (3/4) Exclude cut with ID _5189_210_4557_1_1527248319428_3769229_199-53526-0_sp0.9 from training. Duration: 0.4666875 2023-10-03 04:57:26,534 WARNING [train.py:1204] (3/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_63-101996-0_sp1.1 from training. Duration: 0.8 2023-10-03 04:57:26,650 WARNING [train.py:1204] (3/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_199-86432-0 from training. Duration: 0.98 2023-10-03 04:57:30,242 WARNING [train.py:1204] (3/4) Exclude cut with ID _36399_210_16049_1_1532998771377_3698333_207-259013-0_sp1.1 from training. Duration: 0.92725 2023-10-03 04:57:30,269 WARNING [train.py:1204] (3/4) Exclude cut with ID _62574_210_2655_1_1535180407058_3933249_567-111756-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 04:57:35,745 WARNING [train.py:1204] (3/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_436-318113-0 from training. Duration: 0.75 2023-10-03 04:57:35,790 WARNING [train.py:1204] (3/4) Exclude cut with ID _53379_210_20034_1_1534222027363_3854654_43-305214-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 04:57:35,811 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_15126_210_6753_1_1530778677_2591185_113-157651-0_sp1.1 from training. Duration: 0.6181875 2023-10-03 04:57:38,512 WARNING [train.py:1204] (3/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_146-218763-0_sp1.1 from training. Duration: 0.7818125 2023-10-03 04:57:41,359 WARNING [train.py:1204] (3/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_714-310538-0_sp0.9 from training. Duration: 0.9555625 2023-10-03 04:57:42,161 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.3.encoder.layers.3.nonlin_attention.whiten1, num_groups=1, num_channels=384, metric=5.22 vs. limit=10.0 2023-10-03 04:57:42,801 WARNING [train.py:1204] (3/4) Exclude cut with ID _24271_210_12658_1_1531987270022_7190519_709-16721-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 04:57:45,763 WARNING [train.py:1204] (3/4) Exclude cut with ID _5190_210_4557_1_1527244368799_373439_28-197030-0_sp0.9 from training. Duration: 0.8 2023-10-03 04:57:45,803 WARNING [train.py:1204] (3/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_267-197536-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 04:57:45,827 WARNING [train.py:1204] (3/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_658-282610-0_sp1.1 from training. Duration: 0.4818125 2023-10-03 04:57:45,844 WARNING [train.py:1204] (3/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_257-67050-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 04:57:48,823 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_14906_210_7927_1_1530754190_9408633_961-53402-0_sp1.1 from training. Duration: 0.936375 2023-10-03 04:57:50,762 WARNING [train.py:1204] (3/4) Exclude cut with ID _60113_210_16271_1_1535164212481_4386079_490-280290-0_sp1.1 from training. Duration: 0.8909375 2023-10-03 04:57:52,616 WARNING [train.py:1204] (3/4) Exclude cut with ID _14100_210_8341_1_1530529320613_3678069_258-224599-0 from training. Duration: 0.77 2023-10-03 04:57:53,920 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6961_210_2087_1_1527937168_6815011_565-326898-0_sp1.1 from training. Duration: 0.936375 2023-10-03 04:57:56,743 WARNING [train.py:1204] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_239-316802-0_sp1.1 from training. Duration: 0.62725 2023-10-03 04:57:58,008 WARNING [train.py:1204] (3/4) Exclude cut with ID _55857_210_2346_1_1534644007831_6898209_745-98391-0_sp1.1 from training. Duration: 0.6909375 2023-10-03 04:57:58,012 WARNING [train.py:1204] (3/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_154-135696-0 from training. Duration: 0.8 2023-10-03 04:57:58,014 WARNING [train.py:1204] (3/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_93-58002-0 from training. Duration: 0.57 2023-10-03 04:57:59,935 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9025W0005-507335 from training. Duration: 0.896 2023-10-03 04:58:00,166 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.nonlin_attention.balancer.min_positive, batch_count=1145853.3333333333, ans=0.05 2023-10-03 04:58:01,346 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9029W0034-509435 from training. Duration: 0.64 2023-10-03 04:58:01,519 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=1145853.3333333333, ans=0.1 2023-10-03 04:58:04,126 WARNING [train.py:1204] (3/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_76-203867-0_sp1.1 from training. Duration: 0.6545625 2023-10-03 04:58:04,135 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_46639_210_8341_1_1533726088_4495500_66-346325-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 04:58:04,141 WARNING [train.py:1204] (3/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_145-137790-0_sp0.9 from training. Duration: 0.92225 2023-10-03 04:58:04,164 WARNING [train.py:1204] (3/4) Exclude cut with ID _36522_210_8887_1_1533177030611_1772478_7-327299-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 04:58:04,233 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9042W0009-515615 from training. Duration: 0.896 2023-10-03 04:58:04,241 WARNING [train.py:1204] (3/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_39-248870-0_sp0.9 from training. Duration: 0.7666875 2023-10-03 04:58:04,280 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_35441_210_16554_1_1533347572_4092680_362-242912-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 04:58:05,636 WARNING [train.py:1204] (3/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_216-343007-0_sp1.1 from training. Duration: 0.6 2023-10-03 04:58:07,021 WARNING [train.py:1204] (3/4) Exclude cut with ID _3867_210_1883_1_1526641200575_4475830_272-215563-0_sp0.9 from training. Duration: 0.6333125 2023-10-03 04:58:08,529 WARNING [train.py:1204] (3/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_553-175603-0_sp1.1 from training. Duration: 0.92725 2023-10-03 04:58:09,759 WARNING [train.py:1204] (3/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_963-187109-0 from training. Duration: 0.99 2023-10-03 04:58:11,141 WARNING [train.py:1204] (3/4) Exclude cut with ID _44973_210_1511_1_1533794381163_3634770_495-211466-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 04:58:11,151 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9068W0002-529085 from training. Duration: 0.896 2023-10-03 04:58:11,165 WARNING [train.py:1204] (3/4) Exclude cut with ID _5294_210_1747_1_1527249643928_3696179_180-100713-0_sp1.1 from training. Duration: 0.736375 2023-10-03 04:58:12,489 INFO [train.py:1046] (3/4) Epoch 33, batch 1900, loss[loss=0.1682, simple_loss=0.2424, pruned_loss=0.04702, over 23911.00 frames. ], tot_loss[loss=0.1616, simple_loss=0.2403, pruned_loss=0.04146, over 4709177.73 frames. ], batch size: 195, lr: 3.10e-03, grad_scale: 8.0 2023-10-03 04:58:12,553 WARNING [train.py:1204] (3/4) Exclude cut with ID _35799_210_5854_1_1533263487887_3529071_149-49689-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 04:58:12,806 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.conv_module2.balancer2.min_positive, batch_count=1145920.0, ans=0.05 2023-10-03 04:58:16,835 WARNING [train.py:1204] (3/4) Exclude cut with ID _67725_210_27767_1_1535608756671_5105359_36-220794-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 04:58:21,175 WARNING [train.py:1204] (3/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_338-96520-0_sp1.1 from training. Duration: 0.8545625 2023-10-03 04:58:23,141 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9104W0004-547160 from training. Duration: 0.896 2023-10-03 04:58:23,901 WARNING [train.py:1204] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_330-327861-0 from training. Duration: 0.67 2023-10-03 04:58:25,277 WARNING [train.py:1204] (3/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_156-257607-0_sp0.9 from training. Duration: 0.92225 2023-10-03 04:58:26,529 WARNING [train.py:1204] (3/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_476-322050-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 04:58:26,545 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9118W0017-553385 from training. Duration: 0.896 2023-10-03 04:58:26,569 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9120W0028-554435 from training. Duration: 0.896 2023-10-03 04:58:26,819 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.conv_module1.balancer1.prob, batch_count=1145986.6666666667, ans=0.125 2023-10-03 04:58:30,817 WARNING [train.py:1204] (3/4) Exclude cut with ID _56524_210_9642_1_1534572011779_3619170_145-310842-0 from training. Duration: 0.99 2023-10-03 04:58:32,754 WARNING [train.py:1204] (3/4) Exclude cut with ID _51631_210_8254_1_1534129228420_4084794_270-222508-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 04:58:35,746 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.ff3_skip_rate, batch_count=1145986.6666666667, ans=0.0 2023-10-03 04:58:36,856 WARNING [train.py:1204] (3/4) Exclude cut with ID _14268_210_9206_1_1530622771285_4714099_331-105351-0 from training. Duration: 0.65 2023-10-03 04:58:38,267 WARNING [train.py:1204] (3/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_40-129756-0 from training. Duration: 0.81 2023-10-03 04:58:46,543 WARNING [train.py:1204] (3/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_231-104736-0 from training. Duration: 0.78 2023-10-03 04:58:49,297 WARNING [train.py:1204] (3/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_214-291369-0 from training. Duration: 0.63 2023-10-03 04:58:49,306 WARNING [train.py:1204] (3/4) Exclude cut with ID _62574_210_2655_1_1535180407058_3933249_369-84347-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 04:58:49,359 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9210W0111-599240 from training. Duration: 0.896 2023-10-03 04:58:49,572 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.self_attn_weights.pos_emb_skip_rate, batch_count=1146053.3333333333, ans=0.0 2023-10-03 04:58:51,223 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_92-246270-0 from training. Duration: 0.85 2023-10-03 04:58:51,282 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_37152_210_9697_1_1533106573_3844188_29-239070-0 from training. Duration: 0.98 2023-10-03 04:58:52,574 WARNING [train.py:1204] (3/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_239-22076-0 from training. Duration: 0.85 2023-10-03 04:58:52,585 WARNING [train.py:1204] (3/4) Exclude cut with ID _65962_210_29927_1_1535436026519_4174930_368-44639-0_sp1.1 from training. Duration: 0.97275 2023-10-03 04:58:58,896 WARNING [train.py:1204] (3/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_152-212533-0 from training. Duration: 0.97 2023-10-03 04:59:00,404 WARNING [train.py:1204] (3/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_90-31383-0_sp1.1 from training. Duration: 0.7909375 2023-10-03 04:59:05,029 WARNING [train.py:1204] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_196-152868-0_sp1.1 from training. Duration: 0.92725 2023-10-03 04:59:05,031 WARNING [train.py:1204] (3/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_140-181176-0 from training. Duration: 0.92 2023-10-03 04:59:06,467 WARNING [train.py:1204] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_168-32906-0_sp0.9 from training. Duration: 0.7666875 2023-10-03 04:59:10,503 WARNING [train.py:1204] (3/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_13-165512-0 from training. Duration: 0.82 2023-10-03 04:59:10,536 WARNING [train.py:1204] (3/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_381-317791-0_sp0.9 from training. Duration: 0.811125 2023-10-03 04:59:14,864 WARNING [train.py:1204] (3/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_63-267585-0_sp1.1 from training. Duration: 0.8181875 2023-10-03 04:59:14,866 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_326-303945-0_sp1.1 from training. Duration: 0.9 2023-10-03 04:59:16,198 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_194-143381-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 04:59:16,282 WARNING [train.py:1204] (3/4) Exclude cut with ID _39290_210_4220_1_1533862778760_3075118_194-16667-0_sp1.1 from training. Duration: 0.963625 2023-10-03 04:59:19,025 WARNING [train.py:1204] (3/4) Exclude cut with ID _15012_210_1968_1_1530856820429_5095350_662-210469-0_sp0.9 from training. Duration: 0.6333125 2023-10-03 04:59:19,056 WARNING [train.py:1204] (3/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_68-219807-0_sp1.1 from training. Duration: 0.42725 2023-10-03 04:59:19,141 WARNING [train.py:1204] (3/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_321-22821-0_sp0.9 from training. Duration: 0.988875 2023-10-03 04:59:23,878 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_1037-45317-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 04:59:23,880 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_403-39619-0_sp1.1 from training. Duration: 0.763625 2023-10-03 04:59:25,305 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_510-96996-0_sp0.9 from training. Duration: 0.9666875 2023-10-03 04:59:25,307 WARNING [train.py:1204] (3/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_225-180155-0_sp1.1 from training. Duration: 0.92725 2023-10-03 04:59:25,344 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_278-272612-0_sp0.9 from training. Duration: 0.811125 2023-10-03 04:59:26,644 INFO [train.py:1046] (3/4) Epoch 33, batch 1950, loss[loss=0.1661, simple_loss=0.2613, pruned_loss=0.03546, over 24491.00 frames. ], tot_loss[loss=0.1626, simple_loss=0.2415, pruned_loss=0.04183, over 4708369.03 frames. ], batch size: 69, lr: 3.10e-03, grad_scale: 8.0 2023-10-03 04:59:26,759 WARNING [train.py:1204] (3/4) Exclude cut with ID _53713_210_24116_1_1534314487762_7416751_691-70244-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 04:59:30,877 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6788_210_1388_1_1527898698_4562021_91-197981-0_sp1.1 from training. Duration: 0.7545625 2023-10-03 04:59:32,077 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.539e+02 1.922e+02 2.146e+02 2.746e+02 4.413e+02, threshold=4.292e+02, percent-clipped=1.0 2023-10-03 04:59:33,541 WARNING [train.py:1204] (3/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_60-18123-0_sp0.9 from training. Duration: 0.97775 2023-10-03 04:59:33,585 WARNING [train.py:1204] (3/4) Exclude cut with ID _35799_210_5854_1_1533263487887_3529071_5-214617-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 04:59:33,597 WARNING [train.py:1204] (3/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_890-5117-0_sp1.1 from training. Duration: 0.7090625 2023-10-03 04:59:35,026 WARNING [train.py:1204] (3/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_660-211088-0 from training. Duration: 0.62 2023-10-03 04:59:36,402 WARNING [train.py:1204] (3/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_314-126819-0_sp0.9 from training. Duration: 0.5555625 2023-10-03 04:59:36,442 WARNING [train.py:1204] (3/4) Exclude cut with ID _21632_210_2253_1_1531994001765_3744140_362-66225-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 04:59:36,527 WARNING [train.py:1204] (3/4) Exclude cut with ID _61118_210_9527_1_1534897668884_3752630_173-258555-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 04:59:41,176 WARNING [train.py:1204] (3/4) Exclude cut with ID _7719_210_3525_1_1528456153163_4296960_88-250259-0_sp0.9 from training. Duration: 0.9333125 2023-10-03 04:59:41,210 WARNING [train.py:1204] (3/4) Exclude cut with ID _15140_210_9288_1_1530791388450_4880149_539-153708-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 04:59:41,249 WARNING [train.py:1204] (3/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_253-42544-0_sp1.1 from training. Duration: 0.97275 2023-10-03 04:59:43,917 WARNING [train.py:1204] (3/4) Exclude cut with ID _33640_210_2655_1_1533117605479_4855469_479-296877-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 04:59:46,779 WARNING [train.py:1204] (3/4) Exclude cut with ID 3033-130750-0096-55598-0_sp1.1 from training. Duration: 0.7545625 2023-10-03 04:59:46,798 WARNING [train.py:1204] (3/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_191-41023-0_sp1.1 from training. Duration: 0.6454375 2023-10-03 04:59:46,813 WARNING [train.py:1204] (3/4) Exclude cut with ID _8365_210_2064_1_1528592311070_4677690_431-294102-0_sp1.1 from training. Duration: 0.8090625 2023-10-03 04:59:46,842 WARNING [train.py:1204] (3/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_893-296030-0_sp1.1 from training. Duration: 0.97275 2023-10-03 04:59:47,046 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.conv_skip_rate, batch_count=1146320.0, ans=0.0 2023-10-03 04:59:50,975 WARNING [train.py:1204] (3/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_401-343820-0_sp1.1 from training. Duration: 0.97275 2023-10-03 04:59:54,369 WARNING [train.py:1204] (3/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_127-566-0_sp1.1 from training. Duration: 0.763625 2023-10-03 04:59:54,370 WARNING [train.py:1204] (3/4) Exclude cut with ID _14100_210_8341_1_1530529320613_3678069_126-272847-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 04:59:54,386 WARNING [train.py:1204] (3/4) Exclude cut with ID _15133_210_6228_1_1530794512074_3080379_29-264458-0_sp1.1 from training. Duration: 0.52725 2023-10-03 04:59:54,387 WARNING [train.py:1204] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_127-119109-0 from training. Duration: 0.72 2023-10-03 04:59:55,808 WARNING [train.py:1204] (3/4) Exclude cut with ID _6537_210_1883_1_1527987559710_3597129_382-133062-0_sp1.1 from training. Duration: 0.6181875 2023-10-03 04:59:55,820 WARNING [train.py:1204] (3/4) Exclude cut with ID _29529_210_7234_1_1532393961455_3977460_391-219682-0_sp1.1 from training. Duration: 0.936375 2023-10-03 04:59:55,897 WARNING [train.py:1204] (3/4) Exclude cut with ID _37537_210_2253_1_1533171758568_3421949_96-137189-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 04:59:57,207 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.conv_module1.balancer2.min_positive, batch_count=1146386.6666666667, ans=0.05 2023-10-03 05:00:03,504 WARNING [train.py:1204] (3/4) Exclude cut with ID _58414_210_8254_1_1535421609162_7151182_15-276301-0_sp1.1 from training. Duration: 0.97275 2023-10-03 05:00:04,940 WARNING [train.py:1204] (3/4) Exclude cut with ID _59502_210_12210_1_1535107024698_1835139_110-289074-0_sp1.1 from training. Duration: 0.92725 2023-10-03 05:00:07,788 WARNING [train.py:1204] (3/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_367-61330-0_sp1.1 from training. Duration: 0.6818125 2023-10-03 05:00:12,348 WARNING [train.py:1204] (3/4) Exclude cut with ID _45297_210_2721_1_1533715351381_3264949_353-9541-0_sp1.1 from training. Duration: 0.8818125 2023-10-03 05:00:12,390 WARNING [train.py:1204] (3/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_85-138257-0_sp0.9 from training. Duration: 0.788875 2023-10-03 05:00:12,415 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3787_210_2040_1_1526477544_5478586_603-120324-0 from training. Duration: 0.81 2023-10-03 05:00:13,719 WARNING [train.py:1204] (3/4) Exclude cut with ID _42863_210_9070_1_1533902398788_3696920_411-204131-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 05:00:17,908 WARNING [train.py:1204] (3/4) Exclude cut with ID _30196_210_1664_1_1533708496522_7074140_822-289909-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 05:00:19,309 WARNING [train.py:1204] (3/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_32-335103-0_sp0.9 from training. Duration: 0.911125 2023-10-03 05:00:19,364 WARNING [train.py:1204] (3/4) Exclude cut with ID _6493_210_1747_1_1528459210424_4012470_513-140967-0_sp0.9 from training. Duration: 0.87775 2023-10-03 05:00:26,437 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_47896_210_20508_1_1534492518_5958868_439-257766-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 05:00:26,542 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_1294-63152-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 05:00:29,597 WARNING [train.py:1204] (3/4) Exclude cut with ID _59170_210_6721_1_1534983633960_4203335_319-166512-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 05:00:32,917 WARNING [train.py:1204] (3/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_146-3470-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 05:00:33,177 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.bypass.skip_rate, batch_count=1146520.0, ans=0.07 2023-10-03 05:00:35,618 WARNING [train.py:1204] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_678-159307-0_sp1.1 from training. Duration: 0.77275 2023-10-03 05:00:35,650 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_343-48822-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 05:00:37,595 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_4692_210_3528_1_1526899755_4323254_107-44401-0 from training. Duration: 0.86 2023-10-03 05:00:37,600 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_74-292608-0_sp1.1 from training. Duration: 0.6545625 2023-10-03 05:00:39,007 WARNING [train.py:1204] (3/4) Exclude cut with ID _55644_210_11856_1_1534593599582_4103719_293-248694-0_sp1.1 from training. Duration: 0.97275 2023-10-03 05:00:40,398 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3172_210_2087_1_1526292929_3987865_197-97762-0 from training. Duration: 0.75 2023-10-03 05:00:41,737 INFO [train.py:1046] (3/4) Epoch 33, batch 2000, loss[loss=0.1486, simple_loss=0.2177, pruned_loss=0.0397, over 23493.00 frames. ], tot_loss[loss=0.163, simple_loss=0.2416, pruned_loss=0.0422, over 4713223.16 frames. ], batch size: 256, lr: 3.10e-03, grad_scale: 16.0 2023-10-03 05:00:41,833 WARNING [train.py:1204] (3/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_264-348896-0_sp0.9 from training. Duration: 0.9444375 2023-10-03 05:00:43,672 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.conv_skip_rate, batch_count=1146586.6666666667, ans=0.0 2023-10-03 05:00:45,344 WARNING [train.py:1204] (3/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_209-205365-0_sp0.9 from training. Duration: 0.87775 2023-10-03 05:00:46,750 WARNING [train.py:1204] (3/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_222-198127-0_sp0.9 from training. Duration: 0.9333125 2023-10-03 05:00:46,793 WARNING [train.py:1204] (3/4) Exclude cut with ID _57572_210_5517_1_1534921219624_3809484_52-43530-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 05:00:48,224 WARNING [train.py:1204] (3/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_96-320736-0_sp1.1 from training. Duration: 0.7818125 2023-10-03 05:00:49,643 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_13091_210_7925_1_1530609447_8987354_1159-225508-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 05:00:51,113 WARNING [train.py:1204] (3/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_237-44332-0 from training. Duration: 0.94 2023-10-03 05:00:51,164 WARNING [train.py:1204] (3/4) Exclude cut with ID _7970_210_1883_1_1528455616296_3472230_25-178407-0_sp0.9 from training. Duration: 0.788875 2023-10-03 05:00:53,967 WARNING [train.py:1204] (3/4) Exclude cut with ID _29003_210_19596_1_1532507391720_3894098_357-257062-0_sp1.1 from training. Duration: 0.936375 2023-10-03 05:00:54,144 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.nonlin_attention.balancer.prob, batch_count=1146586.6666666667, ans=0.125 2023-10-03 05:00:58,355 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_809-17156-0 from training. Duration: 0.73 2023-10-03 05:00:58,454 WARNING [train.py:1204] (3/4) Exclude cut with ID _7160_210_1876_1_1527940923231_3703799_302-150728-0_sp1.1 from training. Duration: 0.7090625 2023-10-03 05:00:58,628 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.self_attn_weights.pos_emb_skip_rate, batch_count=1146653.3333333333, ans=0.0 2023-10-03 05:01:02,303 WARNING [train.py:1204] (3/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_444-28041-0_sp0.9 from training. Duration: 0.9444375 2023-10-03 05:01:04,325 WARNING [train.py:1204] (3/4) Exclude cut with ID _52892_210_6721_1_1534378941259_4124129_278-251666-0_sp1.1 from training. Duration: 0.92725 2023-10-03 05:01:05,776 WARNING [train.py:1204] (3/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_115-326778-0 from training. Duration: 0.93 2023-10-03 05:01:05,882 WARNING [train.py:1204] (3/4) Exclude cut with ID _41391_210_7994_1_1533603771872_3638201_31-188961-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 05:01:08,530 WARNING [train.py:1204] (3/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_1073-6264-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 05:01:08,566 WARNING [train.py:1204] (3/4) Exclude cut with ID _66868_210_9527_1_1535509287954_4561140_435-259515-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 05:01:10,492 WARNING [train.py:1204] (3/4) Exclude cut with ID _14886_210_4220_1_1530702020355_3516190_159-73748-0 from training. Duration: 0.83 2023-10-03 05:01:11,732 WARNING [train.py:1204] (3/4) Exclude cut with ID _15150_210_8341_1_1530752411803_7256384_469-239509-0_sp1.1 from training. Duration: 0.6181875 2023-10-03 05:01:13,112 WARNING [train.py:1204] (3/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_395-340852-0 from training. Duration: 0.93 2023-10-03 05:01:13,115 WARNING [train.py:1204] (3/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_138-187031-0_sp1.1 from training. Duration: 0.9 2023-10-03 05:01:15,371 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_13757_210_3528_1_1530440682_4107239_48-106347-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 05:01:16,721 WARNING [train.py:1204] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_164-224052-0_sp1.1 from training. Duration: 0.636375 2023-10-03 05:01:16,731 WARNING [train.py:1204] (3/4) Exclude cut with ID _59288_210_5854_1_1535077757774_3412246_408-322648-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 05:01:18,056 WARNING [train.py:1204] (3/4) Exclude cut with ID _30191_210_1664_1_1533189915047_7196670_827-36359-0_sp1.1 from training. Duration: 0.963625 2023-10-03 05:01:18,144 WARNING [train.py:1204] (3/4) Exclude cut with ID _4680_210_3525_1_1527249757953_893386_75-78979-0_sp1.1 from training. Duration: 0.82725 2023-10-03 05:01:19,584 WARNING [train.py:1204] (3/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_383-165234-0 from training. Duration: 0.94 2023-10-03 05:01:21,205 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.feed_forward3.hidden_balancer.prob, batch_count=1146720.0, ans=0.125 2023-10-03 05:01:22,352 WARNING [train.py:1204] (3/4) Exclude cut with ID _19896_210_1883_1_1531709022662_6649569_94-229927-0 from training. Duration: 0.97 2023-10-03 05:01:22,359 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6788_210_1388_1_1527898698_4562021_180-75288-0_sp1.1 from training. Duration: 0.9 2023-10-03 05:01:22,373 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_26433_210_12108_1_1532157158_4056480_54-321349-0_sp1.1 from training. Duration: 0.97275 2023-10-03 05:01:27,710 WARNING [train.py:1204] (3/4) Exclude cut with ID _56108_210_16380_1_1534505446159_3887959_265-161493-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 05:01:27,810 WARNING [train.py:1204] (3/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_395-111602-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 05:01:27,816 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_154-316450-0_sp0.9 from training. Duration: 0.8444375 2023-10-03 05:01:27,979 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.1.conv_skip_rate, batch_count=1146786.6666666667, ans=0.0 2023-10-03 05:01:29,546 WARNING [train.py:1204] (3/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_381-116041-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 05:01:30,936 WARNING [train.py:1204] (3/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_65-104124-0_sp1.1 from training. Duration: 0.963625 2023-10-03 05:01:32,690 WARNING [train.py:1204] (3/4) Exclude cut with ID _6536_210_1883_1_1527850783339_3319069_169-23811-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 05:01:32,733 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_289-180904-0_sp0.9 from training. Duration: 0.8444375 2023-10-03 05:01:32,749 WARNING [train.py:1204] (3/4) Exclude cut with ID _56406_210_9288_1_1534899618664_3496149_226-242400-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 05:01:34,100 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_24899_210_16554_1_1532046422_3827462_276-216330-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 05:01:36,882 WARNING [train.py:1204] (3/4) Exclude cut with ID _29345_210_4381_1_1532689168263_3860169_198-174328-0_sp1.1 from training. Duration: 0.82725 2023-10-03 05:01:37,039 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.feed_forward3.hidden_balancer.prob, batch_count=1146786.6666666667, ans=0.125 2023-10-03 05:01:38,267 WARNING [train.py:1204] (3/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_338-96520-0 from training. Duration: 0.94 2023-10-03 05:01:44,193 WARNING [train.py:1204] (3/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_7-111954-0_sp1.1 from training. Duration: 0.5090625 2023-10-03 05:01:45,583 WARNING [train.py:1204] (3/4) Exclude cut with ID _36692_210_5665_1_1533608967420_398809_8-16261-0_sp1.1 from training. Duration: 0.97275 2023-10-03 05:01:48,850 WARNING [train.py:1204] (3/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_86-212657-0_sp1.1 from training. Duration: 0.97275 2023-10-03 05:01:48,857 WARNING [train.py:1204] (3/4) Exclude cut with ID _44659_210_2721_1_1534036120061_6614860_626-298884-0_sp1.1 from training. Duration: 0.8818125 2023-10-03 05:01:52,957 WARNING [train.py:1204] (3/4) Exclude cut with ID _49755_210_18053_1_1534581005846_3218709_120-257918-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 05:01:55,619 WARNING [train.py:1204] (3/4) Exclude cut with ID _21881_210_3525_1_1531825242757_3798380_400-113821-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 05:01:55,621 WARNING [train.py:1204] (3/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_387-236773-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 05:01:55,699 WARNING [train.py:1204] (3/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_613-232856-0_sp1.1 from training. Duration: 0.4909375 2023-10-03 05:01:56,855 WARNING [train.py:1204] (3/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_181-78632-0_sp0.9 from training. Duration: 0.8666875 2023-10-03 05:01:58,185 INFO [train.py:1046] (3/4) Epoch 33, batch 2050, loss[loss=0.1493, simple_loss=0.2312, pruned_loss=0.03367, over 24459.00 frames. ], tot_loss[loss=0.1627, simple_loss=0.2411, pruned_loss=0.04213, over 4710178.71 frames. ], batch size: 58, lr: 3.10e-03, grad_scale: 16.0 2023-10-03 05:01:59,622 WARNING [train.py:1204] (3/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_175-17485-0_sp1.1 from training. Duration: 0.97275 2023-10-03 05:01:59,696 WARNING [train.py:1204] (3/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_1004-183422-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 05:02:02,866 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.461e+02 1.906e+02 2.037e+02 2.269e+02 3.118e+02, threshold=4.074e+02, percent-clipped=0.0 2023-10-03 05:02:02,985 WARNING [train.py:1204] (3/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_32-65962-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 05:02:03,057 WARNING [train.py:1204] (3/4) Exclude cut with ID _15458_210_8341_1_1530858614263_3834410_40-298664-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 05:02:09,015 WARNING [train.py:1204] (3/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_380-261863-0_sp1.1 from training. Duration: 0.963625 2023-10-03 05:02:10,637 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.1.conv_skip_rate, batch_count=1146920.0, ans=0.0 2023-10-03 05:02:11,821 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_372-173012-0_sp1.1 from training. Duration: 0.863625 2023-10-03 05:02:11,878 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_57-82205-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 05:02:13,708 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3691_210_3528_1_1526468169_4062053_355-334432-0_sp0.9 from training. Duration: 0.9666875 2023-10-03 05:02:15,119 WARNING [train.py:1204] (3/4) Exclude cut with ID _19023_210_4353_1_1531378892552_5820780_28-36615-0 from training. Duration: 0.97 2023-10-03 05:02:15,124 WARNING [train.py:1204] (3/4) Exclude cut with ID _29853_210_7497_1_1532505600851_6642110_286-251333-0_sp1.1 from training. Duration: 0.92725 2023-10-03 05:02:15,224 WARNING [train.py:1204] (3/4) Exclude cut with ID _31602_210_5854_1_1532677021456_4458250_518-80679-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 05:02:17,083 WARNING [train.py:1204] (3/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_38-187851-0_sp0.9 from training. Duration: 0.688875 2023-10-03 05:02:19,373 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.0.layers.1.nonlin_attention.whiten2, num_groups=1, num_channels=192, metric=7.55 vs. limit=15.0 2023-10-03 05:02:25,309 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.0.feed_forward1.hidden_balancer.prob, batch_count=1146986.6666666667, ans=0.125 2023-10-03 05:02:25,453 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.balancer1.prob, batch_count=1146986.6666666667, ans=0.125 2023-10-03 05:02:27,566 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.0.layers.0.nonlin_attention.whiten1, num_groups=1, num_channels=144, metric=9.08 vs. limit=10.0 2023-10-03 05:02:27,887 WARNING [train.py:1204] (3/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_39-248870-0_sp1.1 from training. Duration: 0.62725 2023-10-03 05:02:27,910 WARNING [train.py:1204] (3/4) Exclude cut with ID _16908_210_5724_1_1531119157248_3772700_154-31455-0_sp1.1 from training. Duration: 0.97275 2023-10-03 05:02:29,366 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8453_210_4211_1_1528502940_8562730_347-330673-0 from training. Duration: 0.72 2023-10-03 05:02:30,799 WARNING [train.py:1204] (3/4) Exclude cut with ID _30191_210_1664_1_1533189915047_7196670_674-204247-0_sp1.1 from training. Duration: 0.97275 2023-10-03 05:02:32,154 WARNING [train.py:1204] (3/4) Exclude cut with ID _8833_210_5219_1_1528592665210_7553059_329-125156-0 from training. Duration: 0.82 2023-10-03 05:02:32,193 WARNING [train.py:1204] (3/4) Exclude cut with ID _4779_210_2084_1_1527318022944_3914893_500-285013-0_sp1.1 from training. Duration: 0.62725 2023-10-03 05:02:35,439 WARNING [train.py:1204] (3/4) Exclude cut with ID _14421_210_8750_1_1530604795825_3542290_190-313578-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 05:02:38,608 WARNING [train.py:1204] (3/4) Exclude cut with ID _35799_210_5854_1_1533263487887_3529071_538-273574-0_sp1.1 from training. Duration: 0.936375 2023-10-03 05:02:39,990 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_14928_210_5632_1_1530773461_4714000_454-112198-0_sp1.1 from training. Duration: 0.6 2023-10-03 05:02:40,034 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_468-143126-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 05:02:41,561 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_4528_210_3273_1_1527076654_3276777_258-121681-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 05:02:43,363 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_397-132565-0_sp1.1 from training. Duration: 0.9 2023-10-03 05:02:43,382 WARNING [train.py:1204] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_454-123002-0_sp0.9 from training. Duration: 0.7666875 2023-10-03 05:02:46,190 WARNING [train.py:1204] (3/4) Exclude cut with ID _27052_210_5986_1_1532329197187_3585009_297-86890-0_sp1.1 from training. Duration: 0.936375 2023-10-03 05:02:48,093 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_51225_210_3601_1_1534400634_2327383_55-301844-0_sp1.1 from training. Duration: 0.8454375 2023-10-03 05:02:49,521 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6786_210_1388_1_1527839699_4194495_378-46070-0_sp0.9 from training. Duration: 0.8 2023-10-03 05:02:50,979 WARNING [train.py:1204] (3/4) Exclude cut with ID _42334_210_7770_1_1534206610396_3605880_55-19758-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 05:02:53,764 WARNING [train.py:1204] (3/4) Exclude cut with ID _14884_210_2252_1_1530707459689_5561250_114-201399-0_sp0.9 from training. Duration: 0.8444375 2023-10-03 05:02:58,009 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_99-251314-0_sp0.9 from training. Duration: 0.9666875 2023-10-03 05:02:59,360 WARNING [train.py:1204] (3/4) Exclude cut with ID _26528_210_9070_1_1532570410842_3851310_497-97218-0 from training. Duration: 0.86 2023-10-03 05:03:02,954 WARNING [train.py:1204] (3/4) Exclude cut with ID _55328_210_27767_1_1534580973260_4088170_230-317241-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 05:03:04,295 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_125-203105-0_sp0.9 from training. Duration: 0.888875 2023-10-03 05:03:05,729 WARNING [train.py:1204] (3/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_55-31904-0_sp1.1 from training. Duration: 0.8 2023-10-03 05:03:07,272 WARNING [train.py:1204] (3/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_18-174647-0 from training. Duration: 0.91 2023-10-03 05:03:10,472 WARNING [train.py:1204] (3/4) Exclude cut with ID IC0165W0005-81726 from training. Duration: 0.952 2023-10-03 05:03:10,472 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_31583_210_3852_1_1532595851_4024413_148-171847-0_sp1.1 from training. Duration: 0.963625 2023-10-03 05:03:10,521 WARNING [train.py:1204] (3/4) Exclude cut with ID _21989_210_5854_1_1531899214931_3271360_116-138769-0_sp1.1 from training. Duration: 0.936375 2023-10-03 05:03:11,908 WARNING [train.py:1204] (3/4) Exclude cut with ID _66070_210_7640_1_1535441393096_7805310_489-292631-0_sp1.1 from training. Duration: 0.7181875 2023-10-03 05:03:13,695 INFO [train.py:1046] (3/4) Epoch 33, batch 2100, loss[loss=0.1766, simple_loss=0.2545, pruned_loss=0.04932, over 23920.00 frames. ], tot_loss[loss=0.1616, simple_loss=0.2398, pruned_loss=0.04174, over 4704787.11 frames. ], batch size: 80, lr: 3.10e-03, grad_scale: 16.0 2023-10-03 05:03:13,730 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_202-27960-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 05:03:13,747 WARNING [train.py:1204] (3/4) Exclude cut with ID _5294_210_1747_1_1527249643928_3696179_342-270278-0 from training. Duration: 0.55 2023-10-03 05:03:14,037 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.conv_skip_rate, batch_count=1147253.3333333333, ans=0.0 2023-10-03 05:03:15,162 WARNING [train.py:1204] (3/4) Exclude cut with ID _38323_210_12108_1_1533121233369_3303049_416-11919-0 from training. Duration: 0.96 2023-10-03 05:03:16,572 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6545_210_2040_1_1527774670_4245258_407-48070-0_sp0.9 from training. Duration: 0.8444375 2023-10-03 05:03:19,323 WARNING [train.py:1204] (3/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_503-30719-0_sp1.1 from training. Duration: 0.8909375 2023-10-03 05:03:19,383 WARNING [train.py:1204] (3/4) Exclude cut with ID _49643_210_7989_1_1534640244224_3939031_234-326497-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 05:03:22,197 WARNING [train.py:1204] (3/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_301-77873-0_sp1.1 from training. Duration: 0.963625 2023-10-03 05:03:23,604 WARNING [train.py:1204] (3/4) Exclude cut with ID _45297_210_2721_1_1533715351381_3264949_152-154697-0_sp1.1 from training. Duration: 0.97275 2023-10-03 05:03:23,620 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3371_210_1794_1_1526384462_5580777_396-339812-0 from training. Duration: 0.85 2023-10-03 05:03:25,054 WARNING [train.py:1204] (3/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_399-156791-0_sp1.1 from training. Duration: 0.7545625 2023-10-03 05:03:25,099 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7696_210_2040_1_1528207167_3716099_117-125434-0 from training. Duration: 0.76 2023-10-03 05:03:25,110 WARNING [train.py:1204] (3/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_96-244299-0 from training. Duration: 0.88 2023-10-03 05:03:27,858 WARNING [train.py:1204] (3/4) Exclude cut with ID _37892_210_19596_1_1533198677897_3964058_519-343462-0_sp1.1 from training. Duration: 0.92725 2023-10-03 05:03:27,887 WARNING [train.py:1204] (3/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_202-183922-0_sp1.1 from training. Duration: 0.77275 2023-10-03 05:03:27,893 WARNING [train.py:1204] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_453-133983-0 from training. Duration: 0.78 2023-10-03 05:03:27,924 WARNING [train.py:1204] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_178-39722-0_sp1.1 from training. Duration: 0.4454375 2023-10-03 05:03:35,825 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_15126_210_6753_1_1530778677_2591185_113-157651-0 from training. Duration: 0.68 2023-10-03 05:03:35,826 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_271-234962-0_sp1.1 from training. Duration: 0.7181875 2023-10-03 05:03:37,916 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.2.encoder.layers.1.nonlin_attention.whiten1, num_groups=1, num_channels=288, metric=5.56 vs. limit=10.0 2023-10-03 05:03:38,427 WARNING [train.py:1204] (3/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_195-64967-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 05:03:38,466 WARNING [train.py:1204] (3/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_365-215065-0_sp1.1 from training. Duration: 0.936375 2023-10-03 05:03:40,040 WARNING [train.py:1204] (3/4) Exclude cut with ID _31602_210_5854_1_1532677021456_4458250_249-96604-0_sp0.9 from training. Duration: 0.988875 2023-10-03 05:03:41,508 WARNING [train.py:1204] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_307-158462-0 from training. Duration: 0.69 2023-10-03 05:03:43,308 WARNING [train.py:1204] (3/4) Exclude cut with ID _30918_210_6400_1_1532754078600_3125920_407-223394-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 05:03:43,311 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6673_210_3273_1_1527767790_3173240_351-167954-0_sp0.9 from training. Duration: 0.5333125 2023-10-03 05:03:44,711 WARNING [train.py:1204] (3/4) Exclude cut with ID _4866_210_2577_1_1527404412284_4364659_256-306830-0 from training. Duration: 0.87 2023-10-03 05:03:46,512 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_17613_210_2151_1_1531137569_2366160_222-131118-0_sp1.1 from training. Duration: 0.963625 2023-10-03 05:03:46,524 WARNING [train.py:1204] (3/4) Exclude cut with ID _45560_210_21982_1_1533812603951_3584999_111-6376-0 from training. Duration: 0.84 2023-10-03 05:03:46,562 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_216-73841-0 from training. Duration: 0.96 2023-10-03 05:03:47,958 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_14058_210_4163_1_1530690953_6327180_49-210687-0 from training. Duration: 0.94 2023-10-03 05:03:49,393 WARNING [train.py:1204] (3/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_506-42614-0_sp0.9 from training. Duration: 0.87775 2023-10-03 05:03:49,524 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_25-332138-0_sp1.1 from training. Duration: 0.82725 2023-10-03 05:03:51,137 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.conv_module2.balancer2.prob, batch_count=1147386.6666666667, ans=0.125 2023-10-03 05:03:52,254 WARNING [train.py:1204] (3/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_589-9070-0_sp1.1 from training. Duration: 0.6454375 2023-10-03 05:03:53,623 WARNING [train.py:1204] (3/4) Exclude cut with ID _6194_210_1534_1_1527511826160_3941194_14-282876-0_sp1.1 from training. Duration: 0.6454375 2023-10-03 05:03:55,039 WARNING [train.py:1204] (3/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_1166-90654-0_sp1.1 from training. Duration: 0.92725 2023-10-03 05:03:56,412 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_218-255858-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 05:03:56,414 WARNING [train.py:1204] (3/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_1285-256622-0 from training. Duration: 0.84 2023-10-03 05:03:56,441 WARNING [train.py:1204] (3/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_205-286261-0_sp1.1 from training. Duration: 0.963625 2023-10-03 05:03:56,455 WARNING [train.py:1204] (3/4) Exclude cut with ID _68777_210_4353_1_1535810390731_3117904_6-154119-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 05:03:57,781 WARNING [train.py:1204] (3/4) Exclude cut with ID _22982_210_1883_1_1531882790447_5449409_565-199393-0_sp1.1 from training. Duration: 0.92725 2023-10-03 05:03:57,790 WARNING [train.py:1204] (3/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_278-154492-0 from training. Duration: 0.76 2023-10-03 05:03:59,140 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_426-313557-0 from training. Duration: 0.53 2023-10-03 05:04:01,090 WARNING [train.py:1204] (3/4) Exclude cut with ID _41817_210_18835_1_1533434477160_7423049_555-293448-0 from training. Duration: 0.8 2023-10-03 05:04:04,079 WARNING [train.py:1204] (3/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_32-335103-0_sp1.1 from training. Duration: 0.7454375 2023-10-03 05:04:07,484 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2655_210_2087_1_1525688959_3897400_202-147557-0_sp1.1 from training. Duration: 0.77275 2023-10-03 05:04:07,508 WARNING [train.py:1204] (3/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_158-250116-0 from training. Duration: 0.62 2023-10-03 05:04:13,421 WARNING [train.py:1204] (3/4) Exclude cut with ID _55429_210_10626_1_1534467588527_7174143_528-88083-0_sp1.1 from training. Duration: 0.963625 2023-10-03 05:04:14,976 WARNING [train.py:1204] (3/4) Exclude cut with ID _8030_210_7497_1_1528444886781_3531089_32-31837-0_sp1.1 from training. Duration: 0.8818125 2023-10-03 05:04:15,006 WARNING [train.py:1204] (3/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_744-86168-0_sp1.1 from training. Duration: 0.97275 2023-10-03 05:04:15,014 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1113-7369-0_sp0.9 from training. Duration: 0.9444375 2023-10-03 05:04:15,030 WARNING [train.py:1204] (3/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_124-345840-0_sp0.9 from training. Duration: 0.57775 2023-10-03 05:04:16,825 WARNING [train.py:1204] (3/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_328-264776-0_sp0.9 from training. Duration: 0.7444375 2023-10-03 05:04:18,277 WARNING [train.py:1204] (3/4) Exclude cut with ID _18895_210_6228_1_1531357274234_3784930_469-166615-0_sp1.1 from training. Duration: 0.963625 2023-10-03 05:04:18,283 WARNING [train.py:1204] (3/4) Exclude cut with ID _8395_210_1531_1_1528597871523_4411579_431-132470-0_sp0.9 from training. Duration: 0.82225 2023-10-03 05:04:20,852 WARNING [train.py:1204] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_554-45617-0_sp1.1 from training. Duration: 0.9 2023-10-03 05:04:20,882 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_35361_210_4238_1_1533262810_7793476_524-188242-0_sp1.1 from training. Duration: 0.92725 2023-10-03 05:04:22,284 WARNING [train.py:1204] (3/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_938-205135-0 from training. Duration: 0.87 2023-10-03 05:04:23,669 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_5931_210_2040_1_1527428481_4547999_321-193614-0 from training. Duration: 0.67 2023-10-03 05:04:23,681 WARNING [train.py:1204] (3/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_445-269521-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 05:04:25,150 WARNING [train.py:1204] (3/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_3-33116-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 05:04:25,151 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_4633_210_2040_1_1526824163_4145343_416-76117-0_sp1.1 from training. Duration: 0.863625 2023-10-03 05:04:25,333 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.ff2_skip_rate, batch_count=1147520.0, ans=0.0 2023-10-03 05:04:26,442 WARNING [train.py:1204] (3/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_1033-274376-0_sp1.1 from training. Duration: 0.7909375 2023-10-03 05:04:26,481 WARNING [train.py:1204] (3/4) Exclude cut with ID _8772_210_1850_1_1528635595972_3664011_398-320049-0_sp0.9 from training. Duration: 0.97775 2023-10-03 05:04:27,908 INFO [train.py:1046] (3/4) Epoch 33, batch 2150, loss[loss=0.1745, simple_loss=0.2475, pruned_loss=0.05071, over 23802.00 frames. ], tot_loss[loss=0.1614, simple_loss=0.2393, pruned_loss=0.04172, over 4707236.62 frames. ], batch size: 150, lr: 3.10e-03, grad_scale: 8.0 2023-10-03 05:04:32,652 WARNING [train.py:1204] (3/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_321-215742-0_sp0.9 from training. Duration: 0.5555625 2023-10-03 05:04:33,937 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.520e+02 1.872e+02 2.028e+02 2.280e+02 3.324e+02, threshold=4.056e+02, percent-clipped=0.0 2023-10-03 05:04:34,020 WARNING [train.py:1204] (3/4) Exclude cut with ID _28394_210_8913_1_1532430049518_3650118_272-190950-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 05:04:34,909 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.conv_skip_rate, batch_count=1147586.6666666667, ans=0.0 2023-10-03 05:04:35,924 WARNING [train.py:1204] (3/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_31-299371-0_sp1.1 from training. Duration: 0.92725 2023-10-03 05:04:37,364 WARNING [train.py:1204] (3/4) Exclude cut with ID _55383_210_10635_1_1534468426172_2231669_134-128246-0_sp0.9 from training. Duration: 0.988875 2023-10-03 05:04:37,367 WARNING [train.py:1204] (3/4) Exclude cut with ID _40519_210_13794_1_1533715228655_7337390_887-9547-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 05:04:38,746 WARNING [train.py:1204] (3/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_310-109461-0_sp0.9 from training. Duration: 0.888875 2023-10-03 05:04:40,300 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2009_210_2071_1_1525481134_4588937_258-250196-0_sp1.1 from training. Duration: 0.92725 2023-10-03 05:04:40,359 WARNING [train.py:1204] (3/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_1186-72702-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 05:04:40,361 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_14831_210_7927_1_1530700200_5583610_181-27467-0_sp1.1 from training. Duration: 0.82725 2023-10-03 05:04:44,785 WARNING [train.py:1204] (3/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_278-132109-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 05:04:44,819 WARNING [train.py:1204] (3/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_261-297503-0 from training. Duration: 0.99 2023-10-03 05:04:49,477 WARNING [train.py:1204] (3/4) Exclude cut with ID _19896_210_1883_1_1531709022662_6649569_313-265903-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 05:04:50,837 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_105-114797-0_sp1.1 from training. Duration: 0.763625 2023-10-03 05:04:52,173 WARNING [train.py:1204] (3/4) Exclude cut with ID _53755_210_8254_1_1534577312941_4707360_9-65843-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 05:04:52,206 WARNING [train.py:1204] (3/4) Exclude cut with ID _18516_210_9206_1_1531704653206_3692406_361-224549-0_sp1.1 from training. Duration: 0.97275 2023-10-03 05:04:52,237 WARNING [train.py:1204] (3/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_403-310975-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 05:04:53,510 WARNING [train.py:1204] (3/4) Exclude cut with ID _5444_210_2151_1_1527418808464_5312210_581-315411-0_sp0.9 from training. Duration: 0.688875 2023-10-03 05:04:53,552 WARNING [train.py:1204] (3/4) Exclude cut with ID _23825_210_13449_1_1531915313526_3497150_464-154904-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 05:04:53,568 WARNING [train.py:1204] (3/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_937-208281-0_sp0.9 from training. Duration: 0.9444375 2023-10-03 05:04:54,823 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_37839_210_7234_1_1533542462_3689303_158-181212-0_sp1.1 from training. Duration: 0.963625 2023-10-03 05:04:54,902 WARNING [train.py:1204] (3/4) Exclude cut with ID _8772_210_1850_1_1528635595972_3664011_398-320049-0 from training. Duration: 0.88 2023-10-03 05:04:56,309 WARNING [train.py:1204] (3/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_718-250907-0_sp1.1 from training. Duration: 0.62725 2023-10-03 05:04:57,686 WARNING [train.py:1204] (3/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_641-126395-0_sp1.1 from training. Duration: 0.92725 2023-10-03 05:04:58,997 WARNING [train.py:1204] (3/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_166-241493-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 05:05:00,781 WARNING [train.py:1204] (3/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_526-62079-0_sp0.9 from training. Duration: 0.7444375 2023-10-03 05:05:02,201 WARNING [train.py:1204] (3/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_439-275083-0_sp1.1 from training. Duration: 0.936375 2023-10-03 05:05:05,109 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_66907_210_21581_1_1535615661_3590177_493-229032-0_sp1.1 from training. Duration: 0.92725 2023-10-03 05:05:06,774 WARNING [train.py:1204] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_89-321955-0_sp1.1 from training. Duration: 0.72725 2023-10-03 05:05:06,997 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.bypass.skip_rate, batch_count=1147720.0, ans=0.04949747468305833 2023-10-03 05:05:08,138 WARNING [train.py:1204] (3/4) Exclude cut with ID _29857_210_20034_1_1532395886753_3798160_273-226195-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 05:05:08,145 WARNING [train.py:1204] (3/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_404-75889-0 from training. Duration: 0.89 2023-10-03 05:05:08,177 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_271-234962-0_sp0.9 from training. Duration: 0.87775 2023-10-03 05:05:09,671 WARNING [train.py:1204] (3/4) Exclude cut with ID _22961_210_8254_1_1531976366672_4278180_156-339685-0_sp1.1 from training. Duration: 0.97275 2023-10-03 05:05:10,990 WARNING [train.py:1204] (3/4) Exclude cut with ID _37892_210_19596_1_1533198677897_3964058_425-231927-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 05:05:12,374 WARNING [train.py:1204] (3/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_102-288670-0_sp1.1 from training. Duration: 0.97275 2023-10-03 05:05:12,464 WARNING [train.py:1204] (3/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_283-220372-0_sp1.1 from training. Duration: 0.5090625 2023-10-03 05:05:13,827 WARNING [train.py:1204] (3/4) Exclude cut with ID _44304_210_15374_1_1533639352765_4613129_86-1699-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 05:05:15,744 WARNING [train.py:1204] (3/4) Exclude cut with ID _2891_210_2550_1_1525874398521_4427710_327-121590-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 05:05:15,753 WARNING [train.py:1204] (3/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_369-222060-0 from training. Duration: 0.71 2023-10-03 05:05:17,818 WARNING [train.py:1204] (3/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_762-109886-0 from training. Duration: 0.91 2023-10-03 05:05:17,840 WARNING [train.py:1204] (3/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_477-255782-0_sp0.9 from training. Duration: 0.911125 2023-10-03 05:05:19,171 WARNING [train.py:1204] (3/4) Exclude cut with ID IC0669W0061-330906 from training. Duration: 0.768 2023-10-03 05:05:19,229 WARNING [train.py:1204] (3/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_347-213071-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 05:05:19,257 WARNING [train.py:1204] (3/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_507-303370-0_sp1.1 from training. Duration: 0.87275 2023-10-03 05:05:20,641 WARNING [train.py:1204] (3/4) Exclude cut with ID _15012_210_1968_1_1530856820429_5095350_688-178849-0 from training. Duration: 0.64 2023-10-03 05:05:20,647 WARNING [train.py:1204] (3/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_28-70987-0_sp0.9 from training. Duration: 0.9666875 2023-10-03 05:05:20,649 WARNING [train.py:1204] (3/4) Exclude cut with ID _5910_210_1389_1_1527337972454_5050100_555-159815-0 from training. Duration: 0.98 2023-10-03 05:05:20,678 WARNING [train.py:1204] (3/4) Exclude cut with ID IC0679W0064-335871 from training. Duration: 0.768 2023-10-03 05:05:20,678 WARNING [train.py:1204] (3/4) Exclude cut with ID IC0679W0079-335886 from training. Duration: 0.896 2023-10-03 05:05:20,710 WARNING [train.py:1204] (3/4) Exclude cut with ID _5608_210_2106_1_1527415439089_6819768_909-82767-0 from training. Duration: 0.61 2023-10-03 05:05:21,006 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.feed_forward2.hidden_balancer.prob, batch_count=1147786.6666666667, ans=0.125 2023-10-03 05:05:22,147 WARNING [train.py:1204] (3/4) Exclude cut with ID _23038_210_9570_1_1532001584158_3652419_411-323967-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 05:05:22,215 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_350-231748-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 05:05:23,595 WARNING [train.py:1204] (3/4) Exclude cut with ID _5289_210_2084_1_1527678202736_3779299_142-103317-0_sp0.9 from training. Duration: 0.9333125 2023-10-03 05:05:24,940 WARNING [train.py:1204] (3/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_314-144055-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 05:05:25,002 WARNING [train.py:1204] (3/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_177-236809-0_sp0.9 from training. Duration: 0.5444375 2023-10-03 05:05:26,374 WARNING [train.py:1204] (3/4) Exclude cut with ID _23038_210_9570_1_1532001584158_3652419_82-334733-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 05:05:26,382 WARNING [train.py:1204] (3/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_225-22526-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 05:05:37,123 WARNING [train.py:1204] (3/4) Exclude cut with ID _30327_210_16049_1_1532484161295_3924169_401-70819-0_sp1.1 from training. Duration: 0.7818125 2023-10-03 05:05:37,178 WARNING [train.py:1204] (3/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_538-32291-0 from training. Duration: 0.83 2023-10-03 05:05:41,295 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_37357_210_17024_1_1533089504_2316121_258-211605-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 05:05:42,626 INFO [train.py:1046] (3/4) Epoch 33, batch 2200, loss[loss=0.1581, simple_loss=0.2475, pruned_loss=0.03441, over 24425.00 frames. ], tot_loss[loss=0.1615, simple_loss=0.2397, pruned_loss=0.04168, over 4717328.88 frames. ], batch size: 69, lr: 3.10e-03, grad_scale: 8.0 2023-10-03 05:05:42,943 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.ff3_skip_rate, batch_count=1147920.0, ans=0.0 2023-10-03 05:05:45,368 WARNING [train.py:1204] (3/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_550-335198-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 05:05:47,147 WARNING [train.py:1204] (3/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_98-741-0_sp0.9 from training. Duration: 0.888875 2023-10-03 05:05:47,201 WARNING [train.py:1204] (3/4) Exclude cut with ID _38323_210_12108_1_1533121233369_3303049_251-238988-0_sp1.1 from training. Duration: 0.92725 2023-10-03 05:05:48,580 WARNING [train.py:1204] (3/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_259-34038-0_sp0.9 from training. Duration: 0.82225 2023-10-03 05:05:51,179 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.4.encoder.layers.0.conv_module2.whiten, num_groups=1, num_channels=384, metric=10.17 vs. limit=15.0 2023-10-03 05:05:51,895 WARNING [train.py:1204] (3/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_1060-180633-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 05:05:51,931 WARNING [train.py:1204] (3/4) Exclude cut with ID _8755_210_1876_1_1528628559357_3258209_211-37473-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 05:05:51,936 WARNING [train.py:1204] (3/4) Exclude cut with ID _4122_210_2084_1_1526713164417_4353280_528-206771-0 from training. Duration: 0.88 2023-10-03 05:05:56,331 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.bypass.skip_rate, batch_count=1147986.6666666667, ans=0.07 2023-10-03 05:05:58,777 WARNING [train.py:1204] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_64-248359-0 from training. Duration: 0.69 2023-10-03 05:06:00,494 WARNING [train.py:1204] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_402-253027-0_sp1.1 from training. Duration: 0.4909375 2023-10-03 05:06:04,772 WARNING [train.py:1204] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_625-102955-0 from training. Duration: 0.77 2023-10-03 05:06:08,458 WARNING [train.py:1204] (3/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_611-219324-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 05:06:08,531 WARNING [train.py:1204] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_718-68505-0_sp0.9 from training. Duration: 0.988875 2023-10-03 05:06:09,947 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_353-182855-0_sp1.1 from training. Duration: 0.87275 2023-10-03 05:06:13,971 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_5960_210_1794_1_1527403865_3990999_515-2613-0_sp0.9 from training. Duration: 0.97775 2023-10-03 05:06:13,999 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_5575_210_1410_1_1527301305_2570170_342-265572-0 from training. Duration: 0.94 2023-10-03 05:06:16,962 WARNING [train.py:1204] (3/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_329-113173-0_sp0.9 from training. Duration: 0.688875 2023-10-03 05:06:17,561 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.4.encoder.layers.0.feed_forward2.out_whiten, num_groups=1, num_channels=384, metric=6.01 vs. limit=15.0 2023-10-03 05:06:18,704 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_15396_210_6890_1_1531308473_1938936_213-312506-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 05:06:20,069 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_521-214276-0_sp1.1 from training. Duration: 0.4 2023-10-03 05:06:22,838 WARNING [train.py:1204] (3/4) Exclude cut with ID _15026_210_8750_1_1530777602316_3291850_211-195043-0_sp1.1 from training. Duration: 0.72725 2023-10-03 05:06:24,655 WARNING [train.py:1204] (3/4) Exclude cut with ID _29343_210_4381_1_1532602757752_3674109_108-257494-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 05:06:26,103 WARNING [train.py:1204] (3/4) Exclude cut with ID _64414_210_17301_1_1535243252048_3757759_403-58151-0_sp1.1 from training. Duration: 0.963625 2023-10-03 05:06:27,486 WARNING [train.py:1204] (3/4) Exclude cut with ID _36512_210_6987_1_1533033143998_3126069_109-177787-0_sp1.1 from training. Duration: 0.92725 2023-10-03 05:06:28,847 WARNING [train.py:1204] (3/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_498-40153-0 from training. Duration: 0.63 2023-10-03 05:06:30,154 WARNING [train.py:1204] (3/4) Exclude cut with ID _56447_210_1389_1_1535074245910_3697189_161-334523-0_sp1.1 from training. Duration: 0.936375 2023-10-03 05:06:31,559 WARNING [train.py:1204] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_238-245655-0 from training. Duration: 0.81 2023-10-03 05:06:34,323 WARNING [train.py:1204] (3/4) Exclude cut with ID _64631_210_27767_1_1535349580068_4165160_108-43865-0_sp1.1 from training. Duration: 0.92725 2023-10-03 05:06:34,337 WARNING [train.py:1204] (3/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_243-42116-0_sp1.1 from training. Duration: 0.663625 2023-10-03 05:06:34,372 WARNING [train.py:1204] (3/4) Exclude cut with ID _66868_210_9527_1_1535509287954_4561140_594-132160-0_sp1.1 from training. Duration: 0.92725 2023-10-03 05:06:37,436 WARNING [train.py:1204] (3/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_48-338976-0_sp0.9 from training. Duration: 0.988875 2023-10-03 05:06:37,484 WARNING [train.py:1204] (3/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_137-100595-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 05:06:38,841 WARNING [train.py:1204] (3/4) Exclude cut with ID _49748_210_24116_1_1533986971737_3158910_12-271669-0_sp1.1 from training. Duration: 0.936375 2023-10-03 05:06:38,881 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_37357_210_17024_1_1533089504_2316121_206-80421-0_sp1.1 from training. Duration: 0.936375 2023-10-03 05:06:40,231 WARNING [train.py:1204] (3/4) Exclude cut with ID _7537_210_2084_1_1528502395431_3924359_113-199933-0_sp1.1 from training. Duration: 0.536375 2023-10-03 05:06:41,536 WARNING [train.py:1204] (3/4) Exclude cut with ID _55440_210_12651_1_1534496713364_2980520_31-59289-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 05:06:43,627 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1083-220122-0_sp0.9 from training. Duration: 0.5666875 2023-10-03 05:06:46,471 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_426-313557-0_sp1.1 from training. Duration: 0.4818125 2023-10-03 05:06:46,527 WARNING [train.py:1204] (3/4) Exclude cut with ID _35799_210_5854_1_1533263487887_3529071_324-94843-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 05:06:47,167 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.4.encoder.layers.0.feed_forward2.out_whiten, num_groups=1, num_channels=384, metric=6.09 vs. limit=15.0 2023-10-03 05:06:47,980 WARNING [train.py:1204] (3/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_264-108685-0_sp1.1 from training. Duration: 0.5 2023-10-03 05:06:49,795 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9012W0006-501141 from training. Duration: 0.896 2023-10-03 05:06:51,198 WARNING [train.py:1204] (3/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_890-5117-0_sp0.9 from training. Duration: 0.8666875 2023-10-03 05:06:52,525 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9023W0008-506301 from training. Duration: 0.896 2023-10-03 05:06:54,350 WARNING [train.py:1204] (3/4) Exclude cut with ID _13916_210_9994_1_1530788341126_3763149_60-191556-0_sp0.9 from training. Duration: 0.67775 2023-10-03 05:06:54,404 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9029W0331-509721 from training. Duration: 0.896 2023-10-03 05:06:54,693 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=1148186.6666666667, ans=0.1 2023-10-03 05:06:55,733 WARNING [train.py:1204] (3/4) Exclude cut with ID _35823_210_6917_1_1533187881914_7297830_567-108596-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 05:06:56,982 INFO [train.py:1046] (3/4) Epoch 33, batch 2250, loss[loss=0.1774, simple_loss=0.2493, pruned_loss=0.05275, over 23769.00 frames. ], tot_loss[loss=0.1623, simple_loss=0.2408, pruned_loss=0.04186, over 4724425.02 frames. ], batch size: 164, lr: 3.09e-03, grad_scale: 8.0 2023-10-03 05:06:57,050 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7543_210_6753_1_1528544010_1110402_25-183860-0_sp1.1 from training. Duration: 0.42725 2023-10-03 05:06:58,446 WARNING [train.py:1204] (3/4) Exclude cut with ID _47278_210_12041_1_1533902418349_3576110_495-260035-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 05:06:59,856 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9051W0015-520281 from training. Duration: 0.9045625 2023-10-03 05:07:01,258 WARNING [train.py:1204] (3/4) Exclude cut with ID _14459_210_2228_1_1531443641946_7516700_707-66193-0_sp1.1 from training. Duration: 0.97275 2023-10-03 05:07:02,536 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.483e+02 1.842e+02 2.039e+02 2.201e+02 2.888e+02, threshold=4.078e+02, percent-clipped=0.0 2023-10-03 05:07:02,689 WARNING [train.py:1204] (3/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_219-224445-0_sp1.1 from training. Duration: 0.763625 2023-10-03 05:07:08,101 WARNING [train.py:1204] (3/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_345-167986-0_sp0.9 from training. Duration: 0.9333125 2023-10-03 05:07:11,298 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_34815_210_6388_1_1533381320_3571903_279-305565-0_sp1.1 from training. Duration: 0.836375 2023-10-03 05:07:12,835 WARNING [train.py:1204] (3/4) Exclude cut with ID _21987_210_5854_1_1531821500482_3637038_317-179212-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 05:07:12,953 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.ff2_skip_rate, batch_count=1148320.0, ans=0.0 2023-10-03 05:07:14,768 WARNING [train.py:1204] (3/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_395-37222-0_sp1.1 from training. Duration: 0.6909375 2023-10-03 05:07:16,123 WARNING [train.py:1204] (3/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_168-50337-0_sp1.1 from training. Duration: 0.763625 2023-10-03 05:07:18,895 WARNING [train.py:1204] (3/4) Exclude cut with ID _60113_210_16271_1_1535164212481_4386079_559-167191-0 from training. Duration: 0.93 2023-10-03 05:07:18,913 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_47228_210_4983_1_1533780021_3672027_344-179642-0_sp1.1 from training. Duration: 0.92725 2023-10-03 05:07:18,951 WARNING [train.py:1204] (3/4) Exclude cut with ID _22961_210_8254_1_1531976366672_4278180_317-199546-0_sp0.9 from training. Duration: 0.9555625 2023-10-03 05:07:21,986 WARNING [train.py:1204] (3/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_141-89727-0 from training. Duration: 0.71 2023-10-03 05:07:23,311 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_78-116075-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 05:07:23,328 WARNING [train.py:1204] (3/4) Exclude cut with ID _64086_210_7994_1_1535270417019_7435949_52-235756-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 05:07:26,005 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7615_210_4211_1_1528110965_4580160_72-235211-0_sp1.1 from training. Duration: 0.6909375 2023-10-03 05:07:29,699 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.conv_module2.balancer1.prob, batch_count=1148386.6666666667, ans=0.125 2023-10-03 05:07:30,810 WARNING [train.py:1204] (3/4) Exclude cut with ID _14069_210_5632_1_1530512692033_4803390_287-27950-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 05:07:32,155 WARNING [train.py:1204] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_74-48717-0_sp0.9 from training. Duration: 0.6444375 2023-10-03 05:07:32,186 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7544_210_6753_1_1528591327_1471426_172-348777-0_sp1.1 from training. Duration: 0.6 2023-10-03 05:07:33,561 WARNING [train.py:1204] (3/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_220-191080-0 from training. Duration: 0.98 2023-10-03 05:07:34,906 WARNING [train.py:1204] (3/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_1081-137641-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 05:07:36,482 WARNING [train.py:1204] (3/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_278-235459-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 05:07:39,381 WARNING [train.py:1204] (3/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_1181-77470-0_sp1.1 from training. Duration: 0.936375 2023-10-03 05:07:41,277 WARNING [train.py:1204] (3/4) Exclude cut with ID _60860_210_8254_1_1534896015875_3557169_198-5531-0_sp1.1 from training. Duration: 0.936375 2023-10-03 05:07:42,644 WARNING [train.py:1204] (3/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_17-289781-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 05:07:42,649 WARNING [train.py:1204] (3/4) Exclude cut with ID _50310_210_15488_1_1534068043345_3641080_146-199752-0_sp1.1 from training. Duration: 0.92725 2023-10-03 05:07:45,473 WARNING [train.py:1204] (3/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_969-213354-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 05:07:46,700 WARNING [train.py:1204] (3/4) Exclude cut with ID _70519_210_4163_1_1535961595994_7458230_66-344156-0_sp1.1 from training. Duration: 0.963625 2023-10-03 05:07:51,371 WARNING [train.py:1204] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_380-204756-0_sp1.1 from training. Duration: 0.7818125 2023-10-03 05:07:54,532 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_128-202104-0_sp0.9 from training. Duration: 0.7 2023-10-03 05:08:00,183 WARNING [train.py:1204] (3/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_51-1049-0_sp0.9 from training. Duration: 0.8333125 2023-10-03 05:08:00,211 WARNING [train.py:1204] (3/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_86-66192-0_sp1.1 from training. Duration: 0.5 2023-10-03 05:08:02,172 WARNING [train.py:1204] (3/4) Exclude cut with ID _14069_210_5632_1_1530512692033_4803390_95-1896-0_sp1.1 from training. Duration: 0.8909375 2023-10-03 05:08:06,350 WARNING [train.py:1204] (3/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_29-293896-0_sp1.1 from training. Duration: 0.5545625 2023-10-03 05:08:09,002 WARNING [train.py:1204] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_167-137255-0_sp0.9 from training. Duration: 0.711125 2023-10-03 05:08:09,003 WARNING [train.py:1204] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_217-95056-0 from training. Duration: 0.98 2023-10-03 05:08:09,018 WARNING [train.py:1204] (3/4) Exclude cut with ID _47278_210_12041_1_1533902418349_3576110_97-266746-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 05:08:09,068 WARNING [train.py:1204] (3/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_380-343670-0_sp1.1 from training. Duration: 0.8 2023-10-03 05:08:10,315 INFO [train.py:1046] (3/4) Epoch 33, batch 2300, loss[loss=0.1477, simple_loss=0.2328, pruned_loss=0.03135, over 24467.00 frames. ], tot_loss[loss=0.1619, simple_loss=0.241, pruned_loss=0.04142, over 4741489.96 frames. ], batch size: 63, lr: 3.09e-03, grad_scale: 8.0 2023-10-03 05:08:13,120 WARNING [train.py:1204] (3/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_107-332398-0 from training. Duration: 0.78 2023-10-03 05:08:13,469 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.attention_skip_rate, batch_count=1148586.6666666667, ans=0.0 2023-10-03 05:08:15,286 WARNING [train.py:1204] (3/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_91-267696-0_sp1.1 from training. Duration: 0.7909375 2023-10-03 05:08:15,315 WARNING [train.py:1204] (3/4) Exclude cut with ID _41298_210_9019_1_1533430371450_4822143_498-186993-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 05:08:15,474 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.conv_module1.balancer1.prob, batch_count=1148586.6666666667, ans=0.125 2023-10-03 05:08:16,882 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.conv_module1.balancer2.prob, batch_count=1148586.6666666667, ans=0.125 2023-10-03 05:08:21,346 WARNING [train.py:1204] (3/4) Exclude cut with ID _17956_210_4353_1_1531209568125_6979970_54-77610-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 05:08:22,656 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_40047_210_2290_1_1533290519_2374311_257-335162-0_sp1.1 from training. Duration: 0.863625 2023-10-03 05:08:22,808 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9653W0193-675471 from training. Duration: 0.9656875 2023-10-03 05:08:26,067 WARNING [train.py:1204] (3/4) Exclude cut with ID _25248_210_8750_1_1532062825941_5554180_243-240536-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 05:08:34,239 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_32360_210_4198_1_1532689160_3648436_60-51908-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 05:08:34,248 WARNING [train.py:1204] (3/4) Exclude cut with ID _4092_210_2151_1_1526643044449_3135759_227-213972-0_sp0.9 from training. Duration: 0.6 2023-10-03 05:08:34,283 WARNING [train.py:1204] (3/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_392-173500-0_sp1.1 from training. Duration: 0.97275 2023-10-03 05:08:36,108 WARNING [train.py:1204] (3/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_71-329150-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 05:08:36,110 WARNING [train.py:1204] (3/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_866-173440-0 from training. Duration: 0.9 2023-10-03 05:08:36,196 WARNING [train.py:1204] (3/4) Exclude cut with ID _14832_210_8750_1_1530691351539_3171348_1-54315-0_sp0.9 from training. Duration: 0.9666875 2023-10-03 05:08:37,678 WARNING [train.py:1204] (3/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_430-116139-0_sp1.1 from training. Duration: 0.77275 2023-10-03 05:08:38,930 WARNING [train.py:1204] (3/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_356-45054-0_sp0.9 from training. Duration: 0.9555625 2023-10-03 05:08:41,790 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3531_210_3528_1_1526294203_5216456_309-227302-0_sp0.9 from training. Duration: 0.7666875 2023-10-03 05:08:44,508 WARNING [train.py:1204] (3/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_301-330455-0_sp1.1 from training. Duration: 0.72725 2023-10-03 05:08:47,827 WARNING [train.py:1204] (3/4) Exclude cut with ID _23557_210_8750_1_1531890106370_6846255_667-145945-0_sp1.1 from training. Duration: 0.92725 2023-10-03 05:08:52,559 WARNING [train.py:1204] (3/4) Exclude cut with ID _14860_210_4915_1_1530702020340_7343240_836-328489-0_sp1.1 from training. Duration: 0.8181875 2023-10-03 05:08:54,473 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_37152_210_9697_1_1533106573_3844188_416-333249-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 05:08:54,763 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.feed_forward1.out_proj.dropout_p, batch_count=1148786.6666666667, ans=0.1 2023-10-03 05:08:57,189 WARNING [train.py:1204] (3/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_538-32291-0_sp0.9 from training. Duration: 0.92225 2023-10-03 05:08:58,721 WARNING [train.py:1204] (3/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_965-176824-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 05:09:01,531 WARNING [train.py:1204] (3/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_86-19601-0_sp1.1 from training. Duration: 0.77275 2023-10-03 05:09:01,598 WARNING [train.py:1204] (3/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_436-318113-0_sp1.1 from training. Duration: 0.6818125 2023-10-03 05:09:02,881 WARNING [train.py:1204] (3/4) Exclude cut with ID _41817_210_18835_1_1533434477160_7423049_555-293448-0_sp0.9 from training. Duration: 0.888875 2023-10-03 05:09:02,892 WARNING [train.py:1204] (3/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_68-219807-0 from training. Duration: 0.47 2023-10-03 05:09:05,939 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7544_210_6753_1_1528591327_1471426_33-346031-0_sp1.1 from training. Duration: 0.5545625 2023-10-03 05:09:05,941 WARNING [train.py:1204] (3/4) Exclude cut with ID _49748_210_24116_1_1533986971737_3158910_263-52113-0_sp1.1 from training. Duration: 0.97275 2023-10-03 05:09:07,827 WARNING [train.py:1204] (3/4) Exclude cut with ID _64639_210_5816_1_1535367619437_3528000_340-24580-0_sp1.1 from training. Duration: 0.92725 2023-10-03 05:09:07,857 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_47228_210_4983_1_1533780021_3672027_479-114941-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 05:09:08,072 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.bypass.skip_rate, batch_count=1148786.6666666667, ans=0.09899494936611666 2023-10-03 05:09:09,100 WARNING [train.py:1204] (3/4) Exclude cut with ID _44585_210_18628_1_1533605350387_4518199_506-119659-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 05:09:10,383 WARNING [train.py:1204] (3/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_314-126819-0_sp1.1 from training. Duration: 0.4545625 2023-10-03 05:09:10,387 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2541_210_1794_1_1525590269_3084765_156-173713-0_sp0.9 from training. Duration: 0.9 2023-10-03 05:09:10,444 WARNING [train.py:1204] (3/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_108-255430-0 from training. Duration: 0.97 2023-10-03 05:09:10,452 WARNING [train.py:1204] (3/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_791-45149-0_sp1.1 from training. Duration: 0.7818125 2023-10-03 05:09:10,457 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_53378_210_17282_1_1534227054_1261425_10-118295-0_sp1.1 from training. Duration: 0.97275 2023-10-03 05:09:11,771 WARNING [train.py:1204] (3/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_148-158509-0 from training. Duration: 0.41 2023-10-03 05:09:17,464 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_34726_210_4525_1_1533019474_5030395_687-291305-0_sp1.1 from training. Duration: 0.936375 2023-10-03 05:09:22,049 WARNING [train.py:1204] (3/4) Exclude cut with ID _14860_210_4915_1_1530702020340_7343240_264-324537-0_sp1.1 from training. Duration: 0.963625 2023-10-03 05:09:25,781 INFO [train.py:1046] (3/4) Epoch 33, batch 2350, loss[loss=0.1768, simple_loss=0.2569, pruned_loss=0.04832, over 23353.00 frames. ], tot_loss[loss=0.1627, simple_loss=0.2417, pruned_loss=0.04187, over 4740022.14 frames. ], batch size: 93, lr: 3.09e-03, grad_scale: 8.0 2023-10-03 05:09:27,256 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_41251_210_15368_1_1533429767_4820695_591-46386-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 05:09:27,292 WARNING [train.py:1204] (3/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_86-19601-0_sp0.9 from training. Duration: 0.9444375 2023-10-03 05:09:27,312 WARNING [train.py:1204] (3/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_401-255491-0_sp0.9 from training. Duration: 0.77775 2023-10-03 05:09:28,713 WARNING [train.py:1204] (3/4) Exclude cut with ID _8696_210_1876_1_1528545632440_3345058_209-54069-0_sp1.1 from training. Duration: 0.6090625 2023-10-03 05:09:28,722 WARNING [train.py:1204] (3/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_369-325184-0_sp1.1 from training. Duration: 0.9 2023-10-03 05:09:28,794 WARNING [train.py:1204] (3/4) Exclude cut with ID _14687_210_5854_1_1530703838259_3664889_115-239046-0_sp0.9 from training. Duration: 0.7444375 2023-10-03 05:09:28,831 WARNING [train.py:1204] (3/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_168-50337-0 from training. Duration: 0.84 2023-10-03 05:09:31,608 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.560e+02 1.933e+02 2.128e+02 2.511e+02 4.744e+02, threshold=4.255e+02, percent-clipped=2.0 2023-10-03 05:09:34,617 WARNING [train.py:1204] (3/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_909-64151-0_sp1.1 from training. Duration: 0.8545625 2023-10-03 05:09:34,627 WARNING [train.py:1204] (3/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_597-154640-0 from training. Duration: 0.9 2023-10-03 05:09:40,646 WARNING [train.py:1204] (3/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_126-274694-0 from training. Duration: 0.98 2023-10-03 05:09:43,483 WARNING [train.py:1204] (3/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_344-269786-0_sp1.1 from training. Duration: 0.97275 2023-10-03 05:09:43,781 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.ff3_skip_rate, batch_count=1148986.6666666667, ans=0.0 2023-10-03 05:09:46,213 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_37818_210_4238_1_1533620910_4720083_322-253154-0_sp1.1 from training. Duration: 0.92725 2023-10-03 05:09:46,216 WARNING [train.py:1204] (3/4) Exclude cut with ID _58895_210_8750_1_1535184081320_3649089_221-15823-0_sp1.1 from training. Duration: 0.92725 2023-10-03 05:09:46,226 WARNING [train.py:1204] (3/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_9-21737-0_sp1.1 from training. Duration: 0.8818125 2023-10-03 05:09:46,269 WARNING [train.py:1204] (3/4) Exclude cut with ID _14069_210_5632_1_1530512692033_4803390_522-341588-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 05:09:47,597 WARNING [train.py:1204] (3/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_363-185703-0 from training. Duration: 0.95 2023-10-03 05:09:51,373 WARNING [train.py:1204] (3/4) Exclude cut with ID _41817_210_18835_1_1533434477160_7423049_265-76216-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 05:09:54,787 WARNING [train.py:1204] (3/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_841-341679-0 from training. Duration: 0.58 2023-10-03 05:09:57,434 WARNING [train.py:1204] (3/4) Exclude cut with ID _55429_210_10626_1_1534467588527_7174143_97-335084-0_sp1.1 from training. Duration: 0.8818125 2023-10-03 05:10:00,171 WARNING [train.py:1204] (3/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_690-126306-0_sp0.9 from training. Duration: 0.8666875 2023-10-03 05:10:00,180 WARNING [train.py:1204] (3/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_102-92751-0_sp1.1 from training. Duration: 0.9 2023-10-03 05:10:02,952 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3787_210_2040_1_1526477544_5478586_603-120324-0_sp1.1 from training. Duration: 0.736375 2023-10-03 05:10:03,169 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.balancer1.prob, batch_count=1149053.3333333333, ans=0.125 2023-10-03 05:10:04,364 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_33348_210_5632_1_1533086912_3324030_85-28583-0 from training. Duration: 0.82 2023-10-03 05:10:05,692 WARNING [train.py:1204] (3/4) Exclude cut with ID _60057_210_10640_1_1534903065793_3333150_362-230377-0_sp1.1 from training. Duration: 0.7909375 2023-10-03 05:10:07,070 WARNING [train.py:1204] (3/4) Exclude cut with ID _60113_210_16271_1_1535164212481_4386079_194-146486-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 05:10:07,088 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_53753_210_7234_1_1534320003_3594148_171-277668-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 05:10:08,383 WARNING [train.py:1204] (3/4) Exclude cut with ID _34423_210_8750_1_1533430798456_3330199_251-332788-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 05:10:10,549 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.nonlin_attention.balancer.prob, batch_count=1149120.0, ans=0.125 2023-10-03 05:10:11,685 WARNING [train.py:1204] (3/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_663-166973-0_sp0.9 from training. Duration: 0.888875 2023-10-03 05:10:14,506 WARNING [train.py:1204] (3/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_927-43827-0 from training. Duration: 0.74 2023-10-03 05:10:14,561 WARNING [train.py:1204] (3/4) Exclude cut with ID _28323_210_16187_1_1532480395667_4100300_306-74561-0_sp1.1 from training. Duration: 0.8545625 2023-10-03 05:10:16,011 WARNING [train.py:1204] (3/4) Exclude cut with ID _15037_210_2253_1_1530752294104_3267650_139-337989-0_sp1.1 from training. Duration: 0.92725 2023-10-03 05:10:16,032 WARNING [train.py:1204] (3/4) Exclude cut with ID _49039_210_6315_1_1533967537541_3846118_460-263670-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 05:10:18,778 WARNING [train.py:1204] (3/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_186-309161-0 from training. Duration: 0.54 2023-10-03 05:10:18,847 WARNING [train.py:1204] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_87-278686-0_sp1.1 from training. Duration: 0.7 2023-10-03 05:10:22,535 WARNING [train.py:1204] (3/4) Exclude cut with ID _27052_210_5986_1_1532329197187_3585009_314-106499-0 from training. Duration: 0.92 2023-10-03 05:10:22,564 WARNING [train.py:1204] (3/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_412-287526-0_sp0.9 from training. Duration: 0.8 2023-10-03 05:10:25,174 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.2.encoder.layers.0.nonlin_attention.whiten2, num_groups=1, num_channels=384, metric=10.48 vs. limit=15.0 2023-10-03 05:10:27,207 WARNING [train.py:1204] (3/4) Exclude cut with ID _22961_210_8254_1_1531976366672_4278180_317-199546-0 from training. Duration: 0.86 2023-10-03 05:10:31,263 WARNING [train.py:1204] (3/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_381-317791-0 from training. Duration: 0.73 2023-10-03 05:10:31,314 WARNING [train.py:1204] (3/4) Exclude cut with ID _44010_210_4525_1_1533884262964_4013620_560-310411-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 05:10:31,320 WARNING [train.py:1204] (3/4) Exclude cut with ID _5294_210_1747_1_1527249643928_3696179_342-270278-0_sp0.9 from training. Duration: 0.611125 2023-10-03 05:10:31,348 WARNING [train.py:1204] (3/4) Exclude cut with ID ID0473W0002-918951 from training. Duration: 0.924 2023-10-03 05:10:31,367 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1083-220122-0 from training. Duration: 0.51 2023-10-03 05:10:34,226 WARNING [train.py:1204] (3/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_138-154812-0 from training. Duration: 0.78 2023-10-03 05:10:36,994 WARNING [train.py:1204] (3/4) Exclude cut with ID _44982_210_1511_1_1534312820108_3726029_331-19420-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 05:10:39,015 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.3.encoder.layers.1.self_attn1.whiten, num_groups=1, num_channels=512, metric=14.62 vs. limit=22.5 2023-10-03 05:10:39,598 INFO [train.py:1046] (3/4) Epoch 33, batch 2400, loss[loss=0.172, simple_loss=0.239, pruned_loss=0.05245, over 23766.00 frames. ], tot_loss[loss=0.163, simple_loss=0.2415, pruned_loss=0.04224, over 4718315.22 frames. ], batch size: 179, lr: 3.09e-03, grad_scale: 16.0 2023-10-03 05:10:41,568 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2655_210_2087_1_1525688959_3897400_202-147557-0_sp0.9 from training. Duration: 0.9444375 2023-10-03 05:10:45,805 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_23850_210_4238_1_1532232597_1632433_32-185135-0_sp0.9 from training. Duration: 0.9555625 2023-10-03 05:10:47,185 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_46-93547-0_sp1.1 from training. Duration: 0.863625 2023-10-03 05:10:49,050 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7931_210_1794_1_1528594728_5564059_280-279560-0 from training. Duration: 0.98 2023-10-03 05:10:49,082 WARNING [train.py:1204] (3/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_937-208281-0 from training. Duration: 0.85 2023-10-03 05:10:55,238 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_14928_210_5632_1_1530773461_4714000_454-112198-0_sp0.9 from training. Duration: 0.7333125 2023-10-03 05:10:55,240 WARNING [train.py:1204] (3/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_944-153333-0_sp1.1 from training. Duration: 0.963625 2023-10-03 05:10:58,274 WARNING [train.py:1204] (3/4) Exclude cut with ID _14821_210_8750_1_1530680503991_6829359_2-227151-0 from training. Duration: 0.83 2023-10-03 05:10:59,554 WARNING [train.py:1204] (3/4) Exclude cut with ID _28461_210_2253_1_1532771775189_3041299_96-152158-0_sp1.1 from training. Duration: 0.77275 2023-10-03 05:10:59,619 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7837_210_1792_1_1528453445_5841305_654-54195-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 05:11:01,075 WARNING [train.py:1204] (3/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_344-152526-0 from training. Duration: 0.53 2023-10-03 05:11:04,009 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_30167_210_3528_1_1532428012_4772375_480-142230-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 05:11:05,373 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_160-218721-0 from training. Duration: 0.96 2023-10-03 05:11:09,479 WARNING [train.py:1204] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_65-88242-0_sp0.9 from training. Duration: 0.62225 2023-10-03 05:11:14,151 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_34886_210_10633_1_1533203882_1755661_76-156096-0 from training. Duration: 0.87 2023-10-03 05:11:14,405 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.bypass.skip_rate, batch_count=1149386.6666666667, ans=0.07 2023-10-03 05:11:15,648 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.1.conv_skip_rate, batch_count=1149386.6666666667, ans=0.0 2023-10-03 05:11:18,580 WARNING [train.py:1204] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_188-38388-0_sp1.1 from training. Duration: 0.92725 2023-10-03 05:11:18,691 WARNING [train.py:1204] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_81-289335-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 05:11:24,571 WARNING [train.py:1204] (3/4) Exclude cut with ID _59493_210_17282_1_1535078419077_3225879_110-8778-0_sp1.1 from training. Duration: 0.936375 2023-10-03 05:11:24,618 WARNING [train.py:1204] (3/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_188-260011-0 from training. Duration: 0.73 2023-10-03 05:11:26,447 WARNING [train.py:1204] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_453-133983-0_sp0.9 from training. Duration: 0.8666875 2023-10-03 05:11:29,480 INFO [scaling.py:1118] (3/4) WithLoss: name=encoder.encoders.4.encoder.layers.1.self_attn_weights, loss-sum=0.000e+00 2023-10-03 05:11:30,615 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8054_210_7927_1_1528627721_4984094_553-17379-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 05:11:33,315 WARNING [train.py:1204] (3/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_341-171072-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 05:11:34,912 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.conv_module2.balancer1.prob, batch_count=1149453.3333333333, ans=0.125 2023-10-03 05:11:36,008 WARNING [train.py:1204] (3/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_265-257286-0_sp1.1 from training. Duration: 0.963625 2023-10-03 05:11:37,510 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_403-39619-0_sp0.9 from training. Duration: 0.9333125 2023-10-03 05:11:37,517 WARNING [train.py:1204] (3/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_464-33931-0_sp0.9 from training. Duration: 0.6 2023-10-03 05:11:37,538 WARNING [train.py:1204] (3/4) Exclude cut with ID _60804_210_9642_1_1534831199859_7268299_632-143863-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 05:11:37,545 WARNING [train.py:1204] (3/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_431-310997-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 05:11:38,943 WARNING [train.py:1204] (3/4) Exclude cut with ID _31649_210_6228_1_1532584608063_4091078_114-263860-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 05:11:38,964 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6961_210_2087_1_1527937168_6815011_562-165518-0_sp0.9 from training. Duration: 0.8555625 2023-10-03 05:11:43,735 WARNING [train.py:1204] (3/4) Exclude cut with ID _27052_210_5986_1_1532329197187_3585009_338-325870-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 05:11:43,785 WARNING [train.py:1204] (3/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_469-94673-0_sp1.1 from training. Duration: 0.7181875 2023-10-03 05:11:43,812 WARNING [train.py:1204] (3/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_259-34038-0 from training. Duration: 0.74 2023-10-03 05:11:46,581 WARNING [train.py:1204] (3/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_388-129552-0 from training. Duration: 0.96 2023-10-03 05:11:48,519 WARNING [train.py:1204] (3/4) Exclude cut with ID _28394_210_8913_1_1532430049518_3650118_342-251455-0_sp1.1 from training. Duration: 0.936375 2023-10-03 05:11:49,834 WARNING [train.py:1204] (3/4) Exclude cut with ID _53731_210_7994_1_1534553979244_3757214_8-191390-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 05:11:49,867 WARNING [train.py:1204] (3/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_613-232856-0 from training. Duration: 0.54 2023-10-03 05:11:49,944 WARNING [train.py:1204] (3/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_557-349534-0 from training. Duration: 0.91 2023-10-03 05:11:51,192 WARNING [train.py:1204] (3/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_379-184223-0 from training. Duration: 0.85 2023-10-03 05:11:51,196 WARNING [train.py:1204] (3/4) Exclude cut with ID IC0125W0001-61807 from training. Duration: 0.966 2023-10-03 05:11:52,649 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_229-45300-0 from training. Duration: 0.57 2023-10-03 05:11:52,722 WARNING [train.py:1204] (3/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_1069-24743-0_sp1.1 from training. Duration: 0.92725 2023-10-03 05:11:54,466 INFO [train.py:1046] (3/4) Epoch 33, batch 2450, loss[loss=0.1562, simple_loss=0.2481, pruned_loss=0.0322, over 24684.00 frames. ], tot_loss[loss=0.1618, simple_loss=0.2402, pruned_loss=0.04169, over 4711161.58 frames. ], batch size: 68, lr: 3.09e-03, grad_scale: 8.0 2023-10-03 05:11:54,547 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_41452_210_7291_1_1533437687_3941281_323-172873-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 05:11:54,566 WARNING [train.py:1204] (3/4) Exclude cut with ID _37086_210_9038_1_1532999540891_7021250_802-226709-0_sp1.1 from training. Duration: 0.97275 2023-10-03 05:11:54,651 WARNING [train.py:1204] (3/4) Exclude cut with ID IC0144W0500-71767 from training. Duration: 0.931875 2023-10-03 05:11:56,414 WARNING [train.py:1204] (3/4) Exclude cut with ID _39846_210_12651_1_1533605310265_2115212_233-129699-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 05:11:57,769 WARNING [train.py:1204] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_574-45541-0_sp0.9 from training. Duration: 0.72225 2023-10-03 05:11:58,621 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.1.encoder.layers.1.self_attn_weights.whiten_keys, num_groups=4, num_channels=128, metric=2.31 vs. limit=6.0 2023-10-03 05:11:59,249 WARNING [train.py:1204] (3/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_372-298093-0_sp1.1 from training. Duration: 0.72725 2023-10-03 05:11:59,261 WARNING [train.py:1204] (3/4) Exclude cut with ID _62574_210_2655_1_1535180407058_3933249_34-26343-0_sp1.1 from training. Duration: 0.97275 2023-10-03 05:12:01,673 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.543e+02 1.804e+02 1.981e+02 2.295e+02 3.038e+02, threshold=3.963e+02, percent-clipped=0.0 2023-10-03 05:12:03,261 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_512-256238-0_sp1.1 from training. Duration: 0.963625 2023-10-03 05:12:03,267 WARNING [train.py:1204] (3/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_196-55502-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 05:12:04,582 WARNING [train.py:1204] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_195-167011-0 from training. Duration: 0.75 2023-10-03 05:12:08,723 WARNING [train.py:1204] (3/4) Exclude cut with ID _13678_210_7497_1_1530530069226_3463170_345-230416-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 05:12:08,735 WARNING [train.py:1204] (3/4) Exclude cut with ID _51078_210_9070_1_1534161557424_3805250_225-327809-0_sp1.1 from training. Duration: 0.963625 2023-10-03 05:12:11,671 WARNING [train.py:1204] (3/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_18-189198-0_sp1.1 from training. Duration: 0.8181875 2023-10-03 05:12:11,681 WARNING [train.py:1204] (3/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_329-82235-0_sp1.1 from training. Duration: 0.7454375 2023-10-03 05:12:11,689 WARNING [train.py:1204] (3/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_726-270780-0_sp0.9 from training. Duration: 0.9555625 2023-10-03 05:12:13,380 WARNING [train.py:1204] (3/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_654-52713-0 from training. Duration: 0.77 2023-10-03 05:12:16,790 WARNING [train.py:1204] (3/4) Exclude cut with ID _20455_210_5219_1_1531565996775_8098100_191-16006-0_sp1.1 from training. Duration: 0.963625 2023-10-03 05:12:18,243 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3171_210_2087_1_1526109084_2330007_66-260583-0_sp1.1 from training. Duration: 0.6545625 2023-10-03 05:12:18,290 WARNING [train.py:1204] (3/4) Exclude cut with ID _38670_210_15379_1_1533448810659_4391059_63-129006-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 05:12:21,626 WARNING [train.py:1204] (3/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_393-57888-0_sp0.9 from training. Duration: 0.711125 2023-10-03 05:12:21,674 WARNING [train.py:1204] (3/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_857-298942-0_sp0.9 from training. Duration: 0.9444375 2023-10-03 05:12:24,370 WARNING [train.py:1204] (3/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_239-22076-0_sp0.9 from training. Duration: 0.9444375 2023-10-03 05:12:25,014 WARNING [train.py:1204] (3/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_424-227947-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 05:12:26,362 WARNING [train.py:1204] (3/4) Exclude cut with ID _14069_210_5632_1_1530512692033_4803390_95-1896-0 from training. Duration: 0.98 2023-10-03 05:12:27,707 WARNING [train.py:1204] (3/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_96-244299-0_sp0.9 from training. Duration: 0.97775 2023-10-03 05:12:33,361 WARNING [train.py:1204] (3/4) Exclude cut with ID _46683_210_9642_1_1533730913492_4591116_562-319825-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 05:12:36,096 WARNING [train.py:1204] (3/4) Exclude cut with ID _64639_210_5816_1_1535367619437_3528000_284-173474-0_sp1.1 from training. Duration: 0.963625 2023-10-03 05:12:36,137 WARNING [train.py:1204] (3/4) Exclude cut with ID _29529_210_7234_1_1532393961455_3977460_497-100133-0_sp1.1 from training. Duration: 0.936375 2023-10-03 05:12:36,172 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_12868_210_6753_1_1530701932_530275_18-76322-0_sp0.9 from training. Duration: 0.9666875 2023-10-03 05:12:37,504 WARNING [train.py:1204] (3/4) Exclude cut with ID _50093_210_4220_1_1534417141207_3432299_116-303447-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 05:12:38,892 WARNING [train.py:1204] (3/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_44-79427-0_sp1.1 from training. Duration: 0.8909375 2023-10-03 05:12:38,947 WARNING [train.py:1204] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_887-249555-0 from training. Duration: 0.77 2023-10-03 05:12:42,171 WARNING [train.py:1204] (3/4) Exclude cut with ID _28461_210_2253_1_1532771775189_3041299_96-152158-0_sp0.9 from training. Duration: 0.9444375 2023-10-03 05:12:42,219 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_14058_210_4163_1_1530690953_6327180_49-210687-0_sp1.1 from training. Duration: 0.8545625 2023-10-03 05:12:45,512 WARNING [train.py:1204] (3/4) Exclude cut with ID _40399_210_4353_1_1533305658194_3279406_351-127903-0_sp1.1 from training. Duration: 0.97275 2023-10-03 05:12:46,831 WARNING [train.py:1204] (3/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_184-206939-0_sp1.1 from training. Duration: 0.936375 2023-10-03 05:12:52,951 WARNING [train.py:1204] (3/4) Exclude cut with ID _14100_210_8341_1_1530529320613_3678069_258-224599-0_sp1.1 from training. Duration: 0.7 2023-10-03 05:12:52,976 WARNING [train.py:1204] (3/4) Exclude cut with ID _6493_210_1747_1_1528459210424_4012470_324-323854-0 from training. Duration: 0.92 2023-10-03 05:12:53,987 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.0.layers.1.conv_module1.whiten, num_groups=1, num_channels=192, metric=10.47 vs. limit=15.0 2023-10-03 05:12:54,502 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_151-325626-0_sp1.1 from training. Duration: 0.8818125 2023-10-03 05:12:55,840 WARNING [train.py:1204] (3/4) Exclude cut with ID _14528_210_8254_1_1530669700514_4306939_106-43145-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 05:12:55,854 WARNING [train.py:1204] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_686-54383-0 from training. Duration: 0.87 2023-10-03 05:12:55,902 WARNING [train.py:1204] (3/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_801-102362-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 05:12:56,606 WARNING [train.py:1204] (3/4) Exclude cut with ID _48677_210_5663_1_1533902405919_3892989_408-40740-0_sp0.9 from training. Duration: 0.92225 2023-10-03 05:12:56,752 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=1149853.3333333333, ans=0.1 2023-10-03 05:12:59,281 WARNING [train.py:1204] (3/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_448-212135-0_sp1.1 from training. Duration: 0.82725 2023-10-03 05:13:01,462 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.0.self_attn_weights.whiten_keys.whitening_limit, batch_count=1149853.3333333333, ans=6.0 2023-10-03 05:13:01,980 WARNING [train.py:1204] (3/4) Exclude cut with ID _59170_210_6721_1_1534983633960_4203335_320-124047-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 05:13:03,391 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_4692_210_3528_1_1526899755_4323254_107-44401-0_sp1.1 from training. Duration: 0.7818125 2023-10-03 05:13:07,386 WARNING [train.py:1204] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_731-2428-0 from training. Duration: 0.83 2023-10-03 05:13:07,471 WARNING [train.py:1204] (3/4) Exclude cut with ID _29857_210_20034_1_1532395886753_3798160_335-163562-0_sp1.1 from training. Duration: 0.77275 2023-10-03 05:13:08,716 INFO [train.py:1046] (3/4) Epoch 33, batch 2500, loss[loss=0.142, simple_loss=0.2177, pruned_loss=0.03311, over 23719.00 frames. ], tot_loss[loss=0.161, simple_loss=0.2391, pruned_loss=0.04146, over 4718382.26 frames. ], batch size: 149, lr: 3.09e-03, grad_scale: 8.0 2023-10-03 05:13:14,762 WARNING [train.py:1204] (3/4) Exclude cut with ID _14832_210_8750_1_1530691351539_3171348_171-275004-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 05:13:22,072 WARNING [train.py:1204] (3/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_105-328174-0_sp0.9 from training. Duration: 0.8444375 2023-10-03 05:13:22,114 WARNING [train.py:1204] (3/4) Exclude cut with ID _68965_210_1883_1_1535698962272_3497490_164-118421-0_sp1.1 from training. Duration: 0.936375 2023-10-03 05:13:23,567 WARNING [train.py:1204] (3/4) Exclude cut with ID _19896_210_1883_1_1531709022662_6649569_380-24170-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 05:13:23,580 WARNING [train.py:1204] (3/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_505-205270-0_sp1.1 from training. Duration: 0.3454375 2023-10-03 05:13:29,039 WARNING [train.py:1204] (3/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_427-208901-0_sp1.1 from training. Duration: 0.6181875 2023-10-03 05:13:29,232 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.bypass_mid.scale_min, batch_count=1149986.6666666667, ans=0.2 2023-10-03 05:13:30,437 WARNING [train.py:1204] (3/4) Exclude cut with ID _23818_210_6228_1_1531917063884_3676920_85-326916-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 05:13:32,939 WARNING [train.py:1204] (3/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_101-293367-0_sp1.1 from training. Duration: 0.57275 2023-10-03 05:13:32,948 WARNING [train.py:1204] (3/4) Exclude cut with ID _5409_210_1388_1_1527292850189_5865191_178-127970-0_sp1.1 from training. Duration: 0.5181875 2023-10-03 05:13:34,331 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_403-39619-0 from training. Duration: 0.84 2023-10-03 05:13:35,153 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.2.encoder.layers.0.conv_module2.whiten, num_groups=1, num_channels=384, metric=9.54 vs. limit=15.0 2023-10-03 05:13:35,701 WARNING [train.py:1204] (3/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_427-87841-0_sp1.1 from training. Duration: 0.963625 2023-10-03 05:13:35,747 WARNING [train.py:1204] (3/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_286-68181-0_sp1.1 from training. Duration: 0.97275 2023-10-03 05:13:35,801 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6673_210_3273_1_1527767790_3173240_327-306185-0 from training. Duration: 0.83 2023-10-03 05:13:35,817 WARNING [train.py:1204] (3/4) Exclude cut with ID _3675_210_4557_1_1526639934408_4182089_106-203713-0_sp1.1 from training. Duration: 0.963625 2023-10-03 05:13:37,152 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_53071_210_16554_1_1534493434_1276796_130-36076-0 from training. Duration: 0.95 2023-10-03 05:13:37,184 WARNING [train.py:1204] (3/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_586-93054-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 05:13:40,454 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=1150053.3333333333, ans=0.1 2023-10-03 05:13:41,452 WARNING [train.py:1204] (3/4) Exclude cut with ID _23825_210_13449_1_1531915313526_3497150_262-10571-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 05:13:42,693 WARNING [train.py:1204] (3/4) Exclude cut with ID _28783_210_7943_1_1532671164808_7155479_432-67885-0_sp1.1 from training. Duration: 0.97275 2023-10-03 05:13:44,300 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.ff2_skip_rate, batch_count=1150053.3333333333, ans=0.0 2023-10-03 05:13:45,929 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6287_210_3273_1_1527509506_2653716_182-199887-0_sp0.9 from training. Duration: 0.5666875 2023-10-03 05:13:45,987 WARNING [train.py:1204] (3/4) Exclude cut with ID _3231_210_2106_1_1526205864934_6784220_171-256004-0 from training. Duration: 0.86 2023-10-03 05:13:47,860 WARNING [train.py:1204] (3/4) Exclude cut with ID _69051_210_1531_1_1535885566901_3829538_332-92090-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 05:13:49,300 WARNING [train.py:1204] (3/4) Exclude cut with ID _3675_210_4557_1_1526639934408_4182089_104-324125-0_sp1.1 from training. Duration: 0.963625 2023-10-03 05:13:51,976 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_41764_210_2521_1_1533686110_4089313_501-2119-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 05:13:57,183 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_56-100988-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 05:13:58,717 WARNING [train.py:1204] (3/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_398-239819-0_sp1.1 from training. Duration: 0.9 2023-10-03 05:14:02,947 WARNING [train.py:1204] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_74-48717-0_sp1.1 from training. Duration: 0.52725 2023-10-03 05:14:07,210 WARNING [train.py:1204] (3/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_177-49092-0 from training. Duration: 0.91 2023-10-03 05:14:07,228 WARNING [train.py:1204] (3/4) Exclude cut with ID _62337_210_12392_1_1535009273105_3412229_44-176353-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 05:14:07,237 WARNING [train.py:1204] (3/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_110-130961-0_sp1.1 from training. Duration: 0.42725 2023-10-03 05:14:08,561 WARNING [train.py:1204] (3/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_37-189262-0_sp1.1 from training. Duration: 0.8909375 2023-10-03 05:14:08,568 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3172_210_2087_1_1526292929_3987865_197-97762-0_sp0.9 from training. Duration: 0.8333125 2023-10-03 05:14:09,922 WARNING [train.py:1204] (3/4) Exclude cut with ID IC0677W0065-334897 from training. Duration: 0.896 2023-10-03 05:14:09,923 WARNING [train.py:1204] (3/4) Exclude cut with ID IC0677W0080-334912 from training. Duration: 0.768 2023-10-03 05:14:09,927 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_120-41668-0 from training. Duration: 0.7 2023-10-03 05:14:12,755 WARNING [train.py:1204] (3/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_460-205287-0_sp1.1 from training. Duration: 0.963625 2023-10-03 05:14:14,263 WARNING [train.py:1204] (3/4) Exclude cut with ID _14632_210_9988_1_1530691279392_3923200_102-204477-0 from training. Duration: 0.97 2023-10-03 05:14:14,268 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_34815_210_6388_1_1533381320_3571903_279-305565-0 from training. Duration: 0.92 2023-10-03 05:14:14,407 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.conv_skip_rate, batch_count=1150186.6666666667, ans=0.0 2023-10-03 05:14:14,831 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.4.encoder.layers.0.feed_forward2.out_whiten, num_groups=1, num_channels=384, metric=6.55 vs. limit=15.0 2023-10-03 05:14:16,148 WARNING [train.py:1204] (3/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_91-148831-0_sp1.1 from training. Duration: 0.9 2023-10-03 05:14:16,212 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_82-221196-0 from training. Duration: 0.76 2023-10-03 05:14:19,576 WARNING [train.py:1204] (3/4) Exclude cut with ID _38020_210_5986_1_1533112252259_7006089_493-102745-0 from training. Duration: 0.98 2023-10-03 05:14:22,348 WARNING [train.py:1204] (3/4) Exclude cut with ID _61133_210_9969_1_1535196777227_3696409_40-93442-0_sp1.1 from training. Duration: 0.92725 2023-10-03 05:14:23,575 INFO [train.py:1046] (3/4) Epoch 33, batch 2550, loss[loss=0.1691, simple_loss=0.2531, pruned_loss=0.04254, over 23653.00 frames. ], tot_loss[loss=0.1609, simple_loss=0.2392, pruned_loss=0.04135, over 4721806.28 frames. ], batch size: 149, lr: 3.09e-03, grad_scale: 8.0 2023-10-03 05:14:23,931 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.balancer1.prob, batch_count=1150253.3333333333, ans=0.125 2023-10-03 05:14:25,644 WARNING [train.py:1204] (3/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_45-258852-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 05:14:25,686 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_53071_210_16554_1_1534493434_1276796_130-36076-0_sp1.1 from training. Duration: 0.863625 2023-10-03 05:14:28,297 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8694_210_6400_1_1529065754_3260756_332-13553-0_sp1.1 from training. Duration: 0.92725 2023-10-03 05:14:29,765 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_405-339986-0 from training. Duration: 0.51 2023-10-03 05:14:29,844 WARNING [train.py:1204] (3/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_927-43827-0_sp1.1 from training. Duration: 0.67275 2023-10-03 05:14:31,060 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.577e+02 1.976e+02 2.259e+02 2.608e+02 3.805e+02, threshold=4.518e+02, percent-clipped=0.0 2023-10-03 05:14:33,933 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_397-147196-0 from training. Duration: 0.8 2023-10-03 05:14:35,363 WARNING [train.py:1204] (3/4) Exclude cut with ID 3557-8342-0013-54691-0_sp1.1 from training. Duration: 0.836375 2023-10-03 05:14:38,041 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_47438_210_5399_1_1533797471_4513356_626-127722-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 05:14:39,556 WARNING [train.py:1204] (3/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_256-323883-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 05:14:40,788 WARNING [train.py:1204] (3/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_839-173017-0_sp0.9 from training. Duration: 0.4555625 2023-10-03 05:14:40,826 WARNING [train.py:1204] (3/4) Exclude cut with ID _5289_210_2084_1_1527678202736_3779299_392-261988-0_sp1.1 from training. Duration: 0.5909375 2023-10-03 05:14:40,872 WARNING [train.py:1204] (3/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_119-224384-0_sp1.1 from training. Duration: 0.936375 2023-10-03 05:14:40,912 WARNING [train.py:1204] (3/4) Exclude cut with ID _57523_210_1856_1_1534766294545_3732989_262-267933-0_sp1.1 from training. Duration: 0.963625 2023-10-03 05:14:43,525 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_35361_210_4238_1_1533262810_7793476_675-87012-0_sp0.9 from training. Duration: 0.988875 2023-10-03 05:14:43,549 WARNING [train.py:1204] (3/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_17-90541-0 from training. Duration: 0.94 2023-10-03 05:14:44,930 WARNING [train.py:1204] (3/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_535-14410-0_sp1.1 from training. Duration: 0.42725 2023-10-03 05:14:44,935 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_24673_210_6400_1_1531968633_778659_94-154511-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 05:14:44,944 WARNING [train.py:1204] (3/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_212-10501-0 from training. Duration: 0.94 2023-10-03 05:14:57,901 WARNING [train.py:1204] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_383-285196-0_sp1.1 from training. Duration: 0.8818125 2023-10-03 05:15:02,579 WARNING [train.py:1204] (3/4) Exclude cut with ID _45560_210_21982_1_1533812603951_3584999_238-47020-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 05:15:02,588 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_21500_210_2054_1_1532149310_2716758_278-18053-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 05:15:02,595 WARNING [train.py:1204] (3/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_453-9626-0_sp1.1 from training. Duration: 0.936375 2023-10-03 05:15:04,011 WARNING [train.py:1204] (3/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_153-97441-0_sp1.1 from training. Duration: 0.5090625 2023-10-03 05:15:08,161 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.1.nonlin_attention.balancer.min_positive, batch_count=1150453.3333333333, ans=0.05 2023-10-03 05:15:10,896 WARNING [train.py:1204] (3/4) Exclude cut with ID _67725_210_27767_1_1535608756671_5105359_431-120895-0_sp1.1 from training. Duration: 0.963625 2023-10-03 05:15:13,637 WARNING [train.py:1204] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_574-45541-0_sp1.1 from training. Duration: 0.5909375 2023-10-03 05:15:13,652 WARNING [train.py:1204] (3/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_162-103620-0_sp1.1 from training. Duration: 0.8454375 2023-10-03 05:15:13,669 WARNING [train.py:1204] (3/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_734-129761-0_sp1.1 from training. Duration: 0.7454375 2023-10-03 05:15:14,995 WARNING [train.py:1204] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_620-276793-0_sp1.1 from training. Duration: 0.536375 2023-10-03 05:15:15,041 WARNING [train.py:1204] (3/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_153-97441-0_sp0.9 from training. Duration: 0.62225 2023-10-03 05:15:16,554 WARNING [train.py:1204] (3/4) Exclude cut with ID _12733_210_8341_1_1530167424556_3646380_523-85621-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 05:15:16,576 WARNING [train.py:1204] (3/4) Exclude cut with ID _64048_210_10626_1_1535198382802_3835179_352-179834-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 05:15:22,816 WARNING [train.py:1204] (3/4) Exclude cut with ID _55162_210_17071_1_1534408248583_3753480_43-242364-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 05:15:22,832 WARNING [train.py:1204] (3/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_661-264857-0 from training. Duration: 0.93 2023-10-03 05:15:22,839 WARNING [train.py:1204] (3/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_325-237800-0_sp1.1 from training. Duration: 0.97275 2023-10-03 05:15:22,891 WARNING [train.py:1204] (3/4) Exclude cut with ID _55328_210_27767_1_1534580973260_4088170_385-171459-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 05:15:22,969 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3780_210_2087_1_1526467389_2145572_176-107140-0_sp1.1 from training. Duration: 0.62725 2023-10-03 05:15:24,347 WARNING [train.py:1204] (3/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_375-244976-0_sp1.1 from training. Duration: 0.6818125 2023-10-03 05:15:27,679 WARNING [train.py:1204] (3/4) Exclude cut with ID _35941_210_15379_1_1532995294515_4023089_309-33162-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 05:15:32,102 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.feed_forward1.out_proj.dropout_p, batch_count=1150520.0, ans=0.1 2023-10-03 05:15:34,487 WARNING [train.py:1204] (3/4) Exclude cut with ID _14459_210_2228_1_1531443641946_7516700_1185-22038-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 05:15:35,919 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_1197-69460-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 05:15:37,244 INFO [train.py:1046] (3/4) Epoch 33, batch 2600, loss[loss=0.1468, simple_loss=0.2225, pruned_loss=0.03556, over 23605.00 frames. ], tot_loss[loss=0.1612, simple_loss=0.2395, pruned_loss=0.0414, over 4725276.40 frames. ], batch size: 149, lr: 3.09e-03, grad_scale: 8.0 2023-10-03 05:15:37,398 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9012W0022-501157 from training. Duration: 0.896 2023-10-03 05:15:40,321 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9023W0009-506302 from training. Duration: 0.896 2023-10-03 05:15:40,355 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2821_210_2715_1_1526108385_3763560_73-296673-0_sp1.1 from training. Duration: 0.7545625 2023-10-03 05:15:41,664 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9025W0113-507442 from training. Duration: 0.896 2023-10-03 05:15:41,748 WARNING [train.py:1204] (3/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_739-2098-0 from training. Duration: 0.92 2023-10-03 05:15:42,975 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9029W0332-509722 from training. Duration: 0.768 2023-10-03 05:15:45,758 WARNING [train.py:1204] (3/4) Exclude cut with ID _16908_210_5724_1_1531119157248_3772700_68-101953-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 05:15:45,780 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9042W0011-515617 from training. Duration: 0.768 2023-10-03 05:15:48,421 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3019_210_2028_1_1526044245_3264865_191-72235-0 from training. Duration: 0.97 2023-10-03 05:15:50,380 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9052W0006-520792 from training. Duration: 0.896 2023-10-03 05:15:51,797 WARNING [train.py:1204] (3/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_245-174553-0_sp1.1 from training. Duration: 0.8 2023-10-03 05:15:53,839 WARNING [train.py:1204] (3/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_202-67017-0 from training. Duration: 0.82 2023-10-03 05:15:55,496 WARNING [train.py:1204] (3/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_304-59227-0 from training. Duration: 0.77 2023-10-03 05:15:56,907 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_246-125000-0_sp0.9 from training. Duration: 0.688875 2023-10-03 05:15:58,182 WARNING [train.py:1204] (3/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_243-42116-0 from training. Duration: 0.73 2023-10-03 05:15:58,347 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9087W0121-538492 from training. Duration: 0.896 2023-10-03 05:16:00,223 WARNING [train.py:1204] (3/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_120-76843-0 from training. Duration: 0.55 2023-10-03 05:16:00,523 INFO [scaling.py:1118] (3/4) WithLoss: name=encoder.encoders.1.encoder.layers.0.self_attn_weights, loss-sum=0.000e+00 2023-10-03 05:16:06,532 WARNING [train.py:1204] (3/4) Exclude cut with ID _7852_210_5219_1_1528286471316_8139400_850-304621-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 05:16:06,554 WARNING [train.py:1204] (3/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_154-285415-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 05:16:06,573 WARNING [train.py:1204] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_538-278194-0_sp1.1 from training. Duration: 0.92725 2023-10-03 05:16:06,573 WARNING [train.py:1204] (3/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_119-207162-0 from training. Duration: 0.71 2023-10-03 05:16:07,992 WARNING [train.py:1204] (3/4) Exclude cut with ID _2334_210_2667_1_1525608238880_4705129_459-104997-0_sp1.1 from training. Duration: 0.863625 2023-10-03 05:16:08,242 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.bypass.scale_min, batch_count=1150720.0, ans=0.2 2023-10-03 05:16:14,758 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9147W0019-567847 from training. Duration: 0.768 2023-10-03 05:16:20,939 WARNING [train.py:1204] (3/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_228-189439-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 05:16:20,970 WARNING [train.py:1204] (3/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_196-77172-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 05:16:22,353 WARNING [train.py:1204] (3/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_625-338546-0 from training. Duration: 0.88 2023-10-03 05:16:22,403 WARNING [train.py:1204] (3/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_193-327630-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 05:16:22,404 WARNING [train.py:1204] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_391-24006-0_sp1.1 from training. Duration: 0.92725 2023-10-03 05:16:22,616 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=1150786.6666666667, ans=0.1 2023-10-03 05:16:23,866 WARNING [train.py:1204] (3/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_197-28164-0 from training. Duration: 0.8 2023-10-03 05:16:25,364 WARNING [train.py:1204] (3/4) Exclude cut with ID _8395_210_1531_1_1528597871523_4411579_431-132470-0_sp1.1 from training. Duration: 0.67275 2023-10-03 05:16:25,381 WARNING [train.py:1204] (3/4) Exclude cut with ID _35350_210_16040_1_1533040191183_4075299_465-248326-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 05:16:27,191 WARNING [train.py:1204] (3/4) Exclude cut with ID _16756_210_4915_1_1531047803763_7104259_101-141922-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 05:16:31,890 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9212W0005-600172 from training. Duration: 0.896 2023-10-03 05:16:33,109 WARNING [train.py:1204] (3/4) Exclude cut with ID _63186_210_2084_1_1535414479705_3908649_378-166820-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 05:16:33,135 WARNING [train.py:1204] (3/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_52-211683-0_sp1.1 from training. Duration: 0.8181875 2023-10-03 05:16:37,890 WARNING [train.py:1204] (3/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_1147-237729-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 05:16:38,006 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.feed_forward1.hidden_balancer.prob, batch_count=1150853.3333333333, ans=0.125 2023-10-03 05:16:40,618 WARNING [train.py:1204] (3/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_234-14174-0_sp0.9 from training. Duration: 0.911125 2023-10-03 05:16:40,642 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_454-276429-0 from training. Duration: 0.87 2023-10-03 05:16:41,915 WARNING [train.py:1204] (3/4) Exclude cut with ID _53713_210_24116_1_1534314487762_7416751_755-165313-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 05:16:43,323 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8278_210_2550_1_1528637099_4220413_436-317072-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 05:16:43,513 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.ff2_skip_rate, batch_count=1150853.3333333333, ans=0.0 2023-10-03 05:16:44,634 WARNING [train.py:1204] (3/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_28-324729-0_sp1.1 from training. Duration: 0.8545625 2023-10-03 05:16:48,971 WARNING [train.py:1204] (3/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_326-165900-0 from training. Duration: 0.69 2023-10-03 05:16:50,278 WARNING [train.py:1204] (3/4) Exclude cut with ID _53489_210_6228_1_1534298357169_3835970_96-30246-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 05:16:51,609 INFO [train.py:1046] (3/4) Epoch 33, batch 2650, loss[loss=0.1537, simple_loss=0.2358, pruned_loss=0.03577, over 24660.00 frames. ], tot_loss[loss=0.1615, simple_loss=0.2402, pruned_loss=0.04143, over 4734865.86 frames. ], batch size: 65, lr: 3.09e-03, grad_scale: 8.0 2023-10-03 05:16:51,715 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8275_210_4198_1_1528455320_3892571_491-17707-0_sp1.1 from training. Duration: 0.5818125 2023-10-03 05:16:55,142 WARNING [train.py:1204] (3/4) Exclude cut with ID _14348_210_8341_1_1530599434996_3778640_240-232370-0 from training. Duration: 0.84 2023-10-03 05:16:55,152 WARNING [train.py:1204] (3/4) Exclude cut with ID _45012_210_12651_1_1533608435136_5371380_54-112623-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 05:16:57,171 WARNING [train.py:1204] (3/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_328-264776-0_sp1.1 from training. Duration: 0.6090625 2023-10-03 05:16:58,431 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9325W0001-648427 from training. Duration: 0.768 2023-10-03 05:16:58,440 WARNING [train.py:1204] (3/4) Exclude cut with ID _56480_210_4220_1_1534679942079_3075409_49-74717-0_sp1.1 from training. Duration: 0.936375 2023-10-03 05:16:59,707 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.600e+02 1.879e+02 2.016e+02 2.278e+02 3.478e+02, threshold=4.033e+02, percent-clipped=0.0 2023-10-03 05:17:00,477 WARNING [train.py:1204] (3/4) Exclude cut with ID _59500_210_8254_1_1534809610680_3736009_6-241869-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 05:17:02,967 WARNING [train.py:1204] (3/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_172-215932-0_sp0.9 from training. Duration: 0.7333125 2023-10-03 05:17:04,421 WARNING [train.py:1204] (3/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_17-90541-0_sp1.1 from training. Duration: 0.8545625 2023-10-03 05:17:05,782 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_43831_210_2158_1_1533725502_5872791_525-89447-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 05:17:07,590 WARNING [train.py:1204] (3/4) Exclude cut with ID _12302_210_5219_1_1529924446466_8107280_907-182949-0 from training. Duration: 0.92 2023-10-03 05:17:07,608 WARNING [train.py:1204] (3/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_402-102311-0_sp0.9 from training. Duration: 0.7444375 2023-10-03 05:17:08,928 WARNING [train.py:1204] (3/4) Exclude cut with ID _8021_210_7994_1_1528376451358_3653069_179-45206-0_sp0.9 from training. Duration: 0.9555625 2023-10-03 05:17:10,575 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.balancer2.prob, batch_count=1150986.6666666667, ans=0.125 2023-10-03 05:17:11,718 WARNING [train.py:1204] (3/4) Exclude cut with ID _40137_210_12210_1_1533290484046_3795180_249-187059-0 from training. Duration: 0.88 2023-10-03 05:17:14,332 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9653W0194-675472 from training. Duration: 0.9658125 2023-10-03 05:17:15,793 WARNING [train.py:1204] (3/4) Exclude cut with ID _51332_210_9988_1_1534244371836_3801270_336-255604-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 05:17:17,186 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_510-96996-0 from training. Duration: 0.87 2023-10-03 05:17:17,200 WARNING [train.py:1204] (3/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_1167-20437-0_sp1.1 from training. Duration: 0.92725 2023-10-03 05:17:18,532 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3475_210_2028_1_1526193578_3622569_265-276625-0 from training. Duration: 0.76 2023-10-03 05:17:21,445 WARNING [train.py:1204] (3/4) Exclude cut with ID _14860_210_4915_1_1530702020340_7343240_470-172418-0_sp1.1 from training. Duration: 0.97275 2023-10-03 05:17:21,454 WARNING [train.py:1204] (3/4) Exclude cut with ID _5444_210_2151_1_1527418808464_5312210_581-315411-0_sp1.1 from training. Duration: 0.563625 2023-10-03 05:17:21,476 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_34450_210_9547_1_1533099443_4109050_519-342439-0_sp1.1 from training. Duration: 0.97275 2023-10-03 05:17:22,790 WARNING [train.py:1204] (3/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_50-175961-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 05:17:26,806 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.2.encoder.layers.2.self_attn_weights.whiten_keys, num_groups=4, num_channels=128, metric=3.89 vs. limit=6.0 2023-10-03 05:17:28,016 WARNING [train.py:1204] (3/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_372-6869-0 from training. Duration: 0.44 2023-10-03 05:17:29,247 WARNING [train.py:1204] (3/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_3-237434-0 from training. Duration: 0.76 2023-10-03 05:17:31,911 WARNING [train.py:1204] (3/4) Exclude cut with ID _8772_210_1850_1_1528635595972_3664011_247-336448-0_sp1.1 from training. Duration: 0.67275 2023-10-03 05:17:33,980 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=1151053.3333333333, ans=0.1 2023-10-03 05:17:34,081 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.self_attn_weights.pos_emb_skip_rate, batch_count=1151053.3333333333, ans=0.0 2023-10-03 05:17:36,580 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_427-78097-0 from training. Duration: 0.8 2023-10-03 05:17:36,596 WARNING [train.py:1204] (3/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_466-252727-0_sp1.1 from training. Duration: 0.97275 2023-10-03 05:17:36,665 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_15972_210_6400_1_1530877934_3405692_316-35478-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 05:17:38,484 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_809-17156-0_sp0.9 from training. Duration: 0.811125 2023-10-03 05:17:38,514 WARNING [train.py:1204] (3/4) Exclude cut with ID _57572_210_5517_1_1534921219624_3809484_594-263474-0_sp1.1 from training. Duration: 0.936375 2023-10-03 05:17:38,550 WARNING [train.py:1204] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_190-228204-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 05:17:40,086 WARNING [train.py:1204] (3/4) Exclude cut with ID _19514_210_9259_1_1531530071330_5084230_117-21193-0_sp1.1 from training. Duration: 0.936375 2023-10-03 05:17:41,466 WARNING [train.py:1204] (3/4) Exclude cut with ID _69584_210_5219_1_1535785133775_4044450_190-21315-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 05:17:42,931 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_53071_210_16554_1_1534493434_1276796_184-302050-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 05:17:44,170 WARNING [train.py:1204] (3/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_120-76843-0_sp1.1 from training. Duration: 0.5 2023-10-03 05:17:45,513 WARNING [train.py:1204] (3/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_318-162017-0_sp1.1 from training. Duration: 0.82725 2023-10-03 05:17:46,922 WARNING [train.py:1204] (3/4) Exclude cut with ID _46067_210_4220_1_1534125571066_3307450_79-46864-0_sp1.1 from training. Duration: 0.92725 2023-10-03 05:17:48,216 WARNING [train.py:1204] (3/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_358-215558-0_sp1.1 from training. Duration: 0.8454375 2023-10-03 05:17:48,291 WARNING [train.py:1204] (3/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_621-66764-0_sp1.1 from training. Duration: 0.92725 2023-10-03 05:17:51,060 WARNING [train.py:1204] (3/4) Exclude cut with ID _42863_210_9070_1_1533902398788_3696920_338-156851-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 05:17:51,089 WARNING [train.py:1204] (3/4) Exclude cut with ID _5289_210_2084_1_1527678202736_3779299_392-261988-0_sp0.9 from training. Duration: 0.72225 2023-10-03 05:17:53,874 WARNING [train.py:1204] (3/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_409-84682-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 05:17:55,219 WARNING [train.py:1204] (3/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_30-326681-0_sp0.9 from training. Duration: 0.92225 2023-10-03 05:17:55,225 WARNING [train.py:1204] (3/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_389-184695-0_sp1.1 from training. Duration: 0.92725 2023-10-03 05:17:55,263 WARNING [train.py:1204] (3/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_442-320317-0 from training. Duration: 0.82 2023-10-03 05:17:58,598 WARNING [train.py:1204] (3/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_756-29911-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 05:18:00,035 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_401-112036-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 05:18:01,804 WARNING [train.py:1204] (3/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_181-308827-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 05:18:03,129 WARNING [train.py:1204] (3/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_332-258725-0_sp1.1 from training. Duration: 0.963625 2023-10-03 05:18:04,880 WARNING [train.py:1204] (3/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_188-260011-0_sp0.9 from training. Duration: 0.811125 2023-10-03 05:18:04,943 WARNING [train.py:1204] (3/4) Exclude cut with ID _49039_210_6315_1_1533967537541_3846118_99-302239-0_sp1.1 from training. Duration: 0.963625 2023-10-03 05:18:06,235 INFO [train.py:1046] (3/4) Epoch 33, batch 2700, loss[loss=0.1743, simple_loss=0.254, pruned_loss=0.04731, over 24338.00 frames. ], tot_loss[loss=0.1629, simple_loss=0.2416, pruned_loss=0.04209, over 4726258.15 frames. ], batch size: 77, lr: 3.09e-03, grad_scale: 8.0 2023-10-03 05:18:06,379 WARNING [train.py:1204] (3/4) Exclude cut with ID _4122_210_2084_1_1526713164417_4353280_312-294828-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 05:18:06,384 WARNING [train.py:1204] (3/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_303-228738-0 from training. Duration: 0.84 2023-10-03 05:18:10,988 WARNING [train.py:1204] (3/4) Exclude cut with ID _14349_210_8341_1_1530615710818_3442470_115-285975-0_sp0.9 from training. Duration: 0.9555625 2023-10-03 05:18:13,769 WARNING [train.py:1204] (3/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_125-144174-0_sp1.1 from training. Duration: 0.3545625 2023-10-03 05:18:15,245 WARNING [train.py:1204] (3/4) Exclude cut with ID _53565_210_10635_1_1534337218335_3820259_351-54570-0_sp1.1 from training. Duration: 0.97275 2023-10-03 05:18:15,278 WARNING [train.py:1204] (3/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_410-40065-0_sp1.1 from training. Duration: 0.963625 2023-10-03 05:18:15,300 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_38965_210_4852_1_1533174906_3484735_167-339442-0_sp1.1 from training. Duration: 0.963625 2023-10-03 05:18:16,599 WARNING [train.py:1204] (3/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_265-164486-0_sp1.1 from training. Duration: 0.87275 2023-10-03 05:18:16,617 WARNING [train.py:1204] (3/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_725-182484-0_sp1.1 from training. Duration: 0.92725 2023-10-03 05:18:16,648 WARNING [train.py:1204] (3/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_607-213951-0_sp0.9 from training. Duration: 0.9333125 2023-10-03 05:18:16,668 WARNING [train.py:1204] (3/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_605-133395-0_sp1.1 from training. Duration: 0.6 2023-10-03 05:18:17,982 WARNING [train.py:1204] (3/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_351-294650-0 from training. Duration: 0.63 2023-10-03 05:18:18,059 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_14953_210_2151_1_1530701710_5616165_710-165400-0_sp1.1 from training. Duration: 0.7909375 2023-10-03 05:18:19,425 WARNING [train.py:1204] (3/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_237-58433-0_sp1.1 from training. Duration: 0.67275 2023-10-03 05:18:20,825 WARNING [train.py:1204] (3/4) Exclude cut with ID _5190_210_4557_1_1527244368799_373439_28-197030-0_sp1.1 from training. Duration: 0.6545625 2023-10-03 05:18:22,076 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_743-327651-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 05:18:23,586 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.0.feed_forward1.out_proj.dropout_p, batch_count=1151320.0, ans=0.1 2023-10-03 05:18:24,735 WARNING [train.py:1204] (3/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_76-203867-0_sp0.9 from training. Duration: 0.8 2023-10-03 05:18:24,849 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder_embed.convnext.layerdrop_rate, batch_count=1151320.0, ans=0.015 2023-10-03 05:18:26,613 WARNING [train.py:1204] (3/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_274-274798-0 from training. Duration: 0.98 2023-10-03 05:18:26,786 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.ff2_skip_rate, batch_count=1151320.0, ans=0.0 2023-10-03 05:18:27,952 WARNING [train.py:1204] (3/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_226-242140-0_sp1.1 from training. Duration: 0.736375 2023-10-03 05:18:30,756 WARNING [train.py:1204] (3/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_352-59958-0_sp1.1 from training. Duration: 0.7818125 2023-10-03 05:18:30,760 WARNING [train.py:1204] (3/4) Exclude cut with ID _39411_210_5986_1_1533538825030_3343649_191-251819-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 05:18:36,860 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7046_210_6753_1_1527895337_6920917_642-106799-0_sp1.1 from training. Duration: 0.836375 2023-10-03 05:18:36,867 WARNING [train.py:1204] (3/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_412-3442-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 05:18:37,006 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.ff2_skip_rate, batch_count=1151386.6666666667, ans=0.0 2023-10-03 05:18:38,199 WARNING [train.py:1204] (3/4) Exclude cut with ID _60057_210_10640_1_1534903065793_3333150_171-181133-0_sp0.9 from training. Duration: 0.97775 2023-10-03 05:18:38,217 WARNING [train.py:1204] (3/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_154-135696-0_sp1.1 from training. Duration: 0.72725 2023-10-03 05:18:41,049 WARNING [train.py:1204] (3/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_148-186805-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 05:18:42,485 WARNING [train.py:1204] (3/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_605-165641-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 05:18:42,491 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_174-277364-0_sp0.9 from training. Duration: 0.788875 2023-10-03 05:18:43,812 WARNING [train.py:1204] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_89-321955-0_sp0.9 from training. Duration: 0.888875 2023-10-03 05:18:48,035 WARNING [train.py:1204] (3/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_438-105429-0_sp1.1 from training. Duration: 0.963625 2023-10-03 05:18:48,040 WARNING [train.py:1204] (3/4) Exclude cut with ID _14970_210_15709_1_1530773543493_1715730_187-20259-0_sp0.9 from training. Duration: 0.988875 2023-10-03 05:18:55,010 WARNING [train.py:1204] (3/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_385-193052-0_sp0.9 from training. Duration: 0.9555625 2023-10-03 05:18:55,055 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7113_210_1849_1_1527942685_3448768_194-311626-0_sp1.1 from training. Duration: 0.92725 2023-10-03 05:18:59,793 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3780_210_2087_1_1526467389_2145572_176-107140-0_sp0.9 from training. Duration: 0.7666875 2023-10-03 05:18:59,795 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_36756_210_7234_1_1533026991_966276_123-213457-0_sp1.1 from training. Duration: 0.936375 2023-10-03 05:19:03,155 WARNING [train.py:1204] (3/4) Exclude cut with ID _23557_210_8750_1_1531890106370_6846255_456-244861-0_sp1.1 from training. Duration: 0.963625 2023-10-03 05:19:04,546 WARNING [train.py:1204] (3/4) Exclude cut with ID _37086_210_9038_1_1532999540891_7021250_242-349170-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 05:19:04,608 WARNING [train.py:1204] (3/4) Exclude cut with ID _41298_210_9019_1_1533430371450_4822143_471-281351-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 05:19:06,477 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_53378_210_17282_1_1534221071_5769509_572-126176-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 05:19:08,407 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_1507-263032-0_sp1.1 from training. Duration: 0.963625 2023-10-03 05:19:09,631 WARNING [train.py:1204] (3/4) Exclude cut with ID _13679_210_7497_1_1530534293662_3355098_411-186088-0_sp1.1 from training. Duration: 0.8545625 2023-10-03 05:19:11,178 WARNING [train.py:1204] (3/4) Exclude cut with ID _29345_210_4381_1_1532689168263_3860169_149-257268-0_sp1.1 from training. Duration: 0.736375 2023-10-03 05:19:12,534 WARNING [train.py:1204] (3/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_381-252014-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 05:19:12,541 WARNING [train.py:1204] (3/4) Exclude cut with ID _35799_210_5854_1_1533263487887_3529071_512-2717-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 05:19:15,301 WARNING [train.py:1204] (3/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_505-205270-0_sp0.9 from training. Duration: 0.42225 2023-10-03 05:19:16,617 WARNING [train.py:1204] (3/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_731-20117-0_sp1.1 from training. Duration: 0.936375 2023-10-03 05:19:18,017 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_37351_210_7925_1_1533012931_3812113_265-85780-0_sp1.1 from training. Duration: 0.8 2023-10-03 05:19:19,330 INFO [train.py:1046] (3/4) Epoch 33, batch 2750, loss[loss=0.1457, simple_loss=0.2111, pruned_loss=0.04013, over 23387.00 frames. ], tot_loss[loss=0.1629, simple_loss=0.2414, pruned_loss=0.04219, over 4711206.82 frames. ], batch size: 285, lr: 3.09e-03, grad_scale: 8.0 2023-10-03 05:19:19,370 WARNING [train.py:1204] (3/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_313-134930-0 from training. Duration: 0.97 2023-10-03 05:19:19,503 WARNING [train.py:1204] (3/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_374-138971-0 from training. Duration: 0.94 2023-10-03 05:19:20,841 WARNING [train.py:1204] (3/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_1227-287886-0_sp1.1 from training. Duration: 0.936375 2023-10-03 05:19:23,632 WARNING [train.py:1204] (3/4) Exclude cut with ID _59493_210_17282_1_1535078419077_3225879_332-217544-0_sp1.1 from training. Duration: 0.97275 2023-10-03 05:19:23,671 WARNING [train.py:1204] (3/4) Exclude cut with ID _23514_210_1389_1_1531832139785_3031439_361-119640-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 05:19:26,307 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.539e+02 2.013e+02 2.192e+02 2.661e+02 5.400e+02, threshold=4.383e+02, percent-clipped=1.0 2023-10-03 05:19:26,415 WARNING [train.py:1204] (3/4) Exclude cut with ID _69584_210_5219_1_1535785133775_4044450_142-118058-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 05:19:26,438 WARNING [train.py:1204] (3/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_178-177582-0_sp1.1 from training. Duration: 0.67275 2023-10-03 05:19:26,492 WARNING [train.py:1204] (3/4) Exclude cut with ID _25303_210_17519_1_1532050207576_3635970_41-195572-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 05:19:30,468 WARNING [train.py:1204] (3/4) Exclude cut with ID _55328_210_27767_1_1534580973260_4088170_329-341322-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 05:19:30,510 WARNING [train.py:1204] (3/4) Exclude cut with ID _5608_210_2106_1_1527415439089_6819768_909-82767-0_sp1.1 from training. Duration: 0.5545625 2023-10-03 05:19:31,664 WARNING [train.py:1204] (3/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_261-297503-0_sp1.1 from training. Duration: 0.9 2023-10-03 05:19:31,670 WARNING [train.py:1204] (3/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_459-291706-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 05:19:31,671 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7762_210_1794_1_1528199508_4057408_422-227034-0 from training. Duration: 0.73 2023-10-03 05:19:31,680 WARNING [train.py:1204] (3/4) Exclude cut with ID _3067_210_2667_1_1526212974640_4503168_250-285937-0_sp0.9 from training. Duration: 0.888875 2023-10-03 05:19:31,694 WARNING [train.py:1204] (3/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_332-253745-0_sp1.1 from training. Duration: 0.936375 2023-10-03 05:19:33,339 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.0.balancer1.prob, batch_count=1151653.3333333333, ans=0.125 2023-10-03 05:19:36,080 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.conv_module1.balancer2.prob, batch_count=1151653.3333333333, ans=0.125 2023-10-03 05:19:39,610 WARNING [train.py:1204] (3/4) Exclude cut with ID _13938_210_9988_1_1530518718978_3693960_304-112784-0 from training. Duration: 0.46 2023-10-03 05:19:41,026 WARNING [train.py:1204] (3/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_226-267957-0_sp1.1 from training. Duration: 0.92725 2023-10-03 05:19:41,085 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_41822_210_13270_1_1533955976_4379284_608-254567-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 05:19:42,424 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_124-113562-0_sp1.1 from training. Duration: 0.8545625 2023-10-03 05:19:42,455 WARNING [train.py:1204] (3/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_117-290916-0_sp1.1 from training. Duration: 0.463625 2023-10-03 05:19:43,790 WARNING [train.py:1204] (3/4) Exclude cut with ID _28427_210_4220_1_1532581240967_3061069_102-66533-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 05:19:45,111 WARNING [train.py:1204] (3/4) Exclude cut with ID _55383_210_10635_1_1534468426172_2231669_134-128246-0_sp1.1 from training. Duration: 0.8090625 2023-10-03 05:19:45,167 WARNING [train.py:1204] (3/4) Exclude cut with ID _45871_210_5665_1_1533864614245_3296060_49-308316-0_sp1.1 from training. Duration: 0.97275 2023-10-03 05:19:46,475 WARNING [train.py:1204] (3/4) Exclude cut with ID _57572_210_5517_1_1534921219624_3809484_176-199205-0_sp1.1 from training. Duration: 0.97275 2023-10-03 05:19:49,566 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.ff3_skip_rate, batch_count=1151720.0, ans=0.0 2023-10-03 05:19:50,614 WARNING [train.py:1204] (3/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_45-265684-0_sp1.1 from training. Duration: 0.6545625 2023-10-03 05:19:50,640 WARNING [train.py:1204] (3/4) Exclude cut with ID _15150_210_8341_1_1530752411803_7256384_469-239509-0_sp0.9 from training. Duration: 0.7555625 2023-10-03 05:19:50,680 WARNING [train.py:1204] (3/4) Exclude cut with ID _28461_210_2253_1_1532771775189_3041299_136-270644-0_sp1.1 from training. Duration: 0.8454375 2023-10-03 05:19:51,995 WARNING [train.py:1204] (3/4) Exclude cut with ID _69584_210_5219_1_1535785133775_4044450_302-47302-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 05:19:53,371 WARNING [train.py:1204] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_166-128941-0_sp1.1 from training. Duration: 0.4909375 2023-10-03 05:19:58,892 WARNING [train.py:1204] (3/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_559-120504-0_sp1.1 from training. Duration: 0.97275 2023-10-03 05:20:01,463 WARNING [train.py:1204] (3/4) Exclude cut with ID _6456_210_1534_1_1527681634212_3181049_345-252877-0_sp1.1 from training. Duration: 0.5818125 2023-10-03 05:20:01,494 WARNING [train.py:1204] (3/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_902-233003-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 05:20:05,479 WARNING [train.py:1204] (3/4) Exclude cut with ID _36538_210_9994_1_1533204020363_3795620_204-261385-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 05:20:05,483 WARNING [train.py:1204] (3/4) Exclude cut with ID _14821_210_8750_1_1530680503991_6829359_189-297451-0_sp1.1 from training. Duration: 0.5 2023-10-03 05:20:05,525 WARNING [train.py:1204] (3/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_1211-226066-0_sp0.9 from training. Duration: 0.8444375 2023-10-03 05:20:11,608 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6786_210_1388_1_1527839699_4194495_273-149841-0_sp1.1 from training. Duration: 0.836375 2023-10-03 05:20:11,654 WARNING [train.py:1204] (3/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_380-162648-0_sp1.1 from training. Duration: 0.8545625 2023-10-03 05:20:11,655 WARNING [train.py:1204] (3/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_494-199538-0 from training. Duration: 0.75 2023-10-03 05:20:13,272 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.bypass.skip_rate, batch_count=1151786.6666666667, ans=0.09899494936611666 2023-10-03 05:20:15,843 WARNING [train.py:1204] (3/4) Exclude cut with ID _49097_210_16049_1_1533952780207_4360979_276-87053-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 05:20:18,538 WARNING [train.py:1204] (3/4) Exclude cut with ID _5444_210_2151_1_1527418808464_5312210_726-27165-0 from training. Duration: 0.67 2023-10-03 05:20:22,757 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_128-202104-0_sp1.1 from training. Duration: 0.57275 2023-10-03 05:20:24,127 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_11349_210_6753_1_1529800861_1101207_26-65602-0_sp1.1 from training. Duration: 0.87275 2023-10-03 05:20:25,414 WARNING [train.py:1204] (3/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_159-328157-0 from training. Duration: 0.52 2023-10-03 05:20:25,495 WARNING [train.py:1204] (3/4) Exclude cut with ID _14069_210_5632_1_1530512692033_4803390_98-21934-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 05:20:28,498 WARNING [train.py:1204] (3/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_352-59958-0_sp0.9 from training. Duration: 0.9555625 2023-10-03 05:20:28,548 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6163_210_6400_1_1527506746_4263068_283-137811-0 from training. Duration: 0.77 2023-10-03 05:20:28,579 WARNING [train.py:1204] (3/4) Exclude cut with ID _21989_210_5854_1_1531899214931_3271360_133-44759-0_sp1.1 from training. Duration: 0.7 2023-10-03 05:20:31,946 WARNING [train.py:1204] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_21-174514-0_sp0.9 from training. Duration: 0.47775 2023-10-03 05:20:31,973 WARNING [train.py:1204] (3/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_201-78811-0_sp1.1 from training. Duration: 0.963625 2023-10-03 05:20:33,188 INFO [train.py:1046] (3/4) Epoch 33, batch 2800, loss[loss=0.1723, simple_loss=0.2416, pruned_loss=0.05151, over 23803.00 frames. ], tot_loss[loss=0.1623, simple_loss=0.2407, pruned_loss=0.04192, over 4709722.78 frames. ], batch size: 212, lr: 3.09e-03, grad_scale: 16.0 2023-10-03 05:20:33,258 WARNING [train.py:1204] (3/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_537-164630-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 05:20:33,315 WARNING [train.py:1204] (3/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_156-257607-0 from training. Duration: 0.83 2023-10-03 05:20:33,332 WARNING [train.py:1204] (3/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_308-154450-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 05:20:33,369 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_45399_210_4852_1_1533624725_7579737_214-240574-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 05:20:36,503 WARNING [train.py:1204] (3/4) Exclude cut with ID _29303_210_2575_1_1532392237303_7410649_615-305517-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 05:20:36,547 WARNING [train.py:1204] (3/4) Exclude cut with ID IC0125W0002-61808 from training. Duration: 0.708 2023-10-03 05:20:36,548 WARNING [train.py:1204] (3/4) Exclude cut with ID IC0125W0017-61823 from training. Duration: 0.759 2023-10-03 05:20:41,131 WARNING [train.py:1204] (3/4) Exclude cut with ID _53219_210_4525_1_1534834836008_4064825_4-145291-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 05:20:41,269 WARNING [train.py:1204] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_185-174805-0_sp1.1 from training. Duration: 0.6181875 2023-10-03 05:20:42,544 WARNING [train.py:1204] (3/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_635-193046-0_sp1.1 from training. Duration: 0.92725 2023-10-03 05:20:45,276 WARNING [train.py:1204] (3/4) Exclude cut with ID _46067_210_4220_1_1534125571066_3307450_321-258866-0_sp1.1 from training. Duration: 0.97275 2023-10-03 05:20:46,687 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8558_210_1410_1_1528505225_7613406_131-329524-0 from training. Duration: 0.79 2023-10-03 05:20:46,929 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.feed_forward2.hidden_balancer.prob, batch_count=1151986.6666666667, ans=0.125 2023-10-03 05:20:49,370 WARNING [train.py:1204] (3/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_847-41334-0_sp0.9 from training. Duration: 0.488875 2023-10-03 05:20:51,006 WARNING [train.py:1204] (3/4) Exclude cut with ID _14227_210_4381_1_1530701804286_3940259_125-275642-0 from training. Duration: 0.99 2023-10-03 05:20:52,348 WARNING [train.py:1204] (3/4) Exclude cut with ID _25248_210_8750_1_1532062825941_5554180_248-177255-0_sp1.1 from training. Duration: 0.963625 2023-10-03 05:20:52,419 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_40047_210_2290_1_1533290519_2374311_195-165693-0_sp1.1 from training. Duration: 0.8090625 2023-10-03 05:20:52,429 WARNING [train.py:1204] (3/4) Exclude cut with ID _23109_210_4381_1_1532150828777_901949_29-258672-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 05:20:56,862 WARNING [train.py:1204] (3/4) Exclude cut with ID _14821_210_8750_1_1530680503991_6829359_2-227151-0_sp1.1 from training. Duration: 0.7545625 2023-10-03 05:20:58,132 WARNING [train.py:1204] (3/4) Exclude cut with ID _49643_210_7989_1_1534640244224_3939031_565-53982-0_sp1.1 from training. Duration: 0.963625 2023-10-03 05:20:58,135 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7544_210_6753_1_1528591327_1471426_33-346031-0_sp0.9 from training. Duration: 0.67775 2023-10-03 05:20:58,779 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.3.encoder.layers.2.conv_module2.whiten, num_groups=1, num_channels=512, metric=12.69 vs. limit=15.0 2023-10-03 05:20:59,592 WARNING [train.py:1204] (3/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_95-61782-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 05:21:03,097 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.bypass.skip_rate, batch_count=1152053.3333333333, ans=0.04949747468305833 2023-10-03 05:21:08,224 WARNING [train.py:1204] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_686-54383-0_sp0.9 from training. Duration: 0.9666875 2023-10-03 05:21:11,327 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_505-276823-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 05:21:11,482 WARNING [train.py:1204] (3/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_745-128443-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 05:21:13,416 WARNING [train.py:1204] (3/4) Exclude cut with ID _3844_210_1390_1_1526727803582_2871209_20-251664-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 05:21:14,773 WARNING [train.py:1204] (3/4) Exclude cut with ID _36538_210_9994_1_1533204020363_3795620_487-164628-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 05:21:19,153 WARNING [train.py:1204] (3/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_309-222545-0_sp1.1 from training. Duration: 0.763625 2023-10-03 05:21:19,167 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3531_210_3528_1_1526294203_5216456_309-227302-0 from training. Duration: 0.69 2023-10-03 05:21:19,227 WARNING [train.py:1204] (3/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_42-192460-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 05:21:20,582 WARNING [train.py:1204] (3/4) Exclude cut with ID _22083_210_8254_1_1531702862439_3678970_49-234546-0_sp1.1 from training. Duration: 0.7545625 2023-10-03 05:21:20,586 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_427-78097-0_sp0.9 from training. Duration: 0.888875 2023-10-03 05:21:24,641 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_378-205254-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 05:21:24,685 WARNING [train.py:1204] (3/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_672-314830-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 05:21:25,626 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.0.layers.1.self_attn_weights.whiten_keys, num_groups=4, num_channels=128, metric=2.32 vs. limit=6.0 2023-10-03 05:21:26,178 WARNING [train.py:1204] (3/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_90-30673-0_sp1.1 from training. Duration: 0.763625 2023-10-03 05:21:29,361 WARNING [train.py:1204] (3/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_563-329943-0_sp1.1 from training. Duration: 0.9 2023-10-03 05:21:29,377 WARNING [train.py:1204] (3/4) Exclude cut with ID _21632_210_2253_1_1531994001765_3744140_154-84090-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 05:21:29,378 WARNING [train.py:1204] (3/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_103-336064-0_sp0.9 from training. Duration: 0.5666875 2023-10-03 05:21:31,236 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_43-71989-0_sp0.9 from training. Duration: 0.6333125 2023-10-03 05:21:31,291 WARNING [train.py:1204] (3/4) Exclude cut with ID _3867_210_1883_1_1526641200575_4475830_319-349850-0_sp0.9 from training. Duration: 0.7444375 2023-10-03 05:21:31,362 WARNING [train.py:1204] (3/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_629-179454-0_sp1.1 from training. Duration: 0.963625 2023-10-03 05:21:31,370 WARNING [train.py:1204] (3/4) Exclude cut with ID _65263_210_22356_1_1535763598691_3921037_49-222917-0 from training. Duration: 0.92 2023-10-03 05:21:31,398 WARNING [train.py:1204] (3/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_602-331772-0_sp1.1 from training. Duration: 0.936375 2023-10-03 05:21:32,748 WARNING [train.py:1204] (3/4) Exclude cut with ID _58414_210_8254_1_1535421609162_7151182_292-134978-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 05:21:34,068 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_38793_210_2544_1_1533176114_3849320_200-299292-0_sp1.1 from training. Duration: 0.936375 2023-10-03 05:21:34,179 WARNING [train.py:1204] (3/4) Exclude cut with ID _14970_210_15709_1_1530775547361_1966965_6-23328-0 from training. Duration: 0.86 2023-10-03 05:21:35,519 WARNING [train.py:1204] (3/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_386-225511-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 05:21:35,537 WARNING [train.py:1204] (3/4) Exclude cut with ID _38323_210_12108_1_1533121233369_3303049_416-11919-0_sp1.1 from training. Duration: 0.87275 2023-10-03 05:21:36,812 WARNING [train.py:1204] (3/4) Exclude cut with ID _4866_210_2577_1_1527404412284_4364659_256-306830-0_sp1.1 from training. Duration: 0.7909375 2023-10-03 05:21:36,998 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.1.ff2_skip_rate, batch_count=1152186.6666666667, ans=0.0 2023-10-03 05:21:38,635 WARNING [train.py:1204] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_53-300544-0 from training. Duration: 0.67 2023-10-03 05:21:44,785 WARNING [train.py:1204] (3/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_80-74184-0_sp1.1 from training. Duration: 0.7545625 2023-10-03 05:21:44,804 WARNING [train.py:1204] (3/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_332-256168-0_sp1.1 from training. Duration: 0.6818125 2023-10-03 05:21:46,219 WARNING [train.py:1204] (3/4) Exclude cut with ID _48836_210_12210_1_1534075848293_3904150_429-35475-0_sp1.1 from training. Duration: 0.8090625 2023-10-03 05:21:47,508 INFO [train.py:1046] (3/4) Epoch 33, batch 2850, loss[loss=0.1614, simple_loss=0.2362, pruned_loss=0.04328, over 23367.00 frames. ], tot_loss[loss=0.161, simple_loss=0.2391, pruned_loss=0.04145, over 4703363.13 frames. ], batch size: 119, lr: 3.09e-03, grad_scale: 16.0 2023-10-03 05:21:47,629 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_144-135833-0_sp1.1 from training. Duration: 0.97275 2023-10-03 05:21:51,850 WARNING [train.py:1204] (3/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_380-343670-0_sp0.9 from training. Duration: 0.97775 2023-10-03 05:21:51,871 WARNING [train.py:1204] (3/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_213-265174-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 05:21:51,902 WARNING [train.py:1204] (3/4) Exclude cut with ID _20455_210_5219_1_1531565996775_8098100_86-314283-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 05:21:53,509 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.conv_module1.balancer1.prob, batch_count=1152253.3333333333, ans=0.125 2023-10-03 05:21:54,430 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.545e+02 1.804e+02 2.039e+02 2.498e+02 3.555e+02, threshold=4.078e+02, percent-clipped=0.0 2023-10-03 05:21:54,598 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_41750_210_1960_1_1533520678_3948042_68-223813-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 05:21:54,663 WARNING [train.py:1204] (3/4) Exclude cut with ID _55153_210_2253_1_1534384786677_3162096_24-198429-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 05:21:57,326 WARNING [train.py:1204] (3/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_119-207162-0_sp0.9 from training. Duration: 0.788875 2023-10-03 05:21:57,380 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_148-172861-0 from training. Duration: 0.5 2023-10-03 05:22:05,528 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_5522_210_3273_1_1527249606_3909096_318-203153-0 from training. Duration: 0.43 2023-10-03 05:22:05,537 WARNING [train.py:1204] (3/4) Exclude cut with ID _14884_210_2252_1_1530707459689_5561250_288-189949-0_sp1.1 from training. Duration: 0.97275 2023-10-03 05:22:06,942 WARNING [train.py:1204] (3/4) Exclude cut with ID _5294_210_1747_1_1527249643928_3696179_529-242298-0 from training. Duration: 0.95 2023-10-03 05:22:07,363 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.5.encoder.layers.1.whiten, num_groups=1, num_channels=256, metric=2.05 vs. limit=12.0 2023-10-03 05:22:08,218 WARNING [train.py:1204] (3/4) Exclude cut with ID _36538_210_9994_1_1533204020363_3795620_503-110289-0_sp1.1 from training. Duration: 0.936375 2023-10-03 05:22:10,091 WARNING [train.py:1204] (3/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_385-193052-0 from training. Duration: 0.86 2023-10-03 05:22:10,148 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_456-22141-0 from training. Duration: 0.88 2023-10-03 05:22:12,892 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_38965_210_4852_1_1533174906_3484735_284-117840-0_sp1.1 from training. Duration: 0.936375 2023-10-03 05:22:25,802 WARNING [train.py:1204] (3/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_75-201625-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 05:22:27,322 WARNING [train.py:1204] (3/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_255-312654-0_sp1.1 from training. Duration: 0.82725 2023-10-03 05:22:27,332 WARNING [train.py:1204] (3/4) Exclude cut with ID _47251_210_18628_1_1533814063128_4137250_214-88076-0_sp0.9 from training. Duration: 0.97775 2023-10-03 05:22:28,728 WARNING [train.py:1204] (3/4) Exclude cut with ID _4092_210_2151_1_1526643044449_3135759_227-213972-0_sp1.1 from training. Duration: 0.4909375 2023-10-03 05:22:28,744 WARNING [train.py:1204] (3/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_912-47473-0_sp1.1 from training. Duration: 0.6181875 2023-10-03 05:22:28,770 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6767_210_3273_1_1527855783_1718932_173-256601-0_sp1.1 from training. Duration: 0.72725 2023-10-03 05:22:30,720 WARNING [train.py:1204] (3/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_32-205288-0_sp1.1 from training. Duration: 0.7090625 2023-10-03 05:22:30,753 WARNING [train.py:1204] (3/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_39-248870-0 from training. Duration: 0.69 2023-10-03 05:22:33,477 WARNING [train.py:1204] (3/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_377-296839-0_sp1.1 from training. Duration: 0.5 2023-10-03 05:22:33,502 WARNING [train.py:1204] (3/4) Exclude cut with ID _13679_210_7497_1_1530534293662_3355098_413-56154-0_sp1.1 from training. Duration: 0.963625 2023-10-03 05:22:35,551 WARNING [train.py:1204] (3/4) Exclude cut with ID _14970_210_15709_1_1530775547361_1966965_201-84544-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 05:22:35,606 WARNING [train.py:1204] (3/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_105-95775-0_sp1.1 from training. Duration: 0.936375 2023-10-03 05:22:36,841 WARNING [train.py:1204] (3/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_131-191669-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 05:22:36,879 WARNING [train.py:1204] (3/4) Exclude cut with ID _14860_210_4915_1_1530702020340_7343240_695-4323-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 05:22:38,318 WARNING [train.py:1204] (3/4) Exclude cut with ID _39867_210_9826_1_1533553222030_9344259_991-130565-0_sp1.1 from training. Duration: 0.97275 2023-10-03 05:22:38,814 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.4.encoder.layers.2.whiten, num_groups=1, num_channels=384, metric=2.65 vs. limit=12.0 2023-10-03 05:22:41,020 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_50-3289-0_sp1.1 from training. Duration: 0.82725 2023-10-03 05:22:42,815 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_54-347670-0_sp1.1 from training. Duration: 0.92725 2023-10-03 05:22:44,158 WARNING [train.py:1204] (3/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_200-153445-0_sp1.1 from training. Duration: 0.936375 2023-10-03 05:22:45,923 WARNING [train.py:1204] (3/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_279-302911-0_sp1.1 from training. Duration: 0.97275 2023-10-03 05:22:47,385 WARNING [train.py:1204] (3/4) Exclude cut with ID _14886_210_4220_1_1530702020355_3516190_159-73748-0_sp0.9 from training. Duration: 0.92225 2023-10-03 05:22:51,643 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.conv_skip_rate, batch_count=1152520.0, ans=0.0 2023-10-03 05:22:52,823 WARNING [train.py:1204] (3/4) Exclude cut with ID _53379_210_20034_1_1534222027363_3854654_23-222715-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 05:22:54,153 WARNING [train.py:1204] (3/4) Exclude cut with ID _12555_210_8750_1_1530075594402_3780239_449-267986-0 from training. Duration: 0.83 2023-10-03 05:22:54,173 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7544_210_6753_1_1528587722_3576686_295-258479-0 from training. Duration: 0.84 2023-10-03 05:22:55,751 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7002_210_1794_1_1527935634_4034444_487-236501-0_sp0.9 from training. Duration: 0.7333125 2023-10-03 05:22:55,797 WARNING [train.py:1204] (3/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_244-19107-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 05:22:55,811 WARNING [train.py:1204] (3/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_390-339137-0 from training. Duration: 0.56 2023-10-03 05:22:57,209 WARNING [train.py:1204] (3/4) Exclude cut with ID _7562_210_2084_1_1528887767916_3946229_186-326422-0_sp1.1 from training. Duration: 0.763625 2023-10-03 05:22:57,275 WARNING [train.py:1204] (3/4) Exclude cut with ID _33640_210_2655_1_1533117605479_4855469_645-145235-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 05:22:57,290 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_25767_210_2161_1_1532307483_3867748_506-171777-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 05:22:57,315 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_5438_210_1794_1_1527336687_2600009_233-58072-0_sp1.1 from training. Duration: 0.7 2023-10-03 05:22:57,315 WARNING [train.py:1204] (3/4) Exclude cut with ID IC0669W0063-330908 from training. Duration: 0.896 2023-10-03 05:22:57,356 WARNING [train.py:1204] (3/4) Exclude cut with ID IC0671W0070-331913 from training. Duration: 0.896 2023-10-03 05:22:57,359 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_258-310570-0_sp1.1 from training. Duration: 0.7454375 2023-10-03 05:22:58,712 WARNING [train.py:1204] (3/4) Exclude cut with ID _9461_210_4557_1_1529060514726_3677451_162-177507-0_sp1.1 from training. Duration: 0.97275 2023-10-03 05:23:01,886 INFO [train.py:1046] (3/4) Epoch 33, batch 2900, loss[loss=0.162, simple_loss=0.2435, pruned_loss=0.04029, over 24663.00 frames. ], tot_loss[loss=0.1617, simple_loss=0.2396, pruned_loss=0.04192, over 4704743.34 frames. ], batch size: 65, lr: 3.09e-03, grad_scale: 16.0 2023-10-03 05:23:02,201 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.conv_skip_rate, batch_count=1152586.6666666667, ans=0.0 2023-10-03 05:23:04,674 WARNING [train.py:1204] (3/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_26-175553-0_sp0.9 from training. Duration: 0.67775 2023-10-03 05:23:04,698 WARNING [train.py:1204] (3/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_7-150481-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 05:23:06,594 WARNING [train.py:1204] (3/4) Exclude cut with ID _23557_210_8750_1_1531890106370_6846255_72-273262-0_sp1.1 from training. Duration: 0.963625 2023-10-03 05:23:06,694 WARNING [train.py:1204] (3/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_367-61330-0 from training. Duration: 0.75 2023-10-03 05:23:10,944 WARNING [train.py:1204] (3/4) Exclude cut with ID _39867_210_9826_1_1533553222030_9344259_1337-11141-0_sp1.1 from training. Duration: 0.936375 2023-10-03 05:23:10,975 WARNING [train.py:1204] (3/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_101-293367-0 from training. Duration: 0.63 2023-10-03 05:23:12,178 WARNING [train.py:1204] (3/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_10-100367-0 from training. Duration: 0.98 2023-10-03 05:23:13,624 WARNING [train.py:1204] (3/4) Exclude cut with ID _60057_210_10640_1_1534903065793_3333150_171-181133-0_sp1.1 from training. Duration: 0.8 2023-10-03 05:23:13,627 WARNING [train.py:1204] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_844-96989-0_sp0.9 from training. Duration: 0.788875 2023-10-03 05:23:15,643 WARNING [train.py:1204] (3/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_301-144595-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 05:23:17,111 WARNING [train.py:1204] (3/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_115-147343-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 05:23:21,679 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3171_210_2087_1_1526109084_2330007_37-64334-0_sp1.1 from training. Duration: 0.7454375 2023-10-03 05:23:21,722 WARNING [train.py:1204] (3/4) Exclude cut with ID _19514_210_9259_1_1531530071330_5084230_769-272914-0_sp1.1 from training. Duration: 0.936375 2023-10-03 05:23:24,393 WARNING [train.py:1204] (3/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_76-311963-0_sp0.9 from training. Duration: 0.77775 2023-10-03 05:23:24,429 WARNING [train.py:1204] (3/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_526-62079-0 from training. Duration: 0.67 2023-10-03 05:23:24,482 WARNING [train.py:1204] (3/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_7-111954-0_sp0.9 from training. Duration: 0.62225 2023-10-03 05:23:27,175 WARNING [train.py:1204] (3/4) Exclude cut with ID _57997_210_4163_1_1534766425889_3961049_360-279140-0_sp1.1 from training. Duration: 0.97275 2023-10-03 05:23:28,670 WARNING [train.py:1204] (3/4) Exclude cut with ID _4866_210_2577_1_1527404412284_4364659_133-1696-0 from training. Duration: 0.99 2023-10-03 05:23:28,846 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.feed_forward1.hidden_balancer.prob, batch_count=1152653.3333333333, ans=0.125 2023-10-03 05:23:29,940 WARNING [train.py:1204] (3/4) Exclude cut with ID _5190_210_4557_1_1527244368799_373439_28-197030-0 from training. Duration: 0.72 2023-10-03 05:23:31,503 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_53-39003-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 05:23:32,786 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6673_210_3273_1_1527767790_3173240_351-167954-0 from training. Duration: 0.48 2023-10-03 05:23:32,802 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2822_210_2715_1_1526112586_1999587_165-36656-0_sp1.1 from training. Duration: 0.8090625 2023-10-03 05:23:34,809 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_328-264892-0_sp1.1 from training. Duration: 0.92725 2023-10-03 05:23:34,813 WARNING [train.py:1204] (3/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_29-293896-0_sp0.9 from training. Duration: 0.67775 2023-10-03 05:23:38,002 WARNING [train.py:1204] (3/4) Exclude cut with ID _65263_210_22356_1_1535763598691_3921037_465-55448-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 05:23:39,345 WARNING [train.py:1204] (3/4) Exclude cut with ID _21989_210_5854_1_1531899214931_3271360_311-53106-0_sp1.1 from training. Duration: 0.97275 2023-10-03 05:23:42,208 WARNING [train.py:1204] (3/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_389-277915-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 05:23:46,624 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_69630_210_7234_1_1535791094_677837_63-277139-0_sp1.1 from training. Duration: 0.963625 2023-10-03 05:23:48,032 WARNING [train.py:1204] (3/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_85-138257-0 from training. Duration: 0.71 2023-10-03 05:23:48,047 WARNING [train.py:1204] (3/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_686-247600-0 from training. Duration: 0.92 2023-10-03 05:23:48,052 WARNING [train.py:1204] (3/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_108-255430-0_sp1.1 from training. Duration: 0.8818125 2023-10-03 05:23:50,328 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.balancer2.prob, batch_count=1152786.6666666667, ans=0.125 2023-10-03 05:23:52,789 WARNING [train.py:1204] (3/4) Exclude cut with ID _14884_210_2252_1_1530707459689_5561250_114-201399-0_sp1.1 from training. Duration: 0.6909375 2023-10-03 05:23:54,275 WARNING [train.py:1204] (3/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_506-42614-0 from training. Duration: 0.79 2023-10-03 05:23:55,653 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_85-195240-0_sp1.1 from training. Duration: 0.7909375 2023-10-03 05:23:59,907 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_141-36243-0_sp1.1 from training. Duration: 0.97275 2023-10-03 05:24:07,430 WARNING [train.py:1204] (3/4) Exclude cut with ID _7852_210_5219_1_1528286471316_8139400_358-287492-0_sp1.1 from training. Duration: 0.87275 2023-10-03 05:24:07,464 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_805-71497-0_sp0.9 from training. Duration: 0.788875 2023-10-03 05:24:08,794 WARNING [train.py:1204] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_178-39722-0 from training. Duration: 0.49 2023-10-03 05:24:09,061 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.conv_module1.balancer2.min_abs, batch_count=1152853.3333333333, ans=0.5 2023-10-03 05:24:11,422 WARNING [train.py:1204] (3/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_428-181925-0_sp1.1 from training. Duration: 0.963625 2023-10-03 05:24:11,436 WARNING [train.py:1204] (3/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_770-200430-0 from training. Duration: 0.89 2023-10-03 05:24:11,483 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_1104-271009-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 05:24:13,156 WARNING [train.py:1204] (3/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_128-296581-0_sp0.9 from training. Duration: 0.82225 2023-10-03 05:24:15,902 INFO [train.py:1046] (3/4) Epoch 33, batch 2950, loss[loss=0.1473, simple_loss=0.2282, pruned_loss=0.03323, over 23598.00 frames. ], tot_loss[loss=0.1617, simple_loss=0.2399, pruned_loss=0.04175, over 4701275.18 frames. ], batch size: 149, lr: 3.09e-03, grad_scale: 16.0 2023-10-03 05:24:18,668 WARNING [train.py:1204] (3/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_19-62511-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 05:24:20,622 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_805-71497-0 from training. Duration: 0.71 2023-10-03 05:24:20,709 WARNING [train.py:1204] (3/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_573-152077-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 05:24:20,714 WARNING [train.py:1204] (3/4) Exclude cut with ID _42875_210_14856_1_1533704397487_4012433_393-177816-0_sp1.1 from training. Duration: 0.963625 2023-10-03 05:24:22,131 WARNING [train.py:1204] (3/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_278-35159-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 05:24:23,327 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.464e+02 1.788e+02 1.939e+02 2.108e+02 3.552e+02, threshold=3.878e+02, percent-clipped=0.0 2023-10-03 05:24:23,483 WARNING [train.py:1204] (3/4) Exclude cut with ID _15037_210_2253_1_1530752294104_3267650_13-130361-0_sp1.1 from training. Duration: 0.9 2023-10-03 05:24:24,832 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3019_210_2028_1_1526044245_3264865_127-135615-0 from training. Duration: 0.94 2023-10-03 05:24:26,075 WARNING [train.py:1204] (3/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_607-213951-0 from training. Duration: 0.84 2023-10-03 05:24:26,164 WARNING [train.py:1204] (3/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_436-318113-0_sp0.9 from training. Duration: 0.8333125 2023-10-03 05:24:26,165 WARNING [train.py:1204] (3/4) Exclude cut with ID _13938_210_9988_1_1530518718978_3693960_148-152225-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 05:24:31,643 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2541_210_1794_1_1525590269_3084765_156-173713-0_sp1.1 from training. Duration: 0.736375 2023-10-03 05:24:33,072 WARNING [train.py:1204] (3/4) Exclude cut with ID _28323_210_16187_1_1532480395667_4100300_531-24157-0_sp1.1 from training. Duration: 0.8909375 2023-10-03 05:24:36,965 WARNING [train.py:1204] (3/4) Exclude cut with ID _29303_210_2575_1_1532392237303_7410649_382-217492-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 05:24:37,009 WARNING [train.py:1204] (3/4) Exclude cut with ID _58895_210_8750_1_1535184081320_3649089_270-158013-0_sp1.1 from training. Duration: 0.92725 2023-10-03 05:24:39,791 WARNING [train.py:1204] (3/4) Exclude cut with ID _29277_210_3279_1_1532581109293_3173069_311-206227-0_sp1.1 from training. Duration: 0.936375 2023-10-03 05:24:39,805 WARNING [train.py:1204] (3/4) Exclude cut with ID _7160_210_1876_1_1527940923231_3703799_282-322611-0_sp0.9 from training. Duration: 0.9666875 2023-10-03 05:24:41,182 WARNING [train.py:1204] (3/4) Exclude cut with ID _53219_210_4525_1_1534834836008_4064825_562-32295-0_sp1.1 from training. Duration: 0.963625 2023-10-03 05:24:42,541 WARNING [train.py:1204] (3/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_408-87330-0_sp1.1 from training. Duration: 0.963625 2023-10-03 05:24:42,549 WARNING [train.py:1204] (3/4) Exclude cut with ID _59202_210_26984_1_1534816810612_3549174_137-318710-0_sp1.1 from training. Duration: 0.8090625 2023-10-03 05:24:45,809 WARNING [train.py:1204] (3/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_372-30302-0 from training. Duration: 0.86 2023-10-03 05:24:46,034 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.conv_skip_rate, batch_count=1153053.3333333333, ans=0.0 2023-10-03 05:24:46,728 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.0.layers.1.self_attn_weights.whiten_keys, num_groups=4, num_channels=128, metric=2.16 vs. limit=6.0 2023-10-03 05:24:50,689 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.3.encoder.layers.1.whiten, num_groups=1, num_channels=512, metric=4.85 vs. limit=12.0 2023-10-03 05:24:51,762 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7543_210_6753_1_1528541907_1399322_178-56269-0_sp1.1 from training. Duration: 0.95275 2023-10-03 05:24:51,779 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9109W0012-549758 from training. Duration: 0.512 2023-10-03 05:24:53,192 WARNING [train.py:1204] (3/4) Exclude cut with ID _5410_210_1388_1_1527397055795_1719020_39-112221-0_sp0.9 from training. Duration: 0.7666875 2023-10-03 05:24:54,874 INFO [scaling.py:1118] (3/4) WithLoss: name=encoder.encoders.3.encoder.layers.1.self_attn_weights, loss-sum=0.000e+00 2023-10-03 05:24:56,006 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9121W0012-554933 from training. Duration: 0.896 2023-10-03 05:24:56,105 WARNING [train.py:1204] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_476-272757-0 from training. Duration: 0.93 2023-10-03 05:24:57,297 WARNING [train.py:1204] (3/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_1085-171718-0_sp1.1 from training. Duration: 0.92725 2023-10-03 05:24:57,347 WARNING [train.py:1204] (3/4) Exclude cut with ID _8480_210_1874_1_1528589391205_6728330_309-139896-0_sp1.1 from training. Duration: 0.736375 2023-10-03 05:24:57,348 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9130W0113-559703 from training. Duration: 0.896 2023-10-03 05:24:57,352 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_292-265928-0_sp1.1 from training. Duration: 0.463625 2023-10-03 05:24:58,885 WARNING [train.py:1204] (3/4) Exclude cut with ID _8772_210_1850_1_1528635595972_3664011_96-12756-0 from training. Duration: 0.98 2023-10-03 05:25:00,160 WARNING [train.py:1204] (3/4) Exclude cut with ID _12556_210_8341_1_1530081024100_7327690_725-193477-0_sp1.1 from training. Duration: 0.8909375 2023-10-03 05:25:01,495 WARNING [train.py:1204] (3/4) Exclude cut with ID _5289_210_2084_1_1527678202736_3779299_142-103317-0_sp1.1 from training. Duration: 0.763625 2023-10-03 05:25:03,022 WARNING [train.py:1204] (3/4) Exclude cut with ID _45012_210_12651_1_1533608435136_5371380_206-51075-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 05:25:04,299 WARNING [train.py:1204] (3/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_176-56120-0_sp1.1 from training. Duration: 0.7545625 2023-10-03 05:25:04,333 WARNING [train.py:1204] (3/4) Exclude cut with ID _40284_210_2084_1_1533513679814_3704279_220-201864-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 05:25:04,357 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9165W0004-576638 from training. Duration: 0.896 2023-10-03 05:25:04,405 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_5438_210_1794_1_1527336687_2600009_218-343336-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 05:25:05,772 WARNING [train.py:1204] (3/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_318-162017-0 from training. Duration: 0.91 2023-10-03 05:25:08,407 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.3.encoder.layers.3.nonlin_attention.whiten2, num_groups=1, num_channels=512, metric=5.30 vs. limit=15.0 2023-10-03 05:25:11,832 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_49544_210_1794_1_1534379770_5262083_483-349298-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 05:25:13,186 WARNING [train.py:1204] (3/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_197-28164-0_sp0.9 from training. Duration: 0.888875 2023-10-03 05:25:13,241 WARNING [train.py:1204] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_643-52007-0 from training. Duration: 0.57 2023-10-03 05:25:13,271 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_38965_210_4852_1_1533174906_3484735_403-332963-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 05:25:14,057 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.2.encoder.layers.1.nonlin_attention.whiten2, num_groups=1, num_channels=384, metric=4.54 vs. limit=15.0 2023-10-03 05:25:14,668 WARNING [train.py:1204] (3/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_340-174176-0 from training. Duration: 0.49 2023-10-03 05:25:17,848 WARNING [train.py:1204] (3/4) Exclude cut with ID _56708_210_13177_1_1535252351282_3422899_363-917-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 05:25:19,455 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.conv_module1.balancer1.prob, batch_count=1153186.6666666667, ans=0.125 2023-10-03 05:25:20,927 WARNING [train.py:1204] (3/4) Exclude cut with ID _19896_210_1883_1_1531709022662_6649569_525-159063-0_sp1.1 from training. Duration: 0.936375 2023-10-03 05:25:20,980 WARNING [train.py:1204] (3/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_430-116139-0_sp0.9 from training. Duration: 0.9444375 2023-10-03 05:25:22,348 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_53753_210_7234_1_1534320003_3594148_86-329250-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 05:25:22,364 WARNING [train.py:1204] (3/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_406-275649-0_sp1.1 from training. Duration: 0.3818125 2023-10-03 05:25:22,818 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.5.encoder.layers.0.feed_forward2.out_whiten, num_groups=1, num_channels=256, metric=7.81 vs. limit=15.0 2023-10-03 05:25:25,030 WARNING [train.py:1204] (3/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_1083-147536-0_sp1.1 from training. Duration: 0.8818125 2023-10-03 05:25:25,094 WARNING [train.py:1204] (3/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_54-311741-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 05:25:25,099 WARNING [train.py:1204] (3/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_237-58433-0_sp0.9 from training. Duration: 0.82225 2023-10-03 05:25:26,485 WARNING [train.py:1204] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_98-51676-0_sp1.1 from training. Duration: 0.663625 2023-10-03 05:25:26,554 WARNING [train.py:1204] (3/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_386-270472-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 05:25:27,930 WARNING [train.py:1204] (3/4) Exclude cut with ID _19023_210_4353_1_1531378892552_5820780_83-254794-0_sp1.1 from training. Duration: 0.7818125 2023-10-03 05:25:29,295 INFO [train.py:1046] (3/4) Epoch 33, batch 3000, loss[loss=0.1693, simple_loss=0.2498, pruned_loss=0.04443, over 23238.00 frames. ], tot_loss[loss=0.1619, simple_loss=0.2406, pruned_loss=0.04161, over 4701019.76 frames. ], batch size: 93, lr: 3.09e-03, grad_scale: 16.0 2023-10-03 05:25:29,296 INFO [train.py:1069] (3/4) Computing validation loss 2023-10-03 05:25:41,087 INFO [train.py:1078] (3/4) Epoch 33, validation: loss=0.3581, simple_loss=0.2789, pruned_loss=0.2187, over 1125622.00 frames. 2023-10-03 05:25:41,088 INFO [train.py:1079] (3/4) Maximum memory allocated so far is 21134MB 2023-10-03 05:25:41,193 WARNING [train.py:1204] (3/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_314-93656-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 05:25:41,205 WARNING [train.py:1204] (3/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_336-213530-0 from training. Duration: 0.9 2023-10-03 05:25:42,709 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.ff2_skip_rate, batch_count=1153253.3333333333, ans=0.0 2023-10-03 05:25:43,842 WARNING [train.py:1204] (3/4) Exclude cut with ID _36686_210_5665_1_1533778225913_3646410_65-221120-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 05:25:45,998 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_26433_210_12108_1_1532157158_4056480_271-68178-0_sp1.1 from training. Duration: 0.92725 2023-10-03 05:25:46,069 WARNING [train.py:1204] (3/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_46-207599-0_sp0.9 from training. Duration: 0.711125 2023-10-03 05:25:49,002 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9303W0114-637133 from training. Duration: 0.896 2023-10-03 05:25:49,031 WARNING [train.py:1204] (3/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_490-207861-0 from training. Duration: 0.98 2023-10-03 05:25:52,372 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6767_210_3273_1_1527855783_1718932_173-256601-0_sp0.9 from training. Duration: 0.888875 2023-10-03 05:25:53,627 WARNING [train.py:1204] (3/4) Exclude cut with ID _13921_210_9994_1_1530841782272_3503520_327-69677-0_sp1.1 from training. Duration: 0.8181875 2023-10-03 05:25:53,685 WARNING [train.py:1204] (3/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_508-335369-0 from training. Duration: 0.48 2023-10-03 05:25:53,712 WARNING [train.py:1204] (3/4) Exclude cut with ID _64631_210_27767_1_1535349580068_4165160_24-46567-0_sp1.1 from training. Duration: 0.963625 2023-10-03 05:25:56,724 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.balancer1.prob, batch_count=1153320.0, ans=0.125 2023-10-03 05:25:59,757 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_89-35748-0_sp1.1 from training. Duration: 0.7181875 2023-10-03 05:26:11,200 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_26433_210_12108_1_1532157158_4056480_578-262107-0_sp1.1 from training. Duration: 0.9 2023-10-03 05:26:19,028 WARNING [train.py:1204] (3/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_129-337802-0 from training. Duration: 0.72 2023-10-03 05:26:19,138 WARNING [train.py:1204] (3/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_356-167816-0_sp0.9 from training. Duration: 0.8 2023-10-03 05:26:21,898 WARNING [train.py:1204] (3/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_137-116442-0_sp1.1 from training. Duration: 0.7090625 2023-10-03 05:26:21,941 WARNING [train.py:1204] (3/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_358-172940-0_sp1.1 from training. Duration: 0.963625 2023-10-03 05:26:21,967 WARNING [train.py:1204] (3/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_668-344147-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 05:26:22,159 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.conv_skip_rate, batch_count=1153386.6666666667, ans=0.0 2023-10-03 05:26:25,103 WARNING [train.py:1204] (3/4) Exclude cut with ID _62337_210_12392_1_1535009273105_3412229_255-157667-0_sp1.1 from training. Duration: 0.936375 2023-10-03 05:26:25,107 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_486-151131-0 from training. Duration: 0.85 2023-10-03 05:26:26,606 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_53638_210_21581_1_1534297319_5808449_885-80172-0 from training. Duration: 0.96 2023-10-03 05:26:27,950 WARNING [train.py:1204] (3/4) Exclude cut with ID _50093_210_4220_1_1534417141207_3432299_105-183067-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 05:26:29,311 WARNING [train.py:1204] (3/4) Exclude cut with ID _7562_210_2084_1_1528887767916_3946229_345-169156-0_sp0.9 from training. Duration: 0.6444375 2023-10-03 05:26:29,501 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.1.conv_module2.balancer1.prob, batch_count=1153453.3333333333, ans=0.125 2023-10-03 05:26:30,748 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_4528_210_3273_1_1527076654_3276777_209-219804-0_sp0.9 from training. Duration: 0.7666875 2023-10-03 05:26:32,167 WARNING [train.py:1204] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_35-153886-0_sp0.9 from training. Duration: 0.9333125 2023-10-03 05:26:32,230 WARNING [train.py:1204] (3/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_332-121925-0_sp1.1 from training. Duration: 0.92725 2023-10-03 05:26:32,232 WARNING [train.py:1204] (3/4) Exclude cut with ID _59202_210_26984_1_1534816810612_3549174_495-65515-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 05:26:36,406 WARNING [train.py:1204] (3/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_769-168302-0_sp1.1 from training. Duration: 0.6454375 2023-10-03 05:26:36,452 WARNING [train.py:1204] (3/4) Exclude cut with ID _24271_210_12658_1_1531987270022_7190519_708-71345-0_sp1.1 from training. Duration: 0.936375 2023-10-03 05:26:36,453 WARNING [train.py:1204] (3/4) Exclude cut with ID 3033-130750-0096-55598-0_sp0.9 from training. Duration: 0.92225 2023-10-03 05:26:39,613 WARNING [train.py:1204] (3/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_219-224445-0_sp0.9 from training. Duration: 0.9333125 2023-10-03 05:26:41,117 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_174-277364-0 from training. Duration: 0.71 2023-10-03 05:26:41,197 WARNING [train.py:1204] (3/4) Exclude cut with ID _45021_210_9994_1_1533619751094_3330288_96-239010-0_sp1.1 from training. Duration: 0.82725 2023-10-03 05:26:41,236 WARNING [train.py:1204] (3/4) Exclude cut with ID _31602_210_5854_1_1532677021456_4458250_489-305116-0_sp1.1 from training. Duration: 0.97275 2023-10-03 05:26:42,569 WARNING [train.py:1204] (3/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_269-154726-0_sp0.9 from training. Duration: 0.9555625 2023-10-03 05:26:42,897 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.nonlin_attention.balancer.max_positive, batch_count=1153520.0, ans=0.95 2023-10-03 05:26:46,043 WARNING [train.py:1204] (3/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_126-54418-0_sp1.1 from training. Duration: 0.92725 2023-10-03 05:26:46,085 WARNING [train.py:1204] (3/4) Exclude cut with ID _42326_210_7770_1_1533814282531_3707269_182-224159-0_sp1.1 from training. Duration: 0.92725 2023-10-03 05:26:47,278 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7002_210_1794_1_1527939683_5157286_1-281264-0_sp0.9 from training. Duration: 0.488875 2023-10-03 05:26:47,317 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_86-16881-0 from training. Duration: 0.58 2023-10-03 05:26:48,582 WARNING [train.py:1204] (3/4) Exclude cut with ID _19514_210_9259_1_1531530071330_5084230_637-279094-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 05:26:48,608 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_26433_210_12108_1_1532157158_4056480_578-262107-0 from training. Duration: 0.99 2023-10-03 05:26:48,654 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_82-221196-0_sp1.1 from training. Duration: 0.6909375 2023-10-03 05:26:50,630 WARNING [train.py:1204] (3/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_402-102311-0 from training. Duration: 0.67 2023-10-03 05:26:52,214 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.balancer1.prob, batch_count=1153520.0, ans=0.125 2023-10-03 05:26:53,254 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1015-343884-0_sp1.1 from training. Duration: 0.563625 2023-10-03 05:26:53,334 WARNING [train.py:1204] (3/4) Exclude cut with ID _13684_210_7497_1_1530619354863_3598359_312-235606-0_sp1.1 from training. Duration: 0.5454375 2023-10-03 05:26:55,227 WARNING [train.py:1204] (3/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_55-31904-0 from training. Duration: 0.88 2023-10-03 05:26:55,310 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_25-332138-0 from training. Duration: 0.91 2023-10-03 05:26:55,312 WARNING [train.py:1204] (3/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_912-282412-0_sp0.9 from training. Duration: 0.5444375 2023-10-03 05:26:56,576 INFO [train.py:1046] (3/4) Epoch 33, batch 3050, loss[loss=0.157, simple_loss=0.2337, pruned_loss=0.04016, over 24602.00 frames. ], tot_loss[loss=0.1629, simple_loss=0.2419, pruned_loss=0.04192, over 4717270.80 frames. ], batch size: 60, lr: 3.09e-03, grad_scale: 16.0 2023-10-03 05:26:56,697 WARNING [train.py:1204] (3/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_458-299435-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 05:26:58,152 WARNING [train.py:1204] (3/4) Exclude cut with ID _69584_210_5219_1_1535785133775_4044450_275-88403-0_sp1.1 from training. Duration: 0.92725 2023-10-03 05:26:58,173 WARNING [train.py:1204] (3/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_841-341679-0_sp1.1 from training. Duration: 0.52725 2023-10-03 05:26:58,181 WARNING [train.py:1204] (3/4) Exclude cut with ID _41391_210_7994_1_1533603771872_3638201_1-54521-0_sp1.1 from training. Duration: 0.97275 2023-10-03 05:26:58,226 WARNING [train.py:1204] (3/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_495-186609-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 05:27:00,288 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.2.encoder.layers.2.self_attn1.whiten, num_groups=1, num_channels=384, metric=18.85 vs. limit=22.5 2023-10-03 05:27:00,986 WARNING [train.py:1204] (3/4) Exclude cut with ID _4681_210_3525_1_1527250689464_120889_15-126760-0 from training. Duration: 0.81 2023-10-03 05:27:02,475 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3019_210_2028_1_1526044245_3264865_127-135615-0_sp1.1 from training. Duration: 0.8545625 2023-10-03 05:27:03,730 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.626e+02 1.901e+02 2.048e+02 2.289e+02 3.124e+02, threshold=4.095e+02, percent-clipped=0.0 2023-10-03 05:27:05,043 WARNING [train.py:1204] (3/4) Exclude cut with ID _30191_210_1664_1_1533189915047_7196670_915-21561-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 05:27:05,094 WARNING [train.py:1204] (3/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_6-214815-0_sp0.9 from training. Duration: 0.9444375 2023-10-03 05:27:08,332 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_45399_210_4852_1_1533624725_7579737_190-137494-0_sp1.1 from training. Duration: 0.97275 2023-10-03 05:27:12,255 WARNING [train.py:1204] (3/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_207-132038-0 from training. Duration: 0.94 2023-10-03 05:27:19,557 WARNING [train.py:1204] (3/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_90-31383-0 from training. Duration: 0.87 2023-10-03 05:27:21,391 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_15517_210_6753_1_1530952293_1664795_119-31001-0 from training. Duration: 0.94 2023-10-03 05:27:21,436 WARNING [train.py:1204] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_684-85342-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 05:27:23,168 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.feed_forward1.out_proj.dropout_p, batch_count=1153653.3333333333, ans=0.1 2023-10-03 05:27:24,244 WARNING [train.py:1204] (3/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_97-325053-0_sp1.1 from training. Duration: 0.836375 2023-10-03 05:27:25,878 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.bypass.skip_rate, batch_count=1153720.0, ans=0.07 2023-10-03 05:27:27,539 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_68702_210_23689_1_1535799577_3420068_168-25150-0_sp1.1 from training. Duration: 0.97275 2023-10-03 05:27:27,557 WARNING [train.py:1204] (3/4) Exclude cut with ID _65134_210_14763_1_1535335393985_3505368_111-27984-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 05:27:27,608 WARNING [train.py:1204] (3/4) Exclude cut with ID _48678_210_5663_1_1533988830947_3769199_276-226230-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 05:27:30,364 WARNING [train.py:1204] (3/4) Exclude cut with ID _28427_210_4220_1_1532581240967_3061069_43-84928-0_sp1.1 from training. Duration: 0.9 2023-10-03 05:27:30,398 WARNING [train.py:1204] (3/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_158-250116-0_sp1.1 from training. Duration: 0.563625 2023-10-03 05:27:30,423 WARNING [train.py:1204] (3/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_279-332556-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 05:27:31,808 WARNING [train.py:1204] (3/4) Exclude cut with ID _42863_210_9070_1_1533902398788_3696920_488-230146-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 05:27:31,809 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_387-247159-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 05:27:33,144 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6729_210_2550_1_1528032481_1788725_261-42619-0_sp1.1 from training. Duration: 0.97275 2023-10-03 05:27:34,564 WARNING [train.py:1204] (3/4) Exclude cut with ID _19896_210_1883_1_1531709022662_6649569_831-44724-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 05:27:40,342 WARNING [train.py:1204] (3/4) Exclude cut with ID _55153_210_2253_1_1534384786677_3162096_280-285944-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 05:27:40,372 WARNING [train.py:1204] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_448-4048-0 from training. Duration: 0.96 2023-10-03 05:27:40,416 WARNING [train.py:1204] (3/4) Exclude cut with ID _29678_210_7893_1_1532421042787_3906118_456-202548-0_sp1.1 from training. Duration: 0.97275 2023-10-03 05:27:40,435 WARNING [train.py:1204] (3/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_107-332398-0_sp1.1 from training. Duration: 0.7090625 2023-10-03 05:27:43,347 WARNING [train.py:1204] (3/4) Exclude cut with ID _13921_210_9994_1_1530838702800_3003429_46-149444-0_sp1.1 from training. Duration: 0.8545625 2023-10-03 05:27:44,682 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_257-20733-0_sp0.9 from training. Duration: 0.8444375 2023-10-03 05:27:44,731 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_53753_210_7234_1_1534320003_3594148_385-234809-0_sp1.1 from training. Duration: 0.963625 2023-10-03 05:27:44,788 WARNING [train.py:1204] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_292-215346-0_sp1.1 from training. Duration: 0.8818125 2023-10-03 05:27:48,235 WARNING [train.py:1204] (3/4) Exclude cut with ID _68965_210_1883_1_1535698962272_3497490_196-124268-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 05:27:50,096 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_23801_210_16223_1_1532066392_3294657_570-5439-0_sp1.1 from training. Duration: 0.8818125 2023-10-03 05:27:54,348 WARNING [train.py:1204] (3/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_349-6928-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 05:27:55,698 WARNING [train.py:1204] (3/4) Exclude cut with ID _60621_210_17071_1_1535013053530_3671739_226-255202-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 05:27:55,702 WARNING [train.py:1204] (3/4) Exclude cut with ID _41817_210_18835_1_1533434477160_7423049_448-130302-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 05:27:57,689 WARNING [train.py:1204] (3/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_758-143677-0_sp1.1 from training. Duration: 0.8909375 2023-10-03 05:27:57,725 WARNING [train.py:1204] (3/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_912-47473-0_sp0.9 from training. Duration: 0.7555625 2023-10-03 05:27:59,005 WARNING [train.py:1204] (3/4) Exclude cut with ID _6694_210_4599_1_1527852637490_3965529_356-169233-0_sp1.1 from training. Duration: 0.9 2023-10-03 05:28:00,391 WARNING [train.py:1204] (3/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_1-99680-0 from training. Duration: 0.67 2023-10-03 05:28:01,686 WARNING [train.py:1204] (3/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_199-86432-0_sp1.1 from training. Duration: 0.8909375 2023-10-03 05:28:01,716 WARNING [train.py:1204] (3/4) Exclude cut with ID _61148_210_6897_1_1535094022726_4217610_362-182430-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 05:28:01,792 WARNING [train.py:1204] (3/4) Exclude cut with ID _49376_210_5986_1_1533985278732_3370740_301-154763-0 from training. Duration: 0.98 2023-10-03 05:28:03,422 INFO [scaling.py:1118] (3/4) WithLoss: name=encoder.encoders.3.encoder.layers.0.self_attn_weights, loss-sum=0.000e+00 2023-10-03 05:28:04,534 WARNING [train.py:1204] (3/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_572-185150-0_sp1.1 from training. Duration: 0.8818125 2023-10-03 05:28:09,150 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_47896_210_20508_1_1534492518_5958868_558-314572-0_sp1.1 from training. Duration: 0.8818125 2023-10-03 05:28:10,401 INFO [train.py:1046] (3/4) Epoch 33, batch 3100, loss[loss=0.1405, simple_loss=0.2196, pruned_loss=0.0307, over 24602.00 frames. ], tot_loss[loss=0.1618, simple_loss=0.2407, pruned_loss=0.0415, over 4722376.46 frames. ], batch size: 60, lr: 3.09e-03, grad_scale: 16.0 2023-10-03 05:28:10,486 WARNING [train.py:1204] (3/4) Exclude cut with ID _45560_210_21982_1_1533812603951_3584999_111-6376-0_sp0.9 from training. Duration: 0.9333125 2023-10-03 05:28:14,441 WARNING [train.py:1204] (3/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_412-317077-0_sp1.1 from training. Duration: 0.6818125 2023-10-03 05:28:15,877 WARNING [train.py:1204] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_718-68505-0 from training. Duration: 0.89 2023-10-03 05:28:16,103 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.conv_module1.balancer1.prob, batch_count=1153920.0, ans=0.125 2023-10-03 05:28:19,157 WARNING [train.py:1204] (3/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_77-311404-0 from training. Duration: 0.94 2023-10-03 05:28:19,228 WARNING [train.py:1204] (3/4) Exclude cut with ID _60229_210_27767_1_1534838406158_3655350_264-279573-0 from training. Duration: 0.83 2023-10-03 05:28:21,191 WARNING [train.py:1204] (3/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_106-300985-0_sp1.1 from training. Duration: 0.8181875 2023-10-03 05:28:23,910 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_4183_210_2087_1_1526729100_1869838_79-277189-0_sp1.1 from training. Duration: 0.963625 2023-10-03 05:28:23,930 WARNING [train.py:1204] (3/4) Exclude cut with ID _28427_210_4220_1_1532581240967_3061069_195-295268-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 05:28:26,840 WARNING [train.py:1204] (3/4) Exclude cut with ID _4092_210_2151_1_1526643044449_3135759_356-278694-0_sp0.9 from training. Duration: 0.52225 2023-10-03 05:28:30,052 WARNING [train.py:1204] (3/4) Exclude cut with ID _14459_210_2228_1_1531443641946_7516700_777-71446-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 05:28:34,304 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.1.nonlin_attention.balancer.prob, batch_count=1153986.6666666667, ans=0.125 2023-10-03 05:28:35,517 WARNING [train.py:1204] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_520-126729-0 from training. Duration: 0.7 2023-10-03 05:28:40,279 WARNING [train.py:1204] (3/4) Exclude cut with ID _13684_210_7497_1_1530619354863_3598359_312-235606-0_sp0.9 from training. Duration: 0.6666875 2023-10-03 05:28:40,326 WARNING [train.py:1204] (3/4) Exclude cut with ID _37537_210_2253_1_1533171758568_3421949_283-349402-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 05:28:41,776 WARNING [train.py:1204] (3/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_29-117932-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 05:28:41,796 WARNING [train.py:1204] (3/4) Exclude cut with ID _29303_210_2575_1_1532392237303_7410649_721-283351-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 05:28:41,872 WARNING [train.py:1204] (3/4) Exclude cut with ID _14886_210_4220_1_1530702020355_3516190_175-229073-0_sp0.9 from training. Duration: 0.5 2023-10-03 05:28:44,619 WARNING [train.py:1204] (3/4) Exclude cut with ID _15458_210_8341_1_1530858614263_3834410_70-268830-0_sp1.1 from training. Duration: 0.97275 2023-10-03 05:28:44,642 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2295_210_2087_1_1525500993_2407863_234-83104-0 from training. Duration: 0.62 2023-10-03 05:28:44,643 WARNING [train.py:1204] (3/4) Exclude cut with ID _22961_210_8254_1_1531976366672_4278180_317-199546-0_sp1.1 from training. Duration: 0.7818125 2023-10-03 05:28:46,045 WARNING [train.py:1204] (3/4) Exclude cut with ID _27894_210_12392_1_1532415496078_6902439_547-196022-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 05:28:47,603 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.0.layers.0.nonlin_attention.whiten2, num_groups=1, num_channels=192, metric=5.06 vs. limit=15.0 2023-10-03 05:28:47,897 WARNING [train.py:1204] (3/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_46-207599-0 from training. Duration: 0.64 2023-10-03 05:28:49,271 WARNING [train.py:1204] (3/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_426-326246-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 05:28:52,981 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.4.encoder.layers.2.conv_module2.whiten, num_groups=1, num_channels=384, metric=4.08 vs. limit=15.0 2023-10-03 05:28:53,904 WARNING [train.py:1204] (3/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_445-117398-0_sp0.9 from training. Duration: 0.988875 2023-10-03 05:28:55,228 WARNING [train.py:1204] (3/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_563-347832-0 from training. Duration: 0.89 2023-10-03 05:28:55,320 WARNING [train.py:1204] (3/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_252-216674-0 from training. Duration: 0.54 2023-10-03 05:28:56,599 WARNING [train.py:1204] (3/4) Exclude cut with ID _59611_210_8895_1_1534741198240_3650361_187-212779-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 05:28:56,735 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.1.conv_module2.balancer1.prob, batch_count=1154120.0, ans=0.125 2023-10-03 05:28:57,942 WARNING [train.py:1204] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_282-159367-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 05:28:59,381 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_68702_210_23689_1_1535799577_3420068_33-136883-0_sp1.1 from training. Duration: 0.936375 2023-10-03 05:28:59,393 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_47228_210_4983_1_1533780021_3672027_184-293706-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 05:28:59,414 WARNING [train.py:1204] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_656-188361-0_sp0.9 from training. Duration: 0.9666875 2023-10-03 05:29:00,831 WARNING [train.py:1204] (3/4) Exclude cut with ID _4866_210_2577_1_1527404412284_4364659_352-157930-0_sp0.9 from training. Duration: 0.9 2023-10-03 05:29:00,832 WARNING [train.py:1204] (3/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_210-106047-0_sp1.1 from training. Duration: 0.8818125 2023-10-03 05:29:03,912 WARNING [train.py:1204] (3/4) Exclude cut with ID _9389_210_1664_1_1529217337301_5289269_289-213417-0_sp1.1 from training. Duration: 0.8090625 2023-10-03 05:29:03,952 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3910_210_1794_1_1526777667_3747260_178-41186-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 05:29:03,957 WARNING [train.py:1204] (3/4) Exclude cut with ID _27052_210_5986_1_1532329197187_3585009_24-172263-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 05:29:03,962 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_5522_210_3273_1_1527249606_3909096_318-203153-0_sp1.1 from training. Duration: 0.3909375 2023-10-03 05:29:07,385 WARNING [train.py:1204] (3/4) Exclude cut with ID _27894_210_12392_1_1532415496078_6902439_222-51982-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 05:29:08,758 WARNING [train.py:1204] (3/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_87-131427-0 from training. Duration: 0.78 2023-10-03 05:29:11,489 WARNING [train.py:1204] (3/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_372-298093-0_sp0.9 from training. Duration: 0.888875 2023-10-03 05:29:11,551 WARNING [train.py:1204] (3/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_663-166973-0 from training. Duration: 0.8 2023-10-03 05:29:12,823 WARNING [train.py:1204] (3/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_429-177469-0_sp1.1 from training. Duration: 0.936375 2023-10-03 05:29:12,863 WARNING [train.py:1204] (3/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_724-172651-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 05:29:12,898 WARNING [train.py:1204] (3/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_103-336064-0 from training. Duration: 0.51 2023-10-03 05:29:13,378 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.5.encoder.layers.1.self_attn2.whiten, num_groups=1, num_channels=256, metric=11.68 vs. limit=22.5 2023-10-03 05:29:13,509 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.4.encoder.layers.0.feed_forward3.out_whiten, num_groups=1, num_channels=384, metric=11.80 vs. limit=15.0 2023-10-03 05:29:21,062 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.feed_forward1.out_proj.dropout_p, batch_count=1154186.6666666667, ans=0.1 2023-10-03 05:29:23,553 WARNING [train.py:1204] (3/4) Exclude cut with ID _48677_210_5663_1_1533902405919_3892989_111-11439-0 from training. Duration: 0.96 2023-10-03 05:29:24,786 INFO [train.py:1046] (3/4) Epoch 33, batch 3150, loss[loss=0.161, simple_loss=0.2308, pruned_loss=0.04557, over 23743.00 frames. ], tot_loss[loss=0.1612, simple_loss=0.2396, pruned_loss=0.04138, over 4717813.56 frames. ], batch size: 179, lr: 3.09e-03, grad_scale: 16.0 2023-10-03 05:29:26,278 WARNING [train.py:1204] (3/4) Exclude cut with ID _8085_210_4557_1_1528455633493_3716100_95-103398-0_sp1.1 from training. Duration: 0.963625 2023-10-03 05:29:27,603 WARNING [train.py:1204] (3/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_1041-110837-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 05:29:29,081 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_50050_210_16223_1_1535435796_7290138_1155-121757-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 05:29:29,083 WARNING [train.py:1204] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_602-275260-0_sp0.9 from training. Duration: 0.911125 2023-10-03 05:29:29,136 WARNING [train.py:1204] (3/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_265-164486-0 from training. Duration: 0.96 2023-10-03 05:29:30,490 WARNING [train.py:1204] (3/4) Exclude cut with ID _46067_210_4220_1_1534125571066_3307450_142-229375-0_sp1.1 from training. Duration: 0.963625 2023-10-03 05:29:31,692 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.544e+02 1.823e+02 1.977e+02 2.153e+02 2.836e+02, threshold=3.954e+02, percent-clipped=0.0 2023-10-03 05:29:31,777 WARNING [train.py:1204] (3/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_310-26186-0_sp1.1 from training. Duration: 0.636375 2023-10-03 05:29:33,512 WARNING [train.py:1204] (3/4) Exclude cut with ID _28461_210_2253_1_1532771775189_3041299_136-270644-0 from training. Duration: 0.93 2023-10-03 05:29:34,931 WARNING [train.py:1204] (3/4) Exclude cut with ID _14860_210_4915_1_1530702020340_7343240_438-286372-0_sp1.1 from training. Duration: 0.936375 2023-10-03 05:29:38,069 WARNING [train.py:1204] (3/4) Exclude cut with ID IC0128W0004-63309 from training. Duration: 0.831 2023-10-03 05:29:39,521 WARNING [train.py:1204] (3/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_135-174080-0 from training. Duration: 0.53 2023-10-03 05:29:39,561 WARNING [train.py:1204] (3/4) Exclude cut with ID _41326_210_5854_1_1533778076509_3492080_277-59511-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 05:29:40,891 WARNING [train.py:1204] (3/4) Exclude cut with ID IC0144W0112-71379 from training. Duration: 0.992 2023-10-03 05:29:41,200 INFO [scaling.py:1118] (3/4) WithLoss: name=encoder.encoders.4.encoder.layers.1.self_attn_weights, loss-sum=0.000e+00 2023-10-03 05:29:42,245 WARNING [train.py:1204] (3/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_711-15688-0_sp0.9 from training. Duration: 0.47775 2023-10-03 05:29:42,356 WARNING [train.py:1204] (3/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_237-332027-0 from training. Duration: 0.95 2023-10-03 05:29:43,694 WARNING [train.py:1204] (3/4) Exclude cut with ID _7970_210_1883_1_1528455616296_3472230_25-178407-0 from training. Duration: 0.71 2023-10-03 05:29:43,695 WARNING [train.py:1204] (3/4) Exclude cut with ID _8480_210_1874_1_1528589391205_6728330_309-139896-0 from training. Duration: 0.81 2023-10-03 05:29:43,713 WARNING [train.py:1204] (3/4) Exclude cut with ID _39571_210_13449_1_1533801584143_3651077_319-268877-0_sp1.1 from training. Duration: 0.936375 2023-10-03 05:29:43,716 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_155-246880-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 05:29:43,803 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_566-122599-0_sp1.1 from training. Duration: 0.936375 2023-10-03 05:29:44,008 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.conv_module1.balancer1.max_abs, batch_count=1154320.0, ans=10.0 2023-10-03 05:29:45,775 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_12868_210_6753_1_1530701932_530275_18-76322-0 from training. Duration: 0.87 2023-10-03 05:29:46,940 WARNING [train.py:1204] (3/4) Exclude cut with ID _56494_210_4220_1_1535284837625_3033169_159-106035-0_sp1.1 from training. Duration: 0.963625 2023-10-03 05:29:46,978 WARNING [train.py:1204] (3/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_98-276013-0_sp1.1 from training. Duration: 0.963625 2023-10-03 05:29:48,318 WARNING [train.py:1204] (3/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_330-216124-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 05:29:49,765 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_177-58005-0_sp0.9 from training. Duration: 0.688875 2023-10-03 05:29:54,598 WARNING [train.py:1204] (3/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_343-139898-0 from training. Duration: 0.96 2023-10-03 05:29:54,808 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=1154386.6666666667, ans=0.1 2023-10-03 05:29:55,887 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6961_210_2087_1_1527937168_6815011_395-185060-0_sp1.1 from training. Duration: 0.863625 2023-10-03 05:29:57,311 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7002_210_1794_1_1527939683_5157286_184-229630-0_sp0.9 from training. Duration: 0.82225 2023-10-03 05:29:58,624 WARNING [train.py:1204] (3/4) Exclude cut with ID _59170_210_6721_1_1534983633960_4203335_153-18895-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 05:29:58,662 WARNING [train.py:1204] (3/4) Exclude cut with ID _49643_210_7989_1_1534640244224_3939031_598-161926-0 from training. Duration: 0.9 2023-10-03 05:30:01,377 WARNING [train.py:1204] (3/4) Exclude cut with ID _4068_210_2629_1_1526697350395_1546259_224-288693-0 from training. Duration: 0.59 2023-10-03 05:30:01,427 WARNING [train.py:1204] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_718-68505-0_sp1.1 from training. Duration: 0.8090625 2023-10-03 05:30:03,426 WARNING [train.py:1204] (3/4) Exclude cut with ID _4068_210_2629_1_1526697350395_1546259_224-288693-0_sp0.9 from training. Duration: 0.6555625 2023-10-03 05:30:03,444 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2296_210_2087_1_1525503448_3397335_57-185430-0_sp0.9 from training. Duration: 0.6666875 2023-10-03 05:30:04,796 WARNING [train.py:1204] (3/4) Exclude cut with ID _15458_210_8341_1_1530858614263_3834410_49-235472-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 05:30:04,798 WARNING [train.py:1204] (3/4) Exclude cut with ID _64135_210_2253_1_1535677267876_7608972_498-5039-0_sp0.9 from training. Duration: 0.8666875 2023-10-03 05:30:05,056 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.feed_forward3.hidden_balancer.prob, batch_count=1154386.6666666667, ans=0.125 2023-10-03 05:30:06,840 WARNING [train.py:1204] (3/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_283-220372-0_sp0.9 from training. Duration: 0.62225 2023-10-03 05:30:06,849 WARNING [train.py:1204] (3/4) Exclude cut with ID _6473_210_1741_1_1527901307396_3815380_108-17335-0_sp0.9 from training. Duration: 0.72225 2023-10-03 05:30:08,218 WARNING [train.py:1204] (3/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_113-92608-0 from training. Duration: 0.54 2023-10-03 05:30:09,537 WARNING [train.py:1204] (3/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_272-249985-0_sp1.1 from training. Duration: 0.6090625 2023-10-03 05:30:09,552 WARNING [train.py:1204] (3/4) Exclude cut with ID _50376_210_13401_1_1534031502101_4217740_518-187390-0_sp1.1 from training. Duration: 0.97275 2023-10-03 05:30:10,978 WARNING [train.py:1204] (3/4) Exclude cut with ID _59738_210_15379_1_1534849512562_3941470_165-248260-0_sp1.1 from training. Duration: 0.92725 2023-10-03 05:30:10,995 WARNING [train.py:1204] (3/4) Exclude cut with ID _23818_210_6228_1_1531917063884_3676920_400-343925-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 05:30:12,348 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6961_210_2087_1_1527937168_6815011_562-165518-0 from training. Duration: 0.77 2023-10-03 05:30:12,417 WARNING [train.py:1204] (3/4) Exclude cut with ID _24271_210_12658_1_1531987270022_7190519_289-207931-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 05:30:15,620 WARNING [train.py:1204] (3/4) Exclude cut with ID _14632_210_9988_1_1530691279392_3923200_332-105904-0 from training. Duration: 0.9 2023-10-03 05:30:15,640 WARNING [train.py:1204] (3/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_203-232438-0_sp1.1 from training. Duration: 0.97275 2023-10-03 05:30:16,997 WARNING [train.py:1204] (3/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_263-168388-0 from training. Duration: 0.86 2023-10-03 05:30:18,245 WARNING [train.py:1204] (3/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_174-205058-0 from training. Duration: 0.95 2023-10-03 05:30:18,363 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_94-195801-0_sp1.1 from training. Duration: 0.8545625 2023-10-03 05:30:19,726 WARNING [train.py:1204] (3/4) Exclude cut with ID _55153_210_2253_1_1534384786677_3162096_269-103204-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 05:30:21,087 WARNING [train.py:1204] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_748-36150-0 from training. Duration: 0.91 2023-10-03 05:30:22,479 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_148-172861-0_sp1.1 from training. Duration: 0.4545625 2023-10-03 05:30:22,531 WARNING [train.py:1204] (3/4) Exclude cut with ID _67499_210_17282_1_1535509805220_3692430_460-77730-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 05:30:24,373 WARNING [train.py:1204] (3/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_374-20753-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 05:30:25,767 WARNING [train.py:1204] (3/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_374-289983-0_sp1.1 from training. Duration: 0.97275 2023-10-03 05:30:25,803 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_34886_210_10633_1_1533203882_1755661_76-156096-0_sp0.9 from training. Duration: 0.9666875 2023-10-03 05:30:31,322 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_238-256668-0_sp0.9 from training. Duration: 0.7666875 2023-10-03 05:30:32,584 WARNING [train.py:1204] (3/4) Exclude cut with ID _46683_210_9642_1_1533730913492_4591116_489-146727-0_sp1.1 from training. Duration: 0.97275 2023-10-03 05:30:35,370 WARNING [train.py:1204] (3/4) Exclude cut with ID _13938_210_9988_1_1530518718978_3693960_304-112784-0_sp0.9 from training. Duration: 0.511125 2023-10-03 05:30:39,160 INFO [train.py:1046] (3/4) Epoch 33, batch 3200, loss[loss=0.1728, simple_loss=0.2616, pruned_loss=0.04202, over 24660.00 frames. ], tot_loss[loss=0.1595, simple_loss=0.2377, pruned_loss=0.04067, over 4705902.46 frames. ], batch size: 68, lr: 3.09e-03, grad_scale: 32.0 2023-10-03 05:30:40,730 WARNING [train.py:1204] (3/4) Exclude cut with ID _24271_210_12658_1_1531987270022_7190519_243-46549-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 05:30:40,735 WARNING [train.py:1204] (3/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_78-182912-0_sp1.1 from training. Duration: 0.536375 2023-10-03 05:30:43,534 WARNING [train.py:1204] (3/4) Exclude cut with ID _57572_210_5517_1_1534921219624_3809484_121-321334-0_sp1.1 from training. Duration: 0.97275 2023-10-03 05:30:45,294 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_40047_210_2290_1_1533290519_2374311_254-102149-0_sp1.1 from training. Duration: 0.963625 2023-10-03 05:30:45,295 WARNING [train.py:1204] (3/4) Exclude cut with ID _28461_210_2253_1_1532771775189_3041299_96-152158-0 from training. Duration: 0.85 2023-10-03 05:30:46,784 WARNING [train.py:1204] (3/4) Exclude cut with ID _29877_210_6228_1_1532397791106_5276149_161-102979-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 05:30:50,461 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.0.layers.0.conv_module1.whiten, num_groups=1, num_channels=192, metric=10.65 vs. limit=15.0 2023-10-03 05:30:50,794 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3787_210_2040_1_1526477544_5478586_603-120324-0_sp0.9 from training. Duration: 0.9 2023-10-03 05:30:55,393 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_41750_210_1960_1_1533520678_3948042_247-213456-0_sp1.1 from training. Duration: 0.97275 2023-10-03 05:31:02,914 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.bypass.skip_rate, batch_count=1154653.3333333333, ans=0.09899494936611666 2023-10-03 05:31:04,141 WARNING [train.py:1204] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_127-119109-0_sp0.9 from training. Duration: 0.8 2023-10-03 05:31:13,211 WARNING [train.py:1204] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_215-202973-0 from training. Duration: 0.88 2023-10-03 05:31:14,596 WARNING [train.py:1204] (3/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_17-320491-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 05:31:16,622 WARNING [train.py:1204] (3/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_310-114452-0 from training. Duration: 0.59 2023-10-03 05:31:17,922 WARNING [train.py:1204] (3/4) Exclude cut with ID _15133_210_6228_1_1530794512074_3080379_29-264458-0_sp0.9 from training. Duration: 0.6444375 2023-10-03 05:31:20,696 WARNING [train.py:1204] (3/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_159-281310-0_sp1.1 from training. Duration: 0.863625 2023-10-03 05:31:22,368 WARNING [train.py:1204] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_195-167011-0_sp0.9 from training. Duration: 0.8333125 2023-10-03 05:31:23,798 WARNING [train.py:1204] (3/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_156-257607-0_sp1.1 from training. Duration: 0.7545625 2023-10-03 05:31:26,730 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2295_210_2087_1_1525500993_2407863_116-238894-0 from training. Duration: 0.7 2023-10-03 05:31:28,120 WARNING [train.py:1204] (3/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_321-335475-0_sp0.9 from training. Duration: 0.5 2023-10-03 05:31:29,539 WARNING [train.py:1204] (3/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_188-114464-0 from training. Duration: 0.82 2023-10-03 05:31:32,367 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2592_210_2290_1_1525697025_4882925_157-70512-0 from training. Duration: 0.91 2023-10-03 05:31:35,009 WARNING [train.py:1204] (3/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_138-64210-0_sp1.1 from training. Duration: 0.663625 2023-10-03 05:31:38,182 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.ff3_skip_rate, batch_count=1154853.3333333333, ans=0.0 2023-10-03 05:31:41,260 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_11305_210_1794_1_1529750428_7178148_651-95702-0_sp1.1 from training. Duration: 0.963625 2023-10-03 05:31:41,282 WARNING [train.py:1204] (3/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_379-184223-0_sp0.9 from training. Duration: 0.9444375 2023-10-03 05:31:41,308 WARNING [train.py:1204] (3/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_381-125325-0_sp1.1 from training. Duration: 0.963625 2023-10-03 05:31:42,664 WARNING [train.py:1204] (3/4) Exclude cut with ID IC0622W0021-307659 from training. Duration: 0.896 2023-10-03 05:31:42,666 WARNING [train.py:1204] (3/4) Exclude cut with ID _7624_210_1531_1_1528109802198_4933101_418-349834-0_sp1.1 from training. Duration: 0.5454375 2023-10-03 05:31:45,392 WARNING [train.py:1204] (3/4) Exclude cut with ID _49728_210_12651_1_1534041034294_6788150_607-3416-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 05:31:47,262 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8275_210_4198_1_1528455320_3892571_491-17707-0 from training. Duration: 0.64 2023-10-03 05:31:47,308 WARNING [train.py:1204] (3/4) Exclude cut with ID _5287_210_2084_1_1527418883040_3878670_333-48508-0 from training. Duration: 0.6 2023-10-03 05:31:49,867 WARNING [train.py:1204] (3/4) Exclude cut with ID _8365_210_2064_1_1528592311070_4677690_371-122295-0 from training. Duration: 0.55 2023-10-03 05:31:50,101 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.conv_module2.balancer2.prob, batch_count=1154853.3333333333, ans=0.125 2023-10-03 05:31:51,250 WARNING [train.py:1204] (3/4) Exclude cut with ID _14860_210_4915_1_1530702020340_7343240_836-328489-0 from training. Duration: 0.9 2023-10-03 05:31:51,387 WARNING [train.py:1204] (3/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_1033-274376-0_sp0.9 from training. Duration: 0.9666875 2023-10-03 05:31:53,017 INFO [train.py:1046] (3/4) Epoch 33, batch 3250, loss[loss=0.1648, simple_loss=0.2484, pruned_loss=0.04065, over 24075.00 frames. ], tot_loss[loss=0.1602, simple_loss=0.2387, pruned_loss=0.04083, over 4715220.16 frames. ], batch size: 80, lr: 3.09e-03, grad_scale: 16.0 2023-10-03 05:31:54,712 WARNING [train.py:1204] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_475-127455-0_sp0.9 from training. Duration: 0.77775 2023-10-03 05:31:54,721 WARNING [train.py:1204] (3/4) Exclude cut with ID IC0671W0071-331914 from training. Duration: 0.896 2023-10-03 05:31:56,001 WARNING [train.py:1204] (3/4) Exclude cut with ID _30327_210_16049_1_1532484161295_3924169_2-1523-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 05:31:56,003 WARNING [train.py:1204] (3/4) Exclude cut with ID _24314_210_4915_1_1532135078116_7170179_460-208279-0_sp1.1 from training. Duration: 0.97275 2023-10-03 05:31:56,314 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.ff3_skip_rate, batch_count=1154920.0, ans=0.0 2023-10-03 05:31:57,385 WARNING [train.py:1204] (3/4) Exclude cut with ID IC0679W0052-335859 from training. Duration: 0.64 2023-10-03 05:32:00,334 WARNING [train.py:1204] (3/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_3-237434-0_sp1.1 from training. Duration: 0.6909375 2023-10-03 05:32:01,521 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.561e+02 1.860e+02 2.002e+02 2.246e+02 3.741e+02, threshold=4.004e+02, percent-clipped=0.0 2023-10-03 05:32:02,987 WARNING [train.py:1204] (3/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_405-185283-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 05:32:07,639 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.conv_module2.balancer2.prob, batch_count=1154986.6666666667, ans=0.125 2023-10-03 05:32:09,150 WARNING [train.py:1204] (3/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_927-83684-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 05:32:09,158 WARNING [train.py:1204] (3/4) Exclude cut with ID _6694_210_4599_1_1527852637490_3965529_356-169233-0 from training. Duration: 0.99 2023-10-03 05:32:10,983 WARNING [train.py:1204] (3/4) Exclude cut with ID _19896_210_1883_1_1531709022662_6649569_562-233001-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 05:32:11,024 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_354-78629-0_sp1.1 from training. Duration: 0.963625 2023-10-03 05:32:11,025 WARNING [train.py:1204] (3/4) Exclude cut with ID _60135_210_13427_1_1534935557922_3651319_96-168198-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 05:32:12,453 WARNING [train.py:1204] (3/4) Exclude cut with ID _31602_210_5854_1_1532677021456_4458250_249-96604-0_sp1.1 from training. Duration: 0.8090625 2023-10-03 05:32:14,183 WARNING [train.py:1204] (3/4) Exclude cut with ID _14100_210_8341_1_1530529320613_3678069_258-224599-0_sp0.9 from training. Duration: 0.8555625 2023-10-03 05:32:15,686 WARNING [train.py:1204] (3/4) Exclude cut with ID _19708_210_9206_1_1532136650733_4238364_215-160135-0_sp1.1 from training. Duration: 0.97275 2023-10-03 05:32:15,705 WARNING [train.py:1204] (3/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_113-18163-0_sp0.9 from training. Duration: 0.988875 2023-10-03 05:32:15,746 WARNING [train.py:1204] (3/4) Exclude cut with ID _64135_210_2253_1_1535677267876_7608972_552-337340-0_sp1.1 from training. Duration: 0.92725 2023-10-03 05:32:17,001 WARNING [train.py:1204] (3/4) Exclude cut with ID _67725_210_27767_1_1535608756671_5105359_10-12887-0_sp1.1 from training. Duration: 0.97275 2023-10-03 05:32:17,014 WARNING [train.py:1204] (3/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_401-284840-0_sp1.1 from training. Duration: 0.97275 2023-10-03 05:32:17,040 WARNING [train.py:1204] (3/4) Exclude cut with ID _44659_210_2721_1_1534036120061_6614860_483-141285-0_sp1.1 from training. Duration: 0.936375 2023-10-03 05:32:19,886 WARNING [train.py:1204] (3/4) Exclude cut with ID _9461_210_4557_1_1529060514726_3677451_358-332734-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 05:32:20,089 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.conv_module1.balancer2.prob, batch_count=1154986.6666666667, ans=0.125 2023-10-03 05:32:21,495 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.out_combiner.scale_min, batch_count=1155053.3333333333, ans=0.2 2023-10-03 05:32:22,644 WARNING [train.py:1204] (3/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_788-298907-0_sp1.1 from training. Duration: 0.8090625 2023-10-03 05:32:24,620 WARNING [train.py:1204] (3/4) Exclude cut with ID _39411_210_5986_1_1533538825030_3343649_354-267196-0_sp1.1 from training. Duration: 0.92725 2023-10-03 05:32:24,648 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_14953_210_2151_1_1530701710_5616165_192-309792-0_sp1.1 from training. Duration: 0.97275 2023-10-03 05:32:27,393 WARNING [train.py:1204] (3/4) Exclude cut with ID _59170_210_6721_1_1534983633960_4203335_290-151255-0_sp1.1 from training. Duration: 0.92725 2023-10-03 05:32:27,421 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_5575_210_1410_1_1527301305_2570170_249-93769-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 05:32:27,430 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_46013_210_7234_1_1533776362_3744032_463-52378-0_sp1.1 from training. Duration: 0.7818125 2023-10-03 05:32:31,762 WARNING [train.py:1204] (3/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_580-93798-0 from training. Duration: 0.56 2023-10-03 05:32:31,823 WARNING [train.py:1204] (3/4) Exclude cut with ID _19514_210_9259_1_1531530071330_5084230_385-13762-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 05:32:31,828 WARNING [train.py:1204] (3/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_458-67321-0_sp1.1 from training. Duration: 0.8818125 2023-10-03 05:32:34,369 WARNING [train.py:1204] (3/4) Exclude cut with ID _41298_210_9019_1_1533430371450_4822143_188-154791-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 05:32:34,432 WARNING [train.py:1204] (3/4) Exclude cut with ID _15037_210_2253_1_1530752294104_3267650_296-192597-0_sp0.9 from training. Duration: 0.9 2023-10-03 05:32:40,375 WARNING [train.py:1204] (3/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_661-264857-0_sp1.1 from training. Duration: 0.8454375 2023-10-03 05:32:43,891 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.conv_skip_rate, batch_count=1155120.0, ans=0.0 2023-10-03 05:32:49,517 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_34008_210_10431_1_1533345424_8183247_662-317331-0_sp1.1 from training. Duration: 0.8545625 2023-10-03 05:32:49,549 WARNING [train.py:1204] (3/4) Exclude cut with ID _46502_210_13794_1_1533724452198_3657159_241-278329-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 05:32:49,556 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_11349_210_6753_1_1529800861_1101207_26-65602-0 from training. Duration: 0.96 2023-10-03 05:32:49,560 WARNING [train.py:1204] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_448-4048-0_sp1.1 from training. Duration: 0.87275 2023-10-03 05:32:49,566 WARNING [train.py:1204] (3/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_662-275621-0_sp1.1 from training. Duration: 0.3818125 2023-10-03 05:32:50,908 WARNING [train.py:1204] (3/4) Exclude cut with ID _13938_210_9988_1_1530518718978_3693960_256-54454-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 05:32:52,428 WARNING [train.py:1204] (3/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_256-32662-0 from training. Duration: 0.9 2023-10-03 05:32:53,740 WARNING [train.py:1204] (3/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_326-17061-0 from training. Duration: 0.94 2023-10-03 05:32:53,797 WARNING [train.py:1204] (3/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_505-97987-0_sp1.1 from training. Duration: 0.7818125 2023-10-03 05:32:55,631 WARNING [train.py:1204] (3/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_316-294364-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 05:32:56,984 WARNING [train.py:1204] (3/4) Exclude cut with ID _19514_210_9259_1_1531530071330_5084230_762-231063-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 05:32:57,050 WARNING [train.py:1204] (3/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_68-219807-0_sp0.9 from training. Duration: 0.52225 2023-10-03 05:32:57,084 WARNING [train.py:1204] (3/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_479-158053-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 05:33:01,146 WARNING [train.py:1204] (3/4) Exclude cut with ID _66112_210_10635_1_1535590236305_3801484_83-38181-0_sp1.1 from training. Duration: 0.936375 2023-10-03 05:33:01,161 WARNING [train.py:1204] (3/4) Exclude cut with ID _55857_210_2346_1_1534644007831_6898209_255-146346-0_sp1.1 from training. Duration: 0.963625 2023-10-03 05:33:03,748 WARNING [train.py:1204] (3/4) Exclude cut with ID _48836_210_12210_1_1534075848293_3904150_429-35475-0 from training. Duration: 0.89 2023-10-03 05:33:03,760 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_50154_210_13731_1_1534750862_1080961_119-187163-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 05:33:06,448 INFO [train.py:1046] (3/4) Epoch 33, batch 3300, loss[loss=0.1628, simple_loss=0.2325, pruned_loss=0.04654, over 23790.00 frames. ], tot_loss[loss=0.1609, simple_loss=0.2392, pruned_loss=0.0413, over 4702351.73 frames. ], batch size: 179, lr: 3.09e-03, grad_scale: 16.0 2023-10-03 05:33:06,496 WARNING [train.py:1204] (3/4) Exclude cut with ID _8793_210_6310_1_1528542985139_2310009_121-179782-0_sp0.9 from training. Duration: 0.888875 2023-10-03 05:33:06,507 WARNING [train.py:1204] (3/4) Exclude cut with ID _14884_210_2252_1_1530707459689_5561250_591-173406-0 from training. Duration: 0.96 2023-10-03 05:33:09,314 WARNING [train.py:1204] (3/4) Exclude cut with ID _15133_210_6228_1_1530794512074_3080379_54-198193-0_sp1.1 from training. Duration: 0.8545625 2023-10-03 05:33:09,330 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6545_210_2040_1_1527774670_4245258_407-48070-0 from training. Duration: 0.76 2023-10-03 05:33:12,163 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1032-236863-0 from training. Duration: 0.84 2023-10-03 05:33:12,239 WARNING [train.py:1204] (3/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_507-303370-0 from training. Duration: 0.96 2023-10-03 05:33:12,269 WARNING [train.py:1204] (3/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_164-282366-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 05:33:15,637 WARNING [train.py:1204] (3/4) Exclude cut with ID _39867_210_9826_1_1533553222030_9344259_312-158727-0_sp1.1 from training. Duration: 0.963625 2023-10-03 05:33:15,756 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.nonlin_attention.balancer.prob, batch_count=1155253.3333333333, ans=0.125 2023-10-03 05:33:17,606 WARNING [train.py:1204] (3/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_174-205058-0_sp1.1 from training. Duration: 0.863625 2023-10-03 05:33:17,626 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_11305_210_1794_1_1529750428_7178148_166-237358-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 05:33:20,775 WARNING [train.py:1204] (3/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_26-175553-0_sp1.1 from training. Duration: 0.5545625 2023-10-03 05:33:20,794 WARNING [train.py:1204] (3/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_96-290039-0_sp1.1 from training. Duration: 0.5090625 2023-10-03 05:33:22,307 WARNING [train.py:1204] (3/4) Exclude cut with ID _53219_210_4525_1_1534834836008_4064825_261-101691-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 05:33:22,415 WARNING [train.py:1204] (3/4) Exclude cut with ID _24271_210_12658_1_1531987270022_7190519_96-293390-0_sp1.1 from training. Duration: 0.936375 2023-10-03 05:33:25,450 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.balancer_ff2.min_abs, batch_count=1155320.0, ans=0.1 2023-10-03 05:33:27,165 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_271-234962-0 from training. Duration: 0.79 2023-10-03 05:33:28,511 WARNING [train.py:1204] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_136-30660-0_sp1.1 from training. Duration: 0.97275 2023-10-03 05:33:28,534 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_625-281046-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 05:33:29,875 WARNING [train.py:1204] (3/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_333-168193-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 05:33:31,331 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9029W0269-509664 from training. Duration: 0.896 2023-10-03 05:33:31,420 WARNING [train.py:1204] (3/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_450-147393-0_sp1.1 from training. Duration: 0.92725 2023-10-03 05:33:32,800 WARNING [train.py:1204] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_643-52007-0_sp1.1 from training. Duration: 0.5181875 2023-10-03 05:33:32,884 WARNING [train.py:1204] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_330-327861-0_sp0.9 from training. Duration: 0.7444375 2023-10-03 05:33:32,896 WARNING [train.py:1204] (3/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_260-317651-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 05:33:34,143 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9044W0010-516654 from training. Duration: 0.896 2023-10-03 05:33:36,920 WARNING [train.py:1204] (3/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_265-118378-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 05:33:36,927 WARNING [train.py:1204] (3/4) Exclude cut with ID _6194_210_1534_1_1527511826160_3941194_321-148641-0_sp0.9 from training. Duration: 0.711125 2023-10-03 05:33:38,389 WARNING [train.py:1204] (3/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_101-169176-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 05:33:38,399 WARNING [train.py:1204] (3/4) Exclude cut with ID 497-129325-0061-62254-0_sp1.1 from training. Duration: 0.97725 2023-10-03 05:33:41,438 WARNING [train.py:1204] (3/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_795-33252-0_sp1.1 from training. Duration: 0.336375 2023-10-03 05:33:41,452 WARNING [train.py:1204] (3/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_168-246176-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 05:33:42,798 WARNING [train.py:1204] (3/4) Exclude cut with ID _7160_210_1876_1_1527940923231_3703799_120-150943-0_sp1.1 from training. Duration: 0.736375 2023-10-03 05:33:44,203 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9084W0005-536844 from training. Duration: 0.768 2023-10-03 05:33:45,722 WARNING [train.py:1204] (3/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_234-260682-0 from training. Duration: 0.84 2023-10-03 05:33:47,114 WARNING [train.py:1204] (3/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_188-114464-0_sp0.9 from training. Duration: 0.911125 2023-10-03 05:33:49,207 WARNING [train.py:1204] (3/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_81-75849-0 from training. Duration: 0.39 2023-10-03 05:33:52,254 WARNING [train.py:1204] (3/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_379-184223-0_sp1.1 from training. Duration: 0.77275 2023-10-03 05:33:53,727 WARNING [train.py:1204] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_454-123002-0_sp1.1 from training. Duration: 0.62725 2023-10-03 05:33:53,780 WARNING [train.py:1204] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_887-249555-0_sp1.1 from training. Duration: 0.7 2023-10-03 05:33:53,956 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.conv_skip_rate, batch_count=1155453.3333333333, ans=0.0 2023-10-03 05:33:56,553 WARNING [train.py:1204] (3/4) Exclude cut with ID _31088_210_16446_1_1532520012284_3440919_20-260902-0_sp1.1 from training. Duration: 0.97275 2023-10-03 05:33:58,330 WARNING [train.py:1204] (3/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_856-344574-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 05:33:58,332 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_36756_210_7234_1_1533026991_966276_4-164118-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 05:33:58,355 WARNING [train.py:1204] (3/4) Exclude cut with ID _8365_210_2064_1_1528592311070_4677690_371-122295-0_sp1.1 from training. Duration: 0.5 2023-10-03 05:34:01,073 WARNING [train.py:1204] (3/4) Exclude cut with ID _5910_210_1389_1_1527337972454_5050100_555-159815-0_sp1.1 from training. Duration: 0.8909375 2023-10-03 05:34:01,083 WARNING [train.py:1204] (3/4) Exclude cut with ID _37086_210_9038_1_1532999540891_7021250_469-147941-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 05:34:02,452 WARNING [train.py:1204] (3/4) Exclude cut with ID _3067_210_2667_1_1526212974640_4503168_459-173867-0_sp0.9 from training. Duration: 0.97775 2023-10-03 05:34:02,562 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9156W0006-572499 from training. Duration: 0.896 2023-10-03 05:34:03,862 WARNING [train.py:1204] (3/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_277-210798-0 from training. Duration: 0.55 2023-10-03 05:34:05,298 WARNING [train.py:1204] (3/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_103-336064-0_sp1.1 from training. Duration: 0.463625 2023-10-03 05:34:06,596 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3414_210_1988_1_1526212718_4813195_331-290945-0_sp1.1 from training. Duration: 0.936375 2023-10-03 05:34:06,597 WARNING [train.py:1204] (3/4) Exclude cut with ID _67499_210_17282_1_1535509805220_3692430_365-29276-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 05:34:07,961 WARNING [train.py:1204] (3/4) Exclude cut with ID _14821_210_8750_1_1530680503991_6829359_69-250847-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 05:34:07,962 WARNING [train.py:1204] (3/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_1102-61301-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 05:34:09,353 WARNING [train.py:1204] (3/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_166-246557-0_sp1.1 from training. Duration: 0.5818125 2023-10-03 05:34:09,394 WARNING [train.py:1204] (3/4) Exclude cut with ID _15351_210_10626_1_1530856635653_3576210_214-129918-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 05:34:09,406 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_120-41668-0_sp0.9 from training. Duration: 0.77775 2023-10-03 05:34:11,410 WARNING [train.py:1204] (3/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_27-72278-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 05:34:12,643 WARNING [train.py:1204] (3/4) Exclude cut with ID _24314_210_4915_1_1532135078116_7170179_521-258868-0_sp1.1 from training. Duration: 0.7181875 2023-10-03 05:34:14,063 WARNING [train.py:1204] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_143-86344-0 from training. Duration: 0.77 2023-10-03 05:34:15,349 WARNING [train.py:1204] (3/4) Exclude cut with ID _53489_210_6228_1_1534298357169_3835970_465-157433-0_sp1.1 from training. Duration: 0.92725 2023-10-03 05:34:15,440 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_248-319353-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 05:34:18,004 WARNING [train.py:1204] (3/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_102-77064-0_sp1.1 from training. Duration: 0.8454375 2023-10-03 05:34:18,045 WARNING [train.py:1204] (3/4) Exclude cut with ID _9389_210_1664_1_1529217337301_5289269_379-128794-0_sp1.1 from training. Duration: 0.77275 2023-10-03 05:34:18,158 WARNING [train.py:1204] (3/4) Exclude cut with ID _3231_210_2106_1_1526205864934_6784220_236-260835-0_sp1.1 from training. Duration: 0.97275 2023-10-03 05:34:21,349 INFO [train.py:1046] (3/4) Epoch 33, batch 3350, loss[loss=0.1586, simple_loss=0.2458, pruned_loss=0.0357, over 24500.00 frames. ], tot_loss[loss=0.1617, simple_loss=0.2405, pruned_loss=0.04143, over 4706467.58 frames. ], batch size: 66, lr: 3.09e-03, grad_scale: 16.0 2023-10-03 05:34:21,439 WARNING [train.py:1204] (3/4) Exclude cut with ID _24271_210_12658_1_1531987270022_7190519_184-342363-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 05:34:21,440 WARNING [train.py:1204] (3/4) Exclude cut with ID _27894_210_12392_1_1532415496078_6902439_67-221663-0_sp1.1 from training. Duration: 0.92725 2023-10-03 05:34:24,746 WARNING [train.py:1204] (3/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_304-59227-0_sp1.1 from training. Duration: 0.7 2023-10-03 05:34:27,437 WARNING [train.py:1204] (3/4) Exclude cut with ID _58605_210_12210_1_1534723195939_3904259_458-42232-0_sp1.1 from training. Duration: 0.92725 2023-10-03 05:34:28,805 WARNING [train.py:1204] (3/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_13-165512-0_sp0.9 from training. Duration: 0.911125 2023-10-03 05:34:30,134 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.535e+02 1.952e+02 2.092e+02 2.361e+02 3.355e+02, threshold=4.183e+02, percent-clipped=0.0 2023-10-03 05:34:30,281 WARNING [train.py:1204] (3/4) Exclude cut with ID _58602_210_2346_1_1534903210085_6908830_792-102241-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 05:34:31,696 WARNING [train.py:1204] (3/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_588-125165-0_sp0.9 from training. Duration: 0.92225 2023-10-03 05:34:33,058 WARNING [train.py:1204] (3/4) Exclude cut with ID _54218_210_12210_1_1534413646388_4409259_239-98429-0_sp1.1 from training. Duration: 0.97275 2023-10-03 05:34:34,370 WARNING [train.py:1204] (3/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_538-32291-0_sp1.1 from training. Duration: 0.7545625 2023-10-03 05:34:35,762 WARNING [train.py:1204] (3/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_419-317062-0 from training. Duration: 0.91 2023-10-03 05:34:37,145 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9309W0202-640329 from training. Duration: 0.896 2023-10-03 05:34:38,388 WARNING [train.py:1204] (3/4) Exclude cut with ID _3231_210_2106_1_1526205864934_6784220_419-313291-0_sp1.1 from training. Duration: 0.97275 2023-10-03 05:34:41,266 WARNING [train.py:1204] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_619-344499-0 from training. Duration: 0.63 2023-10-03 05:34:41,278 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_278-272612-0 from training. Duration: 0.73 2023-10-03 05:34:43,168 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_103-84010-0_sp0.9 from training. Duration: 0.7666875 2023-10-03 05:34:43,189 WARNING [train.py:1204] (3/4) Exclude cut with ID _26528_210_9070_1_1532570410842_3851310_497-97218-0_sp1.1 from training. Duration: 0.7818125 2023-10-03 05:34:43,286 WARNING [train.py:1204] (3/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_262-84613-0_sp1.1 from training. Duration: 0.936375 2023-10-03 05:34:44,531 WARNING [train.py:1204] (3/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_387-131538-0 from training. Duration: 0.81 2023-10-03 05:34:44,566 WARNING [train.py:1204] (3/4) Exclude cut with ID _8030_210_7497_1_1528444886781_3531089_224-300489-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 05:34:44,583 WARNING [train.py:1204] (3/4) Exclude cut with ID _14349_210_8341_1_1530615710818_3442470_170-336541-0_sp0.9 from training. Duration: 0.9555625 2023-10-03 05:34:46,075 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_47069_210_15368_1_1533781267_4431320_180-31746-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 05:34:48,660 WARNING [train.py:1204] (3/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_447-7041-0_sp1.1 from training. Duration: 0.92725 2023-10-03 05:34:48,693 WARNING [train.py:1204] (3/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_2-55283-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 05:34:49,988 WARNING [train.py:1204] (3/4) Exclude cut with ID _12555_210_8750_1_1530075594402_3780239_70-181911-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 05:34:53,919 WARNING [train.py:1204] (3/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_311-328022-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 05:34:57,470 WARNING [train.py:1204] (3/4) Exclude cut with ID _14528_210_8254_1_1530669700514_4306939_330-2987-0_sp1.1 from training. Duration: 0.92725 2023-10-03 05:34:57,520 WARNING [train.py:1204] (3/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_385-184858-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 05:34:57,712 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.conv_skip_rate, batch_count=1155720.0, ans=0.0 2023-10-03 05:35:01,630 WARNING [train.py:1204] (3/4) Exclude cut with ID _64414_210_17301_1_1535243252048_3757759_348-333360-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 05:35:02,936 WARNING [train.py:1204] (3/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_753-214751-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 05:35:04,358 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_17613_210_2151_1_1531137569_2366160_202-208013-0_sp1.1 from training. Duration: 0.92725 2023-10-03 05:35:04,367 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_43831_210_2158_1_1533725502_5872791_610-129300-0_sp1.1 from training. Duration: 0.936375 2023-10-03 05:35:06,965 WARNING [train.py:1204] (3/4) Exclude cut with ID _54218_210_12210_1_1534413646388_4409259_321-23852-0_sp1.1 from training. Duration: 0.936375 2023-10-03 05:35:09,673 WARNING [train.py:1204] (3/4) Exclude cut with ID _39411_210_5986_1_1533538825030_3343649_173-238248-0 from training. Duration: 0.83 2023-10-03 05:35:09,680 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_405-339986-0_sp0.9 from training. Duration: 0.5666875 2023-10-03 05:35:09,716 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2009_210_2071_1_1525481134_4588937_385-302100-0 from training. Duration: 0.87 2023-10-03 05:35:09,743 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_161-276996-0_sp1.1 from training. Duration: 0.863625 2023-10-03 05:35:10,000 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=1155786.6666666667, ans=0.1 2023-10-03 05:35:11,108 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_177-58005-0 from training. Duration: 0.62 2023-10-03 05:35:12,581 WARNING [train.py:1204] (3/4) Exclude cut with ID _64639_210_5816_1_1535367619437_3528000_146-343734-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 05:35:12,775 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.conv_skip_rate, batch_count=1155786.6666666667, ans=0.0 2023-10-03 05:35:12,830 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.feed_forward1.out_proj.dropout_p, batch_count=1155786.6666666667, ans=0.1 2023-10-03 05:35:14,458 WARNING [train.py:1204] (3/4) Exclude cut with ID _3845_210_1390_1_1526731326141_3523610_72-61441-0_sp1.1 from training. Duration: 0.92725 2023-10-03 05:35:21,259 WARNING [train.py:1204] (3/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_991-125028-0_sp1.1 from training. Duration: 0.936375 2023-10-03 05:35:22,580 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7544_210_6753_1_1528591327_1471426_172-348777-0 from training. Duration: 0.66 2023-10-03 05:35:22,624 WARNING [train.py:1204] (3/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_416-257730-0_sp1.1 from training. Duration: 0.5909375 2023-10-03 05:35:22,691 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1113-7369-0_sp1.1 from training. Duration: 0.77275 2023-10-03 05:35:24,455 WARNING [train.py:1204] (3/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_242-76943-0_sp1.1 from training. Duration: 0.8545625 2023-10-03 05:35:28,541 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.3.encoder.layers.0.self_attn1.whiten, num_groups=1, num_channels=512, metric=20.56 vs. limit=22.5 2023-10-03 05:35:29,657 WARNING [train.py:1204] (3/4) Exclude cut with ID _44718_210_5816_1_1534050051869_6469030_190-49648-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 05:35:31,075 WARNING [train.py:1204] (3/4) Exclude cut with ID _47251_210_18628_1_1533814063128_4137250_214-88076-0 from training. Duration: 0.88 2023-10-03 05:35:31,113 WARNING [train.py:1204] (3/4) Exclude cut with ID _5585_210_2667_1_1527418813324_8007340_89-332072-0_sp1.1 from training. Duration: 0.5818125 2023-10-03 05:35:32,401 WARNING [train.py:1204] (3/4) Exclude cut with ID _41817_210_18835_1_1533434477160_7423049_555-293448-0_sp1.1 from training. Duration: 0.72725 2023-10-03 05:35:33,786 WARNING [train.py:1204] (3/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_286-331314-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 05:35:33,834 WARNING [train.py:1204] (3/4) Exclude cut with ID _13921_210_9994_1_1530838702800_3003429_46-149444-0 from training. Duration: 0.94 2023-10-03 05:35:33,892 WARNING [train.py:1204] (3/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_768-43595-0_sp1.1 from training. Duration: 0.936375 2023-10-03 05:35:33,900 WARNING [train.py:1204] (3/4) Exclude cut with ID _35355_210_16040_1_1533212988977_4016229_74-190076-0 from training. Duration: 0.8 2023-10-03 05:35:35,247 INFO [train.py:1046] (3/4) Epoch 33, batch 3400, loss[loss=0.1417, simple_loss=0.2173, pruned_loss=0.03303, over 20225.00 frames. ], tot_loss[loss=0.1617, simple_loss=0.2408, pruned_loss=0.04127, over 4704384.91 frames. ], batch size: 44, lr: 3.08e-03, grad_scale: 16.0 2023-10-03 05:35:36,546 WARNING [train.py:1204] (3/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_1007-316380-0_sp1.1 from training. Duration: 0.963625 2023-10-03 05:35:36,591 WARNING [train.py:1204] (3/4) Exclude cut with ID _23557_210_8750_1_1531890106370_6846255_150-283437-0_sp1.1 from training. Duration: 0.963625 2023-10-03 05:35:36,635 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3474_210_2028_1_1526188513_4825246_145-309131-0_sp1.1 from training. Duration: 0.67275 2023-10-03 05:35:37,974 WARNING [train.py:1204] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_391-22148-0_sp1.1 from training. Duration: 0.7 2023-10-03 05:35:39,318 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_238-256668-0 from training. Duration: 0.69 2023-10-03 05:35:44,060 WARNING [train.py:1204] (3/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_372-198878-0 from training. Duration: 0.6 2023-10-03 05:35:44,070 WARNING [train.py:1204] (3/4) Exclude cut with ID ID0174W0006-766659 from training. Duration: 0.953 2023-10-03 05:35:44,087 WARNING [train.py:1204] (3/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_668-23019-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 05:35:47,603 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.4.encoder.layers.2.self_attn_weights.whiten_keys, num_groups=4, num_channels=128, metric=1.69 vs. limit=6.0 2023-10-03 05:35:48,462 WARNING [train.py:1204] (3/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_322-74328-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 05:35:48,475 WARNING [train.py:1204] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_872-264525-0_sp1.1 from training. Duration: 0.5909375 2023-10-03 05:35:48,519 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_53378_210_17282_1_1534221071_5769509_641-286201-0_sp1.1 from training. Duration: 0.97275 2023-10-03 05:35:48,769 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.conv_skip_rate, batch_count=1155986.6666666667, ans=0.0 2023-10-03 05:35:49,870 WARNING [train.py:1204] (3/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_199-295611-0_sp1.1 from training. Duration: 0.5 2023-10-03 05:35:56,745 WARNING [train.py:1204] (3/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_1270-317971-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 05:35:58,631 WARNING [train.py:1204] (3/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_784-213099-0 from training. Duration: 0.93 2023-10-03 05:36:01,696 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.feed_forward2.hidden_balancer.prob, batch_count=1155986.6666666667, ans=0.125 2023-10-03 05:36:04,267 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_115-233929-0_sp0.9 from training. Duration: 0.8 2023-10-03 05:36:04,399 WARNING [train.py:1204] (3/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_256-223809-0_sp1.1 from training. Duration: 0.97275 2023-10-03 05:36:05,725 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_285-87781-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 05:36:07,019 WARNING [train.py:1204] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_619-344499-0_sp1.1 from training. Duration: 0.57275 2023-10-03 05:36:12,657 WARNING [train.py:1204] (3/4) Exclude cut with ID _60229_210_27767_1_1534838406158_3655350_264-279573-0_sp0.9 from training. Duration: 0.92225 2023-10-03 05:36:14,361 INFO [scaling.py:1118] (3/4) WithLoss: name=encoder.encoders.4.encoder.layers.2.self_attn_weights, loss-sum=0.000e+00 2023-10-03 05:36:15,493 WARNING [train.py:1204] (3/4) Exclude cut with ID _7719_210_3525_1_1528456153163_4296960_333-110399-0 from training. Duration: 0.96 2023-10-03 05:36:21,524 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_50423_210_13270_1_1534128598_5021734_759-90833-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 05:36:22,830 WARNING [train.py:1204] (3/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_151-254798-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 05:36:22,872 WARNING [train.py:1204] (3/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_450-203637-0 from training. Duration: 0.92 2023-10-03 05:36:22,906 WARNING [train.py:1204] (3/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_673-36456-0_sp1.1 from training. Duration: 0.936375 2023-10-03 05:36:22,956 WARNING [train.py:1204] (3/4) Exclude cut with ID _36538_210_9994_1_1533204020363_3795620_131-8685-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 05:36:24,300 WARNING [train.py:1204] (3/4) Exclude cut with ID _12556_210_8341_1_1530081024100_7327690_713-55626-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 05:36:25,639 WARNING [train.py:1204] (3/4) Exclude cut with ID _2322_210_2667_1_1525604548971_3656299_440-80494-0_sp0.9 from training. Duration: 0.97775 2023-10-03 05:36:27,667 WARNING [train.py:1204] (3/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_210-325081-0_sp1.1 from training. Duration: 0.97275 2023-10-03 05:36:32,423 WARNING [train.py:1204] (3/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_375-244976-0_sp0.9 from training. Duration: 0.8333125 2023-10-03 05:36:32,429 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_15517_210_6753_1_1530952293_1664795_119-31001-0_sp1.1 from training. Duration: 0.8545625 2023-10-03 05:36:36,606 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_41452_210_7291_1_1533437687_3941281_319-42326-0_sp1.1 from training. Duration: 0.92725 2023-10-03 05:36:37,978 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_372-173012-0 from training. Duration: 0.95 2023-10-03 05:36:42,216 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.ff3_skip_rate, batch_count=1156186.6666666667, ans=0.0 2023-10-03 05:36:43,306 WARNING [train.py:1204] (3/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_41-31157-0_sp1.1 from training. Duration: 0.5181875 2023-10-03 05:36:44,913 WARNING [train.py:1204] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_91-212486-0 from training. Duration: 0.68 2023-10-03 05:36:48,400 WARNING [train.py:1204] (3/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_1267-272693-0 from training. Duration: 0.73 2023-10-03 05:36:49,750 INFO [train.py:1046] (3/4) Epoch 33, batch 3450, loss[loss=0.1747, simple_loss=0.2337, pruned_loss=0.05783, over 19902.00 frames. ], tot_loss[loss=0.1616, simple_loss=0.2407, pruned_loss=0.04128, over 4713684.68 frames. ], batch size: 388, lr: 3.08e-03, grad_scale: 16.0 2023-10-03 05:36:49,810 WARNING [train.py:1204] (3/4) Exclude cut with ID _13938_210_9988_1_1530518718978_3693960_152-67046-0_sp1.1 from training. Duration: 0.936375 2023-10-03 05:36:50,437 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.4.encoder.layers.1.self_attn2.whiten, num_groups=1, num_channels=384, metric=12.20 vs. limit=22.5 2023-10-03 05:36:52,514 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_4528_210_3273_1_1527076654_3276777_342-168068-0_sp1.1 from training. Duration: 0.7909375 2023-10-03 05:36:52,531 WARNING [train.py:1204] (3/4) Exclude cut with ID _7606_210_4915_1_1528894253277_5080690_127-177761-0 from training. Duration: 0.64 2023-10-03 05:36:52,603 WARNING [train.py:1204] (3/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_335-137972-0_sp1.1 from training. Duration: 0.92725 2023-10-03 05:36:57,622 WARNING [train.py:1204] (3/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_331-117829-0_sp1.1 from training. Duration: 0.563625 2023-10-03 05:36:59,361 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.487e+02 1.812e+02 1.985e+02 2.188e+02 3.320e+02, threshold=3.971e+02, percent-clipped=0.0 2023-10-03 05:37:00,961 WARNING [train.py:1204] (3/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_125-125818-0_sp1.1 from training. Duration: 0.87275 2023-10-03 05:37:02,371 WARNING [train.py:1204] (3/4) Exclude cut with ID _6789_210_1534_1_1527854471671_2870519_48-294897-0_sp1.1 from training. Duration: 0.963625 2023-10-03 05:37:03,806 WARNING [train.py:1204] (3/4) Exclude cut with ID _6694_210_4599_1_1527852637490_3965529_116-109398-0_sp1.1 from training. Duration: 0.663625 2023-10-03 05:37:03,826 WARNING [train.py:1204] (3/4) Exclude cut with ID _52892_210_6721_1_1534378941259_4124129_222-172685-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 05:37:06,476 WARNING [train.py:1204] (3/4) Exclude cut with ID _50655_210_18628_1_1534229942193_3772154_320-132275-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 05:37:13,195 WARNING [train.py:1204] (3/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_76-311963-0 from training. Duration: 0.7 2023-10-03 05:37:13,520 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.conv_module2.balancer2.prob, batch_count=1156320.0, ans=0.125 2023-10-03 05:37:17,993 WARNING [train.py:1204] (3/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_32-205288-0 from training. Duration: 0.78 2023-10-03 05:37:18,020 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_86-16881-0_sp0.9 from training. Duration: 0.6444375 2023-10-03 05:37:19,348 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_271-297505-0_sp1.1 from training. Duration: 0.9 2023-10-03 05:37:20,773 WARNING [train.py:1204] (3/4) Exclude cut with ID _57572_210_5517_1_1534921219624_3809484_187-295293-0_sp1.1 from training. Duration: 0.963625 2023-10-03 05:37:21,308 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.5.encoder.layers.1.feed_forward1.out_whiten, num_groups=1, num_channels=256, metric=6.87 vs. limit=15.0 2023-10-03 05:37:22,358 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.conv_skip_rate, batch_count=1156386.6666666667, ans=0.0 2023-10-03 05:37:26,261 WARNING [train.py:1204] (3/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_563-329943-0 from training. Duration: 0.99 2023-10-03 05:37:26,322 WARNING [train.py:1204] (3/4) Exclude cut with ID _7160_210_1876_1_1527940923231_3703799_302-150728-0_sp0.9 from training. Duration: 0.8666875 2023-10-03 05:37:29,919 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.feed_forward1.out_proj.dropout_p, batch_count=1156386.6666666667, ans=0.1 2023-10-03 05:37:31,044 WARNING [train.py:1204] (3/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_926-7526-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 05:37:31,082 WARNING [train.py:1204] (3/4) Exclude cut with ID _62574_210_2655_1_1535180407058_3933249_467-22348-0_sp1.1 from training. Duration: 0.936375 2023-10-03 05:37:32,491 WARNING [train.py:1204] (3/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_280-161633-0_sp0.9 from training. Duration: 0.811125 2023-10-03 05:37:32,658 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.0.conv_skip_rate, batch_count=1156453.3333333333, ans=0.0 2023-10-03 05:37:33,893 WARNING [train.py:1204] (3/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_385-193052-0_sp1.1 from training. Duration: 0.7818125 2023-10-03 05:37:35,306 WARNING [train.py:1204] (3/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_444-28041-0 from training. Duration: 0.85 2023-10-03 05:37:35,311 WARNING [train.py:1204] (3/4) Exclude cut with ID _66070_210_7640_1_1535441393096_7805310_341-249237-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 05:37:36,637 WARNING [train.py:1204] (3/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_425-269826-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 05:37:38,689 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.2.encoder.layers.0.conv_module1.whiten, num_groups=1, num_channels=384, metric=7.21 vs. limit=15.0 2023-10-03 05:37:39,246 WARNING [train.py:1204] (3/4) Exclude cut with ID _53950_210_5816_1_1534417264919_3862951_124-185597-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 05:37:41,980 WARNING [train.py:1204] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_380-204756-0 from training. Duration: 0.86 2023-10-03 05:37:44,849 WARNING [train.py:1204] (3/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_282-257791-0_sp0.9 from training. Duration: 0.9555625 2023-10-03 05:37:49,700 WARNING [train.py:1204] (3/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_372-131163-0_sp1.1 from training. Duration: 0.8545625 2023-10-03 05:37:52,407 WARNING [train.py:1204] (3/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_447-233513-0_sp1.1 from training. Duration: 0.963625 2023-10-03 05:37:53,818 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_54-118333-0_sp1.1 from training. Duration: 0.97275 2023-10-03 05:38:00,354 WARNING [train.py:1204] (3/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_388-230239-0_sp1.1 from training. Duration: 0.963625 2023-10-03 05:38:00,373 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_40047_210_2290_1_1533290519_2374311_141-54639-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 05:38:00,436 WARNING [train.py:1204] (3/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_91-267696-0_sp0.9 from training. Duration: 0.9666875 2023-10-03 05:38:00,471 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_532-137847-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 05:38:03,160 INFO [train.py:1046] (3/4) Epoch 33, batch 3500, loss[loss=0.1496, simple_loss=0.2321, pruned_loss=0.03356, over 24652.00 frames. ], tot_loss[loss=0.1611, simple_loss=0.24, pruned_loss=0.04111, over 4718885.27 frames. ], batch size: 65, lr: 3.08e-03, grad_scale: 16.0 2023-10-03 05:38:05,905 WARNING [train.py:1204] (3/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_155-283231-0_sp1.1 from training. Duration: 0.97275 2023-10-03 05:38:06,713 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.2.encoder.layers.1.self_attn2.whiten, num_groups=1, num_channels=384, metric=15.22 vs. limit=22.5 2023-10-03 05:38:08,733 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2245_210_2715_1_1525511548_3540254_310-52483-0_sp1.1 from training. Duration: 0.8 2023-10-03 05:38:08,793 WARNING [train.py:1204] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_185-174805-0 from training. Duration: 0.68 2023-10-03 05:38:11,469 WARNING [train.py:1204] (3/4) Exclude cut with ID _15012_210_1968_1_1530856820429_5095350_662-210469-0_sp1.1 from training. Duration: 0.5181875 2023-10-03 05:38:14,251 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_426-313557-0_sp0.9 from training. Duration: 0.588875 2023-10-03 05:38:16,122 WARNING [train.py:1204] (3/4) Exclude cut with ID _42326_210_7770_1_1533814282531_3707269_407-204343-0_sp1.1 from training. Duration: 0.97275 2023-10-03 05:38:16,138 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_258-310570-0 from training. Duration: 0.82 2023-10-03 05:38:20,426 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_4737_210_2040_1_1526998689_3293008_378-178650-0_sp1.1 from training. Duration: 0.863625 2023-10-03 05:38:20,513 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_47896_210_20508_1_1534492518_5958868_432-327451-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 05:38:21,871 WARNING [train.py:1204] (3/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_188-114464-0_sp1.1 from training. Duration: 0.7454375 2023-10-03 05:38:21,878 WARNING [train.py:1204] (3/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_798-106179-0_sp1.1 from training. Duration: 0.7545625 2023-10-03 05:38:23,242 WARNING [train.py:1204] (3/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_143-117421-0_sp1.1 from training. Duration: 0.52725 2023-10-03 05:38:23,281 WARNING [train.py:1204] (3/4) Exclude cut with ID _67499_210_17282_1_1535509805220_3692430_361-78786-0_sp1.1 from training. Duration: 0.963625 2023-10-03 05:38:23,319 WARNING [train.py:1204] (3/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_712-342563-0_sp1.1 from training. Duration: 0.8818125 2023-10-03 05:38:24,645 WARNING [train.py:1204] (3/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_387-159008-0 from training. Duration: 0.86 2023-10-03 05:38:27,882 WARNING [train.py:1204] (3/4) Exclude cut with ID _33640_210_2655_1_1533117605479_4855469_959-18820-0_sp1.1 from training. Duration: 0.963625 2023-10-03 05:38:27,918 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_133-564-0_sp0.9 from training. Duration: 0.87775 2023-10-03 05:38:29,559 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.conv_module2.balancer2.prob, batch_count=1156653.3333333333, ans=0.125 2023-10-03 05:38:30,634 WARNING [train.py:1204] (3/4) Exclude cut with ID _23514_210_1389_1_1531832139785_3031439_12-29287-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 05:38:34,067 WARNING [train.py:1204] (3/4) Exclude cut with ID _23514_210_1389_1_1531832139785_3031439_489-185052-0_sp1.1 from training. Duration: 0.963625 2023-10-03 05:38:35,392 WARNING [train.py:1204] (3/4) Exclude cut with ID _5444_210_2151_1_1527418808464_5312210_634-194174-0 from training. Duration: 0.97 2023-10-03 05:38:35,417 WARNING [train.py:1204] (3/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_91-26919-0_sp1.1 from training. Duration: 0.8818125 2023-10-03 05:38:38,145 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_37351_210_7925_1_1533012931_3812113_516-82303-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 05:38:38,242 WARNING [train.py:1204] (3/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_80-74184-0_sp0.9 from training. Duration: 0.92225 2023-10-03 05:38:39,526 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_9460_210_4211_1_1528954587_10049667_495-202963-0_sp1.1 from training. Duration: 0.963625 2023-10-03 05:38:40,943 WARNING [train.py:1204] (3/4) Exclude cut with ID _14632_210_9988_1_1530691279392_3923200_332-105904-0_sp1.1 from training. Duration: 0.8181875 2023-10-03 05:38:40,960 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_40764_210_1925_1_1533882359_3752345_493-167886-0_sp1.1 from training. Duration: 0.92725 2023-10-03 05:38:42,478 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_196-289637-0 from training. Duration: 0.75 2023-10-03 05:38:44,397 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.4.encoder.layers.1.self_attn1.whiten, num_groups=1, num_channels=384, metric=11.77 vs. limit=22.5 2023-10-03 05:38:45,163 WARNING [train.py:1204] (3/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_26-175553-0 from training. Duration: 0.61 2023-10-03 05:38:45,219 WARNING [train.py:1204] (3/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_890-5117-0 from training. Duration: 0.78 2023-10-03 05:38:45,275 WARNING [train.py:1204] (3/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_206-306860-0_sp1.1 from training. Duration: 0.92725 2023-10-03 05:38:47,310 WARNING [train.py:1204] (3/4) Exclude cut with ID _25882_210_3036_1_1532775665815_6479510_570-79824-0_sp1.1 from training. Duration: 0.963625 2023-10-03 05:38:48,672 WARNING [train.py:1204] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_731-2428-0_sp1.1 from training. Duration: 0.7545625 2023-10-03 05:38:48,700 WARNING [train.py:1204] (3/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_138-154812-0_sp0.9 from training. Duration: 0.8666875 2023-10-03 05:38:52,777 WARNING [train.py:1204] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_178-39722-0_sp0.9 from training. Duration: 0.5444375 2023-10-03 05:38:52,835 WARNING [train.py:1204] (3/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_1285-256622-0_sp0.9 from training. Duration: 0.9333125 2023-10-03 05:38:52,984 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.ff3_skip_rate, batch_count=1156786.6666666667, ans=0.0 2023-10-03 05:38:58,260 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_41428_210_2161_1_1533774184_3992532_65-271323-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 05:39:00,296 WARNING [train.py:1204] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_308-161067-0 from training. Duration: 0.66 2023-10-03 05:39:00,301 WARNING [train.py:1204] (3/4) Exclude cut with ID _3067_210_2667_1_1526212974640_4503168_459-173867-0 from training. Duration: 0.88 2023-10-03 05:39:02,169 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2221_210_1988_1_1525597051_4935541_415-296882-0_sp1.1 from training. Duration: 0.7 2023-10-03 05:39:04,908 WARNING [train.py:1204] (3/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_354-192266-0_sp1.1 from training. Duration: 0.936375 2023-10-03 05:39:04,978 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_258-310570-0_sp0.9 from training. Duration: 0.911125 2023-10-03 05:39:07,584 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_548-141039-0_sp1.1 from training. Duration: 0.963625 2023-10-03 05:39:10,458 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_15101_210_7927_1_1530786415_5033274_320-327527-0 from training. Duration: 0.96 2023-10-03 05:39:11,759 WARNING [train.py:1204] (3/4) Exclude cut with ID _2895_210_2477_1_1525956785794_4063750_289-11475-0_sp0.9 from training. Duration: 0.911125 2023-10-03 05:39:11,899 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7543_210_6753_1_1528543318_4552_1-85072-0_sp1.1 from training. Duration: 0.7545625 2023-10-03 05:39:13,297 WARNING [train.py:1204] (3/4) Exclude cut with ID _64639_210_5816_1_1535367619437_3528000_461-329156-0 from training. Duration: 0.96 2023-10-03 05:39:14,672 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_211-339622-0 from training. Duration: 0.45 2023-10-03 05:39:14,901 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.bypass_mid.scale_min, batch_count=1156920.0, ans=0.2 2023-10-03 05:39:14,915 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.attention_skip_rate, batch_count=1156920.0, ans=0.0 2023-10-03 05:39:15,320 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.3.encoder.layers.3.self_attn_weights.whiten_keys, num_groups=8, num_channels=256, metric=2.68 vs. limit=6.0 2023-10-03 05:39:16,001 INFO [train.py:1046] (3/4) Epoch 33, batch 3550, loss[loss=0.153, simple_loss=0.2318, pruned_loss=0.03708, over 24601.00 frames. ], tot_loss[loss=0.1602, simple_loss=0.2389, pruned_loss=0.04071, over 4732686.40 frames. ], batch size: 60, lr: 3.08e-03, grad_scale: 16.0 2023-10-03 05:39:18,092 WARNING [train.py:1204] (3/4) Exclude cut with ID _20455_210_5219_1_1531565996775_8098100_402-112739-0_sp1.1 from training. Duration: 0.963625 2023-10-03 05:39:18,261 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.ff3_skip_rate, batch_count=1156920.0, ans=0.0 2023-10-03 05:39:19,457 WARNING [train.py:1204] (3/4) Exclude cut with ID _53489_210_6228_1_1534298357169_3835970_256-115565-0_sp1.1 from training. Duration: 0.936375 2023-10-03 05:39:19,473 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_26326_210_7925_1_1533002493_3592406_360-305260-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 05:39:19,501 WARNING [train.py:1204] (3/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_18-204383-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 05:39:23,571 WARNING [train.py:1204] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_306-232480-0_sp1.1 from training. Duration: 0.87275 2023-10-03 05:39:24,904 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.443e+02 1.866e+02 2.004e+02 2.192e+02 2.799e+02, threshold=4.007e+02, percent-clipped=0.0 2023-10-03 05:39:25,334 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.ff2_skip_rate, batch_count=1156920.0, ans=0.0 2023-10-03 05:39:27,443 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.1.encoder.layers.0.nonlin_attention.whiten2, num_groups=1, num_channels=256, metric=5.26 vs. limit=15.0 2023-10-03 05:39:31,240 WARNING [train.py:1204] (3/4) Exclude cut with ID _41161_210_20034_1_1533431249657_3555314_116-299186-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 05:39:32,009 WARNING [train.py:1204] (3/4) Exclude cut with ID _13913_210_9994_1_1530579615187_3383897_473-277682-0_sp1.1 from training. Duration: 0.3545625 2023-10-03 05:39:36,516 WARNING [train.py:1204] (3/4) Exclude cut with ID _58895_210_8750_1_1535184081320_3649089_49-225702-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 05:39:36,591 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3080_210_2071_1_1526086102_4365166_312-8726-0_sp1.1 from training. Duration: 0.82725 2023-10-03 05:39:38,019 WARNING [train.py:1204] (3/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_489-190175-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 05:39:39,327 WARNING [train.py:1204] (3/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_345-95717-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 05:39:39,344 WARNING [train.py:1204] (3/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_85-138257-0_sp1.1 from training. Duration: 0.6454375 2023-10-03 05:39:42,283 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7046_210_6753_1_1527895337_6920917_783-278109-0_sp1.1 from training. Duration: 0.7 2023-10-03 05:39:42,311 WARNING [train.py:1204] (3/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_450-203637-0_sp1.1 from training. Duration: 0.836375 2023-10-03 05:39:43,632 WARNING [train.py:1204] (3/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_416-132893-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 05:39:43,655 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2655_210_2087_1_1525688959_3897400_437-337690-0_sp0.9 from training. Duration: 0.7 2023-10-03 05:39:43,872 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.conv_module1.balancer2.prob, batch_count=1156986.6666666667, ans=0.125 2023-10-03 05:39:44,979 WARNING [train.py:1204] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_700-183549-0_sp1.1 from training. Duration: 0.6181875 2023-10-03 05:39:48,516 WARNING [train.py:1204] (3/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_191-59784-0_sp1.1 from training. Duration: 0.8 2023-10-03 05:39:48,533 WARNING [train.py:1204] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_218-295097-0_sp1.1 from training. Duration: 0.7 2023-10-03 05:39:49,828 WARNING [train.py:1204] (3/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_310-109461-0_sp1.1 from training. Duration: 0.72725 2023-10-03 05:39:49,832 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_33348_210_5632_1_1533086912_3324030_131-321149-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 05:39:49,884 WARNING [train.py:1204] (3/4) Exclude cut with ID _64135_210_2253_1_1535677267876_7608972_133-188266-0_sp1.1 from training. Duration: 0.736375 2023-10-03 05:39:49,903 WARNING [train.py:1204] (3/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_505-205270-0 from training. Duration: 0.38 2023-10-03 05:39:51,136 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_67223_210_4238_1_1535612120_2537431_102-88248-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 05:39:52,513 WARNING [train.py:1204] (3/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_60-235433-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 05:39:52,606 WARNING [train.py:1204] (3/4) Exclude cut with ID _5189_210_4557_1_1527248319428_3769229_199-53526-0_sp1.1 from training. Duration: 0.3818125 2023-10-03 05:39:52,787 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.feed_forward1.hidden_balancer.prob, batch_count=1157053.3333333333, ans=0.125 2023-10-03 05:39:58,146 WARNING [train.py:1204] (3/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_439-90979-0_sp1.1 from training. Duration: 0.963625 2023-10-03 05:39:59,486 WARNING [train.py:1204] (3/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_1162-132880-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 05:40:01,357 WARNING [train.py:1204] (3/4) Exclude cut with ID _56423_210_5854_1_1534896061049_3165969_220-22317-0_sp1.1 from training. Duration: 0.963625 2023-10-03 05:40:02,899 WARNING [train.py:1204] (3/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_909-64151-0 from training. Duration: 0.94 2023-10-03 05:40:02,962 WARNING [train.py:1204] (3/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_465-156500-0_sp0.9 from training. Duration: 0.8 2023-10-03 05:40:04,839 WARNING [train.py:1204] (3/4) Exclude cut with ID _44304_210_15374_1_1533639352765_4613129_302-22084-0 from training. Duration: 0.86 2023-10-03 05:40:04,891 WARNING [train.py:1204] (3/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_663-166973-0_sp1.1 from training. Duration: 0.72725 2023-10-03 05:40:07,655 WARNING [train.py:1204] (3/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_503-287883-0_sp0.9 from training. Duration: 0.97775 2023-10-03 05:40:07,676 WARNING [train.py:1204] (3/4) Exclude cut with ID _60057_210_10640_1_1534903065793_3333150_362-230377-0_sp0.9 from training. Duration: 0.9666875 2023-10-03 05:40:10,356 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_4183_210_2087_1_1526729100_1869838_143-280962-0 from training. Duration: 0.56 2023-10-03 05:40:10,450 WARNING [train.py:1204] (3/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_281-1313-0_sp1.1 from training. Duration: 0.936375 2023-10-03 05:40:16,482 WARNING [train.py:1204] (3/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_64-215118-0_sp1.1 from training. Duration: 0.936375 2023-10-03 05:40:16,540 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2822_210_2715_1_1526112586_1999587_165-36656-0 from training. Duration: 0.89 2023-10-03 05:40:16,592 WARNING [train.py:1204] (3/4) Exclude cut with ID _22961_210_8254_1_1531976366672_4278180_402-208830-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 05:40:20,705 WARNING [train.py:1204] (3/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_98-193494-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 05:40:22,057 WARNING [train.py:1204] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_383-285196-0 from training. Duration: 0.97 2023-10-03 05:40:26,417 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6767_210_3273_1_1527855783_1718932_142-127132-0 from training. Duration: 0.95 2023-10-03 05:40:27,661 WARNING [train.py:1204] (3/4) Exclude cut with ID _39867_210_9826_1_1533553222030_9344259_1018-339635-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 05:40:27,724 WARNING [train.py:1204] (3/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_274-274798-0_sp1.1 from training. Duration: 0.8909375 2023-10-03 05:40:28,589 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=1157186.6666666667, ans=0.1 2023-10-03 05:40:30,815 INFO [train.py:1046] (3/4) Epoch 33, batch 3600, loss[loss=0.1739, simple_loss=0.2466, pruned_loss=0.05057, over 23597.00 frames. ], tot_loss[loss=0.1607, simple_loss=0.2391, pruned_loss=0.04116, over 4721810.77 frames. ], batch size: 256, lr: 3.08e-03, grad_scale: 16.0 2023-10-03 05:40:30,904 WARNING [train.py:1204] (3/4) Exclude cut with ID _2334_210_2667_1_1525608238880_4705129_376-263393-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 05:40:30,960 WARNING [train.py:1204] (3/4) Exclude cut with ID _29303_210_2575_1_1532392237303_7410649_363-344675-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 05:40:32,309 WARNING [train.py:1204] (3/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_58-68956-0_sp1.1 from training. Duration: 0.7545625 2023-10-03 05:40:36,748 WARNING [train.py:1204] (3/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_709-311571-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 05:40:38,147 WARNING [train.py:1204] (3/4) Exclude cut with ID _46067_210_4220_1_1534125571066_3307450_136-257407-0_sp1.1 from training. Duration: 0.963625 2023-10-03 05:40:39,670 WARNING [train.py:1204] (3/4) Exclude cut with ID _6493_210_1747_1_1528459210424_4012470_324-323854-0_sp1.1 from training. Duration: 0.836375 2023-10-03 05:40:41,054 WARNING [train.py:1204] (3/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_234-260682-0_sp1.1 from training. Duration: 0.763625 2023-10-03 05:40:41,285 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.nonlin_attention.balancer.prob, batch_count=1157253.3333333333, ans=0.125 2023-10-03 05:40:42,397 WARNING [train.py:1204] (3/4) Exclude cut with ID _36634_210_1794_1_1533296352450_1399884_55-119451-0_sp1.1 from training. Duration: 0.963625 2023-10-03 05:40:42,400 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_40047_210_2290_1_1533290519_2374311_195-165693-0 from training. Duration: 0.89 2023-10-03 05:40:45,132 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_5522_210_3273_1_1527249606_3909096_366-172508-0_sp0.9 from training. Duration: 0.7333125 2023-10-03 05:40:45,231 WARNING [train.py:1204] (3/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_187-10504-0_sp1.1 from training. Duration: 0.963625 2023-10-03 05:40:50,143 WARNING [train.py:1204] (3/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_282-257791-0_sp1.1 from training. Duration: 0.7818125 2023-10-03 05:40:52,901 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_43831_210_2158_1_1533725502_5872791_467-288776-0_sp1.1 from training. Duration: 0.92725 2023-10-03 05:40:52,982 WARNING [train.py:1204] (3/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_138-154812-0_sp1.1 from training. Duration: 0.7090625 2023-10-03 05:40:54,435 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_1154-185930-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 05:40:54,461 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2655_210_2087_1_1525688959_3897400_52-279899-0 from training. Duration: 0.69 2023-10-03 05:40:55,828 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_141-89113-0_sp1.1 from training. Duration: 0.7818125 2023-10-03 05:40:57,259 WARNING [train.py:1204] (3/4) Exclude cut with ID _55134_210_21982_1_1534590028377_3882170_297-47310-0_sp1.1 from training. Duration: 0.963625 2023-10-03 05:40:57,327 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_486-151131-0_sp1.1 from training. Duration: 0.77275 2023-10-03 05:41:00,592 WARNING [train.py:1204] (3/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_21-232980-0_sp1.1 from training. Duration: 0.936375 2023-10-03 05:41:03,244 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6165_210_3528_1_1527590982_4379841_338-106380-0_sp1.1 from training. Duration: 0.92725 2023-10-03 05:41:03,322 WARNING [train.py:1204] (3/4) Exclude cut with ID _55162_210_17071_1_1534408248583_3753480_355-257218-0_sp1.1 from training. Duration: 0.97275 2023-10-03 05:41:04,686 WARNING [train.py:1204] (3/4) Exclude cut with ID _2895_210_2477_1_1525956785794_4063750_289-11475-0 from training. Duration: 0.82 2023-10-03 05:41:08,082 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder_embed.dropout.p, batch_count=1157386.6666666667, ans=0.1 2023-10-03 05:41:10,893 WARNING [train.py:1204] (3/4) Exclude cut with ID _16756_210_4915_1_1531047803763_7104259_839-183431-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 05:41:13,666 WARNING [train.py:1204] (3/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_427-208901-0_sp0.9 from training. Duration: 0.7555625 2023-10-03 05:41:13,713 WARNING [train.py:1204] (3/4) Exclude cut with ID _36512_210_6987_1_1533033143998_3126069_181-306500-0 from training. Duration: 0.95 2023-10-03 05:41:17,920 WARNING [train.py:1204] (3/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_28-70987-0_sp1.1 from training. Duration: 0.7909375 2023-10-03 05:41:22,825 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.conv_module2.balancer1.min_positive, batch_count=1157453.3333333333, ans=0.025 2023-10-03 05:41:23,987 WARNING [train.py:1204] (3/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_590-58996-0_sp1.1 from training. Duration: 0.936375 2023-10-03 05:41:26,691 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_48730_210_5399_1_1533885886_2212769_108-47561-0_sp1.1 from training. Duration: 0.936375 2023-10-03 05:41:28,411 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.feed_forward1.hidden_balancer.prob, batch_count=1157520.0, ans=0.125 2023-10-03 05:41:31,071 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.bypass.scale_min, batch_count=1157520.0, ans=0.2 2023-10-03 05:41:33,883 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.ff2_skip_rate, batch_count=1157520.0, ans=0.0 2023-10-03 05:41:34,732 WARNING [train.py:1204] (3/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_401-158239-0_sp0.9 from training. Duration: 0.788875 2023-10-03 05:41:34,755 WARNING [train.py:1204] (3/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_278-154492-0_sp1.1 from training. Duration: 0.6909375 2023-10-03 05:41:34,760 WARNING [train.py:1204] (3/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_137-116442-0 from training. Duration: 0.78 2023-10-03 05:41:36,124 WARNING [train.py:1204] (3/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_189-317158-0 from training. Duration: 0.9 2023-10-03 05:41:37,981 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_47896_210_20508_1_1534492518_5958868_558-314572-0 from training. Duration: 0.97 2023-10-03 05:41:39,362 WARNING [train.py:1204] (3/4) Exclude cut with ID _66070_210_7640_1_1535441393096_7805310_290-40284-0_sp1.1 from training. Duration: 0.97275 2023-10-03 05:41:40,622 WARNING [train.py:1204] (3/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_110-224688-0_sp1.1 from training. Duration: 0.8818125 2023-10-03 05:41:40,815 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.feed_forward2.hidden_balancer.prob, batch_count=1157520.0, ans=0.125 2023-10-03 05:41:41,988 WARNING [train.py:1204] (3/4) Exclude cut with ID _7719_210_3525_1_1528456153163_4296960_88-250259-0 from training. Duration: 0.84 2023-10-03 05:41:42,048 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8453_210_4211_1_1528502940_8562730_1157-114102-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 05:41:42,079 WARNING [train.py:1204] (3/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_317-175062-0_sp1.1 from training. Duration: 0.7454375 2023-10-03 05:41:42,085 WARNING [train.py:1204] (3/4) Exclude cut with ID _29857_210_20034_1_1532395886753_3798160_254-347254-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 05:41:44,537 INFO [train.py:1046] (3/4) Epoch 33, batch 3650, loss[loss=0.1592, simple_loss=0.2523, pruned_loss=0.0331, over 24349.00 frames. ], tot_loss[loss=0.161, simple_loss=0.24, pruned_loss=0.04104, over 4723754.24 frames. ], batch size: 74, lr: 3.08e-03, grad_scale: 16.0 2023-10-03 05:41:44,588 WARNING [train.py:1204] (3/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_28-70987-0 from training. Duration: 0.87 2023-10-03 05:41:45,949 WARNING [train.py:1204] (3/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_321-335475-0 from training. Duration: 0.45 2023-10-03 05:41:48,751 WARNING [train.py:1204] (3/4) Exclude cut with ID _40901_210_5665_1_1533363789958_3856130_356-107330-0_sp1.1 from training. Duration: 0.936375 2023-10-03 05:41:49,680 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.1.encoder.layers.0.self_attn_weights.whiten_keys, num_groups=4, num_channels=128, metric=3.63 vs. limit=6.0 2023-10-03 05:41:50,105 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3780_210_2087_1_1526467389_2145572_176-107140-0 from training. Duration: 0.69 2023-10-03 05:41:54,158 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.623e+02 1.881e+02 2.096e+02 2.323e+02 3.707e+02, threshold=4.192e+02, percent-clipped=0.0 2023-10-03 05:41:55,539 WARNING [train.py:1204] (3/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_80-135200-0 from training. Duration: 0.79 2023-10-03 05:41:55,662 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_33348_210_5632_1_1533086912_3324030_85-28583-0_sp0.9 from training. Duration: 0.911125 2023-10-03 05:41:59,108 WARNING [train.py:1204] (3/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_29-293896-0 from training. Duration: 0.61 2023-10-03 05:42:01,941 WARNING [train.py:1204] (3/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_194-252800-0 from training. Duration: 0.97 2023-10-03 05:42:06,714 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_360-52446-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 05:42:06,716 WARNING [train.py:1204] (3/4) Exclude cut with ID _9389_210_1664_1_1529217337301_5289269_289-213417-0_sp0.9 from training. Duration: 0.988875 2023-10-03 05:42:06,757 WARNING [train.py:1204] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_143-86344-0_sp0.9 from training. Duration: 0.8555625 2023-10-03 05:42:10,330 WARNING [train.py:1204] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_520-126729-0_sp0.9 from training. Duration: 0.77775 2023-10-03 05:42:11,636 WARNING [train.py:1204] (3/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_76-308663-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 05:42:11,706 WARNING [train.py:1204] (3/4) Exclude cut with ID _8021_210_7994_1_1528376451358_3653069_100-140231-0 from training. Duration: 0.67 2023-10-03 05:42:12,255 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.3.encoder.layers.3.whiten, num_groups=1, num_channels=512, metric=4.75 vs. limit=12.0 2023-10-03 05:42:13,636 WARNING [train.py:1204] (3/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_387-131538-0_sp0.9 from training. Duration: 0.9 2023-10-03 05:42:13,669 WARNING [train.py:1204] (3/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_365-182054-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 05:42:13,718 WARNING [train.py:1204] (3/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_345-340690-0 from training. Duration: 0.93 2023-10-03 05:42:13,916 INFO [scaling.py:1118] (3/4) WithLoss: name=encoder.encoders.2.encoder.layers.0.self_attn_weights, loss-sum=0.000e+00 2023-10-03 05:42:15,245 WARNING [train.py:1204] (3/4) Exclude cut with ID _5409_210_1388_1_1527292850189_5865191_178-127970-0_sp0.9 from training. Duration: 0.6333125 2023-10-03 05:42:16,627 WARNING [train.py:1204] (3/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_352-12734-0_sp1.1 from training. Duration: 0.92725 2023-10-03 05:42:16,630 WARNING [train.py:1204] (3/4) Exclude cut with ID _65134_210_14763_1_1535335393985_3505368_121-129373-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 05:42:18,047 WARNING [train.py:1204] (3/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_557-324258-0_sp1.1 from training. Duration: 0.736375 2023-10-03 05:42:19,449 WARNING [train.py:1204] (3/4) Exclude cut with ID _48678_210_5663_1_1533988830947_3769199_306-255886-0 from training. Duration: 0.77 2023-10-03 05:42:20,904 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7544_210_6753_1_1528587000_691468_87-178599-0 from training. Duration: 0.87 2023-10-03 05:42:22,304 WARNING [train.py:1204] (3/4) Exclude cut with ID _2941_210_2577_1_1526199673150_2489759_208-236402-0_sp1.1 from training. Duration: 0.963625 2023-10-03 05:42:24,896 WARNING [train.py:1204] (3/4) Exclude cut with ID _38670_210_15379_1_1533448810659_4391059_65-169397-0 from training. Duration: 0.97 2023-10-03 05:42:26,248 WARNING [train.py:1204] (3/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_306-328175-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 05:42:26,257 WARNING [train.py:1204] (3/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_212-149947-0_sp1.1 from training. Duration: 0.663625 2023-10-03 05:42:29,767 WARNING [train.py:1204] (3/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_321-22821-0_sp1.1 from training. Duration: 0.8090625 2023-10-03 05:42:32,453 WARNING [train.py:1204] (3/4) Exclude cut with ID _44010_210_4525_1_1533884262964_4013620_483-69110-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 05:42:32,461 WARNING [train.py:1204] (3/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_38-187851-0_sp1.1 from training. Duration: 0.563625 2023-10-03 05:42:33,022 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.4.encoder.layers.1.feed_forward2.out_whiten, num_groups=1, num_channels=384, metric=9.22 vs. limit=15.0 2023-10-03 05:42:35,061 WARNING [train.py:1204] (3/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_129-337802-0_sp0.9 from training. Duration: 0.8 2023-10-03 05:42:35,125 WARNING [train.py:1204] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_268-87909-0_sp1.1 from training. Duration: 0.9 2023-10-03 05:42:35,791 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.2.encoder.layers.1.feed_forward2.out_whiten, num_groups=1, num_channels=384, metric=4.36 vs. limit=15.0 2023-10-03 05:42:38,477 WARNING [train.py:1204] (3/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_559-184862-0_sp1.1 from training. Duration: 0.8545625 2023-10-03 05:42:41,043 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.4.encoder.layers.1.nonlin_attention.whiten2, num_groups=1, num_channels=384, metric=3.22 vs. limit=15.0 2023-10-03 05:42:41,798 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_46032_210_14656_1_1533781412_4507969_237-120766-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 05:42:41,894 WARNING [train.py:1204] (3/4) Exclude cut with ID _28427_210_4220_1_1532581240967_3061069_330-170906-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 05:42:41,899 WARNING [train.py:1204] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_401-51578-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 05:42:43,309 WARNING [train.py:1204] (3/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_209-205365-0_sp1.1 from training. Duration: 0.7181875 2023-10-03 05:42:45,154 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_23801_210_16223_1_1532066392_3294657_57-242170-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 05:42:45,204 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_127-37371-0_sp1.1 from training. Duration: 0.936375 2023-10-03 05:42:51,916 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9171W0019-578905 from training. Duration: 0.896 2023-10-03 05:42:54,649 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_24605_210_1794_1_1532047863_4149755_349-49056-0_sp1.1 from training. Duration: 0.92725 2023-10-03 05:42:54,674 WARNING [train.py:1204] (3/4) Exclude cut with ID _12302_210_5219_1_1529924446466_8107280_897-21237-0_sp1.1 from training. Duration: 0.936375 2023-10-03 05:42:56,073 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_9460_210_4211_1_1528954587_10049667_480-260269-0_sp0.9 from training. Duration: 0.62225 2023-10-03 05:42:56,118 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_348-152548-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 05:42:56,361 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.bypass.scale_min, batch_count=1157853.3333333333, ans=0.2 2023-10-03 05:42:57,533 WARNING [train.py:1204] (3/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_301-330455-0_sp0.9 from training. Duration: 0.888875 2023-10-03 05:42:57,778 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.attention_skip_rate, batch_count=1157920.0, ans=0.0 2023-10-03 05:42:58,798 INFO [train.py:1046] (3/4) Epoch 33, batch 3700, loss[loss=0.1607, simple_loss=0.2377, pruned_loss=0.04187, over 23210.00 frames. ], tot_loss[loss=0.1616, simple_loss=0.2406, pruned_loss=0.04125, over 4733514.12 frames. ], batch size: 105, lr: 3.08e-03, grad_scale: 16.0 2023-10-03 05:42:58,915 WARNING [train.py:1204] (3/4) Exclude cut with ID _61008_210_10633_1_1534852769198_6974370_345-91705-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 05:43:01,081 WARNING [train.py:1204] (3/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_304-273433-0 from training. Duration: 0.99 2023-10-03 05:43:01,084 WARNING [train.py:1204] (3/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_359-345972-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 05:43:03,799 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2296_210_2087_1_1525503448_3397335_57-185430-0_sp1.1 from training. Duration: 0.5454375 2023-10-03 05:43:06,431 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_1256-120974-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 05:43:06,500 WARNING [train.py:1204] (3/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_372-30302-0_sp0.9 from training. Duration: 0.9555625 2023-10-03 05:43:09,827 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_42012_210_7927_1_1533439936_3871556_451-344262-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 05:43:09,828 WARNING [train.py:1204] (3/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_28-324729-0 from training. Duration: 0.94 2023-10-03 05:43:09,836 WARNING [train.py:1204] (3/4) Exclude cut with ID _64135_210_2253_1_1535677267876_7608972_313-18394-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 05:43:10,610 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.2.encoder.layers.2.self_attn2.whiten, num_groups=1, num_channels=384, metric=19.48 vs. limit=22.5 2023-10-03 05:43:11,815 WARNING [train.py:1204] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_620-276793-0_sp0.9 from training. Duration: 0.6555625 2023-10-03 05:43:11,848 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7544_210_6753_1_1528591327_1471426_172-348777-0_sp0.9 from training. Duration: 0.7333125 2023-10-03 05:43:14,415 WARNING [train.py:1204] (3/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_332-221057-0_sp0.9 from training. Duration: 0.7555625 2023-10-03 05:43:17,920 WARNING [train.py:1204] (3/4) Exclude cut with ID _19023_210_4353_1_1531378892552_5820780_5-300035-0_sp1.1 from training. Duration: 0.92725 2023-10-03 05:43:17,968 WARNING [train.py:1204] (3/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_343-309023-0_sp1.1 from training. Duration: 0.963625 2023-10-03 05:43:19,294 WARNING [train.py:1204] (3/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_37-41919-0_sp1.1 from training. Duration: 0.7909375 2023-10-03 05:43:19,329 WARNING [train.py:1204] (3/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_41-182910-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 05:43:20,671 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_229-45300-0_sp0.9 from training. Duration: 0.6333125 2023-10-03 05:43:22,171 WARNING [train.py:1204] (3/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_40-214716-0_sp1.1 from training. Duration: 0.963625 2023-10-03 05:43:24,824 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9309W0203-640330 from training. Duration: 0.896 2023-10-03 05:43:30,625 WARNING [train.py:1204] (3/4) Exclude cut with ID _60229_210_27767_1_1534838406158_3655350_264-279573-0_sp1.1 from training. Duration: 0.7545625 2023-10-03 05:43:32,473 WARNING [train.py:1204] (3/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_372-198878-0_sp0.9 from training. Duration: 0.6666875 2023-10-03 05:43:32,568 WARNING [train.py:1204] (3/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_359-118926-0_sp1.1 from training. Duration: 0.6545625 2023-10-03 05:43:32,578 WARNING [train.py:1204] (3/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_85-65056-0 from training. Duration: 0.96 2023-10-03 05:43:33,858 WARNING [train.py:1204] (3/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_89-242335-0_sp1.1 from training. Duration: 0.863625 2023-10-03 05:43:35,361 WARNING [train.py:1204] (3/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_128-49444-0_sp1.1 from training. Duration: 0.936375 2023-10-03 05:43:36,744 WARNING [train.py:1204] (3/4) Exclude cut with ID _30327_210_16049_1_1532484161295_3924169_401-70819-0 from training. Duration: 0.86 2023-10-03 05:43:36,833 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_41822_210_13270_1_1533955976_4379284_260-256391-0_sp1.1 from training. Duration: 0.936375 2023-10-03 05:43:39,985 WARNING [train.py:1204] (3/4) Exclude cut with ID _67335_210_5219_1_1535525975652_4275229_599-247317-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 05:43:40,252 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.conv_module2.balancer1.prob, batch_count=1158053.3333333333, ans=0.125 2023-10-03 05:43:42,767 WARNING [train.py:1204] (3/4) Exclude cut with ID _29857_210_20034_1_1532395886753_3798160_209-239267-0_sp1.1 from training. Duration: 0.936375 2023-10-03 05:43:43,393 WARNING [train.py:1204] (3/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_46-207599-0_sp1.1 from training. Duration: 0.5818125 2023-10-03 05:43:43,609 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.0.feed_forward2.hidden_balancer.prob, batch_count=1158120.0, ans=0.125 2023-10-03 05:43:44,772 WARNING [train.py:1204] (3/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_912-282412-0_sp1.1 from training. Duration: 0.4454375 2023-10-03 05:43:49,022 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_4183_210_2087_1_1526729100_1869838_141-166600-0_sp1.1 from training. Duration: 0.863625 2023-10-03 05:43:49,026 WARNING [train.py:1204] (3/4) Exclude cut with ID _15037_210_2253_1_1530752294104_3267650_296-192597-0 from training. Duration: 0.81 2023-10-03 05:43:50,257 WARNING [train.py:1204] (3/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_571-220562-0_sp1.1 from training. Duration: 0.963625 2023-10-03 05:43:50,282 WARNING [train.py:1204] (3/4) Exclude cut with ID _5287_210_2084_1_1527418883040_3878670_12-221186-0 from training. Duration: 0.98 2023-10-03 05:43:53,477 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.ff2_skip_rate, batch_count=1158120.0, ans=0.0 2023-10-03 05:43:54,552 WARNING [train.py:1204] (3/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_149-85602-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 05:43:54,581 WARNING [train.py:1204] (3/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_1003-101638-0_sp1.1 from training. Duration: 0.663625 2023-10-03 05:43:57,495 WARNING [train.py:1204] (3/4) Exclude cut with ID _55844_210_2346_1_1534471212569_7625707_1099-33689-0_sp1.1 from training. Duration: 0.92725 2023-10-03 05:43:57,538 WARNING [train.py:1204] (3/4) Exclude cut with ID _7606_210_4915_1_1528894253277_5080690_301-10732-0 from training. Duration: 0.81 2023-10-03 05:43:58,978 WARNING [train.py:1204] (3/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_403-34698-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 05:43:58,983 WARNING [train.py:1204] (3/4) Exclude cut with ID _8480_210_1874_1_1528589391205_6728330_755-112625-0_sp0.9 from training. Duration: 0.82225 2023-10-03 05:44:00,848 WARNING [train.py:1204] (3/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_527-238990-0_sp1.1 from training. Duration: 0.8181875 2023-10-03 05:44:00,882 WARNING [train.py:1204] (3/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_426-7328-0_sp1.1 from training. Duration: 0.92725 2023-10-03 05:44:02,417 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.bypass.scale_min, batch_count=1158186.6666666667, ans=0.2 2023-10-03 05:44:04,966 WARNING [train.py:1204] (3/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_189-317158-0_sp1.1 from training. Duration: 0.8181875 2023-10-03 05:44:06,290 WARNING [train.py:1204] (3/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_162-103620-0 from training. Duration: 0.93 2023-10-03 05:44:06,395 WARNING [train.py:1204] (3/4) Exclude cut with ID _22083_210_8254_1_1531702862439_3678970_49-234546-0 from training. Duration: 0.83 2023-10-03 05:44:06,436 WARNING [train.py:1204] (3/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_370-331600-0_sp1.1 from training. Duration: 0.8545625 2023-10-03 05:44:06,449 WARNING [train.py:1204] (3/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_166-59042-0_sp1.1 from training. Duration: 0.97275 2023-10-03 05:44:08,430 WARNING [train.py:1204] (3/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_138-332773-0_sp1.1 from training. Duration: 0.736375 2023-10-03 05:44:09,807 WARNING [train.py:1204] (3/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_90-30673-0_sp0.9 from training. Duration: 0.9333125 2023-10-03 05:44:10,076 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.conv_module2.balancer1.prob, batch_count=1158186.6666666667, ans=0.125 2023-10-03 05:44:11,231 WARNING [train.py:1204] (3/4) Exclude cut with ID _27967_210_12140_1_1532588391859_8419130_737-41898-0_sp1.1 from training. Duration: 0.936375 2023-10-03 05:44:13,200 WARNING [train.py:1204] (3/4) Exclude cut with ID _8833_210_5219_1_1528592665210_7553059_329-125156-0_sp1.1 from training. Duration: 0.7454375 2023-10-03 05:44:13,911 WARNING [train.py:1204] (3/4) Exclude cut with ID _41817_210_18835_1_1533434477160_7423049_352-42537-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 05:44:15,091 INFO [train.py:1046] (3/4) Epoch 33, batch 3750, loss[loss=0.1781, simple_loss=0.2464, pruned_loss=0.0549, over 23402.00 frames. ], tot_loss[loss=0.1616, simple_loss=0.2411, pruned_loss=0.04104, over 4734116.43 frames. ], batch size: 285, lr: 3.08e-03, grad_scale: 16.0 2023-10-03 05:44:16,543 WARNING [train.py:1204] (3/4) Exclude cut with ID _8365_210_2064_1_1528592311070_4677690_431-294102-0 from training. Duration: 0.89 2023-10-03 05:44:17,928 WARNING [train.py:1204] (3/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_454-314901-0_sp1.1 from training. Duration: 0.4181875 2023-10-03 05:44:20,684 WARNING [train.py:1204] (3/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_80-135200-0_sp0.9 from training. Duration: 0.87775 2023-10-03 05:44:22,005 WARNING [train.py:1204] (3/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_181-78632-0 from training. Duration: 0.78 2023-10-03 05:44:22,066 WARNING [train.py:1204] (3/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_300-27235-0_sp0.9 from training. Duration: 0.9666875 2023-10-03 05:44:23,415 WARNING [train.py:1204] (3/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_96-177176-0_sp1.1 from training. Duration: 0.97275 2023-10-03 05:44:23,510 WARNING [train.py:1204] (3/4) Exclude cut with ID _35941_210_15379_1_1532995294515_4023089_246-348742-0_sp1.1 from training. Duration: 0.97275 2023-10-03 05:44:24,274 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.2.encoder.layers.0.self_attn_weights.whiten_keys, num_groups=4, num_channels=128, metric=5.15 vs. limit=6.0 2023-10-03 05:44:24,743 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.547e+02 1.972e+02 2.162e+02 2.375e+02 3.648e+02, threshold=4.325e+02, percent-clipped=0.0 2023-10-03 05:44:26,146 WARNING [train.py:1204] (3/4) Exclude cut with ID _38670_210_15379_1_1533448810659_4391059_65-169397-0_sp1.1 from training. Duration: 0.8818125 2023-10-03 05:44:26,418 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.attention_skip_rate, batch_count=1158253.3333333333, ans=0.0 2023-10-03 05:44:28,932 WARNING [train.py:1204] (3/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_1003-99642-0_sp1.1 from training. Duration: 0.963625 2023-10-03 05:44:33,678 WARNING [train.py:1204] (3/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_320-14078-0_sp1.1 from training. Duration: 0.863625 2023-10-03 05:44:36,352 WARNING [train.py:1204] (3/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_367-61330-0_sp0.9 from training. Duration: 0.8333125 2023-10-03 05:44:37,792 WARNING [train.py:1204] (3/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_515-298514-0_sp1.1 from training. Duration: 0.92725 2023-10-03 05:44:39,261 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder_embed.conv.2.prob, batch_count=1158320.0, ans=0.125 2023-10-03 05:44:41,184 WARNING [train.py:1204] (3/4) Exclude cut with ID _53731_210_7994_1_1534553979244_3757214_471-320815-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 05:44:41,249 WARNING [train.py:1204] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_306-232480-0 from training. Duration: 0.96 2023-10-03 05:44:42,640 WARNING [train.py:1204] (3/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_478-162247-0_sp0.9 from training. Duration: 0.911125 2023-10-03 05:44:44,162 WARNING [train.py:1204] (3/4) Exclude cut with ID _56423_210_5854_1_1534896061049_3165969_345-284696-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 05:44:44,202 WARNING [train.py:1204] (3/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_600-122253-0_sp1.1 from training. Duration: 0.963625 2023-10-03 05:44:48,220 WARNING [train.py:1204] (3/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_199-295611-0 from training. Duration: 0.55 2023-10-03 05:44:49,781 INFO [scaling.py:1118] (3/4) WithLoss: name=encoder.encoders.2.encoder.layers.2.self_attn_weights, loss-sum=0.000e+00 2023-10-03 05:44:52,406 WARNING [train.py:1204] (3/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_734-129761-0 from training. Duration: 0.82 2023-10-03 05:44:53,847 WARNING [train.py:1204] (3/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_349-169257-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 05:44:53,877 WARNING [train.py:1204] (3/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_202-67017-0_sp0.9 from training. Duration: 0.911125 2023-10-03 05:44:56,601 WARNING [train.py:1204] (3/4) Exclude cut with ID _16908_210_5724_1_1531119157248_3772700_61-258978-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 05:44:59,479 WARNING [train.py:1204] (3/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_172-156931-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 05:45:01,312 WARNING [train.py:1204] (3/4) Exclude cut with ID _4092_210_2151_1_1526643044449_3135759_356-278694-0_sp1.1 from training. Duration: 0.42725 2023-10-03 05:45:02,863 WARNING [train.py:1204] (3/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_116-332008-0 from training. Duration: 0.98 2023-10-03 05:45:05,627 WARNING [train.py:1204] (3/4) Exclude cut with ID _29003_210_19596_1_1532507391720_3894098_358-114789-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 05:45:09,751 WARNING [train.py:1204] (3/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_152-212533-0_sp1.1 from training. Duration: 0.8818125 2023-10-03 05:45:11,517 WARNING [train.py:1204] (3/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_267-214116-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 05:45:14,359 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7543_210_6753_1_1528541907_1399322_165-323619-0_sp0.9 from training. Duration: 0.7666875 2023-10-03 05:45:14,688 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.ff3_skip_rate, batch_count=1158520.0, ans=0.0 2023-10-03 05:45:17,666 WARNING [train.py:1204] (3/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_577-332600-0_sp0.9 from training. Duration: 0.6333125 2023-10-03 05:45:20,908 WARNING [train.py:1204] (3/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_342-100157-0_sp0.9 from training. Duration: 0.6 2023-10-03 05:45:21,172 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.ff2_skip_rate, batch_count=1158520.0, ans=0.0 2023-10-03 05:45:23,677 WARNING [train.py:1204] (3/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_141-89727-0_sp1.1 from training. Duration: 0.6454375 2023-10-03 05:45:23,771 WARNING [train.py:1204] (3/4) Exclude cut with ID _3992_210_1747_1_1526644816031_3731060_107-251434-0_sp1.1 from training. Duration: 0.9 2023-10-03 05:45:25,239 WARNING [train.py:1204] (3/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_401-53423-0_sp1.1 from training. Duration: 0.5 2023-10-03 05:45:29,740 INFO [train.py:1046] (3/4) Epoch 33, batch 3800, loss[loss=0.1864, simple_loss=0.2448, pruned_loss=0.064, over 19909.00 frames. ], tot_loss[loss=0.162, simple_loss=0.2413, pruned_loss=0.04131, over 4732719.06 frames. ], batch size: 388, lr: 3.08e-03, grad_scale: 16.0 2023-10-03 05:45:33,857 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7762_210_1794_1_1528199508_4057408_422-227034-0_sp1.1 from training. Duration: 0.663625 2023-10-03 05:45:36,750 WARNING [train.py:1204] (3/4) Exclude cut with ID _4184_210_2550_1_1526730192013_2219380_102-250299-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 05:45:36,817 WARNING [train.py:1204] (3/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_253-308377-0_sp0.9 from training. Duration: 0.5444375 2023-10-03 05:45:38,284 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_271-297505-0 from training. Duration: 0.99 2023-10-03 05:45:38,393 WARNING [train.py:1204] (3/4) Exclude cut with ID _35941_210_15379_1_1532995294515_4023089_213-142973-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 05:45:41,139 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7544_210_6753_1_1528587722_3576686_162-63018-0_sp1.1 from training. Duration: 0.92725 2023-10-03 05:45:41,223 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_365-217119-0_sp0.9 from training. Duration: 0.7 2023-10-03 05:45:44,170 WARNING [train.py:1204] (3/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_662-275621-0_sp0.9 from training. Duration: 0.4666875 2023-10-03 05:45:44,172 WARNING [train.py:1204] (3/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_374-187555-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 05:45:46,035 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_289-180904-0_sp1.1 from training. Duration: 0.6909375 2023-10-03 05:45:48,386 WARNING [train.py:1204] (3/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_189-47206-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 05:45:48,423 WARNING [train.py:1204] (3/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_234-14174-0_sp1.1 from training. Duration: 0.7454375 2023-10-03 05:45:48,451 WARNING [train.py:1204] (3/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_576-76621-0_sp1.1 from training. Duration: 0.963625 2023-10-03 05:45:51,029 WARNING [train.py:1204] (3/4) Exclude cut with ID _13253_210_1925_1_1530757775819_3577873_253-246386-0 from training. Duration: 0.9 2023-10-03 05:45:55,012 WARNING [train.py:1204] (3/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_372-6869-0_sp1.1 from training. Duration: 0.4 2023-10-03 05:45:55,069 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_21434_210_4238_1_1531916339_1222252_22-54954-0_sp1.1 from training. Duration: 0.936375 2023-10-03 05:45:56,501 WARNING [train.py:1204] (3/4) Exclude cut with ID _29853_210_7497_1_1532505600851_6642110_492-170361-0_sp1.1 from training. Duration: 0.92725 2023-10-03 05:45:58,071 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.attention_skip_rate, batch_count=1158720.0, ans=0.0 2023-10-03 05:45:59,669 WARNING [train.py:1204] (3/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_505-97987-0_sp0.9 from training. Duration: 0.9555625 2023-10-03 05:45:59,709 WARNING [train.py:1204] (3/4) Exclude cut with ID _8365_210_2064_1_1528592311070_4677690_371-122295-0_sp0.9 from training. Duration: 0.611125 2023-10-03 05:46:02,131 WARNING [train.py:1204] (3/4) Exclude cut with ID _14268_210_9206_1_1530622771285_4714099_637-86630-0_sp1.1 from training. Duration: 0.463625 2023-10-03 05:46:02,150 WARNING [train.py:1204] (3/4) Exclude cut with ID _29857_210_20034_1_1532395886753_3798160_372-5678-0_sp1.1 from training. Duration: 0.963625 2023-10-03 05:46:03,551 WARNING [train.py:1204] (3/4) Exclude cut with ID _52892_210_6721_1_1534378941259_4124129_90-165108-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 05:46:04,917 WARNING [train.py:1204] (3/4) Exclude cut with ID _27052_210_5986_1_1532329197187_3585009_143-339198-0_sp1.1 from training. Duration: 0.963625 2023-10-03 05:46:09,130 WARNING [train.py:1204] (3/4) Exclude cut with ID _14268_210_9206_1_1530622771285_4714099_637-86630-0_sp0.9 from training. Duration: 0.5666875 2023-10-03 05:46:09,133 WARNING [train.py:1204] (3/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_401-255491-0 from training. Duration: 0.7 2023-10-03 05:46:10,648 WARNING [train.py:1204] (3/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_178-6946-0_sp1.1 from training. Duration: 0.97275 2023-10-03 05:46:18,581 WARNING [train.py:1204] (3/4) Exclude cut with ID _27485_210_4916_1_1532739191677_3668269_460-163885-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 05:46:22,714 WARNING [train.py:1204] (3/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_270-26053-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 05:46:25,277 WARNING [train.py:1204] (3/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_412-317077-0 from training. Duration: 0.75 2023-10-03 05:46:26,646 WARNING [train.py:1204] (3/4) Exclude cut with ID _5289_210_2084_1_1527678202736_3779299_142-103317-0 from training. Duration: 0.84 2023-10-03 05:46:27,953 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_150-143438-0_sp1.1 from training. Duration: 0.92725 2023-10-03 05:46:29,410 WARNING [train.py:1204] (3/4) Exclude cut with ID _27894_210_12392_1_1532415496078_6902439_668-269779-0_sp1.1 from training. Duration: 0.97275 2023-10-03 05:46:29,464 WARNING [train.py:1204] (3/4) Exclude cut with ID _18513_210_9206_1_1531618215223_3862620_56-222075-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 05:46:31,476 WARNING [train.py:1204] (3/4) Exclude cut with ID _4092_210_2151_1_1526643044449_3135759_356-278694-0 from training. Duration: 0.47 2023-10-03 05:46:35,518 WARNING [train.py:1204] (3/4) Exclude cut with ID _25808_210_13292_1_1532343514562_7366109_633-47524-0 from training. Duration: 0.99 2023-10-03 05:46:35,538 WARNING [train.py:1204] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_872-264525-0 from training. Duration: 0.65 2023-10-03 05:46:35,567 WARNING [train.py:1204] (3/4) Exclude cut with ID _14632_210_9988_1_1530691279392_3923200_239-232545-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 05:46:36,891 WARNING [train.py:1204] (3/4) Exclude cut with ID _14349_210_8341_1_1530615710818_3442470_153-27432-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 05:46:42,949 INFO [train.py:1046] (3/4) Epoch 33, batch 3850, loss[loss=0.1418, simple_loss=0.2218, pruned_loss=0.03093, over 24617.00 frames. ], tot_loss[loss=0.1614, simple_loss=0.24, pruned_loss=0.04138, over 4731038.46 frames. ], batch size: 60, lr: 3.08e-03, grad_scale: 16.0 2023-10-03 05:46:43,114 WARNING [train.py:1204] (3/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_37-41919-0_sp0.9 from training. Duration: 0.9666875 2023-10-03 05:46:44,538 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_196-289637-0_sp0.9 from training. Duration: 0.8333125 2023-10-03 05:46:49,453 WARNING [train.py:1204] (3/4) Exclude cut with ID _13678_210_7497_1_1530530069226_3463170_371-3221-0_sp1.1 from training. Duration: 0.8181875 2023-10-03 05:46:50,169 WARNING [train.py:1204] (3/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_557-324258-0 from training. Duration: 0.81 2023-10-03 05:46:51,520 WARNING [train.py:1204] (3/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_1012-279473-0_sp1.1 from training. Duration: 0.6090625 2023-10-03 05:46:51,593 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_66916_210_14656_1_1535725525_326566_7-92649-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 05:46:54,125 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.495e+02 1.876e+02 2.130e+02 2.387e+02 3.615e+02, threshold=4.261e+02, percent-clipped=0.0 2023-10-03 05:46:54,364 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_9460_210_4211_1_1528954587_10049667_480-260269-0_sp1.1 from training. Duration: 0.5090625 2023-10-03 05:46:57,140 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_121-155001-0_sp1.1 from training. Duration: 0.92725 2023-10-03 05:46:57,262 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.1.self_attn_weights.pos_emb_skip_rate, batch_count=1158986.6666666667, ans=0.0 2023-10-03 05:47:01,008 WARNING [train.py:1204] (3/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_188-171673-0_sp1.1 from training. Duration: 0.52725 2023-10-03 05:47:01,091 WARNING [train.py:1204] (3/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_2-39653-0 from training. Duration: 0.98 2023-10-03 05:47:05,866 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.1.bypass.scale_min, batch_count=1158986.6666666667, ans=0.2 2023-10-03 05:47:07,055 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_23850_210_4238_1_1532232597_1632433_112-229012-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 05:47:08,393 WARNING [train.py:1204] (3/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_626-79528-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 05:47:11,175 WARNING [train.py:1204] (3/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_856-90052-0_sp1.1 from training. Duration: 0.963625 2023-10-03 05:47:11,208 WARNING [train.py:1204] (3/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_122-109023-0_sp0.9 from training. Duration: 0.8444375 2023-10-03 05:47:13,161 WARNING [train.py:1204] (3/4) Exclude cut with ID _64414_210_17301_1_1535243252048_3757759_37-70328-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 05:47:13,221 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_622-344907-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 05:47:13,391 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.feed_forward2.hidden_balancer.prob, batch_count=1159053.3333333333, ans=0.125 2023-10-03 05:47:14,537 WARNING [train.py:1204] (3/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_410-321956-0_sp1.1 from training. Duration: 0.92725 2023-10-03 05:47:14,550 WARNING [train.py:1204] (3/4) Exclude cut with ID _7562_210_2084_1_1528887767916_3946229_186-326422-0_sp0.9 from training. Duration: 0.9333125 2023-10-03 05:47:15,996 WARNING [train.py:1204] (3/4) Exclude cut with ID _37537_210_2253_1_1533171758568_3421949_127-285192-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 05:47:17,787 WARNING [train.py:1204] (3/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_723-228168-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 05:47:19,516 WARNING [train.py:1204] (3/4) Exclude cut with ID _13678_210_7497_1_1530530069226_3463170_426-246984-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 05:47:19,539 WARNING [train.py:1204] (3/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_399-156791-0_sp0.9 from training. Duration: 0.92225 2023-10-03 05:47:19,598 WARNING [train.py:1204] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_391-22148-0 from training. Duration: 0.77 2023-10-03 05:47:20,938 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3475_210_2028_1_1526193578_3622569_64-297072-0 from training. Duration: 0.91 2023-10-03 05:47:21,108 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.0.conv_module1.balancer1.prob, batch_count=1159053.3333333333, ans=0.125 2023-10-03 05:47:22,386 WARNING [train.py:1204] (3/4) Exclude cut with ID _17875_210_2087_1_1531224012907_1821339_225-280015-0_sp1.1 from training. Duration: 0.963625 2023-10-03 05:47:22,404 WARNING [train.py:1204] (3/4) Exclude cut with ID _50655_210_18628_1_1534229942193_3772154_325-246169-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 05:47:25,066 WARNING [train.py:1204] (3/4) Exclude cut with ID _29529_210_7234_1_1532393961455_3977460_597-257348-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 05:47:25,119 WARNING [train.py:1204] (3/4) Exclude cut with ID _58895_210_8750_1_1535184081320_3649089_77-173485-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 05:47:25,139 WARNING [train.py:1204] (3/4) Exclude cut with ID _6270_210_2084_1_1527850911573_3856420_26-277343-0 from training. Duration: 0.82 2023-10-03 05:47:26,547 WARNING [train.py:1204] (3/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_144-3287-0 from training. Duration: 0.67 2023-10-03 05:47:27,909 WARNING [train.py:1204] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_293-145278-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 05:47:29,740 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_40047_210_2290_1_1533290519_2374311_257-335162-0 from training. Duration: 0.95 2023-10-03 05:47:32,429 WARNING [train.py:1204] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_619-344499-0_sp0.9 from training. Duration: 0.7 2023-10-03 05:47:36,618 WARNING [train.py:1204] (3/4) Exclude cut with ID _19504_210_9994_1_1531648863242_3325220_456-315379-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 05:47:38,008 WARNING [train.py:1204] (3/4) Exclude cut with ID _55857_210_2346_1_1534644007831_6898209_631-221692-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 05:47:42,126 WARNING [train.py:1204] (3/4) Exclude cut with ID _64631_210_27767_1_1535349580068_4165160_324-3682-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 05:47:42,155 WARNING [train.py:1204] (3/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_317-211107-0 from training. Duration: 0.92 2023-10-03 05:47:45,427 WARNING [train.py:1204] (3/4) Exclude cut with ID _6789_210_1534_1_1527857937218_3118950_311-61262-0 from training. Duration: 0.65 2023-10-03 05:47:46,856 WARNING [train.py:1204] (3/4) Exclude cut with ID _28323_210_16187_1_1532480395667_4100300_365-68882-0_sp1.1 from training. Duration: 0.92725 2023-10-03 05:47:46,902 WARNING [train.py:1204] (3/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_816-181825-0_sp1.1 from training. Duration: 0.92725 2023-10-03 05:47:51,354 WARNING [train.py:1204] (3/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_132-269633-0_sp1.1 from training. Duration: 0.6181875 2023-10-03 05:47:51,372 WARNING [train.py:1204] (3/4) Exclude cut with ID _44585_210_18628_1_1533605350387_4518199_280-73534-0_sp1.1 from training. Duration: 0.8090625 2023-10-03 05:47:51,413 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_68095_210_20185_1_1535718226_8888855_1009-164755-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 05:47:52,692 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_40223_210_6228_1_1533298404_4812267_400-103813-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 05:47:52,693 WARNING [train.py:1204] (3/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_491-261923-0_sp1.1 from training. Duration: 0.8909375 2023-10-03 05:47:52,698 WARNING [train.py:1204] (3/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_712-342563-0 from training. Duration: 0.97 2023-10-03 05:47:52,775 WARNING [train.py:1204] (3/4) Exclude cut with ID _41161_210_20034_1_1533431249657_3555314_175-190032-0_sp1.1 from training. Duration: 0.963625 2023-10-03 05:47:55,486 WARNING [train.py:1204] (3/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_788-298907-0 from training. Duration: 0.89 2023-10-03 05:47:55,511 WARNING [train.py:1204] (3/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_225-70961-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 05:47:55,517 WARNING [train.py:1204] (3/4) Exclude cut with ID _58583_210_5236_1_1534726835266_3574249_235-175994-0_sp1.1 from training. Duration: 0.92725 2023-10-03 05:47:56,818 INFO [train.py:1046] (3/4) Epoch 33, batch 3900, loss[loss=0.1713, simple_loss=0.2586, pruned_loss=0.04198, over 24637.00 frames. ], tot_loss[loss=0.1609, simple_loss=0.2395, pruned_loss=0.04111, over 4724761.58 frames. ], batch size: 73, lr: 3.08e-03, grad_scale: 16.0 2023-10-03 05:47:56,975 WARNING [train.py:1204] (3/4) Exclude cut with ID _12302_210_5219_1_1529924446466_8107280_907-182949-0_sp1.1 from training. Duration: 0.836375 2023-10-03 05:47:57,201 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.bypass_mid.scale_min, batch_count=1159253.3333333333, ans=0.2 2023-10-03 05:47:58,275 WARNING [train.py:1204] (3/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_94-193692-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 05:48:00,198 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_4528_210_3273_1_1527076654_3276777_342-168068-0_sp0.9 from training. Duration: 0.9666875 2023-10-03 05:48:00,247 WARNING [train.py:1204] (3/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_368-74239-0_sp1.1 from training. Duration: 0.92725 2023-10-03 05:48:00,248 WARNING [train.py:1204] (3/4) Exclude cut with ID _44831_210_13427_1_1533816011549_3689035_291-295928-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 05:48:00,315 WARNING [train.py:1204] (3/4) Exclude cut with ID _56406_210_9288_1_1534899618664_3496149_455-131997-0_sp1.1 from training. Duration: 0.97275 2023-10-03 05:48:00,322 WARNING [train.py:1204] (3/4) Exclude cut with ID _6473_210_1741_1_1527901307396_3815380_108-17335-0 from training. Duration: 0.65 2023-10-03 05:48:01,720 WARNING [train.py:1204] (3/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_286-135600-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 05:48:04,502 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_928-321675-0_sp1.1 from training. Duration: 0.936375 2023-10-03 05:48:05,962 WARNING [train.py:1204] (3/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_81-300212-0_sp1.1 from training. Duration: 0.5454375 2023-10-03 05:48:07,313 WARNING [train.py:1204] (3/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_29-280622-0_sp0.9 from training. Duration: 0.911125 2023-10-03 05:48:07,402 WARNING [train.py:1204] (3/4) Exclude cut with ID _61079_210_4525_1_1534921008198_4011419_228-176447-0_sp1.1 from training. Duration: 0.936375 2023-10-03 05:48:08,896 WARNING [train.py:1204] (3/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_372-198878-0_sp1.1 from training. Duration: 0.5454375 2023-10-03 05:48:08,915 WARNING [train.py:1204] (3/4) Exclude cut with ID _64521_210_14699_1_1535243427259_3815328_244-208725-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 05:48:10,204 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8275_210_4198_1_1528455320_3892571_491-17707-0_sp0.9 from training. Duration: 0.711125 2023-10-03 05:48:12,031 WARNING [train.py:1204] (3/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_375-245677-0 from training. Duration: 0.81 2023-10-03 05:48:12,038 WARNING [train.py:1204] (3/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_690-286055-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 05:48:14,877 WARNING [train.py:1204] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_89-321955-0 from training. Duration: 0.8 2023-10-03 05:48:14,906 WARNING [train.py:1204] (3/4) Exclude cut with ID _18895_210_6228_1_1531357274234_3784930_213-98721-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 05:48:15,147 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=1159320.0, ans=0.1 2023-10-03 05:48:16,157 WARNING [train.py:1204] (3/4) Exclude cut with ID _19708_210_9206_1_1532136650733_4238364_347-65240-0 from training. Duration: 0.97 2023-10-03 05:48:16,297 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.0.feed_forward1.out_proj.dropout_p, batch_count=1159320.0, ans=0.1 2023-10-03 05:48:16,391 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=1159320.0, ans=0.1 2023-10-03 05:48:19,437 WARNING [train.py:1204] (3/4) Exclude cut with ID _64135_210_2253_1_1535677267876_7608972_498-5039-0 from training. Duration: 0.78 2023-10-03 05:48:24,232 WARNING [train.py:1204] (3/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_146-276944-0_sp1.1 from training. Duration: 0.7545625 2023-10-03 05:48:25,547 WARNING [train.py:1204] (3/4) Exclude cut with ID _63171_210_15488_1_1535104792794_3295110_155-34488-0_sp1.1 from training. Duration: 0.97275 2023-10-03 05:48:25,563 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_5438_210_1794_1_1527336687_2600009_233-58072-0_sp0.9 from training. Duration: 0.8555625 2023-10-03 05:48:25,620 WARNING [train.py:1204] (3/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_63-101996-0_sp0.9 from training. Duration: 0.97775 2023-10-03 05:48:30,228 WARNING [train.py:1204] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_271-269084-0_sp1.1 from training. Duration: 0.7545625 2023-10-03 05:48:31,624 WARNING [train.py:1204] (3/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_345-167986-0_sp1.1 from training. Duration: 0.763625 2023-10-03 05:48:35,556 WARNING [train.py:1204] (3/4) Exclude cut with ID _39290_210_4220_1_1533862778760_3075118_92-311600-0_sp1.1 from training. Duration: 0.8 2023-10-03 05:48:35,562 WARNING [train.py:1204] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_209-65306-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 05:48:35,650 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_26433_210_12108_1_1532157158_4056480_317-335807-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 05:48:41,236 WARNING [train.py:1204] (3/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_238-94127-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 05:48:42,504 WARNING [train.py:1204] (3/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_207-132038-0_sp1.1 from training. Duration: 0.8545625 2023-10-03 05:48:42,802 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.conv_module1.balancer2.prob, batch_count=1159453.3333333333, ans=0.125 2023-10-03 05:48:48,788 WARNING [train.py:1204] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_167-137255-0_sp1.1 from training. Duration: 0.5818125 2023-10-03 05:48:50,226 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2009_210_2071_1_1525481134_4588937_385-302100-0_sp0.9 from training. Duration: 0.9666875 2023-10-03 05:49:00,945 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_15517_210_6753_1_1530954008_3487336_253-110617-0_sp1.1 from training. Duration: 0.936375 2023-10-03 05:49:02,504 WARNING [train.py:1204] (3/4) Exclude cut with ID _39290_210_4220_1_1533862778760_3075118_92-311600-0_sp0.9 from training. Duration: 0.97775 2023-10-03 05:49:02,541 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_42-142473-0 from training. Duration: 0.72 2023-10-03 05:49:04,212 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2296_210_2087_1_1525503448_3397335_57-185430-0 from training. Duration: 0.6 2023-10-03 05:49:04,231 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_11305_210_1794_1_1529750428_7178148_257-312870-0_sp0.9 from training. Duration: 0.97775 2023-10-03 05:49:04,730 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.5.encoder.layers.1.nonlin_attention.whiten2, num_groups=1, num_channels=256, metric=10.17 vs. limit=15.0 2023-10-03 05:49:05,671 WARNING [train.py:1204] (3/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_222-198127-0 from training. Duration: 0.84 2023-10-03 05:49:06,975 WARNING [train.py:1204] (3/4) Exclude cut with ID _65263_210_22356_1_1535763598691_3921037_423-202699-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 05:49:07,050 WARNING [train.py:1204] (3/4) Exclude cut with ID _15037_210_2253_1_1530752294104_3267650_13-130361-0 from training. Duration: 0.99 2023-10-03 05:49:10,083 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.balancer1.prob, batch_count=1159586.6666666667, ans=0.125 2023-10-03 05:49:11,132 INFO [train.py:1046] (3/4) Epoch 33, batch 3950, loss[loss=0.1616, simple_loss=0.238, pruned_loss=0.04254, over 23282.00 frames. ], tot_loss[loss=0.1605, simple_loss=0.2393, pruned_loss=0.04088, over 4719637.22 frames. ], batch size: 119, lr: 3.08e-03, grad_scale: 16.0 2023-10-03 05:49:14,103 WARNING [train.py:1204] (3/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_628-198080-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 05:49:15,813 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3171_210_2087_1_1526109084_2330007_37-64334-0 from training. Duration: 0.82 2023-10-03 05:49:17,200 WARNING [train.py:1204] (3/4) Exclude cut with ID _5287_210_2084_1_1527418883040_3878670_12-221186-0_sp1.1 from training. Duration: 0.8909375 2023-10-03 05:49:17,460 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=1159586.6666666667, ans=0.1 2023-10-03 05:49:19,853 WARNING [train.py:1204] (3/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_1103-56445-0_sp1.1 from training. Duration: 0.663625 2023-10-03 05:49:19,979 WARNING [train.py:1204] (3/4) Exclude cut with ID _65263_210_22356_1_1535763598691_3921037_169-140068-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 05:49:21,203 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.601e+02 1.827e+02 2.038e+02 2.266e+02 3.676e+02, threshold=4.077e+02, percent-clipped=0.0 2023-10-03 05:49:27,620 WARNING [train.py:1204] (3/4) Exclude cut with ID IC0671W0118-331961 from training. Duration: 0.896 2023-10-03 05:49:28,985 WARNING [train.py:1204] (3/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_402-102311-0_sp1.1 from training. Duration: 0.6090625 2023-10-03 05:49:29,017 WARNING [train.py:1204] (3/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_202-293523-0 from training. Duration: 0.72 2023-10-03 05:49:29,224 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.feed_forward3.hidden_balancer.prob, batch_count=1159653.3333333333, ans=0.125 2023-10-03 05:49:30,299 WARNING [train.py:1204] (3/4) Exclude cut with ID IC0679W0038-335846 from training. Duration: 0.768 2023-10-03 05:49:30,335 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_37357_210_17024_1_1533089504_2316121_79-198866-0_sp1.1 from training. Duration: 0.936375 2023-10-03 05:49:32,353 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.5.encoder.layers.1.feed_forward2.out_whiten, num_groups=1, num_channels=256, metric=8.27 vs. limit=15.0 2023-10-03 05:49:33,158 WARNING [train.py:1204] (3/4) Exclude cut with ID _7606_210_4915_1_1528894253277_5080690_292-271285-0_sp1.1 from training. Duration: 0.963625 2023-10-03 05:49:33,178 WARNING [train.py:1204] (3/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_377-296839-0_sp0.9 from training. Duration: 0.611125 2023-10-03 05:49:33,181 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_45886_210_17024_1_1533704917_2435333_58-131069-0_sp1.1 from training. Duration: 0.963625 2023-10-03 05:49:35,228 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6961_210_2087_1_1527937168_6815011_395-185060-0 from training. Duration: 0.95 2023-10-03 05:49:35,516 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.bypass.skip_rate, batch_count=1159653.3333333333, ans=0.07 2023-10-03 05:49:36,594 WARNING [train.py:1204] (3/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_304-266479-0_sp1.1 from training. Duration: 0.97275 2023-10-03 05:49:38,112 WARNING [train.py:1204] (3/4) Exclude cut with ID _3867_210_1883_1_1526641200575_4475830_319-349850-0_sp1.1 from training. Duration: 0.6090625 2023-10-03 05:49:38,120 WARNING [train.py:1204] (3/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_304-59227-0_sp0.9 from training. Duration: 0.8555625 2023-10-03 05:49:39,435 WARNING [train.py:1204] (3/4) Exclude cut with ID _8021_210_7994_1_1528376451358_3653069_100-140231-0_sp0.9 from training. Duration: 0.7444375 2023-10-03 05:49:39,499 WARNING [train.py:1204] (3/4) Exclude cut with ID _8833_210_5219_1_1528592665210_7553059_329-125156-0_sp0.9 from training. Duration: 0.911125 2023-10-03 05:49:39,718 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.conv_module2.balancer1.prob, batch_count=1159720.0, ans=0.125 2023-10-03 05:49:45,807 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.feed_forward1.out_proj.dropout_p, batch_count=1159720.0, ans=0.1 2023-10-03 05:49:47,287 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.ff2_skip_rate, batch_count=1159720.0, ans=0.0 2023-10-03 05:49:48,424 WARNING [train.py:1204] (3/4) Exclude cut with ID _14459_210_2228_1_1531443641946_7516700_792-287234-0_sp1.1 from training. Duration: 0.92725 2023-10-03 05:49:48,440 WARNING [train.py:1204] (3/4) Exclude cut with ID _19708_210_9206_1_1532136650733_4238364_347-65240-0_sp1.1 from training. Duration: 0.8818125 2023-10-03 05:49:52,776 WARNING [train.py:1204] (3/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_70-27732-0 from training. Duration: 0.94 2023-10-03 05:49:55,172 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.4.encoder.layers.2.self_attn1.whiten, num_groups=1, num_channels=384, metric=11.78 vs. limit=22.5 2023-10-03 05:49:59,230 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_397-132565-0 from training. Duration: 0.99 2023-10-03 05:49:59,233 WARNING [train.py:1204] (3/4) Exclude cut with ID _49728_210_12651_1_1534041034294_6788150_624-195301-0 from training. Duration: 0.85 2023-10-03 05:50:00,461 WARNING [train.py:1204] (3/4) Exclude cut with ID _55429_210_10626_1_1534467588527_7174143_377-169391-0_sp1.1 from training. Duration: 0.87275 2023-10-03 05:50:00,573 WARNING [train.py:1204] (3/4) Exclude cut with ID _49376_210_5986_1_1533985278732_3370740_354-344695-0_sp1.1 from training. Duration: 0.9 2023-10-03 05:50:00,808 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.ff2_skip_rate, batch_count=1159786.6666666667, ans=0.0 2023-10-03 05:50:08,110 WARNING [train.py:1204] (3/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_276-96593-0_sp1.1 from training. Duration: 0.7 2023-10-03 05:50:08,118 WARNING [train.py:1204] (3/4) Exclude cut with ID _15012_210_1968_1_1530856820429_5095350_688-178849-0_sp0.9 from training. Duration: 0.711125 2023-10-03 05:50:08,156 WARNING [train.py:1204] (3/4) Exclude cut with ID _25565_210_12392_1_1532423009382_3952098_372-69023-0_sp1.1 from training. Duration: 0.936375 2023-10-03 05:50:08,196 WARNING [train.py:1204] (3/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_61-327062-0_sp1.1 from training. Duration: 0.736375 2023-10-03 05:50:08,225 WARNING [train.py:1204] (3/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_215-141059-0 from training. Duration: 0.62 2023-10-03 05:50:12,778 WARNING [train.py:1204] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_6-226750-0_sp1.1 from training. Duration: 0.8545625 2023-10-03 05:50:14,225 WARNING [train.py:1204] (3/4) Exclude cut with ID _14348_210_8341_1_1530599434996_3778640_240-232370-0_sp1.1 from training. Duration: 0.763625 2023-10-03 05:50:18,254 WARNING [train.py:1204] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_468-189058-0 from training. Duration: 0.96 2023-10-03 05:50:18,431 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.ff3_skip_rate, batch_count=1159853.3333333333, ans=0.0 2023-10-03 05:50:25,605 INFO [train.py:1046] (3/4) Epoch 33, batch 4000, loss[loss=0.1806, simple_loss=0.2477, pruned_loss=0.05677, over 23783.00 frames. ], tot_loss[loss=0.1611, simple_loss=0.2401, pruned_loss=0.04107, over 4729497.00 frames. ], batch size: 179, lr: 3.08e-03, grad_scale: 16.0 2023-10-03 05:50:28,556 WARNING [train.py:1204] (3/4) Exclude cut with ID _21987_210_5854_1_1531821500482_3637038_247-93340-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 05:50:35,254 WARNING [train.py:1204] (3/4) Exclude cut with ID _66868_210_9527_1_1535509287954_4561140_436-191602-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 05:50:42,759 WARNING [train.py:1204] (3/4) Exclude cut with ID _18895_210_6228_1_1531357274234_3784930_394-58203-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 05:50:42,806 WARNING [train.py:1204] (3/4) Exclude cut with ID _58605_210_12210_1_1534723195939_3904259_258-281498-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 05:50:42,840 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_33348_210_5632_1_1533086912_3324030_245-292752-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 05:50:44,644 WARNING [train.py:1204] (3/4) Exclude cut with ID _67831_210_12392_1_1535612464202_3092612_346-2148-0 from training. Duration: 0.93 2023-10-03 05:50:44,698 WARNING [train.py:1204] (3/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_369-222060-0_sp0.9 from training. Duration: 0.788875 2023-10-03 05:50:46,089 WARNING [train.py:1204] (3/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_245-174553-0 from training. Duration: 0.88 2023-10-03 05:50:46,095 WARNING [train.py:1204] (3/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_231-104736-0_sp1.1 from training. Duration: 0.7090625 2023-10-03 05:50:46,100 WARNING [train.py:1204] (3/4) Exclude cut with ID _44585_210_18628_1_1533605350387_4518199_280-73534-0 from training. Duration: 0.89 2023-10-03 05:50:47,544 WARNING [train.py:1204] (3/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_22-201476-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 05:50:48,154 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.4.encoder.layers.0.self_attn_weights.whiten_keys, num_groups=4, num_channels=128, metric=1.75 vs. limit=6.0 2023-10-03 05:50:49,265 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.conv_module2.balancer1.prob, batch_count=1159986.6666666667, ans=0.125 2023-10-03 05:50:50,353 WARNING [train.py:1204] (3/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_325-62802-0_sp0.9 from training. Duration: 0.9666875 2023-10-03 05:50:50,372 WARNING [train.py:1204] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_199-274062-0_sp1.1 from training. Duration: 0.97275 2023-10-03 05:50:50,375 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_8-319957-0_sp1.1 from training. Duration: 0.87275 2023-10-03 05:50:51,715 WARNING [train.py:1204] (3/4) Exclude cut with ID _29343_210_4381_1_1532602757752_3674109_224-349726-0_sp1.1 from training. Duration: 0.936375 2023-10-03 05:50:51,721 WARNING [train.py:1204] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_520-126729-0_sp1.1 from training. Duration: 0.636375 2023-10-03 05:50:53,101 WARNING [train.py:1204] (3/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_239-22076-0_sp1.1 from training. Duration: 0.77275 2023-10-03 05:50:55,716 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9012W0011-501146 from training. Duration: 0.896 2023-10-03 05:50:55,795 WARNING [train.py:1204] (3/4) Exclude cut with ID _29857_210_20034_1_1532395886753_3798160_279-185801-0_sp1.1 from training. Duration: 0.82725 2023-10-03 05:50:57,195 WARNING [train.py:1204] (3/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_261-223933-0_sp1.1 from training. Duration: 0.92725 2023-10-03 05:50:58,091 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.feed_forward3.hidden_balancer.prob, batch_count=1160053.3333333333, ans=0.125 2023-10-03 05:50:59,194 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9029W0025-509426 from training. Duration: 0.896 2023-10-03 05:50:59,262 WARNING [train.py:1204] (3/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_337-181502-0_sp1.1 from training. Duration: 0.5909375 2023-10-03 05:50:59,276 WARNING [train.py:1204] (3/4) Exclude cut with ID _68635_210_9317_1_1535770813410_4315870_159-332599-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 05:51:04,678 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_161-276996-0 from training. Duration: 0.95 2023-10-03 05:51:05,931 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7088_210_1874_1_1527939776_1101113_127-68870-0_sp1.1 from training. Duration: 0.936375 2023-10-03 05:51:07,935 WARNING [train.py:1204] (3/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_322-217220-0_sp1.1 from training. Duration: 0.7818125 2023-10-03 05:51:09,241 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9072W0002-530636 from training. Duration: 0.768 2023-10-03 05:51:10,577 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_340-336408-0_sp1.1 from training. Duration: 0.8454375 2023-10-03 05:51:11,948 WARNING [train.py:1204] (3/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_395-37222-0 from training. Duration: 0.76 2023-10-03 05:51:11,951 WARNING [train.py:1204] (3/4) Exclude cut with ID _64099_210_17301_1_1535526977656_1578188_159-209156-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 05:51:12,043 WARNING [train.py:1204] (3/4) Exclude cut with ID _53950_210_5816_1_1534417264919_3862951_28-29681-0_sp1.1 from training. Duration: 0.92725 2023-10-03 05:51:13,328 WARNING [train.py:1204] (3/4) Exclude cut with ID _4681_210_3525_1_1527250689464_120889_15-126760-0_sp1.1 from training. Duration: 0.736375 2023-10-03 05:51:14,787 WARNING [train.py:1204] (3/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_242-3472-0_sp1.1 from training. Duration: 0.663625 2023-10-03 05:51:14,821 WARNING [train.py:1204] (3/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_436-186311-0_sp0.9 from training. Duration: 0.72225 2023-10-03 05:51:16,611 WARNING [train.py:1204] (3/4) Exclude cut with ID _3675_210_4557_1_1526639934408_4182089_452-172147-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 05:51:19,227 WARNING [train.py:1204] (3/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_80-147301-0 from training. Duration: 0.84 2023-10-03 05:51:19,258 WARNING [train.py:1204] (3/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_351-107588-0_sp1.1 from training. Duration: 0.92725 2023-10-03 05:51:20,668 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9111W0002-550781 from training. Duration: 0.896 2023-10-03 05:51:23,793 WARNING [train.py:1204] (3/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_465-156500-0_sp1.1 from training. Duration: 0.6545625 2023-10-03 05:51:26,644 WARNING [train.py:1204] (3/4) Exclude cut with ID _13938_210_9988_1_1530518718978_3693960_304-112784-0_sp1.1 from training. Duration: 0.4181875 2023-10-03 05:51:28,781 WARNING [train.py:1204] (3/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_202-67017-0_sp1.1 from training. Duration: 0.7454375 2023-10-03 05:51:30,084 WARNING [train.py:1204] (3/4) Exclude cut with ID _58414_210_8254_1_1535421609162_7151182_448-4358-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 05:51:30,165 WARNING [train.py:1204] (3/4) Exclude cut with ID _66840_210_5219_1_1535452168426_4196349_203-243234-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 05:51:30,360 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.feed_forward2.hidden_balancer.prob, batch_count=1160186.6666666667, ans=0.125 2023-10-03 05:51:32,777 WARNING [train.py:1204] (3/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_201-213513-0_sp1.1 from training. Duration: 0.97275 2023-10-03 05:51:37,432 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_117-187560-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 05:51:38,828 INFO [train.py:1046] (3/4) Epoch 33, batch 4050, loss[loss=0.1591, simple_loss=0.2523, pruned_loss=0.03293, over 24327.00 frames. ], tot_loss[loss=0.161, simple_loss=0.2403, pruned_loss=0.04089, over 4738497.84 frames. ], batch size: 74, lr: 3.08e-03, grad_scale: 16.0 2023-10-03 05:51:38,910 WARNING [train.py:1204] (3/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_135-174080-0_sp0.9 from training. Duration: 0.588875 2023-10-03 05:51:40,220 WARNING [train.py:1204] (3/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_45-265684-0 from training. Duration: 0.72 2023-10-03 05:51:41,611 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_435-313305-0_sp1.1 from training. Duration: 0.6454375 2023-10-03 05:51:42,861 WARNING [train.py:1204] (3/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_235-307337-0_sp1.1 from training. Duration: 0.8545625 2023-10-03 05:51:42,937 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7762_210_1794_1_1528199508_4057408_419-125009-0_sp0.9 from training. Duration: 0.92225 2023-10-03 05:51:44,351 WARNING [train.py:1204] (3/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_215-141059-0_sp1.1 from training. Duration: 0.563625 2023-10-03 05:51:45,778 WARNING [train.py:1204] (3/4) Exclude cut with ID _27013_210_1389_1_1532437173240_3252649_222-125573-0_sp1.1 from training. Duration: 0.8909375 2023-10-03 05:51:49,103 WARNING [train.py:1204] (3/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_126-274694-0_sp1.1 from training. Duration: 0.8909375 2023-10-03 05:51:50,387 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.579e+02 1.813e+02 1.958e+02 2.207e+02 3.335e+02, threshold=3.917e+02, percent-clipped=0.0 2023-10-03 05:51:53,176 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2592_210_2290_1_1525697025_4882925_157-70512-0_sp1.1 from training. Duration: 0.82725 2023-10-03 05:51:53,229 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_292-265928-0_sp0.9 from training. Duration: 0.5666875 2023-10-03 05:51:54,600 WARNING [train.py:1204] (3/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_1211-226066-0_sp1.1 from training. Duration: 0.6909375 2023-10-03 05:51:54,647 WARNING [train.py:1204] (3/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_394-265747-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 05:51:58,909 WARNING [train.py:1204] (3/4) Exclude cut with ID _48782_210_12658_1_1534467701544_2300422_239-67139-0_sp1.1 from training. Duration: 0.97275 2023-10-03 05:52:00,965 WARNING [train.py:1204] (3/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_329-113173-0_sp1.1 from training. Duration: 0.563625 2023-10-03 05:52:05,513 WARNING [train.py:1204] (3/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_81-75849-0_sp0.9 from training. Duration: 0.4333125 2023-10-03 05:52:05,788 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.self_attn_weights.pos_emb_skip_rate, batch_count=1160320.0, ans=0.0 2023-10-03 05:52:07,498 WARNING [train.py:1204] (3/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_1003-101638-0 from training. Duration: 0.73 2023-10-03 05:52:07,533 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9309W0189-640316 from training. Duration: 0.64 2023-10-03 05:52:08,311 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.0.layers.1.feed_forward3.out_whiten, num_groups=1, num_channels=192, metric=9.52 vs. limit=15.0 2023-10-03 05:52:10,289 WARNING [train.py:1204] (3/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_503-287883-0_sp1.1 from training. Duration: 0.8 2023-10-03 05:52:14,602 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.conv_module1.balancer2.prob, batch_count=1160386.6666666667, ans=0.125 2023-10-03 05:52:17,092 WARNING [train.py:1204] (3/4) Exclude cut with ID _8833_210_5219_1_1528592665210_7553059_611-80313-0 from training. Duration: 0.64 2023-10-03 05:52:18,385 WARNING [train.py:1204] (3/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_335-106626-0_sp1.1 from training. Duration: 0.963625 2023-10-03 05:52:22,867 WARNING [train.py:1204] (3/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_237-44332-0_sp1.1 from training. Duration: 0.8545625 2023-10-03 05:52:25,600 WARNING [train.py:1204] (3/4) Exclude cut with ID _46952_210_5986_1_1534230010357_3107379_178-147341-0_sp1.1 from training. Duration: 0.97275 2023-10-03 05:52:25,635 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_13091_210_7925_1_1530609447_8987354_946-154426-0_sp1.1 from training. Duration: 0.936375 2023-10-03 05:52:25,657 WARNING [train.py:1204] (3/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_326-17061-0_sp1.1 from training. Duration: 0.8545625 2023-10-03 05:52:29,731 WARNING [train.py:1204] (3/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_38-169920-0_sp1.1 from training. Duration: 0.82725 2023-10-03 05:52:32,457 WARNING [train.py:1204] (3/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_283-220372-0 from training. Duration: 0.56 2023-10-03 05:52:33,827 WARNING [train.py:1204] (3/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_252-216674-0_sp1.1 from training. Duration: 0.4909375 2023-10-03 05:52:35,248 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_202-91290-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 05:52:36,008 WARNING [train.py:1204] (3/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_80-74184-0 from training. Duration: 0.83 2023-10-03 05:52:40,270 WARNING [train.py:1204] (3/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_411-271358-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 05:52:46,339 WARNING [train.py:1204] (3/4) Exclude cut with ID _68777_210_4353_1_1535810390731_3117904_129-166012-0 from training. Duration: 0.85 2023-10-03 05:52:46,424 WARNING [train.py:1204] (3/4) Exclude cut with ID _44982_210_1511_1_1534312820108_3726029_410-109759-0_sp1.1 from training. Duration: 0.963625 2023-10-03 05:52:46,428 WARNING [train.py:1204] (3/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_364-22817-0_sp1.1 from training. Duration: 0.6454375 2023-10-03 05:52:46,568 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.0.balancer1.prob, batch_count=1160520.0, ans=0.125 2023-10-03 05:52:49,062 WARNING [train.py:1204] (3/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_89-242335-0 from training. Duration: 0.95 2023-10-03 05:52:49,079 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_151-325626-0 from training. Duration: 0.97 2023-10-03 05:52:49,080 WARNING [train.py:1204] (3/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_786-264332-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 05:52:51,129 WARNING [train.py:1204] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_35-153886-0_sp1.1 from training. Duration: 0.763625 2023-10-03 05:52:52,503 INFO [train.py:1046] (3/4) Epoch 33, batch 4100, loss[loss=0.168, simple_loss=0.2518, pruned_loss=0.04212, over 24023.00 frames. ], tot_loss[loss=0.1615, simple_loss=0.2411, pruned_loss=0.04098, over 4733465.20 frames. ], batch size: 80, lr: 3.08e-03, grad_scale: 16.0 2023-10-03 05:52:52,596 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_24007_210_1794_1_1531918925_1663875_116-100916-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 05:52:52,617 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_162-262717-0_sp1.1 from training. Duration: 0.7545625 2023-10-03 05:52:58,350 WARNING [train.py:1204] (3/4) Exclude cut with ID _8263_210_1664_1_1528438953494_294100_40-204731-0 from training. Duration: 0.5 2023-10-03 05:52:59,718 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_416-293746-0 from training. Duration: 0.9 2023-10-03 05:53:02,474 WARNING [train.py:1204] (3/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_605-219732-0 from training. Duration: 0.94 2023-10-03 05:53:02,589 WARNING [train.py:1204] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_454-123002-0 from training. Duration: 0.69 2023-10-03 05:53:02,594 WARNING [train.py:1204] (3/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_25-304273-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 05:53:04,371 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6860_210_4852_1_1527917457_3833327_451-39702-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 05:53:05,724 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_304-16505-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 05:53:05,745 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_4-266348-0_sp1.1 from training. Duration: 0.7090625 2023-10-03 05:53:07,071 WARNING [train.py:1204] (3/4) Exclude cut with ID ID0149W0001-753701 from training. Duration: 0.989 2023-10-03 05:53:09,762 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_41251_210_15368_1_1533429767_4820695_394-55393-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 05:53:11,132 WARNING [train.py:1204] (3/4) Exclude cut with ID _8793_210_6310_1_1528542985139_2310009_121-179782-0_sp1.1 from training. Duration: 0.72725 2023-10-03 05:53:11,146 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2729_210_3528_1_1525863505_3882214_421-73047-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 05:53:12,410 WARNING [train.py:1204] (3/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_278-154492-0_sp0.9 from training. Duration: 0.8444375 2023-10-03 05:53:14,353 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.4.encoder.layers.2.feed_forward2.out_whiten, num_groups=1, num_channels=384, metric=8.54 vs. limit=15.0 2023-10-03 05:53:15,174 WARNING [train.py:1204] (3/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_412-317077-0_sp0.9 from training. Duration: 0.8333125 2023-10-03 05:53:16,522 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_727-138317-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 05:53:17,875 WARNING [train.py:1204] (3/4) Exclude cut with ID _25808_210_13292_1_1532343514562_7366109_345-335048-0_sp1.1 from training. Duration: 0.92725 2023-10-03 05:53:17,889 WARNING [train.py:1204] (3/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_506-23200-0 from training. Duration: 0.97 2023-10-03 05:53:19,235 WARNING [train.py:1204] (3/4) Exclude cut with ID _44718_210_5816_1_1534050051869_6469030_643-245187-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 05:53:19,239 WARNING [train.py:1204] (3/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_42-186162-0_sp1.1 from training. Duration: 0.836375 2023-10-03 05:53:19,253 WARNING [train.py:1204] (3/4) Exclude cut with ID _3231_210_2106_1_1526205864934_6784220_171-256004-0_sp1.1 from training. Duration: 0.7818125 2023-10-03 05:53:19,278 WARNING [train.py:1204] (3/4) Exclude cut with ID _25914_210_6400_1_1532141317058_644740_43-248478-0_sp1.1 from training. Duration: 0.936375 2023-10-03 05:53:20,607 WARNING [train.py:1204] (3/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_577-287150-0 from training. Duration: 0.82 2023-10-03 05:53:23,872 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_46015_210_7234_1_1533863319_3087837_376-276068-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 05:53:25,210 WARNING [train.py:1204] (3/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_51-200220-0 from training. Duration: 0.56 2023-10-03 05:53:26,595 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_452-96101-0_sp1.1 from training. Duration: 0.963625 2023-10-03 05:53:29,417 WARNING [train.py:1204] (3/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_263-168388-0_sp1.1 from training. Duration: 0.7818125 2023-10-03 05:53:29,418 WARNING [train.py:1204] (3/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_469-94673-0 from training. Duration: 0.79 2023-10-03 05:53:30,823 WARNING [train.py:1204] (3/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_303-228738-0_sp1.1 from training. Duration: 0.763625 2023-10-03 05:53:32,164 WARNING [train.py:1204] (3/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_262-250826-0_sp1.1 from training. Duration: 0.9 2023-10-03 05:53:32,198 WARNING [train.py:1204] (3/4) Exclude cut with ID _68777_210_4353_1_1535810390731_3117904_129-166012-0_sp1.1 from training. Duration: 0.77275 2023-10-03 05:53:34,910 WARNING [train.py:1204] (3/4) Exclude cut with ID _58895_210_8750_1_1535184081320_3649089_265-323810-0 from training. Duration: 0.96 2023-10-03 05:53:36,416 WARNING [train.py:1204] (3/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_120-76843-0_sp0.9 from training. Duration: 0.611125 2023-10-03 05:53:36,477 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7544_210_6753_1_1528587722_3576686_295-258479-0_sp0.9 from training. Duration: 0.9333125 2023-10-03 05:53:40,224 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7046_210_6753_1_1527895337_6920917_783-278109-0 from training. Duration: 0.77 2023-10-03 05:53:40,268 WARNING [train.py:1204] (3/4) Exclude cut with ID _42326_210_7770_1_1533814282531_3707269_113-308323-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 05:53:41,611 WARNING [train.py:1204] (3/4) Exclude cut with ID _7852_210_5219_1_1528286471316_8139400_242-105336-0_sp0.9 from training. Duration: 0.8 2023-10-03 05:53:44,814 WARNING [train.py:1204] (3/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_114-277905-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 05:53:46,726 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.4.encoder.layers.2.self_attn1.whiten, num_groups=1, num_channels=384, metric=12.16 vs. limit=22.5 2023-10-03 05:53:48,936 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_13118_210_7925_1_1530755612_7608297_42-66683-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 05:53:50,470 WARNING [train.py:1204] (3/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_901-79438-0_sp1.1 from training. Duration: 0.8545625 2023-10-03 05:53:51,812 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_30280_210_1881_1_1532872779_3660139_211-182194-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 05:53:57,974 WARNING [train.py:1204] (3/4) Exclude cut with ID _14898_210_9206_1_1530766812377_4177190_621-45732-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 05:53:57,982 WARNING [train.py:1204] (3/4) Exclude cut with ID _35350_210_16040_1_1533040191183_4075299_224-133775-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 05:54:02,073 WARNING [train.py:1204] (3/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_147-293311-0_sp1.1 from training. Duration: 0.8545625 2023-10-03 05:54:04,951 WARNING [train.py:1204] (3/4) Exclude cut with ID _12302_210_5219_1_1529924446466_8107280_835-279754-0_sp1.1 from training. Duration: 0.8181875 2023-10-03 05:54:06,926 INFO [train.py:1046] (3/4) Epoch 33, batch 4150, loss[loss=0.1636, simple_loss=0.2389, pruned_loss=0.04411, over 23810.00 frames. ], tot_loss[loss=0.1619, simple_loss=0.2416, pruned_loss=0.04107, over 4735353.41 frames. ], batch size: 195, lr: 3.08e-03, grad_scale: 16.0 2023-10-03 05:54:07,095 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8453_210_4211_1_1528502940_8562730_347-330673-0_sp0.9 from training. Duration: 0.8 2023-10-03 05:54:09,760 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_213-103282-0_sp0.9 from training. Duration: 0.8666875 2023-10-03 05:54:11,111 WARNING [train.py:1204] (3/4) Exclude cut with ID _49728_210_12651_1_1534041034294_6788150_66-43496-0_sp1.1 from training. Duration: 0.97275 2023-10-03 05:54:11,120 WARNING [train.py:1204] (3/4) Exclude cut with ID _19023_210_4353_1_1531378892552_5820780_28-36615-0_sp1.1 from training. Duration: 0.8818125 2023-10-03 05:54:14,932 WARNING [train.py:1204] (3/4) Exclude cut with ID _14349_210_8341_1_1530615710818_3442470_115-285975-0 from training. Duration: 0.86 2023-10-03 05:54:14,962 WARNING [train.py:1204] (3/4) Exclude cut with ID _12734_210_8341_1_1530183614296_3891929_236-47464-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 05:54:15,020 WARNING [train.py:1204] (3/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_177-236809-0 from training. Duration: 0.49 2023-10-03 05:54:16,450 WARNING [train.py:1204] (3/4) Exclude cut with ID _13678_210_7497_1_1530530069226_3463170_371-3221-0 from training. Duration: 0.9 2023-10-03 05:54:16,483 WARNING [train.py:1204] (3/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_48-338976-0 from training. Duration: 0.89 2023-10-03 05:54:17,800 WARNING [train.py:1204] (3/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_1167-309744-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 05:54:17,973 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.conv_skip_rate, batch_count=1160920.0, ans=0.0 2023-10-03 05:54:20,399 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.491e+02 1.882e+02 2.112e+02 2.535e+02 4.235e+02, threshold=4.223e+02, percent-clipped=3.0 2023-10-03 05:54:21,865 WARNING [train.py:1204] (3/4) Exclude cut with ID _30327_210_16049_1_1532484161295_3924169_401-70819-0_sp0.9 from training. Duration: 0.9555625 2023-10-03 05:54:21,873 WARNING [train.py:1204] (3/4) Exclude cut with ID _55134_210_21982_1_1534590028377_3882170_346-91022-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 05:54:25,064 WARNING [train.py:1204] (3/4) Exclude cut with ID _14459_210_2228_1_1531443641946_7516700_174-118801-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 05:54:25,137 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_269-278006-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 05:54:26,446 WARNING [train.py:1204] (3/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_390-339137-0_sp0.9 from training. Duration: 0.62225 2023-10-03 05:54:29,104 WARNING [train.py:1204] (3/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_253-308377-0_sp1.1 from training. Duration: 0.4454375 2023-10-03 05:54:29,115 WARNING [train.py:1204] (3/4) Exclude cut with ID _23825_210_13449_1_1531915313526_3497150_240-271367-0_sp1.1 from training. Duration: 0.8818125 2023-10-03 05:54:29,369 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.ff2_skip_rate, batch_count=1160986.6666666667, ans=0.0 2023-10-03 05:54:30,564 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1015-343884-0_sp0.9 from training. Duration: 0.688875 2023-10-03 05:54:34,767 WARNING [train.py:1204] (3/4) Exclude cut with ID _23514_210_1389_1_1531832139785_3031439_335-48210-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 05:54:38,112 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_309-65177-0_sp1.1 from training. Duration: 0.8 2023-10-03 05:54:40,026 WARNING [train.py:1204] (3/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_231-137892-0 from training. Duration: 0.99 2023-10-03 05:54:41,309 WARNING [train.py:1204] (3/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_57-270037-0 from training. Duration: 0.77 2023-10-03 05:54:41,314 WARNING [train.py:1204] (3/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_465-214726-0_sp1.1 from training. Duration: 0.7818125 2023-10-03 05:54:43,260 WARNING [train.py:1204] (3/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_106-300985-0 from training. Duration: 0.9 2023-10-03 05:54:43,271 WARNING [train.py:1204] (3/4) Exclude cut with ID _14632_210_9988_1_1530691279392_3923200_161-129530-0_sp1.1 from training. Duration: 0.87275 2023-10-03 05:54:43,285 WARNING [train.py:1204] (3/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_514-88370-0_sp1.1 from training. Duration: 0.963625 2023-10-03 05:54:43,547 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.bypass.scale_min, batch_count=1161053.3333333333, ans=0.2 2023-10-03 05:54:43,569 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.bypass.scale_min, batch_count=1161053.3333333333, ans=0.2 2023-10-03 05:54:46,070 WARNING [train.py:1204] (3/4) Exclude cut with ID _56495_210_4234_1_1534748080334_4136219_305-334505-0_sp1.1 from training. Duration: 0.92725 2023-10-03 05:54:47,365 WARNING [train.py:1204] (3/4) Exclude cut with ID _42875_210_14856_1_1533704397487_4012433_61-264914-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 05:54:50,142 WARNING [train.py:1204] (3/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_91-148831-0 from training. Duration: 0.99 2023-10-03 05:54:53,394 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_435-313305-0_sp0.9 from training. Duration: 0.788875 2023-10-03 05:54:54,817 WARNING [train.py:1204] (3/4) Exclude cut with ID _16744_210_5986_1_1531724405384_3907567_242-111316-0_sp1.1 from training. Duration: 0.8454375 2023-10-03 05:54:56,260 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_257-20733-0 from training. Duration: 0.76 2023-10-03 05:54:57,585 WARNING [train.py:1204] (3/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_4-91037-0_sp1.1 from training. Duration: 0.8 2023-10-03 05:54:57,666 WARNING [train.py:1204] (3/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_3-276701-0 from training. Duration: 0.91 2023-10-03 05:55:00,331 WARNING [train.py:1204] (3/4) Exclude cut with ID _3992_210_1747_1_1526644816031_3731060_36-147855-0_sp1.1 from training. Duration: 0.7090625 2023-10-03 05:55:00,409 WARNING [train.py:1204] (3/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_1296-298671-0_sp1.1 from training. Duration: 0.963625 2023-10-03 05:55:01,837 WARNING [train.py:1204] (3/4) Exclude cut with ID _68965_210_1883_1_1535698962272_3497490_309-334593-0_sp1.1 from training. Duration: 0.92725 2023-10-03 05:55:03,254 WARNING [train.py:1204] (3/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_128-296581-0 from training. Duration: 0.74 2023-10-03 05:55:03,254 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_17613_210_2151_1_1531137569_2366160_170-299079-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 05:55:03,256 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6287_210_3273_1_1527509506_2653716_182-199887-0_sp1.1 from training. Duration: 0.463625 2023-10-03 05:55:04,556 WARNING [train.py:1204] (3/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_80-135200-0_sp1.1 from training. Duration: 0.7181875 2023-10-03 05:55:04,693 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=1161186.6666666667, ans=0.1 2023-10-03 05:55:09,148 WARNING [train.py:1204] (3/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_34-183685-0 from training. Duration: 0.98 2023-10-03 05:55:09,172 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8278_210_2550_1_1528637099_4220413_418-210155-0_sp1.1 from training. Duration: 0.92725 2023-10-03 05:55:09,179 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8853_210_3273_1_1528633899_818968_51-187685-0_sp0.9 from training. Duration: 0.7666875 2023-10-03 05:55:11,741 WARNING [train.py:1204] (3/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_332-221057-0_sp1.1 from training. Duration: 0.6181875 2023-10-03 05:55:11,816 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2221_210_1988_1_1525597051_4935541_415-296882-0 from training. Duration: 0.77 2023-10-03 05:55:11,839 WARNING [train.py:1204] (3/4) Exclude cut with ID _33640_210_2655_1_1533117605479_4855469_770-278185-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 05:55:11,881 WARNING [train.py:1204] (3/4) Exclude cut with ID _5287_210_2084_1_1527418883040_3878670_333-48508-0_sp1.1 from training. Duration: 0.5454375 2023-10-03 05:55:12,122 INFO [scaling.py:1118] (3/4) WithLoss: name=encoder.encoders.3.encoder.layers.3.self_attn_weights, loss-sum=0.000e+00 2023-10-03 05:55:13,257 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_142-276839-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 05:55:14,651 WARNING [train.py:1204] (3/4) Exclude cut with ID _28394_210_8913_1_1532430049518_3650118_126-139371-0_sp1.1 from training. Duration: 0.92725 2023-10-03 05:55:14,678 WARNING [train.py:1204] (3/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_76-203867-0 from training. Duration: 0.72 2023-10-03 05:55:14,715 WARNING [train.py:1204] (3/4) Exclude cut with ID _6194_210_1534_1_1527511826160_3941194_14-282876-0_sp0.9 from training. Duration: 0.788875 2023-10-03 05:55:19,012 WARNING [train.py:1204] (3/4) Exclude cut with ID _35355_210_16040_1_1533212988977_4016229_74-190076-0_sp1.1 from training. Duration: 0.72725 2023-10-03 05:55:21,536 INFO [train.py:1046] (3/4) Epoch 33, batch 4200, loss[loss=0.1619, simple_loss=0.2438, pruned_loss=0.04, over 23296.00 frames. ], tot_loss[loss=0.1611, simple_loss=0.2406, pruned_loss=0.0408, over 4738244.94 frames. ], batch size: 93, lr: 3.08e-03, grad_scale: 8.0 2023-10-03 05:55:21,620 WARNING [train.py:1204] (3/4) Exclude cut with ID _33920_210_4220_1_1533257851500_3084399_198-334063-0 from training. Duration: 0.94 2023-10-03 05:55:23,524 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_4-266348-0_sp0.9 from training. Duration: 0.8666875 2023-10-03 05:55:24,993 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_23856_210_4238_1_1531994785_3087934_122-38075-0_sp1.1 from training. Duration: 0.936375 2023-10-03 05:55:26,400 WARNING [train.py:1204] (3/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_276-96593-0_sp0.9 from training. Duration: 0.8555625 2023-10-03 05:55:26,462 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_308-278734-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 05:55:26,464 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_5190_210_4557_1_1527245078_2965646_353-250603-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 05:55:26,611 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.ff2_skip_rate, batch_count=1161253.3333333333, ans=0.0 2023-10-03 05:55:29,200 WARNING [train.py:1204] (3/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_472-216663-0 from training. Duration: 0.92 2023-10-03 05:55:32,093 WARNING [train.py:1204] (3/4) Exclude cut with ID _7537_210_2084_1_1528502395431_3924359_14-312906-0 from training. Duration: 0.84 2023-10-03 05:55:33,372 WARNING [train.py:1204] (3/4) Exclude cut with ID _29003_210_19596_1_1532507391720_3894098_559-236660-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 05:55:34,759 WARNING [train.py:1204] (3/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_312-204798-0_sp1.1 from training. Duration: 0.8454375 2023-10-03 05:55:37,984 WARNING [train.py:1204] (3/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_17-309070-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 05:55:40,422 WARNING [train.py:1204] (3/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_344-152526-0_sp0.9 from training. Duration: 0.588875 2023-10-03 05:55:43,175 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_41452_210_7291_1_1533437687_3941281_405-11559-0_sp1.1 from training. Duration: 0.82725 2023-10-03 05:55:43,199 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_48730_210_5399_1_1533885886_2212769_157-124052-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 05:55:43,261 WARNING [train.py:1204] (3/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_415-78792-0 from training. Duration: 0.81 2023-10-03 05:55:43,265 WARNING [train.py:1204] (3/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_345-340690-0_sp1.1 from training. Duration: 0.8454375 2023-10-03 05:55:44,681 WARNING [train.py:1204] (3/4) Exclude cut with ID _23825_210_13449_1_1531915313526_3497150_124-8138-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 05:55:46,048 WARNING [train.py:1204] (3/4) Exclude cut with ID _49258_210_7994_1_1534122017863_3738861_194-24864-0_sp1.1 from training. Duration: 0.936375 2023-10-03 05:55:46,071 WARNING [train.py:1204] (3/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_369-222060-0_sp1.1 from training. Duration: 0.6454375 2023-10-03 05:55:47,470 WARNING [train.py:1204] (3/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_480-13146-0_sp0.9 from training. Duration: 0.9444375 2023-10-03 05:55:50,315 WARNING [train.py:1204] (3/4) Exclude cut with ID _13253_210_1925_1_1530757775819_3577873_191-144977-0 from training. Duration: 0.78 2023-10-03 05:55:50,338 WARNING [train.py:1204] (3/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_1128-140266-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 05:55:50,589 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.ff2_skip_rate, batch_count=1161386.6666666667, ans=0.0 2023-10-03 05:55:56,400 WARNING [train.py:1204] (3/4) Exclude cut with ID _8772_210_1850_1_1528635595972_3664011_247-336448-0_sp0.9 from training. Duration: 0.82225 2023-10-03 05:55:56,482 WARNING [train.py:1204] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_318-59097-0_sp1.1 from training. Duration: 0.6909375 2023-10-03 05:55:59,170 WARNING [train.py:1204] (3/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_540-131164-0_sp1.1 from training. Duration: 0.836375 2023-10-03 05:56:00,513 WARNING [train.py:1204] (3/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_39-110836-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 05:56:03,273 WARNING [train.py:1204] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_860-96198-0_sp1.1 from training. Duration: 0.663625 2023-10-03 05:56:03,289 WARNING [train.py:1204] (3/4) Exclude cut with ID _15133_210_6228_1_1530794512074_3080379_29-264458-0 from training. Duration: 0.58 2023-10-03 05:56:03,309 WARNING [train.py:1204] (3/4) Exclude cut with ID _55209_210_1531_1_1534675828996_4597160_797-348343-0_sp1.1 from training. Duration: 0.963625 2023-10-03 05:56:04,734 WARNING [train.py:1204] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_23-189234-0_sp1.1 from training. Duration: 0.7545625 2023-10-03 05:56:05,018 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.conv_module1.balancer1.prob, batch_count=1161453.3333333333, ans=0.125 2023-10-03 05:56:09,500 WARNING [train.py:1204] (3/4) Exclude cut with ID _5410_210_1388_1_1527397055795_1719020_39-112221-0_sp1.1 from training. Duration: 0.62725 2023-10-03 05:56:10,838 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_100-348441-0_sp1.1 from training. Duration: 0.82725 2023-10-03 05:56:14,451 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.conv_module1.balancer1.prob, batch_count=1161453.3333333333, ans=0.125 2023-10-03 05:56:17,613 WARNING [train.py:1204] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_143-86344-0_sp1.1 from training. Duration: 0.7 2023-10-03 05:56:20,368 WARNING [train.py:1204] (3/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_126-29104-0 from training. Duration: 0.85 2023-10-03 05:56:21,866 WARNING [train.py:1204] (3/4) Exclude cut with ID _39846_210_12651_1_1533603651992_1318030_138-31771-0_sp1.1 from training. Duration: 0.963625 2023-10-03 05:56:26,579 WARNING [train.py:1204] (3/4) Exclude cut with ID _2220_210_1819_1_1525509109898_3658680_50-99837-0_sp1.1 from training. Duration: 0.4909375 2023-10-03 05:56:27,874 WARNING [train.py:1204] (3/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_836-210550-0_sp1.1 from training. Duration: 0.97275 2023-10-03 05:56:29,326 WARNING [train.py:1204] (3/4) Exclude cut with ID _28323_210_16187_1_1532480395667_4100300_306-74561-0 from training. Duration: 0.94 2023-10-03 05:56:33,422 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_177-58005-0_sp1.1 from training. Duration: 0.563625 2023-10-03 05:56:36,465 INFO [train.py:1046] (3/4) Epoch 33, batch 4250, loss[loss=0.1767, simple_loss=0.2536, pruned_loss=0.04992, over 23402.00 frames. ], tot_loss[loss=0.1602, simple_loss=0.2394, pruned_loss=0.04056, over 4731042.58 frames. ], batch size: 134, lr: 3.08e-03, grad_scale: 8.0 2023-10-03 05:56:37,954 WARNING [train.py:1204] (3/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_19-342632-0_sp1.1 from training. Duration: 0.82725 2023-10-03 05:56:37,969 WARNING [train.py:1204] (3/4) Exclude cut with ID _35355_210_16040_1_1533212988977_4016229_74-190076-0_sp0.9 from training. Duration: 0.888875 2023-10-03 05:56:40,607 WARNING [train.py:1204] (3/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_285-97786-0_sp1.1 from training. Duration: 0.97275 2023-10-03 05:56:44,044 WARNING [train.py:1204] (3/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_337-181502-0_sp0.9 from training. Duration: 0.72225 2023-10-03 05:56:45,409 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_353-182855-0 from training. Duration: 0.96 2023-10-03 05:56:45,453 WARNING [train.py:1204] (3/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_409-327730-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 05:56:48,130 WARNING [train.py:1204] (3/4) Exclude cut with ID _8030_210_7497_1_1528444886781_3531089_373-19403-0_sp1.1 from training. Duration: 0.97275 2023-10-03 05:56:48,392 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.conv_module2.balancer1.prob, batch_count=1161586.6666666667, ans=0.125 2023-10-03 05:56:49,885 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.543e+02 1.839e+02 2.000e+02 2.260e+02 3.065e+02, threshold=4.000e+02, percent-clipped=0.0 2023-10-03 05:56:51,442 WARNING [train.py:1204] (3/4) Exclude cut with ID _6789_210_1534_1_1527857937218_3118950_231-340237-0_sp1.1 from training. Duration: 0.936375 2023-10-03 05:56:56,100 WARNING [train.py:1204] (3/4) Exclude cut with ID _6493_210_1747_1_1528459210424_4012470_536-57832-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 05:56:56,114 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7046_210_6753_1_1527895337_6920917_756-116002-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 05:56:57,534 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_37152_210_9697_1_1533106573_3844188_29-239070-0_sp1.1 from training. Duration: 0.8909375 2023-10-03 05:56:57,536 WARNING [train.py:1204] (3/4) Exclude cut with ID _19023_210_4353_1_1531378892552_5820780_61-334539-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 05:56:58,981 WARNING [train.py:1204] (3/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_159-28488-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 05:57:00,414 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_764-71272-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 05:57:01,692 WARNING [train.py:1204] (3/4) Exclude cut with ID _44902_210_16187_1_1533888006062_3792078_258-178274-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 05:57:04,324 WARNING [train.py:1204] (3/4) Exclude cut with ID _7537_210_2084_1_1528502395431_3924359_14-312906-0_sp1.1 from training. Duration: 0.763625 2023-10-03 05:57:04,413 WARNING [train.py:1204] (3/4) Exclude cut with ID _49643_210_7989_1_1534640244224_3939031_674-116639-0_sp1.1 from training. Duration: 0.963625 2023-10-03 05:57:06,179 WARNING [train.py:1204] (3/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_415-161-0 from training. Duration: 0.98 2023-10-03 05:57:08,957 WARNING [train.py:1204] (3/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_264-348896-0 from training. Duration: 0.85 2023-10-03 05:57:10,275 WARNING [train.py:1204] (3/4) Exclude cut with ID _51899_210_20034_1_1534129254466_3669220_233-161492-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 05:57:12,039 WARNING [train.py:1204] (3/4) Exclude cut with ID _31255_210_13427_1_1532865619495_3815143_233-200023-0_sp1.1 from training. Duration: 0.936375 2023-10-03 05:57:12,068 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_23850_210_4238_1_1532232597_1632433_100-229879-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 05:57:13,515 WARNING [train.py:1204] (3/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_375-245677-0_sp0.9 from training. Duration: 0.9 2023-10-03 05:57:13,518 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_40223_210_6228_1_1533298404_4812267_355-317415-0_sp1.1 from training. Duration: 0.97275 2023-10-03 05:57:13,561 WARNING [train.py:1204] (3/4) Exclude cut with ID _9679_210_7497_1_1529129212906_4363929_235-81488-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 05:57:17,636 WARNING [train.py:1204] (3/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_172-215932-0_sp1.1 from training. Duration: 0.6 2023-10-03 05:57:17,710 WARNING [train.py:1204] (3/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_64-298917-0_sp1.1 from training. Duration: 0.8 2023-10-03 05:57:23,994 WARNING [train.py:1204] (3/4) Exclude cut with ID _38020_210_5986_1_1533112252259_7006089_131-53533-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 05:57:24,115 WARNING [train.py:1204] (3/4) Exclude cut with ID _42334_210_7770_1_1534206610396_3605880_392-48154-0_sp1.1 from training. Duration: 0.963625 2023-10-03 05:57:25,451 WARNING [train.py:1204] (3/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_337-181502-0 from training. Duration: 0.65 2023-10-03 05:57:25,466 WARNING [train.py:1204] (3/4) Exclude cut with ID _6267_210_2084_1_1527764658526_3894730_375-219887-0_sp0.9 from training. Duration: 0.7444375 2023-10-03 05:57:26,818 WARNING [train.py:1204] (3/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_60-146189-0 from training. Duration: 0.98 2023-10-03 05:57:28,196 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_243-143119-0_sp1.1 from training. Duration: 0.7 2023-10-03 05:57:29,770 WARNING [train.py:1204] (3/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_1267-272693-0_sp0.9 from training. Duration: 0.811125 2023-10-03 05:57:31,160 WARNING [train.py:1204] (3/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_83-182645-0_sp1.1 from training. Duration: 0.97275 2023-10-03 05:57:31,185 WARNING [train.py:1204] (3/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_403-292260-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 05:57:33,917 WARNING [train.py:1204] (3/4) Exclude cut with ID _55429_210_10626_1_1534467588527_7174143_377-169391-0 from training. Duration: 0.96 2023-10-03 05:57:34,028 WARNING [train.py:1204] (3/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_494-199538-0_sp1.1 from training. Duration: 0.6818125 2023-10-03 05:57:35,346 WARNING [train.py:1204] (3/4) Exclude cut with ID _3067_210_2667_1_1526212974640_4503168_119-175776-0_sp1.1 from training. Duration: 0.67275 2023-10-03 05:57:38,303 WARNING [train.py:1204] (3/4) Exclude cut with ID _3231_210_2106_1_1526205864934_6784220_183-303839-0_sp1.1 from training. Duration: 0.97275 2023-10-03 05:57:38,958 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.2.encoder.layers.0.self_attn1.whiten, num_groups=1, num_channels=384, metric=18.39 vs. limit=22.5 2023-10-03 05:57:41,412 WARNING [train.py:1204] (3/4) Exclude cut with ID _18513_210_9206_1_1531618215223_3862620_165-38958-0_sp1.1 from training. Duration: 0.963625 2023-10-03 05:57:43,553 WARNING [train.py:1204] (3/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_87-272068-0_sp1.1 from training. Duration: 0.7545625 2023-10-03 05:57:44,642 WARNING [train.py:1204] (3/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_353-95-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 05:57:46,071 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2009_210_2071_1_1525481134_4588937_73-302813-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 05:57:47,602 WARNING [train.py:1204] (3/4) Exclude cut with ID _64220_210_9527_1_1535250151772_4875259_88-95011-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 05:57:47,657 WARNING [train.py:1204] (3/4) Exclude cut with ID _44902_210_16187_1_1533888006062_3792078_160-12107-0_sp1.1 from training. Duration: 0.92725 2023-10-03 05:57:47,663 WARNING [train.py:1204] (3/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_547-5424-0 from training. Duration: 0.94 2023-10-03 05:57:49,553 WARNING [train.py:1204] (3/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_268-85656-0_sp1.1 from training. Duration: 0.936375 2023-10-03 05:57:50,781 INFO [train.py:1046] (3/4) Epoch 33, batch 4300, loss[loss=0.1667, simple_loss=0.2588, pruned_loss=0.03732, over 24547.00 frames. ], tot_loss[loss=0.1609, simple_loss=0.2398, pruned_loss=0.04099, over 4730151.88 frames. ], batch size: 71, lr: 3.08e-03, grad_scale: 8.0 2023-10-03 05:57:53,157 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.conv_module1.balancer2.prob, batch_count=1161920.0, ans=0.125 2023-10-03 05:57:54,272 WARNING [train.py:1204] (3/4) Exclude cut with ID _66210_210_12651_1_1535446812922_3365850_42-284985-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 05:57:54,296 WARNING [train.py:1204] (3/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_465-214726-0_sp0.9 from training. Duration: 0.9555625 2023-10-03 05:57:55,792 WARNING [train.py:1204] (3/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_50-95770-0_sp1.1 from training. Duration: 0.936375 2023-10-03 05:58:04,122 WARNING [train.py:1204] (3/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_269-77567-0_sp1.1 from training. Duration: 0.963625 2023-10-03 05:58:04,124 WARNING [train.py:1204] (3/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_7-111954-0 from training. Duration: 0.56 2023-10-03 05:58:05,537 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_85-195240-0_sp0.9 from training. Duration: 0.9666875 2023-10-03 05:58:06,952 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2822_210_2715_1_1526112586_1999587_165-36656-0_sp0.9 from training. Duration: 0.988875 2023-10-03 05:58:06,975 WARNING [train.py:1204] (3/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_181-78632-0_sp1.1 from training. Duration: 0.7090625 2023-10-03 05:58:06,995 WARNING [train.py:1204] (3/4) Exclude cut with ID IC0671W0044-331887 from training. Duration: 0.896 2023-10-03 05:58:10,247 WARNING [train.py:1204] (3/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_266-137104-0_sp1.1 from training. Duration: 0.5909375 2023-10-03 05:58:11,721 WARNING [train.py:1204] (3/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_137-116442-0_sp0.9 from training. Duration: 0.8666875 2023-10-03 05:58:14,778 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_23801_210_16223_1_1532066392_3294657_570-5439-0 from training. Duration: 0.97 2023-10-03 05:58:14,790 WARNING [train.py:1204] (3/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_29-280622-0_sp1.1 from training. Duration: 0.7454375 2023-10-03 05:58:14,818 WARNING [train.py:1204] (3/4) Exclude cut with ID 3557-8342-0013-54691-0 from training. Duration: 0.92 2023-10-03 05:58:17,924 WARNING [train.py:1204] (3/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_711-15688-0_sp1.1 from training. Duration: 0.3909375 2023-10-03 05:58:18,240 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.conv_module2.balancer1.prob, batch_count=1161986.6666666667, ans=0.125 2023-10-03 05:58:19,354 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_15126_210_6753_1_1530778677_2591185_120-277707-0_sp1.1 from training. Duration: 0.72725 2023-10-03 05:58:22,474 WARNING [train.py:1204] (3/4) Exclude cut with ID _14970_210_15709_1_1530775547361_1966965_8-169924-0_sp1.1 from training. Duration: 0.836375 2023-10-03 05:58:22,476 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_4692_210_3528_1_1526899755_4323254_107-44401-0_sp0.9 from training. Duration: 0.9555625 2023-10-03 05:58:23,970 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3475_210_2028_1_1526193578_3622569_265-276625-0_sp0.9 from training. Duration: 0.8444375 2023-10-03 05:58:25,453 WARNING [train.py:1204] (3/4) Exclude cut with ID _55153_210_2253_1_1534384786677_3162096_148-79022-0_sp1.1 from training. Duration: 0.87275 2023-10-03 05:58:26,738 WARNING [train.py:1204] (3/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_292-36336-0_sp1.1 from training. Duration: 0.92725 2023-10-03 05:58:26,747 WARNING [train.py:1204] (3/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_329-82235-0 from training. Duration: 0.82 2023-10-03 05:58:28,128 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_11305_210_1794_1_1529750428_7178148_257-312870-0 from training. Duration: 0.88 2023-10-03 05:58:30,879 WARNING [train.py:1204] (3/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_436-4493-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 05:58:33,243 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.0.layers.0.feed_forward2.out_whiten, num_groups=1, num_channels=192, metric=6.71 vs. limit=15.0 2023-10-03 05:58:33,566 WARNING [train.py:1204] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_321-54129-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 05:58:33,574 WARNING [train.py:1204] (3/4) Exclude cut with ID _5287_210_2084_1_1527418883040_3878670_333-48508-0_sp0.9 from training. Duration: 0.6666875 2023-10-03 05:58:33,597 WARNING [train.py:1204] (3/4) Exclude cut with ID _26528_210_9070_1_1532570410842_3851310_244-278158-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 05:58:33,623 WARNING [train.py:1204] (3/4) Exclude cut with ID _46952_210_5986_1_1534230010357_3107379_232-344503-0_sp1.1 from training. Duration: 0.87275 2023-10-03 05:58:33,632 WARNING [train.py:1204] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_43-206757-0 from training. Duration: 0.75 2023-10-03 05:58:33,642 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_4183_210_2087_1_1526729100_1869838_141-166600-0 from training. Duration: 0.95 2023-10-03 05:58:33,687 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_296-287678-0 from training. Duration: 0.95 2023-10-03 05:58:35,072 WARNING [train.py:1204] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_265-60343-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 05:58:35,110 WARNING [train.py:1204] (3/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_332-256168-0 from training. Duration: 0.75 2023-10-03 05:58:36,364 WARNING [train.py:1204] (3/4) Exclude cut with ID _13679_210_7497_1_1530534293662_3355098_411-186088-0 from training. Duration: 0.94 2023-10-03 05:58:36,592 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.conv_module2.balancer2.prob, batch_count=1162120.0, ans=0.125 2023-10-03 05:58:36,965 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.3.encoder.layers.3.whiten, num_groups=1, num_channels=512, metric=3.44 vs. limit=12.0 2023-10-03 05:58:40,883 WARNING [train.py:1204] (3/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_458-126779-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 05:58:42,289 WARNING [train.py:1204] (3/4) Exclude cut with ID IC0803W0380-397572 from training. Duration: 0.032 2023-10-03 05:58:43,614 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_278-272612-0_sp1.1 from training. Duration: 0.663625 2023-10-03 05:58:45,436 WARNING [train.py:1204] (3/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_491-263101-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 05:58:45,441 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_53555_210_5632_1_1534595220_7232297_334-336778-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 05:58:45,746 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=1162120.0, ans=0.1 2023-10-03 05:58:47,007 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_50-3289-0 from training. Duration: 0.91 2023-10-03 05:58:48,861 WARNING [train.py:1204] (3/4) Exclude cut with ID _8699_210_2069_1_1528630346831_3960409_155-39766-0_sp0.9 from training. Duration: 0.8666875 2023-10-03 05:58:48,865 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_502-82145-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 05:58:50,213 WARNING [train.py:1204] (3/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_550-43132-0_sp1.1 from training. Duration: 0.97275 2023-10-03 05:58:50,233 WARNING [train.py:1204] (3/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_446-112117-0_sp1.1 from training. Duration: 0.8181875 2023-10-03 05:58:50,291 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3691_210_3528_1_1526468169_4062053_355-334432-0_sp1.1 from training. Duration: 0.7909375 2023-10-03 05:58:53,606 WARNING [train.py:1204] (3/4) Exclude cut with ID _58605_210_12210_1_1534723195939_3904259_239-94720-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 05:58:56,327 WARNING [train.py:1204] (3/4) Exclude cut with ID _49376_210_5986_1_1533985278732_3370740_391-344072-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 05:58:56,407 WARNING [train.py:1204] (3/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_558-322014-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 05:58:56,439 WARNING [train.py:1204] (3/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_256-32662-0_sp1.1 from training. Duration: 0.8181875 2023-10-03 05:59:02,105 WARNING [train.py:1204] (3/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_572-185150-0 from training. Duration: 0.97 2023-10-03 05:59:02,311 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.feed_forward3.hidden_balancer.prob, batch_count=1162186.6666666667, ans=0.125 2023-10-03 05:59:03,388 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_89-35748-0_sp0.9 from training. Duration: 0.87775 2023-10-03 05:59:04,719 INFO [train.py:1046] (3/4) Epoch 33, batch 4350, loss[loss=0.1635, simple_loss=0.2568, pruned_loss=0.03506, over 24594.00 frames. ], tot_loss[loss=0.161, simple_loss=0.2399, pruned_loss=0.041, over 4717167.40 frames. ], batch size: 71, lr: 3.08e-03, grad_scale: 8.0 2023-10-03 05:59:06,244 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_88-66747-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 05:59:07,803 WARNING [train.py:1204] (3/4) Exclude cut with ID _19504_210_9994_1_1531648863242_3325220_357-237029-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 05:59:10,851 WARNING [train.py:1204] (3/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_302-2550-0_sp1.1 from training. Duration: 0.736375 2023-10-03 05:59:10,851 WARNING [train.py:1204] (3/4) Exclude cut with ID _9461_210_4557_1_1529060514726_3677451_59-194017-0_sp1.1 from training. Duration: 0.963625 2023-10-03 05:59:15,635 WARNING [train.py:1204] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_665-196155-0_sp1.1 from training. Duration: 0.6090625 2023-10-03 05:59:18,255 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.618e+02 1.918e+02 2.252e+02 2.552e+02 4.017e+02, threshold=4.505e+02, percent-clipped=1.0 2023-10-03 05:59:19,741 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_326-315007-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 05:59:19,883 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.1.feed_forward2.hidden_balancer.prob, batch_count=1162320.0, ans=0.125 2023-10-03 05:59:22,014 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.nonlin_attention.balancer.prob, batch_count=1162320.0, ans=0.125 2023-10-03 05:59:23,079 WARNING [train.py:1204] (3/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_105-328174-0_sp1.1 from training. Duration: 0.6909375 2023-10-03 05:59:23,087 WARNING [train.py:1204] (3/4) Exclude cut with ID _56494_210_4220_1_1535284837625_3033169_106-263722-0_sp1.1 from training. Duration: 0.97275 2023-10-03 05:59:25,939 WARNING [train.py:1204] (3/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_770-200430-0_sp0.9 from training. Duration: 0.988875 2023-10-03 05:59:27,382 WARNING [train.py:1204] (3/4) Exclude cut with ID _65640_210_6310_1_1535368201445_3173250_209-13008-0_sp1.1 from training. Duration: 0.9 2023-10-03 05:59:28,780 WARNING [train.py:1204] (3/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_212-149947-0_sp0.9 from training. Duration: 0.811125 2023-10-03 05:59:33,147 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.attention_skip_rate, batch_count=1162386.6666666667, ans=0.0 2023-10-03 05:59:34,235 WARNING [train.py:1204] (3/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_188-171673-0 from training. Duration: 0.58 2023-10-03 05:59:34,317 WARNING [train.py:1204] (3/4) Exclude cut with ID _58895_210_8750_1_1535184081320_3649089_286-281724-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 05:59:35,664 WARNING [train.py:1204] (3/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_16-254909-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 05:59:38,562 WARNING [train.py:1204] (3/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_52-95655-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 05:59:42,028 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.conv_skip_rate, batch_count=1162386.6666666667, ans=0.0 2023-10-03 05:59:43,088 WARNING [train.py:1204] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_554-45617-0 from training. Duration: 0.99 2023-10-03 05:59:45,200 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.0.conv_module2.balancer1.prob, batch_count=1162386.6666666667, ans=0.125 2023-10-03 05:59:46,515 WARNING [train.py:1204] (3/4) Exclude cut with ID _14683_210_5219_1_1530698352608_3974859_473-344611-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 05:59:47,863 WARNING [train.py:1204] (3/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_17-25045-0_sp0.9 from training. Duration: 0.6555625 2023-10-03 05:59:48,138 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.conv_skip_rate, batch_count=1162453.3333333333, ans=0.0 2023-10-03 05:59:51,995 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9072W0003-530637 from training. Duration: 0.768 2023-10-03 05:59:52,097 WARNING [train.py:1204] (3/4) Exclude cut with ID _38670_210_15379_1_1533448810659_4391059_76-74025-0_sp1.1 from training. Duration: 0.936375 2023-10-03 05:59:52,139 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_74-292608-0_sp0.9 from training. Duration: 0.8 2023-10-03 05:59:54,033 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9084W0025-536862 from training. Duration: 0.896 2023-10-03 05:59:54,091 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9087W0021-538392 from training. Duration: 0.768 2023-10-03 05:59:54,096 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7543_210_6753_1_1528543323_645515_56-163234-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 05:59:55,365 WARNING [train.py:1204] (3/4) Exclude cut with ID _45560_210_21982_1_1533812603951_3584999_44-332643-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 05:59:55,433 WARNING [train.py:1204] (3/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_1285-256622-0_sp1.1 from training. Duration: 0.763625 2023-10-03 05:59:56,741 WARNING [train.py:1204] (3/4) Exclude cut with ID _52781_210_16187_1_1534297729099_1118149_141-325632-0_sp1.1 from training. Duration: 0.936375 2023-10-03 05:59:58,104 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_1051-137631-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 05:59:58,152 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_13091_210_7925_1_1530609447_8987354_233-18333-0_sp1.1 from training. Duration: 0.97275 2023-10-03 05:59:59,739 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.conv_module2.balancer1.prob, batch_count=1162453.3333333333, ans=0.125 2023-10-03 06:00:02,141 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_34008_210_10431_1_1533345424_8183247_662-317331-0 from training. Duration: 0.94 2023-10-03 06:00:02,148 WARNING [train.py:1204] (3/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_291-248920-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 06:00:02,150 WARNING [train.py:1204] (3/4) Exclude cut with ID _65263_210_22356_1_1535763598691_3921037_274-288481-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 06:00:02,174 WARNING [train.py:1204] (3/4) Exclude cut with ID _5189_210_4557_1_1527248319428_3769229_233-168080-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 06:00:02,252 WARNING [train.py:1204] (3/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_135-184281-0 from training. Duration: 0.99 2023-10-03 06:00:04,986 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9123W0012-555972 from training. Duration: 0.896 2023-10-03 06:00:04,990 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9123W0118-556077 from training. Duration: 0.896 2023-10-03 06:00:05,008 WARNING [train.py:1204] (3/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_406-275649-0 from training. Duration: 0.42 2023-10-03 06:00:07,858 WARNING [train.py:1204] (3/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_309-13684-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 06:00:07,880 WARNING [train.py:1204] (3/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_129-337802-0_sp1.1 from training. Duration: 0.6545625 2023-10-03 06:00:07,909 WARNING [train.py:1204] (3/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_649-266825-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 06:00:09,208 WARNING [train.py:1204] (3/4) Exclude cut with ID _40519_210_13794_1_1533715228655_7337390_895-224742-0_sp1.1 from training. Duration: 0.92725 2023-10-03 06:00:10,626 WARNING [train.py:1204] (3/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_234-14174-0 from training. Duration: 0.82 2023-10-03 06:00:13,356 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9156W0009-572502 from training. Duration: 0.768 2023-10-03 06:00:13,362 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_424-245529-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 06:00:13,506 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.balancer_ff2.min_abs, batch_count=1162520.0, ans=0.1 2023-10-03 06:00:17,219 WARNING [train.py:1204] (3/4) Exclude cut with ID _26528_210_9070_1_1532570410842_3851310_232-10149-0_sp1.1 from training. Duration: 0.963625 2023-10-03 06:00:17,226 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_17292_210_6753_1_1531125388_2943734_152-30410-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 06:00:18,442 INFO [train.py:1046] (3/4) Epoch 33, batch 4400, loss[loss=0.166, simple_loss=0.2518, pruned_loss=0.04005, over 23723.00 frames. ], tot_loss[loss=0.1619, simple_loss=0.2407, pruned_loss=0.04151, over 4722104.29 frames. ], batch size: 85, lr: 3.08e-03, grad_scale: 16.0 2023-10-03 06:00:19,917 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_35441_210_16554_1_1533347572_4092680_291-82075-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 06:00:21,932 WARNING [train.py:1204] (3/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_64-298917-0 from training. Duration: 0.88 2023-10-03 06:00:23,148 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_5618_210_1874_1_1527311008_5851_1-214501-0_sp0.9 from training. Duration: 0.7655625 2023-10-03 06:00:23,192 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_9460_210_4211_1_1528954587_10049667_572-37035-0 from training. Duration: 0.8 2023-10-03 06:00:23,217 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9193W0023-590322 from training. Duration: 0.896 2023-10-03 06:00:24,436 WARNING [train.py:1204] (3/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_206-10097-0_sp1.1 from training. Duration: 0.4454375 2023-10-03 06:00:24,457 WARNING [train.py:1204] (3/4) Exclude cut with ID _55134_210_21982_1_1534590028377_3882170_17-51731-0_sp1.1 from training. Duration: 0.963625 2023-10-03 06:00:27,785 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_14953_210_2151_1_1530701710_5616165_710-165400-0 from training. Duration: 0.87 2023-10-03 06:00:28,732 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.1.encoder.layers.0.self_attn_weights.whiten_keys, num_groups=4, num_channels=128, metric=3.68 vs. limit=6.0 2023-10-03 06:00:29,151 WARNING [train.py:1204] (3/4) Exclude cut with ID _53203_210_2087_1_1534246218309_475590_61-85777-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 06:00:30,487 WARNING [train.py:1204] (3/4) Exclude cut with ID _55844_210_2346_1_1534471212569_7625707_1050-10453-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 06:00:30,507 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9218W0013-603297 from training. Duration: 0.896 2023-10-03 06:00:33,410 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.self_attn_weights.pos_emb_skip_rate, batch_count=1162653.3333333333, ans=0.0 2023-10-03 06:00:34,582 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_267-102818-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 06:00:34,583 WARNING [train.py:1204] (3/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_191-41023-0 from training. Duration: 0.71 2023-10-03 06:00:36,120 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9231W0202-610227 from training. Duration: 0.896 2023-10-03 06:00:39,045 WARNING [train.py:1204] (3/4) Exclude cut with ID _6537_210_1883_1_1527987559710_3597129_382-133062-0 from training. Duration: 0.68 2023-10-03 06:00:40,393 WARNING [train.py:1204] (3/4) Exclude cut with ID _28427_210_4220_1_1532581240967_3061069_43-84928-0 from training. Duration: 0.99 2023-10-03 06:00:41,757 WARNING [train.py:1204] (3/4) Exclude cut with ID _59256_210_5986_1_1535076085997_3589120_121-237386-0 from training. Duration: 0.96 2023-10-03 06:00:41,787 WARNING [train.py:1204] (3/4) Exclude cut with ID _8480_210_1874_1_1528589391205_6728330_137-256986-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 06:00:41,873 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_9417_210_1794_1_1529235261_4614131_479-346963-0_sp1.1 from training. Duration: 0.936375 2023-10-03 06:00:43,262 WARNING [train.py:1204] (3/4) Exclude cut with ID _26474_210_9570_1_1532520022699_3623380_267-7362-0_sp1.1 from training. Duration: 0.936375 2023-10-03 06:00:43,340 WARNING [train.py:1204] (3/4) Exclude cut with ID _55328_210_27767_1_1534580973260_4088170_287-61272-0_sp1.1 from training. Duration: 0.97275 2023-10-03 06:00:46,633 WARNING [train.py:1204] (3/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_589-9070-0 from training. Duration: 0.71 2023-10-03 06:00:46,641 WARNING [train.py:1204] (3/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_148-158509-0_sp1.1 from training. Duration: 0.37275 2023-10-03 06:00:46,707 WARNING [train.py:1204] (3/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_164-184548-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 06:00:48,099 WARNING [train.py:1204] (3/4) Exclude cut with ID _14970_210_15709_1_1530775547361_1966965_6-23328-0_sp1.1 from training. Duration: 0.7818125 2023-10-03 06:00:48,105 WARNING [train.py:1204] (3/4) Exclude cut with ID _29303_210_2575_1_1532392237303_7410649_612-165069-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 06:00:51,384 WARNING [train.py:1204] (3/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_383-321224-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 06:00:51,410 WARNING [train.py:1204] (3/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_283-160525-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 06:00:51,430 WARNING [train.py:1204] (3/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_505-97987-0 from training. Duration: 0.86 2023-10-03 06:00:51,497 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9309W0190-640317 from training. Duration: 0.768 2023-10-03 06:00:56,864 WARNING [train.py:1204] (3/4) Exclude cut with ID _50093_210_4220_1_1534417141207_3432299_280-297390-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 06:01:02,072 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_17292_210_6753_1_1531125388_2943734_320-71385-0_sp1.1 from training. Duration: 0.97275 2023-10-03 06:01:04,753 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_213-103282-0 from training. Duration: 0.78 2023-10-03 06:01:07,453 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7615_210_4211_1_1528110965_4580160_72-235211-0_sp0.9 from training. Duration: 0.8444375 2023-10-03 06:01:07,682 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.bypass_mid.scale_min, batch_count=1162786.6666666667, ans=0.2 2023-10-03 06:01:08,960 WARNING [train.py:1204] (3/4) Exclude cut with ID _14867_210_1968_1_1530684179890_3846969_173-320977-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 06:01:11,071 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.2.encoder.layers.1.conv_module2.whiten, num_groups=1, num_channels=384, metric=6.89 vs. limit=15.0 2023-10-03 06:01:11,630 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7046_210_6753_1_1527895337_6920917_783-278109-0_sp0.9 from training. Duration: 0.8555625 2023-10-03 06:01:11,680 WARNING [train.py:1204] (3/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_90-30673-0 from training. Duration: 0.84 2023-10-03 06:01:11,695 WARNING [train.py:1204] (3/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_415-161-0_sp1.1 from training. Duration: 0.8909375 2023-10-03 06:01:11,704 WARNING [train.py:1204] (3/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_86-66192-0_sp0.9 from training. Duration: 0.611125 2023-10-03 06:01:11,707 WARNING [train.py:1204] (3/4) Exclude cut with ID _4626_210_3532_1_1526817652717_3916859_446-206656-0_sp1.1 from training. Duration: 0.6909375 2023-10-03 06:01:13,045 WARNING [train.py:1204] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_166-128941-0_sp0.9 from training. Duration: 0.6 2023-10-03 06:01:18,979 WARNING [train.py:1204] (3/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_345-167986-0 from training. Duration: 0.84 2023-10-03 06:01:19,227 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.bypass.skip_rate, batch_count=1162853.3333333333, ans=0.07 2023-10-03 06:01:20,970 WARNING [train.py:1204] (3/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_973-198085-0 from training. Duration: 0.75 2023-10-03 06:01:21,043 WARNING [train.py:1204] (3/4) Exclude cut with ID _60057_210_10640_1_1534903065793_3333150_171-181133-0 from training. Duration: 0.88 2023-10-03 06:01:21,056 WARNING [train.py:1204] (3/4) Exclude cut with ID _19504_210_9994_1_1531648863242_3325220_336-196513-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 06:01:21,063 WARNING [train.py:1204] (3/4) Exclude cut with ID _6493_210_1747_1_1528459210424_4012470_513-140967-0 from training. Duration: 0.79 2023-10-03 06:01:21,138 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6673_210_3273_1_1527767790_3173240_327-306185-0_sp0.9 from training. Duration: 0.92225 2023-10-03 06:01:21,350 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.feed_forward1.out_proj.dropout_p, batch_count=1162853.3333333333, ans=0.1 2023-10-03 06:01:26,559 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1032-236863-0_sp1.1 from training. Duration: 0.763625 2023-10-03 06:01:28,500 WARNING [train.py:1204] (3/4) Exclude cut with ID _14867_210_1968_1_1530684179890_3846969_413-29292-0 from training. Duration: 0.9 2023-10-03 06:01:30,564 WARNING [train.py:1204] (3/4) Exclude cut with ID _47251_210_18628_1_1533814063128_4137250_186-343322-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 06:01:33,222 INFO [train.py:1046] (3/4) Epoch 33, batch 4450, loss[loss=0.1774, simple_loss=0.2536, pruned_loss=0.05055, over 23483.00 frames. ], tot_loss[loss=0.1626, simple_loss=0.2417, pruned_loss=0.0418, over 4730636.05 frames. ], batch size: 285, lr: 3.08e-03, grad_scale: 8.0 2023-10-03 06:01:33,298 WARNING [train.py:1204] (3/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_624-46091-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 06:01:33,345 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_791-197891-0_sp0.9 from training. Duration: 0.9333125 2023-10-03 06:01:40,257 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_50050_210_16223_1_1535435796_7290138_786-288457-0_sp1.1 from training. Duration: 0.92725 2023-10-03 06:01:41,559 WARNING [train.py:1204] (3/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_213-227283-0_sp1.1 from training. Duration: 0.963625 2023-10-03 06:01:43,140 WARNING [train.py:1204] (3/4) Exclude cut with ID _42659_210_2070_1_1533607442765_3120380_58-341121-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 06:01:44,111 INFO [scaling.py:1022] (3/4) Whitening: name=encoder_embed.out_whiten, num_groups=1, num_channels=192, metric=7.22 vs. limit=8.0 2023-10-03 06:01:45,914 WARNING [train.py:1204] (3/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_506-23200-0_sp1.1 from training. Duration: 0.8818125 2023-10-03 06:01:47,589 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.493e+02 1.835e+02 1.974e+02 2.226e+02 3.952e+02, threshold=3.948e+02, percent-clipped=0.0 2023-10-03 06:01:49,112 WARNING [train.py:1204] (3/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_336-213530-0_sp1.1 from training. Duration: 0.8181875 2023-10-03 06:01:51,059 WARNING [train.py:1204] (3/4) Exclude cut with ID _64135_210_2253_1_1535677267876_7608972_180-5312-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 06:01:52,396 WARNING [train.py:1204] (3/4) Exclude cut with ID _12302_210_5219_1_1529924446466_8107280_835-279754-0 from training. Duration: 0.9 2023-10-03 06:01:52,398 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_510-96996-0_sp1.1 from training. Duration: 0.7909375 2023-10-03 06:01:53,727 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_15517_210_6753_1_1530954008_3487336_216-187834-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 06:01:53,767 WARNING [train.py:1204] (3/4) Exclude cut with ID _19504_210_9994_1_1531648863242_3325220_414-113427-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 06:01:53,768 WARNING [train.py:1204] (3/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_264-108685-0_sp0.9 from training. Duration: 0.611125 2023-10-03 06:01:55,203 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_5438_210_1794_1_1527336687_2600009_104-288274-0_sp0.9 from training. Duration: 0.6444375 2023-10-03 06:01:57,217 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.3.encoder.layers.0.feed_forward2.out_whiten, num_groups=1, num_channels=512, metric=4.39 vs. limit=15.0 2023-10-03 06:01:59,881 WARNING [train.py:1204] (3/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_520-39496-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 06:01:59,928 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_37818_210_4238_1_1533620910_4720083_315-308565-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 06:02:01,307 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2009_210_2071_1_1525481134_4588937_385-302100-0_sp1.1 from training. Duration: 0.7909375 2023-10-03 06:02:01,352 WARNING [train.py:1204] (3/4) Exclude cut with ID _58895_210_8750_1_1535184081320_3649089_277-53219-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 06:02:02,679 WARNING [train.py:1204] (3/4) Exclude cut with ID _26664_210_7770_1_1532258943280_3418270_121-111258-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 06:02:02,917 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.bypass.scale_min, batch_count=1163053.3333333333, ans=0.2 2023-10-03 06:02:07,403 WARNING [train.py:1204] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_99-167392-0_sp1.1 from training. Duration: 0.5545625 2023-10-03 06:02:08,745 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1113-7369-0 from training. Duration: 0.85 2023-10-03 06:02:08,764 WARNING [train.py:1204] (3/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_352-59958-0 from training. Duration: 0.86 2023-10-03 06:02:08,768 WARNING [train.py:1204] (3/4) Exclude cut with ID _14886_210_4220_1_1530702020355_3516190_159-73748-0_sp1.1 from training. Duration: 0.7545625 2023-10-03 06:02:11,590 WARNING [train.py:1204] (3/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_131-119569-0_sp1.1 from training. Duration: 0.92725 2023-10-03 06:02:12,924 WARNING [train.py:1204] (3/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_216-343007-0 from training. Duration: 0.66 2023-10-03 06:02:13,056 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.0.feed_forward1.out_proj.dropout_p, batch_count=1163053.3333333333, ans=0.1 2023-10-03 06:02:16,992 WARNING [train.py:1204] (3/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_113-92608-0_sp0.9 from training. Duration: 0.6 2023-10-03 06:02:22,273 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_40047_210_2290_1_1533290519_2374311_132-123271-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 06:02:22,337 WARNING [train.py:1204] (3/4) Exclude cut with ID _8793_210_6310_1_1528542985139_2310009_121-179782-0 from training. Duration: 0.8 2023-10-03 06:02:22,351 WARNING [train.py:1204] (3/4) Exclude cut with ID _43997_210_4525_1_1533538632555_5189329_283-218639-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 06:02:22,354 WARNING [train.py:1204] (3/4) Exclude cut with ID _49376_210_5986_1_1533985278732_3370740_297-78397-0_sp1.1 from training. Duration: 0.936375 2023-10-03 06:02:22,369 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3475_210_2028_1_1526193578_3622569_377-166133-0_sp0.9 from training. Duration: 0.9555625 2023-10-03 06:02:22,382 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_520-213149-0_sp1.1 from training. Duration: 0.92725 2023-10-03 06:02:24,039 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.conv_skip_rate, batch_count=1163120.0, ans=0.0 2023-10-03 06:02:25,118 WARNING [train.py:1204] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_567-144483-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 06:02:28,371 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_4183_210_2087_1_1526729100_1869838_143-280962-0_sp0.9 from training. Duration: 0.62225 2023-10-03 06:02:29,667 WARNING [train.py:1204] (3/4) Exclude cut with ID _49643_210_7989_1_1534640244224_3939031_611-185515-0 from training. Duration: 0.94 2023-10-03 06:02:31,152 WARNING [train.py:1204] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_88-341718-0_sp0.9 from training. Duration: 0.7333125 2023-10-03 06:02:32,624 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_51225_210_3601_1_1534400634_2327383_49-235441-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 06:02:34,164 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder_embed.conv.8.prob, batch_count=1163186.6666666667, ans=0.125 2023-10-03 06:02:34,765 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.3.encoder.layers.1.whiten, num_groups=1, num_channels=512, metric=4.74 vs. limit=12.0 2023-10-03 06:02:35,409 WARNING [train.py:1204] (3/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_240-294400-0_sp1.1 from training. Duration: 0.936375 2023-10-03 06:02:35,674 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.self_attn_weights.pos_emb_skip_rate, batch_count=1163186.6666666667, ans=0.0 2023-10-03 06:02:36,787 WARNING [train.py:1204] (3/4) Exclude cut with ID _55429_210_10626_1_1534467588527_7174143_640-304825-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 06:02:36,809 WARNING [train.py:1204] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_187-221566-0_sp1.1 from training. Duration: 0.3909375 2023-10-03 06:02:40,163 WARNING [train.py:1204] (3/4) Exclude cut with ID _7606_210_4915_1_1528894253277_5080690_127-177761-0_sp0.9 from training. Duration: 0.711125 2023-10-03 06:02:43,048 WARNING [train.py:1204] (3/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_430-116139-0 from training. Duration: 0.85 2023-10-03 06:02:44,415 WARNING [train.py:1204] (3/4) Exclude cut with ID _3675_210_4557_1_1526639934408_4182089_456-18305-0_sp0.9 from training. Duration: 0.8444375 2023-10-03 06:02:47,078 INFO [train.py:1046] (3/4) Epoch 33, batch 4500, loss[loss=0.1734, simple_loss=0.2378, pruned_loss=0.05452, over 23750.00 frames. ], tot_loss[loss=0.163, simple_loss=0.2424, pruned_loss=0.04179, over 4730007.58 frames. ], batch size: 232, lr: 3.07e-03, grad_scale: 8.0 2023-10-03 06:02:48,612 WARNING [train.py:1204] (3/4) Exclude cut with ID _18513_210_9206_1_1531618215223_3862620_402-5957-0_sp1.1 from training. Duration: 0.9 2023-10-03 06:02:50,051 WARNING [train.py:1204] (3/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_465-156500-0 from training. Duration: 0.72 2023-10-03 06:02:50,052 WARNING [train.py:1204] (3/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_714-310538-0 from training. Duration: 0.86 2023-10-03 06:02:50,281 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.feed_forward1.hidden_balancer.prob, batch_count=1163253.3333333333, ans=0.125 2023-10-03 06:02:53,462 WARNING [train.py:1204] (3/4) Exclude cut with ID _39411_210_5986_1_1533538825030_3343649_335-301187-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 06:02:57,083 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.conv_module1.balancer2.prob, batch_count=1163253.3333333333, ans=0.125 2023-10-03 06:02:58,198 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_41750_210_1960_1_1533520678_3948042_241-158616-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 06:02:58,239 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_46013_210_7234_1_1533776362_3744032_50-141535-0_sp1.1 from training. Duration: 0.9 2023-10-03 06:03:00,112 WARNING [train.py:1204] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_43-206757-0_sp0.9 from training. Duration: 0.8333125 2023-10-03 06:03:00,167 WARNING [train.py:1204] (3/4) Exclude cut with ID _22961_210_8254_1_1531976366672_4278180_172-276296-0_sp1.1 from training. Duration: 0.97275 2023-10-03 06:03:00,979 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.1.encoder.layers.0.self_attn1.whiten, num_groups=1, num_channels=256, metric=11.87 vs. limit=22.5 2023-10-03 06:03:01,465 WARNING [train.py:1204] (3/4) Exclude cut with ID _17956_210_4353_1_1531209568125_6979970_1305-301903-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 06:03:01,509 WARNING [train.py:1204] (3/4) Exclude cut with ID _28323_210_16187_1_1532480395667_4100300_604-44507-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 06:03:01,771 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.feed_forward1.hidden_balancer.prob, batch_count=1163320.0, ans=0.125 2023-10-03 06:03:03,223 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.feed_forward2.hidden_balancer.prob, batch_count=1163320.0, ans=0.125 2023-10-03 06:03:13,156 WARNING [train.py:1204] (3/4) Exclude cut with ID _23514_210_1389_1_1531832139785_3031439_88-337544-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 06:03:13,217 WARNING [train.py:1204] (3/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_932-3672-0_sp1.1 from training. Duration: 0.963625 2023-10-03 06:03:15,993 WARNING [train.py:1204] (3/4) Exclude cut with ID _46502_210_13794_1_1533724452198_3657159_207-109991-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 06:03:16,052 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_216-73841-0_sp1.1 from training. Duration: 0.87275 2023-10-03 06:03:18,549 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8453_210_4211_1_1528502940_8562730_347-330673-0_sp1.1 from training. Duration: 0.6545625 2023-10-03 06:03:23,413 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_4183_210_2087_1_1526729100_1869838_143-280962-0_sp1.1 from training. Duration: 0.5090625 2023-10-03 06:03:23,600 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.conv_skip_rate, batch_count=1163386.6666666667, ans=0.0 2023-10-03 06:03:28,071 WARNING [train.py:1204] (3/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_325-113798-0_sp1.1 from training. Duration: 0.77275 2023-10-03 06:03:32,798 WARNING [train.py:1204] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_91-212486-0_sp0.9 from training. Duration: 0.7555625 2023-10-03 06:03:35,809 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.feed_forward1.hidden_balancer.prob, batch_count=1163453.3333333333, ans=0.125 2023-10-03 06:03:36,909 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8776_210_2040_1_1528638837_4088036_244-278763-0_sp1.1 from training. Duration: 0.8181875 2023-10-03 06:03:36,952 WARNING [train.py:1204] (3/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_106-286130-0 from training. Duration: 0.98 2023-10-03 06:03:38,324 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2746_210_1988_1_1525872816_3795627_372-205861-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 06:03:39,697 WARNING [train.py:1204] (3/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_240-54646-0_sp1.1 from training. Duration: 0.936375 2023-10-03 06:03:39,853 WARNING [train.py:1204] (3/4) Exclude cut with ID _59500_210_8254_1_1534809610680_3736009_162-180695-0_sp1.1 from training. Duration: 0.936375 2023-10-03 06:03:41,181 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7544_210_6753_1_1528591327_1471426_146-93357-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 06:03:42,723 WARNING [train.py:1204] (3/4) Exclude cut with ID _42657_210_2070_1_1533520654016_3327008_405-87432-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 06:03:42,755 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_4528_210_3273_1_1527076654_3276777_209-219804-0 from training. Duration: 0.69 2023-10-03 06:03:42,756 WARNING [train.py:1204] (3/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_78-182912-0_sp0.9 from training. Duration: 0.6555625 2023-10-03 06:03:42,767 WARNING [train.py:1204] (3/4) Exclude cut with ID _31255_210_13427_1_1532865619495_3815143_322-114998-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 06:03:50,027 WARNING [train.py:1204] (3/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_848-280957-0_sp0.9 from training. Duration: 0.9666875 2023-10-03 06:03:50,055 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7696_210_2040_1_1528207167_3716099_117-125434-0_sp0.9 from training. Duration: 0.8444375 2023-10-03 06:03:51,555 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_1002-109192-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 06:03:54,639 WARNING [train.py:1204] (3/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_138-64210-0_sp0.9 from training. Duration: 0.811125 2023-10-03 06:03:54,657 WARNING [train.py:1204] (3/4) Exclude cut with ID _25808_210_13292_1_1532343514562_7366109_633-47524-0_sp1.1 from training. Duration: 0.9 2023-10-03 06:03:54,789 WARNING [train.py:1204] (3/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_297-5233-0 from training. Duration: 0.95 2023-10-03 06:03:58,177 WARNING [train.py:1204] (3/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_335-274780-0 from training. Duration: 0.78 2023-10-03 06:03:58,182 WARNING [train.py:1204] (3/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_301-330455-0 from training. Duration: 0.8 2023-10-03 06:04:00,975 WARNING [train.py:1204] (3/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_383-205833-0 from training. Duration: 0.95 2023-10-03 06:04:02,236 INFO [train.py:1046] (3/4) Epoch 33, batch 4550, loss[loss=0.1656, simple_loss=0.2339, pruned_loss=0.04867, over 23903.00 frames. ], tot_loss[loss=0.1622, simple_loss=0.2409, pruned_loss=0.04176, over 4725563.60 frames. ], batch size: 195, lr: 3.07e-03, grad_scale: 8.0 2023-10-03 06:04:05,586 WARNING [train.py:1204] (3/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_48-182155-0 from training. Duration: 0.87 2023-10-03 06:04:05,668 WARNING [train.py:1204] (3/4) Exclude cut with ID _14832_210_8750_1_1530691351539_3171348_1-54315-0_sp1.1 from training. Duration: 0.7909375 2023-10-03 06:04:09,721 WARNING [train.py:1204] (3/4) Exclude cut with ID _44982_210_1511_1_1534312820108_3726029_413-100814-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 06:04:09,768 WARNING [train.py:1204] (3/4) Exclude cut with ID _3231_210_2106_1_1526205864934_6784220_104-261392-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 06:04:11,188 WARNING [train.py:1204] (3/4) Exclude cut with ID _24314_210_4915_1_1532135078116_7170179_1-13588-0_sp1.1 from training. Duration: 0.97275 2023-10-03 06:04:15,293 WARNING [train.py:1204] (3/4) Exclude cut with ID _44304_210_15374_1_1533639352765_4613129_141-8356-0_sp1.1 from training. Duration: 0.963625 2023-10-03 06:04:16,478 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.540e+02 1.864e+02 2.150e+02 2.511e+02 3.546e+02, threshold=4.299e+02, percent-clipped=0.0 2023-10-03 06:04:16,650 WARNING [train.py:1204] (3/4) Exclude cut with ID _37892_210_19596_1_1533198677897_3964058_374-335034-0_sp1.1 from training. Duration: 0.936375 2023-10-03 06:04:18,452 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_195-257512-0_sp0.9 from training. Duration: 0.7444375 2023-10-03 06:04:18,454 WARNING [train.py:1204] (3/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_1267-272693-0_sp1.1 from training. Duration: 0.663625 2023-10-03 06:04:18,455 WARNING [train.py:1204] (3/4) Exclude cut with ID _30196_210_1664_1_1533708496522_7074140_690-326155-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 06:04:20,068 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.feed_forward3.hidden_balancer.prob, batch_count=1163653.3333333333, ans=0.125 2023-10-03 06:04:21,204 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_68847_210_14656_1_1536070214_2586843_38-20717-0_sp1.1 from training. Duration: 0.97275 2023-10-03 06:04:21,257 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_99-251314-0_sp1.1 from training. Duration: 0.7909375 2023-10-03 06:04:23,944 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.0.layers.1.self_attn2.whiten, num_groups=1, num_channels=192, metric=11.07 vs. limit=22.5 2023-10-03 06:04:24,385 WARNING [train.py:1204] (3/4) Exclude cut with ID _49376_210_5986_1_1533985278732_3370740_225-95630-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 06:04:27,819 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2655_210_2087_1_1525688959_3897400_202-147557-0 from training. Duration: 0.85 2023-10-03 06:04:27,881 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_5618_210_1874_1_1527311008_5851_1-214501-0_sp1.1 from training. Duration: 0.626375 2023-10-03 06:04:29,061 WARNING [train.py:1204] (3/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_225-43381-0_sp1.1 from training. Duration: 0.8545625 2023-10-03 06:04:30,377 WARNING [train.py:1204] (3/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_242-3472-0 from training. Duration: 0.73 2023-10-03 06:04:33,710 WARNING [train.py:1204] (3/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_339-104791-0 from training. Duration: 0.49 2023-10-03 06:04:33,786 WARNING [train.py:1204] (3/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_488-92261-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 06:04:37,688 WARNING [train.py:1204] (3/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_175-163894-0 from training. Duration: 0.74 2023-10-03 06:04:39,075 WARNING [train.py:1204] (3/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_41-234353-0_sp0.9 from training. Duration: 0.7555625 2023-10-03 06:04:41,948 WARNING [train.py:1204] (3/4) Exclude cut with ID _47251_210_18628_1_1533814063128_4137250_116-275927-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 06:04:43,311 WARNING [train.py:1204] (3/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_61-223324-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 06:04:43,330 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6961_210_2087_1_1527937168_6815011_562-165518-0_sp1.1 from training. Duration: 0.7 2023-10-03 06:04:44,808 WARNING [train.py:1204] (3/4) Exclude cut with ID _21989_210_5854_1_1531899214931_3271360_133-44759-0 from training. Duration: 0.77 2023-10-03 06:04:46,374 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.conv_module1.balancer2.min_positive, batch_count=1163786.6666666667, ans=0.05 2023-10-03 06:04:47,532 WARNING [train.py:1204] (3/4) Exclude cut with ID _23557_210_8750_1_1531890106370_6846255_77-284923-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 06:04:49,545 WARNING [train.py:1204] (3/4) Exclude cut with ID _13678_210_7497_1_1530530069226_3463170_464-296127-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 06:04:50,827 WARNING [train.py:1204] (3/4) Exclude cut with ID _22613_210_5219_1_1532253599686_4090170_393-36663-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 06:04:50,949 WARNING [train.py:1204] (3/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_235-262124-0_sp0.9 from training. Duration: 0.7444375 2023-10-03 06:04:52,314 WARNING [train.py:1204] (3/4) Exclude cut with ID _8480_210_1874_1_1528589391205_6728330_755-112625-0 from training. Duration: 0.74 2023-10-03 06:04:52,359 WARNING [train.py:1204] (3/4) Exclude cut with ID _6456_210_1534_1_1527681634212_3181049_215-309359-0 from training. Duration: 0.88 2023-10-03 06:04:53,664 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3019_210_2028_1_1526044245_3264865_191-72235-0_sp1.1 from training. Duration: 0.8818125 2023-10-03 06:04:53,741 WARNING [train.py:1204] (3/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_380-162648-0 from training. Duration: 0.94 2023-10-03 06:04:54,037 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.attention_skip_rate, batch_count=1163786.6666666667, ans=0.0 2023-10-03 06:04:55,790 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_92-46009-0 from training. Duration: 0.54 2023-10-03 06:04:55,807 WARNING [train.py:1204] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_53-300544-0_sp0.9 from training. Duration: 0.7444375 2023-10-03 06:04:57,139 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_68235_210_1925_1_1535863247_4484656_101-250943-0_sp1.1 from training. Duration: 0.97275 2023-10-03 06:04:57,149 WARNING [train.py:1204] (3/4) Exclude cut with ID _14970_210_15709_1_1530775547361_1966965_204-22182-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 06:04:58,601 WARNING [train.py:1204] (3/4) Exclude cut with ID _55153_210_2253_1_1534384786677_3162096_310-130092-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 06:04:58,629 WARNING [train.py:1204] (3/4) Exclude cut with ID _60113_210_16271_1_1535164212481_4386079_559-167191-0_sp1.1 from training. Duration: 0.8454375 2023-10-03 06:04:59,449 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.1.encoder.layers.0.nonlin_attention.whiten2, num_groups=1, num_channels=256, metric=5.60 vs. limit=15.0 2023-10-03 06:05:00,000 WARNING [train.py:1204] (3/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_93-58002-0_sp1.1 from training. Duration: 0.5181875 2023-10-03 06:05:00,538 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.5.encoder.layers.0.self_attn_weights.whiten_keys, num_groups=4, num_channels=128, metric=1.74 vs. limit=6.0 2023-10-03 06:05:01,713 WARNING [train.py:1204] (3/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_110-224688-0 from training. Duration: 0.97 2023-10-03 06:05:03,080 WARNING [train.py:1204] (3/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_178-178004-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 06:05:03,089 WARNING [train.py:1204] (3/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_321-335475-0_sp1.1 from training. Duration: 0.4090625 2023-10-03 06:05:04,555 WARNING [train.py:1204] (3/4) Exclude cut with ID _66863_210_18053_1_1535698758334_8590319_1092-271784-0 from training. Duration: 0.96 2023-10-03 06:05:04,563 WARNING [train.py:1204] (3/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_607-213951-0_sp1.1 from training. Duration: 0.763625 2023-10-03 06:05:04,574 WARNING [train.py:1204] (3/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_468-283113-0 from training. Duration: 0.96 2023-10-03 06:05:07,381 WARNING [train.py:1204] (3/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_13-165512-0_sp1.1 from training. Duration: 0.7454375 2023-10-03 06:05:08,664 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_69630_210_7234_1_1535788670_910989_56-169762-0_sp1.1 from training. Duration: 0.92725 2023-10-03 06:05:10,133 WARNING [train.py:1204] (3/4) Exclude cut with ID _67335_210_5219_1_1535525975652_4275229_121-80357-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 06:05:10,186 WARNING [train.py:1204] (3/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_126-12079-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 06:05:10,223 WARNING [train.py:1204] (3/4) Exclude cut with ID _5294_210_1747_1_1527249643928_3696179_342-270278-0_sp1.1 from training. Duration: 0.5 2023-10-03 06:05:11,651 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_30322_210_1925_1_1532843478_826817_35-328264-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 06:05:13,019 WARNING [train.py:1204] (3/4) Exclude cut with ID _8480_210_1874_1_1528589391205_6728330_755-112625-0_sp1.1 from training. Duration: 0.67275 2023-10-03 06:05:15,548 INFO [train.py:1046] (3/4) Epoch 33, batch 4600, loss[loss=0.1546, simple_loss=0.227, pruned_loss=0.04108, over 23321.00 frames. ], tot_loss[loss=0.1613, simple_loss=0.2398, pruned_loss=0.04143, over 4726718.70 frames. ], batch size: 285, lr: 3.07e-03, grad_scale: 8.0 2023-10-03 06:05:16,956 WARNING [train.py:1204] (3/4) Exclude cut with ID _38670_210_15379_1_1533448810659_4391059_132-146412-0_sp1.1 from training. Duration: 0.97275 2023-10-03 06:05:17,046 WARNING [train.py:1204] (3/4) Exclude cut with ID _22020_210_2461_1_1531724164513_6326900_563-39691-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 06:05:19,034 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_9460_210_4211_1_1528954587_10049667_572-37035-0_sp1.1 from training. Duration: 0.72725 2023-10-03 06:05:19,054 WARNING [train.py:1204] (3/4) Exclude cut with ID _68777_210_4353_1_1535810390731_3117904_129-166012-0_sp0.9 from training. Duration: 0.9444375 2023-10-03 06:05:20,414 WARNING [train.py:1204] (3/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_487-91921-0_sp1.1 from training. Duration: 0.963625 2023-10-03 06:05:21,848 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_195-257512-0 from training. Duration: 0.67 2023-10-03 06:05:23,253 WARNING [train.py:1204] (3/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_188-260011-0_sp1.1 from training. Duration: 0.663625 2023-10-03 06:05:29,161 WARNING [train.py:1204] (3/4) Exclude cut with ID _8480_210_1874_1_1528589391205_6728330_309-139896-0_sp0.9 from training. Duration: 0.9 2023-10-03 06:05:29,218 WARNING [train.py:1204] (3/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_866-321248-0_sp1.1 from training. Duration: 0.963625 2023-10-03 06:05:32,524 WARNING [train.py:1204] (3/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_510-216513-0_sp1.1 from training. Duration: 0.97275 2023-10-03 06:05:38,302 WARNING [train.py:1204] (3/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_758-143677-0 from training. Duration: 0.98 2023-10-03 06:05:40,914 WARNING [train.py:1204] (3/4) Exclude cut with ID _55134_210_21982_1_1534590028377_3882170_50-214427-0_sp1.1 from training. Duration: 0.97275 2023-10-03 06:05:42,404 WARNING [train.py:1204] (3/4) Exclude cut with ID _31649_210_6228_1_1532584608063_4091078_482-301330-0_sp1.1 from training. Duration: 0.97275 2023-10-03 06:05:45,247 WARNING [train.py:1204] (3/4) Exclude cut with ID _35799_210_5854_1_1533263487887_3529071_490-326847-0_sp1.1 from training. Duration: 0.9 2023-10-03 06:05:45,255 WARNING [train.py:1204] (3/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_984-90716-0_sp1.1 from training. Duration: 0.963625 2023-10-03 06:05:51,540 WARNING [train.py:1204] (3/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_166-246557-0 from training. Duration: 0.64 2023-10-03 06:05:51,540 WARNING [train.py:1204] (3/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_51-200220-0_sp1.1 from training. Duration: 0.5090625 2023-10-03 06:05:51,625 WARNING [train.py:1204] (3/4) Exclude cut with ID _51382_210_23862_1_1534492801838_3650104_275-92796-0_sp1.1 from training. Duration: 0.936375 2023-10-03 06:05:57,637 WARNING [train.py:1204] (3/4) Exclude cut with ID _29853_210_7497_1_1532505600851_6642110_461-142829-0_sp1.1 from training. Duration: 0.97275 2023-10-03 06:05:57,695 WARNING [train.py:1204] (3/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_788-298907-0_sp0.9 from training. Duration: 0.988875 2023-10-03 06:05:59,030 WARNING [train.py:1204] (3/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_739-2098-0_sp1.1 from training. Duration: 0.836375 2023-10-03 06:06:03,726 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7543_210_6753_1_1528541907_1399322_165-323619-0 from training. Duration: 0.69 2023-10-03 06:06:04,976 WARNING [train.py:1204] (3/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_401-255491-0_sp1.1 from training. Duration: 0.636375 2023-10-03 06:06:07,909 WARNING [train.py:1204] (3/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_24-143283-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 06:06:09,309 WARNING [train.py:1204] (3/4) Exclude cut with ID _14459_210_2228_1_1531443641946_7516700_1221-280439-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 06:06:10,791 WARNING [train.py:1204] (3/4) Exclude cut with ID _59202_210_26984_1_1534816810612_3549174_469-213482-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 06:06:10,792 WARNING [train.py:1204] (3/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_148-158509-0_sp0.9 from training. Duration: 0.4555625 2023-10-03 06:06:12,128 WARNING [train.py:1204] (3/4) Exclude cut with ID _24314_210_4915_1_1532135078116_7170179_863-259095-0_sp1.1 from training. Duration: 0.97275 2023-10-03 06:06:12,204 WARNING [train.py:1204] (3/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_321-22821-0 from training. Duration: 0.89 2023-10-03 06:06:12,225 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_43831_210_2158_1_1533725502_5872791_55-163137-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 06:06:13,752 WARNING [train.py:1204] (3/4) Exclude cut with ID _27430_210_7770_1_1532604689375_1378590_26-25207-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 06:06:15,103 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_17613_210_2151_1_1531137569_2366160_223-316461-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 06:06:15,167 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_53679_210_4852_1_1534556726_4186141_541-213370-0_sp1.1 from training. Duration: 0.963625 2023-10-03 06:06:16,517 WARNING [train.py:1204] (3/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_369-295086-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 06:06:16,587 WARNING [train.py:1204] (3/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_91-267696-0 from training. Duration: 0.87 2023-10-03 06:06:17,993 WARNING [train.py:1204] (3/4) Exclude cut with ID _5294_210_1747_1_1527249643928_3696179_180-100713-0 from training. Duration: 0.81 2023-10-03 06:06:18,017 WARNING [train.py:1204] (3/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_32-335103-0 from training. Duration: 0.82 2023-10-03 06:06:18,026 WARNING [train.py:1204] (3/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_133-312727-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 06:06:20,036 WARNING [train.py:1204] (3/4) Exclude cut with ID _59288_210_5854_1_1535077757774_3412246_391-165934-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 06:06:21,371 WARNING [train.py:1204] (3/4) Exclude cut with ID _28323_210_16187_1_1532480395667_4100300_488-1281-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 06:06:23,193 WARNING [train.py:1204] (3/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_124-193207-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 06:06:30,852 INFO [train.py:1046] (3/4) Epoch 33, batch 4650, loss[loss=0.1768, simple_loss=0.2507, pruned_loss=0.05147, over 23808.00 frames. ], tot_loss[loss=0.1608, simple_loss=0.2394, pruned_loss=0.04107, over 4722356.28 frames. ], batch size: 164, lr: 3.07e-03, grad_scale: 8.0 2023-10-03 06:06:34,286 WARNING [train.py:1204] (3/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_226-242140-0_sp0.9 from training. Duration: 0.9 2023-10-03 06:06:36,963 WARNING [train.py:1204] (3/4) Exclude cut with ID _29277_210_3279_1_1532581109293_3173069_392-3694-0_sp1.1 from training. Duration: 0.936375 2023-10-03 06:06:36,998 WARNING [train.py:1204] (3/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_563-329842-0_sp1.1 from training. Duration: 0.97275 2023-10-03 06:06:37,049 WARNING [train.py:1204] (3/4) Exclude cut with ID _58583_210_5236_1_1534726835266_3574249_391-87755-0_sp1.1 from training. Duration: 0.92725 2023-10-03 06:06:38,351 WARNING [train.py:1204] (3/4) Exclude cut with ID _28427_210_4220_1_1532581240967_3061069_281-237827-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 06:06:38,366 WARNING [train.py:1204] (3/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_44-147898-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 06:06:39,712 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_24007_210_1794_1_1531918925_1663875_125-320772-0_sp1.1 from training. Duration: 0.97275 2023-10-03 06:06:42,524 WARNING [train.py:1204] (3/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_143-117421-0 from training. Duration: 0.58 2023-10-03 06:06:45,057 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.591e+02 1.854e+02 2.077e+02 2.382e+02 3.489e+02, threshold=4.154e+02, percent-clipped=0.0 2023-10-03 06:06:45,260 WARNING [train.py:1204] (3/4) Exclude cut with ID _69584_210_5219_1_1535785133775_4044450_325-7656-0_sp0.9 from training. Duration: 0.9555625 2023-10-03 06:06:45,392 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.0.ff3_skip_rate, batch_count=1164320.0, ans=0.0 2023-10-03 06:06:46,615 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_37351_210_7925_1_1533012931_3812113_265-85780-0 from training. Duration: 0.88 2023-10-03 06:06:47,963 WARNING [train.py:1204] (3/4) Exclude cut with ID _18516_210_9206_1_1531704653206_3692406_331-237896-0_sp1.1 from training. Duration: 0.936375 2023-10-03 06:06:48,180 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.balancer2.prob, batch_count=1164320.0, ans=0.125 2023-10-03 06:06:49,333 WARNING [train.py:1204] (3/4) Exclude cut with ID _14629_210_15709_1_1530604298182_956899_6-232695-0 from training. Duration: 0.94 2023-10-03 06:06:49,365 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7544_210_6753_1_1528587000_691468_87-178599-0_sp1.1 from training. Duration: 0.7909375 2023-10-03 06:06:50,677 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_326-303945-0 from training. Duration: 0.99 2023-10-03 06:06:50,694 WARNING [train.py:1204] (3/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_503-30719-0 from training. Duration: 0.98 2023-10-03 06:06:50,702 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_11305_210_1794_1_1529750428_7178148_299-344640-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 06:06:50,769 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_53071_210_16554_1_1534490405_2954066_265-170472-0_sp1.1 from training. Duration: 0.7545625 2023-10-03 06:06:55,272 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_174-277364-0_sp1.1 from training. Duration: 0.6454375 2023-10-03 06:06:57,276 WARNING [train.py:1204] (3/4) Exclude cut with ID _58414_210_8254_1_1535421609162_7151182_193-151884-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 06:06:57,295 WARNING [train.py:1204] (3/4) Exclude cut with ID IC0651W0104-322063 from training. Duration: 0.768 2023-10-03 06:06:59,429 WARNING [train.py:1204] (3/4) Exclude cut with ID _21881_210_3525_1_1531825242757_3798380_139-305982-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 06:07:00,694 WARNING [train.py:1204] (3/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_79-200392-0 from training. Duration: 0.97 2023-10-03 06:07:03,399 WARNING [train.py:1204] (3/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_451-280894-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 06:07:03,413 WARNING [train.py:1204] (3/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_304-273433-0_sp1.1 from training. Duration: 0.9 2023-10-03 06:07:04,845 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_5959_210_1794_1_1527400400_3445726_412-44052-0 from training. Duration: 0.77 2023-10-03 06:07:04,964 WARNING [train.py:1204] (3/4) Exclude cut with ID _4681_210_3525_1_1527251040321_1235713_148-26159-0_sp1.1 from training. Duration: 0.963625 2023-10-03 06:07:09,650 WARNING [train.py:1204] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_887-249555-0_sp0.9 from training. Duration: 0.8555625 2023-10-03 06:07:09,911 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.feed_forward1.hidden_balancer.prob, batch_count=1164386.6666666667, ans=0.125 2023-10-03 06:07:12,423 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_25236_210_2158_1_1532051968_280567_37-6860-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 06:07:15,338 WARNING [train.py:1204] (3/4) Exclude cut with ID _29277_210_3279_1_1532581109293_3173069_417-115527-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 06:07:16,731 WARNING [train.py:1204] (3/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_552-134748-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 06:07:18,104 WARNING [train.py:1204] (3/4) Exclude cut with ID _43176_210_8254_1_1533524417469_4174339_188-174143-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 06:07:18,134 WARNING [train.py:1204] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_127-119109-0_sp1.1 from training. Duration: 0.6545625 2023-10-03 06:07:21,236 WARNING [train.py:1204] (3/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_38-187851-0 from training. Duration: 0.62 2023-10-03 06:07:21,273 WARNING [train.py:1204] (3/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_588-125165-0 from training. Duration: 0.83 2023-10-03 06:07:22,670 WARNING [train.py:1204] (3/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_344-152526-0_sp1.1 from training. Duration: 0.4818125 2023-10-03 06:07:22,671 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7002_210_1794_1_1527935634_4034444_487-236501-0 from training. Duration: 0.66 2023-10-03 06:07:24,093 WARNING [train.py:1204] (3/4) Exclude cut with ID _23825_210_13449_1_1531915313526_3497150_379-16981-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 06:07:27,511 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.1.conv_module1.balancer2.prob, batch_count=1164453.3333333333, ans=0.125 2023-10-03 06:07:30,613 WARNING [train.py:1204] (3/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_297-5233-0_sp1.1 from training. Duration: 0.863625 2023-10-03 06:07:30,617 WARNING [train.py:1204] (3/4) Exclude cut with ID _41298_210_9019_1_1533430371450_4822143_178-283538-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 06:07:31,889 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7046_210_6753_1_1527895337_6920917_642-106799-0 from training. Duration: 0.92 2023-10-03 06:07:31,912 WARNING [train.py:1204] (3/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_532-291952-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 06:07:33,766 WARNING [train.py:1204] (3/4) Exclude cut with ID _47278_210_12041_1_1533902418349_3576110_260-270468-0_sp1.1 from training. Duration: 0.92725 2023-10-03 06:07:33,784 WARNING [train.py:1204] (3/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_144-3287-0_sp0.9 from training. Duration: 0.7444375 2023-10-03 06:07:36,352 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_162-262717-0_sp0.9 from training. Duration: 0.92225 2023-10-03 06:07:37,887 WARNING [train.py:1204] (3/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_62-335552-0_sp0.9 from training. Duration: 0.9666875 2023-10-03 06:07:37,892 WARNING [train.py:1204] (3/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_726-272852-0_sp1.1 from training. Duration: 0.92725 2023-10-03 06:07:39,262 WARNING [train.py:1204] (3/4) Exclude cut with ID _14898_210_9206_1_1530766812377_4177190_26-135492-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 06:07:41,025 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.feed_forward2.hidden_balancer.prob, batch_count=1164520.0, ans=0.125 2023-10-03 06:07:42,073 WARNING [train.py:1204] (3/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_200-95304-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 06:07:42,116 WARNING [train.py:1204] (3/4) Exclude cut with ID _45012_210_12651_1_1533608435136_5371380_240-37101-0_sp1.1 from training. Duration: 0.8454375 2023-10-03 06:07:43,438 WARNING [train.py:1204] (3/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_132-269633-0_sp0.9 from training. Duration: 0.7555625 2023-10-03 06:07:43,530 WARNING [train.py:1204] (3/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_907-151398-0_sp0.9 from training. Duration: 0.411125 2023-10-03 06:07:44,800 INFO [train.py:1046] (3/4) Epoch 33, batch 4700, loss[loss=0.1641, simple_loss=0.253, pruned_loss=0.03762, over 24331.00 frames. ], tot_loss[loss=0.1614, simple_loss=0.2401, pruned_loss=0.04133, over 4727159.18 frames. ], batch size: 74, lr: 3.07e-03, grad_scale: 8.0 2023-10-03 06:07:44,921 WARNING [train.py:1204] (3/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_45-265684-0_sp0.9 from training. Duration: 0.8 2023-10-03 06:07:45,132 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.bypass_mid.scale_min, batch_count=1164586.6666666667, ans=0.2 2023-10-03 06:07:46,274 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_74-292608-0 from training. Duration: 0.72 2023-10-03 06:07:55,148 WARNING [train.py:1204] (3/4) Exclude cut with ID _59202_210_26984_1_1534816810612_3549174_221-84819-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 06:07:55,228 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_33696_210_3852_1_1533082780_2663375_181-325595-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 06:07:56,519 WARNING [train.py:1204] (3/4) Exclude cut with ID _47251_210_18628_1_1533814063128_4137250_351-154105-0_sp1.1 from training. Duration: 0.963625 2023-10-03 06:07:57,918 WARNING [train.py:1204] (3/4) Exclude cut with ID _35501_210_9288_1_1532998507986_3192189_351-334740-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 06:08:00,333 WARNING [train.py:1204] (3/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_342-100157-0_sp1.1 from training. Duration: 0.4909375 2023-10-03 06:08:05,742 WARNING [train.py:1204] (3/4) Exclude cut with ID _6789_210_1534_1_1527857937218_3118950_262-48853-0 from training. Duration: 0.56 2023-10-03 06:08:05,782 WARNING [train.py:1204] (3/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_430-201020-0 from training. Duration: 0.97 2023-10-03 06:08:06,039 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=1164653.3333333333, ans=0.1 2023-10-03 06:08:09,289 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_71-136518-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 06:08:09,367 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_28-291982-0_sp1.1 from training. Duration: 0.8818125 2023-10-03 06:08:09,390 WARNING [train.py:1204] (3/4) Exclude cut with ID _54218_210_12210_1_1534413646388_4409259_437-275326-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 06:08:14,630 WARNING [train.py:1204] (3/4) Exclude cut with ID _55134_210_21982_1_1534590028377_3882170_40-309139-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 06:08:18,785 WARNING [train.py:1204] (3/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_266-329755-0_sp1.1 from training. Duration: 0.8090625 2023-10-03 06:08:20,331 WARNING [train.py:1204] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_21-174514-0_sp1.1 from training. Duration: 0.3909375 2023-10-03 06:08:23,070 WARNING [train.py:1204] (3/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_409-207378-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 06:08:27,944 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.1.conv_module2.balancer2.prob, batch_count=1164786.6666666667, ans=0.125 2023-10-03 06:08:29,182 WARNING [train.py:1204] (3/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_132-269633-0 from training. Duration: 0.68 2023-10-03 06:08:29,442 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.conv_skip_rate, batch_count=1164786.6666666667, ans=0.0 2023-10-03 06:08:30,624 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2245_210_2715_1_1525511548_3540254_310-52483-0_sp0.9 from training. Duration: 0.97775 2023-10-03 06:08:32,622 WARNING [train.py:1204] (3/4) Exclude cut with ID _27430_210_7770_1_1532604689375_1378590_47-77403-0_sp1.1 from training. Duration: 0.92725 2023-10-03 06:08:36,148 WARNING [train.py:1204] (3/4) Exclude cut with ID _7562_210_2084_1_1528887767916_3946229_186-326422-0 from training. Duration: 0.84 2023-10-03 06:08:37,479 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_230-35993-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 06:08:43,287 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_23854_210_1794_1_1531969696_1329157_57-123491-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 06:08:43,334 WARNING [train.py:1204] (3/4) Exclude cut with ID _14747_210_8341_1_1530685797463_3654089_147-131229-0 from training. Duration: 0.76 2023-10-03 06:08:43,437 WARNING [train.py:1204] (3/4) Exclude cut with ID _30191_210_1664_1_1533189915047_7196670_430-19539-0_sp1.1 from training. Duration: 0.92725 2023-10-03 06:08:44,789 WARNING [train.py:1204] (3/4) Exclude cut with ID _67725_210_27767_1_1535608756671_5105359_44-50762-0_sp1.1 from training. Duration: 0.963625 2023-10-03 06:08:46,224 WARNING [train.py:1204] (3/4) Exclude cut with ID _27485_210_4916_1_1532739191677_3668269_217-161755-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 06:08:47,554 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_5931_210_2040_1_1527428481_4547999_321-193614-0_sp1.1 from training. Duration: 0.6090625 2023-10-03 06:08:47,572 WARNING [train.py:1204] (3/4) Exclude cut with ID _6267_210_2084_1_1527764658526_3894730_375-219887-0 from training. Duration: 0.67 2023-10-03 06:08:47,653 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9087W0022-538393 from training. Duration: 0.768 2023-10-03 06:08:48,996 WARNING [train.py:1204] (3/4) Exclude cut with ID _68777_210_4353_1_1535810390731_3117904_83-196459-0_sp1.1 from training. Duration: 0.963625 2023-10-03 06:08:50,373 WARNING [train.py:1204] (3/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_455-343812-0_sp1.1 from training. Duration: 0.92725 2023-10-03 06:08:50,374 WARNING [train.py:1204] (3/4) Exclude cut with ID _13916_210_9994_1_1530788341126_3763149_414-320174-0_sp1.1 from training. Duration: 0.92725 2023-10-03 06:08:50,378 WARNING [train.py:1204] (3/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_398-239819-0 from training. Duration: 0.99 2023-10-03 06:08:50,643 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.bypass.scale_min, batch_count=1164853.3333333333, ans=0.2 2023-10-03 06:08:51,589 WARNING [train.py:1204] (3/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_199-232314-0_sp1.1 from training. Duration: 0.92725 2023-10-03 06:08:51,739 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.0.conv_module2.balancer1.prob, batch_count=1164853.3333333333, ans=0.125 2023-10-03 06:08:52,261 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.3.encoder.layers.2.nonlin_attention.whiten1, num_groups=1, num_channels=384, metric=4.20 vs. limit=10.0 2023-10-03 06:08:56,142 WARNING [train.py:1204] (3/4) Exclude cut with ID _2334_210_2667_1_1525608238880_4705129_459-104997-0 from training. Duration: 0.95 2023-10-03 06:08:58,780 INFO [train.py:1046] (3/4) Epoch 33, batch 4750, loss[loss=0.1559, simple_loss=0.2331, pruned_loss=0.03931, over 23431.00 frames. ], tot_loss[loss=0.1622, simple_loss=0.241, pruned_loss=0.04163, over 4720993.65 frames. ], batch size: 119, lr: 3.07e-03, grad_scale: 8.0 2023-10-03 06:08:58,907 WARNING [train.py:1204] (3/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_441-209395-0_sp1.1 from training. Duration: 0.936375 2023-10-03 06:09:00,395 WARNING [train.py:1204] (3/4) Exclude cut with ID _27052_210_5986_1_1532329197187_3585009_320-80393-0_sp1.1 from training. Duration: 0.97275 2023-10-03 06:09:02,104 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.feed_forward2.hidden_balancer.prob, batch_count=1164920.0, ans=0.125 2023-10-03 06:09:05,452 WARNING [train.py:1204] (3/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_89-217253-0_sp1.1 from training. Duration: 0.97275 2023-10-03 06:09:05,476 WARNING [train.py:1204] (3/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_453-282588-0_sp1.1 from training. Duration: 0.8909375 2023-10-03 06:09:06,885 WARNING [train.py:1204] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_88-341718-0 from training. Duration: 0.66 2023-10-03 06:09:08,151 WARNING [train.py:1204] (3/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_230-3521-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 06:09:09,662 WARNING [train.py:1204] (3/4) Exclude cut with ID _9679_210_7497_1_1529129212906_4363929_512-313889-0 from training. Duration: 0.81 2023-10-03 06:09:12,979 WARNING [train.py:1204] (3/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_194-252800-0_sp1.1 from training. Duration: 0.8818125 2023-10-03 06:09:13,006 WARNING [train.py:1204] (3/4) Exclude cut with ID _55209_210_1531_1_1534675828996_4597160_239-293541-0_sp1.1 from training. Duration: 0.963625 2023-10-03 06:09:14,314 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.573e+02 1.898e+02 2.045e+02 2.268e+02 3.285e+02, threshold=4.090e+02, percent-clipped=0.0 2023-10-03 06:09:14,427 WARNING [train.py:1204] (3/4) Exclude cut with ID _35799_210_5854_1_1533263487887_3529071_15-325131-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 06:09:19,900 WARNING [train.py:1204] (3/4) Exclude cut with ID _14100_210_8341_1_1530529320613_3678069_279-55760-0 from training. Duration: 0.93 2023-10-03 06:09:25,413 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_489-230703-0_sp1.1 from training. Duration: 0.77275 2023-10-03 06:09:26,801 WARNING [train.py:1204] (3/4) Exclude cut with ID _6694_210_4599_1_1527852637490_3965529_144-185051-0 from training. Duration: 0.96 2023-10-03 06:09:28,204 WARNING [train.py:1204] (3/4) Exclude cut with ID _28461_210_2253_1_1532771775189_3041299_1-228155-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 06:09:31,595 WARNING [train.py:1204] (3/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_686-252323-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 06:09:31,597 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_1075-294633-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 06:09:32,916 WARNING [train.py:1204] (3/4) Exclude cut with ID _22083_210_8254_1_1531702862439_3678970_30-92879-0_sp1.1 from training. Duration: 0.97275 2023-10-03 06:09:34,291 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9245W0121-616888 from training. Duration: 0.896 2023-10-03 06:09:34,294 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6786_210_1388_1_1527839699_4194495_378-46070-0 from training. Duration: 0.72 2023-10-03 06:09:38,276 WARNING [train.py:1204] (3/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_317-175062-0 from training. Duration: 0.82 2023-10-03 06:09:40,989 WARNING [train.py:1204] (3/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_238-89390-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 06:09:42,387 WARNING [train.py:1204] (3/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_524-310811-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 06:09:45,004 WARNING [train.py:1204] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_307-158462-0_sp0.9 from training. Duration: 0.7666875 2023-10-03 06:09:45,004 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9309W0191-640318 from training. Duration: 0.512 2023-10-03 06:09:45,009 WARNING [train.py:1204] (3/4) Exclude cut with ID _55153_210_2253_1_1534384786677_3162096_100-204617-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 06:09:45,511 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.4.encoder.layers.2.self_attn_weights.whiten_keys, num_groups=4, num_channels=128, metric=1.75 vs. limit=6.0 2023-10-03 06:09:48,309 WARNING [train.py:1204] (3/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_112-149213-0_sp1.1 from training. Duration: 0.863625 2023-10-03 06:09:49,841 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.feed_forward2.hidden_balancer.prob, batch_count=1165120.0, ans=0.125 2023-10-03 06:09:51,035 WARNING [train.py:1204] (3/4) Exclude cut with ID _67831_210_12392_1_1535612464202_3092612_346-2148-0_sp1.1 from training. Duration: 0.8454375 2023-10-03 06:09:52,572 WARNING [train.py:1204] (3/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_51-117576-0_sp0.9 from training. Duration: 0.4444375 2023-10-03 06:09:52,593 WARNING [train.py:1204] (3/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_300-27235-0 from training. Duration: 0.87 2023-10-03 06:09:52,639 WARNING [train.py:1204] (3/4) Exclude cut with ID _40399_210_4353_1_1533305658194_3279406_195-116825-0_sp1.1 from training. Duration: 0.97275 2023-10-03 06:09:52,660 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7002_210_1794_1_1527939683_5157286_163-87552-0_sp0.9 from training. Duration: 0.9555625 2023-10-03 06:09:54,010 WARNING [train.py:1204] (3/4) Exclude cut with ID _19514_210_9259_1_1531530071330_5084230_560-237597-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 06:09:55,371 WARNING [train.py:1204] (3/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_41-234353-0_sp1.1 from training. Duration: 0.6181875 2023-10-03 06:09:55,408 WARNING [train.py:1204] (3/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_769-168302-0 from training. Duration: 0.71 2023-10-03 06:09:58,225 WARNING [train.py:1204] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_574-45541-0 from training. Duration: 0.65 2023-10-03 06:10:00,029 WARNING [train.py:1204] (3/4) Exclude cut with ID _50310_210_15488_1_1534068043345_3641080_262-67120-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 06:10:02,623 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_53071_210_16554_1_1534493434_1276796_35-11927-0_sp1.1 from training. Duration: 0.963625 2023-10-03 06:10:02,625 WARNING [train.py:1204] (3/4) Exclude cut with ID _3992_210_1747_1_1526644816031_3731060_107-251434-0 from training. Duration: 0.99 2023-10-03 06:10:02,682 WARNING [train.py:1204] (3/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_412-54488-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 06:10:04,098 WARNING [train.py:1204] (3/4) Exclude cut with ID _18516_210_9206_1_1531704653206_3692406_77-258490-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 06:10:07,193 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_40047_210_2290_1_1533290519_2374311_195-165693-0_sp0.9 from training. Duration: 0.988875 2023-10-03 06:10:07,232 WARNING [train.py:1204] (3/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_296-36971-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 06:10:07,299 WARNING [train.py:1204] (3/4) Exclude cut with ID _14970_210_15709_1_1530773543493_1715730_187-20259-0_sp1.1 from training. Duration: 0.8090625 2023-10-03 06:10:11,965 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_53373_210_5399_1_1534229471_4715695_135-305927-0_sp1.1 from training. Duration: 0.936375 2023-10-03 06:10:11,989 WARNING [train.py:1204] (3/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_60-18123-0 from training. Duration: 0.88 2023-10-03 06:10:13,188 INFO [train.py:1046] (3/4) Epoch 33, batch 4800, loss[loss=0.1745, simple_loss=0.2479, pruned_loss=0.05052, over 22649.00 frames. ], tot_loss[loss=0.1627, simple_loss=0.2416, pruned_loss=0.04187, over 4715032.11 frames. ], batch size: 322, lr: 3.07e-03, grad_scale: 16.0 2023-10-03 06:10:13,263 WARNING [train.py:1204] (3/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_166-166990-0 from training. Duration: 0.96 2023-10-03 06:10:14,589 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_5438_210_1794_1_1527336687_2600009_233-58072-0 from training. Duration: 0.77 2023-10-03 06:10:17,821 WARNING [train.py:1204] (3/4) Exclude cut with ID _8833_210_5219_1_1528592665210_7553059_611-80313-0_sp0.9 from training. Duration: 0.711125 2023-10-03 06:10:17,851 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_53555_210_5632_1_1534595220_7232297_118-43239-0_sp1.1 from training. Duration: 0.936375 2023-10-03 06:10:19,254 WARNING [train.py:1204] (3/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_480-13146-0 from training. Duration: 0.85 2023-10-03 06:10:24,827 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_892-28814-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 06:10:24,889 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_24899_210_16554_1_1532046422_3827462_160-21255-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 06:10:29,012 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_154-316450-0_sp1.1 from training. Duration: 0.6909375 2023-10-03 06:10:29,116 WARNING [train.py:1204] (3/4) Exclude cut with ID _30196_210_1664_1_1533708496522_7074140_1225-301232-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 06:10:29,140 WARNING [train.py:1204] (3/4) Exclude cut with ID _28783_210_7943_1_1532671164808_7155479_414-208100-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 06:10:31,191 WARNING [train.py:1204] (3/4) Exclude cut with ID _19023_210_4353_1_1531378892552_5820780_83-254794-0 from training. Duration: 0.86 2023-10-03 06:10:31,249 WARNING [train.py:1204] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_591-83752-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 06:10:32,524 WARNING [train.py:1204] (3/4) Exclude cut with ID _29877_210_6228_1_1532397791106_5276149_266-62291-0_sp1.1 from training. Duration: 0.7909375 2023-10-03 06:10:33,885 WARNING [train.py:1204] (3/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_280-161633-0_sp1.1 from training. Duration: 0.663625 2023-10-03 06:10:36,138 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.ff2_skip_rate, batch_count=1165320.0, ans=0.0 2023-10-03 06:10:37,169 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7543_210_6753_1_1528539468_333764_39-156634-0_sp1.1 from training. Duration: 0.97275 2023-10-03 06:10:38,552 WARNING [train.py:1204] (3/4) Exclude cut with ID _67499_210_17282_1_1535509805220_3692430_505-312248-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 06:10:38,583 WARNING [train.py:1204] (3/4) Exclude cut with ID _5294_210_1747_1_1527249643928_3696179_180-100713-0_sp0.9 from training. Duration: 0.9 2023-10-03 06:10:41,776 WARNING [train.py:1204] (3/4) Exclude cut with ID _26528_210_9070_1_1532570410842_3851310_508-148132-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 06:10:41,787 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_211-339622-0_sp1.1 from training. Duration: 0.4090625 2023-10-03 06:10:41,800 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_339-138356-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 06:10:43,091 WARNING [train.py:1204] (3/4) Exclude cut with ID _27060_210_5986_1_1532674838922_3522549_211-19933-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 06:10:44,644 WARNING [train.py:1204] (3/4) Exclude cut with ID _60229_210_27767_1_1534838406158_3655350_457-117923-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 06:10:47,905 WARNING [train.py:1204] (3/4) Exclude cut with ID _55429_210_10626_1_1534467588527_7174143_460-141439-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 06:10:49,291 WARNING [train.py:1204] (3/4) Exclude cut with ID _59170_210_6721_1_1534983633960_4203335_263-10780-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 06:10:49,303 WARNING [train.py:1204] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_872-264525-0_sp0.9 from training. Duration: 0.72225 2023-10-03 06:10:49,386 WARNING [train.py:1204] (3/4) Exclude cut with ID _7624_210_1531_1_1528109802198_4933101_418-349834-0_sp0.9 from training. Duration: 0.6666875 2023-10-03 06:10:50,804 WARNING [train.py:1204] (3/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_27-53998-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 06:10:52,243 WARNING [train.py:1204] (3/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_202-183922-0 from training. Duration: 0.85 2023-10-03 06:10:52,261 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7002_210_1794_1_1527939683_5157286_1-281264-0 from training. Duration: 0.44 2023-10-03 06:10:54,927 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_53753_210_7234_1_1534320003_3594148_493-9314-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 06:10:54,952 WARNING [train.py:1204] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_217-95056-0_sp1.1 from training. Duration: 0.8909375 2023-10-03 06:10:55,001 WARNING [train.py:1204] (3/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_406-45086-0_sp1.1 from training. Duration: 0.72725 2023-10-03 06:10:55,008 WARNING [train.py:1204] (3/4) Exclude cut with ID _31649_210_6228_1_1532584608063_4091078_505-213821-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 06:10:55,017 WARNING [train.py:1204] (3/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_317-175062-0_sp0.9 from training. Duration: 0.911125 2023-10-03 06:10:56,451 WARNING [train.py:1204] (3/4) Exclude cut with ID _5444_210_2151_1_1527418808464_5312210_726-27165-0_sp1.1 from training. Duration: 0.6090625 2023-10-03 06:10:58,437 WARNING [train.py:1204] (3/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_494-170437-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 06:11:02,512 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_474-64054-0_sp1.1 from training. Duration: 0.936375 2023-10-03 06:11:05,116 WARNING [train.py:1204] (3/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_521-7556-0_sp1.1 from training. Duration: 0.97275 2023-10-03 06:11:05,228 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_269-348505-0_sp1.1 from training. Duration: 0.92725 2023-10-03 06:11:09,923 WARNING [train.py:1204] (3/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_559-184862-0 from training. Duration: 0.94 2023-10-03 06:11:09,956 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_68702_210_23689_1_1535799577_3420068_92-76040-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 06:11:09,990 WARNING [train.py:1204] (3/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_227-102052-0_sp1.1 from training. Duration: 0.97275 2023-10-03 06:11:11,734 WARNING [train.py:1204] (3/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_718-250907-0_sp0.9 from training. Duration: 0.7666875 2023-10-03 06:11:11,793 WARNING [train.py:1204] (3/4) Exclude cut with ID _14069_210_5632_1_1530512692033_4803390_352-143891-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 06:11:15,847 WARNING [train.py:1204] (3/4) Exclude cut with ID _6267_210_2084_1_1527764658526_3894730_65-106602-0_sp1.1 from training. Duration: 0.936375 2023-10-03 06:11:17,144 WARNING [train.py:1204] (3/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_202-183922-0_sp0.9 from training. Duration: 0.9444375 2023-10-03 06:11:17,157 WARNING [train.py:1204] (3/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_465-268827-0_sp1.1 from training. Duration: 0.97275 2023-10-03 06:11:18,833 WARNING [train.py:1204] (3/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_58-29064-0_sp1.1 from training. Duration: 0.9 2023-10-03 06:11:18,865 WARNING [train.py:1204] (3/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_1-99680-0_sp0.9 from training. Duration: 0.7444375 2023-10-03 06:11:20,146 WARNING [train.py:1204] (3/4) Exclude cut with ID _13253_210_1925_1_1530757775819_3577873_191-144977-0_sp1.1 from training. Duration: 0.7090625 2023-10-03 06:11:24,452 WARNING [train.py:1204] (3/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_276-112684-0_sp1.1 from training. Duration: 0.92725 2023-10-03 06:11:24,462 WARNING [train.py:1204] (3/4) Exclude cut with ID _64639_210_5816_1_1535367619437_3528000_358-411-0_sp1.1 from training. Duration: 0.97275 2023-10-03 06:11:24,478 WARNING [train.py:1204] (3/4) Exclude cut with ID _31649_210_6228_1_1532584608063_4091078_106-21199-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 06:11:25,860 WARNING [train.py:1204] (3/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_901-79438-0 from training. Duration: 0.94 2023-10-03 06:11:27,683 INFO [train.py:1046] (3/4) Epoch 33, batch 4850, loss[loss=0.1606, simple_loss=0.2376, pruned_loss=0.04181, over 23202.00 frames. ], tot_loss[loss=0.1626, simple_loss=0.2414, pruned_loss=0.04185, over 4716862.97 frames. ], batch size: 93, lr: 3.07e-03, grad_scale: 16.0 2023-10-03 06:11:29,165 WARNING [train.py:1204] (3/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_718-250907-0 from training. Duration: 0.69 2023-10-03 06:11:29,172 WARNING [train.py:1204] (3/4) Exclude cut with ID _5287_210_2084_1_1527418883040_3878670_363-194161-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 06:11:29,176 WARNING [train.py:1204] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_586-321321-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 06:11:29,250 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_68900_210_2087_1_1535713157_3708529_164-292661-0_sp1.1 from training. Duration: 0.963625 2023-10-03 06:11:29,251 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_26433_210_12108_1_1532157158_4056480_114-144878-0_sp1.1 from training. Duration: 0.97275 2023-10-03 06:11:32,068 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_15517_210_6753_1_1530954008_3487336_96-341874-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 06:11:38,786 WARNING [train.py:1204] (3/4) Exclude cut with ID _14268_210_9206_1_1530622771285_4714099_637-86630-0 from training. Duration: 0.51 2023-10-03 06:11:39,519 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_28211_210_6388_1_1532861506_4523224_121-320092-0_sp1.1 from training. Duration: 0.92725 2023-10-03 06:11:42,839 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.571e+02 1.890e+02 2.112e+02 2.331e+02 3.669e+02, threshold=4.223e+02, percent-clipped=0.0 2023-10-03 06:11:44,374 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_141-89113-0_sp0.9 from training. Duration: 0.9555625 2023-10-03 06:11:45,664 WARNING [train.py:1204] (3/4) Exclude cut with ID _6789_210_1534_1_1527857937218_3118950_311-61262-0_sp1.1 from training. Duration: 0.5909375 2023-10-03 06:11:45,696 WARNING [train.py:1204] (3/4) Exclude cut with ID _58480_210_12392_1_1534811121111_3646160_389-200461-0_sp1.1 from training. Duration: 0.97275 2023-10-03 06:11:50,313 WARNING [train.py:1204] (3/4) Exclude cut with ID _41326_210_5854_1_1533778076509_3492080_28-156553-0_sp1.1 from training. Duration: 0.92725 2023-10-03 06:11:50,393 WARNING [train.py:1204] (3/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_577-287150-0_sp1.1 from training. Duration: 0.7454375 2023-10-03 06:11:51,765 WARNING [train.py:1204] (3/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_40-129756-0_sp1.1 from training. Duration: 0.736375 2023-10-03 06:11:51,776 WARNING [train.py:1204] (3/4) Exclude cut with ID _60113_210_16271_1_1535164212481_4386079_490-280290-0 from training. Duration: 0.98 2023-10-03 06:11:54,615 WARNING [train.py:1204] (3/4) Exclude cut with ID _58059_210_5854_1_1534901471149_3164808_34-39905-0_sp1.1 from training. Duration: 0.963625 2023-10-03 06:11:57,443 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_791-197891-0_sp1.1 from training. Duration: 0.763625 2023-10-03 06:11:59,166 WARNING [train.py:1204] (3/4) Exclude cut with ID _6789_210_1534_1_1527857937218_3118950_262-48853-0_sp1.1 from training. Duration: 0.5090625 2023-10-03 06:12:00,503 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2655_210_2087_1_1525688959_3897400_52-279899-0_sp0.9 from training. Duration: 0.7666875 2023-10-03 06:12:00,506 WARNING [train.py:1204] (3/4) Exclude cut with ID _14884_210_2252_1_1530707459689_5561250_114-201399-0 from training. Duration: 0.76 2023-10-03 06:12:01,965 WARNING [train.py:1204] (3/4) Exclude cut with ID _19023_210_4353_1_1531378892552_5820780_83-254794-0_sp0.9 from training. Duration: 0.9555625 2023-10-03 06:12:01,981 WARNING [train.py:1204] (3/4) Exclude cut with ID _53489_210_6228_1_1534298357169_3835970_10-297919-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 06:12:06,191 WARNING [train.py:1204] (3/4) Exclude cut with ID _44585_210_18628_1_1533605350387_4518199_418-3055-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 06:12:06,213 WARNING [train.py:1204] (3/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_310-109461-0 from training. Duration: 0.8 2023-10-03 06:12:06,262 WARNING [train.py:1204] (3/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_235-262124-0 from training. Duration: 0.67 2023-10-03 06:12:07,591 WARNING [train.py:1204] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_195-167011-0_sp1.1 from training. Duration: 0.6818125 2023-10-03 06:12:15,532 WARNING [train.py:1204] (3/4) Exclude cut with ID _7852_210_5219_1_1528286471316_8139400_188-55162-0_sp1.1 from training. Duration: 0.936375 2023-10-03 06:12:15,581 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_35361_210_4238_1_1533262810_7793476_675-87012-0 from training. Duration: 0.89 2023-10-03 06:12:17,337 WARNING [train.py:1204] (3/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_465-81427-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 06:12:17,344 WARNING [train.py:1204] (3/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_597-154640-0_sp1.1 from training. Duration: 0.8181875 2023-10-03 06:12:18,727 WARNING [train.py:1204] (3/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_186-309161-0_sp0.9 from training. Duration: 0.6 2023-10-03 06:12:18,824 WARNING [train.py:1204] (3/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_546-119312-0 from training. Duration: 0.99 2023-10-03 06:12:18,828 WARNING [train.py:1204] (3/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_330-240060-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 06:12:20,172 WARNING [train.py:1204] (3/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_477-255782-0 from training. Duration: 0.82 2023-10-03 06:12:20,196 WARNING [train.py:1204] (3/4) Exclude cut with ID _14970_210_15709_1_1530773543493_1715730_150-248939-0_sp1.1 from training. Duration: 0.97275 2023-10-03 06:12:21,626 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_556-206615-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 06:12:24,174 WARNING [train.py:1204] (3/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_308-231735-0 from training. Duration: 0.73 2023-10-03 06:12:30,323 WARNING [train.py:1204] (3/4) Exclude cut with ID _27485_210_4916_1_1532739191677_3668269_66-236571-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 06:12:35,239 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2221_210_1988_1_1525597051_4935541_415-296882-0_sp0.9 from training. Duration: 0.8555625 2023-10-03 06:12:35,248 WARNING [train.py:1204] (3/4) Exclude cut with ID _64521_210_14699_1_1535243427259_3815328_231-78630-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 06:12:35,491 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.ff2_skip_rate, batch_count=1165853.3333333333, ans=0.0 2023-10-03 06:12:35,509 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.feed_forward2.hidden_balancer.prob, batch_count=1165853.3333333333, ans=0.125 2023-10-03 06:12:40,545 WARNING [train.py:1204] (3/4) Exclude cut with ID _13942_210_9988_1_1530604906036_3733360_499-55676-0 from training. Duration: 0.62 2023-10-03 06:12:40,547 WARNING [train.py:1204] (3/4) Exclude cut with ID _49376_210_5986_1_1533985278732_3370740_301-154763-0_sp1.1 from training. Duration: 0.8909375 2023-10-03 06:12:42,282 INFO [train.py:1046] (3/4) Epoch 33, batch 4900, loss[loss=0.1593, simple_loss=0.2435, pruned_loss=0.03748, over 24541.00 frames. ], tot_loss[loss=0.162, simple_loss=0.2403, pruned_loss=0.04186, over 4702435.50 frames. ], batch size: 66, lr: 3.07e-03, grad_scale: 16.0 2023-10-03 06:12:47,021 WARNING [train.py:1204] (3/4) Exclude cut with ID _27485_210_4916_1_1532739191677_3668269_255-216698-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 06:12:48,374 WARNING [train.py:1204] (3/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_137-334143-0_sp1.1 from training. Duration: 0.97275 2023-10-03 06:12:48,393 WARNING [train.py:1204] (3/4) Exclude cut with ID _6456_210_1534_1_1527681634212_3181049_215-309359-0_sp1.1 from training. Duration: 0.8 2023-10-03 06:12:48,981 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.3.encoder.layers.2.feed_forward1.out_whiten, num_groups=1, num_channels=512, metric=7.38 vs. limit=15.0 2023-10-03 06:12:52,519 WARNING [train.py:1204] (3/4) Exclude cut with ID _65640_210_6310_1_1535368201445_3173250_209-13008-0 from training. Duration: 0.99 2023-10-03 06:12:55,420 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.feed_forward1.hidden_balancer.prob, batch_count=1165986.6666666667, ans=0.125 2023-10-03 06:12:56,646 WARNING [train.py:1204] (3/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_58-29064-0 from training. Duration: 0.99 2023-10-03 06:13:01,117 WARNING [train.py:1204] (3/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_410-287857-0 from training. Duration: 0.93 2023-10-03 06:13:01,198 WARNING [train.py:1204] (3/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_153-153537-0 from training. Duration: 0.91 2023-10-03 06:13:01,235 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7544_210_6753_1_1528587722_3576686_172-171750-0_sp0.9 from training. Duration: 0.72225 2023-10-03 06:13:01,263 WARNING [train.py:1204] (3/4) Exclude cut with ID _64521_210_14699_1_1535243427259_3815328_464-183853-0_sp1.1 from training. Duration: 0.97275 2023-10-03 06:13:02,564 WARNING [train.py:1204] (3/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_385-210308-0_sp1.1 from training. Duration: 0.963625 2023-10-03 06:13:02,572 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_46013_210_7234_1_1533776362_3744032_354-261865-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 06:13:02,579 WARNING [train.py:1204] (3/4) Exclude cut with ID _3067_210_2667_1_1526212974640_4503168_250-285937-0_sp1.1 from training. Duration: 0.72725 2023-10-03 06:13:02,659 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_40036_210_13405_1_1533648647_1315388_48-146016-0 from training. Duration: 0.93 2023-10-03 06:13:05,588 WARNING [train.py:1204] (3/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_280-161633-0 from training. Duration: 0.73 2023-10-03 06:13:06,877 WARNING [train.py:1204] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_308-161067-0_sp0.9 from training. Duration: 0.7333125 2023-10-03 06:13:08,795 WARNING [train.py:1204] (3/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_68-212801-0_sp0.9 from training. Duration: 0.811125 2023-10-03 06:13:10,157 WARNING [train.py:1204] (3/4) Exclude cut with ID _6789_210_1534_1_1527857937218_3118950_311-61262-0_sp0.9 from training. Duration: 0.72225 2023-10-03 06:13:11,666 WARNING [train.py:1204] (3/4) Exclude cut with ID _14632_210_9988_1_1530691279392_3923200_102-204477-0_sp1.1 from training. Duration: 0.8818125 2023-10-03 06:13:12,965 WARNING [train.py:1204] (3/4) Exclude cut with ID _65242_210_18628_1_1535369393717_3823149_290-117517-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 06:13:14,792 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_69630_210_7234_1_1535788670_910989_35-337140-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 06:13:14,800 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_5618_210_1874_1_1527311008_5851_1-214501-0 from training. Duration: 0.689 2023-10-03 06:13:18,181 WARNING [train.py:1204] (3/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_168-50337-0_sp0.9 from training. Duration: 0.9333125 2023-10-03 06:13:18,262 WARNING [train.py:1204] (3/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_185-14438-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 06:13:18,284 WARNING [train.py:1204] (3/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_61-327062-0 from training. Duration: 0.81 2023-10-03 06:13:18,289 WARNING [train.py:1204] (3/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_98-741-0 from training. Duration: 0.8 2023-10-03 06:13:19,907 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.conv_module1.balancer2.prob, batch_count=1166053.3333333333, ans=0.125 2023-10-03 06:13:21,051 WARNING [train.py:1204] (3/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_312-204798-0 from training. Duration: 0.93 2023-10-03 06:13:23,638 WARNING [train.py:1204] (3/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_787-294416-0_sp0.9 from training. Duration: 0.911125 2023-10-03 06:13:23,731 WARNING [train.py:1204] (3/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_140-181176-0_sp1.1 from training. Duration: 0.836375 2023-10-03 06:13:23,759 WARNING [train.py:1204] (3/4) Exclude cut with ID _14687_210_5854_1_1530703838259_3664889_115-239046-0_sp1.1 from training. Duration: 0.6090625 2023-10-03 06:13:25,137 WARNING [train.py:1204] (3/4) Exclude cut with ID _56423_210_5854_1_1534896061049_3165969_370-230614-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 06:13:25,179 WARNING [train.py:1204] (3/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_406-275649-0_sp0.9 from training. Duration: 0.4666875 2023-10-03 06:13:25,200 WARNING [train.py:1204] (3/4) Exclude cut with ID _15150_210_8341_1_1530752411803_7256384_22-138274-0_sp1.1 from training. Duration: 0.87275 2023-10-03 06:13:25,251 WARNING [train.py:1204] (3/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_125-125818-0 from training. Duration: 0.96 2023-10-03 06:13:29,204 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_4399_210_1874_1_1526705485_2400757_220-47974-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 06:13:32,559 WARNING [train.py:1204] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_308-161067-0_sp1.1 from training. Duration: 0.6 2023-10-03 06:13:35,355 WARNING [train.py:1204] (3/4) Exclude cut with ID _53489_210_6228_1_1534298357169_3835970_102-57645-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 06:13:38,156 WARNING [train.py:1204] (3/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_178-177582-0 from training. Duration: 0.74 2023-10-03 06:13:38,222 WARNING [train.py:1204] (3/4) Exclude cut with ID _14349_210_8341_1_1530615710818_3442470_170-336541-0_sp1.1 from training. Duration: 0.7818125 2023-10-03 06:13:39,582 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_521-214276-0_sp0.9 from training. Duration: 0.488875 2023-10-03 06:13:40,248 WARNING [train.py:1204] (3/4) Exclude cut with ID _66070_210_7640_1_1535441393096_7805310_489-292631-0 from training. Duration: 0.79 2023-10-03 06:13:46,417 WARNING [train.py:1204] (3/4) Exclude cut with ID _14069_210_5632_1_1530512692033_4803390_438-340023-0_sp1.1 from training. Duration: 0.936375 2023-10-03 06:13:46,511 WARNING [train.py:1204] (3/4) Exclude cut with ID _7852_210_5219_1_1528286471316_8139400_242-105336-0_sp1.1 from training. Duration: 0.6545625 2023-10-03 06:13:48,437 WARNING [train.py:1204] (3/4) Exclude cut with ID _3867_210_1883_1_1526641200575_4475830_272-215563-0 from training. Duration: 0.57 2023-10-03 06:13:48,444 WARNING [train.py:1204] (3/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_143-117421-0_sp0.9 from training. Duration: 0.6444375 2023-10-03 06:13:48,577 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.1.feed_forward1.hidden_balancer.prob, batch_count=1166186.6666666667, ans=0.125 2023-10-03 06:13:49,832 WARNING [train.py:1204] (3/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_30-326681-0_sp1.1 from training. Duration: 0.7545625 2023-10-03 06:13:51,204 WARNING [train.py:1204] (3/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_538-218448-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 06:13:55,443 WARNING [train.py:1204] (3/4) Exclude cut with ID _33640_210_2655_1_1533117605479_4855469_60-192970-0_sp1.1 from training. Duration: 0.963625 2023-10-03 06:13:55,450 WARNING [train.py:1204] (3/4) Exclude cut with ID _65585_210_12210_1_1535450451584_3764098_112-294301-0_sp1.1 from training. Duration: 0.8 2023-10-03 06:13:55,476 WARNING [train.py:1204] (3/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_643-37412-0_sp1.1 from training. Duration: 0.936375 2023-10-03 06:13:55,504 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_9302_210_2550_1_1528889962_4655416_638-301620-0_sp1.1 from training. Duration: 0.42725 2023-10-03 06:13:56,760 INFO [train.py:1046] (3/4) Epoch 33, batch 4950, loss[loss=0.1528, simple_loss=0.2215, pruned_loss=0.04201, over 23457.00 frames. ], tot_loss[loss=0.1607, simple_loss=0.2388, pruned_loss=0.04131, over 4710508.73 frames. ], batch size: 285, lr: 3.07e-03, grad_scale: 16.0 2023-10-03 06:13:56,839 WARNING [train.py:1204] (3/4) Exclude cut with ID _3867_210_1883_1_1526641200575_4475830_272-215563-0_sp1.1 from training. Duration: 0.5181875 2023-10-03 06:13:59,673 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_53679_210_4852_1_1534556726_4186141_358-154143-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 06:13:59,686 WARNING [train.py:1204] (3/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_841-341679-0_sp0.9 from training. Duration: 0.6444375 2023-10-03 06:14:01,150 WARNING [train.py:1204] (3/4) Exclude cut with ID _7160_210_1876_1_1527940923231_3703799_302-150728-0 from training. Duration: 0.78 2023-10-03 06:14:03,041 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_125-203105-0 from training. Duration: 0.8 2023-10-03 06:14:03,071 WARNING [train.py:1204] (3/4) Exclude cut with ID _5585_210_2667_1_1527418813324_8007340_89-332072-0_sp0.9 from training. Duration: 0.711125 2023-10-03 06:14:04,417 WARNING [train.py:1204] (3/4) Exclude cut with ID _3992_210_1747_1_1526644816031_3731060_36-147855-0 from training. Duration: 0.78 2023-10-03 06:14:04,433 WARNING [train.py:1204] (3/4) Exclude cut with ID _59256_210_5986_1_1535076085997_3589120_172-239045-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 06:14:04,442 WARNING [train.py:1204] (3/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_472-216663-0_sp1.1 from training. Duration: 0.836375 2023-10-03 06:14:04,503 WARNING [train.py:1204] (3/4) Exclude cut with ID _36512_210_6987_1_1533033143998_3126069_181-306500-0_sp1.1 from training. Duration: 0.863625 2023-10-03 06:14:04,525 WARNING [train.py:1204] (3/4) Exclude cut with ID _56524_210_9642_1_1534572011779_3619170_492-95008-0_sp1.1 from training. Duration: 0.97275 2023-10-03 06:14:07,293 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_33348_210_5632_1_1533086912_3324030_119-90326-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 06:14:07,343 WARNING [train.py:1204] (3/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_231-137892-0_sp1.1 from training. Duration: 0.9 2023-10-03 06:14:10,395 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8453_210_4211_1_1528502940_8562730_596-306820-0_sp1.1 from training. Duration: 0.8818125 2023-10-03 06:14:10,482 WARNING [train.py:1204] (3/4) Exclude cut with ID _27052_210_5986_1_1532329197187_3585009_50-242750-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 06:14:11,694 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.624e+02 1.885e+02 2.044e+02 2.302e+02 2.905e+02, threshold=4.089e+02, percent-clipped=0.0 2023-10-03 06:14:11,931 WARNING [train.py:1204] (3/4) Exclude cut with ID _13913_210_9994_1_1530579615187_3383897_358-98435-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 06:14:13,202 WARNING [train.py:1204] (3/4) Exclude cut with ID _27013_210_1389_1_1532437173240_3252649_399-229778-0_sp1.1 from training. Duration: 0.963625 2023-10-03 06:14:15,221 WARNING [train.py:1204] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_185-174805-0_sp0.9 from training. Duration: 0.7555625 2023-10-03 06:14:19,164 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.4.encoder.layers.0.nonlin_attention.whiten2, num_groups=1, num_channels=384, metric=11.87 vs. limit=15.0 2023-10-03 06:14:19,996 WARNING [train.py:1204] (3/4) Exclude cut with ID _65585_210_12210_1_1535450451584_3764098_196-221014-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 06:14:21,366 WARNING [train.py:1204] (3/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_202-293523-0_sp1.1 from training. Duration: 0.6545625 2023-10-03 06:14:24,156 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8275_210_4198_1_1528455320_3892571_213-127634-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 06:14:24,195 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_921-231678-0_sp1.1 from training. Duration: 0.97275 2023-10-03 06:14:24,494 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.self_attn_weights.pos_emb_skip_rate, batch_count=1166320.0, ans=0.0 2023-10-03 06:14:25,657 WARNING [train.py:1204] (3/4) Exclude cut with ID _38020_210_5986_1_1533112252259_7006089_493-102745-0_sp1.1 from training. Duration: 0.8909375 2023-10-03 06:14:27,072 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6846_210_6753_1_1527850767_4295158_2-315073-0 from training. Duration: 0.88 2023-10-03 06:14:27,125 WARNING [train.py:1204] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_249-281349-0 from training. Duration: 0.85 2023-10-03 06:14:29,823 WARNING [train.py:1204] (3/4) Exclude cut with ID _18513_210_9206_1_1531618215223_3862620_314-83429-0_sp1.1 from training. Duration: 0.97275 2023-10-03 06:14:31,249 WARNING [train.py:1204] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_215-202973-0_sp0.9 from training. Duration: 0.97775 2023-10-03 06:14:31,262 WARNING [train.py:1204] (3/4) Exclude cut with ID _7719_210_3525_1_1528456153163_4296960_88-250259-0_sp1.1 from training. Duration: 0.763625 2023-10-03 06:14:32,594 WARNING [train.py:1204] (3/4) Exclude cut with ID _7606_210_4915_1_1528894253277_5080690_301-10732-0_sp1.1 from training. Duration: 0.736375 2023-10-03 06:14:32,604 WARNING [train.py:1204] (3/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_818-11907-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 06:14:34,350 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_5959_210_1794_1_1527400400_3445726_412-44052-0_sp1.1 from training. Duration: 0.7 2023-10-03 06:14:35,730 WARNING [train.py:1204] (3/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_6-279195-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 06:14:37,736 WARNING [train.py:1204] (3/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_252-216674-0_sp0.9 from training. Duration: 0.6 2023-10-03 06:14:37,934 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.0.nonlin_attention.balancer.prob, batch_count=1166386.6666666667, ans=0.125 2023-10-03 06:14:39,045 WARNING [train.py:1204] (3/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_144-3287-0_sp1.1 from training. Duration: 0.6090625 2023-10-03 06:14:40,612 WARNING [train.py:1204] (3/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_342-3879-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 06:14:40,663 WARNING [train.py:1204] (3/4) Exclude cut with ID _33640_210_2655_1_1533117605479_4855469_584-65630-0_sp1.1 from training. Duration: 0.97275 2023-10-03 06:14:41,941 WARNING [train.py:1204] (3/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_37-189262-0 from training. Duration: 0.98 2023-10-03 06:14:41,985 WARNING [train.py:1204] (3/4) Exclude cut with ID _9389_210_1664_1_1529217337301_5289269_379-128794-0_sp0.9 from training. Duration: 0.9444375 2023-10-03 06:14:43,753 WARNING [train.py:1204] (3/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_119-207162-0_sp1.1 from training. Duration: 0.6454375 2023-10-03 06:14:49,634 WARNING [train.py:1204] (3/4) Exclude cut with ID _2941_210_2577_1_1526199673150_2489759_200-333643-0_sp1.1 from training. Duration: 0.936375 2023-10-03 06:14:51,010 WARNING [train.py:1204] (3/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_479-125611-0_sp1.1 from training. Duration: 0.87275 2023-10-03 06:14:51,020 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3475_210_2028_1_1526193578_3622569_64-297072-0_sp1.1 from training. Duration: 0.82725 2023-10-03 06:14:51,065 WARNING [train.py:1204] (3/4) Exclude cut with ID _53565_210_10635_1_1534337218335_3820259_139-346291-0_sp1.1 from training. Duration: 0.97275 2023-10-03 06:14:51,118 WARNING [train.py:1204] (3/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_588-125165-0_sp1.1 from training. Duration: 0.7545625 2023-10-03 06:14:52,452 WARNING [train.py:1204] (3/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_938-205135-0_sp1.1 from training. Duration: 0.7909375 2023-10-03 06:14:53,927 WARNING [train.py:1204] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_216-48707-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 06:14:54,093 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.feed_forward1.out_proj.dropout_p, batch_count=1166453.3333333333, ans=0.1 2023-10-03 06:14:55,266 WARNING [train.py:1204] (3/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_477-255782-0_sp1.1 from training. Duration: 0.7454375 2023-10-03 06:14:55,308 WARNING [train.py:1204] (3/4) Exclude cut with ID _16909_210_5724_1_1531292079912_4118919_583-92842-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 06:14:56,647 WARNING [train.py:1204] (3/4) Exclude cut with ID _60860_210_8254_1_1534896015875_3557169_245-256666-0 from training. Duration: 0.97 2023-10-03 06:14:56,906 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.feed_forward1.out_proj.dropout_p, batch_count=1166520.0, ans=0.1 2023-10-03 06:15:00,825 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_45399_210_4852_1_1533624725_7579737_349-116173-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 06:15:04,265 WARNING [train.py:1204] (3/4) Exclude cut with ID _8699_210_2069_1_1528630346831_3960409_155-39766-0 from training. Duration: 0.78 2023-10-03 06:15:05,608 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_86-16881-0_sp1.1 from training. Duration: 0.52725 2023-10-03 06:15:06,750 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.4.encoder.layers.2.feed_forward2.out_whiten, num_groups=1, num_channels=384, metric=9.18 vs. limit=15.0 2023-10-03 06:15:10,295 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_191-159384-0_sp1.1 from training. Duration: 0.97275 2023-10-03 06:15:11,571 INFO [train.py:1046] (3/4) Epoch 33, batch 5000, loss[loss=0.1642, simple_loss=0.2441, pruned_loss=0.04215, over 24453.00 frames. ], tot_loss[loss=0.1607, simple_loss=0.2385, pruned_loss=0.04143, over 4704916.97 frames. ], batch size: 63, lr: 3.07e-03, grad_scale: 16.0 2023-10-03 06:15:11,631 WARNING [train.py:1204] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_307-158462-0_sp1.1 from training. Duration: 0.62725 2023-10-03 06:15:13,011 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7544_210_6753_1_1528591327_1471426_33-346031-0 from training. Duration: 0.61 2023-10-03 06:15:14,349 WARNING [train.py:1204] (3/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_112-149213-0 from training. Duration: 0.95 2023-10-03 06:15:16,174 WARNING [train.py:1204] (3/4) Exclude cut with ID _55844_210_2346_1_1534471212569_7625707_925-191342-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 06:15:17,565 WARNING [train.py:1204] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_87-278686-0 from training. Duration: 0.77 2023-10-03 06:15:17,593 WARNING [train.py:1204] (3/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_415-78792-0_sp1.1 from training. Duration: 0.736375 2023-10-03 06:15:17,602 WARNING [train.py:1204] (3/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_107-332398-0_sp0.9 from training. Duration: 0.8666875 2023-10-03 06:15:17,845 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.feed_forward1.out_proj.dropout_p, batch_count=1166586.6666666667, ans=0.1 2023-10-03 06:15:19,379 WARNING [train.py:1204] (3/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_72-251255-0 from training. Duration: 0.94 2023-10-03 06:15:19,409 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_431-227273-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 06:15:19,463 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7544_210_6753_1_1528587000_691468_87-178599-0_sp0.9 from training. Duration: 0.9666875 2023-10-03 06:15:20,098 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.3.encoder.layers.0.self_attn2.whiten, num_groups=1, num_channels=512, metric=13.86 vs. limit=22.5 2023-10-03 06:15:20,856 WARNING [train.py:1204] (3/4) Exclude cut with ID _13916_210_9994_1_1530788341126_3763149_60-191556-0 from training. Duration: 0.61 2023-10-03 06:15:20,858 WARNING [train.py:1204] (3/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_746-164782-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 06:15:20,897 WARNING [train.py:1204] (3/4) Exclude cut with ID _33640_210_2655_1_1533117605479_4855469_596-21066-0_sp1.1 from training. Duration: 0.92725 2023-10-03 06:15:23,681 WARNING [train.py:1204] (3/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_320-14078-0 from training. Duration: 0.95 2023-10-03 06:15:23,714 WARNING [train.py:1204] (3/4) Exclude cut with ID _29877_210_6228_1_1532397791106_5276149_266-62291-0 from training. Duration: 0.87 2023-10-03 06:15:23,782 WARNING [train.py:1204] (3/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_176-56120-0_sp0.9 from training. Duration: 0.92225 2023-10-03 06:15:25,052 WARNING [train.py:1204] (3/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_91-26919-0 from training. Duration: 0.97 2023-10-03 06:15:25,062 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_224-322394-0_sp0.9 from training. Duration: 0.7333125 2023-10-03 06:15:25,103 WARNING [train.py:1204] (3/4) Exclude cut with ID _44982_210_1511_1_1534312820108_3726029_586-297059-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 06:15:25,140 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_196-289637-0_sp1.1 from training. Duration: 0.6818125 2023-10-03 06:15:26,420 WARNING [train.py:1204] (3/4) Exclude cut with ID _18513_210_9206_1_1531618215223_3862620_402-5957-0 from training. Duration: 0.99 2023-10-03 06:15:26,426 WARNING [train.py:1204] (3/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_81-300212-0 from training. Duration: 0.6 2023-10-03 06:15:27,964 WARNING [train.py:1204] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_166-128941-0 from training. Duration: 0.54 2023-10-03 06:15:29,560 WARNING [train.py:1204] (3/4) Exclude cut with ID _58059_210_5854_1_1534901471149_3164808_57-309660-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 06:15:30,963 WARNING [train.py:1204] (3/4) Exclude cut with ID _60621_210_17071_1_1535013053530_3671739_73-155814-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 06:15:32,765 WARNING [train.py:1204] (3/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_726-270780-0 from training. Duration: 0.86 2023-10-03 06:15:34,060 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8853_210_3273_1_1528633899_818968_51-187685-0_sp1.1 from training. Duration: 0.62725 2023-10-03 06:15:34,381 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=1166653.3333333333, ans=0.1 2023-10-03 06:15:35,495 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_315-212794-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 06:15:36,191 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_53373_210_5399_1_1534229471_4715695_314-122795-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 06:15:37,540 WARNING [train.py:1204] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_187-221566-0_sp0.9 from training. Duration: 0.47775 2023-10-03 06:15:38,949 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_133-564-0 from training. Duration: 0.79 2023-10-03 06:15:38,993 WARNING [train.py:1204] (3/4) Exclude cut with ID _55153_210_2253_1_1534384786677_3162096_255-228344-0_sp1.1 from training. Duration: 0.936375 2023-10-03 06:15:40,401 WARNING [train.py:1204] (3/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_383-165234-0_sp1.1 from training. Duration: 0.8545625 2023-10-03 06:15:45,908 WARNING [train.py:1204] (3/4) Exclude cut with ID IC0677W0056-334889 from training. Duration: 0.768 2023-10-03 06:15:46,239 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.self_attn_weights.pos_emb_skip_rate, batch_count=1166720.0, ans=0.0 2023-10-03 06:15:49,373 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_14953_210_2151_1_1530701710_5616165_710-165400-0_sp0.9 from training. Duration: 0.9666875 2023-10-03 06:15:50,830 WARNING [train.py:1204] (3/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_355-236205-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 06:15:50,832 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_34450_210_9547_1_1533099443_4109050_503-246893-0_sp1.1 from training. Duration: 0.963625 2023-10-03 06:15:55,560 WARNING [train.py:1204] (3/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_25-120161-0 from training. Duration: 0.98 2023-10-03 06:15:55,593 WARNING [train.py:1204] (3/4) Exclude cut with ID _66112_210_10635_1_1535590236305_3801484_205-311930-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 06:15:56,836 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_48049_210_5632_1_1533864553_3519937_185-281218-0_sp1.1 from training. Duration: 0.92725 2023-10-03 06:15:56,882 WARNING [train.py:1204] (3/4) Exclude cut with ID _48782_210_12658_1_1534467701544_2300422_191-332415-0_sp1.1 from training. Duration: 0.97275 2023-10-03 06:15:58,330 WARNING [train.py:1204] (3/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_907-151398-0_sp1.1 from training. Duration: 0.336375 2023-10-03 06:15:58,372 WARNING [train.py:1204] (3/4) Exclude cut with ID _67499_210_17282_1_1535509805220_3692430_20-179346-0_sp1.1 from training. Duration: 0.8818125 2023-10-03 06:16:01,179 WARNING [train.py:1204] (3/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_441-91466-0_sp1.1 from training. Duration: 0.8818125 2023-10-03 06:16:01,243 WARNING [train.py:1204] (3/4) Exclude cut with ID _46681_210_9642_1_1533893005571_7603359_786-164007-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 06:16:07,232 WARNING [train.py:1204] (3/4) Exclude cut with ID _14970_210_15709_1_1530773543493_1715730_187-20259-0 from training. Duration: 0.89 2023-10-03 06:16:11,601 WARNING [train.py:1204] (3/4) Exclude cut with ID _12734_210_8341_1_1530183614296_3891929_383-339075-0_sp1.1 from training. Duration: 0.963625 2023-10-03 06:16:21,772 WARNING [train.py:1204] (3/4) Exclude cut with ID _37086_210_9038_1_1532999540891_7021250_834-322225-0_sp1.1 from training. Duration: 0.92725 2023-10-03 06:16:23,205 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_10987_210_4163_1_1529760555_4123902_56-290559-0_sp1.1 from training. Duration: 0.963625 2023-10-03 06:16:23,211 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_92-246270-0_sp0.9 from training. Duration: 0.9444375 2023-10-03 06:16:23,228 WARNING [train.py:1204] (3/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_56-13561-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 06:16:23,259 WARNING [train.py:1204] (3/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_393-57888-0_sp1.1 from training. Duration: 0.5818125 2023-10-03 06:16:25,058 WARNING [train.py:1204] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_168-32906-0_sp1.1 from training. Duration: 0.62725 2023-10-03 06:16:25,117 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_51225_210_3601_1_1534400634_2327383_127-207553-0_sp1.1 from training. Duration: 0.963625 2023-10-03 06:16:26,417 INFO [train.py:1046] (3/4) Epoch 33, batch 5050, loss[loss=0.1481, simple_loss=0.2233, pruned_loss=0.03646, over 24290.00 frames. ], tot_loss[loss=0.1611, simple_loss=0.2395, pruned_loss=0.04137, over 4725032.88 frames. ], batch size: 56, lr: 3.07e-03, grad_scale: 16.0 2023-10-03 06:16:27,943 WARNING [train.py:1204] (3/4) Exclude cut with ID _58583_210_5236_1_1534726835266_3574249_494-203521-0_sp1.1 from training. Duration: 0.963625 2023-10-03 06:16:27,974 WARNING [train.py:1204] (3/4) Exclude cut with ID _49376_210_5986_1_1533985278732_3370740_354-344695-0 from training. Duration: 0.99 2023-10-03 06:16:29,499 WARNING [train.py:1204] (3/4) Exclude cut with ID _8021_210_7994_1_1528376451358_3653069_179-45206-0_sp1.1 from training. Duration: 0.7818125 2023-10-03 06:16:30,883 WARNING [train.py:1204] (3/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_742-154017-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 06:16:32,232 WARNING [train.py:1204] (3/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_222-198127-0_sp1.1 from training. Duration: 0.763625 2023-10-03 06:16:32,297 WARNING [train.py:1204] (3/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_235-307337-0 from training. Duration: 0.94 2023-10-03 06:16:35,002 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.1.encoder.layers.1.self_attn1.whiten, num_groups=1, num_channels=256, metric=12.21 vs. limit=22.5 2023-10-03 06:16:35,500 WARNING [train.py:1204] (3/4) Exclude cut with ID _8393_210_6702_1_1528505860249_3457328_453-176500-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 06:16:35,551 WARNING [train.py:1204] (3/4) Exclude cut with ID _53565_210_10635_1_1534337218335_3820259_102-179528-0_sp1.1 from training. Duration: 0.97275 2023-10-03 06:16:36,990 WARNING [train.py:1204] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_43-206757-0_sp1.1 from training. Duration: 0.6818125 2023-10-03 06:16:38,296 WARNING [train.py:1204] (3/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_335-274780-0_sp1.1 from training. Duration: 0.7090625 2023-10-03 06:16:39,652 WARNING [train.py:1204] (3/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_128-296581-0_sp1.1 from training. Duration: 0.67275 2023-10-03 06:16:40,924 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.528e+02 1.953e+02 2.133e+02 2.408e+02 3.808e+02, threshold=4.267e+02, percent-clipped=0.0 2023-10-03 06:16:47,145 WARNING [train.py:1204] (3/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_111-112362-0 from training. Duration: 0.91 2023-10-03 06:16:48,445 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_5522_210_3273_1_1527249606_3909096_366-172508-0_sp1.1 from training. Duration: 0.6 2023-10-03 06:16:48,529 WARNING [train.py:1204] (3/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_47-167373-0_sp1.1 from training. Duration: 0.836375 2023-10-03 06:16:49,825 WARNING [train.py:1204] (3/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_41-234353-0 from training. Duration: 0.68 2023-10-03 06:16:49,866 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_82-221196-0_sp0.9 from training. Duration: 0.8444375 2023-10-03 06:16:51,251 WARNING [train.py:1204] (3/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_47-4814-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 06:16:51,285 WARNING [train.py:1204] (3/4) Exclude cut with ID _43541_210_10626_1_1533796233418_3635440_40-247993-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 06:16:52,611 WARNING [train.py:1204] (3/4) Exclude cut with ID _21989_210_5854_1_1531899214931_3271360_133-44759-0_sp0.9 from training. Duration: 0.8555625 2023-10-03 06:16:52,614 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_94-195801-0 from training. Duration: 0.94 2023-10-03 06:16:54,392 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_141-89113-0 from training. Duration: 0.86 2023-10-03 06:16:55,760 WARNING [train.py:1204] (3/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_683-284578-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 06:16:58,533 WARNING [train.py:1204] (3/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_6-214815-0_sp1.1 from training. Duration: 0.77275 2023-10-03 06:16:58,824 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.feed_forward3.hidden_balancer.prob, batch_count=1167053.3333333333, ans=0.125 2023-10-03 06:17:01,393 WARNING [train.py:1204] (3/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_556-212082-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 06:17:01,433 WARNING [train.py:1204] (3/4) Exclude cut with ID _44902_210_16187_1_1533888006062_3792078_149-67212-0 from training. Duration: 0.87 2023-10-03 06:17:02,916 WARNING [train.py:1204] (3/4) Exclude cut with ID _61148_210_6897_1_1535094022726_4217610_333-159440-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 06:17:05,001 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.3.encoder.layers.0.self_attn1.whiten, num_groups=1, num_channels=512, metric=21.93 vs. limit=22.5 2023-10-03 06:17:06,194 WARNING [train.py:1204] (3/4) Exclude cut with ID _3120_210_3525_1_1526036143112_4072980_404-249994-0 from training. Duration: 0.99 2023-10-03 06:17:07,502 WARNING [train.py:1204] (3/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_234-260682-0_sp0.9 from training. Duration: 0.9333125 2023-10-03 06:17:07,529 WARNING [train.py:1204] (3/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_374-138971-0_sp1.1 from training. Duration: 0.8545625 2023-10-03 06:17:08,881 WARNING [train.py:1204] (3/4) Exclude cut with ID _53713_210_24116_1_1534314487762_7416751_743-302085-0_sp1.1 from training. Duration: 0.92725 2023-10-03 06:17:08,943 WARNING [train.py:1204] (3/4) Exclude cut with ID _18516_210_9206_1_1531704653206_3692406_198-334870-0_sp1.1 from training. Duration: 0.836375 2023-10-03 06:17:11,571 WARNING [train.py:1204] (3/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_395-73570-0_sp1.1 from training. Duration: 0.9 2023-10-03 06:17:14,788 WARNING [train.py:1204] (3/4) Exclude cut with ID _46683_210_9642_1_1533730913492_4591116_421-19547-0_sp1.1 from training. Duration: 0.936375 2023-10-03 06:17:14,849 WARNING [train.py:1204] (3/4) Exclude cut with ID _19896_210_1883_1_1531709022662_6649569_714-8668-0_sp1.1 from training. Duration: 0.963625 2023-10-03 06:17:14,877 WARNING [train.py:1204] (3/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_49-101072-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 06:17:16,136 WARNING [train.py:1204] (3/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_220-191080-0_sp1.1 from training. Duration: 0.8909375 2023-10-03 06:17:16,181 WARNING [train.py:1204] (3/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_62-335552-0 from training. Duration: 0.87 2023-10-03 06:17:17,527 WARNING [train.py:1204] (3/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_111-112362-0_sp1.1 from training. Duration: 0.82725 2023-10-03 06:17:18,916 WARNING [train.py:1204] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_318-59097-0_sp0.9 from training. Duration: 0.8444375 2023-10-03 06:17:21,772 WARNING [train.py:1204] (3/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_98-255710-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 06:17:21,786 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9042W0003-515609 from training. Duration: 0.896 2023-10-03 06:17:21,794 WARNING [train.py:1204] (3/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_158-250116-0_sp0.9 from training. Duration: 0.688875 2023-10-03 06:17:23,161 WARNING [train.py:1204] (3/4) Exclude cut with ID _26750_210_5665_1_1532226678442_4063230_7-346885-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 06:17:24,507 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_37839_210_7234_1_1533542462_3689303_357-227078-0_sp1.1 from training. Duration: 0.963625 2023-10-03 06:17:24,548 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9052W0119-520904 from training. Duration: 0.896 2023-10-03 06:17:29,059 WARNING [train.py:1204] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_249-281349-0_sp1.1 from training. Duration: 0.77275 2023-10-03 06:17:29,065 WARNING [train.py:1204] (3/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_147-293311-0 from training. Duration: 0.94 2023-10-03 06:17:29,066 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_158-21166-0_sp1.1 from training. Duration: 0.963625 2023-10-03 06:17:31,861 WARNING [train.py:1204] (3/4) Exclude cut with ID _14100_210_8341_1_1530529320613_3678069_207-332194-0_sp1.1 from training. Duration: 0.92725 2023-10-03 06:17:32,028 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.conv_module2.balancer2.prob, batch_count=1167186.6666666667, ans=0.125 2023-10-03 06:17:33,228 WARNING [train.py:1204] (3/4) Exclude cut with ID _9389_210_1664_1_1529217337301_5289269_324-211031-0_sp1.1 from training. Duration: 0.963625 2023-10-03 06:17:33,248 WARNING [train.py:1204] (3/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_133-158308-0 from training. Duration: 0.84 2023-10-03 06:17:34,592 WARNING [train.py:1204] (3/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_166-314607-0 from training. Duration: 0.54 2023-10-03 06:17:34,816 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.conv_module1.balancer1.prob, batch_count=1167186.6666666667, ans=0.125 2023-10-03 06:17:37,374 WARNING [train.py:1204] (3/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_649-79538-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 06:17:37,390 WARNING [train.py:1204] (3/4) Exclude cut with ID _28394_210_8913_1_1532430049518_3650118_343-115667-0_sp1.1 from training. Duration: 0.97275 2023-10-03 06:17:37,439 WARNING [train.py:1204] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_87-278686-0_sp0.9 from training. Duration: 0.8555625 2023-10-03 06:17:39,171 INFO [train.py:1046] (3/4) Epoch 33, batch 5100, loss[loss=0.171, simple_loss=0.2452, pruned_loss=0.04844, over 23765.00 frames. ], tot_loss[loss=0.1623, simple_loss=0.2406, pruned_loss=0.04195, over 4727810.19 frames. ], batch size: 164, lr: 3.07e-03, grad_scale: 16.0 2023-10-03 06:17:40,646 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9109W0003-549749 from training. Duration: 0.768 2023-10-03 06:17:42,046 WARNING [train.py:1204] (3/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_126-29104-0_sp1.1 from training. Duration: 0.77275 2023-10-03 06:17:45,164 WARNING [train.py:1204] (3/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_325-113798-0 from training. Duration: 0.85 2023-10-03 06:17:45,237 WARNING [train.py:1204] (3/4) Exclude cut with ID _6694_210_4599_1_1527852637490_3965529_116-109398-0 from training. Duration: 0.73 2023-10-03 06:17:45,405 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.conv_module2.balancer2.prob, batch_count=1167253.3333333333, ans=0.125 2023-10-03 06:17:45,461 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.self_attn_weights.pos_emb_skip_rate, batch_count=1167253.3333333333, ans=0.0 2023-10-03 06:17:46,587 WARNING [train.py:1204] (3/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_46-57119-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 06:17:47,993 WARNING [train.py:1204] (3/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_546-119312-0_sp1.1 from training. Duration: 0.9 2023-10-03 06:17:49,478 WARNING [train.py:1204] (3/4) Exclude cut with ID _60057_210_10640_1_1534903065793_3333150_468-306939-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 06:17:49,525 WARNING [train.py:1204] (3/4) Exclude cut with ID _15150_210_8341_1_1530752411803_7256384_22-138274-0 from training. Duration: 0.96 2023-10-03 06:17:50,821 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_289-180904-0 from training. Duration: 0.76 2023-10-03 06:17:53,744 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.conv_module2.balancer2.prob, batch_count=1167320.0, ans=0.125 2023-10-03 06:17:54,881 WARNING [train.py:1204] (3/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_64-133721-0_sp1.1 from training. Duration: 0.92725 2023-10-03 06:17:54,921 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_195-257512-0_sp1.1 from training. Duration: 0.6090625 2023-10-03 06:18:00,025 WARNING [train.py:1204] (3/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_704-101963-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 06:18:01,511 WARNING [train.py:1204] (3/4) Exclude cut with ID _4366_210_1388_1_1526792053633_3613819_231-211402-0 from training. Duration: 0.94 2023-10-03 06:18:02,818 WARNING [train.py:1204] (3/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_679-190385-0_sp1.1 from training. Duration: 0.97275 2023-10-03 06:18:04,181 WARNING [train.py:1204] (3/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_298-81459-0_sp1.1 from training. Duration: 0.963625 2023-10-03 06:18:04,197 WARNING [train.py:1204] (3/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_658-282610-0_sp0.9 from training. Duration: 0.588875 2023-10-03 06:18:06,985 WARNING [train.py:1204] (3/4) Exclude cut with ID _57572_210_5517_1_1534921219624_3809484_196-283734-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 06:18:08,905 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_53071_210_16554_1_1534490405_2954066_294-198554-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 06:18:08,910 WARNING [train.py:1204] (3/4) Exclude cut with ID _4866_210_2577_1_1527404412284_4364659_352-157930-0 from training. Duration: 0.81 2023-10-03 06:18:12,007 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9231W0113-610139 from training. Duration: 0.896 2023-10-03 06:18:13,325 WARNING [train.py:1204] (3/4) Exclude cut with ID _24271_210_12658_1_1531987270022_7190519_638-171906-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 06:18:14,754 WARNING [train.py:1204] (3/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_272-249985-0 from training. Duration: 0.67 2023-10-03 06:18:14,765 WARNING [train.py:1204] (3/4) Exclude cut with ID _64086_210_7994_1_1535270417019_7435949_449-31567-0 from training. Duration: 0.97 2023-10-03 06:18:18,906 WARNING [train.py:1204] (3/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_109-282568-0_sp1.1 from training. Duration: 0.97275 2023-10-03 06:18:26,020 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_121-45934-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 06:18:28,783 WARNING [train.py:1204] (3/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_314-126819-0 from training. Duration: 0.5 2023-10-03 06:18:28,807 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9309W0192-640319 from training. Duration: 0.512 2023-10-03 06:18:28,815 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9310W0003-640649 from training. Duration: 0.896 2023-10-03 06:18:30,485 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.feed_forward2.hidden_balancer.prob, batch_count=1167453.3333333333, ans=0.125 2023-10-03 06:18:32,079 WARNING [train.py:1204] (3/4) Exclude cut with ID _7160_210_1876_1_1527940923231_3703799_120-150943-0 from training. Duration: 0.81 2023-10-03 06:18:32,080 WARNING [train.py:1204] (3/4) Exclude cut with ID _40380_210_9059_1_1533380594759_4187380_6-33508-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 06:18:33,464 WARNING [train.py:1204] (3/4) Exclude cut with ID _59202_210_26984_1_1534816810612_3549174_137-318710-0 from training. Duration: 0.89 2023-10-03 06:18:37,771 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_435-313305-0 from training. Duration: 0.71 2023-10-03 06:18:39,204 WARNING [train.py:1204] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_91-212486-0_sp1.1 from training. Duration: 0.6181875 2023-10-03 06:18:41,106 WARNING [train.py:1204] (3/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_133-158308-0_sp1.1 from training. Duration: 0.763625 2023-10-03 06:18:44,414 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_99-251314-0 from training. Duration: 0.87 2023-10-03 06:18:44,559 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_365-217119-0_sp1.1 from training. Duration: 0.57275 2023-10-03 06:18:45,826 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3475_210_2028_1_1526193578_3622569_377-166133-0 from training. Duration: 0.86 2023-10-03 06:18:50,002 WARNING [train.py:1204] (3/4) Exclude cut with ID _13916_210_9994_1_1530788341126_3763149_116-21034-0_sp1.1 from training. Duration: 0.87275 2023-10-03 06:18:50,018 WARNING [train.py:1204] (3/4) Exclude cut with ID _64220_210_9527_1_1535250151772_4875259_142-216831-0_sp1.1 from training. Duration: 0.936375 2023-10-03 06:18:50,018 WARNING [train.py:1204] (3/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_490-207861-0_sp1.1 from training. Duration: 0.8909375 2023-10-03 06:18:51,380 WARNING [train.py:1204] (3/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_87-272068-0_sp0.9 from training. Duration: 0.92225 2023-10-03 06:18:51,399 WARNING [train.py:1204] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_18-46756-0_sp1.1 from training. Duration: 0.7181875 2023-10-03 06:18:52,745 INFO [train.py:1046] (3/4) Epoch 33, batch 5150, loss[loss=0.1632, simple_loss=0.2477, pruned_loss=0.03936, over 23299.00 frames. ], tot_loss[loss=0.1632, simple_loss=0.2419, pruned_loss=0.04228, over 4719861.35 frames. ], batch size: 93, lr: 3.07e-03, grad_scale: 8.0 2023-10-03 06:18:52,792 WARNING [train.py:1204] (3/4) Exclude cut with ID _39411_210_5986_1_1533538825030_3343649_98-311335-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 06:18:54,113 WARNING [train.py:1204] (3/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_113-18163-0 from training. Duration: 0.89 2023-10-03 06:18:54,115 WARNING [train.py:1204] (3/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_1103-56445-0 from training. Duration: 0.73 2023-10-03 06:18:55,466 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3474_210_2028_1_1526188513_4825246_145-309131-0 from training. Duration: 0.74 2023-10-03 06:18:55,488 WARNING [train.py:1204] (3/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_120-211630-0_sp1.1 from training. Duration: 0.836375 2023-10-03 06:18:55,511 WARNING [train.py:1204] (3/4) Exclude cut with ID _8030_210_7497_1_1528444886781_3531089_32-31837-0 from training. Duration: 0.97 2023-10-03 06:18:58,118 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_19949_210_2161_1_1531706337_928079_8-160956-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 06:18:58,175 WARNING [train.py:1204] (3/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_321-215742-0_sp1.1 from training. Duration: 0.4545625 2023-10-03 06:18:59,742 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_583-124650-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 06:19:01,680 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_30434_210_13049_1_1532519384_981746_164-182755-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 06:19:01,859 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.ff3_skip_rate, batch_count=1167586.6666666667, ans=0.0 2023-10-03 06:19:06,378 WARNING [train.py:1204] (3/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_339-104791-0_sp1.1 from training. Duration: 0.4454375 2023-10-03 06:19:06,398 WARNING [train.py:1204] (3/4) Exclude cut with ID _15026_210_8750_1_1530777602316_3291850_211-195043-0 from training. Duration: 0.8 2023-10-03 06:19:07,757 WARNING [train.py:1204] (3/4) Exclude cut with ID _29877_210_6228_1_1532397791106_5276149_487-182647-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 06:19:08,862 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.579e+02 1.881e+02 2.011e+02 2.161e+02 3.119e+02, threshold=4.022e+02, percent-clipped=0.0 2023-10-03 06:19:08,937 WARNING [train.py:1204] (3/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_127-566-0_sp0.9 from training. Duration: 0.9333125 2023-10-03 06:19:09,082 WARNING [train.py:1204] (3/4) Exclude cut with ID _8263_210_1664_1_1528439452891_970220_68-181805-0_sp0.9 from training. Duration: 0.711125 2023-10-03 06:19:09,083 WARNING [train.py:1204] (3/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_504-72513-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 06:19:09,092 WARNING [train.py:1204] (3/4) Exclude cut with ID _64639_210_5816_1_1535367619437_3528000_324-265233-0_sp1.1 from training. Duration: 0.963625 2023-10-03 06:19:10,390 WARNING [train.py:1204] (3/4) Exclude cut with ID _6456_210_1534_1_1527681634212_3181049_215-309359-0_sp0.9 from training. Duration: 0.97775 2023-10-03 06:19:10,395 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_115-233929-0_sp1.1 from training. Duration: 0.6545625 2023-10-03 06:19:10,451 WARNING [train.py:1204] (3/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_219-224445-0 from training. Duration: 0.84 2023-10-03 06:19:13,608 WARNING [train.py:1204] (3/4) Exclude cut with ID _14867_210_1968_1_1530684179890_3846969_413-29292-0_sp1.1 from training. Duration: 0.8181875 2023-10-03 06:19:13,656 WARNING [train.py:1204] (3/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_1012-279473-0_sp0.9 from training. Duration: 0.7444375 2023-10-03 06:19:13,877 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.feed_forward3.hidden_balancer.prob, batch_count=1167653.3333333333, ans=0.125 2023-10-03 06:19:15,097 WARNING [train.py:1204] (3/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_605-133395-0_sp0.9 from training. Duration: 0.7333125 2023-10-03 06:19:16,416 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_89-35748-0 from training. Duration: 0.79 2023-10-03 06:19:17,782 WARNING [train.py:1204] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_476-272757-0_sp1.1 from training. Duration: 0.8454375 2023-10-03 06:19:23,212 WARNING [train.py:1204] (3/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_449-15508-0_sp0.9 from training. Duration: 0.911125 2023-10-03 06:19:24,641 WARNING [train.py:1204] (3/4) Exclude cut with ID _29345_210_4381_1_1532689168263_3860169_198-174328-0 from training. Duration: 0.91 2023-10-03 06:19:26,268 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.feed_forward1.hidden_balancer.prob, batch_count=1167720.0, ans=0.125 2023-10-03 06:19:27,368 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_15517_210_6753_1_1530952293_1664795_125-158157-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 06:19:34,301 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.balancer2.prob, batch_count=1167720.0, ans=0.125 2023-10-03 06:19:35,563 WARNING [train.py:1204] (3/4) Exclude cut with ID _25565_210_12392_1_1532423009382_3952098_420-113180-0_sp1.1 from training. Duration: 0.963625 2023-10-03 06:19:36,966 WARNING [train.py:1204] (3/4) Exclude cut with ID _51471_210_1889_1_1534399391390_3094250_236-146773-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 06:19:39,864 WARNING [train.py:1204] (3/4) Exclude cut with ID _30039_210_13270_1_1532484612900_2725150_175-35012-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 06:19:41,154 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_23850_210_4238_1_1532232597_1632433_99-275843-0_sp1.1 from training. Duration: 0.936375 2023-10-03 06:19:43,101 WARNING [train.py:1204] (3/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_658-282610-0 from training. Duration: 0.53 2023-10-03 06:19:43,710 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.3.encoder.layers.3.conv_module2.whiten, num_groups=1, num_channels=512, metric=4.25 vs. limit=15.0 2023-10-03 06:19:47,716 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_37818_210_4238_1_1533620910_4720083_234-65934-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 06:19:49,128 WARNING [train.py:1204] (3/4) Exclude cut with ID _3067_210_2667_1_1526212974640_4503168_459-173867-0_sp1.1 from training. Duration: 0.8 2023-10-03 06:19:49,154 WARNING [train.py:1204] (3/4) Exclude cut with ID _8696_210_1876_1_1528545632440_3345058_209-54069-0_sp0.9 from training. Duration: 0.7444375 2023-10-03 06:19:53,259 WARNING [train.py:1204] (3/4) Exclude cut with ID _62574_210_2655_1_1535180407058_3933249_420-236465-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 06:19:53,341 WARNING [train.py:1204] (3/4) Exclude cut with ID _46681_210_9642_1_1533893005571_7603359_333-4042-0_sp1.1 from training. Duration: 0.936375 2023-10-03 06:19:54,698 WARNING [train.py:1204] (3/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_377-296839-0 from training. Duration: 0.55 2023-10-03 06:19:58,873 WARNING [train.py:1204] (3/4) Exclude cut with ID _43997_210_4525_1_1533538632555_5189329_426-92599-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 06:19:58,973 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_43-71989-0_sp1.1 from training. Duration: 0.5181875 2023-10-03 06:20:00,389 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2009_210_2071_1_1525481134_4588937_140-345530-0_sp1.1 from training. Duration: 0.963625 2023-10-03 06:20:01,690 WARNING [train.py:1204] (3/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_138-332773-0_sp0.9 from training. Duration: 0.9 2023-10-03 06:20:01,770 WARNING [train.py:1204] (3/4) Exclude cut with ID _13942_210_9988_1_1530604906036_3733360_499-55676-0_sp0.9 from training. Duration: 0.688875 2023-10-03 06:20:02,042 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=1167853.3333333333, ans=0.1 2023-10-03 06:20:03,485 WARNING [train.py:1204] (3/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_259-34038-0_sp1.1 from training. Duration: 0.67275 2023-10-03 06:20:03,495 WARNING [train.py:1204] (3/4) Exclude cut with ID _3120_210_3525_1_1526036143112_4072980_404-249994-0_sp1.1 from training. Duration: 0.9 2023-10-03 06:20:03,538 WARNING [train.py:1204] (3/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_222-108810-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 06:20:06,128 INFO [train.py:1046] (3/4) Epoch 33, batch 5200, loss[loss=0.2065, simple_loss=0.2741, pruned_loss=0.06948, over 19765.00 frames. ], tot_loss[loss=0.1641, simple_loss=0.2427, pruned_loss=0.04279, over 4705931.41 frames. ], batch size: 388, lr: 3.07e-03, grad_scale: 16.0 2023-10-03 06:20:06,403 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.feed_forward1.hidden_balancer.prob, batch_count=1167920.0, ans=0.125 2023-10-03 06:20:07,464 WARNING [train.py:1204] (3/4) Exclude cut with ID _22083_210_8254_1_1531702862439_3678970_49-234546-0_sp0.9 from training. Duration: 0.92225 2023-10-03 06:20:09,399 WARNING [train.py:1204] (3/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_199-295611-0_sp0.9 from training. Duration: 0.611125 2023-10-03 06:20:11,020 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.conv_skip_rate, batch_count=1167920.0, ans=0.0 2023-10-03 06:20:12,166 WARNING [train.py:1204] (3/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_43-192945-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 06:20:16,833 WARNING [train.py:1204] (3/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_364-22817-0 from training. Duration: 0.71 2023-10-03 06:20:16,917 WARNING [train.py:1204] (3/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_388-129552-0_sp1.1 from training. Duration: 0.87275 2023-10-03 06:20:17,064 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.attention_skip_rate, batch_count=1167920.0, ans=0.0 2023-10-03 06:20:18,283 WARNING [train.py:1204] (3/4) Exclude cut with ID _6537_210_1883_1_1527987559710_3597129_190-280227-0_sp1.1 from training. Duration: 0.97275 2023-10-03 06:20:21,037 WARNING [train.py:1204] (3/4) Exclude cut with ID _55844_210_2346_1_1534471212569_7625707_398-56897-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 06:20:22,395 WARNING [train.py:1204] (3/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_48-182155-0_sp1.1 from training. Duration: 0.7909375 2023-10-03 06:20:22,425 WARNING [train.py:1204] (3/4) Exclude cut with ID _29529_210_7234_1_1532393961455_3977460_582-18084-0_sp1.1 from training. Duration: 0.97275 2023-10-03 06:20:23,943 WARNING [train.py:1204] (3/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_912-282412-0 from training. Duration: 0.49 2023-10-03 06:20:25,375 WARNING [train.py:1204] (3/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_332-256168-0_sp0.9 from training. Duration: 0.8333125 2023-10-03 06:20:25,439 WARNING [train.py:1204] (3/4) Exclude cut with ID _64220_210_9527_1_1535250151772_4875259_55-250624-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 06:20:25,686 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.attention_skip_rate, batch_count=1167986.6666666667, ans=0.0 2023-10-03 06:20:28,112 WARNING [train.py:1204] (3/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_264-108685-0 from training. Duration: 0.55 2023-10-03 06:20:29,558 WARNING [train.py:1204] (3/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_613-232856-0_sp0.9 from training. Duration: 0.6 2023-10-03 06:20:32,177 WARNING [train.py:1204] (3/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_68-212801-0_sp1.1 from training. Duration: 0.663625 2023-10-03 06:20:32,235 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6290_210_3273_1_1527595224_1282787_115-87095-0 from training. Duration: 0.55 2023-10-03 06:20:34,152 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3080_210_2071_1_1526086102_4365166_312-8726-0 from training. Duration: 0.91 2023-10-03 06:20:36,765 WARNING [train.py:1204] (3/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_105-63309-0 from training. Duration: 0.98 2023-10-03 06:20:38,072 WARNING [train.py:1204] (3/4) Exclude cut with ID _2941_210_2577_1_1526199673150_2489759_201-160779-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 06:20:38,076 WARNING [train.py:1204] (3/4) Exclude cut with ID ID0406W0226-885434 from training. Duration: 0.997 2023-10-03 06:20:38,082 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_17292_210_6753_1_1531125388_2943734_353-328201-0_sp1.1 from training. Duration: 0.97275 2023-10-03 06:20:38,197 WARNING [train.py:1204] (3/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_572-96029-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 06:20:39,543 WARNING [train.py:1204] (3/4) Exclude cut with ID _14970_210_15709_1_1530775547361_1966965_6-23328-0_sp0.9 from training. Duration: 0.9555625 2023-10-03 06:20:39,592 WARNING [train.py:1204] (3/4) Exclude cut with ID _45012_210_12651_1_1533608435136_5371380_240-37101-0 from training. Duration: 0.93 2023-10-03 06:20:41,631 WARNING [train.py:1204] (3/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_685-90183-0_sp1.1 from training. Duration: 0.936375 2023-10-03 06:20:44,964 WARNING [train.py:1204] (3/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_41-22245-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 06:20:46,449 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_15126_210_6753_1_1530778677_2591185_120-277707-0 from training. Duration: 0.8 2023-10-03 06:20:46,484 WARNING [train.py:1204] (3/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_310-26186-0 from training. Duration: 0.7 2023-10-03 06:20:46,502 WARNING [train.py:1204] (3/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_787-294416-0 from training. Duration: 0.82 2023-10-03 06:20:50,808 WARNING [train.py:1204] (3/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_795-33252-0 from training. Duration: 0.37 2023-10-03 06:20:50,973 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.ff2_skip_rate, batch_count=1168120.0, ans=0.0 2023-10-03 06:20:52,043 WARNING [train.py:1204] (3/4) Exclude cut with ID _7537_210_2084_1_1528502395431_3924359_113-199933-0_sp0.9 from training. Duration: 0.6555625 2023-10-03 06:20:56,281 WARNING [train.py:1204] (3/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_308-231735-0_sp0.9 from training. Duration: 0.811125 2023-10-03 06:20:57,541 WARNING [train.py:1204] (3/4) Exclude cut with ID _19504_210_9994_1_1531648863242_3325220_354-296765-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 06:20:58,897 WARNING [train.py:1204] (3/4) Exclude cut with ID _5409_210_1388_1_1527292850189_5865191_178-127970-0 from training. Duration: 0.57 2023-10-03 06:20:58,955 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_1466-248056-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 06:20:58,973 WARNING [train.py:1204] (3/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_535-14410-0_sp0.9 from training. Duration: 0.52225 2023-10-03 06:20:58,976 WARNING [train.py:1204] (3/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_747-129941-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 06:20:59,014 WARNING [train.py:1204] (3/4) Exclude cut with ID _15012_210_1968_1_1530856820429_5095350_688-178849-0_sp1.1 from training. Duration: 0.5818125 2023-10-03 06:21:01,911 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.bypass.skip_rate, batch_count=1168120.0, ans=0.07 2023-10-03 06:21:03,092 WARNING [train.py:1204] (3/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_356-45054-0_sp1.1 from training. Duration: 0.7818125 2023-10-03 06:21:04,461 WARNING [train.py:1204] (3/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_146-276944-0_sp0.9 from training. Duration: 0.92225 2023-10-03 06:21:09,211 WARNING [train.py:1204] (3/4) Exclude cut with ID _39204_210_4220_1_1533880547368_3191120_346-180306-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 06:21:10,406 WARNING [train.py:1204] (3/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_641-158017-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 06:21:10,407 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_19952_210_2161_1_1531875469_3851750_274-42137-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 06:21:15,691 WARNING [train.py:1204] (3/4) Exclude cut with ID _60804_210_9642_1_1534831199859_7268299_87-327819-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 06:21:15,781 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_9302_210_2550_1_1528889962_4655416_638-301620-0 from training. Duration: 0.47 2023-10-03 06:21:17,144 WARNING [train.py:1204] (3/4) Exclude cut with ID _44304_210_15374_1_1533639352765_4613129_302-22084-0_sp1.1 from training. Duration: 0.7818125 2023-10-03 06:21:17,164 WARNING [train.py:1204] (3/4) Exclude cut with ID _18513_210_9206_1_1531618215223_3862620_177-227063-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 06:21:17,326 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.bypass.skip_rate, batch_count=1168186.6666666667, ans=0.04949747468305833 2023-10-03 06:21:18,475 WARNING [train.py:1204] (3/4) Exclude cut with ID _41326_210_5854_1_1533778076509_3492080_510-45444-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 06:21:19,727 INFO [train.py:1046] (3/4) Epoch 33, batch 5250, loss[loss=0.1564, simple_loss=0.2435, pruned_loss=0.03467, over 24523.00 frames. ], tot_loss[loss=0.1631, simple_loss=0.2416, pruned_loss=0.0423, over 4692292.46 frames. ], batch size: 71, lr: 3.07e-03, grad_scale: 8.0 2023-10-03 06:21:19,826 WARNING [train.py:1204] (3/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_76-311963-0_sp1.1 from training. Duration: 0.636375 2023-10-03 06:21:19,924 WARNING [train.py:1204] (3/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_156-302675-0_sp0.9 from training. Duration: 0.611125 2023-10-03 06:21:22,726 WARNING [train.py:1204] (3/4) Exclude cut with ID _53489_210_6228_1_1534298357169_3835970_367-148411-0_sp1.1 from training. Duration: 0.936375 2023-10-03 06:21:25,567 WARNING [train.py:1204] (3/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_389-144525-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 06:21:26,829 WARNING [train.py:1204] (3/4) Exclude cut with ID _14069_210_5632_1_1530512692033_4803390_158-238427-0_sp1.1 from training. Duration: 0.963625 2023-10-03 06:21:28,060 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_486-151131-0_sp0.9 from training. Duration: 0.9444375 2023-10-03 06:21:32,314 WARNING [train.py:1204] (3/4) Exclude cut with ID _39846_210_12651_1_1533603651992_1318030_188-71831-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 06:21:33,708 WARNING [train.py:1204] (3/4) Exclude cut with ID _39411_210_5986_1_1533538825030_3343649_173-238248-0_sp1.1 from training. Duration: 0.7545625 2023-10-03 06:21:35,180 WARNING [train.py:1204] (3/4) Exclude cut with ID _20684_210_6917_1_1531567751646_4203930_469-49081-0_sp1.1 from training. Duration: 0.92725 2023-10-03 06:21:36,477 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.546e+02 1.883e+02 2.112e+02 2.383e+02 3.529e+02, threshold=4.224e+02, percent-clipped=0.0 2023-10-03 06:21:36,587 WARNING [train.py:1204] (3/4) Exclude cut with ID _8833_210_5219_1_1528592665210_7553059_611-80313-0_sp1.1 from training. Duration: 0.5818125 2023-10-03 06:21:36,685 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder_embed.convnext.layerdrop_rate, batch_count=1168320.0, ans=0.015 2023-10-03 06:21:40,511 WARNING [train.py:1204] (3/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_440-141811-0 from training. Duration: 0.95 2023-10-03 06:21:40,526 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_41211_210_2544_1_1533348859_3702910_236-223652-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 06:21:41,942 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_40222_210_6228_1_1533385298_4481939_220-163998-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 06:21:44,298 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.4.encoder.layers.1.self_attn1.whiten, num_groups=1, num_channels=384, metric=11.98 vs. limit=22.5 2023-10-03 06:21:46,990 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.conv_skip_rate, batch_count=1168320.0, ans=0.0 2023-10-03 06:21:47,027 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.bypass_mid.scale_min, batch_count=1168320.0, ans=0.2 2023-10-03 06:21:48,769 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.3.encoder.layers.0.self_attn_weights.whiten_keys, num_groups=8, num_channels=256, metric=4.71 vs. limit=6.0 2023-10-03 06:21:49,690 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.ff2_skip_rate, batch_count=1168386.6666666667, ans=0.0 2023-10-03 06:22:01,120 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.0.bypass.skip_rate, batch_count=1168453.3333333333, ans=0.035 2023-10-03 06:22:12,214 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.0.conv_module1.whiten.whitening_limit, batch_count=1168453.3333333333, ans=15.0 2023-10-03 06:22:27,875 INFO [train.py:1046] (3/4) Epoch 33, batch 5300, loss[loss=0.1428, simple_loss=0.1925, pruned_loss=0.04657, over 19204.00 frames. ], tot_loss[loss=0.1621, simple_loss=0.2406, pruned_loss=0.0418, over 4700785.84 frames. ], batch size: 388, lr: 3.07e-03, grad_scale: 8.0 2023-10-03 06:22:41,468 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=1168653.3333333333, ans=0.1 2023-10-03 06:22:42,492 WARNING [train.py:1204] (3/4) Exclude cut with ID _14629_210_15709_1_1530604298182_956899_6-232695-0_sp1.1 from training. Duration: 0.8545625 2023-10-03 06:22:42,528 WARNING [train.py:1204] (3/4) Exclude cut with ID _13921_210_9994_1_1530841782272_3503520_327-69677-0 from training. Duration: 0.9 2023-10-03 06:22:42,554 WARNING [train.py:1204] (3/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_372-298093-0 from training. Duration: 0.8 2023-10-03 06:22:42,568 WARNING [train.py:1204] (3/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_515-139111-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 06:22:43,152 WARNING [train.py:1204] (3/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_85-65056-0_sp1.1 from training. Duration: 0.87275 2023-10-03 06:22:43,193 WARNING [train.py:1204] (3/4) Exclude cut with ID _15113_210_8254_1_1530842438579_3716279_209-54533-0_sp1.1 from training. Duration: 0.87275 2023-10-03 06:22:43,242 WARNING [train.py:1204] (3/4) Exclude cut with ID _70447_210_12392_1_1535972499300_3160420_123-312838-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 06:22:43,270 WARNING [train.py:1204] (3/4) Exclude cut with ID _60113_210_16271_1_1535164212481_4386079_70-63596-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 06:22:43,299 WARNING [train.py:1204] (3/4) Exclude cut with ID _14632_210_9988_1_1530691279392_3923200_225-188509-0_sp1.1 from training. Duration: 0.963625 2023-10-03 06:22:43,347 WARNING [train.py:1204] (3/4) Exclude cut with ID _34734_210_4525_1_1533365977542_4448720_584-180532-0_sp1.1 from training. Duration: 0.97275 2023-10-03 06:22:43,361 WARNING [train.py:1204] (3/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_351-294650-0_sp1.1 from training. Duration: 0.57275 2023-10-03 06:22:43,687 WARNING [train.py:1204] (3/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_263-168388-0_sp0.9 from training. Duration: 0.9555625 2023-10-03 06:22:43,715 WARNING [train.py:1204] (3/4) Exclude cut with ID _22083_210_8254_1_1531702862439_3678970_201-85738-0 from training. Duration: 0.9 2023-10-03 06:22:43,812 WARNING [train.py:1204] (3/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_237-58433-0 from training. Duration: 0.74 2023-10-03 06:22:43,847 WARNING [train.py:1204] (3/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_262-250826-0 from training. Duration: 0.99 2023-10-03 06:22:43,964 WARNING [train.py:1204] (3/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_927-43827-0_sp0.9 from training. Duration: 0.82225 2023-10-03 06:22:43,995 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2821_210_2715_1_1526108385_3763560_73-296673-0 from training. Duration: 0.83 2023-10-03 06:22:44,072 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6287_210_3273_1_1527509506_2653716_182-199887-0 from training. Duration: 0.51 2023-10-03 06:22:44,206 WARNING [train.py:1204] (3/4) Exclude cut with ID _70519_210_4163_1_1535961595994_7458230_899-80223-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 06:22:44,473 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_210-234949-0_sp1.1 from training. Duration: 0.97275 2023-10-03 06:22:44,511 WARNING [train.py:1204] (3/4) Exclude cut with ID _30196_210_1664_1_1533708496522_7074140_768-278176-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 06:22:44,593 WARNING [train.py:1204] (3/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_315-128283-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 06:22:44,677 WARNING [train.py:1204] (3/4) Exclude cut with ID _14886_210_4220_1_1530702020355_3516190_330-323941-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 06:22:45,336 WARNING [train.py:1204] (3/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_264-348896-0_sp1.1 from training. Duration: 0.77275 2023-10-03 06:22:45,360 WARNING [train.py:1204] (3/4) Exclude cut with ID _37086_210_9038_1_1532999540891_7021250_433-10563-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 06:22:45,398 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_53638_210_21581_1_1534297319_5808449_885-80172-0_sp1.1 from training. Duration: 0.87275 2023-10-03 06:22:45,496 WARNING [train.py:1204] (3/4) Exclude cut with ID _14623_210_1925_1_1530773354962_3632609_157-218771-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 06:22:45,507 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_1368-78165-0_sp1.1 from training. Duration: 0.97275 2023-10-03 06:22:45,513 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_309-65177-0_sp0.9 from training. Duration: 0.97775 2023-10-03 06:22:45,518 WARNING [train.py:1204] (3/4) Exclude cut with ID _21632_210_2253_1_1531994001765_3744140_291-138812-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 06:22:45,531 WARNING [train.py:1204] (3/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_669-93163-0_sp1.1 from training. Duration: 0.8909375 2023-10-03 06:22:46,129 WARNING [train.py:1204] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_292-215346-0 from training. Duration: 0.97 2023-10-03 06:22:46,155 WARNING [train.py:1204] (3/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_286-222073-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 06:22:46,457 WARNING [train.py:1204] (3/4) Exclude cut with ID _59256_210_5986_1_1535076085997_3589120_121-237386-0_sp1.1 from training. Duration: 0.87275 2023-10-03 06:22:46,471 WARNING [train.py:1204] (3/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_96-320736-0 from training. Duration: 0.86 2023-10-03 06:22:46,481 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_128-202104-0 from training. Duration: 0.63 2023-10-03 06:22:46,570 WARNING [train.py:1204] (3/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_242-3472-0_sp0.9 from training. Duration: 0.811125 2023-10-03 06:22:46,617 WARNING [train.py:1204] (3/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_154-74182-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 06:22:46,621 WARNING [train.py:1204] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_187-221566-0 from training. Duration: 0.43 2023-10-03 06:22:46,789 WARNING [train.py:1204] (3/4) Exclude cut with ID _6194_210_1534_1_1527511826160_3941194_14-282876-0 from training. Duration: 0.71 2023-10-03 06:22:46,831 WARNING [train.py:1204] (3/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_660-211088-0_sp1.1 from training. Duration: 0.563625 2023-10-03 06:22:47,174 WARNING [train.py:1204] (3/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_526-62079-0_sp1.1 from training. Duration: 0.6090625 2023-10-03 06:22:47,776 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_92-246270-0_sp1.1 from training. Duration: 0.77275 2023-10-03 06:22:47,872 WARNING [train.py:1204] (3/4) Exclude cut with ID IC0287W0023-142350 from training. Duration: 0.956 2023-10-03 06:22:47,945 WARNING [train.py:1204] (3/4) Exclude cut with ID _58605_210_12210_1_1534723195939_3904259_48-269402-0 from training. Duration: 0.96 2023-10-03 06:22:47,958 WARNING [train.py:1204] (3/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_416-257730-0_sp0.9 from training. Duration: 0.72225 2023-10-03 06:22:48,030 WARNING [train.py:1204] (3/4) Exclude cut with ID _7877_210_6014_1_1528286358099_3750136_185-17516-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 06:22:48,130 WARNING [train.py:1204] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_23-189234-0 from training. Duration: 0.83 2023-10-03 06:22:48,145 WARNING [train.py:1204] (3/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_237-89684-0 from training. Duration: 0.96 2023-10-03 06:22:48,215 WARNING [train.py:1204] (3/4) Exclude cut with ID _13916_210_9994_1_1530788341126_3763149_116-21034-0 from training. Duration: 0.96 2023-10-03 06:22:48,364 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_246-125000-0_sp1.1 from training. Duration: 0.563625 2023-10-03 06:22:54,403 INFO [train.py:1046] (3/4) Epoch 34, batch 0, loss[loss=0.164, simple_loss=0.2414, pruned_loss=0.04328, over 23762.00 frames. ], tot_loss[loss=0.164, simple_loss=0.2414, pruned_loss=0.04328, over 23762.00 frames. ], batch size: 179, lr: 3.02e-03, grad_scale: 16.0 2023-10-03 06:22:54,403 INFO [train.py:1069] (3/4) Computing validation loss 2023-10-03 06:23:07,172 INFO [train.py:1078] (3/4) Epoch 34, validation: loss=0.3345, simple_loss=0.2716, pruned_loss=0.1987, over 1125622.00 frames. 2023-10-03 06:23:07,173 INFO [train.py:1079] (3/4) Maximum memory allocated so far is 21134MB 2023-10-03 06:23:11,776 WARNING [train.py:1204] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_402-253027-0 from training. Duration: 0.54 2023-10-03 06:23:11,931 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.1.balancer1.prob, batch_count=1168666.6666666667, ans=0.125 2023-10-03 06:23:13,063 WARNING [train.py:1204] (3/4) Exclude cut with ID _59288_210_5854_1_1535077757774_3412246_158-289345-0_sp1.1 from training. Duration: 0.92725 2023-10-03 06:23:14,544 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_28211_210_6388_1_1532861506_4523224_128-328162-0_sp1.1 from training. Duration: 0.8181875 2023-10-03 06:23:19,251 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_788-278281-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 06:23:19,266 WARNING [train.py:1204] (3/4) Exclude cut with ID _49728_210_12651_1_1534041034294_6788150_624-195301-0_sp0.9 from training. Duration: 0.9444375 2023-10-03 06:23:19,306 WARNING [train.py:1204] (3/4) Exclude cut with ID _14860_210_4915_1_1530702020340_7343240_267-139775-0_sp1.1 from training. Duration: 0.963625 2023-10-03 06:23:20,820 WARNING [train.py:1204] (3/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_332-221057-0 from training. Duration: 0.68 2023-10-03 06:23:22,050 WARNING [train.py:1204] (3/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_242-76943-0 from training. Duration: 0.94 2023-10-03 06:23:24,678 WARNING [train.py:1204] (3/4) Exclude cut with ID _16747_210_5986_1_1531811544733_2749109_87-349587-0_sp1.1 from training. Duration: 0.963625 2023-10-03 06:23:24,739 WARNING [train.py:1204] (3/4) Exclude cut with ID _22982_210_1883_1_1531882790447_5449409_505-346452-0_sp1.1 from training. Duration: 0.963625 2023-10-03 06:23:27,585 WARNING [train.py:1204] (3/4) Exclude cut with ID _17956_210_4353_1_1531209568125_6979970_13-192107-0_sp1.1 from training. Duration: 0.963625 2023-10-03 06:23:27,620 WARNING [train.py:1204] (3/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_430-307666-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 06:23:27,663 WARNING [train.py:1204] (3/4) Exclude cut with ID _2895_210_2477_1_1525956785794_4063750_289-11475-0_sp1.1 from training. Duration: 0.7454375 2023-10-03 06:23:27,673 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3910_210_1794_1_1526742306_334530_28-238909-0_sp0.9 from training. Duration: 0.9 2023-10-03 06:23:29,121 WARNING [train.py:1204] (3/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_18-130336-0 from training. Duration: 0.79 2023-10-03 06:23:30,384 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_548-294316-0_sp0.9 from training. Duration: 0.9 2023-10-03 06:23:37,760 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_42-142473-0_sp1.1 from training. Duration: 0.6545625 2023-10-03 06:23:37,768 WARNING [train.py:1204] (3/4) Exclude cut with ID _59170_210_6721_1_1534983633960_4203335_326-48066-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 06:23:41,720 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_45886_210_17024_1_1533709215_471063_7-79165-0 from training. Duration: 0.98 2023-10-03 06:23:45,655 WARNING [train.py:1204] (3/4) Exclude cut with ID _44585_210_18628_1_1533605350387_4518199_280-73534-0_sp0.9 from training. Duration: 0.988875 2023-10-03 06:23:45,669 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3475_210_2028_1_1526193578_3622569_265-276625-0_sp1.1 from training. Duration: 0.6909375 2023-10-03 06:23:48,425 WARNING [train.py:1204] (3/4) Exclude cut with ID _64631_210_27767_1_1535349580068_4165160_88-225232-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 06:23:52,025 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_205-265518-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 06:23:55,021 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.feed_forward1.out_proj.dropout_p, batch_count=1168866.6666666667, ans=0.1 2023-10-03 06:23:57,513 WARNING [train.py:1204] (3/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_271-253734-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 06:24:01,828 WARNING [train.py:1204] (3/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_282-257791-0 from training. Duration: 0.86 2023-10-03 06:24:04,627 WARNING [train.py:1204] (3/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_356-45054-0 from training. Duration: 0.86 2023-10-03 06:24:06,034 WARNING [train.py:1204] (3/4) Exclude cut with ID _69584_210_5219_1_1535785133775_4044450_38-348116-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 06:24:06,036 WARNING [train.py:1204] (3/4) Exclude cut with ID _66763_210_17301_1_1535788667617_241750_21-328480-0_sp1.1 from training. Duration: 0.936375 2023-10-03 06:24:06,112 WARNING [train.py:1204] (3/4) Exclude cut with ID _26528_210_9070_1_1532570410842_3851310_497-97218-0_sp0.9 from training. Duration: 0.9555625 2023-10-03 06:24:06,161 WARNING [train.py:1204] (3/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_592-101075-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 06:24:07,517 WARNING [train.py:1204] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_678-159307-0 from training. Duration: 0.85 2023-10-03 06:24:10,217 WARNING [train.py:1204] (3/4) Exclude cut with ID _28427_210_4220_1_1532581240967_3061069_166-319773-0_sp1.1 from training. Duration: 0.936375 2023-10-03 06:24:11,338 WARNING [train.py:1204] (3/4) Exclude cut with ID _29877_210_6228_1_1532397791106_5276149_606-224984-0_sp1.1 from training. Duration: 0.936375 2023-10-03 06:24:14,122 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_125-203105-0_sp1.1 from training. Duration: 0.72725 2023-10-03 06:24:16,926 WARNING [train.py:1204] (3/4) Exclude cut with ID IC0620W0113-306765 from training. Duration: 0.768 2023-10-03 06:24:18,370 WARNING [train.py:1204] (3/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_410-287857-0_sp1.1 from training. Duration: 0.8454375 2023-10-03 06:24:21,497 INFO [train.py:1046] (3/4) Epoch 34, batch 50, loss[loss=0.1516, simple_loss=0.2274, pruned_loss=0.03793, over 23604.00 frames. ], tot_loss[loss=0.1623, simple_loss=0.2416, pruned_loss=0.04145, over 1064800.23 frames. ], batch size: 52, lr: 3.02e-03, grad_scale: 8.0 2023-10-03 06:24:22,871 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.533e+02 1.913e+02 2.173e+02 2.528e+02 6.265e+02, threshold=4.345e+02, percent-clipped=6.0 2023-10-03 06:24:22,952 WARNING [train.py:1204] (3/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_78-95260-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 06:24:24,398 WARNING [train.py:1204] (3/4) Exclude cut with ID _58059_210_5854_1_1534901471149_3164808_6-73080-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 06:24:24,418 WARNING [train.py:1204] (3/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_331-117829-0 from training. Duration: 0.62 2023-10-03 06:24:25,724 WARNING [train.py:1204] (3/4) Exclude cut with ID _14880_210_8895_1_1530680426655_3795259_289-212817-0_sp0.9 from training. Duration: 0.7666875 2023-10-03 06:24:25,758 WARNING [train.py:1204] (3/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_620-144779-0_sp1.1 from training. Duration: 0.92725 2023-10-03 06:24:27,226 WARNING [train.py:1204] (3/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_561-288575-0_sp1.1 from training. Duration: 0.963625 2023-10-03 06:24:28,567 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_22657_210_6890_1_1532086002_1637216_122-188128-0_sp1.1 from training. Duration: 0.963625 2023-10-03 06:24:31,192 WARNING [train.py:1204] (3/4) Exclude cut with ID _16756_210_4915_1_1531047803763_7104259_604-86607-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 06:24:32,665 WARNING [train.py:1204] (3/4) Exclude cut with ID _15150_210_8341_1_1530752411803_7256384_469-239509-0 from training. Duration: 0.68 2023-10-03 06:24:32,673 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_10987_210_4163_1_1529760555_4123902_302-87926-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 06:24:38,791 WARNING [train.py:1204] (3/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_469-94673-0_sp0.9 from training. Duration: 0.87775 2023-10-03 06:24:40,611 WARNING [train.py:1204] (3/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_798-106179-0 from training. Duration: 0.83 2023-10-03 06:24:41,921 WARNING [train.py:1204] (3/4) Exclude cut with ID _16744_210_5986_1_1531724405384_3907567_242-111316-0 from training. Duration: 0.93 2023-10-03 06:24:43,322 WARNING [train.py:1204] (3/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_938-205135-0_sp0.9 from training. Duration: 0.9666875 2023-10-03 06:24:44,715 WARNING [train.py:1204] (3/4) Exclude cut with ID _64135_210_2253_1_1535677267876_7608972_133-188266-0_sp0.9 from training. Duration: 0.9 2023-10-03 06:24:44,720 WARNING [train.py:1204] (3/4) Exclude cut with ID _44585_210_18628_1_1533605350387_4518199_623-346820-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 06:24:46,040 WARNING [train.py:1204] (3/4) Exclude cut with ID _42326_210_7770_1_1533814282531_3707269_454-266903-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 06:24:46,114 WARNING [train.py:1204] (3/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_5-55893-0_sp1.1 from training. Duration: 0.5 2023-10-03 06:24:47,376 WARNING [train.py:1204] (3/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_577-332600-0_sp1.1 from training. Duration: 0.5181875 2023-10-03 06:24:47,377 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_957-138882-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 06:24:55,981 WARNING [train.py:1204] (3/4) Exclude cut with ID _12556_210_8341_1_1530081024100_7327690_219-38004-0_sp1.1 from training. Duration: 0.97275 2023-10-03 06:24:58,547 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_397-147196-0_sp1.1 from training. Duration: 0.72725 2023-10-03 06:24:58,556 WARNING [train.py:1204] (3/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_231-104736-0_sp0.9 from training. Duration: 0.8666875 2023-10-03 06:24:59,919 WARNING [train.py:1204] (3/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_398-334558-0 from training. Duration: 0.96 2023-10-03 06:25:01,358 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_257-20733-0_sp1.1 from training. Duration: 0.6909375 2023-10-03 06:25:01,444 WARNING [train.py:1204] (3/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_146-282400-0_sp1.1 from training. Duration: 0.6454375 2023-10-03 06:25:01,448 WARNING [train.py:1204] (3/4) Exclude cut with ID _51899_210_20034_1_1534129254466_3669220_265-215428-0 from training. Duration: 0.98 2023-10-03 06:25:01,601 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.0.attention_skip_rate, batch_count=1169133.3333333333, ans=0.0 2023-10-03 06:25:02,822 WARNING [train.py:1204] (3/4) Exclude cut with ID _34423_210_8750_1_1533430798456_3330199_219-106541-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 06:25:04,227 WARNING [train.py:1204] (3/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_458-67321-0 from training. Duration: 0.97 2023-10-03 06:25:09,050 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=1169200.0, ans=0.1 2023-10-03 06:25:10,119 WARNING [train.py:1204] (3/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_1051-182606-0_sp1.1 from training. Duration: 0.936375 2023-10-03 06:25:12,070 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_53555_210_5632_1_1534595220_7232297_603-4396-0_sp1.1 from training. Duration: 0.97275 2023-10-03 06:25:13,429 WARNING [train.py:1204] (3/4) Exclude cut with ID _54218_210_12210_1_1534413646388_4409259_159-144367-0_sp1.1 from training. Duration: 0.963625 2023-10-03 06:25:14,820 WARNING [train.py:1204] (3/4) Exclude cut with ID _7606_210_4915_1_1528894253277_5080690_301-10732-0_sp0.9 from training. Duration: 0.9 2023-10-03 06:25:14,823 WARNING [train.py:1204] (3/4) Exclude cut with ID _15026_210_8750_1_1530777602316_3291850_211-195043-0_sp0.9 from training. Duration: 0.888875 2023-10-03 06:25:15,001 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.conv_skip_rate, batch_count=1169200.0, ans=0.0 2023-10-03 06:25:17,533 WARNING [train.py:1204] (3/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_146-282400-0 from training. Duration: 0.71 2023-10-03 06:25:17,577 WARNING [train.py:1204] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_65-88242-0 from training. Duration: 0.56 2023-10-03 06:25:19,010 WARNING [train.py:1204] (3/4) Exclude cut with ID _55857_210_2346_1_1534644007831_6898209_80-107757-0_sp1.1 from training. Duration: 0.963625 2023-10-03 06:25:19,053 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_15126_210_6753_1_1530778677_2591185_120-277707-0_sp0.9 from training. Duration: 0.888875 2023-10-03 06:25:20,473 WARNING [train.py:1204] (3/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_1033-168593-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 06:25:22,386 WARNING [train.py:1204] (3/4) Exclude cut with ID _46683_210_9642_1_1533730913492_4591116_475-30711-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 06:25:22,417 WARNING [train.py:1204] (3/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_253-308377-0 from training. Duration: 0.49 2023-10-03 06:25:23,735 WARNING [train.py:1204] (3/4) Exclude cut with ID _4680_210_3525_1_1527249757953_893386_75-78979-0 from training. Duration: 0.91 2023-10-03 06:25:25,068 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_5522_210_3273_1_1527249606_3909096_318-203153-0_sp0.9 from training. Duration: 0.47775 2023-10-03 06:25:25,167 WARNING [train.py:1204] (3/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_550-170116-0_sp1.1 from training. Duration: 0.92725 2023-10-03 06:25:26,312 WARNING [train.py:1204] (3/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_277-210798-0_sp0.9 from training. Duration: 0.611125 2023-10-03 06:25:27,612 WARNING [train.py:1204] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_620-276793-0 from training. Duration: 0.59 2023-10-03 06:25:27,624 WARNING [train.py:1204] (3/4) Exclude cut with ID _3067_210_2667_1_1526212974640_4503168_119-175776-0 from training. Duration: 0.74 2023-10-03 06:25:29,001 WARNING [train.py:1204] (3/4) Exclude cut with ID _55209_210_1531_1_1534675828996_4597160_681-195827-0_sp1.1 from training. Duration: 0.92725 2023-10-03 06:25:30,340 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_427-78097-0_sp1.1 from training. Duration: 0.72725 2023-10-03 06:25:30,444 WARNING [train.py:1204] (3/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_96-290039-0_sp0.9 from training. Duration: 0.62225 2023-10-03 06:25:31,675 WARNING [train.py:1204] (3/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_287-292174-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 06:25:33,330 INFO [train.py:1046] (3/4) Epoch 34, batch 100, loss[loss=0.1662, simple_loss=0.2435, pruned_loss=0.0444, over 23466.00 frames. ], tot_loss[loss=0.1629, simple_loss=0.2424, pruned_loss=0.04168, over 1861387.71 frames. ], batch size: 134, lr: 3.02e-03, grad_scale: 8.0 2023-10-03 06:25:33,410 WARNING [train.py:1204] (3/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_200-292228-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 06:25:36,232 WARNING [train.py:1204] (3/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_291-194999-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 06:25:38,906 WARNING [train.py:1204] (3/4) Exclude cut with ID _58895_210_8750_1_1535184081320_3649089_265-323810-0_sp1.1 from training. Duration: 0.87275 2023-10-03 06:25:40,307 WARNING [train.py:1204] (3/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_58-68956-0 from training. Duration: 0.83 2023-10-03 06:25:40,311 WARNING [train.py:1204] (3/4) Exclude cut with ID _39867_210_9826_1_1533553222030_9344259_322-115153-0_sp1.1 from training. Duration: 0.963625 2023-10-03 06:25:42,326 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.conv_skip_rate, batch_count=1169333.3333333333, ans=0.0 2023-10-03 06:25:45,568 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3371_210_1794_1_1526384462_5580777_396-339812-0_sp1.1 from training. Duration: 0.77275 2023-10-03 06:25:46,832 WARNING [train.py:1204] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_731-2428-0_sp0.9 from training. Duration: 0.92225 2023-10-03 06:25:46,851 WARNING [train.py:1204] (3/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_98-741-0_sp1.1 from training. Duration: 0.72725 2023-10-03 06:25:46,853 WARNING [train.py:1204] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_238-245655-0_sp0.9 from training. Duration: 0.9 2023-10-03 06:25:46,889 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6788_210_1388_1_1527898698_4562021_91-197981-0_sp0.9 from training. Duration: 0.92225 2023-10-03 06:25:48,304 WARNING [train.py:1204] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_860-96198-0 from training. Duration: 0.73 2023-10-03 06:25:49,769 WARNING [train.py:1204] (3/4) Exclude cut with ID _14268_210_9206_1_1530622771285_4714099_331-105351-0_sp0.9 from training. Duration: 0.72225 2023-10-03 06:25:49,880 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.1.feed_forward1.out_proj.dropout_p, batch_count=1169400.0, ans=0.1 2023-10-03 06:25:51,045 WARNING [train.py:1204] (3/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_204-137133-0_sp1.1 from training. Duration: 0.92725 2023-10-03 06:25:51,064 WARNING [train.py:1204] (3/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_5-46496-0_sp1.1 from training. Duration: 0.936375 2023-10-03 06:25:51,072 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_160-218721-0_sp1.1 from training. Duration: 0.87275 2023-10-03 06:25:55,842 WARNING [train.py:1204] (3/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_503-287883-0 from training. Duration: 0.88 2023-10-03 06:25:57,157 WARNING [train.py:1204] (3/4) Exclude cut with ID _57572_210_5517_1_1534921219624_3809484_110-79134-0_sp1.1 from training. Duration: 0.92725 2023-10-03 06:25:58,541 WARNING [train.py:1204] (3/4) Exclude cut with ID _44585_210_18628_1_1533605350387_4518199_421-37641-0_sp1.1 from training. Duration: 0.936375 2023-10-03 06:25:58,749 INFO [scaling.py:1118] (3/4) WithLoss: name=encoder.encoders.0.layers.1.self_attn_weights, loss-sum=0.000e+00 2023-10-03 06:25:59,884 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_42-142473-0_sp0.9 from training. Duration: 0.8 2023-10-03 06:26:00,520 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.4.encoder.layers.0.conv_module1.whiten, num_groups=1, num_channels=384, metric=5.17 vs. limit=15.0 2023-10-03 06:26:01,256 WARNING [train.py:1204] (3/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_310-114452-0_sp0.9 from training. Duration: 0.6555625 2023-10-03 06:26:05,439 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9029W0120-509520 from training. Duration: 0.768 2023-10-03 06:26:05,454 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9029W0433-509820 from training. Duration: 0.896 2023-10-03 06:26:06,892 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_40222_210_6228_1_1533385298_4481939_163-334586-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 06:26:06,893 WARNING [train.py:1204] (3/4) Exclude cut with ID _48677_210_5663_1_1533902405919_3892989_408-40740-0_sp1.1 from training. Duration: 0.7545625 2023-10-03 06:26:09,680 WARNING [train.py:1204] (3/4) Exclude cut with ID _24314_210_4915_1_1532135078116_7170179_521-258868-0_sp0.9 from training. Duration: 0.87775 2023-10-03 06:26:11,094 WARNING [train.py:1204] (3/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_576-51198-0_sp1.1 from training. Duration: 0.92725 2023-10-03 06:26:14,161 WARNING [train.py:1204] (3/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_306-216871-0_sp1.1 from training. Duration: 0.97275 2023-10-03 06:26:17,100 WARNING [train.py:1204] (3/4) Exclude cut with ID _20684_210_6917_1_1531567751646_4203930_463-119982-0_sp1.1 from training. Duration: 0.97275 2023-10-03 06:26:18,370 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9084W0012-536850 from training. Duration: 0.64 2023-10-03 06:26:21,099 WARNING [train.py:1204] (3/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_847-41334-0_sp1.1 from training. Duration: 0.4 2023-10-03 06:26:23,917 WARNING [train.py:1204] (3/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_326-165900-0_sp1.1 from training. Duration: 0.62725 2023-10-03 06:26:23,989 WARNING [train.py:1204] (3/4) Exclude cut with ID _7537_210_2084_1_1528502395431_3924359_128-268029-0_sp1.1 from training. Duration: 0.8909375 2023-10-03 06:26:27,185 WARNING [train.py:1204] (3/4) Exclude cut with ID _67499_210_17282_1_1535509805220_3692430_333-155030-0_sp1.1 from training. Duration: 0.97275 2023-10-03 06:26:28,938 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.5.encoder.layers.0.feed_forward3.out_whiten, num_groups=1, num_channels=256, metric=11.58 vs. limit=15.0 2023-10-03 06:26:29,907 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_34008_210_10431_1_1533345424_8183247_588-257177-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 06:26:32,608 WARNING [train.py:1204] (3/4) Exclude cut with ID _25914_210_6400_1_1532141317058_644740_52-96808-0_sp1.1 from training. Duration: 0.82725 2023-10-03 06:26:34,035 WARNING [train.py:1204] (3/4) Exclude cut with ID _56494_210_4220_1_1535284837625_3033169_69-138150-0_sp1.1 from training. Duration: 0.8545625 2023-10-03 06:26:35,708 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.conv_skip_rate, batch_count=1169600.0, ans=0.0 2023-10-03 06:26:36,766 WARNING [train.py:1204] (3/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_19-143553-0_sp1.1 from training. Duration: 0.97275 2023-10-03 06:26:36,809 WARNING [train.py:1204] (3/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_582-130735-0_sp1.1 from training. Duration: 0.936375 2023-10-03 06:26:38,190 WARNING [train.py:1204] (3/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_943-201356-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 06:26:38,200 WARNING [train.py:1204] (3/4) Exclude cut with ID _3231_210_2106_1_1526205864934_6784220_422-229276-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 06:26:38,216 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_48049_210_5632_1_1533864553_3519937_73-198717-0_sp1.1 from training. Duration: 0.97275 2023-10-03 06:26:39,489 WARNING [train.py:1204] (3/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_329-285801-0 from training. Duration: 0.91 2023-10-03 06:26:39,513 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9171W0114-579000 from training. Duration: 0.896 2023-10-03 06:26:39,525 WARNING [train.py:1204] (3/4) Exclude cut with ID _3120_210_3525_1_1526036143112_4072980_210-132675-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 06:26:40,957 WARNING [train.py:1204] (3/4) Exclude cut with ID _55857_210_2346_1_1534644007831_6898209_745-98391-0_sp0.9 from training. Duration: 0.8444375 2023-10-03 06:26:42,849 WARNING [train.py:1204] (3/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_615-322035-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 06:26:42,860 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_36756_210_7234_1_1533026991_966276_119-315767-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 06:26:42,875 WARNING [train.py:1204] (3/4) Exclude cut with ID _13916_210_9994_1_1530788341126_3763149_60-191556-0_sp1.1 from training. Duration: 0.5545625 2023-10-03 06:26:43,904 WARNING [train.py:1204] (3/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_177-236809-0_sp1.1 from training. Duration: 0.4454375 2023-10-03 06:26:43,948 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7002_210_1794_1_1527939683_5157286_184-229630-0_sp1.1 from training. Duration: 0.67275 2023-10-03 06:26:43,954 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_50428_210_5724_1_1534247532_4126418_464-18322-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 06:26:44,021 WARNING [train.py:1204] (3/4) Exclude cut with ID _35799_210_5854_1_1533263487887_3529071_466-131402-0_sp1.1 from training. Duration: 0.936375 2023-10-03 06:26:45,800 INFO [train.py:1046] (3/4) Epoch 34, batch 150, loss[loss=0.1673, simple_loss=0.251, pruned_loss=0.04177, over 24095.00 frames. ], tot_loss[loss=0.1626, simple_loss=0.2421, pruned_loss=0.04157, over 2510161.24 frames. ], batch size: 80, lr: 3.02e-03, grad_scale: 4.0 2023-10-03 06:26:45,865 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_1259-132768-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 06:26:45,925 WARNING [train.py:1204] (3/4) Exclude cut with ID _6694_210_4599_1_1527852637490_3965529_144-185051-0_sp1.1 from training. Duration: 0.87275 2023-10-03 06:26:45,956 WARNING [train.py:1204] (3/4) Exclude cut with ID _64639_210_5816_1_1535367619437_3528000_59-133290-0_sp1.1 from training. Duration: 0.963625 2023-10-03 06:26:48,605 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.566e+02 1.899e+02 2.057e+02 2.467e+02 3.842e+02, threshold=4.114e+02, percent-clipped=0.0 2023-10-03 06:26:50,032 WARNING [train.py:1204] (3/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_223-49525-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 06:26:52,803 WARNING [train.py:1204] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_748-36150-0_sp1.1 from training. Duration: 0.82725 2023-10-03 06:26:52,804 WARNING [train.py:1204] (3/4) Exclude cut with ID _19896_210_1883_1_1531709022662_6649569_52-221627-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 06:26:52,861 WARNING [train.py:1204] (3/4) Exclude cut with ID _6789_210_1534_1_1527854471671_2870519_296-115766-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 06:26:56,879 WARNING [train.py:1204] (3/4) Exclude cut with ID _14624_210_1925_1_1530858690657_3642219_100-19871-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 06:26:56,923 WARNING [train.py:1204] (3/4) Exclude cut with ID _12555_210_8750_1_1530075594402_3780239_50-293675-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 06:27:00,093 WARNING [train.py:1204] (3/4) Exclude cut with ID _14880_210_8895_1_1530680426655_3795259_289-212817-0_sp1.1 from training. Duration: 0.62725 2023-10-03 06:27:01,437 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_388-211626-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 06:27:04,768 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.5.encoder.layers.0.feed_forward2.out_whiten, num_groups=1, num_channels=256, metric=8.30 vs. limit=15.0 2023-10-03 06:27:05,543 WARNING [train.py:1204] (3/4) Exclude cut with ID _5289_210_2084_1_1527678202736_3779299_392-261988-0 from training. Duration: 0.65 2023-10-03 06:27:05,551 WARNING [train.py:1204] (3/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_124-345840-0 from training. Duration: 0.52 2023-10-03 06:27:05,563 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2655_210_2087_1_1525688959_3897400_437-337690-0 from training. Duration: 0.63 2023-10-03 06:27:08,356 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3780_210_2087_1_1526467391_239068_13-274633-0_sp1.1 from training. Duration: 0.92725 2023-10-03 06:27:08,360 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_805-71497-0_sp1.1 from training. Duration: 0.6454375 2023-10-03 06:27:09,647 WARNING [train.py:1204] (3/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_434-83000-0_sp1.1 from training. Duration: 0.8909375 2023-10-03 06:27:09,725 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_17613_210_2151_1_1531137569_2366160_285-219531-0_sp1.1 from training. Duration: 0.97275 2023-10-03 06:27:09,730 WARNING [train.py:1204] (3/4) Exclude cut with ID _56108_210_16380_1_1534505446159_3887959_22-37858-0_sp1.1 from training. Duration: 0.936375 2023-10-03 06:27:11,088 WARNING [train.py:1204] (3/4) Exclude cut with ID _28427_210_4220_1_1532581240967_3061069_200-88444-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 06:27:11,167 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_77-279042-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 06:27:12,516 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9309W0193-640320 from training. Duration: 0.896 2023-10-03 06:27:15,874 WARNING [train.py:1204] (3/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_516-324055-0_sp1.1 from training. Duration: 0.936375 2023-10-03 06:27:20,478 WARNING [train.py:1204] (3/4) Exclude cut with ID _40094_210_16380_1_1533294005773_3641159_10-261240-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 06:27:23,340 WARNING [train.py:1204] (3/4) Exclude cut with ID _6267_210_2084_1_1527764658526_3894730_375-219887-0_sp1.1 from training. Duration: 0.6090625 2023-10-03 06:27:24,680 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6788_210_1388_1_1527898698_4562021_180-75288-0 from training. Duration: 0.99 2023-10-03 06:27:26,141 WARNING [train.py:1204] (3/4) Exclude cut with ID _8030_210_7497_1_1528444886781_3531089_262-86750-0_sp1.1 from training. Duration: 0.736375 2023-10-03 06:27:26,163 WARNING [train.py:1204] (3/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_934-325498-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 06:27:26,183 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_456-22141-0_sp0.9 from training. Duration: 0.97775 2023-10-03 06:27:27,737 WARNING [train.py:1204] (3/4) Exclude cut with ID _49643_210_7989_1_1534640244224_3939031_598-161926-0_sp1.1 from training. Duration: 0.8181875 2023-10-03 06:27:29,634 WARNING [train.py:1204] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_345-49721-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 06:27:30,996 WARNING [train.py:1204] (3/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_440-141811-0_sp1.1 from training. Duration: 0.863625 2023-10-03 06:27:32,363 WARNING [train.py:1204] (3/4) Exclude cut with ID _64048_210_10626_1_1535198382802_3835179_394-212120-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 06:27:32,408 WARNING [train.py:1204] (3/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_91-277684-0 from training. Duration: 0.74 2023-10-03 06:27:37,904 WARNING [train.py:1204] (3/4) Exclude cut with ID _23038_210_9570_1_1532001584158_3652419_191-346343-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 06:27:39,330 WARNING [train.py:1204] (3/4) Exclude cut with ID _15140_210_9288_1_1530791388450_4880149_351-347974-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 06:27:39,364 WARNING [train.py:1204] (3/4) Exclude cut with ID _58605_210_12210_1_1534723195939_3904259_48-269402-0_sp1.1 from training. Duration: 0.87275 2023-10-03 06:27:39,373 WARNING [train.py:1204] (3/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_401-53423-0_sp0.9 from training. Duration: 0.611125 2023-10-03 06:27:39,528 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.feed_forward3.hidden_balancer.prob, batch_count=1169866.6666666667, ans=0.125 2023-10-03 06:27:42,646 WARNING [train.py:1204] (3/4) Exclude cut with ID _42659_210_2070_1_1533607442765_3120380_158-49004-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 06:27:45,891 WARNING [train.py:1204] (3/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_81-75849-0_sp1.1 from training. Duration: 0.3545625 2023-10-03 06:27:47,345 WARNING [train.py:1204] (3/4) Exclude cut with ID _27052_210_5986_1_1532329197187_3585009_314-106499-0_sp1.1 from training. Duration: 0.836375 2023-10-03 06:27:47,448 WARNING [train.py:1204] (3/4) Exclude cut with ID _4626_210_3532_1_1526817652717_3916859_446-206656-0_sp0.9 from training. Duration: 0.8444375 2023-10-03 06:27:47,680 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.balancer2.prob, batch_count=1169933.3333333333, ans=0.125 2023-10-03 06:27:49,161 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_53378_210_17282_1_1534221071_5769509_158-114960-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 06:27:51,876 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3171_210_2087_1_1526109084_2330007_37-64334-0_sp0.9 from training. Duration: 0.911125 2023-10-03 06:27:51,900 WARNING [train.py:1204] (3/4) Exclude cut with ID _5410_210_1388_1_1527397055795_1719020_39-112221-0 from training. Duration: 0.69 2023-10-03 06:27:51,919 WARNING [train.py:1204] (3/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_4-91037-0_sp0.9 from training. Duration: 0.97775 2023-10-03 06:27:51,958 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder_embed.dropout.p, batch_count=1169933.3333333333, ans=0.1 2023-10-03 06:27:52,535 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.3.encoder.layers.3.nonlin_attention.whiten1, num_groups=1, num_channels=384, metric=5.40 vs. limit=10.0 2023-10-03 06:27:53,212 WARNING [train.py:1204] (3/4) Exclude cut with ID ID0079W0443-717795 from training. Duration: 0.541 2023-10-03 06:27:54,776 WARNING [train.py:1204] (3/4) Exclude cut with ID _34734_210_4525_1_1533365977542_4448720_766-297928-0_sp1.1 from training. Duration: 0.936375 2023-10-03 06:27:58,630 INFO [train.py:1046] (3/4) Epoch 34, batch 200, loss[loss=0.1766, simple_loss=0.252, pruned_loss=0.0506, over 22775.00 frames. ], tot_loss[loss=0.1631, simple_loss=0.2431, pruned_loss=0.04161, over 3010027.13 frames. ], batch size: 322, lr: 3.02e-03, grad_scale: 8.0 2023-10-03 06:27:58,708 WARNING [train.py:1204] (3/4) Exclude cut with ID _51471_210_1889_1_1534399391390_3094250_310-337793-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 06:27:58,728 WARNING [train.py:1204] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_678-159307-0_sp0.9 from training. Duration: 0.9444375 2023-10-03 06:28:00,810 WARNING [train.py:1204] (3/4) Exclude cut with ID _50093_210_4220_1_1534417141207_3432299_259-322252-0 from training. Duration: 0.92 2023-10-03 06:28:00,991 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.conv_module1.balancer2.prob, batch_count=1170000.0, ans=0.125 2023-10-03 06:28:02,152 WARNING [train.py:1204] (3/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_67-103444-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 06:28:03,472 WARNING [train.py:1204] (3/4) Exclude cut with ID _40551_210_10635_1_1533776424632_3416179_311-247578-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 06:28:04,944 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_4633_210_2040_1_1526824163_4145343_416-76117-0 from training. Duration: 0.95 2023-10-03 06:28:05,186 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.conv_module2.balancer1.min_positive, batch_count=1170000.0, ans=0.025 2023-10-03 06:28:06,291 WARNING [train.py:1204] (3/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_331-117829-0_sp0.9 from training. Duration: 0.688875 2023-10-03 06:28:07,601 WARNING [train.py:1204] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_299-247059-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 06:28:08,967 WARNING [train.py:1204] (3/4) Exclude cut with ID _64002_210_9570_1_1535198605157_38979_2-240845-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 06:28:09,308 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.feed_forward3.hidden_balancer.prob, batch_count=1170000.0, ans=0.125 2023-10-03 06:28:12,134 WARNING [train.py:1204] (3/4) Exclude cut with ID _4681_210_3525_1_1527250689464_120889_15-126760-0_sp0.9 from training. Duration: 0.9 2023-10-03 06:28:13,505 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_21500_210_2054_1_1532149310_2716758_256-156611-0_sp1.1 from training. Duration: 0.936375 2023-10-03 06:28:13,529 WARNING [train.py:1204] (3/4) Exclude cut with ID _16908_210_5724_1_1531119157248_3772700_624-203729-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 06:28:13,838 INFO [scaling.py:1118] (3/4) WithLoss: name=encoder.encoders.5.encoder.layers.1.self_attn_weights, loss-sum=0.000e+00 2023-10-03 06:28:21,754 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.self_attn_weights.pos_emb_skip_rate, batch_count=1170066.6666666667, ans=0.0 2023-10-03 06:28:21,761 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.bypass_mid.scale_min, batch_count=1170066.6666666667, ans=0.2 2023-10-03 06:28:28,660 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.bypass.skip_rate, batch_count=1170133.3333333333, ans=0.04949747468305833 2023-10-03 06:28:31,204 WARNING [train.py:1204] (3/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_605-219732-0_sp1.1 from training. Duration: 0.8545625 2023-10-03 06:28:33,068 WARNING [train.py:1204] (3/4) Exclude cut with ID _44718_210_5816_1_1534050051869_6469030_380-167520-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 06:28:33,137 WARNING [train.py:1204] (3/4) Exclude cut with ID _19896_210_1883_1_1531709022662_6649569_94-229927-0_sp1.1 from training. Duration: 0.8818125 2023-10-03 06:28:34,542 WARNING [train.py:1204] (3/4) Exclude cut with ID _19514_210_9259_1_1531530071330_5084230_519-174300-0_sp1.1 from training. Duration: 0.963625 2023-10-03 06:28:34,603 WARNING [train.py:1204] (3/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_340-174176-0_sp0.9 from training. Duration: 0.5444375 2023-10-03 06:28:34,606 WARNING [train.py:1204] (3/4) Exclude cut with ID _8263_210_1664_1_1528439452891_970220_68-181805-0_sp1.1 from training. Duration: 0.5818125 2023-10-03 06:28:35,886 WARNING [train.py:1204] (3/4) Exclude cut with ID _3120_210_3525_1_1526036143112_4072980_348-257723-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 06:28:37,289 WARNING [train.py:1204] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_453-133983-0_sp1.1 from training. Duration: 0.7090625 2023-10-03 06:28:38,625 WARNING [train.py:1204] (3/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_17-86060-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 06:28:38,647 WARNING [train.py:1204] (3/4) Exclude cut with ID _29877_210_6228_1_1532397791106_5276149_228-184657-0_sp1.1 from training. Duration: 0.97275 2023-10-03 06:28:40,041 WARNING [train.py:1204] (3/4) Exclude cut with ID _18516_210_9206_1_1531704653206_3692406_198-334870-0 from training. Duration: 0.92 2023-10-03 06:28:40,083 WARNING [train.py:1204] (3/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_340-174176-0_sp1.1 from training. Duration: 0.4454375 2023-10-03 06:28:40,094 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_14-139186-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 06:28:41,004 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.0.layers.1.self_attn1.whiten, num_groups=1, num_channels=192, metric=10.69 vs. limit=22.5 2023-10-03 06:28:44,768 WARNING [train.py:1204] (3/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_126-29104-0_sp0.9 from training. Duration: 0.9444375 2023-10-03 06:28:46,279 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.1.encoder.layers.1.nonlin_attention.whiten2, num_groups=1, num_channels=256, metric=7.87 vs. limit=15.0 2023-10-03 06:28:50,035 WARNING [train.py:1204] (3/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_84-10310-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 06:28:57,210 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_15517_210_6753_1_1530954008_3487336_79-273076-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 06:28:57,260 WARNING [train.py:1204] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_371-18169-0_sp0.9 from training. Duration: 0.9555625 2023-10-03 06:29:01,632 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.out_combiner.scale_min, batch_count=1170266.6666666667, ans=0.2 2023-10-03 06:29:02,818 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_68702_210_23689_1_1535799577_3420068_170-92788-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 06:29:04,704 WARNING [train.py:1204] (3/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_78-182912-0 from training. Duration: 0.59 2023-10-03 06:29:05,942 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3019_210_2028_1_1526044245_3264865_297-12384-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 06:29:07,194 WARNING [train.py:1204] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_147-111137-0_sp1.1 from training. Duration: 0.863625 2023-10-03 06:29:07,199 WARNING [train.py:1204] (3/4) Exclude cut with ID _28323_210_16187_1_1532480395667_4100300_117-8859-0_sp1.1 from training. Duration: 0.97275 2023-10-03 06:29:08,572 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7762_210_1794_1_1528199508_4057408_419-125009-0_sp1.1 from training. Duration: 0.7545625 2023-10-03 06:29:08,670 WARNING [train.py:1204] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_268-87909-0 from training. Duration: 0.99 2023-10-03 06:29:11,346 WARNING [train.py:1204] (3/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_118-311357-0_sp1.1 from training. Duration: 0.92725 2023-10-03 06:29:11,373 WARNING [train.py:1204] (3/4) Exclude cut with ID ID0377W0003-870225 from training. Duration: 0.852 2023-10-03 06:29:11,615 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.feed_forward1.hidden_balancer.prob, batch_count=1170333.3333333333, ans=0.125 2023-10-03 06:29:12,625 INFO [train.py:1046] (3/4) Epoch 34, batch 250, loss[loss=0.1649, simple_loss=0.2448, pruned_loss=0.04252, over 23465.00 frames. ], tot_loss[loss=0.1626, simple_loss=0.2425, pruned_loss=0.04135, over 3394089.61 frames. ], batch size: 93, lr: 3.02e-03, grad_scale: 8.0 2023-10-03 06:29:12,873 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.1.feed_forward1.out_proj.dropout_p, batch_count=1170333.3333333333, ans=0.1 2023-10-03 06:29:14,723 WARNING [train.py:1204] (3/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_56-5838-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 06:29:15,952 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.593e+02 1.885e+02 2.083e+02 2.421e+02 4.173e+02, threshold=4.166e+02, percent-clipped=1.0 2023-10-03 06:29:17,369 WARNING [train.py:1204] (3/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_90-31383-0_sp0.9 from training. Duration: 0.9666875 2023-10-03 06:29:19,257 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_31583_210_3852_1_1532595851_4024413_58-86657-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 06:29:19,297 WARNING [train.py:1204] (3/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_344-113875-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 06:29:22,238 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.0.layers.1.nonlin_attention.whiten1, num_groups=1, num_channels=144, metric=5.23 vs. limit=10.0 2023-10-03 06:29:22,593 WARNING [train.py:1204] (3/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_278-104676-0_sp1.1 from training. Duration: 0.8909375 2023-10-03 06:29:22,622 WARNING [train.py:1204] (3/4) Exclude cut with ID _55844_210_2346_1_1534471212569_7625707_764-2387-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 06:29:23,953 WARNING [train.py:1204] (3/4) Exclude cut with ID _27430_210_7770_1_1532604689375_1378590_157-14963-0_sp1.1 from training. Duration: 0.963625 2023-10-03 06:29:25,590 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.feed_forward3.hidden_balancer.prob, batch_count=1170333.3333333333, ans=0.125 2023-10-03 06:29:26,864 WARNING [train.py:1204] (3/4) Exclude cut with ID _7160_210_1876_1_1527940923231_3703799_282-322611-0_sp1.1 from training. Duration: 0.7909375 2023-10-03 06:29:32,715 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.conv_skip_rate, batch_count=1170400.0, ans=0.0 2023-10-03 06:29:34,055 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.feed_forward1.out_proj.dropout_p, batch_count=1170400.0, ans=0.1 2023-10-03 06:29:36,665 WARNING [train.py:1204] (3/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_190-138640-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 06:29:38,702 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_33348_210_5632_1_1533086912_3324030_63-270922-0_sp1.1 from training. Duration: 0.97275 2023-10-03 06:29:38,738 WARNING [train.py:1204] (3/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_135-184281-0_sp1.1 from training. Duration: 0.9 2023-10-03 06:29:38,969 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.ff3_skip_rate, batch_count=1170400.0, ans=0.0 2023-10-03 06:29:45,696 WARNING [train.py:1204] (3/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_277-210798-0_sp1.1 from training. Duration: 0.5 2023-10-03 06:29:47,037 WARNING [train.py:1204] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_402-253027-0_sp0.9 from training. Duration: 0.6 2023-10-03 06:29:47,111 WARNING [train.py:1204] (3/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_540-275685-0_sp0.9 from training. Duration: 0.988875 2023-10-03 06:29:48,906 WARNING [train.py:1204] (3/4) Exclude cut with ID _59493_210_17282_1_1535078419077_3225879_118-18960-0_sp1.1 from training. Duration: 0.936375 2023-10-03 06:29:49,576 WARNING [train.py:1204] (3/4) Exclude cut with ID _6194_210_1534_1_1527511826160_3941194_321-148641-0_sp1.1 from training. Duration: 0.5818125 2023-10-03 06:29:49,582 WARNING [train.py:1204] (3/4) Exclude cut with ID _61133_210_9969_1_1535196777227_3696409_333-270325-0_sp1.1 from training. Duration: 0.8454375 2023-10-03 06:29:50,242 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.3.encoder.layers.2.self_attn2.whiten, num_groups=1, num_channels=512, metric=12.21 vs. limit=22.5 2023-10-03 06:29:50,995 WARNING [train.py:1204] (3/4) Exclude cut with ID _49376_210_5986_1_1533985278732_3370740_307-145221-0_sp1.1 from training. Duration: 0.936375 2023-10-03 06:29:52,484 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6846_210_6753_1_1527850767_4295158_2-315073-0_sp0.9 from training. Duration: 0.97775 2023-10-03 06:29:55,256 WARNING [train.py:1204] (3/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_17-25045-0 from training. Duration: 0.59 2023-10-03 06:29:56,516 WARNING [train.py:1204] (3/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_503-333510-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 06:29:59,199 WARNING [train.py:1204] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_238-245655-0_sp1.1 from training. Duration: 0.736375 2023-10-03 06:29:59,227 WARNING [train.py:1204] (3/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_329-82235-0_sp0.9 from training. Duration: 0.911125 2023-10-03 06:29:59,232 WARNING [train.py:1204] (3/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_325-113798-0_sp0.9 from training. Duration: 0.9444375 2023-10-03 06:29:59,407 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.conv_module2.balancer2.prob, batch_count=1170533.3333333333, ans=0.125 2023-10-03 06:30:00,720 WARNING [train.py:1204] (3/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_395-37222-0_sp0.9 from training. Duration: 0.8444375 2023-10-03 06:30:00,791 WARNING [train.py:1204] (3/4) Exclude cut with ID _64135_210_2253_1_1535677267876_7608972_498-5039-0_sp1.1 from training. Duration: 0.7090625 2023-10-03 06:30:00,802 WARNING [train.py:1204] (3/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_303-228738-0_sp0.9 from training. Duration: 0.9333125 2023-10-03 06:30:02,327 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_577-220899-0_sp1.1 from training. Duration: 0.92725 2023-10-03 06:30:04,980 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3475_210_2028_1_1526193578_3622569_377-166133-0_sp1.1 from training. Duration: 0.7818125 2023-10-03 06:30:05,012 WARNING [train.py:1204] (3/4) Exclude cut with ID _49376_210_5986_1_1533985278732_3370740_272-313512-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 06:30:06,474 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.1.self_attn_weights.pos_emb_skip_rate, batch_count=1170533.3333333333, ans=0.0 2023-10-03 06:30:09,009 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7762_210_1794_1_1528199508_4057408_422-227034-0_sp0.9 from training. Duration: 0.811125 2023-10-03 06:30:11,120 WARNING [train.py:1204] (3/4) Exclude cut with ID _18895_210_6228_1_1531357274234_3784930_74-341428-0_sp1.1 from training. Duration: 0.92725 2023-10-03 06:30:15,303 WARNING [train.py:1204] (3/4) Exclude cut with ID _44902_210_16187_1_1533888006062_3792078_149-67212-0_sp1.1 from training. Duration: 0.7909375 2023-10-03 06:30:21,245 WARNING [train.py:1204] (3/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_243-126020-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 06:30:23,212 WARNING [train.py:1204] (3/4) Exclude cut with ID _54218_210_12210_1_1534413646388_4409259_259-6561-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 06:30:28,050 INFO [train.py:1046] (3/4) Epoch 34, batch 300, loss[loss=0.1593, simple_loss=0.2269, pruned_loss=0.04589, over 23728.00 frames. ], tot_loss[loss=0.1613, simple_loss=0.2408, pruned_loss=0.04094, over 3698988.67 frames. ], batch size: 164, lr: 3.02e-03, grad_scale: 8.0 2023-10-03 06:30:28,130 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_41452_210_7291_1_1533437687_3941281_405-11559-0 from training. Duration: 0.91 2023-10-03 06:30:29,540 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_759-225084-0_sp1.1 from training. Duration: 0.97275 2023-10-03 06:30:29,560 WARNING [train.py:1204] (3/4) Exclude cut with ID _14747_210_8341_1_1530685797463_3654089_147-131229-0_sp0.9 from training. Duration: 0.8444375 2023-10-03 06:30:30,914 WARNING [train.py:1204] (3/4) Exclude cut with ID _14867_210_1968_1_1530684179890_3846969_319-250868-0 from training. Duration: 0.99 2023-10-03 06:30:30,935 WARNING [train.py:1204] (3/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_580-93798-0_sp0.9 from training. Duration: 0.62225 2023-10-03 06:30:31,039 WARNING [train.py:1204] (3/4) Exclude cut with ID _9679_210_7497_1_1529129212906_4363929_512-313889-0_sp0.9 from training. Duration: 0.9 2023-10-03 06:30:31,049 WARNING [train.py:1204] (3/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_18-189198-0 from training. Duration: 0.9 2023-10-03 06:30:35,318 WARNING [train.py:1204] (3/4) Exclude cut with ID _49755_210_18053_1_1534581005846_3218709_137-64179-0_sp1.1 from training. Duration: 0.92725 2023-10-03 06:30:35,399 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_48049_210_5632_1_1533864553_3519937_306-283794-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 06:30:38,122 WARNING [train.py:1204] (3/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_25-120161-0_sp1.1 from training. Duration: 0.8909375 2023-10-03 06:30:39,920 WARNING [train.py:1204] (3/4) Exclude cut with ID _8833_210_5219_1_1528592665210_7553059_319-349181-0 from training. Duration: 0.99 2023-10-03 06:30:41,301 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_296-332325-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 06:30:41,389 WARNING [train.py:1204] (3/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_436-186311-0_sp1.1 from training. Duration: 0.5909375 2023-10-03 06:30:41,406 WARNING [train.py:1204] (3/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_348-127789-0 from training. Duration: 0.85 2023-10-03 06:30:41,411 WARNING [train.py:1204] (3/4) Exclude cut with ID _3992_210_1747_1_1526644816031_3731060_443-195183-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 06:30:45,552 WARNING [train.py:1204] (3/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_308-231735-0_sp1.1 from training. Duration: 0.663625 2023-10-03 06:30:50,717 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6786_210_1388_1_1527839699_4194495_378-46070-0_sp1.1 from training. Duration: 0.6545625 2023-10-03 06:30:50,757 WARNING [train.py:1204] (3/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_63-101996-0 from training. Duration: 0.88 2023-10-03 06:30:56,672 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2541_210_1794_1_1525590269_3084765_156-173713-0 from training. Duration: 0.81 2023-10-03 06:30:56,716 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_217-239796-0_sp1.1 from training. Duration: 0.963625 2023-10-03 06:30:58,123 WARNING [train.py:1204] (3/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_530-329400-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 06:31:00,914 WARNING [train.py:1204] (3/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_150-62920-0_sp1.1 from training. Duration: 0.963625 2023-10-03 06:31:00,915 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_5522_210_3273_1_1527249606_3909096_366-172508-0 from training. Duration: 0.66 2023-10-03 06:31:00,917 WARNING [train.py:1204] (3/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_303-340446-0_sp1.1 from training. Duration: 0.8181875 2023-10-03 06:31:02,281 WARNING [train.py:1204] (3/4) Exclude cut with ID _8699_210_2069_1_1528630346831_3960409_160-24408-0_sp1.1 from training. Duration: 0.82725 2023-10-03 06:31:03,732 WARNING [train.py:1204] (3/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_299-187801-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 06:31:03,760 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_34886_210_10633_1_1533203882_1755661_92-345288-0_sp1.1 from training. Duration: 0.97275 2023-10-03 06:31:07,892 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_5438_210_1794_1_1527336687_2600009_104-288274-0_sp1.1 from training. Duration: 0.52725 2023-10-03 06:31:07,896 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_154-316450-0 from training. Duration: 0.76 2023-10-03 06:31:07,984 WARNING [train.py:1204] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_23-189234-0_sp0.9 from training. Duration: 0.92225 2023-10-03 06:31:10,764 WARNING [train.py:1204] (3/4) Exclude cut with ID _4779_210_2084_1_1527318022944_3914893_346-168820-0_sp1.1 from training. Duration: 0.963625 2023-10-03 06:31:10,857 WARNING [train.py:1204] (3/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_5-55893-0 from training. Duration: 0.55 2023-10-03 06:31:12,818 WARNING [train.py:1204] (3/4) Exclude cut with ID _20455_210_5219_1_1531565996775_8098100_180-24517-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 06:31:18,275 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_243-143119-0_sp0.9 from training. Duration: 0.8555625 2023-10-03 06:31:20,381 WARNING [train.py:1204] (3/4) Exclude cut with ID _65585_210_12210_1_1535450451584_3764098_293-212045-0_sp1.1 from training. Duration: 0.92725 2023-10-03 06:31:20,393 WARNING [train.py:1204] (3/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_577-332600-0 from training. Duration: 0.57 2023-10-03 06:31:23,865 WARNING [train.py:1204] (3/4) Exclude cut with ID _30039_210_13270_1_1532484612900_2725150_104-289874-0_sp1.1 from training. Duration: 0.963625 2023-10-03 06:31:23,875 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3172_210_2087_1_1526292929_3987865_197-97762-0_sp1.1 from training. Duration: 0.6818125 2023-10-03 06:31:26,576 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_256-150908-0_sp1.1 from training. Duration: 0.963625 2023-10-03 06:31:26,689 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_296-287678-0_sp1.1 from training. Duration: 0.863625 2023-10-03 06:31:26,731 WARNING [train.py:1204] (3/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_443-30034-0 from training. Duration: 0.64 2023-10-03 06:31:27,856 WARNING [train.py:1204] (3/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_494-199538-0_sp0.9 from training. Duration: 0.8333125 2023-10-03 06:31:27,931 WARNING [train.py:1204] (3/4) Exclude cut with ID _19708_210_9206_1_1532136650733_4238364_61-162325-0_sp1.1 from training. Duration: 0.97275 2023-10-03 06:31:28,083 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.feed_forward3.hidden_balancer.prob, batch_count=1170933.3333333333, ans=0.125 2023-10-03 06:31:30,532 WARNING [train.py:1204] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_475-127455-0 from training. Duration: 0.7 2023-10-03 06:31:31,918 WARNING [train.py:1204] (3/4) Exclude cut with ID _48677_210_5663_1_1533902405919_3892989_188-11581-0_sp1.1 from training. Duration: 0.963625 2023-10-03 06:31:31,955 WARNING [train.py:1204] (3/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_136-8381-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 06:31:32,481 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.4.encoder.layers.2.feed_forward1.out_whiten, num_groups=1, num_channels=384, metric=10.90 vs. limit=15.0 2023-10-03 06:31:33,270 WARNING [train.py:1204] (3/4) Exclude cut with ID _55328_210_27767_1_1534580973260_4088170_270-229555-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 06:31:34,564 WARNING [train.py:1204] (3/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_372-266951-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 06:31:34,644 WARNING [train.py:1204] (3/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_14-242149-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 06:31:35,203 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.3.encoder.layers.1.self_attn_weights.whiten_keys, num_groups=8, num_channels=256, metric=4.08 vs. limit=6.0 2023-10-03 06:31:41,420 INFO [train.py:1046] (3/4) Epoch 34, batch 350, loss[loss=0.1525, simple_loss=0.2412, pruned_loss=0.03192, over 24643.00 frames. ], tot_loss[loss=0.16, simple_loss=0.2383, pruned_loss=0.04082, over 3904120.38 frames. ], batch size: 68, lr: 3.02e-03, grad_scale: 8.0 2023-10-03 06:31:41,485 WARNING [train.py:1204] (3/4) Exclude cut with ID _7877_210_6014_1_1528286358099_3750136_159-76512-0_sp1.1 from training. Duration: 0.936375 2023-10-03 06:31:41,488 WARNING [train.py:1204] (3/4) Exclude cut with ID _8263_210_1664_1_1528438953494_294100_40-204731-0_sp1.1 from training. Duration: 0.4545625 2023-10-03 06:31:44,785 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8278_210_2550_1_1528637099_4220413_694-238413-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 06:31:45,077 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.attention_skip_rate, batch_count=1171000.0, ans=0.0 2023-10-03 06:31:46,060 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.537e+02 1.903e+02 2.085e+02 2.376e+02 3.254e+02, threshold=4.169e+02, percent-clipped=0.0 2023-10-03 06:31:49,514 WARNING [train.py:1204] (3/4) Exclude cut with ID _47278_210_12041_1_1533902418349_3576110_421-172710-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 06:31:51,063 WARNING [train.py:1204] (3/4) Exclude cut with ID _14970_210_15709_1_1530773543493_1715730_215-282680-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 06:31:53,004 WARNING [train.py:1204] (3/4) Exclude cut with ID _64639_210_5816_1_1535367619437_3528000_306-149449-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 06:31:56,085 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_28-291982-0 from training. Duration: 0.97 2023-10-03 06:31:58,819 WARNING [train.py:1204] (3/4) Exclude cut with ID _55134_210_21982_1_1534590028377_3882170_412-15870-0_sp1.1 from training. Duration: 0.936375 2023-10-03 06:31:58,861 WARNING [train.py:1204] (3/4) Exclude cut with ID _13684_210_7497_1_1530619354863_3598359_312-235606-0 from training. Duration: 0.6 2023-10-03 06:32:00,378 WARNING [train.py:1204] (3/4) Exclude cut with ID _37112_210_2575_1_1532997007673_7268610_347-238993-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 06:32:01,702 WARNING [train.py:1204] (3/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_121-284152-0 from training. Duration: 0.59 2023-10-03 06:32:01,754 WARNING [train.py:1204] (3/4) Exclude cut with ID _2941_210_2577_1_1526199673150_2489759_228-2961-0_sp1.1 from training. Duration: 0.97275 2023-10-03 06:32:03,198 WARNING [train.py:1204] (3/4) Exclude cut with ID _28323_210_16187_1_1532480395667_4100300_531-24157-0 from training. Duration: 0.98 2023-10-03 06:32:05,884 WARNING [train.py:1204] (3/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_266-329755-0_sp0.9 from training. Duration: 0.988875 2023-10-03 06:32:06,121 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.conv_module2.balancer1.prob, batch_count=1171066.6666666667, ans=0.125 2023-10-03 06:32:07,317 WARNING [train.py:1204] (3/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_1038-146421-0_sp1.1 from training. Duration: 0.97275 2023-10-03 06:32:07,859 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.4.encoder.layers.1.nonlin_attention.whiten1, num_groups=1, num_channels=288, metric=4.19 vs. limit=10.0 2023-10-03 06:32:08,617 WARNING [train.py:1204] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_391-22148-0_sp0.9 from training. Duration: 0.8555625 2023-10-03 06:32:10,033 WARNING [train.py:1204] (3/4) Exclude cut with ID _64521_210_14699_1_1535243427259_3815328_543-341117-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 06:32:11,304 WARNING [train.py:1204] (3/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_52-168712-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 06:32:11,349 WARNING [train.py:1204] (3/4) Exclude cut with ID _33920_210_4220_1_1533257851500_3084399_198-334063-0_sp1.1 from training. Duration: 0.8545625 2023-10-03 06:32:11,369 WARNING [train.py:1204] (3/4) Exclude cut with ID _50093_210_4220_1_1534417141207_3432299_69-111987-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 06:32:12,704 WARNING [train.py:1204] (3/4) Exclude cut with ID _14821_210_8750_1_1530680503991_6829359_189-297451-0_sp0.9 from training. Duration: 0.611125 2023-10-03 06:32:14,124 WARNING [train.py:1204] (3/4) Exclude cut with ID _8030_210_7497_1_1528444886781_3531089_262-86750-0_sp0.9 from training. Duration: 0.9 2023-10-03 06:32:14,129 WARNING [train.py:1204] (3/4) Exclude cut with ID _24612_210_8913_1_1531998054193_3806359_294-191196-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 06:32:16,034 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.1.feed_forward1.hidden_balancer.prob, batch_count=1171133.3333333333, ans=0.125 2023-10-03 06:32:22,021 WARNING [train.py:1204] (3/4) Exclude cut with ID _16909_210_5724_1_1531292079912_4118919_707-342896-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 06:32:22,043 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_9460_210_4211_1_1528954587_10049667_572-37035-0_sp0.9 from training. Duration: 0.888875 2023-10-03 06:32:23,735 WARNING [train.py:1204] (3/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_762-109886-0_sp1.1 from training. Duration: 0.82725 2023-10-03 06:32:23,756 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_14953_210_2151_1_1530701710_5616165_519-326393-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 06:32:24,019 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.nonlin_attention.balancer.prob, batch_count=1171133.3333333333, ans=0.125 2023-10-03 06:32:27,186 WARNING [train.py:1204] (3/4) Exclude cut with ID _25914_210_6400_1_1532141317058_644740_52-96808-0 from training. Duration: 0.91 2023-10-03 06:32:27,190 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_68095_210_20185_1_1535718226_8888855_1079-94525-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 06:32:32,614 WARNING [train.py:1204] (3/4) Exclude cut with ID _14860_210_4915_1_1530702020340_7343240_746-3647-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 06:32:32,615 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_70379_210_7925_1_1535941455_557455_49-101720-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 06:32:32,636 WARNING [train.py:1204] (3/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_8-307291-0_sp1.1 from training. Duration: 0.936375 2023-10-03 06:32:33,924 WARNING [train.py:1204] (3/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_117-290916-0 from training. Duration: 0.51 2023-10-03 06:32:36,702 WARNING [train.py:1204] (3/4) Exclude cut with ID _22020_210_2461_1_1531724164513_6326900_944-339840-0_sp1.1 from training. Duration: 0.963625 2023-10-03 06:32:36,785 WARNING [train.py:1204] (3/4) Exclude cut with ID IC0515W0043-254911 from training. Duration: 0.976 2023-10-03 06:32:39,439 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3910_210_1794_1_1526742306_334530_28-238909-0 from training. Duration: 0.81 2023-10-03 06:32:39,460 WARNING [train.py:1204] (3/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_269-267275-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 06:32:43,515 WARNING [train.py:1204] (3/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_72-251255-0_sp1.1 from training. Duration: 0.8545625 2023-10-03 06:32:43,525 WARNING [train.py:1204] (3/4) Exclude cut with ID _14886_210_4220_1_1530702020355_3516190_175-229073-0 from training. Duration: 0.45 2023-10-03 06:32:46,219 WARNING [train.py:1204] (3/4) Exclude cut with ID _40519_210_13794_1_1533715228655_7337390_921-209452-0_sp1.1 from training. Duration: 0.963625 2023-10-03 06:32:48,238 WARNING [train.py:1204] (3/4) Exclude cut with ID _29857_210_20034_1_1532395886753_3798160_335-163562-0_sp0.9 from training. Duration: 0.9444375 2023-10-03 06:32:48,342 WARNING [train.py:1204] (3/4) Exclude cut with ID _58583_210_5236_1_1534726835266_3574249_384-58445-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 06:32:49,682 WARNING [train.py:1204] (3/4) Exclude cut with ID _44011_210_2046_1_1533722401691_3631310_96-315656-0_sp1.1 from training. Duration: 0.963625 2023-10-03 06:32:49,690 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_68484_210_1794_1_1535849804_7678097_549-170613-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 06:32:51,570 WARNING [train.py:1204] (3/4) Exclude cut with ID _37892_210_19596_1_1533198677897_3964058_214-148058-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 06:32:54,731 WARNING [train.py:1204] (3/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_415-78792-0_sp0.9 from training. Duration: 0.9 2023-10-03 06:32:56,018 INFO [train.py:1046] (3/4) Epoch 34, batch 400, loss[loss=0.1709, simple_loss=0.2536, pruned_loss=0.04414, over 23661.00 frames. ], tot_loss[loss=0.1601, simple_loss=0.2387, pruned_loss=0.04073, over 4090347.91 frames. ], batch size: 85, lr: 3.02e-03, grad_scale: 16.0 2023-10-03 06:32:56,181 WARNING [train.py:1204] (3/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_383-205833-0_sp1.1 from training. Duration: 0.863625 2023-10-03 06:32:58,033 WARNING [train.py:1204] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_167-137255-0 from training. Duration: 0.64 2023-10-03 06:32:58,049 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8275_210_4198_1_1528455320_3892571_484-154786-0_sp1.1 from training. Duration: 0.963625 2023-10-03 06:32:59,332 WARNING [train.py:1204] (3/4) Exclude cut with ID _40284_210_2084_1_1533513679814_3704279_396-319412-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 06:33:00,741 WARNING [train.py:1204] (3/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_280-120942-0_sp0.9 from training. Duration: 0.8555625 2023-10-03 06:33:02,116 WARNING [train.py:1204] (3/4) Exclude cut with ID _45630_210_10556_1_1533949174498_3902049_558-240889-0_sp1.1 from training. Duration: 0.92725 2023-10-03 06:33:03,801 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.bypass_mid.scale_min, batch_count=1171333.3333333333, ans=0.2 2023-10-03 06:33:04,876 WARNING [train.py:1204] (3/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_197-307458-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 06:33:04,962 WARNING [train.py:1204] (3/4) Exclude cut with ID _16909_210_5724_1_1531292079912_4118919_163-254843-0_sp1.1 from training. Duration: 0.92725 2023-10-03 06:33:06,390 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_103-84010-0 from training. Duration: 0.69 2023-10-03 06:33:09,144 WARNING [train.py:1204] (3/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_401-158239-0 from training. Duration: 0.71 2023-10-03 06:33:09,144 WARNING [train.py:1204] (3/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_899-233658-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 06:33:10,584 WARNING [train.py:1204] (3/4) Exclude cut with ID _23825_210_13449_1_1531915313526_3497150_240-271367-0 from training. Duration: 0.97 2023-10-03 06:33:11,857 WARNING [train.py:1204] (3/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_424-134619-0_sp1.1 from training. Duration: 0.92725 2023-10-03 06:33:14,810 WARNING [train.py:1204] (3/4) Exclude cut with ID _44831_210_13427_1_1533816011549_3689035_32-188147-0_sp1.1 from training. Duration: 0.87275 2023-10-03 06:33:14,813 WARNING [train.py:1204] (3/4) Exclude cut with ID _59170_210_6721_1_1534983633960_4203335_314-63879-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 06:33:14,835 WARNING [train.py:1204] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_602-275260-0 from training. Duration: 0.82 2023-10-03 06:33:14,876 WARNING [train.py:1204] (3/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_511-306767-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 06:33:16,344 WARNING [train.py:1204] (3/4) Exclude cut with ID _41817_210_18835_1_1533434477160_7423049_565-201775-0_sp1.1 from training. Duration: 0.92725 2023-10-03 06:33:16,361 WARNING [train.py:1204] (3/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_778-173151-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 06:33:16,398 WARNING [train.py:1204] (3/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_680-51001-0_sp1.1 from training. Duration: 0.963625 2023-10-03 06:33:19,846 WARNING [train.py:1204] (3/4) Exclude cut with ID IC0677W0059-334891 from training. Duration: 0.896 2023-10-03 06:33:21,074 WARNING [train.py:1204] (3/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_370-331600-0 from training. Duration: 0.94 2023-10-03 06:33:25,834 WARNING [train.py:1204] (3/4) Exclude cut with ID _66840_210_5219_1_1535452168426_4196349_446-240192-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 06:33:25,921 WARNING [train.py:1204] (3/4) Exclude cut with ID _65242_210_18628_1_1535369393717_3823149_38-6732-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 06:33:27,266 WARNING [train.py:1204] (3/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_302-2550-0 from training. Duration: 0.81 2023-10-03 06:33:27,493 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.conv_module1.balancer1.prob, batch_count=1171466.6666666667, ans=0.125 2023-10-03 06:33:29,180 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_100-348441-0 from training. Duration: 0.91 2023-10-03 06:33:31,312 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.0.layers.1.feed_forward3.out_whiten, num_groups=1, num_channels=192, metric=10.09 vs. limit=15.0 2023-10-03 06:33:31,945 WARNING [train.py:1204] (3/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_329-285801-0_sp1.1 from training. Duration: 0.82725 2023-10-03 06:33:33,268 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_30167_210_3528_1_1532428012_4772375_343-334712-0_sp1.1 from training. Duration: 0.936375 2023-10-03 06:33:37,553 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7543_210_6753_1_1528543318_4552_1-85072-0 from training. Duration: 0.83 2023-10-03 06:33:40,289 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7002_210_1794_1_1527935634_4034444_487-236501-0_sp1.1 from training. Duration: 0.6 2023-10-03 06:33:41,600 WARNING [train.py:1204] (3/4) Exclude cut with ID _7160_210_1876_1_1527940923231_3703799_282-322611-0 from training. Duration: 0.87 2023-10-03 06:33:42,044 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.5.encoder.layers.0.feed_forward2.out_whiten, num_groups=1, num_channels=256, metric=12.76 vs. limit=15.0 2023-10-03 06:33:45,466 WARNING [train.py:1204] (3/4) Exclude cut with ID _67335_210_5219_1_1535525975652_4275229_570-262983-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 06:33:45,577 WARNING [train.py:1204] (3/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_625-338546-0_sp1.1 from training. Duration: 0.8 2023-10-03 06:33:46,232 WARNING [train.py:1204] (3/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_176-56120-0 from training. Duration: 0.83 2023-10-03 06:33:46,428 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.feed_forward3.hidden_balancer.prob, batch_count=1171533.3333333333, ans=0.125 2023-10-03 06:33:47,781 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.feed_forward3.hidden_balancer.prob, batch_count=1171533.3333333333, ans=0.125 2023-10-03 06:33:49,518 WARNING [train.py:1204] (3/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_963-187109-0_sp1.1 from training. Duration: 0.9 2023-10-03 06:33:52,280 WARNING [train.py:1204] (3/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_121-284152-0_sp0.9 from training. Duration: 0.6555625 2023-10-03 06:33:54,196 WARNING [train.py:1204] (3/4) Exclude cut with ID _45021_210_9994_1_1533619751094_3330288_104-81711-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 06:33:56,961 WARNING [train.py:1204] (3/4) Exclude cut with ID _51382_210_23862_1_1534492801838_3650104_129-343914-0_sp1.1 from training. Duration: 0.936375 2023-10-03 06:33:56,987 WARNING [train.py:1204] (3/4) Exclude cut with ID _14970_210_15709_1_1530775547361_1966965_8-169924-0 from training. Duration: 0.92 2023-10-03 06:34:00,308 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2295_210_2087_1_1525500993_2407863_234-83104-0_sp1.1 from training. Duration: 0.563625 2023-10-03 06:34:00,383 WARNING [train.py:1204] (3/4) Exclude cut with ID _14687_210_5854_1_1530703838259_3664889_115-239046-0 from training. Duration: 0.67 2023-10-03 06:34:01,833 WARNING [train.py:1204] (3/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_690-126306-0_sp1.1 from training. Duration: 0.7090625 2023-10-03 06:34:01,841 WARNING [train.py:1204] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_625-102955-0_sp0.9 from training. Duration: 0.8555625 2023-10-03 06:34:04,614 WARNING [train.py:1204] (3/4) Exclude cut with ID _9389_210_1664_1_1529217337301_5289269_289-213417-0 from training. Duration: 0.89 2023-10-03 06:34:06,197 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.conv_skip_rate, batch_count=1171600.0, ans=0.0 2023-10-03 06:34:07,244 WARNING [train.py:1204] (3/4) Exclude cut with ID _13253_210_1925_1_1530757775819_3577873_191-144977-0_sp0.9 from training. Duration: 0.8666875 2023-10-03 06:34:08,655 WARNING [train.py:1204] (3/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_325-155703-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 06:34:08,698 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2295_210_2087_1_1525500993_2407863_116-238894-0_sp0.9 from training. Duration: 0.77775 2023-10-03 06:34:09,942 INFO [train.py:1046] (3/4) Epoch 34, batch 450, loss[loss=0.1464, simple_loss=0.2247, pruned_loss=0.03402, over 24260.00 frames. ], tot_loss[loss=0.161, simple_loss=0.2397, pruned_loss=0.04116, over 4224045.50 frames. ], batch size: 56, lr: 3.02e-03, grad_scale: 16.0 2023-10-03 06:34:10,074 WARNING [train.py:1204] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_94-126128-0 from training. Duration: 0.86 2023-10-03 06:34:10,089 WARNING [train.py:1204] (3/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_395-310220-0_sp0.9 from training. Duration: 0.988875 2023-10-03 06:34:11,491 WARNING [train.py:1204] (3/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_821-110852-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 06:34:12,788 WARNING [train.py:1204] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_64-248359-0_sp1.1 from training. Duration: 0.62725 2023-10-03 06:34:12,798 WARNING [train.py:1204] (3/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_138-187031-0 from training. Duration: 0.99 2023-10-03 06:34:12,828 WARNING [train.py:1204] (3/4) Exclude cut with ID _4122_210_2084_1_1526713164417_4353280_528-206771-0_sp0.9 from training. Duration: 0.97775 2023-10-03 06:34:14,035 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.541e+02 1.869e+02 1.964e+02 2.234e+02 2.686e+02, threshold=3.928e+02, percent-clipped=0.0 2023-10-03 06:34:14,181 WARNING [train.py:1204] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_844-96989-0_sp1.1 from training. Duration: 0.6454375 2023-10-03 06:34:15,585 WARNING [train.py:1204] (3/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_106-30782-0_sp1.1 from training. Duration: 0.72725 2023-10-03 06:34:26,355 WARNING [train.py:1204] (3/4) Exclude cut with ID _14683_210_5219_1_1530698352608_3974859_281-250040-0_sp1.1 from training. Duration: 0.936375 2023-10-03 06:34:26,393 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_46013_210_7234_1_1533776362_3744032_463-52378-0_sp0.9 from training. Duration: 0.9555625 2023-10-03 06:34:27,840 WARNING [train.py:1204] (3/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_299-34413-0 from training. Duration: 0.65 2023-10-03 06:34:29,236 WARNING [train.py:1204] (3/4) Exclude cut with ID _35799_210_5854_1_1533263487887_3529071_490-326847-0 from training. Duration: 0.99 2023-10-03 06:34:32,566 WARNING [train.py:1204] (3/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_589-9070-0_sp0.9 from training. Duration: 0.788875 2023-10-03 06:34:34,093 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.feed_forward1.out_proj.dropout_p, batch_count=1171733.3333333333, ans=0.1 2023-10-03 06:34:35,330 WARNING [train.py:1204] (3/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_884-29515-0_sp1.1 from training. Duration: 0.936375 2023-10-03 06:34:35,572 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.nonlin_attention.balancer.prob, batch_count=1171733.3333333333, ans=0.125 2023-10-03 06:34:36,808 WARNING [train.py:1204] (3/4) Exclude cut with ID _15012_210_1968_1_1530856820429_5095350_474-119995-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 06:34:39,625 WARNING [train.py:1204] (3/4) Exclude cut with ID _65585_210_12210_1_1535450451584_3764098_211-97124-0_sp1.1 from training. Duration: 0.963625 2023-10-03 06:34:40,907 WARNING [train.py:1204] (3/4) Exclude cut with ID _33640_210_2655_1_1533117605479_4855469_666-224463-0_sp1.1 from training. Duration: 0.963625 2023-10-03 06:34:43,545 WARNING [train.py:1204] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_35-153886-0 from training. Duration: 0.84 2023-10-03 06:34:43,599 WARNING [train.py:1204] (3/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_210-106047-0 from training. Duration: 0.97 2023-10-03 06:34:46,184 WARNING [train.py:1204] (3/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_479-125611-0 from training. Duration: 0.96 2023-10-03 06:34:46,204 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_9417_210_1794_1_1529235261_4614131_427-153963-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 06:34:47,548 WARNING [train.py:1204] (3/4) Exclude cut with ID _23818_210_6228_1_1531917063884_3676920_133-170571-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 06:34:47,619 WARNING [train.py:1204] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_602-275260-0_sp1.1 from training. Duration: 0.7454375 2023-10-03 06:34:49,655 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder_embed.convnext.layerdrop_rate, batch_count=1171800.0, ans=0.015 2023-10-03 06:34:50,925 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9029W0121-509521 from training. Duration: 0.896 2023-10-03 06:34:50,933 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9029W0434-509821 from training. Duration: 0.64 2023-10-03 06:34:50,957 WARNING [train.py:1204] (3/4) Exclude cut with ID _23038_210_9570_1_1532001584158_3652419_289-307034-0_sp1.1 from training. Duration: 0.936375 2023-10-03 06:34:51,156 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.conv_skip_rate, batch_count=1171800.0, ans=0.0 2023-10-03 06:34:52,375 WARNING [train.py:1204] (3/4) Exclude cut with ID _29345_210_4381_1_1532689168263_3860169_149-257268-0_sp0.9 from training. Duration: 0.9 2023-10-03 06:34:53,757 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_405-339986-0_sp1.1 from training. Duration: 0.463625 2023-10-03 06:34:56,979 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2655_210_2087_1_1525688959_3897400_437-337690-0_sp1.1 from training. Duration: 0.57275 2023-10-03 06:34:58,213 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7543_210_6753_1_1528541907_1399322_165-323619-0_sp1.1 from training. Duration: 0.62725 2023-10-03 06:34:58,263 WARNING [train.py:1204] (3/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_508-335369-0_sp1.1 from training. Duration: 0.436375 2023-10-03 06:34:58,332 WARNING [train.py:1204] (3/4) Exclude cut with ID _7852_210_5219_1_1528286471316_8139400_358-287492-0 from training. Duration: 0.96 2023-10-03 06:35:01,102 WARNING [train.py:1204] (3/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_322-217220-0_sp0.9 from training. Duration: 0.9555625 2023-10-03 06:35:03,957 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3171_210_2087_1_1526109084_2330007_66-260583-0_sp0.9 from training. Duration: 0.8 2023-10-03 06:35:03,993 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8558_210_1410_1_1528505225_7613406_131-329524-0_sp1.1 from training. Duration: 0.7181875 2023-10-03 06:35:07,220 WARNING [train.py:1204] (3/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_30-326681-0 from training. Duration: 0.83 2023-10-03 06:35:11,364 WARNING [train.py:1204] (3/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_58-68956-0_sp0.9 from training. Duration: 0.92225 2023-10-03 06:35:12,685 WARNING [train.py:1204] (3/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_255-312654-0 from training. Duration: 0.91 2023-10-03 06:35:12,768 WARNING [train.py:1204] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_99-167392-0 from training. Duration: 0.61 2023-10-03 06:35:14,118 WARNING [train.py:1204] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_94-126128-0_sp0.9 from training. Duration: 0.9555625 2023-10-03 06:35:18,338 WARNING [train.py:1204] (3/4) Exclude cut with ID _68635_210_9317_1_1535770813410_4315870_187-192817-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 06:35:19,672 WARNING [train.py:1204] (3/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_277-204970-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 06:35:22,465 INFO [train.py:1046] (3/4) Epoch 34, batch 500, loss[loss=0.1537, simple_loss=0.2419, pruned_loss=0.03269, over 24580.00 frames. ], tot_loss[loss=0.1622, simple_loss=0.2411, pruned_loss=0.04161, over 4343179.28 frames. ], batch size: 71, lr: 3.02e-03, grad_scale: 16.0 2023-10-03 06:35:22,515 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6163_210_6400_1_1527506746_4263068_283-137811-0_sp0.9 from training. Duration: 0.8555625 2023-10-03 06:35:22,543 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9144W0121-566446 from training. Duration: 0.896 2023-10-03 06:35:27,632 WARNING [train.py:1204] (3/4) Exclude cut with ID _49371_210_12071_1_1533952761021_3812190_351-315519-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 06:35:29,077 WARNING [train.py:1204] (3/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_115-326778-0_sp1.1 from training. Duration: 0.8454375 2023-10-03 06:35:29,135 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_17292_210_6753_1_1531125388_2943734_436-198965-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 06:35:29,144 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9165W0117-576751 from training. Duration: 0.896 2023-10-03 06:35:30,594 WARNING [train.py:1204] (3/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_212-149947-0 from training. Duration: 0.73 2023-10-03 06:35:30,601 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_13757_210_3528_1_1530440682_4107239_65-167544-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 06:35:33,399 WARNING [train.py:1204] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_700-183549-0_sp0.9 from training. Duration: 0.7555625 2023-10-03 06:35:33,702 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.conv_module2.balancer1.prob, batch_count=1172000.0, ans=0.125 2023-10-03 06:35:36,284 WARNING [train.py:1204] (3/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_216-343007-0_sp0.9 from training. Duration: 0.7333125 2023-10-03 06:35:38,158 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7544_210_6753_1_1528587722_3576686_295-258479-0_sp1.1 from training. Duration: 0.763625 2023-10-03 06:35:39,499 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_365-287106-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 06:35:39,512 WARNING [train.py:1204] (3/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_300-3747-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 06:35:40,849 WARNING [train.py:1204] (3/4) Exclude cut with ID _14069_210_5632_1_1530512692033_4803390_487-303201-0_sp1.1 from training. Duration: 0.92725 2023-10-03 06:35:41,415 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.3.encoder.layers.3.conv_module2.whiten, num_groups=1, num_channels=512, metric=6.17 vs. limit=15.0 2023-10-03 06:35:49,104 WARNING [train.py:1204] (3/4) Exclude cut with ID _14459_210_2228_1_1531443641946_7516700_668-123026-0_sp1.1 from training. Duration: 0.936375 2023-10-03 06:35:49,131 WARNING [train.py:1204] (3/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_108-53929-0_sp1.1 from training. Duration: 0.536375 2023-10-03 06:35:50,423 WARNING [train.py:1204] (3/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_266-137104-0_sp0.9 from training. Duration: 0.72225 2023-10-03 06:35:50,455 WARNING [train.py:1204] (3/4) Exclude cut with ID _12302_210_5219_1_1529924446466_8107280_902-168619-0_sp1.1 from training. Duration: 0.936375 2023-10-03 06:35:51,713 WARNING [train.py:1204] (3/4) Exclude cut with ID _59502_210_12210_1_1535107024698_1835139_146-162495-0 from training. Duration: 0.92 2023-10-03 06:35:51,737 WARNING [train.py:1204] (3/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_412-287526-0_sp1.1 from training. Duration: 0.6545625 2023-10-03 06:35:52,670 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.balancer_ff2.min_abs, batch_count=1172133.3333333333, ans=0.1 2023-10-03 06:35:55,009 WARNING [train.py:1204] (3/4) Exclude cut with ID _7160_210_1876_1_1527940923231_3703799_120-150943-0_sp0.9 from training. Duration: 0.9 2023-10-03 06:35:56,874 WARNING [train.py:1204] (3/4) Exclude cut with ID _15037_210_2253_1_1530752294104_3267650_296-192597-0_sp1.1 from training. Duration: 0.736375 2023-10-03 06:35:56,893 WARNING [train.py:1204] (3/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_397-216497-0_sp1.1 from training. Duration: 0.87275 2023-10-03 06:35:56,908 WARNING [train.py:1204] (3/4) Exclude cut with ID _40094_210_16380_1_1533294005773_3641159_76-269320-0_sp1.1 from training. Duration: 0.936375 2023-10-03 06:35:58,330 WARNING [train.py:1204] (3/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_449-15508-0 from training. Duration: 0.82 2023-10-03 06:36:02,439 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9309W0194-640321 from training. Duration: 0.64 2023-10-03 06:36:03,895 WARNING [train.py:1204] (3/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_284-312178-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 06:36:05,300 WARNING [train.py:1204] (3/4) Exclude cut with ID _29853_210_7497_1_1532505600851_6642110_401-55937-0_sp1.1 from training. Duration: 0.92725 2023-10-03 06:36:06,721 WARNING [train.py:1204] (3/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_716-24918-0_sp1.1 from training. Duration: 0.92725 2023-10-03 06:36:06,740 WARNING [train.py:1204] (3/4) Exclude cut with ID _19637_210_4220_1_1531537167676_3205060_314-305638-0_sp1.1 from training. Duration: 0.92725 2023-10-03 06:36:08,015 WARNING [train.py:1204] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_475-127455-0_sp1.1 from training. Duration: 0.636375 2023-10-03 06:36:09,500 WARNING [train.py:1204] (3/4) Exclude cut with ID _5444_210_2151_1_1527418808464_5312210_581-315411-0 from training. Duration: 0.62 2023-10-03 06:36:12,268 WARNING [train.py:1204] (3/4) Exclude cut with ID _44902_210_16187_1_1533888006062_3792078_149-67212-0_sp0.9 from training. Duration: 0.9666875 2023-10-03 06:36:14,113 WARNING [train.py:1204] (3/4) Exclude cut with ID _31602_210_5854_1_1532677021456_4458250_63-63253-0_sp1.1 from training. Duration: 0.963625 2023-10-03 06:36:18,229 WARNING [train.py:1204] (3/4) Exclude cut with ID _55209_210_1531_1_1534675828996_4597160_660-203992-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 06:36:20,983 WARNING [train.py:1204] (3/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_83-190624-0_sp1.1 from training. Duration: 0.92725 2023-10-03 06:36:27,384 WARNING [train.py:1204] (3/4) Exclude cut with ID _58605_210_12210_1_1534723195939_3904259_154-139635-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 06:36:28,972 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.conv_module2.balancer2.prob, batch_count=1172266.6666666667, ans=0.125 2023-10-03 06:36:31,418 WARNING [train.py:1204] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_164-224052-0 from training. Duration: 0.7 2023-10-03 06:36:31,427 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_1126-189071-0_sp1.1 from training. Duration: 0.963625 2023-10-03 06:36:31,443 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_15396_210_6890_1_1531308473_1938936_310-214789-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 06:36:34,239 WARNING [train.py:1204] (3/4) Exclude cut with ID _8395_210_1531_1_1528597871523_4411579_431-132470-0 from training. Duration: 0.74 2023-10-03 06:36:35,497 INFO [train.py:1046] (3/4) Epoch 34, batch 550, loss[loss=0.1765, simple_loss=0.2474, pruned_loss=0.05286, over 22615.00 frames. ], tot_loss[loss=0.163, simple_loss=0.2419, pruned_loss=0.04206, over 4436281.29 frames. ], batch size: 322, lr: 3.02e-03, grad_scale: 16.0 2023-10-03 06:36:35,570 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3780_210_2087_1_1526467760_2130483_170-259044-0_sp0.9 from training. Duration: 0.77775 2023-10-03 06:36:37,065 WARNING [train.py:1204] (3/4) Exclude cut with ID _12733_210_8341_1_1530167424556_3646380_537-230640-0_sp1.1 from training. Duration: 0.963625 2023-10-03 06:36:39,847 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.559e+02 1.879e+02 2.022e+02 2.267e+02 3.367e+02, threshold=4.045e+02, percent-clipped=0.0 2023-10-03 06:36:41,414 WARNING [train.py:1204] (3/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_159-281310-0 from training. Duration: 0.95 2023-10-03 06:36:42,844 WARNING [train.py:1204] (3/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_434-83000-0 from training. Duration: 0.98 2023-10-03 06:36:42,879 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_4405_210_1849_1_1526732205_3593939_493-32464-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 06:36:42,891 WARNING [train.py:1204] (3/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_342-100157-0 from training. Duration: 0.54 2023-10-03 06:36:42,942 WARNING [train.py:1204] (3/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_765-250133-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 06:36:44,230 WARNING [train.py:1204] (3/4) Exclude cut with ID _38670_210_15379_1_1533448810659_4391059_81-312245-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 06:36:44,311 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_50050_210_16223_1_1535435796_7290138_1273-247611-0_sp1.1 from training. Duration: 0.936375 2023-10-03 06:36:46,232 WARNING [train.py:1204] (3/4) Exclude cut with ID _44831_210_13427_1_1533816011549_3689035_399-18591-0_sp1.1 from training. Duration: 0.936375 2023-10-03 06:36:46,248 WARNING [train.py:1204] (3/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_557-324258-0_sp0.9 from training. Duration: 0.9 2023-10-03 06:36:47,584 WARNING [train.py:1204] (3/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_62-335552-0_sp1.1 from training. Duration: 0.7909375 2023-10-03 06:36:48,971 WARNING [train.py:1204] (3/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_673-264125-0_sp1.1 from training. Duration: 0.963625 2023-10-03 06:36:49,059 WARNING [train.py:1204] (3/4) Exclude cut with ID _8030_210_7497_1_1528444886781_3531089_262-86750-0 from training. Duration: 0.81 2023-10-03 06:36:50,399 WARNING [train.py:1204] (3/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_444-28041-0_sp1.1 from training. Duration: 0.77275 2023-10-03 06:36:54,580 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_34008_210_10431_1_1533345424_8183247_516-14858-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 06:36:54,593 WARNING [train.py:1204] (3/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_54-248218-0_sp1.1 from training. Duration: 0.936375 2023-10-03 06:36:55,110 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.4.encoder.layers.0.nonlin_attention.whiten1, num_groups=1, num_channels=288, metric=7.96 vs. limit=10.0 2023-10-03 06:36:57,878 WARNING [train.py:1204] (3/4) Exclude cut with ID _13253_210_1925_1_1530757775819_3577873_300-119782-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 06:36:57,952 WARNING [train.py:1204] (3/4) Exclude cut with ID _39290_210_4220_1_1533862778760_3075118_21-240703-0_sp1.1 from training. Duration: 0.936375 2023-10-03 06:36:59,598 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.ff3_skip_rate, batch_count=1172400.0, ans=0.0 2023-10-03 06:37:03,478 WARNING [train.py:1204] (3/4) Exclude cut with ID 774-127930-0014-10412-0_sp1.1 from training. Duration: 0.95 2023-10-03 06:37:03,534 WARNING [train.py:1204] (3/4) Exclude cut with ID _67499_210_17282_1_1535509805220_3692430_20-179346-0 from training. Duration: 0.97 2023-10-03 06:37:03,656 WARNING [train.py:1204] (3/4) Exclude cut with ID _8365_210_2064_1_1528592311070_4677690_431-294102-0_sp0.9 from training. Duration: 0.988875 2023-10-03 06:37:06,719 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.conv_module2.balancer2.prob, batch_count=1172466.6666666667, ans=0.125 2023-10-03 06:37:08,006 WARNING [train.py:1204] (3/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_365-1492-0_sp1.1 from training. Duration: 0.82725 2023-10-03 06:37:09,345 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_105-114797-0_sp0.9 from training. Duration: 0.9333125 2023-10-03 06:37:09,595 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=1172466.6666666667, ans=0.1 2023-10-03 06:37:10,686 WARNING [train.py:1204] (3/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_769-168302-0_sp0.9 from training. Duration: 0.788875 2023-10-03 06:37:13,465 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_40340_210_9024_1_1533433916_4132944_320-270277-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 06:37:13,470 WARNING [train.py:1204] (3/4) Exclude cut with ID ID0203W0003-781711 from training. Duration: 0.958 2023-10-03 06:37:15,401 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_30434_210_13049_1_1532519384_981746_52-12932-0_sp1.1 from training. Duration: 0.936375 2023-10-03 06:37:15,502 WARNING [train.py:1204] (3/4) Exclude cut with ID _8263_210_1664_1_1528438953494_294100_40-204731-0_sp0.9 from training. Duration: 0.5555625 2023-10-03 06:37:18,280 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1032-236863-0_sp0.9 from training. Duration: 0.9333125 2023-10-03 06:37:19,000 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.2.encoder.layers.2.self_attn2.whiten, num_groups=1, num_channels=384, metric=18.24 vs. limit=22.5 2023-10-03 06:37:19,546 WARNING [train.py:1204] (3/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_335-274780-0_sp0.9 from training. Duration: 0.8666875 2023-10-03 06:37:19,556 WARNING [train.py:1204] (3/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_654-52713-0_sp1.1 from training. Duration: 0.7 2023-10-03 06:37:19,668 WARNING [train.py:1204] (3/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_742-267856-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 06:37:22,941 WARNING [train.py:1204] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_74-48717-0 from training. Duration: 0.58 2023-10-03 06:37:23,033 WARNING [train.py:1204] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_18-46756-0 from training. Duration: 0.79 2023-10-03 06:37:24,433 WARNING [train.py:1204] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_612-56061-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 06:37:24,450 WARNING [train.py:1204] (3/4) Exclude cut with ID _56423_210_5854_1_1534896061049_3165969_412-2403-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 06:37:24,506 WARNING [train.py:1204] (3/4) Exclude cut with ID _62574_210_2655_1_1535180407058_3933249_120-196848-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 06:37:24,507 WARNING [train.py:1204] (3/4) Exclude cut with ID _40519_210_13794_1_1533715228655_7337390_916-51000-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 06:37:24,684 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.conv_module2.balancer1.prob, batch_count=1172533.3333333333, ans=0.125 2023-10-03 06:37:27,609 WARNING [train.py:1204] (3/4) Exclude cut with ID _65263_210_22356_1_1535763598691_3921037_108-21902-0_sp1.1 from training. Duration: 0.9 2023-10-03 06:37:27,693 WARNING [train.py:1204] (3/4) Exclude cut with ID _4122_210_2084_1_1526713164417_4353280_528-206771-0_sp1.1 from training. Duration: 0.8 2023-10-03 06:37:27,822 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.self_attn_weights.pos_emb_skip_rate, batch_count=1172533.3333333333, ans=0.0 2023-10-03 06:37:32,447 WARNING [train.py:1204] (3/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_43-325657-0_sp1.1 from training. Duration: 0.92725 2023-10-03 06:37:32,502 WARNING [train.py:1204] (3/4) Exclude cut with ID _44659_210_2721_1_1534036120061_6614860_314-82041-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 06:37:33,838 WARNING [train.py:1204] (3/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_81-300212-0_sp0.9 from training. Duration: 0.6666875 2023-10-03 06:37:33,937 WARNING [train.py:1204] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_665-196155-0_sp0.9 from training. Duration: 0.7444375 2023-10-03 06:37:36,627 WARNING [train.py:1204] (3/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_140-336881-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 06:37:38,044 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6290_210_3273_1_1527595224_1282787_115-87095-0_sp0.9 from training. Duration: 0.611125 2023-10-03 06:37:38,105 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_53373_210_5399_1_1534229471_4715695_607-259685-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 06:37:38,235 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.1.ff2_skip_rate, batch_count=1172600.0, ans=0.0 2023-10-03 06:37:39,390 WARNING [train.py:1204] (3/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_175-163894-0_sp1.1 from training. Duration: 0.67275 2023-10-03 06:37:39,415 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7002_210_1794_1_1527939683_5157286_1-281264-0_sp1.1 from training. Duration: 0.4 2023-10-03 06:37:43,731 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.feed_forward1.out_proj.dropout_p, batch_count=1172600.0, ans=0.1 2023-10-03 06:37:45,001 WARNING [train.py:1204] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_605-257205-0 from training. Duration: 0.59 2023-10-03 06:37:45,613 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.3.encoder.layers.3.conv_module2.whiten, num_groups=1, num_channels=512, metric=7.06 vs. limit=15.0 2023-10-03 06:37:48,202 WARNING [train.py:1204] (3/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_454-314901-0 from training. Duration: 0.46 2023-10-03 06:37:49,532 INFO [train.py:1046] (3/4) Epoch 34, batch 600, loss[loss=0.1559, simple_loss=0.2341, pruned_loss=0.03889, over 24293.00 frames. ], tot_loss[loss=0.1637, simple_loss=0.2425, pruned_loss=0.04244, over 4478962.18 frames. ], batch size: 56, lr: 3.02e-03, grad_scale: 16.0 2023-10-03 06:37:49,666 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_46032_210_14656_1_1533781412_4507969_287-130422-0_sp1.1 from training. Duration: 0.97275 2023-10-03 06:37:50,874 WARNING [train.py:1204] (3/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_186-309161-0_sp1.1 from training. Duration: 0.4909375 2023-10-03 06:37:50,920 WARNING [train.py:1204] (3/4) Exclude cut with ID _42659_210_2070_1_1533607442765_3120380_45-342307-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 06:37:57,599 WARNING [train.py:1204] (3/4) Exclude cut with ID _12555_210_8750_1_1530075594402_3780239_478-326818-0_sp1.1 from training. Duration: 0.963625 2023-10-03 06:38:00,779 WARNING [train.py:1204] (3/4) Exclude cut with ID _7606_210_4915_1_1528894253277_5080690_127-177761-0_sp1.1 from training. Duration: 0.5818125 2023-10-03 06:38:02,163 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6788_210_1388_1_1527898698_4562021_91-197981-0 from training. Duration: 0.83 2023-10-03 06:38:06,018 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8558_210_1410_1_1528505225_7613406_131-329524-0_sp0.9 from training. Duration: 0.87775 2023-10-03 06:38:06,152 WARNING [train.py:1204] (3/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_153-153537-0_sp1.1 from training. Duration: 0.82725 2023-10-03 06:38:08,900 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_17292_210_6753_1_1531125388_2943734_285-349105-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 06:38:10,299 WARNING [train.py:1204] (3/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_37-41919-0 from training. Duration: 0.87 2023-10-03 06:38:10,319 WARNING [train.py:1204] (3/4) Exclude cut with ID _60860_210_8254_1_1534896015875_3557169_172-294852-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 06:38:17,228 WARNING [train.py:1204] (3/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_146-276944-0 from training. Duration: 0.83 2023-10-03 06:38:20,589 WARNING [train.py:1204] (3/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_1254-44082-0_sp1.1 from training. Duration: 0.936375 2023-10-03 06:38:21,853 WARNING [train.py:1204] (3/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_1115-180350-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 06:38:21,882 WARNING [train.py:1204] (3/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_60-146189-0_sp1.1 from training. Duration: 0.8909375 2023-10-03 06:38:27,231 WARNING [train.py:1204] (3/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_517-153649-0_sp1.1 from training. Duration: 0.92725 2023-10-03 06:38:28,495 WARNING [train.py:1204] (3/4) Exclude cut with ID _39571_210_13449_1_1533801584143_3651077_37-266257-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 06:38:28,559 WARNING [train.py:1204] (3/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_527-276118-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 06:38:34,541 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7696_210_2040_1_1528207167_3716099_117-125434-0_sp1.1 from training. Duration: 0.6909375 2023-10-03 06:38:38,669 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_25767_210_2161_1_1532307483_3867748_478-156360-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 06:38:38,673 WARNING [train.py:1204] (3/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_177-49092-0_sp1.1 from training. Duration: 0.82725 2023-10-03 06:38:38,680 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_11305_210_1794_1_1529750428_7178148_680-317073-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 06:38:40,273 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.ff2_skip_rate, batch_count=1172866.6666666667, ans=0.0 2023-10-03 06:38:45,535 WARNING [train.py:1204] (3/4) Exclude cut with ID _53565_210_10635_1_1534337218335_3820259_287-18120-0 from training. Duration: 0.97 2023-10-03 06:38:49,144 WARNING [train.py:1204] (3/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_498-40153-0_sp1.1 from training. Duration: 0.57275 2023-10-03 06:38:49,166 WARNING [train.py:1204] (3/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_555-63066-0_sp1.1 from training. Duration: 0.963625 2023-10-03 06:38:53,666 WARNING [train.py:1204] (3/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_395-310220-0 from training. Duration: 0.89 2023-10-03 06:38:55,577 WARNING [train.py:1204] (3/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_937-208281-0_sp1.1 from training. Duration: 0.77275 2023-10-03 06:38:58,406 WARNING [train.py:1204] (3/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_120-211630-0 from training. Duration: 0.92 2023-10-03 06:38:58,432 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_454-276429-0_sp1.1 from training. Duration: 0.7909375 2023-10-03 06:38:58,478 WARNING [train.py:1204] (3/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_404-75889-0_sp1.1 from training. Duration: 0.8090625 2023-10-03 06:39:02,697 WARNING [train.py:1204] (3/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_282-136641-0_sp0.9 from training. Duration: 0.4666875 2023-10-03 06:39:04,476 INFO [train.py:1046] (3/4) Epoch 34, batch 650, loss[loss=0.1575, simple_loss=0.2092, pruned_loss=0.05295, over 19499.00 frames. ], tot_loss[loss=0.1629, simple_loss=0.241, pruned_loss=0.04237, over 4518336.20 frames. ], batch size: 388, lr: 3.02e-03, grad_scale: 16.0 2023-10-03 06:39:04,564 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_809-17156-0_sp1.1 from training. Duration: 0.663625 2023-10-03 06:39:07,283 WARNING [train.py:1204] (3/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_1003-101638-0_sp0.9 from training. Duration: 0.811125 2023-10-03 06:39:07,396 WARNING [train.py:1204] (3/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_404-75889-0_sp0.9 from training. Duration: 0.988875 2023-10-03 06:39:07,579 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.conv_skip_rate, batch_count=1173000.0, ans=0.0 2023-10-03 06:39:08,689 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.565e+02 1.849e+02 2.038e+02 2.277e+02 3.904e+02, threshold=4.076e+02, percent-clipped=0.0 2023-10-03 06:39:08,833 WARNING [train.py:1204] (3/4) Exclude cut with ID _59288_210_5854_1_1535077757774_3412246_570-295255-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 06:39:10,397 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.bypass_mid.scale_min, batch_count=1173000.0, ans=0.2 2023-10-03 06:39:10,862 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.3.encoder.layers.2.self_attn2.whiten, num_groups=1, num_channels=512, metric=11.11 vs. limit=22.5 2023-10-03 06:39:11,632 WARNING [train.py:1204] (3/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_412-287526-0 from training. Duration: 0.72 2023-10-03 06:39:12,870 WARNING [train.py:1204] (3/4) Exclude cut with ID _44010_210_4525_1_1533884262964_4013620_649-79655-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 06:39:17,370 WARNING [train.py:1204] (3/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_57-270037-0_sp0.9 from training. Duration: 0.8555625 2023-10-03 06:39:17,371 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_34815_210_6388_1_1533381320_3571903_302-255963-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 06:39:20,702 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_47449_210_5399_1_1534076445_4006929_573-301852-0_sp1.1 from training. Duration: 0.97275 2023-10-03 06:39:25,192 WARNING [train.py:1204] (3/4) Exclude cut with ID 6709-74022-0004-86860-0_sp1.1 from training. Duration: 0.9409375 2023-10-03 06:39:26,570 WARNING [train.py:1204] (3/4) Exclude cut with ID _40519_210_13794_1_1533715228655_7337390_111-71991-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 06:39:28,003 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_30167_210_3528_1_1532428012_4772375_169-214186-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 06:39:30,853 WARNING [train.py:1204] (3/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_20-235313-0_sp1.1 from training. Duration: 0.963625 2023-10-03 06:39:32,162 WARNING [train.py:1204] (3/4) Exclude cut with ID _4681_210_3525_1_1527251040321_1235713_123-24428-0_sp0.9 from training. Duration: 0.5333125 2023-10-03 06:39:35,432 WARNING [train.py:1204] (3/4) Exclude cut with ID _31602_210_5854_1_1532677021456_4458250_444-198752-0_sp1.1 from training. Duration: 0.97275 2023-10-03 06:39:35,476 WARNING [train.py:1204] (3/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_9-279442-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 06:39:35,525 WARNING [train.py:1204] (3/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_973-198085-0_sp0.9 from training. Duration: 0.8333125 2023-10-03 06:39:36,896 WARNING [train.py:1204] (3/4) Exclude cut with ID _29853_210_7497_1_1532505600851_6642110_567-250221-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 06:39:37,554 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.3.encoder.layers.3.feed_forward1.out_whiten, num_groups=1, num_channels=512, metric=5.37 vs. limit=15.0 2023-10-03 06:39:39,491 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6545_210_2040_1_1527774670_4245258_407-48070-0_sp1.1 from training. Duration: 0.6909375 2023-10-03 06:39:42,169 WARNING [train.py:1204] (3/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_224-343295-0_sp0.9 from training. Duration: 0.611125 2023-10-03 06:39:42,185 WARNING [train.py:1204] (3/4) Exclude cut with ID IC0125W0011-61817 from training. Duration: 0.88 2023-10-03 06:39:42,203 WARNING [train.py:1204] (3/4) Exclude cut with ID _60113_210_16271_1_1535164212481_4386079_435-154371-0_sp1.1 from training. Duration: 0.97275 2023-10-03 06:39:42,215 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_25836_210_16554_1_1532138167_53197_3-9394-0_sp1.1 from training. Duration: 0.92725 2023-10-03 06:39:44,995 WARNING [train.py:1204] (3/4) Exclude cut with ID _29343_210_4381_1_1532602757752_3674109_57-55125-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 06:39:46,359 WARNING [train.py:1204] (3/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_267-23560-0_sp1.1 from training. Duration: 0.936375 2023-10-03 06:39:46,388 WARNING [train.py:1204] (3/4) Exclude cut with ID _30196_210_1664_1_1533708496522_7074140_1150-278867-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 06:39:46,436 WARNING [train.py:1204] (3/4) Exclude cut with ID _6270_210_2084_1_1527850911573_3856420_26-277343-0_sp0.9 from training. Duration: 0.911125 2023-10-03 06:39:47,245 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.1.encoder.layers.1.self_attn2.whiten, num_groups=1, num_channels=256, metric=9.31 vs. limit=22.5 2023-10-03 06:39:47,851 WARNING [train.py:1204] (3/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_325-62802-0 from training. Duration: 0.87 2023-10-03 06:39:50,124 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.conv_module2.balancer2.min_positive, batch_count=1173200.0, ans=0.05 2023-10-03 06:39:51,178 WARNING [train.py:1204] (3/4) Exclude cut with ID _56524_210_9642_1_1534572011779_3619170_145-310842-0_sp1.1 from training. Duration: 0.9 2023-10-03 06:39:51,216 WARNING [train.py:1204] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_860-96198-0_sp0.9 from training. Duration: 0.811125 2023-10-03 06:39:52,544 WARNING [train.py:1204] (3/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_17-25045-0_sp1.1 from training. Duration: 0.536375 2023-10-03 06:39:53,836 WARNING [train.py:1204] (3/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_349-69727-0_sp1.1 from training. Duration: 0.936375 2023-10-03 06:39:53,918 WARNING [train.py:1204] (3/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_580-93798-0_sp1.1 from training. Duration: 0.5090625 2023-10-03 06:39:55,868 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7762_210_1794_1_1528199508_4057408_419-125009-0 from training. Duration: 0.83 2023-10-03 06:39:57,204 WARNING [train.py:1204] (3/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_356-167816-0 from training. Duration: 0.72 2023-10-03 06:39:57,237 WARNING [train.py:1204] (3/4) Exclude cut with ID _29345_210_4381_1_1532689168263_3860169_225-97100-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 06:39:57,241 WARNING [train.py:1204] (3/4) Exclude cut with ID _14227_210_4381_1_1530701804286_3940259_374-194712-0_sp1.1 from training. Duration: 0.92725 2023-10-03 06:39:57,276 WARNING [train.py:1204] (3/4) Exclude cut with ID _40137_210_12210_1_1533290484046_3795180_249-187059-0_sp0.9 from training. Duration: 0.97775 2023-10-03 06:39:57,297 WARNING [train.py:1204] (3/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_389-317194-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 06:39:58,659 WARNING [train.py:1204] (3/4) Exclude cut with ID _27485_210_4916_1_1532739191677_3668269_239-122918-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 06:40:02,901 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2245_210_2715_1_1525511548_3540254_128-117609-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 06:40:02,957 WARNING [train.py:1204] (3/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_160-276178-0_sp1.1 from training. Duration: 0.963625 2023-10-03 06:40:04,731 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_31920_210_10633_1_1532860009_1686094_1-279735-0_sp1.1 from training. Duration: 0.97275 2023-10-03 06:40:06,329 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=1173266.6666666667, ans=0.1 2023-10-03 06:40:07,491 WARNING [train.py:1204] (3/4) Exclude cut with ID _51078_210_9070_1_1534161557424_3805250_325-262383-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 06:40:07,511 WARNING [train.py:1204] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_643-52007-0_sp0.9 from training. Duration: 0.6333125 2023-10-03 06:40:07,567 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_51937_210_5014_1_1534573610_4022025_518-299248-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 06:40:15,857 WARNING [train.py:1204] (3/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_166-314607-0_sp1.1 from training. Duration: 0.4909375 2023-10-03 06:40:15,861 WARNING [train.py:1204] (3/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_1087-282939-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 06:40:19,756 INFO [train.py:1046] (3/4) Epoch 34, batch 700, loss[loss=0.147, simple_loss=0.2234, pruned_loss=0.03531, over 22878.00 frames. ], tot_loss[loss=0.1611, simple_loss=0.239, pruned_loss=0.0416, over 4553142.98 frames. ], batch size: 50, lr: 3.02e-03, grad_scale: 16.0 2023-10-03 06:40:19,807 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_23850_210_4238_1_1532232597_1632433_32-185135-0_sp1.1 from training. Duration: 0.7818125 2023-10-03 06:40:19,848 WARNING [train.py:1204] (3/4) Exclude cut with ID _59500_210_8254_1_1534809610680_3736009_145-89359-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 06:40:26,196 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_340-336408-0 from training. Duration: 0.93 2023-10-03 06:40:27,492 WARNING [train.py:1204] (3/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_9-21737-0 from training. Duration: 0.97 2023-10-03 06:40:27,655 WARNING [train.py:1204] (3/4) Exclude cut with ID _2220_210_1819_1_1525509109898_3658680_50-99837-0 from training. Duration: 0.54 2023-10-03 06:40:28,997 WARNING [train.py:1204] (3/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_879-299350-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 06:40:30,395 WARNING [train.py:1204] (3/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_905-160074-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 06:40:31,882 WARNING [train.py:1204] (3/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_395-73570-0 from training. Duration: 0.99 2023-10-03 06:40:35,904 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7264_210_4938_1_1527939911_8132095_800-91995-0_sp1.1 from training. Duration: 0.7818125 2023-10-03 06:40:37,865 WARNING [train.py:1204] (3/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_77-311404-0_sp1.1 from training. Duration: 0.8545625 2023-10-03 06:40:40,579 WARNING [train.py:1204] (3/4) Exclude cut with ID _13679_210_7497_1_1530534293662_3355098_107-80772-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 06:40:40,692 WARNING [train.py:1204] (3/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_224-343295-0_sp1.1 from training. Duration: 0.5 2023-10-03 06:40:41,890 WARNING [train.py:1204] (3/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_156-253482-0_sp1.1 from training. Duration: 0.963625 2023-10-03 06:40:43,316 WARNING [train.py:1204] (3/4) Exclude cut with ID _49728_210_12651_1_1534041034294_6788150_309-125532-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 06:40:46,047 WARNING [train.py:1204] (3/4) Exclude cut with ID _13913_210_9994_1_1530579615187_3383897_473-277682-0_sp0.9 from training. Duration: 0.4333125 2023-10-03 06:40:46,052 WARNING [train.py:1204] (3/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_3-276701-0_sp1.1 from training. Duration: 0.82725 2023-10-03 06:40:47,348 WARNING [train.py:1204] (3/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_282-136641-0 from training. Duration: 0.42 2023-10-03 06:40:48,963 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.attention_skip_rate, batch_count=1173466.6666666667, ans=0.0 2023-10-03 06:40:50,160 WARNING [train.py:1204] (3/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_127-566-0 from training. Duration: 0.84 2023-10-03 06:40:54,409 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.ff2_skip_rate, batch_count=1173466.6666666667, ans=0.0 2023-10-03 06:40:55,593 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6163_210_6400_1_1527506746_4263068_283-137811-0_sp1.1 from training. Duration: 0.7 2023-10-03 06:40:55,625 WARNING [train.py:1204] (3/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_34-183685-0_sp1.1 from training. Duration: 0.8909375 2023-10-03 06:40:57,018 WARNING [train.py:1204] (3/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_154-135696-0_sp0.9 from training. Duration: 0.888875 2023-10-03 06:40:59,766 WARNING [train.py:1204] (3/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_676-51014-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 06:41:01,115 WARNING [train.py:1204] (3/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_38-169920-0 from training. Duration: 0.91 2023-10-03 06:41:05,562 WARNING [train.py:1204] (3/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_747-114465-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 06:41:06,875 WARNING [train.py:1204] (3/4) Exclude cut with ID _8021_210_7994_1_1528376451358_3653069_100-140231-0_sp1.1 from training. Duration: 0.6090625 2023-10-03 06:41:06,902 WARNING [train.py:1204] (3/4) Exclude cut with ID _64135_210_2253_1_1535677267876_7608972_133-188266-0 from training. Duration: 0.81 2023-10-03 06:41:09,901 WARNING [train.py:1204] (3/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_714-310538-0_sp1.1 from training. Duration: 0.7818125 2023-10-03 06:41:11,341 WARNING [train.py:1204] (3/4) Exclude cut with ID _39867_210_9826_1_1533553222030_9344259_1134-288498-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 06:41:14,161 WARNING [train.py:1204] (3/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_211-237289-0_sp1.1 from training. Duration: 0.936375 2023-10-03 06:41:18,423 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.feed_forward3.hidden_balancer.prob, batch_count=1173600.0, ans=0.125 2023-10-03 06:41:19,663 WARNING [train.py:1204] (3/4) Exclude cut with ID _59202_210_26984_1_1534816810612_3549174_137-318710-0_sp0.9 from training. Duration: 0.988875 2023-10-03 06:41:20,966 WARNING [train.py:1204] (3/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_86-66192-0 from training. Duration: 0.55 2023-10-03 06:41:24,802 WARNING [train.py:1204] (3/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_540-275685-0 from training. Duration: 0.89 2023-10-03 06:41:24,838 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8776_210_2040_1_1528638837_4088036_244-278763-0 from training. Duration: 0.9 2023-10-03 06:41:28,283 WARNING [train.py:1204] (3/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_9-219972-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 06:41:30,133 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.5.encoder.layers.1.self_attn_weights.whiten_keys, num_groups=4, num_channels=128, metric=1.48 vs. limit=6.0 2023-10-03 06:41:31,003 WARNING [train.py:1204] (3/4) Exclude cut with ID _44659_210_2721_1_1534036120061_6614860_656-283823-0_sp1.1 from training. Duration: 0.92725 2023-10-03 06:41:32,309 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_69630_210_7234_1_1535789601_661049_77-216385-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 06:41:33,741 INFO [train.py:1046] (3/4) Epoch 34, batch 750, loss[loss=0.1578, simple_loss=0.2498, pruned_loss=0.03288, over 24652.00 frames. ], tot_loss[loss=0.1607, simple_loss=0.2391, pruned_loss=0.04121, over 4599366.44 frames. ], batch size: 68, lr: 3.02e-03, grad_scale: 16.0 2023-10-03 06:41:33,850 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_19952_210_2161_1_1531875469_3851750_210-308919-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 06:41:33,856 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_4737_210_2040_1_1526998689_3293008_378-178650-0 from training. Duration: 0.95 2023-10-03 06:41:36,627 WARNING [train.py:1204] (3/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_224-343295-0 from training. Duration: 0.55 2023-10-03 06:41:38,361 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.513e+02 1.818e+02 1.954e+02 2.110e+02 2.895e+02, threshold=3.908e+02, percent-clipped=0.0 2023-10-03 06:41:38,435 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7002_210_1794_1_1527939683_5157286_184-229630-0 from training. Duration: 0.74 2023-10-03 06:41:38,467 WARNING [train.py:1204] (3/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_6-214815-0 from training. Duration: 0.85 2023-10-03 06:41:39,761 WARNING [train.py:1204] (3/4) Exclude cut with ID _8699_210_2069_1_1528630346831_3960409_160-24408-0 from training. Duration: 0.91 2023-10-03 06:41:39,791 WARNING [train.py:1204] (3/4) Exclude cut with ID _5585_210_2667_1_1527418813324_8007340_89-332072-0 from training. Duration: 0.64 2023-10-03 06:41:39,844 WARNING [train.py:1204] (3/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_528-259800-0_sp1.1 from training. Duration: 0.963625 2023-10-03 06:41:41,233 WARNING [train.py:1204] (3/4) Exclude cut with ID _55383_210_10635_1_1534468426172_2231669_134-128246-0 from training. Duration: 0.89 2023-10-03 06:41:42,494 WARNING [train.py:1204] (3/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_400-338274-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 06:41:42,574 WARNING [train.py:1204] (3/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_364-22817-0_sp0.9 from training. Duration: 0.788875 2023-10-03 06:41:42,711 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.conv_module2.balancer2.min_positive, batch_count=1173666.6666666667, ans=0.05 2023-10-03 06:41:45,231 WARNING [train.py:1204] (3/4) Exclude cut with ID _42863_210_9070_1_1533902398788_3696920_331-291057-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 06:41:45,486 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=1173666.6666666667, ans=0.1 2023-10-03 06:41:46,658 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_5436_210_1794_1_1527331317_2541494_229-211016-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 06:41:46,695 WARNING [train.py:1204] (3/4) Exclude cut with ID _6456_210_1534_1_1527681634212_3181049_345-252877-0_sp0.9 from training. Duration: 0.711125 2023-10-03 06:41:46,717 WARNING [train.py:1204] (3/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_96-81499-0_sp1.1 from training. Duration: 0.92725 2023-10-03 06:41:49,671 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_5575_210_1410_1_1527301305_2570170_342-265572-0_sp1.1 from training. Duration: 0.8545625 2023-10-03 06:41:52,193 WARNING [train.py:1204] (3/4) Exclude cut with ID _5444_210_2151_1_1527418808464_5312210_726-27165-0_sp0.9 from training. Duration: 0.7444375 2023-10-03 06:41:54,313 WARNING [train.py:1204] (3/4) Exclude cut with ID _22083_210_8254_1_1531702862439_3678970_304-327750-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 06:41:55,549 WARNING [train.py:1204] (3/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_245-226199-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 06:41:56,895 WARNING [train.py:1204] (3/4) Exclude cut with ID _42875_210_14856_1_1533704397487_4012433_102-199499-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 06:41:56,942 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_51225_210_3601_1_1534400634_2327383_55-301844-0 from training. Duration: 0.93 2023-10-03 06:41:58,949 WARNING [train.py:1204] (3/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_916-195848-0_sp1.1 from training. Duration: 0.836375 2023-10-03 06:41:59,010 WARNING [train.py:1204] (3/4) Exclude cut with ID _44718_210_5816_1_1534050051869_6469030_497-311999-0_sp1.1 from training. Duration: 0.936375 2023-10-03 06:42:02,238 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_34886_210_10633_1_1533203882_1755661_91-194765-0_sp1.1 from training. Duration: 0.936375 2023-10-03 06:42:02,490 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.ff3_skip_rate, batch_count=1173800.0, ans=0.0 2023-10-03 06:42:02,539 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.ff3_skip_rate, batch_count=1173800.0, ans=0.0 2023-10-03 06:42:03,566 WARNING [train.py:1204] (3/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_310-26186-0_sp0.9 from training. Duration: 0.77775 2023-10-03 06:42:04,979 WARNING [train.py:1204] (3/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_416-257730-0 from training. Duration: 0.65 2023-10-03 06:42:04,984 WARNING [train.py:1204] (3/4) Exclude cut with ID _9389_210_1664_1_1529217337301_5289269_209-154760-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 06:42:06,479 WARNING [train.py:1204] (3/4) Exclude cut with ID _4092_210_2151_1_1526643044449_3135759_227-213972-0 from training. Duration: 0.54 2023-10-03 06:42:06,506 WARNING [train.py:1204] (3/4) Exclude cut with ID IC0679W0044-335852 from training. Duration: 0.768 2023-10-03 06:42:08,284 WARNING [train.py:1204] (3/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_42-186162-0 from training. Duration: 0.92 2023-10-03 06:42:08,292 WARNING [train.py:1204] (3/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_302-2550-0_sp0.9 from training. Duration: 0.9 2023-10-03 06:42:09,461 WARNING [train.py:1204] (3/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_356-31216-0_sp0.9 from training. Duration: 0.8333125 2023-10-03 06:42:10,979 WARNING [train.py:1204] (3/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_125-322172-0_sp1.1 from training. Duration: 0.8454375 2023-10-03 06:42:16,426 WARNING [train.py:1204] (3/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_141-89727-0_sp0.9 from training. Duration: 0.788875 2023-10-03 06:42:16,449 WARNING [train.py:1204] (3/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_647-337784-0_sp1.1 from training. Duration: 0.97275 2023-10-03 06:42:16,465 WARNING [train.py:1204] (3/4) Exclude cut with ID _6493_210_1747_1_1528459210424_4012470_513-140967-0_sp1.1 from training. Duration: 0.7181875 2023-10-03 06:42:18,095 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.conv_module2.balancer2.prob, batch_count=1173866.6666666667, ans=0.125 2023-10-03 06:42:19,228 WARNING [train.py:1204] (3/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_87-266334-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 06:42:20,586 WARNING [train.py:1204] (3/4) Exclude cut with ID _26528_210_9070_1_1532570410842_3851310_526-261338-0_sp1.1 from training. Duration: 0.92725 2023-10-03 06:42:21,920 WARNING [train.py:1204] (3/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_380-343670-0 from training. Duration: 0.88 2023-10-03 06:42:21,976 WARNING [train.py:1204] (3/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_449-15508-0_sp1.1 from training. Duration: 0.7454375 2023-10-03 06:42:23,380 WARNING [train.py:1204] (3/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_372-6869-0_sp0.9 from training. Duration: 0.488875 2023-10-03 06:42:24,759 WARNING [train.py:1204] (3/4) Exclude cut with ID _26907_210_7234_1_1532221740694_3205263_337-2494-0_sp0.9 from training. Duration: 0.97775 2023-10-03 06:42:28,129 WARNING [train.py:1204] (3/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_10-100367-0_sp1.1 from training. Duration: 0.8909375 2023-10-03 06:42:29,547 WARNING [train.py:1204] (3/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_47-167373-0 from training. Duration: 0.92 2023-10-03 06:42:29,610 WARNING [train.py:1204] (3/4) Exclude cut with ID _28323_210_16187_1_1532480395667_4100300_628-282179-0_sp1.1 from training. Duration: 0.97275 2023-10-03 06:42:35,728 WARNING [train.py:1204] (3/4) Exclude cut with ID _20455_210_5219_1_1531565996775_8098100_213-280898-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 06:42:35,819 WARNING [train.py:1204] (3/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_383-316000-0_sp1.1 from training. Duration: 0.8181875 2023-10-03 06:42:35,935 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.0.conv_skip_rate, batch_count=1173933.3333333333, ans=0.0 2023-10-03 06:42:37,069 WARNING [train.py:1204] (3/4) Exclude cut with ID _18513_210_9206_1_1531618215223_3862620_16-78180-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 06:42:37,356 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.out_combiner.scale_min, batch_count=1173933.3333333333, ans=0.2 2023-10-03 06:42:39,828 WARNING [train.py:1204] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_330-327861-0_sp1.1 from training. Duration: 0.6090625 2023-10-03 06:42:41,821 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.self_attn_weights.pos_emb_skip_rate, batch_count=1173933.3333333333, ans=0.0 2023-10-03 06:42:43,044 WARNING [train.py:1204] (3/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_356-31216-0 from training. Duration: 0.75 2023-10-03 06:42:43,064 WARNING [train.py:1204] (3/4) Exclude cut with ID _25248_210_8750_1_1532062825941_5554180_255-148539-0_sp1.1 from training. Duration: 0.963625 2023-10-03 06:42:43,104 WARNING [train.py:1204] (3/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_269-154726-0_sp1.1 from training. Duration: 0.7818125 2023-10-03 06:42:44,708 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7002_210_1794_1_1527939683_5157286_163-87552-0_sp1.1 from training. Duration: 0.7818125 2023-10-03 06:42:45,997 WARNING [train.py:1204] (3/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_97-151896-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 06:42:47,367 INFO [train.py:1046] (3/4) Epoch 34, batch 800, loss[loss=0.1658, simple_loss=0.2568, pruned_loss=0.03743, over 24016.00 frames. ], tot_loss[loss=0.161, simple_loss=0.2394, pruned_loss=0.04126, over 4624993.99 frames. ], batch size: 80, lr: 3.01e-03, grad_scale: 32.0 2023-10-03 06:42:48,818 WARNING [train.py:1204] (3/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_207-7026-0_sp1.1 from training. Duration: 0.97275 2023-10-03 06:42:48,839 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_92-46009-0_sp0.9 from training. Duration: 0.6 2023-10-03 06:42:54,317 WARNING [train.py:1204] (3/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_122-178847-0_sp1.1 from training. Duration: 0.97275 2023-10-03 06:42:54,322 WARNING [train.py:1204] (3/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_94-348956-0_sp1.1 from training. Duration: 0.92725 2023-10-03 06:42:56,230 WARNING [train.py:1204] (3/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_1018-244089-0_sp1.1 from training. Duration: 0.963625 2023-10-03 06:42:56,248 WARNING [train.py:1204] (3/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_87-118423-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 06:42:57,687 WARNING [train.py:1204] (3/4) Exclude cut with ID _53713_210_24116_1_1534314487762_7416751_504-212292-0_sp1.1 from training. Duration: 0.92725 2023-10-03 06:42:59,581 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_446-150327-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 06:43:01,431 WARNING [train.py:1204] (3/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_682-245989-0_sp1.1 from training. Duration: 0.92725 2023-10-03 06:43:05,697 WARNING [train.py:1204] (3/4) Exclude cut with ID _35723_210_1794_1_1533088341043_628250_8-234524-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 06:43:05,777 WARNING [train.py:1204] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_64-248359-0_sp0.9 from training. Duration: 0.7666875 2023-10-03 06:43:08,670 WARNING [train.py:1204] (3/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_49-97158-0 from training. Duration: 0.76 2023-10-03 06:43:08,727 WARNING [train.py:1204] (3/4) Exclude cut with ID _24314_210_4915_1_1532135078116_7170179_743-915-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 06:43:08,934 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.nonlin_attention.balancer.max_positive, batch_count=1174066.6666666667, ans=0.95 2023-10-03 06:43:11,863 WARNING [train.py:1204] (3/4) Exclude cut with ID _61194_210_18628_1_1535007555628_3750909_353-323468-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 06:43:11,897 WARNING [train.py:1204] (3/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_387-131538-0_sp1.1 from training. Duration: 0.736375 2023-10-03 06:43:11,919 WARNING [train.py:1204] (3/4) Exclude cut with ID _44424_210_18163_1_1533623394848_1213110_79-187603-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 06:43:11,967 WARNING [train.py:1204] (3/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_86-19601-0 from training. Duration: 0.85 2023-10-03 06:43:13,274 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_50050_210_16223_1_1535435796_7290138_103-283529-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 06:43:13,306 WARNING [train.py:1204] (3/4) Exclude cut with ID _13913_210_9994_1_1530579615187_3383897_473-277682-0 from training. Duration: 0.39 2023-10-03 06:43:16,134 WARNING [train.py:1204] (3/4) Exclude cut with ID _42326_210_7770_1_1533814282531_3707269_368-68029-0_sp1.1 from training. Duration: 0.92725 2023-10-03 06:43:17,646 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_41428_210_2161_1_1533774184_3992532_446-339645-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 06:43:18,906 WARNING [train.py:1204] (3/4) Exclude cut with ID _69584_210_5219_1_1535785133775_4044450_325-7656-0_sp1.1 from training. Duration: 0.7818125 2023-10-03 06:43:18,914 WARNING [train.py:1204] (3/4) Exclude cut with ID _21632_210_2253_1_1531994001765_3744140_134-184387-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 06:43:21,666 WARNING [train.py:1204] (3/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_550-184707-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 06:43:21,680 WARNING [train.py:1204] (3/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_106-341780-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 06:43:25,832 WARNING [train.py:1204] (3/4) Exclude cut with ID _45871_210_5665_1_1533864614245_3296060_253-132552-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 06:43:25,874 WARNING [train.py:1204] (3/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_395-310220-0_sp1.1 from training. Duration: 0.8090625 2023-10-03 06:43:25,896 WARNING [train.py:1204] (3/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_795-33252-0_sp0.9 from training. Duration: 0.411125 2023-10-03 06:43:29,179 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9002W0003-495962 from training. Duration: 0.946 2023-10-03 06:43:29,203 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_43-71989-0 from training. Duration: 0.57 2023-10-03 06:43:29,216 WARNING [train.py:1204] (3/4) Exclude cut with ID _8699_210_2069_1_1528630346831_3960409_155-39766-0_sp1.1 from training. Duration: 0.7090625 2023-10-03 06:43:29,232 WARNING [train.py:1204] (3/4) Exclude cut with ID _27894_210_12392_1_1532415496078_6902439_788-232684-0_sp1.1 from training. Duration: 0.97275 2023-10-03 06:43:31,139 WARNING [train.py:1204] (3/4) Exclude cut with ID _56494_210_4220_1_1535284837625_3033169_10-39296-0_sp1.1 from training. Duration: 0.92725 2023-10-03 06:43:31,167 WARNING [train.py:1204] (3/4) Exclude cut with ID _40901_210_5665_1_1533363789958_3856130_341-20819-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 06:43:36,662 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9029W0435-509822 from training. Duration: 0.896 2023-10-03 06:43:36,715 WARNING [train.py:1204] (3/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_51-117576-0 from training. Duration: 0.4 2023-10-03 06:43:38,132 WARNING [train.py:1204] (3/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_197-28164-0_sp1.1 from training. Duration: 0.72725 2023-10-03 06:43:39,571 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.0.nonlin_attention.balancer.prob, batch_count=1174200.0, ans=0.125 2023-10-03 06:43:41,426 WARNING [train.py:1204] (3/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_145-137790-0_sp1.1 from training. Duration: 0.7545625 2023-10-03 06:43:42,136 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.3.encoder.layers.0.self_attn1.whiten, num_groups=1, num_channels=512, metric=15.15 vs. limit=22.5 2023-10-03 06:43:44,369 WARNING [train.py:1204] (3/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_2-39653-0_sp1.1 from training. Duration: 0.8909375 2023-10-03 06:43:48,417 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3780_210_2087_1_1526467760_2130483_186-208240-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 06:43:49,742 WARNING [train.py:1204] (3/4) Exclude cut with ID _24055_210_12651_1_1532310907143_976979_92-59356-0 from training. Duration: 0.91 2023-10-03 06:43:49,786 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3910_210_1794_1_1526742306_334530_28-238909-0_sp1.1 from training. Duration: 0.736375 2023-10-03 06:43:52,520 WARNING [train.py:1204] (3/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_269-154726-0 from training. Duration: 0.86 2023-10-03 06:43:59,791 WARNING [train.py:1204] (3/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_326-165900-0_sp0.9 from training. Duration: 0.7666875 2023-10-03 06:44:01,097 INFO [train.py:1046] (3/4) Epoch 34, batch 850, loss[loss=0.1754, simple_loss=0.246, pruned_loss=0.05239, over 22788.00 frames. ], tot_loss[loss=0.1614, simple_loss=0.24, pruned_loss=0.04146, over 4641667.71 frames. ], batch size: 322, lr: 3.01e-03, grad_scale: 16.0 2023-10-03 06:44:01,242 WARNING [train.py:1204] (3/4) Exclude cut with ID _5294_210_1747_1_1527249643928_3696179_470-276077-0_sp1.1 from training. Duration: 0.963625 2023-10-03 06:44:02,530 WARNING [train.py:1204] (3/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_372-131163-0 from training. Duration: 0.94 2023-10-03 06:44:02,574 WARNING [train.py:1204] (3/4) Exclude cut with ID _45297_210_2721_1_1533715351381_3264949_351-147136-0_sp1.1 from training. Duration: 0.7909375 2023-10-03 06:44:02,667 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_1072-12671-0_sp1.1 from training. Duration: 0.92725 2023-10-03 06:44:04,474 WARNING [train.py:1204] (3/4) Exclude cut with ID _15133_210_6228_1_1530794512074_3080379_54-198193-0 from training. Duration: 0.94 2023-10-03 06:44:04,501 WARNING [train.py:1204] (3/4) Exclude cut with ID _30196_210_1664_1_1533708496522_7074140_1194-270988-0_sp1.1 from training. Duration: 0.97275 2023-10-03 06:44:07,095 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.564e+02 1.856e+02 2.060e+02 2.258e+02 3.332e+02, threshold=4.119e+02, percent-clipped=0.0 2023-10-03 06:44:07,194 WARNING [train.py:1204] (3/4) Exclude cut with ID _50310_210_15488_1_1534068043345_3641080_289-193637-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 06:44:08,446 WARNING [train.py:1204] (3/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_313-263495-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 06:44:09,895 WARNING [train.py:1204] (3/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_18-130336-0_sp1.1 from training. Duration: 0.7181875 2023-10-03 06:44:09,966 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_47896_210_20508_1_1534492518_5958868_646-287209-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 06:44:11,325 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_294-163793-0 from training. Duration: 0.82 2023-10-03 06:44:12,685 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_8-319957-0 from training. Duration: 0.96 2023-10-03 06:44:12,688 WARNING [train.py:1204] (3/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_1211-226066-0 from training. Duration: 0.76 2023-10-03 06:44:14,725 WARNING [train.py:1204] (3/4) Exclude cut with ID _4779_210_2084_1_1527318022944_3914893_500-285013-0_sp0.9 from training. Duration: 0.7666875 2023-10-03 06:44:14,744 WARNING [train.py:1204] (3/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_347-232268-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 06:44:16,147 WARNING [train.py:1204] (3/4) Exclude cut with ID _29877_210_6228_1_1532397791106_5276149_111-55056-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 06:44:17,452 WARNING [train.py:1204] (3/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_812-271646-0_sp1.1 from training. Duration: 0.92725 2023-10-03 06:44:17,485 WARNING [train.py:1204] (3/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_235-262124-0_sp1.1 from training. Duration: 0.6090625 2023-10-03 06:44:22,761 WARNING [train.py:1204] (3/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_14-16530-0_sp1.1 from training. Duration: 0.97275 2023-10-03 06:44:22,794 WARNING [train.py:1204] (3/4) Exclude cut with ID _44718_210_5816_1_1534050051869_6469030_568-304512-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 06:44:24,300 WARNING [train.py:1204] (3/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_1033-274376-0 from training. Duration: 0.87 2023-10-03 06:44:25,846 WARNING [train.py:1204] (3/4) Exclude cut with ID _4681_210_3525_1_1527251040321_1235713_123-24428-0 from training. Duration: 0.48 2023-10-03 06:44:29,116 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_37818_210_4238_1_1533620910_4720083_251-288764-0_sp1.1 from training. Duration: 0.97275 2023-10-03 06:44:29,813 WARNING [train.py:1204] (3/4) Exclude cut with ID _65585_210_12210_1_1535450451584_3764098_112-294301-0 from training. Duration: 0.88 2023-10-03 06:44:33,858 WARNING [train.py:1204] (3/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_329-113173-0 from training. Duration: 0.62 2023-10-03 06:44:35,232 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6767_210_3273_1_1527855783_1718932_173-256601-0 from training. Duration: 0.8 2023-10-03 06:44:37,178 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9266W0001-627677 from training. Duration: 0.896 2023-10-03 06:44:37,189 WARNING [train.py:1204] (3/4) Exclude cut with ID _49643_210_7989_1_1534640244224_3939031_611-185515-0_sp1.1 from training. Duration: 0.8545625 2023-10-03 06:44:37,192 WARNING [train.py:1204] (3/4) Exclude cut with ID _31649_210_6228_1_1532584608063_4091078_687-237771-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 06:44:37,441 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.ff2_skip_rate, batch_count=1174466.6666666667, ans=0.0 2023-10-03 06:44:38,470 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_148-172861-0_sp0.9 from training. Duration: 0.5555625 2023-10-03 06:44:41,286 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_5129_210_6400_1_1527161005_3579054_107-69738-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 06:44:43,212 WARNING [train.py:1204] (3/4) Exclude cut with ID _40137_210_12210_1_1533290484046_3795180_36-301164-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 06:44:43,244 WARNING [train.py:1204] (3/4) Exclude cut with ID _7537_210_2084_1_1528502395431_3924359_113-199933-0 from training. Duration: 0.59 2023-10-03 06:44:45,963 WARNING [train.py:1204] (3/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_547-5424-0_sp1.1 from training. Duration: 0.8545625 2023-10-03 06:44:47,285 WARNING [train.py:1204] (3/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_147-284024-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 06:44:47,363 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_294-163793-0_sp1.1 from training. Duration: 0.7454375 2023-10-03 06:44:47,391 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3780_210_2087_1_1526467760_2130483_170-259044-0_sp1.1 from training. Duration: 0.636375 2023-10-03 06:44:48,902 WARNING [train.py:1204] (3/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_726-270780-0_sp1.1 from training. Duration: 0.7818125 2023-10-03 06:44:50,203 WARNING [train.py:1204] (3/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_214-291369-0_sp1.1 from training. Duration: 0.57275 2023-10-03 06:44:50,254 WARNING [train.py:1204] (3/4) Exclude cut with ID _21881_210_3525_1_1531825242757_3798380_247-102591-0 from training. Duration: 0.98 2023-10-03 06:44:54,292 WARNING [train.py:1204] (3/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_18-174647-0_sp1.1 from training. Duration: 0.82725 2023-10-03 06:44:54,295 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_415-52611-0_sp1.1 from training. Duration: 0.936375 2023-10-03 06:44:54,352 WARNING [train.py:1204] (3/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_379-49457-0_sp0.9 from training. Duration: 0.9666875 2023-10-03 06:44:54,365 WARNING [train.py:1204] (3/4) Exclude cut with ID _64521_210_14699_1_1535243427259_3815328_368-195106-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 06:44:55,700 WARNING [train.py:1204] (3/4) Exclude cut with ID _28394_210_8913_1_1532430049518_3650118_51-334124-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 06:44:59,251 WARNING [train.py:1204] (3/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_317-161769-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 06:45:02,203 WARNING [train.py:1204] (3/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_40-129756-0_sp0.9 from training. Duration: 0.9 2023-10-03 06:45:03,692 WARNING [train.py:1204] (3/4) Exclude cut with ID _3067_210_2667_1_1526212974640_4503168_119-175776-0_sp0.9 from training. Duration: 0.82225 2023-10-03 06:45:03,743 WARNING [train.py:1204] (3/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_1004-304915-0_sp1.1 from training. Duration: 0.92725 2023-10-03 06:45:05,090 WARNING [train.py:1204] (3/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_359-118926-0_sp0.9 from training. Duration: 0.8 2023-10-03 06:45:05,618 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.5.encoder.layers.0.feed_forward2.out_whiten, num_groups=1, num_channels=256, metric=8.13 vs. limit=15.0 2023-10-03 06:45:14,140 WARNING [train.py:1204] (3/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_51-200220-0_sp0.9 from training. Duration: 0.62225 2023-10-03 06:45:14,222 WARNING [train.py:1204] (3/4) Exclude cut with ID _46683_210_9642_1_1533730913492_4591116_530-326202-0_sp1.1 from training. Duration: 0.936375 2023-10-03 06:45:15,809 INFO [train.py:1046] (3/4) Epoch 34, batch 900, loss[loss=0.1628, simple_loss=0.2386, pruned_loss=0.04353, over 23728.00 frames. ], tot_loss[loss=0.1621, simple_loss=0.2408, pruned_loss=0.04176, over 4650730.20 frames. ], batch size: 232, lr: 3.01e-03, grad_scale: 8.0 2023-10-03 06:45:15,910 WARNING [train.py:1204] (3/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_567-135114-0 from training. Duration: 0.87 2023-10-03 06:45:15,943 WARNING [train.py:1204] (3/4) Exclude cut with ID _66863_210_18053_1_1535698758334_8590319_1092-271784-0_sp1.1 from training. Duration: 0.87275 2023-10-03 06:45:15,952 WARNING [train.py:1204] (3/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_762-282614-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 06:45:18,619 WARNING [train.py:1204] (3/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_280-120942-0 from training. Duration: 0.77 2023-10-03 06:45:22,755 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_31583_210_3852_1_1532595851_4024413_27-170931-0_sp1.1 from training. Duration: 0.963625 2023-10-03 06:45:25,477 WARNING [train.py:1204] (3/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_266-271628-0_sp1.1 from training. Duration: 0.92725 2023-10-03 06:45:26,820 WARNING [train.py:1204] (3/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_41-31157-0 from training. Duration: 0.57 2023-10-03 06:45:29,649 WARNING [train.py:1204] (3/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_113-18163-0_sp1.1 from training. Duration: 0.8090625 2023-10-03 06:45:29,684 WARNING [train.py:1204] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_147-111137-0 from training. Duration: 0.95 2023-10-03 06:45:30,584 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.balancer2.prob, batch_count=1174733.3333333333, ans=0.125 2023-10-03 06:45:31,461 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1083-220122-0_sp1.1 from training. Duration: 0.463625 2023-10-03 06:45:31,555 WARNING [train.py:1204] (3/4) Exclude cut with ID _14884_210_2252_1_1530707459689_5561250_591-173406-0_sp1.1 from training. Duration: 0.87275 2023-10-03 06:45:31,561 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_24677_210_1794_1_1532002693_4341296_149-303793-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 06:45:33,293 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_133-564-0_sp1.1 from training. Duration: 0.7181875 2023-10-03 06:45:34,612 WARNING [train.py:1204] (3/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_625-338546-0_sp0.9 from training. Duration: 0.97775 2023-10-03 06:45:42,078 WARNING [train.py:1204] (3/4) Exclude cut with ID _25882_210_3036_1_1532775665815_6479510_624-320169-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 06:45:42,084 WARNING [train.py:1204] (3/4) Exclude cut with ID _20622_210_3852_1_1531622427080_1093839_127-325326-0_sp1.1 from training. Duration: 0.92725 2023-10-03 06:45:43,488 WARNING [train.py:1204] (3/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_464-33931-0_sp1.1 from training. Duration: 0.4909375 2023-10-03 06:45:46,979 WARNING [train.py:1204] (3/4) Exclude cut with ID _66112_210_10635_1_1535590236305_3801484_120-304588-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 06:45:48,669 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.conv_module1.balancer2.prob, batch_count=1174800.0, ans=0.125 2023-10-03 06:45:49,879 WARNING [train.py:1204] (3/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_110-130961-0 from training. Duration: 0.47 2023-10-03 06:45:52,602 WARNING [train.py:1204] (3/4) Exclude cut with ID _16908_210_5724_1_1531119157248_3772700_64-54020-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 06:45:54,377 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.attention_skip_rate, batch_count=1174800.0, ans=0.0 2023-10-03 06:45:55,508 WARNING [train.py:1204] (3/4) Exclude cut with ID _4068_210_2629_1_1526697350395_1546259_224-288693-0_sp1.1 from training. Duration: 0.536375 2023-10-03 06:45:55,548 WARNING [train.py:1204] (3/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_191-41023-0_sp0.9 from training. Duration: 0.788875 2023-10-03 06:45:56,860 WARNING [train.py:1204] (3/4) Exclude cut with ID ID0205W0255-783002 from training. Duration: 0.958 2023-10-03 06:45:56,929 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_162-262717-0 from training. Duration: 0.83 2023-10-03 06:46:03,719 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_224-322394-0_sp1.1 from training. Duration: 0.6 2023-10-03 06:46:03,732 WARNING [train.py:1204] (3/4) Exclude cut with ID _4866_210_2577_1_1527404412284_4364659_133-1696-0_sp1.1 from training. Duration: 0.9 2023-10-03 06:46:05,057 WARNING [train.py:1204] (3/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_973-198085-0_sp1.1 from training. Duration: 0.6818125 2023-10-03 06:46:09,905 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_47449_210_5399_1_1534076445_4006929_410-237806-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 06:46:09,921 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7543_210_6753_1_1528543318_4552_1-85072-0_sp0.9 from training. Duration: 0.92225 2023-10-03 06:46:10,107 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.feed_forward1.hidden_balancer.prob, batch_count=1174866.6666666667, ans=0.125 2023-10-03 06:46:11,261 WARNING [train.py:1204] (3/4) Exclude cut with ID _65263_210_22356_1_1535763598691_3921037_108-21902-0 from training. Duration: 0.99 2023-10-03 06:46:11,266 WARNING [train.py:1204] (3/4) Exclude cut with ID _65585_210_12210_1_1535450451584_3764098_309-174818-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 06:46:14,432 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_309-65177-0 from training. Duration: 0.88 2023-10-03 06:46:15,932 WARNING [train.py:1204] (3/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_363-185703-0_sp1.1 from training. Duration: 0.863625 2023-10-03 06:46:15,967 WARNING [train.py:1204] (3/4) Exclude cut with ID _42326_210_7770_1_1533814282531_3707269_442-238692-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 06:46:17,314 WARNING [train.py:1204] (3/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_226-254446-0_sp1.1 from training. Duration: 0.936375 2023-10-03 06:46:17,332 WARNING [train.py:1204] (3/4) Exclude cut with ID _29853_210_7497_1_1532505600851_6642110_287-299903-0_sp1.1 from training. Duration: 0.963625 2023-10-03 06:46:21,415 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1015-343884-0 from training. Duration: 0.62 2023-10-03 06:46:22,664 WARNING [train.py:1204] (3/4) Exclude cut with ID ID0305W0008-833402 from training. Duration: 0.91 2023-10-03 06:46:22,782 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6673_210_3273_1_1527767790_3173240_351-167954-0_sp1.1 from training. Duration: 0.436375 2023-10-03 06:46:22,794 WARNING [train.py:1204] (3/4) Exclude cut with ID _6456_210_1534_1_1527681634212_3181049_345-252877-0 from training. Duration: 0.64 2023-10-03 06:46:26,965 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_13-205182-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 06:46:29,127 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.feed_forward1.out_proj.dropout_p, batch_count=1175000.0, ans=0.1 2023-10-03 06:46:30,146 INFO [train.py:1046] (3/4) Epoch 34, batch 950, loss[loss=0.156, simple_loss=0.2452, pruned_loss=0.03347, over 24418.00 frames. ], tot_loss[loss=0.1628, simple_loss=0.2413, pruned_loss=0.04215, over 4665199.50 frames. ], batch size: 69, lr: 3.01e-03, grad_scale: 8.0 2023-10-03 06:46:30,300 WARNING [train.py:1204] (3/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_427-208901-0 from training. Duration: 0.68 2023-10-03 06:46:30,491 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.feed_forward3.hidden_balancer.prob, batch_count=1175000.0, ans=0.125 2023-10-03 06:46:35,105 WARNING [train.py:1204] (3/4) Exclude cut with ID _49039_210_6315_1_1533967537541_3846118_9-148921-0_sp1.1 from training. Duration: 0.97275 2023-10-03 06:46:38,420 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.571e+02 1.873e+02 2.075e+02 2.442e+02 3.584e+02, threshold=4.150e+02, percent-clipped=0.0 2023-10-03 06:46:38,530 WARNING [train.py:1204] (3/4) Exclude cut with ID _59493_210_17282_1_1535078419077_3225879_143-70317-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 06:46:38,571 WARNING [train.py:1204] (3/4) Exclude cut with ID _27967_210_12140_1_1532588391859_8419130_1056-236369-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 06:46:39,814 WARNING [train.py:1204] (3/4) Exclude cut with ID _6537_210_1883_1_1527987559710_3597129_382-133062-0_sp0.9 from training. Duration: 0.7555625 2023-10-03 06:46:40,102 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.feed_forward1.hidden_balancer.prob, batch_count=1175000.0, ans=0.125 2023-10-03 06:46:41,277 WARNING [train.py:1204] (3/4) Exclude cut with ID ID0376W0255-869957 from training. Duration: 0.9510625 2023-10-03 06:46:45,308 WARNING [train.py:1204] (3/4) Exclude cut with ID _49371_210_12071_1_1533952761021_3812190_483-240691-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 06:46:47,330 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_12868_210_6753_1_1530701932_530275_18-76322-0_sp1.1 from training. Duration: 0.7909375 2023-10-03 06:46:48,640 WARNING [train.py:1204] (3/4) Exclude cut with ID _53950_210_5816_1_1534417264919_3862951_367-26344-0_sp1.1 from training. Duration: 0.97275 2023-10-03 06:46:48,667 WARNING [train.py:1204] (3/4) Exclude cut with ID _5444_210_2151_1_1527418808464_5312210_634-194174-0_sp1.1 from training. Duration: 0.8818125 2023-10-03 06:46:48,685 WARNING [train.py:1204] (3/4) Exclude cut with ID _56494_210_4220_1_1535284837625_3033169_69-138150-0 from training. Duration: 0.94 2023-10-03 06:46:49,508 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.2.encoder.layers.0.whiten, num_groups=1, num_channels=384, metric=5.81 vs. limit=12.0 2023-10-03 06:46:50,017 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7543_210_6753_1_1528544010_1110402_25-183860-0_sp0.9 from training. Duration: 0.52225 2023-10-03 06:46:51,408 WARNING [train.py:1204] (3/4) Exclude cut with ID _40284_210_2084_1_1533513679814_3704279_287-191918-0_sp1.1 from training. Duration: 0.963625 2023-10-03 06:46:52,859 WARNING [train.py:1204] (3/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_291-270881-0 from training. Duration: 0.97 2023-10-03 06:46:52,908 WARNING [train.py:1204] (3/4) Exclude cut with ID _12555_210_8750_1_1530075594402_3780239_449-267986-0_sp0.9 from training. Duration: 0.92225 2023-10-03 06:46:53,067 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.bypass.scale_min, batch_count=1175066.6666666667, ans=0.2 2023-10-03 06:46:56,986 WARNING [train.py:1204] (3/4) Exclude cut with ID _3992_210_1747_1_1526644816031_3731060_268-64520-0_sp1.1 from training. Duration: 0.963625 2023-10-03 06:46:56,993 WARNING [train.py:1204] (3/4) Exclude cut with ID _39411_210_5986_1_1533538825030_3343649_173-238248-0_sp0.9 from training. Duration: 0.92225 2023-10-03 06:46:57,621 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.3.encoder.layers.2.nonlin_attention.whiten2, num_groups=1, num_channels=512, metric=5.84 vs. limit=15.0 2023-10-03 06:46:58,294 WARNING [train.py:1204] (3/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_828-313097-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 06:46:59,792 WARNING [train.py:1204] (3/4) Exclude cut with ID _26907_210_7234_1_1532221740694_3205263_337-2494-0 from training. Duration: 0.88 2023-10-03 06:47:01,222 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_229-45300-0_sp1.1 from training. Duration: 0.5181875 2023-10-03 06:47:03,147 WARNING [train.py:1204] (3/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_300-27235-0_sp1.1 from training. Duration: 0.7909375 2023-10-03 06:47:04,564 WARNING [train.py:1204] (3/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_48-338976-0_sp1.1 from training. Duration: 0.8090625 2023-10-03 06:47:10,053 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_13091_210_7925_1_1530609447_8987354_792-195353-0_sp1.1 from training. Duration: 0.936375 2023-10-03 06:47:10,066 WARNING [train.py:1204] (3/4) Exclude cut with ID _56524_210_9642_1_1534572011779_3619170_311-54500-0_sp1.1 from training. Duration: 0.97275 2023-10-03 06:47:14,748 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7543_210_6753_1_1528544010_1110402_25-183860-0 from training. Duration: 0.47 2023-10-03 06:47:16,119 WARNING [train.py:1204] (3/4) Exclude cut with ID 2411-132532-0017-82279-0_sp1.1 from training. Duration: 0.9681875 2023-10-03 06:47:16,120 WARNING [train.py:1204] (3/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_272-249985-0_sp0.9 from training. Duration: 0.7444375 2023-10-03 06:47:17,605 WARNING [train.py:1204] (3/4) Exclude cut with ID _55644_210_11856_1_1534593599582_4103719_320-229006-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 06:47:18,925 WARNING [train.py:1204] (3/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_177-100554-0_sp1.1 from training. Duration: 0.963625 2023-10-03 06:47:18,927 WARNING [train.py:1204] (3/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_787-294416-0_sp1.1 from training. Duration: 0.7454375 2023-10-03 06:47:23,716 WARNING [train.py:1204] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_318-59097-0 from training. Duration: 0.76 2023-10-03 06:47:23,785 WARNING [train.py:1204] (3/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_480-13146-0_sp1.1 from training. Duration: 0.77275 2023-10-03 06:47:25,118 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_469-338558-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 06:47:26,481 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_367-126897-0_sp1.1 from training. Duration: 0.963625 2023-10-03 06:47:26,499 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6545_210_2040_1_1527774670_4245258_379-14182-0 from training. Duration: 0.8 2023-10-03 06:47:26,526 WARNING [train.py:1204] (3/4) Exclude cut with ID _21631_210_2253_1_1531907880778_3160339_197-79498-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 06:47:26,530 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_416-293746-0_sp1.1 from training. Duration: 0.8181875 2023-10-03 06:47:26,567 WARNING [train.py:1204] (3/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_52-211683-0 from training. Duration: 0.9 2023-10-03 06:47:30,833 WARNING [train.py:1204] (3/4) Exclude cut with ID _48678_210_5663_1_1533988830947_3769199_306-255886-0_sp0.9 from training. Duration: 0.8555625 2023-10-03 06:47:31,619 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.ff3_skip_rate, batch_count=1175266.6666666667, ans=0.0 2023-10-03 06:47:34,524 WARNING [train.py:1204] (3/4) Exclude cut with ID _65134_210_14763_1_1535335393985_3505368_296-24733-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 06:47:37,708 INFO [scaling.py:1118] (3/4) WithLoss: name=encoder.encoders.4.encoder.layers.1.self_attn_weights, loss-sum=0.000e+00 2023-10-03 06:47:38,804 WARNING [train.py:1204] (3/4) Exclude cut with ID _14898_210_9206_1_1530766812377_4177190_335-99364-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 06:47:38,887 WARNING [train.py:1204] (3/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_19-342632-0 from training. Duration: 0.91 2023-10-03 06:47:38,904 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_53071_210_16554_1_1534490405_2954066_265-170472-0 from training. Duration: 0.83 2023-10-03 06:47:42,971 WARNING [train.py:1204] (3/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_189-298172-0_sp1.1 from training. Duration: 0.963625 2023-10-03 06:47:44,857 INFO [train.py:1046] (3/4) Epoch 34, batch 1000, loss[loss=0.1353, simple_loss=0.215, pruned_loss=0.02778, over 24267.00 frames. ], tot_loss[loss=0.1618, simple_loss=0.2397, pruned_loss=0.0419, over 4674444.05 frames. ], batch size: 56, lr: 3.01e-03, grad_scale: 8.0 2023-10-03 06:47:47,741 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7615_210_4211_1_1528110965_4580160_72-235211-0 from training. Duration: 0.76 2023-10-03 06:47:47,782 WARNING [train.py:1204] (3/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_155-90874-0_sp1.1 from training. Duration: 0.936375 2023-10-03 06:47:52,550 WARNING [train.py:1204] (3/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_164-216310-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 06:47:52,864 INFO [scaling.py:1118] (3/4) WithLoss: name=encoder.encoders.5.encoder.layers.0.self_attn_weights, loss-sum=0.000e+00 2023-10-03 06:47:55,225 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_46013_210_7234_1_1533776362_3744032_463-52378-0 from training. Duration: 0.86 2023-10-03 06:47:55,236 WARNING [train.py:1204] (3/4) Exclude cut with ID _61133_210_9969_1_1535196777227_3696409_333-270325-0 from training. Duration: 0.93 2023-10-03 06:47:59,419 WARNING [train.py:1204] (3/4) Exclude cut with ID _5172_210_1883_1_1527246062567_3577219_18-101380-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 06:47:59,427 WARNING [train.py:1204] (3/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_577-236858-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 06:47:59,656 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.conv_module2.balancer2.prob, batch_count=1175400.0, ans=0.125 2023-10-03 06:47:59,659 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.attention_skip_rate, batch_count=1175400.0, ans=0.0 2023-10-03 06:48:01,152 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=1175400.0, ans=0.1 2023-10-03 06:48:01,562 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.3.encoder.layers.2.self_attn1.whiten, num_groups=1, num_channels=512, metric=13.24 vs. limit=22.5 2023-10-03 06:48:02,191 WARNING [train.py:1204] (3/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_1006-177345-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 06:48:05,453 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_4-266348-0 from training. Duration: 0.78 2023-10-03 06:48:06,957 WARNING [train.py:1204] (3/4) Exclude cut with ID _55857_210_2346_1_1534644007831_6898209_745-98391-0 from training. Duration: 0.76 2023-10-03 06:48:09,667 WARNING [train.py:1204] (3/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_375-244976-0 from training. Duration: 0.75 2023-10-03 06:48:09,720 WARNING [train.py:1204] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_4-248404-0_sp1.1 from training. Duration: 0.97275 2023-10-03 06:48:11,112 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8453_210_4211_1_1528502940_8562730_596-306820-0 from training. Duration: 0.97 2023-10-03 06:48:12,502 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_211-339622-0_sp0.9 from training. Duration: 0.5 2023-10-03 06:48:12,536 WARNING [train.py:1204] (3/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_393-57888-0 from training. Duration: 0.64 2023-10-03 06:48:13,944 WARNING [train.py:1204] (3/4) Exclude cut with ID _23825_210_13449_1_1531915313526_3497150_143-343575-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 06:48:14,005 WARNING [train.py:1204] (3/4) Exclude cut with ID _58605_210_12210_1_1534723195939_3904259_36-12256-0_sp1.1 from training. Duration: 0.936375 2023-10-03 06:48:19,044 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.conv_module2.balancer2.min_positive, batch_count=1175466.6666666667, ans=0.05 2023-10-03 06:48:21,619 WARNING [train.py:1204] (3/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_38-67795-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 06:48:21,831 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.bypass.scale_min, batch_count=1175466.6666666667, ans=0.2 2023-10-03 06:48:22,876 WARNING [train.py:1204] (3/4) Exclude cut with ID _64631_210_27767_1_1535349580068_4165160_308-327832-0_sp1.1 from training. Duration: 0.92725 2023-10-03 06:48:22,944 WARNING [train.py:1204] (3/4) Exclude cut with ID _37086_210_9038_1_1532999540891_7021250_726-81176-0_sp1.1 from training. Duration: 0.936375 2023-10-03 06:48:23,004 WARNING [train.py:1204] (3/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_47-141521-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 06:48:23,009 WARNING [train.py:1204] (3/4) Exclude cut with ID _3067_210_2667_1_1526212974640_4503168_250-285937-0 from training. Duration: 0.8 2023-10-03 06:48:24,308 WARNING [train.py:1204] (3/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_239-24000-0_sp1.1 from training. Duration: 0.97275 2023-10-03 06:48:24,370 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3371_210_1794_1_1526384462_5580777_396-339812-0_sp0.9 from training. Duration: 0.9444375 2023-10-03 06:48:25,672 WARNING [train.py:1204] (3/4) Exclude cut with ID _16909_210_5724_1_1531292079912_4118919_580-173454-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 06:48:26,918 WARNING [train.py:1204] (3/4) Exclude cut with ID IC0124W0002-61308 from training. Duration: 0.982 2023-10-03 06:48:28,487 WARNING [train.py:1204] (3/4) Exclude cut with ID _31602_210_5854_1_1532677021456_4458250_249-96604-0 from training. Duration: 0.89 2023-10-03 06:48:29,746 WARNING [train.py:1204] (3/4) Exclude cut with ID _69584_210_5219_1_1535785133775_4044450_325-7656-0 from training. Duration: 0.86 2023-10-03 06:48:31,817 WARNING [train.py:1204] (3/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_478-162247-0 from training. Duration: 0.82 2023-10-03 06:48:34,486 WARNING [train.py:1204] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_686-54383-0_sp1.1 from training. Duration: 0.7909375 2023-10-03 06:48:34,724 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=1175533.3333333333, ans=0.1 2023-10-03 06:48:40,079 WARNING [train.py:1204] (3/4) Exclude cut with ID _29303_210_2575_1_1532392237303_7410649_85-270092-0_sp1.1 from training. Duration: 0.936375 2023-10-03 06:48:40,090 WARNING [train.py:1204] (3/4) Exclude cut with ID _2322_210_2667_1_1525604548971_3656299_440-80494-0_sp1.1 from training. Duration: 0.8 2023-10-03 06:48:41,361 WARNING [train.py:1204] (3/4) Exclude cut with ID _47346_210_9476_1_1533815971553_3536160_201-40644-0_sp1.1 from training. Duration: 0.936375 2023-10-03 06:48:42,744 WARNING [train.py:1204] (3/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_343-139898-0_sp1.1 from training. Duration: 0.87275 2023-10-03 06:48:44,123 WARNING [train.py:1204] (3/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_1012-279473-0 from training. Duration: 0.67 2023-10-03 06:48:45,507 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7931_210_1794_1_1528594728_5564059_280-279560-0_sp1.1 from training. Duration: 0.8909375 2023-10-03 06:48:45,554 WARNING [train.py:1204] (3/4) Exclude cut with ID _8263_210_1664_1_1528439452891_970220_68-181805-0 from training. Duration: 0.64 2023-10-03 06:48:47,517 WARNING [train.py:1204] (3/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_605-133395-0 from training. Duration: 0.66 2023-10-03 06:48:48,763 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_43-171109-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 06:48:48,767 WARNING [train.py:1204] (3/4) Exclude cut with ID _40094_210_16380_1_1533294005773_3641159_107-280739-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 06:48:52,907 WARNING [train.py:1204] (3/4) Exclude cut with ID _14821_210_8750_1_1530680503991_6829359_2-227151-0_sp0.9 from training. Duration: 0.92225 2023-10-03 06:48:54,407 WARNING [train.py:1204] (3/4) Exclude cut with ID _14268_210_9206_1_1530622771285_4714099_331-105351-0_sp1.1 from training. Duration: 0.5909375 2023-10-03 06:48:55,788 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_26326_210_7925_1_1533002493_3592406_451-82741-0_sp1.1 from training. Duration: 0.97275 2023-10-03 06:48:57,060 INFO [train.py:1046] (3/4) Epoch 34, batch 1050, loss[loss=0.1526, simple_loss=0.2427, pruned_loss=0.03119, over 24595.00 frames. ], tot_loss[loss=0.1605, simple_loss=0.2383, pruned_loss=0.0413, over 4670441.20 frames. ], batch size: 71, lr: 3.01e-03, grad_scale: 8.0 2023-10-03 06:48:58,553 WARNING [train.py:1204] (3/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_191-59784-0_sp0.9 from training. Duration: 0.97775 2023-10-03 06:48:58,645 WARNING [train.py:1204] (3/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_445-117398-0_sp1.1 from training. Duration: 0.8090625 2023-10-03 06:48:59,381 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.1.encoder.layers.0.self_attn2.whiten, num_groups=1, num_channels=256, metric=11.59 vs. limit=22.5 2023-10-03 06:49:00,779 WARNING [train.py:1204] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_605-257205-0_sp0.9 from training. Duration: 0.6555625 2023-10-03 06:49:02,098 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_50050_210_16223_1_1535435796_7290138_592-258841-0_sp1.1 from training. Duration: 0.936375 2023-10-03 06:49:04,580 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.649e+02 1.921e+02 2.098e+02 2.393e+02 3.925e+02, threshold=4.195e+02, percent-clipped=0.0 2023-10-03 06:49:04,673 WARNING [train.py:1204] (3/4) Exclude cut with ID _14348_210_8341_1_1530599434996_3778640_240-232370-0_sp0.9 from training. Duration: 0.9333125 2023-10-03 06:49:07,259 WARNING [train.py:1204] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_239-316802-0_sp0.9 from training. Duration: 0.7666875 2023-10-03 06:49:08,698 WARNING [train.py:1204] (3/4) Exclude cut with ID _45560_210_21982_1_1533812603951_3584999_111-6376-0_sp1.1 from training. Duration: 0.763625 2023-10-03 06:49:08,889 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.conv_module2.balancer2.prob, batch_count=1175666.6666666667, ans=0.125 2023-10-03 06:49:10,213 WARNING [train.py:1204] (3/4) Exclude cut with ID _54189_210_16380_1_1534332670039_3911710_606-311481-0_sp1.1 from training. Duration: 0.963625 2023-10-03 06:49:11,552 WARNING [train.py:1204] (3/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_237-332027-0_sp1.1 from training. Duration: 0.863625 2023-10-03 06:49:11,587 WARNING [train.py:1204] (3/4) Exclude cut with ID _66070_210_7640_1_1535441393096_7805310_489-292631-0_sp0.9 from training. Duration: 0.87775 2023-10-03 06:49:11,659 WARNING [train.py:1204] (3/4) Exclude cut with ID _49728_210_12651_1_1534041034294_6788150_624-195301-0_sp1.1 from training. Duration: 0.77275 2023-10-03 06:49:13,039 WARNING [train.py:1204] (3/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_44-79427-0 from training. Duration: 0.98 2023-10-03 06:49:14,395 WARNING [train.py:1204] (3/4) Exclude cut with ID _35350_210_16040_1_1533040191183_4075299_490-198354-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 06:49:14,435 WARNING [train.py:1204] (3/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_309-222545-0 from training. Duration: 0.84 2023-10-03 06:49:17,783 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_13091_210_7925_1_1530609447_8987354_790-199469-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 06:49:17,789 WARNING [train.py:1204] (3/4) Exclude cut with ID _21632_210_2253_1_1531994001765_3744140_110-274654-0 from training. Duration: 0.82 2023-10-03 06:49:17,800 WARNING [train.py:1204] (3/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_498-40153-0_sp0.9 from training. Duration: 0.7 2023-10-03 06:49:26,517 WARNING [train.py:1204] (3/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_363-282909-0_sp1.1 from training. Duration: 0.936375 2023-10-03 06:49:27,845 WARNING [train.py:1204] (3/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_178-177582-0_sp0.9 from training. Duration: 0.82225 2023-10-03 06:49:27,856 WARNING [train.py:1204] (3/4) Exclude cut with ID _24612_210_8913_1_1531998054193_3806359_357-35487-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 06:49:29,257 WARNING [train.py:1204] (3/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_138-332773-0 from training. Duration: 0.81 2023-10-03 06:49:29,276 WARNING [train.py:1204] (3/4) Exclude cut with ID _7624_210_1531_1_1528109802198_4933101_418-349834-0 from training. Duration: 0.6 2023-10-03 06:49:29,303 WARNING [train.py:1204] (3/4) Exclude cut with ID _7537_210_2084_1_1528502395431_3924359_14-312906-0_sp0.9 from training. Duration: 0.9333125 2023-10-03 06:49:34,373 WARNING [train.py:1204] (3/4) Exclude cut with ID _60057_210_10640_1_1534903065793_3333150_362-230377-0 from training. Duration: 0.87 2023-10-03 06:49:37,093 WARNING [train.py:1204] (3/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_138-64210-0 from training. Duration: 0.73 2023-10-03 06:49:37,176 WARNING [train.py:1204] (3/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_454-339504-0_sp1.1 from training. Duration: 0.97275 2023-10-03 06:49:41,055 WARNING [train.py:1204] (3/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_508-335369-0_sp0.9 from training. Duration: 0.5333125 2023-10-03 06:49:43,680 WARNING [train.py:1204] (3/4) Exclude cut with ID _5608_210_2106_1_1527415439089_6819768_909-82767-0_sp0.9 from training. Duration: 0.67775 2023-10-03 06:49:43,725 WARNING [train.py:1204] (3/4) Exclude cut with ID _6694_210_4599_1_1527852637490_3965529_99-11611-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 06:49:43,787 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2655_210_2087_1_1525688959_3897400_52-279899-0_sp1.1 from training. Duration: 0.62725 2023-10-03 06:49:46,637 WARNING [train.py:1204] (3/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_375-245677-0_sp1.1 from training. Duration: 0.736375 2023-10-03 06:49:51,873 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_23850_210_4238_1_1532232597_1632433_32-185135-0 from training. Duration: 0.86 2023-10-03 06:49:53,271 WARNING [train.py:1204] (3/4) Exclude cut with ID _15113_210_8254_1_1530842438579_3716279_209-54533-0 from training. Duration: 0.96 2023-10-03 06:49:54,593 WARNING [train.py:1204] (3/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_401-53423-0 from training. Duration: 0.55 2023-10-03 06:49:54,630 WARNING [train.py:1204] (3/4) Exclude cut with ID _14867_210_1968_1_1530684179890_3846969_319-250868-0_sp1.1 from training. Duration: 0.9 2023-10-03 06:49:54,658 WARNING [train.py:1204] (3/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_106-30782-0_sp0.9 from training. Duration: 0.888875 2023-10-03 06:49:54,824 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=1175933.3333333333, ans=0.1 2023-10-03 06:49:56,061 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_5960_210_1794_1_1527403865_3990999_515-2613-0 from training. Duration: 0.88 2023-10-03 06:49:58,979 WARNING [train.py:1204] (3/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_798-106179-0_sp0.9 from training. Duration: 0.92225 2023-10-03 06:50:02,109 WARNING [train.py:1204] (3/4) Exclude cut with ID _14227_210_4381_1_1530701804286_3940259_125-275642-0_sp1.1 from training. Duration: 0.9 2023-10-03 06:50:02,111 WARNING [train.py:1204] (3/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_79-200392-0_sp1.1 from training. Duration: 0.8818125 2023-10-03 06:50:03,413 WARNING [train.py:1204] (3/4) Exclude cut with ID _48836_210_12210_1_1534075848293_3904150_429-35475-0_sp0.9 from training. Duration: 0.988875 2023-10-03 06:50:03,430 WARNING [train.py:1204] (3/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_224-172517-0_sp1.1 from training. Duration: 0.97275 2023-10-03 06:50:06,591 WARNING [train.py:1204] (3/4) Exclude cut with ID _15113_210_8254_1_1530842438579_3716279_76-232507-0_sp1.1 from training. Duration: 0.97275 2023-10-03 06:50:06,594 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_135-5501-0 from training. Duration: 0.98 2023-10-03 06:50:07,971 WARNING [train.py:1204] (3/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_563-347832-0_sp0.9 from training. Duration: 0.988875 2023-10-03 06:50:07,982 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6786_210_1388_1_1527839699_4194495_273-149841-0 from training. Duration: 0.92 2023-10-03 06:50:08,009 WARNING [train.py:1204] (3/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_172-215932-0 from training. Duration: 0.66 2023-10-03 06:50:08,057 WARNING [train.py:1204] (3/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_939-132910-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 06:50:10,774 INFO [train.py:1046] (3/4) Epoch 34, batch 1100, loss[loss=0.1615, simple_loss=0.238, pruned_loss=0.04254, over 23801.00 frames. ], tot_loss[loss=0.1599, simple_loss=0.238, pruned_loss=0.04089, over 4684764.27 frames. ], batch size: 179, lr: 3.01e-03, grad_scale: 8.0 2023-10-03 06:50:12,295 WARNING [train.py:1204] (3/4) Exclude cut with ID _49371_210_12071_1_1533952761021_3812190_16-246001-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 06:50:16,498 WARNING [train.py:1204] (3/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_680-189165-0_sp1.1 from training. Duration: 0.936375 2023-10-03 06:50:21,300 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.bypass.skip_rate, batch_count=1176000.0, ans=0.09899494936611666 2023-10-03 06:50:23,035 WARNING [train.py:1204] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_53-300544-0_sp1.1 from training. Duration: 0.6090625 2023-10-03 06:50:25,755 WARNING [train.py:1204] (3/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_540-275685-0_sp1.1 from training. Duration: 0.8090625 2023-10-03 06:50:25,771 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_38965_210_4852_1_1533174906_3484735_453-106579-0_sp1.1 from training. Duration: 0.92725 2023-10-03 06:50:25,823 WARNING [train.py:1204] (3/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_105-328174-0 from training. Duration: 0.76 2023-10-03 06:50:28,653 WARNING [train.py:1204] (3/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_279-126205-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 06:50:28,816 WARNING [train.py:1204] (3/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_5-55893-0_sp0.9 from training. Duration: 0.611125 2023-10-03 06:50:31,592 WARNING [train.py:1204] (3/4) Exclude cut with ID _13678_210_7497_1_1530530069226_3463170_141-10835-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 06:50:34,774 WARNING [train.py:1204] (3/4) Exclude cut with ID _22083_210_8254_1_1531702862439_3678970_201-85738-0_sp1.1 from training. Duration: 0.8181875 2023-10-03 06:50:34,807 WARNING [train.py:1204] (3/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_51-1049-0 from training. Duration: 0.75 2023-10-03 06:50:36,745 WARNING [train.py:1204] (3/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_206-10097-0_sp0.9 from training. Duration: 0.5444375 2023-10-03 06:50:38,031 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_4528_210_3273_1_1527076654_3276777_360-63661-0_sp1.1 from training. Duration: 0.92725 2023-10-03 06:50:38,045 WARNING [train.py:1204] (3/4) Exclude cut with ID _64086_210_7994_1_1535270417019_7435949_449-31567-0_sp1.1 from training. Duration: 0.8818125 2023-10-03 06:50:40,722 WARNING [train.py:1204] (3/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_245-174553-0_sp0.9 from training. Duration: 0.97775 2023-10-03 06:50:42,198 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6545_210_2040_1_1527774670_4245258_379-14182-0_sp1.1 from training. Duration: 0.72725 2023-10-03 06:50:43,868 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=1176133.3333333333, ans=0.1 2023-10-03 06:50:46,334 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_256-206070-0_sp1.1 from training. Duration: 0.963625 2023-10-03 06:50:50,318 WARNING [train.py:1204] (3/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_278-104676-0 from training. Duration: 0.98 2023-10-03 06:50:51,484 WARNING [train.py:1204] (3/4) Exclude cut with ID IC0671W0065-331908 from training. Duration: 0.896 2023-10-03 06:50:51,525 WARNING [train.py:1204] (3/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_244-129917-0_sp1.1 from training. Duration: 0.97275 2023-10-03 06:50:54,872 WARNING [train.py:1204] (3/4) Exclude cut with ID _7852_210_5219_1_1528286471316_8139400_1129-170857-0_sp1.1 from training. Duration: 0.97275 2023-10-03 06:50:54,952 WARNING [train.py:1204] (3/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_443-30034-0_sp0.9 from training. Duration: 0.711125 2023-10-03 06:50:56,347 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_31583_210_3852_1_1532595851_4024413_250-332999-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 06:50:57,622 WARNING [train.py:1204] (3/4) Exclude cut with ID _8772_210_1850_1_1528635595972_3664011_247-336448-0 from training. Duration: 0.74 2023-10-03 06:50:58,849 WARNING [train.py:1204] (3/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_567-135114-0_sp0.9 from training. Duration: 0.9666875 2023-10-03 06:50:58,853 WARNING [train.py:1204] (3/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_557-349534-0_sp1.1 from training. Duration: 0.82725 2023-10-03 06:50:58,871 WARNING [train.py:1204] (3/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_1341-26728-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 06:50:59,015 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=1176200.0, ans=0.1 2023-10-03 06:51:00,219 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_40222_210_6228_1_1533385298_4481939_375-328051-0_sp1.1 from training. Duration: 0.97275 2023-10-03 06:51:00,239 WARNING [train.py:1204] (3/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_276-96593-0 from training. Duration: 0.77 2023-10-03 06:51:01,828 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.bypass.skip_rate, batch_count=1176200.0, ans=0.04949747468305833 2023-10-03 06:51:02,623 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.0.layers.1.feed_forward1.out_whiten, num_groups=1, num_channels=192, metric=4.58 vs. limit=15.0 2023-10-03 06:51:03,331 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.conv_module1.balancer2.prob, batch_count=1176200.0, ans=0.125 2023-10-03 06:51:04,869 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_53071_210_16554_1_1534490405_2954066_265-170472-0_sp0.9 from training. Duration: 0.92225 2023-10-03 06:51:04,885 WARNING [train.py:1204] (3/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_365-1492-0 from training. Duration: 0.91 2023-10-03 06:51:06,385 WARNING [train.py:1204] (3/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_654-52713-0_sp0.9 from training. Duration: 0.8555625 2023-10-03 06:51:06,538 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.conv_module1.balancer2.prob, batch_count=1176200.0, ans=0.125 2023-10-03 06:51:11,022 WARNING [train.py:1204] (3/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_113-92608-0_sp1.1 from training. Duration: 0.4909375 2023-10-03 06:51:15,179 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_46013_210_7234_1_1533776362_3744032_50-141535-0 from training. Duration: 0.99 2023-10-03 06:51:15,190 WARNING [train.py:1204] (3/4) Exclude cut with ID _13942_210_9988_1_1530604906036_3733360_499-55676-0_sp1.1 from training. Duration: 0.563625 2023-10-03 06:51:16,624 WARNING [train.py:1204] (3/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_375-65143-0_sp1.1 from training. Duration: 0.97275 2023-10-03 06:51:19,273 WARNING [train.py:1204] (3/4) Exclude cut with ID _16756_210_4915_1_1531047803763_7104259_199-325608-0_sp1.1 from training. Duration: 0.92725 2023-10-03 06:51:19,306 WARNING [train.py:1204] (3/4) Exclude cut with ID _31649_210_6228_1_1532584608063_4091078_443-124391-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 06:51:19,404 WARNING [train.py:1204] (3/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_146-218763-0 from training. Duration: 0.86 2023-10-03 06:51:20,748 WARNING [train.py:1204] (3/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_567-135114-0_sp1.1 from training. Duration: 0.7909375 2023-10-03 06:51:22,027 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_26326_210_7925_1_1533002493_3592406_462-288299-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 06:51:23,808 INFO [train.py:1046] (3/4) Epoch 34, batch 1150, loss[loss=0.1818, simple_loss=0.2526, pruned_loss=0.05555, over 23865.00 frames. ], tot_loss[loss=0.1597, simple_loss=0.2384, pruned_loss=0.04051, over 4705169.66 frames. ], batch size: 179, lr: 3.01e-03, grad_scale: 8.0 2023-10-03 06:51:23,901 WARNING [train.py:1204] (3/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_322-217220-0 from training. Duration: 0.86 2023-10-03 06:51:23,936 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_238-256668-0_sp1.1 from training. Duration: 0.62725 2023-10-03 06:51:23,980 WARNING [train.py:1204] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_371-18169-0 from training. Duration: 0.86 2023-10-03 06:51:25,799 WARNING [train.py:1204] (3/4) Exclude cut with ID _58480_210_12392_1_1534811121111_3646160_23-74512-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 06:51:25,806 WARNING [train.py:1204] (3/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_784-213099-0_sp1.1 from training. Duration: 0.8454375 2023-10-03 06:51:25,904 WARNING [train.py:1204] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_625-102955-0_sp1.1 from training. Duration: 0.7 2023-10-03 06:51:28,781 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.1.conv_module1.balancer1.prob, batch_count=1176333.3333333333, ans=0.125 2023-10-03 06:51:31,203 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.646e+02 1.910e+02 2.113e+02 2.416e+02 3.603e+02, threshold=4.226e+02, percent-clipped=0.0 2023-10-03 06:51:31,298 WARNING [train.py:1204] (3/4) Exclude cut with ID _49748_210_24116_1_1533986971737_3158910_176-174327-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 06:51:32,721 WARNING [train.py:1204] (3/4) Exclude cut with ID _50310_210_15488_1_1534068043345_3641080_432-96140-0_sp1.1 from training. Duration: 0.936375 2023-10-03 06:51:34,632 WARNING [train.py:1204] (3/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_76-14910-0_sp1.1 from training. Duration: 0.92725 2023-10-03 06:51:35,346 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.3.encoder.layers.0.nonlin_attention.whiten1, num_groups=1, num_channels=384, metric=7.68 vs. limit=10.0 2023-10-03 06:51:35,940 WARNING [train.py:1204] (3/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_40-19458-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 06:51:35,980 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_46-93547-0 from training. Duration: 0.95 2023-10-03 06:51:36,014 WARNING [train.py:1204] (3/4) Exclude cut with ID _21631_210_2253_1_1531907880778_3160339_321-35325-0_sp1.1 from training. Duration: 0.963625 2023-10-03 06:51:37,611 WARNING [train.py:1204] (3/4) Exclude cut with ID _4779_210_2084_1_1527318022944_3914893_500-285013-0 from training. Duration: 0.69 2023-10-03 06:51:40,720 WARNING [train.py:1204] (3/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_301-93324-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 06:51:40,726 WARNING [train.py:1204] (3/4) Exclude cut with ID _6473_210_1741_1_1527901307396_3815380_108-17335-0_sp1.1 from training. Duration: 0.5909375 2023-10-03 06:51:45,087 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.attention_skip_rate, batch_count=1176400.0, ans=0.0 2023-10-03 06:51:46,214 WARNING [train.py:1204] (3/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_206-10097-0 from training. Duration: 0.49 2023-10-03 06:51:48,919 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_36756_210_7234_1_1533026991_966276_103-77513-0_sp1.1 from training. Duration: 0.97275 2023-10-03 06:51:50,855 WARNING [train.py:1204] (3/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_662-18193-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 06:51:52,174 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_26433_210_12108_1_1532157158_4056480_439-52438-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 06:51:52,240 WARNING [train.py:1204] (3/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_464-33931-0 from training. Duration: 0.54 2023-10-03 06:51:52,246 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3474_210_2028_1_1526188513_4825246_145-309131-0_sp0.9 from training. Duration: 0.82225 2023-10-03 06:51:53,715 WARNING [train.py:1204] (3/4) Exclude cut with ID _9679_210_7497_1_1529129212906_4363929_446-308321-0_sp1.1 from training. Duration: 0.963625 2023-10-03 06:51:56,594 WARNING [train.py:1204] (3/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_662-275621-0 from training. Duration: 0.42 2023-10-03 06:51:58,355 WARNING [train.py:1204] (3/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_344-337148-0_sp1.1 from training. Duration: 0.97275 2023-10-03 06:51:59,856 WARNING [train.py:1204] (3/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_759-179071-0_sp1.1 from training. Duration: 0.92725 2023-10-03 06:52:05,868 WARNING [train.py:1204] (3/4) Exclude cut with ID _28783_210_7943_1_1532671164808_7155479_293-12188-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 06:52:06,730 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.conv_skip_rate, batch_count=1176466.6666666667, ans=0.0 2023-10-03 06:52:12,099 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_17613_210_2151_1_1531137569_2366160_235-320304-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 06:52:13,399 WARNING [train.py:1204] (3/4) Exclude cut with ID _14349_210_8341_1_1530615710818_3442470_170-336541-0 from training. Duration: 0.86 2023-10-03 06:52:13,464 WARNING [train.py:1204] (3/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_375-218677-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 06:52:13,508 WARNING [train.py:1204] (3/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_829-202956-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 06:52:19,233 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9029W0436-509823 from training. Duration: 0.768 2023-10-03 06:52:22,282 WARNING [train.py:1204] (3/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_231-112214-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 06:52:27,799 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9059W0470-524883 from training. Duration: 0.3415625 2023-10-03 06:52:31,123 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_43266_210_1794_1_1533521789_4497218_206-79512-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 06:52:33,120 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_294-163793-0_sp0.9 from training. Duration: 0.911125 2023-10-03 06:52:33,157 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6290_210_3273_1_1527595224_1282787_115-87095-0_sp1.1 from training. Duration: 0.5 2023-10-03 06:52:34,504 WARNING [train.py:1204] (3/4) Exclude cut with ID _14100_210_8341_1_1530529320613_3678069_279-55760-0_sp1.1 from training. Duration: 0.8454375 2023-10-03 06:52:37,129 WARNING [train.py:1204] (3/4) Exclude cut with ID _14621_210_1925_1_1530686901185_3675420_419-234200-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 06:52:38,899 INFO [train.py:1046] (3/4) Epoch 34, batch 1200, loss[loss=0.1611, simple_loss=0.2393, pruned_loss=0.04149, over 23577.00 frames. ], tot_loss[loss=0.1602, simple_loss=0.2387, pruned_loss=0.04086, over 4704890.00 frames. ], batch size: 232, lr: 3.01e-03, grad_scale: 16.0 2023-10-03 06:52:41,819 WARNING [train.py:1204] (3/4) Exclude cut with ID _60229_210_27767_1_1534838406158_3655350_353-213656-0_sp1.1 from training. Duration: 0.863625 2023-10-03 06:52:41,828 WARNING [train.py:1204] (3/4) Exclude cut with ID _48678_210_5663_1_1533988830947_3769199_306-255886-0_sp1.1 from training. Duration: 0.7 2023-10-03 06:52:43,282 WARNING [train.py:1204] (3/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_466-3492-0_sp1.1 from training. Duration: 0.97275 2023-10-03 06:52:43,283 WARNING [train.py:1204] (3/4) Exclude cut with ID _8755_210_1876_1_1528628559357_3258209_199-242167-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 06:52:44,605 WARNING [train.py:1204] (3/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_237-89684-0_sp1.1 from training. Duration: 0.87275 2023-10-03 06:52:46,025 WARNING [train.py:1204] (3/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_70-84121-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 06:52:47,411 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7544_210_6753_1_1528587722_3576686_172-171750-0_sp1.1 from training. Duration: 0.5909375 2023-10-03 06:52:50,101 WARNING [train.py:1204] (3/4) Exclude cut with ID _24271_210_12658_1_1531987270022_7190519_40-321822-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 06:52:50,129 WARNING [train.py:1204] (3/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_227-83612-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 06:52:50,334 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.conv_skip_rate, batch_count=1176666.6666666667, ans=0.0 2023-10-03 06:52:53,358 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9156W0015-572508 from training. Duration: 0.896 2023-10-03 06:52:56,079 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_292-265928-0 from training. Duration: 0.51 2023-10-03 06:52:58,910 WARNING [train.py:1204] (3/4) Exclude cut with ID _14747_210_8341_1_1530685797463_3654089_147-131229-0_sp1.1 from training. Duration: 0.6909375 2023-10-03 06:53:00,760 WARNING [train.py:1204] (3/4) Exclude cut with ID _45297_210_2721_1_1533715351381_3264949_351-147136-0_sp0.9 from training. Duration: 0.9666875 2023-10-03 06:53:03,886 WARNING [train.py:1204] (3/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_1102-324568-0_sp1.1 from training. Duration: 0.97275 2023-10-03 06:53:05,346 WARNING [train.py:1204] (3/4) Exclude cut with ID _18516_210_9206_1_1531704653206_3692406_465-75446-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 06:53:05,347 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9203W0126-595623 from training. Duration: 0.896 2023-10-03 06:53:05,427 WARNING [train.py:1204] (3/4) Exclude cut with ID _55383_210_10635_1_1534468426172_2231669_139-328410-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 06:53:14,185 WARNING [train.py:1204] (3/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_166-246557-0_sp0.9 from training. Duration: 0.711125 2023-10-03 06:53:14,189 WARNING [train.py:1204] (3/4) Exclude cut with ID _30039_210_13270_1_1532484612900_2725150_239-304872-0_sp1.1 from training. Duration: 0.963625 2023-10-03 06:53:14,218 WARNING [train.py:1204] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_6-226750-0 from training. Duration: 0.94 2023-10-03 06:53:15,570 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_135-5501-0_sp1.1 from training. Duration: 0.8909375 2023-10-03 06:53:18,431 WARNING [train.py:1204] (3/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_359-118926-0 from training. Duration: 0.72 2023-10-03 06:53:22,516 WARNING [train.py:1204] (3/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_4-91037-0 from training. Duration: 0.88 2023-10-03 06:53:22,543 WARNING [train.py:1204] (3/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_415-288492-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 06:53:23,860 WARNING [train.py:1204] (3/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_544-287179-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 06:53:24,010 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.self_attn_weights.pos_emb_skip_rate, batch_count=1176866.6666666667, ans=0.0 2023-10-03 06:53:25,848 WARNING [train.py:1204] (3/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_122-32606-0_sp1.1 from training. Duration: 0.92725 2023-10-03 06:53:27,155 WARNING [train.py:1204] (3/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_121-284152-0_sp1.1 from training. Duration: 0.536375 2023-10-03 06:53:27,325 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.1.ff3_skip_rate, batch_count=1176866.6666666667, ans=0.0 2023-10-03 06:53:28,653 WARNING [train.py:1204] (3/4) Exclude cut with ID _44718_210_5816_1_1534050051869_6469030_350-299477-0_sp1.1 from training. Duration: 0.97275 2023-10-03 06:53:28,666 WARNING [train.py:1204] (3/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_1103-56445-0_sp0.9 from training. Duration: 0.811125 2023-10-03 06:53:30,434 WARNING [train.py:1204] (3/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_64-298917-0_sp0.9 from training. Duration: 0.97775 2023-10-03 06:53:31,705 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2245_210_2715_1_1525511548_3540254_310-52483-0 from training. Duration: 0.88 2023-10-03 06:53:31,745 WARNING [train.py:1204] (3/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_401-158239-0_sp1.1 from training. Duration: 0.6454375 2023-10-03 06:53:33,088 WARNING [train.py:1204] (3/4) Exclude cut with ID _59502_210_12210_1_1535107024698_1835139_146-162495-0_sp1.1 from training. Duration: 0.836375 2023-10-03 06:53:33,106 WARNING [train.py:1204] (3/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_41-31157-0_sp0.9 from training. Duration: 0.6333125 2023-10-03 06:53:34,567 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_68717_210_3528_1_1535628683_1556759_95-36722-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 06:53:34,580 WARNING [train.py:1204] (3/4) Exclude cut with ID _30196_210_1664_1_1533708496522_7074140_654-162611-0_sp1.1 from training. Duration: 0.92725 2023-10-03 06:53:39,214 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_9302_210_2550_1_1528889962_4655416_638-301620-0_sp0.9 from training. Duration: 0.52225 2023-10-03 06:53:41,966 WARNING [train.py:1204] (3/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_478-162247-0_sp1.1 from training. Duration: 0.7454375 2023-10-03 06:53:44,034 WARNING [train.py:1204] (3/4) Exclude cut with ID _45297_210_2721_1_1533715351381_3264949_353-9541-0 from training. Duration: 0.97 2023-10-03 06:53:48,287 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9653W0190-675468 from training. Duration: 0.9935 2023-10-03 06:53:49,622 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_49730_210_13049_1_1534075294_1830676_214-84436-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 06:53:52,410 INFO [train.py:1046] (3/4) Epoch 34, batch 1250, loss[loss=0.1714, simple_loss=0.2517, pruned_loss=0.04557, over 24063.00 frames. ], tot_loss[loss=0.1605, simple_loss=0.2396, pruned_loss=0.04074, over 4709937.76 frames. ], batch size: 86, lr: 3.01e-03, grad_scale: 16.0 2023-10-03 06:53:52,470 WARNING [train.py:1204] (3/4) Exclude cut with ID _50093_210_4220_1_1534417141207_3432299_259-322252-0_sp1.1 from training. Duration: 0.836375 2023-10-03 06:53:53,865 WARNING [train.py:1204] (3/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_599-50220-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 06:53:55,261 WARNING [train.py:1204] (3/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_174-109016-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 06:53:58,527 WARNING [train.py:1204] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_665-196155-0 from training. Duration: 0.67 2023-10-03 06:53:59,821 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.620e+02 1.873e+02 2.047e+02 2.320e+02 3.578e+02, threshold=4.093e+02, percent-clipped=0.0 2023-10-03 06:54:01,242 WARNING [train.py:1204] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_656-188361-0_sp1.1 from training. Duration: 0.7909375 2023-10-03 06:54:03,080 WARNING [train.py:1204] (3/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_60-18123-0_sp1.1 from training. Duration: 0.8 2023-10-03 06:54:03,140 WARNING [train.py:1204] (3/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_535-14410-0 from training. Duration: 0.47 2023-10-03 06:54:04,564 WARNING [train.py:1204] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_94-126128-0_sp1.1 from training. Duration: 0.7818125 2023-10-03 06:54:04,811 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.bypass_mid.scale_min, batch_count=1177000.0, ans=0.2 2023-10-03 06:54:05,967 WARNING [train.py:1204] (3/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_356-31216-0_sp1.1 from training. Duration: 0.6818125 2023-10-03 06:54:10,758 WARNING [train.py:1204] (3/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_443-30034-0_sp1.1 from training. Duration: 0.5818125 2023-10-03 06:54:10,826 WARNING [train.py:1204] (3/4) Exclude cut with ID _26907_210_7234_1_1532221740694_3205263_337-2494-0_sp1.1 from training. Duration: 0.8 2023-10-03 06:54:12,139 WARNING [train.py:1204] (3/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_49-97158-0_sp0.9 from training. Duration: 0.8444375 2023-10-03 06:54:12,150 WARNING [train.py:1204] (3/4) Exclude cut with ID _7877_210_6014_1_1528286358099_3750136_553-170128-0_sp1.1 from training. Duration: 0.963625 2023-10-03 06:54:15,492 WARNING [train.py:1204] (3/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_202-293523-0_sp0.9 from training. Duration: 0.8 2023-10-03 06:54:19,354 WARNING [train.py:1204] (3/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_188-171673-0_sp0.9 from training. Duration: 0.6444375 2023-10-03 06:54:19,369 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_120-41668-0_sp1.1 from training. Duration: 0.636375 2023-10-03 06:54:19,374 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_40223_210_6228_1_1533298404_4812267_613-78769-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 06:54:21,885 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_53861_210_5399_1_1534422156_4272077_150-336899-0_sp1.1 from training. Duration: 0.963625 2023-10-03 06:54:21,947 WARNING [train.py:1204] (3/4) Exclude cut with ID _14459_210_2228_1_1531443641946_7516700_1146-56843-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 06:54:24,693 WARNING [train.py:1204] (3/4) Exclude cut with ID _39867_210_9826_1_1533553222030_9344259_1194-299928-0_sp1.1 from training. Duration: 0.92725 2023-10-03 06:54:24,855 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.conv_module2.balancer1.prob, batch_count=1177133.3333333333, ans=0.125 2023-10-03 06:54:24,927 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.conv_skip_rate, batch_count=1177133.3333333333, ans=0.0 2023-10-03 06:54:26,019 WARNING [train.py:1204] (3/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_214-291369-0_sp0.9 from training. Duration: 0.7 2023-10-03 06:54:29,544 WARNING [train.py:1204] (3/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_358-215558-0 from training. Duration: 0.93 2023-10-03 06:54:30,834 WARNING [train.py:1204] (3/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_577-287150-0_sp0.9 from training. Duration: 0.911125 2023-10-03 06:54:32,287 WARNING [train.py:1204] (3/4) Exclude cut with ID _60860_210_8254_1_1534896015875_3557169_229-187367-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 06:54:34,214 WARNING [train.py:1204] (3/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_857-298942-0 from training. Duration: 0.85 2023-10-03 06:54:34,270 WARNING [train.py:1204] (3/4) Exclude cut with ID _40137_210_12210_1_1533290484046_3795180_249-187059-0_sp1.1 from training. Duration: 0.8 2023-10-03 06:54:34,277 WARNING [train.py:1204] (3/4) Exclude cut with ID ID0171W0508-765603 from training. Duration: 0.541 2023-10-03 06:54:34,293 WARNING [train.py:1204] (3/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_521-219439-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 06:54:34,298 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_40223_210_6228_1_1533298404_4812267_741-302616-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 06:54:40,119 WARNING [train.py:1204] (3/4) Exclude cut with ID _65293_210_9070_1_1535545549765_4004210_438-154735-0_sp1.1 from training. Duration: 0.92725 2023-10-03 06:54:41,629 WARNING [train.py:1204] (3/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_564-132950-0_sp1.1 from training. Duration: 0.92725 2023-10-03 06:54:41,669 WARNING [train.py:1204] (3/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_291-270881-0_sp1.1 from training. Duration: 0.8818125 2023-10-03 06:54:43,019 WARNING [train.py:1204] (3/4) Exclude cut with ID _60229_210_27767_1_1534838406158_3655350_353-213656-0 from training. Duration: 0.95 2023-10-03 06:54:43,037 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_243-143119-0 from training. Duration: 0.77 2023-10-03 06:54:43,058 WARNING [train.py:1204] (3/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_690-126306-0 from training. Duration: 0.78 2023-10-03 06:54:46,149 WARNING [train.py:1204] (3/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_347-95003-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 06:54:47,526 WARNING [train.py:1204] (3/4) Exclude cut with ID _2322_210_2667_1_1525604548971_3656299_440-80494-0 from training. Duration: 0.88 2023-10-03 06:54:47,549 WARNING [train.py:1204] (3/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_370-229434-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 06:54:51,617 WARNING [train.py:1204] (3/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_124-345840-0_sp1.1 from training. Duration: 0.47275 2023-10-03 06:54:51,632 WARNING [train.py:1204] (3/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_165-123581-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 06:54:53,144 WARNING [train.py:1204] (3/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_453-282588-0 from training. Duration: 0.98 2023-10-03 06:54:54,445 WARNING [train.py:1204] (3/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_299-34413-0_sp0.9 from training. Duration: 0.72225 2023-10-03 06:54:54,482 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_40036_210_13405_1_1533648647_1315388_48-146016-0_sp1.1 from training. Duration: 0.8454375 2023-10-03 06:54:54,512 WARNING [train.py:1204] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_99-167392-0_sp0.9 from training. Duration: 0.67775 2023-10-03 06:54:54,566 WARNING [train.py:1204] (3/4) Exclude cut with ID _48677_210_5663_1_1533902405919_3892989_37-207114-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 06:54:57,344 WARNING [train.py:1204] (3/4) Exclude cut with ID _29857_210_20034_1_1532395886753_3798160_335-163562-0 from training. Duration: 0.85 2023-10-03 06:55:00,726 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_36492_210_7927_1_1533261987_4790834_703-343351-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 06:55:00,805 WARNING [train.py:1204] (3/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_80-147301-0_sp0.9 from training. Duration: 0.9333125 2023-10-03 06:55:02,193 WARNING [train.py:1204] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_218-295097-0_sp0.9 from training. Duration: 0.8555625 2023-10-03 06:55:02,639 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.5.encoder.layers.1.nonlin_attention.whiten1, num_groups=1, num_channels=192, metric=2.66 vs. limit=10.0 2023-10-03 06:55:03,727 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.conv_module1.balancer1.prob, batch_count=1177266.6666666667, ans=0.125 2023-10-03 06:55:07,040 INFO [train.py:1046] (3/4) Epoch 34, batch 1300, loss[loss=0.1678, simple_loss=0.2522, pruned_loss=0.04165, over 24003.00 frames. ], tot_loss[loss=0.161, simple_loss=0.2398, pruned_loss=0.04104, over 4717205.27 frames. ], batch size: 80, lr: 3.01e-03, grad_scale: 8.0 2023-10-03 06:55:07,150 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2295_210_2087_1_1525500993_2407863_116-238894-0_sp1.1 from training. Duration: 0.636375 2023-10-03 06:55:09,851 WARNING [train.py:1204] (3/4) Exclude cut with ID _3231_210_2106_1_1526205864934_6784220_697-171068-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 06:55:09,878 WARNING [train.py:1204] (3/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_102-77064-0 from training. Duration: 0.93 2023-10-03 06:55:12,787 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_13091_210_7925_1_1530609447_8987354_1186-261957-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 06:55:15,568 WARNING [train.py:1204] (3/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_91-277684-0_sp1.1 from training. Duration: 0.67275 2023-10-03 06:55:16,253 WARNING [train.py:1204] (3/4) Exclude cut with ID _31088_210_16446_1_1532520012284_3440919_159-313751-0_sp1.1 from training. Duration: 0.936375 2023-10-03 06:55:17,521 WARNING [train.py:1204] (3/4) Exclude cut with ID _42326_210_7770_1_1533814282531_3707269_227-203721-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 06:55:18,864 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3531_210_3528_1_1526294203_5216456_309-227302-0_sp1.1 from training. Duration: 0.62725 2023-10-03 06:55:20,237 WARNING [train.py:1204] (3/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_226-242140-0 from training. Duration: 0.81 2023-10-03 06:55:24,454 WARNING [train.py:1204] (3/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_122-109023-0_sp1.1 from training. Duration: 0.6909375 2023-10-03 06:55:25,840 WARNING [train.py:1204] (3/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_660-211088-0_sp0.9 from training. Duration: 0.688875 2023-10-03 06:55:27,142 WARNING [train.py:1204] (3/4) Exclude cut with ID _39290_210_4220_1_1533862778760_3075118_92-311600-0 from training. Duration: 0.88 2023-10-03 06:55:31,918 WARNING [train.py:1204] (3/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_390-339137-0_sp1.1 from training. Duration: 0.5090625 2023-10-03 06:55:34,710 WARNING [train.py:1204] (3/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_449-341946-0_sp1.1 from training. Duration: 0.963625 2023-10-03 06:55:34,759 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder_embed.convnext.layerdrop_rate, batch_count=1177466.6666666667, ans=0.015 2023-10-03 06:55:36,690 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_68095_210_20185_1_1535718226_8888855_884-327007-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 06:55:38,076 WARNING [train.py:1204] (3/4) Exclude cut with ID _42326_210_7770_1_1533814282531_3707269_308-222998-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 06:55:38,325 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.self_attn_weights.pos_emb_skip_rate, batch_count=1177466.6666666667, ans=0.0 2023-10-03 06:55:39,437 WARNING [train.py:1204] (3/4) Exclude cut with ID _40399_210_4353_1_1533305658194_3279406_344-238708-0_sp1.1 from training. Duration: 0.963625 2023-10-03 06:55:40,212 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.1.encoder.layers.0.feed_forward2.out_whiten, num_groups=1, num_channels=256, metric=4.12 vs. limit=15.0 2023-10-03 06:55:40,801 WARNING [train.py:1204] (3/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_87-131427-0_sp1.1 from training. Duration: 0.7090625 2023-10-03 06:55:40,848 WARNING [train.py:1204] (3/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_110-130961-0_sp0.9 from training. Duration: 0.52225 2023-10-03 06:55:40,899 WARNING [train.py:1204] (3/4) Exclude cut with ID _55153_210_2253_1_1534384786677_3162096_148-79022-0 from training. Duration: 0.96 2023-10-03 06:55:42,930 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.3.encoder.layers.0.conv_module2.whiten, num_groups=1, num_channels=512, metric=2.89 vs. limit=15.0 2023-10-03 06:55:47,465 WARNING [train.py:1204] (3/4) Exclude cut with ID _9679_210_7497_1_1529129212906_4363929_512-313889-0_sp1.1 from training. Duration: 0.736375 2023-10-03 06:55:47,478 WARNING [train.py:1204] (3/4) Exclude cut with ID _7970_210_1883_1_1528455616296_3472230_25-178407-0_sp1.1 from training. Duration: 0.6454375 2023-10-03 06:55:50,543 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_246-125000-0 from training. Duration: 0.62 2023-10-03 06:55:50,596 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_15126_210_6753_1_1530778677_2591185_113-157651-0_sp0.9 from training. Duration: 0.7555625 2023-10-03 06:55:51,994 WARNING [train.py:1204] (3/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_105-63309-0_sp1.1 from training. Duration: 0.8909375 2023-10-03 06:55:53,492 WARNING [train.py:1204] (3/4) Exclude cut with ID _29877_210_6228_1_1532397791106_5276149_578-177271-0_sp1.1 from training. Duration: 0.936375 2023-10-03 06:55:53,581 WARNING [train.py:1204] (3/4) Exclude cut with ID _44831_210_13427_1_1533816011549_3689035_32-188147-0 from training. Duration: 0.96 2023-10-03 06:55:54,893 WARNING [train.py:1204] (3/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_300-254337-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 06:55:54,908 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_128-241008-0 from training. Duration: 0.94 2023-10-03 06:55:56,346 WARNING [train.py:1204] (3/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_53-279178-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 06:55:56,496 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.0.balancer2.prob, batch_count=1177533.3333333333, ans=0.125 2023-10-03 06:55:59,169 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_41452_210_7291_1_1533437687_3941281_273-334088-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 06:55:59,171 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_128-241008-0_sp1.1 from training. Duration: 0.8545625 2023-10-03 06:56:02,882 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.feed_forward1.hidden_balancer.prob, batch_count=1177533.3333333333, ans=0.125 2023-10-03 06:56:03,960 WARNING [train.py:1204] (3/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_379-49457-0 from training. Duration: 0.87 2023-10-03 06:56:05,650 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_105-114797-0 from training. Duration: 0.84 2023-10-03 06:56:07,020 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_14928_210_5632_1_1530773461_4714000_454-112198-0 from training. Duration: 0.66 2023-10-03 06:56:10,343 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_456-22141-0_sp1.1 from training. Duration: 0.8 2023-10-03 06:56:13,165 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_115-233929-0 from training. Duration: 0.72 2023-10-03 06:56:14,612 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_616-16795-0_sp1.1 from training. Duration: 0.963625 2023-10-03 06:56:20,035 INFO [train.py:1046] (3/4) Epoch 34, batch 1350, loss[loss=0.1648, simple_loss=0.2462, pruned_loss=0.04168, over 23870.00 frames. ], tot_loss[loss=0.1611, simple_loss=0.2403, pruned_loss=0.04097, over 4719150.50 frames. ], batch size: 86, lr: 3.01e-03, grad_scale: 8.0 2023-10-03 06:56:20,885 WARNING [train.py:1204] (3/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_448-212135-0 from training. Duration: 0.91 2023-10-03 06:56:23,636 WARNING [train.py:1204] (3/4) Exclude cut with ID _46683_210_9642_1_1533730913492_4591116_201-205677-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 06:56:25,055 WARNING [train.py:1204] (3/4) Exclude cut with ID _64086_210_7994_1_1535270417019_7435949_43-80483-0_sp1.1 from training. Duration: 0.97275 2023-10-03 06:56:25,527 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.5.encoder.layers.0.conv_module1.whiten, num_groups=1, num_channels=256, metric=9.37 vs. limit=15.0 2023-10-03 06:56:27,915 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_240-134124-0_sp1.1 from training. Duration: 0.963625 2023-10-03 06:56:27,953 WARNING [train.py:1204] (3/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_907-120940-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 06:56:29,625 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.537e+02 1.836e+02 2.071e+02 2.360e+02 3.214e+02, threshold=4.142e+02, percent-clipped=0.0 2023-10-03 06:56:31,080 WARNING [train.py:1204] (3/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_848-280957-0_sp1.1 from training. Duration: 0.7909375 2023-10-03 06:56:31,126 WARNING [train.py:1204] (3/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_243-42116-0_sp0.9 from training. Duration: 0.811125 2023-10-03 06:56:35,776 WARNING [train.py:1204] (3/4) Exclude cut with ID _6694_210_4599_1_1527852637490_3965529_116-109398-0_sp0.9 from training. Duration: 0.811125 2023-10-03 06:56:38,975 WARNING [train.py:1204] (3/4) Exclude cut with ID _24314_210_4915_1_1532135078116_7170179_521-258868-0 from training. Duration: 0.79 2023-10-03 06:56:40,395 WARNING [train.py:1204] (3/4) Exclude cut with ID _2220_210_1819_1_1525509109898_3658680_50-99837-0_sp0.9 from training. Duration: 0.6 2023-10-03 06:56:40,437 WARNING [train.py:1204] (3/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_313-134930-0_sp1.1 from training. Duration: 0.8818125 2023-10-03 06:56:43,078 WARNING [train.py:1204] (3/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_125-144174-0 from training. Duration: 0.39 2023-10-03 06:56:43,147 WARNING [train.py:1204] (3/4) Exclude cut with ID _59288_210_5854_1_1535077757774_3412246_275-293897-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 06:56:44,522 WARNING [train.py:1204] (3/4) Exclude cut with ID _40519_210_13794_1_1533715228655_7337390_722-52880-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 06:56:44,536 WARNING [train.py:1204] (3/4) Exclude cut with ID _7537_210_2084_1_1528502395431_3924359_128-268029-0 from training. Duration: 0.98 2023-10-03 06:56:44,868 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.nonlin_attention.balancer.min_positive, batch_count=1177733.3333333333, ans=0.05 2023-10-03 06:56:45,964 WARNING [train.py:1204] (3/4) Exclude cut with ID _8021_210_7994_1_1528376451358_3653069_179-45206-0 from training. Duration: 0.86 2023-10-03 06:56:47,329 WARNING [train.py:1204] (3/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_88-276668-0 from training. Duration: 0.99 2023-10-03 06:56:48,672 WARNING [train.py:1204] (3/4) Exclude cut with ID _22613_210_5219_1_1532253599686_4090170_290-212862-0_sp1.1 from training. Duration: 0.97275 2023-10-03 06:56:48,677 WARNING [train.py:1204] (3/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_465-214726-0 from training. Duration: 0.86 2023-10-03 06:56:50,830 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.ff2_skip_rate, batch_count=1177800.0, ans=0.0 2023-10-03 06:57:00,605 WARNING [train.py:1204] (3/4) Exclude cut with ID _56480_210_4220_1_1534679942079_3075409_138-36031-0_sp1.1 from training. Duration: 0.97275 2023-10-03 06:57:05,797 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.ff2_skip_rate, batch_count=1177866.6666666667, ans=0.0 2023-10-03 06:57:08,244 WARNING [train.py:1204] (3/4) Exclude cut with ID _58605_210_12210_1_1534723195939_3904259_300-285878-0_sp1.1 from training. Duration: 0.97275 2023-10-03 06:57:09,616 WARNING [train.py:1204] (3/4) Exclude cut with ID _40901_210_5665_1_1533363789958_3856130_114-340229-0_sp1.1 from training. Duration: 0.936375 2023-10-03 06:57:09,661 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_489-230703-0 from training. Duration: 0.85 2023-10-03 06:57:12,468 WARNING [train.py:1204] (3/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_322-81243-0_sp1.1 from training. Duration: 0.936375 2023-10-03 06:57:13,268 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.2.encoder.layers.2.feed_forward1.out_whiten, num_groups=1, num_channels=384, metric=6.18 vs. limit=15.0 2023-10-03 06:57:13,788 WARNING [train.py:1204] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_271-269084-0 from training. Duration: 0.83 2023-10-03 06:57:13,806 WARNING [train.py:1204] (3/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_166-314607-0_sp0.9 from training. Duration: 0.6 2023-10-03 06:57:15,184 WARNING [train.py:1204] (3/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_340-343302-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 06:57:15,403 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.1.conv_module1.balancer2.min_abs, batch_count=1177866.6666666667, ans=0.5 2023-10-03 06:57:16,530 WARNING [train.py:1204] (3/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_286-112015-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 06:57:17,980 WARNING [train.py:1204] (3/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_328-264776-0 from training. Duration: 0.67 2023-10-03 06:57:20,544 WARNING [train.py:1204] (3/4) Exclude cut with ID _4866_210_2577_1_1527404412284_4364659_256-306830-0_sp0.9 from training. Duration: 0.9666875 2023-10-03 06:57:22,529 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.conv_skip_rate, batch_count=1177933.3333333333, ans=0.0 2023-10-03 06:57:26,689 WARNING [train.py:1204] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_239-316802-0 from training. Duration: 0.69 2023-10-03 06:57:28,146 WARNING [train.py:1204] (3/4) Exclude cut with ID _29857_210_20034_1_1532395886753_3798160_279-185801-0 from training. Duration: 0.91 2023-10-03 06:57:28,428 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.feed_forward2.hidden_balancer.prob, batch_count=1177933.3333333333, ans=0.125 2023-10-03 06:57:31,104 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=1177933.3333333333, ans=0.1 2023-10-03 06:57:34,246 INFO [train.py:1046] (3/4) Epoch 34, batch 1400, loss[loss=0.1706, simple_loss=0.2508, pruned_loss=0.04517, over 23662.00 frames. ], tot_loss[loss=0.1598, simple_loss=0.2384, pruned_loss=0.0406, over 4709085.55 frames. ], batch size: 85, lr: 3.01e-03, grad_scale: 8.0 2023-10-03 06:57:34,378 WARNING [train.py:1204] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_656-188361-0 from training. Duration: 0.87 2023-10-03 06:57:36,318 WARNING [train.py:1204] (3/4) Exclude cut with ID _44718_210_5816_1_1534050051869_6469030_100-230287-0_sp1.1 from training. Duration: 0.936375 2023-10-03 06:57:38,962 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8275_210_4198_1_1528455320_3892571_276-215622-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 06:57:39,018 WARNING [train.py:1204] (3/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_698-309430-0_sp1.1 from training. Duration: 0.963625 2023-10-03 06:57:39,338 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.conv_skip_rate, batch_count=1178000.0, ans=0.0 2023-10-03 06:57:43,194 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_365-217119-0 from training. Duration: 0.63 2023-10-03 06:57:44,629 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7264_210_4938_1_1527939911_8132095_800-91995-0 from training. Duration: 0.86 2023-10-03 06:57:54,931 WARNING [train.py:1204] (3/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_49-97158-0_sp1.1 from training. Duration: 0.6909375 2023-10-03 06:57:57,853 WARNING [train.py:1204] (3/4) Exclude cut with ID _61132_210_9969_1_1535021792964_3755419_61-57720-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 06:57:59,255 WARNING [train.py:1204] (3/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_345-306832-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 06:57:59,282 WARNING [train.py:1204] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_164-224052-0_sp0.9 from training. Duration: 0.77775 2023-10-03 06:58:04,461 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_34886_210_10633_1_1533203882_1755661_76-156096-0_sp1.1 from training. Duration: 0.7909375 2023-10-03 06:58:04,544 WARNING [train.py:1204] (3/4) Exclude cut with ID 4133-6541-0027-40495-0_sp1.1 from training. Duration: 0.9681875 2023-10-03 06:58:13,211 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_514-211165-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 06:58:14,629 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_41211_210_2544_1_1533348859_3702910_97-212695-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 06:58:14,933 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.attention_skip_rate, batch_count=1178133.3333333333, ans=0.0 2023-10-03 06:58:17,460 WARNING [train.py:1204] (3/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_406-45086-0 from training. Duration: 0.8 2023-10-03 06:58:17,611 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.conv_module2.balancer1.prob, batch_count=1178200.0, ans=0.125 2023-10-03 06:58:18,767 WARNING [train.py:1204] (3/4) Exclude cut with ID _65263_210_22356_1_1535763598691_3921037_49-222917-0_sp1.1 from training. Duration: 0.836375 2023-10-03 06:58:20,156 WARNING [train.py:1204] (3/4) Exclude cut with ID _4866_210_2577_1_1527404412284_4364659_352-157930-0_sp1.1 from training. Duration: 0.736375 2023-10-03 06:58:20,212 WARNING [train.py:1204] (3/4) Exclude cut with ID _4366_210_1388_1_1526792053633_3613819_231-211402-0_sp1.1 from training. Duration: 0.8545625 2023-10-03 06:58:21,427 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_98-21576-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 06:58:23,326 WARNING [train.py:1204] (3/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_348-127789-0_sp0.9 from training. Duration: 0.9444375 2023-10-03 06:58:23,344 WARNING [train.py:1204] (3/4) Exclude cut with ID _45871_210_5665_1_1533864614245_3296060_98-329557-0_sp1.1 from training. Duration: 0.97275 2023-10-03 06:58:24,607 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6846_210_6753_1_1527850767_4295158_2-315073-0_sp1.1 from training. Duration: 0.8 2023-10-03 06:58:24,724 WARNING [train.py:1204] (3/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_145-137790-0 from training. Duration: 0.83 2023-10-03 06:58:24,739 WARNING [train.py:1204] (3/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_133-158308-0_sp0.9 from training. Duration: 0.9333125 2023-10-03 06:58:30,323 WARNING [train.py:1204] (3/4) Exclude cut with ID _14898_210_9206_1_1530766812377_4177190_320-11758-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 06:58:31,926 WARNING [train.py:1204] (3/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_406-45086-0_sp0.9 from training. Duration: 0.888875 2023-10-03 06:58:40,387 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_5438_210_1794_1_1527336687_2600009_104-288274-0 from training. Duration: 0.58 2023-10-03 06:58:41,698 WARNING [train.py:1204] (3/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_108-53929-0_sp0.9 from training. Duration: 0.6555625 2023-10-03 06:58:43,070 WARNING [train.py:1204] (3/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_444-286390-0_sp1.1 from training. Duration: 0.963625 2023-10-03 06:58:44,576 WARNING [train.py:1204] (3/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_125-144174-0_sp0.9 from training. Duration: 0.4333125 2023-10-03 06:58:44,643 WARNING [train.py:1204] (3/4) Exclude cut with ID _55209_210_1531_1_1534675828996_4597160_118-345621-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 06:58:47,405 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_45886_210_17024_1_1533709215_471063_7-79165-0_sp1.1 from training. Duration: 0.8909375 2023-10-03 06:58:48,830 INFO [train.py:1046] (3/4) Epoch 34, batch 1450, loss[loss=0.1652, simple_loss=0.2261, pruned_loss=0.05212, over 23338.00 frames. ], tot_loss[loss=0.1594, simple_loss=0.2378, pruned_loss=0.04044, over 4715002.95 frames. ], batch size: 285, lr: 3.01e-03, grad_scale: 8.0 2023-10-03 06:58:50,322 WARNING [train.py:1204] (3/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_175-163894-0_sp0.9 from training. Duration: 0.82225 2023-10-03 06:58:53,148 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_50188_210_4198_1_1534503621_3646041_353-81682-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 06:58:53,154 WARNING [train.py:1204] (3/4) Exclude cut with ID _17875_210_2087_1_1531224012907_1821339_175-80565-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 06:58:53,163 WARNING [train.py:1204] (3/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_159-328157-0_sp1.1 from training. Duration: 0.47275 2023-10-03 06:58:57,697 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.501e+02 1.834e+02 2.006e+02 2.217e+02 3.059e+02, threshold=4.013e+02, percent-clipped=0.0 2023-10-03 06:58:59,161 WARNING [train.py:1204] (3/4) Exclude cut with ID _60057_210_10640_1_1534903065793_3333150_137-267579-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 06:58:59,210 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_92-46009-0_sp1.1 from training. Duration: 0.4909375 2023-10-03 06:59:00,575 WARNING [train.py:1204] (3/4) Exclude cut with ID _53203_210_2087_1_1534246218309_475590_48-68507-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 06:59:00,590 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_28211_210_6388_1_1532861506_4523224_128-328162-0 from training. Duration: 0.9 2023-10-03 06:59:01,921 WARNING [train.py:1204] (3/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_51-1049-0_sp1.1 from training. Duration: 0.6818125 2023-10-03 06:59:01,996 WARNING [train.py:1204] (3/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_383-316000-0 from training. Duration: 0.9 2023-10-03 06:59:03,298 WARNING [train.py:1204] (3/4) Exclude cut with ID _4779_210_2084_1_1527318022944_3914893_185-295153-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 06:59:03,380 WARNING [train.py:1204] (3/4) Exclude cut with ID _3231_210_2106_1_1526205864934_6784220_171-256004-0_sp0.9 from training. Duration: 0.9555625 2023-10-03 06:59:03,381 WARNING [train.py:1204] (3/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_87-272068-0 from training. Duration: 0.83 2023-10-03 06:59:04,764 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_164-152777-0_sp1.1 from training. Duration: 0.92725 2023-10-03 06:59:04,799 WARNING [train.py:1204] (3/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_18-130336-0_sp0.9 from training. Duration: 0.87775 2023-10-03 06:59:04,852 WARNING [train.py:1204] (3/4) Exclude cut with ID _14886_210_4220_1_1530702020355_3516190_175-229073-0_sp1.1 from training. Duration: 0.4090625 2023-10-03 06:59:04,864 WARNING [train.py:1204] (3/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_791-45149-0_sp0.9 from training. Duration: 0.9555625 2023-10-03 06:59:07,243 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.4.encoder.layers.2.feed_forward3.out_whiten, num_groups=1, num_channels=384, metric=12.19 vs. limit=15.0 2023-10-03 06:59:08,015 WARNING [train.py:1204] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_468-189058-0_sp1.1 from training. Duration: 0.87275 2023-10-03 06:59:09,808 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8278_210_2550_1_1528637099_4220413_32-142780-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 06:59:12,485 WARNING [train.py:1204] (3/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_146-218763-0_sp0.9 from training. Duration: 0.9555625 2023-10-03 06:59:15,409 WARNING [train.py:1204] (3/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_348-127789-0_sp1.1 from training. Duration: 0.77275 2023-10-03 06:59:15,414 WARNING [train.py:1204] (3/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_288-285445-0_sp1.1 from training. Duration: 0.936375 2023-10-03 06:59:16,867 WARNING [train.py:1204] (3/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_308-181732-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 06:59:16,894 WARNING [train.py:1204] (3/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_895-117428-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 06:59:21,020 WARNING [train.py:1204] (3/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_387-159008-0_sp0.9 from training. Duration: 0.9555625 2023-10-03 06:59:21,043 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_37351_210_7925_1_1533012931_3812113_265-85780-0_sp0.9 from training. Duration: 0.97775 2023-10-03 06:59:21,065 WARNING [train.py:1204] (3/4) Exclude cut with ID _40551_210_10635_1_1533776424632_3416179_361-3964-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 06:59:22,219 WARNING [train.py:1204] (3/4) Exclude cut with ID _25882_210_3036_1_1532775665815_6479510_485-122742-0_sp1.1 from training. Duration: 0.97275 2023-10-03 06:59:25,064 WARNING [train.py:1204] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_98-51676-0 from training. Duration: 0.73 2023-10-03 06:59:28,491 WARNING [train.py:1204] (3/4) Exclude cut with ID _30548_210_16040_1_1532608097130_3146969_371-280610-0_sp1.1 from training. Duration: 0.92725 2023-10-03 06:59:30,019 WARNING [train.py:1204] (3/4) Exclude cut with ID IC0671W0081-331924 from training. Duration: 0.896 2023-10-03 06:59:32,630 WARNING [train.py:1204] (3/4) Exclude cut with ID _40284_210_2084_1_1533513679814_3704279_380-160775-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 06:59:33,935 WARNING [train.py:1204] (3/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_57-270037-0_sp1.1 from training. Duration: 0.7 2023-10-03 06:59:34,481 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.5.encoder.layers.0.feed_forward1.out_whiten, num_groups=1, num_channels=256, metric=10.50 vs. limit=15.0 2023-10-03 06:59:35,772 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_853-150237-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 06:59:37,159 WARNING [train.py:1204] (3/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_491-261923-0 from training. Duration: 0.98 2023-10-03 06:59:42,487 WARNING [train.py:1204] (3/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_836-244045-0_sp1.1 from training. Duration: 0.97275 2023-10-03 06:59:43,859 WARNING [train.py:1204] (3/4) Exclude cut with ID _14880_210_8895_1_1530680426655_3795259_289-212817-0 from training. Duration: 0.69 2023-10-03 06:59:45,271 WARNING [train.py:1204] (3/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_303-340446-0 from training. Duration: 0.9 2023-10-03 06:59:45,383 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_37152_210_9697_1_1533106573_3844188_418-41736-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 06:59:49,456 WARNING [train.py:1204] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_371-18169-0_sp1.1 from training. Duration: 0.7818125 2023-10-03 06:59:49,492 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_187-219984-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 06:59:52,093 WARNING [train.py:1204] (3/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_63-267585-0 from training. Duration: 0.9 2023-10-03 06:59:53,519 WARNING [train.py:1204] (3/4) Exclude cut with ID _27013_210_1389_1_1532437173240_3252649_222-125573-0 from training. Duration: 0.98 2023-10-03 06:59:53,565 WARNING [train.py:1204] (3/4) Exclude cut with ID _14821_210_8750_1_1530680503991_6829359_189-297451-0 from training. Duration: 0.55 2023-10-03 06:59:53,697 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.conv_module1.balancer1.prob, batch_count=1178600.0, ans=0.125 2023-10-03 06:59:54,930 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_557-163391-0_sp1.1 from training. Duration: 0.97275 2023-10-03 06:59:56,795 WARNING [train.py:1204] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_65-88242-0_sp1.1 from training. Duration: 0.5090625 2023-10-03 07:00:02,345 INFO [train.py:1046] (3/4) Epoch 34, batch 1500, loss[loss=0.1801, simple_loss=0.2664, pruned_loss=0.04693, over 23673.00 frames. ], tot_loss[loss=0.1605, simple_loss=0.2394, pruned_loss=0.04081, over 4727769.01 frames. ], batch size: 85, lr: 3.01e-03, grad_scale: 8.0 2023-10-03 07:00:05,078 WARNING [train.py:1204] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_218-295097-0 from training. Duration: 0.77 2023-10-03 07:00:06,805 WARNING [train.py:1204] (3/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_686-247600-0_sp1.1 from training. Duration: 0.836375 2023-10-03 07:00:06,809 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_397-147196-0_sp0.9 from training. Duration: 0.888875 2023-10-03 07:00:08,195 WARNING [train.py:1204] (3/4) Exclude cut with ID _53950_210_5816_1_1534417264919_3862951_237-136449-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 07:00:08,476 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.nonlin_attention.balancer.prob, batch_count=1178666.6666666667, ans=0.125 2023-10-03 07:00:09,489 WARNING [train.py:1204] (3/4) Exclude cut with ID _3120_210_3525_1_1526036143112_4072980_74-178400-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 07:00:09,561 WARNING [train.py:1204] (3/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_309-222545-0_sp0.9 from training. Duration: 0.9333125 2023-10-03 07:00:09,708 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.1.balancer1.prob, batch_count=1178666.6666666667, ans=0.125 2023-10-03 07:00:11,459 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7544_210_6753_1_1528587722_3576686_172-171750-0 from training. Duration: 0.65 2023-10-03 07:00:13,385 WARNING [train.py:1204] (3/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_3-237434-0_sp0.9 from training. Duration: 0.8444375 2023-10-03 07:00:14,689 WARNING [train.py:1204] (3/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_101-293367-0_sp0.9 from training. Duration: 0.7 2023-10-03 07:00:14,696 WARNING [train.py:1204] (3/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_818-213552-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 07:00:16,010 WARNING [train.py:1204] (3/4) Exclude cut with ID _14349_210_8341_1_1530615710818_3442470_115-285975-0_sp1.1 from training. Duration: 0.7818125 2023-10-03 07:00:17,406 WARNING [train.py:1204] (3/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_291-309749-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 07:00:18,785 WARNING [train.py:1204] (3/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_715-3462-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 07:00:23,137 WARNING [train.py:1204] (3/4) Exclude cut with ID _59502_210_12210_1_1535107024698_1835139_13-331827-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 07:00:24,284 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_4528_210_3273_1_1527076654_3276777_342-168068-0 from training. Duration: 0.87 2023-10-03 07:00:24,349 WARNING [train.py:1204] (3/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_215-141059-0_sp0.9 from training. Duration: 0.688875 2023-10-03 07:00:25,639 WARNING [train.py:1204] (3/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_106-286130-0_sp1.1 from training. Duration: 0.8909375 2023-10-03 07:00:26,950 WARNING [train.py:1204] (3/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_480-77655-0_sp1.1 from training. Duration: 0.97275 2023-10-03 07:00:28,938 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.nonlin_attention.balancer.prob, batch_count=1178733.3333333333, ans=0.125 2023-10-03 07:00:28,952 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.ff3_skip_rate, batch_count=1178733.3333333333, ans=0.0 2023-10-03 07:00:30,137 WARNING [train.py:1204] (3/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_848-280957-0 from training. Duration: 0.87 2023-10-03 07:00:31,985 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.attention_skip_rate, batch_count=1178800.0, ans=0.0 2023-10-03 07:00:33,156 WARNING [train.py:1204] (3/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_122-109023-0 from training. Duration: 0.76 2023-10-03 07:00:33,270 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_11305_210_1794_1_1529750428_7178148_645-233480-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 07:00:34,664 WARNING [train.py:1204] (3/4) Exclude cut with ID _7562_210_2084_1_1528887767916_3946229_345-169156-0 from training. Duration: 0.58 2023-10-03 07:00:37,748 WARNING [train.py:1204] (3/4) Exclude cut with ID _7562_210_2084_1_1528887767916_3946229_345-169156-0_sp1.1 from training. Duration: 0.52725 2023-10-03 07:00:39,210 WARNING [train.py:1204] (3/4) Exclude cut with ID _13253_210_1925_1_1530757775819_3577873_253-246386-0_sp1.1 from training. Duration: 0.8181875 2023-10-03 07:00:39,325 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.0.conv_skip_rate, batch_count=1178800.0, ans=0.0 2023-10-03 07:00:40,582 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_136-264444-0_sp1.1 from training. Duration: 0.97275 2023-10-03 07:00:40,601 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2168_210_2290_1_1525606920_1849400_136-222991-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 07:00:42,032 WARNING [train.py:1204] (3/4) Exclude cut with ID _7852_210_5219_1_1528286471316_8139400_242-105336-0 from training. Duration: 0.72 2023-10-03 07:00:43,302 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_5960_210_1794_1_1527403865_3990999_515-2613-0_sp1.1 from training. Duration: 0.8 2023-10-03 07:00:43,317 WARNING [train.py:1204] (3/4) Exclude cut with ID _39688_210_19596_1_1533371626725_4154159_551-167834-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 07:00:45,239 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_85-195240-0 from training. Duration: 0.87 2023-10-03 07:00:45,298 WARNING [train.py:1204] (3/4) Exclude cut with ID _63171_210_15488_1_1535104792794_3295110_113-113534-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 07:00:49,496 WARNING [train.py:1204] (3/4) Exclude cut with ID _8833_210_5219_1_1528592665210_7553059_319-349181-0_sp1.1 from training. Duration: 0.9 2023-10-03 07:00:49,497 WARNING [train.py:1204] (3/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_225-43381-0 from training. Duration: 0.94 2023-10-03 07:00:55,071 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_5959_210_1794_1_1527400400_3445726_412-44052-0_sp0.9 from training. Duration: 0.8555625 2023-10-03 07:00:56,408 WARNING [train.py:1204] (3/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_356-167816-0_sp1.1 from training. Duration: 0.6545625 2023-10-03 07:00:58,509 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9012W0112-501244 from training. Duration: 0.768 2023-10-03 07:00:59,866 WARNING [train.py:1204] (3/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_177-326952-0_sp1.1 from training. Duration: 0.92725 2023-10-03 07:00:59,870 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9016W0116-502789 from training. Duration: 0.896 2023-10-03 07:01:00,609 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.2.encoder.layers.2.conv_module2.whiten, num_groups=1, num_channels=384, metric=5.23 vs. limit=15.0 2023-10-03 07:01:02,513 WARNING [train.py:1204] (3/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_557-204815-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 07:01:03,833 WARNING [train.py:1204] (3/4) Exclude cut with ID _55857_210_2346_1_1534644007831_6898209_279-229469-0_sp1.1 from training. Duration: 0.936375 2023-10-03 07:01:03,912 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9029W0375-509764 from training. Duration: 0.896 2023-10-03 07:01:05,287 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2295_210_2087_1_1525500993_2407863_234-83104-0_sp0.9 from training. Duration: 0.688875 2023-10-03 07:01:08,442 WARNING [train.py:1204] (3/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_266-137104-0 from training. Duration: 0.65 2023-10-03 07:01:09,808 WARNING [train.py:1204] (3/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_289-163927-0_sp1.1 from training. Duration: 0.92725 2023-10-03 07:01:10,115 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.feed_forward1.out_proj.dropout_p, batch_count=1178933.3333333333, ans=0.1 2023-10-03 07:01:14,071 WARNING [train.py:1204] (3/4) Exclude cut with ID _12733_210_8341_1_1530167424556_3646380_86-225314-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 07:01:14,092 WARNING [train.py:1204] (3/4) Exclude cut with ID _7877_210_6014_1_1528286358099_3750136_618-284064-0_sp1.1 from training. Duration: 0.92725 2023-10-03 07:01:15,349 INFO [train.py:1046] (3/4) Epoch 34, batch 1550, loss[loss=0.1625, simple_loss=0.2419, pruned_loss=0.04156, over 23789.00 frames. ], tot_loss[loss=0.1608, simple_loss=0.24, pruned_loss=0.04078, over 4740272.03 frames. ], batch size: 212, lr: 3.01e-03, grad_scale: 8.0 2023-10-03 07:01:15,404 WARNING [train.py:1204] (3/4) Exclude cut with ID _55844_210_2346_1_1534471212569_7625707_1017-31469-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 07:01:15,447 WARNING [train.py:1204] (3/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_617-241446-0_sp1.1 from training. Duration: 0.92725 2023-10-03 07:01:17,386 WARNING [train.py:1204] (3/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_866-173440-0_sp1.1 from training. Duration: 0.8181875 2023-10-03 07:01:17,503 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_9460_210_4211_1_1528954587_10049667_480-260269-0 from training. Duration: 0.56 2023-10-03 07:01:18,808 WARNING [train.py:1204] (3/4) Exclude cut with ID _44659_210_2721_1_1534036120061_6614860_626-298884-0 from training. Duration: 0.97 2023-10-03 07:01:18,858 WARNING [train.py:1204] (3/4) Exclude cut with ID _53565_210_10635_1_1534337218335_3820259_287-18120-0_sp1.1 from training. Duration: 0.8818125 2023-10-03 07:01:20,215 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_791-197891-0 from training. Duration: 0.84 2023-10-03 07:01:20,283 WARNING [train.py:1204] (3/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_527-238990-0 from training. Duration: 0.9 2023-10-03 07:01:22,897 WARNING [train.py:1204] (3/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_449-65969-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 07:01:23,006 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_553-341452-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 07:01:24,227 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.491e+02 1.899e+02 2.097e+02 2.316e+02 2.849e+02, threshold=4.195e+02, percent-clipped=0.0 2023-10-03 07:01:24,336 WARNING [train.py:1204] (3/4) Exclude cut with ID _16909_210_5724_1_1531292079912_4118919_300-47574-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 07:01:24,357 WARNING [train.py:1204] (3/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_70-27732-0_sp1.1 from training. Duration: 0.8545625 2023-10-03 07:01:24,446 WARNING [train.py:1204] (3/4) Exclude cut with ID _44718_210_5816_1_1534050051869_6469030_727-28074-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 07:01:25,792 WARNING [train.py:1204] (3/4) Exclude cut with ID _55857_210_2346_1_1534644007831_6898209_357-123664-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 07:01:29,310 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9123W0004-555964 from training. Duration: 0.896 2023-10-03 07:01:29,338 WARNING [train.py:1204] (3/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_445-145208-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 07:01:29,362 WARNING [train.py:1204] (3/4) Exclude cut with ID _3675_210_4557_1_1526639934408_4182089_456-18305-0_sp1.1 from training. Duration: 0.6909375 2023-10-03 07:01:30,719 WARNING [train.py:1204] (3/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_1-99680-0_sp1.1 from training. Duration: 0.6090625 2023-10-03 07:01:31,381 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.3.encoder.layers.1.whiten, num_groups=1, num_channels=512, metric=5.39 vs. limit=12.0 2023-10-03 07:01:32,146 WARNING [train.py:1204] (3/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_146-282400-0_sp0.9 from training. Duration: 0.788875 2023-10-03 07:01:32,168 WARNING [train.py:1204] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_844-96989-0 from training. Duration: 0.71 2023-10-03 07:01:34,849 WARNING [train.py:1204] (3/4) Exclude cut with ID _29853_210_7497_1_1532505600851_6642110_128-146364-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 07:01:34,887 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_224-322394-0 from training. Duration: 0.66 2023-10-03 07:01:36,297 WARNING [train.py:1204] (3/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_191-59784-0 from training. Duration: 0.88 2023-10-03 07:01:36,302 WARNING [train.py:1204] (3/4) Exclude cut with ID _12556_210_8341_1_1530081024100_7327690_725-193477-0 from training. Duration: 0.98 2023-10-03 07:01:38,023 WARNING [train.py:1204] (3/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_523-79838-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 07:01:38,123 WARNING [train.py:1204] (3/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_398-334558-0_sp1.1 from training. Duration: 0.87275 2023-10-03 07:01:38,298 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.conv_module1.balancer1.prob, batch_count=1179066.6666666667, ans=0.125 2023-10-03 07:01:42,235 WARNING [train.py:1204] (3/4) Exclude cut with ID _6456_210_1534_1_1527681634212_3181049_171-36697-0_sp1.1 from training. Duration: 0.936375 2023-10-03 07:01:45,384 WARNING [train.py:1204] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_700-183549-0 from training. Duration: 0.68 2023-10-03 07:01:45,386 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8853_210_3273_1_1528633899_818968_51-187685-0 from training. Duration: 0.69 2023-10-03 07:01:50,201 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.1.feed_forward2.hidden_balancer.prob, batch_count=1179133.3333333333, ans=0.125 2023-10-03 07:01:52,755 WARNING [train.py:1204] (3/4) Exclude cut with ID _7719_210_3525_1_1528456153163_4296960_333-110399-0_sp1.1 from training. Duration: 0.87275 2023-10-03 07:01:55,533 WARNING [train.py:1204] (3/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_1022-83740-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 07:01:56,852 WARNING [train.py:1204] (3/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_61-327062-0_sp0.9 from training. Duration: 0.9 2023-10-03 07:01:56,866 WARNING [train.py:1204] (3/4) Exclude cut with ID _47251_210_18628_1_1533814063128_4137250_214-88076-0_sp1.1 from training. Duration: 0.8 2023-10-03 07:01:56,930 WARNING [train.py:1204] (3/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_839-173017-0 from training. Duration: 0.41 2023-10-03 07:02:02,465 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.0.balancer1.prob, batch_count=1179200.0, ans=0.125 2023-10-03 07:02:04,175 WARNING [train.py:1204] (3/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_299-34413-0_sp1.1 from training. Duration: 0.5909375 2023-10-03 07:02:05,538 WARNING [train.py:1204] (3/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_664-159457-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 07:02:08,310 WARNING [train.py:1204] (3/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_483-291760-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 07:02:10,289 WARNING [train.py:1204] (3/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_287-255474-0_sp1.1 from training. Duration: 0.92725 2023-10-03 07:02:10,323 WARNING [train.py:1204] (3/4) Exclude cut with ID _48677_210_5663_1_1533902405919_3892989_111-11439-0_sp1.1 from training. Duration: 0.87275 2023-10-03 07:02:10,337 WARNING [train.py:1204] (3/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_399-156791-0 from training. Duration: 0.83 2023-10-03 07:02:10,368 WARNING [train.py:1204] (3/4) Exclude cut with ID _3992_210_1747_1_1526644816031_3731060_36-147855-0_sp0.9 from training. Duration: 0.8666875 2023-10-03 07:02:14,349 WARNING [train.py:1204] (3/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_442-320317-0_sp1.1 from training. Duration: 0.7454375 2023-10-03 07:02:14,368 WARNING [train.py:1204] (3/4) Exclude cut with ID _55429_210_10626_1_1534467588527_7174143_953-80868-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 07:02:14,446 WARNING [train.py:1204] (3/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_159-328157-0_sp0.9 from training. Duration: 0.57775 2023-10-03 07:02:14,456 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9309W0197-640324 from training. Duration: 0.896 2023-10-03 07:02:14,694 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=1179266.6666666667, ans=0.1 2023-10-03 07:02:17,778 WARNING [train.py:1204] (3/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_361-30822-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 07:02:22,521 WARNING [train.py:1204] (3/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_636-251213-0 from training. Duration: 0.94 2023-10-03 07:02:28,072 WARNING [train.py:1204] (3/4) Exclude cut with ID _49728_210_12651_1_1534041034294_6788150_468-333827-0_sp1.1 from training. Duration: 0.963625 2023-10-03 07:02:29,300 INFO [train.py:1046] (3/4) Epoch 34, batch 1600, loss[loss=0.148, simple_loss=0.2358, pruned_loss=0.03013, over 24645.00 frames. ], tot_loss[loss=0.1614, simple_loss=0.2409, pruned_loss=0.04094, over 4731914.45 frames. ], batch size: 68, lr: 3.01e-03, grad_scale: 16.0 2023-10-03 07:02:29,397 WARNING [train.py:1204] (3/4) Exclude cut with ID _25882_210_3036_1_1532775665815_6479510_623-191759-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 07:02:29,575 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.conv_module2.balancer2.prob, batch_count=1179333.3333333333, ans=0.125 2023-10-03 07:02:30,716 WARNING [train.py:1204] (3/4) Exclude cut with ID _5189_210_4557_1_1527248319428_3769229_199-53526-0 from training. Duration: 0.42 2023-10-03 07:02:32,114 WARNING [train.py:1204] (3/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_32-205288-0_sp0.9 from training. Duration: 0.8666875 2023-10-03 07:02:33,543 WARNING [train.py:1204] (3/4) Exclude cut with ID _65263_210_22356_1_1535763598691_3921037_401-346646-0_sp1.1 from training. Duration: 0.963625 2023-10-03 07:02:33,548 WARNING [train.py:1204] (3/4) Exclude cut with ID _12555_210_8750_1_1530075594402_3780239_449-267986-0_sp1.1 from training. Duration: 0.7545625 2023-10-03 07:02:33,569 WARNING [train.py:1204] (3/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_430-201020-0_sp1.1 from training. Duration: 0.8818125 2023-10-03 07:02:34,922 WARNING [train.py:1204] (3/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_379-49457-0_sp1.1 from training. Duration: 0.7909375 2023-10-03 07:02:38,287 WARNING [train.py:1204] (3/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_200-105218-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 07:02:38,341 WARNING [train.py:1204] (3/4) Exclude cut with ID _3867_210_1883_1_1526641200575_4475830_319-349850-0 from training. Duration: 0.67 2023-10-03 07:02:38,409 WARNING [train.py:1204] (3/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_916-195848-0 from training. Duration: 0.92 2023-10-03 07:02:41,120 WARNING [train.py:1204] (3/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_436-186311-0 from training. Duration: 0.65 2023-10-03 07:02:43,110 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3910_210_1794_1_1526777667_3747260_302-180617-0_sp1.1 from training. Duration: 0.97275 2023-10-03 07:02:43,344 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.ff2_skip_rate, batch_count=1179400.0, ans=0.0 2023-10-03 07:02:44,575 WARNING [train.py:1204] (3/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_102-92751-0 from training. Duration: 0.99 2023-10-03 07:02:44,635 WARNING [train.py:1204] (3/4) Exclude cut with ID _51899_210_20034_1_1534129254466_3669220_265-215428-0_sp1.1 from training. Duration: 0.8909375 2023-10-03 07:02:47,940 WARNING [train.py:1204] (3/4) Exclude cut with ID _58583_210_5236_1_1534726835266_3574249_438-198993-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 07:02:51,332 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.4.encoder.layers.0.conv_module2.whiten, num_groups=1, num_channels=384, metric=12.85 vs. limit=15.0 2023-10-03 07:02:52,655 WARNING [train.py:1204] (3/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_166-166990-0_sp1.1 from training. Duration: 0.87275 2023-10-03 07:02:55,367 WARNING [train.py:1204] (3/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_907-151398-0 from training. Duration: 0.37 2023-10-03 07:02:58,164 WARNING [train.py:1204] (3/4) Exclude cut with ID _38670_210_15379_1_1533448810659_4391059_452-256669-0_sp1.1 from training. Duration: 0.936375 2023-10-03 07:02:58,752 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.3.encoder.layers.1.whiten, num_groups=1, num_channels=512, metric=4.99 vs. limit=12.0 2023-10-03 07:02:59,408 WARNING [train.py:1204] (3/4) Exclude cut with ID _48677_210_5663_1_1533902405919_3892989_408-40740-0 from training. Duration: 0.83 2023-10-03 07:02:59,450 WARNING [train.py:1204] (3/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_903-158756-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 07:02:59,507 WARNING [train.py:1204] (3/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_397-216497-0 from training. Duration: 0.96 2023-10-03 07:03:03,789 WARNING [train.py:1204] (3/4) Exclude cut with ID _55429_210_10626_1_1534467588527_7174143_97-335084-0 from training. Duration: 0.97 2023-10-03 07:03:10,121 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.5.encoder.layers.1.nonlin_attention.whiten2, num_groups=1, num_channels=256, metric=9.69 vs. limit=15.0 2023-10-03 07:03:12,822 WARNING [train.py:1204] (3/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_453-267730-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 07:03:14,193 WARNING [train.py:1204] (3/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_153-97441-0 from training. Duration: 0.56 2023-10-03 07:03:14,248 WARNING [train.py:1204] (3/4) Exclude cut with ID _40901_210_5665_1_1533363789958_3856130_312-329122-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 07:03:14,287 WARNING [train.py:1204] (3/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_97-126471-0_sp1.1 from training. Duration: 0.97275 2023-10-03 07:03:14,287 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_11305_210_1794_1_1529750428_7178148_257-312870-0_sp1.1 from training. Duration: 0.8 2023-10-03 07:03:17,195 WARNING [train.py:1204] (3/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_454-314901-0_sp0.9 from training. Duration: 0.511125 2023-10-03 07:03:22,026 WARNING [train.py:1204] (3/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_282-136641-0_sp1.1 from training. Duration: 0.3818125 2023-10-03 07:03:25,096 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_46013_210_7234_1_1533776362_3744032_493-160724-0_sp1.1 from training. Duration: 0.963625 2023-10-03 07:03:25,144 WARNING [train.py:1204] (3/4) Exclude cut with ID _67831_210_12392_1_1535612464202_3092612_259-33442-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 07:03:26,529 WARNING [train.py:1204] (3/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_289-62384-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 07:03:26,577 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6545_210_2040_1_1527774670_4245258_379-14182-0_sp0.9 from training. Duration: 0.888875 2023-10-03 07:03:28,064 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_548-294316-0_sp1.1 from training. Duration: 0.736375 2023-10-03 07:03:29,512 WARNING [train.py:1204] (3/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_857-298942-0_sp1.1 from training. Duration: 0.77275 2023-10-03 07:03:30,877 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6673_210_3273_1_1527767790_3173240_327-306185-0_sp1.1 from training. Duration: 0.7545625 2023-10-03 07:03:36,316 WARNING [train.py:1204] (3/4) Exclude cut with ID _61008_210_10633_1_1534852769198_6974370_814-107647-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 07:03:37,656 WARNING [train.py:1204] (3/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_116-332008-0_sp1.1 from training. Duration: 0.8909375 2023-10-03 07:03:39,685 WARNING [train.py:1204] (3/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_266-329755-0 from training. Duration: 0.89 2023-10-03 07:03:39,688 WARNING [train.py:1204] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_271-269084-0_sp0.9 from training. Duration: 0.92225 2023-10-03 07:03:39,758 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3171_210_2087_1_1526109084_2330007_66-260583-0 from training. Duration: 0.72 2023-10-03 07:03:44,172 INFO [train.py:1046] (3/4) Epoch 34, batch 1650, loss[loss=0.1772, simple_loss=0.2633, pruned_loss=0.0456, over 23973.00 frames. ], tot_loss[loss=0.1624, simple_loss=0.242, pruned_loss=0.04144, over 4730784.35 frames. ], batch size: 80, lr: 3.01e-03, grad_scale: 16.0 2023-10-03 07:03:45,528 WARNING [train.py:1204] (3/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_1326-276386-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 07:03:45,657 WARNING [train.py:1204] (3/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_372-30302-0_sp1.1 from training. Duration: 0.7818125 2023-10-03 07:03:46,957 WARNING [train.py:1204] (3/4) Exclude cut with ID _25882_210_3036_1_1532775665815_6479510_631-257029-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 07:03:46,966 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3691_210_3528_1_1526468169_4062053_355-334432-0 from training. Duration: 0.87 2023-10-03 07:03:46,974 WARNING [train.py:1204] (3/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_839-173017-0_sp1.1 from training. Duration: 0.37275 2023-10-03 07:03:46,982 WARNING [train.py:1204] (3/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_441-91466-0 from training. Duration: 0.97 2023-10-03 07:03:47,024 WARNING [train.py:1204] (3/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_108-53929-0 from training. Duration: 0.59 2023-10-03 07:03:47,252 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.balancer2.prob, batch_count=1179666.6666666667, ans=0.125 2023-10-03 07:03:51,558 WARNING [train.py:1204] (3/4) Exclude cut with ID _9389_210_1664_1_1529217337301_5289269_260-1942-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 07:03:52,799 WARNING [train.py:1204] (3/4) Exclude cut with ID _56423_210_5854_1_1534896061049_3165969_198-61547-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 07:03:52,845 WARNING [train.py:1204] (3/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_412-142122-0_sp1.1 from training. Duration: 0.92725 2023-10-03 07:03:54,532 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.588e+02 1.876e+02 2.077e+02 2.336e+02 3.555e+02, threshold=4.154e+02, percent-clipped=0.0 2023-10-03 07:03:54,608 WARNING [train.py:1204] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_88-341718-0_sp1.1 from training. Duration: 0.6 2023-10-03 07:03:57,239 WARNING [train.py:1204] (3/4) Exclude cut with ID _61079_210_4525_1_1534921008198_4011419_334-298697-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 07:03:58,655 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_5465_210_4211_1_1527227253_55357_8-2499-0_sp1.1 from training. Duration: 0.996375 2023-10-03 07:04:00,226 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_447-201565-0_sp1.1 from training. Duration: 0.97275 2023-10-03 07:04:00,231 WARNING [train.py:1204] (3/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_253-161789-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 07:04:00,234 WARNING [train.py:1204] (3/4) Exclude cut with ID _44304_210_15374_1_1533639352765_4613129_302-22084-0_sp0.9 from training. Duration: 0.9555625 2023-10-03 07:04:00,243 WARNING [train.py:1204] (3/4) Exclude cut with ID _26664_210_7770_1_1532258943280_3418270_187-80815-0_sp1.1 from training. Duration: 0.8454375 2023-10-03 07:04:01,588 WARNING [train.py:1204] (3/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_68-212801-0 from training. Duration: 0.73 2023-10-03 07:04:01,627 WARNING [train.py:1204] (3/4) Exclude cut with ID _29345_210_4381_1_1532689168263_3860169_149-257268-0 from training. Duration: 0.81 2023-10-03 07:04:07,210 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_213-103282-0_sp1.1 from training. Duration: 0.7090625 2023-10-03 07:04:09,063 WARNING [train.py:1204] (3/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_156-302675-0_sp1.1 from training. Duration: 0.5 2023-10-03 07:04:18,191 WARNING [train.py:1204] (3/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_125-322172-0 from training. Duration: 0.93 2023-10-03 07:04:19,535 WARNING [train.py:1204] (3/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_493-60474-0_sp1.1 from training. Duration: 0.963625 2023-10-03 07:04:20,974 WARNING [train.py:1204] (3/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_156-302675-0 from training. Duration: 0.55 2023-10-03 07:04:22,564 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=1179800.0, ans=0.1 2023-10-03 07:04:23,861 WARNING [train.py:1204] (3/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_123-267862-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 07:04:25,854 WARNING [train.py:1204] (3/4) Exclude cut with ID _28427_210_4220_1_1532581240967_3061069_201-143747-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 07:04:27,023 WARNING [train.py:1204] (3/4) Exclude cut with ID _8772_210_1850_1_1528635595972_3664011_96-12756-0_sp1.1 from training. Duration: 0.8909375 2023-10-03 07:04:27,061 WARNING [train.py:1204] (3/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_1347-145675-0_sp1.1 from training. Duration: 0.936375 2023-10-03 07:04:28,490 WARNING [train.py:1204] (3/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_387-159008-0_sp1.1 from training. Duration: 0.7818125 2023-10-03 07:04:28,508 WARNING [train.py:1204] (3/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_400-201896-0_sp1.1 from training. Duration: 0.963625 2023-10-03 07:04:31,234 WARNING [train.py:1204] (3/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_326-174612-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 07:04:32,619 WARNING [train.py:1204] (3/4) Exclude cut with ID _55844_210_2346_1_1534471212569_7625707_390-116233-0_sp1.1 from training. Duration: 0.963625 2023-10-03 07:04:32,650 WARNING [train.py:1204] (3/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_419-317062-0_sp1.1 from training. Duration: 0.82725 2023-10-03 07:04:32,688 WARNING [train.py:1204] (3/4) Exclude cut with ID _29877_210_6228_1_1532397791106_5276149_266-62291-0_sp0.9 from training. Duration: 0.9666875 2023-10-03 07:04:34,055 WARNING [train.py:1204] (3/4) Exclude cut with ID _7857_210_1534_1_1528286515745_6831470_431-338528-0_sp1.1 from training. Duration: 0.92725 2023-10-03 07:04:35,387 WARNING [train.py:1204] (3/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_87-131427-0_sp0.9 from training. Duration: 0.8666875 2023-10-03 07:04:35,541 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.self_attn_weights.pos_emb_skip_rate, batch_count=1179866.6666666667, ans=0.0 2023-10-03 07:04:39,340 WARNING [train.py:1204] (3/4) Exclude cut with ID _24055_210_12651_1_1532310907143_976979_92-59356-0_sp1.1 from training. Duration: 0.82725 2023-10-03 07:04:41,185 WARNING [train.py:1204] (3/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_96-290039-0 from training. Duration: 0.56 2023-10-03 07:04:43,042 WARNING [train.py:1204] (3/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_48-182155-0_sp0.9 from training. Duration: 0.9666875 2023-10-03 07:04:43,091 WARNING [train.py:1204] (3/4) Exclude cut with ID _4626_210_3532_1_1526817652717_3916859_446-206656-0 from training. Duration: 0.76 2023-10-03 07:04:44,576 WARNING [train.py:1204] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_168-32906-0 from training. Duration: 0.69 2023-10-03 07:04:44,601 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3780_210_2087_1_1526467760_2130483_170-259044-0 from training. Duration: 0.7 2023-10-03 07:04:44,612 WARNING [train.py:1204] (3/4) Exclude cut with ID _65134_210_14763_1_1535335393985_3505368_153-216718-0_sp1.1 from training. Duration: 0.92725 2023-10-03 07:04:46,016 WARNING [train.py:1204] (3/4) Exclude cut with ID _64639_210_5816_1_1535367619437_3528000_461-329156-0_sp1.1 from training. Duration: 0.87275 2023-10-03 07:04:46,059 WARNING [train.py:1204] (3/4) Exclude cut with ID _21989_210_5854_1_1531899214931_3271360_126-347531-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 07:04:46,110 WARNING [train.py:1204] (3/4) Exclude cut with ID _7614_210_4915_1_1529326764092_4537359_96-162197-0_sp1.1 from training. Duration: 0.963625 2023-10-03 07:04:46,111 WARNING [train.py:1204] (3/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_1083-147536-0 from training. Duration: 0.97 2023-10-03 07:04:46,345 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.feed_forward2.hidden_balancer.prob, batch_count=1179933.3333333333, ans=0.125 2023-10-03 07:04:50,749 WARNING [train.py:1204] (3/4) Exclude cut with ID _64029_210_4381_1_1535178502772_3756459_275-16710-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 07:04:52,083 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7264_210_4938_1_1527939911_8132095_800-91995-0_sp0.9 from training. Duration: 0.9555625 2023-10-03 07:04:52,122 WARNING [train.py:1204] (3/4) Exclude cut with ID _3844_210_1390_1_1526727803582_2871209_50-193879-0_sp1.1 from training. Duration: 0.936375 2023-10-03 07:04:55,469 WARNING [train.py:1204] (3/4) Exclude cut with ID _46952_210_5986_1_1534230010357_3107379_232-344503-0 from training. Duration: 0.96 2023-10-03 07:04:58,424 INFO [train.py:1046] (3/4) Epoch 34, batch 1700, loss[loss=0.159, simple_loss=0.2374, pruned_loss=0.04028, over 23234.00 frames. ], tot_loss[loss=0.1621, simple_loss=0.2418, pruned_loss=0.04121, over 4736013.87 frames. ], batch size: 105, lr: 3.01e-03, grad_scale: 8.0 2023-10-03 07:04:59,926 WARNING [train.py:1204] (3/4) Exclude cut with ID _25808_210_13292_1_1532343514562_7366109_206-14076-0_sp1.1 from training. Duration: 0.936375 2023-10-03 07:04:59,927 WARNING [train.py:1204] (3/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_55-31904-0_sp0.9 from training. Duration: 0.97775 2023-10-03 07:04:59,959 WARNING [train.py:1204] (3/4) Exclude cut with ID _14632_210_9988_1_1530691279392_3923200_161-129530-0 from training. Duration: 0.96 2023-10-03 07:05:01,312 WARNING [train.py:1204] (3/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_770-200430-0_sp1.1 from training. Duration: 0.8090625 2023-10-03 07:05:01,321 WARNING [train.py:1204] (3/4) Exclude cut with ID _6270_210_2084_1_1527850911573_3856420_26-277343-0_sp1.1 from training. Duration: 0.7454375 2023-10-03 07:05:01,322 WARNING [train.py:1204] (3/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_88-23776-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 07:05:04,104 WARNING [train.py:1204] (3/4) Exclude cut with ID _60860_210_8254_1_1534896015875_3557169_245-256666-0_sp1.1 from training. Duration: 0.8818125 2023-10-03 07:05:04,132 WARNING [train.py:1204] (3/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_636-251213-0_sp1.1 from training. Duration: 0.8545625 2023-10-03 07:05:04,153 WARNING [train.py:1204] (3/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_540-131164-0 from training. Duration: 0.92 2023-10-03 07:05:06,736 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_5931_210_2040_1_1527428481_4547999_321-193614-0_sp0.9 from training. Duration: 0.7444375 2023-10-03 07:05:08,380 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.nonlin_attention.balancer.min_positive, batch_count=1180000.0, ans=0.05 2023-10-03 07:05:12,907 WARNING [train.py:1204] (3/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_373-225403-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 07:05:17,491 WARNING [train.py:1204] (3/4) Exclude cut with ID _20622_210_3852_1_1531622427080_1093839_57-326027-0_sp1.1 from training. Duration: 0.97275 2023-10-03 07:05:22,270 WARNING [train.py:1204] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_605-257205-0_sp1.1 from training. Duration: 0.536375 2023-10-03 07:05:22,290 WARNING [train.py:1204] (3/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_442-320317-0_sp0.9 from training. Duration: 0.911125 2023-10-03 07:05:23,543 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_35361_210_4238_1_1533262810_7793476_675-87012-0_sp1.1 from training. Duration: 0.8090625 2023-10-03 07:05:23,569 WARNING [train.py:1204] (3/4) Exclude cut with ID _4184_210_2550_1_1526730192013_2219380_306-44736-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 07:05:23,751 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.bypass.scale_min, batch_count=1180066.6666666667, ans=0.2 2023-10-03 07:05:24,034 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.5.encoder.layers.1.self_attn2.whiten, num_groups=1, num_channels=256, metric=10.61 vs. limit=22.5 2023-10-03 07:05:26,615 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_124-113562-0 from training. Duration: 0.94 2023-10-03 07:05:26,928 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.conv_module2.balancer1.prob, batch_count=1180133.3333333333, ans=0.125 2023-10-03 07:05:28,133 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2821_210_2715_1_1526108385_3763560_73-296673-0_sp0.9 from training. Duration: 0.92225 2023-10-03 07:05:28,149 WARNING [train.py:1204] (3/4) Exclude cut with ID _10301_210_5886_1_1529222275332_3910620_216-46959-0_sp1.1 from training. Duration: 0.92725 2023-10-03 07:05:29,577 WARNING [train.py:1204] (3/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_91-277684-0_sp0.9 from training. Duration: 0.82225 2023-10-03 07:05:29,669 WARNING [train.py:1204] (3/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_351-294650-0_sp0.9 from training. Duration: 0.7 2023-10-03 07:05:32,326 WARNING [train.py:1204] (3/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_209-205365-0 from training. Duration: 0.79 2023-10-03 07:05:32,374 WARNING [train.py:1204] (3/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_97-325053-0 from training. Duration: 0.92 2023-10-03 07:05:32,476 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_49348_210_7927_1_1533954500_3691300_109-276430-0_sp1.1 from training. Duration: 0.92725 2023-10-03 07:05:32,695 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.nonlin_attention.balancer.prob, batch_count=1180133.3333333333, ans=0.125 2023-10-03 07:05:35,115 WARNING [train.py:1204] (3/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_912-47473-0 from training. Duration: 0.68 2023-10-03 07:05:36,561 WARNING [train.py:1204] (3/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_96-320736-0_sp0.9 from training. Duration: 0.9555625 2023-10-03 07:05:43,414 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.balancer2.prob, batch_count=1180200.0, ans=0.125 2023-10-03 07:05:46,014 WARNING [train.py:1204] (3/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_1252-235481-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 07:05:46,103 WARNING [train.py:1204] (3/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_813-261554-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 07:05:47,962 WARNING [train.py:1204] (3/4) Exclude cut with ID _21632_210_2253_1_1531994001765_3744140_110-274654-0_sp0.9 from training. Duration: 0.911125 2023-10-03 07:05:49,347 WARNING [train.py:1204] (3/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_381-317791-0_sp1.1 from training. Duration: 0.663625 2023-10-03 07:05:49,360 WARNING [train.py:1204] (3/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_51-117576-0_sp1.1 from training. Duration: 0.363625 2023-10-03 07:05:49,381 WARNING [train.py:1204] (3/4) Exclude cut with ID _42321_210_7770_1_1533468631352_4366895_104-215915-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 07:05:49,658 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.conv_skip_rate, batch_count=1180200.0, ans=0.0 2023-10-03 07:05:52,172 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_216-22824-0_sp1.1 from training. Duration: 0.92725 2023-10-03 07:05:52,172 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_14831_210_7927_1_1530700200_5583610_181-27467-0 from training. Duration: 0.91 2023-10-03 07:05:52,221 WARNING [train.py:1204] (3/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_1024-265645-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 07:05:52,230 WARNING [train.py:1204] (3/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_412-233904-0_sp1.1 from training. Duration: 0.936375 2023-10-03 07:05:52,253 WARNING [train.py:1204] (3/4) Exclude cut with ID _30191_210_1664_1_1533189915047_7196670_911-223461-0_sp1.1 from training. Duration: 0.92725 2023-10-03 07:05:52,261 WARNING [train.py:1204] (3/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_558-148266-0_sp1.1 from training. Duration: 0.963625 2023-10-03 07:05:53,847 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=1180200.0, ans=0.1 2023-10-03 07:05:55,501 WARNING [train.py:1204] (3/4) Exclude cut with ID _58480_210_12392_1_1534811121111_3646160_212-164495-0_sp1.1 from training. Duration: 0.936375 2023-10-03 07:05:55,503 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_454-276429-0_sp0.9 from training. Duration: 0.9666875 2023-10-03 07:05:55,602 WARNING [train.py:1204] (3/4) Exclude cut with ID _24736_210_3279_1_1532332894218_2838700_73-197086-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 07:05:57,037 WARNING [train.py:1204] (3/4) Exclude cut with ID _5294_210_1747_1_1527249643928_3696179_529-242298-0_sp1.1 from training. Duration: 0.863625 2023-10-03 07:05:57,081 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_53555_210_5632_1_1534595220_7232297_345-171438-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 07:06:01,149 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_28211_210_6388_1_1532861506_4523224_22-25766-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 07:06:02,534 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7002_210_1794_1_1527939683_5157286_163-87552-0 from training. Duration: 0.86 2023-10-03 07:06:04,028 WARNING [train.py:1204] (3/4) Exclude cut with ID _56927_210_9969_1_1534848970840_3695229_409-225129-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 07:06:06,618 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_26192_210_7927_1_1532137269_1040115_21-219331-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 07:06:08,071 WARNING [train.py:1204] (3/4) Exclude cut with ID _15012_210_1968_1_1530856820429_5095350_662-210469-0 from training. Duration: 0.57 2023-10-03 07:06:12,499 INFO [train.py:1046] (3/4) Epoch 34, batch 1750, loss[loss=0.1505, simple_loss=0.2295, pruned_loss=0.03577, over 24300.00 frames. ], tot_loss[loss=0.1605, simple_loss=0.2397, pruned_loss=0.04062, over 4719171.82 frames. ], batch size: 61, lr: 3.01e-03, grad_scale: 8.0 2023-10-03 07:06:15,799 WARNING [train.py:1204] (3/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_582-191016-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 07:06:17,843 WARNING [train.py:1204] (3/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_270-210032-0_sp1.1 from training. Duration: 0.963625 2023-10-03 07:06:17,877 WARNING [train.py:1204] (3/4) Exclude cut with ID _6789_210_1534_1_1527857937218_3118950_262-48853-0_sp0.9 from training. Duration: 0.62225 2023-10-03 07:06:17,935 WARNING [train.py:1204] (3/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_847-41334-0 from training. Duration: 0.44 2023-10-03 07:06:19,222 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_573-46532-0_sp1.1 from training. Duration: 0.92725 2023-10-03 07:06:20,722 WARNING [train.py:1204] (3/4) Exclude cut with ID _21881_210_3525_1_1531825242757_3798380_247-102591-0_sp1.1 from training. Duration: 0.8909375 2023-10-03 07:06:20,750 WARNING [train.py:1204] (3/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_200-21395-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 07:06:23,234 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.489e+02 1.852e+02 2.086e+02 2.272e+02 3.301e+02, threshold=4.171e+02, percent-clipped=0.0 2023-10-03 07:06:24,791 WARNING [train.py:1204] (3/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_711-15688-0 from training. Duration: 0.43 2023-10-03 07:06:26,661 WARNING [train.py:1204] (3/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_53-75025-0_sp1.1 from training. Duration: 0.963625 2023-10-03 07:06:28,397 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.attention_skip_rate, batch_count=1180400.0, ans=0.0 2023-10-03 07:06:29,461 WARNING [train.py:1204] (3/4) Exclude cut with ID _6194_210_1534_1_1527511826160_3941194_321-148641-0 from training. Duration: 0.64 2023-10-03 07:06:29,499 WARNING [train.py:1204] (3/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_337-294431-0_sp1.1 from training. Duration: 0.936375 2023-10-03 07:06:30,900 WARNING [train.py:1204] (3/4) Exclude cut with ID _21632_210_2253_1_1531994001765_3744140_110-274654-0_sp1.1 from training. Duration: 0.7454375 2023-10-03 07:06:34,832 WARNING [train.py:1204] (3/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_135-174080-0_sp1.1 from training. Duration: 0.4818125 2023-10-03 07:06:36,703 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.5.encoder.layers.0.nonlin_attention.whiten2, num_groups=1, num_channels=256, metric=3.52 vs. limit=15.0 2023-10-03 07:06:37,497 WARNING [train.py:1204] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_21-174514-0 from training. Duration: 0.43 2023-10-03 07:06:38,989 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_38965_210_4852_1_1533174906_3484735_215-294589-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 07:06:39,015 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_548-294316-0 from training. Duration: 0.81 2023-10-03 07:06:48,271 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_103-84010-0_sp1.1 from training. Duration: 0.62725 2023-10-03 07:06:51,488 WARNING [train.py:1204] (3/4) Exclude cut with ID _18199_210_6109_1_1531475789864_4288790_295-198247-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 07:06:51,517 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_13757_210_3528_1_1530440682_4107239_103-313224-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 07:06:55,585 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_34886_210_10633_1_1533203882_1755661_67-294673-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 07:06:55,595 WARNING [train.py:1204] (3/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_350-101734-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 07:06:57,014 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_37357_210_17024_1_1533089504_2316121_204-153986-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 07:06:58,989 WARNING [train.py:1204] (3/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_673-115569-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 07:07:00,679 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.1.balancer1.prob, batch_count=1180533.3333333333, ans=0.125 2023-10-03 07:07:01,787 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_489-230703-0_sp0.9 from training. Duration: 0.9444375 2023-10-03 07:07:01,852 WARNING [train.py:1204] (3/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_575-102496-0_sp1.1 from training. Duration: 0.936375 2023-10-03 07:07:03,268 WARNING [train.py:1204] (3/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_669-93163-0 from training. Duration: 0.98 2023-10-03 07:07:04,697 WARNING [train.py:1204] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_249-281349-0_sp0.9 from training. Duration: 0.9444375 2023-10-03 07:07:06,159 WARNING [train.py:1204] (3/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_106-30782-0 from training. Duration: 0.8 2023-10-03 07:07:07,496 WARNING [train.py:1204] (3/4) Exclude cut with ID _8772_210_1850_1_1528635595972_3664011_398-320049-0_sp1.1 from training. Duration: 0.8 2023-10-03 07:07:10,310 WARNING [train.py:1204] (3/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_516-151727-0_sp1.1 from training. Duration: 0.963625 2023-10-03 07:07:10,371 WARNING [train.py:1204] (3/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_325-62802-0_sp1.1 from training. Duration: 0.7909375 2023-10-03 07:07:14,097 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.3.encoder.layers.2.feed_forward1.out_whiten, num_groups=1, num_channels=512, metric=7.29 vs. limit=15.0 2023-10-03 07:07:14,920 WARNING [train.py:1204] (3/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_93-58002-0_sp0.9 from training. Duration: 0.6333125 2023-10-03 07:07:16,246 WARNING [train.py:1204] (3/4) Exclude cut with ID _4681_210_3525_1_1527251040321_1235713_123-24428-0_sp1.1 from training. Duration: 0.436375 2023-10-03 07:07:16,300 WARNING [train.py:1204] (3/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_1185-56473-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 07:07:18,088 WARNING [train.py:1204] (3/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_96-244299-0_sp1.1 from training. Duration: 0.8 2023-10-03 07:07:20,936 WARNING [train.py:1204] (3/4) Exclude cut with ID _59500_210_8254_1_1534809610680_3736009_104-72815-0_sp1.1 from training. Duration: 0.963625 2023-10-03 07:07:23,982 WARNING [train.py:1204] (3/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_468-283113-0_sp1.1 from training. Duration: 0.87275 2023-10-03 07:07:24,091 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6239_210_4211_1_1527765584_4306698_528-102186-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 07:07:25,423 WARNING [train.py:1204] (3/4) Exclude cut with ID _45297_210_2721_1_1533715351381_3264949_351-147136-0 from training. Duration: 0.87 2023-10-03 07:07:25,435 WARNING [train.py:1204] (3/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_520-51744-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 07:07:26,718 INFO [train.py:1046] (3/4) Epoch 34, batch 1800, loss[loss=0.1667, simple_loss=0.2628, pruned_loss=0.03528, over 24456.00 frames. ], tot_loss[loss=0.1599, simple_loss=0.2389, pruned_loss=0.04047, over 4719929.01 frames. ], batch size: 69, lr: 3.01e-03, grad_scale: 8.0 2023-10-03 07:07:26,794 WARNING [train.py:1204] (3/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_310-114452-0_sp1.1 from training. Duration: 0.536375 2023-10-03 07:07:26,804 WARNING [train.py:1204] (3/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_450-165710-0_sp1.1 from training. Duration: 0.92725 2023-10-03 07:07:26,814 WARNING [train.py:1204] (3/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_280-120942-0_sp1.1 from training. Duration: 0.7 2023-10-03 07:07:26,834 WARNING [train.py:1204] (3/4) Exclude cut with ID _65585_210_12210_1_1535450451584_3764098_112-294301-0_sp0.9 from training. Duration: 0.97775 2023-10-03 07:07:28,215 WARNING [train.py:1204] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_98-51676-0_sp0.9 from training. Duration: 0.811125 2023-10-03 07:07:31,506 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_33348_210_5632_1_1533086912_3324030_85-28583-0_sp1.1 from training. Duration: 0.7454375 2023-10-03 07:07:32,864 WARNING [train.py:1204] (3/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_336-208232-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 07:07:34,292 WARNING [train.py:1204] (3/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_339-104791-0_sp0.9 from training. Duration: 0.5444375 2023-10-03 07:07:37,011 WARNING [train.py:1204] (3/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_559-29840-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 07:07:38,394 WARNING [train.py:1204] (3/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_117-290916-0_sp0.9 from training. Duration: 0.5666875 2023-10-03 07:07:39,812 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_185-112289-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 07:07:40,102 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.conv_module1.balancer2.prob, batch_count=1180733.3333333333, ans=0.125 2023-10-03 07:07:43,041 WARNING [train.py:1204] (3/4) Exclude cut with ID _59256_210_5986_1_1535076085997_3589120_155-298797-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 07:07:45,796 WARNING [train.py:1204] (3/4) Exclude cut with ID _44659_210_2721_1_1534036120061_6614860_246-206531-0_sp1.1 from training. Duration: 0.92725 2023-10-03 07:07:45,849 WARNING [train.py:1204] (3/4) Exclude cut with ID _23818_210_6228_1_1531917063884_3676920_469-151944-0_sp1.1 from training. Duration: 0.92725 2023-10-03 07:07:47,623 WARNING [train.py:1204] (3/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_88-276668-0_sp1.1 from training. Duration: 0.9 2023-10-03 07:07:47,762 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_15101_210_7927_1_1530786415_5033274_320-327527-0_sp1.1 from training. Duration: 0.87275 2023-10-03 07:07:47,777 WARNING [train.py:1204] (3/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_446-112117-0 from training. Duration: 0.9 2023-10-03 07:07:49,235 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_30280_210_1881_1_1532872779_3660139_400-254159-0_sp1.1 from training. Duration: 0.936375 2023-10-03 07:07:49,368 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.ff3_skip_rate, batch_count=1180733.3333333333, ans=0.0 2023-10-03 07:07:53,948 WARNING [train.py:1204] (3/4) Exclude cut with ID _40284_210_2084_1_1533513679814_3704279_158-345341-0_sp1.1 from training. Duration: 0.936375 2023-10-03 07:07:56,851 WARNING [train.py:1204] (3/4) Exclude cut with ID 3033-130750-0096-55598-0 from training. Duration: 0.83 2023-10-03 07:07:58,467 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.balancer2.prob, batch_count=1180800.0, ans=0.125 2023-10-03 07:07:59,954 WARNING [train.py:1204] (3/4) Exclude cut with ID _9389_210_1664_1_1529217337301_5289269_379-128794-0 from training. Duration: 0.85 2023-10-03 07:07:59,976 WARNING [train.py:1204] (3/4) Exclude cut with ID _3675_210_4557_1_1526639934408_4182089_456-18305-0 from training. Duration: 0.76 2023-10-03 07:08:01,252 WARNING [train.py:1204] (3/4) Exclude cut with ID _29853_210_7497_1_1532505600851_6642110_571-249460-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 07:08:01,346 WARNING [train.py:1204] (3/4) Exclude cut with ID _4068_210_2629_1_1526697350395_1546259_230-325568-0_sp1.1 from training. Duration: 0.92725 2023-10-03 07:08:01,346 WARNING [train.py:1204] (3/4) Exclude cut with ID _34415_210_8750_1_1533085213375_3451116_213-267570-0_sp1.1 from training. Duration: 0.963625 2023-10-03 07:08:02,666 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6767_210_3273_1_1527855783_1718932_142-127132-0_sp1.1 from training. Duration: 0.863625 2023-10-03 07:08:08,293 WARNING [train.py:1204] (3/4) Exclude cut with ID IC0653W0001-322925 from training. Duration: 0.896 2023-10-03 07:08:09,744 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_4528_210_3273_1_1527076654_3276777_209-219804-0_sp1.1 from training. Duration: 0.62725 2023-10-03 07:08:11,152 WARNING [train.py:1204] (3/4) Exclude cut with ID _55844_210_2346_1_1534471212569_7625707_187-204971-0_sp1.1 from training. Duration: 0.936375 2023-10-03 07:08:12,468 WARNING [train.py:1204] (3/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_369-325184-0 from training. Duration: 0.99 2023-10-03 07:08:12,503 WARNING [train.py:1204] (3/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_791-45149-0 from training. Duration: 0.86 2023-10-03 07:08:12,556 WARNING [train.py:1204] (3/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_317-211107-0_sp1.1 from training. Duration: 0.836375 2023-10-03 07:08:14,408 WARNING [train.py:1204] (3/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_488-330913-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 07:08:15,782 WARNING [train.py:1204] (3/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_506-42614-0_sp1.1 from training. Duration: 0.7181875 2023-10-03 07:08:21,005 WARNING [train.py:1204] (3/4) Exclude cut with ID _8696_210_1876_1_1528545632440_3345058_209-54069-0 from training. Duration: 0.67 2023-10-03 07:08:25,188 WARNING [train.py:1204] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_215-202973-0_sp1.1 from training. Duration: 0.8 2023-10-03 07:08:25,261 WARNING [train.py:1204] (3/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_321-215742-0 from training. Duration: 0.5 2023-10-03 07:08:26,670 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_428-32114-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 07:08:26,679 WARNING [train.py:1204] (3/4) Exclude cut with ID _27013_210_1389_1_1532437173240_3252649_467-263151-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 07:08:26,729 WARNING [train.py:1204] (3/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_734-129761-0_sp0.9 from training. Duration: 0.911125 2023-10-03 07:08:28,028 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_521-214276-0 from training. Duration: 0.44 2023-10-03 07:08:31,236 WARNING [train.py:1204] (3/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_80-147301-0_sp1.1 from training. Duration: 0.763625 2023-10-03 07:08:31,241 WARNING [train.py:1204] (3/4) Exclude cut with ID _9679_210_7497_1_1529129212906_4363929_56-307796-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 07:08:35,218 WARNING [train.py:1204] (3/4) Exclude cut with ID _14832_210_8750_1_1530691351539_3171348_1-54315-0 from training. Duration: 0.87 2023-10-03 07:08:35,218 WARNING [train.py:1204] (3/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_558-155750-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 07:08:36,636 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_142-309370-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 07:08:36,662 WARNING [train.py:1204] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_18-46756-0_sp0.9 from training. Duration: 0.87775 2023-10-03 07:08:36,669 WARNING [train.py:1204] (3/4) Exclude cut with ID _61148_210_6897_1_1535094022726_4217610_279-212159-0_sp1.1 from training. Duration: 0.936375 2023-10-03 07:08:39,475 WARNING [train.py:1204] (3/4) Exclude cut with ID _21987_210_5854_1_1531821500482_3637038_164-2641-0_sp1.1 from training. Duration: 0.936375 2023-10-03 07:08:39,514 WARNING [train.py:1204] (3/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_395-340852-0_sp1.1 from training. Duration: 0.8454375 2023-10-03 07:08:40,840 INFO [train.py:1046] (3/4) Epoch 34, batch 1850, loss[loss=0.1556, simple_loss=0.2431, pruned_loss=0.03402, over 24473.00 frames. ], tot_loss[loss=0.1601, simple_loss=0.2391, pruned_loss=0.04058, over 4720090.25 frames. ], batch size: 66, lr: 3.01e-03, grad_scale: 8.0 2023-10-03 07:08:42,328 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1077-184261-0_sp1.1 from training. Duration: 0.963625 2023-10-03 07:08:42,336 WARNING [train.py:1204] (3/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_1176-204992-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 07:08:44,375 WARNING [train.py:1204] (3/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_563-347832-0_sp1.1 from training. Duration: 0.8090625 2023-10-03 07:08:45,717 WARNING [train.py:1204] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_380-204756-0_sp0.9 from training. Duration: 0.9555625 2023-10-03 07:08:51,418 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.601e+02 1.930e+02 2.238e+02 2.527e+02 4.034e+02, threshold=4.475e+02, percent-clipped=0.0 2023-10-03 07:08:54,944 WARNING [train.py:1204] (3/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_212-10501-0_sp1.1 from training. Duration: 0.8545625 2023-10-03 07:08:54,976 WARNING [train.py:1204] (3/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_445-117398-0 from training. Duration: 0.89 2023-10-03 07:08:57,808 WARNING [train.py:1204] (3/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_29-280622-0 from training. Duration: 0.82 2023-10-03 07:09:00,497 WARNING [train.py:1204] (3/4) Exclude cut with ID _26664_210_7770_1_1532258943280_3418270_187-80815-0 from training. Duration: 0.93 2023-10-03 07:09:03,385 WARNING [train.py:1204] (3/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_414-349640-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 07:09:03,404 WARNING [train.py:1204] (3/4) Exclude cut with ID _45021_210_9994_1_1533619751094_3330288_96-239010-0 from training. Duration: 0.91 2023-10-03 07:09:03,406 WARNING [train.py:1204] (3/4) Exclude cut with ID _5189_210_4557_1_1527248319428_3769229_199-53526-0_sp0.9 from training. Duration: 0.4666875 2023-10-03 07:09:04,015 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.4.encoder.layers.2.nonlin_attention.whiten2, num_groups=1, num_channels=384, metric=3.33 vs. limit=15.0 2023-10-03 07:09:13,519 WARNING [train.py:1204] (3/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_63-101996-0_sp1.1 from training. Duration: 0.8 2023-10-03 07:09:14,418 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.feed_forward1.out_proj.dropout_p, batch_count=1181133.3333333333, ans=0.1 2023-10-03 07:09:15,487 WARNING [train.py:1204] (3/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_199-86432-0 from training. Duration: 0.98 2023-10-03 07:09:16,948 WARNING [train.py:1204] (3/4) Exclude cut with ID _36399_210_16049_1_1532998771377_3698333_207-259013-0_sp1.1 from training. Duration: 0.92725 2023-10-03 07:09:18,725 WARNING [train.py:1204] (3/4) Exclude cut with ID _62574_210_2655_1_1535180407058_3933249_567-111756-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 07:09:21,541 WARNING [train.py:1204] (3/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_436-318113-0 from training. Duration: 0.75 2023-10-03 07:09:22,843 WARNING [train.py:1204] (3/4) Exclude cut with ID _53379_210_20034_1_1534222027363_3854654_43-305214-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 07:09:22,868 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_15126_210_6753_1_1530778677_2591185_113-157651-0_sp1.1 from training. Duration: 0.6181875 2023-10-03 07:09:24,828 WARNING [train.py:1204] (3/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_146-218763-0_sp1.1 from training. Duration: 0.7818125 2023-10-03 07:09:26,324 WARNING [train.py:1204] (3/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_714-310538-0_sp0.9 from training. Duration: 0.9555625 2023-10-03 07:09:28,396 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.feed_forward3.out_whiten.whitening_limit, batch_count=1181200.0, ans=15.0 2023-10-03 07:09:29,068 WARNING [train.py:1204] (3/4) Exclude cut with ID _24271_210_12658_1_1531987270022_7190519_709-16721-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 07:09:31,967 WARNING [train.py:1204] (3/4) Exclude cut with ID _5190_210_4557_1_1527244368799_373439_28-197030-0_sp0.9 from training. Duration: 0.8 2023-10-03 07:09:32,004 WARNING [train.py:1204] (3/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_267-197536-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 07:09:32,037 WARNING [train.py:1204] (3/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_658-282610-0_sp1.1 from training. Duration: 0.4818125 2023-10-03 07:09:32,046 WARNING [train.py:1204] (3/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_257-67050-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 07:09:33,389 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_14906_210_7927_1_1530754190_9408633_961-53402-0_sp1.1 from training. Duration: 0.936375 2023-10-03 07:09:36,266 WARNING [train.py:1204] (3/4) Exclude cut with ID _60113_210_16271_1_1535164212481_4386079_490-280290-0_sp1.1 from training. Duration: 0.8909375 2023-10-03 07:09:36,601 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.balancer2.prob, batch_count=1181200.0, ans=0.125 2023-10-03 07:09:38,226 WARNING [train.py:1204] (3/4) Exclude cut with ID _14100_210_8341_1_1530529320613_3678069_258-224599-0 from training. Duration: 0.77 2023-10-03 07:09:39,435 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6961_210_2087_1_1527937168_6815011_565-326898-0_sp1.1 from training. Duration: 0.936375 2023-10-03 07:09:39,691 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=1181266.6666666667, ans=0.1 2023-10-03 07:09:41,129 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.ff3_skip_rate, batch_count=1181266.6666666667, ans=0.0 2023-10-03 07:09:42,291 WARNING [train.py:1204] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_239-316802-0_sp1.1 from training. Duration: 0.62725 2023-10-03 07:09:42,491 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.conv_skip_rate, batch_count=1181266.6666666667, ans=0.0 2023-10-03 07:09:43,649 WARNING [train.py:1204] (3/4) Exclude cut with ID _55857_210_2346_1_1534644007831_6898209_745-98391-0_sp1.1 from training. Duration: 0.6909375 2023-10-03 07:09:43,652 WARNING [train.py:1204] (3/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_154-135696-0 from training. Duration: 0.8 2023-10-03 07:09:43,654 WARNING [train.py:1204] (3/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_93-58002-0 from training. Duration: 0.57 2023-10-03 07:09:46,755 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9025W0005-507335 from training. Duration: 0.896 2023-10-03 07:09:48,583 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9029W0034-509435 from training. Duration: 0.64 2023-10-03 07:09:48,715 WARNING [train.py:1204] (3/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_76-203867-0_sp1.1 from training. Duration: 0.6545625 2023-10-03 07:09:48,725 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_46639_210_8341_1_1533726088_4495500_66-346325-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 07:09:48,731 WARNING [train.py:1204] (3/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_145-137790-0_sp0.9 from training. Duration: 0.92225 2023-10-03 07:09:50,069 WARNING [train.py:1204] (3/4) Exclude cut with ID _36522_210_8887_1_1533177030611_1772478_7-327299-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 07:09:50,127 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9042W0009-515615 from training. Duration: 0.896 2023-10-03 07:09:50,142 WARNING [train.py:1204] (3/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_39-248870-0_sp0.9 from training. Duration: 0.7666875 2023-10-03 07:09:51,587 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_35441_210_16554_1_1533347572_4092680_362-242912-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 07:09:52,894 WARNING [train.py:1204] (3/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_216-343007-0_sp1.1 from training. Duration: 0.6 2023-10-03 07:09:54,361 WARNING [train.py:1204] (3/4) Exclude cut with ID _3867_210_1883_1_1526641200575_4475830_272-215563-0_sp0.9 from training. Duration: 0.6333125 2023-10-03 07:09:56,154 INFO [train.py:1046] (3/4) Epoch 34, batch 1900, loss[loss=0.1557, simple_loss=0.2458, pruned_loss=0.03282, over 24581.00 frames. ], tot_loss[loss=0.1606, simple_loss=0.2401, pruned_loss=0.04058, over 4731946.31 frames. ], batch size: 71, lr: 3.01e-03, grad_scale: 8.0 2023-10-03 07:09:57,674 WARNING [train.py:1204] (3/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_553-175603-0_sp1.1 from training. Duration: 0.92725 2023-10-03 07:09:57,683 WARNING [train.py:1204] (3/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_963-187109-0 from training. Duration: 0.99 2023-10-03 07:10:00,380 WARNING [train.py:1204] (3/4) Exclude cut with ID _44973_210_1511_1_1533794381163_3634770_495-211466-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 07:10:00,398 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9068W0002-529085 from training. Duration: 0.896 2023-10-03 07:10:00,418 WARNING [train.py:1204] (3/4) Exclude cut with ID _5294_210_1747_1_1527249643928_3696179_180-100713-0_sp1.1 from training. Duration: 0.736375 2023-10-03 07:10:01,742 WARNING [train.py:1204] (3/4) Exclude cut with ID _35799_210_5854_1_1533263487887_3529071_149-49689-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 07:10:05,939 WARNING [train.py:1204] (3/4) Exclude cut with ID _67725_210_27767_1_1535608756671_5105359_36-220794-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 07:10:07,551 WARNING [train.py:1204] (3/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_338-96520-0_sp1.1 from training. Duration: 0.8545625 2023-10-03 07:10:09,249 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9104W0004-547160 from training. Duration: 0.896 2023-10-03 07:10:09,532 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.feed_forward1.hidden_balancer.prob, batch_count=1181400.0, ans=0.125 2023-10-03 07:10:10,594 WARNING [train.py:1204] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_330-327861-0 from training. Duration: 0.67 2023-10-03 07:10:10,704 WARNING [train.py:1204] (3/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_156-257607-0_sp0.9 from training. Duration: 0.92225 2023-10-03 07:10:12,028 WARNING [train.py:1204] (3/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_476-322050-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 07:10:12,038 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9118W0017-553385 from training. Duration: 0.896 2023-10-03 07:10:12,070 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9120W0028-554435 from training. Duration: 0.896 2023-10-03 07:10:16,069 WARNING [train.py:1204] (3/4) Exclude cut with ID _56524_210_9642_1_1534572011779_3619170_145-310842-0 from training. Duration: 0.99 2023-10-03 07:10:16,181 WARNING [train.py:1204] (3/4) Exclude cut with ID _51631_210_8254_1_1534129228420_4084794_270-222508-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 07:10:20,105 WARNING [train.py:1204] (3/4) Exclude cut with ID _14268_210_9206_1_1530622771285_4714099_331-105351-0 from training. Duration: 0.65 2023-10-03 07:10:21,392 WARNING [train.py:1204] (3/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_40-129756-0 from training. Duration: 0.81 2023-10-03 07:10:25,622 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.0.balancer2.prob, batch_count=1181466.6666666667, ans=0.125 2023-10-03 07:10:31,545 WARNING [train.py:1204] (3/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_231-104736-0 from training. Duration: 0.78 2023-10-03 07:10:34,336 WARNING [train.py:1204] (3/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_214-291369-0 from training. Duration: 0.63 2023-10-03 07:10:34,349 WARNING [train.py:1204] (3/4) Exclude cut with ID _62574_210_2655_1_1535180407058_3933249_369-84347-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 07:10:35,653 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9210W0111-599240 from training. Duration: 0.896 2023-10-03 07:10:35,657 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_92-246270-0 from training. Duration: 0.85 2023-10-03 07:10:35,692 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_37152_210_9697_1_1533106573_3844188_29-239070-0 from training. Duration: 0.98 2023-10-03 07:10:35,746 WARNING [train.py:1204] (3/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_239-22076-0 from training. Duration: 0.85 2023-10-03 07:10:35,749 WARNING [train.py:1204] (3/4) Exclude cut with ID _65962_210_29927_1_1535436026519_4174930_368-44639-0_sp1.1 from training. Duration: 0.97275 2023-10-03 07:10:35,956 INFO [scaling.py:1118] (3/4) WithLoss: name=encoder.encoders.2.encoder.layers.0.self_attn_weights, loss-sum=0.000e+00 2023-10-03 07:10:37,848 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.3.encoder.layers.3.nonlin_attention.whiten2, num_groups=1, num_channels=512, metric=4.74 vs. limit=15.0 2023-10-03 07:10:40,448 WARNING [train.py:1204] (3/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_152-212533-0 from training. Duration: 0.97 2023-10-03 07:10:43,235 WARNING [train.py:1204] (3/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_90-31383-0_sp1.1 from training. Duration: 0.7909375 2023-10-03 07:10:47,144 WARNING [train.py:1204] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_196-152868-0_sp1.1 from training. Duration: 0.92725 2023-10-03 07:10:47,146 WARNING [train.py:1204] (3/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_140-181176-0 from training. Duration: 0.92 2023-10-03 07:10:49,083 WARNING [train.py:1204] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_168-32906-0_sp0.9 from training. Duration: 0.7666875 2023-10-03 07:10:53,847 WARNING [train.py:1204] (3/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_13-165512-0 from training. Duration: 0.82 2023-10-03 07:10:53,887 WARNING [train.py:1204] (3/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_381-317791-0_sp0.9 from training. Duration: 0.811125 2023-10-03 07:10:58,307 WARNING [train.py:1204] (3/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_63-267585-0_sp1.1 from training. Duration: 0.8181875 2023-10-03 07:10:58,309 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_326-303945-0_sp1.1 from training. Duration: 0.9 2023-10-03 07:11:00,110 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_194-143381-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 07:11:00,180 WARNING [train.py:1204] (3/4) Exclude cut with ID _39290_210_4220_1_1533862778760_3075118_194-16667-0_sp1.1 from training. Duration: 0.963625 2023-10-03 07:11:01,494 WARNING [train.py:1204] (3/4) Exclude cut with ID _15012_210_1968_1_1530856820429_5095350_662-210469-0_sp0.9 from training. Duration: 0.6333125 2023-10-03 07:11:01,533 WARNING [train.py:1204] (3/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_68-219807-0_sp1.1 from training. Duration: 0.42725 2023-10-03 07:11:02,825 WARNING [train.py:1204] (3/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_321-22821-0_sp0.9 from training. Duration: 0.988875 2023-10-03 07:11:06,772 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_1037-45317-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 07:11:06,773 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_403-39619-0_sp1.1 from training. Duration: 0.763625 2023-10-03 07:11:07,119 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.feed_forward3.hidden_balancer.prob, batch_count=1181600.0, ans=0.125 2023-10-03 07:11:09,531 INFO [train.py:1046] (3/4) Epoch 34, batch 1950, loss[loss=0.1547, simple_loss=0.2479, pruned_loss=0.03071, over 24344.00 frames. ], tot_loss[loss=0.1609, simple_loss=0.2403, pruned_loss=0.04076, over 4730884.85 frames. ], batch size: 74, lr: 3.00e-03, grad_scale: 8.0 2023-10-03 07:11:09,583 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_510-96996-0_sp0.9 from training. Duration: 0.9666875 2023-10-03 07:11:09,594 WARNING [train.py:1204] (3/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_225-180155-0_sp1.1 from training. Duration: 0.92725 2023-10-03 07:11:09,647 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_278-272612-0_sp0.9 from training. Duration: 0.811125 2023-10-03 07:11:11,637 WARNING [train.py:1204] (3/4) Exclude cut with ID _53713_210_24116_1_1534314487762_7416751_691-70244-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 07:11:14,619 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6788_210_1388_1_1527898698_4562021_91-197981-0_sp1.1 from training. Duration: 0.7545625 2023-10-03 07:11:16,021 WARNING [train.py:1204] (3/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_60-18123-0_sp0.9 from training. Duration: 0.97775 2023-10-03 07:11:16,074 WARNING [train.py:1204] (3/4) Exclude cut with ID _35799_210_5854_1_1533263487887_3529071_5-214617-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 07:11:16,078 WARNING [train.py:1204] (3/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_890-5117-0_sp1.1 from training. Duration: 0.7090625 2023-10-03 07:11:19,957 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.581e+02 1.878e+02 2.003e+02 2.217e+02 3.435e+02, threshold=4.007e+02, percent-clipped=0.0 2023-10-03 07:11:20,056 WARNING [train.py:1204] (3/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_660-211088-0 from training. Duration: 0.62 2023-10-03 07:11:20,117 WARNING [train.py:1204] (3/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_314-126819-0_sp0.9 from training. Duration: 0.5555625 2023-10-03 07:11:20,150 WARNING [train.py:1204] (3/4) Exclude cut with ID _21632_210_2253_1_1531994001765_3744140_362-66225-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 07:11:22,687 WARNING [train.py:1204] (3/4) Exclude cut with ID _61118_210_9527_1_1534897668884_3752630_173-258555-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 07:11:24,227 INFO [scaling.py:1118] (3/4) WithLoss: name=encoder.encoders.4.encoder.layers.2.self_attn_weights, loss-sum=0.000e+00 2023-10-03 07:11:25,322 WARNING [train.py:1204] (3/4) Exclude cut with ID _7719_210_3525_1_1528456153163_4296960_88-250259-0_sp0.9 from training. Duration: 0.9333125 2023-10-03 07:11:25,346 WARNING [train.py:1204] (3/4) Exclude cut with ID _15140_210_9288_1_1530791388450_4880149_539-153708-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 07:11:25,368 WARNING [train.py:1204] (3/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_253-42544-0_sp1.1 from training. Duration: 0.97275 2023-10-03 07:11:28,167 WARNING [train.py:1204] (3/4) Exclude cut with ID _33640_210_2655_1_1533117605479_4855469_479-296877-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 07:11:31,472 WARNING [train.py:1204] (3/4) Exclude cut with ID 3033-130750-0096-55598-0_sp1.1 from training. Duration: 0.7545625 2023-10-03 07:11:31,482 WARNING [train.py:1204] (3/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_191-41023-0_sp1.1 from training. Duration: 0.6454375 2023-10-03 07:11:32,768 WARNING [train.py:1204] (3/4) Exclude cut with ID _8365_210_2064_1_1528592311070_4677690_431-294102-0_sp1.1 from training. Duration: 0.8090625 2023-10-03 07:11:32,799 WARNING [train.py:1204] (3/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_893-296030-0_sp1.1 from training. Duration: 0.97275 2023-10-03 07:11:35,559 WARNING [train.py:1204] (3/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_401-343820-0_sp1.1 from training. Duration: 0.97275 2023-10-03 07:11:39,729 WARNING [train.py:1204] (3/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_127-566-0_sp1.1 from training. Duration: 0.763625 2023-10-03 07:11:39,731 WARNING [train.py:1204] (3/4) Exclude cut with ID _14100_210_8341_1_1530529320613_3678069_126-272847-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 07:11:39,759 WARNING [train.py:1204] (3/4) Exclude cut with ID _15133_210_6228_1_1530794512074_3080379_29-264458-0_sp1.1 from training. Duration: 0.52725 2023-10-03 07:11:39,759 WARNING [train.py:1204] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_127-119109-0 from training. Duration: 0.72 2023-10-03 07:11:40,414 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.4.encoder.layers.0.self_attn_weights.whiten_keys, num_groups=4, num_channels=128, metric=1.60 vs. limit=6.0 2023-10-03 07:11:41,110 WARNING [train.py:1204] (3/4) Exclude cut with ID _6537_210_1883_1_1527987559710_3597129_382-133062-0_sp1.1 from training. Duration: 0.6181875 2023-10-03 07:11:41,129 WARNING [train.py:1204] (3/4) Exclude cut with ID _29529_210_7234_1_1532393961455_3977460_391-219682-0_sp1.1 from training. Duration: 0.936375 2023-10-03 07:11:42,339 WARNING [train.py:1204] (3/4) Exclude cut with ID _37537_210_2253_1_1533171758568_3421949_96-137189-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 07:11:47,139 WARNING [train.py:1204] (3/4) Exclude cut with ID _58414_210_8254_1_1535421609162_7151182_15-276301-0_sp1.1 from training. Duration: 0.97275 2023-10-03 07:11:48,621 WARNING [train.py:1204] (3/4) Exclude cut with ID _59502_210_12210_1_1535107024698_1835139_110-289074-0_sp1.1 from training. Duration: 0.92725 2023-10-03 07:11:52,563 WARNING [train.py:1204] (3/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_367-61330-0_sp1.1 from training. Duration: 0.6818125 2023-10-03 07:11:54,540 WARNING [train.py:1204] (3/4) Exclude cut with ID _45297_210_2721_1_1533715351381_3264949_353-9541-0_sp1.1 from training. Duration: 0.8818125 2023-10-03 07:11:55,368 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.conv_module2.balancer2.min_abs, batch_count=1181866.6666666667, ans=0.5 2023-10-03 07:11:56,407 WARNING [train.py:1204] (3/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_85-138257-0_sp0.9 from training. Duration: 0.788875 2023-10-03 07:11:56,432 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3787_210_2040_1_1526477544_5478586_603-120324-0 from training. Duration: 0.81 2023-10-03 07:11:56,483 WARNING [train.py:1204] (3/4) Exclude cut with ID _42863_210_9070_1_1533902398788_3696920_411-204131-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 07:11:59,241 WARNING [train.py:1204] (3/4) Exclude cut with ID _30196_210_1664_1_1533708496522_7074140_822-289909-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 07:12:00,587 WARNING [train.py:1204] (3/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_32-335103-0_sp0.9 from training. Duration: 0.911125 2023-10-03 07:12:02,306 WARNING [train.py:1204] (3/4) Exclude cut with ID _6493_210_1747_1_1528459210424_4012470_513-140967-0_sp0.9 from training. Duration: 0.87775 2023-10-03 07:12:08,032 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_47896_210_20508_1_1534492518_5958868_439-257766-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 07:12:08,118 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_1294-63152-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 07:12:10,087 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.4.encoder.layers.1.nonlin_attention.whiten1, num_groups=1, num_channels=288, metric=3.91 vs. limit=10.0 2023-10-03 07:12:10,916 WARNING [train.py:1204] (3/4) Exclude cut with ID _59170_210_6721_1_1534983633960_4203335_319-166512-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 07:12:12,343 WARNING [train.py:1204] (3/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_146-3470-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 07:12:13,870 WARNING [train.py:1204] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_678-159307-0_sp1.1 from training. Duration: 0.77275 2023-10-03 07:12:15,680 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_343-48822-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 07:12:15,751 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_4692_210_3528_1_1526899755_4323254_107-44401-0 from training. Duration: 0.86 2023-10-03 07:12:15,756 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_74-292608-0_sp1.1 from training. Duration: 0.6545625 2023-10-03 07:12:17,029 WARNING [train.py:1204] (3/4) Exclude cut with ID _55644_210_11856_1_1534593599582_4103719_293-248694-0_sp1.1 from training. Duration: 0.97275 2023-10-03 07:12:17,117 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3172_210_2087_1_1526292929_3987865_197-97762-0 from training. Duration: 0.75 2023-10-03 07:12:19,780 WARNING [train.py:1204] (3/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_264-348896-0_sp0.9 from training. Duration: 0.9444375 2023-10-03 07:12:23,803 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.1.encoder.layers.0.nonlin_attention.whiten1, num_groups=1, num_channels=192, metric=4.47 vs. limit=10.0 2023-10-03 07:12:24,235 INFO [train.py:1046] (3/4) Epoch 34, batch 2000, loss[loss=0.1805, simple_loss=0.249, pruned_loss=0.05598, over 23831.00 frames. ], tot_loss[loss=0.1617, simple_loss=0.2408, pruned_loss=0.04128, over 4729430.19 frames. ], batch size: 195, lr: 3.00e-03, grad_scale: 16.0 2023-10-03 07:12:24,327 WARNING [train.py:1204] (3/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_209-205365-0_sp0.9 from training. Duration: 0.87775 2023-10-03 07:12:25,650 WARNING [train.py:1204] (3/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_222-198127-0_sp0.9 from training. Duration: 0.9333125 2023-10-03 07:12:27,001 WARNING [train.py:1204] (3/4) Exclude cut with ID _57572_210_5517_1_1534921219624_3809484_52-43530-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 07:12:29,713 WARNING [train.py:1204] (3/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_96-320736-0_sp1.1 from training. Duration: 0.7818125 2023-10-03 07:12:29,834 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_13091_210_7925_1_1530609447_8987354_1159-225508-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 07:12:33,168 WARNING [train.py:1204] (3/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_237-44332-0 from training. Duration: 0.94 2023-10-03 07:12:34,510 WARNING [train.py:1204] (3/4) Exclude cut with ID _7970_210_1883_1_1528455616296_3472230_25-178407-0_sp0.9 from training. Duration: 0.788875 2023-10-03 07:12:36,053 WARNING [train.py:1204] (3/4) Exclude cut with ID _29003_210_19596_1_1532507391720_3894098_357-257062-0_sp1.1 from training. Duration: 0.936375 2023-10-03 07:12:37,388 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_809-17156-0 from training. Duration: 0.73 2023-10-03 07:12:37,481 WARNING [train.py:1204] (3/4) Exclude cut with ID _7160_210_1876_1_1527940923231_3703799_302-150728-0_sp1.1 from training. Duration: 0.7090625 2023-10-03 07:12:37,489 WARNING [train.py:1204] (3/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_444-28041-0_sp0.9 from training. Duration: 0.9444375 2023-10-03 07:12:41,524 WARNING [train.py:1204] (3/4) Exclude cut with ID _52892_210_6721_1_1534378941259_4124129_278-251666-0_sp1.1 from training. Duration: 0.92725 2023-10-03 07:12:41,625 WARNING [train.py:1204] (3/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_115-326778-0 from training. Duration: 0.93 2023-10-03 07:12:42,977 WARNING [train.py:1204] (3/4) Exclude cut with ID _41391_210_7994_1_1533603771872_3638201_31-188961-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 07:12:44,730 WARNING [train.py:1204] (3/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_1073-6264-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 07:12:44,747 WARNING [train.py:1204] (3/4) Exclude cut with ID _66868_210_9527_1_1535509287954_4561140_435-259515-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 07:12:46,041 WARNING [train.py:1204] (3/4) Exclude cut with ID _14886_210_4220_1_1530702020355_3516190_159-73748-0 from training. Duration: 0.83 2023-10-03 07:12:46,075 WARNING [train.py:1204] (3/4) Exclude cut with ID _15150_210_8341_1_1530752411803_7256384_469-239509-0_sp1.1 from training. Duration: 0.6181875 2023-10-03 07:12:48,776 WARNING [train.py:1204] (3/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_395-340852-0 from training. Duration: 0.93 2023-10-03 07:12:48,779 WARNING [train.py:1204] (3/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_138-187031-0_sp1.1 from training. Duration: 0.9 2023-10-03 07:12:53,771 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_13757_210_3528_1_1530440682_4107239_48-106347-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 07:12:53,847 WARNING [train.py:1204] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_164-224052-0_sp1.1 from training. Duration: 0.636375 2023-10-03 07:12:53,851 WARNING [train.py:1204] (3/4) Exclude cut with ID _59288_210_5854_1_1535077757774_3412246_408-322648-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 07:12:55,209 WARNING [train.py:1204] (3/4) Exclude cut with ID _30191_210_1664_1_1533189915047_7196670_827-36359-0_sp1.1 from training. Duration: 0.963625 2023-10-03 07:12:55,295 WARNING [train.py:1204] (3/4) Exclude cut with ID _4680_210_3525_1_1527249757953_893386_75-78979-0_sp1.1 from training. Duration: 0.82725 2023-10-03 07:12:56,032 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.2.encoder.layers.2.feed_forward2.out_whiten, num_groups=1, num_channels=384, metric=3.65 vs. limit=15.0 2023-10-03 07:12:56,623 WARNING [train.py:1204] (3/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_383-165234-0 from training. Duration: 0.94 2023-10-03 07:12:58,909 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.conv_skip_rate, batch_count=1182133.3333333333, ans=0.0 2023-10-03 07:13:00,015 WARNING [train.py:1204] (3/4) Exclude cut with ID _19896_210_1883_1_1531709022662_6649569_94-229927-0 from training. Duration: 0.97 2023-10-03 07:13:00,028 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6788_210_1388_1_1527898698_4562021_180-75288-0_sp1.1 from training. Duration: 0.9 2023-10-03 07:13:00,042 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_26433_210_12108_1_1532157158_4056480_54-321349-0_sp1.1 from training. Duration: 0.97275 2023-10-03 07:13:05,371 WARNING [train.py:1204] (3/4) Exclude cut with ID _56108_210_16380_1_1534505446159_3887959_265-161493-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 07:13:06,779 WARNING [train.py:1204] (3/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_395-111602-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 07:13:06,787 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_154-316450-0_sp0.9 from training. Duration: 0.8444375 2023-10-03 07:13:06,858 WARNING [train.py:1204] (3/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_381-116041-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 07:13:09,608 WARNING [train.py:1204] (3/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_65-104124-0_sp1.1 from training. Duration: 0.963625 2023-10-03 07:13:10,938 WARNING [train.py:1204] (3/4) Exclude cut with ID _6536_210_1883_1_1527850783339_3319069_169-23811-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 07:13:12,259 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_289-180904-0_sp0.9 from training. Duration: 0.8444375 2023-10-03 07:13:12,277 WARNING [train.py:1204] (3/4) Exclude cut with ID _56406_210_9288_1_1534899618664_3496149_226-242400-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 07:13:13,682 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_24899_210_16554_1_1532046422_3827462_276-216330-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 07:13:16,458 WARNING [train.py:1204] (3/4) Exclude cut with ID _29345_210_4381_1_1532689168263_3860169_198-174328-0_sp1.1 from training. Duration: 0.82725 2023-10-03 07:13:16,522 WARNING [train.py:1204] (3/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_338-96520-0 from training. Duration: 0.94 2023-10-03 07:13:19,837 WARNING [train.py:1204] (3/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_7-111954-0_sp1.1 from training. Duration: 0.5090625 2023-10-03 07:13:21,182 WARNING [train.py:1204] (3/4) Exclude cut with ID _36692_210_5665_1_1533608967420_398809_8-16261-0_sp1.1 from training. Duration: 0.97275 2023-10-03 07:13:26,213 WARNING [train.py:1204] (3/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_86-212657-0_sp1.1 from training. Duration: 0.97275 2023-10-03 07:13:26,220 WARNING [train.py:1204] (3/4) Exclude cut with ID _44659_210_2721_1_1534036120061_6614860_626-298884-0_sp1.1 from training. Duration: 0.8818125 2023-10-03 07:13:29,708 WARNING [train.py:1204] (3/4) Exclude cut with ID _49755_210_18053_1_1534581005846_3218709_120-257918-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 07:13:31,094 WARNING [train.py:1204] (3/4) Exclude cut with ID _21881_210_3525_1_1531825242757_3798380_400-113821-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 07:13:31,097 WARNING [train.py:1204] (3/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_387-236773-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 07:13:33,658 WARNING [train.py:1204] (3/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_613-232856-0_sp1.1 from training. Duration: 0.4909375 2023-10-03 07:13:33,694 WARNING [train.py:1204] (3/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_181-78632-0_sp0.9 from training. Duration: 0.8666875 2023-10-03 07:13:37,624 INFO [train.py:1046] (3/4) Epoch 34, batch 2050, loss[loss=0.1529, simple_loss=0.2371, pruned_loss=0.03438, over 24383.00 frames. ], tot_loss[loss=0.1603, simple_loss=0.2394, pruned_loss=0.04053, over 4734141.69 frames. ], batch size: 77, lr: 3.00e-03, grad_scale: 16.0 2023-10-03 07:13:37,668 WARNING [train.py:1204] (3/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_175-17485-0_sp1.1 from training. Duration: 0.97275 2023-10-03 07:13:37,742 WARNING [train.py:1204] (3/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_1004-183422-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 07:13:40,692 WARNING [train.py:1204] (3/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_32-65962-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 07:13:42,002 WARNING [train.py:1204] (3/4) Exclude cut with ID _15458_210_8341_1_1530858614263_3834410_40-298664-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 07:13:42,869 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.feed_forward2.out_whiten.whitening_limit, batch_count=1182333.3333333333, ans=15.0 2023-10-03 07:13:44,855 WARNING [train.py:1204] (3/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_380-261863-0_sp1.1 from training. Duration: 0.963625 2023-10-03 07:13:46,340 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_372-173012-0_sp1.1 from training. Duration: 0.863625 2023-10-03 07:13:47,557 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.532e+02 1.844e+02 2.070e+02 2.334e+02 4.430e+02, threshold=4.140e+02, percent-clipped=1.0 2023-10-03 07:13:47,686 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_57-82205-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 07:13:47,748 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3691_210_3528_1_1526468169_4062053_355-334432-0_sp0.9 from training. Duration: 0.9666875 2023-10-03 07:13:50,824 WARNING [train.py:1204] (3/4) Exclude cut with ID _19023_210_4353_1_1531378892552_5820780_28-36615-0 from training. Duration: 0.97 2023-10-03 07:13:50,842 WARNING [train.py:1204] (3/4) Exclude cut with ID _29853_210_7497_1_1532505600851_6642110_286-251333-0_sp1.1 from training. Duration: 0.92725 2023-10-03 07:13:52,783 WARNING [train.py:1204] (3/4) Exclude cut with ID _31602_210_5854_1_1532677021456_4458250_518-80679-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 07:13:52,843 WARNING [train.py:1204] (3/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_38-187851-0_sp0.9 from training. Duration: 0.688875 2023-10-03 07:13:59,462 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.feed_forward3.hidden_balancer.prob, batch_count=1182400.0, ans=0.125 2023-10-03 07:14:02,584 WARNING [train.py:1204] (3/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_39-248870-0_sp1.1 from training. Duration: 0.62725 2023-10-03 07:14:02,768 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.conv_module1.balancer2.prob, batch_count=1182400.0, ans=0.125 2023-10-03 07:14:03,833 WARNING [train.py:1204] (3/4) Exclude cut with ID _16908_210_5724_1_1531119157248_3772700_154-31455-0_sp1.1 from training. Duration: 0.97275 2023-10-03 07:14:04,198 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.conv_module2.balancer2.prob, batch_count=1182400.0, ans=0.125 2023-10-03 07:14:05,254 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8453_210_4211_1_1528502940_8562730_347-330673-0 from training. Duration: 0.72 2023-10-03 07:14:07,932 WARNING [train.py:1204] (3/4) Exclude cut with ID _30191_210_1664_1_1533189915047_7196670_674-204247-0_sp1.1 from training. Duration: 0.97275 2023-10-03 07:14:09,319 WARNING [train.py:1204] (3/4) Exclude cut with ID _8833_210_5219_1_1528592665210_7553059_329-125156-0 from training. Duration: 0.82 2023-10-03 07:14:10,748 WARNING [train.py:1204] (3/4) Exclude cut with ID _4779_210_2084_1_1527318022944_3914893_500-285013-0_sp1.1 from training. Duration: 0.62725 2023-10-03 07:14:13,492 WARNING [train.py:1204] (3/4) Exclude cut with ID _14421_210_8750_1_1530604795825_3542290_190-313578-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 07:14:14,944 WARNING [train.py:1204] (3/4) Exclude cut with ID _35799_210_5854_1_1533263487887_3529071_538-273574-0_sp1.1 from training. Duration: 0.936375 2023-10-03 07:14:15,182 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.conv_module2.balancer2.prob, batch_count=1182466.6666666667, ans=0.125 2023-10-03 07:14:16,307 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_14928_210_5632_1_1530773461_4714000_454-112198-0_sp1.1 from training. Duration: 0.6 2023-10-03 07:14:16,351 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_468-143126-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 07:14:17,857 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_4528_210_3273_1_1527076654_3276777_258-121681-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 07:14:17,942 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_397-132565-0_sp1.1 from training. Duration: 0.9 2023-10-03 07:14:19,264 WARNING [train.py:1204] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_454-123002-0_sp0.9 from training. Duration: 0.7666875 2023-10-03 07:14:22,588 WARNING [train.py:1204] (3/4) Exclude cut with ID _27052_210_5986_1_1532329197187_3585009_297-86890-0_sp1.1 from training. Duration: 0.936375 2023-10-03 07:14:25,627 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_51225_210_3601_1_1534400634_2327383_55-301844-0_sp1.1 from training. Duration: 0.8454375 2023-10-03 07:14:27,031 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6786_210_1388_1_1527839699_4194495_378-46070-0_sp0.9 from training. Duration: 0.8 2023-10-03 07:14:28,938 WARNING [train.py:1204] (3/4) Exclude cut with ID _42334_210_7770_1_1534206610396_3605880_55-19758-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 07:14:34,921 WARNING [train.py:1204] (3/4) Exclude cut with ID _14884_210_2252_1_1530707459689_5561250_114-201399-0_sp0.9 from training. Duration: 0.8444375 2023-10-03 07:14:37,123 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.3.encoder.layers.0.feed_forward1.out_whiten, num_groups=1, num_channels=512, metric=9.83 vs. limit=15.0 2023-10-03 07:14:37,878 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_99-251314-0_sp0.9 from training. Duration: 0.9666875 2023-10-03 07:14:39,245 WARNING [train.py:1204] (3/4) Exclude cut with ID _26528_210_9070_1_1532570410842_3851310_497-97218-0 from training. Duration: 0.86 2023-10-03 07:14:39,569 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.bypass.skip_rate, batch_count=1182600.0, ans=0.09899494936611666 2023-10-03 07:14:44,840 WARNING [train.py:1204] (3/4) Exclude cut with ID _55328_210_27767_1_1534580973260_4088170_230-317241-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 07:14:44,899 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_125-203105-0_sp0.9 from training. Duration: 0.888875 2023-10-03 07:14:47,696 WARNING [train.py:1204] (3/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_55-31904-0_sp1.1 from training. Duration: 0.8 2023-10-03 07:14:49,134 WARNING [train.py:1204] (3/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_18-174647-0 from training. Duration: 0.91 2023-10-03 07:14:50,826 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.feed_forward1.hidden_balancer.prob, batch_count=1182666.6666666667, ans=0.125 2023-10-03 07:14:52,201 INFO [train.py:1046] (3/4) Epoch 34, batch 2100, loss[loss=0.132, simple_loss=0.2119, pruned_loss=0.02605, over 21936.00 frames. ], tot_loss[loss=0.1597, simple_loss=0.2387, pruned_loss=0.04033, over 4731735.17 frames. ], batch size: 48, lr: 3.00e-03, grad_scale: 16.0 2023-10-03 07:14:52,490 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.conv_module2.balancer2.min_positive, batch_count=1182666.6666666667, ans=0.05 2023-10-03 07:14:54,091 WARNING [train.py:1204] (3/4) Exclude cut with ID IC0165W0005-81726 from training. Duration: 0.952 2023-10-03 07:14:54,092 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_31583_210_3852_1_1532595851_4024413_148-171847-0_sp1.1 from training. Duration: 0.963625 2023-10-03 07:14:54,142 WARNING [train.py:1204] (3/4) Exclude cut with ID _21989_210_5854_1_1531899214931_3271360_116-138769-0_sp1.1 from training. Duration: 0.936375 2023-10-03 07:14:54,187 WARNING [train.py:1204] (3/4) Exclude cut with ID _66070_210_7640_1_1535441393096_7805310_489-292631-0_sp1.1 from training. Duration: 0.7181875 2023-10-03 07:14:54,262 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_202-27960-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 07:14:54,269 WARNING [train.py:1204] (3/4) Exclude cut with ID _5294_210_1747_1_1527249643928_3696179_342-270278-0 from training. Duration: 0.55 2023-10-03 07:14:56,024 WARNING [train.py:1204] (3/4) Exclude cut with ID _38323_210_12108_1_1533121233369_3303049_416-11919-0 from training. Duration: 0.96 2023-10-03 07:14:57,436 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6545_210_2040_1_1527774670_4245258_407-48070-0_sp0.9 from training. Duration: 0.8444375 2023-10-03 07:14:57,569 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.0.self_attn_weights.pos_emb_skip_rate, batch_count=1182666.6666666667, ans=0.0 2023-10-03 07:14:57,668 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.attention_skip_rate, batch_count=1182666.6666666667, ans=0.0 2023-10-03 07:15:00,780 WARNING [train.py:1204] (3/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_503-30719-0_sp1.1 from training. Duration: 0.8909375 2023-10-03 07:15:02,585 WARNING [train.py:1204] (3/4) Exclude cut with ID _49643_210_7989_1_1534640244224_3939031_234-326497-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 07:15:04,019 WARNING [train.py:1204] (3/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_301-77873-0_sp1.1 from training. Duration: 0.963625 2023-10-03 07:15:04,065 WARNING [train.py:1204] (3/4) Exclude cut with ID _45297_210_2721_1_1533715351381_3264949_152-154697-0_sp1.1 from training. Duration: 0.97275 2023-10-03 07:15:05,366 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3371_210_1794_1_1526384462_5580777_396-339812-0 from training. Duration: 0.85 2023-10-03 07:15:06,782 WARNING [train.py:1204] (3/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_399-156791-0_sp1.1 from training. Duration: 0.7545625 2023-10-03 07:15:06,839 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7696_210_2040_1_1528207167_3716099_117-125434-0 from training. Duration: 0.76 2023-10-03 07:15:06,844 WARNING [train.py:1204] (3/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_96-244299-0 from training. Duration: 0.88 2023-10-03 07:15:08,261 WARNING [train.py:1204] (3/4) Exclude cut with ID _37892_210_19596_1_1533198677897_3964058_519-343462-0_sp1.1 from training. Duration: 0.92725 2023-10-03 07:15:09,522 WARNING [train.py:1204] (3/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_202-183922-0_sp1.1 from training. Duration: 0.77275 2023-10-03 07:15:09,529 WARNING [train.py:1204] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_453-133983-0 from training. Duration: 0.78 2023-10-03 07:15:09,568 WARNING [train.py:1204] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_178-39722-0_sp1.1 from training. Duration: 0.4454375 2023-10-03 07:15:12,576 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.conv_module1.balancer2.min_abs, batch_count=1182733.3333333333, ans=0.5 2023-10-03 07:15:14,983 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_15126_210_6753_1_1530778677_2591185_113-157651-0 from training. Duration: 0.68 2023-10-03 07:15:14,984 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_271-234962-0_sp1.1 from training. Duration: 0.7181875 2023-10-03 07:15:15,193 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.attention_skip_rate, batch_count=1182733.3333333333, ans=0.0 2023-10-03 07:15:17,845 WARNING [train.py:1204] (3/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_195-64967-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 07:15:17,979 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.conv_skip_rate, batch_count=1182733.3333333333, ans=0.0 2023-10-03 07:15:19,219 WARNING [train.py:1204] (3/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_365-215065-0_sp1.1 from training. Duration: 0.936375 2023-10-03 07:15:20,839 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.feed_forward3.hidden_balancer.prob, batch_count=1182800.0, ans=0.125 2023-10-03 07:15:23,339 WARNING [train.py:1204] (3/4) Exclude cut with ID _31602_210_5854_1_1532677021456_4458250_249-96604-0_sp0.9 from training. Duration: 0.988875 2023-10-03 07:15:23,397 WARNING [train.py:1204] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_307-158462-0 from training. Duration: 0.69 2023-10-03 07:15:23,431 WARNING [train.py:1204] (3/4) Exclude cut with ID _30918_210_6400_1_1532754078600_3125920_407-223394-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 07:15:23,434 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6673_210_3273_1_1527767790_3173240_351-167954-0_sp0.9 from training. Duration: 0.5333125 2023-10-03 07:15:25,286 WARNING [train.py:1204] (3/4) Exclude cut with ID _4866_210_2577_1_1527404412284_4364659_256-306830-0 from training. Duration: 0.87 2023-10-03 07:15:25,330 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_17613_210_2151_1_1531137569_2366160_222-131118-0_sp1.1 from training. Duration: 0.963625 2023-10-03 07:15:25,349 WARNING [train.py:1204] (3/4) Exclude cut with ID _45560_210_21982_1_1533812603951_3584999_111-6376-0 from training. Duration: 0.84 2023-10-03 07:15:25,379 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_216-73841-0 from training. Duration: 0.96 2023-10-03 07:15:27,278 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_14058_210_4163_1_1530690953_6327180_49-210687-0 from training. Duration: 0.94 2023-10-03 07:15:28,767 WARNING [train.py:1204] (3/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_506-42614-0_sp0.9 from training. Duration: 0.87775 2023-10-03 07:15:30,711 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_25-332138-0_sp1.1 from training. Duration: 0.82725 2023-10-03 07:15:33,941 WARNING [train.py:1204] (3/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_589-9070-0_sp1.1 from training. Duration: 0.6454375 2023-10-03 07:15:34,059 WARNING [train.py:1204] (3/4) Exclude cut with ID _6194_210_1534_1_1527511826160_3941194_14-282876-0_sp1.1 from training. Duration: 0.6454375 2023-10-03 07:15:35,440 WARNING [train.py:1204] (3/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_1166-90654-0_sp1.1 from training. Duration: 0.92725 2023-10-03 07:15:36,849 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_218-255858-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 07:15:36,851 WARNING [train.py:1204] (3/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_1285-256622-0 from training. Duration: 0.84 2023-10-03 07:15:36,876 WARNING [train.py:1204] (3/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_205-286261-0_sp1.1 from training. Duration: 0.963625 2023-10-03 07:15:38,132 WARNING [train.py:1204] (3/4) Exclude cut with ID _68777_210_4353_1_1535810390731_3117904_6-154119-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 07:15:38,203 WARNING [train.py:1204] (3/4) Exclude cut with ID _22982_210_1883_1_1531882790447_5449409_565-199393-0_sp1.1 from training. Duration: 0.92725 2023-10-03 07:15:38,212 WARNING [train.py:1204] (3/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_278-154492-0 from training. Duration: 0.76 2023-10-03 07:15:39,575 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_426-313557-0 from training. Duration: 0.53 2023-10-03 07:15:40,931 WARNING [train.py:1204] (3/4) Exclude cut with ID _41817_210_18835_1_1533434477160_7423049_555-293448-0 from training. Duration: 0.8 2023-10-03 07:15:41,104 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.0.self_attn_weights.pos_emb_skip_rate, batch_count=1182866.6666666667, ans=0.0 2023-10-03 07:15:43,728 WARNING [train.py:1204] (3/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_32-335103-0_sp1.1 from training. Duration: 0.7454375 2023-10-03 07:15:47,828 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2655_210_2087_1_1525688959_3897400_202-147557-0_sp1.1 from training. Duration: 0.77275 2023-10-03 07:15:47,865 WARNING [train.py:1204] (3/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_158-250116-0 from training. Duration: 0.62 2023-10-03 07:15:52,585 WARNING [train.py:1204] (3/4) Exclude cut with ID _55429_210_10626_1_1534467588527_7174143_528-88083-0_sp1.1 from training. Duration: 0.963625 2023-10-03 07:15:55,881 WARNING [train.py:1204] (3/4) Exclude cut with ID _8030_210_7497_1_1528444886781_3531089_32-31837-0_sp1.1 from training. Duration: 0.8818125 2023-10-03 07:15:55,903 WARNING [train.py:1204] (3/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_744-86168-0_sp1.1 from training. Duration: 0.97275 2023-10-03 07:15:55,911 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1113-7369-0_sp0.9 from training. Duration: 0.9444375 2023-10-03 07:15:55,938 WARNING [train.py:1204] (3/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_124-345840-0_sp0.9 from training. Duration: 0.57775 2023-10-03 07:15:55,988 WARNING [train.py:1204] (3/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_328-264776-0_sp0.9 from training. Duration: 0.7444375 2023-10-03 07:15:57,361 WARNING [train.py:1204] (3/4) Exclude cut with ID _18895_210_6228_1_1531357274234_3784930_469-166615-0_sp1.1 from training. Duration: 0.963625 2023-10-03 07:15:57,367 WARNING [train.py:1204] (3/4) Exclude cut with ID _8395_210_1531_1_1528597871523_4411579_431-132470-0_sp0.9 from training. Duration: 0.82225 2023-10-03 07:15:59,361 WARNING [train.py:1204] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_554-45617-0_sp1.1 from training. Duration: 0.9 2023-10-03 07:15:59,391 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_35361_210_4238_1_1533262810_7793476_524-188242-0_sp1.1 from training. Duration: 0.92725 2023-10-03 07:15:59,572 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.1.conv_module1.balancer1.max_abs, batch_count=1182933.3333333333, ans=10.0 2023-10-03 07:16:00,795 WARNING [train.py:1204] (3/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_938-205135-0 from training. Duration: 0.87 2023-10-03 07:16:02,582 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_5931_210_2040_1_1527428481_4547999_321-193614-0 from training. Duration: 0.67 2023-10-03 07:16:02,595 WARNING [train.py:1204] (3/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_445-269521-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 07:16:05,355 WARNING [train.py:1204] (3/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_3-33116-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 07:16:05,356 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_4633_210_2040_1_1526824163_4145343_416-76117-0_sp1.1 from training. Duration: 0.863625 2023-10-03 07:16:06,638 WARNING [train.py:1204] (3/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_1033-274376-0_sp1.1 from training. Duration: 0.7909375 2023-10-03 07:16:06,679 WARNING [train.py:1204] (3/4) Exclude cut with ID _8772_210_1850_1_1528635595972_3664011_398-320049-0_sp0.9 from training. Duration: 0.97775 2023-10-03 07:16:08,082 INFO [train.py:1046] (3/4) Epoch 34, batch 2150, loss[loss=0.1395, simple_loss=0.2198, pruned_loss=0.02957, over 24581.00 frames. ], tot_loss[loss=0.1589, simple_loss=0.2375, pruned_loss=0.0402, over 4728452.60 frames. ], batch size: 60, lr: 3.00e-03, grad_scale: 16.0 2023-10-03 07:16:13,595 WARNING [train.py:1204] (3/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_321-215742-0_sp0.9 from training. Duration: 0.5555625 2023-10-03 07:16:14,987 WARNING [train.py:1204] (3/4) Exclude cut with ID _28394_210_8913_1_1532430049518_3650118_272-190950-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 07:16:16,372 WARNING [train.py:1204] (3/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_31-299371-0_sp1.1 from training. Duration: 0.92725 2023-10-03 07:16:17,650 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.613e+02 1.865e+02 1.979e+02 2.220e+02 3.230e+02, threshold=3.959e+02, percent-clipped=0.0 2023-10-03 07:16:17,780 WARNING [train.py:1204] (3/4) Exclude cut with ID _55383_210_10635_1_1534468426172_2231669_134-128246-0_sp0.9 from training. Duration: 0.988875 2023-10-03 07:16:17,783 WARNING [train.py:1204] (3/4) Exclude cut with ID _40519_210_13794_1_1533715228655_7337390_887-9547-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 07:16:19,111 WARNING [train.py:1204] (3/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_310-109461-0_sp0.9 from training. Duration: 0.888875 2023-10-03 07:16:21,921 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2009_210_2071_1_1525481134_4588937_258-250196-0_sp1.1 from training. Duration: 0.92725 2023-10-03 07:16:21,977 WARNING [train.py:1204] (3/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_1186-72702-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 07:16:21,979 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_14831_210_7927_1_1530700200_5583610_181-27467-0_sp1.1 from training. Duration: 0.82725 2023-10-03 07:16:26,452 WARNING [train.py:1204] (3/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_278-132109-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 07:16:26,469 WARNING [train.py:1204] (3/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_261-297503-0 from training. Duration: 0.99 2023-10-03 07:16:30,524 WARNING [train.py:1204] (3/4) Exclude cut with ID _19896_210_1883_1_1531709022662_6649569_313-265903-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 07:16:31,959 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_105-114797-0_sp1.1 from training. Duration: 0.763625 2023-10-03 07:16:32,048 WARNING [train.py:1204] (3/4) Exclude cut with ID _53755_210_8254_1_1534577312941_4707360_9-65843-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 07:16:33,869 WARNING [train.py:1204] (3/4) Exclude cut with ID _18516_210_9206_1_1531704653206_3692406_361-224549-0_sp1.1 from training. Duration: 0.97275 2023-10-03 07:16:33,900 WARNING [train.py:1204] (3/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_403-310975-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 07:16:35,268 WARNING [train.py:1204] (3/4) Exclude cut with ID _5444_210_2151_1_1527418808464_5312210_581-315411-0_sp0.9 from training. Duration: 0.688875 2023-10-03 07:16:35,324 WARNING [train.py:1204] (3/4) Exclude cut with ID _23825_210_13449_1_1531915313526_3497150_464-154904-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 07:16:36,621 WARNING [train.py:1204] (3/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_937-208281-0_sp0.9 from training. Duration: 0.9444375 2023-10-03 07:16:37,876 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_37839_210_7234_1_1533542462_3689303_158-181212-0_sp1.1 from training. Duration: 0.963625 2023-10-03 07:16:37,969 WARNING [train.py:1204] (3/4) Exclude cut with ID _8772_210_1850_1_1528635595972_3664011_398-320049-0 from training. Duration: 0.88 2023-10-03 07:16:40,568 WARNING [train.py:1204] (3/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_718-250907-0_sp1.1 from training. Duration: 0.62725 2023-10-03 07:16:41,932 WARNING [train.py:1204] (3/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_641-126395-0_sp1.1 from training. Duration: 0.92725 2023-10-03 07:16:42,143 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.conv_module1.balancer1.prob, batch_count=1183133.3333333333, ans=0.125 2023-10-03 07:16:43,245 WARNING [train.py:1204] (3/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_166-241493-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 07:16:43,335 WARNING [train.py:1204] (3/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_526-62079-0_sp0.9 from training. Duration: 0.7444375 2023-10-03 07:16:44,715 WARNING [train.py:1204] (3/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_439-275083-0_sp1.1 from training. Duration: 0.936375 2023-10-03 07:16:47,770 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_66907_210_21581_1_1535615661_3590177_493-229032-0_sp1.1 from training. Duration: 0.92725 2023-10-03 07:16:47,799 WARNING [train.py:1204] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_89-321955-0_sp1.1 from training. Duration: 0.72725 2023-10-03 07:16:49,158 WARNING [train.py:1204] (3/4) Exclude cut with ID _29857_210_20034_1_1532395886753_3798160_273-226195-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 07:16:49,165 WARNING [train.py:1204] (3/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_404-75889-0 from training. Duration: 0.89 2023-10-03 07:16:49,208 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_271-234962-0_sp0.9 from training. Duration: 0.87775 2023-10-03 07:16:52,024 WARNING [train.py:1204] (3/4) Exclude cut with ID _22961_210_8254_1_1531976366672_4278180_156-339685-0_sp1.1 from training. Duration: 0.97275 2023-10-03 07:16:52,071 WARNING [train.py:1204] (3/4) Exclude cut with ID _37892_210_19596_1_1533198677897_3964058_425-231927-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 07:16:53,874 WARNING [train.py:1204] (3/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_102-288670-0_sp1.1 from training. Duration: 0.97275 2023-10-03 07:16:55,204 WARNING [train.py:1204] (3/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_283-220372-0_sp1.1 from training. Duration: 0.5090625 2023-10-03 07:16:56,529 WARNING [train.py:1204] (3/4) Exclude cut with ID _44304_210_15374_1_1533639352765_4613129_86-1699-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 07:16:56,605 WARNING [train.py:1204] (3/4) Exclude cut with ID _2891_210_2550_1_1525874398521_4427710_327-121590-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 07:16:56,611 WARNING [train.py:1204] (3/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_369-222060-0 from training. Duration: 0.71 2023-10-03 07:16:59,359 WARNING [train.py:1204] (3/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_762-109886-0 from training. Duration: 0.91 2023-10-03 07:16:59,387 WARNING [train.py:1204] (3/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_477-255782-0_sp0.9 from training. Duration: 0.911125 2023-10-03 07:17:00,663 WARNING [train.py:1204] (3/4) Exclude cut with ID IC0669W0061-330906 from training. Duration: 0.768 2023-10-03 07:17:00,726 WARNING [train.py:1204] (3/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_347-213071-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 07:17:00,753 WARNING [train.py:1204] (3/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_507-303370-0_sp1.1 from training. Duration: 0.87275 2023-10-03 07:17:02,679 WARNING [train.py:1204] (3/4) Exclude cut with ID _15012_210_1968_1_1530856820429_5095350_688-178849-0 from training. Duration: 0.64 2023-10-03 07:17:02,684 WARNING [train.py:1204] (3/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_28-70987-0_sp0.9 from training. Duration: 0.9666875 2023-10-03 07:17:02,686 WARNING [train.py:1204] (3/4) Exclude cut with ID _5910_210_1389_1_1527337972454_5050100_555-159815-0 from training. Duration: 0.98 2023-10-03 07:17:02,711 WARNING [train.py:1204] (3/4) Exclude cut with ID IC0679W0064-335871 from training. Duration: 0.768 2023-10-03 07:17:02,712 WARNING [train.py:1204] (3/4) Exclude cut with ID IC0679W0079-335886 from training. Duration: 0.896 2023-10-03 07:17:04,191 WARNING [train.py:1204] (3/4) Exclude cut with ID _5608_210_2106_1_1527415439089_6819768_909-82767-0 from training. Duration: 0.61 2023-10-03 07:17:06,213 WARNING [train.py:1204] (3/4) Exclude cut with ID _23038_210_9570_1_1532001584158_3652419_411-323967-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 07:17:06,270 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_350-231748-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 07:17:06,280 WARNING [train.py:1204] (3/4) Exclude cut with ID _5289_210_2084_1_1527678202736_3779299_142-103317-0_sp0.9 from training. Duration: 0.9333125 2023-10-03 07:17:07,655 WARNING [train.py:1204] (3/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_314-144055-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 07:17:07,729 WARNING [train.py:1204] (3/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_177-236809-0_sp0.9 from training. Duration: 0.5444375 2023-10-03 07:17:09,796 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.2.encoder.layers.1.conv_module1.whiten, num_groups=1, num_channels=384, metric=8.68 vs. limit=15.0 2023-10-03 07:17:10,339 WARNING [train.py:1204] (3/4) Exclude cut with ID _23038_210_9570_1_1532001584158_3652419_82-334733-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 07:17:10,347 WARNING [train.py:1204] (3/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_225-22526-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 07:17:17,383 WARNING [train.py:1204] (3/4) Exclude cut with ID _30327_210_16049_1_1532484161295_3924169_401-70819-0_sp1.1 from training. Duration: 0.7818125 2023-10-03 07:17:18,640 WARNING [train.py:1204] (3/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_538-32291-0 from training. Duration: 0.83 2023-10-03 07:17:21,360 INFO [train.py:1046] (3/4) Epoch 34, batch 2200, loss[loss=0.1648, simple_loss=0.2499, pruned_loss=0.03984, over 23741.00 frames. ], tot_loss[loss=0.1585, simple_loss=0.2376, pruned_loss=0.03966, over 4734596.35 frames. ], batch size: 85, lr: 3.00e-03, grad_scale: 16.0 2023-10-03 07:17:22,835 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_37357_210_17024_1_1533089504_2316121_258-211605-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 07:17:28,708 WARNING [train.py:1204] (3/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_550-335198-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 07:17:28,781 WARNING [train.py:1204] (3/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_98-741-0_sp0.9 from training. Duration: 0.888875 2023-10-03 07:17:30,068 WARNING [train.py:1204] (3/4) Exclude cut with ID _38323_210_12108_1_1533121233369_3303049_251-238988-0_sp1.1 from training. Duration: 0.92725 2023-10-03 07:17:30,154 WARNING [train.py:1204] (3/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_259-34038-0_sp0.9 from training. Duration: 0.82225 2023-10-03 07:17:31,950 WARNING [train.py:1204] (3/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_1060-180633-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 07:17:31,994 WARNING [train.py:1204] (3/4) Exclude cut with ID _8755_210_1876_1_1528628559357_3258209_211-37473-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 07:17:32,005 WARNING [train.py:1204] (3/4) Exclude cut with ID _4122_210_2084_1_1526713164417_4353280_528-206771-0 from training. Duration: 0.88 2023-10-03 07:17:38,150 WARNING [train.py:1204] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_64-248359-0 from training. Duration: 0.69 2023-10-03 07:17:39,787 WARNING [train.py:1204] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_402-253027-0_sp1.1 from training. Duration: 0.4909375 2023-10-03 07:17:45,342 WARNING [train.py:1204] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_625-102955-0 from training. Duration: 0.77 2023-10-03 07:17:46,810 WARNING [train.py:1204] (3/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_611-219324-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 07:17:46,891 WARNING [train.py:1204] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_718-68505-0_sp0.9 from training. Duration: 0.988875 2023-10-03 07:17:47,041 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.ff2_skip_rate, batch_count=1183400.0, ans=0.0 2023-10-03 07:17:48,267 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_353-182855-0_sp1.1 from training. Duration: 0.87275 2023-10-03 07:17:51,095 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_5960_210_1794_1_1527403865_3990999_515-2613-0_sp0.9 from training. Duration: 0.97775 2023-10-03 07:17:51,115 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_5575_210_1410_1_1527301305_2570170_342-265572-0 from training. Duration: 0.94 2023-10-03 07:17:55,768 WARNING [train.py:1204] (3/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_329-113173-0_sp0.9 from training. Duration: 0.688875 2023-10-03 07:17:55,857 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_15396_210_6890_1_1531308473_1938936_213-312506-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 07:17:57,213 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_521-214276-0_sp1.1 from training. Duration: 0.4 2023-10-03 07:18:00,602 WARNING [train.py:1204] (3/4) Exclude cut with ID _15026_210_8750_1_1530777602316_3291850_211-195043-0_sp1.1 from training. Duration: 0.72725 2023-10-03 07:18:02,017 WARNING [train.py:1204] (3/4) Exclude cut with ID _29343_210_4381_1_1532602757752_3674109_108-257494-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 07:18:02,264 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.conv_module2.balancer2.prob, batch_count=1183466.6666666667, ans=0.125 2023-10-03 07:18:03,941 WARNING [train.py:1204] (3/4) Exclude cut with ID _64414_210_17301_1_1535243252048_3757759_403-58151-0_sp1.1 from training. Duration: 0.963625 2023-10-03 07:18:05,394 WARNING [train.py:1204] (3/4) Exclude cut with ID _36512_210_6987_1_1533033143998_3126069_109-177787-0_sp1.1 from training. Duration: 0.92725 2023-10-03 07:18:08,580 WARNING [train.py:1204] (3/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_498-40153-0 from training. Duration: 0.63 2023-10-03 07:18:08,661 WARNING [train.py:1204] (3/4) Exclude cut with ID _56447_210_1389_1_1535074245910_3697189_161-334523-0_sp1.1 from training. Duration: 0.936375 2023-10-03 07:18:11,351 WARNING [train.py:1204] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_238-245655-0 from training. Duration: 0.81 2023-10-03 07:18:13,503 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.1.encoder.layers.0.nonlin_attention.whiten1, num_groups=1, num_channels=192, metric=5.71 vs. limit=10.0 2023-10-03 07:18:13,935 WARNING [train.py:1204] (3/4) Exclude cut with ID _64631_210_27767_1_1535349580068_4165160_108-43865-0_sp1.1 from training. Duration: 0.92725 2023-10-03 07:18:13,941 WARNING [train.py:1204] (3/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_243-42116-0_sp1.1 from training. Duration: 0.663625 2023-10-03 07:18:13,975 WARNING [train.py:1204] (3/4) Exclude cut with ID _66868_210_9527_1_1535509287954_4561140_594-132160-0_sp1.1 from training. Duration: 0.92725 2023-10-03 07:18:16,635 WARNING [train.py:1204] (3/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_48-338976-0_sp0.9 from training. Duration: 0.988875 2023-10-03 07:18:16,668 WARNING [train.py:1204] (3/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_137-100595-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 07:18:16,680 WARNING [train.py:1204] (3/4) Exclude cut with ID _49748_210_24116_1_1533986971737_3158910_12-271669-0_sp1.1 from training. Duration: 0.936375 2023-10-03 07:18:18,072 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_37357_210_17024_1_1533089504_2316121_206-80421-0_sp1.1 from training. Duration: 0.936375 2023-10-03 07:18:18,432 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.self_attn_weights.pos_emb_skip_rate, batch_count=1183533.3333333333, ans=0.0 2023-10-03 07:18:19,462 WARNING [train.py:1204] (3/4) Exclude cut with ID _7537_210_2084_1_1528502395431_3924359_113-199933-0_sp1.1 from training. Duration: 0.536375 2023-10-03 07:18:19,514 WARNING [train.py:1204] (3/4) Exclude cut with ID _55440_210_12651_1_1534496713364_2980520_31-59289-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 07:18:22,152 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1083-220122-0_sp0.9 from training. Duration: 0.5666875 2023-10-03 07:18:23,601 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_426-313557-0_sp1.1 from training. Duration: 0.4818125 2023-10-03 07:18:24,874 WARNING [train.py:1204] (3/4) Exclude cut with ID _35799_210_5854_1_1533263487887_3529071_324-94843-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 07:18:26,395 WARNING [train.py:1204] (3/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_264-108685-0_sp1.1 from training. Duration: 0.5 2023-10-03 07:18:28,242 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9012W0006-501141 from training. Duration: 0.896 2023-10-03 07:18:29,715 WARNING [train.py:1204] (3/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_890-5117-0_sp0.9 from training. Duration: 0.8666875 2023-10-03 07:18:29,761 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9023W0008-506301 from training. Duration: 0.896 2023-10-03 07:18:30,029 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=1183600.0, ans=0.1 2023-10-03 07:18:31,131 WARNING [train.py:1204] (3/4) Exclude cut with ID _13916_210_9994_1_1530788341126_3763149_60-191556-0_sp0.9 from training. Duration: 0.67775 2023-10-03 07:18:31,182 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9029W0331-509721 from training. Duration: 0.896 2023-10-03 07:18:34,496 WARNING [train.py:1204] (3/4) Exclude cut with ID _35823_210_6917_1_1533187881914_7297830_567-108596-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 07:18:35,756 INFO [train.py:1046] (3/4) Epoch 34, batch 2250, loss[loss=0.1711, simple_loss=0.2439, pruned_loss=0.04909, over 23722.00 frames. ], tot_loss[loss=0.1593, simple_loss=0.2385, pruned_loss=0.04003, over 4741272.64 frames. ], batch size: 232, lr: 3.00e-03, grad_scale: 16.0 2023-10-03 07:18:35,815 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7543_210_6753_1_1528544010_1110402_25-183860-0_sp1.1 from training. Duration: 0.42725 2023-10-03 07:18:35,931 WARNING [train.py:1204] (3/4) Exclude cut with ID _47278_210_12041_1_1533902418349_3576110_495-260035-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 07:18:39,648 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9051W0015-520281 from training. Duration: 0.9045625 2023-10-03 07:18:39,992 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.conv_skip_rate, batch_count=1183666.6666666667, ans=0.0 2023-10-03 07:18:41,029 WARNING [train.py:1204] (3/4) Exclude cut with ID _14459_210_2228_1_1531443641946_7516700_707-66193-0_sp1.1 from training. Duration: 0.97275 2023-10-03 07:18:42,437 WARNING [train.py:1204] (3/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_219-224445-0_sp1.1 from training. Duration: 0.763625 2023-10-03 07:18:45,464 WARNING [train.py:1204] (3/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_345-167986-0_sp0.9 from training. Duration: 0.9333125 2023-10-03 07:18:46,636 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.504e+02 1.861e+02 2.051e+02 2.381e+02 3.141e+02, threshold=4.103e+02, percent-clipped=0.0 2023-10-03 07:18:47,992 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_34815_210_6388_1_1533381320_3571903_279-305565-0_sp1.1 from training. Duration: 0.836375 2023-10-03 07:18:50,718 WARNING [train.py:1204] (3/4) Exclude cut with ID _21987_210_5854_1_1531821500482_3637038_317-179212-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 07:18:51,959 WARNING [train.py:1204] (3/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_395-37222-0_sp1.1 from training. Duration: 0.6909375 2023-10-03 07:18:53,287 WARNING [train.py:1204] (3/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_168-50337-0_sp1.1 from training. Duration: 0.763625 2023-10-03 07:18:54,904 WARNING [train.py:1204] (3/4) Exclude cut with ID _60113_210_16271_1_1535164212481_4386079_559-167191-0 from training. Duration: 0.93 2023-10-03 07:18:54,913 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_47228_210_4983_1_1533780021_3672027_344-179642-0_sp1.1 from training. Duration: 0.92725 2023-10-03 07:18:54,940 WARNING [train.py:1204] (3/4) Exclude cut with ID _22961_210_8254_1_1531976366672_4278180_317-199546-0_sp0.9 from training. Duration: 0.9555625 2023-10-03 07:18:57,795 WARNING [train.py:1204] (3/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_141-89727-0 from training. Duration: 0.71 2023-10-03 07:18:57,893 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_78-116075-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 07:18:57,919 WARNING [train.py:1204] (3/4) Exclude cut with ID _64086_210_7994_1_1535270417019_7435949_52-235756-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 07:18:59,313 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7615_210_4211_1_1528110965_4580160_72-235211-0_sp1.1 from training. Duration: 0.6909375 2023-10-03 07:19:07,670 WARNING [train.py:1204] (3/4) Exclude cut with ID _14069_210_5632_1_1530512692033_4803390_287-27950-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 07:19:09,645 WARNING [train.py:1204] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_74-48717-0_sp0.9 from training. Duration: 0.6444375 2023-10-03 07:19:09,677 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7544_210_6753_1_1528591327_1471426_172-348777-0_sp1.1 from training. Duration: 0.6 2023-10-03 07:19:09,886 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.conv_skip_rate, batch_count=1183800.0, ans=0.0 2023-10-03 07:19:11,019 WARNING [train.py:1204] (3/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_220-191080-0 from training. Duration: 0.98 2023-10-03 07:19:12,394 WARNING [train.py:1204] (3/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_1081-137641-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 07:19:13,796 WARNING [train.py:1204] (3/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_278-235459-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 07:19:18,102 WARNING [train.py:1204] (3/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_1181-77470-0_sp1.1 from training. Duration: 0.936375 2023-10-03 07:19:19,604 WARNING [train.py:1204] (3/4) Exclude cut with ID _60860_210_8254_1_1534896015875_3557169_198-5531-0_sp1.1 from training. Duration: 0.936375 2023-10-03 07:19:20,989 WARNING [train.py:1204] (3/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_17-289781-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 07:19:20,994 WARNING [train.py:1204] (3/4) Exclude cut with ID _50310_210_15488_1_1534068043345_3641080_146-199752-0_sp1.1 from training. Duration: 0.92725 2023-10-03 07:19:22,417 WARNING [train.py:1204] (3/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_969-213354-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 07:19:25,011 WARNING [train.py:1204] (3/4) Exclude cut with ID _70519_210_4163_1_1535961595994_7458230_66-344156-0_sp1.1 from training. Duration: 0.963625 2023-10-03 07:19:29,103 WARNING [train.py:1204] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_380-204756-0_sp1.1 from training. Duration: 0.7818125 2023-10-03 07:19:30,637 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_128-202104-0_sp0.9 from training. Duration: 0.7 2023-10-03 07:19:32,689 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.bypass.skip_rate, batch_count=1183866.6666666667, ans=0.04949747468305833 2023-10-03 07:19:35,261 WARNING [train.py:1204] (3/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_51-1049-0_sp0.9 from training. Duration: 0.8333125 2023-10-03 07:19:37,063 WARNING [train.py:1204] (3/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_86-66192-0_sp1.1 from training. Duration: 0.5 2023-10-03 07:19:37,111 WARNING [train.py:1204] (3/4) Exclude cut with ID _14069_210_5632_1_1530512692033_4803390_95-1896-0_sp1.1 from training. Duration: 0.8909375 2023-10-03 07:19:42,330 WARNING [train.py:1204] (3/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_29-293896-0_sp1.1 from training. Duration: 0.5545625 2023-10-03 07:19:43,775 WARNING [train.py:1204] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_167-137255-0_sp0.9 from training. Duration: 0.711125 2023-10-03 07:19:43,776 WARNING [train.py:1204] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_217-95056-0 from training. Duration: 0.98 2023-10-03 07:19:45,032 WARNING [train.py:1204] (3/4) Exclude cut with ID _47278_210_12041_1_1533902418349_3576110_97-266746-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 07:19:45,079 WARNING [train.py:1204] (3/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_380-343670-0_sp1.1 from training. Duration: 0.8 2023-10-03 07:19:47,825 WARNING [train.py:1204] (3/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_107-332398-0 from training. Duration: 0.78 2023-10-03 07:19:51,021 INFO [train.py:1046] (3/4) Epoch 34, batch 2300, loss[loss=0.1524, simple_loss=0.2378, pruned_loss=0.03356, over 24531.00 frames. ], tot_loss[loss=0.1605, simple_loss=0.2398, pruned_loss=0.04058, over 4741868.90 frames. ], batch size: 66, lr: 3.00e-03, grad_scale: 16.0 2023-10-03 07:19:51,078 WARNING [train.py:1204] (3/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_91-267696-0_sp1.1 from training. Duration: 0.7909375 2023-10-03 07:19:51,112 WARNING [train.py:1204] (3/4) Exclude cut with ID _41298_210_9019_1_1533430371450_4822143_498-186993-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 07:19:52,796 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=1184000.0, ans=0.1 2023-10-03 07:19:56,811 WARNING [train.py:1204] (3/4) Exclude cut with ID _17956_210_4353_1_1531209568125_6979970_54-77610-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 07:19:58,106 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_40047_210_2290_1_1533290519_2374311_257-335162-0_sp1.1 from training. Duration: 0.863625 2023-10-03 07:19:59,524 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9653W0193-675471 from training. Duration: 0.9656875 2023-10-03 07:20:00,298 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.2.encoder.layers.0.conv_module2.whiten, num_groups=1, num_channels=384, metric=6.41 vs. limit=15.0 2023-10-03 07:20:00,939 WARNING [train.py:1204] (3/4) Exclude cut with ID _25248_210_8750_1_1532062825941_5554180_243-240536-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 07:20:07,628 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.feed_forward1.hidden_balancer.prob, batch_count=1184066.6666666667, ans=0.125 2023-10-03 07:20:07,721 INFO [scaling.py:1118] (3/4) WithLoss: name=encoder.encoders.3.encoder.layers.2.self_attn_weights, loss-sum=0.000e+00 2023-10-03 07:20:09,335 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_32360_210_4198_1_1532689160_3648436_60-51908-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 07:20:09,351 WARNING [train.py:1204] (3/4) Exclude cut with ID _4092_210_2151_1_1526643044449_3135759_227-213972-0_sp0.9 from training. Duration: 0.6 2023-10-03 07:20:09,390 WARNING [train.py:1204] (3/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_392-173500-0_sp1.1 from training. Duration: 0.97275 2023-10-03 07:20:10,730 WARNING [train.py:1204] (3/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_71-329150-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 07:20:10,732 WARNING [train.py:1204] (3/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_866-173440-0 from training. Duration: 0.9 2023-10-03 07:20:10,811 WARNING [train.py:1204] (3/4) Exclude cut with ID _14832_210_8750_1_1530691351539_3171348_1-54315-0_sp0.9 from training. Duration: 0.9666875 2023-10-03 07:20:14,222 WARNING [train.py:1204] (3/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_430-116139-0_sp1.1 from training. Duration: 0.77275 2023-10-03 07:20:14,249 WARNING [train.py:1204] (3/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_356-45054-0_sp0.9 from training. Duration: 0.9555625 2023-10-03 07:20:14,463 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.0.ff3_skip_rate, batch_count=1184066.6666666667, ans=0.0 2023-10-03 07:20:17,072 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3531_210_3528_1_1526294203_5216456_309-227302-0_sp0.9 from training. Duration: 0.7666875 2023-10-03 07:20:18,570 WARNING [train.py:1204] (3/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_301-330455-0_sp1.1 from training. Duration: 0.72725 2023-10-03 07:20:22,659 WARNING [train.py:1204] (3/4) Exclude cut with ID _23557_210_8750_1_1531890106370_6846255_667-145945-0_sp1.1 from training. Duration: 0.92725 2023-10-03 07:20:25,484 WARNING [train.py:1204] (3/4) Exclude cut with ID _14860_210_4915_1_1530702020340_7343240_836-328489-0_sp1.1 from training. Duration: 0.8181875 2023-10-03 07:20:26,786 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_37152_210_9697_1_1533106573_3844188_416-333249-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 07:20:26,949 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.nonlin_attention.balancer.prob, batch_count=1184133.3333333333, ans=0.125 2023-10-03 07:20:28,424 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.nonlin_attention.balancer.max_positive, batch_count=1184133.3333333333, ans=0.95 2023-10-03 07:20:29,673 WARNING [train.py:1204] (3/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_538-32291-0_sp0.9 from training. Duration: 0.92225 2023-10-03 07:20:32,404 WARNING [train.py:1204] (3/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_965-176824-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 07:20:35,823 WARNING [train.py:1204] (3/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_86-19601-0_sp1.1 from training. Duration: 0.77275 2023-10-03 07:20:37,585 WARNING [train.py:1204] (3/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_436-318113-0_sp1.1 from training. Duration: 0.6818125 2023-10-03 07:20:37,616 WARNING [train.py:1204] (3/4) Exclude cut with ID _41817_210_18835_1_1533434477160_7423049_555-293448-0_sp0.9 from training. Duration: 0.888875 2023-10-03 07:20:37,628 WARNING [train.py:1204] (3/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_68-219807-0 from training. Duration: 0.47 2023-10-03 07:20:38,138 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.5.encoder.layers.0.conv_module2.whiten, num_groups=1, num_channels=256, metric=3.87 vs. limit=15.0 2023-10-03 07:20:41,842 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7544_210_6753_1_1528591327_1471426_33-346031-0_sp1.1 from training. Duration: 0.5545625 2023-10-03 07:20:41,844 WARNING [train.py:1204] (3/4) Exclude cut with ID _49748_210_24116_1_1533986971737_3158910_263-52113-0_sp1.1 from training. Duration: 0.97275 2023-10-03 07:20:42,523 WARNING [train.py:1204] (3/4) Exclude cut with ID _64639_210_5816_1_1535367619437_3528000_340-24580-0_sp1.1 from training. Duration: 0.92725 2023-10-03 07:20:43,592 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_47228_210_4983_1_1533780021_3672027_479-114941-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 07:20:43,625 WARNING [train.py:1204] (3/4) Exclude cut with ID _44585_210_18628_1_1533605350387_4518199_506-119659-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 07:20:43,703 WARNING [train.py:1204] (3/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_314-126819-0_sp1.1 from training. Duration: 0.4545625 2023-10-03 07:20:43,706 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2541_210_1794_1_1525590269_3084765_156-173713-0_sp0.9 from training. Duration: 0.9 2023-10-03 07:20:45,087 WARNING [train.py:1204] (3/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_108-255430-0 from training. Duration: 0.97 2023-10-03 07:20:45,094 WARNING [train.py:1204] (3/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_791-45149-0_sp1.1 from training. Duration: 0.7818125 2023-10-03 07:20:45,100 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_53378_210_17282_1_1534227054_1261425_10-118295-0_sp1.1 from training. Duration: 0.97275 2023-10-03 07:20:45,153 WARNING [train.py:1204] (3/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_148-158509-0 from training. Duration: 0.41 2023-10-03 07:20:49,470 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_34726_210_4525_1_1533019474_5030395_687-291305-0_sp1.1 from training. Duration: 0.936375 2023-10-03 07:20:52,338 WARNING [train.py:1204] (3/4) Exclude cut with ID _14860_210_4915_1_1530702020340_7343240_264-324537-0_sp1.1 from training. Duration: 0.963625 2023-10-03 07:20:56,360 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_41251_210_15368_1_1533429767_4820695_591-46386-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 07:20:57,672 WARNING [train.py:1204] (3/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_86-19601-0_sp0.9 from training. Duration: 0.9444375 2023-10-03 07:20:57,699 WARNING [train.py:1204] (3/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_401-255491-0_sp0.9 from training. Duration: 0.77775 2023-10-03 07:21:00,409 WARNING [train.py:1204] (3/4) Exclude cut with ID _8696_210_1876_1_1528545632440_3345058_209-54069-0_sp1.1 from training. Duration: 0.6090625 2023-10-03 07:21:00,419 WARNING [train.py:1204] (3/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_369-325184-0_sp1.1 from training. Duration: 0.9 2023-10-03 07:21:00,498 WARNING [train.py:1204] (3/4) Exclude cut with ID _14687_210_5854_1_1530703838259_3664889_115-239046-0_sp0.9 from training. Duration: 0.7444375 2023-10-03 07:21:01,784 WARNING [train.py:1204] (3/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_168-50337-0 from training. Duration: 0.84 2023-10-03 07:21:05,072 INFO [train.py:1046] (3/4) Epoch 34, batch 2350, loss[loss=0.1482, simple_loss=0.2376, pruned_loss=0.0294, over 24562.00 frames. ], tot_loss[loss=0.1614, simple_loss=0.2408, pruned_loss=0.041, over 4732004.73 frames. ], batch size: 71, lr: 3.00e-03, grad_scale: 16.0 2023-10-03 07:21:08,510 WARNING [train.py:1204] (3/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_909-64151-0_sp1.1 from training. Duration: 0.8545625 2023-10-03 07:21:08,520 WARNING [train.py:1204] (3/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_597-154640-0 from training. Duration: 0.9 2023-10-03 07:21:13,723 WARNING [train.py:1204] (3/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_126-274694-0 from training. Duration: 0.98 2023-10-03 07:21:16,333 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.488e+02 1.838e+02 1.985e+02 2.277e+02 3.152e+02, threshold=3.970e+02, percent-clipped=0.0 2023-10-03 07:21:16,608 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.attention_skip_rate, batch_count=1184333.3333333333, ans=0.0 2023-10-03 07:21:17,765 WARNING [train.py:1204] (3/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_344-269786-0_sp1.1 from training. Duration: 0.97275 2023-10-03 07:21:20,622 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_37818_210_4238_1_1533620910_4720083_322-253154-0_sp1.1 from training. Duration: 0.92725 2023-10-03 07:21:20,625 WARNING [train.py:1204] (3/4) Exclude cut with ID _58895_210_8750_1_1535184081320_3649089_221-15823-0_sp1.1 from training. Duration: 0.92725 2023-10-03 07:21:21,868 WARNING [train.py:1204] (3/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_9-21737-0_sp1.1 from training. Duration: 0.8818125 2023-10-03 07:21:21,923 WARNING [train.py:1204] (3/4) Exclude cut with ID _14069_210_5632_1_1530512692033_4803390_522-341588-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 07:21:23,325 WARNING [train.py:1204] (3/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_363-185703-0 from training. Duration: 0.95 2023-10-03 07:21:23,572 INFO [scaling.py:1118] (3/4) WithLoss: name=encoder.encoders.2.encoder.layers.1.self_attn_weights, loss-sum=0.000e+00 2023-10-03 07:21:26,090 WARNING [train.py:1204] (3/4) Exclude cut with ID _41817_210_18835_1_1533434477160_7423049_265-76216-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 07:21:30,423 WARNING [train.py:1204] (3/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_841-341679-0 from training. Duration: 0.58 2023-10-03 07:21:31,778 WARNING [train.py:1204] (3/4) Exclude cut with ID _55429_210_10626_1_1534467588527_7174143_97-335084-0_sp1.1 from training. Duration: 0.8818125 2023-10-03 07:21:34,531 WARNING [train.py:1204] (3/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_690-126306-0_sp0.9 from training. Duration: 0.8666875 2023-10-03 07:21:34,540 WARNING [train.py:1204] (3/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_102-92751-0_sp1.1 from training. Duration: 0.9 2023-10-03 07:21:36,459 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3787_210_2040_1_1526477544_5478586_603-120324-0_sp1.1 from training. Duration: 0.736375 2023-10-03 07:21:39,142 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_33348_210_5632_1_1533086912_3324030_85-28583-0 from training. Duration: 0.82 2023-10-03 07:21:39,211 WARNING [train.py:1204] (3/4) Exclude cut with ID _60057_210_10640_1_1534903065793_3333150_362-230377-0_sp1.1 from training. Duration: 0.7909375 2023-10-03 07:21:42,458 WARNING [train.py:1204] (3/4) Exclude cut with ID _60113_210_16271_1_1535164212481_4386079_194-146486-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 07:21:42,472 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_53753_210_7234_1_1534320003_3594148_171-277668-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 07:21:42,509 WARNING [train.py:1204] (3/4) Exclude cut with ID _34423_210_8750_1_1533430798456_3330199_251-332788-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 07:21:46,062 WARNING [train.py:1204] (3/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_663-166973-0_sp0.9 from training. Duration: 0.888875 2023-10-03 07:21:46,211 WARNING [train.py:1204] (3/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_927-43827-0 from training. Duration: 0.74 2023-10-03 07:21:47,596 WARNING [train.py:1204] (3/4) Exclude cut with ID _28323_210_16187_1_1532480395667_4100300_306-74561-0_sp1.1 from training. Duration: 0.8545625 2023-10-03 07:21:47,895 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.ff2_skip_rate, batch_count=1184466.6666666667, ans=0.0 2023-10-03 07:21:50,367 WARNING [train.py:1204] (3/4) Exclude cut with ID _15037_210_2253_1_1530752294104_3267650_139-337989-0_sp1.1 from training. Duration: 0.92725 2023-10-03 07:21:50,388 WARNING [train.py:1204] (3/4) Exclude cut with ID _49039_210_6315_1_1533967537541_3846118_460-263670-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 07:21:51,777 WARNING [train.py:1204] (3/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_186-309161-0 from training. Duration: 0.54 2023-10-03 07:21:53,187 WARNING [train.py:1204] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_87-278686-0_sp1.1 from training. Duration: 0.7 2023-10-03 07:21:54,768 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.ff3_skip_rate, batch_count=1184533.3333333333, ans=0.0 2023-10-03 07:21:55,917 WARNING [train.py:1204] (3/4) Exclude cut with ID _27052_210_5986_1_1532329197187_3585009_314-106499-0 from training. Duration: 0.92 2023-10-03 07:21:55,934 WARNING [train.py:1204] (3/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_412-287526-0_sp0.9 from training. Duration: 0.8 2023-10-03 07:22:00,219 WARNING [train.py:1204] (3/4) Exclude cut with ID _22961_210_8254_1_1531976366672_4278180_317-199546-0 from training. Duration: 0.86 2023-10-03 07:22:04,937 WARNING [train.py:1204] (3/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_381-317791-0 from training. Duration: 0.73 2023-10-03 07:22:04,995 WARNING [train.py:1204] (3/4) Exclude cut with ID _44010_210_4525_1_1533884262964_4013620_560-310411-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 07:22:05,001 WARNING [train.py:1204] (3/4) Exclude cut with ID _5294_210_1747_1_1527249643928_3696179_342-270278-0_sp0.9 from training. Duration: 0.611125 2023-10-03 07:22:06,939 WARNING [train.py:1204] (3/4) Exclude cut with ID ID0473W0002-918951 from training. Duration: 0.924 2023-10-03 07:22:06,964 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1083-220122-0 from training. Duration: 0.51 2023-10-03 07:22:08,543 WARNING [train.py:1204] (3/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_138-154812-0 from training. Duration: 0.78 2023-10-03 07:22:09,419 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.1.encoder.layers.0.feed_forward3.out_whiten, num_groups=1, num_channels=256, metric=12.17 vs. limit=15.0 2023-10-03 07:22:11,439 WARNING [train.py:1204] (3/4) Exclude cut with ID _44982_210_1511_1_1534312820108_3726029_331-19420-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 07:22:16,604 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2655_210_2087_1_1525688959_3897400_202-147557-0_sp0.9 from training. Duration: 0.9444375 2023-10-03 07:22:19,498 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_23850_210_4238_1_1532232597_1632433_32-185135-0_sp0.9 from training. Duration: 0.9555625 2023-10-03 07:22:20,820 INFO [train.py:1046] (3/4) Epoch 34, batch 2400, loss[loss=0.1511, simple_loss=0.232, pruned_loss=0.03513, over 24376.00 frames. ], tot_loss[loss=0.1612, simple_loss=0.2406, pruned_loss=0.04092, over 4734960.87 frames. ], batch size: 61, lr: 3.00e-03, grad_scale: 32.0 2023-10-03 07:22:22,216 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_46-93547-0_sp1.1 from training. Duration: 0.863625 2023-10-03 07:22:23,580 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7931_210_1794_1_1528594728_5564059_280-279560-0 from training. Duration: 0.98 2023-10-03 07:22:23,624 WARNING [train.py:1204] (3/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_937-208281-0 from training. Duration: 0.85 2023-10-03 07:22:23,774 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=1184666.6666666667, ans=0.1 2023-10-03 07:22:29,220 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_14928_210_5632_1_1530773461_4714000_454-112198-0_sp0.9 from training. Duration: 0.7333125 2023-10-03 07:22:29,221 WARNING [train.py:1204] (3/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_944-153333-0_sp1.1 from training. Duration: 0.963625 2023-10-03 07:22:31,919 WARNING [train.py:1204] (3/4) Exclude cut with ID _14821_210_8750_1_1530680503991_6829359_2-227151-0 from training. Duration: 0.83 2023-10-03 07:22:31,937 WARNING [train.py:1204] (3/4) Exclude cut with ID _28461_210_2253_1_1532771775189_3041299_96-152158-0_sp1.1 from training. Duration: 0.77275 2023-10-03 07:22:33,199 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7837_210_1792_1_1528453445_5841305_654-54195-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 07:22:33,239 WARNING [train.py:1204] (3/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_344-152526-0 from training. Duration: 0.53 2023-10-03 07:22:40,308 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_30167_210_3528_1_1532428012_4772375_480-142230-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 07:22:41,693 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_160-218721-0 from training. Duration: 0.96 2023-10-03 07:22:47,113 WARNING [train.py:1204] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_65-88242-0_sp0.9 from training. Duration: 0.62225 2023-10-03 07:22:52,593 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_34886_210_10633_1_1533203882_1755661_76-156096-0 from training. Duration: 0.87 2023-10-03 07:22:54,019 WARNING [train.py:1204] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_188-38388-0_sp1.1 from training. Duration: 0.92725 2023-10-03 07:22:55,361 WARNING [train.py:1204] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_81-289335-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 07:22:58,355 WARNING [train.py:1204] (3/4) Exclude cut with ID _59493_210_17282_1_1535078419077_3225879_110-8778-0_sp1.1 from training. Duration: 0.936375 2023-10-03 07:22:59,589 WARNING [train.py:1204] (3/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_188-260011-0 from training. Duration: 0.73 2023-10-03 07:22:59,802 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.conv_module2.balancer1.prob, batch_count=1184800.0, ans=0.125 2023-10-03 07:23:00,939 WARNING [train.py:1204] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_453-133983-0_sp0.9 from training. Duration: 0.8666875 2023-10-03 07:23:09,569 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8054_210_7927_1_1528627721_4984094_553-17379-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 07:23:11,510 WARNING [train.py:1204] (3/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_341-171072-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 07:23:11,659 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.0.balancer_ff3.min_abs, batch_count=1184866.6666666667, ans=0.2 2023-10-03 07:23:14,380 WARNING [train.py:1204] (3/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_265-257286-0_sp1.1 from training. Duration: 0.963625 2023-10-03 07:23:14,492 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_403-39619-0_sp0.9 from training. Duration: 0.9333125 2023-10-03 07:23:15,853 WARNING [train.py:1204] (3/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_464-33931-0_sp0.9 from training. Duration: 0.6 2023-10-03 07:23:15,888 WARNING [train.py:1204] (3/4) Exclude cut with ID _60804_210_9642_1_1534831199859_7268299_632-143863-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 07:23:15,896 WARNING [train.py:1204] (3/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_431-310997-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 07:23:17,205 WARNING [train.py:1204] (3/4) Exclude cut with ID _31649_210_6228_1_1532584608063_4091078_114-263860-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 07:23:17,216 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6961_210_2087_1_1527937168_6815011_562-165518-0_sp0.9 from training. Duration: 0.8555625 2023-10-03 07:23:19,819 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.4.encoder.layers.2.self_attn1.whiten, num_groups=1, num_channels=384, metric=12.01 vs. limit=22.5 2023-10-03 07:23:21,881 WARNING [train.py:1204] (3/4) Exclude cut with ID _27052_210_5986_1_1532329197187_3585009_338-325870-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 07:23:21,940 WARNING [train.py:1204] (3/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_469-94673-0_sp1.1 from training. Duration: 0.7181875 2023-10-03 07:23:21,968 WARNING [train.py:1204] (3/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_259-34038-0 from training. Duration: 0.74 2023-10-03 07:23:23,400 WARNING [train.py:1204] (3/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_388-129552-0 from training. Duration: 0.96 2023-10-03 07:23:26,105 WARNING [train.py:1204] (3/4) Exclude cut with ID _28394_210_8913_1_1532430049518_3650118_342-251455-0_sp1.1 from training. Duration: 0.936375 2023-10-03 07:23:26,134 WARNING [train.py:1204] (3/4) Exclude cut with ID _53731_210_7994_1_1534553979244_3757214_8-191390-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 07:23:27,364 WARNING [train.py:1204] (3/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_613-232856-0 from training. Duration: 0.54 2023-10-03 07:23:27,432 WARNING [train.py:1204] (3/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_557-349534-0 from training. Duration: 0.91 2023-10-03 07:23:28,728 WARNING [train.py:1204] (3/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_379-184223-0 from training. Duration: 0.85 2023-10-03 07:23:28,732 WARNING [train.py:1204] (3/4) Exclude cut with ID IC0125W0001-61807 from training. Duration: 0.966 2023-10-03 07:23:28,810 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_229-45300-0 from training. Duration: 0.57 2023-10-03 07:23:30,157 WARNING [train.py:1204] (3/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_1069-24743-0_sp1.1 from training. Duration: 0.92725 2023-10-03 07:23:30,434 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.self_attn_weights.pos_emb_skip_rate, batch_count=1184933.3333333333, ans=0.0 2023-10-03 07:23:31,630 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_41452_210_7291_1_1533437687_3941281_323-172873-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 07:23:31,640 WARNING [train.py:1204] (3/4) Exclude cut with ID _37086_210_9038_1_1532999540891_7021250_802-226709-0_sp1.1 from training. Duration: 0.97275 2023-10-03 07:23:31,880 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.conv_skip_rate, batch_count=1184933.3333333333, ans=0.0 2023-10-03 07:23:32,986 WARNING [train.py:1204] (3/4) Exclude cut with ID IC0144W0500-71767 from training. Duration: 0.931875 2023-10-03 07:23:34,290 INFO [train.py:1046] (3/4) Epoch 34, batch 2450, loss[loss=0.1677, simple_loss=0.2358, pruned_loss=0.04977, over 23774.00 frames. ], tot_loss[loss=0.1603, simple_loss=0.2391, pruned_loss=0.04072, over 4737780.56 frames. ], batch size: 195, lr: 3.00e-03, grad_scale: 32.0 2023-10-03 07:23:34,342 WARNING [train.py:1204] (3/4) Exclude cut with ID _39846_210_12651_1_1533605310265_2115212_233-129699-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 07:23:34,389 WARNING [train.py:1204] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_574-45541-0_sp0.9 from training. Duration: 0.72225 2023-10-03 07:23:37,214 WARNING [train.py:1204] (3/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_372-298093-0_sp1.1 from training. Duration: 0.72725 2023-10-03 07:23:37,232 WARNING [train.py:1204] (3/4) Exclude cut with ID _62574_210_2655_1_1535180407058_3933249_34-26343-0_sp1.1 from training. Duration: 0.97275 2023-10-03 07:23:40,670 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.attention_skip_rate, batch_count=1185000.0, ans=0.0 2023-10-03 07:23:41,786 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_512-256238-0_sp1.1 from training. Duration: 0.963625 2023-10-03 07:23:41,801 WARNING [train.py:1204] (3/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_196-55502-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 07:23:43,264 WARNING [train.py:1204] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_195-167011-0 from training. Duration: 0.75 2023-10-03 07:23:47,260 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.558e+02 1.892e+02 2.133e+02 2.562e+02 4.061e+02, threshold=4.265e+02, percent-clipped=1.0 2023-10-03 07:23:47,360 WARNING [train.py:1204] (3/4) Exclude cut with ID _13678_210_7497_1_1530530069226_3463170_345-230416-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 07:23:47,370 WARNING [train.py:1204] (3/4) Exclude cut with ID _51078_210_9070_1_1534161557424_3805250_225-327809-0_sp1.1 from training. Duration: 0.963625 2023-10-03 07:23:50,820 WARNING [train.py:1204] (3/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_18-189198-0_sp1.1 from training. Duration: 0.8181875 2023-10-03 07:23:50,834 WARNING [train.py:1204] (3/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_329-82235-0_sp1.1 from training. Duration: 0.7454375 2023-10-03 07:23:50,845 WARNING [train.py:1204] (3/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_726-270780-0_sp0.9 from training. Duration: 0.9555625 2023-10-03 07:23:50,890 WARNING [train.py:1204] (3/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_654-52713-0 from training. Duration: 0.77 2023-10-03 07:23:52,649 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.conv_module2.balancer1.prob, batch_count=1185066.6666666667, ans=0.125 2023-10-03 07:23:54,909 WARNING [train.py:1204] (3/4) Exclude cut with ID _20455_210_5219_1_1531565996775_8098100_191-16006-0_sp1.1 from training. Duration: 0.963625 2023-10-03 07:23:56,288 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3171_210_2087_1_1526109084_2330007_66-260583-0_sp1.1 from training. Duration: 0.6545625 2023-10-03 07:23:57,601 WARNING [train.py:1204] (3/4) Exclude cut with ID _38670_210_15379_1_1533448810659_4391059_63-129006-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 07:24:01,650 WARNING [train.py:1204] (3/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_393-57888-0_sp0.9 from training. Duration: 0.711125 2023-10-03 07:24:01,686 WARNING [train.py:1204] (3/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_857-298942-0_sp0.9 from training. Duration: 0.9444375 2023-10-03 07:24:03,007 WARNING [train.py:1204] (3/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_239-22076-0_sp0.9 from training. Duration: 0.9444375 2023-10-03 07:24:03,034 WARNING [train.py:1204] (3/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_424-227947-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 07:24:04,571 WARNING [train.py:1204] (3/4) Exclude cut with ID _14069_210_5632_1_1530512692033_4803390_95-1896-0 from training. Duration: 0.98 2023-10-03 07:24:04,647 WARNING [train.py:1204] (3/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_96-244299-0_sp0.9 from training. Duration: 0.97775 2023-10-03 07:24:11,054 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.balancer1.prob, batch_count=1185133.3333333333, ans=0.125 2023-10-03 07:24:12,613 WARNING [train.py:1204] (3/4) Exclude cut with ID _46683_210_9642_1_1533730913492_4591116_562-319825-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 07:24:12,711 WARNING [train.py:1204] (3/4) Exclude cut with ID _64639_210_5816_1_1535367619437_3528000_284-173474-0_sp1.1 from training. Duration: 0.963625 2023-10-03 07:24:14,035 WARNING [train.py:1204] (3/4) Exclude cut with ID _29529_210_7234_1_1532393961455_3977460_497-100133-0_sp1.1 from training. Duration: 0.936375 2023-10-03 07:24:14,066 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_12868_210_6753_1_1530701932_530275_18-76322-0_sp0.9 from training. Duration: 0.9666875 2023-10-03 07:24:14,103 WARNING [train.py:1204] (3/4) Exclude cut with ID _50093_210_4220_1_1534417141207_3432299_116-303447-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 07:24:15,949 WARNING [train.py:1204] (3/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_44-79427-0_sp1.1 from training. Duration: 0.8909375 2023-10-03 07:24:17,663 WARNING [train.py:1204] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_887-249555-0 from training. Duration: 0.77 2023-10-03 07:24:21,527 WARNING [train.py:1204] (3/4) Exclude cut with ID _28461_210_2253_1_1532771775189_3041299_96-152158-0_sp0.9 from training. Duration: 0.9444375 2023-10-03 07:24:22,858 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_14058_210_4163_1_1530690953_6327180_49-210687-0_sp1.1 from training. Duration: 0.8545625 2023-10-03 07:24:25,685 WARNING [train.py:1204] (3/4) Exclude cut with ID _40399_210_4353_1_1533305658194_3279406_351-127903-0_sp1.1 from training. Duration: 0.97275 2023-10-03 07:24:25,692 WARNING [train.py:1204] (3/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_184-206939-0_sp1.1 from training. Duration: 0.936375 2023-10-03 07:24:29,890 WARNING [train.py:1204] (3/4) Exclude cut with ID _14100_210_8341_1_1530529320613_3678069_258-224599-0_sp1.1 from training. Duration: 0.7 2023-10-03 07:24:29,905 WARNING [train.py:1204] (3/4) Exclude cut with ID _6493_210_1747_1_1528459210424_4012470_324-323854-0 from training. Duration: 0.92 2023-10-03 07:24:31,262 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_151-325626-0_sp1.1 from training. Duration: 0.8818125 2023-10-03 07:24:32,645 WARNING [train.py:1204] (3/4) Exclude cut with ID _14528_210_8254_1_1530669700514_4306939_106-43145-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 07:24:32,659 WARNING [train.py:1204] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_686-54383-0 from training. Duration: 0.87 2023-10-03 07:24:32,711 WARNING [train.py:1204] (3/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_801-102362-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 07:24:34,131 WARNING [train.py:1204] (3/4) Exclude cut with ID _48677_210_5663_1_1533902405919_3892989_408-40740-0_sp0.9 from training. Duration: 0.92225 2023-10-03 07:24:37,014 WARNING [train.py:1204] (3/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_448-212135-0_sp1.1 from training. Duration: 0.82725 2023-10-03 07:24:40,151 WARNING [train.py:1204] (3/4) Exclude cut with ID _59170_210_6721_1_1534983633960_4203335_320-124047-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 07:24:41,531 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_4692_210_3528_1_1526899755_4323254_107-44401-0_sp1.1 from training. Duration: 0.7818125 2023-10-03 07:24:44,888 WARNING [train.py:1204] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_731-2428-0 from training. Duration: 0.83 2023-10-03 07:24:47,502 WARNING [train.py:1204] (3/4) Exclude cut with ID _29857_210_20034_1_1532395886753_3798160_335-163562-0_sp1.1 from training. Duration: 0.77275 2023-10-03 07:24:49,890 INFO [train.py:1046] (3/4) Epoch 34, batch 2500, loss[loss=0.1618, simple_loss=0.2446, pruned_loss=0.03952, over 23531.00 frames. ], tot_loss[loss=0.1593, simple_loss=0.2378, pruned_loss=0.04033, over 4735458.51 frames. ], batch size: 106, lr: 3.00e-03, grad_scale: 16.0 2023-10-03 07:24:51,474 WARNING [train.py:1204] (3/4) Exclude cut with ID _14832_210_8750_1_1530691351539_3171348_171-275004-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 07:24:55,096 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.0.layers.1.conv_module1.whiten, num_groups=1, num_channels=192, metric=7.94 vs. limit=15.0 2023-10-03 07:24:58,617 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.conv_skip_rate, batch_count=1185333.3333333333, ans=0.0 2023-10-03 07:25:01,192 WARNING [train.py:1204] (3/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_105-328174-0_sp0.9 from training. Duration: 0.8444375 2023-10-03 07:25:01,226 WARNING [train.py:1204] (3/4) Exclude cut with ID _68965_210_1883_1_1535698962272_3497490_164-118421-0_sp1.1 from training. Duration: 0.936375 2023-10-03 07:25:01,300 WARNING [train.py:1204] (3/4) Exclude cut with ID _19896_210_1883_1_1531709022662_6649569_380-24170-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 07:25:01,304 WARNING [train.py:1204] (3/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_505-205270-0_sp1.1 from training. Duration: 0.3454375 2023-10-03 07:25:08,620 WARNING [train.py:1204] (3/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_427-208901-0_sp1.1 from training. Duration: 0.6181875 2023-10-03 07:25:09,977 WARNING [train.py:1204] (3/4) Exclude cut with ID _23818_210_6228_1_1531917063884_3676920_85-326916-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 07:25:11,313 WARNING [train.py:1204] (3/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_101-293367-0_sp1.1 from training. Duration: 0.57275 2023-10-03 07:25:11,328 WARNING [train.py:1204] (3/4) Exclude cut with ID _5409_210_1388_1_1527292850189_5865191_178-127970-0_sp1.1 from training. Duration: 0.5181875 2023-10-03 07:25:12,671 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_403-39619-0 from training. Duration: 0.84 2023-10-03 07:25:14,129 WARNING [train.py:1204] (3/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_427-87841-0_sp1.1 from training. Duration: 0.963625 2023-10-03 07:25:14,179 WARNING [train.py:1204] (3/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_286-68181-0_sp1.1 from training. Duration: 0.97275 2023-10-03 07:25:15,508 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6673_210_3273_1_1527767790_3173240_327-306185-0 from training. Duration: 0.83 2023-10-03 07:25:15,527 WARNING [train.py:1204] (3/4) Exclude cut with ID _3675_210_4557_1_1526639934408_4182089_106-203713-0_sp1.1 from training. Duration: 0.963625 2023-10-03 07:25:17,488 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_53071_210_16554_1_1534493434_1276796_130-36076-0 from training. Duration: 0.95 2023-10-03 07:25:17,524 WARNING [train.py:1204] (3/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_586-93054-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 07:25:23,955 WARNING [train.py:1204] (3/4) Exclude cut with ID _23825_210_13449_1_1531915313526_3497150_262-10571-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 07:25:25,311 WARNING [train.py:1204] (3/4) Exclude cut with ID _28783_210_7943_1_1532671164808_7155479_432-67885-0_sp1.1 from training. Duration: 0.97275 2023-10-03 07:25:28,202 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6287_210_3273_1_1527509506_2653716_182-199887-0_sp0.9 from training. Duration: 0.5666875 2023-10-03 07:25:28,263 WARNING [train.py:1204] (3/4) Exclude cut with ID _3231_210_2106_1_1526205864934_6784220_171-256004-0 from training. Duration: 0.86 2023-10-03 07:25:28,314 WARNING [train.py:1204] (3/4) Exclude cut with ID _69051_210_1531_1_1535885566901_3829538_332-92090-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 07:25:29,694 WARNING [train.py:1204] (3/4) Exclude cut with ID _3675_210_4557_1_1526639934408_4182089_104-324125-0_sp1.1 from training. Duration: 0.963625 2023-10-03 07:25:32,557 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_41764_210_2521_1_1533686110_4089313_501-2119-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 07:25:36,572 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_56-100988-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 07:25:39,267 WARNING [train.py:1204] (3/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_398-239819-0_sp1.1 from training. Duration: 0.9 2023-10-03 07:25:43,837 WARNING [train.py:1204] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_74-48717-0_sp1.1 from training. Duration: 0.52725 2023-10-03 07:25:46,547 WARNING [train.py:1204] (3/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_177-49092-0 from training. Duration: 0.91 2023-10-03 07:25:46,580 WARNING [train.py:1204] (3/4) Exclude cut with ID _62337_210_12392_1_1535009273105_3412229_44-176353-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 07:25:46,592 WARNING [train.py:1204] (3/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_110-130961-0_sp1.1 from training. Duration: 0.42725 2023-10-03 07:25:48,429 WARNING [train.py:1204] (3/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_37-189262-0_sp1.1 from training. Duration: 0.8909375 2023-10-03 07:25:48,430 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3172_210_2087_1_1526292929_3987865_197-97762-0_sp0.9 from training. Duration: 0.8333125 2023-10-03 07:25:50,317 WARNING [train.py:1204] (3/4) Exclude cut with ID IC0677W0065-334897 from training. Duration: 0.896 2023-10-03 07:25:50,318 WARNING [train.py:1204] (3/4) Exclude cut with ID IC0677W0080-334912 from training. Duration: 0.768 2023-10-03 07:25:50,332 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_120-41668-0 from training. Duration: 0.7 2023-10-03 07:25:52,737 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.4.encoder.layers.1.feed_forward2.out_whiten, num_groups=1, num_channels=384, metric=11.15 vs. limit=15.0 2023-10-03 07:25:54,889 WARNING [train.py:1204] (3/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_460-205287-0_sp1.1 from training. Duration: 0.963625 2023-10-03 07:25:56,311 WARNING [train.py:1204] (3/4) Exclude cut with ID _14632_210_9988_1_1530691279392_3923200_102-204477-0 from training. Duration: 0.97 2023-10-03 07:25:57,598 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_34815_210_6388_1_1533381320_3571903_279-305565-0 from training. Duration: 0.92 2023-10-03 07:25:57,664 WARNING [train.py:1204] (3/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_91-148831-0_sp1.1 from training. Duration: 0.9 2023-10-03 07:25:58,270 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.4.encoder.layers.1.self_attn1.whiten, num_groups=1, num_channels=384, metric=12.00 vs. limit=22.5 2023-10-03 07:25:59,177 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_82-221196-0 from training. Duration: 0.76 2023-10-03 07:26:03,118 INFO [train.py:1046] (3/4) Epoch 34, batch 2550, loss[loss=0.1559, simple_loss=0.2382, pruned_loss=0.03676, over 24566.00 frames. ], tot_loss[loss=0.1595, simple_loss=0.2384, pruned_loss=0.04031, over 4735283.86 frames. ], batch size: 60, lr: 3.00e-03, grad_scale: 16.0 2023-10-03 07:26:03,193 WARNING [train.py:1204] (3/4) Exclude cut with ID _38020_210_5986_1_1533112252259_7006089_493-102745-0 from training. Duration: 0.98 2023-10-03 07:26:03,383 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder_embed.convnext.hidden_balancer.prob, batch_count=1185666.6666666667, ans=0.125 2023-10-03 07:26:04,571 WARNING [train.py:1204] (3/4) Exclude cut with ID _61133_210_9969_1_1535196777227_3696409_40-93442-0_sp1.1 from training. Duration: 0.92725 2023-10-03 07:26:04,718 WARNING [train.py:1204] (3/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_45-258852-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 07:26:06,096 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_53071_210_16554_1_1534493434_1276796_130-36076-0_sp1.1 from training. Duration: 0.863625 2023-10-03 07:26:07,629 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8694_210_6400_1_1529065754_3260756_332-13553-0_sp1.1 from training. Duration: 0.92725 2023-10-03 07:26:09,067 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_405-339986-0 from training. Duration: 0.51 2023-10-03 07:26:09,123 WARNING [train.py:1204] (3/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_927-43827-0_sp1.1 from training. Duration: 0.67275 2023-10-03 07:26:13,737 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_397-147196-0 from training. Duration: 0.8 2023-10-03 07:26:13,863 WARNING [train.py:1204] (3/4) Exclude cut with ID 3557-8342-0013-54691-0_sp1.1 from training. Duration: 0.836375 2023-10-03 07:26:14,071 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.conv_module1.balancer1.prob, batch_count=1185666.6666666667, ans=0.125 2023-10-03 07:26:15,135 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.559e+02 1.964e+02 2.224e+02 2.724e+02 4.382e+02, threshold=4.447e+02, percent-clipped=1.0 2023-10-03 07:26:15,293 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_47438_210_5399_1_1533797471_4513356_626-127722-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 07:26:18,659 WARNING [train.py:1204] (3/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_256-323883-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 07:26:18,671 WARNING [train.py:1204] (3/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_839-173017-0_sp0.9 from training. Duration: 0.4555625 2023-10-03 07:26:18,705 WARNING [train.py:1204] (3/4) Exclude cut with ID _5289_210_2084_1_1527678202736_3779299_392-261988-0_sp1.1 from training. Duration: 0.5909375 2023-10-03 07:26:18,745 WARNING [train.py:1204] (3/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_119-224384-0_sp1.1 from training. Duration: 0.936375 2023-10-03 07:26:20,458 WARNING [train.py:1204] (3/4) Exclude cut with ID _57523_210_1856_1_1534766294545_3732989_262-267933-0_sp1.1 from training. Duration: 0.963625 2023-10-03 07:26:21,816 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_35361_210_4238_1_1533262810_7793476_675-87012-0_sp0.9 from training. Duration: 0.988875 2023-10-03 07:26:21,842 WARNING [train.py:1204] (3/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_17-90541-0 from training. Duration: 0.94 2023-10-03 07:26:21,882 WARNING [train.py:1204] (3/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_535-14410-0_sp1.1 from training. Duration: 0.42725 2023-10-03 07:26:21,888 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_24673_210_6400_1_1531968633_778659_94-154511-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 07:26:23,784 WARNING [train.py:1204] (3/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_212-10501-0 from training. Duration: 0.94 2023-10-03 07:26:23,947 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.0.self_attn_weights.pos_emb_skip_rate, batch_count=1185733.3333333333, ans=0.0 2023-10-03 07:26:26,804 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.balancer2.prob, batch_count=1185733.3333333333, ans=0.125 2023-10-03 07:26:34,944 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.balancer1.prob, batch_count=1185800.0, ans=0.125 2023-10-03 07:26:36,320 WARNING [train.py:1204] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_383-285196-0_sp1.1 from training. Duration: 0.8818125 2023-10-03 07:26:40,480 WARNING [train.py:1204] (3/4) Exclude cut with ID _45560_210_21982_1_1533812603951_3584999_238-47020-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 07:26:40,496 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_21500_210_2054_1_1532149310_2716758_278-18053-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 07:26:42,211 WARNING [train.py:1204] (3/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_453-9626-0_sp1.1 from training. Duration: 0.936375 2023-10-03 07:26:43,583 WARNING [train.py:1204] (3/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_153-97441-0_sp1.1 from training. Duration: 0.5090625 2023-10-03 07:26:49,133 WARNING [train.py:1204] (3/4) Exclude cut with ID _67725_210_27767_1_1535608756671_5105359_431-120895-0_sp1.1 from training. Duration: 0.963625 2023-10-03 07:26:52,431 WARNING [train.py:1204] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_574-45541-0_sp1.1 from training. Duration: 0.5909375 2023-10-03 07:26:54,155 WARNING [train.py:1204] (3/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_162-103620-0_sp1.1 from training. Duration: 0.8454375 2023-10-03 07:26:54,174 WARNING [train.py:1204] (3/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_734-129761-0_sp1.1 from training. Duration: 0.7454375 2023-10-03 07:26:54,204 WARNING [train.py:1204] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_620-276793-0_sp1.1 from training. Duration: 0.536375 2023-10-03 07:26:54,843 WARNING [train.py:1204] (3/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_153-97441-0_sp0.9 from training. Duration: 0.62225 2023-10-03 07:26:57,857 WARNING [train.py:1204] (3/4) Exclude cut with ID _12733_210_8341_1_1530167424556_3646380_523-85621-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 07:26:57,870 WARNING [train.py:1204] (3/4) Exclude cut with ID _64048_210_10626_1_1535198382802_3835179_352-179834-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 07:27:02,002 WARNING [train.py:1204] (3/4) Exclude cut with ID _55162_210_17071_1_1534408248583_3753480_43-242364-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 07:27:02,033 WARNING [train.py:1204] (3/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_661-264857-0 from training. Duration: 0.93 2023-10-03 07:27:02,041 WARNING [train.py:1204] (3/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_325-237800-0_sp1.1 from training. Duration: 0.97275 2023-10-03 07:27:03,180 WARNING [train.py:1204] (3/4) Exclude cut with ID _55328_210_27767_1_1534580973260_4088170_385-171459-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 07:27:04,498 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3780_210_2087_1_1526467389_2145572_176-107140-0_sp1.1 from training. Duration: 0.62725 2023-10-03 07:27:04,593 WARNING [train.py:1204] (3/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_375-244976-0_sp1.1 from training. Duration: 0.6818125 2023-10-03 07:27:05,976 WARNING [train.py:1204] (3/4) Exclude cut with ID _35941_210_15379_1_1532995294515_4023089_309-33162-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 07:27:11,465 WARNING [train.py:1204] (3/4) Exclude cut with ID _14459_210_2228_1_1531443641946_7516700_1185-22038-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 07:27:11,671 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.conv_skip_rate, batch_count=1185933.3333333333, ans=0.0 2023-10-03 07:27:11,689 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.conv_skip_rate, batch_count=1185933.3333333333, ans=0.0 2023-10-03 07:27:11,751 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.ff2_skip_rate, batch_count=1185933.3333333333, ans=0.0 2023-10-03 07:27:13,441 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_1197-69460-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 07:27:14,932 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9012W0022-501157 from training. Duration: 0.896 2023-10-03 07:27:16,959 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9023W0009-506302 from training. Duration: 0.896 2023-10-03 07:27:16,974 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2821_210_2715_1_1526108385_3763560_73-296673-0_sp1.1 from training. Duration: 0.7545625 2023-10-03 07:27:18,266 INFO [train.py:1046] (3/4) Epoch 34, batch 2600, loss[loss=0.1681, simple_loss=0.2424, pruned_loss=0.0469, over 23768.00 frames. ], tot_loss[loss=0.1597, simple_loss=0.239, pruned_loss=0.04024, over 4733520.38 frames. ], batch size: 164, lr: 3.00e-03, grad_scale: 16.0 2023-10-03 07:27:18,341 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9025W0113-507442 from training. Duration: 0.896 2023-10-03 07:27:19,663 WARNING [train.py:1204] (3/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_739-2098-0 from training. Duration: 0.92 2023-10-03 07:27:19,682 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9029W0332-509722 from training. Duration: 0.768 2023-10-03 07:27:21,881 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.bypass.scale_min, batch_count=1186000.0, ans=0.2 2023-10-03 07:27:22,880 WARNING [train.py:1204] (3/4) Exclude cut with ID _16908_210_5724_1_1531119157248_3772700_68-101953-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 07:27:22,898 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9042W0011-515617 from training. Duration: 0.768 2023-10-03 07:27:24,802 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3019_210_2028_1_1526044245_3264865_191-72235-0 from training. Duration: 0.97 2023-10-03 07:27:26,156 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9052W0006-520792 from training. Duration: 0.896 2023-10-03 07:27:27,651 WARNING [train.py:1204] (3/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_245-174553-0_sp1.1 from training. Duration: 0.8 2023-10-03 07:27:29,022 WARNING [train.py:1204] (3/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_202-67017-0 from training. Duration: 0.82 2023-10-03 07:27:30,352 WARNING [train.py:1204] (3/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_304-59227-0 from training. Duration: 0.77 2023-10-03 07:27:31,799 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_246-125000-0_sp0.9 from training. Duration: 0.688875 2023-10-03 07:27:31,842 WARNING [train.py:1204] (3/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_243-42116-0 from training. Duration: 0.73 2023-10-03 07:27:33,307 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9087W0121-538492 from training. Duration: 0.896 2023-10-03 07:27:33,322 WARNING [train.py:1204] (3/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_120-76843-0 from training. Duration: 0.55 2023-10-03 07:27:35,030 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.conv_module2.balancer1.prob, batch_count=1186066.6666666667, ans=0.125 2023-10-03 07:27:38,890 WARNING [train.py:1204] (3/4) Exclude cut with ID _7852_210_5219_1_1528286471316_8139400_850-304621-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 07:27:38,902 WARNING [train.py:1204] (3/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_154-285415-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 07:27:38,920 WARNING [train.py:1204] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_538-278194-0_sp1.1 from training. Duration: 0.92725 2023-10-03 07:27:38,921 WARNING [train.py:1204] (3/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_119-207162-0 from training. Duration: 0.71 2023-10-03 07:27:41,679 WARNING [train.py:1204] (3/4) Exclude cut with ID _2334_210_2667_1_1525608238880_4705129_459-104997-0_sp1.1 from training. Duration: 0.863625 2023-10-03 07:27:41,839 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.1.ff3_skip_rate, batch_count=1186066.6666666667, ans=0.0 2023-10-03 07:27:44,389 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.conv_module2.whiten.whitening_limit, batch_count=1186066.6666666667, ans=15.0 2023-10-03 07:27:47,831 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9147W0019-567847 from training. Duration: 0.768 2023-10-03 07:27:51,331 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.feed_forward3.hidden_balancer.prob, batch_count=1186133.3333333333, ans=0.125 2023-10-03 07:27:52,670 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.feed_forward1.out_proj.dropout_p, batch_count=1186133.3333333333, ans=0.1 2023-10-03 07:27:54,394 WARNING [train.py:1204] (3/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_228-189439-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 07:27:55,839 WARNING [train.py:1204] (3/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_196-77172-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 07:27:57,214 WARNING [train.py:1204] (3/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_625-338546-0 from training. Duration: 0.88 2023-10-03 07:27:57,264 WARNING [train.py:1204] (3/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_193-327630-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 07:27:57,266 WARNING [train.py:1204] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_391-24006-0_sp1.1 from training. Duration: 0.92725 2023-10-03 07:27:58,505 WARNING [train.py:1204] (3/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_197-28164-0 from training. Duration: 0.8 2023-10-03 07:27:59,977 WARNING [train.py:1204] (3/4) Exclude cut with ID _8395_210_1531_1_1528597871523_4411579_431-132470-0_sp1.1 from training. Duration: 0.67275 2023-10-03 07:28:00,006 WARNING [train.py:1204] (3/4) Exclude cut with ID _35350_210_16040_1_1533040191183_4075299_465-248326-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 07:28:01,561 WARNING [train.py:1204] (3/4) Exclude cut with ID _16756_210_4915_1_1531047803763_7104259_101-141922-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 07:28:04,448 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9212W0005-600172 from training. Duration: 0.896 2023-10-03 07:28:04,683 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.ff2_skip_rate, batch_count=1186200.0, ans=0.0 2023-10-03 07:28:05,741 WARNING [train.py:1204] (3/4) Exclude cut with ID _63186_210_2084_1_1535414479705_3908649_378-166820-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 07:28:05,754 WARNING [train.py:1204] (3/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_52-211683-0_sp1.1 from training. Duration: 0.8181875 2023-10-03 07:28:07,383 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.attention_skip_rate, batch_count=1186200.0, ans=0.0 2023-10-03 07:28:08,674 WARNING [train.py:1204] (3/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_1147-237729-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 07:28:09,868 WARNING [train.py:1204] (3/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_234-14174-0_sp0.9 from training. Duration: 0.911125 2023-10-03 07:28:09,880 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_454-276429-0 from training. Duration: 0.87 2023-10-03 07:28:13,128 WARNING [train.py:1204] (3/4) Exclude cut with ID _53713_210_24116_1_1534314487762_7416751_755-165313-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 07:28:15,110 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8278_210_2550_1_1528637099_4220413_436-317072-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 07:28:16,433 WARNING [train.py:1204] (3/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_28-324729-0_sp1.1 from training. Duration: 0.8545625 2023-10-03 07:28:21,198 WARNING [train.py:1204] (3/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_326-165900-0 from training. Duration: 0.69 2023-10-03 07:28:22,521 WARNING [train.py:1204] (3/4) Exclude cut with ID _53489_210_6228_1_1534298357169_3835970_96-30246-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 07:28:23,830 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8275_210_4198_1_1528455320_3892571_491-17707-0_sp1.1 from training. Duration: 0.5818125 2023-10-03 07:28:28,214 WARNING [train.py:1204] (3/4) Exclude cut with ID _14348_210_8341_1_1530599434996_3778640_240-232370-0 from training. Duration: 0.84 2023-10-03 07:28:28,225 WARNING [train.py:1204] (3/4) Exclude cut with ID _45012_210_12651_1_1533608435136_5371380_54-112623-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 07:28:28,323 WARNING [train.py:1204] (3/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_328-264776-0_sp1.1 from training. Duration: 0.6090625 2023-10-03 07:28:29,665 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9325W0001-648427 from training. Duration: 0.768 2023-10-03 07:28:29,674 WARNING [train.py:1204] (3/4) Exclude cut with ID _56480_210_4220_1_1534679942079_3075409_49-74717-0_sp1.1 from training. Duration: 0.936375 2023-10-03 07:28:32,265 INFO [train.py:1046] (3/4) Epoch 34, batch 2650, loss[loss=0.1609, simple_loss=0.2333, pruned_loss=0.04421, over 23409.00 frames. ], tot_loss[loss=0.1612, simple_loss=0.2403, pruned_loss=0.04107, over 4721571.58 frames. ], batch size: 120, lr: 3.00e-03, grad_scale: 16.0 2023-10-03 07:28:32,370 WARNING [train.py:1204] (3/4) Exclude cut with ID _59500_210_8254_1_1534809610680_3736009_6-241869-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 07:28:35,012 WARNING [train.py:1204] (3/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_172-215932-0_sp0.9 from training. Duration: 0.7333125 2023-10-03 07:28:36,474 WARNING [train.py:1204] (3/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_17-90541-0_sp1.1 from training. Duration: 0.8545625 2023-10-03 07:28:39,215 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_43831_210_2158_1_1533725502_5872791_525-89447-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 07:28:40,589 WARNING [train.py:1204] (3/4) Exclude cut with ID _12302_210_5219_1_1529924446466_8107280_907-182949-0 from training. Duration: 0.92 2023-10-03 07:28:40,609 WARNING [train.py:1204] (3/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_402-102311-0_sp0.9 from training. Duration: 0.7444375 2023-10-03 07:28:41,914 WARNING [train.py:1204] (3/4) Exclude cut with ID _8021_210_7994_1_1528376451358_3653069_179-45206-0_sp0.9 from training. Duration: 0.9555625 2023-10-03 07:28:43,950 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.610e+02 1.874e+02 2.143e+02 2.497e+02 3.678e+02, threshold=4.285e+02, percent-clipped=0.0 2023-10-03 07:28:44,091 WARNING [train.py:1204] (3/4) Exclude cut with ID _40137_210_12210_1_1533290484046_3795180_249-187059-0 from training. Duration: 0.88 2023-10-03 07:28:47,255 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9653W0194-675472 from training. Duration: 0.9658125 2023-10-03 07:28:49,979 WARNING [train.py:1204] (3/4) Exclude cut with ID _51332_210_9988_1_1534244371836_3801270_336-255604-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 07:28:52,038 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_510-96996-0 from training. Duration: 0.87 2023-10-03 07:28:52,050 WARNING [train.py:1204] (3/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_1167-20437-0_sp1.1 from training. Duration: 0.92725 2023-10-03 07:28:53,315 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3475_210_2028_1_1526193578_3622569_265-276625-0 from training. Duration: 0.76 2023-10-03 07:28:57,576 WARNING [train.py:1204] (3/4) Exclude cut with ID _14860_210_4915_1_1530702020340_7343240_470-172418-0_sp1.1 from training. Duration: 0.97275 2023-10-03 07:28:57,584 WARNING [train.py:1204] (3/4) Exclude cut with ID _5444_210_2151_1_1527418808464_5312210_581-315411-0_sp1.1 from training. Duration: 0.563625 2023-10-03 07:28:57,608 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_34450_210_9547_1_1533099443_4109050_519-342439-0_sp1.1 from training. Duration: 0.97275 2023-10-03 07:28:57,647 WARNING [train.py:1204] (3/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_50-175961-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 07:29:02,359 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.bypass.skip_rate, batch_count=1186466.6666666667, ans=0.09899494936611666 2023-10-03 07:29:03,468 WARNING [train.py:1204] (3/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_372-6869-0 from training. Duration: 0.44 2023-10-03 07:29:03,473 WARNING [train.py:1204] (3/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_3-237434-0 from training. Duration: 0.76 2023-10-03 07:29:06,328 WARNING [train.py:1204] (3/4) Exclude cut with ID _8772_210_1850_1_1528635595972_3664011_247-336448-0_sp1.1 from training. Duration: 0.67275 2023-10-03 07:29:09,054 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_427-78097-0 from training. Duration: 0.8 2023-10-03 07:29:09,078 WARNING [train.py:1204] (3/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_466-252727-0_sp1.1 from training. Duration: 0.97275 2023-10-03 07:29:09,153 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_15972_210_6400_1_1530877934_3405692_316-35478-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 07:29:09,178 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_809-17156-0_sp0.9 from training. Duration: 0.811125 2023-10-03 07:29:10,494 WARNING [train.py:1204] (3/4) Exclude cut with ID _57572_210_5517_1_1534921219624_3809484_594-263474-0_sp1.1 from training. Duration: 0.936375 2023-10-03 07:29:10,538 WARNING [train.py:1204] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_190-228204-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 07:29:11,981 WARNING [train.py:1204] (3/4) Exclude cut with ID _19514_210_9259_1_1531530071330_5084230_117-21193-0_sp1.1 from training. Duration: 0.936375 2023-10-03 07:29:15,029 WARNING [train.py:1204] (3/4) Exclude cut with ID _69584_210_5219_1_1535785133775_4044450_190-21315-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 07:29:16,783 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_53071_210_16554_1_1534493434_1276796_184-302050-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 07:29:16,850 WARNING [train.py:1204] (3/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_120-76843-0_sp1.1 from training. Duration: 0.5 2023-10-03 07:29:19,539 WARNING [train.py:1204] (3/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_318-162017-0_sp1.1 from training. Duration: 0.82725 2023-10-03 07:29:20,921 WARNING [train.py:1204] (3/4) Exclude cut with ID _46067_210_4220_1_1534125571066_3307450_79-46864-0_sp1.1 from training. Duration: 0.92725 2023-10-03 07:29:21,540 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.3.encoder.layers.2.self_attn2.whiten, num_groups=1, num_channels=512, metric=11.19 vs. limit=22.5 2023-10-03 07:29:22,297 WARNING [train.py:1204] (3/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_358-215558-0_sp1.1 from training. Duration: 0.8454375 2023-10-03 07:29:22,374 WARNING [train.py:1204] (3/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_621-66764-0_sp1.1 from training. Duration: 0.92725 2023-10-03 07:29:25,553 WARNING [train.py:1204] (3/4) Exclude cut with ID _42863_210_9070_1_1533902398788_3696920_338-156851-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 07:29:25,604 WARNING [train.py:1204] (3/4) Exclude cut with ID _5289_210_2084_1_1527678202736_3779299_392-261988-0_sp0.9 from training. Duration: 0.72225 2023-10-03 07:29:29,547 WARNING [train.py:1204] (3/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_409-84682-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 07:29:30,922 WARNING [train.py:1204] (3/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_30-326681-0_sp0.9 from training. Duration: 0.92225 2023-10-03 07:29:30,928 WARNING [train.py:1204] (3/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_389-184695-0_sp1.1 from training. Duration: 0.92725 2023-10-03 07:29:30,975 WARNING [train.py:1204] (3/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_442-320317-0 from training. Duration: 0.82 2023-10-03 07:29:37,065 WARNING [train.py:1204] (3/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_756-29911-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 07:29:37,167 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_401-112036-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 07:29:38,444 WARNING [train.py:1204] (3/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_181-308827-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 07:29:38,517 WARNING [train.py:1204] (3/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_332-258725-0_sp1.1 from training. Duration: 0.963625 2023-10-03 07:29:39,823 WARNING [train.py:1204] (3/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_188-260011-0_sp0.9 from training. Duration: 0.811125 2023-10-03 07:29:39,872 WARNING [train.py:1204] (3/4) Exclude cut with ID _49039_210_6315_1_1533967537541_3846118_99-302239-0_sp1.1 from training. Duration: 0.963625 2023-10-03 07:29:42,610 WARNING [train.py:1204] (3/4) Exclude cut with ID _4122_210_2084_1_1526713164417_4353280_312-294828-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 07:29:42,615 WARNING [train.py:1204] (3/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_303-228738-0 from training. Duration: 0.84 2023-10-03 07:29:45,423 INFO [train.py:1046] (3/4) Epoch 34, batch 2700, loss[loss=0.1564, simple_loss=0.2291, pruned_loss=0.0419, over 23332.00 frames. ], tot_loss[loss=0.1618, simple_loss=0.241, pruned_loss=0.04132, over 4728514.71 frames. ], batch size: 285, lr: 3.00e-03, grad_scale: 16.0 2023-10-03 07:29:45,553 WARNING [train.py:1204] (3/4) Exclude cut with ID _14349_210_8341_1_1530615710818_3442470_115-285975-0_sp0.9 from training. Duration: 0.9555625 2023-10-03 07:29:48,227 WARNING [train.py:1204] (3/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_125-144174-0_sp1.1 from training. Duration: 0.3545625 2023-10-03 07:29:51,649 WARNING [train.py:1204] (3/4) Exclude cut with ID _53565_210_10635_1_1534337218335_3820259_351-54570-0_sp1.1 from training. Duration: 0.97275 2023-10-03 07:29:51,680 WARNING [train.py:1204] (3/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_410-40065-0_sp1.1 from training. Duration: 0.963625 2023-10-03 07:29:51,725 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_38965_210_4852_1_1533174906_3484735_167-339442-0_sp1.1 from training. Duration: 0.963625 2023-10-03 07:29:53,091 WARNING [train.py:1204] (3/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_265-164486-0_sp1.1 from training. Duration: 0.87275 2023-10-03 07:29:53,104 WARNING [train.py:1204] (3/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_725-182484-0_sp1.1 from training. Duration: 0.92725 2023-10-03 07:29:53,134 WARNING [train.py:1204] (3/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_607-213951-0_sp0.9 from training. Duration: 0.9333125 2023-10-03 07:29:53,148 WARNING [train.py:1204] (3/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_605-133395-0_sp1.1 from training. Duration: 0.6 2023-10-03 07:29:53,162 WARNING [train.py:1204] (3/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_351-294650-0 from training. Duration: 0.63 2023-10-03 07:29:55,011 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_14953_210_2151_1_1530701710_5616165_710-165400-0_sp1.1 from training. Duration: 0.7909375 2023-10-03 07:29:56,423 WARNING [train.py:1204] (3/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_237-58433-0_sp1.1 from training. Duration: 0.67275 2023-10-03 07:29:57,817 WARNING [train.py:1204] (3/4) Exclude cut with ID _5190_210_4557_1_1527244368799_373439_28-197030-0_sp1.1 from training. Duration: 0.6545625 2023-10-03 07:29:59,228 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_743-327651-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 07:30:02,051 WARNING [train.py:1204] (3/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_76-203867-0_sp0.9 from training. Duration: 0.8 2023-10-03 07:30:03,017 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.0.layers.0.whiten, num_groups=1, num_channels=192, metric=4.42 vs. limit=12.0 2023-10-03 07:30:03,436 WARNING [train.py:1204] (3/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_274-274798-0 from training. Duration: 0.98 2023-10-03 07:30:03,471 WARNING [train.py:1204] (3/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_226-242140-0_sp1.1 from training. Duration: 0.736375 2023-10-03 07:30:08,188 WARNING [train.py:1204] (3/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_352-59958-0_sp1.1 from training. Duration: 0.7818125 2023-10-03 07:30:08,193 WARNING [train.py:1204] (3/4) Exclude cut with ID _39411_210_5986_1_1533538825030_3343649_191-251819-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 07:30:09,841 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.conv_module1.balancer1.prob, batch_count=1186733.3333333333, ans=0.125 2023-10-03 07:30:13,743 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7046_210_6753_1_1527895337_6920917_642-106799-0_sp1.1 from training. Duration: 0.836375 2023-10-03 07:30:13,750 WARNING [train.py:1204] (3/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_412-3442-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 07:30:13,783 WARNING [train.py:1204] (3/4) Exclude cut with ID _60057_210_10640_1_1534903065793_3333150_171-181133-0_sp0.9 from training. Duration: 0.97775 2023-10-03 07:30:13,791 WARNING [train.py:1204] (3/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_154-135696-0_sp1.1 from training. Duration: 0.72725 2023-10-03 07:30:17,856 WARNING [train.py:1204] (3/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_148-186805-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 07:30:19,289 WARNING [train.py:1204] (3/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_605-165641-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 07:30:19,298 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_174-277364-0_sp0.9 from training. Duration: 0.788875 2023-10-03 07:30:19,318 WARNING [train.py:1204] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_89-321955-0_sp0.9 from training. Duration: 0.888875 2023-10-03 07:30:21,324 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.self_attn_weights.pos_emb_skip_rate, batch_count=1186800.0, ans=0.0 2023-10-03 07:30:21,344 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.conv_module2.balancer2.prob, batch_count=1186800.0, ans=0.125 2023-10-03 07:30:27,561 WARNING [train.py:1204] (3/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_438-105429-0_sp1.1 from training. Duration: 0.963625 2023-10-03 07:30:27,565 WARNING [train.py:1204] (3/4) Exclude cut with ID _14970_210_15709_1_1530773543493_1715730_187-20259-0_sp0.9 from training. Duration: 0.988875 2023-10-03 07:30:30,756 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.conv_module1.balancer2.prob, batch_count=1186866.6666666667, ans=0.125 2023-10-03 07:30:32,450 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.5.encoder.layers.0.self_attn1.whiten, num_groups=1, num_channels=256, metric=12.45 vs. limit=22.5 2023-10-03 07:30:33,384 WARNING [train.py:1204] (3/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_385-193052-0_sp0.9 from training. Duration: 0.9555625 2023-10-03 07:30:33,464 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7113_210_1849_1_1527942685_3448768_194-311626-0_sp1.1 from training. Duration: 0.92725 2023-10-03 07:30:36,287 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=1186866.6666666667, ans=0.1 2023-10-03 07:30:37,421 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3780_210_2087_1_1526467389_2145572_176-107140-0_sp0.9 from training. Duration: 0.7666875 2023-10-03 07:30:37,422 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_36756_210_7234_1_1533026991_966276_123-213457-0_sp1.1 from training. Duration: 0.936375 2023-10-03 07:30:40,595 WARNING [train.py:1204] (3/4) Exclude cut with ID _23557_210_8750_1_1531890106370_6846255_456-244861-0_sp1.1 from training. Duration: 0.963625 2023-10-03 07:30:42,047 WARNING [train.py:1204] (3/4) Exclude cut with ID _37086_210_9038_1_1532999540891_7021250_242-349170-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 07:30:43,398 WARNING [train.py:1204] (3/4) Exclude cut with ID _41298_210_9019_1_1533430371450_4822143_471-281351-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 07:30:44,813 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_53378_210_17282_1_1534221071_5769509_572-126176-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 07:30:46,282 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_1507-263032-0_sp1.1 from training. Duration: 0.963625 2023-10-03 07:30:46,324 WARNING [train.py:1204] (3/4) Exclude cut with ID _13679_210_7497_1_1530534293662_3355098_411-186088-0_sp1.1 from training. Duration: 0.8545625 2023-10-03 07:30:47,694 WARNING [train.py:1204] (3/4) Exclude cut with ID _29345_210_4381_1_1532689168263_3860169_149-257268-0_sp1.1 from training. Duration: 0.736375 2023-10-03 07:30:49,045 WARNING [train.py:1204] (3/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_381-252014-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 07:30:49,052 WARNING [train.py:1204] (3/4) Exclude cut with ID _35799_210_5854_1_1533263487887_3529071_512-2717-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 07:30:52,580 WARNING [train.py:1204] (3/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_505-205270-0_sp0.9 from training. Duration: 0.42225 2023-10-03 07:30:53,826 WARNING [train.py:1204] (3/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_731-20117-0_sp1.1 from training. Duration: 0.936375 2023-10-03 07:30:55,225 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_37351_210_7925_1_1533012931_3812113_265-85780-0_sp1.1 from training. Duration: 0.8 2023-10-03 07:30:55,239 WARNING [train.py:1204] (3/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_313-134930-0 from training. Duration: 0.97 2023-10-03 07:30:56,698 WARNING [train.py:1204] (3/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_374-138971-0 from training. Duration: 0.94 2023-10-03 07:30:56,731 WARNING [train.py:1204] (3/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_1227-287886-0_sp1.1 from training. Duration: 0.936375 2023-10-03 07:30:59,384 INFO [train.py:1046] (3/4) Epoch 34, batch 2750, loss[loss=0.1578, simple_loss=0.2367, pruned_loss=0.03947, over 23858.00 frames. ], tot_loss[loss=0.1613, simple_loss=0.2399, pruned_loss=0.04135, over 4723777.11 frames. ], batch size: 195, lr: 3.00e-03, grad_scale: 16.0 2023-10-03 07:30:59,484 WARNING [train.py:1204] (3/4) Exclude cut with ID _59493_210_17282_1_1535078419077_3225879_332-217544-0_sp1.1 from training. Duration: 0.97275 2023-10-03 07:30:59,527 WARNING [train.py:1204] (3/4) Exclude cut with ID _23514_210_1389_1_1531832139785_3031439_361-119640-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 07:31:02,207 WARNING [train.py:1204] (3/4) Exclude cut with ID _69584_210_5219_1_1535785133775_4044450_142-118058-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 07:31:02,229 WARNING [train.py:1204] (3/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_178-177582-0_sp1.1 from training. Duration: 0.67275 2023-10-03 07:31:03,552 WARNING [train.py:1204] (3/4) Exclude cut with ID _25303_210_17519_1_1532050207576_3635970_41-195572-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 07:31:06,284 WARNING [train.py:1204] (3/4) Exclude cut with ID _55328_210_27767_1_1534580973260_4088170_329-341322-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 07:31:06,325 WARNING [train.py:1204] (3/4) Exclude cut with ID _5608_210_2106_1_1527415439089_6819768_909-82767-0_sp1.1 from training. Duration: 0.5545625 2023-10-03 07:31:07,780 WARNING [train.py:1204] (3/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_261-297503-0_sp1.1 from training. Duration: 0.9 2023-10-03 07:31:07,788 WARNING [train.py:1204] (3/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_459-291706-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 07:31:07,789 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7762_210_1794_1_1528199508_4057408_422-227034-0 from training. Duration: 0.73 2023-10-03 07:31:07,804 WARNING [train.py:1204] (3/4) Exclude cut with ID _3067_210_2667_1_1526212974640_4503168_250-285937-0_sp0.9 from training. Duration: 0.888875 2023-10-03 07:31:07,824 WARNING [train.py:1204] (3/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_332-253745-0_sp1.1 from training. Duration: 0.936375 2023-10-03 07:31:10,303 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.538e+02 1.859e+02 2.069e+02 2.359e+02 3.471e+02, threshold=4.138e+02, percent-clipped=0.0 2023-10-03 07:31:17,680 WARNING [train.py:1204] (3/4) Exclude cut with ID _13938_210_9988_1_1530518718978_3693960_304-112784-0 from training. Duration: 0.46 2023-10-03 07:31:19,121 WARNING [train.py:1204] (3/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_226-267957-0_sp1.1 from training. Duration: 0.92725 2023-10-03 07:31:19,161 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_41822_210_13270_1_1533955976_4379284_608-254567-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 07:31:19,222 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_124-113562-0_sp1.1 from training. Duration: 0.8545625 2023-10-03 07:31:20,602 WARNING [train.py:1204] (3/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_117-290916-0_sp1.1 from training. Duration: 0.463625 2023-10-03 07:31:21,995 WARNING [train.py:1204] (3/4) Exclude cut with ID _28427_210_4220_1_1532581240967_3061069_102-66533-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 07:31:22,076 WARNING [train.py:1204] (3/4) Exclude cut with ID _55383_210_10635_1_1534468426172_2231669_134-128246-0_sp1.1 from training. Duration: 0.8090625 2023-10-03 07:31:23,803 WARNING [train.py:1204] (3/4) Exclude cut with ID _45871_210_5665_1_1533864614245_3296060_49-308316-0_sp1.1 from training. Duration: 0.97275 2023-10-03 07:31:23,851 WARNING [train.py:1204] (3/4) Exclude cut with ID _57572_210_5517_1_1534921219624_3809484_176-199205-0_sp1.1 from training. Duration: 0.97275 2023-10-03 07:31:26,076 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.feed_forward2.hidden_balancer.prob, batch_count=1187066.6666666667, ans=0.125 2023-10-03 07:31:28,614 WARNING [train.py:1204] (3/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_45-265684-0_sp1.1 from training. Duration: 0.6545625 2023-10-03 07:31:28,640 WARNING [train.py:1204] (3/4) Exclude cut with ID _15150_210_8341_1_1530752411803_7256384_469-239509-0_sp0.9 from training. Duration: 0.7555625 2023-10-03 07:31:29,987 WARNING [train.py:1204] (3/4) Exclude cut with ID _28461_210_2253_1_1532771775189_3041299_136-270644-0_sp1.1 from training. Duration: 0.8454375 2023-10-03 07:31:30,051 WARNING [train.py:1204] (3/4) Exclude cut with ID _69584_210_5219_1_1535785133775_4044450_302-47302-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 07:31:31,408 WARNING [train.py:1204] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_166-128941-0_sp1.1 from training. Duration: 0.4909375 2023-10-03 07:31:38,241 WARNING [train.py:1204] (3/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_559-120504-0_sp1.1 from training. Duration: 0.97275 2023-10-03 07:31:39,675 WARNING [train.py:1204] (3/4) Exclude cut with ID _6456_210_1534_1_1527681634212_3181049_345-252877-0_sp1.1 from training. Duration: 0.5818125 2023-10-03 07:31:40,950 WARNING [train.py:1204] (3/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_902-233003-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 07:31:43,790 WARNING [train.py:1204] (3/4) Exclude cut with ID _36538_210_9994_1_1533204020363_3795620_204-261385-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 07:31:43,795 WARNING [train.py:1204] (3/4) Exclude cut with ID _14821_210_8750_1_1530680503991_6829359_189-297451-0_sp1.1 from training. Duration: 0.5 2023-10-03 07:31:43,833 WARNING [train.py:1204] (3/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_1211-226066-0_sp0.9 from training. Duration: 0.8444375 2023-10-03 07:31:49,865 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6786_210_1388_1_1527839699_4194495_273-149841-0_sp1.1 from training. Duration: 0.836375 2023-10-03 07:31:49,899 WARNING [train.py:1204] (3/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_380-162648-0_sp1.1 from training. Duration: 0.8545625 2023-10-03 07:31:49,900 WARNING [train.py:1204] (3/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_494-199538-0 from training. Duration: 0.75 2023-10-03 07:31:53,901 WARNING [train.py:1204] (3/4) Exclude cut with ID _49097_210_16049_1_1533952780207_4360979_276-87053-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 07:31:56,534 WARNING [train.py:1204] (3/4) Exclude cut with ID _5444_210_2151_1_1527418808464_5312210_726-27165-0 from training. Duration: 0.67 2023-10-03 07:32:00,777 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_128-202104-0_sp1.1 from training. Duration: 0.57275 2023-10-03 07:32:03,533 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_11349_210_6753_1_1529800861_1101207_26-65602-0_sp1.1 from training. Duration: 0.87275 2023-10-03 07:32:03,563 WARNING [train.py:1204] (3/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_159-328157-0 from training. Duration: 0.52 2023-10-03 07:32:04,859 WARNING [train.py:1204] (3/4) Exclude cut with ID _14069_210_5632_1_1530512692033_4803390_98-21934-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 07:32:06,289 WARNING [train.py:1204] (3/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_352-59958-0_sp0.9 from training. Duration: 0.9555625 2023-10-03 07:32:06,335 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6163_210_6400_1_1527506746_4263068_283-137811-0 from training. Duration: 0.77 2023-10-03 07:32:06,365 WARNING [train.py:1204] (3/4) Exclude cut with ID _21989_210_5854_1_1531899214931_3271360_133-44759-0_sp1.1 from training. Duration: 0.7 2023-10-03 07:32:07,783 WARNING [train.py:1204] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_21-174514-0_sp0.9 from training. Duration: 0.47775 2023-10-03 07:32:07,940 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.conv_module1.balancer2.prob, batch_count=1187266.6666666667, ans=0.125 2023-10-03 07:32:09,133 WARNING [train.py:1204] (3/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_201-78811-0_sp1.1 from training. Duration: 0.963625 2023-10-03 07:32:09,179 WARNING [train.py:1204] (3/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_537-164630-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 07:32:09,250 WARNING [train.py:1204] (3/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_156-257607-0 from training. Duration: 0.83 2023-10-03 07:32:09,262 WARNING [train.py:1204] (3/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_308-154450-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 07:32:11,116 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_45399_210_4852_1_1533624725_7579737_214-240574-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 07:32:11,299 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.1.conv_module1.balancer1.prob, batch_count=1187333.3333333333, ans=0.125 2023-10-03 07:32:12,369 INFO [train.py:1046] (3/4) Epoch 34, batch 2800, loss[loss=0.1671, simple_loss=0.2313, pruned_loss=0.05143, over 23912.00 frames. ], tot_loss[loss=0.1606, simple_loss=0.2389, pruned_loss=0.04113, over 4717766.39 frames. ], batch size: 212, lr: 3.00e-03, grad_scale: 32.0 2023-10-03 07:32:12,506 WARNING [train.py:1204] (3/4) Exclude cut with ID _29303_210_2575_1_1532392237303_7410649_615-305517-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 07:32:13,829 WARNING [train.py:1204] (3/4) Exclude cut with ID IC0125W0002-61808 from training. Duration: 0.708 2023-10-03 07:32:13,830 WARNING [train.py:1204] (3/4) Exclude cut with ID IC0125W0017-61823 from training. Duration: 0.759 2023-10-03 07:32:14,019 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.bypass.skip_rate, batch_count=1187333.3333333333, ans=0.04949747468305833 2023-10-03 07:32:16,676 WARNING [train.py:1204] (3/4) Exclude cut with ID _53219_210_4525_1_1534834836008_4064825_4-145291-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 07:32:20,104 WARNING [train.py:1204] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_185-174805-0_sp1.1 from training. Duration: 0.6181875 2023-10-03 07:32:20,127 WARNING [train.py:1204] (3/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_635-193046-0_sp1.1 from training. Duration: 0.92725 2023-10-03 07:32:22,130 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.ff2_skip_rate, batch_count=1187333.3333333333, ans=0.0 2023-10-03 07:32:23,356 WARNING [train.py:1204] (3/4) Exclude cut with ID _46067_210_4220_1_1534125571066_3307450_321-258866-0_sp1.1 from training. Duration: 0.97275 2023-10-03 07:32:24,124 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.0.balancer2.prob, batch_count=1187333.3333333333, ans=0.125 2023-10-03 07:32:25,300 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8558_210_1410_1_1528505225_7613406_131-329524-0 from training. Duration: 0.79 2023-10-03 07:32:26,761 WARNING [train.py:1204] (3/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_847-41334-0_sp0.9 from training. Duration: 0.488875 2023-10-03 07:32:28,150 WARNING [train.py:1204] (3/4) Exclude cut with ID _14227_210_4381_1_1530701804286_3940259_125-275642-0 from training. Duration: 0.99 2023-10-03 07:32:28,237 WARNING [train.py:1204] (3/4) Exclude cut with ID _25248_210_8750_1_1532062825941_5554180_248-177255-0_sp1.1 from training. Duration: 0.963625 2023-10-03 07:32:29,600 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_40047_210_2290_1_1533290519_2374311_195-165693-0_sp1.1 from training. Duration: 0.8090625 2023-10-03 07:32:29,604 WARNING [train.py:1204] (3/4) Exclude cut with ID _23109_210_4381_1_1532150828777_901949_29-258672-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 07:32:33,817 WARNING [train.py:1204] (3/4) Exclude cut with ID _14821_210_8750_1_1530680503991_6829359_2-227151-0_sp1.1 from training. Duration: 0.7545625 2023-10-03 07:32:33,844 WARNING [train.py:1204] (3/4) Exclude cut with ID _49643_210_7989_1_1534640244224_3939031_565-53982-0_sp1.1 from training. Duration: 0.963625 2023-10-03 07:32:33,848 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7544_210_6753_1_1528591327_1471426_33-346031-0_sp0.9 from training. Duration: 0.67775 2023-10-03 07:32:35,130 WARNING [train.py:1204] (3/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_95-61782-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 07:32:41,365 WARNING [train.py:1204] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_686-54383-0_sp0.9 from training. Duration: 0.9666875 2023-10-03 07:32:43,500 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.0.layers.0.whiten, num_groups=1, num_channels=192, metric=4.18 vs. limit=12.0 2023-10-03 07:32:44,061 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_505-276823-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 07:32:46,761 WARNING [train.py:1204] (3/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_745-128443-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 07:32:46,832 WARNING [train.py:1204] (3/4) Exclude cut with ID _3844_210_1390_1_1526727803582_2871209_20-251664-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 07:32:48,173 WARNING [train.py:1204] (3/4) Exclude cut with ID _36538_210_9994_1_1533204020363_3795620_487-164628-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 07:32:52,930 WARNING [train.py:1204] (3/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_309-222545-0_sp1.1 from training. Duration: 0.763625 2023-10-03 07:32:53,545 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3531_210_3528_1_1526294203_5216456_309-227302-0 from training. Duration: 0.69 2023-10-03 07:32:54,770 WARNING [train.py:1204] (3/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_42-192460-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 07:32:54,847 WARNING [train.py:1204] (3/4) Exclude cut with ID _22083_210_8254_1_1531702862439_3678970_49-234546-0_sp1.1 from training. Duration: 0.7545625 2023-10-03 07:32:54,851 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_427-78097-0_sp0.9 from training. Duration: 0.888875 2023-10-03 07:32:57,751 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=1187533.3333333333, ans=0.1 2023-10-03 07:32:58,965 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_378-205254-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 07:32:59,003 WARNING [train.py:1204] (3/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_672-314830-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 07:33:01,778 WARNING [train.py:1204] (3/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_90-30673-0_sp1.1 from training. Duration: 0.763625 2023-10-03 07:33:01,925 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.1.nonlin_attention.balancer.prob, batch_count=1187533.3333333333, ans=0.125 2023-10-03 07:33:04,525 WARNING [train.py:1204] (3/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_563-329943-0_sp1.1 from training. Duration: 0.9 2023-10-03 07:33:04,541 WARNING [train.py:1204] (3/4) Exclude cut with ID _21632_210_2253_1_1531994001765_3744140_154-84090-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 07:33:04,542 WARNING [train.py:1204] (3/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_103-336064-0_sp0.9 from training. Duration: 0.5666875 2023-10-03 07:33:05,895 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_43-71989-0_sp0.9 from training. Duration: 0.6333125 2023-10-03 07:33:05,954 WARNING [train.py:1204] (3/4) Exclude cut with ID _3867_210_1883_1_1526641200575_4475830_319-349850-0_sp0.9 from training. Duration: 0.7444375 2023-10-03 07:33:06,469 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.4.encoder.layers.1.nonlin_attention.whiten1, num_groups=1, num_channels=288, metric=4.12 vs. limit=10.0 2023-10-03 07:33:07,196 WARNING [train.py:1204] (3/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_629-179454-0_sp1.1 from training. Duration: 0.963625 2023-10-03 07:33:07,212 WARNING [train.py:1204] (3/4) Exclude cut with ID _65263_210_22356_1_1535763598691_3921037_49-222917-0 from training. Duration: 0.92 2023-10-03 07:33:07,241 WARNING [train.py:1204] (3/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_602-331772-0_sp1.1 from training. Duration: 0.936375 2023-10-03 07:33:08,668 WARNING [train.py:1204] (3/4) Exclude cut with ID _58414_210_8254_1_1535421609162_7151182_292-134978-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 07:33:08,706 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_38793_210_2544_1_1533176114_3849320_200-299292-0_sp1.1 from training. Duration: 0.936375 2023-10-03 07:33:10,564 WARNING [train.py:1204] (3/4) Exclude cut with ID _14970_210_15709_1_1530775547361_1966965_6-23328-0 from training. Duration: 0.86 2023-10-03 07:33:12,099 WARNING [train.py:1204] (3/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_386-225511-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 07:33:12,110 WARNING [train.py:1204] (3/4) Exclude cut with ID _38323_210_12108_1_1533121233369_3303049_416-11919-0_sp1.1 from training. Duration: 0.87275 2023-10-03 07:33:13,499 WARNING [train.py:1204] (3/4) Exclude cut with ID _4866_210_2577_1_1527404412284_4364659_256-306830-0_sp1.1 from training. Duration: 0.7909375 2023-10-03 07:33:14,916 WARNING [train.py:1204] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_53-300544-0 from training. Duration: 0.67 2023-10-03 07:33:22,336 WARNING [train.py:1204] (3/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_80-74184-0_sp1.1 from training. Duration: 0.7545625 2023-10-03 07:33:22,358 WARNING [train.py:1204] (3/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_332-256168-0_sp1.1 from training. Duration: 0.6818125 2023-10-03 07:33:22,645 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.attention_skip_rate, batch_count=1187600.0, ans=0.0 2023-10-03 07:33:23,656 WARNING [train.py:1204] (3/4) Exclude cut with ID _48836_210_12210_1_1534075848293_3904150_429-35475-0_sp1.1 from training. Duration: 0.8090625 2023-10-03 07:33:25,652 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_144-135833-0_sp1.1 from training. Duration: 0.97275 2023-10-03 07:33:27,108 INFO [train.py:1046] (3/4) Epoch 34, batch 2850, loss[loss=0.1747, simple_loss=0.2589, pruned_loss=0.04523, over 24370.00 frames. ], tot_loss[loss=0.1602, simple_loss=0.2387, pruned_loss=0.04089, over 4719047.72 frames. ], batch size: 77, lr: 3.00e-03, grad_scale: 32.0 2023-10-03 07:33:29,917 WARNING [train.py:1204] (3/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_380-343670-0_sp0.9 from training. Duration: 0.97775 2023-10-03 07:33:29,946 WARNING [train.py:1204] (3/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_213-265174-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 07:33:29,985 WARNING [train.py:1204] (3/4) Exclude cut with ID _20455_210_5219_1_1531565996775_8098100_86-314283-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 07:33:32,854 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_41750_210_1960_1_1533520678_3948042_68-223813-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 07:33:34,095 WARNING [train.py:1204] (3/4) Exclude cut with ID _55153_210_2253_1_1534384786677_3162096_24-198429-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 07:33:35,617 WARNING [train.py:1204] (3/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_119-207162-0_sp0.9 from training. Duration: 0.788875 2023-10-03 07:33:35,670 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_148-172861-0 from training. Duration: 0.5 2023-10-03 07:33:38,279 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.584e+02 1.832e+02 1.949e+02 2.119e+02 3.126e+02, threshold=3.897e+02, percent-clipped=0.0 2023-10-03 07:33:40,704 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.1.encoder.layers.0.nonlin_attention.whiten2, num_groups=1, num_channels=256, metric=8.28 vs. limit=15.0 2023-10-03 07:33:41,248 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_5522_210_3273_1_1527249606_3909096_318-203153-0 from training. Duration: 0.43 2023-10-03 07:33:41,264 WARNING [train.py:1204] (3/4) Exclude cut with ID _14884_210_2252_1_1530707459689_5561250_288-189949-0_sp1.1 from training. Duration: 0.97275 2023-10-03 07:33:42,780 WARNING [train.py:1204] (3/4) Exclude cut with ID _5294_210_1747_1_1527249643928_3696179_529-242298-0 from training. Duration: 0.95 2023-10-03 07:33:44,682 WARNING [train.py:1204] (3/4) Exclude cut with ID _36538_210_9994_1_1533204020363_3795620_503-110289-0_sp1.1 from training. Duration: 0.936375 2023-10-03 07:33:47,475 WARNING [train.py:1204] (3/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_385-193052-0 from training. Duration: 0.86 2023-10-03 07:33:47,542 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_456-22141-0 from training. Duration: 0.88 2023-10-03 07:33:49,037 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_38965_210_4852_1_1533174906_3484735_284-117840-0_sp1.1 from training. Duration: 0.936375 2023-10-03 07:34:01,809 WARNING [train.py:1204] (3/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_75-201625-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 07:34:03,135 WARNING [train.py:1204] (3/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_255-312654-0_sp1.1 from training. Duration: 0.82725 2023-10-03 07:34:03,154 WARNING [train.py:1204] (3/4) Exclude cut with ID _47251_210_18628_1_1533814063128_4137250_214-88076-0_sp0.9 from training. Duration: 0.97775 2023-10-03 07:34:03,228 WARNING [train.py:1204] (3/4) Exclude cut with ID _4092_210_2151_1_1526643044449_3135759_227-213972-0_sp1.1 from training. Duration: 0.4909375 2023-10-03 07:34:03,244 WARNING [train.py:1204] (3/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_912-47473-0_sp1.1 from training. Duration: 0.6181875 2023-10-03 07:34:03,271 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6767_210_3273_1_1527855783_1718932_173-256601-0_sp1.1 from training. Duration: 0.72725 2023-10-03 07:34:05,514 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.1.encoder.layers.1.conv_module2.whiten, num_groups=1, num_channels=256, metric=9.94 vs. limit=15.0 2023-10-03 07:34:06,024 WARNING [train.py:1204] (3/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_32-205288-0_sp1.1 from training. Duration: 0.7090625 2023-10-03 07:34:06,067 WARNING [train.py:1204] (3/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_39-248870-0 from training. Duration: 0.69 2023-10-03 07:34:07,526 WARNING [train.py:1204] (3/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_377-296839-0_sp1.1 from training. Duration: 0.5 2023-10-03 07:34:08,823 WARNING [train.py:1204] (3/4) Exclude cut with ID _13679_210_7497_1_1530534293662_3355098_413-56154-0_sp1.1 from training. Duration: 0.963625 2023-10-03 07:34:08,877 WARNING [train.py:1204] (3/4) Exclude cut with ID _14970_210_15709_1_1530775547361_1966965_201-84544-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 07:34:08,928 WARNING [train.py:1204] (3/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_105-95775-0_sp1.1 from training. Duration: 0.936375 2023-10-03 07:34:11,695 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.conv_module2.balancer2.prob, batch_count=1187866.6666666667, ans=0.125 2023-10-03 07:34:12,266 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.2.encoder.layers.2.feed_forward1.out_whiten, num_groups=1, num_channels=384, metric=5.53 vs. limit=15.0 2023-10-03 07:34:12,853 WARNING [train.py:1204] (3/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_131-191669-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 07:34:12,887 WARNING [train.py:1204] (3/4) Exclude cut with ID _14860_210_4915_1_1530702020340_7343240_695-4323-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 07:34:14,208 WARNING [train.py:1204] (3/4) Exclude cut with ID _39867_210_9826_1_1533553222030_9344259_991-130565-0_sp1.1 from training. Duration: 0.97275 2023-10-03 07:34:14,341 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_50-3289-0_sp1.1 from training. Duration: 0.82725 2023-10-03 07:34:16,138 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_54-347670-0_sp1.1 from training. Duration: 0.92725 2023-10-03 07:34:17,499 WARNING [train.py:1204] (3/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_200-153445-0_sp1.1 from training. Duration: 0.936375 2023-10-03 07:34:17,579 WARNING [train.py:1204] (3/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_279-302911-0_sp1.1 from training. Duration: 0.97275 2023-10-03 07:34:20,825 WARNING [train.py:1204] (3/4) Exclude cut with ID _14886_210_4220_1_1530702020355_3516190_159-73748-0_sp0.9 from training. Duration: 0.92225 2023-10-03 07:34:25,944 WARNING [train.py:1204] (3/4) Exclude cut with ID _53379_210_20034_1_1534222027363_3854654_23-222715-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 07:34:27,095 WARNING [train.py:1204] (3/4) Exclude cut with ID _12555_210_8750_1_1530075594402_3780239_449-267986-0 from training. Duration: 0.83 2023-10-03 07:34:27,108 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7544_210_6753_1_1528587722_3576686_295-258479-0 from training. Duration: 0.84 2023-10-03 07:34:28,480 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7002_210_1794_1_1527935634_4034444_487-236501-0_sp0.9 from training. Duration: 0.7333125 2023-10-03 07:34:28,534 WARNING [train.py:1204] (3/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_244-19107-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 07:34:29,877 WARNING [train.py:1204] (3/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_390-339137-0 from training. Duration: 0.56 2023-10-03 07:34:31,289 WARNING [train.py:1204] (3/4) Exclude cut with ID _7562_210_2084_1_1528887767916_3946229_186-326422-0_sp1.1 from training. Duration: 0.763625 2023-10-03 07:34:31,362 WARNING [train.py:1204] (3/4) Exclude cut with ID _33640_210_2655_1_1533117605479_4855469_645-145235-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 07:34:31,381 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_25767_210_2161_1_1532307483_3867748_506-171777-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 07:34:32,635 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_5438_210_1794_1_1527336687_2600009_233-58072-0_sp1.1 from training. Duration: 0.7 2023-10-03 07:34:32,635 WARNING [train.py:1204] (3/4) Exclude cut with ID IC0669W0063-330908 from training. Duration: 0.896 2023-10-03 07:34:32,684 WARNING [train.py:1204] (3/4) Exclude cut with ID IC0671W0070-331913 from training. Duration: 0.896 2023-10-03 07:34:32,687 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_258-310570-0_sp1.1 from training. Duration: 0.7454375 2023-10-03 07:34:32,745 WARNING [train.py:1204] (3/4) Exclude cut with ID _9461_210_4557_1_1529060514726_3677451_162-177507-0_sp1.1 from training. Duration: 0.97275 2023-10-03 07:34:36,827 WARNING [train.py:1204] (3/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_26-175553-0_sp0.9 from training. Duration: 0.67775 2023-10-03 07:34:36,852 WARNING [train.py:1204] (3/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_7-150481-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 07:34:38,135 WARNING [train.py:1204] (3/4) Exclude cut with ID _23557_210_8750_1_1531890106370_6846255_72-273262-0_sp1.1 from training. Duration: 0.963625 2023-10-03 07:34:38,250 WARNING [train.py:1204] (3/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_367-61330-0 from training. Duration: 0.75 2023-10-03 07:34:38,440 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.bypass.skip_rate, batch_count=1187933.3333333333, ans=0.07 2023-10-03 07:34:40,968 INFO [train.py:1046] (3/4) Epoch 34, batch 2900, loss[loss=0.1712, simple_loss=0.2565, pruned_loss=0.04297, over 24147.00 frames. ], tot_loss[loss=0.16, simple_loss=0.2387, pruned_loss=0.04071, over 4711956.61 frames. ], batch size: 80, lr: 3.00e-03, grad_scale: 32.0 2023-10-03 07:34:41,150 WARNING [train.py:1204] (3/4) Exclude cut with ID _39867_210_9826_1_1533553222030_9344259_1337-11141-0_sp1.1 from training. Duration: 0.936375 2023-10-03 07:34:41,176 WARNING [train.py:1204] (3/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_101-293367-0 from training. Duration: 0.63 2023-10-03 07:34:42,946 WARNING [train.py:1204] (3/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_10-100367-0 from training. Duration: 0.98 2023-10-03 07:34:45,648 WARNING [train.py:1204] (3/4) Exclude cut with ID _60057_210_10640_1_1534903065793_3333150_171-181133-0_sp1.1 from training. Duration: 0.8 2023-10-03 07:34:45,658 WARNING [train.py:1204] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_844-96989-0_sp0.9 from training. Duration: 0.788875 2023-10-03 07:34:45,930 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.attention_skip_rate, batch_count=1188000.0, ans=0.0 2023-10-03 07:34:48,326 WARNING [train.py:1204] (3/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_301-144595-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 07:34:48,414 WARNING [train.py:1204] (3/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_115-147343-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 07:34:53,345 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3171_210_2087_1_1526109084_2330007_37-64334-0_sp1.1 from training. Duration: 0.7454375 2023-10-03 07:34:53,391 WARNING [train.py:1204] (3/4) Exclude cut with ID _19514_210_9259_1_1531530071330_5084230_769-272914-0_sp1.1 from training. Duration: 0.936375 2023-10-03 07:34:56,119 WARNING [train.py:1204] (3/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_76-311963-0_sp0.9 from training. Duration: 0.77775 2023-10-03 07:34:57,466 WARNING [train.py:1204] (3/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_526-62079-0 from training. Duration: 0.67 2023-10-03 07:34:57,526 WARNING [train.py:1204] (3/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_7-111954-0_sp0.9 from training. Duration: 0.62225 2023-10-03 07:34:59,464 WARNING [train.py:1204] (3/4) Exclude cut with ID _57997_210_4163_1_1534766425889_3961049_360-279140-0_sp1.1 from training. Duration: 0.97275 2023-10-03 07:35:02,294 WARNING [train.py:1204] (3/4) Exclude cut with ID _4866_210_2577_1_1527404412284_4364659_133-1696-0 from training. Duration: 0.99 2023-10-03 07:35:02,370 WARNING [train.py:1204] (3/4) Exclude cut with ID _5190_210_4557_1_1527244368799_373439_28-197030-0 from training. Duration: 0.72 2023-10-03 07:35:05,202 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_53-39003-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 07:35:05,205 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6673_210_3273_1_1527767790_3173240_351-167954-0 from training. Duration: 0.48 2023-10-03 07:35:05,221 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2822_210_2715_1_1526112586_1999587_165-36656-0_sp1.1 from training. Duration: 0.8090625 2023-10-03 07:35:08,002 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_328-264892-0_sp1.1 from training. Duration: 0.92725 2023-10-03 07:35:08,007 WARNING [train.py:1204] (3/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_29-293896-0_sp0.9 from training. Duration: 0.67775 2023-10-03 07:35:11,953 WARNING [train.py:1204] (3/4) Exclude cut with ID _65263_210_22356_1_1535763598691_3921037_465-55448-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 07:35:12,252 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.attention_skip_rate, batch_count=1188133.3333333333, ans=0.0 2023-10-03 07:35:13,350 WARNING [train.py:1204] (3/4) Exclude cut with ID _21989_210_5854_1_1531899214931_3271360_311-53106-0_sp1.1 from training. Duration: 0.97275 2023-10-03 07:35:15,357 WARNING [train.py:1204] (3/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_389-277915-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 07:35:18,175 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_69630_210_7234_1_1535791094_677837_63-277139-0_sp1.1 from training. Duration: 0.963625 2023-10-03 07:35:21,329 WARNING [train.py:1204] (3/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_85-138257-0 from training. Duration: 0.71 2023-10-03 07:35:21,359 WARNING [train.py:1204] (3/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_686-247600-0 from training. Duration: 0.92 2023-10-03 07:35:21,365 WARNING [train.py:1204] (3/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_108-255430-0_sp1.1 from training. Duration: 0.8818125 2023-10-03 07:35:26,057 WARNING [train.py:1204] (3/4) Exclude cut with ID _14884_210_2252_1_1530707459689_5561250_114-201399-0_sp1.1 from training. Duration: 0.6909375 2023-10-03 07:35:27,466 WARNING [train.py:1204] (3/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_506-42614-0 from training. Duration: 0.79 2023-10-03 07:35:28,827 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_85-195240-0_sp1.1 from training. Duration: 0.7909375 2023-10-03 07:35:34,824 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_141-36243-0_sp1.1 from training. Duration: 0.97275 2023-10-03 07:35:42,222 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.attention_skip_rate, batch_count=1188266.6666666667, ans=0.0 2023-10-03 07:35:43,244 WARNING [train.py:1204] (3/4) Exclude cut with ID _7852_210_5219_1_1528286471316_8139400_358-287492-0_sp1.1 from training. Duration: 0.87275 2023-10-03 07:35:43,263 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_805-71497-0_sp0.9 from training. Duration: 0.788875 2023-10-03 07:35:43,342 WARNING [train.py:1204] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_178-39722-0 from training. Duration: 0.49 2023-10-03 07:35:46,760 WARNING [train.py:1204] (3/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_428-181925-0_sp1.1 from training. Duration: 0.963625 2023-10-03 07:35:46,772 WARNING [train.py:1204] (3/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_770-200430-0 from training. Duration: 0.89 2023-10-03 07:35:46,855 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder_embed.conv.5.prob, batch_count=1188266.6666666667, ans=0.125 2023-10-03 07:35:48,072 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_1104-271009-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 07:35:48,116 WARNING [train.py:1204] (3/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_128-296581-0_sp0.9 from training. Duration: 0.82225 2023-10-03 07:35:52,384 WARNING [train.py:1204] (3/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_19-62511-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 07:35:54,553 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_805-71497-0 from training. Duration: 0.71 2023-10-03 07:35:55,845 INFO [train.py:1046] (3/4) Epoch 34, batch 2950, loss[loss=0.1546, simple_loss=0.2374, pruned_loss=0.03588, over 24656.00 frames. ], tot_loss[loss=0.1606, simple_loss=0.2399, pruned_loss=0.04068, over 4723553.95 frames. ], batch size: 65, lr: 3.00e-03, grad_scale: 16.0 2023-10-03 07:35:55,948 WARNING [train.py:1204] (3/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_573-152077-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 07:35:55,960 WARNING [train.py:1204] (3/4) Exclude cut with ID _42875_210_14856_1_1533704397487_4012433_393-177816-0_sp1.1 from training. Duration: 0.963625 2023-10-03 07:35:57,661 WARNING [train.py:1204] (3/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_278-35159-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 07:35:59,042 WARNING [train.py:1204] (3/4) Exclude cut with ID _15037_210_2253_1_1530752294104_3267650_13-130361-0_sp1.1 from training. Duration: 0.9 2023-10-03 07:36:00,385 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3019_210_2028_1_1526044245_3264865_127-135615-0 from training. Duration: 0.94 2023-10-03 07:36:00,476 WARNING [train.py:1204] (3/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_607-213951-0 from training. Duration: 0.84 2023-10-03 07:36:00,541 WARNING [train.py:1204] (3/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_436-318113-0_sp0.9 from training. Duration: 0.8333125 2023-10-03 07:36:00,542 WARNING [train.py:1204] (3/4) Exclude cut with ID _13938_210_9988_1_1530518718978_3693960_148-152225-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 07:36:05,231 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2541_210_1794_1_1525590269_3084765_156-173713-0_sp1.1 from training. Duration: 0.736375 2023-10-03 07:36:06,711 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.nonlin_attention.balancer.prob, batch_count=1188333.3333333333, ans=0.125 2023-10-03 07:36:07,795 WARNING [train.py:1204] (3/4) Exclude cut with ID _28323_210_16187_1_1532480395667_4100300_531-24157-0_sp1.1 from training. Duration: 0.8909375 2023-10-03 07:36:08,912 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.604e+02 1.911e+02 2.066e+02 2.316e+02 3.128e+02, threshold=4.131e+02, percent-clipped=0.0 2023-10-03 07:36:10,811 WARNING [train.py:1204] (3/4) Exclude cut with ID _29303_210_2575_1_1532392237303_7410649_382-217492-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 07:36:10,843 WARNING [train.py:1204] (3/4) Exclude cut with ID _58895_210_8750_1_1535184081320_3649089_270-158013-0_sp1.1 from training. Duration: 0.92725 2023-10-03 07:36:12,426 INFO [scaling.py:1118] (3/4) WithLoss: name=encoder.encoders.4.encoder.layers.0.self_attn_weights, loss-sum=0.000e+00 2023-10-03 07:36:13,569 WARNING [train.py:1204] (3/4) Exclude cut with ID _29277_210_3279_1_1532581109293_3173069_311-206227-0_sp1.1 from training. Duration: 0.936375 2023-10-03 07:36:13,587 WARNING [train.py:1204] (3/4) Exclude cut with ID _7160_210_1876_1_1527940923231_3703799_282-322611-0_sp0.9 from training. Duration: 0.9666875 2023-10-03 07:36:14,968 WARNING [train.py:1204] (3/4) Exclude cut with ID _53219_210_4525_1_1534834836008_4064825_562-32295-0_sp1.1 from training. Duration: 0.963625 2023-10-03 07:36:15,143 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.balancer2.prob, batch_count=1188400.0, ans=0.125 2023-10-03 07:36:16,298 WARNING [train.py:1204] (3/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_408-87330-0_sp1.1 from training. Duration: 0.963625 2023-10-03 07:36:16,305 WARNING [train.py:1204] (3/4) Exclude cut with ID _59202_210_26984_1_1534816810612_3549174_137-318710-0_sp1.1 from training. Duration: 0.8090625 2023-10-03 07:36:19,104 WARNING [train.py:1204] (3/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_372-30302-0 from training. Duration: 0.86 2023-10-03 07:36:25,045 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7543_210_6753_1_1528541907_1399322_178-56269-0_sp1.1 from training. Duration: 0.95275 2023-10-03 07:36:26,316 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9109W0012-549758 from training. Duration: 0.512 2023-10-03 07:36:26,391 WARNING [train.py:1204] (3/4) Exclude cut with ID _5410_210_1388_1_1527397055795_1719020_39-112221-0_sp0.9 from training. Duration: 0.7666875 2023-10-03 07:36:29,633 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9121W0012-554933 from training. Duration: 0.896 2023-10-03 07:36:31,161 WARNING [train.py:1204] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_476-272757-0 from training. Duration: 0.93 2023-10-03 07:36:31,181 WARNING [train.py:1204] (3/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_1085-171718-0_sp1.1 from training. Duration: 0.92725 2023-10-03 07:36:31,240 WARNING [train.py:1204] (3/4) Exclude cut with ID _8480_210_1874_1_1528589391205_6728330_309-139896-0_sp1.1 from training. Duration: 0.736375 2023-10-03 07:36:31,240 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9130W0113-559703 from training. Duration: 0.896 2023-10-03 07:36:31,244 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_292-265928-0_sp1.1 from training. Duration: 0.463625 2023-10-03 07:36:33,146 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.3.encoder.layers.2.whiten, num_groups=1, num_channels=512, metric=3.98 vs. limit=12.0 2023-10-03 07:36:33,887 WARNING [train.py:1204] (3/4) Exclude cut with ID _8772_210_1850_1_1528635595972_3664011_96-12756-0 from training. Duration: 0.98 2023-10-03 07:36:35,225 WARNING [train.py:1204] (3/4) Exclude cut with ID _12556_210_8341_1_1530081024100_7327690_725-193477-0_sp1.1 from training. Duration: 0.8909375 2023-10-03 07:36:35,279 WARNING [train.py:1204] (3/4) Exclude cut with ID _5289_210_2084_1_1527678202736_3779299_142-103317-0_sp1.1 from training. Duration: 0.763625 2023-10-03 07:36:38,399 WARNING [train.py:1204] (3/4) Exclude cut with ID _45012_210_12651_1_1533608435136_5371380_206-51075-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 07:36:39,775 WARNING [train.py:1204] (3/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_176-56120-0_sp1.1 from training. Duration: 0.7545625 2023-10-03 07:36:39,800 WARNING [train.py:1204] (3/4) Exclude cut with ID _40284_210_2084_1_1533513679814_3704279_220-201864-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 07:36:39,831 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9165W0004-576638 from training. Duration: 0.896 2023-10-03 07:36:39,868 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_5438_210_1794_1_1527336687_2600009_218-343336-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 07:36:41,152 WARNING [train.py:1204] (3/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_318-162017-0 from training. Duration: 0.91 2023-10-03 07:36:41,412 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.attention_skip_rate, batch_count=1188533.3333333333, ans=0.0 2023-10-03 07:36:46,935 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_49544_210_1794_1_1534379770_5262083_483-349298-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 07:36:48,312 WARNING [train.py:1204] (3/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_197-28164-0_sp0.9 from training. Duration: 0.888875 2023-10-03 07:36:48,384 WARNING [train.py:1204] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_643-52007-0 from training. Duration: 0.57 2023-10-03 07:36:49,712 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_38965_210_4852_1_1533174906_3484735_403-332963-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 07:36:49,940 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.1.conv_module2.balancer2.prob, batch_count=1188533.3333333333, ans=0.125 2023-10-03 07:36:51,205 WARNING [train.py:1204] (3/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_340-174176-0 from training. Duration: 0.49 2023-10-03 07:36:52,688 WARNING [train.py:1204] (3/4) Exclude cut with ID _56708_210_13177_1_1535252351282_3422899_363-917-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 07:36:54,101 WARNING [train.py:1204] (3/4) Exclude cut with ID _19896_210_1883_1_1531709022662_6649569_525-159063-0_sp1.1 from training. Duration: 0.936375 2023-10-03 07:36:55,398 WARNING [train.py:1204] (3/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_430-116139-0_sp0.9 from training. Duration: 0.9444375 2023-10-03 07:36:57,407 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_53753_210_7234_1_1534320003_3594148_86-329250-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 07:36:57,417 WARNING [train.py:1204] (3/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_406-275649-0_sp1.1 from training. Duration: 0.3818125 2023-10-03 07:36:58,760 WARNING [train.py:1204] (3/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_1083-147536-0_sp1.1 from training. Duration: 0.8818125 2023-10-03 07:36:58,826 WARNING [train.py:1204] (3/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_54-311741-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 07:36:58,837 WARNING [train.py:1204] (3/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_237-58433-0_sp0.9 from training. Duration: 0.82225 2023-10-03 07:37:00,164 WARNING [train.py:1204] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_98-51676-0_sp1.1 from training. Duration: 0.663625 2023-10-03 07:37:00,209 WARNING [train.py:1204] (3/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_386-270472-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 07:37:02,131 WARNING [train.py:1204] (3/4) Exclude cut with ID _19023_210_4353_1_1531378892552_5820780_83-254794-0_sp1.1 from training. Duration: 0.7818125 2023-10-03 07:37:03,480 WARNING [train.py:1204] (3/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_314-93656-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 07:37:03,491 WARNING [train.py:1204] (3/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_336-213530-0 from training. Duration: 0.9 2023-10-03 07:37:04,878 WARNING [train.py:1204] (3/4) Exclude cut with ID _36686_210_5665_1_1533778225913_3646410_65-221120-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 07:37:06,761 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_26433_210_12108_1_1532157158_4056480_271-68178-0_sp1.1 from training. Duration: 0.92725 2023-10-03 07:37:06,808 WARNING [train.py:1204] (3/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_46-207599-0_sp0.9 from training. Duration: 0.711125 2023-10-03 07:37:09,469 INFO [train.py:1046] (3/4) Epoch 34, batch 3000, loss[loss=0.2173, simple_loss=0.2823, pruned_loss=0.07621, over 19341.00 frames. ], tot_loss[loss=0.1623, simple_loss=0.2416, pruned_loss=0.04154, over 4710480.39 frames. ], batch size: 388, lr: 3.00e-03, grad_scale: 16.0 2023-10-03 07:37:09,470 INFO [train.py:1069] (3/4) Computing validation loss 2023-10-03 07:37:21,179 INFO [train.py:1078] (3/4) Epoch 34, validation: loss=0.3506, simple_loss=0.2704, pruned_loss=0.2154, over 1125622.00 frames. 2023-10-03 07:37:21,179 INFO [train.py:1079] (3/4) Maximum memory allocated so far is 21134MB 2023-10-03 07:37:21,302 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9303W0114-637133 from training. Duration: 0.896 2023-10-03 07:37:22,667 WARNING [train.py:1204] (3/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_490-207861-0 from training. Duration: 0.98 2023-10-03 07:37:24,605 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6767_210_3273_1_1527855783_1718932_173-256601-0_sp0.9 from training. Duration: 0.888875 2023-10-03 07:37:24,641 WARNING [train.py:1204] (3/4) Exclude cut with ID _13921_210_9994_1_1530841782272_3503520_327-69677-0_sp1.1 from training. Duration: 0.8181875 2023-10-03 07:37:25,956 WARNING [train.py:1204] (3/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_508-335369-0 from training. Duration: 0.48 2023-10-03 07:37:25,979 WARNING [train.py:1204] (3/4) Exclude cut with ID _64631_210_27767_1_1535349580068_4165160_24-46567-0_sp1.1 from training. Duration: 0.963625 2023-10-03 07:37:26,155 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.conv_module1.balancer2.prob, batch_count=1188666.6666666667, ans=0.125 2023-10-03 07:37:28,980 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.conv_module1.balancer2.prob, batch_count=1188666.6666666667, ans=0.125 2023-10-03 07:37:32,104 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_89-35748-0_sp1.1 from training. Duration: 0.7181875 2023-10-03 07:37:40,869 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_26433_210_12108_1_1532157158_4056480_578-262107-0_sp1.1 from training. Duration: 0.9 2023-10-03 07:37:46,801 WARNING [train.py:1204] (3/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_129-337802-0 from training. Duration: 0.72 2023-10-03 07:37:48,238 WARNING [train.py:1204] (3/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_356-167816-0_sp0.9 from training. Duration: 0.8 2023-10-03 07:37:49,690 WARNING [train.py:1204] (3/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_137-116442-0_sp1.1 from training. Duration: 0.7090625 2023-10-03 07:37:49,725 WARNING [train.py:1204] (3/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_358-172940-0_sp1.1 from training. Duration: 0.963625 2023-10-03 07:37:51,070 WARNING [train.py:1204] (3/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_668-344147-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 07:37:53,080 WARNING [train.py:1204] (3/4) Exclude cut with ID _62337_210_12392_1_1535009273105_3412229_255-157667-0_sp1.1 from training. Duration: 0.936375 2023-10-03 07:37:53,084 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_486-151131-0 from training. Duration: 0.85 2023-10-03 07:37:54,513 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_53638_210_21581_1_1534297319_5808449_885-80172-0 from training. Duration: 0.96 2023-10-03 07:37:55,893 WARNING [train.py:1204] (3/4) Exclude cut with ID _50093_210_4220_1_1534417141207_3432299_105-183067-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 07:37:57,222 WARNING [train.py:1204] (3/4) Exclude cut with ID _7562_210_2084_1_1528887767916_3946229_345-169156-0_sp0.9 from training. Duration: 0.6444375 2023-10-03 07:37:59,676 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_4528_210_3273_1_1527076654_3276777_209-219804-0_sp0.9 from training. Duration: 0.7666875 2023-10-03 07:37:59,718 WARNING [train.py:1204] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_35-153886-0_sp0.9 from training. Duration: 0.9333125 2023-10-03 07:37:59,767 WARNING [train.py:1204] (3/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_332-121925-0_sp1.1 from training. Duration: 0.92725 2023-10-03 07:37:59,769 WARNING [train.py:1204] (3/4) Exclude cut with ID _59202_210_26984_1_1534816810612_3549174_495-65515-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 07:38:01,131 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=1188800.0, ans=0.1 2023-10-03 07:38:03,732 WARNING [train.py:1204] (3/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_769-168302-0_sp1.1 from training. Duration: 0.6454375 2023-10-03 07:38:03,759 WARNING [train.py:1204] (3/4) Exclude cut with ID _24271_210_12658_1_1531987270022_7190519_708-71345-0_sp1.1 from training. Duration: 0.936375 2023-10-03 07:38:03,760 WARNING [train.py:1204] (3/4) Exclude cut with ID 3033-130750-0096-55598-0_sp0.9 from training. Duration: 0.92225 2023-10-03 07:38:05,092 WARNING [train.py:1204] (3/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_219-224445-0_sp0.9 from training. Duration: 0.9333125 2023-10-03 07:38:07,879 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_174-277364-0 from training. Duration: 0.71 2023-10-03 07:38:09,337 WARNING [train.py:1204] (3/4) Exclude cut with ID _45021_210_9994_1_1533619751094_3330288_96-239010-0_sp1.1 from training. Duration: 0.82725 2023-10-03 07:38:10,574 WARNING [train.py:1204] (3/4) Exclude cut with ID _31602_210_5854_1_1532677021456_4458250_489-305116-0_sp1.1 from training. Duration: 0.97275 2023-10-03 07:38:10,601 WARNING [train.py:1204] (3/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_269-154726-0_sp0.9 from training. Duration: 0.9555625 2023-10-03 07:38:10,830 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.out_combiner.scale_min, batch_count=1188866.6666666667, ans=0.2 2023-10-03 07:38:14,795 WARNING [train.py:1204] (3/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_126-54418-0_sp1.1 from training. Duration: 0.92725 2023-10-03 07:38:14,825 WARNING [train.py:1204] (3/4) Exclude cut with ID _42326_210_7770_1_1533814282531_3707269_182-224159-0_sp1.1 from training. Duration: 0.92725 2023-10-03 07:38:15,021 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.conv_module1.balancer1.prob, batch_count=1188866.6666666667, ans=0.125 2023-10-03 07:38:16,804 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7002_210_1794_1_1527939683_5157286_1-281264-0_sp0.9 from training. Duration: 0.488875 2023-10-03 07:38:16,832 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_86-16881-0 from training. Duration: 0.58 2023-10-03 07:38:16,865 WARNING [train.py:1204] (3/4) Exclude cut with ID _19514_210_9259_1_1531530071330_5084230_637-279094-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 07:38:18,209 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_26433_210_12108_1_1532157158_4056480_578-262107-0 from training. Duration: 0.99 2023-10-03 07:38:18,254 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_82-221196-0_sp1.1 from training. Duration: 0.6909375 2023-10-03 07:38:18,527 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.balancer2.prob, batch_count=1188866.6666666667, ans=0.125 2023-10-03 07:38:19,701 WARNING [train.py:1204] (3/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_402-102311-0 from training. Duration: 0.67 2023-10-03 07:38:21,138 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1015-343884-0_sp1.1 from training. Duration: 0.563625 2023-10-03 07:38:24,123 WARNING [train.py:1204] (3/4) Exclude cut with ID _13684_210_7497_1_1530619354863_3598359_312-235606-0_sp1.1 from training. Duration: 0.5454375 2023-10-03 07:38:24,162 WARNING [train.py:1204] (3/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_55-31904-0 from training. Duration: 0.88 2023-10-03 07:38:25,468 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_25-332138-0 from training. Duration: 0.91 2023-10-03 07:38:25,469 WARNING [train.py:1204] (3/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_912-282412-0_sp0.9 from training. Duration: 0.5444375 2023-10-03 07:38:25,545 WARNING [train.py:1204] (3/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_458-299435-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 07:38:26,842 WARNING [train.py:1204] (3/4) Exclude cut with ID _69584_210_5219_1_1535785133775_4044450_275-88403-0_sp1.1 from training. Duration: 0.92725 2023-10-03 07:38:26,850 WARNING [train.py:1204] (3/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_841-341679-0_sp1.1 from training. Duration: 0.52725 2023-10-03 07:38:26,864 WARNING [train.py:1204] (3/4) Exclude cut with ID _41391_210_7994_1_1533603771872_3638201_1-54521-0_sp1.1 from training. Duration: 0.97275 2023-10-03 07:38:28,202 WARNING [train.py:1204] (3/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_495-186609-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 07:38:31,574 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.conv_module1.balancer2.prob, batch_count=1188933.3333333333, ans=0.125 2023-10-03 07:38:32,754 WARNING [train.py:1204] (3/4) Exclude cut with ID _4681_210_3525_1_1527250689464_120889_15-126760-0 from training. Duration: 0.81 2023-10-03 07:38:34,084 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3019_210_2028_1_1526044245_3264865_127-135615-0_sp1.1 from training. Duration: 0.8545625 2023-10-03 07:38:35,345 INFO [train.py:1046] (3/4) Epoch 34, batch 3050, loss[loss=0.1566, simple_loss=0.245, pruned_loss=0.03411, over 24452.00 frames. ], tot_loss[loss=0.1629, simple_loss=0.242, pruned_loss=0.04192, over 4696586.94 frames. ], batch size: 69, lr: 3.00e-03, grad_scale: 16.0 2023-10-03 07:38:36,795 WARNING [train.py:1204] (3/4) Exclude cut with ID _30191_210_1664_1_1533189915047_7196670_915-21561-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 07:38:36,836 WARNING [train.py:1204] (3/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_6-214815-0_sp0.9 from training. Duration: 0.9444375 2023-10-03 07:38:40,922 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_45399_210_4852_1_1533624725_7579737_190-137494-0_sp1.1 from training. Duration: 0.97275 2023-10-03 07:38:43,657 WARNING [train.py:1204] (3/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_207-132038-0 from training. Duration: 0.94 2023-10-03 07:38:47,568 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.492e+02 1.885e+02 2.088e+02 2.309e+02 3.994e+02, threshold=4.176e+02, percent-clipped=0.0 2023-10-03 07:38:50,398 WARNING [train.py:1204] (3/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_90-31383-0 from training. Duration: 0.87 2023-10-03 07:38:50,440 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_15517_210_6753_1_1530952293_1664795_119-31001-0 from training. Duration: 0.94 2023-10-03 07:38:52,168 WARNING [train.py:1204] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_684-85342-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 07:38:55,695 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=1189066.6666666667, ans=0.1 2023-10-03 07:38:56,774 WARNING [train.py:1204] (3/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_97-325053-0_sp1.1 from training. Duration: 0.836375 2023-10-03 07:38:58,276 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_68702_210_23689_1_1535799577_3420068_168-25150-0_sp1.1 from training. Duration: 0.97275 2023-10-03 07:38:59,561 WARNING [train.py:1204] (3/4) Exclude cut with ID _65134_210_14763_1_1535335393985_3505368_111-27984-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 07:38:59,620 WARNING [train.py:1204] (3/4) Exclude cut with ID _48678_210_5663_1_1533988830947_3769199_276-226230-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 07:39:00,343 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.3.encoder.layers.0.self_attn_weights.whiten_keys, num_groups=8, num_channels=256, metric=4.77 vs. limit=6.0 2023-10-03 07:39:02,431 WARNING [train.py:1204] (3/4) Exclude cut with ID _28427_210_4220_1_1532581240967_3061069_43-84928-0_sp1.1 from training. Duration: 0.9 2023-10-03 07:39:02,469 WARNING [train.py:1204] (3/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_158-250116-0_sp1.1 from training. Duration: 0.563625 2023-10-03 07:39:02,498 WARNING [train.py:1204] (3/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_279-332556-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 07:39:04,417 WARNING [train.py:1204] (3/4) Exclude cut with ID _42863_210_9070_1_1533902398788_3696920_488-230146-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 07:39:04,417 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_387-247159-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 07:39:05,803 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6729_210_2550_1_1528032481_1788725_261-42619-0_sp1.1 from training. Duration: 0.97275 2023-10-03 07:39:07,168 WARNING [train.py:1204] (3/4) Exclude cut with ID _19896_210_1883_1_1531709022662_6649569_831-44724-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 07:39:11,115 WARNING [train.py:1204] (3/4) Exclude cut with ID _55153_210_2253_1_1534384786677_3162096_280-285944-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 07:39:11,146 WARNING [train.py:1204] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_448-4048-0 from training. Duration: 0.96 2023-10-03 07:39:11,188 WARNING [train.py:1204] (3/4) Exclude cut with ID _29678_210_7893_1_1532421042787_3906118_456-202548-0_sp1.1 from training. Duration: 0.97275 2023-10-03 07:39:11,208 WARNING [train.py:1204] (3/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_107-332398-0_sp1.1 from training. Duration: 0.7090625 2023-10-03 07:39:15,308 WARNING [train.py:1204] (3/4) Exclude cut with ID _13921_210_9994_1_1530838702800_3003429_46-149444-0_sp1.1 from training. Duration: 0.8545625 2023-10-03 07:39:15,350 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_257-20733-0_sp0.9 from training. Duration: 0.8444375 2023-10-03 07:39:15,494 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.0.feed_forward1.out_proj.dropout_p, batch_count=1189133.3333333333, ans=0.1 2023-10-03 07:39:16,682 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_53753_210_7234_1_1534320003_3594148_385-234809-0_sp1.1 from training. Duration: 0.963625 2023-10-03 07:39:16,738 WARNING [train.py:1204] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_292-215346-0_sp1.1 from training. Duration: 0.8818125 2023-10-03 07:39:22,088 WARNING [train.py:1204] (3/4) Exclude cut with ID _68965_210_1883_1_1535698962272_3497490_196-124268-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 07:39:22,129 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_23801_210_16223_1_1532066392_3294657_570-5439-0_sp1.1 from training. Duration: 0.8818125 2023-10-03 07:39:28,296 WARNING [train.py:1204] (3/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_349-6928-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 07:39:28,333 WARNING [train.py:1204] (3/4) Exclude cut with ID _60621_210_17071_1_1535013053530_3671739_226-255202-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 07:39:28,337 WARNING [train.py:1204] (3/4) Exclude cut with ID _41817_210_18835_1_1533434477160_7423049_448-130302-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 07:39:31,071 WARNING [train.py:1204] (3/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_758-143677-0_sp1.1 from training. Duration: 0.8909375 2023-10-03 07:39:31,112 WARNING [train.py:1204] (3/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_912-47473-0_sp0.9 from training. Duration: 0.7555625 2023-10-03 07:39:31,145 WARNING [train.py:1204] (3/4) Exclude cut with ID _6694_210_4599_1_1527852637490_3965529_356-169233-0_sp1.1 from training. Duration: 0.9 2023-10-03 07:39:33,170 WARNING [train.py:1204] (3/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_1-99680-0 from training. Duration: 0.67 2023-10-03 07:39:34,496 WARNING [train.py:1204] (3/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_199-86432-0_sp1.1 from training. Duration: 0.8909375 2023-10-03 07:39:34,527 WARNING [train.py:1204] (3/4) Exclude cut with ID _61148_210_6897_1_1535094022726_4217610_362-182430-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 07:39:35,855 WARNING [train.py:1204] (3/4) Exclude cut with ID _49376_210_5986_1_1533985278732_3370740_301-154763-0 from training. Duration: 0.98 2023-10-03 07:39:37,310 WARNING [train.py:1204] (3/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_572-185150-0_sp1.1 from training. Duration: 0.8818125 2023-10-03 07:39:43,225 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_47896_210_20508_1_1534492518_5958868_558-314572-0_sp1.1 from training. Duration: 0.8818125 2023-10-03 07:39:44,458 WARNING [train.py:1204] (3/4) Exclude cut with ID _45560_210_21982_1_1533812603951_3584999_111-6376-0_sp0.9 from training. Duration: 0.9333125 2023-10-03 07:39:47,527 INFO [train.py:1046] (3/4) Epoch 34, batch 3100, loss[loss=0.1633, simple_loss=0.2499, pruned_loss=0.03839, over 24355.00 frames. ], tot_loss[loss=0.1623, simple_loss=0.2413, pruned_loss=0.04167, over 4699456.63 frames. ], batch size: 77, lr: 3.00e-03, grad_scale: 16.0 2023-10-03 07:39:47,588 WARNING [train.py:1204] (3/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_412-317077-0_sp1.1 from training. Duration: 0.6818125 2023-10-03 07:39:49,055 WARNING [train.py:1204] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_718-68505-0 from training. Duration: 0.89 2023-10-03 07:39:51,693 WARNING [train.py:1204] (3/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_77-311404-0 from training. Duration: 0.94 2023-10-03 07:39:51,770 WARNING [train.py:1204] (3/4) Exclude cut with ID _60229_210_27767_1_1534838406158_3655350_264-279573-0 from training. Duration: 0.83 2023-10-03 07:39:54,508 WARNING [train.py:1204] (3/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_106-300985-0_sp1.1 from training. Duration: 0.8181875 2023-10-03 07:39:58,358 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_4183_210_2087_1_1526729100_1869838_79-277189-0_sp1.1 from training. Duration: 0.963625 2023-10-03 07:39:58,367 WARNING [train.py:1204] (3/4) Exclude cut with ID _28427_210_4220_1_1532581240967_3061069_195-295268-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 07:39:59,792 WARNING [train.py:1204] (3/4) Exclude cut with ID _4092_210_2151_1_1526643044449_3135759_356-278694-0_sp0.9 from training. Duration: 0.52225 2023-10-03 07:40:02,967 WARNING [train.py:1204] (3/4) Exclude cut with ID _14459_210_2228_1_1531443641946_7516700_777-71446-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 07:40:03,216 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.bypass.skip_rate, batch_count=1189400.0, ans=0.04949747468305833 2023-10-03 07:40:05,038 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.1.feed_forward2.hidden_balancer.prob, batch_count=1189400.0, ans=0.125 2023-10-03 07:40:08,913 WARNING [train.py:1204] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_520-126729-0 from training. Duration: 0.7 2023-10-03 07:40:13,005 WARNING [train.py:1204] (3/4) Exclude cut with ID _13684_210_7497_1_1530619354863_3598359_312-235606-0_sp0.9 from training. Duration: 0.6666875 2023-10-03 07:40:13,055 WARNING [train.py:1204] (3/4) Exclude cut with ID _37537_210_2253_1_1533171758568_3421949_283-349402-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 07:40:13,086 WARNING [train.py:1204] (3/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_29-117932-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 07:40:14,451 WARNING [train.py:1204] (3/4) Exclude cut with ID _29303_210_2575_1_1532392237303_7410649_721-283351-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 07:40:15,784 WARNING [train.py:1204] (3/4) Exclude cut with ID _14886_210_4220_1_1530702020355_3516190_175-229073-0_sp0.9 from training. Duration: 0.5 2023-10-03 07:40:17,109 WARNING [train.py:1204] (3/4) Exclude cut with ID _15458_210_8341_1_1530858614263_3834410_70-268830-0_sp1.1 from training. Duration: 0.97275 2023-10-03 07:40:17,128 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2295_210_2087_1_1525500993_2407863_234-83104-0 from training. Duration: 0.62 2023-10-03 07:40:17,129 WARNING [train.py:1204] (3/4) Exclude cut with ID _22961_210_8254_1_1531976366672_4278180_317-199546-0_sp1.1 from training. Duration: 0.7818125 2023-10-03 07:40:18,687 WARNING [train.py:1204] (3/4) Exclude cut with ID _27894_210_12392_1_1532415496078_6902439_547-196022-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 07:40:18,800 WARNING [train.py:1204] (3/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_46-207599-0 from training. Duration: 0.64 2023-10-03 07:40:20,264 WARNING [train.py:1204] (3/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_426-326246-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 07:40:21,701 WARNING [train.py:1204] (3/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_445-117398-0_sp0.9 from training. Duration: 0.988875 2023-10-03 07:40:22,999 WARNING [train.py:1204] (3/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_563-347832-0 from training. Duration: 0.89 2023-10-03 07:40:24,705 WARNING [train.py:1204] (3/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_252-216674-0 from training. Duration: 0.54 2023-10-03 07:40:26,567 WARNING [train.py:1204] (3/4) Exclude cut with ID _59611_210_8895_1_1534741198240_3650361_187-212779-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 07:40:27,928 WARNING [train.py:1204] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_282-159367-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 07:40:29,461 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_68702_210_23689_1_1535799577_3420068_33-136883-0_sp1.1 from training. Duration: 0.936375 2023-10-03 07:40:29,472 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_47228_210_4983_1_1533780021_3672027_184-293706-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 07:40:29,498 WARNING [train.py:1204] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_656-188361-0_sp0.9 from training. Duration: 0.9666875 2023-10-03 07:40:30,857 WARNING [train.py:1204] (3/4) Exclude cut with ID _4866_210_2577_1_1527404412284_4364659_352-157930-0_sp0.9 from training. Duration: 0.9 2023-10-03 07:40:30,857 WARNING [train.py:1204] (3/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_210-106047-0_sp1.1 from training. Duration: 0.8818125 2023-10-03 07:40:34,573 WARNING [train.py:1204] (3/4) Exclude cut with ID _9389_210_1664_1_1529217337301_5289269_289-213417-0_sp1.1 from training. Duration: 0.8090625 2023-10-03 07:40:34,606 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3910_210_1794_1_1526777667_3747260_178-41186-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 07:40:34,609 WARNING [train.py:1204] (3/4) Exclude cut with ID _27052_210_5986_1_1532329197187_3585009_24-172263-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 07:40:34,613 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_5522_210_3273_1_1527249606_3909096_318-203153-0_sp1.1 from training. Duration: 0.3909375 2023-10-03 07:40:36,261 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.conv_module1.balancer2.prob, batch_count=1189533.3333333333, ans=0.125 2023-10-03 07:40:38,845 WARNING [train.py:1204] (3/4) Exclude cut with ID _27894_210_12392_1_1532415496078_6902439_222-51982-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 07:40:40,190 WARNING [train.py:1204] (3/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_87-131427-0 from training. Duration: 0.78 2023-10-03 07:40:41,642 WARNING [train.py:1204] (3/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_372-298093-0_sp0.9 from training. Duration: 0.888875 2023-10-03 07:40:41,715 WARNING [train.py:1204] (3/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_663-166973-0 from training. Duration: 0.8 2023-10-03 07:40:41,745 WARNING [train.py:1204] (3/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_429-177469-0_sp1.1 from training. Duration: 0.936375 2023-10-03 07:40:41,779 WARNING [train.py:1204] (3/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_724-172651-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 07:40:43,134 WARNING [train.py:1204] (3/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_103-336064-0 from training. Duration: 0.51 2023-10-03 07:40:44,822 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.bypass_mid.scale_min, batch_count=1189533.3333333333, ans=0.2 2023-10-03 07:40:53,470 WARNING [train.py:1204] (3/4) Exclude cut with ID _48677_210_5663_1_1533902405919_3892989_111-11439-0 from training. Duration: 0.96 2023-10-03 07:40:56,771 WARNING [train.py:1204] (3/4) Exclude cut with ID _8085_210_4557_1_1528455633493_3716100_95-103398-0_sp1.1 from training. Duration: 0.963625 2023-10-03 07:40:58,133 WARNING [train.py:1204] (3/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_1041-110837-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 07:41:01,415 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_50050_210_16223_1_1535435796_7290138_1155-121757-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 07:41:01,418 WARNING [train.py:1204] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_602-275260-0_sp0.9 from training. Duration: 0.911125 2023-10-03 07:41:01,493 WARNING [train.py:1204] (3/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_265-164486-0 from training. Duration: 0.96 2023-10-03 07:41:02,590 INFO [train.py:1046] (3/4) Epoch 34, batch 3150, loss[loss=0.1509, simple_loss=0.2109, pruned_loss=0.04542, over 22677.00 frames. ], tot_loss[loss=0.1609, simple_loss=0.2392, pruned_loss=0.04129, over 4675252.77 frames. ], batch size: 322, lr: 2.99e-03, grad_scale: 16.0 2023-10-03 07:41:02,718 WARNING [train.py:1204] (3/4) Exclude cut with ID _46067_210_4220_1_1534125571066_3307450_142-229375-0_sp1.1 from training. Duration: 0.963625 2023-10-03 07:41:02,743 WARNING [train.py:1204] (3/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_310-26186-0_sp1.1 from training. Duration: 0.636375 2023-10-03 07:41:04,075 WARNING [train.py:1204] (3/4) Exclude cut with ID _28461_210_2253_1_1532771775189_3041299_136-270644-0 from training. Duration: 0.93 2023-10-03 07:41:05,877 WARNING [train.py:1204] (3/4) Exclude cut with ID _14860_210_4915_1_1530702020340_7343240_438-286372-0_sp1.1 from training. Duration: 0.936375 2023-10-03 07:41:06,074 INFO [scaling.py:1118] (3/4) WithLoss: name=encoder.encoders.2.encoder.layers.1.self_attn_weights, loss-sum=0.000e+00 2023-10-03 07:41:08,627 WARNING [train.py:1204] (3/4) Exclude cut with ID IC0128W0004-63309 from training. Duration: 0.831 2023-10-03 07:41:08,906 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.self_attn_weights.pos_emb_skip_rate, batch_count=1189666.6666666667, ans=0.0 2023-10-03 07:41:10,072 WARNING [train.py:1204] (3/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_135-174080-0 from training. Duration: 0.53 2023-10-03 07:41:10,112 WARNING [train.py:1204] (3/4) Exclude cut with ID _41326_210_5854_1_1533778076509_3492080_277-59511-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 07:41:10,253 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.balancer2.prob, batch_count=1189666.6666666667, ans=0.125 2023-10-03 07:41:11,417 WARNING [train.py:1204] (3/4) Exclude cut with ID IC0144W0112-71379 from training. Duration: 0.992 2023-10-03 07:41:11,482 WARNING [train.py:1204] (3/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_711-15688-0_sp0.9 from training. Duration: 0.47775 2023-10-03 07:41:14,072 WARNING [train.py:1204] (3/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_237-332027-0 from training. Duration: 0.95 2023-10-03 07:41:14,135 WARNING [train.py:1204] (3/4) Exclude cut with ID _7970_210_1883_1_1528455616296_3472230_25-178407-0 from training. Duration: 0.71 2023-10-03 07:41:14,137 WARNING [train.py:1204] (3/4) Exclude cut with ID _8480_210_1874_1_1528589391205_6728330_309-139896-0 from training. Duration: 0.81 2023-10-03 07:41:14,153 WARNING [train.py:1204] (3/4) Exclude cut with ID _39571_210_13449_1_1533801584143_3651077_319-268877-0_sp1.1 from training. Duration: 0.936375 2023-10-03 07:41:14,157 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_155-246880-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 07:41:15,391 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.646e+02 1.879e+02 2.066e+02 2.438e+02 3.109e+02, threshold=4.131e+02, percent-clipped=0.0 2023-10-03 07:41:16,851 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_566-122599-0_sp1.1 from training. Duration: 0.936375 2023-10-03 07:41:18,226 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_12868_210_6753_1_1530701932_530275_18-76322-0 from training. Duration: 0.87 2023-10-03 07:41:18,357 WARNING [train.py:1204] (3/4) Exclude cut with ID _56494_210_4220_1_1535284837625_3033169_159-106035-0_sp1.1 from training. Duration: 0.963625 2023-10-03 07:41:18,387 WARNING [train.py:1204] (3/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_98-276013-0_sp1.1 from training. Duration: 0.963625 2023-10-03 07:41:19,773 WARNING [train.py:1204] (3/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_330-216124-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 07:41:19,904 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_177-58005-0_sp0.9 from training. Duration: 0.688875 2023-10-03 07:41:20,138 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.out_combiner.scale_min, batch_count=1189733.3333333333, ans=0.2 2023-10-03 07:41:21,933 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.conv_module1.balancer2.min_abs, batch_count=1189733.3333333333, ans=0.5 2023-10-03 07:41:25,133 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.3.encoder.layers.2.nonlin_attention.whiten1, num_groups=1, num_channels=384, metric=6.68 vs. limit=10.0 2023-10-03 07:41:25,771 WARNING [train.py:1204] (3/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_343-139898-0 from training. Duration: 0.96 2023-10-03 07:41:25,836 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6961_210_2087_1_1527937168_6815011_395-185060-0_sp1.1 from training. Duration: 0.863625 2023-10-03 07:41:30,256 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7002_210_1794_1_1527939683_5157286_184-229630-0_sp0.9 from training. Duration: 0.82225 2023-10-03 07:41:30,296 WARNING [train.py:1204] (3/4) Exclude cut with ID _59170_210_6721_1_1534983633960_4203335_153-18895-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 07:41:30,346 WARNING [train.py:1204] (3/4) Exclude cut with ID _49643_210_7989_1_1534640244224_3939031_598-161926-0 from training. Duration: 0.9 2023-10-03 07:41:34,123 WARNING [train.py:1204] (3/4) Exclude cut with ID _4068_210_2629_1_1526697350395_1546259_224-288693-0 from training. Duration: 0.59 2023-10-03 07:41:34,182 WARNING [train.py:1204] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_718-68505-0_sp1.1 from training. Duration: 0.8090625 2023-10-03 07:41:35,578 WARNING [train.py:1204] (3/4) Exclude cut with ID _4068_210_2629_1_1526697350395_1546259_224-288693-0_sp0.9 from training. Duration: 0.6555625 2023-10-03 07:41:35,590 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2296_210_2087_1_1525503448_3397335_57-185430-0_sp0.9 from training. Duration: 0.6666875 2023-10-03 07:41:36,880 WARNING [train.py:1204] (3/4) Exclude cut with ID _15458_210_8341_1_1530858614263_3834410_49-235472-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 07:41:36,882 WARNING [train.py:1204] (3/4) Exclude cut with ID _64135_210_2253_1_1535677267876_7608972_498-5039-0_sp0.9 from training. Duration: 0.8666875 2023-10-03 07:41:37,000 WARNING [train.py:1204] (3/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_283-220372-0_sp0.9 from training. Duration: 0.62225 2023-10-03 07:41:37,009 WARNING [train.py:1204] (3/4) Exclude cut with ID _6473_210_1741_1_1527901307396_3815380_108-17335-0_sp0.9 from training. Duration: 0.72225 2023-10-03 07:41:38,277 WARNING [train.py:1204] (3/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_113-92608-0 from training. Duration: 0.54 2023-10-03 07:41:39,586 WARNING [train.py:1204] (3/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_272-249985-0_sp1.1 from training. Duration: 0.6090625 2023-10-03 07:41:39,602 WARNING [train.py:1204] (3/4) Exclude cut with ID _50376_210_13401_1_1534031502101_4217740_518-187390-0_sp1.1 from training. Duration: 0.97275 2023-10-03 07:41:39,844 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.conv_module2.balancer1.prob, batch_count=1189800.0, ans=0.125 2023-10-03 07:41:42,316 WARNING [train.py:1204] (3/4) Exclude cut with ID _59738_210_15379_1_1534849512562_3941470_165-248260-0_sp1.1 from training. Duration: 0.92725 2023-10-03 07:41:42,327 WARNING [train.py:1204] (3/4) Exclude cut with ID _23818_210_6228_1_1531917063884_3676920_400-343925-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 07:41:42,391 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6961_210_2087_1_1527937168_6815011_562-165518-0 from training. Duration: 0.77 2023-10-03 07:41:43,873 WARNING [train.py:1204] (3/4) Exclude cut with ID _24271_210_12658_1_1531987270022_7190519_289-207931-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 07:41:45,241 WARNING [train.py:1204] (3/4) Exclude cut with ID _14632_210_9988_1_1530691279392_3923200_332-105904-0 from training. Duration: 0.9 2023-10-03 07:41:46,514 WARNING [train.py:1204] (3/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_203-232438-0_sp1.1 from training. Duration: 0.97275 2023-10-03 07:41:46,618 WARNING [train.py:1204] (3/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_263-168388-0 from training. Duration: 0.86 2023-10-03 07:41:47,941 WARNING [train.py:1204] (3/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_174-205058-0 from training. Duration: 0.95 2023-10-03 07:41:50,701 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_94-195801-0_sp1.1 from training. Duration: 0.8545625 2023-10-03 07:41:50,738 WARNING [train.py:1204] (3/4) Exclude cut with ID _55153_210_2253_1_1534384786677_3162096_269-103204-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 07:41:50,848 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.0.conv_skip_rate, batch_count=1189866.6666666667, ans=0.0 2023-10-03 07:41:50,994 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.out_combiner.scale_min, batch_count=1189866.6666666667, ans=0.2 2023-10-03 07:41:52,135 WARNING [train.py:1204] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_748-36150-0 from training. Duration: 0.91 2023-10-03 07:41:52,213 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_148-172861-0_sp1.1 from training. Duration: 0.4545625 2023-10-03 07:41:52,385 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.attention_skip_rate, batch_count=1189866.6666666667, ans=0.0 2023-10-03 07:41:54,013 WARNING [train.py:1204] (3/4) Exclude cut with ID _67499_210_17282_1_1535509805220_3692430_460-77730-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 07:41:55,541 WARNING [train.py:1204] (3/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_374-20753-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 07:41:56,925 WARNING [train.py:1204] (3/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_374-289983-0_sp1.1 from training. Duration: 0.97275 2023-10-03 07:41:57,130 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=1189866.6666666667, ans=0.1 2023-10-03 07:41:58,221 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_34886_210_10633_1_1533203882_1755661_76-156096-0_sp0.9 from training. Duration: 0.9666875 2023-10-03 07:42:03,453 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_238-256668-0_sp0.9 from training. Duration: 0.7666875 2023-10-03 07:42:03,495 WARNING [train.py:1204] (3/4) Exclude cut with ID _46683_210_9642_1_1533730913492_4591116_489-146727-0_sp1.1 from training. Duration: 0.97275 2023-10-03 07:42:05,579 WARNING [train.py:1204] (3/4) Exclude cut with ID _13938_210_9988_1_1530518718978_3693960_304-112784-0_sp0.9 from training. Duration: 0.511125 2023-10-03 07:42:11,133 WARNING [train.py:1204] (3/4) Exclude cut with ID _24271_210_12658_1_1531987270022_7190519_243-46549-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 07:42:11,138 WARNING [train.py:1204] (3/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_78-182912-0_sp1.1 from training. Duration: 0.536375 2023-10-03 07:42:12,866 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.conv_module2.balancer2.min_positive, batch_count=1189933.3333333333, ans=0.05 2023-10-03 07:42:13,921 WARNING [train.py:1204] (3/4) Exclude cut with ID _57572_210_5517_1_1534921219624_3809484_121-321334-0_sp1.1 from training. Duration: 0.97275 2023-10-03 07:42:15,333 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_40047_210_2290_1_1533290519_2374311_254-102149-0_sp1.1 from training. Duration: 0.963625 2023-10-03 07:42:15,334 WARNING [train.py:1204] (3/4) Exclude cut with ID _28461_210_2253_1_1532771775189_3041299_96-152158-0 from training. Duration: 0.85 2023-10-03 07:42:16,677 INFO [train.py:1046] (3/4) Epoch 34, batch 3200, loss[loss=0.1622, simple_loss=0.2456, pruned_loss=0.03935, over 24652.00 frames. ], tot_loss[loss=0.16, simple_loss=0.2379, pruned_loss=0.04102, over 4680583.38 frames. ], batch size: 68, lr: 2.99e-03, grad_scale: 32.0 2023-10-03 07:42:16,816 WARNING [train.py:1204] (3/4) Exclude cut with ID _29877_210_6228_1_1532397791106_5276149_161-102979-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 07:42:20,996 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3787_210_2040_1_1526477544_5478586_603-120324-0_sp0.9 from training. Duration: 0.9 2023-10-03 07:42:24,379 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_41750_210_1960_1_1533520678_3948042_247-213456-0_sp1.1 from training. Duration: 0.97275 2023-10-03 07:42:27,362 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.nonlin_attention.balancer.max_positive, batch_count=1190000.0, ans=0.95 2023-10-03 07:42:34,866 WARNING [train.py:1204] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_127-119109-0_sp0.9 from training. Duration: 0.8 2023-10-03 07:42:41,156 INFO [scaling.py:1118] (3/4) WithLoss: name=encoder.encoders.3.encoder.layers.2.self_attn_weights, loss-sum=0.000e+00 2023-10-03 07:42:44,988 WARNING [train.py:1204] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_215-202973-0 from training. Duration: 0.88 2023-10-03 07:42:46,349 WARNING [train.py:1204] (3/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_17-320491-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 07:42:50,488 WARNING [train.py:1204] (3/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_310-114452-0 from training. Duration: 0.59 2023-10-03 07:42:50,563 WARNING [train.py:1204] (3/4) Exclude cut with ID _15133_210_6228_1_1530794512074_3080379_29-264458-0_sp0.9 from training. Duration: 0.6444375 2023-10-03 07:42:53,333 WARNING [train.py:1204] (3/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_159-281310-0_sp1.1 from training. Duration: 0.863625 2023-10-03 07:42:53,351 WARNING [train.py:1204] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_195-167011-0_sp0.9 from training. Duration: 0.8333125 2023-10-03 07:42:54,727 WARNING [train.py:1204] (3/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_156-257607-0_sp1.1 from training. Duration: 0.7545625 2023-10-03 07:42:58,008 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2295_210_2087_1_1525500993_2407863_116-238894-0 from training. Duration: 0.7 2023-10-03 07:42:59,400 WARNING [train.py:1204] (3/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_321-335475-0_sp0.9 from training. Duration: 0.5 2023-10-03 07:43:02,687 WARNING [train.py:1204] (3/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_188-114464-0 from training. Duration: 0.82 2023-10-03 07:43:04,759 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.3.encoder.layers.2.self_attn1.whiten, num_groups=1, num_channels=512, metric=14.17 vs. limit=22.5 2023-10-03 07:43:05,400 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2592_210_2290_1_1525697025_4882925_157-70512-0 from training. Duration: 0.91 2023-10-03 07:43:08,230 WARNING [train.py:1204] (3/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_138-64210-0_sp1.1 from training. Duration: 0.663625 2023-10-03 07:43:12,955 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_11305_210_1794_1_1529750428_7178148_651-95702-0_sp1.1 from training. Duration: 0.963625 2023-10-03 07:43:12,977 WARNING [train.py:1204] (3/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_379-184223-0_sp0.9 from training. Duration: 0.9444375 2023-10-03 07:43:14,252 WARNING [train.py:1204] (3/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_381-125325-0_sp1.1 from training. Duration: 0.963625 2023-10-03 07:43:14,314 WARNING [train.py:1204] (3/4) Exclude cut with ID IC0622W0021-307659 from training. Duration: 0.896 2023-10-03 07:43:14,316 WARNING [train.py:1204] (3/4) Exclude cut with ID _7624_210_1531_1_1528109802198_4933101_418-349834-0_sp1.1 from training. Duration: 0.5454375 2023-10-03 07:43:18,385 WARNING [train.py:1204] (3/4) Exclude cut with ID _49728_210_12651_1_1534041034294_6788150_607-3416-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 07:43:19,775 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8275_210_4198_1_1528455320_3892571_491-17707-0 from training. Duration: 0.64 2023-10-03 07:43:19,822 WARNING [train.py:1204] (3/4) Exclude cut with ID _5287_210_2084_1_1527418883040_3878670_333-48508-0 from training. Duration: 0.6 2023-10-03 07:43:21,171 WARNING [train.py:1204] (3/4) Exclude cut with ID _8365_210_2064_1_1528592311070_4677690_371-122295-0 from training. Duration: 0.55 2023-10-03 07:43:21,284 WARNING [train.py:1204] (3/4) Exclude cut with ID _14860_210_4915_1_1530702020340_7343240_836-328489-0 from training. Duration: 0.9 2023-10-03 07:43:22,662 WARNING [train.py:1204] (3/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_1033-274376-0_sp0.9 from training. Duration: 0.9666875 2023-10-03 07:43:24,107 WARNING [train.py:1204] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_475-127455-0_sp0.9 from training. Duration: 0.77775 2023-10-03 07:43:24,113 WARNING [train.py:1204] (3/4) Exclude cut with ID IC0671W0071-331914 from training. Duration: 0.896 2023-10-03 07:43:25,873 WARNING [train.py:1204] (3/4) Exclude cut with ID _30327_210_16049_1_1532484161295_3924169_2-1523-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 07:43:25,876 WARNING [train.py:1204] (3/4) Exclude cut with ID _24314_210_4915_1_1532135078116_7170179_460-208279-0_sp1.1 from training. Duration: 0.97275 2023-10-03 07:43:27,252 WARNING [train.py:1204] (3/4) Exclude cut with ID IC0679W0052-335859 from training. Duration: 0.64 2023-10-03 07:43:30,667 INFO [train.py:1046] (3/4) Epoch 34, batch 3250, loss[loss=0.17, simple_loss=0.2566, pruned_loss=0.04173, over 24336.00 frames. ], tot_loss[loss=0.1603, simple_loss=0.2383, pruned_loss=0.04114, over 4686300.84 frames. ], batch size: 77, lr: 2.99e-03, grad_scale: 16.0 2023-10-03 07:43:30,780 WARNING [train.py:1204] (3/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_3-237434-0_sp1.1 from training. Duration: 0.6909375 2023-10-03 07:43:32,296 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.nonlin_attention.balancer.prob, batch_count=1190333.3333333333, ans=0.125 2023-10-03 07:43:33,954 WARNING [train.py:1204] (3/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_405-185283-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 07:43:44,078 WARNING [train.py:1204] (3/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_927-83684-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 07:43:44,095 WARNING [train.py:1204] (3/4) Exclude cut with ID _6694_210_4599_1_1527852637490_3965529_356-169233-0 from training. Duration: 0.99 2023-10-03 07:43:44,307 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.bypass_mid.scale_min, batch_count=1190400.0, ans=0.2 2023-10-03 07:43:45,398 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.562e+02 1.836e+02 2.055e+02 2.276e+02 3.582e+02, threshold=4.110e+02, percent-clipped=0.0 2023-10-03 07:43:45,525 WARNING [train.py:1204] (3/4) Exclude cut with ID _19896_210_1883_1_1531709022662_6649569_562-233001-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 07:43:46,819 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_354-78629-0_sp1.1 from training. Duration: 0.963625 2023-10-03 07:43:46,821 WARNING [train.py:1204] (3/4) Exclude cut with ID _60135_210_13427_1_1534935557922_3651319_96-168198-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 07:43:48,248 WARNING [train.py:1204] (3/4) Exclude cut with ID _31602_210_5854_1_1532677021456_4458250_249-96604-0_sp1.1 from training. Duration: 0.8090625 2023-10-03 07:43:49,723 WARNING [train.py:1204] (3/4) Exclude cut with ID _14100_210_8341_1_1530529320613_3678069_258-224599-0_sp0.9 from training. Duration: 0.8555625 2023-10-03 07:43:52,474 WARNING [train.py:1204] (3/4) Exclude cut with ID _19708_210_9206_1_1532136650733_4238364_215-160135-0_sp1.1 from training. Duration: 0.97275 2023-10-03 07:43:52,506 WARNING [train.py:1204] (3/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_113-18163-0_sp0.9 from training. Duration: 0.988875 2023-10-03 07:43:52,550 WARNING [train.py:1204] (3/4) Exclude cut with ID _64135_210_2253_1_1535677267876_7608972_552-337340-0_sp1.1 from training. Duration: 0.92725 2023-10-03 07:43:52,579 WARNING [train.py:1204] (3/4) Exclude cut with ID _67725_210_27767_1_1535608756671_5105359_10-12887-0_sp1.1 from training. Duration: 0.97275 2023-10-03 07:43:52,584 WARNING [train.py:1204] (3/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_401-284840-0_sp1.1 from training. Duration: 0.97275 2023-10-03 07:43:53,846 WARNING [train.py:1204] (3/4) Exclude cut with ID _44659_210_2721_1_1534036120061_6614860_483-141285-0_sp1.1 from training. Duration: 0.936375 2023-10-03 07:43:55,351 WARNING [train.py:1204] (3/4) Exclude cut with ID _9461_210_4557_1_1529060514726_3677451_358-332734-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 07:43:56,699 WARNING [train.py:1204] (3/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_788-298907-0_sp1.1 from training. Duration: 0.8090625 2023-10-03 07:43:58,334 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.1.bypass.skip_rate, batch_count=1190466.6666666667, ans=0.035 2023-10-03 07:43:59,456 WARNING [train.py:1204] (3/4) Exclude cut with ID _39411_210_5986_1_1533538825030_3343649_354-267196-0_sp1.1 from training. Duration: 0.92725 2023-10-03 07:43:59,484 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_14953_210_2151_1_1530701710_5616165_192-309792-0_sp1.1 from training. Duration: 0.97275 2023-10-03 07:43:59,596 WARNING [train.py:1204] (3/4) Exclude cut with ID _59170_210_6721_1_1534983633960_4203335_290-151255-0_sp1.1 from training. Duration: 0.92725 2023-10-03 07:43:59,619 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_5575_210_1410_1_1527301305_2570170_249-93769-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 07:43:59,628 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_46013_210_7234_1_1533776362_3744032_463-52378-0_sp1.1 from training. Duration: 0.7818125 2023-10-03 07:44:03,552 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.nonlin_attention.balancer.prob, batch_count=1190466.6666666667, ans=0.125 2023-10-03 07:44:04,166 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.feed_forward3.out_whiten.whitening_limit, batch_count=1190466.6666666667, ans=15.0 2023-10-03 07:44:04,899 WARNING [train.py:1204] (3/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_580-93798-0 from training. Duration: 0.56 2023-10-03 07:44:06,177 WARNING [train.py:1204] (3/4) Exclude cut with ID _19514_210_9259_1_1531530071330_5084230_385-13762-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 07:44:06,182 WARNING [train.py:1204] (3/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_458-67321-0_sp1.1 from training. Duration: 0.8818125 2023-10-03 07:44:08,105 WARNING [train.py:1204] (3/4) Exclude cut with ID _41298_210_9019_1_1533430371450_4822143_188-154791-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 07:44:09,421 WARNING [train.py:1204] (3/4) Exclude cut with ID _15037_210_2253_1_1530752294104_3267650_296-192597-0_sp0.9 from training. Duration: 0.9 2023-10-03 07:44:09,685 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.conv_module1.balancer2.prob, batch_count=1190466.6666666667, ans=0.125 2023-10-03 07:44:15,604 WARNING [train.py:1204] (3/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_661-264857-0_sp1.1 from training. Duration: 0.8454375 2023-10-03 07:44:22,492 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_34008_210_10431_1_1533345424_8183247_662-317331-0_sp1.1 from training. Duration: 0.8545625 2023-10-03 07:44:22,516 WARNING [train.py:1204] (3/4) Exclude cut with ID _46502_210_13794_1_1533724452198_3657159_241-278329-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 07:44:22,516 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_11349_210_6753_1_1529800861_1101207_26-65602-0 from training. Duration: 0.96 2023-10-03 07:44:22,520 WARNING [train.py:1204] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_448-4048-0_sp1.1 from training. Duration: 0.87275 2023-10-03 07:44:22,536 WARNING [train.py:1204] (3/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_662-275621-0_sp1.1 from training. Duration: 0.3818125 2023-10-03 07:44:22,584 WARNING [train.py:1204] (3/4) Exclude cut with ID _13938_210_9988_1_1530518718978_3693960_256-54454-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 07:44:25,408 WARNING [train.py:1204] (3/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_256-32662-0 from training. Duration: 0.9 2023-10-03 07:44:26,794 WARNING [train.py:1204] (3/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_326-17061-0 from training. Duration: 0.94 2023-10-03 07:44:26,831 WARNING [train.py:1204] (3/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_505-97987-0_sp1.1 from training. Duration: 0.7818125 2023-10-03 07:44:29,367 WARNING [train.py:1204] (3/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_316-294364-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 07:44:29,429 WARNING [train.py:1204] (3/4) Exclude cut with ID _19514_210_9259_1_1531530071330_5084230_762-231063-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 07:44:29,488 WARNING [train.py:1204] (3/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_68-219807-0_sp0.9 from training. Duration: 0.52225 2023-10-03 07:44:30,880 WARNING [train.py:1204] (3/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_479-158053-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 07:44:34,789 WARNING [train.py:1204] (3/4) Exclude cut with ID _66112_210_10635_1_1535590236305_3801484_83-38181-0_sp1.1 from training. Duration: 0.936375 2023-10-03 07:44:34,804 WARNING [train.py:1204] (3/4) Exclude cut with ID _55857_210_2346_1_1534644007831_6898209_255-146346-0_sp1.1 from training. Duration: 0.963625 2023-10-03 07:44:35,109 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=1190600.0, ans=0.1 2023-10-03 07:44:36,231 WARNING [train.py:1204] (3/4) Exclude cut with ID _48836_210_12210_1_1534075848293_3904150_429-35475-0 from training. Duration: 0.89 2023-10-03 07:44:36,243 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_50154_210_13731_1_1534750862_1080961_119-187163-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 07:44:40,846 WARNING [train.py:1204] (3/4) Exclude cut with ID _8793_210_6310_1_1528542985139_2310009_121-179782-0_sp0.9 from training. Duration: 0.888875 2023-10-03 07:44:40,857 WARNING [train.py:1204] (3/4) Exclude cut with ID _14884_210_2252_1_1530707459689_5561250_591-173406-0 from training. Duration: 0.96 2023-10-03 07:44:42,569 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.ff3_skip_rate, batch_count=1190600.0, ans=0.0 2023-10-03 07:44:43,663 WARNING [train.py:1204] (3/4) Exclude cut with ID _15133_210_6228_1_1530794512074_3080379_54-198193-0_sp1.1 from training. Duration: 0.8545625 2023-10-03 07:44:43,681 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6545_210_2040_1_1527774670_4245258_407-48070-0 from training. Duration: 0.76 2023-10-03 07:44:45,006 INFO [train.py:1046] (3/4) Epoch 34, batch 3300, loss[loss=0.1519, simple_loss=0.2372, pruned_loss=0.03328, over 23635.00 frames. ], tot_loss[loss=0.1608, simple_loss=0.239, pruned_loss=0.04128, over 4693236.93 frames. ], batch size: 94, lr: 2.99e-03, grad_scale: 16.0 2023-10-03 07:44:45,182 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1032-236863-0 from training. Duration: 0.84 2023-10-03 07:44:45,254 WARNING [train.py:1204] (3/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_507-303370-0 from training. Duration: 0.96 2023-10-03 07:44:47,115 WARNING [train.py:1204] (3/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_164-282366-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 07:44:47,454 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.self_attn_weights.pos_emb_skip_rate, batch_count=1190666.6666666667, ans=0.0 2023-10-03 07:44:50,044 WARNING [train.py:1204] (3/4) Exclude cut with ID _39867_210_9826_1_1533553222030_9344259_312-158727-0_sp1.1 from training. Duration: 0.963625 2023-10-03 07:44:51,430 WARNING [train.py:1204] (3/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_174-205058-0_sp1.1 from training. Duration: 0.863625 2023-10-03 07:44:51,461 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_11305_210_1794_1_1529750428_7178148_166-237358-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 07:44:54,019 WARNING [train.py:1204] (3/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_26-175553-0_sp1.1 from training. Duration: 0.5545625 2023-10-03 07:44:54,040 WARNING [train.py:1204] (3/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_96-290039-0_sp1.1 from training. Duration: 0.5090625 2023-10-03 07:44:56,662 WARNING [train.py:1204] (3/4) Exclude cut with ID _53219_210_4525_1_1534834836008_4064825_261-101691-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 07:44:58,027 WARNING [train.py:1204] (3/4) Exclude cut with ID _24271_210_12658_1_1531987270022_7190519_96-293390-0_sp1.1 from training. Duration: 0.936375 2023-10-03 07:45:02,124 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_271-234962-0 from training. Duration: 0.79 2023-10-03 07:45:02,187 WARNING [train.py:1204] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_136-30660-0_sp1.1 from training. Duration: 0.97275 2023-10-03 07:45:03,525 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_625-281046-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 07:45:05,378 WARNING [train.py:1204] (3/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_333-168193-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 07:45:06,691 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9029W0269-509664 from training. Duration: 0.896 2023-10-03 07:45:06,788 WARNING [train.py:1204] (3/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_450-147393-0_sp1.1 from training. Duration: 0.92725 2023-10-03 07:45:08,877 WARNING [train.py:1204] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_643-52007-0_sp1.1 from training. Duration: 0.5181875 2023-10-03 07:45:08,936 WARNING [train.py:1204] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_330-327861-0_sp0.9 from training. Duration: 0.7444375 2023-10-03 07:45:08,953 WARNING [train.py:1204] (3/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_260-317651-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 07:45:08,975 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9044W0010-516654 from training. Duration: 0.896 2023-10-03 07:45:13,579 WARNING [train.py:1204] (3/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_265-118378-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 07:45:13,585 WARNING [train.py:1204] (3/4) Exclude cut with ID _6194_210_1534_1_1527511826160_3941194_321-148641-0_sp0.9 from training. Duration: 0.711125 2023-10-03 07:45:15,090 WARNING [train.py:1204] (3/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_101-169176-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 07:45:15,093 WARNING [train.py:1204] (3/4) Exclude cut with ID 497-129325-0061-62254-0_sp1.1 from training. Duration: 0.97725 2023-10-03 07:45:16,840 WARNING [train.py:1204] (3/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_795-33252-0_sp1.1 from training. Duration: 0.336375 2023-10-03 07:45:16,856 WARNING [train.py:1204] (3/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_168-246176-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 07:45:18,181 WARNING [train.py:1204] (3/4) Exclude cut with ID _7160_210_1876_1_1527940923231_3703799_120-150943-0_sp1.1 from training. Duration: 0.736375 2023-10-03 07:45:20,959 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9084W0005-536844 from training. Duration: 0.768 2023-10-03 07:45:21,080 WARNING [train.py:1204] (3/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_234-260682-0 from training. Duration: 0.84 2023-10-03 07:45:22,424 WARNING [train.py:1204] (3/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_188-114464-0_sp0.9 from training. Duration: 0.911125 2023-10-03 07:45:23,829 WARNING [train.py:1204] (3/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_81-75849-0 from training. Duration: 0.39 2023-10-03 07:45:27,831 WARNING [train.py:1204] (3/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_379-184223-0_sp1.1 from training. Duration: 0.77275 2023-10-03 07:45:29,316 WARNING [train.py:1204] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_454-123002-0_sp1.1 from training. Duration: 0.62725 2023-10-03 07:45:30,647 WARNING [train.py:1204] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_887-249555-0_sp1.1 from training. Duration: 0.7 2023-10-03 07:45:32,107 WARNING [train.py:1204] (3/4) Exclude cut with ID _31088_210_16446_1_1532520012284_3440919_20-260902-0_sp1.1 from training. Duration: 0.97275 2023-10-03 07:45:33,427 WARNING [train.py:1204] (3/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_856-344574-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 07:45:33,429 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_36756_210_7234_1_1533026991_966276_4-164118-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 07:45:33,455 WARNING [train.py:1204] (3/4) Exclude cut with ID _8365_210_2064_1_1528592311070_4677690_371-122295-0_sp1.1 from training. Duration: 0.5 2023-10-03 07:45:33,657 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.nonlin_attention.balancer.prob, batch_count=1190866.6666666667, ans=0.125 2023-10-03 07:45:34,962 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.0.ff2_skip_rate, batch_count=1190866.6666666667, ans=0.0 2023-10-03 07:45:36,779 WARNING [train.py:1204] (3/4) Exclude cut with ID _5910_210_1389_1_1527337972454_5050100_555-159815-0_sp1.1 from training. Duration: 0.8909375 2023-10-03 07:45:36,798 WARNING [train.py:1204] (3/4) Exclude cut with ID _37086_210_9038_1_1532999540891_7021250_469-147941-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 07:45:36,872 WARNING [train.py:1204] (3/4) Exclude cut with ID _3067_210_2667_1_1526212974640_4503168_459-173867-0_sp0.9 from training. Duration: 0.97775 2023-10-03 07:45:37,078 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.bypass.skip_rate, batch_count=1190866.6666666667, ans=0.07 2023-10-03 07:45:38,934 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9156W0006-572499 from training. Duration: 0.896 2023-10-03 07:45:40,343 WARNING [train.py:1204] (3/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_277-210798-0 from training. Duration: 0.55 2023-10-03 07:45:41,727 WARNING [train.py:1204] (3/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_103-336064-0_sp1.1 from training. Duration: 0.463625 2023-10-03 07:45:41,781 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3414_210_1988_1_1526212718_4813195_331-290945-0_sp1.1 from training. Duration: 0.936375 2023-10-03 07:45:41,782 WARNING [train.py:1204] (3/4) Exclude cut with ID _67499_210_17282_1_1535509805220_3692430_365-29276-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 07:45:44,420 WARNING [train.py:1204] (3/4) Exclude cut with ID _14821_210_8750_1_1530680503991_6829359_69-250847-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 07:45:44,421 WARNING [train.py:1204] (3/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_1102-61301-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 07:45:46,426 WARNING [train.py:1204] (3/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_166-246557-0_sp1.1 from training. Duration: 0.5818125 2023-10-03 07:45:46,481 WARNING [train.py:1204] (3/4) Exclude cut with ID _15351_210_10626_1_1530856635653_3576210_214-129918-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 07:45:46,493 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_120-41668-0_sp0.9 from training. Duration: 0.77775 2023-10-03 07:45:47,836 WARNING [train.py:1204] (3/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_27-72278-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 07:45:49,232 WARNING [train.py:1204] (3/4) Exclude cut with ID _24314_210_4915_1_1532135078116_7170179_521-258868-0_sp1.1 from training. Duration: 0.7181875 2023-10-03 07:45:50,727 WARNING [train.py:1204] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_143-86344-0 from training. Duration: 0.77 2023-10-03 07:45:52,033 WARNING [train.py:1204] (3/4) Exclude cut with ID _53489_210_6228_1_1534298357169_3835970_465-157433-0_sp1.1 from training. Duration: 0.92725 2023-10-03 07:45:53,322 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_248-319353-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 07:45:54,799 WARNING [train.py:1204] (3/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_102-77064-0_sp1.1 from training. Duration: 0.8454375 2023-10-03 07:45:54,837 WARNING [train.py:1204] (3/4) Exclude cut with ID _9389_210_1664_1_1529217337301_5289269_379-128794-0_sp1.1 from training. Duration: 0.77275 2023-10-03 07:45:56,248 WARNING [train.py:1204] (3/4) Exclude cut with ID _3231_210_2106_1_1526205864934_6784220_236-260835-0_sp1.1 from training. Duration: 0.97275 2023-10-03 07:45:57,667 WARNING [train.py:1204] (3/4) Exclude cut with ID _24271_210_12658_1_1531987270022_7190519_184-342363-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 07:45:57,668 WARNING [train.py:1204] (3/4) Exclude cut with ID _27894_210_12392_1_1532415496078_6902439_67-221663-0_sp1.1 from training. Duration: 0.92725 2023-10-03 07:45:59,064 INFO [train.py:1046] (3/4) Epoch 34, batch 3350, loss[loss=0.1704, simple_loss=0.2451, pruned_loss=0.04783, over 23434.00 frames. ], tot_loss[loss=0.1614, simple_loss=0.2398, pruned_loss=0.04156, over 4695516.63 frames. ], batch size: 134, lr: 2.99e-03, grad_scale: 8.0 2023-10-03 07:45:59,250 WARNING [train.py:1204] (3/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_304-59227-0_sp1.1 from training. Duration: 0.7 2023-10-03 07:46:00,751 WARNING [train.py:1204] (3/4) Exclude cut with ID _58605_210_12210_1_1534723195939_3904259_458-42232-0_sp1.1 from training. Duration: 0.92725 2023-10-03 07:46:00,830 WARNING [train.py:1204] (3/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_13-165512-0_sp0.9 from training. Duration: 0.911125 2023-10-03 07:46:03,935 WARNING [train.py:1204] (3/4) Exclude cut with ID _58602_210_2346_1_1534903210085_6908830_792-102241-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 07:46:07,174 WARNING [train.py:1204] (3/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_588-125165-0_sp0.9 from training. Duration: 0.92225 2023-10-03 07:46:08,532 WARNING [train.py:1204] (3/4) Exclude cut with ID _54218_210_12210_1_1534413646388_4409259_239-98429-0_sp1.1 from training. Duration: 0.97275 2023-10-03 07:46:09,877 WARNING [train.py:1204] (3/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_538-32291-0_sp1.1 from training. Duration: 0.7545625 2023-10-03 07:46:09,970 WARNING [train.py:1204] (3/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_419-317062-0 from training. Duration: 0.91 2023-10-03 07:46:11,430 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9309W0202-640329 from training. Duration: 0.896 2023-10-03 07:46:11,473 WARNING [train.py:1204] (3/4) Exclude cut with ID _3231_210_2106_1_1526205864934_6784220_419-313291-0_sp1.1 from training. Duration: 0.97275 2023-10-03 07:46:14,691 WARNING [train.py:1204] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_619-344499-0 from training. Duration: 0.63 2023-10-03 07:46:14,703 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_278-272612-0 from training. Duration: 0.73 2023-10-03 07:46:16,358 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.570e+02 1.949e+02 2.135e+02 2.589e+02 3.898e+02, threshold=4.270e+02, percent-clipped=0.0 2023-10-03 07:46:16,446 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_103-84010-0_sp0.9 from training. Duration: 0.7666875 2023-10-03 07:46:16,483 WARNING [train.py:1204] (3/4) Exclude cut with ID _26528_210_9070_1_1532570410842_3851310_497-97218-0_sp1.1 from training. Duration: 0.7818125 2023-10-03 07:46:17,887 WARNING [train.py:1204] (3/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_262-84613-0_sp1.1 from training. Duration: 0.936375 2023-10-03 07:46:17,926 WARNING [train.py:1204] (3/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_387-131538-0 from training. Duration: 0.81 2023-10-03 07:46:17,957 WARNING [train.py:1204] (3/4) Exclude cut with ID _8030_210_7497_1_1528444886781_3531089_224-300489-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 07:46:17,980 WARNING [train.py:1204] (3/4) Exclude cut with ID _14349_210_8341_1_1530615710818_3442470_170-336541-0_sp0.9 from training. Duration: 0.9555625 2023-10-03 07:46:19,297 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_47069_210_15368_1_1533781267_4431320_180-31746-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 07:46:19,504 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.feed_forward2.hidden_balancer.prob, batch_count=1191066.6666666667, ans=0.125 2023-10-03 07:46:20,665 WARNING [train.py:1204] (3/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_447-7041-0_sp1.1 from training. Duration: 0.92725 2023-10-03 07:46:20,707 WARNING [train.py:1204] (3/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_2-55283-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 07:46:22,022 WARNING [train.py:1204] (3/4) Exclude cut with ID _12555_210_8750_1_1530075594402_3780239_70-181911-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 07:46:24,735 WARNING [train.py:1204] (3/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_311-328022-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 07:46:27,446 WARNING [train.py:1204] (3/4) Exclude cut with ID _14528_210_8254_1_1530669700514_4306939_330-2987-0_sp1.1 from training. Duration: 0.92725 2023-10-03 07:46:28,809 WARNING [train.py:1204] (3/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_385-184858-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 07:46:30,304 WARNING [train.py:1204] (3/4) Exclude cut with ID _64414_210_17301_1_1535243252048_3757759_348-333360-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 07:46:31,638 WARNING [train.py:1204] (3/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_753-214751-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 07:46:34,949 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_17613_210_2151_1_1531137569_2366160_202-208013-0_sp1.1 from training. Duration: 0.92725 2023-10-03 07:46:34,957 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_43831_210_2158_1_1533725502_5872791_610-129300-0_sp1.1 from training. Duration: 0.936375 2023-10-03 07:46:37,606 WARNING [train.py:1204] (3/4) Exclude cut with ID _54218_210_12210_1_1534413646388_4409259_321-23852-0_sp1.1 from training. Duration: 0.936375 2023-10-03 07:46:39,017 WARNING [train.py:1204] (3/4) Exclude cut with ID _39411_210_5986_1_1533538825030_3343649_173-238248-0 from training. Duration: 0.83 2023-10-03 07:46:39,024 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_405-339986-0_sp0.9 from training. Duration: 0.5666875 2023-10-03 07:46:39,064 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2009_210_2071_1_1525481134_4588937_385-302100-0 from training. Duration: 0.87 2023-10-03 07:46:40,400 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_161-276996-0_sp1.1 from training. Duration: 0.863625 2023-10-03 07:46:41,776 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_177-58005-0 from training. Duration: 0.62 2023-10-03 07:46:43,122 WARNING [train.py:1204] (3/4) Exclude cut with ID _64639_210_5816_1_1535367619437_3528000_146-343734-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 07:46:44,547 WARNING [train.py:1204] (3/4) Exclude cut with ID _3845_210_1390_1_1526731326141_3523610_72-61441-0_sp1.1 from training. Duration: 0.92725 2023-10-03 07:46:50,144 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.2.encoder.layers.0.nonlin_attention.whiten2, num_groups=1, num_channels=384, metric=6.85 vs. limit=15.0 2023-10-03 07:46:51,116 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.5.encoder.layers.1.self_attn_weights.whiten_keys, num_groups=4, num_channels=128, metric=1.68 vs. limit=6.0 2023-10-03 07:46:52,042 WARNING [train.py:1204] (3/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_991-125028-0_sp1.1 from training. Duration: 0.936375 2023-10-03 07:46:53,380 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7544_210_6753_1_1528591327_1471426_172-348777-0 from training. Duration: 0.66 2023-10-03 07:46:53,572 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.conv_skip_rate, batch_count=1191200.0, ans=0.0 2023-10-03 07:46:54,706 WARNING [train.py:1204] (3/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_416-257730-0_sp1.1 from training. Duration: 0.5909375 2023-10-03 07:46:56,112 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1113-7369-0_sp1.1 from training. Duration: 0.77275 2023-10-03 07:46:57,523 WARNING [train.py:1204] (3/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_242-76943-0_sp1.1 from training. Duration: 0.8545625 2023-10-03 07:47:00,685 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.feed_forward2.hidden_balancer.prob, batch_count=1191266.6666666667, ans=0.125 2023-10-03 07:47:01,746 WARNING [train.py:1204] (3/4) Exclude cut with ID _44718_210_5816_1_1534050051869_6469030_190-49648-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 07:47:03,166 WARNING [train.py:1204] (3/4) Exclude cut with ID _47251_210_18628_1_1533814063128_4137250_214-88076-0 from training. Duration: 0.88 2023-10-03 07:47:05,017 WARNING [train.py:1204] (3/4) Exclude cut with ID _5585_210_2667_1_1527418813324_8007340_89-332072-0_sp1.1 from training. Duration: 0.5818125 2023-10-03 07:47:05,042 WARNING [train.py:1204] (3/4) Exclude cut with ID _41817_210_18835_1_1533434477160_7423049_555-293448-0_sp1.1 from training. Duration: 0.72725 2023-10-03 07:47:06,938 WARNING [train.py:1204] (3/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_286-331314-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 07:47:08,272 WARNING [train.py:1204] (3/4) Exclude cut with ID _13921_210_9994_1_1530838702800_3003429_46-149444-0 from training. Duration: 0.94 2023-10-03 07:47:08,342 WARNING [train.py:1204] (3/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_768-43595-0_sp1.1 from training. Duration: 0.936375 2023-10-03 07:47:08,351 WARNING [train.py:1204] (3/4) Exclude cut with ID _35355_210_16040_1_1533212988977_4016229_74-190076-0 from training. Duration: 0.8 2023-10-03 07:47:11,110 WARNING [train.py:1204] (3/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_1007-316380-0_sp1.1 from training. Duration: 0.963625 2023-10-03 07:47:12,408 INFO [train.py:1046] (3/4) Epoch 34, batch 3400, loss[loss=0.1708, simple_loss=0.2586, pruned_loss=0.04151, over 24412.00 frames. ], tot_loss[loss=0.162, simple_loss=0.2405, pruned_loss=0.04169, over 4695075.77 frames. ], batch size: 77, lr: 2.99e-03, grad_scale: 8.0 2023-10-03 07:47:12,472 WARNING [train.py:1204] (3/4) Exclude cut with ID _23557_210_8750_1_1531890106370_6846255_150-283437-0_sp1.1 from training. Duration: 0.963625 2023-10-03 07:47:12,738 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.conv_module1.balancer2.prob, batch_count=1191333.3333333333, ans=0.125 2023-10-03 07:47:13,699 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3474_210_2028_1_1526188513_4825246_145-309131-0_sp1.1 from training. Duration: 0.67275 2023-10-03 07:47:13,793 WARNING [train.py:1204] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_391-22148-0_sp1.1 from training. Duration: 0.7 2023-10-03 07:47:15,214 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_238-256668-0 from training. Duration: 0.69 2023-10-03 07:47:18,602 WARNING [train.py:1204] (3/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_372-198878-0 from training. Duration: 0.6 2023-10-03 07:47:18,614 WARNING [train.py:1204] (3/4) Exclude cut with ID ID0174W0006-766659 from training. Duration: 0.953 2023-10-03 07:47:18,636 WARNING [train.py:1204] (3/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_668-23019-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 07:47:18,789 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.nonlin_attention.balancer.min_positive, batch_count=1191333.3333333333, ans=0.05 2023-10-03 07:47:23,333 WARNING [train.py:1204] (3/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_322-74328-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 07:47:23,339 WARNING [train.py:1204] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_872-264525-0_sp1.1 from training. Duration: 0.5909375 2023-10-03 07:47:24,713 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_53378_210_17282_1_1534221071_5769509_641-286201-0_sp1.1 from training. Duration: 0.97275 2023-10-03 07:47:24,796 WARNING [train.py:1204] (3/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_199-295611-0_sp1.1 from training. Duration: 0.5 2023-10-03 07:47:28,926 WARNING [train.py:1204] (3/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_1270-317971-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 07:47:30,380 WARNING [train.py:1204] (3/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_784-213099-0 from training. Duration: 0.93 2023-10-03 07:47:35,914 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_115-233929-0_sp0.9 from training. Duration: 0.8 2023-10-03 07:47:39,226 WARNING [train.py:1204] (3/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_256-223809-0_sp1.1 from training. Duration: 0.97275 2023-10-03 07:47:39,263 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_285-87781-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 07:47:40,699 WARNING [train.py:1204] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_619-344499-0_sp1.1 from training. Duration: 0.57275 2023-10-03 07:47:44,931 WARNING [train.py:1204] (3/4) Exclude cut with ID _60229_210_27767_1_1534838406158_3655350_264-279573-0_sp0.9 from training. Duration: 0.92225 2023-10-03 07:47:49,666 WARNING [train.py:1204] (3/4) Exclude cut with ID _7719_210_3525_1_1528456153163_4296960_333-110399-0 from training. Duration: 0.96 2023-10-03 07:47:51,604 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.4.encoder.layers.0.self_attn1.whiten, num_groups=1, num_channels=384, metric=14.19 vs. limit=22.5 2023-10-03 07:47:51,904 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.0.layers.1.self_attn1.whiten, num_groups=1, num_channels=192, metric=11.90 vs. limit=22.5 2023-10-03 07:47:55,521 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=1191533.3333333333, ans=0.1 2023-10-03 07:47:56,753 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_50423_210_13270_1_1534128598_5021734_759-90833-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 07:47:58,103 WARNING [train.py:1204] (3/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_151-254798-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 07:47:58,147 WARNING [train.py:1204] (3/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_450-203637-0 from training. Duration: 0.92 2023-10-03 07:47:58,171 WARNING [train.py:1204] (3/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_673-36456-0_sp1.1 from training. Duration: 0.936375 2023-10-03 07:47:58,216 WARNING [train.py:1204] (3/4) Exclude cut with ID _36538_210_9994_1_1533204020363_3795620_131-8685-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 07:47:59,514 WARNING [train.py:1204] (3/4) Exclude cut with ID _12556_210_8341_1_1530081024100_7327690_713-55626-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 07:48:00,905 WARNING [train.py:1204] (3/4) Exclude cut with ID _2322_210_2667_1_1525604548971_3656299_440-80494-0_sp0.9 from training. Duration: 0.97775 2023-10-03 07:48:02,424 WARNING [train.py:1204] (3/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_210-325081-0_sp1.1 from training. Duration: 0.97275 2023-10-03 07:48:05,123 WARNING [train.py:1204] (3/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_375-244976-0_sp0.9 from training. Duration: 0.8333125 2023-10-03 07:48:05,132 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_15517_210_6753_1_1530952293_1664795_119-31001-0_sp1.1 from training. Duration: 0.8545625 2023-10-03 07:48:12,931 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_41452_210_7291_1_1533437687_3941281_319-42326-0_sp1.1 from training. Duration: 0.92725 2023-10-03 07:48:14,337 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_372-173012-0 from training. Duration: 0.95 2023-10-03 07:48:14,538 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.bypass_mid.scale_min, batch_count=1191600.0, ans=0.2 2023-10-03 07:48:15,939 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.conv_module2.balancer1.min_positive, batch_count=1191600.0, ans=0.025 2023-10-03 07:48:18,705 WARNING [train.py:1204] (3/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_41-31157-0_sp1.1 from training. Duration: 0.5181875 2023-10-03 07:48:22,319 WARNING [train.py:1204] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_91-212486-0 from training. Duration: 0.68 2023-10-03 07:48:26,335 INFO [train.py:1046] (3/4) Epoch 34, batch 3450, loss[loss=0.1614, simple_loss=0.2478, pruned_loss=0.03749, over 24301.00 frames. ], tot_loss[loss=0.1622, simple_loss=0.241, pruned_loss=0.04174, over 4706239.24 frames. ], batch size: 61, lr: 2.99e-03, grad_scale: 4.0 2023-10-03 07:48:26,440 WARNING [train.py:1204] (3/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_1267-272693-0 from training. Duration: 0.73 2023-10-03 07:48:27,811 WARNING [train.py:1204] (3/4) Exclude cut with ID _13938_210_9988_1_1530518718978_3693960_152-67046-0_sp1.1 from training. Duration: 0.936375 2023-10-03 07:48:29,213 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_4528_210_3273_1_1527076654_3276777_342-168068-0_sp1.1 from training. Duration: 0.7909375 2023-10-03 07:48:29,221 WARNING [train.py:1204] (3/4) Exclude cut with ID _7606_210_4915_1_1528894253277_5080690_127-177761-0 from training. Duration: 0.64 2023-10-03 07:48:29,294 WARNING [train.py:1204] (3/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_335-137972-0_sp1.1 from training. Duration: 0.92725 2023-10-03 07:48:29,970 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.3.encoder.layers.0.nonlin_attention.whiten2, num_groups=1, num_channels=512, metric=6.95 vs. limit=15.0 2023-10-03 07:48:32,199 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.conv_module1.balancer2.prob, batch_count=1191666.6666666667, ans=0.125 2023-10-03 07:48:32,742 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.2.encoder.layers.1.self_attn2.whiten, num_groups=1, num_channels=384, metric=16.47 vs. limit=22.5 2023-10-03 07:48:33,316 WARNING [train.py:1204] (3/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_331-117829-0_sp1.1 from training. Duration: 0.563625 2023-10-03 07:48:37,446 WARNING [train.py:1204] (3/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_125-125818-0_sp1.1 from training. Duration: 0.87275 2023-10-03 07:48:39,363 WARNING [train.py:1204] (3/4) Exclude cut with ID _6789_210_1534_1_1527854471671_2870519_48-294897-0_sp1.1 from training. Duration: 0.963625 2023-10-03 07:48:39,444 WARNING [train.py:1204] (3/4) Exclude cut with ID _6694_210_4599_1_1527852637490_3965529_116-109398-0_sp1.1 from training. Duration: 0.663625 2023-10-03 07:48:39,454 WARNING [train.py:1204] (3/4) Exclude cut with ID _52892_210_6721_1_1534378941259_4124129_222-172685-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 07:48:42,756 WARNING [train.py:1204] (3/4) Exclude cut with ID _50655_210_18628_1_1534229942193_3772154_320-132275-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 07:48:44,089 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.632e+02 1.945e+02 2.174e+02 2.507e+02 5.518e+02, threshold=4.348e+02, percent-clipped=2.0 2023-10-03 07:48:44,741 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.4.encoder.layers.2.self_attn_weights.whiten_keys, num_groups=4, num_channels=128, metric=2.70 vs. limit=6.0 2023-10-03 07:48:45,709 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.conv_module1.balancer1.prob, batch_count=1191733.3333333333, ans=0.125 2023-10-03 07:48:48,263 WARNING [train.py:1204] (3/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_76-311963-0 from training. Duration: 0.7 2023-10-03 07:48:48,411 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.feed_forward1.hidden_balancer.prob, batch_count=1191733.3333333333, ans=0.125 2023-10-03 07:48:54,521 WARNING [train.py:1204] (3/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_32-205288-0 from training. Duration: 0.78 2023-10-03 07:48:55,798 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_86-16881-0_sp0.9 from training. Duration: 0.6444375 2023-10-03 07:48:55,840 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_271-297505-0_sp1.1 from training. Duration: 0.9 2023-10-03 07:48:57,192 WARNING [train.py:1204] (3/4) Exclude cut with ID _57572_210_5517_1_1534921219624_3809484_187-295293-0_sp1.1 from training. Duration: 0.963625 2023-10-03 07:49:01,355 WARNING [train.py:1204] (3/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_563-329943-0 from training. Duration: 0.99 2023-10-03 07:49:01,639 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.conv_module2.balancer1.min_positive, batch_count=1191800.0, ans=0.025 2023-10-03 07:49:02,717 WARNING [train.py:1204] (3/4) Exclude cut with ID _7160_210_1876_1_1527940923231_3703799_302-150728-0_sp0.9 from training. Duration: 0.8666875 2023-10-03 07:49:05,627 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.1.conv_skip_rate, batch_count=1191800.0, ans=0.0 2023-10-03 07:49:06,790 WARNING [train.py:1204] (3/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_926-7526-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 07:49:06,827 WARNING [train.py:1204] (3/4) Exclude cut with ID _62574_210_2655_1_1535180407058_3933249_467-22348-0_sp1.1 from training. Duration: 0.936375 2023-10-03 07:49:08,224 WARNING [train.py:1204] (3/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_280-161633-0_sp0.9 from training. Duration: 0.811125 2023-10-03 07:49:10,170 WARNING [train.py:1204] (3/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_385-193052-0_sp1.1 from training. Duration: 0.7818125 2023-10-03 07:49:11,535 WARNING [train.py:1204] (3/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_444-28041-0 from training. Duration: 0.85 2023-10-03 07:49:11,540 WARNING [train.py:1204] (3/4) Exclude cut with ID _66070_210_7640_1_1535441393096_7805310_341-249237-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 07:49:11,635 WARNING [train.py:1204] (3/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_425-269826-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 07:49:12,307 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.4.encoder.layers.1.conv_module2.whiten, num_groups=1, num_channels=384, metric=2.91 vs. limit=15.0 2023-10-03 07:49:13,199 WARNING [train.py:1204] (3/4) Exclude cut with ID _53950_210_5816_1_1534417264919_3862951_124-185597-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 07:49:15,967 WARNING [train.py:1204] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_380-204756-0 from training. Duration: 0.86 2023-10-03 07:49:18,708 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.1.feed_forward1.hidden_balancer.prob, batch_count=1191866.6666666667, ans=0.125 2023-10-03 07:49:20,019 WARNING [train.py:1204] (3/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_282-257791-0_sp0.9 from training. Duration: 0.9555625 2023-10-03 07:49:22,321 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.1.attention_skip_rate, batch_count=1191866.6666666667, ans=0.0 2023-10-03 07:49:23,541 WARNING [train.py:1204] (3/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_372-131163-0_sp1.1 from training. Duration: 0.8545625 2023-10-03 07:49:24,912 WARNING [train.py:1204] (3/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_447-233513-0_sp1.1 from training. Duration: 0.963625 2023-10-03 07:49:27,719 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_54-118333-0_sp1.1 from training. Duration: 0.97275 2023-10-03 07:49:33,167 WARNING [train.py:1204] (3/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_388-230239-0_sp1.1 from training. Duration: 0.963625 2023-10-03 07:49:33,188 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_40047_210_2290_1_1533290519_2374311_141-54639-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 07:49:34,519 WARNING [train.py:1204] (3/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_91-267696-0_sp0.9 from training. Duration: 0.9666875 2023-10-03 07:49:34,704 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.conv_skip_rate, batch_count=1191933.3333333333, ans=0.0 2023-10-03 07:49:35,814 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_532-137847-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 07:49:39,505 INFO [train.py:1046] (3/4) Epoch 34, batch 3500, loss[loss=0.1527, simple_loss=0.2318, pruned_loss=0.03674, over 24440.00 frames. ], tot_loss[loss=0.1613, simple_loss=0.2401, pruned_loss=0.04124, over 4706178.77 frames. ], batch size: 58, lr: 2.99e-03, grad_scale: 8.0 2023-10-03 07:49:40,385 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.2.encoder.layers.2.nonlin_attention.whiten1, num_groups=1, num_channels=288, metric=4.85 vs. limit=10.0 2023-10-03 07:49:41,661 WARNING [train.py:1204] (3/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_155-283231-0_sp1.1 from training. Duration: 0.97275 2023-10-03 07:49:42,949 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2245_210_2715_1_1525511548_3540254_310-52483-0_sp1.1 from training. Duration: 0.8 2023-10-03 07:49:43,193 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.balancer2.prob, batch_count=1192000.0, ans=0.125 2023-10-03 07:49:44,304 WARNING [train.py:1204] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_185-174805-0 from training. Duration: 0.68 2023-10-03 07:49:45,799 WARNING [train.py:1204] (3/4) Exclude cut with ID _15012_210_1968_1_1530856820429_5095350_662-210469-0_sp1.1 from training. Duration: 0.5181875 2023-10-03 07:49:48,514 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_426-313557-0_sp0.9 from training. Duration: 0.588875 2023-10-03 07:49:50,392 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.3.encoder.layers.0.self_attn2.whiten, num_groups=1, num_channels=512, metric=14.63 vs. limit=22.5 2023-10-03 07:49:51,189 WARNING [train.py:1204] (3/4) Exclude cut with ID _42326_210_7770_1_1533814282531_3707269_407-204343-0_sp1.1 from training. Duration: 0.97275 2023-10-03 07:49:51,196 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_258-310570-0 from training. Duration: 0.82 2023-10-03 07:49:54,034 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.1.encoder.layers.0.whiten, num_groups=1, num_channels=256, metric=5.00 vs. limit=12.0 2023-10-03 07:49:56,700 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_4737_210_2040_1_1526998689_3293008_378-178650-0_sp1.1 from training. Duration: 0.863625 2023-10-03 07:49:58,031 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_47896_210_20508_1_1534492518_5958868_432-327451-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 07:49:58,133 WARNING [train.py:1204] (3/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_188-114464-0_sp1.1 from training. Duration: 0.7454375 2023-10-03 07:49:58,147 WARNING [train.py:1204] (3/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_798-106179-0_sp1.1 from training. Duration: 0.7545625 2023-10-03 07:49:59,401 WARNING [train.py:1204] (3/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_143-117421-0_sp1.1 from training. Duration: 0.52725 2023-10-03 07:49:59,443 WARNING [train.py:1204] (3/4) Exclude cut with ID _67499_210_17282_1_1535509805220_3692430_361-78786-0_sp1.1 from training. Duration: 0.963625 2023-10-03 07:50:00,713 WARNING [train.py:1204] (3/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_712-342563-0_sp1.1 from training. Duration: 0.8818125 2023-10-03 07:50:00,764 WARNING [train.py:1204] (3/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_387-159008-0 from training. Duration: 0.86 2023-10-03 07:50:02,204 WARNING [train.py:1204] (3/4) Exclude cut with ID _33640_210_2655_1_1533117605479_4855469_959-18820-0_sp1.1 from training. Duration: 0.963625 2023-10-03 07:50:03,468 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_133-564-0_sp0.9 from training. Duration: 0.87775 2023-10-03 07:50:04,923 WARNING [train.py:1204] (3/4) Exclude cut with ID _23514_210_1389_1_1531832139785_3031439_12-29287-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 07:50:10,791 WARNING [train.py:1204] (3/4) Exclude cut with ID _23514_210_1389_1_1531832139785_3031439_489-185052-0_sp1.1 from training. Duration: 0.963625 2023-10-03 07:50:12,705 WARNING [train.py:1204] (3/4) Exclude cut with ID _5444_210_2151_1_1527418808464_5312210_634-194174-0 from training. Duration: 0.97 2023-10-03 07:50:12,731 WARNING [train.py:1204] (3/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_91-26919-0_sp1.1 from training. Duration: 0.8818125 2023-10-03 07:50:14,220 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_37351_210_7925_1_1533012931_3812113_516-82303-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 07:50:15,584 WARNING [train.py:1204] (3/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_80-74184-0_sp0.9 from training. Duration: 0.92225 2023-10-03 07:50:16,989 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_9460_210_4211_1_1528954587_10049667_495-202963-0_sp1.1 from training. Duration: 0.963625 2023-10-03 07:50:18,368 WARNING [train.py:1204] (3/4) Exclude cut with ID _14632_210_9988_1_1530691279392_3923200_332-105904-0_sp1.1 from training. Duration: 0.8181875 2023-10-03 07:50:19,567 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_40764_210_1925_1_1533882359_3752345_493-167886-0_sp1.1 from training. Duration: 0.92725 2023-10-03 07:50:20,859 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_196-289637-0 from training. Duration: 0.75 2023-10-03 07:50:22,755 WARNING [train.py:1204] (3/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_26-175553-0 from training. Duration: 0.61 2023-10-03 07:50:24,016 WARNING [train.py:1204] (3/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_890-5117-0 from training. Duration: 0.78 2023-10-03 07:50:24,078 WARNING [train.py:1204] (3/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_206-306860-0_sp1.1 from training. Duration: 0.92725 2023-10-03 07:50:25,954 WARNING [train.py:1204] (3/4) Exclude cut with ID _25882_210_3036_1_1532775665815_6479510_570-79824-0_sp1.1 from training. Duration: 0.963625 2023-10-03 07:50:27,289 WARNING [train.py:1204] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_731-2428-0_sp1.1 from training. Duration: 0.7545625 2023-10-03 07:50:27,317 WARNING [train.py:1204] (3/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_138-154812-0_sp0.9 from training. Duration: 0.8666875 2023-10-03 07:50:29,553 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.2.encoder.layers.1.self_attn_weights.whiten_keys, num_groups=4, num_channels=128, metric=2.54 vs. limit=6.0 2023-10-03 07:50:30,231 WARNING [train.py:1204] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_178-39722-0_sp0.9 from training. Duration: 0.5444375 2023-10-03 07:50:30,443 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=1192200.0, ans=0.1 2023-10-03 07:50:31,514 WARNING [train.py:1204] (3/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_1285-256622-0_sp0.9 from training. Duration: 0.9333125 2023-10-03 07:50:35,783 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_41428_210_2161_1_1533774184_3992532_65-271323-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 07:50:38,517 WARNING [train.py:1204] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_308-161067-0 from training. Duration: 0.66 2023-10-03 07:50:38,522 WARNING [train.py:1204] (3/4) Exclude cut with ID _3067_210_2667_1_1526212974640_4503168_459-173867-0 from training. Duration: 0.88 2023-10-03 07:50:38,523 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2221_210_1988_1_1525597051_4935541_415-296882-0_sp1.1 from training. Duration: 0.7 2023-10-03 07:50:39,931 WARNING [train.py:1204] (3/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_354-192266-0_sp1.1 from training. Duration: 0.936375 2023-10-03 07:50:39,990 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_258-310570-0_sp0.9 from training. Duration: 0.911125 2023-10-03 07:50:41,892 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_548-141039-0_sp1.1 from training. Duration: 0.963625 2023-10-03 07:50:42,351 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.5.encoder.layers.1.conv_module1.whiten, num_groups=1, num_channels=256, metric=3.36 vs. limit=15.0 2023-10-03 07:50:45,132 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_15101_210_7927_1_1530786415_5033274_320-327527-0 from training. Duration: 0.96 2023-10-03 07:50:46,662 WARNING [train.py:1204] (3/4) Exclude cut with ID _2895_210_2477_1_1525956785794_4063750_289-11475-0_sp0.9 from training. Duration: 0.911125 2023-10-03 07:50:48,034 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7543_210_6753_1_1528543318_4552_1-85072-0_sp1.1 from training. Duration: 0.7545625 2023-10-03 07:50:49,286 WARNING [train.py:1204] (3/4) Exclude cut with ID _64639_210_5816_1_1535367619437_3528000_461-329156-0 from training. Duration: 0.96 2023-10-03 07:50:50,723 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_211-339622-0 from training. Duration: 0.45 2023-10-03 07:50:52,189 WARNING [train.py:1204] (3/4) Exclude cut with ID _20455_210_5219_1_1531565996775_8098100_402-112739-0_sp1.1 from training. Duration: 0.963625 2023-10-03 07:50:53,617 INFO [train.py:1046] (3/4) Epoch 34, batch 3550, loss[loss=0.1428, simple_loss=0.2233, pruned_loss=0.03117, over 24643.00 frames. ], tot_loss[loss=0.1606, simple_loss=0.2394, pruned_loss=0.04093, over 4721868.73 frames. ], batch size: 60, lr: 2.99e-03, grad_scale: 8.0 2023-10-03 07:50:53,701 WARNING [train.py:1204] (3/4) Exclude cut with ID _53489_210_6228_1_1534298357169_3835970_256-115565-0_sp1.1 from training. Duration: 0.936375 2023-10-03 07:50:53,724 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_26326_210_7925_1_1533002493_3592406_360-305260-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 07:50:55,446 WARNING [train.py:1204] (3/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_18-204383-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 07:50:57,510 WARNING [train.py:1204] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_306-232480-0_sp1.1 from training. Duration: 0.87275 2023-10-03 07:51:00,385 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.bypass.skip_rate, batch_count=1192333.3333333333, ans=0.09899494936611666 2023-10-03 07:51:05,614 WARNING [train.py:1204] (3/4) Exclude cut with ID _41161_210_20034_1_1533431249657_3555314_116-299186-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 07:51:07,047 WARNING [train.py:1204] (3/4) Exclude cut with ID _13913_210_9994_1_1530579615187_3383897_473-277682-0_sp1.1 from training. Duration: 0.3545625 2023-10-03 07:51:11,509 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.553e+02 1.871e+02 2.039e+02 2.227e+02 3.484e+02, threshold=4.078e+02, percent-clipped=0.0 2023-10-03 07:51:11,644 WARNING [train.py:1204] (3/4) Exclude cut with ID _58895_210_8750_1_1535184081320_3649089_49-225702-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 07:51:11,706 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3080_210_2071_1_1526086102_4365166_312-8726-0_sp1.1 from training. Duration: 0.82725 2023-10-03 07:51:11,944 INFO [scaling.py:1118] (3/4) WithLoss: name=encoder.encoders.3.encoder.layers.1.self_attn_weights, loss-sum=0.000e+00 2023-10-03 07:51:12,578 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.1.encoder.layers.0.feed_forward2.out_whiten, num_groups=1, num_channels=256, metric=4.25 vs. limit=15.0 2023-10-03 07:51:13,077 WARNING [train.py:1204] (3/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_489-190175-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 07:51:13,143 WARNING [train.py:1204] (3/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_345-95717-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 07:51:13,154 WARNING [train.py:1204] (3/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_85-138257-0_sp1.1 from training. Duration: 0.6454375 2023-10-03 07:51:17,846 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7046_210_6753_1_1527895337_6920917_783-278109-0_sp1.1 from training. Duration: 0.7 2023-10-03 07:51:17,884 WARNING [train.py:1204] (3/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_450-203637-0_sp1.1 from training. Duration: 0.836375 2023-10-03 07:51:17,923 WARNING [train.py:1204] (3/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_416-132893-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 07:51:19,213 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2655_210_2087_1_1525688959_3897400_437-337690-0_sp0.9 from training. Duration: 0.7 2023-10-03 07:51:20,581 WARNING [train.py:1204] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_700-183549-0_sp1.1 from training. Duration: 0.6181875 2023-10-03 07:51:25,372 WARNING [train.py:1204] (3/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_191-59784-0_sp1.1 from training. Duration: 0.8 2023-10-03 07:51:25,395 WARNING [train.py:1204] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_218-295097-0_sp1.1 from training. Duration: 0.7 2023-10-03 07:51:27,359 WARNING [train.py:1204] (3/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_310-109461-0_sp1.1 from training. Duration: 0.72725 2023-10-03 07:51:27,363 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_33348_210_5632_1_1533086912_3324030_131-321149-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 07:51:27,513 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.1.feed_forward3.hidden_balancer.prob, batch_count=1192466.6666666667, ans=0.125 2023-10-03 07:51:28,714 WARNING [train.py:1204] (3/4) Exclude cut with ID _64135_210_2253_1_1535677267876_7608972_133-188266-0_sp1.1 from training. Duration: 0.736375 2023-10-03 07:51:28,731 WARNING [train.py:1204] (3/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_505-205270-0 from training. Duration: 0.38 2023-10-03 07:51:28,752 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_67223_210_4238_1_1535612120_2537431_102-88248-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 07:51:28,910 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.0.conv_skip_rate, batch_count=1192466.6666666667, ans=0.0 2023-10-03 07:51:30,312 WARNING [train.py:1204] (3/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_60-235433-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 07:51:30,591 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.bypass.skip_rate, batch_count=1192466.6666666667, ans=0.07 2023-10-03 07:51:31,689 WARNING [train.py:1204] (3/4) Exclude cut with ID _5189_210_4557_1_1527248319428_3769229_199-53526-0_sp1.1 from training. Duration: 0.3818125 2023-10-03 07:51:35,834 WARNING [train.py:1204] (3/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_439-90979-0_sp1.1 from training. Duration: 0.963625 2023-10-03 07:51:35,905 WARNING [train.py:1204] (3/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_1162-132880-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 07:51:37,271 WARNING [train.py:1204] (3/4) Exclude cut with ID _56423_210_5854_1_1534896061049_3165969_220-22317-0_sp1.1 from training. Duration: 0.963625 2023-10-03 07:51:38,681 WARNING [train.py:1204] (3/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_909-64151-0 from training. Duration: 0.94 2023-10-03 07:51:40,023 WARNING [train.py:1204] (3/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_465-156500-0_sp0.9 from training. Duration: 0.8 2023-10-03 07:51:40,114 WARNING [train.py:1204] (3/4) Exclude cut with ID _44304_210_15374_1_1533639352765_4613129_302-22084-0 from training. Duration: 0.86 2023-10-03 07:51:40,155 WARNING [train.py:1204] (3/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_663-166973-0_sp1.1 from training. Duration: 0.72725 2023-10-03 07:51:42,605 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.4.encoder.layers.2.conv_module2.whiten, num_groups=1, num_channels=384, metric=3.16 vs. limit=15.0 2023-10-03 07:51:43,409 WARNING [train.py:1204] (3/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_503-287883-0_sp0.9 from training. Duration: 0.97775 2023-10-03 07:51:43,432 WARNING [train.py:1204] (3/4) Exclude cut with ID _60057_210_10640_1_1534903065793_3333150_362-230377-0_sp0.9 from training. Duration: 0.9666875 2023-10-03 07:51:46,654 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_4183_210_2087_1_1526729100_1869838_143-280962-0 from training. Duration: 0.56 2023-10-03 07:51:47,227 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.3.encoder.layers.3.nonlin_attention.whiten2, num_groups=1, num_channels=512, metric=5.50 vs. limit=15.0 2023-10-03 07:51:47,948 WARNING [train.py:1204] (3/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_281-1313-0_sp1.1 from training. Duration: 0.936375 2023-10-03 07:51:52,172 WARNING [train.py:1204] (3/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_64-215118-0_sp1.1 from training. Duration: 0.936375 2023-10-03 07:51:53,967 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2822_210_2715_1_1526112586_1999587_165-36656-0 from training. Duration: 0.89 2023-10-03 07:51:54,012 WARNING [train.py:1204] (3/4) Exclude cut with ID _22961_210_8254_1_1531976366672_4278180_402-208830-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 07:51:57,256 WARNING [train.py:1204] (3/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_98-193494-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 07:51:58,525 WARNING [train.py:1204] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_383-285196-0 from training. Duration: 0.97 2023-10-03 07:52:00,158 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=1192600.0, ans=0.1 2023-10-03 07:52:04,123 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6767_210_3273_1_1527855783_1718932_142-127132-0 from training. Duration: 0.95 2023-10-03 07:52:05,424 WARNING [train.py:1204] (3/4) Exclude cut with ID _39867_210_9826_1_1533553222030_9344259_1018-339635-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 07:52:06,773 WARNING [train.py:1204] (3/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_274-274798-0_sp1.1 from training. Duration: 0.8909375 2023-10-03 07:52:08,112 INFO [train.py:1046] (3/4) Epoch 34, batch 3600, loss[loss=0.1508, simple_loss=0.2304, pruned_loss=0.03565, over 24672.00 frames. ], tot_loss[loss=0.1599, simple_loss=0.2388, pruned_loss=0.04052, over 4731397.76 frames. ], batch size: 65, lr: 2.99e-03, grad_scale: 16.0 2023-10-03 07:52:08,218 WARNING [train.py:1204] (3/4) Exclude cut with ID _2334_210_2667_1_1525608238880_4705129_376-263393-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 07:52:08,273 WARNING [train.py:1204] (3/4) Exclude cut with ID _29303_210_2575_1_1532392237303_7410649_363-344675-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 07:52:11,395 WARNING [train.py:1204] (3/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_58-68956-0_sp1.1 from training. Duration: 0.7545625 2023-10-03 07:52:14,241 WARNING [train.py:1204] (3/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_709-311571-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 07:52:17,303 WARNING [train.py:1204] (3/4) Exclude cut with ID _46067_210_4220_1_1534125571066_3307450_136-257407-0_sp1.1 from training. Duration: 0.963625 2023-10-03 07:52:18,539 WARNING [train.py:1204] (3/4) Exclude cut with ID _6493_210_1747_1_1528459210424_4012470_324-323854-0_sp1.1 from training. Duration: 0.836375 2023-10-03 07:52:18,617 WARNING [train.py:1204] (3/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_234-260682-0_sp1.1 from training. Duration: 0.763625 2023-10-03 07:52:19,909 WARNING [train.py:1204] (3/4) Exclude cut with ID _36634_210_1794_1_1533296352450_1399884_55-119451-0_sp1.1 from training. Duration: 0.963625 2023-10-03 07:52:19,912 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_40047_210_2290_1_1533290519_2374311_195-165693-0 from training. Duration: 0.89 2023-10-03 07:52:23,906 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_5522_210_3273_1_1527249606_3909096_366-172508-0_sp0.9 from training. Duration: 0.7333125 2023-10-03 07:52:25,174 WARNING [train.py:1204] (3/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_187-10504-0_sp1.1 from training. Duration: 0.963625 2023-10-03 07:52:28,549 WARNING [train.py:1204] (3/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_282-257791-0_sp1.1 from training. Duration: 0.7818125 2023-10-03 07:52:29,997 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_43831_210_2158_1_1533725502_5872791_467-288776-0_sp1.1 from training. Duration: 0.92725 2023-10-03 07:52:31,364 WARNING [train.py:1204] (3/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_138-154812-0_sp1.1 from training. Duration: 0.7090625 2023-10-03 07:52:32,740 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_1154-185930-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 07:52:32,767 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2655_210_2087_1_1525688959_3897400_52-279899-0 from training. Duration: 0.69 2023-10-03 07:52:32,820 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_141-89113-0_sp1.1 from training. Duration: 0.7818125 2023-10-03 07:52:36,745 WARNING [train.py:1204] (3/4) Exclude cut with ID _55134_210_21982_1_1534590028377_3882170_297-47310-0_sp1.1 from training. Duration: 0.963625 2023-10-03 07:52:38,075 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_486-151131-0_sp1.1 from training. Duration: 0.77275 2023-10-03 07:52:39,468 WARNING [train.py:1204] (3/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_21-232980-0_sp1.1 from training. Duration: 0.936375 2023-10-03 07:52:40,935 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6165_210_3528_1_1527590982_4379841_338-106380-0_sp1.1 from training. Duration: 0.92725 2023-10-03 07:52:42,748 WARNING [train.py:1204] (3/4) Exclude cut with ID _55162_210_17071_1_1534408248583_3753480_355-257218-0_sp1.1 from training. Duration: 0.97275 2023-10-03 07:52:42,825 WARNING [train.py:1204] (3/4) Exclude cut with ID _2895_210_2477_1_1525956785794_4063750_289-11475-0 from training. Duration: 0.82 2023-10-03 07:52:50,457 WARNING [train.py:1204] (3/4) Exclude cut with ID _16756_210_4915_1_1531047803763_7104259_839-183431-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 07:52:51,895 WARNING [train.py:1204] (3/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_427-208901-0_sp0.9 from training. Duration: 0.7555625 2023-10-03 07:52:51,949 WARNING [train.py:1204] (3/4) Exclude cut with ID _36512_210_6987_1_1533033143998_3126069_181-306500-0 from training. Duration: 0.95 2023-10-03 07:52:58,016 WARNING [train.py:1204] (3/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_28-70987-0_sp1.1 from training. Duration: 0.7909375 2023-10-03 07:52:58,230 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.bypass.scale_min, batch_count=1192866.6666666667, ans=0.2 2023-10-03 07:53:02,770 WARNING [train.py:1204] (3/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_590-58996-0_sp1.1 from training. Duration: 0.936375 2023-10-03 07:53:05,592 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_48730_210_5399_1_1533885886_2212769_108-47561-0_sp1.1 from training. Duration: 0.936375 2023-10-03 07:53:09,791 WARNING [train.py:1204] (3/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_401-158239-0_sp0.9 from training. Duration: 0.788875 2023-10-03 07:53:09,807 WARNING [train.py:1204] (3/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_278-154492-0_sp1.1 from training. Duration: 0.6909375 2023-10-03 07:53:09,813 WARNING [train.py:1204] (3/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_137-116442-0 from training. Duration: 0.78 2023-10-03 07:53:11,233 WARNING [train.py:1204] (3/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_189-317158-0 from training. Duration: 0.9 2023-10-03 07:53:13,034 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_47896_210_20508_1_1534492518_5958868_558-314572-0 from training. Duration: 0.97 2023-10-03 07:53:15,687 WARNING [train.py:1204] (3/4) Exclude cut with ID _66070_210_7640_1_1535441393096_7805310_290-40284-0_sp1.1 from training. Duration: 0.97275 2023-10-03 07:53:15,730 WARNING [train.py:1204] (3/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_110-224688-0_sp1.1 from training. Duration: 0.8818125 2023-10-03 07:53:16,419 WARNING [train.py:1204] (3/4) Exclude cut with ID _7719_210_3525_1_1528456153163_4296960_88-250259-0 from training. Duration: 0.84 2023-10-03 07:53:17,538 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8453_210_4211_1_1528502940_8562730_1157-114102-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 07:53:17,577 WARNING [train.py:1204] (3/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_317-175062-0_sp1.1 from training. Duration: 0.7454375 2023-10-03 07:53:17,583 WARNING [train.py:1204] (3/4) Exclude cut with ID _29857_210_20034_1_1532395886753_3798160_254-347254-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 07:53:18,862 WARNING [train.py:1204] (3/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_28-70987-0 from training. Duration: 0.87 2023-10-03 07:53:20,208 WARNING [train.py:1204] (3/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_321-335475-0 from training. Duration: 0.45 2023-10-03 07:53:21,543 INFO [train.py:1046] (3/4) Epoch 34, batch 3650, loss[loss=0.1641, simple_loss=0.2396, pruned_loss=0.04425, over 23798.00 frames. ], tot_loss[loss=0.1608, simple_loss=0.2399, pruned_loss=0.04088, over 4729627.25 frames. ], batch size: 212, lr: 2.99e-03, grad_scale: 16.0 2023-10-03 07:53:22,956 WARNING [train.py:1204] (3/4) Exclude cut with ID _40901_210_5665_1_1533363789958_3856130_356-107330-0_sp1.1 from training. Duration: 0.936375 2023-10-03 07:53:23,215 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.conv_module1.balancer1.prob, batch_count=1193000.0, ans=0.125 2023-10-03 07:53:24,335 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3780_210_2087_1_1526467389_2145572_176-107140-0 from training. Duration: 0.69 2023-10-03 07:53:29,206 WARNING [train.py:1204] (3/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_80-135200-0 from training. Duration: 0.79 2023-10-03 07:53:30,604 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_33348_210_5632_1_1533086912_3324030_85-28583-0_sp0.9 from training. Duration: 0.911125 2023-10-03 07:53:35,129 WARNING [train.py:1204] (3/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_29-293896-0 from training. Duration: 0.61 2023-10-03 07:53:36,505 WARNING [train.py:1204] (3/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_194-252800-0 from training. Duration: 0.97 2023-10-03 07:53:36,782 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.self_attn_weights.pos_emb_skip_rate, batch_count=1193066.6666666667, ans=0.0 2023-10-03 07:53:36,807 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.feed_forward3.hidden_balancer.prob, batch_count=1193066.6666666667, ans=0.125 2023-10-03 07:53:39,308 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.432e+02 1.892e+02 2.047e+02 2.269e+02 3.053e+02, threshold=4.094e+02, percent-clipped=0.0 2023-10-03 07:53:39,394 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_360-52446-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 07:53:39,397 WARNING [train.py:1204] (3/4) Exclude cut with ID _9389_210_1664_1_1529217337301_5289269_289-213417-0_sp0.9 from training. Duration: 0.988875 2023-10-03 07:53:39,450 WARNING [train.py:1204] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_143-86344-0_sp0.9 from training. Duration: 0.8555625 2023-10-03 07:53:42,726 WARNING [train.py:1204] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_520-126729-0_sp0.9 from training. Duration: 0.77775 2023-10-03 07:53:42,760 WARNING [train.py:1204] (3/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_76-308663-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 07:53:44,110 WARNING [train.py:1204] (3/4) Exclude cut with ID _8021_210_7994_1_1528376451358_3653069_100-140231-0 from training. Duration: 0.67 2023-10-03 07:53:44,177 WARNING [train.py:1204] (3/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_387-131538-0_sp0.9 from training. Duration: 0.9 2023-10-03 07:53:44,206 WARNING [train.py:1204] (3/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_365-182054-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 07:53:45,664 WARNING [train.py:1204] (3/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_345-340690-0 from training. Duration: 0.93 2023-10-03 07:53:46,964 WARNING [train.py:1204] (3/4) Exclude cut with ID _5409_210_1388_1_1527292850189_5865191_178-127970-0_sp0.9 from training. Duration: 0.6333125 2023-10-03 07:53:47,018 WARNING [train.py:1204] (3/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_352-12734-0_sp1.1 from training. Duration: 0.92725 2023-10-03 07:53:47,021 WARNING [train.py:1204] (3/4) Exclude cut with ID _65134_210_14763_1_1535335393985_3505368_121-129373-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 07:53:51,459 WARNING [train.py:1204] (3/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_557-324258-0_sp1.1 from training. Duration: 0.736375 2023-10-03 07:53:53,019 WARNING [train.py:1204] (3/4) Exclude cut with ID _48678_210_5663_1_1533988830947_3769199_306-255886-0 from training. Duration: 0.77 2023-10-03 07:53:54,360 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7544_210_6753_1_1528587000_691468_87-178599-0 from training. Duration: 0.87 2023-10-03 07:53:55,735 WARNING [train.py:1204] (3/4) Exclude cut with ID _2941_210_2577_1_1526199673150_2489759_208-236402-0_sp1.1 from training. Duration: 0.963625 2023-10-03 07:53:55,897 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.conv_module1.balancer1.prob, batch_count=1193133.3333333333, ans=0.125 2023-10-03 07:53:58,567 WARNING [train.py:1204] (3/4) Exclude cut with ID _38670_210_15379_1_1533448810659_4391059_65-169397-0 from training. Duration: 0.97 2023-10-03 07:53:59,468 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.1.encoder.layers.0.whiten, num_groups=1, num_channels=256, metric=5.25 vs. limit=12.0 2023-10-03 07:53:59,925 WARNING [train.py:1204] (3/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_306-328175-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 07:53:59,942 WARNING [train.py:1204] (3/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_212-149947-0_sp1.1 from training. Duration: 0.663625 2023-10-03 07:54:06,484 WARNING [train.py:1204] (3/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_321-22821-0_sp1.1 from training. Duration: 0.8090625 2023-10-03 07:54:06,609 WARNING [train.py:1204] (3/4) Exclude cut with ID _44010_210_4525_1_1533884262964_4013620_483-69110-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 07:54:06,617 WARNING [train.py:1204] (3/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_38-187851-0_sp1.1 from training. Duration: 0.563625 2023-10-03 07:54:09,311 WARNING [train.py:1204] (3/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_129-337802-0_sp0.9 from training. Duration: 0.8 2023-10-03 07:54:10,543 WARNING [train.py:1204] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_268-87909-0_sp1.1 from training. Duration: 0.9 2023-10-03 07:54:10,805 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.balancer1.prob, batch_count=1193200.0, ans=0.125 2023-10-03 07:54:12,109 WARNING [train.py:1204] (3/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_559-184862-0_sp1.1 from training. Duration: 0.8545625 2023-10-03 07:54:14,771 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_46032_210_14656_1_1533781412_4507969_237-120766-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 07:54:14,871 WARNING [train.py:1204] (3/4) Exclude cut with ID _28427_210_4220_1_1532581240967_3061069_330-170906-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 07:54:14,877 WARNING [train.py:1204] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_401-51578-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 07:54:18,071 WARNING [train.py:1204] (3/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_209-205365-0_sp1.1 from training. Duration: 0.7181875 2023-10-03 07:54:19,463 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_23801_210_16223_1_1532066392_3294657_57-242170-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 07:54:19,526 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_127-37371-0_sp1.1 from training. Duration: 0.936375 2023-10-03 07:54:24,597 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.feed_forward3.hidden_balancer.prob, batch_count=1193266.6666666667, ans=0.125 2023-10-03 07:54:25,735 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9171W0019-578905 from training. Duration: 0.896 2023-10-03 07:54:27,651 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.4.encoder.layers.2.feed_forward2.out_whiten, num_groups=1, num_channels=384, metric=9.90 vs. limit=15.0 2023-10-03 07:54:29,783 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_24605_210_1794_1_1532047863_4149755_349-49056-0_sp1.1 from training. Duration: 0.92725 2023-10-03 07:54:29,800 WARNING [train.py:1204] (3/4) Exclude cut with ID _12302_210_5219_1_1529924446466_8107280_897-21237-0_sp1.1 from training. Duration: 0.936375 2023-10-03 07:54:31,132 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_9460_210_4211_1_1528954587_10049667_480-260269-0_sp0.9 from training. Duration: 0.62225 2023-10-03 07:54:32,367 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_348-152548-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 07:54:34,167 WARNING [train.py:1204] (3/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_301-330455-0_sp0.9 from training. Duration: 0.888875 2023-10-03 07:54:35,822 INFO [train.py:1046] (3/4) Epoch 34, batch 3700, loss[loss=0.1478, simple_loss=0.2211, pruned_loss=0.03723, over 24459.00 frames. ], tot_loss[loss=0.1611, simple_loss=0.2405, pruned_loss=0.04081, over 4741605.31 frames. ], batch size: 58, lr: 2.99e-03, grad_scale: 16.0 2023-10-03 07:54:35,922 WARNING [train.py:1204] (3/4) Exclude cut with ID _61008_210_10633_1_1534852769198_6974370_345-91705-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 07:54:37,307 WARNING [train.py:1204] (3/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_304-273433-0 from training. Duration: 0.99 2023-10-03 07:54:37,311 WARNING [train.py:1204] (3/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_359-345972-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 07:54:39,268 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2296_210_2087_1_1525503448_3397335_57-185430-0_sp1.1 from training. Duration: 0.5454375 2023-10-03 07:54:40,681 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_1256-120974-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 07:54:42,011 WARNING [train.py:1204] (3/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_372-30302-0_sp0.9 from training. Duration: 0.9555625 2023-10-03 07:54:43,461 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_42012_210_7927_1_1533439936_3871556_451-344262-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 07:54:43,462 WARNING [train.py:1204] (3/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_28-324729-0 from training. Duration: 0.94 2023-10-03 07:54:44,749 WARNING [train.py:1204] (3/4) Exclude cut with ID _64135_210_2253_1_1535677267876_7608972_313-18394-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 07:54:44,839 WARNING [train.py:1204] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_620-276793-0_sp0.9 from training. Duration: 0.6555625 2023-10-03 07:54:46,058 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7544_210_6753_1_1528591327_1471426_172-348777-0_sp0.9 from training. Duration: 0.7333125 2023-10-03 07:54:49,441 WARNING [train.py:1204] (3/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_332-221057-0_sp0.9 from training. Duration: 0.7555625 2023-10-03 07:54:52,752 WARNING [train.py:1204] (3/4) Exclude cut with ID _19023_210_4353_1_1531378892552_5820780_5-300035-0_sp1.1 from training. Duration: 0.92725 2023-10-03 07:54:52,800 WARNING [train.py:1204] (3/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_343-309023-0_sp1.1 from training. Duration: 0.963625 2023-10-03 07:54:54,097 WARNING [train.py:1204] (3/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_37-41919-0_sp1.1 from training. Duration: 0.7909375 2023-10-03 07:54:55,420 WARNING [train.py:1204] (3/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_41-182910-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 07:54:55,492 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_229-45300-0_sp0.9 from training. Duration: 0.6333125 2023-10-03 07:54:56,931 WARNING [train.py:1204] (3/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_40-214716-0_sp1.1 from training. Duration: 0.963625 2023-10-03 07:54:58,281 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9309W0203-640330 from training. Duration: 0.896 2023-10-03 07:55:01,243 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.feed_forward1.out_proj.dropout_p, batch_count=1193400.0, ans=0.1 2023-10-03 07:55:05,248 WARNING [train.py:1204] (3/4) Exclude cut with ID _60229_210_27767_1_1534838406158_3655350_264-279573-0_sp1.1 from training. Duration: 0.7545625 2023-10-03 07:55:07,138 WARNING [train.py:1204] (3/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_372-198878-0_sp0.9 from training. Duration: 0.6666875 2023-10-03 07:55:08,463 WARNING [train.py:1204] (3/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_359-118926-0_sp1.1 from training. Duration: 0.6545625 2023-10-03 07:55:10,208 WARNING [train.py:1204] (3/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_85-65056-0 from training. Duration: 0.96 2023-10-03 07:55:10,226 WARNING [train.py:1204] (3/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_89-242335-0_sp1.1 from training. Duration: 0.863625 2023-10-03 07:55:13,021 WARNING [train.py:1204] (3/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_128-49444-0_sp1.1 from training. Duration: 0.936375 2023-10-03 07:55:14,382 WARNING [train.py:1204] (3/4) Exclude cut with ID _30327_210_16049_1_1532484161295_3924169_401-70819-0 from training. Duration: 0.86 2023-10-03 07:55:15,807 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_41822_210_13270_1_1533955976_4379284_260-256391-0_sp1.1 from training. Duration: 0.936375 2023-10-03 07:55:16,638 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.1.encoder.layers.1.feed_forward3.out_whiten, num_groups=1, num_channels=256, metric=9.43 vs. limit=15.0 2023-10-03 07:55:17,096 WARNING [train.py:1204] (3/4) Exclude cut with ID _67335_210_5219_1_1535525975652_4275229_599-247317-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 07:55:18,562 WARNING [train.py:1204] (3/4) Exclude cut with ID _29857_210_20034_1_1532395886753_3798160_209-239267-0_sp1.1 from training. Duration: 0.936375 2023-10-03 07:55:18,588 WARNING [train.py:1204] (3/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_46-207599-0_sp1.1 from training. Duration: 0.5818125 2023-10-03 07:55:20,626 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=1193533.3333333333, ans=0.1 2023-10-03 07:55:21,831 WARNING [train.py:1204] (3/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_912-282412-0_sp1.1 from training. Duration: 0.4454375 2023-10-03 07:55:26,317 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_4183_210_2087_1_1526729100_1869838_141-166600-0_sp1.1 from training. Duration: 0.863625 2023-10-03 07:55:26,322 WARNING [train.py:1204] (3/4) Exclude cut with ID _15037_210_2253_1_1530752294104_3267650_296-192597-0 from training. Duration: 0.81 2023-10-03 07:55:27,665 WARNING [train.py:1204] (3/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_571-220562-0_sp1.1 from training. Duration: 0.963625 2023-10-03 07:55:27,697 WARNING [train.py:1204] (3/4) Exclude cut with ID _5287_210_2084_1_1527418883040_3878670_12-221186-0 from training. Duration: 0.98 2023-10-03 07:55:32,112 WARNING [train.py:1204] (3/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_149-85602-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 07:55:32,145 WARNING [train.py:1204] (3/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_1003-101638-0_sp1.1 from training. Duration: 0.663625 2023-10-03 07:55:35,174 WARNING [train.py:1204] (3/4) Exclude cut with ID _55844_210_2346_1_1534471212569_7625707_1099-33689-0_sp1.1 from training. Duration: 0.92725 2023-10-03 07:55:36,370 WARNING [train.py:1204] (3/4) Exclude cut with ID _7606_210_4915_1_1528894253277_5080690_301-10732-0 from training. Duration: 0.81 2023-10-03 07:55:38,313 WARNING [train.py:1204] (3/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_403-34698-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 07:55:38,321 WARNING [train.py:1204] (3/4) Exclude cut with ID _8480_210_1874_1_1528589391205_6728330_755-112625-0_sp0.9 from training. Duration: 0.82225 2023-10-03 07:55:39,737 WARNING [train.py:1204] (3/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_527-238990-0_sp1.1 from training. Duration: 0.8181875 2023-10-03 07:55:39,763 WARNING [train.py:1204] (3/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_426-7328-0_sp1.1 from training. Duration: 0.92725 2023-10-03 07:55:43,107 WARNING [train.py:1204] (3/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_189-317158-0_sp1.1 from training. Duration: 0.8181875 2023-10-03 07:55:43,175 WARNING [train.py:1204] (3/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_162-103620-0 from training. Duration: 0.93 2023-10-03 07:55:45,785 WARNING [train.py:1204] (3/4) Exclude cut with ID _22083_210_8254_1_1531702862439_3678970_49-234546-0 from training. Duration: 0.83 2023-10-03 07:55:45,849 WARNING [train.py:1204] (3/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_370-331600-0_sp1.1 from training. Duration: 0.8545625 2023-10-03 07:55:45,865 WARNING [train.py:1204] (3/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_166-59042-0_sp1.1 from training. Duration: 0.97275 2023-10-03 07:55:47,293 WARNING [train.py:1204] (3/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_138-332773-0_sp1.1 from training. Duration: 0.736375 2023-10-03 07:55:48,585 WARNING [train.py:1204] (3/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_90-30673-0_sp0.9 from training. Duration: 0.9333125 2023-10-03 07:55:50,547 INFO [train.py:1046] (3/4) Epoch 34, batch 3750, loss[loss=0.1708, simple_loss=0.2535, pruned_loss=0.04406, over 23279.00 frames. ], tot_loss[loss=0.1624, simple_loss=0.2419, pruned_loss=0.04145, over 4734390.79 frames. ], batch size: 105, lr: 2.99e-03, grad_scale: 16.0 2023-10-03 07:55:50,692 WARNING [train.py:1204] (3/4) Exclude cut with ID _27967_210_12140_1_1532588391859_8419130_737-41898-0_sp1.1 from training. Duration: 0.936375 2023-10-03 07:55:52,029 WARNING [train.py:1204] (3/4) Exclude cut with ID _8833_210_5219_1_1528592665210_7553059_329-125156-0_sp1.1 from training. Duration: 0.7454375 2023-10-03 07:55:52,221 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.ff3_skip_rate, batch_count=1193666.6666666667, ans=0.0 2023-10-03 07:55:52,282 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.conv_module1.balancer1.prob, batch_count=1193666.6666666667, ans=0.125 2023-10-03 07:55:53,292 WARNING [train.py:1204] (3/4) Exclude cut with ID _41817_210_18835_1_1533434477160_7423049_352-42537-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 07:55:53,526 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.balancer2.prob, batch_count=1193666.6666666667, ans=0.125 2023-10-03 07:55:55,249 WARNING [train.py:1204] (3/4) Exclude cut with ID _8365_210_2064_1_1528592311070_4677690_431-294102-0 from training. Duration: 0.89 2023-10-03 07:55:56,637 WARNING [train.py:1204] (3/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_454-314901-0_sp1.1 from training. Duration: 0.4181875 2023-10-03 07:55:58,083 WARNING [train.py:1204] (3/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_80-135200-0_sp0.9 from training. Duration: 0.87775 2023-10-03 07:55:58,133 WARNING [train.py:1204] (3/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_181-78632-0 from training. Duration: 0.78 2023-10-03 07:55:58,190 WARNING [train.py:1204] (3/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_300-27235-0_sp0.9 from training. Duration: 0.9666875 2023-10-03 07:55:59,595 WARNING [train.py:1204] (3/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_96-177176-0_sp1.1 from training. Duration: 0.97275 2023-10-03 07:56:00,991 WARNING [train.py:1204] (3/4) Exclude cut with ID _35941_210_15379_1_1532995294515_4023089_246-348742-0_sp1.1 from training. Duration: 0.97275 2023-10-03 07:56:02,372 WARNING [train.py:1204] (3/4) Exclude cut with ID _38670_210_15379_1_1533448810659_4391059_65-169397-0_sp1.1 from training. Duration: 0.8818125 2023-10-03 07:56:03,770 WARNING [train.py:1204] (3/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_1003-99642-0_sp1.1 from training. Duration: 0.963625 2023-10-03 07:56:05,331 WARNING [train.py:1204] (3/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_320-14078-0_sp1.1 from training. Duration: 0.863625 2023-10-03 07:56:07,206 WARNING [train.py:1204] (3/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_367-61330-0_sp0.9 from training. Duration: 0.8333125 2023-10-03 07:56:08,415 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.577e+02 1.948e+02 2.234e+02 2.708e+02 3.464e+02, threshold=4.468e+02, percent-clipped=0.0 2023-10-03 07:56:10,312 WARNING [train.py:1204] (3/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_515-298514-0_sp1.1 from training. Duration: 0.92725 2023-10-03 07:56:12,967 WARNING [train.py:1204] (3/4) Exclude cut with ID _53731_210_7994_1_1534553979244_3757214_471-320815-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 07:56:13,023 WARNING [train.py:1204] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_306-232480-0 from training. Duration: 0.96 2023-10-03 07:56:14,896 WARNING [train.py:1204] (3/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_478-162247-0_sp0.9 from training. Duration: 0.911125 2023-10-03 07:56:16,366 WARNING [train.py:1204] (3/4) Exclude cut with ID _56423_210_5854_1_1534896061049_3165969_345-284696-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 07:56:16,402 WARNING [train.py:1204] (3/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_600-122253-0_sp1.1 from training. Duration: 0.963625 2023-10-03 07:56:19,649 WARNING [train.py:1204] (3/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_199-295611-0 from training. Duration: 0.55 2023-10-03 07:56:23,694 WARNING [train.py:1204] (3/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_734-129761-0 from training. Duration: 0.82 2023-10-03 07:56:25,151 WARNING [train.py:1204] (3/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_349-169257-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 07:56:25,187 WARNING [train.py:1204] (3/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_202-67017-0_sp0.9 from training. Duration: 0.911125 2023-10-03 07:56:26,576 WARNING [train.py:1204] (3/4) Exclude cut with ID _16908_210_5724_1_1531119157248_3772700_61-258978-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 07:56:29,473 WARNING [train.py:1204] (3/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_172-156931-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 07:56:30,903 WARNING [train.py:1204] (3/4) Exclude cut with ID _4092_210_2151_1_1526643044449_3135759_356-278694-0_sp1.1 from training. Duration: 0.42725 2023-10-03 07:56:32,489 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.ff2_skip_rate, batch_count=1193800.0, ans=0.0 2023-10-03 07:56:33,695 WARNING [train.py:1204] (3/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_116-332008-0 from training. Duration: 0.98 2023-10-03 07:56:37,266 WARNING [train.py:1204] (3/4) Exclude cut with ID _29003_210_19596_1_1532507391720_3894098_358-114789-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 07:56:40,062 WARNING [train.py:1204] (3/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_152-212533-0_sp1.1 from training. Duration: 0.8818125 2023-10-03 07:56:41,336 WARNING [train.py:1204] (3/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_267-214116-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 07:56:44,175 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7543_210_6753_1_1528541907_1399322_165-323619-0_sp0.9 from training. Duration: 0.7666875 2023-10-03 07:56:49,304 WARNING [train.py:1204] (3/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_577-332600-0_sp0.9 from training. Duration: 0.6333125 2023-10-03 07:56:50,663 WARNING [train.py:1204] (3/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_342-100157-0_sp0.9 from training. Duration: 0.6 2023-10-03 07:56:52,022 WARNING [train.py:1204] (3/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_141-89727-0_sp1.1 from training. Duration: 0.6454375 2023-10-03 07:56:53,428 WARNING [train.py:1204] (3/4) Exclude cut with ID _3992_210_1747_1_1526644816031_3731060_107-251434-0_sp1.1 from training. Duration: 0.9 2023-10-03 07:56:54,927 WARNING [train.py:1204] (3/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_401-53423-0_sp1.1 from training. Duration: 0.5 2023-10-03 07:57:03,304 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7762_210_1794_1_1528199508_4057408_422-227034-0_sp1.1 from training. Duration: 0.663625 2023-10-03 07:57:04,525 INFO [train.py:1046] (3/4) Epoch 34, batch 3800, loss[loss=0.1468, simple_loss=0.2253, pruned_loss=0.03417, over 23260.00 frames. ], tot_loss[loss=0.1626, simple_loss=0.2417, pruned_loss=0.04174, over 4721433.54 frames. ], batch size: 105, lr: 2.99e-03, grad_scale: 16.0 2023-10-03 07:57:07,935 WARNING [train.py:1204] (3/4) Exclude cut with ID _4184_210_2550_1_1526730192013_2219380_102-250299-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 07:57:08,007 WARNING [train.py:1204] (3/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_253-308377-0_sp0.9 from training. Duration: 0.5444375 2023-10-03 07:57:08,240 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.ff2_skip_rate, batch_count=1194000.0, ans=0.0 2023-10-03 07:57:09,930 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_271-297505-0 from training. Duration: 0.99 2023-10-03 07:57:10,036 WARNING [train.py:1204] (3/4) Exclude cut with ID _35941_210_15379_1_1532995294515_4023089_213-142973-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 07:57:12,682 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7544_210_6753_1_1528587722_3576686_162-63018-0_sp1.1 from training. Duration: 0.92725 2023-10-03 07:57:14,024 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_365-217119-0_sp0.9 from training. Duration: 0.7 2023-10-03 07:57:17,239 WARNING [train.py:1204] (3/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_662-275621-0_sp0.9 from training. Duration: 0.4666875 2023-10-03 07:57:17,240 WARNING [train.py:1204] (3/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_374-187555-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 07:57:19,113 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_289-180904-0_sp1.1 from training. Duration: 0.6909375 2023-10-03 07:57:19,392 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.conv_module1.balancer2.prob, batch_count=1194066.6666666667, ans=0.125 2023-10-03 07:57:20,593 WARNING [train.py:1204] (3/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_189-47206-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 07:57:20,631 WARNING [train.py:1204] (3/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_234-14174-0_sp1.1 from training. Duration: 0.7454375 2023-10-03 07:57:21,858 WARNING [train.py:1204] (3/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_576-76621-0_sp1.1 from training. Duration: 0.963625 2023-10-03 07:57:23,225 WARNING [train.py:1204] (3/4) Exclude cut with ID _13253_210_1925_1_1530757775819_3577873_253-246386-0 from training. Duration: 0.9 2023-10-03 07:57:27,288 WARNING [train.py:1204] (3/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_372-6869-0_sp1.1 from training. Duration: 0.4 2023-10-03 07:57:27,346 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_21434_210_4238_1_1531916339_1222252_22-54954-0_sp1.1 from training. Duration: 0.936375 2023-10-03 07:57:28,814 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.0.nonlin_attention.balancer.min_positive, batch_count=1194066.6666666667, ans=0.05 2023-10-03 07:57:30,057 WARNING [train.py:1204] (3/4) Exclude cut with ID _29853_210_7497_1_1532505600851_6642110_492-170361-0_sp1.1 from training. Duration: 0.92725 2023-10-03 07:57:32,910 WARNING [train.py:1204] (3/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_505-97987-0_sp0.9 from training. Duration: 0.9555625 2023-10-03 07:57:34,324 WARNING [train.py:1204] (3/4) Exclude cut with ID _8365_210_2064_1_1528592311070_4677690_371-122295-0_sp0.9 from training. Duration: 0.611125 2023-10-03 07:57:35,787 WARNING [train.py:1204] (3/4) Exclude cut with ID _14268_210_9206_1_1530622771285_4714099_637-86630-0_sp1.1 from training. Duration: 0.463625 2023-10-03 07:57:35,810 WARNING [train.py:1204] (3/4) Exclude cut with ID _29857_210_20034_1_1532395886753_3798160_372-5678-0_sp1.1 from training. Duration: 0.963625 2023-10-03 07:57:38,472 WARNING [train.py:1204] (3/4) Exclude cut with ID _52892_210_6721_1_1534378941259_4124129_90-165108-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 07:57:39,754 WARNING [train.py:1204] (3/4) Exclude cut with ID _27052_210_5986_1_1532329197187_3585009_143-339198-0_sp1.1 from training. Duration: 0.963625 2023-10-03 07:57:44,449 WARNING [train.py:1204] (3/4) Exclude cut with ID _14268_210_9206_1_1530622771285_4714099_637-86630-0_sp0.9 from training. Duration: 0.5666875 2023-10-03 07:57:44,452 WARNING [train.py:1204] (3/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_401-255491-0 from training. Duration: 0.7 2023-10-03 07:57:46,239 WARNING [train.py:1204] (3/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_178-6946-0_sp1.1 from training. Duration: 0.97275 2023-10-03 07:57:50,114 INFO [scaling.py:1118] (3/4) WithLoss: name=encoder.encoders.5.encoder.layers.1.self_attn_weights, loss-sum=0.000e+00 2023-10-03 07:57:54,178 WARNING [train.py:1204] (3/4) Exclude cut with ID _27485_210_4916_1_1532739191677_3668269_460-163885-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 07:57:55,060 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.0.layers.0.nonlin_attention.whiten2, num_groups=1, num_channels=192, metric=5.35 vs. limit=15.0 2023-10-03 07:57:57,074 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.self_attn_weights.pos_emb_skip_rate, batch_count=1194200.0, ans=0.0 2023-10-03 07:57:58,373 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.0.nonlin_attention.balancer.prob, batch_count=1194200.0, ans=0.125 2023-10-03 07:57:59,599 WARNING [train.py:1204] (3/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_270-26053-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 07:58:02,352 WARNING [train.py:1204] (3/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_412-317077-0 from training. Duration: 0.75 2023-10-03 07:58:02,535 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.bypass_mid.scale_min, batch_count=1194266.6666666667, ans=0.2 2023-10-03 07:58:03,751 WARNING [train.py:1204] (3/4) Exclude cut with ID _5289_210_2084_1_1527678202736_3779299_142-103317-0 from training. Duration: 0.84 2023-10-03 07:58:05,229 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_150-143438-0_sp1.1 from training. Duration: 0.92725 2023-10-03 07:58:06,650 WARNING [train.py:1204] (3/4) Exclude cut with ID _27894_210_12392_1_1532415496078_6902439_668-269779-0_sp1.1 from training. Duration: 0.97275 2023-10-03 07:58:06,723 WARNING [train.py:1204] (3/4) Exclude cut with ID _18513_210_9206_1_1531618215223_3862620_56-222075-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 07:58:08,130 WARNING [train.py:1204] (3/4) Exclude cut with ID _4092_210_2151_1_1526643044449_3135759_356-278694-0 from training. Duration: 0.47 2023-10-03 07:58:11,068 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.feed_forward3.hidden_balancer.prob, batch_count=1194266.6666666667, ans=0.125 2023-10-03 07:58:12,211 WARNING [train.py:1204] (3/4) Exclude cut with ID _25808_210_13292_1_1532343514562_7366109_633-47524-0 from training. Duration: 0.99 2023-10-03 07:58:12,221 WARNING [train.py:1204] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_872-264525-0 from training. Duration: 0.65 2023-10-03 07:58:12,250 WARNING [train.py:1204] (3/4) Exclude cut with ID _14632_210_9988_1_1530691279392_3923200_239-232545-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 07:58:13,675 WARNING [train.py:1204] (3/4) Exclude cut with ID _14349_210_8341_1_1530615710818_3442470_153-27432-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 07:58:17,352 WARNING [train.py:1204] (3/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_37-41919-0_sp0.9 from training. Duration: 0.9666875 2023-10-03 07:58:19,197 INFO [train.py:1046] (3/4) Epoch 34, batch 3850, loss[loss=0.1726, simple_loss=0.2369, pruned_loss=0.05417, over 23874.00 frames. ], tot_loss[loss=0.1613, simple_loss=0.2404, pruned_loss=0.04113, over 4716054.42 frames. ], batch size: 179, lr: 2.99e-03, grad_scale: 16.0 2023-10-03 07:58:19,268 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_196-289637-0_sp0.9 from training. Duration: 0.8333125 2023-10-03 07:58:19,610 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.balancer2.prob, batch_count=1194333.3333333333, ans=0.125 2023-10-03 07:58:24,443 WARNING [train.py:1204] (3/4) Exclude cut with ID _13678_210_7497_1_1530530069226_3463170_371-3221-0_sp1.1 from training. Duration: 0.8181875 2023-10-03 07:58:25,781 WARNING [train.py:1204] (3/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_557-324258-0 from training. Duration: 0.81 2023-10-03 07:58:27,090 WARNING [train.py:1204] (3/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_1012-279473-0_sp1.1 from training. Duration: 0.6090625 2023-10-03 07:58:27,190 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_66916_210_14656_1_1535725525_326566_7-92649-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 07:58:31,607 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_9460_210_4211_1_1528954587_10049667_480-260269-0_sp1.1 from training. Duration: 0.5090625 2023-10-03 07:58:32,967 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_121-155001-0_sp1.1 from training. Duration: 0.92725 2023-10-03 07:58:33,174 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.conv_module1.balancer1.max_abs, batch_count=1194400.0, ans=10.0 2023-10-03 07:58:35,668 WARNING [train.py:1204] (3/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_188-171673-0_sp1.1 from training. Duration: 0.52725 2023-10-03 07:58:36,896 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.629e+02 1.877e+02 2.078e+02 2.275e+02 4.210e+02, threshold=4.155e+02, percent-clipped=0.0 2023-10-03 07:58:36,987 WARNING [train.py:1204] (3/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_2-39653-0 from training. Duration: 0.98 2023-10-03 07:58:43,804 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_23850_210_4238_1_1532232597_1632433_112-229012-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 07:58:45,734 WARNING [train.py:1204] (3/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_626-79528-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 07:58:48,559 WARNING [train.py:1204] (3/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_856-90052-0_sp1.1 from training. Duration: 0.963625 2023-10-03 07:58:48,595 WARNING [train.py:1204] (3/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_122-109023-0_sp0.9 from training. Duration: 0.8444375 2023-10-03 07:58:48,840 INFO [scaling.py:1118] (3/4) WithLoss: name=encoder.encoders.1.encoder.layers.1.self_attn_weights, loss-sum=0.000e+00 2023-10-03 07:58:50,904 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.4.encoder.layers.1.feed_forward3.out_whiten, num_groups=1, num_channels=384, metric=11.67 vs. limit=15.0 2023-10-03 07:58:51,705 WARNING [train.py:1204] (3/4) Exclude cut with ID _64414_210_17301_1_1535243252048_3757759_37-70328-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 07:58:51,763 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_622-344907-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 07:58:53,741 WARNING [train.py:1204] (3/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_410-321956-0_sp1.1 from training. Duration: 0.92725 2023-10-03 07:58:53,758 WARNING [train.py:1204] (3/4) Exclude cut with ID _7562_210_2084_1_1528887767916_3946229_186-326422-0_sp0.9 from training. Duration: 0.9333125 2023-10-03 07:58:55,033 WARNING [train.py:1204] (3/4) Exclude cut with ID _37537_210_2253_1_1533171758568_3421949_127-285192-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 07:58:57,028 WARNING [train.py:1204] (3/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_723-228168-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 07:58:58,391 WARNING [train.py:1204] (3/4) Exclude cut with ID _13678_210_7497_1_1530530069226_3463170_426-246984-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 07:58:58,409 WARNING [train.py:1204] (3/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_399-156791-0_sp0.9 from training. Duration: 0.92225 2023-10-03 07:58:59,833 WARNING [train.py:1204] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_391-22148-0 from training. Duration: 0.77 2023-10-03 07:58:59,864 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3475_210_2028_1_1526193578_3622569_64-297072-0 from training. Duration: 0.91 2023-10-03 07:59:01,166 WARNING [train.py:1204] (3/4) Exclude cut with ID _17875_210_2087_1_1531224012907_1821339_225-280015-0_sp1.1 from training. Duration: 0.963625 2023-10-03 07:59:01,184 WARNING [train.py:1204] (3/4) Exclude cut with ID _50655_210_18628_1_1534229942193_3772154_325-246169-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 07:59:04,051 WARNING [train.py:1204] (3/4) Exclude cut with ID _29529_210_7234_1_1532393961455_3977460_597-257348-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 07:59:04,097 WARNING [train.py:1204] (3/4) Exclude cut with ID _58895_210_8750_1_1535184081320_3649089_77-173485-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 07:59:04,128 WARNING [train.py:1204] (3/4) Exclude cut with ID _6270_210_2084_1_1527850911573_3856420_26-277343-0 from training. Duration: 0.82 2023-10-03 07:59:06,849 WARNING [train.py:1204] (3/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_144-3287-0 from training. Duration: 0.67 2023-10-03 07:59:08,221 WARNING [train.py:1204] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_293-145278-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 07:59:08,457 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.conv_module1.balancer2.prob, batch_count=1194533.3333333333, ans=0.125 2023-10-03 07:59:10,094 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.3.encoder.layers.3.whiten, num_groups=1, num_channels=512, metric=3.51 vs. limit=12.0 2023-10-03 07:59:10,974 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_40047_210_2290_1_1533290519_2374311_257-335162-0 from training. Duration: 0.95 2023-10-03 07:59:11,326 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.conv_module2.balancer1.prob, batch_count=1194533.3333333333, ans=0.125 2023-10-03 07:59:12,392 WARNING [train.py:1204] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_619-344499-0_sp0.9 from training. Duration: 0.7 2023-10-03 07:59:15,721 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.5.encoder.layers.0.feed_forward1.out_whiten, num_groups=1, num_channels=256, metric=6.70 vs. limit=15.0 2023-10-03 07:59:16,556 WARNING [train.py:1204] (3/4) Exclude cut with ID _19504_210_9994_1_1531648863242_3325220_456-315379-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 07:59:18,337 WARNING [train.py:1204] (3/4) Exclude cut with ID _55857_210_2346_1_1534644007831_6898209_631-221692-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 07:59:18,505 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=1194600.0, ans=0.1 2023-10-03 07:59:21,760 WARNING [train.py:1204] (3/4) Exclude cut with ID _64631_210_27767_1_1535349580068_4165160_324-3682-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 07:59:21,780 WARNING [train.py:1204] (3/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_317-211107-0 from training. Duration: 0.92 2023-10-03 07:59:21,945 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.balancer2.prob, batch_count=1194600.0, ans=0.125 2023-10-03 07:59:26,236 WARNING [train.py:1204] (3/4) Exclude cut with ID _6789_210_1534_1_1527857937218_3118950_311-61262-0 from training. Duration: 0.65 2023-10-03 07:59:27,734 WARNING [train.py:1204] (3/4) Exclude cut with ID _28323_210_16187_1_1532480395667_4100300_365-68882-0_sp1.1 from training. Duration: 0.92725 2023-10-03 07:59:28,414 WARNING [train.py:1204] (3/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_816-181825-0_sp1.1 from training. Duration: 0.92725 2023-10-03 07:59:29,771 WARNING [train.py:1204] (3/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_132-269633-0_sp1.1 from training. Duration: 0.6181875 2023-10-03 07:59:29,789 WARNING [train.py:1204] (3/4) Exclude cut with ID _44585_210_18628_1_1533605350387_4518199_280-73534-0_sp1.1 from training. Duration: 0.8090625 2023-10-03 07:59:31,252 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_68095_210_20185_1_1535718226_8888855_1009-164755-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 07:59:31,323 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_40223_210_6228_1_1533298404_4812267_400-103813-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 07:59:31,324 WARNING [train.py:1204] (3/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_491-261923-0_sp1.1 from training. Duration: 0.8909375 2023-10-03 07:59:32,583 WARNING [train.py:1204] (3/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_712-342563-0 from training. Duration: 0.97 2023-10-03 07:59:32,662 WARNING [train.py:1204] (3/4) Exclude cut with ID _41161_210_20034_1_1533431249657_3555314_175-190032-0_sp1.1 from training. Duration: 0.963625 2023-10-03 07:59:34,202 INFO [train.py:1046] (3/4) Epoch 34, batch 3900, loss[loss=0.1572, simple_loss=0.2258, pruned_loss=0.0443, over 23762.00 frames. ], tot_loss[loss=0.1609, simple_loss=0.2399, pruned_loss=0.04092, over 4719717.50 frames. ], batch size: 232, lr: 2.99e-03, grad_scale: 16.0 2023-10-03 07:59:34,364 WARNING [train.py:1204] (3/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_788-298907-0 from training. Duration: 0.89 2023-10-03 07:59:35,627 WARNING [train.py:1204] (3/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_225-70961-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 07:59:35,634 WARNING [train.py:1204] (3/4) Exclude cut with ID _58583_210_5236_1_1534726835266_3574249_235-175994-0_sp1.1 from training. Duration: 0.92725 2023-10-03 07:59:36,517 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.2.encoder.layers.0.whiten, num_groups=1, num_channels=384, metric=5.36 vs. limit=12.0 2023-10-03 07:59:37,041 WARNING [train.py:1204] (3/4) Exclude cut with ID _12302_210_5219_1_1529924446466_8107280_907-182949-0_sp1.1 from training. Duration: 0.836375 2023-10-03 07:59:37,085 WARNING [train.py:1204] (3/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_94-193692-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 07:59:37,192 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.0.self_attn_weights.pos_emb_skip_rate, batch_count=1194666.6666666667, ans=0.0 2023-10-03 07:59:38,572 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.conv_skip_rate, batch_count=1194666.6666666667, ans=0.0 2023-10-03 07:59:39,692 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_4528_210_3273_1_1527076654_3276777_342-168068-0_sp0.9 from training. Duration: 0.9666875 2023-10-03 07:59:39,742 WARNING [train.py:1204] (3/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_368-74239-0_sp1.1 from training. Duration: 0.92725 2023-10-03 07:59:39,743 WARNING [train.py:1204] (3/4) Exclude cut with ID _44831_210_13427_1_1533816011549_3689035_291-295928-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 07:59:41,157 WARNING [train.py:1204] (3/4) Exclude cut with ID _56406_210_9288_1_1534899618664_3496149_455-131997-0_sp1.1 from training. Duration: 0.97275 2023-10-03 07:59:41,164 WARNING [train.py:1204] (3/4) Exclude cut with ID _6473_210_1741_1_1527901307396_3815380_108-17335-0 from training. Duration: 0.65 2023-10-03 07:59:41,224 WARNING [train.py:1204] (3/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_286-135600-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 07:59:43,183 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.3.encoder.layers.1.nonlin_attention.whiten2, num_groups=1, num_channels=512, metric=7.15 vs. limit=15.0 2023-10-03 07:59:45,202 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_928-321675-0_sp1.1 from training. Duration: 0.936375 2023-10-03 07:59:45,288 WARNING [train.py:1204] (3/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_81-300212-0_sp1.1 from training. Duration: 0.5454375 2023-10-03 07:59:46,603 WARNING [train.py:1204] (3/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_29-280622-0_sp0.9 from training. Duration: 0.911125 2023-10-03 07:59:48,572 WARNING [train.py:1204] (3/4) Exclude cut with ID _61079_210_4525_1_1534921008198_4011419_228-176447-0_sp1.1 from training. Duration: 0.936375 2023-10-03 07:59:50,014 WARNING [train.py:1204] (3/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_372-198878-0_sp1.1 from training. Duration: 0.5454375 2023-10-03 07:59:50,127 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.attention_skip_rate, batch_count=1194733.3333333333, ans=0.0 2023-10-03 07:59:51,644 WARNING [train.py:1204] (3/4) Exclude cut with ID _64521_210_14699_1_1535243427259_3815328_244-208725-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 07:59:53,072 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8275_210_4198_1_1528455320_3892571_491-17707-0_sp0.9 from training. Duration: 0.711125 2023-10-03 07:59:54,496 WARNING [train.py:1204] (3/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_375-245677-0 from training. Duration: 0.81 2023-10-03 07:59:54,511 WARNING [train.py:1204] (3/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_690-286055-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 07:59:56,390 WARNING [train.py:1204] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_89-321955-0 from training. Duration: 0.8 2023-10-03 07:59:56,427 WARNING [train.py:1204] (3/4) Exclude cut with ID _18895_210_6228_1_1531357274234_3784930_213-98721-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 07:59:57,799 WARNING [train.py:1204] (3/4) Exclude cut with ID _19708_210_9206_1_1532136650733_4238364_347-65240-0 from training. Duration: 0.97 2023-10-03 07:59:59,609 WARNING [train.py:1204] (3/4) Exclude cut with ID _64135_210_2253_1_1535677267876_7608972_498-5039-0 from training. Duration: 0.78 2023-10-03 08:00:03,586 WARNING [train.py:1204] (3/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_146-276944-0_sp1.1 from training. Duration: 0.7545625 2023-10-03 08:00:03,668 WARNING [train.py:1204] (3/4) Exclude cut with ID _63171_210_15488_1_1535104792794_3295110_155-34488-0_sp1.1 from training. Duration: 0.97275 2023-10-03 08:00:03,681 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_5438_210_1794_1_1527336687_2600009_233-58072-0_sp0.9 from training. Duration: 0.8555625 2023-10-03 08:00:03,733 WARNING [train.py:1204] (3/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_63-101996-0_sp0.9 from training. Duration: 0.97775 2023-10-03 08:00:03,886 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.feed_forward1.out_proj.dropout_p, batch_count=1194800.0, ans=0.1 2023-10-03 08:00:09,249 WARNING [train.py:1204] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_271-269084-0_sp1.1 from training. Duration: 0.7545625 2023-10-03 08:00:10,752 WARNING [train.py:1204] (3/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_345-167986-0_sp1.1 from training. Duration: 0.763625 2023-10-03 08:00:13,352 WARNING [train.py:1204] (3/4) Exclude cut with ID _39290_210_4220_1_1533862778760_3075118_92-311600-0_sp1.1 from training. Duration: 0.8 2023-10-03 08:00:13,366 WARNING [train.py:1204] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_209-65306-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 08:00:13,436 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_26433_210_12108_1_1532157158_4056480_317-335807-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 08:00:18,203 WARNING [train.py:1204] (3/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_238-94127-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 08:00:19,481 WARNING [train.py:1204] (3/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_207-132038-0_sp1.1 from training. Duration: 0.8545625 2023-10-03 08:00:27,506 WARNING [train.py:1204] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_167-137255-0_sp1.1 from training. Duration: 0.5818125 2023-10-03 08:00:28,904 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2009_210_2071_1_1525481134_4588937_385-302100-0_sp0.9 from training. Duration: 0.9666875 2023-10-03 08:00:37,736 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_15517_210_6753_1_1530954008_3487336_253-110617-0_sp1.1 from training. Duration: 0.936375 2023-10-03 08:00:38,048 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.ff2_skip_rate, batch_count=1194933.3333333333, ans=0.0 2023-10-03 08:00:39,207 WARNING [train.py:1204] (3/4) Exclude cut with ID _39290_210_4220_1_1533862778760_3075118_92-311600-0_sp0.9 from training. Duration: 0.97775 2023-10-03 08:00:40,594 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_42-142473-0 from training. Duration: 0.72 2023-10-03 08:00:40,631 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2296_210_2087_1_1525503448_3397335_57-185430-0 from training. Duration: 0.6 2023-10-03 08:00:40,643 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_11305_210_1794_1_1529750428_7178148_257-312870-0_sp0.9 from training. Duration: 0.97775 2023-10-03 08:00:42,015 WARNING [train.py:1204] (3/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_222-198127-0 from training. Duration: 0.84 2023-10-03 08:00:42,253 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.ff3_skip_rate, batch_count=1194933.3333333333, ans=0.0 2023-10-03 08:00:43,324 WARNING [train.py:1204] (3/4) Exclude cut with ID _65263_210_22356_1_1535763598691_3921037_423-202699-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 08:00:43,393 WARNING [train.py:1204] (3/4) Exclude cut with ID _15037_210_2253_1_1530752294104_3267650_13-130361-0 from training. Duration: 0.99 2023-10-03 08:00:47,991 INFO [train.py:1046] (3/4) Epoch 34, batch 3950, loss[loss=0.1506, simple_loss=0.2299, pruned_loss=0.0357, over 13671.00 frames. ], tot_loss[loss=0.1599, simple_loss=0.2392, pruned_loss=0.04029, over 4718231.72 frames. ], batch size: 29, lr: 2.99e-03, grad_scale: 16.0 2023-10-03 08:00:50,893 WARNING [train.py:1204] (3/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_628-198080-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 08:00:52,749 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3171_210_2087_1_1526109084_2330007_37-64334-0 from training. Duration: 0.82 2023-10-03 08:00:52,800 WARNING [train.py:1204] (3/4) Exclude cut with ID _5287_210_2084_1_1527418883040_3878670_12-221186-0_sp1.1 from training. Duration: 0.8909375 2023-10-03 08:00:54,827 WARNING [train.py:1204] (3/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_1103-56445-0_sp1.1 from training. Duration: 0.663625 2023-10-03 08:00:56,366 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.bypass_mid.scale_min, batch_count=1195000.0, ans=0.2 2023-10-03 08:00:57,567 WARNING [train.py:1204] (3/4) Exclude cut with ID _65263_210_22356_1_1535763598691_3921037_169-140068-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 08:01:02,361 WARNING [train.py:1204] (3/4) Exclude cut with ID IC0671W0118-331961 from training. Duration: 0.896 2023-10-03 08:01:03,815 WARNING [train.py:1204] (3/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_402-102311-0_sp1.1 from training. Duration: 0.6090625 2023-10-03 08:01:03,842 WARNING [train.py:1204] (3/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_202-293523-0 from training. Duration: 0.72 2023-10-03 08:01:04,428 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.conv_module2.whiten.whitening_limit, batch_count=1195066.6666666667, ans=15.0 2023-10-03 08:01:05,069 WARNING [train.py:1204] (3/4) Exclude cut with ID IC0679W0038-335846 from training. Duration: 0.768 2023-10-03 08:01:05,101 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_37357_210_17024_1_1533089504_2316121_79-198866-0_sp1.1 from training. Duration: 0.936375 2023-10-03 08:01:06,448 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.524e+02 1.876e+02 2.006e+02 2.250e+02 3.004e+02, threshold=4.013e+02, percent-clipped=0.0 2023-10-03 08:01:06,605 WARNING [train.py:1204] (3/4) Exclude cut with ID _7606_210_4915_1_1528894253277_5080690_292-271285-0_sp1.1 from training. Duration: 0.963625 2023-10-03 08:01:06,622 WARNING [train.py:1204] (3/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_377-296839-0_sp0.9 from training. Duration: 0.611125 2023-10-03 08:01:06,625 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_45886_210_17024_1_1533704917_2435333_58-131069-0_sp1.1 from training. Duration: 0.963625 2023-10-03 08:01:09,354 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6961_210_2087_1_1527937168_6815011_395-185060-0 from training. Duration: 0.95 2023-10-03 08:01:10,705 WARNING [train.py:1204] (3/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_304-266479-0_sp1.1 from training. Duration: 0.97275 2023-10-03 08:01:10,756 WARNING [train.py:1204] (3/4) Exclude cut with ID _3867_210_1883_1_1526641200575_4475830_319-349850-0_sp1.1 from training. Duration: 0.6090625 2023-10-03 08:01:10,764 WARNING [train.py:1204] (3/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_304-59227-0_sp0.9 from training. Duration: 0.8555625 2023-10-03 08:01:12,109 WARNING [train.py:1204] (3/4) Exclude cut with ID _8021_210_7994_1_1528376451358_3653069_100-140231-0_sp0.9 from training. Duration: 0.7444375 2023-10-03 08:01:12,162 WARNING [train.py:1204] (3/4) Exclude cut with ID _8833_210_5219_1_1528592665210_7553059_329-125156-0_sp0.9 from training. Duration: 0.911125 2023-10-03 08:01:24,711 WARNING [train.py:1204] (3/4) Exclude cut with ID _14459_210_2228_1_1531443641946_7516700_792-287234-0_sp1.1 from training. Duration: 0.92725 2023-10-03 08:01:24,727 WARNING [train.py:1204] (3/4) Exclude cut with ID _19708_210_9206_1_1532136650733_4238364_347-65240-0_sp1.1 from training. Duration: 0.8818125 2023-10-03 08:01:30,880 WARNING [train.py:1204] (3/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_70-27732-0 from training. Duration: 0.94 2023-10-03 08:01:36,475 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_397-132565-0 from training. Duration: 0.99 2023-10-03 08:01:36,478 WARNING [train.py:1204] (3/4) Exclude cut with ID _49728_210_12651_1_1534041034294_6788150_624-195301-0 from training. Duration: 0.85 2023-10-03 08:01:36,524 WARNING [train.py:1204] (3/4) Exclude cut with ID _55429_210_10626_1_1534467588527_7174143_377-169391-0_sp1.1 from training. Duration: 0.87275 2023-10-03 08:01:37,745 WARNING [train.py:1204] (3/4) Exclude cut with ID _49376_210_5986_1_1533985278732_3370740_354-344695-0_sp1.1 from training. Duration: 0.9 2023-10-03 08:01:45,960 WARNING [train.py:1204] (3/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_276-96593-0_sp1.1 from training. Duration: 0.7 2023-10-03 08:01:45,968 WARNING [train.py:1204] (3/4) Exclude cut with ID _15012_210_1968_1_1530856820429_5095350_688-178849-0_sp0.9 from training. Duration: 0.711125 2023-10-03 08:01:46,009 WARNING [train.py:1204] (3/4) Exclude cut with ID _25565_210_12392_1_1532423009382_3952098_372-69023-0_sp1.1 from training. Duration: 0.936375 2023-10-03 08:01:47,832 WARNING [train.py:1204] (3/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_61-327062-0_sp1.1 from training. Duration: 0.736375 2023-10-03 08:01:47,862 WARNING [train.py:1204] (3/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_215-141059-0 from training. Duration: 0.62 2023-10-03 08:01:52,098 WARNING [train.py:1204] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_6-226750-0_sp1.1 from training. Duration: 0.8545625 2023-10-03 08:01:53,485 WARNING [train.py:1204] (3/4) Exclude cut with ID _14348_210_8341_1_1530599434996_3778640_240-232370-0_sp1.1 from training. Duration: 0.763625 2023-10-03 08:01:57,246 WARNING [train.py:1204] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_468-189058-0 from training. Duration: 0.96 2023-10-03 08:01:58,096 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.1.encoder.layers.0.nonlin_attention.whiten1, num_groups=1, num_channels=192, metric=4.50 vs. limit=10.0 2023-10-03 08:02:02,688 INFO [train.py:1046] (3/4) Epoch 34, batch 4000, loss[loss=0.1616, simple_loss=0.2572, pruned_loss=0.03304, over 24649.00 frames. ], tot_loss[loss=0.1603, simple_loss=0.24, pruned_loss=0.04029, over 4719366.49 frames. ], batch size: 73, lr: 2.99e-03, grad_scale: 32.0 2023-10-03 08:02:03,655 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.ff2_skip_rate, batch_count=1195333.3333333333, ans=0.0 2023-10-03 08:02:06,085 WARNING [train.py:1204] (3/4) Exclude cut with ID _21987_210_5854_1_1531821500482_3637038_247-93340-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 08:02:12,973 WARNING [train.py:1204] (3/4) Exclude cut with ID _66868_210_9527_1_1535509287954_4561140_436-191602-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 08:02:17,217 WARNING [train.py:1204] (3/4) Exclude cut with ID _18895_210_6228_1_1531357274234_3784930_394-58203-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 08:02:18,570 WARNING [train.py:1204] (3/4) Exclude cut with ID _58605_210_12210_1_1534723195939_3904259_258-281498-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 08:02:18,616 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_33348_210_5632_1_1533086912_3324030_245-292752-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 08:02:19,199 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.3.encoder.layers.2.whiten, num_groups=1, num_channels=512, metric=6.62 vs. limit=12.0 2023-10-03 08:02:20,544 WARNING [train.py:1204] (3/4) Exclude cut with ID _67831_210_12392_1_1535612464202_3092612_346-2148-0 from training. Duration: 0.93 2023-10-03 08:02:20,627 WARNING [train.py:1204] (3/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_369-222060-0_sp0.9 from training. Duration: 0.788875 2023-10-03 08:02:20,683 WARNING [train.py:1204] (3/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_245-174553-0 from training. Duration: 0.88 2023-10-03 08:02:20,688 WARNING [train.py:1204] (3/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_231-104736-0_sp1.1 from training. Duration: 0.7090625 2023-10-03 08:02:20,702 WARNING [train.py:1204] (3/4) Exclude cut with ID _44585_210_18628_1_1533605350387_4518199_280-73534-0 from training. Duration: 0.89 2023-10-03 08:02:23,360 WARNING [train.py:1204] (3/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_22-201476-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 08:02:26,652 WARNING [train.py:1204] (3/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_325-62802-0_sp0.9 from training. Duration: 0.9666875 2023-10-03 08:02:26,665 WARNING [train.py:1204] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_199-274062-0_sp1.1 from training. Duration: 0.97275 2023-10-03 08:02:26,668 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_8-319957-0_sp1.1 from training. Duration: 0.87275 2023-10-03 08:02:26,709 WARNING [train.py:1204] (3/4) Exclude cut with ID _29343_210_4381_1_1532602757752_3674109_224-349726-0_sp1.1 from training. Duration: 0.936375 2023-10-03 08:02:26,714 WARNING [train.py:1204] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_520-126729-0_sp1.1 from training. Duration: 0.636375 2023-10-03 08:02:28,174 WARNING [train.py:1204] (3/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_239-22076-0_sp1.1 from training. Duration: 0.77275 2023-10-03 08:02:29,473 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9012W0011-501146 from training. Duration: 0.896 2023-10-03 08:02:29,553 WARNING [train.py:1204] (3/4) Exclude cut with ID _29857_210_20034_1_1532395886753_3798160_279-185801-0_sp1.1 from training. Duration: 0.82725 2023-10-03 08:02:29,597 WARNING [train.py:1204] (3/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_261-223933-0_sp1.1 from training. Duration: 0.92725 2023-10-03 08:02:31,954 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.4.encoder.layers.0.self_attn2.whiten, num_groups=1, num_channels=384, metric=14.02 vs. limit=22.5 2023-10-03 08:02:32,789 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9029W0025-509426 from training. Duration: 0.896 2023-10-03 08:02:34,234 WARNING [train.py:1204] (3/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_337-181502-0_sp1.1 from training. Duration: 0.5909375 2023-10-03 08:02:34,253 WARNING [train.py:1204] (3/4) Exclude cut with ID _68635_210_9317_1_1535770813410_4315870_159-332599-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 08:02:40,739 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_161-276996-0 from training. Duration: 0.95 2023-10-03 08:02:40,789 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7088_210_1874_1_1527939776_1101113_127-68870-0_sp1.1 from training. Duration: 0.936375 2023-10-03 08:02:42,205 WARNING [train.py:1204] (3/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_322-217220-0_sp1.1 from training. Duration: 0.7818125 2023-10-03 08:02:43,488 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9072W0002-530636 from training. Duration: 0.768 2023-10-03 08:02:43,590 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_340-336408-0_sp1.1 from training. Duration: 0.8454375 2023-10-03 08:02:44,914 WARNING [train.py:1204] (3/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_395-37222-0 from training. Duration: 0.76 2023-10-03 08:02:44,917 WARNING [train.py:1204] (3/4) Exclude cut with ID _64099_210_17301_1_1535526977656_1578188_159-209156-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 08:02:45,004 WARNING [train.py:1204] (3/4) Exclude cut with ID _53950_210_5816_1_1534417264919_3862951_28-29681-0_sp1.1 from training. Duration: 0.92725 2023-10-03 08:02:46,405 WARNING [train.py:1204] (3/4) Exclude cut with ID _4681_210_3525_1_1527250689464_120889_15-126760-0_sp1.1 from training. Duration: 0.736375 2023-10-03 08:02:48,459 WARNING [train.py:1204] (3/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_242-3472-0_sp1.1 from training. Duration: 0.663625 2023-10-03 08:02:48,504 WARNING [train.py:1204] (3/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_436-186311-0_sp0.9 from training. Duration: 0.72225 2023-10-03 08:02:49,740 WARNING [train.py:1204] (3/4) Exclude cut with ID _3675_210_4557_1_1526639934408_4182089_452-172147-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 08:02:51,031 WARNING [train.py:1204] (3/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_80-147301-0 from training. Duration: 0.84 2023-10-03 08:02:51,071 WARNING [train.py:1204] (3/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_351-107588-0_sp1.1 from training. Duration: 0.92725 2023-10-03 08:02:52,545 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9111W0002-550781 from training. Duration: 0.896 2023-10-03 08:02:57,972 INFO [scaling.py:1118] (3/4) WithLoss: name=encoder.encoders.4.encoder.layers.1.self_attn_weights, loss-sum=0.000e+00 2023-10-03 08:02:59,643 WARNING [train.py:1204] (3/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_465-156500-0_sp1.1 from training. Duration: 0.6545625 2023-10-03 08:03:02,491 WARNING [train.py:1204] (3/4) Exclude cut with ID _13938_210_9988_1_1530518718978_3693960_304-112784-0_sp1.1 from training. Duration: 0.4181875 2023-10-03 08:03:03,960 WARNING [train.py:1204] (3/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_202-67017-0_sp1.1 from training. Duration: 0.7454375 2023-10-03 08:03:03,998 WARNING [train.py:1204] (3/4) Exclude cut with ID _58414_210_8254_1_1535421609162_7151182_448-4358-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 08:03:05,255 WARNING [train.py:1204] (3/4) Exclude cut with ID _66840_210_5219_1_1535452168426_4196349_203-243234-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 08:03:06,623 WARNING [train.py:1204] (3/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_201-213513-0_sp1.1 from training. Duration: 0.97275 2023-10-03 08:03:12,151 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_117-187560-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 08:03:13,604 WARNING [train.py:1204] (3/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_135-174080-0_sp0.9 from training. Duration: 0.588875 2023-10-03 08:03:14,950 WARNING [train.py:1204] (3/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_45-265684-0 from training. Duration: 0.72 2023-10-03 08:03:15,222 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.feed_forward1.hidden_balancer.prob, batch_count=1195666.6666666667, ans=0.125 2023-10-03 08:03:16,220 INFO [train.py:1046] (3/4) Epoch 34, batch 4050, loss[loss=0.163, simple_loss=0.2386, pruned_loss=0.04371, over 23449.00 frames. ], tot_loss[loss=0.1617, simple_loss=0.2409, pruned_loss=0.04122, over 4714956.48 frames. ], batch size: 93, lr: 2.99e-03, grad_scale: 32.0 2023-10-03 08:03:16,327 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_435-313305-0_sp1.1 from training. Duration: 0.6454375 2023-10-03 08:03:16,354 WARNING [train.py:1204] (3/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_235-307337-0_sp1.1 from training. Duration: 0.8545625 2023-10-03 08:03:16,498 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.conv_module2.balancer2.prob, batch_count=1195666.6666666667, ans=0.125 2023-10-03 08:03:18,306 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7762_210_1794_1_1528199508_4057408_419-125009-0_sp0.9 from training. Duration: 0.92225 2023-10-03 08:03:19,633 WARNING [train.py:1204] (3/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_215-141059-0_sp1.1 from training. Duration: 0.563625 2023-10-03 08:03:20,989 WARNING [train.py:1204] (3/4) Exclude cut with ID _27013_210_1389_1_1532437173240_3252649_222-125573-0_sp1.1 from training. Duration: 0.8909375 2023-10-03 08:03:25,074 WARNING [train.py:1204] (3/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_126-274694-0_sp1.1 from training. Duration: 0.8909375 2023-10-03 08:03:29,547 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2592_210_2290_1_1525697025_4882925_157-70512-0_sp1.1 from training. Duration: 0.82725 2023-10-03 08:03:29,604 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_292-265928-0_sp0.9 from training. Duration: 0.5666875 2023-10-03 08:03:31,534 WARNING [train.py:1204] (3/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_1211-226066-0_sp1.1 from training. Duration: 0.6909375 2023-10-03 08:03:33,521 WARNING [train.py:1204] (3/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_394-265747-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 08:03:34,567 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.556e+02 1.804e+02 1.973e+02 2.142e+02 3.125e+02, threshold=3.945e+02, percent-clipped=0.0 2023-10-03 08:03:37,310 WARNING [train.py:1204] (3/4) Exclude cut with ID _48782_210_12658_1_1534467701544_2300422_239-67139-0_sp1.1 from training. Duration: 0.97275 2023-10-03 08:03:40,047 WARNING [train.py:1204] (3/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_329-113173-0_sp1.1 from training. Duration: 0.563625 2023-10-03 08:03:41,495 WARNING [train.py:1204] (3/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_81-75849-0_sp0.9 from training. Duration: 0.4333125 2023-10-03 08:03:44,259 WARNING [train.py:1204] (3/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_1003-101638-0 from training. Duration: 0.73 2023-10-03 08:03:44,306 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9309W0189-640316 from training. Duration: 0.64 2023-10-03 08:03:45,800 WARNING [train.py:1204] (3/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_503-287883-0_sp1.1 from training. Duration: 0.8 2023-10-03 08:03:48,886 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=1195800.0, ans=0.1 2023-10-03 08:03:51,812 WARNING [train.py:1204] (3/4) Exclude cut with ID _8833_210_5219_1_1528592665210_7553059_611-80313-0 from training. Duration: 0.64 2023-10-03 08:03:51,888 WARNING [train.py:1204] (3/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_335-106626-0_sp1.1 from training. Duration: 0.963625 2023-10-03 08:03:56,018 WARNING [train.py:1204] (3/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_237-44332-0_sp1.1 from training. Duration: 0.8545625 2023-10-03 08:03:58,725 WARNING [train.py:1204] (3/4) Exclude cut with ID _46952_210_5986_1_1534230010357_3107379_178-147341-0_sp1.1 from training. Duration: 0.97275 2023-10-03 08:03:58,767 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_13091_210_7925_1_1530609447_8987354_946-154426-0_sp1.1 from training. Duration: 0.936375 2023-10-03 08:03:58,785 WARNING [train.py:1204] (3/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_326-17061-0_sp1.1 from training. Duration: 0.8545625 2023-10-03 08:04:03,909 WARNING [train.py:1204] (3/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_38-169920-0_sp1.1 from training. Duration: 0.82725 2023-10-03 08:04:08,641 WARNING [train.py:1204] (3/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_283-220372-0 from training. Duration: 0.56 2023-10-03 08:04:08,655 WARNING [train.py:1204] (3/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_252-216674-0_sp1.1 from training. Duration: 0.4909375 2023-10-03 08:04:10,070 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_202-91290-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 08:04:12,754 WARNING [train.py:1204] (3/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_80-74184-0 from training. Duration: 0.83 2023-10-03 08:04:15,652 WARNING [train.py:1204] (3/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_411-271358-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 08:04:21,242 WARNING [train.py:1204] (3/4) Exclude cut with ID _68777_210_4353_1_1535810390731_3117904_129-166012-0 from training. Duration: 0.85 2023-10-03 08:04:23,013 WARNING [train.py:1204] (3/4) Exclude cut with ID _44982_210_1511_1_1534312820108_3726029_410-109759-0_sp1.1 from training. Duration: 0.963625 2023-10-03 08:04:23,017 WARNING [train.py:1204] (3/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_364-22817-0_sp1.1 from training. Duration: 0.6454375 2023-10-03 08:04:24,415 WARNING [train.py:1204] (3/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_89-242335-0 from training. Duration: 0.95 2023-10-03 08:04:24,423 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_151-325626-0 from training. Duration: 0.97 2023-10-03 08:04:24,424 WARNING [train.py:1204] (3/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_786-264332-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 08:04:27,133 WARNING [train.py:1204] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_35-153886-0_sp1.1 from training. Duration: 0.763625 2023-10-03 08:04:27,231 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_24007_210_1794_1_1531918925_1663875_116-100916-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 08:04:27,245 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_162-262717-0_sp1.1 from training. Duration: 0.7545625 2023-10-03 08:04:30,031 INFO [train.py:1046] (3/4) Epoch 34, batch 4100, loss[loss=0.2029, simple_loss=0.2686, pruned_loss=0.06861, over 19754.00 frames. ], tot_loss[loss=0.1627, simple_loss=0.2418, pruned_loss=0.04179, over 4697121.26 frames. ], batch size: 388, lr: 2.99e-03, grad_scale: 32.0 2023-10-03 08:04:35,343 WARNING [train.py:1204] (3/4) Exclude cut with ID _8263_210_1664_1_1528438953494_294100_40-204731-0 from training. Duration: 0.5 2023-10-03 08:04:37,985 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_416-293746-0 from training. Duration: 0.9 2023-10-03 08:04:39,386 WARNING [train.py:1204] (3/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_605-219732-0 from training. Duration: 0.94 2023-10-03 08:04:39,478 WARNING [train.py:1204] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_454-123002-0 from training. Duration: 0.69 2023-10-03 08:04:39,484 WARNING [train.py:1204] (3/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_25-304273-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 08:04:40,884 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6860_210_4852_1_1527917457_3833327_451-39702-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 08:04:40,911 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_304-16505-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 08:04:40,924 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_4-266348-0_sp1.1 from training. Duration: 0.7090625 2023-10-03 08:04:41,018 WARNING [train.py:1204] (3/4) Exclude cut with ID ID0149W0001-753701 from training. Duration: 0.989 2023-10-03 08:04:45,076 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_41251_210_15368_1_1533429767_4820695_394-55393-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 08:04:46,386 WARNING [train.py:1204] (3/4) Exclude cut with ID _8793_210_6310_1_1528542985139_2310009_121-179782-0_sp1.1 from training. Duration: 0.72725 2023-10-03 08:04:46,423 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2729_210_3528_1_1525863505_3882214_421-73047-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 08:04:47,841 WARNING [train.py:1204] (3/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_278-154492-0_sp0.9 from training. Duration: 0.8444375 2023-10-03 08:04:52,578 WARNING [train.py:1204] (3/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_412-317077-0_sp0.9 from training. Duration: 0.8333125 2023-10-03 08:04:52,655 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_727-138317-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 08:04:52,695 WARNING [train.py:1204] (3/4) Exclude cut with ID _25808_210_13292_1_1532343514562_7366109_345-335048-0_sp1.1 from training. Duration: 0.92725 2023-10-03 08:04:52,714 WARNING [train.py:1204] (3/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_506-23200-0 from training. Duration: 0.97 2023-10-03 08:04:54,033 WARNING [train.py:1204] (3/4) Exclude cut with ID _44718_210_5816_1_1534050051869_6469030_643-245187-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 08:04:54,038 WARNING [train.py:1204] (3/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_42-186162-0_sp1.1 from training. Duration: 0.836375 2023-10-03 08:04:54,059 WARNING [train.py:1204] (3/4) Exclude cut with ID _3231_210_2106_1_1526205864934_6784220_171-256004-0_sp1.1 from training. Duration: 0.7818125 2023-10-03 08:04:54,088 WARNING [train.py:1204] (3/4) Exclude cut with ID _25914_210_6400_1_1532141317058_644740_43-248478-0_sp1.1 from training. Duration: 0.936375 2023-10-03 08:04:55,480 WARNING [train.py:1204] (3/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_577-287150-0 from training. Duration: 0.82 2023-10-03 08:04:58,445 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_46015_210_7234_1_1533863319_3087837_376-276068-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 08:04:59,741 WARNING [train.py:1204] (3/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_51-200220-0 from training. Duration: 0.56 2023-10-03 08:05:02,363 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_452-96101-0_sp1.1 from training. Duration: 0.963625 2023-10-03 08:05:05,599 WARNING [train.py:1204] (3/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_263-168388-0_sp1.1 from training. Duration: 0.7818125 2023-10-03 08:05:05,600 WARNING [train.py:1204] (3/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_469-94673-0 from training. Duration: 0.79 2023-10-03 08:05:05,900 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.bypass.scale_min, batch_count=1196133.3333333333, ans=0.2 2023-10-03 08:05:06,932 WARNING [train.py:1204] (3/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_303-228738-0_sp1.1 from training. Duration: 0.763625 2023-10-03 08:05:06,974 WARNING [train.py:1204] (3/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_262-250826-0_sp1.1 from training. Duration: 0.9 2023-10-03 08:05:07,002 WARNING [train.py:1204] (3/4) Exclude cut with ID _68777_210_4353_1_1535810390731_3117904_129-166012-0_sp1.1 from training. Duration: 0.77275 2023-10-03 08:05:10,428 WARNING [train.py:1204] (3/4) Exclude cut with ID _58895_210_8750_1_1535184081320_3649089_265-323810-0 from training. Duration: 0.96 2023-10-03 08:05:10,512 WARNING [train.py:1204] (3/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_120-76843-0_sp0.9 from training. Duration: 0.611125 2023-10-03 08:05:11,831 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7544_210_6753_1_1528587722_3576686_295-258479-0_sp0.9 from training. Duration: 0.9333125 2023-10-03 08:05:13,276 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7046_210_6753_1_1527895337_6920917_783-278109-0 from training. Duration: 0.77 2023-10-03 08:05:14,576 WARNING [train.py:1204] (3/4) Exclude cut with ID _42326_210_7770_1_1533814282531_3707269_113-308323-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 08:05:14,623 WARNING [train.py:1204] (3/4) Exclude cut with ID _7852_210_5219_1_1528286471316_8139400_242-105336-0_sp0.9 from training. Duration: 0.8 2023-10-03 08:05:17,333 WARNING [train.py:1204] (3/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_114-277905-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 08:05:23,145 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_13118_210_7925_1_1530755612_7608297_42-66683-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 08:05:23,386 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.conv_module2.balancer1.prob, batch_count=1196200.0, ans=0.125 2023-10-03 08:05:25,956 WARNING [train.py:1204] (3/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_901-79438-0_sp1.1 from training. Duration: 0.8545625 2023-10-03 08:05:26,000 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_30280_210_1881_1_1532872779_3660139_211-182194-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 08:05:31,742 WARNING [train.py:1204] (3/4) Exclude cut with ID _14898_210_9206_1_1530766812377_4177190_621-45732-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 08:05:31,750 WARNING [train.py:1204] (3/4) Exclude cut with ID _35350_210_16040_1_1533040191183_4075299_224-133775-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 08:05:31,966 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.bypass.skip_rate, batch_count=1196266.6666666667, ans=0.07 2023-10-03 08:05:36,849 WARNING [train.py:1204] (3/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_147-293311-0_sp1.1 from training. Duration: 0.8545625 2023-10-03 08:05:39,493 WARNING [train.py:1204] (3/4) Exclude cut with ID _12302_210_5219_1_1529924446466_8107280_835-279754-0_sp1.1 from training. Duration: 0.8181875 2023-10-03 08:05:41,168 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.feed_forward2.hidden_balancer.prob, batch_count=1196266.6666666667, ans=0.125 2023-10-03 08:05:43,638 INFO [train.py:1046] (3/4) Epoch 34, batch 4150, loss[loss=0.144, simple_loss=0.2308, pruned_loss=0.02857, over 24471.00 frames. ], tot_loss[loss=0.1626, simple_loss=0.2416, pruned_loss=0.04182, over 4695348.96 frames. ], batch size: 63, lr: 2.99e-03, grad_scale: 16.0 2023-10-03 08:05:43,686 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8453_210_4211_1_1528502940_8562730_347-330673-0_sp0.9 from training. Duration: 0.8 2023-10-03 08:05:43,768 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_213-103282-0_sp0.9 from training. Duration: 0.8666875 2023-10-03 08:05:45,130 WARNING [train.py:1204] (3/4) Exclude cut with ID _49728_210_12651_1_1534041034294_6788150_66-43496-0_sp1.1 from training. Duration: 0.97275 2023-10-03 08:05:45,134 WARNING [train.py:1204] (3/4) Exclude cut with ID _19023_210_4353_1_1531378892552_5820780_28-36615-0_sp1.1 from training. Duration: 0.8818125 2023-10-03 08:05:47,946 WARNING [train.py:1204] (3/4) Exclude cut with ID _14349_210_8341_1_1530615710818_3442470_115-285975-0 from training. Duration: 0.86 2023-10-03 08:05:49,227 WARNING [train.py:1204] (3/4) Exclude cut with ID _12734_210_8341_1_1530183614296_3891929_236-47464-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 08:05:49,282 WARNING [train.py:1204] (3/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_177-236809-0 from training. Duration: 0.49 2023-10-03 08:05:49,360 WARNING [train.py:1204] (3/4) Exclude cut with ID _13678_210_7497_1_1530530069226_3463170_371-3221-0 from training. Duration: 0.9 2023-10-03 08:05:49,374 WARNING [train.py:1204] (3/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_48-338976-0 from training. Duration: 0.89 2023-10-03 08:05:52,583 WARNING [train.py:1204] (3/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_1167-309744-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 08:05:56,811 WARNING [train.py:1204] (3/4) Exclude cut with ID _30327_210_16049_1_1532484161295_3924169_401-70819-0_sp0.9 from training. Duration: 0.9555625 2023-10-03 08:05:56,819 WARNING [train.py:1204] (3/4) Exclude cut with ID _55134_210_21982_1_1534590028377_3882170_346-91022-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 08:06:01,002 WARNING [train.py:1204] (3/4) Exclude cut with ID _14459_210_2228_1_1531443641946_7516700_174-118801-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 08:06:02,270 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.580e+02 1.887e+02 2.039e+02 2.346e+02 3.122e+02, threshold=4.079e+02, percent-clipped=0.0 2023-10-03 08:06:02,389 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_269-278006-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 08:06:02,440 WARNING [train.py:1204] (3/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_390-339137-0_sp0.9 from training. Duration: 0.62225 2023-10-03 08:06:05,010 WARNING [train.py:1204] (3/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_253-308377-0_sp1.1 from training. Duration: 0.4454375 2023-10-03 08:06:05,021 WARNING [train.py:1204] (3/4) Exclude cut with ID _23825_210_13449_1_1531915313526_3497150_240-271367-0_sp1.1 from training. Duration: 0.8818125 2023-10-03 08:06:05,109 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1015-343884-0_sp0.9 from training. Duration: 0.688875 2023-10-03 08:06:10,536 WARNING [train.py:1204] (3/4) Exclude cut with ID _23514_210_1389_1_1531832139785_3031439_335-48210-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 08:06:13,289 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_309-65177-0_sp1.1 from training. Duration: 0.8 2023-10-03 08:06:13,365 WARNING [train.py:1204] (3/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_231-137892-0 from training. Duration: 0.99 2023-10-03 08:06:16,174 WARNING [train.py:1204] (3/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_57-270037-0 from training. Duration: 0.77 2023-10-03 08:06:16,180 WARNING [train.py:1204] (3/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_465-214726-0_sp1.1 from training. Duration: 0.7818125 2023-10-03 08:06:16,256 WARNING [train.py:1204] (3/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_106-300985-0 from training. Duration: 0.9 2023-10-03 08:06:16,261 WARNING [train.py:1204] (3/4) Exclude cut with ID _14632_210_9988_1_1530691279392_3923200_161-129530-0_sp1.1 from training. Duration: 0.87275 2023-10-03 08:06:17,570 WARNING [train.py:1204] (3/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_514-88370-0_sp1.1 from training. Duration: 0.963625 2023-10-03 08:06:20,688 WARNING [train.py:1204] (3/4) Exclude cut with ID _56495_210_4234_1_1534748080334_4136219_305-334505-0_sp1.1 from training. Duration: 0.92725 2023-10-03 08:06:22,036 WARNING [train.py:1204] (3/4) Exclude cut with ID _42875_210_14856_1_1533704397487_4012433_61-264914-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 08:06:24,069 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.2.encoder.layers.1.feed_forward3.out_whiten, num_groups=1, num_channels=384, metric=7.04 vs. limit=15.0 2023-10-03 08:06:26,086 WARNING [train.py:1204] (3/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_91-148831-0 from training. Duration: 0.99 2023-10-03 08:06:28,940 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_435-313305-0_sp0.9 from training. Duration: 0.788875 2023-10-03 08:06:29,071 WARNING [train.py:1204] (3/4) Exclude cut with ID _16744_210_5986_1_1531724405384_3907567_242-111316-0_sp1.1 from training. Duration: 0.8454375 2023-10-03 08:06:30,554 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_257-20733-0 from training. Duration: 0.76 2023-10-03 08:06:30,605 WARNING [train.py:1204] (3/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_4-91037-0_sp1.1 from training. Duration: 0.8 2023-10-03 08:06:31,921 WARNING [train.py:1204] (3/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_3-276701-0 from training. Duration: 0.91 2023-10-03 08:06:33,746 WARNING [train.py:1204] (3/4) Exclude cut with ID _3992_210_1747_1_1526644816031_3731060_36-147855-0_sp1.1 from training. Duration: 0.7090625 2023-10-03 08:06:36,838 WARNING [train.py:1204] (3/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_1296-298671-0_sp1.1 from training. Duration: 0.963625 2023-10-03 08:06:36,927 WARNING [train.py:1204] (3/4) Exclude cut with ID _68965_210_1883_1_1535698962272_3497490_309-334593-0_sp1.1 from training. Duration: 0.92725 2023-10-03 08:06:38,873 WARNING [train.py:1204] (3/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_128-296581-0 from training. Duration: 0.74 2023-10-03 08:06:38,873 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_17613_210_2151_1_1531137569_2366160_170-299079-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 08:06:38,876 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6287_210_3273_1_1527509506_2653716_182-199887-0_sp1.1 from training. Duration: 0.463625 2023-10-03 08:06:40,279 WARNING [train.py:1204] (3/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_80-135200-0_sp1.1 from training. Duration: 0.7181875 2023-10-03 08:06:44,146 WARNING [train.py:1204] (3/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_34-183685-0 from training. Duration: 0.98 2023-10-03 08:06:44,172 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8278_210_2550_1_1528637099_4220413_418-210155-0_sp1.1 from training. Duration: 0.92725 2023-10-03 08:06:44,176 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8853_210_3273_1_1528633899_818968_51-187685-0_sp0.9 from training. Duration: 0.7666875 2023-10-03 08:06:44,201 WARNING [train.py:1204] (3/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_332-221057-0_sp1.1 from training. Duration: 0.6181875 2023-10-03 08:06:45,677 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2221_210_1988_1_1525597051_4935541_415-296882-0 from training. Duration: 0.77 2023-10-03 08:06:45,710 WARNING [train.py:1204] (3/4) Exclude cut with ID _33640_210_2655_1_1533117605479_4855469_770-278185-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 08:06:47,028 WARNING [train.py:1204] (3/4) Exclude cut with ID _5287_210_2084_1_1527418883040_3878670_333-48508-0_sp1.1 from training. Duration: 0.5454375 2023-10-03 08:06:47,085 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_142-276839-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 08:06:48,546 WARNING [train.py:1204] (3/4) Exclude cut with ID _28394_210_8913_1_1532430049518_3650118_126-139371-0_sp1.1 from training. Duration: 0.92725 2023-10-03 08:06:49,796 WARNING [train.py:1204] (3/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_76-203867-0 from training. Duration: 0.72 2023-10-03 08:06:49,851 WARNING [train.py:1204] (3/4) Exclude cut with ID _6194_210_1534_1_1527511826160_3941194_14-282876-0_sp0.9 from training. Duration: 0.788875 2023-10-03 08:06:53,336 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.balancer2.prob, batch_count=1196600.0, ans=0.125 2023-10-03 08:06:54,537 WARNING [train.py:1204] (3/4) Exclude cut with ID _35355_210_16040_1_1533212988977_4016229_74-190076-0_sp1.1 from training. Duration: 0.72725 2023-10-03 08:06:56,017 WARNING [train.py:1204] (3/4) Exclude cut with ID _33920_210_4220_1_1533257851500_3084399_198-334063-0 from training. Duration: 0.94 2023-10-03 08:06:57,319 INFO [train.py:1046] (3/4) Epoch 34, batch 4200, loss[loss=0.1362, simple_loss=0.1942, pruned_loss=0.03908, over 19399.00 frames. ], tot_loss[loss=0.1617, simple_loss=0.2404, pruned_loss=0.04149, over 4688282.69 frames. ], batch size: 389, lr: 2.99e-03, grad_scale: 16.0 2023-10-03 08:06:58,696 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_4-266348-0_sp0.9 from training. Duration: 0.8666875 2023-10-03 08:07:00,168 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_23856_210_4238_1_1531994785_3087934_122-38075-0_sp1.1 from training. Duration: 0.936375 2023-10-03 08:07:01,454 WARNING [train.py:1204] (3/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_276-96593-0_sp0.9 from training. Duration: 0.8555625 2023-10-03 08:07:01,511 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_308-278734-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 08:07:01,512 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_5190_210_4557_1_1527245078_2965646_353-250603-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 08:07:03,544 WARNING [train.py:1204] (3/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_472-216663-0 from training. Duration: 0.92 2023-10-03 08:07:06,317 WARNING [train.py:1204] (3/4) Exclude cut with ID _7537_210_2084_1_1528502395431_3924359_14-312906-0 from training. Duration: 0.84 2023-10-03 08:07:06,353 WARNING [train.py:1204] (3/4) Exclude cut with ID _29003_210_19596_1_1532507391720_3894098_559-236660-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 08:07:07,754 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.feed_forward2.hidden_balancer.prob, batch_count=1196666.6666666667, ans=0.125 2023-10-03 08:07:08,894 WARNING [train.py:1204] (3/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_312-204798-0_sp1.1 from training. Duration: 0.8454375 2023-10-03 08:07:12,256 WARNING [train.py:1204] (3/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_17-309070-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 08:07:16,243 WARNING [train.py:1204] (3/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_344-152526-0_sp0.9 from training. Duration: 0.588875 2023-10-03 08:07:16,374 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_41452_210_7291_1_1533437687_3941281_405-11559-0_sp1.1 from training. Duration: 0.82725 2023-10-03 08:07:16,565 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.conv_module2.balancer2.prob, batch_count=1196733.3333333333, ans=0.125 2023-10-03 08:07:17,723 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_48730_210_5399_1_1533885886_2212769_157-124052-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 08:07:17,763 WARNING [train.py:1204] (3/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_415-78792-0 from training. Duration: 0.81 2023-10-03 08:07:17,767 WARNING [train.py:1204] (3/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_345-340690-0_sp1.1 from training. Duration: 0.8454375 2023-10-03 08:07:19,103 WARNING [train.py:1204] (3/4) Exclude cut with ID _23825_210_13449_1_1531915313526_3497150_124-8138-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 08:07:19,151 WARNING [train.py:1204] (3/4) Exclude cut with ID _49258_210_7994_1_1534122017863_3738861_194-24864-0_sp1.1 from training. Duration: 0.936375 2023-10-03 08:07:20,911 WARNING [train.py:1204] (3/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_369-222060-0_sp1.1 from training. Duration: 0.6454375 2023-10-03 08:07:22,385 WARNING [train.py:1204] (3/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_480-13146-0_sp0.9 from training. Duration: 0.9444375 2023-10-03 08:07:25,172 WARNING [train.py:1204] (3/4) Exclude cut with ID _13253_210_1925_1_1530757775819_3577873_191-144977-0 from training. Duration: 0.78 2023-10-03 08:07:25,199 WARNING [train.py:1204] (3/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_1128-140266-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 08:07:28,170 WARNING [train.py:1204] (3/4) Exclude cut with ID _8772_210_1850_1_1528635595972_3664011_247-336448-0_sp0.9 from training. Duration: 0.82225 2023-10-03 08:07:29,487 WARNING [train.py:1204] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_318-59097-0_sp1.1 from training. Duration: 0.6909375 2023-10-03 08:07:30,941 WARNING [train.py:1204] (3/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_540-131164-0_sp1.1 from training. Duration: 0.836375 2023-10-03 08:07:32,751 WARNING [train.py:1204] (3/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_39-110836-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 08:07:36,063 WARNING [train.py:1204] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_860-96198-0_sp1.1 from training. Duration: 0.663625 2023-10-03 08:07:37,083 WARNING [train.py:1204] (3/4) Exclude cut with ID _15133_210_6228_1_1530794512074_3080379_29-264458-0 from training. Duration: 0.58 2023-10-03 08:07:37,112 WARNING [train.py:1204] (3/4) Exclude cut with ID _55209_210_1531_1_1534675828996_4597160_797-348343-0_sp1.1 from training. Duration: 0.963625 2023-10-03 08:07:38,484 WARNING [train.py:1204] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_23-189234-0_sp1.1 from training. Duration: 0.7545625 2023-10-03 08:07:41,777 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.4.encoder.layers.0.conv_module1.whiten, num_groups=1, num_channels=384, metric=4.52 vs. limit=15.0 2023-10-03 08:07:45,824 WARNING [train.py:1204] (3/4) Exclude cut with ID _5410_210_1388_1_1527397055795_1719020_39-112221-0_sp1.1 from training. Duration: 0.62725 2023-10-03 08:07:47,282 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_100-348441-0_sp1.1 from training. Duration: 0.82725 2023-10-03 08:07:51,554 WARNING [train.py:1204] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_143-86344-0_sp1.1 from training. Duration: 0.7 2023-10-03 08:07:56,052 WARNING [train.py:1204] (3/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_126-29104-0 from training. Duration: 0.85 2023-10-03 08:07:58,700 WARNING [train.py:1204] (3/4) Exclude cut with ID _39846_210_12651_1_1533603651992_1318030_138-31771-0_sp1.1 from training. Duration: 0.963625 2023-10-03 08:08:03,076 WARNING [train.py:1204] (3/4) Exclude cut with ID _2220_210_1819_1_1525509109898_3658680_50-99837-0_sp1.1 from training. Duration: 0.4909375 2023-10-03 08:08:03,141 WARNING [train.py:1204] (3/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_836-210550-0_sp1.1 from training. Duration: 0.97275 2023-10-03 08:08:03,265 WARNING [train.py:1204] (3/4) Exclude cut with ID _28323_210_16187_1_1532480395667_4100300_306-74561-0 from training. Duration: 0.94 2023-10-03 08:08:09,742 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_177-58005-0_sp1.1 from training. Duration: 0.563625 2023-10-03 08:08:11,121 INFO [train.py:1046] (3/4) Epoch 34, batch 4250, loss[loss=0.1476, simple_loss=0.2342, pruned_loss=0.03045, over 24651.00 frames. ], tot_loss[loss=0.1607, simple_loss=0.2392, pruned_loss=0.04108, over 4690918.43 frames. ], batch size: 68, lr: 2.99e-03, grad_scale: 16.0 2023-10-03 08:08:13,186 WARNING [train.py:1204] (3/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_19-342632-0_sp1.1 from training. Duration: 0.82725 2023-10-03 08:08:13,206 WARNING [train.py:1204] (3/4) Exclude cut with ID _35355_210_16040_1_1533212988977_4016229_74-190076-0_sp0.9 from training. Duration: 0.888875 2023-10-03 08:08:15,927 WARNING [train.py:1204] (3/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_285-97786-0_sp1.1 from training. Duration: 0.97275 2023-10-03 08:08:21,497 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.out_combiner.scale_min, batch_count=1197000.0, ans=0.2 2023-10-03 08:08:21,563 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.nonlin_attention.balancer.prob, batch_count=1197000.0, ans=0.125 2023-10-03 08:08:22,640 WARNING [train.py:1204] (3/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_337-181502-0_sp0.9 from training. Duration: 0.72225 2023-10-03 08:08:22,678 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_353-182855-0 from training. Duration: 0.96 2023-10-03 08:08:22,707 WARNING [train.py:1204] (3/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_409-327730-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 08:08:24,184 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.conv_skip_rate, batch_count=1197066.6666666667, ans=0.0 2023-10-03 08:08:27,390 WARNING [train.py:1204] (3/4) Exclude cut with ID _8030_210_7497_1_1528444886781_3531089_373-19403-0_sp1.1 from training. Duration: 0.97275 2023-10-03 08:08:28,997 WARNING [train.py:1204] (3/4) Exclude cut with ID _6789_210_1534_1_1527857937218_3118950_231-340237-0_sp1.1 from training. Duration: 0.936375 2023-10-03 08:08:30,138 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.578e+02 1.924e+02 2.069e+02 2.506e+02 3.818e+02, threshold=4.139e+02, percent-clipped=0.0 2023-10-03 08:08:31,935 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.self_attn_weights.pos_emb_skip_rate, batch_count=1197066.6666666667, ans=0.0 2023-10-03 08:08:33,048 WARNING [train.py:1204] (3/4) Exclude cut with ID _6493_210_1747_1_1528459210424_4012470_536-57832-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 08:08:33,068 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7046_210_6753_1_1527895337_6920917_756-116002-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 08:08:36,121 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_37152_210_9697_1_1533106573_3844188_29-239070-0_sp1.1 from training. Duration: 0.8909375 2023-10-03 08:08:36,123 WARNING [train.py:1204] (3/4) Exclude cut with ID _19023_210_4353_1_1531378892552_5820780_61-334539-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 08:08:37,525 WARNING [train.py:1204] (3/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_159-28488-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 08:08:38,818 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_764-71272-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 08:08:40,811 WARNING [train.py:1204] (3/4) Exclude cut with ID _44902_210_16187_1_1533888006062_3792078_258-178274-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 08:08:42,309 WARNING [train.py:1204] (3/4) Exclude cut with ID _7537_210_2084_1_1528502395431_3924359_14-312906-0_sp1.1 from training. Duration: 0.763625 2023-10-03 08:08:43,629 WARNING [train.py:1204] (3/4) Exclude cut with ID _49643_210_7989_1_1534640244224_3939031_674-116639-0_sp1.1 from training. Duration: 0.963625 2023-10-03 08:08:45,023 WARNING [train.py:1204] (3/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_415-161-0 from training. Duration: 0.98 2023-10-03 08:08:47,048 WARNING [train.py:1204] (3/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_264-348896-0 from training. Duration: 0.85 2023-10-03 08:08:47,058 WARNING [train.py:1204] (3/4) Exclude cut with ID _51899_210_20034_1_1534129254466_3669220_233-161492-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 08:08:48,509 WARNING [train.py:1204] (3/4) Exclude cut with ID _31255_210_13427_1_1532865619495_3815143_233-200023-0_sp1.1 from training. Duration: 0.936375 2023-10-03 08:08:49,847 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_23850_210_4238_1_1532232597_1632433_100-229879-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 08:08:51,080 WARNING [train.py:1204] (3/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_375-245677-0_sp0.9 from training. Duration: 0.9 2023-10-03 08:08:51,083 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_40223_210_6228_1_1533298404_4812267_355-317415-0_sp1.1 from training. Duration: 0.97275 2023-10-03 08:08:51,126 WARNING [train.py:1204] (3/4) Exclude cut with ID _9679_210_7497_1_1529129212906_4363929_235-81488-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 08:08:53,869 WARNING [train.py:1204] (3/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_172-215932-0_sp1.1 from training. Duration: 0.6 2023-10-03 08:08:55,686 WARNING [train.py:1204] (3/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_64-298917-0_sp1.1 from training. Duration: 0.8 2023-10-03 08:08:58,565 WARNING [train.py:1204] (3/4) Exclude cut with ID _38020_210_5986_1_1533112252259_7006089_131-53533-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 08:09:01,216 WARNING [train.py:1204] (3/4) Exclude cut with ID _42334_210_7770_1_1534206610396_3605880_392-48154-0_sp1.1 from training. Duration: 0.963625 2023-10-03 08:09:01,275 WARNING [train.py:1204] (3/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_337-181502-0 from training. Duration: 0.65 2023-10-03 08:09:01,282 WARNING [train.py:1204] (3/4) Exclude cut with ID _6267_210_2084_1_1527764658526_3894730_375-219887-0_sp0.9 from training. Duration: 0.7444375 2023-10-03 08:09:01,360 WARNING [train.py:1204] (3/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_60-146189-0 from training. Duration: 0.98 2023-10-03 08:09:02,710 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_243-143119-0_sp1.1 from training. Duration: 0.7 2023-10-03 08:09:02,922 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.feed_forward2.hidden_balancer.prob, batch_count=1197200.0, ans=0.125 2023-10-03 08:09:04,016 WARNING [train.py:1204] (3/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_1267-272693-0_sp0.9 from training. Duration: 0.811125 2023-10-03 08:09:05,984 WARNING [train.py:1204] (3/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_83-182645-0_sp1.1 from training. Duration: 0.97275 2023-10-03 08:09:06,013 WARNING [train.py:1204] (3/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_403-292260-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 08:09:09,466 WARNING [train.py:1204] (3/4) Exclude cut with ID _55429_210_10626_1_1534467588527_7174143_377-169391-0 from training. Duration: 0.96 2023-10-03 08:09:10,644 WARNING [train.py:1204] (3/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_494-199538-0_sp1.1 from training. Duration: 0.6818125 2023-10-03 08:09:10,697 WARNING [train.py:1204] (3/4) Exclude cut with ID _3067_210_2667_1_1526212974640_4503168_119-175776-0_sp1.1 from training. Duration: 0.67275 2023-10-03 08:09:13,471 WARNING [train.py:1204] (3/4) Exclude cut with ID _3231_210_2106_1_1526205864934_6784220_183-303839-0_sp1.1 from training. Duration: 0.97275 2023-10-03 08:09:15,514 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.1.ff2_skip_rate, batch_count=1197266.6666666667, ans=0.0 2023-10-03 08:09:16,679 WARNING [train.py:1204] (3/4) Exclude cut with ID _18513_210_9206_1_1531618215223_3862620_165-38958-0_sp1.1 from training. Duration: 0.963625 2023-10-03 08:09:18,073 WARNING [train.py:1204] (3/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_87-272068-0_sp1.1 from training. Duration: 0.7545625 2023-10-03 08:09:19,434 WARNING [train.py:1204] (3/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_353-95-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 08:09:20,891 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2009_210_2071_1_1525481134_4588937_73-302813-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 08:09:22,310 WARNING [train.py:1204] (3/4) Exclude cut with ID _64220_210_9527_1_1535250151772_4875259_88-95011-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 08:09:23,682 WARNING [train.py:1204] (3/4) Exclude cut with ID _44902_210_16187_1_1533888006062_3792078_160-12107-0_sp1.1 from training. Duration: 0.92725 2023-10-03 08:09:23,688 WARNING [train.py:1204] (3/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_547-5424-0 from training. Duration: 0.94 2023-10-03 08:09:23,818 WARNING [train.py:1204] (3/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_268-85656-0_sp1.1 from training. Duration: 0.936375 2023-10-03 08:09:25,712 INFO [train.py:1046] (3/4) Epoch 34, batch 4300, loss[loss=0.1764, simple_loss=0.2608, pruned_loss=0.04603, over 24045.00 frames. ], tot_loss[loss=0.1607, simple_loss=0.2392, pruned_loss=0.04112, over 4699743.22 frames. ], batch size: 80, lr: 2.99e-03, grad_scale: 16.0 2023-10-03 08:09:29,927 WARNING [train.py:1204] (3/4) Exclude cut with ID _66210_210_12651_1_1535446812922_3365850_42-284985-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 08:09:29,963 WARNING [train.py:1204] (3/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_465-214726-0_sp0.9 from training. Duration: 0.9555625 2023-10-03 08:09:34,178 WARNING [train.py:1204] (3/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_50-95770-0_sp1.1 from training. Duration: 0.936375 2023-10-03 08:09:35,725 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.conv_module1.balancer2.prob, batch_count=1197333.3333333333, ans=0.125 2023-10-03 08:09:38,945 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.1.ff2_skip_rate, batch_count=1197400.0, ans=0.0 2023-10-03 08:09:42,132 WARNING [train.py:1204] (3/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_269-77567-0_sp1.1 from training. Duration: 0.963625 2023-10-03 08:09:42,134 WARNING [train.py:1204] (3/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_7-111954-0 from training. Duration: 0.56 2023-10-03 08:09:44,704 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_85-195240-0_sp0.9 from training. Duration: 0.9666875 2023-10-03 08:09:48,013 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2822_210_2715_1_1526112586_1999587_165-36656-0_sp0.9 from training. Duration: 0.988875 2023-10-03 08:09:48,028 WARNING [train.py:1204] (3/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_181-78632-0_sp1.1 from training. Duration: 0.7090625 2023-10-03 08:09:48,040 WARNING [train.py:1204] (3/4) Exclude cut with ID IC0671W0044-331887 from training. Duration: 0.896 2023-10-03 08:09:49,554 WARNING [train.py:1204] (3/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_266-137104-0_sp1.1 from training. Duration: 0.5909375 2023-10-03 08:09:50,930 WARNING [train.py:1204] (3/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_137-116442-0_sp0.9 from training. Duration: 0.8666875 2023-10-03 08:09:52,574 INFO [scaling.py:1118] (3/4) WithLoss: name=encoder.encoders.5.encoder.layers.1.self_attn_weights, loss-sum=0.000e+00 2023-10-03 08:09:54,699 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.0.layers.0.feed_forward1.out_whiten, num_groups=1, num_channels=192, metric=11.53 vs. limit=15.0 2023-10-03 08:09:54,999 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_23801_210_16223_1_1532066392_3294657_570-5439-0 from training. Duration: 0.97 2023-10-03 08:09:55,018 WARNING [train.py:1204] (3/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_29-280622-0_sp1.1 from training. Duration: 0.7454375 2023-10-03 08:09:55,041 WARNING [train.py:1204] (3/4) Exclude cut with ID 3557-8342-0013-54691-0 from training. Duration: 0.92 2023-10-03 08:09:55,235 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.balancer1.prob, batch_count=1197466.6666666667, ans=0.125 2023-10-03 08:09:58,204 WARNING [train.py:1204] (3/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_711-15688-0_sp1.1 from training. Duration: 0.3909375 2023-10-03 08:09:58,381 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.1.feed_forward1.out_proj.dropout_p, batch_count=1197466.6666666667, ans=0.1 2023-10-03 08:09:59,600 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_15126_210_6753_1_1530778677_2591185_120-277707-0_sp1.1 from training. Duration: 0.72725 2023-10-03 08:10:01,079 WARNING [train.py:1204] (3/4) Exclude cut with ID _14970_210_15709_1_1530775547361_1966965_8-169924-0_sp1.1 from training. Duration: 0.836375 2023-10-03 08:10:01,080 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_4692_210_3528_1_1526899755_4323254_107-44401-0_sp0.9 from training. Duration: 0.9555625 2023-10-03 08:10:03,606 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3475_210_2028_1_1526193578_3622569_265-276625-0_sp0.9 from training. Duration: 0.8444375 2023-10-03 08:10:04,943 WARNING [train.py:1204] (3/4) Exclude cut with ID _55153_210_2253_1_1534384786677_3162096_148-79022-0_sp1.1 from training. Duration: 0.87275 2023-10-03 08:10:05,026 WARNING [train.py:1204] (3/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_292-36336-0_sp1.1 from training. Duration: 0.92725 2023-10-03 08:10:05,034 WARNING [train.py:1204] (3/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_329-82235-0 from training. Duration: 0.82 2023-10-03 08:10:06,792 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_11305_210_1794_1_1529750428_7178148_257-312870-0 from training. Duration: 0.88 2023-10-03 08:10:08,145 WARNING [train.py:1204] (3/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_436-4493-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 08:10:11,581 WARNING [train.py:1204] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_321-54129-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 08:10:11,588 WARNING [train.py:1204] (3/4) Exclude cut with ID _5287_210_2084_1_1527418883040_3878670_333-48508-0_sp0.9 from training. Duration: 0.6666875 2023-10-03 08:10:11,610 WARNING [train.py:1204] (3/4) Exclude cut with ID _26528_210_9070_1_1532570410842_3851310_244-278158-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 08:10:12,855 WARNING [train.py:1204] (3/4) Exclude cut with ID _46952_210_5986_1_1534230010357_3107379_232-344503-0_sp1.1 from training. Duration: 0.87275 2023-10-03 08:10:12,866 WARNING [train.py:1204] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_43-206757-0 from training. Duration: 0.75 2023-10-03 08:10:12,867 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_4183_210_2087_1_1526729100_1869838_141-166600-0 from training. Duration: 0.95 2023-10-03 08:10:12,944 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_296-287678-0 from training. Duration: 0.95 2023-10-03 08:10:14,274 WARNING [train.py:1204] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_265-60343-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 08:10:14,303 WARNING [train.py:1204] (3/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_332-256168-0 from training. Duration: 0.75 2023-10-03 08:10:14,345 WARNING [train.py:1204] (3/4) Exclude cut with ID _13679_210_7497_1_1530534293662_3355098_411-186088-0 from training. Duration: 0.94 2023-10-03 08:10:14,581 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.nonlin_attention.balancer.prob, batch_count=1197533.3333333333, ans=0.125 2023-10-03 08:10:18,303 WARNING [train.py:1204] (3/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_458-126779-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 08:10:20,160 WARNING [train.py:1204] (3/4) Exclude cut with ID IC0803W0380-397572 from training. Duration: 0.032 2023-10-03 08:10:21,496 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_278-272612-0_sp1.1 from training. Duration: 0.663625 2023-10-03 08:10:24,197 WARNING [train.py:1204] (3/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_491-263101-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 08:10:24,203 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_53555_210_5632_1_1534595220_7232297_334-336778-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 08:10:25,606 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_50-3289-0 from training. Duration: 0.91 2023-10-03 08:10:28,759 WARNING [train.py:1204] (3/4) Exclude cut with ID _8699_210_2069_1_1528630346831_3960409_155-39766-0_sp0.9 from training. Duration: 0.8666875 2023-10-03 08:10:28,763 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_502-82145-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 08:10:28,826 WARNING [train.py:1204] (3/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_550-43132-0_sp1.1 from training. Duration: 0.97275 2023-10-03 08:10:29,524 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.2.encoder.layers.1.whiten, num_groups=1, num_channels=384, metric=4.75 vs. limit=12.0 2023-10-03 08:10:30,056 WARNING [train.py:1204] (3/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_446-112117-0_sp1.1 from training. Duration: 0.8181875 2023-10-03 08:10:31,396 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3691_210_3528_1_1526468169_4062053_355-334432-0_sp1.1 from training. Duration: 0.7909375 2023-10-03 08:10:32,864 WARNING [train.py:1204] (3/4) Exclude cut with ID _58605_210_12210_1_1534723195939_3904259_239-94720-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 08:10:34,422 WARNING [train.py:1204] (3/4) Exclude cut with ID _49376_210_5986_1_1533985278732_3370740_391-344072-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 08:10:35,775 WARNING [train.py:1204] (3/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_558-322014-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 08:10:37,057 WARNING [train.py:1204] (3/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_256-32662-0_sp1.1 from training. Duration: 0.8181875 2023-10-03 08:10:38,356 INFO [train.py:1046] (3/4) Epoch 34, batch 4350, loss[loss=0.1621, simple_loss=0.2416, pruned_loss=0.04127, over 23333.00 frames. ], tot_loss[loss=0.1605, simple_loss=0.2396, pruned_loss=0.04068, over 4713056.05 frames. ], batch size: 119, lr: 2.98e-03, grad_scale: 16.0 2023-10-03 08:10:43,177 WARNING [train.py:1204] (3/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_572-185150-0 from training. Duration: 0.97 2023-10-03 08:10:43,206 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_89-35748-0_sp0.9 from training. Duration: 0.87775 2023-10-03 08:10:43,499 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.bypass.skip_rate, batch_count=1197666.6666666667, ans=0.04949747468305833 2023-10-03 08:10:47,825 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_88-66747-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 08:10:49,245 WARNING [train.py:1204] (3/4) Exclude cut with ID _19504_210_9994_1_1531648863242_3325220_357-237029-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 08:10:51,967 WARNING [train.py:1204] (3/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_302-2550-0_sp1.1 from training. Duration: 0.736375 2023-10-03 08:10:51,967 WARNING [train.py:1204] (3/4) Exclude cut with ID _9461_210_4557_1_1529060514726_3677451_59-194017-0_sp1.1 from training. Duration: 0.963625 2023-10-03 08:10:56,416 WARNING [train.py:1204] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_665-196155-0_sp1.1 from training. Duration: 0.6090625 2023-10-03 08:10:57,704 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.566e+02 1.863e+02 2.001e+02 2.275e+02 3.129e+02, threshold=4.001e+02, percent-clipped=0.0 2023-10-03 08:11:00,911 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_326-315007-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 08:11:03,628 WARNING [train.py:1204] (3/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_105-328174-0_sp1.1 from training. Duration: 0.6909375 2023-10-03 08:11:03,647 WARNING [train.py:1204] (3/4) Exclude cut with ID _56494_210_4220_1_1535284837625_3033169_106-263722-0_sp1.1 from training. Duration: 0.97275 2023-10-03 08:11:03,864 INFO [scaling.py:1118] (3/4) WithLoss: name=encoder.encoders.2.encoder.layers.0.self_attn_weights, loss-sum=0.000e+00 2023-10-03 08:11:06,363 WARNING [train.py:1204] (3/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_770-200430-0_sp0.9 from training. Duration: 0.988875 2023-10-03 08:11:06,654 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.conv_module2.balancer1.prob, batch_count=1197800.0, ans=0.125 2023-10-03 08:11:09,052 WARNING [train.py:1204] (3/4) Exclude cut with ID _65640_210_6310_1_1535368201445_3173250_209-13008-0_sp1.1 from training. Duration: 0.9 2023-10-03 08:11:10,983 WARNING [train.py:1204] (3/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_212-149947-0_sp0.9 from training. Duration: 0.811125 2023-10-03 08:11:15,195 WARNING [train.py:1204] (3/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_188-171673-0 from training. Duration: 0.58 2023-10-03 08:11:16,547 WARNING [train.py:1204] (3/4) Exclude cut with ID _58895_210_8750_1_1535184081320_3649089_286-281724-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 08:11:16,616 WARNING [train.py:1204] (3/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_16-254909-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 08:11:24,437 WARNING [train.py:1204] (3/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_52-95655-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 08:11:25,886 WARNING [train.py:1204] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_554-45617-0 from training. Duration: 0.99 2023-10-03 08:11:28,659 WARNING [train.py:1204] (3/4) Exclude cut with ID _14683_210_5219_1_1530698352608_3974859_473-344611-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 08:11:28,770 WARNING [train.py:1204] (3/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_17-25045-0_sp0.9 from training. Duration: 0.6555625 2023-10-03 08:11:30,943 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.feed_forward3.hidden_balancer.prob, batch_count=1197866.6666666667, ans=0.125 2023-10-03 08:11:33,577 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9072W0003-530637 from training. Duration: 0.768 2023-10-03 08:11:35,049 WARNING [train.py:1204] (3/4) Exclude cut with ID _38670_210_15379_1_1533448810659_4391059_76-74025-0_sp1.1 from training. Duration: 0.936375 2023-10-03 08:11:35,088 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_74-292608-0_sp0.9 from training. Duration: 0.8 2023-10-03 08:11:36,434 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9084W0025-536862 from training. Duration: 0.896 2023-10-03 08:11:37,797 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9087W0021-538392 from training. Duration: 0.768 2023-10-03 08:11:37,802 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7543_210_6753_1_1528543323_645515_56-163234-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 08:11:37,822 WARNING [train.py:1204] (3/4) Exclude cut with ID _45560_210_21982_1_1533812603951_3584999_44-332643-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 08:11:37,907 WARNING [train.py:1204] (3/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_1285-256622-0_sp1.1 from training. Duration: 0.763625 2023-10-03 08:11:39,408 WARNING [train.py:1204] (3/4) Exclude cut with ID _52781_210_16187_1_1534297729099_1118149_141-325632-0_sp1.1 from training. Duration: 0.936375 2023-10-03 08:11:39,470 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_1051-137631-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 08:11:39,516 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_13091_210_7925_1_1530609447_8987354_233-18333-0_sp1.1 from training. Duration: 0.97275 2023-10-03 08:11:40,023 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.4.encoder.layers.2.self_attn_weights.whiten_keys, num_groups=4, num_channels=128, metric=2.24 vs. limit=6.0 2023-10-03 08:11:40,024 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.self_attn_weights.whiten_keys.whitening_limit, batch_count=1197933.3333333333, ans=6.0 2023-10-03 08:11:43,906 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_34008_210_10431_1_1533345424_8183247_662-317331-0 from training. Duration: 0.94 2023-10-03 08:11:43,922 WARNING [train.py:1204] (3/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_291-248920-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 08:11:43,924 WARNING [train.py:1204] (3/4) Exclude cut with ID _65263_210_22356_1_1535763598691_3921037_274-288481-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 08:11:43,959 WARNING [train.py:1204] (3/4) Exclude cut with ID _5189_210_4557_1_1527248319428_3769229_233-168080-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 08:11:45,338 WARNING [train.py:1204] (3/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_135-184281-0 from training. Duration: 0.99 2023-10-03 08:11:45,443 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9123W0012-555972 from training. Duration: 0.896 2023-10-03 08:11:45,447 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9123W0118-556077 from training. Duration: 0.896 2023-10-03 08:11:45,463 WARNING [train.py:1204] (3/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_406-275649-0 from training. Duration: 0.42 2023-10-03 08:11:48,992 WARNING [train.py:1204] (3/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_309-13684-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 08:11:50,138 WARNING [train.py:1204] (3/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_129-337802-0_sp1.1 from training. Duration: 0.6545625 2023-10-03 08:11:50,172 WARNING [train.py:1204] (3/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_649-266825-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 08:11:50,403 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.conv_module1.balancer1.prob, batch_count=1197933.3333333333, ans=0.125 2023-10-03 08:11:51,639 WARNING [train.py:1204] (3/4) Exclude cut with ID _40519_210_13794_1_1533715228655_7337390_895-224742-0_sp1.1 from training. Duration: 0.92725 2023-10-03 08:11:52,796 INFO [train.py:1046] (3/4) Epoch 34, batch 4400, loss[loss=0.1635, simple_loss=0.2322, pruned_loss=0.04741, over 23841.00 frames. ], tot_loss[loss=0.1607, simple_loss=0.2399, pruned_loss=0.04075, over 4714125.30 frames. ], batch size: 195, lr: 2.98e-03, grad_scale: 32.0 2023-10-03 08:11:52,913 WARNING [train.py:1204] (3/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_234-14174-0 from training. Duration: 0.82 2023-10-03 08:11:56,161 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9156W0009-572502 from training. Duration: 0.768 2023-10-03 08:11:56,169 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_424-245529-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 08:12:00,371 WARNING [train.py:1204] (3/4) Exclude cut with ID _26528_210_9070_1_1532570410842_3851310_232-10149-0_sp1.1 from training. Duration: 0.963625 2023-10-03 08:12:02,194 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_17292_210_6753_1_1531125388_2943734_152-30410-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 08:12:04,419 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.0.layers.1.self_attn1.whiten, num_groups=1, num_channels=192, metric=10.12 vs. limit=22.5 2023-10-03 08:12:04,827 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_35441_210_16554_1_1533347572_4092680_291-82075-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 08:12:04,978 WARNING [train.py:1204] (3/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_64-298917-0 from training. Duration: 0.88 2023-10-03 08:12:04,997 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_5618_210_1874_1_1527311008_5851_1-214501-0_sp0.9 from training. Duration: 0.7655625 2023-10-03 08:12:06,377 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_9460_210_4211_1_1528954587_10049667_572-37035-0 from training. Duration: 0.8 2023-10-03 08:12:06,405 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9193W0023-590322 from training. Duration: 0.896 2023-10-03 08:12:07,739 WARNING [train.py:1204] (3/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_206-10097-0_sp1.1 from training. Duration: 0.4454375 2023-10-03 08:12:07,758 WARNING [train.py:1204] (3/4) Exclude cut with ID _55134_210_21982_1_1534590028377_3882170_17-51731-0_sp1.1 from training. Duration: 0.963625 2023-10-03 08:12:10,536 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_14953_210_2151_1_1530701710_5616165_710-165400-0 from training. Duration: 0.87 2023-10-03 08:12:12,021 WARNING [train.py:1204] (3/4) Exclude cut with ID _53203_210_2087_1_1534246218309_475590_61-85777-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 08:12:13,267 WARNING [train.py:1204] (3/4) Exclude cut with ID _55844_210_2346_1_1534471212569_7625707_1050-10453-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 08:12:13,278 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9218W0013-603297 from training. Duration: 0.896 2023-10-03 08:12:14,864 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_267-102818-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 08:12:14,865 WARNING [train.py:1204] (3/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_191-41023-0 from training. Duration: 0.71 2023-10-03 08:12:14,910 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9231W0202-610227 from training. Duration: 0.896 2023-10-03 08:12:18,224 WARNING [train.py:1204] (3/4) Exclude cut with ID _6537_210_1883_1_1527987559710_3597129_382-133062-0 from training. Duration: 0.68 2023-10-03 08:12:19,917 WARNING [train.py:1204] (3/4) Exclude cut with ID _28427_210_4220_1_1532581240967_3061069_43-84928-0 from training. Duration: 0.99 2023-10-03 08:12:19,955 WARNING [train.py:1204] (3/4) Exclude cut with ID _59256_210_5986_1_1535076085997_3589120_121-237386-0 from training. Duration: 0.96 2023-10-03 08:12:21,217 WARNING [train.py:1204] (3/4) Exclude cut with ID _8480_210_1874_1_1528589391205_6728330_137-256986-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 08:12:21,312 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_9417_210_1794_1_1529235261_4614131_479-346963-0_sp1.1 from training. Duration: 0.936375 2023-10-03 08:12:21,346 WARNING [train.py:1204] (3/4) Exclude cut with ID _26474_210_9570_1_1532520022699_3623380_267-7362-0_sp1.1 from training. Duration: 0.936375 2023-10-03 08:12:22,663 WARNING [train.py:1204] (3/4) Exclude cut with ID _55328_210_27767_1_1534580973260_4088170_287-61272-0_sp1.1 from training. Duration: 0.97275 2023-10-03 08:12:24,146 WARNING [train.py:1204] (3/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_589-9070-0 from training. Duration: 0.71 2023-10-03 08:12:24,160 WARNING [train.py:1204] (3/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_148-158509-0_sp1.1 from training. Duration: 0.37275 2023-10-03 08:12:25,441 WARNING [train.py:1204] (3/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_164-184548-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 08:12:26,952 WARNING [train.py:1204] (3/4) Exclude cut with ID _14970_210_15709_1_1530775547361_1966965_6-23328-0_sp1.1 from training. Duration: 0.7818125 2023-10-03 08:12:26,957 WARNING [train.py:1204] (3/4) Exclude cut with ID _29303_210_2575_1_1532392237303_7410649_612-165069-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 08:12:28,647 WARNING [train.py:1204] (3/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_383-321224-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 08:12:28,694 WARNING [train.py:1204] (3/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_283-160525-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 08:12:28,706 WARNING [train.py:1204] (3/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_505-97987-0 from training. Duration: 0.86 2023-10-03 08:12:30,245 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9309W0190-640317 from training. Duration: 0.768 2023-10-03 08:12:33,446 WARNING [train.py:1204] (3/4) Exclude cut with ID _50093_210_4220_1_1534417141207_3432299_280-297390-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 08:12:33,914 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.5.encoder.layers.0.nonlin_attention.whiten2, num_groups=1, num_channels=256, metric=3.56 vs. limit=15.0 2023-10-03 08:12:40,295 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_17292_210_6753_1_1531125388_2943734_320-71385-0_sp1.1 from training. Duration: 0.97275 2023-10-03 08:12:41,172 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.0.layers.1.feed_forward2.out_whiten, num_groups=1, num_channels=192, metric=4.61 vs. limit=15.0 2023-10-03 08:12:41,742 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_213-103282-0 from training. Duration: 0.78 2023-10-03 08:12:46,252 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7615_210_4211_1_1528110965_4580160_72-235211-0_sp0.9 from training. Duration: 0.8444375 2023-10-03 08:12:48,957 WARNING [train.py:1204] (3/4) Exclude cut with ID _14867_210_1968_1_1530684179890_3846969_173-320977-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 08:12:51,692 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7046_210_6753_1_1527895337_6920917_783-278109-0_sp0.9 from training. Duration: 0.8555625 2023-10-03 08:12:51,736 WARNING [train.py:1204] (3/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_90-30673-0 from training. Duration: 0.84 2023-10-03 08:12:51,758 WARNING [train.py:1204] (3/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_415-161-0_sp1.1 from training. Duration: 0.8909375 2023-10-03 08:12:52,661 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.bypass.skip_rate, batch_count=1198266.6666666667, ans=0.04949747468305833 2023-10-03 08:12:53,722 WARNING [train.py:1204] (3/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_86-66192-0_sp0.9 from training. Duration: 0.611125 2023-10-03 08:12:53,725 WARNING [train.py:1204] (3/4) Exclude cut with ID _4626_210_3532_1_1526817652717_3916859_446-206656-0_sp1.1 from training. Duration: 0.6909375 2023-10-03 08:12:53,785 WARNING [train.py:1204] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_166-128941-0_sp0.9 from training. Duration: 0.6 2023-10-03 08:12:55,367 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.bypass_mid.scale_min, batch_count=1198266.6666666667, ans=0.2 2023-10-03 08:12:56,641 WARNING [train.py:1204] (3/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_345-167986-0 from training. Duration: 0.84 2023-10-03 08:13:00,009 WARNING [train.py:1204] (3/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_973-198085-0 from training. Duration: 0.75 2023-10-03 08:13:01,226 WARNING [train.py:1204] (3/4) Exclude cut with ID _60057_210_10640_1_1534903065793_3333150_171-181133-0 from training. Duration: 0.88 2023-10-03 08:13:01,240 WARNING [train.py:1204] (3/4) Exclude cut with ID _19504_210_9994_1_1531648863242_3325220_336-196513-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 08:13:02,599 WARNING [train.py:1204] (3/4) Exclude cut with ID _6493_210_1747_1_1528459210424_4012470_513-140967-0 from training. Duration: 0.79 2023-10-03 08:13:04,467 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6673_210_3273_1_1527767790_3173240_327-306185-0_sp0.9 from training. Duration: 0.92225 2023-10-03 08:13:07,171 INFO [train.py:1046] (3/4) Epoch 34, batch 4450, loss[loss=0.1455, simple_loss=0.2261, pruned_loss=0.03248, over 24624.00 frames. ], tot_loss[loss=0.1612, simple_loss=0.2407, pruned_loss=0.04083, over 4721629.59 frames. ], batch size: 60, lr: 2.98e-03, grad_scale: 32.0 2023-10-03 08:13:07,263 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1032-236863-0_sp1.1 from training. Duration: 0.763625 2023-10-03 08:13:08,732 WARNING [train.py:1204] (3/4) Exclude cut with ID _14867_210_1968_1_1530684179890_3846969_413-29292-0 from training. Duration: 0.9 2023-10-03 08:13:12,940 WARNING [train.py:1204] (3/4) Exclude cut with ID _47251_210_18628_1_1533814063128_4137250_186-343322-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 08:13:14,347 WARNING [train.py:1204] (3/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_624-46091-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 08:13:15,669 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_791-197891-0_sp0.9 from training. Duration: 0.9333125 2023-10-03 08:13:22,251 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_50050_210_16223_1_1535435796_7290138_786-288457-0_sp1.1 from training. Duration: 0.92725 2023-10-03 08:13:22,269 WARNING [train.py:1204] (3/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_213-227283-0_sp1.1 from training. Duration: 0.963625 2023-10-03 08:13:23,833 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.ff2_skip_rate, batch_count=1198400.0, ans=0.0 2023-10-03 08:13:23,882 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=1198400.0, ans=0.1 2023-10-03 08:13:25,191 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.feed_forward1.hidden_balancer.prob, batch_count=1198400.0, ans=0.125 2023-10-03 08:13:26,260 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.574e+02 1.851e+02 2.013e+02 2.365e+02 4.076e+02, threshold=4.026e+02, percent-clipped=1.0 2023-10-03 08:13:26,338 WARNING [train.py:1204] (3/4) Exclude cut with ID _42659_210_2070_1_1533607442765_3120380_58-341121-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 08:13:27,220 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.1.encoder.layers.1.conv_module1.whiten, num_groups=1, num_channels=256, metric=11.95 vs. limit=15.0 2023-10-03 08:13:27,748 WARNING [train.py:1204] (3/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_506-23200-0_sp1.1 from training. Duration: 0.8818125 2023-10-03 08:13:30,459 WARNING [train.py:1204] (3/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_336-213530-0_sp1.1 from training. Duration: 0.8181875 2023-10-03 08:13:30,484 WARNING [train.py:1204] (3/4) Exclude cut with ID _64135_210_2253_1_1535677267876_7608972_180-5312-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 08:13:32,403 WARNING [train.py:1204] (3/4) Exclude cut with ID _12302_210_5219_1_1529924446466_8107280_835-279754-0 from training. Duration: 0.9 2023-10-03 08:13:32,405 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_510-96996-0_sp1.1 from training. Duration: 0.7909375 2023-10-03 08:13:34,329 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_15517_210_6753_1_1530954008_3487336_216-187834-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 08:13:34,367 WARNING [train.py:1204] (3/4) Exclude cut with ID _19504_210_9994_1_1531648863242_3325220_414-113427-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 08:13:34,368 WARNING [train.py:1204] (3/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_264-108685-0_sp0.9 from training. Duration: 0.611125 2023-10-03 08:13:37,221 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_5438_210_1794_1_1527336687_2600009_104-288274-0_sp0.9 from training. Duration: 0.6444375 2023-10-03 08:13:41,449 WARNING [train.py:1204] (3/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_520-39496-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 08:13:41,491 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_37818_210_4238_1_1533620910_4720083_315-308565-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 08:13:44,140 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2009_210_2071_1_1525481134_4588937_385-302100-0_sp1.1 from training. Duration: 0.7909375 2023-10-03 08:13:44,183 WARNING [train.py:1204] (3/4) Exclude cut with ID _58895_210_8750_1_1535184081320_3649089_277-53219-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 08:13:45,584 WARNING [train.py:1204] (3/4) Exclude cut with ID _26664_210_7770_1_1532258943280_3418270_121-111258-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 08:13:50,439 WARNING [train.py:1204] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_99-167392-0_sp1.1 from training. Duration: 0.5545625 2023-10-03 08:13:51,797 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1113-7369-0 from training. Duration: 0.85 2023-10-03 08:13:51,812 WARNING [train.py:1204] (3/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_352-59958-0 from training. Duration: 0.86 2023-10-03 08:13:51,820 WARNING [train.py:1204] (3/4) Exclude cut with ID _14886_210_4220_1_1530702020355_3516190_159-73748-0_sp1.1 from training. Duration: 0.7545625 2023-10-03 08:13:52,031 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.nonlin_attention.balancer.max_positive, batch_count=1198533.3333333333, ans=0.95 2023-10-03 08:13:56,392 WARNING [train.py:1204] (3/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_131-119569-0_sp1.1 from training. Duration: 0.92725 2023-10-03 08:13:56,477 WARNING [train.py:1204] (3/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_216-343007-0 from training. Duration: 0.66 2023-10-03 08:13:59,260 WARNING [train.py:1204] (3/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_113-92608-0_sp0.9 from training. Duration: 0.6 2023-10-03 08:14:02,156 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_40047_210_2290_1_1533290519_2374311_132-123271-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 08:14:03,904 WARNING [train.py:1204] (3/4) Exclude cut with ID _8793_210_6310_1_1528542985139_2310009_121-179782-0 from training. Duration: 0.8 2023-10-03 08:14:03,924 WARNING [train.py:1204] (3/4) Exclude cut with ID _43997_210_4525_1_1533538632555_5189329_283-218639-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 08:14:03,927 WARNING [train.py:1204] (3/4) Exclude cut with ID _49376_210_5986_1_1533985278732_3370740_297-78397-0_sp1.1 from training. Duration: 0.936375 2023-10-03 08:14:03,944 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3475_210_2028_1_1526193578_3622569_377-166133-0_sp0.9 from training. Duration: 0.9555625 2023-10-03 08:14:03,958 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_520-213149-0_sp1.1 from training. Duration: 0.92725 2023-10-03 08:14:05,785 WARNING [train.py:1204] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_567-144483-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 08:14:09,827 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_4183_210_2087_1_1526729100_1869838_143-280962-0_sp0.9 from training. Duration: 0.62225 2023-10-03 08:14:09,873 WARNING [train.py:1204] (3/4) Exclude cut with ID _49643_210_7989_1_1534640244224_3939031_611-185515-0 from training. Duration: 0.94 2023-10-03 08:14:11,277 WARNING [train.py:1204] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_88-341718-0_sp0.9 from training. Duration: 0.7333125 2023-10-03 08:14:12,676 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_51225_210_3601_1_1534400634_2327383_49-235441-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 08:14:15,309 WARNING [train.py:1204] (3/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_240-294400-0_sp1.1 from training. Duration: 0.936375 2023-10-03 08:14:15,414 WARNING [train.py:1204] (3/4) Exclude cut with ID _55429_210_10626_1_1534467588527_7174143_640-304825-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 08:14:16,705 WARNING [train.py:1204] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_187-221566-0_sp1.1 from training. Duration: 0.3909375 2023-10-03 08:14:18,565 WARNING [train.py:1204] (3/4) Exclude cut with ID _7606_210_4915_1_1528894253277_5080690_127-177761-0_sp0.9 from training. Duration: 0.711125 2023-10-03 08:14:20,187 WARNING [train.py:1204] (3/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_430-116139-0 from training. Duration: 0.85 2023-10-03 08:14:21,768 INFO [train.py:1046] (3/4) Epoch 34, batch 4500, loss[loss=0.1433, simple_loss=0.2211, pruned_loss=0.03275, over 24320.00 frames. ], tot_loss[loss=0.1613, simple_loss=0.2403, pruned_loss=0.04113, over 4708865.58 frames. ], batch size: 56, lr: 2.98e-03, grad_scale: 16.0 2023-10-03 08:14:21,891 WARNING [train.py:1204] (3/4) Exclude cut with ID _3675_210_4557_1_1526639934408_4182089_456-18305-0_sp0.9 from training. Duration: 0.8444375 2023-10-03 08:14:22,207 INFO [scaling.py:1118] (3/4) WithLoss: name=encoder.encoders.4.encoder.layers.2.self_attn_weights, loss-sum=0.000e+00 2023-10-03 08:14:22,666 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.2.encoder.layers.2.conv_module1.whiten, num_groups=1, num_channels=384, metric=5.90 vs. limit=15.0 2023-10-03 08:14:25,259 WARNING [train.py:1204] (3/4) Exclude cut with ID _18513_210_9206_1_1531618215223_3862620_402-5957-0_sp1.1 from training. Duration: 0.9 2023-10-03 08:14:27,813 WARNING [train.py:1204] (3/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_465-156500-0 from training. Duration: 0.72 2023-10-03 08:14:27,815 WARNING [train.py:1204] (3/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_714-310538-0 from training. Duration: 0.86 2023-10-03 08:14:27,947 WARNING [train.py:1204] (3/4) Exclude cut with ID _39411_210_5986_1_1533538825030_3343649_335-301187-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 08:14:35,662 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_41750_210_1960_1_1533520678_3948042_241-158616-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 08:14:35,712 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_46013_210_7234_1_1533776362_3744032_50-141535-0_sp1.1 from training. Duration: 0.9 2023-10-03 08:14:37,089 WARNING [train.py:1204] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_43-206757-0_sp0.9 from training. Duration: 0.8333125 2023-10-03 08:14:37,149 WARNING [train.py:1204] (3/4) Exclude cut with ID _22961_210_8254_1_1531976366672_4278180_172-276296-0_sp1.1 from training. Duration: 0.97275 2023-10-03 08:14:37,171 WARNING [train.py:1204] (3/4) Exclude cut with ID _17956_210_4353_1_1531209568125_6979970_1305-301903-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 08:14:37,222 WARNING [train.py:1204] (3/4) Exclude cut with ID _28323_210_16187_1_1532480395667_4100300_604-44507-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 08:14:48,879 WARNING [train.py:1204] (3/4) Exclude cut with ID _23514_210_1389_1_1531832139785_3031439_88-337544-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 08:14:48,951 WARNING [train.py:1204] (3/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_932-3672-0_sp1.1 from training. Duration: 0.963625 2023-10-03 08:14:49,658 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.2.encoder.layers.1.self_attn1.whiten, num_groups=1, num_channels=384, metric=16.41 vs. limit=22.5 2023-10-03 08:14:50,391 WARNING [train.py:1204] (3/4) Exclude cut with ID _46502_210_13794_1_1533724452198_3657159_207-109991-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 08:14:51,671 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_216-73841-0_sp1.1 from training. Duration: 0.87275 2023-10-03 08:14:53,013 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8453_210_4211_1_1528502940_8562730_347-330673-0_sp1.1 from training. Duration: 0.6545625 2023-10-03 08:14:56,431 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.bypass.scale_min, batch_count=1198800.0, ans=0.2 2023-10-03 08:14:57,638 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.1.ff3_skip_rate, batch_count=1198800.0, ans=0.0 2023-10-03 08:14:57,763 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.nonlin_attention.balancer.max_positive, batch_count=1198800.0, ans=0.95 2023-10-03 08:15:00,292 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_4183_210_2087_1_1526729100_1869838_143-280962-0_sp1.1 from training. Duration: 0.5090625 2023-10-03 08:15:05,034 WARNING [train.py:1204] (3/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_325-113798-0_sp1.1 from training. Duration: 0.77275 2023-10-03 08:15:08,436 WARNING [train.py:1204] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_91-212486-0_sp0.9 from training. Duration: 0.7555625 2023-10-03 08:15:09,953 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8776_210_2040_1_1528638837_4088036_244-278763-0_sp1.1 from training. Duration: 0.8181875 2023-10-03 08:15:10,680 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.1.encoder.layers.1.feed_forward2.out_whiten, num_groups=1, num_channels=256, metric=4.41 vs. limit=15.0 2023-10-03 08:15:11,234 WARNING [train.py:1204] (3/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_106-286130-0 from training. Duration: 0.98 2023-10-03 08:15:11,302 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2746_210_1988_1_1525872816_3795627_372-205861-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 08:15:11,344 WARNING [train.py:1204] (3/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_240-54646-0_sp1.1 from training. Duration: 0.936375 2023-10-03 08:15:12,846 WARNING [train.py:1204] (3/4) Exclude cut with ID _59500_210_8254_1_1534809610680_3736009_162-180695-0_sp1.1 from training. Duration: 0.936375 2023-10-03 08:15:14,134 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7544_210_6753_1_1528591327_1471426_146-93357-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 08:15:17,415 WARNING [train.py:1204] (3/4) Exclude cut with ID _42657_210_2070_1_1533520654016_3327008_405-87432-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 08:15:17,446 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_4528_210_3273_1_1527076654_3276777_209-219804-0 from training. Duration: 0.69 2023-10-03 08:15:17,446 WARNING [train.py:1204] (3/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_78-182912-0_sp0.9 from training. Duration: 0.6555625 2023-10-03 08:15:17,453 WARNING [train.py:1204] (3/4) Exclude cut with ID _31255_210_13427_1_1532865619495_3815143_322-114998-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 08:15:20,290 WARNING [train.py:1204] (3/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_848-280957-0_sp0.9 from training. Duration: 0.9666875 2023-10-03 08:15:21,565 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7696_210_2040_1_1528207167_3716099_117-125434-0_sp0.9 from training. Duration: 0.8444375 2023-10-03 08:15:24,734 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_1002-109192-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 08:15:25,019 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.bypass.skip_rate, batch_count=1198933.3333333333, ans=0.09899494936611666 2023-10-03 08:15:26,323 WARNING [train.py:1204] (3/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_138-64210-0_sp0.9 from training. Duration: 0.811125 2023-10-03 08:15:26,344 WARNING [train.py:1204] (3/4) Exclude cut with ID _25808_210_13292_1_1532343514562_7366109_633-47524-0_sp1.1 from training. Duration: 0.9 2023-10-03 08:15:28,976 WARNING [train.py:1204] (3/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_297-5233-0 from training. Duration: 0.95 2023-10-03 08:15:30,332 WARNING [train.py:1204] (3/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_335-274780-0 from training. Duration: 0.78 2023-10-03 08:15:30,337 WARNING [train.py:1204] (3/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_301-330455-0 from training. Duration: 0.8 2023-10-03 08:15:30,567 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.0.ff2_skip_rate, batch_count=1198933.3333333333, ans=0.0 2023-10-03 08:15:33,752 WARNING [train.py:1204] (3/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_383-205833-0 from training. Duration: 0.95 2023-10-03 08:15:34,024 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.conv_module1.balancer2.prob, batch_count=1198933.3333333333, ans=0.125 2023-10-03 08:15:37,044 INFO [train.py:1046] (3/4) Epoch 34, batch 4550, loss[loss=0.1595, simple_loss=0.2477, pruned_loss=0.03566, over 24607.00 frames. ], tot_loss[loss=0.1613, simple_loss=0.2401, pruned_loss=0.04123, over 4702264.99 frames. ], batch size: 68, lr: 2.98e-03, grad_scale: 16.0 2023-10-03 08:15:37,110 WARNING [train.py:1204] (3/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_48-182155-0 from training. Duration: 0.87 2023-10-03 08:15:37,303 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.feed_forward3.hidden_balancer.prob, batch_count=1199000.0, ans=0.125 2023-10-03 08:15:37,847 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.3.encoder.layers.0.feed_forward1.out_whiten, num_groups=1, num_channels=512, metric=7.15 vs. limit=15.0 2023-10-03 08:15:38,483 WARNING [train.py:1204] (3/4) Exclude cut with ID _14832_210_8750_1_1530691351539_3171348_1-54315-0_sp1.1 from training. Duration: 0.7909375 2023-10-03 08:15:41,154 WARNING [train.py:1204] (3/4) Exclude cut with ID _44982_210_1511_1_1534312820108_3726029_413-100814-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 08:15:41,218 WARNING [train.py:1204] (3/4) Exclude cut with ID _3231_210_2106_1_1526205864934_6784220_104-261392-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 08:15:43,286 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.3.encoder.layers.1.self_attn1.whiten, num_groups=1, num_channels=512, metric=15.43 vs. limit=22.5 2023-10-03 08:15:44,058 WARNING [train.py:1204] (3/4) Exclude cut with ID _24314_210_4915_1_1532135078116_7170179_1-13588-0_sp1.1 from training. Duration: 0.97275 2023-10-03 08:15:48,600 WARNING [train.py:1204] (3/4) Exclude cut with ID _44304_210_15374_1_1533639352765_4613129_141-8356-0_sp1.1 from training. Duration: 0.963625 2023-10-03 08:15:50,065 WARNING [train.py:1204] (3/4) Exclude cut with ID _37892_210_19596_1_1533198677897_3964058_374-335034-0_sp1.1 from training. Duration: 0.936375 2023-10-03 08:15:51,485 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_195-257512-0_sp0.9 from training. Duration: 0.7444375 2023-10-03 08:15:51,487 WARNING [train.py:1204] (3/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_1267-272693-0_sp1.1 from training. Duration: 0.663625 2023-10-03 08:15:51,487 WARNING [train.py:1204] (3/4) Exclude cut with ID _30196_210_1664_1_1533708496522_7074140_690-326155-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 08:15:54,573 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_68847_210_14656_1_1536070214_2586843_38-20717-0_sp1.1 from training. Duration: 0.97275 2023-10-03 08:15:54,627 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_99-251314-0_sp1.1 from training. Duration: 0.7909375 2023-10-03 08:15:57,244 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.584e+02 1.900e+02 2.073e+02 2.296e+02 3.311e+02, threshold=4.145e+02, percent-clipped=0.0 2023-10-03 08:15:58,774 WARNING [train.py:1204] (3/4) Exclude cut with ID _49376_210_5986_1_1533985278732_3370740_225-95630-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 08:16:00,202 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2655_210_2087_1_1525688959_3897400_202-147557-0 from training. Duration: 0.85 2023-10-03 08:16:00,260 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_5618_210_1874_1_1527311008_5851_1-214501-0_sp1.1 from training. Duration: 0.626375 2023-10-03 08:16:01,645 WARNING [train.py:1204] (3/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_225-43381-0_sp1.1 from training. Duration: 0.8545625 2023-10-03 08:16:03,620 WARNING [train.py:1204] (3/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_242-3472-0 from training. Duration: 0.73 2023-10-03 08:16:06,493 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.0.layers.0.self_attn1.whiten, num_groups=1, num_channels=192, metric=13.73 vs. limit=22.5 2023-10-03 08:16:06,969 WARNING [train.py:1204] (3/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_339-104791-0 from training. Duration: 0.49 2023-10-03 08:16:08,342 WARNING [train.py:1204] (3/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_488-92261-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 08:16:11,026 WARNING [train.py:1204] (3/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_175-163894-0 from training. Duration: 0.74 2023-10-03 08:16:12,462 WARNING [train.py:1204] (3/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_41-234353-0_sp0.9 from training. Duration: 0.7555625 2023-10-03 08:16:14,665 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.conv_module1.balancer2.prob, batch_count=1199133.3333333333, ans=0.125 2023-10-03 08:16:15,780 WARNING [train.py:1204] (3/4) Exclude cut with ID _47251_210_18628_1_1533814063128_4137250_116-275927-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 08:16:15,809 WARNING [train.py:1204] (3/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_61-223324-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 08:16:15,827 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6961_210_2087_1_1527937168_6815011_562-165518-0_sp1.1 from training. Duration: 0.7 2023-10-03 08:16:18,513 WARNING [train.py:1204] (3/4) Exclude cut with ID _21989_210_5854_1_1531899214931_3271360_133-44759-0 from training. Duration: 0.77 2023-10-03 08:16:19,934 WARNING [train.py:1204] (3/4) Exclude cut with ID _23557_210_8750_1_1531890106370_6846255_77-284923-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 08:16:24,463 WARNING [train.py:1204] (3/4) Exclude cut with ID _13678_210_7497_1_1530530069226_3463170_464-296127-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 08:16:24,478 WARNING [train.py:1204] (3/4) Exclude cut with ID _22613_210_5219_1_1532253599686_4090170_393-36663-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 08:16:25,804 WARNING [train.py:1204] (3/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_235-262124-0_sp0.9 from training. Duration: 0.7444375 2023-10-03 08:16:25,894 WARNING [train.py:1204] (3/4) Exclude cut with ID _8480_210_1874_1_1528589391205_6728330_755-112625-0 from training. Duration: 0.74 2023-10-03 08:16:27,352 WARNING [train.py:1204] (3/4) Exclude cut with ID _6456_210_1534_1_1527681634212_3181049_215-309359-0 from training. Duration: 0.88 2023-10-03 08:16:27,375 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3019_210_2028_1_1526044245_3264865_191-72235-0_sp1.1 from training. Duration: 0.8818125 2023-10-03 08:16:27,444 WARNING [train.py:1204] (3/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_380-162648-0 from training. Duration: 0.94 2023-10-03 08:16:30,193 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_92-46009-0 from training. Duration: 0.54 2023-10-03 08:16:30,233 WARNING [train.py:1204] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_53-300544-0_sp0.9 from training. Duration: 0.7444375 2023-10-03 08:16:31,590 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_68235_210_1925_1_1535863247_4484656_101-250943-0_sp1.1 from training. Duration: 0.97275 2023-10-03 08:16:31,606 WARNING [train.py:1204] (3/4) Exclude cut with ID _14970_210_15709_1_1530775547361_1966965_204-22182-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 08:16:32,943 WARNING [train.py:1204] (3/4) Exclude cut with ID _55153_210_2253_1_1534384786677_3162096_310-130092-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 08:16:34,118 WARNING [train.py:1204] (3/4) Exclude cut with ID _60113_210_16271_1_1535164212481_4386079_559-167191-0_sp1.1 from training. Duration: 0.8454375 2023-10-03 08:16:36,101 WARNING [train.py:1204] (3/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_93-58002-0_sp1.1 from training. Duration: 0.5181875 2023-10-03 08:16:36,164 WARNING [train.py:1204] (3/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_110-224688-0 from training. Duration: 0.97 2023-10-03 08:16:37,628 WARNING [train.py:1204] (3/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_178-178004-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 08:16:37,637 WARNING [train.py:1204] (3/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_321-335475-0_sp1.1 from training. Duration: 0.4090625 2023-10-03 08:16:38,955 WARNING [train.py:1204] (3/4) Exclude cut with ID _66863_210_18053_1_1535698758334_8590319_1092-271784-0 from training. Duration: 0.96 2023-10-03 08:16:38,961 WARNING [train.py:1204] (3/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_607-213951-0_sp1.1 from training. Duration: 0.763625 2023-10-03 08:16:38,970 WARNING [train.py:1204] (3/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_468-283113-0 from training. Duration: 0.96 2023-10-03 08:16:42,462 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.1.encoder.layers.0.self_attn_weights.whiten_keys, num_groups=4, num_channels=128, metric=3.60 vs. limit=6.0 2023-10-03 08:16:42,888 WARNING [train.py:1204] (3/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_13-165512-0_sp1.1 from training. Duration: 0.7454375 2023-10-03 08:16:42,913 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_69630_210_7234_1_1535788670_910989_56-169762-0_sp1.1 from training. Duration: 0.92725 2023-10-03 08:16:46,175 WARNING [train.py:1204] (3/4) Exclude cut with ID _67335_210_5219_1_1535525975652_4275229_121-80357-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 08:16:46,217 WARNING [train.py:1204] (3/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_126-12079-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 08:16:46,259 WARNING [train.py:1204] (3/4) Exclude cut with ID _5294_210_1747_1_1527249643928_3696179_342-270278-0_sp1.1 from training. Duration: 0.5 2023-10-03 08:16:46,451 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.bypass.scale_min, batch_count=1199266.6666666667, ans=0.2 2023-10-03 08:16:47,589 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_30322_210_1925_1_1532843478_826817_35-328264-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 08:16:50,174 INFO [train.py:1046] (3/4) Epoch 34, batch 4600, loss[loss=0.145, simple_loss=0.2276, pruned_loss=0.03115, over 24443.00 frames. ], tot_loss[loss=0.1597, simple_loss=0.2382, pruned_loss=0.04062, over 4702733.49 frames. ], batch size: 63, lr: 2.98e-03, grad_scale: 16.0 2023-10-03 08:16:50,270 WARNING [train.py:1204] (3/4) Exclude cut with ID _8480_210_1874_1_1528589391205_6728330_755-112625-0_sp1.1 from training. Duration: 0.67275 2023-10-03 08:16:50,553 INFO [scaling.py:1118] (3/4) WithLoss: name=encoder.encoders.4.encoder.layers.1.self_attn_weights, loss-sum=0.000e+00 2023-10-03 08:16:52,978 WARNING [train.py:1204] (3/4) Exclude cut with ID _38670_210_15379_1_1533448810659_4391059_132-146412-0_sp1.1 from training. Duration: 0.97275 2023-10-03 08:16:54,419 WARNING [train.py:1204] (3/4) Exclude cut with ID _22020_210_2461_1_1531724164513_6326900_563-39691-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 08:16:57,760 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_9460_210_4211_1_1528954587_10049667_572-37035-0_sp1.1 from training. Duration: 0.72725 2023-10-03 08:16:57,779 WARNING [train.py:1204] (3/4) Exclude cut with ID _68777_210_4353_1_1535810390731_3117904_129-166012-0_sp0.9 from training. Duration: 0.9444375 2023-10-03 08:16:59,106 WARNING [train.py:1204] (3/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_487-91921-0_sp1.1 from training. Duration: 0.963625 2023-10-03 08:17:00,563 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_195-257512-0 from training. Duration: 0.67 2023-10-03 08:17:01,927 WARNING [train.py:1204] (3/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_188-260011-0_sp1.1 from training. Duration: 0.663625 2023-10-03 08:17:04,805 WARNING [train.py:1204] (3/4) Exclude cut with ID _8480_210_1874_1_1528589391205_6728330_309-139896-0_sp0.9 from training. Duration: 0.9 2023-10-03 08:17:06,103 WARNING [train.py:1204] (3/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_866-321248-0_sp1.1 from training. Duration: 0.963625 2023-10-03 08:17:08,116 WARNING [train.py:1204] (3/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_510-216513-0_sp1.1 from training. Duration: 0.97275 2023-10-03 08:17:12,431 WARNING [train.py:1204] (3/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_758-143677-0 from training. Duration: 0.98 2023-10-03 08:17:14,340 WARNING [train.py:1204] (3/4) Exclude cut with ID _55134_210_21982_1_1534590028377_3882170_50-214427-0_sp1.1 from training. Duration: 0.97275 2023-10-03 08:17:17,193 WARNING [train.py:1204] (3/4) Exclude cut with ID _31649_210_6228_1_1532584608063_4091078_482-301330-0_sp1.1 from training. Duration: 0.97275 2023-10-03 08:17:19,878 WARNING [train.py:1204] (3/4) Exclude cut with ID _35799_210_5854_1_1533263487887_3529071_490-326847-0_sp1.1 from training. Duration: 0.9 2023-10-03 08:17:19,886 WARNING [train.py:1204] (3/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_984-90716-0_sp1.1 from training. Duration: 0.963625 2023-10-03 08:17:24,200 WARNING [train.py:1204] (3/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_166-246557-0 from training. Duration: 0.64 2023-10-03 08:17:24,201 WARNING [train.py:1204] (3/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_51-200220-0_sp1.1 from training. Duration: 0.5090625 2023-10-03 08:17:24,357 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.ff3_skip_rate, batch_count=1199466.6666666667, ans=0.0 2023-10-03 08:17:25,574 WARNING [train.py:1204] (3/4) Exclude cut with ID _51382_210_23862_1_1534492801838_3650104_275-92796-0_sp1.1 from training. Duration: 0.936375 2023-10-03 08:17:31,562 WARNING [train.py:1204] (3/4) Exclude cut with ID _29853_210_7497_1_1532505600851_6642110_461-142829-0_sp1.1 from training. Duration: 0.97275 2023-10-03 08:17:31,616 WARNING [train.py:1204] (3/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_788-298907-0_sp0.9 from training. Duration: 0.988875 2023-10-03 08:17:32,989 WARNING [train.py:1204] (3/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_739-2098-0_sp1.1 from training. Duration: 0.836375 2023-10-03 08:17:36,271 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7543_210_6753_1_1528541907_1399322_165-323619-0 from training. Duration: 0.69 2023-10-03 08:17:38,193 WARNING [train.py:1204] (3/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_401-255491-0_sp1.1 from training. Duration: 0.636375 2023-10-03 08:17:39,853 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.bypass_mid.scale_min, batch_count=1199533.3333333333, ans=0.2 2023-10-03 08:17:42,777 WARNING [train.py:1204] (3/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_24-143283-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 08:17:44,097 WARNING [train.py:1204] (3/4) Exclude cut with ID _14459_210_2228_1_1531443641946_7516700_1221-280439-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 08:17:44,352 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.bypass.skip_rate, batch_count=1199533.3333333333, ans=0.09899494936611666 2023-10-03 08:17:45,634 WARNING [train.py:1204] (3/4) Exclude cut with ID _59202_210_26984_1_1534816810612_3549174_469-213482-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 08:17:45,635 WARNING [train.py:1204] (3/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_148-158509-0_sp0.9 from training. Duration: 0.4555625 2023-10-03 08:17:46,968 WARNING [train.py:1204] (3/4) Exclude cut with ID _24314_210_4915_1_1532135078116_7170179_863-259095-0_sp1.1 from training. Duration: 0.97275 2023-10-03 08:17:47,028 WARNING [train.py:1204] (3/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_321-22821-0 from training. Duration: 0.89 2023-10-03 08:17:47,046 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_43831_210_2158_1_1533725502_5872791_55-163137-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 08:17:48,297 WARNING [train.py:1204] (3/4) Exclude cut with ID _27430_210_7770_1_1532604689375_1378590_26-25207-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 08:17:48,518 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.conv_skip_rate, batch_count=1199600.0, ans=0.0 2023-10-03 08:17:49,675 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_17613_210_2151_1_1531137569_2366160_223-316461-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 08:17:51,039 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_53679_210_4852_1_1534556726_4186141_541-213370-0_sp1.1 from training. Duration: 0.963625 2023-10-03 08:17:51,103 WARNING [train.py:1204] (3/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_369-295086-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 08:17:52,373 WARNING [train.py:1204] (3/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_91-267696-0 from training. Duration: 0.87 2023-10-03 08:17:52,426 WARNING [train.py:1204] (3/4) Exclude cut with ID _5294_210_1747_1_1527249643928_3696179_180-100713-0 from training. Duration: 0.81 2023-10-03 08:17:53,748 WARNING [train.py:1204] (3/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_32-335103-0 from training. Duration: 0.82 2023-10-03 08:17:53,754 WARNING [train.py:1204] (3/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_133-312727-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 08:17:53,851 WARNING [train.py:1204] (3/4) Exclude cut with ID _59288_210_5854_1_1535077757774_3412246_391-165934-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 08:17:55,173 WARNING [train.py:1204] (3/4) Exclude cut with ID _28323_210_16187_1_1532480395667_4100300_488-1281-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 08:17:56,584 WARNING [train.py:1204] (3/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_124-193207-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 08:18:03,969 INFO [train.py:1046] (3/4) Epoch 34, batch 4650, loss[loss=0.1624, simple_loss=0.2373, pruned_loss=0.04375, over 23368.00 frames. ], tot_loss[loss=0.1596, simple_loss=0.2386, pruned_loss=0.04029, over 4719554.86 frames. ], batch size: 285, lr: 2.98e-03, grad_scale: 16.0 2023-10-03 08:18:06,821 WARNING [train.py:1204] (3/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_226-242140-0_sp0.9 from training. Duration: 0.9 2023-10-03 08:18:10,089 WARNING [train.py:1204] (3/4) Exclude cut with ID _29277_210_3279_1_1532581109293_3173069_392-3694-0_sp1.1 from training. Duration: 0.936375 2023-10-03 08:18:10,124 WARNING [train.py:1204] (3/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_563-329842-0_sp1.1 from training. Duration: 0.97275 2023-10-03 08:18:12,145 WARNING [train.py:1204] (3/4) Exclude cut with ID _58583_210_5236_1_1534726835266_3574249_391-87755-0_sp1.1 from training. Duration: 0.92725 2023-10-03 08:18:12,163 WARNING [train.py:1204] (3/4) Exclude cut with ID _28427_210_4220_1_1532581240967_3061069_281-237827-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 08:18:12,184 WARNING [train.py:1204] (3/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_44-147898-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 08:18:13,544 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_24007_210_1794_1_1531918925_1663875_125-320772-0_sp1.1 from training. Duration: 0.97275 2023-10-03 08:18:17,624 WARNING [train.py:1204] (3/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_143-117421-0 from training. Duration: 0.58 2023-10-03 08:18:20,428 WARNING [train.py:1204] (3/4) Exclude cut with ID _69584_210_5219_1_1535785133775_4044450_325-7656-0_sp0.9 from training. Duration: 0.9555625 2023-10-03 08:18:21,836 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_37351_210_7925_1_1533012931_3812113_265-85780-0 from training. Duration: 0.88 2023-10-03 08:18:21,859 WARNING [train.py:1204] (3/4) Exclude cut with ID _18516_210_9206_1_1531704653206_3692406_331-237896-0_sp1.1 from training. Duration: 0.936375 2023-10-03 08:18:23,119 WARNING [train.py:1204] (3/4) Exclude cut with ID _14629_210_15709_1_1530604298182_956899_6-232695-0 from training. Duration: 0.94 2023-10-03 08:18:23,140 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7544_210_6753_1_1528587000_691468_87-178599-0_sp1.1 from training. Duration: 0.7909375 2023-10-03 08:18:23,206 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_326-303945-0 from training. Duration: 0.99 2023-10-03 08:18:24,433 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.454e+02 1.812e+02 2.012e+02 2.227e+02 3.293e+02, threshold=4.023e+02, percent-clipped=0.0 2023-10-03 08:18:24,534 WARNING [train.py:1204] (3/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_503-30719-0 from training. Duration: 0.98 2023-10-03 08:18:24,541 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_11305_210_1794_1_1529750428_7178148_299-344640-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 08:18:24,601 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_53071_210_16554_1_1534490405_2954066_265-170472-0_sp1.1 from training. Duration: 0.7545625 2023-10-03 08:18:28,611 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_174-277364-0_sp1.1 from training. Duration: 0.6454375 2023-10-03 08:18:30,098 WARNING [train.py:1204] (3/4) Exclude cut with ID _58414_210_8254_1_1535421609162_7151182_193-151884-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 08:18:30,118 WARNING [train.py:1204] (3/4) Exclude cut with ID IC0651W0104-322063 from training. Duration: 0.768 2023-10-03 08:18:33,290 WARNING [train.py:1204] (3/4) Exclude cut with ID _21881_210_3525_1_1531825242757_3798380_139-305982-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 08:18:33,623 INFO [scaling.py:1118] (3/4) WithLoss: name=encoder.encoders.5.encoder.layers.0.self_attn_weights, loss-sum=0.000e+00 2023-10-03 08:18:34,702 WARNING [train.py:1204] (3/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_79-200392-0 from training. Duration: 0.97 2023-10-03 08:18:36,227 WARNING [train.py:1204] (3/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_451-280894-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 08:18:36,243 WARNING [train.py:1204] (3/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_304-273433-0_sp1.1 from training. Duration: 0.9 2023-10-03 08:18:37,517 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_5959_210_1794_1_1527400400_3445726_412-44052-0 from training. Duration: 0.77 2023-10-03 08:18:39,434 WARNING [train.py:1204] (3/4) Exclude cut with ID _4681_210_3525_1_1527251040321_1235713_148-26159-0_sp1.1 from training. Duration: 0.963625 2023-10-03 08:18:42,685 WARNING [train.py:1204] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_887-249555-0_sp0.9 from training. Duration: 0.8555625 2023-10-03 08:18:45,918 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_25236_210_2158_1_1532051968_280567_37-6860-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 08:18:51,598 WARNING [train.py:1204] (3/4) Exclude cut with ID _29277_210_3279_1_1532581109293_3173069_417-115527-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 08:18:54,365 WARNING [train.py:1204] (3/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_552-134748-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 08:18:54,434 WARNING [train.py:1204] (3/4) Exclude cut with ID _43176_210_8254_1_1533524417469_4174339_188-174143-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 08:18:55,795 WARNING [train.py:1204] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_127-119109-0_sp1.1 from training. Duration: 0.6545625 2023-10-03 08:18:57,236 WARNING [train.py:1204] (3/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_38-187851-0 from training. Duration: 0.62 2023-10-03 08:18:58,540 WARNING [train.py:1204] (3/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_588-125165-0 from training. Duration: 0.83 2023-10-03 08:18:58,608 WARNING [train.py:1204] (3/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_344-152526-0_sp1.1 from training. Duration: 0.4818125 2023-10-03 08:18:58,609 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7002_210_1794_1_1527935634_4034444_487-236501-0 from training. Duration: 0.66 2023-10-03 08:18:59,966 WARNING [train.py:1204] (3/4) Exclude cut with ID _23825_210_13449_1_1531915313526_3497150_379-16981-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 08:19:06,070 WARNING [train.py:1204] (3/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_297-5233-0_sp1.1 from training. Duration: 0.863625 2023-10-03 08:19:06,082 WARNING [train.py:1204] (3/4) Exclude cut with ID _41298_210_9019_1_1533430371450_4822143_178-283538-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 08:19:06,116 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7046_210_6753_1_1527895337_6920917_642-106799-0 from training. Duration: 0.92 2023-10-03 08:19:07,343 WARNING [train.py:1204] (3/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_532-291952-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 08:19:07,456 WARNING [train.py:1204] (3/4) Exclude cut with ID _47278_210_12041_1_1533902418349_3576110_260-270468-0_sp1.1 from training. Duration: 0.92725 2023-10-03 08:19:08,716 WARNING [train.py:1204] (3/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_144-3287-0_sp0.9 from training. Duration: 0.7444375 2023-10-03 08:19:08,833 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_162-262717-0_sp0.9 from training. Duration: 0.92225 2023-10-03 08:19:11,818 WARNING [train.py:1204] (3/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_62-335552-0_sp0.9 from training. Duration: 0.9666875 2023-10-03 08:19:11,824 WARNING [train.py:1204] (3/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_726-272852-0_sp1.1 from training. Duration: 0.92725 2023-10-03 08:19:13,702 WARNING [train.py:1204] (3/4) Exclude cut with ID _14898_210_9206_1_1530766812377_4177190_26-135492-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 08:19:17,144 WARNING [train.py:1204] (3/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_200-95304-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 08:19:20,905 INFO [train.py:1046] (3/4) Epoch 34, batch 4700, loss[loss=0.1697, simple_loss=0.2575, pruned_loss=0.04099, over 24584.00 frames. ], tot_loss[loss=0.1605, simple_loss=0.2393, pruned_loss=0.04085, over 4710783.36 frames. ], batch size: 71, lr: 2.98e-03, grad_scale: 16.0 2023-10-03 08:19:20,982 WARNING [train.py:1204] (3/4) Exclude cut with ID _45012_210_12651_1_1533608435136_5371380_240-37101-0_sp1.1 from training. Duration: 0.8454375 2023-10-03 08:19:20,990 WARNING [train.py:1204] (3/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_132-269633-0_sp0.9 from training. Duration: 0.7555625 2023-10-03 08:19:22,309 WARNING [train.py:1204] (3/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_907-151398-0_sp0.9 from training. Duration: 0.411125 2023-10-03 08:19:22,391 WARNING [train.py:1204] (3/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_45-265684-0_sp0.9 from training. Duration: 0.8 2023-10-03 08:19:23,748 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_74-292608-0 from training. Duration: 0.72 2023-10-03 08:19:30,612 WARNING [train.py:1204] (3/4) Exclude cut with ID _59202_210_26984_1_1534816810612_3549174_221-84819-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 08:19:30,676 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_33696_210_3852_1_1533082780_2663375_181-325595-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 08:19:32,614 WARNING [train.py:1204] (3/4) Exclude cut with ID _47251_210_18628_1_1533814063128_4137250_351-154105-0_sp1.1 from training. Duration: 0.963625 2023-10-03 08:19:32,711 WARNING [train.py:1204] (3/4) Exclude cut with ID _35501_210_9288_1_1532998507986_3192189_351-334740-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 08:19:33,899 WARNING [train.py:1204] (3/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_342-100157-0_sp1.1 from training. Duration: 0.4909375 2023-10-03 08:19:39,381 WARNING [train.py:1204] (3/4) Exclude cut with ID _6789_210_1534_1_1527857937218_3118950_262-48853-0 from training. Duration: 0.56 2023-10-03 08:19:39,421 WARNING [train.py:1204] (3/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_430-201020-0 from training. Duration: 0.97 2023-10-03 08:19:40,945 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_71-136518-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 08:19:44,401 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_28-291982-0_sp1.1 from training. Duration: 0.8818125 2023-10-03 08:19:44,438 WARNING [train.py:1204] (3/4) Exclude cut with ID _54218_210_12210_1_1534413646388_4409259_437-275326-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 08:19:47,356 WARNING [train.py:1204] (3/4) Exclude cut with ID _55134_210_21982_1_1534590028377_3882170_40-309139-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 08:19:54,470 WARNING [train.py:1204] (3/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_266-329755-0_sp1.1 from training. Duration: 0.8090625 2023-10-03 08:19:54,560 WARNING [train.py:1204] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_21-174514-0_sp1.1 from training. Duration: 0.3909375 2023-10-03 08:19:57,480 WARNING [train.py:1204] (3/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_409-207378-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 08:20:03,113 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.0.conv_skip_rate, batch_count=1200200.0, ans=0.0 2023-10-03 08:20:04,711 WARNING [train.py:1204] (3/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_132-269633-0 from training. Duration: 0.68 2023-10-03 08:20:06,080 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2245_210_2715_1_1525511548_3540254_310-52483-0_sp0.9 from training. Duration: 0.97775 2023-10-03 08:20:07,620 WARNING [train.py:1204] (3/4) Exclude cut with ID _27430_210_7770_1_1532604689375_1378590_47-77403-0_sp1.1 from training. Duration: 0.92725 2023-10-03 08:20:10,424 WARNING [train.py:1204] (3/4) Exclude cut with ID _7562_210_2084_1_1528887767916_3946229_186-326422-0 from training. Duration: 0.84 2023-10-03 08:20:11,755 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_230-35993-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 08:20:16,385 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_23854_210_1794_1_1531969696_1329157_57-123491-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 08:20:17,800 WARNING [train.py:1204] (3/4) Exclude cut with ID _14747_210_8341_1_1530685797463_3654089_147-131229-0 from training. Duration: 0.76 2023-10-03 08:20:19,715 WARNING [train.py:1204] (3/4) Exclude cut with ID _30191_210_1664_1_1533189915047_7196670_430-19539-0_sp1.1 from training. Duration: 0.92725 2023-10-03 08:20:19,736 WARNING [train.py:1204] (3/4) Exclude cut with ID _67725_210_27767_1_1535608756671_5105359_44-50762-0_sp1.1 from training. Duration: 0.963625 2023-10-03 08:20:22,535 WARNING [train.py:1204] (3/4) Exclude cut with ID _27485_210_4916_1_1532739191677_3668269_217-161755-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 08:20:22,569 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_5931_210_2040_1_1527428481_4547999_321-193614-0_sp1.1 from training. Duration: 0.6090625 2023-10-03 08:20:22,584 WARNING [train.py:1204] (3/4) Exclude cut with ID _6267_210_2084_1_1527764658526_3894730_375-219887-0 from training. Duration: 0.67 2023-10-03 08:20:24,019 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9087W0022-538393 from training. Duration: 0.768 2023-10-03 08:20:24,335 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.nonlin_attention.balancer.prob, batch_count=1200266.6666666667, ans=0.125 2023-10-03 08:20:25,377 WARNING [train.py:1204] (3/4) Exclude cut with ID _68777_210_4353_1_1535810390731_3117904_83-196459-0_sp1.1 from training. Duration: 0.963625 2023-10-03 08:20:28,044 WARNING [train.py:1204] (3/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_455-343812-0_sp1.1 from training. Duration: 0.92725 2023-10-03 08:20:28,045 WARNING [train.py:1204] (3/4) Exclude cut with ID _13916_210_9994_1_1530788341126_3763149_414-320174-0_sp1.1 from training. Duration: 0.92725 2023-10-03 08:20:28,049 WARNING [train.py:1204] (3/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_398-239819-0 from training. Duration: 0.99 2023-10-03 08:20:28,162 WARNING [train.py:1204] (3/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_199-232314-0_sp1.1 from training. Duration: 0.92725 2023-10-03 08:20:32,152 WARNING [train.py:1204] (3/4) Exclude cut with ID _2334_210_2667_1_1525608238880_4705129_459-104997-0 from training. Duration: 0.95 2023-10-03 08:20:33,485 INFO [train.py:1046] (3/4) Epoch 34, batch 4750, loss[loss=0.1661, simple_loss=0.2575, pruned_loss=0.0374, over 24538.00 frames. ], tot_loss[loss=0.1613, simple_loss=0.2404, pruned_loss=0.04107, over 4709575.49 frames. ], batch size: 71, lr: 2.98e-03, grad_scale: 16.0 2023-10-03 08:20:34,911 WARNING [train.py:1204] (3/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_441-209395-0_sp1.1 from training. Duration: 0.936375 2023-10-03 08:20:35,606 WARNING [train.py:1204] (3/4) Exclude cut with ID _27052_210_5986_1_1532329197187_3585009_320-80393-0_sp1.1 from training. Duration: 0.97275 2023-10-03 08:20:39,778 WARNING [train.py:1204] (3/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_89-217253-0_sp1.1 from training. Duration: 0.97275 2023-10-03 08:20:41,070 WARNING [train.py:1204] (3/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_453-282588-0_sp1.1 from training. Duration: 0.8909375 2023-10-03 08:20:42,423 WARNING [train.py:1204] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_88-341718-0 from training. Duration: 0.66 2023-10-03 08:20:42,479 WARNING [train.py:1204] (3/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_230-3521-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 08:20:44,032 WARNING [train.py:1204] (3/4) Exclude cut with ID _9679_210_7497_1_1529129212906_4363929_512-313889-0 from training. Duration: 0.81 2023-10-03 08:20:46,217 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.bypass.scale_min, batch_count=1200333.3333333333, ans=0.2 2023-10-03 08:20:47,263 WARNING [train.py:1204] (3/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_194-252800-0_sp1.1 from training. Duration: 0.8818125 2023-10-03 08:20:47,291 WARNING [train.py:1204] (3/4) Exclude cut with ID _55209_210_1531_1_1534675828996_4597160_239-293541-0_sp1.1 from training. Duration: 0.963625 2023-10-03 08:20:48,693 WARNING [train.py:1204] (3/4) Exclude cut with ID _35799_210_5854_1_1533263487887_3529071_15-325131-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 08:20:55,002 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.401e+02 1.941e+02 2.055e+02 2.370e+02 3.747e+02, threshold=4.109e+02, percent-clipped=0.0 2023-10-03 08:20:55,149 WARNING [train.py:1204] (3/4) Exclude cut with ID _14100_210_8341_1_1530529320613_3678069_279-55760-0 from training. Duration: 0.93 2023-10-03 08:20:59,388 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_489-230703-0_sp1.1 from training. Duration: 0.77275 2023-10-03 08:21:00,856 WARNING [train.py:1204] (3/4) Exclude cut with ID _6694_210_4599_1_1527852637490_3965529_144-185051-0 from training. Duration: 0.96 2023-10-03 08:21:01,074 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.bypass.scale_min, batch_count=1200400.0, ans=0.2 2023-10-03 08:21:02,185 WARNING [train.py:1204] (3/4) Exclude cut with ID _28461_210_2253_1_1532771775189_3041299_1-228155-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 08:21:03,770 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.bypass_mid.scale_min, batch_count=1200466.6666666667, ans=0.2 2023-10-03 08:21:04,954 WARNING [train.py:1204] (3/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_686-252323-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 08:21:04,956 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_1075-294633-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 08:21:04,978 WARNING [train.py:1204] (3/4) Exclude cut with ID _22083_210_8254_1_1531702862439_3678970_30-92879-0_sp1.1 from training. Duration: 0.97275 2023-10-03 08:21:05,060 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9245W0121-616888 from training. Duration: 0.896 2023-10-03 08:21:05,062 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6786_210_1388_1_1527839699_4194495_378-46070-0 from training. Duration: 0.72 2023-10-03 08:21:07,223 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.balancer2.prob, batch_count=1200466.6666666667, ans=0.125 2023-10-03 08:21:09,819 WARNING [train.py:1204] (3/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_317-175062-0 from training. Duration: 0.82 2023-10-03 08:21:09,939 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.0.conv_skip_rate, batch_count=1200466.6666666667, ans=0.0 2023-10-03 08:21:14,965 WARNING [train.py:1204] (3/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_238-89390-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 08:21:16,394 WARNING [train.py:1204] (3/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_524-310811-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 08:21:19,238 WARNING [train.py:1204] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_307-158462-0_sp0.9 from training. Duration: 0.7666875 2023-10-03 08:21:19,238 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9309W0191-640318 from training. Duration: 0.512 2023-10-03 08:21:19,243 WARNING [train.py:1204] (3/4) Exclude cut with ID _55153_210_2253_1_1534384786677_3162096_100-204617-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 08:21:22,675 WARNING [train.py:1204] (3/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_112-149213-0_sp1.1 from training. Duration: 0.863625 2023-10-03 08:21:26,261 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.conv_skip_rate, batch_count=1200533.3333333333, ans=0.0 2023-10-03 08:21:27,304 WARNING [train.py:1204] (3/4) Exclude cut with ID _67831_210_12392_1_1535612464202_3092612_346-2148-0_sp1.1 from training. Duration: 0.8454375 2023-10-03 08:21:27,429 WARNING [train.py:1204] (3/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_51-117576-0_sp0.9 from training. Duration: 0.4444375 2023-10-03 08:21:28,891 WARNING [train.py:1204] (3/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_300-27235-0 from training. Duration: 0.87 2023-10-03 08:21:30,261 WARNING [train.py:1204] (3/4) Exclude cut with ID _40399_210_4353_1_1533305658194_3279406_195-116825-0_sp1.1 from training. Duration: 0.97275 2023-10-03 08:21:30,290 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7002_210_1794_1_1527939683_5157286_163-87552-0_sp0.9 from training. Duration: 0.9555625 2023-10-03 08:21:30,329 WARNING [train.py:1204] (3/4) Exclude cut with ID _19514_210_9259_1_1531530071330_5084230_560-237597-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 08:21:31,662 WARNING [train.py:1204] (3/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_41-234353-0_sp1.1 from training. Duration: 0.6181875 2023-10-03 08:21:32,951 WARNING [train.py:1204] (3/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_769-168302-0 from training. Duration: 0.71 2023-10-03 08:21:34,452 WARNING [train.py:1204] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_574-45541-0 from training. Duration: 0.65 2023-10-03 08:21:35,964 WARNING [train.py:1204] (3/4) Exclude cut with ID _50310_210_15488_1_1534068043345_3641080_262-67120-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 08:21:38,721 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_53071_210_16554_1_1534493434_1276796_35-11927-0_sp1.1 from training. Duration: 0.963625 2023-10-03 08:21:38,723 WARNING [train.py:1204] (3/4) Exclude cut with ID _3992_210_1747_1_1526644816031_3731060_107-251434-0 from training. Duration: 0.99 2023-10-03 08:21:39,426 WARNING [train.py:1204] (3/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_412-54488-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 08:21:40,654 WARNING [train.py:1204] (3/4) Exclude cut with ID _18516_210_9206_1_1531704653206_3692406_77-258490-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 08:21:43,375 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_40047_210_2290_1_1533290519_2374311_195-165693-0_sp0.9 from training. Duration: 0.988875 2023-10-03 08:21:43,434 WARNING [train.py:1204] (3/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_296-36971-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 08:21:43,495 WARNING [train.py:1204] (3/4) Exclude cut with ID _14970_210_15709_1_1530773543493_1715730_187-20259-0_sp1.1 from training. Duration: 0.8090625 2023-10-03 08:21:46,345 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_53373_210_5399_1_1534229471_4715695_135-305927-0_sp1.1 from training. Duration: 0.936375 2023-10-03 08:21:47,623 INFO [train.py:1046] (3/4) Epoch 34, batch 4800, loss[loss=0.1395, simple_loss=0.2213, pruned_loss=0.02887, over 21332.00 frames. ], tot_loss[loss=0.1617, simple_loss=0.2409, pruned_loss=0.04124, over 4707099.59 frames. ], batch size: 46, lr: 2.98e-03, grad_scale: 32.0 2023-10-03 08:21:47,671 WARNING [train.py:1204] (3/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_60-18123-0 from training. Duration: 0.88 2023-10-03 08:21:47,749 WARNING [train.py:1204] (3/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_166-166990-0 from training. Duration: 0.96 2023-10-03 08:21:49,081 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_5438_210_1794_1_1527336687_2600009_233-58072-0 from training. Duration: 0.77 2023-10-03 08:21:50,539 WARNING [train.py:1204] (3/4) Exclude cut with ID _8833_210_5219_1_1528592665210_7553059_611-80313-0_sp0.9 from training. Duration: 0.711125 2023-10-03 08:21:52,237 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_53555_210_5632_1_1534595220_7232297_118-43239-0_sp1.1 from training. Duration: 0.936375 2023-10-03 08:21:53,591 WARNING [train.py:1204] (3/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_480-13146-0 from training. Duration: 0.85 2023-10-03 08:21:59,695 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_892-28814-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 08:21:59,757 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_24899_210_16554_1_1532046422_3827462_160-21255-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 08:22:05,115 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_154-316450-0_sp1.1 from training. Duration: 0.6909375 2023-10-03 08:22:05,296 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.1.attention_skip_rate, batch_count=1200733.3333333333, ans=0.0 2023-10-03 08:22:06,470 WARNING [train.py:1204] (3/4) Exclude cut with ID _30196_210_1664_1_1533708496522_7074140_1225-301232-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 08:22:06,495 WARNING [train.py:1204] (3/4) Exclude cut with ID _28783_210_7943_1_1532671164808_7155479_414-208100-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 08:22:07,895 WARNING [train.py:1204] (3/4) Exclude cut with ID _19023_210_4353_1_1531378892552_5820780_83-254794-0 from training. Duration: 0.86 2023-10-03 08:22:07,969 WARNING [train.py:1204] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_591-83752-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 08:22:09,206 WARNING [train.py:1204] (3/4) Exclude cut with ID _29877_210_6228_1_1532397791106_5276149_266-62291-0_sp1.1 from training. Duration: 0.7909375 2023-10-03 08:22:11,844 WARNING [train.py:1204] (3/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_280-161633-0_sp1.1 from training. Duration: 0.663625 2023-10-03 08:22:14,108 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.conv_module2.balancer2.prob, batch_count=1200733.3333333333, ans=0.125 2023-10-03 08:22:15,193 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7543_210_6753_1_1528539468_333764_39-156634-0_sp1.1 from training. Duration: 0.97275 2023-10-03 08:22:16,631 WARNING [train.py:1204] (3/4) Exclude cut with ID _67499_210_17282_1_1535509805220_3692430_505-312248-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 08:22:16,672 WARNING [train.py:1204] (3/4) Exclude cut with ID _5294_210_1747_1_1527249643928_3696179_180-100713-0_sp0.9 from training. Duration: 0.9 2023-10-03 08:22:18,003 WARNING [train.py:1204] (3/4) Exclude cut with ID _26528_210_9070_1_1532570410842_3851310_508-148132-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 08:22:18,013 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_211-339622-0_sp1.1 from training. Duration: 0.4090625 2023-10-03 08:22:18,031 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_339-138356-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 08:22:19,264 WARNING [train.py:1204] (3/4) Exclude cut with ID _27060_210_5986_1_1532674838922_3522549_211-19933-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 08:22:20,715 WARNING [train.py:1204] (3/4) Exclude cut with ID _60229_210_27767_1_1534838406158_3655350_457-117923-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 08:22:22,579 WARNING [train.py:1204] (3/4) Exclude cut with ID _55429_210_10626_1_1534467588527_7174143_460-141439-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 08:22:22,818 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.conv_skip_rate, batch_count=1200800.0, ans=0.0 2023-10-03 08:22:26,181 WARNING [train.py:1204] (3/4) Exclude cut with ID _59170_210_6721_1_1534983633960_4203335_263-10780-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 08:22:26,194 WARNING [train.py:1204] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_872-264525-0_sp0.9 from training. Duration: 0.72225 2023-10-03 08:22:27,565 WARNING [train.py:1204] (3/4) Exclude cut with ID _7624_210_1531_1_1528109802198_4933101_418-349834-0_sp0.9 from training. Duration: 0.6666875 2023-10-03 08:22:28,887 WARNING [train.py:1204] (3/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_27-53998-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 08:22:30,282 WARNING [train.py:1204] (3/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_202-183922-0 from training. Duration: 0.85 2023-10-03 08:22:30,295 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7002_210_1794_1_1527939683_5157286_1-281264-0 from training. Duration: 0.44 2023-10-03 08:22:31,594 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_53753_210_7234_1_1534320003_3594148_493-9314-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 08:22:31,607 WARNING [train.py:1204] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_217-95056-0_sp1.1 from training. Duration: 0.8909375 2023-10-03 08:22:31,736 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.0.feed_forward1.out_proj.dropout_p, batch_count=1200866.6666666667, ans=0.1 2023-10-03 08:22:33,036 WARNING [train.py:1204] (3/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_406-45086-0_sp1.1 from training. Duration: 0.72725 2023-10-03 08:22:33,043 WARNING [train.py:1204] (3/4) Exclude cut with ID _31649_210_6228_1_1532584608063_4091078_505-213821-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 08:22:33,060 WARNING [train.py:1204] (3/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_317-175062-0_sp0.9 from training. Duration: 0.911125 2023-10-03 08:22:35,769 WARNING [train.py:1204] (3/4) Exclude cut with ID _5444_210_2151_1_1527418808464_5312210_726-27165-0_sp1.1 from training. Duration: 0.6090625 2023-10-03 08:22:35,810 WARNING [train.py:1204] (3/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_494-170437-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 08:22:37,364 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_474-64054-0_sp1.1 from training. Duration: 0.936375 2023-10-03 08:22:40,200 WARNING [train.py:1204] (3/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_521-7556-0_sp1.1 from training. Duration: 0.97275 2023-10-03 08:22:41,625 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_269-348505-0_sp1.1 from training. Duration: 0.92725 2023-10-03 08:22:46,471 WARNING [train.py:1204] (3/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_559-184862-0 from training. Duration: 0.94 2023-10-03 08:22:46,514 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_68702_210_23689_1_1535799577_3420068_92-76040-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 08:22:47,828 WARNING [train.py:1204] (3/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_227-102052-0_sp1.1 from training. Duration: 0.97275 2023-10-03 08:22:47,877 WARNING [train.py:1204] (3/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_718-250907-0_sp0.9 from training. Duration: 0.7666875 2023-10-03 08:22:49,209 WARNING [train.py:1204] (3/4) Exclude cut with ID _14069_210_5632_1_1530512692033_4803390_352-143891-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 08:22:52,461 WARNING [train.py:1204] (3/4) Exclude cut with ID _6267_210_2084_1_1527764658526_3894730_65-106602-0_sp1.1 from training. Duration: 0.936375 2023-10-03 08:22:53,822 WARNING [train.py:1204] (3/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_202-183922-0_sp0.9 from training. Duration: 0.9444375 2023-10-03 08:22:53,829 WARNING [train.py:1204] (3/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_465-268827-0_sp1.1 from training. Duration: 0.97275 2023-10-03 08:22:53,890 WARNING [train.py:1204] (3/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_58-29064-0_sp1.1 from training. Duration: 0.9 2023-10-03 08:22:53,926 WARNING [train.py:1204] (3/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_1-99680-0_sp0.9 from training. Duration: 0.7444375 2023-10-03 08:22:55,834 WARNING [train.py:1204] (3/4) Exclude cut with ID _13253_210_1925_1_1530757775819_3577873_191-144977-0_sp1.1 from training. Duration: 0.7090625 2023-10-03 08:22:56,055 INFO [scaling.py:1118] (3/4) WithLoss: name=encoder.encoders.2.encoder.layers.1.self_attn_weights, loss-sum=0.000e+00 2023-10-03 08:22:58,783 WARNING [train.py:1204] (3/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_276-112684-0_sp1.1 from training. Duration: 0.92725 2023-10-03 08:22:58,792 WARNING [train.py:1204] (3/4) Exclude cut with ID _64639_210_5816_1_1535367619437_3528000_358-411-0_sp1.1 from training. Duration: 0.97275 2023-10-03 08:22:58,799 WARNING [train.py:1204] (3/4) Exclude cut with ID _31649_210_6228_1_1532584608063_4091078_106-21199-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 08:23:01,329 INFO [train.py:1046] (3/4) Epoch 34, batch 4850, loss[loss=0.16, simple_loss=0.248, pruned_loss=0.03598, over 24326.00 frames. ], tot_loss[loss=0.1616, simple_loss=0.2414, pruned_loss=0.04091, over 4714285.39 frames. ], batch size: 74, lr: 2.98e-03, grad_scale: 32.0 2023-10-03 08:23:01,384 WARNING [train.py:1204] (3/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_901-79438-0 from training. Duration: 0.94 2023-10-03 08:23:02,842 WARNING [train.py:1204] (3/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_718-250907-0 from training. Duration: 0.69 2023-10-03 08:23:02,848 WARNING [train.py:1204] (3/4) Exclude cut with ID _5287_210_2084_1_1527418883040_3878670_363-194161-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 08:23:02,851 WARNING [train.py:1204] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_586-321321-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 08:23:04,204 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_68900_210_2087_1_1535713157_3708529_164-292661-0_sp1.1 from training. Duration: 0.963625 2023-10-03 08:23:04,205 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_26433_210_12108_1_1532157158_4056480_114-144878-0_sp1.1 from training. Duration: 0.97275 2023-10-03 08:23:05,715 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.ff3_skip_rate, batch_count=1201000.0, ans=0.0 2023-10-03 08:23:06,825 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_15517_210_6753_1_1530954008_3487336_96-341874-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 08:23:13,967 WARNING [train.py:1204] (3/4) Exclude cut with ID _14268_210_9206_1_1530622771285_4714099_637-86630-0 from training. Duration: 0.51 2023-10-03 08:23:15,355 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_28211_210_6388_1_1532861506_4523224_121-320092-0_sp1.1 from training. Duration: 0.92725 2023-10-03 08:23:19,485 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_141-89113-0_sp0.9 from training. Duration: 0.9555625 2023-10-03 08:23:19,576 WARNING [train.py:1204] (3/4) Exclude cut with ID _6789_210_1534_1_1527857937218_3118950_311-61262-0_sp1.1 from training. Duration: 0.5909375 2023-10-03 08:23:20,843 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.564e+02 1.893e+02 2.140e+02 2.446e+02 3.787e+02, threshold=4.279e+02, percent-clipped=0.0 2023-10-03 08:23:20,922 WARNING [train.py:1204] (3/4) Exclude cut with ID _58480_210_12392_1_1534811121111_3646160_389-200461-0_sp1.1 from training. Duration: 0.97275 2023-10-03 08:23:24,802 WARNING [train.py:1204] (3/4) Exclude cut with ID _41326_210_5854_1_1533778076509_3492080_28-156553-0_sp1.1 from training. Duration: 0.92725 2023-10-03 08:23:26,720 WARNING [train.py:1204] (3/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_577-287150-0_sp1.1 from training. Duration: 0.7454375 2023-10-03 08:23:28,106 WARNING [train.py:1204] (3/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_40-129756-0_sp1.1 from training. Duration: 0.736375 2023-10-03 08:23:28,117 WARNING [train.py:1204] (3/4) Exclude cut with ID _60113_210_16271_1_1535164212481_4386079_490-280290-0 from training. Duration: 0.98 2023-10-03 08:23:30,110 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.feed_forward2.out_whiten.whitening_limit, batch_count=1201133.3333333333, ans=15.0 2023-10-03 08:23:30,897 WARNING [train.py:1204] (3/4) Exclude cut with ID _58059_210_5854_1_1534901471149_3164808_34-39905-0_sp1.1 from training. Duration: 0.963625 2023-10-03 08:23:32,372 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_791-197891-0_sp1.1 from training. Duration: 0.763625 2023-10-03 08:23:32,393 WARNING [train.py:1204] (3/4) Exclude cut with ID _6789_210_1534_1_1527857937218_3118950_262-48853-0_sp1.1 from training. Duration: 0.5090625 2023-10-03 08:23:33,792 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2655_210_2087_1_1525688959_3897400_52-279899-0_sp0.9 from training. Duration: 0.7666875 2023-10-03 08:23:33,796 WARNING [train.py:1204] (3/4) Exclude cut with ID _14884_210_2252_1_1530707459689_5561250_114-201399-0 from training. Duration: 0.76 2023-10-03 08:23:35,264 WARNING [train.py:1204] (3/4) Exclude cut with ID _19023_210_4353_1_1531378892552_5820780_83-254794-0_sp0.9 from training. Duration: 0.9555625 2023-10-03 08:23:35,280 WARNING [train.py:1204] (3/4) Exclude cut with ID _53489_210_6228_1_1534298357169_3835970_10-297919-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 08:23:38,000 WARNING [train.py:1204] (3/4) Exclude cut with ID _44585_210_18628_1_1533605350387_4518199_418-3055-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 08:23:38,013 WARNING [train.py:1204] (3/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_310-109461-0 from training. Duration: 0.8 2023-10-03 08:23:38,222 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.feed_forward2.hidden_balancer.prob, batch_count=1201133.3333333333, ans=0.125 2023-10-03 08:23:39,337 WARNING [train.py:1204] (3/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_235-262124-0 from training. Duration: 0.67 2023-10-03 08:23:40,666 WARNING [train.py:1204] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_195-167011-0_sp1.1 from training. Duration: 0.6818125 2023-10-03 08:23:45,546 INFO [scaling.py:1118] (3/4) WithLoss: name=encoder.encoders.3.encoder.layers.2.self_attn_weights, loss-sum=0.000e+00 2023-10-03 08:23:48,025 WARNING [train.py:1204] (3/4) Exclude cut with ID _7852_210_5219_1_1528286471316_8139400_188-55162-0_sp1.1 from training. Duration: 0.936375 2023-10-03 08:23:49,407 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_35361_210_4238_1_1533262810_7793476_675-87012-0 from training. Duration: 0.89 2023-10-03 08:23:50,798 WARNING [train.py:1204] (3/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_465-81427-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 08:23:50,804 WARNING [train.py:1204] (3/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_597-154640-0_sp1.1 from training. Duration: 0.8181875 2023-10-03 08:23:54,508 WARNING [train.py:1204] (3/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_186-309161-0_sp0.9 from training. Duration: 0.6 2023-10-03 08:23:54,614 WARNING [train.py:1204] (3/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_546-119312-0 from training. Duration: 0.99 2023-10-03 08:23:56,276 WARNING [train.py:1204] (3/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_330-240060-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 08:23:56,362 WARNING [train.py:1204] (3/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_477-255782-0 from training. Duration: 0.82 2023-10-03 08:23:57,609 WARNING [train.py:1204] (3/4) Exclude cut with ID _14970_210_15709_1_1530773543493_1715730_150-248939-0_sp1.1 from training. Duration: 0.97275 2023-10-03 08:23:57,691 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_556-206615-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 08:23:59,030 WARNING [train.py:1204] (3/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_308-231735-0 from training. Duration: 0.73 2023-10-03 08:24:01,078 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.4.encoder.layers.1.feed_forward3.out_whiten, num_groups=1, num_channels=384, metric=11.61 vs. limit=15.0 2023-10-03 08:24:03,653 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.bypass.scale_min, batch_count=1201266.6666666667, ans=0.2 2023-10-03 08:24:06,081 WARNING [train.py:1204] (3/4) Exclude cut with ID _27485_210_4916_1_1532739191677_3668269_66-236571-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 08:24:10,390 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2221_210_1988_1_1525597051_4935541_415-296882-0_sp0.9 from training. Duration: 0.8555625 2023-10-03 08:24:11,658 WARNING [train.py:1204] (3/4) Exclude cut with ID _64521_210_14699_1_1535243427259_3815328_231-78630-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 08:24:12,650 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=1201266.6666666667, ans=0.1 2023-10-03 08:24:15,140 INFO [train.py:1046] (3/4) Epoch 34, batch 4900, loss[loss=0.1674, simple_loss=0.2536, pruned_loss=0.04062, over 23302.00 frames. ], tot_loss[loss=0.1607, simple_loss=0.2405, pruned_loss=0.04045, over 4712561.99 frames. ], batch size: 93, lr: 2.98e-03, grad_scale: 32.0 2023-10-03 08:24:17,924 WARNING [train.py:1204] (3/4) Exclude cut with ID _13942_210_9988_1_1530604906036_3733360_499-55676-0 from training. Duration: 0.62 2023-10-03 08:24:17,926 WARNING [train.py:1204] (3/4) Exclude cut with ID _49376_210_5986_1_1533985278732_3370740_301-154763-0_sp1.1 from training. Duration: 0.8909375 2023-10-03 08:24:19,603 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.conv_module2.balancer1.prob, batch_count=1201333.3333333333, ans=0.125 2023-10-03 08:24:22,761 WARNING [train.py:1204] (3/4) Exclude cut with ID _27485_210_4916_1_1532739191677_3668269_255-216698-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 08:24:24,165 WARNING [train.py:1204] (3/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_137-334143-0_sp1.1 from training. Duration: 0.97275 2023-10-03 08:24:24,201 WARNING [train.py:1204] (3/4) Exclude cut with ID _6456_210_1534_1_1527681634212_3181049_215-309359-0_sp1.1 from training. Duration: 0.8 2023-10-03 08:24:27,610 WARNING [train.py:1204] (3/4) Exclude cut with ID _65640_210_6310_1_1535368201445_3173250_209-13008-0 from training. Duration: 0.99 2023-10-03 08:24:31,860 WARNING [train.py:1204] (3/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_58-29064-0 from training. Duration: 0.99 2023-10-03 08:24:35,937 WARNING [train.py:1204] (3/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_410-287857-0 from training. Duration: 0.93 2023-10-03 08:24:37,316 WARNING [train.py:1204] (3/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_153-153537-0 from training. Duration: 0.91 2023-10-03 08:24:37,356 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7544_210_6753_1_1528587722_3576686_172-171750-0_sp0.9 from training. Duration: 0.72225 2023-10-03 08:24:37,391 WARNING [train.py:1204] (3/4) Exclude cut with ID _64521_210_14699_1_1535243427259_3815328_464-183853-0_sp1.1 from training. Duration: 0.97275 2023-10-03 08:24:38,737 WARNING [train.py:1204] (3/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_385-210308-0_sp1.1 from training. Duration: 0.963625 2023-10-03 08:24:38,744 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_46013_210_7234_1_1533776362_3744032_354-261865-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 08:24:38,758 WARNING [train.py:1204] (3/4) Exclude cut with ID _3067_210_2667_1_1526212974640_4503168_250-285937-0_sp1.1 from training. Duration: 0.72725 2023-10-03 08:24:38,830 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_40036_210_13405_1_1533648647_1315388_48-146016-0 from training. Duration: 0.93 2023-10-03 08:24:41,569 WARNING [train.py:1204] (3/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_280-161633-0 from training. Duration: 0.73 2023-10-03 08:24:42,805 WARNING [train.py:1204] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_308-161067-0_sp0.9 from training. Duration: 0.7333125 2023-10-03 08:24:44,193 WARNING [train.py:1204] (3/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_68-212801-0_sp0.9 from training. Duration: 0.811125 2023-10-03 08:24:44,240 WARNING [train.py:1204] (3/4) Exclude cut with ID _6789_210_1534_1_1527857937218_3118950_311-61262-0_sp0.9 from training. Duration: 0.72225 2023-10-03 08:24:47,580 WARNING [train.py:1204] (3/4) Exclude cut with ID _14632_210_9988_1_1530691279392_3923200_102-204477-0_sp1.1 from training. Duration: 0.8818125 2023-10-03 08:24:48,873 WARNING [train.py:1204] (3/4) Exclude cut with ID _65242_210_18628_1_1535369393717_3823149_290-117517-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 08:24:50,183 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_69630_210_7234_1_1535788670_910989_35-337140-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 08:24:50,191 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_5618_210_1874_1_1527311008_5851_1-214501-0 from training. Duration: 0.689 2023-10-03 08:24:51,668 WARNING [train.py:1204] (3/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_168-50337-0_sp0.9 from training. Duration: 0.9333125 2023-10-03 08:24:53,538 WARNING [train.py:1204] (3/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_185-14438-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 08:24:53,558 WARNING [train.py:1204] (3/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_61-327062-0 from training. Duration: 0.81 2023-10-03 08:24:53,563 WARNING [train.py:1204] (3/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_98-741-0 from training. Duration: 0.8 2023-10-03 08:24:57,354 WARNING [train.py:1204] (3/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_312-204798-0 from training. Duration: 0.93 2023-10-03 08:24:58,887 WARNING [train.py:1204] (3/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_787-294416-0_sp0.9 from training. Duration: 0.911125 2023-10-03 08:24:58,971 WARNING [train.py:1204] (3/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_140-181176-0_sp1.1 from training. Duration: 0.836375 2023-10-03 08:25:00,244 WARNING [train.py:1204] (3/4) Exclude cut with ID _14687_210_5854_1_1530703838259_3664889_115-239046-0_sp1.1 from training. Duration: 0.6090625 2023-10-03 08:25:00,293 WARNING [train.py:1204] (3/4) Exclude cut with ID _56423_210_5854_1_1534896061049_3165969_370-230614-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 08:25:01,678 WARNING [train.py:1204] (3/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_406-275649-0_sp0.9 from training. Duration: 0.4666875 2023-10-03 08:25:01,700 WARNING [train.py:1204] (3/4) Exclude cut with ID _15150_210_8341_1_1530752411803_7256384_22-138274-0_sp1.1 from training. Duration: 0.87275 2023-10-03 08:25:01,749 WARNING [train.py:1204] (3/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_125-125818-0 from training. Duration: 0.96 2023-10-03 08:25:03,087 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_4399_210_1874_1_1526705485_2400757_220-47974-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 08:25:05,642 WARNING [train.py:1204] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_308-161067-0_sp1.1 from training. Duration: 0.6 2023-10-03 08:25:07,086 WARNING [train.py:1204] (3/4) Exclude cut with ID _53489_210_6228_1_1534298357169_3835970_102-57645-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 08:25:09,710 WARNING [train.py:1204] (3/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_178-177582-0 from training. Duration: 0.74 2023-10-03 08:25:11,041 WARNING [train.py:1204] (3/4) Exclude cut with ID _14349_210_8341_1_1530615710818_3442470_170-336541-0_sp1.1 from training. Duration: 0.7818125 2023-10-03 08:25:12,522 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_521-214276-0_sp0.9 from training. Duration: 0.488875 2023-10-03 08:25:12,560 WARNING [train.py:1204] (3/4) Exclude cut with ID _66070_210_7640_1_1535441393096_7805310_489-292631-0 from training. Duration: 0.79 2023-10-03 08:25:19,851 WARNING [train.py:1204] (3/4) Exclude cut with ID _14069_210_5632_1_1530512692033_4803390_438-340023-0_sp1.1 from training. Duration: 0.936375 2023-10-03 08:25:19,940 WARNING [train.py:1204] (3/4) Exclude cut with ID _7852_210_5219_1_1528286471316_8139400_242-105336-0_sp1.1 from training. Duration: 0.6545625 2023-10-03 08:25:21,287 WARNING [train.py:1204] (3/4) Exclude cut with ID _3867_210_1883_1_1526641200575_4475830_272-215563-0 from training. Duration: 0.57 2023-10-03 08:25:21,295 WARNING [train.py:1204] (3/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_143-117421-0_sp0.9 from training. Duration: 0.6444375 2023-10-03 08:25:21,311 WARNING [train.py:1204] (3/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_30-326681-0_sp1.1 from training. Duration: 0.7545625 2023-10-03 08:25:23,353 WARNING [train.py:1204] (3/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_538-218448-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 08:25:27,892 WARNING [train.py:1204] (3/4) Exclude cut with ID _33640_210_2655_1_1533117605479_4855469_60-192970-0_sp1.1 from training. Duration: 0.963625 2023-10-03 08:25:27,899 WARNING [train.py:1204] (3/4) Exclude cut with ID _65585_210_12210_1_1535450451584_3764098_112-294301-0_sp1.1 from training. Duration: 0.8 2023-10-03 08:25:27,930 WARNING [train.py:1204] (3/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_643-37412-0_sp1.1 from training. Duration: 0.936375 2023-10-03 08:25:29,184 INFO [train.py:1046] (3/4) Epoch 34, batch 4950, loss[loss=0.1669, simple_loss=0.2552, pruned_loss=0.03931, over 24392.00 frames. ], tot_loss[loss=0.1598, simple_loss=0.239, pruned_loss=0.04029, over 4711802.14 frames. ], batch size: 77, lr: 2.98e-03, grad_scale: 16.0 2023-10-03 08:25:29,237 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_9302_210_2550_1_1528889962_4655416_638-301620-0_sp1.1 from training. Duration: 0.42725 2023-10-03 08:25:29,357 WARNING [train.py:1204] (3/4) Exclude cut with ID _3867_210_1883_1_1526641200575_4475830_272-215563-0_sp1.1 from training. Duration: 0.5181875 2023-10-03 08:25:32,168 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_53679_210_4852_1_1534556726_4186141_358-154143-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 08:25:33,573 WARNING [train.py:1204] (3/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_841-341679-0_sp0.9 from training. Duration: 0.6444375 2023-10-03 08:25:36,306 WARNING [train.py:1204] (3/4) Exclude cut with ID _7160_210_1876_1_1527940923231_3703799_302-150728-0 from training. Duration: 0.78 2023-10-03 08:25:37,657 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_125-203105-0 from training. Duration: 0.8 2023-10-03 08:25:37,680 WARNING [train.py:1204] (3/4) Exclude cut with ID _5585_210_2667_1_1527418813324_8007340_89-332072-0_sp0.9 from training. Duration: 0.711125 2023-10-03 08:25:39,058 WARNING [train.py:1204] (3/4) Exclude cut with ID _3992_210_1747_1_1526644816031_3731060_36-147855-0 from training. Duration: 0.78 2023-10-03 08:25:39,078 WARNING [train.py:1204] (3/4) Exclude cut with ID _59256_210_5986_1_1535076085997_3589120_172-239045-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 08:25:39,086 WARNING [train.py:1204] (3/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_472-216663-0_sp1.1 from training. Duration: 0.836375 2023-10-03 08:25:39,140 WARNING [train.py:1204] (3/4) Exclude cut with ID _36512_210_6987_1_1533033143998_3126069_181-306500-0_sp1.1 from training. Duration: 0.863625 2023-10-03 08:25:39,163 WARNING [train.py:1204] (3/4) Exclude cut with ID _56524_210_9642_1_1534572011779_3619170_492-95008-0_sp1.1 from training. Duration: 0.97275 2023-10-03 08:25:41,821 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_33348_210_5632_1_1533086912_3324030_119-90326-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 08:25:43,203 WARNING [train.py:1204] (3/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_231-137892-0_sp1.1 from training. Duration: 0.9 2023-10-03 08:25:43,337 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8453_210_4211_1_1528502940_8562730_596-306820-0_sp1.1 from training. Duration: 0.8818125 2023-10-03 08:25:44,671 WARNING [train.py:1204] (3/4) Exclude cut with ID _27052_210_5986_1_1532329197187_3585009_50-242750-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 08:25:47,894 WARNING [train.py:1204] (3/4) Exclude cut with ID _13913_210_9994_1_1530579615187_3383897_358-98435-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 08:25:47,914 WARNING [train.py:1204] (3/4) Exclude cut with ID _27013_210_1389_1_1532437173240_3252649_399-229778-0_sp1.1 from training. Duration: 0.963625 2023-10-03 08:25:50,570 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.526e+02 1.889e+02 2.059e+02 2.301e+02 3.763e+02, threshold=4.118e+02, percent-clipped=0.0 2023-10-03 08:25:50,678 WARNING [train.py:1204] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_185-174805-0_sp0.9 from training. Duration: 0.7555625 2023-10-03 08:25:56,032 WARNING [train.py:1204] (3/4) Exclude cut with ID _65585_210_12210_1_1535450451584_3764098_196-221014-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 08:25:57,765 WARNING [train.py:1204] (3/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_202-293523-0_sp1.1 from training. Duration: 0.6545625 2023-10-03 08:26:00,390 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8275_210_4198_1_1528455320_3892571_213-127634-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 08:26:00,442 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_921-231678-0_sp1.1 from training. Duration: 0.97275 2023-10-03 08:26:01,868 WARNING [train.py:1204] (3/4) Exclude cut with ID _38020_210_5986_1_1533112252259_7006089_493-102745-0_sp1.1 from training. Duration: 0.8909375 2023-10-03 08:26:03,220 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6846_210_6753_1_1527850767_4295158_2-315073-0 from training. Duration: 0.88 2023-10-03 08:26:04,474 WARNING [train.py:1204] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_249-281349-0 from training. Duration: 0.85 2023-10-03 08:26:05,812 WARNING [train.py:1204] (3/4) Exclude cut with ID _18513_210_9206_1_1531618215223_3862620_314-83429-0_sp1.1 from training. Duration: 0.97275 2023-10-03 08:26:08,467 WARNING [train.py:1204] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_215-202973-0_sp0.9 from training. Duration: 0.97775 2023-10-03 08:26:08,476 WARNING [train.py:1204] (3/4) Exclude cut with ID _7719_210_3525_1_1528456153163_4296960_88-250259-0_sp1.1 from training. Duration: 0.763625 2023-10-03 08:26:09,849 WARNING [train.py:1204] (3/4) Exclude cut with ID _7606_210_4915_1_1528894253277_5080690_301-10732-0_sp1.1 from training. Duration: 0.736375 2023-10-03 08:26:09,859 WARNING [train.py:1204] (3/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_818-11907-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 08:26:09,940 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_5959_210_1794_1_1527400400_3445726_412-44052-0_sp1.1 from training. Duration: 0.7 2023-10-03 08:26:11,330 WARNING [train.py:1204] (3/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_6-279195-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 08:26:13,062 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.feed_forward3.hidden_balancer.prob, batch_count=1201866.6666666667, ans=0.125 2023-10-03 08:26:14,152 WARNING [train.py:1204] (3/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_252-216674-0_sp0.9 from training. Duration: 0.6 2023-10-03 08:26:16,155 WARNING [train.py:1204] (3/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_144-3287-0_sp1.1 from training. Duration: 0.6090625 2023-10-03 08:26:16,413 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.balancer2.prob, batch_count=1201866.6666666667, ans=0.125 2023-10-03 08:26:17,367 WARNING [train.py:1204] (3/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_342-3879-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 08:26:17,410 WARNING [train.py:1204] (3/4) Exclude cut with ID _33640_210_2655_1_1533117605479_4855469_584-65630-0_sp1.1 from training. Duration: 0.97275 2023-10-03 08:26:17,593 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=1201866.6666666667, ans=0.1 2023-10-03 08:26:18,772 WARNING [train.py:1204] (3/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_37-189262-0 from training. Duration: 0.98 2023-10-03 08:26:18,817 WARNING [train.py:1204] (3/4) Exclude cut with ID _9389_210_1664_1_1529217337301_5289269_379-128794-0_sp0.9 from training. Duration: 0.9444375 2023-10-03 08:26:20,240 WARNING [train.py:1204] (3/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_119-207162-0_sp1.1 from training. Duration: 0.6454375 2023-10-03 08:26:24,860 WARNING [train.py:1204] (3/4) Exclude cut with ID _2941_210_2577_1_1526199673150_2489759_200-333643-0_sp1.1 from training. Duration: 0.936375 2023-10-03 08:26:26,841 WARNING [train.py:1204] (3/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_479-125611-0_sp1.1 from training. Duration: 0.87275 2023-10-03 08:26:26,851 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3475_210_2028_1_1526193578_3622569_64-297072-0_sp1.1 from training. Duration: 0.82725 2023-10-03 08:26:28,542 WARNING [train.py:1204] (3/4) Exclude cut with ID _53565_210_10635_1_1534337218335_3820259_139-346291-0_sp1.1 from training. Duration: 0.97275 2023-10-03 08:26:28,601 WARNING [train.py:1204] (3/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_588-125165-0_sp1.1 from training. Duration: 0.7545625 2023-10-03 08:26:29,970 WARNING [train.py:1204] (3/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_938-205135-0_sp1.1 from training. Duration: 0.7909375 2023-10-03 08:26:31,372 WARNING [train.py:1204] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_216-48707-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 08:26:31,438 WARNING [train.py:1204] (3/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_477-255782-0_sp1.1 from training. Duration: 0.7454375 2023-10-03 08:26:32,744 WARNING [train.py:1204] (3/4) Exclude cut with ID _16909_210_5724_1_1531292079912_4118919_583-92842-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 08:26:34,092 WARNING [train.py:1204] (3/4) Exclude cut with ID _60860_210_8254_1_1534896015875_3557169_245-256666-0 from training. Duration: 0.97 2023-10-03 08:26:34,815 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.2.encoder.layers.2.nonlin_attention.whiten1, num_groups=1, num_channels=288, metric=5.02 vs. limit=10.0 2023-10-03 08:26:38,224 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_45399_210_4852_1_1533624725_7579737_349-116173-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 08:26:42,294 INFO [train.py:1046] (3/4) Epoch 34, batch 5000, loss[loss=0.172, simple_loss=0.2566, pruned_loss=0.04373, over 24341.00 frames. ], tot_loss[loss=0.1595, simple_loss=0.2387, pruned_loss=0.0402, over 4712576.22 frames. ], batch size: 77, lr: 2.98e-03, grad_scale: 16.0 2023-10-03 08:26:42,397 WARNING [train.py:1204] (3/4) Exclude cut with ID _8699_210_2069_1_1528630346831_3960409_155-39766-0 from training. Duration: 0.78 2023-10-03 08:26:42,413 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_86-16881-0_sp1.1 from training. Duration: 0.52725 2023-10-03 08:26:43,957 INFO [scaling.py:1118] (3/4) WithLoss: name=encoder.encoders.4.encoder.layers.0.self_attn_weights, loss-sum=0.000e+00 2023-10-03 08:26:48,447 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_191-159384-0_sp1.1 from training. Duration: 0.97275 2023-10-03 08:26:48,457 WARNING [train.py:1204] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_307-158462-0_sp1.1 from training. Duration: 0.62725 2023-10-03 08:26:51,219 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7544_210_6753_1_1528591327_1471426_33-346031-0 from training. Duration: 0.61 2023-10-03 08:26:51,315 WARNING [train.py:1204] (3/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_112-149213-0 from training. Duration: 0.95 2023-10-03 08:26:54,534 WARNING [train.py:1204] (3/4) Exclude cut with ID _55844_210_2346_1_1534471212569_7625707_925-191342-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 08:26:55,905 WARNING [train.py:1204] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_87-278686-0 from training. Duration: 0.77 2023-10-03 08:26:55,934 WARNING [train.py:1204] (3/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_415-78792-0_sp1.1 from training. Duration: 0.736375 2023-10-03 08:26:55,944 WARNING [train.py:1204] (3/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_107-332398-0_sp0.9 from training. Duration: 0.8666875 2023-10-03 08:26:57,413 WARNING [train.py:1204] (3/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_72-251255-0 from training. Duration: 0.94 2023-10-03 08:26:59,104 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_431-227273-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 08:26:59,185 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7544_210_6753_1_1528587000_691468_87-178599-0_sp0.9 from training. Duration: 0.9666875 2023-10-03 08:27:01,115 WARNING [train.py:1204] (3/4) Exclude cut with ID _13916_210_9994_1_1530788341126_3763149_60-191556-0 from training. Duration: 0.61 2023-10-03 08:27:01,117 WARNING [train.py:1204] (3/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_746-164782-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 08:27:01,152 WARNING [train.py:1204] (3/4) Exclude cut with ID _33640_210_2655_1_1533117605479_4855469_596-21066-0_sp1.1 from training. Duration: 0.92725 2023-10-03 08:27:02,604 WARNING [train.py:1204] (3/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_320-14078-0 from training. Duration: 0.95 2023-10-03 08:27:03,872 WARNING [train.py:1204] (3/4) Exclude cut with ID _29877_210_6228_1_1532397791106_5276149_266-62291-0 from training. Duration: 0.87 2023-10-03 08:27:05,218 WARNING [train.py:1204] (3/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_176-56120-0_sp0.9 from training. Duration: 0.92225 2023-10-03 08:27:05,254 WARNING [train.py:1204] (3/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_91-26919-0 from training. Duration: 0.97 2023-10-03 08:27:05,261 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_224-322394-0_sp0.9 from training. Duration: 0.7333125 2023-10-03 08:27:05,304 WARNING [train.py:1204] (3/4) Exclude cut with ID _44982_210_1511_1_1534312820108_3726029_586-297059-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 08:27:05,348 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_196-289637-0_sp1.1 from training. Duration: 0.6818125 2023-10-03 08:27:05,349 WARNING [train.py:1204] (3/4) Exclude cut with ID _18513_210_9206_1_1531618215223_3862620_402-5957-0 from training. Duration: 0.99 2023-10-03 08:27:05,481 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.balancer2.prob, batch_count=1202066.6666666667, ans=0.125 2023-10-03 08:27:06,604 WARNING [train.py:1204] (3/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_81-300212-0 from training. Duration: 0.6 2023-10-03 08:27:09,313 WARNING [train.py:1204] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_166-128941-0 from training. Duration: 0.54 2023-10-03 08:27:09,327 WARNING [train.py:1204] (3/4) Exclude cut with ID _58059_210_5854_1_1534901471149_3164808_57-309660-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 08:27:09,395 WARNING [train.py:1204] (3/4) Exclude cut with ID _60621_210_17071_1_1535013053530_3671739_73-155814-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 08:27:10,716 WARNING [train.py:1204] (3/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_726-270780-0 from training. Duration: 0.86 2023-10-03 08:27:10,737 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8853_210_3273_1_1528633899_818968_51-187685-0_sp1.1 from training. Duration: 0.62725 2023-10-03 08:27:10,852 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_315-212794-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 08:27:12,274 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_53373_210_5399_1_1534229471_4715695_314-122795-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 08:27:13,827 WARNING [train.py:1204] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_187-221566-0_sp0.9 from training. Duration: 0.47775 2023-10-03 08:27:17,145 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_133-564-0 from training. Duration: 0.79 2023-10-03 08:27:17,190 WARNING [train.py:1204] (3/4) Exclude cut with ID _55153_210_2253_1_1534384786677_3162096_255-228344-0_sp1.1 from training. Duration: 0.936375 2023-10-03 08:27:17,388 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.ff3_skip_rate, batch_count=1202133.3333333333, ans=0.0 2023-10-03 08:27:18,521 WARNING [train.py:1204] (3/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_383-165234-0_sp1.1 from training. Duration: 0.8545625 2023-10-03 08:27:22,752 WARNING [train.py:1204] (3/4) Exclude cut with ID IC0677W0056-334889 from training. Duration: 0.768 2023-10-03 08:27:27,273 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_14953_210_2151_1_1530701710_5616165_710-165400-0_sp0.9 from training. Duration: 0.9666875 2023-10-03 08:27:28,777 WARNING [train.py:1204] (3/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_355-236205-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 08:27:28,778 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_34450_210_9547_1_1533099443_4109050_503-246893-0_sp1.1 from training. Duration: 0.963625 2023-10-03 08:27:32,179 WARNING [train.py:1204] (3/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_25-120161-0 from training. Duration: 0.98 2023-10-03 08:27:32,203 WARNING [train.py:1204] (3/4) Exclude cut with ID _66112_210_10635_1_1535590236305_3801484_205-311930-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 08:27:32,235 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_48049_210_5632_1_1533864553_3519937_185-281218-0_sp1.1 from training. Duration: 0.92725 2023-10-03 08:27:32,265 WARNING [train.py:1204] (3/4) Exclude cut with ID _48782_210_12658_1_1534467701544_2300422_191-332415-0_sp1.1 from training. Duration: 0.97275 2023-10-03 08:27:34,957 WARNING [train.py:1204] (3/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_907-151398-0_sp1.1 from training. Duration: 0.336375 2023-10-03 08:27:35,000 WARNING [train.py:1204] (3/4) Exclude cut with ID _67499_210_17282_1_1535509805220_3692430_20-179346-0_sp1.1 from training. Duration: 0.8818125 2023-10-03 08:27:38,309 WARNING [train.py:1204] (3/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_441-91466-0_sp1.1 from training. Duration: 0.8818125 2023-10-03 08:27:39,631 WARNING [train.py:1204] (3/4) Exclude cut with ID _46681_210_9642_1_1533893005571_7603359_786-164007-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 08:27:43,992 WARNING [train.py:1204] (3/4) Exclude cut with ID _14970_210_15709_1_1530773543493_1715730_187-20259-0 from training. Duration: 0.89 2023-10-03 08:27:46,727 WARNING [train.py:1204] (3/4) Exclude cut with ID _12734_210_8341_1_1530183614296_3891929_383-339075-0_sp1.1 from training. Duration: 0.963625 2023-10-03 08:27:57,235 INFO [train.py:1046] (3/4) Epoch 34, batch 5050, loss[loss=0.1475, simple_loss=0.2386, pruned_loss=0.0282, over 24539.00 frames. ], tot_loss[loss=0.1596, simple_loss=0.2391, pruned_loss=0.04009, over 4712352.63 frames. ], batch size: 71, lr: 2.98e-03, grad_scale: 16.0 2023-10-03 08:27:57,301 WARNING [train.py:1204] (3/4) Exclude cut with ID _37086_210_9038_1_1532999540891_7021250_834-322225-0_sp1.1 from training. Duration: 0.92725 2023-10-03 08:27:58,662 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_10987_210_4163_1_1529760555_4123902_56-290559-0_sp1.1 from training. Duration: 0.963625 2023-10-03 08:27:58,669 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_92-246270-0_sp0.9 from training. Duration: 0.9444375 2023-10-03 08:27:58,697 WARNING [train.py:1204] (3/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_56-13561-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 08:27:58,728 WARNING [train.py:1204] (3/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_393-57888-0_sp1.1 from training. Duration: 0.5818125 2023-10-03 08:27:58,756 WARNING [train.py:1204] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_168-32906-0_sp1.1 from training. Duration: 0.62725 2023-10-03 08:28:00,060 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_51225_210_3601_1_1534400634_2327383_127-207553-0_sp1.1 from training. Duration: 0.963625 2023-10-03 08:28:01,764 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.conv_module1.balancer1.max_abs, batch_count=1202333.3333333333, ans=10.0 2023-10-03 08:28:02,929 WARNING [train.py:1204] (3/4) Exclude cut with ID _58583_210_5236_1_1534726835266_3574249_494-203521-0_sp1.1 from training. Duration: 0.963625 2023-10-03 08:28:04,187 WARNING [train.py:1204] (3/4) Exclude cut with ID _49376_210_5986_1_1533985278732_3370740_354-344695-0 from training. Duration: 0.99 2023-10-03 08:28:05,919 WARNING [train.py:1204] (3/4) Exclude cut with ID _8021_210_7994_1_1528376451358_3653069_179-45206-0_sp1.1 from training. Duration: 0.7818125 2023-10-03 08:28:07,385 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.feed_forward1.hidden_balancer.prob, batch_count=1202333.3333333333, ans=0.125 2023-10-03 08:28:08,593 WARNING [train.py:1204] (3/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_742-154017-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 08:28:09,984 WARNING [train.py:1204] (3/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_222-198127-0_sp1.1 from training. Duration: 0.763625 2023-10-03 08:28:10,043 WARNING [train.py:1204] (3/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_235-307337-0 from training. Duration: 0.94 2023-10-03 08:28:11,438 WARNING [train.py:1204] (3/4) Exclude cut with ID _8393_210_6702_1_1528505860249_3457328_453-176500-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 08:28:11,485 WARNING [train.py:1204] (3/4) Exclude cut with ID _53565_210_10635_1_1534337218335_3820259_102-179528-0_sp1.1 from training. Duration: 0.97275 2023-10-03 08:28:14,211 WARNING [train.py:1204] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_43-206757-0_sp1.1 from training. Duration: 0.6818125 2023-10-03 08:28:14,311 WARNING [train.py:1204] (3/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_335-274780-0_sp1.1 from training. Duration: 0.7090625 2023-10-03 08:28:15,573 WARNING [train.py:1204] (3/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_128-296581-0_sp1.1 from training. Duration: 0.67275 2023-10-03 08:28:18,352 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.528e+02 1.843e+02 2.007e+02 2.253e+02 3.128e+02, threshold=4.014e+02, percent-clipped=0.0 2023-10-03 08:28:22,993 WARNING [train.py:1204] (3/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_111-112362-0 from training. Duration: 0.91 2023-10-03 08:28:23,027 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_5522_210_3273_1_1527249606_3909096_366-172508-0_sp1.1 from training. Duration: 0.6 2023-10-03 08:28:24,270 WARNING [train.py:1204] (3/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_47-167373-0_sp1.1 from training. Duration: 0.836375 2023-10-03 08:28:24,322 WARNING [train.py:1204] (3/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_41-234353-0 from training. Duration: 0.68 2023-10-03 08:28:26,209 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_82-221196-0_sp0.9 from training. Duration: 0.8444375 2023-10-03 08:28:28,040 WARNING [train.py:1204] (3/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_47-4814-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 08:28:28,077 WARNING [train.py:1204] (3/4) Exclude cut with ID _43541_210_10626_1_1533796233418_3635440_40-247993-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 08:28:28,724 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.3.encoder.layers.0.self_attn_weights.whiten_keys, num_groups=8, num_channels=256, metric=4.80 vs. limit=6.0 2023-10-03 08:28:29,427 WARNING [train.py:1204] (3/4) Exclude cut with ID _21989_210_5854_1_1531899214931_3271360_133-44759-0_sp0.9 from training. Duration: 0.8555625 2023-10-03 08:28:29,430 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_94-195801-0 from training. Duration: 0.94 2023-10-03 08:28:29,518 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_141-89113-0 from training. Duration: 0.86 2023-10-03 08:28:30,813 WARNING [train.py:1204] (3/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_683-284578-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 08:28:33,543 WARNING [train.py:1204] (3/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_6-214815-0_sp1.1 from training. Duration: 0.77275 2023-10-03 08:28:36,924 WARNING [train.py:1204] (3/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_556-212082-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 08:28:36,963 WARNING [train.py:1204] (3/4) Exclude cut with ID _44902_210_16187_1_1533888006062_3792078_149-67212-0 from training. Duration: 0.87 2023-10-03 08:28:37,911 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.0.layers.1.self_attn1.whiten, num_groups=1, num_channels=192, metric=11.69 vs. limit=22.5 2023-10-03 08:28:38,319 WARNING [train.py:1204] (3/4) Exclude cut with ID _61148_210_6897_1_1535094022726_4217610_333-159440-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 08:28:41,070 WARNING [train.py:1204] (3/4) Exclude cut with ID _3120_210_3525_1_1526036143112_4072980_404-249994-0 from training. Duration: 0.99 2023-10-03 08:28:42,432 WARNING [train.py:1204] (3/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_234-260682-0_sp0.9 from training. Duration: 0.9333125 2023-10-03 08:28:42,460 WARNING [train.py:1204] (3/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_374-138971-0_sp1.1 from training. Duration: 0.8545625 2023-10-03 08:28:42,528 WARNING [train.py:1204] (3/4) Exclude cut with ID _53713_210_24116_1_1534314487762_7416751_743-302085-0_sp1.1 from training. Duration: 0.92725 2023-10-03 08:28:43,978 WARNING [train.py:1204] (3/4) Exclude cut with ID _18516_210_9206_1_1531704653206_3692406_198-334870-0_sp1.1 from training. Duration: 0.836375 2023-10-03 08:28:45,321 WARNING [train.py:1204] (3/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_395-73570-0_sp1.1 from training. Duration: 0.9 2023-10-03 08:28:46,770 WARNING [train.py:1204] (3/4) Exclude cut with ID _46683_210_9642_1_1533730913492_4591116_421-19547-0_sp1.1 from training. Duration: 0.936375 2023-10-03 08:28:46,811 WARNING [train.py:1204] (3/4) Exclude cut with ID _19896_210_1883_1_1531709022662_6649569_714-8668-0_sp1.1 from training. Duration: 0.963625 2023-10-03 08:28:48,130 WARNING [train.py:1204] (3/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_49-101072-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 08:28:48,136 WARNING [train.py:1204] (3/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_220-191080-0_sp1.1 from training. Duration: 0.8909375 2023-10-03 08:28:48,177 WARNING [train.py:1204] (3/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_62-335552-0 from training. Duration: 0.87 2023-10-03 08:28:50,217 WARNING [train.py:1204] (3/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_111-112362-0_sp1.1 from training. Duration: 0.82725 2023-10-03 08:28:52,776 WARNING [train.py:1204] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_318-59097-0_sp0.9 from training. Duration: 0.8444375 2023-10-03 08:28:56,869 WARNING [train.py:1204] (3/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_98-255710-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 08:28:56,875 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9042W0003-515609 from training. Duration: 0.896 2023-10-03 08:28:56,884 WARNING [train.py:1204] (3/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_158-250116-0_sp0.9 from training. Duration: 0.688875 2023-10-03 08:28:58,955 WARNING [train.py:1204] (3/4) Exclude cut with ID _26750_210_5665_1_1532226678442_4063230_7-346885-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 08:29:00,270 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_37839_210_7234_1_1533542462_3689303_357-227078-0_sp1.1 from training. Duration: 0.963625 2023-10-03 08:29:00,292 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9052W0119-520904 from training. Duration: 0.896 2023-10-03 08:29:01,806 WARNING [train.py:1204] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_249-281349-0_sp1.1 from training. Duration: 0.77275 2023-10-03 08:29:01,812 WARNING [train.py:1204] (3/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_147-293311-0 from training. Duration: 0.94 2023-10-03 08:29:01,813 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_158-21166-0_sp1.1 from training. Duration: 0.963625 2023-10-03 08:29:03,382 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.feed_forward3.hidden_balancer.prob, batch_count=1202600.0, ans=0.125 2023-10-03 08:29:04,607 WARNING [train.py:1204] (3/4) Exclude cut with ID _14100_210_8341_1_1530529320613_3678069_207-332194-0_sp1.1 from training. Duration: 0.92725 2023-10-03 08:29:06,446 WARNING [train.py:1204] (3/4) Exclude cut with ID _9389_210_1664_1_1529217337301_5289269_324-211031-0_sp1.1 from training. Duration: 0.963625 2023-10-03 08:29:06,467 WARNING [train.py:1204] (3/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_133-158308-0 from training. Duration: 0.84 2023-10-03 08:29:07,866 WARNING [train.py:1204] (3/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_166-314607-0 from training. Duration: 0.54 2023-10-03 08:29:09,270 WARNING [train.py:1204] (3/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_649-79538-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 08:29:09,278 WARNING [train.py:1204] (3/4) Exclude cut with ID _28394_210_8913_1_1532430049518_3650118_343-115667-0_sp1.1 from training. Duration: 0.97275 2023-10-03 08:29:09,317 WARNING [train.py:1204] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_87-278686-0_sp0.9 from training. Duration: 0.8555625 2023-10-03 08:29:10,817 INFO [train.py:1046] (3/4) Epoch 34, batch 5100, loss[loss=0.1635, simple_loss=0.2389, pruned_loss=0.04401, over 23149.00 frames. ], tot_loss[loss=0.1605, simple_loss=0.2399, pruned_loss=0.04055, over 4718948.66 frames. ], batch size: 119, lr: 2.98e-03, grad_scale: 16.0 2023-10-03 08:29:12,296 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9109W0003-549749 from training. Duration: 0.768 2023-10-03 08:29:15,074 WARNING [train.py:1204] (3/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_126-29104-0_sp1.1 from training. Duration: 0.77275 2023-10-03 08:29:17,882 WARNING [train.py:1204] (3/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_325-113798-0 from training. Duration: 0.85 2023-10-03 08:29:17,936 WARNING [train.py:1204] (3/4) Exclude cut with ID _6694_210_4599_1_1527852637490_3965529_116-109398-0 from training. Duration: 0.73 2023-10-03 08:29:19,286 WARNING [train.py:1204] (3/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_46-57119-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 08:29:20,724 WARNING [train.py:1204] (3/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_546-119312-0_sp1.1 from training. Duration: 0.9 2023-10-03 08:29:22,820 WARNING [train.py:1204] (3/4) Exclude cut with ID _60057_210_10640_1_1534903065793_3333150_468-306939-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 08:29:23,980 WARNING [train.py:1204] (3/4) Exclude cut with ID _15150_210_8341_1_1530752411803_7256384_22-138274-0 from training. Duration: 0.96 2023-10-03 08:29:24,013 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_289-180904-0 from training. Duration: 0.76 2023-10-03 08:29:28,813 WARNING [train.py:1204] (3/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_64-133721-0_sp1.1 from training. Duration: 0.92725 2023-10-03 08:29:28,847 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_195-257512-0_sp1.1 from training. Duration: 0.6090625 2023-10-03 08:29:33,650 WARNING [train.py:1204] (3/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_704-101963-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 08:29:35,153 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.1.ff3_skip_rate, batch_count=1202733.3333333333, ans=0.0 2023-10-03 08:29:36,522 WARNING [train.py:1204] (3/4) Exclude cut with ID _4366_210_1388_1_1526792053633_3613819_231-211402-0 from training. Duration: 0.94 2023-10-03 08:29:38,441 WARNING [train.py:1204] (3/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_679-190385-0_sp1.1 from training. Duration: 0.97275 2023-10-03 08:29:39,392 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.1.encoder.layers.0.self_attn2.whiten, num_groups=1, num_channels=256, metric=12.84 vs. limit=22.5 2023-10-03 08:29:39,901 WARNING [train.py:1204] (3/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_298-81459-0_sp1.1 from training. Duration: 0.963625 2023-10-03 08:29:39,910 WARNING [train.py:1204] (3/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_658-282610-0_sp0.9 from training. Duration: 0.588875 2023-10-03 08:29:42,732 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.feed_forward1.hidden_balancer.prob, batch_count=1202800.0, ans=0.125 2023-10-03 08:29:45,171 WARNING [train.py:1204] (3/4) Exclude cut with ID _57572_210_5517_1_1534921219624_3809484_196-283734-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 08:29:45,250 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_53071_210_16554_1_1534490405_2954066_294-198554-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 08:29:45,255 WARNING [train.py:1204] (3/4) Exclude cut with ID _4866_210_2577_1_1527404412284_4364659_352-157930-0 from training. Duration: 0.81 2023-10-03 08:29:48,086 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9231W0113-610139 from training. Duration: 0.896 2023-10-03 08:29:48,151 WARNING [train.py:1204] (3/4) Exclude cut with ID _24271_210_12658_1_1531987270022_7190519_638-171906-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 08:29:48,197 WARNING [train.py:1204] (3/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_272-249985-0 from training. Duration: 0.67 2023-10-03 08:29:49,522 WARNING [train.py:1204] (3/4) Exclude cut with ID _64086_210_7994_1_1535270417019_7435949_449-31567-0 from training. Duration: 0.97 2023-10-03 08:29:52,266 WARNING [train.py:1204] (3/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_109-282568-0_sp1.1 from training. Duration: 0.97275 2023-10-03 08:29:54,397 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=1202866.6666666667, ans=0.1 2023-10-03 08:30:00,112 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_121-45934-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 08:30:04,812 WARNING [train.py:1204] (3/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_314-126819-0 from training. Duration: 0.5 2023-10-03 08:30:04,834 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9309W0192-640319 from training. Duration: 0.512 2023-10-03 08:30:04,841 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9310W0003-640649 from training. Duration: 0.896 2023-10-03 08:30:06,203 WARNING [train.py:1204] (3/4) Exclude cut with ID _7160_210_1876_1_1527940923231_3703799_120-150943-0 from training. Duration: 0.81 2023-10-03 08:30:06,208 WARNING [train.py:1204] (3/4) Exclude cut with ID _40380_210_9059_1_1533380594759_4187380_6-33508-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 08:30:08,970 WARNING [train.py:1204] (3/4) Exclude cut with ID _59202_210_26984_1_1534816810612_3549174_137-318710-0 from training. Duration: 0.89 2023-10-03 08:30:13,849 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_435-313305-0 from training. Duration: 0.71 2023-10-03 08:30:15,270 WARNING [train.py:1204] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_91-212486-0_sp1.1 from training. Duration: 0.6181875 2023-10-03 08:30:15,370 WARNING [train.py:1204] (3/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_133-158308-0_sp1.1 from training. Duration: 0.763625 2023-10-03 08:30:19,429 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_99-251314-0 from training. Duration: 0.87 2023-10-03 08:30:20,865 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_365-217119-0_sp1.1 from training. Duration: 0.57275 2023-10-03 08:30:20,913 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3475_210_2028_1_1526193578_3622569_377-166133-0 from training. Duration: 0.86 2023-10-03 08:30:25,339 INFO [train.py:1046] (3/4) Epoch 34, batch 5150, loss[loss=0.1787, simple_loss=0.2475, pruned_loss=0.0549, over 23765.00 frames. ], tot_loss[loss=0.1617, simple_loss=0.2408, pruned_loss=0.04134, over 4715198.94 frames. ], batch size: 164, lr: 2.98e-03, grad_scale: 16.0 2023-10-03 08:30:26,851 WARNING [train.py:1204] (3/4) Exclude cut with ID _13916_210_9994_1_1530788341126_3763149_116-21034-0_sp1.1 from training. Duration: 0.87275 2023-10-03 08:30:26,868 WARNING [train.py:1204] (3/4) Exclude cut with ID _64220_210_9527_1_1535250151772_4875259_142-216831-0_sp1.1 from training. Duration: 0.936375 2023-10-03 08:30:26,869 WARNING [train.py:1204] (3/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_490-207861-0_sp1.1 from training. Duration: 0.8909375 2023-10-03 08:30:26,914 WARNING [train.py:1204] (3/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_87-272068-0_sp0.9 from training. Duration: 0.92225 2023-10-03 08:30:28,278 WARNING [train.py:1204] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_18-46756-0_sp1.1 from training. Duration: 0.7181875 2023-10-03 08:30:28,360 WARNING [train.py:1204] (3/4) Exclude cut with ID _39411_210_5986_1_1533538825030_3343649_98-311335-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 08:30:29,671 WARNING [train.py:1204] (3/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_113-18163-0 from training. Duration: 0.89 2023-10-03 08:30:29,673 WARNING [train.py:1204] (3/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_1103-56445-0 from training. Duration: 0.73 2023-10-03 08:30:29,716 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3474_210_2028_1_1526188513_4825246_145-309131-0 from training. Duration: 0.74 2023-10-03 08:30:31,624 WARNING [train.py:1204] (3/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_120-211630-0_sp1.1 from training. Duration: 0.836375 2023-10-03 08:30:31,642 WARNING [train.py:1204] (3/4) Exclude cut with ID _8030_210_7497_1_1528444886781_3531089_32-31837-0 from training. Duration: 0.97 2023-10-03 08:30:33,015 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_19949_210_2161_1_1531706337_928079_8-160956-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 08:30:34,373 WARNING [train.py:1204] (3/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_321-215742-0_sp1.1 from training. Duration: 0.4545625 2023-10-03 08:30:36,281 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_583-124650-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 08:30:37,636 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_30434_210_13049_1_1532519384_981746_164-182755-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 08:30:39,985 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.1.encoder.layers.0.self_attn_weights.whiten_keys, num_groups=4, num_channels=128, metric=3.59 vs. limit=6.0 2023-10-03 08:30:42,052 WARNING [train.py:1204] (3/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_339-104791-0_sp1.1 from training. Duration: 0.4454375 2023-10-03 08:30:42,071 WARNING [train.py:1204] (3/4) Exclude cut with ID _15026_210_8750_1_1530777602316_3291850_211-195043-0 from training. Duration: 0.8 2023-10-03 08:30:42,233 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.1.self_attn_weights.pos_emb_skip_rate, batch_count=1203066.6666666667, ans=0.0 2023-10-03 08:30:43,451 WARNING [train.py:1204] (3/4) Exclude cut with ID _29877_210_6228_1_1532397791106_5276149_487-182647-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 08:30:43,507 WARNING [train.py:1204] (3/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_127-566-0_sp0.9 from training. Duration: 0.9333125 2023-10-03 08:30:44,931 WARNING [train.py:1204] (3/4) Exclude cut with ID _8263_210_1664_1_1528439452891_970220_68-181805-0_sp0.9 from training. Duration: 0.711125 2023-10-03 08:30:44,933 WARNING [train.py:1204] (3/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_504-72513-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 08:30:44,940 WARNING [train.py:1204] (3/4) Exclude cut with ID _64639_210_5816_1_1535367619437_3528000_324-265233-0_sp1.1 from training. Duration: 0.963625 2023-10-03 08:30:46,298 WARNING [train.py:1204] (3/4) Exclude cut with ID _6456_210_1534_1_1527681634212_3181049_215-309359-0_sp0.9 from training. Duration: 0.97775 2023-10-03 08:30:46,302 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_115-233929-0_sp1.1 from training. Duration: 0.6545625 2023-10-03 08:30:46,429 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.balancer1.prob, batch_count=1203066.6666666667, ans=0.125 2023-10-03 08:30:48,005 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.525e+02 1.915e+02 2.112e+02 2.409e+02 3.229e+02, threshold=4.224e+02, percent-clipped=0.0 2023-10-03 08:30:48,111 WARNING [train.py:1204] (3/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_219-224445-0 from training. Duration: 0.84 2023-10-03 08:30:49,611 WARNING [train.py:1204] (3/4) Exclude cut with ID _14867_210_1968_1_1530684179890_3846969_413-29292-0_sp1.1 from training. Duration: 0.8181875 2023-10-03 08:30:49,658 WARNING [train.py:1204] (3/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_1012-279473-0_sp0.9 from training. Duration: 0.7444375 2023-10-03 08:30:52,386 WARNING [train.py:1204] (3/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_605-133395-0_sp0.9 from training. Duration: 0.7333125 2023-10-03 08:30:53,850 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_89-35748-0 from training. Duration: 0.79 2023-10-03 08:30:53,940 WARNING [train.py:1204] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_476-272757-0_sp1.1 from training. Duration: 0.8454375 2023-10-03 08:31:01,730 WARNING [train.py:1204] (3/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_449-15508-0_sp0.9 from training. Duration: 0.911125 2023-10-03 08:31:03,187 WARNING [train.py:1204] (3/4) Exclude cut with ID _29345_210_4381_1_1532689168263_3860169_198-174328-0 from training. Duration: 0.91 2023-10-03 08:31:05,997 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_15517_210_6753_1_1530952293_1664795_125-158157-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 08:31:12,056 WARNING [train.py:1204] (3/4) Exclude cut with ID _25565_210_12392_1_1532423009382_3952098_420-113180-0_sp1.1 from training. Duration: 0.963625 2023-10-03 08:31:12,140 WARNING [train.py:1204] (3/4) Exclude cut with ID _51471_210_1889_1_1534399391390_3094250_236-146773-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 08:31:16,382 WARNING [train.py:1204] (3/4) Exclude cut with ID _30039_210_13270_1_1532484612900_2725150_175-35012-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 08:31:18,122 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_23850_210_4238_1_1532232597_1632433_99-275843-0_sp1.1 from training. Duration: 0.936375 2023-10-03 08:31:19,569 WARNING [train.py:1204] (3/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_658-282610-0 from training. Duration: 0.53 2023-10-03 08:31:22,487 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_37818_210_4238_1_1533620910_4720083_234-65934-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 08:31:23,953 WARNING [train.py:1204] (3/4) Exclude cut with ID _3067_210_2667_1_1526212974640_4503168_459-173867-0_sp1.1 from training. Duration: 0.8 2023-10-03 08:31:25,274 WARNING [train.py:1204] (3/4) Exclude cut with ID _8696_210_1876_1_1528545632440_3345058_209-54069-0_sp0.9 from training. Duration: 0.7444375 2023-10-03 08:31:28,654 WARNING [train.py:1204] (3/4) Exclude cut with ID _62574_210_2655_1_1535180407058_3933249_420-236465-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 08:31:29,292 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.4.encoder.layers.0.feed_forward3.out_whiten, num_groups=1, num_channels=384, metric=12.02 vs. limit=15.0 2023-10-03 08:31:30,018 WARNING [train.py:1204] (3/4) Exclude cut with ID _46681_210_9642_1_1533893005571_7603359_333-4042-0_sp1.1 from training. Duration: 0.936375 2023-10-03 08:31:31,477 WARNING [train.py:1204] (3/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_377-296839-0 from training. Duration: 0.55 2023-10-03 08:31:32,074 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.4.encoder.layers.2.self_attn_weights.whiten_keys, num_groups=4, num_channels=128, metric=1.63 vs. limit=6.0 2023-10-03 08:31:36,257 WARNING [train.py:1204] (3/4) Exclude cut with ID _43997_210_4525_1_1533538632555_5189329_426-92599-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 08:31:38,920 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_43-71989-0_sp1.1 from training. Duration: 0.5181875 2023-10-03 08:31:40,218 INFO [train.py:1046] (3/4) Epoch 34, batch 5200, loss[loss=0.1624, simple_loss=0.2364, pruned_loss=0.04415, over 23542.00 frames. ], tot_loss[loss=0.161, simple_loss=0.2402, pruned_loss=0.0409, over 4725591.99 frames. ], batch size: 119, lr: 2.98e-03, grad_scale: 32.0 2023-10-03 08:31:40,336 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2009_210_2071_1_1525481134_4588937_140-345530-0_sp1.1 from training. Duration: 0.963625 2023-10-03 08:31:40,349 WARNING [train.py:1204] (3/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_138-332773-0_sp0.9 from training. Duration: 0.9 2023-10-03 08:31:42,152 WARNING [train.py:1204] (3/4) Exclude cut with ID _13942_210_9988_1_1530604906036_3733360_499-55676-0_sp0.9 from training. Duration: 0.688875 2023-10-03 08:31:42,174 WARNING [train.py:1204] (3/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_259-34038-0_sp1.1 from training. Duration: 0.67275 2023-10-03 08:31:42,186 WARNING [train.py:1204] (3/4) Exclude cut with ID _3120_210_3525_1_1526036143112_4072980_404-249994-0_sp1.1 from training. Duration: 0.9 2023-10-03 08:31:42,227 WARNING [train.py:1204] (3/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_222-108810-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 08:31:45,852 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.nonlin_attention.whiten2.whitening_limit, batch_count=1203333.3333333333, ans=15.0 2023-10-03 08:31:46,424 WARNING [train.py:1204] (3/4) Exclude cut with ID _22083_210_8254_1_1531702862439_3678970_49-234546-0_sp0.9 from training. Duration: 0.92225 2023-10-03 08:31:47,792 WARNING [train.py:1204] (3/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_199-295611-0_sp0.9 from training. Duration: 0.611125 2023-10-03 08:31:49,703 WARNING [train.py:1204] (3/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_43-192945-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 08:31:53,711 WARNING [train.py:1204] (3/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_364-22817-0 from training. Duration: 0.71 2023-10-03 08:31:55,102 WARNING [train.py:1204] (3/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_388-129552-0_sp1.1 from training. Duration: 0.87275 2023-10-03 08:31:55,161 WARNING [train.py:1204] (3/4) Exclude cut with ID _6537_210_1883_1_1527987559710_3597129_190-280227-0_sp1.1 from training. Duration: 0.97275 2023-10-03 08:31:57,899 WARNING [train.py:1204] (3/4) Exclude cut with ID _55844_210_2346_1_1534471212569_7625707_398-56897-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 08:31:57,986 WARNING [train.py:1204] (3/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_48-182155-0_sp1.1 from training. Duration: 0.7909375 2023-10-03 08:31:58,008 WARNING [train.py:1204] (3/4) Exclude cut with ID _29529_210_7234_1_1532393961455_3977460_582-18084-0_sp1.1 from training. Duration: 0.97275 2023-10-03 08:32:01,308 WARNING [train.py:1204] (3/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_912-282412-0 from training. Duration: 0.49 2023-10-03 08:32:02,704 WARNING [train.py:1204] (3/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_332-256168-0_sp0.9 from training. Duration: 0.8333125 2023-10-03 08:32:02,945 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.bypass_mid.scale_min, batch_count=1203400.0, ans=0.2 2023-10-03 08:32:04,062 WARNING [train.py:1204] (3/4) Exclude cut with ID _64220_210_9527_1_1535250151772_4875259_55-250624-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 08:32:05,531 WARNING [train.py:1204] (3/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_264-108685-0 from training. Duration: 0.55 2023-10-03 08:32:06,954 WARNING [train.py:1204] (3/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_613-232856-0_sp0.9 from training. Duration: 0.6 2023-10-03 08:32:08,300 WARNING [train.py:1204] (3/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_68-212801-0_sp1.1 from training. Duration: 0.663625 2023-10-03 08:32:08,945 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.3.encoder.layers.2.nonlin_attention.whiten1, num_groups=1, num_channels=384, metric=4.26 vs. limit=10.0 2023-10-03 08:32:09,616 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6290_210_3273_1_1527595224_1282787_115-87095-0 from training. Duration: 0.55 2023-10-03 08:32:09,671 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3080_210_2071_1_1526086102_4365166_312-8726-0 from training. Duration: 0.91 2023-10-03 08:32:12,836 WARNING [train.py:1204] (3/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_105-63309-0 from training. Duration: 0.98 2023-10-03 08:32:14,178 WARNING [train.py:1204] (3/4) Exclude cut with ID _2941_210_2577_1_1526199673150_2489759_201-160779-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 08:32:14,181 WARNING [train.py:1204] (3/4) Exclude cut with ID ID0406W0226-885434 from training. Duration: 0.997 2023-10-03 08:32:14,188 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_17292_210_6753_1_1531125388_2943734_353-328201-0_sp1.1 from training. Duration: 0.97275 2023-10-03 08:32:15,588 WARNING [train.py:1204] (3/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_572-96029-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 08:32:15,616 WARNING [train.py:1204] (3/4) Exclude cut with ID _14970_210_15709_1_1530775547361_1966965_6-23328-0_sp0.9 from training. Duration: 0.9555625 2023-10-03 08:32:16,995 WARNING [train.py:1204] (3/4) Exclude cut with ID _45012_210_12651_1_1533608435136_5371380_240-37101-0 from training. Duration: 0.93 2023-10-03 08:32:17,056 WARNING [train.py:1204] (3/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_685-90183-0_sp1.1 from training. Duration: 0.936375 2023-10-03 08:32:20,367 WARNING [train.py:1204] (3/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_41-22245-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 08:32:21,865 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_15126_210_6753_1_1530778677_2591185_120-277707-0 from training. Duration: 0.8 2023-10-03 08:32:21,900 WARNING [train.py:1204] (3/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_310-26186-0 from training. Duration: 0.7 2023-10-03 08:32:21,916 WARNING [train.py:1204] (3/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_787-294416-0 from training. Duration: 0.82 2023-10-03 08:32:26,161 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.0.conv_module1.balancer2.prob, batch_count=1203533.3333333333, ans=0.125 2023-10-03 08:32:27,349 WARNING [train.py:1204] (3/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_795-33252-0 from training. Duration: 0.37 2023-10-03 08:32:27,387 WARNING [train.py:1204] (3/4) Exclude cut with ID _7537_210_2084_1_1528502395431_3924359_113-199933-0_sp0.9 from training. Duration: 0.6555625 2023-10-03 08:32:27,686 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.conv_skip_rate, batch_count=1203533.3333333333, ans=0.0 2023-10-03 08:32:31,695 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.3.encoder.layers.3.feed_forward1.out_whiten, num_groups=1, num_channels=512, metric=4.84 vs. limit=15.0 2023-10-03 08:32:32,492 WARNING [train.py:1204] (3/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_308-231735-0_sp0.9 from training. Duration: 0.811125 2023-10-03 08:32:32,513 WARNING [train.py:1204] (3/4) Exclude cut with ID _19504_210_9994_1_1531648863242_3325220_354-296765-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 08:32:35,157 WARNING [train.py:1204] (3/4) Exclude cut with ID _5409_210_1388_1_1527292850189_5865191_178-127970-0 from training. Duration: 0.57 2023-10-03 08:32:36,485 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_1466-248056-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 08:32:36,502 WARNING [train.py:1204] (3/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_535-14410-0_sp0.9 from training. Duration: 0.52225 2023-10-03 08:32:36,512 WARNING [train.py:1204] (3/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_747-129941-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 08:32:36,548 WARNING [train.py:1204] (3/4) Exclude cut with ID _15012_210_1968_1_1530856820429_5095350_688-178849-0_sp1.1 from training. Duration: 0.5818125 2023-10-03 08:32:39,381 WARNING [train.py:1204] (3/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_356-45054-0_sp1.1 from training. Duration: 0.7818125 2023-10-03 08:32:41,308 WARNING [train.py:1204] (3/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_146-276944-0_sp0.9 from training. Duration: 0.92225 2023-10-03 08:32:44,160 WARNING [train.py:1204] (3/4) Exclude cut with ID _39204_210_4220_1_1533880547368_3191120_346-180306-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 08:32:44,234 WARNING [train.py:1204] (3/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_641-158017-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 08:32:44,235 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_19952_210_2161_1_1531875469_3851750_274-42137-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 08:32:45,803 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=1203600.0, ans=0.1 2023-10-03 08:32:50,302 WARNING [train.py:1204] (3/4) Exclude cut with ID _60804_210_9642_1_1534831199859_7268299_87-327819-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 08:32:50,486 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.conv_skip_rate, batch_count=1203600.0, ans=0.0 2023-10-03 08:32:51,602 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_9302_210_2550_1_1528889962_4655416_638-301620-0 from training. Duration: 0.47 2023-10-03 08:32:52,996 WARNING [train.py:1204] (3/4) Exclude cut with ID _44304_210_15374_1_1533639352765_4613129_302-22084-0_sp1.1 from training. Duration: 0.7818125 2023-10-03 08:32:53,008 WARNING [train.py:1204] (3/4) Exclude cut with ID _18513_210_9206_1_1531618215223_3862620_177-227063-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 08:32:54,308 INFO [train.py:1046] (3/4) Epoch 34, batch 5250, loss[loss=0.1699, simple_loss=0.2483, pruned_loss=0.04573, over 23486.00 frames. ], tot_loss[loss=0.1612, simple_loss=0.2405, pruned_loss=0.04097, over 4723548.62 frames. ], batch size: 93, lr: 2.98e-03, grad_scale: 16.0 2023-10-03 08:32:54,401 WARNING [train.py:1204] (3/4) Exclude cut with ID _41326_210_5854_1_1533778076509_3492080_510-45444-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 08:32:55,166 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.1.encoder.layers.1.whiten, num_groups=1, num_channels=256, metric=5.22 vs. limit=12.0 2023-10-03 08:32:55,647 WARNING [train.py:1204] (3/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_76-311963-0_sp1.1 from training. Duration: 0.636375 2023-10-03 08:32:55,733 WARNING [train.py:1204] (3/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_156-302675-0_sp0.9 from training. Duration: 0.611125 2023-10-03 08:32:59,006 WARNING [train.py:1204] (3/4) Exclude cut with ID _53489_210_6228_1_1534298357169_3835970_367-148411-0_sp1.1 from training. Duration: 0.936375 2023-10-03 08:33:01,715 WARNING [train.py:1204] (3/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_389-144525-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 08:33:01,747 WARNING [train.py:1204] (3/4) Exclude cut with ID _14069_210_5632_1_1530512692033_4803390_158-238427-0_sp1.1 from training. Duration: 0.963625 2023-10-03 08:33:03,597 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_486-151131-0_sp0.9 from training. Duration: 0.9444375 2023-10-03 08:33:07,922 WARNING [train.py:1204] (3/4) Exclude cut with ID _39846_210_12651_1_1533603651992_1318030_188-71831-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 08:33:10,577 WARNING [train.py:1204] (3/4) Exclude cut with ID _39411_210_5986_1_1533538825030_3343649_173-238248-0_sp1.1 from training. Duration: 0.7545625 2023-10-03 08:33:12,002 WARNING [train.py:1204] (3/4) Exclude cut with ID _20684_210_6917_1_1531567751646_4203930_469-49081-0_sp1.1 from training. Duration: 0.92725 2023-10-03 08:33:13,332 WARNING [train.py:1204] (3/4) Exclude cut with ID _8833_210_5219_1_1528592665210_7553059_611-80313-0_sp1.1 from training. Duration: 0.5818125 2023-10-03 08:33:15,363 WARNING [train.py:1204] (3/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_440-141811-0 from training. Duration: 0.95 2023-10-03 08:33:15,378 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_41211_210_2544_1_1533348859_3702910_236-223652-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 08:33:16,762 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_40222_210_6228_1_1533385298_4481939_220-163998-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 08:33:18,465 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.466e+02 1.928e+02 2.131e+02 2.518e+02 4.702e+02, threshold=4.262e+02, percent-clipped=2.0 2023-10-03 08:33:23,029 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.3.encoder.layers.0.feed_forward1.out_whiten, num_groups=1, num_channels=512, metric=7.39 vs. limit=15.0 2023-10-03 08:33:34,211 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.self_attn_weights.pos_emb_skip_rate, batch_count=1203800.0, ans=0.0 2023-10-03 08:33:50,429 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.balancer_ff2.min_abs, batch_count=1203933.3333333333, ans=0.1 2023-10-03 08:34:03,911 INFO [train.py:1046] (3/4) Epoch 34, batch 5300, loss[loss=0.1563, simple_loss=0.2228, pruned_loss=0.04488, over 23637.00 frames. ], tot_loss[loss=0.1601, simple_loss=0.2391, pruned_loss=0.04055, over 4710396.63 frames. ], batch size: 232, lr: 2.98e-03, grad_scale: 16.0 2023-10-03 08:34:19,278 WARNING [train.py:1204] (3/4) Exclude cut with ID _14629_210_15709_1_1530604298182_956899_6-232695-0_sp1.1 from training. Duration: 0.8545625 2023-10-03 08:34:19,300 WARNING [train.py:1204] (3/4) Exclude cut with ID _13921_210_9994_1_1530841782272_3503520_327-69677-0 from training. Duration: 0.9 2023-10-03 08:34:19,328 WARNING [train.py:1204] (3/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_372-298093-0 from training. Duration: 0.8 2023-10-03 08:34:19,342 WARNING [train.py:1204] (3/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_515-139111-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 08:34:19,606 WARNING [train.py:1204] (3/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_85-65056-0_sp1.1 from training. Duration: 0.87275 2023-10-03 08:34:19,655 WARNING [train.py:1204] (3/4) Exclude cut with ID _15113_210_8254_1_1530842438579_3716279_209-54533-0_sp1.1 from training. Duration: 0.87275 2023-10-03 08:34:19,712 WARNING [train.py:1204] (3/4) Exclude cut with ID _70447_210_12392_1_1535972499300_3160420_123-312838-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 08:34:19,742 WARNING [train.py:1204] (3/4) Exclude cut with ID _60113_210_16271_1_1535164212481_4386079_70-63596-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 08:34:19,770 WARNING [train.py:1204] (3/4) Exclude cut with ID _14632_210_9988_1_1530691279392_3923200_225-188509-0_sp1.1 from training. Duration: 0.963625 2023-10-03 08:34:19,820 WARNING [train.py:1204] (3/4) Exclude cut with ID _34734_210_4525_1_1533365977542_4448720_584-180532-0_sp1.1 from training. Duration: 0.97275 2023-10-03 08:34:19,833 WARNING [train.py:1204] (3/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_351-294650-0_sp1.1 from training. Duration: 0.57275 2023-10-03 08:34:20,161 WARNING [train.py:1204] (3/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_263-168388-0_sp0.9 from training. Duration: 0.9555625 2023-10-03 08:34:20,190 WARNING [train.py:1204] (3/4) Exclude cut with ID _22083_210_8254_1_1531702862439_3678970_201-85738-0 from training. Duration: 0.9 2023-10-03 08:34:20,282 WARNING [train.py:1204] (3/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_237-58433-0 from training. Duration: 0.74 2023-10-03 08:34:20,308 WARNING [train.py:1204] (3/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_262-250826-0 from training. Duration: 0.99 2023-10-03 08:34:20,413 WARNING [train.py:1204] (3/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_927-43827-0_sp0.9 from training. Duration: 0.82225 2023-10-03 08:34:20,446 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2821_210_2715_1_1526108385_3763560_73-296673-0 from training. Duration: 0.83 2023-10-03 08:34:20,521 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6287_210_3273_1_1527509506_2653716_182-199887-0 from training. Duration: 0.51 2023-10-03 08:34:21,021 WARNING [train.py:1204] (3/4) Exclude cut with ID _70519_210_4163_1_1535961595994_7458230_899-80223-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 08:34:21,289 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_210-234949-0_sp1.1 from training. Duration: 0.97275 2023-10-03 08:34:21,327 WARNING [train.py:1204] (3/4) Exclude cut with ID _30196_210_1664_1_1533708496522_7074140_768-278176-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 08:34:21,403 WARNING [train.py:1204] (3/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_315-128283-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 08:34:21,489 WARNING [train.py:1204] (3/4) Exclude cut with ID _14886_210_4220_1_1530702020355_3516190_330-323941-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 08:34:21,776 WARNING [train.py:1204] (3/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_264-348896-0_sp1.1 from training. Duration: 0.77275 2023-10-03 08:34:21,801 WARNING [train.py:1204] (3/4) Exclude cut with ID _37086_210_9038_1_1532999540891_7021250_433-10563-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 08:34:21,840 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_53638_210_21581_1_1534297319_5808449_885-80172-0_sp1.1 from training. Duration: 0.87275 2023-10-03 08:34:21,937 WARNING [train.py:1204] (3/4) Exclude cut with ID _14623_210_1925_1_1530773354962_3632609_157-218771-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 08:34:21,949 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_1368-78165-0_sp1.1 from training. Duration: 0.97275 2023-10-03 08:34:21,953 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_309-65177-0_sp0.9 from training. Duration: 0.97775 2023-10-03 08:34:21,959 WARNING [train.py:1204] (3/4) Exclude cut with ID _21632_210_2253_1_1531994001765_3744140_291-138812-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 08:34:21,974 WARNING [train.py:1204] (3/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_669-93163-0_sp1.1 from training. Duration: 0.8909375 2023-10-03 08:34:22,567 WARNING [train.py:1204] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_292-215346-0 from training. Duration: 0.97 2023-10-03 08:34:22,593 WARNING [train.py:1204] (3/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_286-222073-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 08:34:22,884 WARNING [train.py:1204] (3/4) Exclude cut with ID _59256_210_5986_1_1535076085997_3589120_121-237386-0_sp1.1 from training. Duration: 0.87275 2023-10-03 08:34:22,899 WARNING [train.py:1204] (3/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_96-320736-0 from training. Duration: 0.86 2023-10-03 08:34:22,909 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_128-202104-0 from training. Duration: 0.63 2023-10-03 08:34:23,465 WARNING [train.py:1204] (3/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_242-3472-0_sp0.9 from training. Duration: 0.811125 2023-10-03 08:34:23,511 WARNING [train.py:1204] (3/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_154-74182-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 08:34:23,515 WARNING [train.py:1204] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_187-221566-0 from training. Duration: 0.43 2023-10-03 08:34:23,685 WARNING [train.py:1204] (3/4) Exclude cut with ID _6194_210_1534_1_1527511826160_3941194_14-282876-0 from training. Duration: 0.71 2023-10-03 08:34:23,729 WARNING [train.py:1204] (3/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_660-211088-0_sp1.1 from training. Duration: 0.563625 2023-10-03 08:34:24,077 WARNING [train.py:1204] (3/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_526-62079-0_sp1.1 from training. Duration: 0.6090625 2023-10-03 08:34:24,225 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_92-246270-0_sp1.1 from training. Duration: 0.77275 2023-10-03 08:34:24,312 WARNING [train.py:1204] (3/4) Exclude cut with ID IC0287W0023-142350 from training. Duration: 0.956 2023-10-03 08:34:24,386 WARNING [train.py:1204] (3/4) Exclude cut with ID _58605_210_12210_1_1534723195939_3904259_48-269402-0 from training. Duration: 0.96 2023-10-03 08:34:24,400 WARNING [train.py:1204] (3/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_416-257730-0_sp0.9 from training. Duration: 0.72225 2023-10-03 08:34:24,475 WARNING [train.py:1204] (3/4) Exclude cut with ID _7877_210_6014_1_1528286358099_3750136_185-17516-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 08:34:24,576 WARNING [train.py:1204] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_23-189234-0 from training. Duration: 0.83 2023-10-03 08:34:24,592 WARNING [train.py:1204] (3/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_237-89684-0 from training. Duration: 0.96 2023-10-03 08:34:24,657 WARNING [train.py:1204] (3/4) Exclude cut with ID _13916_210_9994_1_1530788341126_3763149_116-21034-0 from training. Duration: 0.96 2023-10-03 08:34:24,817 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_246-125000-0_sp1.1 from training. Duration: 0.563625 2023-10-03 08:34:31,026 INFO [train.py:1046] (3/4) Epoch 35, batch 0, loss[loss=0.2014, simple_loss=0.2739, pruned_loss=0.06445, over 19252.00 frames. ], tot_loss[loss=0.2014, simple_loss=0.2739, pruned_loss=0.06445, over 19252.00 frames. ], batch size: 388, lr: 2.93e-03, grad_scale: 32.0 2023-10-03 08:34:31,026 INFO [train.py:1069] (3/4) Computing validation loss 2023-10-03 08:34:43,450 INFO [train.py:1078] (3/4) Epoch 35, validation: loss=0.3289, simple_loss=0.2753, pruned_loss=0.1913, over 1125622.00 frames. 2023-10-03 08:34:43,451 INFO [train.py:1079] (3/4) Maximum memory allocated so far is 21134MB 2023-10-03 08:34:44,932 WARNING [train.py:1204] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_402-253027-0 from training. Duration: 0.54 2023-10-03 08:34:44,995 WARNING [train.py:1204] (3/4) Exclude cut with ID _59288_210_5854_1_1535077757774_3412246_158-289345-0_sp1.1 from training. Duration: 0.92725 2023-10-03 08:34:46,347 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_28211_210_6388_1_1532861506_4523224_128-328162-0_sp1.1 from training. Duration: 0.8181875 2023-10-03 08:34:51,660 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_788-278281-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 08:34:51,668 WARNING [train.py:1204] (3/4) Exclude cut with ID _49728_210_12651_1_1534041034294_6788150_624-195301-0_sp0.9 from training. Duration: 0.9444375 2023-10-03 08:34:51,717 WARNING [train.py:1204] (3/4) Exclude cut with ID _14860_210_4915_1_1530702020340_7343240_267-139775-0_sp1.1 from training. Duration: 0.963625 2023-10-03 08:34:53,007 WARNING [train.py:1204] (3/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_332-221057-0 from training. Duration: 0.68 2023-10-03 08:34:54,438 WARNING [train.py:1204] (3/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_242-76943-0 from training. Duration: 0.94 2023-10-03 08:34:57,300 WARNING [train.py:1204] (3/4) Exclude cut with ID _16747_210_5986_1_1531811544733_2749109_87-349587-0_sp1.1 from training. Duration: 0.963625 2023-10-03 08:34:57,358 WARNING [train.py:1204] (3/4) Exclude cut with ID _22982_210_1883_1_1531882790447_5449409_505-346452-0_sp1.1 from training. Duration: 0.963625 2023-10-03 08:35:00,689 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.ff2_skip_rate, batch_count=1204153.3333333333, ans=0.0 2023-10-03 08:35:01,843 WARNING [train.py:1204] (3/4) Exclude cut with ID _17956_210_4353_1_1531209568125_6979970_13-192107-0_sp1.1 from training. Duration: 0.963625 2023-10-03 08:35:01,881 WARNING [train.py:1204] (3/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_430-307666-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 08:35:03,169 WARNING [train.py:1204] (3/4) Exclude cut with ID _2895_210_2477_1_1525956785794_4063750_289-11475-0_sp1.1 from training. Duration: 0.7454375 2023-10-03 08:35:03,181 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3910_210_1794_1_1526742306_334530_28-238909-0_sp0.9 from training. Duration: 0.9 2023-10-03 08:35:04,642 WARNING [train.py:1204] (3/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_18-130336-0 from training. Duration: 0.79 2023-10-03 08:35:07,826 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_548-294316-0_sp0.9 from training. Duration: 0.9 2023-10-03 08:35:15,274 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_42-142473-0_sp1.1 from training. Duration: 0.6545625 2023-10-03 08:35:15,288 WARNING [train.py:1204] (3/4) Exclude cut with ID _59170_210_6721_1_1534983633960_4203335_326-48066-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 08:35:17,142 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_45886_210_17024_1_1533709215_471063_7-79165-0 from training. Duration: 0.98 2023-10-03 08:35:17,406 INFO [scaling.py:1118] (3/4) WithLoss: name=encoder.encoders.2.encoder.layers.1.self_attn_weights, loss-sum=0.000e+00 2023-10-03 08:35:18,748 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=1204220.0, ans=0.1 2023-10-03 08:35:21,231 WARNING [train.py:1204] (3/4) Exclude cut with ID _44585_210_18628_1_1533605350387_4518199_280-73534-0_sp0.9 from training. Duration: 0.988875 2023-10-03 08:35:21,246 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3475_210_2028_1_1526193578_3622569_265-276625-0_sp1.1 from training. Duration: 0.6909375 2023-10-03 08:35:23,957 WARNING [train.py:1204] (3/4) Exclude cut with ID _64631_210_27767_1_1535349580068_4165160_88-225232-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 08:35:26,814 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_205-265518-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 08:35:26,985 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.conv_module1.balancer1.prob, batch_count=1204286.6666666667, ans=0.125 2023-10-03 08:35:31,107 WARNING [train.py:1204] (3/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_271-253734-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 08:35:31,291 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.balancer1.prob, batch_count=1204286.6666666667, ans=0.125 2023-10-03 08:35:38,371 WARNING [train.py:1204] (3/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_282-257791-0 from training. Duration: 0.86 2023-10-03 08:35:42,056 WARNING [train.py:1204] (3/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_356-45054-0 from training. Duration: 0.86 2023-10-03 08:35:43,468 WARNING [train.py:1204] (3/4) Exclude cut with ID _69584_210_5219_1_1535785133775_4044450_38-348116-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 08:35:43,469 WARNING [train.py:1204] (3/4) Exclude cut with ID _66763_210_17301_1_1535788667617_241750_21-328480-0_sp1.1 from training. Duration: 0.936375 2023-10-03 08:35:44,749 WARNING [train.py:1204] (3/4) Exclude cut with ID _26528_210_9070_1_1532570410842_3851310_497-97218-0_sp0.9 from training. Duration: 0.9555625 2023-10-03 08:35:44,815 WARNING [train.py:1204] (3/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_592-101075-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 08:35:47,866 WARNING [train.py:1204] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_678-159307-0 from training. Duration: 0.85 2023-10-03 08:35:49,441 WARNING [train.py:1204] (3/4) Exclude cut with ID _28427_210_4220_1_1532581240967_3061069_166-319773-0_sp1.1 from training. Duration: 0.936375 2023-10-03 08:35:49,514 WARNING [train.py:1204] (3/4) Exclude cut with ID _29877_210_6228_1_1532397791106_5276149_606-224984-0_sp1.1 from training. Duration: 0.936375 2023-10-03 08:35:54,210 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_125-203105-0_sp1.1 from training. Duration: 0.72725 2023-10-03 08:35:56,837 INFO [train.py:1046] (3/4) Epoch 35, batch 50, loss[loss=0.1619, simple_loss=0.2503, pruned_loss=0.03673, over 24286.00 frames. ], tot_loss[loss=0.1621, simple_loss=0.2424, pruned_loss=0.04089, over 1056587.77 frames. ], batch size: 74, lr: 2.93e-03, grad_scale: 8.0 2023-10-03 08:35:58,377 WARNING [train.py:1204] (3/4) Exclude cut with ID IC0620W0113-306765 from training. Duration: 0.768 2023-10-03 08:35:58,587 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.feed_forward3.hidden_balancer.prob, batch_count=1204420.0, ans=0.125 2023-10-03 08:35:59,722 WARNING [train.py:1204] (3/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_410-287857-0_sp1.1 from training. Duration: 0.8454375 2023-10-03 08:36:01,304 WARNING [train.py:1204] (3/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_78-95260-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 08:36:03,820 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.567e+02 1.925e+02 2.366e+02 2.722e+02 6.685e+02, threshold=4.732e+02, percent-clipped=5.0 2023-10-03 08:36:03,969 WARNING [train.py:1204] (3/4) Exclude cut with ID _58059_210_5854_1_1534901471149_3164808_6-73080-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 08:36:03,982 WARNING [train.py:1204] (3/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_331-117829-0 from training. Duration: 0.62 2023-10-03 08:36:04,040 WARNING [train.py:1204] (3/4) Exclude cut with ID _14880_210_8895_1_1530680426655_3795259_289-212817-0_sp0.9 from training. Duration: 0.7666875 2023-10-03 08:36:04,072 WARNING [train.py:1204] (3/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_620-144779-0_sp1.1 from training. Duration: 0.92725 2023-10-03 08:36:07,302 WARNING [train.py:1204] (3/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_561-288575-0_sp1.1 from training. Duration: 0.963625 2023-10-03 08:36:08,575 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_22657_210_6890_1_1532086002_1637216_122-188128-0_sp1.1 from training. Duration: 0.963625 2023-10-03 08:36:11,261 WARNING [train.py:1204] (3/4) Exclude cut with ID _16756_210_4915_1_1531047803763_7104259_604-86607-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 08:36:14,573 WARNING [train.py:1204] (3/4) Exclude cut with ID _15150_210_8341_1_1530752411803_7256384_469-239509-0 from training. Duration: 0.68 2023-10-03 08:36:14,590 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_10987_210_4163_1_1529760555_4123902_302-87926-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 08:36:17,147 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.5.encoder.layers.0.self_attn1.whiten, num_groups=1, num_channels=256, metric=11.88 vs. limit=22.5 2023-10-03 08:36:18,105 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.ff3_skip_rate, batch_count=1204486.6666666667, ans=0.0 2023-10-03 08:36:21,931 WARNING [train.py:1204] (3/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_469-94673-0_sp0.9 from training. Duration: 0.87775 2023-10-03 08:36:23,760 WARNING [train.py:1204] (3/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_798-106179-0 from training. Duration: 0.83 2023-10-03 08:36:25,267 WARNING [train.py:1204] (3/4) Exclude cut with ID _16744_210_5986_1_1531724405384_3907567_242-111316-0 from training. Duration: 0.93 2023-10-03 08:36:25,556 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.feed_forward1.out_proj.dropout_p, batch_count=1204553.3333333333, ans=0.1 2023-10-03 08:36:27,936 WARNING [train.py:1204] (3/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_938-205135-0_sp0.9 from training. Duration: 0.9666875 2023-10-03 08:36:28,065 WARNING [train.py:1204] (3/4) Exclude cut with ID _64135_210_2253_1_1535677267876_7608972_133-188266-0_sp0.9 from training. Duration: 0.9 2023-10-03 08:36:28,227 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.balancer2.prob, batch_count=1204553.3333333333, ans=0.125 2023-10-03 08:36:29,290 WARNING [train.py:1204] (3/4) Exclude cut with ID _44585_210_18628_1_1533605350387_4518199_623-346820-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 08:36:29,370 WARNING [train.py:1204] (3/4) Exclude cut with ID _42326_210_7770_1_1533814282531_3707269_454-266903-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 08:36:29,434 WARNING [train.py:1204] (3/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_5-55893-0_sp1.1 from training. Duration: 0.5 2023-10-03 08:36:30,794 WARNING [train.py:1204] (3/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_577-332600-0_sp1.1 from training. Duration: 0.5181875 2023-10-03 08:36:30,795 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_957-138882-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 08:36:37,688 WARNING [train.py:1204] (3/4) Exclude cut with ID _12556_210_8341_1_1530081024100_7327690_219-38004-0_sp1.1 from training. Duration: 0.97275 2023-10-03 08:36:39,177 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_397-147196-0_sp1.1 from training. Duration: 0.72725 2023-10-03 08:36:40,790 WARNING [train.py:1204] (3/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_231-104736-0_sp0.9 from training. Duration: 0.8666875 2023-10-03 08:36:40,883 WARNING [train.py:1204] (3/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_398-334558-0 from training. Duration: 0.96 2023-10-03 08:36:42,294 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_257-20733-0_sp1.1 from training. Duration: 0.6909375 2023-10-03 08:36:43,602 WARNING [train.py:1204] (3/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_146-282400-0_sp1.1 from training. Duration: 0.6454375 2023-10-03 08:36:43,606 WARNING [train.py:1204] (3/4) Exclude cut with ID _51899_210_20034_1_1534129254466_3669220_265-215428-0 from training. Duration: 0.98 2023-10-03 08:36:43,680 WARNING [train.py:1204] (3/4) Exclude cut with ID _34423_210_8750_1_1533430798456_3330199_219-106541-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 08:36:45,149 WARNING [train.py:1204] (3/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_458-67321-0 from training. Duration: 0.97 2023-10-03 08:36:45,930 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.conv_skip_rate, batch_count=1204620.0, ans=0.0 2023-10-03 08:36:52,575 WARNING [train.py:1204] (3/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_1051-182606-0_sp1.1 from training. Duration: 0.936375 2023-10-03 08:36:53,847 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_53555_210_5632_1_1534595220_7232297_603-4396-0_sp1.1 from training. Duration: 0.97275 2023-10-03 08:36:55,308 WARNING [train.py:1204] (3/4) Exclude cut with ID _54218_210_12210_1_1534413646388_4409259_159-144367-0_sp1.1 from training. Duration: 0.963625 2023-10-03 08:36:57,217 WARNING [train.py:1204] (3/4) Exclude cut with ID _7606_210_4915_1_1528894253277_5080690_301-10732-0_sp0.9 from training. Duration: 0.9 2023-10-03 08:36:57,220 WARNING [train.py:1204] (3/4) Exclude cut with ID _15026_210_8750_1_1530777602316_3291850_211-195043-0_sp0.9 from training. Duration: 0.888875 2023-10-03 08:36:58,730 WARNING [train.py:1204] (3/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_146-282400-0 from training. Duration: 0.71 2023-10-03 08:36:58,770 WARNING [train.py:1204] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_65-88242-0 from training. Duration: 0.56 2023-10-03 08:37:00,233 WARNING [train.py:1204] (3/4) Exclude cut with ID _55857_210_2346_1_1534644007831_6898209_80-107757-0_sp1.1 from training. Duration: 0.963625 2023-10-03 08:37:01,574 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_15126_210_6753_1_1530778677_2591185_120-277707-0_sp0.9 from training. Duration: 0.888875 2023-10-03 08:37:01,682 WARNING [train.py:1204] (3/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_1033-168593-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 08:37:02,932 WARNING [train.py:1204] (3/4) Exclude cut with ID _46683_210_9642_1_1533730913492_4591116_475-30711-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 08:37:02,957 WARNING [train.py:1204] (3/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_253-308377-0 from training. Duration: 0.49 2023-10-03 08:37:03,026 WARNING [train.py:1204] (3/4) Exclude cut with ID _4680_210_3525_1_1527249757953_893386_75-78979-0 from training. Duration: 0.91 2023-10-03 08:37:04,352 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_5522_210_3273_1_1527249606_3909096_318-203153-0_sp0.9 from training. Duration: 0.47775 2023-10-03 08:37:05,733 WARNING [train.py:1204] (3/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_550-170116-0_sp1.1 from training. Duration: 0.92725 2023-10-03 08:37:07,151 WARNING [train.py:1204] (3/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_277-210798-0_sp0.9 from training. Duration: 0.611125 2023-10-03 08:37:08,476 WARNING [train.py:1204] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_620-276793-0 from training. Duration: 0.59 2023-10-03 08:37:08,488 WARNING [train.py:1204] (3/4) Exclude cut with ID _3067_210_2667_1_1526212974640_4503168_119-175776-0 from training. Duration: 0.74 2023-10-03 08:37:08,558 WARNING [train.py:1204] (3/4) Exclude cut with ID _55209_210_1531_1_1534675828996_4597160_681-195827-0_sp1.1 from training. Duration: 0.92725 2023-10-03 08:37:08,779 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.conv_module2.balancer1.prob, batch_count=1204753.3333333333, ans=0.125 2023-10-03 08:37:09,819 INFO [train.py:1046] (3/4) Epoch 35, batch 100, loss[loss=0.146, simple_loss=0.237, pruned_loss=0.02755, over 24318.00 frames. ], tot_loss[loss=0.1655, simple_loss=0.2446, pruned_loss=0.04314, over 1862641.23 frames. ], batch size: 74, lr: 2.93e-03, grad_scale: 8.0 2023-10-03 08:37:09,863 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_427-78097-0_sp1.1 from training. Duration: 0.72725 2023-10-03 08:37:11,659 WARNING [train.py:1204] (3/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_96-290039-0_sp0.9 from training. Duration: 0.62225 2023-10-03 08:37:11,675 WARNING [train.py:1204] (3/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_287-292174-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 08:37:14,995 WARNING [train.py:1204] (3/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_200-292228-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 08:37:18,285 WARNING [train.py:1204] (3/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_291-194999-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 08:37:21,026 WARNING [train.py:1204] (3/4) Exclude cut with ID _58895_210_8750_1_1535184081320_3649089_265-323810-0_sp1.1 from training. Duration: 0.87275 2023-10-03 08:37:22,453 WARNING [train.py:1204] (3/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_58-68956-0 from training. Duration: 0.83 2023-10-03 08:37:22,466 WARNING [train.py:1204] (3/4) Exclude cut with ID _39867_210_9826_1_1533553222030_9344259_322-115153-0_sp1.1 from training. Duration: 0.963625 2023-10-03 08:37:26,541 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3371_210_1794_1_1526384462_5580777_396-339812-0_sp1.1 from training. Duration: 0.77275 2023-10-03 08:37:26,561 WARNING [train.py:1204] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_731-2428-0_sp0.9 from training. Duration: 0.92225 2023-10-03 08:37:26,580 WARNING [train.py:1204] (3/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_98-741-0_sp1.1 from training. Duration: 0.72725 2023-10-03 08:37:26,582 WARNING [train.py:1204] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_238-245655-0_sp0.9 from training. Duration: 0.9 2023-10-03 08:37:26,780 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.balancer_ff2.min_abs, batch_count=1204820.0, ans=0.1 2023-10-03 08:37:27,914 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6788_210_1388_1_1527898698_4562021_91-197981-0_sp0.9 from training. Duration: 0.92225 2023-10-03 08:37:28,183 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.nonlin_attention.balancer.prob, batch_count=1204820.0, ans=0.125 2023-10-03 08:37:29,333 WARNING [train.py:1204] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_860-96198-0 from training. Duration: 0.73 2023-10-03 08:37:32,563 WARNING [train.py:1204] (3/4) Exclude cut with ID _14268_210_9206_1_1530622771285_4714099_331-105351-0_sp0.9 from training. Duration: 0.72225 2023-10-03 08:37:32,601 WARNING [train.py:1204] (3/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_204-137133-0_sp1.1 from training. Duration: 0.92725 2023-10-03 08:37:32,626 WARNING [train.py:1204] (3/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_5-46496-0_sp1.1 from training. Duration: 0.936375 2023-10-03 08:37:33,889 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_160-218721-0_sp1.1 from training. Duration: 0.87275 2023-10-03 08:37:36,754 WARNING [train.py:1204] (3/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_503-287883-0 from training. Duration: 0.88 2023-10-03 08:37:38,068 WARNING [train.py:1204] (3/4) Exclude cut with ID _57572_210_5517_1_1534921219624_3809484_110-79134-0_sp1.1 from training. Duration: 0.92725 2023-10-03 08:37:38,152 WARNING [train.py:1204] (3/4) Exclude cut with ID _44585_210_18628_1_1533605350387_4518199_421-37641-0_sp1.1 from training. Duration: 0.936375 2023-10-03 08:37:39,510 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_42-142473-0_sp0.9 from training. Duration: 0.8 2023-10-03 08:37:42,052 WARNING [train.py:1204] (3/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_310-114452-0_sp0.9 from training. Duration: 0.6555625 2023-10-03 08:37:46,630 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9029W0120-509520 from training. Duration: 0.768 2023-10-03 08:37:46,650 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9029W0433-509820 from training. Duration: 0.896 2023-10-03 08:37:49,912 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_40222_210_6228_1_1533385298_4481939_163-334586-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 08:37:49,912 WARNING [train.py:1204] (3/4) Exclude cut with ID _48677_210_5663_1_1533902405919_3892989_408-40740-0_sp1.1 from training. Duration: 0.7545625 2023-10-03 08:37:51,473 WARNING [train.py:1204] (3/4) Exclude cut with ID _24314_210_4915_1_1532135078116_7170179_521-258868-0_sp0.9 from training. Duration: 0.87775 2023-10-03 08:37:53,510 WARNING [train.py:1204] (3/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_576-51198-0_sp1.1 from training. Duration: 0.92725 2023-10-03 08:37:55,069 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.balancer2.prob, batch_count=1204953.3333333333, ans=0.125 2023-10-03 08:37:56,204 WARNING [train.py:1204] (3/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_306-216871-0_sp1.1 from training. Duration: 0.97275 2023-10-03 08:38:00,255 WARNING [train.py:1204] (3/4) Exclude cut with ID _20684_210_6917_1_1531567751646_4203930_463-119982-0_sp1.1 from training. Duration: 0.97275 2023-10-03 08:38:00,861 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.3.encoder.layers.3.feed_forward2.out_whiten, num_groups=1, num_channels=512, metric=6.32 vs. limit=15.0 2023-10-03 08:38:01,584 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9084W0012-536850 from training. Duration: 0.64 2023-10-03 08:38:04,685 WARNING [train.py:1204] (3/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_847-41334-0_sp1.1 from training. Duration: 0.4 2023-10-03 08:38:07,488 WARNING [train.py:1204] (3/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_326-165900-0_sp1.1 from training. Duration: 0.62725 2023-10-03 08:38:07,569 WARNING [train.py:1204] (3/4) Exclude cut with ID _7537_210_2084_1_1528502395431_3924359_128-268029-0_sp1.1 from training. Duration: 0.8909375 2023-10-03 08:38:11,550 WARNING [train.py:1204] (3/4) Exclude cut with ID _67499_210_17282_1_1535509805220_3692430_333-155030-0_sp1.1 from training. Duration: 0.97275 2023-10-03 08:38:13,052 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_34008_210_10431_1_1533345424_8183247_588-257177-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 08:38:14,507 WARNING [train.py:1204] (3/4) Exclude cut with ID _25914_210_6400_1_1532141317058_644740_52-96808-0_sp1.1 from training. Duration: 0.82725 2023-10-03 08:38:17,772 WARNING [train.py:1204] (3/4) Exclude cut with ID _56494_210_4220_1_1535284837625_3033169_69-138150-0_sp1.1 from training. Duration: 0.8545625 2023-10-03 08:38:17,948 WARNING [train.py:1204] (3/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_19-143553-0_sp1.1 from training. Duration: 0.97275 2023-10-03 08:38:19,857 WARNING [train.py:1204] (3/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_582-130735-0_sp1.1 from training. Duration: 0.936375 2023-10-03 08:38:21,238 WARNING [train.py:1204] (3/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_943-201356-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 08:38:21,258 WARNING [train.py:1204] (3/4) Exclude cut with ID _3231_210_2106_1_1526205864934_6784220_422-229276-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 08:38:21,275 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_48049_210_5632_1_1533864553_3519937_73-198717-0_sp1.1 from training. Duration: 0.97275 2023-10-03 08:38:21,335 WARNING [train.py:1204] (3/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_329-285801-0 from training. Duration: 0.91 2023-10-03 08:38:22,607 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9171W0114-579000 from training. Duration: 0.896 2023-10-03 08:38:22,620 WARNING [train.py:1204] (3/4) Exclude cut with ID _3120_210_3525_1_1526036143112_4072980_210-132675-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 08:38:22,697 WARNING [train.py:1204] (3/4) Exclude cut with ID _55857_210_2346_1_1534644007831_6898209_745-98391-0_sp0.9 from training. Duration: 0.8444375 2023-10-03 08:38:24,009 INFO [train.py:1046] (3/4) Epoch 35, batch 150, loss[loss=0.167, simple_loss=0.2394, pruned_loss=0.04727, over 23815.00 frames. ], tot_loss[loss=0.1637, simple_loss=0.2434, pruned_loss=0.04207, over 2511034.38 frames. ], batch size: 212, lr: 2.93e-03, grad_scale: 8.0 2023-10-03 08:38:24,064 WARNING [train.py:1204] (3/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_615-322035-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 08:38:24,077 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_36756_210_7234_1_1533026991_966276_119-315767-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 08:38:24,092 WARNING [train.py:1204] (3/4) Exclude cut with ID _13916_210_9994_1_1530788341126_3763149_60-191556-0_sp1.1 from training. Duration: 0.5545625 2023-10-03 08:38:24,107 WARNING [train.py:1204] (3/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_177-236809-0_sp1.1 from training. Duration: 0.4454375 2023-10-03 08:38:25,395 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7002_210_1794_1_1527939683_5157286_184-229630-0_sp1.1 from training. Duration: 0.67275 2023-10-03 08:38:25,401 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_50428_210_5724_1_1534247532_4126418_464-18322-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 08:38:26,801 WARNING [train.py:1204] (3/4) Exclude cut with ID _35799_210_5854_1_1533263487887_3529071_466-131402-0_sp1.1 from training. Duration: 0.936375 2023-10-03 08:38:26,871 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_1259-132768-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 08:38:28,220 WARNING [train.py:1204] (3/4) Exclude cut with ID _6694_210_4599_1_1527852637490_3965529_144-185051-0_sp1.1 from training. Duration: 0.87275 2023-10-03 08:38:28,269 WARNING [train.py:1204] (3/4) Exclude cut with ID _64639_210_5816_1_1535367619437_3528000_59-133290-0_sp1.1 from training. Duration: 0.963625 2023-10-03 08:38:30,899 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.586e+02 1.859e+02 2.007e+02 2.245e+02 3.352e+02, threshold=4.014e+02, percent-clipped=0.0 2023-10-03 08:38:31,046 WARNING [train.py:1204] (3/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_223-49525-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 08:38:33,130 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.2.encoder.layers.2.whiten, num_groups=1, num_channels=384, metric=4.23 vs. limit=12.0 2023-10-03 08:38:35,555 WARNING [train.py:1204] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_748-36150-0_sp1.1 from training. Duration: 0.82725 2023-10-03 08:38:35,556 WARNING [train.py:1204] (3/4) Exclude cut with ID _19896_210_1883_1_1531709022662_6649569_52-221627-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 08:38:35,610 WARNING [train.py:1204] (3/4) Exclude cut with ID _6789_210_1534_1_1527854471671_2870519_296-115766-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 08:38:37,055 WARNING [train.py:1204] (3/4) Exclude cut with ID _14624_210_1925_1_1530858690657_3642219_100-19871-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 08:38:37,086 WARNING [train.py:1204] (3/4) Exclude cut with ID _12555_210_8750_1_1530075594402_3780239_50-293675-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 08:38:41,374 WARNING [train.py:1204] (3/4) Exclude cut with ID _14880_210_8895_1_1530680426655_3795259_289-212817-0_sp1.1 from training. Duration: 0.62725 2023-10-03 08:38:42,577 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_388-211626-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 08:38:45,765 WARNING [train.py:1204] (3/4) Exclude cut with ID _5289_210_2084_1_1527678202736_3779299_392-261988-0 from training. Duration: 0.65 2023-10-03 08:38:45,774 WARNING [train.py:1204] (3/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_124-345840-0 from training. Duration: 0.52 2023-10-03 08:38:45,787 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2655_210_2087_1_1525688959_3897400_437-337690-0 from training. Duration: 0.63 2023-10-03 08:38:47,301 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.0.bypass.skip_rate, batch_count=1205153.3333333333, ans=0.035 2023-10-03 08:38:48,556 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3780_210_2087_1_1526467391_239068_13-274633-0_sp1.1 from training. Duration: 0.92725 2023-10-03 08:38:48,561 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_805-71497-0_sp1.1 from training. Duration: 0.6454375 2023-10-03 08:38:48,752 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.ff2_skip_rate, batch_count=1205153.3333333333, ans=0.0 2023-10-03 08:38:49,911 WARNING [train.py:1204] (3/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_434-83000-0_sp1.1 from training. Duration: 0.8909375 2023-10-03 08:38:50,604 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_17613_210_2151_1_1531137569_2366160_285-219531-0_sp1.1 from training. Duration: 0.97275 2023-10-03 08:38:51,849 WARNING [train.py:1204] (3/4) Exclude cut with ID _56108_210_16380_1_1534505446159_3887959_22-37858-0_sp1.1 from training. Duration: 0.936375 2023-10-03 08:38:51,876 WARNING [train.py:1204] (3/4) Exclude cut with ID _28427_210_4220_1_1532581240967_3061069_200-88444-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 08:38:53,252 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_77-279042-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 08:38:54,626 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9309W0193-640320 from training. Duration: 0.896 2023-10-03 08:38:56,095 WARNING [train.py:1204] (3/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_516-324055-0_sp1.1 from training. Duration: 0.936375 2023-10-03 08:38:56,320 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.attention_skip_rate, batch_count=1205220.0, ans=0.0 2023-10-03 08:39:00,190 WARNING [train.py:1204] (3/4) Exclude cut with ID _40094_210_16380_1_1533294005773_3641159_10-261240-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 08:39:00,445 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.conv_module2.balancer1.prob, batch_count=1205220.0, ans=0.125 2023-10-03 08:39:00,490 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=1205220.0, ans=0.1 2023-10-03 08:39:03,555 WARNING [train.py:1204] (3/4) Exclude cut with ID _6267_210_2084_1_1527764658526_3894730_375-219887-0_sp1.1 from training. Duration: 0.6090625 2023-10-03 08:39:04,909 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6788_210_1388_1_1527898698_4562021_180-75288-0 from training. Duration: 0.99 2023-10-03 08:39:05,141 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.conv_module2.balancer2.prob, batch_count=1205220.0, ans=0.125 2023-10-03 08:39:09,018 WARNING [train.py:1204] (3/4) Exclude cut with ID _8030_210_7497_1_1528444886781_3531089_262-86750-0_sp1.1 from training. Duration: 0.736375 2023-10-03 08:39:09,038 WARNING [train.py:1204] (3/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_934-325498-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 08:39:09,066 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_456-22141-0_sp0.9 from training. Duration: 0.97775 2023-10-03 08:39:10,510 WARNING [train.py:1204] (3/4) Exclude cut with ID _49643_210_7989_1_1534640244224_3939031_598-161926-0_sp1.1 from training. Duration: 0.8181875 2023-10-03 08:39:11,833 WARNING [train.py:1204] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_345-49721-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 08:39:13,146 WARNING [train.py:1204] (3/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_440-141811-0_sp1.1 from training. Duration: 0.863625 2023-10-03 08:39:14,871 WARNING [train.py:1204] (3/4) Exclude cut with ID _64048_210_10626_1_1535198382802_3835179_394-212120-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 08:39:16,204 WARNING [train.py:1204] (3/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_91-277684-0 from training. Duration: 0.74 2023-10-03 08:39:20,502 WARNING [train.py:1204] (3/4) Exclude cut with ID _23038_210_9570_1_1532001584158_3652419_191-346343-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 08:39:21,892 WARNING [train.py:1204] (3/4) Exclude cut with ID _15140_210_9288_1_1530791388450_4880149_351-347974-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 08:39:21,920 WARNING [train.py:1204] (3/4) Exclude cut with ID _58605_210_12210_1_1534723195939_3904259_48-269402-0_sp1.1 from training. Duration: 0.87275 2023-10-03 08:39:21,936 WARNING [train.py:1204] (3/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_401-53423-0_sp0.9 from training. Duration: 0.611125 2023-10-03 08:39:23,865 WARNING [train.py:1204] (3/4) Exclude cut with ID _42659_210_2070_1_1533607442765_3120380_158-49004-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 08:39:26,496 WARNING [train.py:1204] (3/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_81-75849-0_sp1.1 from training. Duration: 0.3545625 2023-10-03 08:39:27,879 WARNING [train.py:1204] (3/4) Exclude cut with ID _27052_210_5986_1_1532329197187_3585009_314-106499-0_sp1.1 from training. Duration: 0.836375 2023-10-03 08:39:29,240 WARNING [train.py:1204] (3/4) Exclude cut with ID _4626_210_3532_1_1526817652717_3916859_446-206656-0_sp0.9 from training. Duration: 0.8444375 2023-10-03 08:39:30,680 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_53378_210_17282_1_1534221071_5769509_158-114960-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 08:39:33,914 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3171_210_2087_1_1526109084_2330007_37-64334-0_sp0.9 from training. Duration: 0.911125 2023-10-03 08:39:33,936 WARNING [train.py:1204] (3/4) Exclude cut with ID _5410_210_1388_1_1527397055795_1719020_39-112221-0 from training. Duration: 0.69 2023-10-03 08:39:35,243 WARNING [train.py:1204] (3/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_4-91037-0_sp0.9 from training. Duration: 0.97775 2023-10-03 08:39:35,258 WARNING [train.py:1204] (3/4) Exclude cut with ID ID0079W0443-717795 from training. Duration: 0.541 2023-10-03 08:39:36,564 INFO [train.py:1046] (3/4) Epoch 35, batch 200, loss[loss=0.1567, simple_loss=0.2352, pruned_loss=0.03914, over 23225.00 frames. ], tot_loss[loss=0.1633, simple_loss=0.2428, pruned_loss=0.0419, over 2999355.72 frames. ], batch size: 119, lr: 2.93e-03, grad_scale: 8.0 2023-10-03 08:39:38,056 WARNING [train.py:1204] (3/4) Exclude cut with ID _34734_210_4525_1_1533365977542_4448720_766-297928-0_sp1.1 from training. Duration: 0.936375 2023-10-03 08:39:42,135 WARNING [train.py:1204] (3/4) Exclude cut with ID _51471_210_1889_1_1534399391390_3094250_310-337793-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 08:39:42,147 WARNING [train.py:1204] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_678-159307-0_sp0.9 from training. Duration: 0.9444375 2023-10-03 08:39:43,756 WARNING [train.py:1204] (3/4) Exclude cut with ID _50093_210_4220_1_1534417141207_3432299_259-322252-0 from training. Duration: 0.92 2023-10-03 08:39:45,146 WARNING [train.py:1204] (3/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_67-103444-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 08:39:45,179 WARNING [train.py:1204] (3/4) Exclude cut with ID _40551_210_10635_1_1533776424632_3416179_311-247578-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 08:39:48,453 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_4633_210_2040_1_1526824163_4145343_416-76117-0 from training. Duration: 0.95 2023-10-03 08:39:49,679 WARNING [train.py:1204] (3/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_331-117829-0_sp0.9 from training. Duration: 0.688875 2023-10-03 08:39:51,150 WARNING [train.py:1204] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_299-247059-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 08:39:51,240 WARNING [train.py:1204] (3/4) Exclude cut with ID _64002_210_9570_1_1535198605157_38979_2-240845-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 08:39:56,404 WARNING [train.py:1204] (3/4) Exclude cut with ID _4681_210_3525_1_1527250689464_120889_15-126760-0_sp0.9 from training. Duration: 0.9 2023-10-03 08:39:56,422 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_21500_210_2054_1_1532149310_2716758_256-156611-0_sp1.1 from training. Duration: 0.936375 2023-10-03 08:39:57,758 WARNING [train.py:1204] (3/4) Exclude cut with ID _16908_210_5724_1_1531119157248_3772700_624-203729-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 08:40:08,218 INFO [scaling.py:1118] (3/4) WithLoss: name=encoder.encoders.3.encoder.layers.1.self_attn_weights, loss-sum=0.000e+00 2023-10-03 08:40:14,809 WARNING [train.py:1204] (3/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_605-219732-0_sp1.1 from training. Duration: 0.8545625 2023-10-03 08:40:14,837 WARNING [train.py:1204] (3/4) Exclude cut with ID _44718_210_5816_1_1534050051869_6469030_380-167520-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 08:40:14,904 WARNING [train.py:1204] (3/4) Exclude cut with ID _19896_210_1883_1_1531709022662_6649569_94-229927-0_sp1.1 from training. Duration: 0.8818125 2023-10-03 08:40:15,618 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.2.encoder.layers.1.self_attn1.whiten, num_groups=1, num_channels=384, metric=16.12 vs. limit=22.5 2023-10-03 08:40:16,113 WARNING [train.py:1204] (3/4) Exclude cut with ID _19514_210_9259_1_1531530071330_5084230_519-174300-0_sp1.1 from training. Duration: 0.963625 2023-10-03 08:40:18,068 WARNING [train.py:1204] (3/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_340-174176-0_sp0.9 from training. Duration: 0.5444375 2023-10-03 08:40:18,072 WARNING [train.py:1204] (3/4) Exclude cut with ID _8263_210_1664_1_1528439452891_970220_68-181805-0_sp1.1 from training. Duration: 0.5818125 2023-10-03 08:40:19,587 WARNING [train.py:1204] (3/4) Exclude cut with ID _3120_210_3525_1_1526036143112_4072980_348-257723-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 08:40:20,979 WARNING [train.py:1204] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_453-133983-0_sp1.1 from training. Duration: 0.7090625 2023-10-03 08:40:21,044 WARNING [train.py:1204] (3/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_17-86060-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 08:40:22,341 WARNING [train.py:1204] (3/4) Exclude cut with ID _29877_210_6228_1_1532397791106_5276149_228-184657-0_sp1.1 from training. Duration: 0.97275 2023-10-03 08:40:22,435 WARNING [train.py:1204] (3/4) Exclude cut with ID _18516_210_9206_1_1531704653206_3692406_198-334870-0 from training. Duration: 0.92 2023-10-03 08:40:23,737 WARNING [train.py:1204] (3/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_340-174176-0_sp1.1 from training. Duration: 0.4454375 2023-10-03 08:40:23,754 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_14-139186-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 08:40:27,657 WARNING [train.py:1204] (3/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_126-29104-0_sp0.9 from training. Duration: 0.9444375 2023-10-03 08:40:29,474 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.5.encoder.layers.0.conv_module1.whiten, num_groups=1, num_channels=256, metric=7.84 vs. limit=15.0 2023-10-03 08:40:34,849 WARNING [train.py:1204] (3/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_84-10310-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 08:40:40,347 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_15517_210_6753_1_1530954008_3487336_79-273076-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 08:40:40,825 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.4.encoder.layers.2.feed_forward3.out_whiten, num_groups=1, num_channels=384, metric=13.08 vs. limit=15.0 2023-10-03 08:40:41,673 WARNING [train.py:1204] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_371-18169-0_sp0.9 from training. Duration: 0.9555625 2023-10-03 08:40:47,303 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_68702_210_23689_1_1535799577_3420068_170-92788-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 08:40:48,913 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.bypass_mid.scale_min, batch_count=1205753.3333333333, ans=0.2 2023-10-03 08:40:50,619 INFO [train.py:1046] (3/4) Epoch 35, batch 250, loss[loss=0.1525, simple_loss=0.238, pruned_loss=0.03348, over 24442.00 frames. ], tot_loss[loss=0.1617, simple_loss=0.2415, pruned_loss=0.04092, over 3371592.95 frames. ], batch size: 63, lr: 2.93e-03, grad_scale: 8.0 2023-10-03 08:40:50,703 WARNING [train.py:1204] (3/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_78-182912-0 from training. Duration: 0.59 2023-10-03 08:40:50,762 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3019_210_2028_1_1526044245_3264865_297-12384-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 08:40:50,768 WARNING [train.py:1204] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_147-111137-0_sp1.1 from training. Duration: 0.863625 2023-10-03 08:40:50,774 WARNING [train.py:1204] (3/4) Exclude cut with ID _28323_210_16187_1_1532480395667_4100300_117-8859-0_sp1.1 from training. Duration: 0.97275 2023-10-03 08:40:50,857 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7762_210_1794_1_1528199508_4057408_419-125009-0_sp1.1 from training. Duration: 0.7545625 2023-10-03 08:40:52,180 WARNING [train.py:1204] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_268-87909-0 from training. Duration: 0.99 2023-10-03 08:40:52,257 WARNING [train.py:1204] (3/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_118-311357-0_sp1.1 from training. Duration: 0.92725 2023-10-03 08:40:52,373 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.1.feed_forward1.out_proj.dropout_p, batch_count=1205753.3333333333, ans=0.1 2023-10-03 08:40:53,524 WARNING [train.py:1204] (3/4) Exclude cut with ID ID0377W0003-870225 from training. Duration: 0.852 2023-10-03 08:40:54,827 WARNING [train.py:1204] (3/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_56-5838-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 08:40:56,809 WARNING [train.py:1204] (3/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_90-31383-0_sp0.9 from training. Duration: 0.9666875 2023-10-03 08:40:57,921 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.625e+02 1.919e+02 2.120e+02 2.596e+02 4.381e+02, threshold=4.240e+02, percent-clipped=2.0 2023-10-03 08:40:58,054 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_31583_210_3852_1_1532595851_4024413_58-86657-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 08:40:59,930 WARNING [train.py:1204] (3/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_344-113875-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 08:41:03,127 WARNING [train.py:1204] (3/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_278-104676-0_sp1.1 from training. Duration: 0.8909375 2023-10-03 08:41:03,157 WARNING [train.py:1204] (3/4) Exclude cut with ID _55844_210_2346_1_1534471212569_7625707_764-2387-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 08:41:05,824 WARNING [train.py:1204] (3/4) Exclude cut with ID _27430_210_7770_1_1532604689375_1378590_157-14963-0_sp1.1 from training. Duration: 0.963625 2023-10-03 08:41:07,416 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.balancer2.prob, batch_count=1205820.0, ans=0.125 2023-10-03 08:41:08,639 WARNING [train.py:1204] (3/4) Exclude cut with ID _7160_210_1876_1_1527940923231_3703799_282-322611-0_sp1.1 from training. Duration: 0.7909375 2023-10-03 08:41:14,421 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.balancer2.prob, batch_count=1205820.0, ans=0.125 2023-10-03 08:41:18,284 WARNING [train.py:1204] (3/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_190-138640-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 08:41:18,546 INFO [scaling.py:1118] (3/4) WithLoss: name=encoder.encoders.2.encoder.layers.2.self_attn_weights, loss-sum=0.000e+00 2023-10-03 08:41:21,388 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_33348_210_5632_1_1533086912_3324030_63-270922-0_sp1.1 from training. Duration: 0.97275 2023-10-03 08:41:21,410 WARNING [train.py:1204] (3/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_135-184281-0_sp1.1 from training. Duration: 0.9 2023-10-03 08:41:27,121 WARNING [train.py:1204] (3/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_277-210798-0_sp1.1 from training. Duration: 0.5 2023-10-03 08:41:27,175 WARNING [train.py:1204] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_402-253027-0_sp0.9 from training. Duration: 0.6 2023-10-03 08:41:28,487 WARNING [train.py:1204] (3/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_540-275685-0_sp0.9 from training. Duration: 0.988875 2023-10-03 08:41:28,527 WARNING [train.py:1204] (3/4) Exclude cut with ID _59493_210_17282_1_1535078419077_3225879_118-18960-0_sp1.1 from training. Duration: 0.936375 2023-10-03 08:41:30,473 WARNING [train.py:1204] (3/4) Exclude cut with ID _6194_210_1534_1_1527511826160_3941194_321-148641-0_sp1.1 from training. Duration: 0.5818125 2023-10-03 08:41:30,479 WARNING [train.py:1204] (3/4) Exclude cut with ID _61133_210_9969_1_1535196777227_3696409_333-270325-0_sp1.1 from training. Duration: 0.8454375 2023-10-03 08:41:30,650 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.nonlin_attention.balancer.prob, batch_count=1205886.6666666667, ans=0.125 2023-10-03 08:41:32,468 WARNING [train.py:1204] (3/4) Exclude cut with ID _49376_210_5986_1_1533985278732_3370740_307-145221-0_sp1.1 from training. Duration: 0.936375 2023-10-03 08:41:33,673 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6846_210_6753_1_1527850767_4295158_2-315073-0_sp0.9 from training. Duration: 0.97775 2023-10-03 08:41:34,293 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.4.encoder.layers.0.conv_module1.whiten, num_groups=1, num_channels=384, metric=4.36 vs. limit=15.0 2023-10-03 08:41:36,307 WARNING [train.py:1204] (3/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_17-25045-0 from training. Duration: 0.59 2023-10-03 08:41:36,328 WARNING [train.py:1204] (3/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_503-333510-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 08:41:37,912 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.0.feed_forward3.hidden_balancer.prob, batch_count=1205953.3333333333, ans=0.125 2023-10-03 08:41:39,112 WARNING [train.py:1204] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_238-245655-0_sp1.1 from training. Duration: 0.736375 2023-10-03 08:41:39,134 WARNING [train.py:1204] (3/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_329-82235-0_sp0.9 from training. Duration: 0.911125 2023-10-03 08:41:39,139 WARNING [train.py:1204] (3/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_325-113798-0_sp0.9 from training. Duration: 0.9444375 2023-10-03 08:41:39,219 WARNING [train.py:1204] (3/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_395-37222-0_sp0.9 from training. Duration: 0.8444375 2023-10-03 08:41:40,527 WARNING [train.py:1204] (3/4) Exclude cut with ID _64135_210_2253_1_1535677267876_7608972_498-5039-0_sp1.1 from training. Duration: 0.7090625 2023-10-03 08:41:40,536 WARNING [train.py:1204] (3/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_303-228738-0_sp0.9 from training. Duration: 0.9333125 2023-10-03 08:41:43,332 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_577-220899-0_sp1.1 from training. Duration: 0.92725 2023-10-03 08:41:43,433 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3475_210_2028_1_1526193578_3622569_377-166133-0_sp1.1 from training. Duration: 0.7818125 2023-10-03 08:41:43,639 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.conv_module1.balancer1.prob, batch_count=1205953.3333333333, ans=0.125 2023-10-03 08:41:44,693 WARNING [train.py:1204] (3/4) Exclude cut with ID _49376_210_5986_1_1533985278732_3370740_272-313512-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 08:41:47,450 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7762_210_1794_1_1528199508_4057408_422-227034-0_sp0.9 from training. Duration: 0.811125 2023-10-03 08:41:52,217 WARNING [train.py:1204] (3/4) Exclude cut with ID _18895_210_6228_1_1531357274234_3784930_74-341428-0_sp1.1 from training. Duration: 0.92725 2023-10-03 08:41:53,639 WARNING [train.py:1204] (3/4) Exclude cut with ID _44902_210_16187_1_1533888006062_3792078_149-67212-0_sp1.1 from training. Duration: 0.7909375 2023-10-03 08:41:57,949 WARNING [train.py:1204] (3/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_243-126020-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 08:41:59,852 WARNING [train.py:1204] (3/4) Exclude cut with ID _54218_210_12210_1_1534413646388_4409259_259-6561-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 08:42:02,780 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_41452_210_7291_1_1533437687_3941281_405-11559-0 from training. Duration: 0.91 2023-10-03 08:42:04,664 INFO [train.py:1046] (3/4) Epoch 35, batch 300, loss[loss=0.1616, simple_loss=0.2432, pruned_loss=0.03997, over 24458.00 frames. ], tot_loss[loss=0.1603, simple_loss=0.2399, pruned_loss=0.04037, over 3656735.62 frames. ], batch size: 63, lr: 2.93e-03, grad_scale: 8.0 2023-10-03 08:42:04,730 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_759-225084-0_sp1.1 from training. Duration: 0.97275 2023-10-03 08:42:04,742 WARNING [train.py:1204] (3/4) Exclude cut with ID _14747_210_8341_1_1530685797463_3654089_147-131229-0_sp0.9 from training. Duration: 0.8444375 2023-10-03 08:42:07,492 WARNING [train.py:1204] (3/4) Exclude cut with ID _14867_210_1968_1_1530684179890_3846969_319-250868-0 from training. Duration: 0.99 2023-10-03 08:42:07,514 WARNING [train.py:1204] (3/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_580-93798-0_sp0.9 from training. Duration: 0.62225 2023-10-03 08:42:07,616 WARNING [train.py:1204] (3/4) Exclude cut with ID _9679_210_7497_1_1529129212906_4363929_512-313889-0_sp0.9 from training. Duration: 0.9 2023-10-03 08:42:07,631 WARNING [train.py:1204] (3/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_18-189198-0 from training. Duration: 0.9 2023-10-03 08:42:11,920 WARNING [train.py:1204] (3/4) Exclude cut with ID _49755_210_18053_1_1534581005846_3218709_137-64179-0_sp1.1 from training. Duration: 0.92725 2023-10-03 08:42:13,199 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_48049_210_5632_1_1533864553_3519937_306-283794-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 08:42:17,158 WARNING [train.py:1204] (3/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_25-120161-0_sp1.1 from training. Duration: 0.8909375 2023-10-03 08:42:17,219 WARNING [train.py:1204] (3/4) Exclude cut with ID _8833_210_5219_1_1528592665210_7553059_319-349181-0 from training. Duration: 0.99 2023-10-03 08:42:19,225 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_296-332325-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 08:42:20,508 WARNING [train.py:1204] (3/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_436-186311-0_sp1.1 from training. Duration: 0.5909375 2023-10-03 08:42:20,520 WARNING [train.py:1204] (3/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_348-127789-0 from training. Duration: 0.85 2023-10-03 08:42:20,525 WARNING [train.py:1204] (3/4) Exclude cut with ID _3992_210_1747_1_1526644816031_3731060_443-195183-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 08:42:23,337 WARNING [train.py:1204] (3/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_308-231735-0_sp1.1 from training. Duration: 0.663625 2023-10-03 08:42:28,783 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6786_210_1388_1_1527839699_4194495_378-46070-0_sp1.1 from training. Duration: 0.6545625 2023-10-03 08:42:28,830 WARNING [train.py:1204] (3/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_63-101996-0 from training. Duration: 0.88 2023-10-03 08:42:32,352 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2541_210_1794_1_1525590269_3084765_156-173713-0 from training. Duration: 0.81 2023-10-03 08:42:32,392 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_217-239796-0_sp1.1 from training. Duration: 0.963625 2023-10-03 08:42:35,544 WARNING [train.py:1204] (3/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_530-329400-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 08:42:36,962 WARNING [train.py:1204] (3/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_150-62920-0_sp1.1 from training. Duration: 0.963625 2023-10-03 08:42:36,963 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_5522_210_3273_1_1527249606_3909096_366-172508-0 from training. Duration: 0.66 2023-10-03 08:42:36,965 WARNING [train.py:1204] (3/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_303-340446-0_sp1.1 from training. Duration: 0.8181875 2023-10-03 08:42:39,800 WARNING [train.py:1204] (3/4) Exclude cut with ID _8699_210_2069_1_1528630346831_3960409_160-24408-0_sp1.1 from training. Duration: 0.82725 2023-10-03 08:42:42,586 WARNING [train.py:1204] (3/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_299-187801-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 08:42:42,621 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_34886_210_10633_1_1533203882_1755661_92-345288-0_sp1.1 from training. Duration: 0.97275 2023-10-03 08:42:45,456 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_5438_210_1794_1_1527336687_2600009_104-288274-0_sp1.1 from training. Duration: 0.52725 2023-10-03 08:42:45,461 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_154-316450-0 from training. Duration: 0.76 2023-10-03 08:42:47,405 WARNING [train.py:1204] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_23-189234-0_sp0.9 from training. Duration: 0.92225 2023-10-03 08:42:50,348 WARNING [train.py:1204] (3/4) Exclude cut with ID _4779_210_2084_1_1527318022944_3914893_346-168820-0_sp1.1 from training. Duration: 0.963625 2023-10-03 08:42:51,694 WARNING [train.py:1204] (3/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_5-55893-0 from training. Duration: 0.55 2023-10-03 08:42:53,062 WARNING [train.py:1204] (3/4) Exclude cut with ID _20455_210_5219_1_1531565996775_8098100_180-24517-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 08:42:56,001 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_243-143119-0_sp0.9 from training. Duration: 0.8555625 2023-10-03 08:42:58,578 WARNING [train.py:1204] (3/4) Exclude cut with ID _65585_210_12210_1_1535450451584_3764098_293-212045-0_sp1.1 from training. Duration: 0.92725 2023-10-03 08:42:58,582 WARNING [train.py:1204] (3/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_577-332600-0 from training. Duration: 0.57 2023-10-03 08:43:03,683 WARNING [train.py:1204] (3/4) Exclude cut with ID _30039_210_13270_1_1532484612900_2725150_104-289874-0_sp1.1 from training. Duration: 0.963625 2023-10-03 08:43:03,690 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3172_210_2087_1_1526292929_3987865_197-97762-0_sp1.1 from training. Duration: 0.6818125 2023-10-03 08:43:06,499 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_256-150908-0_sp1.1 from training. Duration: 0.963625 2023-10-03 08:43:07,906 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_296-287678-0_sp1.1 from training. Duration: 0.863625 2023-10-03 08:43:07,939 WARNING [train.py:1204] (3/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_443-30034-0 from training. Duration: 0.64 2023-10-03 08:43:07,963 WARNING [train.py:1204] (3/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_494-199538-0_sp0.9 from training. Duration: 0.8333125 2023-10-03 08:43:09,405 WARNING [train.py:1204] (3/4) Exclude cut with ID _19708_210_9206_1_1532136650733_4238364_61-162325-0_sp1.1 from training. Duration: 0.97275 2023-10-03 08:43:09,493 WARNING [train.py:1204] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_475-127455-0 from training. Duration: 0.7 2023-10-03 08:43:09,742 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.conv_module2.balancer2.prob, batch_count=1206353.3333333333, ans=0.125 2023-10-03 08:43:10,878 WARNING [train.py:1204] (3/4) Exclude cut with ID _48677_210_5663_1_1533902405919_3892989_188-11581-0_sp1.1 from training. Duration: 0.963625 2023-10-03 08:43:12,183 WARNING [train.py:1204] (3/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_136-8381-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 08:43:12,288 WARNING [train.py:1204] (3/4) Exclude cut with ID _55328_210_27767_1_1534580973260_4088170_270-229555-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 08:43:12,317 WARNING [train.py:1204] (3/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_372-266951-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 08:43:12,485 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.conv_module2.balancer2.min_positive, batch_count=1206353.3333333333, ans=0.05 2023-10-03 08:43:13,686 WARNING [train.py:1204] (3/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_14-242149-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 08:43:13,878 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=1206353.3333333333, ans=0.1 2023-10-03 08:43:18,395 INFO [train.py:1046] (3/4) Epoch 35, batch 350, loss[loss=0.1473, simple_loss=0.223, pruned_loss=0.0358, over 23435.00 frames. ], tot_loss[loss=0.1591, simple_loss=0.2379, pruned_loss=0.04015, over 3876707.44 frames. ], batch size: 285, lr: 2.93e-03, grad_scale: 8.0 2023-10-03 08:43:18,493 WARNING [train.py:1204] (3/4) Exclude cut with ID _7877_210_6014_1_1528286358099_3750136_159-76512-0_sp1.1 from training. Duration: 0.936375 2023-10-03 08:43:18,496 WARNING [train.py:1204] (3/4) Exclude cut with ID _8263_210_1664_1_1528438953494_294100_40-204731-0_sp1.1 from training. Duration: 0.4545625 2023-10-03 08:43:21,186 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8278_210_2550_1_1528637099_4220413_694-238413-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 08:43:25,038 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.539e+02 1.902e+02 2.096e+02 2.398e+02 4.416e+02, threshold=4.192e+02, percent-clipped=1.0 2023-10-03 08:43:27,927 WARNING [train.py:1204] (3/4) Exclude cut with ID _47278_210_12041_1_1533902418349_3576110_421-172710-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 08:43:29,528 WARNING [train.py:1204] (3/4) Exclude cut with ID _14970_210_15709_1_1530773543493_1715730_215-282680-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 08:43:31,418 WARNING [train.py:1204] (3/4) Exclude cut with ID _64639_210_5816_1_1535367619437_3528000_306-149449-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 08:43:34,735 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_28-291982-0 from training. Duration: 0.97 2023-10-03 08:43:36,065 WARNING [train.py:1204] (3/4) Exclude cut with ID _55134_210_21982_1_1534590028377_3882170_412-15870-0_sp1.1 from training. Duration: 0.936375 2023-10-03 08:43:36,090 WARNING [train.py:1204] (3/4) Exclude cut with ID _13684_210_7497_1_1530619354863_3598359_312-235606-0 from training. Duration: 0.6 2023-10-03 08:43:37,601 WARNING [train.py:1204] (3/4) Exclude cut with ID _37112_210_2575_1_1532997007673_7268610_347-238993-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 08:43:38,883 WARNING [train.py:1204] (3/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_121-284152-0 from training. Duration: 0.59 2023-10-03 08:43:38,948 WARNING [train.py:1204] (3/4) Exclude cut with ID _2941_210_2577_1_1526199673150_2489759_228-2961-0_sp1.1 from training. Duration: 0.97275 2023-10-03 08:43:41,801 WARNING [train.py:1204] (3/4) Exclude cut with ID _28323_210_16187_1_1532480395667_4100300_531-24157-0 from training. Duration: 0.98 2023-10-03 08:43:44,392 WARNING [train.py:1204] (3/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_266-329755-0_sp0.9 from training. Duration: 0.988875 2023-10-03 08:43:45,680 WARNING [train.py:1204] (3/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_1038-146421-0_sp1.1 from training. Duration: 0.97275 2023-10-03 08:43:45,754 WARNING [train.py:1204] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_391-22148-0_sp0.9 from training. Duration: 0.8555625 2023-10-03 08:43:47,163 WARNING [train.py:1204] (3/4) Exclude cut with ID _64521_210_14699_1_1535243427259_3815328_543-341117-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 08:43:47,183 WARNING [train.py:1204] (3/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_52-168712-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 08:43:48,535 WARNING [train.py:1204] (3/4) Exclude cut with ID _33920_210_4220_1_1533257851500_3084399_198-334063-0_sp1.1 from training. Duration: 0.8545625 2023-10-03 08:43:48,555 WARNING [train.py:1204] (3/4) Exclude cut with ID _50093_210_4220_1_1534417141207_3432299_69-111987-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 08:43:48,595 WARNING [train.py:1204] (3/4) Exclude cut with ID _14821_210_8750_1_1530680503991_6829359_189-297451-0_sp0.9 from training. Duration: 0.611125 2023-10-03 08:43:51,916 WARNING [train.py:1204] (3/4) Exclude cut with ID _8030_210_7497_1_1528444886781_3531089_262-86750-0_sp0.9 from training. Duration: 0.9 2023-10-03 08:43:51,921 WARNING [train.py:1204] (3/4) Exclude cut with ID _24612_210_8913_1_1531998054193_3806359_294-191196-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 08:43:56,265 INFO [scaling.py:1118] (3/4) WithLoss: name=encoder.encoders.4.encoder.layers.2.self_attn_weights, loss-sum=0.000e+00 2023-10-03 08:43:56,514 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.4.encoder.layers.0.whiten, num_groups=1, num_channels=384, metric=5.94 vs. limit=12.0 2023-10-03 08:43:58,822 WARNING [train.py:1204] (3/4) Exclude cut with ID _16909_210_5724_1_1531292079912_4118919_707-342896-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 08:43:58,838 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_9460_210_4211_1_1528954587_10049667_572-37035-0_sp0.9 from training. Duration: 0.888875 2023-10-03 08:44:00,704 WARNING [train.py:1204] (3/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_762-109886-0_sp1.1 from training. Duration: 0.82725 2023-10-03 08:44:00,734 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_14953_210_2151_1_1530701710_5616165_519-326393-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 08:44:05,293 WARNING [train.py:1204] (3/4) Exclude cut with ID _25914_210_6400_1_1532141317058_644740_52-96808-0 from training. Duration: 0.91 2023-10-03 08:44:05,305 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_68095_210_20185_1_1535718226_8888855_1079-94525-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 08:44:09,896 WARNING [train.py:1204] (3/4) Exclude cut with ID _14860_210_4915_1_1530702020340_7343240_746-3647-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 08:44:09,896 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_70379_210_7925_1_1535941455_557455_49-101720-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 08:44:09,909 WARNING [train.py:1204] (3/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_8-307291-0_sp1.1 from training. Duration: 0.936375 2023-10-03 08:44:10,879 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.1.encoder.layers.0.nonlin_attention.whiten2, num_groups=1, num_channels=256, metric=5.86 vs. limit=15.0 2023-10-03 08:44:11,294 WARNING [train.py:1204] (3/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_117-290916-0 from training. Duration: 0.51 2023-10-03 08:44:14,179 WARNING [train.py:1204] (3/4) Exclude cut with ID _22020_210_2461_1_1531724164513_6326900_944-339840-0_sp1.1 from training. Duration: 0.963625 2023-10-03 08:44:15,528 WARNING [train.py:1204] (3/4) Exclude cut with ID IC0515W0043-254911 from training. Duration: 0.976 2023-10-03 08:44:16,923 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3910_210_1794_1_1526742306_334530_28-238909-0 from training. Duration: 0.81 2023-10-03 08:44:16,939 WARNING [train.py:1204] (3/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_269-267275-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 08:44:19,701 WARNING [train.py:1204] (3/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_72-251255-0_sp1.1 from training. Duration: 0.8545625 2023-10-03 08:44:19,722 WARNING [train.py:1204] (3/4) Exclude cut with ID _14886_210_4220_1_1530702020355_3516190_175-229073-0 from training. Duration: 0.45 2023-10-03 08:44:21,178 WARNING [train.py:1204] (3/4) Exclude cut with ID _40519_210_13794_1_1533715228655_7337390_921-209452-0_sp1.1 from training. Duration: 0.963625 2023-10-03 08:44:21,828 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.3.encoder.layers.3.conv_module2.whiten, num_groups=1, num_channels=512, metric=4.95 vs. limit=15.0 2023-10-03 08:44:23,198 WARNING [train.py:1204] (3/4) Exclude cut with ID _29857_210_20034_1_1532395886753_3798160_335-163562-0_sp0.9 from training. Duration: 0.9444375 2023-10-03 08:44:24,581 WARNING [train.py:1204] (3/4) Exclude cut with ID _58583_210_5236_1_1534726835266_3574249_384-58445-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 08:44:25,948 WARNING [train.py:1204] (3/4) Exclude cut with ID _44011_210_2046_1_1533722401691_3631310_96-315656-0_sp1.1 from training. Duration: 0.963625 2023-10-03 08:44:25,952 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_68484_210_1794_1_1535849804_7678097_549-170613-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 08:44:27,350 WARNING [train.py:1204] (3/4) Exclude cut with ID _37892_210_19596_1_1533198677897_3964058_214-148058-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 08:44:30,780 WARNING [train.py:1204] (3/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_415-78792-0_sp0.9 from training. Duration: 0.9 2023-10-03 08:44:31,943 INFO [train.py:1046] (3/4) Epoch 35, batch 400, loss[loss=0.1586, simple_loss=0.2362, pruned_loss=0.04051, over 23864.00 frames. ], tot_loss[loss=0.1589, simple_loss=0.2377, pruned_loss=0.04003, over 4055388.16 frames. ], batch size: 195, lr: 2.93e-03, grad_scale: 16.0 2023-10-03 08:44:33,422 WARNING [train.py:1204] (3/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_383-205833-0_sp1.1 from training. Duration: 0.863625 2023-10-03 08:44:33,504 WARNING [train.py:1204] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_167-137255-0 from training. Duration: 0.64 2023-10-03 08:44:33,512 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8275_210_4198_1_1528455320_3892571_484-154786-0_sp1.1 from training. Duration: 0.963625 2023-10-03 08:44:35,443 WARNING [train.py:1204] (3/4) Exclude cut with ID _40284_210_2084_1_1533513679814_3704279_396-319412-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 08:44:36,933 WARNING [train.py:1204] (3/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_280-120942-0_sp0.9 from training. Duration: 0.8555625 2023-10-03 08:44:36,973 WARNING [train.py:1204] (3/4) Exclude cut with ID _45630_210_10556_1_1533949174498_3902049_558-240889-0_sp1.1 from training. Duration: 0.92725 2023-10-03 08:44:39,722 WARNING [train.py:1204] (3/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_197-307458-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 08:44:41,041 WARNING [train.py:1204] (3/4) Exclude cut with ID _16909_210_5724_1_1531292079912_4118919_163-254843-0_sp1.1 from training. Duration: 0.92725 2023-10-03 08:44:41,165 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_103-84010-0 from training. Duration: 0.69 2023-10-03 08:44:43,825 WARNING [train.py:1204] (3/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_401-158239-0 from training. Duration: 0.71 2023-10-03 08:44:43,826 WARNING [train.py:1204] (3/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_899-233658-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 08:44:43,931 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.0.bypass_mid.scale_min, batch_count=1206753.3333333333, ans=0.2 2023-10-03 08:44:45,259 WARNING [train.py:1204] (3/4) Exclude cut with ID _23825_210_13449_1_1531915313526_3497150_240-271367-0 from training. Duration: 0.97 2023-10-03 08:44:45,300 WARNING [train.py:1204] (3/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_424-134619-0_sp1.1 from training. Duration: 0.92725 2023-10-03 08:44:50,774 WARNING [train.py:1204] (3/4) Exclude cut with ID _44831_210_13427_1_1533816011549_3689035_32-188147-0_sp1.1 from training. Duration: 0.87275 2023-10-03 08:44:50,777 WARNING [train.py:1204] (3/4) Exclude cut with ID _59170_210_6721_1_1534983633960_4203335_314-63879-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 08:44:50,792 WARNING [train.py:1204] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_602-275260-0 from training. Duration: 0.82 2023-10-03 08:44:50,830 WARNING [train.py:1204] (3/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_511-306767-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 08:44:50,955 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.1.conv_module1.balancer2.prob, batch_count=1206820.0, ans=0.125 2023-10-03 08:44:52,636 WARNING [train.py:1204] (3/4) Exclude cut with ID _41817_210_18835_1_1533434477160_7423049_565-201775-0_sp1.1 from training. Duration: 0.92725 2023-10-03 08:44:52,654 WARNING [train.py:1204] (3/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_778-173151-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 08:44:52,704 WARNING [train.py:1204] (3/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_680-51001-0_sp1.1 from training. Duration: 0.963625 2023-10-03 08:44:55,514 WARNING [train.py:1204] (3/4) Exclude cut with ID IC0677W0059-334891 from training. Duration: 0.896 2023-10-03 08:44:56,819 WARNING [train.py:1204] (3/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_370-331600-0 from training. Duration: 0.94 2023-10-03 08:44:59,063 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.conv_module2.balancer2.prob, batch_count=1206820.0, ans=0.125 2023-10-03 08:45:00,923 WARNING [train.py:1204] (3/4) Exclude cut with ID _66840_210_5219_1_1535452168426_4196349_446-240192-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 08:45:03,545 WARNING [train.py:1204] (3/4) Exclude cut with ID _65242_210_18628_1_1535369393717_3823149_38-6732-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 08:45:04,906 WARNING [train.py:1204] (3/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_302-2550-0 from training. Duration: 0.81 2023-10-03 08:45:04,989 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_100-348441-0 from training. Duration: 0.91 2023-10-03 08:45:05,171 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.feed_forward1.out_proj.dropout_p, batch_count=1206886.6666666667, ans=0.1 2023-10-03 08:45:06,448 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.1.feed_forward2.hidden_balancer.prob, batch_count=1206886.6666666667, ans=0.125 2023-10-03 08:45:06,793 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.5.encoder.layers.1.nonlin_attention.whiten1, num_groups=1, num_channels=192, metric=2.68 vs. limit=10.0 2023-10-03 08:45:09,509 WARNING [train.py:1204] (3/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_329-285801-0_sp1.1 from training. Duration: 0.82725 2023-10-03 08:45:09,781 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.conv_module2.balancer1.min_positive, batch_count=1206886.6666666667, ans=0.025 2023-10-03 08:45:10,901 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_30167_210_3528_1_1532428012_4772375_343-334712-0_sp1.1 from training. Duration: 0.936375 2023-10-03 08:45:14,236 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.3.encoder.layers.3.feed_forward1.out_whiten, num_groups=1, num_channels=512, metric=5.45 vs. limit=15.0 2023-10-03 08:45:19,033 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7543_210_6753_1_1528543318_4552_1-85072-0 from training. Duration: 0.83 2023-10-03 08:45:21,908 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7002_210_1794_1_1527935634_4034444_487-236501-0_sp1.1 from training. Duration: 0.6 2023-10-03 08:45:22,023 WARNING [train.py:1204] (3/4) Exclude cut with ID _7160_210_1876_1_1527940923231_3703799_282-322611-0 from training. Duration: 0.87 2023-10-03 08:45:23,424 WARNING [train.py:1204] (3/4) Exclude cut with ID _67335_210_5219_1_1535525975652_4275229_570-262983-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 08:45:25,322 WARNING [train.py:1204] (3/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_625-338546-0_sp1.1 from training. Duration: 0.8 2023-10-03 08:45:25,362 WARNING [train.py:1204] (3/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_176-56120-0 from training. Duration: 0.83 2023-10-03 08:45:30,051 WARNING [train.py:1204] (3/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_963-187109-0_sp1.1 from training. Duration: 0.9 2023-10-03 08:45:31,426 WARNING [train.py:1204] (3/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_121-284152-0_sp0.9 from training. Duration: 0.6555625 2023-10-03 08:45:34,046 WARNING [train.py:1204] (3/4) Exclude cut with ID _45021_210_9994_1_1533619751094_3330288_104-81711-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 08:45:37,665 WARNING [train.py:1204] (3/4) Exclude cut with ID _51382_210_23862_1_1534492801838_3650104_129-343914-0_sp1.1 from training. Duration: 0.936375 2023-10-03 08:45:37,711 WARNING [train.py:1204] (3/4) Exclude cut with ID _14970_210_15709_1_1530775547361_1966965_8-169924-0 from training. Duration: 0.92 2023-10-03 08:45:39,089 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2295_210_2087_1_1525500993_2407863_234-83104-0_sp1.1 from training. Duration: 0.563625 2023-10-03 08:45:41,166 WARNING [train.py:1204] (3/4) Exclude cut with ID _14687_210_5854_1_1530703838259_3664889_115-239046-0 from training. Duration: 0.67 2023-10-03 08:45:42,603 WARNING [train.py:1204] (3/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_690-126306-0_sp1.1 from training. Duration: 0.7090625 2023-10-03 08:45:43,888 WARNING [train.py:1204] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_625-102955-0_sp0.9 from training. Duration: 0.8555625 2023-10-03 08:45:45,360 WARNING [train.py:1204] (3/4) Exclude cut with ID _9389_210_1664_1_1529217337301_5289269_289-213417-0 from training. Duration: 0.89 2023-10-03 08:45:46,706 INFO [train.py:1046] (3/4) Epoch 35, batch 450, loss[loss=0.1514, simple_loss=0.2312, pruned_loss=0.03583, over 24453.00 frames. ], tot_loss[loss=0.1594, simple_loss=0.2382, pruned_loss=0.04029, over 4197008.70 frames. ], batch size: 58, lr: 2.93e-03, grad_scale: 16.0 2023-10-03 08:45:48,106 WARNING [train.py:1204] (3/4) Exclude cut with ID _13253_210_1925_1_1530757775819_3577873_191-144977-0_sp0.9 from training. Duration: 0.8666875 2023-10-03 08:45:48,163 WARNING [train.py:1204] (3/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_325-155703-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 08:45:48,207 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2295_210_2087_1_1525500993_2407863_116-238894-0_sp0.9 from training. Duration: 0.77775 2023-10-03 08:45:49,587 WARNING [train.py:1204] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_94-126128-0 from training. Duration: 0.86 2023-10-03 08:45:49,605 WARNING [train.py:1204] (3/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_395-310220-0_sp0.9 from training. Duration: 0.988875 2023-10-03 08:45:50,939 WARNING [train.py:1204] (3/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_821-110852-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 08:45:52,297 WARNING [train.py:1204] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_64-248359-0_sp1.1 from training. Duration: 0.62725 2023-10-03 08:45:52,317 WARNING [train.py:1204] (3/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_138-187031-0 from training. Duration: 0.99 2023-10-03 08:45:52,348 WARNING [train.py:1204] (3/4) Exclude cut with ID _4122_210_2084_1_1526713164417_4353280_528-206771-0_sp0.9 from training. Duration: 0.97775 2023-10-03 08:45:53,602 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.578e+02 1.846e+02 1.963e+02 2.221e+02 3.123e+02, threshold=3.927e+02, percent-clipped=0.0 2023-10-03 08:45:53,731 WARNING [train.py:1204] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_844-96989-0_sp1.1 from training. Duration: 0.6454375 2023-10-03 08:45:57,070 WARNING [train.py:1204] (3/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_106-30782-0_sp1.1 from training. Duration: 0.72725 2023-10-03 08:45:57,304 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.attention_skip_rate, batch_count=1207086.6666666667, ans=0.0 2023-10-03 08:45:59,958 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.1.conv_module2.balancer2.min_abs, batch_count=1207153.3333333333, ans=0.5 2023-10-03 08:46:05,967 WARNING [train.py:1204] (3/4) Exclude cut with ID _14683_210_5219_1_1530698352608_3974859_281-250040-0_sp1.1 from training. Duration: 0.936375 2023-10-03 08:46:06,012 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_46013_210_7234_1_1533776362_3744032_463-52378-0_sp0.9 from training. Duration: 0.9555625 2023-10-03 08:46:09,128 WARNING [train.py:1204] (3/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_299-34413-0 from training. Duration: 0.65 2023-10-03 08:46:10,515 WARNING [train.py:1204] (3/4) Exclude cut with ID _35799_210_5854_1_1533263487887_3529071_490-326847-0 from training. Duration: 0.99 2023-10-03 08:46:13,794 WARNING [train.py:1204] (3/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_589-9070-0_sp0.9 from training. Duration: 0.788875 2023-10-03 08:46:15,272 WARNING [train.py:1204] (3/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_884-29515-0_sp1.1 from training. Duration: 0.936375 2023-10-03 08:46:16,794 WARNING [train.py:1204] (3/4) Exclude cut with ID _15012_210_1968_1_1530856820429_5095350_474-119995-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 08:46:19,504 WARNING [train.py:1204] (3/4) Exclude cut with ID _65585_210_12210_1_1535450451584_3764098_211-97124-0_sp1.1 from training. Duration: 0.963625 2023-10-03 08:46:20,854 WARNING [train.py:1204] (3/4) Exclude cut with ID _33640_210_2655_1_1533117605479_4855469_666-224463-0_sp1.1 from training. Duration: 0.963625 2023-10-03 08:46:24,815 WARNING [train.py:1204] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_35-153886-0 from training. Duration: 0.84 2023-10-03 08:46:24,858 WARNING [train.py:1204] (3/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_210-106047-0 from training. Duration: 0.97 2023-10-03 08:46:26,235 WARNING [train.py:1204] (3/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_479-125611-0 from training. Duration: 0.96 2023-10-03 08:46:28,097 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_9417_210_1794_1_1529235261_4614131_427-153963-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 08:46:28,654 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.5.encoder.layers.1.whiten, num_groups=1, num_channels=256, metric=2.07 vs. limit=12.0 2023-10-03 08:46:29,525 WARNING [train.py:1204] (3/4) Exclude cut with ID _23818_210_6228_1_1531917063884_3676920_133-170571-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 08:46:29,601 WARNING [train.py:1204] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_602-275260-0_sp1.1 from training. Duration: 0.7454375 2023-10-03 08:46:31,055 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9029W0121-509521 from training. Duration: 0.896 2023-10-03 08:46:31,066 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9029W0434-509821 from training. Duration: 0.64 2023-10-03 08:46:32,419 WARNING [train.py:1204] (3/4) Exclude cut with ID _23038_210_9570_1_1532001584158_3652419_289-307034-0_sp1.1 from training. Duration: 0.936375 2023-10-03 08:46:33,830 WARNING [train.py:1204] (3/4) Exclude cut with ID _29345_210_4381_1_1532689168263_3860169_149-257268-0_sp0.9 from training. Duration: 0.9 2023-10-03 08:46:35,117 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_405-339986-0_sp1.1 from training. Duration: 0.463625 2023-10-03 08:46:37,052 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2655_210_2087_1_1525688959_3897400_437-337690-0_sp1.1 from training. Duration: 0.57275 2023-10-03 08:46:37,291 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.conv_module2.balancer2.min_abs, batch_count=1207286.6666666667, ans=0.5 2023-10-03 08:46:38,312 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7543_210_6753_1_1528541907_1399322_165-323619-0_sp1.1 from training. Duration: 0.62725 2023-10-03 08:46:38,379 WARNING [train.py:1204] (3/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_508-335369-0_sp1.1 from training. Duration: 0.436375 2023-10-03 08:46:40,327 WARNING [train.py:1204] (3/4) Exclude cut with ID _7852_210_5219_1_1528286471316_8139400_358-287492-0 from training. Duration: 0.96 2023-10-03 08:46:42,853 WARNING [train.py:1204] (3/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_322-217220-0_sp0.9 from training. Duration: 0.9555625 2023-10-03 08:46:46,090 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3171_210_2087_1_1526109084_2330007_66-260583-0_sp0.9 from training. Duration: 0.8 2023-10-03 08:46:47,377 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8558_210_1410_1_1528505225_7613406_131-329524-0_sp1.1 from training. Duration: 0.7181875 2023-10-03 08:46:47,494 WARNING [train.py:1204] (3/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_30-326681-0 from training. Duration: 0.83 2023-10-03 08:46:50,413 WARNING [train.py:1204] (3/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_58-68956-0_sp0.9 from training. Duration: 0.92225 2023-10-03 08:46:50,481 WARNING [train.py:1204] (3/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_255-312654-0 from training. Duration: 0.91 2023-10-03 08:46:50,567 WARNING [train.py:1204] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_99-167392-0 from training. Duration: 0.61 2023-10-03 08:46:51,849 WARNING [train.py:1204] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_94-126128-0_sp0.9 from training. Duration: 0.9555625 2023-10-03 08:46:57,462 WARNING [train.py:1204] (3/4) Exclude cut with ID _68635_210_9317_1_1535770813410_4315870_187-192817-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 08:47:00,732 INFO [train.py:1046] (3/4) Epoch 35, batch 500, loss[loss=0.1673, simple_loss=0.2419, pruned_loss=0.04635, over 23717.00 frames. ], tot_loss[loss=0.1595, simple_loss=0.239, pruned_loss=0.03999, over 4321782.60 frames. ], batch size: 179, lr: 2.93e-03, grad_scale: 16.0 2023-10-03 08:47:00,780 WARNING [train.py:1204] (3/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_277-204970-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 08:47:00,893 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6163_210_6400_1_1527506746_4263068_283-137811-0_sp0.9 from training. Duration: 0.8555625 2023-10-03 08:47:02,228 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9144W0121-566446 from training. Duration: 0.896 2023-10-03 08:47:05,078 WARNING [train.py:1204] (3/4) Exclude cut with ID _49371_210_12071_1_1533952761021_3812190_351-315519-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 08:47:05,137 WARNING [train.py:1204] (3/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_115-326778-0_sp1.1 from training. Duration: 0.8454375 2023-10-03 08:47:06,358 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_17292_210_6753_1_1531125388_2943734_436-198965-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 08:47:06,367 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9165W0117-576751 from training. Duration: 0.896 2023-10-03 08:47:07,798 WARNING [train.py:1204] (3/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_212-149947-0 from training. Duration: 0.73 2023-10-03 08:47:07,806 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_13757_210_3528_1_1530440682_4107239_65-167544-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 08:47:08,682 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.attention_skip_rate, batch_count=1207420.0, ans=0.0 2023-10-03 08:47:11,116 WARNING [train.py:1204] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_700-183549-0_sp0.9 from training. Duration: 0.7555625 2023-10-03 08:47:14,434 WARNING [train.py:1204] (3/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_216-343007-0_sp0.9 from training. Duration: 0.7333125 2023-10-03 08:47:15,832 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7544_210_6753_1_1528587722_3576686_295-258479-0_sp1.1 from training. Duration: 0.763625 2023-10-03 08:47:17,201 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_365-287106-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 08:47:18,511 WARNING [train.py:1204] (3/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_300-3747-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 08:47:18,588 WARNING [train.py:1204] (3/4) Exclude cut with ID _14069_210_5632_1_1530512692033_4803390_487-303201-0_sp1.1 from training. Duration: 0.92725 2023-10-03 08:47:20,076 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.feed_forward1.hidden_balancer.prob, batch_count=1207486.6666666667, ans=0.125 2023-10-03 08:47:28,358 WARNING [train.py:1204] (3/4) Exclude cut with ID _14459_210_2228_1_1531443641946_7516700_668-123026-0_sp1.1 from training. Duration: 0.936375 2023-10-03 08:47:28,379 WARNING [train.py:1204] (3/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_108-53929-0_sp1.1 from training. Duration: 0.536375 2023-10-03 08:47:30,203 WARNING [train.py:1204] (3/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_266-137104-0_sp0.9 from training. Duration: 0.72225 2023-10-03 08:47:30,237 WARNING [train.py:1204] (3/4) Exclude cut with ID _12302_210_5219_1_1529924446466_8107280_902-168619-0_sp1.1 from training. Duration: 0.936375 2023-10-03 08:47:30,264 WARNING [train.py:1204] (3/4) Exclude cut with ID _59502_210_12210_1_1535107024698_1835139_146-162495-0 from training. Duration: 0.92 2023-10-03 08:47:31,675 WARNING [train.py:1204] (3/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_412-287526-0_sp1.1 from training. Duration: 0.6545625 2023-10-03 08:47:34,481 WARNING [train.py:1204] (3/4) Exclude cut with ID _7160_210_1876_1_1527940923231_3703799_120-150943-0_sp0.9 from training. Duration: 0.9 2023-10-03 08:47:35,824 WARNING [train.py:1204] (3/4) Exclude cut with ID _15037_210_2253_1_1530752294104_3267650_296-192597-0_sp1.1 from training. Duration: 0.736375 2023-10-03 08:47:35,850 WARNING [train.py:1204] (3/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_397-216497-0_sp1.1 from training. Duration: 0.87275 2023-10-03 08:47:35,865 WARNING [train.py:1204] (3/4) Exclude cut with ID _40094_210_16380_1_1533294005773_3641159_76-269320-0_sp1.1 from training. Duration: 0.936375 2023-10-03 08:47:37,228 WARNING [train.py:1204] (3/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_449-15508-0 from training. Duration: 0.82 2023-10-03 08:47:40,171 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.conv_module1.balancer1.prob, batch_count=1207553.3333333333, ans=0.125 2023-10-03 08:47:41,199 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9309W0194-640321 from training. Duration: 0.64 2023-10-03 08:47:43,918 WARNING [train.py:1204] (3/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_284-312178-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 08:47:45,298 WARNING [train.py:1204] (3/4) Exclude cut with ID _29853_210_7497_1_1532505600851_6642110_401-55937-0_sp1.1 from training. Duration: 0.92725 2023-10-03 08:47:45,368 WARNING [train.py:1204] (3/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_716-24918-0_sp1.1 from training. Duration: 0.92725 2023-10-03 08:47:47,168 WARNING [train.py:1204] (3/4) Exclude cut with ID _19637_210_4220_1_1531537167676_3205060_314-305638-0_sp1.1 from training. Duration: 0.92725 2023-10-03 08:47:47,199 WARNING [train.py:1204] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_475-127455-0_sp1.1 from training. Duration: 0.636375 2023-10-03 08:47:48,653 WARNING [train.py:1204] (3/4) Exclude cut with ID _5444_210_2151_1_1527418808464_5312210_581-315411-0 from training. Duration: 0.62 2023-10-03 08:47:50,121 WARNING [train.py:1204] (3/4) Exclude cut with ID _44902_210_16187_1_1533888006062_3792078_149-67212-0_sp0.9 from training. Duration: 0.9666875 2023-10-03 08:47:52,602 WARNING [train.py:1204] (3/4) Exclude cut with ID _31602_210_5854_1_1532677021456_4458250_63-63253-0_sp1.1 from training. Duration: 0.963625 2023-10-03 08:47:56,752 WARNING [train.py:1204] (3/4) Exclude cut with ID _55209_210_1531_1_1534675828996_4597160_660-203992-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 08:47:57,054 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.feed_forward1.hidden_balancer.prob, batch_count=1207620.0, ans=0.125 2023-10-03 08:47:58,261 WARNING [train.py:1204] (3/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_83-190624-0_sp1.1 from training. Duration: 0.92725 2023-10-03 08:48:05,698 WARNING [train.py:1204] (3/4) Exclude cut with ID _58605_210_12210_1_1534723195939_3904259_154-139635-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 08:48:07,205 WARNING [train.py:1204] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_164-224052-0 from training. Duration: 0.7 2023-10-03 08:48:07,221 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_1126-189071-0_sp1.1 from training. Duration: 0.963625 2023-10-03 08:48:08,929 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_15396_210_6890_1_1531308473_1938936_310-214789-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 08:48:11,249 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.conv_skip_rate, batch_count=1207686.6666666667, ans=0.0 2023-10-03 08:48:11,261 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.bypass.skip_rate, batch_count=1207686.6666666667, ans=0.07 2023-10-03 08:48:12,426 WARNING [train.py:1204] (3/4) Exclude cut with ID _8395_210_1531_1_1528597871523_4411579_431-132470-0 from training. Duration: 0.74 2023-10-03 08:48:13,730 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3780_210_2087_1_1526467760_2130483_170-259044-0_sp0.9 from training. Duration: 0.77775 2023-10-03 08:48:15,009 INFO [train.py:1046] (3/4) Epoch 35, batch 550, loss[loss=0.1758, simple_loss=0.2464, pruned_loss=0.05262, over 23894.00 frames. ], tot_loss[loss=0.1607, simple_loss=0.2402, pruned_loss=0.04063, over 4406684.56 frames. ], batch size: 195, lr: 2.93e-03, grad_scale: 16.0 2023-10-03 08:48:16,459 WARNING [train.py:1204] (3/4) Exclude cut with ID _12733_210_8341_1_1530167424556_3646380_537-230640-0_sp1.1 from training. Duration: 0.963625 2023-10-03 08:48:19,835 WARNING [train.py:1204] (3/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_159-281310-0 from training. Duration: 0.95 2023-10-03 08:48:21,259 WARNING [train.py:1204] (3/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_434-83000-0 from training. Duration: 0.98 2023-10-03 08:48:21,287 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_4405_210_1849_1_1526732205_3593939_493-32464-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 08:48:21,302 WARNING [train.py:1204] (3/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_342-100157-0 from training. Duration: 0.54 2023-10-03 08:48:22,463 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.577e+02 1.820e+02 2.073e+02 2.406e+02 3.793e+02, threshold=4.145e+02, percent-clipped=0.0 2023-10-03 08:48:22,590 WARNING [train.py:1204] (3/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_765-250133-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 08:48:22,595 WARNING [train.py:1204] (3/4) Exclude cut with ID _38670_210_15379_1_1533448810659_4391059_81-312245-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 08:48:23,923 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_50050_210_16223_1_1535435796_7290138_1273-247611-0_sp1.1 from training. Duration: 0.936375 2023-10-03 08:48:23,992 WARNING [train.py:1204] (3/4) Exclude cut with ID _44831_210_13427_1_1533816011549_3689035_399-18591-0_sp1.1 from training. Duration: 0.936375 2023-10-03 08:48:25,259 WARNING [train.py:1204] (3/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_557-324258-0_sp0.9 from training. Duration: 0.9 2023-10-03 08:48:26,646 WARNING [train.py:1204] (3/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_62-335552-0_sp1.1 from training. Duration: 0.7909375 2023-10-03 08:48:29,519 WARNING [train.py:1204] (3/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_673-264125-0_sp1.1 from training. Duration: 0.963625 2023-10-03 08:48:29,610 WARNING [train.py:1204] (3/4) Exclude cut with ID _8030_210_7497_1_1528444886781_3531089_262-86750-0 from training. Duration: 0.81 2023-10-03 08:48:29,624 WARNING [train.py:1204] (3/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_444-28041-0_sp1.1 from training. Duration: 0.77275 2023-10-03 08:48:32,600 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.bypass.skip_rate, batch_count=1207820.0, ans=0.09899494936611666 2023-10-03 08:48:33,734 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_34008_210_10431_1_1533345424_8183247_516-14858-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 08:48:34,984 WARNING [train.py:1204] (3/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_54-248218-0_sp1.1 from training. Duration: 0.936375 2023-10-03 08:48:36,960 WARNING [train.py:1204] (3/4) Exclude cut with ID _13253_210_1925_1_1530757775819_3577873_300-119782-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 08:48:37,073 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.conv_module1.balancer1.prob, batch_count=1207820.0, ans=0.125 2023-10-03 08:48:38,301 WARNING [train.py:1204] (3/4) Exclude cut with ID _39290_210_4220_1_1533862778760_3075118_21-240703-0_sp1.1 from training. Duration: 0.936375 2023-10-03 08:48:41,645 WARNING [train.py:1204] (3/4) Exclude cut with ID 774-127930-0014-10412-0_sp1.1 from training. Duration: 0.95 2023-10-03 08:48:44,308 WARNING [train.py:1204] (3/4) Exclude cut with ID _67499_210_17282_1_1535509805220_3692430_20-179346-0 from training. Duration: 0.97 2023-10-03 08:48:45,667 WARNING [train.py:1204] (3/4) Exclude cut with ID _8365_210_2064_1_1528592311070_4677690_431-294102-0_sp0.9 from training. Duration: 0.988875 2023-10-03 08:48:47,256 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.ff3_skip_rate, batch_count=1207886.6666666667, ans=0.0 2023-10-03 08:48:50,411 WARNING [train.py:1204] (3/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_365-1492-0_sp1.1 from training. Duration: 0.82725 2023-10-03 08:48:50,443 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_105-114797-0_sp0.9 from training. Duration: 0.9333125 2023-10-03 08:48:51,861 WARNING [train.py:1204] (3/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_769-168302-0_sp0.9 from training. Duration: 0.788875 2023-10-03 08:48:54,592 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_40340_210_9024_1_1533433916_4132944_320-270277-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 08:48:54,597 WARNING [train.py:1204] (3/4) Exclude cut with ID ID0203W0003-781711 from training. Duration: 0.958 2023-10-03 08:48:56,111 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_30434_210_13049_1_1532519384_981746_52-12932-0_sp1.1 from training. Duration: 0.936375 2023-10-03 08:48:57,471 WARNING [train.py:1204] (3/4) Exclude cut with ID _8263_210_1664_1_1528438953494_294100_40-204731-0_sp0.9 from training. Duration: 0.5555625 2023-10-03 08:49:00,174 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1032-236863-0_sp0.9 from training. Duration: 0.9333125 2023-10-03 08:49:00,231 WARNING [train.py:1204] (3/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_335-274780-0_sp0.9 from training. Duration: 0.8666875 2023-10-03 08:49:00,246 WARNING [train.py:1204] (3/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_654-52713-0_sp1.1 from training. Duration: 0.7 2023-10-03 08:49:01,648 WARNING [train.py:1204] (3/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_742-267856-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 08:49:02,984 WARNING [train.py:1204] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_74-48717-0 from training. Duration: 0.58 2023-10-03 08:49:03,085 WARNING [train.py:1204] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_18-46756-0 from training. Duration: 0.79 2023-10-03 08:49:04,360 WARNING [train.py:1204] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_612-56061-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 08:49:04,368 WARNING [train.py:1204] (3/4) Exclude cut with ID _56423_210_5854_1_1534896061049_3165969_412-2403-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 08:49:04,434 WARNING [train.py:1204] (3/4) Exclude cut with ID _62574_210_2655_1_1535180407058_3933249_120-196848-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 08:49:04,434 WARNING [train.py:1204] (3/4) Exclude cut with ID _40519_210_13794_1_1533715228655_7337390_916-51000-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 08:49:06,473 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.bypass.scale_min, batch_count=1207953.3333333333, ans=0.2 2023-10-03 08:49:07,628 WARNING [train.py:1204] (3/4) Exclude cut with ID _65263_210_22356_1_1535763598691_3921037_108-21902-0_sp1.1 from training. Duration: 0.9 2023-10-03 08:49:09,349 WARNING [train.py:1204] (3/4) Exclude cut with ID _4122_210_2084_1_1526713164417_4353280_528-206771-0_sp1.1 from training. Duration: 0.8 2023-10-03 08:49:11,814 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.4.encoder.layers.2.whiten, num_groups=1, num_channels=384, metric=2.64 vs. limit=12.0 2023-10-03 08:49:12,676 WARNING [train.py:1204] (3/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_43-325657-0_sp1.1 from training. Duration: 0.92725 2023-10-03 08:49:12,727 WARNING [train.py:1204] (3/4) Exclude cut with ID _44659_210_2721_1_1534036120061_6614860_314-82041-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 08:49:12,792 WARNING [train.py:1204] (3/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_81-300212-0_sp0.9 from training. Duration: 0.6666875 2023-10-03 08:49:15,428 WARNING [train.py:1204] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_665-196155-0_sp0.9 from training. Duration: 0.7444375 2023-10-03 08:49:15,544 WARNING [train.py:1204] (3/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_140-336881-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 08:49:16,719 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6290_210_3273_1_1527595224_1282787_115-87095-0_sp0.9 from training. Duration: 0.611125 2023-10-03 08:49:18,063 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_53373_210_5399_1_1534229471_4715695_607-259685-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 08:49:19,442 WARNING [train.py:1204] (3/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_175-163894-0_sp1.1 from training. Duration: 0.67275 2023-10-03 08:49:21,431 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7002_210_1794_1_1527939683_5157286_1-281264-0_sp1.1 from training. Duration: 0.4 2023-10-03 08:49:24,343 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.feed_forward3.hidden_balancer.prob, batch_count=1208020.0, ans=0.125 2023-10-03 08:49:26,867 WARNING [train.py:1204] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_605-257205-0 from training. Duration: 0.59 2023-10-03 08:49:28,081 INFO [train.py:1046] (3/4) Epoch 35, batch 600, loss[loss=0.1438, simple_loss=0.2182, pruned_loss=0.03472, over 23451.00 frames. ], tot_loss[loss=0.1619, simple_loss=0.2414, pruned_loss=0.04122, over 4465068.45 frames. ], batch size: 134, lr: 2.93e-03, grad_scale: 16.0 2023-10-03 08:49:28,441 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.ff2_skip_rate, batch_count=1208086.6666666667, ans=0.0 2023-10-03 08:49:29,620 WARNING [train.py:1204] (3/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_454-314901-0 from training. Duration: 0.46 2023-10-03 08:49:30,317 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.2.encoder.layers.2.conv_module2.whiten, num_groups=1, num_channels=384, metric=5.31 vs. limit=15.0 2023-10-03 08:49:30,994 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_46032_210_14656_1_1533781412_4507969_287-130422-0_sp1.1 from training. Duration: 0.97275 2023-10-03 08:49:32,334 WARNING [train.py:1204] (3/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_186-309161-0_sp1.1 from training. Duration: 0.4909375 2023-10-03 08:49:32,384 WARNING [train.py:1204] (3/4) Exclude cut with ID _42659_210_2070_1_1533607442765_3120380_45-342307-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 08:49:37,314 WARNING [train.py:1204] (3/4) Exclude cut with ID _12555_210_8750_1_1530075594402_3780239_478-326818-0_sp1.1 from training. Duration: 0.963625 2023-10-03 08:49:39,176 WARNING [train.py:1204] (3/4) Exclude cut with ID _7606_210_4915_1_1528894253277_5080690_127-177761-0_sp1.1 from training. Duration: 0.5818125 2023-10-03 08:49:40,516 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6788_210_1388_1_1527898698_4562021_91-197981-0 from training. Duration: 0.83 2023-10-03 08:49:43,605 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8558_210_1410_1_1528505225_7613406_131-329524-0_sp0.9 from training. Duration: 0.87775 2023-10-03 08:49:44,986 WARNING [train.py:1204] (3/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_153-153537-0_sp1.1 from training. Duration: 0.82725 2023-10-03 08:49:46,378 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_17292_210_6753_1_1531125388_2943734_285-349105-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 08:49:47,829 WARNING [train.py:1204] (3/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_37-41919-0 from training. Duration: 0.87 2023-10-03 08:49:47,852 WARNING [train.py:1204] (3/4) Exclude cut with ID _60860_210_8254_1_1534896015875_3557169_172-294852-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 08:49:55,099 WARNING [train.py:1204] (3/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_146-276944-0 from training. Duration: 0.83 2023-10-03 08:49:59,112 WARNING [train.py:1204] (3/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_1254-44082-0_sp1.1 from training. Duration: 0.936375 2023-10-03 08:49:59,131 WARNING [train.py:1204] (3/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_1115-180350-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 08:49:59,162 WARNING [train.py:1204] (3/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_60-146189-0_sp1.1 from training. Duration: 0.8909375 2023-10-03 08:50:03,427 WARNING [train.py:1204] (3/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_517-153649-0_sp1.1 from training. Duration: 0.92725 2023-10-03 08:50:03,440 WARNING [train.py:1204] (3/4) Exclude cut with ID _39571_210_13449_1_1533801584143_3651077_37-266257-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 08:50:03,588 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.0.balancer2.prob, batch_count=1208220.0, ans=0.125 2023-10-03 08:50:04,788 WARNING [train.py:1204] (3/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_527-276118-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 08:50:11,443 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7696_210_2040_1_1528207167_3716099_117-125434-0_sp1.1 from training. Duration: 0.6909375 2023-10-03 08:50:14,880 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.ff2_skip_rate, batch_count=1208286.6666666667, ans=0.0 2023-10-03 08:50:16,134 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_25767_210_2161_1_1532307483_3867748_478-156360-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 08:50:16,138 WARNING [train.py:1204] (3/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_177-49092-0_sp1.1 from training. Duration: 0.82725 2023-10-03 08:50:16,145 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_11305_210_1794_1_1529750428_7178148_680-317073-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 08:50:23,578 WARNING [train.py:1204] (3/4) Exclude cut with ID _53565_210_10635_1_1534337218335_3820259_287-18120-0 from training. Duration: 0.97 2023-10-03 08:50:26,371 WARNING [train.py:1204] (3/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_498-40153-0_sp1.1 from training. Duration: 0.57275 2023-10-03 08:50:27,657 WARNING [train.py:1204] (3/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_555-63066-0_sp1.1 from training. Duration: 0.963625 2023-10-03 08:50:31,761 WARNING [train.py:1204] (3/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_395-310220-0 from training. Duration: 0.89 2023-10-03 08:50:33,090 WARNING [train.py:1204] (3/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_937-208281-0_sp1.1 from training. Duration: 0.77275 2023-10-03 08:50:35,098 WARNING [train.py:1204] (3/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_120-211630-0 from training. Duration: 0.92 2023-10-03 08:50:36,453 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_454-276429-0_sp1.1 from training. Duration: 0.7909375 2023-10-03 08:50:36,495 WARNING [train.py:1204] (3/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_404-75889-0_sp1.1 from training. Duration: 0.8090625 2023-10-03 08:50:40,118 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.conv_module2.balancer1.max_abs, batch_count=1208353.3333333333, ans=10.0 2023-10-03 08:50:40,184 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.self_attn_weights.pos_emb_skip_rate, batch_count=1208353.3333333333, ans=0.0 2023-10-03 08:50:41,251 WARNING [train.py:1204] (3/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_282-136641-0_sp0.9 from training. Duration: 0.4666875 2023-10-03 08:50:43,030 INFO [train.py:1046] (3/4) Epoch 35, batch 650, loss[loss=0.1388, simple_loss=0.1953, pruned_loss=0.04113, over 19386.00 frames. ], tot_loss[loss=0.1612, simple_loss=0.2402, pruned_loss=0.04104, over 4513121.98 frames. ], batch size: 388, lr: 2.93e-03, grad_scale: 16.0 2023-10-03 08:50:43,078 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_809-17156-0_sp1.1 from training. Duration: 0.663625 2023-10-03 08:50:44,567 WARNING [train.py:1204] (3/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_1003-101638-0_sp0.9 from training. Duration: 0.811125 2023-10-03 08:50:44,754 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.nonlin_attention.balancer.min_positive, batch_count=1208420.0, ans=0.05 2023-10-03 08:50:46,067 WARNING [train.py:1204] (3/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_404-75889-0_sp0.9 from training. Duration: 0.988875 2023-10-03 08:50:48,741 WARNING [train.py:1204] (3/4) Exclude cut with ID _59288_210_5854_1_1535077757774_3412246_570-295255-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 08:50:49,940 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.581e+02 1.925e+02 2.110e+02 2.430e+02 3.265e+02, threshold=4.220e+02, percent-clipped=0.0 2023-10-03 08:50:51,852 WARNING [train.py:1204] (3/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_412-287526-0 from training. Duration: 0.72 2023-10-03 08:50:53,138 WARNING [train.py:1204] (3/4) Exclude cut with ID _44010_210_4525_1_1533884262964_4013620_649-79655-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 08:50:57,459 WARNING [train.py:1204] (3/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_57-270037-0_sp0.9 from training. Duration: 0.8555625 2023-10-03 08:50:57,460 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_34815_210_6388_1_1533381320_3571903_302-255963-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 08:51:00,172 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_47449_210_5399_1_1534076445_4006929_573-301852-0_sp1.1 from training. Duration: 0.97275 2023-10-03 08:51:04,296 WARNING [train.py:1204] (3/4) Exclude cut with ID 6709-74022-0004-86860-0_sp1.1 from training. Duration: 0.9409375 2023-10-03 08:51:04,423 WARNING [train.py:1204] (3/4) Exclude cut with ID _40519_210_13794_1_1533715228655_7337390_111-71991-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 08:51:05,930 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_30167_210_3528_1_1532428012_4772375_169-214186-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 08:51:08,647 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.nonlin_attention.balancer.min_positive, batch_count=1208486.6666666667, ans=0.05 2023-10-03 08:51:09,758 WARNING [train.py:1204] (3/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_20-235313-0_sp1.1 from training. Duration: 0.963625 2023-10-03 08:51:09,796 WARNING [train.py:1204] (3/4) Exclude cut with ID _4681_210_3525_1_1527251040321_1235713_123-24428-0_sp0.9 from training. Duration: 0.5333125 2023-10-03 08:51:13,095 WARNING [train.py:1204] (3/4) Exclude cut with ID _31602_210_5854_1_1532677021456_4458250_444-198752-0_sp1.1 from training. Duration: 0.97275 2023-10-03 08:51:14,386 WARNING [train.py:1204] (3/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_9-279442-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 08:51:15,723 WARNING [train.py:1204] (3/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_973-198085-0_sp0.9 from training. Duration: 0.8333125 2023-10-03 08:51:15,821 WARNING [train.py:1204] (3/4) Exclude cut with ID _29853_210_7497_1_1532505600851_6642110_567-250221-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 08:51:17,187 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6545_210_2040_1_1527774670_4245258_407-48070-0_sp1.1 from training. Duration: 0.6909375 2023-10-03 08:51:19,259 WARNING [train.py:1204] (3/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_224-343295-0_sp0.9 from training. Duration: 0.611125 2023-10-03 08:51:19,273 WARNING [train.py:1204] (3/4) Exclude cut with ID IC0125W0011-61817 from training. Duration: 0.88 2023-10-03 08:51:19,290 WARNING [train.py:1204] (3/4) Exclude cut with ID _60113_210_16271_1_1535164212481_4386079_435-154371-0_sp1.1 from training. Duration: 0.97275 2023-10-03 08:51:19,303 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_25836_210_16554_1_1532138167_53197_3-9394-0_sp1.1 from training. Duration: 0.92725 2023-10-03 08:51:23,574 WARNING [train.py:1204] (3/4) Exclude cut with ID _29343_210_4381_1_1532602757752_3674109_57-55125-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 08:51:24,870 WARNING [train.py:1204] (3/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_267-23560-0_sp1.1 from training. Duration: 0.936375 2023-10-03 08:51:24,897 WARNING [train.py:1204] (3/4) Exclude cut with ID _30196_210_1664_1_1533708496522_7074140_1150-278867-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 08:51:26,205 WARNING [train.py:1204] (3/4) Exclude cut with ID _6270_210_2084_1_1527850911573_3856420_26-277343-0_sp0.9 from training. Duration: 0.911125 2023-10-03 08:51:26,272 WARNING [train.py:1204] (3/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_325-62802-0 from training. Duration: 0.87 2023-10-03 08:51:27,557 WARNING [train.py:1204] (3/4) Exclude cut with ID _56524_210_9642_1_1534572011779_3619170_145-310842-0_sp1.1 from training. Duration: 0.9 2023-10-03 08:51:27,596 WARNING [train.py:1204] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_860-96198-0_sp0.9 from training. Duration: 0.811125 2023-10-03 08:51:27,827 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=1208620.0, ans=0.1 2023-10-03 08:51:29,074 WARNING [train.py:1204] (3/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_17-25045-0_sp1.1 from training. Duration: 0.536375 2023-10-03 08:51:29,096 WARNING [train.py:1204] (3/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_349-69727-0_sp1.1 from training. Duration: 0.936375 2023-10-03 08:51:30,360 WARNING [train.py:1204] (3/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_580-93798-0_sp1.1 from training. Duration: 0.5090625 2023-10-03 08:51:31,960 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7762_210_1794_1_1528199508_4057408_419-125009-0 from training. Duration: 0.83 2023-10-03 08:51:32,055 WARNING [train.py:1204] (3/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_356-167816-0 from training. Duration: 0.72 2023-10-03 08:51:32,083 WARNING [train.py:1204] (3/4) Exclude cut with ID _29345_210_4381_1_1532689168263_3860169_225-97100-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 08:51:32,087 WARNING [train.py:1204] (3/4) Exclude cut with ID _14227_210_4381_1_1530701804286_3940259_374-194712-0_sp1.1 from training. Duration: 0.92725 2023-10-03 08:51:32,117 WARNING [train.py:1204] (3/4) Exclude cut with ID _40137_210_12210_1_1533290484046_3795180_249-187059-0_sp0.9 from training. Duration: 0.97775 2023-10-03 08:51:33,196 WARNING [train.py:1204] (3/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_389-317194-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 08:51:34,662 WARNING [train.py:1204] (3/4) Exclude cut with ID _27485_210_4916_1_1532739191677_3668269_239-122918-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 08:51:38,765 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.feed_forward3.hidden_balancer.prob, batch_count=1208620.0, ans=0.125 2023-10-03 08:51:41,235 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2245_210_2715_1_1525511548_3540254_128-117609-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 08:51:41,272 WARNING [train.py:1204] (3/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_160-276178-0_sp1.1 from training. Duration: 0.963625 2023-10-03 08:51:42,585 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_31920_210_10633_1_1532860009_1686094_1-279735-0_sp1.1 from training. Duration: 0.97275 2023-10-03 08:51:45,308 WARNING [train.py:1204] (3/4) Exclude cut with ID _51078_210_9070_1_1534161557424_3805250_325-262383-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 08:51:45,335 WARNING [train.py:1204] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_643-52007-0_sp0.9 from training. Duration: 0.6333125 2023-10-03 08:51:46,783 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_51937_210_5014_1_1534573610_4022025_518-299248-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 08:51:52,778 WARNING [train.py:1204] (3/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_166-314607-0_sp1.1 from training. Duration: 0.4909375 2023-10-03 08:51:52,782 WARNING [train.py:1204] (3/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_1087-282939-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 08:51:52,830 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_23850_210_4238_1_1532232597_1632433_32-185135-0_sp1.1 from training. Duration: 0.7818125 2023-10-03 08:51:54,028 WARNING [train.py:1204] (3/4) Exclude cut with ID _59500_210_8254_1_1534809610680_3736009_145-89359-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 08:51:54,267 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=1208686.6666666667, ans=0.1 2023-10-03 08:51:56,798 INFO [train.py:1046] (3/4) Epoch 35, batch 700, loss[loss=0.1512, simple_loss=0.2279, pruned_loss=0.03722, over 23608.00 frames. ], tot_loss[loss=0.1608, simple_loss=0.2399, pruned_loss=0.04083, over 4558342.04 frames. ], batch size: 135, lr: 2.93e-03, grad_scale: 16.0 2023-10-03 08:51:56,964 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_340-336408-0 from training. Duration: 0.93 2023-10-03 08:51:58,377 WARNING [train.py:1204] (3/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_9-21737-0 from training. Duration: 0.97 2023-10-03 08:52:01,229 WARNING [train.py:1204] (3/4) Exclude cut with ID _2220_210_1819_1_1525509109898_3658680_50-99837-0 from training. Duration: 0.54 2023-10-03 08:52:02,468 WARNING [train.py:1204] (3/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_879-299350-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 08:52:03,948 WARNING [train.py:1204] (3/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_905-160074-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 08:52:05,310 WARNING [train.py:1204] (3/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_395-73570-0 from training. Duration: 0.99 2023-10-03 08:52:05,476 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.conv_module1.balancer2.prob, batch_count=1208753.3333333333, ans=0.125 2023-10-03 08:52:12,309 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7264_210_4938_1_1527939911_8132095_800-91995-0_sp1.1 from training. Duration: 0.7818125 2023-10-03 08:52:12,616 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.ff3_skip_rate, batch_count=1208820.0, ans=0.0 2023-10-03 08:52:13,841 WARNING [train.py:1204] (3/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_77-311404-0_sp1.1 from training. Duration: 0.8545625 2023-10-03 08:52:16,430 WARNING [train.py:1204] (3/4) Exclude cut with ID _13679_210_7497_1_1530534293662_3355098_107-80772-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 08:52:17,663 WARNING [train.py:1204] (3/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_224-343295-0_sp1.1 from training. Duration: 0.5 2023-10-03 08:52:17,699 WARNING [train.py:1204] (3/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_156-253482-0_sp1.1 from training. Duration: 0.963625 2023-10-03 08:52:20,987 WARNING [train.py:1204] (3/4) Exclude cut with ID _49728_210_12651_1_1534041034294_6788150_309-125532-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 08:52:23,720 WARNING [train.py:1204] (3/4) Exclude cut with ID _13913_210_9994_1_1530579615187_3383897_473-277682-0_sp0.9 from training. Duration: 0.4333125 2023-10-03 08:52:23,725 WARNING [train.py:1204] (3/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_3-276701-0_sp1.1 from training. Duration: 0.82725 2023-10-03 08:52:25,157 WARNING [train.py:1204] (3/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_282-136641-0 from training. Duration: 0.42 2023-10-03 08:52:27,962 WARNING [train.py:1204] (3/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_127-566-0 from training. Duration: 0.84 2023-10-03 08:52:33,279 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6163_210_6400_1_1527506746_4263068_283-137811-0_sp1.1 from training. Duration: 0.7 2023-10-03 08:52:33,307 WARNING [train.py:1204] (3/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_34-183685-0_sp1.1 from training. Duration: 0.8909375 2023-10-03 08:52:33,443 WARNING [train.py:1204] (3/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_154-135696-0_sp0.9 from training. Duration: 0.888875 2023-10-03 08:52:37,608 WARNING [train.py:1204] (3/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_676-51014-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 08:52:39,487 WARNING [train.py:1204] (3/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_38-169920-0 from training. Duration: 0.91 2023-10-03 08:52:44,666 WARNING [train.py:1204] (3/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_747-114465-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 08:52:44,712 WARNING [train.py:1204] (3/4) Exclude cut with ID _8021_210_7994_1_1528376451358_3653069_100-140231-0_sp1.1 from training. Duration: 0.6090625 2023-10-03 08:52:46,014 WARNING [train.py:1204] (3/4) Exclude cut with ID _64135_210_2253_1_1535677267876_7608972_133-188266-0 from training. Duration: 0.81 2023-10-03 08:52:48,886 WARNING [train.py:1204] (3/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_714-310538-0_sp1.1 from training. Duration: 0.7818125 2023-10-03 08:52:50,256 WARNING [train.py:1204] (3/4) Exclude cut with ID _39867_210_9826_1_1533553222030_9344259_1134-288498-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 08:52:53,631 WARNING [train.py:1204] (3/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_211-237289-0_sp1.1 from training. Duration: 0.936375 2023-10-03 08:52:57,940 WARNING [train.py:1204] (3/4) Exclude cut with ID _59202_210_26984_1_1534816810612_3549174_137-318710-0_sp0.9 from training. Duration: 0.988875 2023-10-03 08:52:57,965 WARNING [train.py:1204] (3/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_86-66192-0 from training. Duration: 0.55 2023-10-03 08:53:02,052 WARNING [train.py:1204] (3/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_540-275685-0 from training. Duration: 0.89 2023-10-03 08:53:02,092 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8776_210_2040_1_1528638837_4088036_244-278763-0 from training. Duration: 0.9 2023-10-03 08:53:02,200 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.0.feed_forward1.out_proj.dropout_p, batch_count=1209020.0, ans=0.1 2023-10-03 08:53:04,737 WARNING [train.py:1204] (3/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_9-219972-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 08:53:06,163 WARNING [train.py:1204] (3/4) Exclude cut with ID _44659_210_2721_1_1534036120061_6614860_656-283823-0_sp1.1 from training. Duration: 0.92725 2023-10-03 08:53:06,387 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.attention_skip_rate, batch_count=1209020.0, ans=0.0 2023-10-03 08:53:07,552 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_69630_210_7234_1_1535789601_661049_77-216385-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 08:53:10,788 INFO [train.py:1046] (3/4) Epoch 35, batch 750, loss[loss=0.1656, simple_loss=0.2424, pruned_loss=0.04441, over 23748.00 frames. ], tot_loss[loss=0.1601, simple_loss=0.2391, pruned_loss=0.04056, over 4588395.24 frames. ], batch size: 232, lr: 2.93e-03, grad_scale: 16.0 2023-10-03 08:53:10,868 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_19952_210_2161_1_1531875469_3851750_210-308919-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 08:53:10,881 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_4737_210_2040_1_1526998689_3293008_378-178650-0 from training. Duration: 0.95 2023-10-03 08:53:16,042 WARNING [train.py:1204] (3/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_224-343295-0 from training. Duration: 0.55 2023-10-03 08:53:17,352 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7002_210_1794_1_1527939683_5157286_184-229630-0 from training. Duration: 0.74 2023-10-03 08:53:17,382 WARNING [train.py:1204] (3/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_6-214815-0 from training. Duration: 0.85 2023-10-03 08:53:17,524 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.0.conv_skip_rate, batch_count=1209086.6666666667, ans=0.0 2023-10-03 08:53:18,617 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.560e+02 2.000e+02 2.273e+02 2.606e+02 4.191e+02, threshold=4.547e+02, percent-clipped=0.0 2023-10-03 08:53:18,726 WARNING [train.py:1204] (3/4) Exclude cut with ID _8699_210_2069_1_1528630346831_3960409_160-24408-0 from training. Duration: 0.91 2023-10-03 08:53:18,763 WARNING [train.py:1204] (3/4) Exclude cut with ID _5585_210_2667_1_1527418813324_8007340_89-332072-0 from training. Duration: 0.64 2023-10-03 08:53:18,965 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=1209086.6666666667, ans=0.1 2023-10-03 08:53:19,034 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.feed_forward2.hidden_balancer.prob, batch_count=1209086.6666666667, ans=0.125 2023-10-03 08:53:20,109 WARNING [train.py:1204] (3/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_528-259800-0_sp1.1 from training. Duration: 0.963625 2023-10-03 08:53:20,366 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.conv_skip_rate, batch_count=1209086.6666666667, ans=0.0 2023-10-03 08:53:21,359 WARNING [train.py:1204] (3/4) Exclude cut with ID _55383_210_10635_1_1534468426172_2231669_134-128246-0 from training. Duration: 0.89 2023-10-03 08:53:21,439 WARNING [train.py:1204] (3/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_400-338274-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 08:53:22,829 WARNING [train.py:1204] (3/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_364-22817-0_sp0.9 from training. Duration: 0.788875 2023-10-03 08:53:24,433 WARNING [train.py:1204] (3/4) Exclude cut with ID _42863_210_9070_1_1533902398788_3696920_331-291057-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 08:53:26,272 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_5436_210_1794_1_1527331317_2541494_229-211016-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 08:53:27,609 WARNING [train.py:1204] (3/4) Exclude cut with ID _6456_210_1534_1_1527681634212_3181049_345-252877-0_sp0.9 from training. Duration: 0.711125 2023-10-03 08:53:27,642 WARNING [train.py:1204] (3/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_96-81499-0_sp1.1 from training. Duration: 0.92725 2023-10-03 08:53:29,128 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_5575_210_1410_1_1527301305_2570170_342-265572-0_sp1.1 from training. Duration: 0.8545625 2023-10-03 08:53:30,473 WARNING [train.py:1204] (3/4) Exclude cut with ID _5444_210_2151_1_1527418808464_5312210_726-27165-0_sp0.9 from training. Duration: 0.7444375 2023-10-03 08:53:31,901 WARNING [train.py:1204] (3/4) Exclude cut with ID _22083_210_8254_1_1531702862439_3678970_304-327750-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 08:53:33,329 WARNING [train.py:1204] (3/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_245-226199-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 08:53:33,367 WARNING [train.py:1204] (3/4) Exclude cut with ID _42875_210_14856_1_1533704397487_4012433_102-199499-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 08:53:34,684 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_51225_210_3601_1_1534400634_2327383_55-301844-0 from training. Duration: 0.93 2023-10-03 08:53:36,027 WARNING [train.py:1204] (3/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_916-195848-0_sp1.1 from training. Duration: 0.836375 2023-10-03 08:53:36,093 WARNING [train.py:1204] (3/4) Exclude cut with ID _44718_210_5816_1_1534050051869_6469030_497-311999-0_sp1.1 from training. Duration: 0.936375 2023-10-03 08:53:37,526 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_34886_210_10633_1_1533203882_1755661_91-194765-0_sp1.1 from training. Duration: 0.936375 2023-10-03 08:53:38,866 WARNING [train.py:1204] (3/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_310-26186-0_sp0.9 from training. Duration: 0.77775 2023-10-03 08:53:40,261 WARNING [train.py:1204] (3/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_416-257730-0 from training. Duration: 0.65 2023-10-03 08:53:40,273 WARNING [train.py:1204] (3/4) Exclude cut with ID _9389_210_1664_1_1529217337301_5289269_209-154760-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 08:53:40,342 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder_embed.dropout.p, batch_count=1209220.0, ans=0.1 2023-10-03 08:53:44,172 WARNING [train.py:1204] (3/4) Exclude cut with ID _4092_210_2151_1_1526643044449_3135759_227-213972-0 from training. Duration: 0.54 2023-10-03 08:53:44,202 WARNING [train.py:1204] (3/4) Exclude cut with ID IC0679W0044-335852 from training. Duration: 0.768 2023-10-03 08:53:44,273 WARNING [train.py:1204] (3/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_42-186162-0 from training. Duration: 0.92 2023-10-03 08:53:44,279 WARNING [train.py:1204] (3/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_302-2550-0_sp0.9 from training. Duration: 0.9 2023-10-03 08:53:44,293 WARNING [train.py:1204] (3/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_356-31216-0_sp0.9 from training. Duration: 0.8333125 2023-10-03 08:53:45,549 WARNING [train.py:1204] (3/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_125-322172-0_sp1.1 from training. Duration: 0.8454375 2023-10-03 08:53:51,089 WARNING [train.py:1204] (3/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_141-89727-0_sp0.9 from training. Duration: 0.788875 2023-10-03 08:53:51,328 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.feed_forward3.hidden_balancer.prob, batch_count=1209220.0, ans=0.125 2023-10-03 08:53:52,406 WARNING [train.py:1204] (3/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_647-337784-0_sp1.1 from training. Duration: 0.97275 2023-10-03 08:53:52,423 WARNING [train.py:1204] (3/4) Exclude cut with ID _6493_210_1747_1_1528459210424_4012470_513-140967-0_sp1.1 from training. Duration: 0.7181875 2023-10-03 08:53:55,900 WARNING [train.py:1204] (3/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_87-266334-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 08:53:57,215 WARNING [train.py:1204] (3/4) Exclude cut with ID _26528_210_9070_1_1532570410842_3851310_526-261338-0_sp1.1 from training. Duration: 0.92725 2023-10-03 08:53:57,260 WARNING [train.py:1204] (3/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_380-343670-0 from training. Duration: 0.88 2023-10-03 08:53:58,489 WARNING [train.py:1204] (3/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_449-15508-0_sp1.1 from training. Duration: 0.7454375 2023-10-03 08:53:59,863 WARNING [train.py:1204] (3/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_372-6869-0_sp0.9 from training. Duration: 0.488875 2023-10-03 08:53:59,924 WARNING [train.py:1204] (3/4) Exclude cut with ID _26907_210_7234_1_1532221740694_3205263_337-2494-0_sp0.9 from training. Duration: 0.97775 2023-10-03 08:54:02,790 WARNING [train.py:1204] (3/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_10-100367-0_sp1.1 from training. Duration: 0.8909375 2023-10-03 08:54:02,828 WARNING [train.py:1204] (3/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_47-167373-0 from training. Duration: 0.92 2023-10-03 08:54:02,884 WARNING [train.py:1204] (3/4) Exclude cut with ID _28323_210_16187_1_1532480395667_4100300_628-282179-0_sp1.1 from training. Duration: 0.97275 2023-10-03 08:54:06,218 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.ff2_skip_rate, batch_count=1209286.6666666667, ans=0.0 2023-10-03 08:54:08,769 WARNING [train.py:1204] (3/4) Exclude cut with ID _20455_210_5219_1_1531565996775_8098100_213-280898-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 08:54:10,060 WARNING [train.py:1204] (3/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_383-316000-0_sp1.1 from training. Duration: 0.8181875 2023-10-03 08:54:10,096 WARNING [train.py:1204] (3/4) Exclude cut with ID _18513_210_9206_1_1531618215223_3862620_16-78180-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 08:54:10,457 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.ff3_skip_rate, batch_count=1209353.3333333333, ans=0.0 2023-10-03 08:54:12,219 WARNING [train.py:1204] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_330-327861-0_sp1.1 from training. Duration: 0.6090625 2023-10-03 08:54:14,930 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.balancer1.prob, batch_count=1209353.3333333333, ans=0.125 2023-10-03 08:54:15,704 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.0.layers.0.feed_forward3.out_whiten, num_groups=1, num_channels=192, metric=11.96 vs. limit=15.0 2023-10-03 08:54:17,389 WARNING [train.py:1204] (3/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_356-31216-0 from training. Duration: 0.75 2023-10-03 08:54:17,426 WARNING [train.py:1204] (3/4) Exclude cut with ID _25248_210_8750_1_1532062825941_5554180_255-148539-0_sp1.1 from training. Duration: 0.963625 2023-10-03 08:54:17,463 WARNING [train.py:1204] (3/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_269-154726-0_sp1.1 from training. Duration: 0.7818125 2023-10-03 08:54:20,126 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7002_210_1794_1_1527939683_5157286_163-87552-0_sp1.1 from training. Duration: 0.7818125 2023-10-03 08:54:20,155 WARNING [train.py:1204] (3/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_97-151896-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 08:54:22,990 WARNING [train.py:1204] (3/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_207-7026-0_sp1.1 from training. Duration: 0.97275 2023-10-03 08:54:23,010 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_92-46009-0_sp0.9 from training. Duration: 0.6 2023-10-03 08:54:25,500 INFO [train.py:1046] (3/4) Epoch 35, batch 800, loss[loss=0.1663, simple_loss=0.2401, pruned_loss=0.04622, over 23791.00 frames. ], tot_loss[loss=0.1602, simple_loss=0.2391, pruned_loss=0.04064, over 4608460.78 frames. ], batch size: 164, lr: 2.93e-03, grad_scale: 32.0 2023-10-03 08:54:31,874 WARNING [train.py:1204] (3/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_122-178847-0_sp1.1 from training. Duration: 0.97275 2023-10-03 08:54:31,877 WARNING [train.py:1204] (3/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_94-348956-0_sp1.1 from training. Duration: 0.92725 2023-10-03 08:54:34,702 WARNING [train.py:1204] (3/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_1018-244089-0_sp1.1 from training. Duration: 0.963625 2023-10-03 08:54:34,714 WARNING [train.py:1204] (3/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_87-118423-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 08:54:34,819 WARNING [train.py:1204] (3/4) Exclude cut with ID _53713_210_24116_1_1534314487762_7416751_504-212292-0_sp1.1 from training. Duration: 0.92725 2023-10-03 08:54:34,839 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_446-150327-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 08:54:37,453 WARNING [train.py:1204] (3/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_682-245989-0_sp1.1 from training. Duration: 0.92725 2023-10-03 08:54:40,299 WARNING [train.py:1204] (3/4) Exclude cut with ID _35723_210_1794_1_1533088341043_628250_8-234524-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 08:54:41,670 WARNING [train.py:1204] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_64-248359-0_sp0.9 from training. Duration: 0.7666875 2023-10-03 08:54:43,631 WARNING [train.py:1204] (3/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_49-97158-0 from training. Duration: 0.76 2023-10-03 08:54:45,728 WARNING [train.py:1204] (3/4) Exclude cut with ID _24314_210_4915_1_1532135078116_7170179_743-915-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 08:54:47,190 WARNING [train.py:1204] (3/4) Exclude cut with ID _61194_210_18628_1_1535007555628_3750909_353-323468-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 08:54:47,219 WARNING [train.py:1204] (3/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_387-131538-0_sp1.1 from training. Duration: 0.736375 2023-10-03 08:54:47,248 WARNING [train.py:1204] (3/4) Exclude cut with ID _44424_210_18163_1_1533623394848_1213110_79-187603-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 08:54:48,617 WARNING [train.py:1204] (3/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_86-19601-0 from training. Duration: 0.85 2023-10-03 08:54:48,645 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_50050_210_16223_1_1535435796_7290138_103-283529-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 08:54:48,675 WARNING [train.py:1204] (3/4) Exclude cut with ID _13913_210_9994_1_1530579615187_3383897_473-277682-0 from training. Duration: 0.39 2023-10-03 08:54:52,761 WARNING [train.py:1204] (3/4) Exclude cut with ID _42326_210_7770_1_1533814282531_3707269_368-68029-0_sp1.1 from training. Duration: 0.92725 2023-10-03 08:54:54,129 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_41428_210_2161_1_1533774184_3992532_446-339645-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 08:54:56,892 WARNING [train.py:1204] (3/4) Exclude cut with ID _69584_210_5219_1_1535785133775_4044450_325-7656-0_sp1.1 from training. Duration: 0.7818125 2023-10-03 08:54:56,900 WARNING [train.py:1204] (3/4) Exclude cut with ID _21632_210_2253_1_1531994001765_3744140_134-184387-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 08:55:00,312 WARNING [train.py:1204] (3/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_550-184707-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 08:55:00,328 WARNING [train.py:1204] (3/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_106-341780-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 08:55:01,975 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.attention_skip_rate, batch_count=1209553.3333333333, ans=0.0 2023-10-03 08:55:04,550 WARNING [train.py:1204] (3/4) Exclude cut with ID _45871_210_5665_1_1533864614245_3296060_253-132552-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 08:55:05,835 WARNING [train.py:1204] (3/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_395-310220-0_sp1.1 from training. Duration: 0.8090625 2023-10-03 08:55:05,852 WARNING [train.py:1204] (3/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_795-33252-0_sp0.9 from training. Duration: 0.411125 2023-10-03 08:55:07,357 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9002W0003-495962 from training. Duration: 0.946 2023-10-03 08:55:07,376 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_43-71989-0 from training. Duration: 0.57 2023-10-03 08:55:08,621 WARNING [train.py:1204] (3/4) Exclude cut with ID _8699_210_2069_1_1528630346831_3960409_155-39766-0_sp1.1 from training. Duration: 0.7090625 2023-10-03 08:55:08,641 WARNING [train.py:1204] (3/4) Exclude cut with ID _27894_210_12392_1_1532415496078_6902439_788-232684-0_sp1.1 from training. Duration: 0.97275 2023-10-03 08:55:10,086 WARNING [train.py:1204] (3/4) Exclude cut with ID _56494_210_4220_1_1535284837625_3033169_10-39296-0_sp1.1 from training. Duration: 0.92725 2023-10-03 08:55:10,115 WARNING [train.py:1204] (3/4) Exclude cut with ID _40901_210_5665_1_1533363789958_3856130_341-20819-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 08:55:15,966 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9029W0435-509822 from training. Duration: 0.896 2023-10-03 08:55:16,025 WARNING [train.py:1204] (3/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_51-117576-0 from training. Duration: 0.4 2023-10-03 08:55:19,541 WARNING [train.py:1204] (3/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_197-28164-0_sp1.1 from training. Duration: 0.72725 2023-10-03 08:55:21,076 WARNING [train.py:1204] (3/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_145-137790-0_sp1.1 from training. Duration: 0.7545625 2023-10-03 08:55:25,147 WARNING [train.py:1204] (3/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_2-39653-0_sp1.1 from training. Duration: 0.8909375 2023-10-03 08:55:29,262 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3780_210_2087_1_1526467760_2130483_186-208240-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 08:55:29,346 WARNING [train.py:1204] (3/4) Exclude cut with ID _24055_210_12651_1_1532310907143_976979_92-59356-0 from training. Duration: 0.91 2023-10-03 08:55:30,685 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3910_210_1794_1_1526742306_334530_28-238909-0_sp1.1 from training. Duration: 0.736375 2023-10-03 08:55:32,791 WARNING [train.py:1204] (3/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_269-154726-0 from training. Duration: 0.86 2023-10-03 08:55:38,310 WARNING [train.py:1204] (3/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_326-165900-0_sp0.9 from training. Duration: 0.7666875 2023-10-03 08:55:39,643 INFO [train.py:1046] (3/4) Epoch 35, batch 850, loss[loss=0.1568, simple_loss=0.2387, pruned_loss=0.03746, over 24548.00 frames. ], tot_loss[loss=0.1611, simple_loss=0.2401, pruned_loss=0.04103, over 4636100.07 frames. ], batch size: 60, lr: 2.93e-03, grad_scale: 32.0 2023-10-03 08:55:39,783 WARNING [train.py:1204] (3/4) Exclude cut with ID _5294_210_1747_1_1527249643928_3696179_470-276077-0_sp1.1 from training. Duration: 0.963625 2023-10-03 08:55:41,090 WARNING [train.py:1204] (3/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_372-131163-0 from training. Duration: 0.94 2023-10-03 08:55:41,147 WARNING [train.py:1204] (3/4) Exclude cut with ID _45297_210_2721_1_1533715351381_3264949_351-147136-0_sp1.1 from training. Duration: 0.7909375 2023-10-03 08:55:42,460 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_1072-12671-0_sp1.1 from training. Duration: 0.92725 2023-10-03 08:55:43,786 WARNING [train.py:1204] (3/4) Exclude cut with ID _15133_210_6228_1_1530794512074_3080379_54-198193-0 from training. Duration: 0.94 2023-10-03 08:55:43,816 WARNING [train.py:1204] (3/4) Exclude cut with ID _30196_210_1664_1_1533708496522_7074140_1194-270988-0_sp1.1 from training. Duration: 0.97275 2023-10-03 08:55:44,591 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.2.encoder.layers.0.self_attn_weights.whiten_keys, num_groups=4, num_channels=128, metric=4.18 vs. limit=6.0 2023-10-03 08:55:46,872 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.632e+02 1.835e+02 2.028e+02 2.413e+02 3.992e+02, threshold=4.056e+02, percent-clipped=0.0 2023-10-03 08:55:46,965 WARNING [train.py:1204] (3/4) Exclude cut with ID _50310_210_15488_1_1534068043345_3641080_289-193637-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 08:55:47,061 WARNING [train.py:1204] (3/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_313-263495-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 08:55:48,485 WARNING [train.py:1204] (3/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_18-130336-0_sp1.1 from training. Duration: 0.7181875 2023-10-03 08:55:50,238 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_47896_210_20508_1_1534492518_5958868_646-287209-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 08:55:52,109 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_294-163793-0 from training. Duration: 0.82 2023-10-03 08:55:52,140 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_8-319957-0 from training. Duration: 0.96 2023-10-03 08:55:52,143 WARNING [train.py:1204] (3/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_1211-226066-0 from training. Duration: 0.76 2023-10-03 08:55:52,281 WARNING [train.py:1204] (3/4) Exclude cut with ID _4779_210_2084_1_1527318022944_3914893_500-285013-0_sp0.9 from training. Duration: 0.7666875 2023-10-03 08:55:52,298 WARNING [train.py:1204] (3/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_347-232268-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 08:55:56,323 WARNING [train.py:1204] (3/4) Exclude cut with ID _29877_210_6228_1_1532397791106_5276149_111-55056-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 08:55:56,355 WARNING [train.py:1204] (3/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_812-271646-0_sp1.1 from training. Duration: 0.92725 2023-10-03 08:55:56,385 WARNING [train.py:1204] (3/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_235-262124-0_sp1.1 from training. Duration: 0.6090625 2023-10-03 08:56:00,602 WARNING [train.py:1204] (3/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_14-16530-0_sp1.1 from training. Duration: 0.97275 2023-10-03 08:56:01,969 WARNING [train.py:1204] (3/4) Exclude cut with ID _44718_210_5816_1_1534050051869_6469030_568-304512-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 08:56:01,995 WARNING [train.py:1204] (3/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_1033-274376-0 from training. Duration: 0.87 2023-10-03 08:56:06,617 WARNING [train.py:1204] (3/4) Exclude cut with ID _4681_210_3525_1_1527251040321_1235713_123-24428-0 from training. Duration: 0.48 2023-10-03 08:56:09,325 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_37818_210_4238_1_1533620910_4720083_251-288764-0_sp1.1 from training. Duration: 0.97275 2023-10-03 08:56:10,651 WARNING [train.py:1204] (3/4) Exclude cut with ID _65585_210_12210_1_1535450451584_3764098_112-294301-0 from training. Duration: 0.88 2023-10-03 08:56:14,733 WARNING [train.py:1204] (3/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_329-113173-0 from training. Duration: 0.62 2023-10-03 08:56:16,226 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6767_210_3273_1_1527855783_1718932_173-256601-0 from training. Duration: 0.8 2023-10-03 08:56:17,696 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9266W0001-627677 from training. Duration: 0.896 2023-10-03 08:56:17,706 WARNING [train.py:1204] (3/4) Exclude cut with ID _49643_210_7989_1_1534640244224_3939031_611-185515-0_sp1.1 from training. Duration: 0.8545625 2023-10-03 08:56:17,710 WARNING [train.py:1204] (3/4) Exclude cut with ID _31649_210_6228_1_1532584608063_4091078_687-237771-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 08:56:17,720 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_148-172861-0_sp0.9 from training. Duration: 0.5555625 2023-10-03 08:56:21,533 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_5129_210_6400_1_1527161005_3579054_107-69738-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 08:56:23,582 WARNING [train.py:1204] (3/4) Exclude cut with ID _40137_210_12210_1_1533290484046_3795180_36-301164-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 08:56:23,612 WARNING [train.py:1204] (3/4) Exclude cut with ID _7537_210_2084_1_1528502395431_3924359_113-199933-0 from training. Duration: 0.59 2023-10-03 08:56:26,183 WARNING [train.py:1204] (3/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_547-5424-0_sp1.1 from training. Duration: 0.8545625 2023-10-03 08:56:26,233 WARNING [train.py:1204] (3/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_147-284024-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 08:56:26,458 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.conv_skip_rate, batch_count=1209953.3333333333, ans=0.0 2023-10-03 08:56:27,599 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_294-163793-0_sp1.1 from training. Duration: 0.7454375 2023-10-03 08:56:27,626 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3780_210_2087_1_1526467760_2130483_170-259044-0_sp1.1 from training. Duration: 0.636375 2023-10-03 08:56:30,375 WARNING [train.py:1204] (3/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_726-270780-0_sp1.1 from training. Duration: 0.7818125 2023-10-03 08:56:30,461 WARNING [train.py:1204] (3/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_214-291369-0_sp1.1 from training. Duration: 0.57275 2023-10-03 08:56:31,800 WARNING [train.py:1204] (3/4) Exclude cut with ID _21881_210_3525_1_1531825242757_3798380_247-102591-0 from training. Duration: 0.98 2023-10-03 08:56:34,595 WARNING [train.py:1204] (3/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_18-174647-0_sp1.1 from training. Duration: 0.82725 2023-10-03 08:56:34,597 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_415-52611-0_sp1.1 from training. Duration: 0.936375 2023-10-03 08:56:35,950 WARNING [train.py:1204] (3/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_379-49457-0_sp0.9 from training. Duration: 0.9666875 2023-10-03 08:56:35,971 WARNING [train.py:1204] (3/4) Exclude cut with ID _64521_210_14699_1_1535243427259_3815328_368-195106-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 08:56:37,765 WARNING [train.py:1204] (3/4) Exclude cut with ID _28394_210_8913_1_1532430049518_3650118_51-334124-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 08:56:40,451 WARNING [train.py:1204] (3/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_317-161769-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 08:56:41,873 WARNING [train.py:1204] (3/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_40-129756-0_sp0.9 from training. Duration: 0.9 2023-10-03 08:56:43,277 WARNING [train.py:1204] (3/4) Exclude cut with ID _3067_210_2667_1_1526212974640_4503168_119-175776-0_sp0.9 from training. Duration: 0.82225 2023-10-03 08:56:43,323 WARNING [train.py:1204] (3/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_1004-304915-0_sp1.1 from training. Duration: 0.92725 2023-10-03 08:56:44,615 WARNING [train.py:1204] (3/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_359-118926-0_sp0.9 from training. Duration: 0.8 2023-10-03 08:56:49,623 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.0.layers.0.conv_module1.whiten, num_groups=1, num_channels=192, metric=10.79 vs. limit=15.0 2023-10-03 08:56:52,822 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.bypass.skip_rate, batch_count=1210086.6666666667, ans=0.04949747468305833 2023-10-03 08:56:54,424 INFO [train.py:1046] (3/4) Epoch 35, batch 900, loss[loss=0.1596, simple_loss=0.2396, pruned_loss=0.03978, over 23627.00 frames. ], tot_loss[loss=0.1616, simple_loss=0.2407, pruned_loss=0.04125, over 4654877.19 frames. ], batch size: 149, lr: 2.93e-03, grad_scale: 16.0 2023-10-03 08:56:54,476 WARNING [train.py:1204] (3/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_51-200220-0_sp0.9 from training. Duration: 0.62225 2023-10-03 08:56:54,554 WARNING [train.py:1204] (3/4) Exclude cut with ID _46683_210_9642_1_1533730913492_4591116_530-326202-0_sp1.1 from training. Duration: 0.936375 2023-10-03 08:56:54,603 WARNING [train.py:1204] (3/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_567-135114-0 from training. Duration: 0.87 2023-10-03 08:56:55,906 WARNING [train.py:1204] (3/4) Exclude cut with ID _66863_210_18053_1_1535698758334_8590319_1092-271784-0_sp1.1 from training. Duration: 0.87275 2023-10-03 08:56:55,924 WARNING [train.py:1204] (3/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_762-282614-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 08:56:57,316 WARNING [train.py:1204] (3/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_280-120942-0 from training. Duration: 0.77 2023-10-03 08:56:57,485 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.conv_skip_rate, batch_count=1210086.6666666667, ans=0.0 2023-10-03 08:57:04,264 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_31583_210_3852_1_1532595851_4024413_27-170931-0_sp1.1 from training. Duration: 0.963625 2023-10-03 08:57:05,708 WARNING [train.py:1204] (3/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_266-271628-0_sp1.1 from training. Duration: 0.92725 2023-10-03 08:57:06,986 WARNING [train.py:1204] (3/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_41-31157-0 from training. Duration: 0.57 2023-10-03 08:57:08,612 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.feed_forward2.hidden_balancer.prob, batch_count=1210153.3333333333, ans=0.125 2023-10-03 08:57:10,381 WARNING [train.py:1204] (3/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_113-18163-0_sp1.1 from training. Duration: 0.8090625 2023-10-03 08:57:10,408 WARNING [train.py:1204] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_147-111137-0 from training. Duration: 0.95 2023-10-03 08:57:11,879 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1083-220122-0_sp1.1 from training. Duration: 0.463625 2023-10-03 08:57:12,051 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=1210153.3333333333, ans=0.1 2023-10-03 08:57:13,130 WARNING [train.py:1204] (3/4) Exclude cut with ID _14884_210_2252_1_1530707459689_5561250_591-173406-0_sp1.1 from training. Duration: 0.87275 2023-10-03 08:57:13,143 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_24677_210_1794_1_1532002693_4341296_149-303793-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 08:57:14,475 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_133-564-0_sp1.1 from training. Duration: 0.7181875 2023-10-03 08:57:14,502 WARNING [train.py:1204] (3/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_625-338546-0_sp0.9 from training. Duration: 0.97775 2023-10-03 08:57:22,232 WARNING [train.py:1204] (3/4) Exclude cut with ID _25882_210_3036_1_1532775665815_6479510_624-320169-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 08:57:22,245 WARNING [train.py:1204] (3/4) Exclude cut with ID _20622_210_3852_1_1531622427080_1093839_127-325326-0_sp1.1 from training. Duration: 0.92725 2023-10-03 08:57:23,990 WARNING [train.py:1204] (3/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_464-33931-0_sp1.1 from training. Duration: 0.4909375 2023-10-03 08:57:27,292 WARNING [train.py:1204] (3/4) Exclude cut with ID _66112_210_10635_1_1535590236305_3801484_120-304588-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 08:57:28,935 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.bypass_mid.scale_min, batch_count=1210220.0, ans=0.2 2023-10-03 08:57:30,144 WARNING [train.py:1204] (3/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_110-130961-0 from training. Duration: 0.47 2023-10-03 08:57:32,839 WARNING [train.py:1204] (3/4) Exclude cut with ID _16908_210_5724_1_1531119157248_3772700_64-54020-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 08:57:37,038 WARNING [train.py:1204] (3/4) Exclude cut with ID _4068_210_2629_1_1526697350395_1546259_224-288693-0_sp1.1 from training. Duration: 0.536375 2023-10-03 08:57:38,282 WARNING [train.py:1204] (3/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_191-41023-0_sp0.9 from training. Duration: 0.788875 2023-10-03 08:57:38,341 WARNING [train.py:1204] (3/4) Exclude cut with ID ID0205W0255-783002 from training. Duration: 0.958 2023-10-03 08:57:38,574 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.bypass.skip_rate, batch_count=1210286.6666666667, ans=0.04949747468305833 2023-10-03 08:57:39,738 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_162-262717-0 from training. Duration: 0.83 2023-10-03 08:57:41,780 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.3.encoder.layers.0.nonlin_attention.whiten1, num_groups=1, num_channels=384, metric=6.73 vs. limit=10.0 2023-10-03 08:57:45,879 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_224-322394-0_sp1.1 from training. Duration: 0.6 2023-10-03 08:57:45,888 WARNING [train.py:1204] (3/4) Exclude cut with ID _4866_210_2577_1_1527404412284_4364659_133-1696-0_sp1.1 from training. Duration: 0.9 2023-10-03 08:57:47,219 WARNING [train.py:1204] (3/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_973-198085-0_sp1.1 from training. Duration: 0.6818125 2023-10-03 08:57:53,835 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_47449_210_5399_1_1534076445_4006929_410-237806-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 08:57:53,850 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7543_210_6753_1_1528543318_4552_1-85072-0_sp0.9 from training. Duration: 0.92225 2023-10-03 08:57:55,242 WARNING [train.py:1204] (3/4) Exclude cut with ID _65263_210_22356_1_1535763598691_3921037_108-21902-0 from training. Duration: 0.99 2023-10-03 08:57:55,250 WARNING [train.py:1204] (3/4) Exclude cut with ID _65585_210_12210_1_1535450451584_3764098_309-174818-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 08:57:58,575 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_309-65177-0 from training. Duration: 0.88 2023-10-03 08:58:01,210 WARNING [train.py:1204] (3/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_363-185703-0_sp1.1 from training. Duration: 0.863625 2023-10-03 08:58:01,236 WARNING [train.py:1204] (3/4) Exclude cut with ID _42326_210_7770_1_1533814282531_3707269_442-238692-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 08:58:02,610 WARNING [train.py:1204] (3/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_226-254446-0_sp1.1 from training. Duration: 0.936375 2023-10-03 08:58:02,630 WARNING [train.py:1204] (3/4) Exclude cut with ID _29853_210_7497_1_1532505600851_6642110_287-299903-0_sp1.1 from training. Duration: 0.963625 2023-10-03 08:58:06,790 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1015-343884-0 from training. Duration: 0.62 2023-10-03 08:58:06,829 WARNING [train.py:1204] (3/4) Exclude cut with ID ID0305W0008-833402 from training. Duration: 0.91 2023-10-03 08:58:08,144 INFO [train.py:1046] (3/4) Epoch 35, batch 950, loss[loss=0.1671, simple_loss=0.2357, pruned_loss=0.04925, over 23795.00 frames. ], tot_loss[loss=0.1613, simple_loss=0.2404, pruned_loss=0.0411, over 4673884.24 frames. ], batch size: 195, lr: 2.93e-03, grad_scale: 16.0 2023-10-03 08:58:08,263 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6673_210_3273_1_1527767790_3173240_351-167954-0_sp1.1 from training. Duration: 0.436375 2023-10-03 08:58:09,537 WARNING [train.py:1204] (3/4) Exclude cut with ID _6456_210_1534_1_1527681634212_3181049_345-252877-0 from training. Duration: 0.64 2023-10-03 08:58:11,018 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_13-205182-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 08:58:14,275 WARNING [train.py:1204] (3/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_427-208901-0 from training. Duration: 0.68 2023-10-03 08:58:17,060 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.651e+02 2.074e+02 2.262e+02 2.631e+02 4.033e+02, threshold=4.525e+02, percent-clipped=0.0 2023-10-03 08:58:17,202 WARNING [train.py:1204] (3/4) Exclude cut with ID _49039_210_6315_1_1533967537541_3846118_9-148921-0_sp1.1 from training. Duration: 0.97275 2023-10-03 08:58:20,452 WARNING [train.py:1204] (3/4) Exclude cut with ID _59493_210_17282_1_1535078419077_3225879_143-70317-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 08:58:20,478 WARNING [train.py:1204] (3/4) Exclude cut with ID _27967_210_12140_1_1532588391859_8419130_1056-236369-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 08:58:22,464 WARNING [train.py:1204] (3/4) Exclude cut with ID _6537_210_1883_1_1527987559710_3597129_382-133062-0_sp0.9 from training. Duration: 0.7555625 2023-10-03 08:58:23,896 WARNING [train.py:1204] (3/4) Exclude cut with ID ID0376W0255-869957 from training. Duration: 0.9510625 2023-10-03 08:58:26,747 WARNING [train.py:1204] (3/4) Exclude cut with ID _49371_210_12071_1_1533952761021_3812190_483-240691-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 08:58:28,659 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_12868_210_6753_1_1530701932_530275_18-76322-0_sp1.1 from training. Duration: 0.7909375 2023-10-03 08:58:28,734 WARNING [train.py:1204] (3/4) Exclude cut with ID _53950_210_5816_1_1534417264919_3862951_367-26344-0_sp1.1 from training. Duration: 0.97275 2023-10-03 08:58:30,069 WARNING [train.py:1204] (3/4) Exclude cut with ID _5444_210_2151_1_1527418808464_5312210_634-194174-0_sp1.1 from training. Duration: 0.8818125 2023-10-03 08:58:30,100 WARNING [train.py:1204] (3/4) Exclude cut with ID _56494_210_4220_1_1535284837625_3033169_69-138150-0 from training. Duration: 0.94 2023-10-03 08:58:31,624 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7543_210_6753_1_1528544010_1110402_25-183860-0_sp0.9 from training. Duration: 0.52225 2023-10-03 08:58:33,017 WARNING [train.py:1204] (3/4) Exclude cut with ID _40284_210_2084_1_1533513679814_3704279_287-191918-0_sp1.1 from training. Duration: 0.963625 2023-10-03 08:58:34,399 WARNING [train.py:1204] (3/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_291-270881-0 from training. Duration: 0.97 2023-10-03 08:58:34,456 WARNING [train.py:1204] (3/4) Exclude cut with ID _12555_210_8750_1_1530075594402_3780239_449-267986-0_sp0.9 from training. Duration: 0.92225 2023-10-03 08:58:38,586 WARNING [train.py:1204] (3/4) Exclude cut with ID _3992_210_1747_1_1526644816031_3731060_268-64520-0_sp1.1 from training. Duration: 0.963625 2023-10-03 08:58:38,594 WARNING [train.py:1204] (3/4) Exclude cut with ID _39411_210_5986_1_1533538825030_3343649_173-238248-0_sp0.9 from training. Duration: 0.92225 2023-10-03 08:58:38,625 WARNING [train.py:1204] (3/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_828-313097-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 08:58:40,035 WARNING [train.py:1204] (3/4) Exclude cut with ID _26907_210_7234_1_1532221740694_3205263_337-2494-0 from training. Duration: 0.88 2023-10-03 08:58:42,619 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_229-45300-0_sp1.1 from training. Duration: 0.5181875 2023-10-03 08:58:44,575 WARNING [train.py:1204] (3/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_300-27235-0_sp1.1 from training. Duration: 0.7909375 2023-10-03 08:58:44,675 WARNING [train.py:1204] (3/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_48-338976-0_sp1.1 from training. Duration: 0.8090625 2023-10-03 08:58:46,182 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.nonlin_attention.balancer.prob, batch_count=1210553.3333333333, ans=0.125 2023-10-03 08:58:50,643 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_13091_210_7925_1_1530609447_8987354_792-195353-0_sp1.1 from training. Duration: 0.936375 2023-10-03 08:58:50,649 WARNING [train.py:1204] (3/4) Exclude cut with ID _56524_210_9642_1_1534572011779_3619170_311-54500-0_sp1.1 from training. Duration: 0.97275 2023-10-03 08:58:54,073 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7543_210_6753_1_1528544010_1110402_25-183860-0 from training. Duration: 0.47 2023-10-03 08:58:56,757 WARNING [train.py:1204] (3/4) Exclude cut with ID 2411-132532-0017-82279-0_sp1.1 from training. Duration: 0.9681875 2023-10-03 08:58:56,765 WARNING [train.py:1204] (3/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_272-249985-0_sp0.9 from training. Duration: 0.7444375 2023-10-03 08:58:58,114 WARNING [train.py:1204] (3/4) Exclude cut with ID _55644_210_11856_1_1534593599582_4103719_320-229006-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 08:58:59,544 WARNING [train.py:1204] (3/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_177-100554-0_sp1.1 from training. Duration: 0.963625 2023-10-03 08:58:59,546 WARNING [train.py:1204] (3/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_787-294416-0_sp1.1 from training. Duration: 0.7454375 2023-10-03 08:59:04,435 WARNING [train.py:1204] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_318-59097-0 from training. Duration: 0.76 2023-10-03 08:59:04,508 WARNING [train.py:1204] (3/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_480-13146-0_sp1.1 from training. Duration: 0.77275 2023-10-03 08:59:07,235 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_469-338558-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 08:59:08,571 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_367-126897-0_sp1.1 from training. Duration: 0.963625 2023-10-03 08:59:09,929 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6545_210_2040_1_1527774670_4245258_379-14182-0 from training. Duration: 0.8 2023-10-03 08:59:09,941 WARNING [train.py:1204] (3/4) Exclude cut with ID _21631_210_2253_1_1531907880778_3160339_197-79498-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 08:59:09,945 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_416-293746-0_sp1.1 from training. Duration: 0.8181875 2023-10-03 08:59:09,980 WARNING [train.py:1204] (3/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_52-211683-0 from training. Duration: 0.9 2023-10-03 08:59:12,948 WARNING [train.py:1204] (3/4) Exclude cut with ID _48678_210_5663_1_1533988830947_3769199_306-255886-0_sp0.9 from training. Duration: 0.8555625 2023-10-03 08:59:14,356 WARNING [train.py:1204] (3/4) Exclude cut with ID _65134_210_14763_1_1535335393985_3505368_296-24733-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 08:59:20,238 WARNING [train.py:1204] (3/4) Exclude cut with ID _14898_210_9206_1_1530766812377_4177190_335-99364-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 08:59:21,649 WARNING [train.py:1204] (3/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_19-342632-0 from training. Duration: 0.91 2023-10-03 08:59:21,666 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_53071_210_16554_1_1534490405_2954066_265-170472-0 from training. Duration: 0.83 2023-10-03 08:59:23,544 INFO [train.py:1046] (3/4) Epoch 35, batch 1000, loss[loss=0.1559, simple_loss=0.2348, pruned_loss=0.03854, over 23336.00 frames. ], tot_loss[loss=0.1606, simple_loss=0.2396, pruned_loss=0.04079, over 4689916.06 frames. ], batch size: 119, lr: 2.93e-03, grad_scale: 16.0 2023-10-03 08:59:25,080 WARNING [train.py:1204] (3/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_189-298172-0_sp1.1 from training. Duration: 0.963625 2023-10-03 08:59:28,558 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.conv_skip_rate, batch_count=1210753.3333333333, ans=0.0 2023-10-03 08:59:29,634 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7615_210_4211_1_1528110965_4580160_72-235211-0 from training. Duration: 0.76 2023-10-03 08:59:31,004 WARNING [train.py:1204] (3/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_155-90874-0_sp1.1 from training. Duration: 0.936375 2023-10-03 08:59:34,533 WARNING [train.py:1204] (3/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_164-216310-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 08:59:35,959 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_46013_210_7234_1_1533776362_3744032_463-52378-0 from training. Duration: 0.86 2023-10-03 08:59:35,967 WARNING [train.py:1204] (3/4) Exclude cut with ID _61133_210_9969_1_1535196777227_3696409_333-270325-0 from training. Duration: 0.93 2023-10-03 08:59:40,272 WARNING [train.py:1204] (3/4) Exclude cut with ID _5172_210_1883_1_1527246062567_3577219_18-101380-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 08:59:40,281 WARNING [train.py:1204] (3/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_577-236858-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 08:59:42,941 WARNING [train.py:1204] (3/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_1006-177345-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 08:59:43,184 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=1210820.0, ans=0.1 2023-10-03 08:59:44,392 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_4-266348-0 from training. Duration: 0.78 2023-10-03 08:59:47,216 WARNING [train.py:1204] (3/4) Exclude cut with ID _55857_210_2346_1_1534644007831_6898209_745-98391-0 from training. Duration: 0.76 2023-10-03 08:59:47,415 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.conv_module1.balancer2.min_abs, batch_count=1210820.0, ans=0.5 2023-10-03 08:59:49,174 WARNING [train.py:1204] (3/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_375-244976-0 from training. Duration: 0.75 2023-10-03 08:59:49,216 WARNING [train.py:1204] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_4-248404-0_sp1.1 from training. Duration: 0.97275 2023-10-03 08:59:50,543 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8453_210_4211_1_1528502940_8562730_596-306820-0 from training. Duration: 0.97 2023-10-03 08:59:53,862 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_211-339622-0_sp0.9 from training. Duration: 0.5 2023-10-03 08:59:53,892 WARNING [train.py:1204] (3/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_393-57888-0 from training. Duration: 0.64 2023-10-03 08:59:54,498 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.3.encoder.layers.1.feed_forward2.out_whiten, num_groups=1, num_channels=512, metric=5.32 vs. limit=15.0 2023-10-03 08:59:55,111 WARNING [train.py:1204] (3/4) Exclude cut with ID _23825_210_13449_1_1531915313526_3497150_143-343575-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 08:59:55,189 WARNING [train.py:1204] (3/4) Exclude cut with ID _58605_210_12210_1_1534723195939_3904259_36-12256-0_sp1.1 from training. Duration: 0.936375 2023-10-03 08:59:55,395 INFO [scaling.py:1118] (3/4) WithLoss: name=encoder.encoders.2.encoder.layers.2.self_attn_weights, loss-sum=0.000e+00 2023-10-03 09:00:04,249 WARNING [train.py:1204] (3/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_38-67795-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 09:00:05,564 WARNING [train.py:1204] (3/4) Exclude cut with ID _64631_210_27767_1_1535349580068_4165160_308-327832-0_sp1.1 from training. Duration: 0.92725 2023-10-03 09:00:05,617 WARNING [train.py:1204] (3/4) Exclude cut with ID _37086_210_9038_1_1532999540891_7021250_726-81176-0_sp1.1 from training. Duration: 0.936375 2023-10-03 09:00:05,672 WARNING [train.py:1204] (3/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_47-141521-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 09:00:05,676 WARNING [train.py:1204] (3/4) Exclude cut with ID _3067_210_2667_1_1526212974640_4503168_250-285937-0 from training. Duration: 0.8 2023-10-03 09:00:05,706 WARNING [train.py:1204] (3/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_239-24000-0_sp1.1 from training. Duration: 0.97275 2023-10-03 09:00:07,072 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3371_210_1794_1_1526384462_5580777_396-339812-0_sp0.9 from training. Duration: 0.9444375 2023-10-03 09:00:07,118 WARNING [train.py:1204] (3/4) Exclude cut with ID _16909_210_5724_1_1531292079912_4118919_580-173454-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 09:00:08,436 WARNING [train.py:1204] (3/4) Exclude cut with ID IC0124W0002-61308 from training. Duration: 0.982 2023-10-03 09:00:12,495 WARNING [train.py:1204] (3/4) Exclude cut with ID _31602_210_5854_1_1532677021456_4458250_249-96604-0 from training. Duration: 0.89 2023-10-03 09:00:13,819 WARNING [train.py:1204] (3/4) Exclude cut with ID _69584_210_5219_1_1535785133775_4044450_325-7656-0 from training. Duration: 0.86 2023-10-03 09:00:13,948 WARNING [train.py:1204] (3/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_478-162247-0 from training. Duration: 0.82 2023-10-03 09:00:17,236 WARNING [train.py:1204] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_686-54383-0_sp1.1 from training. Duration: 0.7909375 2023-10-03 09:00:23,896 WARNING [train.py:1204] (3/4) Exclude cut with ID _29303_210_2575_1_1532392237303_7410649_85-270092-0_sp1.1 from training. Duration: 0.936375 2023-10-03 09:00:23,913 WARNING [train.py:1204] (3/4) Exclude cut with ID _2322_210_2667_1_1525604548971_3656299_440-80494-0_sp1.1 from training. Duration: 0.8 2023-10-03 09:00:25,181 WARNING [train.py:1204] (3/4) Exclude cut with ID _47346_210_9476_1_1533815971553_3536160_201-40644-0_sp1.1 from training. Duration: 0.936375 2023-10-03 09:00:25,288 WARNING [train.py:1204] (3/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_343-139898-0_sp1.1 from training. Duration: 0.87275 2023-10-03 09:00:27,943 WARNING [train.py:1204] (3/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_1012-279473-0 from training. Duration: 0.67 2023-10-03 09:00:29,305 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7931_210_1794_1_1528594728_5564059_280-279560-0_sp1.1 from training. Duration: 0.8909375 2023-10-03 09:00:29,350 WARNING [train.py:1204] (3/4) Exclude cut with ID _8263_210_1664_1_1528439452891_970220_68-181805-0 from training. Duration: 0.64 2023-10-03 09:00:30,722 WARNING [train.py:1204] (3/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_605-133395-0 from training. Duration: 0.66 2023-10-03 09:00:32,562 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_43-171109-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 09:00:32,566 WARNING [train.py:1204] (3/4) Exclude cut with ID _40094_210_16380_1_1533294005773_3641159_107-280739-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 09:00:33,987 WARNING [train.py:1204] (3/4) Exclude cut with ID _14821_210_8750_1_1530680503991_6829359_2-227151-0_sp0.9 from training. Duration: 0.92225 2023-10-03 09:00:36,703 WARNING [train.py:1204] (3/4) Exclude cut with ID _14268_210_9206_1_1530622771285_4714099_331-105351-0_sp1.1 from training. Duration: 0.5909375 2023-10-03 09:00:38,080 INFO [train.py:1046] (3/4) Epoch 35, batch 1050, loss[loss=0.1654, simple_loss=0.2507, pruned_loss=0.04007, over 24616.00 frames. ], tot_loss[loss=0.1596, simple_loss=0.2385, pruned_loss=0.0404, over 4701311.50 frames. ], batch size: 68, lr: 2.92e-03, grad_scale: 16.0 2023-10-03 09:00:38,191 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_26326_210_7925_1_1533002493_3592406_451-82741-0_sp1.1 from training. Duration: 0.97275 2023-10-03 09:00:41,052 WARNING [train.py:1204] (3/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_191-59784-0_sp0.9 from training. Duration: 0.97775 2023-10-03 09:00:43,660 WARNING [train.py:1204] (3/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_445-117398-0_sp1.1 from training. Duration: 0.8090625 2023-10-03 09:00:45,050 WARNING [train.py:1204] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_605-257205-0_sp0.9 from training. Duration: 0.6555625 2023-10-03 09:00:45,129 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_50050_210_16223_1_1535435796_7290138_592-258841-0_sp1.1 from training. Duration: 0.936375 2023-10-03 09:00:46,414 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.526e+02 1.826e+02 1.998e+02 2.224e+02 3.015e+02, threshold=3.995e+02, percent-clipped=0.0 2023-10-03 09:00:46,661 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.1.bypass.scale_min, batch_count=1211086.6666666667, ans=0.2 2023-10-03 09:00:47,904 WARNING [train.py:1204] (3/4) Exclude cut with ID _14348_210_8341_1_1530599434996_3778640_240-232370-0_sp0.9 from training. Duration: 0.9333125 2023-10-03 09:00:49,830 WARNING [train.py:1204] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_239-316802-0_sp0.9 from training. Duration: 0.7666875 2023-10-03 09:00:51,632 WARNING [train.py:1204] (3/4) Exclude cut with ID _45560_210_21982_1_1533812603951_3584999_111-6376-0_sp1.1 from training. Duration: 0.763625 2023-10-03 09:00:51,809 WARNING [train.py:1204] (3/4) Exclude cut with ID _54189_210_16380_1_1534332670039_3911710_606-311481-0_sp1.1 from training. Duration: 0.963625 2023-10-03 09:00:53,221 WARNING [train.py:1204] (3/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_237-332027-0_sp1.1 from training. Duration: 0.863625 2023-10-03 09:00:53,247 WARNING [train.py:1204] (3/4) Exclude cut with ID _66070_210_7640_1_1535441393096_7805310_489-292631-0_sp0.9 from training. Duration: 0.87775 2023-10-03 09:00:54,642 WARNING [train.py:1204] (3/4) Exclude cut with ID _49728_210_12651_1_1534041034294_6788150_624-195301-0_sp1.1 from training. Duration: 0.77275 2023-10-03 09:00:56,492 WARNING [train.py:1204] (3/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_44-79427-0 from training. Duration: 0.98 2023-10-03 09:00:56,560 WARNING [train.py:1204] (3/4) Exclude cut with ID _35350_210_16040_1_1533040191183_4075299_490-198354-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 09:00:57,830 WARNING [train.py:1204] (3/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_309-222545-0 from training. Duration: 0.84 2023-10-03 09:00:58,005 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_13091_210_7925_1_1530609447_8987354_790-199469-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 09:00:58,010 WARNING [train.py:1204] (3/4) Exclude cut with ID _21632_210_2253_1_1531994001765_3744140_110-274654-0 from training. Duration: 0.82 2023-10-03 09:00:58,016 WARNING [train.py:1204] (3/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_498-40153-0_sp0.9 from training. Duration: 0.7 2023-10-03 09:01:05,482 WARNING [train.py:1204] (3/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_363-282909-0_sp1.1 from training. Duration: 0.936375 2023-10-03 09:01:06,851 WARNING [train.py:1204] (3/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_178-177582-0_sp0.9 from training. Duration: 0.82225 2023-10-03 09:01:06,870 WARNING [train.py:1204] (3/4) Exclude cut with ID _24612_210_8913_1_1531998054193_3806359_357-35487-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 09:01:08,290 WARNING [train.py:1204] (3/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_138-332773-0 from training. Duration: 0.81 2023-10-03 09:01:09,614 WARNING [train.py:1204] (3/4) Exclude cut with ID _7624_210_1531_1_1528109802198_4933101_418-349834-0 from training. Duration: 0.6 2023-10-03 09:01:09,648 WARNING [train.py:1204] (3/4) Exclude cut with ID _7537_210_2084_1_1528502395431_3924359_14-312906-0_sp0.9 from training. Duration: 0.9333125 2023-10-03 09:01:12,389 WARNING [train.py:1204] (3/4) Exclude cut with ID _60057_210_10640_1_1534903065793_3333150_362-230377-0 from training. Duration: 0.87 2023-10-03 09:01:14,002 INFO [scaling.py:1118] (3/4) WithLoss: name=encoder.encoders.4.encoder.layers.2.self_attn_weights, loss-sum=0.000e+00 2023-10-03 09:01:15,073 WARNING [train.py:1204] (3/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_138-64210-0 from training. Duration: 0.73 2023-10-03 09:01:16,353 WARNING [train.py:1204] (3/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_454-339504-0_sp1.1 from training. Duration: 0.97275 2023-10-03 09:01:19,663 WARNING [train.py:1204] (3/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_508-335369-0_sp0.9 from training. Duration: 0.5333125 2023-10-03 09:01:22,776 WARNING [train.py:1204] (3/4) Exclude cut with ID _5608_210_2106_1_1527415439089_6819768_909-82767-0_sp0.9 from training. Duration: 0.67775 2023-10-03 09:01:22,801 WARNING [train.py:1204] (3/4) Exclude cut with ID _6694_210_4599_1_1527852637490_3965529_99-11611-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 09:01:22,870 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2655_210_2087_1_1525688959_3897400_52-279899-0_sp1.1 from training. Duration: 0.62725 2023-10-03 09:01:25,010 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.3.encoder.layers.0.nonlin_attention.whiten1, num_groups=1, num_channels=384, metric=7.21 vs. limit=10.0 2023-10-03 09:01:27,548 WARNING [train.py:1204] (3/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_375-245677-0_sp1.1 from training. Duration: 0.736375 2023-10-03 09:01:27,720 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.1.feed_forward1.out_proj.dropout_p, batch_count=1211286.6666666667, ans=0.1 2023-10-03 09:01:31,496 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_23850_210_4238_1_1532232597_1632433_32-185135-0 from training. Duration: 0.86 2023-10-03 09:01:32,985 WARNING [train.py:1204] (3/4) Exclude cut with ID _15113_210_8254_1_1530842438579_3716279_209-54533-0 from training. Duration: 0.96 2023-10-03 09:01:33,037 WARNING [train.py:1204] (3/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_401-53423-0 from training. Duration: 0.55 2023-10-03 09:01:34,325 WARNING [train.py:1204] (3/4) Exclude cut with ID _14867_210_1968_1_1530684179890_3846969_319-250868-0_sp1.1 from training. Duration: 0.9 2023-10-03 09:01:34,340 WARNING [train.py:1204] (3/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_106-30782-0_sp0.9 from training. Duration: 0.888875 2023-10-03 09:01:35,736 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_5960_210_1794_1_1527403865_3990999_515-2613-0 from training. Duration: 0.88 2023-10-03 09:01:37,390 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.out_combiner.scale_min, batch_count=1211353.3333333333, ans=0.2 2023-10-03 09:01:39,784 WARNING [train.py:1204] (3/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_798-106179-0_sp0.9 from training. Duration: 0.92225 2023-10-03 09:01:42,435 WARNING [train.py:1204] (3/4) Exclude cut with ID _14227_210_4381_1_1530701804286_3940259_125-275642-0_sp1.1 from training. Duration: 0.9 2023-10-03 09:01:42,437 WARNING [train.py:1204] (3/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_79-200392-0_sp1.1 from training. Duration: 0.8818125 2023-10-03 09:01:43,686 WARNING [train.py:1204] (3/4) Exclude cut with ID _48836_210_12210_1_1534075848293_3904150_429-35475-0_sp0.9 from training. Duration: 0.988875 2023-10-03 09:01:43,700 WARNING [train.py:1204] (3/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_224-172517-0_sp1.1 from training. Duration: 0.97275 2023-10-03 09:01:43,940 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.0.feed_forward1.out_proj.dropout_p, batch_count=1211353.3333333333, ans=0.1 2023-10-03 09:01:46,568 WARNING [train.py:1204] (3/4) Exclude cut with ID _15113_210_8254_1_1530842438579_3716279_76-232507-0_sp1.1 from training. Duration: 0.97275 2023-10-03 09:01:46,570 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_135-5501-0 from training. Duration: 0.98 2023-10-03 09:01:48,005 WARNING [train.py:1204] (3/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_563-347832-0_sp0.9 from training. Duration: 0.988875 2023-10-03 09:01:49,337 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6786_210_1388_1_1527839699_4194495_273-149841-0 from training. Duration: 0.92 2023-10-03 09:01:49,373 WARNING [train.py:1204] (3/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_172-215932-0 from training. Duration: 0.66 2023-10-03 09:01:49,434 WARNING [train.py:1204] (3/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_939-132910-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 09:01:50,776 INFO [train.py:1046] (3/4) Epoch 35, batch 1100, loss[loss=0.176, simple_loss=0.265, pruned_loss=0.04348, over 24334.00 frames. ], tot_loss[loss=0.1593, simple_loss=0.2377, pruned_loss=0.04039, over 4694234.22 frames. ], batch size: 77, lr: 2.92e-03, grad_scale: 16.0 2023-10-03 09:01:52,879 WARNING [train.py:1204] (3/4) Exclude cut with ID _49371_210_12071_1_1533952761021_3812190_16-246001-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 09:01:59,059 WARNING [train.py:1204] (3/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_680-189165-0_sp1.1 from training. Duration: 0.936375 2023-10-03 09:02:03,316 WARNING [train.py:1204] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_53-300544-0_sp1.1 from training. Duration: 0.6090625 2023-10-03 09:02:03,424 WARNING [train.py:1204] (3/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_540-275685-0_sp1.1 from training. Duration: 0.8090625 2023-10-03 09:02:04,778 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_38965_210_4852_1_1533174906_3484735_453-106579-0_sp1.1 from training. Duration: 0.92725 2023-10-03 09:02:04,831 WARNING [train.py:1204] (3/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_105-328174-0 from training. Duration: 0.76 2023-10-03 09:02:06,168 WARNING [train.py:1204] (3/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_279-126205-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 09:02:06,407 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.bypass.skip_rate, batch_count=1211486.6666666667, ans=0.04949747468305833 2023-10-03 09:02:07,665 WARNING [train.py:1204] (3/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_5-55893-0_sp0.9 from training. Duration: 0.611125 2023-10-03 09:02:10,303 WARNING [train.py:1204] (3/4) Exclude cut with ID _13678_210_7497_1_1530530069226_3463170_141-10835-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 09:02:13,010 WARNING [train.py:1204] (3/4) Exclude cut with ID _22083_210_8254_1_1531702862439_3678970_201-85738-0_sp1.1 from training. Duration: 0.8181875 2023-10-03 09:02:13,037 WARNING [train.py:1204] (3/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_51-1049-0 from training. Duration: 0.75 2023-10-03 09:02:14,425 WARNING [train.py:1204] (3/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_206-10097-0_sp0.9 from training. Duration: 0.5444375 2023-10-03 09:02:15,803 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_4528_210_3273_1_1527076654_3276777_360-63661-0_sp1.1 from training. Duration: 0.92725 2023-10-03 09:02:17,106 WARNING [train.py:1204] (3/4) Exclude cut with ID _64086_210_7994_1_1535270417019_7435949_449-31567-0_sp1.1 from training. Duration: 0.8818125 2023-10-03 09:02:18,587 WARNING [train.py:1204] (3/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_245-174553-0_sp0.9 from training. Duration: 0.97775 2023-10-03 09:02:21,263 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6545_210_2040_1_1527774670_4245258_379-14182-0_sp1.1 from training. Duration: 0.72725 2023-10-03 09:02:26,439 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_256-206070-0_sp1.1 from training. Duration: 0.963625 2023-10-03 09:02:29,617 WARNING [train.py:1204] (3/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_278-104676-0 from training. Duration: 0.98 2023-10-03 09:02:31,446 WARNING [train.py:1204] (3/4) Exclude cut with ID IC0671W0065-331908 from training. Duration: 0.896 2023-10-03 09:02:31,495 WARNING [train.py:1204] (3/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_244-129917-0_sp1.1 from training. Duration: 0.97275 2023-10-03 09:02:34,212 WARNING [train.py:1204] (3/4) Exclude cut with ID _7852_210_5219_1_1528286471316_8139400_1129-170857-0_sp1.1 from training. Duration: 0.97275 2023-10-03 09:02:34,285 WARNING [train.py:1204] (3/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_443-30034-0_sp0.9 from training. Duration: 0.711125 2023-10-03 09:02:34,314 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_31583_210_3852_1_1532595851_4024413_250-332999-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 09:02:34,469 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.bypass_mid.scale_min, batch_count=1211620.0, ans=0.2 2023-10-03 09:02:35,710 WARNING [train.py:1204] (3/4) Exclude cut with ID _8772_210_1850_1_1528635595972_3664011_247-336448-0 from training. Duration: 0.74 2023-10-03 09:02:37,100 WARNING [train.py:1204] (3/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_567-135114-0_sp0.9 from training. Duration: 0.9666875 2023-10-03 09:02:37,103 WARNING [train.py:1204] (3/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_557-349534-0_sp1.1 from training. Duration: 0.82725 2023-10-03 09:02:37,122 WARNING [train.py:1204] (3/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_1341-26728-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 09:02:37,172 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_40222_210_6228_1_1533385298_4481939_375-328051-0_sp1.1 from training. Duration: 0.97275 2023-10-03 09:02:37,195 WARNING [train.py:1204] (3/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_276-96593-0 from training. Duration: 0.77 2023-10-03 09:02:41,427 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.self_attn_weights.pos_emb_skip_rate, batch_count=1211620.0, ans=0.0 2023-10-03 09:02:41,487 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.balancer2.prob, batch_count=1211620.0, ans=0.125 2023-10-03 09:02:43,910 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_53071_210_16554_1_1534490405_2954066_265-170472-0_sp0.9 from training. Duration: 0.92225 2023-10-03 09:02:43,927 WARNING [train.py:1204] (3/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_365-1492-0 from training. Duration: 0.91 2023-10-03 09:02:45,328 WARNING [train.py:1204] (3/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_654-52713-0_sp0.9 from training. Duration: 0.8555625 2023-10-03 09:02:48,238 WARNING [train.py:1204] (3/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_113-92608-0_sp1.1 from training. Duration: 0.4909375 2023-10-03 09:02:48,541 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.ff3_skip_rate, batch_count=1211686.6666666667, ans=0.0 2023-10-03 09:02:51,028 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_46013_210_7234_1_1533776362_3744032_50-141535-0 from training. Duration: 0.99 2023-10-03 09:02:51,037 WARNING [train.py:1204] (3/4) Exclude cut with ID _13942_210_9988_1_1530604906036_3733360_499-55676-0_sp1.1 from training. Duration: 0.563625 2023-10-03 09:02:54,352 WARNING [train.py:1204] (3/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_375-65143-0_sp1.1 from training. Duration: 0.97275 2023-10-03 09:02:56,788 WARNING [train.py:1204] (3/4) Exclude cut with ID _16756_210_4915_1_1531047803763_7104259_199-325608-0_sp1.1 from training. Duration: 0.92725 2023-10-03 09:02:56,813 WARNING [train.py:1204] (3/4) Exclude cut with ID _31649_210_6228_1_1532584608063_4091078_443-124391-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 09:02:58,135 WARNING [train.py:1204] (3/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_146-218763-0 from training. Duration: 0.86 2023-10-03 09:02:58,184 WARNING [train.py:1204] (3/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_567-135114-0_sp1.1 from training. Duration: 0.7909375 2023-10-03 09:02:59,532 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_26326_210_7925_1_1533002493_3592406_462-288299-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 09:03:00,971 WARNING [train.py:1204] (3/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_322-217220-0 from training. Duration: 0.86 2023-10-03 09:03:01,005 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_238-256668-0_sp1.1 from training. Duration: 0.62725 2023-10-03 09:03:01,054 WARNING [train.py:1204] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_371-18169-0 from training. Duration: 0.86 2023-10-03 09:03:02,455 WARNING [train.py:1204] (3/4) Exclude cut with ID _58480_210_12392_1_1534811121111_3646160_23-74512-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 09:03:02,467 WARNING [train.py:1204] (3/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_784-213099-0_sp1.1 from training. Duration: 0.8454375 2023-10-03 09:03:03,852 WARNING [train.py:1204] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_625-102955-0_sp1.1 from training. Duration: 0.7 2023-10-03 09:03:05,196 INFO [train.py:1046] (3/4) Epoch 35, batch 1150, loss[loss=0.1559, simple_loss=0.2333, pruned_loss=0.03922, over 24593.00 frames. ], tot_loss[loss=0.1596, simple_loss=0.2382, pruned_loss=0.04052, over 4685729.25 frames. ], batch size: 60, lr: 2.92e-03, grad_scale: 16.0 2023-10-03 09:03:06,797 WARNING [train.py:1204] (3/4) Exclude cut with ID _49748_210_24116_1_1533986971737_3158910_176-174327-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 09:03:09,442 WARNING [train.py:1204] (3/4) Exclude cut with ID _50310_210_15488_1_1534068043345_3641080_432-96140-0_sp1.1 from training. Duration: 0.936375 2023-10-03 09:03:11,370 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.3.encoder.layers.1.self_attn1.whiten, num_groups=1, num_channels=512, metric=13.67 vs. limit=22.5 2023-10-03 09:03:12,095 WARNING [train.py:1204] (3/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_76-14910-0_sp1.1 from training. Duration: 0.92725 2023-10-03 09:03:12,116 WARNING [train.py:1204] (3/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_40-19458-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 09:03:12,147 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_46-93547-0 from training. Duration: 0.95 2023-10-03 09:03:12,176 WARNING [train.py:1204] (3/4) Exclude cut with ID _21631_210_2253_1_1531907880778_3160339_321-35325-0_sp1.1 from training. Duration: 0.963625 2023-10-03 09:03:13,385 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.530e+02 1.882e+02 2.034e+02 2.362e+02 3.611e+02, threshold=4.067e+02, percent-clipped=0.0 2023-10-03 09:03:15,019 WARNING [train.py:1204] (3/4) Exclude cut with ID _4779_210_2084_1_1527318022944_3914893_500-285013-0 from training. Duration: 0.69 2023-10-03 09:03:16,368 WARNING [train.py:1204] (3/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_301-93324-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 09:03:16,385 WARNING [train.py:1204] (3/4) Exclude cut with ID _6473_210_1741_1_1527901307396_3815380_108-17335-0_sp1.1 from training. Duration: 0.5909375 2023-10-03 09:03:21,335 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.conv_skip_rate, batch_count=1211820.0, ans=0.0 2023-10-03 09:03:22,380 WARNING [train.py:1204] (3/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_206-10097-0 from training. Duration: 0.49 2023-10-03 09:03:23,844 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_36756_210_7234_1_1533026991_966276_103-77513-0_sp1.1 from training. Duration: 0.97275 2023-10-03 09:03:28,989 WARNING [train.py:1204] (3/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_662-18193-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 09:03:29,061 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_26433_210_12108_1_1532157158_4056480_439-52438-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 09:03:29,100 WARNING [train.py:1204] (3/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_464-33931-0 from training. Duration: 0.54 2023-10-03 09:03:29,107 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3474_210_2028_1_1526188513_4825246_145-309131-0_sp0.9 from training. Duration: 0.82225 2023-10-03 09:03:29,118 WARNING [train.py:1204] (3/4) Exclude cut with ID _9679_210_7497_1_1529129212906_4363929_446-308321-0_sp1.1 from training. Duration: 0.963625 2023-10-03 09:03:33,339 WARNING [train.py:1204] (3/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_662-275621-0 from training. Duration: 0.42 2023-10-03 09:03:34,684 WARNING [train.py:1204] (3/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_344-337148-0_sp1.1 from training. Duration: 0.97275 2023-10-03 09:03:36,037 WARNING [train.py:1204] (3/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_759-179071-0_sp1.1 from training. Duration: 0.92725 2023-10-03 09:03:44,278 WARNING [train.py:1204] (3/4) Exclude cut with ID _28783_210_7943_1_1532671164808_7155479_293-12188-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 09:03:49,711 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_17613_210_2151_1_1531137569_2366160_235-320304-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 09:03:49,733 WARNING [train.py:1204] (3/4) Exclude cut with ID _14349_210_8341_1_1530615710818_3442470_170-336541-0 from training. Duration: 0.86 2023-10-03 09:03:49,787 WARNING [train.py:1204] (3/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_375-218677-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 09:03:51,444 WARNING [train.py:1204] (3/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_829-202956-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 09:03:56,244 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9029W0436-509823 from training. Duration: 0.768 2023-10-03 09:03:57,595 WARNING [train.py:1204] (3/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_231-112214-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 09:04:05,144 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.ff2_skip_rate, batch_count=1212020.0, ans=0.0 2023-10-03 09:04:06,286 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9059W0470-524883 from training. Duration: 0.3415625 2023-10-03 09:04:07,091 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.self_attn2.whiten.whitening_limit, batch_count=1212020.0, ans=22.5 2023-10-03 09:04:09,375 INFO [scaling.py:1118] (3/4) WithLoss: name=encoder.encoders.1.encoder.layers.0.self_attn_weights, loss-sum=0.000e+00 2023-10-03 09:04:09,869 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.3.encoder.layers.1.self_attn1.whiten, num_groups=1, num_channels=512, metric=15.35 vs. limit=22.5 2023-10-03 09:04:10,518 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_43266_210_1794_1_1533521789_4497218_206-79512-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 09:04:10,605 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_294-163793-0_sp0.9 from training. Duration: 0.911125 2023-10-03 09:04:10,757 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=1212020.0, ans=0.1 2023-10-03 09:04:11,926 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6290_210_3273_1_1527595224_1282787_115-87095-0_sp1.1 from training. Duration: 0.5 2023-10-03 09:04:11,970 WARNING [train.py:1204] (3/4) Exclude cut with ID _14100_210_8341_1_1530529320613_3678069_279-55760-0_sp1.1 from training. Duration: 0.8454375 2023-10-03 09:04:14,825 WARNING [train.py:1204] (3/4) Exclude cut with ID _14621_210_1925_1_1530686901185_3675420_419-234200-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 09:04:17,444 INFO [train.py:1046] (3/4) Epoch 35, batch 1200, loss[loss=0.1457, simple_loss=0.2254, pruned_loss=0.03301, over 24425.00 frames. ], tot_loss[loss=0.16, simple_loss=0.2386, pruned_loss=0.04065, over 4689828.63 frames. ], batch size: 58, lr: 2.92e-03, grad_scale: 32.0 2023-10-03 09:04:20,272 WARNING [train.py:1204] (3/4) Exclude cut with ID _60229_210_27767_1_1534838406158_3655350_353-213656-0_sp1.1 from training. Duration: 0.863625 2023-10-03 09:04:20,281 WARNING [train.py:1204] (3/4) Exclude cut with ID _48678_210_5663_1_1533988830947_3769199_306-255886-0_sp1.1 from training. Duration: 0.7 2023-10-03 09:04:21,725 WARNING [train.py:1204] (3/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_466-3492-0_sp1.1 from training. Duration: 0.97275 2023-10-03 09:04:21,726 WARNING [train.py:1204] (3/4) Exclude cut with ID _8755_210_1876_1_1528628559357_3258209_199-242167-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 09:04:23,618 WARNING [train.py:1204] (3/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_237-89684-0_sp1.1 from training. Duration: 0.87275 2023-10-03 09:04:26,723 WARNING [train.py:1204] (3/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_70-84121-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 09:04:26,838 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7544_210_6753_1_1528587722_3576686_172-171750-0_sp1.1 from training. Duration: 0.5909375 2023-10-03 09:04:29,897 WARNING [train.py:1204] (3/4) Exclude cut with ID _24271_210_12658_1_1531987270022_7190519_40-321822-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 09:04:29,921 WARNING [train.py:1204] (3/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_227-83612-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 09:04:33,086 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9156W0015-572508 from training. Duration: 0.896 2023-10-03 09:04:35,963 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_292-265928-0 from training. Duration: 0.51 2023-10-03 09:04:36,251 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.conv_skip_rate, batch_count=1212153.3333333333, ans=0.0 2023-10-03 09:04:40,072 WARNING [train.py:1204] (3/4) Exclude cut with ID _14747_210_8341_1_1530685797463_3654089_147-131229-0_sp1.1 from training. Duration: 0.6909375 2023-10-03 09:04:42,756 WARNING [train.py:1204] (3/4) Exclude cut with ID _45297_210_2721_1_1533715351381_3264949_351-147136-0_sp0.9 from training. Duration: 0.9666875 2023-10-03 09:04:44,168 WARNING [train.py:1204] (3/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_1102-324568-0_sp1.1 from training. Duration: 0.97275 2023-10-03 09:04:46,913 WARNING [train.py:1204] (3/4) Exclude cut with ID _18516_210_9206_1_1531704653206_3692406_465-75446-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 09:04:46,914 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9203W0126-595623 from training. Duration: 0.896 2023-10-03 09:04:47,180 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.ff2_skip_rate, batch_count=1212220.0, ans=0.0 2023-10-03 09:04:48,271 WARNING [train.py:1204] (3/4) Exclude cut with ID _55383_210_10635_1_1534468426172_2231669_139-328410-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 09:04:50,354 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.3.encoder.layers.2.conv_module2.whiten, num_groups=1, num_channels=512, metric=7.37 vs. limit=15.0 2023-10-03 09:04:55,205 WARNING [train.py:1204] (3/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_166-246557-0_sp0.9 from training. Duration: 0.711125 2023-10-03 09:04:55,209 WARNING [train.py:1204] (3/4) Exclude cut with ID _30039_210_13270_1_1532484612900_2725150_239-304872-0_sp1.1 from training. Duration: 0.963625 2023-10-03 09:04:55,240 WARNING [train.py:1204] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_6-226750-0 from training. Duration: 0.94 2023-10-03 09:04:57,080 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_135-5501-0_sp1.1 from training. Duration: 0.8909375 2023-10-03 09:05:00,400 WARNING [train.py:1204] (3/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_359-118926-0 from training. Duration: 0.72 2023-10-03 09:05:05,751 WARNING [train.py:1204] (3/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_4-91037-0 from training. Duration: 0.88 2023-10-03 09:05:05,777 WARNING [train.py:1204] (3/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_415-288492-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 09:05:05,845 WARNING [train.py:1204] (3/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_544-287179-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 09:05:07,041 WARNING [train.py:1204] (3/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_122-32606-0_sp1.1 from training. Duration: 0.92725 2023-10-03 09:05:07,303 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.bypass_mid.scale_min, batch_count=1212286.6666666667, ans=0.2 2023-10-03 09:05:08,351 WARNING [train.py:1204] (3/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_121-284152-0_sp1.1 from training. Duration: 0.536375 2023-10-03 09:05:09,717 WARNING [train.py:1204] (3/4) Exclude cut with ID _44718_210_5816_1_1534050051869_6469030_350-299477-0_sp1.1 from training. Duration: 0.97275 2023-10-03 09:05:09,737 WARNING [train.py:1204] (3/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_1103-56445-0_sp0.9 from training. Duration: 0.811125 2023-10-03 09:05:09,800 WARNING [train.py:1204] (3/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_64-298917-0_sp0.9 from training. Duration: 0.97775 2023-10-03 09:05:11,080 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2245_210_2715_1_1525511548_3540254_310-52483-0 from training. Duration: 0.88 2023-10-03 09:05:12,230 WARNING [train.py:1204] (3/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_401-158239-0_sp1.1 from training. Duration: 0.6454375 2023-10-03 09:05:12,275 WARNING [train.py:1204] (3/4) Exclude cut with ID _59502_210_12210_1_1535107024698_1835139_146-162495-0_sp1.1 from training. Duration: 0.836375 2023-10-03 09:05:12,285 WARNING [train.py:1204] (3/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_41-31157-0_sp0.9 from training. Duration: 0.6333125 2023-10-03 09:05:13,687 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_68717_210_3528_1_1535628683_1556759_95-36722-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 09:05:13,843 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.out_combiner.scale_min, batch_count=1212286.6666666667, ans=0.2 2023-10-03 09:05:14,980 WARNING [train.py:1204] (3/4) Exclude cut with ID _30196_210_1664_1_1533708496522_7074140_654-162611-0_sp1.1 from training. Duration: 0.92725 2023-10-03 09:05:19,205 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_9302_210_2550_1_1528889962_4655416_638-301620-0_sp0.9 from training. Duration: 0.52225 2023-10-03 09:05:19,541 INFO [scaling.py:1118] (3/4) WithLoss: name=encoder.encoders.4.encoder.layers.2.self_attn_weights, loss-sum=0.000e+00 2023-10-03 09:05:20,545 WARNING [train.py:1204] (3/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_478-162247-0_sp1.1 from training. Duration: 0.7454375 2023-10-03 09:05:23,322 WARNING [train.py:1204] (3/4) Exclude cut with ID _45297_210_2721_1_1533715351381_3264949_353-9541-0 from training. Duration: 0.97 2023-10-03 09:05:29,238 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9653W0190-675468 from training. Duration: 0.9935 2023-10-03 09:05:31,154 INFO [train.py:1046] (3/4) Epoch 35, batch 1250, loss[loss=0.1447, simple_loss=0.2317, pruned_loss=0.02882, over 24659.00 frames. ], tot_loss[loss=0.1602, simple_loss=0.239, pruned_loss=0.04069, over 4700805.39 frames. ], batch size: 65, lr: 2.92e-03, grad_scale: 16.0 2023-10-03 09:05:31,208 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_49730_210_13049_1_1534075294_1830676_214-84436-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 09:05:31,516 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.bypass.scale_min, batch_count=1212420.0, ans=0.2 2023-10-03 09:05:32,658 WARNING [train.py:1204] (3/4) Exclude cut with ID _50093_210_4220_1_1534417141207_3432299_259-322252-0_sp1.1 from training. Duration: 0.836375 2023-10-03 09:05:34,086 WARNING [train.py:1204] (3/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_599-50220-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 09:05:36,474 WARNING [train.py:1204] (3/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_174-109016-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 09:05:37,845 WARNING [train.py:1204] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_665-196155-0 from training. Duration: 0.67 2023-10-03 09:05:40,685 WARNING [train.py:1204] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_656-188361-0_sp1.1 from training. Duration: 0.7909375 2023-10-03 09:05:41,845 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.586e+02 1.899e+02 2.182e+02 2.478e+02 3.266e+02, threshold=4.363e+02, percent-clipped=0.0 2023-10-03 09:05:41,986 WARNING [train.py:1204] (3/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_60-18123-0_sp1.1 from training. Duration: 0.8 2023-10-03 09:05:43,279 WARNING [train.py:1204] (3/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_535-14410-0 from training. Duration: 0.47 2023-10-03 09:05:44,642 WARNING [train.py:1204] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_94-126128-0_sp1.1 from training. Duration: 0.7818125 2023-10-03 09:05:44,722 WARNING [train.py:1204] (3/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_356-31216-0_sp1.1 from training. Duration: 0.6818125 2023-10-03 09:05:48,905 WARNING [train.py:1204] (3/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_443-30034-0_sp1.1 from training. Duration: 0.5818125 2023-10-03 09:05:48,976 WARNING [train.py:1204] (3/4) Exclude cut with ID _26907_210_7234_1_1532221740694_3205263_337-2494-0_sp1.1 from training. Duration: 0.8 2023-10-03 09:05:50,295 WARNING [train.py:1204] (3/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_49-97158-0_sp0.9 from training. Duration: 0.8444375 2023-10-03 09:05:50,312 WARNING [train.py:1204] (3/4) Exclude cut with ID _7877_210_6014_1_1528286358099_3750136_553-170128-0_sp1.1 from training. Duration: 0.963625 2023-10-03 09:05:51,760 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.0.conv_skip_rate, batch_count=1212486.6666666667, ans=0.0 2023-10-03 09:05:53,069 WARNING [train.py:1204] (3/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_202-293523-0_sp0.9 from training. Duration: 0.8 2023-10-03 09:05:57,605 WARNING [train.py:1204] (3/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_188-171673-0_sp0.9 from training. Duration: 0.6444375 2023-10-03 09:05:57,627 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_120-41668-0_sp1.1 from training. Duration: 0.636375 2023-10-03 09:05:57,639 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_40223_210_6228_1_1533298404_4812267_613-78769-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 09:05:59,056 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_53861_210_5399_1_1534422156_4272077_150-336899-0_sp1.1 from training. Duration: 0.963625 2023-10-03 09:06:00,385 WARNING [train.py:1204] (3/4) Exclude cut with ID _14459_210_2228_1_1531443641946_7516700_1146-56843-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 09:06:03,185 WARNING [train.py:1204] (3/4) Exclude cut with ID _39867_210_9826_1_1533553222030_9344259_1194-299928-0_sp1.1 from training. Duration: 0.92725 2023-10-03 09:06:04,989 WARNING [train.py:1204] (3/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_214-291369-0_sp0.9 from training. Duration: 0.7 2023-10-03 09:06:05,304 INFO [scaling.py:1118] (3/4) WithLoss: name=encoder.encoders.5.encoder.layers.0.self_attn_weights, loss-sum=0.000e+00 2023-10-03 09:06:10,020 WARNING [train.py:1204] (3/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_358-215558-0 from training. Duration: 0.93 2023-10-03 09:06:10,082 WARNING [train.py:1204] (3/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_577-287150-0_sp0.9 from training. Duration: 0.911125 2023-10-03 09:06:12,914 WARNING [train.py:1204] (3/4) Exclude cut with ID _60860_210_8254_1_1534896015875_3557169_229-187367-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 09:06:14,249 WARNING [train.py:1204] (3/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_857-298942-0 from training. Duration: 0.85 2023-10-03 09:06:14,302 WARNING [train.py:1204] (3/4) Exclude cut with ID _40137_210_12210_1_1533290484046_3795180_249-187059-0_sp1.1 from training. Duration: 0.8 2023-10-03 09:06:15,646 WARNING [train.py:1204] (3/4) Exclude cut with ID ID0171W0508-765603 from training. Duration: 0.541 2023-10-03 09:06:15,677 WARNING [train.py:1204] (3/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_521-219439-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 09:06:15,687 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_40223_210_6228_1_1533298404_4812267_741-302616-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 09:06:18,406 WARNING [train.py:1204] (3/4) Exclude cut with ID _65293_210_9070_1_1535545549765_4004210_438-154735-0_sp1.1 from training. Duration: 0.92725 2023-10-03 09:06:21,123 WARNING [train.py:1204] (3/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_564-132950-0_sp1.1 from training. Duration: 0.92725 2023-10-03 09:06:22,461 WARNING [train.py:1204] (3/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_291-270881-0_sp1.1 from training. Duration: 0.8818125 2023-10-03 09:06:23,740 WARNING [train.py:1204] (3/4) Exclude cut with ID _60229_210_27767_1_1534838406158_3655350_353-213656-0 from training. Duration: 0.95 2023-10-03 09:06:23,759 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_243-143119-0 from training. Duration: 0.77 2023-10-03 09:06:23,781 WARNING [train.py:1204] (3/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_690-126306-0 from training. Duration: 0.78 2023-10-03 09:06:25,649 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.5.encoder.layers.1.conv_module1.whiten, num_groups=1, num_channels=256, metric=3.71 vs. limit=15.0 2023-10-03 09:06:26,527 WARNING [train.py:1204] (3/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_347-95003-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 09:06:28,556 WARNING [train.py:1204] (3/4) Exclude cut with ID _2322_210_2667_1_1525604548971_3656299_440-80494-0 from training. Duration: 0.88 2023-10-03 09:06:28,572 WARNING [train.py:1204] (3/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_370-229434-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 09:06:32,645 WARNING [train.py:1204] (3/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_124-345840-0_sp1.1 from training. Duration: 0.47275 2023-10-03 09:06:32,653 WARNING [train.py:1204] (3/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_165-123581-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 09:06:32,787 WARNING [train.py:1204] (3/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_453-282588-0 from training. Duration: 0.98 2023-10-03 09:06:32,803 WARNING [train.py:1204] (3/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_299-34413-0_sp0.9 from training. Duration: 0.72225 2023-10-03 09:06:32,836 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_40036_210_13405_1_1533648647_1315388_48-146016-0_sp1.1 from training. Duration: 0.8454375 2023-10-03 09:06:34,068 WARNING [train.py:1204] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_99-167392-0_sp0.9 from training. Duration: 0.67775 2023-10-03 09:06:34,128 WARNING [train.py:1204] (3/4) Exclude cut with ID _48677_210_5663_1_1533902405919_3892989_37-207114-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 09:06:37,470 WARNING [train.py:1204] (3/4) Exclude cut with ID _29857_210_20034_1_1532395886753_3798160_335-163562-0 from training. Duration: 0.85 2023-10-03 09:06:40,563 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_36492_210_7927_1_1533261987_4790834_703-343351-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 09:06:41,918 WARNING [train.py:1204] (3/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_80-147301-0_sp0.9 from training. Duration: 0.9333125 2023-10-03 09:06:42,194 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.feed_forward1.out_proj.dropout_p, batch_count=1212686.6666666667, ans=0.1 2023-10-03 09:06:43,282 WARNING [train.py:1204] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_218-295097-0_sp0.9 from training. Duration: 0.8555625 2023-10-03 09:06:44,593 INFO [train.py:1046] (3/4) Epoch 35, batch 1300, loss[loss=0.1492, simple_loss=0.2251, pruned_loss=0.03665, over 24323.00 frames. ], tot_loss[loss=0.1603, simple_loss=0.2398, pruned_loss=0.04038, over 4718518.51 frames. ], batch size: 56, lr: 2.92e-03, grad_scale: 16.0 2023-10-03 09:06:46,008 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2295_210_2087_1_1525500993_2407863_116-238894-0_sp1.1 from training. Duration: 0.636375 2023-10-03 09:06:48,860 WARNING [train.py:1204] (3/4) Exclude cut with ID _3231_210_2106_1_1526205864934_6784220_697-171068-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 09:06:48,889 WARNING [train.py:1204] (3/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_102-77064-0 from training. Duration: 0.93 2023-10-03 09:06:50,484 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.bypass.skip_rate, batch_count=1212753.3333333333, ans=0.07 2023-10-03 09:06:52,985 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_13091_210_7925_1_1530609447_8987354_1186-261957-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 09:06:55,781 WARNING [train.py:1204] (3/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_91-277684-0_sp1.1 from training. Duration: 0.67275 2023-10-03 09:06:57,719 WARNING [train.py:1204] (3/4) Exclude cut with ID _31088_210_16446_1_1532520012284_3440919_159-313751-0_sp1.1 from training. Duration: 0.936375 2023-10-03 09:06:57,825 WARNING [train.py:1204] (3/4) Exclude cut with ID _42326_210_7770_1_1533814282531_3707269_227-203721-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 09:07:00,965 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3531_210_3528_1_1526294203_5216456_309-227302-0_sp1.1 from training. Duration: 0.62725 2023-10-03 09:07:02,371 WARNING [train.py:1204] (3/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_226-242140-0 from training. Duration: 0.81 2023-10-03 09:07:05,198 WARNING [train.py:1204] (3/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_122-109023-0_sp1.1 from training. Duration: 0.6909375 2023-10-03 09:07:05,351 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.balancer1.prob, batch_count=1212820.0, ans=0.125 2023-10-03 09:07:06,552 WARNING [train.py:1204] (3/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_660-211088-0_sp0.9 from training. Duration: 0.688875 2023-10-03 09:07:06,657 WARNING [train.py:1204] (3/4) Exclude cut with ID _39290_210_4220_1_1533862778760_3075118_92-311600-0 from training. Duration: 0.88 2023-10-03 09:07:10,574 WARNING [train.py:1204] (3/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_390-339137-0_sp1.1 from training. Duration: 0.5090625 2023-10-03 09:07:13,017 WARNING [train.py:1204] (3/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_449-341946-0_sp1.1 from training. Duration: 0.963625 2023-10-03 09:07:14,434 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_68095_210_20185_1_1535718226_8888855_884-327007-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 09:07:15,920 WARNING [train.py:1204] (3/4) Exclude cut with ID _42326_210_7770_1_1533814282531_3707269_308-222998-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 09:07:17,190 WARNING [train.py:1204] (3/4) Exclude cut with ID _40399_210_4353_1_1533305658194_3279406_344-238708-0_sp1.1 from training. Duration: 0.963625 2023-10-03 09:07:17,260 WARNING [train.py:1204] (3/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_87-131427-0_sp1.1 from training. Duration: 0.7090625 2023-10-03 09:07:18,576 WARNING [train.py:1204] (3/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_110-130961-0_sp0.9 from training. Duration: 0.52225 2023-10-03 09:07:18,608 WARNING [train.py:1204] (3/4) Exclude cut with ID _55153_210_2253_1_1534384786677_3162096_148-79022-0 from training. Duration: 0.96 2023-10-03 09:07:24,110 WARNING [train.py:1204] (3/4) Exclude cut with ID _9679_210_7497_1_1529129212906_4363929_512-313889-0_sp1.1 from training. Duration: 0.736375 2023-10-03 09:07:24,121 WARNING [train.py:1204] (3/4) Exclude cut with ID _7970_210_1883_1_1528455616296_3472230_25-178407-0_sp1.1 from training. Duration: 0.6454375 2023-10-03 09:07:27,179 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_246-125000-0 from training. Duration: 0.62 2023-10-03 09:07:27,247 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_15126_210_6753_1_1530778677_2591185_113-157651-0_sp0.9 from training. Duration: 0.7555625 2023-10-03 09:07:30,399 WARNING [train.py:1204] (3/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_105-63309-0_sp1.1 from training. Duration: 0.8909375 2023-10-03 09:07:32,058 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.feed_forward2.hidden_balancer.prob, batch_count=1212953.3333333333, ans=0.125 2023-10-03 09:07:33,163 WARNING [train.py:1204] (3/4) Exclude cut with ID _29877_210_6228_1_1532397791106_5276149_578-177271-0_sp1.1 from training. Duration: 0.936375 2023-10-03 09:07:33,205 WARNING [train.py:1204] (3/4) Exclude cut with ID _44831_210_13427_1_1533816011549_3689035_32-188147-0 from training. Duration: 0.96 2023-10-03 09:07:33,259 WARNING [train.py:1204] (3/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_300-254337-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 09:07:33,406 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.attention_skip_rate, batch_count=1212953.3333333333, ans=0.0 2023-10-03 09:07:34,534 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_128-241008-0 from training. Duration: 0.94 2023-10-03 09:07:34,674 WARNING [train.py:1204] (3/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_53-279178-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 09:07:39,382 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_41452_210_7291_1_1533437687_3941281_273-334088-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 09:07:39,385 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_128-241008-0_sp1.1 from training. Duration: 0.8545625 2023-10-03 09:07:42,057 WARNING [train.py:1204] (3/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_379-49457-0 from training. Duration: 0.87 2023-10-03 09:07:43,444 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_105-114797-0 from training. Duration: 0.84 2023-10-03 09:07:44,737 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_14928_210_5632_1_1530773461_4714000_454-112198-0 from training. Duration: 0.66 2023-10-03 09:07:48,956 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_456-22141-0_sp1.1 from training. Duration: 0.8 2023-10-03 09:07:50,445 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_115-233929-0 from training. Duration: 0.72 2023-10-03 09:07:53,083 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_616-16795-0_sp1.1 from training. Duration: 0.963625 2023-10-03 09:07:53,346 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.bypass.scale_min, batch_count=1213020.0, ans=0.2 2023-10-03 09:07:54,921 INFO [scaling.py:1118] (3/4) WithLoss: name=encoder.encoders.5.encoder.layers.0.self_attn_weights, loss-sum=0.000e+00 2023-10-03 09:07:56,731 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.3.encoder.layers.2.feed_forward1.out_whiten, num_groups=1, num_channels=512, metric=7.36 vs. limit=15.0 2023-10-03 09:07:57,295 INFO [train.py:1046] (3/4) Epoch 35, batch 1350, loss[loss=0.1586, simple_loss=0.2563, pruned_loss=0.03048, over 24673.00 frames. ], tot_loss[loss=0.1601, simple_loss=0.2394, pruned_loss=0.04046, over 4716281.30 frames. ], batch size: 73, lr: 2.92e-03, grad_scale: 16.0 2023-10-03 09:07:59,148 WARNING [train.py:1204] (3/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_448-212135-0 from training. Duration: 0.91 2023-10-03 09:08:01,124 WARNING [train.py:1204] (3/4) Exclude cut with ID _46683_210_9642_1_1533730913492_4591116_201-205677-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 09:08:01,299 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.feed_forward2.hidden_balancer.prob, batch_count=1213086.6666666667, ans=0.125 2023-10-03 09:08:03,853 WARNING [train.py:1204] (3/4) Exclude cut with ID _64086_210_7994_1_1535270417019_7435949_43-80483-0_sp1.1 from training. Duration: 0.97275 2023-10-03 09:08:07,130 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_240-134124-0_sp1.1 from training. Duration: 0.963625 2023-10-03 09:08:08,373 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.516e+02 1.947e+02 2.144e+02 2.393e+02 3.515e+02, threshold=4.287e+02, percent-clipped=0.0 2023-10-03 09:08:08,464 WARNING [train.py:1204] (3/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_907-120940-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 09:08:09,049 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.3.encoder.layers.3.feed_forward1.out_whiten, num_groups=1, num_channels=512, metric=4.96 vs. limit=15.0 2023-10-03 09:08:09,824 WARNING [train.py:1204] (3/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_848-280957-0_sp1.1 from training. Duration: 0.7909375 2023-10-03 09:08:11,611 WARNING [train.py:1204] (3/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_243-42116-0_sp0.9 from training. Duration: 0.811125 2023-10-03 09:08:15,862 WARNING [train.py:1204] (3/4) Exclude cut with ID _6694_210_4599_1_1527852637490_3965529_116-109398-0_sp0.9 from training. Duration: 0.811125 2023-10-03 09:08:17,272 WARNING [train.py:1204] (3/4) Exclude cut with ID _24314_210_4915_1_1532135078116_7170179_521-258868-0 from training. Duration: 0.79 2023-10-03 09:08:17,397 WARNING [train.py:1204] (3/4) Exclude cut with ID _2220_210_1819_1_1525509109898_3658680_50-99837-0_sp0.9 from training. Duration: 0.6 2023-10-03 09:08:18,766 WARNING [train.py:1204] (3/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_313-134930-0_sp1.1 from training. Duration: 0.8818125 2023-10-03 09:08:21,432 WARNING [train.py:1204] (3/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_125-144174-0 from training. Duration: 0.39 2023-10-03 09:08:21,487 WARNING [train.py:1204] (3/4) Exclude cut with ID _59288_210_5854_1_1535077757774_3412246_275-293897-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 09:08:24,062 WARNING [train.py:1204] (3/4) Exclude cut with ID _40519_210_13794_1_1533715228655_7337390_722-52880-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 09:08:24,083 WARNING [train.py:1204] (3/4) Exclude cut with ID _7537_210_2084_1_1528502395431_3924359_128-268029-0 from training. Duration: 0.98 2023-10-03 09:08:25,498 WARNING [train.py:1204] (3/4) Exclude cut with ID _8021_210_7994_1_1528376451358_3653069_179-45206-0 from training. Duration: 0.86 2023-10-03 09:08:26,863 WARNING [train.py:1204] (3/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_88-276668-0 from training. Duration: 0.99 2023-10-03 09:08:28,725 WARNING [train.py:1204] (3/4) Exclude cut with ID _22613_210_5219_1_1532253599686_4090170_290-212862-0_sp1.1 from training. Duration: 0.97275 2023-10-03 09:08:28,738 WARNING [train.py:1204] (3/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_465-214726-0 from training. Duration: 0.86 2023-10-03 09:08:36,568 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.0.attention_skip_rate, batch_count=1213220.0, ans=0.0 2023-10-03 09:08:39,608 WARNING [train.py:1204] (3/4) Exclude cut with ID _56480_210_4220_1_1534679942079_3075409_138-36031-0_sp1.1 from training. Duration: 0.97275 2023-10-03 09:08:44,246 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.feed_forward2.hidden_balancer.prob, batch_count=1213286.6666666667, ans=0.125 2023-10-03 09:08:49,496 WARNING [train.py:1204] (3/4) Exclude cut with ID _58605_210_12210_1_1534723195939_3904259_300-285878-0_sp1.1 from training. Duration: 0.97275 2023-10-03 09:08:49,529 WARNING [train.py:1204] (3/4) Exclude cut with ID _40901_210_5665_1_1533363789958_3856130_114-340229-0_sp1.1 from training. Duration: 0.936375 2023-10-03 09:08:49,575 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_489-230703-0 from training. Duration: 0.85 2023-10-03 09:08:52,454 WARNING [train.py:1204] (3/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_322-81243-0_sp1.1 from training. Duration: 0.936375 2023-10-03 09:08:52,544 WARNING [train.py:1204] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_271-269084-0 from training. Duration: 0.83 2023-10-03 09:08:53,812 WARNING [train.py:1204] (3/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_166-314607-0_sp0.9 from training. Duration: 0.6 2023-10-03 09:08:53,877 WARNING [train.py:1204] (3/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_340-343302-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 09:08:54,101 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.conv_module2.balancer2.min_positive, batch_count=1213286.6666666667, ans=0.05 2023-10-03 09:08:56,682 WARNING [train.py:1204] (3/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_286-112015-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 09:08:59,875 WARNING [train.py:1204] (3/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_328-264776-0 from training. Duration: 0.67 2023-10-03 09:08:59,988 WARNING [train.py:1204] (3/4) Exclude cut with ID _4866_210_2577_1_1527404412284_4364659_256-306830-0_sp0.9 from training. Duration: 0.9666875 2023-10-03 09:09:04,644 WARNING [train.py:1204] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_239-316802-0 from training. Duration: 0.69 2023-10-03 09:09:07,280 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.3.encoder.layers.3.self_attn1.whiten, num_groups=1, num_channels=512, metric=10.41 vs. limit=22.5 2023-10-03 09:09:07,984 WARNING [train.py:1204] (3/4) Exclude cut with ID _29857_210_20034_1_1532395886753_3798160_279-185801-0 from training. Duration: 0.91 2023-10-03 09:09:12,359 INFO [train.py:1046] (3/4) Epoch 35, batch 1400, loss[loss=0.1535, simple_loss=0.234, pruned_loss=0.03648, over 23360.00 frames. ], tot_loss[loss=0.1594, simple_loss=0.2385, pruned_loss=0.04019, over 4722243.23 frames. ], batch size: 93, lr: 2.92e-03, grad_scale: 16.0 2023-10-03 09:09:15,222 WARNING [train.py:1204] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_656-188361-0 from training. Duration: 0.87 2023-10-03 09:09:16,609 WARNING [train.py:1204] (3/4) Exclude cut with ID _44718_210_5816_1_1534050051869_6469030_100-230287-0_sp1.1 from training. Duration: 0.936375 2023-10-03 09:09:19,640 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8275_210_4198_1_1528455320_3892571_276-215622-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 09:09:19,713 WARNING [train.py:1204] (3/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_698-309430-0_sp1.1 from training. Duration: 0.963625 2023-10-03 09:09:23,613 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_365-217119-0 from training. Duration: 0.63 2023-10-03 09:09:26,258 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7264_210_4938_1_1527939911_8132095_800-91995-0 from training. Duration: 0.86 2023-10-03 09:09:29,997 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=1213486.6666666667, ans=0.1 2023-10-03 09:09:34,773 WARNING [train.py:1204] (3/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_49-97158-0_sp1.1 from training. Duration: 0.6909375 2023-10-03 09:09:36,058 WARNING [train.py:1204] (3/4) Exclude cut with ID _61132_210_9969_1_1535021792964_3755419_61-57720-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 09:09:39,144 WARNING [train.py:1204] (3/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_345-306832-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 09:09:39,168 WARNING [train.py:1204] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_164-224052-0_sp0.9 from training. Duration: 0.77775 2023-10-03 09:09:42,654 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_34886_210_10633_1_1533203882_1755661_76-156096-0_sp1.1 from training. Duration: 0.7909375 2023-10-03 09:09:45,325 WARNING [train.py:1204] (3/4) Exclude cut with ID 4133-6541-0027-40495-0_sp1.1 from training. Duration: 0.9681875 2023-10-03 09:09:46,998 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.feed_forward1.hidden_balancer.prob, batch_count=1213553.3333333333, ans=0.125 2023-10-03 09:09:51,182 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.ff3_skip_rate, batch_count=1213553.3333333333, ans=0.0 2023-10-03 09:09:52,513 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_514-211165-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 09:09:53,823 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_41211_210_2544_1_1533348859_3702910_97-212695-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 09:09:56,684 WARNING [train.py:1204] (3/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_406-45086-0 from training. Duration: 0.8 2023-10-03 09:09:58,434 WARNING [train.py:1204] (3/4) Exclude cut with ID _65263_210_22356_1_1535763598691_3921037_49-222917-0_sp1.1 from training. Duration: 0.836375 2023-10-03 09:09:58,477 WARNING [train.py:1204] (3/4) Exclude cut with ID _4866_210_2577_1_1527404412284_4364659_352-157930-0_sp1.1 from training. Duration: 0.736375 2023-10-03 09:09:58,539 WARNING [train.py:1204] (3/4) Exclude cut with ID _4366_210_1388_1_1526792053633_3613819_231-211402-0_sp1.1 from training. Duration: 0.8545625 2023-10-03 09:09:59,772 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_98-21576-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 09:10:01,662 WARNING [train.py:1204] (3/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_348-127789-0_sp0.9 from training. Duration: 0.9444375 2023-10-03 09:10:01,682 WARNING [train.py:1204] (3/4) Exclude cut with ID _45871_210_5665_1_1533864614245_3296060_98-329557-0_sp1.1 from training. Duration: 0.97275 2023-10-03 09:10:01,754 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6846_210_6753_1_1527850767_4295158_2-315073-0_sp1.1 from training. Duration: 0.8 2023-10-03 09:10:02,052 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.conv_module2.balancer1.max_abs, batch_count=1213620.0, ans=10.0 2023-10-03 09:10:03,117 WARNING [train.py:1204] (3/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_145-137790-0 from training. Duration: 0.83 2023-10-03 09:10:03,130 WARNING [train.py:1204] (3/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_133-158308-0_sp0.9 from training. Duration: 0.9333125 2023-10-03 09:10:05,983 WARNING [train.py:1204] (3/4) Exclude cut with ID _14898_210_9206_1_1530766812377_4177190_320-11758-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 09:10:12,457 WARNING [train.py:1204] (3/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_406-45086-0_sp0.9 from training. Duration: 0.888875 2023-10-03 09:10:16,793 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_5438_210_1794_1_1527336687_2600009_104-288274-0 from training. Duration: 0.58 2023-10-03 09:10:18,126 WARNING [train.py:1204] (3/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_108-53929-0_sp0.9 from training. Duration: 0.6555625 2023-10-03 09:10:19,492 WARNING [train.py:1204] (3/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_444-286390-0_sp1.1 from training. Duration: 0.963625 2023-10-03 09:10:20,970 WARNING [train.py:1204] (3/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_125-144174-0_sp0.9 from training. Duration: 0.4333125 2023-10-03 09:10:22,301 WARNING [train.py:1204] (3/4) Exclude cut with ID _55209_210_1531_1_1534675828996_4597160_118-345621-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 09:10:24,998 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_45886_210_17024_1_1533709215_471063_7-79165-0_sp1.1 from training. Duration: 0.8909375 2023-10-03 09:10:26,293 INFO [train.py:1046] (3/4) Epoch 35, batch 1450, loss[loss=0.1505, simple_loss=0.2194, pruned_loss=0.04077, over 23510.00 frames. ], tot_loss[loss=0.1596, simple_loss=0.2385, pruned_loss=0.04036, over 4710222.07 frames. ], batch size: 256, lr: 2.92e-03, grad_scale: 16.0 2023-10-03 09:10:29,743 WARNING [train.py:1204] (3/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_175-163894-0_sp0.9 from training. Duration: 0.82225 2023-10-03 09:10:31,064 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_50188_210_4198_1_1534503621_3646041_353-81682-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 09:10:31,070 WARNING [train.py:1204] (3/4) Exclude cut with ID _17875_210_2087_1_1531224012907_1821339_175-80565-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 09:10:31,085 WARNING [train.py:1204] (3/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_159-328157-0_sp1.1 from training. Duration: 0.47275 2023-10-03 09:10:36,987 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.554e+02 1.854e+02 2.034e+02 2.256e+02 3.370e+02, threshold=4.069e+02, percent-clipped=0.0 2023-10-03 09:10:38,369 WARNING [train.py:1204] (3/4) Exclude cut with ID _60057_210_10640_1_1534903065793_3333150_137-267579-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 09:10:38,425 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_92-46009-0_sp1.1 from training. Duration: 0.4909375 2023-10-03 09:10:40,420 WARNING [train.py:1204] (3/4) Exclude cut with ID _53203_210_2087_1_1534246218309_475590_48-68507-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 09:10:40,446 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_28211_210_6388_1_1532861506_4523224_128-328162-0 from training. Duration: 0.9 2023-10-03 09:10:41,820 WARNING [train.py:1204] (3/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_51-1049-0_sp1.1 from training. Duration: 0.6818125 2023-10-03 09:10:43,738 WARNING [train.py:1204] (3/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_383-316000-0 from training. Duration: 0.9 2023-10-03 09:10:43,780 WARNING [train.py:1204] (3/4) Exclude cut with ID _4779_210_2084_1_1527318022944_3914893_185-295153-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 09:10:43,864 WARNING [train.py:1204] (3/4) Exclude cut with ID _3231_210_2106_1_1526205864934_6784220_171-256004-0_sp0.9 from training. Duration: 0.9555625 2023-10-03 09:10:43,864 WARNING [train.py:1204] (3/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_87-272068-0 from training. Duration: 0.83 2023-10-03 09:10:45,269 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_164-152777-0_sp1.1 from training. Duration: 0.92725 2023-10-03 09:10:46,574 WARNING [train.py:1204] (3/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_18-130336-0_sp0.9 from training. Duration: 0.87775 2023-10-03 09:10:46,621 WARNING [train.py:1204] (3/4) Exclude cut with ID _14886_210_4220_1_1530702020355_3516190_175-229073-0_sp1.1 from training. Duration: 0.4090625 2023-10-03 09:10:46,652 WARNING [train.py:1204] (3/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_791-45149-0_sp0.9 from training. Duration: 0.9555625 2023-10-03 09:10:46,759 WARNING [train.py:1204] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_468-189058-0_sp1.1 from training. Duration: 0.87275 2023-10-03 09:10:48,145 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8278_210_2550_1_1528637099_4220413_32-142780-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 09:10:50,924 WARNING [train.py:1204] (3/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_146-218763-0_sp0.9 from training. Duration: 0.9555625 2023-10-03 09:10:55,067 WARNING [train.py:1204] (3/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_348-127789-0_sp1.1 from training. Duration: 0.77275 2023-10-03 09:10:55,081 WARNING [train.py:1204] (3/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_288-285445-0_sp1.1 from training. Duration: 0.936375 2023-10-03 09:10:56,467 WARNING [train.py:1204] (3/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_308-181732-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 09:10:57,765 WARNING [train.py:1204] (3/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_895-117428-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 09:10:59,105 WARNING [train.py:1204] (3/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_387-159008-0_sp0.9 from training. Duration: 0.9555625 2023-10-03 09:10:59,118 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_37351_210_7925_1_1533012931_3812113_265-85780-0_sp0.9 from training. Duration: 0.97775 2023-10-03 09:10:59,142 WARNING [train.py:1204] (3/4) Exclude cut with ID _40551_210_10635_1_1533776424632_3416179_361-3964-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 09:10:59,189 WARNING [train.py:1204] (3/4) Exclude cut with ID _25882_210_3036_1_1532775665815_6479510_485-122742-0_sp1.1 from training. Duration: 0.97275 2023-10-03 09:11:02,601 WARNING [train.py:1204] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_98-51676-0 from training. Duration: 0.73 2023-10-03 09:11:07,529 WARNING [train.py:1204] (3/4) Exclude cut with ID _30548_210_16040_1_1532608097130_3146969_371-280610-0_sp1.1 from training. Duration: 0.92725 2023-10-03 09:11:10,452 WARNING [train.py:1204] (3/4) Exclude cut with ID IC0671W0081-331924 from training. Duration: 0.896 2023-10-03 09:11:11,872 WARNING [train.py:1204] (3/4) Exclude cut with ID _40284_210_2084_1_1533513679814_3704279_380-160775-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 09:11:13,714 WARNING [train.py:1204] (3/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_57-270037-0_sp1.1 from training. Duration: 0.7 2023-10-03 09:11:15,162 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_853-150237-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 09:11:15,427 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.balancer1.prob, batch_count=1213953.3333333333, ans=0.125 2023-10-03 09:11:16,501 WARNING [train.py:1204] (3/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_491-261923-0 from training. Duration: 0.98 2023-10-03 09:11:16,763 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.ff2_skip_rate, batch_count=1213953.3333333333, ans=0.0 2023-10-03 09:11:19,375 WARNING [train.py:1204] (3/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_836-244045-0_sp1.1 from training. Duration: 0.97275 2023-10-03 09:11:20,732 WARNING [train.py:1204] (3/4) Exclude cut with ID _14880_210_8895_1_1530680426655_3795259_289-212817-0 from training. Duration: 0.69 2023-10-03 09:11:22,103 WARNING [train.py:1204] (3/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_303-340446-0 from training. Duration: 0.9 2023-10-03 09:11:23,541 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_37152_210_9697_1_1533106573_3844188_418-41736-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 09:11:27,588 WARNING [train.py:1204] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_371-18169-0_sp1.1 from training. Duration: 0.7818125 2023-10-03 09:11:27,623 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_187-219984-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 09:11:28,996 WARNING [train.py:1204] (3/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_63-267585-0 from training. Duration: 0.9 2023-10-03 09:11:32,261 WARNING [train.py:1204] (3/4) Exclude cut with ID _27013_210_1389_1_1532437173240_3252649_222-125573-0 from training. Duration: 0.98 2023-10-03 09:11:33,604 WARNING [train.py:1204] (3/4) Exclude cut with ID _14821_210_8750_1_1530680503991_6829359_189-297451-0 from training. Duration: 0.55 2023-10-03 09:11:33,697 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_557-163391-0_sp1.1 from training. Duration: 0.97275 2023-10-03 09:11:35,055 WARNING [train.py:1204] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_65-88242-0_sp1.1 from training. Duration: 0.5090625 2023-10-03 09:11:37,400 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.nonlin_attention.balancer.prob, batch_count=1214020.0, ans=0.125 2023-10-03 09:11:41,108 INFO [train.py:1046] (3/4) Epoch 35, batch 1500, loss[loss=0.1499, simple_loss=0.2257, pruned_loss=0.03698, over 24567.00 frames. ], tot_loss[loss=0.1599, simple_loss=0.2386, pruned_loss=0.04057, over 4707247.95 frames. ], batch size: 60, lr: 2.92e-03, grad_scale: 16.0 2023-10-03 09:11:42,509 WARNING [train.py:1204] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_218-295097-0 from training. Duration: 0.77 2023-10-03 09:11:42,532 WARNING [train.py:1204] (3/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_686-247600-0_sp1.1 from training. Duration: 0.836375 2023-10-03 09:11:42,535 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_397-147196-0_sp0.9 from training. Duration: 0.888875 2023-10-03 09:11:44,384 WARNING [train.py:1204] (3/4) Exclude cut with ID _53950_210_5816_1_1534417264919_3862951_237-136449-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 09:11:45,709 WARNING [train.py:1204] (3/4) Exclude cut with ID _3120_210_3525_1_1526036143112_4072980_74-178400-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 09:11:45,784 WARNING [train.py:1204] (3/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_309-222545-0_sp0.9 from training. Duration: 0.9333125 2023-10-03 09:11:47,222 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7544_210_6753_1_1528587722_3576686_172-171750-0 from training. Duration: 0.65 2023-10-03 09:11:48,589 WARNING [train.py:1204] (3/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_3-237434-0_sp0.9 from training. Duration: 0.8444375 2023-10-03 09:11:48,620 WARNING [train.py:1204] (3/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_101-293367-0_sp0.9 from training. Duration: 0.7 2023-10-03 09:11:48,636 WARNING [train.py:1204] (3/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_818-213552-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 09:11:48,712 WARNING [train.py:1204] (3/4) Exclude cut with ID _14349_210_8341_1_1530615710818_3442470_115-285975-0_sp1.1 from training. Duration: 0.7818125 2023-10-03 09:11:51,299 WARNING [train.py:1204] (3/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_291-309749-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 09:11:53,900 WARNING [train.py:1204] (3/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_715-3462-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 09:11:54,228 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.bypass_mid.scale_min, batch_count=1214153.3333333333, ans=0.2 2023-10-03 09:11:58,180 WARNING [train.py:1204] (3/4) Exclude cut with ID _59502_210_12210_1_1535107024698_1835139_13-331827-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 09:11:58,190 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_4528_210_3273_1_1527076654_3276777_342-168068-0 from training. Duration: 0.87 2023-10-03 09:11:59,525 WARNING [train.py:1204] (3/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_215-141059-0_sp0.9 from training. Duration: 0.688875 2023-10-03 09:11:59,566 WARNING [train.py:1204] (3/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_106-286130-0_sp1.1 from training. Duration: 0.8909375 2023-10-03 09:12:01,367 WARNING [train.py:1204] (3/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_480-77655-0_sp1.1 from training. Duration: 0.97275 2023-10-03 09:12:05,219 WARNING [train.py:1204] (3/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_848-280957-0 from training. Duration: 0.87 2023-10-03 09:12:09,999 WARNING [train.py:1204] (3/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_122-109023-0 from training. Duration: 0.76 2023-10-03 09:12:11,352 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_11305_210_1794_1_1529750428_7178148_645-233480-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 09:12:11,422 WARNING [train.py:1204] (3/4) Exclude cut with ID _7562_210_2084_1_1528887767916_3946229_345-169156-0 from training. Duration: 0.58 2023-10-03 09:12:14,174 WARNING [train.py:1204] (3/4) Exclude cut with ID _7562_210_2084_1_1528887767916_3946229_345-169156-0_sp1.1 from training. Duration: 0.52725 2023-10-03 09:12:14,342 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.ff2_skip_rate, batch_count=1214220.0, ans=0.0 2023-10-03 09:12:17,289 WARNING [train.py:1204] (3/4) Exclude cut with ID _13253_210_1925_1_1530757775819_3577873_253-246386-0_sp1.1 from training. Duration: 0.8181875 2023-10-03 09:12:17,367 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_136-264444-0_sp1.1 from training. Duration: 0.97275 2023-10-03 09:12:17,380 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2168_210_2290_1_1525606920_1849400_136-222991-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 09:12:18,731 WARNING [train.py:1204] (3/4) Exclude cut with ID _7852_210_5219_1_1528286471316_8139400_242-105336-0 from training. Duration: 0.72 2023-10-03 09:12:18,776 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_5960_210_1794_1_1527403865_3990999_515-2613-0_sp1.1 from training. Duration: 0.8 2023-10-03 09:12:18,803 WARNING [train.py:1204] (3/4) Exclude cut with ID _39688_210_19596_1_1533371626725_4154159_551-167834-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 09:12:20,185 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_85-195240-0 from training. Duration: 0.87 2023-10-03 09:12:20,247 WARNING [train.py:1204] (3/4) Exclude cut with ID _63171_210_15488_1_1535104792794_3295110_113-113534-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 09:12:21,860 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.balancer2.prob, batch_count=1214220.0, ans=0.125 2023-10-03 09:12:25,827 WARNING [train.py:1204] (3/4) Exclude cut with ID _8833_210_5219_1_1528592665210_7553059_319-349181-0_sp1.1 from training. Duration: 0.9 2023-10-03 09:12:25,828 WARNING [train.py:1204] (3/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_225-43381-0 from training. Duration: 0.94 2023-10-03 09:12:27,419 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.attention_skip_rate, batch_count=1214286.6666666667, ans=0.0 2023-10-03 09:12:31,337 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_5959_210_1794_1_1527400400_3445726_412-44052-0_sp0.9 from training. Duration: 0.8555625 2023-10-03 09:12:31,724 INFO [scaling.py:1118] (3/4) WithLoss: name=encoder.encoders.5.encoder.layers.0.self_attn_weights, loss-sum=0.000e+00 2023-10-03 09:12:33,223 WARNING [train.py:1204] (3/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_356-167816-0_sp1.1 from training. Duration: 0.6545625 2023-10-03 09:12:37,890 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9012W0112-501244 from training. Duration: 0.768 2023-10-03 09:12:37,936 WARNING [train.py:1204] (3/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_177-326952-0_sp1.1 from training. Duration: 0.92725 2023-10-03 09:12:37,947 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9016W0116-502789 from training. Duration: 0.896 2023-10-03 09:12:39,140 WARNING [train.py:1204] (3/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_557-204815-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 09:12:41,088 WARNING [train.py:1204] (3/4) Exclude cut with ID _55857_210_2346_1_1534644007831_6898209_279-229469-0_sp1.1 from training. Duration: 0.936375 2023-10-03 09:12:42,410 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9029W0375-509764 from training. Duration: 0.896 2023-10-03 09:12:43,820 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2295_210_2087_1_1525500993_2407863_234-83104-0_sp0.9 from training. Duration: 0.688875 2023-10-03 09:12:45,255 WARNING [train.py:1204] (3/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_266-137104-0 from training. Duration: 0.65 2023-10-03 09:12:46,647 WARNING [train.py:1204] (3/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_289-163927-0_sp1.1 from training. Duration: 0.92725 2023-10-03 09:12:49,969 WARNING [train.py:1204] (3/4) Exclude cut with ID _12733_210_8341_1_1530167424556_3646380_86-225314-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 09:12:51,297 WARNING [train.py:1204] (3/4) Exclude cut with ID _7877_210_6014_1_1528286358099_3750136_618-284064-0_sp1.1 from training. Duration: 0.92725 2023-10-03 09:12:51,335 WARNING [train.py:1204] (3/4) Exclude cut with ID _55844_210_2346_1_1534471212569_7625707_1017-31469-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 09:12:51,377 WARNING [train.py:1204] (3/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_617-241446-0_sp1.1 from training. Duration: 0.92725 2023-10-03 09:12:51,430 WARNING [train.py:1204] (3/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_866-173440-0_sp1.1 from training. Duration: 0.8181875 2023-10-03 09:12:54,135 INFO [train.py:1046] (3/4) Epoch 35, batch 1550, loss[loss=0.1608, simple_loss=0.2537, pruned_loss=0.03396, over 24439.00 frames. ], tot_loss[loss=0.1607, simple_loss=0.2399, pruned_loss=0.0407, over 4715509.64 frames. ], batch size: 69, lr: 2.92e-03, grad_scale: 16.0 2023-10-03 09:12:54,191 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_9460_210_4211_1_1528954587_10049667_480-260269-0 from training. Duration: 0.56 2023-10-03 09:12:54,238 WARNING [train.py:1204] (3/4) Exclude cut with ID _44659_210_2721_1_1534036120061_6614860_626-298884-0 from training. Duration: 0.97 2023-10-03 09:12:54,281 WARNING [train.py:1204] (3/4) Exclude cut with ID _53565_210_10635_1_1534337218335_3820259_287-18120-0_sp1.1 from training. Duration: 0.8818125 2023-10-03 09:12:55,556 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_791-197891-0 from training. Duration: 0.84 2023-10-03 09:12:55,608 WARNING [train.py:1204] (3/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_527-238990-0 from training. Duration: 0.9 2023-10-03 09:12:58,310 WARNING [train.py:1204] (3/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_449-65969-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 09:12:58,407 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_553-341452-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 09:12:59,744 WARNING [train.py:1204] (3/4) Exclude cut with ID _16909_210_5724_1_1531292079912_4118919_300-47574-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 09:12:59,757 WARNING [train.py:1204] (3/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_70-27732-0_sp1.1 from training. Duration: 0.8545625 2023-10-03 09:12:59,849 WARNING [train.py:1204] (3/4) Exclude cut with ID _44718_210_5816_1_1534050051869_6469030_727-28074-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 09:13:01,131 WARNING [train.py:1204] (3/4) Exclude cut with ID _55857_210_2346_1_1534644007831_6898209_357-123664-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 09:13:04,292 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.573e+02 1.874e+02 2.151e+02 2.475e+02 3.456e+02, threshold=4.303e+02, percent-clipped=0.0 2023-10-03 09:13:04,407 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9123W0004-555964 from training. Duration: 0.896 2023-10-03 09:13:04,433 WARNING [train.py:1204] (3/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_445-145208-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 09:13:04,460 WARNING [train.py:1204] (3/4) Exclude cut with ID _3675_210_4557_1_1526639934408_4182089_456-18305-0_sp1.1 from training. Duration: 0.6909375 2023-10-03 09:13:05,794 WARNING [train.py:1204] (3/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_1-99680-0_sp1.1 from training. Duration: 0.6090625 2023-10-03 09:13:09,166 WARNING [train.py:1204] (3/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_146-282400-0_sp0.9 from training. Duration: 0.788875 2023-10-03 09:13:09,188 WARNING [train.py:1204] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_844-96989-0 from training. Duration: 0.71 2023-10-03 09:13:09,354 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.feed_forward2.hidden_balancer.prob, batch_count=1214486.6666666667, ans=0.125 2023-10-03 09:13:11,149 WARNING [train.py:1204] (3/4) Exclude cut with ID _29853_210_7497_1_1532505600851_6642110_128-146364-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 09:13:11,183 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder_embed.conv.5.prob, batch_count=1214486.6666666667, ans=0.125 2023-10-03 09:13:12,410 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_224-322394-0 from training. Duration: 0.66 2023-10-03 09:13:13,909 WARNING [train.py:1204] (3/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_191-59784-0 from training. Duration: 0.88 2023-10-03 09:13:13,916 WARNING [train.py:1204] (3/4) Exclude cut with ID _12556_210_8341_1_1530081024100_7327690_725-193477-0 from training. Duration: 0.98 2023-10-03 09:13:13,974 WARNING [train.py:1204] (3/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_523-79838-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 09:13:15,373 WARNING [train.py:1204] (3/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_398-334558-0_sp1.1 from training. Duration: 0.87275 2023-10-03 09:13:19,977 WARNING [train.py:1204] (3/4) Exclude cut with ID _6456_210_1534_1_1527681634212_3181049_171-36697-0_sp1.1 from training. Duration: 0.936375 2023-10-03 09:13:22,843 WARNING [train.py:1204] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_700-183549-0 from training. Duration: 0.68 2023-10-03 09:13:22,845 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8853_210_3273_1_1528633899_818968_51-187685-0 from training. Duration: 0.69 2023-10-03 09:13:31,250 WARNING [train.py:1204] (3/4) Exclude cut with ID _7719_210_3525_1_1528456153163_4296960_333-110399-0_sp1.1 from training. Duration: 0.87275 2023-10-03 09:13:34,200 WARNING [train.py:1204] (3/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_1022-83740-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 09:13:34,225 WARNING [train.py:1204] (3/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_61-327062-0_sp0.9 from training. Duration: 0.9 2023-10-03 09:13:34,240 WARNING [train.py:1204] (3/4) Exclude cut with ID _47251_210_18628_1_1533814063128_4137250_214-88076-0_sp1.1 from training. Duration: 0.8 2023-10-03 09:13:35,520 WARNING [train.py:1204] (3/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_839-173017-0 from training. Duration: 0.41 2023-10-03 09:13:41,821 WARNING [train.py:1204] (3/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_299-34413-0_sp1.1 from training. Duration: 0.5909375 2023-10-03 09:13:43,520 WARNING [train.py:1204] (3/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_664-159457-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 09:13:45,214 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.1.conv_module2.balancer1.prob, batch_count=1214620.0, ans=0.125 2023-10-03 09:13:45,658 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.3.encoder.layers.3.self_attn_weights.whiten_keys, num_groups=8, num_channels=256, metric=3.13 vs. limit=6.0 2023-10-03 09:13:46,312 WARNING [train.py:1204] (3/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_483-291760-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 09:13:47,915 WARNING [train.py:1204] (3/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_287-255474-0_sp1.1 from training. Duration: 0.92725 2023-10-03 09:13:49,490 WARNING [train.py:1204] (3/4) Exclude cut with ID _48677_210_5663_1_1533902405919_3892989_111-11439-0_sp1.1 from training. Duration: 0.87275 2023-10-03 09:13:49,503 WARNING [train.py:1204] (3/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_399-156791-0 from training. Duration: 0.83 2023-10-03 09:13:49,541 WARNING [train.py:1204] (3/4) Exclude cut with ID _3992_210_1747_1_1526644816031_3731060_36-147855-0_sp0.9 from training. Duration: 0.8666875 2023-10-03 09:13:52,221 WARNING [train.py:1204] (3/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_442-320317-0_sp1.1 from training. Duration: 0.7454375 2023-10-03 09:13:52,247 WARNING [train.py:1204] (3/4) Exclude cut with ID _55429_210_10626_1_1534467588527_7174143_953-80868-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 09:13:54,241 WARNING [train.py:1204] (3/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_159-328157-0_sp0.9 from training. Duration: 0.57775 2023-10-03 09:13:54,254 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9309W0197-640324 from training. Duration: 0.896 2023-10-03 09:13:57,043 WARNING [train.py:1204] (3/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_361-30822-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 09:13:59,880 WARNING [train.py:1204] (3/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_636-251213-0 from training. Duration: 0.94 2023-10-03 09:14:04,081 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.bypass.skip_rate, batch_count=1214686.6666666667, ans=0.09899494936611666 2023-10-03 09:14:05,234 WARNING [train.py:1204] (3/4) Exclude cut with ID _49728_210_12651_1_1534041034294_6788150_468-333827-0_sp1.1 from training. Duration: 0.963625 2023-10-03 09:14:05,306 WARNING [train.py:1204] (3/4) Exclude cut with ID _25882_210_3036_1_1532775665815_6479510_623-191759-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 09:14:05,366 WARNING [train.py:1204] (3/4) Exclude cut with ID _5189_210_4557_1_1527248319428_3769229_199-53526-0 from training. Duration: 0.42 2023-10-03 09:14:08,474 INFO [train.py:1046] (3/4) Epoch 35, batch 1600, loss[loss=0.181, simple_loss=0.2573, pruned_loss=0.05238, over 22827.00 frames. ], tot_loss[loss=0.1611, simple_loss=0.2401, pruned_loss=0.04103, over 4715437.78 frames. ], batch size: 322, lr: 2.92e-03, grad_scale: 32.0 2023-10-03 09:14:08,569 WARNING [train.py:1204] (3/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_32-205288-0_sp0.9 from training. Duration: 0.8666875 2023-10-03 09:14:09,860 WARNING [train.py:1204] (3/4) Exclude cut with ID _65263_210_22356_1_1535763598691_3921037_401-346646-0_sp1.1 from training. Duration: 0.963625 2023-10-03 09:14:09,864 WARNING [train.py:1204] (3/4) Exclude cut with ID _12555_210_8750_1_1530075594402_3780239_449-267986-0_sp1.1 from training. Duration: 0.7545625 2023-10-03 09:14:09,883 WARNING [train.py:1204] (3/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_430-201020-0_sp1.1 from training. Duration: 0.8818125 2023-10-03 09:14:11,158 WARNING [train.py:1204] (3/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_379-49457-0_sp1.1 from training. Duration: 0.7909375 2023-10-03 09:14:16,517 WARNING [train.py:1204] (3/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_200-105218-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 09:14:16,675 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.1.feed_forward1.out_proj.dropout_p, batch_count=1214753.3333333333, ans=0.1 2023-10-03 09:14:17,832 WARNING [train.py:1204] (3/4) Exclude cut with ID _3867_210_1883_1_1526641200575_4475830_319-349850-0 from training. Duration: 0.67 2023-10-03 09:14:17,914 WARNING [train.py:1204] (3/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_916-195848-0 from training. Duration: 0.92 2023-10-03 09:14:19,277 WARNING [train.py:1204] (3/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_436-186311-0 from training. Duration: 0.65 2023-10-03 09:14:19,624 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.balancer2.prob, batch_count=1214753.3333333333, ans=0.125 2023-10-03 09:14:20,678 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3910_210_1794_1_1526777667_3747260_302-180617-0_sp1.1 from training. Duration: 0.97275 2023-10-03 09:14:22,174 WARNING [train.py:1204] (3/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_102-92751-0 from training. Duration: 0.99 2023-10-03 09:14:23,933 WARNING [train.py:1204] (3/4) Exclude cut with ID _51899_210_20034_1_1534129254466_3669220_265-215428-0_sp1.1 from training. Duration: 0.8909375 2023-10-03 09:14:25,378 WARNING [train.py:1204] (3/4) Exclude cut with ID _58583_210_5236_1_1534726835266_3574249_438-198993-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 09:14:28,297 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.bypass.skip_rate, batch_count=1214820.0, ans=0.07 2023-10-03 09:14:29,410 WARNING [train.py:1204] (3/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_166-166990-0_sp1.1 from training. Duration: 0.87275 2023-10-03 09:14:33,460 WARNING [train.py:1204] (3/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_907-151398-0 from training. Duration: 0.37 2023-10-03 09:14:33,805 INFO [scaling.py:1118] (3/4) WithLoss: name=encoder.encoders.4.encoder.layers.0.self_attn_weights, loss-sum=0.000e+00 2023-10-03 09:14:36,293 WARNING [train.py:1204] (3/4) Exclude cut with ID _38670_210_15379_1_1533448810659_4391059_452-256669-0_sp1.1 from training. Duration: 0.936375 2023-10-03 09:14:36,489 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.ff2_skip_rate, batch_count=1214886.6666666667, ans=0.0 2023-10-03 09:14:37,562 WARNING [train.py:1204] (3/4) Exclude cut with ID _48677_210_5663_1_1533902405919_3892989_408-40740-0 from training. Duration: 0.83 2023-10-03 09:14:37,615 WARNING [train.py:1204] (3/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_903-158756-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 09:14:37,662 WARNING [train.py:1204] (3/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_397-216497-0 from training. Duration: 0.96 2023-10-03 09:14:42,900 WARNING [train.py:1204] (3/4) Exclude cut with ID _55429_210_10626_1_1534467588527_7174143_97-335084-0 from training. Duration: 0.97 2023-10-03 09:14:46,177 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=1214886.6666666667, ans=0.1 2023-10-03 09:14:49,272 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.2.encoder.layers.2.conv_module2.whiten, num_groups=1, num_channels=384, metric=5.90 vs. limit=15.0 2023-10-03 09:14:50,048 WARNING [train.py:1204] (3/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_453-267730-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 09:14:51,414 WARNING [train.py:1204] (3/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_153-97441-0 from training. Duration: 0.56 2023-10-03 09:14:52,811 WARNING [train.py:1204] (3/4) Exclude cut with ID _40901_210_5665_1_1533363789958_3856130_312-329122-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 09:14:52,862 WARNING [train.py:1204] (3/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_97-126471-0_sp1.1 from training. Duration: 0.97275 2023-10-03 09:14:52,863 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_11305_210_1794_1_1529750428_7178148_257-312870-0_sp1.1 from training. Duration: 0.8 2023-10-03 09:14:56,029 WARNING [train.py:1204] (3/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_454-314901-0_sp0.9 from training. Duration: 0.511125 2023-10-03 09:15:00,237 WARNING [train.py:1204] (3/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_282-136641-0_sp1.1 from training. Duration: 0.3818125 2023-10-03 09:15:01,621 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_46013_210_7234_1_1533776362_3744032_493-160724-0_sp1.1 from training. Duration: 0.963625 2023-10-03 09:15:01,660 WARNING [train.py:1204] (3/4) Exclude cut with ID _67831_210_12392_1_1535612464202_3092612_259-33442-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 09:15:03,010 WARNING [train.py:1204] (3/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_289-62384-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 09:15:04,336 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6545_210_2040_1_1527774670_4245258_379-14182-0_sp0.9 from training. Duration: 0.888875 2023-10-03 09:15:05,792 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_548-294316-0_sp1.1 from training. Duration: 0.736375 2023-10-03 09:15:07,144 WARNING [train.py:1204] (3/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_857-298942-0_sp1.1 from training. Duration: 0.77275 2023-10-03 09:15:08,594 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6673_210_3273_1_1527767790_3173240_327-306185-0_sp1.1 from training. Duration: 0.7545625 2023-10-03 09:15:13,406 WARNING [train.py:1204] (3/4) Exclude cut with ID _61008_210_10633_1_1534852769198_6974370_814-107647-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 09:15:14,799 WARNING [train.py:1204] (3/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_116-332008-0_sp1.1 from training. Duration: 0.8909375 2023-10-03 09:15:16,845 WARNING [train.py:1204] (3/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_266-329755-0 from training. Duration: 0.89 2023-10-03 09:15:16,848 WARNING [train.py:1204] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_271-269084-0_sp0.9 from training. Duration: 0.92225 2023-10-03 09:15:18,211 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3171_210_2087_1_1526109084_2330007_66-260583-0 from training. Duration: 0.72 2023-10-03 09:15:22,292 INFO [train.py:1046] (3/4) Epoch 35, batch 1650, loss[loss=0.1648, simple_loss=0.2301, pruned_loss=0.04982, over 23360.00 frames. ], tot_loss[loss=0.1612, simple_loss=0.2404, pruned_loss=0.041, over 4719156.21 frames. ], batch size: 285, lr: 2.92e-03, grad_scale: 16.0 2023-10-03 09:15:22,599 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.conv_module2.balancer1.prob, batch_count=1215086.6666666667, ans=0.125 2023-10-03 09:15:24,381 WARNING [train.py:1204] (3/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_1326-276386-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 09:15:24,502 WARNING [train.py:1204] (3/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_372-30302-0_sp1.1 from training. Duration: 0.7818125 2023-10-03 09:15:25,853 WARNING [train.py:1204] (3/4) Exclude cut with ID _25882_210_3036_1_1532775665815_6479510_631-257029-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 09:15:25,870 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3691_210_3528_1_1526468169_4062053_355-334432-0 from training. Duration: 0.87 2023-10-03 09:15:25,884 WARNING [train.py:1204] (3/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_839-173017-0_sp1.1 from training. Duration: 0.37275 2023-10-03 09:15:25,892 WARNING [train.py:1204] (3/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_441-91466-0 from training. Duration: 0.97 2023-10-03 09:15:27,236 WARNING [train.py:1204] (3/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_108-53929-0 from training. Duration: 0.59 2023-10-03 09:15:31,353 WARNING [train.py:1204] (3/4) Exclude cut with ID _9389_210_1664_1_1529217337301_5289269_260-1942-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 09:15:31,407 WARNING [train.py:1204] (3/4) Exclude cut with ID _56423_210_5854_1_1534896061049_3165969_198-61547-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 09:15:32,710 WARNING [train.py:1204] (3/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_412-142122-0_sp1.1 from training. Duration: 0.92725 2023-10-03 09:15:32,736 WARNING [train.py:1204] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_88-341718-0_sp1.1 from training. Duration: 0.6 2023-10-03 09:15:34,006 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.573e+02 1.944e+02 2.112e+02 2.392e+02 3.284e+02, threshold=4.225e+02, percent-clipped=0.0 2023-10-03 09:15:35,382 WARNING [train.py:1204] (3/4) Exclude cut with ID _61079_210_4525_1_1534921008198_4011419_334-298697-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 09:15:36,839 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_5465_210_4211_1_1527227253_55357_8-2499-0_sp1.1 from training. Duration: 0.996375 2023-10-03 09:15:39,552 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_447-201565-0_sp1.1 from training. Duration: 0.97275 2023-10-03 09:15:39,559 WARNING [train.py:1204] (3/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_253-161789-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 09:15:39,563 WARNING [train.py:1204] (3/4) Exclude cut with ID _44304_210_15374_1_1533639352765_4613129_302-22084-0_sp0.9 from training. Duration: 0.9555625 2023-10-03 09:15:39,571 WARNING [train.py:1204] (3/4) Exclude cut with ID _26664_210_7770_1_1532258943280_3418270_187-80815-0_sp1.1 from training. Duration: 0.8454375 2023-10-03 09:15:39,639 WARNING [train.py:1204] (3/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_68-212801-0 from training. Duration: 0.73 2023-10-03 09:15:39,670 WARNING [train.py:1204] (3/4) Exclude cut with ID _29345_210_4381_1_1532689168263_3860169_149-257268-0 from training. Duration: 0.81 2023-10-03 09:15:41,329 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.bypass.skip_rate, batch_count=1215153.3333333333, ans=0.09899494936611666 2023-10-03 09:15:44,438 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_213-103282-0_sp1.1 from training. Duration: 0.7090625 2023-10-03 09:15:47,890 WARNING [train.py:1204] (3/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_156-302675-0_sp1.1 from training. Duration: 0.5 2023-10-03 09:15:55,487 WARNING [train.py:1204] (3/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_125-322172-0 from training. Duration: 0.93 2023-10-03 09:15:56,892 WARNING [train.py:1204] (3/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_493-60474-0_sp1.1 from training. Duration: 0.963625 2023-10-03 09:15:58,250 WARNING [train.py:1204] (3/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_156-302675-0 from training. Duration: 0.55 2023-10-03 09:15:59,594 WARNING [train.py:1204] (3/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_123-267862-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 09:16:01,076 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.bypass.skip_rate, batch_count=1215220.0, ans=0.09899494936611666 2023-10-03 09:16:03,504 WARNING [train.py:1204] (3/4) Exclude cut with ID _28427_210_4220_1_1532581240967_3061069_201-143747-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 09:16:04,832 WARNING [train.py:1204] (3/4) Exclude cut with ID _8772_210_1850_1_1528635595972_3664011_96-12756-0_sp1.1 from training. Duration: 0.8909375 2023-10-03 09:16:04,862 WARNING [train.py:1204] (3/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_1347-145675-0_sp1.1 from training. Duration: 0.936375 2023-10-03 09:16:04,968 WARNING [train.py:1204] (3/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_387-159008-0_sp1.1 from training. Duration: 0.7818125 2023-10-03 09:16:04,985 WARNING [train.py:1204] (3/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_400-201896-0_sp1.1 from training. Duration: 0.963625 2023-10-03 09:16:07,802 WARNING [train.py:1204] (3/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_326-174612-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 09:16:09,214 WARNING [train.py:1204] (3/4) Exclude cut with ID _55844_210_2346_1_1534471212569_7625707_390-116233-0_sp1.1 from training. Duration: 0.963625 2023-10-03 09:16:09,251 WARNING [train.py:1204] (3/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_419-317062-0_sp1.1 from training. Duration: 0.82725 2023-10-03 09:16:11,187 WARNING [train.py:1204] (3/4) Exclude cut with ID _29877_210_6228_1_1532397791106_5276149_266-62291-0_sp0.9 from training. Duration: 0.9666875 2023-10-03 09:16:12,588 WARNING [train.py:1204] (3/4) Exclude cut with ID _7857_210_1534_1_1528286515745_6831470_431-338528-0_sp1.1 from training. Duration: 0.92725 2023-10-03 09:16:13,863 WARNING [train.py:1204] (3/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_87-131427-0_sp0.9 from training. Duration: 0.8666875 2023-10-03 09:16:17,331 WARNING [train.py:1204] (3/4) Exclude cut with ID _24055_210_12651_1_1532310907143_976979_92-59356-0_sp1.1 from training. Duration: 0.82725 2023-10-03 09:16:17,419 WARNING [train.py:1204] (3/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_96-290039-0 from training. Duration: 0.56 2023-10-03 09:16:20,120 WARNING [train.py:1204] (3/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_48-182155-0_sp0.9 from training. Duration: 0.9666875 2023-10-03 09:16:20,171 WARNING [train.py:1204] (3/4) Exclude cut with ID _4626_210_3532_1_1526817652717_3916859_446-206656-0 from training. Duration: 0.76 2023-10-03 09:16:21,732 WARNING [train.py:1204] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_168-32906-0 from training. Duration: 0.69 2023-10-03 09:16:21,747 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3780_210_2087_1_1526467760_2130483_170-259044-0 from training. Duration: 0.7 2023-10-03 09:16:21,764 WARNING [train.py:1204] (3/4) Exclude cut with ID _65134_210_14763_1_1535335393985_3505368_153-216718-0_sp1.1 from training. Duration: 0.92725 2023-10-03 09:16:23,176 WARNING [train.py:1204] (3/4) Exclude cut with ID _64639_210_5816_1_1535367619437_3528000_461-329156-0_sp1.1 from training. Duration: 0.87275 2023-10-03 09:16:24,516 WARNING [train.py:1204] (3/4) Exclude cut with ID _21989_210_5854_1_1531899214931_3271360_126-347531-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 09:16:24,577 WARNING [train.py:1204] (3/4) Exclude cut with ID _7614_210_4915_1_1529326764092_4537359_96-162197-0_sp1.1 from training. Duration: 0.963625 2023-10-03 09:16:24,578 WARNING [train.py:1204] (3/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_1083-147536-0 from training. Duration: 0.97 2023-10-03 09:16:28,057 WARNING [train.py:1204] (3/4) Exclude cut with ID _64029_210_4381_1_1535178502772_3756459_275-16710-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 09:16:29,313 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7264_210_4938_1_1527939911_8132095_800-91995-0_sp0.9 from training. Duration: 0.9555625 2023-10-03 09:16:29,345 WARNING [train.py:1204] (3/4) Exclude cut with ID _3844_210_1390_1_1526727803582_2871209_50-193879-0_sp1.1 from training. Duration: 0.936375 2023-10-03 09:16:33,352 WARNING [train.py:1204] (3/4) Exclude cut with ID _46952_210_5986_1_1534230010357_3107379_232-344503-0 from training. Duration: 0.96 2023-10-03 09:16:36,034 INFO [train.py:1046] (3/4) Epoch 35, batch 1700, loss[loss=0.1565, simple_loss=0.2208, pruned_loss=0.04608, over 23624.00 frames. ], tot_loss[loss=0.161, simple_loss=0.2398, pruned_loss=0.04109, over 4707749.68 frames. ], batch size: 256, lr: 2.92e-03, grad_scale: 16.0 2023-10-03 09:16:36,252 WARNING [train.py:1204] (3/4) Exclude cut with ID _25808_210_13292_1_1532343514562_7366109_206-14076-0_sp1.1 from training. Duration: 0.936375 2023-10-03 09:16:36,253 WARNING [train.py:1204] (3/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_55-31904-0_sp0.9 from training. Duration: 0.97775 2023-10-03 09:16:37,521 WARNING [train.py:1204] (3/4) Exclude cut with ID _14632_210_9988_1_1530691279392_3923200_161-129530-0 from training. Duration: 0.96 2023-10-03 09:16:37,590 WARNING [train.py:1204] (3/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_770-200430-0_sp1.1 from training. Duration: 0.8090625 2023-10-03 09:16:38,875 WARNING [train.py:1204] (3/4) Exclude cut with ID _6270_210_2084_1_1527850911573_3856420_26-277343-0_sp1.1 from training. Duration: 0.7454375 2023-10-03 09:16:38,876 WARNING [train.py:1204] (3/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_88-23776-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 09:16:41,661 WARNING [train.py:1204] (3/4) Exclude cut with ID _60860_210_8254_1_1534896015875_3557169_245-256666-0_sp1.1 from training. Duration: 0.8818125 2023-10-03 09:16:41,690 WARNING [train.py:1204] (3/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_636-251213-0_sp1.1 from training. Duration: 0.8545625 2023-10-03 09:16:41,704 WARNING [train.py:1204] (3/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_540-131164-0 from training. Duration: 0.92 2023-10-03 09:16:43,536 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_5931_210_2040_1_1527428481_4547999_321-193614-0_sp0.9 from training. Duration: 0.7444375 2023-10-03 09:16:43,781 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.feed_forward3.hidden_balancer.prob, batch_count=1215420.0, ans=0.125 2023-10-03 09:16:51,562 WARNING [train.py:1204] (3/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_373-225403-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 09:16:53,063 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.conv_module2.balancer1.prob, batch_count=1215486.6666666667, ans=0.125 2023-10-03 09:16:54,186 WARNING [train.py:1204] (3/4) Exclude cut with ID _20622_210_3852_1_1531622427080_1093839_57-326027-0_sp1.1 from training. Duration: 0.97275 2023-10-03 09:16:57,929 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.bypass.scale_min, batch_count=1215486.6666666667, ans=0.2 2023-10-03 09:16:58,916 WARNING [train.py:1204] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_605-257205-0_sp1.1 from training. Duration: 0.536375 2023-10-03 09:16:58,953 WARNING [train.py:1204] (3/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_442-320317-0_sp0.9 from training. Duration: 0.911125 2023-10-03 09:16:58,988 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_35361_210_4238_1_1533262810_7793476_675-87012-0_sp1.1 from training. Duration: 0.8090625 2023-10-03 09:16:59,022 WARNING [train.py:1204] (3/4) Exclude cut with ID _4184_210_2550_1_1526730192013_2219380_306-44736-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 09:17:01,791 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_124-113562-0 from training. Duration: 0.94 2023-10-03 09:17:03,354 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2821_210_2715_1_1526108385_3763560_73-296673-0_sp0.9 from training. Duration: 0.92225 2023-10-03 09:17:03,376 WARNING [train.py:1204] (3/4) Exclude cut with ID _10301_210_5886_1_1529222275332_3910620_216-46959-0_sp1.1 from training. Duration: 0.92725 2023-10-03 09:17:04,847 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.ff2_skip_rate, batch_count=1215553.3333333333, ans=0.0 2023-10-03 09:17:06,012 WARNING [train.py:1204] (3/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_91-277684-0_sp0.9 from training. Duration: 0.82225 2023-10-03 09:17:07,365 WARNING [train.py:1204] (3/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_351-294650-0_sp0.9 from training. Duration: 0.7 2023-10-03 09:17:08,880 WARNING [train.py:1204] (3/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_209-205365-0 from training. Duration: 0.79 2023-10-03 09:17:10,200 WARNING [train.py:1204] (3/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_97-325053-0 from training. Duration: 0.92 2023-10-03 09:17:12,121 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_49348_210_7927_1_1533954500_3691300_109-276430-0_sp1.1 from training. Duration: 0.92725 2023-10-03 09:17:13,883 WARNING [train.py:1204] (3/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_912-47473-0 from training. Duration: 0.68 2023-10-03 09:17:15,247 WARNING [train.py:1204] (3/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_96-320736-0_sp0.9 from training. Duration: 0.9555625 2023-10-03 09:17:24,008 WARNING [train.py:1204] (3/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_1252-235481-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 09:17:24,100 WARNING [train.py:1204] (3/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_813-261554-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 09:17:24,166 WARNING [train.py:1204] (3/4) Exclude cut with ID _21632_210_2253_1_1531994001765_3744140_110-274654-0_sp0.9 from training. Duration: 0.911125 2023-10-03 09:17:25,552 WARNING [train.py:1204] (3/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_381-317791-0_sp1.1 from training. Duration: 0.663625 2023-10-03 09:17:25,564 WARNING [train.py:1204] (3/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_51-117576-0_sp1.1 from training. Duration: 0.363625 2023-10-03 09:17:25,590 WARNING [train.py:1204] (3/4) Exclude cut with ID _42321_210_7770_1_1533468631352_4366895_104-215915-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 09:17:28,763 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_216-22824-0_sp1.1 from training. Duration: 0.92725 2023-10-03 09:17:28,764 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_14831_210_7927_1_1530700200_5583610_181-27467-0 from training. Duration: 0.91 2023-10-03 09:17:28,829 WARNING [train.py:1204] (3/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_1024-265645-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 09:17:28,832 WARNING [train.py:1204] (3/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_412-233904-0_sp1.1 from training. Duration: 0.936375 2023-10-03 09:17:28,848 WARNING [train.py:1204] (3/4) Exclude cut with ID _30191_210_1664_1_1533189915047_7196670_911-223461-0_sp1.1 from training. Duration: 0.92725 2023-10-03 09:17:28,855 WARNING [train.py:1204] (3/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_558-148266-0_sp1.1 from training. Duration: 0.963625 2023-10-03 09:17:29,764 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.0.layers.1.conv_module1.whiten, num_groups=1, num_channels=192, metric=7.98 vs. limit=15.0 2023-10-03 09:17:30,455 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.balancer1.prob, batch_count=1215620.0, ans=0.125 2023-10-03 09:17:31,579 WARNING [train.py:1204] (3/4) Exclude cut with ID _58480_210_12392_1_1534811121111_3646160_212-164495-0_sp1.1 from training. Duration: 0.936375 2023-10-03 09:17:31,580 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_454-276429-0_sp0.9 from training. Duration: 0.9666875 2023-10-03 09:17:32,871 WARNING [train.py:1204] (3/4) Exclude cut with ID _24736_210_3279_1_1532332894218_2838700_73-197086-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 09:17:32,929 WARNING [train.py:1204] (3/4) Exclude cut with ID _5294_210_1747_1_1527249643928_3696179_529-242298-0_sp1.1 from training. Duration: 0.863625 2023-10-03 09:17:34,215 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_53555_210_5632_1_1534595220_7232297_345-171438-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 09:17:34,433 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.ff3_skip_rate, batch_count=1215686.6666666667, ans=0.0 2023-10-03 09:17:36,951 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_28211_210_6388_1_1532861506_4523224_22-25766-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 09:17:37,187 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.conv_module1.balancer2.prob, batch_count=1215686.6666666667, ans=0.125 2023-10-03 09:17:38,327 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7002_210_1794_1_1527939683_5157286_163-87552-0 from training. Duration: 0.86 2023-10-03 09:17:40,284 WARNING [train.py:1204] (3/4) Exclude cut with ID _56927_210_9969_1_1534848970840_3695229_409-225129-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 09:17:42,160 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_26192_210_7927_1_1532137269_1040115_21-219331-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 09:17:45,575 WARNING [train.py:1204] (3/4) Exclude cut with ID _15012_210_1968_1_1530856820429_5095350_662-210469-0 from training. Duration: 0.57 2023-10-03 09:17:51,154 INFO [train.py:1046] (3/4) Epoch 35, batch 1750, loss[loss=0.1618, simple_loss=0.2323, pruned_loss=0.04559, over 23375.00 frames. ], tot_loss[loss=0.1597, simple_loss=0.238, pruned_loss=0.04069, over 4703073.28 frames. ], batch size: 285, lr: 2.92e-03, grad_scale: 16.0 2023-10-03 09:17:51,193 WARNING [train.py:1204] (3/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_582-191016-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 09:17:51,433 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.conv_skip_rate, batch_count=1215753.3333333333, ans=0.0 2023-10-03 09:17:53,979 WARNING [train.py:1204] (3/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_270-210032-0_sp1.1 from training. Duration: 0.963625 2023-10-03 09:17:54,014 WARNING [train.py:1204] (3/4) Exclude cut with ID _6789_210_1534_1_1527857937218_3118950_262-48853-0_sp0.9 from training. Duration: 0.62225 2023-10-03 09:17:55,373 WARNING [train.py:1204] (3/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_847-41334-0 from training. Duration: 0.44 2023-10-03 09:17:55,402 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_573-46532-0_sp1.1 from training. Duration: 0.92725 2023-10-03 09:17:59,564 WARNING [train.py:1204] (3/4) Exclude cut with ID _21881_210_3525_1_1531825242757_3798380_247-102591-0_sp1.1 from training. Duration: 0.8909375 2023-10-03 09:17:59,596 WARNING [train.py:1204] (3/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_200-21395-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 09:18:02,913 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.539e+02 1.871e+02 1.981e+02 2.197e+02 2.904e+02, threshold=3.962e+02, percent-clipped=0.0 2023-10-03 09:18:03,053 WARNING [train.py:1204] (3/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_711-15688-0 from training. Duration: 0.43 2023-10-03 09:18:05,733 WARNING [train.py:1204] (3/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_53-75025-0_sp1.1 from training. Duration: 0.963625 2023-10-03 09:18:08,512 WARNING [train.py:1204] (3/4) Exclude cut with ID _6194_210_1534_1_1527511826160_3941194_321-148641-0 from training. Duration: 0.64 2023-10-03 09:18:08,548 WARNING [train.py:1204] (3/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_337-294431-0_sp1.1 from training. Duration: 0.936375 2023-10-03 09:18:09,923 WARNING [train.py:1204] (3/4) Exclude cut with ID _21632_210_2253_1_1531994001765_3744140_110-274654-0_sp1.1 from training. Duration: 0.7454375 2023-10-03 09:18:13,177 WARNING [train.py:1204] (3/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_135-174080-0_sp1.1 from training. Duration: 0.4818125 2023-10-03 09:18:15,139 WARNING [train.py:1204] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_21-174514-0 from training. Duration: 0.43 2023-10-03 09:18:15,296 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_38965_210_4852_1_1533174906_3484735_215-294589-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 09:18:16,603 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_548-294316-0 from training. Duration: 0.81 2023-10-03 09:18:22,621 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_103-84010-0_sp1.1 from training. Duration: 0.62725 2023-10-03 09:18:24,262 WARNING [train.py:1204] (3/4) Exclude cut with ID _18199_210_6109_1_1531475789864_4288790_295-198247-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 09:18:24,282 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_13757_210_3528_1_1530440682_4107239_103-313224-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 09:18:28,411 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_34886_210_10633_1_1533203882_1755661_67-294673-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 09:18:28,420 WARNING [train.py:1204] (3/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_350-101734-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 09:18:30,595 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.2.encoder.layers.0.whiten, num_groups=1, num_channels=384, metric=5.79 vs. limit=12.0 2023-10-03 09:18:31,713 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_37357_210_17024_1_1533089504_2316121_204-153986-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 09:18:33,131 WARNING [train.py:1204] (3/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_673-115569-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 09:18:34,563 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_489-230703-0_sp0.9 from training. Duration: 0.9444375 2023-10-03 09:18:35,945 WARNING [train.py:1204] (3/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_575-102496-0_sp1.1 from training. Duration: 0.936375 2023-10-03 09:18:37,257 WARNING [train.py:1204] (3/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_669-93163-0 from training. Duration: 0.98 2023-10-03 09:18:38,681 WARNING [train.py:1204] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_249-281349-0_sp0.9 from training. Duration: 0.9444375 2023-10-03 09:18:40,672 WARNING [train.py:1204] (3/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_106-30782-0 from training. Duration: 0.8 2023-10-03 09:18:42,499 WARNING [train.py:1204] (3/4) Exclude cut with ID _8772_210_1850_1_1528635595972_3664011_398-320049-0_sp1.1 from training. Duration: 0.8 2023-10-03 09:18:43,880 WARNING [train.py:1204] (3/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_516-151727-0_sp1.1 from training. Duration: 0.963625 2023-10-03 09:18:45,184 WARNING [train.py:1204] (3/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_325-62802-0_sp1.1 from training. Duration: 0.7909375 2023-10-03 09:18:50,131 WARNING [train.py:1204] (3/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_93-58002-0_sp0.9 from training. Duration: 0.6333125 2023-10-03 09:18:50,181 WARNING [train.py:1204] (3/4) Exclude cut with ID _4681_210_3525_1_1527251040321_1235713_123-24428-0_sp1.1 from training. Duration: 0.436375 2023-10-03 09:18:51,552 WARNING [train.py:1204] (3/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_1185-56473-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 09:18:51,763 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.ff2_skip_rate, batch_count=1216020.0, ans=0.0 2023-10-03 09:18:52,916 WARNING [train.py:1204] (3/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_96-244299-0_sp1.1 from training. Duration: 0.8 2023-10-03 09:18:56,978 WARNING [train.py:1204] (3/4) Exclude cut with ID _59500_210_8254_1_1534809610680_3736009_104-72815-0_sp1.1 from training. Duration: 0.963625 2023-10-03 09:18:57,212 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.attention_skip_rate, batch_count=1216020.0, ans=0.0 2023-10-03 09:18:58,633 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.balancer1.prob, batch_count=1216020.0, ans=0.125 2023-10-03 09:18:59,690 WARNING [train.py:1204] (3/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_468-283113-0_sp1.1 from training. Duration: 0.87275 2023-10-03 09:19:01,079 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6239_210_4211_1_1527765584_4306698_528-102186-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 09:19:02,496 WARNING [train.py:1204] (3/4) Exclude cut with ID _45297_210_2721_1_1533715351381_3264949_351-147136-0 from training. Duration: 0.87 2023-10-03 09:19:02,500 WARNING [train.py:1204] (3/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_520-51744-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 09:19:03,101 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.3.encoder.layers.1.self_attn1.whiten, num_groups=1, num_channels=512, metric=17.11 vs. limit=22.5 2023-10-03 09:19:04,339 WARNING [train.py:1204] (3/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_310-114452-0_sp1.1 from training. Duration: 0.536375 2023-10-03 09:19:04,342 WARNING [train.py:1204] (3/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_450-165710-0_sp1.1 from training. Duration: 0.92725 2023-10-03 09:19:05,614 INFO [train.py:1046] (3/4) Epoch 35, batch 1800, loss[loss=0.1556, simple_loss=0.2359, pruned_loss=0.03763, over 24314.00 frames. ], tot_loss[loss=0.1588, simple_loss=0.2374, pruned_loss=0.04011, over 4707493.98 frames. ], batch size: 61, lr: 2.92e-03, grad_scale: 16.0 2023-10-03 09:19:05,655 WARNING [train.py:1204] (3/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_280-120942-0_sp1.1 from training. Duration: 0.7 2023-10-03 09:19:05,678 WARNING [train.py:1204] (3/4) Exclude cut with ID _65585_210_12210_1_1535450451584_3764098_112-294301-0_sp0.9 from training. Duration: 0.97775 2023-10-03 09:19:05,731 WARNING [train.py:1204] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_98-51676-0_sp0.9 from training. Duration: 0.811125 2023-10-03 09:19:08,558 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_33348_210_5632_1_1533086912_3324030_85-28583-0_sp1.1 from training. Duration: 0.7454375 2023-10-03 09:19:08,640 WARNING [train.py:1204] (3/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_336-208232-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 09:19:10,030 WARNING [train.py:1204] (3/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_339-104791-0_sp0.9 from training. Duration: 0.5444375 2023-10-03 09:19:12,735 WARNING [train.py:1204] (3/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_559-29840-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 09:19:15,953 WARNING [train.py:1204] (3/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_117-290916-0_sp0.9 from training. Duration: 0.5666875 2023-10-03 09:19:17,260 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_185-112289-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 09:19:20,445 WARNING [train.py:1204] (3/4) Exclude cut with ID _59256_210_5986_1_1535076085997_3589120_155-298797-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 09:19:21,962 WARNING [train.py:1204] (3/4) Exclude cut with ID _44659_210_2721_1_1534036120061_6614860_246-206531-0_sp1.1 from training. Duration: 0.92725 2023-10-03 09:19:22,012 WARNING [train.py:1204] (3/4) Exclude cut with ID _23818_210_6228_1_1531917063884_3676920_469-151944-0_sp1.1 from training. Duration: 0.92725 2023-10-03 09:19:23,393 WARNING [train.py:1204] (3/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_88-276668-0_sp1.1 from training. Duration: 0.9 2023-10-03 09:19:26,076 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_15101_210_7927_1_1530786415_5033274_320-327527-0_sp1.1 from training. Duration: 0.87275 2023-10-03 09:19:26,093 WARNING [train.py:1204] (3/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_446-112117-0 from training. Duration: 0.9 2023-10-03 09:19:26,163 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_30280_210_1881_1_1532872779_3660139_400-254159-0_sp1.1 from training. Duration: 0.936375 2023-10-03 09:19:30,238 WARNING [train.py:1204] (3/4) Exclude cut with ID _40284_210_2084_1_1533513679814_3704279_158-345341-0_sp1.1 from training. Duration: 0.936375 2023-10-03 09:19:34,290 WARNING [train.py:1204] (3/4) Exclude cut with ID 3033-130750-0096-55598-0 from training. Duration: 0.83 2023-10-03 09:19:35,837 WARNING [train.py:1204] (3/4) Exclude cut with ID _9389_210_1664_1_1529217337301_5289269_379-128794-0 from training. Duration: 0.85 2023-10-03 09:19:37,667 WARNING [train.py:1204] (3/4) Exclude cut with ID _3675_210_4557_1_1526639934408_4182089_456-18305-0 from training. Duration: 0.76 2023-10-03 09:19:37,693 WARNING [train.py:1204] (3/4) Exclude cut with ID _29853_210_7497_1_1532505600851_6642110_571-249460-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 09:19:37,791 WARNING [train.py:1204] (3/4) Exclude cut with ID _4068_210_2629_1_1526697350395_1546259_230-325568-0_sp1.1 from training. Duration: 0.92725 2023-10-03 09:19:37,792 WARNING [train.py:1204] (3/4) Exclude cut with ID _34415_210_8750_1_1533085213375_3451116_213-267570-0_sp1.1 from training. Duration: 0.963625 2023-10-03 09:19:39,163 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6767_210_3273_1_1527855783_1718932_142-127132-0_sp1.1 from training. Duration: 0.863625 2023-10-03 09:19:45,347 WARNING [train.py:1204] (3/4) Exclude cut with ID IC0653W0001-322925 from training. Duration: 0.896 2023-10-03 09:19:45,437 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_4528_210_3273_1_1527076654_3276777_209-219804-0_sp1.1 from training. Duration: 0.62725 2023-10-03 09:19:47,318 WARNING [train.py:1204] (3/4) Exclude cut with ID _55844_210_2346_1_1534471212569_7625707_187-204971-0_sp1.1 from training. Duration: 0.936375 2023-10-03 09:19:49,907 WARNING [train.py:1204] (3/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_369-325184-0 from training. Duration: 0.99 2023-10-03 09:19:49,940 WARNING [train.py:1204] (3/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_791-45149-0 from training. Duration: 0.86 2023-10-03 09:19:51,203 WARNING [train.py:1204] (3/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_317-211107-0_sp1.1 from training. Duration: 0.836375 2023-10-03 09:19:52,559 WARNING [train.py:1204] (3/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_488-330913-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 09:19:53,931 WARNING [train.py:1204] (3/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_506-42614-0_sp1.1 from training. Duration: 0.7181875 2023-10-03 09:19:58,430 WARNING [train.py:1204] (3/4) Exclude cut with ID _8696_210_1876_1_1528545632440_3345058_209-54069-0 from training. Duration: 0.67 2023-10-03 09:20:05,313 WARNING [train.py:1204] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_215-202973-0_sp1.1 from training. Duration: 0.8 2023-10-03 09:20:05,377 WARNING [train.py:1204] (3/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_321-215742-0 from training. Duration: 0.5 2023-10-03 09:20:05,432 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_428-32114-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 09:20:05,441 WARNING [train.py:1204] (3/4) Exclude cut with ID _27013_210_1389_1_1532437173240_3252649_467-263151-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 09:20:06,761 WARNING [train.py:1204] (3/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_734-129761-0_sp0.9 from training. Duration: 0.911125 2023-10-03 09:20:06,804 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_521-214276-0 from training. Duration: 0.44 2023-10-03 09:20:11,282 WARNING [train.py:1204] (3/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_80-147301-0_sp1.1 from training. Duration: 0.763625 2023-10-03 09:20:11,285 WARNING [train.py:1204] (3/4) Exclude cut with ID _9679_210_7497_1_1529129212906_4363929_56-307796-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 09:20:12,701 WARNING [train.py:1204] (3/4) Exclude cut with ID _14832_210_8750_1_1530691351539_3171348_1-54315-0 from training. Duration: 0.87 2023-10-03 09:20:12,701 WARNING [train.py:1204] (3/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_558-155750-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 09:20:12,990 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.out_combiner.scale_min, batch_count=1216353.3333333333, ans=0.2 2023-10-03 09:20:14,740 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_142-309370-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 09:20:16,602 WARNING [train.py:1204] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_18-46756-0_sp0.9 from training. Duration: 0.87775 2023-10-03 09:20:16,609 WARNING [train.py:1204] (3/4) Exclude cut with ID _61148_210_6897_1_1535094022726_4217610_279-212159-0_sp1.1 from training. Duration: 0.936375 2023-10-03 09:20:16,724 WARNING [train.py:1204] (3/4) Exclude cut with ID _21987_210_5854_1_1531821500482_3637038_164-2641-0_sp1.1 from training. Duration: 0.936375 2023-10-03 09:20:18,032 WARNING [train.py:1204] (3/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_395-340852-0_sp1.1 from training. Duration: 0.8454375 2023-10-03 09:20:19,689 INFO [train.py:1046] (3/4) Epoch 35, batch 1850, loss[loss=0.1704, simple_loss=0.2366, pruned_loss=0.05214, over 23638.00 frames. ], tot_loss[loss=0.1594, simple_loss=0.238, pruned_loss=0.04034, over 4690705.46 frames. ], batch size: 256, lr: 2.92e-03, grad_scale: 16.0 2023-10-03 09:20:19,799 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1077-184261-0_sp1.1 from training. Duration: 0.963625 2023-10-03 09:20:19,807 WARNING [train.py:1204] (3/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_1176-204992-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 09:20:22,507 WARNING [train.py:1204] (3/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_563-347832-0_sp1.1 from training. Duration: 0.8090625 2023-10-03 09:20:23,861 WARNING [train.py:1204] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_380-204756-0_sp0.9 from training. Duration: 0.9555625 2023-10-03 09:20:25,461 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.conv_module1.balancer1.min_positive, batch_count=1216420.0, ans=0.025 2023-10-03 09:20:26,899 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.conv_module1.balancer1.prob, batch_count=1216420.0, ans=0.125 2023-10-03 09:20:30,536 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.557e+02 1.898e+02 2.066e+02 2.341e+02 4.051e+02, threshold=4.131e+02, percent-clipped=1.0 2023-10-03 09:20:30,689 WARNING [train.py:1204] (3/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_212-10501-0_sp1.1 from training. Duration: 0.8545625 2023-10-03 09:20:30,710 WARNING [train.py:1204] (3/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_445-117398-0 from training. Duration: 0.89 2023-10-03 09:20:34,724 WARNING [train.py:1204] (3/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_29-280622-0 from training. Duration: 0.82 2023-10-03 09:20:37,402 WARNING [train.py:1204] (3/4) Exclude cut with ID _26664_210_7770_1_1532258943280_3418270_187-80815-0 from training. Duration: 0.93 2023-10-03 09:20:39,065 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.feed_forward3.hidden_balancer.prob, batch_count=1216486.6666666667, ans=0.125 2023-10-03 09:20:40,132 WARNING [train.py:1204] (3/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_414-349640-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 09:20:40,149 WARNING [train.py:1204] (3/4) Exclude cut with ID _45021_210_9994_1_1533619751094_3330288_96-239010-0 from training. Duration: 0.91 2023-10-03 09:20:40,155 WARNING [train.py:1204] (3/4) Exclude cut with ID _5189_210_4557_1_1527248319428_3769229_199-53526-0_sp0.9 from training. Duration: 0.4666875 2023-10-03 09:20:51,286 WARNING [train.py:1204] (3/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_63-101996-0_sp1.1 from training. Duration: 0.8 2023-10-03 09:20:51,533 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=1216553.3333333333, ans=0.1 2023-10-03 09:20:52,674 WARNING [train.py:1204] (3/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_199-86432-0 from training. Duration: 0.98 2023-10-03 09:20:55,349 WARNING [train.py:1204] (3/4) Exclude cut with ID _36399_210_16049_1_1532998771377_3698333_207-259013-0_sp1.1 from training. Duration: 0.92725 2023-10-03 09:20:56,691 WARNING [train.py:1204] (3/4) Exclude cut with ID _62574_210_2655_1_1535180407058_3933249_567-111756-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 09:20:58,399 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.conv_skip_rate, batch_count=1216553.3333333333, ans=0.0 2023-10-03 09:20:59,614 WARNING [train.py:1204] (3/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_436-318113-0 from training. Duration: 0.75 2023-10-03 09:20:59,680 WARNING [train.py:1204] (3/4) Exclude cut with ID _53379_210_20034_1_1534222027363_3854654_43-305214-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 09:21:00,875 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_15126_210_6753_1_1530778677_2591185_113-157651-0_sp1.1 from training. Duration: 0.6181875 2023-10-03 09:21:02,393 WARNING [train.py:1204] (3/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_146-218763-0_sp1.1 from training. Duration: 0.7818125 2023-10-03 09:21:05,149 WARNING [train.py:1204] (3/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_714-310538-0_sp0.9 from training. Duration: 0.9555625 2023-10-03 09:21:05,366 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.nonlin_attention.balancer.prob, batch_count=1216620.0, ans=0.125 2023-10-03 09:21:07,997 WARNING [train.py:1204] (3/4) Exclude cut with ID _24271_210_12658_1_1531987270022_7190519_709-16721-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 09:21:09,492 WARNING [train.py:1204] (3/4) Exclude cut with ID _5190_210_4557_1_1527244368799_373439_28-197030-0_sp0.9 from training. Duration: 0.8 2023-10-03 09:21:09,519 WARNING [train.py:1204] (3/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_267-197536-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 09:21:10,806 WARNING [train.py:1204] (3/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_658-282610-0_sp1.1 from training. Duration: 0.4818125 2023-10-03 09:21:10,816 WARNING [train.py:1204] (3/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_257-67050-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 09:21:12,239 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_14906_210_7927_1_1530754190_9408633_961-53402-0_sp1.1 from training. Duration: 0.936375 2023-10-03 09:21:14,155 WARNING [train.py:1204] (3/4) Exclude cut with ID _60113_210_16271_1_1535164212481_4386079_490-280290-0_sp1.1 from training. Duration: 0.8909375 2023-10-03 09:21:14,446 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.feed_forward3.hidden_balancer.prob, batch_count=1216620.0, ans=0.125 2023-10-03 09:21:17,320 WARNING [train.py:1204] (3/4) Exclude cut with ID _14100_210_8341_1_1530529320613_3678069_258-224599-0 from training. Duration: 0.77 2023-10-03 09:21:17,379 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6961_210_2087_1_1527937168_6815011_565-326898-0_sp1.1 from training. Duration: 0.936375 2023-10-03 09:21:22,423 WARNING [train.py:1204] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_239-316802-0_sp1.1 from training. Duration: 0.62725 2023-10-03 09:21:23,727 WARNING [train.py:1204] (3/4) Exclude cut with ID _55857_210_2346_1_1534644007831_6898209_745-98391-0_sp1.1 from training. Duration: 0.6909375 2023-10-03 09:21:23,730 WARNING [train.py:1204] (3/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_154-135696-0 from training. Duration: 0.8 2023-10-03 09:21:23,732 WARNING [train.py:1204] (3/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_93-58002-0 from training. Duration: 0.57 2023-10-03 09:21:25,135 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9025W0005-507335 from training. Duration: 0.896 2023-10-03 09:21:26,439 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9029W0034-509435 from training. Duration: 0.64 2023-10-03 09:21:27,811 WARNING [train.py:1204] (3/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_76-203867-0_sp1.1 from training. Duration: 0.6545625 2023-10-03 09:21:27,827 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_46639_210_8341_1_1533726088_4495500_66-346325-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 09:21:27,834 WARNING [train.py:1204] (3/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_145-137790-0_sp0.9 from training. Duration: 0.92225 2023-10-03 09:21:27,881 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder_embed.convnext.layerdrop_rate, batch_count=1216686.6666666667, ans=0.015 2023-10-03 09:21:29,149 WARNING [train.py:1204] (3/4) Exclude cut with ID _36522_210_8887_1_1533177030611_1772478_7-327299-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 09:21:29,238 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9042W0009-515615 from training. Duration: 0.896 2023-10-03 09:21:30,507 WARNING [train.py:1204] (3/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_39-248870-0_sp0.9 from training. Duration: 0.7666875 2023-10-03 09:21:30,552 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_35441_210_16554_1_1533347572_4092680_362-242912-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 09:21:31,900 INFO [train.py:1046] (3/4) Epoch 35, batch 1900, loss[loss=0.1653, simple_loss=0.2364, pruned_loss=0.04707, over 23730.00 frames. ], tot_loss[loss=0.16, simple_loss=0.2388, pruned_loss=0.0406, over 4697100.27 frames. ], batch size: 179, lr: 2.92e-03, grad_scale: 8.0 2023-10-03 09:21:31,947 WARNING [train.py:1204] (3/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_216-343007-0_sp1.1 from training. Duration: 0.6 2023-10-03 09:21:32,036 WARNING [train.py:1204] (3/4) Exclude cut with ID _3867_210_1883_1_1526641200575_4475830_272-215563-0_sp0.9 from training. Duration: 0.6333125 2023-10-03 09:21:32,228 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.bypass_mid.scale_min, batch_count=1216753.3333333333, ans=0.2 2023-10-03 09:21:33,422 WARNING [train.py:1204] (3/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_553-175603-0_sp1.1 from training. Duration: 0.92725 2023-10-03 09:21:33,437 WARNING [train.py:1204] (3/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_963-187109-0 from training. Duration: 0.99 2023-10-03 09:21:33,708 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.balancer2.prob, batch_count=1216753.3333333333, ans=0.125 2023-10-03 09:21:36,174 WARNING [train.py:1204] (3/4) Exclude cut with ID _44973_210_1511_1_1533794381163_3634770_495-211466-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 09:21:36,185 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9068W0002-529085 from training. Duration: 0.896 2023-10-03 09:21:36,203 WARNING [train.py:1204] (3/4) Exclude cut with ID _5294_210_1747_1_1527249643928_3696179_180-100713-0_sp1.1 from training. Duration: 0.736375 2023-10-03 09:21:37,616 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.0.bypass_mid.scale_min, batch_count=1216753.3333333333, ans=0.2 2023-10-03 09:21:38,823 WARNING [train.py:1204] (3/4) Exclude cut with ID _35799_210_5854_1_1533263487887_3529071_149-49689-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 09:21:42,992 WARNING [train.py:1204] (3/4) Exclude cut with ID _67725_210_27767_1_1535608756671_5105359_36-220794-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 09:21:44,868 WARNING [train.py:1204] (3/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_338-96520-0_sp1.1 from training. Duration: 0.8545625 2023-10-03 09:21:46,221 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9104W0004-547160 from training. Duration: 0.896 2023-10-03 09:21:46,304 WARNING [train.py:1204] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_330-327861-0 from training. Duration: 0.67 2023-10-03 09:21:47,690 WARNING [train.py:1204] (3/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_156-257607-0_sp0.9 from training. Duration: 0.92225 2023-10-03 09:21:49,523 WARNING [train.py:1204] (3/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_476-322050-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 09:21:49,531 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9118W0017-553385 from training. Duration: 0.896 2023-10-03 09:21:49,562 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9120W0028-554435 from training. Duration: 0.896 2023-10-03 09:21:54,130 WARNING [train.py:1204] (3/4) Exclude cut with ID _56524_210_9642_1_1534572011779_3619170_145-310842-0 from training. Duration: 0.99 2023-10-03 09:21:55,432 WARNING [train.py:1204] (3/4) Exclude cut with ID _51631_210_8254_1_1534129228420_4084794_270-222508-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 09:21:55,597 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.0.bypass.skip_rate, batch_count=1216820.0, ans=0.035 2023-10-03 09:21:59,704 WARNING [train.py:1204] (3/4) Exclude cut with ID _14268_210_9206_1_1530622771285_4714099_331-105351-0 from training. Duration: 0.65 2023-10-03 09:22:01,110 WARNING [train.py:1204] (3/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_40-129756-0 from training. Duration: 0.81 2023-10-03 09:22:05,591 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.bypass_mid.scale_min, batch_count=1216886.6666666667, ans=0.2 2023-10-03 09:22:08,189 WARNING [train.py:1204] (3/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_231-104736-0 from training. Duration: 0.78 2023-10-03 09:22:09,710 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.attention_skip_rate, batch_count=1216886.6666666667, ans=0.0 2023-10-03 09:22:10,880 WARNING [train.py:1204] (3/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_214-291369-0 from training. Duration: 0.63 2023-10-03 09:22:10,888 WARNING [train.py:1204] (3/4) Exclude cut with ID _62574_210_2655_1_1535180407058_3933249_369-84347-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 09:22:12,390 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9210W0111-599240 from training. Duration: 0.896 2023-10-03 09:22:12,395 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_92-246270-0 from training. Duration: 0.85 2023-10-03 09:22:12,435 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_37152_210_9697_1_1533106573_3844188_29-239070-0 from training. Duration: 0.98 2023-10-03 09:22:13,776 WARNING [train.py:1204] (3/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_239-22076-0 from training. Duration: 0.85 2023-10-03 09:22:13,779 WARNING [train.py:1204] (3/4) Exclude cut with ID _65962_210_29927_1_1535436026519_4174930_368-44639-0_sp1.1 from training. Duration: 0.97275 2023-10-03 09:22:18,895 WARNING [train.py:1204] (3/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_152-212533-0 from training. Duration: 0.97 2023-10-03 09:22:19,125 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.1.balancer2.prob, batch_count=1216953.3333333333, ans=0.125 2023-10-03 09:22:20,394 WARNING [train.py:1204] (3/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_90-31383-0_sp1.1 from training. Duration: 0.7909375 2023-10-03 09:22:23,068 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.2.encoder.layers.1.self_attn2.whiten, num_groups=1, num_channels=384, metric=16.69 vs. limit=22.5 2023-10-03 09:22:23,734 WARNING [train.py:1204] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_196-152868-0_sp1.1 from training. Duration: 0.92725 2023-10-03 09:22:23,735 WARNING [train.py:1204] (3/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_140-181176-0 from training. Duration: 0.92 2023-10-03 09:22:24,430 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.3.encoder.layers.3.self_attn1.whiten, num_groups=1, num_channels=512, metric=13.02 vs. limit=22.5 2023-10-03 09:22:26,474 WARNING [train.py:1204] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_168-32906-0_sp0.9 from training. Duration: 0.7666875 2023-10-03 09:22:30,657 WARNING [train.py:1204] (3/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_13-165512-0 from training. Duration: 0.82 2023-10-03 09:22:31,901 WARNING [train.py:1204] (3/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_381-317791-0_sp0.9 from training. Duration: 0.811125 2023-10-03 09:22:37,815 WARNING [train.py:1204] (3/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_63-267585-0_sp1.1 from training. Duration: 0.8181875 2023-10-03 09:22:37,817 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_326-303945-0_sp1.1 from training. Duration: 0.9 2023-10-03 09:22:37,836 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_194-143381-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 09:22:37,913 WARNING [train.py:1204] (3/4) Exclude cut with ID _39290_210_4220_1_1533862778760_3075118_194-16667-0_sp1.1 from training. Duration: 0.963625 2023-10-03 09:22:39,290 WARNING [train.py:1204] (3/4) Exclude cut with ID _15012_210_1968_1_1530856820429_5095350_662-210469-0_sp0.9 from training. Duration: 0.6333125 2023-10-03 09:22:39,455 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.balancer_ff2.min_abs, batch_count=1217020.0, ans=0.1 2023-10-03 09:22:40,487 WARNING [train.py:1204] (3/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_68-219807-0_sp1.1 from training. Duration: 0.42725 2023-10-03 09:22:40,552 WARNING [train.py:1204] (3/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_321-22821-0_sp0.9 from training. Duration: 0.988875 2023-10-03 09:22:43,338 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_1037-45317-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 09:22:43,340 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_403-39619-0_sp1.1 from training. Duration: 0.763625 2023-10-03 09:22:44,700 INFO [train.py:1046] (3/4) Epoch 35, batch 1950, loss[loss=0.174, simple_loss=0.2534, pruned_loss=0.04732, over 23908.00 frames. ], tot_loss[loss=0.1597, simple_loss=0.2393, pruned_loss=0.04006, over 4719107.64 frames. ], batch size: 86, lr: 2.92e-03, grad_scale: 8.0 2023-10-03 09:22:46,030 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_510-96996-0_sp0.9 from training. Duration: 0.9666875 2023-10-03 09:22:46,033 WARNING [train.py:1204] (3/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_225-180155-0_sp1.1 from training. Duration: 0.92725 2023-10-03 09:22:46,081 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_278-272612-0_sp0.9 from training. Duration: 0.811125 2023-10-03 09:22:47,888 WARNING [train.py:1204] (3/4) Exclude cut with ID _53713_210_24116_1_1534314487762_7416751_691-70244-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 09:22:52,399 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6788_210_1388_1_1527898698_4562021_91-197981-0_sp1.1 from training. Duration: 0.7545625 2023-10-03 09:22:53,823 WARNING [train.py:1204] (3/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_60-18123-0_sp0.9 from training. Duration: 0.97775 2023-10-03 09:22:55,777 WARNING [train.py:1204] (3/4) Exclude cut with ID _35799_210_5854_1_1533263487887_3529071_5-214617-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 09:22:55,789 WARNING [train.py:1204] (3/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_890-5117-0_sp1.1 from training. Duration: 0.7090625 2023-10-03 09:22:55,934 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.bypass.skip_rate, batch_count=1217086.6666666667, ans=0.04949747468305833 2023-10-03 09:22:55,976 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=1217086.6666666667, ans=0.1 2023-10-03 09:22:56,015 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.conv_module2.balancer1.prob, batch_count=1217086.6666666667, ans=0.125 2023-10-03 09:22:56,962 WARNING [train.py:1204] (3/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_660-211088-0 from training. Duration: 0.62 2023-10-03 09:22:58,294 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.491e+02 1.842e+02 2.075e+02 2.339e+02 3.045e+02, threshold=4.150e+02, percent-clipped=0.0 2023-10-03 09:22:58,417 WARNING [train.py:1204] (3/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_314-126819-0_sp0.9 from training. Duration: 0.5555625 2023-10-03 09:22:58,440 WARNING [train.py:1204] (3/4) Exclude cut with ID _21632_210_2253_1_1531994001765_3744140_362-66225-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 09:22:59,896 WARNING [train.py:1204] (3/4) Exclude cut with ID _61118_210_9527_1_1534897668884_3752630_173-258555-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 09:23:00,241 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.conv_module1.balancer1.prob, batch_count=1217153.3333333333, ans=0.125 2023-10-03 09:23:01,416 WARNING [train.py:1204] (3/4) Exclude cut with ID _7719_210_3525_1_1528456153163_4296960_88-250259-0_sp0.9 from training. Duration: 0.9333125 2023-10-03 09:23:01,439 WARNING [train.py:1204] (3/4) Exclude cut with ID _15140_210_9288_1_1530791388450_4880149_539-153708-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 09:23:01,472 WARNING [train.py:1204] (3/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_253-42544-0_sp1.1 from training. Duration: 0.97275 2023-10-03 09:23:02,861 WARNING [train.py:1204] (3/4) Exclude cut with ID _33640_210_2655_1_1533117605479_4855469_479-296877-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 09:23:05,531 WARNING [train.py:1204] (3/4) Exclude cut with ID 3033-130750-0096-55598-0_sp1.1 from training. Duration: 0.7545625 2023-10-03 09:23:06,792 WARNING [train.py:1204] (3/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_191-41023-0_sp1.1 from training. Duration: 0.6454375 2023-10-03 09:23:06,815 WARNING [train.py:1204] (3/4) Exclude cut with ID _8365_210_2064_1_1528592311070_4677690_431-294102-0_sp1.1 from training. Duration: 0.8090625 2023-10-03 09:23:06,852 WARNING [train.py:1204] (3/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_893-296030-0_sp1.1 from training. Duration: 0.97275 2023-10-03 09:23:09,677 WARNING [train.py:1204] (3/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_401-343820-0_sp1.1 from training. Duration: 0.97275 2023-10-03 09:23:12,505 WARNING [train.py:1204] (3/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_127-566-0_sp1.1 from training. Duration: 0.763625 2023-10-03 09:23:12,506 WARNING [train.py:1204] (3/4) Exclude cut with ID _14100_210_8341_1_1530529320613_3678069_126-272847-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 09:23:12,529 WARNING [train.py:1204] (3/4) Exclude cut with ID _15133_210_6228_1_1530794512074_3080379_29-264458-0_sp1.1 from training. Duration: 0.52725 2023-10-03 09:23:12,529 WARNING [train.py:1204] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_127-119109-0 from training. Duration: 0.72 2023-10-03 09:23:12,578 WARNING [train.py:1204] (3/4) Exclude cut with ID _6537_210_1883_1_1527987559710_3597129_382-133062-0_sp1.1 from training. Duration: 0.6181875 2023-10-03 09:23:12,597 WARNING [train.py:1204] (3/4) Exclude cut with ID _29529_210_7234_1_1532393961455_3977460_391-219682-0_sp1.1 from training. Duration: 0.936375 2023-10-03 09:23:12,858 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.conv_skip_rate, batch_count=1217220.0, ans=0.0 2023-10-03 09:23:13,933 WARNING [train.py:1204] (3/4) Exclude cut with ID _37537_210_2253_1_1533171758568_3421949_96-137189-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 09:23:16,387 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.3.encoder.layers.1.conv_module1.whiten, num_groups=1, num_channels=512, metric=3.25 vs. limit=15.0 2023-10-03 09:23:18,375 WARNING [train.py:1204] (3/4) Exclude cut with ID _58414_210_8254_1_1535421609162_7151182_15-276301-0_sp1.1 from training. Duration: 0.97275 2023-10-03 09:23:20,289 WARNING [train.py:1204] (3/4) Exclude cut with ID _59502_210_12210_1_1535107024698_1835139_110-289074-0_sp1.1 from training. Duration: 0.92725 2023-10-03 09:23:25,449 WARNING [train.py:1204] (3/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_367-61330-0_sp1.1 from training. Duration: 0.6818125 2023-10-03 09:23:28,359 WARNING [train.py:1204] (3/4) Exclude cut with ID _45297_210_2721_1_1533715351381_3264949_353-9541-0_sp1.1 from training. Duration: 0.8818125 2023-10-03 09:23:28,559 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.conv_skip_rate, batch_count=1217286.6666666667, ans=0.0 2023-10-03 09:23:29,657 WARNING [train.py:1204] (3/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_85-138257-0_sp0.9 from training. Duration: 0.788875 2023-10-03 09:23:29,682 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3787_210_2040_1_1526477544_5478586_603-120324-0 from training. Duration: 0.81 2023-10-03 09:23:29,725 WARNING [train.py:1204] (3/4) Exclude cut with ID _42863_210_9070_1_1533902398788_3696920_411-204131-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 09:23:32,658 WARNING [train.py:1204] (3/4) Exclude cut with ID _30196_210_1664_1_1533708496522_7074140_822-289909-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 09:23:34,128 WARNING [train.py:1204] (3/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_32-335103-0_sp0.9 from training. Duration: 0.911125 2023-10-03 09:23:34,168 WARNING [train.py:1204] (3/4) Exclude cut with ID _6493_210_1747_1_1528459210424_4012470_513-140967-0_sp0.9 from training. Duration: 0.87775 2023-10-03 09:23:37,068 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.conv_module2.balancer1.min_positive, batch_count=1217286.6666666667, ans=0.025 2023-10-03 09:23:42,462 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_47896_210_20508_1_1534492518_5958868_439-257766-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 09:23:43,939 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_1294-63152-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 09:23:46,652 WARNING [train.py:1204] (3/4) Exclude cut with ID _59170_210_6721_1_1534983633960_4203335_319-166512-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 09:23:49,957 WARNING [train.py:1204] (3/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_146-3470-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 09:23:53,124 WARNING [train.py:1204] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_678-159307-0_sp1.1 from training. Duration: 0.77275 2023-10-03 09:23:54,359 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_343-48822-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 09:23:54,428 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_4692_210_3528_1_1526899755_4323254_107-44401-0 from training. Duration: 0.86 2023-10-03 09:23:54,433 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_74-292608-0_sp1.1 from training. Duration: 0.6545625 2023-10-03 09:23:56,335 WARNING [train.py:1204] (3/4) Exclude cut with ID _55644_210_11856_1_1534593599582_4103719_293-248694-0_sp1.1 from training. Duration: 0.97275 2023-10-03 09:23:57,048 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3172_210_2087_1_1526292929_3987865_197-97762-0 from training. Duration: 0.75 2023-10-03 09:23:58,390 WARNING [train.py:1204] (3/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_264-348896-0_sp0.9 from training. Duration: 0.9444375 2023-10-03 09:23:59,665 INFO [train.py:1046] (3/4) Epoch 35, batch 2000, loss[loss=0.1666, simple_loss=0.2372, pruned_loss=0.048, over 23544.00 frames. ], tot_loss[loss=0.1604, simple_loss=0.2397, pruned_loss=0.04052, over 4711152.96 frames. ], batch size: 256, lr: 2.92e-03, grad_scale: 16.0 2023-10-03 09:24:02,405 WARNING [train.py:1204] (3/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_209-205365-0_sp0.9 from training. Duration: 0.87775 2023-10-03 09:24:02,507 WARNING [train.py:1204] (3/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_222-198127-0_sp0.9 from training. Duration: 0.9333125 2023-10-03 09:24:03,786 WARNING [train.py:1204] (3/4) Exclude cut with ID _57572_210_5517_1_1534921219624_3809484_52-43530-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 09:24:05,080 WARNING [train.py:1204] (3/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_96-320736-0_sp1.1 from training. Duration: 0.7818125 2023-10-03 09:24:07,751 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_13091_210_7925_1_1530609447_8987354_1159-225508-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 09:24:09,467 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.conv_skip_rate, batch_count=1217420.0, ans=0.0 2023-10-03 09:24:09,481 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.bypass.skip_rate, batch_count=1217420.0, ans=0.09899494936611666 2023-10-03 09:24:10,586 WARNING [train.py:1204] (3/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_237-44332-0 from training. Duration: 0.94 2023-10-03 09:24:10,632 WARNING [train.py:1204] (3/4) Exclude cut with ID _7970_210_1883_1_1528455616296_3472230_25-178407-0_sp0.9 from training. Duration: 0.788875 2023-10-03 09:24:13,410 WARNING [train.py:1204] (3/4) Exclude cut with ID _29003_210_19596_1_1532507391720_3894098_357-257062-0_sp1.1 from training. Duration: 0.936375 2023-10-03 09:24:14,915 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_809-17156-0 from training. Duration: 0.73 2023-10-03 09:24:16,280 WARNING [train.py:1204] (3/4) Exclude cut with ID _7160_210_1876_1_1527940923231_3703799_302-150728-0_sp1.1 from training. Duration: 0.7090625 2023-10-03 09:24:16,288 WARNING [train.py:1204] (3/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_444-28041-0_sp0.9 from training. Duration: 0.9444375 2023-10-03 09:24:19,034 WARNING [train.py:1204] (3/4) Exclude cut with ID _52892_210_6721_1_1534378941259_4124129_278-251666-0_sp1.1 from training. Duration: 0.92725 2023-10-03 09:24:20,922 WARNING [train.py:1204] (3/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_115-326778-0 from training. Duration: 0.93 2023-10-03 09:24:22,196 WARNING [train.py:1204] (3/4) Exclude cut with ID _41391_210_7994_1_1533603771872_3638201_31-188961-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 09:24:23,566 WARNING [train.py:1204] (3/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_1073-6264-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 09:24:23,793 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.feed_forward3.hidden_balancer.prob, batch_count=1217486.6666666667, ans=0.125 2023-10-03 09:24:25,947 WARNING [train.py:1204] (3/4) Exclude cut with ID _66868_210_9527_1_1535509287954_4561140_435-259515-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 09:24:26,036 WARNING [train.py:1204] (3/4) Exclude cut with ID _14886_210_4220_1_1530702020355_3516190_159-73748-0 from training. Duration: 0.83 2023-10-03 09:24:26,075 WARNING [train.py:1204] (3/4) Exclude cut with ID _15150_210_8341_1_1530752411803_7256384_469-239509-0_sp1.1 from training. Duration: 0.6181875 2023-10-03 09:24:28,627 WARNING [train.py:1204] (3/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_395-340852-0 from training. Duration: 0.93 2023-10-03 09:24:28,637 WARNING [train.py:1204] (3/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_138-187031-0_sp1.1 from training. Duration: 0.9 2023-10-03 09:24:30,409 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.conv_skip_rate, batch_count=1217553.3333333333, ans=0.0 2023-10-03 09:24:31,571 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_13757_210_3528_1_1530440682_4107239_48-106347-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 09:24:32,852 WARNING [train.py:1204] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_164-224052-0_sp1.1 from training. Duration: 0.636375 2023-10-03 09:24:32,857 WARNING [train.py:1204] (3/4) Exclude cut with ID _59288_210_5854_1_1535077757774_3412246_408-322648-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 09:24:32,906 WARNING [train.py:1204] (3/4) Exclude cut with ID _30191_210_1664_1_1533189915047_7196670_827-36359-0_sp1.1 from training. Duration: 0.963625 2023-10-03 09:24:34,271 WARNING [train.py:1204] (3/4) Exclude cut with ID _4680_210_3525_1_1527249757953_893386_75-78979-0_sp1.1 from training. Duration: 0.82725 2023-10-03 09:24:35,548 WARNING [train.py:1204] (3/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_383-165234-0 from training. Duration: 0.94 2023-10-03 09:24:35,782 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.balancer1.prob, batch_count=1217553.3333333333, ans=0.125 2023-10-03 09:24:38,300 WARNING [train.py:1204] (3/4) Exclude cut with ID _19896_210_1883_1_1531709022662_6649569_94-229927-0 from training. Duration: 0.97 2023-10-03 09:24:38,307 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6788_210_1388_1_1527898698_4562021_180-75288-0_sp1.1 from training. Duration: 0.9 2023-10-03 09:24:38,315 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_26433_210_12108_1_1532157158_4056480_54-321349-0_sp1.1 from training. Duration: 0.97275 2023-10-03 09:24:40,022 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.bypass.skip_rate, batch_count=1217553.3333333333, ans=0.09899494936611666 2023-10-03 09:24:40,475 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.3.encoder.layers.2.feed_forward3.out_whiten, num_groups=1, num_channels=512, metric=6.80 vs. limit=15.0 2023-10-03 09:24:42,521 WARNING [train.py:1204] (3/4) Exclude cut with ID _56108_210_16380_1_1534505446159_3887959_265-161493-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 09:24:44,022 WARNING [train.py:1204] (3/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_395-111602-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 09:24:44,035 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_154-316450-0_sp0.9 from training. Duration: 0.8444375 2023-10-03 09:24:44,168 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.0.feed_forward1.out_proj.dropout_p, batch_count=1217620.0, ans=0.1 2023-10-03 09:24:44,698 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.2.encoder.layers.0.conv_module1.whiten, num_groups=1, num_channels=384, metric=10.24 vs. limit=15.0 2023-10-03 09:24:45,298 WARNING [train.py:1204] (3/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_381-116041-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 09:24:46,632 WARNING [train.py:1204] (3/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_65-104124-0_sp1.1 from training. Duration: 0.963625 2023-10-03 09:24:46,696 WARNING [train.py:1204] (3/4) Exclude cut with ID _6536_210_1883_1_1527850783339_3319069_169-23811-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 09:24:47,993 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_289-180904-0_sp0.9 from training. Duration: 0.8444375 2023-10-03 09:24:48,009 WARNING [train.py:1204] (3/4) Exclude cut with ID _56406_210_9288_1_1534899618664_3496149_226-242400-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 09:24:49,481 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_24899_210_16554_1_1532046422_3827462_276-216330-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 09:24:52,732 WARNING [train.py:1204] (3/4) Exclude cut with ID _29345_210_4381_1_1532689168263_3860169_198-174328-0_sp1.1 from training. Duration: 0.82725 2023-10-03 09:24:54,731 WARNING [train.py:1204] (3/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_338-96520-0 from training. Duration: 0.94 2023-10-03 09:24:59,334 WARNING [train.py:1204] (3/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_7-111954-0_sp1.1 from training. Duration: 0.5090625 2023-10-03 09:25:00,846 WARNING [train.py:1204] (3/4) Exclude cut with ID _36692_210_5665_1_1533608967420_398809_8-16261-0_sp1.1 from training. Duration: 0.97275 2023-10-03 09:25:02,313 WARNING [train.py:1204] (3/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_86-212657-0_sp1.1 from training. Duration: 0.97275 2023-10-03 09:25:03,587 WARNING [train.py:1204] (3/4) Exclude cut with ID _44659_210_2721_1_1534036120061_6614860_626-298884-0_sp1.1 from training. Duration: 0.8818125 2023-10-03 09:25:05,145 WARNING [train.py:1204] (3/4) Exclude cut with ID _49755_210_18053_1_1534581005846_3218709_120-257918-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 09:25:08,038 WARNING [train.py:1204] (3/4) Exclude cut with ID _21881_210_3525_1_1531825242757_3798380_400-113821-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 09:25:08,040 WARNING [train.py:1204] (3/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_387-236773-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 09:25:09,424 WARNING [train.py:1204] (3/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_613-232856-0_sp1.1 from training. Duration: 0.4909375 2023-10-03 09:25:09,462 WARNING [train.py:1204] (3/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_181-78632-0_sp0.9 from training. Duration: 0.8666875 2023-10-03 09:25:12,121 INFO [train.py:1046] (3/4) Epoch 35, batch 2050, loss[loss=0.151, simple_loss=0.2345, pruned_loss=0.03379, over 24306.00 frames. ], tot_loss[loss=0.1607, simple_loss=0.24, pruned_loss=0.04068, over 4710567.05 frames. ], batch size: 61, lr: 2.92e-03, grad_scale: 16.0 2023-10-03 09:25:12,195 WARNING [train.py:1204] (3/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_175-17485-0_sp1.1 from training. Duration: 0.97275 2023-10-03 09:25:12,271 WARNING [train.py:1204] (3/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_1004-183422-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 09:25:13,730 WARNING [train.py:1204] (3/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_32-65962-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 09:25:14,994 WARNING [train.py:1204] (3/4) Exclude cut with ID _15458_210_8341_1_1530858614263_3834410_40-298664-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 09:25:19,576 WARNING [train.py:1204] (3/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_380-261863-0_sp1.1 from training. Duration: 0.963625 2023-10-03 09:25:21,033 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.attention_skip_rate, batch_count=1217753.3333333333, ans=0.0 2023-10-03 09:25:22,782 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_372-173012-0_sp1.1 from training. Duration: 0.863625 2023-10-03 09:25:24,544 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_57-82205-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 09:25:24,798 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.bypass.scale_min, batch_count=1217753.3333333333, ans=0.2 2023-10-03 09:25:25,765 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.652e+02 1.915e+02 2.069e+02 2.253e+02 3.253e+02, threshold=4.138e+02, percent-clipped=0.0 2023-10-03 09:25:25,867 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3691_210_3528_1_1526468169_4062053_355-334432-0_sp0.9 from training. Duration: 0.9666875 2023-10-03 09:25:27,868 WARNING [train.py:1204] (3/4) Exclude cut with ID _19023_210_4353_1_1531378892552_5820780_28-36615-0 from training. Duration: 0.97 2023-10-03 09:25:27,873 WARNING [train.py:1204] (3/4) Exclude cut with ID _29853_210_7497_1_1532505600851_6642110_286-251333-0_sp1.1 from training. Duration: 0.92725 2023-10-03 09:25:29,296 WARNING [train.py:1204] (3/4) Exclude cut with ID _31602_210_5854_1_1532677021456_4458250_518-80679-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 09:25:29,332 WARNING [train.py:1204] (3/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_38-187851-0_sp0.9 from training. Duration: 0.688875 2023-10-03 09:25:40,257 WARNING [train.py:1204] (3/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_39-248870-0_sp1.1 from training. Duration: 0.62725 2023-10-03 09:25:40,277 WARNING [train.py:1204] (3/4) Exclude cut with ID _16908_210_5724_1_1531119157248_3772700_154-31455-0_sp1.1 from training. Duration: 0.97275 2023-10-03 09:25:43,046 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8453_210_4211_1_1528502940_8562730_347-330673-0 from training. Duration: 0.72 2023-10-03 09:25:44,493 WARNING [train.py:1204] (3/4) Exclude cut with ID _30191_210_1664_1_1533189915047_7196670_674-204247-0_sp1.1 from training. Duration: 0.97275 2023-10-03 09:25:44,586 WARNING [train.py:1204] (3/4) Exclude cut with ID _8833_210_5219_1_1528592665210_7553059_329-125156-0 from training. Duration: 0.82 2023-10-03 09:25:44,623 WARNING [train.py:1204] (3/4) Exclude cut with ID _4779_210_2084_1_1527318022944_3914893_500-285013-0_sp1.1 from training. Duration: 0.62725 2023-10-03 09:25:47,387 WARNING [train.py:1204] (3/4) Exclude cut with ID _14421_210_8750_1_1530604795825_3542290_190-313578-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 09:25:50,681 WARNING [train.py:1204] (3/4) Exclude cut with ID _35799_210_5854_1_1533263487887_3529071_538-273574-0_sp1.1 from training. Duration: 0.936375 2023-10-03 09:25:52,202 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_14928_210_5632_1_1530773461_4714000_454-112198-0_sp1.1 from training. Duration: 0.6 2023-10-03 09:25:52,247 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_468-143126-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 09:25:53,648 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_4528_210_3273_1_1527076654_3276777_258-121681-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 09:25:55,664 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_397-132565-0_sp1.1 from training. Duration: 0.9 2023-10-03 09:25:55,682 WARNING [train.py:1204] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_454-123002-0_sp0.9 from training. Duration: 0.7666875 2023-10-03 09:25:55,931 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.bypass_mid.scale_min, batch_count=1217953.3333333333, ans=0.2 2023-10-03 09:26:00,265 WARNING [train.py:1204] (3/4) Exclude cut with ID _27052_210_5986_1_1532329197187_3585009_297-86890-0_sp1.1 from training. Duration: 0.936375 2023-10-03 09:26:00,632 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.bypass_mid.scale_min, batch_count=1217953.3333333333, ans=0.2 2023-10-03 09:26:01,699 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_51225_210_3601_1_1534400634_2327383_55-301844-0_sp1.1 from training. Duration: 0.8454375 2023-10-03 09:26:03,101 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6786_210_1388_1_1527839699_4194495_378-46070-0_sp0.9 from training. Duration: 0.8 2023-10-03 09:26:04,416 WARNING [train.py:1204] (3/4) Exclude cut with ID _42334_210_7770_1_1534206610396_3605880_55-19758-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 09:26:07,272 WARNING [train.py:1204] (3/4) Exclude cut with ID _14884_210_2252_1_1530707459689_5561250_114-201399-0_sp0.9 from training. Duration: 0.8444375 2023-10-03 09:26:07,556 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.feed_forward1.out_proj.dropout_p, batch_count=1217953.3333333333, ans=0.1 2023-10-03 09:26:12,726 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_99-251314-0_sp0.9 from training. Duration: 0.9666875 2023-10-03 09:26:12,812 WARNING [train.py:1204] (3/4) Exclude cut with ID _26528_210_9070_1_1532570410842_3851310_497-97218-0 from training. Duration: 0.86 2023-10-03 09:26:17,015 WARNING [train.py:1204] (3/4) Exclude cut with ID _55328_210_27767_1_1534580973260_4088170_230-317241-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 09:26:18,375 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_125-203105-0_sp0.9 from training. Duration: 0.888875 2023-10-03 09:26:20,168 WARNING [train.py:1204] (3/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_55-31904-0_sp1.1 from training. Duration: 0.8 2023-10-03 09:26:22,827 WARNING [train.py:1204] (3/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_18-174647-0 from training. Duration: 0.91 2023-10-03 09:26:26,569 INFO [train.py:1046] (3/4) Epoch 35, batch 2100, loss[loss=0.1458, simple_loss=0.2302, pruned_loss=0.03068, over 24446.00 frames. ], tot_loss[loss=0.1594, simple_loss=0.2387, pruned_loss=0.04005, over 4703810.44 frames. ], batch size: 63, lr: 2.92e-03, grad_scale: 8.0 2023-10-03 09:26:27,255 WARNING [train.py:1204] (3/4) Exclude cut with ID IC0165W0005-81726 from training. Duration: 0.952 2023-10-03 09:26:27,255 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_31583_210_3852_1_1532595851_4024413_148-171847-0_sp1.1 from training. Duration: 0.963625 2023-10-03 09:26:28,388 WARNING [train.py:1204] (3/4) Exclude cut with ID _21989_210_5854_1_1531899214931_3271360_116-138769-0_sp1.1 from training. Duration: 0.936375 2023-10-03 09:26:28,442 WARNING [train.py:1204] (3/4) Exclude cut with ID _66070_210_7640_1_1535441393096_7805310_489-292631-0_sp1.1 from training. Duration: 0.7181875 2023-10-03 09:26:29,835 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_202-27960-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 09:26:29,843 WARNING [train.py:1204] (3/4) Exclude cut with ID _5294_210_1747_1_1527249643928_3696179_342-270278-0 from training. Duration: 0.55 2023-10-03 09:26:29,880 WARNING [train.py:1204] (3/4) Exclude cut with ID _38323_210_12108_1_1533121233369_3303049_416-11919-0 from training. Duration: 0.96 2023-10-03 09:26:32,554 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6545_210_2040_1_1527774670_4245258_407-48070-0_sp0.9 from training. Duration: 0.8444375 2023-10-03 09:26:35,389 WARNING [train.py:1204] (3/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_503-30719-0_sp1.1 from training. Duration: 0.8909375 2023-10-03 09:26:35,452 WARNING [train.py:1204] (3/4) Exclude cut with ID _49643_210_7989_1_1534640244224_3939031_234-326497-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 09:26:36,941 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.bypass.skip_rate, batch_count=1218086.6666666667, ans=0.04949747468305833 2023-10-03 09:26:39,473 WARNING [train.py:1204] (3/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_301-77873-0_sp1.1 from training. Duration: 0.963625 2023-10-03 09:26:39,543 WARNING [train.py:1204] (3/4) Exclude cut with ID _45297_210_2721_1_1533715351381_3264949_152-154697-0_sp1.1 from training. Duration: 0.97275 2023-10-03 09:26:40,781 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3371_210_1794_1_1526384462_5580777_396-339812-0 from training. Duration: 0.85 2023-10-03 09:26:42,160 WARNING [train.py:1204] (3/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_399-156791-0_sp1.1 from training. Duration: 0.7545625 2023-10-03 09:26:42,197 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7696_210_2040_1_1528207167_3716099_117-125434-0 from training. Duration: 0.76 2023-10-03 09:26:42,212 WARNING [train.py:1204] (3/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_96-244299-0 from training. Duration: 0.88 2023-10-03 09:26:43,579 WARNING [train.py:1204] (3/4) Exclude cut with ID _37892_210_19596_1_1533198677897_3964058_519-343462-0_sp1.1 from training. Duration: 0.92725 2023-10-03 09:26:43,615 WARNING [train.py:1204] (3/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_202-183922-0_sp1.1 from training. Duration: 0.77275 2023-10-03 09:26:43,621 WARNING [train.py:1204] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_453-133983-0 from training. Duration: 0.78 2023-10-03 09:26:44,879 WARNING [train.py:1204] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_178-39722-0_sp1.1 from training. Duration: 0.4454375 2023-10-03 09:26:50,386 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_15126_210_6753_1_1530778677_2591185_113-157651-0 from training. Duration: 0.68 2023-10-03 09:26:50,387 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_271-234962-0_sp1.1 from training. Duration: 0.7181875 2023-10-03 09:26:53,539 WARNING [train.py:1204] (3/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_195-64967-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 09:26:53,588 WARNING [train.py:1204] (3/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_365-215065-0_sp1.1 from training. Duration: 0.936375 2023-10-03 09:26:57,082 WARNING [train.py:1204] (3/4) Exclude cut with ID _31602_210_5854_1_1532677021456_4458250_249-96604-0_sp0.9 from training. Duration: 0.988875 2023-10-03 09:26:58,389 WARNING [train.py:1204] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_307-158462-0 from training. Duration: 0.69 2023-10-03 09:26:58,430 WARNING [train.py:1204] (3/4) Exclude cut with ID _30918_210_6400_1_1532754078600_3125920_407-223394-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 09:26:58,434 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6673_210_3273_1_1527767790_3173240_351-167954-0_sp0.9 from training. Duration: 0.5333125 2023-10-03 09:26:59,317 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.feed_forward3.hidden_balancer.prob, batch_count=1218220.0, ans=0.125 2023-10-03 09:27:00,242 WARNING [train.py:1204] (3/4) Exclude cut with ID _4866_210_2577_1_1527404412284_4364659_256-306830-0 from training. Duration: 0.87 2023-10-03 09:27:00,283 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_17613_210_2151_1_1531137569_2366160_222-131118-0_sp1.1 from training. Duration: 0.963625 2023-10-03 09:27:00,301 WARNING [train.py:1204] (3/4) Exclude cut with ID _45560_210_21982_1_1533812603951_3584999_111-6376-0 from training. Duration: 0.84 2023-10-03 09:27:01,643 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_216-73841-0 from training. Duration: 0.96 2023-10-03 09:27:01,708 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_14058_210_4163_1_1530690953_6327180_49-210687-0 from training. Duration: 0.94 2023-10-03 09:27:03,099 WARNING [train.py:1204] (3/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_506-42614-0_sp0.9 from training. Duration: 0.87775 2023-10-03 09:27:04,539 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_25-332138-0_sp1.1 from training. Duration: 0.82725 2023-10-03 09:27:06,127 WARNING [train.py:1204] (3/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_589-9070-0_sp1.1 from training. Duration: 0.6454375 2023-10-03 09:27:07,512 WARNING [train.py:1204] (3/4) Exclude cut with ID _6194_210_1534_1_1527511826160_3941194_14-282876-0_sp1.1 from training. Duration: 0.6454375 2023-10-03 09:27:08,884 WARNING [train.py:1204] (3/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_1166-90654-0_sp1.1 from training. Duration: 0.92725 2023-10-03 09:27:10,323 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_218-255858-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 09:27:10,332 WARNING [train.py:1204] (3/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_1285-256622-0 from training. Duration: 0.84 2023-10-03 09:27:11,552 WARNING [train.py:1204] (3/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_205-286261-0_sp1.1 from training. Duration: 0.963625 2023-10-03 09:27:11,566 WARNING [train.py:1204] (3/4) Exclude cut with ID _68777_210_4353_1_1535810390731_3117904_6-154119-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 09:27:11,627 WARNING [train.py:1204] (3/4) Exclude cut with ID _22982_210_1883_1_1531882790447_5449409_565-199393-0_sp1.1 from training. Duration: 0.92725 2023-10-03 09:27:11,642 WARNING [train.py:1204] (3/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_278-154492-0 from training. Duration: 0.76 2023-10-03 09:27:14,422 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_426-313557-0 from training. Duration: 0.53 2023-10-03 09:27:15,692 WARNING [train.py:1204] (3/4) Exclude cut with ID _41817_210_18835_1_1533434477160_7423049_555-293448-0 from training. Duration: 0.8 2023-10-03 09:27:19,075 WARNING [train.py:1204] (3/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_32-335103-0_sp1.1 from training. Duration: 0.7454375 2023-10-03 09:27:20,592 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.bypass.scale_min, batch_count=1218286.6666666667, ans=0.2 2023-10-03 09:27:23,148 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2655_210_2087_1_1525688959_3897400_202-147557-0_sp1.1 from training. Duration: 0.77275 2023-10-03 09:27:23,180 WARNING [train.py:1204] (3/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_158-250116-0 from training. Duration: 0.62 2023-10-03 09:27:28,459 WARNING [train.py:1204] (3/4) Exclude cut with ID _55429_210_10626_1_1534467588527_7174143_528-88083-0_sp1.1 from training. Duration: 0.963625 2023-10-03 09:27:31,574 WARNING [train.py:1204] (3/4) Exclude cut with ID _8030_210_7497_1_1528444886781_3531089_32-31837-0_sp1.1 from training. Duration: 0.8818125 2023-10-03 09:27:31,596 WARNING [train.py:1204] (3/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_744-86168-0_sp1.1 from training. Duration: 0.97275 2023-10-03 09:27:31,610 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1113-7369-0_sp0.9 from training. Duration: 0.9444375 2023-10-03 09:27:31,633 WARNING [train.py:1204] (3/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_124-345840-0_sp0.9 from training. Duration: 0.57775 2023-10-03 09:27:32,948 WARNING [train.py:1204] (3/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_328-264776-0_sp0.9 from training. Duration: 0.7444375 2023-10-03 09:27:34,309 WARNING [train.py:1204] (3/4) Exclude cut with ID _18895_210_6228_1_1531357274234_3784930_469-166615-0_sp1.1 from training. Duration: 0.963625 2023-10-03 09:27:34,326 WARNING [train.py:1204] (3/4) Exclude cut with ID _8395_210_1531_1_1528597871523_4411579_431-132470-0_sp0.9 from training. Duration: 0.82225 2023-10-03 09:27:35,718 WARNING [train.py:1204] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_554-45617-0_sp1.1 from training. Duration: 0.9 2023-10-03 09:27:35,750 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_35361_210_4238_1_1533262810_7793476_524-188242-0_sp1.1 from training. Duration: 0.92725 2023-10-03 09:27:37,176 WARNING [train.py:1204] (3/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_938-205135-0 from training. Duration: 0.87 2023-10-03 09:27:38,602 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_5931_210_2040_1_1527428481_4547999_321-193614-0 from training. Duration: 0.67 2023-10-03 09:27:38,630 WARNING [train.py:1204] (3/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_445-269521-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 09:27:38,885 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=1218420.0, ans=0.1 2023-10-03 09:27:38,888 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=1218420.0, ans=0.1 2023-10-03 09:27:40,018 INFO [train.py:1046] (3/4) Epoch 35, batch 2150, loss[loss=0.1674, simple_loss=0.2444, pruned_loss=0.04516, over 23136.00 frames. ], tot_loss[loss=0.159, simple_loss=0.2383, pruned_loss=0.03986, over 4711578.26 frames. ], batch size: 105, lr: 2.92e-03, grad_scale: 8.0 2023-10-03 09:27:40,132 WARNING [train.py:1204] (3/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_3-33116-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 09:27:40,133 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_4633_210_2040_1_1526824163_4145343_416-76117-0_sp1.1 from training. Duration: 0.863625 2023-10-03 09:27:40,167 WARNING [train.py:1204] (3/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_1033-274376-0_sp1.1 from training. Duration: 0.7909375 2023-10-03 09:27:41,512 WARNING [train.py:1204] (3/4) Exclude cut with ID _8772_210_1850_1_1528635595972_3664011_398-320049-0_sp0.9 from training. Duration: 0.97775 2023-10-03 09:27:45,532 WARNING [train.py:1204] (3/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_321-215742-0_sp0.9 from training. Duration: 0.5555625 2023-10-03 09:27:47,487 WARNING [train.py:1204] (3/4) Exclude cut with ID _28394_210_8913_1_1532430049518_3650118_272-190950-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 09:27:48,888 WARNING [train.py:1204] (3/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_31-299371-0_sp1.1 from training. Duration: 0.92725 2023-10-03 09:27:51,589 WARNING [train.py:1204] (3/4) Exclude cut with ID _55383_210_10635_1_1534468426172_2231669_134-128246-0_sp0.9 from training. Duration: 0.988875 2023-10-03 09:27:51,600 WARNING [train.py:1204] (3/4) Exclude cut with ID _40519_210_13794_1_1533715228655_7337390_887-9547-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 09:27:51,635 WARNING [train.py:1204] (3/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_310-109461-0_sp0.9 from training. Duration: 0.888875 2023-10-03 09:27:54,164 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.583e+02 1.825e+02 1.971e+02 2.203e+02 3.479e+02, threshold=3.943e+02, percent-clipped=0.0 2023-10-03 09:27:55,666 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2009_210_2071_1_1525481134_4588937_258-250196-0_sp1.1 from training. Duration: 0.92725 2023-10-03 09:27:55,721 WARNING [train.py:1204] (3/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_1186-72702-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 09:27:55,723 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_14831_210_7927_1_1530700200_5583610_181-27467-0_sp1.1 from training. Duration: 0.82725 2023-10-03 09:28:00,848 WARNING [train.py:1204] (3/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_278-132109-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 09:28:00,868 WARNING [train.py:1204] (3/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_261-297503-0 from training. Duration: 0.99 2023-10-03 09:28:05,134 WARNING [train.py:1204] (3/4) Exclude cut with ID _19896_210_1883_1_1531709022662_6649569_313-265903-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 09:28:05,238 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_105-114797-0_sp1.1 from training. Duration: 0.763625 2023-10-03 09:28:06,486 WARNING [train.py:1204] (3/4) Exclude cut with ID _53755_210_8254_1_1534577312941_4707360_9-65843-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 09:28:06,518 WARNING [train.py:1204] (3/4) Exclude cut with ID _18516_210_9206_1_1531704653206_3692406_361-224549-0_sp1.1 from training. Duration: 0.97275 2023-10-03 09:28:06,538 WARNING [train.py:1204] (3/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_403-310975-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 09:28:06,575 WARNING [train.py:1204] (3/4) Exclude cut with ID _5444_210_2151_1_1527418808464_5312210_581-315411-0_sp0.9 from training. Duration: 0.688875 2023-10-03 09:28:07,849 WARNING [train.py:1204] (3/4) Exclude cut with ID _23825_210_13449_1_1531915313526_3497150_464-154904-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 09:28:07,858 WARNING [train.py:1204] (3/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_937-208281-0_sp0.9 from training. Duration: 0.9444375 2023-10-03 09:28:07,920 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_37839_210_7234_1_1533542462_3689303_158-181212-0_sp1.1 from training. Duration: 0.963625 2023-10-03 09:28:09,307 WARNING [train.py:1204] (3/4) Exclude cut with ID _8772_210_1850_1_1528635595972_3664011_398-320049-0 from training. Duration: 0.88 2023-10-03 09:28:10,657 WARNING [train.py:1204] (3/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_718-250907-0_sp1.1 from training. Duration: 0.62725 2023-10-03 09:28:12,032 WARNING [train.py:1204] (3/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_641-126395-0_sp1.1 from training. Duration: 0.92725 2023-10-03 09:28:13,362 WARNING [train.py:1204] (3/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_166-241493-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 09:28:13,458 WARNING [train.py:1204] (3/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_526-62079-0_sp0.9 from training. Duration: 0.7444375 2023-10-03 09:28:14,082 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.3.encoder.layers.3.conv_module2.whiten, num_groups=1, num_channels=512, metric=4.28 vs. limit=15.0 2023-10-03 09:28:14,819 WARNING [train.py:1204] (3/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_439-275083-0_sp1.1 from training. Duration: 0.936375 2023-10-03 09:28:16,497 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.nonlin_attention.balancer.max_positive, batch_count=1218553.3333333333, ans=0.95 2023-10-03 09:28:18,134 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_66907_210_21581_1_1535615661_3590177_493-229032-0_sp1.1 from training. Duration: 0.92725 2023-10-03 09:28:18,173 WARNING [train.py:1204] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_89-321955-0_sp1.1 from training. Duration: 0.72725 2023-10-03 09:28:19,548 WARNING [train.py:1204] (3/4) Exclude cut with ID _29857_210_20034_1_1532395886753_3798160_273-226195-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 09:28:19,555 WARNING [train.py:1204] (3/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_404-75889-0 from training. Duration: 0.89 2023-10-03 09:28:20,770 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_271-234962-0_sp0.9 from training. Duration: 0.87775 2023-10-03 09:28:22,270 WARNING [train.py:1204] (3/4) Exclude cut with ID _22961_210_8254_1_1531976366672_4278180_156-339685-0_sp1.1 from training. Duration: 0.97275 2023-10-03 09:28:23,585 WARNING [train.py:1204] (3/4) Exclude cut with ID _37892_210_19596_1_1533198677897_3964058_425-231927-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 09:28:23,694 WARNING [train.py:1204] (3/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_102-288670-0_sp1.1 from training. Duration: 0.97275 2023-10-03 09:28:25,661 WARNING [train.py:1204] (3/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_283-220372-0_sp1.1 from training. Duration: 0.5090625 2023-10-03 09:28:27,016 WARNING [train.py:1204] (3/4) Exclude cut with ID _44304_210_15374_1_1533639352765_4613129_86-1699-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 09:28:28,444 WARNING [train.py:1204] (3/4) Exclude cut with ID _2891_210_2550_1_1525874398521_4427710_327-121590-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 09:28:28,471 WARNING [train.py:1204] (3/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_369-222060-0 from training. Duration: 0.71 2023-10-03 09:28:30,410 WARNING [train.py:1204] (3/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_762-109886-0 from training. Duration: 0.91 2023-10-03 09:28:30,441 WARNING [train.py:1204] (3/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_477-255782-0_sp0.9 from training. Duration: 0.911125 2023-10-03 09:28:30,467 WARNING [train.py:1204] (3/4) Exclude cut with ID IC0669W0061-330906 from training. Duration: 0.768 2023-10-03 09:28:31,757 WARNING [train.py:1204] (3/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_347-213071-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 09:28:31,800 WARNING [train.py:1204] (3/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_507-303370-0_sp1.1 from training. Duration: 0.87275 2023-10-03 09:28:33,163 WARNING [train.py:1204] (3/4) Exclude cut with ID _15012_210_1968_1_1530856820429_5095350_688-178849-0 from training. Duration: 0.64 2023-10-03 09:28:33,167 WARNING [train.py:1204] (3/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_28-70987-0_sp0.9 from training. Duration: 0.9666875 2023-10-03 09:28:33,169 WARNING [train.py:1204] (3/4) Exclude cut with ID _5910_210_1389_1_1527337972454_5050100_555-159815-0 from training. Duration: 0.98 2023-10-03 09:28:33,203 WARNING [train.py:1204] (3/4) Exclude cut with ID IC0679W0064-335871 from training. Duration: 0.768 2023-10-03 09:28:33,203 WARNING [train.py:1204] (3/4) Exclude cut with ID IC0679W0079-335886 from training. Duration: 0.896 2023-10-03 09:28:34,828 WARNING [train.py:1204] (3/4) Exclude cut with ID _5608_210_2106_1_1527415439089_6819768_909-82767-0 from training. Duration: 0.61 2023-10-03 09:28:36,168 WARNING [train.py:1204] (3/4) Exclude cut with ID _23038_210_9570_1_1532001584158_3652419_411-323967-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 09:28:36,216 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_350-231748-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 09:28:36,226 WARNING [train.py:1204] (3/4) Exclude cut with ID _5289_210_2084_1_1527678202736_3779299_142-103317-0_sp0.9 from training. Duration: 0.9333125 2023-10-03 09:28:36,305 WARNING [train.py:1204] (3/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_314-144055-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 09:28:37,574 WARNING [train.py:1204] (3/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_177-236809-0_sp0.9 from training. Duration: 0.5444375 2023-10-03 09:28:37,693 WARNING [train.py:1204] (3/4) Exclude cut with ID _23038_210_9570_1_1532001584158_3652419_82-334733-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 09:28:37,700 WARNING [train.py:1204] (3/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_225-22526-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 09:28:47,338 WARNING [train.py:1204] (3/4) Exclude cut with ID _30327_210_16049_1_1532484161295_3924169_401-70819-0_sp1.1 from training. Duration: 0.7818125 2023-10-03 09:28:48,642 WARNING [train.py:1204] (3/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_538-32291-0 from training. Duration: 0.83 2023-10-03 09:28:51,851 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_37357_210_17024_1_1533089504_2316121_258-211605-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 09:28:53,566 INFO [train.py:1046] (3/4) Epoch 35, batch 2200, loss[loss=0.14, simple_loss=0.2257, pruned_loss=0.02716, over 24520.00 frames. ], tot_loss[loss=0.1593, simple_loss=0.2387, pruned_loss=0.03997, over 4714466.65 frames. ], batch size: 63, lr: 2.92e-03, grad_scale: 8.0 2023-10-03 09:28:58,300 WARNING [train.py:1204] (3/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_550-335198-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 09:28:59,563 WARNING [train.py:1204] (3/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_98-741-0_sp0.9 from training. Duration: 0.888875 2023-10-03 09:28:59,613 WARNING [train.py:1204] (3/4) Exclude cut with ID _38323_210_12108_1_1533121233369_3303049_251-238988-0_sp1.1 from training. Duration: 0.92725 2023-10-03 09:29:01,116 WARNING [train.py:1204] (3/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_259-34038-0_sp0.9 from training. Duration: 0.82225 2023-10-03 09:29:03,287 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.balancer2.prob, batch_count=1218753.3333333333, ans=0.125 2023-10-03 09:29:04,476 WARNING [train.py:1204] (3/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_1060-180633-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 09:29:04,514 WARNING [train.py:1204] (3/4) Exclude cut with ID _8755_210_1876_1_1528628559357_3258209_211-37473-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 09:29:04,519 WARNING [train.py:1204] (3/4) Exclude cut with ID _4122_210_2084_1_1526713164417_4353280_528-206771-0 from training. Duration: 0.88 2023-10-03 09:29:07,684 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.out_combiner.scale_min, batch_count=1218820.0, ans=0.2 2023-10-03 09:29:08,764 WARNING [train.py:1204] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_64-248359-0 from training. Duration: 0.69 2023-10-03 09:29:11,378 WARNING [train.py:1204] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_402-253027-0_sp1.1 from training. Duration: 0.4909375 2023-10-03 09:29:15,691 WARNING [train.py:1204] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_625-102955-0 from training. Duration: 0.77 2023-10-03 09:29:17,144 WARNING [train.py:1204] (3/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_611-219324-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 09:29:18,494 WARNING [train.py:1204] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_718-68505-0_sp0.9 from training. Duration: 0.988875 2023-10-03 09:29:18,547 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_353-182855-0_sp1.1 from training. Duration: 0.87275 2023-10-03 09:29:23,048 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_5960_210_1794_1_1527403865_3990999_515-2613-0_sp0.9 from training. Duration: 0.97775 2023-10-03 09:29:23,082 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_5575_210_1410_1_1527301305_2570170_342-265572-0 from training. Duration: 0.94 2023-10-03 09:29:27,981 WARNING [train.py:1204] (3/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_329-113173-0_sp0.9 from training. Duration: 0.688875 2023-10-03 09:29:29,331 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_15396_210_6890_1_1531308473_1938936_213-312506-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 09:29:29,383 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_521-214276-0_sp1.1 from training. Duration: 0.4 2023-10-03 09:29:32,691 WARNING [train.py:1204] (3/4) Exclude cut with ID _15026_210_8750_1_1530777602316_3291850_211-195043-0_sp1.1 from training. Duration: 0.72725 2023-10-03 09:29:35,369 WARNING [train.py:1204] (3/4) Exclude cut with ID _29343_210_4381_1_1532602757752_3674109_108-257494-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 09:29:36,748 WARNING [train.py:1204] (3/4) Exclude cut with ID _64414_210_17301_1_1535243252048_3757759_403-58151-0_sp1.1 from training. Duration: 0.963625 2023-10-03 09:29:36,857 WARNING [train.py:1204] (3/4) Exclude cut with ID _36512_210_6987_1_1533033143998_3126069_109-177787-0_sp1.1 from training. Duration: 0.92725 2023-10-03 09:29:39,594 WARNING [train.py:1204] (3/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_498-40153-0 from training. Duration: 0.63 2023-10-03 09:29:39,671 WARNING [train.py:1204] (3/4) Exclude cut with ID _56447_210_1389_1_1535074245910_3697189_161-334523-0_sp1.1 from training. Duration: 0.936375 2023-10-03 09:29:41,114 WARNING [train.py:1204] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_238-245655-0 from training. Duration: 0.81 2023-10-03 09:29:42,534 WARNING [train.py:1204] (3/4) Exclude cut with ID _64631_210_27767_1_1535349580068_4165160_108-43865-0_sp1.1 from training. Duration: 0.92725 2023-10-03 09:29:42,547 WARNING [train.py:1204] (3/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_243-42116-0_sp1.1 from training. Duration: 0.663625 2023-10-03 09:29:42,571 WARNING [train.py:1204] (3/4) Exclude cut with ID _66868_210_9527_1_1535509287954_4561140_594-132160-0_sp1.1 from training. Duration: 0.92725 2023-10-03 09:29:45,336 WARNING [train.py:1204] (3/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_48-338976-0_sp0.9 from training. Duration: 0.988875 2023-10-03 09:29:45,370 WARNING [train.py:1204] (3/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_137-100595-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 09:29:45,378 WARNING [train.py:1204] (3/4) Exclude cut with ID _49748_210_24116_1_1533986971737_3158910_12-271669-0_sp1.1 from training. Duration: 0.936375 2023-10-03 09:29:45,397 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_37357_210_17024_1_1533089504_2316121_206-80421-0_sp1.1 from training. Duration: 0.936375 2023-10-03 09:29:46,735 WARNING [train.py:1204] (3/4) Exclude cut with ID _7537_210_2084_1_1528502395431_3924359_113-199933-0_sp1.1 from training. Duration: 0.536375 2023-10-03 09:29:48,385 WARNING [train.py:1204] (3/4) Exclude cut with ID _55440_210_12651_1_1534496713364_2980520_31-59289-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 09:29:49,753 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1083-220122-0_sp0.9 from training. Duration: 0.5666875 2023-10-03 09:29:52,582 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_426-313557-0_sp1.1 from training. Duration: 0.4818125 2023-10-03 09:29:52,761 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=1219020.0, ans=0.1 2023-10-03 09:29:53,877 WARNING [train.py:1204] (3/4) Exclude cut with ID _35799_210_5854_1_1533263487887_3529071_324-94843-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 09:29:57,520 WARNING [train.py:1204] (3/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_264-108685-0_sp1.1 from training. Duration: 0.5 2023-10-03 09:29:58,900 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9012W0006-501141 from training. Duration: 0.896 2023-10-03 09:30:00,359 WARNING [train.py:1204] (3/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_890-5117-0_sp0.9 from training. Duration: 0.8666875 2023-10-03 09:30:00,397 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9023W0008-506301 from training. Duration: 0.896 2023-10-03 09:30:01,723 WARNING [train.py:1204] (3/4) Exclude cut with ID _13916_210_9994_1_1530788341126_3763149_60-191556-0_sp0.9 from training. Duration: 0.67775 2023-10-03 09:30:02,431 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9029W0331-509721 from training. Duration: 0.896 2023-10-03 09:30:03,869 WARNING [train.py:1204] (3/4) Exclude cut with ID _35823_210_6917_1_1533187881914_7297830_567-108596-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 09:30:03,927 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7543_210_6753_1_1528544010_1110402_25-183860-0_sp1.1 from training. Duration: 0.42725 2023-10-03 09:30:06,536 WARNING [train.py:1204] (3/4) Exclude cut with ID _47278_210_12041_1_1533902418349_3576110_495-260035-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 09:30:07,817 INFO [train.py:1046] (3/4) Epoch 35, batch 2250, loss[loss=0.1567, simple_loss=0.2301, pruned_loss=0.04159, over 23803.00 frames. ], tot_loss[loss=0.16, simple_loss=0.2393, pruned_loss=0.04032, over 4707794.57 frames. ], batch size: 164, lr: 2.92e-03, grad_scale: 8.0 2023-10-03 09:30:07,918 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9051W0015-520281 from training. Duration: 0.9045625 2023-10-03 09:30:10,538 WARNING [train.py:1204] (3/4) Exclude cut with ID _14459_210_2228_1_1531443641946_7516700_707-66193-0_sp1.1 from training. Duration: 0.97275 2023-10-03 09:30:11,901 WARNING [train.py:1204] (3/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_219-224445-0_sp1.1 from training. Duration: 0.763625 2023-10-03 09:30:14,884 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.attention_skip_rate, batch_count=1219086.6666666667, ans=0.0 2023-10-03 09:30:17,451 WARNING [train.py:1204] (3/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_345-167986-0_sp0.9 from training. Duration: 0.9333125 2023-10-03 09:30:18,901 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_34815_210_6388_1_1533381320_3571903_279-305565-0_sp1.1 from training. Duration: 0.836375 2023-10-03 09:30:19,165 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.conv_skip_rate, batch_count=1219086.6666666667, ans=0.0 2023-10-03 09:30:22,205 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.522e+02 1.843e+02 1.976e+02 2.228e+02 2.990e+02, threshold=3.951e+02, percent-clipped=0.0 2023-10-03 09:30:23,608 WARNING [train.py:1204] (3/4) Exclude cut with ID _21987_210_5854_1_1531821500482_3637038_317-179212-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 09:30:23,896 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.self_attn_weights.pos_emb_skip_rate, batch_count=1219153.3333333333, ans=0.0 2023-10-03 09:30:24,954 WARNING [train.py:1204] (3/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_395-37222-0_sp1.1 from training. Duration: 0.6909375 2023-10-03 09:30:26,379 WARNING [train.py:1204] (3/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_168-50337-0_sp1.1 from training. Duration: 0.763625 2023-10-03 09:30:26,563 WARNING [train.py:1204] (3/4) Exclude cut with ID _60113_210_16271_1_1535164212481_4386079_559-167191-0 from training. Duration: 0.93 2023-10-03 09:30:26,572 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_47228_210_4983_1_1533780021_3672027_344-179642-0_sp1.1 from training. Duration: 0.92725 2023-10-03 09:30:28,408 WARNING [train.py:1204] (3/4) Exclude cut with ID _22961_210_8254_1_1531976366672_4278180_317-199546-0_sp0.9 from training. Duration: 0.9555625 2023-10-03 09:30:29,827 WARNING [train.py:1204] (3/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_141-89727-0 from training. Duration: 0.71 2023-10-03 09:30:29,889 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_78-116075-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 09:30:29,900 WARNING [train.py:1204] (3/4) Exclude cut with ID _64086_210_7994_1_1535270417019_7435949_52-235756-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 09:30:31,803 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7615_210_4211_1_1528110965_4580160_72-235211-0_sp1.1 from training. Duration: 0.6909375 2023-10-03 09:30:38,696 WARNING [train.py:1204] (3/4) Exclude cut with ID _14069_210_5632_1_1530512692033_4803390_287-27950-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 09:30:38,804 WARNING [train.py:1204] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_74-48717-0_sp0.9 from training. Duration: 0.6444375 2023-10-03 09:30:40,178 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7544_210_6753_1_1528591327_1471426_172-348777-0_sp1.1 from training. Duration: 0.6 2023-10-03 09:30:41,620 WARNING [train.py:1204] (3/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_220-191080-0 from training. Duration: 0.98 2023-10-03 09:30:43,004 WARNING [train.py:1204] (3/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_1081-137641-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 09:30:44,419 WARNING [train.py:1204] (3/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_278-235459-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 09:30:47,336 WARNING [train.py:1204] (3/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_1181-77470-0_sp1.1 from training. Duration: 0.936375 2023-10-03 09:30:48,710 WARNING [train.py:1204] (3/4) Exclude cut with ID _60860_210_8254_1_1534896015875_3557169_198-5531-0_sp1.1 from training. Duration: 0.936375 2023-10-03 09:30:50,172 WARNING [train.py:1204] (3/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_17-289781-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 09:30:50,177 WARNING [train.py:1204] (3/4) Exclude cut with ID _50310_210_15488_1_1534068043345_3641080_146-199752-0_sp1.1 from training. Duration: 0.92725 2023-10-03 09:30:53,348 WARNING [train.py:1204] (3/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_969-213354-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 09:30:53,434 WARNING [train.py:1204] (3/4) Exclude cut with ID _70519_210_4163_1_1535961595994_7458230_66-344156-0_sp1.1 from training. Duration: 0.963625 2023-10-03 09:30:56,739 WARNING [train.py:1204] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_380-204756-0_sp1.1 from training. Duration: 0.7818125 2023-10-03 09:30:59,486 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_128-202104-0_sp0.9 from training. Duration: 0.7 2023-10-03 09:31:03,980 WARNING [train.py:1204] (3/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_51-1049-0_sp0.9 from training. Duration: 0.8333125 2023-10-03 09:31:04,004 WARNING [train.py:1204] (3/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_86-66192-0_sp1.1 from training. Duration: 0.5 2023-10-03 09:31:05,409 WARNING [train.py:1204] (3/4) Exclude cut with ID _14069_210_5632_1_1530512692033_4803390_95-1896-0_sp1.1 from training. Duration: 0.8909375 2023-10-03 09:31:06,291 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.ff2_skip_rate, batch_count=1219353.3333333333, ans=0.0 2023-10-03 09:31:12,808 WARNING [train.py:1204] (3/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_29-293896-0_sp1.1 from training. Duration: 0.5545625 2023-10-03 09:31:15,578 WARNING [train.py:1204] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_167-137255-0_sp0.9 from training. Duration: 0.711125 2023-10-03 09:31:15,579 WARNING [train.py:1204] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_217-95056-0 from training. Duration: 0.98 2023-10-03 09:31:15,604 WARNING [train.py:1204] (3/4) Exclude cut with ID _47278_210_12041_1_1533902418349_3576110_97-266746-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 09:31:17,011 WARNING [train.py:1204] (3/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_380-343670-0_sp1.1 from training. Duration: 0.8 2023-10-03 09:31:18,511 WARNING [train.py:1204] (3/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_107-332398-0 from training. Duration: 0.78 2023-10-03 09:31:21,142 INFO [train.py:1046] (3/4) Epoch 35, batch 2300, loss[loss=0.1532, simple_loss=0.2294, pruned_loss=0.03846, over 23576.00 frames. ], tot_loss[loss=0.1599, simple_loss=0.2394, pruned_loss=0.04015, over 4725157.02 frames. ], batch size: 134, lr: 2.91e-03, grad_scale: 8.0 2023-10-03 09:31:21,275 WARNING [train.py:1204] (3/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_91-267696-0_sp1.1 from training. Duration: 0.7909375 2023-10-03 09:31:22,587 WARNING [train.py:1204] (3/4) Exclude cut with ID _41298_210_9019_1_1533430371450_4822143_498-186993-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 09:31:27,458 WARNING [train.py:1204] (3/4) Exclude cut with ID _17956_210_4353_1_1531209568125_6979970_54-77610-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 09:31:27,498 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_40047_210_2290_1_1533290519_2374311_257-335162-0_sp1.1 from training. Duration: 0.863625 2023-10-03 09:31:30,589 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9653W0193-675471 from training. Duration: 0.9656875 2023-10-03 09:31:32,004 WARNING [train.py:1204] (3/4) Exclude cut with ID _25248_210_8750_1_1532062825941_5554180_243-240536-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 09:31:38,023 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_32360_210_4198_1_1532689160_3648436_60-51908-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 09:31:38,032 WARNING [train.py:1204] (3/4) Exclude cut with ID _4092_210_2151_1_1526643044449_3135759_227-213972-0_sp0.9 from training. Duration: 0.6 2023-10-03 09:31:38,077 WARNING [train.py:1204] (3/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_392-173500-0_sp1.1 from training. Duration: 0.97275 2023-10-03 09:31:39,384 WARNING [train.py:1204] (3/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_71-329150-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 09:31:39,386 WARNING [train.py:1204] (3/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_866-173440-0 from training. Duration: 0.9 2023-10-03 09:31:41,160 WARNING [train.py:1204] (3/4) Exclude cut with ID _14832_210_8750_1_1530691351539_3171348_1-54315-0_sp0.9 from training. Duration: 0.9666875 2023-10-03 09:31:42,644 WARNING [train.py:1204] (3/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_430-116139-0_sp1.1 from training. Duration: 0.77275 2023-10-03 09:31:43,875 WARNING [train.py:1204] (3/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_356-45054-0_sp0.9 from training. Duration: 0.9555625 2023-10-03 09:31:46,752 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3531_210_3528_1_1526294203_5216456_309-227302-0_sp0.9 from training. Duration: 0.7666875 2023-10-03 09:31:48,257 WARNING [train.py:1204] (3/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_301-330455-0_sp1.1 from training. Duration: 0.72725 2023-10-03 09:31:48,396 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.conv_module1.balancer2.prob, batch_count=1219486.6666666667, ans=0.125 2023-10-03 09:31:52,250 WARNING [train.py:1204] (3/4) Exclude cut with ID _23557_210_8750_1_1531890106370_6846255_667-145945-0_sp1.1 from training. Duration: 0.92725 2023-10-03 09:31:57,187 WARNING [train.py:1204] (3/4) Exclude cut with ID _14860_210_4915_1_1530702020340_7343240_836-328489-0_sp1.1 from training. Duration: 0.8181875 2023-10-03 09:31:57,231 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_37152_210_9697_1_1533106573_3844188_416-333249-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 09:32:00,168 WARNING [train.py:1204] (3/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_538-32291-0_sp0.9 from training. Duration: 0.92225 2023-10-03 09:32:03,395 WARNING [train.py:1204] (3/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_965-176824-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 09:32:06,299 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.0.conv_module1.balancer1.min_positive, batch_count=1219620.0, ans=0.025 2023-10-03 09:32:07,545 WARNING [train.py:1204] (3/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_86-19601-0_sp1.1 from training. Duration: 0.77275 2023-10-03 09:32:08,932 WARNING [train.py:1204] (3/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_436-318113-0_sp1.1 from training. Duration: 0.6818125 2023-10-03 09:32:08,960 WARNING [train.py:1204] (3/4) Exclude cut with ID _41817_210_18835_1_1533434477160_7423049_555-293448-0_sp0.9 from training. Duration: 0.888875 2023-10-03 09:32:08,979 WARNING [train.py:1204] (3/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_68-219807-0 from training. Duration: 0.47 2023-10-03 09:32:13,708 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7544_210_6753_1_1528591327_1471426_33-346031-0_sp1.1 from training. Duration: 0.5545625 2023-10-03 09:32:13,710 WARNING [train.py:1204] (3/4) Exclude cut with ID _49748_210_24116_1_1533986971737_3158910_263-52113-0_sp1.1 from training. Duration: 0.97275 2023-10-03 09:32:15,054 WARNING [train.py:1204] (3/4) Exclude cut with ID _64639_210_5816_1_1535367619437_3528000_340-24580-0_sp1.1 from training. Duration: 0.92725 2023-10-03 09:32:15,069 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_47228_210_4983_1_1533780021_3672027_479-114941-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 09:32:15,102 WARNING [train.py:1204] (3/4) Exclude cut with ID _44585_210_18628_1_1533605350387_4518199_506-119659-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 09:32:15,134 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder_embed.conv.2.prob, batch_count=1219620.0, ans=0.125 2023-10-03 09:32:16,478 WARNING [train.py:1204] (3/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_314-126819-0_sp1.1 from training. Duration: 0.4545625 2023-10-03 09:32:16,481 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2541_210_1794_1_1525590269_3084765_156-173713-0_sp0.9 from training. Duration: 0.9 2023-10-03 09:32:16,529 WARNING [train.py:1204] (3/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_108-255430-0 from training. Duration: 0.97 2023-10-03 09:32:16,537 WARNING [train.py:1204] (3/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_791-45149-0_sp1.1 from training. Duration: 0.7818125 2023-10-03 09:32:16,542 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_53378_210_17282_1_1534227054_1261425_10-118295-0_sp1.1 from training. Duration: 0.97275 2023-10-03 09:32:17,863 WARNING [train.py:1204] (3/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_148-158509-0 from training. Duration: 0.41 2023-10-03 09:32:23,572 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.ff2_skip_rate, batch_count=1219686.6666666667, ans=0.0 2023-10-03 09:32:24,771 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_34726_210_4525_1_1533019474_5030395_687-291305-0_sp1.1 from training. Duration: 0.936375 2023-10-03 09:32:29,301 WARNING [train.py:1204] (3/4) Exclude cut with ID _14860_210_4915_1_1530702020340_7343240_264-324537-0_sp1.1 from training. Duration: 0.963625 2023-10-03 09:32:34,028 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_41251_210_15368_1_1533429767_4820695_591-46386-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 09:32:34,053 WARNING [train.py:1204] (3/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_86-19601-0_sp0.9 from training. Duration: 0.9444375 2023-10-03 09:32:34,079 WARNING [train.py:1204] (3/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_401-255491-0_sp0.9 from training. Duration: 0.77775 2023-10-03 09:32:35,435 INFO [train.py:1046] (3/4) Epoch 35, batch 2350, loss[loss=0.1444, simple_loss=0.2293, pruned_loss=0.02973, over 24452.00 frames. ], tot_loss[loss=0.1602, simple_loss=0.2399, pruned_loss=0.04023, over 4728913.04 frames. ], batch size: 63, lr: 2.91e-03, grad_scale: 8.0 2023-10-03 09:32:35,606 WARNING [train.py:1204] (3/4) Exclude cut with ID _8696_210_1876_1_1528545632440_3345058_209-54069-0_sp1.1 from training. Duration: 0.6090625 2023-10-03 09:32:37,484 WARNING [train.py:1204] (3/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_369-325184-0_sp1.1 from training. Duration: 0.9 2023-10-03 09:32:37,556 WARNING [train.py:1204] (3/4) Exclude cut with ID _14687_210_5854_1_1530703838259_3664889_115-239046-0_sp0.9 from training. Duration: 0.7444375 2023-10-03 09:32:38,469 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.0.layers.0.self_attn2.whiten, num_groups=1, num_channels=192, metric=11.74 vs. limit=22.5 2023-10-03 09:32:38,865 WARNING [train.py:1204] (3/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_168-50337-0 from training. Duration: 0.84 2023-10-03 09:32:44,386 WARNING [train.py:1204] (3/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_909-64151-0_sp1.1 from training. Duration: 0.8545625 2023-10-03 09:32:44,395 WARNING [train.py:1204] (3/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_597-154640-0 from training. Duration: 0.9 2023-10-03 09:32:49,247 WARNING [train.py:1204] (3/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_126-274694-0 from training. Duration: 0.98 2023-10-03 09:32:50,632 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.499e+02 1.929e+02 2.127e+02 2.368e+02 3.367e+02, threshold=4.254e+02, percent-clipped=0.0 2023-10-03 09:32:52,113 WARNING [train.py:1204] (3/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_344-269786-0_sp1.1 from training. Duration: 0.97275 2023-10-03 09:32:54,852 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_37818_210_4238_1_1533620910_4720083_322-253154-0_sp1.1 from training. Duration: 0.92725 2023-10-03 09:32:54,855 WARNING [train.py:1204] (3/4) Exclude cut with ID _58895_210_8750_1_1535184081320_3649089_221-15823-0_sp1.1 from training. Duration: 0.92725 2023-10-03 09:32:56,099 WARNING [train.py:1204] (3/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_9-21737-0_sp1.1 from training. Duration: 0.8818125 2023-10-03 09:32:56,138 WARNING [train.py:1204] (3/4) Exclude cut with ID _14069_210_5632_1_1530512692033_4803390_522-341588-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 09:32:56,398 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.self_attn_weights.pos_emb_skip_rate, batch_count=1219820.0, ans=0.0 2023-10-03 09:32:57,554 WARNING [train.py:1204] (3/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_363-185703-0 from training. Duration: 0.95 2023-10-03 09:33:00,947 WARNING [train.py:1204] (3/4) Exclude cut with ID _41817_210_18835_1_1533434477160_7423049_265-76216-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 09:33:05,187 WARNING [train.py:1204] (3/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_841-341679-0 from training. Duration: 0.58 2023-10-03 09:33:05,904 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.2.encoder.layers.1.nonlin_attention.whiten2, num_groups=1, num_channels=384, metric=6.31 vs. limit=15.0 2023-10-03 09:33:06,584 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.0.feed_forward1.out_proj.dropout_p, batch_count=1219886.6666666667, ans=0.1 2023-10-03 09:33:07,794 WARNING [train.py:1204] (3/4) Exclude cut with ID _55429_210_10626_1_1534467588527_7174143_97-335084-0_sp1.1 from training. Duration: 0.8818125 2023-10-03 09:33:10,960 WARNING [train.py:1204] (3/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_690-126306-0_sp0.9 from training. Duration: 0.8666875 2023-10-03 09:33:10,969 WARNING [train.py:1204] (3/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_102-92751-0_sp1.1 from training. Duration: 0.9 2023-10-03 09:33:12,427 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3787_210_2040_1_1526477544_5478586_603-120324-0_sp1.1 from training. Duration: 0.736375 2023-10-03 09:33:15,216 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_33348_210_5632_1_1533086912_3324030_85-28583-0 from training. Duration: 0.82 2023-10-03 09:33:15,275 WARNING [train.py:1204] (3/4) Exclude cut with ID _60057_210_10640_1_1534903065793_3333150_362-230377-0_sp1.1 from training. Duration: 0.7909375 2023-10-03 09:33:17,272 WARNING [train.py:1204] (3/4) Exclude cut with ID _60113_210_16271_1_1535164212481_4386079_194-146486-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 09:33:17,292 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_53753_210_7234_1_1534320003_3594148_171-277668-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 09:33:18,523 WARNING [train.py:1204] (3/4) Exclude cut with ID _34423_210_8750_1_1533430798456_3330199_251-332788-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 09:33:20,251 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.self_attn_weights.pos_emb_skip_rate, batch_count=1219953.3333333333, ans=0.0 2023-10-03 09:33:21,314 WARNING [train.py:1204] (3/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_663-166973-0_sp0.9 from training. Duration: 0.888875 2023-10-03 09:33:22,617 WARNING [train.py:1204] (3/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_927-43827-0 from training. Duration: 0.74 2023-10-03 09:33:23,990 WARNING [train.py:1204] (3/4) Exclude cut with ID _28323_210_16187_1_1532480395667_4100300_306-74561-0_sp1.1 from training. Duration: 0.8545625 2023-10-03 09:33:25,458 WARNING [train.py:1204] (3/4) Exclude cut with ID _15037_210_2253_1_1530752294104_3267650_139-337989-0_sp1.1 from training. Duration: 0.92725 2023-10-03 09:33:25,479 WARNING [train.py:1204] (3/4) Exclude cut with ID _49039_210_6315_1_1533967537541_3846118_460-263670-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 09:33:26,849 WARNING [train.py:1204] (3/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_186-309161-0 from training. Duration: 0.54 2023-10-03 09:33:28,221 WARNING [train.py:1204] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_87-278686-0_sp1.1 from training. Duration: 0.7 2023-10-03 09:33:31,332 WARNING [train.py:1204] (3/4) Exclude cut with ID _27052_210_5986_1_1532329197187_3585009_314-106499-0 from training. Duration: 0.92 2023-10-03 09:33:31,356 WARNING [train.py:1204] (3/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_412-287526-0_sp0.9 from training. Duration: 0.8 2023-10-03 09:33:37,402 WARNING [train.py:1204] (3/4) Exclude cut with ID _22961_210_8254_1_1531976366672_4278180_317-199546-0 from training. Duration: 0.86 2023-10-03 09:33:40,347 WARNING [train.py:1204] (3/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_381-317791-0 from training. Duration: 0.73 2023-10-03 09:33:42,192 WARNING [train.py:1204] (3/4) Exclude cut with ID _44010_210_4525_1_1533884262964_4013620_560-310411-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 09:33:42,201 WARNING [train.py:1204] (3/4) Exclude cut with ID _5294_210_1747_1_1527249643928_3696179_342-270278-0_sp0.9 from training. Duration: 0.611125 2023-10-03 09:33:42,223 WARNING [train.py:1204] (3/4) Exclude cut with ID ID0473W0002-918951 from training. Duration: 0.924 2023-10-03 09:33:42,252 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1083-220122-0 from training. Duration: 0.51 2023-10-03 09:33:43,705 WARNING [train.py:1204] (3/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_138-154812-0 from training. Duration: 0.78 2023-10-03 09:33:44,183 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.5.encoder.layers.1.self_attn_weights.whiten_keys, num_groups=4, num_channels=128, metric=1.59 vs. limit=6.0 2023-10-03 09:33:46,875 WARNING [train.py:1204] (3/4) Exclude cut with ID _44982_210_1511_1_1534312820108_3726029_331-19420-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 09:33:48,648 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.nonlin_attention.balancer.min_positive, batch_count=1220086.6666666667, ans=0.05 2023-10-03 09:33:49,757 INFO [train.py:1046] (3/4) Epoch 35, batch 2400, loss[loss=0.1393, simple_loss=0.22, pruned_loss=0.0293, over 21667.00 frames. ], tot_loss[loss=0.1611, simple_loss=0.2405, pruned_loss=0.04083, over 4707595.04 frames. ], batch size: 47, lr: 2.91e-03, grad_scale: 16.0 2023-10-03 09:33:49,882 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2655_210_2087_1_1525688959_3897400_202-147557-0_sp0.9 from training. Duration: 0.9444375 2023-10-03 09:33:55,131 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_23850_210_4238_1_1532232597_1632433_32-185135-0_sp0.9 from training. Duration: 0.9555625 2023-10-03 09:33:56,567 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_46-93547-0_sp1.1 from training. Duration: 0.863625 2023-10-03 09:33:56,613 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7931_210_1794_1_1528594728_5564059_280-279560-0 from training. Duration: 0.98 2023-10-03 09:33:56,653 WARNING [train.py:1204] (3/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_937-208281-0 from training. Duration: 0.85 2023-10-03 09:34:03,895 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_14928_210_5632_1_1530773461_4714000_454-112198-0_sp0.9 from training. Duration: 0.7333125 2023-10-03 09:34:03,896 WARNING [train.py:1204] (3/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_944-153333-0_sp1.1 from training. Duration: 0.963625 2023-10-03 09:34:05,942 WARNING [train.py:1204] (3/4) Exclude cut with ID _14821_210_8750_1_1530680503991_6829359_2-227151-0 from training. Duration: 0.83 2023-10-03 09:34:05,959 WARNING [train.py:1204] (3/4) Exclude cut with ID _28461_210_2253_1_1532771775189_3041299_96-152158-0_sp1.1 from training. Duration: 0.77275 2023-10-03 09:34:07,316 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7837_210_1792_1_1528453445_5841305_654-54195-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 09:34:07,348 WARNING [train.py:1204] (3/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_344-152526-0 from training. Duration: 0.53 2023-10-03 09:34:13,356 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_30167_210_3528_1_1532428012_4772375_480-142230-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 09:34:16,200 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_160-218721-0 from training. Duration: 0.96 2023-10-03 09:34:20,920 WARNING [train.py:1204] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_65-88242-0_sp0.9 from training. Duration: 0.62225 2023-10-03 09:34:21,132 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.conv_module2.balancer2.min_abs, batch_count=1220220.0, ans=0.5 2023-10-03 09:34:25,197 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_34886_210_10633_1_1533203882_1755661_76-156096-0 from training. Duration: 0.87 2023-10-03 09:34:26,607 WARNING [train.py:1204] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_188-38388-0_sp1.1 from training. Duration: 0.92725 2023-10-03 09:34:27,996 WARNING [train.py:1204] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_81-289335-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 09:34:30,886 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=1220220.0, ans=0.1 2023-10-03 09:34:32,083 WARNING [train.py:1204] (3/4) Exclude cut with ID _59493_210_17282_1_1535078419077_3225879_110-8778-0_sp1.1 from training. Duration: 0.936375 2023-10-03 09:34:32,114 WARNING [train.py:1204] (3/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_188-260011-0 from training. Duration: 0.73 2023-10-03 09:34:32,288 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=1220286.6666666667, ans=0.1 2023-10-03 09:34:32,358 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.ff2_skip_rate, batch_count=1220286.6666666667, ans=0.0 2023-10-03 09:34:34,004 WARNING [train.py:1204] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_453-133983-0_sp0.9 from training. Duration: 0.8666875 2023-10-03 09:34:38,793 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.ff2_skip_rate, batch_count=1220286.6666666667, ans=0.0 2023-10-03 09:34:41,216 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8054_210_7927_1_1528627721_4984094_553-17379-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 09:34:41,486 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.conv_module1.balancer1.prob, batch_count=1220286.6666666667, ans=0.125 2023-10-03 09:34:42,703 WARNING [train.py:1204] (3/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_341-171072-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 09:34:44,150 WARNING [train.py:1204] (3/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_265-257286-0_sp1.1 from training. Duration: 0.963625 2023-10-03 09:34:45,594 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_403-39619-0_sp0.9 from training. Duration: 0.9333125 2023-10-03 09:34:45,599 WARNING [train.py:1204] (3/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_464-33931-0_sp0.9 from training. Duration: 0.6 2023-10-03 09:34:46,892 WARNING [train.py:1204] (3/4) Exclude cut with ID _60804_210_9642_1_1534831199859_7268299_632-143863-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 09:34:46,899 WARNING [train.py:1204] (3/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_431-310997-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 09:34:46,940 WARNING [train.py:1204] (3/4) Exclude cut with ID _31649_210_6228_1_1532584608063_4091078_114-263860-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 09:34:46,952 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6961_210_2087_1_1527937168_6815011_562-165518-0_sp0.9 from training. Duration: 0.8555625 2023-10-03 09:34:51,614 WARNING [train.py:1204] (3/4) Exclude cut with ID _27052_210_5986_1_1532329197187_3585009_338-325870-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 09:34:53,037 WARNING [train.py:1204] (3/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_469-94673-0_sp1.1 from training. Duration: 0.7181875 2023-10-03 09:34:53,059 WARNING [train.py:1204] (3/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_259-34038-0 from training. Duration: 0.74 2023-10-03 09:34:53,149 WARNING [train.py:1204] (3/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_388-129552-0 from training. Duration: 0.96 2023-10-03 09:34:55,869 WARNING [train.py:1204] (3/4) Exclude cut with ID _28394_210_8913_1_1532430049518_3650118_342-251455-0_sp1.1 from training. Duration: 0.936375 2023-10-03 09:34:55,888 WARNING [train.py:1204] (3/4) Exclude cut with ID _53731_210_7994_1_1534553979244_3757214_8-191390-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 09:34:55,928 WARNING [train.py:1204] (3/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_613-232856-0 from training. Duration: 0.54 2023-10-03 09:34:56,080 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.nonlin_attention.balancer.prob, batch_count=1220353.3333333333, ans=0.125 2023-10-03 09:34:57,284 WARNING [train.py:1204] (3/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_557-349534-0 from training. Duration: 0.91 2023-10-03 09:34:57,304 WARNING [train.py:1204] (3/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_379-184223-0 from training. Duration: 0.85 2023-10-03 09:34:57,307 WARNING [train.py:1204] (3/4) Exclude cut with ID IC0125W0001-61807 from training. Duration: 0.966 2023-10-03 09:34:57,391 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_229-45300-0 from training. Duration: 0.57 2023-10-03 09:35:00,046 WARNING [train.py:1204] (3/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_1069-24743-0_sp1.1 from training. Duration: 0.92725 2023-10-03 09:35:01,418 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_41452_210_7291_1_1533437687_3941281_323-172873-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 09:35:01,429 WARNING [train.py:1204] (3/4) Exclude cut with ID _37086_210_9038_1_1532999540891_7021250_802-226709-0_sp1.1 from training. Duration: 0.97275 2023-10-03 09:35:03,237 INFO [train.py:1046] (3/4) Epoch 35, batch 2450, loss[loss=0.1579, simple_loss=0.2173, pruned_loss=0.04925, over 23319.00 frames. ], tot_loss[loss=0.1601, simple_loss=0.2394, pruned_loss=0.04037, over 4715643.24 frames. ], batch size: 285, lr: 2.91e-03, grad_scale: 8.0 2023-10-03 09:35:03,282 WARNING [train.py:1204] (3/4) Exclude cut with ID IC0144W0500-71767 from training. Duration: 0.931875 2023-10-03 09:35:04,720 WARNING [train.py:1204] (3/4) Exclude cut with ID _39846_210_12651_1_1533605310265_2115212_233-129699-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 09:35:04,787 WARNING [train.py:1204] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_574-45541-0_sp0.9 from training. Duration: 0.72225 2023-10-03 09:35:08,302 WARNING [train.py:1204] (3/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_372-298093-0_sp1.1 from training. Duration: 0.72725 2023-10-03 09:35:09,584 WARNING [train.py:1204] (3/4) Exclude cut with ID _62574_210_2655_1_1535180407058_3933249_34-26343-0_sp1.1 from training. Duration: 0.97275 2023-10-03 09:35:11,172 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_512-256238-0_sp1.1 from training. Duration: 0.963625 2023-10-03 09:35:11,177 WARNING [train.py:1204] (3/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_196-55502-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 09:35:12,543 WARNING [train.py:1204] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_195-167011-0 from training. Duration: 0.75 2023-10-03 09:35:16,599 WARNING [train.py:1204] (3/4) Exclude cut with ID _13678_210_7497_1_1530530069226_3463170_345-230416-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 09:35:16,617 WARNING [train.py:1204] (3/4) Exclude cut with ID _51078_210_9070_1_1534161557424_3805250_225-327809-0_sp1.1 from training. Duration: 0.963625 2023-10-03 09:35:19,800 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.587e+02 1.894e+02 2.090e+02 2.378e+02 3.280e+02, threshold=4.179e+02, percent-clipped=0.0 2023-10-03 09:35:21,244 WARNING [train.py:1204] (3/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_18-189198-0_sp1.1 from training. Duration: 0.8181875 2023-10-03 09:35:21,262 WARNING [train.py:1204] (3/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_329-82235-0_sp1.1 from training. Duration: 0.7454375 2023-10-03 09:35:21,271 WARNING [train.py:1204] (3/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_726-270780-0_sp0.9 from training. Duration: 0.9555625 2023-10-03 09:35:21,309 WARNING [train.py:1204] (3/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_654-52713-0 from training. Duration: 0.77 2023-10-03 09:35:26,546 WARNING [train.py:1204] (3/4) Exclude cut with ID _20455_210_5219_1_1531565996775_8098100_191-16006-0_sp1.1 from training. Duration: 0.963625 2023-10-03 09:35:27,955 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3171_210_2087_1_1526109084_2330007_66-260583-0_sp1.1 from training. Duration: 0.6545625 2023-10-03 09:35:29,275 WARNING [train.py:1204] (3/4) Exclude cut with ID _38670_210_15379_1_1533448810659_4391059_63-129006-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 09:35:31,517 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.2.encoder.layers.1.self_attn_weights.whiten_keys, num_groups=4, num_channels=128, metric=3.03 vs. limit=6.0 2023-10-03 09:35:32,701 WARNING [train.py:1204] (3/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_393-57888-0_sp0.9 from training. Duration: 0.711125 2023-10-03 09:35:34,595 WARNING [train.py:1204] (3/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_857-298942-0_sp0.9 from training. Duration: 0.9444375 2023-10-03 09:35:34,699 WARNING [train.py:1204] (3/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_239-22076-0_sp0.9 from training. Duration: 0.9444375 2023-10-03 09:35:36,016 WARNING [train.py:1204] (3/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_424-227947-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 09:35:37,912 WARNING [train.py:1204] (3/4) Exclude cut with ID _14069_210_5632_1_1530512692033_4803390_95-1896-0 from training. Duration: 0.98 2023-10-03 09:35:39,189 WARNING [train.py:1204] (3/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_96-244299-0_sp0.9 from training. Duration: 0.97775 2023-10-03 09:35:46,172 WARNING [train.py:1204] (3/4) Exclude cut with ID _46683_210_9642_1_1533730913492_4591116_562-319825-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 09:35:47,541 WARNING [train.py:1204] (3/4) Exclude cut with ID _64639_210_5816_1_1535367619437_3528000_284-173474-0_sp1.1 from training. Duration: 0.963625 2023-10-03 09:35:47,570 WARNING [train.py:1204] (3/4) Exclude cut with ID _29529_210_7234_1_1532393961455_3977460_497-100133-0_sp1.1 from training. Duration: 0.936375 2023-10-03 09:35:47,608 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_12868_210_6753_1_1530701932_530275_18-76322-0_sp0.9 from training. Duration: 0.9666875 2023-10-03 09:35:48,270 WARNING [train.py:1204] (3/4) Exclude cut with ID _50093_210_4220_1_1534417141207_3432299_116-303447-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 09:35:49,675 WARNING [train.py:1204] (3/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_44-79427-0_sp1.1 from training. Duration: 0.8909375 2023-10-03 09:35:50,976 WARNING [train.py:1204] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_887-249555-0 from training. Duration: 0.77 2023-10-03 09:35:53,839 WARNING [train.py:1204] (3/4) Exclude cut with ID _28461_210_2253_1_1532771775189_3041299_96-152158-0_sp0.9 from training. Duration: 0.9444375 2023-10-03 09:35:55,135 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_14058_210_4163_1_1530690953_6327180_49-210687-0_sp1.1 from training. Duration: 0.8545625 2023-10-03 09:35:56,607 WARNING [train.py:1204] (3/4) Exclude cut with ID _40399_210_4353_1_1533305658194_3279406_351-127903-0_sp1.1 from training. Duration: 0.97275 2023-10-03 09:35:56,612 WARNING [train.py:1204] (3/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_184-206939-0_sp1.1 from training. Duration: 0.936375 2023-10-03 09:36:02,106 WARNING [train.py:1204] (3/4) Exclude cut with ID _14100_210_8341_1_1530529320613_3678069_258-224599-0_sp1.1 from training. Duration: 0.7 2023-10-03 09:36:02,123 WARNING [train.py:1204] (3/4) Exclude cut with ID _6493_210_1747_1_1528459210424_4012470_324-323854-0 from training. Duration: 0.92 2023-10-03 09:36:02,217 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_151-325626-0_sp1.1 from training. Duration: 0.8818125 2023-10-03 09:36:04,025 WARNING [train.py:1204] (3/4) Exclude cut with ID _14528_210_8254_1_1530669700514_4306939_106-43145-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 09:36:04,049 WARNING [train.py:1204] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_686-54383-0 from training. Duration: 0.87 2023-10-03 09:36:05,421 WARNING [train.py:1204] (3/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_801-102362-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 09:36:06,786 WARNING [train.py:1204] (3/4) Exclude cut with ID _48677_210_5663_1_1533902405919_3892989_408-40740-0_sp0.9 from training. Duration: 0.92225 2023-10-03 09:36:11,484 WARNING [train.py:1204] (3/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_448-212135-0_sp1.1 from training. Duration: 0.82725 2023-10-03 09:36:12,924 WARNING [train.py:1204] (3/4) Exclude cut with ID _59170_210_6721_1_1534983633960_4203335_320-124047-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 09:36:14,213 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_4692_210_3528_1_1526899755_4323254_107-44401-0_sp1.1 from training. Duration: 0.7818125 2023-10-03 09:36:16,951 INFO [train.py:1046] (3/4) Epoch 35, batch 2500, loss[loss=0.1484, simple_loss=0.2276, pruned_loss=0.0346, over 24618.00 frames. ], tot_loss[loss=0.1593, simple_loss=0.2386, pruned_loss=0.04, over 4711171.00 frames. ], batch size: 60, lr: 2.91e-03, grad_scale: 8.0 2023-10-03 09:36:17,100 WARNING [train.py:1204] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_731-2428-0 from training. Duration: 0.83 2023-10-03 09:36:18,501 WARNING [train.py:1204] (3/4) Exclude cut with ID _29857_210_20034_1_1532395886753_3798160_335-163562-0_sp1.1 from training. Duration: 0.77275 2023-10-03 09:36:19,092 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.4.encoder.layers.2.self_attn1.whiten, num_groups=1, num_channels=384, metric=11.70 vs. limit=22.5 2023-10-03 09:36:23,296 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder_embed.conv.2.prob, batch_count=1220753.3333333333, ans=0.125 2023-10-03 09:36:24,552 WARNING [train.py:1204] (3/4) Exclude cut with ID _14832_210_8750_1_1530691351539_3171348_171-275004-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 09:36:27,931 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.5.encoder.layers.0.nonlin_attention.whiten2, num_groups=1, num_channels=256, metric=3.47 vs. limit=15.0 2023-10-03 09:36:33,117 WARNING [train.py:1204] (3/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_105-328174-0_sp0.9 from training. Duration: 0.8444375 2023-10-03 09:36:33,147 WARNING [train.py:1204] (3/4) Exclude cut with ID _68965_210_1883_1_1535698962272_3497490_164-118421-0_sp1.1 from training. Duration: 0.936375 2023-10-03 09:36:34,897 WARNING [train.py:1204] (3/4) Exclude cut with ID _19896_210_1883_1_1531709022662_6649569_380-24170-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 09:36:34,909 WARNING [train.py:1204] (3/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_505-205270-0_sp1.1 from training. Duration: 0.3454375 2023-10-03 09:36:42,564 WARNING [train.py:1204] (3/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_427-208901-0_sp1.1 from training. Duration: 0.6181875 2023-10-03 09:36:42,608 WARNING [train.py:1204] (3/4) Exclude cut with ID _23818_210_6228_1_1531917063884_3676920_85-326916-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 09:36:43,326 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.1.encoder.layers.1.whiten, num_groups=1, num_channels=256, metric=3.67 vs. limit=12.0 2023-10-03 09:36:44,053 WARNING [train.py:1204] (3/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_101-293367-0_sp1.1 from training. Duration: 0.57275 2023-10-03 09:36:44,061 WARNING [train.py:1204] (3/4) Exclude cut with ID _5409_210_1388_1_1527292850189_5865191_178-127970-0_sp1.1 from training. Duration: 0.5181875 2023-10-03 09:36:44,112 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_403-39619-0 from training. Duration: 0.84 2023-10-03 09:36:45,476 WARNING [train.py:1204] (3/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_427-87841-0_sp1.1 from training. Duration: 0.963625 2023-10-03 09:36:46,810 WARNING [train.py:1204] (3/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_286-68181-0_sp1.1 from training. Duration: 0.97275 2023-10-03 09:36:46,848 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6673_210_3273_1_1527767790_3173240_327-306185-0 from training. Duration: 0.83 2023-10-03 09:36:46,862 WARNING [train.py:1204] (3/4) Exclude cut with ID _3675_210_4557_1_1526639934408_4182089_106-203713-0_sp1.1 from training. Duration: 0.963625 2023-10-03 09:36:48,176 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_53071_210_16554_1_1534493434_1276796_130-36076-0 from training. Duration: 0.95 2023-10-03 09:36:49,414 WARNING [train.py:1204] (3/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_586-93054-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 09:36:52,864 WARNING [train.py:1204] (3/4) Exclude cut with ID _23825_210_13449_1_1531915313526_3497150_262-10571-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 09:36:52,935 WARNING [train.py:1204] (3/4) Exclude cut with ID _28783_210_7943_1_1532671164808_7155479_432-67885-0_sp1.1 from training. Duration: 0.97275 2023-10-03 09:36:55,736 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6287_210_3273_1_1527509506_2653716_182-199887-0_sp0.9 from training. Duration: 0.5666875 2023-10-03 09:36:55,796 WARNING [train.py:1204] (3/4) Exclude cut with ID _3231_210_2106_1_1526205864934_6784220_171-256004-0 from training. Duration: 0.86 2023-10-03 09:36:55,841 WARNING [train.py:1204] (3/4) Exclude cut with ID _69051_210_1531_1_1535885566901_3829538_332-92090-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 09:36:58,613 WARNING [train.py:1204] (3/4) Exclude cut with ID _3675_210_4557_1_1526639934408_4182089_104-324125-0_sp1.1 from training. Duration: 0.963625 2023-10-03 09:37:01,416 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_41764_210_2521_1_1533686110_4089313_501-2119-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 09:37:06,050 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_56-100988-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 09:37:08,808 WARNING [train.py:1204] (3/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_398-239819-0_sp1.1 from training. Duration: 0.9 2023-10-03 09:37:13,666 WARNING [train.py:1204] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_74-48717-0_sp1.1 from training. Duration: 0.52725 2023-10-03 09:37:16,289 WARNING [train.py:1204] (3/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_177-49092-0 from training. Duration: 0.91 2023-10-03 09:37:16,307 WARNING [train.py:1204] (3/4) Exclude cut with ID _62337_210_12392_1_1535009273105_3412229_44-176353-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 09:37:16,324 WARNING [train.py:1204] (3/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_110-130961-0_sp1.1 from training. Duration: 0.42725 2023-10-03 09:37:20,852 WARNING [train.py:1204] (3/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_37-189262-0_sp1.1 from training. Duration: 0.8909375 2023-10-03 09:37:20,853 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3172_210_2087_1_1526292929_3987865_197-97762-0_sp0.9 from training. Duration: 0.8333125 2023-10-03 09:37:20,966 WARNING [train.py:1204] (3/4) Exclude cut with ID IC0677W0065-334897 from training. Duration: 0.896 2023-10-03 09:37:20,967 WARNING [train.py:1204] (3/4) Exclude cut with ID IC0677W0080-334912 from training. Duration: 0.768 2023-10-03 09:37:20,971 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_120-41668-0 from training. Duration: 0.7 2023-10-03 09:37:25,180 WARNING [train.py:1204] (3/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_460-205287-0_sp1.1 from training. Duration: 0.963625 2023-10-03 09:37:26,681 WARNING [train.py:1204] (3/4) Exclude cut with ID _14632_210_9988_1_1530691279392_3923200_102-204477-0 from training. Duration: 0.97 2023-10-03 09:37:27,882 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_34815_210_6388_1_1533381320_3571903_279-305565-0 from training. Duration: 0.92 2023-10-03 09:37:27,954 WARNING [train.py:1204] (3/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_91-148831-0_sp1.1 from training. Duration: 0.9 2023-10-03 09:37:29,345 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_82-221196-0 from training. Duration: 0.76 2023-10-03 09:37:30,704 INFO [train.py:1046] (3/4) Epoch 35, batch 2550, loss[loss=0.1519, simple_loss=0.2334, pruned_loss=0.03515, over 24667.00 frames. ], tot_loss[loss=0.1596, simple_loss=0.2385, pruned_loss=0.04031, over 4701832.02 frames. ], batch size: 65, lr: 2.91e-03, grad_scale: 8.0 2023-10-03 09:37:32,140 WARNING [train.py:1204] (3/4) Exclude cut with ID _38020_210_5986_1_1533112252259_7006089_493-102745-0 from training. Duration: 0.98 2023-10-03 09:37:35,044 WARNING [train.py:1204] (3/4) Exclude cut with ID _61133_210_9969_1_1535196777227_3696409_40-93442-0_sp1.1 from training. Duration: 0.92725 2023-10-03 09:37:36,415 WARNING [train.py:1204] (3/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_45-258852-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 09:37:36,447 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_53071_210_16554_1_1534493434_1276796_130-36076-0_sp1.1 from training. Duration: 0.863625 2023-10-03 09:37:37,838 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8694_210_6400_1_1529065754_3260756_332-13553-0_sp1.1 from training. Duration: 0.92725 2023-10-03 09:37:39,878 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_405-339986-0 from training. Duration: 0.51 2023-10-03 09:37:39,928 WARNING [train.py:1204] (3/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_927-43827-0_sp1.1 from training. Duration: 0.67275 2023-10-03 09:37:40,099 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.bypass_mid.scale_min, batch_count=1221086.6666666667, ans=0.2 2023-10-03 09:37:42,792 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_397-147196-0 from training. Duration: 0.8 2023-10-03 09:37:44,731 WARNING [train.py:1204] (3/4) Exclude cut with ID 3557-8342-0013-54691-0_sp1.1 from training. Duration: 0.836375 2023-10-03 09:37:46,240 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_47438_210_5399_1_1533797471_4513356_626-127722-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 09:37:47,471 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.474e+02 1.892e+02 2.118e+02 2.497e+02 3.276e+02, threshold=4.237e+02, percent-clipped=0.0 2023-10-03 09:37:49,991 WARNING [train.py:1204] (3/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_256-323883-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 09:37:49,995 WARNING [train.py:1204] (3/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_839-173017-0_sp0.9 from training. Duration: 0.4555625 2023-10-03 09:37:50,048 WARNING [train.py:1204] (3/4) Exclude cut with ID _5289_210_2084_1_1527678202736_3779299_392-261988-0_sp1.1 from training. Duration: 0.5909375 2023-10-03 09:37:50,084 WARNING [train.py:1204] (3/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_119-224384-0_sp1.1 from training. Duration: 0.936375 2023-10-03 09:37:52,030 WARNING [train.py:1204] (3/4) Exclude cut with ID _57523_210_1856_1_1534766294545_3732989_262-267933-0_sp1.1 from training. Duration: 0.963625 2023-10-03 09:37:53,482 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_35361_210_4238_1_1533262810_7793476_675-87012-0_sp0.9 from training. Duration: 0.988875 2023-10-03 09:37:53,499 WARNING [train.py:1204] (3/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_17-90541-0 from training. Duration: 0.94 2023-10-03 09:37:53,531 WARNING [train.py:1204] (3/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_535-14410-0_sp1.1 from training. Duration: 0.42725 2023-10-03 09:37:53,543 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_24673_210_6400_1_1531968633_778659_94-154511-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 09:37:53,546 WARNING [train.py:1204] (3/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_212-10501-0 from training. Duration: 0.94 2023-10-03 09:38:05,122 WARNING [train.py:1204] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_383-285196-0_sp1.1 from training. Duration: 0.8818125 2023-10-03 09:38:08,621 WARNING [train.py:1204] (3/4) Exclude cut with ID _45560_210_21982_1_1533812603951_3584999_238-47020-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 09:38:09,868 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_21500_210_2054_1_1532149310_2716758_278-18053-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 09:38:09,884 WARNING [train.py:1204] (3/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_453-9626-0_sp1.1 from training. Duration: 0.936375 2023-10-03 09:38:11,337 WARNING [train.py:1204] (3/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_153-97441-0_sp1.1 from training. Duration: 0.5090625 2023-10-03 09:38:16,007 WARNING [train.py:1204] (3/4) Exclude cut with ID _67725_210_27767_1_1535608756671_5105359_431-120895-0_sp1.1 from training. Duration: 0.963625 2023-10-03 09:38:18,721 WARNING [train.py:1204] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_574-45541-0_sp1.1 from training. Duration: 0.5909375 2023-10-03 09:38:18,737 WARNING [train.py:1204] (3/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_162-103620-0_sp1.1 from training. Duration: 0.8454375 2023-10-03 09:38:18,749 WARNING [train.py:1204] (3/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_734-129761-0_sp1.1 from training. Duration: 0.7454375 2023-10-03 09:38:19,423 WARNING [train.py:1204] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_620-276793-0_sp1.1 from training. Duration: 0.536375 2023-10-03 09:38:20,675 WARNING [train.py:1204] (3/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_153-97441-0_sp0.9 from training. Duration: 0.62225 2023-10-03 09:38:22,256 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.1.bypass.skip_rate, batch_count=1221286.6666666667, ans=0.035 2023-10-03 09:38:23,475 WARNING [train.py:1204] (3/4) Exclude cut with ID _12733_210_8341_1_1530167424556_3646380_523-85621-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 09:38:23,488 WARNING [train.py:1204] (3/4) Exclude cut with ID _64048_210_10626_1_1535198382802_3835179_352-179834-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 09:38:23,620 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.0.ff3_skip_rate, batch_count=1221286.6666666667, ans=0.0 2023-10-03 09:38:30,530 WARNING [train.py:1204] (3/4) Exclude cut with ID _55162_210_17071_1_1534408248583_3753480_43-242364-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 09:38:30,547 WARNING [train.py:1204] (3/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_661-264857-0 from training. Duration: 0.93 2023-10-03 09:38:30,553 WARNING [train.py:1204] (3/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_325-237800-0_sp1.1 from training. Duration: 0.97275 2023-10-03 09:38:30,601 WARNING [train.py:1204] (3/4) Exclude cut with ID _55328_210_27767_1_1534580973260_4088170_385-171459-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 09:38:31,922 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3780_210_2087_1_1526467389_2145572_176-107140-0_sp1.1 from training. Duration: 0.62725 2023-10-03 09:38:32,015 WARNING [train.py:1204] (3/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_375-244976-0_sp1.1 from training. Duration: 0.6818125 2023-10-03 09:38:33,975 WARNING [train.py:1204] (3/4) Exclude cut with ID _35941_210_15379_1_1532995294515_4023089_309-33162-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 09:38:36,142 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.2.encoder.layers.1.self_attn2.whiten, num_groups=1, num_channels=384, metric=15.67 vs. limit=22.5 2023-10-03 09:38:41,143 WARNING [train.py:1204] (3/4) Exclude cut with ID _14459_210_2228_1_1531443641946_7516700_1185-22038-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 09:38:41,755 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.4.encoder.layers.2.self_attn1.whiten, num_groups=1, num_channels=384, metric=12.18 vs. limit=22.5 2023-10-03 09:38:41,919 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.3.encoder.layers.0.nonlin_attention.whiten1, num_groups=1, num_channels=384, metric=7.45 vs. limit=10.0 2023-10-03 09:38:42,547 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_1197-69460-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 09:38:45,748 INFO [train.py:1046] (3/4) Epoch 35, batch 2600, loss[loss=0.1667, simple_loss=0.2498, pruned_loss=0.04178, over 24491.00 frames. ], tot_loss[loss=0.1604, simple_loss=0.2394, pruned_loss=0.04064, over 4713204.15 frames. ], batch size: 66, lr: 2.91e-03, grad_scale: 8.0 2023-10-03 09:38:45,894 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9012W0022-501157 from training. Duration: 0.896 2023-10-03 09:38:48,614 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9023W0009-506302 from training. Duration: 0.896 2023-10-03 09:38:48,638 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2821_210_2715_1_1526108385_3763560_73-296673-0_sp1.1 from training. Duration: 0.7545625 2023-10-03 09:38:49,934 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9025W0113-507442 from training. Duration: 0.896 2023-10-03 09:38:50,002 WARNING [train.py:1204] (3/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_739-2098-0 from training. Duration: 0.92 2023-10-03 09:38:50,019 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9029W0332-509722 from training. Duration: 0.768 2023-10-03 09:38:53,323 WARNING [train.py:1204] (3/4) Exclude cut with ID _16908_210_5724_1_1531119157248_3772700_68-101953-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 09:38:53,341 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9042W0011-515617 from training. Duration: 0.768 2023-10-03 09:38:54,691 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3019_210_2028_1_1526044245_3264865_191-72235-0 from training. Duration: 0.97 2023-10-03 09:38:56,041 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9052W0006-520792 from training. Duration: 0.896 2023-10-03 09:38:58,668 WARNING [train.py:1204] (3/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_245-174553-0_sp1.1 from training. Duration: 0.8 2023-10-03 09:39:00,032 WARNING [train.py:1204] (3/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_202-67017-0 from training. Duration: 0.82 2023-10-03 09:39:02,784 WARNING [train.py:1204] (3/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_304-59227-0 from training. Duration: 0.77 2023-10-03 09:39:04,151 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_246-125000-0_sp0.9 from training. Duration: 0.688875 2023-10-03 09:39:06,096 WARNING [train.py:1204] (3/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_243-42116-0 from training. Duration: 0.73 2023-10-03 09:39:07,511 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9087W0121-538492 from training. Duration: 0.896 2023-10-03 09:39:07,526 WARNING [train.py:1204] (3/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_120-76843-0 from training. Duration: 0.55 2023-10-03 09:39:15,205 WARNING [train.py:1204] (3/4) Exclude cut with ID _7852_210_5219_1_1528286471316_8139400_850-304621-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 09:39:15,218 WARNING [train.py:1204] (3/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_154-285415-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 09:39:15,230 WARNING [train.py:1204] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_538-278194-0_sp1.1 from training. Duration: 0.92725 2023-10-03 09:39:15,231 WARNING [train.py:1204] (3/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_119-207162-0 from training. Duration: 0.71 2023-10-03 09:39:17,045 WARNING [train.py:1204] (3/4) Exclude cut with ID _2334_210_2667_1_1525608238880_4705129_459-104997-0_sp1.1 from training. Duration: 0.863625 2023-10-03 09:39:22,713 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.feed_forward1.out_proj.dropout_p, batch_count=1221553.3333333333, ans=0.1 2023-10-03 09:39:23,831 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9147W0019-567847 from training. Duration: 0.768 2023-10-03 09:39:26,215 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.balancer2.prob, batch_count=1221553.3333333333, ans=0.125 2023-10-03 09:39:29,939 WARNING [train.py:1204] (3/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_228-189439-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 09:39:29,977 WARNING [train.py:1204] (3/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_196-77172-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 09:39:31,287 WARNING [train.py:1204] (3/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_625-338546-0 from training. Duration: 0.88 2023-10-03 09:39:31,336 WARNING [train.py:1204] (3/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_193-327630-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 09:39:31,338 WARNING [train.py:1204] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_391-24006-0_sp1.1 from training. Duration: 0.92725 2023-10-03 09:39:32,647 WARNING [train.py:1204] (3/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_197-28164-0 from training. Duration: 0.8 2023-10-03 09:39:32,873 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.balancer1.prob, batch_count=1221620.0, ans=0.125 2023-10-03 09:39:34,142 WARNING [train.py:1204] (3/4) Exclude cut with ID _8395_210_1531_1_1528597871523_4411579_431-132470-0_sp1.1 from training. Duration: 0.67275 2023-10-03 09:39:34,159 WARNING [train.py:1204] (3/4) Exclude cut with ID _35350_210_16040_1_1533040191183_4075299_465-248326-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 09:39:34,370 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.conv_module1.balancer2.min_positive, batch_count=1221620.0, ans=0.05 2023-10-03 09:39:36,728 WARNING [train.py:1204] (3/4) Exclude cut with ID _16756_210_4915_1_1531047803763_7104259_101-141922-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 09:39:39,660 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.1.encoder.layers.0.self_attn_weights.whiten_keys, num_groups=4, num_channels=128, metric=4.26 vs. limit=6.0 2023-10-03 09:39:41,566 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9212W0005-600172 from training. Duration: 0.896 2023-10-03 09:39:41,597 WARNING [train.py:1204] (3/4) Exclude cut with ID _63186_210_2084_1_1535414479705_3908649_378-166820-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 09:39:41,627 WARNING [train.py:1204] (3/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_52-211683-0_sp1.1 from training. Duration: 0.8181875 2023-10-03 09:39:45,230 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.conv_module2.balancer2.prob, batch_count=1221686.6666666667, ans=0.125 2023-10-03 09:39:47,910 WARNING [train.py:1204] (3/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_1147-237729-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 09:39:47,986 WARNING [train.py:1204] (3/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_234-14174-0_sp0.9 from training. Duration: 0.911125 2023-10-03 09:39:48,005 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_454-276429-0 from training. Duration: 0.87 2023-10-03 09:39:49,704 WARNING [train.py:1204] (3/4) Exclude cut with ID _53713_210_24116_1_1534314487762_7416751_755-165313-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 09:39:51,233 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8278_210_2550_1_1528637099_4220413_436-317072-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 09:39:52,592 WARNING [train.py:1204] (3/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_28-324729-0_sp1.1 from training. Duration: 0.8545625 2023-10-03 09:39:57,573 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.attention_skip_rate, batch_count=1221686.6666666667, ans=0.0 2023-10-03 09:39:58,653 WARNING [train.py:1204] (3/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_326-165900-0 from training. Duration: 0.69 2023-10-03 09:39:58,724 WARNING [train.py:1204] (3/4) Exclude cut with ID _53489_210_6228_1_1534298357169_3835970_96-30246-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 09:40:00,028 INFO [train.py:1046] (3/4) Epoch 35, batch 2650, loss[loss=0.1641, simple_loss=0.2456, pruned_loss=0.04129, over 23358.00 frames. ], tot_loss[loss=0.1601, simple_loss=0.2396, pruned_loss=0.04034, over 4725470.37 frames. ], batch size: 93, lr: 2.91e-03, grad_scale: 8.0 2023-10-03 09:40:00,144 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8275_210_4198_1_1528455320_3892571_491-17707-0_sp1.1 from training. Duration: 0.5818125 2023-10-03 09:40:04,243 WARNING [train.py:1204] (3/4) Exclude cut with ID _14348_210_8341_1_1530599434996_3778640_240-232370-0 from training. Duration: 0.84 2023-10-03 09:40:04,254 WARNING [train.py:1204] (3/4) Exclude cut with ID _45012_210_12651_1_1533608435136_5371380_54-112623-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 09:40:05,634 WARNING [train.py:1204] (3/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_328-264776-0_sp1.1 from training. Duration: 0.6090625 2023-10-03 09:40:07,013 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9325W0001-648427 from training. Duration: 0.768 2023-10-03 09:40:07,023 WARNING [train.py:1204] (3/4) Exclude cut with ID _56480_210_4220_1_1534679942079_3075409_49-74717-0_sp1.1 from training. Duration: 0.936375 2023-10-03 09:40:08,889 WARNING [train.py:1204] (3/4) Exclude cut with ID _59500_210_8254_1_1534809610680_3736009_6-241869-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 09:40:11,552 WARNING [train.py:1204] (3/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_172-215932-0_sp0.9 from training. Duration: 0.7333125 2023-10-03 09:40:13,034 WARNING [train.py:1204] (3/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_17-90541-0_sp1.1 from training. Duration: 0.8545625 2023-10-03 09:40:14,474 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_43831_210_2158_1_1533725502_5872791_525-89447-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 09:40:15,675 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.512e+02 1.893e+02 2.177e+02 2.429e+02 3.374e+02, threshold=4.354e+02, percent-clipped=0.0 2023-10-03 09:40:15,794 WARNING [train.py:1204] (3/4) Exclude cut with ID _12302_210_5219_1_1529924446466_8107280_907-182949-0 from training. Duration: 0.92 2023-10-03 09:40:15,813 WARNING [train.py:1204] (3/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_402-102311-0_sp0.9 from training. Duration: 0.7444375 2023-10-03 09:40:15,848 WARNING [train.py:1204] (3/4) Exclude cut with ID _8021_210_7994_1_1528376451358_3653069_179-45206-0_sp0.9 from training. Duration: 0.9555625 2023-10-03 09:40:19,341 WARNING [train.py:1204] (3/4) Exclude cut with ID _40137_210_12210_1_1533290484046_3795180_249-187059-0 from training. Duration: 0.88 2023-10-03 09:40:22,551 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9653W0194-675472 from training. Duration: 0.9658125 2023-10-03 09:40:25,273 WARNING [train.py:1204] (3/4) Exclude cut with ID _51332_210_9988_1_1534244371836_3801270_336-255604-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 09:40:26,724 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_510-96996-0 from training. Duration: 0.87 2023-10-03 09:40:28,505 WARNING [train.py:1204] (3/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_1167-20437-0_sp1.1 from training. Duration: 0.92725 2023-10-03 09:40:28,562 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3475_210_2028_1_1526193578_3622569_265-276625-0 from training. Duration: 0.76 2023-10-03 09:40:31,432 WARNING [train.py:1204] (3/4) Exclude cut with ID _14860_210_4915_1_1530702020340_7343240_470-172418-0_sp1.1 from training. Duration: 0.97275 2023-10-03 09:40:31,443 WARNING [train.py:1204] (3/4) Exclude cut with ID _5444_210_2151_1_1527418808464_5312210_581-315411-0_sp1.1 from training. Duration: 0.563625 2023-10-03 09:40:31,464 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_34450_210_9547_1_1533099443_4109050_519-342439-0_sp1.1 from training. Duration: 0.97275 2023-10-03 09:40:32,736 WARNING [train.py:1204] (3/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_50-175961-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 09:40:36,816 WARNING [train.py:1204] (3/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_372-6869-0 from training. Duration: 0.44 2023-10-03 09:40:36,821 WARNING [train.py:1204] (3/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_3-237434-0 from training. Duration: 0.76 2023-10-03 09:40:37,019 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=1221886.6666666667, ans=0.1 2023-10-03 09:40:41,401 WARNING [train.py:1204] (3/4) Exclude cut with ID _8772_210_1850_1_1528635595972_3664011_247-336448-0_sp1.1 from training. Duration: 0.67275 2023-10-03 09:40:45,405 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_427-78097-0 from training. Duration: 0.8 2023-10-03 09:40:45,422 WARNING [train.py:1204] (3/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_466-252727-0_sp1.1 from training. Duration: 0.97275 2023-10-03 09:40:46,761 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_15972_210_6400_1_1530877934_3405692_316-35478-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 09:40:46,787 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_809-17156-0_sp0.9 from training. Duration: 0.811125 2023-10-03 09:40:46,829 WARNING [train.py:1204] (3/4) Exclude cut with ID _57572_210_5517_1_1534921219624_3809484_594-263474-0_sp1.1 from training. Duration: 0.936375 2023-10-03 09:40:48,646 WARNING [train.py:1204] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_190-228204-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 09:40:48,853 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.bypass.skip_rate, batch_count=1221953.3333333333, ans=0.04949747468305833 2023-10-03 09:40:51,401 WARNING [train.py:1204] (3/4) Exclude cut with ID _19514_210_9259_1_1531530071330_5084230_117-21193-0_sp1.1 from training. Duration: 0.936375 2023-10-03 09:40:51,522 WARNING [train.py:1204] (3/4) Exclude cut with ID _69584_210_5219_1_1535785133775_4044450_190-21315-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 09:40:54,053 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_53071_210_16554_1_1534493434_1276796_184-302050-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 09:40:54,099 WARNING [train.py:1204] (3/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_120-76843-0_sp1.1 from training. Duration: 0.5 2023-10-03 09:40:55,495 WARNING [train.py:1204] (3/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_318-162017-0_sp1.1 from training. Duration: 0.82725 2023-10-03 09:40:57,408 WARNING [train.py:1204] (3/4) Exclude cut with ID _46067_210_4220_1_1534125571066_3307450_79-46864-0_sp1.1 from training. Duration: 0.92725 2023-10-03 09:40:57,464 WARNING [train.py:1204] (3/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_358-215558-0_sp1.1 from training. Duration: 0.8454375 2023-10-03 09:40:58,814 WARNING [train.py:1204] (3/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_621-66764-0_sp1.1 from training. Duration: 0.92725 2023-10-03 09:40:58,927 WARNING [train.py:1204] (3/4) Exclude cut with ID _42863_210_9070_1_1533902398788_3696920_338-156851-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 09:41:00,291 WARNING [train.py:1204] (3/4) Exclude cut with ID _5289_210_2084_1_1527678202736_3779299_392-261988-0_sp0.9 from training. Duration: 0.72225 2023-10-03 09:41:02,463 WARNING [train.py:1204] (3/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_409-84682-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 09:41:03,846 WARNING [train.py:1204] (3/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_30-326681-0_sp0.9 from training. Duration: 0.92225 2023-10-03 09:41:03,851 WARNING [train.py:1204] (3/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_389-184695-0_sp1.1 from training. Duration: 0.92725 2023-10-03 09:41:03,898 WARNING [train.py:1204] (3/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_442-320317-0 from training. Duration: 0.82 2023-10-03 09:41:08,039 WARNING [train.py:1204] (3/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_756-29911-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 09:41:10,575 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_401-112036-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 09:41:11,900 WARNING [train.py:1204] (3/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_181-308827-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 09:41:13,655 INFO [train.py:1046] (3/4) Epoch 35, batch 2700, loss[loss=0.1508, simple_loss=0.2323, pruned_loss=0.03463, over 24329.00 frames. ], tot_loss[loss=0.1608, simple_loss=0.2404, pruned_loss=0.04058, over 4732474.63 frames. ], batch size: 61, lr: 2.91e-03, grad_scale: 8.0 2023-10-03 09:41:13,703 WARNING [train.py:1204] (3/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_332-258725-0_sp1.1 from training. Duration: 0.963625 2023-10-03 09:41:13,770 WARNING [train.py:1204] (3/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_188-260011-0_sp0.9 from training. Duration: 0.811125 2023-10-03 09:41:13,815 WARNING [train.py:1204] (3/4) Exclude cut with ID _49039_210_6315_1_1533967537541_3846118_99-302239-0_sp1.1 from training. Duration: 0.963625 2023-10-03 09:41:16,496 WARNING [train.py:1204] (3/4) Exclude cut with ID _4122_210_2084_1_1526713164417_4353280_312-294828-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 09:41:16,503 WARNING [train.py:1204] (3/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_303-228738-0 from training. Duration: 0.84 2023-10-03 09:41:18,425 WARNING [train.py:1204] (3/4) Exclude cut with ID _14349_210_8341_1_1530615710818_3442470_115-285975-0_sp0.9 from training. Duration: 0.9555625 2023-10-03 09:41:19,742 WARNING [train.py:1204] (3/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_125-144174-0_sp1.1 from training. Duration: 0.3545625 2023-10-03 09:41:22,485 WARNING [train.py:1204] (3/4) Exclude cut with ID _53565_210_10635_1_1534337218335_3820259_351-54570-0_sp1.1 from training. Duration: 0.97275 2023-10-03 09:41:22,513 WARNING [train.py:1204] (3/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_410-40065-0_sp1.1 from training. Duration: 0.963625 2023-10-03 09:41:22,535 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_38965_210_4852_1_1533174906_3484735_167-339442-0_sp1.1 from training. Duration: 0.963625 2023-10-03 09:41:25,187 WARNING [train.py:1204] (3/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_265-164486-0_sp1.1 from training. Duration: 0.87275 2023-10-03 09:41:25,198 WARNING [train.py:1204] (3/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_725-182484-0_sp1.1 from training. Duration: 0.92725 2023-10-03 09:41:25,236 WARNING [train.py:1204] (3/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_607-213951-0_sp0.9 from training. Duration: 0.9333125 2023-10-03 09:41:26,522 WARNING [train.py:1204] (3/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_605-133395-0_sp1.1 from training. Duration: 0.6 2023-10-03 09:41:26,544 WARNING [train.py:1204] (3/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_351-294650-0 from training. Duration: 0.63 2023-10-03 09:41:28,441 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_14953_210_2151_1_1530701710_5616165_710-165400-0_sp1.1 from training. Duration: 0.7909375 2023-10-03 09:41:29,787 WARNING [train.py:1204] (3/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_237-58433-0_sp1.1 from training. Duration: 0.67275 2023-10-03 09:41:29,876 WARNING [train.py:1204] (3/4) Exclude cut with ID _5190_210_4557_1_1527244368799_373439_28-197030-0_sp1.1 from training. Duration: 0.6545625 2023-10-03 09:41:29,923 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_743-327651-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 09:41:33,457 WARNING [train.py:1204] (3/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_76-203867-0_sp0.9 from training. Duration: 0.8 2023-10-03 09:41:34,829 WARNING [train.py:1204] (3/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_274-274798-0 from training. Duration: 0.98 2023-10-03 09:41:34,872 WARNING [train.py:1204] (3/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_226-242140-0_sp1.1 from training. Duration: 0.736375 2023-10-03 09:41:39,080 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.feed_forward3.hidden_balancer.prob, batch_count=1222153.3333333333, ans=0.125 2023-10-03 09:41:40,149 WARNING [train.py:1204] (3/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_352-59958-0_sp1.1 from training. Duration: 0.7818125 2023-10-03 09:41:40,153 WARNING [train.py:1204] (3/4) Exclude cut with ID _39411_210_5986_1_1533538825030_3343649_191-251819-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 09:41:45,038 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7046_210_6753_1_1527895337_6920917_642-106799-0_sp1.1 from training. Duration: 0.836375 2023-10-03 09:41:45,045 WARNING [train.py:1204] (3/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_412-3442-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 09:41:45,070 WARNING [train.py:1204] (3/4) Exclude cut with ID _60057_210_10640_1_1534903065793_3333150_171-181133-0_sp0.9 from training. Duration: 0.97775 2023-10-03 09:41:45,086 WARNING [train.py:1204] (3/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_154-135696-0_sp1.1 from training. Duration: 0.72725 2023-10-03 09:41:49,779 WARNING [train.py:1204] (3/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_148-186805-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 09:41:51,199 WARNING [train.py:1204] (3/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_605-165641-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 09:41:52,471 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_174-277364-0_sp0.9 from training. Duration: 0.788875 2023-10-03 09:41:52,491 WARNING [train.py:1204] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_89-321955-0_sp0.9 from training. Duration: 0.888875 2023-10-03 09:41:55,387 WARNING [train.py:1204] (3/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_438-105429-0_sp1.1 from training. Duration: 0.963625 2023-10-03 09:41:55,391 WARNING [train.py:1204] (3/4) Exclude cut with ID _14970_210_15709_1_1530773543493_1715730_187-20259-0_sp0.9 from training. Duration: 0.988875 2023-10-03 09:42:04,753 WARNING [train.py:1204] (3/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_385-193052-0_sp0.9 from training. Duration: 0.9555625 2023-10-03 09:42:06,048 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7113_210_1849_1_1527942685_3448768_194-311626-0_sp1.1 from training. Duration: 0.92725 2023-10-03 09:42:09,055 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3780_210_2087_1_1526467389_2145572_176-107140-0_sp0.9 from training. Duration: 0.7666875 2023-10-03 09:42:09,057 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_36756_210_7234_1_1533026991_966276_123-213457-0_sp1.1 from training. Duration: 0.936375 2023-10-03 09:42:10,608 WARNING [train.py:1204] (3/4) Exclude cut with ID _23557_210_8750_1_1531890106370_6846255_456-244861-0_sp1.1 from training. Duration: 0.963625 2023-10-03 09:42:11,947 WARNING [train.py:1204] (3/4) Exclude cut with ID _37086_210_9038_1_1532999540891_7021250_242-349170-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 09:42:12,000 WARNING [train.py:1204] (3/4) Exclude cut with ID _41298_210_9019_1_1533430371450_4822143_471-281351-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 09:42:13,808 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_53378_210_17282_1_1534221071_5769509_572-126176-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 09:42:15,140 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_1507-263032-0_sp1.1 from training. Duration: 0.963625 2023-10-03 09:42:17,030 WARNING [train.py:1204] (3/4) Exclude cut with ID _13679_210_7497_1_1530534293662_3355098_411-186088-0_sp1.1 from training. Duration: 0.8545625 2023-10-03 09:42:19,828 WARNING [train.py:1204] (3/4) Exclude cut with ID _29345_210_4381_1_1532689168263_3860169_149-257268-0_sp1.1 from training. Duration: 0.736375 2023-10-03 09:42:20,012 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.conv_skip_rate, batch_count=1222353.3333333333, ans=0.0 2023-10-03 09:42:20,110 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.conv_module1.balancer1.max_abs, batch_count=1222353.3333333333, ans=10.0 2023-10-03 09:42:21,215 WARNING [train.py:1204] (3/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_381-252014-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 09:42:21,223 WARNING [train.py:1204] (3/4) Exclude cut with ID _35799_210_5854_1_1533263487887_3529071_512-2717-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 09:42:23,912 WARNING [train.py:1204] (3/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_505-205270-0_sp0.9 from training. Duration: 0.42225 2023-10-03 09:42:23,981 WARNING [train.py:1204] (3/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_731-20117-0_sp1.1 from training. Duration: 0.936375 2023-10-03 09:42:27,307 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_37351_210_7925_1_1533012931_3812113_265-85780-0_sp1.1 from training. Duration: 0.8 2023-10-03 09:42:27,321 WARNING [train.py:1204] (3/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_313-134930-0 from training. Duration: 0.97 2023-10-03 09:42:28,558 INFO [train.py:1046] (3/4) Epoch 35, batch 2750, loss[loss=0.1456, simple_loss=0.2223, pruned_loss=0.03446, over 24317.00 frames. ], tot_loss[loss=0.1613, simple_loss=0.2409, pruned_loss=0.04082, over 4730769.48 frames. ], batch size: 56, lr: 2.91e-03, grad_scale: 8.0 2023-10-03 09:42:28,682 WARNING [train.py:1204] (3/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_374-138971-0 from training. Duration: 0.94 2023-10-03 09:42:28,723 WARNING [train.py:1204] (3/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_1227-287886-0_sp1.1 from training. Duration: 0.936375 2023-10-03 09:42:33,099 WARNING [train.py:1204] (3/4) Exclude cut with ID _59493_210_17282_1_1535078419077_3225879_332-217544-0_sp1.1 from training. Duration: 0.97275 2023-10-03 09:42:33,139 WARNING [train.py:1204] (3/4) Exclude cut with ID _23514_210_1389_1_1531832139785_3031439_361-119640-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 09:42:34,520 WARNING [train.py:1204] (3/4) Exclude cut with ID _69584_210_5219_1_1535785133775_4044450_142-118058-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 09:42:34,552 WARNING [train.py:1204] (3/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_178-177582-0_sp1.1 from training. Duration: 0.67275 2023-10-03 09:42:35,927 WARNING [train.py:1204] (3/4) Exclude cut with ID _25303_210_17519_1_1532050207576_3635970_41-195572-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 09:42:37,536 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.feed_forward1.hidden_balancer.prob, batch_count=1222420.0, ans=0.125 2023-10-03 09:42:38,664 WARNING [train.py:1204] (3/4) Exclude cut with ID _55328_210_27767_1_1534580973260_4088170_329-341322-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 09:42:38,691 WARNING [train.py:1204] (3/4) Exclude cut with ID _5608_210_2106_1_1527415439089_6819768_909-82767-0_sp1.1 from training. Duration: 0.5545625 2023-10-03 09:42:38,740 WARNING [train.py:1204] (3/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_261-297503-0_sp1.1 from training. Duration: 0.9 2023-10-03 09:42:38,746 WARNING [train.py:1204] (3/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_459-291706-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 09:42:38,746 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7762_210_1794_1_1528199508_4057408_422-227034-0 from training. Duration: 0.73 2023-10-03 09:42:38,756 WARNING [train.py:1204] (3/4) Exclude cut with ID _3067_210_2667_1_1526212974640_4503168_250-285937-0_sp0.9 from training. Duration: 0.888875 2023-10-03 09:42:39,968 WARNING [train.py:1204] (3/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_332-253745-0_sp1.1 from training. Duration: 0.936375 2023-10-03 09:42:40,491 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.5.encoder.layers.1.conv_module1.whiten, num_groups=1, num_channels=256, metric=3.26 vs. limit=15.0 2023-10-03 09:42:44,657 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.577e+02 1.884e+02 2.035e+02 2.268e+02 3.504e+02, threshold=4.069e+02, percent-clipped=0.0 2023-10-03 09:42:44,971 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.self_attn_weights.pos_emb_skip_rate, batch_count=1222486.6666666667, ans=0.0 2023-10-03 09:42:46,179 WARNING [train.py:1204] (3/4) Exclude cut with ID _13938_210_9988_1_1530518718978_3693960_304-112784-0 from training. Duration: 0.46 2023-10-03 09:42:47,221 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.0.layers.0.feed_forward2.out_whiten, num_groups=1, num_channels=192, metric=5.55 vs. limit=15.0 2023-10-03 09:42:47,627 WARNING [train.py:1204] (3/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_226-267957-0_sp1.1 from training. Duration: 0.92725 2023-10-03 09:42:49,490 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_41822_210_13270_1_1533955976_4379284_608-254567-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 09:42:50,816 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_124-113562-0_sp1.1 from training. Duration: 0.8545625 2023-10-03 09:42:50,859 WARNING [train.py:1204] (3/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_117-290916-0_sp1.1 from training. Duration: 0.463625 2023-10-03 09:42:52,223 WARNING [train.py:1204] (3/4) Exclude cut with ID _28427_210_4220_1_1532581240967_3061069_102-66533-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 09:42:53,611 WARNING [train.py:1204] (3/4) Exclude cut with ID _55383_210_10635_1_1534468426172_2231669_134-128246-0_sp1.1 from training. Duration: 0.8090625 2023-10-03 09:42:53,672 WARNING [train.py:1204] (3/4) Exclude cut with ID _45871_210_5665_1_1533864614245_3296060_49-308316-0_sp1.1 from training. Duration: 0.97275 2023-10-03 09:42:53,713 WARNING [train.py:1204] (3/4) Exclude cut with ID _57572_210_5517_1_1534921219624_3809484_176-199205-0_sp1.1 from training. Duration: 0.97275 2023-10-03 09:42:58,365 WARNING [train.py:1204] (3/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_45-265684-0_sp1.1 from training. Duration: 0.6545625 2023-10-03 09:42:59,760 WARNING [train.py:1204] (3/4) Exclude cut with ID _15150_210_8341_1_1530752411803_7256384_469-239509-0_sp0.9 from training. Duration: 0.7555625 2023-10-03 09:42:59,815 WARNING [train.py:1204] (3/4) Exclude cut with ID _28461_210_2253_1_1532771775189_3041299_136-270644-0_sp1.1 from training. Duration: 0.8454375 2023-10-03 09:42:59,889 WARNING [train.py:1204] (3/4) Exclude cut with ID _69584_210_5219_1_1535785133775_4044450_302-47302-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 09:43:01,827 WARNING [train.py:1204] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_166-128941-0_sp1.1 from training. Duration: 0.4909375 2023-10-03 09:43:02,085 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.nonlin_attention.balancer.max_positive, batch_count=1222553.3333333333, ans=0.95 2023-10-03 09:43:08,631 WARNING [train.py:1204] (3/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_559-120504-0_sp1.1 from training. Duration: 0.97275 2023-10-03 09:43:11,390 WARNING [train.py:1204] (3/4) Exclude cut with ID _6456_210_1534_1_1527681634212_3181049_345-252877-0_sp1.1 from training. Duration: 0.5818125 2023-10-03 09:43:11,422 WARNING [train.py:1204] (3/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_902-233003-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 09:43:14,341 WARNING [train.py:1204] (3/4) Exclude cut with ID _36538_210_9994_1_1533204020363_3795620_204-261385-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 09:43:14,345 WARNING [train.py:1204] (3/4) Exclude cut with ID _14821_210_8750_1_1530680503991_6829359_189-297451-0_sp1.1 from training. Duration: 0.5 2023-10-03 09:43:14,386 WARNING [train.py:1204] (3/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_1211-226066-0_sp0.9 from training. Duration: 0.8444375 2023-10-03 09:43:19,573 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6786_210_1388_1_1527839699_4194495_273-149841-0_sp1.1 from training. Duration: 0.836375 2023-10-03 09:43:20,908 WARNING [train.py:1204] (3/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_380-162648-0_sp1.1 from training. Duration: 0.8545625 2023-10-03 09:43:20,909 WARNING [train.py:1204] (3/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_494-199538-0 from training. Duration: 0.75 2023-10-03 09:43:25,344 WARNING [train.py:1204] (3/4) Exclude cut with ID _49097_210_16049_1_1533952780207_4360979_276-87053-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 09:43:26,763 WARNING [train.py:1204] (3/4) Exclude cut with ID _5444_210_2151_1_1527418808464_5312210_726-27165-0 from training. Duration: 0.67 2023-10-03 09:43:33,331 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_128-202104-0_sp1.1 from training. Duration: 0.57275 2023-10-03 09:43:34,797 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_11349_210_6753_1_1529800861_1101207_26-65602-0_sp1.1 from training. Duration: 0.87275 2023-10-03 09:43:34,826 WARNING [train.py:1204] (3/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_159-328157-0 from training. Duration: 0.52 2023-10-03 09:43:36,146 WARNING [train.py:1204] (3/4) Exclude cut with ID _14069_210_5632_1_1530512692033_4803390_98-21934-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 09:43:38,716 WARNING [train.py:1204] (3/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_352-59958-0_sp0.9 from training. Duration: 0.9555625 2023-10-03 09:43:38,754 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6163_210_6400_1_1527506746_4263068_283-137811-0 from training. Duration: 0.77 2023-10-03 09:43:38,795 WARNING [train.py:1204] (3/4) Exclude cut with ID _21989_210_5854_1_1531899214931_3271360_133-44759-0_sp1.1 from training. Duration: 0.7 2023-10-03 09:43:42,687 INFO [train.py:1046] (3/4) Epoch 35, batch 2800, loss[loss=0.1699, simple_loss=0.2592, pruned_loss=0.04035, over 24033.00 frames. ], tot_loss[loss=0.1607, simple_loss=0.24, pruned_loss=0.04069, over 4729821.45 frames. ], batch size: 80, lr: 2.91e-03, grad_scale: 8.0 2023-10-03 09:43:42,766 WARNING [train.py:1204] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_21-174514-0_sp0.9 from training. Duration: 0.47775 2023-10-03 09:43:42,798 WARNING [train.py:1204] (3/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_201-78811-0_sp1.1 from training. Duration: 0.963625 2023-10-03 09:43:42,834 WARNING [train.py:1204] (3/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_537-164630-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 09:43:44,357 WARNING [train.py:1204] (3/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_156-257607-0 from training. Duration: 0.83 2023-10-03 09:43:44,375 WARNING [train.py:1204] (3/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_308-154450-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 09:43:44,418 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_45399_210_4852_1_1533624725_7579737_214-240574-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 09:43:47,739 WARNING [train.py:1204] (3/4) Exclude cut with ID _29303_210_2575_1_1532392237303_7410649_615-305517-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 09:43:47,783 WARNING [train.py:1204] (3/4) Exclude cut with ID IC0125W0002-61808 from training. Duration: 0.708 2023-10-03 09:43:47,785 WARNING [train.py:1204] (3/4) Exclude cut with ID IC0125W0017-61823 from training. Duration: 0.759 2023-10-03 09:43:51,006 WARNING [train.py:1204] (3/4) Exclude cut with ID _53219_210_4525_1_1534834836008_4064825_4-145291-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 09:43:52,452 WARNING [train.py:1204] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_185-174805-0_sp1.1 from training. Duration: 0.6181875 2023-10-03 09:43:52,466 WARNING [train.py:1204] (3/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_635-193046-0_sp1.1 from training. Duration: 0.92725 2023-10-03 09:43:55,427 WARNING [train.py:1204] (3/4) Exclude cut with ID _46067_210_4220_1_1534125571066_3307450_321-258866-0_sp1.1 from training. Duration: 0.97275 2023-10-03 09:43:56,832 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8558_210_1410_1_1528505225_7613406_131-329524-0 from training. Duration: 0.79 2023-10-03 09:43:58,919 WARNING [train.py:1204] (3/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_847-41334-0_sp0.9 from training. Duration: 0.488875 2023-10-03 09:44:00,173 WARNING [train.py:1204] (3/4) Exclude cut with ID _14227_210_4381_1_1530701804286_3940259_125-275642-0 from training. Duration: 0.99 2023-10-03 09:44:02,292 WARNING [train.py:1204] (3/4) Exclude cut with ID _25248_210_8750_1_1532062825941_5554180_248-177255-0_sp1.1 from training. Duration: 0.963625 2023-10-03 09:44:03,359 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_40047_210_2290_1_1533290519_2374311_195-165693-0_sp1.1 from training. Duration: 0.8090625 2023-10-03 09:44:03,363 WARNING [train.py:1204] (3/4) Exclude cut with ID _23109_210_4381_1_1532150828777_901949_29-258672-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 09:44:03,693 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.conv_module2.balancer2.prob, batch_count=1222820.0, ans=0.125 2023-10-03 09:44:06,157 WARNING [train.py:1204] (3/4) Exclude cut with ID _14821_210_8750_1_1530680503991_6829359_2-227151-0_sp1.1 from training. Duration: 0.7545625 2023-10-03 09:44:07,572 WARNING [train.py:1204] (3/4) Exclude cut with ID _49643_210_7989_1_1534640244224_3939031_565-53982-0_sp1.1 from training. Duration: 0.963625 2023-10-03 09:44:07,576 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7544_210_6753_1_1528591327_1471426_33-346031-0_sp0.9 from training. Duration: 0.67775 2023-10-03 09:44:08,987 WARNING [train.py:1204] (3/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_95-61782-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 09:44:17,830 WARNING [train.py:1204] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_686-54383-0_sp0.9 from training. Duration: 0.9666875 2023-10-03 09:44:18,097 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.attention_skip_rate, batch_count=1222886.6666666667, ans=0.0 2023-10-03 09:44:19,316 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_505-276823-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 09:44:19,501 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.balancer1.prob, batch_count=1222886.6666666667, ans=0.125 2023-10-03 09:44:21,375 WARNING [train.py:1204] (3/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_745-128443-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 09:44:21,435 WARNING [train.py:1204] (3/4) Exclude cut with ID _3844_210_1390_1_1526727803582_2871209_20-251664-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 09:44:22,825 WARNING [train.py:1204] (3/4) Exclude cut with ID _36538_210_9994_1_1533204020363_3795620_487-164628-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 09:44:22,981 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.self_attn_weights.pos_emb_skip_rate, batch_count=1222886.6666666667, ans=0.0 2023-10-03 09:44:27,070 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.feed_forward2.hidden_balancer.prob, batch_count=1222953.3333333333, ans=0.125 2023-10-03 09:44:28,175 WARNING [train.py:1204] (3/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_309-222545-0_sp1.1 from training. Duration: 0.763625 2023-10-03 09:44:28,190 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3531_210_3528_1_1526294203_5216456_309-227302-0 from training. Duration: 0.69 2023-10-03 09:44:28,256 WARNING [train.py:1204] (3/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_42-192460-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 09:44:29,731 WARNING [train.py:1204] (3/4) Exclude cut with ID _22083_210_8254_1_1531702862439_3678970_49-234546-0_sp1.1 from training. Duration: 0.7545625 2023-10-03 09:44:29,735 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_427-78097-0_sp0.9 from training. Duration: 0.888875 2023-10-03 09:44:31,878 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.feed_forward3.hidden_balancer.prob, batch_count=1222953.3333333333, ans=0.125 2023-10-03 09:44:32,998 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_378-205254-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 09:44:34,934 WARNING [train.py:1204] (3/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_672-314830-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 09:44:37,523 WARNING [train.py:1204] (3/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_90-30673-0_sp1.1 from training. Duration: 0.763625 2023-10-03 09:44:40,243 WARNING [train.py:1204] (3/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_563-329943-0_sp1.1 from training. Duration: 0.9 2023-10-03 09:44:40,266 WARNING [train.py:1204] (3/4) Exclude cut with ID _21632_210_2253_1_1531994001765_3744140_154-84090-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 09:44:40,268 WARNING [train.py:1204] (3/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_103-336064-0_sp0.9 from training. Duration: 0.5666875 2023-10-03 09:44:40,307 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_43-71989-0_sp0.9 from training. Duration: 0.6333125 2023-10-03 09:44:40,366 WARNING [train.py:1204] (3/4) Exclude cut with ID _3867_210_1883_1_1526641200575_4475830_319-349850-0_sp0.9 from training. Duration: 0.7444375 2023-10-03 09:44:40,587 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.bypass.scale_min, batch_count=1222953.3333333333, ans=0.2 2023-10-03 09:44:43,026 WARNING [train.py:1204] (3/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_629-179454-0_sp1.1 from training. Duration: 0.963625 2023-10-03 09:44:43,035 WARNING [train.py:1204] (3/4) Exclude cut with ID _65263_210_22356_1_1535763598691_3921037_49-222917-0 from training. Duration: 0.92 2023-10-03 09:44:43,073 WARNING [train.py:1204] (3/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_602-331772-0_sp1.1 from training. Duration: 0.936375 2023-10-03 09:44:44,352 WARNING [train.py:1204] (3/4) Exclude cut with ID _58414_210_8254_1_1535421609162_7151182_292-134978-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 09:44:44,380 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_38793_210_2544_1_1533176114_3849320_200-299292-0_sp1.1 from training. Duration: 0.936375 2023-10-03 09:44:44,499 WARNING [train.py:1204] (3/4) Exclude cut with ID _14970_210_15709_1_1530775547361_1966965_6-23328-0 from training. Duration: 0.86 2023-10-03 09:44:46,416 WARNING [train.py:1204] (3/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_386-225511-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 09:44:46,436 WARNING [train.py:1204] (3/4) Exclude cut with ID _38323_210_12108_1_1533121233369_3303049_416-11919-0_sp1.1 from training. Duration: 0.87275 2023-10-03 09:44:46,484 WARNING [train.py:1204] (3/4) Exclude cut with ID _4866_210_2577_1_1527404412284_4364659_256-306830-0_sp1.1 from training. Duration: 0.7909375 2023-10-03 09:44:47,868 WARNING [train.py:1204] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_53-300544-0 from training. Duration: 0.67 2023-10-03 09:44:48,078 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.self_attn_weights.pos_emb_skip_rate, batch_count=1223020.0, ans=0.0 2023-10-03 09:44:53,913 WARNING [train.py:1204] (3/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_80-74184-0_sp1.1 from training. Duration: 0.7545625 2023-10-03 09:44:53,927 WARNING [train.py:1204] (3/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_332-256168-0_sp1.1 from training. Duration: 0.6818125 2023-10-03 09:44:53,966 WARNING [train.py:1204] (3/4) Exclude cut with ID _48836_210_12210_1_1534075848293_3904150_429-35475-0_sp1.1 from training. Duration: 0.8090625 2023-10-03 09:44:55,518 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.bypass_mid.scale_min, batch_count=1223020.0, ans=0.2 2023-10-03 09:44:56,658 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_144-135833-0_sp1.1 from training. Duration: 0.97275 2023-10-03 09:44:58,077 INFO [train.py:1046] (3/4) Epoch 35, batch 2850, loss[loss=0.1591, simple_loss=0.2343, pruned_loss=0.0419, over 23800.00 frames. ], tot_loss[loss=0.1596, simple_loss=0.2389, pruned_loss=0.04018, over 4714390.08 frames. ], batch size: 179, lr: 2.91e-03, grad_scale: 8.0 2023-10-03 09:45:00,006 WARNING [train.py:1204] (3/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_380-343670-0_sp0.9 from training. Duration: 0.97775 2023-10-03 09:45:00,033 WARNING [train.py:1204] (3/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_213-265174-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 09:45:01,360 WARNING [train.py:1204] (3/4) Exclude cut with ID _20455_210_5219_1_1531565996775_8098100_86-314283-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 09:45:04,757 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_41750_210_1960_1_1533520678_3948042_68-223813-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 09:45:06,057 WARNING [train.py:1204] (3/4) Exclude cut with ID _55153_210_2253_1_1534384786677_3162096_24-198429-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 09:45:07,392 WARNING [train.py:1204] (3/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_119-207162-0_sp0.9 from training. Duration: 0.788875 2023-10-03 09:45:07,436 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_148-172861-0 from training. Duration: 0.5 2023-10-03 09:45:09,127 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.feed_forward3.hidden_balancer.prob, batch_count=1223086.6666666667, ans=0.125 2023-10-03 09:45:13,662 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_5522_210_3273_1_1527249606_3909096_318-203153-0 from training. Duration: 0.43 2023-10-03 09:45:13,668 WARNING [train.py:1204] (3/4) Exclude cut with ID _14884_210_2252_1_1530707459689_5561250_288-189949-0_sp1.1 from training. Duration: 0.97275 2023-10-03 09:45:15,177 WARNING [train.py:1204] (3/4) Exclude cut with ID _5294_210_1747_1_1527249643928_3696179_529-242298-0 from training. Duration: 0.95 2023-10-03 09:45:16,441 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.620e+02 1.902e+02 2.080e+02 2.462e+02 6.971e+02, threshold=4.161e+02, percent-clipped=1.0 2023-10-03 09:45:16,517 WARNING [train.py:1204] (3/4) Exclude cut with ID _36538_210_9994_1_1533204020363_3795620_503-110289-0_sp1.1 from training. Duration: 0.936375 2023-10-03 09:45:18,048 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.balancer1.prob, batch_count=1223153.3333333333, ans=0.125 2023-10-03 09:45:19,151 WARNING [train.py:1204] (3/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_385-193052-0 from training. Duration: 0.86 2023-10-03 09:45:20,496 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_456-22141-0 from training. Duration: 0.88 2023-10-03 09:45:20,611 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_38965_210_4852_1_1533174906_3484735_284-117840-0_sp1.1 from training. Duration: 0.936375 2023-10-03 09:45:25,436 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.conv_module2.balancer2.min_positive, batch_count=1223153.3333333333, ans=0.05 2023-10-03 09:45:33,889 WARNING [train.py:1204] (3/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_75-201625-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 09:45:35,441 WARNING [train.py:1204] (3/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_255-312654-0_sp1.1 from training. Duration: 0.82725 2023-10-03 09:45:35,459 WARNING [train.py:1204] (3/4) Exclude cut with ID _47251_210_18628_1_1533814063128_4137250_214-88076-0_sp0.9 from training. Duration: 0.97775 2023-10-03 09:45:37,311 WARNING [train.py:1204] (3/4) Exclude cut with ID _4092_210_2151_1_1526643044449_3135759_227-213972-0_sp1.1 from training. Duration: 0.4909375 2023-10-03 09:45:37,327 WARNING [train.py:1204] (3/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_912-47473-0_sp1.1 from training. Duration: 0.6181875 2023-10-03 09:45:37,361 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6767_210_3273_1_1527855783_1718932_173-256601-0_sp1.1 from training. Duration: 0.72725 2023-10-03 09:45:40,127 WARNING [train.py:1204] (3/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_32-205288-0_sp1.1 from training. Duration: 0.7090625 2023-10-03 09:45:40,158 WARNING [train.py:1204] (3/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_39-248870-0 from training. Duration: 0.69 2023-10-03 09:45:43,259 WARNING [train.py:1204] (3/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_377-296839-0_sp1.1 from training. Duration: 0.5 2023-10-03 09:45:43,291 WARNING [train.py:1204] (3/4) Exclude cut with ID _13679_210_7497_1_1530534293662_3355098_413-56154-0_sp1.1 from training. Duration: 0.963625 2023-10-03 09:45:43,338 WARNING [train.py:1204] (3/4) Exclude cut with ID _14970_210_15709_1_1530775547361_1966965_201-84544-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 09:45:44,663 WARNING [train.py:1204] (3/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_105-95775-0_sp1.1 from training. Duration: 0.936375 2023-10-03 09:45:46,081 WARNING [train.py:1204] (3/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_131-191669-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 09:45:46,108 WARNING [train.py:1204] (3/4) Exclude cut with ID _14860_210_4915_1_1530702020340_7343240_695-4323-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 09:45:47,501 WARNING [train.py:1204] (3/4) Exclude cut with ID _39867_210_9826_1_1533553222030_9344259_991-130565-0_sp1.1 from training. Duration: 0.97275 2023-10-03 09:45:47,643 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_50-3289-0_sp1.1 from training. Duration: 0.82725 2023-10-03 09:45:48,606 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.0.nonlin_attention.whiten1.whitening_limit, batch_count=1223286.6666666667, ans=10.0 2023-10-03 09:45:50,394 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_54-347670-0_sp1.1 from training. Duration: 0.92725 2023-10-03 09:45:51,726 WARNING [train.py:1204] (3/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_200-153445-0_sp1.1 from training. Duration: 0.936375 2023-10-03 09:45:53,467 WARNING [train.py:1204] (3/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_279-302911-0_sp1.1 from training. Duration: 0.97275 2023-10-03 09:45:54,878 WARNING [train.py:1204] (3/4) Exclude cut with ID _14886_210_4220_1_1530702020355_3516190_159-73748-0_sp0.9 from training. Duration: 0.92225 2023-10-03 09:45:55,164 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.out_combiner.scale_min, batch_count=1223286.6666666667, ans=0.2 2023-10-03 09:45:59,119 WARNING [train.py:1204] (3/4) Exclude cut with ID _53379_210_20034_1_1534222027363_3854654_23-222715-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 09:46:00,380 WARNING [train.py:1204] (3/4) Exclude cut with ID _12555_210_8750_1_1530075594402_3780239_449-267986-0 from training. Duration: 0.83 2023-10-03 09:46:01,654 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7544_210_6753_1_1528587722_3576686_295-258479-0 from training. Duration: 0.84 2023-10-03 09:46:03,294 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.feed_forward1.out_proj.dropout_p, batch_count=1223353.3333333333, ans=0.1 2023-10-03 09:46:03,885 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.1.encoder.layers.1.conv_module2.whiten, num_groups=1, num_channels=256, metric=10.94 vs. limit=15.0 2023-10-03 09:46:04,357 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7002_210_1794_1_1527935634_4034444_487-236501-0_sp0.9 from training. Duration: 0.7333125 2023-10-03 09:46:04,416 WARNING [train.py:1204] (3/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_244-19107-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 09:46:04,429 WARNING [train.py:1204] (3/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_390-339137-0 from training. Duration: 0.56 2023-10-03 09:46:06,417 WARNING [train.py:1204] (3/4) Exclude cut with ID _7562_210_2084_1_1528887767916_3946229_186-326422-0_sp1.1 from training. Duration: 0.763625 2023-10-03 09:46:06,492 WARNING [train.py:1204] (3/4) Exclude cut with ID _33640_210_2655_1_1533117605479_4855469_645-145235-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 09:46:06,509 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_25767_210_2161_1_1532307483_3867748_506-171777-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 09:46:06,528 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_5438_210_1794_1_1527336687_2600009_233-58072-0_sp1.1 from training. Duration: 0.7 2023-10-03 09:46:06,528 WARNING [train.py:1204] (3/4) Exclude cut with ID IC0669W0063-330908 from training. Duration: 0.896 2023-10-03 09:46:07,818 WARNING [train.py:1204] (3/4) Exclude cut with ID IC0671W0070-331913 from training. Duration: 0.896 2023-10-03 09:46:07,821 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_258-310570-0_sp1.1 from training. Duration: 0.7454375 2023-10-03 09:46:09,597 WARNING [train.py:1204] (3/4) Exclude cut with ID _9461_210_4557_1_1529060514726_3677451_162-177507-0_sp1.1 from training. Duration: 0.97275 2023-10-03 09:46:12,867 INFO [train.py:1046] (3/4) Epoch 35, batch 2900, loss[loss=0.1534, simple_loss=0.229, pruned_loss=0.03889, over 23476.00 frames. ], tot_loss[loss=0.1597, simple_loss=0.2394, pruned_loss=0.04002, over 4723622.26 frames. ], batch size: 285, lr: 2.91e-03, grad_scale: 8.0 2023-10-03 09:46:14,247 WARNING [train.py:1204] (3/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_26-175553-0_sp0.9 from training. Duration: 0.67775 2023-10-03 09:46:14,263 WARNING [train.py:1204] (3/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_7-150481-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 09:46:14,331 WARNING [train.py:1204] (3/4) Exclude cut with ID _23557_210_8750_1_1531890106370_6846255_72-273262-0_sp1.1 from training. Duration: 0.963625 2023-10-03 09:46:15,781 WARNING [train.py:1204] (3/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_367-61330-0 from training. Duration: 0.75 2023-10-03 09:46:21,221 WARNING [train.py:1204] (3/4) Exclude cut with ID _39867_210_9826_1_1533553222030_9344259_1337-11141-0_sp1.1 from training. Duration: 0.936375 2023-10-03 09:46:21,240 WARNING [train.py:1204] (3/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_101-293367-0 from training. Duration: 0.63 2023-10-03 09:46:21,318 WARNING [train.py:1204] (3/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_10-100367-0 from training. Duration: 0.98 2023-10-03 09:46:22,700 WARNING [train.py:1204] (3/4) Exclude cut with ID _60057_210_10640_1_1534903065793_3333150_171-181133-0_sp1.1 from training. Duration: 0.8 2023-10-03 09:46:22,703 WARNING [train.py:1204] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_844-96989-0_sp0.9 from training. Duration: 0.788875 2023-10-03 09:46:24,148 WARNING [train.py:1204] (3/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_301-144595-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 09:46:26,195 WARNING [train.py:1204] (3/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_115-147343-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 09:46:29,020 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3171_210_2087_1_1526109084_2330007_37-64334-0_sp1.1 from training. Duration: 0.7454375 2023-10-03 09:46:29,055 WARNING [train.py:1204] (3/4) Exclude cut with ID _19514_210_9259_1_1531530071330_5084230_769-272914-0_sp1.1 from training. Duration: 0.936375 2023-10-03 09:46:32,107 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.balancer_ff2.min_abs, batch_count=1223486.6666666667, ans=0.1 2023-10-03 09:46:33,149 WARNING [train.py:1204] (3/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_76-311963-0_sp0.9 from training. Duration: 0.77775 2023-10-03 09:46:33,190 WARNING [train.py:1204] (3/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_526-62079-0 from training. Duration: 0.67 2023-10-03 09:46:33,249 WARNING [train.py:1204] (3/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_7-111954-0_sp0.9 from training. Duration: 0.62225 2023-10-03 09:46:34,588 WARNING [train.py:1204] (3/4) Exclude cut with ID _57997_210_4163_1_1534766425889_3961049_360-279140-0_sp1.1 from training. Duration: 0.97275 2023-10-03 09:46:36,003 WARNING [train.py:1204] (3/4) Exclude cut with ID _4866_210_2577_1_1527404412284_4364659_133-1696-0 from training. Duration: 0.99 2023-10-03 09:46:36,061 WARNING [train.py:1204] (3/4) Exclude cut with ID _5190_210_4557_1_1527244368799_373439_28-197030-0 from training. Duration: 0.72 2023-10-03 09:46:39,759 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_53-39003-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 09:46:39,762 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6673_210_3273_1_1527767790_3173240_351-167954-0 from training. Duration: 0.48 2023-10-03 09:46:39,785 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2822_210_2715_1_1526112586_1999587_165-36656-0_sp1.1 from training. Duration: 0.8090625 2023-10-03 09:46:43,037 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_328-264892-0_sp1.1 from training. Duration: 0.92725 2023-10-03 09:46:43,041 WARNING [train.py:1204] (3/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_29-293896-0_sp0.9 from training. Duration: 0.67775 2023-10-03 09:46:45,726 WARNING [train.py:1204] (3/4) Exclude cut with ID _65263_210_22356_1_1535763598691_3921037_465-55448-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 09:46:45,812 WARNING [train.py:1204] (3/4) Exclude cut with ID _21989_210_5854_1_1531899214931_3271360_311-53106-0_sp1.1 from training. Duration: 0.97275 2023-10-03 09:46:49,788 WARNING [train.py:1204] (3/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_389-277915-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 09:46:52,469 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_69630_210_7234_1_1535791094_677837_63-277139-0_sp1.1 from training. Duration: 0.963625 2023-10-03 09:46:55,713 WARNING [train.py:1204] (3/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_85-138257-0 from training. Duration: 0.71 2023-10-03 09:46:55,737 WARNING [train.py:1204] (3/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_686-247600-0 from training. Duration: 0.92 2023-10-03 09:46:55,743 WARNING [train.py:1204] (3/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_108-255430-0_sp1.1 from training. Duration: 0.8818125 2023-10-03 09:47:00,016 WARNING [train.py:1204] (3/4) Exclude cut with ID _14884_210_2252_1_1530707459689_5561250_114-201399-0_sp1.1 from training. Duration: 0.6909375 2023-10-03 09:47:01,609 WARNING [train.py:1204] (3/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_506-42614-0 from training. Duration: 0.79 2023-10-03 09:47:01,838 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.balancer1.prob, batch_count=1223620.0, ans=0.125 2023-10-03 09:47:03,012 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_85-195240-0_sp1.1 from training. Duration: 0.7909375 2023-10-03 09:47:08,980 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_141-36243-0_sp1.1 from training. Duration: 0.97275 2023-10-03 09:47:18,113 WARNING [train.py:1204] (3/4) Exclude cut with ID _7852_210_5219_1_1528286471316_8139400_358-287492-0_sp1.1 from training. Duration: 0.87275 2023-10-03 09:47:18,139 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_805-71497-0_sp0.9 from training. Duration: 0.788875 2023-10-03 09:47:19,496 WARNING [train.py:1204] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_178-39722-0 from training. Duration: 0.49 2023-10-03 09:47:19,784 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.conv_skip_rate, batch_count=1223686.6666666667, ans=0.0 2023-10-03 09:47:22,407 WARNING [train.py:1204] (3/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_428-181925-0_sp1.1 from training. Duration: 0.963625 2023-10-03 09:47:22,411 WARNING [train.py:1204] (3/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_770-200430-0 from training. Duration: 0.89 2023-10-03 09:47:22,462 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_1104-271009-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 09:47:22,512 WARNING [train.py:1204] (3/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_128-296581-0_sp0.9 from training. Duration: 0.82225 2023-10-03 09:47:26,630 INFO [train.py:1046] (3/4) Epoch 35, batch 2950, loss[loss=0.1613, simple_loss=0.234, pruned_loss=0.04427, over 23615.00 frames. ], tot_loss[loss=0.1602, simple_loss=0.2397, pruned_loss=0.04031, over 4722333.43 frames. ], batch size: 256, lr: 2.91e-03, grad_scale: 8.0 2023-10-03 09:47:29,891 WARNING [train.py:1204] (3/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_19-62511-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 09:47:31,320 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_805-71497-0 from training. Duration: 0.71 2023-10-03 09:47:31,391 WARNING [train.py:1204] (3/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_573-152077-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 09:47:31,396 WARNING [train.py:1204] (3/4) Exclude cut with ID _42875_210_14856_1_1533704397487_4012433_393-177816-0_sp1.1 from training. Duration: 0.963625 2023-10-03 09:47:31,573 INFO [scaling.py:1118] (3/4) WithLoss: name=encoder.encoders.0.layers.0.self_attn_weights, loss-sum=0.000e+00 2023-10-03 09:47:32,798 WARNING [train.py:1204] (3/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_278-35159-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 09:47:34,168 WARNING [train.py:1204] (3/4) Exclude cut with ID _15037_210_2253_1_1530752294104_3267650_13-130361-0_sp1.1 from training. Duration: 0.9 2023-10-03 09:47:35,071 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.1.encoder.layers.1.whiten, num_groups=1, num_channels=256, metric=3.96 vs. limit=12.0 2023-10-03 09:47:35,557 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3019_210_2028_1_1526044245_3264865_127-135615-0 from training. Duration: 0.94 2023-10-03 09:47:36,927 WARNING [train.py:1204] (3/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_607-213951-0 from training. Duration: 0.84 2023-10-03 09:47:38,296 WARNING [train.py:1204] (3/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_436-318113-0_sp0.9 from training. Duration: 0.8333125 2023-10-03 09:47:38,297 WARNING [train.py:1204] (3/4) Exclude cut with ID _13938_210_9988_1_1530518718978_3693960_148-152225-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 09:47:44,988 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.625e+02 1.855e+02 2.022e+02 2.286e+02 3.734e+02, threshold=4.043e+02, percent-clipped=0.0 2023-10-03 09:47:46,978 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2541_210_1794_1_1525590269_3084765_156-173713-0_sp1.1 from training. Duration: 0.736375 2023-10-03 09:47:48,400 WARNING [train.py:1204] (3/4) Exclude cut with ID _28323_210_16187_1_1532480395667_4100300_531-24157-0_sp1.1 from training. Duration: 0.8909375 2023-10-03 09:47:49,879 WARNING [train.py:1204] (3/4) Exclude cut with ID _29303_210_2575_1_1532392237303_7410649_382-217492-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 09:47:49,913 WARNING [train.py:1204] (3/4) Exclude cut with ID _58895_210_8750_1_1535184081320_3649089_270-158013-0_sp1.1 from training. Duration: 0.92725 2023-10-03 09:47:52,697 WARNING [train.py:1204] (3/4) Exclude cut with ID _29277_210_3279_1_1532581109293_3173069_311-206227-0_sp1.1 from training. Duration: 0.936375 2023-10-03 09:47:52,707 WARNING [train.py:1204] (3/4) Exclude cut with ID _7160_210_1876_1_1527940923231_3703799_282-322611-0_sp0.9 from training. Duration: 0.9666875 2023-10-03 09:47:53,043 INFO [scaling.py:1118] (3/4) WithLoss: name=encoder.encoders.4.encoder.layers.0.self_attn_weights, loss-sum=0.000e+00 2023-10-03 09:47:54,151 WARNING [train.py:1204] (3/4) Exclude cut with ID _53219_210_4525_1_1534834836008_4064825_562-32295-0_sp1.1 from training. Duration: 0.963625 2023-10-03 09:47:54,238 WARNING [train.py:1204] (3/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_408-87330-0_sp1.1 from training. Duration: 0.963625 2023-10-03 09:47:54,246 WARNING [train.py:1204] (3/4) Exclude cut with ID _59202_210_26984_1_1534816810612_3549174_137-318710-0_sp1.1 from training. Duration: 0.8090625 2023-10-03 09:47:57,032 WARNING [train.py:1204] (3/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_372-30302-0 from training. Duration: 0.86 2023-10-03 09:48:02,977 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7543_210_6753_1_1528541907_1399322_178-56269-0_sp1.1 from training. Duration: 0.95275 2023-10-03 09:48:02,995 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9109W0012-549758 from training. Duration: 0.512 2023-10-03 09:48:04,291 WARNING [train.py:1204] (3/4) Exclude cut with ID _5410_210_1388_1_1527397055795_1719020_39-112221-0_sp0.9 from training. Duration: 0.7666875 2023-10-03 09:48:04,421 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9121W0012-554933 from training. Duration: 0.896 2023-10-03 09:48:07,000 WARNING [train.py:1204] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_476-272757-0 from training. Duration: 0.93 2023-10-03 09:48:07,021 WARNING [train.py:1204] (3/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_1085-171718-0_sp1.1 from training. Duration: 0.92725 2023-10-03 09:48:07,074 WARNING [train.py:1204] (3/4) Exclude cut with ID _8480_210_1874_1_1528589391205_6728330_309-139896-0_sp1.1 from training. Duration: 0.736375 2023-10-03 09:48:07,074 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9130W0113-559703 from training. Duration: 0.896 2023-10-03 09:48:07,078 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_292-265928-0_sp1.1 from training. Duration: 0.463625 2023-10-03 09:48:07,446 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.ff2_skip_rate, batch_count=1223886.6666666667, ans=0.0 2023-10-03 09:48:09,868 WARNING [train.py:1204] (3/4) Exclude cut with ID _8772_210_1850_1_1528635595972_3664011_96-12756-0 from training. Duration: 0.98 2023-10-03 09:48:09,931 WARNING [train.py:1204] (3/4) Exclude cut with ID _12556_210_8341_1_1530081024100_7327690_725-193477-0_sp1.1 from training. Duration: 0.8909375 2023-10-03 09:48:09,968 WARNING [train.py:1204] (3/4) Exclude cut with ID _5289_210_2084_1_1527678202736_3779299_142-103317-0_sp1.1 from training. Duration: 0.763625 2023-10-03 09:48:13,576 WARNING [train.py:1204] (3/4) Exclude cut with ID _45012_210_12651_1_1533608435136_5371380_206-51075-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 09:48:13,755 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.0.balancer2.prob, batch_count=1223953.3333333333, ans=0.125 2023-10-03 09:48:13,837 INFO [scaling.py:1118] (3/4) WithLoss: name=encoder.encoders.3.encoder.layers.0.self_attn_weights, loss-sum=0.000e+00 2023-10-03 09:48:14,884 WARNING [train.py:1204] (3/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_176-56120-0_sp1.1 from training. Duration: 0.7545625 2023-10-03 09:48:14,925 WARNING [train.py:1204] (3/4) Exclude cut with ID _40284_210_2084_1_1533513679814_3704279_220-201864-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 09:48:16,282 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9165W0004-576638 from training. Duration: 0.896 2023-10-03 09:48:16,319 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_5438_210_1794_1_1527336687_2600009_218-343336-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 09:48:16,340 WARNING [train.py:1204] (3/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_318-162017-0 from training. Duration: 0.91 2023-10-03 09:48:22,596 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_49544_210_1794_1_1534379770_5262083_483-349298-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 09:48:23,749 WARNING [train.py:1204] (3/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_197-28164-0_sp0.9 from training. Duration: 0.888875 2023-10-03 09:48:23,806 WARNING [train.py:1204] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_643-52007-0 from training. Duration: 0.57 2023-10-03 09:48:23,837 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_38965_210_4852_1_1533174906_3484735_403-332963-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 09:48:25,266 WARNING [train.py:1204] (3/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_340-174176-0 from training. Duration: 0.49 2023-10-03 09:48:25,994 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.3.encoder.layers.0.conv_module2.whiten, num_groups=1, num_channels=512, metric=3.00 vs. limit=15.0 2023-10-03 09:48:28,046 WARNING [train.py:1204] (3/4) Exclude cut with ID _56708_210_13177_1_1535252351282_3422899_363-917-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 09:48:28,154 WARNING [train.py:1204] (3/4) Exclude cut with ID _19896_210_1883_1_1531709022662_6649569_525-159063-0_sp1.1 from training. Duration: 0.936375 2023-10-03 09:48:28,197 WARNING [train.py:1204] (3/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_430-116139-0_sp0.9 from training. Duration: 0.9444375 2023-10-03 09:48:31,395 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_53753_210_7234_1_1534320003_3594148_86-329250-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 09:48:31,405 WARNING [train.py:1204] (3/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_406-275649-0_sp1.1 from training. Duration: 0.3818125 2023-10-03 09:48:32,821 WARNING [train.py:1204] (3/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_1083-147536-0_sp1.1 from training. Duration: 0.8818125 2023-10-03 09:48:32,881 WARNING [train.py:1204] (3/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_54-311741-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 09:48:32,894 WARNING [train.py:1204] (3/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_237-58433-0_sp0.9 from training. Duration: 0.82225 2023-10-03 09:48:34,252 WARNING [train.py:1204] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_98-51676-0_sp1.1 from training. Duration: 0.663625 2023-10-03 09:48:34,313 WARNING [train.py:1204] (3/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_386-270472-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 09:48:35,725 WARNING [train.py:1204] (3/4) Exclude cut with ID _19023_210_4353_1_1531378892552_5820780_83-254794-0_sp1.1 from training. Duration: 0.7818125 2023-10-03 09:48:35,988 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.feed_forward2.hidden_balancer.prob, batch_count=1224020.0, ans=0.125 2023-10-03 09:48:37,080 WARNING [train.py:1204] (3/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_314-93656-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 09:48:37,092 WARNING [train.py:1204] (3/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_336-213530-0 from training. Duration: 0.9 2023-10-03 09:48:38,889 WARNING [train.py:1204] (3/4) Exclude cut with ID _36686_210_5665_1_1533778225913_3646410_65-221120-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 09:48:41,555 INFO [train.py:1046] (3/4) Epoch 35, batch 3000, loss[loss=0.1564, simple_loss=0.2306, pruned_loss=0.04111, over 23865.00 frames. ], tot_loss[loss=0.1614, simple_loss=0.2408, pruned_loss=0.04098, over 4727250.61 frames. ], batch size: 212, lr: 2.91e-03, grad_scale: 8.0 2023-10-03 09:48:41,556 INFO [train.py:1069] (3/4) Computing validation loss 2023-10-03 09:48:53,196 INFO [train.py:1078] (3/4) Epoch 35, validation: loss=0.3596, simple_loss=0.2732, pruned_loss=0.223, over 1125622.00 frames. 2023-10-03 09:48:53,196 INFO [train.py:1079] (3/4) Maximum memory allocated so far is 21134MB 2023-10-03 09:48:53,311 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_26433_210_12108_1_1532157158_4056480_271-68178-0_sp1.1 from training. Duration: 0.92725 2023-10-03 09:48:53,349 WARNING [train.py:1204] (3/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_46-207599-0_sp0.9 from training. Duration: 0.711125 2023-10-03 09:48:56,731 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9303W0114-637133 from training. Duration: 0.896 2023-10-03 09:48:56,763 WARNING [train.py:1204] (3/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_490-207861-0 from training. Duration: 0.98 2023-10-03 09:48:59,689 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6767_210_3273_1_1527855783_1718932_173-256601-0_sp0.9 from training. Duration: 0.888875 2023-10-03 09:48:59,731 WARNING [train.py:1204] (3/4) Exclude cut with ID _13921_210_9994_1_1530841782272_3503520_327-69677-0_sp1.1 from training. Duration: 0.8181875 2023-10-03 09:49:01,042 WARNING [train.py:1204] (3/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_508-335369-0 from training. Duration: 0.48 2023-10-03 09:49:01,066 WARNING [train.py:1204] (3/4) Exclude cut with ID _64631_210_27767_1_1535349580068_4165160_24-46567-0_sp1.1 from training. Duration: 0.963625 2023-10-03 09:49:05,346 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.feed_forward3.hidden_balancer.prob, batch_count=1224086.6666666667, ans=0.125 2023-10-03 09:49:09,572 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_89-35748-0_sp1.1 from training. Duration: 0.7181875 2023-10-03 09:49:10,060 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.5.encoder.layers.1.whiten, num_groups=1, num_channels=256, metric=2.78 vs. limit=12.0 2023-10-03 09:49:19,987 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_26433_210_12108_1_1532157158_4056480_578-262107-0_sp1.1 from training. Duration: 0.9 2023-10-03 09:49:24,039 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.2.encoder.layers.1.feed_forward2.out_whiten, num_groups=1, num_channels=384, metric=4.69 vs. limit=15.0 2023-10-03 09:49:27,337 WARNING [train.py:1204] (3/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_129-337802-0 from training. Duration: 0.72 2023-10-03 09:49:28,716 WARNING [train.py:1204] (3/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_356-167816-0_sp0.9 from training. Duration: 0.8 2023-10-03 09:49:33,246 WARNING [train.py:1204] (3/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_137-116442-0_sp1.1 from training. Duration: 0.7090625 2023-10-03 09:49:33,288 WARNING [train.py:1204] (3/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_358-172940-0_sp1.1 from training. Duration: 0.963625 2023-10-03 09:49:34,623 WARNING [train.py:1204] (3/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_668-344147-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 09:49:36,049 WARNING [train.py:1204] (3/4) Exclude cut with ID _62337_210_12392_1_1535009273105_3412229_255-157667-0_sp1.1 from training. Duration: 0.936375 2023-10-03 09:49:36,052 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_486-151131-0 from training. Duration: 0.85 2023-10-03 09:49:36,376 INFO [scaling.py:1118] (3/4) WithLoss: name=encoder.encoders.3.encoder.layers.0.self_attn_weights, loss-sum=0.000e+00 2023-10-03 09:49:36,412 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.balancer1.prob, batch_count=1224286.6666666667, ans=0.125 2023-10-03 09:49:37,478 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_53638_210_21581_1_1534297319_5808449_885-80172-0 from training. Duration: 0.96 2023-10-03 09:49:38,862 WARNING [train.py:1204] (3/4) Exclude cut with ID _50093_210_4220_1_1534417141207_3432299_105-183067-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 09:49:38,944 WARNING [train.py:1204] (3/4) Exclude cut with ID _7562_210_2084_1_1528887767916_3946229_345-169156-0_sp0.9 from training. Duration: 0.6444375 2023-10-03 09:49:40,386 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_4528_210_3273_1_1527076654_3276777_209-219804-0_sp0.9 from training. Duration: 0.7666875 2023-10-03 09:49:40,442 WARNING [train.py:1204] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_35-153886-0_sp0.9 from training. Duration: 0.9333125 2023-10-03 09:49:40,506 WARNING [train.py:1204] (3/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_332-121925-0_sp1.1 from training. Duration: 0.92725 2023-10-03 09:49:40,508 WARNING [train.py:1204] (3/4) Exclude cut with ID _59202_210_26984_1_1534816810612_3549174_495-65515-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 09:49:45,913 WARNING [train.py:1204] (3/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_769-168302-0_sp1.1 from training. Duration: 0.6454375 2023-10-03 09:49:47,041 WARNING [train.py:1204] (3/4) Exclude cut with ID _24271_210_12658_1_1531987270022_7190519_708-71345-0_sp1.1 from training. Duration: 0.936375 2023-10-03 09:49:47,042 WARNING [train.py:1204] (3/4) Exclude cut with ID 3033-130750-0096-55598-0_sp0.9 from training. Duration: 0.92225 2023-10-03 09:49:48,297 WARNING [train.py:1204] (3/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_219-224445-0_sp0.9 from training. Duration: 0.9333125 2023-10-03 09:49:51,095 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_174-277364-0 from training. Duration: 0.71 2023-10-03 09:49:51,177 WARNING [train.py:1204] (3/4) Exclude cut with ID _45021_210_9994_1_1533619751094_3330288_96-239010-0_sp1.1 from training. Duration: 0.82725 2023-10-03 09:49:51,198 WARNING [train.py:1204] (3/4) Exclude cut with ID _31602_210_5854_1_1532677021456_4458250_489-305116-0_sp1.1 from training. Duration: 0.97275 2023-10-03 09:49:51,216 WARNING [train.py:1204] (3/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_269-154726-0_sp0.9 from training. Duration: 0.9555625 2023-10-03 09:49:55,905 WARNING [train.py:1204] (3/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_126-54418-0_sp1.1 from training. Duration: 0.92725 2023-10-03 09:49:55,947 WARNING [train.py:1204] (3/4) Exclude cut with ID _42326_210_7770_1_1533814282531_3707269_182-224159-0_sp1.1 from training. Duration: 0.92725 2023-10-03 09:49:58,540 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7002_210_1794_1_1527939683_5157286_1-281264-0_sp0.9 from training. Duration: 0.488875 2023-10-03 09:49:58,576 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_86-16881-0 from training. Duration: 0.58 2023-10-03 09:49:58,617 WARNING [train.py:1204] (3/4) Exclude cut with ID _19514_210_9259_1_1531530071330_5084230_637-279094-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 09:49:59,924 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_26433_210_12108_1_1532157158_4056480_578-262107-0 from training. Duration: 0.99 2023-10-03 09:49:59,972 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_82-221196-0_sp1.1 from training. Duration: 0.6909375 2023-10-03 09:50:03,153 WARNING [train.py:1204] (3/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_402-102311-0 from training. Duration: 0.67 2023-10-03 09:50:07,180 INFO [train.py:1046] (3/4) Epoch 35, batch 3050, loss[loss=0.1804, simple_loss=0.2587, pruned_loss=0.05107, over 23545.00 frames. ], tot_loss[loss=0.1613, simple_loss=0.2412, pruned_loss=0.04077, over 4728010.54 frames. ], batch size: 120, lr: 2.91e-03, grad_scale: 4.0 2023-10-03 09:50:07,247 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1015-343884-0_sp1.1 from training. Duration: 0.563625 2023-10-03 09:50:08,628 WARNING [train.py:1204] (3/4) Exclude cut with ID _13684_210_7497_1_1530619354863_3598359_312-235606-0_sp1.1 from training. Duration: 0.5454375 2023-10-03 09:50:08,669 WARNING [train.py:1204] (3/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_55-31904-0 from training. Duration: 0.88 2023-10-03 09:50:08,744 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_25-332138-0 from training. Duration: 0.91 2023-10-03 09:50:08,747 WARNING [train.py:1204] (3/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_912-282412-0_sp0.9 from training. Duration: 0.5444375 2023-10-03 09:50:10,117 WARNING [train.py:1204] (3/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_458-299435-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 09:50:10,199 WARNING [train.py:1204] (3/4) Exclude cut with ID _69584_210_5219_1_1535785133775_4044450_275-88403-0_sp1.1 from training. Duration: 0.92725 2023-10-03 09:50:10,207 WARNING [train.py:1204] (3/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_841-341679-0_sp1.1 from training. Duration: 0.52725 2023-10-03 09:50:10,213 WARNING [train.py:1204] (3/4) Exclude cut with ID _41391_210_7994_1_1533603771872_3638201_1-54521-0_sp1.1 from training. Duration: 0.97275 2023-10-03 09:50:11,517 WARNING [train.py:1204] (3/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_495-186609-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 09:50:14,393 WARNING [train.py:1204] (3/4) Exclude cut with ID _4681_210_3525_1_1527250689464_120889_15-126760-0 from training. Duration: 0.81 2023-10-03 09:50:17,519 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3019_210_2028_1_1526044245_3264865_127-135615-0_sp1.1 from training. Duration: 0.8545625 2023-10-03 09:50:19,581 WARNING [train.py:1204] (3/4) Exclude cut with ID _30191_210_1664_1_1533189915047_7196670_915-21561-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 09:50:19,624 WARNING [train.py:1204] (3/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_6-214815-0_sp0.9 from training. Duration: 0.9444375 2023-10-03 09:50:23,754 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_45399_210_4852_1_1533624725_7579737_190-137494-0_sp1.1 from training. Duration: 0.97275 2023-10-03 09:50:26,934 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.384e+02 1.939e+02 2.109e+02 2.349e+02 4.315e+02, threshold=4.217e+02, percent-clipped=1.0 2023-10-03 09:50:27,035 WARNING [train.py:1204] (3/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_207-132038-0 from training. Duration: 0.94 2023-10-03 09:50:31,319 WARNING [train.py:1204] (3/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_90-31383-0 from training. Duration: 0.87 2023-10-03 09:50:31,365 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_15517_210_6753_1_1530952293_1664795_119-31001-0 from training. Duration: 0.94 2023-10-03 09:50:31,398 WARNING [train.py:1204] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_684-85342-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 09:50:31,605 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.conv_skip_rate, batch_count=1224486.6666666667, ans=0.0 2023-10-03 09:50:34,479 WARNING [train.py:1204] (3/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_97-325053-0_sp1.1 from training. Duration: 0.836375 2023-10-03 09:50:38,435 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_68702_210_23689_1_1535799577_3420068_168-25150-0_sp1.1 from training. Duration: 0.97275 2023-10-03 09:50:38,443 WARNING [train.py:1204] (3/4) Exclude cut with ID _65134_210_14763_1_1535335393985_3505368_111-27984-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 09:50:38,507 WARNING [train.py:1204] (3/4) Exclude cut with ID _48678_210_5663_1_1533988830947_3769199_276-226230-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 09:50:40,622 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.2.encoder.layers.2.conv_module2.whiten, num_groups=1, num_channels=384, metric=4.95 vs. limit=15.0 2023-10-03 09:50:41,244 WARNING [train.py:1204] (3/4) Exclude cut with ID _28427_210_4220_1_1532581240967_3061069_43-84928-0_sp1.1 from training. Duration: 0.9 2023-10-03 09:50:41,277 WARNING [train.py:1204] (3/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_158-250116-0_sp1.1 from training. Duration: 0.563625 2023-10-03 09:50:42,475 WARNING [train.py:1204] (3/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_279-332556-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 09:50:42,515 WARNING [train.py:1204] (3/4) Exclude cut with ID _42863_210_9070_1_1533902398788_3696920_488-230146-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 09:50:42,516 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_387-247159-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 09:50:43,846 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6729_210_2550_1_1528032481_1788725_261-42619-0_sp1.1 from training. Duration: 0.97275 2023-10-03 09:50:45,315 WARNING [train.py:1204] (3/4) Exclude cut with ID _19896_210_1883_1_1531709022662_6649569_831-44724-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 09:50:46,760 WARNING [train.py:1204] (3/4) Exclude cut with ID _55153_210_2253_1_1534384786677_3162096_280-285944-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 09:50:46,872 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.1.nonlin_attention.balancer.prob, batch_count=1224553.3333333333, ans=0.125 2023-10-03 09:50:48,446 WARNING [train.py:1204] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_448-4048-0 from training. Duration: 0.96 2023-10-03 09:50:49,799 WARNING [train.py:1204] (3/4) Exclude cut with ID _29678_210_7893_1_1532421042787_3906118_456-202548-0_sp1.1 from training. Duration: 0.97275 2023-10-03 09:50:49,812 WARNING [train.py:1204] (3/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_107-332398-0_sp1.1 from training. Duration: 0.7090625 2023-10-03 09:50:53,190 WARNING [train.py:1204] (3/4) Exclude cut with ID _13921_210_9994_1_1530838702800_3003429_46-149444-0_sp1.1 from training. Duration: 0.8545625 2023-10-03 09:50:53,235 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_257-20733-0_sp0.9 from training. Duration: 0.8444375 2023-10-03 09:50:54,533 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_53753_210_7234_1_1534320003_3594148_385-234809-0_sp1.1 from training. Duration: 0.963625 2023-10-03 09:50:54,595 WARNING [train.py:1204] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_292-215346-0_sp1.1 from training. Duration: 0.8818125 2023-10-03 09:50:58,703 WARNING [train.py:1204] (3/4) Exclude cut with ID _68965_210_1883_1_1535698962272_3497490_196-124268-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 09:51:00,051 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_23801_210_16223_1_1532066392_3294657_570-5439-0_sp1.1 from training. Duration: 0.8818125 2023-10-03 09:51:04,827 WARNING [train.py:1204] (3/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_349-6928-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 09:51:06,180 WARNING [train.py:1204] (3/4) Exclude cut with ID _60621_210_17071_1_1535013053530_3671739_226-255202-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 09:51:06,185 WARNING [train.py:1204] (3/4) Exclude cut with ID _41817_210_18835_1_1533434477160_7423049_448-130302-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 09:51:07,592 WARNING [train.py:1204] (3/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_758-143677-0_sp1.1 from training. Duration: 0.8909375 2023-10-03 09:51:07,644 WARNING [train.py:1204] (3/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_912-47473-0_sp0.9 from training. Duration: 0.7555625 2023-10-03 09:51:09,001 WARNING [train.py:1204] (3/4) Exclude cut with ID _6694_210_4599_1_1527852637490_3965529_356-169233-0_sp1.1 from training. Duration: 0.9 2023-10-03 09:51:10,358 WARNING [train.py:1204] (3/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_1-99680-0 from training. Duration: 0.67 2023-10-03 09:51:11,713 WARNING [train.py:1204] (3/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_199-86432-0_sp1.1 from training. Duration: 0.8909375 2023-10-03 09:51:11,735 WARNING [train.py:1204] (3/4) Exclude cut with ID _61148_210_6897_1_1535094022726_4217610_362-182430-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 09:51:13,021 WARNING [train.py:1204] (3/4) Exclude cut with ID _49376_210_5986_1_1533985278732_3370740_301-154763-0 from training. Duration: 0.98 2023-10-03 09:51:14,389 WARNING [train.py:1204] (3/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_572-185150-0_sp1.1 from training. Duration: 0.8818125 2023-10-03 09:51:20,347 INFO [train.py:1046] (3/4) Epoch 35, batch 3100, loss[loss=0.1438, simple_loss=0.2294, pruned_loss=0.02906, over 24308.00 frames. ], tot_loss[loss=0.1603, simple_loss=0.2401, pruned_loss=0.04022, over 4734995.63 frames. ], batch size: 61, lr: 2.91e-03, grad_scale: 8.0 2023-10-03 09:51:20,510 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_47896_210_20508_1_1534492518_5958868_558-314572-0_sp1.1 from training. Duration: 0.8818125 2023-10-03 09:51:22,494 WARNING [train.py:1204] (3/4) Exclude cut with ID _45560_210_21982_1_1533812603951_3584999_111-6376-0_sp0.9 from training. Duration: 0.9333125 2023-10-03 09:51:25,277 WARNING [train.py:1204] (3/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_412-317077-0_sp1.1 from training. Duration: 0.6818125 2023-10-03 09:51:26,705 WARNING [train.py:1204] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_718-68505-0 from training. Duration: 0.89 2023-10-03 09:51:26,942 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.balancer1.prob, batch_count=1224753.3333333333, ans=0.125 2023-10-03 09:51:29,281 WARNING [train.py:1204] (3/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_77-311404-0 from training. Duration: 0.94 2023-10-03 09:51:29,499 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.conv_module1.balancer1.prob, batch_count=1224753.3333333333, ans=0.125 2023-10-03 09:51:30,625 WARNING [train.py:1204] (3/4) Exclude cut with ID _60229_210_27767_1_1534838406158_3655350_264-279573-0 from training. Duration: 0.83 2023-10-03 09:51:31,989 WARNING [train.py:1204] (3/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_106-300985-0_sp1.1 from training. Duration: 0.8181875 2023-10-03 09:51:33,533 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.balancer1.prob, batch_count=1224820.0, ans=0.125 2023-10-03 09:51:35,323 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_4183_210_2087_1_1526729100_1869838_79-277189-0_sp1.1 from training. Duration: 0.963625 2023-10-03 09:51:35,476 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.attention_skip_rate, batch_count=1224820.0, ans=0.0 2023-10-03 09:51:36,582 WARNING [train.py:1204] (3/4) Exclude cut with ID _28427_210_4220_1_1532581240967_3061069_195-295268-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 09:51:38,181 WARNING [train.py:1204] (3/4) Exclude cut with ID _4092_210_2151_1_1526643044449_3135759_356-278694-0_sp0.9 from training. Duration: 0.52225 2023-10-03 09:51:38,358 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.bypass_mid.scale_min, batch_count=1224820.0, ans=0.2 2023-10-03 09:51:41,035 WARNING [train.py:1204] (3/4) Exclude cut with ID _14459_210_2228_1_1531443641946_7516700_777-71446-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 09:51:46,563 WARNING [train.py:1204] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_520-126729-0 from training. Duration: 0.7 2023-10-03 09:51:51,796 WARNING [train.py:1204] (3/4) Exclude cut with ID _13684_210_7497_1_1530619354863_3598359_312-235606-0_sp0.9 from training. Duration: 0.6666875 2023-10-03 09:51:53,088 WARNING [train.py:1204] (3/4) Exclude cut with ID _37537_210_2253_1_1533171758568_3421949_283-349402-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 09:51:53,126 WARNING [train.py:1204] (3/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_29-117932-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 09:51:53,160 WARNING [train.py:1204] (3/4) Exclude cut with ID _29303_210_2575_1_1532392237303_7410649_721-283351-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 09:51:54,492 WARNING [train.py:1204] (3/4) Exclude cut with ID _14886_210_4220_1_1530702020355_3516190_175-229073-0_sp0.9 from training. Duration: 0.5 2023-10-03 09:51:56,019 WARNING [train.py:1204] (3/4) Exclude cut with ID _15458_210_8341_1_1530858614263_3834410_70-268830-0_sp1.1 from training. Duration: 0.97275 2023-10-03 09:51:56,041 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2295_210_2087_1_1525500993_2407863_234-83104-0 from training. Duration: 0.62 2023-10-03 09:51:56,042 WARNING [train.py:1204] (3/4) Exclude cut with ID _22961_210_8254_1_1531976366672_4278180_317-199546-0_sp1.1 from training. Duration: 0.7818125 2023-10-03 09:51:57,442 WARNING [train.py:1204] (3/4) Exclude cut with ID _27894_210_12392_1_1532415496078_6902439_547-196022-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 09:51:58,818 WARNING [train.py:1204] (3/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_46-207599-0 from training. Duration: 0.64 2023-10-03 09:52:00,170 WARNING [train.py:1204] (3/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_426-326246-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 09:52:04,238 WARNING [train.py:1204] (3/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_445-117398-0_sp0.9 from training. Duration: 0.988875 2023-10-03 09:52:04,300 WARNING [train.py:1204] (3/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_563-347832-0 from training. Duration: 0.89 2023-10-03 09:52:07,528 WARNING [train.py:1204] (3/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_252-216674-0 from training. Duration: 0.54 2023-10-03 09:52:07,634 WARNING [train.py:1204] (3/4) Exclude cut with ID _59611_210_8895_1_1534741198240_3650361_187-212779-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 09:52:07,702 WARNING [train.py:1204] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_282-159367-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 09:52:09,180 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_68702_210_23689_1_1535799577_3420068_33-136883-0_sp1.1 from training. Duration: 0.936375 2023-10-03 09:52:09,197 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_47228_210_4983_1_1533780021_3672027_184-293706-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 09:52:10,466 WARNING [train.py:1204] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_656-188361-0_sp0.9 from training. Duration: 0.9666875 2023-10-03 09:52:10,595 WARNING [train.py:1204] (3/4) Exclude cut with ID _4866_210_2577_1_1527404412284_4364659_352-157930-0_sp0.9 from training. Duration: 0.9 2023-10-03 09:52:10,596 WARNING [train.py:1204] (3/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_210-106047-0_sp1.1 from training. Duration: 0.8818125 2023-10-03 09:52:13,309 WARNING [train.py:1204] (3/4) Exclude cut with ID _9389_210_1664_1_1529217337301_5289269_289-213417-0_sp1.1 from training. Duration: 0.8090625 2023-10-03 09:52:13,347 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3910_210_1794_1_1526777667_3747260_178-41186-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 09:52:13,351 WARNING [train.py:1204] (3/4) Exclude cut with ID _27052_210_5986_1_1532329197187_3585009_24-172263-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 09:52:14,652 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_5522_210_3273_1_1527249606_3909096_318-203153-0_sp1.1 from training. Duration: 0.3909375 2023-10-03 09:52:17,426 WARNING [train.py:1204] (3/4) Exclude cut with ID _27894_210_12392_1_1532415496078_6902439_222-51982-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 09:52:18,849 WARNING [train.py:1204] (3/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_87-131427-0 from training. Duration: 0.78 2023-10-03 09:52:19,510 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.4.encoder.layers.0.conv_module1.whiten, num_groups=1, num_channels=384, metric=4.46 vs. limit=15.0 2023-10-03 09:52:21,396 WARNING [train.py:1204] (3/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_372-298093-0_sp0.9 from training. Duration: 0.888875 2023-10-03 09:52:22,732 WARNING [train.py:1204] (3/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_663-166973-0 from training. Duration: 0.8 2023-10-03 09:52:22,761 WARNING [train.py:1204] (3/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_429-177469-0_sp1.1 from training. Duration: 0.936375 2023-10-03 09:52:24,064 WARNING [train.py:1204] (3/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_724-172651-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 09:52:24,105 WARNING [train.py:1204] (3/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_103-336064-0 from training. Duration: 0.51 2023-10-03 09:52:24,331 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.conv_skip_rate, batch_count=1225020.0, ans=0.0 2023-10-03 09:52:34,872 INFO [train.py:1046] (3/4) Epoch 35, batch 3150, loss[loss=0.1666, simple_loss=0.2536, pruned_loss=0.03978, over 24471.00 frames. ], tot_loss[loss=0.1595, simple_loss=0.2387, pruned_loss=0.04012, over 4737080.86 frames. ], batch size: 69, lr: 2.91e-03, grad_scale: 8.0 2023-10-03 09:52:34,996 WARNING [train.py:1204] (3/4) Exclude cut with ID _48677_210_5663_1_1533902405919_3892989_111-11439-0 from training. Duration: 0.96 2023-10-03 09:52:37,713 WARNING [train.py:1204] (3/4) Exclude cut with ID _8085_210_4557_1_1528455633493_3716100_95-103398-0_sp1.1 from training. Duration: 0.963625 2023-10-03 09:52:37,769 WARNING [train.py:1204] (3/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_1041-110837-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 09:52:39,280 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_50050_210_16223_1_1535435796_7290138_1155-121757-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 09:52:39,282 WARNING [train.py:1204] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_602-275260-0_sp0.9 from training. Duration: 0.911125 2023-10-03 09:52:39,350 WARNING [train.py:1204] (3/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_265-164486-0 from training. Duration: 0.96 2023-10-03 09:52:40,700 WARNING [train.py:1204] (3/4) Exclude cut with ID _46067_210_4220_1_1534125571066_3307450_142-229375-0_sp1.1 from training. Duration: 0.963625 2023-10-03 09:52:40,733 WARNING [train.py:1204] (3/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_310-26186-0_sp1.1 from training. Duration: 0.636375 2023-10-03 09:52:40,825 WARNING [train.py:1204] (3/4) Exclude cut with ID _28461_210_2253_1_1532771775189_3041299_136-270644-0 from training. Duration: 0.93 2023-10-03 09:52:42,308 WARNING [train.py:1204] (3/4) Exclude cut with ID _14860_210_4915_1_1530702020340_7343240_438-286372-0_sp1.1 from training. Duration: 0.936375 2023-10-03 09:52:43,666 WARNING [train.py:1204] (3/4) Exclude cut with ID IC0128W0004-63309 from training. Duration: 0.831 2023-10-03 09:52:46,495 WARNING [train.py:1204] (3/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_135-174080-0 from training. Duration: 0.53 2023-10-03 09:52:46,538 WARNING [train.py:1204] (3/4) Exclude cut with ID _41326_210_5854_1_1533778076509_3492080_277-59511-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 09:52:46,614 WARNING [train.py:1204] (3/4) Exclude cut with ID IC0144W0112-71379 from training. Duration: 0.992 2023-10-03 09:52:48,548 WARNING [train.py:1204] (3/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_711-15688-0_sp0.9 from training. Duration: 0.47775 2023-10-03 09:52:49,951 WARNING [train.py:1204] (3/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_237-332027-0 from training. Duration: 0.95 2023-10-03 09:52:50,016 WARNING [train.py:1204] (3/4) Exclude cut with ID _7970_210_1883_1_1528455616296_3472230_25-178407-0 from training. Duration: 0.71 2023-10-03 09:52:50,018 WARNING [train.py:1204] (3/4) Exclude cut with ID _8480_210_1874_1_1528589391205_6728330_309-139896-0 from training. Duration: 0.81 2023-10-03 09:52:50,037 WARNING [train.py:1204] (3/4) Exclude cut with ID _39571_210_13449_1_1533801584143_3651077_319-268877-0_sp1.1 from training. Duration: 0.936375 2023-10-03 09:52:50,040 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_155-246880-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 09:52:51,367 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_566-122599-0_sp1.1 from training. Duration: 0.936375 2023-10-03 09:52:52,643 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_12868_210_6753_1_1530701932_530275_18-76322-0 from training. Duration: 0.87 2023-10-03 09:52:54,302 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.542e+02 1.875e+02 2.111e+02 2.412e+02 4.030e+02, threshold=4.223e+02, percent-clipped=0.0 2023-10-03 09:52:57,080 WARNING [train.py:1204] (3/4) Exclude cut with ID _56494_210_4220_1_1535284837625_3033169_159-106035-0_sp1.1 from training. Duration: 0.963625 2023-10-03 09:52:57,110 WARNING [train.py:1204] (3/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_98-276013-0_sp1.1 from training. Duration: 0.963625 2023-10-03 09:52:57,173 WARNING [train.py:1204] (3/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_330-216124-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 09:52:57,358 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.ff2_skip_rate, batch_count=1225153.3333333333, ans=0.0 2023-10-03 09:52:58,537 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_177-58005-0_sp0.9 from training. Duration: 0.688875 2023-10-03 09:53:03,216 WARNING [train.py:1204] (3/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_343-139898-0 from training. Duration: 0.96 2023-10-03 09:53:04,512 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6961_210_2087_1_1527937168_6815011_395-185060-0_sp1.1 from training. Duration: 0.863625 2023-10-03 09:53:06,005 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7002_210_1794_1_1527939683_5157286_184-229630-0_sp0.9 from training. Duration: 0.82225 2023-10-03 09:53:06,044 WARNING [train.py:1204] (3/4) Exclude cut with ID _59170_210_6721_1_1534983633960_4203335_153-18895-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 09:53:06,110 WARNING [train.py:1204] (3/4) Exclude cut with ID _49643_210_7989_1_1534640244224_3939031_598-161926-0 from training. Duration: 0.9 2023-10-03 09:53:08,898 WARNING [train.py:1204] (3/4) Exclude cut with ID _4068_210_2629_1_1526697350395_1546259_224-288693-0 from training. Duration: 0.59 2023-10-03 09:53:08,951 WARNING [train.py:1204] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_718-68505-0_sp1.1 from training. Duration: 0.8090625 2023-10-03 09:53:10,428 WARNING [train.py:1204] (3/4) Exclude cut with ID _4068_210_2629_1_1526697350395_1546259_224-288693-0_sp0.9 from training. Duration: 0.6555625 2023-10-03 09:53:10,440 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2296_210_2087_1_1525503448_3397335_57-185430-0_sp0.9 from training. Duration: 0.6666875 2023-10-03 09:53:10,507 WARNING [train.py:1204] (3/4) Exclude cut with ID _15458_210_8341_1_1530858614263_3834410_49-235472-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 09:53:10,509 WARNING [train.py:1204] (3/4) Exclude cut with ID _64135_210_2253_1_1535677267876_7608972_498-5039-0_sp0.9 from training. Duration: 0.8666875 2023-10-03 09:53:11,805 WARNING [train.py:1204] (3/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_283-220372-0_sp0.9 from training. Duration: 0.62225 2023-10-03 09:53:11,814 WARNING [train.py:1204] (3/4) Exclude cut with ID _6473_210_1741_1_1527901307396_3815380_108-17335-0_sp0.9 from training. Duration: 0.72225 2023-10-03 09:53:13,135 WARNING [train.py:1204] (3/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_113-92608-0 from training. Duration: 0.54 2023-10-03 09:53:13,187 WARNING [train.py:1204] (3/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_272-249985-0_sp1.1 from training. Duration: 0.6090625 2023-10-03 09:53:14,459 WARNING [train.py:1204] (3/4) Exclude cut with ID _50376_210_13401_1_1534031502101_4217740_518-187390-0_sp1.1 from training. Duration: 0.97275 2023-10-03 09:53:17,605 WARNING [train.py:1204] (3/4) Exclude cut with ID _59738_210_15379_1_1534849512562_3941470_165-248260-0_sp1.1 from training. Duration: 0.92725 2023-10-03 09:53:17,622 WARNING [train.py:1204] (3/4) Exclude cut with ID _23818_210_6228_1_1531917063884_3676920_400-343925-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 09:53:19,603 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6961_210_2087_1_1527937168_6815011_562-165518-0 from training. Duration: 0.77 2023-10-03 09:53:19,657 WARNING [train.py:1204] (3/4) Exclude cut with ID _24271_210_12658_1_1531987270022_7190519_289-207931-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 09:53:22,379 WARNING [train.py:1204] (3/4) Exclude cut with ID _14632_210_9988_1_1530691279392_3923200_332-105904-0 from training. Duration: 0.9 2023-10-03 09:53:22,393 WARNING [train.py:1204] (3/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_203-232438-0_sp1.1 from training. Duration: 0.97275 2023-10-03 09:53:22,499 WARNING [train.py:1204] (3/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_263-168388-0 from training. Duration: 0.86 2023-10-03 09:53:23,968 WARNING [train.py:1204] (3/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_174-205058-0 from training. Duration: 0.95 2023-10-03 09:53:24,101 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_94-195801-0_sp1.1 from training. Duration: 0.8545625 2023-10-03 09:53:24,133 WARNING [train.py:1204] (3/4) Exclude cut with ID _55153_210_2253_1_1534384786677_3162096_269-103204-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 09:53:25,917 WARNING [train.py:1204] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_748-36150-0 from training. Duration: 0.91 2023-10-03 09:53:27,407 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_148-172861-0_sp1.1 from training. Duration: 0.4545625 2023-10-03 09:53:27,451 WARNING [train.py:1204] (3/4) Exclude cut with ID _67499_210_17282_1_1535509805220_3692430_460-77730-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 09:53:29,417 WARNING [train.py:1204] (3/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_374-20753-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 09:53:29,619 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.ff3_skip_rate, batch_count=1225286.6666666667, ans=0.0 2023-10-03 09:53:31,172 WARNING [train.py:1204] (3/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_374-289983-0_sp1.1 from training. Duration: 0.97275 2023-10-03 09:53:32,321 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_34886_210_10633_1_1533203882_1755661_76-156096-0_sp0.9 from training. Duration: 0.9666875 2023-10-03 09:53:35,257 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_238-256668-0_sp0.9 from training. Duration: 0.7666875 2023-10-03 09:53:36,067 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.1.encoder.layers.1.nonlin_attention.whiten1, num_groups=1, num_channels=192, metric=5.84 vs. limit=10.0 2023-10-03 09:53:36,597 WARNING [train.py:1204] (3/4) Exclude cut with ID _46683_210_9642_1_1533730913492_4591116_489-146727-0_sp1.1 from training. Duration: 0.97275 2023-10-03 09:53:38,035 WARNING [train.py:1204] (3/4) Exclude cut with ID _13938_210_9988_1_1530518718978_3693960_304-112784-0_sp0.9 from training. Duration: 0.511125 2023-10-03 09:53:42,329 WARNING [train.py:1204] (3/4) Exclude cut with ID _24271_210_12658_1_1531987270022_7190519_243-46549-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 09:53:42,333 WARNING [train.py:1204] (3/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_78-182912-0_sp1.1 from training. Duration: 0.536375 2023-10-03 09:53:47,227 WARNING [train.py:1204] (3/4) Exclude cut with ID _57572_210_5517_1_1534921219624_3809484_121-321334-0_sp1.1 from training. Duration: 0.97275 2023-10-03 09:53:50,278 INFO [train.py:1046] (3/4) Epoch 35, batch 3200, loss[loss=0.1445, simple_loss=0.2267, pruned_loss=0.03113, over 24340.00 frames. ], tot_loss[loss=0.1583, simple_loss=0.2367, pruned_loss=0.03996, over 4714146.59 frames. ], batch size: 61, lr: 2.91e-03, grad_scale: 16.0 2023-10-03 09:53:50,336 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_40047_210_2290_1_1533290519_2374311_254-102149-0_sp1.1 from training. Duration: 0.963625 2023-10-03 09:53:50,337 WARNING [train.py:1204] (3/4) Exclude cut with ID _28461_210_2253_1_1532771775189_3041299_96-152158-0 from training. Duration: 0.85 2023-10-03 09:53:53,654 WARNING [train.py:1204] (3/4) Exclude cut with ID _29877_210_6228_1_1532397791106_5276149_161-102979-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 09:53:56,537 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3787_210_2040_1_1526477544_5478586_603-120324-0_sp0.9 from training. Duration: 0.9 2023-10-03 09:54:01,045 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_41750_210_1960_1_1533520678_3948042_247-213456-0_sp1.1 from training. Duration: 0.97275 2023-10-03 09:54:09,425 WARNING [train.py:1204] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_127-119109-0_sp0.9 from training. Duration: 0.8 2023-10-03 09:54:12,339 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.1.conv_skip_rate, batch_count=1225486.6666666667, ans=0.0 2023-10-03 09:54:19,493 WARNING [train.py:1204] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_215-202973-0 from training. Duration: 0.88 2023-10-03 09:54:19,567 WARNING [train.py:1204] (3/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_17-320491-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 09:54:22,253 WARNING [train.py:1204] (3/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_310-114452-0 from training. Duration: 0.59 2023-10-03 09:54:22,335 WARNING [train.py:1204] (3/4) Exclude cut with ID _15133_210_6228_1_1530794512074_3080379_29-264458-0_sp0.9 from training. Duration: 0.6444375 2023-10-03 09:54:26,850 WARNING [train.py:1204] (3/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_159-281310-0_sp1.1 from training. Duration: 0.863625 2023-10-03 09:54:26,867 WARNING [train.py:1204] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_195-167011-0_sp0.9 from training. Duration: 0.8333125 2023-10-03 09:54:26,999 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.1.balancer1.prob, batch_count=1225553.3333333333, ans=0.125 2023-10-03 09:54:28,166 WARNING [train.py:1204] (3/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_156-257607-0_sp1.1 from training. Duration: 0.7545625 2023-10-03 09:54:28,824 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.3.encoder.layers.3.whiten, num_groups=1, num_channels=512, metric=3.41 vs. limit=12.0 2023-10-03 09:54:31,771 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.2.encoder.layers.0.nonlin_attention.whiten1, num_groups=1, num_channels=288, metric=7.91 vs. limit=10.0 2023-10-03 09:54:32,953 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2295_210_2087_1_1525500993_2407863_116-238894-0 from training. Duration: 0.7 2023-10-03 09:54:33,055 WARNING [train.py:1204] (3/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_321-335475-0_sp0.9 from training. Duration: 0.5 2023-10-03 09:54:35,767 WARNING [train.py:1204] (3/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_188-114464-0 from training. Duration: 0.82 2023-10-03 09:54:37,287 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2592_210_2290_1_1525697025_4882925_157-70512-0 from training. Duration: 0.91 2023-10-03 09:54:37,464 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=1225620.0, ans=0.1 2023-10-03 09:54:39,975 WARNING [train.py:1204] (3/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_138-64210-0_sp1.1 from training. Duration: 0.663625 2023-10-03 09:54:46,123 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_11305_210_1794_1_1529750428_7178148_651-95702-0_sp1.1 from training. Duration: 0.963625 2023-10-03 09:54:46,138 WARNING [train.py:1204] (3/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_379-184223-0_sp0.9 from training. Duration: 0.9444375 2023-10-03 09:54:46,157 WARNING [train.py:1204] (3/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_381-125325-0_sp1.1 from training. Duration: 0.963625 2023-10-03 09:54:46,208 WARNING [train.py:1204] (3/4) Exclude cut with ID IC0622W0021-307659 from training. Duration: 0.896 2023-10-03 09:54:46,213 WARNING [train.py:1204] (3/4) Exclude cut with ID _7624_210_1531_1_1528109802198_4933101_418-349834-0_sp1.1 from training. Duration: 0.5454375 2023-10-03 09:54:50,858 WARNING [train.py:1204] (3/4) Exclude cut with ID _49728_210_12651_1_1534041034294_6788150_607-3416-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 09:54:52,203 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8275_210_4198_1_1528455320_3892571_491-17707-0 from training. Duration: 0.64 2023-10-03 09:54:52,263 WARNING [train.py:1204] (3/4) Exclude cut with ID _5287_210_2084_1_1527418883040_3878670_333-48508-0 from training. Duration: 0.6 2023-10-03 09:54:53,486 WARNING [train.py:1204] (3/4) Exclude cut with ID _8365_210_2064_1_1528592311070_4677690_371-122295-0 from training. Duration: 0.55 2023-10-03 09:54:53,600 WARNING [train.py:1204] (3/4) Exclude cut with ID _14860_210_4915_1_1530702020340_7343240_836-328489-0 from training. Duration: 0.9 2023-10-03 09:54:56,829 WARNING [train.py:1204] (3/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_1033-274376-0_sp0.9 from training. Duration: 0.9666875 2023-10-03 09:54:59,709 WARNING [train.py:1204] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_475-127455-0_sp0.9 from training. Duration: 0.77775 2023-10-03 09:54:59,715 WARNING [train.py:1204] (3/4) Exclude cut with ID IC0671W0071-331914 from training. Duration: 0.896 2023-10-03 09:54:59,734 WARNING [train.py:1204] (3/4) Exclude cut with ID _30327_210_16049_1_1532484161295_3924169_2-1523-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 09:54:59,737 WARNING [train.py:1204] (3/4) Exclude cut with ID _24314_210_4915_1_1532135078116_7170179_460-208279-0_sp1.1 from training. Duration: 0.97275 2023-10-03 09:55:01,093 WARNING [train.py:1204] (3/4) Exclude cut with ID IC0679W0052-335859 from training. Duration: 0.64 2023-10-03 09:55:03,007 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.balancer1.prob, batch_count=1225753.3333333333, ans=0.125 2023-10-03 09:55:04,072 INFO [train.py:1046] (3/4) Epoch 35, batch 3250, loss[loss=0.1644, simple_loss=0.2387, pruned_loss=0.04502, over 23782.00 frames. ], tot_loss[loss=0.1591, simple_loss=0.238, pruned_loss=0.04007, over 4719283.83 frames. ], batch size: 179, lr: 2.91e-03, grad_scale: 16.0 2023-10-03 09:55:05,674 WARNING [train.py:1204] (3/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_3-237434-0_sp1.1 from training. Duration: 0.6909375 2023-10-03 09:55:09,746 WARNING [train.py:1204] (3/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_405-185283-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 09:55:15,535 WARNING [train.py:1204] (3/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_927-83684-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 09:55:15,543 WARNING [train.py:1204] (3/4) Exclude cut with ID _6694_210_4599_1_1527852637490_3965529_356-169233-0 from training. Duration: 0.99 2023-10-03 09:55:16,853 WARNING [train.py:1204] (3/4) Exclude cut with ID _19896_210_1883_1_1531709022662_6649569_562-233001-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 09:55:18,691 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_354-78629-0_sp1.1 from training. Duration: 0.963625 2023-10-03 09:55:18,692 WARNING [train.py:1204] (3/4) Exclude cut with ID _60135_210_13427_1_1534935557922_3651319_96-168198-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 09:55:20,118 WARNING [train.py:1204] (3/4) Exclude cut with ID _31602_210_5854_1_1532677021456_4458250_249-96604-0_sp1.1 from training. Duration: 0.8090625 2023-10-03 09:55:20,155 WARNING [train.py:1204] (3/4) Exclude cut with ID _14100_210_8341_1_1530529320613_3678069_258-224599-0_sp0.9 from training. Duration: 0.8555625 2023-10-03 09:55:23,371 WARNING [train.py:1204] (3/4) Exclude cut with ID _19708_210_9206_1_1532136650733_4238364_215-160135-0_sp1.1 from training. Duration: 0.97275 2023-10-03 09:55:23,409 WARNING [train.py:1204] (3/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_113-18163-0_sp0.9 from training. Duration: 0.988875 2023-10-03 09:55:24,721 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.562e+02 2.063e+02 2.296e+02 2.650e+02 3.939e+02, threshold=4.593e+02, percent-clipped=0.0 2023-10-03 09:55:24,803 WARNING [train.py:1204] (3/4) Exclude cut with ID _64135_210_2253_1_1535677267876_7608972_552-337340-0_sp1.1 from training. Duration: 0.92725 2023-10-03 09:55:24,843 WARNING [train.py:1204] (3/4) Exclude cut with ID _67725_210_27767_1_1535608756671_5105359_10-12887-0_sp1.1 from training. Duration: 0.97275 2023-10-03 09:55:24,849 WARNING [train.py:1204] (3/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_401-284840-0_sp1.1 from training. Duration: 0.97275 2023-10-03 09:55:24,875 WARNING [train.py:1204] (3/4) Exclude cut with ID _44659_210_2721_1_1534036120061_6614860_483-141285-0_sp1.1 from training. Duration: 0.936375 2023-10-03 09:55:27,852 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.bypass.skip_rate, batch_count=1225820.0, ans=0.04949747468305833 2023-10-03 09:55:29,468 WARNING [train.py:1204] (3/4) Exclude cut with ID _9461_210_4557_1_1529060514726_3677451_358-332734-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 09:55:30,884 WARNING [train.py:1204] (3/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_788-298907-0_sp1.1 from training. Duration: 0.8090625 2023-10-03 09:55:32,291 WARNING [train.py:1204] (3/4) Exclude cut with ID _39411_210_5986_1_1533538825030_3343649_354-267196-0_sp1.1 from training. Duration: 0.92725 2023-10-03 09:55:32,325 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_14953_210_2151_1_1530701710_5616165_192-309792-0_sp1.1 from training. Duration: 0.97275 2023-10-03 09:55:33,716 WARNING [train.py:1204] (3/4) Exclude cut with ID _59170_210_6721_1_1534983633960_4203335_290-151255-0_sp1.1 from training. Duration: 0.92725 2023-10-03 09:55:35,032 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_5575_210_1410_1_1527301305_2570170_249-93769-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 09:55:35,049 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_46013_210_7234_1_1533776362_3744032_463-52378-0_sp1.1 from training. Duration: 0.7818125 2023-10-03 09:55:39,790 WARNING [train.py:1204] (3/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_580-93798-0 from training. Duration: 0.56 2023-10-03 09:55:41,056 WARNING [train.py:1204] (3/4) Exclude cut with ID _19514_210_9259_1_1531530071330_5084230_385-13762-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 09:55:41,061 WARNING [train.py:1204] (3/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_458-67321-0_sp1.1 from training. Duration: 0.8818125 2023-10-03 09:55:42,474 WARNING [train.py:1204] (3/4) Exclude cut with ID _41298_210_9019_1_1533430371450_4822143_188-154791-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 09:55:42,557 WARNING [train.py:1204] (3/4) Exclude cut with ID _15037_210_2253_1_1530752294104_3267650_296-192597-0_sp0.9 from training. Duration: 0.9 2023-10-03 09:55:44,181 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.ff2_skip_rate, batch_count=1225886.6666666667, ans=0.0 2023-10-03 09:55:48,259 WARNING [train.py:1204] (3/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_661-264857-0_sp1.1 from training. Duration: 0.8454375 2023-10-03 09:55:48,485 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.feed_forward1.hidden_balancer.prob, batch_count=1225953.3333333333, ans=0.125 2023-10-03 09:55:54,958 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_34008_210_10431_1_1533345424_8183247_662-317331-0_sp1.1 from training. Duration: 0.8545625 2023-10-03 09:55:56,354 WARNING [train.py:1204] (3/4) Exclude cut with ID _46502_210_13794_1_1533724452198_3657159_241-278329-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 09:55:56,355 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_11349_210_6753_1_1529800861_1101207_26-65602-0 from training. Duration: 0.96 2023-10-03 09:55:56,358 WARNING [train.py:1204] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_448-4048-0_sp1.1 from training. Duration: 0.87275 2023-10-03 09:55:56,365 WARNING [train.py:1204] (3/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_662-275621-0_sp1.1 from training. Duration: 0.3818125 2023-10-03 09:55:56,414 WARNING [train.py:1204] (3/4) Exclude cut with ID _13938_210_9988_1_1530518718978_3693960_256-54454-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 09:56:00,408 WARNING [train.py:1204] (3/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_256-32662-0 from training. Duration: 0.9 2023-10-03 09:56:00,447 WARNING [train.py:1204] (3/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_326-17061-0 from training. Duration: 0.94 2023-10-03 09:56:00,482 WARNING [train.py:1204] (3/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_505-97987-0_sp1.1 from training. Duration: 0.7818125 2023-10-03 09:56:02,450 WARNING [train.py:1204] (3/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_316-294364-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 09:56:02,522 WARNING [train.py:1204] (3/4) Exclude cut with ID _19514_210_9259_1_1531530071330_5084230_762-231063-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 09:56:02,563 WARNING [train.py:1204] (3/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_68-219807-0_sp0.9 from training. Duration: 0.52225 2023-10-03 09:56:03,944 WARNING [train.py:1204] (3/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_479-158053-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 09:56:08,489 WARNING [train.py:1204] (3/4) Exclude cut with ID _66112_210_10635_1_1535590236305_3801484_83-38181-0_sp1.1 from training. Duration: 0.936375 2023-10-03 09:56:08,503 WARNING [train.py:1204] (3/4) Exclude cut with ID _55857_210_2346_1_1534644007831_6898209_255-146346-0_sp1.1 from training. Duration: 0.963625 2023-10-03 09:56:08,852 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.feed_forward2.hidden_balancer.prob, batch_count=1226020.0, ans=0.125 2023-10-03 09:56:09,222 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.3.encoder.layers.2.self_attn2.whiten, num_groups=1, num_channels=512, metric=11.93 vs. limit=22.5 2023-10-03 09:56:09,852 WARNING [train.py:1204] (3/4) Exclude cut with ID _48836_210_12210_1_1534075848293_3904150_429-35475-0 from training. Duration: 0.89 2023-10-03 09:56:09,863 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_50154_210_13731_1_1534750862_1080961_119-187163-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 09:56:11,948 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.3.encoder.layers.2.conv_module1.whiten, num_groups=1, num_channels=512, metric=5.11 vs. limit=15.0 2023-10-03 09:56:12,634 WARNING [train.py:1204] (3/4) Exclude cut with ID _8793_210_6310_1_1528542985139_2310009_121-179782-0_sp0.9 from training. Duration: 0.888875 2023-10-03 09:56:12,637 WARNING [train.py:1204] (3/4) Exclude cut with ID _14884_210_2252_1_1530707459689_5561250_591-173406-0 from training. Duration: 0.96 2023-10-03 09:56:15,355 WARNING [train.py:1204] (3/4) Exclude cut with ID _15133_210_6228_1_1530794512074_3080379_54-198193-0_sp1.1 from training. Duration: 0.8545625 2023-10-03 09:56:15,364 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6545_210_2040_1_1527774670_4245258_407-48070-0 from training. Duration: 0.76 2023-10-03 09:56:18,033 INFO [train.py:1046] (3/4) Epoch 35, batch 3300, loss[loss=0.1494, simple_loss=0.2367, pruned_loss=0.03103, over 24307.00 frames. ], tot_loss[loss=0.1598, simple_loss=0.2391, pruned_loss=0.04028, over 4720335.25 frames. ], batch size: 61, lr: 2.91e-03, grad_scale: 8.0 2023-10-03 09:56:18,091 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1032-236863-0 from training. Duration: 0.84 2023-10-03 09:56:19,475 WARNING [train.py:1204] (3/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_507-303370-0 from training. Duration: 0.96 2023-10-03 09:56:19,498 WARNING [train.py:1204] (3/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_164-282366-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 09:56:22,875 WARNING [train.py:1204] (3/4) Exclude cut with ID _39867_210_9826_1_1533553222030_9344259_312-158727-0_sp1.1 from training. Duration: 0.963625 2023-10-03 09:56:24,231 WARNING [train.py:1204] (3/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_174-205058-0_sp1.1 from training. Duration: 0.863625 2023-10-03 09:56:24,256 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_11305_210_1794_1_1529750428_7178148_166-237358-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 09:56:25,720 WARNING [train.py:1204] (3/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_26-175553-0_sp1.1 from training. Duration: 0.5545625 2023-10-03 09:56:25,745 WARNING [train.py:1204] (3/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_96-290039-0_sp1.1 from training. Duration: 0.5090625 2023-10-03 09:56:28,973 WARNING [train.py:1204] (3/4) Exclude cut with ID _53219_210_4525_1_1534834836008_4064825_261-101691-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 09:56:30,478 WARNING [train.py:1204] (3/4) Exclude cut with ID _24271_210_12658_1_1531987270022_7190519_96-293390-0_sp1.1 from training. Duration: 0.936375 2023-10-03 09:56:33,996 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.conv_module1.balancer1.max_abs, batch_count=1226153.3333333333, ans=10.0 2023-10-03 09:56:35,752 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_271-234962-0 from training. Duration: 0.79 2023-10-03 09:56:35,825 WARNING [train.py:1204] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_136-30660-0_sp1.1 from training. Duration: 0.97275 2023-10-03 09:56:37,192 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_625-281046-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 09:56:38,690 WARNING [train.py:1204] (3/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_333-168193-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 09:56:40,057 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9029W0269-509664 from training. Duration: 0.896 2023-10-03 09:56:40,247 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.bypass.skip_rate, batch_count=1226153.3333333333, ans=0.04949747468305833 2023-10-03 09:56:41,385 WARNING [train.py:1204] (3/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_450-147393-0_sp1.1 from training. Duration: 0.92725 2023-10-03 09:56:41,448 WARNING [train.py:1204] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_643-52007-0_sp1.1 from training. Duration: 0.5181875 2023-10-03 09:56:42,866 WARNING [train.py:1204] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_330-327861-0_sp0.9 from training. Duration: 0.7444375 2023-10-03 09:56:42,883 WARNING [train.py:1204] (3/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_260-317651-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 09:56:44,148 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9044W0010-516654 from training. Duration: 0.896 2023-10-03 09:56:48,171 WARNING [train.py:1204] (3/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_265-118378-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 09:56:48,185 WARNING [train.py:1204] (3/4) Exclude cut with ID _6194_210_1534_1_1527511826160_3941194_321-148641-0_sp0.9 from training. Duration: 0.711125 2023-10-03 09:56:50,823 WARNING [train.py:1204] (3/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_101-169176-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 09:56:50,826 WARNING [train.py:1204] (3/4) Exclude cut with ID 497-129325-0061-62254-0_sp1.1 from training. Duration: 0.97725 2023-10-03 09:56:52,214 WARNING [train.py:1204] (3/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_795-33252-0_sp1.1 from training. Duration: 0.336375 2023-10-03 09:56:52,240 WARNING [train.py:1204] (3/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_168-246176-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 09:56:53,648 WARNING [train.py:1204] (3/4) Exclude cut with ID _7160_210_1876_1_1527940923231_3703799_120-150943-0_sp1.1 from training. Duration: 0.736375 2023-10-03 09:56:53,909 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.balancer1.prob, batch_count=1226220.0, ans=0.125 2023-10-03 09:56:56,280 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9084W0005-536844 from training. Duration: 0.768 2023-10-03 09:56:57,043 WARNING [train.py:1204] (3/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_234-260682-0 from training. Duration: 0.84 2023-10-03 09:56:58,297 WARNING [train.py:1204] (3/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_188-114464-0_sp0.9 from training. Duration: 0.911125 2023-10-03 09:57:01,067 WARNING [train.py:1204] (3/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_81-75849-0 from training. Duration: 0.39 2023-10-03 09:57:02,662 WARNING [train.py:1204] (3/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_379-184223-0_sp1.1 from training. Duration: 0.77275 2023-10-03 09:57:02,964 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=1226286.6666666667, ans=0.1 2023-10-03 09:57:04,621 WARNING [train.py:1204] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_454-123002-0_sp1.1 from training. Duration: 0.62725 2023-10-03 09:57:05,906 WARNING [train.py:1204] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_887-249555-0_sp1.1 from training. Duration: 0.7 2023-10-03 09:57:09,322 WARNING [train.py:1204] (3/4) Exclude cut with ID _31088_210_16446_1_1532520012284_3440919_20-260902-0_sp1.1 from training. Duration: 0.97275 2023-10-03 09:57:09,362 WARNING [train.py:1204] (3/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_856-344574-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 09:57:09,364 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_36756_210_7234_1_1533026991_966276_4-164118-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 09:57:09,384 WARNING [train.py:1204] (3/4) Exclude cut with ID _8365_210_2064_1_1528592311070_4677690_371-122295-0_sp1.1 from training. Duration: 0.5 2023-10-03 09:57:11,977 WARNING [train.py:1204] (3/4) Exclude cut with ID _5910_210_1389_1_1527337972454_5050100_555-159815-0_sp1.1 from training. Duration: 0.8909375 2023-10-03 09:57:11,995 WARNING [train.py:1204] (3/4) Exclude cut with ID _37086_210_9038_1_1532999540891_7021250_469-147941-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 09:57:12,064 WARNING [train.py:1204] (3/4) Exclude cut with ID _3067_210_2667_1_1526212974640_4503168_459-173867-0_sp0.9 from training. Duration: 0.97775 2023-10-03 09:57:13,470 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9156W0006-572499 from training. Duration: 0.896 2023-10-03 09:57:13,563 WARNING [train.py:1204] (3/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_277-210798-0 from training. Duration: 0.55 2023-10-03 09:57:16,317 WARNING [train.py:1204] (3/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_103-336064-0_sp1.1 from training. Duration: 0.463625 2023-10-03 09:57:17,663 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3414_210_1988_1_1526212718_4813195_331-290945-0_sp1.1 from training. Duration: 0.936375 2023-10-03 09:57:17,664 WARNING [train.py:1204] (3/4) Exclude cut with ID _67499_210_17282_1_1535509805220_3692430_365-29276-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 09:57:17,954 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.bypass.skip_rate, batch_count=1226353.3333333333, ans=0.09899494936611666 2023-10-03 09:57:19,020 WARNING [train.py:1204] (3/4) Exclude cut with ID _14821_210_8750_1_1530680503991_6829359_69-250847-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 09:57:19,022 WARNING [train.py:1204] (3/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_1102-61301-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 09:57:20,324 WARNING [train.py:1204] (3/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_166-246557-0_sp1.1 from training. Duration: 0.5818125 2023-10-03 09:57:20,387 WARNING [train.py:1204] (3/4) Exclude cut with ID _15351_210_10626_1_1530856635653_3576210_214-129918-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 09:57:20,399 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_120-41668-0_sp0.9 from training. Duration: 0.77775 2023-10-03 09:57:21,791 WARNING [train.py:1204] (3/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_27-72278-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 09:57:24,536 WARNING [train.py:1204] (3/4) Exclude cut with ID _24314_210_4915_1_1532135078116_7170179_521-258868-0_sp1.1 from training. Duration: 0.7181875 2023-10-03 09:57:27,378 WARNING [train.py:1204] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_143-86344-0 from training. Duration: 0.77 2023-10-03 09:57:29,151 WARNING [train.py:1204] (3/4) Exclude cut with ID _53489_210_6228_1_1534298357169_3835970_465-157433-0_sp1.1 from training. Duration: 0.92725 2023-10-03 09:57:30,448 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_248-319353-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 09:57:31,605 INFO [train.py:1046] (3/4) Epoch 35, batch 3350, loss[loss=0.163, simple_loss=0.2383, pruned_loss=0.04388, over 23698.00 frames. ], tot_loss[loss=0.1603, simple_loss=0.2398, pruned_loss=0.04037, over 4722123.98 frames. ], batch size: 135, lr: 2.91e-03, grad_scale: 8.0 2023-10-03 09:57:33,023 WARNING [train.py:1204] (3/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_102-77064-0_sp1.1 from training. Duration: 0.8454375 2023-10-03 09:57:33,066 WARNING [train.py:1204] (3/4) Exclude cut with ID _9389_210_1664_1_1529217337301_5289269_379-128794-0_sp1.1 from training. Duration: 0.77275 2023-10-03 09:57:34,515 WARNING [train.py:1204] (3/4) Exclude cut with ID _3231_210_2106_1_1526205864934_6784220_236-260835-0_sp1.1 from training. Duration: 0.97275 2023-10-03 09:57:35,927 WARNING [train.py:1204] (3/4) Exclude cut with ID _24271_210_12658_1_1531987270022_7190519_184-342363-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 09:57:35,928 WARNING [train.py:1204] (3/4) Exclude cut with ID _27894_210_12392_1_1532415496078_6902439_67-221663-0_sp1.1 from training. Duration: 0.92725 2023-10-03 09:57:38,046 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.ff3_skip_rate, batch_count=1226420.0, ans=0.0 2023-10-03 09:57:39,657 WARNING [train.py:1204] (3/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_304-59227-0_sp1.1 from training. Duration: 0.7 2023-10-03 09:57:39,758 WARNING [train.py:1204] (3/4) Exclude cut with ID _58605_210_12210_1_1534723195939_3904259_458-42232-0_sp1.1 from training. Duration: 0.92725 2023-10-03 09:57:41,695 WARNING [train.py:1204] (3/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_13-165512-0_sp0.9 from training. Duration: 0.911125 2023-10-03 09:57:43,101 WARNING [train.py:1204] (3/4) Exclude cut with ID _58602_210_2346_1_1534903210085_6908830_792-102241-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 09:57:44,513 WARNING [train.py:1204] (3/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_588-125165-0_sp0.9 from training. Duration: 0.92225 2023-10-03 09:57:44,720 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.ff3_skip_rate, batch_count=1226420.0, ans=0.0 2023-10-03 09:57:45,875 WARNING [train.py:1204] (3/4) Exclude cut with ID _54218_210_12210_1_1534413646388_4409259_239-98429-0_sp1.1 from training. Duration: 0.97275 2023-10-03 09:57:47,093 WARNING [train.py:1204] (3/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_538-32291-0_sp1.1 from training. Duration: 0.7545625 2023-10-03 09:57:48,411 WARNING [train.py:1204] (3/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_419-317062-0 from training. Duration: 0.91 2023-10-03 09:57:49,880 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9309W0202-640329 from training. Duration: 0.896 2023-10-03 09:57:49,923 WARNING [train.py:1204] (3/4) Exclude cut with ID _3231_210_2106_1_1526205864934_6784220_419-313291-0_sp1.1 from training. Duration: 0.97275 2023-10-03 09:57:52,675 WARNING [train.py:1204] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_619-344499-0 from training. Duration: 0.63 2023-10-03 09:57:52,693 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_278-272612-0 from training. Duration: 0.73 2023-10-03 09:57:53,949 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.535e+02 1.867e+02 2.273e+02 2.647e+02 3.558e+02, threshold=4.546e+02, percent-clipped=0.0 2023-10-03 09:57:54,052 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_103-84010-0_sp0.9 from training. Duration: 0.7666875 2023-10-03 09:57:54,081 WARNING [train.py:1204] (3/4) Exclude cut with ID _26528_210_9070_1_1532570410842_3851310_497-97218-0_sp1.1 from training. Duration: 0.7818125 2023-10-03 09:57:55,368 WARNING [train.py:1204] (3/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_262-84613-0_sp1.1 from training. Duration: 0.936375 2023-10-03 09:57:55,408 WARNING [train.py:1204] (3/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_387-131538-0 from training. Duration: 0.81 2023-10-03 09:57:55,434 WARNING [train.py:1204] (3/4) Exclude cut with ID _8030_210_7497_1_1528444886781_3531089_224-300489-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 09:57:56,741 WARNING [train.py:1204] (3/4) Exclude cut with ID _14349_210_8341_1_1530615710818_3442470_170-336541-0_sp0.9 from training. Duration: 0.9555625 2023-10-03 09:57:56,881 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_47069_210_15368_1_1533781267_4431320_180-31746-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 09:58:00,063 WARNING [train.py:1204] (3/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_447-7041-0_sp1.1 from training. Duration: 0.92725 2023-10-03 09:58:00,098 WARNING [train.py:1204] (3/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_2-55283-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 09:58:01,480 WARNING [train.py:1204] (3/4) Exclude cut with ID _12555_210_8750_1_1530075594402_3780239_70-181911-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 09:58:04,200 WARNING [train.py:1204] (3/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_311-328022-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 09:58:08,110 WARNING [train.py:1204] (3/4) Exclude cut with ID _14528_210_8254_1_1530669700514_4306939_330-2987-0_sp1.1 from training. Duration: 0.92725 2023-10-03 09:58:08,177 WARNING [train.py:1204] (3/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_385-184858-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 09:58:12,837 WARNING [train.py:1204] (3/4) Exclude cut with ID _64414_210_17301_1_1535243252048_3757759_348-333360-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 09:58:12,904 WARNING [train.py:1204] (3/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_753-214751-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 09:58:14,366 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_17613_210_2151_1_1531137569_2366160_202-208013-0_sp1.1 from training. Duration: 0.92725 2023-10-03 09:58:15,687 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_43831_210_2158_1_1533725502_5872791_610-129300-0_sp1.1 from training. Duration: 0.936375 2023-10-03 09:58:17,128 WARNING [train.py:1204] (3/4) Exclude cut with ID _54218_210_12210_1_1534413646388_4409259_321-23852-0_sp1.1 from training. Duration: 0.936375 2023-10-03 09:58:19,793 WARNING [train.py:1204] (3/4) Exclude cut with ID _39411_210_5986_1_1533538825030_3343649_173-238248-0 from training. Duration: 0.83 2023-10-03 09:58:19,807 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_405-339986-0_sp0.9 from training. Duration: 0.5666875 2023-10-03 09:58:19,838 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2009_210_2071_1_1525481134_4588937_385-302100-0 from training. Duration: 0.87 2023-10-03 09:58:19,868 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_161-276996-0_sp1.1 from training. Duration: 0.863625 2023-10-03 09:58:21,199 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_177-58005-0 from training. Duration: 0.62 2023-10-03 09:58:22,640 WARNING [train.py:1204] (3/4) Exclude cut with ID _64639_210_5816_1_1535367619437_3528000_146-343734-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 09:58:24,002 WARNING [train.py:1204] (3/4) Exclude cut with ID _3845_210_1390_1_1526731326141_3523610_72-61441-0_sp1.1 from training. Duration: 0.92725 2023-10-03 09:58:24,286 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.conv_skip_rate, batch_count=1226620.0, ans=0.0 2023-10-03 09:58:32,719 WARNING [train.py:1204] (3/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_991-125028-0_sp1.1 from training. Duration: 0.936375 2023-10-03 09:58:33,957 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7544_210_6753_1_1528591327_1471426_172-348777-0 from training. Duration: 0.66 2023-10-03 09:58:34,007 WARNING [train.py:1204] (3/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_416-257730-0_sp1.1 from training. Duration: 0.5909375 2023-10-03 09:58:35,206 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1113-7369-0_sp1.1 from training. Duration: 0.77275 2023-10-03 09:58:38,334 WARNING [train.py:1204] (3/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_242-76943-0_sp1.1 from training. Duration: 0.8545625 2023-10-03 09:58:41,762 WARNING [train.py:1204] (3/4) Exclude cut with ID _44718_210_5816_1_1534050051869_6469030_190-49648-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 09:58:43,751 WARNING [train.py:1204] (3/4) Exclude cut with ID _47251_210_18628_1_1533814063128_4137250_214-88076-0 from training. Duration: 0.88 2023-10-03 09:58:45,062 WARNING [train.py:1204] (3/4) Exclude cut with ID _5585_210_2667_1_1527418813324_8007340_89-332072-0_sp1.1 from training. Duration: 0.5818125 2023-10-03 09:58:46,331 WARNING [train.py:1204] (3/4) Exclude cut with ID _41817_210_18835_1_1533434477160_7423049_555-293448-0_sp1.1 from training. Duration: 0.72725 2023-10-03 09:58:47,649 WARNING [train.py:1204] (3/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_286-331314-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 09:58:48,891 INFO [train.py:1046] (3/4) Epoch 35, batch 3400, loss[loss=0.1718, simple_loss=0.2533, pruned_loss=0.04519, over 23528.00 frames. ], tot_loss[loss=0.1611, simple_loss=0.2408, pruned_loss=0.04067, over 4732337.24 frames. ], batch size: 93, lr: 2.91e-03, grad_scale: 8.0 2023-10-03 09:58:48,937 WARNING [train.py:1204] (3/4) Exclude cut with ID _13921_210_9994_1_1530838702800_3003429_46-149444-0 from training. Duration: 0.94 2023-10-03 09:58:48,994 WARNING [train.py:1204] (3/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_768-43595-0_sp1.1 from training. Duration: 0.936375 2023-10-03 09:58:49,009 WARNING [train.py:1204] (3/4) Exclude cut with ID _35355_210_16040_1_1533212988977_4016229_74-190076-0 from training. Duration: 0.8 2023-10-03 09:58:50,327 WARNING [train.py:1204] (3/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_1007-316380-0_sp1.1 from training. Duration: 0.963625 2023-10-03 09:58:51,635 WARNING [train.py:1204] (3/4) Exclude cut with ID _23557_210_8750_1_1531890106370_6846255_150-283437-0_sp1.1 from training. Duration: 0.963625 2023-10-03 09:58:51,672 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3474_210_2028_1_1526188513_4825246_145-309131-0_sp1.1 from training. Duration: 0.67275 2023-10-03 09:58:53,027 WARNING [train.py:1204] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_391-22148-0_sp1.1 from training. Duration: 0.7 2023-10-03 09:58:53,045 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_238-256668-0 from training. Duration: 0.69 2023-10-03 09:58:57,305 WARNING [train.py:1204] (3/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_372-198878-0 from training. Duration: 0.6 2023-10-03 09:58:57,317 WARNING [train.py:1204] (3/4) Exclude cut with ID ID0174W0006-766659 from training. Duration: 0.953 2023-10-03 09:58:57,336 WARNING [train.py:1204] (3/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_668-23019-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 09:59:01,518 WARNING [train.py:1204] (3/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_322-74328-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 09:59:01,525 WARNING [train.py:1204] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_872-264525-0_sp1.1 from training. Duration: 0.5909375 2023-10-03 09:59:01,569 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_53378_210_17282_1_1534221071_5769509_641-286201-0_sp1.1 from training. Duration: 0.97275 2023-10-03 09:59:01,653 WARNING [train.py:1204] (3/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_199-295611-0_sp1.1 from training. Duration: 0.5 2023-10-03 09:59:08,162 WARNING [train.py:1204] (3/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_1270-317971-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 09:59:08,288 WARNING [train.py:1204] (3/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_784-213099-0 from training. Duration: 0.93 2023-10-03 09:59:14,598 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.feed_forward3.hidden_balancer.prob, batch_count=1226820.0, ans=0.125 2023-10-03 09:59:16,110 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_115-233929-0_sp0.9 from training. Duration: 0.8 2023-10-03 09:59:18,951 WARNING [train.py:1204] (3/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_256-223809-0_sp1.1 from training. Duration: 0.97275 2023-10-03 09:59:19,020 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_285-87781-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 09:59:20,519 WARNING [train.py:1204] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_619-344499-0_sp1.1 from training. Duration: 0.57275 2023-10-03 09:59:20,791 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.ff3_skip_rate, batch_count=1226886.6666666667, ans=0.0 2023-10-03 09:59:27,279 WARNING [train.py:1204] (3/4) Exclude cut with ID _60229_210_27767_1_1534838406158_3655350_264-279573-0_sp0.9 from training. Duration: 0.92225 2023-10-03 09:59:27,550 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.self_attn_weights.pos_emb_skip_rate, batch_count=1226886.6666666667, ans=0.0 2023-10-03 09:59:30,139 WARNING [train.py:1204] (3/4) Exclude cut with ID _7719_210_3525_1_1528456153163_4296960_333-110399-0 from training. Duration: 0.96 2023-10-03 09:59:35,071 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_50423_210_13270_1_1534128598_5021734_759-90833-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 09:59:36,235 WARNING [train.py:1204] (3/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_151-254798-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 09:59:36,268 WARNING [train.py:1204] (3/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_450-203637-0 from training. Duration: 0.92 2023-10-03 09:59:36,310 WARNING [train.py:1204] (3/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_673-36456-0_sp1.1 from training. Duration: 0.936375 2023-10-03 09:59:36,458 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.nonlin_attention.balancer.prob, batch_count=1226953.3333333333, ans=0.125 2023-10-03 09:59:37,630 WARNING [train.py:1204] (3/4) Exclude cut with ID _36538_210_9994_1_1533204020363_3795620_131-8685-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 09:59:37,688 WARNING [train.py:1204] (3/4) Exclude cut with ID _12556_210_8341_1_1530081024100_7327690_713-55626-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 09:59:39,578 WARNING [train.py:1204] (3/4) Exclude cut with ID _2322_210_2667_1_1525604548971_3656299_440-80494-0_sp0.9 from training. Duration: 0.97775 2023-10-03 09:59:42,437 WARNING [train.py:1204] (3/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_210-325081-0_sp1.1 from training. Duration: 0.97275 2023-10-03 09:59:44,520 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.1.conv_module1.balancer2.prob, batch_count=1226953.3333333333, ans=0.125 2023-10-03 09:59:45,739 WARNING [train.py:1204] (3/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_375-244976-0_sp0.9 from training. Duration: 0.8333125 2023-10-03 09:59:45,744 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_15517_210_6753_1_1530952293_1664795_119-31001-0_sp1.1 from training. Duration: 0.8545625 2023-10-03 09:59:50,457 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_41452_210_7291_1_1533437687_3941281_319-42326-0_sp1.1 from training. Duration: 0.92725 2023-10-03 09:59:51,803 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_372-173012-0 from training. Duration: 0.95 2023-10-03 09:59:57,311 WARNING [train.py:1204] (3/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_41-31157-0_sp1.1 from training. Duration: 0.5181875 2023-10-03 10:00:02,665 INFO [train.py:1046] (3/4) Epoch 35, batch 3450, loss[loss=0.155, simple_loss=0.2367, pruned_loss=0.03664, over 23387.00 frames. ], tot_loss[loss=0.161, simple_loss=0.2407, pruned_loss=0.0407, over 4722199.88 frames. ], batch size: 119, lr: 2.91e-03, grad_scale: 8.0 2023-10-03 10:00:02,715 WARNING [train.py:1204] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_91-212486-0 from training. Duration: 0.68 2023-10-03 10:00:07,409 WARNING [train.py:1204] (3/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_1267-272693-0 from training. Duration: 0.73 2023-10-03 10:00:08,808 WARNING [train.py:1204] (3/4) Exclude cut with ID _13938_210_9988_1_1530518718978_3693960_152-67046-0_sp1.1 from training. Duration: 0.936375 2023-10-03 10:00:08,948 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_4528_210_3273_1_1527076654_3276777_342-168068-0_sp1.1 from training. Duration: 0.7909375 2023-10-03 10:00:08,956 WARNING [train.py:1204] (3/4) Exclude cut with ID _7606_210_4915_1_1528894253277_5080690_127-177761-0 from training. Duration: 0.64 2023-10-03 10:00:10,879 WARNING [train.py:1204] (3/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_335-137972-0_sp1.1 from training. Duration: 0.92725 2023-10-03 10:00:15,050 WARNING [train.py:1204] (3/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_331-117829-0_sp1.1 from training. Duration: 0.563625 2023-10-03 10:00:15,332 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.1.feed_forward2.hidden_balancer.prob, batch_count=1227086.6666666667, ans=0.125 2023-10-03 10:00:18,593 WARNING [train.py:1204] (3/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_125-125818-0_sp1.1 from training. Duration: 0.87275 2023-10-03 10:00:18,642 WARNING [train.py:1204] (3/4) Exclude cut with ID _6789_210_1534_1_1527854471671_2870519_48-294897-0_sp1.1 from training. Duration: 0.963625 2023-10-03 10:00:19,910 WARNING [train.py:1204] (3/4) Exclude cut with ID _6694_210_4599_1_1527852637490_3965529_116-109398-0_sp1.1 from training. Duration: 0.663625 2023-10-03 10:00:19,920 WARNING [train.py:1204] (3/4) Exclude cut with ID _52892_210_6721_1_1534378941259_4124129_222-172685-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 10:00:23,156 WARNING [train.py:1204] (3/4) Exclude cut with ID _50655_210_18628_1_1534229942193_3772154_320-132275-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 10:00:27,177 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.503e+02 1.826e+02 1.965e+02 2.200e+02 4.257e+02, threshold=3.929e+02, percent-clipped=0.0 2023-10-03 10:00:28,626 WARNING [train.py:1204] (3/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_76-311963-0 from training. Duration: 0.7 2023-10-03 10:00:31,706 WARNING [train.py:1204] (3/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_32-205288-0 from training. Duration: 0.78 2023-10-03 10:00:31,906 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.nonlin_attention.balancer.prob, batch_count=1227220.0, ans=0.125 2023-10-03 10:00:31,950 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.conv_module1.balancer1.prob, batch_count=1227220.0, ans=0.125 2023-10-03 10:00:32,989 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_86-16881-0_sp0.9 from training. Duration: 0.6444375 2023-10-03 10:00:33,033 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_271-297505-0_sp1.1 from training. Duration: 0.9 2023-10-03 10:00:34,807 WARNING [train.py:1204] (3/4) Exclude cut with ID _57572_210_5517_1_1534921219624_3809484_187-295293-0_sp1.1 from training. Duration: 0.963625 2023-10-03 10:00:40,678 WARNING [train.py:1204] (3/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_563-329943-0 from training. Duration: 0.99 2023-10-03 10:00:42,061 WARNING [train.py:1204] (3/4) Exclude cut with ID _7160_210_1876_1_1527940923231_3703799_302-150728-0_sp0.9 from training. Duration: 0.8666875 2023-10-03 10:00:46,526 WARNING [train.py:1204] (3/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_926-7526-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 10:00:46,558 WARNING [train.py:1204] (3/4) Exclude cut with ID _62574_210_2655_1_1535180407058_3933249_467-22348-0_sp1.1 from training. Duration: 0.936375 2023-10-03 10:00:49,292 WARNING [train.py:1204] (3/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_280-161633-0_sp0.9 from training. Duration: 0.811125 2023-10-03 10:00:50,660 WARNING [train.py:1204] (3/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_385-193052-0_sp1.1 from training. Duration: 0.7818125 2023-10-03 10:00:52,016 WARNING [train.py:1204] (3/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_444-28041-0 from training. Duration: 0.85 2023-10-03 10:00:52,021 WARNING [train.py:1204] (3/4) Exclude cut with ID _66070_210_7640_1_1535441393096_7805310_341-249237-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 10:00:55,221 WARNING [train.py:1204] (3/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_425-269826-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 10:00:56,659 WARNING [train.py:1204] (3/4) Exclude cut with ID _53950_210_5816_1_1534417264919_3862951_124-185597-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 10:00:56,951 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.conv_skip_rate, batch_count=1227286.6666666667, ans=0.0 2023-10-03 10:00:59,534 WARNING [train.py:1204] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_380-204756-0 from training. Duration: 0.86 2023-10-03 10:01:02,362 WARNING [train.py:1204] (3/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_282-257791-0_sp0.9 from training. Duration: 0.9555625 2023-10-03 10:01:07,075 WARNING [train.py:1204] (3/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_372-131163-0_sp1.1 from training. Duration: 0.8545625 2023-10-03 10:01:08,403 WARNING [train.py:1204] (3/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_447-233513-0_sp1.1 from training. Duration: 0.963625 2023-10-03 10:01:12,844 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_54-118333-0_sp1.1 from training. Duration: 0.97275 2023-10-03 10:01:15,750 WARNING [train.py:1204] (3/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_388-230239-0_sp1.1 from training. Duration: 0.963625 2023-10-03 10:01:15,767 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_40047_210_2290_1_1533290519_2374311_141-54639-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 10:01:17,066 INFO [train.py:1046] (3/4) Epoch 35, batch 3500, loss[loss=0.1518, simple_loss=0.2278, pruned_loss=0.03791, over 24642.00 frames. ], tot_loss[loss=0.1601, simple_loss=0.2394, pruned_loss=0.04043, over 4720721.28 frames. ], batch size: 65, lr: 2.91e-03, grad_scale: 8.0 2023-10-03 10:01:17,162 WARNING [train.py:1204] (3/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_91-267696-0_sp0.9 from training. Duration: 0.9666875 2023-10-03 10:01:19,120 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_532-137847-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 10:01:21,871 WARNING [train.py:1204] (3/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_155-283231-0_sp1.1 from training. Duration: 0.97275 2023-10-03 10:01:24,545 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2245_210_2715_1_1525511548_3540254_310-52483-0_sp1.1 from training. Duration: 0.8 2023-10-03 10:01:25,161 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.3.encoder.layers.2.feed_forward3.out_whiten, num_groups=1, num_channels=512, metric=12.56 vs. limit=15.0 2023-10-03 10:01:26,445 WARNING [train.py:1204] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_185-174805-0 from training. Duration: 0.68 2023-10-03 10:01:27,796 WARNING [train.py:1204] (3/4) Exclude cut with ID _15012_210_1968_1_1530856820429_5095350_662-210469-0_sp1.1 from training. Duration: 0.5181875 2023-10-03 10:01:30,590 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_426-313557-0_sp0.9 from training. Duration: 0.588875 2023-10-03 10:01:30,831 INFO [scaling.py:1118] (3/4) WithLoss: name=encoder.encoders.3.encoder.layers.1.self_attn_weights, loss-sum=0.000e+00 2023-10-03 10:01:33,329 WARNING [train.py:1204] (3/4) Exclude cut with ID _42326_210_7770_1_1533814282531_3707269_407-204343-0_sp1.1 from training. Duration: 0.97275 2023-10-03 10:01:33,337 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_258-310570-0 from training. Duration: 0.82 2023-10-03 10:01:33,513 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.1.conv_module2.balancer2.min_abs, batch_count=1227486.6666666667, ans=0.5 2023-10-03 10:01:33,515 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.1.bypass.skip_rate, batch_count=1227486.6666666667, ans=0.035 2023-10-03 10:01:39,287 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_4737_210_2040_1_1526998689_3293008_378-178650-0_sp1.1 from training. Duration: 0.863625 2023-10-03 10:01:39,375 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_47896_210_20508_1_1534492518_5958868_432-327451-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 10:01:39,456 WARNING [train.py:1204] (3/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_188-114464-0_sp1.1 from training. Duration: 0.7454375 2023-10-03 10:01:39,463 WARNING [train.py:1204] (3/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_798-106179-0_sp1.1 from training. Duration: 0.7545625 2023-10-03 10:01:41,355 WARNING [train.py:1204] (3/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_143-117421-0_sp1.1 from training. Duration: 0.52725 2023-10-03 10:01:41,401 WARNING [train.py:1204] (3/4) Exclude cut with ID _67499_210_17282_1_1535509805220_3692430_361-78786-0_sp1.1 from training. Duration: 0.963625 2023-10-03 10:01:42,737 WARNING [train.py:1204] (3/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_712-342563-0_sp1.1 from training. Duration: 0.8818125 2023-10-03 10:01:42,799 WARNING [train.py:1204] (3/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_387-159008-0 from training. Duration: 0.86 2023-10-03 10:01:45,610 WARNING [train.py:1204] (3/4) Exclude cut with ID _33640_210_2655_1_1533117605479_4855469_959-18820-0_sp1.1 from training. Duration: 0.963625 2023-10-03 10:01:45,644 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_133-564-0_sp0.9 from training. Duration: 0.87775 2023-10-03 10:01:46,941 WARNING [train.py:1204] (3/4) Exclude cut with ID _23514_210_1389_1_1531832139785_3031439_12-29287-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 10:01:50,300 WARNING [train.py:1204] (3/4) Exclude cut with ID _23514_210_1389_1_1531832139785_3031439_489-185052-0_sp1.1 from training. Duration: 0.963625 2023-10-03 10:01:50,376 WARNING [train.py:1204] (3/4) Exclude cut with ID _5444_210_2151_1_1527418808464_5312210_634-194174-0 from training. Duration: 0.97 2023-10-03 10:01:51,654 WARNING [train.py:1204] (3/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_91-26919-0_sp1.1 from training. Duration: 0.8818125 2023-10-03 10:01:53,076 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_37351_210_7925_1_1533012931_3812113_516-82303-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 10:01:54,270 WARNING [train.py:1204] (3/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_80-74184-0_sp0.9 from training. Duration: 0.92225 2023-10-03 10:01:55,750 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_9460_210_4211_1_1528954587_10049667_495-202963-0_sp1.1 from training. Duration: 0.963625 2023-10-03 10:01:58,708 WARNING [train.py:1204] (3/4) Exclude cut with ID _14632_210_9988_1_1530691279392_3923200_332-105904-0_sp1.1 from training. Duration: 0.8181875 2023-10-03 10:01:59,942 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_40764_210_1925_1_1533882359_3752345_493-167886-0_sp1.1 from training. Duration: 0.92725 2023-10-03 10:02:01,347 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_196-289637-0 from training. Duration: 0.75 2023-10-03 10:02:01,427 WARNING [train.py:1204] (3/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_26-175553-0 from training. Duration: 0.61 2023-10-03 10:02:02,737 WARNING [train.py:1204] (3/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_890-5117-0 from training. Duration: 0.78 2023-10-03 10:02:02,792 WARNING [train.py:1204] (3/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_206-306860-0_sp1.1 from training. Duration: 0.92725 2023-10-03 10:02:04,228 WARNING [train.py:1204] (3/4) Exclude cut with ID _25882_210_3036_1_1532775665815_6479510_570-79824-0_sp1.1 from training. Duration: 0.963625 2023-10-03 10:02:05,628 WARNING [train.py:1204] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_731-2428-0_sp1.1 from training. Duration: 0.7545625 2023-10-03 10:02:05,662 WARNING [train.py:1204] (3/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_138-154812-0_sp0.9 from training. Duration: 0.8666875 2023-10-03 10:02:10,113 WARNING [train.py:1204] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_178-39722-0_sp0.9 from training. Duration: 0.5444375 2023-10-03 10:02:10,178 WARNING [train.py:1204] (3/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_1285-256622-0_sp0.9 from training. Duration: 0.9333125 2023-10-03 10:02:14,868 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_41428_210_2161_1_1533774184_3992532_65-271323-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 10:02:16,373 WARNING [train.py:1204] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_308-161067-0 from training. Duration: 0.66 2023-10-03 10:02:16,379 WARNING [train.py:1204] (3/4) Exclude cut with ID _3067_210_2667_1_1526212974640_4503168_459-173867-0 from training. Duration: 0.88 2023-10-03 10:02:16,380 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2221_210_1988_1_1525597051_4935541_415-296882-0_sp1.1 from training. Duration: 0.7 2023-10-03 10:02:19,713 WARNING [train.py:1204] (3/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_354-192266-0_sp1.1 from training. Duration: 0.936375 2023-10-03 10:02:20,998 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_258-310570-0_sp0.9 from training. Duration: 0.911125 2023-10-03 10:02:22,403 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_548-141039-0_sp1.1 from training. Duration: 0.963625 2023-10-03 10:02:23,761 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_15101_210_7927_1_1530786415_5033274_320-327527-0 from training. Duration: 0.96 2023-10-03 10:02:23,816 WARNING [train.py:1204] (3/4) Exclude cut with ID _2895_210_2477_1_1525956785794_4063750_289-11475-0_sp0.9 from training. Duration: 0.911125 2023-10-03 10:02:25,228 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7543_210_6753_1_1528543318_4552_1-85072-0_sp1.1 from training. Duration: 0.7545625 2023-10-03 10:02:25,299 WARNING [train.py:1204] (3/4) Exclude cut with ID _64639_210_5816_1_1535367619437_3528000_461-329156-0 from training. Duration: 0.96 2023-10-03 10:02:26,839 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.feed_forward3.hidden_balancer.prob, batch_count=1227686.6666666667, ans=0.125 2023-10-03 10:02:28,444 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_211-339622-0 from training. Duration: 0.45 2023-10-03 10:02:31,061 INFO [train.py:1046] (3/4) Epoch 35, batch 3550, loss[loss=0.1505, simple_loss=0.2233, pruned_loss=0.03878, over 23876.00 frames. ], tot_loss[loss=0.1589, simple_loss=0.2376, pruned_loss=0.0401, over 4719999.72 frames. ], batch size: 195, lr: 2.90e-03, grad_scale: 4.0 2023-10-03 10:02:31,155 WARNING [train.py:1204] (3/4) Exclude cut with ID _20455_210_5219_1_1531565996775_8098100_402-112739-0_sp1.1 from training. Duration: 0.963625 2023-10-03 10:02:32,574 WARNING [train.py:1204] (3/4) Exclude cut with ID _53489_210_6228_1_1534298357169_3835970_256-115565-0_sp1.1 from training. Duration: 0.936375 2023-10-03 10:02:32,589 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_26326_210_7925_1_1533002493_3592406_360-305260-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 10:02:32,618 WARNING [train.py:1204] (3/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_18-204383-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 10:02:36,235 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.conv_skip_rate, batch_count=1227753.3333333333, ans=0.0 2023-10-03 10:02:37,158 WARNING [train.py:1204] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_306-232480-0_sp1.1 from training. Duration: 0.87275 2023-10-03 10:02:44,755 WARNING [train.py:1204] (3/4) Exclude cut with ID _41161_210_20034_1_1533431249657_3555314_116-299186-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 10:02:46,198 WARNING [train.py:1204] (3/4) Exclude cut with ID _13913_210_9994_1_1530579615187_3383897_473-277682-0_sp1.1 from training. Duration: 0.3545625 2023-10-03 10:02:49,539 WARNING [train.py:1204] (3/4) Exclude cut with ID _58895_210_8750_1_1535184081320_3649089_49-225702-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 10:02:50,845 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3080_210_2071_1_1526086102_4365166_312-8726-0_sp1.1 from training. Duration: 0.82725 2023-10-03 10:02:51,118 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.feed_forward1.hidden_balancer.prob, batch_count=1227820.0, ans=0.125 2023-10-03 10:02:52,202 WARNING [train.py:1204] (3/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_489-190175-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 10:02:52,286 WARNING [train.py:1204] (3/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_345-95717-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 10:02:52,306 WARNING [train.py:1204] (3/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_85-138257-0_sp1.1 from training. Duration: 0.6454375 2023-10-03 10:02:56,325 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.583e+02 1.952e+02 2.132e+02 2.407e+02 4.209e+02, threshold=4.264e+02, percent-clipped=1.0 2023-10-03 10:02:56,436 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7046_210_6753_1_1527895337_6920917_783-278109-0_sp1.1 from training. Duration: 0.7 2023-10-03 10:02:56,467 WARNING [train.py:1204] (3/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_450-203637-0_sp1.1 from training. Duration: 0.836375 2023-10-03 10:02:57,741 WARNING [train.py:1204] (3/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_416-132893-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 10:02:57,758 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2655_210_2087_1_1525688959_3897400_437-337690-0_sp0.9 from training. Duration: 0.7 2023-10-03 10:02:59,574 WARNING [train.py:1204] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_700-183549-0_sp1.1 from training. Duration: 0.6181875 2023-10-03 10:03:03,754 WARNING [train.py:1204] (3/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_191-59784-0_sp1.1 from training. Duration: 0.8 2023-10-03 10:03:05,014 WARNING [train.py:1204] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_218-295097-0_sp1.1 from training. Duration: 0.7 2023-10-03 10:03:07,651 WARNING [train.py:1204] (3/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_310-109461-0_sp1.1 from training. Duration: 0.72725 2023-10-03 10:03:07,656 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_33348_210_5632_1_1533086912_3324030_131-321149-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 10:03:07,700 WARNING [train.py:1204] (3/4) Exclude cut with ID _64135_210_2253_1_1535677267876_7608972_133-188266-0_sp1.1 from training. Duration: 0.736375 2023-10-03 10:03:07,728 WARNING [train.py:1204] (3/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_505-205270-0 from training. Duration: 0.38 2023-10-03 10:03:07,748 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_67223_210_4238_1_1535612120_2537431_102-88248-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 10:03:09,735 WARNING [train.py:1204] (3/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_60-235433-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 10:03:11,567 WARNING [train.py:1204] (3/4) Exclude cut with ID _5189_210_4557_1_1527248319428_3769229_199-53526-0_sp1.1 from training. Duration: 0.3818125 2023-10-03 10:03:17,100 WARNING [train.py:1204] (3/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_439-90979-0_sp1.1 from training. Duration: 0.963625 2023-10-03 10:03:17,181 WARNING [train.py:1204] (3/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_1162-132880-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 10:03:18,500 WARNING [train.py:1204] (3/4) Exclude cut with ID _56423_210_5854_1_1534896061049_3165969_220-22317-0_sp1.1 from training. Duration: 0.963625 2023-10-03 10:03:19,921 WARNING [train.py:1204] (3/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_909-64151-0 from training. Duration: 0.94 2023-10-03 10:03:19,985 WARNING [train.py:1204] (3/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_465-156500-0_sp0.9 from training. Duration: 0.8 2023-10-03 10:03:20,120 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.conv_module1.balancer1.prob, batch_count=1227953.3333333333, ans=0.125 2023-10-03 10:03:23,200 WARNING [train.py:1204] (3/4) Exclude cut with ID _44304_210_15374_1_1533639352765_4613129_302-22084-0 from training. Duration: 0.86 2023-10-03 10:03:23,245 WARNING [train.py:1204] (3/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_663-166973-0_sp1.1 from training. Duration: 0.72725 2023-10-03 10:03:26,209 WARNING [train.py:1204] (3/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_503-287883-0_sp0.9 from training. Duration: 0.97775 2023-10-03 10:03:26,232 WARNING [train.py:1204] (3/4) Exclude cut with ID _60057_210_10640_1_1534903065793_3333150_362-230377-0_sp0.9 from training. Duration: 0.9666875 2023-10-03 10:03:28,020 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.attention_skip_rate, batch_count=1227953.3333333333, ans=0.0 2023-10-03 10:03:29,308 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_4183_210_2087_1_1526729100_1869838_143-280962-0 from training. Duration: 0.56 2023-10-03 10:03:30,688 WARNING [train.py:1204] (3/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_281-1313-0_sp1.1 from training. Duration: 0.936375 2023-10-03 10:03:31,045 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=1228020.0, ans=0.1 2023-10-03 10:03:35,542 WARNING [train.py:1204] (3/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_64-215118-0_sp1.1 from training. Duration: 0.936375 2023-10-03 10:03:35,592 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2822_210_2715_1_1526112586_1999587_165-36656-0 from training. Duration: 0.89 2023-10-03 10:03:35,649 WARNING [train.py:1204] (3/4) Exclude cut with ID _22961_210_8254_1_1531976366672_4278180_402-208830-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 10:03:40,260 WARNING [train.py:1204] (3/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_98-193494-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 10:03:41,551 WARNING [train.py:1204] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_383-285196-0 from training. Duration: 0.97 2023-10-03 10:03:45,621 INFO [train.py:1046] (3/4) Epoch 35, batch 3600, loss[loss=0.1334, simple_loss=0.2125, pruned_loss=0.02713, over 22049.00 frames. ], tot_loss[loss=0.159, simple_loss=0.2377, pruned_loss=0.04014, over 4712841.53 frames. ], batch size: 48, lr: 2.90e-03, grad_scale: 8.0 2023-10-03 10:03:46,971 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6767_210_3273_1_1527855783_1718932_142-127132-0 from training. Duration: 0.95 2023-10-03 10:03:47,007 WARNING [train.py:1204] (3/4) Exclude cut with ID _39867_210_9826_1_1533553222030_9344259_1018-339635-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 10:03:48,339 WARNING [train.py:1204] (3/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_274-274798-0_sp1.1 from training. Duration: 0.8909375 2023-10-03 10:03:49,884 WARNING [train.py:1204] (3/4) Exclude cut with ID _2334_210_2667_1_1525608238880_4705129_376-263393-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 10:03:49,949 WARNING [train.py:1204] (3/4) Exclude cut with ID _29303_210_2575_1_1532392237303_7410649_363-344675-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 10:03:51,425 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.bypass.scale_min, batch_count=1228086.6666666667, ans=0.2 2023-10-03 10:03:52,532 WARNING [train.py:1204] (3/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_58-68956-0_sp1.1 from training. Duration: 0.7545625 2023-10-03 10:03:55,902 WARNING [train.py:1204] (3/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_709-311571-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 10:03:56,582 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.2.encoder.layers.2.whiten, num_groups=1, num_channels=384, metric=3.89 vs. limit=12.0 2023-10-03 10:03:57,316 WARNING [train.py:1204] (3/4) Exclude cut with ID _46067_210_4220_1_1534125571066_3307450_136-257407-0_sp1.1 from training. Duration: 0.963625 2023-10-03 10:03:58,626 WARNING [train.py:1204] (3/4) Exclude cut with ID _6493_210_1747_1_1528459210424_4012470_324-323854-0_sp1.1 from training. Duration: 0.836375 2023-10-03 10:03:58,685 WARNING [train.py:1204] (3/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_234-260682-0_sp1.1 from training. Duration: 0.763625 2023-10-03 10:04:00,159 WARNING [train.py:1204] (3/4) Exclude cut with ID _36634_210_1794_1_1533296352450_1399884_55-119451-0_sp1.1 from training. Duration: 0.963625 2023-10-03 10:04:00,163 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_40047_210_2290_1_1533290519_2374311_195-165693-0 from training. Duration: 0.89 2023-10-03 10:04:00,378 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder_embed.convnext.out_balancer.prob, batch_count=1228153.3333333333, ans=0.125 2023-10-03 10:04:02,122 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_5522_210_3273_1_1527249606_3909096_366-172508-0_sp0.9 from training. Duration: 0.7333125 2023-10-03 10:04:03,443 WARNING [train.py:1204] (3/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_187-10504-0_sp1.1 from training. Duration: 0.963625 2023-10-03 10:04:04,835 WARNING [train.py:1204] (3/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_282-257791-0_sp1.1 from training. Duration: 0.7818125 2023-10-03 10:04:09,241 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_43831_210_2158_1_1533725502_5872791_467-288776-0_sp1.1 from training. Duration: 0.92725 2023-10-03 10:04:09,366 WARNING [train.py:1204] (3/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_138-154812-0_sp1.1 from training. Duration: 0.7090625 2023-10-03 10:04:10,717 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_1154-185930-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 10:04:12,423 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2655_210_2087_1_1525688959_3897400_52-279899-0 from training. Duration: 0.69 2023-10-03 10:04:12,496 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_141-89113-0_sp1.1 from training. Duration: 0.7818125 2023-10-03 10:04:15,239 WARNING [train.py:1204] (3/4) Exclude cut with ID _55134_210_21982_1_1534590028377_3882170_297-47310-0_sp1.1 from training. Duration: 0.963625 2023-10-03 10:04:16,547 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_486-151131-0_sp1.1 from training. Duration: 0.77275 2023-10-03 10:04:18,014 WARNING [train.py:1204] (3/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_21-232980-0_sp1.1 from training. Duration: 0.936375 2023-10-03 10:04:18,146 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6165_210_3528_1_1527590982_4379841_338-106380-0_sp1.1 from training. Duration: 0.92725 2023-10-03 10:04:19,540 WARNING [train.py:1204] (3/4) Exclude cut with ID _55162_210_17071_1_1534408248583_3753480_355-257218-0_sp1.1 from training. Duration: 0.97275 2023-10-03 10:04:20,910 WARNING [train.py:1204] (3/4) Exclude cut with ID _2895_210_2477_1_1525956785794_4063750_289-11475-0 from training. Duration: 0.82 2023-10-03 10:04:29,624 WARNING [train.py:1204] (3/4) Exclude cut with ID _16756_210_4915_1_1531047803763_7104259_839-183431-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 10:04:29,731 WARNING [train.py:1204] (3/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_427-208901-0_sp0.9 from training. Duration: 0.7555625 2023-10-03 10:04:29,893 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.balancer2.prob, batch_count=1228286.6666666667, ans=0.125 2023-10-03 10:04:31,385 WARNING [train.py:1204] (3/4) Exclude cut with ID _36512_210_6987_1_1533033143998_3126069_181-306500-0 from training. Duration: 0.95 2023-10-03 10:04:34,407 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.balancer2.prob, batch_count=1228286.6666666667, ans=0.125 2023-10-03 10:04:35,564 WARNING [train.py:1204] (3/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_28-70987-0_sp1.1 from training. Duration: 0.7909375 2023-10-03 10:04:39,779 WARNING [train.py:1204] (3/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_590-58996-0_sp1.1 from training. Duration: 0.936375 2023-10-03 10:04:43,264 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_48730_210_5399_1_1533885886_2212769_108-47561-0_sp1.1 from training. Duration: 0.936375 2023-10-03 10:04:49,279 WARNING [train.py:1204] (3/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_401-158239-0_sp0.9 from training. Duration: 0.788875 2023-10-03 10:04:49,293 WARNING [train.py:1204] (3/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_278-154492-0_sp1.1 from training. Duration: 0.6909375 2023-10-03 10:04:49,298 WARNING [train.py:1204] (3/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_137-116442-0 from training. Duration: 0.78 2023-10-03 10:04:50,668 WARNING [train.py:1204] (3/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_189-317158-0 from training. Duration: 0.9 2023-10-03 10:04:52,388 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_47896_210_20508_1_1534492518_5958868_558-314572-0 from training. Duration: 0.97 2023-10-03 10:04:53,853 WARNING [train.py:1204] (3/4) Exclude cut with ID _66070_210_7640_1_1535441393096_7805310_290-40284-0_sp1.1 from training. Duration: 0.97275 2023-10-03 10:04:53,906 WARNING [train.py:1204] (3/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_110-224688-0_sp1.1 from training. Duration: 0.8818125 2023-10-03 10:04:56,612 WARNING [train.py:1204] (3/4) Exclude cut with ID _7719_210_3525_1_1528456153163_4296960_88-250259-0 from training. Duration: 0.84 2023-10-03 10:04:56,659 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8453_210_4211_1_1528502940_8562730_1157-114102-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 10:04:56,685 WARNING [train.py:1204] (3/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_317-175062-0_sp1.1 from training. Duration: 0.7454375 2023-10-03 10:04:56,698 WARNING [train.py:1204] (3/4) Exclude cut with ID _29857_210_20034_1_1532395886753_3798160_254-347254-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 10:04:58,064 WARNING [train.py:1204] (3/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_28-70987-0 from training. Duration: 0.87 2023-10-03 10:04:58,129 WARNING [train.py:1204] (3/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_321-335475-0 from training. Duration: 0.45 2023-10-03 10:04:59,417 INFO [train.py:1046] (3/4) Epoch 35, batch 3650, loss[loss=0.1624, simple_loss=0.24, pruned_loss=0.04236, over 23687.00 frames. ], tot_loss[loss=0.1594, simple_loss=0.2384, pruned_loss=0.0402, over 4709733.08 frames. ], batch size: 164, lr: 2.90e-03, grad_scale: 8.0 2023-10-03 10:05:00,875 WARNING [train.py:1204] (3/4) Exclude cut with ID _40901_210_5665_1_1533363789958_3856130_356-107330-0_sp1.1 from training. Duration: 0.936375 2023-10-03 10:05:02,842 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3780_210_2087_1_1526467389_2145572_176-107140-0 from training. Duration: 0.69 2023-10-03 10:05:08,461 WARNING [train.py:1204] (3/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_80-135200-0 from training. Duration: 0.79 2023-10-03 10:05:09,839 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_33348_210_5632_1_1533086912_3324030_85-28583-0_sp0.9 from training. Duration: 0.911125 2023-10-03 10:05:14,581 WARNING [train.py:1204] (3/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_29-293896-0 from training. Duration: 0.61 2023-10-03 10:05:14,706 WARNING [train.py:1204] (3/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_194-252800-0 from training. Duration: 0.97 2023-10-03 10:05:19,505 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_360-52446-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 10:05:19,506 WARNING [train.py:1204] (3/4) Exclude cut with ID _9389_210_1664_1_1529217337301_5289269_289-213417-0_sp0.9 from training. Duration: 0.988875 2023-10-03 10:05:19,545 WARNING [train.py:1204] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_143-86344-0_sp0.9 from training. Duration: 0.8555625 2023-10-03 10:05:22,254 WARNING [train.py:1204] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_520-126729-0_sp0.9 from training. Duration: 0.77775 2023-10-03 10:05:22,276 WARNING [train.py:1204] (3/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_76-308663-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 10:05:22,334 WARNING [train.py:1204] (3/4) Exclude cut with ID _8021_210_7994_1_1528376451358_3653069_100-140231-0 from training. Duration: 0.67 2023-10-03 10:05:24,267 WARNING [train.py:1204] (3/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_387-131538-0_sp0.9 from training. Duration: 0.9 2023-10-03 10:05:24,297 WARNING [train.py:1204] (3/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_365-182054-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 10:05:24,339 WARNING [train.py:1204] (3/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_345-340690-0 from training. Duration: 0.93 2023-10-03 10:05:25,446 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.563e+02 1.828e+02 1.992e+02 2.156e+02 3.543e+02, threshold=3.984e+02, percent-clipped=0.0 2023-10-03 10:05:25,562 WARNING [train.py:1204] (3/4) Exclude cut with ID _5409_210_1388_1_1527292850189_5865191_178-127970-0_sp0.9 from training. Duration: 0.6333125 2023-10-03 10:05:26,908 WARNING [train.py:1204] (3/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_352-12734-0_sp1.1 from training. Duration: 0.92725 2023-10-03 10:05:26,911 WARNING [train.py:1204] (3/4) Exclude cut with ID _65134_210_14763_1_1535335393985_3505368_121-129373-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 10:05:27,156 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.feed_forward2.hidden_balancer.prob, batch_count=1228486.6666666667, ans=0.125 2023-10-03 10:05:27,188 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.feed_forward1.out_proj.dropout_p, batch_count=1228486.6666666667, ans=0.1 2023-10-03 10:05:28,225 WARNING [train.py:1204] (3/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_557-324258-0_sp1.1 from training. Duration: 0.736375 2023-10-03 10:05:31,006 WARNING [train.py:1204] (3/4) Exclude cut with ID _48678_210_5663_1_1533988830947_3769199_306-255886-0 from training. Duration: 0.77 2023-10-03 10:05:31,093 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7544_210_6753_1_1528587000_691468_87-178599-0 from training. Duration: 0.87 2023-10-03 10:05:31,151 WARNING [train.py:1204] (3/4) Exclude cut with ID _2941_210_2577_1_1526199673150_2489759_208-236402-0_sp1.1 from training. Duration: 0.963625 2023-10-03 10:05:34,315 WARNING [train.py:1204] (3/4) Exclude cut with ID _38670_210_15379_1_1533448810659_4391059_65-169397-0 from training. Duration: 0.97 2023-10-03 10:05:35,741 WARNING [train.py:1204] (3/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_306-328175-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 10:05:36,958 WARNING [train.py:1204] (3/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_212-149947-0_sp1.1 from training. Duration: 0.663625 2023-10-03 10:05:43,045 WARNING [train.py:1204] (3/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_321-22821-0_sp1.1 from training. Duration: 0.8090625 2023-10-03 10:05:44,436 WARNING [train.py:1204] (3/4) Exclude cut with ID _44010_210_4525_1_1533884262964_4013620_483-69110-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 10:05:44,454 WARNING [train.py:1204] (3/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_38-187851-0_sp1.1 from training. Duration: 0.563625 2023-10-03 10:05:45,980 WARNING [train.py:1204] (3/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_129-337802-0_sp0.9 from training. Duration: 0.8 2023-10-03 10:05:46,070 WARNING [train.py:1204] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_268-87909-0_sp1.1 from training. Duration: 0.9 2023-10-03 10:05:48,784 WARNING [train.py:1204] (3/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_559-184862-0_sp1.1 from training. Duration: 0.8545625 2023-10-03 10:05:52,129 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_46032_210_14656_1_1533781412_4507969_237-120766-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 10:05:52,221 WARNING [train.py:1204] (3/4) Exclude cut with ID _28427_210_4220_1_1532581240967_3061069_330-170906-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 10:05:52,228 WARNING [train.py:1204] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_401-51578-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 10:05:53,605 WARNING [train.py:1204] (3/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_209-205365-0_sp1.1 from training. Duration: 0.7181875 2023-10-03 10:05:53,801 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.out_combiner.scale_min, batch_count=1228620.0, ans=0.2 2023-10-03 10:05:55,554 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_23801_210_16223_1_1532066392_3294657_57-242170-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 10:05:55,615 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_127-37371-0_sp1.1 from training. Duration: 0.936375 2023-10-03 10:06:01,379 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.bypass.scale_min, batch_count=1228686.6666666667, ans=0.2 2023-10-03 10:06:02,494 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9171W0019-578905 from training. Duration: 0.896 2023-10-03 10:06:07,200 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_24605_210_1794_1_1532047863_4149755_349-49056-0_sp1.1 from training. Duration: 0.92725 2023-10-03 10:06:07,212 WARNING [train.py:1204] (3/4) Exclude cut with ID _12302_210_5219_1_1529924446466_8107280_897-21237-0_sp1.1 from training. Duration: 0.936375 2023-10-03 10:06:07,318 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_9460_210_4211_1_1528954587_10049667_480-260269-0_sp0.9 from training. Duration: 0.62225 2023-10-03 10:06:08,679 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_348-152548-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 10:06:10,025 WARNING [train.py:1204] (3/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_301-330455-0_sp0.9 from training. Duration: 0.888875 2023-10-03 10:06:11,303 WARNING [train.py:1204] (3/4) Exclude cut with ID _61008_210_10633_1_1534852769198_6974370_345-91705-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 10:06:11,421 WARNING [train.py:1204] (3/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_304-273433-0 from training. Duration: 0.99 2023-10-03 10:06:11,424 WARNING [train.py:1204] (3/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_359-345972-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 10:06:14,645 INFO [train.py:1046] (3/4) Epoch 35, batch 3700, loss[loss=0.1673, simple_loss=0.2517, pruned_loss=0.04144, over 24388.00 frames. ], tot_loss[loss=0.16, simple_loss=0.2397, pruned_loss=0.04014, over 4717343.32 frames. ], batch size: 77, lr: 2.90e-03, grad_scale: 8.0 2023-10-03 10:06:14,765 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2296_210_2087_1_1525503448_3397335_57-185430-0_sp1.1 from training. Duration: 0.5454375 2023-10-03 10:06:17,444 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_1256-120974-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 10:06:17,509 WARNING [train.py:1204] (3/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_372-30302-0_sp0.9 from training. Duration: 0.9555625 2023-10-03 10:06:20,704 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_42012_210_7927_1_1533439936_3871556_451-344262-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 10:06:20,705 WARNING [train.py:1204] (3/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_28-324729-0 from training. Duration: 0.94 2023-10-03 10:06:20,713 WARNING [train.py:1204] (3/4) Exclude cut with ID _64135_210_2253_1_1535677267876_7608972_313-18394-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 10:06:22,115 WARNING [train.py:1204] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_620-276793-0_sp0.9 from training. Duration: 0.6555625 2023-10-03 10:06:22,152 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7544_210_6753_1_1528591327_1471426_172-348777-0_sp0.9 from training. Duration: 0.7333125 2023-10-03 10:06:26,783 WARNING [train.py:1204] (3/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_332-221057-0_sp0.9 from training. Duration: 0.7555625 2023-10-03 10:06:28,243 WARNING [train.py:1204] (3/4) Exclude cut with ID _19023_210_4353_1_1531378892552_5820780_5-300035-0_sp1.1 from training. Duration: 0.92725 2023-10-03 10:06:29,720 WARNING [train.py:1204] (3/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_343-309023-0_sp1.1 from training. Duration: 0.963625 2023-10-03 10:06:31,158 WARNING [train.py:1204] (3/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_37-41919-0_sp1.1 from training. Duration: 0.7909375 2023-10-03 10:06:31,194 WARNING [train.py:1204] (3/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_41-182910-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 10:06:32,515 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_229-45300-0_sp0.9 from training. Duration: 0.6333125 2023-10-03 10:06:35,155 WARNING [train.py:1204] (3/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_40-214716-0_sp1.1 from training. Duration: 0.963625 2023-10-03 10:06:36,528 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9309W0203-640330 from training. Duration: 0.896 2023-10-03 10:06:45,252 WARNING [train.py:1204] (3/4) Exclude cut with ID _60229_210_27767_1_1534838406158_3655350_264-279573-0_sp1.1 from training. Duration: 0.7545625 2023-10-03 10:06:45,299 WARNING [train.py:1204] (3/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_372-198878-0_sp0.9 from training. Duration: 0.6666875 2023-10-03 10:06:47,321 WARNING [train.py:1204] (3/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_359-118926-0_sp1.1 from training. Duration: 0.6545625 2023-10-03 10:06:47,332 WARNING [train.py:1204] (3/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_85-65056-0 from training. Duration: 0.96 2023-10-03 10:06:47,596 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.balancer2.prob, batch_count=1228886.6666666667, ans=0.125 2023-10-03 10:06:48,365 WARNING [train.py:1204] (3/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_89-242335-0_sp1.1 from training. Duration: 0.863625 2023-10-03 10:06:49,948 WARNING [train.py:1204] (3/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_128-49444-0_sp1.1 from training. Duration: 0.936375 2023-10-03 10:06:51,846 WARNING [train.py:1204] (3/4) Exclude cut with ID _30327_210_16049_1_1532484161295_3924169_401-70819-0 from training. Duration: 0.86 2023-10-03 10:06:53,331 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_41822_210_13270_1_1533955976_4379284_260-256391-0_sp1.1 from training. Duration: 0.936375 2023-10-03 10:06:54,722 WARNING [train.py:1204] (3/4) Exclude cut with ID _67335_210_5219_1_1535525975652_4275229_599-247317-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 10:06:56,130 WARNING [train.py:1204] (3/4) Exclude cut with ID _29857_210_20034_1_1532395886753_3798160_209-239267-0_sp1.1 from training. Duration: 0.936375 2023-10-03 10:06:56,149 WARNING [train.py:1204] (3/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_46-207599-0_sp1.1 from training. Duration: 0.5818125 2023-10-03 10:06:59,382 WARNING [train.py:1204] (3/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_912-282412-0_sp1.1 from training. Duration: 0.4454375 2023-10-03 10:07:02,428 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_4183_210_2087_1_1526729100_1869838_141-166600-0_sp1.1 from training. Duration: 0.863625 2023-10-03 10:07:02,433 WARNING [train.py:1204] (3/4) Exclude cut with ID _15037_210_2253_1_1530752294104_3267650_296-192597-0 from training. Duration: 0.81 2023-10-03 10:07:03,754 WARNING [train.py:1204] (3/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_571-220562-0_sp1.1 from training. Duration: 0.963625 2023-10-03 10:07:03,773 WARNING [train.py:1204] (3/4) Exclude cut with ID _5287_210_2084_1_1527418883040_3878670_12-221186-0 from training. Duration: 0.98 2023-10-03 10:07:07,788 WARNING [train.py:1204] (3/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_149-85602-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 10:07:07,816 WARNING [train.py:1204] (3/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_1003-101638-0_sp1.1 from training. Duration: 0.663625 2023-10-03 10:07:12,257 WARNING [train.py:1204] (3/4) Exclude cut with ID _55844_210_2346_1_1534471212569_7625707_1099-33689-0_sp1.1 from training. Duration: 0.92725 2023-10-03 10:07:13,467 WARNING [train.py:1204] (3/4) Exclude cut with ID _7606_210_4915_1_1528894253277_5080690_301-10732-0 from training. Duration: 0.81 2023-10-03 10:07:14,980 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.feed_forward3.hidden_balancer.prob, batch_count=1229020.0, ans=0.125 2023-10-03 10:07:16,154 WARNING [train.py:1204] (3/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_403-34698-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 10:07:16,168 WARNING [train.py:1204] (3/4) Exclude cut with ID _8480_210_1874_1_1528589391205_6728330_755-112625-0_sp0.9 from training. Duration: 0.82225 2023-10-03 10:07:16,182 WARNING [train.py:1204] (3/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_527-238990-0_sp1.1 from training. Duration: 0.8181875 2023-10-03 10:07:16,217 WARNING [train.py:1204] (3/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_426-7328-0_sp1.1 from training. Duration: 0.92725 2023-10-03 10:07:20,958 WARNING [train.py:1204] (3/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_189-317158-0_sp1.1 from training. Duration: 0.8181875 2023-10-03 10:07:21,020 WARNING [train.py:1204] (3/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_162-103620-0 from training. Duration: 0.93 2023-10-03 10:07:21,668 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.3.encoder.layers.3.self_attn2.whiten, num_groups=1, num_channels=512, metric=10.21 vs. limit=22.5 2023-10-03 10:07:22,436 WARNING [train.py:1204] (3/4) Exclude cut with ID _22083_210_8254_1_1531702862439_3678970_49-234546-0 from training. Duration: 0.83 2023-10-03 10:07:24,283 WARNING [train.py:1204] (3/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_370-331600-0_sp1.1 from training. Duration: 0.8545625 2023-10-03 10:07:24,297 WARNING [train.py:1204] (3/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_166-59042-0_sp1.1 from training. Duration: 0.97275 2023-10-03 10:07:25,645 WARNING [train.py:1204] (3/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_138-332773-0_sp1.1 from training. Duration: 0.736375 2023-10-03 10:07:25,723 WARNING [train.py:1204] (3/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_90-30673-0_sp0.9 from training. Duration: 0.9333125 2023-10-03 10:07:28,305 INFO [train.py:1046] (3/4) Epoch 35, batch 3750, loss[loss=0.1709, simple_loss=0.2416, pruned_loss=0.05013, over 23783.00 frames. ], tot_loss[loss=0.1609, simple_loss=0.2407, pruned_loss=0.04055, over 4719759.24 frames. ], batch size: 195, lr: 2.90e-03, grad_scale: 8.0 2023-10-03 10:07:28,451 WARNING [train.py:1204] (3/4) Exclude cut with ID _27967_210_12140_1_1532588391859_8419130_737-41898-0_sp1.1 from training. Duration: 0.936375 2023-10-03 10:07:29,873 WARNING [train.py:1204] (3/4) Exclude cut with ID _8833_210_5219_1_1528592665210_7553059_329-125156-0_sp1.1 from training. Duration: 0.7454375 2023-10-03 10:07:31,722 WARNING [train.py:1204] (3/4) Exclude cut with ID _41817_210_18835_1_1533434477160_7423049_352-42537-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 10:07:33,144 WARNING [train.py:1204] (3/4) Exclude cut with ID _8365_210_2064_1_1528592311070_4677690_431-294102-0 from training. Duration: 0.89 2023-10-03 10:07:34,595 WARNING [train.py:1204] (3/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_454-314901-0_sp1.1 from training. Duration: 0.4181875 2023-10-03 10:07:37,308 WARNING [train.py:1204] (3/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_80-135200-0_sp0.9 from training. Duration: 0.87775 2023-10-03 10:07:37,346 WARNING [train.py:1204] (3/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_181-78632-0 from training. Duration: 0.78 2023-10-03 10:07:38,634 WARNING [train.py:1204] (3/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_300-27235-0_sp0.9 from training. Duration: 0.9666875 2023-10-03 10:07:40,018 WARNING [train.py:1204] (3/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_96-177176-0_sp1.1 from training. Duration: 0.97275 2023-10-03 10:07:41,936 WARNING [train.py:1204] (3/4) Exclude cut with ID _35941_210_15379_1_1532995294515_4023089_246-348742-0_sp1.1 from training. Duration: 0.97275 2023-10-03 10:07:43,311 WARNING [train.py:1204] (3/4) Exclude cut with ID _38670_210_15379_1_1533448810659_4391059_65-169397-0_sp1.1 from training. Duration: 0.8818125 2023-10-03 10:07:46,121 WARNING [train.py:1204] (3/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_1003-99642-0_sp1.1 from training. Duration: 0.963625 2023-10-03 10:07:48,887 WARNING [train.py:1204] (3/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_320-14078-0_sp1.1 from training. Duration: 0.863625 2023-10-03 10:07:48,985 WARNING [train.py:1204] (3/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_367-61330-0_sp0.9 from training. Duration: 0.8333125 2023-10-03 10:07:52,235 WARNING [train.py:1204] (3/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_515-298514-0_sp1.1 from training. Duration: 0.92725 2023-10-03 10:07:53,851 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.536e+02 1.913e+02 2.089e+02 2.337e+02 3.206e+02, threshold=4.178e+02, percent-clipped=0.0 2023-10-03 10:07:55,341 WARNING [train.py:1204] (3/4) Exclude cut with ID _53731_210_7994_1_1534553979244_3757214_471-320815-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 10:07:55,401 WARNING [train.py:1204] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_306-232480-0 from training. Duration: 0.96 2023-10-03 10:07:56,727 WARNING [train.py:1204] (3/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_478-162247-0_sp0.9 from training. Duration: 0.911125 2023-10-03 10:07:58,155 WARNING [train.py:1204] (3/4) Exclude cut with ID _56423_210_5854_1_1534896061049_3165969_345-284696-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 10:07:58,204 WARNING [train.py:1204] (3/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_600-122253-0_sp1.1 from training. Duration: 0.963625 2023-10-03 10:08:01,477 WARNING [train.py:1204] (3/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_199-295611-0 from training. Duration: 0.55 2023-10-03 10:08:04,216 WARNING [train.py:1204] (3/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_734-129761-0 from training. Duration: 0.82 2023-10-03 10:08:05,584 WARNING [train.py:1204] (3/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_349-169257-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 10:08:05,623 WARNING [train.py:1204] (3/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_202-67017-0_sp0.9 from training. Duration: 0.911125 2023-10-03 10:08:07,177 WARNING [train.py:1204] (3/4) Exclude cut with ID _16908_210_5724_1_1531119157248_3772700_61-258978-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 10:08:11,335 WARNING [train.py:1204] (3/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_172-156931-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 10:08:14,563 WARNING [train.py:1204] (3/4) Exclude cut with ID _4092_210_2151_1_1526643044449_3135759_356-278694-0_sp1.1 from training. Duration: 0.42725 2023-10-03 10:08:17,453 WARNING [train.py:1204] (3/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_116-332008-0 from training. Duration: 0.98 2023-10-03 10:08:20,900 WARNING [train.py:1204] (3/4) Exclude cut with ID _29003_210_19596_1_1532507391720_3894098_358-114789-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 10:08:24,485 WARNING [train.py:1204] (3/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_152-212533-0_sp1.1 from training. Duration: 0.8818125 2023-10-03 10:08:25,813 WARNING [train.py:1204] (3/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_267-214116-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 10:08:26,449 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.4.encoder.layers.0.feed_forward3.out_whiten, num_groups=1, num_channels=384, metric=13.15 vs. limit=15.0 2023-10-03 10:08:29,856 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7543_210_6753_1_1528541907_1399322_165-323619-0_sp0.9 from training. Duration: 0.7666875 2023-10-03 10:08:33,000 WARNING [train.py:1204] (3/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_577-332600-0_sp0.9 from training. Duration: 0.6333125 2023-10-03 10:08:34,441 WARNING [train.py:1204] (3/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_342-100157-0_sp0.9 from training. Duration: 0.6 2023-10-03 10:08:35,832 WARNING [train.py:1204] (3/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_141-89727-0_sp1.1 from training. Duration: 0.6454375 2023-10-03 10:08:37,264 WARNING [train.py:1204] (3/4) Exclude cut with ID _3992_210_1747_1_1526644816031_3731060_107-251434-0_sp1.1 from training. Duration: 0.9 2023-10-03 10:08:39,984 WARNING [train.py:1204] (3/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_401-53423-0_sp1.1 from training. Duration: 0.5 2023-10-03 10:08:41,602 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.self_attn_weights.pos_emb_skip_rate, batch_count=1229420.0, ans=0.0 2023-10-03 10:08:42,698 INFO [train.py:1046] (3/4) Epoch 35, batch 3800, loss[loss=0.1461, simple_loss=0.2388, pruned_loss=0.02669, over 24455.00 frames. ], tot_loss[loss=0.1613, simple_loss=0.241, pruned_loss=0.04084, over 4706116.68 frames. ], batch size: 69, lr: 2.90e-03, grad_scale: 8.0 2023-10-03 10:08:46,225 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7762_210_1794_1_1528199508_4057408_422-227034-0_sp1.1 from training. Duration: 0.663625 2023-10-03 10:08:49,720 WARNING [train.py:1204] (3/4) Exclude cut with ID _4184_210_2550_1_1526730192013_2219380_102-250299-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 10:08:49,787 WARNING [train.py:1204] (3/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_253-308377-0_sp0.9 from training. Duration: 0.5444375 2023-10-03 10:08:51,010 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_271-297505-0 from training. Duration: 0.99 2023-10-03 10:08:53,848 WARNING [train.py:1204] (3/4) Exclude cut with ID _35941_210_15379_1_1532995294515_4023089_213-142973-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 10:08:55,309 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7544_210_6753_1_1528587722_3576686_162-63018-0_sp1.1 from training. Duration: 0.92725 2023-10-03 10:08:56,534 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_365-217119-0_sp0.9 from training. Duration: 0.7 2023-10-03 10:08:59,345 WARNING [train.py:1204] (3/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_662-275621-0_sp0.9 from training. Duration: 0.4666875 2023-10-03 10:08:59,347 WARNING [train.py:1204] (3/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_374-187555-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 10:09:00,682 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_289-180904-0_sp1.1 from training. Duration: 0.6909375 2023-10-03 10:09:02,499 WARNING [train.py:1204] (3/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_189-47206-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 10:09:02,544 WARNING [train.py:1204] (3/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_234-14174-0_sp1.1 from training. Duration: 0.7454375 2023-10-03 10:09:03,908 WARNING [train.py:1204] (3/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_576-76621-0_sp1.1 from training. Duration: 0.963625 2023-10-03 10:09:04,002 WARNING [train.py:1204] (3/4) Exclude cut with ID _13253_210_1925_1_1530757775819_3577873_253-246386-0 from training. Duration: 0.9 2023-10-03 10:09:04,206 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.feed_forward1.out_proj.dropout_p, batch_count=1229486.6666666667, ans=0.1 2023-10-03 10:09:05,562 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.ff3_skip_rate, batch_count=1229486.6666666667, ans=0.0 2023-10-03 10:09:08,096 WARNING [train.py:1204] (3/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_372-6869-0_sp1.1 from training. Duration: 0.4 2023-10-03 10:09:08,161 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_21434_210_4238_1_1531916339_1222252_22-54954-0_sp1.1 from training. Duration: 0.936375 2023-10-03 10:09:10,920 WARNING [train.py:1204] (3/4) Exclude cut with ID _29853_210_7497_1_1532505600851_6642110_492-170361-0_sp1.1 from training. Duration: 0.92725 2023-10-03 10:09:13,727 WARNING [train.py:1204] (3/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_505-97987-0_sp0.9 from training. Duration: 0.9555625 2023-10-03 10:09:13,770 WARNING [train.py:1204] (3/4) Exclude cut with ID _8365_210_2064_1_1528592311070_4677690_371-122295-0_sp0.9 from training. Duration: 0.611125 2023-10-03 10:09:15,239 WARNING [train.py:1204] (3/4) Exclude cut with ID _14268_210_9206_1_1530622771285_4714099_637-86630-0_sp1.1 from training. Duration: 0.463625 2023-10-03 10:09:15,251 WARNING [train.py:1204] (3/4) Exclude cut with ID _29857_210_20034_1_1532395886753_3798160_372-5678-0_sp1.1 from training. Duration: 0.963625 2023-10-03 10:09:18,446 WARNING [train.py:1204] (3/4) Exclude cut with ID _52892_210_6721_1_1534378941259_4124129_90-165108-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 10:09:19,277 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=1229553.3333333333, ans=0.1 2023-10-03 10:09:20,367 WARNING [train.py:1204] (3/4) Exclude cut with ID _27052_210_5986_1_1532329197187_3585009_143-339198-0_sp1.1 from training. Duration: 0.963625 2023-10-03 10:09:22,184 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.4.encoder.layers.2.feed_forward3.out_whiten, num_groups=1, num_channels=384, metric=14.53 vs. limit=15.0 2023-10-03 10:09:24,957 WARNING [train.py:1204] (3/4) Exclude cut with ID _14268_210_9206_1_1530622771285_4714099_637-86630-0_sp0.9 from training. Duration: 0.5666875 2023-10-03 10:09:24,960 WARNING [train.py:1204] (3/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_401-255491-0 from training. Duration: 0.7 2023-10-03 10:09:27,768 WARNING [train.py:1204] (3/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_178-6946-0_sp1.1 from training. Duration: 0.97275 2023-10-03 10:09:34,084 WARNING [train.py:1204] (3/4) Exclude cut with ID _27485_210_4916_1_1532739191677_3668269_460-163885-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 10:09:39,445 WARNING [train.py:1204] (3/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_270-26053-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 10:09:40,895 WARNING [train.py:1204] (3/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_412-317077-0 from training. Duration: 0.75 2023-10-03 10:09:42,356 WARNING [train.py:1204] (3/4) Exclude cut with ID _5289_210_2084_1_1527678202736_3779299_142-103317-0 from training. Duration: 0.84 2023-10-03 10:09:42,404 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_150-143438-0_sp1.1 from training. Duration: 0.92725 2023-10-03 10:09:45,137 WARNING [train.py:1204] (3/4) Exclude cut with ID _27894_210_12392_1_1532415496078_6902439_668-269779-0_sp1.1 from training. Duration: 0.97275 2023-10-03 10:09:45,207 WARNING [train.py:1204] (3/4) Exclude cut with ID _18513_210_9206_1_1531618215223_3862620_56-222075-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 10:09:46,606 WARNING [train.py:1204] (3/4) Exclude cut with ID _4092_210_2151_1_1526643044449_3135759_356-278694-0 from training. Duration: 0.47 2023-10-03 10:09:49,773 WARNING [train.py:1204] (3/4) Exclude cut with ID _25808_210_13292_1_1532343514562_7366109_633-47524-0 from training. Duration: 0.99 2023-10-03 10:09:49,783 WARNING [train.py:1204] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_872-264525-0 from training. Duration: 0.65 2023-10-03 10:09:49,819 WARNING [train.py:1204] (3/4) Exclude cut with ID _14632_210_9988_1_1530691279392_3923200_239-232545-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 10:09:51,822 WARNING [train.py:1204] (3/4) Exclude cut with ID _14349_210_8341_1_1530615710818_3442470_153-27432-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 10:09:53,435 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.ff3_skip_rate, batch_count=1229686.6666666667, ans=0.0 2023-10-03 10:09:57,216 INFO [train.py:1046] (3/4) Epoch 35, batch 3850, loss[loss=0.1352, simple_loss=0.1922, pruned_loss=0.0391, over 19412.00 frames. ], tot_loss[loss=0.1604, simple_loss=0.2397, pruned_loss=0.04052, over 4709178.31 frames. ], batch size: 388, lr: 2.90e-03, grad_scale: 8.0 2023-10-03 10:09:57,289 WARNING [train.py:1204] (3/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_37-41919-0_sp0.9 from training. Duration: 0.9666875 2023-10-03 10:09:58,626 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_196-289637-0_sp0.9 from training. Duration: 0.8333125 2023-10-03 10:10:03,538 WARNING [train.py:1204] (3/4) Exclude cut with ID _13678_210_7497_1_1530530069226_3463170_371-3221-0_sp1.1 from training. Duration: 0.8181875 2023-10-03 10:10:03,618 WARNING [train.py:1204] (3/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_557-324258-0 from training. Duration: 0.81 2023-10-03 10:10:04,953 WARNING [train.py:1204] (3/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_1012-279473-0_sp1.1 from training. Duration: 0.6090625 2023-10-03 10:10:06,272 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_66916_210_14656_1_1535725525_326566_7-92649-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 10:10:10,296 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_9460_210_4211_1_1528954587_10049667_480-260269-0_sp1.1 from training. Duration: 0.5090625 2023-10-03 10:10:12,953 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_121-155001-0_sp1.1 from training. Duration: 0.92725 2023-10-03 10:10:13,554 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.5.encoder.layers.0.self_attn1.whiten, num_groups=1, num_channels=256, metric=12.51 vs. limit=22.5 2023-10-03 10:10:14,395 WARNING [train.py:1204] (3/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_188-171673-0_sp1.1 from training. Duration: 0.52725 2023-10-03 10:10:15,705 WARNING [train.py:1204] (3/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_2-39653-0 from training. Duration: 0.98 2023-10-03 10:10:21,361 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.bypass_mid.scale_min, batch_count=1229820.0, ans=0.2 2023-10-03 10:10:22,239 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.524e+02 1.944e+02 2.219e+02 2.451e+02 3.928e+02, threshold=4.439e+02, percent-clipped=0.0 2023-10-03 10:10:22,312 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_23850_210_4238_1_1532232597_1632433_112-229012-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 10:10:23,774 WARNING [train.py:1204] (3/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_626-79528-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 10:10:25,343 WARNING [train.py:1204] (3/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_856-90052-0_sp1.1 from training. Duration: 0.963625 2023-10-03 10:10:26,668 WARNING [train.py:1204] (3/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_122-109023-0_sp0.9 from training. Duration: 0.8444375 2023-10-03 10:10:29,368 WARNING [train.py:1204] (3/4) Exclude cut with ID _64414_210_17301_1_1535243252048_3757759_37-70328-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 10:10:29,442 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_622-344907-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 10:10:30,712 WARNING [train.py:1204] (3/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_410-321956-0_sp1.1 from training. Duration: 0.92725 2023-10-03 10:10:30,725 WARNING [train.py:1204] (3/4) Exclude cut with ID _7562_210_2084_1_1528887767916_3946229_186-326422-0_sp0.9 from training. Duration: 0.9333125 2023-10-03 10:10:32,124 WARNING [train.py:1204] (3/4) Exclude cut with ID _37537_210_2253_1_1533171758568_3421949_127-285192-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 10:10:32,257 WARNING [train.py:1204] (3/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_723-228168-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 10:10:33,927 WARNING [train.py:1204] (3/4) Exclude cut with ID _13678_210_7497_1_1530530069226_3463170_426-246984-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 10:10:33,946 WARNING [train.py:1204] (3/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_399-156791-0_sp0.9 from training. Duration: 0.92225 2023-10-03 10:10:35,364 WARNING [train.py:1204] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_391-22148-0 from training. Duration: 0.77 2023-10-03 10:10:35,401 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3475_210_2028_1_1526193578_3622569_64-297072-0 from training. Duration: 0.91 2023-10-03 10:10:35,466 WARNING [train.py:1204] (3/4) Exclude cut with ID _17875_210_2087_1_1531224012907_1821339_225-280015-0_sp1.1 from training. Duration: 0.963625 2023-10-03 10:10:35,493 WARNING [train.py:1204] (3/4) Exclude cut with ID _50655_210_18628_1_1534229942193_3772154_325-246169-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 10:10:38,235 WARNING [train.py:1204] (3/4) Exclude cut with ID _29529_210_7234_1_1532393961455_3977460_597-257348-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 10:10:38,867 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.3.encoder.layers.0.self_attn1.whiten, num_groups=1, num_channels=512, metric=14.94 vs. limit=22.5 2023-10-03 10:10:39,463 WARNING [train.py:1204] (3/4) Exclude cut with ID _58895_210_8750_1_1535184081320_3649089_77-173485-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 10:10:39,489 WARNING [train.py:1204] (3/4) Exclude cut with ID _6270_210_2084_1_1527850911573_3856420_26-277343-0 from training. Duration: 0.82 2023-10-03 10:10:42,187 WARNING [train.py:1204] (3/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_144-3287-0 from training. Duration: 0.67 2023-10-03 10:10:42,488 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.conv_module1.balancer1.prob, batch_count=1229953.3333333333, ans=0.125 2023-10-03 10:10:43,603 WARNING [train.py:1204] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_293-145278-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 10:10:46,303 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_40047_210_2290_1_1533290519_2374311_257-335162-0 from training. Duration: 0.95 2023-10-03 10:10:47,763 WARNING [train.py:1204] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_619-344499-0_sp0.9 from training. Duration: 0.7 2023-10-03 10:10:53,181 WARNING [train.py:1204] (3/4) Exclude cut with ID _19504_210_9994_1_1531648863242_3325220_456-315379-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 10:10:54,526 WARNING [train.py:1204] (3/4) Exclude cut with ID _55857_210_2346_1_1534644007831_6898209_631-221692-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 10:10:58,732 WARNING [train.py:1204] (3/4) Exclude cut with ID _64631_210_27767_1_1535349580068_4165160_324-3682-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 10:10:58,753 WARNING [train.py:1204] (3/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_317-211107-0 from training. Duration: 0.92 2023-10-03 10:11:01,565 WARNING [train.py:1204] (3/4) Exclude cut with ID _6789_210_1534_1_1527857937218_3118950_311-61262-0 from training. Duration: 0.65 2023-10-03 10:11:04,665 WARNING [train.py:1204] (3/4) Exclude cut with ID _28323_210_16187_1_1532480395667_4100300_365-68882-0_sp1.1 from training. Duration: 0.92725 2023-10-03 10:11:04,708 WARNING [train.py:1204] (3/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_816-181825-0_sp1.1 from training. Duration: 0.92725 2023-10-03 10:11:06,045 WARNING [train.py:1204] (3/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_132-269633-0_sp1.1 from training. Duration: 0.6181875 2023-10-03 10:11:06,147 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.0.attention_skip_rate, batch_count=1230020.0, ans=0.0 2023-10-03 10:11:07,403 WARNING [train.py:1204] (3/4) Exclude cut with ID _44585_210_18628_1_1533605350387_4518199_280-73534-0_sp1.1 from training. Duration: 0.8090625 2023-10-03 10:11:07,446 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_68095_210_20185_1_1535718226_8888855_1009-164755-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 10:11:07,515 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_40223_210_6228_1_1533298404_4812267_400-103813-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 10:11:07,516 WARNING [train.py:1204] (3/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_491-261923-0_sp1.1 from training. Duration: 0.8909375 2023-10-03 10:11:07,521 WARNING [train.py:1204] (3/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_712-342563-0 from training. Duration: 0.97 2023-10-03 10:11:08,865 WARNING [train.py:1204] (3/4) Exclude cut with ID _41161_210_20034_1_1533431249657_3555314_175-190032-0_sp1.1 from training. Duration: 0.963625 2023-10-03 10:11:10,222 INFO [train.py:1046] (3/4) Epoch 35, batch 3900, loss[loss=0.146, simple_loss=0.2239, pruned_loss=0.03402, over 24304.00 frames. ], tot_loss[loss=0.1588, simple_loss=0.238, pruned_loss=0.0398, over 4707548.44 frames. ], batch size: 56, lr: 2.90e-03, grad_scale: 8.0 2023-10-03 10:11:10,277 WARNING [train.py:1204] (3/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_788-298907-0 from training. Duration: 0.89 2023-10-03 10:11:10,307 WARNING [train.py:1204] (3/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_225-70961-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 10:11:10,320 WARNING [train.py:1204] (3/4) Exclude cut with ID _58583_210_5236_1_1534726835266_3574249_235-175994-0_sp1.1 from training. Duration: 0.92725 2023-10-03 10:11:11,735 WARNING [train.py:1204] (3/4) Exclude cut with ID _12302_210_5219_1_1529924446466_8107280_907-182949-0_sp1.1 from training. Duration: 0.836375 2023-10-03 10:11:13,027 WARNING [train.py:1204] (3/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_94-193692-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 10:11:13,137 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_4528_210_3273_1_1527076654_3276777_342-168068-0_sp0.9 from training. Duration: 0.9666875 2023-10-03 10:11:13,272 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.feed_forward3.hidden_balancer.prob, batch_count=1230086.6666666667, ans=0.125 2023-10-03 10:11:14,435 WARNING [train.py:1204] (3/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_368-74239-0_sp1.1 from training. Duration: 0.92725 2023-10-03 10:11:14,436 WARNING [train.py:1204] (3/4) Exclude cut with ID _44831_210_13427_1_1533816011549_3689035_291-295928-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 10:11:15,707 WARNING [train.py:1204] (3/4) Exclude cut with ID _56406_210_9288_1_1534899618664_3496149_455-131997-0_sp1.1 from training. Duration: 0.97275 2023-10-03 10:11:15,714 WARNING [train.py:1204] (3/4) Exclude cut with ID _6473_210_1741_1_1527901307396_3815380_108-17335-0 from training. Duration: 0.65 2023-10-03 10:11:15,749 WARNING [train.py:1204] (3/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_286-135600-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 10:11:19,149 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_928-321675-0_sp1.1 from training. Duration: 0.936375 2023-10-03 10:11:19,820 WARNING [train.py:1204] (3/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_81-300212-0_sp1.1 from training. Duration: 0.5454375 2023-10-03 10:11:19,857 WARNING [train.py:1204] (3/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_29-280622-0_sp0.9 from training. Duration: 0.911125 2023-10-03 10:11:22,426 WARNING [train.py:1204] (3/4) Exclude cut with ID _61079_210_4525_1_1534921008198_4011419_228-176447-0_sp1.1 from training. Duration: 0.936375 2023-10-03 10:11:23,802 WARNING [train.py:1204] (3/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_372-198878-0_sp1.1 from training. Duration: 0.5454375 2023-10-03 10:11:23,813 WARNING [train.py:1204] (3/4) Exclude cut with ID _64521_210_14699_1_1535243427259_3815328_244-208725-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 10:11:25,161 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8275_210_4198_1_1528455320_3892571_491-17707-0_sp0.9 from training. Duration: 0.711125 2023-10-03 10:11:27,894 WARNING [train.py:1204] (3/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_375-245677-0 from training. Duration: 0.81 2023-10-03 10:11:27,902 WARNING [train.py:1204] (3/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_690-286055-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 10:11:29,364 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.1.conv_skip_rate, batch_count=1230153.3333333333, ans=0.0 2023-10-03 10:11:30,609 WARNING [train.py:1204] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_89-321955-0 from training. Duration: 0.8 2023-10-03 10:11:30,638 WARNING [train.py:1204] (3/4) Exclude cut with ID _18895_210_6228_1_1531357274234_3784930_213-98721-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 10:11:32,631 WARNING [train.py:1204] (3/4) Exclude cut with ID _19708_210_9206_1_1532136650733_4238364_347-65240-0 from training. Duration: 0.97 2023-10-03 10:11:34,027 WARNING [train.py:1204] (3/4) Exclude cut with ID _64135_210_2253_1_1535677267876_7608972_498-5039-0 from training. Duration: 0.78 2023-10-03 10:11:38,095 WARNING [train.py:1204] (3/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_146-276944-0_sp1.1 from training. Duration: 0.7545625 2023-10-03 10:11:40,951 WARNING [train.py:1204] (3/4) Exclude cut with ID _63171_210_15488_1_1535104792794_3295110_155-34488-0_sp1.1 from training. Duration: 0.97275 2023-10-03 10:11:40,964 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_5438_210_1794_1_1527336687_2600009_233-58072-0_sp0.9 from training. Duration: 0.8555625 2023-10-03 10:11:41,021 WARNING [train.py:1204] (3/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_63-101996-0_sp0.9 from training. Duration: 0.97775 2023-10-03 10:11:45,191 WARNING [train.py:1204] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_271-269084-0_sp1.1 from training. Duration: 0.7545625 2023-10-03 10:11:46,628 WARNING [train.py:1204] (3/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_345-167986-0_sp1.1 from training. Duration: 0.763625 2023-10-03 10:11:49,335 WARNING [train.py:1204] (3/4) Exclude cut with ID _39290_210_4220_1_1533862778760_3075118_92-311600-0_sp1.1 from training. Duration: 0.8 2023-10-03 10:11:49,341 WARNING [train.py:1204] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_209-65306-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 10:11:51,168 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_26433_210_12108_1_1532157158_4056480_317-335807-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 10:11:55,741 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.4.encoder.layers.1.self_attn_weights.whiten_keys, num_groups=4, num_channels=128, metric=2.08 vs. limit=6.0 2023-10-03 10:11:56,563 WARNING [train.py:1204] (3/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_238-94127-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 10:11:56,611 WARNING [train.py:1204] (3/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_207-132038-0_sp1.1 from training. Duration: 0.8545625 2023-10-03 10:12:00,909 WARNING [train.py:1204] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_167-137255-0_sp1.1 from training. Duration: 0.5818125 2023-10-03 10:12:02,309 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2009_210_2071_1_1525481134_4588937_385-302100-0_sp0.9 from training. Duration: 0.9666875 2023-10-03 10:12:12,645 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_15517_210_6753_1_1530954008_3487336_253-110617-0_sp1.1 from training. Duration: 0.936375 2023-10-03 10:12:16,606 WARNING [train.py:1204] (3/4) Exclude cut with ID _39290_210_4220_1_1533862778760_3075118_92-311600-0_sp0.9 from training. Duration: 0.97775 2023-10-03 10:12:16,659 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_42-142473-0 from training. Duration: 0.72 2023-10-03 10:12:16,707 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2296_210_2087_1_1525503448_3397335_57-185430-0 from training. Duration: 0.6 2023-10-03 10:12:16,719 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_11305_210_1794_1_1529750428_7178148_257-312870-0_sp0.9 from training. Duration: 0.97775 2023-10-03 10:12:19,322 WARNING [train.py:1204] (3/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_222-198127-0 from training. Duration: 0.84 2023-10-03 10:12:21,327 WARNING [train.py:1204] (3/4) Exclude cut with ID _65263_210_22356_1_1535763598691_3921037_423-202699-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 10:12:21,383 WARNING [train.py:1204] (3/4) Exclude cut with ID _15037_210_2253_1_1530752294104_3267650_13-130361-0 from training. Duration: 0.99 2023-10-03 10:12:24,444 INFO [train.py:1046] (3/4) Epoch 35, batch 3950, loss[loss=0.1546, simple_loss=0.2475, pruned_loss=0.03086, over 24334.00 frames. ], tot_loss[loss=0.1585, simple_loss=0.2375, pruned_loss=0.0398, over 4707198.34 frames. ], batch size: 74, lr: 2.90e-03, grad_scale: 8.0 2023-10-03 10:12:28,603 WARNING [train.py:1204] (3/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_628-198080-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 10:12:29,952 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3171_210_2087_1_1526109084_2330007_37-64334-0 from training. Duration: 0.82 2023-10-03 10:12:30,008 WARNING [train.py:1204] (3/4) Exclude cut with ID _5287_210_2084_1_1527418883040_3878670_12-221186-0_sp1.1 from training. Duration: 0.8909375 2023-10-03 10:12:34,059 WARNING [train.py:1204] (3/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_1103-56445-0_sp1.1 from training. Duration: 0.663625 2023-10-03 10:12:35,455 WARNING [train.py:1204] (3/4) Exclude cut with ID _65263_210_22356_1_1535763598691_3921037_169-140068-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 10:12:40,293 WARNING [train.py:1204] (3/4) Exclude cut with ID IC0671W0118-331961 from training. Duration: 0.896 2023-10-03 10:12:41,653 WARNING [train.py:1204] (3/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_402-102311-0_sp1.1 from training. Duration: 0.6090625 2023-10-03 10:12:41,684 WARNING [train.py:1204] (3/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_202-293523-0 from training. Duration: 0.72 2023-10-03 10:12:41,741 WARNING [train.py:1204] (3/4) Exclude cut with ID IC0679W0038-335846 from training. Duration: 0.768 2023-10-03 10:12:42,908 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_37357_210_17024_1_1533089504_2316121_79-198866-0_sp1.1 from training. Duration: 0.936375 2023-10-03 10:12:45,709 WARNING [train.py:1204] (3/4) Exclude cut with ID _7606_210_4915_1_1528894253277_5080690_292-271285-0_sp1.1 from training. Duration: 0.963625 2023-10-03 10:12:45,728 WARNING [train.py:1204] (3/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_377-296839-0_sp0.9 from training. Duration: 0.611125 2023-10-03 10:12:45,731 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_45886_210_17024_1_1533704917_2435333_58-131069-0_sp1.1 from training. Duration: 0.963625 2023-10-03 10:12:48,446 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.574e+02 1.899e+02 2.029e+02 2.399e+02 3.247e+02, threshold=4.058e+02, percent-clipped=0.0 2023-10-03 10:12:48,551 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6961_210_2087_1_1527937168_6815011_395-185060-0 from training. Duration: 0.95 2023-10-03 10:12:49,901 WARNING [train.py:1204] (3/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_304-266479-0_sp1.1 from training. Duration: 0.97275 2023-10-03 10:12:50,026 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.attention_skip_rate, batch_count=1230486.6666666667, ans=0.0 2023-10-03 10:12:51,165 WARNING [train.py:1204] (3/4) Exclude cut with ID _3867_210_1883_1_1526641200575_4475830_319-349850-0_sp1.1 from training. Duration: 0.6090625 2023-10-03 10:12:51,181 WARNING [train.py:1204] (3/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_304-59227-0_sp0.9 from training. Duration: 0.8555625 2023-10-03 10:12:52,986 WARNING [train.py:1204] (3/4) Exclude cut with ID _8021_210_7994_1_1528376451358_3653069_100-140231-0_sp0.9 from training. Duration: 0.7444375 2023-10-03 10:12:53,036 WARNING [train.py:1204] (3/4) Exclude cut with ID _8833_210_5219_1_1528592665210_7553059_329-125156-0_sp0.9 from training. Duration: 0.911125 2023-10-03 10:12:54,635 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.conv_module2.balancer1.prob, batch_count=1230553.3333333333, ans=0.125 2023-10-03 10:12:56,477 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=1230553.3333333333, ans=0.1 2023-10-03 10:13:03,268 WARNING [train.py:1204] (3/4) Exclude cut with ID _14459_210_2228_1_1531443641946_7516700_792-287234-0_sp1.1 from training. Duration: 0.92725 2023-10-03 10:13:03,285 WARNING [train.py:1204] (3/4) Exclude cut with ID _19708_210_9206_1_1532136650733_4238364_347-65240-0_sp1.1 from training. Duration: 0.8818125 2023-10-03 10:13:04,751 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.bypass_mid.scale_min, batch_count=1230553.3333333333, ans=0.2 2023-10-03 10:13:09,221 WARNING [train.py:1204] (3/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_70-27732-0 from training. Duration: 0.94 2023-10-03 10:13:11,173 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.5.encoder.layers.1.conv_module2.whiten, num_groups=1, num_channels=256, metric=5.42 vs. limit=15.0 2023-10-03 10:13:12,101 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_397-132565-0 from training. Duration: 0.99 2023-10-03 10:13:12,104 WARNING [train.py:1204] (3/4) Exclude cut with ID _49728_210_12651_1_1534041034294_6788150_624-195301-0 from training. Duration: 0.85 2023-10-03 10:13:13,428 WARNING [train.py:1204] (3/4) Exclude cut with ID _55429_210_10626_1_1534467588527_7174143_377-169391-0_sp1.1 from training. Duration: 0.87275 2023-10-03 10:13:13,531 WARNING [train.py:1204] (3/4) Exclude cut with ID _49376_210_5986_1_1533985278732_3370740_354-344695-0_sp1.1 from training. Duration: 0.9 2023-10-03 10:13:18,302 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.5.encoder.layers.0.conv_module2.whiten, num_groups=1, num_channels=256, metric=3.65 vs. limit=15.0 2023-10-03 10:13:21,867 WARNING [train.py:1204] (3/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_276-96593-0_sp1.1 from training. Duration: 0.7 2023-10-03 10:13:22,521 WARNING [train.py:1204] (3/4) Exclude cut with ID _15012_210_1968_1_1530856820429_5095350_688-178849-0_sp0.9 from training. Duration: 0.711125 2023-10-03 10:13:22,558 WARNING [train.py:1204] (3/4) Exclude cut with ID _25565_210_12392_1_1532423009382_3952098_372-69023-0_sp1.1 from training. Duration: 0.936375 2023-10-03 10:13:23,857 WARNING [train.py:1204] (3/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_61-327062-0_sp1.1 from training. Duration: 0.736375 2023-10-03 10:13:23,888 WARNING [train.py:1204] (3/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_215-141059-0 from training. Duration: 0.62 2023-10-03 10:13:27,286 WARNING [train.py:1204] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_6-226750-0_sp1.1 from training. Duration: 0.8545625 2023-10-03 10:13:28,624 WARNING [train.py:1204] (3/4) Exclude cut with ID _14348_210_8341_1_1530599434996_3778640_240-232370-0_sp1.1 from training. Duration: 0.763625 2023-10-03 10:13:32,683 WARNING [train.py:1204] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_468-189058-0 from training. Duration: 0.96 2023-10-03 10:13:37,215 INFO [train.py:1046] (3/4) Epoch 35, batch 4000, loss[loss=0.1667, simple_loss=0.237, pruned_loss=0.04822, over 23753.00 frames. ], tot_loss[loss=0.1593, simple_loss=0.2384, pruned_loss=0.04006, over 4712228.18 frames. ], batch size: 164, lr: 2.90e-03, grad_scale: 16.0 2023-10-03 10:13:40,178 WARNING [train.py:1204] (3/4) Exclude cut with ID _21987_210_5854_1_1531821500482_3637038_247-93340-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 10:13:48,375 WARNING [train.py:1204] (3/4) Exclude cut with ID _66868_210_9527_1_1535509287954_4561140_436-191602-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 10:13:54,747 WARNING [train.py:1204] (3/4) Exclude cut with ID _18895_210_6228_1_1531357274234_3784930_394-58203-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 10:13:54,802 WARNING [train.py:1204] (3/4) Exclude cut with ID _58605_210_12210_1_1534723195939_3904259_258-281498-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 10:13:54,829 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_33348_210_5632_1_1533086912_3324030_245-292752-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 10:13:54,856 WARNING [train.py:1204] (3/4) Exclude cut with ID _67831_210_12392_1_1535612464202_3092612_346-2148-0 from training. Duration: 0.93 2023-10-03 10:13:56,737 WARNING [train.py:1204] (3/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_369-222060-0_sp0.9 from training. Duration: 0.788875 2023-10-03 10:13:56,806 WARNING [train.py:1204] (3/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_245-174553-0 from training. Duration: 0.88 2023-10-03 10:13:56,812 WARNING [train.py:1204] (3/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_231-104736-0_sp1.1 from training. Duration: 0.7090625 2023-10-03 10:13:56,817 WARNING [train.py:1204] (3/4) Exclude cut with ID _44585_210_18628_1_1533605350387_4518199_280-73534-0 from training. Duration: 0.89 2023-10-03 10:14:00,016 WARNING [train.py:1204] (3/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_22-201476-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 10:14:03,569 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.2.encoder.layers.2.self_attn1.whiten, num_groups=1, num_channels=384, metric=19.35 vs. limit=22.5 2023-10-03 10:14:04,175 WARNING [train.py:1204] (3/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_325-62802-0_sp0.9 from training. Duration: 0.9666875 2023-10-03 10:14:04,196 WARNING [train.py:1204] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_199-274062-0_sp1.1 from training. Duration: 0.97275 2023-10-03 10:14:04,199 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_8-319957-0_sp1.1 from training. Duration: 0.87275 2023-10-03 10:14:05,529 WARNING [train.py:1204] (3/4) Exclude cut with ID _29343_210_4381_1_1532602757752_3674109_224-349726-0_sp1.1 from training. Duration: 0.936375 2023-10-03 10:14:05,541 WARNING [train.py:1204] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_520-126729-0_sp1.1 from training. Duration: 0.636375 2023-10-03 10:14:06,992 WARNING [train.py:1204] (3/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_239-22076-0_sp1.1 from training. Duration: 0.77275 2023-10-03 10:14:10,245 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9012W0011-501146 from training. Duration: 0.896 2023-10-03 10:14:10,319 WARNING [train.py:1204] (3/4) Exclude cut with ID _29857_210_20034_1_1532395886753_3798160_279-185801-0_sp1.1 from training. Duration: 0.82725 2023-10-03 10:14:11,639 WARNING [train.py:1204] (3/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_261-223933-0_sp1.1 from training. Duration: 0.92725 2023-10-03 10:14:11,868 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.feed_forward1.out_proj.dropout_p, batch_count=1230886.6666666667, ans=0.1 2023-10-03 10:14:13,440 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.conv_module2.balancer2.min_positive, batch_count=1230886.6666666667, ans=0.05 2023-10-03 10:14:14,451 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9029W0025-509426 from training. Duration: 0.896 2023-10-03 10:14:15,840 WARNING [train.py:1204] (3/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_337-181502-0_sp1.1 from training. Duration: 0.5909375 2023-10-03 10:14:15,862 WARNING [train.py:1204] (3/4) Exclude cut with ID _68635_210_9317_1_1535770813410_4315870_159-332599-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 10:14:22,719 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_161-276996-0 from training. Duration: 0.95 2023-10-03 10:14:22,771 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7088_210_1874_1_1527939776_1101113_127-68870-0_sp1.1 from training. Duration: 0.936375 2023-10-03 10:14:25,671 WARNING [train.py:1204] (3/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_322-217220-0_sp1.1 from training. Duration: 0.7818125 2023-10-03 10:14:25,730 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9072W0002-530636 from training. Duration: 0.768 2023-10-03 10:14:27,064 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_340-336408-0_sp1.1 from training. Duration: 0.8454375 2023-10-03 10:14:27,744 WARNING [train.py:1204] (3/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_395-37222-0 from training. Duration: 0.76 2023-10-03 10:14:27,748 WARNING [train.py:1204] (3/4) Exclude cut with ID _64099_210_17301_1_1535526977656_1578188_159-209156-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 10:14:28,986 WARNING [train.py:1204] (3/4) Exclude cut with ID _53950_210_5816_1_1534417264919_3862951_28-29681-0_sp1.1 from training. Duration: 0.92725 2023-10-03 10:14:30,775 WARNING [train.py:1204] (3/4) Exclude cut with ID _4681_210_3525_1_1527250689464_120889_15-126760-0_sp1.1 from training. Duration: 0.736375 2023-10-03 10:14:32,645 WARNING [train.py:1204] (3/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_242-3472-0_sp1.1 from training. Duration: 0.663625 2023-10-03 10:14:32,679 WARNING [train.py:1204] (3/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_436-186311-0_sp0.9 from training. Duration: 0.72225 2023-10-03 10:14:32,704 WARNING [train.py:1204] (3/4) Exclude cut with ID _3675_210_4557_1_1526639934408_4182089_452-172147-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 10:14:32,912 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.conv_skip_rate, batch_count=1230953.3333333333, ans=0.0 2023-10-03 10:14:35,351 WARNING [train.py:1204] (3/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_80-147301-0 from training. Duration: 0.84 2023-10-03 10:14:35,373 WARNING [train.py:1204] (3/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_351-107588-0_sp1.1 from training. Duration: 0.92725 2023-10-03 10:14:36,824 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9111W0002-550781 from training. Duration: 0.896 2023-10-03 10:14:41,086 WARNING [train.py:1204] (3/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_465-156500-0_sp1.1 from training. Duration: 0.6545625 2023-10-03 10:14:44,345 WARNING [train.py:1204] (3/4) Exclude cut with ID _13938_210_9988_1_1530518718978_3693960_304-112784-0_sp1.1 from training. Duration: 0.4181875 2023-10-03 10:14:47,087 WARNING [train.py:1204] (3/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_202-67017-0_sp1.1 from training. Duration: 0.7454375 2023-10-03 10:14:47,123 WARNING [train.py:1204] (3/4) Exclude cut with ID _58414_210_8254_1_1535421609162_7151182_448-4358-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 10:14:47,208 WARNING [train.py:1204] (3/4) Exclude cut with ID _66840_210_5219_1_1535452168426_4196349_203-243234-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 10:14:48,469 WARNING [train.py:1204] (3/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_201-213513-0_sp1.1 from training. Duration: 0.97275 2023-10-03 10:14:51,154 INFO [train.py:1046] (3/4) Epoch 35, batch 4050, loss[loss=0.1501, simple_loss=0.2258, pruned_loss=0.03722, over 23701.00 frames. ], tot_loss[loss=0.1598, simple_loss=0.2393, pruned_loss=0.04017, over 4731113.89 frames. ], batch size: 149, lr: 2.90e-03, grad_scale: 16.0 2023-10-03 10:14:53,980 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_117-187560-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 10:14:54,301 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.bypass_mid.scale_min, batch_count=1231086.6666666667, ans=0.2 2023-10-03 10:14:55,465 WARNING [train.py:1204] (3/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_135-174080-0_sp0.9 from training. Duration: 0.588875 2023-10-03 10:14:56,719 WARNING [train.py:1204] (3/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_45-265684-0 from training. Duration: 0.72 2023-10-03 10:14:58,722 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_435-313305-0_sp1.1 from training. Duration: 0.6454375 2023-10-03 10:14:58,741 WARNING [train.py:1204] (3/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_235-307337-0_sp1.1 from training. Duration: 0.8545625 2023-10-03 10:15:00,112 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7762_210_1794_1_1528199508_4057408_419-125009-0_sp0.9 from training. Duration: 0.92225 2023-10-03 10:15:01,451 WARNING [train.py:1204] (3/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_215-141059-0_sp1.1 from training. Duration: 0.563625 2023-10-03 10:15:01,741 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.nonlin_attention.balancer.prob, batch_count=1231086.6666666667, ans=0.125 2023-10-03 10:15:03,502 WARNING [train.py:1204] (3/4) Exclude cut with ID _27013_210_1389_1_1532437173240_3252649_222-125573-0_sp1.1 from training. Duration: 0.8909375 2023-10-03 10:15:05,634 WARNING [train.py:1204] (3/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_126-274694-0_sp1.1 from training. Duration: 0.8909375 2023-10-03 10:15:08,329 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2592_210_2290_1_1525697025_4882925_157-70512-0_sp1.1 from training. Duration: 0.82725 2023-10-03 10:15:09,733 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_292-265928-0_sp0.9 from training. Duration: 0.5666875 2023-10-03 10:15:12,562 WARNING [train.py:1204] (3/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_1211-226066-0_sp1.1 from training. Duration: 0.6909375 2023-10-03 10:15:12,619 WARNING [train.py:1204] (3/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_394-265747-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 10:15:14,109 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.0.balancer2.prob, batch_count=1231153.3333333333, ans=0.125 2023-10-03 10:15:17,182 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.537e+02 1.812e+02 1.989e+02 2.164e+02 3.056e+02, threshold=3.978e+02, percent-clipped=0.0 2023-10-03 10:15:17,270 WARNING [train.py:1204] (3/4) Exclude cut with ID _48782_210_12658_1_1534467701544_2300422_239-67139-0_sp1.1 from training. Duration: 0.97275 2023-10-03 10:15:18,618 WARNING [train.py:1204] (3/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_329-113173-0_sp1.1 from training. Duration: 0.563625 2023-10-03 10:15:20,133 WARNING [train.py:1204] (3/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_81-75849-0_sp0.9 from training. Duration: 0.4333125 2023-10-03 10:15:21,591 WARNING [train.py:1204] (3/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_1003-101638-0 from training. Duration: 0.73 2023-10-03 10:15:21,614 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9309W0189-640316 from training. Duration: 0.64 2023-10-03 10:15:22,939 WARNING [train.py:1204] (3/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_503-287883-0_sp1.1 from training. Duration: 0.8 2023-10-03 10:15:28,353 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.3.encoder.layers.1.nonlin_attention.whiten1, num_groups=1, num_channels=384, metric=5.02 vs. limit=10.0 2023-10-03 10:15:29,107 WARNING [train.py:1204] (3/4) Exclude cut with ID _8833_210_5219_1_1528592665210_7553059_611-80313-0 from training. Duration: 0.64 2023-10-03 10:15:31,013 WARNING [train.py:1204] (3/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_335-106626-0_sp1.1 from training. Duration: 0.963625 2023-10-03 10:15:33,128 WARNING [train.py:1204] (3/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_237-44332-0_sp1.1 from training. Duration: 0.8545625 2023-10-03 10:15:34,615 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.ff2_skip_rate, batch_count=1231220.0, ans=0.0 2023-10-03 10:15:37,233 WARNING [train.py:1204] (3/4) Exclude cut with ID _46952_210_5986_1_1534230010357_3107379_178-147341-0_sp1.1 from training. Duration: 0.97275 2023-10-03 10:15:37,272 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_13091_210_7925_1_1530609447_8987354_946-154426-0_sp1.1 from training. Duration: 0.936375 2023-10-03 10:15:37,290 WARNING [train.py:1204] (3/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_326-17061-0_sp1.1 from training. Duration: 0.8545625 2023-10-03 10:15:41,466 WARNING [train.py:1204] (3/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_38-169920-0_sp1.1 from training. Duration: 0.82725 2023-10-03 10:15:44,600 WARNING [train.py:1204] (3/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_283-220372-0 from training. Duration: 0.56 2023-10-03 10:15:44,608 WARNING [train.py:1204] (3/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_252-216674-0_sp1.1 from training. Duration: 0.4909375 2023-10-03 10:15:47,320 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_202-91290-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 10:15:47,438 WARNING [train.py:1204] (3/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_80-74184-0 from training. Duration: 0.83 2023-10-03 10:15:52,859 WARNING [train.py:1204] (3/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_411-271358-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 10:15:59,186 WARNING [train.py:1204] (3/4) Exclude cut with ID _68777_210_4353_1_1535810390731_3117904_129-166012-0 from training. Duration: 0.85 2023-10-03 10:15:59,273 WARNING [train.py:1204] (3/4) Exclude cut with ID _44982_210_1511_1_1534312820108_3726029_410-109759-0_sp1.1 from training. Duration: 0.963625 2023-10-03 10:15:59,278 WARNING [train.py:1204] (3/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_364-22817-0_sp1.1 from training. Duration: 0.6454375 2023-10-03 10:16:03,610 WARNING [train.py:1204] (3/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_89-242335-0 from training. Duration: 0.95 2023-10-03 10:16:03,625 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_151-325626-0 from training. Duration: 0.97 2023-10-03 10:16:03,626 WARNING [train.py:1204] (3/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_786-264332-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 10:16:06,182 INFO [train.py:1046] (3/4) Epoch 35, batch 4100, loss[loss=0.1658, simple_loss=0.2514, pruned_loss=0.04008, over 23719.00 frames. ], tot_loss[loss=0.161, simple_loss=0.2403, pruned_loss=0.04079, over 4723416.92 frames. ], batch size: 85, lr: 2.90e-03, grad_scale: 16.0 2023-10-03 10:16:06,299 WARNING [train.py:1204] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_35-153886-0_sp1.1 from training. Duration: 0.763625 2023-10-03 10:16:08,998 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_24007_210_1794_1_1531918925_1663875_116-100916-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 10:16:09,020 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_162-262717-0_sp1.1 from training. Duration: 0.7545625 2023-10-03 10:16:16,366 WARNING [train.py:1204] (3/4) Exclude cut with ID _8263_210_1664_1_1528438953494_294100_40-204731-0 from training. Duration: 0.5 2023-10-03 10:16:17,755 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_416-293746-0 from training. Duration: 0.9 2023-10-03 10:16:19,185 WARNING [train.py:1204] (3/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_605-219732-0 from training. Duration: 0.94 2023-10-03 10:16:19,259 WARNING [train.py:1204] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_454-123002-0 from training. Duration: 0.69 2023-10-03 10:16:19,264 WARNING [train.py:1204] (3/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_25-304273-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 10:16:20,567 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6860_210_4852_1_1527917457_3833327_451-39702-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 10:16:20,601 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_304-16505-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 10:16:20,620 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_4-266348-0_sp1.1 from training. Duration: 0.7090625 2023-10-03 10:16:21,948 WARNING [train.py:1204] (3/4) Exclude cut with ID ID0149W0001-753701 from training. Duration: 0.989 2023-10-03 10:16:24,696 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_41251_210_15368_1_1533429767_4820695_394-55393-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 10:16:26,080 WARNING [train.py:1204] (3/4) Exclude cut with ID _8793_210_6310_1_1528542985139_2310009_121-179782-0_sp1.1 from training. Duration: 0.72725 2023-10-03 10:16:26,097 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2729_210_3528_1_1525863505_3882214_421-73047-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 10:16:26,158 WARNING [train.py:1204] (3/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_278-154492-0_sp0.9 from training. Duration: 0.8444375 2023-10-03 10:16:30,227 WARNING [train.py:1204] (3/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_412-317077-0_sp0.9 from training. Duration: 0.8333125 2023-10-03 10:16:32,105 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_727-138317-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 10:16:33,425 WARNING [train.py:1204] (3/4) Exclude cut with ID _25808_210_13292_1_1532343514562_7366109_345-335048-0_sp1.1 from training. Duration: 0.92725 2023-10-03 10:16:33,445 WARNING [train.py:1204] (3/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_506-23200-0 from training. Duration: 0.97 2023-10-03 10:16:34,106 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.3.encoder.layers.3.self_attn1.whiten, num_groups=1, num_channels=512, metric=12.46 vs. limit=22.5 2023-10-03 10:16:34,716 WARNING [train.py:1204] (3/4) Exclude cut with ID _44718_210_5816_1_1534050051869_6469030_643-245187-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 10:16:34,720 WARNING [train.py:1204] (3/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_42-186162-0_sp1.1 from training. Duration: 0.836375 2023-10-03 10:16:34,745 WARNING [train.py:1204] (3/4) Exclude cut with ID _3231_210_2106_1_1526205864934_6784220_171-256004-0_sp1.1 from training. Duration: 0.7818125 2023-10-03 10:16:34,763 WARNING [train.py:1204] (3/4) Exclude cut with ID _25914_210_6400_1_1532141317058_644740_43-248478-0_sp1.1 from training. Duration: 0.936375 2023-10-03 10:16:34,805 WARNING [train.py:1204] (3/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_577-287150-0 from training. Duration: 0.82 2023-10-03 10:16:37,991 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_46015_210_7234_1_1533863319_3087837_376-276068-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 10:16:39,251 WARNING [train.py:1204] (3/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_51-200220-0 from training. Duration: 0.56 2023-10-03 10:16:40,760 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_452-96101-0_sp1.1 from training. Duration: 0.963625 2023-10-03 10:16:42,304 WARNING [train.py:1204] (3/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_263-168388-0_sp1.1 from training. Duration: 0.7818125 2023-10-03 10:16:42,305 WARNING [train.py:1204] (3/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_469-94673-0 from training. Duration: 0.79 2023-10-03 10:16:43,040 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.2.encoder.layers.2.nonlin_attention.whiten1, num_groups=1, num_channels=288, metric=4.82 vs. limit=10.0 2023-10-03 10:16:43,761 WARNING [train.py:1204] (3/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_303-228738-0_sp1.1 from training. Duration: 0.763625 2023-10-03 10:16:45,470 WARNING [train.py:1204] (3/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_262-250826-0_sp1.1 from training. Duration: 0.9 2023-10-03 10:16:45,500 WARNING [train.py:1204] (3/4) Exclude cut with ID _68777_210_4353_1_1535810390731_3117904_129-166012-0_sp1.1 from training. Duration: 0.77275 2023-10-03 10:16:46,874 WARNING [train.py:1204] (3/4) Exclude cut with ID _58895_210_8750_1_1535184081320_3649089_265-323810-0 from training. Duration: 0.96 2023-10-03 10:16:47,395 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.3.encoder.layers.3.self_attn2.whiten, num_groups=1, num_channels=512, metric=12.18 vs. limit=22.5 2023-10-03 10:16:48,145 WARNING [train.py:1204] (3/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_120-76843-0_sp0.9 from training. Duration: 0.611125 2023-10-03 10:16:48,355 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.conv_skip_rate, batch_count=1231620.0, ans=0.0 2023-10-03 10:16:50,762 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7544_210_6753_1_1528587722_3576686_295-258479-0_sp0.9 from training. Duration: 0.9333125 2023-10-03 10:16:52,177 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7046_210_6753_1_1527895337_6920917_783-278109-0 from training. Duration: 0.77 2023-10-03 10:16:52,224 WARNING [train.py:1204] (3/4) Exclude cut with ID _42326_210_7770_1_1533814282531_3707269_113-308323-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 10:16:52,263 WARNING [train.py:1204] (3/4) Exclude cut with ID _7852_210_5219_1_1528286471316_8139400_242-105336-0_sp0.9 from training. Duration: 0.8 2023-10-03 10:16:55,101 WARNING [train.py:1204] (3/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_114-277905-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 10:17:00,618 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_13118_210_7925_1_1530755612_7608297_42-66683-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 10:17:02,679 WARNING [train.py:1204] (3/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_901-79438-0_sp1.1 from training. Duration: 0.8545625 2023-10-03 10:17:03,978 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_30280_210_1881_1_1532872779_3660139_211-182194-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 10:17:04,315 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=1231686.6666666667, ans=0.1 2023-10-03 10:17:08,665 WARNING [train.py:1204] (3/4) Exclude cut with ID _14898_210_9206_1_1530766812377_4177190_621-45732-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 10:17:09,937 WARNING [train.py:1204] (3/4) Exclude cut with ID _35350_210_16040_1_1533040191183_4075299_224-133775-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 10:17:12,880 WARNING [train.py:1204] (3/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_147-293311-0_sp1.1 from training. Duration: 0.8545625 2023-10-03 10:17:15,936 WARNING [train.py:1204] (3/4) Exclude cut with ID _12302_210_5219_1_1529924446466_8107280_835-279754-0_sp1.1 from training. Duration: 0.8181875 2023-10-03 10:17:18,754 INFO [train.py:1046] (3/4) Epoch 35, batch 4150, loss[loss=0.1488, simple_loss=0.2286, pruned_loss=0.03454, over 24418.00 frames. ], tot_loss[loss=0.1609, simple_loss=0.2403, pruned_loss=0.0407, over 4722619.15 frames. ], batch size: 58, lr: 2.90e-03, grad_scale: 16.0 2023-10-03 10:17:20,142 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8453_210_4211_1_1528502940_8562730_347-330673-0_sp0.9 from training. Duration: 0.8 2023-10-03 10:17:21,560 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_213-103282-0_sp0.9 from training. Duration: 0.8666875 2023-10-03 10:17:22,874 WARNING [train.py:1204] (3/4) Exclude cut with ID _49728_210_12651_1_1534041034294_6788150_66-43496-0_sp1.1 from training. Duration: 0.97275 2023-10-03 10:17:22,876 WARNING [train.py:1204] (3/4) Exclude cut with ID _19023_210_4353_1_1531378892552_5820780_28-36615-0_sp1.1 from training. Duration: 0.8818125 2023-10-03 10:17:25,622 WARNING [train.py:1204] (3/4) Exclude cut with ID _14349_210_8341_1_1530615710818_3442470_115-285975-0 from training. Duration: 0.86 2023-10-03 10:17:26,904 WARNING [train.py:1204] (3/4) Exclude cut with ID _12734_210_8341_1_1530183614296_3891929_236-47464-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 10:17:26,961 WARNING [train.py:1204] (3/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_177-236809-0 from training. Duration: 0.49 2023-10-03 10:17:28,361 WARNING [train.py:1204] (3/4) Exclude cut with ID _13678_210_7497_1_1530530069226_3463170_371-3221-0 from training. Duration: 0.9 2023-10-03 10:17:28,376 WARNING [train.py:1204] (3/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_48-338976-0 from training. Duration: 0.89 2023-10-03 10:17:29,269 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.feed_forward3.hidden_balancer.prob, batch_count=1231753.3333333333, ans=0.125 2023-10-03 10:17:30,137 WARNING [train.py:1204] (3/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_1167-309744-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 10:17:34,615 WARNING [train.py:1204] (3/4) Exclude cut with ID _30327_210_16049_1_1532484161295_3924169_401-70819-0_sp0.9 from training. Duration: 0.9555625 2023-10-03 10:17:34,623 WARNING [train.py:1204] (3/4) Exclude cut with ID _55134_210_21982_1_1534590028377_3882170_346-91022-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 10:17:39,215 WARNING [train.py:1204] (3/4) Exclude cut with ID _14459_210_2228_1_1531443641946_7516700_174-118801-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 10:17:39,273 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_269-278006-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 10:17:40,605 WARNING [train.py:1204] (3/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_390-339137-0_sp0.9 from training. Duration: 0.62225 2023-10-03 10:17:40,736 WARNING [train.py:1204] (3/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_253-308377-0_sp1.1 from training. Duration: 0.4454375 2023-10-03 10:17:42,080 WARNING [train.py:1204] (3/4) Exclude cut with ID _23825_210_13449_1_1531915313526_3497150_240-271367-0_sp1.1 from training. Duration: 0.8818125 2023-10-03 10:17:43,363 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.504e+02 1.871e+02 2.078e+02 2.359e+02 3.570e+02, threshold=4.155e+02, percent-clipped=0.0 2023-10-03 10:17:43,469 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1015-343884-0_sp0.9 from training. Duration: 0.688875 2023-10-03 10:17:48,042 WARNING [train.py:1204] (3/4) Exclude cut with ID _23514_210_1389_1_1531832139785_3031439_335-48210-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 10:17:52,145 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_309-65177-0_sp1.1 from training. Duration: 0.8 2023-10-03 10:17:53,476 WARNING [train.py:1204] (3/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_231-137892-0 from training. Duration: 0.99 2023-10-03 10:17:53,840 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.conv_module1.balancer1.prob, batch_count=1231886.6666666667, ans=0.125 2023-10-03 10:17:54,840 WARNING [train.py:1204] (3/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_57-270037-0 from training. Duration: 0.77 2023-10-03 10:17:54,845 WARNING [train.py:1204] (3/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_465-214726-0_sp1.1 from training. Duration: 0.7818125 2023-10-03 10:17:54,969 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.1.feed_forward1.out_proj.dropout_p, batch_count=1231886.6666666667, ans=0.1 2023-10-03 10:17:57,507 WARNING [train.py:1204] (3/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_106-300985-0 from training. Duration: 0.9 2023-10-03 10:17:57,512 WARNING [train.py:1204] (3/4) Exclude cut with ID _14632_210_9988_1_1530691279392_3923200_161-129530-0_sp1.1 from training. Duration: 0.87275 2023-10-03 10:17:57,525 WARNING [train.py:1204] (3/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_514-88370-0_sp1.1 from training. Duration: 0.963625 2023-10-03 10:17:58,990 WARNING [train.py:1204] (3/4) Exclude cut with ID _56495_210_4234_1_1534748080334_4136219_305-334505-0_sp1.1 from training. Duration: 0.92725 2023-10-03 10:17:59,063 WARNING [train.py:1204] (3/4) Exclude cut with ID _42875_210_14856_1_1533704397487_4012433_61-264914-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 10:18:02,587 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.conv_module2.balancer2.prob, batch_count=1231953.3333333333, ans=0.125 2023-10-03 10:18:03,752 WARNING [train.py:1204] (3/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_91-148831-0 from training. Duration: 0.99 2023-10-03 10:18:06,924 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_435-313305-0_sp0.9 from training. Duration: 0.788875 2023-10-03 10:18:07,041 WARNING [train.py:1204] (3/4) Exclude cut with ID _16744_210_5986_1_1531724405384_3907567_242-111316-0_sp1.1 from training. Duration: 0.8454375 2023-10-03 10:18:07,101 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_257-20733-0 from training. Duration: 0.76 2023-10-03 10:18:08,984 WARNING [train.py:1204] (3/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_4-91037-0_sp1.1 from training. Duration: 0.8 2023-10-03 10:18:10,286 WARNING [train.py:1204] (3/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_3-276701-0 from training. Duration: 0.91 2023-10-03 10:18:13,371 WARNING [train.py:1204] (3/4) Exclude cut with ID _3992_210_1747_1_1526644816031_3731060_36-147855-0_sp1.1 from training. Duration: 0.7090625 2023-10-03 10:18:15,097 WARNING [train.py:1204] (3/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_1296-298671-0_sp1.1 from training. Duration: 0.963625 2023-10-03 10:18:15,326 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=1231953.3333333333, ans=0.1 2023-10-03 10:18:16,587 WARNING [train.py:1204] (3/4) Exclude cut with ID _68965_210_1883_1_1535698962272_3497490_309-334593-0_sp1.1 from training. Duration: 0.92725 2023-10-03 10:18:17,994 WARNING [train.py:1204] (3/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_128-296581-0 from training. Duration: 0.74 2023-10-03 10:18:17,994 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_17613_210_2151_1_1531137569_2366160_170-299079-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 10:18:17,996 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6287_210_3273_1_1527509506_2653716_182-199887-0_sp1.1 from training. Duration: 0.463625 2023-10-03 10:18:19,372 WARNING [train.py:1204] (3/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_80-135200-0_sp1.1 from training. Duration: 0.7181875 2023-10-03 10:18:20,824 WARNING [train.py:1204] (3/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_34-183685-0 from training. Duration: 0.98 2023-10-03 10:18:22,072 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8278_210_2550_1_1528637099_4220413_418-210155-0_sp1.1 from training. Duration: 0.92725 2023-10-03 10:18:22,076 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8853_210_3273_1_1528633899_818968_51-187685-0_sp0.9 from training. Duration: 0.7666875 2023-10-03 10:18:22,096 WARNING [train.py:1204] (3/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_332-221057-0_sp1.1 from training. Duration: 0.6181875 2023-10-03 10:18:22,163 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2221_210_1988_1_1525597051_4935541_415-296882-0 from training. Duration: 0.77 2023-10-03 10:18:22,352 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.bypass_mid.scale_min, batch_count=1232020.0, ans=0.2 2023-10-03 10:18:23,520 WARNING [train.py:1204] (3/4) Exclude cut with ID _33640_210_2655_1_1533117605479_4855469_770-278185-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 10:18:23,548 WARNING [train.py:1204] (3/4) Exclude cut with ID _5287_210_2084_1_1527418883040_3878670_333-48508-0_sp1.1 from training. Duration: 0.5454375 2023-10-03 10:18:24,834 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_142-276839-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 10:18:26,391 WARNING [train.py:1204] (3/4) Exclude cut with ID _28394_210_8913_1_1532430049518_3650118_126-139371-0_sp1.1 from training. Duration: 0.92725 2023-10-03 10:18:27,674 WARNING [train.py:1204] (3/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_76-203867-0 from training. Duration: 0.72 2023-10-03 10:18:27,718 WARNING [train.py:1204] (3/4) Exclude cut with ID _6194_210_1534_1_1527511826160_3941194_14-282876-0_sp0.9 from training. Duration: 0.788875 2023-10-03 10:18:32,330 INFO [train.py:1046] (3/4) Epoch 35, batch 4200, loss[loss=0.1527, simple_loss=0.2444, pruned_loss=0.03049, over 24460.00 frames. ], tot_loss[loss=0.1601, simple_loss=0.2394, pruned_loss=0.04037, over 4741781.00 frames. ], batch size: 69, lr: 2.90e-03, grad_scale: 16.0 2023-10-03 10:18:32,384 WARNING [train.py:1204] (3/4) Exclude cut with ID _35355_210_16040_1_1533212988977_4016229_74-190076-0_sp1.1 from training. Duration: 0.72725 2023-10-03 10:18:33,819 WARNING [train.py:1204] (3/4) Exclude cut with ID _33920_210_4220_1_1533257851500_3084399_198-334063-0 from training. Duration: 0.94 2023-10-03 10:18:35,803 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_4-266348-0_sp0.9 from training. Duration: 0.8666875 2023-10-03 10:18:37,268 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_23856_210_4238_1_1531994785_3087934_122-38075-0_sp1.1 from training. Duration: 0.936375 2023-10-03 10:18:39,099 WARNING [train.py:1204] (3/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_276-96593-0_sp0.9 from training. Duration: 0.8555625 2023-10-03 10:18:39,177 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_308-278734-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 10:18:39,179 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_5190_210_4557_1_1527245078_2965646_353-250603-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 10:18:41,868 WARNING [train.py:1204] (3/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_472-216663-0 from training. Duration: 0.92 2023-10-03 10:18:44,613 WARNING [train.py:1204] (3/4) Exclude cut with ID _7537_210_2084_1_1528502395431_3924359_14-312906-0 from training. Duration: 0.84 2023-10-03 10:18:44,663 WARNING [train.py:1204] (3/4) Exclude cut with ID _29003_210_19596_1_1532507391720_3894098_559-236660-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 10:18:48,115 WARNING [train.py:1204] (3/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_312-204798-0_sp1.1 from training. Duration: 0.8454375 2023-10-03 10:18:50,821 WARNING [train.py:1204] (3/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_17-309070-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 10:18:53,662 WARNING [train.py:1204] (3/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_344-152526-0_sp0.9 from training. Duration: 0.588875 2023-10-03 10:18:55,122 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_41452_210_7291_1_1533437687_3941281_405-11559-0_sp1.1 from training. Duration: 0.82725 2023-10-03 10:18:55,146 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_48730_210_5399_1_1533885886_2212769_157-124052-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 10:18:55,205 WARNING [train.py:1204] (3/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_415-78792-0 from training. Duration: 0.81 2023-10-03 10:18:55,210 WARNING [train.py:1204] (3/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_345-340690-0_sp1.1 from training. Duration: 0.8454375 2023-10-03 10:18:57,825 WARNING [train.py:1204] (3/4) Exclude cut with ID _23825_210_13449_1_1531915313526_3497150_124-8138-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 10:18:57,876 WARNING [train.py:1204] (3/4) Exclude cut with ID _49258_210_7994_1_1534122017863_3738861_194-24864-0_sp1.1 from training. Duration: 0.936375 2023-10-03 10:18:57,896 WARNING [train.py:1204] (3/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_369-222060-0_sp1.1 from training. Duration: 0.6454375 2023-10-03 10:18:59,890 WARNING [train.py:1204] (3/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_480-13146-0_sp0.9 from training. Duration: 0.9444375 2023-10-03 10:19:00,093 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.balancer2.prob, batch_count=1232153.3333333333, ans=0.125 2023-10-03 10:19:01,283 WARNING [train.py:1204] (3/4) Exclude cut with ID _13253_210_1925_1_1530757775819_3577873_191-144977-0 from training. Duration: 0.78 2023-10-03 10:19:01,300 WARNING [train.py:1204] (3/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_1128-140266-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 10:19:02,909 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.0.feed_forward1.hidden_balancer.prob, batch_count=1232220.0, ans=0.125 2023-10-03 10:19:06,123 WARNING [train.py:1204] (3/4) Exclude cut with ID _8772_210_1850_1_1528635595972_3664011_247-336448-0_sp0.9 from training. Duration: 0.82225 2023-10-03 10:19:07,405 WARNING [train.py:1204] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_318-59097-0_sp1.1 from training. Duration: 0.6909375 2023-10-03 10:19:10,143 WARNING [train.py:1204] (3/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_540-131164-0_sp1.1 from training. Duration: 0.836375 2023-10-03 10:19:10,238 WARNING [train.py:1204] (3/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_39-110836-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 10:19:11,725 WARNING [train.py:1204] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_860-96198-0_sp1.1 from training. Duration: 0.663625 2023-10-03 10:19:11,732 WARNING [train.py:1204] (3/4) Exclude cut with ID _15133_210_6228_1_1530794512074_3080379_29-264458-0 from training. Duration: 0.58 2023-10-03 10:19:12,977 WARNING [train.py:1204] (3/4) Exclude cut with ID _55209_210_1531_1_1534675828996_4597160_797-348343-0_sp1.1 from training. Duration: 0.963625 2023-10-03 10:19:14,333 WARNING [train.py:1204] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_23-189234-0_sp1.1 from training. Duration: 0.7545625 2023-10-03 10:19:19,160 WARNING [train.py:1204] (3/4) Exclude cut with ID _5410_210_1388_1_1527397055795_1719020_39-112221-0_sp1.1 from training. Duration: 0.62725 2023-10-03 10:19:20,562 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_100-348441-0_sp1.1 from training. Duration: 0.82725 2023-10-03 10:19:27,336 WARNING [train.py:1204] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_143-86344-0_sp1.1 from training. Duration: 0.7 2023-10-03 10:19:30,579 WARNING [train.py:1204] (3/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_126-29104-0 from training. Duration: 0.85 2023-10-03 10:19:31,989 WARNING [train.py:1204] (3/4) Exclude cut with ID _39846_210_12651_1_1533603651992_1318030_138-31771-0_sp1.1 from training. Duration: 0.963625 2023-10-03 10:19:36,722 WARNING [train.py:1204] (3/4) Exclude cut with ID _2220_210_1819_1_1525509109898_3658680_50-99837-0_sp1.1 from training. Duration: 0.4909375 2023-10-03 10:19:36,798 WARNING [train.py:1204] (3/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_836-210550-0_sp1.1 from training. Duration: 0.97275 2023-10-03 10:19:38,753 WARNING [train.py:1204] (3/4) Exclude cut with ID _28323_210_16187_1_1532480395667_4100300_306-74561-0 from training. Duration: 0.94 2023-10-03 10:19:43,073 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_177-58005-0_sp1.1 from training. Duration: 0.563625 2023-10-03 10:19:44,804 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.conv_skip_rate, batch_count=1232353.3333333333, ans=0.0 2023-10-03 10:19:47,741 INFO [train.py:1046] (3/4) Epoch 35, batch 4250, loss[loss=0.1629, simple_loss=0.2486, pruned_loss=0.03862, over 24046.00 frames. ], tot_loss[loss=0.1589, simple_loss=0.2378, pruned_loss=0.03998, over 4726778.24 frames. ], batch size: 80, lr: 2.90e-03, grad_scale: 16.0 2023-10-03 10:19:49,112 WARNING [train.py:1204] (3/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_19-342632-0_sp1.1 from training. Duration: 0.82725 2023-10-03 10:19:49,127 WARNING [train.py:1204] (3/4) Exclude cut with ID _35355_210_16040_1_1533212988977_4016229_74-190076-0_sp0.9 from training. Duration: 0.888875 2023-10-03 10:19:51,278 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.3.encoder.layers.3.feed_forward2.out_whiten, num_groups=1, num_channels=512, metric=4.94 vs. limit=15.0 2023-10-03 10:19:51,945 WARNING [train.py:1204] (3/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_285-97786-0_sp1.1 from training. Duration: 0.97275 2023-10-03 10:19:56,252 WARNING [train.py:1204] (3/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_337-181502-0_sp0.9 from training. Duration: 0.72225 2023-10-03 10:19:57,438 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_353-182855-0 from training. Duration: 0.96 2023-10-03 10:19:57,465 WARNING [train.py:1204] (3/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_409-327730-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 10:19:58,201 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.3.encoder.layers.2.self_attn1.whiten, num_groups=1, num_channels=512, metric=12.43 vs. limit=22.5 2023-10-03 10:20:00,835 WARNING [train.py:1204] (3/4) Exclude cut with ID _8030_210_7497_1_1528444886781_3531089_373-19403-0_sp1.1 from training. Duration: 0.97275 2023-10-03 10:20:02,525 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.feed_forward1.hidden_balancer.prob, batch_count=1232486.6666666667, ans=0.125 2023-10-03 10:20:05,528 WARNING [train.py:1204] (3/4) Exclude cut with ID _6789_210_1534_1_1527857937218_3118950_231-340237-0_sp1.1 from training. Duration: 0.936375 2023-10-03 10:20:07,121 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.1.ff3_skip_rate, batch_count=1232486.6666666667, ans=0.0 2023-10-03 10:20:09,669 WARNING [train.py:1204] (3/4) Exclude cut with ID _6493_210_1747_1_1528459210424_4012470_536-57832-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 10:20:09,691 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7046_210_6753_1_1527895337_6920917_756-116002-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 10:20:11,138 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_37152_210_9697_1_1533106573_3844188_29-239070-0_sp1.1 from training. Duration: 0.8909375 2023-10-03 10:20:12,891 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.522e+02 1.847e+02 2.096e+02 2.391e+02 3.957e+02, threshold=4.191e+02, percent-clipped=0.0 2023-10-03 10:20:12,958 WARNING [train.py:1204] (3/4) Exclude cut with ID _19023_210_4353_1_1531378892552_5820780_61-334539-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 10:20:13,084 WARNING [train.py:1204] (3/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_159-28488-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 10:20:14,319 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_764-71272-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 10:20:14,388 WARNING [train.py:1204] (3/4) Exclude cut with ID _44902_210_16187_1_1533888006062_3792078_258-178274-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 10:20:17,753 WARNING [train.py:1204] (3/4) Exclude cut with ID _7537_210_2084_1_1528502395431_3924359_14-312906-0_sp1.1 from training. Duration: 0.763625 2023-10-03 10:20:19,014 WARNING [train.py:1204] (3/4) Exclude cut with ID _49643_210_7989_1_1534640244224_3939031_674-116639-0_sp1.1 from training. Duration: 0.963625 2023-10-03 10:20:19,110 WARNING [train.py:1204] (3/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_415-161-0 from training. Duration: 0.98 2023-10-03 10:20:23,147 WARNING [train.py:1204] (3/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_264-348896-0 from training. Duration: 0.85 2023-10-03 10:20:23,154 WARNING [train.py:1204] (3/4) Exclude cut with ID _51899_210_20034_1_1534129254466_3669220_233-161492-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 10:20:24,494 WARNING [train.py:1204] (3/4) Exclude cut with ID _31255_210_13427_1_1532865619495_3815143_233-200023-0_sp1.1 from training. Duration: 0.936375 2023-10-03 10:20:24,513 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_23850_210_4238_1_1532232597_1632433_100-229879-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 10:20:25,891 WARNING [train.py:1204] (3/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_375-245677-0_sp0.9 from training. Duration: 0.9 2023-10-03 10:20:25,894 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_40223_210_6228_1_1533298404_4812267_355-317415-0_sp1.1 from training. Duration: 0.97275 2023-10-03 10:20:25,927 WARNING [train.py:1204] (3/4) Exclude cut with ID _9679_210_7497_1_1529129212906_4363929_235-81488-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 10:20:27,414 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.nonlin_attention.balancer.prob, batch_count=1232553.3333333333, ans=0.125 2023-10-03 10:20:30,298 WARNING [train.py:1204] (3/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_172-215932-0_sp1.1 from training. Duration: 0.6 2023-10-03 10:20:31,683 WARNING [train.py:1204] (3/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_64-298917-0_sp1.1 from training. Duration: 0.8 2023-10-03 10:20:33,158 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.feed_forward1.out_proj.dropout_p, batch_count=1232620.0, ans=0.1 2023-10-03 10:20:35,813 WARNING [train.py:1204] (3/4) Exclude cut with ID _38020_210_5986_1_1533112252259_7006089_131-53533-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 10:20:37,693 WARNING [train.py:1204] (3/4) Exclude cut with ID _42334_210_7770_1_1534206610396_3605880_392-48154-0_sp1.1 from training. Duration: 0.963625 2023-10-03 10:20:39,043 WARNING [train.py:1204] (3/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_337-181502-0 from training. Duration: 0.65 2023-10-03 10:20:39,059 WARNING [train.py:1204] (3/4) Exclude cut with ID _6267_210_2084_1_1527764658526_3894730_375-219887-0_sp0.9 from training. Duration: 0.7444375 2023-10-03 10:20:39,145 WARNING [train.py:1204] (3/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_60-146189-0 from training. Duration: 0.98 2023-10-03 10:20:40,520 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_243-143119-0_sp1.1 from training. Duration: 0.7 2023-10-03 10:20:41,927 WARNING [train.py:1204] (3/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_1267-272693-0_sp0.9 from training. Duration: 0.811125 2023-10-03 10:20:43,884 WARNING [train.py:1204] (3/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_83-182645-0_sp1.1 from training. Duration: 0.97275 2023-10-03 10:20:43,909 WARNING [train.py:1204] (3/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_403-292260-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 10:20:46,705 WARNING [train.py:1204] (3/4) Exclude cut with ID _55429_210_10626_1_1534467588527_7174143_377-169391-0 from training. Duration: 0.96 2023-10-03 10:20:49,804 WARNING [train.py:1204] (3/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_494-199538-0_sp1.1 from training. Duration: 0.6818125 2023-10-03 10:20:49,858 WARNING [train.py:1204] (3/4) Exclude cut with ID _3067_210_2667_1_1526212974640_4503168_119-175776-0_sp1.1 from training. Duration: 0.67275 2023-10-03 10:20:53,995 WARNING [train.py:1204] (3/4) Exclude cut with ID _3231_210_2106_1_1526205864934_6784220_183-303839-0_sp1.1 from training. Duration: 0.97275 2023-10-03 10:20:55,438 WARNING [train.py:1204] (3/4) Exclude cut with ID _18513_210_9206_1_1531618215223_3862620_165-38958-0_sp1.1 from training. Duration: 0.963625 2023-10-03 10:20:58,041 WARNING [train.py:1204] (3/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_87-272068-0_sp1.1 from training. Duration: 0.7545625 2023-10-03 10:20:58,132 WARNING [train.py:1204] (3/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_353-95-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 10:20:59,492 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2009_210_2071_1_1525481134_4588937_73-302813-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 10:20:59,603 WARNING [train.py:1204] (3/4) Exclude cut with ID _64220_210_9527_1_1535250151772_4875259_88-95011-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 10:21:01,288 INFO [train.py:1046] (3/4) Epoch 35, batch 4300, loss[loss=0.1521, simple_loss=0.2319, pruned_loss=0.03617, over 23552.00 frames. ], tot_loss[loss=0.1587, simple_loss=0.2377, pruned_loss=0.03983, over 4729657.66 frames. ], batch size: 134, lr: 2.90e-03, grad_scale: 16.0 2023-10-03 10:21:01,390 WARNING [train.py:1204] (3/4) Exclude cut with ID _44902_210_16187_1_1533888006062_3792078_160-12107-0_sp1.1 from training. Duration: 0.92725 2023-10-03 10:21:01,403 WARNING [train.py:1204] (3/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_547-5424-0 from training. Duration: 0.94 2023-10-03 10:21:04,155 WARNING [train.py:1204] (3/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_268-85656-0_sp1.1 from training. Duration: 0.936375 2023-10-03 10:21:10,114 WARNING [train.py:1204] (3/4) Exclude cut with ID _66210_210_12651_1_1535446812922_3365850_42-284985-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 10:21:10,138 WARNING [train.py:1204] (3/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_465-214726-0_sp0.9 from training. Duration: 0.9555625 2023-10-03 10:21:12,412 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.1.encoder.layers.1.conv_module2.whiten, num_groups=1, num_channels=256, metric=11.64 vs. limit=15.0 2023-10-03 10:21:13,103 WARNING [train.py:1204] (3/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_50-95770-0_sp1.1 from training. Duration: 0.936375 2023-10-03 10:21:19,404 WARNING [train.py:1204] (3/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_269-77567-0_sp1.1 from training. Duration: 0.963625 2023-10-03 10:21:19,406 WARNING [train.py:1204] (3/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_7-111954-0 from training. Duration: 0.56 2023-10-03 10:21:21,336 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_85-195240-0_sp0.9 from training. Duration: 0.9666875 2023-10-03 10:21:22,703 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2822_210_2715_1_1526112586_1999587_165-36656-0_sp0.9 from training. Duration: 0.988875 2023-10-03 10:21:23,151 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.5.encoder.layers.0.feed_forward3.out_whiten, num_groups=1, num_channels=256, metric=7.74 vs. limit=15.0 2023-10-03 10:21:23,953 WARNING [train.py:1204] (3/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_181-78632-0_sp1.1 from training. Duration: 0.7090625 2023-10-03 10:21:23,965 WARNING [train.py:1204] (3/4) Exclude cut with ID IC0671W0044-331887 from training. Duration: 0.896 2023-10-03 10:21:26,759 WARNING [train.py:1204] (3/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_266-137104-0_sp1.1 from training. Duration: 0.5909375 2023-10-03 10:21:28,238 WARNING [train.py:1204] (3/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_137-116442-0_sp0.9 from training. Duration: 0.8666875 2023-10-03 10:21:31,544 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_23801_210_16223_1_1532066392_3294657_570-5439-0 from training. Duration: 0.97 2023-10-03 10:21:31,563 WARNING [train.py:1204] (3/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_29-280622-0_sp1.1 from training. Duration: 0.7454375 2023-10-03 10:21:31,593 WARNING [train.py:1204] (3/4) Exclude cut with ID 3557-8342-0013-54691-0 from training. Duration: 0.92 2023-10-03 10:21:33,022 WARNING [train.py:1204] (3/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_711-15688-0_sp1.1 from training. Duration: 0.3909375 2023-10-03 10:21:35,687 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_15126_210_6753_1_1530778677_2591185_120-277707-0_sp1.1 from training. Duration: 0.72725 2023-10-03 10:21:38,436 WARNING [train.py:1204] (3/4) Exclude cut with ID _14970_210_15709_1_1530775547361_1966965_8-169924-0_sp1.1 from training. Duration: 0.836375 2023-10-03 10:21:38,438 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_4692_210_3528_1_1526899755_4323254_107-44401-0_sp0.9 from training. Duration: 0.9555625 2023-10-03 10:21:40,218 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3475_210_2028_1_1526193578_3622569_265-276625-0_sp0.9 from training. Duration: 0.8444375 2023-10-03 10:21:41,586 WARNING [train.py:1204] (3/4) Exclude cut with ID _55153_210_2253_1_1534384786677_3162096_148-79022-0_sp1.1 from training. Duration: 0.87275 2023-10-03 10:21:41,781 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.bypass.scale_min, batch_count=1232886.6666666667, ans=0.2 2023-10-03 10:21:42,919 WARNING [train.py:1204] (3/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_292-36336-0_sp1.1 from training. Duration: 0.92725 2023-10-03 10:21:42,935 WARNING [train.py:1204] (3/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_329-82235-0 from training. Duration: 0.82 2023-10-03 10:21:44,266 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_11305_210_1794_1_1529750428_7178148_257-312870-0 from training. Duration: 0.88 2023-10-03 10:21:46,227 WARNING [train.py:1204] (3/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_436-4493-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 10:21:49,034 WARNING [train.py:1204] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_321-54129-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 10:21:49,047 WARNING [train.py:1204] (3/4) Exclude cut with ID _5287_210_2084_1_1527418883040_3878670_333-48508-0_sp0.9 from training. Duration: 0.6666875 2023-10-03 10:21:49,198 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.self_attn_weights.pos_emb_skip_rate, batch_count=1232953.3333333333, ans=0.0 2023-10-03 10:21:50,646 WARNING [train.py:1204] (3/4) Exclude cut with ID _26528_210_9070_1_1532570410842_3851310_244-278158-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 10:21:50,674 WARNING [train.py:1204] (3/4) Exclude cut with ID _46952_210_5986_1_1534230010357_3107379_232-344503-0_sp1.1 from training. Duration: 0.87275 2023-10-03 10:21:50,684 WARNING [train.py:1204] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_43-206757-0 from training. Duration: 0.75 2023-10-03 10:21:50,686 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_4183_210_2087_1_1526729100_1869838_141-166600-0 from training. Duration: 0.95 2023-10-03 10:21:50,756 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_296-287678-0 from training. Duration: 0.95 2023-10-03 10:21:52,079 WARNING [train.py:1204] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_265-60343-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 10:21:52,104 WARNING [train.py:1204] (3/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_332-256168-0 from training. Duration: 0.75 2023-10-03 10:21:52,139 WARNING [train.py:1204] (3/4) Exclude cut with ID _13679_210_7497_1_1530534293662_3355098_411-186088-0 from training. Duration: 0.94 2023-10-03 10:21:56,117 WARNING [train.py:1204] (3/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_458-126779-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 10:21:57,567 WARNING [train.py:1204] (3/4) Exclude cut with ID IC0803W0380-397572 from training. Duration: 0.032 2023-10-03 10:21:58,909 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_278-272612-0_sp1.1 from training. Duration: 0.663625 2023-10-03 10:22:00,344 WARNING [train.py:1204] (3/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_491-263101-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 10:22:01,660 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_53555_210_5632_1_1534595220_7232297_334-336778-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 10:22:02,905 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.5.encoder.layers.1.nonlin_attention.whiten1, num_groups=1, num_channels=192, metric=2.73 vs. limit=10.0 2023-10-03 10:22:03,793 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_50-3289-0 from training. Duration: 0.91 2023-10-03 10:22:05,221 WARNING [train.py:1204] (3/4) Exclude cut with ID _8699_210_2069_1_1528630346831_3960409_155-39766-0_sp0.9 from training. Duration: 0.8666875 2023-10-03 10:22:05,226 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_502-82145-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 10:22:06,520 WARNING [train.py:1204] (3/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_550-43132-0_sp1.1 from training. Duration: 0.97275 2023-10-03 10:22:06,550 WARNING [train.py:1204] (3/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_446-112117-0_sp1.1 from training. Duration: 0.8181875 2023-10-03 10:22:07,897 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3691_210_3528_1_1526468169_4062053_355-334432-0_sp1.1 from training. Duration: 0.7909375 2023-10-03 10:22:11,052 WARNING [train.py:1204] (3/4) Exclude cut with ID _58605_210_12210_1_1534723195939_3904259_239-94720-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 10:22:12,522 WARNING [train.py:1204] (3/4) Exclude cut with ID _49376_210_5986_1_1533985278732_3370740_391-344072-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 10:22:13,857 WARNING [train.py:1204] (3/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_558-322014-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 10:22:13,906 WARNING [train.py:1204] (3/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_256-32662-0_sp1.1 from training. Duration: 0.8181875 2023-10-03 10:22:15,225 INFO [train.py:1046] (3/4) Epoch 35, batch 4350, loss[loss=0.148, simple_loss=0.2272, pruned_loss=0.03444, over 24575.00 frames. ], tot_loss[loss=0.1593, simple_loss=0.2387, pruned_loss=0.03996, over 4736657.42 frames. ], batch size: 60, lr: 2.90e-03, grad_scale: 16.0 2023-10-03 10:22:16,132 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.3.encoder.layers.1.self_attn1.whiten, num_groups=1, num_channels=512, metric=14.24 vs. limit=22.5 2023-10-03 10:22:18,208 WARNING [train.py:1204] (3/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_572-185150-0 from training. Duration: 0.97 2023-10-03 10:22:19,427 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_89-35748-0_sp0.9 from training. Duration: 0.87775 2023-10-03 10:22:24,492 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_88-66747-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 10:22:25,856 WARNING [train.py:1204] (3/4) Exclude cut with ID _19504_210_9994_1_1531648863242_3325220_357-237029-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 10:22:27,653 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.5.encoder.layers.1.nonlin_attention.whiten2, num_groups=1, num_channels=256, metric=9.72 vs. limit=15.0 2023-10-03 10:22:28,600 WARNING [train.py:1204] (3/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_302-2550-0_sp1.1 from training. Duration: 0.736375 2023-10-03 10:22:28,600 WARNING [train.py:1204] (3/4) Exclude cut with ID _9461_210_4557_1_1529060514726_3677451_59-194017-0_sp1.1 from training. Duration: 0.963625 2023-10-03 10:22:34,368 WARNING [train.py:1204] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_665-196155-0_sp1.1 from training. Duration: 0.6090625 2023-10-03 10:22:37,825 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_326-315007-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 10:22:39,380 INFO [scaling.py:1118] (3/4) WithLoss: name=encoder.encoders.2.encoder.layers.0.self_attn_weights, loss-sum=0.000e+00 2023-10-03 10:22:40,393 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.639e+02 1.912e+02 2.047e+02 2.309e+02 3.251e+02, threshold=4.094e+02, percent-clipped=0.0 2023-10-03 10:22:40,542 WARNING [train.py:1204] (3/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_105-328174-0_sp1.1 from training. Duration: 0.6909375 2023-10-03 10:22:40,550 WARNING [train.py:1204] (3/4) Exclude cut with ID _56494_210_4220_1_1535284837625_3033169_106-263722-0_sp1.1 from training. Duration: 0.97275 2023-10-03 10:22:43,866 WARNING [train.py:1204] (3/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_770-200430-0_sp0.9 from training. Duration: 0.988875 2023-10-03 10:22:45,273 WARNING [train.py:1204] (3/4) Exclude cut with ID _65640_210_6310_1_1535368201445_3173250_209-13008-0_sp1.1 from training. Duration: 0.9 2023-10-03 10:22:46,707 WARNING [train.py:1204] (3/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_212-149947-0_sp0.9 from training. Duration: 0.811125 2023-10-03 10:22:51,407 WARNING [train.py:1204] (3/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_188-171673-0 from training. Duration: 0.58 2023-10-03 10:22:51,475 WARNING [train.py:1204] (3/4) Exclude cut with ID _58895_210_8750_1_1535184081320_3649089_286-281724-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 10:22:52,684 WARNING [train.py:1204] (3/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_16-254909-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 10:22:57,321 WARNING [train.py:1204] (3/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_52-95655-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 10:22:59,967 WARNING [train.py:1204] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_554-45617-0 from training. Duration: 0.99 2023-10-03 10:23:03,942 WARNING [train.py:1204] (3/4) Exclude cut with ID _14683_210_5219_1_1530698352608_3974859_473-344611-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 10:23:04,043 WARNING [train.py:1204] (3/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_17-25045-0_sp0.9 from training. Duration: 0.6555625 2023-10-03 10:23:08,740 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9072W0003-530637 from training. Duration: 0.768 2023-10-03 10:23:12,078 WARNING [train.py:1204] (3/4) Exclude cut with ID _38670_210_15379_1_1533448810659_4391059_76-74025-0_sp1.1 from training. Duration: 0.936375 2023-10-03 10:23:12,120 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_74-292608-0_sp0.9 from training. Duration: 0.8 2023-10-03 10:23:12,265 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.1.feed_forward3.hidden_balancer.prob, batch_count=1233286.6666666667, ans=0.125 2023-10-03 10:23:13,491 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9084W0025-536862 from training. Duration: 0.896 2023-10-03 10:23:13,561 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9087W0021-538392 from training. Duration: 0.768 2023-10-03 10:23:13,567 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7543_210_6753_1_1528543323_645515_56-163234-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 10:23:13,595 WARNING [train.py:1204] (3/4) Exclude cut with ID _45560_210_21982_1_1533812603951_3584999_44-332643-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 10:23:14,944 WARNING [train.py:1204] (3/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_1285-256622-0_sp1.1 from training. Duration: 0.763625 2023-10-03 10:23:16,236 WARNING [train.py:1204] (3/4) Exclude cut with ID _52781_210_16187_1_1534297729099_1118149_141-325632-0_sp1.1 from training. Duration: 0.936375 2023-10-03 10:23:17,166 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.0.layers.1.feed_forward1.out_whiten, num_groups=1, num_channels=192, metric=4.44 vs. limit=15.0 2023-10-03 10:23:17,556 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_1051-137631-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 10:23:17,603 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_13091_210_7925_1_1530609447_8987354_233-18333-0_sp1.1 from training. Duration: 0.97275 2023-10-03 10:23:20,272 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_34008_210_10431_1_1533345424_8183247_662-317331-0 from training. Duration: 0.94 2023-10-03 10:23:20,278 WARNING [train.py:1204] (3/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_291-248920-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 10:23:20,280 WARNING [train.py:1204] (3/4) Exclude cut with ID _65263_210_22356_1_1535763598691_3921037_274-288481-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 10:23:20,308 WARNING [train.py:1204] (3/4) Exclude cut with ID _5189_210_4557_1_1527248319428_3769229_233-168080-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 10:23:22,125 WARNING [train.py:1204] (3/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_135-184281-0 from training. Duration: 0.99 2023-10-03 10:23:23,472 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9123W0012-555972 from training. Duration: 0.896 2023-10-03 10:23:23,476 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9123W0118-556077 from training. Duration: 0.896 2023-10-03 10:23:23,485 WARNING [train.py:1204] (3/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_406-275649-0 from training. Duration: 0.42 2023-10-03 10:23:24,980 WARNING [train.py:1204] (3/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_309-13684-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 10:23:26,288 WARNING [train.py:1204] (3/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_129-337802-0_sp1.1 from training. Duration: 0.6545625 2023-10-03 10:23:26,319 WARNING [train.py:1204] (3/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_649-266825-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 10:23:28,085 WARNING [train.py:1204] (3/4) Exclude cut with ID _40519_210_13794_1_1533715228655_7337390_895-224742-0_sp1.1 from training. Duration: 0.92725 2023-10-03 10:23:29,411 INFO [train.py:1046] (3/4) Epoch 35, batch 4400, loss[loss=0.169, simple_loss=0.2423, pruned_loss=0.04782, over 23736.00 frames. ], tot_loss[loss=0.1602, simple_loss=0.2396, pruned_loss=0.04038, over 4736561.65 frames. ], batch size: 164, lr: 2.90e-03, grad_scale: 16.0 2023-10-03 10:23:29,558 WARNING [train.py:1204] (3/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_234-14174-0 from training. Duration: 0.82 2023-10-03 10:23:30,960 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9156W0009-572502 from training. Duration: 0.768 2023-10-03 10:23:30,967 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_424-245529-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 10:23:36,528 WARNING [train.py:1204] (3/4) Exclude cut with ID _26528_210_9070_1_1532570410842_3851310_232-10149-0_sp1.1 from training. Duration: 0.963625 2023-10-03 10:23:36,536 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_17292_210_6753_1_1531125388_2943734_152-30410-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 10:23:38,491 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_35441_210_16554_1_1533347572_4092680_291-82075-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 10:23:39,888 WARNING [train.py:1204] (3/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_64-298917-0 from training. Duration: 0.88 2023-10-03 10:23:39,916 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_5618_210_1874_1_1527311008_5851_1-214501-0_sp0.9 from training. Duration: 0.7655625 2023-10-03 10:23:39,950 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_9460_210_4211_1_1528954587_10049667_572-37035-0 from training. Duration: 0.8 2023-10-03 10:23:40,248 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.conv_module2.balancer1.prob, batch_count=1233420.0, ans=0.125 2023-10-03 10:23:41,326 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9193W0023-590322 from training. Duration: 0.896 2023-10-03 10:23:42,644 WARNING [train.py:1204] (3/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_206-10097-0_sp1.1 from training. Duration: 0.4454375 2023-10-03 10:23:42,663 WARNING [train.py:1204] (3/4) Exclude cut with ID _55134_210_21982_1_1534590028377_3882170_17-51731-0_sp1.1 from training. Duration: 0.963625 2023-10-03 10:23:43,337 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.3.encoder.layers.0.nonlin_attention.whiten2, num_groups=1, num_channels=512, metric=6.11 vs. limit=15.0 2023-10-03 10:23:45,937 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_14953_210_2151_1_1530701710_5616165_710-165400-0 from training. Duration: 0.87 2023-10-03 10:23:48,644 WARNING [train.py:1204] (3/4) Exclude cut with ID _53203_210_2087_1_1534246218309_475590_61-85777-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 10:23:48,738 WARNING [train.py:1204] (3/4) Exclude cut with ID _55844_210_2346_1_1534471212569_7625707_1050-10453-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 10:23:48,747 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9218W0013-603297 from training. Duration: 0.896 2023-10-03 10:23:50,439 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.conv_module1.balancer1.min_positive, batch_count=1233486.6666666667, ans=0.025 2023-10-03 10:23:51,615 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_267-102818-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 10:23:51,616 WARNING [train.py:1204] (3/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_191-41023-0 from training. Duration: 0.71 2023-10-03 10:23:52,960 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9231W0202-610227 from training. Duration: 0.896 2023-10-03 10:23:56,352 WARNING [train.py:1204] (3/4) Exclude cut with ID _6537_210_1883_1_1527987559710_3597129_382-133062-0 from training. Duration: 0.68 2023-10-03 10:23:56,409 WARNING [train.py:1204] (3/4) Exclude cut with ID _28427_210_4220_1_1532581240967_3061069_43-84928-0 from training. Duration: 0.99 2023-10-03 10:23:56,438 WARNING [train.py:1204] (3/4) Exclude cut with ID _59256_210_5986_1_1535076085997_3589120_121-237386-0 from training. Duration: 0.96 2023-10-03 10:23:56,459 WARNING [train.py:1204] (3/4) Exclude cut with ID _8480_210_1874_1_1528589391205_6728330_137-256986-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 10:23:57,934 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_9417_210_1794_1_1529235261_4614131_479-346963-0_sp1.1 from training. Duration: 0.936375 2023-10-03 10:23:59,252 WARNING [train.py:1204] (3/4) Exclude cut with ID _26474_210_9570_1_1532520022699_3623380_267-7362-0_sp1.1 from training. Duration: 0.936375 2023-10-03 10:24:01,178 WARNING [train.py:1204] (3/4) Exclude cut with ID _55328_210_27767_1_1534580973260_4088170_287-61272-0_sp1.1 from training. Duration: 0.97275 2023-10-03 10:24:02,592 WARNING [train.py:1204] (3/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_589-9070-0 from training. Duration: 0.71 2023-10-03 10:24:02,750 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.self_attn_weights.pos_emb_skip_rate, batch_count=1233553.3333333333, ans=0.0 2023-10-03 10:24:03,924 WARNING [train.py:1204] (3/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_148-158509-0_sp1.1 from training. Duration: 0.37275 2023-10-03 10:24:03,998 WARNING [train.py:1204] (3/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_164-184548-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 10:24:05,339 WARNING [train.py:1204] (3/4) Exclude cut with ID _14970_210_15709_1_1530775547361_1966965_6-23328-0_sp1.1 from training. Duration: 0.7818125 2023-10-03 10:24:05,344 WARNING [train.py:1204] (3/4) Exclude cut with ID _29303_210_2575_1_1532392237303_7410649_612-165069-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 10:24:06,668 WARNING [train.py:1204] (3/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_383-321224-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 10:24:06,699 WARNING [train.py:1204] (3/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_283-160525-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 10:24:06,710 WARNING [train.py:1204] (3/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_505-97987-0 from training. Duration: 0.86 2023-10-03 10:24:08,014 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9309W0190-640317 from training. Duration: 0.768 2023-10-03 10:24:11,554 WARNING [train.py:1204] (3/4) Exclude cut with ID _50093_210_4220_1_1534417141207_3432299_280-297390-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 10:24:16,342 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.balancer1.prob, batch_count=1233620.0, ans=0.125 2023-10-03 10:24:18,933 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_17292_210_6753_1_1531125388_2943734_320-71385-0_sp1.1 from training. Duration: 0.97275 2023-10-03 10:24:21,710 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_213-103282-0 from training. Duration: 0.78 2023-10-03 10:24:21,981 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.conv_module2.balancer2.prob, batch_count=1233620.0, ans=0.125 2023-10-03 10:24:25,840 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7615_210_4211_1_1528110965_4580160_72-235211-0_sp0.9 from training. Duration: 0.8444375 2023-10-03 10:24:27,893 WARNING [train.py:1204] (3/4) Exclude cut with ID _14867_210_1968_1_1530684179890_3846969_173-320977-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 10:24:30,578 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7046_210_6753_1_1527895337_6920917_783-278109-0_sp0.9 from training. Duration: 0.8555625 2023-10-03 10:24:31,863 WARNING [train.py:1204] (3/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_90-30673-0 from training. Duration: 0.84 2023-10-03 10:24:31,879 WARNING [train.py:1204] (3/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_415-161-0_sp1.1 from training. Duration: 0.8909375 2023-10-03 10:24:31,889 WARNING [train.py:1204] (3/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_86-66192-0_sp0.9 from training. Duration: 0.611125 2023-10-03 10:24:31,892 WARNING [train.py:1204] (3/4) Exclude cut with ID _4626_210_3532_1_1526817652717_3916859_446-206656-0_sp1.1 from training. Duration: 0.6909375 2023-10-03 10:24:31,954 WARNING [train.py:1204] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_166-128941-0_sp0.9 from training. Duration: 0.6 2023-10-03 10:24:36,667 WARNING [train.py:1204] (3/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_345-167986-0 from training. Duration: 0.84 2023-10-03 10:24:38,165 WARNING [train.py:1204] (3/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_973-198085-0 from training. Duration: 0.75 2023-10-03 10:24:39,549 WARNING [train.py:1204] (3/4) Exclude cut with ID _60057_210_10640_1_1534903065793_3333150_171-181133-0 from training. Duration: 0.88 2023-10-03 10:24:39,563 WARNING [train.py:1204] (3/4) Exclude cut with ID _19504_210_9994_1_1531648863242_3325220_336-196513-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 10:24:39,570 WARNING [train.py:1204] (3/4) Exclude cut with ID _6493_210_1747_1_1528459210424_4012470_513-140967-0 from training. Duration: 0.79 2023-10-03 10:24:41,460 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6673_210_3273_1_1527767790_3173240_327-306185-0_sp0.9 from training. Duration: 0.92225 2023-10-03 10:24:44,095 INFO [train.py:1046] (3/4) Epoch 35, batch 4450, loss[loss=0.2129, simple_loss=0.2818, pruned_loss=0.07201, over 19777.00 frames. ], tot_loss[loss=0.1615, simple_loss=0.2407, pruned_loss=0.04117, over 4722452.36 frames. ], batch size: 388, lr: 2.90e-03, grad_scale: 16.0 2023-10-03 10:24:46,077 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1032-236863-0_sp1.1 from training. Duration: 0.763625 2023-10-03 10:24:46,412 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=1233753.3333333333, ans=0.1 2023-10-03 10:24:47,465 WARNING [train.py:1204] (3/4) Exclude cut with ID _14867_210_1968_1_1530684179890_3846969_413-29292-0 from training. Duration: 0.9 2023-10-03 10:24:50,291 WARNING [train.py:1204] (3/4) Exclude cut with ID _47251_210_18628_1_1533814063128_4137250_186-343322-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 10:24:53,031 WARNING [train.py:1204] (3/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_624-46091-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 10:24:53,073 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_791-197891-0_sp0.9 from training. Duration: 0.9333125 2023-10-03 10:24:56,570 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_50050_210_16223_1_1535435796_7290138_786-288457-0_sp1.1 from training. Duration: 0.92725 2023-10-03 10:24:56,599 WARNING [train.py:1204] (3/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_213-227283-0_sp1.1 from training. Duration: 0.963625 2023-10-03 10:24:59,314 WARNING [train.py:1204] (3/4) Exclude cut with ID _42659_210_2070_1_1533607442765_3120380_58-341121-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 10:25:01,833 WARNING [train.py:1204] (3/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_506-23200-0_sp1.1 from training. Duration: 0.8818125 2023-10-03 10:25:04,457 WARNING [train.py:1204] (3/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_336-213530-0_sp1.1 from training. Duration: 0.8181875 2023-10-03 10:25:05,746 WARNING [train.py:1204] (3/4) Exclude cut with ID _64135_210_2253_1_1535677267876_7608972_180-5312-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 10:25:06,500 WARNING [train.py:1204] (3/4) Exclude cut with ID _12302_210_5219_1_1529924446466_8107280_835-279754-0 from training. Duration: 0.9 2023-10-03 10:25:06,502 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_510-96996-0_sp1.1 from training. Duration: 0.7909375 2023-10-03 10:25:07,766 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_15517_210_6753_1_1530954008_3487336_216-187834-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 10:25:07,792 WARNING [train.py:1204] (3/4) Exclude cut with ID _19504_210_9994_1_1531648863242_3325220_414-113427-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 10:25:07,793 WARNING [train.py:1204] (3/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_264-108685-0_sp0.9 from training. Duration: 0.611125 2023-10-03 10:25:10,452 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.500e+02 1.919e+02 2.096e+02 2.482e+02 3.695e+02, threshold=4.192e+02, percent-clipped=0.0 2023-10-03 10:25:10,575 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_5438_210_1794_1_1527336687_2600009_104-288274-0_sp0.9 from training. Duration: 0.6444375 2023-10-03 10:25:15,325 WARNING [train.py:1204] (3/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_520-39496-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 10:25:15,358 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_37818_210_4238_1_1533620910_4720083_315-308565-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 10:25:16,727 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2009_210_2071_1_1525481134_4588937_385-302100-0_sp1.1 from training. Duration: 0.7909375 2023-10-03 10:25:16,769 WARNING [train.py:1204] (3/4) Exclude cut with ID _58895_210_8750_1_1535184081320_3649089_277-53219-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 10:25:18,134 WARNING [train.py:1204] (3/4) Exclude cut with ID _26664_210_7770_1_1532258943280_3418270_121-111258-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 10:25:19,614 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.1.attention_skip_rate, batch_count=1233886.6666666667, ans=0.0 2023-10-03 10:25:19,699 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.attention_skip_rate, batch_count=1233886.6666666667, ans=0.0 2023-10-03 10:25:24,209 WARNING [train.py:1204] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_99-167392-0_sp1.1 from training. Duration: 0.5545625 2023-10-03 10:25:25,576 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1113-7369-0 from training. Duration: 0.85 2023-10-03 10:25:25,591 WARNING [train.py:1204] (3/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_352-59958-0 from training. Duration: 0.86 2023-10-03 10:25:25,595 WARNING [train.py:1204] (3/4) Exclude cut with ID _14886_210_4220_1_1530702020355_3516190_159-73748-0_sp1.1 from training. Duration: 0.7545625 2023-10-03 10:25:27,032 WARNING [train.py:1204] (3/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_131-119569-0_sp1.1 from training. Duration: 0.92725 2023-10-03 10:25:28,372 WARNING [train.py:1204] (3/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_216-343007-0 from training. Duration: 0.66 2023-10-03 10:25:31,260 WARNING [train.py:1204] (3/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_113-92608-0_sp0.9 from training. Duration: 0.6 2023-10-03 10:25:35,898 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_40047_210_2290_1_1533290519_2374311_132-123271-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 10:25:35,943 WARNING [train.py:1204] (3/4) Exclude cut with ID _8793_210_6310_1_1528542985139_2310009_121-179782-0 from training. Duration: 0.8 2023-10-03 10:25:35,966 WARNING [train.py:1204] (3/4) Exclude cut with ID _43997_210_4525_1_1533538632555_5189329_283-218639-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 10:25:35,968 WARNING [train.py:1204] (3/4) Exclude cut with ID _49376_210_5986_1_1533985278732_3370740_297-78397-0_sp1.1 from training. Duration: 0.936375 2023-10-03 10:25:35,989 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3475_210_2028_1_1526193578_3622569_377-166133-0_sp0.9 from training. Duration: 0.9555625 2023-10-03 10:25:36,001 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_520-213149-0_sp1.1 from training. Duration: 0.92725 2023-10-03 10:25:37,294 WARNING [train.py:1204] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_567-144483-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 10:25:41,695 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_4183_210_2087_1_1526729100_1869838_143-280962-0_sp0.9 from training. Duration: 0.62225 2023-10-03 10:25:41,853 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.feed_forward2.hidden_balancer.prob, batch_count=1234020.0, ans=0.125 2023-10-03 10:25:43,036 WARNING [train.py:1204] (3/4) Exclude cut with ID _49643_210_7989_1_1534640244224_3939031_611-185515-0 from training. Duration: 0.94 2023-10-03 10:25:44,382 WARNING [train.py:1204] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_88-341718-0_sp0.9 from training. Duration: 0.7333125 2023-10-03 10:25:46,655 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.5.encoder.layers.0.self_attn_weights.whiten_keys, num_groups=4, num_channels=128, metric=1.92 vs. limit=6.0 2023-10-03 10:25:47,496 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_51225_210_3601_1_1534400634_2327383_49-235441-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 10:25:47,583 WARNING [train.py:1204] (3/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_240-294400-0_sp1.1 from training. Duration: 0.936375 2023-10-03 10:25:49,092 WARNING [train.py:1204] (3/4) Exclude cut with ID _55429_210_10626_1_1534467588527_7174143_640-304825-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 10:25:49,128 WARNING [train.py:1204] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_187-221566-0_sp1.1 from training. Duration: 0.3909375 2023-10-03 10:25:51,004 WARNING [train.py:1204] (3/4) Exclude cut with ID _7606_210_4915_1_1528894253277_5080690_127-177761-0_sp0.9 from training. Duration: 0.711125 2023-10-03 10:25:53,759 WARNING [train.py:1204] (3/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_430-116139-0 from training. Duration: 0.85 2023-10-03 10:25:55,133 WARNING [train.py:1204] (3/4) Exclude cut with ID _3675_210_4557_1_1526639934408_4182089_456-18305-0_sp0.9 from training. Duration: 0.8444375 2023-10-03 10:25:57,834 INFO [train.py:1046] (3/4) Epoch 35, batch 4500, loss[loss=0.1508, simple_loss=0.2211, pruned_loss=0.0403, over 23615.00 frames. ], tot_loss[loss=0.1618, simple_loss=0.2406, pruned_loss=0.04152, over 4702360.98 frames. ], batch size: 256, lr: 2.90e-03, grad_scale: 16.0 2023-10-03 10:25:59,174 WARNING [train.py:1204] (3/4) Exclude cut with ID _18513_210_9206_1_1531618215223_3862620_402-5957-0_sp1.1 from training. Duration: 0.9 2023-10-03 10:26:00,514 WARNING [train.py:1204] (3/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_465-156500-0 from training. Duration: 0.72 2023-10-03 10:26:00,515 WARNING [train.py:1204] (3/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_714-310538-0 from training. Duration: 0.86 2023-10-03 10:26:01,803 WARNING [train.py:1204] (3/4) Exclude cut with ID _39411_210_5986_1_1533538825030_3343649_335-301187-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 10:26:05,389 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_41750_210_1960_1_1533520678_3948042_241-158616-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 10:26:05,429 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_46013_210_7234_1_1533776362_3744032_50-141535-0_sp1.1 from training. Duration: 0.9 2023-10-03 10:26:06,758 WARNING [train.py:1204] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_43-206757-0_sp0.9 from training. Duration: 0.8333125 2023-10-03 10:26:08,713 WARNING [train.py:1204] (3/4) Exclude cut with ID _22961_210_8254_1_1531976366672_4278180_172-276296-0_sp1.1 from training. Duration: 0.97275 2023-10-03 10:26:08,734 WARNING [train.py:1204] (3/4) Exclude cut with ID _17956_210_4353_1_1531209568125_6979970_1305-301903-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 10:26:08,781 WARNING [train.py:1204] (3/4) Exclude cut with ID _28323_210_16187_1_1532480395667_4100300_604-44507-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 10:26:21,852 WARNING [train.py:1204] (3/4) Exclude cut with ID _23514_210_1389_1_1531832139785_3031439_88-337544-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 10:26:21,918 WARNING [train.py:1204] (3/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_932-3672-0_sp1.1 from training. Duration: 0.963625 2023-10-03 10:26:23,380 WARNING [train.py:1204] (3/4) Exclude cut with ID _46502_210_13794_1_1533724452198_3657159_207-109991-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 10:26:24,104 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.2.encoder.layers.1.feed_forward3.out_whiten, num_groups=1, num_channels=384, metric=8.55 vs. limit=15.0 2023-10-03 10:26:26,146 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_216-73841-0_sp1.1 from training. Duration: 0.87275 2023-10-03 10:26:27,510 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8453_210_4211_1_1528502940_8562730_347-330673-0_sp1.1 from training. Duration: 0.6545625 2023-10-03 10:26:32,996 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_4183_210_2087_1_1526729100_1869838_143-280962-0_sp1.1 from training. Duration: 0.5090625 2023-10-03 10:26:35,110 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.2.encoder.layers.2.self_attn_weights.whiten_keys, num_groups=4, num_channels=128, metric=3.24 vs. limit=6.0 2023-10-03 10:26:37,646 WARNING [train.py:1204] (3/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_325-113798-0_sp1.1 from training. Duration: 0.77275 2023-10-03 10:26:42,388 WARNING [train.py:1204] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_91-212486-0_sp0.9 from training. Duration: 0.7555625 2023-10-03 10:26:45,072 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8776_210_2040_1_1528638837_4088036_244-278763-0_sp1.1 from training. Duration: 0.8181875 2023-10-03 10:26:45,110 WARNING [train.py:1204] (3/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_106-286130-0 from training. Duration: 0.98 2023-10-03 10:26:45,172 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2746_210_1988_1_1525872816_3795627_372-205861-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 10:26:47,027 WARNING [train.py:1204] (3/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_240-54646-0_sp1.1 from training. Duration: 0.936375 2023-10-03 10:26:49,771 WARNING [train.py:1204] (3/4) Exclude cut with ID _59500_210_8254_1_1534809610680_3736009_162-180695-0_sp1.1 from training. Duration: 0.936375 2023-10-03 10:26:49,792 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7544_210_6753_1_1528591327_1471426_146-93357-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 10:26:52,586 WARNING [train.py:1204] (3/4) Exclude cut with ID _42657_210_2070_1_1533520654016_3327008_405-87432-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 10:26:52,614 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_4528_210_3273_1_1527076654_3276777_209-219804-0 from training. Duration: 0.69 2023-10-03 10:26:52,614 WARNING [train.py:1204] (3/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_78-182912-0_sp0.9 from training. Duration: 0.6555625 2023-10-03 10:26:52,622 WARNING [train.py:1204] (3/4) Exclude cut with ID _31255_210_13427_1_1532865619495_3815143_322-114998-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 10:26:56,931 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.0.layers.1.whiten, num_groups=1, num_channels=192, metric=3.68 vs. limit=12.0 2023-10-03 10:26:58,325 INFO [scaling.py:1022] (3/4) Whitening: name=encoder_embed.convnext.out_whiten, num_groups=1, num_channels=128, metric=3.86 vs. limit=5.0 2023-10-03 10:26:58,756 WARNING [train.py:1204] (3/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_848-280957-0_sp0.9 from training. Duration: 0.9666875 2023-10-03 10:26:58,786 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7696_210_2040_1_1528207167_3716099_117-125434-0_sp0.9 from training. Duration: 0.8444375 2023-10-03 10:27:00,194 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_1002-109192-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 10:27:02,932 WARNING [train.py:1204] (3/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_138-64210-0_sp0.9 from training. Duration: 0.811125 2023-10-03 10:27:04,205 WARNING [train.py:1204] (3/4) Exclude cut with ID _25808_210_13292_1_1532343514562_7366109_633-47524-0_sp1.1 from training. Duration: 0.9 2023-10-03 10:27:04,323 WARNING [train.py:1204] (3/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_297-5233-0 from training. Duration: 0.95 2023-10-03 10:27:06,146 WARNING [train.py:1204] (3/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_335-274780-0 from training. Duration: 0.78 2023-10-03 10:27:06,151 WARNING [train.py:1204] (3/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_301-330455-0 from training. Duration: 0.8 2023-10-03 10:27:09,420 WARNING [train.py:1204] (3/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_383-205833-0 from training. Duration: 0.95 2023-10-03 10:27:12,139 INFO [train.py:1046] (3/4) Epoch 35, batch 4550, loss[loss=0.1694, simple_loss=0.2315, pruned_loss=0.05364, over 23652.00 frames. ], tot_loss[loss=0.1608, simple_loss=0.2391, pruned_loss=0.04122, over 4698659.78 frames. ], batch size: 232, lr: 2.90e-03, grad_scale: 16.0 2023-10-03 10:27:12,221 WARNING [train.py:1204] (3/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_48-182155-0 from training. Duration: 0.87 2023-10-03 10:27:12,307 WARNING [train.py:1204] (3/4) Exclude cut with ID _14832_210_8750_1_1530691351539_3171348_1-54315-0_sp1.1 from training. Duration: 0.7909375 2023-10-03 10:27:13,823 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.conv_module1.balancer1.prob, batch_count=1234420.0, ans=0.125 2023-10-03 10:27:16,311 WARNING [train.py:1204] (3/4) Exclude cut with ID _44982_210_1511_1_1534312820108_3726029_413-100814-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 10:27:17,703 WARNING [train.py:1204] (3/4) Exclude cut with ID _3231_210_2106_1_1526205864934_6784220_104-261392-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 10:27:19,622 WARNING [train.py:1204] (3/4) Exclude cut with ID _24314_210_4915_1_1532135078116_7170179_1-13588-0_sp1.1 from training. Duration: 0.97275 2023-10-03 10:27:23,818 WARNING [train.py:1204] (3/4) Exclude cut with ID _44304_210_15374_1_1533639352765_4613129_141-8356-0_sp1.1 from training. Duration: 0.963625 2023-10-03 10:27:25,698 WARNING [train.py:1204] (3/4) Exclude cut with ID _37892_210_19596_1_1533198677897_3964058_374-335034-0_sp1.1 from training. Duration: 0.936375 2023-10-03 10:27:27,124 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_195-257512-0_sp0.9 from training. Duration: 0.7444375 2023-10-03 10:27:27,126 WARNING [train.py:1204] (3/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_1267-272693-0_sp1.1 from training. Duration: 0.663625 2023-10-03 10:27:27,127 WARNING [train.py:1204] (3/4) Exclude cut with ID _30196_210_1664_1_1533708496522_7074140_690-326155-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 10:27:29,831 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_68847_210_14656_1_1536070214_2586843_38-20717-0_sp1.1 from training. Duration: 0.97275 2023-10-03 10:27:31,123 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_99-251314-0_sp1.1 from training. Duration: 0.7909375 2023-10-03 10:27:31,327 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.attention_skip_rate, batch_count=1234486.6666666667, ans=0.0 2023-10-03 10:27:34,549 WARNING [train.py:1204] (3/4) Exclude cut with ID _49376_210_5986_1_1533985278732_3370740_225-95630-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 10:27:37,952 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2655_210_2087_1_1525688959_3897400_202-147557-0 from training. Duration: 0.85 2023-10-03 10:27:38,005 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_5618_210_1874_1_1527311008_5851_1-214501-0_sp1.1 from training. Duration: 0.626375 2023-10-03 10:27:38,079 WARNING [train.py:1204] (3/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_225-43381-0_sp1.1 from training. Duration: 0.8545625 2023-10-03 10:27:39,285 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.577e+02 1.927e+02 2.055e+02 2.353e+02 3.694e+02, threshold=4.110e+02, percent-clipped=0.0 2023-10-03 10:27:39,412 WARNING [train.py:1204] (3/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_242-3472-0 from training. Duration: 0.73 2023-10-03 10:27:42,261 WARNING [train.py:1204] (3/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_339-104791-0 from training. Duration: 0.49 2023-10-03 10:27:43,739 WARNING [train.py:1204] (3/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_488-92261-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 10:27:45,724 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.2.encoder.layers.0.self_attn_weights.whiten_keys, num_groups=4, num_channels=128, metric=4.27 vs. limit=6.0 2023-10-03 10:27:46,971 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.4.encoder.layers.1.feed_forward3.out_whiten, num_groups=1, num_channels=384, metric=14.40 vs. limit=15.0 2023-10-03 10:27:47,722 WARNING [train.py:1204] (3/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_175-163894-0 from training. Duration: 0.74 2023-10-03 10:27:49,214 WARNING [train.py:1204] (3/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_41-234353-0_sp0.9 from training. Duration: 0.7555625 2023-10-03 10:27:53,691 WARNING [train.py:1204] (3/4) Exclude cut with ID _47251_210_18628_1_1533814063128_4137250_116-275927-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 10:27:53,745 WARNING [train.py:1204] (3/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_61-223324-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 10:27:53,757 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6961_210_2087_1_1527937168_6815011_562-165518-0_sp1.1 from training. Duration: 0.7 2023-10-03 10:27:55,379 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.bypass.skip_rate, batch_count=1234620.0, ans=0.07 2023-10-03 10:27:56,518 WARNING [train.py:1204] (3/4) Exclude cut with ID _21989_210_5854_1_1531899214931_3271360_133-44759-0 from training. Duration: 0.77 2023-10-03 10:27:59,865 WARNING [train.py:1204] (3/4) Exclude cut with ID _23557_210_8750_1_1531890106370_6846255_77-284923-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 10:28:01,363 WARNING [train.py:1204] (3/4) Exclude cut with ID _13678_210_7497_1_1530530069226_3463170_464-296127-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 10:28:01,384 WARNING [train.py:1204] (3/4) Exclude cut with ID _22613_210_5219_1_1532253599686_4090170_393-36663-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 10:28:02,696 WARNING [train.py:1204] (3/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_235-262124-0_sp0.9 from training. Duration: 0.7444375 2023-10-03 10:28:04,155 WARNING [train.py:1204] (3/4) Exclude cut with ID _8480_210_1874_1_1528589391205_6728330_755-112625-0 from training. Duration: 0.74 2023-10-03 10:28:04,208 WARNING [train.py:1204] (3/4) Exclude cut with ID _6456_210_1534_1_1527681634212_3181049_215-309359-0 from training. Duration: 0.88 2023-10-03 10:28:05,394 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3019_210_2028_1_1526044245_3264865_191-72235-0_sp1.1 from training. Duration: 0.8818125 2023-10-03 10:28:07,389 WARNING [train.py:1204] (3/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_380-162648-0 from training. Duration: 0.94 2023-10-03 10:28:08,904 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder_embed.conv.5.prob, batch_count=1234620.0, ans=0.125 2023-10-03 10:28:10,582 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_92-46009-0 from training. Duration: 0.54 2023-10-03 10:28:10,600 WARNING [train.py:1204] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_53-300544-0_sp0.9 from training. Duration: 0.7444375 2023-10-03 10:28:10,705 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_68235_210_1925_1_1535863247_4484656_101-250943-0_sp1.1 from training. Duration: 0.97275 2023-10-03 10:28:10,714 WARNING [train.py:1204] (3/4) Exclude cut with ID _14970_210_15709_1_1530775547361_1966965_204-22182-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 10:28:11,960 WARNING [train.py:1204] (3/4) Exclude cut with ID _55153_210_2253_1_1534384786677_3162096_310-130092-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 10:28:11,970 WARNING [train.py:1204] (3/4) Exclude cut with ID _60113_210_16271_1_1535164212481_4386079_559-167191-0_sp1.1 from training. Duration: 0.8454375 2023-10-03 10:28:13,462 WARNING [train.py:1204] (3/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_93-58002-0_sp1.1 from training. Duration: 0.5181875 2023-10-03 10:28:14,748 WARNING [train.py:1204] (3/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_110-224688-0 from training. Duration: 0.97 2023-10-03 10:28:17,441 WARNING [train.py:1204] (3/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_178-178004-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 10:28:17,451 WARNING [train.py:1204] (3/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_321-335475-0_sp1.1 from training. Duration: 0.4090625 2023-10-03 10:28:17,538 WARNING [train.py:1204] (3/4) Exclude cut with ID _66863_210_18053_1_1535698758334_8590319_1092-271784-0 from training. Duration: 0.96 2023-10-03 10:28:17,544 WARNING [train.py:1204] (3/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_607-213951-0_sp1.1 from training. Duration: 0.763625 2023-10-03 10:28:18,899 WARNING [train.py:1204] (3/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_468-283113-0 from training. Duration: 0.96 2023-10-03 10:28:21,799 WARNING [train.py:1204] (3/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_13-165512-0_sp1.1 from training. Duration: 0.7454375 2023-10-03 10:28:21,818 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_69630_210_7234_1_1535788670_910989_56-169762-0_sp1.1 from training. Duration: 0.92725 2023-10-03 10:28:23,145 WARNING [train.py:1204] (3/4) Exclude cut with ID _67335_210_5219_1_1535525975652_4275229_121-80357-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 10:28:23,185 WARNING [train.py:1204] (3/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_126-12079-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 10:28:24,974 WARNING [train.py:1204] (3/4) Exclude cut with ID _5294_210_1747_1_1527249643928_3696179_342-270278-0_sp1.1 from training. Duration: 0.5 2023-10-03 10:28:26,314 INFO [train.py:1046] (3/4) Epoch 35, batch 4600, loss[loss=0.1673, simple_loss=0.2381, pruned_loss=0.04823, over 23855.00 frames. ], tot_loss[loss=0.1599, simple_loss=0.2382, pruned_loss=0.04074, over 4691041.74 frames. ], batch size: 179, lr: 2.90e-03, grad_scale: 16.0 2023-10-03 10:28:26,339 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_30322_210_1925_1_1532843478_826817_35-328264-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 10:28:26,561 INFO [scaling.py:1118] (3/4) WithLoss: name=encoder.encoders.2.encoder.layers.0.self_attn_weights, loss-sum=0.000e+00 2023-10-03 10:28:29,031 WARNING [train.py:1204] (3/4) Exclude cut with ID _8480_210_1874_1_1528589391205_6728330_755-112625-0_sp1.1 from training. Duration: 0.67275 2023-10-03 10:28:30,985 WARNING [train.py:1204] (3/4) Exclude cut with ID _38670_210_15379_1_1533448810659_4391059_132-146412-0_sp1.1 from training. Duration: 0.97275 2023-10-03 10:28:32,300 WARNING [train.py:1204] (3/4) Exclude cut with ID _22020_210_2461_1_1531724164513_6326900_563-39691-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 10:28:35,104 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_9460_210_4211_1_1528954587_10049667_572-37035-0_sp1.1 from training. Duration: 0.72725 2023-10-03 10:28:35,116 WARNING [train.py:1204] (3/4) Exclude cut with ID _68777_210_4353_1_1535810390731_3117904_129-166012-0_sp0.9 from training. Duration: 0.9444375 2023-10-03 10:28:36,475 WARNING [train.py:1204] (3/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_487-91921-0_sp1.1 from training. Duration: 0.963625 2023-10-03 10:28:37,861 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_195-257512-0 from training. Duration: 0.67 2023-10-03 10:28:38,778 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.conv_module1.balancer1.prob, batch_count=1234753.3333333333, ans=0.125 2023-10-03 10:28:39,838 WARNING [train.py:1204] (3/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_188-260011-0_sp1.1 from training. Duration: 0.663625 2023-10-03 10:28:43,228 WARNING [train.py:1204] (3/4) Exclude cut with ID _8480_210_1874_1_1528589391205_6728330_309-139896-0_sp0.9 from training. Duration: 0.9 2023-10-03 10:28:44,526 WARNING [train.py:1204] (3/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_866-321248-0_sp1.1 from training. Duration: 0.963625 2023-10-03 10:28:46,071 WARNING [train.py:1204] (3/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_510-216513-0_sp1.1 from training. Duration: 0.97275 2023-10-03 10:28:51,659 WARNING [train.py:1204] (3/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_758-143677-0 from training. Duration: 0.98 2023-10-03 10:28:53,310 WARNING [train.py:1204] (3/4) Exclude cut with ID _55134_210_21982_1_1534590028377_3882170_50-214427-0_sp1.1 from training. Duration: 0.97275 2023-10-03 10:28:56,195 WARNING [train.py:1204] (3/4) Exclude cut with ID _31649_210_6228_1_1532584608063_4091078_482-301330-0_sp1.1 from training. Duration: 0.97275 2023-10-03 10:28:58,914 WARNING [train.py:1204] (3/4) Exclude cut with ID _35799_210_5854_1_1533263487887_3529071_490-326847-0_sp1.1 from training. Duration: 0.9 2023-10-03 10:28:58,922 WARNING [train.py:1204] (3/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_984-90716-0_sp1.1 from training. Duration: 0.963625 2023-10-03 10:29:03,617 WARNING [train.py:1204] (3/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_166-246557-0 from training. Duration: 0.64 2023-10-03 10:29:03,617 WARNING [train.py:1204] (3/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_51-200220-0_sp1.1 from training. Duration: 0.5090625 2023-10-03 10:29:03,694 WARNING [train.py:1204] (3/4) Exclude cut with ID _51382_210_23862_1_1534492801838_3650104_275-92796-0_sp1.1 from training. Duration: 0.936375 2023-10-03 10:29:09,559 WARNING [train.py:1204] (3/4) Exclude cut with ID _29853_210_7497_1_1532505600851_6642110_461-142829-0_sp1.1 from training. Duration: 0.97275 2023-10-03 10:29:09,611 WARNING [train.py:1204] (3/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_788-298907-0_sp0.9 from training. Duration: 0.988875 2023-10-03 10:29:11,466 WARNING [train.py:1204] (3/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_739-2098-0_sp1.1 from training. Duration: 0.836375 2023-10-03 10:29:14,206 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7543_210_6753_1_1528541907_1399322_165-323619-0 from training. Duration: 0.69 2023-10-03 10:29:15,635 WARNING [train.py:1204] (3/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_401-255491-0_sp1.1 from training. Duration: 0.636375 2023-10-03 10:29:17,150 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.conv_module1.balancer1.prob, batch_count=1234953.3333333333, ans=0.125 2023-10-03 10:29:21,030 WARNING [train.py:1204] (3/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_24-143283-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 10:29:21,108 WARNING [train.py:1204] (3/4) Exclude cut with ID _14459_210_2228_1_1531443641946_7516700_1221-280439-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 10:29:24,509 WARNING [train.py:1204] (3/4) Exclude cut with ID _59202_210_26984_1_1534816810612_3549174_469-213482-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 10:29:24,510 WARNING [train.py:1204] (3/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_148-158509-0_sp0.9 from training. Duration: 0.4555625 2023-10-03 10:29:24,553 WARNING [train.py:1204] (3/4) Exclude cut with ID _24314_210_4915_1_1532135078116_7170179_863-259095-0_sp1.1 from training. Duration: 0.97275 2023-10-03 10:29:25,950 WARNING [train.py:1204] (3/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_321-22821-0 from training. Duration: 0.89 2023-10-03 10:29:25,974 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_43831_210_2158_1_1533725502_5872791_55-163137-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 10:29:27,311 WARNING [train.py:1204] (3/4) Exclude cut with ID _27430_210_7770_1_1532604689375_1378590_26-25207-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 10:29:28,644 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_17613_210_2151_1_1531137569_2366160_223-316461-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 10:29:28,728 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_53679_210_4852_1_1534556726_4186141_541-213370-0_sp1.1 from training. Duration: 0.963625 2023-10-03 10:29:30,098 WARNING [train.py:1204] (3/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_369-295086-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 10:29:30,296 INFO [scaling.py:1118] (3/4) WithLoss: name=encoder.encoders.2.encoder.layers.1.self_attn_weights, loss-sum=0.000e+00 2023-10-03 10:29:31,420 WARNING [train.py:1204] (3/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_91-267696-0 from training. Duration: 0.87 2023-10-03 10:29:32,686 WARNING [train.py:1204] (3/4) Exclude cut with ID _5294_210_1747_1_1527249643928_3696179_180-100713-0 from training. Duration: 0.81 2023-10-03 10:29:32,719 WARNING [train.py:1204] (3/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_32-335103-0 from training. Duration: 0.82 2023-10-03 10:29:32,734 WARNING [train.py:1204] (3/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_133-312727-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 10:29:34,595 WARNING [train.py:1204] (3/4) Exclude cut with ID _59288_210_5854_1_1535077757774_3412246_391-165934-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 10:29:35,959 WARNING [train.py:1204] (3/4) Exclude cut with ID _28323_210_16187_1_1532480395667_4100300_488-1281-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 10:29:36,031 WARNING [train.py:1204] (3/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_124-193207-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 10:29:39,728 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.bypass.skip_rate, batch_count=1235086.6666666667, ans=0.07 2023-10-03 10:29:40,763 INFO [train.py:1046] (3/4) Epoch 35, batch 4650, loss[loss=0.1574, simple_loss=0.2466, pruned_loss=0.0341, over 24591.00 frames. ], tot_loss[loss=0.1591, simple_loss=0.2379, pruned_loss=0.04017, over 4701404.45 frames. ], batch size: 71, lr: 2.90e-03, grad_scale: 8.0 2023-10-03 10:29:45,595 WARNING [train.py:1204] (3/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_226-242140-0_sp0.9 from training. Duration: 0.9 2023-10-03 10:29:47,204 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.ff3_skip_rate, batch_count=1235086.6666666667, ans=0.0 2023-10-03 10:29:48,320 WARNING [train.py:1204] (3/4) Exclude cut with ID _29277_210_3279_1_1532581109293_3173069_392-3694-0_sp1.1 from training. Duration: 0.936375 2023-10-03 10:29:48,363 WARNING [train.py:1204] (3/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_563-329842-0_sp1.1 from training. Duration: 0.97275 2023-10-03 10:29:49,722 WARNING [train.py:1204] (3/4) Exclude cut with ID _58583_210_5236_1_1534726835266_3574249_391-87755-0_sp1.1 from training. Duration: 0.92725 2023-10-03 10:29:49,750 WARNING [train.py:1204] (3/4) Exclude cut with ID _28427_210_4220_1_1532581240967_3061069_281-237827-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 10:29:49,764 WARNING [train.py:1204] (3/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_44-147898-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 10:29:51,168 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_24007_210_1794_1_1531918925_1663875_125-320772-0_sp1.1 from training. Duration: 0.97275 2023-10-03 10:29:55,694 WARNING [train.py:1204] (3/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_143-117421-0 from training. Duration: 0.58 2023-10-03 10:29:58,455 WARNING [train.py:1204] (3/4) Exclude cut with ID _69584_210_5219_1_1535785133775_4044450_325-7656-0_sp0.9 from training. Duration: 0.9555625 2023-10-03 10:30:01,226 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_37351_210_7925_1_1533012931_3812113_265-85780-0 from training. Duration: 0.88 2023-10-03 10:30:01,246 WARNING [train.py:1204] (3/4) Exclude cut with ID _18516_210_9206_1_1531704653206_3692406_331-237896-0_sp1.1 from training. Duration: 0.936375 2023-10-03 10:30:02,579 WARNING [train.py:1204] (3/4) Exclude cut with ID _14629_210_15709_1_1530604298182_956899_6-232695-0 from training. Duration: 0.94 2023-10-03 10:30:02,607 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7544_210_6753_1_1528587000_691468_87-178599-0_sp1.1 from training. Duration: 0.7909375 2023-10-03 10:30:02,879 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.feed_forward1.hidden_balancer.prob, batch_count=1235153.3333333333, ans=0.125 2023-10-03 10:30:03,919 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_326-303945-0 from training. Duration: 0.99 2023-10-03 10:30:03,945 WARNING [train.py:1204] (3/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_503-30719-0 from training. Duration: 0.98 2023-10-03 10:30:03,953 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_11305_210_1794_1_1529750428_7178148_299-344640-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 10:30:04,018 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_53071_210_16554_1_1534490405_2954066_265-170472-0_sp1.1 from training. Duration: 0.7545625 2023-10-03 10:30:04,206 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.conv_skip_rate, batch_count=1235153.3333333333, ans=0.0 2023-10-03 10:30:07,483 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_174-277364-0_sp1.1 from training. Duration: 0.6454375 2023-10-03 10:30:08,825 WARNING [train.py:1204] (3/4) Exclude cut with ID _58414_210_8254_1_1535421609162_7151182_193-151884-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 10:30:08,843 WARNING [train.py:1204] (3/4) Exclude cut with ID IC0651W0104-322063 from training. Duration: 0.768 2023-10-03 10:30:10,230 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.548e+02 1.852e+02 2.049e+02 2.338e+02 3.401e+02, threshold=4.098e+02, percent-clipped=0.0 2023-10-03 10:30:13,466 WARNING [train.py:1204] (3/4) Exclude cut with ID _21881_210_3525_1_1531825242757_3798380_139-305982-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 10:30:14,830 WARNING [train.py:1204] (3/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_79-200392-0 from training. Duration: 0.97 2023-10-03 10:30:18,056 WARNING [train.py:1204] (3/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_451-280894-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 10:30:18,064 WARNING [train.py:1204] (3/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_304-273433-0_sp1.1 from training. Duration: 0.9 2023-10-03 10:30:18,146 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_5959_210_1794_1_1527400400_3445726_412-44052-0 from training. Duration: 0.77 2023-10-03 10:30:19,524 WARNING [train.py:1204] (3/4) Exclude cut with ID _4681_210_3525_1_1527251040321_1235713_148-26159-0_sp1.1 from training. Duration: 0.963625 2023-10-03 10:30:22,448 WARNING [train.py:1204] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_887-249555-0_sp0.9 from training. Duration: 0.8555625 2023-10-03 10:30:27,051 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_25236_210_2158_1_1532051968_280567_37-6860-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 10:30:31,311 WARNING [train.py:1204] (3/4) Exclude cut with ID _29277_210_3279_1_1532581109293_3173069_417-115527-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 10:30:32,782 WARNING [train.py:1204] (3/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_552-134748-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 10:30:33,000 INFO [scaling.py:1118] (3/4) WithLoss: name=encoder.encoders.2.encoder.layers.2.self_attn_weights, loss-sum=0.000e+00 2023-10-03 10:30:34,027 WARNING [train.py:1204] (3/4) Exclude cut with ID _43176_210_8254_1_1533524417469_4174339_188-174143-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 10:30:34,070 WARNING [train.py:1204] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_127-119109-0_sp1.1 from training. Duration: 0.6545625 2023-10-03 10:30:36,795 WARNING [train.py:1204] (3/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_38-187851-0 from training. Duration: 0.62 2023-10-03 10:30:36,834 WARNING [train.py:1204] (3/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_588-125165-0 from training. Duration: 0.83 2023-10-03 10:30:38,667 WARNING [train.py:1204] (3/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_344-152526-0_sp1.1 from training. Duration: 0.4818125 2023-10-03 10:30:38,669 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7002_210_1794_1_1527935634_4034444_487-236501-0 from training. Duration: 0.66 2023-10-03 10:30:39,623 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.1.encoder.layers.1.feed_forward2.out_whiten, num_groups=1, num_channels=256, metric=3.57 vs. limit=15.0 2023-10-03 10:30:40,147 WARNING [train.py:1204] (3/4) Exclude cut with ID _23825_210_13449_1_1531915313526_3497150_379-16981-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 10:30:46,322 WARNING [train.py:1204] (3/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_297-5233-0_sp1.1 from training. Duration: 0.863625 2023-10-03 10:30:46,327 WARNING [train.py:1204] (3/4) Exclude cut with ID _41298_210_9019_1_1533430371450_4822143_178-283538-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 10:30:46,371 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7046_210_6753_1_1527895337_6920917_642-106799-0 from training. Duration: 0.92 2023-10-03 10:30:46,384 WARNING [train.py:1204] (3/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_532-291952-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 10:30:47,790 WARNING [train.py:1204] (3/4) Exclude cut with ID _47278_210_12041_1_1533902418349_3576110_260-270468-0_sp1.1 from training. Duration: 0.92725 2023-10-03 10:30:47,795 WARNING [train.py:1204] (3/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_144-3287-0_sp0.9 from training. Duration: 0.7444375 2023-10-03 10:30:49,791 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_162-262717-0_sp0.9 from training. Duration: 0.92225 2023-10-03 10:30:50,060 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.conv_module1.balancer2.prob, batch_count=1235353.3333333333, ans=0.125 2023-10-03 10:30:52,490 WARNING [train.py:1204] (3/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_62-335552-0_sp0.9 from training. Duration: 0.9666875 2023-10-03 10:30:52,496 WARNING [train.py:1204] (3/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_726-272852-0_sp1.1 from training. Duration: 0.92725 2023-10-03 10:30:52,588 WARNING [train.py:1204] (3/4) Exclude cut with ID _14898_210_9206_1_1530766812377_4177190_26-135492-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 10:30:55,119 INFO [train.py:1046] (3/4) Epoch 35, batch 4700, loss[loss=0.1534, simple_loss=0.239, pruned_loss=0.03386, over 24518.00 frames. ], tot_loss[loss=0.1598, simple_loss=0.2389, pruned_loss=0.04032, over 4706013.44 frames. ], batch size: 66, lr: 2.90e-03, grad_scale: 8.0 2023-10-03 10:30:57,118 WARNING [train.py:1204] (3/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_200-95304-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 10:30:58,429 WARNING [train.py:1204] (3/4) Exclude cut with ID _45012_210_12651_1_1533608435136_5371380_240-37101-0_sp1.1 from training. Duration: 0.8454375 2023-10-03 10:30:58,437 WARNING [train.py:1204] (3/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_132-269633-0_sp0.9 from training. Duration: 0.7555625 2023-10-03 10:30:58,510 WARNING [train.py:1204] (3/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_907-151398-0_sp0.9 from training. Duration: 0.411125 2023-10-03 10:30:59,846 WARNING [train.py:1204] (3/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_45-265684-0_sp0.9 from training. Duration: 0.8 2023-10-03 10:31:01,230 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_74-292608-0 from training. Duration: 0.72 2023-10-03 10:31:08,916 WARNING [train.py:1204] (3/4) Exclude cut with ID _59202_210_26984_1_1534816810612_3549174_221-84819-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 10:31:10,285 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_33696_210_3852_1_1533082780_2663375_181-325595-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 10:31:10,328 WARNING [train.py:1204] (3/4) Exclude cut with ID _47251_210_18628_1_1533814063128_4137250_351-154105-0_sp1.1 from training. Duration: 0.963625 2023-10-03 10:31:11,651 WARNING [train.py:1204] (3/4) Exclude cut with ID _35501_210_9288_1_1532998507986_3192189_351-334740-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 10:31:13,124 WARNING [train.py:1204] (3/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_342-100157-0_sp1.1 from training. Duration: 0.4909375 2023-10-03 10:31:16,584 WARNING [train.py:1204] (3/4) Exclude cut with ID _6789_210_1534_1_1527857937218_3118950_262-48853-0 from training. Duration: 0.56 2023-10-03 10:31:17,900 WARNING [train.py:1204] (3/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_430-201020-0 from training. Duration: 0.97 2023-10-03 10:31:19,327 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_71-136518-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 10:31:21,119 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_28-291982-0_sp1.1 from training. Duration: 0.8818125 2023-10-03 10:31:21,153 WARNING [train.py:1204] (3/4) Exclude cut with ID _54218_210_12210_1_1534413646388_4409259_437-275326-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 10:31:24,035 WARNING [train.py:1204] (3/4) Exclude cut with ID _55134_210_21982_1_1534590028377_3882170_40-309139-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 10:31:31,266 WARNING [train.py:1204] (3/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_266-329755-0_sp1.1 from training. Duration: 0.8090625 2023-10-03 10:31:32,650 WARNING [train.py:1204] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_21-174514-0_sp1.1 from training. Duration: 0.3909375 2023-10-03 10:31:34,132 WARNING [train.py:1204] (3/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_409-207378-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 10:31:40,456 WARNING [train.py:1204] (3/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_132-269633-0 from training. Duration: 0.68 2023-10-03 10:31:41,856 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2245_210_2715_1_1525511548_3540254_310-52483-0_sp0.9 from training. Duration: 0.97775 2023-10-03 10:31:44,481 WARNING [train.py:1204] (3/4) Exclude cut with ID _27430_210_7770_1_1532604689375_1378590_47-77403-0_sp1.1 from training. Duration: 0.92725 2023-10-03 10:31:46,644 WARNING [train.py:1204] (3/4) Exclude cut with ID _7562_210_2084_1_1528887767916_3946229_186-326422-0 from training. Duration: 0.84 2023-10-03 10:31:47,879 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_230-35993-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 10:31:51,243 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_23854_210_1794_1_1531969696_1329157_57-123491-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 10:31:52,469 WARNING [train.py:1204] (3/4) Exclude cut with ID _14747_210_8341_1_1530685797463_3654089_147-131229-0 from training. Duration: 0.76 2023-10-03 10:31:53,795 WARNING [train.py:1204] (3/4) Exclude cut with ID _30191_210_1664_1_1533189915047_7196670_430-19539-0_sp1.1 from training. Duration: 0.92725 2023-10-03 10:31:53,817 WARNING [train.py:1204] (3/4) Exclude cut with ID _67725_210_27767_1_1535608756671_5105359_44-50762-0_sp1.1 from training. Duration: 0.963625 2023-10-03 10:31:57,275 WARNING [train.py:1204] (3/4) Exclude cut with ID _27485_210_4916_1_1532739191677_3668269_217-161755-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 10:31:58,543 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_5931_210_2040_1_1527428481_4547999_321-193614-0_sp1.1 from training. Duration: 0.6090625 2023-10-03 10:31:58,574 WARNING [train.py:1204] (3/4) Exclude cut with ID _6267_210_2084_1_1527764658526_3894730_375-219887-0 from training. Duration: 0.67 2023-10-03 10:31:58,819 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.feed_forward3.hidden_balancer.prob, batch_count=1235686.6666666667, ans=0.125 2023-10-03 10:31:59,912 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9087W0022-538393 from training. Duration: 0.768 2023-10-03 10:32:01,326 WARNING [train.py:1204] (3/4) Exclude cut with ID _68777_210_4353_1_1535810390731_3117904_83-196459-0_sp1.1 from training. Duration: 0.963625 2023-10-03 10:32:01,464 WARNING [train.py:1204] (3/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_455-343812-0_sp1.1 from training. Duration: 0.92725 2023-10-03 10:32:01,465 WARNING [train.py:1204] (3/4) Exclude cut with ID _13916_210_9994_1_1530788341126_3763149_414-320174-0_sp1.1 from training. Duration: 0.92725 2023-10-03 10:32:01,469 WARNING [train.py:1204] (3/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_398-239819-0 from training. Duration: 0.99 2023-10-03 10:32:02,751 WARNING [train.py:1204] (3/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_199-232314-0_sp1.1 from training. Duration: 0.92725 2023-10-03 10:32:03,050 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.conv_module1.balancer1.prob, batch_count=1235686.6666666667, ans=0.125 2023-10-03 10:32:07,470 WARNING [train.py:1204] (3/4) Exclude cut with ID _2334_210_2667_1_1525608238880_4705129_459-104997-0 from training. Duration: 0.95 2023-10-03 10:32:10,143 INFO [train.py:1046] (3/4) Epoch 35, batch 4750, loss[loss=0.1675, simple_loss=0.2471, pruned_loss=0.04398, over 23718.00 frames. ], tot_loss[loss=0.1604, simple_loss=0.24, pruned_loss=0.04037, over 4716606.53 frames. ], batch size: 85, lr: 2.90e-03, grad_scale: 8.0 2023-10-03 10:32:10,213 WARNING [train.py:1204] (3/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_441-209395-0_sp1.1 from training. Duration: 0.936375 2023-10-03 10:32:10,294 WARNING [train.py:1204] (3/4) Exclude cut with ID _27052_210_5986_1_1532329197187_3585009_320-80393-0_sp1.1 from training. Duration: 0.97275 2023-10-03 10:32:13,882 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.2.encoder.layers.0.nonlin_attention.whiten2, num_groups=1, num_channels=384, metric=5.70 vs. limit=15.0 2023-10-03 10:32:15,788 WARNING [train.py:1204] (3/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_89-217253-0_sp1.1 from training. Duration: 0.97275 2023-10-03 10:32:15,809 WARNING [train.py:1204] (3/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_453-282588-0_sp1.1 from training. Duration: 0.8909375 2023-10-03 10:32:19,125 WARNING [train.py:1204] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_88-341718-0 from training. Duration: 0.66 2023-10-03 10:32:19,840 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.2.encoder.layers.2.conv_module2.whiten, num_groups=1, num_channels=384, metric=5.44 vs. limit=15.0 2023-10-03 10:32:20,436 WARNING [train.py:1204] (3/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_230-3521-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 10:32:23,632 WARNING [train.py:1204] (3/4) Exclude cut with ID _9679_210_7497_1_1529129212906_4363929_512-313889-0 from training. Duration: 0.81 2023-10-03 10:32:25,026 WARNING [train.py:1204] (3/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_194-252800-0_sp1.1 from training. Duration: 0.8818125 2023-10-03 10:32:25,054 WARNING [train.py:1204] (3/4) Exclude cut with ID _55209_210_1531_1_1534675828996_4597160_239-293541-0_sp1.1 from training. Duration: 0.963625 2023-10-03 10:32:25,093 WARNING [train.py:1204] (3/4) Exclude cut with ID _35799_210_5854_1_1533263487887_3529071_15-325131-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 10:32:30,977 WARNING [train.py:1204] (3/4) Exclude cut with ID _14100_210_8341_1_1530529320613_3678069_279-55760-0 from training. Duration: 0.93 2023-10-03 10:32:32,510 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.attention_skip_rate, batch_count=1235820.0, ans=0.0 2023-10-03 10:32:35,223 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_489-230703-0_sp1.1 from training. Duration: 0.77275 2023-10-03 10:32:36,560 WARNING [train.py:1204] (3/4) Exclude cut with ID _6694_210_4599_1_1527852637490_3965529_144-185051-0 from training. Duration: 0.96 2023-10-03 10:32:37,925 WARNING [train.py:1204] (3/4) Exclude cut with ID _28461_210_2253_1_1532771775189_3041299_1-228155-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 10:32:38,251 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.conv_module1.balancer1.prob, batch_count=1235886.6666666667, ans=0.125 2023-10-03 10:32:39,854 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.646e+02 1.852e+02 2.146e+02 2.471e+02 3.124e+02, threshold=4.292e+02, percent-clipped=0.0 2023-10-03 10:32:41,441 WARNING [train.py:1204] (3/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_686-252323-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 10:32:41,443 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_1075-294633-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 10:32:41,455 WARNING [train.py:1204] (3/4) Exclude cut with ID _22083_210_8254_1_1531702862439_3678970_30-92879-0_sp1.1 from training. Duration: 0.97275 2023-10-03 10:32:44,142 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9245W0121-616888 from training. Duration: 0.896 2023-10-03 10:32:44,145 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6786_210_1388_1_1527839699_4194495_378-46070-0 from training. Duration: 0.72 2023-10-03 10:32:50,115 WARNING [train.py:1204] (3/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_317-175062-0 from training. Duration: 0.82 2023-10-03 10:32:51,648 WARNING [train.py:1204] (3/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_238-89390-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 10:32:53,085 WARNING [train.py:1204] (3/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_524-310811-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 10:32:55,957 WARNING [train.py:1204] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_307-158462-0_sp0.9 from training. Duration: 0.7666875 2023-10-03 10:32:55,957 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9309W0191-640318 from training. Duration: 0.512 2023-10-03 10:32:55,962 WARNING [train.py:1204] (3/4) Exclude cut with ID _55153_210_2253_1_1534384786677_3162096_100-204617-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 10:32:59,187 WARNING [train.py:1204] (3/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_112-149213-0_sp1.1 from training. Duration: 0.863625 2023-10-03 10:33:02,403 WARNING [train.py:1204] (3/4) Exclude cut with ID _67831_210_12392_1_1535612464202_3092612_346-2148-0_sp1.1 from training. Duration: 0.8454375 2023-10-03 10:33:03,822 WARNING [train.py:1204] (3/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_51-117576-0_sp0.9 from training. Duration: 0.4444375 2023-10-03 10:33:03,858 WARNING [train.py:1204] (3/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_300-27235-0 from training. Duration: 0.87 2023-10-03 10:33:05,403 WARNING [train.py:1204] (3/4) Exclude cut with ID _40399_210_4353_1_1533305658194_3279406_195-116825-0_sp1.1 from training. Duration: 0.97275 2023-10-03 10:33:05,429 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7002_210_1794_1_1527939683_5157286_163-87552-0_sp0.9 from training. Duration: 0.9555625 2023-10-03 10:33:06,761 WARNING [train.py:1204] (3/4) Exclude cut with ID _19514_210_9259_1_1531530071330_5084230_560-237597-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 10:33:06,826 WARNING [train.py:1204] (3/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_41-234353-0_sp1.1 from training. Duration: 0.6181875 2023-10-03 10:33:06,843 WARNING [train.py:1204] (3/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_769-168302-0 from training. Duration: 0.71 2023-10-03 10:33:09,603 WARNING [train.py:1204] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_574-45541-0 from training. Duration: 0.65 2023-10-03 10:33:12,364 WARNING [train.py:1204] (3/4) Exclude cut with ID _50310_210_15488_1_1534068043345_3641080_262-67120-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 10:33:15,576 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_53071_210_16554_1_1534493434_1276796_35-11927-0_sp1.1 from training. Duration: 0.963625 2023-10-03 10:33:15,577 WARNING [train.py:1204] (3/4) Exclude cut with ID _3992_210_1747_1_1526644816031_3731060_107-251434-0 from training. Duration: 0.99 2023-10-03 10:33:15,625 WARNING [train.py:1204] (3/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_412-54488-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 10:33:17,027 WARNING [train.py:1204] (3/4) Exclude cut with ID _18516_210_9206_1_1531704653206_3692406_77-258490-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 10:33:18,451 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_40047_210_2290_1_1533290519_2374311_195-165693-0_sp0.9 from training. Duration: 0.988875 2023-10-03 10:33:18,507 WARNING [train.py:1204] (3/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_296-36971-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 10:33:20,251 WARNING [train.py:1204] (3/4) Exclude cut with ID _14970_210_15709_1_1530773543493_1715730_187-20259-0_sp1.1 from training. Duration: 0.8090625 2023-10-03 10:33:21,750 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_53373_210_5399_1_1534229471_4715695_135-305927-0_sp1.1 from training. Duration: 0.936375 2023-10-03 10:33:21,773 WARNING [train.py:1204] (3/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_60-18123-0 from training. Duration: 0.88 2023-10-03 10:33:24,456 INFO [train.py:1046] (3/4) Epoch 35, batch 4800, loss[loss=0.1715, simple_loss=0.2428, pruned_loss=0.05016, over 23940.00 frames. ], tot_loss[loss=0.1607, simple_loss=0.2405, pruned_loss=0.04047, over 4710111.01 frames. ], batch size: 195, lr: 2.89e-03, grad_scale: 16.0 2023-10-03 10:33:24,511 WARNING [train.py:1204] (3/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_166-166990-0 from training. Duration: 0.96 2023-10-03 10:33:24,601 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_5438_210_1794_1_1527336687_2600009_233-58072-0 from training. Duration: 0.77 2023-10-03 10:33:27,962 WARNING [train.py:1204] (3/4) Exclude cut with ID _8833_210_5219_1_1528592665210_7553059_611-80313-0_sp0.9 from training. Duration: 0.711125 2023-10-03 10:33:27,985 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_53555_210_5632_1_1534595220_7232297_118-43239-0_sp1.1 from training. Duration: 0.936375 2023-10-03 10:33:29,402 WARNING [train.py:1204] (3/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_480-13146-0 from training. Duration: 0.85 2023-10-03 10:33:34,238 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_892-28814-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 10:33:34,287 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_24899_210_16554_1_1532046422_3827462_160-21255-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 10:33:38,507 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.conv_module2.balancer1.prob, batch_count=1236153.3333333333, ans=0.125 2023-10-03 10:33:39,716 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_154-316450-0_sp1.1 from training. Duration: 0.6909375 2023-10-03 10:33:39,800 WARNING [train.py:1204] (3/4) Exclude cut with ID _30196_210_1664_1_1533708496522_7074140_1225-301232-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 10:33:39,824 WARNING [train.py:1204] (3/4) Exclude cut with ID _28783_210_7943_1_1532671164808_7155479_414-208100-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 10:33:41,112 WARNING [train.py:1204] (3/4) Exclude cut with ID _19023_210_4353_1_1531378892552_5820780_83-254794-0 from training. Duration: 0.86 2023-10-03 10:33:41,175 WARNING [train.py:1204] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_591-83752-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 10:33:41,217 WARNING [train.py:1204] (3/4) Exclude cut with ID _29877_210_6228_1_1532397791106_5276149_266-62291-0_sp1.1 from training. Duration: 0.7909375 2023-10-03 10:33:42,620 WARNING [train.py:1204] (3/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_280-161633-0_sp1.1 from training. Duration: 0.663625 2023-10-03 10:33:47,056 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7543_210_6753_1_1528539468_333764_39-156634-0_sp1.1 from training. Duration: 0.97275 2023-10-03 10:33:49,038 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.conv_module1.balancer2.prob, batch_count=1236153.3333333333, ans=0.125 2023-10-03 10:33:50,251 WARNING [train.py:1204] (3/4) Exclude cut with ID _67499_210_17282_1_1535509805220_3692430_505-312248-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 10:33:50,273 WARNING [train.py:1204] (3/4) Exclude cut with ID _5294_210_1747_1_1527249643928_3696179_180-100713-0_sp0.9 from training. Duration: 0.9 2023-10-03 10:33:50,360 WARNING [train.py:1204] (3/4) Exclude cut with ID _26528_210_9070_1_1532570410842_3851310_508-148132-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 10:33:51,661 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_211-339622-0_sp1.1 from training. Duration: 0.4090625 2023-10-03 10:33:51,682 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_339-138356-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 10:33:53,055 WARNING [train.py:1204] (3/4) Exclude cut with ID _27060_210_5986_1_1532674838922_3522549_211-19933-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 10:33:54,012 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.1.encoder.layers.0.nonlin_attention.whiten2, num_groups=1, num_channels=256, metric=6.04 vs. limit=15.0 2023-10-03 10:33:55,832 WARNING [train.py:1204] (3/4) Exclude cut with ID _60229_210_27767_1_1534838406158_3655350_457-117923-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 10:33:59,069 WARNING [train.py:1204] (3/4) Exclude cut with ID _55429_210_10626_1_1534467588527_7174143_460-141439-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 10:34:00,425 WARNING [train.py:1204] (3/4) Exclude cut with ID _59170_210_6721_1_1534983633960_4203335_263-10780-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 10:34:00,437 WARNING [train.py:1204] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_872-264525-0_sp0.9 from training. Duration: 0.72225 2023-10-03 10:34:00,935 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.5.encoder.layers.1.feed_forward1.out_whiten, num_groups=1, num_channels=256, metric=6.01 vs. limit=15.0 2023-10-03 10:34:03,186 WARNING [train.py:1204] (3/4) Exclude cut with ID _7624_210_1531_1_1528109802198_4933101_418-349834-0_sp0.9 from training. Duration: 0.6666875 2023-10-03 10:34:04,564 WARNING [train.py:1204] (3/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_27-53998-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 10:34:05,956 WARNING [train.py:1204] (3/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_202-183922-0 from training. Duration: 0.85 2023-10-03 10:34:07,298 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7002_210_1794_1_1527939683_5157286_1-281264-0 from training. Duration: 0.44 2023-10-03 10:34:07,385 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_53753_210_7234_1_1534320003_3594148_493-9314-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 10:34:08,667 WARNING [train.py:1204] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_217-95056-0_sp1.1 from training. Duration: 0.8909375 2023-10-03 10:34:08,710 WARNING [train.py:1204] (3/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_406-45086-0_sp1.1 from training. Duration: 0.72725 2023-10-03 10:34:08,716 WARNING [train.py:1204] (3/4) Exclude cut with ID _31649_210_6228_1_1532584608063_4091078_505-213821-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 10:34:08,726 WARNING [train.py:1204] (3/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_317-175062-0_sp0.9 from training. Duration: 0.911125 2023-10-03 10:34:10,153 WARNING [train.py:1204] (3/4) Exclude cut with ID _5444_210_2151_1_1527418808464_5312210_726-27165-0_sp1.1 from training. Duration: 0.6090625 2023-10-03 10:34:11,471 WARNING [train.py:1204] (3/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_494-170437-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 10:34:14,314 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_474-64054-0_sp1.1 from training. Duration: 0.936375 2023-10-03 10:34:17,623 WARNING [train.py:1204] (3/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_521-7556-0_sp1.1 from training. Duration: 0.97275 2023-10-03 10:34:17,739 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_269-348505-0_sp1.1 from training. Duration: 0.92725 2023-10-03 10:34:22,466 WARNING [train.py:1204] (3/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_559-184862-0 from training. Duration: 0.94 2023-10-03 10:34:23,677 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_68702_210_23689_1_1535799577_3420068_92-76040-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 10:34:23,718 WARNING [train.py:1204] (3/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_227-102052-0_sp1.1 from training. Duration: 0.97275 2023-10-03 10:34:25,092 WARNING [train.py:1204] (3/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_718-250907-0_sp0.9 from training. Duration: 0.7666875 2023-10-03 10:34:25,158 WARNING [train.py:1204] (3/4) Exclude cut with ID _14069_210_5632_1_1530512692033_4803390_352-143891-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 10:34:29,521 WARNING [train.py:1204] (3/4) Exclude cut with ID _6267_210_2084_1_1527764658526_3894730_65-106602-0_sp1.1 from training. Duration: 0.936375 2023-10-03 10:34:29,580 WARNING [train.py:1204] (3/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_202-183922-0_sp0.9 from training. Duration: 0.9444375 2023-10-03 10:34:29,593 WARNING [train.py:1204] (3/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_465-268827-0_sp1.1 from training. Duration: 0.97275 2023-10-03 10:34:31,607 WARNING [train.py:1204] (3/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_58-29064-0_sp1.1 from training. Duration: 0.9 2023-10-03 10:34:31,648 WARNING [train.py:1204] (3/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_1-99680-0_sp0.9 from training. Duration: 0.7444375 2023-10-03 10:34:32,975 WARNING [train.py:1204] (3/4) Exclude cut with ID _13253_210_1925_1_1530757775819_3577873_191-144977-0_sp1.1 from training. Duration: 0.7090625 2023-10-03 10:34:33,131 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.0.conv_module2.balancer1.prob, batch_count=1236353.3333333333, ans=0.125 2023-10-03 10:34:34,646 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.feed_forward1.out_proj.dropout_p, batch_count=1236353.3333333333, ans=0.1 2023-10-03 10:34:35,748 WARNING [train.py:1204] (3/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_276-112684-0_sp1.1 from training. Duration: 0.92725 2023-10-03 10:34:35,756 WARNING [train.py:1204] (3/4) Exclude cut with ID _64639_210_5816_1_1535367619437_3528000_358-411-0_sp1.1 from training. Duration: 0.97275 2023-10-03 10:34:35,769 WARNING [train.py:1204] (3/4) Exclude cut with ID _31649_210_6228_1_1532584608063_4091078_106-21199-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 10:34:37,189 WARNING [train.py:1204] (3/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_901-79438-0 from training. Duration: 0.94 2023-10-03 10:34:38,445 INFO [train.py:1046] (3/4) Epoch 35, batch 4850, loss[loss=0.1605, simple_loss=0.2391, pruned_loss=0.04098, over 23658.00 frames. ], tot_loss[loss=0.1607, simple_loss=0.2404, pruned_loss=0.04046, over 4705014.39 frames. ], batch size: 149, lr: 2.89e-03, grad_scale: 16.0 2023-10-03 10:34:38,704 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.conv_skip_rate, batch_count=1236420.0, ans=0.0 2023-10-03 10:34:40,035 WARNING [train.py:1204] (3/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_718-250907-0 from training. Duration: 0.69 2023-10-03 10:34:40,040 WARNING [train.py:1204] (3/4) Exclude cut with ID _5287_210_2084_1_1527418883040_3878670_363-194161-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 10:34:40,044 WARNING [train.py:1204] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_586-321321-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 10:34:41,346 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_68900_210_2087_1_1535713157_3708529_164-292661-0_sp1.1 from training. Duration: 0.963625 2023-10-03 10:34:41,347 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_26433_210_12108_1_1532157158_4056480_114-144878-0_sp1.1 from training. Duration: 0.97275 2023-10-03 10:34:42,862 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_15517_210_6753_1_1530954008_3487336_96-341874-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 10:34:51,589 WARNING [train.py:1204] (3/4) Exclude cut with ID _14268_210_9206_1_1530622771285_4714099_637-86630-0 from training. Duration: 0.51 2023-10-03 10:34:53,396 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_28211_210_6388_1_1532861506_4523224_121-320092-0_sp1.1 from training. Duration: 0.92725 2023-10-03 10:34:56,273 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_141-89113-0_sp0.9 from training. Duration: 0.9555625 2023-10-03 10:34:57,644 WARNING [train.py:1204] (3/4) Exclude cut with ID _6789_210_1534_1_1527857937218_3118950_311-61262-0_sp1.1 from training. Duration: 0.5909375 2023-10-03 10:34:57,672 WARNING [train.py:1204] (3/4) Exclude cut with ID _58480_210_12392_1_1534811121111_3646160_389-200461-0_sp1.1 from training. Duration: 0.97275 2023-10-03 10:34:59,716 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.3.encoder.layers.2.conv_module1.whiten, num_groups=1, num_channels=512, metric=5.03 vs. limit=15.0 2023-10-03 10:35:01,091 WARNING [train.py:1204] (3/4) Exclude cut with ID _41326_210_5854_1_1533778076509_3492080_28-156553-0_sp1.1 from training. Duration: 0.92725 2023-10-03 10:35:02,463 WARNING [train.py:1204] (3/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_577-287150-0_sp1.1 from training. Duration: 0.7454375 2023-10-03 10:35:03,767 WARNING [train.py:1204] (3/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_40-129756-0_sp1.1 from training. Duration: 0.736375 2023-10-03 10:35:03,782 WARNING [train.py:1204] (3/4) Exclude cut with ID _60113_210_16271_1_1535164212481_4386079_490-280290-0 from training. Duration: 0.98 2023-10-03 10:35:04,403 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.3.encoder.layers.3.feed_forward1.out_whiten, num_groups=1, num_channels=512, metric=5.04 vs. limit=15.0 2023-10-03 10:35:06,635 WARNING [train.py:1204] (3/4) Exclude cut with ID _58059_210_5854_1_1534901471149_3164808_34-39905-0_sp1.1 from training. Duration: 0.963625 2023-10-03 10:35:07,879 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.563e+02 1.944e+02 2.127e+02 2.497e+02 3.827e+02, threshold=4.255e+02, percent-clipped=0.0 2023-10-03 10:35:08,039 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_791-197891-0_sp1.1 from training. Duration: 0.763625 2023-10-03 10:35:09,344 WARNING [train.py:1204] (3/4) Exclude cut with ID _6789_210_1534_1_1527857937218_3118950_262-48853-0_sp1.1 from training. Duration: 0.5090625 2023-10-03 10:35:09,421 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2655_210_2087_1_1525688959_3897400_52-279899-0_sp0.9 from training. Duration: 0.7666875 2023-10-03 10:35:09,424 WARNING [train.py:1204] (3/4) Exclude cut with ID _14884_210_2252_1_1530707459689_5561250_114-201399-0 from training. Duration: 0.76 2023-10-03 10:35:12,240 WARNING [train.py:1204] (3/4) Exclude cut with ID _19023_210_4353_1_1531378892552_5820780_83-254794-0_sp0.9 from training. Duration: 0.9555625 2023-10-03 10:35:12,256 WARNING [train.py:1204] (3/4) Exclude cut with ID _53489_210_6228_1_1534298357169_3835970_10-297919-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 10:35:12,888 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.3.encoder.layers.3.self_attn_weights.whiten_keys, num_groups=8, num_channels=256, metric=2.63 vs. limit=6.0 2023-10-03 10:35:18,170 WARNING [train.py:1204] (3/4) Exclude cut with ID _44585_210_18628_1_1533605350387_4518199_418-3055-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 10:35:18,192 WARNING [train.py:1204] (3/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_310-109461-0 from training. Duration: 0.8 2023-10-03 10:35:18,231 WARNING [train.py:1204] (3/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_235-262124-0 from training. Duration: 0.67 2023-10-03 10:35:18,461 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.balancer1.prob, batch_count=1236553.3333333333, ans=0.125 2023-10-03 10:35:19,525 WARNING [train.py:1204] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_195-167011-0_sp1.1 from training. Duration: 0.6818125 2023-10-03 10:35:27,168 WARNING [train.py:1204] (3/4) Exclude cut with ID _7852_210_5219_1_1528286471316_8139400_188-55162-0_sp1.1 from training. Duration: 0.936375 2023-10-03 10:35:27,229 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_35361_210_4238_1_1533262810_7793476_675-87012-0 from training. Duration: 0.89 2023-10-03 10:35:28,693 WARNING [train.py:1204] (3/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_465-81427-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 10:35:28,703 WARNING [train.py:1204] (3/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_597-154640-0_sp1.1 from training. Duration: 0.8181875 2023-10-03 10:35:32,148 WARNING [train.py:1204] (3/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_186-309161-0_sp0.9 from training. Duration: 0.6 2023-10-03 10:35:33,488 WARNING [train.py:1204] (3/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_546-119312-0 from training. Duration: 0.99 2023-10-03 10:35:33,492 WARNING [train.py:1204] (3/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_330-240060-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 10:35:33,573 WARNING [train.py:1204] (3/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_477-255782-0 from training. Duration: 0.82 2023-10-03 10:35:33,600 WARNING [train.py:1204] (3/4) Exclude cut with ID _14970_210_15709_1_1530773543493_1715730_150-248939-0_sp1.1 from training. Duration: 0.97275 2023-10-03 10:35:35,030 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_556-206615-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 10:35:36,419 WARNING [train.py:1204] (3/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_308-231735-0 from training. Duration: 0.73 2023-10-03 10:35:38,028 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.balancer2.prob, batch_count=1236686.6666666667, ans=0.125 2023-10-03 10:35:44,537 WARNING [train.py:1204] (3/4) Exclude cut with ID _27485_210_4916_1_1532739191677_3668269_66-236571-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 10:35:49,391 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2221_210_1988_1_1525597051_4935541_415-296882-0_sp0.9 from training. Duration: 0.8555625 2023-10-03 10:35:49,399 WARNING [train.py:1204] (3/4) Exclude cut with ID _64521_210_14699_1_1535243427259_3815328_231-78630-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 10:35:52,514 INFO [train.py:1046] (3/4) Epoch 35, batch 4900, loss[loss=0.167, simple_loss=0.2522, pruned_loss=0.04088, over 24376.00 frames. ], tot_loss[loss=0.1607, simple_loss=0.2396, pruned_loss=0.04086, over 4689273.60 frames. ], batch size: 77, lr: 2.89e-03, grad_scale: 16.0 2023-10-03 10:35:55,217 WARNING [train.py:1204] (3/4) Exclude cut with ID _13942_210_9988_1_1530604906036_3733360_499-55676-0 from training. Duration: 0.62 2023-10-03 10:35:55,220 WARNING [train.py:1204] (3/4) Exclude cut with ID _49376_210_5986_1_1533985278732_3370740_301-154763-0_sp1.1 from training. Duration: 0.8909375 2023-10-03 10:35:58,136 WARNING [train.py:1204] (3/4) Exclude cut with ID _27485_210_4916_1_1532739191677_3668269_255-216698-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 10:35:59,941 WARNING [train.py:1204] (3/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_137-334143-0_sp1.1 from training. Duration: 0.97275 2023-10-03 10:35:59,976 WARNING [train.py:1204] (3/4) Exclude cut with ID _6456_210_1534_1_1527681634212_3181049_215-309359-0_sp1.1 from training. Duration: 0.8 2023-10-03 10:36:03,209 WARNING [train.py:1204] (3/4) Exclude cut with ID _65640_210_6310_1_1535368201445_3173250_209-13008-0 from training. Duration: 0.99 2023-10-03 10:36:06,214 WARNING [train.py:1204] (3/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_58-29064-0 from training. Duration: 0.99 2023-10-03 10:36:11,540 WARNING [train.py:1204] (3/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_410-287857-0 from training. Duration: 0.93 2023-10-03 10:36:12,923 WARNING [train.py:1204] (3/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_153-153537-0 from training. Duration: 0.91 2023-10-03 10:36:12,953 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7544_210_6753_1_1528587722_3576686_172-171750-0_sp0.9 from training. Duration: 0.72225 2023-10-03 10:36:12,991 WARNING [train.py:1204] (3/4) Exclude cut with ID _64521_210_14699_1_1535243427259_3815328_464-183853-0_sp1.1 from training. Duration: 0.97275 2023-10-03 10:36:13,004 WARNING [train.py:1204] (3/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_385-210308-0_sp1.1 from training. Duration: 0.963625 2023-10-03 10:36:13,011 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_46013_210_7234_1_1533776362_3744032_354-261865-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 10:36:13,018 WARNING [train.py:1204] (3/4) Exclude cut with ID _3067_210_2667_1_1526212974640_4503168_250-285937-0_sp1.1 from training. Duration: 0.72725 2023-10-03 10:36:14,439 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_40036_210_13405_1_1533648647_1315388_48-146016-0 from training. Duration: 0.93 2023-10-03 10:36:17,209 WARNING [train.py:1204] (3/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_280-161633-0 from training. Duration: 0.73 2023-10-03 10:36:19,096 WARNING [train.py:1204] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_308-161067-0_sp0.9 from training. Duration: 0.7333125 2023-10-03 10:36:19,204 WARNING [train.py:1204] (3/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_68-212801-0_sp0.9 from training. Duration: 0.811125 2023-10-03 10:36:21,109 WARNING [train.py:1204] (3/4) Exclude cut with ID _6789_210_1534_1_1527857937218_3118950_311-61262-0_sp0.9 from training. Duration: 0.72225 2023-10-03 10:36:21,399 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.conv_module1.balancer1.prob, batch_count=1236886.6666666667, ans=0.125 2023-10-03 10:36:22,573 WARNING [train.py:1204] (3/4) Exclude cut with ID _14632_210_9988_1_1530691279392_3923200_102-204477-0_sp1.1 from training. Duration: 0.8818125 2023-10-03 10:36:23,884 WARNING [train.py:1204] (3/4) Exclude cut with ID _65242_210_18628_1_1535369393717_3823149_290-117517-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 10:36:25,321 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_69630_210_7234_1_1535788670_910989_35-337140-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 10:36:25,336 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_5618_210_1874_1_1527311008_5851_1-214501-0 from training. Duration: 0.689 2023-10-03 10:36:26,854 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.feed_forward1.hidden_balancer.prob, batch_count=1236886.6666666667, ans=0.125 2023-10-03 10:36:27,979 WARNING [train.py:1204] (3/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_168-50337-0_sp0.9 from training. Duration: 0.9333125 2023-10-03 10:36:29,989 WARNING [train.py:1204] (3/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_185-14438-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 10:36:30,010 WARNING [train.py:1204] (3/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_61-327062-0 from training. Duration: 0.81 2023-10-03 10:36:30,016 WARNING [train.py:1204] (3/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_98-741-0 from training. Duration: 0.8 2023-10-03 10:36:33,487 WARNING [train.py:1204] (3/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_312-204798-0 from training. Duration: 0.93 2023-10-03 10:36:36,226 WARNING [train.py:1204] (3/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_787-294416-0_sp0.9 from training. Duration: 0.911125 2023-10-03 10:36:36,312 WARNING [train.py:1204] (3/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_140-181176-0_sp1.1 from training. Duration: 0.836375 2023-10-03 10:36:36,344 WARNING [train.py:1204] (3/4) Exclude cut with ID _14687_210_5854_1_1530703838259_3664889_115-239046-0_sp1.1 from training. Duration: 0.6090625 2023-10-03 10:36:37,639 WARNING [train.py:1204] (3/4) Exclude cut with ID _56423_210_5854_1_1534896061049_3165969_370-230614-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 10:36:38,937 WARNING [train.py:1204] (3/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_406-275649-0_sp0.9 from training. Duration: 0.4666875 2023-10-03 10:36:38,949 WARNING [train.py:1204] (3/4) Exclude cut with ID _15150_210_8341_1_1530752411803_7256384_22-138274-0_sp1.1 from training. Duration: 0.87275 2023-10-03 10:36:38,994 WARNING [train.py:1204] (3/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_125-125818-0 from training. Duration: 0.96 2023-10-03 10:36:40,362 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_4399_210_1874_1_1526705485_2400757_220-47974-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 10:36:42,933 WARNING [train.py:1204] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_308-161067-0_sp1.1 from training. Duration: 0.6 2023-10-03 10:36:44,309 WARNING [train.py:1204] (3/4) Exclude cut with ID _53489_210_6228_1_1534298357169_3835970_102-57645-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 10:36:47,001 WARNING [train.py:1204] (3/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_178-177582-0 from training. Duration: 0.74 2023-10-03 10:36:47,081 WARNING [train.py:1204] (3/4) Exclude cut with ID _14349_210_8341_1_1530615710818_3442470_170-336541-0_sp1.1 from training. Duration: 0.7818125 2023-10-03 10:36:48,968 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_521-214276-0_sp0.9 from training. Duration: 0.488875 2023-10-03 10:36:49,038 WARNING [train.py:1204] (3/4) Exclude cut with ID _66070_210_7640_1_1535441393096_7805310_489-292631-0 from training. Duration: 0.79 2023-10-03 10:36:56,543 WARNING [train.py:1204] (3/4) Exclude cut with ID _14069_210_5632_1_1530512692033_4803390_438-340023-0_sp1.1 from training. Duration: 0.936375 2023-10-03 10:36:57,879 WARNING [train.py:1204] (3/4) Exclude cut with ID _7852_210_5219_1_1528286471316_8139400_242-105336-0_sp1.1 from training. Duration: 0.6545625 2023-10-03 10:36:58,063 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.conv_skip_rate, batch_count=1237020.0, ans=0.0 2023-10-03 10:36:59,317 WARNING [train.py:1204] (3/4) Exclude cut with ID _3867_210_1883_1_1526641200575_4475830_272-215563-0 from training. Duration: 0.57 2023-10-03 10:36:59,326 WARNING [train.py:1204] (3/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_143-117421-0_sp0.9 from training. Duration: 0.6444375 2023-10-03 10:36:59,340 WARNING [train.py:1204] (3/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_30-326681-0_sp1.1 from training. Duration: 0.7545625 2023-10-03 10:37:01,301 WARNING [train.py:1204] (3/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_538-218448-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 10:37:01,486 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.feed_forward2.hidden_balancer.prob, batch_count=1237020.0, ans=0.125 2023-10-03 10:37:05,349 WARNING [train.py:1204] (3/4) Exclude cut with ID _33640_210_2655_1_1533117605479_4855469_60-192970-0_sp1.1 from training. Duration: 0.963625 2023-10-03 10:37:05,357 WARNING [train.py:1204] (3/4) Exclude cut with ID _65585_210_12210_1_1535450451584_3764098_112-294301-0_sp1.1 from training. Duration: 0.8 2023-10-03 10:37:06,716 INFO [train.py:1046] (3/4) Epoch 35, batch 4950, loss[loss=0.145, simple_loss=0.2221, pruned_loss=0.03402, over 21128.00 frames. ], tot_loss[loss=0.1592, simple_loss=0.2379, pruned_loss=0.0403, over 4698282.62 frames. ], batch size: 46, lr: 2.89e-03, grad_scale: 16.0 2023-10-03 10:37:06,757 WARNING [train.py:1204] (3/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_643-37412-0_sp1.1 from training. Duration: 0.936375 2023-10-03 10:37:06,774 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_9302_210_2550_1_1528889962_4655416_638-301620-0_sp1.1 from training. Duration: 0.42725 2023-10-03 10:37:07,084 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.self_attn_weights.pos_emb_skip_rate, batch_count=1237086.6666666667, ans=0.0 2023-10-03 10:37:08,141 WARNING [train.py:1204] (3/4) Exclude cut with ID _3867_210_1883_1_1526641200575_4475830_272-215563-0_sp1.1 from training. Duration: 0.5181875 2023-10-03 10:37:10,999 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_53679_210_4852_1_1534556726_4186141_358-154143-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 10:37:11,013 WARNING [train.py:1204] (3/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_841-341679-0_sp0.9 from training. Duration: 0.6444375 2023-10-03 10:37:13,818 WARNING [train.py:1204] (3/4) Exclude cut with ID _7160_210_1876_1_1527940923231_3703799_302-150728-0 from training. Duration: 0.78 2023-10-03 10:37:13,854 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_125-203105-0 from training. Duration: 0.8 2023-10-03 10:37:13,874 WARNING [train.py:1204] (3/4) Exclude cut with ID _5585_210_2667_1_1527418813324_8007340_89-332072-0_sp0.9 from training. Duration: 0.711125 2023-10-03 10:37:13,942 WARNING [train.py:1204] (3/4) Exclude cut with ID _3992_210_1747_1_1526644816031_3731060_36-147855-0 from training. Duration: 0.78 2023-10-03 10:37:15,291 WARNING [train.py:1204] (3/4) Exclude cut with ID _59256_210_5986_1_1535076085997_3589120_172-239045-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 10:37:15,300 WARNING [train.py:1204] (3/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_472-216663-0_sp1.1 from training. Duration: 0.836375 2023-10-03 10:37:15,352 WARNING [train.py:1204] (3/4) Exclude cut with ID _36512_210_6987_1_1533033143998_3126069_181-306500-0_sp1.1 from training. Duration: 0.863625 2023-10-03 10:37:15,367 WARNING [train.py:1204] (3/4) Exclude cut with ID _56524_210_9642_1_1534572011779_3619170_492-95008-0_sp1.1 from training. Duration: 0.97275 2023-10-03 10:37:18,122 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_33348_210_5632_1_1533086912_3324030_119-90326-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 10:37:19,971 WARNING [train.py:1204] (3/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_231-137892-0_sp1.1 from training. Duration: 0.9 2023-10-03 10:37:20,356 INFO [scaling.py:1118] (3/4) WithLoss: name=encoder.encoders.5.encoder.layers.1.self_attn_weights, loss-sum=0.000e+00 2023-10-03 10:37:21,410 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8453_210_4211_1_1528502940_8562730_596-306820-0_sp1.1 from training. Duration: 0.8818125 2023-10-03 10:37:22,161 WARNING [train.py:1204] (3/4) Exclude cut with ID _27052_210_5986_1_1532329197187_3585009_50-242750-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 10:37:24,723 WARNING [train.py:1204] (3/4) Exclude cut with ID _13913_210_9994_1_1530579615187_3383897_358-98435-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 10:37:24,736 WARNING [train.py:1204] (3/4) Exclude cut with ID _27013_210_1389_1_1532437173240_3252649_399-229778-0_sp1.1 from training. Duration: 0.963625 2023-10-03 10:37:28,858 WARNING [train.py:1204] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_185-174805-0_sp0.9 from training. Duration: 0.7555625 2023-10-03 10:37:33,521 WARNING [train.py:1204] (3/4) Exclude cut with ID _65585_210_12210_1_1535450451584_3764098_196-221014-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 10:37:34,801 WARNING [train.py:1204] (3/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_202-293523-0_sp1.1 from training. Duration: 0.6545625 2023-10-03 10:37:36,260 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8275_210_4198_1_1528455320_3892571_213-127634-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 10:37:36,304 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_921-231678-0_sp1.1 from training. Duration: 0.97275 2023-10-03 10:37:37,597 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.649e+02 1.912e+02 2.157e+02 2.432e+02 3.456e+02, threshold=4.313e+02, percent-clipped=0.0 2023-10-03 10:37:37,735 WARNING [train.py:1204] (3/4) Exclude cut with ID _38020_210_5986_1_1533112252259_7006089_493-102745-0_sp1.1 from training. Duration: 0.8909375 2023-10-03 10:37:39,207 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6846_210_6753_1_1527850767_4295158_2-315073-0 from training. Duration: 0.88 2023-10-03 10:37:40,573 WARNING [train.py:1204] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_249-281349-0 from training. Duration: 0.85 2023-10-03 10:37:43,156 WARNING [train.py:1204] (3/4) Exclude cut with ID _18513_210_9206_1_1531618215223_3862620_314-83429-0_sp1.1 from training. Duration: 0.97275 2023-10-03 10:37:44,519 WARNING [train.py:1204] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_215-202973-0_sp0.9 from training. Duration: 0.97775 2023-10-03 10:37:44,528 WARNING [train.py:1204] (3/4) Exclude cut with ID _7719_210_3525_1_1528456153163_4296960_88-250259-0_sp1.1 from training. Duration: 0.763625 2023-10-03 10:37:45,964 WARNING [train.py:1204] (3/4) Exclude cut with ID _7606_210_4915_1_1528894253277_5080690_301-10732-0_sp1.1 from training. Duration: 0.736375 2023-10-03 10:37:45,974 WARNING [train.py:1204] (3/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_818-11907-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 10:37:47,331 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_5959_210_1794_1_1527400400_3445726_412-44052-0_sp1.1 from training. Duration: 0.7 2023-10-03 10:37:49,324 WARNING [train.py:1204] (3/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_6-279195-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 10:37:51,985 WARNING [train.py:1204] (3/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_252-216674-0_sp0.9 from training. Duration: 0.6 2023-10-03 10:37:53,855 WARNING [train.py:1204] (3/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_144-3287-0_sp1.1 from training. Duration: 0.6090625 2023-10-03 10:37:55,214 WARNING [train.py:1204] (3/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_342-3879-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 10:37:56,669 WARNING [train.py:1204] (3/4) Exclude cut with ID _33640_210_2655_1_1533117605479_4855469_584-65630-0_sp1.1 from training. Duration: 0.97275 2023-10-03 10:37:56,739 WARNING [train.py:1204] (3/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_37-189262-0 from training. Duration: 0.98 2023-10-03 10:37:56,765 WARNING [train.py:1204] (3/4) Exclude cut with ID _9389_210_1664_1_1529217337301_5289269_379-128794-0_sp0.9 from training. Duration: 0.9444375 2023-10-03 10:37:59,409 WARNING [train.py:1204] (3/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_119-207162-0_sp1.1 from training. Duration: 0.6454375 2023-10-03 10:38:02,434 WARNING [train.py:1204] (3/4) Exclude cut with ID _2941_210_2577_1_1526199673150_2489759_200-333643-0_sp1.1 from training. Duration: 0.936375 2023-10-03 10:38:04,204 WARNING [train.py:1204] (3/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_479-125611-0_sp1.1 from training. Duration: 0.87275 2023-10-03 10:38:04,217 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3475_210_2028_1_1526193578_3622569_64-297072-0_sp1.1 from training. Duration: 0.82725 2023-10-03 10:38:05,522 WARNING [train.py:1204] (3/4) Exclude cut with ID _53565_210_10635_1_1534337218335_3820259_139-346291-0_sp1.1 from training. Duration: 0.97275 2023-10-03 10:38:05,564 WARNING [train.py:1204] (3/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_588-125165-0_sp1.1 from training. Duration: 0.7545625 2023-10-03 10:38:05,629 WARNING [train.py:1204] (3/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_938-205135-0_sp1.1 from training. Duration: 0.7909375 2023-10-03 10:38:08,288 WARNING [train.py:1204] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_216-48707-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 10:38:08,333 WARNING [train.py:1204] (3/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_477-255782-0_sp1.1 from training. Duration: 0.7454375 2023-10-03 10:38:08,374 WARNING [train.py:1204] (3/4) Exclude cut with ID _16909_210_5724_1_1531292079912_4118919_583-92842-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 10:38:08,576 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.conv_skip_rate, batch_count=1237353.3333333333, ans=0.0 2023-10-03 10:38:11,008 WARNING [train.py:1204] (3/4) Exclude cut with ID _60860_210_8254_1_1534896015875_3557169_245-256666-0 from training. Duration: 0.97 2023-10-03 10:38:11,263 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.feed_forward1.out_proj.dropout_p, batch_count=1237353.3333333333, ans=0.1 2023-10-03 10:38:15,076 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_45399_210_4852_1_1533624725_7579737_349-116173-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 10:38:19,248 INFO [train.py:1046] (3/4) Epoch 35, batch 5000, loss[loss=0.143, simple_loss=0.2273, pruned_loss=0.0294, over 24487.00 frames. ], tot_loss[loss=0.159, simple_loss=0.2379, pruned_loss=0.04004, over 4703582.07 frames. ], batch size: 63, lr: 2.89e-03, grad_scale: 8.0 2023-10-03 10:38:19,301 WARNING [train.py:1204] (3/4) Exclude cut with ID _8699_210_2069_1_1528630346831_3960409_155-39766-0 from training. Duration: 0.78 2023-10-03 10:38:19,318 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_86-16881-0_sp1.1 from training. Duration: 0.52725 2023-10-03 10:38:19,574 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.1.ff3_skip_rate, batch_count=1237420.0, ans=0.0 2023-10-03 10:38:19,588 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=1237420.0, ans=0.1 2023-10-03 10:38:24,700 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_191-159384-0_sp1.1 from training. Duration: 0.97275 2023-10-03 10:38:25,886 WARNING [train.py:1204] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_307-158462-0_sp1.1 from training. Duration: 0.62725 2023-10-03 10:38:27,323 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7544_210_6753_1_1528591327_1471426_33-346031-0 from training. Duration: 0.61 2023-10-03 10:38:28,766 WARNING [train.py:1204] (3/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_112-149213-0 from training. Duration: 0.95 2023-10-03 10:38:30,196 WARNING [train.py:1204] (3/4) Exclude cut with ID _55844_210_2346_1_1534471212569_7625707_925-191342-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 10:38:32,220 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.bypass_mid.scale_min, batch_count=1237420.0, ans=0.2 2023-10-03 10:38:33,346 WARNING [train.py:1204] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_87-278686-0 from training. Duration: 0.77 2023-10-03 10:38:33,382 WARNING [train.py:1204] (3/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_415-78792-0_sp1.1 from training. Duration: 0.736375 2023-10-03 10:38:33,402 WARNING [train.py:1204] (3/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_107-332398-0_sp0.9 from training. Duration: 0.8666875 2023-10-03 10:38:34,763 WARNING [train.py:1204] (3/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_72-251255-0 from training. Duration: 0.94 2023-10-03 10:38:36,419 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_431-227273-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 10:38:36,479 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7544_210_6753_1_1528587000_691468_87-178599-0_sp0.9 from training. Duration: 0.9666875 2023-10-03 10:38:36,549 WARNING [train.py:1204] (3/4) Exclude cut with ID _13916_210_9994_1_1530788341126_3763149_60-191556-0 from training. Duration: 0.61 2023-10-03 10:38:36,551 WARNING [train.py:1204] (3/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_746-164782-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 10:38:37,876 WARNING [train.py:1204] (3/4) Exclude cut with ID _33640_210_2655_1_1533117605479_4855469_596-21066-0_sp1.1 from training. Duration: 0.92725 2023-10-03 10:38:37,990 WARNING [train.py:1204] (3/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_320-14078-0 from training. Duration: 0.95 2023-10-03 10:38:39,263 WARNING [train.py:1204] (3/4) Exclude cut with ID _29877_210_6228_1_1532397791106_5276149_266-62291-0 from training. Duration: 0.87 2023-10-03 10:38:39,339 WARNING [train.py:1204] (3/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_176-56120-0_sp0.9 from training. Duration: 0.92225 2023-10-03 10:38:39,376 WARNING [train.py:1204] (3/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_91-26919-0 from training. Duration: 0.97 2023-10-03 10:38:39,390 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_224-322394-0_sp0.9 from training. Duration: 0.7333125 2023-10-03 10:38:40,730 WARNING [train.py:1204] (3/4) Exclude cut with ID _44982_210_1511_1_1534312820108_3726029_586-297059-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 10:38:40,777 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_196-289637-0_sp1.1 from training. Duration: 0.6818125 2023-10-03 10:38:40,779 WARNING [train.py:1204] (3/4) Exclude cut with ID _18513_210_9206_1_1531618215223_3862620_402-5957-0 from training. Duration: 0.99 2023-10-03 10:38:40,784 WARNING [train.py:1204] (3/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_81-300212-0 from training. Duration: 0.6 2023-10-03 10:38:43,471 WARNING [train.py:1204] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_166-128941-0 from training. Duration: 0.54 2023-10-03 10:38:43,486 WARNING [train.py:1204] (3/4) Exclude cut with ID _58059_210_5854_1_1534901471149_3164808_57-309660-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 10:38:43,646 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.attention_skip_rate, batch_count=1237486.6666666667, ans=0.0 2023-10-03 10:38:43,745 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=1237486.6666666667, ans=0.1 2023-10-03 10:38:43,986 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.4.encoder.layers.2.nonlin_attention.whiten2, num_groups=1, num_channels=384, metric=4.37 vs. limit=15.0 2023-10-03 10:38:44,798 WARNING [train.py:1204] (3/4) Exclude cut with ID _60621_210_17071_1_1535013053530_3671739_73-155814-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 10:38:46,190 WARNING [train.py:1204] (3/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_726-270780-0 from training. Duration: 0.86 2023-10-03 10:38:46,202 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8853_210_3273_1_1528633899_818968_51-187685-0_sp1.1 from training. Duration: 0.62725 2023-10-03 10:38:46,948 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.3.encoder.layers.2.self_attn2.whiten, num_groups=1, num_channels=512, metric=11.09 vs. limit=22.5 2023-10-03 10:38:47,608 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_315-212794-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 10:38:49,499 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_53373_210_5399_1_1534229471_4715695_314-122795-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 10:38:52,177 WARNING [train.py:1204] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_187-221566-0_sp0.9 from training. Duration: 0.47775 2023-10-03 10:38:53,538 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_133-564-0 from training. Duration: 0.79 2023-10-03 10:38:53,580 WARNING [train.py:1204] (3/4) Exclude cut with ID _55153_210_2253_1_1534384786677_3162096_255-228344-0_sp1.1 from training. Duration: 0.936375 2023-10-03 10:38:54,957 WARNING [train.py:1204] (3/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_383-165234-0_sp1.1 from training. Duration: 0.8545625 2023-10-03 10:38:59,636 WARNING [train.py:1204] (3/4) Exclude cut with ID IC0677W0056-334889 from training. Duration: 0.768 2023-10-03 10:38:59,760 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder_embed.convnext.layerdrop_rate, batch_count=1237553.3333333333, ans=0.015 2023-10-03 10:39:01,172 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_14953_210_2151_1_1530701710_5616165_710-165400-0_sp0.9 from training. Duration: 0.9666875 2023-10-03 10:39:02,579 WARNING [train.py:1204] (3/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_355-236205-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 10:39:02,581 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_34450_210_9547_1_1533099443_4109050_503-246893-0_sp1.1 from training. Duration: 0.963625 2023-10-03 10:39:05,888 WARNING [train.py:1204] (3/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_25-120161-0 from training. Duration: 0.98 2023-10-03 10:39:05,914 WARNING [train.py:1204] (3/4) Exclude cut with ID _66112_210_10635_1_1535590236305_3801484_205-311930-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 10:39:07,622 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_48049_210_5632_1_1533864553_3519937_185-281218-0_sp1.1 from training. Duration: 0.92725 2023-10-03 10:39:07,660 WARNING [train.py:1204] (3/4) Exclude cut with ID _48782_210_12658_1_1534467701544_2300422_191-332415-0_sp1.1 from training. Duration: 0.97275 2023-10-03 10:39:10,262 WARNING [train.py:1204] (3/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_907-151398-0_sp1.1 from training. Duration: 0.336375 2023-10-03 10:39:10,315 WARNING [train.py:1204] (3/4) Exclude cut with ID _67499_210_17282_1_1535509805220_3692430_20-179346-0_sp1.1 from training. Duration: 0.8818125 2023-10-03 10:39:14,491 WARNING [train.py:1204] (3/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_441-91466-0_sp1.1 from training. Duration: 0.8818125 2023-10-03 10:39:14,566 WARNING [train.py:1204] (3/4) Exclude cut with ID _46681_210_9642_1_1533893005571_7603359_786-164007-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 10:39:20,028 WARNING [train.py:1204] (3/4) Exclude cut with ID _14970_210_15709_1_1530773543493_1715730_187-20259-0 from training. Duration: 0.89 2023-10-03 10:39:22,283 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=1237686.6666666667, ans=0.1 2023-10-03 10:39:23,384 WARNING [train.py:1204] (3/4) Exclude cut with ID _12734_210_8341_1_1530183614296_3891929_383-339075-0_sp1.1 from training. Duration: 0.963625 2023-10-03 10:39:33,081 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.0.layers.0.feed_forward2.out_whiten, num_groups=1, num_channels=192, metric=6.52 vs. limit=15.0 2023-10-03 10:39:33,456 INFO [train.py:1046] (3/4) Epoch 35, batch 5050, loss[loss=0.1457, simple_loss=0.2187, pruned_loss=0.03632, over 24332.00 frames. ], tot_loss[loss=0.1593, simple_loss=0.238, pruned_loss=0.04034, over 4702077.44 frames. ], batch size: 56, lr: 2.89e-03, grad_scale: 8.0 2023-10-03 10:39:33,511 WARNING [train.py:1204] (3/4) Exclude cut with ID _37086_210_9038_1_1532999540891_7021250_834-322225-0_sp1.1 from training. Duration: 0.92725 2023-10-03 10:39:34,836 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_10987_210_4163_1_1529760555_4123902_56-290559-0_sp1.1 from training. Duration: 0.963625 2023-10-03 10:39:34,843 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_92-246270-0_sp0.9 from training. Duration: 0.9444375 2023-10-03 10:39:34,866 WARNING [train.py:1204] (3/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_56-13561-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 10:39:36,253 WARNING [train.py:1204] (3/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_393-57888-0_sp1.1 from training. Duration: 0.5818125 2023-10-03 10:39:36,284 WARNING [train.py:1204] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_168-32906-0_sp1.1 from training. Duration: 0.62725 2023-10-03 10:39:36,328 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_51225_210_3601_1_1534400634_2327383_127-207553-0_sp1.1 from training. Duration: 0.963625 2023-10-03 10:39:41,007 WARNING [train.py:1204] (3/4) Exclude cut with ID _58583_210_5236_1_1534726835266_3574249_494-203521-0_sp1.1 from training. Duration: 0.963625 2023-10-03 10:39:41,021 WARNING [train.py:1204] (3/4) Exclude cut with ID _49376_210_5986_1_1533985278732_3370740_354-344695-0 from training. Duration: 0.99 2023-10-03 10:39:42,361 WARNING [train.py:1204] (3/4) Exclude cut with ID _8021_210_7994_1_1528376451358_3653069_179-45206-0_sp1.1 from training. Duration: 0.7818125 2023-10-03 10:39:43,735 WARNING [train.py:1204] (3/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_742-154017-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 10:39:45,166 WARNING [train.py:1204] (3/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_222-198127-0_sp1.1 from training. Duration: 0.763625 2023-10-03 10:39:45,211 WARNING [train.py:1204] (3/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_235-307337-0 from training. Duration: 0.94 2023-10-03 10:39:46,537 WARNING [train.py:1204] (3/4) Exclude cut with ID _8393_210_6702_1_1528505860249_3457328_453-176500-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 10:39:46,593 WARNING [train.py:1204] (3/4) Exclude cut with ID _53565_210_10635_1_1534337218335_3820259_102-179528-0_sp1.1 from training. Duration: 0.97275 2023-10-03 10:39:48,115 WARNING [train.py:1204] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_43-206757-0_sp1.1 from training. Duration: 0.6818125 2023-10-03 10:39:49,482 WARNING [train.py:1204] (3/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_335-274780-0_sp1.1 from training. Duration: 0.7090625 2023-10-03 10:39:49,521 WARNING [train.py:1204] (3/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_128-296581-0_sp1.1 from training. Duration: 0.67275 2023-10-03 10:39:58,940 WARNING [train.py:1204] (3/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_111-112362-0 from training. Duration: 0.91 2023-10-03 10:39:58,990 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_5522_210_3273_1_1527249606_3909096_366-172508-0_sp1.1 from training. Duration: 0.6 2023-10-03 10:40:00,368 WARNING [train.py:1204] (3/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_47-167373-0_sp1.1 from training. Duration: 0.836375 2023-10-03 10:40:00,420 WARNING [train.py:1204] (3/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_41-234353-0 from training. Duration: 0.68 2023-10-03 10:40:01,738 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_82-221196-0_sp0.9 from training. Duration: 0.8444375 2023-10-03 10:40:01,844 WARNING [train.py:1204] (3/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_47-4814-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 10:40:01,866 WARNING [train.py:1204] (3/4) Exclude cut with ID _43541_210_10626_1_1533796233418_3635440_40-247993-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 10:40:03,233 WARNING [train.py:1204] (3/4) Exclude cut with ID _21989_210_5854_1_1531899214931_3271360_133-44759-0_sp0.9 from training. Duration: 0.8555625 2023-10-03 10:40:03,236 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_94-195801-0 from training. Duration: 0.94 2023-10-03 10:40:04,412 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.654e+02 1.891e+02 1.998e+02 2.212e+02 3.004e+02, threshold=3.997e+02, percent-clipped=0.0 2023-10-03 10:40:04,559 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_141-89113-0 from training. Duration: 0.86 2023-10-03 10:40:06,333 WARNING [train.py:1204] (3/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_683-284578-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 10:40:08,405 WARNING [train.py:1204] (3/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_6-214815-0_sp1.1 from training. Duration: 0.77275 2023-10-03 10:40:08,704 INFO [scaling.py:1118] (3/4) WithLoss: name=encoder.encoders.3.encoder.layers.1.self_attn_weights, loss-sum=0.000e+00 2023-10-03 10:40:11,065 WARNING [train.py:1204] (3/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_556-212082-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 10:40:11,102 WARNING [train.py:1204] (3/4) Exclude cut with ID _44902_210_16187_1_1533888006062_3792078_149-67212-0 from training. Duration: 0.87 2023-10-03 10:40:13,858 WARNING [train.py:1204] (3/4) Exclude cut with ID _61148_210_6897_1_1535094022726_4217610_333-159440-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 10:40:16,536 WARNING [train.py:1204] (3/4) Exclude cut with ID _3120_210_3525_1_1526036143112_4072980_404-249994-0 from training. Duration: 0.99 2023-10-03 10:40:16,633 WARNING [train.py:1204] (3/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_234-260682-0_sp0.9 from training. Duration: 0.9333125 2023-10-03 10:40:17,974 WARNING [train.py:1204] (3/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_374-138971-0_sp1.1 from training. Duration: 0.8545625 2023-10-03 10:40:18,071 WARNING [train.py:1204] (3/4) Exclude cut with ID _53713_210_24116_1_1534314487762_7416751_743-302085-0_sp1.1 from training. Duration: 0.92725 2023-10-03 10:40:19,434 WARNING [train.py:1204] (3/4) Exclude cut with ID _18516_210_9206_1_1531704653206_3692406_198-334870-0_sp1.1 from training. Duration: 0.836375 2023-10-03 10:40:21,342 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.4.encoder.layers.1.self_attn2.whiten, num_groups=1, num_channels=384, metric=13.29 vs. limit=22.5 2023-10-03 10:40:22,143 WARNING [train.py:1204] (3/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_395-73570-0_sp1.1 from training. Duration: 0.9 2023-10-03 10:40:24,099 WARNING [train.py:1204] (3/4) Exclude cut with ID _46683_210_9642_1_1533730913492_4591116_421-19547-0_sp1.1 from training. Duration: 0.936375 2023-10-03 10:40:24,148 WARNING [train.py:1204] (3/4) Exclude cut with ID _19896_210_1883_1_1531709022662_6649569_714-8668-0_sp1.1 from training. Duration: 0.963625 2023-10-03 10:40:25,434 WARNING [train.py:1204] (3/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_49-101072-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 10:40:25,449 WARNING [train.py:1204] (3/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_220-191080-0_sp1.1 from training. Duration: 0.8909375 2023-10-03 10:40:25,477 WARNING [train.py:1204] (3/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_62-335552-0 from training. Duration: 0.87 2023-10-03 10:40:26,174 WARNING [train.py:1204] (3/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_111-112362-0_sp1.1 from training. Duration: 0.82725 2023-10-03 10:40:28,857 WARNING [train.py:1204] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_318-59097-0_sp0.9 from training. Duration: 0.8444375 2023-10-03 10:40:31,752 WARNING [train.py:1204] (3/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_98-255710-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 10:40:31,760 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9042W0003-515609 from training. Duration: 0.896 2023-10-03 10:40:31,771 WARNING [train.py:1204] (3/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_158-250116-0_sp0.9 from training. Duration: 0.688875 2023-10-03 10:40:31,900 WARNING [train.py:1204] (3/4) Exclude cut with ID _26750_210_5665_1_1532226678442_4063230_7-346885-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 10:40:33,192 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_37839_210_7234_1_1533542462_3689303_357-227078-0_sp1.1 from training. Duration: 0.963625 2023-10-03 10:40:33,229 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9052W0119-520904 from training. Duration: 0.896 2023-10-03 10:40:37,731 WARNING [train.py:1204] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_249-281349-0_sp1.1 from training. Duration: 0.77275 2023-10-03 10:40:37,736 WARNING [train.py:1204] (3/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_147-293311-0 from training. Duration: 0.94 2023-10-03 10:40:37,737 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_158-21166-0_sp1.1 from training. Duration: 0.963625 2023-10-03 10:40:39,897 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.3.encoder.layers.1.nonlin_attention.whiten1, num_groups=1, num_channels=384, metric=4.98 vs. limit=10.0 2023-10-03 10:40:42,598 WARNING [train.py:1204] (3/4) Exclude cut with ID _14100_210_8341_1_1530529320613_3678069_207-332194-0_sp1.1 from training. Duration: 0.92725 2023-10-03 10:40:42,630 WARNING [train.py:1204] (3/4) Exclude cut with ID _9389_210_1664_1_1529217337301_5289269_324-211031-0_sp1.1 from training. Duration: 0.963625 2023-10-03 10:40:42,651 WARNING [train.py:1204] (3/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_133-158308-0 from training. Duration: 0.84 2023-10-03 10:40:45,249 WARNING [train.py:1204] (3/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_166-314607-0 from training. Duration: 0.54 2023-10-03 10:40:46,690 WARNING [train.py:1204] (3/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_649-79538-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 10:40:46,697 WARNING [train.py:1204] (3/4) Exclude cut with ID _28394_210_8913_1_1532430049518_3650118_343-115667-0_sp1.1 from training. Duration: 0.97275 2023-10-03 10:40:46,731 WARNING [train.py:1204] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_87-278686-0_sp0.9 from training. Duration: 0.8555625 2023-10-03 10:40:48,035 INFO [train.py:1046] (3/4) Epoch 35, batch 5100, loss[loss=0.1688, simple_loss=0.2413, pruned_loss=0.04817, over 23436.00 frames. ], tot_loss[loss=0.1605, simple_loss=0.2395, pruned_loss=0.04074, over 4709631.25 frames. ], batch size: 285, lr: 2.89e-03, grad_scale: 8.0 2023-10-03 10:40:49,609 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9109W0003-549749 from training. Duration: 0.768 2023-10-03 10:40:51,176 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.feed_forward1.out_proj.dropout_p, batch_count=1238086.6666666667, ans=0.1 2023-10-03 10:40:52,309 WARNING [train.py:1204] (3/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_126-29104-0_sp1.1 from training. Duration: 0.77275 2023-10-03 10:40:55,606 WARNING [train.py:1204] (3/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_325-113798-0 from training. Duration: 0.85 2023-10-03 10:40:55,658 WARNING [train.py:1204] (3/4) Exclude cut with ID _6694_210_4599_1_1527852637490_3965529_116-109398-0 from training. Duration: 0.73 2023-10-03 10:40:57,611 WARNING [train.py:1204] (3/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_46-57119-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 10:41:00,290 WARNING [train.py:1204] (3/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_546-119312-0_sp1.1 from training. Duration: 0.9 2023-10-03 10:41:01,731 WARNING [train.py:1204] (3/4) Exclude cut with ID _60057_210_10640_1_1534903065793_3333150_468-306939-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 10:41:01,886 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.conv_module2.balancer1.prob, batch_count=1238153.3333333333, ans=0.125 2023-10-03 10:41:03,078 WARNING [train.py:1204] (3/4) Exclude cut with ID _15150_210_8341_1_1530752411803_7256384_22-138274-0 from training. Duration: 0.96 2023-10-03 10:41:03,093 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_289-180904-0 from training. Duration: 0.76 2023-10-03 10:41:06,023 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.balancer1.prob, batch_count=1238153.3333333333, ans=0.125 2023-10-03 10:41:09,015 WARNING [train.py:1204] (3/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_64-133721-0_sp1.1 from training. Duration: 0.92725 2023-10-03 10:41:10,326 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_195-257512-0_sp1.1 from training. Duration: 0.6090625 2023-10-03 10:41:13,185 WARNING [train.py:1204] (3/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_704-101963-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 10:41:13,404 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.self_attn_weights.pos_emb_skip_rate, batch_count=1238153.3333333333, ans=0.0 2023-10-03 10:41:15,912 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.0.layers.0.whiten, num_groups=1, num_channels=192, metric=4.53 vs. limit=12.0 2023-10-03 10:41:17,673 WARNING [train.py:1204] (3/4) Exclude cut with ID _4366_210_1388_1_1526792053633_3613819_231-211402-0 from training. Duration: 0.94 2023-10-03 10:41:17,735 WARNING [train.py:1204] (3/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_679-190385-0_sp1.1 from training. Duration: 0.97275 2023-10-03 10:41:20,572 WARNING [train.py:1204] (3/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_298-81459-0_sp1.1 from training. Duration: 0.963625 2023-10-03 10:41:20,581 WARNING [train.py:1204] (3/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_658-282610-0_sp0.9 from training. Duration: 0.588875 2023-10-03 10:41:21,953 WARNING [train.py:1204] (3/4) Exclude cut with ID _57572_210_5517_1_1534921219624_3809484_196-283734-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 10:41:22,047 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_53071_210_16554_1_1534490405_2954066_294-198554-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 10:41:23,377 WARNING [train.py:1204] (3/4) Exclude cut with ID _4866_210_2577_1_1527404412284_4364659_352-157930-0 from training. Duration: 0.81 2023-10-03 10:41:23,656 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.conv_module1.balancer1.prob, batch_count=1238220.0, ans=0.125 2023-10-03 10:41:24,751 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9231W0113-610139 from training. Duration: 0.896 2023-10-03 10:41:26,133 WARNING [train.py:1204] (3/4) Exclude cut with ID _24271_210_12658_1_1531987270022_7190519_638-171906-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 10:41:26,182 WARNING [train.py:1204] (3/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_272-249985-0 from training. Duration: 0.67 2023-10-03 10:41:26,192 WARNING [train.py:1204] (3/4) Exclude cut with ID _64086_210_7994_1_1535270417019_7435949_449-31567-0 from training. Duration: 0.97 2023-10-03 10:41:29,469 WARNING [train.py:1204] (3/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_109-282568-0_sp1.1 from training. Duration: 0.97275 2023-10-03 10:41:38,241 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_121-45934-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 10:41:41,405 WARNING [train.py:1204] (3/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_314-126819-0 from training. Duration: 0.5 2023-10-03 10:41:41,426 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9309W0192-640319 from training. Duration: 0.512 2023-10-03 10:41:41,439 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9310W0003-640649 from training. Duration: 0.896 2023-10-03 10:41:44,118 WARNING [train.py:1204] (3/4) Exclude cut with ID _7160_210_1876_1_1527940923231_3703799_120-150943-0 from training. Duration: 0.81 2023-10-03 10:41:44,121 WARNING [train.py:1204] (3/4) Exclude cut with ID _40380_210_9059_1_1533380594759_4187380_6-33508-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 10:41:47,454 WARNING [train.py:1204] (3/4) Exclude cut with ID _59202_210_26984_1_1534816810612_3549174_137-318710-0 from training. Duration: 0.89 2023-10-03 10:41:51,552 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_435-313305-0 from training. Duration: 0.71 2023-10-03 10:41:54,152 WARNING [train.py:1204] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_91-212486-0_sp1.1 from training. Duration: 0.6181875 2023-10-03 10:41:54,254 WARNING [train.py:1204] (3/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_133-158308-0_sp1.1 from training. Duration: 0.763625 2023-10-03 10:41:55,693 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_99-251314-0 from training. Duration: 0.87 2023-10-03 10:41:57,199 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_365-217119-0_sp1.1 from training. Duration: 0.57275 2023-10-03 10:41:58,541 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3475_210_2028_1_1526193578_3622569_377-166133-0 from training. Duration: 0.86 2023-10-03 10:42:01,735 INFO [train.py:1046] (3/4) Epoch 35, batch 5150, loss[loss=0.1896, simple_loss=0.2597, pruned_loss=0.0598, over 19316.00 frames. ], tot_loss[loss=0.1608, simple_loss=0.2403, pruned_loss=0.04067, over 4725201.38 frames. ], batch size: 388, lr: 2.89e-03, grad_scale: 8.0 2023-10-03 10:42:01,923 WARNING [train.py:1204] (3/4) Exclude cut with ID _13916_210_9994_1_1530788341126_3763149_116-21034-0_sp1.1 from training. Duration: 0.87275 2023-10-03 10:42:01,933 WARNING [train.py:1204] (3/4) Exclude cut with ID _64220_210_9527_1_1535250151772_4875259_142-216831-0_sp1.1 from training. Duration: 0.936375 2023-10-03 10:42:01,933 WARNING [train.py:1204] (3/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_490-207861-0_sp1.1 from training. Duration: 0.8909375 2023-10-03 10:42:03,317 WARNING [train.py:1204] (3/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_87-272068-0_sp0.9 from training. Duration: 0.92225 2023-10-03 10:42:04,637 WARNING [train.py:1204] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_18-46756-0_sp1.1 from training. Duration: 0.7181875 2023-10-03 10:42:04,702 WARNING [train.py:1204] (3/4) Exclude cut with ID _39411_210_5986_1_1533538825030_3343649_98-311335-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 10:42:06,054 WARNING [train.py:1204] (3/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_113-18163-0 from training. Duration: 0.89 2023-10-03 10:42:06,056 WARNING [train.py:1204] (3/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_1103-56445-0 from training. Duration: 0.73 2023-10-03 10:42:06,286 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.conv_module2.balancer1.prob, batch_count=1238420.0, ans=0.125 2023-10-03 10:42:07,319 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3474_210_2028_1_1526188513_4825246_145-309131-0 from training. Duration: 0.74 2023-10-03 10:42:07,345 WARNING [train.py:1204] (3/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_120-211630-0_sp1.1 from training. Duration: 0.836375 2023-10-03 10:42:07,361 WARNING [train.py:1204] (3/4) Exclude cut with ID _8030_210_7497_1_1528444886781_3531089_32-31837-0 from training. Duration: 0.97 2023-10-03 10:42:08,746 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_19949_210_2161_1_1531706337_928079_8-160956-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 10:42:10,091 WARNING [train.py:1204] (3/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_321-215742-0_sp1.1 from training. Duration: 0.4545625 2023-10-03 10:42:12,106 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_583-124650-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 10:42:13,523 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_30434_210_13049_1_1532519384_981746_164-182755-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 10:42:17,032 WARNING [train.py:1204] (3/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_339-104791-0_sp1.1 from training. Duration: 0.4454375 2023-10-03 10:42:17,057 WARNING [train.py:1204] (3/4) Exclude cut with ID _15026_210_8750_1_1530777602316_3291850_211-195043-0 from training. Duration: 0.8 2023-10-03 10:42:18,394 WARNING [train.py:1204] (3/4) Exclude cut with ID _29877_210_6228_1_1532397791106_5276149_487-182647-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 10:42:19,764 WARNING [train.py:1204] (3/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_127-566-0_sp0.9 from training. Duration: 0.9333125 2023-10-03 10:42:21,148 WARNING [train.py:1204] (3/4) Exclude cut with ID _8263_210_1664_1_1528439452891_970220_68-181805-0_sp0.9 from training. Duration: 0.711125 2023-10-03 10:42:21,150 WARNING [train.py:1204] (3/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_504-72513-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 10:42:22,456 WARNING [train.py:1204] (3/4) Exclude cut with ID _64639_210_5816_1_1535367619437_3528000_324-265233-0_sp1.1 from training. Duration: 0.963625 2023-10-03 10:42:22,519 WARNING [train.py:1204] (3/4) Exclude cut with ID _6456_210_1534_1_1527681634212_3181049_215-309359-0_sp0.9 from training. Duration: 0.97775 2023-10-03 10:42:22,524 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_115-233929-0_sp1.1 from training. Duration: 0.6545625 2023-10-03 10:42:23,805 WARNING [train.py:1204] (3/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_219-224445-0 from training. Duration: 0.84 2023-10-03 10:42:23,934 WARNING [train.py:1204] (3/4) Exclude cut with ID _14867_210_1968_1_1530684179890_3846969_413-29292-0_sp1.1 from training. Duration: 0.8181875 2023-10-03 10:42:25,289 WARNING [train.py:1204] (3/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_1012-279473-0_sp0.9 from training. Duration: 0.7444375 2023-10-03 10:42:26,736 WARNING [train.py:1204] (3/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_605-133395-0_sp0.9 from training. Duration: 0.7333125 2023-10-03 10:42:27,241 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.5.encoder.layers.1.self_attn2.whiten, num_groups=1, num_channels=256, metric=9.94 vs. limit=22.5 2023-10-03 10:42:28,134 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_89-35748-0 from training. Duration: 0.79 2023-10-03 10:42:29,898 WARNING [train.py:1204] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_476-272757-0_sp1.1 from training. Duration: 0.8454375 2023-10-03 10:42:33,236 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.556e+02 2.014e+02 2.285e+02 2.770e+02 4.713e+02, threshold=4.570e+02, percent-clipped=3.0 2023-10-03 10:42:35,999 WARNING [train.py:1204] (3/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_449-15508-0_sp0.9 from training. Duration: 0.911125 2023-10-03 10:42:37,428 WARNING [train.py:1204] (3/4) Exclude cut with ID _29345_210_4381_1_1532689168263_3860169_198-174328-0 from training. Duration: 0.91 2023-10-03 10:42:40,309 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_15517_210_6753_1_1530952293_1664795_125-158157-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 10:42:45,020 WARNING [train.py:1204] (3/4) Exclude cut with ID _25565_210_12392_1_1532423009382_3952098_420-113180-0_sp1.1 from training. Duration: 0.963625 2023-10-03 10:42:45,125 WARNING [train.py:1204] (3/4) Exclude cut with ID _51471_210_1889_1_1534399391390_3094250_236-146773-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 10:42:50,885 WARNING [train.py:1204] (3/4) Exclude cut with ID _30039_210_13270_1_1532484612900_2725150_175-35012-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 10:42:52,227 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_23850_210_4238_1_1532232597_1632433_99-275843-0_sp1.1 from training. Duration: 0.936375 2023-10-03 10:42:54,903 WARNING [train.py:1204] (3/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_658-282610-0 from training. Duration: 0.53 2023-10-03 10:42:59,172 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_37818_210_4238_1_1533620910_4720083_234-65934-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 10:42:59,893 WARNING [train.py:1204] (3/4) Exclude cut with ID _3067_210_2667_1_1526212974640_4503168_459-173867-0_sp1.1 from training. Duration: 0.8 2023-10-03 10:42:59,926 WARNING [train.py:1204] (3/4) Exclude cut with ID _8696_210_1876_1_1528545632440_3345058_209-54069-0_sp0.9 from training. Duration: 0.7444375 2023-10-03 10:43:02,663 WARNING [train.py:1204] (3/4) Exclude cut with ID _62574_210_2655_1_1535180407058_3933249_420-236465-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 10:43:04,041 WARNING [train.py:1204] (3/4) Exclude cut with ID _46681_210_9642_1_1533893005571_7603359_333-4042-0_sp1.1 from training. Duration: 0.936375 2023-10-03 10:43:04,111 WARNING [train.py:1204] (3/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_377-296839-0 from training. Duration: 0.55 2023-10-03 10:43:10,061 WARNING [train.py:1204] (3/4) Exclude cut with ID _43997_210_4525_1_1533538632555_5189329_426-92599-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 10:43:10,166 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_43-71989-0_sp1.1 from training. Duration: 0.5181875 2023-10-03 10:43:13,423 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2009_210_2071_1_1525481134_4588937_140-345530-0_sp1.1 from training. Duration: 0.963625 2023-10-03 10:43:13,430 WARNING [train.py:1204] (3/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_138-332773-0_sp0.9 from training. Duration: 0.9 2023-10-03 10:43:13,933 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.5.encoder.layers.0.feed_forward2.out_whiten, num_groups=1, num_channels=256, metric=6.78 vs. limit=15.0 2023-10-03 10:43:14,755 WARNING [train.py:1204] (3/4) Exclude cut with ID _13942_210_9988_1_1530604906036_3733360_499-55676-0_sp0.9 from training. Duration: 0.688875 2023-10-03 10:43:14,778 WARNING [train.py:1204] (3/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_259-34038-0_sp1.1 from training. Duration: 0.67275 2023-10-03 10:43:14,787 WARNING [train.py:1204] (3/4) Exclude cut with ID _3120_210_3525_1_1526036143112_4072980_404-249994-0_sp1.1 from training. Duration: 0.9 2023-10-03 10:43:16,102 INFO [train.py:1046] (3/4) Epoch 35, batch 5200, loss[loss=0.1517, simple_loss=0.2285, pruned_loss=0.03746, over 24299.00 frames. ], tot_loss[loss=0.162, simple_loss=0.2412, pruned_loss=0.04138, over 4698040.91 frames. ], batch size: 56, lr: 2.89e-03, grad_scale: 16.0 2023-10-03 10:43:16,158 WARNING [train.py:1204] (3/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_222-108810-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 10:43:19,342 WARNING [train.py:1204] (3/4) Exclude cut with ID _22083_210_8254_1_1531702862439_3678970_49-234546-0_sp0.9 from training. Duration: 0.92225 2023-10-03 10:43:22,032 WARNING [train.py:1204] (3/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_199-295611-0_sp0.9 from training. Duration: 0.611125 2023-10-03 10:43:23,508 WARNING [train.py:1204] (3/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_43-192945-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 10:43:26,307 WARNING [train.py:1204] (3/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_364-22817-0 from training. Duration: 0.71 2023-10-03 10:43:27,689 WARNING [train.py:1204] (3/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_388-129552-0_sp1.1 from training. Duration: 0.87275 2023-10-03 10:43:29,002 WARNING [train.py:1204] (3/4) Exclude cut with ID _6537_210_1883_1_1527987559710_3597129_190-280227-0_sp1.1 from training. Duration: 0.97275 2023-10-03 10:43:30,001 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.conv_module1.balancer2.prob, batch_count=1238820.0, ans=0.125 2023-10-03 10:43:31,257 WARNING [train.py:1204] (3/4) Exclude cut with ID _55844_210_2346_1_1534471212569_7625707_398-56897-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 10:43:31,335 WARNING [train.py:1204] (3/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_48-182155-0_sp1.1 from training. Duration: 0.7909375 2023-10-03 10:43:31,369 WARNING [train.py:1204] (3/4) Exclude cut with ID _29529_210_7234_1_1532393961455_3977460_582-18084-0_sp1.1 from training. Duration: 0.97275 2023-10-03 10:43:32,748 WARNING [train.py:1204] (3/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_912-282412-0 from training. Duration: 0.49 2023-10-03 10:43:35,767 WARNING [train.py:1204] (3/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_332-256168-0_sp0.9 from training. Duration: 0.8333125 2023-10-03 10:43:35,830 WARNING [train.py:1204] (3/4) Exclude cut with ID _64220_210_9527_1_1535250151772_4875259_55-250624-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 10:43:38,730 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=1238820.0, ans=0.1 2023-10-03 10:43:39,668 WARNING [train.py:1204] (3/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_264-108685-0 from training. Duration: 0.55 2023-10-03 10:43:42,501 WARNING [train.py:1204] (3/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_613-232856-0_sp0.9 from training. Duration: 0.6 2023-10-03 10:43:42,592 WARNING [train.py:1204] (3/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_68-212801-0_sp1.1 from training. Duration: 0.663625 2023-10-03 10:43:44,381 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6290_210_3273_1_1527595224_1282787_115-87095-0 from training. Duration: 0.55 2023-10-03 10:43:44,685 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.conv_module2.balancer1.prob, batch_count=1238886.6666666667, ans=0.125 2023-10-03 10:43:45,745 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3080_210_2071_1_1526086102_4365166_312-8726-0 from training. Duration: 0.91 2023-10-03 10:43:47,198 WARNING [train.py:1204] (3/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_105-63309-0 from training. Duration: 0.98 2023-10-03 10:43:47,274 WARNING [train.py:1204] (3/4) Exclude cut with ID _2941_210_2577_1_1526199673150_2489759_201-160779-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 10:43:49,134 WARNING [train.py:1204] (3/4) Exclude cut with ID ID0406W0226-885434 from training. Duration: 0.997 2023-10-03 10:43:49,141 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_17292_210_6753_1_1531125388_2943734_353-328201-0_sp1.1 from training. Duration: 0.97275 2023-10-03 10:43:50,526 WARNING [train.py:1204] (3/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_572-96029-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 10:43:50,554 WARNING [train.py:1204] (3/4) Exclude cut with ID _14970_210_15709_1_1530775547361_1966965_6-23328-0_sp0.9 from training. Duration: 0.9555625 2023-10-03 10:43:51,928 WARNING [train.py:1204] (3/4) Exclude cut with ID _45012_210_12651_1_1533608435136_5371380_240-37101-0 from training. Duration: 0.93 2023-10-03 10:43:52,000 WARNING [train.py:1204] (3/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_685-90183-0_sp1.1 from training. Duration: 0.936375 2023-10-03 10:43:54,766 WARNING [train.py:1204] (3/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_41-22245-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 10:43:57,558 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_15126_210_6753_1_1530778677_2591185_120-277707-0 from training. Duration: 0.8 2023-10-03 10:43:57,597 WARNING [train.py:1204] (3/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_310-26186-0 from training. Duration: 0.7 2023-10-03 10:43:57,619 WARNING [train.py:1204] (3/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_787-294416-0 from training. Duration: 0.82 2023-10-03 10:44:03,577 WARNING [train.py:1204] (3/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_795-33252-0 from training. Duration: 0.37 2023-10-03 10:44:03,622 WARNING [train.py:1204] (3/4) Exclude cut with ID _7537_210_2084_1_1528502395431_3924359_113-199933-0_sp0.9 from training. Duration: 0.6555625 2023-10-03 10:44:10,951 WARNING [train.py:1204] (3/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_308-231735-0_sp0.9 from training. Duration: 0.811125 2023-10-03 10:44:10,982 WARNING [train.py:1204] (3/4) Exclude cut with ID _19504_210_9994_1_1531648863242_3325220_354-296765-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 10:44:11,274 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=1238953.3333333333, ans=0.1 2023-10-03 10:44:12,445 WARNING [train.py:1204] (3/4) Exclude cut with ID _5409_210_1388_1_1527292850189_5865191_178-127970-0 from training. Duration: 0.57 2023-10-03 10:44:13,762 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_1466-248056-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 10:44:13,790 WARNING [train.py:1204] (3/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_535-14410-0_sp0.9 from training. Duration: 0.52225 2023-10-03 10:44:13,794 WARNING [train.py:1204] (3/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_747-129941-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 10:44:13,829 WARNING [train.py:1204] (3/4) Exclude cut with ID _15012_210_1968_1_1530856820429_5095350_688-178849-0_sp1.1 from training. Duration: 0.5818125 2023-10-03 10:44:14,159 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.feed_forward1.out_proj.dropout_p, batch_count=1239020.0, ans=0.1 2023-10-03 10:44:15,425 WARNING [train.py:1204] (3/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_356-45054-0_sp1.1 from training. Duration: 0.7818125 2023-10-03 10:44:17,438 WARNING [train.py:1204] (3/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_146-276944-0_sp0.9 from training. Duration: 0.92225 2023-10-03 10:44:21,903 WARNING [train.py:1204] (3/4) Exclude cut with ID _39204_210_4220_1_1533880547368_3191120_346-180306-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 10:44:21,993 WARNING [train.py:1204] (3/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_641-158017-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 10:44:21,993 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_19952_210_2161_1_1531875469_3851750_274-42137-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 10:44:24,871 WARNING [train.py:1204] (3/4) Exclude cut with ID _60804_210_9642_1_1534831199859_7268299_87-327819-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 10:44:26,260 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_9302_210_2550_1_1528889962_4655416_638-301620-0 from training. Duration: 0.47 2023-10-03 10:44:27,572 WARNING [train.py:1204] (3/4) Exclude cut with ID _44304_210_15374_1_1533639352765_4613129_302-22084-0_sp1.1 from training. Duration: 0.7818125 2023-10-03 10:44:27,590 WARNING [train.py:1204] (3/4) Exclude cut with ID _18513_210_9206_1_1531618215223_3862620_177-227063-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 10:44:27,769 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.attention_skip_rate, batch_count=1239020.0, ans=0.0 2023-10-03 10:44:28,965 WARNING [train.py:1204] (3/4) Exclude cut with ID _41326_210_5854_1_1533778076509_3492080_510-45444-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 10:44:29,009 WARNING [train.py:1204] (3/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_76-311963-0_sp1.1 from training. Duration: 0.636375 2023-10-03 10:44:30,844 INFO [train.py:1046] (3/4) Epoch 35, batch 5250, loss[loss=0.1675, simple_loss=0.2516, pruned_loss=0.04167, over 23479.00 frames. ], tot_loss[loss=0.1611, simple_loss=0.24, pruned_loss=0.04112, over 4695128.89 frames. ], batch size: 134, lr: 2.89e-03, grad_scale: 16.0 2023-10-03 10:44:30,914 WARNING [train.py:1204] (3/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_156-302675-0_sp0.9 from training. Duration: 0.611125 2023-10-03 10:44:34,893 WARNING [train.py:1204] (3/4) Exclude cut with ID _53489_210_6228_1_1534298357169_3835970_367-148411-0_sp1.1 from training. Duration: 0.936375 2023-10-03 10:44:36,336 WARNING [train.py:1204] (3/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_389-144525-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 10:44:36,385 WARNING [train.py:1204] (3/4) Exclude cut with ID _14069_210_5632_1_1530512692033_4803390_158-238427-0_sp1.1 from training. Duration: 0.963625 2023-10-03 10:44:39,597 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_486-151131-0_sp0.9 from training. Duration: 0.9444375 2023-10-03 10:44:43,743 WARNING [train.py:1204] (3/4) Exclude cut with ID _39846_210_12651_1_1533603651992_1318030_188-71831-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 10:44:46,891 WARNING [train.py:1204] (3/4) Exclude cut with ID _39411_210_5986_1_1533538825030_3343649_173-238248-0_sp1.1 from training. Duration: 0.7545625 2023-10-03 10:44:48,311 WARNING [train.py:1204] (3/4) Exclude cut with ID _20684_210_6917_1_1531567751646_4203930_469-49081-0_sp1.1 from training. Duration: 0.92725 2023-10-03 10:44:48,416 WARNING [train.py:1204] (3/4) Exclude cut with ID _8833_210_5219_1_1528592665210_7553059_611-80313-0_sp1.1 from training. Duration: 0.5818125 2023-10-03 10:44:51,547 WARNING [train.py:1204] (3/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_440-141811-0 from training. Duration: 0.95 2023-10-03 10:44:51,561 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_41211_210_2544_1_1533348859_3702910_236-223652-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 10:44:51,654 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_40222_210_6228_1_1533385298_4481939_220-163998-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 10:45:00,735 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.461e+02 1.950e+02 2.167e+02 2.359e+02 3.354e+02, threshold=4.333e+02, percent-clipped=0.0 2023-10-03 10:45:13,854 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.1.encoder.layers.1.conv_module2.whiten, num_groups=1, num_channels=256, metric=10.85 vs. limit=15.0 2023-10-03 10:45:15,720 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.feed_forward1.hidden_balancer.prob, batch_count=1239286.6666666667, ans=0.125 2023-10-03 10:45:33,188 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.ff2_skip_rate, batch_count=1239353.3333333333, ans=0.0 2023-10-03 10:45:39,410 INFO [train.py:1046] (3/4) Epoch 35, batch 5300, loss[loss=0.1323, simple_loss=0.1853, pruned_loss=0.03962, over 18901.00 frames. ], tot_loss[loss=0.1606, simple_loss=0.2395, pruned_loss=0.04082, over 4684118.58 frames. ], batch size: 388, lr: 2.89e-03, grad_scale: 16.0 2023-10-03 10:45:53,836 WARNING [train.py:1204] (3/4) Exclude cut with ID _14629_210_15709_1_1530604298182_956899_6-232695-0_sp1.1 from training. Duration: 0.8545625 2023-10-03 10:45:53,858 WARNING [train.py:1204] (3/4) Exclude cut with ID _13921_210_9994_1_1530841782272_3503520_327-69677-0 from training. Duration: 0.9 2023-10-03 10:45:53,884 WARNING [train.py:1204] (3/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_372-298093-0 from training. Duration: 0.8 2023-10-03 10:45:53,897 WARNING [train.py:1204] (3/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_515-139111-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 10:45:54,130 WARNING [train.py:1204] (3/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_85-65056-0_sp1.1 from training. Duration: 0.87275 2023-10-03 10:45:54,170 WARNING [train.py:1204] (3/4) Exclude cut with ID _15113_210_8254_1_1530842438579_3716279_209-54533-0_sp1.1 from training. Duration: 0.87275 2023-10-03 10:45:54,218 WARNING [train.py:1204] (3/4) Exclude cut with ID _70447_210_12392_1_1535972499300_3160420_123-312838-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 10:45:54,245 WARNING [train.py:1204] (3/4) Exclude cut with ID _60113_210_16271_1_1535164212481_4386079_70-63596-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 10:45:54,274 WARNING [train.py:1204] (3/4) Exclude cut with ID _14632_210_9988_1_1530691279392_3923200_225-188509-0_sp1.1 from training. Duration: 0.963625 2023-10-03 10:45:54,323 WARNING [train.py:1204] (3/4) Exclude cut with ID _34734_210_4525_1_1533365977542_4448720_584-180532-0_sp1.1 from training. Duration: 0.97275 2023-10-03 10:45:54,338 WARNING [train.py:1204] (3/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_351-294650-0_sp1.1 from training. Duration: 0.57275 2023-10-03 10:45:55,077 WARNING [train.py:1204] (3/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_263-168388-0_sp0.9 from training. Duration: 0.9555625 2023-10-03 10:45:55,104 WARNING [train.py:1204] (3/4) Exclude cut with ID _22083_210_8254_1_1531702862439_3678970_201-85738-0 from training. Duration: 0.9 2023-10-03 10:45:55,200 WARNING [train.py:1204] (3/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_237-58433-0 from training. Duration: 0.74 2023-10-03 10:45:55,226 WARNING [train.py:1204] (3/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_262-250826-0 from training. Duration: 0.99 2023-10-03 10:45:55,334 WARNING [train.py:1204] (3/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_927-43827-0_sp0.9 from training. Duration: 0.82225 2023-10-03 10:45:55,366 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2821_210_2715_1_1526108385_3763560_73-296673-0 from training. Duration: 0.83 2023-10-03 10:45:55,442 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6287_210_3273_1_1527509506_2653716_182-199887-0 from training. Duration: 0.51 2023-10-03 10:45:55,576 WARNING [train.py:1204] (3/4) Exclude cut with ID _70519_210_4163_1_1535961595994_7458230_899-80223-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 10:45:55,835 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_210-234949-0_sp1.1 from training. Duration: 0.97275 2023-10-03 10:45:55,873 WARNING [train.py:1204] (3/4) Exclude cut with ID _30196_210_1664_1_1533708496522_7074140_768-278176-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 10:45:55,953 WARNING [train.py:1204] (3/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_315-128283-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 10:45:56,032 WARNING [train.py:1204] (3/4) Exclude cut with ID _14886_210_4220_1_1530702020355_3516190_330-323941-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 10:45:56,303 WARNING [train.py:1204] (3/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_264-348896-0_sp1.1 from training. Duration: 0.77275 2023-10-03 10:45:56,324 WARNING [train.py:1204] (3/4) Exclude cut with ID _37086_210_9038_1_1532999540891_7021250_433-10563-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 10:45:56,361 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_53638_210_21581_1_1534297319_5808449_885-80172-0_sp1.1 from training. Duration: 0.87275 2023-10-03 10:45:56,454 WARNING [train.py:1204] (3/4) Exclude cut with ID _14623_210_1925_1_1530773354962_3632609_157-218771-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 10:45:56,466 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_1368-78165-0_sp1.1 from training. Duration: 0.97275 2023-10-03 10:45:56,470 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_309-65177-0_sp0.9 from training. Duration: 0.97775 2023-10-03 10:45:56,476 WARNING [train.py:1204] (3/4) Exclude cut with ID _21632_210_2253_1_1531994001765_3744140_291-138812-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 10:45:56,490 WARNING [train.py:1204] (3/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_669-93163-0_sp1.1 from training. Duration: 0.8909375 2023-10-03 10:45:57,508 WARNING [train.py:1204] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_292-215346-0 from training. Duration: 0.97 2023-10-03 10:45:57,534 WARNING [train.py:1204] (3/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_286-222073-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 10:45:57,827 WARNING [train.py:1204] (3/4) Exclude cut with ID _59256_210_5986_1_1535076085997_3589120_121-237386-0_sp1.1 from training. Duration: 0.87275 2023-10-03 10:45:57,842 WARNING [train.py:1204] (3/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_96-320736-0 from training. Duration: 0.86 2023-10-03 10:45:57,853 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_128-202104-0 from training. Duration: 0.63 2023-10-03 10:45:57,945 WARNING [train.py:1204] (3/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_242-3472-0_sp0.9 from training. Duration: 0.811125 2023-10-03 10:45:57,991 WARNING [train.py:1204] (3/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_154-74182-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 10:45:57,995 WARNING [train.py:1204] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_187-221566-0 from training. Duration: 0.43 2023-10-03 10:45:58,162 WARNING [train.py:1204] (3/4) Exclude cut with ID _6194_210_1534_1_1527511826160_3941194_14-282876-0 from training. Duration: 0.71 2023-10-03 10:45:58,201 WARNING [train.py:1204] (3/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_660-211088-0_sp1.1 from training. Duration: 0.563625 2023-10-03 10:45:58,543 WARNING [train.py:1204] (3/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_526-62079-0_sp1.1 from training. Duration: 0.6090625 2023-10-03 10:45:58,703 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_92-246270-0_sp1.1 from training. Duration: 0.77275 2023-10-03 10:45:58,798 WARNING [train.py:1204] (3/4) Exclude cut with ID IC0287W0023-142350 from training. Duration: 0.956 2023-10-03 10:45:58,881 WARNING [train.py:1204] (3/4) Exclude cut with ID _58605_210_12210_1_1534723195939_3904259_48-269402-0 from training. Duration: 0.96 2023-10-03 10:45:58,897 WARNING [train.py:1204] (3/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_416-257730-0_sp0.9 from training. Duration: 0.72225 2023-10-03 10:45:58,979 WARNING [train.py:1204] (3/4) Exclude cut with ID _7877_210_6014_1_1528286358099_3750136_185-17516-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 10:45:59,089 WARNING [train.py:1204] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_23-189234-0 from training. Duration: 0.83 2023-10-03 10:45:59,105 WARNING [train.py:1204] (3/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_237-89684-0 from training. Duration: 0.96 2023-10-03 10:45:59,176 WARNING [train.py:1204] (3/4) Exclude cut with ID _13916_210_9994_1_1530788341126_3763149_116-21034-0 from training. Duration: 0.96 2023-10-03 10:45:59,335 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_246-125000-0_sp1.1 from training. Duration: 0.563625 2023-10-03 10:46:05,607 INFO [train.py:1046] (3/4) Epoch 36, batch 0, loss[loss=0.1513, simple_loss=0.2305, pruned_loss=0.036, over 23696.00 frames. ], tot_loss[loss=0.1513, simple_loss=0.2305, pruned_loss=0.036, over 23696.00 frames. ], batch size: 232, lr: 2.85e-03, grad_scale: 32.0 2023-10-03 10:46:05,607 INFO [train.py:1069] (3/4) Computing validation loss 2023-10-03 10:46:17,646 INFO [train.py:1078] (3/4) Epoch 36, validation: loss=0.3188, simple_loss=0.2685, pruned_loss=0.1846, over 1125622.00 frames. 2023-10-03 10:46:17,646 INFO [train.py:1079] (3/4) Maximum memory allocated so far is 21134MB 2023-10-03 10:46:20,567 WARNING [train.py:1204] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_402-253027-0 from training. Duration: 0.54 2023-10-03 10:46:20,627 WARNING [train.py:1204] (3/4) Exclude cut with ID _59288_210_5854_1_1535077757774_3412246_158-289345-0_sp1.1 from training. Duration: 0.92725 2023-10-03 10:46:23,329 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_28211_210_6388_1_1532861506_4523224_128-328162-0_sp1.1 from training. Duration: 0.8181875 2023-10-03 10:46:26,316 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_788-278281-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 10:46:26,333 WARNING [train.py:1204] (3/4) Exclude cut with ID _49728_210_12651_1_1534041034294_6788150_624-195301-0_sp0.9 from training. Duration: 0.9444375 2023-10-03 10:46:26,375 WARNING [train.py:1204] (3/4) Exclude cut with ID _14860_210_4915_1_1530702020340_7343240_267-139775-0_sp1.1 from training. Duration: 0.963625 2023-10-03 10:46:28,396 WARNING [train.py:1204] (3/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_332-221057-0 from training. Duration: 0.68 2023-10-03 10:46:29,597 WARNING [train.py:1204] (3/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_242-76943-0 from training. Duration: 0.94 2023-10-03 10:46:30,999 WARNING [train.py:1204] (3/4) Exclude cut with ID _16747_210_5986_1_1531811544733_2749109_87-349587-0_sp1.1 from training. Duration: 0.963625 2023-10-03 10:46:32,278 WARNING [train.py:1204] (3/4) Exclude cut with ID _22982_210_1883_1_1531882790447_5449409_505-346452-0_sp1.1 from training. Duration: 0.963625 2023-10-03 10:46:33,820 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=1239566.6666666667, ans=0.1 2023-10-03 10:46:36,353 WARNING [train.py:1204] (3/4) Exclude cut with ID _17956_210_4353_1_1531209568125_6979970_13-192107-0_sp1.1 from training. Duration: 0.963625 2023-10-03 10:46:37,615 WARNING [train.py:1204] (3/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_430-307666-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 10:46:37,657 WARNING [train.py:1204] (3/4) Exclude cut with ID _2895_210_2477_1_1525956785794_4063750_289-11475-0_sp1.1 from training. Duration: 0.7454375 2023-10-03 10:46:37,668 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3910_210_1794_1_1526742306_334530_28-238909-0_sp0.9 from training. Duration: 0.9 2023-10-03 10:46:39,218 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.0.conv_module1.balancer1.prob, batch_count=1239566.6666666667, ans=0.125 2023-10-03 10:46:39,334 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.feed_forward3.hidden_balancer.prob, batch_count=1239566.6666666667, ans=0.125 2023-10-03 10:46:40,425 WARNING [train.py:1204] (3/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_18-130336-0 from training. Duration: 0.79 2023-10-03 10:46:41,883 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_548-294316-0_sp0.9 from training. Duration: 0.9 2023-10-03 10:46:48,429 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_42-142473-0_sp1.1 from training. Duration: 0.6545625 2023-10-03 10:46:48,436 WARNING [train.py:1204] (3/4) Exclude cut with ID _59170_210_6721_1_1534983633960_4203335_326-48066-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 10:46:50,607 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_45886_210_17024_1_1533709215_471063_7-79165-0 from training. Duration: 0.98 2023-10-03 10:46:54,586 WARNING [train.py:1204] (3/4) Exclude cut with ID _44585_210_18628_1_1533605350387_4518199_280-73534-0_sp0.9 from training. Duration: 0.988875 2023-10-03 10:46:54,600 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3475_210_2028_1_1526193578_3622569_265-276625-0_sp1.1 from training. Duration: 0.6909375 2023-10-03 10:46:57,269 WARNING [train.py:1204] (3/4) Exclude cut with ID _64631_210_27767_1_1535349580068_4165160_88-225232-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 10:47:00,970 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.conv_module2.balancer2.prob, batch_count=1239700.0, ans=0.125 2023-10-03 10:47:02,197 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_205-265518-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 10:47:07,599 WARNING [train.py:1204] (3/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_271-253734-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 10:47:11,965 WARNING [train.py:1204] (3/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_282-257791-0 from training. Duration: 0.86 2023-10-03 10:47:14,961 WARNING [train.py:1204] (3/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_356-45054-0 from training. Duration: 0.86 2023-10-03 10:47:16,344 WARNING [train.py:1204] (3/4) Exclude cut with ID _69584_210_5219_1_1535785133775_4044450_38-348116-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 10:47:16,346 WARNING [train.py:1204] (3/4) Exclude cut with ID _66763_210_17301_1_1535788667617_241750_21-328480-0_sp1.1 from training. Duration: 0.936375 2023-10-03 10:47:18,070 WARNING [train.py:1204] (3/4) Exclude cut with ID _26528_210_9070_1_1532570410842_3851310_497-97218-0_sp0.9 from training. Duration: 0.9555625 2023-10-03 10:47:18,133 WARNING [train.py:1204] (3/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_592-101075-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 10:47:21,209 WARNING [train.py:1204] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_678-159307-0 from training. Duration: 0.85 2023-10-03 10:47:22,709 WARNING [train.py:1204] (3/4) Exclude cut with ID _28427_210_4220_1_1532581240967_3061069_166-319773-0_sp1.1 from training. Duration: 0.936375 2023-10-03 10:47:24,423 WARNING [train.py:1204] (3/4) Exclude cut with ID _29877_210_6228_1_1532397791106_5276149_606-224984-0_sp1.1 from training. Duration: 0.936375 2023-10-03 10:47:27,265 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_125-203105-0_sp1.1 from training. Duration: 0.72725 2023-10-03 10:47:31,227 INFO [train.py:1046] (3/4) Epoch 36, batch 50, loss[loss=0.1661, simple_loss=0.2537, pruned_loss=0.03924, over 24467.00 frames. ], tot_loss[loss=0.1602, simple_loss=0.2414, pruned_loss=0.03948, over 1075613.94 frames. ], batch size: 69, lr: 2.85e-03, grad_scale: 16.0 2023-10-03 10:47:31,331 WARNING [train.py:1204] (3/4) Exclude cut with ID IC0620W0113-306765 from training. Duration: 0.768 2023-10-03 10:47:32,704 WARNING [train.py:1204] (3/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_410-287857-0_sp1.1 from training. Duration: 0.8454375 2023-10-03 10:47:36,168 WARNING [train.py:1204] (3/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_78-95260-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 10:47:37,615 WARNING [train.py:1204] (3/4) Exclude cut with ID _58059_210_5854_1_1534901471149_3164808_6-73080-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 10:47:37,648 WARNING [train.py:1204] (3/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_331-117829-0 from training. Duration: 0.62 2023-10-03 10:47:38,892 WARNING [train.py:1204] (3/4) Exclude cut with ID _14880_210_8895_1_1530680426655_3795259_289-212817-0_sp0.9 from training. Duration: 0.7666875 2023-10-03 10:47:38,930 WARNING [train.py:1204] (3/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_620-144779-0_sp1.1 from training. Duration: 0.92725 2023-10-03 10:47:40,412 WARNING [train.py:1204] (3/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_561-288575-0_sp1.1 from training. Duration: 0.963625 2023-10-03 10:47:40,527 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_22657_210_6890_1_1532086002_1637216_122-188128-0_sp1.1 from training. Duration: 0.963625 2023-10-03 10:47:41,841 WARNING [train.py:1204] (3/4) Exclude cut with ID _16756_210_4915_1_1531047803763_7104259_604-86607-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 10:47:45,661 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.520e+02 1.883e+02 2.047e+02 2.460e+02 5.185e+02, threshold=4.094e+02, percent-clipped=4.0 2023-10-03 10:47:45,760 WARNING [train.py:1204] (3/4) Exclude cut with ID _15150_210_8341_1_1530752411803_7256384_469-239509-0 from training. Duration: 0.68 2023-10-03 10:47:45,768 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_10987_210_4163_1_1529760555_4123902_302-87926-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 10:47:52,424 WARNING [train.py:1204] (3/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_469-94673-0_sp0.9 from training. Duration: 0.87775 2023-10-03 10:47:52,621 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.attention_skip_rate, batch_count=1239900.0, ans=0.0 2023-10-03 10:47:53,810 WARNING [train.py:1204] (3/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_798-106179-0 from training. Duration: 0.83 2023-10-03 10:47:55,209 WARNING [train.py:1204] (3/4) Exclude cut with ID _16744_210_5986_1_1531724405384_3907567_242-111316-0 from training. Duration: 0.93 2023-10-03 10:47:56,587 WARNING [train.py:1204] (3/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_938-205135-0_sp0.9 from training. Duration: 0.9666875 2023-10-03 10:47:58,018 WARNING [train.py:1204] (3/4) Exclude cut with ID _64135_210_2253_1_1535677267876_7608972_133-188266-0_sp0.9 from training. Duration: 0.9 2023-10-03 10:47:58,022 WARNING [train.py:1204] (3/4) Exclude cut with ID _44585_210_18628_1_1533605350387_4518199_623-346820-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 10:47:59,329 WARNING [train.py:1204] (3/4) Exclude cut with ID _42326_210_7770_1_1533814282531_3707269_454-266903-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 10:47:59,413 WARNING [train.py:1204] (3/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_5-55893-0_sp1.1 from training. Duration: 0.5 2023-10-03 10:48:00,781 WARNING [train.py:1204] (3/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_577-332600-0_sp1.1 from training. Duration: 0.5181875 2023-10-03 10:48:00,789 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_957-138882-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 10:48:01,008 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.ff2_skip_rate, batch_count=1239966.6666666667, ans=0.0 2023-10-03 10:48:01,086 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.feed_forward1.hidden_balancer.prob, batch_count=1239966.6666666667, ans=0.125 2023-10-03 10:48:05,634 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.0.conv_module2.balancer2.min_positive, batch_count=1239966.6666666667, ans=0.05 2023-10-03 10:48:08,745 WARNING [train.py:1204] (3/4) Exclude cut with ID _12556_210_8341_1_1530081024100_7327690_219-38004-0_sp1.1 from training. Duration: 0.97275 2023-10-03 10:48:10,102 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_397-147196-0_sp1.1 from training. Duration: 0.72725 2023-10-03 10:48:10,111 WARNING [train.py:1204] (3/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_231-104736-0_sp0.9 from training. Duration: 0.8666875 2023-10-03 10:48:11,431 WARNING [train.py:1204] (3/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_398-334558-0 from training. Duration: 0.96 2023-10-03 10:48:12,191 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.3.encoder.layers.1.nonlin_attention.whiten2, num_groups=1, num_channels=512, metric=8.63 vs. limit=15.0 2023-10-03 10:48:12,861 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_257-20733-0_sp1.1 from training. Duration: 0.6909375 2023-10-03 10:48:14,233 WARNING [train.py:1204] (3/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_146-282400-0_sp1.1 from training. Duration: 0.6454375 2023-10-03 10:48:14,236 WARNING [train.py:1204] (3/4) Exclude cut with ID _51899_210_20034_1_1534129254466_3669220_265-215428-0 from training. Duration: 0.98 2023-10-03 10:48:14,299 WARNING [train.py:1204] (3/4) Exclude cut with ID _34423_210_8750_1_1533430798456_3330199_219-106541-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 10:48:16,876 WARNING [train.py:1204] (3/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_458-67321-0 from training. Duration: 0.97 2023-10-03 10:48:25,068 WARNING [train.py:1204] (3/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_1051-182606-0_sp1.1 from training. Duration: 0.936375 2023-10-03 10:48:25,091 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_53555_210_5632_1_1534595220_7232297_603-4396-0_sp1.1 from training. Duration: 0.97275 2023-10-03 10:48:27,064 WARNING [train.py:1204] (3/4) Exclude cut with ID _54218_210_12210_1_1534413646388_4409259_159-144367-0_sp1.1 from training. Duration: 0.963625 2023-10-03 10:48:28,446 WARNING [train.py:1204] (3/4) Exclude cut with ID _7606_210_4915_1_1528894253277_5080690_301-10732-0_sp0.9 from training. Duration: 0.9 2023-10-03 10:48:28,449 WARNING [train.py:1204] (3/4) Exclude cut with ID _15026_210_8750_1_1530777602316_3291850_211-195043-0_sp0.9 from training. Duration: 0.888875 2023-10-03 10:48:29,893 WARNING [train.py:1204] (3/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_146-282400-0 from training. Duration: 0.71 2023-10-03 10:48:29,939 WARNING [train.py:1204] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_65-88242-0 from training. Duration: 0.56 2023-10-03 10:48:31,307 WARNING [train.py:1204] (3/4) Exclude cut with ID _55857_210_2346_1_1534644007831_6898209_80-107757-0_sp1.1 from training. Duration: 0.963625 2023-10-03 10:48:32,621 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_15126_210_6753_1_1530778677_2591185_120-277707-0_sp0.9 from training. Duration: 0.888875 2023-10-03 10:48:34,033 WARNING [train.py:1204] (3/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_1033-168593-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 10:48:34,715 WARNING [train.py:1204] (3/4) Exclude cut with ID _46683_210_9642_1_1533730913492_4591116_475-30711-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 10:48:34,738 WARNING [train.py:1204] (3/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_253-308377-0 from training. Duration: 0.49 2023-10-03 10:48:35,907 WARNING [train.py:1204] (3/4) Exclude cut with ID _4680_210_3525_1_1527249757953_893386_75-78979-0 from training. Duration: 0.91 2023-10-03 10:48:36,132 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.balancer2.prob, batch_count=1240100.0, ans=0.125 2023-10-03 10:48:37,208 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_5522_210_3273_1_1527249606_3909096_318-203153-0_sp0.9 from training. Duration: 0.47775 2023-10-03 10:48:38,588 WARNING [train.py:1204] (3/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_550-170116-0_sp1.1 from training. Duration: 0.92725 2023-10-03 10:48:38,622 WARNING [train.py:1204] (3/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_277-210798-0_sp0.9 from training. Duration: 0.611125 2023-10-03 10:48:39,920 WARNING [train.py:1204] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_620-276793-0 from training. Duration: 0.59 2023-10-03 10:48:39,940 WARNING [train.py:1204] (3/4) Exclude cut with ID _3067_210_2667_1_1526212974640_4503168_119-175776-0 from training. Duration: 0.74 2023-10-03 10:48:42,576 WARNING [train.py:1204] (3/4) Exclude cut with ID _55209_210_1531_1_1534675828996_4597160_681-195827-0_sp1.1 from training. Duration: 0.92725 2023-10-03 10:48:42,641 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_427-78097-0_sp1.1 from training. Duration: 0.72725 2023-10-03 10:48:44,006 WARNING [train.py:1204] (3/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_96-290039-0_sp0.9 from training. Duration: 0.62225 2023-10-03 10:48:44,018 WARNING [train.py:1204] (3/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_287-292174-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 10:48:45,321 INFO [train.py:1046] (3/4) Epoch 36, batch 100, loss[loss=0.1465, simple_loss=0.2226, pruned_loss=0.03514, over 23682.00 frames. ], tot_loss[loss=0.1623, simple_loss=0.2424, pruned_loss=0.04106, over 1876822.61 frames. ], batch size: 149, lr: 2.85e-03, grad_scale: 16.0 2023-10-03 10:48:45,493 WARNING [train.py:1204] (3/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_200-292228-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 10:48:48,285 WARNING [train.py:1204] (3/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_291-194999-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 10:48:51,002 WARNING [train.py:1204] (3/4) Exclude cut with ID _58895_210_8750_1_1535184081320_3649089_265-323810-0_sp1.1 from training. Duration: 0.87275 2023-10-03 10:48:52,880 WARNING [train.py:1204] (3/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_58-68956-0 from training. Duration: 0.83 2023-10-03 10:48:52,894 WARNING [train.py:1204] (3/4) Exclude cut with ID _39867_210_9826_1_1533553222030_9344259_322-115153-0_sp1.1 from training. Duration: 0.963625 2023-10-03 10:48:57,327 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3371_210_1794_1_1526384462_5580777_396-339812-0_sp1.1 from training. Duration: 0.77275 2023-10-03 10:48:57,348 WARNING [train.py:1204] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_731-2428-0_sp0.9 from training. Duration: 0.92225 2023-10-03 10:48:57,380 WARNING [train.py:1204] (3/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_98-741-0_sp1.1 from training. Duration: 0.72725 2023-10-03 10:48:57,381 WARNING [train.py:1204] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_238-245655-0_sp0.9 from training. Duration: 0.9 2023-10-03 10:48:58,627 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6788_210_1388_1_1527898698_4562021_91-197981-0_sp0.9 from training. Duration: 0.92225 2023-10-03 10:49:00,073 WARNING [train.py:1204] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_860-96198-0 from training. Duration: 0.73 2023-10-03 10:49:02,790 WARNING [train.py:1204] (3/4) Exclude cut with ID _14268_210_9206_1_1530622771285_4714099_331-105351-0_sp0.9 from training. Duration: 0.72225 2023-10-03 10:49:02,822 WARNING [train.py:1204] (3/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_204-137133-0_sp1.1 from training. Duration: 0.92725 2023-10-03 10:49:02,854 WARNING [train.py:1204] (3/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_5-46496-0_sp1.1 from training. Duration: 0.936375 2023-10-03 10:49:04,193 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_160-218721-0_sp1.1 from training. Duration: 0.87275 2023-10-03 10:49:07,679 WARNING [train.py:1204] (3/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_503-287883-0 from training. Duration: 0.88 2023-10-03 10:49:07,746 WARNING [train.py:1204] (3/4) Exclude cut with ID _57572_210_5517_1_1534921219624_3809484_110-79134-0_sp1.1 from training. Duration: 0.92725 2023-10-03 10:49:07,920 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.feed_forward1.out_proj.dropout_p, batch_count=1240233.3333333333, ans=0.1 2023-10-03 10:49:09,074 WARNING [train.py:1204] (3/4) Exclude cut with ID _44585_210_18628_1_1533605350387_4518199_421-37641-0_sp1.1 from training. Duration: 0.936375 2023-10-03 10:49:10,450 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_42-142473-0_sp0.9 from training. Duration: 0.8 2023-10-03 10:49:13,094 WARNING [train.py:1204] (3/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_310-114452-0_sp0.9 from training. Duration: 0.6555625 2023-10-03 10:49:17,209 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9029W0120-509520 from training. Duration: 0.768 2023-10-03 10:49:17,235 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9029W0433-509820 from training. Duration: 0.896 2023-10-03 10:49:18,765 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_40222_210_6228_1_1533385298_4481939_163-334586-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 10:49:18,765 WARNING [train.py:1204] (3/4) Exclude cut with ID _48677_210_5663_1_1533902405919_3892989_408-40740-0_sp1.1 from training. Duration: 0.7545625 2023-10-03 10:49:21,578 WARNING [train.py:1204] (3/4) Exclude cut with ID _24314_210_4915_1_1532135078116_7170179_521-258868-0_sp0.9 from training. Duration: 0.87775 2023-10-03 10:49:23,018 WARNING [train.py:1204] (3/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_576-51198-0_sp1.1 from training. Duration: 0.92725 2023-10-03 10:49:25,029 WARNING [train.py:1204] (3/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_306-216871-0_sp1.1 from training. Duration: 0.97275 2023-10-03 10:49:25,262 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.conv_module1.balancer2.prob, batch_count=1240300.0, ans=0.125 2023-10-03 10:49:26,864 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.attention_skip_rate, batch_count=1240300.0, ans=0.0 2023-10-03 10:49:30,828 WARNING [train.py:1204] (3/4) Exclude cut with ID _20684_210_6917_1_1531567751646_4203930_463-119982-0_sp1.1 from training. Duration: 0.97275 2023-10-03 10:49:32,635 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9084W0012-536850 from training. Duration: 0.64 2023-10-03 10:49:34,050 WARNING [train.py:1204] (3/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_847-41334-0_sp1.1 from training. Duration: 0.4 2023-10-03 10:49:36,926 WARNING [train.py:1204] (3/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_326-165900-0_sp1.1 from training. Duration: 0.62725 2023-10-03 10:49:38,864 WARNING [train.py:1204] (3/4) Exclude cut with ID _7537_210_2084_1_1528502395431_3924359_128-268029-0_sp1.1 from training. Duration: 0.8909375 2023-10-03 10:49:40,145 WARNING [train.py:1204] (3/4) Exclude cut with ID _67499_210_17282_1_1535509805220_3692430_333-155030-0_sp1.1 from training. Duration: 0.97275 2023-10-03 10:49:40,322 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.feed_forward1.hidden_balancer.prob, batch_count=1240366.6666666667, ans=0.125 2023-10-03 10:49:42,944 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_34008_210_10431_1_1533345424_8183247_588-257177-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 10:49:45,793 WARNING [train.py:1204] (3/4) Exclude cut with ID _25914_210_6400_1_1532141317058_644740_52-96808-0_sp1.1 from training. Duration: 0.82725 2023-10-03 10:49:45,905 WARNING [train.py:1204] (3/4) Exclude cut with ID _56494_210_4220_1_1535284837625_3033169_69-138150-0_sp1.1 from training. Duration: 0.8545625 2023-10-03 10:49:48,507 WARNING [train.py:1204] (3/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_19-143553-0_sp1.1 from training. Duration: 0.97275 2023-10-03 10:49:48,564 WARNING [train.py:1204] (3/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_582-130735-0_sp1.1 from training. Duration: 0.936375 2023-10-03 10:49:49,921 WARNING [train.py:1204] (3/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_943-201356-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 10:49:49,931 WARNING [train.py:1204] (3/4) Exclude cut with ID _3231_210_2106_1_1526205864934_6784220_422-229276-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 10:49:51,707 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_48049_210_5632_1_1533864553_3519937_73-198717-0_sp1.1 from training. Duration: 0.97275 2023-10-03 10:49:51,755 WARNING [train.py:1204] (3/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_329-285801-0 from training. Duration: 0.91 2023-10-03 10:49:51,788 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9171W0114-579000 from training. Duration: 0.896 2023-10-03 10:49:51,807 WARNING [train.py:1204] (3/4) Exclude cut with ID _3120_210_3525_1_1526036143112_4072980_210-132675-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 10:49:53,236 WARNING [train.py:1204] (3/4) Exclude cut with ID _55857_210_2346_1_1534644007831_6898209_745-98391-0_sp0.9 from training. Duration: 0.8444375 2023-10-03 10:49:54,519 WARNING [train.py:1204] (3/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_615-322035-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 10:49:54,524 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_36756_210_7234_1_1533026991_966276_119-315767-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 10:49:54,539 WARNING [train.py:1204] (3/4) Exclude cut with ID _13916_210_9994_1_1530788341126_3763149_60-191556-0_sp1.1 from training. Duration: 0.5545625 2023-10-03 10:49:54,554 WARNING [train.py:1204] (3/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_177-236809-0_sp1.1 from training. Duration: 0.4454375 2023-10-03 10:49:54,606 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7002_210_1794_1_1527939683_5157286_184-229630-0_sp1.1 from training. Duration: 0.67275 2023-10-03 10:49:54,614 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_50428_210_5724_1_1534247532_4126418_464-18322-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 10:49:54,675 WARNING [train.py:1204] (3/4) Exclude cut with ID _35799_210_5854_1_1533263487887_3529071_466-131402-0_sp1.1 from training. Duration: 0.936375 2023-10-03 10:49:56,451 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_1259-132768-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 10:49:57,791 WARNING [train.py:1204] (3/4) Exclude cut with ID _6694_210_4599_1_1527852637490_3965529_144-185051-0_sp1.1 from training. Duration: 0.87275 2023-10-03 10:49:57,822 WARNING [train.py:1204] (3/4) Exclude cut with ID _64639_210_5816_1_1535367619437_3528000_59-133290-0_sp1.1 from training. Duration: 0.963625 2023-10-03 10:49:59,054 INFO [train.py:1046] (3/4) Epoch 36, batch 150, loss[loss=0.1807, simple_loss=0.2484, pruned_loss=0.05651, over 23798.00 frames. ], tot_loss[loss=0.1619, simple_loss=0.2419, pruned_loss=0.04091, over 2505393.74 frames. ], batch size: 164, lr: 2.85e-03, grad_scale: 16.0 2023-10-03 10:49:59,213 WARNING [train.py:1204] (3/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_223-49525-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 10:50:03,876 WARNING [train.py:1204] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_748-36150-0_sp1.1 from training. Duration: 0.82725 2023-10-03 10:50:03,877 WARNING [train.py:1204] (3/4) Exclude cut with ID _19896_210_1883_1_1531709022662_6649569_52-221627-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 10:50:03,932 WARNING [train.py:1204] (3/4) Exclude cut with ID _6789_210_1534_1_1527854471671_2870519_296-115766-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 10:50:08,629 WARNING [train.py:1204] (3/4) Exclude cut with ID _14624_210_1925_1_1530858690657_3642219_100-19871-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 10:50:08,661 WARNING [train.py:1204] (3/4) Exclude cut with ID _12555_210_8750_1_1530075594402_3780239_50-293675-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 10:50:08,856 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.bypass.skip_rate, batch_count=1240500.0, ans=0.07 2023-10-03 10:50:10,115 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.conv_skip_rate, batch_count=1240500.0, ans=0.0 2023-10-03 10:50:11,355 WARNING [train.py:1204] (3/4) Exclude cut with ID _14880_210_8895_1_1530680426655_3795259_289-212817-0_sp1.1 from training. Duration: 0.62725 2023-10-03 10:50:11,990 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.feed_forward1.out_whiten.whitening_limit, batch_count=1240500.0, ans=15.0 2023-10-03 10:50:12,720 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_388-211626-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 10:50:13,949 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.531e+02 1.846e+02 1.951e+02 2.154e+02 3.020e+02, threshold=3.902e+02, percent-clipped=0.0 2023-10-03 10:50:14,282 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.nonlin_attention.balancer.prob, batch_count=1240566.6666666667, ans=0.125 2023-10-03 10:50:15,502 WARNING [train.py:1204] (3/4) Exclude cut with ID _5289_210_2084_1_1527678202736_3779299_392-261988-0 from training. Duration: 0.65 2023-10-03 10:50:15,510 WARNING [train.py:1204] (3/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_124-345840-0 from training. Duration: 0.52 2023-10-03 10:50:15,514 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2655_210_2087_1_1525688959_3897400_437-337690-0 from training. Duration: 0.63 2023-10-03 10:50:18,334 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3780_210_2087_1_1526467391_239068_13-274633-0_sp1.1 from training. Duration: 0.92725 2023-10-03 10:50:18,346 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_805-71497-0_sp1.1 from training. Duration: 0.6454375 2023-10-03 10:50:19,698 WARNING [train.py:1204] (3/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_434-83000-0_sp1.1 from training. Duration: 0.8909375 2023-10-03 10:50:19,786 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_17613_210_2151_1_1531137569_2366160_285-219531-0_sp1.1 from training. Duration: 0.97275 2023-10-03 10:50:19,794 WARNING [train.py:1204] (3/4) Exclude cut with ID _56108_210_16380_1_1534505446159_3887959_22-37858-0_sp1.1 from training. Duration: 0.936375 2023-10-03 10:50:19,819 WARNING [train.py:1204] (3/4) Exclude cut with ID _28427_210_4220_1_1532581240967_3061069_200-88444-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 10:50:21,246 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_77-279042-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 10:50:23,103 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9309W0193-640320 from training. Duration: 0.896 2023-10-03 10:50:24,426 WARNING [train.py:1204] (3/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_516-324055-0_sp1.1 from training. Duration: 0.936375 2023-10-03 10:50:30,449 WARNING [train.py:1204] (3/4) Exclude cut with ID _40094_210_16380_1_1533294005773_3641159_10-261240-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 10:50:36,263 WARNING [train.py:1204] (3/4) Exclude cut with ID _6267_210_2084_1_1527764658526_3894730_375-219887-0_sp1.1 from training. Duration: 0.6090625 2023-10-03 10:50:36,363 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6788_210_1388_1_1527898698_4562021_180-75288-0 from training. Duration: 0.99 2023-10-03 10:50:41,156 WARNING [train.py:1204] (3/4) Exclude cut with ID _8030_210_7497_1_1528444886781_3531089_262-86750-0_sp1.1 from training. Duration: 0.736375 2023-10-03 10:50:41,173 WARNING [train.py:1204] (3/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_934-325498-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 10:50:41,210 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_456-22141-0_sp0.9 from training. Duration: 0.97775 2023-10-03 10:50:42,155 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.0.layers.1.self_attn1.whiten, num_groups=1, num_channels=192, metric=12.47 vs. limit=22.5 2023-10-03 10:50:42,652 WARNING [train.py:1204] (3/4) Exclude cut with ID _49643_210_7989_1_1534640244224_3939031_598-161926-0_sp1.1 from training. Duration: 0.8181875 2023-10-03 10:50:45,394 WARNING [train.py:1204] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_345-49721-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 10:50:45,461 WARNING [train.py:1204] (3/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_440-141811-0_sp1.1 from training. Duration: 0.863625 2023-10-03 10:50:45,698 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.ff3_skip_rate, batch_count=1240700.0, ans=0.0 2023-10-03 10:50:46,753 WARNING [train.py:1204] (3/4) Exclude cut with ID _64048_210_10626_1_1535198382802_3835179_394-212120-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 10:50:46,806 WARNING [train.py:1204] (3/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_91-277684-0 from training. Duration: 0.74 2023-10-03 10:50:52,213 WARNING [train.py:1204] (3/4) Exclude cut with ID _23038_210_9570_1_1532001584158_3652419_191-346343-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 10:50:53,540 WARNING [train.py:1204] (3/4) Exclude cut with ID _15140_210_9288_1_1530791388450_4880149_351-347974-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 10:50:53,581 WARNING [train.py:1204] (3/4) Exclude cut with ID _58605_210_12210_1_1534723195939_3904259_48-269402-0_sp1.1 from training. Duration: 0.87275 2023-10-03 10:50:55,380 WARNING [train.py:1204] (3/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_401-53423-0_sp0.9 from training. Duration: 0.611125 2023-10-03 10:50:56,929 WARNING [train.py:1204] (3/4) Exclude cut with ID _42659_210_2070_1_1533607442765_3120380_158-49004-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 10:50:58,275 WARNING [train.py:1204] (3/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_81-75849-0_sp1.1 from training. Duration: 0.3545625 2023-10-03 10:51:01,031 WARNING [train.py:1204] (3/4) Exclude cut with ID _27052_210_5986_1_1532329197187_3585009_314-106499-0_sp1.1 from training. Duration: 0.836375 2023-10-03 10:51:01,774 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.3.encoder.layers.1.self_attn2.whiten, num_groups=1, num_channels=512, metric=14.57 vs. limit=22.5 2023-10-03 10:51:02,913 WARNING [train.py:1204] (3/4) Exclude cut with ID _4626_210_3532_1_1526817652717_3916859_446-206656-0_sp0.9 from training. Duration: 0.8444375 2023-10-03 10:51:03,006 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_53378_210_17282_1_1534221071_5769509_158-114960-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 10:51:04,316 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3171_210_2087_1_1526109084_2330007_37-64334-0_sp0.9 from training. Duration: 0.911125 2023-10-03 10:51:04,334 WARNING [train.py:1204] (3/4) Exclude cut with ID _5410_210_1388_1_1527397055795_1719020_39-112221-0 from training. Duration: 0.69 2023-10-03 10:51:04,351 WARNING [train.py:1204] (3/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_4-91037-0_sp0.9 from training. Duration: 0.97775 2023-10-03 10:51:04,364 WARNING [train.py:1204] (3/4) Exclude cut with ID ID0079W0443-717795 from training. Duration: 0.541 2023-10-03 10:51:09,110 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.feed_forward2.hidden_balancer.prob, batch_count=1240766.6666666667, ans=0.125 2023-10-03 10:51:10,207 WARNING [train.py:1204] (3/4) Exclude cut with ID _34734_210_4525_1_1533365977542_4448720_766-297928-0_sp1.1 from training. Duration: 0.936375 2023-10-03 10:51:12,212 WARNING [train.py:1204] (3/4) Exclude cut with ID _51471_210_1889_1_1534399391390_3094250_310-337793-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 10:51:12,230 WARNING [train.py:1204] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_678-159307-0_sp0.9 from training. Duration: 0.9444375 2023-10-03 10:51:13,584 INFO [train.py:1046] (3/4) Epoch 36, batch 200, loss[loss=0.1705, simple_loss=0.233, pruned_loss=0.05404, over 23765.00 frames. ], tot_loss[loss=0.1629, simple_loss=0.2427, pruned_loss=0.04155, over 2999791.55 frames. ], batch size: 179, lr: 2.85e-03, grad_scale: 8.0 2023-10-03 10:51:14,397 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.3.encoder.layers.1.self_attn1.whiten, num_groups=1, num_channels=512, metric=15.47 vs. limit=22.5 2023-10-03 10:51:16,462 WARNING [train.py:1204] (3/4) Exclude cut with ID _50093_210_4220_1_1534417141207_3432299_259-322252-0 from training. Duration: 0.92 2023-10-03 10:51:16,528 WARNING [train.py:1204] (3/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_67-103444-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 10:51:16,545 WARNING [train.py:1204] (3/4) Exclude cut with ID _40551_210_10635_1_1533776424632_3416179_311-247578-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 10:51:19,213 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_4633_210_2040_1_1526824163_4145343_416-76117-0 from training. Duration: 0.95 2023-10-03 10:51:20,558 WARNING [train.py:1204] (3/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_331-117829-0_sp0.9 from training. Duration: 0.688875 2023-10-03 10:51:20,672 WARNING [train.py:1204] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_299-247059-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 10:51:20,807 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.conv_module1.balancer2.prob, batch_count=1240833.3333333333, ans=0.125 2023-10-03 10:51:21,952 WARNING [train.py:1204] (3/4) Exclude cut with ID _64002_210_9570_1_1535198605157_38979_2-240845-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 10:51:26,351 WARNING [train.py:1204] (3/4) Exclude cut with ID _4681_210_3525_1_1527250689464_120889_15-126760-0_sp0.9 from training. Duration: 0.9 2023-10-03 10:51:26,370 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_21500_210_2054_1_1532149310_2716758_256-156611-0_sp1.1 from training. Duration: 0.936375 2023-10-03 10:51:26,392 WARNING [train.py:1204] (3/4) Exclude cut with ID _16908_210_5724_1_1531119157248_3772700_624-203729-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 10:51:47,175 WARNING [train.py:1204] (3/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_605-219732-0_sp1.1 from training. Duration: 0.8545625 2023-10-03 10:51:47,201 WARNING [train.py:1204] (3/4) Exclude cut with ID _44718_210_5816_1_1534050051869_6469030_380-167520-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 10:51:48,492 WARNING [train.py:1204] (3/4) Exclude cut with ID _19896_210_1883_1_1531709022662_6649569_94-229927-0_sp1.1 from training. Duration: 0.8818125 2023-10-03 10:51:49,788 WARNING [train.py:1204] (3/4) Exclude cut with ID _19514_210_9259_1_1531530071330_5084230_519-174300-0_sp1.1 from training. Duration: 0.963625 2023-10-03 10:51:49,875 WARNING [train.py:1204] (3/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_340-174176-0_sp0.9 from training. Duration: 0.5444375 2023-10-03 10:51:49,879 WARNING [train.py:1204] (3/4) Exclude cut with ID _8263_210_1664_1_1528439452891_970220_68-181805-0_sp1.1 from training. Duration: 0.5818125 2023-10-03 10:51:51,295 WARNING [train.py:1204] (3/4) Exclude cut with ID _3120_210_3525_1_1526036143112_4072980_348-257723-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 10:51:52,628 WARNING [train.py:1204] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_453-133983-0_sp1.1 from training. Duration: 0.7090625 2023-10-03 10:51:52,686 WARNING [train.py:1204] (3/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_17-86060-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 10:51:52,715 WARNING [train.py:1204] (3/4) Exclude cut with ID _29877_210_6228_1_1532397791106_5276149_228-184657-0_sp1.1 from training. Duration: 0.97275 2023-10-03 10:51:52,806 WARNING [train.py:1204] (3/4) Exclude cut with ID _18516_210_9206_1_1531704653206_3692406_198-334870-0 from training. Duration: 0.92 2023-10-03 10:51:53,991 WARNING [train.py:1204] (3/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_340-174176-0_sp1.1 from training. Duration: 0.4454375 2023-10-03 10:51:54,003 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_14-139186-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 10:51:54,282 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.conv_skip_rate, batch_count=1240966.6666666667, ans=0.0 2023-10-03 10:51:56,457 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.5.encoder.layers.0.conv_module1.whiten, num_groups=1, num_channels=256, metric=9.21 vs. limit=15.0 2023-10-03 10:51:58,670 WARNING [train.py:1204] (3/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_126-29104-0_sp0.9 from training. Duration: 0.9444375 2023-10-03 10:52:02,082 WARNING [train.py:1204] (3/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_84-10310-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 10:52:08,137 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.conv_module2.balancer1.prob, batch_count=1241033.3333333333, ans=0.125 2023-10-03 10:52:11,080 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_15517_210_6753_1_1530954008_3487336_79-273076-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 10:52:11,130 WARNING [train.py:1204] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_371-18169-0_sp0.9 from training. Duration: 0.9555625 2023-10-03 10:52:16,995 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_68702_210_23689_1_1535799577_3420068_170-92788-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 10:52:19,808 WARNING [train.py:1204] (3/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_78-182912-0 from training. Duration: 0.59 2023-10-03 10:52:19,870 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3019_210_2028_1_1526044245_3264865_297-12384-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 10:52:19,875 WARNING [train.py:1204] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_147-111137-0_sp1.1 from training. Duration: 0.863625 2023-10-03 10:52:21,195 WARNING [train.py:1204] (3/4) Exclude cut with ID _28323_210_16187_1_1532480395667_4100300_117-8859-0_sp1.1 from training. Duration: 0.97275 2023-10-03 10:52:22,632 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7762_210_1794_1_1528199508_4057408_419-125009-0_sp1.1 from training. Duration: 0.7545625 2023-10-03 10:52:23,977 WARNING [train.py:1204] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_268-87909-0 from training. Duration: 0.99 2023-10-03 10:52:24,460 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.4.encoder.layers.1.conv_module1.whiten, num_groups=1, num_channels=384, metric=5.11 vs. limit=15.0 2023-10-03 10:52:25,410 WARNING [train.py:1204] (3/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_118-311357-0_sp1.1 from training. Duration: 0.92725 2023-10-03 10:52:25,423 WARNING [train.py:1204] (3/4) Exclude cut with ID ID0377W0003-870225 from training. Duration: 0.852 2023-10-03 10:52:26,686 INFO [train.py:1046] (3/4) Epoch 36, batch 250, loss[loss=0.1715, simple_loss=0.2568, pruned_loss=0.04311, over 23423.00 frames. ], tot_loss[loss=0.1622, simple_loss=0.2414, pruned_loss=0.04156, over 3362830.84 frames. ], batch size: 93, lr: 2.85e-03, grad_scale: 4.0 2023-10-03 10:52:28,687 WARNING [train.py:1204] (3/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_56-5838-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 10:52:28,812 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.0.attention_skip_rate, batch_count=1241166.6666666667, ans=0.0 2023-10-03 10:52:31,533 WARNING [train.py:1204] (3/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_90-31383-0_sp0.9 from training. Duration: 0.9666875 2023-10-03 10:52:32,949 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_31583_210_3852_1_1532595851_4024413_58-86657-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 10:52:32,988 WARNING [train.py:1204] (3/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_344-113875-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 10:52:34,458 WARNING [train.py:1204] (3/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_278-104676-0_sp1.1 from training. Duration: 0.8909375 2023-10-03 10:52:34,482 WARNING [train.py:1204] (3/4) Exclude cut with ID _55844_210_2346_1_1534471212569_7625707_764-2387-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 10:52:36,306 WARNING [train.py:1204] (3/4) Exclude cut with ID _27430_210_7770_1_1532604689375_1378590_157-14963-0_sp1.1 from training. Duration: 0.963625 2023-10-03 10:52:39,095 WARNING [train.py:1204] (3/4) Exclude cut with ID _7160_210_1876_1_1527940923231_3703799_282-322611-0_sp1.1 from training. Duration: 0.7909375 2023-10-03 10:52:45,132 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.606e+02 1.828e+02 1.971e+02 2.158e+02 2.749e+02, threshold=3.942e+02, percent-clipped=0.0 2023-10-03 10:52:48,147 WARNING [train.py:1204] (3/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_190-138640-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 10:52:50,882 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_33348_210_5632_1_1533086912_3324030_63-270922-0_sp1.1 from training. Duration: 0.97275 2023-10-03 10:52:50,916 WARNING [train.py:1204] (3/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_135-184281-0_sp1.1 from training. Duration: 0.9 2023-10-03 10:52:52,509 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.balancer1.prob, batch_count=1241233.3333333333, ans=0.125 2023-10-03 10:52:53,781 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=1241233.3333333333, ans=0.1 2023-10-03 10:52:56,946 WARNING [train.py:1204] (3/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_277-210798-0_sp1.1 from training. Duration: 0.5 2023-10-03 10:52:57,007 WARNING [train.py:1204] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_402-253027-0_sp0.9 from training. Duration: 0.6 2023-10-03 10:52:58,350 WARNING [train.py:1204] (3/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_540-275685-0_sp0.9 from training. Duration: 0.988875 2023-10-03 10:52:59,703 WARNING [train.py:1204] (3/4) Exclude cut with ID _59493_210_17282_1_1535078419077_3225879_118-18960-0_sp1.1 from training. Duration: 0.936375 2023-10-03 10:52:59,780 WARNING [train.py:1204] (3/4) Exclude cut with ID _6194_210_1534_1_1527511826160_3941194_321-148641-0_sp1.1 from training. Duration: 0.5818125 2023-10-03 10:52:59,793 WARNING [train.py:1204] (3/4) Exclude cut with ID _61133_210_9969_1_1535196777227_3696409_333-270325-0_sp1.1 from training. Duration: 0.8454375 2023-10-03 10:53:01,180 WARNING [train.py:1204] (3/4) Exclude cut with ID _49376_210_5986_1_1533985278732_3370740_307-145221-0_sp1.1 from training. Duration: 0.936375 2023-10-03 10:53:01,971 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.2.encoder.layers.1.self_attn_weights.whiten_keys, num_groups=4, num_channels=128, metric=2.56 vs. limit=6.0 2023-10-03 10:53:03,158 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6846_210_6753_1_1527850767_4295158_2-315073-0_sp0.9 from training. Duration: 0.97775 2023-10-03 10:53:06,172 WARNING [train.py:1204] (3/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_17-25045-0 from training. Duration: 0.59 2023-10-03 10:53:06,187 WARNING [train.py:1204] (3/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_503-333510-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 10:53:08,815 WARNING [train.py:1204] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_238-245655-0_sp1.1 from training. Duration: 0.736375 2023-10-03 10:53:08,850 WARNING [train.py:1204] (3/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_329-82235-0_sp0.9 from training. Duration: 0.911125 2023-10-03 10:53:08,856 WARNING [train.py:1204] (3/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_325-113798-0_sp0.9 from training. Duration: 0.9444375 2023-10-03 10:53:10,274 WARNING [train.py:1204] (3/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_395-37222-0_sp0.9 from training. Duration: 0.8444375 2023-10-03 10:53:12,078 WARNING [train.py:1204] (3/4) Exclude cut with ID _64135_210_2253_1_1535677267876_7608972_498-5039-0_sp1.1 from training. Duration: 0.7090625 2023-10-03 10:53:12,098 WARNING [train.py:1204] (3/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_303-228738-0_sp0.9 from training. Duration: 0.9333125 2023-10-03 10:53:15,292 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_577-220899-0_sp1.1 from training. Duration: 0.92725 2023-10-03 10:53:15,396 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3475_210_2028_1_1526193578_3622569_377-166133-0_sp1.1 from training. Duration: 0.7818125 2023-10-03 10:53:16,733 WARNING [train.py:1204] (3/4) Exclude cut with ID _49376_210_5986_1_1533985278732_3370740_272-313512-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 10:53:20,751 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7762_210_1794_1_1528199508_4057408_422-227034-0_sp0.9 from training. Duration: 0.811125 2023-10-03 10:53:23,662 WARNING [train.py:1204] (3/4) Exclude cut with ID _18895_210_6228_1_1531357274234_3784930_74-341428-0_sp1.1 from training. Duration: 0.92725 2023-10-03 10:53:26,625 WARNING [train.py:1204] (3/4) Exclude cut with ID _44902_210_16187_1_1533888006062_3792078_149-67212-0_sp1.1 from training. Duration: 0.7909375 2023-10-03 10:53:31,216 WARNING [train.py:1204] (3/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_243-126020-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 10:53:32,998 WARNING [train.py:1204] (3/4) Exclude cut with ID _54218_210_12210_1_1534413646388_4409259_259-6561-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 10:53:35,726 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_41452_210_7291_1_1533437687_3941281_405-11559-0 from training. Duration: 0.91 2023-10-03 10:53:35,818 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_759-225084-0_sp1.1 from training. Duration: 0.97275 2023-10-03 10:53:35,831 WARNING [train.py:1204] (3/4) Exclude cut with ID _14747_210_8341_1_1530685797463_3654089_147-131229-0_sp0.9 from training. Duration: 0.8444375 2023-10-03 10:53:38,636 WARNING [train.py:1204] (3/4) Exclude cut with ID _14867_210_1968_1_1530684179890_3846969_319-250868-0 from training. Duration: 0.99 2023-10-03 10:53:38,672 WARNING [train.py:1204] (3/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_580-93798-0_sp0.9 from training. Duration: 0.62225 2023-10-03 10:53:41,196 INFO [train.py:1046] (3/4) Epoch 36, batch 300, loss[loss=0.1492, simple_loss=0.2225, pruned_loss=0.03795, over 23657.00 frames. ], tot_loss[loss=0.1605, simple_loss=0.2395, pruned_loss=0.04075, over 3661181.45 frames. ], batch size: 149, lr: 2.85e-03, grad_scale: 8.0 2023-10-03 10:53:41,251 WARNING [train.py:1204] (3/4) Exclude cut with ID _9679_210_7497_1_1529129212906_4363929_512-313889-0_sp0.9 from training. Duration: 0.9 2023-10-03 10:53:41,269 WARNING [train.py:1204] (3/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_18-189198-0 from training. Duration: 0.9 2023-10-03 10:53:45,835 WARNING [train.py:1204] (3/4) Exclude cut with ID _49755_210_18053_1_1534581005846_3218709_137-64179-0_sp1.1 from training. Duration: 0.92725 2023-10-03 10:53:47,151 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_48049_210_5632_1_1533864553_3519937_306-283794-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 10:53:49,689 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.3.encoder.layers.3.self_attn2.whiten, num_groups=1, num_channels=512, metric=9.66 vs. limit=22.5 2023-10-03 10:53:50,397 WARNING [train.py:1204] (3/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_25-120161-0_sp1.1 from training. Duration: 0.8909375 2023-10-03 10:53:50,457 WARNING [train.py:1204] (3/4) Exclude cut with ID _8833_210_5219_1_1528592665210_7553059_319-349181-0 from training. Duration: 0.99 2023-10-03 10:53:51,969 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_296-332325-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 10:53:52,065 WARNING [train.py:1204] (3/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_436-186311-0_sp1.1 from training. Duration: 0.5909375 2023-10-03 10:53:53,299 WARNING [train.py:1204] (3/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_348-127789-0 from training. Duration: 0.85 2023-10-03 10:53:53,304 WARNING [train.py:1204] (3/4) Exclude cut with ID _3992_210_1747_1_1526644816031_3731060_443-195183-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 10:53:58,850 WARNING [train.py:1204] (3/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_308-231735-0_sp1.1 from training. Duration: 0.663625 2023-10-03 10:54:04,100 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6786_210_1388_1_1527839699_4194495_378-46070-0_sp1.1 from training. Duration: 0.6545625 2023-10-03 10:54:04,146 WARNING [train.py:1204] (3/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_63-101996-0 from training. Duration: 0.88 2023-10-03 10:54:06,888 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2541_210_1794_1_1525590269_3084765_156-173713-0 from training. Duration: 0.81 2023-10-03 10:54:06,919 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_217-239796-0_sp1.1 from training. Duration: 0.963625 2023-10-03 10:54:09,639 WARNING [train.py:1204] (3/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_530-329400-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 10:54:09,805 WARNING [train.py:1204] (3/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_150-62920-0_sp1.1 from training. Duration: 0.963625 2023-10-03 10:54:09,805 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_5522_210_3273_1_1527249606_3909096_366-172508-0 from training. Duration: 0.66 2023-10-03 10:54:09,807 WARNING [train.py:1204] (3/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_303-340446-0_sp1.1 from training. Duration: 0.8181875 2023-10-03 10:54:12,489 WARNING [train.py:1204] (3/4) Exclude cut with ID _8699_210_2069_1_1528630346831_3960409_160-24408-0_sp1.1 from training. Duration: 0.82725 2023-10-03 10:54:14,499 WARNING [train.py:1204] (3/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_299-187801-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 10:54:15,520 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_34886_210_10633_1_1533203882_1755661_92-345288-0_sp1.1 from training. Duration: 0.97275 2023-10-03 10:54:20,113 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_5438_210_1794_1_1527336687_2600009_104-288274-0_sp1.1 from training. Duration: 0.52725 2023-10-03 10:54:20,117 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_154-316450-0 from training. Duration: 0.76 2023-10-03 10:54:21,481 WARNING [train.py:1204] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_23-189234-0_sp0.9 from training. Duration: 0.92225 2023-10-03 10:54:24,290 WARNING [train.py:1204] (3/4) Exclude cut with ID _4779_210_2084_1_1527318022944_3914893_346-168820-0_sp1.1 from training. Duration: 0.963625 2023-10-03 10:54:24,392 WARNING [train.py:1204] (3/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_5-55893-0 from training. Duration: 0.55 2023-10-03 10:54:25,710 WARNING [train.py:1204] (3/4) Exclude cut with ID _20455_210_5219_1_1531565996775_8098100_180-24517-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 10:54:29,826 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_243-143119-0_sp0.9 from training. Duration: 0.8555625 2023-10-03 10:54:32,903 WARNING [train.py:1204] (3/4) Exclude cut with ID _65585_210_12210_1_1535450451584_3764098_293-212045-0_sp1.1 from training. Duration: 0.92725 2023-10-03 10:54:32,910 WARNING [train.py:1204] (3/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_577-332600-0 from training. Duration: 0.57 2023-10-03 10:54:36,217 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.feed_forward1.hidden_balancer.prob, batch_count=1241700.0, ans=0.125 2023-10-03 10:54:36,308 INFO [scaling.py:1118] (3/4) WithLoss: name=encoder.encoders.3.encoder.layers.2.self_attn_weights, loss-sum=0.000e+00 2023-10-03 10:54:37,366 WARNING [train.py:1204] (3/4) Exclude cut with ID _30039_210_13270_1_1532484612900_2725150_104-289874-0_sp1.1 from training. Duration: 0.963625 2023-10-03 10:54:37,381 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3172_210_2087_1_1526292929_3987865_197-97762-0_sp1.1 from training. Duration: 0.6818125 2023-10-03 10:54:40,147 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_256-150908-0_sp1.1 from training. Duration: 0.963625 2023-10-03 10:54:41,587 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_296-287678-0_sp1.1 from training. Duration: 0.863625 2023-10-03 10:54:41,634 WARNING [train.py:1204] (3/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_443-30034-0 from training. Duration: 0.64 2023-10-03 10:54:41,660 WARNING [train.py:1204] (3/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_494-199538-0_sp0.9 from training. Duration: 0.8333125 2023-10-03 10:54:42,934 WARNING [train.py:1204] (3/4) Exclude cut with ID _19708_210_9206_1_1532136650733_4238364_61-162325-0_sp1.1 from training. Duration: 0.97275 2023-10-03 10:54:44,908 WARNING [train.py:1204] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_475-127455-0 from training. Duration: 0.7 2023-10-03 10:54:47,353 WARNING [train.py:1204] (3/4) Exclude cut with ID _48677_210_5663_1_1533902405919_3892989_188-11581-0_sp1.1 from training. Duration: 0.963625 2023-10-03 10:54:47,375 WARNING [train.py:1204] (3/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_136-8381-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 10:54:48,835 WARNING [train.py:1204] (3/4) Exclude cut with ID _55328_210_27767_1_1534580973260_4088170_270-229555-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 10:54:48,864 WARNING [train.py:1204] (3/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_372-266951-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 10:54:48,925 WARNING [train.py:1204] (3/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_14-242149-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 10:54:53,790 WARNING [train.py:1204] (3/4) Exclude cut with ID _7877_210_6014_1_1528286358099_3750136_159-76512-0_sp1.1 from training. Duration: 0.936375 2023-10-03 10:54:53,793 WARNING [train.py:1204] (3/4) Exclude cut with ID _8263_210_1664_1_1528438953494_294100_40-204731-0_sp1.1 from training. Duration: 0.4545625 2023-10-03 10:54:54,333 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.5.encoder.layers.1.whiten, num_groups=1, num_channels=256, metric=2.09 vs. limit=12.0 2023-10-03 10:54:55,182 INFO [train.py:1046] (3/4) Epoch 36, batch 350, loss[loss=0.1455, simple_loss=0.2306, pruned_loss=0.03024, over 24500.00 frames. ], tot_loss[loss=0.1593, simple_loss=0.2379, pruned_loss=0.04031, over 3894216.37 frames. ], batch size: 66, lr: 2.85e-03, grad_scale: 8.0 2023-10-03 10:54:56,646 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8278_210_2550_1_1528637099_4220413_694-238413-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 10:54:58,259 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.0.balancer2.prob, batch_count=1241833.3333333333, ans=0.125 2023-10-03 10:55:00,908 WARNING [train.py:1204] (3/4) Exclude cut with ID _47278_210_12041_1_1533902418349_3576110_421-172710-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 10:55:06,047 WARNING [train.py:1204] (3/4) Exclude cut with ID _14970_210_15709_1_1530773543493_1715730_215-282680-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 10:55:06,091 WARNING [train.py:1204] (3/4) Exclude cut with ID _64639_210_5816_1_1535367619437_3528000_306-149449-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 10:55:08,760 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_28-291982-0 from training. Duration: 0.97 2023-10-03 10:55:10,135 WARNING [train.py:1204] (3/4) Exclude cut with ID _55134_210_21982_1_1534590028377_3882170_412-15870-0_sp1.1 from training. Duration: 0.936375 2023-10-03 10:55:10,161 WARNING [train.py:1204] (3/4) Exclude cut with ID _13684_210_7497_1_1530619354863_3598359_312-235606-0 from training. Duration: 0.6 2023-10-03 10:55:11,750 WARNING [train.py:1204] (3/4) Exclude cut with ID _37112_210_2575_1_1532997007673_7268610_347-238993-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 10:55:13,004 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.545e+02 1.890e+02 2.063e+02 2.336e+02 3.358e+02, threshold=4.126e+02, percent-clipped=0.0 2023-10-03 10:55:13,080 WARNING [train.py:1204] (3/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_121-284152-0 from training. Duration: 0.59 2023-10-03 10:55:13,138 WARNING [train.py:1204] (3/4) Exclude cut with ID _2941_210_2577_1_1526199673150_2489759_228-2961-0_sp1.1 from training. Duration: 0.97275 2023-10-03 10:55:17,629 WARNING [train.py:1204] (3/4) Exclude cut with ID _28323_210_16187_1_1532480395667_4100300_531-24157-0 from training. Duration: 0.98 2023-10-03 10:55:17,751 WARNING [train.py:1204] (3/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_266-329755-0_sp0.9 from training. Duration: 0.988875 2023-10-03 10:55:20,457 WARNING [train.py:1204] (3/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_1038-146421-0_sp1.1 from training. Duration: 0.97275 2023-10-03 10:55:22,455 WARNING [train.py:1204] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_391-22148-0_sp0.9 from training. Duration: 0.8555625 2023-10-03 10:55:22,564 WARNING [train.py:1204] (3/4) Exclude cut with ID _64521_210_14699_1_1535243427259_3815328_543-341117-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 10:55:22,575 WARNING [train.py:1204] (3/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_52-168712-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 10:55:23,879 WARNING [train.py:1204] (3/4) Exclude cut with ID _33920_210_4220_1_1533257851500_3084399_198-334063-0_sp1.1 from training. Duration: 0.8545625 2023-10-03 10:55:23,892 WARNING [train.py:1204] (3/4) Exclude cut with ID _50093_210_4220_1_1534417141207_3432299_69-111987-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 10:55:25,204 WARNING [train.py:1204] (3/4) Exclude cut with ID _14821_210_8750_1_1530680503991_6829359_189-297451-0_sp0.9 from training. Duration: 0.611125 2023-10-03 10:55:26,638 WARNING [train.py:1204] (3/4) Exclude cut with ID _8030_210_7497_1_1528444886781_3531089_262-86750-0_sp0.9 from training. Duration: 0.9 2023-10-03 10:55:26,643 WARNING [train.py:1204] (3/4) Exclude cut with ID _24612_210_8913_1_1531998054193_3806359_294-191196-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 10:55:32,420 WARNING [train.py:1204] (3/4) Exclude cut with ID _16909_210_5724_1_1531292079912_4118919_707-342896-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 10:55:32,434 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_9460_210_4211_1_1528954587_10049667_572-37035-0_sp0.9 from training. Duration: 0.888875 2023-10-03 10:55:32,585 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.self_attn_weights.pos_emb_skip_rate, batch_count=1241966.6666666667, ans=0.0 2023-10-03 10:55:34,321 WARNING [train.py:1204] (3/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_762-109886-0_sp1.1 from training. Duration: 0.82725 2023-10-03 10:55:34,443 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.0.feed_forward1.out_proj.dropout_p, batch_count=1241966.6666666667, ans=0.1 2023-10-03 10:55:35,642 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_14953_210_2151_1_1530701710_5616165_519-326393-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 10:55:40,492 WARNING [train.py:1204] (3/4) Exclude cut with ID _25914_210_6400_1_1532141317058_644740_52-96808-0 from training. Duration: 0.91 2023-10-03 10:55:40,496 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_68095_210_20185_1_1535718226_8888855_1079-94525-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 10:55:45,990 WARNING [train.py:1204] (3/4) Exclude cut with ID _14860_210_4915_1_1530702020340_7343240_746-3647-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 10:55:45,991 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_70379_210_7925_1_1535941455_557455_49-101720-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 10:55:46,014 WARNING [train.py:1204] (3/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_8-307291-0_sp1.1 from training. Duration: 0.936375 2023-10-03 10:55:49,100 WARNING [train.py:1204] (3/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_117-290916-0 from training. Duration: 0.51 2023-10-03 10:55:50,445 WARNING [train.py:1204] (3/4) Exclude cut with ID _22020_210_2461_1_1531724164513_6326900_944-339840-0_sp1.1 from training. Duration: 0.963625 2023-10-03 10:55:51,211 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.1.encoder.layers.0.self_attn2.whiten, num_groups=1, num_channels=256, metric=11.90 vs. limit=22.5 2023-10-03 10:55:51,706 WARNING [train.py:1204] (3/4) Exclude cut with ID IC0515W0043-254911 from training. Duration: 0.976 2023-10-03 10:55:51,823 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3910_210_1794_1_1526742306_334530_28-238909-0 from training. Duration: 0.81 2023-10-03 10:55:51,838 WARNING [train.py:1204] (3/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_269-267275-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 10:55:56,440 WARNING [train.py:1204] (3/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_72-251255-0_sp1.1 from training. Duration: 0.8545625 2023-10-03 10:55:56,458 WARNING [train.py:1204] (3/4) Exclude cut with ID _14886_210_4220_1_1530702020355_3516190_175-229073-0 from training. Duration: 0.45 2023-10-03 10:55:57,865 WARNING [train.py:1204] (3/4) Exclude cut with ID _40519_210_13794_1_1533715228655_7337390_921-209452-0_sp1.1 from training. Duration: 0.963625 2023-10-03 10:55:59,379 WARNING [train.py:1204] (3/4) Exclude cut with ID _29857_210_20034_1_1532395886753_3798160_335-163562-0_sp0.9 from training. Duration: 0.9444375 2023-10-03 10:56:00,923 WARNING [train.py:1204] (3/4) Exclude cut with ID _58583_210_5236_1_1534726835266_3574249_384-58445-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 10:56:01,112 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.feed_forward2.hidden_balancer.prob, batch_count=1242100.0, ans=0.125 2023-10-03 10:56:02,317 WARNING [train.py:1204] (3/4) Exclude cut with ID _44011_210_2046_1_1533722401691_3631310_96-315656-0_sp1.1 from training. Duration: 0.963625 2023-10-03 10:56:02,322 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_68484_210_1794_1_1535849804_7678097_549-170613-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 10:56:05,534 WARNING [train.py:1204] (3/4) Exclude cut with ID _37892_210_19596_1_1533198677897_3964058_214-148058-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 10:56:06,977 WARNING [train.py:1204] (3/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_415-78792-0_sp0.9 from training. Duration: 0.9 2023-10-03 10:56:08,427 WARNING [train.py:1204] (3/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_383-205833-0_sp1.1 from training. Duration: 0.863625 2023-10-03 10:56:10,303 INFO [train.py:1046] (3/4) Epoch 36, batch 400, loss[loss=0.167, simple_loss=0.2394, pruned_loss=0.04729, over 23767.00 frames. ], tot_loss[loss=0.1588, simple_loss=0.238, pruned_loss=0.03985, over 4072420.47 frames. ], batch size: 179, lr: 2.85e-03, grad_scale: 16.0 2023-10-03 10:56:10,409 WARNING [train.py:1204] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_167-137255-0 from training. Duration: 0.64 2023-10-03 10:56:10,418 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8275_210_4198_1_1528455320_3892571_484-154786-0_sp1.1 from training. Duration: 0.963625 2023-10-03 10:56:10,473 WARNING [train.py:1204] (3/4) Exclude cut with ID _40284_210_2084_1_1533513679814_3704279_396-319412-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 10:56:12,022 WARNING [train.py:1204] (3/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_280-120942-0_sp0.9 from training. Duration: 0.8555625 2023-10-03 10:56:13,349 WARNING [train.py:1204] (3/4) Exclude cut with ID _45630_210_10556_1_1533949174498_3902049_558-240889-0_sp1.1 from training. Duration: 0.92725 2023-10-03 10:56:14,776 WARNING [train.py:1204] (3/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_197-307458-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 10:56:16,443 WARNING [train.py:1204] (3/4) Exclude cut with ID _16909_210_5724_1_1531292079912_4118919_163-254843-0_sp1.1 from training. Duration: 0.92725 2023-10-03 10:56:16,689 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.bypass.scale_min, batch_count=1242166.6666666667, ans=0.2 2023-10-03 10:56:17,852 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_103-84010-0 from training. Duration: 0.69 2023-10-03 10:56:20,703 WARNING [train.py:1204] (3/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_401-158239-0 from training. Duration: 0.71 2023-10-03 10:56:20,704 WARNING [train.py:1204] (3/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_899-233658-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 10:56:22,455 WARNING [train.py:1204] (3/4) Exclude cut with ID _23825_210_13449_1_1531915313526_3497150_240-271367-0 from training. Duration: 0.97 2023-10-03 10:56:22,524 WARNING [train.py:1204] (3/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_424-134619-0_sp1.1 from training. Duration: 0.92725 2023-10-03 10:56:26,749 WARNING [train.py:1204] (3/4) Exclude cut with ID _44831_210_13427_1_1533816011549_3689035_32-188147-0_sp1.1 from training. Duration: 0.87275 2023-10-03 10:56:26,752 WARNING [train.py:1204] (3/4) Exclude cut with ID _59170_210_6721_1_1534983633960_4203335_314-63879-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 10:56:26,766 WARNING [train.py:1204] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_602-275260-0 from training. Duration: 0.82 2023-10-03 10:56:28,001 WARNING [train.py:1204] (3/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_511-306767-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 10:56:28,034 WARNING [train.py:1204] (3/4) Exclude cut with ID _41817_210_18835_1_1533434477160_7423049_565-201775-0_sp1.1 from training. Duration: 0.92725 2023-10-03 10:56:28,045 WARNING [train.py:1204] (3/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_778-173151-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 10:56:28,095 WARNING [train.py:1204] (3/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_680-51001-0_sp1.1 from training. Duration: 0.963625 2023-10-03 10:56:30,778 WARNING [train.py:1204] (3/4) Exclude cut with ID IC0677W0059-334891 from training. Duration: 0.896 2023-10-03 10:56:30,829 WARNING [train.py:1204] (3/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_370-331600-0 from training. Duration: 0.94 2023-10-03 10:56:35,719 WARNING [train.py:1204] (3/4) Exclude cut with ID _66840_210_5219_1_1535452168426_4196349_446-240192-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 10:56:37,129 WARNING [train.py:1204] (3/4) Exclude cut with ID _65242_210_18628_1_1535369393717_3823149_38-6732-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 10:56:38,559 WARNING [train.py:1204] (3/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_302-2550-0 from training. Duration: 0.81 2023-10-03 10:56:40,333 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_100-348441-0 from training. Duration: 0.91 2023-10-03 10:56:41,900 WARNING [train.py:1204] (3/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_329-285801-0_sp1.1 from training. Duration: 0.82725 2023-10-03 10:56:43,266 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_30167_210_3528_1_1532428012_4772375_343-334712-0_sp1.1 from training. Duration: 0.936375 2023-10-03 10:56:51,854 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7543_210_6753_1_1528543318_4552_1-85072-0 from training. Duration: 0.83 2023-10-03 10:56:53,580 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.conv_skip_rate, batch_count=1242366.6666666667, ans=0.0 2023-10-03 10:56:53,834 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.5.encoder.layers.1.feed_forward1.out_whiten, num_groups=1, num_channels=256, metric=5.58 vs. limit=15.0 2023-10-03 10:56:54,662 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7002_210_1794_1_1527935634_4034444_487-236501-0_sp1.1 from training. Duration: 0.6 2023-10-03 10:56:57,894 WARNING [train.py:1204] (3/4) Exclude cut with ID _7160_210_1876_1_1527940923231_3703799_282-322611-0 from training. Duration: 0.87 2023-10-03 10:56:59,396 WARNING [train.py:1204] (3/4) Exclude cut with ID _67335_210_5219_1_1535525975652_4275229_570-262983-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 10:56:59,516 WARNING [train.py:1204] (3/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_625-338546-0_sp1.1 from training. Duration: 0.8 2023-10-03 10:57:00,634 WARNING [train.py:1204] (3/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_176-56120-0 from training. Duration: 0.83 2023-10-03 10:57:03,341 WARNING [train.py:1204] (3/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_963-187109-0_sp1.1 from training. Duration: 0.9 2023-10-03 10:57:05,308 WARNING [train.py:1204] (3/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_121-284152-0_sp0.9 from training. Duration: 0.6555625 2023-10-03 10:57:06,650 WARNING [train.py:1204] (3/4) Exclude cut with ID _45021_210_9994_1_1533619751094_3330288_104-81711-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 10:57:09,582 WARNING [train.py:1204] (3/4) Exclude cut with ID _51382_210_23862_1_1534492801838_3650104_129-343914-0_sp1.1 from training. Duration: 0.936375 2023-10-03 10:57:11,526 WARNING [train.py:1204] (3/4) Exclude cut with ID _14970_210_15709_1_1530775547361_1966965_8-169924-0 from training. Duration: 0.92 2023-10-03 10:57:12,877 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2295_210_2087_1_1525500993_2407863_234-83104-0_sp1.1 from training. Duration: 0.563625 2023-10-03 10:57:14,420 WARNING [train.py:1204] (3/4) Exclude cut with ID _14687_210_5854_1_1530703838259_3664889_115-239046-0 from training. Duration: 0.67 2023-10-03 10:57:17,070 WARNING [train.py:1204] (3/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_690-126306-0_sp1.1 from training. Duration: 0.7090625 2023-10-03 10:57:17,085 WARNING [train.py:1204] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_625-102955-0_sp0.9 from training. Duration: 0.8555625 2023-10-03 10:57:19,913 WARNING [train.py:1204] (3/4) Exclude cut with ID _9389_210_1664_1_1529217337301_5289269_289-213417-0 from training. Duration: 0.89 2023-10-03 10:57:21,823 WARNING [train.py:1204] (3/4) Exclude cut with ID _13253_210_1925_1_1530757775819_3577873_191-144977-0_sp0.9 from training. Duration: 0.8666875 2023-10-03 10:57:21,876 WARNING [train.py:1204] (3/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_325-155703-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 10:57:21,918 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2295_210_2087_1_1525500993_2407863_116-238894-0_sp0.9 from training. Duration: 0.77775 2023-10-03 10:57:22,021 WARNING [train.py:1204] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_94-126128-0 from training. Duration: 0.86 2023-10-03 10:57:23,406 WARNING [train.py:1204] (3/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_395-310220-0_sp0.9 from training. Duration: 0.988875 2023-10-03 10:57:24,722 INFO [train.py:1046] (3/4) Epoch 36, batch 450, loss[loss=0.1564, simple_loss=0.2449, pruned_loss=0.03392, over 24654.00 frames. ], tot_loss[loss=0.1591, simple_loss=0.2384, pruned_loss=0.03992, over 4200338.01 frames. ], batch size: 68, lr: 2.85e-03, grad_scale: 16.0 2023-10-03 10:57:24,735 WARNING [train.py:1204] (3/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_821-110852-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 10:57:24,794 WARNING [train.py:1204] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_64-248359-0_sp1.1 from training. Duration: 0.62725 2023-10-03 10:57:24,805 WARNING [train.py:1204] (3/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_138-187031-0 from training. Duration: 0.99 2023-10-03 10:57:24,849 WARNING [train.py:1204] (3/4) Exclude cut with ID _4122_210_2084_1_1526713164417_4353280_528-206771-0_sp0.9 from training. Duration: 0.97775 2023-10-03 10:57:27,455 WARNING [train.py:1204] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_844-96989-0_sp1.1 from training. Duration: 0.6454375 2023-10-03 10:57:27,692 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.conv_module2.balancer1.prob, batch_count=1242500.0, ans=0.125 2023-10-03 10:57:29,460 WARNING [train.py:1204] (3/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_106-30782-0_sp1.1 from training. Duration: 0.72725 2023-10-03 10:57:31,485 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.3.encoder.layers.3.self_attn1.whiten, num_groups=1, num_channels=512, metric=14.37 vs. limit=22.5 2023-10-03 10:57:39,690 WARNING [train.py:1204] (3/4) Exclude cut with ID _14683_210_5219_1_1530698352608_3974859_281-250040-0_sp1.1 from training. Duration: 0.936375 2023-10-03 10:57:39,724 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_46013_210_7234_1_1533776362_3744032_463-52378-0_sp0.9 from training. Duration: 0.9555625 2023-10-03 10:57:42,358 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.565e+02 1.918e+02 2.094e+02 2.356e+02 3.401e+02, threshold=4.188e+02, percent-clipped=0.0 2023-10-03 10:57:42,461 WARNING [train.py:1204] (3/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_299-34413-0 from training. Duration: 0.65 2023-10-03 10:57:42,528 WARNING [train.py:1204] (3/4) Exclude cut with ID _35799_210_5854_1_1533263487887_3529071_490-326847-0 from training. Duration: 0.99 2023-10-03 10:57:47,431 WARNING [train.py:1204] (3/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_589-9070-0_sp0.9 from training. Duration: 0.788875 2023-10-03 10:57:48,784 WARNING [train.py:1204] (3/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_884-29515-0_sp1.1 from training. Duration: 0.936375 2023-10-03 10:57:51,314 WARNING [train.py:1204] (3/4) Exclude cut with ID _15012_210_1968_1_1530856820429_5095350_474-119995-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 10:57:56,116 WARNING [train.py:1204] (3/4) Exclude cut with ID _65585_210_12210_1_1535450451584_3764098_211-97124-0_sp1.1 from training. Duration: 0.963625 2023-10-03 10:57:57,459 WARNING [train.py:1204] (3/4) Exclude cut with ID _33640_210_2655_1_1533117605479_4855469_666-224463-0_sp1.1 from training. Duration: 0.963625 2023-10-03 10:57:58,945 WARNING [train.py:1204] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_35-153886-0 from training. Duration: 0.84 2023-10-03 10:57:59,255 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.conv_skip_rate, batch_count=1242633.3333333333, ans=0.0 2023-10-03 10:58:00,169 WARNING [train.py:1204] (3/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_210-106047-0 from training. Duration: 0.97 2023-10-03 10:58:00,308 WARNING [train.py:1204] (3/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_479-125611-0 from training. Duration: 0.96 2023-10-03 10:58:00,330 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_9417_210_1794_1_1529235261_4614131_427-153963-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 10:58:01,839 WARNING [train.py:1204] (3/4) Exclude cut with ID _23818_210_6228_1_1531917063884_3676920_133-170571-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 10:58:03,722 WARNING [train.py:1204] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_602-275260-0_sp1.1 from training. Duration: 0.7454375 2023-10-03 10:58:05,038 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9029W0121-509521 from training. Duration: 0.896 2023-10-03 10:58:05,053 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9029W0434-509821 from training. Duration: 0.64 2023-10-03 10:58:06,425 WARNING [train.py:1204] (3/4) Exclude cut with ID _23038_210_9570_1_1532001584158_3652419_289-307034-0_sp1.1 from training. Duration: 0.936375 2023-10-03 10:58:06,678 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.bypass.scale_min, batch_count=1242633.3333333333, ans=0.2 2023-10-03 10:58:08,288 WARNING [train.py:1204] (3/4) Exclude cut with ID _29345_210_4381_1_1532689168263_3860169_149-257268-0_sp0.9 from training. Duration: 0.9 2023-10-03 10:58:08,369 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_405-339986-0_sp1.1 from training. Duration: 0.463625 2023-10-03 10:58:10,010 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.nonlin_attention.balancer.prob, batch_count=1242700.0, ans=0.125 2023-10-03 10:58:11,254 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2655_210_2087_1_1525688959_3897400_437-337690-0_sp1.1 from training. Duration: 0.57275 2023-10-03 10:58:11,281 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7543_210_6753_1_1528541907_1399322_165-323619-0_sp1.1 from training. Duration: 0.62725 2023-10-03 10:58:12,603 WARNING [train.py:1204] (3/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_508-335369-0_sp1.1 from training. Duration: 0.436375 2023-10-03 10:58:12,676 WARNING [train.py:1204] (3/4) Exclude cut with ID _7852_210_5219_1_1528286471316_8139400_358-287492-0 from training. Duration: 0.96 2023-10-03 10:58:14,131 WARNING [train.py:1204] (3/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_322-217220-0_sp0.9 from training. Duration: 0.9555625 2023-10-03 10:58:16,290 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.nonlin_attention.balancer.max_positive, batch_count=1242700.0, ans=0.95 2023-10-03 10:58:17,551 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3171_210_2087_1_1526109084_2330007_66-260583-0_sp0.9 from training. Duration: 0.8 2023-10-03 10:58:17,580 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8558_210_1410_1_1528505225_7613406_131-329524-0_sp1.1 from training. Duration: 0.7181875 2023-10-03 10:58:19,038 WARNING [train.py:1204] (3/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_30-326681-0 from training. Duration: 0.83 2023-10-03 10:58:21,965 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.bypass_mid.scale_min, batch_count=1242700.0, ans=0.2 2023-10-03 10:58:23,213 WARNING [train.py:1204] (3/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_58-68956-0_sp0.9 from training. Duration: 0.92225 2023-10-03 10:58:23,314 WARNING [train.py:1204] (3/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_255-312654-0 from training. Duration: 0.91 2023-10-03 10:58:24,089 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.conv_skip_rate, batch_count=1242766.6666666667, ans=0.0 2023-10-03 10:58:25,188 WARNING [train.py:1204] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_99-167392-0 from training. Duration: 0.61 2023-10-03 10:58:26,463 WARNING [train.py:1204] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_94-126128-0_sp0.9 from training. Duration: 0.9555625 2023-10-03 10:58:31,865 WARNING [train.py:1204] (3/4) Exclude cut with ID _68635_210_9317_1_1535770813410_4315870_187-192817-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 10:58:31,979 WARNING [train.py:1204] (3/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_277-204970-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 10:58:35,229 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6163_210_6400_1_1527506746_4263068_283-137811-0_sp0.9 from training. Duration: 0.8555625 2023-10-03 10:58:35,261 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9144W0121-566446 from training. Duration: 0.896 2023-10-03 10:58:39,712 INFO [train.py:1046] (3/4) Epoch 36, batch 500, loss[loss=0.1541, simple_loss=0.2429, pruned_loss=0.03272, over 24358.00 frames. ], tot_loss[loss=0.1599, simple_loss=0.239, pruned_loss=0.04038, over 4316863.63 frames. ], batch size: 77, lr: 2.85e-03, grad_scale: 16.0 2023-10-03 10:58:39,825 WARNING [train.py:1204] (3/4) Exclude cut with ID _49371_210_12071_1_1533952761021_3812190_351-315519-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 10:58:39,881 WARNING [train.py:1204] (3/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_115-326778-0_sp1.1 from training. Duration: 0.8454375 2023-10-03 10:58:39,997 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.1.balancer2.prob, batch_count=1242833.3333333333, ans=0.125 2023-10-03 10:58:41,214 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_17292_210_6753_1_1531125388_2943734_436-198965-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 10:58:42,476 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9165W0117-576751 from training. Duration: 0.896 2023-10-03 10:58:43,497 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.0.layers.1.feed_forward1.out_whiten, num_groups=1, num_channels=192, metric=4.20 vs. limit=15.0 2023-10-03 10:58:43,871 WARNING [train.py:1204] (3/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_212-149947-0 from training. Duration: 0.73 2023-10-03 10:58:43,881 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_13757_210_3528_1_1530440682_4107239_65-167544-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 10:58:45,399 WARNING [train.py:1204] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_700-183549-0_sp0.9 from training. Duration: 0.7555625 2023-10-03 10:58:49,946 WARNING [train.py:1204] (3/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_216-343007-0_sp0.9 from training. Duration: 0.7333125 2023-10-03 10:58:51,276 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7544_210_6753_1_1528587722_3576686_295-258479-0_sp1.1 from training. Duration: 0.763625 2023-10-03 10:58:52,677 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_365-287106-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 10:58:53,981 WARNING [train.py:1204] (3/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_300-3747-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 10:58:54,043 WARNING [train.py:1204] (3/4) Exclude cut with ID _14069_210_5632_1_1530512692033_4803390_487-303201-0_sp1.1 from training. Duration: 0.92725 2023-10-03 10:59:06,122 WARNING [train.py:1204] (3/4) Exclude cut with ID _14459_210_2228_1_1531443641946_7516700_668-123026-0_sp1.1 from training. Duration: 0.936375 2023-10-03 10:59:06,160 WARNING [train.py:1204] (3/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_108-53929-0_sp1.1 from training. Duration: 0.536375 2023-10-03 10:59:06,191 WARNING [train.py:1204] (3/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_266-137104-0_sp0.9 from training. Duration: 0.72225 2023-10-03 10:59:06,226 WARNING [train.py:1204] (3/4) Exclude cut with ID _12302_210_5219_1_1529924446466_8107280_902-168619-0_sp1.1 from training. Duration: 0.936375 2023-10-03 10:59:07,542 WARNING [train.py:1204] (3/4) Exclude cut with ID _59502_210_12210_1_1535107024698_1835139_146-162495-0 from training. Duration: 0.92 2023-10-03 10:59:07,564 WARNING [train.py:1204] (3/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_412-287526-0_sp1.1 from training. Duration: 0.6545625 2023-10-03 10:59:10,846 WARNING [train.py:1204] (3/4) Exclude cut with ID _7160_210_1876_1_1527940923231_3703799_120-150943-0_sp0.9 from training. Duration: 0.9 2023-10-03 10:59:12,234 WARNING [train.py:1204] (3/4) Exclude cut with ID _15037_210_2253_1_1530752294104_3267650_296-192597-0_sp1.1 from training. Duration: 0.736375 2023-10-03 10:59:12,255 WARNING [train.py:1204] (3/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_397-216497-0_sp1.1 from training. Duration: 0.87275 2023-10-03 10:59:12,281 WARNING [train.py:1204] (3/4) Exclude cut with ID _40094_210_16380_1_1533294005773_3641159_76-269320-0_sp1.1 from training. Duration: 0.936375 2023-10-03 10:59:12,347 WARNING [train.py:1204] (3/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_449-15508-0 from training. Duration: 0.82 2023-10-03 10:59:12,550 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.conv_module1.balancer1.prob, batch_count=1242966.6666666667, ans=0.125 2023-10-03 10:59:15,113 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9309W0194-640321 from training. Duration: 0.64 2023-10-03 10:59:17,827 WARNING [train.py:1204] (3/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_284-312178-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 10:59:18,275 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.5.encoder.layers.1.nonlin_attention.whiten1, num_groups=1, num_channels=192, metric=2.75 vs. limit=10.0 2023-10-03 10:59:19,201 WARNING [train.py:1204] (3/4) Exclude cut with ID _29853_210_7497_1_1532505600851_6642110_401-55937-0_sp1.1 from training. Duration: 0.92725 2023-10-03 10:59:21,021 WARNING [train.py:1204] (3/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_716-24918-0_sp1.1 from training. Duration: 0.92725 2023-10-03 10:59:21,049 WARNING [train.py:1204] (3/4) Exclude cut with ID _19637_210_4220_1_1531537167676_3205060_314-305638-0_sp1.1 from training. Duration: 0.92725 2023-10-03 10:59:21,089 WARNING [train.py:1204] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_475-127455-0_sp1.1 from training. Duration: 0.636375 2023-10-03 10:59:21,266 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.feed_forward3.hidden_balancer.prob, batch_count=1242966.6666666667, ans=0.125 2023-10-03 10:59:25,138 WARNING [train.py:1204] (3/4) Exclude cut with ID _5444_210_2151_1_1527418808464_5312210_581-315411-0 from training. Duration: 0.62 2023-10-03 10:59:27,933 WARNING [train.py:1204] (3/4) Exclude cut with ID _44902_210_16187_1_1533888006062_3792078_149-67212-0_sp0.9 from training. Duration: 0.9666875 2023-10-03 10:59:29,390 WARNING [train.py:1204] (3/4) Exclude cut with ID _31602_210_5854_1_1532677021456_4458250_63-63253-0_sp1.1 from training. Duration: 0.963625 2023-10-03 10:59:32,272 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.0.layers.0.self_attn2.whiten, num_groups=1, num_channels=192, metric=11.30 vs. limit=22.5 2023-10-03 10:59:32,796 WARNING [train.py:1204] (3/4) Exclude cut with ID _55209_210_1531_1_1534675828996_4597160_660-203992-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 10:59:33,022 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.balancer_na.min_abs, batch_count=1243033.3333333333, ans=0.02 2023-10-03 10:59:35,540 WARNING [train.py:1204] (3/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_83-190624-0_sp1.1 from training. Duration: 0.92725 2023-10-03 10:59:40,822 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.4.encoder.layers.0.nonlin_attention.whiten1, num_groups=1, num_channels=288, metric=8.79 vs. limit=10.0 2023-10-03 10:59:43,417 WARNING [train.py:1204] (3/4) Exclude cut with ID _58605_210_12210_1_1534723195939_3904259_154-139635-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 10:59:46,238 WARNING [train.py:1204] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_164-224052-0 from training. Duration: 0.7 2023-10-03 10:59:46,247 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_1126-189071-0_sp1.1 from training. Duration: 0.963625 2023-10-03 10:59:46,257 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_15396_210_6890_1_1531308473_1938936_310-214789-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 10:59:49,026 WARNING [train.py:1204] (3/4) Exclude cut with ID _8395_210_1531_1_1528597871523_4411579_431-132470-0 from training. Duration: 0.74 2023-10-03 10:59:49,070 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3780_210_2087_1_1526467760_2130483_170-259044-0_sp0.9 from training. Duration: 0.77775 2023-10-03 10:59:50,465 WARNING [train.py:1204] (3/4) Exclude cut with ID _12733_210_8341_1_1530167424556_3646380_537-230640-0_sp1.1 from training. Duration: 0.963625 2023-10-03 10:59:53,732 INFO [train.py:1046] (3/4) Epoch 36, batch 550, loss[loss=0.1775, simple_loss=0.2545, pruned_loss=0.05022, over 23859.00 frames. ], tot_loss[loss=0.1603, simple_loss=0.24, pruned_loss=0.0403, over 4404504.68 frames. ], batch size: 195, lr: 2.85e-03, grad_scale: 16.0 2023-10-03 10:59:56,553 WARNING [train.py:1204] (3/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_159-281310-0 from training. Duration: 0.95 2023-10-03 10:59:58,005 WARNING [train.py:1204] (3/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_434-83000-0 from training. Duration: 0.98 2023-10-03 10:59:58,032 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_4405_210_1849_1_1526732205_3593939_493-32464-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 10:59:58,048 WARNING [train.py:1204] (3/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_342-100157-0 from training. Duration: 0.54 2023-10-03 10:59:59,326 WARNING [train.py:1204] (3/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_765-250133-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 10:59:59,331 WARNING [train.py:1204] (3/4) Exclude cut with ID _38670_210_15379_1_1533448810659_4391059_81-312245-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 10:59:59,526 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.1.conv_module1.balancer2.prob, batch_count=1243166.6666666667, ans=0.125 2023-10-03 11:00:00,618 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_50050_210_16223_1_1535435796_7290138_1273-247611-0_sp1.1 from training. Duration: 0.936375 2023-10-03 11:00:00,677 WARNING [train.py:1204] (3/4) Exclude cut with ID _44831_210_13427_1_1533816011549_3689035_399-18591-0_sp1.1 from training. Duration: 0.936375 2023-10-03 11:00:00,700 WARNING [train.py:1204] (3/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_557-324258-0_sp0.9 from training. Duration: 0.9 2023-10-03 11:00:02,081 WARNING [train.py:1204] (3/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_62-335552-0_sp1.1 from training. Duration: 0.7909375 2023-10-03 11:00:05,047 WARNING [train.py:1204] (3/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_673-264125-0_sp1.1 from training. Duration: 0.963625 2023-10-03 11:00:06,397 WARNING [train.py:1204] (3/4) Exclude cut with ID _8030_210_7497_1_1528444886781_3531089_262-86750-0 from training. Duration: 0.81 2023-10-03 11:00:06,414 WARNING [train.py:1204] (3/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_444-28041-0_sp1.1 from training. Duration: 0.77275 2023-10-03 11:00:11,070 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.564e+02 1.849e+02 2.026e+02 2.312e+02 3.706e+02, threshold=4.052e+02, percent-clipped=0.0 2023-10-03 11:00:11,196 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_34008_210_10431_1_1533345424_8183247_516-14858-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 11:00:11,208 WARNING [train.py:1204] (3/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_54-248218-0_sp1.1 from training. Duration: 0.936375 2023-10-03 11:00:14,350 WARNING [train.py:1204] (3/4) Exclude cut with ID _13253_210_1925_1_1530757775819_3577873_300-119782-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 11:00:14,419 WARNING [train.py:1204] (3/4) Exclude cut with ID _39290_210_4220_1_1533862778760_3075118_21-240703-0_sp1.1 from training. Duration: 0.936375 2023-10-03 11:00:18,554 WARNING [train.py:1204] (3/4) Exclude cut with ID 774-127930-0014-10412-0_sp1.1 from training. Duration: 0.95 2023-10-03 11:00:19,879 WARNING [train.py:1204] (3/4) Exclude cut with ID _67499_210_17282_1_1535509805220_3692430_20-179346-0 from training. Duration: 0.97 2023-10-03 11:00:19,992 WARNING [train.py:1204] (3/4) Exclude cut with ID _8365_210_2064_1_1528592311070_4677690_431-294102-0_sp0.9 from training. Duration: 0.988875 2023-10-03 11:00:23,537 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.self_attn_weights.pos_emb_skip_rate, batch_count=1243300.0, ans=0.0 2023-10-03 11:00:24,634 WARNING [train.py:1204] (3/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_365-1492-0_sp1.1 from training. Duration: 0.82725 2023-10-03 11:00:24,675 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_105-114797-0_sp0.9 from training. Duration: 0.9333125 2023-10-03 11:00:26,049 WARNING [train.py:1204] (3/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_769-168302-0_sp0.9 from training. Duration: 0.788875 2023-10-03 11:00:28,757 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_40340_210_9024_1_1533433916_4132944_320-270277-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 11:00:28,762 WARNING [train.py:1204] (3/4) Exclude cut with ID ID0203W0003-781711 from training. Duration: 0.958 2023-10-03 11:00:30,685 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_30434_210_13049_1_1532519384_981746_52-12932-0_sp1.1 from training. Duration: 0.936375 2023-10-03 11:00:31,856 WARNING [train.py:1204] (3/4) Exclude cut with ID _8263_210_1664_1_1528438953494_294100_40-204731-0_sp0.9 from training. Duration: 0.5555625 2023-10-03 11:00:34,635 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1032-236863-0_sp0.9 from training. Duration: 0.9333125 2023-10-03 11:00:34,681 WARNING [train.py:1204] (3/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_335-274780-0_sp0.9 from training. Duration: 0.8666875 2023-10-03 11:00:34,698 WARNING [train.py:1204] (3/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_654-52713-0_sp1.1 from training. Duration: 0.7 2023-10-03 11:00:34,783 WARNING [train.py:1204] (3/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_742-267856-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 11:00:36,227 WARNING [train.py:1204] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_74-48717-0 from training. Duration: 0.58 2023-10-03 11:00:38,940 WARNING [train.py:1204] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_18-46756-0 from training. Duration: 0.79 2023-10-03 11:00:40,831 WARNING [train.py:1204] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_612-56061-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 11:00:40,845 WARNING [train.py:1204] (3/4) Exclude cut with ID _56423_210_5854_1_1534896061049_3165969_412-2403-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 11:00:42,143 WARNING [train.py:1204] (3/4) Exclude cut with ID _62574_210_2655_1_1535180407058_3933249_120-196848-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 11:00:42,143 WARNING [train.py:1204] (3/4) Exclude cut with ID _40519_210_13794_1_1533715228655_7337390_916-51000-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 11:00:45,502 WARNING [train.py:1204] (3/4) Exclude cut with ID _65263_210_22356_1_1535763598691_3921037_108-21902-0_sp1.1 from training. Duration: 0.9 2023-10-03 11:00:45,593 WARNING [train.py:1204] (3/4) Exclude cut with ID _4122_210_2084_1_1526713164417_4353280_528-206771-0_sp1.1 from training. Duration: 0.8 2023-10-03 11:00:48,389 WARNING [train.py:1204] (3/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_43-325657-0_sp1.1 from training. Duration: 0.92725 2023-10-03 11:00:48,440 WARNING [train.py:1204] (3/4) Exclude cut with ID _44659_210_2721_1_1534036120061_6614860_314-82041-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 11:00:49,763 WARNING [train.py:1204] (3/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_81-300212-0_sp0.9 from training. Duration: 0.6666875 2023-10-03 11:00:51,055 WARNING [train.py:1204] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_665-196155-0_sp0.9 from training. Duration: 0.7444375 2023-10-03 11:00:54,177 WARNING [train.py:1204] (3/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_140-336881-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 11:00:54,926 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.nonlin_attention.whiten1.whitening_limit, batch_count=1243433.3333333333, ans=10.0 2023-10-03 11:00:55,642 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6290_210_3273_1_1527595224_1282787_115-87095-0_sp0.9 from training. Duration: 0.611125 2023-10-03 11:00:56,919 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_53373_210_5399_1_1534229471_4715695_607-259685-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 11:00:58,278 WARNING [train.py:1204] (3/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_175-163894-0_sp1.1 from training. Duration: 0.67275 2023-10-03 11:00:58,304 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7002_210_1794_1_1527939683_5157286_1-281264-0_sp1.1 from training. Duration: 0.4 2023-10-03 11:01:05,591 WARNING [train.py:1204] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_605-257205-0 from training. Duration: 0.59 2023-10-03 11:01:06,993 INFO [train.py:1046] (3/4) Epoch 36, batch 600, loss[loss=0.1388, simple_loss=0.2225, pruned_loss=0.02755, over 24309.00 frames. ], tot_loss[loss=0.1608, simple_loss=0.2405, pruned_loss=0.04051, over 4477622.63 frames. ], batch size: 61, lr: 2.85e-03, grad_scale: 16.0 2023-10-03 11:01:08,573 WARNING [train.py:1204] (3/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_454-314901-0 from training. Duration: 0.46 2023-10-03 11:01:09,929 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_46032_210_14656_1_1533781412_4507969_287-130422-0_sp1.1 from training. Duration: 0.97275 2023-10-03 11:01:09,951 WARNING [train.py:1204] (3/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_186-309161-0_sp1.1 from training. Duration: 0.4909375 2023-10-03 11:01:09,984 WARNING [train.py:1204] (3/4) Exclude cut with ID _42659_210_2070_1_1533607442765_3120380_45-342307-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 11:01:11,528 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.conv_skip_rate, batch_count=1243500.0, ans=0.0 2023-10-03 11:01:17,734 WARNING [train.py:1204] (3/4) Exclude cut with ID _12555_210_8750_1_1530075594402_3780239_478-326818-0_sp1.1 from training. Duration: 0.963625 2023-10-03 11:01:19,159 WARNING [train.py:1204] (3/4) Exclude cut with ID _7606_210_4915_1_1528894253277_5080690_127-177761-0_sp1.1 from training. Duration: 0.5818125 2023-10-03 11:01:20,556 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6788_210_1388_1_1527898698_4562021_91-197981-0 from training. Duration: 0.83 2023-10-03 11:01:21,996 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8558_210_1410_1_1528505225_7613406_131-329524-0_sp0.9 from training. Duration: 0.87775 2023-10-03 11:01:25,247 WARNING [train.py:1204] (3/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_153-153537-0_sp1.1 from training. Duration: 0.82725 2023-10-03 11:01:25,928 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.3.encoder.layers.2.feed_forward2.out_whiten, num_groups=1, num_channels=512, metric=13.28 vs. limit=15.0 2023-10-03 11:01:26,622 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_17292_210_6753_1_1531125388_2943734_285-349105-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 11:01:28,081 WARNING [train.py:1204] (3/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_37-41919-0 from training. Duration: 0.87 2023-10-03 11:01:28,109 WARNING [train.py:1204] (3/4) Exclude cut with ID _60860_210_8254_1_1534896015875_3557169_172-294852-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 11:01:32,417 WARNING [train.py:1204] (3/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_146-276944-0 from training. Duration: 0.83 2023-10-03 11:01:32,630 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.bypass_mid.scale_min, batch_count=1243566.6666666667, ans=0.2 2023-10-03 11:01:35,774 WARNING [train.py:1204] (3/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_1254-44082-0_sp1.1 from training. Duration: 0.936375 2023-10-03 11:01:35,787 WARNING [train.py:1204] (3/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_1115-180350-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 11:01:35,810 WARNING [train.py:1204] (3/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_60-146189-0_sp1.1 from training. Duration: 0.8909375 2023-10-03 11:01:43,157 WARNING [train.py:1204] (3/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_517-153649-0_sp1.1 from training. Duration: 0.92725 2023-10-03 11:01:43,170 WARNING [train.py:1204] (3/4) Exclude cut with ID _39571_210_13449_1_1533801584143_3651077_37-266257-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 11:01:43,215 WARNING [train.py:1204] (3/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_527-276118-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 11:01:49,351 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7696_210_2040_1_1528207167_3716099_117-125434-0_sp1.1 from training. Duration: 0.6909375 2023-10-03 11:01:51,433 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.2.encoder.layers.2.self_attn2.whiten, num_groups=1, num_channels=384, metric=16.55 vs. limit=22.5 2023-10-03 11:01:54,164 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_25767_210_2161_1_1532307483_3867748_478-156360-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 11:01:54,169 WARNING [train.py:1204] (3/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_177-49092-0_sp1.1 from training. Duration: 0.82725 2023-10-03 11:01:54,184 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_11305_210_1794_1_1529750428_7178148_680-317073-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 11:01:59,932 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.feed_forward1.out_proj.dropout_p, batch_count=1243700.0, ans=0.1 2023-10-03 11:02:01,059 WARNING [train.py:1204] (3/4) Exclude cut with ID _53565_210_10635_1_1534337218335_3820259_287-18120-0 from training. Duration: 0.97 2023-10-03 11:02:06,977 WARNING [train.py:1204] (3/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_498-40153-0_sp1.1 from training. Duration: 0.57275 2023-10-03 11:02:07,001 WARNING [train.py:1204] (3/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_555-63066-0_sp1.1 from training. Duration: 0.963625 2023-10-03 11:02:08,996 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.4.encoder.layers.2.conv_module1.whiten, num_groups=1, num_channels=384, metric=2.67 vs. limit=15.0 2023-10-03 11:02:11,197 WARNING [train.py:1204] (3/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_395-310220-0 from training. Duration: 0.89 2023-10-03 11:02:12,964 WARNING [train.py:1204] (3/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_937-208281-0_sp1.1 from training. Duration: 0.77275 2023-10-03 11:02:14,544 WARNING [train.py:1204] (3/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_120-211630-0 from training. Duration: 0.92 2023-10-03 11:02:14,575 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_454-276429-0_sp1.1 from training. Duration: 0.7909375 2023-10-03 11:02:14,764 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=1243766.6666666667, ans=0.1 2023-10-03 11:02:15,867 WARNING [train.py:1204] (3/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_404-75889-0_sp1.1 from training. Duration: 0.8090625 2023-10-03 11:02:20,472 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.conv_skip_rate, batch_count=1243833.3333333333, ans=0.0 2023-10-03 11:02:21,595 INFO [train.py:1046] (3/4) Epoch 36, batch 650, loss[loss=0.1638, simple_loss=0.2302, pruned_loss=0.0487, over 23829.00 frames. ], tot_loss[loss=0.1597, simple_loss=0.2393, pruned_loss=0.04004, over 4525946.99 frames. ], batch size: 195, lr: 2.84e-03, grad_scale: 16.0 2023-10-03 11:02:23,122 WARNING [train.py:1204] (3/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_282-136641-0_sp0.9 from training. Duration: 0.4666875 2023-10-03 11:02:24,847 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_809-17156-0_sp1.1 from training. Duration: 0.663625 2023-10-03 11:02:26,438 WARNING [train.py:1204] (3/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_1003-101638-0_sp0.9 from training. Duration: 0.811125 2023-10-03 11:02:26,548 WARNING [train.py:1204] (3/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_404-75889-0_sp0.9 from training. Duration: 0.988875 2023-10-03 11:02:29,154 WARNING [train.py:1204] (3/4) Exclude cut with ID _59288_210_5854_1_1535077757774_3412246_570-295255-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 11:02:30,739 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.self_attn_weights.pos_emb_skip_rate, batch_count=1243833.3333333333, ans=0.0 2023-10-03 11:02:31,901 WARNING [train.py:1204] (3/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_412-287526-0 from training. Duration: 0.72 2023-10-03 11:02:31,968 WARNING [train.py:1204] (3/4) Exclude cut with ID _44010_210_4525_1_1533884262964_4013620_649-79655-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 11:02:38,046 WARNING [train.py:1204] (3/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_57-270037-0_sp0.9 from training. Duration: 0.8555625 2023-10-03 11:02:38,047 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_34815_210_6388_1_1533381320_3571903_302-255963-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 11:02:39,289 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.445e+02 1.874e+02 2.139e+02 2.425e+02 3.850e+02, threshold=4.279e+02, percent-clipped=0.0 2023-10-03 11:02:42,055 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_47449_210_5399_1_1534076445_4006929_573-301852-0_sp1.1 from training. Duration: 0.97275 2023-10-03 11:02:45,567 WARNING [train.py:1204] (3/4) Exclude cut with ID 6709-74022-0004-86860-0_sp1.1 from training. Duration: 0.9409375 2023-10-03 11:02:46,971 WARNING [train.py:1204] (3/4) Exclude cut with ID _40519_210_13794_1_1533715228655_7337390_111-71991-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 11:02:47,019 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_30167_210_3528_1_1532428012_4772375_169-214186-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 11:02:50,305 WARNING [train.py:1204] (3/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_20-235313-0_sp1.1 from training. Duration: 0.963625 2023-10-03 11:02:51,574 WARNING [train.py:1204] (3/4) Exclude cut with ID _4681_210_3525_1_1527251040321_1235713_123-24428-0_sp0.9 from training. Duration: 0.5333125 2023-10-03 11:02:52,975 WARNING [train.py:1204] (3/4) Exclude cut with ID _31602_210_5854_1_1532677021456_4458250_444-198752-0_sp1.1 from training. Duration: 0.97275 2023-10-03 11:02:53,208 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.attention_skip_rate, batch_count=1243966.6666666667, ans=0.0 2023-10-03 11:02:54,718 WARNING [train.py:1204] (3/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_9-279442-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 11:02:54,768 WARNING [train.py:1204] (3/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_973-198085-0_sp0.9 from training. Duration: 0.8333125 2023-10-03 11:02:56,102 WARNING [train.py:1204] (3/4) Exclude cut with ID _29853_210_7497_1_1532505600851_6642110_567-250221-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 11:02:57,586 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6545_210_2040_1_1527774670_4245258_407-48070-0_sp1.1 from training. Duration: 0.6909375 2023-10-03 11:02:59,306 WARNING [train.py:1204] (3/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_224-343295-0_sp0.9 from training. Duration: 0.611125 2023-10-03 11:02:59,329 WARNING [train.py:1204] (3/4) Exclude cut with ID IC0125W0011-61817 from training. Duration: 0.88 2023-10-03 11:02:59,346 WARNING [train.py:1204] (3/4) Exclude cut with ID _60113_210_16271_1_1535164212481_4386079_435-154371-0_sp1.1 from training. Duration: 0.97275 2023-10-03 11:02:59,359 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_25836_210_16554_1_1532138167_53197_3-9394-0_sp1.1 from training. Duration: 0.92725 2023-10-03 11:03:03,465 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.1.encoder.layers.0.self_attn_weights.whiten_keys, num_groups=4, num_channels=128, metric=3.73 vs. limit=6.0 2023-10-03 11:03:03,824 WARNING [train.py:1204] (3/4) Exclude cut with ID _29343_210_4381_1_1532602757752_3674109_57-55125-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 11:03:03,916 WARNING [train.py:1204] (3/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_267-23560-0_sp1.1 from training. Duration: 0.936375 2023-10-03 11:03:05,146 WARNING [train.py:1204] (3/4) Exclude cut with ID _30196_210_1664_1_1533708496522_7074140_1150-278867-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 11:03:05,214 WARNING [train.py:1204] (3/4) Exclude cut with ID _6270_210_2084_1_1527850911573_3856420_26-277343-0_sp0.9 from training. Duration: 0.911125 2023-10-03 11:03:06,620 WARNING [train.py:1204] (3/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_325-62802-0 from training. Duration: 0.87 2023-10-03 11:03:06,691 WARNING [train.py:1204] (3/4) Exclude cut with ID _56524_210_9642_1_1534572011779_3619170_145-310842-0_sp1.1 from training. Duration: 0.9 2023-10-03 11:03:08,002 WARNING [train.py:1204] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_860-96198-0_sp0.9 from training. Duration: 0.811125 2023-10-03 11:03:09,385 WARNING [train.py:1204] (3/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_17-25045-0_sp1.1 from training. Duration: 0.536375 2023-10-03 11:03:09,412 WARNING [train.py:1204] (3/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_349-69727-0_sp1.1 from training. Duration: 0.936375 2023-10-03 11:03:10,761 WARNING [train.py:1204] (3/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_580-93798-0_sp1.1 from training. Duration: 0.5090625 2023-10-03 11:03:10,945 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.1.bypass.scale_min, batch_count=1244033.3333333333, ans=0.2 2023-10-03 11:03:12,179 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7762_210_1794_1_1528199508_4057408_419-125009-0 from training. Duration: 0.83 2023-10-03 11:03:12,271 WARNING [train.py:1204] (3/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_356-167816-0 from training. Duration: 0.72 2023-10-03 11:03:14,050 WARNING [train.py:1204] (3/4) Exclude cut with ID _29345_210_4381_1_1532689168263_3860169_225-97100-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 11:03:14,055 WARNING [train.py:1204] (3/4) Exclude cut with ID _14227_210_4381_1_1530701804286_3940259_374-194712-0_sp1.1 from training. Duration: 0.92725 2023-10-03 11:03:14,079 WARNING [train.py:1204] (3/4) Exclude cut with ID _40137_210_12210_1_1533290484046_3795180_249-187059-0_sp0.9 from training. Duration: 0.97775 2023-10-03 11:03:14,103 WARNING [train.py:1204] (3/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_389-317194-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 11:03:14,260 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.1.attention_skip_rate, batch_count=1244033.3333333333, ans=0.0 2023-10-03 11:03:15,602 WARNING [train.py:1204] (3/4) Exclude cut with ID _27485_210_4916_1_1532739191677_3668269_239-122918-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 11:03:20,044 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.balancer2.prob, batch_count=1244100.0, ans=0.125 2023-10-03 11:03:21,574 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2245_210_2715_1_1525511548_3540254_128-117609-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 11:03:22,845 WARNING [train.py:1204] (3/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_160-276178-0_sp1.1 from training. Duration: 0.963625 2023-10-03 11:03:24,354 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_31920_210_10633_1_1532860009_1686094_1-279735-0_sp1.1 from training. Duration: 0.97275 2023-10-03 11:03:26,354 WARNING [train.py:1204] (3/4) Exclude cut with ID _51078_210_9070_1_1534161557424_3805250_325-262383-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 11:03:26,374 WARNING [train.py:1204] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_643-52007-0_sp0.9 from training. Duration: 0.6333125 2023-10-03 11:03:26,427 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_51937_210_5014_1_1534573610_4022025_518-299248-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 11:03:30,755 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.feed_forward2.hidden_balancer.prob, batch_count=1244100.0, ans=0.125 2023-10-03 11:03:33,380 WARNING [train.py:1204] (3/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_166-314607-0_sp1.1 from training. Duration: 0.4909375 2023-10-03 11:03:33,384 WARNING [train.py:1204] (3/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_1087-282939-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 11:03:34,043 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_23850_210_4238_1_1532232597_1632433_32-185135-0_sp1.1 from training. Duration: 0.7818125 2023-10-03 11:03:34,076 WARNING [train.py:1204] (3/4) Exclude cut with ID _59500_210_8254_1_1534809610680_3736009_145-89359-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 11:03:34,796 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.3.encoder.layers.0.feed_forward3.out_whiten, num_groups=1, num_channels=512, metric=10.50 vs. limit=15.0 2023-10-03 11:03:36,721 INFO [train.py:1046] (3/4) Epoch 36, batch 700, loss[loss=0.1432, simple_loss=0.225, pruned_loss=0.0307, over 24517.00 frames. ], tot_loss[loss=0.1584, simple_loss=0.2377, pruned_loss=0.03962, over 4572412.90 frames. ], batch size: 63, lr: 2.84e-03, grad_scale: 8.0 2023-10-03 11:03:38,297 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_340-336408-0 from training. Duration: 0.93 2023-10-03 11:03:39,674 WARNING [train.py:1204] (3/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_9-21737-0 from training. Duration: 0.97 2023-10-03 11:03:41,584 WARNING [train.py:1204] (3/4) Exclude cut with ID _2220_210_1819_1_1525509109898_3658680_50-99837-0 from training. Duration: 0.54 2023-10-03 11:03:42,938 WARNING [train.py:1204] (3/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_879-299350-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 11:03:45,623 WARNING [train.py:1204] (3/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_905-160074-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 11:03:47,040 WARNING [train.py:1204] (3/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_395-73570-0 from training. Duration: 0.99 2023-10-03 11:03:51,701 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7264_210_4938_1_1527939911_8132095_800-91995-0_sp1.1 from training. Duration: 0.7818125 2023-10-03 11:03:54,446 WARNING [train.py:1204] (3/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_77-311404-0_sp1.1 from training. Duration: 0.8545625 2023-10-03 11:03:55,792 WARNING [train.py:1204] (3/4) Exclude cut with ID _13679_210_7497_1_1530534293662_3355098_107-80772-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 11:03:59,051 WARNING [train.py:1204] (3/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_224-343295-0_sp1.1 from training. Duration: 0.5 2023-10-03 11:03:59,101 WARNING [train.py:1204] (3/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_156-253482-0_sp1.1 from training. Duration: 0.963625 2023-10-03 11:04:01,818 WARNING [train.py:1204] (3/4) Exclude cut with ID _49728_210_12651_1_1534041034294_6788150_309-125532-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 11:04:03,523 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.conv_skip_rate, batch_count=1244233.3333333333, ans=0.0 2023-10-03 11:04:04,728 WARNING [train.py:1204] (3/4) Exclude cut with ID _13913_210_9994_1_1530579615187_3383897_473-277682-0_sp0.9 from training. Duration: 0.4333125 2023-10-03 11:04:04,740 WARNING [train.py:1204] (3/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_3-276701-0_sp1.1 from training. Duration: 0.82725 2023-10-03 11:04:07,433 WARNING [train.py:1204] (3/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_282-136641-0 from training. Duration: 0.42 2023-10-03 11:04:10,778 WARNING [train.py:1204] (3/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_127-566-0 from training. Duration: 0.84 2023-10-03 11:04:12,261 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6163_210_6400_1_1527506746_4263068_283-137811-0_sp1.1 from training. Duration: 0.7 2023-10-03 11:04:14,014 WARNING [train.py:1204] (3/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_34-183685-0_sp1.1 from training. Duration: 0.8909375 2023-10-03 11:04:15,463 WARNING [train.py:1204] (3/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_154-135696-0_sp0.9 from training. Duration: 0.888875 2023-10-03 11:04:19,563 WARNING [train.py:1204] (3/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_676-51014-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 11:04:21,383 WARNING [train.py:1204] (3/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_38-169920-0 from training. Duration: 0.91 2023-10-03 11:04:23,623 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.0.layers.1.conv_module1.whiten, num_groups=1, num_channels=192, metric=9.91 vs. limit=15.0 2023-10-03 11:04:24,131 WARNING [train.py:1204] (3/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_747-114465-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 11:04:25,546 WARNING [train.py:1204] (3/4) Exclude cut with ID _8021_210_7994_1_1528376451358_3653069_100-140231-0_sp1.1 from training. Duration: 0.6090625 2023-10-03 11:04:25,563 WARNING [train.py:1204] (3/4) Exclude cut with ID _64135_210_2253_1_1535677267876_7608972_133-188266-0 from training. Duration: 0.81 2023-10-03 11:04:28,433 WARNING [train.py:1204] (3/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_714-310538-0_sp1.1 from training. Duration: 0.7818125 2023-10-03 11:04:30,361 WARNING [train.py:1204] (3/4) Exclude cut with ID _39867_210_9826_1_1533553222030_9344259_1134-288498-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 11:04:31,749 WARNING [train.py:1204] (3/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_211-237289-0_sp1.1 from training. Duration: 0.936375 2023-10-03 11:04:36,059 WARNING [train.py:1204] (3/4) Exclude cut with ID _59202_210_26984_1_1534816810612_3549174_137-318710-0_sp0.9 from training. Duration: 0.988875 2023-10-03 11:04:36,083 WARNING [train.py:1204] (3/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_86-66192-0 from training. Duration: 0.55 2023-10-03 11:04:40,881 WARNING [train.py:1204] (3/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_540-275685-0 from training. Duration: 0.89 2023-10-03 11:04:41,018 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.1.balancer1.prob, batch_count=1244433.3333333333, ans=0.125 2023-10-03 11:04:42,723 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8776_210_2040_1_1528638837_4088036_244-278763-0 from training. Duration: 0.9 2023-10-03 11:04:45,410 WARNING [train.py:1204] (3/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_9-219972-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 11:04:46,812 WARNING [train.py:1204] (3/4) Exclude cut with ID _44659_210_2721_1_1534036120061_6614860_656-283823-0_sp1.1 from training. Duration: 0.92725 2023-10-03 11:04:46,914 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_69630_210_7234_1_1535789601_661049_77-216385-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 11:04:48,478 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.conv_module2.balancer2.prob, batch_count=1244433.3333333333, ans=0.125 2023-10-03 11:04:50,080 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_19952_210_2161_1_1531875469_3851750_210-308919-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 11:04:50,086 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_4737_210_2040_1_1526998689_3293008_378-178650-0 from training. Duration: 0.95 2023-10-03 11:04:51,443 INFO [train.py:1046] (3/4) Epoch 36, batch 750, loss[loss=0.1771, simple_loss=0.2586, pruned_loss=0.04777, over 23297.00 frames. ], tot_loss[loss=0.1585, simple_loss=0.2377, pruned_loss=0.0396, over 4603133.77 frames. ], batch size: 93, lr: 2.84e-03, grad_scale: 8.0 2023-10-03 11:04:52,992 WARNING [train.py:1204] (3/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_224-343295-0 from training. Duration: 0.55 2023-10-03 11:04:53,016 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7002_210_1794_1_1527939683_5157286_184-229630-0 from training. Duration: 0.74 2023-10-03 11:04:53,051 WARNING [train.py:1204] (3/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_6-214815-0 from training. Duration: 0.85 2023-10-03 11:04:54,417 WARNING [train.py:1204] (3/4) Exclude cut with ID _8699_210_2069_1_1528630346831_3960409_160-24408-0 from training. Duration: 0.91 2023-10-03 11:04:55,677 WARNING [train.py:1204] (3/4) Exclude cut with ID _5585_210_2667_1_1527418813324_8007340_89-332072-0 from training. Duration: 0.64 2023-10-03 11:04:55,731 WARNING [train.py:1204] (3/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_528-259800-0_sp1.1 from training. Duration: 0.963625 2023-10-03 11:04:57,099 WARNING [train.py:1204] (3/4) Exclude cut with ID _55383_210_10635_1_1534468426172_2231669_134-128246-0 from training. Duration: 0.89 2023-10-03 11:04:57,183 WARNING [train.py:1204] (3/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_400-338274-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 11:04:59,058 WARNING [train.py:1204] (3/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_364-22817-0_sp0.9 from training. Duration: 0.788875 2023-10-03 11:05:00,562 WARNING [train.py:1204] (3/4) Exclude cut with ID _42863_210_9070_1_1533902398788_3696920_331-291057-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 11:05:03,366 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_5436_210_1794_1_1527331317_2541494_229-211016-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 11:05:03,396 WARNING [train.py:1204] (3/4) Exclude cut with ID _6456_210_1534_1_1527681634212_3181049_345-252877-0_sp0.9 from training. Duration: 0.711125 2023-10-03 11:05:03,433 WARNING [train.py:1204] (3/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_96-81499-0_sp1.1 from training. Duration: 0.92725 2023-10-03 11:05:04,912 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_5575_210_1410_1_1527301305_2570170_342-265572-0_sp1.1 from training. Duration: 0.8545625 2023-10-03 11:05:06,282 WARNING [train.py:1204] (3/4) Exclude cut with ID _5444_210_2151_1_1527418808464_5312210_726-27165-0_sp0.9 from training. Duration: 0.7444375 2023-10-03 11:05:09,496 WARNING [train.py:1204] (3/4) Exclude cut with ID _22083_210_8254_1_1531702862439_3678970_304-327750-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 11:05:10,660 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.521e+02 1.795e+02 1.933e+02 2.137e+02 2.929e+02, threshold=3.865e+02, percent-clipped=0.0 2023-10-03 11:05:10,855 WARNING [train.py:1204] (3/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_245-226199-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 11:05:11,038 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.bypass.skip_rate, batch_count=1244566.6666666667, ans=0.07 2023-10-03 11:05:12,140 WARNING [train.py:1204] (3/4) Exclude cut with ID _42875_210_14856_1_1533704397487_4012433_102-199499-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 11:05:12,207 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_51225_210_3601_1_1534400634_2327383_55-301844-0 from training. Duration: 0.93 2023-10-03 11:05:12,383 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.feed_forward1.hidden_balancer.prob, batch_count=1244566.6666666667, ans=0.125 2023-10-03 11:05:13,602 WARNING [train.py:1204] (3/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_916-195848-0_sp1.1 from training. Duration: 0.836375 2023-10-03 11:05:14,180 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.4.encoder.layers.1.feed_forward1.out_whiten, num_groups=1, num_channels=384, metric=9.06 vs. limit=15.0 2023-10-03 11:05:15,637 WARNING [train.py:1204] (3/4) Exclude cut with ID _44718_210_5816_1_1534050051869_6469030_497-311999-0_sp1.1 from training. Duration: 0.936375 2023-10-03 11:05:17,104 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_34886_210_10633_1_1533203882_1755661_91-194765-0_sp1.1 from training. Duration: 0.936375 2023-10-03 11:05:18,421 WARNING [train.py:1204] (3/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_310-26186-0_sp0.9 from training. Duration: 0.77775 2023-10-03 11:05:19,798 WARNING [train.py:1204] (3/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_416-257730-0 from training. Duration: 0.65 2023-10-03 11:05:19,803 WARNING [train.py:1204] (3/4) Exclude cut with ID _9389_210_1664_1_1529217337301_5289269_209-154760-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 11:05:19,956 WARNING [train.py:1204] (3/4) Exclude cut with ID _4092_210_2151_1_1526643044449_3135759_227-213972-0 from training. Duration: 0.54 2023-10-03 11:05:20,173 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.feed_forward1.hidden_balancer.prob, batch_count=1244633.3333333333, ans=0.125 2023-10-03 11:05:20,240 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.bypass.skip_rate, batch_count=1244633.3333333333, ans=0.04949747468305833 2023-10-03 11:05:21,276 WARNING [train.py:1204] (3/4) Exclude cut with ID IC0679W0044-335852 from training. Duration: 0.768 2023-10-03 11:05:21,347 WARNING [train.py:1204] (3/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_42-186162-0 from training. Duration: 0.92 2023-10-03 11:05:21,353 WARNING [train.py:1204] (3/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_302-2550-0_sp0.9 from training. Duration: 0.9 2023-10-03 11:05:23,082 WARNING [train.py:1204] (3/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_356-31216-0_sp0.9 from training. Duration: 0.8333125 2023-10-03 11:05:23,374 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.conv_module1.balancer1.prob, batch_count=1244633.3333333333, ans=0.125 2023-10-03 11:05:26,025 WARNING [train.py:1204] (3/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_125-322172-0_sp1.1 from training. Duration: 0.8454375 2023-10-03 11:05:26,280 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.feed_forward1.out_proj.dropout_p, batch_count=1244633.3333333333, ans=0.1 2023-10-03 11:05:30,963 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.feed_forward2.hidden_balancer.prob, batch_count=1244633.3333333333, ans=0.125 2023-10-03 11:05:32,042 WARNING [train.py:1204] (3/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_141-89727-0_sp0.9 from training. Duration: 0.788875 2023-10-03 11:05:32,059 WARNING [train.py:1204] (3/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_647-337784-0_sp1.1 from training. Duration: 0.97275 2023-10-03 11:05:32,075 WARNING [train.py:1204] (3/4) Exclude cut with ID _6493_210_1747_1_1528459210424_4012470_513-140967-0_sp1.1 from training. Duration: 0.7181875 2023-10-03 11:05:34,709 WARNING [train.py:1204] (3/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_87-266334-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 11:05:36,162 WARNING [train.py:1204] (3/4) Exclude cut with ID _26528_210_9070_1_1532570410842_3851310_526-261338-0_sp1.1 from training. Duration: 0.92725 2023-10-03 11:05:37,471 WARNING [train.py:1204] (3/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_380-343670-0 from training. Duration: 0.88 2023-10-03 11:05:37,515 WARNING [train.py:1204] (3/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_449-15508-0_sp1.1 from training. Duration: 0.7454375 2023-10-03 11:05:39,518 WARNING [train.py:1204] (3/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_372-6869-0_sp0.9 from training. Duration: 0.488875 2023-10-03 11:05:39,583 WARNING [train.py:1204] (3/4) Exclude cut with ID _26907_210_7234_1_1532221740694_3205263_337-2494-0_sp0.9 from training. Duration: 0.97775 2023-10-03 11:05:39,782 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.1.ff3_skip_rate, batch_count=1244700.0, ans=0.0 2023-10-03 11:05:40,927 WARNING [train.py:1204] (3/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_10-100367-0_sp1.1 from training. Duration: 0.8909375 2023-10-03 11:05:42,270 WARNING [train.py:1204] (3/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_47-167373-0 from training. Duration: 0.92 2023-10-03 11:05:42,333 WARNING [train.py:1204] (3/4) Exclude cut with ID _28323_210_16187_1_1532480395667_4100300_628-282179-0_sp1.1 from training. Duration: 0.97275 2023-10-03 11:05:46,548 WARNING [train.py:1204] (3/4) Exclude cut with ID _20455_210_5219_1_1531565996775_8098100_213-280898-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 11:05:48,429 WARNING [train.py:1204] (3/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_383-316000-0_sp1.1 from training. Duration: 0.8181875 2023-10-03 11:05:49,783 WARNING [train.py:1204] (3/4) Exclude cut with ID _18513_210_9206_1_1531618215223_3862620_16-78180-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 11:05:50,079 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.bypass.skip_rate, batch_count=1244766.6666666667, ans=0.07 2023-10-03 11:05:51,587 WARNING [train.py:1204] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_330-327861-0_sp1.1 from training. Duration: 0.6090625 2023-10-03 11:05:54,328 WARNING [train.py:1204] (3/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_356-31216-0 from training. Duration: 0.75 2023-10-03 11:05:55,564 WARNING [train.py:1204] (3/4) Exclude cut with ID _25248_210_8750_1_1532062825941_5554180_255-148539-0_sp1.1 from training. Duration: 0.963625 2023-10-03 11:05:56,863 WARNING [train.py:1204] (3/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_269-154726-0_sp1.1 from training. Duration: 0.7818125 2023-10-03 11:06:00,276 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7002_210_1794_1_1527939683_5157286_163-87552-0_sp1.1 from training. Duration: 0.7818125 2023-10-03 11:06:00,303 WARNING [train.py:1204] (3/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_97-151896-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 11:06:04,388 WARNING [train.py:1204] (3/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_207-7026-0_sp1.1 from training. Duration: 0.97275 2023-10-03 11:06:04,414 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_92-46009-0_sp0.9 from training. Duration: 0.6 2023-10-03 11:06:05,754 INFO [train.py:1046] (3/4) Epoch 36, batch 800, loss[loss=0.1556, simple_loss=0.2356, pruned_loss=0.03783, over 24322.00 frames. ], tot_loss[loss=0.1594, simple_loss=0.2387, pruned_loss=0.04009, over 4634536.02 frames. ], batch size: 61, lr: 2.84e-03, grad_scale: 16.0 2023-10-03 11:06:07,297 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.bypass.skip_rate, batch_count=1244833.3333333333, ans=0.09899494936611666 2023-10-03 11:06:12,109 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.conv_module2.balancer1.min_positive, batch_count=1244833.3333333333, ans=0.025 2023-10-03 11:06:13,243 WARNING [train.py:1204] (3/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_122-178847-0_sp1.1 from training. Duration: 0.97275 2023-10-03 11:06:13,246 WARNING [train.py:1204] (3/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_94-348956-0_sp1.1 from training. Duration: 0.92725 2023-10-03 11:06:14,832 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=1244833.3333333333, ans=0.1 2023-10-03 11:06:15,982 WARNING [train.py:1204] (3/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_1018-244089-0_sp1.1 from training. Duration: 0.963625 2023-10-03 11:06:16,002 WARNING [train.py:1204] (3/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_87-118423-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 11:06:16,092 WARNING [train.py:1204] (3/4) Exclude cut with ID _53713_210_24116_1_1534314487762_7416751_504-212292-0_sp1.1 from training. Duration: 0.92725 2023-10-03 11:06:17,356 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_446-150327-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 11:06:18,911 WARNING [train.py:1204] (3/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_682-245989-0_sp1.1 from training. Duration: 0.92725 2023-10-03 11:06:22,189 WARNING [train.py:1204] (3/4) Exclude cut with ID _35723_210_1794_1_1533088341043_628250_8-234524-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 11:06:23,589 WARNING [train.py:1204] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_64-248359-0_sp0.9 from training. Duration: 0.7666875 2023-10-03 11:06:25,091 WARNING [train.py:1204] (3/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_49-97158-0 from training. Duration: 0.76 2023-10-03 11:06:26,760 WARNING [train.py:1204] (3/4) Exclude cut with ID _24314_210_4915_1_1532135078116_7170179_743-915-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 11:06:28,145 WARNING [train.py:1204] (3/4) Exclude cut with ID _61194_210_18628_1_1535007555628_3750909_353-323468-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 11:06:28,169 WARNING [train.py:1204] (3/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_387-131538-0_sp1.1 from training. Duration: 0.736375 2023-10-03 11:06:28,213 WARNING [train.py:1204] (3/4) Exclude cut with ID _44424_210_18163_1_1533623394848_1213110_79-187603-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 11:06:28,248 WARNING [train.py:1204] (3/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_86-19601-0 from training. Duration: 0.85 2023-10-03 11:06:28,276 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_50050_210_16223_1_1535435796_7290138_103-283529-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 11:06:30,028 WARNING [train.py:1204] (3/4) Exclude cut with ID _13913_210_9994_1_1530579615187_3383897_473-277682-0 from training. Duration: 0.39 2023-10-03 11:06:32,753 WARNING [train.py:1204] (3/4) Exclude cut with ID _42326_210_7770_1_1533814282531_3707269_368-68029-0_sp1.1 from training. Duration: 0.92725 2023-10-03 11:06:34,207 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_41428_210_2161_1_1533774184_3992532_446-339645-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 11:06:36,876 WARNING [train.py:1204] (3/4) Exclude cut with ID _69584_210_5219_1_1535785133775_4044450_325-7656-0_sp1.1 from training. Duration: 0.7818125 2023-10-03 11:06:36,887 WARNING [train.py:1204] (3/4) Exclude cut with ID _21632_210_2253_1_1531994001765_3744140_134-184387-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 11:06:38,279 WARNING [train.py:1204] (3/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_550-184707-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 11:06:38,302 WARNING [train.py:1204] (3/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_106-341780-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 11:06:42,888 WARNING [train.py:1204] (3/4) Exclude cut with ID _45871_210_5665_1_1533864614245_3296060_253-132552-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 11:06:44,157 WARNING [train.py:1204] (3/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_395-310220-0_sp1.1 from training. Duration: 0.8090625 2023-10-03 11:06:44,180 WARNING [train.py:1204] (3/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_795-33252-0_sp0.9 from training. Duration: 0.411125 2023-10-03 11:06:46,891 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9002W0003-495962 from training. Duration: 0.946 2023-10-03 11:06:46,920 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_43-71989-0 from training. Duration: 0.57 2023-10-03 11:06:46,939 WARNING [train.py:1204] (3/4) Exclude cut with ID _8699_210_2069_1_1528630346831_3960409_155-39766-0_sp1.1 from training. Duration: 0.7090625 2023-10-03 11:06:46,961 WARNING [train.py:1204] (3/4) Exclude cut with ID _27894_210_12392_1_1532415496078_6902439_788-232684-0_sp1.1 from training. Duration: 0.97275 2023-10-03 11:06:48,372 WARNING [train.py:1204] (3/4) Exclude cut with ID _56494_210_4220_1_1535284837625_3033169_10-39296-0_sp1.1 from training. Duration: 0.92725 2023-10-03 11:06:48,398 WARNING [train.py:1204] (3/4) Exclude cut with ID _40901_210_5665_1_1533363789958_3856130_341-20819-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 11:06:54,431 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9029W0435-509822 from training. Duration: 0.896 2023-10-03 11:06:54,485 WARNING [train.py:1204] (3/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_51-117576-0 from training. Duration: 0.4 2023-10-03 11:06:55,753 WARNING [train.py:1204] (3/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_197-28164-0_sp1.1 from training. Duration: 0.72725 2023-10-03 11:06:57,134 WARNING [train.py:1204] (3/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_145-137790-0_sp1.1 from training. Duration: 0.7545625 2023-10-03 11:07:00,610 WARNING [train.py:1204] (3/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_2-39653-0_sp1.1 from training. Duration: 0.8909375 2023-10-03 11:07:01,692 INFO [scaling.py:1022] (3/4) Whitening: name=encoder_embed.convnext.out_whiten, num_groups=1, num_channels=128, metric=4.53 vs. limit=5.0 2023-10-03 11:07:03,338 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3780_210_2087_1_1526467760_2130483_186-208240-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 11:07:04,669 WARNING [train.py:1204] (3/4) Exclude cut with ID _24055_210_12651_1_1532310907143_976979_92-59356-0 from training. Duration: 0.91 2023-10-03 11:07:04,719 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3910_210_1794_1_1526742306_334530_28-238909-0_sp1.1 from training. Duration: 0.736375 2023-10-03 11:07:07,479 WARNING [train.py:1204] (3/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_269-154726-0 from training. Duration: 0.86 2023-10-03 11:07:12,069 WARNING [train.py:1204] (3/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_326-165900-0_sp0.9 from training. Duration: 0.7666875 2023-10-03 11:07:16,562 WARNING [train.py:1204] (3/4) Exclude cut with ID _5294_210_1747_1_1527249643928_3696179_470-276077-0_sp1.1 from training. Duration: 0.963625 2023-10-03 11:07:16,601 WARNING [train.py:1204] (3/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_372-131163-0 from training. Duration: 0.94 2023-10-03 11:07:16,655 WARNING [train.py:1204] (3/4) Exclude cut with ID _45297_210_2721_1_1533715351381_3264949_351-147136-0_sp1.1 from training. Duration: 0.7909375 2023-10-03 11:07:17,923 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_1072-12671-0_sp1.1 from training. Duration: 0.92725 2023-10-03 11:07:18,199 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.conv_skip_rate, batch_count=1245166.6666666667, ans=0.0 2023-10-03 11:07:19,159 INFO [train.py:1046] (3/4) Epoch 36, batch 850, loss[loss=0.1487, simple_loss=0.2246, pruned_loss=0.03641, over 23389.00 frames. ], tot_loss[loss=0.1595, simple_loss=0.2392, pruned_loss=0.03993, over 4653832.56 frames. ], batch size: 134, lr: 2.84e-03, grad_scale: 8.0 2023-10-03 11:07:19,230 WARNING [train.py:1204] (3/4) Exclude cut with ID _15133_210_6228_1_1530794512074_3080379_54-198193-0 from training. Duration: 0.94 2023-10-03 11:07:19,254 WARNING [train.py:1204] (3/4) Exclude cut with ID _30196_210_1664_1_1533708496522_7074140_1194-270988-0_sp1.1 from training. Duration: 0.97275 2023-10-03 11:07:20,712 WARNING [train.py:1204] (3/4) Exclude cut with ID _50310_210_15488_1_1534068043345_3641080_289-193637-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 11:07:20,805 WARNING [train.py:1204] (3/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_313-263495-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 11:07:23,530 WARNING [train.py:1204] (3/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_18-130336-0_sp1.1 from training. Duration: 0.7181875 2023-10-03 11:07:23,597 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_47896_210_20508_1_1534492518_5958868_646-287209-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 11:07:25,037 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_294-163793-0 from training. Duration: 0.82 2023-10-03 11:07:25,070 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_8-319957-0 from training. Duration: 0.96 2023-10-03 11:07:25,074 WARNING [train.py:1204] (3/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_1211-226066-0 from training. Duration: 0.76 2023-10-03 11:07:28,179 WARNING [train.py:1204] (3/4) Exclude cut with ID _4779_210_2084_1_1527318022944_3914893_500-285013-0_sp0.9 from training. Duration: 0.7666875 2023-10-03 11:07:28,208 WARNING [train.py:1204] (3/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_347-232268-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 11:07:29,625 WARNING [train.py:1204] (3/4) Exclude cut with ID _29877_210_6228_1_1532397791106_5276149_111-55056-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 11:07:29,650 WARNING [train.py:1204] (3/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_812-271646-0_sp1.1 from training. Duration: 0.92725 2023-10-03 11:07:29,675 WARNING [train.py:1204] (3/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_235-262124-0_sp1.1 from training. Duration: 0.6090625 2023-10-03 11:07:29,906 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.balancer2.prob, batch_count=1245166.6666666667, ans=0.125 2023-10-03 11:07:33,846 WARNING [train.py:1204] (3/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_14-16530-0_sp1.1 from training. Duration: 0.97275 2023-10-03 11:07:35,111 WARNING [train.py:1204] (3/4) Exclude cut with ID _44718_210_5816_1_1534050051869_6469030_568-304512-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 11:07:35,150 WARNING [train.py:1204] (3/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_1033-274376-0 from training. Duration: 0.87 2023-10-03 11:07:35,327 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.1.ff3_skip_rate, batch_count=1245233.3333333333, ans=0.0 2023-10-03 11:07:39,064 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.588e+02 1.931e+02 2.185e+02 2.563e+02 4.001e+02, threshold=4.370e+02, percent-clipped=1.0 2023-10-03 11:07:40,434 WARNING [train.py:1204] (3/4) Exclude cut with ID _4681_210_3525_1_1527251040321_1235713_123-24428-0 from training. Duration: 0.48 2023-10-03 11:07:43,842 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_37818_210_4238_1_1533620910_4720083_251-288764-0_sp1.1 from training. Duration: 0.97275 2023-10-03 11:07:44,006 INFO [scaling.py:1118] (3/4) WithLoss: name=encoder.encoders.0.layers.0.self_attn_weights, loss-sum=0.000e+00 2023-10-03 11:07:44,041 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.balancer2.prob, batch_count=1245233.3333333333, ans=0.125 2023-10-03 11:07:45,224 WARNING [train.py:1204] (3/4) Exclude cut with ID _65585_210_12210_1_1535450451584_3764098_112-294301-0 from training. Duration: 0.88 2023-10-03 11:07:49,826 WARNING [train.py:1204] (3/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_329-113173-0 from training. Duration: 0.62 2023-10-03 11:07:51,163 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6767_210_3273_1_1527855783_1718932_173-256601-0 from training. Duration: 0.8 2023-10-03 11:07:53,982 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9266W0001-627677 from training. Duration: 0.896 2023-10-03 11:07:53,998 WARNING [train.py:1204] (3/4) Exclude cut with ID _49643_210_7989_1_1534640244224_3939031_611-185515-0_sp1.1 from training. Duration: 0.8545625 2023-10-03 11:07:54,002 WARNING [train.py:1204] (3/4) Exclude cut with ID _31649_210_6228_1_1532584608063_4091078_687-237771-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 11:07:54,013 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_148-172861-0_sp0.9 from training. Duration: 0.5555625 2023-10-03 11:07:56,992 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_5129_210_6400_1_1527161005_3579054_107-69738-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 11:07:58,410 WARNING [train.py:1204] (3/4) Exclude cut with ID _40137_210_12210_1_1533290484046_3795180_36-301164-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 11:07:58,433 WARNING [train.py:1204] (3/4) Exclude cut with ID _7537_210_2084_1_1528502395431_3924359_113-199933-0 from training. Duration: 0.59 2023-10-03 11:07:59,882 WARNING [train.py:1204] (3/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_547-5424-0_sp1.1 from training. Duration: 0.8545625 2023-10-03 11:08:01,880 WARNING [train.py:1204] (3/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_147-284024-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 11:08:01,961 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_294-163793-0_sp1.1 from training. Duration: 0.7454375 2023-10-03 11:08:03,912 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3780_210_2087_1_1526467760_2130483_170-259044-0_sp1.1 from training. Duration: 0.636375 2023-10-03 11:08:05,322 WARNING [train.py:1204] (3/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_726-270780-0_sp1.1 from training. Duration: 0.7818125 2023-10-03 11:08:08,062 WARNING [train.py:1204] (3/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_214-291369-0_sp1.1 from training. Duration: 0.57275 2023-10-03 11:08:09,419 WARNING [train.py:1204] (3/4) Exclude cut with ID _21881_210_3525_1_1531825242757_3798380_247-102591-0 from training. Duration: 0.98 2023-10-03 11:08:10,973 WARNING [train.py:1204] (3/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_18-174647-0_sp1.1 from training. Duration: 0.82725 2023-10-03 11:08:12,376 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_415-52611-0_sp1.1 from training. Duration: 0.936375 2023-10-03 11:08:13,752 WARNING [train.py:1204] (3/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_379-49457-0_sp0.9 from training. Duration: 0.9666875 2023-10-03 11:08:13,776 WARNING [train.py:1204] (3/4) Exclude cut with ID _64521_210_14699_1_1535243427259_3815328_368-195106-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 11:08:13,851 WARNING [train.py:1204] (3/4) Exclude cut with ID _28394_210_8913_1_1532430049518_3650118_51-334124-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 11:08:15,317 WARNING [train.py:1204] (3/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_317-161769-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 11:08:18,479 WARNING [train.py:1204] (3/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_40-129756-0_sp0.9 from training. Duration: 0.9 2023-10-03 11:08:18,575 WARNING [train.py:1204] (3/4) Exclude cut with ID _3067_210_2667_1_1526212974640_4503168_119-175776-0_sp0.9 from training. Duration: 0.82225 2023-10-03 11:08:19,951 WARNING [train.py:1204] (3/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_1004-304915-0_sp1.1 from training. Duration: 0.92725 2023-10-03 11:08:21,356 WARNING [train.py:1204] (3/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_359-118926-0_sp0.9 from training. Duration: 0.8 2023-10-03 11:08:27,056 WARNING [train.py:1204] (3/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_51-200220-0_sp0.9 from training. Duration: 0.62225 2023-10-03 11:08:28,354 WARNING [train.py:1204] (3/4) Exclude cut with ID _46683_210_9642_1_1533730913492_4591116_530-326202-0_sp1.1 from training. Duration: 0.936375 2023-10-03 11:08:29,660 WARNING [train.py:1204] (3/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_567-135114-0 from training. Duration: 0.87 2023-10-03 11:08:29,709 WARNING [train.py:1204] (3/4) Exclude cut with ID _66863_210_18053_1_1535698758334_8590319_1092-271784-0_sp1.1 from training. Duration: 0.87275 2023-10-03 11:08:31,053 WARNING [train.py:1204] (3/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_762-282614-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 11:08:32,860 INFO [train.py:1046] (3/4) Epoch 36, batch 900, loss[loss=0.1448, simple_loss=0.2296, pruned_loss=0.03004, over 24472.00 frames. ], tot_loss[loss=0.1598, simple_loss=0.2397, pruned_loss=0.03989, over 4686433.47 frames. ], batch size: 63, lr: 2.84e-03, grad_scale: 8.0 2023-10-03 11:08:32,948 WARNING [train.py:1204] (3/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_280-120942-0 from training. Duration: 0.77 2023-10-03 11:08:39,107 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_31583_210_3852_1_1532595851_4024413_27-170931-0_sp1.1 from training. Duration: 0.963625 2023-10-03 11:08:41,764 WARNING [train.py:1204] (3/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_266-271628-0_sp1.1 from training. Duration: 0.92725 2023-10-03 11:08:41,809 WARNING [train.py:1204] (3/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_41-31157-0 from training. Duration: 0.57 2023-10-03 11:08:44,585 WARNING [train.py:1204] (3/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_113-18163-0_sp1.1 from training. Duration: 0.8090625 2023-10-03 11:08:44,620 WARNING [train.py:1204] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_147-111137-0 from training. Duration: 0.95 2023-10-03 11:08:47,230 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1083-220122-0_sp1.1 from training. Duration: 0.463625 2023-10-03 11:08:47,866 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.3.encoder.layers.1.nonlin_attention.whiten2, num_groups=1, num_channels=512, metric=5.94 vs. limit=15.0 2023-10-03 11:08:49,788 WARNING [train.py:1204] (3/4) Exclude cut with ID _14884_210_2252_1_1530707459689_5561250_591-173406-0_sp1.1 from training. Duration: 0.87275 2023-10-03 11:08:49,794 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_24677_210_1794_1_1532002693_4341296_149-303793-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 11:08:49,844 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_133-564-0_sp1.1 from training. Duration: 0.7181875 2023-10-03 11:08:49,884 WARNING [train.py:1204] (3/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_625-338546-0_sp0.9 from training. Duration: 0.97775 2023-10-03 11:08:59,593 WARNING [train.py:1204] (3/4) Exclude cut with ID _25882_210_3036_1_1532775665815_6479510_624-320169-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 11:08:59,607 WARNING [train.py:1204] (3/4) Exclude cut with ID _20622_210_3852_1_1531622427080_1093839_127-325326-0_sp1.1 from training. Duration: 0.92725 2023-10-03 11:08:59,636 WARNING [train.py:1204] (3/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_464-33931-0_sp1.1 from training. Duration: 0.4909375 2023-10-03 11:09:02,433 WARNING [train.py:1204] (3/4) Exclude cut with ID _66112_210_10635_1_1535590236305_3801484_120-304588-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 11:09:08,543 WARNING [train.py:1204] (3/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_110-130961-0 from training. Duration: 0.47 2023-10-03 11:09:10,560 WARNING [train.py:1204] (3/4) Exclude cut with ID _16908_210_5724_1_1531119157248_3772700_64-54020-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 11:09:12,153 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.bypass.scale_min, batch_count=1245633.3333333333, ans=0.2 2023-10-03 11:09:13,508 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.feed_forward2.hidden_balancer.prob, batch_count=1245633.3333333333, ans=0.125 2023-10-03 11:09:14,659 WARNING [train.py:1204] (3/4) Exclude cut with ID _4068_210_2629_1_1526697350395_1546259_224-288693-0_sp1.1 from training. Duration: 0.536375 2023-10-03 11:09:15,173 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.5.encoder.layers.0.conv_module2.whiten, num_groups=1, num_channels=256, metric=4.74 vs. limit=15.0 2023-10-03 11:09:15,935 WARNING [train.py:1204] (3/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_191-41023-0_sp0.9 from training. Duration: 0.788875 2023-10-03 11:09:16,000 WARNING [train.py:1204] (3/4) Exclude cut with ID ID0205W0255-783002 from training. Duration: 0.958 2023-10-03 11:09:17,408 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_162-262717-0 from training. Duration: 0.83 2023-10-03 11:09:18,274 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.0.layers.0.nonlin_attention.whiten2, num_groups=1, num_channels=192, metric=4.92 vs. limit=15.0 2023-10-03 11:09:23,605 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_224-322394-0_sp1.1 from training. Duration: 0.6 2023-10-03 11:09:23,611 WARNING [train.py:1204] (3/4) Exclude cut with ID _4866_210_2577_1_1527404412284_4364659_133-1696-0_sp1.1 from training. Duration: 0.9 2023-10-03 11:09:23,721 INFO [scaling.py:1118] (3/4) WithLoss: name=encoder.encoders.0.layers.0.self_attn_weights, loss-sum=0.000e+00 2023-10-03 11:09:24,921 WARNING [train.py:1204] (3/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_973-198085-0_sp1.1 from training. Duration: 0.6818125 2023-10-03 11:09:31,754 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_47449_210_5399_1_1534076445_4006929_410-237806-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 11:09:31,770 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7543_210_6753_1_1528543318_4552_1-85072-0_sp0.9 from training. Duration: 0.92225 2023-10-03 11:09:33,213 WARNING [train.py:1204] (3/4) Exclude cut with ID _65263_210_22356_1_1535763598691_3921037_108-21902-0 from training. Duration: 0.99 2023-10-03 11:09:33,218 WARNING [train.py:1204] (3/4) Exclude cut with ID _65585_210_12210_1_1535450451584_3764098_309-174818-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 11:09:35,843 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_309-65177-0 from training. Duration: 0.88 2023-10-03 11:09:36,430 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.5.encoder.layers.0.self_attn_weights.whiten_keys, num_groups=4, num_channels=128, metric=1.64 vs. limit=6.0 2023-10-03 11:09:37,308 WARNING [train.py:1204] (3/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_363-185703-0_sp1.1 from training. Duration: 0.863625 2023-10-03 11:09:37,343 WARNING [train.py:1204] (3/4) Exclude cut with ID _42326_210_7770_1_1533814282531_3707269_442-238692-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 11:09:40,679 WARNING [train.py:1204] (3/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_226-254446-0_sp1.1 from training. Duration: 0.936375 2023-10-03 11:09:40,697 WARNING [train.py:1204] (3/4) Exclude cut with ID _29853_210_7497_1_1532505600851_6642110_287-299903-0_sp1.1 from training. Duration: 0.963625 2023-10-03 11:09:44,241 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1015-343884-0 from training. Duration: 0.62 2023-10-03 11:09:44,291 WARNING [train.py:1204] (3/4) Exclude cut with ID ID0305W0008-833402 from training. Duration: 0.91 2023-10-03 11:09:45,634 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6673_210_3273_1_1527767790_3173240_351-167954-0_sp1.1 from training. Duration: 0.436375 2023-10-03 11:09:45,646 WARNING [train.py:1204] (3/4) Exclude cut with ID _6456_210_1534_1_1527681634212_3181049_345-252877-0 from training. Duration: 0.64 2023-10-03 11:09:46,951 INFO [train.py:1046] (3/4) Epoch 36, batch 950, loss[loss=0.2207, simple_loss=0.2878, pruned_loss=0.07681, over 19400.00 frames. ], tot_loss[loss=0.1607, simple_loss=0.2405, pruned_loss=0.04046, over 4684697.98 frames. ], batch size: 388, lr: 2.84e-03, grad_scale: 8.0 2023-10-03 11:09:49,117 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_13-205182-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 11:09:49,397 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.balancer1.prob, batch_count=1245833.3333333333, ans=0.125 2023-10-03 11:09:51,684 WARNING [train.py:1204] (3/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_427-208901-0 from training. Duration: 0.68 2023-10-03 11:09:54,513 WARNING [train.py:1204] (3/4) Exclude cut with ID _49039_210_6315_1_1533967537541_3846118_9-148921-0_sp1.1 from training. Duration: 0.97275 2023-10-03 11:09:55,193 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.3.encoder.layers.2.self_attn1.whiten, num_groups=1, num_channels=512, metric=11.28 vs. limit=22.5 2023-10-03 11:09:57,329 WARNING [train.py:1204] (3/4) Exclude cut with ID _59493_210_17282_1_1535078419077_3225879_143-70317-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 11:09:57,373 WARNING [train.py:1204] (3/4) Exclude cut with ID _27967_210_12140_1_1532588391859_8419130_1056-236369-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 11:09:57,419 WARNING [train.py:1204] (3/4) Exclude cut with ID _6537_210_1883_1_1527987559710_3597129_382-133062-0_sp0.9 from training. Duration: 0.7555625 2023-10-03 11:10:01,463 WARNING [train.py:1204] (3/4) Exclude cut with ID ID0376W0255-869957 from training. Duration: 0.9510625 2023-10-03 11:10:04,152 WARNING [train.py:1204] (3/4) Exclude cut with ID _49371_210_12071_1_1533952761021_3812190_483-240691-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 11:10:05,464 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_12868_210_6753_1_1530701932_530275_18-76322-0_sp1.1 from training. Duration: 0.7909375 2023-10-03 11:10:07,345 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.472e+02 1.892e+02 2.032e+02 2.261e+02 4.305e+02, threshold=4.064e+02, percent-clipped=0.0 2023-10-03 11:10:07,447 WARNING [train.py:1204] (3/4) Exclude cut with ID _53950_210_5816_1_1534417264919_3862951_367-26344-0_sp1.1 from training. Duration: 0.97275 2023-10-03 11:10:07,477 WARNING [train.py:1204] (3/4) Exclude cut with ID _5444_210_2151_1_1527418808464_5312210_634-194174-0_sp1.1 from training. Duration: 0.8818125 2023-10-03 11:10:07,495 WARNING [train.py:1204] (3/4) Exclude cut with ID _56494_210_4220_1_1535284837625_3033169_69-138150-0 from training. Duration: 0.94 2023-10-03 11:10:08,826 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7543_210_6753_1_1528544010_1110402_25-183860-0_sp0.9 from training. Duration: 0.52225 2023-10-03 11:10:10,136 WARNING [train.py:1204] (3/4) Exclude cut with ID _40284_210_2084_1_1533513679814_3704279_287-191918-0_sp1.1 from training. Duration: 0.963625 2023-10-03 11:10:10,263 WARNING [train.py:1204] (3/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_291-270881-0 from training. Duration: 0.97 2023-10-03 11:10:11,025 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.2.encoder.layers.0.conv_module1.whiten, num_groups=1, num_channels=384, metric=7.14 vs. limit=15.0 2023-10-03 11:10:11,966 WARNING [train.py:1204] (3/4) Exclude cut with ID _12555_210_8750_1_1530075594402_3780239_449-267986-0_sp0.9 from training. Duration: 0.92225 2023-10-03 11:10:14,993 WARNING [train.py:1204] (3/4) Exclude cut with ID _3992_210_1747_1_1526644816031_3731060_268-64520-0_sp1.1 from training. Duration: 0.963625 2023-10-03 11:10:15,000 WARNING [train.py:1204] (3/4) Exclude cut with ID _39411_210_5986_1_1533538825030_3343649_173-238248-0_sp0.9 from training. Duration: 0.92225 2023-10-03 11:10:15,020 WARNING [train.py:1204] (3/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_828-313097-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 11:10:16,910 WARNING [train.py:1204] (3/4) Exclude cut with ID _26907_210_7234_1_1532221740694_3205263_337-2494-0 from training. Duration: 0.88 2023-10-03 11:10:18,395 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_229-45300-0_sp1.1 from training. Duration: 0.5181875 2023-10-03 11:10:22,871 WARNING [train.py:1204] (3/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_300-27235-0_sp1.1 from training. Duration: 0.7909375 2023-10-03 11:10:24,240 WARNING [train.py:1204] (3/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_48-338976-0_sp1.1 from training. Duration: 0.8090625 2023-10-03 11:10:27,059 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_13091_210_7925_1_1530609447_8987354_792-195353-0_sp1.1 from training. Duration: 0.936375 2023-10-03 11:10:28,334 WARNING [train.py:1204] (3/4) Exclude cut with ID _56524_210_9642_1_1534572011779_3619170_311-54500-0_sp1.1 from training. Duration: 0.97275 2023-10-03 11:10:31,020 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7543_210_6753_1_1528544010_1110402_25-183860-0 from training. Duration: 0.47 2023-10-03 11:10:32,415 WARNING [train.py:1204] (3/4) Exclude cut with ID 2411-132532-0017-82279-0_sp1.1 from training. Duration: 0.9681875 2023-10-03 11:10:32,416 WARNING [train.py:1204] (3/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_272-249985-0_sp0.9 from training. Duration: 0.7444375 2023-10-03 11:10:33,812 WARNING [train.py:1204] (3/4) Exclude cut with ID _55644_210_11856_1_1534593599582_4103719_320-229006-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 11:10:33,868 WARNING [train.py:1204] (3/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_177-100554-0_sp1.1 from training. Duration: 0.963625 2023-10-03 11:10:33,869 WARNING [train.py:1204] (3/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_787-294416-0_sp1.1 from training. Duration: 0.7454375 2023-10-03 11:10:37,270 WARNING [train.py:1204] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_318-59097-0 from training. Duration: 0.76 2023-10-03 11:10:38,710 WARNING [train.py:1204] (3/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_480-13146-0_sp1.1 from training. Duration: 0.77275 2023-10-03 11:10:41,340 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_469-338558-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 11:10:41,407 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_367-126897-0_sp1.1 from training. Duration: 0.963625 2023-10-03 11:10:42,751 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6545_210_2040_1_1527774670_4245258_379-14182-0 from training. Duration: 0.8 2023-10-03 11:10:42,762 WARNING [train.py:1204] (3/4) Exclude cut with ID _21631_210_2253_1_1531907880778_3160339_197-79498-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 11:10:42,767 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_416-293746-0_sp1.1 from training. Duration: 0.8181875 2023-10-03 11:10:42,801 WARNING [train.py:1204] (3/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_52-211683-0 from training. Duration: 0.9 2023-10-03 11:10:46,663 WARNING [train.py:1204] (3/4) Exclude cut with ID _48678_210_5663_1_1533988830947_3769199_306-255886-0_sp0.9 from training. Duration: 0.8555625 2023-10-03 11:10:49,224 WARNING [train.py:1204] (3/4) Exclude cut with ID _65134_210_14763_1_1535335393985_3505368_296-24733-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 11:10:51,398 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.conv_module2.balancer2.min_abs, batch_count=1246100.0, ans=0.5 2023-10-03 11:10:55,376 WARNING [train.py:1204] (3/4) Exclude cut with ID _14898_210_9206_1_1530766812377_4177190_335-99364-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 11:10:55,564 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.1.bypass.scale_min, batch_count=1246100.0, ans=0.2 2023-10-03 11:10:56,695 WARNING [train.py:1204] (3/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_19-342632-0 from training. Duration: 0.91 2023-10-03 11:10:56,720 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_53071_210_16554_1_1534490405_2954066_265-170472-0 from training. Duration: 0.83 2023-10-03 11:10:59,502 WARNING [train.py:1204] (3/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_189-298172-0_sp1.1 from training. Duration: 0.963625 2023-10-03 11:11:00,877 INFO [train.py:1046] (3/4) Epoch 36, batch 1000, loss[loss=0.1577, simple_loss=0.2185, pruned_loss=0.04845, over 22698.00 frames. ], tot_loss[loss=0.1607, simple_loss=0.24, pruned_loss=0.04067, over 4683961.75 frames. ], batch size: 322, lr: 2.84e-03, grad_scale: 8.0 2023-10-03 11:11:03,711 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7615_210_4211_1_1528110965_4580160_72-235211-0 from training. Duration: 0.76 2023-10-03 11:11:05,039 WARNING [train.py:1204] (3/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_155-90874-0_sp1.1 from training. Duration: 0.936375 2023-10-03 11:11:09,549 WARNING [train.py:1204] (3/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_164-216310-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 11:11:09,656 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_46013_210_7234_1_1533776362_3744032_463-52378-0 from training. Duration: 0.86 2023-10-03 11:11:10,976 WARNING [train.py:1204] (3/4) Exclude cut with ID _61133_210_9969_1_1535196777227_3696409_333-270325-0 from training. Duration: 0.93 2023-10-03 11:11:14,435 WARNING [train.py:1204] (3/4) Exclude cut with ID _5172_210_1883_1_1527246062567_3577219_18-101380-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 11:11:15,796 WARNING [train.py:1204] (3/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_577-236858-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 11:11:15,961 WARNING [train.py:1204] (3/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_1006-177345-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 11:11:19,019 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_4-266348-0 from training. Duration: 0.78 2023-10-03 11:11:20,537 WARNING [train.py:1204] (3/4) Exclude cut with ID _55857_210_2346_1_1534644007831_6898209_745-98391-0 from training. Duration: 0.76 2023-10-03 11:11:22,249 WARNING [train.py:1204] (3/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_375-244976-0 from training. Duration: 0.75 2023-10-03 11:11:23,588 WARNING [train.py:1204] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_4-248404-0_sp1.1 from training. Duration: 0.97275 2023-10-03 11:11:25,010 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8453_210_4211_1_1528502940_8562730_596-306820-0 from training. Duration: 0.97 2023-10-03 11:11:26,398 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_211-339622-0_sp0.9 from training. Duration: 0.5 2023-10-03 11:11:26,437 WARNING [train.py:1204] (3/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_393-57888-0 from training. Duration: 0.64 2023-10-03 11:11:27,657 WARNING [train.py:1204] (3/4) Exclude cut with ID _23825_210_13449_1_1531915313526_3497150_143-343575-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 11:11:27,722 WARNING [train.py:1204] (3/4) Exclude cut with ID _58605_210_12210_1_1534723195939_3904259_36-12256-0_sp1.1 from training. Duration: 0.936375 2023-10-03 11:11:29,297 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.conv_module1.balancer1.prob, batch_count=1246300.0, ans=0.125 2023-10-03 11:11:35,517 INFO [scaling.py:1118] (3/4) WithLoss: name=encoder.encoders.5.encoder.layers.0.self_attn_weights, loss-sum=0.000e+00 2023-10-03 11:11:36,546 WARNING [train.py:1204] (3/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_38-67795-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 11:11:36,617 WARNING [train.py:1204] (3/4) Exclude cut with ID _64631_210_27767_1_1535349580068_4165160_308-327832-0_sp1.1 from training. Duration: 0.92725 2023-10-03 11:11:37,878 WARNING [train.py:1204] (3/4) Exclude cut with ID _37086_210_9038_1_1532999540891_7021250_726-81176-0_sp1.1 from training. Duration: 0.936375 2023-10-03 11:11:37,943 WARNING [train.py:1204] (3/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_47-141521-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 11:11:37,946 WARNING [train.py:1204] (3/4) Exclude cut with ID _3067_210_2667_1_1526212974640_4503168_250-285937-0 from training. Duration: 0.8 2023-10-03 11:11:39,314 WARNING [train.py:1204] (3/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_239-24000-0_sp1.1 from training. Duration: 0.97275 2023-10-03 11:11:39,385 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3371_210_1794_1_1526384462_5580777_396-339812-0_sp0.9 from training. Duration: 0.9444375 2023-10-03 11:11:40,698 WARNING [train.py:1204] (3/4) Exclude cut with ID _16909_210_5724_1_1531292079912_4118919_580-173454-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 11:11:40,765 WARNING [train.py:1204] (3/4) Exclude cut with ID IC0124W0002-61308 from training. Duration: 0.982 2023-10-03 11:11:44,599 WARNING [train.py:1204] (3/4) Exclude cut with ID _31602_210_5854_1_1532677021456_4458250_249-96604-0 from training. Duration: 0.89 2023-10-03 11:11:45,983 WARNING [train.py:1204] (3/4) Exclude cut with ID _69584_210_5219_1_1535785133775_4044450_325-7656-0 from training. Duration: 0.86 2023-10-03 11:11:47,497 WARNING [train.py:1204] (3/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_478-162247-0 from training. Duration: 0.82 2023-10-03 11:11:48,850 WARNING [train.py:1204] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_686-54383-0_sp1.1 from training. Duration: 0.7909375 2023-10-03 11:11:56,214 WARNING [train.py:1204] (3/4) Exclude cut with ID _29303_210_2575_1_1532392237303_7410649_85-270092-0_sp1.1 from training. Duration: 0.936375 2023-10-03 11:11:56,232 WARNING [train.py:1204] (3/4) Exclude cut with ID _2322_210_2667_1_1525604548971_3656299_440-80494-0_sp1.1 from training. Duration: 0.8 2023-10-03 11:11:56,276 WARNING [train.py:1204] (3/4) Exclude cut with ID _47346_210_9476_1_1533815971553_3536160_201-40644-0_sp1.1 from training. Duration: 0.936375 2023-10-03 11:11:58,935 WARNING [train.py:1204] (3/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_343-139898-0_sp1.1 from training. Duration: 0.87275 2023-10-03 11:12:00,351 WARNING [train.py:1204] (3/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_1012-279473-0 from training. Duration: 0.67 2023-10-03 11:12:02,330 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7931_210_1794_1_1528594728_5564059_280-279560-0_sp1.1 from training. Duration: 0.8909375 2023-10-03 11:12:02,372 WARNING [train.py:1204] (3/4) Exclude cut with ID _8263_210_1664_1_1528439452891_970220_68-181805-0 from training. Duration: 0.64 2023-10-03 11:12:03,687 WARNING [train.py:1204] (3/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_605-133395-0 from training. Duration: 0.66 2023-10-03 11:12:05,080 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_43-171109-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 11:12:05,083 WARNING [train.py:1204] (3/4) Exclude cut with ID _40094_210_16380_1_1533294005773_3641159_107-280739-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 11:12:06,580 WARNING [train.py:1204] (3/4) Exclude cut with ID _14821_210_8750_1_1530680503991_6829359_2-227151-0_sp0.9 from training. Duration: 0.92225 2023-10-03 11:12:06,821 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.conv_module2.balancer1.prob, batch_count=1246433.3333333333, ans=0.125 2023-10-03 11:12:09,236 WARNING [train.py:1204] (3/4) Exclude cut with ID _14268_210_9206_1_1530622771285_4714099_331-105351-0_sp1.1 from training. Duration: 0.5909375 2023-10-03 11:12:10,637 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_26326_210_7925_1_1533002493_3592406_451-82741-0_sp1.1 from training. Duration: 0.97275 2023-10-03 11:12:10,757 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.0.feed_forward1.out_proj.dropout_p, batch_count=1246433.3333333333, ans=0.1 2023-10-03 11:12:13,822 WARNING [train.py:1204] (3/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_191-59784-0_sp0.9 from training. Duration: 0.97775 2023-10-03 11:12:14,107 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.conv_module1.balancer2.min_positive, batch_count=1246500.0, ans=0.05 2023-10-03 11:12:15,121 INFO [train.py:1046] (3/4) Epoch 36, batch 1050, loss[loss=0.1656, simple_loss=0.2337, pruned_loss=0.04869, over 22915.00 frames. ], tot_loss[loss=0.1599, simple_loss=0.2393, pruned_loss=0.04026, over 4693631.50 frames. ], batch size: 322, lr: 2.84e-03, grad_scale: 8.0 2023-10-03 11:12:15,208 WARNING [train.py:1204] (3/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_445-117398-0_sp1.1 from training. Duration: 0.8090625 2023-10-03 11:12:17,102 WARNING [train.py:1204] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_605-257205-0_sp0.9 from training. Duration: 0.6555625 2023-10-03 11:12:18,477 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_50050_210_16223_1_1535435796_7290138_592-258841-0_sp1.1 from training. Duration: 0.936375 2023-10-03 11:12:19,850 WARNING [train.py:1204] (3/4) Exclude cut with ID _14348_210_8341_1_1530599434996_3778640_240-232370-0_sp0.9 from training. Duration: 0.9333125 2023-10-03 11:12:24,374 WARNING [train.py:1204] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_239-316802-0_sp0.9 from training. Duration: 0.7666875 2023-10-03 11:12:25,839 WARNING [train.py:1204] (3/4) Exclude cut with ID _45560_210_21982_1_1533812603951_3584999_111-6376-0_sp1.1 from training. Duration: 0.763625 2023-10-03 11:12:28,591 WARNING [train.py:1204] (3/4) Exclude cut with ID _54189_210_16380_1_1534332670039_3911710_606-311481-0_sp1.1 from training. Duration: 0.963625 2023-10-03 11:12:28,663 WARNING [train.py:1204] (3/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_237-332027-0_sp1.1 from training. Duration: 0.863625 2023-10-03 11:12:28,701 WARNING [train.py:1204] (3/4) Exclude cut with ID _66070_210_7640_1_1535441393096_7805310_489-292631-0_sp0.9 from training. Duration: 0.87775 2023-10-03 11:12:30,049 WARNING [train.py:1204] (3/4) Exclude cut with ID _49728_210_12651_1_1534041034294_6788150_624-195301-0_sp1.1 from training. Duration: 0.77275 2023-10-03 11:12:31,405 WARNING [train.py:1204] (3/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_44-79427-0 from training. Duration: 0.98 2023-10-03 11:12:32,794 WARNING [train.py:1204] (3/4) Exclude cut with ID _35350_210_16040_1_1533040191183_4075299_490-198354-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 11:12:32,837 WARNING [train.py:1204] (3/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_309-222545-0 from training. Duration: 0.84 2023-10-03 11:12:36,088 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_13091_210_7925_1_1530609447_8987354_790-199469-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 11:12:36,101 WARNING [train.py:1204] (3/4) Exclude cut with ID _21632_210_2253_1_1531994001765_3744140_110-274654-0 from training. Duration: 0.82 2023-10-03 11:12:36,107 WARNING [train.py:1204] (3/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_498-40153-0_sp0.9 from training. Duration: 0.7 2023-10-03 11:12:36,247 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.ff2_skip_rate, batch_count=1246566.6666666667, ans=0.0 2023-10-03 11:12:37,303 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.595e+02 1.909e+02 2.045e+02 2.275e+02 3.142e+02, threshold=4.091e+02, percent-clipped=0.0 2023-10-03 11:12:42,912 WARNING [train.py:1204] (3/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_363-282909-0_sp1.1 from training. Duration: 0.936375 2023-10-03 11:12:42,966 WARNING [train.py:1204] (3/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_178-177582-0_sp0.9 from training. Duration: 0.82225 2023-10-03 11:12:42,976 WARNING [train.py:1204] (3/4) Exclude cut with ID _24612_210_8913_1_1531998054193_3806359_357-35487-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 11:12:45,866 WARNING [train.py:1204] (3/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_138-332773-0 from training. Duration: 0.81 2023-10-03 11:12:45,892 WARNING [train.py:1204] (3/4) Exclude cut with ID _7624_210_1531_1_1528109802198_4933101_418-349834-0 from training. Duration: 0.6 2023-10-03 11:12:47,679 WARNING [train.py:1204] (3/4) Exclude cut with ID _7537_210_2084_1_1528502395431_3924359_14-312906-0_sp0.9 from training. Duration: 0.9333125 2023-10-03 11:12:49,186 WARNING [train.py:1204] (3/4) Exclude cut with ID _60057_210_10640_1_1534903065793_3333150_362-230377-0 from training. Duration: 0.87 2023-10-03 11:12:52,407 WARNING [train.py:1204] (3/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_138-64210-0 from training. Duration: 0.73 2023-10-03 11:12:54,391 WARNING [train.py:1204] (3/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_454-339504-0_sp1.1 from training. Duration: 0.97275 2023-10-03 11:12:58,404 WARNING [train.py:1204] (3/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_508-335369-0_sp0.9 from training. Duration: 0.5333125 2023-10-03 11:12:59,782 WARNING [train.py:1204] (3/4) Exclude cut with ID _5608_210_2106_1_1527415439089_6819768_909-82767-0_sp0.9 from training. Duration: 0.67775 2023-10-03 11:12:59,823 WARNING [train.py:1204] (3/4) Exclude cut with ID _6694_210_4599_1_1527852637490_3965529_99-11611-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 11:13:01,383 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2655_210_2087_1_1525688959_3897400_52-279899-0_sp1.1 from training. Duration: 0.62725 2023-10-03 11:13:04,371 WARNING [train.py:1204] (3/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_375-245677-0_sp1.1 from training. Duration: 0.736375 2023-10-03 11:13:04,519 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=1246700.0, ans=0.1 2023-10-03 11:13:07,791 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_23850_210_4238_1_1532232597_1632433_32-185135-0 from training. Duration: 0.86 2023-10-03 11:13:09,149 WARNING [train.py:1204] (3/4) Exclude cut with ID _15113_210_8254_1_1530842438579_3716279_209-54533-0 from training. Duration: 0.96 2023-10-03 11:13:10,377 WARNING [train.py:1204] (3/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_401-53423-0 from training. Duration: 0.55 2023-10-03 11:13:11,687 WARNING [train.py:1204] (3/4) Exclude cut with ID _14867_210_1968_1_1530684179890_3846969_319-250868-0_sp1.1 from training. Duration: 0.9 2023-10-03 11:13:11,708 WARNING [train.py:1204] (3/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_106-30782-0_sp0.9 from training. Duration: 0.888875 2023-10-03 11:13:13,089 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_5960_210_1794_1_1527403865_3990999_515-2613-0 from training. Duration: 0.88 2023-10-03 11:13:16,329 WARNING [train.py:1204] (3/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_798-106179-0_sp0.9 from training. Duration: 0.92225 2023-10-03 11:13:17,780 WARNING [train.py:1204] (3/4) Exclude cut with ID _14227_210_4381_1_1530701804286_3940259_125-275642-0_sp1.1 from training. Duration: 0.9 2023-10-03 11:13:17,782 WARNING [train.py:1204] (3/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_79-200392-0_sp1.1 from training. Duration: 0.8818125 2023-10-03 11:13:19,124 WARNING [train.py:1204] (3/4) Exclude cut with ID _48836_210_12210_1_1534075848293_3904150_429-35475-0_sp0.9 from training. Duration: 0.988875 2023-10-03 11:13:19,147 WARNING [train.py:1204] (3/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_224-172517-0_sp1.1 from training. Duration: 0.97275 2023-10-03 11:13:22,321 WARNING [train.py:1204] (3/4) Exclude cut with ID _15113_210_8254_1_1530842438579_3716279_76-232507-0_sp1.1 from training. Duration: 0.97275 2023-10-03 11:13:22,324 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_135-5501-0 from training. Duration: 0.98 2023-10-03 11:13:25,512 WARNING [train.py:1204] (3/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_563-347832-0_sp0.9 from training. Duration: 0.988875 2023-10-03 11:13:25,522 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6786_210_1388_1_1527839699_4194495_273-149841-0 from training. Duration: 0.92 2023-10-03 11:13:25,543 WARNING [train.py:1204] (3/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_172-215932-0 from training. Duration: 0.66 2023-10-03 11:13:25,608 WARNING [train.py:1204] (3/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_939-132910-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 11:13:29,568 INFO [train.py:1046] (3/4) Epoch 36, batch 1100, loss[loss=0.1471, simple_loss=0.2276, pruned_loss=0.03335, over 24599.00 frames. ], tot_loss[loss=0.1592, simple_loss=0.2387, pruned_loss=0.03985, over 4688630.67 frames. ], batch size: 60, lr: 2.84e-03, grad_scale: 8.0 2023-10-03 11:13:29,639 WARNING [train.py:1204] (3/4) Exclude cut with ID _49371_210_12071_1_1533952761021_3812190_16-246001-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 11:13:33,784 WARNING [train.py:1204] (3/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_680-189165-0_sp1.1 from training. Duration: 0.936375 2023-10-03 11:13:38,435 WARNING [train.py:1204] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_53-300544-0_sp1.1 from training. Duration: 0.6090625 2023-10-03 11:13:41,116 WARNING [train.py:1204] (3/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_540-275685-0_sp1.1 from training. Duration: 0.8090625 2023-10-03 11:13:41,145 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_38965_210_4852_1_1533174906_3484735_453-106579-0_sp1.1 from training. Duration: 0.92725 2023-10-03 11:13:41,187 WARNING [train.py:1204] (3/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_105-328174-0 from training. Duration: 0.76 2023-10-03 11:13:42,554 WARNING [train.py:1204] (3/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_279-126205-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 11:13:42,738 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.0.conv_module1.balancer1.prob, batch_count=1246900.0, ans=0.125 2023-10-03 11:13:43,965 WARNING [train.py:1204] (3/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_5-55893-0_sp0.9 from training. Duration: 0.611125 2023-10-03 11:13:47,159 WARNING [train.py:1204] (3/4) Exclude cut with ID _13678_210_7497_1_1530530069226_3463170_141-10835-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 11:13:48,596 WARNING [train.py:1204] (3/4) Exclude cut with ID _22083_210_8254_1_1531702862439_3678970_201-85738-0_sp1.1 from training. Duration: 0.8181875 2023-10-03 11:13:48,629 WARNING [train.py:1204] (3/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_51-1049-0 from training. Duration: 0.75 2023-10-03 11:13:50,699 WARNING [train.py:1204] (3/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_206-10097-0_sp0.9 from training. Duration: 0.5444375 2023-10-03 11:13:51,287 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.4.encoder.layers.1.nonlin_attention.whiten2, num_groups=1, num_channels=384, metric=3.19 vs. limit=15.0 2023-10-03 11:13:52,073 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_4528_210_3273_1_1527076654_3276777_360-63661-0_sp1.1 from training. Duration: 0.92725 2023-10-03 11:13:52,081 WARNING [train.py:1204] (3/4) Exclude cut with ID _64086_210_7994_1_1535270417019_7435949_449-31567-0_sp1.1 from training. Duration: 0.8818125 2023-10-03 11:13:55,437 WARNING [train.py:1204] (3/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_245-174553-0_sp0.9 from training. Duration: 0.97775 2023-10-03 11:13:56,959 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6545_210_2040_1_1527774670_4245258_379-14182-0_sp1.1 from training. Duration: 0.72725 2023-10-03 11:14:02,428 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_256-206070-0_sp1.1 from training. Duration: 0.963625 2023-10-03 11:14:02,765 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=1246966.6666666667, ans=0.1 2023-10-03 11:14:03,899 WARNING [train.py:1204] (3/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_278-104676-0 from training. Duration: 0.98 2023-10-03 11:14:05,386 WARNING [train.py:1204] (3/4) Exclude cut with ID IC0671W0065-331908 from training. Duration: 0.896 2023-10-03 11:14:06,716 WARNING [train.py:1204] (3/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_244-129917-0_sp1.1 from training. Duration: 0.97275 2023-10-03 11:14:09,788 WARNING [train.py:1204] (3/4) Exclude cut with ID _7852_210_5219_1_1528286471316_8139400_1129-170857-0_sp1.1 from training. Duration: 0.97275 2023-10-03 11:14:11,170 WARNING [train.py:1204] (3/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_443-30034-0_sp0.9 from training. Duration: 0.711125 2023-10-03 11:14:11,200 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_31583_210_3852_1_1532595851_4024413_250-332999-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 11:14:13,971 WARNING [train.py:1204] (3/4) Exclude cut with ID _8772_210_1850_1_1528635595972_3664011_247-336448-0 from training. Duration: 0.74 2023-10-03 11:14:15,358 WARNING [train.py:1204] (3/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_567-135114-0_sp0.9 from training. Duration: 0.9666875 2023-10-03 11:14:15,362 WARNING [train.py:1204] (3/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_557-349534-0_sp1.1 from training. Duration: 0.82725 2023-10-03 11:14:15,374 WARNING [train.py:1204] (3/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_1341-26728-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 11:14:15,429 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_40222_210_6228_1_1533385298_4481939_375-328051-0_sp1.1 from training. Duration: 0.97275 2023-10-03 11:14:15,458 WARNING [train.py:1204] (3/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_276-96593-0 from training. Duration: 0.77 2023-10-03 11:14:18,810 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.0.feed_forward3.hidden_balancer.prob, batch_count=1247033.3333333333, ans=0.125 2023-10-03 11:14:23,264 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_53071_210_16554_1_1534490405_2954066_265-170472-0_sp0.9 from training. Duration: 0.92225 2023-10-03 11:14:23,281 WARNING [train.py:1204] (3/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_365-1492-0 from training. Duration: 0.91 2023-10-03 11:14:24,749 WARNING [train.py:1204] (3/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_654-52713-0_sp0.9 from training. Duration: 0.8555625 2023-10-03 11:14:26,441 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.conv_module2.balancer2.prob, batch_count=1247033.3333333333, ans=0.125 2023-10-03 11:14:30,815 WARNING [train.py:1204] (3/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_113-92608-0_sp1.1 from training. Duration: 0.4909375 2023-10-03 11:14:32,402 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_46013_210_7234_1_1533776362_3744032_50-141535-0 from training. Duration: 0.99 2023-10-03 11:14:32,410 WARNING [train.py:1204] (3/4) Exclude cut with ID _13942_210_9988_1_1530604906036_3733360_499-55676-0_sp1.1 from training. Duration: 0.563625 2023-10-03 11:14:33,770 WARNING [train.py:1204] (3/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_375-65143-0_sp1.1 from training. Duration: 0.97275 2023-10-03 11:14:36,563 WARNING [train.py:1204] (3/4) Exclude cut with ID _16756_210_4915_1_1531047803763_7104259_199-325608-0_sp1.1 from training. Duration: 0.92725 2023-10-03 11:14:36,588 WARNING [train.py:1204] (3/4) Exclude cut with ID _31649_210_6228_1_1532584608063_4091078_443-124391-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 11:14:38,091 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.conv_skip_rate, batch_count=1247100.0, ans=0.0 2023-10-03 11:14:39,212 WARNING [train.py:1204] (3/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_146-218763-0 from training. Duration: 0.86 2023-10-03 11:14:39,267 WARNING [train.py:1204] (3/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_567-135114-0_sp1.1 from training. Duration: 0.7909375 2023-10-03 11:14:39,311 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_26326_210_7925_1_1533002493_3592406_462-288299-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 11:14:40,679 WARNING [train.py:1204] (3/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_322-217220-0 from training. Duration: 0.86 2023-10-03 11:14:40,714 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_238-256668-0_sp1.1 from training. Duration: 0.62725 2023-10-03 11:14:42,022 WARNING [train.py:1204] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_371-18169-0 from training. Duration: 0.86 2023-10-03 11:14:43,468 INFO [train.py:1046] (3/4) Epoch 36, batch 1150, loss[loss=0.1352, simple_loss=0.2113, pruned_loss=0.02954, over 21158.00 frames. ], tot_loss[loss=0.1598, simple_loss=0.2393, pruned_loss=0.04016, over 4694726.93 frames. ], batch size: 46, lr: 2.84e-03, grad_scale: 8.0 2023-10-03 11:14:43,525 WARNING [train.py:1204] (3/4) Exclude cut with ID _58480_210_12392_1_1534811121111_3646160_23-74512-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 11:14:43,532 WARNING [train.py:1204] (3/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_784-213099-0_sp1.1 from training. Duration: 0.8454375 2023-10-03 11:14:44,929 WARNING [train.py:1204] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_625-102955-0_sp1.1 from training. Duration: 0.7 2023-10-03 11:14:50,119 WARNING [train.py:1204] (3/4) Exclude cut with ID _49748_210_24116_1_1533986971737_3158910_176-174327-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 11:14:51,553 WARNING [train.py:1204] (3/4) Exclude cut with ID _50310_210_15488_1_1534068043345_3641080_432-96140-0_sp1.1 from training. Duration: 0.936375 2023-10-03 11:14:52,968 WARNING [train.py:1204] (3/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_76-14910-0_sp1.1 from training. Duration: 0.92725 2023-10-03 11:14:54,731 WARNING [train.py:1204] (3/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_40-19458-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 11:14:54,775 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_46-93547-0 from training. Duration: 0.95 2023-10-03 11:14:54,807 WARNING [train.py:1204] (3/4) Exclude cut with ID _21631_210_2253_1_1531907880778_3160339_321-35325-0_sp1.1 from training. Duration: 0.963625 2023-10-03 11:14:57,986 WARNING [train.py:1204] (3/4) Exclude cut with ID _4779_210_2084_1_1527318022944_3914893_500-285013-0 from training. Duration: 0.69 2023-10-03 11:14:59,442 WARNING [train.py:1204] (3/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_301-93324-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 11:14:59,449 WARNING [train.py:1204] (3/4) Exclude cut with ID _6473_210_1741_1_1527901307396_3815380_108-17335-0_sp1.1 from training. Duration: 0.5909375 2023-10-03 11:15:04,887 WARNING [train.py:1204] (3/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_206-10097-0 from training. Duration: 0.49 2023-10-03 11:15:05,014 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=1247233.3333333333, ans=0.1 2023-10-03 11:15:06,092 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.611e+02 1.838e+02 2.022e+02 2.263e+02 3.460e+02, threshold=4.043e+02, percent-clipped=0.0 2023-10-03 11:15:07,599 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_36756_210_7234_1_1533026991_966276_103-77513-0_sp1.1 from training. Duration: 0.97275 2023-10-03 11:15:10,716 WARNING [train.py:1204] (3/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_662-18193-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 11:15:10,775 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_26433_210_12108_1_1532157158_4056480_439-52438-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 11:15:12,063 WARNING [train.py:1204] (3/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_464-33931-0 from training. Duration: 0.54 2023-10-03 11:15:12,070 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3474_210_2028_1_1526188513_4825246_145-309131-0_sp0.9 from training. Duration: 0.82225 2023-10-03 11:15:12,089 WARNING [train.py:1204] (3/4) Exclude cut with ID _9679_210_7497_1_1529129212906_4363929_446-308321-0_sp1.1 from training. Duration: 0.963625 2023-10-03 11:15:15,142 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.ff2_skip_rate, batch_count=1247300.0, ans=0.0 2023-10-03 11:15:16,449 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.ff2_skip_rate, batch_count=1247300.0, ans=0.0 2023-10-03 11:15:17,614 WARNING [train.py:1204] (3/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_662-275621-0 from training. Duration: 0.42 2023-10-03 11:15:19,432 WARNING [train.py:1204] (3/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_344-337148-0_sp1.1 from training. Duration: 0.97275 2023-10-03 11:15:20,829 WARNING [train.py:1204] (3/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_759-179071-0_sp1.1 from training. Duration: 0.92725 2023-10-03 11:15:27,627 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.ff3_skip_rate, batch_count=1247366.6666666667, ans=0.0 2023-10-03 11:15:28,823 WARNING [train.py:1204] (3/4) Exclude cut with ID _28783_210_7943_1_1532671164808_7155479_293-12188-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 11:15:34,966 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_17613_210_2151_1_1531137569_2366160_235-320304-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 11:15:34,982 WARNING [train.py:1204] (3/4) Exclude cut with ID _14349_210_8341_1_1530615710818_3442470_170-336541-0 from training. Duration: 0.86 2023-10-03 11:15:35,047 WARNING [train.py:1204] (3/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_375-218677-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 11:15:36,259 WARNING [train.py:1204] (3/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_829-202956-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 11:15:36,588 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.balancer1.prob, batch_count=1247366.6666666667, ans=0.125 2023-10-03 11:15:40,453 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9029W0436-509823 from training. Duration: 0.768 2023-10-03 11:15:43,243 WARNING [train.py:1204] (3/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_231-112214-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 11:15:43,469 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.bypass_mid.scale_min, batch_count=1247433.3333333333, ans=0.2 2023-10-03 11:15:50,644 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9059W0470-524883 from training. Duration: 0.3415625 2023-10-03 11:15:53,979 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_43266_210_1794_1_1533521789_4497218_206-79512-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 11:15:55,258 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_294-163793-0_sp0.9 from training. Duration: 0.911125 2023-10-03 11:15:56,508 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6290_210_3273_1_1527595224_1282787_115-87095-0_sp1.1 from training. Duration: 0.5 2023-10-03 11:15:56,559 WARNING [train.py:1204] (3/4) Exclude cut with ID _14100_210_8341_1_1530529320613_3678069_279-55760-0_sp1.1 from training. Duration: 0.8454375 2023-10-03 11:15:57,880 INFO [train.py:1046] (3/4) Epoch 36, batch 1200, loss[loss=0.1547, simple_loss=0.2334, pruned_loss=0.03802, over 24453.00 frames. ], tot_loss[loss=0.1604, simple_loss=0.2398, pruned_loss=0.04054, over 4697853.89 frames. ], batch size: 58, lr: 2.84e-03, grad_scale: 16.0 2023-10-03 11:15:59,407 WARNING [train.py:1204] (3/4) Exclude cut with ID _14621_210_1925_1_1530686901185_3675420_419-234200-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 11:16:04,287 WARNING [train.py:1204] (3/4) Exclude cut with ID _60229_210_27767_1_1534838406158_3655350_353-213656-0_sp1.1 from training. Duration: 0.863625 2023-10-03 11:16:04,297 WARNING [train.py:1204] (3/4) Exclude cut with ID _48678_210_5663_1_1533988830947_3769199_306-255886-0_sp1.1 from training. Duration: 0.7 2023-10-03 11:16:06,885 WARNING [train.py:1204] (3/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_466-3492-0_sp1.1 from training. Duration: 0.97275 2023-10-03 11:16:06,886 WARNING [train.py:1204] (3/4) Exclude cut with ID _8755_210_1876_1_1528628559357_3258209_199-242167-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 11:16:06,943 WARNING [train.py:1204] (3/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_237-89684-0_sp1.1 from training. Duration: 0.87275 2023-10-03 11:16:08,368 WARNING [train.py:1204] (3/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_70-84121-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 11:16:11,097 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7544_210_6753_1_1528587722_3576686_172-171750-0_sp1.1 from training. Duration: 0.5909375 2023-10-03 11:16:11,205 WARNING [train.py:1204] (3/4) Exclude cut with ID _24271_210_12658_1_1531987270022_7190519_40-321822-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 11:16:11,229 WARNING [train.py:1204] (3/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_227-83612-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 11:16:11,381 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.1.conv_module2.balancer1.prob, batch_count=1247566.6666666667, ans=0.125 2023-10-03 11:16:13,860 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9156W0015-572508 from training. Duration: 0.896 2023-10-03 11:16:14,620 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.3.encoder.layers.2.self_attn2.whiten, num_groups=1, num_channels=512, metric=11.27 vs. limit=22.5 2023-10-03 11:16:16,719 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_292-265928-0 from training. Duration: 0.51 2023-10-03 11:16:19,358 WARNING [train.py:1204] (3/4) Exclude cut with ID _14747_210_8341_1_1530685797463_3654089_147-131229-0_sp1.1 from training. Duration: 0.6909375 2023-10-03 11:16:24,015 WARNING [train.py:1204] (3/4) Exclude cut with ID _45297_210_2721_1_1533715351381_3264949_351-147136-0_sp0.9 from training. Duration: 0.9666875 2023-10-03 11:16:26,822 WARNING [train.py:1204] (3/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_1102-324568-0_sp1.1 from training. Duration: 0.97275 2023-10-03 11:16:28,709 WARNING [train.py:1204] (3/4) Exclude cut with ID _18516_210_9206_1_1531704653206_3692406_465-75446-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 11:16:28,718 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9203W0126-595623 from training. Duration: 0.896 2023-10-03 11:16:30,135 WARNING [train.py:1204] (3/4) Exclude cut with ID _55383_210_10635_1_1534468426172_2231669_139-328410-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 11:16:30,417 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.feed_forward3.hidden_balancer.prob, batch_count=1247633.3333333333, ans=0.125 2023-10-03 11:16:36,180 WARNING [train.py:1204] (3/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_166-246557-0_sp0.9 from training. Duration: 0.711125 2023-10-03 11:16:36,184 WARNING [train.py:1204] (3/4) Exclude cut with ID _30039_210_13270_1_1532484612900_2725150_239-304872-0_sp1.1 from training. Duration: 0.963625 2023-10-03 11:16:37,487 WARNING [train.py:1204] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_6-226750-0 from training. Duration: 0.94 2023-10-03 11:16:38,858 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_135-5501-0_sp1.1 from training. Duration: 0.8909375 2023-10-03 11:16:41,661 WARNING [train.py:1204] (3/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_359-118926-0 from training. Duration: 0.72 2023-10-03 11:16:43,213 WARNING [train.py:1204] (3/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_4-91037-0 from training. Duration: 0.88 2023-10-03 11:16:43,236 WARNING [train.py:1204] (3/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_415-288492-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 11:16:45,924 WARNING [train.py:1204] (3/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_544-287179-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 11:16:46,025 WARNING [train.py:1204] (3/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_122-32606-0_sp1.1 from training. Duration: 0.92725 2023-10-03 11:16:47,338 WARNING [train.py:1204] (3/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_121-284152-0_sp1.1 from training. Duration: 0.536375 2023-10-03 11:16:48,811 WARNING [train.py:1204] (3/4) Exclude cut with ID _44718_210_5816_1_1534050051869_6469030_350-299477-0_sp1.1 from training. Duration: 0.97275 2023-10-03 11:16:48,823 WARNING [train.py:1204] (3/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_1103-56445-0_sp0.9 from training. Duration: 0.811125 2023-10-03 11:16:50,146 WARNING [train.py:1204] (3/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_64-298917-0_sp0.9 from training. Duration: 0.97775 2023-10-03 11:16:50,207 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2245_210_2715_1_1525511548_3540254_310-52483-0 from training. Duration: 0.88 2023-10-03 11:16:50,248 WARNING [train.py:1204] (3/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_401-158239-0_sp1.1 from training. Duration: 0.6454375 2023-10-03 11:16:51,544 WARNING [train.py:1204] (3/4) Exclude cut with ID _59502_210_12210_1_1535107024698_1835139_146-162495-0_sp1.1 from training. Duration: 0.836375 2023-10-03 11:16:51,565 WARNING [train.py:1204] (3/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_41-31157-0_sp0.9 from training. Duration: 0.6333125 2023-10-03 11:16:53,628 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_68717_210_3528_1_1535628683_1556759_95-36722-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 11:16:53,635 WARNING [train.py:1204] (3/4) Exclude cut with ID _30196_210_1664_1_1533708496522_7074140_654-162611-0_sp1.1 from training. Duration: 0.92725 2023-10-03 11:16:58,182 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_9302_210_2550_1_1528889962_4655416_638-301620-0_sp0.9 from training. Duration: 0.52225 2023-10-03 11:16:59,590 WARNING [train.py:1204] (3/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_478-162247-0_sp1.1 from training. Duration: 0.7454375 2023-10-03 11:17:04,038 WARNING [train.py:1204] (3/4) Exclude cut with ID _45297_210_2721_1_1533715351381_3264949_353-9541-0 from training. Duration: 0.97 2023-10-03 11:17:08,148 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9653W0190-675468 from training. Duration: 0.9935 2023-10-03 11:17:09,559 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_49730_210_13049_1_1534075294_1830676_214-84436-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 11:17:10,169 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.3.encoder.layers.2.self_attn1.whiten, num_groups=1, num_channels=512, metric=13.89 vs. limit=22.5 2023-10-03 11:17:10,804 INFO [train.py:1046] (3/4) Epoch 36, batch 1250, loss[loss=0.1974, simple_loss=0.2648, pruned_loss=0.06505, over 19138.00 frames. ], tot_loss[loss=0.1611, simple_loss=0.2407, pruned_loss=0.04076, over 4706951.21 frames. ], batch size: 388, lr: 2.84e-03, grad_scale: 16.0 2023-10-03 11:17:12,181 WARNING [train.py:1204] (3/4) Exclude cut with ID _50093_210_4220_1_1534417141207_3432299_259-322252-0_sp1.1 from training. Duration: 0.836375 2023-10-03 11:17:13,537 WARNING [train.py:1204] (3/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_599-50220-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 11:17:14,887 WARNING [train.py:1204] (3/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_174-109016-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 11:17:16,350 WARNING [train.py:1204] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_665-196155-0 from training. Duration: 0.67 2023-10-03 11:17:20,482 WARNING [train.py:1204] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_656-188361-0_sp1.1 from training. Duration: 0.7909375 2023-10-03 11:17:21,784 WARNING [train.py:1204] (3/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_60-18123-0_sp1.1 from training. Duration: 0.8 2023-10-03 11:17:23,176 WARNING [train.py:1204] (3/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_535-14410-0 from training. Duration: 0.47 2023-10-03 11:17:25,042 WARNING [train.py:1204] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_94-126128-0_sp1.1 from training. Duration: 0.7818125 2023-10-03 11:17:26,964 WARNING [train.py:1204] (3/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_356-31216-0_sp1.1 from training. Duration: 0.6818125 2023-10-03 11:17:30,306 WARNING [train.py:1204] (3/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_443-30034-0_sp1.1 from training. Duration: 0.5818125 2023-10-03 11:17:31,713 WARNING [train.py:1204] (3/4) Exclude cut with ID _26907_210_7234_1_1532221740694_3205263_337-2494-0_sp1.1 from training. Duration: 0.8 2023-10-03 11:17:31,799 WARNING [train.py:1204] (3/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_49-97158-0_sp0.9 from training. Duration: 0.8444375 2023-10-03 11:17:32,978 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.503e+02 1.893e+02 2.096e+02 2.315e+02 3.131e+02, threshold=4.193e+02, percent-clipped=0.0 2023-10-03 11:17:33,056 WARNING [train.py:1204] (3/4) Exclude cut with ID _7877_210_6014_1_1528286358099_3750136_553-170128-0_sp1.1 from training. Duration: 0.963625 2023-10-03 11:17:36,341 WARNING [train.py:1204] (3/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_202-293523-0_sp0.9 from training. Duration: 0.8 2023-10-03 11:17:39,196 WARNING [train.py:1204] (3/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_188-171673-0_sp0.9 from training. Duration: 0.6444375 2023-10-03 11:17:39,218 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_120-41668-0_sp1.1 from training. Duration: 0.636375 2023-10-03 11:17:39,225 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_40223_210_6228_1_1533298404_4812267_613-78769-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 11:17:41,889 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_53861_210_5399_1_1534422156_4272077_150-336899-0_sp1.1 from training. Duration: 0.963625 2023-10-03 11:17:41,954 WARNING [train.py:1204] (3/4) Exclude cut with ID _14459_210_2228_1_1531443641946_7516700_1146-56843-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 11:17:44,745 WARNING [train.py:1204] (3/4) Exclude cut with ID _39867_210_9826_1_1533553222030_9344259_1194-299928-0_sp1.1 from training. Duration: 0.92725 2023-10-03 11:17:44,825 WARNING [train.py:1204] (3/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_214-291369-0_sp0.9 from training. Duration: 0.7 2023-10-03 11:17:49,268 WARNING [train.py:1204] (3/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_358-215558-0 from training. Duration: 0.93 2023-10-03 11:17:49,317 WARNING [train.py:1204] (3/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_577-287150-0_sp0.9 from training. Duration: 0.911125 2023-10-03 11:17:50,739 WARNING [train.py:1204] (3/4) Exclude cut with ID _60860_210_8254_1_1534896015875_3557169_229-187367-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 11:17:52,100 WARNING [train.py:1204] (3/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_857-298942-0 from training. Duration: 0.85 2023-10-03 11:17:54,595 WARNING [train.py:1204] (3/4) Exclude cut with ID _40137_210_12210_1_1533290484046_3795180_249-187059-0_sp1.1 from training. Duration: 0.8 2023-10-03 11:17:54,602 WARNING [train.py:1204] (3/4) Exclude cut with ID ID0171W0508-765603 from training. Duration: 0.541 2023-10-03 11:17:54,618 WARNING [train.py:1204] (3/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_521-219439-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 11:17:54,623 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_40223_210_6228_1_1533298404_4812267_741-302616-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 11:17:58,566 WARNING [train.py:1204] (3/4) Exclude cut with ID _65293_210_9070_1_1535545549765_4004210_438-154735-0_sp1.1 from training. Duration: 0.92725 2023-10-03 11:17:59,308 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.1.encoder.layers.1.whiten, num_groups=1, num_channels=256, metric=3.84 vs. limit=12.0 2023-10-03 11:18:01,834 WARNING [train.py:1204] (3/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_564-132950-0_sp1.1 from training. Duration: 0.92725 2023-10-03 11:18:01,864 WARNING [train.py:1204] (3/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_291-270881-0_sp1.1 from training. Duration: 0.8818125 2023-10-03 11:18:05,105 WARNING [train.py:1204] (3/4) Exclude cut with ID _60229_210_27767_1_1534838406158_3655350_353-213656-0 from training. Duration: 0.95 2023-10-03 11:18:05,124 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_243-143119-0 from training. Duration: 0.77 2023-10-03 11:18:05,157 WARNING [train.py:1204] (3/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_690-126306-0 from training. Duration: 0.78 2023-10-03 11:18:05,355 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.feed_forward2.hidden_balancer.prob, batch_count=1248033.3333333333, ans=0.125 2023-10-03 11:18:08,091 WARNING [train.py:1204] (3/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_347-95003-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 11:18:09,614 WARNING [train.py:1204] (3/4) Exclude cut with ID _2322_210_2667_1_1525604548971_3656299_440-80494-0 from training. Duration: 0.88 2023-10-03 11:18:09,635 WARNING [train.py:1204] (3/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_370-229434-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 11:18:12,473 WARNING [train.py:1204] (3/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_124-345840-0_sp1.1 from training. Duration: 0.47275 2023-10-03 11:18:12,484 WARNING [train.py:1204] (3/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_165-123581-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 11:18:13,904 WARNING [train.py:1204] (3/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_453-282588-0 from training. Duration: 0.98 2023-10-03 11:18:13,920 WARNING [train.py:1204] (3/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_299-34413-0_sp0.9 from training. Duration: 0.72225 2023-10-03 11:18:13,971 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_40036_210_13405_1_1533648647_1315388_48-146016-0_sp1.1 from training. Duration: 0.8454375 2023-10-03 11:18:14,000 WARNING [train.py:1204] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_99-167392-0_sp0.9 from training. Duration: 0.67775 2023-10-03 11:18:14,044 WARNING [train.py:1204] (3/4) Exclude cut with ID _48677_210_5663_1_1533902405919_3892989_37-207114-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 11:18:15,622 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.conv_module2.balancer2.prob, batch_count=1248100.0, ans=0.125 2023-10-03 11:18:16,671 WARNING [train.py:1204] (3/4) Exclude cut with ID _29857_210_20034_1_1532395886753_3798160_335-163562-0 from training. Duration: 0.85 2023-10-03 11:18:19,524 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_36492_210_7927_1_1533261987_4790834_703-343351-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 11:18:20,846 WARNING [train.py:1204] (3/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_80-147301-0_sp0.9 from training. Duration: 0.9333125 2023-10-03 11:18:22,087 WARNING [train.py:1204] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_218-295097-0_sp0.9 from training. Duration: 0.8555625 2023-10-03 11:18:24,263 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2295_210_2087_1_1525500993_2407863_116-238894-0_sp1.1 from training. Duration: 0.636375 2023-10-03 11:18:25,574 INFO [train.py:1046] (3/4) Epoch 36, batch 1300, loss[loss=0.1516, simple_loss=0.242, pruned_loss=0.0306, over 24657.00 frames. ], tot_loss[loss=0.1617, simple_loss=0.2413, pruned_loss=0.04108, over 4706435.13 frames. ], batch size: 68, lr: 2.84e-03, grad_scale: 16.0 2023-10-03 11:18:27,112 WARNING [train.py:1204] (3/4) Exclude cut with ID _3231_210_2106_1_1526205864934_6784220_697-171068-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 11:18:27,139 WARNING [train.py:1204] (3/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_102-77064-0 from training. Duration: 0.93 2023-10-03 11:18:27,386 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.feed_forward3.hidden_balancer.prob, batch_count=1248166.6666666667, ans=0.125 2023-10-03 11:18:31,752 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_13091_210_7925_1_1530609447_8987354_1186-261957-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 11:18:32,051 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.feed_forward1.hidden_balancer.prob, batch_count=1248166.6666666667, ans=0.125 2023-10-03 11:18:33,559 WARNING [train.py:1204] (3/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_91-277684-0_sp1.1 from training. Duration: 0.67275 2023-10-03 11:18:34,861 WARNING [train.py:1204] (3/4) Exclude cut with ID _31088_210_16446_1_1532520012284_3440919_159-313751-0_sp1.1 from training. Duration: 0.936375 2023-10-03 11:18:34,966 WARNING [train.py:1204] (3/4) Exclude cut with ID _42326_210_7770_1_1533814282531_3707269_227-203721-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 11:18:37,543 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3531_210_3528_1_1526294203_5216456_309-227302-0_sp1.1 from training. Duration: 0.62725 2023-10-03 11:18:37,604 WARNING [train.py:1204] (3/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_226-242140-0 from training. Duration: 0.81 2023-10-03 11:18:42,414 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.3.encoder.layers.0.self_attn2.whiten, num_groups=1, num_channels=512, metric=13.22 vs. limit=22.5 2023-10-03 11:18:44,527 WARNING [train.py:1204] (3/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_122-109023-0_sp1.1 from training. Duration: 0.6909375 2023-10-03 11:18:44,614 WARNING [train.py:1204] (3/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_660-211088-0_sp0.9 from training. Duration: 0.688875 2023-10-03 11:18:47,310 WARNING [train.py:1204] (3/4) Exclude cut with ID _39290_210_4220_1_1533862778760_3075118_92-311600-0 from training. Duration: 0.88 2023-10-03 11:18:50,129 WARNING [train.py:1204] (3/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_390-339137-0_sp1.1 from training. Duration: 0.5090625 2023-10-03 11:18:52,972 WARNING [train.py:1204] (3/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_449-341946-0_sp1.1 from training. Duration: 0.963625 2023-10-03 11:18:54,856 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_68095_210_20185_1_1535718226_8888855_884-327007-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 11:18:55,581 WARNING [train.py:1204] (3/4) Exclude cut with ID _42326_210_7770_1_1533814282531_3707269_308-222998-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 11:18:56,849 WARNING [train.py:1204] (3/4) Exclude cut with ID _40399_210_4353_1_1533305658194_3279406_344-238708-0_sp1.1 from training. Duration: 0.963625 2023-10-03 11:18:58,167 WARNING [train.py:1204] (3/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_87-131427-0_sp1.1 from training. Duration: 0.7090625 2023-10-03 11:18:58,225 WARNING [train.py:1204] (3/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_110-130961-0_sp0.9 from training. Duration: 0.52225 2023-10-03 11:18:58,258 WARNING [train.py:1204] (3/4) Exclude cut with ID _55153_210_2253_1_1534384786677_3162096_148-79022-0 from training. Duration: 0.96 2023-10-03 11:19:04,692 WARNING [train.py:1204] (3/4) Exclude cut with ID _9679_210_7497_1_1529129212906_4363929_512-313889-0_sp1.1 from training. Duration: 0.736375 2023-10-03 11:19:04,706 WARNING [train.py:1204] (3/4) Exclude cut with ID _7970_210_1883_1_1528455616296_3472230_25-178407-0_sp1.1 from training. Duration: 0.6454375 2023-10-03 11:19:06,113 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_246-125000-0 from training. Duration: 0.62 2023-10-03 11:19:06,167 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_15126_210_6753_1_1530778677_2591185_113-157651-0_sp0.9 from training. Duration: 0.7555625 2023-10-03 11:19:07,561 WARNING [train.py:1204] (3/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_105-63309-0_sp1.1 from training. Duration: 0.8909375 2023-10-03 11:19:10,270 WARNING [train.py:1204] (3/4) Exclude cut with ID _29877_210_6228_1_1532397791106_5276149_578-177271-0_sp1.1 from training. Duration: 0.936375 2023-10-03 11:19:11,504 WARNING [train.py:1204] (3/4) Exclude cut with ID _44831_210_13427_1_1533816011549_3689035_32-188147-0 from training. Duration: 0.96 2023-10-03 11:19:11,546 WARNING [train.py:1204] (3/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_300-254337-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 11:19:11,558 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_128-241008-0 from training. Duration: 0.94 2023-10-03 11:19:12,999 WARNING [train.py:1204] (3/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_53-279178-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 11:19:15,902 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_41452_210_7291_1_1533437687_3941281_273-334088-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 11:19:15,904 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_128-241008-0_sp1.1 from training. Duration: 0.8545625 2023-10-03 11:19:21,195 WARNING [train.py:1204] (3/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_379-49457-0 from training. Duration: 0.87 2023-10-03 11:19:22,976 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_105-114797-0 from training. Duration: 0.84 2023-10-03 11:19:23,051 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_14928_210_5632_1_1530773461_4714000_454-112198-0 from training. Duration: 0.66 2023-10-03 11:19:29,045 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_456-22141-0_sp1.1 from training. Duration: 0.8 2023-10-03 11:19:31,935 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_115-233929-0 from training. Duration: 0.72 2023-10-03 11:19:33,846 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_616-16795-0_sp1.1 from training. Duration: 0.963625 2023-10-03 11:19:34,031 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.feed_forward2.hidden_balancer.prob, batch_count=1248433.3333333333, ans=0.125 2023-10-03 11:19:38,879 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.1.conv_module1.balancer2.prob, batch_count=1248500.0, ans=0.125 2023-10-03 11:19:40,020 INFO [train.py:1046] (3/4) Epoch 36, batch 1350, loss[loss=0.1531, simple_loss=0.2212, pruned_loss=0.04246, over 23896.00 frames. ], tot_loss[loss=0.1609, simple_loss=0.2406, pruned_loss=0.04066, over 4716016.80 frames. ], batch size: 195, lr: 2.84e-03, grad_scale: 16.0 2023-10-03 11:19:41,415 WARNING [train.py:1204] (3/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_448-212135-0 from training. Duration: 0.91 2023-10-03 11:19:43,063 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.ff3_skip_rate, batch_count=1248500.0, ans=0.0 2023-10-03 11:19:44,137 WARNING [train.py:1204] (3/4) Exclude cut with ID _46683_210_9642_1_1533730913492_4591116_201-205677-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 11:19:45,537 WARNING [train.py:1204] (3/4) Exclude cut with ID _64086_210_7994_1_1535270417019_7435949_43-80483-0_sp1.1 from training. Duration: 0.97275 2023-10-03 11:19:48,321 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_240-134124-0_sp1.1 from training. Duration: 0.963625 2023-10-03 11:19:48,351 WARNING [train.py:1204] (3/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_907-120940-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 11:19:50,814 WARNING [train.py:1204] (3/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_848-280957-0_sp1.1 from training. Duration: 0.7909375 2023-10-03 11:19:52,148 WARNING [train.py:1204] (3/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_243-42116-0_sp0.9 from training. Duration: 0.811125 2023-10-03 11:19:55,687 WARNING [train.py:1204] (3/4) Exclude cut with ID _6694_210_4599_1_1527852637490_3965529_116-109398-0_sp0.9 from training. Duration: 0.811125 2023-10-03 11:19:57,078 WARNING [train.py:1204] (3/4) Exclude cut with ID _24314_210_4915_1_1532135078116_7170179_521-258868-0 from training. Duration: 0.79 2023-10-03 11:19:57,291 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.feed_forward3.hidden_balancer.prob, batch_count=1248566.6666666667, ans=0.125 2023-10-03 11:19:59,056 WARNING [train.py:1204] (3/4) Exclude cut with ID _2220_210_1819_1_1525509109898_3658680_50-99837-0_sp0.9 from training. Duration: 0.6 2023-10-03 11:20:00,245 WARNING [train.py:1204] (3/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_313-134930-0_sp1.1 from training. Duration: 0.8818125 2023-10-03 11:20:01,941 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.606e+02 2.023e+02 2.236e+02 2.575e+02 4.914e+02, threshold=4.472e+02, percent-clipped=3.0 2023-10-03 11:20:02,088 WARNING [train.py:1204] (3/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_125-144174-0 from training. Duration: 0.39 2023-10-03 11:20:03,485 WARNING [train.py:1204] (3/4) Exclude cut with ID _59288_210_5854_1_1535077757774_3412246_275-293897-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 11:20:03,575 WARNING [train.py:1204] (3/4) Exclude cut with ID _40519_210_13794_1_1533715228655_7337390_722-52880-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 11:20:03,597 WARNING [train.py:1204] (3/4) Exclude cut with ID _7537_210_2084_1_1528502395431_3924359_128-268029-0 from training. Duration: 0.98 2023-10-03 11:20:06,393 WARNING [train.py:1204] (3/4) Exclude cut with ID _8021_210_7994_1_1528376451358_3653069_179-45206-0 from training. Duration: 0.86 2023-10-03 11:20:09,639 WARNING [train.py:1204] (3/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_88-276668-0 from training. Duration: 0.99 2023-10-03 11:20:09,817 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.bypass_mid.scale_min, batch_count=1248633.3333333333, ans=0.2 2023-10-03 11:20:11,037 WARNING [train.py:1204] (3/4) Exclude cut with ID _22613_210_5219_1_1532253599686_4090170_290-212862-0_sp1.1 from training. Duration: 0.97275 2023-10-03 11:20:11,042 WARNING [train.py:1204] (3/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_465-214726-0 from training. Duration: 0.86 2023-10-03 11:20:19,331 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.bypass.skip_rate, batch_count=1248633.3333333333, ans=0.04949747468305833 2023-10-03 11:20:19,355 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.conv_module2.balancer2.prob, batch_count=1248633.3333333333, ans=0.125 2023-10-03 11:20:21,925 WARNING [train.py:1204] (3/4) Exclude cut with ID _56480_210_4220_1_1534679942079_3075409_138-36031-0_sp1.1 from training. Duration: 0.97275 2023-10-03 11:20:29,898 WARNING [train.py:1204] (3/4) Exclude cut with ID _58605_210_12210_1_1534723195939_3904259_300-285878-0_sp1.1 from training. Duration: 0.97275 2023-10-03 11:20:29,929 WARNING [train.py:1204] (3/4) Exclude cut with ID _40901_210_5665_1_1533363789958_3856130_114-340229-0_sp1.1 from training. Duration: 0.936375 2023-10-03 11:20:31,257 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_489-230703-0 from training. Duration: 0.85 2023-10-03 11:20:34,468 WARNING [train.py:1204] (3/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_322-81243-0_sp1.1 from training. Duration: 0.936375 2023-10-03 11:20:34,552 WARNING [train.py:1204] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_271-269084-0 from training. Duration: 0.83 2023-10-03 11:20:34,562 WARNING [train.py:1204] (3/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_166-314607-0_sp0.9 from training. Duration: 0.6 2023-10-03 11:20:35,841 WARNING [train.py:1204] (3/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_340-343302-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 11:20:37,714 WARNING [train.py:1204] (3/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_286-112015-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 11:20:37,905 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.0.feed_forward1.out_proj.dropout_p, batch_count=1248766.6666666667, ans=0.1 2023-10-03 11:20:39,175 WARNING [train.py:1204] (3/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_328-264776-0 from training. Duration: 0.67 2023-10-03 11:20:40,546 WARNING [train.py:1204] (3/4) Exclude cut with ID _4866_210_2577_1_1527404412284_4364659_256-306830-0_sp0.9 from training. Duration: 0.9666875 2023-10-03 11:20:45,984 WARNING [train.py:1204] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_239-316802-0 from training. Duration: 0.69 2023-10-03 11:20:48,727 WARNING [train.py:1204] (3/4) Exclude cut with ID _29857_210_20034_1_1532395886753_3798160_279-185801-0 from training. Duration: 0.91 2023-10-03 11:20:52,870 INFO [train.py:1046] (3/4) Epoch 36, batch 1400, loss[loss=0.1565, simple_loss=0.2423, pruned_loss=0.03534, over 24675.00 frames. ], tot_loss[loss=0.1595, simple_loss=0.239, pruned_loss=0.04001, over 4713493.91 frames. ], batch size: 65, lr: 2.84e-03, grad_scale: 16.0 2023-10-03 11:20:54,870 WARNING [train.py:1204] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_656-188361-0 from training. Duration: 0.87 2023-10-03 11:20:54,990 WARNING [train.py:1204] (3/4) Exclude cut with ID _44718_210_5816_1_1534050051869_6469030_100-230287-0_sp1.1 from training. Duration: 0.936375 2023-10-03 11:20:58,937 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8275_210_4198_1_1528455320_3892571_276-215622-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 11:21:00,846 WARNING [train.py:1204] (3/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_698-309430-0_sp1.1 from training. Duration: 0.963625 2023-10-03 11:21:02,529 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.conv_module1.balancer2.min_positive, batch_count=1248833.3333333333, ans=0.05 2023-10-03 11:21:04,260 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_365-217119-0 from training. Duration: 0.63 2023-10-03 11:21:05,649 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7264_210_4938_1_1527939911_8132095_800-91995-0 from training. Duration: 0.86 2023-10-03 11:21:15,607 WARNING [train.py:1204] (3/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_49-97158-0_sp1.1 from training. Duration: 0.6909375 2023-10-03 11:21:17,143 WARNING [train.py:1204] (3/4) Exclude cut with ID _61132_210_9969_1_1535021792964_3755419_61-57720-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 11:21:18,627 WARNING [train.py:1204] (3/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_345-306832-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 11:21:19,973 WARNING [train.py:1204] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_164-224052-0_sp0.9 from training. Duration: 0.77775 2023-10-03 11:21:23,309 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_34886_210_10633_1_1533203882_1755661_76-156096-0_sp1.1 from training. Duration: 0.7909375 2023-10-03 11:21:23,384 WARNING [train.py:1204] (3/4) Exclude cut with ID 4133-6541-0027-40495-0_sp1.1 from training. Duration: 0.9681875 2023-10-03 11:21:33,565 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_514-211165-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 11:21:35,449 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_41211_210_2544_1_1533348859_3702910_97-212695-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 11:21:35,606 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.conv_module1.balancer1.prob, batch_count=1248966.6666666667, ans=0.125 2023-10-03 11:21:40,269 WARNING [train.py:1204] (3/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_406-45086-0 from training. Duration: 0.8 2023-10-03 11:21:40,342 WARNING [train.py:1204] (3/4) Exclude cut with ID _65263_210_22356_1_1535763598691_3921037_49-222917-0_sp1.1 from training. Duration: 0.836375 2023-10-03 11:21:41,585 WARNING [train.py:1204] (3/4) Exclude cut with ID _4866_210_2577_1_1527404412284_4364659_352-157930-0_sp1.1 from training. Duration: 0.736375 2023-10-03 11:21:41,653 WARNING [train.py:1204] (3/4) Exclude cut with ID _4366_210_1388_1_1526792053633_3613819_231-211402-0_sp1.1 from training. Duration: 0.8545625 2023-10-03 11:21:41,688 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_98-21576-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 11:21:44,339 WARNING [train.py:1204] (3/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_348-127789-0_sp0.9 from training. Duration: 0.9444375 2023-10-03 11:21:44,362 WARNING [train.py:1204] (3/4) Exclude cut with ID _45871_210_5665_1_1533864614245_3296060_98-329557-0_sp1.1 from training. Duration: 0.97275 2023-10-03 11:21:44,418 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6846_210_6753_1_1527850767_4295158_2-315073-0_sp1.1 from training. Duration: 0.8 2023-10-03 11:21:45,829 WARNING [train.py:1204] (3/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_145-137790-0 from training. Duration: 0.83 2023-10-03 11:21:45,869 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder_embed.conv.8.prob, batch_count=1249033.3333333333, ans=0.125 2023-10-03 11:21:47,168 WARNING [train.py:1204] (3/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_133-158308-0_sp0.9 from training. Duration: 0.9333125 2023-10-03 11:21:50,022 WARNING [train.py:1204] (3/4) Exclude cut with ID _14898_210_9206_1_1530766812377_4177190_320-11758-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 11:21:51,469 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.balancer1.prob, batch_count=1249100.0, ans=0.125 2023-10-03 11:21:52,714 WARNING [train.py:1204] (3/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_406-45086-0_sp0.9 from training. Duration: 0.888875 2023-10-03 11:21:58,796 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_5438_210_1794_1_1527336687_2600009_104-288274-0 from training. Duration: 0.58 2023-10-03 11:22:00,190 WARNING [train.py:1204] (3/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_108-53929-0_sp0.9 from training. Duration: 0.6555625 2023-10-03 11:22:01,457 WARNING [train.py:1204] (3/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_444-286390-0_sp1.1 from training. Duration: 0.963625 2023-10-03 11:22:04,831 WARNING [train.py:1204] (3/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_125-144174-0_sp0.9 from training. Duration: 0.4333125 2023-10-03 11:22:04,896 WARNING [train.py:1204] (3/4) Exclude cut with ID _55209_210_1531_1_1534675828996_4597160_118-345621-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 11:22:06,770 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_45886_210_17024_1_1533709215_471063_7-79165-0_sp1.1 from training. Duration: 0.8909375 2023-10-03 11:22:08,014 INFO [train.py:1046] (3/4) Epoch 36, batch 1450, loss[loss=0.1642, simple_loss=0.24, pruned_loss=0.04417, over 23683.00 frames. ], tot_loss[loss=0.1585, simple_loss=0.2384, pruned_loss=0.03934, over 4711536.22 frames. ], batch size: 179, lr: 2.84e-03, grad_scale: 16.0 2023-10-03 11:22:08,272 WARNING [train.py:1204] (3/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_175-163894-0_sp0.9 from training. Duration: 0.82225 2023-10-03 11:22:10,952 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_50188_210_4198_1_1534503621_3646041_353-81682-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 11:22:10,959 WARNING [train.py:1204] (3/4) Exclude cut with ID _17875_210_2087_1_1531224012907_1821339_175-80565-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 11:22:10,967 WARNING [train.py:1204] (3/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_159-328157-0_sp1.1 from training. Duration: 0.47275 2023-10-03 11:22:15,039 WARNING [train.py:1204] (3/4) Exclude cut with ID _60057_210_10640_1_1534903065793_3333150_137-267579-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 11:22:15,088 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_92-46009-0_sp1.1 from training. Duration: 0.4909375 2023-10-03 11:22:16,517 WARNING [train.py:1204] (3/4) Exclude cut with ID _53203_210_2087_1_1534246218309_475590_48-68507-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 11:22:16,540 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_28211_210_6388_1_1532861506_4523224_128-328162-0 from training. Duration: 0.9 2023-10-03 11:22:17,858 WARNING [train.py:1204] (3/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_51-1049-0_sp1.1 from training. Duration: 0.6818125 2023-10-03 11:22:19,232 WARNING [train.py:1204] (3/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_383-316000-0 from training. Duration: 0.9 2023-10-03 11:22:20,586 WARNING [train.py:1204] (3/4) Exclude cut with ID _4779_210_2084_1_1527318022944_3914893_185-295153-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 11:22:23,173 WARNING [train.py:1204] (3/4) Exclude cut with ID _3231_210_2106_1_1526205864934_6784220_171-256004-0_sp0.9 from training. Duration: 0.9555625 2023-10-03 11:22:23,174 WARNING [train.py:1204] (3/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_87-272068-0 from training. Duration: 0.83 2023-10-03 11:22:23,291 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_164-152777-0_sp1.1 from training. Duration: 0.92725 2023-10-03 11:22:25,244 WARNING [train.py:1204] (3/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_18-130336-0_sp0.9 from training. Duration: 0.87775 2023-10-03 11:22:25,310 WARNING [train.py:1204] (3/4) Exclude cut with ID _14886_210_4220_1_1530702020355_3516190_175-229073-0_sp1.1 from training. Duration: 0.4090625 2023-10-03 11:22:25,329 WARNING [train.py:1204] (3/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_791-45149-0_sp0.9 from training. Duration: 0.9555625 2023-10-03 11:22:25,590 INFO [scaling.py:1118] (3/4) WithLoss: name=encoder.encoders.4.encoder.layers.1.self_attn_weights, loss-sum=0.000e+00 2023-10-03 11:22:26,643 WARNING [train.py:1204] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_468-189058-0_sp1.1 from training. Duration: 0.87275 2023-10-03 11:22:28,180 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8278_210_2550_1_1528637099_4220413_32-142780-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 11:22:29,124 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.0.layers.0.nonlin_attention.whiten2, num_groups=1, num_channels=192, metric=5.18 vs. limit=15.0 2023-10-03 11:22:29,487 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.581e+02 1.817e+02 1.972e+02 2.237e+02 2.946e+02, threshold=3.944e+02, percent-clipped=0.0 2023-10-03 11:22:30,999 WARNING [train.py:1204] (3/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_146-218763-0_sp0.9 from training. Duration: 0.9555625 2023-10-03 11:22:34,347 WARNING [train.py:1204] (3/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_348-127789-0_sp1.1 from training. Duration: 0.77275 2023-10-03 11:22:34,358 WARNING [train.py:1204] (3/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_288-285445-0_sp1.1 from training. Duration: 0.936375 2023-10-03 11:22:36,535 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.conv_skip_rate, batch_count=1249300.0, ans=0.0 2023-10-03 11:22:37,581 WARNING [train.py:1204] (3/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_308-181732-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 11:22:37,606 WARNING [train.py:1204] (3/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_895-117428-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 11:22:37,805 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.0.feed_forward3.hidden_balancer.prob, batch_count=1249300.0, ans=0.125 2023-10-03 11:22:39,203 WARNING [train.py:1204] (3/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_387-159008-0_sp0.9 from training. Duration: 0.9555625 2023-10-03 11:22:40,468 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_37351_210_7925_1_1533012931_3812113_265-85780-0_sp0.9 from training. Duration: 0.97775 2023-10-03 11:22:40,483 WARNING [train.py:1204] (3/4) Exclude cut with ID _40551_210_10635_1_1533776424632_3416179_361-3964-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 11:22:40,523 WARNING [train.py:1204] (3/4) Exclude cut with ID _25882_210_3036_1_1532775665815_6479510_485-122742-0_sp1.1 from training. Duration: 0.97275 2023-10-03 11:22:44,708 WARNING [train.py:1204] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_98-51676-0 from training. Duration: 0.73 2023-10-03 11:22:46,353 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.0.conv_module1.balancer2.prob, batch_count=1249300.0, ans=0.125 2023-10-03 11:22:47,520 WARNING [train.py:1204] (3/4) Exclude cut with ID _30548_210_16040_1_1532608097130_3146969_371-280610-0_sp1.1 from training. Duration: 0.92725 2023-10-03 11:22:51,561 WARNING [train.py:1204] (3/4) Exclude cut with ID IC0671W0081-331924 from training. Duration: 0.896 2023-10-03 11:22:51,665 WARNING [train.py:1204] (3/4) Exclude cut with ID _40284_210_2084_1_1533513679814_3704279_380-160775-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 11:22:53,018 WARNING [train.py:1204] (3/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_57-270037-0_sp1.1 from training. Duration: 0.7 2023-10-03 11:22:55,686 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_853-150237-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 11:22:55,772 WARNING [train.py:1204] (3/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_491-261923-0 from training. Duration: 0.98 2023-10-03 11:23:00,401 WARNING [train.py:1204] (3/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_836-244045-0_sp1.1 from training. Duration: 0.97275 2023-10-03 11:23:00,488 WARNING [train.py:1204] (3/4) Exclude cut with ID _14880_210_8895_1_1530680426655_3795259_289-212817-0 from training. Duration: 0.69 2023-10-03 11:23:01,941 WARNING [train.py:1204] (3/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_303-340446-0 from training. Duration: 0.9 2023-10-03 11:23:05,250 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_37152_210_9697_1_1533106573_3844188_418-41736-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 11:23:08,926 WARNING [train.py:1204] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_371-18169-0_sp1.1 from training. Duration: 0.7818125 2023-10-03 11:23:08,955 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_187-219984-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 11:23:10,424 WARNING [train.py:1204] (3/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_63-267585-0 from training. Duration: 0.9 2023-10-03 11:23:11,887 WARNING [train.py:1204] (3/4) Exclude cut with ID _27013_210_1389_1_1532437173240_3252649_222-125573-0 from training. Duration: 0.98 2023-10-03 11:23:13,164 WARNING [train.py:1204] (3/4) Exclude cut with ID _14821_210_8750_1_1530680503991_6829359_189-297451-0 from training. Duration: 0.55 2023-10-03 11:23:14,472 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_557-163391-0_sp1.1 from training. Duration: 0.97275 2023-10-03 11:23:15,894 WARNING [train.py:1204] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_65-88242-0_sp1.1 from training. Duration: 0.5090625 2023-10-03 11:23:18,270 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.0.layers.1.feed_forward3.out_whiten, num_groups=1, num_channels=192, metric=9.31 vs. limit=15.0 2023-10-03 11:23:21,485 INFO [train.py:1046] (3/4) Epoch 36, batch 1500, loss[loss=0.1635, simple_loss=0.2502, pruned_loss=0.03842, over 23968.00 frames. ], tot_loss[loss=0.1589, simple_loss=0.2384, pruned_loss=0.03973, over 4698626.89 frames. ], batch size: 80, lr: 2.84e-03, grad_scale: 16.0 2023-10-03 11:23:23,101 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.1.feed_forward1.hidden_balancer.prob, batch_count=1249500.0, ans=0.125 2023-10-03 11:23:25,696 WARNING [train.py:1204] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_218-295097-0 from training. Duration: 0.77 2023-10-03 11:23:25,725 WARNING [train.py:1204] (3/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_686-247600-0_sp1.1 from training. Duration: 0.836375 2023-10-03 11:23:25,728 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_397-147196-0_sp0.9 from training. Duration: 0.888875 2023-10-03 11:23:27,105 WARNING [train.py:1204] (3/4) Exclude cut with ID _53950_210_5816_1_1534417264919_3862951_237-136449-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 11:23:27,185 WARNING [train.py:1204] (3/4) Exclude cut with ID _3120_210_3525_1_1526036143112_4072980_74-178400-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 11:23:29,097 WARNING [train.py:1204] (3/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_309-222545-0_sp0.9 from training. Duration: 0.9333125 2023-10-03 11:23:30,550 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7544_210_6753_1_1528587722_3576686_172-171750-0 from training. Duration: 0.65 2023-10-03 11:23:31,937 WARNING [train.py:1204] (3/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_3-237434-0_sp0.9 from training. Duration: 0.8444375 2023-10-03 11:23:31,967 WARNING [train.py:1204] (3/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_101-293367-0_sp0.9 from training. Duration: 0.7 2023-10-03 11:23:33,284 WARNING [train.py:1204] (3/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_818-213552-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 11:23:33,358 WARNING [train.py:1204] (3/4) Exclude cut with ID _14349_210_8341_1_1530615710818_3442470_115-285975-0_sp1.1 from training. Duration: 0.7818125 2023-10-03 11:23:35,465 WARNING [train.py:1204] (3/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_291-309749-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 11:23:35,545 WARNING [train.py:1204] (3/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_715-3462-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 11:23:37,128 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.feed_forward1.out_proj.dropout_p, batch_count=1249566.6666666667, ans=0.1 2023-10-03 11:23:40,306 WARNING [train.py:1204] (3/4) Exclude cut with ID _59502_210_12210_1_1535107024698_1835139_13-331827-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 11:23:40,316 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_4528_210_3273_1_1527076654_3276777_342-168068-0 from training. Duration: 0.87 2023-10-03 11:23:41,744 WARNING [train.py:1204] (3/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_215-141059-0_sp0.9 from training. Duration: 0.688875 2023-10-03 11:23:41,772 WARNING [train.py:1204] (3/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_106-286130-0_sp1.1 from training. Duration: 0.8909375 2023-10-03 11:23:43,180 WARNING [train.py:1204] (3/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_480-77655-0_sp1.1 from training. Duration: 0.97275 2023-10-03 11:23:45,930 WARNING [train.py:1204] (3/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_848-280957-0 from training. Duration: 0.87 2023-10-03 11:23:50,219 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.1.conv_module2.balancer2.prob, batch_count=1249633.3333333333, ans=0.125 2023-10-03 11:23:51,374 WARNING [train.py:1204] (3/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_122-109023-0 from training. Duration: 0.76 2023-10-03 11:23:52,016 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.4.encoder.layers.1.feed_forward3.out_whiten, num_groups=1, num_channels=384, metric=11.48 vs. limit=15.0 2023-10-03 11:23:52,792 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_11305_210_1794_1_1529750428_7178148_645-233480-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 11:23:54,114 WARNING [train.py:1204] (3/4) Exclude cut with ID _7562_210_2084_1_1528887767916_3946229_345-169156-0 from training. Duration: 0.58 2023-10-03 11:23:56,808 WARNING [train.py:1204] (3/4) Exclude cut with ID _7562_210_2084_1_1528887767916_3946229_345-169156-0_sp1.1 from training. Duration: 0.52725 2023-10-03 11:23:59,527 WARNING [train.py:1204] (3/4) Exclude cut with ID _13253_210_1925_1_1530757775819_3577873_253-246386-0_sp1.1 from training. Duration: 0.8181875 2023-10-03 11:24:00,850 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_136-264444-0_sp1.1 from training. Duration: 0.97275 2023-10-03 11:24:00,871 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2168_210_2290_1_1525606920_1849400_136-222991-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 11:24:01,115 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.feed_forward2.hidden_balancer.prob, batch_count=1249633.3333333333, ans=0.125 2023-10-03 11:24:02,239 WARNING [train.py:1204] (3/4) Exclude cut with ID _7852_210_5219_1_1528286471316_8139400_242-105336-0 from training. Duration: 0.72 2023-10-03 11:24:02,294 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_5960_210_1794_1_1527403865_3990999_515-2613-0_sp1.1 from training. Duration: 0.8 2023-10-03 11:24:02,309 WARNING [train.py:1204] (3/4) Exclude cut with ID _39688_210_19596_1_1533371626725_4154159_551-167834-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 11:24:02,360 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_85-195240-0 from training. Duration: 0.87 2023-10-03 11:24:04,287 WARNING [train.py:1204] (3/4) Exclude cut with ID _63171_210_15488_1_1535104792794_3295110_113-113534-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 11:24:09,323 WARNING [train.py:1204] (3/4) Exclude cut with ID _8833_210_5219_1_1528592665210_7553059_319-349181-0_sp1.1 from training. Duration: 0.9 2023-10-03 11:24:09,331 WARNING [train.py:1204] (3/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_225-43381-0 from training. Duration: 0.94 2023-10-03 11:24:09,396 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder_embed.conv.8.prob, batch_count=1249700.0, ans=0.125 2023-10-03 11:24:12,566 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.0.feed_forward1.out_proj.dropout_p, batch_count=1249700.0, ans=0.1 2023-10-03 11:24:15,134 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_5959_210_1794_1_1527400400_3445726_412-44052-0_sp0.9 from training. Duration: 0.8555625 2023-10-03 11:24:16,506 WARNING [train.py:1204] (3/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_356-167816-0_sp1.1 from training. Duration: 0.6545625 2023-10-03 11:24:19,385 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9012W0112-501244 from training. Duration: 0.768 2023-10-03 11:24:19,432 WARNING [train.py:1204] (3/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_177-326952-0_sp1.1 from training. Duration: 0.92725 2023-10-03 11:24:19,436 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9016W0116-502789 from training. Duration: 0.896 2023-10-03 11:24:19,523 WARNING [train.py:1204] (3/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_557-204815-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 11:24:19,647 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.1.attention_skip_rate, batch_count=1249766.6666666667, ans=0.0 2023-10-03 11:24:20,854 WARNING [train.py:1204] (3/4) Exclude cut with ID _55857_210_2346_1_1534644007831_6898209_279-229469-0_sp1.1 from training. Duration: 0.936375 2023-10-03 11:24:22,243 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9029W0375-509764 from training. Duration: 0.896 2023-10-03 11:24:23,703 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2295_210_2087_1_1525500993_2407863_234-83104-0_sp0.9 from training. Duration: 0.688875 2023-10-03 11:24:26,452 WARNING [train.py:1204] (3/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_266-137104-0 from training. Duration: 0.65 2023-10-03 11:24:27,759 WARNING [train.py:1204] (3/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_289-163927-0_sp1.1 from training. Duration: 0.92725 2023-10-03 11:24:30,502 WARNING [train.py:1204] (3/4) Exclude cut with ID _12733_210_8341_1_1530167424556_3646380_86-225314-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 11:24:30,516 WARNING [train.py:1204] (3/4) Exclude cut with ID _7877_210_6014_1_1528286358099_3750136_618-284064-0_sp1.1 from training. Duration: 0.92725 2023-10-03 11:24:31,833 WARNING [train.py:1204] (3/4) Exclude cut with ID _55844_210_2346_1_1534471212569_7625707_1017-31469-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 11:24:31,889 WARNING [train.py:1204] (3/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_617-241446-0_sp1.1 from training. Duration: 0.92725 2023-10-03 11:24:33,802 WARNING [train.py:1204] (3/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_866-173440-0_sp1.1 from training. Duration: 0.8181875 2023-10-03 11:24:35,217 INFO [train.py:1046] (3/4) Epoch 36, batch 1550, loss[loss=0.1505, simple_loss=0.2339, pruned_loss=0.03355, over 23284.00 frames. ], tot_loss[loss=0.1592, simple_loss=0.2391, pruned_loss=0.03968, over 4704423.12 frames. ], batch size: 93, lr: 2.84e-03, grad_scale: 16.0 2023-10-03 11:24:35,314 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_9460_210_4211_1_1528954587_10049667_480-260269-0 from training. Duration: 0.56 2023-10-03 11:24:35,351 WARNING [train.py:1204] (3/4) Exclude cut with ID _44659_210_2721_1_1534036120061_6614860_626-298884-0 from training. Duration: 0.97 2023-10-03 11:24:37,242 WARNING [train.py:1204] (3/4) Exclude cut with ID _53565_210_10635_1_1534337218335_3820259_287-18120-0_sp1.1 from training. Duration: 0.8818125 2023-10-03 11:24:37,300 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_791-197891-0 from training. Duration: 0.84 2023-10-03 11:24:38,567 WARNING [train.py:1204] (3/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_527-238990-0 from training. Duration: 0.9 2023-10-03 11:24:40,031 WARNING [train.py:1204] (3/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_449-65969-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 11:24:41,730 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_553-341452-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 11:24:42,979 WARNING [train.py:1204] (3/4) Exclude cut with ID _16909_210_5724_1_1531292079912_4118919_300-47574-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 11:24:43,000 WARNING [train.py:1204] (3/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_70-27732-0_sp1.1 from training. Duration: 0.8545625 2023-10-03 11:24:44,468 WARNING [train.py:1204] (3/4) Exclude cut with ID _44718_210_5816_1_1534050051869_6469030_727-28074-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 11:24:44,519 WARNING [train.py:1204] (3/4) Exclude cut with ID _55857_210_2346_1_1534644007831_6898209_357-123664-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 11:24:47,217 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9123W0004-555964 from training. Duration: 0.896 2023-10-03 11:24:47,258 WARNING [train.py:1204] (3/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_445-145208-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 11:24:47,291 WARNING [train.py:1204] (3/4) Exclude cut with ID _3675_210_4557_1_1526639934408_4182089_456-18305-0_sp1.1 from training. Duration: 0.6909375 2023-10-03 11:24:48,637 WARNING [train.py:1204] (3/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_1-99680-0_sp1.1 from training. Duration: 0.6090625 2023-10-03 11:24:48,953 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.feed_forward1.out_proj.dropout_p, batch_count=1249900.0, ans=0.1 2023-10-03 11:24:50,137 WARNING [train.py:1204] (3/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_146-282400-0_sp0.9 from training. Duration: 0.788875 2023-10-03 11:24:50,157 WARNING [train.py:1204] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_844-96989-0 from training. Duration: 0.71 2023-10-03 11:24:51,455 WARNING [train.py:1204] (3/4) Exclude cut with ID _29853_210_7497_1_1532505600851_6642110_128-146364-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 11:24:51,478 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_224-322394-0 from training. Duration: 0.66 2023-10-03 11:24:52,915 WARNING [train.py:1204] (3/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_191-59784-0 from training. Duration: 0.88 2023-10-03 11:24:52,920 WARNING [train.py:1204] (3/4) Exclude cut with ID _12556_210_8341_1_1530081024100_7327690_725-193477-0 from training. Duration: 0.98 2023-10-03 11:24:54,230 WARNING [train.py:1204] (3/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_523-79838-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 11:24:55,602 WARNING [train.py:1204] (3/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_398-334558-0_sp1.1 from training. Duration: 0.87275 2023-10-03 11:24:56,773 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.601e+02 1.814e+02 1.950e+02 2.198e+02 3.185e+02, threshold=3.899e+02, percent-clipped=0.0 2023-10-03 11:24:59,595 WARNING [train.py:1204] (3/4) Exclude cut with ID _6456_210_1534_1_1527681634212_3181049_171-36697-0_sp1.1 from training. Duration: 0.936375 2023-10-03 11:25:03,024 WARNING [train.py:1204] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_700-183549-0 from training. Duration: 0.68 2023-10-03 11:25:03,026 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8853_210_3273_1_1528633899_818968_51-187685-0 from training. Duration: 0.69 2023-10-03 11:25:11,306 WARNING [train.py:1204] (3/4) Exclude cut with ID _7719_210_3525_1_1528456153163_4296960_333-110399-0_sp1.1 from training. Duration: 0.87275 2023-10-03 11:25:14,641 WARNING [train.py:1204] (3/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_1022-83740-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 11:25:15,936 WARNING [train.py:1204] (3/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_61-327062-0_sp0.9 from training. Duration: 0.9 2023-10-03 11:25:15,942 WARNING [train.py:1204] (3/4) Exclude cut with ID _47251_210_18628_1_1533814063128_4137250_214-88076-0_sp1.1 from training. Duration: 0.8 2023-10-03 11:25:17,284 WARNING [train.py:1204] (3/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_839-173017-0 from training. Duration: 0.41 2023-10-03 11:25:18,990 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.ff2_skip_rate, batch_count=1250033.3333333333, ans=0.0 2023-10-03 11:25:21,717 WARNING [train.py:1204] (3/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_299-34413-0_sp1.1 from training. Duration: 0.5909375 2023-10-03 11:25:23,127 WARNING [train.py:1204] (3/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_664-159457-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 11:25:23,270 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.bypass_mid.scale_min, batch_count=1250033.3333333333, ans=0.2 2023-10-03 11:25:25,857 WARNING [train.py:1204] (3/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_483-291760-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 11:25:27,319 WARNING [train.py:1204] (3/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_287-255474-0_sp1.1 from training. Duration: 0.92725 2023-10-03 11:25:27,992 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.3.encoder.layers.2.conv_module2.whiten, num_groups=1, num_channels=512, metric=4.10 vs. limit=15.0 2023-10-03 11:25:28,626 WARNING [train.py:1204] (3/4) Exclude cut with ID _48677_210_5663_1_1533902405919_3892989_111-11439-0_sp1.1 from training. Duration: 0.87275 2023-10-03 11:25:28,640 WARNING [train.py:1204] (3/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_399-156791-0 from training. Duration: 0.83 2023-10-03 11:25:28,682 WARNING [train.py:1204] (3/4) Exclude cut with ID _3992_210_1747_1_1526644816031_3731060_36-147855-0_sp0.9 from training. Duration: 0.8666875 2023-10-03 11:25:30,785 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.1.encoder.layers.0.feed_forward1.out_whiten, num_groups=1, num_channels=256, metric=5.10 vs. limit=15.0 2023-10-03 11:25:31,374 WARNING [train.py:1204] (3/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_442-320317-0_sp1.1 from training. Duration: 0.7454375 2023-10-03 11:25:31,399 WARNING [train.py:1204] (3/4) Exclude cut with ID _55429_210_10626_1_1534467588527_7174143_953-80868-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 11:25:32,682 WARNING [train.py:1204] (3/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_159-328157-0_sp0.9 from training. Duration: 0.57775 2023-10-03 11:25:32,697 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9309W0197-640324 from training. Duration: 0.896 2023-10-03 11:25:35,766 WARNING [train.py:1204] (3/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_361-30822-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 11:25:39,352 WARNING [train.py:1204] (3/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_636-251213-0 from training. Duration: 0.94 2023-10-03 11:25:45,164 WARNING [train.py:1204] (3/4) Exclude cut with ID _49728_210_12651_1_1534041034294_6788150_468-333827-0_sp1.1 from training. Duration: 0.963625 2023-10-03 11:25:46,463 WARNING [train.py:1204] (3/4) Exclude cut with ID _25882_210_3036_1_1532775665815_6479510_623-191759-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 11:25:46,514 WARNING [train.py:1204] (3/4) Exclude cut with ID _5189_210_4557_1_1527248319428_3769229_199-53526-0 from training. Duration: 0.42 2023-10-03 11:25:49,039 INFO [train.py:1046] (3/4) Epoch 36, batch 1600, loss[loss=0.1581, simple_loss=0.2284, pruned_loss=0.04389, over 23549.00 frames. ], tot_loss[loss=0.1606, simple_loss=0.24, pruned_loss=0.04056, over 4691719.14 frames. ], batch size: 149, lr: 2.84e-03, grad_scale: 32.0 2023-10-03 11:25:49,080 WARNING [train.py:1204] (3/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_32-205288-0_sp0.9 from training. Duration: 0.8666875 2023-10-03 11:25:50,459 WARNING [train.py:1204] (3/4) Exclude cut with ID _65263_210_22356_1_1535763598691_3921037_401-346646-0_sp1.1 from training. Duration: 0.963625 2023-10-03 11:25:50,464 WARNING [train.py:1204] (3/4) Exclude cut with ID _12555_210_8750_1_1530075594402_3780239_449-267986-0_sp1.1 from training. Duration: 0.7545625 2023-10-03 11:25:50,484 WARNING [train.py:1204] (3/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_430-201020-0_sp1.1 from training. Duration: 0.8818125 2023-10-03 11:25:50,564 WARNING [train.py:1204] (3/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_379-49457-0_sp1.1 from training. Duration: 0.7909375 2023-10-03 11:25:53,452 WARNING [train.py:1204] (3/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_200-105218-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 11:25:53,519 WARNING [train.py:1204] (3/4) Exclude cut with ID _3867_210_1883_1_1526641200575_4475830_319-349850-0 from training. Duration: 0.67 2023-10-03 11:25:54,866 WARNING [train.py:1204] (3/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_916-195848-0 from training. Duration: 0.92 2023-10-03 11:25:56,262 WARNING [train.py:1204] (3/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_436-186311-0 from training. Duration: 0.65 2023-10-03 11:25:57,072 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.3.encoder.layers.0.self_attn_weights.whiten_keys, num_groups=8, num_channels=256, metric=4.49 vs. limit=6.0 2023-10-03 11:25:57,730 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3910_210_1794_1_1526777667_3747260_302-180617-0_sp1.1 from training. Duration: 0.97275 2023-10-03 11:25:59,324 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.bypass_mid.scale_min, batch_count=1250166.6666666667, ans=0.2 2023-10-03 11:26:00,310 WARNING [train.py:1204] (3/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_102-92751-0 from training. Duration: 0.99 2023-10-03 11:26:00,380 WARNING [train.py:1204] (3/4) Exclude cut with ID _51899_210_20034_1_1534129254466_3669220_265-215428-0_sp1.1 from training. Duration: 0.8909375 2023-10-03 11:26:02,662 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.ff3_skip_rate, batch_count=1250233.3333333333, ans=0.0 2023-10-03 11:26:03,006 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.5.encoder.layers.0.self_attn_weights.whiten_keys, num_groups=4, num_channels=128, metric=1.62 vs. limit=6.0 2023-10-03 11:26:03,745 WARNING [train.py:1204] (3/4) Exclude cut with ID _58583_210_5236_1_1534726835266_3574249_438-198993-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 11:26:07,878 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.1.encoder.layers.0.nonlin_attention.whiten2, num_groups=1, num_channels=256, metric=6.36 vs. limit=15.0 2023-10-03 11:26:08,255 WARNING [train.py:1204] (3/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_166-166990-0_sp1.1 from training. Duration: 0.87275 2023-10-03 11:26:09,847 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.attention_skip_rate, batch_count=1250233.3333333333, ans=0.0 2023-10-03 11:26:12,394 WARNING [train.py:1204] (3/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_907-151398-0 from training. Duration: 0.37 2023-10-03 11:26:15,802 WARNING [train.py:1204] (3/4) Exclude cut with ID _38670_210_15379_1_1533448810659_4391059_452-256669-0_sp1.1 from training. Duration: 0.936375 2023-10-03 11:26:15,847 WARNING [train.py:1204] (3/4) Exclude cut with ID _48677_210_5663_1_1533902405919_3892989_408-40740-0 from training. Duration: 0.83 2023-10-03 11:26:15,890 WARNING [train.py:1204] (3/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_903-158756-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 11:26:17,220 WARNING [train.py:1204] (3/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_397-216497-0 from training. Duration: 0.96 2023-10-03 11:26:22,728 WARNING [train.py:1204] (3/4) Exclude cut with ID _55429_210_10626_1_1534467588527_7174143_97-335084-0 from training. Duration: 0.97 2023-10-03 11:26:25,711 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.ff3_skip_rate, batch_count=1250300.0, ans=0.0 2023-10-03 11:26:29,641 WARNING [train.py:1204] (3/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_453-267730-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 11:26:29,719 WARNING [train.py:1204] (3/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_153-97441-0 from training. Duration: 0.56 2023-10-03 11:26:31,120 WARNING [train.py:1204] (3/4) Exclude cut with ID _40901_210_5665_1_1533363789958_3856130_312-329122-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 11:26:31,173 WARNING [train.py:1204] (3/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_97-126471-0_sp1.1 from training. Duration: 0.97275 2023-10-03 11:26:31,173 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_11305_210_1794_1_1529750428_7178148_257-312870-0_sp1.1 from training. Duration: 0.8 2023-10-03 11:26:33,836 WARNING [train.py:1204] (3/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_454-314901-0_sp0.9 from training. Duration: 0.511125 2023-10-03 11:26:38,640 WARNING [train.py:1204] (3/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_282-136641-0_sp1.1 from training. Duration: 0.3818125 2023-10-03 11:26:38,891 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.balancer1.prob, batch_count=1250366.6666666667, ans=0.125 2023-10-03 11:26:40,644 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_46013_210_7234_1_1533776362_3744032_493-160724-0_sp1.1 from training. Duration: 0.963625 2023-10-03 11:26:41,948 WARNING [train.py:1204] (3/4) Exclude cut with ID _67831_210_12392_1_1535612464202_3092612_259-33442-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 11:26:42,007 WARNING [train.py:1204] (3/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_289-62384-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 11:26:43,424 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6545_210_2040_1_1527774670_4245258_379-14182-0_sp0.9 from training. Duration: 0.888875 2023-10-03 11:26:44,825 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_548-294316-0_sp1.1 from training. Duration: 0.736375 2023-10-03 11:26:45,726 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.0.layers.1.self_attn2.whiten, num_groups=1, num_channels=192, metric=10.00 vs. limit=22.5 2023-10-03 11:26:46,659 WARNING [train.py:1204] (3/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_857-298942-0_sp1.1 from training. Duration: 0.77275 2023-10-03 11:26:48,664 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6673_210_3273_1_1527767790_3173240_327-306185-0_sp1.1 from training. Duration: 0.7545625 2023-10-03 11:26:50,384 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.nonlin_attention.balancer.prob, batch_count=1250433.3333333333, ans=0.125 2023-10-03 11:26:54,128 WARNING [train.py:1204] (3/4) Exclude cut with ID _61008_210_10633_1_1534852769198_6974370_814-107647-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 11:26:54,199 WARNING [train.py:1204] (3/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_116-332008-0_sp1.1 from training. Duration: 0.8909375 2023-10-03 11:26:56,941 WARNING [train.py:1204] (3/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_266-329755-0 from training. Duration: 0.89 2023-10-03 11:26:56,944 WARNING [train.py:1204] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_271-269084-0_sp0.9 from training. Duration: 0.92225 2023-10-03 11:26:58,327 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3171_210_2087_1_1526109084_2330007_66-260583-0 from training. Duration: 0.72 2023-10-03 11:27:02,345 INFO [train.py:1046] (3/4) Epoch 36, batch 1650, loss[loss=0.1582, simple_loss=0.2438, pruned_loss=0.03626, over 24463.00 frames. ], tot_loss[loss=0.1606, simple_loss=0.2405, pruned_loss=0.04038, over 4704290.20 frames. ], batch size: 66, lr: 2.84e-03, grad_scale: 32.0 2023-10-03 11:27:02,485 WARNING [train.py:1204] (3/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_1326-276386-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 11:27:03,899 WARNING [train.py:1204] (3/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_372-30302-0_sp1.1 from training. Duration: 0.7818125 2023-10-03 11:27:05,652 WARNING [train.py:1204] (3/4) Exclude cut with ID _25882_210_3036_1_1532775665815_6479510_631-257029-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 11:27:05,661 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3691_210_3528_1_1526468169_4062053_355-334432-0 from training. Duration: 0.87 2023-10-03 11:27:05,677 WARNING [train.py:1204] (3/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_839-173017-0_sp1.1 from training. Duration: 0.37275 2023-10-03 11:27:05,691 WARNING [train.py:1204] (3/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_441-91466-0 from training. Duration: 0.97 2023-10-03 11:27:07,019 WARNING [train.py:1204] (3/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_108-53929-0 from training. Duration: 0.59 2023-10-03 11:27:11,639 WARNING [train.py:1204] (3/4) Exclude cut with ID _9389_210_1664_1_1529217337301_5289269_260-1942-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 11:27:11,691 WARNING [train.py:1204] (3/4) Exclude cut with ID _56423_210_5854_1_1534896061049_3165969_198-61547-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 11:27:12,937 WARNING [train.py:1204] (3/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_412-142122-0_sp1.1 from training. Duration: 0.92725 2023-10-03 11:27:12,961 WARNING [train.py:1204] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_88-341718-0_sp1.1 from training. Duration: 0.6 2023-10-03 11:27:14,429 WARNING [train.py:1204] (3/4) Exclude cut with ID _61079_210_4525_1_1534921008198_4011419_334-298697-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 11:27:14,769 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.conv_skip_rate, batch_count=1250500.0, ans=0.0 2023-10-03 11:27:16,221 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_5465_210_4211_1_1527227253_55357_8-2499-0_sp1.1 from training. Duration: 0.996375 2023-10-03 11:27:17,279 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.0.layers.1.nonlin_attention.whiten2, num_groups=1, num_channels=192, metric=5.82 vs. limit=15.0 2023-10-03 11:27:17,671 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_447-201565-0_sp1.1 from training. Duration: 0.97275 2023-10-03 11:27:17,677 WARNING [train.py:1204] (3/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_253-161789-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 11:27:17,680 WARNING [train.py:1204] (3/4) Exclude cut with ID _44304_210_15374_1_1533639352765_4613129_302-22084-0_sp0.9 from training. Duration: 0.9555625 2023-10-03 11:27:17,697 WARNING [train.py:1204] (3/4) Exclude cut with ID _26664_210_7770_1_1532258943280_3418270_187-80815-0_sp1.1 from training. Duration: 0.8454375 2023-10-03 11:27:19,057 WARNING [train.py:1204] (3/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_68-212801-0 from training. Duration: 0.73 2023-10-03 11:27:20,771 WARNING [train.py:1204] (3/4) Exclude cut with ID _29345_210_4381_1_1532689168263_3860169_149-257268-0 from training. Duration: 0.81 2023-10-03 11:27:24,862 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.495e+02 1.852e+02 2.089e+02 2.353e+02 3.371e+02, threshold=4.177e+02, percent-clipped=0.0 2023-10-03 11:27:24,951 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_213-103282-0_sp1.1 from training. Duration: 0.7090625 2023-10-03 11:27:26,414 WARNING [train.py:1204] (3/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_156-302675-0_sp1.1 from training. Duration: 0.5 2023-10-03 11:27:33,390 WARNING [train.py:1204] (3/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_125-322172-0 from training. Duration: 0.93 2023-10-03 11:27:35,231 WARNING [train.py:1204] (3/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_493-60474-0_sp1.1 from training. Duration: 0.963625 2023-10-03 11:27:36,729 WARNING [train.py:1204] (3/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_156-302675-0 from training. Duration: 0.55 2023-10-03 11:27:39,902 WARNING [train.py:1204] (3/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_123-267862-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 11:27:43,176 WARNING [train.py:1204] (3/4) Exclude cut with ID _28427_210_4220_1_1532581240967_3061069_201-143747-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 11:27:43,210 WARNING [train.py:1204] (3/4) Exclude cut with ID _8772_210_1850_1_1528635595972_3664011_96-12756-0_sp1.1 from training. Duration: 0.8909375 2023-10-03 11:27:43,249 WARNING [train.py:1204] (3/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_1347-145675-0_sp1.1 from training. Duration: 0.936375 2023-10-03 11:27:47,015 WARNING [train.py:1204] (3/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_387-159008-0_sp1.1 from training. Duration: 0.7818125 2023-10-03 11:27:47,045 WARNING [train.py:1204] (3/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_400-201896-0_sp1.1 from training. Duration: 0.963625 2023-10-03 11:27:49,811 WARNING [train.py:1204] (3/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_326-174612-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 11:27:49,858 WARNING [train.py:1204] (3/4) Exclude cut with ID _55844_210_2346_1_1534471212569_7625707_390-116233-0_sp1.1 from training. Duration: 0.963625 2023-10-03 11:27:49,884 WARNING [train.py:1204] (3/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_419-317062-0_sp1.1 from training. Duration: 0.82725 2023-10-03 11:27:51,267 WARNING [train.py:1204] (3/4) Exclude cut with ID _29877_210_6228_1_1532397791106_5276149_266-62291-0_sp0.9 from training. Duration: 0.9666875 2023-10-03 11:27:52,614 WARNING [train.py:1204] (3/4) Exclude cut with ID _7857_210_1534_1_1528286515745_6831470_431-338528-0_sp1.1 from training. Duration: 0.92725 2023-10-03 11:27:52,687 WARNING [train.py:1204] (3/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_87-131427-0_sp0.9 from training. Duration: 0.8666875 2023-10-03 11:27:58,077 WARNING [train.py:1204] (3/4) Exclude cut with ID _24055_210_12651_1_1532310907143_976979_92-59356-0_sp1.1 from training. Duration: 0.82725 2023-10-03 11:27:58,162 WARNING [train.py:1204] (3/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_96-290039-0 from training. Duration: 0.56 2023-10-03 11:27:58,261 WARNING [train.py:1204] (3/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_48-182155-0_sp0.9 from training. Duration: 0.9666875 2023-10-03 11:27:59,561 WARNING [train.py:1204] (3/4) Exclude cut with ID _4626_210_3532_1_1526817652717_3916859_446-206656-0 from training. Duration: 0.76 2023-10-03 11:27:59,651 WARNING [train.py:1204] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_168-32906-0 from training. Duration: 0.69 2023-10-03 11:27:59,672 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3780_210_2087_1_1526467760_2130483_170-259044-0 from training. Duration: 0.7 2023-10-03 11:28:01,063 WARNING [train.py:1204] (3/4) Exclude cut with ID _65134_210_14763_1_1535335393985_3505368_153-216718-0_sp1.1 from training. Duration: 0.92725 2023-10-03 11:28:01,136 WARNING [train.py:1204] (3/4) Exclude cut with ID _64639_210_5816_1_1535367619437_3528000_461-329156-0_sp1.1 from training. Duration: 0.87275 2023-10-03 11:28:01,170 WARNING [train.py:1204] (3/4) Exclude cut with ID _21989_210_5854_1_1531899214931_3271360_126-347531-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 11:28:01,229 WARNING [train.py:1204] (3/4) Exclude cut with ID _7614_210_4915_1_1529326764092_4537359_96-162197-0_sp1.1 from training. Duration: 0.963625 2023-10-03 11:28:01,231 WARNING [train.py:1204] (3/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_1083-147536-0 from training. Duration: 0.97 2023-10-03 11:28:05,969 WARNING [train.py:1204] (3/4) Exclude cut with ID _64029_210_4381_1_1535178502772_3756459_275-16710-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 11:28:07,372 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7264_210_4938_1_1527939911_8132095_800-91995-0_sp0.9 from training. Duration: 0.9555625 2023-10-03 11:28:08,652 WARNING [train.py:1204] (3/4) Exclude cut with ID _3844_210_1390_1_1526727803582_2871209_50-193879-0_sp1.1 from training. Duration: 0.936375 2023-10-03 11:28:10,616 WARNING [train.py:1204] (3/4) Exclude cut with ID _46952_210_5986_1_1534230010357_3107379_232-344503-0 from training. Duration: 0.96 2023-10-03 11:28:16,019 WARNING [train.py:1204] (3/4) Exclude cut with ID _25808_210_13292_1_1532343514562_7366109_206-14076-0_sp1.1 from training. Duration: 0.936375 2023-10-03 11:28:16,020 WARNING [train.py:1204] (3/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_55-31904-0_sp0.9 from training. Duration: 0.97775 2023-10-03 11:28:16,039 WARNING [train.py:1204] (3/4) Exclude cut with ID _14632_210_9988_1_1530691279392_3923200_161-129530-0 from training. Duration: 0.96 2023-10-03 11:28:16,114 WARNING [train.py:1204] (3/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_770-200430-0_sp1.1 from training. Duration: 0.8090625 2023-10-03 11:28:17,848 INFO [train.py:1046] (3/4) Epoch 36, batch 1700, loss[loss=0.139, simple_loss=0.2199, pruned_loss=0.02904, over 24447.00 frames. ], tot_loss[loss=0.1601, simple_loss=0.2395, pruned_loss=0.04032, over 4694967.06 frames. ], batch size: 58, lr: 2.84e-03, grad_scale: 32.0 2023-10-03 11:28:17,894 WARNING [train.py:1204] (3/4) Exclude cut with ID _6270_210_2084_1_1527850911573_3856420_26-277343-0_sp1.1 from training. Duration: 0.7454375 2023-10-03 11:28:17,896 WARNING [train.py:1204] (3/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_88-23776-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 11:28:19,306 WARNING [train.py:1204] (3/4) Exclude cut with ID _60860_210_8254_1_1534896015875_3557169_245-256666-0_sp1.1 from training. Duration: 0.8818125 2023-10-03 11:28:19,537 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.feed_forward3.hidden_balancer.prob, batch_count=1250833.3333333333, ans=0.125 2023-10-03 11:28:20,684 WARNING [train.py:1204] (3/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_636-251213-0_sp1.1 from training. Duration: 0.8545625 2023-10-03 11:28:20,697 WARNING [train.py:1204] (3/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_540-131164-0 from training. Duration: 0.92 2023-10-03 11:28:23,492 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_5931_210_2040_1_1527428481_4547999_321-193614-0_sp0.9 from training. Duration: 0.7444375 2023-10-03 11:28:24,168 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.3.encoder.layers.1.conv_module1.whiten, num_groups=1, num_channels=512, metric=6.03 vs. limit=15.0 2023-10-03 11:28:31,763 WARNING [train.py:1204] (3/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_373-225403-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 11:28:34,492 WARNING [train.py:1204] (3/4) Exclude cut with ID _20622_210_3852_1_1531622427080_1093839_57-326027-0_sp1.1 from training. Duration: 0.97275 2023-10-03 11:28:39,393 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.bypass.scale_min, batch_count=1250900.0, ans=0.2 2023-10-03 11:28:41,078 WARNING [train.py:1204] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_605-257205-0_sp1.1 from training. Duration: 0.536375 2023-10-03 11:28:41,105 WARNING [train.py:1204] (3/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_442-320317-0_sp0.9 from training. Duration: 0.911125 2023-10-03 11:28:42,286 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_35361_210_4238_1_1533262810_7793476_675-87012-0_sp1.1 from training. Duration: 0.8090625 2023-10-03 11:28:42,318 WARNING [train.py:1204] (3/4) Exclude cut with ID _4184_210_2550_1_1526730192013_2219380_306-44736-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 11:28:44,952 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_124-113562-0 from training. Duration: 0.94 2023-10-03 11:28:47,741 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2821_210_2715_1_1526108385_3763560_73-296673-0_sp0.9 from training. Duration: 0.92225 2023-10-03 11:28:47,763 WARNING [train.py:1204] (3/4) Exclude cut with ID _10301_210_5886_1_1529222275332_3910620_216-46959-0_sp1.1 from training. Duration: 0.92725 2023-10-03 11:28:49,786 WARNING [train.py:1204] (3/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_91-277684-0_sp0.9 from training. Duration: 0.82225 2023-10-03 11:28:51,241 WARNING [train.py:1204] (3/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_351-294650-0_sp0.9 from training. Duration: 0.7 2023-10-03 11:28:53,189 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.0.bypass.skip_rate, batch_count=1250966.6666666667, ans=0.035 2023-10-03 11:28:54,380 WARNING [train.py:1204] (3/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_209-205365-0 from training. Duration: 0.79 2023-10-03 11:28:54,439 WARNING [train.py:1204] (3/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_97-325053-0 from training. Duration: 0.92 2023-10-03 11:28:55,760 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_49348_210_7927_1_1533954500_3691300_109-276430-0_sp1.1 from training. Duration: 0.92725 2023-10-03 11:28:57,260 WARNING [train.py:1204] (3/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_912-47473-0 from training. Duration: 0.68 2023-10-03 11:28:58,673 WARNING [train.py:1204] (3/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_96-320736-0_sp0.9 from training. Duration: 0.9555625 2023-10-03 11:29:03,138 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.conv_module1.balancer1.max_abs, batch_count=1251033.3333333333, ans=10.0 2023-10-03 11:29:05,711 WARNING [train.py:1204] (3/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_1252-235481-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 11:29:05,803 WARNING [train.py:1204] (3/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_813-261554-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 11:29:07,738 WARNING [train.py:1204] (3/4) Exclude cut with ID _21632_210_2253_1_1531994001765_3744140_110-274654-0_sp0.9 from training. Duration: 0.911125 2023-10-03 11:29:09,911 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.2.encoder.layers.2.self_attn1.whiten, num_groups=1, num_channels=384, metric=18.22 vs. limit=22.5 2023-10-03 11:29:10,476 WARNING [train.py:1204] (3/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_381-317791-0_sp1.1 from training. Duration: 0.663625 2023-10-03 11:29:10,489 WARNING [train.py:1204] (3/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_51-117576-0_sp1.1 from training. Duration: 0.363625 2023-10-03 11:29:10,513 WARNING [train.py:1204] (3/4) Exclude cut with ID _42321_210_7770_1_1533468631352_4366895_104-215915-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 11:29:11,823 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_216-22824-0_sp1.1 from training. Duration: 0.92725 2023-10-03 11:29:11,824 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_14831_210_7927_1_1530700200_5583610_181-27467-0 from training. Duration: 0.91 2023-10-03 11:29:13,124 WARNING [train.py:1204] (3/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_1024-265645-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 11:29:13,126 WARNING [train.py:1204] (3/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_412-233904-0_sp1.1 from training. Duration: 0.936375 2023-10-03 11:29:13,151 WARNING [train.py:1204] (3/4) Exclude cut with ID _30191_210_1664_1_1533189915047_7196670_911-223461-0_sp1.1 from training. Duration: 0.92725 2023-10-03 11:29:13,159 WARNING [train.py:1204] (3/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_558-148266-0_sp1.1 from training. Duration: 0.963625 2023-10-03 11:29:15,221 WARNING [train.py:1204] (3/4) Exclude cut with ID _58480_210_12392_1_1534811121111_3646160_212-164495-0_sp1.1 from training. Duration: 0.936375 2023-10-03 11:29:15,223 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_454-276429-0_sp0.9 from training. Duration: 0.9666875 2023-10-03 11:29:15,314 WARNING [train.py:1204] (3/4) Exclude cut with ID _24736_210_3279_1_1532332894218_2838700_73-197086-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 11:29:16,641 WARNING [train.py:1204] (3/4) Exclude cut with ID _5294_210_1747_1_1527249643928_3696179_529-242298-0_sp1.1 from training. Duration: 0.863625 2023-10-03 11:29:16,692 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_53555_210_5632_1_1534595220_7232297_345-171438-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 11:29:20,194 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.attention_skip_rate, batch_count=1251100.0, ans=0.0 2023-10-03 11:29:23,292 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_28211_210_6388_1_1532861506_4523224_22-25766-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 11:29:24,673 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7002_210_1794_1_1527939683_5157286_163-87552-0 from training. Duration: 0.86 2023-10-03 11:29:26,166 WARNING [train.py:1204] (3/4) Exclude cut with ID _56927_210_9969_1_1534848970840_3695229_409-225129-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 11:29:27,506 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_26192_210_7927_1_1532137269_1040115_21-219331-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 11:29:28,899 WARNING [train.py:1204] (3/4) Exclude cut with ID _15012_210_1968_1_1530856820429_5095350_662-210469-0 from training. Duration: 0.57 2023-10-03 11:29:31,606 INFO [train.py:1046] (3/4) Epoch 36, batch 1750, loss[loss=0.1652, simple_loss=0.241, pruned_loss=0.04472, over 23833.00 frames. ], tot_loss[loss=0.1592, simple_loss=0.2387, pruned_loss=0.03991, over 4703912.84 frames. ], batch size: 195, lr: 2.84e-03, grad_scale: 16.0 2023-10-03 11:29:34,482 WARNING [train.py:1204] (3/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_582-191016-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 11:29:37,172 WARNING [train.py:1204] (3/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_270-210032-0_sp1.1 from training. Duration: 0.963625 2023-10-03 11:29:37,209 WARNING [train.py:1204] (3/4) Exclude cut with ID _6789_210_1534_1_1527857937218_3118950_262-48853-0_sp0.9 from training. Duration: 0.62225 2023-10-03 11:29:37,388 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.balancer2.prob, batch_count=1251166.6666666667, ans=0.125 2023-10-03 11:29:39,213 WARNING [train.py:1204] (3/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_847-41334-0 from training. Duration: 0.44 2023-10-03 11:29:39,247 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_573-46532-0_sp1.1 from training. Duration: 0.92725 2023-10-03 11:29:41,959 WARNING [train.py:1204] (3/4) Exclude cut with ID _21881_210_3525_1_1531825242757_3798380_247-102591-0_sp1.1 from training. Duration: 0.8909375 2023-10-03 11:29:41,982 WARNING [train.py:1204] (3/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_200-21395-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 11:29:45,504 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.0.ff2_skip_rate, batch_count=1251233.3333333333, ans=0.0 2023-10-03 11:29:46,601 WARNING [train.py:1204] (3/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_711-15688-0 from training. Duration: 0.43 2023-10-03 11:29:48,087 WARNING [train.py:1204] (3/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_53-75025-0_sp1.1 from training. Duration: 0.963625 2023-10-03 11:29:50,795 WARNING [train.py:1204] (3/4) Exclude cut with ID _6194_210_1534_1_1527511826160_3941194_321-148641-0 from training. Duration: 0.64 2023-10-03 11:29:50,826 WARNING [train.py:1204] (3/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_337-294431-0_sp1.1 from training. Duration: 0.936375 2023-10-03 11:29:52,831 WARNING [train.py:1204] (3/4) Exclude cut with ID _21632_210_2253_1_1531994001765_3744140_110-274654-0_sp1.1 from training. Duration: 0.7454375 2023-10-03 11:29:55,391 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.520e+02 1.875e+02 2.027e+02 2.400e+02 3.332e+02, threshold=4.055e+02, percent-clipped=0.0 2023-10-03 11:29:55,483 WARNING [train.py:1204] (3/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_135-174080-0_sp1.1 from training. Duration: 0.4818125 2023-10-03 11:29:55,773 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.balancer2.prob, batch_count=1251233.3333333333, ans=0.125 2023-10-03 11:29:56,850 WARNING [train.py:1204] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_21-174514-0 from training. Duration: 0.43 2023-10-03 11:29:58,263 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_38965_210_4852_1_1533174906_3484735_215-294589-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 11:29:58,296 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_548-294316-0 from training. Duration: 0.81 2023-10-03 11:30:06,036 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_103-84010-0_sp1.1 from training. Duration: 0.62725 2023-10-03 11:30:08,756 WARNING [train.py:1204] (3/4) Exclude cut with ID _18199_210_6109_1_1531475789864_4288790_295-198247-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 11:30:08,781 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_13757_210_3528_1_1530440682_4107239_103-313224-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 11:30:12,907 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_34886_210_10633_1_1533203882_1755661_67-294673-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 11:30:12,917 WARNING [train.py:1204] (3/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_350-101734-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 11:30:14,387 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_37357_210_17024_1_1533089504_2316121_204-153986-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 11:30:15,734 WARNING [train.py:1204] (3/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_673-115569-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 11:30:17,074 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.1.encoder.layers.0.whiten, num_groups=1, num_channels=256, metric=4.97 vs. limit=12.0 2023-10-03 11:30:18,937 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_489-230703-0_sp0.9 from training. Duration: 0.9444375 2023-10-03 11:30:20,282 WARNING [train.py:1204] (3/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_575-102496-0_sp1.1 from training. Duration: 0.936375 2023-10-03 11:30:20,376 WARNING [train.py:1204] (3/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_669-93163-0 from training. Duration: 0.98 2023-10-03 11:30:24,052 WARNING [train.py:1204] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_249-281349-0_sp0.9 from training. Duration: 0.9444375 2023-10-03 11:30:26,577 WARNING [train.py:1204] (3/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_106-30782-0 from training. Duration: 0.8 2023-10-03 11:30:26,645 WARNING [train.py:1204] (3/4) Exclude cut with ID _8772_210_1850_1_1528635595972_3664011_398-320049-0_sp1.1 from training. Duration: 0.8 2023-10-03 11:30:28,049 WARNING [train.py:1204] (3/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_516-151727-0_sp1.1 from training. Duration: 0.963625 2023-10-03 11:30:29,384 WARNING [train.py:1204] (3/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_325-62802-0_sp1.1 from training. Duration: 0.7909375 2023-10-03 11:30:33,542 WARNING [train.py:1204] (3/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_93-58002-0_sp0.9 from training. Duration: 0.6333125 2023-10-03 11:30:34,880 WARNING [train.py:1204] (3/4) Exclude cut with ID _4681_210_3525_1_1527251040321_1235713_123-24428-0_sp1.1 from training. Duration: 0.436375 2023-10-03 11:30:34,931 WARNING [train.py:1204] (3/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_1185-56473-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 11:30:36,489 WARNING [train.py:1204] (3/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_96-244299-0_sp1.1 from training. Duration: 0.8 2023-10-03 11:30:39,906 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.conv_module2.balancer1.prob, batch_count=1251433.3333333333, ans=0.125 2023-10-03 11:30:41,117 WARNING [train.py:1204] (3/4) Exclude cut with ID _59500_210_8254_1_1534809610680_3736009_104-72815-0_sp1.1 from training. Duration: 0.963625 2023-10-03 11:30:43,870 WARNING [train.py:1204] (3/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_468-283113-0_sp1.1 from training. Duration: 0.87275 2023-10-03 11:30:45,157 INFO [train.py:1046] (3/4) Epoch 36, batch 1800, loss[loss=0.1672, simple_loss=0.2395, pruned_loss=0.04744, over 23835.00 frames. ], tot_loss[loss=0.1588, simple_loss=0.2384, pruned_loss=0.03962, over 4711438.65 frames. ], batch size: 164, lr: 2.84e-03, grad_scale: 16.0 2023-10-03 11:30:45,229 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6239_210_4211_1_1527765584_4306698_528-102186-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 11:30:45,294 WARNING [train.py:1204] (3/4) Exclude cut with ID _45297_210_2721_1_1533715351381_3264949_351-147136-0 from training. Duration: 0.87 2023-10-03 11:30:46,643 WARNING [train.py:1204] (3/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_520-51744-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 11:30:46,740 WARNING [train.py:1204] (3/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_310-114452-0_sp1.1 from training. Duration: 0.536375 2023-10-03 11:30:46,744 WARNING [train.py:1204] (3/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_450-165710-0_sp1.1 from training. Duration: 0.92725 2023-10-03 11:30:46,753 WARNING [train.py:1204] (3/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_280-120942-0_sp1.1 from training. Duration: 0.7 2023-10-03 11:30:46,780 WARNING [train.py:1204] (3/4) Exclude cut with ID _65585_210_12210_1_1535450451584_3764098_112-294301-0_sp0.9 from training. Duration: 0.97775 2023-10-03 11:30:46,979 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.feed_forward2.hidden_balancer.prob, batch_count=1251500.0, ans=0.125 2023-10-03 11:30:48,109 WARNING [train.py:1204] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_98-51676-0_sp0.9 from training. Duration: 0.811125 2023-10-03 11:30:48,954 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.bypass.scale_min, batch_count=1251500.0, ans=0.2 2023-10-03 11:30:52,615 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_33348_210_5632_1_1533086912_3324030_85-28583-0_sp1.1 from training. Duration: 0.7454375 2023-10-03 11:30:52,699 WARNING [train.py:1204] (3/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_336-208232-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 11:30:54,514 WARNING [train.py:1204] (3/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_339-104791-0_sp0.9 from training. Duration: 0.5444375 2023-10-03 11:30:55,967 WARNING [train.py:1204] (3/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_559-29840-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 11:30:56,107 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.1.self_attn_weights.pos_emb_skip_rate, batch_count=1251500.0, ans=0.0 2023-10-03 11:31:00,414 WARNING [train.py:1204] (3/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_117-290916-0_sp0.9 from training. Duration: 0.5666875 2023-10-03 11:31:01,796 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_185-112289-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 11:31:02,097 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.self_attn_weights.pos_emb_skip_rate, batch_count=1251566.6666666667, ans=0.0 2023-10-03 11:31:03,248 WARNING [train.py:1204] (3/4) Exclude cut with ID _59256_210_5986_1_1535076085997_3589120_155-298797-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 11:31:05,988 WARNING [train.py:1204] (3/4) Exclude cut with ID _44659_210_2721_1_1534036120061_6614860_246-206531-0_sp1.1 from training. Duration: 0.92725 2023-10-03 11:31:06,034 WARNING [train.py:1204] (3/4) Exclude cut with ID _23818_210_6228_1_1531917063884_3676920_469-151944-0_sp1.1 from training. Duration: 0.92725 2023-10-03 11:31:07,376 WARNING [train.py:1204] (3/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_88-276668-0_sp1.1 from training. Duration: 0.9 2023-10-03 11:31:10,591 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_15101_210_7927_1_1530786415_5033274_320-327527-0_sp1.1 from training. Duration: 0.87275 2023-10-03 11:31:10,602 WARNING [train.py:1204] (3/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_446-112117-0 from training. Duration: 0.9 2023-10-03 11:31:12,012 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_30280_210_1881_1_1532872779_3660139_400-254159-0_sp1.1 from training. Duration: 0.936375 2023-10-03 11:31:14,791 WARNING [train.py:1204] (3/4) Exclude cut with ID _40284_210_2084_1_1533513679814_3704279_158-345341-0_sp1.1 from training. Duration: 0.936375 2023-10-03 11:31:14,952 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.1.feed_forward1.out_proj.dropout_p, batch_count=1251633.3333333333, ans=0.1 2023-10-03 11:31:17,657 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.1.conv_module2.balancer1.prob, batch_count=1251633.3333333333, ans=0.125 2023-10-03 11:31:18,943 WARNING [train.py:1204] (3/4) Exclude cut with ID 3033-130750-0096-55598-0 from training. Duration: 0.83 2023-10-03 11:31:20,404 WARNING [train.py:1204] (3/4) Exclude cut with ID _9389_210_1664_1_1529217337301_5289269_379-128794-0 from training. Duration: 0.85 2023-10-03 11:31:20,424 WARNING [train.py:1204] (3/4) Exclude cut with ID _3675_210_4557_1_1526639934408_4182089_456-18305-0 from training. Duration: 0.76 2023-10-03 11:31:22,338 WARNING [train.py:1204] (3/4) Exclude cut with ID _29853_210_7497_1_1532505600851_6642110_571-249460-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 11:31:23,700 WARNING [train.py:1204] (3/4) Exclude cut with ID _4068_210_2629_1_1526697350395_1546259_230-325568-0_sp1.1 from training. Duration: 0.92725 2023-10-03 11:31:23,701 WARNING [train.py:1204] (3/4) Exclude cut with ID _34415_210_8750_1_1533085213375_3451116_213-267570-0_sp1.1 from training. Duration: 0.963625 2023-10-03 11:31:25,366 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6767_210_3273_1_1527855783_1718932_142-127132-0_sp1.1 from training. Duration: 0.863625 2023-10-03 11:31:31,588 WARNING [train.py:1204] (3/4) Exclude cut with ID IC0653W0001-322925 from training. Duration: 0.896 2023-10-03 11:31:31,686 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_4528_210_3273_1_1527076654_3276777_209-219804-0_sp1.1 from training. Duration: 0.62725 2023-10-03 11:31:33,065 WARNING [train.py:1204] (3/4) Exclude cut with ID _55844_210_2346_1_1534471212569_7625707_187-204971-0_sp1.1 from training. Duration: 0.936375 2023-10-03 11:31:34,471 WARNING [train.py:1204] (3/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_369-325184-0 from training. Duration: 0.99 2023-10-03 11:31:34,513 WARNING [train.py:1204] (3/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_791-45149-0 from training. Duration: 0.86 2023-10-03 11:31:35,838 WARNING [train.py:1204] (3/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_317-211107-0_sp1.1 from training. Duration: 0.836375 2023-10-03 11:31:37,261 WARNING [train.py:1204] (3/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_488-330913-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 11:31:37,367 WARNING [train.py:1204] (3/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_506-42614-0_sp1.1 from training. Duration: 0.7181875 2023-10-03 11:31:41,855 WARNING [train.py:1204] (3/4) Exclude cut with ID _8696_210_1876_1_1528545632440_3345058_209-54069-0 from training. Duration: 0.67 2023-10-03 11:31:47,438 WARNING [train.py:1204] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_215-202973-0_sp1.1 from training. Duration: 0.8 2023-10-03 11:31:47,517 WARNING [train.py:1204] (3/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_321-215742-0 from training. Duration: 0.5 2023-10-03 11:31:47,684 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.bypass.scale_min, batch_count=1251766.6666666667, ans=0.2 2023-10-03 11:31:48,083 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.3.encoder.layers.3.conv_module1.whiten, num_groups=1, num_channels=512, metric=7.86 vs. limit=15.0 2023-10-03 11:31:48,891 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_428-32114-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 11:31:48,906 WARNING [train.py:1204] (3/4) Exclude cut with ID _27013_210_1389_1_1532437173240_3252649_467-263151-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 11:31:50,243 WARNING [train.py:1204] (3/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_734-129761-0_sp0.9 from training. Duration: 0.911125 2023-10-03 11:31:50,285 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_521-214276-0 from training. Duration: 0.44 2023-10-03 11:31:53,658 WARNING [train.py:1204] (3/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_80-147301-0_sp1.1 from training. Duration: 0.763625 2023-10-03 11:31:53,660 WARNING [train.py:1204] (3/4) Exclude cut with ID _9679_210_7497_1_1529129212906_4363929_56-307796-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 11:31:57,668 WARNING [train.py:1204] (3/4) Exclude cut with ID _14832_210_8750_1_1530691351539_3171348_1-54315-0 from training. Duration: 0.87 2023-10-03 11:31:57,668 WARNING [train.py:1204] (3/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_558-155750-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 11:31:59,434 INFO [train.py:1046] (3/4) Epoch 36, batch 1850, loss[loss=0.1727, simple_loss=0.2454, pruned_loss=0.04994, over 22847.00 frames. ], tot_loss[loss=0.1587, simple_loss=0.2384, pruned_loss=0.03953, over 4719708.24 frames. ], batch size: 322, lr: 2.84e-03, grad_scale: 16.0 2023-10-03 11:31:59,580 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_142-309370-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 11:31:59,611 WARNING [train.py:1204] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_18-46756-0_sp0.9 from training. Duration: 0.87775 2023-10-03 11:31:59,618 WARNING [train.py:1204] (3/4) Exclude cut with ID _61148_210_6897_1_1535094022726_4217610_279-212159-0_sp1.1 from training. Duration: 0.936375 2023-10-03 11:32:01,034 WARNING [train.py:1204] (3/4) Exclude cut with ID _21987_210_5854_1_1531821500482_3637038_164-2641-0_sp1.1 from training. Duration: 0.936375 2023-10-03 11:32:01,077 WARNING [train.py:1204] (3/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_395-340852-0_sp1.1 from training. Duration: 0.8454375 2023-10-03 11:32:03,889 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1077-184261-0_sp1.1 from training. Duration: 0.963625 2023-10-03 11:32:03,897 WARNING [train.py:1204] (3/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_1176-204992-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 11:32:05,281 WARNING [train.py:1204] (3/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_563-347832-0_sp1.1 from training. Duration: 0.8090625 2023-10-03 11:32:06,561 WARNING [train.py:1204] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_380-204756-0_sp0.9 from training. Duration: 0.9555625 2023-10-03 11:32:12,698 WARNING [train.py:1204] (3/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_212-10501-0_sp1.1 from training. Duration: 0.8545625 2023-10-03 11:32:12,726 WARNING [train.py:1204] (3/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_445-117398-0 from training. Duration: 0.89 2023-10-03 11:32:15,582 WARNING [train.py:1204] (3/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_29-280622-0 from training. Duration: 0.82 2023-10-03 11:32:18,436 WARNING [train.py:1204] (3/4) Exclude cut with ID _26664_210_7770_1_1532258943280_3418270_187-80815-0 from training. Duration: 0.93 2023-10-03 11:32:18,730 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.bypass.scale_min, batch_count=1251900.0, ans=0.2 2023-10-03 11:32:21,220 WARNING [train.py:1204] (3/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_414-349640-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 11:32:22,888 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.499e+02 1.882e+02 2.066e+02 2.277e+02 3.527e+02, threshold=4.131e+02, percent-clipped=0.0 2023-10-03 11:32:22,955 WARNING [train.py:1204] (3/4) Exclude cut with ID _45021_210_9994_1_1533619751094_3330288_96-239010-0 from training. Duration: 0.91 2023-10-03 11:32:22,958 WARNING [train.py:1204] (3/4) Exclude cut with ID _5189_210_4557_1_1527248319428_3769229_199-53526-0_sp0.9 from training. Duration: 0.4666875 2023-10-03 11:32:23,191 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.feed_forward2.hidden_balancer.prob, batch_count=1251900.0, ans=0.125 2023-10-03 11:32:32,075 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.attention_skip_rate, batch_count=1251966.6666666667, ans=0.0 2023-10-03 11:32:33,153 WARNING [train.py:1204] (3/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_63-101996-0_sp1.1 from training. Duration: 0.8 2023-10-03 11:32:34,470 WARNING [train.py:1204] (3/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_199-86432-0 from training. Duration: 0.98 2023-10-03 11:32:37,550 WARNING [train.py:1204] (3/4) Exclude cut with ID _36399_210_16049_1_1532998771377_3698333_207-259013-0_sp1.1 from training. Duration: 0.92725 2023-10-03 11:32:37,587 WARNING [train.py:1204] (3/4) Exclude cut with ID _62574_210_2655_1_1535180407058_3933249_567-111756-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 11:32:43,538 WARNING [train.py:1204] (3/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_436-318113-0 from training. Duration: 0.75 2023-10-03 11:32:43,597 WARNING [train.py:1204] (3/4) Exclude cut with ID _53379_210_20034_1_1534222027363_3854654_43-305214-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 11:32:43,619 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_15126_210_6753_1_1530778677_2591185_113-157651-0_sp1.1 from training. Duration: 0.6181875 2023-10-03 11:32:44,908 WARNING [train.py:1204] (3/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_146-218763-0_sp1.1 from training. Duration: 0.7818125 2023-10-03 11:32:46,355 WARNING [train.py:1204] (3/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_714-310538-0_sp0.9 from training. Duration: 0.9555625 2023-10-03 11:32:46,547 INFO [scaling.py:1118] (3/4) WithLoss: name=encoder.encoders.2.encoder.layers.0.self_attn_weights, loss-sum=0.000e+00 2023-10-03 11:32:49,095 WARNING [train.py:1204] (3/4) Exclude cut with ID _24271_210_12658_1_1531987270022_7190519_709-16721-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 11:32:51,937 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.feed_forward1.hidden_balancer.prob, batch_count=1252033.3333333333, ans=0.125 2023-10-03 11:32:53,098 WARNING [train.py:1204] (3/4) Exclude cut with ID _5190_210_4557_1_1527244368799_373439_28-197030-0_sp0.9 from training. Duration: 0.8 2023-10-03 11:32:53,147 WARNING [train.py:1204] (3/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_267-197536-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 11:32:54,459 WARNING [train.py:1204] (3/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_658-282610-0_sp1.1 from training. Duration: 0.4818125 2023-10-03 11:32:54,469 WARNING [train.py:1204] (3/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_257-67050-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 11:32:56,504 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_14906_210_7927_1_1530754190_9408633_961-53402-0_sp1.1 from training. Duration: 0.936375 2023-10-03 11:32:58,037 WARNING [train.py:1204] (3/4) Exclude cut with ID _60113_210_16271_1_1535164212481_4386079_490-280290-0_sp1.1 from training. Duration: 0.8909375 2023-10-03 11:32:59,504 WARNING [train.py:1204] (3/4) Exclude cut with ID _14100_210_8341_1_1530529320613_3678069_258-224599-0 from training. Duration: 0.77 2023-10-03 11:33:01,378 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6961_210_2087_1_1527937168_6815011_565-326898-0_sp1.1 from training. Duration: 0.936375 2023-10-03 11:33:06,740 WARNING [train.py:1204] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_239-316802-0_sp1.1 from training. Duration: 0.62725 2023-10-03 11:33:06,805 WARNING [train.py:1204] (3/4) Exclude cut with ID _55857_210_2346_1_1534644007831_6898209_745-98391-0_sp1.1 from training. Duration: 0.6909375 2023-10-03 11:33:06,809 WARNING [train.py:1204] (3/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_154-135696-0 from training. Duration: 0.8 2023-10-03 11:33:06,811 WARNING [train.py:1204] (3/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_93-58002-0 from training. Duration: 0.57 2023-10-03 11:33:09,497 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9025W0005-507335 from training. Duration: 0.896 2023-10-03 11:33:09,573 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9029W0034-509435 from training. Duration: 0.64 2023-10-03 11:33:12,678 INFO [train.py:1046] (3/4) Epoch 36, batch 1900, loss[loss=0.1544, simple_loss=0.2373, pruned_loss=0.0357, over 17814.00 frames. ], tot_loss[loss=0.1586, simple_loss=0.2387, pruned_loss=0.03923, over 4728596.51 frames. ], batch size: 38, lr: 2.84e-03, grad_scale: 8.0 2023-10-03 11:33:12,730 WARNING [train.py:1204] (3/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_76-203867-0_sp1.1 from training. Duration: 0.6545625 2023-10-03 11:33:12,740 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_46639_210_8341_1_1533726088_4495500_66-346325-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 11:33:12,746 WARNING [train.py:1204] (3/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_145-137790-0_sp0.9 from training. Duration: 0.92225 2023-10-03 11:33:12,771 WARNING [train.py:1204] (3/4) Exclude cut with ID _36522_210_8887_1_1533177030611_1772478_7-327299-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 11:33:12,840 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9042W0009-515615 from training. Duration: 0.896 2023-10-03 11:33:12,849 WARNING [train.py:1204] (3/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_39-248870-0_sp0.9 from training. Duration: 0.7666875 2023-10-03 11:33:12,893 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_35441_210_16554_1_1533347572_4092680_362-242912-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 11:33:14,281 WARNING [train.py:1204] (3/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_216-343007-0_sp1.1 from training. Duration: 0.6 2023-10-03 11:33:15,629 WARNING [train.py:1204] (3/4) Exclude cut with ID _3867_210_1883_1_1526641200575_4475830_272-215563-0_sp0.9 from training. Duration: 0.6333125 2023-10-03 11:33:16,999 WARNING [train.py:1204] (3/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_553-175603-0_sp1.1 from training. Duration: 0.92725 2023-10-03 11:33:17,015 WARNING [train.py:1204] (3/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_963-187109-0 from training. Duration: 0.99 2023-10-03 11:33:20,904 WARNING [train.py:1204] (3/4) Exclude cut with ID _44973_210_1511_1_1533794381163_3634770_495-211466-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 11:33:20,914 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9068W0002-529085 from training. Duration: 0.896 2023-10-03 11:33:20,929 WARNING [train.py:1204] (3/4) Exclude cut with ID _5294_210_1747_1_1527249643928_3696179_180-100713-0_sp1.1 from training. Duration: 0.736375 2023-10-03 11:33:21,013 WARNING [train.py:1204] (3/4) Exclude cut with ID _35799_210_5854_1_1533263487887_3529071_149-49689-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 11:33:22,575 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.bypass_mid.scale_min, batch_count=1252166.6666666667, ans=0.2 2023-10-03 11:33:26,547 WARNING [train.py:1204] (3/4) Exclude cut with ID _67725_210_27767_1_1535608756671_5105359_36-220794-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 11:33:26,878 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.feed_forward2.hidden_balancer.prob, batch_count=1252233.3333333333, ans=0.125 2023-10-03 11:33:28,410 WARNING [train.py:1204] (3/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_338-96520-0_sp1.1 from training. Duration: 0.8545625 2023-10-03 11:33:28,493 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9104W0004-547160 from training. Duration: 0.896 2023-10-03 11:33:30,338 WARNING [train.py:1204] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_330-327861-0 from training. Duration: 0.67 2023-10-03 11:33:31,697 WARNING [train.py:1204] (3/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_156-257607-0_sp0.9 from training. Duration: 0.92225 2023-10-03 11:33:33,410 WARNING [train.py:1204] (3/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_476-322050-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 11:33:33,419 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9118W0017-553385 from training. Duration: 0.896 2023-10-03 11:33:33,458 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9120W0028-554435 from training. Duration: 0.896 2023-10-03 11:33:36,158 WARNING [train.py:1204] (3/4) Exclude cut with ID _56524_210_9642_1_1534572011779_3619170_145-310842-0 from training. Duration: 0.99 2023-10-03 11:33:37,557 WARNING [train.py:1204] (3/4) Exclude cut with ID _51631_210_8254_1_1534129228420_4084794_270-222508-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 11:33:40,388 WARNING [train.py:1204] (3/4) Exclude cut with ID _14268_210_9206_1_1530622771285_4714099_331-105351-0 from training. Duration: 0.65 2023-10-03 11:33:41,834 WARNING [train.py:1204] (3/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_40-129756-0 from training. Duration: 0.81 2023-10-03 11:33:52,189 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.feed_forward2.hidden_balancer.prob, batch_count=1252300.0, ans=0.125 2023-10-03 11:33:53,281 WARNING [train.py:1204] (3/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_231-104736-0 from training. Duration: 0.78 2023-10-03 11:33:57,328 WARNING [train.py:1204] (3/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_214-291369-0 from training. Duration: 0.63 2023-10-03 11:33:57,335 WARNING [train.py:1204] (3/4) Exclude cut with ID _62574_210_2655_1_1535180407058_3933249_369-84347-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 11:33:57,401 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9210W0111-599240 from training. Duration: 0.896 2023-10-03 11:33:58,700 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_92-246270-0 from training. Duration: 0.85 2023-10-03 11:33:58,726 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_37152_210_9697_1_1533106573_3844188_29-239070-0 from training. Duration: 0.98 2023-10-03 11:33:58,781 WARNING [train.py:1204] (3/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_239-22076-0 from training. Duration: 0.85 2023-10-03 11:33:58,784 WARNING [train.py:1204] (3/4) Exclude cut with ID _65962_210_29927_1_1535436026519_4174930_368-44639-0_sp1.1 from training. Duration: 0.97275 2023-10-03 11:34:03,476 WARNING [train.py:1204] (3/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_152-212533-0 from training. Duration: 0.97 2023-10-03 11:34:05,382 WARNING [train.py:1204] (3/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_90-31383-0_sp1.1 from training. Duration: 0.7909375 2023-10-03 11:34:06,879 WARNING [train.py:1204] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_196-152868-0_sp1.1 from training. Duration: 0.92725 2023-10-03 11:34:06,881 WARNING [train.py:1204] (3/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_140-181176-0 from training. Duration: 0.92 2023-10-03 11:34:09,583 WARNING [train.py:1204] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_168-32906-0_sp0.9 from training. Duration: 0.7666875 2023-10-03 11:34:11,239 WARNING [train.py:1204] (3/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_13-165512-0 from training. Duration: 0.82 2023-10-03 11:34:12,627 WARNING [train.py:1204] (3/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_381-317791-0_sp0.9 from training. Duration: 0.811125 2023-10-03 11:34:17,367 WARNING [train.py:1204] (3/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_63-267585-0_sp1.1 from training. Duration: 0.8181875 2023-10-03 11:34:17,369 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_326-303945-0_sp1.1 from training. Duration: 0.9 2023-10-03 11:34:17,381 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_194-143381-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 11:34:19,927 WARNING [train.py:1204] (3/4) Exclude cut with ID _39290_210_4220_1_1533862778760_3075118_194-16667-0_sp1.1 from training. Duration: 0.963625 2023-10-03 11:34:21,286 WARNING [train.py:1204] (3/4) Exclude cut with ID _15012_210_1968_1_1530856820429_5095350_662-210469-0_sp0.9 from training. Duration: 0.6333125 2023-10-03 11:34:21,311 WARNING [train.py:1204] (3/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_68-219807-0_sp1.1 from training. Duration: 0.42725 2023-10-03 11:34:22,631 WARNING [train.py:1204] (3/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_321-22821-0_sp0.9 from training. Duration: 0.988875 2023-10-03 11:34:24,036 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_1037-45317-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 11:34:24,037 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_403-39619-0_sp1.1 from training. Duration: 0.763625 2023-10-03 11:34:25,336 INFO [train.py:1046] (3/4) Epoch 36, batch 1950, loss[loss=0.1407, simple_loss=0.2228, pruned_loss=0.02929, over 24335.00 frames. ], tot_loss[loss=0.1599, simple_loss=0.2395, pruned_loss=0.04013, over 4708032.91 frames. ], batch size: 61, lr: 2.84e-03, grad_scale: 4.0 2023-10-03 11:34:27,241 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_510-96996-0_sp0.9 from training. Duration: 0.9666875 2023-10-03 11:34:27,244 WARNING [train.py:1204] (3/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_225-180155-0_sp1.1 from training. Duration: 0.92725 2023-10-03 11:34:28,575 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_278-272612-0_sp0.9 from training. Duration: 0.811125 2023-10-03 11:34:28,687 WARNING [train.py:1204] (3/4) Exclude cut with ID _53713_210_24116_1_1534314487762_7416751_691-70244-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 11:34:30,473 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.conv_skip_rate, batch_count=1252500.0, ans=0.0 2023-10-03 11:34:31,998 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6788_210_1388_1_1527898698_4562021_91-197981-0_sp1.1 from training. Duration: 0.7545625 2023-10-03 11:34:34,706 WARNING [train.py:1204] (3/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_60-18123-0_sp0.9 from training. Duration: 0.97775 2023-10-03 11:34:35,915 WARNING [train.py:1204] (3/4) Exclude cut with ID _35799_210_5854_1_1533263487887_3529071_5-214617-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 11:34:35,920 WARNING [train.py:1204] (3/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_890-5117-0_sp1.1 from training. Duration: 0.7090625 2023-10-03 11:34:37,382 WARNING [train.py:1204] (3/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_660-211088-0 from training. Duration: 0.62 2023-10-03 11:34:37,450 WARNING [train.py:1204] (3/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_314-126819-0_sp0.9 from training. Duration: 0.5555625 2023-10-03 11:34:38,815 WARNING [train.py:1204] (3/4) Exclude cut with ID _21632_210_2253_1_1531994001765_3744140_362-66225-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 11:34:38,913 WARNING [train.py:1204] (3/4) Exclude cut with ID _61118_210_9527_1_1534897668884_3752630_173-258555-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 11:34:42,955 WARNING [train.py:1204] (3/4) Exclude cut with ID _7719_210_3525_1_1528456153163_4296960_88-250259-0_sp0.9 from training. Duration: 0.9333125 2023-10-03 11:34:42,981 WARNING [train.py:1204] (3/4) Exclude cut with ID _15140_210_9288_1_1530791388450_4880149_539-153708-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 11:34:43,011 WARNING [train.py:1204] (3/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_253-42544-0_sp1.1 from training. Duration: 0.97275 2023-10-03 11:34:44,420 WARNING [train.py:1204] (3/4) Exclude cut with ID _33640_210_2655_1_1533117605479_4855469_479-296877-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 11:34:47,664 WARNING [train.py:1204] (3/4) Exclude cut with ID 3033-130750-0096-55598-0_sp1.1 from training. Duration: 0.7545625 2023-10-03 11:34:47,682 WARNING [train.py:1204] (3/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_191-41023-0_sp1.1 from training. Duration: 0.6454375 2023-10-03 11:34:47,696 WARNING [train.py:1204] (3/4) Exclude cut with ID _8365_210_2064_1_1528592311070_4677690_431-294102-0_sp1.1 from training. Duration: 0.8090625 2023-10-03 11:34:47,728 WARNING [train.py:1204] (3/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_893-296030-0_sp1.1 from training. Duration: 0.97275 2023-10-03 11:34:50,429 WARNING [train.py:1204] (3/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_401-343820-0_sp1.1 from training. Duration: 0.97275 2023-10-03 11:34:51,698 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.582e+02 1.918e+02 2.114e+02 2.421e+02 3.439e+02, threshold=4.228e+02, percent-clipped=0.0 2023-10-03 11:34:53,275 WARNING [train.py:1204] (3/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_127-566-0_sp1.1 from training. Duration: 0.763625 2023-10-03 11:34:53,277 WARNING [train.py:1204] (3/4) Exclude cut with ID _14100_210_8341_1_1530529320613_3678069_126-272847-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 11:34:53,297 WARNING [train.py:1204] (3/4) Exclude cut with ID _15133_210_6228_1_1530794512074_3080379_29-264458-0_sp1.1 from training. Duration: 0.52725 2023-10-03 11:34:53,297 WARNING [train.py:1204] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_127-119109-0 from training. Duration: 0.72 2023-10-03 11:34:54,562 WARNING [train.py:1204] (3/4) Exclude cut with ID _6537_210_1883_1_1527987559710_3597129_382-133062-0_sp1.1 from training. Duration: 0.6181875 2023-10-03 11:34:55,189 WARNING [train.py:1204] (3/4) Exclude cut with ID _29529_210_7234_1_1532393961455_3977460_391-219682-0_sp1.1 from training. Duration: 0.936375 2023-10-03 11:34:55,230 WARNING [train.py:1204] (3/4) Exclude cut with ID _37537_210_2253_1_1533171758568_3421949_96-137189-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 11:34:58,018 WARNING [train.py:1204] (3/4) Exclude cut with ID _58414_210_8254_1_1535421609162_7151182_15-276301-0_sp1.1 from training. Duration: 0.97275 2023-10-03 11:35:01,315 WARNING [train.py:1204] (3/4) Exclude cut with ID _59502_210_12210_1_1535107024698_1835139_110-289074-0_sp1.1 from training. Duration: 0.92725 2023-10-03 11:35:04,688 WARNING [train.py:1204] (3/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_367-61330-0_sp1.1 from training. Duration: 0.6818125 2023-10-03 11:35:06,365 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.feed_forward3.hidden_balancer.prob, batch_count=1252633.3333333333, ans=0.125 2023-10-03 11:35:07,508 WARNING [train.py:1204] (3/4) Exclude cut with ID _45297_210_2721_1_1533715351381_3264949_353-9541-0_sp1.1 from training. Duration: 0.8818125 2023-10-03 11:35:07,536 WARNING [train.py:1204] (3/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_85-138257-0_sp0.9 from training. Duration: 0.788875 2023-10-03 11:35:07,570 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3787_210_2040_1_1526477544_5478586_603-120324-0 from training. Duration: 0.81 2023-10-03 11:35:07,614 WARNING [train.py:1204] (3/4) Exclude cut with ID _42863_210_9070_1_1533902398788_3696920_411-204131-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 11:35:11,744 WARNING [train.py:1204] (3/4) Exclude cut with ID _30196_210_1664_1_1533708496522_7074140_822-289909-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 11:35:13,490 WARNING [train.py:1204] (3/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_32-335103-0_sp0.9 from training. Duration: 0.911125 2023-10-03 11:35:13,542 WARNING [train.py:1204] (3/4) Exclude cut with ID _6493_210_1747_1_1528459210424_4012470_513-140967-0_sp0.9 from training. Duration: 0.87775 2023-10-03 11:35:20,654 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_47896_210_20508_1_1534492518_5958868_439-257766-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 11:35:22,558 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_1294-63152-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 11:35:25,432 WARNING [train.py:1204] (3/4) Exclude cut with ID _59170_210_6721_1_1534983633960_4203335_319-166512-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 11:35:26,895 WARNING [train.py:1204] (3/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_146-3470-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 11:35:30,205 WARNING [train.py:1204] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_678-159307-0_sp1.1 from training. Duration: 0.77275 2023-10-03 11:35:30,251 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_343-48822-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 11:35:30,305 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_4692_210_3528_1_1526899755_4323254_107-44401-0 from training. Duration: 0.86 2023-10-03 11:35:31,488 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_74-292608-0_sp1.1 from training. Duration: 0.6545625 2023-10-03 11:35:31,564 WARNING [train.py:1204] (3/4) Exclude cut with ID _55644_210_11856_1_1534593599582_4103719_293-248694-0_sp1.1 from training. Duration: 0.97275 2023-10-03 11:35:31,653 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3172_210_2087_1_1526292929_3987865_197-97762-0 from training. Duration: 0.75 2023-10-03 11:35:34,362 WARNING [train.py:1204] (3/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_264-348896-0_sp0.9 from training. Duration: 0.9444375 2023-10-03 11:35:34,561 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.conv_skip_rate, batch_count=1252766.6666666667, ans=0.0 2023-10-03 11:35:38,459 WARNING [train.py:1204] (3/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_209-205365-0_sp0.9 from training. Duration: 0.87775 2023-10-03 11:35:39,759 INFO [train.py:1046] (3/4) Epoch 36, batch 2000, loss[loss=0.1723, simple_loss=0.2625, pruned_loss=0.04101, over 24352.00 frames. ], tot_loss[loss=0.1615, simple_loss=0.2409, pruned_loss=0.04103, over 4691241.79 frames. ], batch size: 77, lr: 2.83e-03, grad_scale: 8.0 2023-10-03 11:35:39,848 WARNING [train.py:1204] (3/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_222-198127-0_sp0.9 from training. Duration: 0.9333125 2023-10-03 11:35:41,083 WARNING [train.py:1204] (3/4) Exclude cut with ID _57572_210_5517_1_1534921219624_3809484_52-43530-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 11:35:42,477 WARNING [train.py:1204] (3/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_96-320736-0_sp1.1 from training. Duration: 0.7818125 2023-10-03 11:35:43,867 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_13091_210_7925_1_1530609447_8987354_1159-225508-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 11:35:45,412 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.nonlin_attention.balancer.prob, batch_count=1252833.3333333333, ans=0.125 2023-10-03 11:35:49,843 WARNING [train.py:1204] (3/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_237-44332-0 from training. Duration: 0.94 2023-10-03 11:35:49,894 WARNING [train.py:1204] (3/4) Exclude cut with ID _7970_210_1883_1_1528455616296_3472230_25-178407-0_sp0.9 from training. Duration: 0.788875 2023-10-03 11:35:50,062 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.bypass.scale_min, batch_count=1252833.3333333333, ans=0.2 2023-10-03 11:35:52,651 WARNING [train.py:1204] (3/4) Exclude cut with ID _29003_210_19596_1_1532507391720_3894098_357-257062-0_sp1.1 from training. Duration: 0.936375 2023-10-03 11:35:54,629 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_809-17156-0 from training. Duration: 0.73 2023-10-03 11:35:56,019 WARNING [train.py:1204] (3/4) Exclude cut with ID _7160_210_1876_1_1527940923231_3703799_302-150728-0_sp1.1 from training. Duration: 0.7090625 2023-10-03 11:35:56,036 WARNING [train.py:1204] (3/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_444-28041-0_sp0.9 from training. Duration: 0.9444375 2023-10-03 11:35:56,338 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.ff3_skip_rate, batch_count=1252900.0, ans=0.0 2023-10-03 11:35:58,758 WARNING [train.py:1204] (3/4) Exclude cut with ID _52892_210_6721_1_1534378941259_4124129_278-251666-0_sp1.1 from training. Duration: 0.92725 2023-10-03 11:36:00,782 WARNING [train.py:1204] (3/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_115-326778-0 from training. Duration: 0.93 2023-10-03 11:36:02,344 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.nonlin_attention.balancer.min_positive, batch_count=1252900.0, ans=0.05 2023-10-03 11:36:03,376 WARNING [train.py:1204] (3/4) Exclude cut with ID _41391_210_7994_1_1533603771872_3638201_31-188961-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 11:36:03,634 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.conv_module2.balancer2.prob, batch_count=1252900.0, ans=0.125 2023-10-03 11:36:04,842 WARNING [train.py:1204] (3/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_1073-6264-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 11:36:04,873 WARNING [train.py:1204] (3/4) Exclude cut with ID _66868_210_9527_1_1535509287954_4561140_435-259515-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 11:36:06,747 WARNING [train.py:1204] (3/4) Exclude cut with ID _14886_210_4220_1_1530702020355_3516190_159-73748-0 from training. Duration: 0.83 2023-10-03 11:36:06,777 WARNING [train.py:1204] (3/4) Exclude cut with ID _15150_210_8341_1_1530752411803_7256384_469-239509-0_sp1.1 from training. Duration: 0.6181875 2023-10-03 11:36:08,168 WARNING [train.py:1204] (3/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_395-340852-0 from training. Duration: 0.93 2023-10-03 11:36:08,175 WARNING [train.py:1204] (3/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_138-187031-0_sp1.1 from training. Duration: 0.9 2023-10-03 11:36:11,071 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_13757_210_3528_1_1530440682_4107239_48-106347-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 11:36:12,386 WARNING [train.py:1204] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_164-224052-0_sp1.1 from training. Duration: 0.636375 2023-10-03 11:36:12,392 WARNING [train.py:1204] (3/4) Exclude cut with ID _59288_210_5854_1_1535077757774_3412246_408-322648-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 11:36:12,439 WARNING [train.py:1204] (3/4) Exclude cut with ID _30191_210_1664_1_1533189915047_7196670_827-36359-0_sp1.1 from training. Duration: 0.963625 2023-10-03 11:36:13,764 WARNING [train.py:1204] (3/4) Exclude cut with ID _4680_210_3525_1_1527249757953_893386_75-78979-0_sp1.1 from training. Duration: 0.82725 2023-10-03 11:36:13,832 WARNING [train.py:1204] (3/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_383-165234-0 from training. Duration: 0.94 2023-10-03 11:36:16,552 WARNING [train.py:1204] (3/4) Exclude cut with ID _19896_210_1883_1_1531709022662_6649569_94-229927-0 from training. Duration: 0.97 2023-10-03 11:36:16,563 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6788_210_1388_1_1527898698_4562021_180-75288-0_sp1.1 from training. Duration: 0.9 2023-10-03 11:36:18,184 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_26433_210_12108_1_1532157158_4056480_54-321349-0_sp1.1 from training. Duration: 0.97275 2023-10-03 11:36:20,533 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.1.encoder.layers.0.feed_forward3.out_whiten, num_groups=1, num_channels=256, metric=12.30 vs. limit=15.0 2023-10-03 11:36:23,095 WARNING [train.py:1204] (3/4) Exclude cut with ID _56108_210_16380_1_1534505446159_3887959_265-161493-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 11:36:24,288 WARNING [train.py:1204] (3/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_395-111602-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 11:36:24,303 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_154-316450-0_sp0.9 from training. Duration: 0.8444375 2023-10-03 11:36:25,772 WARNING [train.py:1204] (3/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_381-116041-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 11:36:28,484 WARNING [train.py:1204] (3/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_65-104124-0_sp1.1 from training. Duration: 0.963625 2023-10-03 11:36:29,742 WARNING [train.py:1204] (3/4) Exclude cut with ID _6536_210_1883_1_1527850783339_3319069_169-23811-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 11:36:29,784 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_289-180904-0_sp0.9 from training. Duration: 0.8444375 2023-10-03 11:36:29,795 WARNING [train.py:1204] (3/4) Exclude cut with ID _56406_210_9288_1_1534899618664_3496149_226-242400-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 11:36:31,842 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_24899_210_16554_1_1532046422_3827462_276-216330-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 11:36:34,579 WARNING [train.py:1204] (3/4) Exclude cut with ID _29345_210_4381_1_1532689168263_3860169_198-174328-0_sp1.1 from training. Duration: 0.82725 2023-10-03 11:36:34,644 WARNING [train.py:1204] (3/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_338-96520-0 from training. Duration: 0.94 2023-10-03 11:36:39,213 WARNING [train.py:1204] (3/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_7-111954-0_sp1.1 from training. Duration: 0.5090625 2023-10-03 11:36:39,308 WARNING [train.py:1204] (3/4) Exclude cut with ID _36692_210_5665_1_1533608967420_398809_8-16261-0_sp1.1 from training. Duration: 0.97275 2023-10-03 11:36:42,091 WARNING [train.py:1204] (3/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_86-212657-0_sp1.1 from training. Duration: 0.97275 2023-10-03 11:36:43,444 WARNING [train.py:1204] (3/4) Exclude cut with ID _44659_210_2721_1_1534036120061_6614860_626-298884-0_sp1.1 from training. Duration: 0.8818125 2023-10-03 11:36:46,173 WARNING [train.py:1204] (3/4) Exclude cut with ID _49755_210_18053_1_1534581005846_3218709_120-257918-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 11:36:49,320 WARNING [train.py:1204] (3/4) Exclude cut with ID _21881_210_3525_1_1531825242757_3798380_400-113821-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 11:36:49,323 WARNING [train.py:1204] (3/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_387-236773-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 11:36:50,687 WARNING [train.py:1204] (3/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_613-232856-0_sp1.1 from training. Duration: 0.4909375 2023-10-03 11:36:50,719 WARNING [train.py:1204] (3/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_181-78632-0_sp0.9 from training. Duration: 0.8666875 2023-10-03 11:36:53,338 INFO [train.py:1046] (3/4) Epoch 36, batch 2050, loss[loss=0.1357, simple_loss=0.212, pruned_loss=0.02967, over 24378.00 frames. ], tot_loss[loss=0.1606, simple_loss=0.2393, pruned_loss=0.04095, over 4690373.48 frames. ], batch size: 56, lr: 2.83e-03, grad_scale: 8.0 2023-10-03 11:36:54,018 WARNING [train.py:1204] (3/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_175-17485-0_sp1.1 from training. Duration: 0.97275 2023-10-03 11:36:55,294 WARNING [train.py:1204] (3/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_1004-183422-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 11:36:58,036 WARNING [train.py:1204] (3/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_32-65962-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 11:36:59,391 WARNING [train.py:1204] (3/4) Exclude cut with ID _15458_210_8341_1_1530858614263_3834410_40-298664-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 11:37:04,194 WARNING [train.py:1204] (3/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_380-261863-0_sp1.1 from training. Duration: 0.963625 2023-10-03 11:37:06,960 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_372-173012-0_sp1.1 from training. Duration: 0.863625 2023-10-03 11:37:07,017 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_57-82205-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 11:37:07,204 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=1253233.3333333333, ans=0.1 2023-10-03 11:37:08,335 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3691_210_3528_1_1526468169_4062053_355-334432-0_sp0.9 from training. Duration: 0.9666875 2023-10-03 11:37:08,573 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.balancer2.prob, batch_count=1253233.3333333333, ans=0.125 2023-10-03 11:37:09,626 WARNING [train.py:1204] (3/4) Exclude cut with ID _19023_210_4353_1_1531378892552_5820780_28-36615-0 from training. Duration: 0.97 2023-10-03 11:37:09,630 WARNING [train.py:1204] (3/4) Exclude cut with ID _29853_210_7497_1_1532505600851_6642110_286-251333-0_sp1.1 from training. Duration: 0.92725 2023-10-03 11:37:11,361 WARNING [train.py:1204] (3/4) Exclude cut with ID _31602_210_5854_1_1532677021456_4458250_518-80679-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 11:37:11,396 WARNING [train.py:1204] (3/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_38-187851-0_sp0.9 from training. Duration: 0.688875 2023-10-03 11:37:11,512 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.0.attention_skip_rate, batch_count=1253233.3333333333, ans=0.0 2023-10-03 11:37:20,198 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.542e+02 1.868e+02 2.079e+02 2.374e+02 3.501e+02, threshold=4.157e+02, percent-clipped=0.0 2023-10-03 11:37:20,294 WARNING [train.py:1204] (3/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_39-248870-0_sp1.1 from training. Duration: 0.62725 2023-10-03 11:37:20,328 WARNING [train.py:1204] (3/4) Exclude cut with ID _16908_210_5724_1_1531119157248_3772700_154-31455-0_sp1.1 from training. Duration: 0.97275 2023-10-03 11:37:21,738 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8453_210_4211_1_1528502940_8562730_347-330673-0 from training. Duration: 0.72 2023-10-03 11:37:24,430 WARNING [train.py:1204] (3/4) Exclude cut with ID _30191_210_1664_1_1533189915047_7196670_674-204247-0_sp1.1 from training. Duration: 0.97275 2023-10-03 11:37:26,352 WARNING [train.py:1204] (3/4) Exclude cut with ID _8833_210_5219_1_1528592665210_7553059_329-125156-0 from training. Duration: 0.82 2023-10-03 11:37:26,393 WARNING [train.py:1204] (3/4) Exclude cut with ID _4779_210_2084_1_1527318022944_3914893_500-285013-0_sp1.1 from training. Duration: 0.62725 2023-10-03 11:37:29,617 WARNING [train.py:1204] (3/4) Exclude cut with ID _14421_210_8750_1_1530604795825_3542290_190-313578-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 11:37:34,972 WARNING [train.py:1204] (3/4) Exclude cut with ID _35799_210_5854_1_1533263487887_3529071_538-273574-0_sp1.1 from training. Duration: 0.936375 2023-10-03 11:37:35,079 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_14928_210_5632_1_1530773461_4714000_454-112198-0_sp1.1 from training. Duration: 0.6 2023-10-03 11:37:36,415 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_468-143126-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 11:37:37,840 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_4528_210_3273_1_1527076654_3276777_258-121681-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 11:37:37,926 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_397-132565-0_sp1.1 from training. Duration: 0.9 2023-10-03 11:37:37,952 WARNING [train.py:1204] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_454-123002-0_sp0.9 from training. Duration: 0.7666875 2023-10-03 11:37:40,589 WARNING [train.py:1204] (3/4) Exclude cut with ID _27052_210_5986_1_1532329197187_3585009_297-86890-0_sp1.1 from training. Duration: 0.936375 2023-10-03 11:37:43,914 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_51225_210_3601_1_1534400634_2327383_55-301844-0_sp1.1 from training. Duration: 0.8454375 2023-10-03 11:37:45,241 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6786_210_1388_1_1527839699_4194495_378-46070-0_sp0.9 from training. Duration: 0.8 2023-10-03 11:37:45,410 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.0.conv_module1.balancer2.prob, batch_count=1253366.6666666667, ans=0.125 2023-10-03 11:37:46,647 WARNING [train.py:1204] (3/4) Exclude cut with ID _42334_210_7770_1_1534206610396_3605880_55-19758-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 11:37:50,707 WARNING [train.py:1204] (3/4) Exclude cut with ID _14884_210_2252_1_1530707459689_5561250_114-201399-0_sp0.9 from training. Duration: 0.8444375 2023-10-03 11:37:55,366 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_99-251314-0_sp0.9 from training. Duration: 0.9666875 2023-10-03 11:37:56,298 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.0.layers.0.self_attn2.whiten, num_groups=1, num_channels=192, metric=10.21 vs. limit=22.5 2023-10-03 11:37:58,567 WARNING [train.py:1204] (3/4) Exclude cut with ID _26528_210_9070_1_1532570410842_3851310_497-97218-0 from training. Duration: 0.86 2023-10-03 11:38:03,485 WARNING [train.py:1204] (3/4) Exclude cut with ID _55328_210_27767_1_1534580973260_4088170_230-317241-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 11:38:03,717 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.self_attn_weights.pos_emb_skip_rate, batch_count=1253433.3333333333, ans=0.0 2023-10-03 11:38:04,921 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_125-203105-0_sp0.9 from training. Duration: 0.888875 2023-10-03 11:38:06,353 WARNING [train.py:1204] (3/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_55-31904-0_sp1.1 from training. Duration: 0.8 2023-10-03 11:38:07,679 WARNING [train.py:1204] (3/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_18-174647-0 from training. Duration: 0.91 2023-10-03 11:38:10,446 INFO [train.py:1046] (3/4) Epoch 36, batch 2100, loss[loss=0.153, simple_loss=0.236, pruned_loss=0.03501, over 24463.00 frames. ], tot_loss[loss=0.1595, simple_loss=0.2382, pruned_loss=0.04037, over 4697627.66 frames. ], batch size: 63, lr: 2.83e-03, grad_scale: 8.0 2023-10-03 11:38:10,550 WARNING [train.py:1204] (3/4) Exclude cut with ID IC0165W0005-81726 from training. Duration: 0.952 2023-10-03 11:38:10,553 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_31583_210_3852_1_1532595851_4024413_148-171847-0_sp1.1 from training. Duration: 0.963625 2023-10-03 11:38:10,605 WARNING [train.py:1204] (3/4) Exclude cut with ID _21989_210_5854_1_1531899214931_3271360_116-138769-0_sp1.1 from training. Duration: 0.936375 2023-10-03 11:38:11,937 WARNING [train.py:1204] (3/4) Exclude cut with ID _66070_210_7640_1_1535441393096_7805310_489-292631-0_sp1.1 from training. Duration: 0.7181875 2023-10-03 11:38:14,506 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_202-27960-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 11:38:14,514 WARNING [train.py:1204] (3/4) Exclude cut with ID _5294_210_1747_1_1527249643928_3696179_342-270278-0 from training. Duration: 0.55 2023-10-03 11:38:14,561 WARNING [train.py:1204] (3/4) Exclude cut with ID _38323_210_12108_1_1533121233369_3303049_416-11919-0 from training. Duration: 0.96 2023-10-03 11:38:16,516 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6545_210_2040_1_1527774670_4245258_407-48070-0_sp0.9 from training. Duration: 0.8444375 2023-10-03 11:38:19,478 WARNING [train.py:1204] (3/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_503-30719-0_sp1.1 from training. Duration: 0.8909375 2023-10-03 11:38:20,800 WARNING [train.py:1204] (3/4) Exclude cut with ID _49643_210_7989_1_1534640244224_3939031_234-326497-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 11:38:24,008 WARNING [train.py:1204] (3/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_301-77873-0_sp1.1 from training. Duration: 0.963625 2023-10-03 11:38:24,056 WARNING [train.py:1204] (3/4) Exclude cut with ID _45297_210_2721_1_1533715351381_3264949_152-154697-0_sp1.1 from training. Duration: 0.97275 2023-10-03 11:38:24,065 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3371_210_1794_1_1526384462_5580777_396-339812-0 from training. Duration: 0.85 2023-10-03 11:38:25,400 WARNING [train.py:1204] (3/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_399-156791-0_sp1.1 from training. Duration: 0.7545625 2023-10-03 11:38:26,658 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7696_210_2040_1_1528207167_3716099_117-125434-0 from training. Duration: 0.76 2023-10-03 11:38:26,663 WARNING [train.py:1204] (3/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_96-244299-0 from training. Duration: 0.88 2023-10-03 11:38:28,713 WARNING [train.py:1204] (3/4) Exclude cut with ID _37892_210_19596_1_1533198677897_3964058_519-343462-0_sp1.1 from training. Duration: 0.92725 2023-10-03 11:38:28,735 WARNING [train.py:1204] (3/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_202-183922-0_sp1.1 from training. Duration: 0.77275 2023-10-03 11:38:28,741 WARNING [train.py:1204] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_453-133983-0 from training. Duration: 0.78 2023-10-03 11:38:28,779 WARNING [train.py:1204] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_178-39722-0_sp1.1 from training. Duration: 0.4454375 2023-10-03 11:38:33,566 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_15126_210_6753_1_1530778677_2591185_113-157651-0 from training. Duration: 0.68 2023-10-03 11:38:33,567 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_271-234962-0_sp1.1 from training. Duration: 0.7181875 2023-10-03 11:38:36,329 WARNING [train.py:1204] (3/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_195-64967-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 11:38:37,657 WARNING [train.py:1204] (3/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_365-215065-0_sp1.1 from training. Duration: 0.936375 2023-10-03 11:38:40,355 WARNING [train.py:1204] (3/4) Exclude cut with ID _31602_210_5854_1_1532677021456_4458250_249-96604-0_sp0.9 from training. Duration: 0.988875 2023-10-03 11:38:40,409 WARNING [train.py:1204] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_307-158462-0 from training. Duration: 0.69 2023-10-03 11:38:40,459 WARNING [train.py:1204] (3/4) Exclude cut with ID _30918_210_6400_1_1532754078600_3125920_407-223394-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 11:38:41,750 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6673_210_3273_1_1527767790_3173240_351-167954-0_sp0.9 from training. Duration: 0.5333125 2023-10-03 11:38:43,204 WARNING [train.py:1204] (3/4) Exclude cut with ID _4866_210_2577_1_1527404412284_4364659_256-306830-0 from training. Duration: 0.87 2023-10-03 11:38:43,265 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_17613_210_2151_1_1531137569_2366160_222-131118-0_sp1.1 from training. Duration: 0.963625 2023-10-03 11:38:44,555 WARNING [train.py:1204] (3/4) Exclude cut with ID _45560_210_21982_1_1533812603951_3584999_111-6376-0 from training. Duration: 0.84 2023-10-03 11:38:44,594 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_216-73841-0 from training. Duration: 0.96 2023-10-03 11:38:44,641 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_14058_210_4163_1_1530690953_6327180_49-210687-0 from training. Duration: 0.94 2023-10-03 11:38:47,825 WARNING [train.py:1204] (3/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_506-42614-0_sp0.9 from training. Duration: 0.87775 2023-10-03 11:38:48,102 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.attention_skip_rate, batch_count=1253633.3333333333, ans=0.0 2023-10-03 11:38:49,276 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_25-332138-0_sp1.1 from training. Duration: 0.82725 2023-10-03 11:38:49,481 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.1.conv_module1.balancer2.prob, batch_count=1253633.3333333333, ans=0.125 2023-10-03 11:38:52,042 WARNING [train.py:1204] (3/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_589-9070-0_sp1.1 from training. Duration: 0.6454375 2023-10-03 11:38:53,851 WARNING [train.py:1204] (3/4) Exclude cut with ID _6194_210_1534_1_1527511826160_3941194_14-282876-0_sp1.1 from training. Duration: 0.6454375 2023-10-03 11:38:55,177 WARNING [train.py:1204] (3/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_1166-90654-0_sp1.1 from training. Duration: 0.92725 2023-10-03 11:38:55,289 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_218-255858-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 11:38:55,291 WARNING [train.py:1204] (3/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_1285-256622-0 from training. Duration: 0.84 2023-10-03 11:38:55,303 WARNING [train.py:1204] (3/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_205-286261-0_sp1.1 from training. Duration: 0.963625 2023-10-03 11:38:56,596 WARNING [train.py:1204] (3/4) Exclude cut with ID _68777_210_4353_1_1535810390731_3117904_6-154119-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 11:38:57,288 WARNING [train.py:1204] (3/4) Exclude cut with ID _22982_210_1883_1_1531882790447_5449409_565-199393-0_sp1.1 from training. Duration: 0.92725 2023-10-03 11:38:58,438 WARNING [train.py:1204] (3/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_278-154492-0 from training. Duration: 0.76 2023-10-03 11:39:01,136 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_426-313557-0 from training. Duration: 0.53 2023-10-03 11:39:01,193 WARNING [train.py:1204] (3/4) Exclude cut with ID _41817_210_18835_1_1533434477160_7423049_555-293448-0 from training. Duration: 0.8 2023-10-03 11:39:03,402 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.feed_forward1.out_proj.dropout_p, batch_count=1253700.0, ans=0.1 2023-10-03 11:39:04,644 WARNING [train.py:1204] (3/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_32-335103-0_sp1.1 from training. Duration: 0.7454375 2023-10-03 11:39:07,463 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2655_210_2087_1_1525688959_3897400_202-147557-0_sp1.1 from training. Duration: 0.77275 2023-10-03 11:39:08,739 WARNING [train.py:1204] (3/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_158-250116-0 from training. Duration: 0.62 2023-10-03 11:39:11,675 WARNING [train.py:1204] (3/4) Exclude cut with ID _55429_210_10626_1_1534467588527_7174143_528-88083-0_sp1.1 from training. Duration: 0.963625 2023-10-03 11:39:14,560 WARNING [train.py:1204] (3/4) Exclude cut with ID _8030_210_7497_1_1528444886781_3531089_32-31837-0_sp1.1 from training. Duration: 0.8818125 2023-10-03 11:39:14,580 WARNING [train.py:1204] (3/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_744-86168-0_sp1.1 from training. Duration: 0.97275 2023-10-03 11:39:14,589 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1113-7369-0_sp0.9 from training. Duration: 0.9444375 2023-10-03 11:39:14,605 WARNING [train.py:1204] (3/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_124-345840-0_sp0.9 from training. Duration: 0.57775 2023-10-03 11:39:14,659 WARNING [train.py:1204] (3/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_328-264776-0_sp0.9 from training. Duration: 0.7444375 2023-10-03 11:39:14,842 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.ff3_skip_rate, batch_count=1253766.6666666667, ans=0.0 2023-10-03 11:39:17,800 WARNING [train.py:1204] (3/4) Exclude cut with ID _18895_210_6228_1_1531357274234_3784930_469-166615-0_sp1.1 from training. Duration: 0.963625 2023-10-03 11:39:17,806 WARNING [train.py:1204] (3/4) Exclude cut with ID _8395_210_1531_1_1528597871523_4411579_431-132470-0_sp0.9 from training. Duration: 0.82225 2023-10-03 11:39:17,885 WARNING [train.py:1204] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_554-45617-0_sp1.1 from training. Duration: 0.9 2023-10-03 11:39:19,222 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_35361_210_4238_1_1533262810_7793476_524-188242-0_sp1.1 from training. Duration: 0.92725 2023-10-03 11:39:20,684 WARNING [train.py:1204] (3/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_938-205135-0 from training. Duration: 0.87 2023-10-03 11:39:20,930 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.bypass.skip_rate, batch_count=1253766.6666666667, ans=0.09899494936611666 2023-10-03 11:39:22,091 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_5931_210_2040_1_1527428481_4547999_321-193614-0 from training. Duration: 0.67 2023-10-03 11:39:22,109 WARNING [train.py:1204] (3/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_445-269521-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 11:39:24,121 WARNING [train.py:1204] (3/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_3-33116-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 11:39:24,122 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_4633_210_2040_1_1526824163_4145343_416-76117-0_sp1.1 from training. Duration: 0.863625 2023-10-03 11:39:24,336 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.ff2_skip_rate, batch_count=1253833.3333333333, ans=0.0 2023-10-03 11:39:25,341 INFO [train.py:1046] (3/4) Epoch 36, batch 2150, loss[loss=0.1544, simple_loss=0.243, pruned_loss=0.03284, over 24685.00 frames. ], tot_loss[loss=0.159, simple_loss=0.2378, pruned_loss=0.04009, over 4699496.27 frames. ], batch size: 73, lr: 2.83e-03, grad_scale: 8.0 2023-10-03 11:39:25,422 WARNING [train.py:1204] (3/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_1033-274376-0_sp1.1 from training. Duration: 0.7909375 2023-10-03 11:39:25,461 WARNING [train.py:1204] (3/4) Exclude cut with ID _8772_210_1850_1_1528635595972_3664011_398-320049-0_sp0.9 from training. Duration: 0.97775 2023-10-03 11:39:31,350 WARNING [train.py:1204] (3/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_321-215742-0_sp0.9 from training. Duration: 0.5555625 2023-10-03 11:39:33,199 WARNING [train.py:1204] (3/4) Exclude cut with ID _28394_210_8913_1_1532430049518_3650118_272-190950-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 11:39:34,606 WARNING [train.py:1204] (3/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_31-299371-0_sp1.1 from training. Duration: 0.92725 2023-10-03 11:39:37,168 WARNING [train.py:1204] (3/4) Exclude cut with ID _55383_210_10635_1_1534468426172_2231669_134-128246-0_sp0.9 from training. Duration: 0.988875 2023-10-03 11:39:37,171 WARNING [train.py:1204] (3/4) Exclude cut with ID _40519_210_13794_1_1533715228655_7337390_887-9547-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 11:39:37,214 WARNING [train.py:1204] (3/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_310-109461-0_sp0.9 from training. Duration: 0.888875 2023-10-03 11:39:39,211 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.4.encoder.layers.1.self_attn1.whiten, num_groups=1, num_channels=384, metric=11.30 vs. limit=22.5 2023-10-03 11:39:40,012 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2009_210_2071_1_1525481134_4588937_258-250196-0_sp1.1 from training. Duration: 0.92725 2023-10-03 11:39:41,348 WARNING [train.py:1204] (3/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_1186-72702-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 11:39:41,350 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_14831_210_7927_1_1530700200_5583610_181-27467-0_sp1.1 from training. Duration: 0.82725 2023-10-03 11:39:41,599 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.ff3_skip_rate, batch_count=1253900.0, ans=0.0 2023-10-03 11:39:41,616 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.bypass.scale_min, batch_count=1253900.0, ans=0.2 2023-10-03 11:39:41,643 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.balancer2.prob, batch_count=1253900.0, ans=0.125 2023-10-03 11:39:44,048 WARNING [train.py:1204] (3/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_278-132109-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 11:39:45,331 WARNING [train.py:1204] (3/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_261-297503-0 from training. Duration: 0.99 2023-10-03 11:39:46,918 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=1253900.0, ans=0.1 2023-10-03 11:39:50,135 WARNING [train.py:1204] (3/4) Exclude cut with ID _19896_210_1883_1_1531709022662_6649569_313-265903-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 11:39:51,383 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.630e+02 1.841e+02 2.012e+02 2.249e+02 3.397e+02, threshold=4.024e+02, percent-clipped=0.0 2023-10-03 11:39:51,504 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_105-114797-0_sp1.1 from training. Duration: 0.763625 2023-10-03 11:39:52,871 WARNING [train.py:1204] (3/4) Exclude cut with ID _53755_210_8254_1_1534577312941_4707360_9-65843-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 11:39:52,912 WARNING [train.py:1204] (3/4) Exclude cut with ID _18516_210_9206_1_1531704653206_3692406_361-224549-0_sp1.1 from training. Duration: 0.97275 2023-10-03 11:39:54,546 WARNING [train.py:1204] (3/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_403-310975-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 11:39:55,860 WARNING [train.py:1204] (3/4) Exclude cut with ID _5444_210_2151_1_1527418808464_5312210_581-315411-0_sp0.9 from training. Duration: 0.688875 2023-10-03 11:39:55,909 WARNING [train.py:1204] (3/4) Exclude cut with ID _23825_210_13449_1_1531915313526_3497150_464-154904-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 11:39:55,924 WARNING [train.py:1204] (3/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_937-208281-0_sp0.9 from training. Duration: 0.9444375 2023-10-03 11:39:57,234 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_37839_210_7234_1_1533542462_3689303_158-181212-0_sp1.1 from training. Duration: 0.963625 2023-10-03 11:39:58,575 WARNING [train.py:1204] (3/4) Exclude cut with ID _8772_210_1850_1_1528635595972_3664011_398-320049-0 from training. Duration: 0.88 2023-10-03 11:40:00,081 WARNING [train.py:1204] (3/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_718-250907-0_sp1.1 from training. Duration: 0.62725 2023-10-03 11:40:01,579 WARNING [train.py:1204] (3/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_641-126395-0_sp1.1 from training. Duration: 0.92725 2023-10-03 11:40:03,319 WARNING [train.py:1204] (3/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_166-241493-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 11:40:04,646 WARNING [train.py:1204] (3/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_526-62079-0_sp0.9 from training. Duration: 0.7444375 2023-10-03 11:40:04,748 WARNING [train.py:1204] (3/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_439-275083-0_sp1.1 from training. Duration: 0.936375 2023-10-03 11:40:06,766 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_66907_210_21581_1_1535615661_3590177_493-229032-0_sp1.1 from training. Duration: 0.92725 2023-10-03 11:40:06,792 WARNING [train.py:1204] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_89-321955-0_sp1.1 from training. Duration: 0.72725 2023-10-03 11:40:09,478 WARNING [train.py:1204] (3/4) Exclude cut with ID _29857_210_20034_1_1532395886753_3798160_273-226195-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 11:40:09,485 WARNING [train.py:1204] (3/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_404-75889-0 from training. Duration: 0.89 2023-10-03 11:40:09,529 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_271-234962-0_sp0.9 from training. Duration: 0.87775 2023-10-03 11:40:12,185 WARNING [train.py:1204] (3/4) Exclude cut with ID _22961_210_8254_1_1531976366672_4278180_156-339685-0_sp1.1 from training. Duration: 0.97275 2023-10-03 11:40:13,477 WARNING [train.py:1204] (3/4) Exclude cut with ID _37892_210_19596_1_1533198677897_3964058_425-231927-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 11:40:13,591 WARNING [train.py:1204] (3/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_102-288670-0_sp1.1 from training. Duration: 0.97275 2023-10-03 11:40:14,924 WARNING [train.py:1204] (3/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_283-220372-0_sp1.1 from training. Duration: 0.5090625 2023-10-03 11:40:16,241 WARNING [train.py:1204] (3/4) Exclude cut with ID _44304_210_15374_1_1533639352765_4613129_86-1699-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 11:40:16,319 WARNING [train.py:1204] (3/4) Exclude cut with ID _2891_210_2550_1_1525874398521_4427710_327-121590-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 11:40:16,502 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.feed_forward1.hidden_balancer.prob, batch_count=1254033.3333333333, ans=0.125 2023-10-03 11:40:17,658 WARNING [train.py:1204] (3/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_369-222060-0 from training. Duration: 0.71 2023-10-03 11:40:19,081 WARNING [train.py:1204] (3/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_762-109886-0 from training. Duration: 0.91 2023-10-03 11:40:19,111 WARNING [train.py:1204] (3/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_477-255782-0_sp0.9 from training. Duration: 0.911125 2023-10-03 11:40:20,364 WARNING [train.py:1204] (3/4) Exclude cut with ID IC0669W0061-330906 from training. Duration: 0.768 2023-10-03 11:40:21,630 WARNING [train.py:1204] (3/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_347-213071-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 11:40:21,662 WARNING [train.py:1204] (3/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_507-303370-0_sp1.1 from training. Duration: 0.87275 2023-10-03 11:40:23,474 WARNING [train.py:1204] (3/4) Exclude cut with ID _15012_210_1968_1_1530856820429_5095350_688-178849-0 from training. Duration: 0.64 2023-10-03 11:40:23,486 WARNING [train.py:1204] (3/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_28-70987-0_sp0.9 from training. Duration: 0.9666875 2023-10-03 11:40:23,488 WARNING [train.py:1204] (3/4) Exclude cut with ID _5910_210_1389_1_1527337972454_5050100_555-159815-0 from training. Duration: 0.98 2023-10-03 11:40:23,507 WARNING [train.py:1204] (3/4) Exclude cut with ID IC0679W0064-335871 from training. Duration: 0.768 2023-10-03 11:40:23,508 WARNING [train.py:1204] (3/4) Exclude cut with ID IC0679W0079-335886 from training. Duration: 0.896 2023-10-03 11:40:23,553 WARNING [train.py:1204] (3/4) Exclude cut with ID _5608_210_2106_1_1527415439089_6819768_909-82767-0 from training. Duration: 0.61 2023-10-03 11:40:25,095 WARNING [train.py:1204] (3/4) Exclude cut with ID _23038_210_9570_1_1532001584158_3652419_411-323967-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 11:40:26,825 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_350-231748-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 11:40:26,836 WARNING [train.py:1204] (3/4) Exclude cut with ID _5289_210_2084_1_1527678202736_3779299_142-103317-0_sp0.9 from training. Duration: 0.9333125 2023-10-03 11:40:28,186 WARNING [train.py:1204] (3/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_314-144055-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 11:40:28,254 WARNING [train.py:1204] (3/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_177-236809-0_sp0.9 from training. Duration: 0.5444375 2023-10-03 11:40:29,659 WARNING [train.py:1204] (3/4) Exclude cut with ID _23038_210_9570_1_1532001584158_3652419_82-334733-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 11:40:29,674 WARNING [train.py:1204] (3/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_225-22526-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 11:40:29,896 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.bypass.scale_min, batch_count=1254100.0, ans=0.2 2023-10-03 11:40:37,122 WARNING [train.py:1204] (3/4) Exclude cut with ID _30327_210_16049_1_1532484161295_3924169_401-70819-0_sp1.1 from training. Duration: 0.7818125 2023-10-03 11:40:37,177 WARNING [train.py:1204] (3/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_538-32291-0 from training. Duration: 0.83 2023-10-03 11:40:38,511 INFO [train.py:1046] (3/4) Epoch 36, batch 2200, loss[loss=0.1844, simple_loss=0.2674, pruned_loss=0.05067, over 24422.00 frames. ], tot_loss[loss=0.159, simple_loss=0.2377, pruned_loss=0.04013, over 4704857.53 frames. ], batch size: 77, lr: 2.83e-03, grad_scale: 8.0 2023-10-03 11:40:40,114 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_37357_210_17024_1_1533089504_2316121_258-211605-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 11:40:44,118 WARNING [train.py:1204] (3/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_550-335198-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 11:40:45,372 WARNING [train.py:1204] (3/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_98-741-0_sp0.9 from training. Duration: 0.888875 2023-10-03 11:40:45,423 WARNING [train.py:1204] (3/4) Exclude cut with ID _38323_210_12108_1_1533121233369_3303049_251-238988-0_sp1.1 from training. Duration: 0.92725 2023-10-03 11:40:46,730 WARNING [train.py:1204] (3/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_259-34038-0_sp0.9 from training. Duration: 0.82225 2023-10-03 11:40:47,879 INFO [scaling.py:1022] (3/4) Whitening: name=encoder_embed.out_whiten, num_groups=1, num_channels=192, metric=6.74 vs. limit=8.0 2023-10-03 11:40:48,666 WARNING [train.py:1204] (3/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_1060-180633-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 11:40:49,860 WARNING [train.py:1204] (3/4) Exclude cut with ID _8755_210_1876_1_1528628559357_3258209_211-37473-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 11:40:49,865 WARNING [train.py:1204] (3/4) Exclude cut with ID _4122_210_2084_1_1526713164417_4353280_528-206771-0 from training. Duration: 0.88 2023-10-03 11:40:51,533 INFO [scaling.py:1118] (3/4) WithLoss: name=encoder.encoders.5.encoder.layers.0.self_attn_weights, loss-sum=0.000e+00 2023-10-03 11:40:55,993 WARNING [train.py:1204] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_64-248359-0 from training. Duration: 0.69 2023-10-03 11:40:58,758 WARNING [train.py:1204] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_402-253027-0_sp1.1 from training. Duration: 0.4909375 2023-10-03 11:41:04,294 WARNING [train.py:1204] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_625-102955-0 from training. Duration: 0.77 2023-10-03 11:41:05,104 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.0.layers.1.feed_forward3.out_whiten, num_groups=1, num_channels=192, metric=8.29 vs. limit=15.0 2023-10-03 11:41:06,822 WARNING [train.py:1204] (3/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_611-219324-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 11:41:08,240 WARNING [train.py:1204] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_718-68505-0_sp0.9 from training. Duration: 0.988875 2023-10-03 11:41:08,287 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_353-182855-0_sp1.1 from training. Duration: 0.87275 2023-10-03 11:41:10,976 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_5960_210_1794_1_1527403865_3990999_515-2613-0_sp0.9 from training. Duration: 0.97775 2023-10-03 11:41:10,995 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_5575_210_1410_1_1527301305_2570170_342-265572-0 from training. Duration: 0.94 2023-10-03 11:41:14,232 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.5.encoder.layers.0.conv_module2.whiten, num_groups=1, num_channels=256, metric=4.00 vs. limit=15.0 2023-10-03 11:41:15,086 WARNING [train.py:1204] (3/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_329-113173-0_sp0.9 from training. Duration: 0.688875 2023-10-03 11:41:15,176 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_15396_210_6890_1_1531308473_1938936_213-312506-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 11:41:16,516 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_521-214276-0_sp1.1 from training. Duration: 0.4 2023-10-03 11:41:19,852 WARNING [train.py:1204] (3/4) Exclude cut with ID _15026_210_8750_1_1530777602316_3291850_211-195043-0_sp1.1 from training. Duration: 0.72725 2023-10-03 11:41:22,476 WARNING [train.py:1204] (3/4) Exclude cut with ID _29343_210_4381_1_1532602757752_3674109_108-257494-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 11:41:22,600 WARNING [train.py:1204] (3/4) Exclude cut with ID _64414_210_17301_1_1535243252048_3757759_403-58151-0_sp1.1 from training. Duration: 0.963625 2023-10-03 11:41:24,048 WARNING [train.py:1204] (3/4) Exclude cut with ID _36512_210_6987_1_1533033143998_3126069_109-177787-0_sp1.1 from training. Duration: 0.92725 2023-10-03 11:41:26,112 WARNING [train.py:1204] (3/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_498-40153-0 from training. Duration: 0.63 2023-10-03 11:41:26,191 WARNING [train.py:1204] (3/4) Exclude cut with ID _56447_210_1389_1_1535074245910_3697189_161-334523-0_sp1.1 from training. Duration: 0.936375 2023-10-03 11:41:27,491 WARNING [train.py:1204] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_238-245655-0 from training. Duration: 0.81 2023-10-03 11:41:27,789 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.bypass_mid.scale_min, batch_count=1254366.6666666667, ans=0.2 2023-10-03 11:41:28,964 WARNING [train.py:1204] (3/4) Exclude cut with ID _64631_210_27767_1_1535349580068_4165160_108-43865-0_sp1.1 from training. Duration: 0.92725 2023-10-03 11:41:28,969 WARNING [train.py:1204] (3/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_243-42116-0_sp1.1 from training. Duration: 0.663625 2023-10-03 11:41:28,996 WARNING [train.py:1204] (3/4) Exclude cut with ID _66868_210_9527_1_1535509287954_4561140_594-132160-0_sp1.1 from training. Duration: 0.92725 2023-10-03 11:41:31,030 WARNING [train.py:1204] (3/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_48-338976-0_sp0.9 from training. Duration: 0.988875 2023-10-03 11:41:32,204 WARNING [train.py:1204] (3/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_137-100595-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 11:41:32,210 WARNING [train.py:1204] (3/4) Exclude cut with ID _49748_210_24116_1_1533986971737_3158910_12-271669-0_sp1.1 from training. Duration: 0.936375 2023-10-03 11:41:32,237 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_37357_210_17024_1_1533089504_2316121_206-80421-0_sp1.1 from training. Duration: 0.936375 2023-10-03 11:41:33,451 WARNING [train.py:1204] (3/4) Exclude cut with ID _7537_210_2084_1_1528502395431_3924359_113-199933-0_sp1.1 from training. Duration: 0.536375 2023-10-03 11:41:33,485 WARNING [train.py:1204] (3/4) Exclude cut with ID _55440_210_12651_1_1534496713364_2980520_31-59289-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 11:41:36,096 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1083-220122-0_sp0.9 from training. Duration: 0.5666875 2023-10-03 11:41:40,123 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_426-313557-0_sp1.1 from training. Duration: 0.4818125 2023-10-03 11:41:41,457 WARNING [train.py:1204] (3/4) Exclude cut with ID _35799_210_5854_1_1533263487887_3529071_324-94843-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 11:41:44,163 WARNING [train.py:1204] (3/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_264-108685-0_sp1.1 from training. Duration: 0.5 2023-10-03 11:41:44,324 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.bypass.scale_min, batch_count=1254433.3333333333, ans=0.2 2023-10-03 11:41:45,490 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9012W0006-501141 from training. Duration: 0.896 2023-10-03 11:41:47,097 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.feed_forward2.hidden_balancer.prob, batch_count=1254433.3333333333, ans=0.125 2023-10-03 11:41:48,210 WARNING [train.py:1204] (3/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_890-5117-0_sp0.9 from training. Duration: 0.8666875 2023-10-03 11:41:48,263 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9023W0008-506301 from training. Duration: 0.896 2023-10-03 11:41:49,590 WARNING [train.py:1204] (3/4) Exclude cut with ID _13916_210_9994_1_1530788341126_3763149_60-191556-0_sp0.9 from training. Duration: 0.67775 2023-10-03 11:41:49,633 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9029W0331-509721 from training. Duration: 0.896 2023-10-03 11:41:51,359 INFO [train.py:1046] (3/4) Epoch 36, batch 2250, loss[loss=0.1702, simple_loss=0.2443, pruned_loss=0.04802, over 23277.00 frames. ], tot_loss[loss=0.1601, simple_loss=0.2391, pruned_loss=0.04056, over 4712214.40 frames. ], batch size: 119, lr: 2.83e-03, grad_scale: 8.0 2023-10-03 11:41:52,767 WARNING [train.py:1204] (3/4) Exclude cut with ID _35823_210_6917_1_1533187881914_7297830_567-108596-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 11:41:52,820 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7543_210_6753_1_1528544010_1110402_25-183860-0_sp1.1 from training. Duration: 0.42725 2023-10-03 11:41:54,180 WARNING [train.py:1204] (3/4) Exclude cut with ID _47278_210_12041_1_1533902418349_3576110_495-260035-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 11:41:56,771 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9051W0015-520281 from training. Duration: 0.9045625 2023-10-03 11:41:56,883 WARNING [train.py:1204] (3/4) Exclude cut with ID _14459_210_2228_1_1531443641946_7516700_707-66193-0_sp1.1 from training. Duration: 0.97275 2023-10-03 11:42:00,304 WARNING [train.py:1204] (3/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_219-224445-0_sp1.1 from training. Duration: 0.763625 2023-10-03 11:42:05,498 WARNING [train.py:1204] (3/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_345-167986-0_sp0.9 from training. Duration: 0.9333125 2023-10-03 11:42:05,608 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_34815_210_6388_1_1533381320_3571903_279-305565-0_sp1.1 from training. Duration: 0.836375 2023-10-03 11:42:09,807 WARNING [train.py:1204] (3/4) Exclude cut with ID _21987_210_5854_1_1531821500482_3637038_317-179212-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 11:42:11,036 WARNING [train.py:1204] (3/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_395-37222-0_sp1.1 from training. Duration: 0.6909375 2023-10-03 11:42:11,113 WARNING [train.py:1204] (3/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_168-50337-0_sp1.1 from training. Duration: 0.763625 2023-10-03 11:42:12,471 WARNING [train.py:1204] (3/4) Exclude cut with ID _60113_210_16271_1_1535164212481_4386079_559-167191-0 from training. Duration: 0.93 2023-10-03 11:42:13,805 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_47228_210_4983_1_1533780021_3672027_344-179642-0_sp1.1 from training. Duration: 0.92725 2023-10-03 11:42:13,845 WARNING [train.py:1204] (3/4) Exclude cut with ID _22961_210_8254_1_1531976366672_4278180_317-199546-0_sp0.9 from training. Duration: 0.9555625 2023-10-03 11:42:15,438 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.conv_module1.balancer1.prob, batch_count=1254566.6666666667, ans=0.125 2023-10-03 11:42:16,611 WARNING [train.py:1204] (3/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_141-89727-0 from training. Duration: 0.71 2023-10-03 11:42:16,685 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_78-116075-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 11:42:16,703 WARNING [train.py:1204] (3/4) Exclude cut with ID _64086_210_7994_1_1535270417019_7435949_52-235756-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 11:42:17,886 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.488e+02 1.895e+02 2.080e+02 2.389e+02 3.595e+02, threshold=4.160e+02, percent-clipped=0.0 2023-10-03 11:42:19,351 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7615_210_4211_1_1528110965_4580160_72-235211-0_sp1.1 from training. Duration: 0.6909375 2023-10-03 11:42:23,568 WARNING [train.py:1204] (3/4) Exclude cut with ID _14069_210_5632_1_1530512692033_4803390_287-27950-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 11:42:25,285 WARNING [train.py:1204] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_74-48717-0_sp0.9 from training. Duration: 0.6444375 2023-10-03 11:42:25,309 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7544_210_6753_1_1528591327_1471426_172-348777-0_sp1.1 from training. Duration: 0.6 2023-10-03 11:42:26,705 WARNING [train.py:1204] (3/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_220-191080-0 from training. Duration: 0.98 2023-10-03 11:42:28,674 WARNING [train.py:1204] (3/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_1081-137641-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 11:42:31,347 WARNING [train.py:1204] (3/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_278-235459-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 11:42:35,309 WARNING [train.py:1204] (3/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_1181-77470-0_sp1.1 from training. Duration: 0.936375 2023-10-03 11:42:36,675 WARNING [train.py:1204] (3/4) Exclude cut with ID _60860_210_8254_1_1534896015875_3557169_198-5531-0_sp1.1 from training. Duration: 0.936375 2023-10-03 11:42:38,057 WARNING [train.py:1204] (3/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_17-289781-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 11:42:38,062 WARNING [train.py:1204] (3/4) Exclude cut with ID _50310_210_15488_1_1534068043345_3641080_146-199752-0_sp1.1 from training. Duration: 0.92725 2023-10-03 11:42:38,282 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.bypass.skip_rate, batch_count=1254700.0, ans=0.07 2023-10-03 11:42:40,764 WARNING [train.py:1204] (3/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_969-213354-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 11:42:42,121 WARNING [train.py:1204] (3/4) Exclude cut with ID _70519_210_4163_1_1535961595994_7458230_66-344156-0_sp1.1 from training. Duration: 0.963625 2023-10-03 11:42:46,260 WARNING [train.py:1204] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_380-204756-0_sp1.1 from training. Duration: 0.7818125 2023-10-03 11:42:47,621 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_128-202104-0_sp0.9 from training. Duration: 0.7 2023-10-03 11:42:51,912 WARNING [train.py:1204] (3/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_51-1049-0_sp0.9 from training. Duration: 0.8333125 2023-10-03 11:42:51,936 WARNING [train.py:1204] (3/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_86-66192-0_sp1.1 from training. Duration: 0.5 2023-10-03 11:42:53,289 WARNING [train.py:1204] (3/4) Exclude cut with ID _14069_210_5632_1_1530512692033_4803390_95-1896-0_sp1.1 from training. Duration: 0.8909375 2023-10-03 11:42:56,786 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.feed_forward3.hidden_balancer.prob, batch_count=1254766.6666666667, ans=0.125 2023-10-03 11:42:57,915 WARNING [train.py:1204] (3/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_29-293896-0_sp1.1 from training. Duration: 0.5545625 2023-10-03 11:42:59,781 WARNING [train.py:1204] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_167-137255-0_sp0.9 from training. Duration: 0.711125 2023-10-03 11:43:01,023 WARNING [train.py:1204] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_217-95056-0 from training. Duration: 0.98 2023-10-03 11:43:01,038 WARNING [train.py:1204] (3/4) Exclude cut with ID _47278_210_12041_1_1533902418349_3576110_97-266746-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 11:43:01,087 WARNING [train.py:1204] (3/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_380-343670-0_sp1.1 from training. Duration: 0.8 2023-10-03 11:43:04,146 WARNING [train.py:1204] (3/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_107-332398-0 from training. Duration: 0.78 2023-10-03 11:43:06,075 INFO [train.py:1046] (3/4) Epoch 36, batch 2300, loss[loss=0.1625, simple_loss=0.2505, pruned_loss=0.03724, over 24634.00 frames. ], tot_loss[loss=0.1607, simple_loss=0.2401, pruned_loss=0.04068, over 4719112.92 frames. ], batch size: 73, lr: 2.83e-03, grad_scale: 8.0 2023-10-03 11:43:07,487 WARNING [train.py:1204] (3/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_91-267696-0_sp1.1 from training. Duration: 0.7909375 2023-10-03 11:43:07,516 WARNING [train.py:1204] (3/4) Exclude cut with ID _41298_210_9019_1_1533430371450_4822143_498-186993-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 11:43:14,600 WARNING [train.py:1204] (3/4) Exclude cut with ID _17956_210_4353_1_1531209568125_6979970_54-77610-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 11:43:14,646 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_40047_210_2290_1_1533290519_2374311_257-335162-0_sp1.1 from training. Duration: 0.863625 2023-10-03 11:43:18,527 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9653W0193-675471 from training. Duration: 0.9656875 2023-10-03 11:43:19,807 WARNING [train.py:1204] (3/4) Exclude cut with ID _25248_210_8750_1_1532062825941_5554180_243-240536-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 11:43:25,478 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_32360_210_4198_1_1532689160_3648436_60-51908-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 11:43:25,486 WARNING [train.py:1204] (3/4) Exclude cut with ID _4092_210_2151_1_1526643044449_3135759_227-213972-0_sp0.9 from training. Duration: 0.6 2023-10-03 11:43:26,823 WARNING [train.py:1204] (3/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_392-173500-0_sp1.1 from training. Duration: 0.97275 2023-10-03 11:43:26,863 WARNING [train.py:1204] (3/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_71-329150-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 11:43:26,865 WARNING [train.py:1204] (3/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_866-173440-0 from training. Duration: 0.9 2023-10-03 11:43:26,954 WARNING [train.py:1204] (3/4) Exclude cut with ID _14832_210_8750_1_1530691351539_3171348_1-54315-0_sp0.9 from training. Duration: 0.9666875 2023-10-03 11:43:29,010 WARNING [train.py:1204] (3/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_430-116139-0_sp1.1 from training. Duration: 0.77275 2023-10-03 11:43:30,305 WARNING [train.py:1204] (3/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_356-45054-0_sp0.9 from training. Duration: 0.9555625 2023-10-03 11:43:35,060 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3531_210_3528_1_1526294203_5216456_309-227302-0_sp0.9 from training. Duration: 0.7666875 2023-10-03 11:43:37,727 WARNING [train.py:1204] (3/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_301-330455-0_sp1.1 from training. Duration: 0.72725 2023-10-03 11:43:40,520 WARNING [train.py:1204] (3/4) Exclude cut with ID _23557_210_8750_1_1531890106370_6846255_667-145945-0_sp1.1 from training. Duration: 0.92725 2023-10-03 11:43:43,435 WARNING [train.py:1204] (3/4) Exclude cut with ID _14860_210_4915_1_1530702020340_7343240_836-328489-0_sp1.1 from training. Duration: 0.8181875 2023-10-03 11:43:44,831 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_37152_210_9697_1_1533106573_3844188_416-333249-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 11:43:47,626 WARNING [train.py:1204] (3/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_538-32291-0_sp0.9 from training. Duration: 0.92225 2023-10-03 11:43:47,827 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=1255033.3333333333, ans=0.1 2023-10-03 11:43:50,406 WARNING [train.py:1204] (3/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_965-176824-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 11:43:51,987 WARNING [train.py:1204] (3/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_86-19601-0_sp1.1 from training. Duration: 0.77275 2023-10-03 11:43:54,636 WARNING [train.py:1204] (3/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_436-318113-0_sp1.1 from training. Duration: 0.6818125 2023-10-03 11:43:54,671 WARNING [train.py:1204] (3/4) Exclude cut with ID _41817_210_18835_1_1533434477160_7423049_555-293448-0_sp0.9 from training. Duration: 0.888875 2023-10-03 11:43:54,684 WARNING [train.py:1204] (3/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_68-219807-0 from training. Duration: 0.47 2023-10-03 11:43:58,681 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.2.encoder.layers.2.self_attn2.whiten, num_groups=1, num_channels=384, metric=18.36 vs. limit=22.5 2023-10-03 11:43:59,305 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7544_210_6753_1_1528591327_1471426_33-346031-0_sp1.1 from training. Duration: 0.5545625 2023-10-03 11:43:59,306 WARNING [train.py:1204] (3/4) Exclude cut with ID _49748_210_24116_1_1533986971737_3158910_263-52113-0_sp1.1 from training. Duration: 0.97275 2023-10-03 11:44:00,596 WARNING [train.py:1204] (3/4) Exclude cut with ID _64639_210_5816_1_1535367619437_3528000_340-24580-0_sp1.1 from training. Duration: 0.92725 2023-10-03 11:44:00,611 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_47228_210_4983_1_1533780021_3672027_479-114941-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 11:44:00,648 WARNING [train.py:1204] (3/4) Exclude cut with ID _44585_210_18628_1_1533605350387_4518199_506-119659-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 11:44:02,575 WARNING [train.py:1204] (3/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_314-126819-0_sp1.1 from training. Duration: 0.4545625 2023-10-03 11:44:02,579 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2541_210_1794_1_1525590269_3084765_156-173713-0_sp0.9 from training. Duration: 0.9 2023-10-03 11:44:02,628 WARNING [train.py:1204] (3/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_108-255430-0 from training. Duration: 0.97 2023-10-03 11:44:02,635 WARNING [train.py:1204] (3/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_791-45149-0_sp1.1 from training. Duration: 0.7818125 2023-10-03 11:44:02,641 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_53378_210_17282_1_1534227054_1261425_10-118295-0_sp1.1 from training. Duration: 0.97275 2023-10-03 11:44:02,694 WARNING [train.py:1204] (3/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_148-158509-0 from training. Duration: 0.41 2023-10-03 11:44:03,342 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.3.encoder.layers.3.conv_module1.whiten, num_groups=1, num_channels=512, metric=4.54 vs. limit=15.0 2023-10-03 11:44:08,837 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_34726_210_4525_1_1533019474_5030395_687-291305-0_sp1.1 from training. Duration: 0.936375 2023-10-03 11:44:11,649 WARNING [train.py:1204] (3/4) Exclude cut with ID _14860_210_4915_1_1530702020340_7343240_264-324537-0_sp1.1 from training. Duration: 0.963625 2023-10-03 11:44:14,546 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_41251_210_15368_1_1533429767_4820695_591-46386-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 11:44:15,856 WARNING [train.py:1204] (3/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_86-19601-0_sp0.9 from training. Duration: 0.9444375 2023-10-03 11:44:15,876 WARNING [train.py:1204] (3/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_401-255491-0_sp0.9 from training. Duration: 0.77775 2023-10-03 11:44:16,174 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.conv_module2.balancer2.prob, batch_count=1255100.0, ans=0.125 2023-10-03 11:44:17,286 WARNING [train.py:1204] (3/4) Exclude cut with ID _8696_210_1876_1_1528545632440_3345058_209-54069-0_sp1.1 from training. Duration: 0.6090625 2023-10-03 11:44:17,296 WARNING [train.py:1204] (3/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_369-325184-0_sp1.1 from training. Duration: 0.9 2023-10-03 11:44:17,366 WARNING [train.py:1204] (3/4) Exclude cut with ID _14687_210_5854_1_1530703838259_3664889_115-239046-0_sp0.9 from training. Duration: 0.7444375 2023-10-03 11:44:18,613 INFO [train.py:1046] (3/4) Epoch 36, batch 2350, loss[loss=0.148, simple_loss=0.2316, pruned_loss=0.03217, over 24546.00 frames. ], tot_loss[loss=0.1613, simple_loss=0.2406, pruned_loss=0.04095, over 4709617.16 frames. ], batch size: 71, lr: 2.83e-03, grad_scale: 8.0 2023-10-03 11:44:18,666 WARNING [train.py:1204] (3/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_168-50337-0 from training. Duration: 0.84 2023-10-03 11:44:22,035 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.4.encoder.layers.1.self_attn1.whiten, num_groups=1, num_channels=384, metric=11.74 vs. limit=22.5 2023-10-03 11:44:26,125 WARNING [train.py:1204] (3/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_909-64151-0_sp1.1 from training. Duration: 0.8545625 2023-10-03 11:44:26,135 WARNING [train.py:1204] (3/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_597-154640-0 from training. Duration: 0.9 2023-10-03 11:44:31,743 WARNING [train.py:1204] (3/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_126-274694-0 from training. Duration: 0.98 2023-10-03 11:44:31,991 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.self_attn_weights.pos_emb_skip_rate, batch_count=1255233.3333333333, ans=0.0 2023-10-03 11:44:35,750 WARNING [train.py:1204] (3/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_344-269786-0_sp1.1 from training. Duration: 0.97275 2023-10-03 11:44:38,998 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_37818_210_4238_1_1533620910_4720083_322-253154-0_sp1.1 from training. Duration: 0.92725 2023-10-03 11:44:39,001 WARNING [train.py:1204] (3/4) Exclude cut with ID _58895_210_8750_1_1535184081320_3649089_221-15823-0_sp1.1 from training. Duration: 0.92725 2023-10-03 11:44:39,011 WARNING [train.py:1204] (3/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_9-21737-0_sp1.1 from training. Duration: 0.8818125 2023-10-03 11:44:39,055 WARNING [train.py:1204] (3/4) Exclude cut with ID _14069_210_5632_1_1530512692033_4803390_522-341588-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 11:44:40,417 WARNING [train.py:1204] (3/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_363-185703-0 from training. Duration: 0.95 2023-10-03 11:44:44,500 WARNING [train.py:1204] (3/4) Exclude cut with ID _41817_210_18835_1_1533434477160_7423049_265-76216-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 11:44:44,806 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.ff3_skip_rate, batch_count=1255233.3333333333, ans=0.0 2023-10-03 11:44:44,822 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.ff2_skip_rate, batch_count=1255233.3333333333, ans=0.0 2023-10-03 11:44:45,734 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.584e+02 1.891e+02 2.082e+02 2.272e+02 3.750e+02, threshold=4.165e+02, percent-clipped=0.0 2023-10-03 11:44:48,536 WARNING [train.py:1204] (3/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_841-341679-0 from training. Duration: 0.58 2023-10-03 11:44:49,942 WARNING [train.py:1204] (3/4) Exclude cut with ID _55429_210_10626_1_1534467588527_7174143_97-335084-0_sp1.1 from training. Duration: 0.8818125 2023-10-03 11:44:53,975 WARNING [train.py:1204] (3/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_690-126306-0_sp0.9 from training. Duration: 0.8666875 2023-10-03 11:44:53,993 WARNING [train.py:1204] (3/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_102-92751-0_sp1.1 from training. Duration: 0.9 2023-10-03 11:44:55,500 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3787_210_2040_1_1526477544_5478586_603-120324-0_sp1.1 from training. Duration: 0.736375 2023-10-03 11:44:56,819 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_33348_210_5632_1_1533086912_3324030_85-28583-0 from training. Duration: 0.82 2023-10-03 11:44:56,876 WARNING [train.py:1204] (3/4) Exclude cut with ID _60057_210_10640_1_1534903065793_3333150_362-230377-0_sp1.1 from training. Duration: 0.7909375 2023-10-03 11:44:59,962 WARNING [train.py:1204] (3/4) Exclude cut with ID _60113_210_16271_1_1535164212481_4386079_194-146486-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 11:44:59,975 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_53753_210_7234_1_1534320003_3594148_171-277668-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 11:45:00,011 WARNING [train.py:1204] (3/4) Exclude cut with ID _34423_210_8750_1_1533430798456_3330199_251-332788-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 11:45:02,800 WARNING [train.py:1204] (3/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_663-166973-0_sp0.9 from training. Duration: 0.888875 2023-10-03 11:45:07,517 WARNING [train.py:1204] (3/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_927-43827-0 from training. Duration: 0.74 2023-10-03 11:45:07,570 WARNING [train.py:1204] (3/4) Exclude cut with ID _28323_210_16187_1_1532480395667_4100300_306-74561-0_sp1.1 from training. Duration: 0.8545625 2023-10-03 11:45:10,953 WARNING [train.py:1204] (3/4) Exclude cut with ID _15037_210_2253_1_1530752294104_3267650_139-337989-0_sp1.1 from training. Duration: 0.92725 2023-10-03 11:45:10,966 WARNING [train.py:1204] (3/4) Exclude cut with ID _49039_210_6315_1_1533967537541_3846118_460-263670-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 11:45:11,099 WARNING [train.py:1204] (3/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_186-309161-0 from training. Duration: 0.54 2023-10-03 11:45:12,572 WARNING [train.py:1204] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_87-278686-0_sp1.1 from training. Duration: 0.7 2023-10-03 11:45:15,256 WARNING [train.py:1204] (3/4) Exclude cut with ID _27052_210_5986_1_1532329197187_3585009_314-106499-0 from training. Duration: 0.92 2023-10-03 11:45:15,281 WARNING [train.py:1204] (3/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_412-287526-0_sp0.9 from training. Duration: 0.8 2023-10-03 11:45:19,378 WARNING [train.py:1204] (3/4) Exclude cut with ID _22961_210_8254_1_1531976366672_4278180_317-199546-0 from training. Duration: 0.86 2023-10-03 11:45:22,131 WARNING [train.py:1204] (3/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_381-317791-0 from training. Duration: 0.73 2023-10-03 11:45:23,478 WARNING [train.py:1204] (3/4) Exclude cut with ID _44010_210_4525_1_1533884262964_4013620_560-310411-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 11:45:23,487 WARNING [train.py:1204] (3/4) Exclude cut with ID _5294_210_1747_1_1527249643928_3696179_342-270278-0_sp0.9 from training. Duration: 0.611125 2023-10-03 11:45:23,514 WARNING [train.py:1204] (3/4) Exclude cut with ID ID0473W0002-918951 from training. Duration: 0.924 2023-10-03 11:45:23,531 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1083-220122-0 from training. Duration: 0.51 2023-10-03 11:45:24,963 WARNING [train.py:1204] (3/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_138-154812-0 from training. Duration: 0.78 2023-10-03 11:45:28,171 WARNING [train.py:1204] (3/4) Exclude cut with ID _44982_210_1511_1_1534312820108_3726029_331-19420-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 11:45:28,427 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.conv_skip_rate, batch_count=1255433.3333333333, ans=0.0 2023-10-03 11:45:32,233 INFO [train.py:1046] (3/4) Epoch 36, batch 2400, loss[loss=0.16, simple_loss=0.2537, pruned_loss=0.03315, over 24277.00 frames. ], tot_loss[loss=0.1607, simple_loss=0.2397, pruned_loss=0.04081, over 4710812.71 frames. ], batch size: 74, lr: 2.83e-03, grad_scale: 16.0 2023-10-03 11:45:32,314 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2655_210_2087_1_1525688959_3897400_202-147557-0_sp0.9 from training. Duration: 0.9444375 2023-10-03 11:45:37,844 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_23850_210_4238_1_1532232597_1632433_32-185135-0_sp0.9 from training. Duration: 0.9555625 2023-10-03 11:45:39,266 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_46-93547-0_sp1.1 from training. Duration: 0.863625 2023-10-03 11:45:39,318 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7931_210_1794_1_1528594728_5564059_280-279560-0 from training. Duration: 0.98 2023-10-03 11:45:40,631 WARNING [train.py:1204] (3/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_937-208281-0 from training. Duration: 0.85 2023-10-03 11:45:46,615 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.conv_module1.balancer1.prob, batch_count=1255566.6666666667, ans=0.125 2023-10-03 11:45:47,865 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_14928_210_5632_1_1530773461_4714000_454-112198-0_sp0.9 from training. Duration: 0.7333125 2023-10-03 11:45:47,866 WARNING [train.py:1204] (3/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_944-153333-0_sp1.1 from training. Duration: 0.963625 2023-10-03 11:45:50,786 WARNING [train.py:1204] (3/4) Exclude cut with ID _14821_210_8750_1_1530680503991_6829359_2-227151-0 from training. Duration: 0.83 2023-10-03 11:45:50,809 WARNING [train.py:1204] (3/4) Exclude cut with ID _28461_210_2253_1_1532771775189_3041299_96-152158-0_sp1.1 from training. Duration: 0.77275 2023-10-03 11:45:50,881 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7837_210_1792_1_1528453445_5841305_654-54195-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 11:45:52,177 WARNING [train.py:1204] (3/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_344-152526-0 from training. Duration: 0.53 2023-10-03 11:45:57,561 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_30167_210_3528_1_1532428012_4772375_480-142230-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 11:45:59,069 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_160-218721-0 from training. Duration: 0.96 2023-10-03 11:46:02,486 WARNING [train.py:1204] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_65-88242-0_sp0.9 from training. Duration: 0.62225 2023-10-03 11:46:07,708 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_34886_210_10633_1_1533203882_1755661_76-156096-0 from training. Duration: 0.87 2023-10-03 11:46:09,788 WARNING [train.py:1204] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_188-38388-0_sp1.1 from training. Duration: 0.92725 2023-10-03 11:46:09,987 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=1255633.3333333333, ans=0.1 2023-10-03 11:46:11,169 WARNING [train.py:1204] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_81-289335-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 11:46:15,392 WARNING [train.py:1204] (3/4) Exclude cut with ID _59493_210_17282_1_1535078419077_3225879_110-8778-0_sp1.1 from training. Duration: 0.936375 2023-10-03 11:46:15,427 WARNING [train.py:1204] (3/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_188-260011-0 from training. Duration: 0.73 2023-10-03 11:46:16,810 WARNING [train.py:1204] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_453-133983-0_sp0.9 from training. Duration: 0.8666875 2023-10-03 11:46:22,374 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8054_210_7927_1_1528627721_4984094_553-17379-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 11:46:23,775 WARNING [train.py:1204] (3/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_341-171072-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 11:46:28,188 WARNING [train.py:1204] (3/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_265-257286-0_sp1.1 from training. Duration: 0.963625 2023-10-03 11:46:29,682 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_403-39619-0_sp0.9 from training. Duration: 0.9333125 2023-10-03 11:46:29,687 WARNING [train.py:1204] (3/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_464-33931-0_sp0.9 from training. Duration: 0.6 2023-10-03 11:46:29,716 WARNING [train.py:1204] (3/4) Exclude cut with ID _60804_210_9642_1_1534831199859_7268299_632-143863-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 11:46:29,723 WARNING [train.py:1204] (3/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_431-310997-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 11:46:29,782 WARNING [train.py:1204] (3/4) Exclude cut with ID _31649_210_6228_1_1532584608063_4091078_114-263860-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 11:46:29,793 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6961_210_2087_1_1527937168_6815011_562-165518-0_sp0.9 from training. Duration: 0.8555625 2023-10-03 11:46:33,147 WARNING [train.py:1204] (3/4) Exclude cut with ID _27052_210_5986_1_1532329197187_3585009_338-325870-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 11:46:33,194 WARNING [train.py:1204] (3/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_469-94673-0_sp1.1 from training. Duration: 0.7181875 2023-10-03 11:46:33,407 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.bypass_mid.scale_min, batch_count=1255766.6666666667, ans=0.2 2023-10-03 11:46:34,574 WARNING [train.py:1204] (3/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_259-34038-0 from training. Duration: 0.74 2023-10-03 11:46:36,311 WARNING [train.py:1204] (3/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_388-129552-0 from training. Duration: 0.96 2023-10-03 11:46:37,756 WARNING [train.py:1204] (3/4) Exclude cut with ID _28394_210_8913_1_1532430049518_3650118_342-251455-0_sp1.1 from training. Duration: 0.936375 2023-10-03 11:46:37,780 WARNING [train.py:1204] (3/4) Exclude cut with ID _53731_210_7994_1_1534553979244_3757214_8-191390-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 11:46:39,566 WARNING [train.py:1204] (3/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_613-232856-0 from training. Duration: 0.54 2023-10-03 11:46:39,756 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.feed_forward2.hidden_balancer.prob, batch_count=1255766.6666666667, ans=0.125 2023-10-03 11:46:40,871 WARNING [train.py:1204] (3/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_557-349534-0 from training. Duration: 0.91 2023-10-03 11:46:40,880 WARNING [train.py:1204] (3/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_379-184223-0 from training. Duration: 0.85 2023-10-03 11:46:42,257 WARNING [train.py:1204] (3/4) Exclude cut with ID IC0125W0001-61807 from training. Duration: 0.966 2023-10-03 11:46:42,344 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_229-45300-0 from training. Duration: 0.57 2023-10-03 11:46:43,690 WARNING [train.py:1204] (3/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_1069-24743-0_sp1.1 from training. Duration: 0.92725 2023-10-03 11:46:43,794 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_41452_210_7291_1_1533437687_3941281_323-172873-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 11:46:45,125 WARNING [train.py:1204] (3/4) Exclude cut with ID _37086_210_9038_1_1532999540891_7021250_802-226709-0_sp1.1 from training. Duration: 0.97275 2023-10-03 11:46:45,215 WARNING [train.py:1204] (3/4) Exclude cut with ID IC0144W0500-71767 from training. Duration: 0.931875 2023-10-03 11:46:46,608 WARNING [train.py:1204] (3/4) Exclude cut with ID _39846_210_12651_1_1533605310265_2115212_233-129699-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 11:46:47,880 INFO [train.py:1046] (3/4) Epoch 36, batch 2450, loss[loss=0.1515, simple_loss=0.2016, pruned_loss=0.05065, over 19016.00 frames. ], tot_loss[loss=0.1588, simple_loss=0.2368, pruned_loss=0.04041, over 4687384.01 frames. ], batch size: 389, lr: 2.83e-03, grad_scale: 16.0 2023-10-03 11:46:47,925 WARNING [train.py:1204] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_574-45541-0_sp0.9 from training. Duration: 0.72225 2023-10-03 11:46:50,920 WARNING [train.py:1204] (3/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_372-298093-0_sp1.1 from training. Duration: 0.72725 2023-10-03 11:46:50,933 WARNING [train.py:1204] (3/4) Exclude cut with ID _62574_210_2655_1_1535180407058_3933249_34-26343-0_sp1.1 from training. Duration: 0.97275 2023-10-03 11:46:53,757 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_512-256238-0_sp1.1 from training. Duration: 0.963625 2023-10-03 11:46:55,049 WARNING [train.py:1204] (3/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_196-55502-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 11:46:55,146 WARNING [train.py:1204] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_195-167011-0 from training. Duration: 0.75 2023-10-03 11:46:55,327 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.1.feed_forward1.out_proj.dropout_p, batch_count=1255833.3333333333, ans=0.1 2023-10-03 11:46:59,802 WARNING [train.py:1204] (3/4) Exclude cut with ID _13678_210_7497_1_1530530069226_3463170_345-230416-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 11:46:59,820 WARNING [train.py:1204] (3/4) Exclude cut with ID _51078_210_9070_1_1534161557424_3805250_225-327809-0_sp1.1 from training. Duration: 0.963625 2023-10-03 11:47:03,232 WARNING [train.py:1204] (3/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_18-189198-0_sp1.1 from training. Duration: 0.8181875 2023-10-03 11:47:03,250 WARNING [train.py:1204] (3/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_329-82235-0_sp1.1 from training. Duration: 0.7454375 2023-10-03 11:47:03,267 WARNING [train.py:1204] (3/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_726-270780-0_sp0.9 from training. Duration: 0.9555625 2023-10-03 11:47:04,556 WARNING [train.py:1204] (3/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_654-52713-0 from training. Duration: 0.77 2023-10-03 11:47:07,264 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.2.encoder.layers.2.self_attn2.whiten, num_groups=1, num_channels=384, metric=17.60 vs. limit=22.5 2023-10-03 11:47:11,003 WARNING [train.py:1204] (3/4) Exclude cut with ID _20455_210_5219_1_1531565996775_8098100_191-16006-0_sp1.1 from training. Duration: 0.963625 2023-10-03 11:47:12,399 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3171_210_2087_1_1526109084_2330007_66-260583-0_sp1.1 from training. Duration: 0.6545625 2023-10-03 11:47:12,453 WARNING [train.py:1204] (3/4) Exclude cut with ID _38670_210_15379_1_1533448810659_4391059_63-129006-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 11:47:15,052 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.613e+02 1.962e+02 2.115e+02 2.391e+02 5.447e+02, threshold=4.229e+02, percent-clipped=1.0 2023-10-03 11:47:16,618 WARNING [train.py:1204] (3/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_393-57888-0_sp0.9 from training. Duration: 0.711125 2023-10-03 11:47:16,646 WARNING [train.py:1204] (3/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_857-298942-0_sp0.9 from training. Duration: 0.9444375 2023-10-03 11:47:17,896 WARNING [train.py:1204] (3/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_239-22076-0_sp0.9 from training. Duration: 0.9444375 2023-10-03 11:47:17,925 WARNING [train.py:1204] (3/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_424-227947-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 11:47:20,646 WARNING [train.py:1204] (3/4) Exclude cut with ID _14069_210_5632_1_1530512692033_4803390_95-1896-0 from training. Duration: 0.98 2023-10-03 11:47:21,966 WARNING [train.py:1204] (3/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_96-244299-0_sp0.9 from training. Duration: 0.97775 2023-10-03 11:47:30,424 WARNING [train.py:1204] (3/4) Exclude cut with ID _46683_210_9642_1_1533730913492_4591116_562-319825-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 11:47:30,536 WARNING [train.py:1204] (3/4) Exclude cut with ID _64639_210_5816_1_1535367619437_3528000_284-173474-0_sp1.1 from training. Duration: 0.963625 2023-10-03 11:47:30,556 WARNING [train.py:1204] (3/4) Exclude cut with ID _29529_210_7234_1_1532393961455_3977460_497-100133-0_sp1.1 from training. Duration: 0.936375 2023-10-03 11:47:31,861 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_12868_210_6753_1_1530701932_530275_18-76322-0_sp0.9 from training. Duration: 0.9666875 2023-10-03 11:47:31,893 WARNING [train.py:1204] (3/4) Exclude cut with ID _50093_210_4220_1_1534417141207_3432299_116-303447-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 11:47:31,987 WARNING [train.py:1204] (3/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_44-79427-0_sp1.1 from training. Duration: 0.8909375 2023-10-03 11:47:33,933 WARNING [train.py:1204] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_887-249555-0 from training. Duration: 0.77 2023-10-03 11:47:36,757 WARNING [train.py:1204] (3/4) Exclude cut with ID _28461_210_2253_1_1532771775189_3041299_96-152158-0_sp0.9 from training. Duration: 0.9444375 2023-10-03 11:47:38,183 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_14058_210_4163_1_1530690953_6327180_49-210687-0_sp1.1 from training. Duration: 0.8545625 2023-10-03 11:47:41,508 WARNING [train.py:1204] (3/4) Exclude cut with ID _40399_210_4353_1_1533305658194_3279406_351-127903-0_sp1.1 from training. Duration: 0.97275 2023-10-03 11:47:41,514 WARNING [train.py:1204] (3/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_184-206939-0_sp1.1 from training. Duration: 0.936375 2023-10-03 11:47:45,596 WARNING [train.py:1204] (3/4) Exclude cut with ID _14100_210_8341_1_1530529320613_3678069_258-224599-0_sp1.1 from training. Duration: 0.7 2023-10-03 11:47:45,610 WARNING [train.py:1204] (3/4) Exclude cut with ID _6493_210_1747_1_1528459210424_4012470_324-323854-0 from training. Duration: 0.92 2023-10-03 11:47:46,859 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_151-325626-0_sp1.1 from training. Duration: 0.8818125 2023-10-03 11:47:48,193 WARNING [train.py:1204] (3/4) Exclude cut with ID _14528_210_8254_1_1530669700514_4306939_106-43145-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 11:47:48,213 WARNING [train.py:1204] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_686-54383-0 from training. Duration: 0.87 2023-10-03 11:47:49,548 WARNING [train.py:1204] (3/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_801-102362-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 11:47:50,898 WARNING [train.py:1204] (3/4) Exclude cut with ID _48677_210_5663_1_1533902405919_3892989_408-40740-0_sp0.9 from training. Duration: 0.92225 2023-10-03 11:47:54,958 WARNING [train.py:1204] (3/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_448-212135-0_sp1.1 from training. Duration: 0.82725 2023-10-03 11:47:57,735 WARNING [train.py:1204] (3/4) Exclude cut with ID _59170_210_6721_1_1534983633960_4203335_320-124047-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 11:47:57,787 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_4692_210_3528_1_1526899755_4323254_107-44401-0_sp1.1 from training. Duration: 0.7818125 2023-10-03 11:48:00,573 INFO [train.py:1046] (3/4) Epoch 36, batch 2500, loss[loss=0.16, simple_loss=0.2478, pruned_loss=0.03608, over 24662.00 frames. ], tot_loss[loss=0.1588, simple_loss=0.2366, pruned_loss=0.04046, over 4684510.96 frames. ], batch size: 73, lr: 2.83e-03, grad_scale: 16.0 2023-10-03 11:48:00,715 WARNING [train.py:1204] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_731-2428-0 from training. Duration: 0.83 2023-10-03 11:48:02,070 WARNING [train.py:1204] (3/4) Exclude cut with ID _29857_210_20034_1_1532395886753_3798160_335-163562-0_sp1.1 from training. Duration: 0.77275 2023-10-03 11:48:08,670 WARNING [train.py:1204] (3/4) Exclude cut with ID _14832_210_8750_1_1530691351539_3171348_171-275004-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 11:48:09,016 INFO [scaling.py:1118] (3/4) WithLoss: name=encoder.encoders.5.encoder.layers.0.self_attn_weights, loss-sum=0.000e+00 2023-10-03 11:48:13,600 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.attention_skip_rate, batch_count=1256166.6666666667, ans=0.0 2023-10-03 11:48:18,661 WARNING [train.py:1204] (3/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_105-328174-0_sp0.9 from training. Duration: 0.8444375 2023-10-03 11:48:18,686 WARNING [train.py:1204] (3/4) Exclude cut with ID _68965_210_1883_1_1535698962272_3497490_164-118421-0_sp1.1 from training. Duration: 0.936375 2023-10-03 11:48:20,061 WARNING [train.py:1204] (3/4) Exclude cut with ID _19896_210_1883_1_1531709022662_6649569_380-24170-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 11:48:20,066 WARNING [train.py:1204] (3/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_505-205270-0_sp1.1 from training. Duration: 0.3454375 2023-10-03 11:48:25,671 WARNING [train.py:1204] (3/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_427-208901-0_sp1.1 from training. Duration: 0.6181875 2023-10-03 11:48:25,715 WARNING [train.py:1204] (3/4) Exclude cut with ID _23818_210_6228_1_1531917063884_3676920_85-326916-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 11:48:27,069 WARNING [train.py:1204] (3/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_101-293367-0_sp1.1 from training. Duration: 0.57275 2023-10-03 11:48:27,083 WARNING [train.py:1204] (3/4) Exclude cut with ID _5409_210_1388_1_1527292850189_5865191_178-127970-0_sp1.1 from training. Duration: 0.5181875 2023-10-03 11:48:28,443 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_403-39619-0 from training. Duration: 0.84 2023-10-03 11:48:29,253 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.2.encoder.layers.1.conv_module1.whiten, num_groups=1, num_channels=384, metric=6.76 vs. limit=15.0 2023-10-03 11:48:29,781 WARNING [train.py:1204] (3/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_427-87841-0_sp1.1 from training. Duration: 0.963625 2023-10-03 11:48:29,829 WARNING [train.py:1204] (3/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_286-68181-0_sp1.1 from training. Duration: 0.97275 2023-10-03 11:48:29,867 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6673_210_3273_1_1527767790_3173240_327-306185-0 from training. Duration: 0.83 2023-10-03 11:48:29,884 WARNING [train.py:1204] (3/4) Exclude cut with ID _3675_210_4557_1_1526639934408_4182089_106-203713-0_sp1.1 from training. Duration: 0.963625 2023-10-03 11:48:31,190 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_53071_210_16554_1_1534493434_1276796_130-36076-0 from training. Duration: 0.95 2023-10-03 11:48:31,219 WARNING [train.py:1204] (3/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_586-93054-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 11:48:36,282 WARNING [train.py:1204] (3/4) Exclude cut with ID _23825_210_13449_1_1531915313526_3497150_262-10571-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 11:48:37,679 WARNING [train.py:1204] (3/4) Exclude cut with ID _28783_210_7943_1_1532671164808_7155479_432-67885-0_sp1.1 from training. Duration: 0.97275 2023-10-03 11:48:40,470 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6287_210_3273_1_1527509506_2653716_182-199887-0_sp0.9 from training. Duration: 0.5666875 2023-10-03 11:48:40,521 WARNING [train.py:1204] (3/4) Exclude cut with ID _3231_210_2106_1_1526205864934_6784220_171-256004-0 from training. Duration: 0.86 2023-10-03 11:48:42,378 WARNING [train.py:1204] (3/4) Exclude cut with ID _69051_210_1531_1_1535885566901_3829538_332-92090-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 11:48:44,472 WARNING [train.py:1204] (3/4) Exclude cut with ID _3675_210_4557_1_1526639934408_4182089_104-324125-0_sp1.1 from training. Duration: 0.963625 2023-10-03 11:48:48,607 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_41764_210_2521_1_1533686110_4089313_501-2119-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 11:48:51,426 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_56-100988-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 11:48:54,316 WARNING [train.py:1204] (3/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_398-239819-0_sp1.1 from training. Duration: 0.9 2023-10-03 11:48:59,759 WARNING [train.py:1204] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_74-48717-0_sp1.1 from training. Duration: 0.52725 2023-10-03 11:49:02,502 WARNING [train.py:1204] (3/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_177-49092-0 from training. Duration: 0.91 2023-10-03 11:49:02,521 WARNING [train.py:1204] (3/4) Exclude cut with ID _62337_210_12392_1_1535009273105_3412229_44-176353-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 11:49:02,529 WARNING [train.py:1204] (3/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_110-130961-0_sp1.1 from training. Duration: 0.42725 2023-10-03 11:49:05,762 WARNING [train.py:1204] (3/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_37-189262-0_sp1.1 from training. Duration: 0.8909375 2023-10-03 11:49:05,762 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3172_210_2087_1_1526292929_3987865_197-97762-0_sp0.9 from training. Duration: 0.8333125 2023-10-03 11:49:05,862 WARNING [train.py:1204] (3/4) Exclude cut with ID IC0677W0065-334897 from training. Duration: 0.896 2023-10-03 11:49:05,862 WARNING [train.py:1204] (3/4) Exclude cut with ID IC0677W0080-334912 from training. Duration: 0.768 2023-10-03 11:49:05,867 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_120-41668-0 from training. Duration: 0.7 2023-10-03 11:49:09,224 WARNING [train.py:1204] (3/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_460-205287-0_sp1.1 from training. Duration: 0.963625 2023-10-03 11:49:09,527 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.conv_module1.balancer2.prob, batch_count=1256433.3333333333, ans=0.125 2023-10-03 11:49:10,589 WARNING [train.py:1204] (3/4) Exclude cut with ID _14632_210_9988_1_1530691279392_3923200_102-204477-0 from training. Duration: 0.97 2023-10-03 11:49:10,594 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_34815_210_6388_1_1533381320_3571903_279-305565-0 from training. Duration: 0.92 2023-10-03 11:49:10,663 WARNING [train.py:1204] (3/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_91-148831-0_sp1.1 from training. Duration: 0.9 2023-10-03 11:49:10,716 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_82-221196-0 from training. Duration: 0.76 2023-10-03 11:49:15,080 INFO [train.py:1046] (3/4) Epoch 36, batch 2550, loss[loss=0.1384, simple_loss=0.2192, pruned_loss=0.02875, over 24445.00 frames. ], tot_loss[loss=0.159, simple_loss=0.2374, pruned_loss=0.04028, over 4702311.60 frames. ], batch size: 58, lr: 2.83e-03, grad_scale: 16.0 2023-10-03 11:49:15,201 WARNING [train.py:1204] (3/4) Exclude cut with ID _38020_210_5986_1_1533112252259_7006089_493-102745-0 from training. Duration: 0.98 2023-10-03 11:49:15,432 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.conv_module2.balancer2.prob, batch_count=1256500.0, ans=0.125 2023-10-03 11:49:17,872 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.2.encoder.layers.0.self_attn_weights.whiten_keys, num_groups=4, num_channels=128, metric=4.31 vs. limit=6.0 2023-10-03 11:49:18,391 WARNING [train.py:1204] (3/4) Exclude cut with ID _61133_210_9969_1_1535196777227_3696409_40-93442-0_sp1.1 from training. Duration: 0.92725 2023-10-03 11:49:19,900 WARNING [train.py:1204] (3/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_45-258852-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 11:49:21,178 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_53071_210_16554_1_1534493434_1276796_130-36076-0_sp1.1 from training. Duration: 0.863625 2023-10-03 11:49:22,662 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8694_210_6400_1_1529065754_3260756_332-13553-0_sp1.1 from training. Duration: 0.92725 2023-10-03 11:49:24,041 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_405-339986-0 from training. Duration: 0.51 2023-10-03 11:49:24,092 WARNING [train.py:1204] (3/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_927-43827-0_sp1.1 from training. Duration: 0.67275 2023-10-03 11:49:26,758 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_397-147196-0 from training. Duration: 0.8 2023-10-03 11:49:28,207 WARNING [train.py:1204] (3/4) Exclude cut with ID 3557-8342-0013-54691-0_sp1.1 from training. Duration: 0.836375 2023-10-03 11:49:30,984 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_47438_210_5399_1_1533797471_4513356_626-127722-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 11:49:34,203 WARNING [train.py:1204] (3/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_256-323883-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 11:49:34,208 WARNING [train.py:1204] (3/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_839-173017-0_sp0.9 from training. Duration: 0.4555625 2023-10-03 11:49:34,243 WARNING [train.py:1204] (3/4) Exclude cut with ID _5289_210_2084_1_1527678202736_3779299_392-261988-0_sp1.1 from training. Duration: 0.5909375 2023-10-03 11:49:34,278 WARNING [train.py:1204] (3/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_119-224384-0_sp1.1 from training. Duration: 0.936375 2023-10-03 11:49:34,317 WARNING [train.py:1204] (3/4) Exclude cut with ID _57523_210_1856_1_1534766294545_3732989_262-267933-0_sp1.1 from training. Duration: 0.963625 2023-10-03 11:49:35,721 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder_embed.conv.2.prob, batch_count=1256566.6666666667, ans=0.125 2023-10-03 11:49:37,112 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_35361_210_4238_1_1533262810_7793476_675-87012-0_sp0.9 from training. Duration: 0.988875 2023-10-03 11:49:38,917 WARNING [train.py:1204] (3/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_17-90541-0 from training. Duration: 0.94 2023-10-03 11:49:38,944 WARNING [train.py:1204] (3/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_535-14410-0_sp1.1 from training. Duration: 0.42725 2023-10-03 11:49:38,949 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_24673_210_6400_1_1531968633_778659_94-154511-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 11:49:38,952 WARNING [train.py:1204] (3/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_212-10501-0 from training. Duration: 0.94 2023-10-03 11:49:43,473 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.594e+02 1.881e+02 2.105e+02 2.315e+02 3.420e+02, threshold=4.210e+02, percent-clipped=0.0 2023-10-03 11:49:43,879 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.ff2_skip_rate, batch_count=1256633.3333333333, ans=0.0 2023-10-03 11:49:49,716 WARNING [train.py:1204] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_383-285196-0_sp1.1 from training. Duration: 0.8818125 2023-10-03 11:49:56,474 WARNING [train.py:1204] (3/4) Exclude cut with ID _45560_210_21982_1_1533812603951_3584999_238-47020-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 11:49:56,483 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_21500_210_2054_1_1532149310_2716758_278-18053-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 11:49:56,498 WARNING [train.py:1204] (3/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_453-9626-0_sp1.1 from training. Duration: 0.936375 2023-10-03 11:49:57,901 WARNING [train.py:1204] (3/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_153-97441-0_sp1.1 from training. Duration: 0.5090625 2023-10-03 11:50:03,357 WARNING [train.py:1204] (3/4) Exclude cut with ID _67725_210_27767_1_1535608756671_5105359_431-120895-0_sp1.1 from training. Duration: 0.963625 2023-10-03 11:50:03,941 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.4.encoder.layers.0.self_attn_weights.whiten_keys, num_groups=4, num_channels=128, metric=1.70 vs. limit=6.0 2023-10-03 11:50:06,706 WARNING [train.py:1204] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_574-45541-0_sp1.1 from training. Duration: 0.5909375 2023-10-03 11:50:06,732 WARNING [train.py:1204] (3/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_162-103620-0_sp1.1 from training. Duration: 0.8454375 2023-10-03 11:50:06,744 WARNING [train.py:1204] (3/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_734-129761-0_sp1.1 from training. Duration: 0.7454375 2023-10-03 11:50:06,856 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.0.conv_module2.balancer1.min_positive, batch_count=1256700.0, ans=0.025 2023-10-03 11:50:07,954 WARNING [train.py:1204] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_620-276793-0_sp1.1 from training. Duration: 0.536375 2023-10-03 11:50:07,992 WARNING [train.py:1204] (3/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_153-97441-0_sp0.9 from training. Duration: 0.62225 2023-10-03 11:50:13,103 WARNING [train.py:1204] (3/4) Exclude cut with ID _12733_210_8341_1_1530167424556_3646380_523-85621-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 11:50:13,115 WARNING [train.py:1204] (3/4) Exclude cut with ID _64048_210_10626_1_1535198382802_3835179_352-179834-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 11:50:17,528 WARNING [train.py:1204] (3/4) Exclude cut with ID _55162_210_17071_1_1534408248583_3753480_43-242364-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 11:50:17,544 WARNING [train.py:1204] (3/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_661-264857-0 from training. Duration: 0.93 2023-10-03 11:50:17,561 WARNING [train.py:1204] (3/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_325-237800-0_sp1.1 from training. Duration: 0.97275 2023-10-03 11:50:17,719 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.attention_skip_rate, batch_count=1256766.6666666667, ans=0.0 2023-10-03 11:50:19,410 WARNING [train.py:1204] (3/4) Exclude cut with ID _55328_210_27767_1_1534580973260_4088170_385-171459-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 11:50:19,484 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3780_210_2087_1_1526467389_2145572_176-107140-0_sp1.1 from training. Duration: 0.62725 2023-10-03 11:50:20,817 WARNING [train.py:1204] (3/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_375-244976-0_sp1.1 from training. Duration: 0.6818125 2023-10-03 11:50:22,305 WARNING [train.py:1204] (3/4) Exclude cut with ID _35941_210_15379_1_1532995294515_4023089_309-33162-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 11:50:27,775 WARNING [train.py:1204] (3/4) Exclude cut with ID _14459_210_2228_1_1531443641946_7516700_1185-22038-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 11:50:29,095 INFO [train.py:1046] (3/4) Epoch 36, batch 2600, loss[loss=0.1775, simple_loss=0.2594, pruned_loss=0.04784, over 24404.00 frames. ], tot_loss[loss=0.1597, simple_loss=0.2389, pruned_loss=0.04026, over 4719503.75 frames. ], batch size: 77, lr: 2.83e-03, grad_scale: 8.0 2023-10-03 11:50:30,485 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_1197-69460-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 11:50:31,929 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9012W0022-501157 from training. Duration: 0.896 2023-10-03 11:50:32,113 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.conv_module2.balancer1.min_positive, batch_count=1256833.3333333333, ans=0.025 2023-10-03 11:50:36,038 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9023W0009-506302 from training. Duration: 0.896 2023-10-03 11:50:36,064 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2821_210_2715_1_1526108385_3763560_73-296673-0_sp1.1 from training. Duration: 0.7545625 2023-10-03 11:50:36,095 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9025W0113-507442 from training. Duration: 0.896 2023-10-03 11:50:37,914 WARNING [train.py:1204] (3/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_739-2098-0 from training. Duration: 0.92 2023-10-03 11:50:37,934 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9029W0332-509722 from training. Duration: 0.768 2023-10-03 11:50:38,126 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder_embed.dropout.p, batch_count=1256833.3333333333, ans=0.1 2023-10-03 11:50:39,544 WARNING [train.py:1204] (3/4) Exclude cut with ID _16908_210_5724_1_1531119157248_3772700_68-101953-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 11:50:39,575 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9042W0011-515617 from training. Duration: 0.768 2023-10-03 11:50:40,299 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.2.encoder.layers.2.conv_module2.whiten, num_groups=1, num_channels=384, metric=5.24 vs. limit=15.0 2023-10-03 11:50:40,946 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3019_210_2028_1_1526044245_3264865_191-72235-0 from training. Duration: 0.97 2023-10-03 11:50:42,867 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9052W0006-520792 from training. Duration: 0.896 2023-10-03 11:50:43,611 WARNING [train.py:1204] (3/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_245-174553-0_sp1.1 from training. Duration: 0.8 2023-10-03 11:50:44,978 WARNING [train.py:1204] (3/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_202-67017-0 from training. Duration: 0.82 2023-10-03 11:50:45,258 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.feed_forward3.hidden_balancer.prob, batch_count=1256900.0, ans=0.125 2023-10-03 11:50:46,480 WARNING [train.py:1204] (3/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_304-59227-0 from training. Duration: 0.77 2023-10-03 11:50:48,541 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_246-125000-0_sp0.9 from training. Duration: 0.688875 2023-10-03 11:50:48,575 WARNING [train.py:1204] (3/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_243-42116-0 from training. Duration: 0.73 2023-10-03 11:50:51,241 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9087W0121-538492 from training. Duration: 0.896 2023-10-03 11:50:52,533 WARNING [train.py:1204] (3/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_120-76843-0 from training. Duration: 0.55 2023-10-03 11:50:58,098 WARNING [train.py:1204] (3/4) Exclude cut with ID _7852_210_5219_1_1528286471316_8139400_850-304621-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 11:50:58,111 WARNING [train.py:1204] (3/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_154-285415-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 11:50:58,122 WARNING [train.py:1204] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_538-278194-0_sp1.1 from training. Duration: 0.92725 2023-10-03 11:50:58,122 WARNING [train.py:1204] (3/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_119-207162-0 from training. Duration: 0.71 2023-10-03 11:50:59,613 WARNING [train.py:1204] (3/4) Exclude cut with ID _2334_210_2667_1_1525608238880_4705129_459-104997-0_sp1.1 from training. Duration: 0.863625 2023-10-03 11:51:03,786 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9147W0019-567847 from training. Duration: 0.768 2023-10-03 11:51:06,360 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.3.encoder.layers.0.feed_forward2.out_whiten, num_groups=1, num_channels=512, metric=4.10 vs. limit=15.0 2023-10-03 11:51:08,522 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.feed_forward3.hidden_balancer.prob, batch_count=1256966.6666666667, ans=0.125 2023-10-03 11:51:09,614 WARNING [train.py:1204] (3/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_228-189439-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 11:51:10,764 WARNING [train.py:1204] (3/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_196-77172-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 11:51:11,444 WARNING [train.py:1204] (3/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_625-338546-0 from training. Duration: 0.88 2023-10-03 11:51:11,484 WARNING [train.py:1204] (3/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_193-327630-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 11:51:11,485 WARNING [train.py:1204] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_391-24006-0_sp1.1 from training. Duration: 0.92725 2023-10-03 11:51:11,719 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.balancer2.prob, batch_count=1256966.6666666667, ans=0.125 2023-10-03 11:51:12,825 WARNING [train.py:1204] (3/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_197-28164-0 from training. Duration: 0.8 2023-10-03 11:51:14,450 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.0.self_attn_weights.pos_emb_skip_rate, batch_count=1257033.3333333333, ans=0.0 2023-10-03 11:51:16,078 WARNING [train.py:1204] (3/4) Exclude cut with ID _8395_210_1531_1_1528597871523_4411579_431-132470-0_sp1.1 from training. Duration: 0.67275 2023-10-03 11:51:17,427 WARNING [train.py:1204] (3/4) Exclude cut with ID _35350_210_16040_1_1533040191183_4075299_465-248326-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 11:51:18,851 WARNING [train.py:1204] (3/4) Exclude cut with ID _16756_210_4915_1_1531047803763_7104259_101-141922-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 11:51:21,651 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9212W0005-600172 from training. Duration: 0.896 2023-10-03 11:51:21,678 WARNING [train.py:1204] (3/4) Exclude cut with ID _63186_210_2084_1_1535414479705_3908649_378-166820-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 11:51:22,947 WARNING [train.py:1204] (3/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_52-211683-0_sp1.1 from training. Duration: 0.8181875 2023-10-03 11:51:25,841 WARNING [train.py:1204] (3/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_1147-237729-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 11:51:27,130 WARNING [train.py:1204] (3/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_234-14174-0_sp0.9 from training. Duration: 0.911125 2023-10-03 11:51:27,142 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_454-276429-0 from training. Duration: 0.87 2023-10-03 11:51:27,231 WARNING [train.py:1204] (3/4) Exclude cut with ID _53713_210_24116_1_1534314487762_7416751_755-165313-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 11:51:28,741 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8278_210_2550_1_1528637099_4220413_436-317072-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 11:51:30,084 WARNING [train.py:1204] (3/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_28-324729-0_sp1.1 from training. Duration: 0.8545625 2023-10-03 11:51:34,591 WARNING [train.py:1204] (3/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_326-165900-0 from training. Duration: 0.69 2023-10-03 11:51:35,880 WARNING [train.py:1204] (3/4) Exclude cut with ID _53489_210_6228_1_1534298357169_3835970_96-30246-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 11:51:37,303 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8275_210_4198_1_1528455320_3892571_491-17707-0_sp1.1 from training. Duration: 0.5818125 2023-10-03 11:51:40,773 WARNING [train.py:1204] (3/4) Exclude cut with ID _14348_210_8341_1_1530599434996_3778640_240-232370-0 from training. Duration: 0.84 2023-10-03 11:51:42,061 WARNING [train.py:1204] (3/4) Exclude cut with ID _45012_210_12651_1_1533608435136_5371380_54-112623-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 11:51:43,355 INFO [train.py:1046] (3/4) Epoch 36, batch 2650, loss[loss=0.1423, simple_loss=0.2238, pruned_loss=0.03036, over 23712.00 frames. ], tot_loss[loss=0.1603, simple_loss=0.2395, pruned_loss=0.04053, over 4723226.00 frames. ], batch size: 149, lr: 2.83e-03, grad_scale: 8.0 2023-10-03 11:51:43,399 WARNING [train.py:1204] (3/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_328-264776-0_sp1.1 from training. Duration: 0.6090625 2023-10-03 11:51:43,462 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9325W0001-648427 from training. Duration: 0.768 2023-10-03 11:51:43,472 WARNING [train.py:1204] (3/4) Exclude cut with ID _56480_210_4220_1_1534679942079_3075409_49-74717-0_sp1.1 from training. Duration: 0.936375 2023-10-03 11:51:46,870 WARNING [train.py:1204] (3/4) Exclude cut with ID _59500_210_8254_1_1534809610680_3736009_6-241869-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 11:51:48,275 WARNING [train.py:1204] (3/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_172-215932-0_sp0.9 from training. Duration: 0.7333125 2023-10-03 11:51:48,434 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.conv_module2.balancer1.prob, batch_count=1257166.6666666667, ans=0.125 2023-10-03 11:51:49,677 WARNING [train.py:1204] (3/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_17-90541-0_sp1.1 from training. Duration: 0.8545625 2023-10-03 11:51:51,105 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_43831_210_2158_1_1533725502_5872791_525-89447-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 11:51:52,422 WARNING [train.py:1204] (3/4) Exclude cut with ID _12302_210_5219_1_1529924446466_8107280_907-182949-0 from training. Duration: 0.92 2023-10-03 11:51:53,684 WARNING [train.py:1204] (3/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_402-102311-0_sp0.9 from training. Duration: 0.7444375 2023-10-03 11:51:53,724 WARNING [train.py:1204] (3/4) Exclude cut with ID _8021_210_7994_1_1528376451358_3653069_179-45206-0_sp0.9 from training. Duration: 0.9555625 2023-10-03 11:51:57,752 WARNING [train.py:1204] (3/4) Exclude cut with ID _40137_210_12210_1_1533290484046_3795180_249-187059-0 from training. Duration: 0.88 2023-10-03 11:51:57,849 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9653W0194-675472 from training. Duration: 0.9658125 2023-10-03 11:52:00,644 WARNING [train.py:1204] (3/4) Exclude cut with ID _51332_210_9988_1_1534244371836_3801270_336-255604-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 11:52:02,603 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_510-96996-0 from training. Duration: 0.87 2023-10-03 11:52:02,615 WARNING [train.py:1204] (3/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_1167-20437-0_sp1.1 from training. Duration: 0.92725 2023-10-03 11:52:03,931 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3475_210_2028_1_1526193578_3622569_265-276625-0 from training. Duration: 0.76 2023-10-03 11:52:08,290 WARNING [train.py:1204] (3/4) Exclude cut with ID _14860_210_4915_1_1530702020340_7343240_470-172418-0_sp1.1 from training. Duration: 0.97275 2023-10-03 11:52:08,298 WARNING [train.py:1204] (3/4) Exclude cut with ID _5444_210_2151_1_1527418808464_5312210_581-315411-0_sp1.1 from training. Duration: 0.563625 2023-10-03 11:52:09,576 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_34450_210_9547_1_1533099443_4109050_519-342439-0_sp1.1 from training. Duration: 0.97275 2023-10-03 11:52:09,603 WARNING [train.py:1204] (3/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_50-175961-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 11:52:11,329 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.541e+02 1.883e+02 2.080e+02 2.401e+02 3.232e+02, threshold=4.161e+02, percent-clipped=0.0 2023-10-03 11:52:14,287 WARNING [train.py:1204] (3/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_372-6869-0 from training. Duration: 0.44 2023-10-03 11:52:14,294 WARNING [train.py:1204] (3/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_3-237434-0 from training. Duration: 0.76 2023-10-03 11:52:15,819 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.balancer2.prob, batch_count=1257300.0, ans=0.125 2023-10-03 11:52:16,982 WARNING [train.py:1204] (3/4) Exclude cut with ID _8772_210_1850_1_1528635595972_3664011_247-336448-0_sp1.1 from training. Duration: 0.67275 2023-10-03 11:52:22,031 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_427-78097-0 from training. Duration: 0.8 2023-10-03 11:52:22,055 WARNING [train.py:1204] (3/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_466-252727-0_sp1.1 from training. Duration: 0.97275 2023-10-03 11:52:23,410 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_15972_210_6400_1_1530877934_3405692_316-35478-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 11:52:23,436 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_809-17156-0_sp0.9 from training. Duration: 0.811125 2023-10-03 11:52:23,478 WARNING [train.py:1204] (3/4) Exclude cut with ID _57572_210_5517_1_1534921219624_3809484_594-263474-0_sp1.1 from training. Duration: 0.936375 2023-10-03 11:52:24,787 WARNING [train.py:1204] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_190-228204-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 11:52:26,156 WARNING [train.py:1204] (3/4) Exclude cut with ID _19514_210_9259_1_1531530071330_5084230_117-21193-0_sp1.1 from training. Duration: 0.936375 2023-10-03 11:52:27,610 WARNING [train.py:1204] (3/4) Exclude cut with ID _69584_210_5219_1_1535785133775_4044450_190-21315-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 11:52:28,978 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_53071_210_16554_1_1534493434_1276796_184-302050-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 11:52:30,324 WARNING [train.py:1204] (3/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_120-76843-0_sp1.1 from training. Duration: 0.5 2023-10-03 11:52:31,731 WARNING [train.py:1204] (3/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_318-162017-0_sp1.1 from training. Duration: 0.82725 2023-10-03 11:52:31,948 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.feed_forward1.out_proj.dropout_p, batch_count=1257366.6666666667, ans=0.1 2023-10-03 11:52:33,181 WARNING [train.py:1204] (3/4) Exclude cut with ID _46067_210_4220_1_1534125571066_3307450_79-46864-0_sp1.1 from training. Duration: 0.92725 2023-10-03 11:52:35,153 WARNING [train.py:1204] (3/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_358-215558-0_sp1.1 from training. Duration: 0.8454375 2023-10-03 11:52:36,544 WARNING [train.py:1204] (3/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_621-66764-0_sp1.1 from training. Duration: 0.92725 2023-10-03 11:52:38,061 WARNING [train.py:1204] (3/4) Exclude cut with ID _42863_210_9070_1_1533902398788_3696920_338-156851-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 11:52:39,386 WARNING [train.py:1204] (3/4) Exclude cut with ID _5289_210_2084_1_1527678202736_3779299_392-261988-0_sp0.9 from training. Duration: 0.72225 2023-10-03 11:52:42,147 WARNING [train.py:1204] (3/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_409-84682-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 11:52:43,540 WARNING [train.py:1204] (3/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_30-326681-0_sp0.9 from training. Duration: 0.92225 2023-10-03 11:52:43,545 WARNING [train.py:1204] (3/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_389-184695-0_sp1.1 from training. Duration: 0.92725 2023-10-03 11:52:43,598 WARNING [train.py:1204] (3/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_442-320317-0 from training. Duration: 0.82 2023-10-03 11:52:47,181 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.feed_forward2.hidden_balancer.prob, batch_count=1257433.3333333333, ans=0.125 2023-10-03 11:52:47,200 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.feed_forward3.hidden_balancer.prob, batch_count=1257433.3333333333, ans=0.125 2023-10-03 11:52:48,343 WARNING [train.py:1204] (3/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_756-29911-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 11:52:49,779 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_401-112036-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 11:52:49,865 WARNING [train.py:1204] (3/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_181-308827-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 11:52:51,219 WARNING [train.py:1204] (3/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_332-258725-0_sp1.1 from training. Duration: 0.963625 2023-10-03 11:52:53,011 WARNING [train.py:1204] (3/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_188-260011-0_sp0.9 from training. Duration: 0.811125 2023-10-03 11:52:53,053 WARNING [train.py:1204] (3/4) Exclude cut with ID _49039_210_6315_1_1533967537541_3846118_99-302239-0_sp1.1 from training. Duration: 0.963625 2023-10-03 11:52:55,741 WARNING [train.py:1204] (3/4) Exclude cut with ID _4122_210_2084_1_1526713164417_4353280_312-294828-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 11:52:55,753 WARNING [train.py:1204] (3/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_303-228738-0 from training. Duration: 0.84 2023-10-03 11:52:55,986 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.bypass_mid.scale_min, batch_count=1257500.0, ans=0.2 2023-10-03 11:52:56,982 INFO [train.py:1046] (3/4) Epoch 36, batch 2700, loss[loss=0.1564, simple_loss=0.2404, pruned_loss=0.03627, over 24482.00 frames. ], tot_loss[loss=0.1604, simple_loss=0.2398, pruned_loss=0.04048, over 4727493.86 frames. ], batch size: 63, lr: 2.83e-03, grad_scale: 8.0 2023-10-03 11:52:57,318 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.nonlin_attention.balancer.min_positive, batch_count=1257500.0, ans=0.05 2023-10-03 11:52:58,595 WARNING [train.py:1204] (3/4) Exclude cut with ID _14349_210_8341_1_1530615710818_3442470_115-285975-0_sp0.9 from training. Duration: 0.9555625 2023-10-03 11:53:01,211 WARNING [train.py:1204] (3/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_125-144174-0_sp1.1 from training. Duration: 0.3545625 2023-10-03 11:53:02,628 WARNING [train.py:1204] (3/4) Exclude cut with ID _53565_210_10635_1_1534337218335_3820259_351-54570-0_sp1.1 from training. Duration: 0.97275 2023-10-03 11:53:02,881 INFO [scaling.py:1118] (3/4) WithLoss: name=encoder.encoders.4.encoder.layers.1.self_attn_weights, loss-sum=0.000e+00 2023-10-03 11:53:03,895 WARNING [train.py:1204] (3/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_410-40065-0_sp1.1 from training. Duration: 0.963625 2023-10-03 11:53:03,917 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_38965_210_4852_1_1533174906_3484735_167-339442-0_sp1.1 from training. Duration: 0.963625 2023-10-03 11:53:05,301 WARNING [train.py:1204] (3/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_265-164486-0_sp1.1 from training. Duration: 0.87275 2023-10-03 11:53:05,312 WARNING [train.py:1204] (3/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_725-182484-0_sp1.1 from training. Duration: 0.92725 2023-10-03 11:53:05,339 WARNING [train.py:1204] (3/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_607-213951-0_sp0.9 from training. Duration: 0.9333125 2023-10-03 11:53:05,356 WARNING [train.py:1204] (3/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_605-133395-0_sp1.1 from training. Duration: 0.6 2023-10-03 11:53:05,385 WARNING [train.py:1204] (3/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_351-294650-0 from training. Duration: 0.63 2023-10-03 11:53:07,250 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_14953_210_2151_1_1530701710_5616165_710-165400-0_sp1.1 from training. Duration: 0.7909375 2023-10-03 11:53:10,057 WARNING [train.py:1204] (3/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_237-58433-0_sp1.1 from training. Duration: 0.67275 2023-10-03 11:53:10,155 WARNING [train.py:1204] (3/4) Exclude cut with ID _5190_210_4557_1_1527244368799_373439_28-197030-0_sp1.1 from training. Duration: 0.6545625 2023-10-03 11:53:10,197 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_743-327651-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 11:53:12,926 WARNING [train.py:1204] (3/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_76-203867-0_sp0.9 from training. Duration: 0.8 2023-10-03 11:53:13,004 WARNING [train.py:1204] (3/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_274-274798-0 from training. Duration: 0.98 2023-10-03 11:53:14,396 WARNING [train.py:1204] (3/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_226-242140-0_sp1.1 from training. Duration: 0.736375 2023-10-03 11:53:16,742 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.5.encoder.layers.1.nonlin_attention.whiten2, num_groups=1, num_channels=256, metric=9.03 vs. limit=15.0 2023-10-03 11:53:20,677 WARNING [train.py:1204] (3/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_352-59958-0_sp1.1 from training. Duration: 0.7818125 2023-10-03 11:53:20,682 WARNING [train.py:1204] (3/4) Exclude cut with ID _39411_210_5986_1_1533538825030_3343649_191-251819-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 11:53:26,731 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7046_210_6753_1_1527895337_6920917_642-106799-0_sp1.1 from training. Duration: 0.836375 2023-10-03 11:53:26,745 WARNING [train.py:1204] (3/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_412-3442-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 11:53:28,075 WARNING [train.py:1204] (3/4) Exclude cut with ID _60057_210_10640_1_1534903065793_3333150_171-181133-0_sp0.9 from training. Duration: 0.97775 2023-10-03 11:53:28,086 WARNING [train.py:1204] (3/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_154-135696-0_sp1.1 from training. Duration: 0.72725 2023-10-03 11:53:30,841 WARNING [train.py:1204] (3/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_148-186805-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 11:53:32,344 WARNING [train.py:1204] (3/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_605-165641-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 11:53:32,351 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_174-277364-0_sp0.9 from training. Duration: 0.788875 2023-10-03 11:53:32,364 WARNING [train.py:1204] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_89-321955-0_sp0.9 from training. Duration: 0.888875 2023-10-03 11:53:36,986 WARNING [train.py:1204] (3/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_438-105429-0_sp1.1 from training. Duration: 0.963625 2023-10-03 11:53:36,990 WARNING [train.py:1204] (3/4) Exclude cut with ID _14970_210_15709_1_1530773543493_1715730_187-20259-0_sp0.9 from training. Duration: 0.988875 2023-10-03 11:53:37,328 INFO [scaling.py:1118] (3/4) WithLoss: name=encoder.encoders.4.encoder.layers.2.self_attn_weights, loss-sum=0.000e+00 2023-10-03 11:53:45,692 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.conv_module2.balancer1.max_abs, batch_count=1257700.0, ans=10.0 2023-10-03 11:53:46,745 WARNING [train.py:1204] (3/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_385-193052-0_sp0.9 from training. Duration: 0.9555625 2023-10-03 11:53:46,801 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7113_210_1849_1_1527942685_3448768_194-311626-0_sp1.1 from training. Duration: 0.92725 2023-10-03 11:53:46,961 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.1.conv_module2.balancer2.prob, batch_count=1257700.0, ans=0.125 2023-10-03 11:53:52,065 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3780_210_2087_1_1526467389_2145572_176-107140-0_sp0.9 from training. Duration: 0.7666875 2023-10-03 11:53:52,067 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_36756_210_7234_1_1533026991_966276_123-213457-0_sp1.1 from training. Duration: 0.936375 2023-10-03 11:53:53,741 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.bypass.skip_rate, batch_count=1257700.0, ans=0.09899494936611666 2023-10-03 11:53:55,189 WARNING [train.py:1204] (3/4) Exclude cut with ID _23557_210_8750_1_1531890106370_6846255_456-244861-0_sp1.1 from training. Duration: 0.963625 2023-10-03 11:53:56,548 WARNING [train.py:1204] (3/4) Exclude cut with ID _37086_210_9038_1_1532999540891_7021250_242-349170-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 11:53:57,839 WARNING [train.py:1204] (3/4) Exclude cut with ID _41298_210_9019_1_1533430371450_4822143_471-281351-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 11:53:58,169 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.conv_module1.balancer2.prob, batch_count=1257766.6666666667, ans=0.125 2023-10-03 11:53:59,172 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_53378_210_17282_1_1534221071_5769509_572-126176-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 11:53:59,292 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_1507-263032-0_sp1.1 from training. Duration: 0.963625 2023-10-03 11:53:59,320 WARNING [train.py:1204] (3/4) Exclude cut with ID _13679_210_7497_1_1530534293662_3355098_411-186088-0_sp1.1 from training. Duration: 0.8545625 2023-10-03 11:54:01,988 WARNING [train.py:1204] (3/4) Exclude cut with ID _29345_210_4381_1_1532689168263_3860169_149-257268-0_sp1.1 from training. Duration: 0.736375 2023-10-03 11:54:03,419 WARNING [train.py:1204] (3/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_381-252014-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 11:54:03,429 WARNING [train.py:1204] (3/4) Exclude cut with ID _35799_210_5854_1_1533263487887_3529071_512-2717-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 11:54:06,209 WARNING [train.py:1204] (3/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_505-205270-0_sp0.9 from training. Duration: 0.42225 2023-10-03 11:54:08,002 WARNING [train.py:1204] (3/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_731-20117-0_sp1.1 from training. Duration: 0.936375 2023-10-03 11:54:08,204 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.conv_module1.balancer2.prob, batch_count=1257766.6666666667, ans=0.125 2023-10-03 11:54:10,713 INFO [train.py:1046] (3/4) Epoch 36, batch 2750, loss[loss=0.1753, simple_loss=0.2606, pruned_loss=0.04502, over 24410.00 frames. ], tot_loss[loss=0.1604, simple_loss=0.2399, pruned_loss=0.04049, over 4725773.27 frames. ], batch size: 77, lr: 2.83e-03, grad_scale: 4.0 2023-10-03 11:54:10,822 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_37351_210_7925_1_1533012931_3812113_265-85780-0_sp1.1 from training. Duration: 0.8 2023-10-03 11:54:10,828 WARNING [train.py:1204] (3/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_313-134930-0 from training. Duration: 0.97 2023-10-03 11:54:12,331 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.feed_forward1.hidden_balancer.prob, batch_count=1257833.3333333333, ans=0.125 2023-10-03 11:54:13,573 WARNING [train.py:1204] (3/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_374-138971-0 from training. Duration: 0.94 2023-10-03 11:54:13,629 WARNING [train.py:1204] (3/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_1227-287886-0_sp1.1 from training. Duration: 0.936375 2023-10-03 11:54:15,114 WARNING [train.py:1204] (3/4) Exclude cut with ID _59493_210_17282_1_1535078419077_3225879_332-217544-0_sp1.1 from training. Duration: 0.97275 2023-10-03 11:54:16,453 WARNING [train.py:1204] (3/4) Exclude cut with ID _23514_210_1389_1_1531832139785_3031439_361-119640-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 11:54:16,695 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.1.attention_skip_rate, batch_count=1257833.3333333333, ans=0.0 2023-10-03 11:54:17,939 WARNING [train.py:1204] (3/4) Exclude cut with ID _69584_210_5219_1_1535785133775_4044450_142-118058-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 11:54:17,975 WARNING [train.py:1204] (3/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_178-177582-0_sp1.1 from training. Duration: 0.67275 2023-10-03 11:54:18,010 WARNING [train.py:1204] (3/4) Exclude cut with ID _25303_210_17519_1_1532050207576_3635970_41-195572-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 11:54:23,034 WARNING [train.py:1204] (3/4) Exclude cut with ID _55328_210_27767_1_1534580973260_4088170_329-341322-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 11:54:23,071 WARNING [train.py:1204] (3/4) Exclude cut with ID _5608_210_2106_1_1527415439089_6819768_909-82767-0_sp1.1 from training. Duration: 0.5545625 2023-10-03 11:54:23,120 WARNING [train.py:1204] (3/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_261-297503-0_sp1.1 from training. Duration: 0.9 2023-10-03 11:54:23,127 WARNING [train.py:1204] (3/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_459-291706-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 11:54:23,128 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7762_210_1794_1_1528199508_4057408_422-227034-0 from training. Duration: 0.73 2023-10-03 11:54:23,140 WARNING [train.py:1204] (3/4) Exclude cut with ID _3067_210_2667_1_1526212974640_4503168_250-285937-0_sp0.9 from training. Duration: 0.888875 2023-10-03 11:54:23,156 WARNING [train.py:1204] (3/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_332-253745-0_sp1.1 from training. Duration: 0.936375 2023-10-03 11:54:30,330 WARNING [train.py:1204] (3/4) Exclude cut with ID _13938_210_9988_1_1530518718978_3693960_304-112784-0 from training. Duration: 0.46 2023-10-03 11:54:30,476 WARNING [train.py:1204] (3/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_226-267957-0_sp1.1 from training. Duration: 0.92725 2023-10-03 11:54:31,791 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_41822_210_13270_1_1533955976_4379284_608-254567-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 11:54:31,864 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_124-113562-0_sp1.1 from training. Duration: 0.8545625 2023-10-03 11:54:31,893 WARNING [train.py:1204] (3/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_117-290916-0_sp1.1 from training. Duration: 0.463625 2023-10-03 11:54:33,363 WARNING [train.py:1204] (3/4) Exclude cut with ID _28427_210_4220_1_1532581240967_3061069_102-66533-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 11:54:35,925 WARNING [train.py:1204] (3/4) Exclude cut with ID _55383_210_10635_1_1534468426172_2231669_134-128246-0_sp1.1 from training. Duration: 0.8090625 2023-10-03 11:54:35,965 WARNING [train.py:1204] (3/4) Exclude cut with ID _45871_210_5665_1_1533864614245_3296060_49-308316-0_sp1.1 from training. Duration: 0.97275 2023-10-03 11:54:36,012 WARNING [train.py:1204] (3/4) Exclude cut with ID _57572_210_5517_1_1534921219624_3809484_176-199205-0_sp1.1 from training. Duration: 0.97275 2023-10-03 11:54:39,235 WARNING [train.py:1204] (3/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_45-265684-0_sp1.1 from training. Duration: 0.6545625 2023-10-03 11:54:39,254 WARNING [train.py:1204] (3/4) Exclude cut with ID _15150_210_8341_1_1530752411803_7256384_469-239509-0_sp0.9 from training. Duration: 0.7555625 2023-10-03 11:54:40,468 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.561e+02 1.983e+02 2.239e+02 2.641e+02 5.389e+02, threshold=4.478e+02, percent-clipped=1.0 2023-10-03 11:54:40,573 WARNING [train.py:1204] (3/4) Exclude cut with ID _28461_210_2253_1_1532771775189_3041299_136-270644-0_sp1.1 from training. Duration: 0.8454375 2023-10-03 11:54:41,922 WARNING [train.py:1204] (3/4) Exclude cut with ID _69584_210_5219_1_1535785133775_4044450_302-47302-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 11:54:42,014 WARNING [train.py:1204] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_166-128941-0_sp1.1 from training. Duration: 0.4909375 2023-10-03 11:54:47,837 WARNING [train.py:1204] (3/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_559-120504-0_sp1.1 from training. Duration: 0.97275 2023-10-03 11:54:51,038 WARNING [train.py:1204] (3/4) Exclude cut with ID _6456_210_1534_1_1527681634212_3181049_345-252877-0_sp1.1 from training. Duration: 0.5818125 2023-10-03 11:54:51,062 WARNING [train.py:1204] (3/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_902-233003-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 11:54:54,379 WARNING [train.py:1204] (3/4) Exclude cut with ID _36538_210_9994_1_1533204020363_3795620_204-261385-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 11:54:54,383 WARNING [train.py:1204] (3/4) Exclude cut with ID _14821_210_8750_1_1530680503991_6829359_189-297451-0_sp1.1 from training. Duration: 0.5 2023-10-03 11:54:56,333 WARNING [train.py:1204] (3/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_1211-226066-0_sp0.9 from training. Duration: 0.8444375 2023-10-03 11:55:00,730 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6786_210_1388_1_1527839699_4194495_273-149841-0_sp1.1 from training. Duration: 0.836375 2023-10-03 11:55:00,767 WARNING [train.py:1204] (3/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_380-162648-0_sp1.1 from training. Duration: 0.8545625 2023-10-03 11:55:00,768 WARNING [train.py:1204] (3/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_494-199538-0 from training. Duration: 0.75 2023-10-03 11:55:05,989 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.2.encoder.layers.2.nonlin_attention.whiten1, num_groups=1, num_channels=288, metric=5.15 vs. limit=10.0 2023-10-03 11:55:06,663 WARNING [train.py:1204] (3/4) Exclude cut with ID _49097_210_16049_1_1533952780207_4360979_276-87053-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 11:55:08,066 WARNING [train.py:1204] (3/4) Exclude cut with ID _5444_210_2151_1_1527418808464_5312210_726-27165-0 from training. Duration: 0.67 2023-10-03 11:55:14,667 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_128-202104-0_sp1.1 from training. Duration: 0.57275 2023-10-03 11:55:16,063 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_11349_210_6753_1_1529800861_1101207_26-65602-0_sp1.1 from training. Duration: 0.87275 2023-10-03 11:55:16,094 WARNING [train.py:1204] (3/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_159-328157-0 from training. Duration: 0.52 2023-10-03 11:55:17,495 WARNING [train.py:1204] (3/4) Exclude cut with ID _14069_210_5632_1_1530512692033_4803390_98-21934-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 11:55:18,895 WARNING [train.py:1204] (3/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_352-59958-0_sp0.9 from training. Duration: 0.9555625 2023-10-03 11:55:18,936 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6163_210_6400_1_1527506746_4263068_283-137811-0 from training. Duration: 0.77 2023-10-03 11:55:18,963 WARNING [train.py:1204] (3/4) Exclude cut with ID _21989_210_5854_1_1531899214931_3271360_133-44759-0_sp1.1 from training. Duration: 0.7 2023-10-03 11:55:19,472 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.4.encoder.layers.2.self_attn1.whiten, num_groups=1, num_channels=384, metric=12.12 vs. limit=22.5 2023-10-03 11:55:22,470 INFO [scaling.py:1118] (3/4) WithLoss: name=encoder.encoders.4.encoder.layers.0.self_attn_weights, loss-sum=0.000e+00 2023-10-03 11:55:24,164 WARNING [train.py:1204] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_21-174514-0_sp0.9 from training. Duration: 0.47775 2023-10-03 11:55:25,406 INFO [train.py:1046] (3/4) Epoch 36, batch 2800, loss[loss=0.1688, simple_loss=0.2443, pruned_loss=0.04667, over 23232.00 frames. ], tot_loss[loss=0.1596, simple_loss=0.239, pruned_loss=0.04006, over 4732205.16 frames. ], batch size: 105, lr: 2.83e-03, grad_scale: 8.0 2023-10-03 11:55:25,476 WARNING [train.py:1204] (3/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_201-78811-0_sp1.1 from training. Duration: 0.963625 2023-10-03 11:55:25,504 WARNING [train.py:1204] (3/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_537-164630-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 11:55:25,572 WARNING [train.py:1204] (3/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_156-257607-0 from training. Duration: 0.83 2023-10-03 11:55:25,583 WARNING [train.py:1204] (3/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_308-154450-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 11:55:27,314 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_45399_210_4852_1_1533624725_7579737_214-240574-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 11:55:28,769 WARNING [train.py:1204] (3/4) Exclude cut with ID _29303_210_2575_1_1532392237303_7410649_615-305517-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 11:55:28,805 WARNING [train.py:1204] (3/4) Exclude cut with ID IC0125W0002-61808 from training. Duration: 0.708 2023-10-03 11:55:28,806 WARNING [train.py:1204] (3/4) Exclude cut with ID IC0125W0017-61823 from training. Duration: 0.759 2023-10-03 11:55:31,562 WARNING [train.py:1204] (3/4) Exclude cut with ID _53219_210_4525_1_1534834836008_4064825_4-145291-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 11:55:34,594 WARNING [train.py:1204] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_185-174805-0_sp1.1 from training. Duration: 0.6181875 2023-10-03 11:55:35,882 WARNING [train.py:1204] (3/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_635-193046-0_sp1.1 from training. Duration: 0.92725 2023-10-03 11:55:38,763 WARNING [train.py:1204] (3/4) Exclude cut with ID _46067_210_4220_1_1534125571066_3307450_321-258866-0_sp1.1 from training. Duration: 0.97275 2023-10-03 11:55:41,689 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8558_210_1410_1_1528505225_7613406_131-329524-0 from training. Duration: 0.79 2023-10-03 11:55:43,284 WARNING [train.py:1204] (3/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_847-41334-0_sp0.9 from training. Duration: 0.488875 2023-10-03 11:55:44,711 WARNING [train.py:1204] (3/4) Exclude cut with ID _14227_210_4381_1_1530701804286_3940259_125-275642-0 from training. Duration: 0.99 2023-10-03 11:55:46,070 WARNING [train.py:1204] (3/4) Exclude cut with ID _25248_210_8750_1_1532062825941_5554180_248-177255-0_sp1.1 from training. Duration: 0.963625 2023-10-03 11:55:46,120 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_40047_210_2290_1_1533290519_2374311_195-165693-0_sp1.1 from training. Duration: 0.8090625 2023-10-03 11:55:46,124 WARNING [train.py:1204] (3/4) Exclude cut with ID _23109_210_4381_1_1532150828777_901949_29-258672-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 11:55:50,207 WARNING [train.py:1204] (3/4) Exclude cut with ID _14821_210_8750_1_1530680503991_6829359_2-227151-0_sp1.1 from training. Duration: 0.7545625 2023-10-03 11:55:51,898 WARNING [train.py:1204] (3/4) Exclude cut with ID _49643_210_7989_1_1534640244224_3939031_565-53982-0_sp1.1 from training. Duration: 0.963625 2023-10-03 11:55:51,901 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7544_210_6753_1_1528591327_1471426_33-346031-0_sp0.9 from training. Duration: 0.67775 2023-10-03 11:55:51,982 WARNING [train.py:1204] (3/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_95-61782-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 11:56:01,114 WARNING [train.py:1204] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_686-54383-0_sp0.9 from training. Duration: 0.9666875 2023-10-03 11:56:02,543 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_505-276823-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 11:56:05,191 WARNING [train.py:1204] (3/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_745-128443-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 11:56:05,239 WARNING [train.py:1204] (3/4) Exclude cut with ID _3844_210_1390_1_1526727803582_2871209_20-251664-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 11:56:05,311 WARNING [train.py:1204] (3/4) Exclude cut with ID _36538_210_9994_1_1533204020363_3795620_487-164628-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 11:56:09,081 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.balancer1.prob, batch_count=1258366.6666666667, ans=0.125 2023-10-03 11:56:11,573 WARNING [train.py:1204] (3/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_309-222545-0_sp1.1 from training. Duration: 0.763625 2023-10-03 11:56:11,595 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3531_210_3528_1_1526294203_5216456_309-227302-0 from training. Duration: 0.69 2023-10-03 11:56:11,648 WARNING [train.py:1204] (3/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_42-192460-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 11:56:11,876 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.bypass_mid.scale_min, batch_count=1258366.6666666667, ans=0.2 2023-10-03 11:56:12,475 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.0.layers.0.whiten, num_groups=1, num_channels=192, metric=5.26 vs. limit=12.0 2023-10-03 11:56:12,924 WARNING [train.py:1204] (3/4) Exclude cut with ID _22083_210_8254_1_1531702862439_3678970_49-234546-0_sp1.1 from training. Duration: 0.7545625 2023-10-03 11:56:12,933 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_427-78097-0_sp0.9 from training. Duration: 0.888875 2023-10-03 11:56:15,738 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_378-205254-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 11:56:16,516 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.2.encoder.layers.0.self_attn_weights.whiten_keys, num_groups=4, num_channels=128, metric=4.29 vs. limit=6.0 2023-10-03 11:56:17,093 WARNING [train.py:1204] (3/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_672-314830-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 11:56:20,030 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.self_attn_weights.pos_emb_skip_rate, batch_count=1258366.6666666667, ans=0.0 2023-10-03 11:56:21,230 WARNING [train.py:1204] (3/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_90-30673-0_sp1.1 from training. Duration: 0.763625 2023-10-03 11:56:22,759 WARNING [train.py:1204] (3/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_563-329943-0_sp1.1 from training. Duration: 0.9 2023-10-03 11:56:24,610 WARNING [train.py:1204] (3/4) Exclude cut with ID _21632_210_2253_1_1531994001765_3744140_154-84090-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 11:56:24,611 WARNING [train.py:1204] (3/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_103-336064-0_sp0.9 from training. Duration: 0.5666875 2023-10-03 11:56:24,659 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_43-71989-0_sp0.9 from training. Duration: 0.6333125 2023-10-03 11:56:24,712 WARNING [train.py:1204] (3/4) Exclude cut with ID _3867_210_1883_1_1526641200575_4475830_319-349850-0_sp0.9 from training. Duration: 0.7444375 2023-10-03 11:56:24,781 WARNING [train.py:1204] (3/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_629-179454-0_sp1.1 from training. Duration: 0.963625 2023-10-03 11:56:24,788 WARNING [train.py:1204] (3/4) Exclude cut with ID _65263_210_22356_1_1535763598691_3921037_49-222917-0 from training. Duration: 0.92 2023-10-03 11:56:26,136 WARNING [train.py:1204] (3/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_602-331772-0_sp1.1 from training. Duration: 0.936375 2023-10-03 11:56:28,058 WARNING [train.py:1204] (3/4) Exclude cut with ID _58414_210_8254_1_1535421609162_7151182_292-134978-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 11:56:28,085 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_38793_210_2544_1_1533176114_3849320_200-299292-0_sp1.1 from training. Duration: 0.936375 2023-10-03 11:56:28,195 WARNING [train.py:1204] (3/4) Exclude cut with ID _14970_210_15709_1_1530775547361_1966965_6-23328-0 from training. Duration: 0.86 2023-10-03 11:56:29,559 WARNING [train.py:1204] (3/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_386-225511-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 11:56:29,581 WARNING [train.py:1204] (3/4) Exclude cut with ID _38323_210_12108_1_1533121233369_3303049_416-11919-0_sp1.1 from training. Duration: 0.87275 2023-10-03 11:56:29,777 INFO [scaling.py:1118] (3/4) WithLoss: name=encoder.encoders.2.encoder.layers.1.self_attn_weights, loss-sum=0.000e+00 2023-10-03 11:56:30,892 WARNING [train.py:1204] (3/4) Exclude cut with ID _4866_210_2577_1_1527404412284_4364659_256-306830-0_sp1.1 from training. Duration: 0.7909375 2023-10-03 11:56:32,342 WARNING [train.py:1204] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_53-300544-0 from training. Duration: 0.67 2023-10-03 11:56:36,577 WARNING [train.py:1204] (3/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_80-74184-0_sp1.1 from training. Duration: 0.7545625 2023-10-03 11:56:36,597 WARNING [train.py:1204] (3/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_332-256168-0_sp1.1 from training. Duration: 0.6818125 2023-10-03 11:56:37,967 WARNING [train.py:1204] (3/4) Exclude cut with ID _48836_210_12210_1_1534075848293_3904150_429-35475-0_sp1.1 from training. Duration: 0.8090625 2023-10-03 11:56:39,667 INFO [train.py:1046] (3/4) Epoch 36, batch 2850, loss[loss=0.1505, simple_loss=0.2196, pruned_loss=0.04074, over 23476.00 frames. ], tot_loss[loss=0.159, simple_loss=0.2384, pruned_loss=0.03981, over 4733833.91 frames. ], batch size: 285, lr: 2.83e-03, grad_scale: 8.0 2023-10-03 11:56:41,126 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_144-135833-0_sp1.1 from training. Duration: 0.97275 2023-10-03 11:56:45,042 WARNING [train.py:1204] (3/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_380-343670-0_sp0.9 from training. Duration: 0.97775 2023-10-03 11:56:46,518 WARNING [train.py:1204] (3/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_213-265174-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 11:56:46,551 WARNING [train.py:1204] (3/4) Exclude cut with ID _20455_210_5219_1_1531565996775_8098100_86-314283-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 11:56:48,048 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_41750_210_1960_1_1533520678_3948042_68-223813-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 11:56:49,420 WARNING [train.py:1204] (3/4) Exclude cut with ID _55153_210_2253_1_1534384786677_3162096_24-198429-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 11:56:50,828 WARNING [train.py:1204] (3/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_119-207162-0_sp0.9 from training. Duration: 0.788875 2023-10-03 11:56:52,172 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_148-172861-0 from training. Duration: 0.5 2023-10-03 11:56:55,714 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.feed_forward2.hidden_balancer.prob, batch_count=1258566.6666666667, ans=0.125 2023-10-03 11:56:55,802 INFO [scaling.py:1118] (3/4) WithLoss: name=encoder.encoders.3.encoder.layers.2.self_attn_weights, loss-sum=0.000e+00 2023-10-03 11:56:58,821 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_5522_210_3273_1_1527249606_3909096_318-203153-0 from training. Duration: 0.43 2023-10-03 11:56:58,834 WARNING [train.py:1204] (3/4) Exclude cut with ID _14884_210_2252_1_1530707459689_5561250_288-189949-0_sp1.1 from training. Duration: 0.97275 2023-10-03 11:57:01,507 WARNING [train.py:1204] (3/4) Exclude cut with ID _5294_210_1747_1_1527249643928_3696179_529-242298-0 from training. Duration: 0.95 2023-10-03 11:57:02,819 WARNING [train.py:1204] (3/4) Exclude cut with ID _36538_210_9994_1_1533204020363_3795620_503-110289-0_sp1.1 from training. Duration: 0.936375 2023-10-03 11:57:04,237 WARNING [train.py:1204] (3/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_385-193052-0 from training. Duration: 0.86 2023-10-03 11:57:04,300 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_456-22141-0 from training. Duration: 0.88 2023-10-03 11:57:05,741 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_38965_210_4852_1_1533174906_3484735_284-117840-0_sp1.1 from training. Duration: 0.936375 2023-10-03 11:57:08,331 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.568e+02 1.821e+02 1.991e+02 2.148e+02 3.259e+02, threshold=3.982e+02, percent-clipped=0.0 2023-10-03 11:57:17,593 WARNING [train.py:1204] (3/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_75-201625-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 11:57:17,703 WARNING [train.py:1204] (3/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_255-312654-0_sp1.1 from training. Duration: 0.82725 2023-10-03 11:57:17,713 WARNING [train.py:1204] (3/4) Exclude cut with ID _47251_210_18628_1_1533814063128_4137250_214-88076-0_sp0.9 from training. Duration: 0.97775 2023-10-03 11:57:19,005 WARNING [train.py:1204] (3/4) Exclude cut with ID _4092_210_2151_1_1526643044449_3135759_227-213972-0_sp1.1 from training. Duration: 0.4909375 2023-10-03 11:57:19,020 WARNING [train.py:1204] (3/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_912-47473-0_sp1.1 from training. Duration: 0.6181875 2023-10-03 11:57:19,048 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6767_210_3273_1_1527855783_1718932_173-256601-0_sp1.1 from training. Duration: 0.72725 2023-10-03 11:57:21,688 WARNING [train.py:1204] (3/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_32-205288-0_sp1.1 from training. Duration: 0.7090625 2023-10-03 11:57:21,719 WARNING [train.py:1204] (3/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_39-248870-0 from training. Duration: 0.69 2023-10-03 11:57:24,569 WARNING [train.py:1204] (3/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_377-296839-0_sp1.1 from training. Duration: 0.5 2023-10-03 11:57:24,585 WARNING [train.py:1204] (3/4) Exclude cut with ID _13679_210_7497_1_1530534293662_3355098_413-56154-0_sp1.1 from training. Duration: 0.963625 2023-10-03 11:57:25,877 WARNING [train.py:1204] (3/4) Exclude cut with ID _14970_210_15709_1_1530775547361_1966965_201-84544-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 11:57:25,926 WARNING [train.py:1204] (3/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_105-95775-0_sp1.1 from training. Duration: 0.936375 2023-10-03 11:57:27,964 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.0.feed_forward1.out_proj.dropout_p, batch_count=1258700.0, ans=0.1 2023-10-03 11:57:29,164 WARNING [train.py:1204] (3/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_131-191669-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 11:57:29,189 WARNING [train.py:1204] (3/4) Exclude cut with ID _14860_210_4915_1_1530702020340_7343240_695-4323-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 11:57:31,292 WARNING [train.py:1204] (3/4) Exclude cut with ID _39867_210_9826_1_1533553222030_9344259_991-130565-0_sp1.1 from training. Duration: 0.97275 2023-10-03 11:57:32,715 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_50-3289-0_sp1.1 from training. Duration: 0.82725 2023-10-03 11:57:34,098 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_54-347670-0_sp1.1 from training. Duration: 0.92725 2023-10-03 11:57:35,423 WARNING [train.py:1204] (3/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_200-153445-0_sp1.1 from training. Duration: 0.936375 2023-10-03 11:57:35,511 WARNING [train.py:1204] (3/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_279-302911-0_sp1.1 from training. Duration: 0.97275 2023-10-03 11:57:38,154 WARNING [train.py:1204] (3/4) Exclude cut with ID _14886_210_4220_1_1530702020355_3516190_159-73748-0_sp0.9 from training. Duration: 0.92225 2023-10-03 11:57:39,744 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.conv_skip_rate, batch_count=1258766.6666666667, ans=0.0 2023-10-03 11:57:42,886 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=1258766.6666666667, ans=0.1 2023-10-03 11:57:44,114 WARNING [train.py:1204] (3/4) Exclude cut with ID _53379_210_20034_1_1534222027363_3854654_23-222715-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 11:57:45,448 WARNING [train.py:1204] (3/4) Exclude cut with ID _12555_210_8750_1_1530075594402_3780239_449-267986-0 from training. Duration: 0.83 2023-10-03 11:57:45,470 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7544_210_6753_1_1528587722_3576686_295-258479-0 from training. Duration: 0.84 2023-10-03 11:57:46,959 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7002_210_1794_1_1527935634_4034444_487-236501-0_sp0.9 from training. Duration: 0.7333125 2023-10-03 11:57:47,017 WARNING [train.py:1204] (3/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_244-19107-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 11:57:47,030 WARNING [train.py:1204] (3/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_390-339137-0 from training. Duration: 0.56 2023-10-03 11:57:48,331 WARNING [train.py:1204] (3/4) Exclude cut with ID _7562_210_2084_1_1528887767916_3946229_186-326422-0_sp1.1 from training. Duration: 0.763625 2023-10-03 11:57:49,687 WARNING [train.py:1204] (3/4) Exclude cut with ID _33640_210_2655_1_1533117605479_4855469_645-145235-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 11:57:49,704 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_25767_210_2161_1_1532307483_3867748_506-171777-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 11:57:49,739 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_5438_210_1794_1_1527336687_2600009_233-58072-0_sp1.1 from training. Duration: 0.7 2023-10-03 11:57:49,740 WARNING [train.py:1204] (3/4) Exclude cut with ID IC0669W0063-330908 from training. Duration: 0.896 2023-10-03 11:57:51,064 WARNING [train.py:1204] (3/4) Exclude cut with ID IC0671W0070-331913 from training. Duration: 0.896 2023-10-03 11:57:51,067 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_258-310570-0_sp1.1 from training. Duration: 0.7454375 2023-10-03 11:57:51,122 WARNING [train.py:1204] (3/4) Exclude cut with ID _9461_210_4557_1_1529060514726_3677451_162-177507-0_sp1.1 from training. Duration: 0.97275 2023-10-03 11:57:52,536 INFO [train.py:1046] (3/4) Epoch 36, batch 2900, loss[loss=0.1688, simple_loss=0.2401, pruned_loss=0.04873, over 22769.00 frames. ], tot_loss[loss=0.1588, simple_loss=0.2382, pruned_loss=0.03968, over 4729496.98 frames. ], batch size: 322, lr: 2.83e-03, grad_scale: 8.0 2023-10-03 11:57:53,978 WARNING [train.py:1204] (3/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_26-175553-0_sp0.9 from training. Duration: 0.67775 2023-10-03 11:57:53,994 WARNING [train.py:1204] (3/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_7-150481-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 11:57:54,057 WARNING [train.py:1204] (3/4) Exclude cut with ID _23557_210_8750_1_1531890106370_6846255_72-273262-0_sp1.1 from training. Duration: 0.963625 2023-10-03 11:57:55,462 WARNING [train.py:1204] (3/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_367-61330-0 from training. Duration: 0.75 2023-10-03 11:57:59,224 WARNING [train.py:1204] (3/4) Exclude cut with ID _39867_210_9826_1_1533553222030_9344259_1337-11141-0_sp1.1 from training. Duration: 0.936375 2023-10-03 11:57:59,242 WARNING [train.py:1204] (3/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_101-293367-0 from training. Duration: 0.63 2023-10-03 11:58:01,042 WARNING [train.py:1204] (3/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_10-100367-0 from training. Duration: 0.98 2023-10-03 11:58:02,461 WARNING [train.py:1204] (3/4) Exclude cut with ID _60057_210_10640_1_1534903065793_3333150_171-181133-0_sp1.1 from training. Duration: 0.8 2023-10-03 11:58:02,464 WARNING [train.py:1204] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_844-96989-0_sp0.9 from training. Duration: 0.788875 2023-10-03 11:58:02,700 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.bypass.skip_rate, batch_count=1258833.3333333333, ans=0.07 2023-10-03 11:58:03,897 WARNING [train.py:1204] (3/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_301-144595-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 11:58:05,280 WARNING [train.py:1204] (3/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_115-147343-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 11:58:09,330 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3171_210_2087_1_1526109084_2330007_37-64334-0_sp1.1 from training. Duration: 0.7454375 2023-10-03 11:58:09,373 WARNING [train.py:1204] (3/4) Exclude cut with ID _19514_210_9259_1_1531530071330_5084230_769-272914-0_sp1.1 from training. Duration: 0.936375 2023-10-03 11:58:11,594 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.bypass_mid.scale_min, batch_count=1258900.0, ans=0.2 2023-10-03 11:58:12,742 WARNING [train.py:1204] (3/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_76-311963-0_sp0.9 from training. Duration: 0.77775 2023-10-03 11:58:12,772 WARNING [train.py:1204] (3/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_526-62079-0 from training. Duration: 0.67 2023-10-03 11:58:12,819 WARNING [train.py:1204] (3/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_7-111954-0_sp0.9 from training. Duration: 0.62225 2023-10-03 11:58:14,142 WARNING [train.py:1204] (3/4) Exclude cut with ID _57997_210_4163_1_1534766425889_3961049_360-279140-0_sp1.1 from training. Duration: 0.97275 2023-10-03 11:58:15,548 WARNING [train.py:1204] (3/4) Exclude cut with ID _4866_210_2577_1_1527404412284_4364659_133-1696-0 from training. Duration: 0.99 2023-10-03 11:58:15,600 WARNING [train.py:1204] (3/4) Exclude cut with ID _5190_210_4557_1_1527244368799_373439_28-197030-0 from training. Duration: 0.72 2023-10-03 11:58:19,624 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_53-39003-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 11:58:19,627 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6673_210_3273_1_1527767790_3173240_351-167954-0 from training. Duration: 0.48 2023-10-03 11:58:19,650 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2822_210_2715_1_1526112586_1999587_165-36656-0_sp1.1 from training. Duration: 0.8090625 2023-10-03 11:58:22,329 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_328-264892-0_sp1.1 from training. Duration: 0.92725 2023-10-03 11:58:22,333 WARNING [train.py:1204] (3/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_29-293896-0_sp0.9 from training. Duration: 0.67775 2023-10-03 11:58:25,085 WARNING [train.py:1204] (3/4) Exclude cut with ID _65263_210_22356_1_1535763598691_3921037_465-55448-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 11:58:25,150 WARNING [train.py:1204] (3/4) Exclude cut with ID _21989_210_5854_1_1531899214931_3271360_311-53106-0_sp1.1 from training. Duration: 0.97275 2023-10-03 11:58:27,331 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.balancer1.prob, batch_count=1258966.6666666667, ans=0.125 2023-10-03 11:58:30,471 WARNING [train.py:1204] (3/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_389-277915-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 11:58:31,908 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_69630_210_7234_1_1535791094_677837_63-277139-0_sp1.1 from training. Duration: 0.963625 2023-10-03 11:58:33,398 WARNING [train.py:1204] (3/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_85-138257-0 from training. Duration: 0.71 2023-10-03 11:58:33,413 WARNING [train.py:1204] (3/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_686-247600-0 from training. Duration: 0.92 2023-10-03 11:58:33,418 WARNING [train.py:1204] (3/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_108-255430-0_sp1.1 from training. Duration: 0.8818125 2023-10-03 11:58:36,448 WARNING [train.py:1204] (3/4) Exclude cut with ID _14884_210_2252_1_1530707459689_5561250_114-201399-0_sp1.1 from training. Duration: 0.6909375 2023-10-03 11:58:39,632 WARNING [train.py:1204] (3/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_506-42614-0 from training. Duration: 0.79 2023-10-03 11:58:40,960 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_85-195240-0_sp1.1 from training. Duration: 0.7909375 2023-10-03 11:58:46,600 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_141-36243-0_sp1.1 from training. Duration: 0.97275 2023-10-03 11:58:48,123 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.nonlin_attention.balancer.prob, batch_count=1259033.3333333333, ans=0.125 2023-10-03 11:58:51,504 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.4.encoder.layers.0.nonlin_attention.whiten1, num_groups=1, num_channels=288, metric=7.15 vs. limit=10.0 2023-10-03 11:58:52,277 WARNING [train.py:1204] (3/4) Exclude cut with ID _7852_210_5219_1_1528286471316_8139400_358-287492-0_sp1.1 from training. Duration: 0.87275 2023-10-03 11:58:53,558 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_805-71497-0_sp0.9 from training. Duration: 0.788875 2023-10-03 11:58:53,655 WARNING [train.py:1204] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_178-39722-0 from training. Duration: 0.49 2023-10-03 11:58:58,661 WARNING [train.py:1204] (3/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_428-181925-0_sp1.1 from training. Duration: 0.963625 2023-10-03 11:58:58,665 WARNING [train.py:1204] (3/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_770-200430-0 from training. Duration: 0.89 2023-10-03 11:58:58,716 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_1104-271009-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 11:59:00,062 WARNING [train.py:1204] (3/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_128-296581-0_sp0.9 from training. Duration: 0.82225 2023-10-03 11:59:04,341 WARNING [train.py:1204] (3/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_19-62511-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 11:59:05,755 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_805-71497-0 from training. Duration: 0.71 2023-10-03 11:59:07,056 INFO [train.py:1046] (3/4) Epoch 36, batch 2950, loss[loss=0.167, simple_loss=0.241, pruned_loss=0.04649, over 23851.00 frames. ], tot_loss[loss=0.1596, simple_loss=0.2393, pruned_loss=0.03991, over 4710049.08 frames. ], batch size: 179, lr: 2.83e-03, grad_scale: 8.0 2023-10-03 11:59:07,131 WARNING [train.py:1204] (3/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_573-152077-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 11:59:07,143 WARNING [train.py:1204] (3/4) Exclude cut with ID _42875_210_14856_1_1533704397487_4012433_393-177816-0_sp1.1 from training. Duration: 0.963625 2023-10-03 11:59:08,554 WARNING [train.py:1204] (3/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_278-35159-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 11:59:10,711 WARNING [train.py:1204] (3/4) Exclude cut with ID _15037_210_2253_1_1530752294104_3267650_13-130361-0_sp1.1 from training. Duration: 0.9 2023-10-03 11:59:13,342 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3019_210_2028_1_1526044245_3264865_127-135615-0 from training. Duration: 0.94 2023-10-03 11:59:13,399 WARNING [train.py:1204] (3/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_607-213951-0 from training. Duration: 0.84 2023-10-03 11:59:14,750 WARNING [train.py:1204] (3/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_436-318113-0_sp0.9 from training. Duration: 0.8333125 2023-10-03 11:59:14,751 WARNING [train.py:1204] (3/4) Exclude cut with ID _13938_210_9988_1_1530518718978_3693960_148-152225-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 11:59:14,854 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder_embed.dropout.p, batch_count=1259166.6666666667, ans=0.1 2023-10-03 11:59:16,397 INFO [scaling.py:1118] (3/4) WithLoss: name=encoder.encoders.2.encoder.layers.0.self_attn_weights, loss-sum=0.000e+00 2023-10-03 11:59:20,406 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2541_210_1794_1_1525590269_3084765_156-173713-0_sp1.1 from training. Duration: 0.736375 2023-10-03 11:59:21,880 WARNING [train.py:1204] (3/4) Exclude cut with ID _28323_210_16187_1_1532480395667_4100300_531-24157-0_sp1.1 from training. Duration: 0.8909375 2023-10-03 11:59:22,207 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.conv_module2.balancer2.min_positive, batch_count=1259233.3333333333, ans=0.05 2023-10-03 11:59:23,255 WARNING [train.py:1204] (3/4) Exclude cut with ID _29303_210_2575_1_1532392237303_7410649_382-217492-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 11:59:23,296 WARNING [train.py:1204] (3/4) Exclude cut with ID _58895_210_8750_1_1535184081320_3649089_270-158013-0_sp1.1 from training. Duration: 0.92725 2023-10-03 11:59:26,570 WARNING [train.py:1204] (3/4) Exclude cut with ID _29277_210_3279_1_1532581109293_3173069_311-206227-0_sp1.1 from training. Duration: 0.936375 2023-10-03 11:59:26,581 WARNING [train.py:1204] (3/4) Exclude cut with ID _7160_210_1876_1_1527940923231_3703799_282-322611-0_sp0.9 from training. Duration: 0.9666875 2023-10-03 11:59:29,730 WARNING [train.py:1204] (3/4) Exclude cut with ID _53219_210_4525_1_1534834836008_4064825_562-32295-0_sp1.1 from training. Duration: 0.963625 2023-10-03 11:59:29,825 WARNING [train.py:1204] (3/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_408-87330-0_sp1.1 from training. Duration: 0.963625 2023-10-03 11:59:29,832 WARNING [train.py:1204] (3/4) Exclude cut with ID _59202_210_26984_1_1534816810612_3549174_137-318710-0_sp1.1 from training. Duration: 0.8090625 2023-10-03 11:59:32,529 WARNING [train.py:1204] (3/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_372-30302-0 from training. Duration: 0.86 2023-10-03 11:59:33,530 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.1.encoder.layers.1.feed_forward1.out_whiten, num_groups=1, num_channels=256, metric=5.67 vs. limit=15.0 2023-10-03 11:59:36,562 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.651e+02 1.912e+02 2.122e+02 2.362e+02 3.535e+02, threshold=4.243e+02, percent-clipped=0.0 2023-10-03 11:59:36,673 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7543_210_6753_1_1528541907_1399322_178-56269-0_sp1.1 from training. Duration: 0.95275 2023-10-03 11:59:36,698 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9109W0012-549758 from training. Duration: 0.512 2023-10-03 11:59:37,965 WARNING [train.py:1204] (3/4) Exclude cut with ID _5410_210_1388_1_1527397055795_1719020_39-112221-0_sp0.9 from training. Duration: 0.7666875 2023-10-03 11:59:39,385 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9121W0012-554933 from training. Duration: 0.896 2023-10-03 11:59:41,321 WARNING [train.py:1204] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_476-272757-0 from training. Duration: 0.93 2023-10-03 11:59:41,340 WARNING [train.py:1204] (3/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_1085-171718-0_sp1.1 from training. Duration: 0.92725 2023-10-03 11:59:41,406 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder_embed.dropout.p, batch_count=1259300.0, ans=0.1 2023-10-03 11:59:42,824 WARNING [train.py:1204] (3/4) Exclude cut with ID _8480_210_1874_1_1528589391205_6728330_309-139896-0_sp1.1 from training. Duration: 0.736375 2023-10-03 11:59:42,825 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9130W0113-559703 from training. Duration: 0.896 2023-10-03 11:59:42,829 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_292-265928-0_sp1.1 from training. Duration: 0.463625 2023-10-03 11:59:45,443 WARNING [train.py:1204] (3/4) Exclude cut with ID _8772_210_1850_1_1528635595972_3664011_96-12756-0 from training. Duration: 0.98 2023-10-03 11:59:45,520 WARNING [train.py:1204] (3/4) Exclude cut with ID _12556_210_8341_1_1530081024100_7327690_725-193477-0_sp1.1 from training. Duration: 0.8909375 2023-10-03 11:59:46,871 WARNING [train.py:1204] (3/4) Exclude cut with ID _5289_210_2084_1_1527678202736_3779299_142-103317-0_sp1.1 from training. Duration: 0.763625 2023-10-03 11:59:48,406 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.conv_module2.balancer2.min_positive, batch_count=1259300.0, ans=0.05 2023-10-03 11:59:49,654 WARNING [train.py:1204] (3/4) Exclude cut with ID _45012_210_12651_1_1533608435136_5371380_206-51075-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 11:59:51,140 WARNING [train.py:1204] (3/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_176-56120-0_sp1.1 from training. Duration: 0.7545625 2023-10-03 11:59:52,490 WARNING [train.py:1204] (3/4) Exclude cut with ID _40284_210_2084_1_1533513679814_3704279_220-201864-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 11:59:52,529 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9165W0004-576638 from training. Duration: 0.896 2023-10-03 11:59:53,836 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_5438_210_1794_1_1527336687_2600009_218-343336-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 11:59:53,871 WARNING [train.py:1204] (3/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_318-162017-0 from training. Duration: 0.91 2023-10-03 11:59:59,961 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_49544_210_1794_1_1534379770_5262083_483-349298-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 12:00:01,836 WARNING [train.py:1204] (3/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_197-28164-0_sp0.9 from training. Duration: 0.888875 2023-10-03 12:00:03,231 WARNING [train.py:1204] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_643-52007-0 from training. Duration: 0.57 2023-10-03 12:00:03,250 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_38965_210_4852_1_1533174906_3484735_403-332963-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 12:00:04,608 WARNING [train.py:1204] (3/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_340-174176-0 from training. Duration: 0.49 2023-10-03 12:00:07,434 WARNING [train.py:1204] (3/4) Exclude cut with ID _56708_210_13177_1_1535252351282_3422899_363-917-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 12:00:08,871 WARNING [train.py:1204] (3/4) Exclude cut with ID _19896_210_1883_1_1531709022662_6649569_525-159063-0_sp1.1 from training. Duration: 0.936375 2023-10-03 12:00:10,185 WARNING [train.py:1204] (3/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_430-116139-0_sp0.9 from training. Duration: 0.9444375 2023-10-03 12:00:11,565 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_53753_210_7234_1_1534320003_3594148_86-329250-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 12:00:12,871 WARNING [train.py:1204] (3/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_406-275649-0_sp1.1 from training. Duration: 0.3818125 2023-10-03 12:00:14,219 WARNING [train.py:1204] (3/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_1083-147536-0_sp1.1 from training. Duration: 0.8818125 2023-10-03 12:00:15,608 WARNING [train.py:1204] (3/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_54-311741-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 12:00:15,613 WARNING [train.py:1204] (3/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_237-58433-0_sp0.9 from training. Duration: 0.82225 2023-10-03 12:00:15,657 WARNING [train.py:1204] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_98-51676-0_sp1.1 from training. Duration: 0.663625 2023-10-03 12:00:17,569 WARNING [train.py:1204] (3/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_386-270472-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 12:00:17,662 WARNING [train.py:1204] (3/4) Exclude cut with ID _19023_210_4353_1_1531378892552_5820780_83-254794-0_sp1.1 from training. Duration: 0.7818125 2023-10-03 12:00:18,960 WARNING [train.py:1204] (3/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_314-93656-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 12:00:18,980 WARNING [train.py:1204] (3/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_336-213530-0 from training. Duration: 0.9 2023-10-03 12:00:20,329 INFO [train.py:1046] (3/4) Epoch 36, batch 3000, loss[loss=0.1372, simple_loss=0.2212, pruned_loss=0.02658, over 24336.00 frames. ], tot_loss[loss=0.1595, simple_loss=0.2394, pruned_loss=0.0398, over 4722861.87 frames. ], batch size: 61, lr: 2.83e-03, grad_scale: 8.0 2023-10-03 12:00:20,330 INFO [train.py:1069] (3/4) Computing validation loss 2023-10-03 12:00:31,799 INFO [train.py:1078] (3/4) Epoch 36, validation: loss=0.3578, simple_loss=0.2691, pruned_loss=0.2232, over 1125622.00 frames. 2023-10-03 12:00:31,800 INFO [train.py:1079] (3/4) Maximum memory allocated so far is 21134MB 2023-10-03 12:00:31,842 WARNING [train.py:1204] (3/4) Exclude cut with ID _36686_210_5665_1_1533778225913_3646410_65-221120-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 12:00:33,279 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_26433_210_12108_1_1532157158_4056480_271-68178-0_sp1.1 from training. Duration: 0.92725 2023-10-03 12:00:34,557 WARNING [train.py:1204] (3/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_46-207599-0_sp0.9 from training. Duration: 0.711125 2023-10-03 12:00:38,528 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9303W0114-637133 from training. Duration: 0.896 2023-10-03 12:00:38,571 WARNING [train.py:1204] (3/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_490-207861-0 from training. Duration: 0.98 2023-10-03 12:00:41,344 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6767_210_3273_1_1527855783_1718932_173-256601-0_sp0.9 from training. Duration: 0.888875 2023-10-03 12:00:42,776 WARNING [train.py:1204] (3/4) Exclude cut with ID _13921_210_9994_1_1530841782272_3503520_327-69677-0_sp1.1 from training. Duration: 0.8181875 2023-10-03 12:00:42,831 WARNING [train.py:1204] (3/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_508-335369-0 from training. Duration: 0.48 2023-10-03 12:00:42,862 WARNING [train.py:1204] (3/4) Exclude cut with ID _64631_210_27767_1_1535349580068_4165160_24-46567-0_sp1.1 from training. Duration: 0.963625 2023-10-03 12:00:49,018 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_89-35748-0_sp1.1 from training. Duration: 0.7181875 2023-10-03 12:00:58,558 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_26433_210_12108_1_1532157158_4056480_578-262107-0_sp1.1 from training. Duration: 0.9 2023-10-03 12:01:03,948 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.2.encoder.layers.1.feed_forward2.out_whiten, num_groups=1, num_channels=384, metric=3.81 vs. limit=15.0 2023-10-03 12:01:06,451 WARNING [train.py:1204] (3/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_129-337802-0 from training. Duration: 0.72 2023-10-03 12:01:08,415 WARNING [train.py:1204] (3/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_356-167816-0_sp0.9 from training. Duration: 0.8 2023-10-03 12:01:09,893 WARNING [train.py:1204] (3/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_137-116442-0_sp1.1 from training. Duration: 0.7090625 2023-10-03 12:01:11,165 WARNING [train.py:1204] (3/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_358-172940-0_sp1.1 from training. Duration: 0.963625 2023-10-03 12:01:11,184 WARNING [train.py:1204] (3/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_668-344147-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 12:01:11,351 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.0.feed_forward1.hidden_balancer.prob, batch_count=1259633.3333333333, ans=0.125 2023-10-03 12:01:12,652 WARNING [train.py:1204] (3/4) Exclude cut with ID _62337_210_12392_1_1535009273105_3412229_255-157667-0_sp1.1 from training. Duration: 0.936375 2023-10-03 12:01:12,655 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_486-151131-0 from training. Duration: 0.85 2023-10-03 12:01:14,100 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_53638_210_21581_1_1534297319_5808449_885-80172-0 from training. Duration: 0.96 2023-10-03 12:01:15,492 WARNING [train.py:1204] (3/4) Exclude cut with ID _50093_210_4220_1_1534417141207_3432299_105-183067-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 12:01:17,414 WARNING [train.py:1204] (3/4) Exclude cut with ID _7562_210_2084_1_1528887767916_3946229_345-169156-0_sp0.9 from training. Duration: 0.6444375 2023-10-03 12:01:18,823 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_4528_210_3273_1_1527076654_3276777_209-219804-0_sp0.9 from training. Duration: 0.7666875 2023-10-03 12:01:18,855 WARNING [train.py:1204] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_35-153886-0_sp0.9 from training. Duration: 0.9333125 2023-10-03 12:01:20,188 WARNING [train.py:1204] (3/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_332-121925-0_sp1.1 from training. Duration: 0.92725 2023-10-03 12:01:20,190 WARNING [train.py:1204] (3/4) Exclude cut with ID _59202_210_26984_1_1534816810612_3549174_495-65515-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 12:01:22,944 WARNING [train.py:1204] (3/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_769-168302-0_sp1.1 from training. Duration: 0.6454375 2023-10-03 12:01:22,975 WARNING [train.py:1204] (3/4) Exclude cut with ID _24271_210_12658_1_1531987270022_7190519_708-71345-0_sp1.1 from training. Duration: 0.936375 2023-10-03 12:01:22,975 WARNING [train.py:1204] (3/4) Exclude cut with ID 3033-130750-0096-55598-0_sp0.9 from training. Duration: 0.92225 2023-10-03 12:01:24,388 WARNING [train.py:1204] (3/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_219-224445-0_sp0.9 from training. Duration: 0.9333125 2023-10-03 12:01:27,194 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_174-277364-0 from training. Duration: 0.71 2023-10-03 12:01:29,828 WARNING [train.py:1204] (3/4) Exclude cut with ID _45021_210_9994_1_1533619751094_3330288_96-239010-0_sp1.1 from training. Duration: 0.82725 2023-10-03 12:01:29,862 WARNING [train.py:1204] (3/4) Exclude cut with ID _31602_210_5854_1_1532677021456_4458250_489-305116-0_sp1.1 from training. Duration: 0.97275 2023-10-03 12:01:29,896 WARNING [train.py:1204] (3/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_269-154726-0_sp0.9 from training. Duration: 0.9555625 2023-10-03 12:01:33,192 WARNING [train.py:1204] (3/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_126-54418-0_sp1.1 from training. Duration: 0.92725 2023-10-03 12:01:35,016 WARNING [train.py:1204] (3/4) Exclude cut with ID _42326_210_7770_1_1533814282531_3707269_182-224159-0_sp1.1 from training. Duration: 0.92725 2023-10-03 12:01:36,977 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7002_210_1794_1_1527939683_5157286_1-281264-0_sp0.9 from training. Duration: 0.488875 2023-10-03 12:01:37,013 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_86-16881-0 from training. Duration: 0.58 2023-10-03 12:01:37,047 WARNING [train.py:1204] (3/4) Exclude cut with ID _19514_210_9259_1_1531530071330_5084230_637-279094-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 12:01:37,075 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_26433_210_12108_1_1532157158_4056480_578-262107-0 from training. Duration: 0.99 2023-10-03 12:01:38,400 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_82-221196-0_sp1.1 from training. Duration: 0.6909375 2023-10-03 12:01:41,021 WARNING [train.py:1204] (3/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_402-102311-0 from training. Duration: 0.67 2023-10-03 12:01:42,575 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1015-343884-0_sp1.1 from training. Duration: 0.563625 2023-10-03 12:01:42,765 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.0.balancer2.prob, batch_count=1259766.6666666667, ans=0.125 2023-10-03 12:01:43,976 WARNING [train.py:1204] (3/4) Exclude cut with ID _13684_210_7497_1_1530619354863_3598359_312-235606-0_sp1.1 from training. Duration: 0.5454375 2023-10-03 12:01:44,012 WARNING [train.py:1204] (3/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_55-31904-0 from training. Duration: 0.88 2023-10-03 12:01:45,287 INFO [train.py:1046] (3/4) Epoch 36, batch 3050, loss[loss=0.1483, simple_loss=0.229, pruned_loss=0.03381, over 24599.00 frames. ], tot_loss[loss=0.1602, simple_loss=0.2401, pruned_loss=0.04013, over 4723974.16 frames. ], batch size: 60, lr: 2.83e-03, grad_scale: 8.0 2023-10-03 12:01:45,380 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_25-332138-0 from training. Duration: 0.91 2023-10-03 12:01:45,382 WARNING [train.py:1204] (3/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_912-282412-0_sp0.9 from training. Duration: 0.5444375 2023-10-03 12:01:45,455 WARNING [train.py:1204] (3/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_458-299435-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 12:01:47,451 WARNING [train.py:1204] (3/4) Exclude cut with ID _69584_210_5219_1_1535785133775_4044450_275-88403-0_sp1.1 from training. Duration: 0.92725 2023-10-03 12:01:47,459 WARNING [train.py:1204] (3/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_841-341679-0_sp1.1 from training. Duration: 0.52725 2023-10-03 12:01:47,466 WARNING [train.py:1204] (3/4) Exclude cut with ID _41391_210_7994_1_1533603771872_3638201_1-54521-0_sp1.1 from training. Duration: 0.97275 2023-10-03 12:01:47,518 WARNING [train.py:1204] (3/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_495-186609-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 12:01:51,533 WARNING [train.py:1204] (3/4) Exclude cut with ID _4681_210_3525_1_1527250689464_120889_15-126760-0 from training. Duration: 0.81 2023-10-03 12:01:52,928 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3019_210_2028_1_1526044245_3264865_127-135615-0_sp1.1 from training. Duration: 0.8545625 2023-10-03 12:01:54,415 WARNING [train.py:1204] (3/4) Exclude cut with ID _30191_210_1664_1_1533189915047_7196670_915-21561-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 12:01:55,789 WARNING [train.py:1204] (3/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_6-214815-0_sp0.9 from training. Duration: 0.9444375 2023-10-03 12:01:56,410 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.4.encoder.layers.0.feed_forward1.out_whiten, num_groups=1, num_channels=384, metric=5.49 vs. limit=15.0 2023-10-03 12:01:58,080 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.1.encoder.layers.1.whiten, num_groups=1, num_channels=256, metric=3.85 vs. limit=12.0 2023-10-03 12:01:58,579 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_45399_210_4852_1_1533624725_7579737_190-137494-0_sp1.1 from training. Duration: 0.97275 2023-10-03 12:01:59,964 WARNING [train.py:1204] (3/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_207-132038-0 from training. Duration: 0.94 2023-10-03 12:02:05,353 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.conv_module1.balancer2.prob, batch_count=1259900.0, ans=0.125 2023-10-03 12:02:06,241 WARNING [train.py:1204] (3/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_90-31383-0 from training. Duration: 0.87 2023-10-03 12:02:06,274 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_15517_210_6753_1_1530952293_1664795_119-31001-0 from training. Duration: 0.94 2023-10-03 12:02:06,314 WARNING [train.py:1204] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_684-85342-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 12:02:09,605 WARNING [train.py:1204] (3/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_97-325053-0_sp1.1 from training. Duration: 0.836375 2023-10-03 12:02:12,451 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_68702_210_23689_1_1535799577_3420068_168-25150-0_sp1.1 from training. Duration: 0.97275 2023-10-03 12:02:12,458 WARNING [train.py:1204] (3/4) Exclude cut with ID _65134_210_14763_1_1535335393985_3505368_111-27984-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 12:02:12,510 WARNING [train.py:1204] (3/4) Exclude cut with ID _48678_210_5663_1_1533988830947_3769199_276-226230-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 12:02:13,133 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.3.encoder.layers.1.whiten, num_groups=1, num_channels=512, metric=4.83 vs. limit=12.0 2023-10-03 12:02:15,670 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.645e+02 1.931e+02 2.120e+02 2.392e+02 4.197e+02, threshold=4.241e+02, percent-clipped=0.0 2023-10-03 12:02:15,787 WARNING [train.py:1204] (3/4) Exclude cut with ID _28427_210_4220_1_1532581240967_3061069_43-84928-0_sp1.1 from training. Duration: 0.9 2023-10-03 12:02:15,824 WARNING [train.py:1204] (3/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_158-250116-0_sp1.1 from training. Duration: 0.563625 2023-10-03 12:02:17,115 WARNING [train.py:1204] (3/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_279-332556-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 12:02:17,172 WARNING [train.py:1204] (3/4) Exclude cut with ID _42863_210_9070_1_1533902398788_3696920_488-230146-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 12:02:17,173 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_387-247159-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 12:02:19,915 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6729_210_2550_1_1528032481_1788725_261-42619-0_sp1.1 from training. Duration: 0.97275 2023-10-03 12:02:21,362 WARNING [train.py:1204] (3/4) Exclude cut with ID _19896_210_1883_1_1531709022662_6649569_831-44724-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 12:02:23,151 WARNING [train.py:1204] (3/4) Exclude cut with ID _55153_210_2253_1_1534384786677_3162096_280-285944-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 12:02:24,435 WARNING [train.py:1204] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_448-4048-0 from training. Duration: 0.96 2023-10-03 12:02:24,490 WARNING [train.py:1204] (3/4) Exclude cut with ID _29678_210_7893_1_1532421042787_3906118_456-202548-0_sp1.1 from training. Duration: 0.97275 2023-10-03 12:02:24,510 WARNING [train.py:1204] (3/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_107-332398-0_sp1.1 from training. Duration: 0.7090625 2023-10-03 12:02:28,188 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.1.encoder.layers.1.nonlin_attention.whiten1, num_groups=1, num_channels=192, metric=5.67 vs. limit=10.0 2023-10-03 12:02:28,680 WARNING [train.py:1204] (3/4) Exclude cut with ID _13921_210_9994_1_1530838702800_3003429_46-149444-0_sp1.1 from training. Duration: 0.8545625 2023-10-03 12:02:28,710 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_257-20733-0_sp0.9 from training. Duration: 0.8444375 2023-10-03 12:02:30,055 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_53753_210_7234_1_1534320003_3594148_385-234809-0_sp1.1 from training. Duration: 0.963625 2023-10-03 12:02:30,100 WARNING [train.py:1204] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_292-215346-0_sp1.1 from training. Duration: 0.8818125 2023-10-03 12:02:35,261 WARNING [train.py:1204] (3/4) Exclude cut with ID _68965_210_1883_1_1535698962272_3497490_196-124268-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 12:02:35,314 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_23801_210_16223_1_1532066392_3294657_570-5439-0_sp1.1 from training. Duration: 0.8818125 2023-10-03 12:02:41,382 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.bypass.scale_min, batch_count=1260033.3333333333, ans=0.2 2023-10-03 12:02:41,426 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.feed_forward1.hidden_balancer.prob, batch_count=1260033.3333333333, ans=0.125 2023-10-03 12:02:42,521 WARNING [train.py:1204] (3/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_349-6928-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 12:02:43,894 WARNING [train.py:1204] (3/4) Exclude cut with ID _60621_210_17071_1_1535013053530_3671739_226-255202-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 12:02:43,898 WARNING [train.py:1204] (3/4) Exclude cut with ID _41817_210_18835_1_1533434477160_7423049_448-130302-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 12:02:45,384 WARNING [train.py:1204] (3/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_758-143677-0_sp1.1 from training. Duration: 0.8909375 2023-10-03 12:02:47,215 WARNING [train.py:1204] (3/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_912-47473-0_sp0.9 from training. Duration: 0.7555625 2023-10-03 12:02:48,553 WARNING [train.py:1204] (3/4) Exclude cut with ID _6694_210_4599_1_1527852637490_3965529_356-169233-0_sp1.1 from training. Duration: 0.9 2023-10-03 12:02:48,633 WARNING [train.py:1204] (3/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_1-99680-0 from training. Duration: 0.67 2023-10-03 12:02:49,900 WARNING [train.py:1204] (3/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_199-86432-0_sp1.1 from training. Duration: 0.8909375 2023-10-03 12:02:49,927 WARNING [train.py:1204] (3/4) Exclude cut with ID _61148_210_6897_1_1535094022726_4217610_362-182430-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 12:02:51,285 WARNING [train.py:1204] (3/4) Exclude cut with ID _49376_210_5986_1_1533985278732_3370740_301-154763-0 from training. Duration: 0.98 2023-10-03 12:02:52,749 WARNING [train.py:1204] (3/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_572-185150-0_sp1.1 from training. Duration: 0.8818125 2023-10-03 12:02:55,797 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.bypass.scale_min, batch_count=1260100.0, ans=0.2 2023-10-03 12:02:58,292 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_47896_210_20508_1_1534492518_5958868_558-314572-0_sp1.1 from training. Duration: 0.8818125 2023-10-03 12:02:58,482 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.0.bypass.scale_min, batch_count=1260166.6666666667, ans=0.2 2023-10-03 12:02:59,599 INFO [train.py:1046] (3/4) Epoch 36, batch 3100, loss[loss=0.1604, simple_loss=0.2402, pruned_loss=0.04026, over 23346.00 frames. ], tot_loss[loss=0.1608, simple_loss=0.2406, pruned_loss=0.04047, over 4721709.19 frames. ], batch size: 93, lr: 2.83e-03, grad_scale: 8.0 2023-10-03 12:02:59,668 WARNING [train.py:1204] (3/4) Exclude cut with ID _45560_210_21982_1_1533812603951_3584999_111-6376-0_sp0.9 from training. Duration: 0.9333125 2023-10-03 12:03:02,543 WARNING [train.py:1204] (3/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_412-317077-0_sp1.1 from training. Duration: 0.6818125 2023-10-03 12:03:05,636 WARNING [train.py:1204] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_718-68505-0 from training. Duration: 0.89 2023-10-03 12:03:07,053 WARNING [train.py:1204] (3/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_77-311404-0 from training. Duration: 0.94 2023-10-03 12:03:08,458 WARNING [train.py:1204] (3/4) Exclude cut with ID _60229_210_27767_1_1534838406158_3655350_264-279573-0 from training. Duration: 0.83 2023-10-03 12:03:08,570 WARNING [train.py:1204] (3/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_106-300985-0_sp1.1 from training. Duration: 0.8181875 2023-10-03 12:03:11,927 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_4183_210_2087_1_1526729100_1869838_79-277189-0_sp1.1 from training. Duration: 0.963625 2023-10-03 12:03:11,936 WARNING [train.py:1204] (3/4) Exclude cut with ID _28427_210_4220_1_1532581240967_3061069_195-295268-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 12:03:13,639 WARNING [train.py:1204] (3/4) Exclude cut with ID _4092_210_2151_1_1526643044449_3135759_356-278694-0_sp0.9 from training. Duration: 0.52225 2023-10-03 12:03:16,913 WARNING [train.py:1204] (3/4) Exclude cut with ID _14459_210_2228_1_1531443641946_7516700_777-71446-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 12:03:22,525 WARNING [train.py:1204] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_520-126729-0 from training. Duration: 0.7 2023-10-03 12:03:26,664 WARNING [train.py:1204] (3/4) Exclude cut with ID _13684_210_7497_1_1530619354863_3598359_312-235606-0_sp0.9 from training. Duration: 0.6666875 2023-10-03 12:03:27,942 WARNING [train.py:1204] (3/4) Exclude cut with ID _37537_210_2253_1_1533171758568_3421949_283-349402-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 12:03:29,269 WARNING [train.py:1204] (3/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_29-117932-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 12:03:29,306 WARNING [train.py:1204] (3/4) Exclude cut with ID _29303_210_2575_1_1532392237303_7410649_721-283351-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 12:03:30,676 WARNING [train.py:1204] (3/4) Exclude cut with ID _14886_210_4220_1_1530702020355_3516190_175-229073-0_sp0.9 from training. Duration: 0.5 2023-10-03 12:03:32,581 WARNING [train.py:1204] (3/4) Exclude cut with ID _15458_210_8341_1_1530858614263_3834410_70-268830-0_sp1.1 from training. Duration: 0.97275 2023-10-03 12:03:32,597 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2295_210_2087_1_1525500993_2407863_234-83104-0 from training. Duration: 0.62 2023-10-03 12:03:32,598 WARNING [train.py:1204] (3/4) Exclude cut with ID _22961_210_8254_1_1531976366672_4278180_317-199546-0_sp1.1 from training. Duration: 0.7818125 2023-10-03 12:03:34,022 WARNING [train.py:1204] (3/4) Exclude cut with ID _27894_210_12392_1_1532415496078_6902439_547-196022-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 12:03:35,411 WARNING [train.py:1204] (3/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_46-207599-0 from training. Duration: 0.64 2023-10-03 12:03:35,539 WARNING [train.py:1204] (3/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_426-326246-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 12:03:35,707 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.conv_module1.balancer2.prob, batch_count=1260300.0, ans=0.125 2023-10-03 12:03:39,780 WARNING [train.py:1204] (3/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_445-117398-0_sp0.9 from training. Duration: 0.988875 2023-10-03 12:03:41,631 WARNING [train.py:1204] (3/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_563-347832-0 from training. Duration: 0.89 2023-10-03 12:03:43,068 WARNING [train.py:1204] (3/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_252-216674-0 from training. Duration: 0.54 2023-10-03 12:03:44,398 WARNING [train.py:1204] (3/4) Exclude cut with ID _59611_210_8895_1_1534741198240_3650361_187-212779-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 12:03:45,890 WARNING [train.py:1204] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_282-159367-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 12:03:47,892 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_68702_210_23689_1_1535799577_3420068_33-136883-0_sp1.1 from training. Duration: 0.936375 2023-10-03 12:03:47,912 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_47228_210_4983_1_1533780021_3672027_184-293706-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 12:03:49,228 WARNING [train.py:1204] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_656-188361-0_sp0.9 from training. Duration: 0.9666875 2023-10-03 12:03:49,354 WARNING [train.py:1204] (3/4) Exclude cut with ID _4866_210_2577_1_1527404412284_4364659_352-157930-0_sp0.9 from training. Duration: 0.9 2023-10-03 12:03:49,355 WARNING [train.py:1204] (3/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_210-106047-0_sp1.1 from training. Duration: 0.8818125 2023-10-03 12:03:52,042 WARNING [train.py:1204] (3/4) Exclude cut with ID _9389_210_1664_1_1529217337301_5289269_289-213417-0_sp1.1 from training. Duration: 0.8090625 2023-10-03 12:03:52,082 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3910_210_1794_1_1526777667_3747260_178-41186-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 12:03:52,086 WARNING [train.py:1204] (3/4) Exclude cut with ID _27052_210_5986_1_1532329197187_3585009_24-172263-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 12:03:52,090 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_5522_210_3273_1_1527249606_3909096_318-203153-0_sp1.1 from training. Duration: 0.3909375 2023-10-03 12:03:57,622 WARNING [train.py:1204] (3/4) Exclude cut with ID _27894_210_12392_1_1532415496078_6902439_222-51982-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 12:03:57,757 WARNING [train.py:1204] (3/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_87-131427-0 from training. Duration: 0.78 2023-10-03 12:04:00,315 WARNING [train.py:1204] (3/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_372-298093-0_sp0.9 from training. Duration: 0.888875 2023-10-03 12:04:01,629 WARNING [train.py:1204] (3/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_663-166973-0 from training. Duration: 0.8 2023-10-03 12:04:01,664 WARNING [train.py:1204] (3/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_429-177469-0_sp1.1 from training. Duration: 0.936375 2023-10-03 12:04:01,696 WARNING [train.py:1204] (3/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_724-172651-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 12:04:01,734 WARNING [train.py:1204] (3/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_103-336064-0 from training. Duration: 0.51 2023-10-03 12:04:11,091 WARNING [train.py:1204] (3/4) Exclude cut with ID _48677_210_5663_1_1533902405919_3892989_111-11439-0 from training. Duration: 0.96 2023-10-03 12:04:11,323 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.feed_forward3.hidden_balancer.prob, batch_count=1260433.3333333333, ans=0.125 2023-10-03 12:04:14,235 INFO [train.py:1046] (3/4) Epoch 36, batch 3150, loss[loss=0.1831, simple_loss=0.2659, pruned_loss=0.05011, over 24047.00 frames. ], tot_loss[loss=0.1596, simple_loss=0.2389, pruned_loss=0.0402, over 4715978.73 frames. ], batch size: 80, lr: 2.83e-03, grad_scale: 8.0 2023-10-03 12:04:14,329 WARNING [train.py:1204] (3/4) Exclude cut with ID _8085_210_4557_1_1528455633493_3716100_95-103398-0_sp1.1 from training. Duration: 0.963625 2023-10-03 12:04:14,405 WARNING [train.py:1204] (3/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_1041-110837-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 12:04:17,613 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_50050_210_16223_1_1535435796_7290138_1155-121757-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 12:04:17,615 WARNING [train.py:1204] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_602-275260-0_sp0.9 from training. Duration: 0.911125 2023-10-03 12:04:17,691 WARNING [train.py:1204] (3/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_265-164486-0 from training. Duration: 0.96 2023-10-03 12:04:17,784 WARNING [train.py:1204] (3/4) Exclude cut with ID _46067_210_4220_1_1534125571066_3307450_142-229375-0_sp1.1 from training. Duration: 0.963625 2023-10-03 12:04:19,081 WARNING [train.py:1204] (3/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_310-26186-0_sp1.1 from training. Duration: 0.636375 2023-10-03 12:04:19,174 WARNING [train.py:1204] (3/4) Exclude cut with ID _28461_210_2253_1_1532771775189_3041299_136-270644-0 from training. Duration: 0.93 2023-10-03 12:04:21,939 WARNING [train.py:1204] (3/4) Exclude cut with ID _14860_210_4915_1_1530702020340_7343240_438-286372-0_sp1.1 from training. Duration: 0.936375 2023-10-03 12:04:24,632 WARNING [train.py:1204] (3/4) Exclude cut with ID IC0128W0004-63309 from training. Duration: 0.831 2023-10-03 12:04:27,456 WARNING [train.py:1204] (3/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_135-174080-0 from training. Duration: 0.53 2023-10-03 12:04:28,744 WARNING [train.py:1204] (3/4) Exclude cut with ID _41326_210_5854_1_1533778076509_3492080_277-59511-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 12:04:28,843 WARNING [train.py:1204] (3/4) Exclude cut with ID IC0144W0112-71379 from training. Duration: 0.992 2023-10-03 12:04:30,223 WARNING [train.py:1204] (3/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_711-15688-0_sp0.9 from training. Duration: 0.47775 2023-10-03 12:04:31,625 WARNING [train.py:1204] (3/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_237-332027-0 from training. Duration: 0.95 2023-10-03 12:04:31,688 WARNING [train.py:1204] (3/4) Exclude cut with ID _7970_210_1883_1_1528455616296_3472230_25-178407-0 from training. Duration: 0.71 2023-10-03 12:04:31,690 WARNING [train.py:1204] (3/4) Exclude cut with ID _8480_210_1874_1_1528589391205_6728330_309-139896-0 from training. Duration: 0.81 2023-10-03 12:04:31,706 WARNING [train.py:1204] (3/4) Exclude cut with ID _39571_210_13449_1_1533801584143_3651077_319-268877-0_sp1.1 from training. Duration: 0.936375 2023-10-03 12:04:31,709 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_155-246880-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 12:04:34,938 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_566-122599-0_sp1.1 from training. Duration: 0.936375 2023-10-03 12:04:35,060 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_12868_210_6753_1_1530701932_530275_18-76322-0 from training. Duration: 0.87 2023-10-03 12:04:36,452 WARNING [train.py:1204] (3/4) Exclude cut with ID _56494_210_4220_1_1535284837625_3033169_159-106035-0_sp1.1 from training. Duration: 0.963625 2023-10-03 12:04:37,740 WARNING [train.py:1204] (3/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_98-276013-0_sp1.1 from training. Duration: 0.963625 2023-10-03 12:04:37,782 WARNING [train.py:1204] (3/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_330-216124-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 12:04:41,083 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_177-58005-0_sp0.9 from training. Duration: 0.688875 2023-10-03 12:04:42,468 WARNING [train.py:1204] (3/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_343-139898-0 from training. Duration: 0.96 2023-10-03 12:04:44,296 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.558e+02 1.839e+02 2.057e+02 2.261e+02 3.088e+02, threshold=4.113e+02, percent-clipped=0.0 2023-10-03 12:04:44,407 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6961_210_2087_1_1527937168_6815011_395-185060-0_sp1.1 from training. Duration: 0.863625 2023-10-03 12:04:45,870 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7002_210_1794_1_1527939683_5157286_184-229630-0_sp0.9 from training. Duration: 0.82225 2023-10-03 12:04:45,929 WARNING [train.py:1204] (3/4) Exclude cut with ID _59170_210_6721_1_1534983633960_4203335_153-18895-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 12:04:47,823 WARNING [train.py:1204] (3/4) Exclude cut with ID _49643_210_7989_1_1534640244224_3939031_598-161926-0 from training. Duration: 0.9 2023-10-03 12:04:49,261 WARNING [train.py:1204] (3/4) Exclude cut with ID _4068_210_2629_1_1526697350395_1546259_224-288693-0 from training. Duration: 0.59 2023-10-03 12:04:50,579 WARNING [train.py:1204] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_718-68505-0_sp1.1 from training. Duration: 0.8090625 2023-10-03 12:04:50,633 WARNING [train.py:1204] (3/4) Exclude cut with ID _4068_210_2629_1_1526697350395_1546259_224-288693-0_sp0.9 from training. Duration: 0.6555625 2023-10-03 12:04:50,650 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2296_210_2087_1_1525503448_3397335_57-185430-0_sp0.9 from training. Duration: 0.6666875 2023-10-03 12:04:52,000 WARNING [train.py:1204] (3/4) Exclude cut with ID _15458_210_8341_1_1530858614263_3834410_49-235472-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 12:04:52,002 WARNING [train.py:1204] (3/4) Exclude cut with ID _64135_210_2253_1_1535677267876_7608972_498-5039-0_sp0.9 from training. Duration: 0.8666875 2023-10-03 12:04:53,353 WARNING [train.py:1204] (3/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_283-220372-0_sp0.9 from training. Duration: 0.62225 2023-10-03 12:04:53,363 WARNING [train.py:1204] (3/4) Exclude cut with ID _6473_210_1741_1_1527901307396_3815380_108-17335-0_sp0.9 from training. Duration: 0.72225 2023-10-03 12:04:54,679 WARNING [train.py:1204] (3/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_113-92608-0 from training. Duration: 0.54 2023-10-03 12:04:55,946 WARNING [train.py:1204] (3/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_272-249985-0_sp1.1 from training. Duration: 0.6090625 2023-10-03 12:04:55,955 WARNING [train.py:1204] (3/4) Exclude cut with ID _50376_210_13401_1_1534031502101_4217740_518-187390-0_sp1.1 from training. Duration: 0.97275 2023-10-03 12:04:57,396 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder_embed.convnext.hidden_balancer.prob, batch_count=1260700.0, ans=0.125 2023-10-03 12:04:58,817 WARNING [train.py:1204] (3/4) Exclude cut with ID _59738_210_15379_1_1534849512562_3941470_165-248260-0_sp1.1 from training. Duration: 0.92725 2023-10-03 12:04:58,834 WARNING [train.py:1204] (3/4) Exclude cut with ID _23818_210_6228_1_1531917063884_3676920_400-343925-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 12:05:00,131 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6961_210_2087_1_1527937168_6815011_562-165518-0 from training. Duration: 0.77 2023-10-03 12:05:01,531 WARNING [train.py:1204] (3/4) Exclude cut with ID _24271_210_12658_1_1531987270022_7190519_289-207931-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 12:05:02,878 WARNING [train.py:1204] (3/4) Exclude cut with ID _14632_210_9988_1_1530691279392_3923200_332-105904-0 from training. Duration: 0.9 2023-10-03 12:05:04,163 WARNING [train.py:1204] (3/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_203-232438-0_sp1.1 from training. Duration: 0.97275 2023-10-03 12:05:04,261 WARNING [train.py:1204] (3/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_263-168388-0 from training. Duration: 0.86 2023-10-03 12:05:05,582 WARNING [train.py:1204] (3/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_174-205058-0 from training. Duration: 0.95 2023-10-03 12:05:07,551 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_94-195801-0_sp1.1 from training. Duration: 0.8545625 2023-10-03 12:05:08,842 WARNING [train.py:1204] (3/4) Exclude cut with ID _55153_210_2253_1_1534384786677_3162096_269-103204-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 12:05:10,778 WARNING [train.py:1204] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_748-36150-0 from training. Duration: 0.91 2023-10-03 12:05:12,182 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_148-172861-0_sp1.1 from training. Duration: 0.4545625 2023-10-03 12:05:12,238 WARNING [train.py:1204] (3/4) Exclude cut with ID _67499_210_17282_1_1535509805220_3692430_460-77730-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 12:05:16,609 WARNING [train.py:1204] (3/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_374-20753-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 12:05:16,708 WARNING [train.py:1204] (3/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_374-289983-0_sp1.1 from training. Duration: 0.97275 2023-10-03 12:05:16,736 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_34886_210_10633_1_1533203882_1755661_76-156096-0_sp0.9 from training. Duration: 0.9666875 2023-10-03 12:05:21,571 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_238-256668-0_sp0.9 from training. Duration: 0.7666875 2023-10-03 12:05:22,918 WARNING [train.py:1204] (3/4) Exclude cut with ID _46683_210_9642_1_1533730913492_4591116_489-146727-0_sp1.1 from training. Duration: 0.97275 2023-10-03 12:05:24,286 WARNING [train.py:1204] (3/4) Exclude cut with ID _13938_210_9988_1_1530518718978_3693960_304-112784-0_sp0.9 from training. Duration: 0.511125 2023-10-03 12:05:28,506 INFO [train.py:1046] (3/4) Epoch 36, batch 3200, loss[loss=0.1734, simple_loss=0.2546, pruned_loss=0.04616, over 24096.00 frames. ], tot_loss[loss=0.1589, simple_loss=0.2381, pruned_loss=0.03979, over 4719930.30 frames. ], batch size: 80, lr: 2.83e-03, grad_scale: 16.0 2023-10-03 12:05:28,614 WARNING [train.py:1204] (3/4) Exclude cut with ID _24271_210_12658_1_1531987270022_7190519_243-46549-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 12:05:28,619 WARNING [train.py:1204] (3/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_78-182912-0_sp1.1 from training. Duration: 0.536375 2023-10-03 12:05:32,810 WARNING [train.py:1204] (3/4) Exclude cut with ID _57572_210_5517_1_1534921219624_3809484_121-321334-0_sp1.1 from training. Duration: 0.97275 2023-10-03 12:05:34,273 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_40047_210_2290_1_1533290519_2374311_254-102149-0_sp1.1 from training. Duration: 0.963625 2023-10-03 12:05:34,274 WARNING [train.py:1204] (3/4) Exclude cut with ID _28461_210_2253_1_1532771775189_3041299_96-152158-0 from training. Duration: 0.85 2023-10-03 12:05:36,931 WARNING [train.py:1204] (3/4) Exclude cut with ID _29877_210_6228_1_1532397791106_5276149_161-102979-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 12:05:41,622 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3787_210_2040_1_1526477544_5478586_603-120324-0_sp0.9 from training. Duration: 0.9 2023-10-03 12:05:44,423 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_41750_210_1960_1_1533520678_3948042_247-213456-0_sp1.1 from training. Duration: 0.97275 2023-10-03 12:05:52,348 WARNING [train.py:1204] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_127-119109-0_sp0.9 from training. Duration: 0.8 2023-10-03 12:05:55,374 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.0.attention_skip_rate, batch_count=1260900.0, ans=0.0 2023-10-03 12:06:00,737 WARNING [train.py:1204] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_215-202973-0 from training. Duration: 0.88 2023-10-03 12:06:02,091 WARNING [train.py:1204] (3/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_17-320491-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 12:06:03,595 WARNING [train.py:1204] (3/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_310-114452-0 from training. Duration: 0.59 2023-10-03 12:06:03,666 WARNING [train.py:1204] (3/4) Exclude cut with ID _15133_210_6228_1_1530794512074_3080379_29-264458-0_sp0.9 from training. Duration: 0.6444375 2023-10-03 12:06:05,727 INFO [scaling.py:1118] (3/4) WithLoss: name=encoder.encoders.1.encoder.layers.0.self_attn_weights, loss-sum=0.000e+00 2023-10-03 12:06:08,372 WARNING [train.py:1204] (3/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_159-281310-0_sp1.1 from training. Duration: 0.863625 2023-10-03 12:06:08,381 WARNING [train.py:1204] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_195-167011-0_sp0.9 from training. Duration: 0.8333125 2023-10-03 12:06:09,688 WARNING [train.py:1204] (3/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_156-257607-0_sp1.1 from training. Duration: 0.7545625 2023-10-03 12:06:13,771 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2295_210_2087_1_1525500993_2407863_116-238894-0 from training. Duration: 0.7 2023-10-03 12:06:15,903 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.0.layers.1.self_attn_weights.whiten_keys, num_groups=4, num_channels=128, metric=2.44 vs. limit=6.0 2023-10-03 12:06:16,352 WARNING [train.py:1204] (3/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_321-335475-0_sp0.9 from training. Duration: 0.5 2023-10-03 12:06:18,374 WARNING [train.py:1204] (3/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_188-114464-0 from training. Duration: 0.82 2023-10-03 12:06:19,915 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2592_210_2290_1_1525697025_4882925_157-70512-0 from training. Duration: 0.91 2023-10-03 12:06:23,149 WARNING [train.py:1204] (3/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_138-64210-0_sp1.1 from training. Duration: 0.663625 2023-10-03 12:06:28,579 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_11305_210_1794_1_1529750428_7178148_651-95702-0_sp1.1 from training. Duration: 0.963625 2023-10-03 12:06:28,599 WARNING [train.py:1204] (3/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_379-184223-0_sp0.9 from training. Duration: 0.9444375 2023-10-03 12:06:28,755 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.feed_forward3.hidden_balancer.prob, batch_count=1261100.0, ans=0.125 2023-10-03 12:06:29,894 WARNING [train.py:1204] (3/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_381-125325-0_sp1.1 from training. Duration: 0.963625 2023-10-03 12:06:29,954 WARNING [train.py:1204] (3/4) Exclude cut with ID IC0622W0021-307659 from training. Duration: 0.896 2023-10-03 12:06:29,956 WARNING [train.py:1204] (3/4) Exclude cut with ID _7624_210_1531_1_1528109802198_4933101_418-349834-0_sp1.1 from training. Duration: 0.5454375 2023-10-03 12:06:34,227 WARNING [train.py:1204] (3/4) Exclude cut with ID _49728_210_12651_1_1534041034294_6788150_607-3416-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 12:06:34,342 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8275_210_4198_1_1528455320_3892571_491-17707-0 from training. Duration: 0.64 2023-10-03 12:06:36,222 WARNING [train.py:1204] (3/4) Exclude cut with ID _5287_210_2084_1_1527418883040_3878670_333-48508-0 from training. Duration: 0.6 2023-10-03 12:06:36,303 WARNING [train.py:1204] (3/4) Exclude cut with ID _8365_210_2064_1_1528592311070_4677690_371-122295-0 from training. Duration: 0.55 2023-10-03 12:06:38,283 WARNING [train.py:1204] (3/4) Exclude cut with ID _14860_210_4915_1_1530702020340_7343240_836-328489-0 from training. Duration: 0.9 2023-10-03 12:06:39,671 WARNING [train.py:1204] (3/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_1033-274376-0_sp0.9 from training. Duration: 0.9666875 2023-10-03 12:06:41,366 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.bypass.skip_rate, batch_count=1261166.6666666667, ans=0.09899494936611666 2023-10-03 12:06:42,439 INFO [train.py:1046] (3/4) Epoch 36, batch 3250, loss[loss=0.1529, simple_loss=0.2438, pruned_loss=0.03101, over 24662.00 frames. ], tot_loss[loss=0.1588, simple_loss=0.2384, pruned_loss=0.03965, over 4725386.72 frames. ], batch size: 68, lr: 2.83e-03, grad_scale: 16.0 2023-10-03 12:06:42,556 WARNING [train.py:1204] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_475-127455-0_sp0.9 from training. Duration: 0.77775 2023-10-03 12:06:42,562 WARNING [train.py:1204] (3/4) Exclude cut with ID IC0671W0071-331914 from training. Duration: 0.896 2023-10-03 12:06:42,590 WARNING [train.py:1204] (3/4) Exclude cut with ID _30327_210_16049_1_1532484161295_3924169_2-1523-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 12:06:42,593 WARNING [train.py:1204] (3/4) Exclude cut with ID _24314_210_4915_1_1532135078116_7170179_460-208279-0_sp1.1 from training. Duration: 0.97275 2023-10-03 12:06:43,983 WARNING [train.py:1204] (3/4) Exclude cut with ID IC0679W0052-335859 from training. Duration: 0.64 2023-10-03 12:06:49,874 WARNING [train.py:1204] (3/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_3-237434-0_sp1.1 from training. Duration: 0.6909375 2023-10-03 12:06:51,887 WARNING [train.py:1204] (3/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_405-185283-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 12:07:00,293 WARNING [train.py:1204] (3/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_927-83684-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 12:07:00,525 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.self_attn_weights.pos_emb_skip_rate, batch_count=1261233.3333333333, ans=0.0 2023-10-03 12:07:01,680 WARNING [train.py:1204] (3/4) Exclude cut with ID _6694_210_4599_1_1527852637490_3965529_356-169233-0 from training. Duration: 0.99 2023-10-03 12:07:01,766 WARNING [train.py:1204] (3/4) Exclude cut with ID _19896_210_1883_1_1531709022662_6649569_562-233001-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 12:07:03,059 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_354-78629-0_sp1.1 from training. Duration: 0.963625 2023-10-03 12:07:03,061 WARNING [train.py:1204] (3/4) Exclude cut with ID _60135_210_13427_1_1534935557922_3651319_96-168198-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 12:07:03,211 WARNING [train.py:1204] (3/4) Exclude cut with ID _31602_210_5854_1_1532677021456_4458250_249-96604-0_sp1.1 from training. Duration: 0.8090625 2023-10-03 12:07:04,563 WARNING [train.py:1204] (3/4) Exclude cut with ID _14100_210_8341_1_1530529320613_3678069_258-224599-0_sp0.9 from training. Duration: 0.8555625 2023-10-03 12:07:06,110 WARNING [train.py:1204] (3/4) Exclude cut with ID _19708_210_9206_1_1532136650733_4238364_215-160135-0_sp1.1 from training. Duration: 0.97275 2023-10-03 12:07:06,140 WARNING [train.py:1204] (3/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_113-18163-0_sp0.9 from training. Duration: 0.988875 2023-10-03 12:07:06,177 WARNING [train.py:1204] (3/4) Exclude cut with ID _64135_210_2253_1_1535677267876_7608972_552-337340-0_sp1.1 from training. Duration: 0.92725 2023-10-03 12:07:06,213 WARNING [train.py:1204] (3/4) Exclude cut with ID _67725_210_27767_1_1535608756671_5105359_10-12887-0_sp1.1 from training. Duration: 0.97275 2023-10-03 12:07:06,219 WARNING [train.py:1204] (3/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_401-284840-0_sp1.1 from training. Duration: 0.97275 2023-10-03 12:07:08,214 WARNING [train.py:1204] (3/4) Exclude cut with ID _44659_210_2721_1_1534036120061_6614860_483-141285-0_sp1.1 from training. Duration: 0.936375 2023-10-03 12:07:11,575 WARNING [train.py:1204] (3/4) Exclude cut with ID _9461_210_4557_1_1529060514726_3677451_358-332734-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 12:07:12,908 WARNING [train.py:1204] (3/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_788-298907-0_sp1.1 from training. Duration: 0.8090625 2023-10-03 12:07:14,172 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.672e+02 1.976e+02 2.164e+02 2.550e+02 4.020e+02, threshold=4.329e+02, percent-clipped=0.0 2023-10-03 12:07:14,293 WARNING [train.py:1204] (3/4) Exclude cut with ID _39411_210_5986_1_1533538825030_3343649_354-267196-0_sp1.1 from training. Duration: 0.92725 2023-10-03 12:07:14,320 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_14953_210_2151_1_1530701710_5616165_192-309792-0_sp1.1 from training. Duration: 0.97275 2023-10-03 12:07:17,059 WARNING [train.py:1204] (3/4) Exclude cut with ID _59170_210_6721_1_1534983633960_4203335_290-151255-0_sp1.1 from training. Duration: 0.92725 2023-10-03 12:07:17,086 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_5575_210_1410_1_1527301305_2570170_249-93769-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 12:07:17,095 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_46013_210_7234_1_1533776362_3744032_463-52378-0_sp1.1 from training. Duration: 0.7818125 2023-10-03 12:07:21,949 WARNING [train.py:1204] (3/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_580-93798-0 from training. Duration: 0.56 2023-10-03 12:07:23,283 WARNING [train.py:1204] (3/4) Exclude cut with ID _19514_210_9259_1_1531530071330_5084230_385-13762-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 12:07:23,288 WARNING [train.py:1204] (3/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_458-67321-0_sp1.1 from training. Duration: 0.8818125 2023-10-03 12:07:24,677 WARNING [train.py:1204] (3/4) Exclude cut with ID _41298_210_9019_1_1533430371450_4822143_188-154791-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 12:07:26,465 WARNING [train.py:1204] (3/4) Exclude cut with ID _15037_210_2253_1_1530752294104_3267650_296-192597-0_sp0.9 from training. Duration: 0.9 2023-10-03 12:07:30,984 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.bypass.scale_min, batch_count=1261366.6666666667, ans=0.2 2023-10-03 12:07:31,972 WARNING [train.py:1204] (3/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_661-264857-0_sp1.1 from training. Duration: 0.8454375 2023-10-03 12:07:37,700 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_34008_210_10431_1_1533345424_8183247_662-317331-0_sp1.1 from training. Duration: 0.8545625 2023-10-03 12:07:39,564 WARNING [train.py:1204] (3/4) Exclude cut with ID _46502_210_13794_1_1533724452198_3657159_241-278329-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 12:07:39,565 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_11349_210_6753_1_1529800861_1101207_26-65602-0 from training. Duration: 0.96 2023-10-03 12:07:39,569 WARNING [train.py:1204] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_448-4048-0_sp1.1 from training. Duration: 0.87275 2023-10-03 12:07:39,576 WARNING [train.py:1204] (3/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_662-275621-0_sp1.1 from training. Duration: 0.3818125 2023-10-03 12:07:39,620 WARNING [train.py:1204] (3/4) Exclude cut with ID _13938_210_9988_1_1530518718978_3693960_256-54454-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 12:07:42,924 WARNING [train.py:1204] (3/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_256-32662-0 from training. Duration: 0.9 2023-10-03 12:07:42,968 WARNING [train.py:1204] (3/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_326-17061-0 from training. Duration: 0.94 2023-10-03 12:07:43,006 WARNING [train.py:1204] (3/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_505-97987-0_sp1.1 from training. Duration: 0.7818125 2023-10-03 12:07:43,096 WARNING [train.py:1204] (3/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_316-294364-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 12:07:44,465 WARNING [train.py:1204] (3/4) Exclude cut with ID _19514_210_9259_1_1531530071330_5084230_762-231063-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 12:07:45,775 WARNING [train.py:1204] (3/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_68-219807-0_sp0.9 from training. Duration: 0.52225 2023-10-03 12:07:45,812 WARNING [train.py:1204] (3/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_479-158053-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 12:07:48,570 WARNING [train.py:1204] (3/4) Exclude cut with ID _66112_210_10635_1_1535590236305_3801484_83-38181-0_sp1.1 from training. Duration: 0.936375 2023-10-03 12:07:48,793 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=1261433.3333333333, ans=0.1 2023-10-03 12:07:48,801 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.self_attn_weights.pos_emb_skip_rate, batch_count=1261433.3333333333, ans=0.0 2023-10-03 12:07:49,786 WARNING [train.py:1204] (3/4) Exclude cut with ID _55857_210_2346_1_1534644007831_6898209_255-146346-0_sp1.1 from training. Duration: 0.963625 2023-10-03 12:07:49,915 WARNING [train.py:1204] (3/4) Exclude cut with ID _48836_210_12210_1_1534075848293_3904150_429-35475-0 from training. Duration: 0.89 2023-10-03 12:07:51,255 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_50154_210_13731_1_1534750862_1080961_119-187163-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 12:07:53,104 WARNING [train.py:1204] (3/4) Exclude cut with ID _8793_210_6310_1_1528542985139_2310009_121-179782-0_sp0.9 from training. Duration: 0.888875 2023-10-03 12:07:53,107 WARNING [train.py:1204] (3/4) Exclude cut with ID _14884_210_2252_1_1530707459689_5561250_591-173406-0 from training. Duration: 0.96 2023-10-03 12:07:57,711 INFO [train.py:1046] (3/4) Epoch 36, batch 3300, loss[loss=0.1754, simple_loss=0.2503, pruned_loss=0.05029, over 23643.00 frames. ], tot_loss[loss=0.1598, simple_loss=0.2394, pruned_loss=0.04014, over 4716673.86 frames. ], batch size: 232, lr: 2.82e-03, grad_scale: 8.0 2023-10-03 12:07:57,806 WARNING [train.py:1204] (3/4) Exclude cut with ID _15133_210_6228_1_1530794512074_3080379_54-198193-0_sp1.1 from training. Duration: 0.8545625 2023-10-03 12:07:57,814 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6545_210_2040_1_1527774670_4245258_407-48070-0 from training. Duration: 0.76 2023-10-03 12:07:59,374 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.conv_module2.balancer2.prob, batch_count=1261500.0, ans=0.125 2023-10-03 12:08:00,511 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1032-236863-0 from training. Duration: 0.84 2023-10-03 12:08:01,864 WARNING [train.py:1204] (3/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_507-303370-0 from training. Duration: 0.96 2023-10-03 12:08:01,885 WARNING [train.py:1204] (3/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_164-282366-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 12:08:02,227 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.feed_forward3.hidden_balancer.prob, batch_count=1261500.0, ans=0.125 2023-10-03 12:08:04,665 WARNING [train.py:1204] (3/4) Exclude cut with ID _39867_210_9826_1_1533553222030_9344259_312-158727-0_sp1.1 from training. Duration: 0.963625 2023-10-03 12:08:05,976 WARNING [train.py:1204] (3/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_174-205058-0_sp1.1 from training. Duration: 0.863625 2023-10-03 12:08:06,149 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.feed_forward2.hidden_balancer.prob, batch_count=1261500.0, ans=0.125 2023-10-03 12:08:07,291 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_11305_210_1794_1_1529750428_7178148_166-237358-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 12:08:08,640 WARNING [train.py:1204] (3/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_26-175553-0_sp1.1 from training. Duration: 0.5545625 2023-10-03 12:08:08,671 WARNING [train.py:1204] (3/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_96-290039-0_sp1.1 from training. Duration: 0.5090625 2023-10-03 12:08:11,758 WARNING [train.py:1204] (3/4) Exclude cut with ID _53219_210_4525_1_1534834836008_4064825_261-101691-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 12:08:11,878 WARNING [train.py:1204] (3/4) Exclude cut with ID _24271_210_12658_1_1531987270022_7190519_96-293390-0_sp1.1 from training. Duration: 0.936375 2023-10-03 12:08:16,541 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_271-234962-0 from training. Duration: 0.79 2023-10-03 12:08:17,799 WARNING [train.py:1204] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_136-30660-0_sp1.1 from training. Duration: 0.97275 2023-10-03 12:08:17,830 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_625-281046-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 12:08:17,920 WARNING [train.py:1204] (3/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_333-168193-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 12:08:19,157 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9029W0269-509664 from training. Duration: 0.896 2023-10-03 12:08:19,245 WARNING [train.py:1204] (3/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_450-147393-0_sp1.1 from training. Duration: 0.92725 2023-10-03 12:08:20,571 WARNING [train.py:1204] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_643-52007-0_sp1.1 from training. Duration: 0.5181875 2023-10-03 12:08:21,782 WARNING [train.py:1204] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_330-327861-0_sp0.9 from training. Duration: 0.7444375 2023-10-03 12:08:21,790 WARNING [train.py:1204] (3/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_260-317651-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 12:08:21,825 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9044W0010-516654 from training. Duration: 0.896 2023-10-03 12:08:26,428 WARNING [train.py:1204] (3/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_265-118378-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 12:08:26,442 WARNING [train.py:1204] (3/4) Exclude cut with ID _6194_210_1534_1_1527511826160_3941194_321-148641-0_sp0.9 from training. Duration: 0.711125 2023-10-03 12:08:27,882 WARNING [train.py:1204] (3/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_101-169176-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 12:08:27,885 WARNING [train.py:1204] (3/4) Exclude cut with ID 497-129325-0061-62254-0_sp1.1 from training. Duration: 0.97725 2023-10-03 12:08:27,974 WARNING [train.py:1204] (3/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_795-33252-0_sp1.1 from training. Duration: 0.336375 2023-10-03 12:08:29,233 WARNING [train.py:1204] (3/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_168-246176-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 12:08:30,683 WARNING [train.py:1204] (3/4) Exclude cut with ID _7160_210_1876_1_1527940923231_3703799_120-150943-0_sp1.1 from training. Duration: 0.736375 2023-10-03 12:08:33,414 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9084W0005-536844 from training. Duration: 0.768 2023-10-03 12:08:34,905 WARNING [train.py:1204] (3/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_234-260682-0 from training. Duration: 0.84 2023-10-03 12:08:34,938 WARNING [train.py:1204] (3/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_188-114464-0_sp0.9 from training. Duration: 0.911125 2023-10-03 12:08:36,400 WARNING [train.py:1204] (3/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_81-75849-0 from training. Duration: 0.39 2023-10-03 12:08:37,867 WARNING [train.py:1204] (3/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_379-184223-0_sp1.1 from training. Duration: 0.77275 2023-10-03 12:08:41,263 WARNING [train.py:1204] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_454-123002-0_sp1.1 from training. Duration: 0.62725 2023-10-03 12:08:43,041 WARNING [train.py:1204] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_887-249555-0_sp1.1 from training. Duration: 0.7 2023-10-03 12:08:44,384 WARNING [train.py:1204] (3/4) Exclude cut with ID _31088_210_16446_1_1532520012284_3440919_20-260902-0_sp1.1 from training. Duration: 0.97275 2023-10-03 12:08:44,436 WARNING [train.py:1204] (3/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_856-344574-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 12:08:44,438 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_36756_210_7234_1_1533026991_966276_4-164118-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 12:08:44,453 WARNING [train.py:1204] (3/4) Exclude cut with ID _8365_210_2064_1_1528592311070_4677690_371-122295-0_sp1.1 from training. Duration: 0.5 2023-10-03 12:08:47,174 WARNING [train.py:1204] (3/4) Exclude cut with ID _5910_210_1389_1_1527337972454_5050100_555-159815-0_sp1.1 from training. Duration: 0.8909375 2023-10-03 12:08:47,191 WARNING [train.py:1204] (3/4) Exclude cut with ID _37086_210_9038_1_1532999540891_7021250_469-147941-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 12:08:48,510 WARNING [train.py:1204] (3/4) Exclude cut with ID _3067_210_2667_1_1526212974640_4503168_459-173867-0_sp0.9 from training. Duration: 0.97775 2023-10-03 12:08:48,618 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9156W0006-572499 from training. Duration: 0.896 2023-10-03 12:08:51,210 WARNING [train.py:1204] (3/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_277-210798-0 from training. Duration: 0.55 2023-10-03 12:08:52,628 WARNING [train.py:1204] (3/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_103-336064-0_sp1.1 from training. Duration: 0.463625 2023-10-03 12:08:52,672 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3414_210_1988_1_1526212718_4813195_331-290945-0_sp1.1 from training. Duration: 0.936375 2023-10-03 12:08:52,673 WARNING [train.py:1204] (3/4) Exclude cut with ID _67499_210_17282_1_1535509805220_3692430_365-29276-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 12:08:54,089 WARNING [train.py:1204] (3/4) Exclude cut with ID _14821_210_8750_1_1530680503991_6829359_69-250847-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 12:08:54,091 WARNING [train.py:1204] (3/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_1102-61301-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 12:08:54,341 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=1261766.6666666667, ans=0.1 2023-10-03 12:08:55,935 WARNING [train.py:1204] (3/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_166-246557-0_sp1.1 from training. Duration: 0.5818125 2023-10-03 12:08:55,980 WARNING [train.py:1204] (3/4) Exclude cut with ID _15351_210_10626_1_1530856635653_3576210_214-129918-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 12:08:55,994 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_120-41668-0_sp0.9 from training. Duration: 0.77775 2023-10-03 12:08:57,306 WARNING [train.py:1204] (3/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_27-72278-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 12:08:58,750 WARNING [train.py:1204] (3/4) Exclude cut with ID _24314_210_4915_1_1532135078116_7170179_521-258868-0_sp1.1 from training. Duration: 0.7181875 2023-10-03 12:09:00,373 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.attention_skip_rate, batch_count=1261766.6666666667, ans=0.0 2023-10-03 12:09:01,537 WARNING [train.py:1204] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_143-86344-0 from training. Duration: 0.77 2023-10-03 12:09:01,577 WARNING [train.py:1204] (3/4) Exclude cut with ID _53489_210_6228_1_1534298357169_3835970_465-157433-0_sp1.1 from training. Duration: 0.92725 2023-10-03 12:09:02,906 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_248-319353-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 12:09:05,574 WARNING [train.py:1204] (3/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_102-77064-0_sp1.1 from training. Duration: 0.8454375 2023-10-03 12:09:05,598 WARNING [train.py:1204] (3/4) Exclude cut with ID _9389_210_1664_1_1529217337301_5289269_379-128794-0_sp1.1 from training. Duration: 0.77275 2023-10-03 12:09:07,508 WARNING [train.py:1204] (3/4) Exclude cut with ID _3231_210_2106_1_1526205864934_6784220_236-260835-0_sp1.1 from training. Duration: 0.97275 2023-10-03 12:09:09,041 WARNING [train.py:1204] (3/4) Exclude cut with ID _24271_210_12658_1_1531987270022_7190519_184-342363-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 12:09:09,042 WARNING [train.py:1204] (3/4) Exclude cut with ID _27894_210_12392_1_1532415496078_6902439_67-221663-0_sp1.1 from training. Duration: 0.92725 2023-10-03 12:09:10,284 INFO [train.py:1046] (3/4) Epoch 36, batch 3350, loss[loss=0.1684, simple_loss=0.2419, pruned_loss=0.04743, over 23432.00 frames. ], tot_loss[loss=0.1604, simple_loss=0.2397, pruned_loss=0.04052, over 4723580.56 frames. ], batch size: 285, lr: 2.82e-03, grad_scale: 8.0 2023-10-03 12:09:12,335 WARNING [train.py:1204] (3/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_304-59227-0_sp1.1 from training. Duration: 0.7 2023-10-03 12:09:13,817 WARNING [train.py:1204] (3/4) Exclude cut with ID _58605_210_12210_1_1534723195939_3904259_458-42232-0_sp1.1 from training. Duration: 0.92725 2023-10-03 12:09:15,153 WARNING [train.py:1204] (3/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_13-165512-0_sp0.9 from training. Duration: 0.911125 2023-10-03 12:09:17,999 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.1.conv_skip_rate, batch_count=1261833.3333333333, ans=0.0 2023-10-03 12:09:19,334 WARNING [train.py:1204] (3/4) Exclude cut with ID _58602_210_2346_1_1534903210085_6908830_792-102241-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 12:09:20,768 WARNING [train.py:1204] (3/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_588-125165-0_sp0.9 from training. Duration: 0.92225 2023-10-03 12:09:23,532 WARNING [train.py:1204] (3/4) Exclude cut with ID _54218_210_12210_1_1534413646388_4409259_239-98429-0_sp1.1 from training. Duration: 0.97275 2023-10-03 12:09:23,587 WARNING [train.py:1204] (3/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_538-32291-0_sp1.1 from training. Duration: 0.7545625 2023-10-03 12:09:24,982 WARNING [train.py:1204] (3/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_419-317062-0 from training. Duration: 0.91 2023-10-03 12:09:25,091 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9309W0202-640329 from training. Duration: 0.896 2023-10-03 12:09:25,125 WARNING [train.py:1204] (3/4) Exclude cut with ID _3231_210_2106_1_1526205864934_6784220_419-313291-0_sp1.1 from training. Duration: 0.97275 2023-10-03 12:09:28,339 WARNING [train.py:1204] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_619-344499-0 from training. Duration: 0.63 2023-10-03 12:09:28,351 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_278-272612-0 from training. Duration: 0.73 2023-10-03 12:09:30,310 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_103-84010-0_sp0.9 from training. Duration: 0.7666875 2023-10-03 12:09:30,347 WARNING [train.py:1204] (3/4) Exclude cut with ID _26528_210_9070_1_1532570410842_3851310_497-97218-0_sp1.1 from training. Duration: 0.7818125 2023-10-03 12:09:30,617 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=1261900.0, ans=0.1 2023-10-03 12:09:31,729 WARNING [train.py:1204] (3/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_262-84613-0_sp1.1 from training. Duration: 0.936375 2023-10-03 12:09:31,765 WARNING [train.py:1204] (3/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_387-131538-0 from training. Duration: 0.81 2023-10-03 12:09:33,127 WARNING [train.py:1204] (3/4) Exclude cut with ID _8030_210_7497_1_1528444886781_3531089_224-300489-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 12:09:33,144 WARNING [train.py:1204] (3/4) Exclude cut with ID _14349_210_8341_1_1530615710818_3442470_170-336541-0_sp0.9 from training. Duration: 0.9555625 2023-10-03 12:09:35,859 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_47069_210_15368_1_1533781267_4431320_180-31746-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 12:09:35,974 WARNING [train.py:1204] (3/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_447-7041-0_sp1.1 from training. Duration: 0.92725 2023-10-03 12:09:35,998 WARNING [train.py:1204] (3/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_2-55283-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 12:09:37,341 WARNING [train.py:1204] (3/4) Exclude cut with ID _12555_210_8750_1_1530075594402_3780239_70-181911-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 12:09:40,722 WARNING [train.py:1204] (3/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_311-328022-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 12:09:41,973 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.593e+02 1.875e+02 2.056e+02 2.286e+02 3.351e+02, threshold=4.113e+02, percent-clipped=0.0 2023-10-03 12:09:42,106 WARNING [train.py:1204] (3/4) Exclude cut with ID _14528_210_8254_1_1530669700514_4306939_330-2987-0_sp1.1 from training. Duration: 0.92725 2023-10-03 12:09:42,766 WARNING [train.py:1204] (3/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_385-184858-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 12:09:46,892 WARNING [train.py:1204] (3/4) Exclude cut with ID _64414_210_17301_1_1535243252048_3757759_348-333360-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 12:09:48,240 WARNING [train.py:1204] (3/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_753-214751-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 12:09:49,955 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.balancer2.prob, batch_count=1261966.6666666667, ans=0.125 2023-10-03 12:09:50,978 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_17613_210_2151_1_1531137569_2366160_202-208013-0_sp1.1 from training. Duration: 0.92725 2023-10-03 12:09:50,987 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_43831_210_2158_1_1533725502_5872791_610-129300-0_sp1.1 from training. Duration: 0.936375 2023-10-03 12:09:52,583 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.feed_forward2.hidden_balancer.prob, batch_count=1261966.6666666667, ans=0.125 2023-10-03 12:09:53,744 WARNING [train.py:1204] (3/4) Exclude cut with ID _54218_210_12210_1_1534413646388_4409259_321-23852-0_sp1.1 from training. Duration: 0.936375 2023-10-03 12:09:55,178 WARNING [train.py:1204] (3/4) Exclude cut with ID _39411_210_5986_1_1533538825030_3343649_173-238248-0 from training. Duration: 0.83 2023-10-03 12:09:55,186 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_405-339986-0_sp0.9 from training. Duration: 0.5666875 2023-10-03 12:09:55,211 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2009_210_2071_1_1525481134_4588937_385-302100-0 from training. Duration: 0.87 2023-10-03 12:09:55,254 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_161-276996-0_sp1.1 from training. Duration: 0.863625 2023-10-03 12:09:55,351 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_177-58005-0 from training. Duration: 0.62 2023-10-03 12:09:58,659 WARNING [train.py:1204] (3/4) Exclude cut with ID _64639_210_5816_1_1535367619437_3528000_146-343734-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 12:09:58,756 WARNING [train.py:1204] (3/4) Exclude cut with ID _3845_210_1390_1_1526731326141_3523610_72-61441-0_sp1.1 from training. Duration: 0.92725 2023-10-03 12:10:05,974 WARNING [train.py:1204] (3/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_991-125028-0_sp1.1 from training. Duration: 0.936375 2023-10-03 12:10:07,265 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7544_210_6753_1_1528591327_1471426_172-348777-0 from training. Duration: 0.66 2023-10-03 12:10:07,319 WARNING [train.py:1204] (3/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_416-257730-0_sp1.1 from training. Duration: 0.5909375 2023-10-03 12:10:08,617 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1113-7369-0_sp1.1 from training. Duration: 0.77275 2023-10-03 12:10:08,720 WARNING [train.py:1204] (3/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_242-76943-0_sp1.1 from training. Duration: 0.8545625 2023-10-03 12:10:11,522 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.conv_module1.balancer1.prob, batch_count=1262100.0, ans=0.125 2023-10-03 12:10:15,099 WARNING [train.py:1204] (3/4) Exclude cut with ID _44718_210_5816_1_1534050051869_6469030_190-49648-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 12:10:16,591 WARNING [train.py:1204] (3/4) Exclude cut with ID _47251_210_18628_1_1533814063128_4137250_214-88076-0 from training. Duration: 0.88 2023-10-03 12:10:17,920 WARNING [train.py:1204] (3/4) Exclude cut with ID _5585_210_2667_1_1527418813324_8007340_89-332072-0_sp1.1 from training. Duration: 0.5818125 2023-10-03 12:10:17,951 WARNING [train.py:1204] (3/4) Exclude cut with ID _41817_210_18835_1_1533434477160_7423049_555-293448-0_sp1.1 from training. Duration: 0.72725 2023-10-03 12:10:19,381 WARNING [train.py:1204] (3/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_286-331314-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 12:10:19,423 WARNING [train.py:1204] (3/4) Exclude cut with ID _13921_210_9994_1_1530838702800_3003429_46-149444-0 from training. Duration: 0.94 2023-10-03 12:10:19,682 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.nonlin_attention.balancer.max_positive, batch_count=1262100.0, ans=0.95 2023-10-03 12:10:20,775 WARNING [train.py:1204] (3/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_768-43595-0_sp1.1 from training. Duration: 0.936375 2023-10-03 12:10:20,787 WARNING [train.py:1204] (3/4) Exclude cut with ID _35355_210_16040_1_1533212988977_4016229_74-190076-0 from training. Duration: 0.8 2023-10-03 12:10:23,483 WARNING [train.py:1204] (3/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_1007-316380-0_sp1.1 from training. Duration: 0.963625 2023-10-03 12:10:24,774 INFO [train.py:1046] (3/4) Epoch 36, batch 3400, loss[loss=0.1627, simple_loss=0.2553, pruned_loss=0.03504, over 24634.00 frames. ], tot_loss[loss=0.1615, simple_loss=0.2406, pruned_loss=0.04119, over 4714403.57 frames. ], batch size: 73, lr: 2.82e-03, grad_scale: 8.0 2023-10-03 12:10:24,818 WARNING [train.py:1204] (3/4) Exclude cut with ID _23557_210_8750_1_1531890106370_6846255_150-283437-0_sp1.1 from training. Duration: 0.963625 2023-10-03 12:10:26,151 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3474_210_2028_1_1526188513_4825246_145-309131-0_sp1.1 from training. Duration: 0.67275 2023-10-03 12:10:27,532 WARNING [train.py:1204] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_391-22148-0_sp1.1 from training. Duration: 0.7 2023-10-03 12:10:27,550 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_238-256668-0 from training. Duration: 0.69 2023-10-03 12:10:33,634 WARNING [train.py:1204] (3/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_372-198878-0 from training. Duration: 0.6 2023-10-03 12:10:33,824 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.bypass.scale_min, batch_count=1262166.6666666667, ans=0.2 2023-10-03 12:10:34,895 WARNING [train.py:1204] (3/4) Exclude cut with ID ID0174W0006-766659 from training. Duration: 0.953 2023-10-03 12:10:34,912 WARNING [train.py:1204] (3/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_668-23019-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 12:10:38,422 WARNING [train.py:1204] (3/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_322-74328-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 12:10:38,428 WARNING [train.py:1204] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_872-264525-0_sp1.1 from training. Duration: 0.5909375 2023-10-03 12:10:39,749 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_53378_210_17282_1_1534221071_5769509_641-286201-0_sp1.1 from training. Duration: 0.97275 2023-10-03 12:10:39,836 WARNING [train.py:1204] (3/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_199-295611-0_sp1.1 from training. Duration: 0.5 2023-10-03 12:10:42,412 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.0.layers.0.self_attn_weights.whiten_keys, num_groups=4, num_channels=128, metric=2.74 vs. limit=6.0 2023-10-03 12:10:47,961 WARNING [train.py:1204] (3/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_1270-317971-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 12:10:48,060 WARNING [train.py:1204] (3/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_784-213099-0 from training. Duration: 0.93 2023-10-03 12:10:52,490 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_115-233929-0_sp0.9 from training. Duration: 0.8 2023-10-03 12:10:53,860 WARNING [train.py:1204] (3/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_256-223809-0_sp1.1 from training. Duration: 0.97275 2023-10-03 12:10:55,253 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_285-87781-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 12:10:55,344 WARNING [train.py:1204] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_619-344499-0_sp1.1 from training. Duration: 0.57275 2023-10-03 12:11:00,780 WARNING [train.py:1204] (3/4) Exclude cut with ID _60229_210_27767_1_1534838406158_3655350_264-279573-0_sp0.9 from training. Duration: 0.92225 2023-10-03 12:11:02,969 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.attention_skip_rate, batch_count=1262300.0, ans=0.0 2023-10-03 12:11:04,625 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.4.encoder.layers.2.feed_forward2.out_whiten, num_groups=1, num_channels=384, metric=7.47 vs. limit=15.0 2023-10-03 12:11:05,361 WARNING [train.py:1204] (3/4) Exclude cut with ID _7719_210_3525_1_1528456153163_4296960_333-110399-0 from training. Duration: 0.96 2023-10-03 12:11:11,352 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_50423_210_13270_1_1534128598_5021734_759-90833-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 12:11:12,638 WARNING [train.py:1204] (3/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_151-254798-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 12:11:12,670 WARNING [train.py:1204] (3/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_450-203637-0 from training. Duration: 0.92 2023-10-03 12:11:14,596 WARNING [train.py:1204] (3/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_673-36456-0_sp1.1 from training. Duration: 0.936375 2023-10-03 12:11:14,644 WARNING [train.py:1204] (3/4) Exclude cut with ID _36538_210_9994_1_1533204020363_3795620_131-8685-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 12:11:15,976 WARNING [train.py:1204] (3/4) Exclude cut with ID _12556_210_8341_1_1530081024100_7327690_713-55626-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 12:11:16,014 WARNING [train.py:1204] (3/4) Exclude cut with ID _2322_210_2667_1_1525604548971_3656299_440-80494-0_sp0.9 from training. Duration: 0.97775 2023-10-03 12:11:18,180 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=1262366.6666666667, ans=0.1 2023-10-03 12:11:19,368 WARNING [train.py:1204] (3/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_210-325081-0_sp1.1 from training. Duration: 0.97275 2023-10-03 12:11:22,286 WARNING [train.py:1204] (3/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_375-244976-0_sp0.9 from training. Duration: 0.8333125 2023-10-03 12:11:22,299 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_15517_210_6753_1_1530952293_1664795_119-31001-0_sp1.1 from training. Duration: 0.8545625 2023-10-03 12:11:25,103 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.0.ff3_skip_rate, batch_count=1262433.3333333333, ans=0.0 2023-10-03 12:11:26,308 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_41452_210_7291_1_1533437687_3941281_319-42326-0_sp1.1 from training. Duration: 0.92725 2023-10-03 12:11:29,108 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_372-173012-0 from training. Duration: 0.95 2023-10-03 12:11:33,384 WARNING [train.py:1204] (3/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_41-31157-0_sp1.1 from training. Duration: 0.5181875 2023-10-03 12:11:37,997 WARNING [train.py:1204] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_91-212486-0 from training. Duration: 0.68 2023-10-03 12:11:39,317 INFO [train.py:1046] (3/4) Epoch 36, batch 3450, loss[loss=0.1775, simple_loss=0.2572, pruned_loss=0.04895, over 23451.00 frames. ], tot_loss[loss=0.1612, simple_loss=0.2408, pruned_loss=0.04082, over 4722489.59 frames. ], batch size: 93, lr: 2.82e-03, grad_scale: 8.0 2023-10-03 12:11:40,828 WARNING [train.py:1204] (3/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_1267-272693-0 from training. Duration: 0.73 2023-10-03 12:11:40,854 WARNING [train.py:1204] (3/4) Exclude cut with ID _13938_210_9988_1_1530518718978_3693960_152-67046-0_sp1.1 from training. Duration: 0.936375 2023-10-03 12:11:42,285 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_4528_210_3273_1_1527076654_3276777_342-168068-0_sp1.1 from training. Duration: 0.7909375 2023-10-03 12:11:42,301 WARNING [train.py:1204] (3/4) Exclude cut with ID _7606_210_4915_1_1528894253277_5080690_127-177761-0 from training. Duration: 0.64 2023-10-03 12:11:44,189 WARNING [train.py:1204] (3/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_335-137972-0_sp1.1 from training. Duration: 0.92725 2023-10-03 12:11:45,021 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.2.encoder.layers.2.self_attn_weights.whiten_keys, num_groups=4, num_channels=128, metric=3.23 vs. limit=6.0 2023-10-03 12:11:46,216 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.3.encoder.layers.3.conv_module2.whiten, num_groups=1, num_channels=512, metric=4.26 vs. limit=15.0 2023-10-03 12:11:47,039 WARNING [train.py:1204] (3/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_331-117829-0_sp1.1 from training. Duration: 0.563625 2023-10-03 12:11:53,049 WARNING [train.py:1204] (3/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_125-125818-0_sp1.1 from training. Duration: 0.87275 2023-10-03 12:11:53,100 WARNING [train.py:1204] (3/4) Exclude cut with ID _6789_210_1534_1_1527854471671_2870519_48-294897-0_sp1.1 from training. Duration: 0.963625 2023-10-03 12:11:53,295 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.bypass.scale_min, batch_count=1262566.6666666667, ans=0.2 2023-10-03 12:11:54,553 WARNING [train.py:1204] (3/4) Exclude cut with ID _6694_210_4599_1_1527852637490_3965529_116-109398-0_sp1.1 from training. Duration: 0.663625 2023-10-03 12:11:54,565 WARNING [train.py:1204] (3/4) Exclude cut with ID _52892_210_6721_1_1534378941259_4124129_222-172685-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 12:11:54,849 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.conv_skip_rate, batch_count=1262566.6666666667, ans=0.0 2023-10-03 12:11:56,036 WARNING [train.py:1204] (3/4) Exclude cut with ID _50655_210_18628_1_1534229942193_3772154_320-132275-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 12:12:01,619 WARNING [train.py:1204] (3/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_76-311963-0 from training. Duration: 0.7 2023-10-03 12:12:03,219 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.feed_forward1.out_proj.dropout_p, batch_count=1262566.6666666667, ans=0.1 2023-10-03 12:12:06,870 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=1262566.6666666667, ans=0.1 2023-10-03 12:12:08,004 WARNING [train.py:1204] (3/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_32-205288-0 from training. Duration: 0.78 2023-10-03 12:12:08,035 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_86-16881-0_sp0.9 from training. Duration: 0.6444375 2023-10-03 12:12:09,315 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_271-297505-0_sp1.1 from training. Duration: 0.9 2023-10-03 12:12:10,547 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.624e+02 1.861e+02 2.014e+02 2.167e+02 2.671e+02, threshold=4.028e+02, percent-clipped=0.0 2023-10-03 12:12:10,654 WARNING [train.py:1204] (3/4) Exclude cut with ID _57572_210_5517_1_1534921219624_3809484_187-295293-0_sp1.1 from training. Duration: 0.963625 2023-10-03 12:12:15,306 WARNING [train.py:1204] (3/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_563-329943-0 from training. Duration: 0.99 2023-10-03 12:12:16,655 WARNING [train.py:1204] (3/4) Exclude cut with ID _7160_210_1876_1_1527940923231_3703799_302-150728-0_sp0.9 from training. Duration: 0.8666875 2023-10-03 12:12:20,047 WARNING [train.py:1204] (3/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_926-7526-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 12:12:20,076 WARNING [train.py:1204] (3/4) Exclude cut with ID _62574_210_2655_1_1535180407058_3933249_467-22348-0_sp1.1 from training. Duration: 0.936375 2023-10-03 12:12:22,760 WARNING [train.py:1204] (3/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_280-161633-0_sp0.9 from training. Duration: 0.811125 2023-10-03 12:12:24,115 WARNING [train.py:1204] (3/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_385-193052-0_sp1.1 from training. Duration: 0.7818125 2023-10-03 12:12:24,306 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.1.conv_skip_rate, batch_count=1262700.0, ans=0.0 2023-10-03 12:12:25,547 WARNING [train.py:1204] (3/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_444-28041-0 from training. Duration: 0.85 2023-10-03 12:12:25,552 WARNING [train.py:1204] (3/4) Exclude cut with ID _66070_210_7640_1_1535441393096_7805310_341-249237-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 12:12:26,923 WARNING [train.py:1204] (3/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_425-269826-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 12:12:29,634 WARNING [train.py:1204] (3/4) Exclude cut with ID _53950_210_5816_1_1534417264919_3862951_124-185597-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 12:12:31,635 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.5.encoder.layers.1.nonlin_attention.whiten1, num_groups=1, num_channels=192, metric=2.81 vs. limit=10.0 2023-10-03 12:12:32,460 WARNING [train.py:1204] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_380-204756-0 from training. Duration: 0.86 2023-10-03 12:12:35,694 WARNING [train.py:1204] (3/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_282-257791-0_sp0.9 from training. Duration: 0.9555625 2023-10-03 12:12:36,018 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.nonlin_attention.balancer.prob, batch_count=1262700.0, ans=0.125 2023-10-03 12:12:41,890 WARNING [train.py:1204] (3/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_372-131163-0_sp1.1 from training. Duration: 0.8545625 2023-10-03 12:12:41,989 WARNING [train.py:1204] (3/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_447-233513-0_sp1.1 from training. Duration: 0.963625 2023-10-03 12:12:43,955 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.feed_forward1.hidden_balancer.prob, batch_count=1262766.6666666667, ans=0.125 2023-10-03 12:12:46,540 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_54-118333-0_sp1.1 from training. Duration: 0.97275 2023-10-03 12:12:49,433 WARNING [train.py:1204] (3/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_388-230239-0_sp1.1 from training. Duration: 0.963625 2023-10-03 12:12:49,451 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_40047_210_2290_1_1533290519_2374311_141-54639-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 12:12:51,355 WARNING [train.py:1204] (3/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_91-267696-0_sp0.9 from training. Duration: 0.9666875 2023-10-03 12:12:51,395 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_532-137847-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 12:12:53,925 INFO [train.py:1046] (3/4) Epoch 36, batch 3500, loss[loss=0.1645, simple_loss=0.251, pruned_loss=0.03898, over 24026.00 frames. ], tot_loss[loss=0.16, simple_loss=0.2399, pruned_loss=0.0401, over 4726628.24 frames. ], batch size: 80, lr: 2.82e-03, grad_scale: 8.0 2023-10-03 12:12:55,437 WARNING [train.py:1204] (3/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_155-283231-0_sp1.1 from training. Duration: 0.97275 2023-10-03 12:12:58,294 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2245_210_2715_1_1525511548_3540254_310-52483-0_sp1.1 from training. Duration: 0.8 2023-10-03 12:12:59,636 WARNING [train.py:1204] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_185-174805-0 from training. Duration: 0.68 2023-10-03 12:13:02,206 WARNING [train.py:1204] (3/4) Exclude cut with ID _15012_210_1968_1_1530856820429_5095350_662-210469-0_sp1.1 from training. Duration: 0.5181875 2023-10-03 12:13:03,897 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.out_combiner.scale_min, batch_count=1262833.3333333333, ans=0.2 2023-10-03 12:13:04,880 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_426-313557-0_sp0.9 from training. Duration: 0.588875 2023-10-03 12:13:08,306 WARNING [train.py:1204] (3/4) Exclude cut with ID _42326_210_7770_1_1533814282531_3707269_407-204343-0_sp1.1 from training. Duration: 0.97275 2023-10-03 12:13:08,313 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_258-310570-0 from training. Duration: 0.82 2023-10-03 12:13:13,068 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_4737_210_2040_1_1526998689_3293008_378-178650-0_sp1.1 from training. Duration: 0.863625 2023-10-03 12:13:14,474 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_47896_210_20508_1_1534492518_5958868_432-327451-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 12:13:14,571 WARNING [train.py:1204] (3/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_188-114464-0_sp1.1 from training. Duration: 0.7454375 2023-10-03 12:13:16,248 WARNING [train.py:1204] (3/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_798-106179-0_sp1.1 from training. Duration: 0.7545625 2023-10-03 12:13:16,284 WARNING [train.py:1204] (3/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_143-117421-0_sp1.1 from training. Duration: 0.52725 2023-10-03 12:13:17,667 WARNING [train.py:1204] (3/4) Exclude cut with ID _67499_210_17282_1_1535509805220_3692430_361-78786-0_sp1.1 from training. Duration: 0.963625 2023-10-03 12:13:17,707 WARNING [train.py:1204] (3/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_712-342563-0_sp1.1 from training. Duration: 0.8818125 2023-10-03 12:13:17,757 WARNING [train.py:1204] (3/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_387-159008-0 from training. Duration: 0.86 2023-10-03 12:13:20,363 WARNING [train.py:1204] (3/4) Exclude cut with ID _33640_210_2655_1_1533117605479_4855469_959-18820-0_sp1.1 from training. Duration: 0.963625 2023-10-03 12:13:20,391 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_133-564-0_sp0.9 from training. Duration: 0.87775 2023-10-03 12:13:23,535 WARNING [train.py:1204] (3/4) Exclude cut with ID _23514_210_1389_1_1531832139785_3031439_12-29287-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 12:13:26,369 WARNING [train.py:1204] (3/4) Exclude cut with ID _23514_210_1389_1_1531832139785_3031439_489-185052-0_sp1.1 from training. Duration: 0.963625 2023-10-03 12:13:26,451 WARNING [train.py:1204] (3/4) Exclude cut with ID _5444_210_2151_1_1527418808464_5312210_634-194174-0 from training. Duration: 0.97 2023-10-03 12:13:26,470 WARNING [train.py:1204] (3/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_91-26919-0_sp1.1 from training. Duration: 0.8818125 2023-10-03 12:13:29,177 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_37351_210_7925_1_1533012931_3812113_516-82303-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 12:13:30,587 WARNING [train.py:1204] (3/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_80-74184-0_sp0.9 from training. Duration: 0.92225 2023-10-03 12:13:31,975 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_9460_210_4211_1_1528954587_10049667_495-202963-0_sp1.1 from training. Duration: 0.963625 2023-10-03 12:13:33,353 WARNING [train.py:1204] (3/4) Exclude cut with ID _14632_210_9988_1_1530691279392_3923200_332-105904-0_sp1.1 from training. Duration: 0.8181875 2023-10-03 12:13:33,377 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_40764_210_1925_1_1533882359_3752345_493-167886-0_sp1.1 from training. Duration: 0.92725 2023-10-03 12:13:34,790 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_196-289637-0 from training. Duration: 0.75 2023-10-03 12:13:36,239 WARNING [train.py:1204] (3/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_26-175553-0 from training. Duration: 0.61 2023-10-03 12:13:36,289 WARNING [train.py:1204] (3/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_890-5117-0 from training. Duration: 0.78 2023-10-03 12:13:38,085 WARNING [train.py:1204] (3/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_206-306860-0_sp1.1 from training. Duration: 0.92725 2023-10-03 12:13:39,456 WARNING [train.py:1204] (3/4) Exclude cut with ID _25882_210_3036_1_1532775665815_6479510_570-79824-0_sp1.1 from training. Duration: 0.963625 2023-10-03 12:13:39,517 WARNING [train.py:1204] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_731-2428-0_sp1.1 from training. Duration: 0.7545625 2023-10-03 12:13:40,798 WARNING [train.py:1204] (3/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_138-154812-0_sp0.9 from training. Duration: 0.8666875 2023-10-03 12:13:44,183 WARNING [train.py:1204] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_178-39722-0_sp0.9 from training. Duration: 0.5444375 2023-10-03 12:13:45,567 WARNING [train.py:1204] (3/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_1285-256622-0_sp0.9 from training. Duration: 0.9333125 2023-10-03 12:13:45,941 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.conv_module2.balancer1.prob, batch_count=1263033.3333333333, ans=0.125 2023-10-03 12:13:50,301 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_41428_210_2161_1_1533774184_3992532_65-271323-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 12:13:51,665 WARNING [train.py:1204] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_308-161067-0 from training. Duration: 0.66 2023-10-03 12:13:51,670 WARNING [train.py:1204] (3/4) Exclude cut with ID _3067_210_2667_1_1526212974640_4503168_459-173867-0 from training. Duration: 0.88 2023-10-03 12:13:51,671 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2221_210_1988_1_1525597051_4935541_415-296882-0_sp1.1 from training. Duration: 0.7 2023-10-03 12:13:53,528 WARNING [train.py:1204] (3/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_354-192266-0_sp1.1 from training. Duration: 0.936375 2023-10-03 12:13:53,592 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_258-310570-0_sp0.9 from training. Duration: 0.911125 2023-10-03 12:13:56,355 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_548-141039-0_sp1.1 from training. Duration: 0.963625 2023-10-03 12:13:57,780 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_15101_210_7927_1_1530786415_5033274_320-327527-0 from training. Duration: 0.96 2023-10-03 12:13:59,144 WARNING [train.py:1204] (3/4) Exclude cut with ID _2895_210_2477_1_1525956785794_4063750_289-11475-0_sp0.9 from training. Duration: 0.911125 2023-10-03 12:14:00,533 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7543_210_6753_1_1528543318_4552_1-85072-0_sp1.1 from training. Duration: 0.7545625 2023-10-03 12:14:01,901 WARNING [train.py:1204] (3/4) Exclude cut with ID _64639_210_5816_1_1535367619437_3528000_461-329156-0 from training. Duration: 0.96 2023-10-03 12:14:03,256 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_211-339622-0 from training. Duration: 0.45 2023-10-03 12:14:06,629 WARNING [train.py:1204] (3/4) Exclude cut with ID _20455_210_5219_1_1531565996775_8098100_402-112739-0_sp1.1 from training. Duration: 0.963625 2023-10-03 12:14:07,927 INFO [train.py:1046] (3/4) Epoch 36, batch 3550, loss[loss=0.1629, simple_loss=0.2486, pruned_loss=0.03861, over 24316.00 frames. ], tot_loss[loss=0.1587, simple_loss=0.2382, pruned_loss=0.03959, over 4709436.64 frames. ], batch size: 77, lr: 2.82e-03, grad_scale: 8.0 2023-10-03 12:14:08,010 WARNING [train.py:1204] (3/4) Exclude cut with ID _53489_210_6228_1_1534298357169_3835970_256-115565-0_sp1.1 from training. Duration: 0.936375 2023-10-03 12:14:08,026 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_26326_210_7925_1_1533002493_3592406_360-305260-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 12:14:08,062 WARNING [train.py:1204] (3/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_18-204383-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 12:14:10,835 WARNING [train.py:1204] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_306-232480-0_sp1.1 from training. Duration: 0.87275 2023-10-03 12:14:14,527 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.balancer2.prob, batch_count=1263166.6666666667, ans=0.125 2023-10-03 12:14:18,871 WARNING [train.py:1204] (3/4) Exclude cut with ID _41161_210_20034_1_1533431249657_3555314_116-299186-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 12:14:20,249 WARNING [train.py:1204] (3/4) Exclude cut with ID _13913_210_9994_1_1530579615187_3383897_473-277682-0_sp1.1 from training. Duration: 0.3545625 2023-10-03 12:14:23,024 WARNING [train.py:1204] (3/4) Exclude cut with ID _58895_210_8750_1_1535184081320_3649089_49-225702-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 12:14:23,757 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3080_210_2071_1_1526086102_4365166_312-8726-0_sp1.1 from training. Duration: 0.82725 2023-10-03 12:14:26,338 WARNING [train.py:1204] (3/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_489-190175-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 12:14:27,721 WARNING [train.py:1204] (3/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_345-95717-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 12:14:27,738 WARNING [train.py:1204] (3/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_85-138257-0_sp1.1 from training. Duration: 0.6454375 2023-10-03 12:14:30,571 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7046_210_6753_1_1527895337_6920917_783-278109-0_sp1.1 from training. Duration: 0.7 2023-10-03 12:14:30,597 WARNING [train.py:1204] (3/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_450-203637-0_sp1.1 from training. Duration: 0.836375 2023-10-03 12:14:32,433 WARNING [train.py:1204] (3/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_416-132893-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 12:14:32,450 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2655_210_2087_1_1525688959_3897400_437-337690-0_sp0.9 from training. Duration: 0.7 2023-10-03 12:14:32,531 WARNING [train.py:1204] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_700-183549-0_sp1.1 from training. Duration: 0.6181875 2023-10-03 12:14:37,976 WARNING [train.py:1204] (3/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_191-59784-0_sp1.1 from training. Duration: 0.8 2023-10-03 12:14:37,988 WARNING [train.py:1204] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_218-295097-0_sp1.1 from training. Duration: 0.7 2023-10-03 12:14:39,161 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.636e+02 1.892e+02 2.093e+02 2.381e+02 3.257e+02, threshold=4.186e+02, percent-clipped=0.0 2023-10-03 12:14:39,311 WARNING [train.py:1204] (3/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_310-109461-0_sp1.1 from training. Duration: 0.72725 2023-10-03 12:14:39,315 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_33348_210_5632_1_1533086912_3324030_131-321149-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 12:14:39,365 WARNING [train.py:1204] (3/4) Exclude cut with ID _64135_210_2253_1_1535677267876_7608972_133-188266-0_sp1.1 from training. Duration: 0.736375 2023-10-03 12:14:40,600 WARNING [train.py:1204] (3/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_505-205270-0 from training. Duration: 0.38 2023-10-03 12:14:40,611 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_67223_210_4238_1_1535612120_2537431_102-88248-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 12:14:42,028 WARNING [train.py:1204] (3/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_60-235433-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 12:14:43,531 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.feed_forward1.hidden_balancer.prob, batch_count=1263300.0, ans=0.125 2023-10-03 12:14:45,070 WARNING [train.py:1204] (3/4) Exclude cut with ID _5189_210_4557_1_1527248319428_3769229_199-53526-0_sp1.1 from training. Duration: 0.3818125 2023-10-03 12:14:48,427 WARNING [train.py:1204] (3/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_439-90979-0_sp1.1 from training. Duration: 0.963625 2023-10-03 12:14:49,797 WARNING [train.py:1204] (3/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_1162-132880-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 12:14:51,217 WARNING [train.py:1204] (3/4) Exclude cut with ID _56423_210_5854_1_1534896061049_3165969_220-22317-0_sp1.1 from training. Duration: 0.963625 2023-10-03 12:14:52,628 WARNING [train.py:1204] (3/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_909-64151-0 from training. Duration: 0.94 2023-10-03 12:14:52,695 WARNING [train.py:1204] (3/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_465-156500-0_sp0.9 from training. Duration: 0.8 2023-10-03 12:14:54,620 WARNING [train.py:1204] (3/4) Exclude cut with ID _44304_210_15374_1_1533639352765_4613129_302-22084-0 from training. Duration: 0.86 2023-10-03 12:14:54,683 WARNING [train.py:1204] (3/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_663-166973-0_sp1.1 from training. Duration: 0.72725 2023-10-03 12:14:57,467 WARNING [train.py:1204] (3/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_503-287883-0_sp0.9 from training. Duration: 0.97775 2023-10-03 12:14:57,489 WARNING [train.py:1204] (3/4) Exclude cut with ID _60057_210_10640_1_1534903065793_3333150_362-230377-0_sp0.9 from training. Duration: 0.9666875 2023-10-03 12:14:59,022 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.conv_module1.balancer2.prob, batch_count=1263366.6666666667, ans=0.125 2023-10-03 12:15:00,259 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.conv_module2.balancer1.prob, batch_count=1263366.6666666667, ans=0.125 2023-10-03 12:15:01,487 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_4183_210_2087_1_1526729100_1869838_143-280962-0 from training. Duration: 0.56 2023-10-03 12:15:03,533 WARNING [train.py:1204] (3/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_281-1313-0_sp1.1 from training. Duration: 0.936375 2023-10-03 12:15:05,638 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.3.encoder.layers.3.self_attn1.whiten, num_groups=1, num_channels=512, metric=12.21 vs. limit=22.5 2023-10-03 12:15:09,145 WARNING [train.py:1204] (3/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_64-215118-0_sp1.1 from training. Duration: 0.936375 2023-10-03 12:15:10,597 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2822_210_2715_1_1526112586_1999587_165-36656-0 from training. Duration: 0.89 2023-10-03 12:15:10,648 WARNING [train.py:1204] (3/4) Exclude cut with ID _22961_210_8254_1_1531976366672_4278180_402-208830-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 12:15:10,835 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.nonlin_attention.balancer.max_positive, batch_count=1263433.3333333333, ans=0.95 2023-10-03 12:15:14,700 WARNING [train.py:1204] (3/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_98-193494-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 12:15:16,094 WARNING [train.py:1204] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_383-285196-0 from training. Duration: 0.97 2023-10-03 12:15:18,130 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.2.encoder.layers.1.feed_forward2.out_whiten, num_groups=1, num_channels=384, metric=4.73 vs. limit=15.0 2023-10-03 12:15:22,073 INFO [train.py:1046] (3/4) Epoch 36, batch 3600, loss[loss=0.1574, simple_loss=0.2416, pruned_loss=0.03662, over 24660.00 frames. ], tot_loss[loss=0.1583, simple_loss=0.2381, pruned_loss=0.03924, over 4724548.22 frames. ], batch size: 68, lr: 2.82e-03, grad_scale: 16.0 2023-10-03 12:15:23,532 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6767_210_3273_1_1527855783_1718932_142-127132-0 from training. Duration: 0.95 2023-10-03 12:15:23,570 WARNING [train.py:1204] (3/4) Exclude cut with ID _39867_210_9826_1_1533553222030_9344259_1018-339635-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 12:15:23,634 WARNING [train.py:1204] (3/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_274-274798-0_sp1.1 from training. Duration: 0.8909375 2023-10-03 12:15:23,880 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.ff3_skip_rate, batch_count=1263500.0, ans=0.0 2023-10-03 12:15:25,016 WARNING [train.py:1204] (3/4) Exclude cut with ID _2334_210_2667_1_1525608238880_4705129_376-263393-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 12:15:26,922 WARNING [train.py:1204] (3/4) Exclude cut with ID _29303_210_2575_1_1532392237303_7410649_363-344675-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 12:15:27,000 WARNING [train.py:1204] (3/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_58-68956-0_sp1.1 from training. Duration: 0.7545625 2023-10-03 12:15:28,560 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.conv_skip_rate, batch_count=1263500.0, ans=0.0 2023-10-03 12:15:29,767 WARNING [train.py:1204] (3/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_709-311571-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 12:15:31,079 WARNING [train.py:1204] (3/4) Exclude cut with ID _46067_210_4220_1_1534125571066_3307450_136-257407-0_sp1.1 from training. Duration: 0.963625 2023-10-03 12:15:34,067 WARNING [train.py:1204] (3/4) Exclude cut with ID _6493_210_1747_1_1528459210424_4012470_324-323854-0_sp1.1 from training. Duration: 0.836375 2023-10-03 12:15:34,121 WARNING [train.py:1204] (3/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_234-260682-0_sp1.1 from training. Duration: 0.763625 2023-10-03 12:15:34,175 WARNING [train.py:1204] (3/4) Exclude cut with ID _36634_210_1794_1_1533296352450_1399884_55-119451-0_sp1.1 from training. Duration: 0.963625 2023-10-03 12:15:34,178 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_40047_210_2290_1_1533290519_2374311_195-165693-0 from training. Duration: 0.89 2023-10-03 12:15:37,015 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_5522_210_3273_1_1527249606_3909096_366-172508-0_sp0.9 from training. Duration: 0.7333125 2023-10-03 12:15:37,104 WARNING [train.py:1204] (3/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_187-10504-0_sp1.1 from training. Duration: 0.963625 2023-10-03 12:15:40,046 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.conv_module1.balancer1.min_positive, batch_count=1263566.6666666667, ans=0.025 2023-10-03 12:15:41,230 WARNING [train.py:1204] (3/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_282-257791-0_sp1.1 from training. Duration: 0.7818125 2023-10-03 12:15:41,503 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.conv_module2.balancer1.prob, batch_count=1263566.6666666667, ans=0.125 2023-10-03 12:15:42,748 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_43831_210_2158_1_1533725502_5872791_467-288776-0_sp1.1 from training. Duration: 0.92725 2023-10-03 12:15:44,103 WARNING [train.py:1204] (3/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_138-154812-0_sp1.1 from training. Duration: 0.7090625 2023-10-03 12:15:45,486 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_1154-185930-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 12:15:45,512 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2655_210_2087_1_1525688959_3897400_52-279899-0 from training. Duration: 0.69 2023-10-03 12:15:46,925 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_141-89113-0_sp1.1 from training. Duration: 0.7818125 2023-10-03 12:15:48,994 WARNING [train.py:1204] (3/4) Exclude cut with ID _55134_210_21982_1_1534590028377_3882170_297-47310-0_sp1.1 from training. Duration: 0.963625 2023-10-03 12:15:51,656 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_486-151131-0_sp1.1 from training. Duration: 0.77275 2023-10-03 12:15:51,877 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.feed_forward1.hidden_balancer.prob, batch_count=1263633.3333333333, ans=0.125 2023-10-03 12:15:51,916 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.conv_module1.balancer1.max_abs, batch_count=1263633.3333333333, ans=10.0 2023-10-03 12:15:52,445 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.2.encoder.layers.2.self_attn_weights.whiten_keys, num_groups=4, num_channels=128, metric=2.87 vs. limit=6.0 2023-10-03 12:15:53,019 WARNING [train.py:1204] (3/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_21-232980-0_sp1.1 from training. Duration: 0.936375 2023-10-03 12:15:54,433 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6165_210_3528_1_1527590982_4379841_338-106380-0_sp1.1 from training. Duration: 0.92725 2023-10-03 12:15:55,789 WARNING [train.py:1204] (3/4) Exclude cut with ID _55162_210_17071_1_1534408248583_3753480_355-257218-0_sp1.1 from training. Duration: 0.97275 2023-10-03 12:15:57,784 WARNING [train.py:1204] (3/4) Exclude cut with ID _2895_210_2477_1_1525956785794_4063750_289-11475-0 from training. Duration: 0.82 2023-10-03 12:16:02,026 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.0.balancer1.prob, batch_count=1263633.3333333333, ans=0.125 2023-10-03 12:16:03,221 WARNING [train.py:1204] (3/4) Exclude cut with ID _16756_210_4915_1_1531047803763_7104259_839-183431-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 12:16:04,964 WARNING [train.py:1204] (3/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_427-208901-0_sp0.9 from training. Duration: 0.7555625 2023-10-03 12:16:05,014 WARNING [train.py:1204] (3/4) Exclude cut with ID _36512_210_6987_1_1533033143998_3126069_181-306500-0 from training. Duration: 0.95 2023-10-03 12:16:09,113 WARNING [train.py:1204] (3/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_28-70987-0_sp1.1 from training. Duration: 0.7909375 2023-10-03 12:16:14,504 WARNING [train.py:1204] (3/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_590-58996-0_sp1.1 from training. Duration: 0.936375 2023-10-03 12:16:19,149 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_48730_210_5399_1_1533885886_2212769_108-47561-0_sp1.1 from training. Duration: 0.936375 2023-10-03 12:16:25,099 WARNING [train.py:1204] (3/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_401-158239-0_sp0.9 from training. Duration: 0.788875 2023-10-03 12:16:25,112 WARNING [train.py:1204] (3/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_278-154492-0_sp1.1 from training. Duration: 0.6909375 2023-10-03 12:16:25,127 WARNING [train.py:1204] (3/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_137-116442-0 from training. Duration: 0.78 2023-10-03 12:16:26,564 WARNING [train.py:1204] (3/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_189-317158-0 from training. Duration: 0.9 2023-10-03 12:16:28,065 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_47896_210_20508_1_1534492518_5958868_558-314572-0 from training. Duration: 0.97 2023-10-03 12:16:30,295 WARNING [train.py:1204] (3/4) Exclude cut with ID _66070_210_7640_1_1535441393096_7805310_290-40284-0_sp1.1 from training. Duration: 0.97275 2023-10-03 12:16:30,328 WARNING [train.py:1204] (3/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_110-224688-0_sp1.1 from training. Duration: 0.8818125 2023-10-03 12:16:31,549 WARNING [train.py:1204] (3/4) Exclude cut with ID _7719_210_3525_1_1528456153163_4296960_88-250259-0 from training. Duration: 0.84 2023-10-03 12:16:32,916 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8453_210_4211_1_1528502940_8562730_1157-114102-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 12:16:32,948 WARNING [train.py:1204] (3/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_317-175062-0_sp1.1 from training. Duration: 0.7454375 2023-10-03 12:16:32,954 WARNING [train.py:1204] (3/4) Exclude cut with ID _29857_210_20034_1_1532395886753_3798160_254-347254-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 12:16:33,029 WARNING [train.py:1204] (3/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_28-70987-0 from training. Duration: 0.87 2023-10-03 12:16:33,248 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.conv_module2.balancer1.prob, batch_count=1263766.6666666667, ans=0.125 2023-10-03 12:16:34,929 WARNING [train.py:1204] (3/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_321-335475-0 from training. Duration: 0.45 2023-10-03 12:16:35,444 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.5.encoder.layers.1.conv_module2.whiten, num_groups=1, num_channels=256, metric=4.95 vs. limit=15.0 2023-10-03 12:16:35,675 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.3.encoder.layers.1.self_attn2.whiten, num_groups=1, num_channels=512, metric=14.14 vs. limit=22.5 2023-10-03 12:16:36,406 INFO [train.py:1046] (3/4) Epoch 36, batch 3650, loss[loss=0.1554, simple_loss=0.2474, pruned_loss=0.03169, over 24633.00 frames. ], tot_loss[loss=0.1584, simple_loss=0.2388, pruned_loss=0.03902, over 4737544.35 frames. ], batch size: 73, lr: 2.82e-03, grad_scale: 16.0 2023-10-03 12:16:37,930 WARNING [train.py:1204] (3/4) Exclude cut with ID _40901_210_5665_1_1533363789958_3856130_356-107330-0_sp1.1 from training. Duration: 0.936375 2023-10-03 12:16:39,214 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3780_210_2087_1_1526467389_2145572_176-107140-0 from training. Duration: 0.69 2023-10-03 12:16:43,431 WARNING [train.py:1204] (3/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_80-135200-0 from training. Duration: 0.79 2023-10-03 12:16:44,744 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_33348_210_5632_1_1533086912_3324030_85-28583-0_sp0.9 from training. Duration: 0.911125 2023-10-03 12:16:49,296 WARNING [train.py:1204] (3/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_29-293896-0 from training. Duration: 0.61 2023-10-03 12:16:52,040 WARNING [train.py:1204] (3/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_194-252800-0 from training. Duration: 0.97 2023-10-03 12:16:54,966 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_360-52446-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 12:16:54,968 WARNING [train.py:1204] (3/4) Exclude cut with ID _9389_210_1664_1_1529217337301_5289269_289-213417-0_sp0.9 from training. Duration: 0.988875 2023-10-03 12:16:56,714 WARNING [train.py:1204] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_143-86344-0_sp0.9 from training. Duration: 0.8555625 2023-10-03 12:16:58,246 WARNING [train.py:1204] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_520-126729-0_sp0.9 from training. Duration: 0.77775 2023-10-03 12:16:58,271 WARNING [train.py:1204] (3/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_76-308663-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 12:16:59,737 WARNING [train.py:1204] (3/4) Exclude cut with ID _8021_210_7994_1_1528376451358_3653069_100-140231-0 from training. Duration: 0.67 2023-10-03 12:17:01,053 WARNING [train.py:1204] (3/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_387-131538-0_sp0.9 from training. Duration: 0.9 2023-10-03 12:17:01,096 WARNING [train.py:1204] (3/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_365-182054-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 12:17:01,772 WARNING [train.py:1204] (3/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_345-340690-0 from training. Duration: 0.93 2023-10-03 12:17:04,279 WARNING [train.py:1204] (3/4) Exclude cut with ID _5409_210_1388_1_1527292850189_5865191_178-127970-0_sp0.9 from training. Duration: 0.6333125 2023-10-03 12:17:04,347 WARNING [train.py:1204] (3/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_352-12734-0_sp1.1 from training. Duration: 0.92725 2023-10-03 12:17:04,350 WARNING [train.py:1204] (3/4) Exclude cut with ID _65134_210_14763_1_1535335393985_3505368_121-129373-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 12:17:07,637 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.612e+02 1.919e+02 2.181e+02 2.473e+02 3.454e+02, threshold=4.361e+02, percent-clipped=0.0 2023-10-03 12:17:07,743 WARNING [train.py:1204] (3/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_557-324258-0_sp1.1 from training. Duration: 0.736375 2023-10-03 12:17:09,141 WARNING [train.py:1204] (3/4) Exclude cut with ID _48678_210_5663_1_1533988830947_3769199_306-255886-0 from training. Duration: 0.77 2023-10-03 12:17:10,596 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7544_210_6753_1_1528587000_691468_87-178599-0 from training. Duration: 0.87 2023-10-03 12:17:10,657 WARNING [train.py:1204] (3/4) Exclude cut with ID _2941_210_2577_1_1526199673150_2489759_208-236402-0_sp1.1 from training. Duration: 0.963625 2023-10-03 12:17:13,658 WARNING [train.py:1204] (3/4) Exclude cut with ID _38670_210_15379_1_1533448810659_4391059_65-169397-0 from training. Duration: 0.97 2023-10-03 12:17:15,042 WARNING [train.py:1204] (3/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_306-328175-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 12:17:15,051 WARNING [train.py:1204] (3/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_212-149947-0_sp1.1 from training. Duration: 0.663625 2023-10-03 12:17:15,644 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.4.encoder.layers.1.conv_module1.whiten, num_groups=1, num_channels=384, metric=3.23 vs. limit=15.0 2023-10-03 12:17:19,269 WARNING [train.py:1204] (3/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_321-22821-0_sp1.1 from training. Duration: 0.8090625 2023-10-03 12:17:20,757 WARNING [train.py:1204] (3/4) Exclude cut with ID _44010_210_4525_1_1533884262964_4013620_483-69110-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 12:17:20,766 WARNING [train.py:1204] (3/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_38-187851-0_sp1.1 from training. Duration: 0.563625 2023-10-03 12:17:21,036 INFO [scaling.py:1118] (3/4) WithLoss: name=encoder.encoders.4.encoder.layers.1.self_attn_weights, loss-sum=0.000e+00 2023-10-03 12:17:22,233 WARNING [train.py:1204] (3/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_129-337802-0_sp0.9 from training. Duration: 0.8 2023-10-03 12:17:25,273 WARNING [train.py:1204] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_268-87909-0_sp1.1 from training. Duration: 0.9 2023-10-03 12:17:28,424 WARNING [train.py:1204] (3/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_559-184862-0_sp1.1 from training. Duration: 0.8545625 2023-10-03 12:17:29,909 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_46032_210_14656_1_1533781412_4507969_237-120766-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 12:17:31,292 WARNING [train.py:1204] (3/4) Exclude cut with ID _28427_210_4220_1_1532581240967_3061069_330-170906-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 12:17:31,298 WARNING [train.py:1204] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_401-51578-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 12:17:31,421 WARNING [train.py:1204] (3/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_209-205365-0_sp1.1 from training. Duration: 0.7181875 2023-10-03 12:17:33,451 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_23801_210_16223_1_1532066392_3294657_57-242170-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 12:17:34,499 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_127-37371-0_sp1.1 from training. Duration: 0.936375 2023-10-03 12:17:34,713 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=1264100.0, ans=0.1 2023-10-03 12:17:40,268 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9171W0019-578905 from training. Duration: 0.896 2023-10-03 12:17:44,380 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_24605_210_1794_1_1532047863_4149755_349-49056-0_sp1.1 from training. Duration: 0.92725 2023-10-03 12:17:44,391 WARNING [train.py:1204] (3/4) Exclude cut with ID _12302_210_5219_1_1529924446466_8107280_897-21237-0_sp1.1 from training. Duration: 0.936375 2023-10-03 12:17:44,482 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_9460_210_4211_1_1528954587_10049667_480-260269-0_sp0.9 from training. Duration: 0.62225 2023-10-03 12:17:45,903 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_348-152548-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 12:17:45,981 WARNING [train.py:1204] (3/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_301-330455-0_sp0.9 from training. Duration: 0.888875 2023-10-03 12:17:47,413 WARNING [train.py:1204] (3/4) Exclude cut with ID _61008_210_10633_1_1534852769198_6974370_345-91705-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 12:17:48,816 WARNING [train.py:1204] (3/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_304-273433-0 from training. Duration: 0.99 2023-10-03 12:17:48,828 WARNING [train.py:1204] (3/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_359-345972-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 12:17:50,163 INFO [train.py:1046] (3/4) Epoch 36, batch 3700, loss[loss=0.1927, simple_loss=0.2631, pruned_loss=0.06119, over 19220.00 frames. ], tot_loss[loss=0.1595, simple_loss=0.2397, pruned_loss=0.03964, over 4736929.54 frames. ], batch size: 388, lr: 2.82e-03, grad_scale: 16.0 2023-10-03 12:17:50,222 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2296_210_2087_1_1525503448_3397335_57-185430-0_sp1.1 from training. Duration: 0.5454375 2023-10-03 12:17:51,619 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_1256-120974-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 12:17:52,937 WARNING [train.py:1204] (3/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_372-30302-0_sp0.9 from training. Duration: 0.9555625 2023-10-03 12:17:56,182 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_42012_210_7927_1_1533439936_3871556_451-344262-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 12:17:56,182 WARNING [train.py:1204] (3/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_28-324729-0 from training. Duration: 0.94 2023-10-03 12:17:56,192 WARNING [train.py:1204] (3/4) Exclude cut with ID _64135_210_2253_1_1535677267876_7608972_313-18394-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 12:17:58,191 WARNING [train.py:1204] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_620-276793-0_sp0.9 from training. Duration: 0.6555625 2023-10-03 12:17:58,222 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7544_210_6753_1_1528591327_1471426_172-348777-0_sp0.9 from training. Duration: 0.7333125 2023-10-03 12:18:04,153 WARNING [train.py:1204] (3/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_332-221057-0_sp0.9 from training. Duration: 0.7555625 2023-10-03 12:18:06,991 WARNING [train.py:1204] (3/4) Exclude cut with ID _19023_210_4353_1_1531378892552_5820780_5-300035-0_sp1.1 from training. Duration: 0.92725 2023-10-03 12:18:08,427 WARNING [train.py:1204] (3/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_343-309023-0_sp1.1 from training. Duration: 0.963625 2023-10-03 12:18:09,805 WARNING [train.py:1204] (3/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_37-41919-0_sp1.1 from training. Duration: 0.7909375 2023-10-03 12:18:09,835 WARNING [train.py:1204] (3/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_41-182910-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 12:18:09,890 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_229-45300-0_sp0.9 from training. Duration: 0.6333125 2023-10-03 12:18:13,103 WARNING [train.py:1204] (3/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_40-214716-0_sp1.1 from training. Duration: 0.963625 2023-10-03 12:18:13,230 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9309W0203-640330 from training. Duration: 0.896 2023-10-03 12:18:15,038 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.5.encoder.layers.1.self_attn1.whiten, num_groups=1, num_channels=256, metric=13.35 vs. limit=22.5 2023-10-03 12:18:20,116 WARNING [train.py:1204] (3/4) Exclude cut with ID _60229_210_27767_1_1534838406158_3655350_264-279573-0_sp1.1 from training. Duration: 0.7545625 2023-10-03 12:18:20,176 WARNING [train.py:1204] (3/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_372-198878-0_sp0.9 from training. Duration: 0.6666875 2023-10-03 12:18:20,294 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.1.feed_forward3.hidden_balancer.prob, batch_count=1264300.0, ans=0.125 2023-10-03 12:18:21,518 WARNING [train.py:1204] (3/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_359-118926-0_sp1.1 from training. Duration: 0.6545625 2023-10-03 12:18:21,537 WARNING [train.py:1204] (3/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_85-65056-0 from training. Duration: 0.96 2023-10-03 12:18:22,867 WARNING [train.py:1204] (3/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_89-242335-0_sp1.1 from training. Duration: 0.863625 2023-10-03 12:18:24,432 WARNING [train.py:1204] (3/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_128-49444-0_sp1.1 from training. Duration: 0.936375 2023-10-03 12:18:25,853 WARNING [train.py:1204] (3/4) Exclude cut with ID _30327_210_16049_1_1532484161295_3924169_401-70819-0 from training. Duration: 0.86 2023-10-03 12:18:27,750 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_41822_210_13270_1_1533955976_4379284_260-256391-0_sp1.1 from training. Duration: 0.936375 2023-10-03 12:18:27,875 WARNING [train.py:1204] (3/4) Exclude cut with ID _67335_210_5219_1_1535525975652_4275229_599-247317-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 12:18:31,197 WARNING [train.py:1204] (3/4) Exclude cut with ID _29857_210_20034_1_1532395886753_3798160_209-239267-0_sp1.1 from training. Duration: 0.936375 2023-10-03 12:18:31,879 WARNING [train.py:1204] (3/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_46-207599-0_sp1.1 from training. Duration: 0.5818125 2023-10-03 12:18:34,477 WARNING [train.py:1204] (3/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_912-282412-0_sp1.1 from training. Duration: 0.4454375 2023-10-03 12:18:34,709 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.self_attn_weights.pos_emb_skip_rate, batch_count=1264366.6666666667, ans=0.0 2023-10-03 12:18:37,529 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.balancer1.prob, batch_count=1264366.6666666667, ans=0.125 2023-10-03 12:18:38,645 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_4183_210_2087_1_1526729100_1869838_141-166600-0_sp1.1 from training. Duration: 0.863625 2023-10-03 12:18:38,650 WARNING [train.py:1204] (3/4) Exclude cut with ID _15037_210_2253_1_1530752294104_3267650_296-192597-0 from training. Duration: 0.81 2023-10-03 12:18:38,907 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.bypass.scale_min, batch_count=1264366.6666666667, ans=0.2 2023-10-03 12:18:40,303 WARNING [train.py:1204] (3/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_571-220562-0_sp1.1 from training. Duration: 0.963625 2023-10-03 12:18:40,336 WARNING [train.py:1204] (3/4) Exclude cut with ID _5287_210_2084_1_1527418883040_3878670_12-221186-0 from training. Duration: 0.98 2023-10-03 12:18:44,695 WARNING [train.py:1204] (3/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_149-85602-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 12:18:44,722 WARNING [train.py:1204] (3/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_1003-101638-0_sp1.1 from training. Duration: 0.663625 2023-10-03 12:18:46,226 WARNING [train.py:1204] (3/4) Exclude cut with ID _55844_210_2346_1_1534471212569_7625707_1099-33689-0_sp1.1 from training. Duration: 0.92725 2023-10-03 12:18:47,077 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.0.layers.1.feed_forward3.out_whiten, num_groups=1, num_channels=192, metric=9.76 vs. limit=15.0 2023-10-03 12:18:47,529 WARNING [train.py:1204] (3/4) Exclude cut with ID _7606_210_4915_1_1528894253277_5080690_301-10732-0 from training. Duration: 0.81 2023-10-03 12:18:49,033 WARNING [train.py:1204] (3/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_403-34698-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 12:18:50,336 WARNING [train.py:1204] (3/4) Exclude cut with ID _8480_210_1874_1_1528589391205_6728330_755-112625-0_sp0.9 from training. Duration: 0.82225 2023-10-03 12:18:50,359 WARNING [train.py:1204] (3/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_527-238990-0_sp1.1 from training. Duration: 0.8181875 2023-10-03 12:18:50,392 WARNING [train.py:1204] (3/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_426-7328-0_sp1.1 from training. Duration: 0.92725 2023-10-03 12:18:51,836 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.bypass.skip_rate, batch_count=1264433.3333333333, ans=0.09899494936611666 2023-10-03 12:18:55,630 WARNING [train.py:1204] (3/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_189-317158-0_sp1.1 from training. Duration: 0.8181875 2023-10-03 12:18:55,698 WARNING [train.py:1204] (3/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_162-103620-0 from training. Duration: 0.93 2023-10-03 12:18:57,124 WARNING [train.py:1204] (3/4) Exclude cut with ID _22083_210_8254_1_1531702862439_3678970_49-234546-0 from training. Duration: 0.83 2023-10-03 12:18:57,165 WARNING [train.py:1204] (3/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_370-331600-0_sp1.1 from training. Duration: 0.8545625 2023-10-03 12:18:57,178 WARNING [train.py:1204] (3/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_166-59042-0_sp1.1 from training. Duration: 0.97275 2023-10-03 12:19:00,401 WARNING [train.py:1204] (3/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_138-332773-0_sp1.1 from training. Duration: 0.736375 2023-10-03 12:19:02,343 WARNING [train.py:1204] (3/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_90-30673-0_sp0.9 from training. Duration: 0.9333125 2023-10-03 12:19:05,619 INFO [train.py:1046] (3/4) Epoch 36, batch 3750, loss[loss=0.1476, simple_loss=0.2213, pruned_loss=0.03694, over 24453.00 frames. ], tot_loss[loss=0.1607, simple_loss=0.2406, pruned_loss=0.04045, over 4725694.22 frames. ], batch size: 58, lr: 2.82e-03, grad_scale: 16.0 2023-10-03 12:19:05,700 WARNING [train.py:1204] (3/4) Exclude cut with ID _27967_210_12140_1_1532588391859_8419130_737-41898-0_sp1.1 from training. Duration: 0.936375 2023-10-03 12:19:07,199 WARNING [train.py:1204] (3/4) Exclude cut with ID _8833_210_5219_1_1528592665210_7553059_329-125156-0_sp1.1 from training. Duration: 0.7454375 2023-10-03 12:19:07,289 WARNING [train.py:1204] (3/4) Exclude cut with ID _41817_210_18835_1_1533434477160_7423049_352-42537-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 12:19:08,607 WARNING [train.py:1204] (3/4) Exclude cut with ID _8365_210_2064_1_1528592311070_4677690_431-294102-0 from training. Duration: 0.89 2023-10-03 12:19:10,011 WARNING [train.py:1204] (3/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_454-314901-0_sp1.1 from training. Duration: 0.4181875 2023-10-03 12:19:10,244 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.feed_forward2.hidden_balancer.prob, batch_count=1264500.0, ans=0.125 2023-10-03 12:19:11,476 WARNING [train.py:1204] (3/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_80-135200-0_sp0.9 from training. Duration: 0.87775 2023-10-03 12:19:11,687 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.feed_forward2.hidden_balancer.prob, batch_count=1264500.0, ans=0.125 2023-10-03 12:19:13,168 WARNING [train.py:1204] (3/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_181-78632-0 from training. Duration: 0.78 2023-10-03 12:19:13,224 WARNING [train.py:1204] (3/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_300-27235-0_sp0.9 from training. Duration: 0.9666875 2023-10-03 12:19:14,638 WARNING [train.py:1204] (3/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_96-177176-0_sp1.1 from training. Duration: 0.97275 2023-10-03 12:19:16,039 WARNING [train.py:1204] (3/4) Exclude cut with ID _35941_210_15379_1_1532995294515_4023089_246-348742-0_sp1.1 from training. Duration: 0.97275 2023-10-03 12:19:16,546 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.feed_forward1.out_whiten.whitening_limit, batch_count=1264500.0, ans=15.0 2023-10-03 12:19:17,426 WARNING [train.py:1204] (3/4) Exclude cut with ID _38670_210_15379_1_1533448810659_4391059_65-169397-0_sp1.1 from training. Duration: 0.8818125 2023-10-03 12:19:20,129 WARNING [train.py:1204] (3/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_1003-99642-0_sp1.1 from training. Duration: 0.963625 2023-10-03 12:19:23,325 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.ff3_skip_rate, batch_count=1264566.6666666667, ans=0.0 2023-10-03 12:19:24,434 WARNING [train.py:1204] (3/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_320-14078-0_sp1.1 from training. Duration: 0.863625 2023-10-03 12:19:24,547 WARNING [train.py:1204] (3/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_367-61330-0_sp0.9 from training. Duration: 0.8333125 2023-10-03 12:19:27,147 WARNING [train.py:1204] (3/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_515-298514-0_sp1.1 from training. Duration: 0.92725 2023-10-03 12:19:28,998 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.bypass_mid.scale_min, batch_count=1264566.6666666667, ans=0.2 2023-10-03 12:19:29,000 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.bypass.scale_min, batch_count=1264566.6666666667, ans=0.2 2023-10-03 12:19:30,548 WARNING [train.py:1204] (3/4) Exclude cut with ID _53731_210_7994_1_1534553979244_3757214_471-320815-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 12:19:30,616 WARNING [train.py:1204] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_306-232480-0 from training. Duration: 0.96 2023-10-03 12:19:30,866 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.attention_skip_rate, batch_count=1264566.6666666667, ans=0.0 2023-10-03 12:19:32,602 WARNING [train.py:1204] (3/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_478-162247-0_sp0.9 from training. Duration: 0.911125 2023-10-03 12:19:34,409 WARNING [train.py:1204] (3/4) Exclude cut with ID _56423_210_5854_1_1534896061049_3165969_345-284696-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 12:19:35,747 WARNING [train.py:1204] (3/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_600-122253-0_sp1.1 from training. Duration: 0.963625 2023-10-03 12:19:37,203 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.627e+02 1.905e+02 2.182e+02 2.625e+02 6.484e+02, threshold=4.364e+02, percent-clipped=1.0 2023-10-03 12:19:38,669 WARNING [train.py:1204] (3/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_199-295611-0 from training. Duration: 0.55 2023-10-03 12:19:41,526 WARNING [train.py:1204] (3/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_734-129761-0 from training. Duration: 0.82 2023-10-03 12:19:44,149 WARNING [train.py:1204] (3/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_349-169257-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 12:19:44,188 WARNING [train.py:1204] (3/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_202-67017-0_sp0.9 from training. Duration: 0.911125 2023-10-03 12:19:46,173 WARNING [train.py:1204] (3/4) Exclude cut with ID _16908_210_5724_1_1531119157248_3772700_61-258978-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 12:19:50,352 WARNING [train.py:1204] (3/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_172-156931-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 12:19:51,761 WARNING [train.py:1204] (3/4) Exclude cut with ID _4092_210_2151_1_1526643044449_3135759_356-278694-0_sp1.1 from training. Duration: 0.42725 2023-10-03 12:19:53,422 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.ff2_skip_rate, batch_count=1264700.0, ans=0.0 2023-10-03 12:19:55,925 WARNING [train.py:1204] (3/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_116-332008-0 from training. Duration: 0.98 2023-10-03 12:19:58,884 WARNING [train.py:1204] (3/4) Exclude cut with ID _29003_210_19596_1_1532507391720_3894098_358-114789-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 12:20:02,174 WARNING [train.py:1204] (3/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_152-212533-0_sp1.1 from training. Duration: 0.8818125 2023-10-03 12:20:02,244 WARNING [train.py:1204] (3/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_267-214116-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 12:20:02,940 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.2.encoder.layers.0.conv_module2.whiten, num_groups=1, num_channels=384, metric=13.65 vs. limit=15.0 2023-10-03 12:20:05,674 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7543_210_6753_1_1528541907_1399322_165-323619-0_sp0.9 from training. Duration: 0.7666875 2023-10-03 12:20:07,194 WARNING [train.py:1204] (3/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_577-332600-0_sp0.9 from training. Duration: 0.6333125 2023-10-03 12:20:08,554 WARNING [train.py:1204] (3/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_342-100157-0_sp0.9 from training. Duration: 0.6 2023-10-03 12:20:11,090 WARNING [train.py:1204] (3/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_141-89727-0_sp1.1 from training. Duration: 0.6454375 2023-10-03 12:20:11,285 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.conv_module1.balancer2.min_positive, batch_count=1264766.6666666667, ans=0.05 2023-10-03 12:20:12,536 WARNING [train.py:1204] (3/4) Exclude cut with ID _3992_210_1747_1_1526644816031_3731060_107-251434-0_sp1.1 from training. Duration: 0.9 2023-10-03 12:20:12,806 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.feed_forward2.hidden_balancer.prob, batch_count=1264766.6666666667, ans=0.125 2023-10-03 12:20:13,974 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.ff3_skip_rate, batch_count=1264766.6666666667, ans=0.0 2023-10-03 12:20:15,759 WARNING [train.py:1204] (3/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_401-53423-0_sp1.1 from training. Duration: 0.5 2023-10-03 12:20:19,912 INFO [train.py:1046] (3/4) Epoch 36, batch 3800, loss[loss=0.1418, simple_loss=0.2205, pruned_loss=0.03151, over 14615.00 frames. ], tot_loss[loss=0.16, simple_loss=0.2395, pruned_loss=0.04022, over 4711911.46 frames. ], batch size: 31, lr: 2.82e-03, grad_scale: 16.0 2023-10-03 12:20:24,061 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7762_210_1794_1_1528199508_4057408_422-227034-0_sp1.1 from training. Duration: 0.663625 2023-10-03 12:20:28,287 WARNING [train.py:1204] (3/4) Exclude cut with ID _4184_210_2550_1_1526730192013_2219380_102-250299-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 12:20:28,361 WARNING [train.py:1204] (3/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_253-308377-0_sp0.9 from training. Duration: 0.5444375 2023-10-03 12:20:29,651 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_271-297505-0 from training. Duration: 0.99 2023-10-03 12:20:31,489 WARNING [train.py:1204] (3/4) Exclude cut with ID _35941_210_15379_1_1532995294515_4023089_213-142973-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 12:20:32,885 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7544_210_6753_1_1528587722_3576686_162-63018-0_sp1.1 from training. Duration: 0.92725 2023-10-03 12:20:34,849 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_365-217119-0_sp0.9 from training. Duration: 0.7 2023-10-03 12:20:36,278 WARNING [train.py:1204] (3/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_662-275621-0_sp0.9 from training. Duration: 0.4666875 2023-10-03 12:20:36,279 WARNING [train.py:1204] (3/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_374-187555-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 12:20:37,550 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_289-180904-0_sp1.1 from training. Duration: 0.6909375 2023-10-03 12:20:39,036 WARNING [train.py:1204] (3/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_189-47206-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 12:20:39,066 WARNING [train.py:1204] (3/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_234-14174-0_sp1.1 from training. Duration: 0.7454375 2023-10-03 12:20:40,451 WARNING [train.py:1204] (3/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_576-76621-0_sp1.1 from training. Duration: 0.963625 2023-10-03 12:20:41,802 WARNING [train.py:1204] (3/4) Exclude cut with ID _13253_210_1925_1_1530757775819_3577873_253-246386-0 from training. Duration: 0.9 2023-10-03 12:20:44,572 WARNING [train.py:1204] (3/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_372-6869-0_sp1.1 from training. Duration: 0.4 2023-10-03 12:20:44,625 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_21434_210_4238_1_1531916339_1222252_22-54954-0_sp1.1 from training. Duration: 0.936375 2023-10-03 12:20:47,959 WARNING [train.py:1204] (3/4) Exclude cut with ID _29853_210_7497_1_1532505600851_6642110_492-170361-0_sp1.1 from training. Duration: 0.92725 2023-10-03 12:20:49,398 WARNING [train.py:1204] (3/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_505-97987-0_sp0.9 from training. Duration: 0.9555625 2023-10-03 12:20:49,438 WARNING [train.py:1204] (3/4) Exclude cut with ID _8365_210_2064_1_1528592311070_4677690_371-122295-0_sp0.9 from training. Duration: 0.611125 2023-10-03 12:20:52,009 WARNING [train.py:1204] (3/4) Exclude cut with ID _14268_210_9206_1_1530622771285_4714099_637-86630-0_sp1.1 from training. Duration: 0.463625 2023-10-03 12:20:52,026 WARNING [train.py:1204] (3/4) Exclude cut with ID _29857_210_20034_1_1532395886753_3798160_372-5678-0_sp1.1 from training. Duration: 0.963625 2023-10-03 12:20:54,330 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.1.encoder.layers.0.feed_forward3.out_whiten, num_groups=1, num_channels=256, metric=11.17 vs. limit=15.0 2023-10-03 12:20:54,802 WARNING [train.py:1204] (3/4) Exclude cut with ID _52892_210_6721_1_1534378941259_4124129_90-165108-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 12:20:56,172 WARNING [train.py:1204] (3/4) Exclude cut with ID _27052_210_5986_1_1532329197187_3585009_143-339198-0_sp1.1 from training. Duration: 0.963625 2023-10-03 12:21:00,374 WARNING [train.py:1204] (3/4) Exclude cut with ID _14268_210_9206_1_1530622771285_4714099_637-86630-0_sp0.9 from training. Duration: 0.5666875 2023-10-03 12:21:00,376 WARNING [train.py:1204] (3/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_401-255491-0 from training. Duration: 0.7 2023-10-03 12:21:00,531 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.feed_forward2.hidden_balancer.prob, batch_count=1264966.6666666667, ans=0.125 2023-10-03 12:21:03,571 WARNING [train.py:1204] (3/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_178-6946-0_sp1.1 from training. Duration: 0.97275 2023-10-03 12:21:07,183 WARNING [train.py:1204] (3/4) Exclude cut with ID _27485_210_4916_1_1532739191677_3668269_460-163885-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 12:21:07,770 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.4.encoder.layers.1.nonlin_attention.whiten2, num_groups=1, num_channels=384, metric=3.03 vs. limit=15.0 2023-10-03 12:21:11,399 WARNING [train.py:1204] (3/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_270-26053-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 12:21:14,685 WARNING [train.py:1204] (3/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_412-317077-0 from training. Duration: 0.75 2023-10-03 12:21:17,300 WARNING [train.py:1204] (3/4) Exclude cut with ID _5289_210_2084_1_1527678202736_3779299_142-103317-0 from training. Duration: 0.84 2023-10-03 12:21:18,611 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_150-143438-0_sp1.1 from training. Duration: 0.92725 2023-10-03 12:21:20,036 WARNING [train.py:1204] (3/4) Exclude cut with ID _27894_210_12392_1_1532415496078_6902439_668-269779-0_sp1.1 from training. Duration: 0.97275 2023-10-03 12:21:20,094 WARNING [train.py:1204] (3/4) Exclude cut with ID _18513_210_9206_1_1531618215223_3862620_56-222075-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 12:21:22,829 WARNING [train.py:1204] (3/4) Exclude cut with ID _4092_210_2151_1_1526643044449_3135759_356-278694-0 from training. Duration: 0.47 2023-10-03 12:21:24,951 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.3.encoder.layers.0.conv_module1.whiten, num_groups=1, num_channels=512, metric=3.53 vs. limit=15.0 2023-10-03 12:21:25,649 WARNING [train.py:1204] (3/4) Exclude cut with ID _25808_210_13292_1_1532343514562_7366109_633-47524-0 from training. Duration: 0.99 2023-10-03 12:21:25,660 WARNING [train.py:1204] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_872-264525-0 from training. Duration: 0.65 2023-10-03 12:21:25,694 WARNING [train.py:1204] (3/4) Exclude cut with ID _14632_210_9988_1_1530691279392_3923200_239-232545-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 12:21:27,106 WARNING [train.py:1204] (3/4) Exclude cut with ID _14349_210_8341_1_1530615710818_3442470_153-27432-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 12:21:32,800 INFO [train.py:1046] (3/4) Epoch 36, batch 3850, loss[loss=0.1774, simple_loss=0.2575, pruned_loss=0.04864, over 23339.00 frames. ], tot_loss[loss=0.1596, simple_loss=0.2388, pruned_loss=0.04014, over 4714879.48 frames. ], batch size: 93, lr: 2.82e-03, grad_scale: 16.0 2023-10-03 12:21:32,903 WARNING [train.py:1204] (3/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_37-41919-0_sp0.9 from training. Duration: 0.9666875 2023-10-03 12:21:32,968 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_196-289637-0_sp0.9 from training. Duration: 0.8333125 2023-10-03 12:21:33,116 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.1.balancer1.prob, batch_count=1265166.6666666667, ans=0.125 2023-10-03 12:21:39,715 WARNING [train.py:1204] (3/4) Exclude cut with ID _13678_210_7497_1_1530530069226_3463170_371-3221-0_sp1.1 from training. Duration: 0.8181875 2023-10-03 12:21:39,781 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder_embed.dropout.p, batch_count=1265166.6666666667, ans=0.1 2023-10-03 12:21:41,086 WARNING [train.py:1204] (3/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_557-324258-0 from training. Duration: 0.81 2023-10-03 12:21:41,172 WARNING [train.py:1204] (3/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_1012-279473-0_sp1.1 from training. Duration: 0.6090625 2023-10-03 12:21:42,511 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_66916_210_14656_1_1535725525_326566_7-92649-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 12:21:46,647 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_9460_210_4211_1_1528954587_10049667_480-260269-0_sp1.1 from training. Duration: 0.5090625 2023-10-03 12:21:48,533 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_121-155001-0_sp1.1 from training. Duration: 0.92725 2023-10-03 12:21:49,093 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.4.encoder.layers.1.whiten, num_groups=1, num_channels=384, metric=5.20 vs. limit=12.0 2023-10-03 12:21:51,291 WARNING [train.py:1204] (3/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_188-171673-0_sp1.1 from training. Duration: 0.52725 2023-10-03 12:21:52,656 WARNING [train.py:1204] (3/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_2-39653-0 from training. Duration: 0.98 2023-10-03 12:21:58,075 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_23850_210_4238_1_1532232597_1632433_112-229012-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 12:21:59,530 WARNING [train.py:1204] (3/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_626-79528-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 12:22:01,563 WARNING [train.py:1204] (3/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_856-90052-0_sp1.1 from training. Duration: 0.963625 2023-10-03 12:22:02,912 WARNING [train.py:1204] (3/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_122-109023-0_sp0.9 from training. Duration: 0.8444375 2023-10-03 12:22:04,230 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.509e+02 1.990e+02 2.175e+02 2.458e+02 3.348e+02, threshold=4.349e+02, percent-clipped=0.0 2023-10-03 12:22:05,776 WARNING [train.py:1204] (3/4) Exclude cut with ID _64414_210_17301_1_1535243252048_3757759_37-70328-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 12:22:07,487 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_622-344907-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 12:22:07,551 WARNING [train.py:1204] (3/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_410-321956-0_sp1.1 from training. Duration: 0.92725 2023-10-03 12:22:07,572 WARNING [train.py:1204] (3/4) Exclude cut with ID _7562_210_2084_1_1528887767916_3946229_186-326422-0_sp0.9 from training. Duration: 0.9333125 2023-10-03 12:22:09,477 WARNING [train.py:1204] (3/4) Exclude cut with ID _37537_210_2253_1_1533171758568_3421949_127-285192-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 12:22:10,930 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.0.nonlin_attention.balancer.prob, batch_count=1265300.0, ans=0.125 2023-10-03 12:22:12,152 WARNING [train.py:1204] (3/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_723-228168-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 12:22:12,216 WARNING [train.py:1204] (3/4) Exclude cut with ID _13678_210_7497_1_1530530069226_3463170_426-246984-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 12:22:12,239 WARNING [train.py:1204] (3/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_399-156791-0_sp0.9 from training. Duration: 0.92225 2023-10-03 12:22:12,284 WARNING [train.py:1204] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_391-22148-0 from training. Duration: 0.77 2023-10-03 12:22:12,312 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3475_210_2028_1_1526193578_3622569_64-297072-0 from training. Duration: 0.91 2023-10-03 12:22:13,780 WARNING [train.py:1204] (3/4) Exclude cut with ID _17875_210_2087_1_1531224012907_1821339_225-280015-0_sp1.1 from training. Duration: 0.963625 2023-10-03 12:22:13,808 WARNING [train.py:1204] (3/4) Exclude cut with ID _50655_210_18628_1_1534229942193_3772154_325-246169-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 12:22:16,595 WARNING [train.py:1204] (3/4) Exclude cut with ID _29529_210_7234_1_1532393961455_3977460_597-257348-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 12:22:16,644 WARNING [train.py:1204] (3/4) Exclude cut with ID _58895_210_8750_1_1535184081320_3649089_77-173485-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 12:22:16,881 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.feed_forward1.out_proj.dropout_p, batch_count=1265366.6666666667, ans=0.1 2023-10-03 12:22:17,922 WARNING [train.py:1204] (3/4) Exclude cut with ID _6270_210_2084_1_1527850911573_3856420_26-277343-0 from training. Duration: 0.82 2023-10-03 12:22:19,860 WARNING [train.py:1204] (3/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_144-3287-0 from training. Duration: 0.67 2023-10-03 12:22:21,262 WARNING [train.py:1204] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_293-145278-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 12:22:23,888 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_40047_210_2290_1_1533290519_2374311_257-335162-0 from training. Duration: 0.95 2023-10-03 12:22:24,027 WARNING [train.py:1204] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_619-344499-0_sp0.9 from training. Duration: 0.7 2023-10-03 12:22:27,476 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.5.encoder.layers.1.feed_forward1.out_whiten, num_groups=1, num_channels=256, metric=6.99 vs. limit=15.0 2023-10-03 12:22:28,444 WARNING [train.py:1204] (3/4) Exclude cut with ID _19504_210_9994_1_1531648863242_3325220_456-315379-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 12:22:30,461 WARNING [train.py:1204] (3/4) Exclude cut with ID _55857_210_2346_1_1534644007831_6898209_631-221692-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 12:22:30,731 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.bypass.skip_rate, batch_count=1265366.6666666667, ans=0.07 2023-10-03 12:22:33,218 WARNING [train.py:1204] (3/4) Exclude cut with ID _64631_210_27767_1_1535349580068_4165160_324-3682-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 12:22:33,244 WARNING [train.py:1204] (3/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_317-211107-0 from training. Duration: 0.92 2023-10-03 12:22:38,413 WARNING [train.py:1204] (3/4) Exclude cut with ID _6789_210_1534_1_1527857937218_3118950_311-61262-0 from training. Duration: 0.65 2023-10-03 12:22:39,838 WARNING [train.py:1204] (3/4) Exclude cut with ID _28323_210_16187_1_1532480395667_4100300_365-68882-0_sp1.1 from training. Duration: 0.92725 2023-10-03 12:22:41,125 WARNING [train.py:1204] (3/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_816-181825-0_sp1.1 from training. Duration: 0.92725 2023-10-03 12:22:43,970 WARNING [train.py:1204] (3/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_132-269633-0_sp1.1 from training. Duration: 0.6181875 2023-10-03 12:22:43,989 WARNING [train.py:1204] (3/4) Exclude cut with ID _44585_210_18628_1_1533605350387_4518199_280-73534-0_sp1.1 from training. Duration: 0.8090625 2023-10-03 12:22:45,387 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_68095_210_20185_1_1535718226_8888855_1009-164755-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 12:22:45,464 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_40223_210_6228_1_1533298404_4812267_400-103813-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 12:22:45,472 WARNING [train.py:1204] (3/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_491-261923-0_sp1.1 from training. Duration: 0.8909375 2023-10-03 12:22:45,477 WARNING [train.py:1204] (3/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_712-342563-0 from training. Duration: 0.97 2023-10-03 12:22:45,675 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.ff3_skip_rate, batch_count=1265433.3333333333, ans=0.0 2023-10-03 12:22:46,815 WARNING [train.py:1204] (3/4) Exclude cut with ID _41161_210_20034_1_1533431249657_3555314_175-190032-0_sp1.1 from training. Duration: 0.963625 2023-10-03 12:22:48,163 INFO [train.py:1046] (3/4) Epoch 36, batch 3900, loss[loss=0.1612, simple_loss=0.2455, pruned_loss=0.03847, over 24369.00 frames. ], tot_loss[loss=0.1587, simple_loss=0.2377, pruned_loss=0.03983, over 4717808.41 frames. ], batch size: 77, lr: 2.82e-03, grad_scale: 16.0 2023-10-03 12:22:48,234 WARNING [train.py:1204] (3/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_788-298907-0 from training. Duration: 0.89 2023-10-03 12:22:48,259 WARNING [train.py:1204] (3/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_225-70961-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 12:22:48,272 WARNING [train.py:1204] (3/4) Exclude cut with ID _58583_210_5236_1_1534726835266_3574249_235-175994-0_sp1.1 from training. Duration: 0.92725 2023-10-03 12:22:49,751 WARNING [train.py:1204] (3/4) Exclude cut with ID _12302_210_5219_1_1529924446466_8107280_907-182949-0_sp1.1 from training. Duration: 0.836375 2023-10-03 12:22:49,794 WARNING [train.py:1204] (3/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_94-193692-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 12:22:52,846 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_4528_210_3273_1_1527076654_3276777_342-168068-0_sp0.9 from training. Duration: 0.9666875 2023-10-03 12:22:54,141 WARNING [train.py:1204] (3/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_368-74239-0_sp1.1 from training. Duration: 0.92725 2023-10-03 12:22:54,142 WARNING [train.py:1204] (3/4) Exclude cut with ID _44831_210_13427_1_1533816011549_3689035_291-295928-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 12:22:55,569 WARNING [train.py:1204] (3/4) Exclude cut with ID _56406_210_9288_1_1534899618664_3496149_455-131997-0_sp1.1 from training. Duration: 0.97275 2023-10-03 12:22:55,578 WARNING [train.py:1204] (3/4) Exclude cut with ID _6473_210_1741_1_1527901307396_3815380_108-17335-0 from training. Duration: 0.65 2023-10-03 12:22:55,625 WARNING [train.py:1204] (3/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_286-135600-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 12:22:56,478 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.2.encoder.layers.2.whiten, num_groups=1, num_channels=384, metric=3.87 vs. limit=12.0 2023-10-03 12:22:57,226 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.out_combiner.scale_min, batch_count=1265500.0, ans=0.2 2023-10-03 12:22:58,396 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_928-321675-0_sp1.1 from training. Duration: 0.936375 2023-10-03 12:22:59,750 WARNING [train.py:1204] (3/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_81-300212-0_sp1.1 from training. Duration: 0.5454375 2023-10-03 12:22:59,790 WARNING [train.py:1204] (3/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_29-280622-0_sp0.9 from training. Duration: 0.911125 2023-10-03 12:23:01,551 WARNING [train.py:1204] (3/4) Exclude cut with ID _61079_210_4525_1_1534921008198_4011419_228-176447-0_sp1.1 from training. Duration: 0.936375 2023-10-03 12:23:04,364 WARNING [train.py:1204] (3/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_372-198878-0_sp1.1 from training. Duration: 0.5454375 2023-10-03 12:23:04,375 WARNING [train.py:1204] (3/4) Exclude cut with ID _64521_210_14699_1_1535243427259_3815328_244-208725-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 12:23:04,495 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8275_210_4198_1_1528455320_3892571_491-17707-0_sp0.9 from training. Duration: 0.711125 2023-10-03 12:23:04,776 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.conv_skip_rate, batch_count=1265566.6666666667, ans=0.0 2023-10-03 12:23:05,914 WARNING [train.py:1204] (3/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_375-245677-0 from training. Duration: 0.81 2023-10-03 12:23:05,922 WARNING [train.py:1204] (3/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_690-286055-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 12:23:09,581 WARNING [train.py:1204] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_89-321955-0 from training. Duration: 0.8 2023-10-03 12:23:09,619 WARNING [train.py:1204] (3/4) Exclude cut with ID _18895_210_6228_1_1531357274234_3784930_213-98721-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 12:23:10,958 WARNING [train.py:1204] (3/4) Exclude cut with ID _19708_210_9206_1_1532136650733_4238364_347-65240-0 from training. Duration: 0.97 2023-10-03 12:23:12,247 WARNING [train.py:1204] (3/4) Exclude cut with ID _64135_210_2253_1_1535677267876_7608972_498-5039-0 from training. Duration: 0.78 2023-10-03 12:23:15,182 WARNING [train.py:1204] (3/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_146-276944-0_sp1.1 from training. Duration: 0.7545625 2023-10-03 12:23:16,572 WARNING [train.py:1204] (3/4) Exclude cut with ID _63171_210_15488_1_1535104792794_3295110_155-34488-0_sp1.1 from training. Duration: 0.97275 2023-10-03 12:23:16,587 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_5438_210_1794_1_1527336687_2600009_233-58072-0_sp0.9 from training. Duration: 0.8555625 2023-10-03 12:23:16,636 WARNING [train.py:1204] (3/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_63-101996-0_sp0.9 from training. Duration: 0.97775 2023-10-03 12:23:17,367 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.3.encoder.layers.0.conv_module2.whiten, num_groups=1, num_channels=512, metric=3.38 vs. limit=15.0 2023-10-03 12:23:19,803 WARNING [train.py:1204] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_271-269084-0_sp1.1 from training. Duration: 0.7545625 2023-10-03 12:23:22,450 WARNING [train.py:1204] (3/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_345-167986-0_sp1.1 from training. Duration: 0.763625 2023-10-03 12:23:24,069 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.feed_forward2.hidden_balancer.prob, batch_count=1265633.3333333333, ans=0.125 2023-10-03 12:23:25,104 WARNING [train.py:1204] (3/4) Exclude cut with ID _39290_210_4220_1_1533862778760_3075118_92-311600-0_sp1.1 from training. Duration: 0.8 2023-10-03 12:23:25,110 WARNING [train.py:1204] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_209-65306-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 12:23:25,196 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_26433_210_12108_1_1532157158_4056480_317-335807-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 12:23:28,064 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.conv_module1.balancer2.prob, batch_count=1265633.3333333333, ans=0.125 2023-10-03 12:23:29,651 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.bypass.scale_min, batch_count=1265633.3333333333, ans=0.2 2023-10-03 12:23:31,415 WARNING [train.py:1204] (3/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_238-94127-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 12:23:31,483 WARNING [train.py:1204] (3/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_207-132038-0_sp1.1 from training. Duration: 0.8545625 2023-10-03 12:23:31,765 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.feed_forward1.hidden_balancer.prob, batch_count=1265700.0, ans=0.125 2023-10-03 12:23:33,129 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.conv_module2.balancer1.prob, batch_count=1265700.0, ans=0.125 2023-10-03 12:23:39,024 WARNING [train.py:1204] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_167-137255-0_sp1.1 from training. Duration: 0.5818125 2023-10-03 12:23:40,341 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2009_210_2071_1_1525481134_4588937_385-302100-0_sp0.9 from training. Duration: 0.9666875 2023-10-03 12:23:40,604 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.self_attn_weights.pos_emb_skip_rate, batch_count=1265700.0, ans=0.0 2023-10-03 12:23:49,183 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_15517_210_6753_1_1530954008_3487336_253-110617-0_sp1.1 from training. Duration: 0.936375 2023-10-03 12:23:50,603 WARNING [train.py:1204] (3/4) Exclude cut with ID _39290_210_4220_1_1533862778760_3075118_92-311600-0_sp0.9 from training. Duration: 0.97775 2023-10-03 12:23:50,647 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_42-142473-0 from training. Duration: 0.72 2023-10-03 12:23:50,866 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.bypass_mid.scale_min, batch_count=1265766.6666666667, ans=0.2 2023-10-03 12:23:51,876 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2296_210_2087_1_1525503448_3397335_57-185430-0 from training. Duration: 0.6 2023-10-03 12:23:51,888 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_11305_210_1794_1_1529750428_7178148_257-312870-0_sp0.9 from training. Duration: 0.97775 2023-10-03 12:23:52,013 WARNING [train.py:1204] (3/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_222-198127-0 from training. Duration: 0.84 2023-10-03 12:23:54,651 WARNING [train.py:1204] (3/4) Exclude cut with ID _65263_210_22356_1_1535763598691_3921037_423-202699-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 12:23:54,711 WARNING [train.py:1204] (3/4) Exclude cut with ID _15037_210_2253_1_1530752294104_3267650_13-130361-0 from training. Duration: 0.99 2023-10-03 12:23:56,148 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder_embed.conv.5.prob, batch_count=1265766.6666666667, ans=0.125 2023-10-03 12:24:01,819 INFO [train.py:1046] (3/4) Epoch 36, batch 3950, loss[loss=0.165, simple_loss=0.225, pruned_loss=0.05252, over 19458.00 frames. ], tot_loss[loss=0.1592, simple_loss=0.238, pruned_loss=0.0402, over 4709598.32 frames. ], batch size: 388, lr: 2.82e-03, grad_scale: 16.0 2023-10-03 12:24:02,557 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.3.encoder.layers.2.nonlin_attention.whiten1, num_groups=1, num_channels=384, metric=4.09 vs. limit=10.0 2023-10-03 12:24:03,240 WARNING [train.py:1204] (3/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_628-198080-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 12:24:03,329 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3171_210_2087_1_1526109084_2330007_37-64334-0 from training. Duration: 0.82 2023-10-03 12:24:05,272 WARNING [train.py:1204] (3/4) Exclude cut with ID _5287_210_2084_1_1527418883040_3878670_12-221186-0_sp1.1 from training. Duration: 0.8909375 2023-10-03 12:24:06,467 WARNING [train.py:1204] (3/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_1103-56445-0_sp1.1 from training. Duration: 0.663625 2023-10-03 12:24:08,164 WARNING [train.py:1204] (3/4) Exclude cut with ID _65263_210_22356_1_1535763598691_3921037_169-140068-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 12:24:12,961 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.balancer2.prob, batch_count=1265833.3333333333, ans=0.125 2023-10-03 12:24:14,335 WARNING [train.py:1204] (3/4) Exclude cut with ID IC0671W0118-331961 from training. Duration: 0.896 2023-10-03 12:24:15,585 WARNING [train.py:1204] (3/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_402-102311-0_sp1.1 from training. Duration: 0.6090625 2023-10-03 12:24:16,983 WARNING [train.py:1204] (3/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_202-293523-0 from training. Duration: 0.72 2023-10-03 12:24:17,038 WARNING [train.py:1204] (3/4) Exclude cut with ID IC0679W0038-335846 from training. Duration: 0.768 2023-10-03 12:24:17,067 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_37357_210_17024_1_1533089504_2316121_79-198866-0_sp1.1 from training. Duration: 0.936375 2023-10-03 12:24:19,844 WARNING [train.py:1204] (3/4) Exclude cut with ID _7606_210_4915_1_1528894253277_5080690_292-271285-0_sp1.1 from training. Duration: 0.963625 2023-10-03 12:24:19,856 WARNING [train.py:1204] (3/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_377-296839-0_sp0.9 from training. Duration: 0.611125 2023-10-03 12:24:19,859 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_45886_210_17024_1_1533704917_2435333_58-131069-0_sp1.1 from training. Duration: 0.963625 2023-10-03 12:24:23,189 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6961_210_2087_1_1527937168_6815011_395-185060-0 from training. Duration: 0.95 2023-10-03 12:24:25,729 WARNING [train.py:1204] (3/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_304-266479-0_sp1.1 from training. Duration: 0.97275 2023-10-03 12:24:25,782 WARNING [train.py:1204] (3/4) Exclude cut with ID _3867_210_1883_1_1526641200575_4475830_319-349850-0_sp1.1 from training. Duration: 0.6090625 2023-10-03 12:24:25,797 WARNING [train.py:1204] (3/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_304-59227-0_sp0.9 from training. Duration: 0.8555625 2023-10-03 12:24:27,193 WARNING [train.py:1204] (3/4) Exclude cut with ID _8021_210_7994_1_1528376451358_3653069_100-140231-0_sp0.9 from training. Duration: 0.7444375 2023-10-03 12:24:28,540 WARNING [train.py:1204] (3/4) Exclude cut with ID _8833_210_5219_1_1528592665210_7553059_329-125156-0_sp0.9 from training. Duration: 0.911125 2023-10-03 12:24:32,501 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.566e+02 1.840e+02 2.075e+02 2.335e+02 2.837e+02, threshold=4.150e+02, percent-clipped=0.0 2023-10-03 12:24:32,803 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.0.ff3_skip_rate, batch_count=1265966.6666666667, ans=0.0 2023-10-03 12:24:37,879 WARNING [train.py:1204] (3/4) Exclude cut with ID _14459_210_2228_1_1531443641946_7516700_792-287234-0_sp1.1 from training. Duration: 0.92725 2023-10-03 12:24:37,897 WARNING [train.py:1204] (3/4) Exclude cut with ID _19708_210_9206_1_1532136650733_4238364_347-65240-0_sp1.1 from training. Duration: 0.8818125 2023-10-03 12:24:45,173 WARNING [train.py:1204] (3/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_70-27732-0 from training. Duration: 0.94 2023-10-03 12:24:50,758 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_397-132565-0 from training. Duration: 0.99 2023-10-03 12:24:50,769 WARNING [train.py:1204] (3/4) Exclude cut with ID _49728_210_12651_1_1534041034294_6788150_624-195301-0 from training. Duration: 0.85 2023-10-03 12:24:50,810 WARNING [train.py:1204] (3/4) Exclude cut with ID _55429_210_10626_1_1534467588527_7174143_377-169391-0_sp1.1 from training. Duration: 0.87275 2023-10-03 12:24:50,970 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=1266033.3333333333, ans=0.1 2023-10-03 12:24:53,502 WARNING [train.py:1204] (3/4) Exclude cut with ID _49376_210_5986_1_1533985278732_3370740_354-344695-0_sp1.1 from training. Duration: 0.9 2023-10-03 12:25:00,016 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.self_attn_weights.pos_emb_skip_rate, batch_count=1266100.0, ans=0.0 2023-10-03 12:25:01,201 WARNING [train.py:1204] (3/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_276-96593-0_sp1.1 from training. Duration: 0.7 2023-10-03 12:25:01,209 WARNING [train.py:1204] (3/4) Exclude cut with ID _15012_210_1968_1_1530856820429_5095350_688-178849-0_sp0.9 from training. Duration: 0.711125 2023-10-03 12:25:02,553 WARNING [train.py:1204] (3/4) Exclude cut with ID _25565_210_12392_1_1532423009382_3952098_372-69023-0_sp1.1 from training. Duration: 0.936375 2023-10-03 12:25:02,593 WARNING [train.py:1204] (3/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_61-327062-0_sp1.1 from training. Duration: 0.736375 2023-10-03 12:25:02,629 WARNING [train.py:1204] (3/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_215-141059-0 from training. Duration: 0.62 2023-10-03 12:25:08,532 WARNING [train.py:1204] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_6-226750-0_sp1.1 from training. Duration: 0.8545625 2023-10-03 12:25:08,623 WARNING [train.py:1204] (3/4) Exclude cut with ID _14348_210_8341_1_1530599434996_3778640_240-232370-0_sp1.1 from training. Duration: 0.763625 2023-10-03 12:25:13,577 WARNING [train.py:1204] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_468-189058-0 from training. Duration: 0.96 2023-10-03 12:25:16,621 INFO [train.py:1046] (3/4) Epoch 36, batch 4000, loss[loss=0.1593, simple_loss=0.2306, pruned_loss=0.04397, over 23678.00 frames. ], tot_loss[loss=0.1596, simple_loss=0.2387, pruned_loss=0.04023, over 4678872.29 frames. ], batch size: 232, lr: 2.82e-03, grad_scale: 32.0 2023-10-03 12:25:22,312 WARNING [train.py:1204] (3/4) Exclude cut with ID _21987_210_5854_1_1531821500482_3637038_247-93340-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 12:25:27,241 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.balancer1.prob, batch_count=1266166.6666666667, ans=0.125 2023-10-03 12:25:29,701 WARNING [train.py:1204] (3/4) Exclude cut with ID _66868_210_9527_1_1535509287954_4561140_436-191602-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 12:25:33,976 WARNING [train.py:1204] (3/4) Exclude cut with ID _18895_210_6228_1_1531357274234_3784930_394-58203-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 12:25:34,020 WARNING [train.py:1204] (3/4) Exclude cut with ID _58605_210_12210_1_1534723195939_3904259_258-281498-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 12:25:35,238 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_33348_210_5632_1_1533086912_3324030_245-292752-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 12:25:35,257 WARNING [train.py:1204] (3/4) Exclude cut with ID _67831_210_12392_1_1535612464202_3092612_346-2148-0 from training. Duration: 0.93 2023-10-03 12:25:35,330 WARNING [train.py:1204] (3/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_369-222060-0_sp0.9 from training. Duration: 0.788875 2023-10-03 12:25:36,689 WARNING [train.py:1204] (3/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_245-174553-0 from training. Duration: 0.88 2023-10-03 12:25:36,695 WARNING [train.py:1204] (3/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_231-104736-0_sp1.1 from training. Duration: 0.7090625 2023-10-03 12:25:36,701 WARNING [train.py:1204] (3/4) Exclude cut with ID _44585_210_18628_1_1533605350387_4518199_280-73534-0 from training. Duration: 0.89 2023-10-03 12:25:38,672 WARNING [train.py:1204] (3/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_22-201476-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 12:25:41,923 WARNING [train.py:1204] (3/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_325-62802-0_sp0.9 from training. Duration: 0.9666875 2023-10-03 12:25:41,936 WARNING [train.py:1204] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_199-274062-0_sp1.1 from training. Duration: 0.97275 2023-10-03 12:25:41,939 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_8-319957-0_sp1.1 from training. Duration: 0.87275 2023-10-03 12:25:41,973 WARNING [train.py:1204] (3/4) Exclude cut with ID _29343_210_4381_1_1532602757752_3674109_224-349726-0_sp1.1 from training. Duration: 0.936375 2023-10-03 12:25:41,978 WARNING [train.py:1204] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_520-126729-0_sp1.1 from training. Duration: 0.636375 2023-10-03 12:25:44,593 WARNING [train.py:1204] (3/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_239-22076-0_sp1.1 from training. Duration: 0.77275 2023-10-03 12:25:46,047 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9012W0011-501146 from training. Duration: 0.896 2023-10-03 12:25:47,709 WARNING [train.py:1204] (3/4) Exclude cut with ID _29857_210_20034_1_1532395886753_3798160_279-185801-0_sp1.1 from training. Duration: 0.82725 2023-10-03 12:25:47,773 WARNING [train.py:1204] (3/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_261-223933-0_sp1.1 from training. Duration: 0.92725 2023-10-03 12:25:50,670 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9029W0025-509426 from training. Duration: 0.896 2023-10-03 12:25:51,294 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.4.encoder.layers.1.conv_module2.whiten, num_groups=1, num_channels=384, metric=2.94 vs. limit=15.0 2023-10-03 12:25:52,089 WARNING [train.py:1204] (3/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_337-181502-0_sp1.1 from training. Duration: 0.5909375 2023-10-03 12:25:52,104 WARNING [train.py:1204] (3/4) Exclude cut with ID _68635_210_9317_1_1535770813410_4315870_159-332599-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 12:25:58,240 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_161-276996-0 from training. Duration: 0.95 2023-10-03 12:25:58,295 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7088_210_1874_1_1527939776_1101113_127-68870-0_sp1.1 from training. Duration: 0.936375 2023-10-03 12:25:59,755 WARNING [train.py:1204] (3/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_322-217220-0_sp1.1 from training. Duration: 0.7818125 2023-10-03 12:26:01,097 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9072W0002-530636 from training. Duration: 0.768 2023-10-03 12:26:01,254 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.1.conv_module2.balancer1.prob, batch_count=1266366.6666666667, ans=0.125 2023-10-03 12:26:02,513 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_340-336408-0_sp1.1 from training. Duration: 0.8454375 2023-10-03 12:26:03,770 WARNING [train.py:1204] (3/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_395-37222-0 from training. Duration: 0.76 2023-10-03 12:26:03,773 WARNING [train.py:1204] (3/4) Exclude cut with ID _64099_210_17301_1_1535526977656_1578188_159-209156-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 12:26:05,120 WARNING [train.py:1204] (3/4) Exclude cut with ID _53950_210_5816_1_1534417264919_3862951_28-29681-0_sp1.1 from training. Duration: 0.92725 2023-10-03 12:26:06,522 WARNING [train.py:1204] (3/4) Exclude cut with ID _4681_210_3525_1_1527250689464_120889_15-126760-0_sp1.1 from training. Duration: 0.736375 2023-10-03 12:26:06,638 WARNING [train.py:1204] (3/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_242-3472-0_sp1.1 from training. Duration: 0.663625 2023-10-03 12:26:07,859 WARNING [train.py:1204] (3/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_436-186311-0_sp0.9 from training. Duration: 0.72225 2023-10-03 12:26:09,630 WARNING [train.py:1204] (3/4) Exclude cut with ID _3675_210_4557_1_1526639934408_4182089_452-172147-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 12:26:11,539 WARNING [train.py:1204] (3/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_80-147301-0 from training. Duration: 0.84 2023-10-03 12:26:11,577 WARNING [train.py:1204] (3/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_351-107588-0_sp1.1 from training. Duration: 0.92725 2023-10-03 12:26:12,994 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9111W0002-550781 from training. Duration: 0.896 2023-10-03 12:26:18,543 WARNING [train.py:1204] (3/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_465-156500-0_sp1.1 from training. Duration: 0.6545625 2023-10-03 12:26:21,924 WARNING [train.py:1204] (3/4) Exclude cut with ID _13938_210_9988_1_1530518718978_3693960_304-112784-0_sp1.1 from training. Duration: 0.4181875 2023-10-03 12:26:24,836 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.ff2_skip_rate, batch_count=1266433.3333333333, ans=0.0 2023-10-03 12:26:25,862 WARNING [train.py:1204] (3/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_202-67017-0_sp1.1 from training. Duration: 0.7454375 2023-10-03 12:26:25,912 WARNING [train.py:1204] (3/4) Exclude cut with ID _58414_210_8254_1_1535421609162_7151182_448-4358-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 12:26:25,975 WARNING [train.py:1204] (3/4) Exclude cut with ID _66840_210_5219_1_1535452168426_4196349_203-243234-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 12:26:27,302 WARNING [train.py:1204] (3/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_201-213513-0_sp1.1 from training. Duration: 0.97275 2023-10-03 12:26:30,020 INFO [train.py:1046] (3/4) Epoch 36, batch 4050, loss[loss=0.153, simple_loss=0.2338, pruned_loss=0.0361, over 23532.00 frames. ], tot_loss[loss=0.1598, simple_loss=0.2389, pruned_loss=0.04033, over 4688580.60 frames. ], batch size: 93, lr: 2.82e-03, grad_scale: 32.0 2023-10-03 12:26:33,327 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_117-187560-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 12:26:34,662 WARNING [train.py:1204] (3/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_135-174080-0_sp0.9 from training. Duration: 0.588875 2023-10-03 12:26:36,008 WARNING [train.py:1204] (3/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_45-265684-0 from training. Duration: 0.72 2023-10-03 12:26:38,759 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_435-313305-0_sp1.1 from training. Duration: 0.6454375 2023-10-03 12:26:38,788 WARNING [train.py:1204] (3/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_235-307337-0_sp1.1 from training. Duration: 0.8545625 2023-10-03 12:26:40,059 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7762_210_1794_1_1528199508_4057408_419-125009-0_sp0.9 from training. Duration: 0.92225 2023-10-03 12:26:40,233 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.bypass.skip_rate, batch_count=1266500.0, ans=0.04949747468305833 2023-10-03 12:26:41,459 WARNING [train.py:1204] (3/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_215-141059-0_sp1.1 from training. Duration: 0.563625 2023-10-03 12:26:42,860 WARNING [train.py:1204] (3/4) Exclude cut with ID _27013_210_1389_1_1532437173240_3252649_222-125573-0_sp1.1 from training. Duration: 0.8909375 2023-10-03 12:26:43,711 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=1266566.6666666667, ans=0.1 2023-10-03 12:26:47,595 WARNING [train.py:1204] (3/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_126-274694-0_sp1.1 from training. Duration: 0.8909375 2023-10-03 12:26:49,014 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2592_210_2290_1_1525697025_4882925_157-70512-0_sp1.1 from training. Duration: 0.82725 2023-10-03 12:26:49,060 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_292-265928-0_sp0.9 from training. Duration: 0.5666875 2023-10-03 12:26:51,761 WARNING [train.py:1204] (3/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_1211-226066-0_sp1.1 from training. Duration: 0.6909375 2023-10-03 12:26:51,812 WARNING [train.py:1204] (3/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_394-265747-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 12:26:54,074 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.conv_skip_rate, batch_count=1266566.6666666667, ans=0.0 2023-10-03 12:26:55,203 WARNING [train.py:1204] (3/4) Exclude cut with ID _48782_210_12658_1_1534467701544_2300422_239-67139-0_sp1.1 from training. Duration: 0.97275 2023-10-03 12:26:57,849 WARNING [train.py:1204] (3/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_329-113173-0_sp1.1 from training. Duration: 0.563625 2023-10-03 12:27:00,597 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.610e+02 1.848e+02 2.001e+02 2.141e+02 3.089e+02, threshold=4.003e+02, percent-clipped=0.0 2023-10-03 12:27:00,973 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=1266633.3333333333, ans=0.1 2023-10-03 12:27:02,027 WARNING [train.py:1204] (3/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_81-75849-0_sp0.9 from training. Duration: 0.4333125 2023-10-03 12:27:03,508 WARNING [train.py:1204] (3/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_1003-101638-0 from training. Duration: 0.73 2023-10-03 12:27:03,536 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9309W0189-640316 from training. Duration: 0.64 2023-10-03 12:27:04,120 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.4.encoder.layers.0.conv_module2.whiten, num_groups=1, num_channels=384, metric=9.83 vs. limit=15.0 2023-10-03 12:27:05,332 WARNING [train.py:1204] (3/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_503-287883-0_sp1.1 from training. Duration: 0.8 2023-10-03 12:27:07,773 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.2.encoder.layers.0.conv_module2.whiten, num_groups=1, num_channels=384, metric=9.48 vs. limit=15.0 2023-10-03 12:27:09,124 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.4.encoder.layers.0.nonlin_attention.whiten2, num_groups=1, num_channels=384, metric=10.06 vs. limit=15.0 2023-10-03 12:27:11,223 WARNING [train.py:1204] (3/4) Exclude cut with ID _8833_210_5219_1_1528592665210_7553059_611-80313-0 from training. Duration: 0.64 2023-10-03 12:27:12,637 WARNING [train.py:1204] (3/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_335-106626-0_sp1.1 from training. Duration: 0.963625 2023-10-03 12:27:16,687 WARNING [train.py:1204] (3/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_237-44332-0_sp1.1 from training. Duration: 0.8545625 2023-10-03 12:27:19,510 WARNING [train.py:1204] (3/4) Exclude cut with ID _46952_210_5986_1_1534230010357_3107379_178-147341-0_sp1.1 from training. Duration: 0.97275 2023-10-03 12:27:19,549 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_13091_210_7925_1_1530609447_8987354_946-154426-0_sp1.1 from training. Duration: 0.936375 2023-10-03 12:27:19,564 WARNING [train.py:1204] (3/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_326-17061-0_sp1.1 from training. Duration: 0.8545625 2023-10-03 12:27:22,279 WARNING [train.py:1204] (3/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_38-169920-0_sp1.1 from training. Duration: 0.82725 2023-10-03 12:27:25,152 WARNING [train.py:1204] (3/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_283-220372-0 from training. Duration: 0.56 2023-10-03 12:27:25,160 WARNING [train.py:1204] (3/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_252-216674-0_sp1.1 from training. Duration: 0.4909375 2023-10-03 12:27:27,148 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_202-91290-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 12:27:28,480 WARNING [train.py:1204] (3/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_80-74184-0 from training. Duration: 0.83 2023-10-03 12:27:32,624 WARNING [train.py:1204] (3/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_411-271358-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 12:27:38,871 WARNING [train.py:1204] (3/4) Exclude cut with ID _68777_210_4353_1_1535810390731_3117904_129-166012-0 from training. Duration: 0.85 2023-10-03 12:27:40,213 WARNING [train.py:1204] (3/4) Exclude cut with ID _44982_210_1511_1_1534312820108_3726029_410-109759-0_sp1.1 from training. Duration: 0.963625 2023-10-03 12:27:40,218 WARNING [train.py:1204] (3/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_364-22817-0_sp1.1 from training. Duration: 0.6454375 2023-10-03 12:27:41,589 WARNING [train.py:1204] (3/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_89-242335-0 from training. Duration: 0.95 2023-10-03 12:27:41,597 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_151-325626-0 from training. Duration: 0.97 2023-10-03 12:27:41,598 WARNING [train.py:1204] (3/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_786-264332-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 12:27:41,874 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.attention_skip_rate, batch_count=1266766.6666666667, ans=0.0 2023-10-03 12:27:43,673 WARNING [train.py:1204] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_35-153886-0_sp1.1 from training. Duration: 0.763625 2023-10-03 12:27:44,300 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.3.encoder.layers.2.feed_forward1.out_whiten, num_groups=1, num_channels=512, metric=6.82 vs. limit=15.0 2023-10-03 12:27:44,931 INFO [train.py:1046] (3/4) Epoch 36, batch 4100, loss[loss=0.1595, simple_loss=0.2445, pruned_loss=0.03723, over 24052.00 frames. ], tot_loss[loss=0.1603, simple_loss=0.2395, pruned_loss=0.04061, over 4685482.70 frames. ], batch size: 80, lr: 2.82e-03, grad_scale: 32.0 2023-10-03 12:27:45,010 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_24007_210_1794_1_1531918925_1663875_116-100916-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 12:27:45,650 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_162-262717-0_sp1.1 from training. Duration: 0.7545625 2023-10-03 12:27:52,266 WARNING [train.py:1204] (3/4) Exclude cut with ID _8263_210_1664_1_1528438953494_294100_40-204731-0 from training. Duration: 0.5 2023-10-03 12:27:53,736 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_416-293746-0 from training. Duration: 0.9 2023-10-03 12:27:54,500 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.2.encoder.layers.1.conv_module1.whiten, num_groups=1, num_channels=384, metric=10.85 vs. limit=15.0 2023-10-03 12:27:56,374 WARNING [train.py:1204] (3/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_605-219732-0 from training. Duration: 0.94 2023-10-03 12:27:56,451 WARNING [train.py:1204] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_454-123002-0 from training. Duration: 0.69 2023-10-03 12:27:56,457 WARNING [train.py:1204] (3/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_25-304273-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 12:27:58,455 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6860_210_4852_1_1527917457_3833327_451-39702-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 12:27:58,497 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_304-16505-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 12:27:58,511 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_4-266348-0_sp1.1 from training. Duration: 0.7090625 2023-10-03 12:27:59,874 WARNING [train.py:1204] (3/4) Exclude cut with ID ID0149W0001-753701 from training. Duration: 0.989 2023-10-03 12:28:00,520 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.4.encoder.layers.0.self_attn_weights.whiten_keys, num_groups=4, num_channels=128, metric=1.61 vs. limit=6.0 2023-10-03 12:28:02,601 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_41251_210_15368_1_1533429767_4820695_394-55393-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 12:28:02,677 WARNING [train.py:1204] (3/4) Exclude cut with ID _8793_210_6310_1_1528542985139_2310009_121-179782-0_sp1.1 from training. Duration: 0.72725 2023-10-03 12:28:02,696 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2729_210_3528_1_1525863505_3882214_421-73047-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 12:28:04,589 WARNING [train.py:1204] (3/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_278-154492-0_sp0.9 from training. Duration: 0.8444375 2023-10-03 12:28:06,374 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.attention_skip_rate, batch_count=1266900.0, ans=0.0 2023-10-03 12:28:07,433 WARNING [train.py:1204] (3/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_412-317077-0_sp0.9 from training. Duration: 0.8333125 2023-10-03 12:28:08,844 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_727-138317-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 12:28:10,129 WARNING [train.py:1204] (3/4) Exclude cut with ID _25808_210_13292_1_1532343514562_7366109_345-335048-0_sp1.1 from training. Duration: 0.92725 2023-10-03 12:28:10,150 WARNING [train.py:1204] (3/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_506-23200-0 from training. Duration: 0.97 2023-10-03 12:28:10,227 WARNING [train.py:1204] (3/4) Exclude cut with ID _44718_210_5816_1_1534050051869_6469030_643-245187-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 12:28:10,231 WARNING [train.py:1204] (3/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_42-186162-0_sp1.1 from training. Duration: 0.836375 2023-10-03 12:28:11,905 WARNING [train.py:1204] (3/4) Exclude cut with ID _3231_210_2106_1_1526205864934_6784220_171-256004-0_sp1.1 from training. Duration: 0.7818125 2023-10-03 12:28:11,934 WARNING [train.py:1204] (3/4) Exclude cut with ID _25914_210_6400_1_1532141317058_644740_43-248478-0_sp1.1 from training. Duration: 0.936375 2023-10-03 12:28:11,977 WARNING [train.py:1204] (3/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_577-287150-0 from training. Duration: 0.82 2023-10-03 12:28:14,829 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_46015_210_7234_1_1533863319_3087837_376-276068-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 12:28:16,774 WARNING [train.py:1204] (3/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_51-200220-0 from training. Duration: 0.56 2023-10-03 12:28:17,082 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.feed_forward2.hidden_balancer.prob, batch_count=1266966.6666666667, ans=0.125 2023-10-03 12:28:18,217 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_452-96101-0_sp1.1 from training. Duration: 0.963625 2023-10-03 12:28:19,640 WARNING [train.py:1204] (3/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_263-168388-0_sp1.1 from training. Duration: 0.7818125 2023-10-03 12:28:19,641 WARNING [train.py:1204] (3/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_469-94673-0 from training. Duration: 0.79 2023-10-03 12:28:21,068 WARNING [train.py:1204] (3/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_303-228738-0_sp1.1 from training. Duration: 0.763625 2023-10-03 12:28:21,120 WARNING [train.py:1204] (3/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_262-250826-0_sp1.1 from training. Duration: 0.9 2023-10-03 12:28:22,396 WARNING [train.py:1204] (3/4) Exclude cut with ID _68777_210_4353_1_1535810390731_3117904_129-166012-0_sp1.1 from training. Duration: 0.77275 2023-10-03 12:28:23,852 WARNING [train.py:1204] (3/4) Exclude cut with ID _58895_210_8750_1_1535184081320_3649089_265-323810-0 from training. Duration: 0.96 2023-10-03 12:28:25,236 WARNING [train.py:1204] (3/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_120-76843-0_sp0.9 from training. Duration: 0.611125 2023-10-03 12:28:25,306 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7544_210_6753_1_1528587722_3576686_295-258479-0_sp0.9 from training. Duration: 0.9333125 2023-10-03 12:28:25,519 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.balancer2.prob, batch_count=1266966.6666666667, ans=0.125 2023-10-03 12:28:26,696 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.1.self_attn_weights.pos_emb_skip_rate, batch_count=1266966.6666666667, ans=0.0 2023-10-03 12:28:28,451 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7046_210_6753_1_1527895337_6920917_783-278109-0 from training. Duration: 0.77 2023-10-03 12:28:28,681 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.ff2_skip_rate, batch_count=1267033.3333333333, ans=0.0 2023-10-03 12:28:29,765 WARNING [train.py:1204] (3/4) Exclude cut with ID _42326_210_7770_1_1533814282531_3707269_113-308323-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 12:28:29,819 WARNING [train.py:1204] (3/4) Exclude cut with ID _7852_210_5219_1_1528286471316_8139400_242-105336-0_sp0.9 from training. Duration: 0.8 2023-10-03 12:28:34,264 WARNING [train.py:1204] (3/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_114-277905-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 12:28:35,873 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.attention_skip_rate, batch_count=1267033.3333333333, ans=0.0 2023-10-03 12:28:38,447 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_13118_210_7925_1_1530755612_7608297_42-66683-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 12:28:42,549 WARNING [train.py:1204] (3/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_901-79438-0_sp1.1 from training. Duration: 0.8545625 2023-10-03 12:28:44,333 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_30280_210_1881_1_1532872779_3660139_211-182194-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 12:28:51,918 WARNING [train.py:1204] (3/4) Exclude cut with ID _14898_210_9206_1_1530766812377_4177190_621-45732-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 12:28:51,935 WARNING [train.py:1204] (3/4) Exclude cut with ID _35350_210_16040_1_1533040191183_4075299_224-133775-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 12:28:54,746 WARNING [train.py:1204] (3/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_147-293311-0_sp1.1 from training. Duration: 0.8545625 2023-10-03 12:28:56,139 WARNING [train.py:1204] (3/4) Exclude cut with ID _12302_210_5219_1_1529924446466_8107280_835-279754-0_sp1.1 from training. Duration: 0.8181875 2023-10-03 12:28:58,824 INFO [train.py:1046] (3/4) Epoch 36, batch 4150, loss[loss=0.1494, simple_loss=0.2263, pruned_loss=0.0363, over 23786.00 frames. ], tot_loss[loss=0.1609, simple_loss=0.2403, pruned_loss=0.04072, over 4691008.10 frames. ], batch size: 164, lr: 2.82e-03, grad_scale: 32.0 2023-10-03 12:28:58,969 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8453_210_4211_1_1528502940_8562730_347-330673-0_sp0.9 from training. Duration: 0.8 2023-10-03 12:29:00,778 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_213-103282-0_sp0.9 from training. Duration: 0.8666875 2023-10-03 12:29:00,864 WARNING [train.py:1204] (3/4) Exclude cut with ID _49728_210_12651_1_1534041034294_6788150_66-43496-0_sp1.1 from training. Duration: 0.97275 2023-10-03 12:29:00,866 WARNING [train.py:1204] (3/4) Exclude cut with ID _19023_210_4353_1_1531378892552_5820780_28-36615-0_sp1.1 from training. Duration: 0.8818125 2023-10-03 12:29:03,619 WARNING [train.py:1204] (3/4) Exclude cut with ID _14349_210_8341_1_1530615710818_3442470_115-285975-0 from training. Duration: 0.86 2023-10-03 12:29:03,644 WARNING [train.py:1204] (3/4) Exclude cut with ID _12734_210_8341_1_1530183614296_3891929_236-47464-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 12:29:03,686 WARNING [train.py:1204] (3/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_177-236809-0 from training. Duration: 0.49 2023-10-03 12:29:05,452 WARNING [train.py:1204] (3/4) Exclude cut with ID _13678_210_7497_1_1530530069226_3463170_371-3221-0 from training. Duration: 0.9 2023-10-03 12:29:06,738 WARNING [train.py:1204] (3/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_48-338976-0 from training. Duration: 0.89 2023-10-03 12:29:06,855 WARNING [train.py:1204] (3/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_1167-309744-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 12:29:11,301 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.balancer1.prob, batch_count=1267166.6666666667, ans=0.125 2023-10-03 12:29:12,428 WARNING [train.py:1204] (3/4) Exclude cut with ID _30327_210_16049_1_1532484161295_3924169_401-70819-0_sp0.9 from training. Duration: 0.9555625 2023-10-03 12:29:12,444 WARNING [train.py:1204] (3/4) Exclude cut with ID _55134_210_21982_1_1534590028377_3882170_346-91022-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 12:29:16,442 WARNING [train.py:1204] (3/4) Exclude cut with ID _14459_210_2228_1_1531443641946_7516700_174-118801-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 12:29:17,663 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_269-278006-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 12:29:19,038 WARNING [train.py:1204] (3/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_390-339137-0_sp0.9 from training. Duration: 0.62225 2023-10-03 12:29:20,514 WARNING [train.py:1204] (3/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_253-308377-0_sp1.1 from training. Duration: 0.4454375 2023-10-03 12:29:20,526 WARNING [train.py:1204] (3/4) Exclude cut with ID _23825_210_13449_1_1531915313526_3497150_240-271367-0_sp1.1 from training. Duration: 0.8818125 2023-10-03 12:29:21,878 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1015-343884-0_sp0.9 from training. Duration: 0.688875 2023-10-03 12:29:24,655 WARNING [train.py:1204] (3/4) Exclude cut with ID _23514_210_1389_1_1531832139785_3031439_335-48210-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 12:29:29,404 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_309-65177-0_sp1.1 from training. Duration: 0.8 2023-10-03 12:29:29,477 WARNING [train.py:1204] (3/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_231-137892-0 from training. Duration: 0.99 2023-10-03 12:29:32,466 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.540e+02 1.928e+02 2.046e+02 2.330e+02 3.418e+02, threshold=4.092e+02, percent-clipped=0.0 2023-10-03 12:29:32,622 WARNING [train.py:1204] (3/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_57-270037-0 from training. Duration: 0.77 2023-10-03 12:29:32,627 WARNING [train.py:1204] (3/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_465-214726-0_sp1.1 from training. Duration: 0.7818125 2023-10-03 12:29:33,210 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.4.encoder.layers.1.nonlin_attention.whiten2, num_groups=1, num_channels=384, metric=3.10 vs. limit=15.0 2023-10-03 12:29:34,002 WARNING [train.py:1204] (3/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_106-300985-0 from training. Duration: 0.9 2023-10-03 12:29:34,007 WARNING [train.py:1204] (3/4) Exclude cut with ID _14632_210_9988_1_1530691279392_3923200_161-129530-0_sp1.1 from training. Duration: 0.87275 2023-10-03 12:29:35,261 WARNING [train.py:1204] (3/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_514-88370-0_sp1.1 from training. Duration: 0.963625 2023-10-03 12:29:36,807 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.nonlin_attention.balancer.prob, batch_count=1267300.0, ans=0.125 2023-10-03 12:29:39,190 WARNING [train.py:1204] (3/4) Exclude cut with ID _56495_210_4234_1_1534748080334_4136219_305-334505-0_sp1.1 from training. Duration: 0.92725 2023-10-03 12:29:39,273 WARNING [train.py:1204] (3/4) Exclude cut with ID _42875_210_14856_1_1533704397487_4012433_61-264914-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 12:29:43,528 WARNING [train.py:1204] (3/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_91-148831-0 from training. Duration: 0.99 2023-10-03 12:29:46,820 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_435-313305-0_sp0.9 from training. Duration: 0.788875 2023-10-03 12:29:48,817 WARNING [train.py:1204] (3/4) Exclude cut with ID _16744_210_5986_1_1531724405384_3907567_242-111316-0_sp1.1 from training. Duration: 0.8454375 2023-10-03 12:29:50,178 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_257-20733-0 from training. Duration: 0.76 2023-10-03 12:29:51,465 WARNING [train.py:1204] (3/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_4-91037-0_sp1.1 from training. Duration: 0.8 2023-10-03 12:29:51,570 WARNING [train.py:1204] (3/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_3-276701-0 from training. Duration: 0.91 2023-10-03 12:29:52,975 WARNING [train.py:1204] (3/4) Exclude cut with ID _3992_210_1747_1_1526644816031_3731060_36-147855-0_sp1.1 from training. Duration: 0.7090625 2023-10-03 12:29:54,376 WARNING [train.py:1204] (3/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_1296-298671-0_sp1.1 from training. Duration: 0.963625 2023-10-03 12:29:55,822 WARNING [train.py:1204] (3/4) Exclude cut with ID _68965_210_1883_1_1535698962272_3497490_309-334593-0_sp1.1 from training. Duration: 0.92725 2023-10-03 12:29:57,243 WARNING [train.py:1204] (3/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_128-296581-0 from training. Duration: 0.74 2023-10-03 12:29:57,244 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_17613_210_2151_1_1531137569_2366160_170-299079-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 12:29:57,246 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6287_210_3273_1_1527509506_2653716_182-199887-0_sp1.1 from training. Duration: 0.463625 2023-10-03 12:29:58,637 WARNING [train.py:1204] (3/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_80-135200-0_sp1.1 from training. Duration: 0.7181875 2023-10-03 12:30:00,657 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.conv_module1.balancer2.prob, batch_count=1267433.3333333333, ans=0.125 2023-10-03 12:30:03,146 WARNING [train.py:1204] (3/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_34-183685-0 from training. Duration: 0.98 2023-10-03 12:30:03,169 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8278_210_2550_1_1528637099_4220413_418-210155-0_sp1.1 from training. Duration: 0.92725 2023-10-03 12:30:03,178 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8853_210_3273_1_1528633899_818968_51-187685-0_sp0.9 from training. Duration: 0.7666875 2023-10-03 12:30:03,192 WARNING [train.py:1204] (3/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_332-221057-0_sp1.1 from training. Duration: 0.6181875 2023-10-03 12:30:05,026 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2221_210_1988_1_1525597051_4935541_415-296882-0 from training. Duration: 0.77 2023-10-03 12:30:05,049 WARNING [train.py:1204] (3/4) Exclude cut with ID _33640_210_2655_1_1533117605479_4855469_770-278185-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 12:30:05,083 WARNING [train.py:1204] (3/4) Exclude cut with ID _5287_210_2084_1_1527418883040_3878670_333-48508-0_sp1.1 from training. Duration: 0.5454375 2023-10-03 12:30:06,524 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_142-276839-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 12:30:07,925 WARNING [train.py:1204] (3/4) Exclude cut with ID _28394_210_8913_1_1532430049518_3650118_126-139371-0_sp1.1 from training. Duration: 0.92725 2023-10-03 12:30:07,941 WARNING [train.py:1204] (3/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_76-203867-0 from training. Duration: 0.72 2023-10-03 12:30:07,988 WARNING [train.py:1204] (3/4) Exclude cut with ID _6194_210_1534_1_1527511826160_3941194_14-282876-0_sp0.9 from training. Duration: 0.788875 2023-10-03 12:30:13,249 INFO [train.py:1046] (3/4) Epoch 36, batch 4200, loss[loss=0.1458, simple_loss=0.2052, pruned_loss=0.04315, over 19458.00 frames. ], tot_loss[loss=0.16, simple_loss=0.239, pruned_loss=0.04048, over 4700736.93 frames. ], batch size: 389, lr: 2.82e-03, grad_scale: 16.0 2023-10-03 12:30:13,334 WARNING [train.py:1204] (3/4) Exclude cut with ID _35355_210_16040_1_1533212988977_4016229_74-190076-0_sp1.1 from training. Duration: 0.72725 2023-10-03 12:30:14,726 WARNING [train.py:1204] (3/4) Exclude cut with ID _33920_210_4220_1_1533257851500_3084399_198-334063-0 from training. Duration: 0.94 2023-10-03 12:30:16,185 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_4-266348-0_sp0.9 from training. Duration: 0.8666875 2023-10-03 12:30:18,328 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_23856_210_4238_1_1531994785_3087934_122-38075-0_sp1.1 from training. Duration: 0.936375 2023-10-03 12:30:19,645 WARNING [train.py:1204] (3/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_276-96593-0_sp0.9 from training. Duration: 0.8555625 2023-10-03 12:30:20,936 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_308-278734-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 12:30:20,938 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_5190_210_4557_1_1527245078_2965646_353-250603-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 12:30:21,104 WARNING [train.py:1204] (3/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_472-216663-0 from training. Duration: 0.92 2023-10-03 12:30:25,248 WARNING [train.py:1204] (3/4) Exclude cut with ID _7537_210_2084_1_1528502395431_3924359_14-312906-0 from training. Duration: 0.84 2023-10-03 12:30:25,292 WARNING [train.py:1204] (3/4) Exclude cut with ID _29003_210_19596_1_1532507391720_3894098_559-236660-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 12:30:26,682 WARNING [train.py:1204] (3/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_312-204798-0_sp1.1 from training. Duration: 0.8454375 2023-10-03 12:30:28,244 WARNING [train.py:1204] (3/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_17-309070-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 12:30:32,038 WARNING [train.py:1204] (3/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_344-152526-0_sp0.9 from training. Duration: 0.588875 2023-10-03 12:30:33,481 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_41452_210_7291_1_1533437687_3941281_405-11559-0_sp1.1 from training. Duration: 0.82725 2023-10-03 12:30:33,504 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_48730_210_5399_1_1533885886_2212769_157-124052-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 12:30:34,811 WARNING [train.py:1204] (3/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_415-78792-0 from training. Duration: 0.81 2023-10-03 12:30:34,815 WARNING [train.py:1204] (3/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_345-340690-0_sp1.1 from training. Duration: 0.8454375 2023-10-03 12:30:34,940 WARNING [train.py:1204] (3/4) Exclude cut with ID _23825_210_13449_1_1531915313526_3497150_124-8138-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 12:30:34,975 WARNING [train.py:1204] (3/4) Exclude cut with ID _49258_210_7994_1_1534122017863_3738861_194-24864-0_sp1.1 from training. Duration: 0.936375 2023-10-03 12:30:36,267 WARNING [train.py:1204] (3/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_369-222060-0_sp1.1 from training. Duration: 0.6454375 2023-10-03 12:30:37,637 WARNING [train.py:1204] (3/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_480-13146-0_sp0.9 from training. Duration: 0.9444375 2023-10-03 12:30:40,174 WARNING [train.py:1204] (3/4) Exclude cut with ID _13253_210_1925_1_1530757775819_3577873_191-144977-0 from training. Duration: 0.78 2023-10-03 12:30:40,197 WARNING [train.py:1204] (3/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_1128-140266-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 12:30:40,344 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.1.ff3_skip_rate, batch_count=1267566.6666666667, ans=0.0 2023-10-03 12:30:42,976 WARNING [train.py:1204] (3/4) Exclude cut with ID _8772_210_1850_1_1528635595972_3664011_247-336448-0_sp0.9 from training. Duration: 0.82225 2023-10-03 12:30:46,148 WARNING [train.py:1204] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_318-59097-0_sp1.1 from training. Duration: 0.6909375 2023-10-03 12:30:47,967 WARNING [train.py:1204] (3/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_540-131164-0_sp1.1 from training. Duration: 0.836375 2023-10-03 12:30:49,467 WARNING [train.py:1204] (3/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_39-110836-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 12:30:50,920 WARNING [train.py:1204] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_860-96198-0_sp1.1 from training. Duration: 0.663625 2023-10-03 12:30:50,935 WARNING [train.py:1204] (3/4) Exclude cut with ID _15133_210_6228_1_1530794512074_3080379_29-264458-0 from training. Duration: 0.58 2023-10-03 12:30:50,956 WARNING [train.py:1204] (3/4) Exclude cut with ID _55209_210_1531_1_1534675828996_4597160_797-348343-0_sp1.1 from training. Duration: 0.963625 2023-10-03 12:30:52,535 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.conv_module1.balancer2.prob, batch_count=1267633.3333333333, ans=0.125 2023-10-03 12:30:53,743 WARNING [train.py:1204] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_23-189234-0_sp1.1 from training. Duration: 0.7545625 2023-10-03 12:30:58,156 WARNING [train.py:1204] (3/4) Exclude cut with ID _5410_210_1388_1_1527397055795_1719020_39-112221-0_sp1.1 from training. Duration: 0.62725 2023-10-03 12:31:01,355 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_100-348441-0_sp1.1 from training. Duration: 0.82725 2023-10-03 12:31:05,660 WARNING [train.py:1204] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_143-86344-0_sp1.1 from training. Duration: 0.7 2023-10-03 12:31:08,463 WARNING [train.py:1204] (3/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_126-29104-0 from training. Duration: 0.85 2023-10-03 12:31:09,882 WARNING [train.py:1204] (3/4) Exclude cut with ID _39846_210_12651_1_1533603651992_1318030_138-31771-0_sp1.1 from training. Duration: 0.963625 2023-10-03 12:31:13,940 WARNING [train.py:1204] (3/4) Exclude cut with ID _2220_210_1819_1_1525509109898_3658680_50-99837-0_sp1.1 from training. Duration: 0.4909375 2023-10-03 12:31:15,145 WARNING [train.py:1204] (3/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_836-210550-0_sp1.1 from training. Duration: 0.97275 2023-10-03 12:31:16,612 WARNING [train.py:1204] (3/4) Exclude cut with ID _28323_210_16187_1_1532480395667_4100300_306-74561-0 from training. Duration: 0.94 2023-10-03 12:31:22,617 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_177-58005-0_sp1.1 from training. Duration: 0.563625 2023-10-03 12:31:26,600 INFO [train.py:1046] (3/4) Epoch 36, batch 4250, loss[loss=0.1573, simple_loss=0.2269, pruned_loss=0.0439, over 23833.00 frames. ], tot_loss[loss=0.1592, simple_loss=0.2384, pruned_loss=0.04006, over 4704504.73 frames. ], batch size: 212, lr: 2.82e-03, grad_scale: 16.0 2023-10-03 12:31:28,429 WARNING [train.py:1204] (3/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_19-342632-0_sp1.1 from training. Duration: 0.82725 2023-10-03 12:31:28,452 WARNING [train.py:1204] (3/4) Exclude cut with ID _35355_210_16040_1_1533212988977_4016229_74-190076-0_sp0.9 from training. Duration: 0.888875 2023-10-03 12:31:31,286 WARNING [train.py:1204] (3/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_285-97786-0_sp1.1 from training. Duration: 0.97275 2023-10-03 12:31:34,936 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.5.encoder.layers.0.self_attn2.whiten, num_groups=1, num_channels=256, metric=10.98 vs. limit=22.5 2023-10-03 12:31:37,154 WARNING [train.py:1204] (3/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_337-181502-0_sp0.9 from training. Duration: 0.72225 2023-10-03 12:31:37,203 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_353-182855-0 from training. Duration: 0.96 2023-10-03 12:31:37,236 WARNING [train.py:1204] (3/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_409-327730-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 12:31:40,138 WARNING [train.py:1204] (3/4) Exclude cut with ID _8030_210_7497_1_1528444886781_3531089_373-19403-0_sp1.1 from training. Duration: 0.97275 2023-10-03 12:31:43,070 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.feed_forward2.hidden_balancer.prob, batch_count=1267900.0, ans=0.125 2023-10-03 12:31:44,257 WARNING [train.py:1204] (3/4) Exclude cut with ID _6789_210_1534_1_1527857937218_3118950_231-340237-0_sp1.1 from training. Duration: 0.936375 2023-10-03 12:31:48,535 WARNING [train.py:1204] (3/4) Exclude cut with ID _6493_210_1747_1_1528459210424_4012470_536-57832-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 12:31:48,565 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7046_210_6753_1_1527895337_6920917_756-116002-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 12:31:52,367 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_37152_210_9697_1_1533106573_3844188_29-239070-0_sp1.1 from training. Duration: 0.8909375 2023-10-03 12:31:52,370 WARNING [train.py:1204] (3/4) Exclude cut with ID _19023_210_4353_1_1531378892552_5820780_61-334539-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 12:31:52,492 WARNING [train.py:1204] (3/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_159-28488-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 12:31:53,808 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_764-71272-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 12:31:53,882 WARNING [train.py:1204] (3/4) Exclude cut with ID _44902_210_16187_1_1533888006062_3792078_258-178274-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 12:31:56,929 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.feed_forward1.out_proj.dropout_p, batch_count=1267966.6666666667, ans=0.1 2023-10-03 12:31:57,989 WARNING [train.py:1204] (3/4) Exclude cut with ID _7537_210_2084_1_1528502395431_3924359_14-312906-0_sp1.1 from training. Duration: 0.763625 2023-10-03 12:31:59,355 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.638e+02 1.900e+02 2.182e+02 2.555e+02 3.786e+02, threshold=4.364e+02, percent-clipped=0.0 2023-10-03 12:31:59,461 WARNING [train.py:1204] (3/4) Exclude cut with ID _49643_210_7989_1_1534640244224_3939031_674-116639-0_sp1.1 from training. Duration: 0.963625 2023-10-03 12:31:59,572 WARNING [train.py:1204] (3/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_415-161-0 from training. Duration: 0.98 2023-10-03 12:32:01,698 INFO [scaling.py:1118] (3/4) WithLoss: name=encoder.encoders.3.encoder.layers.3.self_attn_weights, loss-sum=0.000e+00 2023-10-03 12:32:05,011 WARNING [train.py:1204] (3/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_264-348896-0 from training. Duration: 0.85 2023-10-03 12:32:05,018 WARNING [train.py:1204] (3/4) Exclude cut with ID _51899_210_20034_1_1534129254466_3669220_233-161492-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 12:32:06,359 WARNING [train.py:1204] (3/4) Exclude cut with ID _31255_210_13427_1_1532865619495_3815143_233-200023-0_sp1.1 from training. Duration: 0.936375 2023-10-03 12:32:06,375 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_23850_210_4238_1_1532232597_1632433_100-229879-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 12:32:07,642 WARNING [train.py:1204] (3/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_375-245677-0_sp0.9 from training. Duration: 0.9 2023-10-03 12:32:07,645 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_40223_210_6228_1_1533298404_4812267_355-317415-0_sp1.1 from training. Duration: 0.97275 2023-10-03 12:32:07,699 WARNING [train.py:1204] (3/4) Exclude cut with ID _9679_210_7497_1_1529129212906_4363929_235-81488-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 12:32:07,974 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.ff3_skip_rate, batch_count=1267966.6666666667, ans=0.0 2023-10-03 12:32:09,493 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.ff3_skip_rate, batch_count=1267966.6666666667, ans=0.0 2023-10-03 12:32:10,596 WARNING [train.py:1204] (3/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_172-215932-0_sp1.1 from training. Duration: 0.6 2023-10-03 12:32:11,977 WARNING [train.py:1204] (3/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_64-298917-0_sp1.1 from training. Duration: 0.8 2023-10-03 12:32:16,035 WARNING [train.py:1204] (3/4) Exclude cut with ID _38020_210_5986_1_1533112252259_7006089_131-53533-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 12:32:18,701 WARNING [train.py:1204] (3/4) Exclude cut with ID _42334_210_7770_1_1534206610396_3605880_392-48154-0_sp1.1 from training. Duration: 0.963625 2023-10-03 12:32:20,022 WARNING [train.py:1204] (3/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_337-181502-0 from training. Duration: 0.65 2023-10-03 12:32:20,036 WARNING [train.py:1204] (3/4) Exclude cut with ID _6267_210_2084_1_1527764658526_3894730_375-219887-0_sp0.9 from training. Duration: 0.7444375 2023-10-03 12:32:21,398 WARNING [train.py:1204] (3/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_60-146189-0 from training. Duration: 0.98 2023-10-03 12:32:22,737 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_243-143119-0_sp1.1 from training. Duration: 0.7 2023-10-03 12:32:24,693 WARNING [train.py:1204] (3/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_1267-272693-0_sp0.9 from training. Duration: 0.811125 2023-10-03 12:32:26,765 WARNING [train.py:1204] (3/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_83-182645-0_sp1.1 from training. Duration: 0.97275 2023-10-03 12:32:26,782 WARNING [train.py:1204] (3/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_403-292260-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 12:32:28,230 WARNING [train.py:1204] (3/4) Exclude cut with ID _55429_210_10626_1_1534467588527_7174143_377-169391-0 from training. Duration: 0.96 2023-10-03 12:32:29,645 WARNING [train.py:1204] (3/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_494-199538-0_sp1.1 from training. Duration: 0.6818125 2023-10-03 12:32:29,699 WARNING [train.py:1204] (3/4) Exclude cut with ID _3067_210_2667_1_1526212974640_4503168_119-175776-0_sp1.1 from training. Duration: 0.67275 2023-10-03 12:32:34,056 WARNING [train.py:1204] (3/4) Exclude cut with ID _3231_210_2106_1_1526205864934_6784220_183-303839-0_sp1.1 from training. Duration: 0.97275 2023-10-03 12:32:37,226 WARNING [train.py:1204] (3/4) Exclude cut with ID _18513_210_9206_1_1531618215223_3862620_165-38958-0_sp1.1 from training. Duration: 0.963625 2023-10-03 12:32:38,522 WARNING [train.py:1204] (3/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_87-272068-0_sp1.1 from training. Duration: 0.7545625 2023-10-03 12:32:38,614 WARNING [train.py:1204] (3/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_353-95-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 12:32:41,283 INFO [train.py:1046] (3/4) Epoch 36, batch 4300, loss[loss=0.1557, simple_loss=0.2455, pruned_loss=0.03296, over 24559.00 frames. ], tot_loss[loss=0.159, simple_loss=0.2385, pruned_loss=0.03971, over 4720044.08 frames. ], batch size: 71, lr: 2.82e-03, grad_scale: 16.0 2023-10-03 12:32:41,342 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2009_210_2071_1_1525481134_4588937_73-302813-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 12:32:42,715 WARNING [train.py:1204] (3/4) Exclude cut with ID _64220_210_9527_1_1535250151772_4875259_88-95011-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 12:32:42,788 WARNING [train.py:1204] (3/4) Exclude cut with ID _44902_210_16187_1_1533888006062_3792078_160-12107-0_sp1.1 from training. Duration: 0.92725 2023-10-03 12:32:42,794 WARNING [train.py:1204] (3/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_547-5424-0 from training. Duration: 0.94 2023-10-03 12:32:44,146 WARNING [train.py:1204] (3/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_268-85656-0_sp1.1 from training. Duration: 0.936375 2023-10-03 12:32:48,306 WARNING [train.py:1204] (3/4) Exclude cut with ID _66210_210_12651_1_1535446812922_3365850_42-284985-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 12:32:49,752 WARNING [train.py:1204] (3/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_465-214726-0_sp0.9 from training. Duration: 0.9555625 2023-10-03 12:32:52,662 WARNING [train.py:1204] (3/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_50-95770-0_sp1.1 from training. Duration: 0.936375 2023-10-03 12:33:00,802 WARNING [train.py:1204] (3/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_269-77567-0_sp1.1 from training. Duration: 0.963625 2023-10-03 12:33:00,805 WARNING [train.py:1204] (3/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_7-111954-0 from training. Duration: 0.56 2023-10-03 12:33:02,748 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_85-195240-0_sp0.9 from training. Duration: 0.9666875 2023-10-03 12:33:04,205 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2822_210_2715_1_1526112586_1999587_165-36656-0_sp0.9 from training. Duration: 0.988875 2023-10-03 12:33:04,220 WARNING [train.py:1204] (3/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_181-78632-0_sp1.1 from training. Duration: 0.7090625 2023-10-03 12:33:04,232 WARNING [train.py:1204] (3/4) Exclude cut with ID IC0671W0044-331887 from training. Duration: 0.896 2023-10-03 12:33:07,597 WARNING [train.py:1204] (3/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_266-137104-0_sp1.1 from training. Duration: 0.5909375 2023-10-03 12:33:08,433 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.2.encoder.layers.2.conv_module1.whiten, num_groups=1, num_channels=384, metric=5.43 vs. limit=15.0 2023-10-03 12:33:08,977 WARNING [train.py:1204] (3/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_137-116442-0_sp0.9 from training. Duration: 0.8666875 2023-10-03 12:33:11,928 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_23801_210_16223_1_1532066392_3294657_570-5439-0 from training. Duration: 0.97 2023-10-03 12:33:11,948 WARNING [train.py:1204] (3/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_29-280622-0_sp1.1 from training. Duration: 0.7454375 2023-10-03 12:33:11,977 WARNING [train.py:1204] (3/4) Exclude cut with ID 3557-8342-0013-54691-0 from training. Duration: 0.92 2023-10-03 12:33:14,632 WARNING [train.py:1204] (3/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_711-15688-0_sp1.1 from training. Duration: 0.3909375 2023-10-03 12:33:15,986 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_15126_210_6753_1_1530778677_2591185_120-277707-0_sp1.1 from training. Duration: 0.72725 2023-10-03 12:33:18,749 WARNING [train.py:1204] (3/4) Exclude cut with ID _14970_210_15709_1_1530775547361_1966965_8-169924-0_sp1.1 from training. Duration: 0.836375 2023-10-03 12:33:18,750 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_4692_210_3528_1_1526899755_4323254_107-44401-0_sp0.9 from training. Duration: 0.9555625 2023-10-03 12:33:20,140 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3475_210_2028_1_1526193578_3622569_265-276625-0_sp0.9 from training. Duration: 0.8444375 2023-10-03 12:33:21,571 WARNING [train.py:1204] (3/4) Exclude cut with ID _55153_210_2253_1_1534384786677_3162096_148-79022-0_sp1.1 from training. Duration: 0.87275 2023-10-03 12:33:23,466 WARNING [train.py:1204] (3/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_292-36336-0_sp1.1 from training. Duration: 0.92725 2023-10-03 12:33:23,475 WARNING [train.py:1204] (3/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_329-82235-0 from training. Duration: 0.82 2023-10-03 12:33:24,846 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_11305_210_1794_1_1529750428_7178148_257-312870-0 from training. Duration: 0.88 2023-10-03 12:33:26,218 WARNING [train.py:1204] (3/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_436-4493-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 12:33:29,586 WARNING [train.py:1204] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_321-54129-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 12:33:29,593 WARNING [train.py:1204] (3/4) Exclude cut with ID _5287_210_2084_1_1527418883040_3878670_333-48508-0_sp0.9 from training. Duration: 0.6666875 2023-10-03 12:33:29,613 WARNING [train.py:1204] (3/4) Exclude cut with ID _26528_210_9070_1_1532570410842_3851310_244-278158-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 12:33:30,895 WARNING [train.py:1204] (3/4) Exclude cut with ID _46952_210_5986_1_1534230010357_3107379_232-344503-0_sp1.1 from training. Duration: 0.87275 2023-10-03 12:33:30,906 WARNING [train.py:1204] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_43-206757-0 from training. Duration: 0.75 2023-10-03 12:33:30,907 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_4183_210_2087_1_1526729100_1869838_141-166600-0 from training. Duration: 0.95 2023-10-03 12:33:30,956 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_296-287678-0 from training. Duration: 0.95 2023-10-03 12:33:31,037 WARNING [train.py:1204] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_265-60343-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 12:33:31,259 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=1268366.6666666667, ans=0.1 2023-10-03 12:33:32,866 WARNING [train.py:1204] (3/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_332-256168-0 from training. Duration: 0.75 2023-10-03 12:33:32,909 WARNING [train.py:1204] (3/4) Exclude cut with ID _13679_210_7497_1_1530534293662_3355098_411-186088-0 from training. Duration: 0.94 2023-10-03 12:33:37,429 WARNING [train.py:1204] (3/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_458-126779-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 12:33:38,763 WARNING [train.py:1204] (3/4) Exclude cut with ID IC0803W0380-397572 from training. Duration: 0.032 2023-10-03 12:33:38,841 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_278-272612-0_sp1.1 from training. Duration: 0.663625 2023-10-03 12:33:40,289 WARNING [train.py:1204] (3/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_491-263101-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 12:33:40,294 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_53555_210_5632_1_1534595220_7232297_334-336778-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 12:33:43,040 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_50-3289-0 from training. Duration: 0.91 2023-10-03 12:33:43,716 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.2.encoder.layers.1.conv_module2.whiten, num_groups=1, num_channels=384, metric=9.53 vs. limit=15.0 2023-10-03 12:33:44,321 WARNING [train.py:1204] (3/4) Exclude cut with ID _8699_210_2069_1_1528630346831_3960409_155-39766-0_sp0.9 from training. Duration: 0.8666875 2023-10-03 12:33:44,325 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_502-82145-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 12:33:44,371 WARNING [train.py:1204] (3/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_550-43132-0_sp1.1 from training. Duration: 0.97275 2023-10-03 12:33:44,392 WARNING [train.py:1204] (3/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_446-112117-0_sp1.1 from training. Duration: 0.8181875 2023-10-03 12:33:45,800 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3691_210_3528_1_1526468169_4062053_355-334432-0_sp1.1 from training. Duration: 0.7909375 2023-10-03 12:33:47,201 WARNING [train.py:1204] (3/4) Exclude cut with ID _58605_210_12210_1_1534723195939_3904259_239-94720-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 12:33:47,359 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.feed_forward2.hidden_balancer.prob, batch_count=1268433.3333333333, ans=0.125 2023-10-03 12:33:47,468 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.self_attn_weights.pos_emb_skip_rate, batch_count=1268433.3333333333, ans=0.0 2023-10-03 12:33:49,980 WARNING [train.py:1204] (3/4) Exclude cut with ID _49376_210_5986_1_1533985278732_3370740_391-344072-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 12:33:51,190 WARNING [train.py:1204] (3/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_558-322014-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 12:33:51,228 WARNING [train.py:1204] (3/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_256-32662-0_sp1.1 from training. Duration: 0.8181875 2023-10-03 12:33:55,774 INFO [train.py:1046] (3/4) Epoch 36, batch 4350, loss[loss=0.1699, simple_loss=0.2445, pruned_loss=0.04768, over 23824.00 frames. ], tot_loss[loss=0.1592, simple_loss=0.2391, pruned_loss=0.03965, over 4710799.27 frames. ], batch size: 195, lr: 2.82e-03, grad_scale: 16.0 2023-10-03 12:33:55,950 WARNING [train.py:1204] (3/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_572-185150-0 from training. Duration: 0.97 2023-10-03 12:33:56,481 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.4.encoder.layers.1.conv_module1.whiten, num_groups=1, num_channels=384, metric=3.48 vs. limit=15.0 2023-10-03 12:33:57,272 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_89-35748-0_sp0.9 from training. Duration: 0.87775 2023-10-03 12:34:01,734 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_88-66747-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 12:34:04,508 WARNING [train.py:1204] (3/4) Exclude cut with ID _19504_210_9994_1_1531648863242_3325220_357-237029-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 12:34:05,316 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.2.encoder.layers.0.self_attn1.whiten, num_groups=1, num_channels=384, metric=19.70 vs. limit=22.5 2023-10-03 12:34:07,203 WARNING [train.py:1204] (3/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_302-2550-0_sp1.1 from training. Duration: 0.736375 2023-10-03 12:34:07,203 WARNING [train.py:1204] (3/4) Exclude cut with ID _9461_210_4557_1_1529060514726_3677451_59-194017-0_sp1.1 from training. Duration: 0.963625 2023-10-03 12:34:10,763 WARNING [train.py:1204] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_665-196155-0_sp1.1 from training. Duration: 0.6090625 2023-10-03 12:34:16,134 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_326-315007-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 12:34:17,556 WARNING [train.py:1204] (3/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_105-328174-0_sp1.1 from training. Duration: 0.6909375 2023-10-03 12:34:17,565 WARNING [train.py:1204] (3/4) Exclude cut with ID _56494_210_4220_1_1535284837625_3033169_106-263722-0_sp1.1 from training. Duration: 0.97275 2023-10-03 12:34:21,771 WARNING [train.py:1204] (3/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_770-200430-0_sp0.9 from training. Duration: 0.988875 2023-10-03 12:34:22,057 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.conv_module1.balancer1.prob, batch_count=1268566.6666666667, ans=0.125 2023-10-03 12:34:23,165 WARNING [train.py:1204] (3/4) Exclude cut with ID _65640_210_6310_1_1535368201445_3173250_209-13008-0_sp1.1 from training. Duration: 0.9 2023-10-03 12:34:24,515 WARNING [train.py:1204] (3/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_212-149947-0_sp0.9 from training. Duration: 0.811125 2023-10-03 12:34:27,499 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.653e+02 1.937e+02 2.130e+02 2.314e+02 3.607e+02, threshold=4.259e+02, percent-clipped=0.0 2023-10-03 12:34:30,761 WARNING [train.py:1204] (3/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_188-171673-0 from training. Duration: 0.58 2023-10-03 12:34:30,824 WARNING [train.py:1204] (3/4) Exclude cut with ID _58895_210_8750_1_1535184081320_3649089_286-281724-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 12:34:32,584 WARNING [train.py:1204] (3/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_16-254909-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 12:34:32,867 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=1268633.3333333333, ans=0.1 2023-10-03 12:34:36,839 WARNING [train.py:1204] (3/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_52-95655-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 12:34:38,195 WARNING [train.py:1204] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_554-45617-0 from training. Duration: 0.99 2023-10-03 12:34:41,514 WARNING [train.py:1204] (3/4) Exclude cut with ID _14683_210_5219_1_1530698352608_3974859_473-344611-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 12:34:42,850 WARNING [train.py:1204] (3/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_17-25045-0_sp0.9 from training. Duration: 0.6555625 2023-10-03 12:34:45,693 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9072W0003-530637 from training. Duration: 0.768 2023-10-03 12:34:48,296 WARNING [train.py:1204] (3/4) Exclude cut with ID _38670_210_15379_1_1533448810659_4391059_76-74025-0_sp1.1 from training. Duration: 0.936375 2023-10-03 12:34:48,331 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_74-292608-0_sp0.9 from training. Duration: 0.8 2023-10-03 12:34:49,724 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9084W0025-536862 from training. Duration: 0.896 2023-10-03 12:34:49,779 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9087W0021-538392 from training. Duration: 0.768 2023-10-03 12:34:49,784 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7543_210_6753_1_1528543323_645515_56-163234-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 12:34:49,814 WARNING [train.py:1204] (3/4) Exclude cut with ID _45560_210_21982_1_1533812603951_3584999_44-332643-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 12:34:51,157 WARNING [train.py:1204] (3/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_1285-256622-0_sp1.1 from training. Duration: 0.763625 2023-10-03 12:34:51,209 WARNING [train.py:1204] (3/4) Exclude cut with ID _52781_210_16187_1_1534297729099_1118149_141-325632-0_sp1.1 from training. Duration: 0.936375 2023-10-03 12:34:51,845 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.feed_forward3.out_whiten.whitening_limit, batch_count=1268700.0, ans=15.0 2023-10-03 12:34:52,533 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_1051-137631-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 12:34:52,587 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_13091_210_7925_1_1530609447_8987354_233-18333-0_sp1.1 from training. Duration: 0.97275 2023-10-03 12:34:55,319 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_34008_210_10431_1_1533345424_8183247_662-317331-0 from training. Duration: 0.94 2023-10-03 12:34:55,325 WARNING [train.py:1204] (3/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_291-248920-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 12:34:55,327 WARNING [train.py:1204] (3/4) Exclude cut with ID _65263_210_22356_1_1535763598691_3921037_274-288481-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 12:34:55,357 WARNING [train.py:1204] (3/4) Exclude cut with ID _5189_210_4557_1_1527248319428_3769229_233-168080-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 12:34:55,460 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.0.ff2_skip_rate, batch_count=1268766.6666666667, ans=0.0 2023-10-03 12:34:57,111 WARNING [train.py:1204] (3/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_135-184281-0 from training. Duration: 0.99 2023-10-03 12:34:58,860 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9123W0012-555972 from training. Duration: 0.896 2023-10-03 12:34:58,872 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9123W0118-556077 from training. Duration: 0.896 2023-10-03 12:34:58,889 WARNING [train.py:1204] (3/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_406-275649-0 from training. Duration: 0.42 2023-10-03 12:34:59,241 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.nonlin_attention.balancer.prob, batch_count=1268766.6666666667, ans=0.125 2023-10-03 12:35:02,181 WARNING [train.py:1204] (3/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_309-13684-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 12:35:02,201 WARNING [train.py:1204] (3/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_129-337802-0_sp1.1 from training. Duration: 0.6545625 2023-10-03 12:35:02,219 WARNING [train.py:1204] (3/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_649-266825-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 12:35:03,588 WARNING [train.py:1204] (3/4) Exclude cut with ID _40519_210_13794_1_1533715228655_7337390_895-224742-0_sp1.1 from training. Duration: 0.92725 2023-10-03 12:35:05,044 WARNING [train.py:1204] (3/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_234-14174-0 from training. Duration: 0.82 2023-10-03 12:35:07,680 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9156W0009-572502 from training. Duration: 0.768 2023-10-03 12:35:07,694 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_424-245529-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 12:35:09,583 INFO [train.py:1046] (3/4) Epoch 36, batch 4400, loss[loss=0.1679, simple_loss=0.2584, pruned_loss=0.03872, over 24540.00 frames. ], tot_loss[loss=0.1606, simple_loss=0.2406, pruned_loss=0.04026, over 4714921.26 frames. ], batch size: 71, lr: 2.82e-03, grad_scale: 32.0 2023-10-03 12:35:12,294 WARNING [train.py:1204] (3/4) Exclude cut with ID _26528_210_9070_1_1532570410842_3851310_232-10149-0_sp1.1 from training. Duration: 0.963625 2023-10-03 12:35:12,302 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_17292_210_6753_1_1531125388_2943734_152-30410-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 12:35:13,822 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_35441_210_16554_1_1533347572_4092680_291-82075-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 12:35:14,110 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.self_attn_weights.pos_emb_skip_rate, batch_count=1268833.3333333333, ans=0.0 2023-10-03 12:35:15,258 WARNING [train.py:1204] (3/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_64-298917-0 from training. Duration: 0.88 2023-10-03 12:35:15,278 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_5618_210_1874_1_1527311008_5851_1-214501-0_sp0.9 from training. Duration: 0.7655625 2023-10-03 12:35:15,313 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_9460_210_4211_1_1528954587_10049667_572-37035-0 from training. Duration: 0.8 2023-10-03 12:35:16,626 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9193W0023-590322 from training. Duration: 0.896 2023-10-03 12:35:16,726 WARNING [train.py:1204] (3/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_206-10097-0_sp1.1 from training. Duration: 0.4454375 2023-10-03 12:35:16,738 WARNING [train.py:1204] (3/4) Exclude cut with ID _55134_210_21982_1_1534590028377_3882170_17-51731-0_sp1.1 from training. Duration: 0.963625 2023-10-03 12:35:19,414 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_14953_210_2151_1_1530701710_5616165_710-165400-0 from training. Duration: 0.87 2023-10-03 12:35:20,825 WARNING [train.py:1204] (3/4) Exclude cut with ID _53203_210_2087_1_1534246218309_475590_61-85777-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 12:35:22,312 WARNING [train.py:1204] (3/4) Exclude cut with ID _55844_210_2346_1_1534471212569_7625707_1050-10453-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 12:35:22,323 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9218W0013-603297 from training. Duration: 0.896 2023-10-03 12:35:25,751 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_267-102818-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 12:35:25,758 WARNING [train.py:1204] (3/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_191-41023-0 from training. Duration: 0.71 2023-10-03 12:35:27,072 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9231W0202-610227 from training. Duration: 0.896 2023-10-03 12:35:29,970 WARNING [train.py:1204] (3/4) Exclude cut with ID _6537_210_1883_1_1527987559710_3597129_382-133062-0 from training. Duration: 0.68 2023-10-03 12:35:30,006 WARNING [train.py:1204] (3/4) Exclude cut with ID _28427_210_4220_1_1532581240967_3061069_43-84928-0 from training. Duration: 0.99 2023-10-03 12:35:30,039 WARNING [train.py:1204] (3/4) Exclude cut with ID _59256_210_5986_1_1535076085997_3589120_121-237386-0 from training. Duration: 0.96 2023-10-03 12:35:31,872 WARNING [train.py:1204] (3/4) Exclude cut with ID _8480_210_1874_1_1528589391205_6728330_137-256986-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 12:35:31,970 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_9417_210_1794_1_1529235261_4614131_479-346963-0_sp1.1 from training. Duration: 0.936375 2023-10-03 12:35:33,334 WARNING [train.py:1204] (3/4) Exclude cut with ID _26474_210_9570_1_1532520022699_3623380_267-7362-0_sp1.1 from training. Duration: 0.936375 2023-10-03 12:35:33,407 WARNING [train.py:1204] (3/4) Exclude cut with ID _55328_210_27767_1_1534580973260_4088170_287-61272-0_sp1.1 from training. Duration: 0.97275 2023-10-03 12:35:33,868 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.5.encoder.layers.1.feed_forward1.out_whiten, num_groups=1, num_channels=256, metric=5.37 vs. limit=15.0 2023-10-03 12:35:36,097 WARNING [train.py:1204] (3/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_589-9070-0 from training. Duration: 0.71 2023-10-03 12:35:36,104 WARNING [train.py:1204] (3/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_148-158509-0_sp1.1 from training. Duration: 0.37275 2023-10-03 12:35:37,465 WARNING [train.py:1204] (3/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_164-184548-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 12:35:40,848 WARNING [train.py:1204] (3/4) Exclude cut with ID _14970_210_15709_1_1530775547361_1966965_6-23328-0_sp1.1 from training. Duration: 0.7818125 2023-10-03 12:35:40,860 WARNING [train.py:1204] (3/4) Exclude cut with ID _29303_210_2575_1_1532392237303_7410649_612-165069-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 12:35:43,502 WARNING [train.py:1204] (3/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_383-321224-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 12:35:43,537 WARNING [train.py:1204] (3/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_283-160525-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 12:35:43,547 WARNING [train.py:1204] (3/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_505-97987-0 from training. Duration: 0.86 2023-10-03 12:35:44,957 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9309W0190-640317 from training. Duration: 0.768 2023-10-03 12:35:49,155 WARNING [train.py:1204] (3/4) Exclude cut with ID _50093_210_4220_1_1534417141207_3432299_280-297390-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 12:35:49,580 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.5.encoder.layers.1.feed_forward3.out_whiten, num_groups=1, num_channels=256, metric=11.83 vs. limit=15.0 2023-10-03 12:35:56,040 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_17292_210_6753_1_1531125388_2943734_320-71385-0_sp1.1 from training. Duration: 0.97275 2023-10-03 12:35:58,862 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_213-103282-0 from training. Duration: 0.78 2023-10-03 12:36:02,094 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7615_210_4211_1_1528110965_4580160_72-235211-0_sp0.9 from training. Duration: 0.8444375 2023-10-03 12:36:05,412 WARNING [train.py:1204] (3/4) Exclude cut with ID _14867_210_1968_1_1530684179890_3846969_173-320977-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 12:36:05,688 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.conv_module1.balancer2.prob, batch_count=1269033.3333333333, ans=0.125 2023-10-03 12:36:06,872 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7046_210_6753_1_1527895337_6920917_783-278109-0_sp0.9 from training. Duration: 0.8555625 2023-10-03 12:36:08,738 WARNING [train.py:1204] (3/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_90-30673-0 from training. Duration: 0.84 2023-10-03 12:36:08,760 WARNING [train.py:1204] (3/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_415-161-0_sp1.1 from training. Duration: 0.8909375 2023-10-03 12:36:08,770 WARNING [train.py:1204] (3/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_86-66192-0_sp0.9 from training. Duration: 0.611125 2023-10-03 12:36:08,772 WARNING [train.py:1204] (3/4) Exclude cut with ID _4626_210_3532_1_1526817652717_3916859_446-206656-0_sp1.1 from training. Duration: 0.6909375 2023-10-03 12:36:09,284 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.4.encoder.layers.2.self_attn_weights.whiten_keys, num_groups=4, num_channels=128, metric=1.96 vs. limit=6.0 2023-10-03 12:36:10,208 WARNING [train.py:1204] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_166-128941-0_sp0.9 from training. Duration: 0.6 2023-10-03 12:36:15,605 WARNING [train.py:1204] (3/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_345-167986-0 from training. Duration: 0.84 2023-10-03 12:36:17,585 WARNING [train.py:1204] (3/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_973-198085-0 from training. Duration: 0.75 2023-10-03 12:36:20,284 WARNING [train.py:1204] (3/4) Exclude cut with ID _60057_210_10640_1_1534903065793_3333150_171-181133-0 from training. Duration: 0.88 2023-10-03 12:36:20,309 WARNING [train.py:1204] (3/4) Exclude cut with ID _19504_210_9994_1_1531648863242_3325220_336-196513-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 12:36:20,317 WARNING [train.py:1204] (3/4) Exclude cut with ID _6493_210_1747_1_1528459210424_4012470_513-140967-0 from training. Duration: 0.79 2023-10-03 12:36:20,391 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6673_210_3273_1_1527767790_3173240_327-306185-0_sp0.9 from training. Duration: 0.92225 2023-10-03 12:36:23,135 INFO [train.py:1046] (3/4) Epoch 36, batch 4450, loss[loss=0.1482, simple_loss=0.2402, pruned_loss=0.02806, over 24434.00 frames. ], tot_loss[loss=0.1603, simple_loss=0.2408, pruned_loss=0.03989, over 4728348.93 frames. ], batch size: 69, lr: 2.82e-03, grad_scale: 32.0 2023-10-03 12:36:23,231 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1032-236863-0_sp1.1 from training. Duration: 0.763625 2023-10-03 12:36:24,651 WARNING [train.py:1204] (3/4) Exclude cut with ID _14867_210_1968_1_1530684179890_3846969_413-29292-0 from training. Duration: 0.9 2023-10-03 12:36:28,782 WARNING [train.py:1204] (3/4) Exclude cut with ID _47251_210_18628_1_1533814063128_4137250_186-343322-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 12:36:31,467 WARNING [train.py:1204] (3/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_624-46091-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 12:36:31,506 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_791-197891-0_sp0.9 from training. Duration: 0.9333125 2023-10-03 12:36:35,133 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.conv_skip_rate, batch_count=1269166.6666666667, ans=0.0 2023-10-03 12:36:38,422 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_50050_210_16223_1_1535435796_7290138_786-288457-0_sp1.1 from training. Duration: 0.92725 2023-10-03 12:36:38,440 WARNING [train.py:1204] (3/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_213-227283-0_sp1.1 from training. Duration: 0.963625 2023-10-03 12:36:41,891 WARNING [train.py:1204] (3/4) Exclude cut with ID _42659_210_2070_1_1533607442765_3120380_58-341121-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 12:36:42,567 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.2.encoder.layers.2.conv_module1.whiten, num_groups=1, num_channels=384, metric=6.75 vs. limit=15.0 2023-10-03 12:36:43,150 WARNING [train.py:1204] (3/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_506-23200-0_sp1.1 from training. Duration: 0.8818125 2023-10-03 12:36:44,549 WARNING [train.py:1204] (3/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_336-213530-0_sp1.1 from training. Duration: 0.8181875 2023-10-03 12:36:44,577 WARNING [train.py:1204] (3/4) Exclude cut with ID _64135_210_2253_1_1535677267876_7608972_180-5312-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 12:36:45,985 WARNING [train.py:1204] (3/4) Exclude cut with ID _12302_210_5219_1_1529924446466_8107280_835-279754-0 from training. Duration: 0.9 2023-10-03 12:36:45,986 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_510-96996-0_sp1.1 from training. Duration: 0.7909375 2023-10-03 12:36:47,939 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_15517_210_6753_1_1530954008_3487336_216-187834-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 12:36:47,973 WARNING [train.py:1204] (3/4) Exclude cut with ID _19504_210_9994_1_1531648863242_3325220_414-113427-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 12:36:47,974 WARNING [train.py:1204] (3/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_264-108685-0_sp0.9 from training. Duration: 0.611125 2023-10-03 12:36:50,651 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_5438_210_1794_1_1527336687_2600009_104-288274-0_sp0.9 from training. Duration: 0.6444375 2023-10-03 12:36:56,047 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.578e+02 1.913e+02 2.111e+02 2.301e+02 3.200e+02, threshold=4.223e+02, percent-clipped=0.0 2023-10-03 12:36:56,124 WARNING [train.py:1204] (3/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_520-39496-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 12:36:56,167 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_37818_210_4238_1_1533620910_4720083_315-308565-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 12:36:57,610 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2009_210_2071_1_1525481134_4588937_385-302100-0_sp1.1 from training. Duration: 0.7909375 2023-10-03 12:36:57,648 WARNING [train.py:1204] (3/4) Exclude cut with ID _58895_210_8750_1_1535184081320_3649089_277-53219-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 12:36:59,073 WARNING [train.py:1204] (3/4) Exclude cut with ID _26664_210_7770_1_1532258943280_3418270_121-111258-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 12:37:03,692 WARNING [train.py:1204] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_99-167392-0_sp1.1 from training. Duration: 0.5545625 2023-10-03 12:37:05,060 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1113-7369-0 from training. Duration: 0.85 2023-10-03 12:37:06,916 WARNING [train.py:1204] (3/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_352-59958-0 from training. Duration: 0.86 2023-10-03 12:37:06,920 WARNING [train.py:1204] (3/4) Exclude cut with ID _14886_210_4220_1_1530702020355_3516190_159-73748-0_sp1.1 from training. Duration: 0.7545625 2023-10-03 12:37:08,379 WARNING [train.py:1204] (3/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_131-119569-0_sp1.1 from training. Duration: 0.92725 2023-10-03 12:37:09,966 WARNING [train.py:1204] (3/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_216-343007-0 from training. Duration: 0.66 2023-10-03 12:37:13,375 WARNING [train.py:1204] (3/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_113-92608-0_sp0.9 from training. Duration: 0.6 2023-10-03 12:37:16,161 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_40047_210_2290_1_1533290519_2374311_132-123271-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 12:37:16,218 WARNING [train.py:1204] (3/4) Exclude cut with ID _8793_210_6310_1_1528542985139_2310009_121-179782-0 from training. Duration: 0.8 2023-10-03 12:37:16,231 WARNING [train.py:1204] (3/4) Exclude cut with ID _43997_210_4525_1_1533538632555_5189329_283-218639-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 12:37:16,234 WARNING [train.py:1204] (3/4) Exclude cut with ID _49376_210_5986_1_1533985278732_3370740_297-78397-0_sp1.1 from training. Duration: 0.936375 2023-10-03 12:37:16,249 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3475_210_2028_1_1526193578_3622569_377-166133-0_sp0.9 from training. Duration: 0.9555625 2023-10-03 12:37:16,255 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_520-213149-0_sp1.1 from training. Duration: 0.92725 2023-10-03 12:37:17,602 WARNING [train.py:1204] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_567-144483-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 12:37:20,895 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_4183_210_2087_1_1526729100_1869838_143-280962-0_sp0.9 from training. Duration: 0.62225 2023-10-03 12:37:22,226 WARNING [train.py:1204] (3/4) Exclude cut with ID _49643_210_7989_1_1534640244224_3939031_611-185515-0 from training. Duration: 0.94 2023-10-03 12:37:24,903 WARNING [train.py:1204] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_88-341718-0_sp0.9 from training. Duration: 0.7333125 2023-10-03 12:37:26,481 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_51225_210_3601_1_1534400634_2327383_49-235441-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 12:37:27,884 WARNING [train.py:1204] (3/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_240-294400-0_sp1.1 from training. Duration: 0.936375 2023-10-03 12:37:28,019 WARNING [train.py:1204] (3/4) Exclude cut with ID _55429_210_10626_1_1534467588527_7174143_640-304825-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 12:37:28,041 WARNING [train.py:1204] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_187-221566-0_sp1.1 from training. Duration: 0.3909375 2023-10-03 12:37:29,493 WARNING [train.py:1204] (3/4) Exclude cut with ID _7606_210_4915_1_1528894253277_5080690_127-177761-0_sp0.9 from training. Duration: 0.711125 2023-10-03 12:37:32,975 WARNING [train.py:1204] (3/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_430-116139-0 from training. Duration: 0.85 2023-10-03 12:37:33,184 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=1269433.3333333333, ans=0.1 2023-10-03 12:37:34,296 WARNING [train.py:1204] (3/4) Exclude cut with ID _3675_210_4557_1_1526639934408_4182089_456-18305-0_sp0.9 from training. Duration: 0.8444375 2023-10-03 12:37:38,736 INFO [train.py:1046] (3/4) Epoch 36, batch 4500, loss[loss=0.1694, simple_loss=0.2545, pruned_loss=0.04218, over 24444.00 frames. ], tot_loss[loss=0.1604, simple_loss=0.2407, pruned_loss=0.03999, over 4732198.03 frames. ], batch size: 69, lr: 2.82e-03, grad_scale: 16.0 2023-10-03 12:37:40,553 WARNING [train.py:1204] (3/4) Exclude cut with ID _18513_210_9206_1_1531618215223_3862620_402-5957-0_sp1.1 from training. Duration: 0.9 2023-10-03 12:37:41,905 WARNING [train.py:1204] (3/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_465-156500-0 from training. Duration: 0.72 2023-10-03 12:37:41,906 WARNING [train.py:1204] (3/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_714-310538-0 from training. Duration: 0.86 2023-10-03 12:37:43,325 WARNING [train.py:1204] (3/4) Exclude cut with ID _39411_210_5986_1_1533538825030_3343649_335-301187-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 12:37:49,362 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_41750_210_1960_1_1533520678_3948042_241-158616-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 12:37:49,423 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_46013_210_7234_1_1533776362_3744032_50-141535-0_sp1.1 from training. Duration: 0.9 2023-10-03 12:37:50,819 WARNING [train.py:1204] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_43-206757-0_sp0.9 from training. Duration: 0.8333125 2023-10-03 12:37:52,132 WARNING [train.py:1204] (3/4) Exclude cut with ID _22961_210_8254_1_1531976366672_4278180_172-276296-0_sp1.1 from training. Duration: 0.97275 2023-10-03 12:37:52,152 WARNING [train.py:1204] (3/4) Exclude cut with ID _17956_210_4353_1_1531209568125_6979970_1305-301903-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 12:37:53,571 WARNING [train.py:1204] (3/4) Exclude cut with ID _28323_210_16187_1_1532480395667_4100300_604-44507-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 12:38:02,588 WARNING [train.py:1204] (3/4) Exclude cut with ID _23514_210_1389_1_1531832139785_3031439_88-337544-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 12:38:03,795 WARNING [train.py:1204] (3/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_932-3672-0_sp1.1 from training. Duration: 0.963625 2023-10-03 12:38:05,269 WARNING [train.py:1204] (3/4) Exclude cut with ID _46502_210_13794_1_1533724452198_3657159_207-109991-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 12:38:05,408 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.conv_module1.balancer1.prob, batch_count=1269566.6666666667, ans=0.125 2023-10-03 12:38:06,629 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_216-73841-0_sp1.1 from training. Duration: 0.87275 2023-10-03 12:38:08,658 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8453_210_4211_1_1528502940_8562730_347-330673-0_sp1.1 from training. Duration: 0.6545625 2023-10-03 12:38:10,935 INFO [scaling.py:1118] (3/4) WithLoss: name=encoder.encoders.4.encoder.layers.0.self_attn_weights, loss-sum=0.000e+00 2023-10-03 12:38:13,341 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_4183_210_2087_1_1526729100_1869838_143-280962-0_sp1.1 from training. Duration: 0.5090625 2023-10-03 12:38:15,091 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.conv_module2.balancer2.prob, batch_count=1269633.3333333333, ans=0.125 2023-10-03 12:38:18,889 WARNING [train.py:1204] (3/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_325-113798-0_sp1.1 from training. Duration: 0.77275 2023-10-03 12:38:23,812 WARNING [train.py:1204] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_91-212486-0_sp0.9 from training. Duration: 0.7555625 2023-10-03 12:38:25,239 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8776_210_2040_1_1528638837_4088036_244-278763-0_sp1.1 from training. Duration: 0.8181875 2023-10-03 12:38:25,279 WARNING [train.py:1204] (3/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_106-286130-0 from training. Duration: 0.98 2023-10-03 12:38:26,666 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2746_210_1988_1_1525872816_3795627_372-205861-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 12:38:27,933 WARNING [train.py:1204] (3/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_240-54646-0_sp1.1 from training. Duration: 0.936375 2023-10-03 12:38:29,393 WARNING [train.py:1204] (3/4) Exclude cut with ID _59500_210_8254_1_1534809610680_3736009_162-180695-0_sp1.1 from training. Duration: 0.936375 2023-10-03 12:38:29,426 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7544_210_6753_1_1528591327_1471426_146-93357-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 12:38:32,587 WARNING [train.py:1204] (3/4) Exclude cut with ID _42657_210_2070_1_1533520654016_3327008_405-87432-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 12:38:32,609 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_4528_210_3273_1_1527076654_3276777_209-219804-0 from training. Duration: 0.69 2023-10-03 12:38:32,609 WARNING [train.py:1204] (3/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_78-182912-0_sp0.9 from training. Duration: 0.6555625 2023-10-03 12:38:32,622 WARNING [train.py:1204] (3/4) Exclude cut with ID _31255_210_13427_1_1532865619495_3815143_322-114998-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 12:38:36,848 WARNING [train.py:1204] (3/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_848-280957-0_sp0.9 from training. Duration: 0.9666875 2023-10-03 12:38:36,870 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7696_210_2040_1_1528207167_3716099_117-125434-0_sp0.9 from training. Duration: 0.8444375 2023-10-03 12:38:40,039 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_1002-109192-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 12:38:41,913 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.4.encoder.layers.0.nonlin_attention.whiten1, num_groups=1, num_channels=288, metric=8.78 vs. limit=10.0 2023-10-03 12:38:42,643 WARNING [train.py:1204] (3/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_138-64210-0_sp0.9 from training. Duration: 0.811125 2023-10-03 12:38:42,665 WARNING [train.py:1204] (3/4) Exclude cut with ID _25808_210_13292_1_1532343514562_7366109_633-47524-0_sp1.1 from training. Duration: 0.9 2023-10-03 12:38:44,130 WARNING [train.py:1204] (3/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_297-5233-0 from training. Duration: 0.95 2023-10-03 12:38:45,552 WARNING [train.py:1204] (3/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_335-274780-0 from training. Duration: 0.78 2023-10-03 12:38:45,557 WARNING [train.py:1204] (3/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_301-330455-0 from training. Duration: 0.8 2023-10-03 12:38:49,554 WARNING [train.py:1204] (3/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_383-205833-0 from training. Duration: 0.95 2023-10-03 12:38:49,732 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.conv_skip_rate, batch_count=1269766.6666666667, ans=0.0 2023-10-03 12:38:52,188 INFO [train.py:1046] (3/4) Epoch 36, batch 4550, loss[loss=0.1534, simple_loss=0.2211, pruned_loss=0.0428, over 23691.00 frames. ], tot_loss[loss=0.16, simple_loss=0.2396, pruned_loss=0.04018, over 4728966.61 frames. ], batch size: 232, lr: 2.82e-03, grad_scale: 16.0 2023-10-03 12:38:52,241 WARNING [train.py:1204] (3/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_48-182155-0 from training. Duration: 0.87 2023-10-03 12:38:52,334 WARNING [train.py:1204] (3/4) Exclude cut with ID _14832_210_8750_1_1530691351539_3171348_1-54315-0_sp1.1 from training. Duration: 0.7909375 2023-10-03 12:38:56,986 WARNING [train.py:1204] (3/4) Exclude cut with ID _44982_210_1511_1_1534312820108_3726029_413-100814-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 12:38:57,047 WARNING [train.py:1204] (3/4) Exclude cut with ID _3231_210_2106_1_1526205864934_6784220_104-261392-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 12:39:00,223 WARNING [train.py:1204] (3/4) Exclude cut with ID _24314_210_4915_1_1532135078116_7170179_1-13588-0_sp1.1 from training. Duration: 0.97275 2023-10-03 12:39:00,409 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.self_attn_weights.pos_emb_skip_rate, batch_count=1269833.3333333333, ans=0.0 2023-10-03 12:39:04,347 WARNING [train.py:1204] (3/4) Exclude cut with ID _44304_210_15374_1_1533639352765_4613129_141-8356-0_sp1.1 from training. Duration: 0.963625 2023-10-03 12:39:07,040 WARNING [train.py:1204] (3/4) Exclude cut with ID _37892_210_19596_1_1533198677897_3964058_374-335034-0_sp1.1 from training. Duration: 0.936375 2023-10-03 12:39:08,911 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_195-257512-0_sp0.9 from training. Duration: 0.7444375 2023-10-03 12:39:08,914 WARNING [train.py:1204] (3/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_1267-272693-0_sp1.1 from training. Duration: 0.663625 2023-10-03 12:39:08,915 WARNING [train.py:1204] (3/4) Exclude cut with ID _30196_210_1664_1_1533708496522_7074140_690-326155-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 12:39:13,562 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_68847_210_14656_1_1536070214_2586843_38-20717-0_sp1.1 from training. Duration: 0.97275 2023-10-03 12:39:13,603 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_99-251314-0_sp1.1 from training. Duration: 0.7909375 2023-10-03 12:39:16,364 WARNING [train.py:1204] (3/4) Exclude cut with ID _49376_210_5986_1_1533985278732_3370740_225-95630-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 12:39:17,897 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2655_210_2087_1_1525688959_3897400_202-147557-0 from training. Duration: 0.85 2023-10-03 12:39:19,206 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_5618_210_1874_1_1527311008_5851_1-214501-0_sp1.1 from training. Duration: 0.626375 2023-10-03 12:39:19,290 WARNING [train.py:1204] (3/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_225-43381-0_sp1.1 from training. Duration: 0.8545625 2023-10-03 12:39:21,948 WARNING [train.py:1204] (3/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_242-3472-0 from training. Duration: 0.73 2023-10-03 12:39:24,818 WARNING [train.py:1204] (3/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_339-104791-0 from training. Duration: 0.49 2023-10-03 12:39:25,973 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.557e+02 1.879e+02 2.066e+02 2.299e+02 3.391e+02, threshold=4.132e+02, percent-clipped=0.0 2023-10-03 12:39:26,051 WARNING [train.py:1204] (3/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_488-92261-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 12:39:28,128 WARNING [train.py:1204] (3/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_175-163894-0 from training. Duration: 0.74 2023-10-03 12:39:28,248 WARNING [train.py:1204] (3/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_41-234353-0_sp0.9 from training. Duration: 0.7555625 2023-10-03 12:39:31,740 WARNING [train.py:1204] (3/4) Exclude cut with ID _47251_210_18628_1_1533814063128_4137250_116-275927-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 12:39:31,775 WARNING [train.py:1204] (3/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_61-223324-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 12:39:33,072 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6961_210_2087_1_1527937168_6815011_562-165518-0_sp1.1 from training. Duration: 0.7 2023-10-03 12:39:33,364 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.attention_skip_rate, batch_count=1269966.6666666667, ans=0.0 2023-10-03 12:39:36,198 WARNING [train.py:1204] (3/4) Exclude cut with ID _21989_210_5854_1_1531899214931_3271360_133-44759-0 from training. Duration: 0.77 2023-10-03 12:39:37,700 WARNING [train.py:1204] (3/4) Exclude cut with ID _23557_210_8750_1_1531890106370_6846255_77-284923-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 12:39:40,407 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.1.encoder.layers.1.conv_module1.whiten, num_groups=1, num_channels=256, metric=10.44 vs. limit=15.0 2023-10-03 12:39:40,831 WARNING [train.py:1204] (3/4) Exclude cut with ID _13678_210_7497_1_1530530069226_3463170_464-296127-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 12:39:40,851 WARNING [train.py:1204] (3/4) Exclude cut with ID _22613_210_5219_1_1532253599686_4090170_393-36663-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 12:39:43,561 WARNING [train.py:1204] (3/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_235-262124-0_sp0.9 from training. Duration: 0.7444375 2023-10-03 12:39:43,668 WARNING [train.py:1204] (3/4) Exclude cut with ID _8480_210_1874_1_1528589391205_6728330_755-112625-0 from training. Duration: 0.74 2023-10-03 12:39:44,947 WARNING [train.py:1204] (3/4) Exclude cut with ID _6456_210_1534_1_1527681634212_3181049_215-309359-0 from training. Duration: 0.88 2023-10-03 12:39:44,969 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3019_210_2028_1_1526044245_3264865_191-72235-0_sp1.1 from training. Duration: 0.8818125 2023-10-03 12:39:46,400 WARNING [train.py:1204] (3/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_380-162648-0 from training. Duration: 0.94 2023-10-03 12:39:49,255 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_92-46009-0 from training. Duration: 0.54 2023-10-03 12:39:49,272 WARNING [train.py:1204] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_53-300544-0_sp0.9 from training. Duration: 0.7444375 2023-10-03 12:39:50,632 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_68235_210_1925_1_1535863247_4484656_101-250943-0_sp1.1 from training. Duration: 0.97275 2023-10-03 12:39:50,641 WARNING [train.py:1204] (3/4) Exclude cut with ID _14970_210_15709_1_1530775547361_1966965_204-22182-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 12:39:50,730 WARNING [train.py:1204] (3/4) Exclude cut with ID _55153_210_2253_1_1534384786677_3162096_310-130092-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 12:39:50,740 WARNING [train.py:1204] (3/4) Exclude cut with ID _60113_210_16271_1_1535164212481_4386079_559-167191-0_sp1.1 from training. Duration: 0.8454375 2023-10-03 12:39:52,123 WARNING [train.py:1204] (3/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_93-58002-0_sp1.1 from training. Duration: 0.5181875 2023-10-03 12:39:53,389 WARNING [train.py:1204] (3/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_110-224688-0 from training. Duration: 0.97 2023-10-03 12:39:54,836 WARNING [train.py:1204] (3/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_178-178004-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 12:39:54,845 WARNING [train.py:1204] (3/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_321-335475-0_sp1.1 from training. Duration: 0.4090625 2023-10-03 12:39:54,921 WARNING [train.py:1204] (3/4) Exclude cut with ID _66863_210_18053_1_1535698758334_8590319_1092-271784-0 from training. Duration: 0.96 2023-10-03 12:39:54,928 WARNING [train.py:1204] (3/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_607-213951-0_sp1.1 from training. Duration: 0.763625 2023-10-03 12:39:56,212 WARNING [train.py:1204] (3/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_468-283113-0 from training. Duration: 0.96 2023-10-03 12:39:59,512 WARNING [train.py:1204] (3/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_13-165512-0_sp1.1 from training. Duration: 0.7454375 2023-10-03 12:39:59,530 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_69630_210_7234_1_1535788670_910989_56-169762-0_sp1.1 from training. Duration: 0.92725 2023-10-03 12:40:00,976 WARNING [train.py:1204] (3/4) Exclude cut with ID _67335_210_5219_1_1535525975652_4275229_121-80357-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 12:40:01,027 WARNING [train.py:1204] (3/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_126-12079-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 12:40:02,859 WARNING [train.py:1204] (3/4) Exclude cut with ID _5294_210_1747_1_1527249643928_3696179_342-270278-0_sp1.1 from training. Duration: 0.5 2023-10-03 12:40:02,970 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_30322_210_1925_1_1532843478_826817_35-328264-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 12:40:05,648 WARNING [train.py:1204] (3/4) Exclude cut with ID _8480_210_1874_1_1528589391205_6728330_755-112625-0_sp1.1 from training. Duration: 0.67275 2023-10-03 12:40:06,948 INFO [train.py:1046] (3/4) Epoch 36, batch 4600, loss[loss=0.1269, simple_loss=0.1837, pruned_loss=0.03508, over 19283.00 frames. ], tot_loss[loss=0.1588, simple_loss=0.2378, pruned_loss=0.03986, over 4718604.95 frames. ], batch size: 388, lr: 2.82e-03, grad_scale: 16.0 2023-10-03 12:40:07,097 WARNING [train.py:1204] (3/4) Exclude cut with ID _38670_210_15379_1_1533448810659_4391059_132-146412-0_sp1.1 from training. Duration: 0.97275 2023-10-03 12:40:08,475 WARNING [train.py:1204] (3/4) Exclude cut with ID _22020_210_2461_1_1531724164513_6326900_563-39691-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 12:40:13,213 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_9460_210_4211_1_1528954587_10049667_572-37035-0_sp1.1 from training. Duration: 0.72725 2023-10-03 12:40:13,225 WARNING [train.py:1204] (3/4) Exclude cut with ID _68777_210_4353_1_1535810390731_3117904_129-166012-0_sp0.9 from training. Duration: 0.9444375 2023-10-03 12:40:13,299 WARNING [train.py:1204] (3/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_487-91921-0_sp1.1 from training. Duration: 0.963625 2023-10-03 12:40:15,857 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_195-257512-0 from training. Duration: 0.67 2023-10-03 12:40:15,965 WARNING [train.py:1204] (3/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_188-260011-0_sp1.1 from training. Duration: 0.663625 2023-10-03 12:40:18,938 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.balancer1.prob, batch_count=1270166.6666666667, ans=0.125 2023-10-03 12:40:20,016 WARNING [train.py:1204] (3/4) Exclude cut with ID _8480_210_1874_1_1528589391205_6728330_309-139896-0_sp0.9 from training. Duration: 0.9 2023-10-03 12:40:20,074 WARNING [train.py:1204] (3/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_866-321248-0_sp1.1 from training. Duration: 0.963625 2023-10-03 12:40:22,777 WARNING [train.py:1204] (3/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_510-216513-0_sp1.1 from training. Duration: 0.97275 2023-10-03 12:40:30,496 WARNING [train.py:1204] (3/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_758-143677-0 from training. Duration: 0.98 2023-10-03 12:40:31,874 WARNING [train.py:1204] (3/4) Exclude cut with ID _55134_210_21982_1_1534590028377_3882170_50-214427-0_sp1.1 from training. Duration: 0.97275 2023-10-03 12:40:33,797 WARNING [train.py:1204] (3/4) Exclude cut with ID _31649_210_6228_1_1532584608063_4091078_482-301330-0_sp1.1 from training. Duration: 0.97275 2023-10-03 12:40:36,983 WARNING [train.py:1204] (3/4) Exclude cut with ID _35799_210_5854_1_1533263487887_3529071_490-326847-0_sp1.1 from training. Duration: 0.9 2023-10-03 12:40:36,997 WARNING [train.py:1204] (3/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_984-90716-0_sp1.1 from training. Duration: 0.963625 2023-10-03 12:40:41,131 WARNING [train.py:1204] (3/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_166-246557-0 from training. Duration: 0.64 2023-10-03 12:40:41,131 WARNING [train.py:1204] (3/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_51-200220-0_sp1.1 from training. Duration: 0.5090625 2023-10-03 12:40:43,041 WARNING [train.py:1204] (3/4) Exclude cut with ID _51382_210_23862_1_1534492801838_3650104_275-92796-0_sp1.1 from training. Duration: 0.936375 2023-10-03 12:40:44,591 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.bypass.scale_min, batch_count=1270300.0, ans=0.2 2023-10-03 12:40:48,457 WARNING [train.py:1204] (3/4) Exclude cut with ID _29853_210_7497_1_1532505600851_6642110_461-142829-0_sp1.1 from training. Duration: 0.97275 2023-10-03 12:40:48,498 WARNING [train.py:1204] (3/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_788-298907-0_sp0.9 from training. Duration: 0.988875 2023-10-03 12:40:49,885 WARNING [train.py:1204] (3/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_739-2098-0_sp1.1 from training. Duration: 0.836375 2023-10-03 12:40:51,521 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.self_attn_weights.pos_emb_skip_rate, batch_count=1270366.6666666667, ans=0.0 2023-10-03 12:40:52,804 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7543_210_6753_1_1528541907_1399322_165-323619-0 from training. Duration: 0.69 2023-10-03 12:40:52,992 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.bypass_mid.scale_min, batch_count=1270366.6666666667, ans=0.2 2023-10-03 12:40:54,153 WARNING [train.py:1204] (3/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_401-255491-0_sp1.1 from training. Duration: 0.636375 2023-10-03 12:40:58,345 WARNING [train.py:1204] (3/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_24-143283-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 12:41:00,168 WARNING [train.py:1204] (3/4) Exclude cut with ID _14459_210_2228_1_1531443641946_7516700_1221-280439-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 12:41:02,917 WARNING [train.py:1204] (3/4) Exclude cut with ID _59202_210_26984_1_1534816810612_3549174_469-213482-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 12:41:02,918 WARNING [train.py:1204] (3/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_148-158509-0_sp0.9 from training. Duration: 0.4555625 2023-10-03 12:41:04,798 WARNING [train.py:1204] (3/4) Exclude cut with ID _24314_210_4915_1_1532135078116_7170179_863-259095-0_sp1.1 from training. Duration: 0.97275 2023-10-03 12:41:04,879 WARNING [train.py:1204] (3/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_321-22821-0 from training. Duration: 0.89 2023-10-03 12:41:06,511 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_43831_210_2158_1_1533725502_5872791_55-163137-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 12:41:06,566 WARNING [train.py:1204] (3/4) Exclude cut with ID _27430_210_7770_1_1532604689375_1378590_26-25207-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 12:41:07,975 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_17613_210_2151_1_1531137569_2366160_223-316461-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 12:41:08,244 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.self_attn_weights.pos_emb_skip_rate, batch_count=1270433.3333333333, ans=0.0 2023-10-03 12:41:09,389 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_53679_210_4852_1_1534556726_4186141_541-213370-0_sp1.1 from training. Duration: 0.963625 2023-10-03 12:41:09,462 WARNING [train.py:1204] (3/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_369-295086-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 12:41:09,512 WARNING [train.py:1204] (3/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_91-267696-0 from training. Duration: 0.87 2023-10-03 12:41:10,845 WARNING [train.py:1204] (3/4) Exclude cut with ID _5294_210_1747_1_1527249643928_3696179_180-100713-0 from training. Duration: 0.81 2023-10-03 12:41:10,884 WARNING [train.py:1204] (3/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_32-335103-0 from training. Duration: 0.82 2023-10-03 12:41:10,890 WARNING [train.py:1204] (3/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_133-312727-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 12:41:13,860 WARNING [train.py:1204] (3/4) Exclude cut with ID _59288_210_5854_1_1535077757774_3412246_391-165934-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 12:41:13,903 WARNING [train.py:1204] (3/4) Exclude cut with ID _28323_210_16187_1_1532480395667_4100300_488-1281-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 12:41:13,972 WARNING [train.py:1204] (3/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_124-193207-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 12:41:20,680 INFO [train.py:1046] (3/4) Epoch 36, batch 4650, loss[loss=0.1638, simple_loss=0.2409, pruned_loss=0.04334, over 23421.00 frames. ], tot_loss[loss=0.1583, simple_loss=0.2375, pruned_loss=0.0395, over 4719674.68 frames. ], batch size: 93, lr: 2.81e-03, grad_scale: 16.0 2023-10-03 12:41:24,998 WARNING [train.py:1204] (3/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_226-242140-0_sp0.9 from training. Duration: 0.9 2023-10-03 12:41:26,509 WARNING [train.py:1204] (3/4) Exclude cut with ID _29277_210_3279_1_1532581109293_3173069_392-3694-0_sp1.1 from training. Duration: 0.936375 2023-10-03 12:41:27,809 WARNING [train.py:1204] (3/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_563-329842-0_sp1.1 from training. Duration: 0.97275 2023-10-03 12:41:27,864 WARNING [train.py:1204] (3/4) Exclude cut with ID _58583_210_5236_1_1534726835266_3574249_391-87755-0_sp1.1 from training. Duration: 0.92725 2023-10-03 12:41:27,894 WARNING [train.py:1204] (3/4) Exclude cut with ID _28427_210_4220_1_1532581240967_3061069_281-237827-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 12:41:27,909 WARNING [train.py:1204] (3/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_44-147898-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 12:41:30,031 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.2.encoder.layers.0.self_attn2.whiten, num_groups=1, num_channels=384, metric=17.07 vs. limit=22.5 2023-10-03 12:41:30,735 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_24007_210_1794_1_1531918925_1663875_125-320772-0_sp1.1 from training. Duration: 0.97275 2023-10-03 12:41:33,886 WARNING [train.py:1204] (3/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_143-117421-0 from training. Duration: 0.58 2023-10-03 12:41:37,417 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.self_attn_weights.pos_emb_skip_rate, batch_count=1270566.6666666667, ans=0.0 2023-10-03 12:41:38,526 WARNING [train.py:1204] (3/4) Exclude cut with ID _69584_210_5219_1_1535785133775_4044450_325-7656-0_sp0.9 from training. Duration: 0.9555625 2023-10-03 12:41:39,134 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.5.encoder.layers.0.feed_forward1.out_whiten, num_groups=1, num_channels=256, metric=6.14 vs. limit=15.0 2023-10-03 12:41:39,957 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_37351_210_7925_1_1533012931_3812113_265-85780-0 from training. Duration: 0.88 2023-10-03 12:41:39,981 WARNING [train.py:1204] (3/4) Exclude cut with ID _18516_210_9206_1_1531704653206_3692406_331-237896-0_sp1.1 from training. Duration: 0.936375 2023-10-03 12:41:41,336 WARNING [train.py:1204] (3/4) Exclude cut with ID _14629_210_15709_1_1530604298182_956899_6-232695-0 from training. Duration: 0.94 2023-10-03 12:41:41,363 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7544_210_6753_1_1528587000_691468_87-178599-0_sp1.1 from training. Duration: 0.7909375 2023-10-03 12:41:41,410 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_326-303945-0 from training. Duration: 0.99 2023-10-03 12:41:42,680 WARNING [train.py:1204] (3/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_503-30719-0 from training. Duration: 0.98 2023-10-03 12:41:42,688 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_11305_210_1794_1_1529750428_7178148_299-344640-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 12:41:42,755 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_53071_210_16554_1_1534490405_2954066_265-170472-0_sp1.1 from training. Duration: 0.7545625 2023-10-03 12:41:46,134 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_174-277364-0_sp1.1 from training. Duration: 0.6454375 2023-10-03 12:41:47,441 WARNING [train.py:1204] (3/4) Exclude cut with ID _58414_210_8254_1_1535421609162_7151182_193-151884-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 12:41:47,463 WARNING [train.py:1204] (3/4) Exclude cut with ID IC0651W0104-322063 from training. Duration: 0.768 2023-10-03 12:41:50,223 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder_embed.conv.8.prob, batch_count=1270633.3333333333, ans=0.125 2023-10-03 12:41:51,431 WARNING [train.py:1204] (3/4) Exclude cut with ID _21881_210_3525_1_1531825242757_3798380_139-305982-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 12:41:52,858 WARNING [train.py:1204] (3/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_79-200392-0 from training. Duration: 0.97 2023-10-03 12:41:54,102 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.612e+02 1.890e+02 2.118e+02 2.487e+02 4.002e+02, threshold=4.237e+02, percent-clipped=0.0 2023-10-03 12:41:55,607 WARNING [train.py:1204] (3/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_451-280894-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 12:41:55,615 WARNING [train.py:1204] (3/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_304-273433-0_sp1.1 from training. Duration: 0.9 2023-10-03 12:41:55,920 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.bypass_mid.scale_min, batch_count=1270633.3333333333, ans=0.2 2023-10-03 12:41:57,107 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_5959_210_1794_1_1527400400_3445726_412-44052-0 from training. Duration: 0.77 2023-10-03 12:41:57,233 WARNING [train.py:1204] (3/4) Exclude cut with ID _4681_210_3525_1_1527251040321_1235713_148-26159-0_sp1.1 from training. Duration: 0.963625 2023-10-03 12:41:59,882 WARNING [train.py:1204] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_887-249555-0_sp0.9 from training. Duration: 0.8555625 2023-10-03 12:42:02,707 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_25236_210_2158_1_1532051968_280567_37-6860-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 12:42:07,655 WARNING [train.py:1204] (3/4) Exclude cut with ID _29277_210_3279_1_1532581109293_3173069_417-115527-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 12:42:09,088 WARNING [train.py:1204] (3/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_552-134748-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 12:42:10,461 WARNING [train.py:1204] (3/4) Exclude cut with ID _43176_210_8254_1_1533524417469_4174339_188-174143-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 12:42:10,506 WARNING [train.py:1204] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_127-119109-0_sp1.1 from training. Duration: 0.6545625 2023-10-03 12:42:13,332 WARNING [train.py:1204] (3/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_38-187851-0 from training. Duration: 0.62 2023-10-03 12:42:14,667 WARNING [train.py:1204] (3/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_588-125165-0 from training. Duration: 0.83 2023-10-03 12:42:16,515 WARNING [train.py:1204] (3/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_344-152526-0_sp1.1 from training. Duration: 0.4818125 2023-10-03 12:42:16,516 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7002_210_1794_1_1527935634_4034444_487-236501-0 from training. Duration: 0.66 2023-10-03 12:42:17,341 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.1.encoder.layers.0.feed_forward1.out_whiten, num_groups=1, num_channels=256, metric=4.84 vs. limit=15.0 2023-10-03 12:42:17,902 WARNING [train.py:1204] (3/4) Exclude cut with ID _23825_210_13449_1_1531915313526_3497150_379-16981-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 12:42:24,734 WARNING [train.py:1204] (3/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_297-5233-0_sp1.1 from training. Duration: 0.863625 2023-10-03 12:42:24,738 WARNING [train.py:1204] (3/4) Exclude cut with ID _41298_210_9019_1_1533430371450_4822143_178-283538-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 12:42:24,771 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7046_210_6753_1_1527895337_6920917_642-106799-0 from training. Duration: 0.92 2023-10-03 12:42:24,791 WARNING [train.py:1204] (3/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_532-291952-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 12:42:26,184 WARNING [train.py:1204] (3/4) Exclude cut with ID _47278_210_12041_1_1533902418349_3576110_260-270468-0_sp1.1 from training. Duration: 0.92725 2023-10-03 12:42:26,189 WARNING [train.py:1204] (3/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_144-3287-0_sp0.9 from training. Duration: 0.7444375 2023-10-03 12:42:28,842 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_162-262717-0_sp0.9 from training. Duration: 0.92225 2023-10-03 12:42:30,279 WARNING [train.py:1204] (3/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_62-335552-0_sp0.9 from training. Duration: 0.9666875 2023-10-03 12:42:30,284 WARNING [train.py:1204] (3/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_726-272852-0_sp1.1 from training. Duration: 0.92725 2023-10-03 12:42:31,634 WARNING [train.py:1204] (3/4) Exclude cut with ID _14898_210_9206_1_1530766812377_4177190_26-135492-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 12:42:31,932 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.feed_forward2.hidden_balancer.prob, batch_count=1270833.3333333333, ans=0.125 2023-10-03 12:42:33,005 INFO [train.py:1046] (3/4) Epoch 36, batch 4700, loss[loss=0.1564, simple_loss=0.2401, pruned_loss=0.03635, over 24694.00 frames. ], tot_loss[loss=0.1579, simple_loss=0.2371, pruned_loss=0.03935, over 4721358.33 frames. ], batch size: 65, lr: 2.81e-03, grad_scale: 16.0 2023-10-03 12:42:35,720 WARNING [train.py:1204] (3/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_200-95304-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 12:42:35,749 WARNING [train.py:1204] (3/4) Exclude cut with ID _45012_210_12651_1_1533608435136_5371380_240-37101-0_sp1.1 from training. Duration: 0.8454375 2023-10-03 12:42:35,764 WARNING [train.py:1204] (3/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_132-269633-0_sp0.9 from training. Duration: 0.7555625 2023-10-03 12:42:37,753 WARNING [train.py:1204] (3/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_907-151398-0_sp0.9 from training. Duration: 0.411125 2023-10-03 12:42:39,583 WARNING [train.py:1204] (3/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_45-265684-0_sp0.9 from training. Duration: 0.8 2023-10-03 12:42:39,672 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_74-292608-0 from training. Duration: 0.72 2023-10-03 12:42:43,979 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=1270833.3333333333, ans=0.1 2023-10-03 12:42:46,684 WARNING [train.py:1204] (3/4) Exclude cut with ID _59202_210_26984_1_1534816810612_3549174_221-84819-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 12:42:48,562 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_33696_210_3852_1_1533082780_2663375_181-325595-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 12:42:48,608 WARNING [train.py:1204] (3/4) Exclude cut with ID _47251_210_18628_1_1533814063128_4137250_351-154105-0_sp1.1 from training. Duration: 0.963625 2023-10-03 12:42:49,927 WARNING [train.py:1204] (3/4) Exclude cut with ID _35501_210_9288_1_1532998507986_3192189_351-334740-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 12:42:51,377 WARNING [train.py:1204] (3/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_342-100157-0_sp1.1 from training. Duration: 0.4909375 2023-10-03 12:42:54,230 WARNING [train.py:1204] (3/4) Exclude cut with ID _6789_210_1534_1_1527857937218_3118950_262-48853-0 from training. Duration: 0.56 2023-10-03 12:42:55,486 WARNING [train.py:1204] (3/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_430-201020-0 from training. Duration: 0.97 2023-10-03 12:42:56,903 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_71-136518-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 12:42:58,268 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_28-291982-0_sp1.1 from training. Duration: 0.8818125 2023-10-03 12:42:58,299 WARNING [train.py:1204] (3/4) Exclude cut with ID _54218_210_12210_1_1534413646388_4409259_437-275326-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 12:43:02,215 WARNING [train.py:1204] (3/4) Exclude cut with ID _55134_210_21982_1_1534590028377_3882170_40-309139-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 12:43:05,411 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.4.encoder.layers.2.feed_forward3.out_whiten, num_groups=1, num_channels=384, metric=12.08 vs. limit=15.0 2023-10-03 12:43:08,783 WARNING [train.py:1204] (3/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_266-329755-0_sp1.1 from training. Duration: 0.8090625 2023-10-03 12:43:10,200 WARNING [train.py:1204] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_21-174514-0_sp1.1 from training. Duration: 0.3909375 2023-10-03 12:43:11,669 WARNING [train.py:1204] (3/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_409-207378-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 12:43:17,718 WARNING [train.py:1204] (3/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_132-269633-0 from training. Duration: 0.68 2023-10-03 12:43:17,923 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=1271033.3333333333, ans=0.1 2023-10-03 12:43:19,152 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2245_210_2715_1_1525511548_3540254_310-52483-0_sp0.9 from training. Duration: 0.97775 2023-10-03 12:43:20,596 WARNING [train.py:1204] (3/4) Exclude cut with ID _27430_210_7770_1_1532604689375_1378590_47-77403-0_sp1.1 from training. Duration: 0.92725 2023-10-03 12:43:23,330 WARNING [train.py:1204] (3/4) Exclude cut with ID _7562_210_2084_1_1528887767916_3946229_186-326422-0 from training. Duration: 0.84 2023-10-03 12:43:24,753 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_230-35993-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 12:43:28,829 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_23854_210_1794_1_1531969696_1329157_57-123491-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 12:43:30,057 WARNING [train.py:1204] (3/4) Exclude cut with ID _14747_210_8341_1_1530685797463_3654089_147-131229-0 from training. Duration: 0.76 2023-10-03 12:43:31,418 WARNING [train.py:1204] (3/4) Exclude cut with ID _30191_210_1664_1_1533189915047_7196670_430-19539-0_sp1.1 from training. Duration: 0.92725 2023-10-03 12:43:31,431 WARNING [train.py:1204] (3/4) Exclude cut with ID _67725_210_27767_1_1535608756671_5105359_44-50762-0_sp1.1 from training. Duration: 0.963625 2023-10-03 12:43:34,225 WARNING [train.py:1204] (3/4) Exclude cut with ID _27485_210_4916_1_1532739191677_3668269_217-161755-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 12:43:34,385 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.1.ff2_skip_rate, batch_count=1271100.0, ans=0.0 2023-10-03 12:43:35,500 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_5931_210_2040_1_1527428481_4547999_321-193614-0_sp1.1 from training. Duration: 0.6090625 2023-10-03 12:43:35,523 WARNING [train.py:1204] (3/4) Exclude cut with ID _6267_210_2084_1_1527764658526_3894730_375-219887-0 from training. Duration: 0.67 2023-10-03 12:43:37,359 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9087W0022-538393 from training. Duration: 0.768 2023-10-03 12:43:39,399 WARNING [train.py:1204] (3/4) Exclude cut with ID _68777_210_4353_1_1535810390731_3117904_83-196459-0_sp1.1 from training. Duration: 0.963625 2023-10-03 12:43:39,577 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.bypass.scale_min, batch_count=1271100.0, ans=0.2 2023-10-03 12:43:42,189 WARNING [train.py:1204] (3/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_455-343812-0_sp1.1 from training. Duration: 0.92725 2023-10-03 12:43:42,190 WARNING [train.py:1204] (3/4) Exclude cut with ID _13916_210_9994_1_1530788341126_3763149_414-320174-0_sp1.1 from training. Duration: 0.92725 2023-10-03 12:43:42,193 WARNING [train.py:1204] (3/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_398-239819-0 from training. Duration: 0.99 2023-10-03 12:43:42,270 WARNING [train.py:1204] (3/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_199-232314-0_sp1.1 from training. Duration: 0.92725 2023-10-03 12:43:46,294 INFO [train.py:1046] (3/4) Epoch 36, batch 4750, loss[loss=0.1534, simple_loss=0.2268, pruned_loss=0.03998, over 23583.00 frames. ], tot_loss[loss=0.1585, simple_loss=0.2379, pruned_loss=0.03962, over 4725579.95 frames. ], batch size: 106, lr: 2.81e-03, grad_scale: 8.0 2023-10-03 12:43:47,749 WARNING [train.py:1204] (3/4) Exclude cut with ID _2334_210_2667_1_1525608238880_4705129_459-104997-0 from training. Duration: 0.95 2023-10-03 12:43:50,912 WARNING [train.py:1204] (3/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_441-209395-0_sp1.1 from training. Duration: 0.936375 2023-10-03 12:43:51,735 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.1.encoder.layers.1.feed_forward2.out_whiten, num_groups=1, num_channels=256, metric=3.77 vs. limit=15.0 2023-10-03 12:43:52,377 WARNING [train.py:1204] (3/4) Exclude cut with ID _27052_210_5986_1_1532329197187_3585009_320-80393-0_sp1.1 from training. Duration: 0.97275 2023-10-03 12:43:55,346 WARNING [train.py:1204] (3/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_89-217253-0_sp1.1 from training. Duration: 0.97275 2023-10-03 12:43:56,648 WARNING [train.py:1204] (3/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_453-282588-0_sp1.1 from training. Duration: 0.8909375 2023-10-03 12:43:58,036 WARNING [train.py:1204] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_88-341718-0 from training. Duration: 0.66 2023-10-03 12:43:58,082 WARNING [train.py:1204] (3/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_230-3521-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 12:43:59,583 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.self_attn_weights.pos_emb_skip_rate, batch_count=1271233.3333333333, ans=0.0 2023-10-03 12:44:00,851 WARNING [train.py:1204] (3/4) Exclude cut with ID _9679_210_7497_1_1529129212906_4363929_512-313889-0 from training. Duration: 0.81 2023-10-03 12:44:02,248 WARNING [train.py:1204] (3/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_194-252800-0_sp1.1 from training. Duration: 0.8818125 2023-10-03 12:44:02,267 WARNING [train.py:1204] (3/4) Exclude cut with ID _55209_210_1531_1_1534675828996_4597160_239-293541-0_sp1.1 from training. Duration: 0.963625 2023-10-03 12:44:03,481 WARNING [train.py:1204] (3/4) Exclude cut with ID _35799_210_5854_1_1533263487887_3529071_15-325131-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 12:44:06,409 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.conv_module2.balancer2.prob, batch_count=1271233.3333333333, ans=0.125 2023-10-03 12:44:08,898 WARNING [train.py:1204] (3/4) Exclude cut with ID _14100_210_8341_1_1530529320613_3678069_279-55760-0 from training. Duration: 0.93 2023-10-03 12:44:13,054 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.balancer2.prob, batch_count=1271233.3333333333, ans=0.125 2023-10-03 12:44:14,132 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_489-230703-0_sp1.1 from training. Duration: 0.77275 2023-10-03 12:44:15,509 WARNING [train.py:1204] (3/4) Exclude cut with ID _6694_210_4599_1_1527852637490_3965529_144-185051-0 from training. Duration: 0.96 2023-10-03 12:44:16,429 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.1.encoder.layers.0.feed_forward2.out_whiten, num_groups=1, num_channels=256, metric=4.05 vs. limit=15.0 2023-10-03 12:44:16,835 WARNING [train.py:1204] (3/4) Exclude cut with ID _28461_210_2253_1_1532771775189_3041299_1-228155-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 12:44:20,882 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.627e+02 1.859e+02 2.065e+02 2.331e+02 3.483e+02, threshold=4.130e+02, percent-clipped=0.0 2023-10-03 12:44:21,001 WARNING [train.py:1204] (3/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_686-252323-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 12:44:21,004 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_1075-294633-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 12:44:21,015 WARNING [train.py:1204] (3/4) Exclude cut with ID _22083_210_8254_1_1531702862439_3678970_30-92879-0_sp1.1 from training. Duration: 0.97275 2023-10-03 12:44:22,895 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9245W0121-616888 from training. Duration: 0.896 2023-10-03 12:44:22,898 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6786_210_1388_1_1527839699_4194495_378-46070-0 from training. Duration: 0.72 2023-10-03 12:44:27,159 WARNING [train.py:1204] (3/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_317-175062-0 from training. Duration: 0.82 2023-10-03 12:44:28,657 WARNING [train.py:1204] (3/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_238-89390-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 12:44:31,389 WARNING [train.py:1204] (3/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_524-310811-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 12:44:32,710 WARNING [train.py:1204] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_307-158462-0_sp0.9 from training. Duration: 0.7666875 2023-10-03 12:44:32,710 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9309W0191-640318 from training. Duration: 0.512 2023-10-03 12:44:32,717 WARNING [train.py:1204] (3/4) Exclude cut with ID _55153_210_2253_1_1534384786677_3162096_100-204617-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 12:44:35,388 WARNING [train.py:1204] (3/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_112-149213-0_sp1.1 from training. Duration: 0.863625 2023-10-03 12:44:38,178 WARNING [train.py:1204] (3/4) Exclude cut with ID _67831_210_12392_1_1535612464202_3092612_346-2148-0_sp1.1 from training. Duration: 0.8454375 2023-10-03 12:44:40,762 WARNING [train.py:1204] (3/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_51-117576-0_sp0.9 from training. Duration: 0.4444375 2023-10-03 12:44:40,784 WARNING [train.py:1204] (3/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_300-27235-0 from training. Duration: 0.87 2023-10-03 12:44:40,847 WARNING [train.py:1204] (3/4) Exclude cut with ID _40399_210_4353_1_1533305658194_3279406_195-116825-0_sp1.1 from training. Duration: 0.97275 2023-10-03 12:44:40,870 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7002_210_1794_1_1527939683_5157286_163-87552-0_sp0.9 from training. Duration: 0.9555625 2023-10-03 12:44:40,901 WARNING [train.py:1204] (3/4) Exclude cut with ID _19514_210_9259_1_1531530071330_5084230_560-237597-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 12:44:44,059 WARNING [train.py:1204] (3/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_41-234353-0_sp1.1 from training. Duration: 0.6181875 2023-10-03 12:44:44,089 WARNING [train.py:1204] (3/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_769-168302-0 from training. Duration: 0.71 2023-10-03 12:44:47,387 WARNING [train.py:1204] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_574-45541-0 from training. Duration: 0.65 2023-10-03 12:44:48,843 WARNING [train.py:1204] (3/4) Exclude cut with ID _50310_210_15488_1_1534068043345_3641080_262-67120-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 12:44:52,116 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_53071_210_16554_1_1534493434_1276796_35-11927-0_sp1.1 from training. Duration: 0.963625 2023-10-03 12:44:52,118 WARNING [train.py:1204] (3/4) Exclude cut with ID _3992_210_1747_1_1526644816031_3731060_107-251434-0 from training. Duration: 0.99 2023-10-03 12:44:52,161 WARNING [train.py:1204] (3/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_412-54488-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 12:44:53,691 WARNING [train.py:1204] (3/4) Exclude cut with ID _18516_210_9206_1_1531704653206_3692406_77-258490-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 12:44:53,869 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.conv_module2.balancer1.prob, batch_count=1271433.3333333333, ans=0.125 2023-10-03 12:44:55,059 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_40047_210_2290_1_1533290519_2374311_195-165693-0_sp0.9 from training. Duration: 0.988875 2023-10-03 12:44:56,317 WARNING [train.py:1204] (3/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_296-36971-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 12:44:56,371 WARNING [train.py:1204] (3/4) Exclude cut with ID _14970_210_15709_1_1530773543493_1715730_187-20259-0_sp1.1 from training. Duration: 0.8090625 2023-10-03 12:44:58,870 INFO [train.py:1046] (3/4) Epoch 36, batch 4800, loss[loss=0.1597, simple_loss=0.2506, pruned_loss=0.03436, over 24647.00 frames. ], tot_loss[loss=0.1596, simple_loss=0.2392, pruned_loss=0.03999, over 4720142.28 frames. ], batch size: 73, lr: 2.81e-03, grad_scale: 16.0 2023-10-03 12:45:00,328 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_53373_210_5399_1_1534229471_4715695_135-305927-0_sp1.1 from training. Duration: 0.936375 2023-10-03 12:45:00,353 WARNING [train.py:1204] (3/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_60-18123-0 from training. Duration: 0.88 2023-10-03 12:45:01,649 WARNING [train.py:1204] (3/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_166-166990-0 from training. Duration: 0.96 2023-10-03 12:45:01,725 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_5438_210_1794_1_1527336687_2600009_233-58072-0 from training. Duration: 0.77 2023-10-03 12:45:04,468 WARNING [train.py:1204] (3/4) Exclude cut with ID _8833_210_5219_1_1528592665210_7553059_611-80313-0_sp0.9 from training. Duration: 0.711125 2023-10-03 12:45:04,490 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_53555_210_5632_1_1534595220_7232297_118-43239-0_sp1.1 from training. Duration: 0.936375 2023-10-03 12:45:04,576 WARNING [train.py:1204] (3/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_480-13146-0 from training. Duration: 0.85 2023-10-03 12:45:10,619 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_892-28814-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 12:45:10,663 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_24899_210_16554_1_1532046422_3827462_160-21255-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 12:45:15,351 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_154-316450-0_sp1.1 from training. Duration: 0.6909375 2023-10-03 12:45:17,327 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.4.encoder.layers.2.self_attn1.whiten, num_groups=1, num_channels=384, metric=13.12 vs. limit=22.5 2023-10-03 12:45:18,736 WARNING [train.py:1204] (3/4) Exclude cut with ID _30196_210_1664_1_1533708496522_7074140_1225-301232-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 12:45:18,766 WARNING [train.py:1204] (3/4) Exclude cut with ID _28783_210_7943_1_1532671164808_7155479_414-208100-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 12:45:18,807 WARNING [train.py:1204] (3/4) Exclude cut with ID _19023_210_4353_1_1531378892552_5820780_83-254794-0 from training. Duration: 0.86 2023-10-03 12:45:20,611 WARNING [train.py:1204] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_591-83752-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 12:45:20,659 WARNING [train.py:1204] (3/4) Exclude cut with ID _29877_210_6228_1_1532397791106_5276149_266-62291-0_sp1.1 from training. Duration: 0.7909375 2023-10-03 12:45:20,763 WARNING [train.py:1204] (3/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_280-161633-0_sp1.1 from training. Duration: 0.663625 2023-10-03 12:45:22,210 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.1.conv_module2.balancer2.min_abs, batch_count=1271566.6666666667, ans=0.5 2023-10-03 12:45:26,036 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7543_210_6753_1_1528539468_333764_39-156634-0_sp1.1 from training. Duration: 0.97275 2023-10-03 12:45:27,563 WARNING [train.py:1204] (3/4) Exclude cut with ID _67499_210_17282_1_1535509805220_3692430_505-312248-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 12:45:27,585 WARNING [train.py:1204] (3/4) Exclude cut with ID _5294_210_1747_1_1527249643928_3696179_180-100713-0_sp0.9 from training. Duration: 0.9 2023-10-03 12:45:27,676 WARNING [train.py:1204] (3/4) Exclude cut with ID _26528_210_9070_1_1532570410842_3851310_508-148132-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 12:45:27,686 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_211-339622-0_sp1.1 from training. Duration: 0.4090625 2023-10-03 12:45:27,699 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_339-138356-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 12:45:30,429 WARNING [train.py:1204] (3/4) Exclude cut with ID _27060_210_5986_1_1532674838922_3522549_211-19933-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 12:45:31,819 WARNING [train.py:1204] (3/4) Exclude cut with ID _60229_210_27767_1_1534838406158_3655350_457-117923-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 12:45:34,701 WARNING [train.py:1204] (3/4) Exclude cut with ID _55429_210_10626_1_1534467588527_7174143_460-141439-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 12:45:36,138 WARNING [train.py:1204] (3/4) Exclude cut with ID _59170_210_6721_1_1534983633960_4203335_263-10780-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 12:45:36,150 WARNING [train.py:1204] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_872-264525-0_sp0.9 from training. Duration: 0.72225 2023-10-03 12:45:37,531 WARNING [train.py:1204] (3/4) Exclude cut with ID _7624_210_1531_1_1528109802198_4933101_418-349834-0_sp0.9 from training. Duration: 0.6666875 2023-10-03 12:45:38,885 WARNING [train.py:1204] (3/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_27-53998-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 12:45:40,901 WARNING [train.py:1204] (3/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_202-183922-0 from training. Duration: 0.85 2023-10-03 12:45:40,913 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7002_210_1794_1_1527939683_5157286_1-281264-0 from training. Duration: 0.44 2023-10-03 12:45:41,711 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.1.encoder.layers.0.nonlin_attention.whiten1, num_groups=1, num_channels=192, metric=4.72 vs. limit=10.0 2023-10-03 12:45:42,825 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_53753_210_7234_1_1534320003_3594148_493-9314-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 12:45:42,838 WARNING [train.py:1204] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_217-95056-0_sp1.1 from training. Duration: 0.8909375 2023-10-03 12:45:42,883 WARNING [train.py:1204] (3/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_406-45086-0_sp1.1 from training. Duration: 0.72725 2023-10-03 12:45:42,888 WARNING [train.py:1204] (3/4) Exclude cut with ID _31649_210_6228_1_1532584608063_4091078_505-213821-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 12:45:42,903 WARNING [train.py:1204] (3/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_317-175062-0_sp0.9 from training. Duration: 0.911125 2023-10-03 12:45:44,354 WARNING [train.py:1204] (3/4) Exclude cut with ID _5444_210_2151_1_1527418808464_5312210_726-27165-0_sp1.1 from training. Duration: 0.6090625 2023-10-03 12:45:45,585 WARNING [train.py:1204] (3/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_494-170437-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 12:45:50,161 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_474-64054-0_sp1.1 from training. Duration: 0.936375 2023-10-03 12:45:51,724 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.2.encoder.layers.0.conv_module1.whiten, num_groups=1, num_channels=384, metric=6.74 vs. limit=15.0 2023-10-03 12:45:52,270 WARNING [train.py:1204] (3/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_521-7556-0_sp1.1 from training. Duration: 0.97275 2023-10-03 12:45:53,002 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.3.encoder.layers.1.nonlin_attention.whiten2, num_groups=1, num_channels=512, metric=7.55 vs. limit=15.0 2023-10-03 12:45:53,664 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_269-348505-0_sp1.1 from training. Duration: 0.92725 2023-10-03 12:45:57,820 WARNING [train.py:1204] (3/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_559-184862-0 from training. Duration: 0.94 2023-10-03 12:45:59,103 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_68702_210_23689_1_1535799577_3420068_92-76040-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 12:45:59,151 WARNING [train.py:1204] (3/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_227-102052-0_sp1.1 from training. Duration: 0.97275 2023-10-03 12:45:59,203 WARNING [train.py:1204] (3/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_718-250907-0_sp0.9 from training. Duration: 0.7666875 2023-10-03 12:45:59,374 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=1271766.6666666667, ans=0.1 2023-10-03 12:46:00,620 WARNING [train.py:1204] (3/4) Exclude cut with ID _14069_210_5632_1_1530512692033_4803390_352-143891-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 12:46:05,204 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.5.encoder.layers.0.self_attn1.whiten, num_groups=1, num_channels=256, metric=11.93 vs. limit=22.5 2023-10-03 12:46:06,053 WARNING [train.py:1204] (3/4) Exclude cut with ID _6267_210_2084_1_1527764658526_3894730_65-106602-0_sp1.1 from training. Duration: 0.936375 2023-10-03 12:46:07,353 WARNING [train.py:1204] (3/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_202-183922-0_sp0.9 from training. Duration: 0.9444375 2023-10-03 12:46:07,360 WARNING [train.py:1204] (3/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_465-268827-0_sp1.1 from training. Duration: 0.97275 2023-10-03 12:46:07,425 WARNING [train.py:1204] (3/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_58-29064-0_sp1.1 from training. Duration: 0.9 2023-10-03 12:46:07,455 WARNING [train.py:1204] (3/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_1-99680-0_sp0.9 from training. Duration: 0.7444375 2023-10-03 12:46:08,831 WARNING [train.py:1204] (3/4) Exclude cut with ID _13253_210_1925_1_1530757775819_3577873_191-144977-0_sp1.1 from training. Duration: 0.7090625 2023-10-03 12:46:12,223 WARNING [train.py:1204] (3/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_276-112684-0_sp1.1 from training. Duration: 0.92725 2023-10-03 12:46:12,240 WARNING [train.py:1204] (3/4) Exclude cut with ID _64639_210_5816_1_1535367619437_3528000_358-411-0_sp1.1 from training. Duration: 0.97275 2023-10-03 12:46:12,246 WARNING [train.py:1204] (3/4) Exclude cut with ID _31649_210_6228_1_1532584608063_4091078_106-21199-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 12:46:12,350 WARNING [train.py:1204] (3/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_901-79438-0 from training. Duration: 0.94 2023-10-03 12:46:13,509 INFO [train.py:1046] (3/4) Epoch 36, batch 4850, loss[loss=0.1761, simple_loss=0.2407, pruned_loss=0.05573, over 23782.00 frames. ], tot_loss[loss=0.1598, simple_loss=0.2394, pruned_loss=0.04004, over 4719675.06 frames. ], batch size: 212, lr: 2.81e-03, grad_scale: 16.0 2023-10-03 12:46:13,933 INFO [scaling.py:1118] (3/4) WithLoss: name=encoder.encoders.3.encoder.layers.3.self_attn_weights, loss-sum=0.000e+00 2023-10-03 12:46:15,710 WARNING [train.py:1204] (3/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_718-250907-0 from training. Duration: 0.69 2023-10-03 12:46:15,723 WARNING [train.py:1204] (3/4) Exclude cut with ID _5287_210_2084_1_1527418883040_3878670_363-194161-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 12:46:15,727 WARNING [train.py:1204] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_586-321321-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 12:46:17,590 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_68900_210_2087_1_1535713157_3708529_164-292661-0_sp1.1 from training. Duration: 0.963625 2023-10-03 12:46:17,591 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_26433_210_12108_1_1532157158_4056480_114-144878-0_sp1.1 from training. Duration: 0.97275 2023-10-03 12:46:17,712 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.1.ff2_skip_rate, batch_count=1271833.3333333333, ans=0.0 2023-10-03 12:46:20,317 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_15517_210_6753_1_1530954008_3487336_96-341874-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 12:46:26,596 WARNING [train.py:1204] (3/4) Exclude cut with ID _14268_210_9206_1_1530622771285_4714099_637-86630-0 from training. Duration: 0.51 2023-10-03 12:46:27,912 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_28211_210_6388_1_1532861506_4523224_121-320092-0_sp1.1 from training. Duration: 0.92725 2023-10-03 12:46:30,774 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_141-89113-0_sp0.9 from training. Duration: 0.9555625 2023-10-03 12:46:32,125 WARNING [train.py:1204] (3/4) Exclude cut with ID _6789_210_1534_1_1527857937218_3118950_311-61262-0_sp1.1 from training. Duration: 0.5909375 2023-10-03 12:46:32,153 WARNING [train.py:1204] (3/4) Exclude cut with ID _58480_210_12392_1_1534811121111_3646160_389-200461-0_sp1.1 from training. Duration: 0.97275 2023-10-03 12:46:34,099 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.4.encoder.layers.0.nonlin_attention.whiten2, num_groups=1, num_channels=384, metric=9.21 vs. limit=15.0 2023-10-03 12:46:34,785 WARNING [train.py:1204] (3/4) Exclude cut with ID _41326_210_5854_1_1533778076509_3492080_28-156553-0_sp1.1 from training. Duration: 0.92725 2023-10-03 12:46:37,368 WARNING [train.py:1204] (3/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_577-287150-0_sp1.1 from training. Duration: 0.7454375 2023-10-03 12:46:37,466 WARNING [train.py:1204] (3/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_40-129756-0_sp1.1 from training. Duration: 0.736375 2023-10-03 12:46:37,476 WARNING [train.py:1204] (3/4) Exclude cut with ID _60113_210_16271_1_1535164212481_4386079_490-280290-0 from training. Duration: 0.98 2023-10-03 12:46:42,101 WARNING [train.py:1204] (3/4) Exclude cut with ID _58059_210_5854_1_1534901471149_3164808_34-39905-0_sp1.1 from training. Duration: 0.963625 2023-10-03 12:46:44,109 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_791-197891-0_sp1.1 from training. Duration: 0.763625 2023-10-03 12:46:44,123 WARNING [train.py:1204] (3/4) Exclude cut with ID _6789_210_1534_1_1527857937218_3118950_262-48853-0_sp1.1 from training. Duration: 0.5090625 2023-10-03 12:46:44,180 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2655_210_2087_1_1525688959_3897400_52-279899-0_sp0.9 from training. Duration: 0.7666875 2023-10-03 12:46:45,518 WARNING [train.py:1204] (3/4) Exclude cut with ID _14884_210_2252_1_1530707459689_5561250_114-201399-0 from training. Duration: 0.76 2023-10-03 12:46:48,710 WARNING [train.py:1204] (3/4) Exclude cut with ID _19023_210_4353_1_1531378892552_5820780_83-254794-0_sp0.9 from training. Duration: 0.9555625 2023-10-03 12:46:48,727 WARNING [train.py:1204] (3/4) Exclude cut with ID _53489_210_6228_1_1534298357169_3835970_10-297919-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 12:46:52,112 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.642e+02 1.942e+02 2.103e+02 2.430e+02 3.861e+02, threshold=4.206e+02, percent-clipped=0.0 2023-10-03 12:46:52,304 WARNING [train.py:1204] (3/4) Exclude cut with ID _44585_210_18628_1_1533605350387_4518199_418-3055-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 12:46:52,317 WARNING [train.py:1204] (3/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_310-109461-0 from training. Duration: 0.8 2023-10-03 12:46:53,656 WARNING [train.py:1204] (3/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_235-262124-0 from training. Duration: 0.67 2023-10-03 12:46:55,147 WARNING [train.py:1204] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_195-167011-0_sp1.1 from training. Duration: 0.6818125 2023-10-03 12:46:56,742 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.bypass_mid.scale_min, batch_count=1271966.6666666667, ans=0.2 2023-10-03 12:47:00,816 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.attention_skip_rate, batch_count=1272033.3333333333, ans=0.0 2023-10-03 12:47:01,982 WARNING [train.py:1204] (3/4) Exclude cut with ID _7852_210_5219_1_1528286471316_8139400_188-55162-0_sp1.1 from training. Duration: 0.936375 2023-10-03 12:47:03,246 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_35361_210_4238_1_1533262810_7793476_675-87012-0 from training. Duration: 0.89 2023-10-03 12:47:03,509 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.conv_module2.balancer2.prob, batch_count=1272033.3333333333, ans=0.125 2023-10-03 12:47:04,619 WARNING [train.py:1204] (3/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_465-81427-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 12:47:04,627 WARNING [train.py:1204] (3/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_597-154640-0_sp1.1 from training. Duration: 0.8181875 2023-10-03 12:47:05,985 WARNING [train.py:1204] (3/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_186-309161-0_sp0.9 from training. Duration: 0.6 2023-10-03 12:47:08,653 WARNING [train.py:1204] (3/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_546-119312-0 from training. Duration: 0.99 2023-10-03 12:47:08,657 WARNING [train.py:1204] (3/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_330-240060-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 12:47:10,596 WARNING [train.py:1204] (3/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_477-255782-0 from training. Duration: 0.82 2023-10-03 12:47:10,629 WARNING [train.py:1204] (3/4) Exclude cut with ID _14970_210_15709_1_1530773543493_1715730_150-248939-0_sp1.1 from training. Duration: 0.97275 2023-10-03 12:47:11,977 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_556-206615-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 12:47:13,344 WARNING [train.py:1204] (3/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_308-231735-0 from training. Duration: 0.73 2023-10-03 12:47:16,393 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.feed_forward1.hidden_balancer.prob, batch_count=1272100.0, ans=0.125 2023-10-03 12:47:19,580 WARNING [train.py:1204] (3/4) Exclude cut with ID _27485_210_4916_1_1532739191677_3668269_66-236571-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 12:47:19,723 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.1.ff3_skip_rate, batch_count=1272100.0, ans=0.0 2023-10-03 12:47:25,552 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2221_210_1988_1_1525597051_4935541_415-296882-0_sp0.9 from training. Duration: 0.8555625 2023-10-03 12:47:25,559 WARNING [train.py:1204] (3/4) Exclude cut with ID _64521_210_14699_1_1535243427259_3815328_231-78630-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 12:47:28,157 INFO [train.py:1046] (3/4) Epoch 36, batch 4900, loss[loss=0.1439, simple_loss=0.2054, pruned_loss=0.04117, over 22731.00 frames. ], tot_loss[loss=0.1594, simple_loss=0.2383, pruned_loss=0.04018, over 4708555.26 frames. ], batch size: 322, lr: 2.81e-03, grad_scale: 8.0 2023-10-03 12:47:30,947 WARNING [train.py:1204] (3/4) Exclude cut with ID _13942_210_9988_1_1530604906036_3733360_499-55676-0 from training. Duration: 0.62 2023-10-03 12:47:30,949 WARNING [train.py:1204] (3/4) Exclude cut with ID _49376_210_5986_1_1533985278732_3370740_301-154763-0_sp1.1 from training. Duration: 0.8909375 2023-10-03 12:47:33,885 WARNING [train.py:1204] (3/4) Exclude cut with ID _27485_210_4916_1_1532739191677_3668269_255-216698-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 12:47:35,282 WARNING [train.py:1204] (3/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_137-334143-0_sp1.1 from training. Duration: 0.97275 2023-10-03 12:47:35,314 WARNING [train.py:1204] (3/4) Exclude cut with ID _6456_210_1534_1_1527681634212_3181049_215-309359-0_sp1.1 from training. Duration: 0.8 2023-10-03 12:47:39,803 WARNING [train.py:1204] (3/4) Exclude cut with ID _65640_210_6310_1_1535368201445_3173250_209-13008-0 from training. Duration: 0.99 2023-10-03 12:47:45,342 WARNING [train.py:1204] (3/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_58-29064-0 from training. Duration: 0.99 2023-10-03 12:47:48,872 WARNING [train.py:1204] (3/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_410-287857-0 from training. Duration: 0.93 2023-10-03 12:47:48,946 WARNING [train.py:1204] (3/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_153-153537-0 from training. Duration: 0.91 2023-10-03 12:47:50,259 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7544_210_6753_1_1528587722_3576686_172-171750-0_sp0.9 from training. Duration: 0.72225 2023-10-03 12:47:50,314 WARNING [train.py:1204] (3/4) Exclude cut with ID _64521_210_14699_1_1535243427259_3815328_464-183853-0_sp1.1 from training. Duration: 0.97275 2023-10-03 12:47:50,328 WARNING [train.py:1204] (3/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_385-210308-0_sp1.1 from training. Duration: 0.963625 2023-10-03 12:47:50,336 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_46013_210_7234_1_1533776362_3744032_354-261865-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 12:47:50,343 WARNING [train.py:1204] (3/4) Exclude cut with ID _3067_210_2667_1_1526212974640_4503168_250-285937-0_sp1.1 from training. Duration: 0.72725 2023-10-03 12:47:52,262 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_40036_210_13405_1_1533648647_1315388_48-146016-0 from training. Duration: 0.93 2023-10-03 12:47:55,023 WARNING [train.py:1204] (3/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_280-161633-0 from training. Duration: 0.73 2023-10-03 12:47:55,075 WARNING [train.py:1204] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_308-161067-0_sp0.9 from training. Duration: 0.7333125 2023-10-03 12:47:56,441 WARNING [train.py:1204] (3/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_68-212801-0_sp0.9 from training. Duration: 0.811125 2023-10-03 12:47:56,496 WARNING [train.py:1204] (3/4) Exclude cut with ID _6789_210_1534_1_1527857937218_3118950_311-61262-0_sp0.9 from training. Duration: 0.72225 2023-10-03 12:47:57,984 WARNING [train.py:1204] (3/4) Exclude cut with ID _14632_210_9988_1_1530691279392_3923200_102-204477-0_sp1.1 from training. Duration: 0.8818125 2023-10-03 12:47:59,403 WARNING [train.py:1204] (3/4) Exclude cut with ID _65242_210_18628_1_1535369393717_3823149_290-117517-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 12:48:00,744 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_69630_210_7234_1_1535788670_910989_35-337140-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 12:48:00,751 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_5618_210_1874_1_1527311008_5851_1-214501-0 from training. Duration: 0.689 2023-10-03 12:48:02,088 WARNING [train.py:1204] (3/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_168-50337-0_sp0.9 from training. Duration: 0.9333125 2023-10-03 12:48:03,439 WARNING [train.py:1204] (3/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_185-14438-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 12:48:03,456 WARNING [train.py:1204] (3/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_61-327062-0 from training. Duration: 0.81 2023-10-03 12:48:03,462 WARNING [train.py:1204] (3/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_98-741-0 from training. Duration: 0.8 2023-10-03 12:48:05,102 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.conv_module1.balancer1.prob, batch_count=1272300.0, ans=0.125 2023-10-03 12:48:06,917 WARNING [train.py:1204] (3/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_312-204798-0 from training. Duration: 0.93 2023-10-03 12:48:09,531 WARNING [train.py:1204] (3/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_787-294416-0_sp0.9 from training. Duration: 0.911125 2023-10-03 12:48:09,620 WARNING [train.py:1204] (3/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_140-181176-0_sp1.1 from training. Duration: 0.836375 2023-10-03 12:48:09,663 WARNING [train.py:1204] (3/4) Exclude cut with ID _14687_210_5854_1_1530703838259_3664889_115-239046-0_sp1.1 from training. Duration: 0.6090625 2023-10-03 12:48:10,903 WARNING [train.py:1204] (3/4) Exclude cut with ID _56423_210_5854_1_1534896061049_3165969_370-230614-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 12:48:10,947 WARNING [train.py:1204] (3/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_406-275649-0_sp0.9 from training. Duration: 0.4666875 2023-10-03 12:48:10,963 WARNING [train.py:1204] (3/4) Exclude cut with ID _15150_210_8341_1_1530752411803_7256384_22-138274-0_sp1.1 from training. Duration: 0.87275 2023-10-03 12:48:12,327 WARNING [train.py:1204] (3/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_125-125818-0 from training. Duration: 0.96 2023-10-03 12:48:13,816 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_4399_210_1874_1_1526705485_2400757_220-47974-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 12:48:17,127 WARNING [train.py:1204] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_308-161067-0_sp1.1 from training. Duration: 0.6 2023-10-03 12:48:18,528 WARNING [train.py:1204] (3/4) Exclude cut with ID _53489_210_6228_1_1534298357169_3835970_102-57645-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 12:48:21,668 WARNING [train.py:1204] (3/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_178-177582-0 from training. Duration: 0.74 2023-10-03 12:48:23,036 WARNING [train.py:1204] (3/4) Exclude cut with ID _14349_210_8341_1_1530615710818_3442470_170-336541-0_sp1.1 from training. Duration: 0.7818125 2023-10-03 12:48:23,097 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_521-214276-0_sp0.9 from training. Duration: 0.488875 2023-10-03 12:48:24,414 WARNING [train.py:1204] (3/4) Exclude cut with ID _66070_210_7640_1_1535441393096_7805310_489-292631-0 from training. Duration: 0.79 2023-10-03 12:48:30,193 WARNING [train.py:1204] (3/4) Exclude cut with ID _14069_210_5632_1_1530512692033_4803390_438-340023-0_sp1.1 from training. Duration: 0.936375 2023-10-03 12:48:30,388 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.nonlin_attention.balancer.prob, batch_count=1272433.3333333333, ans=0.125 2023-10-03 12:48:31,532 WARNING [train.py:1204] (3/4) Exclude cut with ID _7852_210_5219_1_1528286471316_8139400_242-105336-0_sp1.1 from training. Duration: 0.6545625 2023-10-03 12:48:32,923 WARNING [train.py:1204] (3/4) Exclude cut with ID _3867_210_1883_1_1526641200575_4475830_272-215563-0 from training. Duration: 0.57 2023-10-03 12:48:34,213 WARNING [train.py:1204] (3/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_143-117421-0_sp0.9 from training. Duration: 0.6444375 2023-10-03 12:48:34,220 WARNING [train.py:1204] (3/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_30-326681-0_sp1.1 from training. Duration: 0.7545625 2023-10-03 12:48:35,631 WARNING [train.py:1204] (3/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_538-218448-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 12:48:40,406 WARNING [train.py:1204] (3/4) Exclude cut with ID _33640_210_2655_1_1533117605479_4855469_60-192970-0_sp1.1 from training. Duration: 0.963625 2023-10-03 12:48:40,421 WARNING [train.py:1204] (3/4) Exclude cut with ID _65585_210_12210_1_1535450451584_3764098_112-294301-0_sp1.1 from training. Duration: 0.8 2023-10-03 12:48:40,451 WARNING [train.py:1204] (3/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_643-37412-0_sp1.1 from training. Duration: 0.936375 2023-10-03 12:48:40,467 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_9302_210_2550_1_1528889962_4655416_638-301620-0_sp1.1 from training. Duration: 0.42725 2023-10-03 12:48:40,674 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.conv_module1.balancer2.min_positive, batch_count=1272500.0, ans=0.05 2023-10-03 12:48:41,770 INFO [train.py:1046] (3/4) Epoch 36, batch 4950, loss[loss=0.1644, simple_loss=0.2413, pruned_loss=0.04374, over 23245.00 frames. ], tot_loss[loss=0.1586, simple_loss=0.2371, pruned_loss=0.04008, over 4691942.83 frames. ], batch size: 93, lr: 2.81e-03, grad_scale: 8.0 2023-10-03 12:48:41,894 WARNING [train.py:1204] (3/4) Exclude cut with ID _3867_210_1883_1_1526641200575_4475830_272-215563-0_sp1.1 from training. Duration: 0.5181875 2023-10-03 12:48:44,562 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_53679_210_4852_1_1534556726_4186141_358-154143-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 12:48:44,573 WARNING [train.py:1204] (3/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_841-341679-0_sp0.9 from training. Duration: 0.6444375 2023-10-03 12:48:48,033 WARNING [train.py:1204] (3/4) Exclude cut with ID _7160_210_1876_1_1527940923231_3703799_302-150728-0 from training. Duration: 0.78 2023-10-03 12:48:48,069 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_125-203105-0 from training. Duration: 0.8 2023-10-03 12:48:48,092 WARNING [train.py:1204] (3/4) Exclude cut with ID _5585_210_2667_1_1527418813324_8007340_89-332072-0_sp0.9 from training. Duration: 0.711125 2023-10-03 12:48:49,220 WARNING [train.py:1204] (3/4) Exclude cut with ID _3992_210_1747_1_1526644816031_3731060_36-147855-0 from training. Duration: 0.78 2023-10-03 12:48:49,240 WARNING [train.py:1204] (3/4) Exclude cut with ID _59256_210_5986_1_1535076085997_3589120_172-239045-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 12:48:49,256 WARNING [train.py:1204] (3/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_472-216663-0_sp1.1 from training. Duration: 0.836375 2023-10-03 12:48:51,087 WARNING [train.py:1204] (3/4) Exclude cut with ID _36512_210_6987_1_1533033143998_3126069_181-306500-0_sp1.1 from training. Duration: 0.863625 2023-10-03 12:48:51,127 WARNING [train.py:1204] (3/4) Exclude cut with ID _56524_210_9642_1_1534572011779_3619170_492-95008-0_sp1.1 from training. Duration: 0.97275 2023-10-03 12:48:53,827 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_33348_210_5632_1_1533086912_3324030_119-90326-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 12:48:53,873 WARNING [train.py:1204] (3/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_231-137892-0_sp1.1 from training. Duration: 0.9 2023-10-03 12:48:55,258 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8453_210_4211_1_1528502940_8562730_596-306820-0_sp1.1 from training. Duration: 0.8818125 2023-10-03 12:48:56,609 WARNING [train.py:1204] (3/4) Exclude cut with ID _27052_210_5986_1_1532329197187_3585009_50-242750-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 12:48:59,326 WARNING [train.py:1204] (3/4) Exclude cut with ID _13913_210_9994_1_1530579615187_3383897_358-98435-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 12:48:59,346 WARNING [train.py:1204] (3/4) Exclude cut with ID _27013_210_1389_1_1532437173240_3252649_399-229778-0_sp1.1 from training. Duration: 0.963625 2023-10-03 12:49:02,844 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.3.encoder.layers.1.feed_forward2.out_whiten, num_groups=1, num_channels=512, metric=8.23 vs. limit=15.0 2023-10-03 12:49:03,626 WARNING [train.py:1204] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_185-174805-0_sp0.9 from training. Duration: 0.7555625 2023-10-03 12:49:07,817 WARNING [train.py:1204] (3/4) Exclude cut with ID _65585_210_12210_1_1535450451584_3764098_196-221014-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 12:49:09,194 WARNING [train.py:1204] (3/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_202-293523-0_sp1.1 from training. Duration: 0.6545625 2023-10-03 12:49:11,068 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8275_210_4198_1_1528455320_3892571_213-127634-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 12:49:12,277 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_921-231678-0_sp1.1 from training. Duration: 0.97275 2023-10-03 12:49:13,658 WARNING [train.py:1204] (3/4) Exclude cut with ID _38020_210_5986_1_1533112252259_7006089_493-102745-0_sp1.1 from training. Duration: 0.8909375 2023-10-03 12:49:14,999 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6846_210_6753_1_1527850767_4295158_2-315073-0 from training. Duration: 0.88 2023-10-03 12:49:16,376 WARNING [train.py:1204] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_249-281349-0 from training. Duration: 0.85 2023-10-03 12:49:17,642 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.645e+02 1.932e+02 2.177e+02 2.793e+02 4.403e+02, threshold=4.354e+02, percent-clipped=2.0 2023-10-03 12:49:17,786 WARNING [train.py:1204] (3/4) Exclude cut with ID _18513_210_9206_1_1531618215223_3862620_314-83429-0_sp1.1 from training. Duration: 0.97275 2023-10-03 12:49:19,688 WARNING [train.py:1204] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_215-202973-0_sp0.9 from training. Duration: 0.97775 2023-10-03 12:49:19,703 WARNING [train.py:1204] (3/4) Exclude cut with ID _7719_210_3525_1_1528456153163_4296960_88-250259-0_sp1.1 from training. Duration: 0.763625 2023-10-03 12:49:21,591 WARNING [train.py:1204] (3/4) Exclude cut with ID _7606_210_4915_1_1528894253277_5080690_301-10732-0_sp1.1 from training. Duration: 0.736375 2023-10-03 12:49:21,600 WARNING [train.py:1204] (3/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_818-11907-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 12:49:21,666 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_5959_210_1794_1_1527400400_3445726_412-44052-0_sp1.1 from training. Duration: 0.7 2023-10-03 12:49:23,644 WARNING [train.py:1204] (3/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_6-279195-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 12:49:26,097 WARNING [train.py:1204] (3/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_252-216674-0_sp0.9 from training. Duration: 0.6 2023-10-03 12:49:27,683 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.conv_module1.balancer1.prob, batch_count=1272700.0, ans=0.125 2023-10-03 12:49:27,687 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.ff3_skip_rate, batch_count=1272700.0, ans=0.0 2023-10-03 12:49:28,827 WARNING [train.py:1204] (3/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_144-3287-0_sp1.1 from training. Duration: 0.6090625 2023-10-03 12:49:30,223 WARNING [train.py:1204] (3/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_342-3879-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 12:49:30,253 WARNING [train.py:1204] (3/4) Exclude cut with ID _33640_210_2655_1_1533117605479_4855469_584-65630-0_sp1.1 from training. Duration: 0.97275 2023-10-03 12:49:30,322 WARNING [train.py:1204] (3/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_37-189262-0 from training. Duration: 0.98 2023-10-03 12:49:30,353 WARNING [train.py:1204] (3/4) Exclude cut with ID _9389_210_1664_1_1529217337301_5289269_379-128794-0_sp0.9 from training. Duration: 0.9444375 2023-10-03 12:49:31,690 WARNING [train.py:1204] (3/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_119-207162-0_sp1.1 from training. Duration: 0.6454375 2023-10-03 12:49:34,574 WARNING [train.py:1204] (3/4) Exclude cut with ID _2941_210_2577_1_1526199673150_2489759_200-333643-0_sp1.1 from training. Duration: 0.936375 2023-10-03 12:49:35,971 WARNING [train.py:1204] (3/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_479-125611-0_sp1.1 from training. Duration: 0.87275 2023-10-03 12:49:35,981 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3475_210_2028_1_1526193578_3622569_64-297072-0_sp1.1 from training. Duration: 0.82725 2023-10-03 12:49:36,020 WARNING [train.py:1204] (3/4) Exclude cut with ID _53565_210_10635_1_1534337218335_3820259_139-346291-0_sp1.1 from training. Duration: 0.97275 2023-10-03 12:49:37,767 WARNING [train.py:1204] (3/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_588-125165-0_sp1.1 from training. Duration: 0.7545625 2023-10-03 12:49:39,108 WARNING [train.py:1204] (3/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_938-205135-0_sp1.1 from training. Duration: 0.7909375 2023-10-03 12:49:39,250 WARNING [train.py:1204] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_216-48707-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 12:49:40,565 WARNING [train.py:1204] (3/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_477-255782-0_sp1.1 from training. Duration: 0.7454375 2023-10-03 12:49:40,601 WARNING [train.py:1204] (3/4) Exclude cut with ID _16909_210_5724_1_1531292079912_4118919_583-92842-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 12:49:41,939 WARNING [train.py:1204] (3/4) Exclude cut with ID _60860_210_8254_1_1534896015875_3557169_245-256666-0 from training. Duration: 0.97 2023-10-03 12:49:46,716 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_45399_210_4852_1_1533624725_7579737_349-116173-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 12:49:51,438 WARNING [train.py:1204] (3/4) Exclude cut with ID _8699_210_2069_1_1528630346831_3960409_155-39766-0 from training. Duration: 0.78 2023-10-03 12:49:51,452 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_86-16881-0_sp1.1 from training. Duration: 0.52725 2023-10-03 12:49:55,800 INFO [train.py:1046] (3/4) Epoch 36, batch 5000, loss[loss=0.1577, simple_loss=0.2468, pruned_loss=0.03429, over 24584.00 frames. ], tot_loss[loss=0.1579, simple_loss=0.2361, pruned_loss=0.03984, over 4673148.27 frames. ], batch size: 71, lr: 2.81e-03, grad_scale: 8.0 2023-10-03 12:49:57,500 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=1272833.3333333333, ans=0.1 2023-10-03 12:49:58,621 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_191-159384-0_sp1.1 from training. Duration: 0.97275 2023-10-03 12:49:58,629 WARNING [train.py:1204] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_307-158462-0_sp1.1 from training. Duration: 0.62725 2023-10-03 12:50:00,017 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7544_210_6753_1_1528591327_1471426_33-346031-0 from training. Duration: 0.61 2023-10-03 12:50:01,358 WARNING [train.py:1204] (3/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_112-149213-0 from training. Duration: 0.95 2023-10-03 12:50:02,745 WARNING [train.py:1204] (3/4) Exclude cut with ID _55844_210_2346_1_1534471212569_7625707_925-191342-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 12:50:04,214 WARNING [train.py:1204] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_87-278686-0 from training. Duration: 0.77 2023-10-03 12:50:04,244 WARNING [train.py:1204] (3/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_415-78792-0_sp1.1 from training. Duration: 0.736375 2023-10-03 12:50:04,262 WARNING [train.py:1204] (3/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_107-332398-0_sp0.9 from training. Duration: 0.8666875 2023-10-03 12:50:05,585 WARNING [train.py:1204] (3/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_72-251255-0 from training. Duration: 0.94 2023-10-03 12:50:05,622 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_431-227273-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 12:50:07,446 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7544_210_6753_1_1528587000_691468_87-178599-0_sp0.9 from training. Duration: 0.9666875 2023-10-03 12:50:07,524 WARNING [train.py:1204] (3/4) Exclude cut with ID _13916_210_9994_1_1530788341126_3763149_60-191556-0 from training. Duration: 0.61 2023-10-03 12:50:07,527 WARNING [train.py:1204] (3/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_746-164782-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 12:50:08,906 WARNING [train.py:1204] (3/4) Exclude cut with ID _33640_210_2655_1_1533117605479_4855469_596-21066-0_sp1.1 from training. Duration: 0.92725 2023-10-03 12:50:10,388 WARNING [train.py:1204] (3/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_320-14078-0 from training. Duration: 0.95 2023-10-03 12:50:11,644 WARNING [train.py:1204] (3/4) Exclude cut with ID _29877_210_6228_1_1532397791106_5276149_266-62291-0 from training. Duration: 0.87 2023-10-03 12:50:11,725 WARNING [train.py:1204] (3/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_176-56120-0_sp0.9 from training. Duration: 0.92225 2023-10-03 12:50:13,047 WARNING [train.py:1204] (3/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_91-26919-0 from training. Duration: 0.97 2023-10-03 12:50:13,062 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_224-322394-0_sp0.9 from training. Duration: 0.7333125 2023-10-03 12:50:13,107 WARNING [train.py:1204] (3/4) Exclude cut with ID _44982_210_1511_1_1534312820108_3726029_586-297059-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 12:50:14,296 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_196-289637-0_sp1.1 from training. Duration: 0.6818125 2023-10-03 12:50:14,297 WARNING [train.py:1204] (3/4) Exclude cut with ID _18513_210_9206_1_1531618215223_3862620_402-5957-0 from training. Duration: 0.99 2023-10-03 12:50:14,310 WARNING [train.py:1204] (3/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_81-300212-0 from training. Duration: 0.6 2023-10-03 12:50:16,383 WARNING [train.py:1204] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_166-128941-0 from training. Duration: 0.54 2023-10-03 12:50:16,397 WARNING [train.py:1204] (3/4) Exclude cut with ID _58059_210_5854_1_1534901471149_3164808_57-309660-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 12:50:17,754 WARNING [train.py:1204] (3/4) Exclude cut with ID _60621_210_17071_1_1535013053530_3671739_73-155814-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 12:50:19,121 WARNING [train.py:1204] (3/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_726-270780-0 from training. Duration: 0.86 2023-10-03 12:50:19,141 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8853_210_3273_1_1528633899_818968_51-187685-0_sp1.1 from training. Duration: 0.62725 2023-10-03 12:50:21,115 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_315-212794-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 12:50:21,170 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder_embed.convnext.hidden_balancer.prob, batch_count=1272900.0, ans=0.125 2023-10-03 12:50:22,963 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_53373_210_5399_1_1534229471_4715695_314-122795-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 12:50:24,362 WARNING [train.py:1204] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_187-221566-0_sp0.9 from training. Duration: 0.47775 2023-10-03 12:50:25,801 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_133-564-0 from training. Duration: 0.79 2023-10-03 12:50:25,866 WARNING [train.py:1204] (3/4) Exclude cut with ID _55153_210_2253_1_1534384786677_3162096_255-228344-0_sp1.1 from training. Duration: 0.936375 2023-10-03 12:50:28,572 WARNING [train.py:1204] (3/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_383-165234-0_sp1.1 from training. Duration: 0.8545625 2023-10-03 12:50:31,354 WARNING [train.py:1204] (3/4) Exclude cut with ID IC0677W0056-334889 from training. Duration: 0.768 2023-10-03 12:50:34,160 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_14953_210_2151_1_1530701710_5616165_710-165400-0_sp0.9 from training. Duration: 0.9666875 2023-10-03 12:50:34,360 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.balancer1.prob, batch_count=1272966.6666666667, ans=0.125 2023-10-03 12:50:36,832 WARNING [train.py:1204] (3/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_355-236205-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 12:50:36,834 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_34450_210_9547_1_1533099443_4109050_503-246893-0_sp1.1 from training. Duration: 0.963625 2023-10-03 12:50:40,119 WARNING [train.py:1204] (3/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_25-120161-0 from training. Duration: 0.98 2023-10-03 12:50:41,623 WARNING [train.py:1204] (3/4) Exclude cut with ID _66112_210_10635_1_1535590236305_3801484_205-311930-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 12:50:43,021 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_48049_210_5632_1_1533864553_3519937_185-281218-0_sp1.1 from training. Duration: 0.92725 2023-10-03 12:50:43,058 WARNING [train.py:1204] (3/4) Exclude cut with ID _48782_210_12658_1_1534467701544_2300422_191-332415-0_sp1.1 from training. Duration: 0.97275 2023-10-03 12:50:44,493 WARNING [train.py:1204] (3/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_907-151398-0_sp1.1 from training. Duration: 0.336375 2023-10-03 12:50:44,546 WARNING [train.py:1204] (3/4) Exclude cut with ID _67499_210_17282_1_1535509805220_3692430_20-179346-0_sp1.1 from training. Duration: 0.8818125 2023-10-03 12:50:48,599 WARNING [train.py:1204] (3/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_441-91466-0_sp1.1 from training. Duration: 0.8818125 2023-10-03 12:50:50,306 WARNING [train.py:1204] (3/4) Exclude cut with ID _46681_210_9642_1_1533893005571_7603359_786-164007-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 12:50:56,702 WARNING [train.py:1204] (3/4) Exclude cut with ID _14970_210_15709_1_1530773543493_1715730_187-20259-0 from training. Duration: 0.89 2023-10-03 12:50:59,945 WARNING [train.py:1204] (3/4) Exclude cut with ID _12734_210_8341_1_1530183614296_3891929_383-339075-0_sp1.1 from training. Duration: 0.963625 2023-10-03 12:51:06,737 WARNING [train.py:1204] (3/4) Exclude cut with ID _37086_210_9038_1_1532999540891_7021250_834-322225-0_sp1.1 from training. Duration: 0.92725 2023-10-03 12:51:08,085 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_10987_210_4163_1_1529760555_4123902_56-290559-0_sp1.1 from training. Duration: 0.963625 2023-10-03 12:51:08,092 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_92-246270-0_sp0.9 from training. Duration: 0.9444375 2023-10-03 12:51:08,112 WARNING [train.py:1204] (3/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_56-13561-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 12:51:08,144 WARNING [train.py:1204] (3/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_393-57888-0_sp1.1 from training. Duration: 0.5818125 2023-10-03 12:51:09,430 INFO [train.py:1046] (3/4) Epoch 36, batch 5050, loss[loss=0.1466, simple_loss=0.2315, pruned_loss=0.03088, over 24469.00 frames. ], tot_loss[loss=0.1586, simple_loss=0.2373, pruned_loss=0.03998, over 4692089.82 frames. ], batch size: 63, lr: 2.81e-03, grad_scale: 8.0 2023-10-03 12:51:09,479 WARNING [train.py:1204] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_168-32906-0_sp1.1 from training. Duration: 0.62725 2023-10-03 12:51:09,523 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_51225_210_3601_1_1534400634_2327383_127-207553-0_sp1.1 from training. Duration: 0.963625 2023-10-03 12:51:12,790 WARNING [train.py:1204] (3/4) Exclude cut with ID _58583_210_5236_1_1534726835266_3574249_494-203521-0_sp1.1 from training. Duration: 0.963625 2023-10-03 12:51:12,812 WARNING [train.py:1204] (3/4) Exclude cut with ID _49376_210_5986_1_1533985278732_3370740_354-344695-0 from training. Duration: 0.99 2023-10-03 12:51:12,898 WARNING [train.py:1204] (3/4) Exclude cut with ID _8021_210_7994_1_1528376451358_3653069_179-45206-0_sp1.1 from training. Duration: 0.7818125 2023-10-03 12:51:15,640 WARNING [train.py:1204] (3/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_742-154017-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 12:51:16,945 WARNING [train.py:1204] (3/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_222-198127-0_sp1.1 from training. Duration: 0.763625 2023-10-03 12:51:16,993 WARNING [train.py:1204] (3/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_235-307337-0 from training. Duration: 0.94 2023-10-03 12:51:18,241 WARNING [train.py:1204] (3/4) Exclude cut with ID _8393_210_6702_1_1528505860249_3457328_453-176500-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 12:51:18,275 WARNING [train.py:1204] (3/4) Exclude cut with ID _53565_210_10635_1_1534337218335_3820259_102-179528-0_sp1.1 from training. Duration: 0.97275 2023-10-03 12:51:21,671 WARNING [train.py:1204] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_43-206757-0_sp1.1 from training. Duration: 0.6818125 2023-10-03 12:51:21,753 WARNING [train.py:1204] (3/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_335-274780-0_sp1.1 from training. Duration: 0.7090625 2023-10-03 12:51:21,793 WARNING [train.py:1204] (3/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_128-296581-0_sp1.1 from training. Duration: 0.67275 2023-10-03 12:51:33,288 WARNING [train.py:1204] (3/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_111-112362-0 from training. Duration: 0.91 2023-10-03 12:51:33,332 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_5522_210_3273_1_1527249606_3909096_366-172508-0_sp1.1 from training. Duration: 0.6 2023-10-03 12:51:34,663 WARNING [train.py:1204] (3/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_47-167373-0_sp1.1 from training. Duration: 0.836375 2023-10-03 12:51:36,040 WARNING [train.py:1204] (3/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_41-234353-0 from training. Duration: 0.68 2023-10-03 12:51:36,080 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_82-221196-0_sp0.9 from training. Duration: 0.8444375 2023-10-03 12:51:37,567 WARNING [train.py:1204] (3/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_47-4814-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 12:51:38,859 WARNING [train.py:1204] (3/4) Exclude cut with ID _43541_210_10626_1_1533796233418_3635440_40-247993-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 12:51:40,126 WARNING [train.py:1204] (3/4) Exclude cut with ID _21989_210_5854_1_1531899214931_3271360_133-44759-0_sp0.9 from training. Duration: 0.8555625 2023-10-03 12:51:40,129 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_94-195801-0 from training. Duration: 0.94 2023-10-03 12:51:40,220 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_141-89113-0 from training. Duration: 0.86 2023-10-03 12:51:43,415 WARNING [train.py:1204] (3/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_683-284578-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 12:51:43,562 WARNING [train.py:1204] (3/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_6-214815-0_sp1.1 from training. Duration: 0.77275 2023-10-03 12:51:46,455 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.666e+02 1.880e+02 2.077e+02 2.440e+02 4.009e+02, threshold=4.155e+02, percent-clipped=0.0 2023-10-03 12:51:46,552 WARNING [train.py:1204] (3/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_556-212082-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 12:51:47,895 WARNING [train.py:1204] (3/4) Exclude cut with ID _44902_210_16187_1_1533888006062_3792078_149-67212-0 from training. Duration: 0.87 2023-10-03 12:51:49,343 WARNING [train.py:1204] (3/4) Exclude cut with ID _61148_210_6897_1_1535094022726_4217610_333-159440-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 12:51:50,854 WARNING [train.py:1204] (3/4) Exclude cut with ID _3120_210_3525_1_1526036143112_4072980_404-249994-0 from training. Duration: 0.99 2023-10-03 12:51:52,901 WARNING [train.py:1204] (3/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_234-260682-0_sp0.9 from training. Duration: 0.9333125 2023-10-03 12:51:53,959 WARNING [train.py:1204] (3/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_374-138971-0_sp1.1 from training. Duration: 0.8545625 2023-10-03 12:51:54,034 WARNING [train.py:1204] (3/4) Exclude cut with ID _53713_210_24116_1_1534314487762_7416751_743-302085-0_sp1.1 from training. Duration: 0.92725 2023-10-03 12:51:54,201 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.bypass.skip_rate, batch_count=1273366.6666666667, ans=0.07 2023-10-03 12:51:55,405 WARNING [train.py:1204] (3/4) Exclude cut with ID _18516_210_9206_1_1531704653206_3692406_198-334870-0_sp1.1 from training. Duration: 0.836375 2023-10-03 12:51:57,298 WARNING [train.py:1204] (3/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_395-73570-0_sp1.1 from training. Duration: 0.9 2023-10-03 12:51:59,980 WARNING [train.py:1204] (3/4) Exclude cut with ID _46683_210_9642_1_1533730913492_4591116_421-19547-0_sp1.1 from training. Duration: 0.936375 2023-10-03 12:52:00,044 WARNING [train.py:1204] (3/4) Exclude cut with ID _19896_210_1883_1_1531709022662_6649569_714-8668-0_sp1.1 from training. Duration: 0.963625 2023-10-03 12:52:01,300 WARNING [train.py:1204] (3/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_49-101072-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 12:52:01,315 WARNING [train.py:1204] (3/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_220-191080-0_sp1.1 from training. Duration: 0.8909375 2023-10-03 12:52:01,352 WARNING [train.py:1204] (3/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_62-335552-0 from training. Duration: 0.87 2023-10-03 12:52:02,850 WARNING [train.py:1204] (3/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_111-112362-0_sp1.1 from training. Duration: 0.82725 2023-10-03 12:52:04,225 WARNING [train.py:1204] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_318-59097-0_sp0.9 from training. Duration: 0.8444375 2023-10-03 12:52:07,036 WARNING [train.py:1204] (3/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_98-255710-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 12:52:07,043 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9042W0003-515609 from training. Duration: 0.896 2023-10-03 12:52:07,051 WARNING [train.py:1204] (3/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_158-250116-0_sp0.9 from training. Duration: 0.688875 2023-10-03 12:52:08,499 WARNING [train.py:1204] (3/4) Exclude cut with ID _26750_210_5665_1_1532226678442_4063230_7-346885-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 12:52:09,792 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_37839_210_7234_1_1533542462_3689303_357-227078-0_sp1.1 from training. Duration: 0.963625 2023-10-03 12:52:11,134 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9052W0119-520904 from training. Duration: 0.896 2023-10-03 12:52:13,853 WARNING [train.py:1204] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_249-281349-0_sp1.1 from training. Duration: 0.77275 2023-10-03 12:52:13,865 WARNING [train.py:1204] (3/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_147-293311-0 from training. Duration: 0.94 2023-10-03 12:52:13,866 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_158-21166-0_sp1.1 from training. Duration: 0.963625 2023-10-03 12:52:19,940 WARNING [train.py:1204] (3/4) Exclude cut with ID _14100_210_8341_1_1530529320613_3678069_207-332194-0_sp1.1 from training. Duration: 0.92725 2023-10-03 12:52:19,987 WARNING [train.py:1204] (3/4) Exclude cut with ID _9389_210_1664_1_1529217337301_5289269_324-211031-0_sp1.1 from training. Duration: 0.963625 2023-10-03 12:52:20,005 WARNING [train.py:1204] (3/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_133-158308-0 from training. Duration: 0.84 2023-10-03 12:52:21,266 WARNING [train.py:1204] (3/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_166-314607-0 from training. Duration: 0.54 2023-10-03 12:52:22,591 INFO [train.py:1046] (3/4) Epoch 36, batch 5100, loss[loss=0.1453, simple_loss=0.2266, pruned_loss=0.03202, over 24413.00 frames. ], tot_loss[loss=0.1584, simple_loss=0.2376, pruned_loss=0.03964, over 4701162.15 frames. ], batch size: 58, lr: 2.81e-03, grad_scale: 8.0 2023-10-03 12:52:22,742 WARNING [train.py:1204] (3/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_649-79538-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 12:52:22,750 WARNING [train.py:1204] (3/4) Exclude cut with ID _28394_210_8913_1_1532430049518_3650118_343-115667-0_sp1.1 from training. Duration: 0.97275 2023-10-03 12:52:24,641 WARNING [train.py:1204] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_87-278686-0_sp0.9 from training. Duration: 0.8555625 2023-10-03 12:52:26,074 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9109W0003-549749 from training. Duration: 0.768 2023-10-03 12:52:28,836 WARNING [train.py:1204] (3/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_126-29104-0_sp1.1 from training. Duration: 0.77275 2023-10-03 12:52:32,232 WARNING [train.py:1204] (3/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_325-113798-0 from training. Duration: 0.85 2023-10-03 12:52:32,292 WARNING [train.py:1204] (3/4) Exclude cut with ID _6694_210_4599_1_1527852637490_3965529_116-109398-0 from training. Duration: 0.73 2023-10-03 12:52:32,524 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.feed_forward3.hidden_balancer.prob, batch_count=1273500.0, ans=0.125 2023-10-03 12:52:33,699 WARNING [train.py:1204] (3/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_46-57119-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 12:52:34,164 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.5.encoder.layers.0.nonlin_attention.whiten1, num_groups=1, num_channels=192, metric=3.81 vs. limit=10.0 2023-10-03 12:52:36,372 WARNING [train.py:1204] (3/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_546-119312-0_sp1.1 from training. Duration: 0.9 2023-10-03 12:52:39,169 WARNING [train.py:1204] (3/4) Exclude cut with ID _60057_210_10640_1_1534903065793_3333150_468-306939-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 12:52:39,204 WARNING [train.py:1204] (3/4) Exclude cut with ID _15150_210_8341_1_1530752411803_7256384_22-138274-0 from training. Duration: 0.96 2023-10-03 12:52:39,226 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_289-180904-0 from training. Duration: 0.76 2023-10-03 12:52:43,400 WARNING [train.py:1204] (3/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_64-133721-0_sp1.1 from training. Duration: 0.92725 2023-10-03 12:52:43,442 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_195-257512-0_sp1.1 from training. Duration: 0.6090625 2023-10-03 12:52:47,519 WARNING [train.py:1204] (3/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_704-101963-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 12:52:50,637 WARNING [train.py:1204] (3/4) Exclude cut with ID _4366_210_1388_1_1526792053633_3613819_231-211402-0 from training. Duration: 0.94 2023-10-03 12:52:51,996 WARNING [train.py:1204] (3/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_679-190385-0_sp1.1 from training. Duration: 0.97275 2023-10-03 12:52:53,424 WARNING [train.py:1204] (3/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_298-81459-0_sp1.1 from training. Duration: 0.963625 2023-10-03 12:52:53,441 WARNING [train.py:1204] (3/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_658-282610-0_sp0.9 from training. Duration: 0.588875 2023-10-03 12:52:53,877 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.5.encoder.layers.0.whiten, num_groups=1, num_channels=256, metric=2.34 vs. limit=12.0 2023-10-03 12:52:56,773 WARNING [train.py:1204] (3/4) Exclude cut with ID _57572_210_5517_1_1534921219624_3809484_196-283734-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 12:52:56,844 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_53071_210_16554_1_1534490405_2954066_294-198554-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 12:52:56,849 WARNING [train.py:1204] (3/4) Exclude cut with ID _4866_210_2577_1_1527404412284_4364659_352-157930-0 from training. Duration: 0.81 2023-10-03 12:52:59,532 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9231W0113-610139 from training. Duration: 0.896 2023-10-03 12:53:01,387 WARNING [train.py:1204] (3/4) Exclude cut with ID _24271_210_12658_1_1531987270022_7190519_638-171906-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 12:53:01,421 WARNING [train.py:1204] (3/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_272-249985-0 from training. Duration: 0.67 2023-10-03 12:53:01,433 WARNING [train.py:1204] (3/4) Exclude cut with ID _64086_210_7994_1_1535270417019_7435949_449-31567-0 from training. Duration: 0.97 2023-10-03 12:53:05,871 WARNING [train.py:1204] (3/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_109-282568-0_sp1.1 from training. Duration: 0.97275 2023-10-03 12:53:12,827 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_121-45934-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 12:53:15,549 WARNING [train.py:1204] (3/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_314-126819-0 from training. Duration: 0.5 2023-10-03 12:53:15,577 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9309W0192-640319 from training. Duration: 0.512 2023-10-03 12:53:15,584 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9310W0003-640649 from training. Duration: 0.896 2023-10-03 12:53:16,540 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.0.layers.0.conv_module2.whiten, num_groups=1, num_channels=192, metric=11.24 vs. limit=15.0 2023-10-03 12:53:17,029 WARNING [train.py:1204] (3/4) Exclude cut with ID _7160_210_1876_1_1527940923231_3703799_120-150943-0 from training. Duration: 0.81 2023-10-03 12:53:17,030 WARNING [train.py:1204] (3/4) Exclude cut with ID _40380_210_9059_1_1533380594759_4187380_6-33508-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 12:53:20,412 WARNING [train.py:1204] (3/4) Exclude cut with ID _59202_210_26984_1_1534816810612_3549174_137-318710-0 from training. Duration: 0.89 2023-10-03 12:53:23,364 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.0.bypass.scale_min, batch_count=1273766.6666666667, ans=0.2 2023-10-03 12:53:24,482 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_435-313305-0 from training. Duration: 0.71 2023-10-03 12:53:26,510 WARNING [train.py:1204] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_91-212486-0_sp1.1 from training. Duration: 0.6181875 2023-10-03 12:53:27,873 WARNING [train.py:1204] (3/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_133-158308-0_sp1.1 from training. Duration: 0.763625 2023-10-03 12:53:32,269 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_99-251314-0 from training. Duration: 0.87 2023-10-03 12:53:34,195 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_365-217119-0_sp1.1 from training. Duration: 0.57275 2023-10-03 12:53:35,474 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3475_210_2028_1_1526193578_3622569_377-166133-0 from training. Duration: 0.86 2023-10-03 12:53:36,864 INFO [train.py:1046] (3/4) Epoch 36, batch 5150, loss[loss=0.1455, simple_loss=0.2267, pruned_loss=0.03217, over 24601.00 frames. ], tot_loss[loss=0.1596, simple_loss=0.2389, pruned_loss=0.04014, over 4706869.19 frames. ], batch size: 60, lr: 2.81e-03, grad_scale: 8.0 2023-10-03 12:53:39,761 WARNING [train.py:1204] (3/4) Exclude cut with ID _13916_210_9994_1_1530788341126_3763149_116-21034-0_sp1.1 from training. Duration: 0.87275 2023-10-03 12:53:39,778 WARNING [train.py:1204] (3/4) Exclude cut with ID _64220_210_9527_1_1535250151772_4875259_142-216831-0_sp1.1 from training. Duration: 0.936375 2023-10-03 12:53:39,778 WARNING [train.py:1204] (3/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_490-207861-0_sp1.1 from training. Duration: 0.8909375 2023-10-03 12:53:41,149 WARNING [train.py:1204] (3/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_87-272068-0_sp0.9 from training. Duration: 0.92225 2023-10-03 12:53:42,459 WARNING [train.py:1204] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_18-46756-0_sp1.1 from training. Duration: 0.7181875 2023-10-03 12:53:42,522 WARNING [train.py:1204] (3/4) Exclude cut with ID _39411_210_5986_1_1533538825030_3343649_98-311335-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 12:53:43,865 WARNING [train.py:1204] (3/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_113-18163-0 from training. Duration: 0.89 2023-10-03 12:53:43,867 WARNING [train.py:1204] (3/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_1103-56445-0 from training. Duration: 0.73 2023-10-03 12:53:43,918 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3474_210_2028_1_1526188513_4825246_145-309131-0 from training. Duration: 0.74 2023-10-03 12:53:43,938 WARNING [train.py:1204] (3/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_120-211630-0_sp1.1 from training. Duration: 0.836375 2023-10-03 12:53:43,948 WARNING [train.py:1204] (3/4) Exclude cut with ID _8030_210_7497_1_1528444886781_3531089_32-31837-0 from training. Duration: 0.97 2023-10-03 12:53:45,363 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_19949_210_2161_1_1531706337_928079_8-160956-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 12:53:46,705 WARNING [train.py:1204] (3/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_321-215742-0_sp1.1 from training. Duration: 0.4545625 2023-10-03 12:53:48,286 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_583-124650-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 12:53:49,666 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_30434_210_13049_1_1532519384_981746_164-182755-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 12:53:49,971 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.bypass.scale_min, batch_count=1273900.0, ans=0.2 2023-10-03 12:53:53,046 WARNING [train.py:1204] (3/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_339-104791-0_sp1.1 from training. Duration: 0.4454375 2023-10-03 12:53:53,066 WARNING [train.py:1204] (3/4) Exclude cut with ID _15026_210_8750_1_1530777602316_3291850_211-195043-0 from training. Duration: 0.8 2023-10-03 12:53:55,673 WARNING [train.py:1204] (3/4) Exclude cut with ID _29877_210_6228_1_1532397791106_5276149_487-182647-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 12:53:55,718 WARNING [train.py:1204] (3/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_127-566-0_sp0.9 from training. Duration: 0.9333125 2023-10-03 12:53:57,697 WARNING [train.py:1204] (3/4) Exclude cut with ID _8263_210_1664_1_1528439452891_970220_68-181805-0_sp0.9 from training. Duration: 0.711125 2023-10-03 12:53:57,699 WARNING [train.py:1204] (3/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_504-72513-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 12:53:57,707 WARNING [train.py:1204] (3/4) Exclude cut with ID _64639_210_5816_1_1535367619437_3528000_324-265233-0_sp1.1 from training. Duration: 0.963625 2023-10-03 12:53:58,879 WARNING [train.py:1204] (3/4) Exclude cut with ID _6456_210_1534_1_1527681634212_3181049_215-309359-0_sp0.9 from training. Duration: 0.97775 2023-10-03 12:53:58,884 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_115-233929-0_sp1.1 from training. Duration: 0.6545625 2023-10-03 12:53:58,925 WARNING [train.py:1204] (3/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_219-224445-0 from training. Duration: 0.84 2023-10-03 12:54:00,335 WARNING [train.py:1204] (3/4) Exclude cut with ID _14867_210_1968_1_1530684179890_3846969_413-29292-0_sp1.1 from training. Duration: 0.8181875 2023-10-03 12:54:01,632 WARNING [train.py:1204] (3/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_1012-279473-0_sp0.9 from training. Duration: 0.7444375 2023-10-03 12:54:04,972 WARNING [train.py:1204] (3/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_605-133395-0_sp0.9 from training. Duration: 0.7333125 2023-10-03 12:54:06,742 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_89-35748-0 from training. Duration: 0.79 2023-10-03 12:54:08,142 WARNING [train.py:1204] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_476-272757-0_sp1.1 from training. Duration: 0.8454375 2023-10-03 12:54:08,426 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.balancer2.prob, batch_count=1273966.6666666667, ans=0.125 2023-10-03 12:54:11,140 WARNING [train.py:1204] (3/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_449-15508-0_sp0.9 from training. Duration: 0.911125 2023-10-03 12:54:12,555 WARNING [train.py:1204] (3/4) Exclude cut with ID _29345_210_4381_1_1532689168263_3860169_198-174328-0 from training. Duration: 0.91 2023-10-03 12:54:13,777 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.503e+02 1.870e+02 2.045e+02 2.275e+02 3.519e+02, threshold=4.091e+02, percent-clipped=0.0 2023-10-03 12:54:15,248 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_15517_210_6753_1_1530952293_1664795_125-158157-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 12:54:21,134 WARNING [train.py:1204] (3/4) Exclude cut with ID _25565_210_12392_1_1532423009382_3952098_420-113180-0_sp1.1 from training. Duration: 0.963625 2023-10-03 12:54:22,502 WARNING [train.py:1204] (3/4) Exclude cut with ID _51471_210_1889_1_1534399391390_3094250_236-146773-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 12:54:27,133 WARNING [train.py:1204] (3/4) Exclude cut with ID _30039_210_13270_1_1532484612900_2725150_175-35012-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 12:54:27,186 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_23850_210_4238_1_1532232597_1632433_99-275843-0_sp1.1 from training. Duration: 0.936375 2023-10-03 12:54:30,028 WARNING [train.py:1204] (3/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_658-282610-0 from training. Duration: 0.53 2023-10-03 12:54:33,330 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_37818_210_4238_1_1533620910_4720083_234-65934-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 12:54:34,666 WARNING [train.py:1204] (3/4) Exclude cut with ID _3067_210_2667_1_1526212974640_4503168_459-173867-0_sp1.1 from training. Duration: 0.8 2023-10-03 12:54:34,690 WARNING [train.py:1204] (3/4) Exclude cut with ID _8696_210_1876_1_1528545632440_3345058_209-54069-0_sp0.9 from training. Duration: 0.7444375 2023-10-03 12:54:36,821 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.2.encoder.layers.1.whiten, num_groups=1, num_channels=384, metric=4.50 vs. limit=12.0 2023-10-03 12:54:37,941 WARNING [train.py:1204] (3/4) Exclude cut with ID _62574_210_2655_1_1535180407058_3933249_420-236465-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 12:54:38,029 WARNING [train.py:1204] (3/4) Exclude cut with ID _46681_210_9642_1_1533893005571_7603359_333-4042-0_sp1.1 from training. Duration: 0.936375 2023-10-03 12:54:39,353 WARNING [train.py:1204] (3/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_377-296839-0 from training. Duration: 0.55 2023-10-03 12:54:43,607 WARNING [train.py:1204] (3/4) Exclude cut with ID _43997_210_4525_1_1533538632555_5189329_426-92599-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 12:54:45,005 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_43-71989-0_sp1.1 from training. Duration: 0.5181875 2023-10-03 12:54:45,246 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.feed_forward2.hidden_balancer.prob, batch_count=1274100.0, ans=0.125 2023-10-03 12:54:46,408 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2009_210_2071_1_1525481134_4588937_140-345530-0_sp1.1 from training. Duration: 0.963625 2023-10-03 12:54:47,692 WARNING [train.py:1204] (3/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_138-332773-0_sp0.9 from training. Duration: 0.9 2023-10-03 12:54:47,764 WARNING [train.py:1204] (3/4) Exclude cut with ID _13942_210_9988_1_1530604906036_3733360_499-55676-0_sp0.9 from training. Duration: 0.688875 2023-10-03 12:54:47,781 WARNING [train.py:1204] (3/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_259-34038-0_sp1.1 from training. Duration: 0.67275 2023-10-03 12:54:47,797 WARNING [train.py:1204] (3/4) Exclude cut with ID _3120_210_3525_1_1526036143112_4072980_404-249994-0_sp1.1 from training. Duration: 0.9 2023-10-03 12:54:49,090 WARNING [train.py:1204] (3/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_222-108810-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 12:54:50,976 INFO [train.py:1046] (3/4) Epoch 36, batch 5200, loss[loss=0.1614, simple_loss=0.2429, pruned_loss=0.03995, over 20710.00 frames. ], tot_loss[loss=0.1606, simple_loss=0.2399, pruned_loss=0.04065, over 4702869.08 frames. ], batch size: 45, lr: 2.81e-03, grad_scale: 16.0 2023-10-03 12:54:52,487 WARNING [train.py:1204] (3/4) Exclude cut with ID _22083_210_8254_1_1531702862439_3678970_49-234546-0_sp0.9 from training. Duration: 0.92225 2023-10-03 12:54:55,555 WARNING [train.py:1204] (3/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_199-295611-0_sp0.9 from training. Duration: 0.611125 2023-10-03 12:54:57,131 WARNING [train.py:1204] (3/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_43-192945-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 12:54:59,144 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.4.encoder.layers.0.nonlin_attention.whiten1, num_groups=1, num_channels=288, metric=8.20 vs. limit=10.0 2023-10-03 12:55:00,134 INFO [scaling.py:1118] (3/4) WithLoss: name=encoder.encoders.4.encoder.layers.1.self_attn_weights, loss-sum=0.000e+00 2023-10-03 12:55:01,190 WARNING [train.py:1204] (3/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_364-22817-0 from training. Duration: 0.71 2023-10-03 12:55:03,035 WARNING [train.py:1204] (3/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_388-129552-0_sp1.1 from training. Duration: 0.87275 2023-10-03 12:55:03,083 WARNING [train.py:1204] (3/4) Exclude cut with ID _6537_210_1883_1_1527987559710_3597129_190-280227-0_sp1.1 from training. Duration: 0.97275 2023-10-03 12:55:05,825 WARNING [train.py:1204] (3/4) Exclude cut with ID _55844_210_2346_1_1534471212569_7625707_398-56897-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 12:55:05,908 WARNING [train.py:1204] (3/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_48-182155-0_sp1.1 from training. Duration: 0.7909375 2023-10-03 12:55:05,930 WARNING [train.py:1204] (3/4) Exclude cut with ID _29529_210_7234_1_1532393961455_3977460_582-18084-0_sp1.1 from training. Duration: 0.97275 2023-10-03 12:55:07,399 WARNING [train.py:1204] (3/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_912-282412-0 from training. Duration: 0.49 2023-10-03 12:55:10,642 WARNING [train.py:1204] (3/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_332-256168-0_sp0.9 from training. Duration: 0.8333125 2023-10-03 12:55:10,700 WARNING [train.py:1204] (3/4) Exclude cut with ID _64220_210_9527_1_1535250151772_4875259_55-250624-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 12:55:13,282 WARNING [train.py:1204] (3/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_264-108685-0 from training. Duration: 0.55 2023-10-03 12:55:13,435 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.self_attn_weights.pos_emb_skip_rate, batch_count=1274233.3333333333, ans=0.0 2023-10-03 12:55:13,443 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.bypass.skip_rate, batch_count=1274233.3333333333, ans=0.07 2023-10-03 12:55:15,931 WARNING [train.py:1204] (3/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_613-232856-0_sp0.9 from training. Duration: 0.6 2023-10-03 12:55:16,008 WARNING [train.py:1204] (3/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_68-212801-0_sp1.1 from training. Duration: 0.663625 2023-10-03 12:55:17,295 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6290_210_3273_1_1527595224_1282787_115-87095-0 from training. Duration: 0.55 2023-10-03 12:55:17,336 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3080_210_2071_1_1526086102_4365166_312-8726-0 from training. Duration: 0.91 2023-10-03 12:55:20,094 WARNING [train.py:1204] (3/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_105-63309-0 from training. Duration: 0.98 2023-10-03 12:55:20,153 WARNING [train.py:1204] (3/4) Exclude cut with ID _2941_210_2577_1_1526199673150_2489759_201-160779-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 12:55:20,156 WARNING [train.py:1204] (3/4) Exclude cut with ID ID0406W0226-885434 from training. Duration: 0.997 2023-10-03 12:55:20,162 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_17292_210_6753_1_1531125388_2943734_353-328201-0_sp1.1 from training. Duration: 0.97275 2023-10-03 12:55:22,152 WARNING [train.py:1204] (3/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_572-96029-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 12:55:23,567 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.2.encoder.layers.0.conv_module1.whiten, num_groups=1, num_channels=384, metric=6.42 vs. limit=15.0 2023-10-03 12:55:24,099 WARNING [train.py:1204] (3/4) Exclude cut with ID _14970_210_15709_1_1530775547361_1966965_6-23328-0_sp0.9 from training. Duration: 0.9555625 2023-10-03 12:55:25,423 WARNING [train.py:1204] (3/4) Exclude cut with ID _45012_210_12651_1_1533608435136_5371380_240-37101-0 from training. Duration: 0.93 2023-10-03 12:55:25,490 WARNING [train.py:1204] (3/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_685-90183-0_sp1.1 from training. Duration: 0.936375 2023-10-03 12:55:28,240 WARNING [train.py:1204] (3/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_41-22245-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 12:55:28,426 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_15126_210_6753_1_1530778677_2591185_120-277707-0 from training. Duration: 0.8 2023-10-03 12:55:30,307 WARNING [train.py:1204] (3/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_310-26186-0 from training. Duration: 0.7 2023-10-03 12:55:30,329 WARNING [train.py:1204] (3/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_787-294416-0 from training. Duration: 0.82 2023-10-03 12:55:34,481 WARNING [train.py:1204] (3/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_795-33252-0 from training. Duration: 0.37 2023-10-03 12:55:35,803 WARNING [train.py:1204] (3/4) Exclude cut with ID _7537_210_2084_1_1528502395431_3924359_113-199933-0_sp0.9 from training. Duration: 0.6555625 2023-10-03 12:55:40,640 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.feed_forward1.hidden_balancer.prob, batch_count=1274366.6666666667, ans=0.125 2023-10-03 12:55:41,921 WARNING [train.py:1204] (3/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_308-231735-0_sp0.9 from training. Duration: 0.811125 2023-10-03 12:55:43,207 WARNING [train.py:1204] (3/4) Exclude cut with ID _19504_210_9994_1_1531648863242_3325220_354-296765-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 12:55:44,611 WARNING [train.py:1204] (3/4) Exclude cut with ID _5409_210_1388_1_1527292850189_5865191_178-127970-0 from training. Duration: 0.57 2023-10-03 12:55:44,651 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_1466-248056-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 12:55:44,680 WARNING [train.py:1204] (3/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_535-14410-0_sp0.9 from training. Duration: 0.52225 2023-10-03 12:55:44,683 WARNING [train.py:1204] (3/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_747-129941-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 12:55:46,009 WARNING [train.py:1204] (3/4) Exclude cut with ID _15012_210_1968_1_1530856820429_5095350_688-178849-0_sp1.1 from training. Duration: 0.5818125 2023-10-03 12:55:50,123 WARNING [train.py:1204] (3/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_356-45054-0_sp1.1 from training. Duration: 0.7818125 2023-10-03 12:55:50,244 WARNING [train.py:1204] (3/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_146-276944-0_sp0.9 from training. Duration: 0.92225 2023-10-03 12:55:55,275 WARNING [train.py:1204] (3/4) Exclude cut with ID _39204_210_4220_1_1533880547368_3191120_346-180306-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 12:55:56,544 WARNING [train.py:1204] (3/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_641-158017-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 12:55:56,544 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_19952_210_2161_1_1531875469_3851750_274-42137-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 12:56:02,171 WARNING [train.py:1204] (3/4) Exclude cut with ID _60804_210_9642_1_1534831199859_7268299_87-327819-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 12:56:03,531 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_9302_210_2550_1_1528889962_4655416_638-301620-0 from training. Duration: 0.47 2023-10-03 12:56:05,242 INFO [train.py:1046] (3/4) Epoch 36, batch 5250, loss[loss=0.1556, simple_loss=0.2348, pruned_loss=0.03817, over 23424.00 frames. ], tot_loss[loss=0.1597, simple_loss=0.2385, pruned_loss=0.04049, over 4701035.62 frames. ], batch size: 119, lr: 2.81e-03, grad_scale: 16.0 2023-10-03 12:56:05,290 WARNING [train.py:1204] (3/4) Exclude cut with ID _44304_210_15374_1_1533639352765_4613129_302-22084-0_sp1.1 from training. Duration: 0.7818125 2023-10-03 12:56:05,308 WARNING [train.py:1204] (3/4) Exclude cut with ID _18513_210_9206_1_1531618215223_3862620_177-227063-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 12:56:06,733 WARNING [train.py:1204] (3/4) Exclude cut with ID _41326_210_5854_1_1533778076509_3492080_510-45444-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 12:56:06,779 WARNING [train.py:1204] (3/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_76-311963-0_sp1.1 from training. Duration: 0.636375 2023-10-03 12:56:06,889 WARNING [train.py:1204] (3/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_156-302675-0_sp0.9 from training. Duration: 0.611125 2023-10-03 12:56:08,292 WARNING [train.py:1204] (3/4) Exclude cut with ID _53489_210_6228_1_1534298357169_3835970_367-148411-0_sp1.1 from training. Duration: 0.936375 2023-10-03 12:56:11,096 WARNING [train.py:1204] (3/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_389-144525-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 12:56:12,728 WARNING [train.py:1204] (3/4) Exclude cut with ID _14069_210_5632_1_1530512692033_4803390_158-238427-0_sp1.1 from training. Duration: 0.963625 2023-10-03 12:56:14,081 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_486-151131-0_sp0.9 from training. Duration: 0.9444375 2023-10-03 12:56:18,160 WARNING [train.py:1204] (3/4) Exclude cut with ID _39846_210_12651_1_1533603651992_1318030_188-71831-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 12:56:19,677 WARNING [train.py:1204] (3/4) Exclude cut with ID _39411_210_5986_1_1533538825030_3343649_173-238248-0_sp1.1 from training. Duration: 0.7545625 2023-10-03 12:56:22,890 WARNING [train.py:1204] (3/4) Exclude cut with ID _20684_210_6917_1_1531567751646_4203930_469-49081-0_sp1.1 from training. Duration: 0.92725 2023-10-03 12:56:24,349 WARNING [train.py:1204] (3/4) Exclude cut with ID _8833_210_5219_1_1528592665210_7553059_611-80313-0_sp1.1 from training. Duration: 0.5818125 2023-10-03 12:56:25,749 WARNING [train.py:1204] (3/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_440-141811-0 from training. Duration: 0.95 2023-10-03 12:56:25,765 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_41211_210_2544_1_1533348859_3702910_236-223652-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 12:56:27,125 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_40222_210_6228_1_1533385298_4481939_220-163998-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 12:56:30,289 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.3.encoder.layers.0.self_attn1.whiten, num_groups=1, num_channels=512, metric=19.55 vs. limit=22.5 2023-10-03 12:56:40,461 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.644e+02 1.965e+02 2.175e+02 2.587e+02 3.682e+02, threshold=4.349e+02, percent-clipped=0.0 2023-10-03 12:56:40,744 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.feed_forward1.hidden_balancer.prob, batch_count=1274633.3333333333, ans=0.125 2023-10-03 12:56:47,259 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.conv_module1.balancer2.prob, batch_count=1274700.0, ans=0.125 2023-10-03 12:56:51,069 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.conv_skip_rate, batch_count=1274700.0, ans=0.0 2023-10-03 12:57:13,155 INFO [train.py:1046] (3/4) Epoch 36, batch 5300, loss[loss=0.1367, simple_loss=0.2122, pruned_loss=0.0306, over 24328.00 frames. ], tot_loss[loss=0.1587, simple_loss=0.2372, pruned_loss=0.04008, over 4703711.52 frames. ], batch size: 56, lr: 2.81e-03, grad_scale: 16.0 2023-10-03 12:57:19,890 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.bypass.scale_min, batch_count=1274833.3333333333, ans=0.2 2023-10-03 12:57:22,426 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.feed_forward2.hidden_balancer.prob, batch_count=1274833.3333333333, ans=0.125 2023-10-03 12:57:26,678 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.3.encoder.layers.1.self_attn2.whiten, num_groups=1, num_channels=512, metric=15.87 vs. limit=22.5 2023-10-03 12:57:27,503 WARNING [train.py:1204] (3/4) Exclude cut with ID _14629_210_15709_1_1530604298182_956899_6-232695-0_sp1.1 from training. Duration: 0.8545625 2023-10-03 12:57:27,538 WARNING [train.py:1204] (3/4) Exclude cut with ID _13921_210_9994_1_1530841782272_3503520_327-69677-0 from training. Duration: 0.9 2023-10-03 12:57:27,564 WARNING [train.py:1204] (3/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_372-298093-0 from training. Duration: 0.8 2023-10-03 12:57:27,578 WARNING [train.py:1204] (3/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_515-139111-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 12:57:27,813 WARNING [train.py:1204] (3/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_85-65056-0_sp1.1 from training. Duration: 0.87275 2023-10-03 12:57:27,856 WARNING [train.py:1204] (3/4) Exclude cut with ID _15113_210_8254_1_1530842438579_3716279_209-54533-0_sp1.1 from training. Duration: 0.87275 2023-10-03 12:57:27,906 WARNING [train.py:1204] (3/4) Exclude cut with ID _70447_210_12392_1_1535972499300_3160420_123-312838-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 12:57:27,942 WARNING [train.py:1204] (3/4) Exclude cut with ID _60113_210_16271_1_1535164212481_4386079_70-63596-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 12:57:27,971 WARNING [train.py:1204] (3/4) Exclude cut with ID _14632_210_9988_1_1530691279392_3923200_225-188509-0_sp1.1 from training. Duration: 0.963625 2023-10-03 12:57:28,024 WARNING [train.py:1204] (3/4) Exclude cut with ID _34734_210_4525_1_1533365977542_4448720_584-180532-0_sp1.1 from training. Duration: 0.97275 2023-10-03 12:57:28,039 WARNING [train.py:1204] (3/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_351-294650-0_sp1.1 from training. Duration: 0.57275 2023-10-03 12:57:28,373 WARNING [train.py:1204] (3/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_263-168388-0_sp0.9 from training. Duration: 0.9555625 2023-10-03 12:57:28,402 WARNING [train.py:1204] (3/4) Exclude cut with ID _22083_210_8254_1_1531702862439_3678970_201-85738-0 from training. Duration: 0.9 2023-10-03 12:57:28,497 WARNING [train.py:1204] (3/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_237-58433-0 from training. Duration: 0.74 2023-10-03 12:57:28,525 WARNING [train.py:1204] (3/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_262-250826-0 from training. Duration: 0.99 2023-10-03 12:57:28,637 WARNING [train.py:1204] (3/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_927-43827-0_sp0.9 from training. Duration: 0.82225 2023-10-03 12:57:28,670 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2821_210_2715_1_1526108385_3763560_73-296673-0 from training. Duration: 0.83 2023-10-03 12:57:28,747 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6287_210_3273_1_1527509506_2653716_182-199887-0 from training. Duration: 0.51 2023-10-03 12:57:28,883 WARNING [train.py:1204] (3/4) Exclude cut with ID _70519_210_4163_1_1535961595994_7458230_899-80223-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 12:57:29,505 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_210-234949-0_sp1.1 from training. Duration: 0.97275 2023-10-03 12:57:29,544 WARNING [train.py:1204] (3/4) Exclude cut with ID _30196_210_1664_1_1533708496522_7074140_768-278176-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 12:57:29,623 WARNING [train.py:1204] (3/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_315-128283-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 12:57:29,705 WARNING [train.py:1204] (3/4) Exclude cut with ID _14886_210_4220_1_1530702020355_3516190_330-323941-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 12:57:29,981 WARNING [train.py:1204] (3/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_264-348896-0_sp1.1 from training. Duration: 0.77275 2023-10-03 12:57:30,003 WARNING [train.py:1204] (3/4) Exclude cut with ID _37086_210_9038_1_1532999540891_7021250_433-10563-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 12:57:30,046 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_53638_210_21581_1_1534297319_5808449_885-80172-0_sp1.1 from training. Duration: 0.87275 2023-10-03 12:57:30,143 WARNING [train.py:1204] (3/4) Exclude cut with ID _14623_210_1925_1_1530773354962_3632609_157-218771-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 12:57:30,155 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_1368-78165-0_sp1.1 from training. Duration: 0.97275 2023-10-03 12:57:30,160 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_309-65177-0_sp0.9 from training. Duration: 0.97775 2023-10-03 12:57:30,165 WARNING [train.py:1204] (3/4) Exclude cut with ID _21632_210_2253_1_1531994001765_3744140_291-138812-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 12:57:30,178 WARNING [train.py:1204] (3/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_669-93163-0_sp1.1 from training. Duration: 0.8909375 2023-10-03 12:57:30,781 WARNING [train.py:1204] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_292-215346-0 from training. Duration: 0.97 2023-10-03 12:57:30,808 WARNING [train.py:1204] (3/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_286-222073-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 12:57:31,127 WARNING [train.py:1204] (3/4) Exclude cut with ID _59256_210_5986_1_1535076085997_3589120_121-237386-0_sp1.1 from training. Duration: 0.87275 2023-10-03 12:57:31,142 WARNING [train.py:1204] (3/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_96-320736-0 from training. Duration: 0.86 2023-10-03 12:57:31,151 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_128-202104-0 from training. Duration: 0.63 2023-10-03 12:57:31,239 WARNING [train.py:1204] (3/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_242-3472-0_sp0.9 from training. Duration: 0.811125 2023-10-03 12:57:31,284 WARNING [train.py:1204] (3/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_154-74182-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 12:57:31,288 WARNING [train.py:1204] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_187-221566-0 from training. Duration: 0.43 2023-10-03 12:57:31,925 WARNING [train.py:1204] (3/4) Exclude cut with ID _6194_210_1534_1_1527511826160_3941194_14-282876-0 from training. Duration: 0.71 2023-10-03 12:57:31,969 WARNING [train.py:1204] (3/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_660-211088-0_sp1.1 from training. Duration: 0.563625 2023-10-03 12:57:32,310 WARNING [train.py:1204] (3/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_526-62079-0_sp1.1 from training. Duration: 0.6090625 2023-10-03 12:57:32,467 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_92-246270-0_sp1.1 from training. Duration: 0.77275 2023-10-03 12:57:32,563 WARNING [train.py:1204] (3/4) Exclude cut with ID IC0287W0023-142350 from training. Duration: 0.956 2023-10-03 12:57:32,647 WARNING [train.py:1204] (3/4) Exclude cut with ID _58605_210_12210_1_1534723195939_3904259_48-269402-0 from training. Duration: 0.96 2023-10-03 12:57:32,663 WARNING [train.py:1204] (3/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_416-257730-0_sp0.9 from training. Duration: 0.72225 2023-10-03 12:57:32,751 WARNING [train.py:1204] (3/4) Exclude cut with ID _7877_210_6014_1_1528286358099_3750136_185-17516-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 12:57:32,863 WARNING [train.py:1204] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_23-189234-0 from training. Duration: 0.83 2023-10-03 12:57:32,880 WARNING [train.py:1204] (3/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_237-89684-0 from training. Duration: 0.96 2023-10-03 12:57:32,949 WARNING [train.py:1204] (3/4) Exclude cut with ID _13916_210_9994_1_1530788341126_3763149_116-21034-0 from training. Duration: 0.96 2023-10-03 12:57:33,089 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_246-125000-0_sp1.1 from training. Duration: 0.563625 2023-10-03 12:57:39,485 INFO [train.py:1046] (3/4) Epoch 37, batch 0, loss[loss=0.153, simple_loss=0.235, pruned_loss=0.03552, over 24332.00 frames. ], tot_loss[loss=0.153, simple_loss=0.235, pruned_loss=0.03552, over 24332.00 frames. ], batch size: 61, lr: 2.77e-03, grad_scale: 32.0 2023-10-03 12:57:39,485 INFO [train.py:1069] (3/4) Computing validation loss 2023-10-03 12:57:48,693 INFO [zipformer.py:1853] (3/4) name=encoder.encoders.0.layers.1.self_attn_weights, attn_weights_entropy = tensor([4.4307, 4.3487, 4.2026, 3.8099], device='cuda:3') 2023-10-03 12:57:51,192 INFO [train.py:1078] (3/4) Epoch 37, validation: loss=0.3206, simple_loss=0.2712, pruned_loss=0.185, over 1125622.00 frames. 2023-10-03 12:57:51,193 INFO [train.py:1079] (3/4) Maximum memory allocated so far is 21134MB 2023-10-03 12:57:52,667 WARNING [train.py:1204] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_402-253027-0 from training. Duration: 0.54 2023-10-03 12:57:54,074 WARNING [train.py:1204] (3/4) Exclude cut with ID _59288_210_5854_1_1535077757774_3412246_158-289345-0_sp1.1 from training. Duration: 0.92725 2023-10-03 12:57:54,308 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.feed_forward3.hidden_balancer.prob, batch_count=1274913.3333333333, ans=0.125 2023-10-03 12:57:55,463 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_28211_210_6388_1_1532861506_4523224_128-328162-0_sp1.1 from training. Duration: 0.8181875 2023-10-03 12:57:55,930 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.5.encoder.layers.0.conv_module1.whiten, num_groups=1, num_channels=256, metric=6.92 vs. limit=15.0 2023-10-03 12:57:57,083 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.balancer1.prob, batch_count=1274913.3333333333, ans=0.125 2023-10-03 12:58:00,954 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_788-278281-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 12:58:00,962 WARNING [train.py:1204] (3/4) Exclude cut with ID _49728_210_12651_1_1534041034294_6788150_624-195301-0_sp0.9 from training. Duration: 0.9444375 2023-10-03 12:58:02,377 WARNING [train.py:1204] (3/4) Exclude cut with ID _14860_210_4915_1_1530702020340_7343240_267-139775-0_sp1.1 from training. Duration: 0.963625 2023-10-03 12:58:02,434 WARNING [train.py:1204] (3/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_332-221057-0 from training. Duration: 0.68 2023-10-03 12:58:03,813 WARNING [train.py:1204] (3/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_242-76943-0 from training. Duration: 0.94 2023-10-03 12:58:06,510 WARNING [train.py:1204] (3/4) Exclude cut with ID _16747_210_5986_1_1531811544733_2749109_87-349587-0_sp1.1 from training. Duration: 0.963625 2023-10-03 12:58:06,748 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.conv_module1.balancer1.prob, batch_count=1274980.0, ans=0.125 2023-10-03 12:58:07,855 WARNING [train.py:1204] (3/4) Exclude cut with ID _22982_210_1883_1_1531882790447_5449409_505-346452-0_sp1.1 from training. Duration: 0.963625 2023-10-03 12:58:10,678 WARNING [train.py:1204] (3/4) Exclude cut with ID _17956_210_4353_1_1531209568125_6979970_13-192107-0_sp1.1 from training. Duration: 0.963625 2023-10-03 12:58:11,935 WARNING [train.py:1204] (3/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_430-307666-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 12:58:11,981 WARNING [train.py:1204] (3/4) Exclude cut with ID _2895_210_2477_1_1525956785794_4063750_289-11475-0_sp1.1 from training. Duration: 0.7454375 2023-10-03 12:58:12,001 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3910_210_1794_1_1526742306_334530_28-238909-0_sp0.9 from training. Duration: 0.9 2023-10-03 12:58:13,536 WARNING [train.py:1204] (3/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_18-130336-0 from training. Duration: 0.79 2023-10-03 12:58:13,776 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.ff3_skip_rate, batch_count=1274980.0, ans=0.0 2023-10-03 12:58:14,866 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_548-294316-0_sp0.9 from training. Duration: 0.9 2023-10-03 12:58:16,528 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.conv_module2.balancer2.prob, batch_count=1274980.0, ans=0.125 2023-10-03 12:58:22,899 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_42-142473-0_sp1.1 from training. Duration: 0.6545625 2023-10-03 12:58:22,905 WARNING [train.py:1204] (3/4) Exclude cut with ID _59170_210_6721_1_1534983633960_4203335_326-48066-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 12:58:26,289 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_45886_210_17024_1_1533709215_471063_7-79165-0 from training. Duration: 0.98 2023-10-03 12:58:26,486 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.1.conv_skip_rate, batch_count=1275046.6666666667, ans=0.0 2023-10-03 12:58:28,732 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.0.layers.1.feed_forward1.out_whiten, num_groups=1, num_channels=192, metric=4.58 vs. limit=15.0 2023-10-03 12:58:29,238 WARNING [train.py:1204] (3/4) Exclude cut with ID _44585_210_18628_1_1533605350387_4518199_280-73534-0_sp0.9 from training. Duration: 0.988875 2023-10-03 12:58:29,248 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3475_210_2028_1_1526193578_3622569_265-276625-0_sp1.1 from training. Duration: 0.6909375 2023-10-03 12:58:30,641 WARNING [train.py:1204] (3/4) Exclude cut with ID _64631_210_27767_1_1535349580068_4165160_88-225232-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 12:58:33,447 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.feed_forward2.hidden_balancer.prob, batch_count=1275113.3333333333, ans=0.125 2023-10-03 12:58:34,657 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_205-265518-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 12:58:37,642 WARNING [train.py:1204] (3/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_271-253734-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 12:58:42,944 WARNING [train.py:1204] (3/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_282-257791-0 from training. Duration: 0.86 2023-10-03 12:58:45,791 WARNING [train.py:1204] (3/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_356-45054-0 from training. Duration: 0.86 2023-10-03 12:58:45,947 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.out_combiner.scale_min, batch_count=1275113.3333333333, ans=0.2 2023-10-03 12:58:47,048 WARNING [train.py:1204] (3/4) Exclude cut with ID _69584_210_5219_1_1535785133775_4044450_38-348116-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 12:58:47,050 WARNING [train.py:1204] (3/4) Exclude cut with ID _66763_210_17301_1_1535788667617_241750_21-328480-0_sp1.1 from training. Duration: 0.936375 2023-10-03 12:58:49,620 WARNING [train.py:1204] (3/4) Exclude cut with ID _26528_210_9070_1_1532570410842_3851310_497-97218-0_sp0.9 from training. Duration: 0.9555625 2023-10-03 12:58:49,675 WARNING [train.py:1204] (3/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_592-101075-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 12:58:52,295 WARNING [train.py:1204] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_678-159307-0 from training. Duration: 0.85 2023-10-03 12:58:53,708 WARNING [train.py:1204] (3/4) Exclude cut with ID _28427_210_4220_1_1532581240967_3061069_166-319773-0_sp1.1 from training. Duration: 0.936375 2023-10-03 12:58:55,103 WARNING [train.py:1204] (3/4) Exclude cut with ID _29877_210_6228_1_1532397791106_5276149_606-224984-0_sp1.1 from training. Duration: 0.936375 2023-10-03 12:58:58,526 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.conv_module1.balancer1.prob, batch_count=1275180.0, ans=0.125 2023-10-03 12:58:59,777 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_125-203105-0_sp1.1 from training. Duration: 0.72725 2023-10-03 12:59:02,438 WARNING [train.py:1204] (3/4) Exclude cut with ID IC0620W0113-306765 from training. Duration: 0.768 2023-10-03 12:59:03,743 INFO [train.py:1046] (3/4) Epoch 37, batch 50, loss[loss=0.1442, simple_loss=0.2291, pruned_loss=0.02961, over 24482.00 frames. ], tot_loss[loss=0.1596, simple_loss=0.2391, pruned_loss=0.04004, over 1072419.10 frames. ], batch size: 63, lr: 2.77e-03, grad_scale: 32.0 2023-10-03 12:59:03,837 WARNING [train.py:1204] (3/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_410-287857-0_sp1.1 from training. Duration: 0.8454375 2023-10-03 12:59:06,739 WARNING [train.py:1204] (3/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_78-95260-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 12:59:09,542 WARNING [train.py:1204] (3/4) Exclude cut with ID _58059_210_5854_1_1534901471149_3164808_6-73080-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 12:59:09,556 WARNING [train.py:1204] (3/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_331-117829-0 from training. Duration: 0.62 2023-10-03 12:59:10,909 WARNING [train.py:1204] (3/4) Exclude cut with ID _14880_210_8895_1_1530680426655_3795259_289-212817-0_sp0.9 from training. Duration: 0.7666875 2023-10-03 12:59:10,936 WARNING [train.py:1204] (3/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_620-144779-0_sp1.1 from training. Duration: 0.92725 2023-10-03 12:59:12,366 WARNING [train.py:1204] (3/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_561-288575-0_sp1.1 from training. Duration: 0.963625 2023-10-03 12:59:13,699 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_22657_210_6890_1_1532086002_1637216_122-188128-0_sp1.1 from training. Duration: 0.963625 2023-10-03 12:59:16,524 WARNING [train.py:1204] (3/4) Exclude cut with ID _16756_210_4915_1_1531047803763_7104259_604-86607-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 12:59:19,218 WARNING [train.py:1204] (3/4) Exclude cut with ID _15150_210_8341_1_1530752411803_7256384_469-239509-0 from training. Duration: 0.68 2023-10-03 12:59:19,226 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_10987_210_4163_1_1529760555_4123902_302-87926-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 12:59:20,033 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.conv_module1.balancer2.prob, batch_count=1275313.3333333333, ans=0.125 2023-10-03 12:59:20,046 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.feed_forward1.hidden_balancer.prob, batch_count=1275313.3333333333, ans=0.125 2023-10-03 12:59:21,303 INFO [scaling.py:1118] (3/4) WithLoss: name=encoder.encoders.1.encoder.layers.1.self_attn_weights, loss-sum=0.000e+00 2023-10-03 12:59:22,971 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.548e+02 1.909e+02 2.063e+02 2.312e+02 4.693e+02, threshold=4.126e+02, percent-clipped=2.0 2023-10-03 12:59:24,848 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.conv_module1.balancer1.prob, batch_count=1275313.3333333333, ans=0.125 2023-10-03 12:59:25,807 WARNING [train.py:1204] (3/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_469-94673-0_sp0.9 from training. Duration: 0.87775 2023-10-03 12:59:25,929 WARNING [train.py:1204] (3/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_798-106179-0 from training. Duration: 0.83 2023-10-03 12:59:28,596 WARNING [train.py:1204] (3/4) Exclude cut with ID _16744_210_5986_1_1531724405384_3907567_242-111316-0 from training. Duration: 0.93 2023-10-03 12:59:30,486 WARNING [train.py:1204] (3/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_938-205135-0_sp0.9 from training. Duration: 0.9666875 2023-10-03 12:59:30,765 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.conv_module1.balancer2.prob, batch_count=1275313.3333333333, ans=0.125 2023-10-03 12:59:31,919 WARNING [train.py:1204] (3/4) Exclude cut with ID _64135_210_2253_1_1535677267876_7608972_133-188266-0_sp0.9 from training. Duration: 0.9 2023-10-03 12:59:31,924 WARNING [train.py:1204] (3/4) Exclude cut with ID _44585_210_18628_1_1533605350387_4518199_623-346820-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 12:59:33,290 WARNING [train.py:1204] (3/4) Exclude cut with ID _42326_210_7770_1_1533814282531_3707269_454-266903-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 12:59:34,565 WARNING [train.py:1204] (3/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_5-55893-0_sp1.1 from training. Duration: 0.5 2023-10-03 12:59:34,604 WARNING [train.py:1204] (3/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_577-332600-0_sp1.1 from training. Duration: 0.5181875 2023-10-03 12:59:34,605 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_957-138882-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 12:59:41,563 WARNING [train.py:1204] (3/4) Exclude cut with ID _12556_210_8341_1_1530081024100_7327690_219-38004-0_sp1.1 from training. Duration: 0.97275 2023-10-03 12:59:44,245 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_397-147196-0_sp1.1 from training. Duration: 0.72725 2023-10-03 12:59:44,264 WARNING [train.py:1204] (3/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_231-104736-0_sp0.9 from training. Duration: 0.8666875 2023-10-03 12:59:45,600 WARNING [train.py:1204] (3/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_398-334558-0 from training. Duration: 0.96 2023-10-03 12:59:47,089 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_257-20733-0_sp1.1 from training. Duration: 0.6909375 2023-10-03 12:59:48,386 WARNING [train.py:1204] (3/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_146-282400-0_sp1.1 from training. Duration: 0.6454375 2023-10-03 12:59:48,390 WARNING [train.py:1204] (3/4) Exclude cut with ID _51899_210_20034_1_1534129254466_3669220_265-215428-0 from training. Duration: 0.98 2023-10-03 12:59:50,378 WARNING [train.py:1204] (3/4) Exclude cut with ID _34423_210_8750_1_1533430798456_3330199_219-106541-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 12:59:51,636 WARNING [train.py:1204] (3/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_458-67321-0 from training. Duration: 0.97 2023-10-03 12:59:59,815 WARNING [train.py:1204] (3/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_1051-182606-0_sp1.1 from training. Duration: 0.936375 2023-10-03 12:59:59,829 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_53555_210_5632_1_1534595220_7232297_603-4396-0_sp1.1 from training. Duration: 0.97275 2023-10-03 13:00:01,224 WARNING [train.py:1204] (3/4) Exclude cut with ID _54218_210_12210_1_1534413646388_4409259_159-144367-0_sp1.1 from training. Duration: 0.963625 2023-10-03 13:00:02,599 WARNING [train.py:1204] (3/4) Exclude cut with ID _7606_210_4915_1_1528894253277_5080690_301-10732-0_sp0.9 from training. Duration: 0.9 2023-10-03 13:00:02,610 WARNING [train.py:1204] (3/4) Exclude cut with ID _15026_210_8750_1_1530777602316_3291850_211-195043-0_sp0.9 from training. Duration: 0.888875 2023-10-03 13:00:05,844 WARNING [train.py:1204] (3/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_146-282400-0 from training. Duration: 0.71 2023-10-03 13:00:07,079 WARNING [train.py:1204] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_65-88242-0 from training. Duration: 0.56 2023-10-03 13:00:07,176 WARNING [train.py:1204] (3/4) Exclude cut with ID _55857_210_2346_1_1534644007831_6898209_80-107757-0_sp1.1 from training. Duration: 0.963625 2023-10-03 13:00:08,425 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_15126_210_6753_1_1530778677_2591185_120-277707-0_sp0.9 from training. Duration: 0.888875 2023-10-03 13:00:08,542 WARNING [train.py:1204] (3/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_1033-168593-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 13:00:08,597 WARNING [train.py:1204] (3/4) Exclude cut with ID _46683_210_9642_1_1533730913492_4591116_475-30711-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 13:00:08,623 WARNING [train.py:1204] (3/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_253-308377-0 from training. Duration: 0.49 2023-10-03 13:00:08,760 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=1275513.3333333333, ans=0.1 2023-10-03 13:00:09,994 WARNING [train.py:1204] (3/4) Exclude cut with ID _4680_210_3525_1_1527249757953_893386_75-78979-0 from training. Duration: 0.91 2023-10-03 13:00:11,387 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_5522_210_3273_1_1527249606_3909096_318-203153-0_sp0.9 from training. Duration: 0.47775 2023-10-03 13:00:12,020 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.3.encoder.layers.1.whiten, num_groups=1, num_channels=512, metric=5.37 vs. limit=12.0 2023-10-03 13:00:12,830 WARNING [train.py:1204] (3/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_550-170116-0_sp1.1 from training. Duration: 0.92725 2023-10-03 13:00:12,865 WARNING [train.py:1204] (3/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_277-210798-0_sp0.9 from training. Duration: 0.611125 2023-10-03 13:00:13,007 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.conv_module1.balancer1.prob, batch_count=1275513.3333333333, ans=0.125 2023-10-03 13:00:14,123 WARNING [train.py:1204] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_620-276793-0 from training. Duration: 0.59 2023-10-03 13:00:14,142 WARNING [train.py:1204] (3/4) Exclude cut with ID _3067_210_2667_1_1526212974640_4503168_119-175776-0 from training. Duration: 0.74 2023-10-03 13:00:15,637 WARNING [train.py:1204] (3/4) Exclude cut with ID _55209_210_1531_1_1534675828996_4597160_681-195827-0_sp1.1 from training. Duration: 0.92725 2023-10-03 13:00:15,727 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_427-78097-0_sp1.1 from training. Duration: 0.72725 2023-10-03 13:00:17,030 INFO [train.py:1046] (3/4) Epoch 37, batch 100, loss[loss=0.1666, simple_loss=0.2551, pruned_loss=0.03906, over 23956.00 frames. ], tot_loss[loss=0.1612, simple_loss=0.241, pruned_loss=0.0407, over 1870587.33 frames. ], batch size: 86, lr: 2.77e-03, grad_scale: 32.0 2023-10-03 13:00:17,154 WARNING [train.py:1204] (3/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_96-290039-0_sp0.9 from training. Duration: 0.62225 2023-10-03 13:00:17,164 WARNING [train.py:1204] (3/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_287-292174-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 13:00:20,398 WARNING [train.py:1204] (3/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_200-292228-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 13:00:23,359 WARNING [train.py:1204] (3/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_291-194999-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 13:00:26,105 WARNING [train.py:1204] (3/4) Exclude cut with ID _58895_210_8750_1_1535184081320_3649089_265-323810-0_sp1.1 from training. Duration: 0.87275 2023-10-03 13:00:27,513 WARNING [train.py:1204] (3/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_58-68956-0 from training. Duration: 0.83 2023-10-03 13:00:27,518 WARNING [train.py:1204] (3/4) Exclude cut with ID _39867_210_9826_1_1533553222030_9344259_322-115153-0_sp1.1 from training. Duration: 0.963625 2023-10-03 13:00:29,096 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3371_210_1794_1_1526384462_5580777_396-339812-0_sp1.1 from training. Duration: 0.77275 2023-10-03 13:00:30,319 WARNING [train.py:1204] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_731-2428-0_sp0.9 from training. Duration: 0.92225 2023-10-03 13:00:30,338 WARNING [train.py:1204] (3/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_98-741-0_sp1.1 from training. Duration: 0.72725 2023-10-03 13:00:30,339 WARNING [train.py:1204] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_238-245655-0_sp0.9 from training. Duration: 0.9 2023-10-03 13:00:30,362 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6788_210_1388_1_1527898698_4562021_91-197981-0_sp0.9 from training. Duration: 0.92225 2023-10-03 13:00:32,119 WARNING [train.py:1204] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_860-96198-0 from training. Duration: 0.73 2023-10-03 13:00:34,118 WARNING [train.py:1204] (3/4) Exclude cut with ID _14268_210_9206_1_1530622771285_4714099_331-105351-0_sp0.9 from training. Duration: 0.72225 2023-10-03 13:00:34,139 WARNING [train.py:1204] (3/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_204-137133-0_sp1.1 from training. Duration: 0.92725 2023-10-03 13:00:35,442 WARNING [train.py:1204] (3/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_5-46496-0_sp1.1 from training. Duration: 0.936375 2023-10-03 13:00:35,450 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_160-218721-0_sp1.1 from training. Duration: 0.87275 2023-10-03 13:00:38,243 WARNING [train.py:1204] (3/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_503-287883-0 from training. Duration: 0.88 2023-10-03 13:00:39,568 WARNING [train.py:1204] (3/4) Exclude cut with ID _57572_210_5517_1_1534921219624_3809484_110-79134-0_sp1.1 from training. Duration: 0.92725 2023-10-03 13:00:40,903 WARNING [train.py:1204] (3/4) Exclude cut with ID _44585_210_18628_1_1533605350387_4518199_421-37641-0_sp1.1 from training. Duration: 0.936375 2023-10-03 13:00:42,273 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_42-142473-0_sp0.9 from training. Duration: 0.8 2023-10-03 13:00:43,800 WARNING [train.py:1204] (3/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_310-114452-0_sp0.9 from training. Duration: 0.6555625 2023-10-03 13:00:48,034 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9029W0120-509520 from training. Duration: 0.768 2023-10-03 13:00:48,055 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9029W0433-509820 from training. Duration: 0.896 2023-10-03 13:00:49,831 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_40222_210_6228_1_1533385298_4481939_163-334586-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 13:00:49,831 WARNING [train.py:1204] (3/4) Exclude cut with ID _48677_210_5663_1_1533902405919_3892989_408-40740-0_sp1.1 from training. Duration: 0.7545625 2023-10-03 13:00:54,718 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.feed_forward2.hidden_balancer.prob, batch_count=1275713.3333333333, ans=0.125 2023-10-03 13:00:55,809 WARNING [train.py:1204] (3/4) Exclude cut with ID _24314_210_4915_1_1532135078116_7170179_521-258868-0_sp0.9 from training. Duration: 0.87775 2023-10-03 13:00:57,189 WARNING [train.py:1204] (3/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_576-51198-0_sp1.1 from training. Duration: 0.92725 2023-10-03 13:00:58,630 WARNING [train.py:1204] (3/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_306-216871-0_sp1.1 from training. Duration: 0.97275 2023-10-03 13:01:03,518 WARNING [train.py:1204] (3/4) Exclude cut with ID _20684_210_6917_1_1531567751646_4203930_463-119982-0_sp1.1 from training. Duration: 0.97275 2023-10-03 13:01:04,831 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9084W0012-536850 from training. Duration: 0.64 2023-10-03 13:01:06,263 WARNING [train.py:1204] (3/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_847-41334-0_sp1.1 from training. Duration: 0.4 2023-10-03 13:01:09,376 WARNING [train.py:1204] (3/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_326-165900-0_sp1.1 from training. Duration: 0.62725 2023-10-03 13:01:10,779 WARNING [train.py:1204] (3/4) Exclude cut with ID _7537_210_2084_1_1528502395431_3924359_128-268029-0_sp1.1 from training. Duration: 0.8909375 2023-10-03 13:01:13,462 WARNING [train.py:1204] (3/4) Exclude cut with ID _67499_210_17282_1_1535509805220_3692430_333-155030-0_sp1.1 from training. Duration: 0.97275 2023-10-03 13:01:17,516 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_34008_210_10431_1_1533345424_8183247_588-257177-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 13:01:19,007 WARNING [train.py:1204] (3/4) Exclude cut with ID _25914_210_6400_1_1532141317058_644740_52-96808-0_sp1.1 from training. Duration: 0.82725 2023-10-03 13:01:21,688 WARNING [train.py:1204] (3/4) Exclude cut with ID _56494_210_4220_1_1535284837625_3033169_69-138150-0_sp1.1 from training. Duration: 0.8545625 2023-10-03 13:01:23,201 WARNING [train.py:1204] (3/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_19-143553-0_sp1.1 from training. Duration: 0.97275 2023-10-03 13:01:24,585 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.2.encoder.layers.0.feed_forward1.out_whiten, num_groups=1, num_channels=384, metric=6.16 vs. limit=15.0 2023-10-03 13:01:25,014 WARNING [train.py:1204] (3/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_582-130735-0_sp1.1 from training. Duration: 0.936375 2023-10-03 13:01:25,140 WARNING [train.py:1204] (3/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_943-201356-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 13:01:25,151 WARNING [train.py:1204] (3/4) Exclude cut with ID _3231_210_2106_1_1526205864934_6784220_422-229276-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 13:01:26,909 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_48049_210_5632_1_1533864553_3519937_73-198717-0_sp1.1 from training. Duration: 0.97275 2023-10-03 13:01:27,147 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.conv_skip_rate, batch_count=1275846.6666666667, ans=0.0 2023-10-03 13:01:28,252 WARNING [train.py:1204] (3/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_329-285801-0 from training. Duration: 0.91 2023-10-03 13:01:28,268 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9171W0114-579000 from training. Duration: 0.896 2023-10-03 13:01:28,280 WARNING [train.py:1204] (3/4) Exclude cut with ID _3120_210_3525_1_1526036143112_4072980_210-132675-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 13:01:28,357 WARNING [train.py:1204] (3/4) Exclude cut with ID _55857_210_2346_1_1534644007831_6898209_745-98391-0_sp0.9 from training. Duration: 0.8444375 2023-10-03 13:01:29,729 WARNING [train.py:1204] (3/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_615-322035-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 13:01:29,734 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_36756_210_7234_1_1533026991_966276_119-315767-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 13:01:29,755 WARNING [train.py:1204] (3/4) Exclude cut with ID _13916_210_9994_1_1530788341126_3763149_60-191556-0_sp1.1 from training. Duration: 0.5545625 2023-10-03 13:01:31,006 INFO [train.py:1046] (3/4) Epoch 37, batch 150, loss[loss=0.1613, simple_loss=0.2308, pruned_loss=0.04588, over 23755.00 frames. ], tot_loss[loss=0.1611, simple_loss=0.2412, pruned_loss=0.04055, over 2516444.93 frames. ], batch size: 179, lr: 2.77e-03, grad_scale: 32.0 2023-10-03 13:01:31,056 WARNING [train.py:1204] (3/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_177-236809-0_sp1.1 from training. Duration: 0.4454375 2023-10-03 13:01:31,106 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7002_210_1794_1_1527939683_5157286_184-229630-0_sp1.1 from training. Duration: 0.67275 2023-10-03 13:01:31,118 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_50428_210_5724_1_1534247532_4126418_464-18322-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 13:01:31,229 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.0.feed_forward1.out_proj.dropout_p, batch_count=1275913.3333333333, ans=0.1 2023-10-03 13:01:32,425 WARNING [train.py:1204] (3/4) Exclude cut with ID _35799_210_5854_1_1533263487887_3529071_466-131402-0_sp1.1 from training. Duration: 0.936375 2023-10-03 13:01:33,812 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_1259-132768-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 13:01:33,860 WARNING [train.py:1204] (3/4) Exclude cut with ID _6694_210_4599_1_1527852637490_3965529_144-185051-0_sp1.1 from training. Duration: 0.87275 2023-10-03 13:01:35,193 WARNING [train.py:1204] (3/4) Exclude cut with ID _64639_210_5816_1_1535367619437_3528000_59-133290-0_sp1.1 from training. Duration: 0.963625 2023-10-03 13:01:37,253 WARNING [train.py:1204] (3/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_223-49525-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 13:01:40,036 WARNING [train.py:1204] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_748-36150-0_sp1.1 from training. Duration: 0.82725 2023-10-03 13:01:40,037 WARNING [train.py:1204] (3/4) Exclude cut with ID _19896_210_1883_1_1531709022662_6649569_52-221627-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 13:01:41,940 WARNING [train.py:1204] (3/4) Exclude cut with ID _6789_210_1534_1_1527854471671_2870519_296-115766-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 13:01:43,397 WARNING [train.py:1204] (3/4) Exclude cut with ID _14624_210_1925_1_1530858690657_3642219_100-19871-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 13:01:43,429 WARNING [train.py:1204] (3/4) Exclude cut with ID _12555_210_8750_1_1530075594402_3780239_50-293675-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 13:01:46,144 WARNING [train.py:1204] (3/4) Exclude cut with ID _14880_210_8895_1_1530680426655_3795259_289-212817-0_sp1.1 from training. Duration: 0.62725 2023-10-03 13:01:46,209 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_388-211626-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 13:01:48,977 WARNING [train.py:1204] (3/4) Exclude cut with ID _5289_210_2084_1_1527678202736_3779299_392-261988-0 from training. Duration: 0.65 2023-10-03 13:01:48,985 WARNING [train.py:1204] (3/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_124-345840-0 from training. Duration: 0.52 2023-10-03 13:01:48,990 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2655_210_2087_1_1525688959_3897400_437-337690-0 from training. Duration: 0.63 2023-10-03 13:01:50,627 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.540e+02 1.933e+02 2.120e+02 2.360e+02 3.157e+02, threshold=4.239e+02, percent-clipped=0.0 2023-10-03 13:01:52,122 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3780_210_2087_1_1526467391_239068_13-274633-0_sp1.1 from training. Duration: 0.92725 2023-10-03 13:01:52,128 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_805-71497-0_sp1.1 from training. Duration: 0.6454375 2023-10-03 13:01:54,084 WARNING [train.py:1204] (3/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_434-83000-0_sp1.1 from training. Duration: 0.8909375 2023-10-03 13:01:55,314 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_17613_210_2151_1_1531137569_2366160_285-219531-0_sp1.1 from training. Duration: 0.97275 2023-10-03 13:01:55,319 WARNING [train.py:1204] (3/4) Exclude cut with ID _56108_210_16380_1_1534505446159_3887959_22-37858-0_sp1.1 from training. Duration: 0.936375 2023-10-03 13:01:57,091 WARNING [train.py:1204] (3/4) Exclude cut with ID _28427_210_4220_1_1532581240967_3061069_200-88444-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 13:01:57,160 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_77-279042-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 13:01:59,938 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9309W0193-640320 from training. Duration: 0.896 2023-10-03 13:02:01,414 WARNING [train.py:1204] (3/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_516-324055-0_sp1.1 from training. Duration: 0.936375 2023-10-03 13:02:05,673 WARNING [train.py:1204] (3/4) Exclude cut with ID _40094_210_16380_1_1533294005773_3641159_10-261240-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 13:02:08,976 WARNING [train.py:1204] (3/4) Exclude cut with ID _6267_210_2084_1_1527764658526_3894730_375-219887-0_sp1.1 from training. Duration: 0.6090625 2023-10-03 13:02:09,077 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6788_210_1388_1_1527898698_4562021_180-75288-0 from training. Duration: 0.99 2023-10-03 13:02:10,657 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.conv_skip_rate, batch_count=1276046.6666666667, ans=0.0 2023-10-03 13:02:11,864 WARNING [train.py:1204] (3/4) Exclude cut with ID _8030_210_7497_1_1528444886781_3531089_262-86750-0_sp1.1 from training. Duration: 0.736375 2023-10-03 13:02:13,127 WARNING [train.py:1204] (3/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_934-325498-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 13:02:13,155 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_456-22141-0_sp0.9 from training. Duration: 0.97775 2023-10-03 13:02:15,833 WARNING [train.py:1204] (3/4) Exclude cut with ID _49643_210_7989_1_1534640244224_3939031_598-161926-0_sp1.1 from training. Duration: 0.8181875 2023-10-03 13:02:16,141 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.bypass.skip_rate, batch_count=1276113.3333333333, ans=0.07 2023-10-03 13:02:17,135 WARNING [train.py:1204] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_345-49721-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 13:02:18,456 WARNING [train.py:1204] (3/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_440-141811-0_sp1.1 from training. Duration: 0.863625 2023-10-03 13:02:19,803 WARNING [train.py:1204] (3/4) Exclude cut with ID _64048_210_10626_1_1535198382802_3835179_394-212120-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 13:02:19,857 WARNING [train.py:1204] (3/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_91-277684-0 from training. Duration: 0.74 2023-10-03 13:02:24,555 WARNING [train.py:1204] (3/4) Exclude cut with ID _23038_210_9570_1_1532001584158_3652419_191-346343-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 13:02:25,931 WARNING [train.py:1204] (3/4) Exclude cut with ID _15140_210_9288_1_1530791388450_4880149_351-347974-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 13:02:25,974 WARNING [train.py:1204] (3/4) Exclude cut with ID _58605_210_12210_1_1534723195939_3904259_48-269402-0_sp1.1 from training. Duration: 0.87275 2023-10-03 13:02:25,981 WARNING [train.py:1204] (3/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_401-53423-0_sp0.9 from training. Duration: 0.611125 2023-10-03 13:02:29,112 WARNING [train.py:1204] (3/4) Exclude cut with ID _42659_210_2070_1_1533607442765_3120380_158-49004-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 13:02:29,351 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.conv_skip_rate, batch_count=1276180.0, ans=0.0 2023-10-03 13:02:29,355 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.balancer2.prob, batch_count=1276180.0, ans=0.125 2023-10-03 13:02:31,794 WARNING [train.py:1204] (3/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_81-75849-0_sp1.1 from training. Duration: 0.3545625 2023-10-03 13:02:34,413 WARNING [train.py:1204] (3/4) Exclude cut with ID _27052_210_5986_1_1532329197187_3585009_314-106499-0_sp1.1 from training. Duration: 0.836375 2023-10-03 13:02:35,787 WARNING [train.py:1204] (3/4) Exclude cut with ID _4626_210_3532_1_1526817652717_3916859_446-206656-0_sp0.9 from training. Duration: 0.8444375 2023-10-03 13:02:35,875 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_53378_210_17282_1_1534221071_5769509_158-114960-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 13:02:37,282 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3171_210_2087_1_1526109084_2330007_37-64334-0_sp0.9 from training. Duration: 0.911125 2023-10-03 13:02:37,304 WARNING [train.py:1204] (3/4) Exclude cut with ID _5410_210_1388_1_1527397055795_1719020_39-112221-0 from training. Duration: 0.69 2023-10-03 13:02:39,166 WARNING [train.py:1204] (3/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_4-91037-0_sp0.9 from training. Duration: 0.97775 2023-10-03 13:02:39,188 WARNING [train.py:1204] (3/4) Exclude cut with ID ID0079W0443-717795 from training. Duration: 0.541 2023-10-03 13:02:39,436 INFO [scaling.py:1118] (3/4) WithLoss: name=encoder.encoders.2.encoder.layers.1.self_attn_weights, loss-sum=0.000e+00 2023-10-03 13:02:42,511 WARNING [train.py:1204] (3/4) Exclude cut with ID _34734_210_4525_1_1533365977542_4448720_766-297928-0_sp1.1 from training. Duration: 0.936375 2023-10-03 13:02:45,197 INFO [train.py:1046] (3/4) Epoch 37, batch 200, loss[loss=0.1457, simple_loss=0.2233, pruned_loss=0.0341, over 24627.00 frames. ], tot_loss[loss=0.1629, simple_loss=0.2425, pruned_loss=0.04167, over 2991714.81 frames. ], batch size: 60, lr: 2.77e-03, grad_scale: 16.0 2023-10-03 13:02:45,344 WARNING [train.py:1204] (3/4) Exclude cut with ID _51471_210_1889_1_1534399391390_3094250_310-337793-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 13:02:46,629 WARNING [train.py:1204] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_678-159307-0_sp0.9 from training. Duration: 0.9444375 2023-10-03 13:02:49,557 WARNING [train.py:1204] (3/4) Exclude cut with ID _50093_210_4220_1_1534417141207_3432299_259-322252-0 from training. Duration: 0.92 2023-10-03 13:02:50,864 WARNING [train.py:1204] (3/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_67-103444-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 13:02:50,882 WARNING [train.py:1204] (3/4) Exclude cut with ID _40551_210_10635_1_1533776424632_3416179_311-247578-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 13:02:52,477 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.balancer1.prob, batch_count=1276246.6666666667, ans=0.125 2023-10-03 13:02:54,143 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_4633_210_2040_1_1526824163_4145343_416-76117-0 from training. Duration: 0.95 2023-10-03 13:02:55,439 WARNING [train.py:1204] (3/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_331-117829-0_sp0.9 from training. Duration: 0.688875 2023-10-03 13:02:56,827 WARNING [train.py:1204] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_299-247059-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 13:02:58,203 WARNING [train.py:1204] (3/4) Exclude cut with ID _64002_210_9570_1_1535198605157_38979_2-240845-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 13:03:01,535 WARNING [train.py:1204] (3/4) Exclude cut with ID _4681_210_3525_1_1527250689464_120889_15-126760-0_sp0.9 from training. Duration: 0.9 2023-10-03 13:03:01,554 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_21500_210_2054_1_1532149310_2716758_256-156611-0_sp1.1 from training. Duration: 0.936375 2023-10-03 13:03:01,575 WARNING [train.py:1204] (3/4) Exclude cut with ID _16908_210_5724_1_1531119157248_3772700_624-203729-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 13:03:04,582 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.conv_module1.balancer1.prob, batch_count=1276313.3333333333, ans=0.125 2023-10-03 13:03:18,291 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.1.encoder.layers.1.self_attn1.whiten, num_groups=1, num_channels=256, metric=12.25 vs. limit=22.5 2023-10-03 13:03:20,240 WARNING [train.py:1204] (3/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_605-219732-0_sp1.1 from training. Duration: 0.8545625 2023-10-03 13:03:21,435 WARNING [train.py:1204] (3/4) Exclude cut with ID _44718_210_5816_1_1534050051869_6469030_380-167520-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 13:03:21,505 WARNING [train.py:1204] (3/4) Exclude cut with ID _19896_210_1883_1_1531709022662_6649569_94-229927-0_sp1.1 from training. Duration: 0.8818125 2023-10-03 13:03:22,229 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.2.encoder.layers.0.nonlin_attention.whiten2, num_groups=1, num_channels=384, metric=8.54 vs. limit=15.0 2023-10-03 13:03:22,895 WARNING [train.py:1204] (3/4) Exclude cut with ID _19514_210_9259_1_1531530071330_5084230_519-174300-0_sp1.1 from training. Duration: 0.963625 2023-10-03 13:03:24,284 WARNING [train.py:1204] (3/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_340-174176-0_sp0.9 from training. Duration: 0.5444375 2023-10-03 13:03:24,288 WARNING [train.py:1204] (3/4) Exclude cut with ID _8263_210_1664_1_1528439452891_970220_68-181805-0_sp1.1 from training. Duration: 0.5818125 2023-10-03 13:03:24,416 WARNING [train.py:1204] (3/4) Exclude cut with ID _3120_210_3525_1_1526036143112_4072980_348-257723-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 13:03:25,726 WARNING [train.py:1204] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_453-133983-0_sp1.1 from training. Duration: 0.7090625 2023-10-03 13:03:26,369 WARNING [train.py:1204] (3/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_17-86060-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 13:03:26,401 WARNING [train.py:1204] (3/4) Exclude cut with ID _29877_210_6228_1_1532397791106_5276149_228-184657-0_sp1.1 from training. Duration: 0.97275 2023-10-03 13:03:27,662 WARNING [train.py:1204] (3/4) Exclude cut with ID _18516_210_9206_1_1531704653206_3692406_198-334870-0 from training. Duration: 0.92 2023-10-03 13:03:27,698 WARNING [train.py:1204] (3/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_340-174176-0_sp1.1 from training. Duration: 0.4454375 2023-10-03 13:03:29,487 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_14-139186-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 13:03:33,660 WARNING [train.py:1204] (3/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_126-29104-0_sp0.9 from training. Duration: 0.9444375 2023-10-03 13:03:37,971 WARNING [train.py:1204] (3/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_84-10310-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 13:03:44,607 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_15517_210_6753_1_1530954008_3487336_79-273076-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 13:03:44,647 WARNING [train.py:1204] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_371-18169-0_sp0.9 from training. Duration: 0.9555625 2023-10-03 13:03:50,251 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_68702_210_23689_1_1535799577_3420068_170-92788-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 13:03:52,844 WARNING [train.py:1204] (3/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_78-182912-0 from training. Duration: 0.59 2023-10-03 13:03:54,183 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3019_210_2028_1_1526044245_3264865_297-12384-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 13:03:54,188 WARNING [train.py:1204] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_147-111137-0_sp1.1 from training. Duration: 0.863625 2023-10-03 13:03:54,193 WARNING [train.py:1204] (3/4) Exclude cut with ID _28323_210_16187_1_1532480395667_4100300_117-8859-0_sp1.1 from training. Duration: 0.97275 2023-10-03 13:03:54,870 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7762_210_1794_1_1528199508_4057408_419-125009-0_sp1.1 from training. Duration: 0.7545625 2023-10-03 13:03:55,972 WARNING [train.py:1204] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_268-87909-0 from training. Duration: 0.99 2023-10-03 13:03:56,047 WARNING [train.py:1204] (3/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_118-311357-0_sp1.1 from training. Duration: 0.92725 2023-10-03 13:03:57,753 WARNING [train.py:1204] (3/4) Exclude cut with ID ID0377W0003-870225 from training. Duration: 0.852 2023-10-03 13:03:59,125 INFO [train.py:1046] (3/4) Epoch 37, batch 250, loss[loss=0.1407, simple_loss=0.2048, pruned_loss=0.0383, over 22618.00 frames. ], tot_loss[loss=0.1614, simple_loss=0.241, pruned_loss=0.04083, over 3377311.26 frames. ], batch size: 322, lr: 2.77e-03, grad_scale: 16.0 2023-10-03 13:03:59,202 WARNING [train.py:1204] (3/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_56-5838-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 13:04:00,724 WARNING [train.py:1204] (3/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_90-31383-0_sp0.9 from training. Duration: 0.9666875 2023-10-03 13:04:02,115 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_31583_210_3852_1_1532595851_4024413_58-86657-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 13:04:03,355 WARNING [train.py:1204] (3/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_344-113875-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 13:04:06,096 WARNING [train.py:1204] (3/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_278-104676-0_sp1.1 from training. Duration: 0.8909375 2023-10-03 13:04:06,120 WARNING [train.py:1204] (3/4) Exclude cut with ID _55844_210_2346_1_1534471212569_7625707_764-2387-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 13:04:07,495 WARNING [train.py:1204] (3/4) Exclude cut with ID _27430_210_7770_1_1532604689375_1378590_157-14963-0_sp1.1 from training. Duration: 0.963625 2023-10-03 13:04:09,016 WARNING [train.py:1204] (3/4) Exclude cut with ID _7160_210_1876_1_1527940923231_3703799_282-322611-0_sp1.1 from training. Duration: 0.7909375 2023-10-03 13:04:19,365 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.555e+02 1.868e+02 1.988e+02 2.171e+02 2.742e+02, threshold=3.975e+02, percent-clipped=0.0 2023-10-03 13:04:20,903 WARNING [train.py:1204] (3/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_190-138640-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 13:04:23,646 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_33348_210_5632_1_1533086912_3324030_63-270922-0_sp1.1 from training. Duration: 0.97275 2023-10-03 13:04:23,669 WARNING [train.py:1204] (3/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_135-184281-0_sp1.1 from training. Duration: 0.9 2023-10-03 13:04:30,138 WARNING [train.py:1204] (3/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_277-210798-0_sp1.1 from training. Duration: 0.5 2023-10-03 13:04:31,513 WARNING [train.py:1204] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_402-253027-0_sp0.9 from training. Duration: 0.6 2023-10-03 13:04:31,586 WARNING [train.py:1204] (3/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_540-275685-0_sp0.9 from training. Duration: 0.988875 2023-10-03 13:04:32,868 WARNING [train.py:1204] (3/4) Exclude cut with ID _59493_210_17282_1_1535078419077_3225879_118-18960-0_sp1.1 from training. Duration: 0.936375 2023-10-03 13:04:34,286 WARNING [train.py:1204] (3/4) Exclude cut with ID _6194_210_1534_1_1527511826160_3941194_321-148641-0_sp1.1 from training. Duration: 0.5818125 2023-10-03 13:04:34,299 WARNING [train.py:1204] (3/4) Exclude cut with ID _61133_210_9969_1_1535196777227_3696409_333-270325-0_sp1.1 from training. Duration: 0.8454375 2023-10-03 13:04:35,646 WARNING [train.py:1204] (3/4) Exclude cut with ID _49376_210_5986_1_1533985278732_3370740_307-145221-0_sp1.1 from training. Duration: 0.936375 2023-10-03 13:04:37,095 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6846_210_6753_1_1527850767_4295158_2-315073-0_sp0.9 from training. Duration: 0.97775 2023-10-03 13:04:39,726 WARNING [train.py:1204] (3/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_17-25045-0 from training. Duration: 0.59 2023-10-03 13:04:39,739 WARNING [train.py:1204] (3/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_503-333510-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 13:04:39,870 WARNING [train.py:1204] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_238-245655-0_sp1.1 from training. Duration: 0.736375 2023-10-03 13:04:41,126 WARNING [train.py:1204] (3/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_329-82235-0_sp0.9 from training. Duration: 0.911125 2023-10-03 13:04:41,131 WARNING [train.py:1204] (3/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_325-113798-0_sp0.9 from training. Duration: 0.9444375 2023-10-03 13:04:41,211 WARNING [train.py:1204] (3/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_395-37222-0_sp0.9 from training. Duration: 0.8444375 2023-10-03 13:04:41,292 WARNING [train.py:1204] (3/4) Exclude cut with ID _64135_210_2253_1_1535677267876_7608972_498-5039-0_sp1.1 from training. Duration: 0.7090625 2023-10-03 13:04:41,300 WARNING [train.py:1204] (3/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_303-228738-0_sp0.9 from training. Duration: 0.9333125 2023-10-03 13:04:44,746 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_577-220899-0_sp1.1 from training. Duration: 0.92725 2023-10-03 13:04:44,851 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3475_210_2028_1_1526193578_3622569_377-166133-0_sp1.1 from training. Duration: 0.7818125 2023-10-03 13:04:44,876 WARNING [train.py:1204] (3/4) Exclude cut with ID _49376_210_5986_1_1533985278732_3370740_272-313512-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 13:04:48,083 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7762_210_1794_1_1528199508_4057408_422-227034-0_sp0.9 from training. Duration: 0.811125 2023-10-03 13:04:52,193 WARNING [train.py:1204] (3/4) Exclude cut with ID _18895_210_6228_1_1531357274234_3784930_74-341428-0_sp1.1 from training. Duration: 0.92725 2023-10-03 13:04:56,744 WARNING [train.py:1204] (3/4) Exclude cut with ID _44902_210_16187_1_1533888006062_3792078_149-67212-0_sp1.1 from training. Duration: 0.7909375 2023-10-03 13:05:00,273 WARNING [train.py:1204] (3/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_243-126020-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 13:05:01,728 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.nonlin_attention.balancer.prob, batch_count=1276846.6666666667, ans=0.125 2023-10-03 13:05:02,959 WARNING [train.py:1204] (3/4) Exclude cut with ID _54218_210_12210_1_1534413646388_4409259_259-6561-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 13:05:06,962 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_41452_210_7291_1_1533437687_3941281_405-11559-0 from training. Duration: 0.91 2023-10-03 13:05:08,405 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_759-225084-0_sp1.1 from training. Duration: 0.97275 2023-10-03 13:05:08,425 WARNING [train.py:1204] (3/4) Exclude cut with ID _14747_210_8341_1_1530685797463_3654089_147-131229-0_sp0.9 from training. Duration: 0.8444375 2023-10-03 13:05:09,925 WARNING [train.py:1204] (3/4) Exclude cut with ID _14867_210_1968_1_1530684179890_3846969_319-250868-0 from training. Duration: 0.99 2023-10-03 13:05:09,946 WARNING [train.py:1204] (3/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_580-93798-0_sp0.9 from training. Duration: 0.62225 2023-10-03 13:05:11,404 WARNING [train.py:1204] (3/4) Exclude cut with ID _9679_210_7497_1_1529129212906_4363929_512-313889-0_sp0.9 from training. Duration: 0.9 2023-10-03 13:05:11,421 WARNING [train.py:1204] (3/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_18-189198-0 from training. Duration: 0.9 2023-10-03 13:05:12,715 INFO [train.py:1046] (3/4) Epoch 37, batch 300, loss[loss=0.137, simple_loss=0.2028, pruned_loss=0.03554, over 22798.00 frames. ], tot_loss[loss=0.1598, simple_loss=0.2391, pruned_loss=0.04027, over 3668929.46 frames. ], batch size: 322, lr: 2.77e-03, grad_scale: 16.0 2023-10-03 13:05:16,708 WARNING [train.py:1204] (3/4) Exclude cut with ID _49755_210_18053_1_1534581005846_3218709_137-64179-0_sp1.1 from training. Duration: 0.92725 2023-10-03 13:05:16,785 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_48049_210_5632_1_1533864553_3519937_306-283794-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 13:05:20,805 WARNING [train.py:1204] (3/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_25-120161-0_sp1.1 from training. Duration: 0.8909375 2023-10-03 13:05:20,865 WARNING [train.py:1204] (3/4) Exclude cut with ID _8833_210_5219_1_1528592665210_7553059_319-349181-0 from training. Duration: 0.99 2023-10-03 13:05:22,212 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_296-332325-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 13:05:23,596 WARNING [train.py:1204] (3/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_436-186311-0_sp1.1 from training. Duration: 0.5909375 2023-10-03 13:05:23,620 WARNING [train.py:1204] (3/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_348-127789-0 from training. Duration: 0.85 2023-10-03 13:05:23,626 WARNING [train.py:1204] (3/4) Exclude cut with ID _3992_210_1747_1_1526644816031_3731060_443-195183-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 13:05:25,667 WARNING [train.py:1204] (3/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_308-231735-0_sp1.1 from training. Duration: 0.663625 2023-10-03 13:05:29,045 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6786_210_1388_1_1527839699_4194495_378-46070-0_sp1.1 from training. Duration: 0.6545625 2023-10-03 13:05:29,082 WARNING [train.py:1204] (3/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_63-101996-0 from training. Duration: 0.88 2023-10-03 13:05:32,989 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2541_210_1794_1_1525590269_3084765_156-173713-0 from training. Duration: 0.81 2023-10-03 13:05:34,205 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_217-239796-0_sp1.1 from training. Duration: 0.963625 2023-10-03 13:05:36,860 WARNING [train.py:1204] (3/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_530-329400-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 13:05:38,320 WARNING [train.py:1204] (3/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_150-62920-0_sp1.1 from training. Duration: 0.963625 2023-10-03 13:05:38,320 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_5522_210_3273_1_1527249606_3909096_366-172508-0 from training. Duration: 0.66 2023-10-03 13:05:38,323 WARNING [train.py:1204] (3/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_303-340446-0_sp1.1 from training. Duration: 0.8181875 2023-10-03 13:05:39,788 WARNING [train.py:1204] (3/4) Exclude cut with ID _8699_210_2069_1_1528630346831_3960409_160-24408-0_sp1.1 from training. Duration: 0.82725 2023-10-03 13:05:43,297 WARNING [train.py:1204] (3/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_299-187801-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 13:05:43,332 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_34886_210_10633_1_1533203882_1755661_92-345288-0_sp1.1 from training. Duration: 0.97275 2023-10-03 13:05:46,508 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder_embed.dropout.p, batch_count=1277046.6666666667, ans=0.1 2023-10-03 13:05:47,412 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.0.layers.0.feed_forward3.out_whiten, num_groups=1, num_channels=192, metric=10.45 vs. limit=15.0 2023-10-03 13:05:49,180 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_5438_210_1794_1_1527336687_2600009_104-288274-0_sp1.1 from training. Duration: 0.52725 2023-10-03 13:05:49,184 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_154-316450-0 from training. Duration: 0.76 2023-10-03 13:05:50,551 WARNING [train.py:1204] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_23-189234-0_sp0.9 from training. Duration: 0.92225 2023-10-03 13:05:51,967 WARNING [train.py:1204] (3/4) Exclude cut with ID _4779_210_2084_1_1527318022944_3914893_346-168820-0_sp1.1 from training. Duration: 0.963625 2023-10-03 13:05:54,619 WARNING [train.py:1204] (3/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_5-55893-0 from training. Duration: 0.55 2023-10-03 13:05:54,706 WARNING [train.py:1204] (3/4) Exclude cut with ID _20455_210_5219_1_1531565996775_8098100_180-24517-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 13:05:59,556 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_243-143119-0_sp0.9 from training. Duration: 0.8555625 2023-10-03 13:06:01,066 WARNING [train.py:1204] (3/4) Exclude cut with ID _65585_210_12210_1_1535450451584_3764098_293-212045-0_sp1.1 from training. Duration: 0.92725 2023-10-03 13:06:01,070 WARNING [train.py:1204] (3/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_577-332600-0 from training. Duration: 0.57 2023-10-03 13:06:02,955 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.4.encoder.layers.2.conv_module2.whiten, num_groups=1, num_channels=384, metric=3.09 vs. limit=15.0 2023-10-03 13:06:06,543 WARNING [train.py:1204] (3/4) Exclude cut with ID _30039_210_13270_1_1532484612900_2725150_104-289874-0_sp1.1 from training. Duration: 0.963625 2023-10-03 13:06:06,553 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3172_210_2087_1_1526292929_3987865_197-97762-0_sp1.1 from training. Duration: 0.6818125 2023-10-03 13:06:09,346 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_256-150908-0_sp1.1 from training. Duration: 0.963625 2023-10-03 13:06:10,746 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_296-287678-0_sp1.1 from training. Duration: 0.863625 2023-10-03 13:06:10,773 WARNING [train.py:1204] (3/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_443-30034-0 from training. Duration: 0.64 2023-10-03 13:06:10,813 WARNING [train.py:1204] (3/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_494-199538-0_sp0.9 from training. Duration: 0.8333125 2023-10-03 13:06:10,987 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.conv_module2.balancer1.prob, batch_count=1277180.0, ans=0.125 2023-10-03 13:06:12,082 WARNING [train.py:1204] (3/4) Exclude cut with ID _19708_210_9206_1_1532136650733_4238364_61-162325-0_sp1.1 from training. Duration: 0.97275 2023-10-03 13:06:13,419 WARNING [train.py:1204] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_475-127455-0 from training. Duration: 0.7 2023-10-03 13:06:15,305 WARNING [train.py:1204] (3/4) Exclude cut with ID _48677_210_5663_1_1533902405919_3892989_188-11581-0_sp1.1 from training. Duration: 0.963625 2023-10-03 13:06:15,327 WARNING [train.py:1204] (3/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_136-8381-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 13:06:16,745 WARNING [train.py:1204] (3/4) Exclude cut with ID _55328_210_27767_1_1534580973260_4088170_270-229555-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 13:06:18,129 WARNING [train.py:1204] (3/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_372-266951-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 13:06:18,202 WARNING [train.py:1204] (3/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_14-242149-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 13:06:21,641 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.nonlin_attention.balancer.min_positive, batch_count=1277180.0, ans=0.05 2023-10-03 13:06:25,411 WARNING [train.py:1204] (3/4) Exclude cut with ID _7877_210_6014_1_1528286358099_3750136_159-76512-0_sp1.1 from training. Duration: 0.936375 2023-10-03 13:06:25,413 WARNING [train.py:1204] (3/4) Exclude cut with ID _8263_210_1664_1_1528438953494_294100_40-204731-0_sp1.1 from training. Duration: 0.4545625 2023-10-03 13:06:26,766 INFO [train.py:1046] (3/4) Epoch 37, batch 350, loss[loss=0.1665, simple_loss=0.2552, pruned_loss=0.03893, over 24409.00 frames. ], tot_loss[loss=0.1587, simple_loss=0.2376, pruned_loss=0.03988, over 3901356.66 frames. ], batch size: 69, lr: 2.77e-03, grad_scale: 16.0 2023-10-03 13:06:26,907 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8278_210_2550_1_1528637099_4220413_694-238413-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 13:06:33,382 WARNING [train.py:1204] (3/4) Exclude cut with ID _47278_210_12041_1_1533902418349_3576110_421-172710-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 13:06:36,139 WARNING [train.py:1204] (3/4) Exclude cut with ID _14970_210_15709_1_1530773543493_1715730_215-282680-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 13:06:36,167 WARNING [train.py:1204] (3/4) Exclude cut with ID _64639_210_5816_1_1535367619437_3528000_306-149449-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 13:06:39,115 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_28-291982-0 from training. Duration: 0.97 2023-10-03 13:06:41,806 WARNING [train.py:1204] (3/4) Exclude cut with ID _55134_210_21982_1_1534590028377_3882170_412-15870-0_sp1.1 from training. Duration: 0.936375 2023-10-03 13:06:41,853 WARNING [train.py:1204] (3/4) Exclude cut with ID _13684_210_7497_1_1530619354863_3598359_312-235606-0 from training. Duration: 0.6 2023-10-03 13:06:44,595 WARNING [train.py:1204] (3/4) Exclude cut with ID _37112_210_2575_1_1532997007673_7268610_347-238993-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 13:06:46,312 WARNING [train.py:1204] (3/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_121-284152-0 from training. Duration: 0.59 2023-10-03 13:06:48,106 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.592e+02 1.877e+02 2.152e+02 2.421e+02 3.801e+02, threshold=4.304e+02, percent-clipped=0.0 2023-10-03 13:06:48,177 WARNING [train.py:1204] (3/4) Exclude cut with ID _2941_210_2577_1_1526199673150_2489759_228-2961-0_sp1.1 from training. Duration: 0.97275 2023-10-03 13:06:49,677 WARNING [train.py:1204] (3/4) Exclude cut with ID _28323_210_16187_1_1532480395667_4100300_531-24157-0 from training. Duration: 0.98 2023-10-03 13:06:51,091 WARNING [train.py:1204] (3/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_266-329755-0_sp0.9 from training. Duration: 0.988875 2023-10-03 13:06:52,532 WARNING [train.py:1204] (3/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_1038-146421-0_sp1.1 from training. Duration: 0.97275 2023-10-03 13:06:52,765 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.bypass.skip_rate, batch_count=1277313.3333333333, ans=0.04949747468305833 2023-10-03 13:06:53,902 WARNING [train.py:1204] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_391-22148-0_sp0.9 from training. Duration: 0.8555625 2023-10-03 13:06:55,931 WARNING [train.py:1204] (3/4) Exclude cut with ID _64521_210_14699_1_1535243427259_3815328_543-341117-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 13:06:55,944 WARNING [train.py:1204] (3/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_52-168712-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 13:06:55,990 WARNING [train.py:1204] (3/4) Exclude cut with ID _33920_210_4220_1_1533257851500_3084399_198-334063-0_sp1.1 from training. Duration: 0.8545625 2023-10-03 13:06:56,003 WARNING [train.py:1204] (3/4) Exclude cut with ID _50093_210_4220_1_1534417141207_3432299_69-111987-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 13:06:56,035 WARNING [train.py:1204] (3/4) Exclude cut with ID _14821_210_8750_1_1530680503991_6829359_189-297451-0_sp0.9 from training. Duration: 0.611125 2023-10-03 13:06:57,412 WARNING [train.py:1204] (3/4) Exclude cut with ID _8030_210_7497_1_1528444886781_3531089_262-86750-0_sp0.9 from training. Duration: 0.9 2023-10-03 13:06:57,421 WARNING [train.py:1204] (3/4) Exclude cut with ID _24612_210_8913_1_1531998054193_3806359_294-191196-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 13:07:06,502 WARNING [train.py:1204] (3/4) Exclude cut with ID _16909_210_5724_1_1531292079912_4118919_707-342896-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 13:07:06,517 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_9460_210_4211_1_1528954587_10049667_572-37035-0_sp0.9 from training. Duration: 0.888875 2023-10-03 13:07:06,564 WARNING [train.py:1204] (3/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_762-109886-0_sp1.1 from training. Duration: 0.82725 2023-10-03 13:07:06,588 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_14953_210_2151_1_1530701710_5616165_519-326393-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 13:07:09,532 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.feed_forward1.hidden_balancer.prob, batch_count=1277380.0, ans=0.125 2023-10-03 13:07:11,980 WARNING [train.py:1204] (3/4) Exclude cut with ID _25914_210_6400_1_1532141317058_644740_52-96808-0 from training. Duration: 0.91 2023-10-03 13:07:11,983 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_68095_210_20185_1_1535718226_8888855_1079-94525-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 13:07:14,823 WARNING [train.py:1204] (3/4) Exclude cut with ID _14860_210_4915_1_1530702020340_7343240_746-3647-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 13:07:14,824 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_70379_210_7925_1_1535941455_557455_49-101720-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 13:07:14,839 WARNING [train.py:1204] (3/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_8-307291-0_sp1.1 from training. Duration: 0.936375 2023-10-03 13:07:16,643 WARNING [train.py:1204] (3/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_117-290916-0 from training. Duration: 0.51 2023-10-03 13:07:18,154 WARNING [train.py:1204] (3/4) Exclude cut with ID _22020_210_2461_1_1531724164513_6326900_944-339840-0_sp1.1 from training. Duration: 0.963625 2023-10-03 13:07:19,692 WARNING [train.py:1204] (3/4) Exclude cut with ID IC0515W0043-254911 from training. Duration: 0.976 2023-10-03 13:07:19,809 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.0.balancer2.prob, batch_count=1277446.6666666667, ans=0.125 2023-10-03 13:07:21,055 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3910_210_1794_1_1526742306_334530_28-238909-0 from training. Duration: 0.81 2023-10-03 13:07:21,077 WARNING [train.py:1204] (3/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_269-267275-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 13:07:24,425 WARNING [train.py:1204] (3/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_72-251255-0_sp1.1 from training. Duration: 0.8545625 2023-10-03 13:07:24,436 WARNING [train.py:1204] (3/4) Exclude cut with ID _14886_210_4220_1_1530702020355_3516190_175-229073-0 from training. Duration: 0.45 2023-10-03 13:07:28,425 WARNING [train.py:1204] (3/4) Exclude cut with ID _40519_210_13794_1_1533715228655_7337390_921-209452-0_sp1.1 from training. Duration: 0.963625 2023-10-03 13:07:30,480 WARNING [train.py:1204] (3/4) Exclude cut with ID _29857_210_20034_1_1532395886753_3798160_335-163562-0_sp0.9 from training. Duration: 0.9444375 2023-10-03 13:07:32,370 WARNING [train.py:1204] (3/4) Exclude cut with ID _58583_210_5236_1_1534726835266_3574249_384-58445-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 13:07:32,435 WARNING [train.py:1204] (3/4) Exclude cut with ID _44011_210_2046_1_1533722401691_3631310_96-315656-0_sp1.1 from training. Duration: 0.963625 2023-10-03 13:07:32,439 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_68484_210_1794_1_1535849804_7678097_549-170613-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 13:07:35,237 WARNING [train.py:1204] (3/4) Exclude cut with ID _37892_210_19596_1_1533198677897_3964058_214-148058-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 13:07:35,519 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.self_attn_weights.pos_emb_skip_rate, batch_count=1277513.3333333333, ans=0.0 2023-10-03 13:07:39,261 WARNING [train.py:1204] (3/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_415-78792-0_sp0.9 from training. Duration: 0.9 2023-10-03 13:07:40,667 WARNING [train.py:1204] (3/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_383-205833-0_sp1.1 from training. Duration: 0.863625 2023-10-03 13:07:41,934 INFO [train.py:1046] (3/4) Epoch 37, batch 400, loss[loss=0.1514, simple_loss=0.2408, pruned_loss=0.03096, over 24500.00 frames. ], tot_loss[loss=0.1583, simple_loss=0.2372, pruned_loss=0.0397, over 4088784.03 frames. ], batch size: 66, lr: 2.77e-03, grad_scale: 32.0 2023-10-03 13:07:42,010 WARNING [train.py:1204] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_167-137255-0 from training. Duration: 0.64 2023-10-03 13:07:42,018 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8275_210_4198_1_1528455320_3892571_484-154786-0_sp1.1 from training. Duration: 0.963625 2023-10-03 13:07:42,062 WARNING [train.py:1204] (3/4) Exclude cut with ID _40284_210_2084_1_1533513679814_3704279_396-319412-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 13:07:43,481 WARNING [train.py:1204] (3/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_280-120942-0_sp0.9 from training. Duration: 0.8555625 2023-10-03 13:07:44,753 WARNING [train.py:1204] (3/4) Exclude cut with ID _45630_210_10556_1_1533949174498_3902049_558-240889-0_sp1.1 from training. Duration: 0.92725 2023-10-03 13:07:47,923 WARNING [train.py:1204] (3/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_197-307458-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 13:07:49,217 WARNING [train.py:1204] (3/4) Exclude cut with ID _16909_210_5724_1_1531292079912_4118919_163-254843-0_sp1.1 from training. Duration: 0.92725 2023-10-03 13:07:50,585 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_103-84010-0 from training. Duration: 0.69 2023-10-03 13:07:52,039 WARNING [train.py:1204] (3/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_401-158239-0 from training. Duration: 0.71 2023-10-03 13:07:52,040 WARNING [train.py:1204] (3/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_899-233658-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 13:07:53,561 WARNING [train.py:1204] (3/4) Exclude cut with ID _23825_210_13449_1_1531915313526_3497150_240-271367-0 from training. Duration: 0.97 2023-10-03 13:07:53,601 WARNING [train.py:1204] (3/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_424-134619-0_sp1.1 from training. Duration: 0.92725 2023-10-03 13:07:57,047 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=1277646.6666666667, ans=0.1 2023-10-03 13:07:59,679 WARNING [train.py:1204] (3/4) Exclude cut with ID _44831_210_13427_1_1533816011549_3689035_32-188147-0_sp1.1 from training. Duration: 0.87275 2023-10-03 13:07:59,682 WARNING [train.py:1204] (3/4) Exclude cut with ID _59170_210_6721_1_1534983633960_4203335_314-63879-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 13:07:59,697 WARNING [train.py:1204] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_602-275260-0 from training. Duration: 0.82 2023-10-03 13:07:59,728 WARNING [train.py:1204] (3/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_511-306767-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 13:07:59,769 WARNING [train.py:1204] (3/4) Exclude cut with ID _41817_210_18835_1_1533434477160_7423049_565-201775-0_sp1.1 from training. Duration: 0.92725 2023-10-03 13:07:59,786 WARNING [train.py:1204] (3/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_778-173151-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 13:08:01,825 WARNING [train.py:1204] (3/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_680-51001-0_sp1.1 from training. Duration: 0.963625 2023-10-03 13:08:03,837 WARNING [train.py:1204] (3/4) Exclude cut with ID IC0677W0059-334891 from training. Duration: 0.896 2023-10-03 13:08:03,886 WARNING [train.py:1204] (3/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_370-331600-0 from training. Duration: 0.94 2023-10-03 13:08:08,174 WARNING [train.py:1204] (3/4) Exclude cut with ID _66840_210_5219_1_1535452168426_4196349_446-240192-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 13:08:09,504 WARNING [train.py:1204] (3/4) Exclude cut with ID _65242_210_18628_1_1535369393717_3823149_38-6732-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 13:08:09,577 WARNING [train.py:1204] (3/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_302-2550-0 from training. Duration: 0.81 2023-10-03 13:08:09,662 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_100-348441-0 from training. Duration: 0.91 2023-10-03 13:08:09,858 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.bypass.skip_rate, batch_count=1277646.6666666667, ans=0.07 2023-10-03 13:08:12,295 WARNING [train.py:1204] (3/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_329-285801-0_sp1.1 from training. Duration: 0.82725 2023-10-03 13:08:13,837 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_30167_210_3528_1_1532428012_4772375_343-334712-0_sp1.1 from training. Duration: 0.936375 2023-10-03 13:08:22,303 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7543_210_6753_1_1528543318_4552_1-85072-0 from training. Duration: 0.83 2023-10-03 13:08:25,070 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7002_210_1794_1_1527935634_4034444_487-236501-0_sp1.1 from training. Duration: 0.6 2023-10-03 13:08:25,168 WARNING [train.py:1204] (3/4) Exclude cut with ID _7160_210_1876_1_1527940923231_3703799_282-322611-0 from training. Duration: 0.87 2023-10-03 13:08:26,708 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.bypass.skip_rate, batch_count=1277780.0, ans=0.04949747468305833 2023-10-03 13:08:27,907 WARNING [train.py:1204] (3/4) Exclude cut with ID _67335_210_5219_1_1535525975652_4275229_570-262983-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 13:08:30,522 WARNING [train.py:1204] (3/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_625-338546-0_sp1.1 from training. Duration: 0.8 2023-10-03 13:08:30,552 WARNING [train.py:1204] (3/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_176-56120-0 from training. Duration: 0.83 2023-10-03 13:08:34,446 WARNING [train.py:1204] (3/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_963-187109-0_sp1.1 from training. Duration: 0.9 2023-10-03 13:08:37,211 WARNING [train.py:1204] (3/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_121-284152-0_sp0.9 from training. Duration: 0.6555625 2023-10-03 13:08:38,768 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.attention_skip_rate, batch_count=1277780.0, ans=0.0 2023-10-03 13:08:39,866 WARNING [train.py:1204] (3/4) Exclude cut with ID _45021_210_9994_1_1533619751094_3330288_104-81711-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 13:08:41,275 WARNING [train.py:1204] (3/4) Exclude cut with ID _51382_210_23862_1_1534492801838_3650104_129-343914-0_sp1.1 from training. Duration: 0.936375 2023-10-03 13:08:41,300 WARNING [train.py:1204] (3/4) Exclude cut with ID _14970_210_15709_1_1530775547361_1966965_8-169924-0 from training. Duration: 0.92 2023-10-03 13:08:44,009 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2295_210_2087_1_1525500993_2407863_234-83104-0_sp1.1 from training. Duration: 0.563625 2023-10-03 13:08:45,346 WARNING [train.py:1204] (3/4) Exclude cut with ID _14687_210_5854_1_1530703838259_3664889_115-239046-0 from training. Duration: 0.67 2023-10-03 13:08:47,405 WARNING [train.py:1204] (3/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_690-126306-0_sp1.1 from training. Duration: 0.7090625 2023-10-03 13:08:47,419 WARNING [train.py:1204] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_625-102955-0_sp0.9 from training. Duration: 0.8555625 2023-10-03 13:08:51,541 WARNING [train.py:1204] (3/4) Exclude cut with ID _9389_210_1664_1_1529217337301_5289269_289-213417-0 from training. Duration: 0.89 2023-10-03 13:08:51,729 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.self_attn_weights.pos_emb_skip_rate, batch_count=1277846.6666666667, ans=0.0 2023-10-03 13:08:52,931 WARNING [train.py:1204] (3/4) Exclude cut with ID _13253_210_1925_1_1530757775819_3577873_191-144977-0_sp0.9 from training. Duration: 0.8666875 2023-10-03 13:08:54,328 WARNING [train.py:1204] (3/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_325-155703-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 13:08:54,360 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2295_210_2087_1_1525500993_2407863_116-238894-0_sp0.9 from training. Duration: 0.77775 2023-10-03 13:08:55,623 INFO [train.py:1046] (3/4) Epoch 37, batch 450, loss[loss=0.1721, simple_loss=0.2437, pruned_loss=0.05029, over 23344.00 frames. ], tot_loss[loss=0.1591, simple_loss=0.2385, pruned_loss=0.03984, over 4227554.56 frames. ], batch size: 285, lr: 2.77e-03, grad_scale: 32.0 2023-10-03 13:08:55,771 WARNING [train.py:1204] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_94-126128-0 from training. Duration: 0.86 2023-10-03 13:08:55,786 WARNING [train.py:1204] (3/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_395-310220-0_sp0.9 from training. Duration: 0.988875 2023-10-03 13:08:57,142 WARNING [train.py:1204] (3/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_821-110852-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 13:08:57,203 WARNING [train.py:1204] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_64-248359-0_sp1.1 from training. Duration: 0.62725 2023-10-03 13:08:57,217 WARNING [train.py:1204] (3/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_138-187031-0 from training. Duration: 0.99 2023-10-03 13:08:58,685 WARNING [train.py:1204] (3/4) Exclude cut with ID _4122_210_2084_1_1526713164417_4353280_528-206771-0_sp0.9 from training. Duration: 0.97775 2023-10-03 13:09:00,108 WARNING [train.py:1204] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_844-96989-0_sp1.1 from training. Duration: 0.6454375 2023-10-03 13:09:00,397 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.conv_module2.balancer1.min_positive, batch_count=1277913.3333333333, ans=0.025 2023-10-03 13:09:01,556 WARNING [train.py:1204] (3/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_106-30782-0_sp1.1 from training. Duration: 0.72725 2023-10-03 13:09:08,431 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.3.encoder.layers.2.nonlin_attention.whiten1, num_groups=1, num_channels=384, metric=4.08 vs. limit=10.0 2023-10-03 13:09:09,244 WARNING [train.py:1204] (3/4) Exclude cut with ID _14683_210_5219_1_1530698352608_3974859_281-250040-0_sp1.1 from training. Duration: 0.936375 2023-10-03 13:09:10,482 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_46013_210_7234_1_1533776362_3744032_463-52378-0_sp0.9 from training. Duration: 0.9555625 2023-10-03 13:09:11,272 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.3.encoder.layers.0.feed_forward2.out_whiten, num_groups=1, num_channels=512, metric=3.77 vs. limit=15.0 2023-10-03 13:09:11,858 WARNING [train.py:1204] (3/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_299-34413-0 from training. Duration: 0.65 2023-10-03 13:09:11,928 WARNING [train.py:1204] (3/4) Exclude cut with ID _35799_210_5854_1_1533263487887_3529071_490-326847-0 from training. Duration: 0.99 2023-10-03 13:09:15,842 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.670e+02 1.823e+02 2.027e+02 2.290e+02 3.468e+02, threshold=4.055e+02, percent-clipped=0.0 2023-10-03 13:09:16,233 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.ff3_skip_rate, batch_count=1277980.0, ans=0.0 2023-10-03 13:09:17,130 WARNING [train.py:1204] (3/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_589-9070-0_sp0.9 from training. Duration: 0.788875 2023-10-03 13:09:17,727 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.4.encoder.layers.1.nonlin_attention.whiten2, num_groups=1, num_channels=384, metric=3.22 vs. limit=15.0 2023-10-03 13:09:20,428 WARNING [train.py:1204] (3/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_884-29515-0_sp1.1 from training. Duration: 0.936375 2023-10-03 13:09:21,836 WARNING [train.py:1204] (3/4) Exclude cut with ID _15012_210_1968_1_1530856820429_5095350_474-119995-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 13:09:27,342 WARNING [train.py:1204] (3/4) Exclude cut with ID _65585_210_12210_1_1535450451584_3764098_211-97124-0_sp1.1 from training. Duration: 0.963625 2023-10-03 13:09:27,406 WARNING [train.py:1204] (3/4) Exclude cut with ID _33640_210_2655_1_1533117605479_4855469_666-224463-0_sp1.1 from training. Duration: 0.963625 2023-10-03 13:09:28,812 WARNING [train.py:1204] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_35-153886-0 from training. Duration: 0.84 2023-10-03 13:09:30,165 WARNING [train.py:1204] (3/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_210-106047-0 from training. Duration: 0.97 2023-10-03 13:09:31,622 WARNING [train.py:1204] (3/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_479-125611-0 from training. Duration: 0.96 2023-10-03 13:09:31,652 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_9417_210_1794_1_1529235261_4614131_427-153963-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 13:09:32,982 WARNING [train.py:1204] (3/4) Exclude cut with ID _23818_210_6228_1_1531917063884_3676920_133-170571-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 13:09:33,056 WARNING [train.py:1204] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_602-275260-0_sp1.1 from training. Duration: 0.7454375 2023-10-03 13:09:35,577 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9029W0121-509521 from training. Duration: 0.896 2023-10-03 13:09:35,591 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9029W0434-509821 from training. Duration: 0.64 2023-10-03 13:09:36,596 WARNING [train.py:1204] (3/4) Exclude cut with ID _23038_210_9570_1_1532001584158_3652419_289-307034-0_sp1.1 from training. Duration: 0.936375 2023-10-03 13:09:37,953 WARNING [train.py:1204] (3/4) Exclude cut with ID _29345_210_4381_1_1532689168263_3860169_149-257268-0_sp0.9 from training. Duration: 0.9 2023-10-03 13:09:39,306 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_405-339986-0_sp1.1 from training. Duration: 0.463625 2023-10-03 13:09:42,098 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2655_210_2087_1_1525688959_3897400_437-337690-0_sp1.1 from training. Duration: 0.57275 2023-10-03 13:09:42,137 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7543_210_6753_1_1528541907_1399322_165-323619-0_sp1.1 from training. Duration: 0.62725 2023-10-03 13:09:43,591 WARNING [train.py:1204] (3/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_508-335369-0_sp1.1 from training. Duration: 0.436375 2023-10-03 13:09:44,909 WARNING [train.py:1204] (3/4) Exclude cut with ID _7852_210_5219_1_1528286471316_8139400_358-287492-0 from training. Duration: 0.96 2023-10-03 13:09:47,621 WARNING [train.py:1204] (3/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_322-217220-0_sp0.9 from training. Duration: 0.9555625 2023-10-03 13:09:49,167 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3171_210_2087_1_1526109084_2330007_66-260583-0_sp0.9 from training. Duration: 0.8 2023-10-03 13:09:50,434 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8558_210_1410_1_1528505225_7613406_131-329524-0_sp1.1 from training. Duration: 0.7181875 2023-10-03 13:09:51,038 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.4.encoder.layers.0.conv_module1.whiten, num_groups=1, num_channels=384, metric=4.32 vs. limit=15.0 2023-10-03 13:09:51,211 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.2.encoder.layers.0.conv_module2.whiten, num_groups=1, num_channels=384, metric=5.25 vs. limit=15.0 2023-10-03 13:09:52,483 WARNING [train.py:1204] (3/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_30-326681-0 from training. Duration: 0.83 2023-10-03 13:09:53,147 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.3.encoder.layers.0.feed_forward1.out_whiten, num_groups=1, num_channels=512, metric=10.14 vs. limit=15.0 2023-10-03 13:09:55,302 WARNING [train.py:1204] (3/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_58-68956-0_sp0.9 from training. Duration: 0.92225 2023-10-03 13:09:56,634 WARNING [train.py:1204] (3/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_255-312654-0 from training. Duration: 0.91 2023-10-03 13:09:57,967 WARNING [train.py:1204] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_99-167392-0 from training. Duration: 0.61 2023-10-03 13:09:59,337 WARNING [train.py:1204] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_94-126128-0_sp0.9 from training. Duration: 0.9555625 2023-10-03 13:10:05,134 WARNING [train.py:1204] (3/4) Exclude cut with ID _68635_210_9317_1_1535770813410_4315870_187-192817-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 13:10:06,513 WARNING [train.py:1204] (3/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_277-204970-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 13:10:08,437 INFO [train.py:1046] (3/4) Epoch 37, batch 500, loss[loss=0.1563, simple_loss=0.2437, pruned_loss=0.03444, over 24672.00 frames. ], tot_loss[loss=0.1596, simple_loss=0.2393, pruned_loss=0.03995, over 4334172.47 frames. ], batch size: 73, lr: 2.77e-03, grad_scale: 32.0 2023-10-03 13:10:08,509 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6163_210_6400_1_1527506746_4263068_283-137811-0_sp0.9 from training. Duration: 0.8555625 2023-10-03 13:10:08,532 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9144W0121-566446 from training. Duration: 0.896 2023-10-03 13:10:12,477 WARNING [train.py:1204] (3/4) Exclude cut with ID _49371_210_12071_1_1533952761021_3812190_351-315519-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 13:10:12,545 WARNING [train.py:1204] (3/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_115-326778-0_sp1.1 from training. Duration: 0.8454375 2023-10-03 13:10:12,602 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_17292_210_6753_1_1531125388_2943734_436-198965-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 13:10:12,610 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9165W0117-576751 from training. Duration: 0.896 2023-10-03 13:10:15,445 WARNING [train.py:1204] (3/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_212-149947-0 from training. Duration: 0.73 2023-10-03 13:10:15,453 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_13757_210_3528_1_1530440682_4107239_65-167544-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 13:10:18,214 WARNING [train.py:1204] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_700-183549-0_sp0.9 from training. Duration: 0.7555625 2023-10-03 13:10:20,211 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.3.encoder.layers.2.nonlin_attention.whiten2, num_groups=1, num_channels=512, metric=6.13 vs. limit=15.0 2023-10-03 13:10:22,326 WARNING [train.py:1204] (3/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_216-343007-0_sp0.9 from training. Duration: 0.7333125 2023-10-03 13:10:23,724 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7544_210_6753_1_1528587722_3576686_295-258479-0_sp1.1 from training. Duration: 0.763625 2023-10-03 13:10:27,029 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_365-287106-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 13:10:27,041 WARNING [train.py:1204] (3/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_300-3747-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 13:10:27,095 WARNING [train.py:1204] (3/4) Exclude cut with ID _14069_210_5632_1_1530512692033_4803390_487-303201-0_sp1.1 from training. Duration: 0.92725 2023-10-03 13:10:29,998 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.balancer1.prob, batch_count=1278313.3333333333, ans=0.125 2023-10-03 13:10:35,349 WARNING [train.py:1204] (3/4) Exclude cut with ID _14459_210_2228_1_1531443641946_7516700_668-123026-0_sp1.1 from training. Duration: 0.936375 2023-10-03 13:10:35,388 WARNING [train.py:1204] (3/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_108-53929-0_sp1.1 from training. Duration: 0.536375 2023-10-03 13:10:37,143 WARNING [train.py:1204] (3/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_266-137104-0_sp0.9 from training. Duration: 0.72225 2023-10-03 13:10:37,187 WARNING [train.py:1204] (3/4) Exclude cut with ID _12302_210_5219_1_1529924446466_8107280_902-168619-0_sp1.1 from training. Duration: 0.936375 2023-10-03 13:10:37,215 WARNING [train.py:1204] (3/4) Exclude cut with ID _59502_210_12210_1_1535107024698_1835139_146-162495-0 from training. Duration: 0.92 2023-10-03 13:10:37,230 WARNING [train.py:1204] (3/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_412-287526-0_sp1.1 from training. Duration: 0.6545625 2023-10-03 13:10:37,437 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.conv_module1.balancer1.prob, batch_count=1278380.0, ans=0.125 2023-10-03 13:10:37,448 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.feed_forward2.hidden_balancer.prob, batch_count=1278380.0, ans=0.125 2023-10-03 13:10:41,105 WARNING [train.py:1204] (3/4) Exclude cut with ID _7160_210_1876_1_1527940923231_3703799_120-150943-0_sp0.9 from training. Duration: 0.9 2023-10-03 13:10:42,471 WARNING [train.py:1204] (3/4) Exclude cut with ID _15037_210_2253_1_1530752294104_3267650_296-192597-0_sp1.1 from training. Duration: 0.736375 2023-10-03 13:10:42,501 WARNING [train.py:1204] (3/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_397-216497-0_sp1.1 from training. Duration: 0.87275 2023-10-03 13:10:42,524 WARNING [train.py:1204] (3/4) Exclude cut with ID _40094_210_16380_1_1533294005773_3641159_76-269320-0_sp1.1 from training. Duration: 0.936375 2023-10-03 13:10:43,794 WARNING [train.py:1204] (3/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_449-15508-0 from training. Duration: 0.82 2023-10-03 13:10:45,214 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9309W0194-640321 from training. Duration: 0.64 2023-10-03 13:10:47,923 WARNING [train.py:1204] (3/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_284-312178-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 13:10:48,123 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.ff2_skip_rate, batch_count=1278380.0, ans=0.0 2023-10-03 13:10:49,305 WARNING [train.py:1204] (3/4) Exclude cut with ID _29853_210_7497_1_1532505600851_6642110_401-55937-0_sp1.1 from training. Duration: 0.92725 2023-10-03 13:10:50,541 WARNING [train.py:1204] (3/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_716-24918-0_sp1.1 from training. Duration: 0.92725 2023-10-03 13:10:50,560 WARNING [train.py:1204] (3/4) Exclude cut with ID _19637_210_4220_1_1531537167676_3205060_314-305638-0_sp1.1 from training. Duration: 0.92725 2023-10-03 13:10:50,601 WARNING [train.py:1204] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_475-127455-0_sp1.1 from training. Duration: 0.636375 2023-10-03 13:10:53,352 WARNING [train.py:1204] (3/4) Exclude cut with ID _5444_210_2151_1_1527418808464_5312210_581-315411-0 from training. Duration: 0.62 2023-10-03 13:10:56,676 WARNING [train.py:1204] (3/4) Exclude cut with ID _44902_210_16187_1_1533888006062_3792078_149-67212-0_sp0.9 from training. Duration: 0.9666875 2023-10-03 13:10:56,750 WARNING [train.py:1204] (3/4) Exclude cut with ID _31602_210_5854_1_1532677021456_4458250_63-63253-0_sp1.1 from training. Duration: 0.963625 2023-10-03 13:10:56,944 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.bypass.skip_rate, batch_count=1278446.6666666667, ans=0.07 2023-10-03 13:11:00,883 WARNING [train.py:1204] (3/4) Exclude cut with ID _55209_210_1531_1_1534675828996_4597160_660-203992-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 13:11:02,385 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.0.feed_forward2.hidden_balancer.prob, batch_count=1278446.6666666667, ans=0.125 2023-10-03 13:11:03,620 WARNING [train.py:1204] (3/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_83-190624-0_sp1.1 from training. Duration: 0.92725 2023-10-03 13:11:08,425 WARNING [train.py:1204] (3/4) Exclude cut with ID _58605_210_12210_1_1534723195939_3904259_154-139635-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 13:11:11,200 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.1.encoder.layers.0.feed_forward3.out_whiten, num_groups=1, num_channels=256, metric=12.88 vs. limit=15.0 2023-10-03 13:11:11,659 WARNING [train.py:1204] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_164-224052-0 from training. Duration: 0.7 2023-10-03 13:11:11,674 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_1126-189071-0_sp1.1 from training. Duration: 0.963625 2023-10-03 13:11:11,692 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_15396_210_6890_1_1531308473_1938936_310-214789-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 13:11:15,683 WARNING [train.py:1204] (3/4) Exclude cut with ID _8395_210_1531_1_1528597871523_4411579_431-132470-0 from training. Duration: 0.74 2023-10-03 13:11:15,737 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3780_210_2087_1_1526467760_2130483_170-259044-0_sp0.9 from training. Duration: 0.77775 2023-10-03 13:11:17,203 WARNING [train.py:1204] (3/4) Exclude cut with ID _12733_210_8341_1_1530167424556_3646380_537-230640-0_sp1.1 from training. Duration: 0.963625 2023-10-03 13:11:21,325 INFO [train.py:1046] (3/4) Epoch 37, batch 550, loss[loss=0.147, simple_loss=0.2208, pruned_loss=0.03657, over 24601.00 frames. ], tot_loss[loss=0.1611, simple_loss=0.2405, pruned_loss=0.04085, over 4413003.75 frames. ], batch size: 60, lr: 2.77e-03, grad_scale: 32.0 2023-10-03 13:11:22,749 WARNING [train.py:1204] (3/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_159-281310-0 from training. Duration: 0.95 2023-10-03 13:11:22,968 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=1278580.0, ans=0.1 2023-10-03 13:11:24,159 WARNING [train.py:1204] (3/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_434-83000-0 from training. Duration: 0.98 2023-10-03 13:11:24,178 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_4405_210_1849_1_1526732205_3593939_493-32464-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 13:11:24,187 WARNING [train.py:1204] (3/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_342-100157-0 from training. Duration: 0.54 2023-10-03 13:11:26,111 WARNING [train.py:1204] (3/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_765-250133-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 13:11:26,116 WARNING [train.py:1204] (3/4) Exclude cut with ID _38670_210_15379_1_1533448810659_4391059_81-312245-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 13:11:26,190 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_50050_210_16223_1_1535435796_7290138_1273-247611-0_sp1.1 from training. Duration: 0.936375 2023-10-03 13:11:26,242 WARNING [train.py:1204] (3/4) Exclude cut with ID _44831_210_13427_1_1533816011549_3689035_399-18591-0_sp1.1 from training. Duration: 0.936375 2023-10-03 13:11:26,264 WARNING [train.py:1204] (3/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_557-324258-0_sp0.9 from training. Duration: 0.9 2023-10-03 13:11:28,924 WARNING [train.py:1204] (3/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_62-335552-0_sp1.1 from training. Duration: 0.7909375 2023-10-03 13:11:29,092 WARNING [train.py:1204] (3/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_673-264125-0_sp1.1 from training. Duration: 0.963625 2023-10-03 13:11:30,349 WARNING [train.py:1204] (3/4) Exclude cut with ID _8030_210_7497_1_1528444886781_3531089_262-86750-0 from training. Duration: 0.81 2023-10-03 13:11:30,362 WARNING [train.py:1204] (3/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_444-28041-0_sp1.1 from training. Duration: 0.77275 2023-10-03 13:11:35,005 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_34008_210_10431_1_1533345424_8183247_516-14858-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 13:11:35,129 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.0.feed_forward1.out_proj.dropout_p, batch_count=1278646.6666666667, ans=0.1 2023-10-03 13:11:36,814 WARNING [train.py:1204] (3/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_54-248218-0_sp1.1 from training. Duration: 0.936375 2023-10-03 13:11:38,317 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.conv_module2.balancer2.min_positive, batch_count=1278646.6666666667, ans=0.05 2023-10-03 13:11:41,556 WARNING [train.py:1204] (3/4) Exclude cut with ID _13253_210_1925_1_1530757775819_3577873_300-119782-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 13:11:41,639 WARNING [train.py:1204] (3/4) Exclude cut with ID _39290_210_4220_1_1533862778760_3075118_21-240703-0_sp1.1 from training. Duration: 0.936375 2023-10-03 13:11:42,925 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.614e+02 1.845e+02 2.021e+02 2.216e+02 2.763e+02, threshold=4.043e+02, percent-clipped=0.0 2023-10-03 13:11:45,893 WARNING [train.py:1204] (3/4) Exclude cut with ID 774-127930-0014-10412-0_sp1.1 from training. Duration: 0.95 2023-10-03 13:11:47,202 WARNING [train.py:1204] (3/4) Exclude cut with ID _67499_210_17282_1_1535509805220_3692430_20-179346-0 from training. Duration: 0.97 2023-10-03 13:11:48,610 WARNING [train.py:1204] (3/4) Exclude cut with ID _8365_210_2064_1_1528592311070_4677690_431-294102-0_sp0.9 from training. Duration: 0.988875 2023-10-03 13:11:48,878 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.attention_skip_rate, batch_count=1278646.6666666667, ans=0.0 2023-10-03 13:11:51,520 WARNING [train.py:1204] (3/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_365-1492-0_sp1.1 from training. Duration: 0.82725 2023-10-03 13:11:52,805 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_105-114797-0_sp0.9 from training. Duration: 0.9333125 2023-10-03 13:11:54,267 WARNING [train.py:1204] (3/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_769-168302-0_sp0.9 from training. Duration: 0.788875 2023-10-03 13:11:57,564 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_40340_210_9024_1_1533433916_4132944_320-270277-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 13:11:57,569 WARNING [train.py:1204] (3/4) Exclude cut with ID ID0203W0003-781711 from training. Duration: 0.958 2023-10-03 13:11:58,893 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_30434_210_13049_1_1532519384_981746_52-12932-0_sp1.1 from training. Duration: 0.936375 2023-10-03 13:11:58,977 WARNING [train.py:1204] (3/4) Exclude cut with ID _8263_210_1664_1_1528438953494_294100_40-204731-0_sp0.9 from training. Duration: 0.5555625 2023-10-03 13:12:01,719 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1032-236863-0_sp0.9 from training. Duration: 0.9333125 2023-10-03 13:12:02,996 WARNING [train.py:1204] (3/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_335-274780-0_sp0.9 from training. Duration: 0.8666875 2023-10-03 13:12:03,012 WARNING [train.py:1204] (3/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_654-52713-0_sp1.1 from training. Duration: 0.7 2023-10-03 13:12:04,405 WARNING [train.py:1204] (3/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_742-267856-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 13:12:05,775 WARNING [train.py:1204] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_74-48717-0 from training. Duration: 0.58 2023-10-03 13:12:06,500 WARNING [train.py:1204] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_18-46756-0 from training. Duration: 0.79 2023-10-03 13:12:07,841 WARNING [train.py:1204] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_612-56061-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 13:12:07,850 WARNING [train.py:1204] (3/4) Exclude cut with ID _56423_210_5854_1_1534896061049_3165969_412-2403-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 13:12:09,141 WARNING [train.py:1204] (3/4) Exclude cut with ID _62574_210_2655_1_1535180407058_3933249_120-196848-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 13:12:09,141 WARNING [train.py:1204] (3/4) Exclude cut with ID _40519_210_13794_1_1533715228655_7337390_916-51000-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 13:12:10,759 WARNING [train.py:1204] (3/4) Exclude cut with ID _65263_210_22356_1_1535763598691_3921037_108-21902-0_sp1.1 from training. Duration: 0.9 2023-10-03 13:12:12,671 WARNING [train.py:1204] (3/4) Exclude cut with ID _4122_210_2084_1_1526713164417_4353280_528-206771-0_sp1.1 from training. Duration: 0.8 2023-10-03 13:12:15,418 WARNING [train.py:1204] (3/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_43-325657-0_sp1.1 from training. Duration: 0.92725 2023-10-03 13:12:16,712 WARNING [train.py:1204] (3/4) Exclude cut with ID _44659_210_2721_1_1534036120061_6614860_314-82041-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 13:12:16,786 WARNING [train.py:1204] (3/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_81-300212-0_sp0.9 from training. Duration: 0.6666875 2023-10-03 13:12:18,148 WARNING [train.py:1204] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_665-196155-0_sp0.9 from training. Duration: 0.7444375 2023-10-03 13:12:19,604 WARNING [train.py:1204] (3/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_140-336881-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 13:12:19,685 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6290_210_3273_1_1527595224_1282787_115-87095-0_sp0.9 from training. Duration: 0.611125 2023-10-03 13:12:21,100 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_53373_210_5399_1_1534229471_4715695_607-259685-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 13:12:21,213 WARNING [train.py:1204] (3/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_175-163894-0_sp1.1 from training. Duration: 0.67275 2023-10-03 13:12:21,230 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7002_210_1794_1_1527939683_5157286_1-281264-0_sp1.1 from training. Duration: 0.4 2023-10-03 13:12:28,639 WARNING [train.py:1204] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_605-257205-0 from training. Duration: 0.59 2023-10-03 13:12:32,713 WARNING [train.py:1204] (3/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_454-314901-0 from training. Duration: 0.46 2023-10-03 13:12:34,143 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_46032_210_14656_1_1533781412_4507969_287-130422-0_sp1.1 from training. Duration: 0.97275 2023-10-03 13:12:34,160 WARNING [train.py:1204] (3/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_186-309161-0_sp1.1 from training. Duration: 0.4909375 2023-10-03 13:12:35,439 INFO [train.py:1046] (3/4) Epoch 37, batch 600, loss[loss=0.1423, simple_loss=0.2229, pruned_loss=0.03086, over 24594.00 frames. ], tot_loss[loss=0.1609, simple_loss=0.2405, pruned_loss=0.04067, over 4478387.55 frames. ], batch size: 60, lr: 2.77e-03, grad_scale: 16.0 2023-10-03 13:12:35,509 WARNING [train.py:1204] (3/4) Exclude cut with ID _42659_210_2070_1_1533607442765_3120380_45-342307-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 13:12:37,741 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.out_combiner.scale_min, batch_count=1278913.3333333333, ans=0.2 2023-10-03 13:12:42,322 WARNING [train.py:1204] (3/4) Exclude cut with ID _12555_210_8750_1_1530075594402_3780239_478-326818-0_sp1.1 from training. Duration: 0.963625 2023-10-03 13:12:43,654 WARNING [train.py:1204] (3/4) Exclude cut with ID _7606_210_4915_1_1528894253277_5080690_127-177761-0_sp1.1 from training. Duration: 0.5818125 2023-10-03 13:12:43,753 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder_embed.conv.2.prob, batch_count=1278913.3333333333, ans=0.125 2023-10-03 13:12:45,003 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6788_210_1388_1_1527898698_4562021_91-197981-0 from training. Duration: 0.83 2023-10-03 13:12:47,738 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8558_210_1410_1_1528505225_7613406_131-329524-0_sp0.9 from training. Duration: 0.87775 2023-10-03 13:12:47,863 WARNING [train.py:1204] (3/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_153-153537-0_sp1.1 from training. Duration: 0.82725 2023-10-03 13:12:50,549 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_17292_210_6753_1_1531125388_2943734_285-349105-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 13:12:50,836 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.bypass_mid.scale_min, batch_count=1278980.0, ans=0.2 2023-10-03 13:12:51,990 WARNING [train.py:1204] (3/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_37-41919-0 from training. Duration: 0.87 2023-10-03 13:12:52,024 WARNING [train.py:1204] (3/4) Exclude cut with ID _60860_210_8254_1_1534896015875_3557169_172-294852-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 13:12:58,100 WARNING [train.py:1204] (3/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_146-276944-0 from training. Duration: 0.83 2023-10-03 13:13:00,968 WARNING [train.py:1204] (3/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_1254-44082-0_sp1.1 from training. Duration: 0.936375 2023-10-03 13:13:00,980 WARNING [train.py:1204] (3/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_1115-180350-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 13:13:02,306 WARNING [train.py:1204] (3/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_60-146189-0_sp1.1 from training. Duration: 0.8909375 2023-10-03 13:13:05,378 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.attention_skip_rate, batch_count=1279046.6666666667, ans=0.0 2023-10-03 13:13:07,164 WARNING [train.py:1204] (3/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_517-153649-0_sp1.1 from training. Duration: 0.92725 2023-10-03 13:13:07,186 WARNING [train.py:1204] (3/4) Exclude cut with ID _39571_210_13449_1_1533801584143_3651077_37-266257-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 13:13:07,243 WARNING [train.py:1204] (3/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_527-276118-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 13:13:16,159 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7696_210_2040_1_1528207167_3716099_117-125434-0_sp1.1 from training. Duration: 0.6909375 2023-10-03 13:13:21,824 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_25767_210_2161_1_1532307483_3867748_478-156360-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 13:13:23,056 WARNING [train.py:1204] (3/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_177-49092-0_sp1.1 from training. Duration: 0.82725 2023-10-03 13:13:23,063 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_11305_210_1794_1_1529750428_7178148_680-317073-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 13:13:30,517 WARNING [train.py:1204] (3/4) Exclude cut with ID _53565_210_10635_1_1534337218335_3820259_287-18120-0 from training. Duration: 0.97 2023-10-03 13:13:34,679 WARNING [train.py:1204] (3/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_498-40153-0_sp1.1 from training. Duration: 0.57275 2023-10-03 13:13:34,696 WARNING [train.py:1204] (3/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_555-63066-0_sp1.1 from training. Duration: 0.963625 2023-10-03 13:13:34,931 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.conv_module2.balancer2.prob, batch_count=1279180.0, ans=0.125 2023-10-03 13:13:38,813 WARNING [train.py:1204] (3/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_395-310220-0 from training. Duration: 0.89 2023-10-03 13:13:38,907 WARNING [train.py:1204] (3/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_937-208281-0_sp1.1 from training. Duration: 0.77275 2023-10-03 13:13:42,650 WARNING [train.py:1204] (3/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_120-211630-0 from training. Duration: 0.92 2023-10-03 13:13:42,684 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_454-276429-0_sp1.1 from training. Duration: 0.7909375 2023-10-03 13:13:44,083 WARNING [train.py:1204] (3/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_404-75889-0_sp1.1 from training. Duration: 0.8090625 2023-10-03 13:13:49,931 INFO [train.py:1046] (3/4) Epoch 37, batch 650, loss[loss=0.1425, simple_loss=0.2018, pruned_loss=0.0416, over 22595.00 frames. ], tot_loss[loss=0.1593, simple_loss=0.2383, pruned_loss=0.04015, over 4537079.91 frames. ], batch size: 322, lr: 2.77e-03, grad_scale: 8.0 2023-10-03 13:13:50,072 WARNING [train.py:1204] (3/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_282-136641-0_sp0.9 from training. Duration: 0.4666875 2023-10-03 13:13:52,658 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_809-17156-0_sp1.1 from training. Duration: 0.663625 2023-10-03 13:13:54,139 WARNING [train.py:1204] (3/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_1003-101638-0_sp0.9 from training. Duration: 0.811125 2023-10-03 13:13:55,584 WARNING [train.py:1204] (3/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_404-75889-0_sp0.9 from training. Duration: 0.988875 2023-10-03 13:13:56,948 WARNING [train.py:1204] (3/4) Exclude cut with ID _59288_210_5854_1_1535077757774_3412246_570-295255-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 13:14:00,275 WARNING [train.py:1204] (3/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_412-287526-0 from training. Duration: 0.72 2023-10-03 13:14:00,340 WARNING [train.py:1204] (3/4) Exclude cut with ID _44010_210_4525_1_1533884262964_4013620_649-79655-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 13:14:04,524 WARNING [train.py:1204] (3/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_57-270037-0_sp0.9 from training. Duration: 0.8555625 2023-10-03 13:14:04,525 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_34815_210_6388_1_1533381320_3571903_302-255963-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 13:14:07,619 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.bypass_mid.scale_min, batch_count=1279313.3333333333, ans=0.2 2023-10-03 13:14:08,747 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_47449_210_5399_1_1534076445_4006929_573-301852-0_sp1.1 from training. Duration: 0.97275 2023-10-03 13:14:12,028 WARNING [train.py:1204] (3/4) Exclude cut with ID 6709-74022-0004-86860-0_sp1.1 from training. Duration: 0.9409375 2023-10-03 13:14:13,240 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.541e+02 1.868e+02 2.019e+02 2.279e+02 4.165e+02, threshold=4.037e+02, percent-clipped=1.0 2023-10-03 13:14:13,978 WARNING [train.py:1204] (3/4) Exclude cut with ID _40519_210_13794_1_1533715228655_7337390_111-71991-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 13:14:15,088 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_30167_210_3528_1_1532428012_4772375_169-214186-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 13:14:18,329 WARNING [train.py:1204] (3/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_20-235313-0_sp1.1 from training. Duration: 0.963625 2023-10-03 13:14:18,984 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.3.encoder.layers.3.conv_module1.whiten, num_groups=1, num_channels=512, metric=4.61 vs. limit=15.0 2023-10-03 13:14:19,631 WARNING [train.py:1204] (3/4) Exclude cut with ID _4681_210_3525_1_1527251040321_1235713_123-24428-0_sp0.9 from training. Duration: 0.5333125 2023-10-03 13:14:21,062 WARNING [train.py:1204] (3/4) Exclude cut with ID _31602_210_5854_1_1532677021456_4458250_444-198752-0_sp1.1 from training. Duration: 0.97275 2023-10-03 13:14:22,415 WARNING [train.py:1204] (3/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_9-279442-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 13:14:22,471 WARNING [train.py:1204] (3/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_973-198085-0_sp0.9 from training. Duration: 0.8333125 2023-10-03 13:14:23,848 WARNING [train.py:1204] (3/4) Exclude cut with ID _29853_210_7497_1_1532505600851_6642110_567-250221-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 13:14:25,187 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6545_210_2040_1_1527774670_4245258_407-48070-0_sp1.1 from training. Duration: 0.6909375 2023-10-03 13:14:26,533 WARNING [train.py:1204] (3/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_224-343295-0_sp0.9 from training. Duration: 0.611125 2023-10-03 13:14:26,564 WARNING [train.py:1204] (3/4) Exclude cut with ID IC0125W0011-61817 from training. Duration: 0.88 2023-10-03 13:14:26,572 WARNING [train.py:1204] (3/4) Exclude cut with ID _60113_210_16271_1_1535164212481_4386079_435-154371-0_sp1.1 from training. Duration: 0.97275 2023-10-03 13:14:28,428 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_25836_210_16554_1_1532138167_53197_3-9394-0_sp1.1 from training. Duration: 0.92725 2023-10-03 13:14:31,302 WARNING [train.py:1204] (3/4) Exclude cut with ID _29343_210_4381_1_1532602757752_3674109_57-55125-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 13:14:31,406 WARNING [train.py:1204] (3/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_267-23560-0_sp1.1 from training. Duration: 0.936375 2023-10-03 13:14:32,578 WARNING [train.py:1204] (3/4) Exclude cut with ID _30196_210_1664_1_1533708496522_7074140_1150-278867-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 13:14:32,643 WARNING [train.py:1204] (3/4) Exclude cut with ID _6270_210_2084_1_1527850911573_3856420_26-277343-0_sp0.9 from training. Duration: 0.911125 2023-10-03 13:14:33,967 WARNING [train.py:1204] (3/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_325-62802-0 from training. Duration: 0.87 2023-10-03 13:14:35,347 WARNING [train.py:1204] (3/4) Exclude cut with ID _56524_210_9642_1_1534572011779_3619170_145-310842-0_sp1.1 from training. Duration: 0.9 2023-10-03 13:14:35,384 WARNING [train.py:1204] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_860-96198-0_sp0.9 from training. Duration: 0.811125 2023-10-03 13:14:35,470 WARNING [train.py:1204] (3/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_17-25045-0_sp1.1 from training. Duration: 0.536375 2023-10-03 13:14:35,488 WARNING [train.py:1204] (3/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_349-69727-0_sp1.1 from training. Duration: 0.936375 2023-10-03 13:14:36,773 WARNING [train.py:1204] (3/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_580-93798-0_sp1.1 from training. Duration: 0.5090625 2023-10-03 13:14:38,131 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7762_210_1794_1_1528199508_4057408_419-125009-0 from training. Duration: 0.83 2023-10-03 13:14:41,430 WARNING [train.py:1204] (3/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_356-167816-0 from training. Duration: 0.72 2023-10-03 13:14:41,462 WARNING [train.py:1204] (3/4) Exclude cut with ID _29345_210_4381_1_1532689168263_3860169_225-97100-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 13:14:41,473 WARNING [train.py:1204] (3/4) Exclude cut with ID _14227_210_4381_1_1530701804286_3940259_374-194712-0_sp1.1 from training. Duration: 0.92725 2023-10-03 13:14:41,506 WARNING [train.py:1204] (3/4) Exclude cut with ID _40137_210_12210_1_1533290484046_3795180_249-187059-0_sp0.9 from training. Duration: 0.97775 2023-10-03 13:14:42,725 WARNING [train.py:1204] (3/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_389-317194-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 13:14:44,119 WARNING [train.py:1204] (3/4) Exclude cut with ID _27485_210_4916_1_1532739191677_3668269_239-122918-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 13:14:48,352 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2245_210_2715_1_1525511548_3540254_128-117609-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 13:14:50,222 WARNING [train.py:1204] (3/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_160-276178-0_sp1.1 from training. Duration: 0.963625 2023-10-03 13:14:51,591 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_31920_210_10633_1_1532860009_1686094_1-279735-0_sp1.1 from training. Duration: 0.97275 2023-10-03 13:14:54,338 WARNING [train.py:1204] (3/4) Exclude cut with ID _51078_210_9070_1_1534161557424_3805250_325-262383-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 13:14:54,363 WARNING [train.py:1204] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_643-52007-0_sp0.9 from training. Duration: 0.6333125 2023-10-03 13:14:55,588 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_51937_210_5014_1_1534573610_4022025_518-299248-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 13:15:02,711 INFO [train.py:1046] (3/4) Epoch 37, batch 700, loss[loss=0.1442, simple_loss=0.2301, pruned_loss=0.02918, over 24354.00 frames. ], tot_loss[loss=0.1585, simple_loss=0.2371, pruned_loss=0.03995, over 4578086.67 frames. ], batch size: 61, lr: 2.77e-03, grad_scale: 8.0 2023-10-03 13:15:02,758 WARNING [train.py:1204] (3/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_166-314607-0_sp1.1 from training. Duration: 0.4909375 2023-10-03 13:15:02,762 WARNING [train.py:1204] (3/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_1087-282939-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 13:15:02,804 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_23850_210_4238_1_1532232597_1632433_32-185135-0_sp1.1 from training. Duration: 0.7818125 2023-10-03 13:15:04,113 WARNING [train.py:1204] (3/4) Exclude cut with ID _59500_210_8254_1_1534809610680_3736009_145-89359-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 13:15:07,046 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_340-336408-0 from training. Duration: 0.93 2023-10-03 13:15:07,124 WARNING [train.py:1204] (3/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_9-21737-0 from training. Duration: 0.97 2023-10-03 13:15:09,878 WARNING [train.py:1204] (3/4) Exclude cut with ID _2220_210_1819_1_1525509109898_3658680_50-99837-0 from training. Duration: 0.54 2023-10-03 13:15:11,144 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.3.encoder.layers.1.nonlin_attention.whiten2, num_groups=1, num_channels=512, metric=9.89 vs. limit=15.0 2023-10-03 13:15:11,190 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.2.encoder.layers.2.nonlin_attention.whiten1, num_groups=1, num_channels=288, metric=9.35 vs. limit=10.0 2023-10-03 13:15:11,615 WARNING [train.py:1204] (3/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_879-299350-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 13:15:13,444 WARNING [train.py:1204] (3/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_905-160074-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 13:15:14,820 WARNING [train.py:1204] (3/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_395-73570-0 from training. Duration: 0.99 2023-10-03 13:15:18,870 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7264_210_4938_1_1527939911_8132095_800-91995-0_sp1.1 from training. Duration: 0.7818125 2023-10-03 13:15:20,297 WARNING [train.py:1204] (3/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_77-311404-0_sp1.1 from training. Duration: 0.8545625 2023-10-03 13:15:23,738 WARNING [train.py:1204] (3/4) Exclude cut with ID _13679_210_7497_1_1530534293662_3355098_107-80772-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 13:15:23,844 WARNING [train.py:1204] (3/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_224-343295-0_sp1.1 from training. Duration: 0.5 2023-10-03 13:15:25,141 WARNING [train.py:1204] (3/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_156-253482-0_sp1.1 from training. Duration: 0.963625 2023-10-03 13:15:28,393 WARNING [train.py:1204] (3/4) Exclude cut with ID _49728_210_12651_1_1534041034294_6788150_309-125532-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 13:15:31,206 WARNING [train.py:1204] (3/4) Exclude cut with ID _13913_210_9994_1_1530579615187_3383897_473-277682-0_sp0.9 from training. Duration: 0.4333125 2023-10-03 13:15:31,211 WARNING [train.py:1204] (3/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_3-276701-0_sp1.1 from training. Duration: 0.82725 2023-10-03 13:15:33,998 WARNING [train.py:1204] (3/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_282-136641-0 from training. Duration: 0.42 2023-10-03 13:15:34,197 WARNING [train.py:1204] (3/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_127-566-0 from training. Duration: 0.84 2023-10-03 13:15:38,276 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6163_210_6400_1_1527506746_4263068_283-137811-0_sp1.1 from training. Duration: 0.7 2023-10-03 13:15:38,311 WARNING [train.py:1204] (3/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_34-183685-0_sp1.1 from training. Duration: 0.8909375 2023-10-03 13:15:39,680 WARNING [train.py:1204] (3/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_154-135696-0_sp0.9 from training. Duration: 0.888875 2023-10-03 13:15:43,129 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.bypass_mid.scale_min, batch_count=1279713.3333333333, ans=0.2 2023-10-03 13:15:44,375 WARNING [train.py:1204] (3/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_676-51014-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 13:15:45,176 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.0.layers.1.conv_module1.whiten, num_groups=1, num_channels=192, metric=9.49 vs. limit=15.0 2023-10-03 13:15:45,621 WARNING [train.py:1204] (3/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_38-169920-0 from training. Duration: 0.91 2023-10-03 13:15:51,167 WARNING [train.py:1204] (3/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_747-114465-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 13:15:51,225 WARNING [train.py:1204] (3/4) Exclude cut with ID _8021_210_7994_1_1528376451358_3653069_100-140231-0_sp1.1 from training. Duration: 0.6090625 2023-10-03 13:15:51,252 WARNING [train.py:1204] (3/4) Exclude cut with ID _64135_210_2253_1_1535677267876_7608972_133-188266-0 from training. Duration: 0.81 2023-10-03 13:15:55,790 WARNING [train.py:1204] (3/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_714-310538-0_sp1.1 from training. Duration: 0.7818125 2023-10-03 13:15:57,756 WARNING [train.py:1204] (3/4) Exclude cut with ID _39867_210_9826_1_1533553222030_9344259_1134-288498-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 13:16:00,523 WARNING [train.py:1204] (3/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_211-237289-0_sp1.1 from training. Duration: 0.936375 2023-10-03 13:16:06,068 WARNING [train.py:1204] (3/4) Exclude cut with ID _59202_210_26984_1_1534816810612_3549174_137-318710-0_sp0.9 from training. Duration: 0.988875 2023-10-03 13:16:06,107 WARNING [train.py:1204] (3/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_86-66192-0 from training. Duration: 0.55 2023-10-03 13:16:07,524 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder_embed.convnext.out_balancer.prob, batch_count=1279846.6666666667, ans=0.125 2023-10-03 13:16:07,609 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=1279846.6666666667, ans=0.1 2023-10-03 13:16:10,310 WARNING [train.py:1204] (3/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_540-275685-0 from training. Duration: 0.89 2023-10-03 13:16:10,352 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8776_210_2040_1_1528638837_4088036_244-278763-0 from training. Duration: 0.9 2023-10-03 13:16:12,425 WARNING [train.py:1204] (3/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_9-219972-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 13:16:15,076 WARNING [train.py:1204] (3/4) Exclude cut with ID _44659_210_2721_1_1534036120061_6614860_656-283823-0_sp1.1 from training. Duration: 0.92725 2023-10-03 13:16:15,157 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_69630_210_7234_1_1535789601_661049_77-216385-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 13:16:16,828 INFO [train.py:1046] (3/4) Epoch 37, batch 750, loss[loss=0.1504, simple_loss=0.2324, pruned_loss=0.03421, over 24475.00 frames. ], tot_loss[loss=0.1576, simple_loss=0.2362, pruned_loss=0.03948, over 4611622.91 frames. ], batch size: 63, lr: 2.77e-03, grad_scale: 8.0 2023-10-03 13:16:16,903 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_19952_210_2161_1_1531875469_3851750_210-308919-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 13:16:16,909 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_4737_210_2040_1_1526998689_3293008_378-178650-0 from training. Duration: 0.95 2023-10-03 13:16:21,066 WARNING [train.py:1204] (3/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_224-343295-0 from training. Duration: 0.55 2023-10-03 13:16:21,096 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7002_210_1794_1_1527939683_5157286_184-229630-0 from training. Duration: 0.74 2023-10-03 13:16:21,121 WARNING [train.py:1204] (3/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_6-214815-0 from training. Duration: 0.85 2023-10-03 13:16:22,371 WARNING [train.py:1204] (3/4) Exclude cut with ID _8699_210_2069_1_1528630346831_3960409_160-24408-0 from training. Duration: 0.91 2023-10-03 13:16:22,411 WARNING [train.py:1204] (3/4) Exclude cut with ID _5585_210_2667_1_1527418813324_8007340_89-332072-0 from training. Duration: 0.64 2023-10-03 13:16:24,171 WARNING [train.py:1204] (3/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_528-259800-0_sp1.1 from training. Duration: 0.963625 2023-10-03 13:16:24,268 WARNING [train.py:1204] (3/4) Exclude cut with ID _55383_210_10635_1_1534468426172_2231669_134-128246-0 from training. Duration: 0.89 2023-10-03 13:16:25,671 WARNING [train.py:1204] (3/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_400-338274-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 13:16:25,892 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.ff2_skip_rate, batch_count=1279913.3333333333, ans=0.0 2023-10-03 13:16:26,909 WARNING [train.py:1204] (3/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_364-22817-0_sp0.9 from training. Duration: 0.788875 2023-10-03 13:16:28,646 WARNING [train.py:1204] (3/4) Exclude cut with ID _42863_210_9070_1_1533902398788_3696920_331-291057-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 13:16:28,783 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.0.bypass.scale_min, batch_count=1279913.3333333333, ans=0.2 2023-10-03 13:16:31,262 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_5436_210_1794_1_1527331317_2541494_229-211016-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 13:16:31,309 WARNING [train.py:1204] (3/4) Exclude cut with ID _6456_210_1534_1_1527681634212_3181049_345-252877-0_sp0.9 from training. Duration: 0.711125 2023-10-03 13:16:32,629 WARNING [train.py:1204] (3/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_96-81499-0_sp1.1 from training. Duration: 0.92725 2023-10-03 13:16:38,001 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_5575_210_1410_1_1527301305_2570170_342-265572-0_sp1.1 from training. Duration: 0.8545625 2023-10-03 13:16:38,064 WARNING [train.py:1204] (3/4) Exclude cut with ID _5444_210_2151_1_1527418808464_5312210_726-27165-0_sp0.9 from training. Duration: 0.7444375 2023-10-03 13:16:40,745 WARNING [train.py:1204] (3/4) Exclude cut with ID _22083_210_8254_1_1531702862439_3678970_304-327750-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 13:16:42,003 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.618e+02 1.819e+02 1.975e+02 2.168e+02 3.086e+02, threshold=3.950e+02, percent-clipped=0.0 2023-10-03 13:16:42,157 WARNING [train.py:1204] (3/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_245-226199-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 13:16:42,198 WARNING [train.py:1204] (3/4) Exclude cut with ID _42875_210_14856_1_1533704397487_4012433_102-199499-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 13:16:43,530 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_51225_210_3601_1_1534400634_2327383_55-301844-0 from training. Duration: 0.93 2023-10-03 13:16:44,939 WARNING [train.py:1204] (3/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_916-195848-0_sp1.1 from training. Duration: 0.836375 2023-10-03 13:16:44,994 WARNING [train.py:1204] (3/4) Exclude cut with ID _44718_210_5816_1_1534050051869_6469030_497-311999-0_sp1.1 from training. Duration: 0.936375 2023-10-03 13:16:48,592 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_34886_210_10633_1_1533203882_1755661_91-194765-0_sp1.1 from training. Duration: 0.936375 2023-10-03 13:16:48,780 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.1.bypass.skip_rate, batch_count=1280046.6666666667, ans=0.035 2023-10-03 13:16:50,030 WARNING [train.py:1204] (3/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_310-26186-0_sp0.9 from training. Duration: 0.77775 2023-10-03 13:16:50,126 WARNING [train.py:1204] (3/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_416-257730-0 from training. Duration: 0.65 2023-10-03 13:16:50,132 WARNING [train.py:1204] (3/4) Exclude cut with ID _9389_210_1664_1_1529217337301_5289269_209-154760-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 13:16:51,612 WARNING [train.py:1204] (3/4) Exclude cut with ID _4092_210_2151_1_1526643044449_3135759_227-213972-0 from training. Duration: 0.54 2023-10-03 13:16:51,634 WARNING [train.py:1204] (3/4) Exclude cut with ID IC0679W0044-335852 from training. Duration: 0.768 2023-10-03 13:16:53,010 WARNING [train.py:1204] (3/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_42-186162-0 from training. Duration: 0.92 2023-10-03 13:16:53,017 WARNING [train.py:1204] (3/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_302-2550-0_sp0.9 from training. Duration: 0.9 2023-10-03 13:16:53,032 WARNING [train.py:1204] (3/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_356-31216-0_sp0.9 from training. Duration: 0.8333125 2023-10-03 13:16:54,862 WARNING [train.py:1204] (3/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_125-322172-0_sp1.1 from training. Duration: 0.8454375 2023-10-03 13:17:01,870 WARNING [train.py:1204] (3/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_141-89727-0_sp0.9 from training. Duration: 0.788875 2023-10-03 13:17:01,897 WARNING [train.py:1204] (3/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_647-337784-0_sp1.1 from training. Duration: 0.97275 2023-10-03 13:17:01,908 WARNING [train.py:1204] (3/4) Exclude cut with ID _6493_210_1747_1_1528459210424_4012470_513-140967-0_sp1.1 from training. Duration: 0.7181875 2023-10-03 13:17:05,232 WARNING [train.py:1204] (3/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_87-266334-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 13:17:06,636 WARNING [train.py:1204] (3/4) Exclude cut with ID _26528_210_9070_1_1532570410842_3851310_526-261338-0_sp1.1 from training. Duration: 0.92725 2023-10-03 13:17:07,985 WARNING [train.py:1204] (3/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_380-343670-0 from training. Duration: 0.88 2023-10-03 13:17:08,032 WARNING [train.py:1204] (3/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_449-15508-0_sp1.1 from training. Duration: 0.7454375 2023-10-03 13:17:08,150 WARNING [train.py:1204] (3/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_372-6869-0_sp0.9 from training. Duration: 0.488875 2023-10-03 13:17:09,413 WARNING [train.py:1204] (3/4) Exclude cut with ID _26907_210_7234_1_1532221740694_3205263_337-2494-0_sp0.9 from training. Duration: 0.97775 2023-10-03 13:17:10,144 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.2.encoder.layers.1.nonlin_attention.whiten1, num_groups=1, num_channels=288, metric=5.56 vs. limit=10.0 2023-10-03 13:17:12,293 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.feed_forward1.hidden_balancer.prob, batch_count=1280113.3333333333, ans=0.125 2023-10-03 13:17:13,449 WARNING [train.py:1204] (3/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_10-100367-0_sp1.1 from training. Duration: 0.8909375 2023-10-03 13:17:13,478 WARNING [train.py:1204] (3/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_47-167373-0 from training. Duration: 0.92 2023-10-03 13:17:14,809 WARNING [train.py:1204] (3/4) Exclude cut with ID _28323_210_16187_1_1532480395667_4100300_628-282179-0_sp1.1 from training. Duration: 0.97275 2023-10-03 13:17:19,886 WARNING [train.py:1204] (3/4) Exclude cut with ID _20455_210_5219_1_1531565996775_8098100_213-280898-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 13:17:21,268 WARNING [train.py:1204] (3/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_383-316000-0_sp1.1 from training. Duration: 0.8181875 2023-10-03 13:17:21,308 WARNING [train.py:1204] (3/4) Exclude cut with ID _18513_210_9206_1_1531618215223_3862620_16-78180-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 13:17:22,728 WARNING [train.py:1204] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_330-327861-0_sp1.1 from training. Duration: 0.6090625 2023-10-03 13:17:26,033 WARNING [train.py:1204] (3/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_356-31216-0 from training. Duration: 0.75 2023-10-03 13:17:26,068 WARNING [train.py:1204] (3/4) Exclude cut with ID _25248_210_8750_1_1532062825941_5554180_255-148539-0_sp1.1 from training. Duration: 0.963625 2023-10-03 13:17:27,308 WARNING [train.py:1204] (3/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_269-154726-0_sp1.1 from training. Duration: 0.7818125 2023-10-03 13:17:30,048 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7002_210_1794_1_1527939683_5157286_163-87552-0_sp1.1 from training. Duration: 0.7818125 2023-10-03 13:17:30,086 WARNING [train.py:1204] (3/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_97-151896-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 13:17:32,785 INFO [train.py:1046] (3/4) Epoch 37, batch 800, loss[loss=0.1585, simple_loss=0.2392, pruned_loss=0.03888, over 23507.00 frames. ], tot_loss[loss=0.1592, simple_loss=0.2381, pruned_loss=0.04012, over 4636064.74 frames. ], batch size: 134, lr: 2.77e-03, grad_scale: 16.0 2023-10-03 13:17:32,920 WARNING [train.py:1204] (3/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_207-7026-0_sp1.1 from training. Duration: 0.97275 2023-10-03 13:17:32,938 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_92-46009-0_sp0.9 from training. Duration: 0.6 2023-10-03 13:17:41,773 WARNING [train.py:1204] (3/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_122-178847-0_sp1.1 from training. Duration: 0.97275 2023-10-03 13:17:41,777 WARNING [train.py:1204] (3/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_94-348956-0_sp1.1 from training. Duration: 0.92725 2023-10-03 13:17:42,068 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.conv_module1.balancer1.min_positive, batch_count=1280246.6666666667, ans=0.025 2023-10-03 13:17:43,249 WARNING [train.py:1204] (3/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_1018-244089-0_sp1.1 from training. Duration: 0.963625 2023-10-03 13:17:43,271 WARNING [train.py:1204] (3/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_87-118423-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 13:17:44,503 WARNING [train.py:1204] (3/4) Exclude cut with ID _53713_210_24116_1_1534314487762_7416751_504-212292-0_sp1.1 from training. Duration: 0.92725 2023-10-03 13:17:45,146 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_446-150327-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 13:17:47,608 WARNING [train.py:1204] (3/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_682-245989-0_sp1.1 from training. Duration: 0.92725 2023-10-03 13:17:50,829 WARNING [train.py:1204] (3/4) Exclude cut with ID _35723_210_1794_1_1533088341043_628250_8-234524-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 13:17:50,906 WARNING [train.py:1204] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_64-248359-0_sp0.9 from training. Duration: 0.7666875 2023-10-03 13:17:55,132 WARNING [train.py:1204] (3/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_49-97158-0 from training. Duration: 0.76 2023-10-03 13:17:55,181 WARNING [train.py:1204] (3/4) Exclude cut with ID _24314_210_4915_1_1532135078116_7170179_743-915-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 13:17:56,421 WARNING [train.py:1204] (3/4) Exclude cut with ID _61194_210_18628_1_1535007555628_3750909_353-323468-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 13:17:56,453 WARNING [train.py:1204] (3/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_387-131538-0_sp1.1 from training. Duration: 0.736375 2023-10-03 13:17:56,481 WARNING [train.py:1204] (3/4) Exclude cut with ID _44424_210_18163_1_1533623394848_1213110_79-187603-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 13:17:56,512 WARNING [train.py:1204] (3/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_86-19601-0 from training. Duration: 0.85 2023-10-03 13:17:58,445 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_50050_210_16223_1_1535435796_7290138_103-283529-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 13:17:58,493 WARNING [train.py:1204] (3/4) Exclude cut with ID _13913_210_9994_1_1530579615187_3383897_473-277682-0 from training. Duration: 0.39 2023-10-03 13:18:01,271 WARNING [train.py:1204] (3/4) Exclude cut with ID _42326_210_7770_1_1533814282531_3707269_368-68029-0_sp1.1 from training. Duration: 0.92725 2023-10-03 13:18:03,990 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_41428_210_2161_1_1533774184_3992532_446-339645-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 13:18:05,689 WARNING [train.py:1204] (3/4) Exclude cut with ID _69584_210_5219_1_1535785133775_4044450_325-7656-0_sp1.1 from training. Duration: 0.7818125 2023-10-03 13:18:07,025 WARNING [train.py:1204] (3/4) Exclude cut with ID _21632_210_2253_1_1531994001765_3744140_134-184387-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 13:18:08,368 WARNING [train.py:1204] (3/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_550-184707-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 13:18:08,384 WARNING [train.py:1204] (3/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_106-341780-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 13:18:12,711 WARNING [train.py:1204] (3/4) Exclude cut with ID _45871_210_5665_1_1533864614245_3296060_253-132552-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 13:18:12,749 WARNING [train.py:1204] (3/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_395-310220-0_sp1.1 from training. Duration: 0.8090625 2023-10-03 13:18:13,922 WARNING [train.py:1204] (3/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_795-33252-0_sp0.9 from training. Duration: 0.411125 2023-10-03 13:18:16,126 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9002W0003-495962 from training. Duration: 0.946 2023-10-03 13:18:16,146 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_43-71989-0 from training. Duration: 0.57 2023-10-03 13:18:16,158 WARNING [train.py:1204] (3/4) Exclude cut with ID _8699_210_2069_1_1528630346831_3960409_155-39766-0_sp1.1 from training. Duration: 0.7090625 2023-10-03 13:18:16,180 WARNING [train.py:1204] (3/4) Exclude cut with ID _27894_210_12392_1_1532415496078_6902439_788-232684-0_sp1.1 from training. Duration: 0.97275 2023-10-03 13:18:17,496 WARNING [train.py:1204] (3/4) Exclude cut with ID _56494_210_4220_1_1535284837625_3033169_10-39296-0_sp1.1 from training. Duration: 0.92725 2023-10-03 13:18:18,811 WARNING [train.py:1204] (3/4) Exclude cut with ID _40901_210_5665_1_1533363789958_3856130_341-20819-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 13:18:19,149 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.conv_skip_rate, batch_count=1280446.6666666667, ans=0.0 2023-10-03 13:18:24,859 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9029W0435-509822 from training. Duration: 0.896 2023-10-03 13:18:24,905 WARNING [train.py:1204] (3/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_51-117576-0 from training. Duration: 0.4 2023-10-03 13:18:26,364 WARNING [train.py:1204] (3/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_197-28164-0_sp1.1 from training. Duration: 0.72725 2023-10-03 13:18:27,866 WARNING [train.py:1204] (3/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_145-137790-0_sp1.1 from training. Duration: 0.7545625 2023-10-03 13:18:30,752 WARNING [train.py:1204] (3/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_2-39653-0_sp1.1 from training. Duration: 0.8909375 2023-10-03 13:18:30,996 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.feed_forward1.out_proj.dropout_p, batch_count=1280513.3333333333, ans=0.1 2023-10-03 13:18:34,028 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3780_210_2087_1_1526467760_2130483_186-208240-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 13:18:35,330 WARNING [train.py:1204] (3/4) Exclude cut with ID _24055_210_12651_1_1532310907143_976979_92-59356-0 from training. Duration: 0.91 2023-10-03 13:18:35,620 INFO [scaling.py:1118] (3/4) WithLoss: name=encoder.encoders.5.encoder.layers.1.self_attn_weights, loss-sum=0.000e+00 2023-10-03 13:18:36,760 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3910_210_1794_1_1526742306_334530_28-238909-0_sp1.1 from training. Duration: 0.736375 2023-10-03 13:18:39,512 WARNING [train.py:1204] (3/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_269-154726-0 from training. Duration: 0.86 2023-10-03 13:18:46,227 INFO [train.py:1046] (3/4) Epoch 37, batch 850, loss[loss=0.1566, simple_loss=0.2306, pruned_loss=0.0413, over 23772.00 frames. ], tot_loss[loss=0.1597, simple_loss=0.2389, pruned_loss=0.0403, over 4656265.48 frames. ], batch size: 212, lr: 2.76e-03, grad_scale: 16.0 2023-10-03 13:18:46,276 WARNING [train.py:1204] (3/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_326-165900-0_sp0.9 from training. Duration: 0.7666875 2023-10-03 13:18:46,485 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.1.feed_forward2.hidden_balancer.prob, batch_count=1280580.0, ans=0.125 2023-10-03 13:18:48,310 WARNING [train.py:1204] (3/4) Exclude cut with ID _5294_210_1747_1_1527249643928_3696179_470-276077-0_sp1.1 from training. Duration: 0.963625 2023-10-03 13:18:49,644 WARNING [train.py:1204] (3/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_372-131163-0 from training. Duration: 0.94 2023-10-03 13:18:49,692 WARNING [train.py:1204] (3/4) Exclude cut with ID _45297_210_2721_1_1533715351381_3264949_351-147136-0_sp1.1 from training. Duration: 0.7909375 2023-10-03 13:18:52,310 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_1072-12671-0_sp1.1 from training. Duration: 0.92725 2023-10-03 13:18:52,402 WARNING [train.py:1204] (3/4) Exclude cut with ID _15133_210_6228_1_1530794512074_3080379_54-198193-0 from training. Duration: 0.94 2023-10-03 13:18:52,419 WARNING [train.py:1204] (3/4) Exclude cut with ID _30196_210_1664_1_1533708496522_7074140_1194-270988-0_sp1.1 from training. Duration: 0.97275 2023-10-03 13:18:54,209 WARNING [train.py:1204] (3/4) Exclude cut with ID _50310_210_15488_1_1534068043345_3641080_289-193637-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 13:18:55,585 WARNING [train.py:1204] (3/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_313-263495-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 13:18:57,083 WARNING [train.py:1204] (3/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_18-130336-0_sp1.1 from training. Duration: 0.7181875 2023-10-03 13:18:57,271 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.bypass_mid.scale_min, batch_count=1280580.0, ans=0.2 2023-10-03 13:18:58,353 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_47896_210_20508_1_1534492518_5958868_646-287209-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 13:18:59,688 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_294-163793-0 from training. Duration: 0.82 2023-10-03 13:19:01,003 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_8-319957-0 from training. Duration: 0.96 2023-10-03 13:19:01,007 WARNING [train.py:1204] (3/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_1211-226066-0 from training. Duration: 0.76 2023-10-03 13:19:02,418 WARNING [train.py:1204] (3/4) Exclude cut with ID _4779_210_2084_1_1527318022944_3914893_500-285013-0_sp0.9 from training. Duration: 0.7666875 2023-10-03 13:19:02,436 WARNING [train.py:1204] (3/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_347-232268-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 13:19:05,713 WARNING [train.py:1204] (3/4) Exclude cut with ID _29877_210_6228_1_1532397791106_5276149_111-55056-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 13:19:05,751 WARNING [train.py:1204] (3/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_812-271646-0_sp1.1 from training. Duration: 0.92725 2023-10-03 13:19:05,988 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.bypass.skip_rate, batch_count=1280646.6666666667, ans=0.09899494936611666 2023-10-03 13:19:07,139 WARNING [train.py:1204] (3/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_235-262124-0_sp1.1 from training. Duration: 0.6090625 2023-10-03 13:19:10,505 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.463e+02 1.795e+02 1.931e+02 2.229e+02 3.193e+02, threshold=3.862e+02, percent-clipped=0.0 2023-10-03 13:19:10,639 WARNING [train.py:1204] (3/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_14-16530-0_sp1.1 from training. Duration: 0.97275 2023-10-03 13:19:12,054 WARNING [train.py:1204] (3/4) Exclude cut with ID _44718_210_5816_1_1534050051869_6469030_568-304512-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 13:19:12,088 WARNING [train.py:1204] (3/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_1033-274376-0 from training. Duration: 0.87 2023-10-03 13:19:13,599 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.conv_skip_rate, batch_count=1280646.6666666667, ans=0.0 2023-10-03 13:19:16,094 WARNING [train.py:1204] (3/4) Exclude cut with ID _4681_210_3525_1_1527251040321_1235713_123-24428-0 from training. Duration: 0.48 2023-10-03 13:19:20,406 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_37818_210_4238_1_1533620910_4720083_251-288764-0_sp1.1 from training. Duration: 0.97275 2023-10-03 13:19:20,476 WARNING [train.py:1204] (3/4) Exclude cut with ID _65585_210_12210_1_1535450451584_3764098_112-294301-0 from training. Duration: 0.88 2023-10-03 13:19:23,877 WARNING [train.py:1204] (3/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_329-113173-0 from training. Duration: 0.62 2023-10-03 13:19:25,664 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6767_210_3273_1_1527855783_1718932_173-256601-0 from training. Duration: 0.8 2023-10-03 13:19:27,199 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9266W0001-627677 from training. Duration: 0.896 2023-10-03 13:19:27,209 WARNING [train.py:1204] (3/4) Exclude cut with ID _49643_210_7989_1_1534640244224_3939031_611-185515-0_sp1.1 from training. Duration: 0.8545625 2023-10-03 13:19:27,219 WARNING [train.py:1204] (3/4) Exclude cut with ID _31649_210_6228_1_1532584608063_4091078_687-237771-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 13:19:27,236 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_148-172861-0_sp0.9 from training. Duration: 0.5555625 2023-10-03 13:19:31,176 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_5129_210_6400_1_1527161005_3579054_107-69738-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 13:19:31,268 WARNING [train.py:1204] (3/4) Exclude cut with ID _40137_210_12210_1_1533290484046_3795180_36-301164-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 13:19:31,301 WARNING [train.py:1204] (3/4) Exclude cut with ID _7537_210_2084_1_1528502395431_3924359_113-199933-0 from training. Duration: 0.59 2023-10-03 13:19:32,858 WARNING [train.py:1204] (3/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_547-5424-0_sp1.1 from training. Duration: 0.8545625 2023-10-03 13:19:34,121 WARNING [train.py:1204] (3/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_147-284024-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 13:19:35,984 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_294-163793-0_sp1.1 from training. Duration: 0.7454375 2023-10-03 13:19:36,011 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3780_210_2087_1_1526467760_2130483_170-259044-0_sp1.1 from training. Duration: 0.636375 2023-10-03 13:19:37,323 WARNING [train.py:1204] (3/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_726-270780-0_sp1.1 from training. Duration: 0.7818125 2023-10-03 13:19:39,316 WARNING [train.py:1204] (3/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_214-291369-0_sp1.1 from training. Duration: 0.57275 2023-10-03 13:19:40,530 WARNING [train.py:1204] (3/4) Exclude cut with ID _21881_210_3525_1_1531825242757_3798380_247-102591-0 from training. Duration: 0.98 2023-10-03 13:19:42,162 WARNING [train.py:1204] (3/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_18-174647-0_sp1.1 from training. Duration: 0.82725 2023-10-03 13:19:42,164 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_415-52611-0_sp1.1 from training. Duration: 0.936375 2023-10-03 13:19:43,364 WARNING [train.py:1204] (3/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_379-49457-0_sp0.9 from training. Duration: 0.9666875 2023-10-03 13:19:43,385 WARNING [train.py:1204] (3/4) Exclude cut with ID _64521_210_14699_1_1535243427259_3815328_368-195106-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 13:19:44,896 WARNING [train.py:1204] (3/4) Exclude cut with ID _28394_210_8913_1_1532430049518_3650118_51-334124-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 13:19:45,243 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.conv_module1.balancer1.prob, batch_count=1280846.6666666667, ans=0.125 2023-10-03 13:19:46,253 WARNING [train.py:1204] (3/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_317-161769-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 13:19:48,384 WARNING [train.py:1204] (3/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_40-129756-0_sp0.9 from training. Duration: 0.9 2023-10-03 13:19:49,730 WARNING [train.py:1204] (3/4) Exclude cut with ID _3067_210_2667_1_1526212974640_4503168_119-175776-0_sp0.9 from training. Duration: 0.82225 2023-10-03 13:19:51,131 WARNING [train.py:1204] (3/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_1004-304915-0_sp1.1 from training. Duration: 0.92725 2023-10-03 13:19:52,357 WARNING [train.py:1204] (3/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_359-118926-0_sp0.9 from training. Duration: 0.8 2023-10-03 13:19:59,920 WARNING [train.py:1204] (3/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_51-200220-0_sp0.9 from training. Duration: 0.62225 2023-10-03 13:20:01,128 INFO [train.py:1046] (3/4) Epoch 37, batch 900, loss[loss=0.1634, simple_loss=0.2381, pruned_loss=0.04437, over 23689.00 frames. ], tot_loss[loss=0.1606, simple_loss=0.2399, pruned_loss=0.04069, over 4670662.40 frames. ], batch size: 164, lr: 2.76e-03, grad_scale: 16.0 2023-10-03 13:20:01,193 WARNING [train.py:1204] (3/4) Exclude cut with ID _46683_210_9642_1_1533730913492_4591116_530-326202-0_sp1.1 from training. Duration: 0.936375 2023-10-03 13:20:01,239 WARNING [train.py:1204] (3/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_567-135114-0 from training. Duration: 0.87 2023-10-03 13:20:01,270 WARNING [train.py:1204] (3/4) Exclude cut with ID _66863_210_18053_1_1535698758334_8590319_1092-271784-0_sp1.1 from training. Duration: 0.87275 2023-10-03 13:20:01,282 WARNING [train.py:1204] (3/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_762-282614-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 13:20:02,771 WARNING [train.py:1204] (3/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_280-120942-0 from training. Duration: 0.77 2023-10-03 13:20:09,355 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_31583_210_3852_1_1532595851_4024413_27-170931-0_sp1.1 from training. Duration: 0.963625 2023-10-03 13:20:12,046 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.bypass.scale_min, batch_count=1280913.3333333333, ans=0.2 2023-10-03 13:20:13,152 WARNING [train.py:1204] (3/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_266-271628-0_sp1.1 from training. Duration: 0.92725 2023-10-03 13:20:13,193 WARNING [train.py:1204] (3/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_41-31157-0 from training. Duration: 0.57 2023-10-03 13:20:16,118 WARNING [train.py:1204] (3/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_113-18163-0_sp1.1 from training. Duration: 0.8090625 2023-10-03 13:20:17,505 WARNING [train.py:1204] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_147-111137-0 from training. Duration: 0.95 2023-10-03 13:20:17,585 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1083-220122-0_sp1.1 from training. Duration: 0.463625 2023-10-03 13:20:17,678 WARNING [train.py:1204] (3/4) Exclude cut with ID _14884_210_2252_1_1530707459689_5561250_591-173406-0_sp1.1 from training. Duration: 0.87275 2023-10-03 13:20:17,684 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_24677_210_1794_1_1532002693_4341296_149-303793-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 13:20:19,066 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_133-564-0_sp1.1 from training. Duration: 0.7181875 2023-10-03 13:20:19,087 WARNING [train.py:1204] (3/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_625-338546-0_sp0.9 from training. Duration: 0.97775 2023-10-03 13:20:28,704 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.1.conv_module1.balancer2.prob, batch_count=1280980.0, ans=0.125 2023-10-03 13:20:28,815 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.conv_module1.balancer2.prob, batch_count=1280980.0, ans=0.125 2023-10-03 13:20:31,306 WARNING [train.py:1204] (3/4) Exclude cut with ID _25882_210_3036_1_1532775665815_6479510_624-320169-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 13:20:31,313 WARNING [train.py:1204] (3/4) Exclude cut with ID _20622_210_3852_1_1531622427080_1093839_127-325326-0_sp1.1 from training. Duration: 0.92725 2023-10-03 13:20:31,346 WARNING [train.py:1204] (3/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_464-33931-0_sp1.1 from training. Duration: 0.4909375 2023-10-03 13:20:32,864 WARNING [train.py:1204] (3/4) Exclude cut with ID _66112_210_10635_1_1535590236305_3801484_120-304588-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 13:20:37,032 WARNING [train.py:1204] (3/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_110-130961-0 from training. Duration: 0.47 2023-10-03 13:20:40,334 WARNING [train.py:1204] (3/4) Exclude cut with ID _16908_210_5724_1_1531119157248_3772700_64-54020-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 13:20:44,099 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.1.encoder.layers.0.whiten, num_groups=1, num_channels=256, metric=5.07 vs. limit=12.0 2023-10-03 13:20:44,631 WARNING [train.py:1204] (3/4) Exclude cut with ID _4068_210_2629_1_1526697350395_1546259_224-288693-0_sp1.1 from training. Duration: 0.536375 2023-10-03 13:20:45,971 WARNING [train.py:1204] (3/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_191-41023-0_sp0.9 from training. Duration: 0.788875 2023-10-03 13:20:46,030 WARNING [train.py:1204] (3/4) Exclude cut with ID ID0205W0255-783002 from training. Duration: 0.958 2023-10-03 13:20:47,360 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_162-262717-0 from training. Duration: 0.83 2023-10-03 13:20:53,028 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_224-322394-0_sp1.1 from training. Duration: 0.6 2023-10-03 13:20:53,034 WARNING [train.py:1204] (3/4) Exclude cut with ID _4866_210_2577_1_1527404412284_4364659_133-1696-0_sp1.1 from training. Duration: 0.9 2023-10-03 13:20:53,088 WARNING [train.py:1204] (3/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_973-198085-0_sp1.1 from training. Duration: 0.6818125 2023-10-03 13:20:56,959 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.bypass.skip_rate, batch_count=1281113.3333333333, ans=0.09899494936611666 2023-10-03 13:21:00,731 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_47449_210_5399_1_1534076445_4006929_410-237806-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 13:21:00,741 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7543_210_6753_1_1528543318_4552_1-85072-0_sp0.9 from training. Duration: 0.92225 2023-10-03 13:21:01,039 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.conv_skip_rate, batch_count=1281180.0, ans=0.0 2023-10-03 13:21:02,175 WARNING [train.py:1204] (3/4) Exclude cut with ID _65263_210_22356_1_1535763598691_3921037_108-21902-0 from training. Duration: 0.99 2023-10-03 13:21:02,181 WARNING [train.py:1204] (3/4) Exclude cut with ID _65585_210_12210_1_1535450451584_3764098_309-174818-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 13:21:03,669 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_309-65177-0 from training. Duration: 0.88 2023-10-03 13:21:05,059 WARNING [train.py:1204] (3/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_363-185703-0_sp1.1 from training. Duration: 0.863625 2023-10-03 13:21:05,106 WARNING [train.py:1204] (3/4) Exclude cut with ID _42326_210_7770_1_1533814282531_3707269_442-238692-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 13:21:06,450 WARNING [train.py:1204] (3/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_226-254446-0_sp1.1 from training. Duration: 0.936375 2023-10-03 13:21:06,467 WARNING [train.py:1204] (3/4) Exclude cut with ID _29853_210_7497_1_1532505600851_6642110_287-299903-0_sp1.1 from training. Duration: 0.963625 2023-10-03 13:21:08,786 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.5.encoder.layers.1.feed_forward3.out_whiten, num_groups=1, num_channels=256, metric=8.61 vs. limit=15.0 2023-10-03 13:21:11,579 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1015-343884-0 from training. Duration: 0.62 2023-10-03 13:21:11,610 WARNING [train.py:1204] (3/4) Exclude cut with ID ID0305W0008-833402 from training. Duration: 0.91 2023-10-03 13:21:11,736 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6673_210_3273_1_1527767790_3173240_351-167954-0_sp1.1 from training. Duration: 0.436375 2023-10-03 13:21:11,752 WARNING [train.py:1204] (3/4) Exclude cut with ID _6456_210_1534_1_1527681634212_3181049_345-252877-0 from training. Duration: 0.64 2023-10-03 13:21:15,633 INFO [train.py:1046] (3/4) Epoch 37, batch 950, loss[loss=0.144, simple_loss=0.2229, pruned_loss=0.03252, over 24355.00 frames. ], tot_loss[loss=0.1608, simple_loss=0.2403, pruned_loss=0.04065, over 4676602.19 frames. ], batch size: 61, lr: 2.76e-03, grad_scale: 16.0 2023-10-03 13:21:15,760 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_13-205182-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 13:21:19,736 WARNING [train.py:1204] (3/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_427-208901-0 from training. Duration: 0.68 2023-10-03 13:21:22,980 WARNING [train.py:1204] (3/4) Exclude cut with ID _49039_210_6315_1_1533967537541_3846118_9-148921-0_sp1.1 from training. Duration: 0.97275 2023-10-03 13:21:25,789 WARNING [train.py:1204] (3/4) Exclude cut with ID _59493_210_17282_1_1535078419077_3225879_143-70317-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 13:21:25,832 WARNING [train.py:1204] (3/4) Exclude cut with ID _27967_210_12140_1_1532588391859_8419130_1056-236369-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 13:21:25,875 WARNING [train.py:1204] (3/4) Exclude cut with ID _6537_210_1883_1_1527987559710_3597129_382-133062-0_sp0.9 from training. Duration: 0.7555625 2023-10-03 13:21:28,968 WARNING [train.py:1204] (3/4) Exclude cut with ID ID0376W0255-869957 from training. Duration: 0.9510625 2023-10-03 13:21:31,797 WARNING [train.py:1204] (3/4) Exclude cut with ID _49371_210_12071_1_1533952761021_3812190_483-240691-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 13:21:31,868 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_12868_210_6753_1_1530701932_530275_18-76322-0_sp1.1 from training. Duration: 0.7909375 2023-10-03 13:21:33,224 WARNING [train.py:1204] (3/4) Exclude cut with ID _53950_210_5816_1_1534417264919_3862951_367-26344-0_sp1.1 from training. Duration: 0.97275 2023-10-03 13:21:34,680 WARNING [train.py:1204] (3/4) Exclude cut with ID _5444_210_2151_1_1527418808464_5312210_634-194174-0_sp1.1 from training. Duration: 0.8818125 2023-10-03 13:21:34,709 WARNING [train.py:1204] (3/4) Exclude cut with ID _56494_210_4220_1_1535284837625_3033169_69-138150-0 from training. Duration: 0.94 2023-10-03 13:21:36,547 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7543_210_6753_1_1528544010_1110402_25-183860-0_sp0.9 from training. Duration: 0.52225 2023-10-03 13:21:36,681 WARNING [train.py:1204] (3/4) Exclude cut with ID _40284_210_2084_1_1533513679814_3704279_287-191918-0_sp1.1 from training. Duration: 0.963625 2023-10-03 13:21:37,013 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=1281313.3333333333, ans=0.1 2023-10-03 13:21:38,752 WARNING [train.py:1204] (3/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_291-270881-0 from training. Duration: 0.97 2023-10-03 13:21:38,813 WARNING [train.py:1204] (3/4) Exclude cut with ID _12555_210_8750_1_1530075594402_3780239_449-267986-0_sp0.9 from training. Duration: 0.92225 2023-10-03 13:21:39,979 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.660e+02 1.894e+02 2.023e+02 2.225e+02 2.799e+02, threshold=4.046e+02, percent-clipped=0.0 2023-10-03 13:21:44,261 WARNING [train.py:1204] (3/4) Exclude cut with ID _3992_210_1747_1_1526644816031_3731060_268-64520-0_sp1.1 from training. Duration: 0.963625 2023-10-03 13:21:44,274 WARNING [train.py:1204] (3/4) Exclude cut with ID _39411_210_5986_1_1533538825030_3343649_173-238248-0_sp0.9 from training. Duration: 0.92225 2023-10-03 13:21:44,295 WARNING [train.py:1204] (3/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_828-313097-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 13:21:45,665 WARNING [train.py:1204] (3/4) Exclude cut with ID _26907_210_7234_1_1532221740694_3205263_337-2494-0 from training. Duration: 0.88 2023-10-03 13:21:47,137 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_229-45300-0_sp1.1 from training. Duration: 0.5181875 2023-10-03 13:21:48,572 WARNING [train.py:1204] (3/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_300-27235-0_sp1.1 from training. Duration: 0.7909375 2023-10-03 13:21:51,159 WARNING [train.py:1204] (3/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_48-338976-0_sp1.1 from training. Duration: 0.8090625 2023-10-03 13:21:56,100 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_13091_210_7925_1_1530609447_8987354_792-195353-0_sp1.1 from training. Duration: 0.936375 2023-10-03 13:21:56,105 WARNING [train.py:1204] (3/4) Exclude cut with ID _56524_210_9642_1_1534572011779_3619170_311-54500-0_sp1.1 from training. Duration: 0.97275 2023-10-03 13:22:00,754 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7543_210_6753_1_1528544010_1110402_25-183860-0 from training. Duration: 0.47 2023-10-03 13:22:02,206 WARNING [train.py:1204] (3/4) Exclude cut with ID 2411-132532-0017-82279-0_sp1.1 from training. Duration: 0.9681875 2023-10-03 13:22:02,207 WARNING [train.py:1204] (3/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_272-249985-0_sp0.9 from training. Duration: 0.7444375 2023-10-03 13:22:03,486 WARNING [train.py:1204] (3/4) Exclude cut with ID _55644_210_11856_1_1534593599582_4103719_320-229006-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 13:22:03,559 WARNING [train.py:1204] (3/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_177-100554-0_sp1.1 from training. Duration: 0.963625 2023-10-03 13:22:03,561 WARNING [train.py:1204] (3/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_787-294416-0_sp1.1 from training. Duration: 0.7454375 2023-10-03 13:22:06,394 WARNING [train.py:1204] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_318-59097-0 from training. Duration: 0.76 2023-10-03 13:22:06,468 WARNING [train.py:1204] (3/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_480-13146-0_sp1.1 from training. Duration: 0.77275 2023-10-03 13:22:08,645 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.conv_module2.balancer2.prob, batch_count=1281446.6666666667, ans=0.125 2023-10-03 13:22:10,230 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_469-338558-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 13:22:10,299 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_367-126897-0_sp1.1 from training. Duration: 0.963625 2023-10-03 13:22:10,316 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6545_210_2040_1_1527774670_4245258_379-14182-0 from training. Duration: 0.8 2023-10-03 13:22:10,328 WARNING [train.py:1204] (3/4) Exclude cut with ID _21631_210_2253_1_1531907880778_3160339_197-79498-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 13:22:10,332 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_416-293746-0_sp1.1 from training. Duration: 0.8181875 2023-10-03 13:22:11,673 WARNING [train.py:1204] (3/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_52-211683-0 from training. Duration: 0.9 2023-10-03 13:22:14,477 WARNING [train.py:1204] (3/4) Exclude cut with ID _48678_210_5663_1_1533988830947_3769199_306-255886-0_sp0.9 from training. Duration: 0.8555625 2023-10-03 13:22:17,169 WARNING [train.py:1204] (3/4) Exclude cut with ID _65134_210_14763_1_1535335393985_3505368_296-24733-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 13:22:21,333 WARNING [train.py:1204] (3/4) Exclude cut with ID _14898_210_9206_1_1530766812377_4177190_335-99364-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 13:22:22,693 WARNING [train.py:1204] (3/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_19-342632-0 from training. Duration: 0.91 2023-10-03 13:22:22,717 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_53071_210_16554_1_1534490405_2954066_265-170472-0 from training. Duration: 0.83 2023-10-03 13:22:26,809 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.bypass_mid.scale_min, batch_count=1281513.3333333333, ans=0.2 2023-10-03 13:22:26,867 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.conv_module1.balancer1.max_abs, batch_count=1281513.3333333333, ans=10.0 2023-10-03 13:22:27,904 WARNING [train.py:1204] (3/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_189-298172-0_sp1.1 from training. Duration: 0.963625 2023-10-03 13:22:30,584 INFO [train.py:1046] (3/4) Epoch 37, batch 1000, loss[loss=0.1467, simple_loss=0.222, pruned_loss=0.03569, over 23610.00 frames. ], tot_loss[loss=0.1597, simple_loss=0.2392, pruned_loss=0.04013, over 4694370.12 frames. ], batch size: 135, lr: 2.76e-03, grad_scale: 16.0 2023-10-03 13:22:30,692 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7615_210_4211_1_1528110965_4580160_72-235211-0 from training. Duration: 0.76 2023-10-03 13:22:31,976 WARNING [train.py:1204] (3/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_155-90874-0_sp1.1 from training. Duration: 0.936375 2023-10-03 13:22:38,132 WARNING [train.py:1204] (3/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_164-216310-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 13:22:39,445 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_46013_210_7234_1_1533776362_3744032_463-52378-0 from training. Duration: 0.86 2023-10-03 13:22:39,449 WARNING [train.py:1204] (3/4) Exclude cut with ID _61133_210_9969_1_1535196777227_3696409_333-270325-0 from training. Duration: 0.93 2023-10-03 13:22:44,155 WARNING [train.py:1204] (3/4) Exclude cut with ID _5172_210_1883_1_1527246062567_3577219_18-101380-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 13:22:44,163 WARNING [train.py:1204] (3/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_577-236858-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 13:22:45,553 WARNING [train.py:1204] (3/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_1006-177345-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 13:22:46,966 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_4-266348-0 from training. Duration: 0.78 2023-10-03 13:22:51,049 WARNING [train.py:1204] (3/4) Exclude cut with ID _55857_210_2346_1_1534644007831_6898209_745-98391-0 from training. Duration: 0.76 2023-10-03 13:22:52,425 WARNING [train.py:1204] (3/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_375-244976-0 from training. Duration: 0.75 2023-10-03 13:22:52,480 WARNING [train.py:1204] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_4-248404-0_sp1.1 from training. Duration: 0.97275 2023-10-03 13:22:55,045 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8453_210_4211_1_1528502940_8562730_596-306820-0 from training. Duration: 0.97 2023-10-03 13:22:56,420 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_211-339622-0_sp0.9 from training. Duration: 0.5 2023-10-03 13:22:57,761 WARNING [train.py:1204] (3/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_393-57888-0 from training. Duration: 0.64 2023-10-03 13:22:59,146 WARNING [train.py:1204] (3/4) Exclude cut with ID _23825_210_13449_1_1531915313526_3497150_143-343575-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 13:22:59,205 WARNING [train.py:1204] (3/4) Exclude cut with ID _58605_210_12210_1_1534723195939_3904259_36-12256-0_sp1.1 from training. Duration: 0.936375 2023-10-03 13:23:08,941 WARNING [train.py:1204] (3/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_38-67795-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 13:23:09,010 WARNING [train.py:1204] (3/4) Exclude cut with ID _64631_210_27767_1_1535349580068_4165160_308-327832-0_sp1.1 from training. Duration: 0.92725 2023-10-03 13:23:10,318 WARNING [train.py:1204] (3/4) Exclude cut with ID _37086_210_9038_1_1532999540891_7021250_726-81176-0_sp1.1 from training. Duration: 0.936375 2023-10-03 13:23:11,723 WARNING [train.py:1204] (3/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_47-141521-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 13:23:11,727 WARNING [train.py:1204] (3/4) Exclude cut with ID _3067_210_2667_1_1526212974640_4503168_250-285937-0 from training. Duration: 0.8 2023-10-03 13:23:11,750 WARNING [train.py:1204] (3/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_239-24000-0_sp1.1 from training. Duration: 0.97275 2023-10-03 13:23:11,806 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3371_210_1794_1_1526384462_5580777_396-339812-0_sp0.9 from training. Duration: 0.9444375 2023-10-03 13:23:11,850 WARNING [train.py:1204] (3/4) Exclude cut with ID _16909_210_5724_1_1531292079912_4118919_580-173454-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 13:23:13,207 WARNING [train.py:1204] (3/4) Exclude cut with ID IC0124W0002-61308 from training. Duration: 0.982 2023-10-03 13:23:18,100 WARNING [train.py:1204] (3/4) Exclude cut with ID _31602_210_5854_1_1532677021456_4458250_249-96604-0 from training. Duration: 0.89 2023-10-03 13:23:18,178 WARNING [train.py:1204] (3/4) Exclude cut with ID _69584_210_5219_1_1535785133775_4044450_325-7656-0 from training. Duration: 0.86 2023-10-03 13:23:19,604 WARNING [train.py:1204] (3/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_478-162247-0 from training. Duration: 0.82 2023-10-03 13:23:20,244 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.2.encoder.layers.2.self_attn_weights.whiten_keys, num_groups=4, num_channels=128, metric=2.96 vs. limit=6.0 2023-10-03 13:23:22,321 WARNING [train.py:1204] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_686-54383-0_sp1.1 from training. Duration: 0.7909375 2023-10-03 13:23:28,867 WARNING [train.py:1204] (3/4) Exclude cut with ID _29303_210_2575_1_1532392237303_7410649_85-270092-0_sp1.1 from training. Duration: 0.936375 2023-10-03 13:23:28,877 WARNING [train.py:1204] (3/4) Exclude cut with ID _2322_210_2667_1_1525604548971_3656299_440-80494-0_sp1.1 from training. Duration: 0.8 2023-10-03 13:23:30,118 WARNING [train.py:1204] (3/4) Exclude cut with ID _47346_210_9476_1_1533815971553_3536160_201-40644-0_sp1.1 from training. Duration: 0.936375 2023-10-03 13:23:31,452 WARNING [train.py:1204] (3/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_343-139898-0_sp1.1 from training. Duration: 0.87275 2023-10-03 13:23:31,558 WARNING [train.py:1204] (3/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_1012-279473-0 from training. Duration: 0.67 2023-10-03 13:23:34,173 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7931_210_1794_1_1528594728_5564059_280-279560-0_sp1.1 from training. Duration: 0.8909375 2023-10-03 13:23:34,230 WARNING [train.py:1204] (3/4) Exclude cut with ID _8263_210_1664_1_1528439452891_970220_68-181805-0 from training. Duration: 0.64 2023-10-03 13:23:35,537 WARNING [train.py:1204] (3/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_605-133395-0 from training. Duration: 0.66 2023-10-03 13:23:35,644 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_43-171109-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 13:23:37,472 WARNING [train.py:1204] (3/4) Exclude cut with ID _40094_210_16380_1_1533294005773_3641159_107-280739-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 13:23:38,881 WARNING [train.py:1204] (3/4) Exclude cut with ID _14821_210_8750_1_1530680503991_6829359_2-227151-0_sp0.9 from training. Duration: 0.92225 2023-10-03 13:23:39,038 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.0.feed_forward1.out_proj.dropout_p, batch_count=1281846.6666666667, ans=0.1 2023-10-03 13:23:41,682 WARNING [train.py:1204] (3/4) Exclude cut with ID _14268_210_9206_1_1530622771285_4714099_331-105351-0_sp1.1 from training. Duration: 0.5909375 2023-10-03 13:23:44,298 INFO [train.py:1046] (3/4) Epoch 37, batch 1050, loss[loss=0.1255, simple_loss=0.1774, pruned_loss=0.0368, over 18955.00 frames. ], tot_loss[loss=0.1581, simple_loss=0.2371, pruned_loss=0.03958, over 4688415.58 frames. ], batch size: 388, lr: 2.76e-03, grad_scale: 16.0 2023-10-03 13:23:44,359 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_26326_210_7925_1_1533002493_3592406_451-82741-0_sp1.1 from training. Duration: 0.97275 2023-10-03 13:23:47,683 WARNING [train.py:1204] (3/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_191-59784-0_sp0.9 from training. Duration: 0.97775 2023-10-03 13:23:47,789 WARNING [train.py:1204] (3/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_445-117398-0_sp1.1 from training. Duration: 0.8090625 2023-10-03 13:23:49,172 WARNING [train.py:1204] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_605-257205-0_sp0.9 from training. Duration: 0.6555625 2023-10-03 13:23:49,242 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_50050_210_16223_1_1535435796_7290138_592-258841-0_sp1.1 from training. Duration: 0.936375 2023-10-03 13:23:50,719 WARNING [train.py:1204] (3/4) Exclude cut with ID _14348_210_8341_1_1530599434996_3778640_240-232370-0_sp0.9 from training. Duration: 0.9333125 2023-10-03 13:23:53,506 WARNING [train.py:1204] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_239-316802-0_sp0.9 from training. Duration: 0.7666875 2023-10-03 13:23:56,567 WARNING [train.py:1204] (3/4) Exclude cut with ID _45560_210_21982_1_1533812603951_3584999_111-6376-0_sp1.1 from training. Duration: 0.763625 2023-10-03 13:23:58,585 WARNING [train.py:1204] (3/4) Exclude cut with ID _54189_210_16380_1_1534332670039_3911710_606-311481-0_sp1.1 from training. Duration: 0.963625 2023-10-03 13:23:58,642 WARNING [train.py:1204] (3/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_237-332027-0_sp1.1 from training. Duration: 0.863625 2023-10-03 13:23:59,924 WARNING [train.py:1204] (3/4) Exclude cut with ID _66070_210_7640_1_1535441393096_7805310_489-292631-0_sp0.9 from training. Duration: 0.87775 2023-10-03 13:24:00,000 WARNING [train.py:1204] (3/4) Exclude cut with ID _49728_210_12651_1_1534041034294_6788150_624-195301-0_sp1.1 from training. Duration: 0.77275 2023-10-03 13:24:01,470 WARNING [train.py:1204] (3/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_44-79427-0 from training. Duration: 0.98 2023-10-03 13:24:01,693 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.feed_forward1.hidden_balancer.prob, batch_count=1281980.0, ans=0.125 2023-10-03 13:24:02,831 WARNING [train.py:1204] (3/4) Exclude cut with ID _35350_210_16040_1_1533040191183_4075299_490-198354-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 13:24:02,873 WARNING [train.py:1204] (3/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_309-222545-0 from training. Duration: 0.84 2023-10-03 13:24:05,583 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_13091_210_7925_1_1530609447_8987354_790-199469-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 13:24:05,589 WARNING [train.py:1204] (3/4) Exclude cut with ID _21632_210_2253_1_1531994001765_3744140_110-274654-0 from training. Duration: 0.82 2023-10-03 13:24:05,595 WARNING [train.py:1204] (3/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_498-40153-0_sp0.9 from training. Duration: 0.7 2023-10-03 13:24:05,806 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.balancer1.prob, batch_count=1281980.0, ans=0.125 2023-10-03 13:24:08,990 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.473e+02 1.790e+02 1.943e+02 2.142e+02 2.975e+02, threshold=3.887e+02, percent-clipped=0.0 2023-10-03 13:24:13,258 WARNING [train.py:1204] (3/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_363-282909-0_sp1.1 from training. Duration: 0.936375 2023-10-03 13:24:14,568 WARNING [train.py:1204] (3/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_178-177582-0_sp0.9 from training. Duration: 0.82225 2023-10-03 13:24:14,587 WARNING [train.py:1204] (3/4) Exclude cut with ID _24612_210_8913_1_1531998054193_3806359_357-35487-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 13:24:16,108 WARNING [train.py:1204] (3/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_138-332773-0 from training. Duration: 0.81 2023-10-03 13:24:16,135 WARNING [train.py:1204] (3/4) Exclude cut with ID _7624_210_1531_1_1528109802198_4933101_418-349834-0 from training. Duration: 0.6 2023-10-03 13:24:16,156 WARNING [train.py:1204] (3/4) Exclude cut with ID _7537_210_2084_1_1528502395431_3924359_14-312906-0_sp0.9 from training. Duration: 0.9333125 2023-10-03 13:24:19,446 WARNING [train.py:1204] (3/4) Exclude cut with ID _60057_210_10640_1_1534903065793_3333150_362-230377-0 from training. Duration: 0.87 2023-10-03 13:24:22,148 WARNING [train.py:1204] (3/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_138-64210-0 from training. Duration: 0.73 2023-10-03 13:24:22,216 WARNING [train.py:1204] (3/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_454-339504-0_sp1.1 from training. Duration: 0.97275 2023-10-03 13:24:25,642 WARNING [train.py:1204] (3/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_508-335369-0_sp0.9 from training. Duration: 0.5333125 2023-10-03 13:24:26,858 WARNING [train.py:1204] (3/4) Exclude cut with ID _5608_210_2106_1_1527415439089_6819768_909-82767-0_sp0.9 from training. Duration: 0.67775 2023-10-03 13:24:28,668 WARNING [train.py:1204] (3/4) Exclude cut with ID _6694_210_4599_1_1527852637490_3965529_99-11611-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 13:24:28,727 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2655_210_2087_1_1525688959_3897400_52-279899-0_sp1.1 from training. Duration: 0.62725 2023-10-03 13:24:32,804 WARNING [train.py:1204] (3/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_375-245677-0_sp1.1 from training. Duration: 0.736375 2023-10-03 13:24:37,475 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_23850_210_4238_1_1532232597_1632433_32-185135-0 from training. Duration: 0.86 2023-10-03 13:24:38,810 WARNING [train.py:1204] (3/4) Exclude cut with ID _15113_210_8254_1_1530842438579_3716279_209-54533-0 from training. Duration: 0.96 2023-10-03 13:24:38,850 WARNING [train.py:1204] (3/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_401-53423-0 from training. Duration: 0.55 2023-10-03 13:24:38,888 WARNING [train.py:1204] (3/4) Exclude cut with ID _14867_210_1968_1_1530684179890_3846969_319-250868-0_sp1.1 from training. Duration: 0.9 2023-10-03 13:24:38,902 WARNING [train.py:1204] (3/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_106-30782-0_sp0.9 from training. Duration: 0.888875 2023-10-03 13:24:40,331 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_5960_210_1794_1_1527403865_3990999_515-2613-0 from training. Duration: 0.88 2023-10-03 13:24:44,490 WARNING [train.py:1204] (3/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_798-106179-0_sp0.9 from training. Duration: 0.92225 2023-10-03 13:24:45,909 WARNING [train.py:1204] (3/4) Exclude cut with ID _14227_210_4381_1_1530701804286_3940259_125-275642-0_sp1.1 from training. Duration: 0.9 2023-10-03 13:24:45,911 WARNING [train.py:1204] (3/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_79-200392-0_sp1.1 from training. Duration: 0.8818125 2023-10-03 13:24:47,168 WARNING [train.py:1204] (3/4) Exclude cut with ID _48836_210_12210_1_1534075848293_3904150_429-35475-0_sp0.9 from training. Duration: 0.988875 2023-10-03 13:24:47,192 WARNING [train.py:1204] (3/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_224-172517-0_sp1.1 from training. Duration: 0.97275 2023-10-03 13:24:47,486 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.ff2_skip_rate, batch_count=1282180.0, ans=0.0 2023-10-03 13:24:49,477 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.balancer1.prob, batch_count=1282180.0, ans=0.125 2023-10-03 13:24:50,655 WARNING [train.py:1204] (3/4) Exclude cut with ID _15113_210_8254_1_1530842438579_3716279_76-232507-0_sp1.1 from training. Duration: 0.97275 2023-10-03 13:24:50,658 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_135-5501-0 from training. Duration: 0.98 2023-10-03 13:24:53,265 WARNING [train.py:1204] (3/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_563-347832-0_sp0.9 from training. Duration: 0.988875 2023-10-03 13:24:53,284 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6786_210_1388_1_1527839699_4194495_273-149841-0 from training. Duration: 0.92 2023-10-03 13:24:53,309 WARNING [train.py:1204] (3/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_172-215932-0 from training. Duration: 0.66 2023-10-03 13:24:54,638 WARNING [train.py:1204] (3/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_939-132910-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 13:24:58,575 WARNING [train.py:1204] (3/4) Exclude cut with ID _49371_210_12071_1_1533952761021_3812190_16-246001-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 13:24:59,840 INFO [train.py:1046] (3/4) Epoch 37, batch 1100, loss[loss=0.1817, simple_loss=0.2632, pruned_loss=0.0501, over 23391.00 frames. ], tot_loss[loss=0.1579, simple_loss=0.2369, pruned_loss=0.03941, over 4699503.92 frames. ], batch size: 93, lr: 2.76e-03, grad_scale: 16.0 2023-10-03 13:25:02,622 WARNING [train.py:1204] (3/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_680-189165-0_sp1.1 from training. Duration: 0.936375 2023-10-03 13:25:07,354 WARNING [train.py:1204] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_53-300544-0_sp1.1 from training. Duration: 0.6090625 2023-10-03 13:25:08,631 WARNING [train.py:1204] (3/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_540-275685-0_sp1.1 from training. Duration: 0.8090625 2023-10-03 13:25:09,909 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_38965_210_4852_1_1533174906_3484735_453-106579-0_sp1.1 from training. Duration: 0.92725 2023-10-03 13:25:09,954 WARNING [train.py:1204] (3/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_105-328174-0 from training. Duration: 0.76 2023-10-03 13:25:11,363 WARNING [train.py:1204] (3/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_279-126205-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 13:25:14,148 WARNING [train.py:1204] (3/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_5-55893-0_sp0.9 from training. Duration: 0.611125 2023-10-03 13:25:15,606 WARNING [train.py:1204] (3/4) Exclude cut with ID _13678_210_7497_1_1530530069226_3463170_141-10835-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 13:25:18,311 WARNING [train.py:1204] (3/4) Exclude cut with ID _22083_210_8254_1_1531702862439_3678970_201-85738-0_sp1.1 from training. Duration: 0.8181875 2023-10-03 13:25:18,340 WARNING [train.py:1204] (3/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_51-1049-0 from training. Duration: 0.75 2023-10-03 13:25:18,411 WARNING [train.py:1204] (3/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_206-10097-0_sp0.9 from training. Duration: 0.5444375 2023-10-03 13:25:20,309 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_4528_210_3273_1_1527076654_3276777_360-63661-0_sp1.1 from training. Duration: 0.92725 2023-10-03 13:25:20,317 WARNING [train.py:1204] (3/4) Exclude cut with ID _64086_210_7994_1_1535270417019_7435949_449-31567-0_sp1.1 from training. Duration: 0.8818125 2023-10-03 13:25:23,039 WARNING [train.py:1204] (3/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_245-174553-0_sp0.9 from training. Duration: 0.97775 2023-10-03 13:25:24,473 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6545_210_2040_1_1527774670_4245258_379-14182-0_sp1.1 from training. Duration: 0.72725 2023-10-03 13:25:31,020 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_256-206070-0_sp1.1 from training. Duration: 0.963625 2023-10-03 13:25:31,303 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.feed_forward2.hidden_balancer.prob, batch_count=1282380.0, ans=0.125 2023-10-03 13:25:33,898 WARNING [train.py:1204] (3/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_278-104676-0 from training. Duration: 0.98 2023-10-03 13:25:35,244 WARNING [train.py:1204] (3/4) Exclude cut with ID IC0671W0065-331908 from training. Duration: 0.896 2023-10-03 13:25:35,291 WARNING [train.py:1204] (3/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_244-129917-0_sp1.1 from training. Duration: 0.97275 2023-10-03 13:25:37,966 WARNING [train.py:1204] (3/4) Exclude cut with ID _7852_210_5219_1_1528286471316_8139400_1129-170857-0_sp1.1 from training. Duration: 0.97275 2023-10-03 13:25:39,812 WARNING [train.py:1204] (3/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_443-30034-0_sp0.9 from training. Duration: 0.711125 2023-10-03 13:25:39,861 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_31583_210_3852_1_1532595851_4024413_250-332999-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 13:25:41,257 WARNING [train.py:1204] (3/4) Exclude cut with ID _8772_210_1850_1_1528635595972_3664011_247-336448-0 from training. Duration: 0.74 2023-10-03 13:25:42,517 WARNING [train.py:1204] (3/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_567-135114-0_sp0.9 from training. Duration: 0.9666875 2023-10-03 13:25:42,521 WARNING [train.py:1204] (3/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_557-349534-0_sp1.1 from training. Duration: 0.82725 2023-10-03 13:25:42,534 WARNING [train.py:1204] (3/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_1341-26728-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 13:25:42,605 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_40222_210_6228_1_1533385298_4481939_375-328051-0_sp1.1 from training. Duration: 0.97275 2023-10-03 13:25:43,838 WARNING [train.py:1204] (3/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_276-96593-0 from training. Duration: 0.77 2023-10-03 13:25:49,438 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_53071_210_16554_1_1534490405_2954066_265-170472-0_sp0.9 from training. Duration: 0.92225 2023-10-03 13:25:49,454 WARNING [train.py:1204] (3/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_365-1492-0 from training. Duration: 0.91 2023-10-03 13:25:52,164 WARNING [train.py:1204] (3/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_654-52713-0_sp0.9 from training. Duration: 0.8555625 2023-10-03 13:25:56,831 WARNING [train.py:1204] (3/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_113-92608-0_sp1.1 from training. Duration: 0.4909375 2023-10-03 13:25:59,024 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_46013_210_7234_1_1533776362_3744032_50-141535-0 from training. Duration: 0.99 2023-10-03 13:25:59,033 WARNING [train.py:1204] (3/4) Exclude cut with ID _13942_210_9988_1_1530604906036_3733360_499-55676-0_sp1.1 from training. Duration: 0.563625 2023-10-03 13:26:02,362 WARNING [train.py:1204] (3/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_375-65143-0_sp1.1 from training. Duration: 0.97275 2023-10-03 13:26:03,850 WARNING [train.py:1204] (3/4) Exclude cut with ID _16756_210_4915_1_1531047803763_7104259_199-325608-0_sp1.1 from training. Duration: 0.92725 2023-10-03 13:26:03,890 WARNING [train.py:1204] (3/4) Exclude cut with ID _31649_210_6228_1_1532584608063_4091078_443-124391-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 13:26:04,019 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.1.feed_forward1.out_proj.dropout_p, batch_count=1282513.3333333333, ans=0.1 2023-10-03 13:26:05,201 WARNING [train.py:1204] (3/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_146-218763-0 from training. Duration: 0.86 2023-10-03 13:26:06,473 WARNING [train.py:1204] (3/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_567-135114-0_sp1.1 from training. Duration: 0.7909375 2023-10-03 13:26:06,535 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_26326_210_7925_1_1533002493_3592406_462-288299-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 13:26:07,780 WARNING [train.py:1204] (3/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_322-217220-0 from training. Duration: 0.86 2023-10-03 13:26:09,021 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_238-256668-0_sp1.1 from training. Duration: 0.62725 2023-10-03 13:26:09,063 WARNING [train.py:1204] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_371-18169-0 from training. Duration: 0.86 2023-10-03 13:26:11,035 WARNING [train.py:1204] (3/4) Exclude cut with ID _58480_210_12392_1_1534811121111_3646160_23-74512-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 13:26:11,040 WARNING [train.py:1204] (3/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_784-213099-0_sp1.1 from training. Duration: 0.8454375 2023-10-03 13:26:12,367 WARNING [train.py:1204] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_625-102955-0_sp1.1 from training. Duration: 0.7 2023-10-03 13:26:13,800 INFO [train.py:1046] (3/4) Epoch 37, batch 1150, loss[loss=0.155, simple_loss=0.2457, pruned_loss=0.03209, over 24475.00 frames. ], tot_loss[loss=0.1586, simple_loss=0.2378, pruned_loss=0.03967, over 4693504.28 frames. ], batch size: 66, lr: 2.76e-03, grad_scale: 8.0 2023-10-03 13:26:15,290 WARNING [train.py:1204] (3/4) Exclude cut with ID _49748_210_24116_1_1533986971737_3158910_176-174327-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 13:26:15,488 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.ff3_skip_rate, batch_count=1282580.0, ans=0.0 2023-10-03 13:26:16,775 WARNING [train.py:1204] (3/4) Exclude cut with ID _50310_210_15488_1_1534068043345_3641080_432-96140-0_sp1.1 from training. Duration: 0.936375 2023-10-03 13:26:18,195 WARNING [train.py:1204] (3/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_76-14910-0_sp1.1 from training. Duration: 0.92725 2023-10-03 13:26:18,216 WARNING [train.py:1204] (3/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_40-19458-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 13:26:19,692 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_46-93547-0 from training. Duration: 0.95 2023-10-03 13:26:19,739 WARNING [train.py:1204] (3/4) Exclude cut with ID _21631_210_2253_1_1531907880778_3160339_321-35325-0_sp1.1 from training. Duration: 0.963625 2023-10-03 13:26:21,141 WARNING [train.py:1204] (3/4) Exclude cut with ID _4779_210_2084_1_1527318022944_3914893_500-285013-0 from training. Duration: 0.69 2023-10-03 13:26:21,383 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.bypass.scale_min, batch_count=1282580.0, ans=0.2 2023-10-03 13:26:22,553 WARNING [train.py:1204] (3/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_301-93324-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 13:26:22,559 WARNING [train.py:1204] (3/4) Exclude cut with ID _6473_210_1741_1_1527901307396_3815380_108-17335-0_sp1.1 from training. Duration: 0.5909375 2023-10-03 13:26:27,539 WARNING [train.py:1204] (3/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_206-10097-0 from training. Duration: 0.49 2023-10-03 13:26:30,857 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_36756_210_7234_1_1533026991_966276_103-77513-0_sp1.1 from training. Duration: 0.97275 2023-10-03 13:26:34,856 WARNING [train.py:1204] (3/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_662-18193-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 13:26:34,930 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_26433_210_12108_1_1532157158_4056480_439-52438-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 13:26:35,695 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.2.encoder.layers.0.conv_module1.whiten, num_groups=1, num_channels=384, metric=6.29 vs. limit=15.0 2023-10-03 13:26:36,731 WARNING [train.py:1204] (3/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_464-33931-0 from training. Duration: 0.54 2023-10-03 13:26:36,737 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3474_210_2028_1_1526188513_4825246_145-309131-0_sp0.9 from training. Duration: 0.82225 2023-10-03 13:26:36,747 WARNING [train.py:1204] (3/4) Exclude cut with ID _9679_210_7497_1_1529129212906_4363929_446-308321-0_sp1.1 from training. Duration: 0.963625 2023-10-03 13:26:39,413 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.536e+02 1.859e+02 2.016e+02 2.195e+02 3.712e+02, threshold=4.032e+02, percent-clipped=0.0 2023-10-03 13:26:42,151 WARNING [train.py:1204] (3/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_662-275621-0 from training. Duration: 0.42 2023-10-03 13:26:43,648 WARNING [train.py:1204] (3/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_344-337148-0_sp1.1 from training. Duration: 0.97275 2023-10-03 13:26:43,777 WARNING [train.py:1204] (3/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_759-179071-0_sp1.1 from training. Duration: 0.92725 2023-10-03 13:26:52,100 WARNING [train.py:1204] (3/4) Exclude cut with ID _28783_210_7943_1_1532671164808_7155479_293-12188-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 13:26:59,246 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_17613_210_2151_1_1531137569_2366160_235-320304-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 13:27:00,458 WARNING [train.py:1204] (3/4) Exclude cut with ID _14349_210_8341_1_1530615710818_3442470_170-336541-0 from training. Duration: 0.86 2023-10-03 13:27:00,522 WARNING [train.py:1204] (3/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_375-218677-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 13:27:00,568 WARNING [train.py:1204] (3/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_829-202956-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 13:27:00,803 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.attention_skip_rate, batch_count=1282780.0, ans=0.0 2023-10-03 13:27:07,436 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9029W0436-509823 from training. Duration: 0.768 2023-10-03 13:27:10,677 WARNING [train.py:1204] (3/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_231-112214-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 13:27:16,259 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9059W0470-524883 from training. Duration: 0.3415625 2023-10-03 13:27:17,790 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_43266_210_1794_1_1533521789_4497218_206-79512-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 13:27:19,208 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_294-163793-0_sp0.9 from training. Duration: 0.911125 2023-10-03 13:27:19,236 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6290_210_3273_1_1527595224_1282787_115-87095-0_sp1.1 from training. Duration: 0.5 2023-10-03 13:27:19,802 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.4.encoder.layers.0.whiten, num_groups=1, num_channels=384, metric=5.19 vs. limit=12.0 2023-10-03 13:27:20,636 WARNING [train.py:1204] (3/4) Exclude cut with ID _14100_210_8341_1_1530529320613_3678069_279-55760-0_sp1.1 from training. Duration: 0.8454375 2023-10-03 13:27:22,006 WARNING [train.py:1204] (3/4) Exclude cut with ID _14621_210_1925_1_1530686901185_3675420_419-234200-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 13:27:26,507 WARNING [train.py:1204] (3/4) Exclude cut with ID _60229_210_27767_1_1534838406158_3655350_353-213656-0_sp1.1 from training. Duration: 0.863625 2023-10-03 13:27:26,517 WARNING [train.py:1204] (3/4) Exclude cut with ID _48678_210_5663_1_1533988830947_3769199_306-255886-0_sp1.1 from training. Duration: 0.7 2023-10-03 13:27:28,282 INFO [train.py:1046] (3/4) Epoch 37, batch 1200, loss[loss=0.2164, simple_loss=0.2814, pruned_loss=0.07572, over 19256.00 frames. ], tot_loss[loss=0.1594, simple_loss=0.2388, pruned_loss=0.04003, over 4685902.39 frames. ], batch size: 388, lr: 2.76e-03, grad_scale: 16.0 2023-10-03 13:27:28,651 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.balancer1.prob, batch_count=1282913.3333333333, ans=0.125 2023-10-03 13:27:29,765 WARNING [train.py:1204] (3/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_466-3492-0_sp1.1 from training. Duration: 0.97275 2023-10-03 13:27:29,766 WARNING [train.py:1204] (3/4) Exclude cut with ID _8755_210_1876_1_1528628559357_3258209_199-242167-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 13:27:30,497 WARNING [train.py:1204] (3/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_237-89684-0_sp1.1 from training. Duration: 0.87275 2023-10-03 13:27:33,144 WARNING [train.py:1204] (3/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_70-84121-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 13:27:34,564 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7544_210_6753_1_1528587722_3576686_172-171750-0_sp1.1 from training. Duration: 0.5909375 2023-10-03 13:27:35,948 WARNING [train.py:1204] (3/4) Exclude cut with ID _24271_210_12658_1_1531987270022_7190519_40-321822-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 13:27:35,974 WARNING [train.py:1204] (3/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_227-83612-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 13:27:38,746 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9156W0015-572508 from training. Duration: 0.896 2023-10-03 13:27:41,999 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_292-265928-0 from training. Duration: 0.51 2023-10-03 13:27:44,856 WARNING [train.py:1204] (3/4) Exclude cut with ID _14747_210_8341_1_1530685797463_3654089_147-131229-0_sp1.1 from training. Duration: 0.6909375 2023-10-03 13:27:45,096 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.conv_module2.balancer2.prob, batch_count=1282980.0, ans=0.125 2023-10-03 13:27:47,567 WARNING [train.py:1204] (3/4) Exclude cut with ID _45297_210_2721_1_1533715351381_3264949_351-147136-0_sp0.9 from training. Duration: 0.9666875 2023-10-03 13:27:50,286 WARNING [train.py:1204] (3/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_1102-324568-0_sp1.1 from training. Duration: 0.97275 2023-10-03 13:27:51,090 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.3.encoder.layers.0.conv_module1.whiten, num_groups=1, num_channels=512, metric=3.91 vs. limit=15.0 2023-10-03 13:27:51,734 WARNING [train.py:1204] (3/4) Exclude cut with ID _18516_210_9206_1_1531704653206_3692406_465-75446-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 13:27:51,736 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9203W0126-595623 from training. Duration: 0.896 2023-10-03 13:27:51,814 WARNING [train.py:1204] (3/4) Exclude cut with ID _55383_210_10635_1_1534468426172_2231669_139-328410-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 13:28:01,786 WARNING [train.py:1204] (3/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_166-246557-0_sp0.9 from training. Duration: 0.711125 2023-10-03 13:28:01,790 WARNING [train.py:1204] (3/4) Exclude cut with ID _30039_210_13270_1_1532484612900_2725150_239-304872-0_sp1.1 from training. Duration: 0.963625 2023-10-03 13:28:01,816 WARNING [train.py:1204] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_6-226750-0 from training. Duration: 0.94 2023-10-03 13:28:03,783 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_135-5501-0_sp1.1 from training. Duration: 0.8909375 2023-10-03 13:28:04,013 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.conv_skip_rate, batch_count=1283046.6666666667, ans=0.0 2023-10-03 13:28:05,419 WARNING [train.py:1204] (3/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_359-118926-0 from training. Duration: 0.72 2023-10-03 13:28:09,522 WARNING [train.py:1204] (3/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_4-91037-0 from training. Duration: 0.88 2023-10-03 13:28:09,543 WARNING [train.py:1204] (3/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_415-288492-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 13:28:10,858 WARNING [train.py:1204] (3/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_544-287179-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 13:28:10,969 WARNING [train.py:1204] (3/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_122-32606-0_sp1.1 from training. Duration: 0.92725 2023-10-03 13:28:12,829 WARNING [train.py:1204] (3/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_121-284152-0_sp1.1 from training. Duration: 0.536375 2023-10-03 13:28:15,523 WARNING [train.py:1204] (3/4) Exclude cut with ID _44718_210_5816_1_1534050051869_6469030_350-299477-0_sp1.1 from training. Duration: 0.97275 2023-10-03 13:28:15,535 WARNING [train.py:1204] (3/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_1103-56445-0_sp0.9 from training. Duration: 0.811125 2023-10-03 13:28:16,858 WARNING [train.py:1204] (3/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_64-298917-0_sp0.9 from training. Duration: 0.97775 2023-10-03 13:28:16,918 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2245_210_2715_1_1525511548_3540254_310-52483-0 from training. Duration: 0.88 2023-10-03 13:28:16,959 WARNING [train.py:1204] (3/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_401-158239-0_sp1.1 from training. Duration: 0.6454375 2023-10-03 13:28:18,363 WARNING [train.py:1204] (3/4) Exclude cut with ID _59502_210_12210_1_1535107024698_1835139_146-162495-0_sp1.1 from training. Duration: 0.836375 2023-10-03 13:28:18,388 WARNING [train.py:1204] (3/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_41-31157-0_sp0.9 from training. Duration: 0.6333125 2023-10-03 13:28:21,107 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_68717_210_3528_1_1535628683_1556759_95-36722-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 13:28:21,116 WARNING [train.py:1204] (3/4) Exclude cut with ID _30196_210_1664_1_1533708496522_7074140_654-162611-0_sp1.1 from training. Duration: 0.92725 2023-10-03 13:28:23,958 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_9302_210_2550_1_1528889962_4655416_638-301620-0_sp0.9 from training. Duration: 0.52225 2023-10-03 13:28:24,474 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.5.encoder.layers.1.self_attn_weights.whiten_keys, num_groups=4, num_channels=128, metric=1.56 vs. limit=6.0 2023-10-03 13:28:26,743 WARNING [train.py:1204] (3/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_478-162247-0_sp1.1 from training. Duration: 0.7454375 2023-10-03 13:28:29,553 WARNING [train.py:1204] (3/4) Exclude cut with ID _45297_210_2721_1_1533715351381_3264949_353-9541-0 from training. Duration: 0.97 2023-10-03 13:28:31,761 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9653W0190-675468 from training. Duration: 0.9935 2023-10-03 13:28:35,147 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_49730_210_13049_1_1534075294_1830676_214-84436-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 13:28:36,576 WARNING [train.py:1204] (3/4) Exclude cut with ID _50093_210_4220_1_1534417141207_3432299_259-322252-0_sp1.1 from training. Duration: 0.836375 2023-10-03 13:28:37,917 WARNING [train.py:1204] (3/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_599-50220-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 13:28:39,335 WARNING [train.py:1204] (3/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_174-109016-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 13:28:42,544 INFO [train.py:1046] (3/4) Epoch 37, batch 1250, loss[loss=0.1408, simple_loss=0.2288, pruned_loss=0.02639, over 24298.00 frames. ], tot_loss[loss=0.1602, simple_loss=0.2398, pruned_loss=0.04036, over 4692756.76 frames. ], batch size: 61, lr: 2.76e-03, grad_scale: 16.0 2023-10-03 13:28:42,639 WARNING [train.py:1204] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_665-196155-0 from training. Duration: 0.67 2023-10-03 13:28:45,562 WARNING [train.py:1204] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_656-188361-0_sp1.1 from training. Duration: 0.7909375 2023-10-03 13:28:46,889 WARNING [train.py:1204] (3/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_60-18123-0_sp1.1 from training. Duration: 0.8 2023-10-03 13:28:46,952 WARNING [train.py:1204] (3/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_535-14410-0 from training. Duration: 0.47 2023-10-03 13:28:49,617 WARNING [train.py:1204] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_94-126128-0_sp1.1 from training. Duration: 0.7818125 2023-10-03 13:28:49,714 WARNING [train.py:1204] (3/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_356-31216-0_sp1.1 from training. Duration: 0.6818125 2023-10-03 13:28:52,623 WARNING [train.py:1204] (3/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_443-30034-0_sp1.1 from training. Duration: 0.5818125 2023-10-03 13:28:52,743 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=1283246.6666666667, ans=0.1 2023-10-03 13:28:53,897 WARNING [train.py:1204] (3/4) Exclude cut with ID _26907_210_7234_1_1532221740694_3205263_337-2494-0_sp1.1 from training. Duration: 0.8 2023-10-03 13:28:53,978 WARNING [train.py:1204] (3/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_49-97158-0_sp0.9 from training. Duration: 0.8444375 2023-10-03 13:28:55,314 WARNING [train.py:1204] (3/4) Exclude cut with ID _7877_210_6014_1_1528286358099_3750136_553-170128-0_sp1.1 from training. Duration: 0.963625 2023-10-03 13:28:56,762 WARNING [train.py:1204] (3/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_202-293523-0_sp0.9 from training. Duration: 0.8 2023-10-03 13:28:59,951 WARNING [train.py:1204] (3/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_188-171673-0_sp0.9 from training. Duration: 0.6444375 2023-10-03 13:28:59,973 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_120-41668-0_sp1.1 from training. Duration: 0.636375 2023-10-03 13:29:01,723 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_40223_210_6228_1_1533298404_4812267_613-78769-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 13:29:01,845 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_53861_210_5399_1_1534422156_4272077_150-336899-0_sp1.1 from training. Duration: 0.963625 2023-10-03 13:29:03,155 WARNING [train.py:1204] (3/4) Exclude cut with ID _14459_210_2228_1_1531443641946_7516700_1146-56843-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 13:29:03,740 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.3.encoder.layers.2.nonlin_attention.whiten1, num_groups=1, num_channels=384, metric=3.87 vs. limit=10.0 2023-10-03 13:29:06,553 WARNING [train.py:1204] (3/4) Exclude cut with ID _39867_210_9826_1_1533553222030_9344259_1194-299928-0_sp1.1 from training. Duration: 0.92725 2023-10-03 13:29:06,621 WARNING [train.py:1204] (3/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_214-291369-0_sp0.9 from training. Duration: 0.7 2023-10-03 13:29:07,849 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.540e+02 1.893e+02 2.075e+02 2.273e+02 3.186e+02, threshold=4.149e+02, percent-clipped=0.0 2023-10-03 13:29:08,246 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.bypass.skip_rate, batch_count=1283313.3333333333, ans=0.07 2023-10-03 13:29:11,334 WARNING [train.py:1204] (3/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_358-215558-0 from training. Duration: 0.93 2023-10-03 13:29:12,710 WARNING [train.py:1204] (3/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_577-287150-0_sp0.9 from training. Duration: 0.911125 2023-10-03 13:29:15,472 WARNING [train.py:1204] (3/4) Exclude cut with ID _60860_210_8254_1_1534896015875_3557169_229-187367-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 13:29:16,849 WARNING [train.py:1204] (3/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_857-298942-0 from training. Duration: 0.85 2023-10-03 13:29:16,894 WARNING [train.py:1204] (3/4) Exclude cut with ID _40137_210_12210_1_1533290484046_3795180_249-187059-0_sp1.1 from training. Duration: 0.8 2023-10-03 13:29:16,901 WARNING [train.py:1204] (3/4) Exclude cut with ID ID0171W0508-765603 from training. Duration: 0.541 2023-10-03 13:29:18,177 WARNING [train.py:1204] (3/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_521-219439-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 13:29:18,182 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_40223_210_6228_1_1533298404_4812267_741-302616-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 13:29:21,068 WARNING [train.py:1204] (3/4) Exclude cut with ID _65293_210_9070_1_1535545549765_4004210_438-154735-0_sp1.1 from training. Duration: 0.92725 2023-10-03 13:29:23,820 WARNING [train.py:1204] (3/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_564-132950-0_sp1.1 from training. Duration: 0.92725 2023-10-03 13:29:23,867 WARNING [train.py:1204] (3/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_291-270881-0_sp1.1 from training. Duration: 0.8818125 2023-10-03 13:29:25,257 WARNING [train.py:1204] (3/4) Exclude cut with ID _60229_210_27767_1_1534838406158_3655350_353-213656-0 from training. Duration: 0.95 2023-10-03 13:29:25,277 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_243-143119-0 from training. Duration: 0.77 2023-10-03 13:29:25,302 WARNING [train.py:1204] (3/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_690-126306-0 from training. Duration: 0.78 2023-10-03 13:29:28,130 WARNING [train.py:1204] (3/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_347-95003-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 13:29:29,950 WARNING [train.py:1204] (3/4) Exclude cut with ID _2322_210_2667_1_1525604548971_3656299_440-80494-0 from training. Duration: 0.88 2023-10-03 13:29:29,967 WARNING [train.py:1204] (3/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_370-229434-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 13:29:31,428 WARNING [train.py:1204] (3/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_124-345840-0_sp1.1 from training. Duration: 0.47275 2023-10-03 13:29:31,436 WARNING [train.py:1204] (3/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_165-123581-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 13:29:33,380 WARNING [train.py:1204] (3/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_453-282588-0 from training. Duration: 0.98 2023-10-03 13:29:33,407 WARNING [train.py:1204] (3/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_299-34413-0_sp0.9 from training. Duration: 0.72225 2023-10-03 13:29:33,463 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_40036_210_13405_1_1533648647_1315388_48-146016-0_sp1.1 from training. Duration: 0.8454375 2023-10-03 13:29:34,771 WARNING [train.py:1204] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_99-167392-0_sp0.9 from training. Duration: 0.67775 2023-10-03 13:29:34,824 WARNING [train.py:1204] (3/4) Exclude cut with ID _48677_210_5663_1_1533902405919_3892989_37-207114-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 13:29:36,225 WARNING [train.py:1204] (3/4) Exclude cut with ID _29857_210_20034_1_1532395886753_3798160_335-163562-0 from training. Duration: 0.85 2023-10-03 13:29:39,260 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_36492_210_7927_1_1533261987_4790834_703-343351-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 13:29:40,616 WARNING [train.py:1204] (3/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_80-147301-0_sp0.9 from training. Duration: 0.9333125 2023-10-03 13:29:42,013 WARNING [train.py:1204] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_218-295097-0_sp0.9 from training. Duration: 0.8555625 2023-10-03 13:29:46,193 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2295_210_2087_1_1525500993_2407863_116-238894-0_sp1.1 from training. Duration: 0.636375 2023-10-03 13:29:46,675 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.5.encoder.layers.0.feed_forward3.out_whiten, num_groups=1, num_channels=256, metric=10.00 vs. limit=15.0 2023-10-03 13:29:50,374 WARNING [train.py:1204] (3/4) Exclude cut with ID _3231_210_2106_1_1526205864934_6784220_697-171068-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 13:29:50,410 WARNING [train.py:1204] (3/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_102-77064-0 from training. Duration: 0.93 2023-10-03 13:29:55,657 INFO [train.py:1046] (3/4) Epoch 37, batch 1300, loss[loss=0.1667, simple_loss=0.2513, pruned_loss=0.04107, over 24614.00 frames. ], tot_loss[loss=0.1604, simple_loss=0.2399, pruned_loss=0.04048, over 4704221.31 frames. ], batch size: 68, lr: 2.76e-03, grad_scale: 16.0 2023-10-03 13:29:55,711 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_13091_210_7925_1_1530609447_8987354_1186-261957-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 13:29:55,841 WARNING [train.py:1204] (3/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_91-277684-0_sp1.1 from training. Duration: 0.67275 2023-10-03 13:29:57,170 WARNING [train.py:1204] (3/4) Exclude cut with ID _31088_210_16446_1_1532520012284_3440919_159-313751-0_sp1.1 from training. Duration: 0.936375 2023-10-03 13:29:58,652 WARNING [train.py:1204] (3/4) Exclude cut with ID _42326_210_7770_1_1533814282531_3707269_227-203721-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 13:30:00,645 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3531_210_3528_1_1526294203_5216456_309-227302-0_sp1.1 from training. Duration: 0.62725 2023-10-03 13:30:00,715 WARNING [train.py:1204] (3/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_226-242140-0 from training. Duration: 0.81 2023-10-03 13:30:02,737 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.self_attn_weights.pos_emb_skip_rate, batch_count=1283580.0, ans=0.0 2023-10-03 13:30:05,692 WARNING [train.py:1204] (3/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_122-109023-0_sp1.1 from training. Duration: 0.6909375 2023-10-03 13:30:05,795 WARNING [train.py:1204] (3/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_660-211088-0_sp0.9 from training. Duration: 0.688875 2023-10-03 13:30:07,131 WARNING [train.py:1204] (3/4) Exclude cut with ID _39290_210_4220_1_1533862778760_3075118_92-311600-0 from training. Duration: 0.88 2023-10-03 13:30:12,876 WARNING [train.py:1204] (3/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_390-339137-0_sp1.1 from training. Duration: 0.5090625 2023-10-03 13:30:14,380 WARNING [train.py:1204] (3/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_449-341946-0_sp1.1 from training. Duration: 0.963625 2023-10-03 13:30:15,627 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_68095_210_20185_1_1535718226_8888855_884-327007-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 13:30:17,059 WARNING [train.py:1204] (3/4) Exclude cut with ID _42326_210_7770_1_1533814282531_3707269_308-222998-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 13:30:18,552 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.bypass.skip_rate, batch_count=1283646.6666666667, ans=0.07 2023-10-03 13:30:19,654 WARNING [train.py:1204] (3/4) Exclude cut with ID _40399_210_4353_1_1533305658194_3279406_344-238708-0_sp1.1 from training. Duration: 0.963625 2023-10-03 13:30:20,949 WARNING [train.py:1204] (3/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_87-131427-0_sp1.1 from training. Duration: 0.7090625 2023-10-03 13:30:21,009 WARNING [train.py:1204] (3/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_110-130961-0_sp0.9 from training. Duration: 0.52225 2023-10-03 13:30:21,049 WARNING [train.py:1204] (3/4) Exclude cut with ID _55153_210_2253_1_1534384786677_3162096_148-79022-0 from training. Duration: 0.96 2023-10-03 13:30:25,842 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.2.encoder.layers.2.nonlin_attention.whiten2, num_groups=1, num_channels=384, metric=4.71 vs. limit=15.0 2023-10-03 13:30:28,099 WARNING [train.py:1204] (3/4) Exclude cut with ID _9679_210_7497_1_1529129212906_4363929_512-313889-0_sp1.1 from training. Duration: 0.736375 2023-10-03 13:30:28,119 WARNING [train.py:1204] (3/4) Exclude cut with ID _7970_210_1883_1_1528455616296_3472230_25-178407-0_sp1.1 from training. Duration: 0.6454375 2023-10-03 13:30:29,635 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_246-125000-0 from training. Duration: 0.62 2023-10-03 13:30:29,694 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_15126_210_6753_1_1530778677_2591185_113-157651-0_sp0.9 from training. Duration: 0.7555625 2023-10-03 13:30:31,586 WARNING [train.py:1204] (3/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_105-63309-0_sp1.1 from training. Duration: 0.8909375 2023-10-03 13:30:32,392 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.1.encoder.layers.1.conv_module2.whiten, num_groups=1, num_channels=256, metric=10.33 vs. limit=15.0 2023-10-03 13:30:34,945 WARNING [train.py:1204] (3/4) Exclude cut with ID _29877_210_6228_1_1532397791106_5276149_578-177271-0_sp1.1 from training. Duration: 0.936375 2023-10-03 13:30:36,260 WARNING [train.py:1204] (3/4) Exclude cut with ID _44831_210_13427_1_1533816011549_3689035_32-188147-0 from training. Duration: 0.96 2023-10-03 13:30:36,316 WARNING [train.py:1204] (3/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_300-254337-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 13:30:36,336 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_128-241008-0 from training. Duration: 0.94 2023-10-03 13:30:37,695 WARNING [train.py:1204] (3/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_53-279178-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 13:30:42,651 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_41452_210_7291_1_1533437687_3941281_273-334088-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 13:30:42,653 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_128-241008-0_sp1.1 from training. Duration: 0.8545625 2023-10-03 13:30:46,882 WARNING [train.py:1204] (3/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_379-49457-0 from training. Duration: 0.87 2023-10-03 13:30:46,947 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_105-114797-0 from training. Duration: 0.84 2023-10-03 13:30:48,315 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_14928_210_5632_1_1530773461_4714000_454-112198-0 from training. Duration: 0.66 2023-10-03 13:30:49,768 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.balancer2.prob, batch_count=1283780.0, ans=0.125 2023-10-03 13:30:52,248 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_456-22141-0_sp1.1 from training. Duration: 0.8 2023-10-03 13:30:54,431 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.2.encoder.layers.2.feed_forward2.out_whiten, num_groups=1, num_channels=384, metric=4.41 vs. limit=15.0 2023-10-03 13:30:55,110 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_115-233929-0 from training. Duration: 0.72 2023-10-03 13:30:56,580 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_616-16795-0_sp1.1 from training. Duration: 0.963625 2023-10-03 13:30:59,590 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.attention_skip_rate, batch_count=1283846.6666666667, ans=0.0 2023-10-03 13:31:03,991 WARNING [train.py:1204] (3/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_448-212135-0 from training. Duration: 0.91 2023-10-03 13:31:06,745 WARNING [train.py:1204] (3/4) Exclude cut with ID _46683_210_9642_1_1533730913492_4591116_201-205677-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 13:31:10,535 INFO [train.py:1046] (3/4) Epoch 37, batch 1350, loss[loss=0.1402, simple_loss=0.2066, pruned_loss=0.0369, over 22824.00 frames. ], tot_loss[loss=0.1595, simple_loss=0.2387, pruned_loss=0.04017, over 4713713.39 frames. ], batch size: 322, lr: 2.76e-03, grad_scale: 16.0 2023-10-03 13:31:11,906 WARNING [train.py:1204] (3/4) Exclude cut with ID _64086_210_7994_1_1535270417019_7435949_43-80483-0_sp1.1 from training. Duration: 0.97275 2023-10-03 13:31:13,924 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_240-134124-0_sp1.1 from training. Duration: 0.963625 2023-10-03 13:31:13,944 WARNING [train.py:1204] (3/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_907-120940-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 13:31:17,883 WARNING [train.py:1204] (3/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_848-280957-0_sp1.1 from training. Duration: 0.7909375 2023-10-03 13:31:17,923 WARNING [train.py:1204] (3/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_243-42116-0_sp0.9 from training. Duration: 0.811125 2023-10-03 13:31:21,997 WARNING [train.py:1204] (3/4) Exclude cut with ID _6694_210_4599_1_1527852637490_3965529_116-109398-0_sp0.9 from training. Duration: 0.811125 2023-10-03 13:31:23,428 WARNING [train.py:1204] (3/4) Exclude cut with ID _24314_210_4915_1_1532135078116_7170179_521-258868-0 from training. Duration: 0.79 2023-10-03 13:31:23,533 WARNING [train.py:1204] (3/4) Exclude cut with ID _2220_210_1819_1_1525509109898_3658680_50-99837-0_sp0.9 from training. Duration: 0.6 2023-10-03 13:31:23,575 WARNING [train.py:1204] (3/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_313-134930-0_sp1.1 from training. Duration: 0.8818125 2023-10-03 13:31:27,817 WARNING [train.py:1204] (3/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_125-144174-0 from training. Duration: 0.39 2023-10-03 13:31:27,881 WARNING [train.py:1204] (3/4) Exclude cut with ID _59288_210_5854_1_1535077757774_3412246_275-293897-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 13:31:29,374 WARNING [train.py:1204] (3/4) Exclude cut with ID _40519_210_13794_1_1533715228655_7337390_722-52880-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 13:31:29,395 WARNING [train.py:1204] (3/4) Exclude cut with ID _7537_210_2084_1_1528502395431_3924359_128-268029-0 from training. Duration: 0.98 2023-10-03 13:31:30,795 WARNING [train.py:1204] (3/4) Exclude cut with ID _8021_210_7994_1_1528376451358_3653069_179-45206-0 from training. Duration: 0.86 2023-10-03 13:31:34,549 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.514e+02 1.947e+02 2.174e+02 2.484e+02 3.426e+02, threshold=4.348e+02, percent-clipped=0.0 2023-10-03 13:31:34,625 WARNING [train.py:1204] (3/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_88-276668-0 from training. Duration: 0.99 2023-10-03 13:31:35,893 WARNING [train.py:1204] (3/4) Exclude cut with ID _22613_210_5219_1_1532253599686_4090170_290-212862-0_sp1.1 from training. Duration: 0.97275 2023-10-03 13:31:35,898 WARNING [train.py:1204] (3/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_465-214726-0 from training. Duration: 0.86 2023-10-03 13:31:47,637 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.feed_forward3.hidden_balancer.prob, batch_count=1284046.6666666667, ans=0.125 2023-10-03 13:31:48,741 WARNING [train.py:1204] (3/4) Exclude cut with ID _56480_210_4220_1_1534679942079_3075409_138-36031-0_sp1.1 from training. Duration: 0.97275 2023-10-03 13:31:52,977 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=1284113.3333333333, ans=0.1 2023-10-03 13:31:58,244 WARNING [train.py:1204] (3/4) Exclude cut with ID _58605_210_12210_1_1534723195939_3904259_300-285878-0_sp1.1 from training. Duration: 0.97275 2023-10-03 13:31:58,278 WARNING [train.py:1204] (3/4) Exclude cut with ID _40901_210_5665_1_1533363789958_3856130_114-340229-0_sp1.1 from training. Duration: 0.936375 2023-10-03 13:31:58,320 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_489-230703-0 from training. Duration: 0.85 2023-10-03 13:32:02,375 WARNING [train.py:1204] (3/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_322-81243-0_sp1.1 from training. Duration: 0.936375 2023-10-03 13:32:02,461 WARNING [train.py:1204] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_271-269084-0 from training. Duration: 0.83 2023-10-03 13:32:03,758 WARNING [train.py:1204] (3/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_166-314607-0_sp0.9 from training. Duration: 0.6 2023-10-03 13:32:03,830 WARNING [train.py:1204] (3/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_340-343302-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 13:32:04,011 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.attention_skip_rate, batch_count=1284113.3333333333, ans=0.0 2023-10-03 13:32:06,524 WARNING [train.py:1204] (3/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_286-112015-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 13:32:08,219 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.feed_forward1.hidden_balancer.prob, batch_count=1284180.0, ans=0.125 2023-10-03 13:32:09,375 WARNING [train.py:1204] (3/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_328-264776-0 from training. Duration: 0.67 2023-10-03 13:32:09,495 WARNING [train.py:1204] (3/4) Exclude cut with ID _4866_210_2577_1_1527404412284_4364659_256-306830-0_sp0.9 from training. Duration: 0.9666875 2023-10-03 13:32:18,306 WARNING [train.py:1204] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_239-316802-0 from training. Duration: 0.69 2023-10-03 13:32:19,749 WARNING [train.py:1204] (3/4) Exclude cut with ID _29857_210_20034_1_1532395886753_3798160_279-185801-0 from training. Duration: 0.91 2023-10-03 13:32:19,962 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.balancer2.prob, batch_count=1284180.0, ans=0.125 2023-10-03 13:32:23,886 INFO [train.py:1046] (3/4) Epoch 37, batch 1400, loss[loss=0.1444, simple_loss=0.2344, pruned_loss=0.02714, over 24478.00 frames. ], tot_loss[loss=0.1586, simple_loss=0.2384, pruned_loss=0.0394, over 4726473.71 frames. ], batch size: 66, lr: 2.76e-03, grad_scale: 16.0 2023-10-03 13:32:25,404 WARNING [train.py:1204] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_656-188361-0 from training. Duration: 0.87 2023-10-03 13:32:25,569 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.feed_forward3.hidden_balancer.prob, batch_count=1284246.6666666667, ans=0.125 2023-10-03 13:32:26,857 WARNING [train.py:1204] (3/4) Exclude cut with ID _44718_210_5816_1_1534050051869_6469030_100-230287-0_sp1.1 from training. Duration: 0.936375 2023-10-03 13:32:29,598 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8275_210_4198_1_1528455320_3892571_276-215622-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 13:32:29,653 WARNING [train.py:1204] (3/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_698-309430-0_sp1.1 from training. Duration: 0.963625 2023-10-03 13:32:33,790 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_365-217119-0 from training. Duration: 0.63 2023-10-03 13:32:34,082 INFO [scaling.py:1118] (3/4) WithLoss: name=encoder.encoders.3.encoder.layers.2.self_attn_weights, loss-sum=0.000e+00 2023-10-03 13:32:35,168 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7264_210_4938_1_1527939911_8132095_800-91995-0 from training. Duration: 0.86 2023-10-03 13:32:37,138 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.5.encoder.layers.0.conv_module1.whiten, num_groups=1, num_channels=256, metric=7.05 vs. limit=15.0 2023-10-03 13:32:44,695 WARNING [train.py:1204] (3/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_49-97158-0_sp1.1 from training. Duration: 0.6909375 2023-10-03 13:32:46,722 WARNING [train.py:1204] (3/4) Exclude cut with ID _61132_210_9969_1_1535021792964_3755419_61-57720-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 13:32:50,618 WARNING [train.py:1204] (3/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_345-306832-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 13:32:50,642 WARNING [train.py:1204] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_164-224052-0_sp0.9 from training. Duration: 0.77775 2023-10-03 13:32:53,451 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_34886_210_10633_1_1533203882_1755661_76-156096-0_sp1.1 from training. Duration: 0.7909375 2023-10-03 13:32:54,786 WARNING [train.py:1204] (3/4) Exclude cut with ID 4133-6541-0027-40495-0_sp1.1 from training. Duration: 0.9681875 2023-10-03 13:33:01,805 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_514-211165-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 13:33:01,864 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_41211_210_2544_1_1533348859_3702910_97-212695-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 13:33:02,089 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.feed_forward1.hidden_balancer.prob, batch_count=1284380.0, ans=0.125 2023-10-03 13:33:05,872 WARNING [train.py:1204] (3/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_406-45086-0 from training. Duration: 0.8 2023-10-03 13:33:05,948 WARNING [train.py:1204] (3/4) Exclude cut with ID _65263_210_22356_1_1535763598691_3921037_49-222917-0_sp1.1 from training. Duration: 0.836375 2023-10-03 13:33:07,358 WARNING [train.py:1204] (3/4) Exclude cut with ID _4866_210_2577_1_1527404412284_4364659_352-157930-0_sp1.1 from training. Duration: 0.736375 2023-10-03 13:33:07,421 WARNING [train.py:1204] (3/4) Exclude cut with ID _4366_210_1388_1_1526792053633_3613819_231-211402-0_sp1.1 from training. Duration: 0.8545625 2023-10-03 13:33:07,473 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_98-21576-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 13:33:08,839 WARNING [train.py:1204] (3/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_348-127789-0_sp0.9 from training. Duration: 0.9444375 2023-10-03 13:33:08,855 WARNING [train.py:1204] (3/4) Exclude cut with ID _45871_210_5665_1_1533864614245_3296060_98-329557-0_sp1.1 from training. Duration: 0.97275 2023-10-03 13:33:09,556 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6846_210_6753_1_1527850767_4295158_2-315073-0_sp1.1 from training. Duration: 0.8 2023-10-03 13:33:10,718 WARNING [train.py:1204] (3/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_145-137790-0 from training. Duration: 0.83 2023-10-03 13:33:10,732 WARNING [train.py:1204] (3/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_133-158308-0_sp0.9 from training. Duration: 0.9333125 2023-10-03 13:33:15,861 WARNING [train.py:1204] (3/4) Exclude cut with ID _14898_210_9206_1_1530766812377_4177190_320-11758-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 13:33:18,675 WARNING [train.py:1204] (3/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_406-45086-0_sp0.9 from training. Duration: 0.888875 2023-10-03 13:33:21,758 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.conv_skip_rate, batch_count=1284513.3333333333, ans=0.0 2023-10-03 13:33:25,787 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_5438_210_1794_1_1527336687_2600009_104-288274-0 from training. Duration: 0.58 2023-10-03 13:33:27,066 WARNING [train.py:1204] (3/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_108-53929-0_sp0.9 from training. Duration: 0.6555625 2023-10-03 13:33:27,128 WARNING [train.py:1204] (3/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_444-286390-0_sp1.1 from training. Duration: 0.963625 2023-10-03 13:33:31,078 WARNING [train.py:1204] (3/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_125-144174-0_sp0.9 from training. Duration: 0.4333125 2023-10-03 13:33:31,141 WARNING [train.py:1204] (3/4) Exclude cut with ID _55209_210_1531_1_1534675828996_4597160_118-345621-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 13:33:32,669 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_45886_210_17024_1_1533709215_471063_7-79165-0_sp1.1 from training. Duration: 0.8909375 2023-10-03 13:33:35,466 WARNING [train.py:1204] (3/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_175-163894-0_sp0.9 from training. Duration: 0.82225 2023-10-03 13:33:36,810 INFO [train.py:1046] (3/4) Epoch 37, batch 1450, loss[loss=0.1613, simple_loss=0.2308, pruned_loss=0.04592, over 22760.00 frames. ], tot_loss[loss=0.158, simple_loss=0.2377, pruned_loss=0.03918, over 4723523.94 frames. ], batch size: 322, lr: 2.76e-03, grad_scale: 16.0 2023-10-03 13:33:38,321 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_50188_210_4198_1_1534503621_3646041_353-81682-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 13:33:38,327 WARNING [train.py:1204] (3/4) Exclude cut with ID _17875_210_2087_1_1531224012907_1821339_175-80565-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 13:33:38,337 WARNING [train.py:1204] (3/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_159-328157-0_sp1.1 from training. Duration: 0.47275 2023-10-03 13:33:42,519 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.conv_module2.balancer1.min_positive, batch_count=1284580.0, ans=0.025 2023-10-03 13:33:43,654 WARNING [train.py:1204] (3/4) Exclude cut with ID _60057_210_10640_1_1534903065793_3333150_137-267579-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 13:33:45,581 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_92-46009-0_sp1.1 from training. Duration: 0.4909375 2023-10-03 13:33:48,744 WARNING [train.py:1204] (3/4) Exclude cut with ID _53203_210_2087_1_1534246218309_475590_48-68507-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 13:33:48,768 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_28211_210_6388_1_1532861506_4523224_128-328162-0 from training. Duration: 0.9 2023-10-03 13:33:50,246 WARNING [train.py:1204] (3/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_51-1049-0_sp1.1 from training. Duration: 0.6818125 2023-10-03 13:33:51,600 WARNING [train.py:1204] (3/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_383-316000-0 from training. Duration: 0.9 2023-10-03 13:33:51,656 WARNING [train.py:1204] (3/4) Exclude cut with ID _4779_210_2084_1_1527318022944_3914893_185-295153-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 13:33:53,039 WARNING [train.py:1204] (3/4) Exclude cut with ID _3231_210_2106_1_1526205864934_6784220_171-256004-0_sp0.9 from training. Duration: 0.9555625 2023-10-03 13:33:53,040 WARNING [train.py:1204] (3/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_87-272068-0 from training. Duration: 0.83 2023-10-03 13:33:54,381 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_164-152777-0_sp1.1 from training. Duration: 0.92725 2023-10-03 13:33:54,425 WARNING [train.py:1204] (3/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_18-130336-0_sp0.9 from training. Duration: 0.87775 2023-10-03 13:33:55,344 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.1.encoder.layers.0.self_attn_weights.whiten_keys, num_groups=4, num_channels=128, metric=3.64 vs. limit=6.0 2023-10-03 13:33:55,772 WARNING [train.py:1204] (3/4) Exclude cut with ID _14886_210_4220_1_1530702020355_3516190_175-229073-0_sp1.1 from training. Duration: 0.4090625 2023-10-03 13:33:55,785 WARNING [train.py:1204] (3/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_791-45149-0_sp0.9 from training. Duration: 0.9555625 2023-10-03 13:33:57,136 WARNING [train.py:1204] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_468-189058-0_sp1.1 from training. Duration: 0.87275 2023-10-03 13:33:58,443 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8278_210_2550_1_1528637099_4220413_32-142780-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 13:33:59,837 WARNING [train.py:1204] (3/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_146-218763-0_sp0.9 from training. Duration: 0.9555625 2023-10-03 13:34:02,485 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.497e+02 1.928e+02 2.172e+02 2.508e+02 3.657e+02, threshold=4.343e+02, percent-clipped=0.0 2023-10-03 13:34:04,001 WARNING [train.py:1204] (3/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_348-127789-0_sp1.1 from training. Duration: 0.77275 2023-10-03 13:34:04,006 WARNING [train.py:1204] (3/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_288-285445-0_sp1.1 from training. Duration: 0.936375 2023-10-03 13:34:05,327 WARNING [train.py:1204] (3/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_308-181732-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 13:34:05,360 WARNING [train.py:1204] (3/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_895-117428-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 13:34:09,177 WARNING [train.py:1204] (3/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_387-159008-0_sp0.9 from training. Duration: 0.9555625 2023-10-03 13:34:09,193 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_37351_210_7925_1_1533012931_3812113_265-85780-0_sp0.9 from training. Duration: 0.97775 2023-10-03 13:34:09,224 WARNING [train.py:1204] (3/4) Exclude cut with ID _40551_210_10635_1_1533776424632_3416179_361-3964-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 13:34:09,257 WARNING [train.py:1204] (3/4) Exclude cut with ID _25882_210_3036_1_1532775665815_6479510_485-122742-0_sp1.1 from training. Duration: 0.97275 2023-10-03 13:34:13,950 WARNING [train.py:1204] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_98-51676-0 from training. Duration: 0.73 2023-10-03 13:34:17,160 WARNING [train.py:1204] (3/4) Exclude cut with ID _30548_210_16040_1_1532608097130_3146969_371-280610-0_sp1.1 from training. Duration: 0.92725 2023-10-03 13:34:20,533 WARNING [train.py:1204] (3/4) Exclude cut with ID IC0671W0081-331924 from training. Duration: 0.896 2023-10-03 13:34:21,915 WARNING [train.py:1204] (3/4) Exclude cut with ID _40284_210_2084_1_1533513679814_3704279_380-160775-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 13:34:23,312 WARNING [train.py:1204] (3/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_57-270037-0_sp1.1 from training. Duration: 0.7 2023-10-03 13:34:23,544 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.feed_forward1.hidden_balancer.prob, batch_count=1284780.0, ans=0.125 2023-10-03 13:34:24,640 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_853-150237-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 13:34:26,054 WARNING [train.py:1204] (3/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_491-261923-0 from training. Duration: 0.98 2023-10-03 13:34:28,924 WARNING [train.py:1204] (3/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_836-244045-0_sp1.1 from training. Duration: 0.97275 2023-10-03 13:34:31,500 WARNING [train.py:1204] (3/4) Exclude cut with ID _14880_210_8895_1_1530680426655_3795259_289-212817-0 from training. Duration: 0.69 2023-10-03 13:34:32,878 WARNING [train.py:1204] (3/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_303-340446-0 from training. Duration: 0.9 2023-10-03 13:34:34,322 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_37152_210_9697_1_1533106573_3844188_418-41736-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 13:34:38,355 WARNING [train.py:1204] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_371-18169-0_sp1.1 from training. Duration: 0.7818125 2023-10-03 13:34:38,383 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_187-219984-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 13:34:38,493 WARNING [train.py:1204] (3/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_63-267585-0 from training. Duration: 0.9 2023-10-03 13:34:41,278 WARNING [train.py:1204] (3/4) Exclude cut with ID _27013_210_1389_1_1532437173240_3252649_222-125573-0 from training. Duration: 0.98 2023-10-03 13:34:41,329 WARNING [train.py:1204] (3/4) Exclude cut with ID _14821_210_8750_1_1530680503991_6829359_189-297451-0 from training. Duration: 0.55 2023-10-03 13:34:42,749 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_557-163391-0_sp1.1 from training. Duration: 0.97275 2023-10-03 13:34:44,575 WARNING [train.py:1204] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_65-88242-0_sp1.1 from training. Duration: 0.5090625 2023-10-03 13:34:51,419 INFO [train.py:1046] (3/4) Epoch 37, batch 1500, loss[loss=0.1656, simple_loss=0.2508, pruned_loss=0.04016, over 24090.00 frames. ], tot_loss[loss=0.1588, simple_loss=0.2386, pruned_loss=0.03953, over 4721397.33 frames. ], batch size: 86, lr: 2.76e-03, grad_scale: 16.0 2023-10-03 13:34:57,018 WARNING [train.py:1204] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_218-295097-0 from training. Duration: 0.77 2023-10-03 13:34:57,053 WARNING [train.py:1204] (3/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_686-247600-0_sp1.1 from training. Duration: 0.836375 2023-10-03 13:34:57,057 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_397-147196-0_sp0.9 from training. Duration: 0.888875 2023-10-03 13:34:58,382 WARNING [train.py:1204] (3/4) Exclude cut with ID _53950_210_5816_1_1534417264919_3862951_237-136449-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 13:34:59,783 WARNING [train.py:1204] (3/4) Exclude cut with ID _3120_210_3525_1_1526036143112_4072980_74-178400-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 13:34:59,865 WARNING [train.py:1204] (3/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_309-222545-0_sp0.9 from training. Duration: 0.9333125 2023-10-03 13:35:01,214 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7544_210_6753_1_1528587722_3576686_172-171750-0 from training. Duration: 0.65 2023-10-03 13:35:02,601 WARNING [train.py:1204] (3/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_3-237434-0_sp0.9 from training. Duration: 0.8444375 2023-10-03 13:35:02,647 WARNING [train.py:1204] (3/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_101-293367-0_sp0.9 from training. Duration: 0.7 2023-10-03 13:35:02,654 WARNING [train.py:1204] (3/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_818-213552-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 13:35:03,992 WARNING [train.py:1204] (3/4) Exclude cut with ID _14349_210_8341_1_1530615710818_3442470_115-285975-0_sp1.1 from training. Duration: 0.7818125 2023-10-03 13:35:04,129 WARNING [train.py:1204] (3/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_291-309749-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 13:35:04,306 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.ff2_skip_rate, batch_count=1284980.0, ans=0.0 2023-10-03 13:35:05,518 WARNING [train.py:1204] (3/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_715-3462-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 13:35:10,888 WARNING [train.py:1204] (3/4) Exclude cut with ID _59502_210_12210_1_1535107024698_1835139_13-331827-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 13:35:10,907 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_4528_210_3273_1_1527076654_3276777_342-168068-0 from training. Duration: 0.87 2023-10-03 13:35:12,285 WARNING [train.py:1204] (3/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_215-141059-0_sp0.9 from training. Duration: 0.688875 2023-10-03 13:35:12,309 WARNING [train.py:1204] (3/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_106-286130-0_sp1.1 from training. Duration: 0.8909375 2023-10-03 13:35:12,384 WARNING [train.py:1204] (3/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_480-77655-0_sp1.1 from training. Duration: 0.97275 2023-10-03 13:35:15,677 WARNING [train.py:1204] (3/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_848-280957-0 from training. Duration: 0.87 2023-10-03 13:35:15,743 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder_embed.conv.8.prob, batch_count=1284980.0, ans=0.125 2023-10-03 13:35:15,878 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.conv_module1.balancer2.min_abs, batch_count=1284980.0, ans=0.5 2023-10-03 13:35:20,292 WARNING [train.py:1204] (3/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_122-109023-0 from training. Duration: 0.76 2023-10-03 13:35:20,537 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.attention_skip_rate, batch_count=1285046.6666666667, ans=0.0 2023-10-03 13:35:21,652 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_11305_210_1794_1_1529750428_7178148_645-233480-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 13:35:21,713 WARNING [train.py:1204] (3/4) Exclude cut with ID _7562_210_2084_1_1528887767916_3946229_345-169156-0 from training. Duration: 0.58 2023-10-03 13:35:24,385 WARNING [train.py:1204] (3/4) Exclude cut with ID _7562_210_2084_1_1528887767916_3946229_345-169156-0_sp1.1 from training. Duration: 0.52725 2023-10-03 13:35:25,785 WARNING [train.py:1204] (3/4) Exclude cut with ID _13253_210_1925_1_1530757775819_3577873_253-246386-0_sp1.1 from training. Duration: 0.8181875 2023-10-03 13:35:27,098 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_136-264444-0_sp1.1 from training. Duration: 0.97275 2023-10-03 13:35:27,116 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2168_210_2290_1_1525606920_1849400_136-222991-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 13:35:29,815 WARNING [train.py:1204] (3/4) Exclude cut with ID _7852_210_5219_1_1528286471316_8139400_242-105336-0 from training. Duration: 0.72 2023-10-03 13:35:29,851 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_5960_210_1794_1_1527403865_3990999_515-2613-0_sp1.1 from training. Duration: 0.8 2023-10-03 13:35:29,865 WARNING [train.py:1204] (3/4) Exclude cut with ID _39688_210_19596_1_1533371626725_4154159_551-167834-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 13:35:29,930 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_85-195240-0 from training. Duration: 0.87 2023-10-03 13:35:31,310 WARNING [train.py:1204] (3/4) Exclude cut with ID _63171_210_15488_1_1535104792794_3295110_113-113534-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 13:35:35,518 WARNING [train.py:1204] (3/4) Exclude cut with ID _8833_210_5219_1_1528592665210_7553059_319-349181-0_sp1.1 from training. Duration: 0.9 2023-10-03 13:35:35,518 WARNING [train.py:1204] (3/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_225-43381-0 from training. Duration: 0.94 2023-10-03 13:35:39,738 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_5959_210_1794_1_1527400400_3445726_412-44052-0_sp0.9 from training. Duration: 0.8555625 2023-10-03 13:35:41,121 WARNING [train.py:1204] (3/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_356-167816-0_sp1.1 from training. Duration: 0.6545625 2023-10-03 13:35:45,602 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9012W0112-501244 from training. Duration: 0.768 2023-10-03 13:35:47,495 WARNING [train.py:1204] (3/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_177-326952-0_sp1.1 from training. Duration: 0.92725 2023-10-03 13:35:47,499 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9016W0116-502789 from training. Duration: 0.896 2023-10-03 13:35:49,383 WARNING [train.py:1204] (3/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_557-204815-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 13:35:49,475 WARNING [train.py:1204] (3/4) Exclude cut with ID _55857_210_2346_1_1534644007831_6898209_279-229469-0_sp1.1 from training. Duration: 0.936375 2023-10-03 13:35:49,548 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9029W0375-509764 from training. Duration: 0.896 2023-10-03 13:35:49,748 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.feed_forward1.out_proj.dropout_p, batch_count=1285180.0, ans=0.1 2023-10-03 13:35:51,337 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2295_210_2087_1_1525500993_2407863_234-83104-0_sp0.9 from training. Duration: 0.688875 2023-10-03 13:35:52,837 WARNING [train.py:1204] (3/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_266-137104-0 from training. Duration: 0.65 2023-10-03 13:35:54,271 WARNING [train.py:1204] (3/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_289-163927-0_sp1.1 from training. Duration: 0.92725 2023-10-03 13:35:57,149 WARNING [train.py:1204] (3/4) Exclude cut with ID _12733_210_8341_1_1530167424556_3646380_86-225314-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 13:35:57,170 WARNING [train.py:1204] (3/4) Exclude cut with ID _7877_210_6014_1_1528286358099_3750136_618-284064-0_sp1.1 from training. Duration: 0.92725 2023-10-03 13:35:57,414 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.attention_skip_rate, batch_count=1285180.0, ans=0.0 2023-10-03 13:35:58,385 WARNING [train.py:1204] (3/4) Exclude cut with ID _55844_210_2346_1_1534471212569_7625707_1017-31469-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 13:35:58,417 WARNING [train.py:1204] (3/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_617-241446-0_sp1.1 from training. Duration: 0.92725 2023-10-03 13:35:58,474 WARNING [train.py:1204] (3/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_866-173440-0_sp1.1 from training. Duration: 0.8181875 2023-10-03 13:35:59,840 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_9460_210_4211_1_1528954587_10049667_480-260269-0 from training. Duration: 0.56 2023-10-03 13:35:59,886 WARNING [train.py:1204] (3/4) Exclude cut with ID _44659_210_2721_1_1534036120061_6614860_626-298884-0 from training. Duration: 0.97 2023-10-03 13:36:01,178 WARNING [train.py:1204] (3/4) Exclude cut with ID _53565_210_10635_1_1534337218335_3820259_287-18120-0_sp1.1 from training. Duration: 0.8818125 2023-10-03 13:36:01,241 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_791-197891-0 from training. Duration: 0.84 2023-10-03 13:36:02,561 WARNING [train.py:1204] (3/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_527-238990-0 from training. Duration: 0.9 2023-10-03 13:36:03,875 INFO [train.py:1046] (3/4) Epoch 37, batch 1550, loss[loss=0.1718, simple_loss=0.2608, pruned_loss=0.04146, over 24667.00 frames. ], tot_loss[loss=0.1595, simple_loss=0.2396, pruned_loss=0.03976, over 4724908.13 frames. ], batch size: 73, lr: 2.76e-03, grad_scale: 8.0 2023-10-03 13:36:04,005 WARNING [train.py:1204] (3/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_449-65969-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 13:36:04,205 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.conv_module1.balancer2.min_positive, batch_count=1285246.6666666667, ans=0.05 2023-10-03 13:36:05,439 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_553-341452-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 13:36:06,713 WARNING [train.py:1204] (3/4) Exclude cut with ID _16909_210_5724_1_1531292079912_4118919_300-47574-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 13:36:06,734 WARNING [train.py:1204] (3/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_70-27732-0_sp1.1 from training. Duration: 0.8545625 2023-10-03 13:36:06,924 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.conv_module1.balancer2.prob, batch_count=1285246.6666666667, ans=0.125 2023-10-03 13:36:09,326 WARNING [train.py:1204] (3/4) Exclude cut with ID _44718_210_5816_1_1534050051869_6469030_727-28074-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 13:36:10,120 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.2.encoder.layers.2.feed_forward1.out_whiten, num_groups=1, num_channels=384, metric=5.35 vs. limit=15.0 2023-10-03 13:36:10,657 WARNING [train.py:1204] (3/4) Exclude cut with ID _55857_210_2346_1_1534644007831_6898209_357-123664-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 13:36:12,051 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9123W0004-555964 from training. Duration: 0.896 2023-10-03 13:36:13,364 WARNING [train.py:1204] (3/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_445-145208-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 13:36:13,392 WARNING [train.py:1204] (3/4) Exclude cut with ID _3675_210_4557_1_1526639934408_4182089_456-18305-0_sp1.1 from training. Duration: 0.6909375 2023-10-03 13:36:13,468 WARNING [train.py:1204] (3/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_1-99680-0_sp1.1 from training. Duration: 0.6090625 2023-10-03 13:36:18,706 WARNING [train.py:1204] (3/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_146-282400-0_sp0.9 from training. Duration: 0.788875 2023-10-03 13:36:18,733 WARNING [train.py:1204] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_844-96989-0 from training. Duration: 0.71 2023-10-03 13:36:21,535 WARNING [train.py:1204] (3/4) Exclude cut with ID _29853_210_7497_1_1532505600851_6642110_128-146364-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 13:36:21,559 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_224-322394-0 from training. Duration: 0.66 2023-10-03 13:36:21,658 WARNING [train.py:1204] (3/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_191-59784-0 from training. Duration: 0.88 2023-10-03 13:36:21,662 WARNING [train.py:1204] (3/4) Exclude cut with ID _12556_210_8341_1_1530081024100_7327690_725-193477-0 from training. Duration: 0.98 2023-10-03 13:36:23,579 WARNING [train.py:1204] (3/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_523-79838-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 13:36:25,045 WARNING [train.py:1204] (3/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_398-334558-0_sp1.1 from training. Duration: 0.87275 2023-10-03 13:36:27,809 WARNING [train.py:1204] (3/4) Exclude cut with ID _6456_210_1534_1_1527681634212_3181049_171-36697-0_sp1.1 from training. Duration: 0.936375 2023-10-03 13:36:29,287 WARNING [train.py:1204] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_700-183549-0 from training. Duration: 0.68 2023-10-03 13:36:29,289 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8853_210_3273_1_1528633899_818968_51-187685-0 from training. Duration: 0.69 2023-10-03 13:36:30,489 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.568e+02 1.843e+02 2.005e+02 2.170e+02 3.085e+02, threshold=4.010e+02, percent-clipped=0.0 2023-10-03 13:36:36,328 WARNING [train.py:1204] (3/4) Exclude cut with ID _7719_210_3525_1_1528456153163_4296960_333-110399-0_sp1.1 from training. Duration: 0.87275 2023-10-03 13:36:40,474 WARNING [train.py:1204] (3/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_1022-83740-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 13:36:40,491 WARNING [train.py:1204] (3/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_61-327062-0_sp0.9 from training. Duration: 0.9 2023-10-03 13:36:40,503 WARNING [train.py:1204] (3/4) Exclude cut with ID _47251_210_18628_1_1533814063128_4137250_214-88076-0_sp1.1 from training. Duration: 0.8 2023-10-03 13:36:41,834 WARNING [train.py:1204] (3/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_839-173017-0 from training. Duration: 0.41 2023-10-03 13:36:46,424 WARNING [train.py:1204] (3/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_299-34413-0_sp1.1 from training. Duration: 0.5909375 2023-10-03 13:36:48,336 WARNING [train.py:1204] (3/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_664-159457-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 13:36:51,617 WARNING [train.py:1204] (3/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_483-291760-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 13:36:54,930 WARNING [train.py:1204] (3/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_287-255474-0_sp1.1 from training. Duration: 0.92725 2023-10-03 13:36:54,973 WARNING [train.py:1204] (3/4) Exclude cut with ID _48677_210_5663_1_1533902405919_3892989_111-11439-0_sp1.1 from training. Duration: 0.87275 2023-10-03 13:36:54,982 WARNING [train.py:1204] (3/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_399-156791-0 from training. Duration: 0.83 2023-10-03 13:36:55,023 WARNING [train.py:1204] (3/4) Exclude cut with ID _3992_210_1747_1_1526644816031_3731060_36-147855-0_sp0.9 from training. Duration: 0.8666875 2023-10-03 13:36:55,527 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.4.encoder.layers.2.self_attn1.whiten, num_groups=1, num_channels=384, metric=13.01 vs. limit=22.5 2023-10-03 13:36:56,393 WARNING [train.py:1204] (3/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_442-320317-0_sp1.1 from training. Duration: 0.7454375 2023-10-03 13:36:56,416 WARNING [train.py:1204] (3/4) Exclude cut with ID _55429_210_10626_1_1534467588527_7174143_953-80868-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 13:36:57,755 WARNING [train.py:1204] (3/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_159-328157-0_sp0.9 from training. Duration: 0.57775 2023-10-03 13:36:57,768 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9309W0197-640324 from training. Duration: 0.896 2023-10-03 13:37:00,622 WARNING [train.py:1204] (3/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_361-30822-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 13:37:04,822 WARNING [train.py:1204] (3/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_636-251213-0 from training. Duration: 0.94 2023-10-03 13:37:11,538 WARNING [train.py:1204] (3/4) Exclude cut with ID _49728_210_12651_1_1534041034294_6788150_468-333827-0_sp1.1 from training. Duration: 0.963625 2023-10-03 13:37:11,623 WARNING [train.py:1204] (3/4) Exclude cut with ID _25882_210_3036_1_1532775665815_6479510_623-191759-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 13:37:12,893 WARNING [train.py:1204] (3/4) Exclude cut with ID _5189_210_4557_1_1527248319428_3769229_199-53526-0 from training. Duration: 0.42 2023-10-03 13:37:13,215 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.self_attn_weights.pos_emb_skip_rate, batch_count=1285513.3333333333, ans=0.0 2023-10-03 13:37:14,285 WARNING [train.py:1204] (3/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_32-205288-0_sp0.9 from training. Duration: 0.8666875 2023-10-03 13:37:14,368 WARNING [train.py:1204] (3/4) Exclude cut with ID _65263_210_22356_1_1535763598691_3921037_401-346646-0_sp1.1 from training. Duration: 0.963625 2023-10-03 13:37:14,372 WARNING [train.py:1204] (3/4) Exclude cut with ID _12555_210_8750_1_1530075594402_3780239_449-267986-0_sp1.1 from training. Duration: 0.7545625 2023-10-03 13:37:15,684 WARNING [train.py:1204] (3/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_430-201020-0_sp1.1 from training. Duration: 0.8818125 2023-10-03 13:37:16,942 INFO [train.py:1046] (3/4) Epoch 37, batch 1600, loss[loss=0.177, simple_loss=0.2509, pruned_loss=0.05158, over 22653.00 frames. ], tot_loss[loss=0.1595, simple_loss=0.2397, pruned_loss=0.03963, over 4735992.71 frames. ], batch size: 322, lr: 2.76e-03, grad_scale: 16.0 2023-10-03 13:37:17,024 WARNING [train.py:1204] (3/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_379-49457-0_sp1.1 from training. Duration: 0.7909375 2023-10-03 13:37:20,923 WARNING [train.py:1204] (3/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_200-105218-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 13:37:22,657 WARNING [train.py:1204] (3/4) Exclude cut with ID _3867_210_1883_1_1526641200575_4475830_319-349850-0 from training. Duration: 0.67 2023-10-03 13:37:23,960 WARNING [train.py:1204] (3/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_916-195848-0 from training. Duration: 0.92 2023-10-03 13:37:25,395 WARNING [train.py:1204] (3/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_436-186311-0 from training. Duration: 0.65 2023-10-03 13:37:28,662 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3910_210_1794_1_1526777667_3747260_302-180617-0_sp1.1 from training. Duration: 0.97275 2023-10-03 13:37:30,016 WARNING [train.py:1204] (3/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_102-92751-0 from training. Duration: 0.99 2023-10-03 13:37:31,336 WARNING [train.py:1204] (3/4) Exclude cut with ID _51899_210_20034_1_1534129254466_3669220_265-215428-0_sp1.1 from training. Duration: 0.8909375 2023-10-03 13:37:32,832 WARNING [train.py:1204] (3/4) Exclude cut with ID _58583_210_5236_1_1534726835266_3574249_438-198993-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 13:37:36,537 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.0.layers.1.feed_forward3.out_whiten, num_groups=1, num_channels=192, metric=8.94 vs. limit=15.0 2023-10-03 13:37:38,528 WARNING [train.py:1204] (3/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_166-166990-0_sp1.1 from training. Duration: 0.87275 2023-10-03 13:37:40,017 WARNING [train.py:1204] (3/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_907-151398-0 from training. Duration: 0.37 2023-10-03 13:37:42,768 WARNING [train.py:1204] (3/4) Exclude cut with ID _38670_210_15379_1_1533448810659_4391059_452-256669-0_sp1.1 from training. Duration: 0.936375 2023-10-03 13:37:44,048 WARNING [train.py:1204] (3/4) Exclude cut with ID _48677_210_5663_1_1533902405919_3892989_408-40740-0 from training. Duration: 0.83 2023-10-03 13:37:44,090 WARNING [train.py:1204] (3/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_903-158756-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 13:37:44,129 WARNING [train.py:1204] (3/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_397-216497-0 from training. Duration: 0.96 2023-10-03 13:37:44,356 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.conv_skip_rate, batch_count=1285646.6666666667, ans=0.0 2023-10-03 13:37:48,619 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.nonlin_attention.balancer.prob, batch_count=1285713.3333333333, ans=0.125 2023-10-03 13:37:49,768 WARNING [train.py:1204] (3/4) Exclude cut with ID _55429_210_10626_1_1534467588527_7174143_97-335084-0 from training. Duration: 0.97 2023-10-03 13:37:58,127 WARNING [train.py:1204] (3/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_453-267730-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 13:38:00,742 WARNING [train.py:1204] (3/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_153-97441-0 from training. Duration: 0.56 2023-10-03 13:38:00,809 WARNING [train.py:1204] (3/4) Exclude cut with ID _40901_210_5665_1_1533363789958_3856130_312-329122-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 13:38:00,852 WARNING [train.py:1204] (3/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_97-126471-0_sp1.1 from training. Duration: 0.97275 2023-10-03 13:38:00,853 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_11305_210_1794_1_1529750428_7178148_257-312870-0_sp1.1 from training. Duration: 0.8 2023-10-03 13:38:03,565 WARNING [train.py:1204] (3/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_454-314901-0_sp0.9 from training. Duration: 0.511125 2023-10-03 13:38:05,213 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.bypass.scale_min, batch_count=1285780.0, ans=0.2 2023-10-03 13:38:07,644 WARNING [train.py:1204] (3/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_282-136641-0_sp1.1 from training. Duration: 0.3818125 2023-10-03 13:38:10,429 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_46013_210_7234_1_1533776362_3744032_493-160724-0_sp1.1 from training. Duration: 0.963625 2023-10-03 13:38:10,480 WARNING [train.py:1204] (3/4) Exclude cut with ID _67831_210_12392_1_1535612464202_3092612_259-33442-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 13:38:10,529 WARNING [train.py:1204] (3/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_289-62384-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 13:38:11,959 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6545_210_2040_1_1527774670_4245258_379-14182-0_sp0.9 from training. Duration: 0.888875 2023-10-03 13:38:13,387 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_548-294316-0_sp1.1 from training. Duration: 0.736375 2023-10-03 13:38:13,471 WARNING [train.py:1204] (3/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_857-298942-0_sp1.1 from training. Duration: 0.77275 2023-10-03 13:38:16,184 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6673_210_3273_1_1527767790_3173240_327-306185-0_sp1.1 from training. Duration: 0.7545625 2023-10-03 13:38:22,201 WARNING [train.py:1204] (3/4) Exclude cut with ID _61008_210_10633_1_1534852769198_6974370_814-107647-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 13:38:22,268 WARNING [train.py:1204] (3/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_116-332008-0_sp1.1 from training. Duration: 0.8909375 2023-10-03 13:38:23,702 WARNING [train.py:1204] (3/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_266-329755-0 from training. Duration: 0.89 2023-10-03 13:38:23,705 WARNING [train.py:1204] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_271-269084-0_sp0.9 from training. Duration: 0.92225 2023-10-03 13:38:25,597 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3171_210_2087_1_1526109084_2330007_66-260583-0 from training. Duration: 0.72 2023-10-03 13:38:30,675 WARNING [train.py:1204] (3/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_1326-276386-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 13:38:31,965 INFO [train.py:1046] (3/4) Epoch 37, batch 1650, loss[loss=0.1625, simple_loss=0.2485, pruned_loss=0.03825, over 24387.00 frames. ], tot_loss[loss=0.159, simple_loss=0.2394, pruned_loss=0.03928, over 4737333.22 frames. ], batch size: 77, lr: 2.76e-03, grad_scale: 16.0 2023-10-03 13:38:32,125 WARNING [train.py:1204] (3/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_372-30302-0_sp1.1 from training. Duration: 0.7818125 2023-10-03 13:38:33,537 WARNING [train.py:1204] (3/4) Exclude cut with ID _25882_210_3036_1_1532775665815_6479510_631-257029-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 13:38:33,547 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3691_210_3528_1_1526468169_4062053_355-334432-0 from training. Duration: 0.87 2023-10-03 13:38:33,561 WARNING [train.py:1204] (3/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_839-173017-0_sp1.1 from training. Duration: 0.37275 2023-10-03 13:38:33,575 WARNING [train.py:1204] (3/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_441-91466-0 from training. Duration: 0.97 2023-10-03 13:38:34,948 WARNING [train.py:1204] (3/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_108-53929-0 from training. Duration: 0.59 2023-10-03 13:38:35,119 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.bypass_mid.scale_min, batch_count=1285913.3333333333, ans=0.2 2023-10-03 13:38:35,151 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.feed_forward1.out_proj.dropout_p, batch_count=1285913.3333333333, ans=0.1 2023-10-03 13:38:36,479 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.ff3_skip_rate, batch_count=1285913.3333333333, ans=0.0 2023-10-03 13:38:38,531 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.2.encoder.layers.0.conv_module2.whiten, num_groups=1, num_channels=384, metric=4.62 vs. limit=15.0 2023-10-03 13:38:39,073 WARNING [train.py:1204] (3/4) Exclude cut with ID _9389_210_1664_1_1529217337301_5289269_260-1942-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 13:38:40,443 WARNING [train.py:1204] (3/4) Exclude cut with ID _56423_210_5854_1_1534896061049_3165969_198-61547-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 13:38:40,475 WARNING [train.py:1204] (3/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_412-142122-0_sp1.1 from training. Duration: 0.92725 2023-10-03 13:38:41,804 WARNING [train.py:1204] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_88-341718-0_sp1.1 from training. Duration: 0.6 2023-10-03 13:38:43,204 WARNING [train.py:1204] (3/4) Exclude cut with ID _61079_210_4525_1_1534921008198_4011419_334-298697-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 13:38:44,656 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_5465_210_4211_1_1527227253_55357_8-2499-0_sp1.1 from training. Duration: 0.996375 2023-10-03 13:38:44,817 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.1.conv_module2.balancer2.prob, batch_count=1285980.0, ans=0.125 2023-10-03 13:38:47,564 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_447-201565-0_sp1.1 from training. Duration: 0.97275 2023-10-03 13:38:47,578 WARNING [train.py:1204] (3/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_253-161789-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 13:38:47,581 WARNING [train.py:1204] (3/4) Exclude cut with ID _44304_210_15374_1_1533639352765_4613129_302-22084-0_sp0.9 from training. Duration: 0.9555625 2023-10-03 13:38:47,589 WARNING [train.py:1204] (3/4) Exclude cut with ID _26664_210_7770_1_1532258943280_3418270_187-80815-0_sp1.1 from training. Duration: 0.8454375 2023-10-03 13:38:48,913 WARNING [train.py:1204] (3/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_68-212801-0 from training. Duration: 0.73 2023-10-03 13:38:48,945 WARNING [train.py:1204] (3/4) Exclude cut with ID _29345_210_4381_1_1532689168263_3860169_149-257268-0 from training. Duration: 0.81 2023-10-03 13:38:56,827 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_213-103282-0_sp1.1 from training. Duration: 0.7090625 2023-10-03 13:38:56,955 WARNING [train.py:1204] (3/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_156-302675-0_sp1.1 from training. Duration: 0.5 2023-10-03 13:38:58,189 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.582e+02 1.927e+02 2.064e+02 2.320e+02 3.499e+02, threshold=4.128e+02, percent-clipped=0.0 2023-10-03 13:39:01,278 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.feed_forward1.hidden_balancer.prob, batch_count=1286046.6666666667, ans=0.125 2023-10-03 13:39:06,933 WARNING [train.py:1204] (3/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_125-322172-0 from training. Duration: 0.93 2023-10-03 13:39:08,495 WARNING [train.py:1204] (3/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_493-60474-0_sp1.1 from training. Duration: 0.963625 2023-10-03 13:39:11,163 WARNING [train.py:1204] (3/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_156-302675-0 from training. Duration: 0.55 2023-10-03 13:39:12,582 WARNING [train.py:1204] (3/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_123-267862-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 13:39:14,081 WARNING [train.py:1204] (3/4) Exclude cut with ID _28427_210_4220_1_1532581240967_3061069_201-143747-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 13:39:15,428 WARNING [train.py:1204] (3/4) Exclude cut with ID _8772_210_1850_1_1528635595972_3664011_96-12756-0_sp1.1 from training. Duration: 0.8909375 2023-10-03 13:39:15,468 WARNING [train.py:1204] (3/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_1347-145675-0_sp1.1 from training. Duration: 0.936375 2023-10-03 13:39:16,996 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.conv_module2.balancer1.prob, batch_count=1286113.3333333333, ans=0.125 2023-10-03 13:39:18,171 WARNING [train.py:1204] (3/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_387-159008-0_sp1.1 from training. Duration: 0.7818125 2023-10-03 13:39:18,190 WARNING [train.py:1204] (3/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_400-201896-0_sp1.1 from training. Duration: 0.963625 2023-10-03 13:39:19,721 WARNING [train.py:1204] (3/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_326-174612-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 13:39:19,763 WARNING [train.py:1204] (3/4) Exclude cut with ID _55844_210_2346_1_1534471212569_7625707_390-116233-0_sp1.1 from training. Duration: 0.963625 2023-10-03 13:39:20,978 WARNING [train.py:1204] (3/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_419-317062-0_sp1.1 from training. Duration: 0.82725 2023-10-03 13:39:22,279 WARNING [train.py:1204] (3/4) Exclude cut with ID _29877_210_6228_1_1532397791106_5276149_266-62291-0_sp0.9 from training. Duration: 0.9666875 2023-10-03 13:39:22,338 WARNING [train.py:1204] (3/4) Exclude cut with ID _7857_210_1534_1_1528286515745_6831470_431-338528-0_sp1.1 from training. Duration: 0.92725 2023-10-03 13:39:22,412 WARNING [train.py:1204] (3/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_87-131427-0_sp0.9 from training. Duration: 0.8666875 2023-10-03 13:39:26,401 WARNING [train.py:1204] (3/4) Exclude cut with ID _24055_210_12651_1_1532310907143_976979_92-59356-0_sp1.1 from training. Duration: 0.82725 2023-10-03 13:39:27,767 WARNING [train.py:1204] (3/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_96-290039-0 from training. Duration: 0.56 2023-10-03 13:39:29,708 WARNING [train.py:1204] (3/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_48-182155-0_sp0.9 from training. Duration: 0.9666875 2023-10-03 13:39:30,997 WARNING [train.py:1204] (3/4) Exclude cut with ID _4626_210_3532_1_1526817652717_3916859_446-206656-0 from training. Duration: 0.76 2023-10-03 13:39:32,736 WARNING [train.py:1204] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_168-32906-0 from training. Duration: 0.69 2023-10-03 13:39:34,082 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3780_210_2087_1_1526467760_2130483_170-259044-0 from training. Duration: 0.7 2023-10-03 13:39:34,100 WARNING [train.py:1204] (3/4) Exclude cut with ID _65134_210_14763_1_1535335393985_3505368_153-216718-0_sp1.1 from training. Duration: 0.92725 2023-10-03 13:39:34,160 WARNING [train.py:1204] (3/4) Exclude cut with ID _64639_210_5816_1_1535367619437_3528000_461-329156-0_sp1.1 from training. Duration: 0.87275 2023-10-03 13:39:34,199 WARNING [train.py:1204] (3/4) Exclude cut with ID _21989_210_5854_1_1531899214931_3271360_126-347531-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 13:39:34,245 WARNING [train.py:1204] (3/4) Exclude cut with ID _7614_210_4915_1_1529326764092_4537359_96-162197-0_sp1.1 from training. Duration: 0.963625 2023-10-03 13:39:34,246 WARNING [train.py:1204] (3/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_1083-147536-0 from training. Duration: 0.97 2023-10-03 13:39:37,156 WARNING [train.py:1204] (3/4) Exclude cut with ID _64029_210_4381_1_1535178502772_3756459_275-16710-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 13:39:38,544 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7264_210_4938_1_1527939911_8132095_800-91995-0_sp0.9 from training. Duration: 0.9555625 2023-10-03 13:39:39,866 WARNING [train.py:1204] (3/4) Exclude cut with ID _3844_210_1390_1_1526727803582_2871209_50-193879-0_sp1.1 from training. Duration: 0.936375 2023-10-03 13:39:42,486 WARNING [train.py:1204] (3/4) Exclude cut with ID _46952_210_5986_1_1534230010357_3107379_232-344503-0 from training. Duration: 0.96 2023-10-03 13:39:45,111 INFO [train.py:1046] (3/4) Epoch 37, batch 1700, loss[loss=0.1543, simple_loss=0.2316, pruned_loss=0.03852, over 23509.00 frames. ], tot_loss[loss=0.159, simple_loss=0.239, pruned_loss=0.03944, over 4723389.27 frames. ], batch size: 134, lr: 2.76e-03, grad_scale: 16.0 2023-10-03 13:39:46,615 WARNING [train.py:1204] (3/4) Exclude cut with ID _25808_210_13292_1_1532343514562_7366109_206-14076-0_sp1.1 from training. Duration: 0.936375 2023-10-03 13:39:46,616 WARNING [train.py:1204] (3/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_55-31904-0_sp0.9 from training. Duration: 0.97775 2023-10-03 13:39:47,920 WARNING [train.py:1204] (3/4) Exclude cut with ID _14632_210_9988_1_1530691279392_3923200_161-129530-0 from training. Duration: 0.96 2023-10-03 13:39:49,293 WARNING [train.py:1204] (3/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_770-200430-0_sp1.1 from training. Duration: 0.8090625 2023-10-03 13:39:49,304 WARNING [train.py:1204] (3/4) Exclude cut with ID _6270_210_2084_1_1527850911573_3856420_26-277343-0_sp1.1 from training. Duration: 0.7454375 2023-10-03 13:39:49,305 WARNING [train.py:1204] (3/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_88-23776-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 13:39:50,752 WARNING [train.py:1204] (3/4) Exclude cut with ID _60860_210_8254_1_1534896015875_3557169_245-256666-0_sp1.1 from training. Duration: 0.8818125 2023-10-03 13:39:50,780 WARNING [train.py:1204] (3/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_636-251213-0_sp1.1 from training. Duration: 0.8545625 2023-10-03 13:39:52,098 WARNING [train.py:1204] (3/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_540-131164-0 from training. Duration: 0.92 2023-10-03 13:39:53,574 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_5931_210_2040_1_1527428481_4547999_321-193614-0_sp0.9 from training. Duration: 0.7444375 2023-10-03 13:40:02,157 WARNING [train.py:1204] (3/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_373-225403-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 13:40:04,764 WARNING [train.py:1204] (3/4) Exclude cut with ID _20622_210_3852_1_1531622427080_1093839_57-326027-0_sp1.1 from training. Duration: 0.97275 2023-10-03 13:40:09,209 WARNING [train.py:1204] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_605-257205-0_sp1.1 from training. Duration: 0.536375 2023-10-03 13:40:10,450 WARNING [train.py:1204] (3/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_442-320317-0_sp0.9 from training. Duration: 0.911125 2023-10-03 13:40:10,499 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_35361_210_4238_1_1533262810_7793476_675-87012-0_sp1.1 from training. Duration: 0.8090625 2023-10-03 13:40:11,751 WARNING [train.py:1204] (3/4) Exclude cut with ID _4184_210_2550_1_1526730192013_2219380_306-44736-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 13:40:13,214 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_124-113562-0 from training. Duration: 0.94 2023-10-03 13:40:15,973 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2821_210_2715_1_1526108385_3763560_73-296673-0_sp0.9 from training. Duration: 0.92225 2023-10-03 13:40:15,988 WARNING [train.py:1204] (3/4) Exclude cut with ID _10301_210_5886_1_1529222275332_3910620_216-46959-0_sp1.1 from training. Duration: 0.92725 2023-10-03 13:40:17,388 WARNING [train.py:1204] (3/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_91-277684-0_sp0.9 from training. Duration: 0.82225 2023-10-03 13:40:17,485 WARNING [train.py:1204] (3/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_351-294650-0_sp0.9 from training. Duration: 0.7 2023-10-03 13:40:20,103 WARNING [train.py:1204] (3/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_209-205365-0 from training. Duration: 0.79 2023-10-03 13:40:20,163 WARNING [train.py:1204] (3/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_97-325053-0 from training. Duration: 0.92 2023-10-03 13:40:20,427 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.conv_module1.balancer1.prob, batch_count=1286380.0, ans=0.125 2023-10-03 13:40:21,555 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_49348_210_7927_1_1533954500_3691300_109-276430-0_sp1.1 from training. Duration: 0.92725 2023-10-03 13:40:22,938 WARNING [train.py:1204] (3/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_912-47473-0 from training. Duration: 0.68 2023-10-03 13:40:23,029 WARNING [train.py:1204] (3/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_96-320736-0_sp0.9 from training. Duration: 0.9555625 2023-10-03 13:40:31,738 WARNING [train.py:1204] (3/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_1252-235481-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 13:40:32,987 WARNING [train.py:1204] (3/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_813-261554-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 13:40:34,267 WARNING [train.py:1204] (3/4) Exclude cut with ID _21632_210_2253_1_1531994001765_3744140_110-274654-0_sp0.9 from training. Duration: 0.911125 2023-10-03 13:40:35,688 WARNING [train.py:1204] (3/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_381-317791-0_sp1.1 from training. Duration: 0.663625 2023-10-03 13:40:35,700 WARNING [train.py:1204] (3/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_51-117576-0_sp1.1 from training. Duration: 0.363625 2023-10-03 13:40:35,726 WARNING [train.py:1204] (3/4) Exclude cut with ID _42321_210_7770_1_1533468631352_4366895_104-215915-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 13:40:37,171 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_216-22824-0_sp1.1 from training. Duration: 0.92725 2023-10-03 13:40:37,171 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_14831_210_7927_1_1530700200_5583610_181-27467-0 from training. Duration: 0.91 2023-10-03 13:40:38,514 WARNING [train.py:1204] (3/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_1024-265645-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 13:40:38,516 WARNING [train.py:1204] (3/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_412-233904-0_sp1.1 from training. Duration: 0.936375 2023-10-03 13:40:38,532 WARNING [train.py:1204] (3/4) Exclude cut with ID _30191_210_1664_1_1533189915047_7196670_911-223461-0_sp1.1 from training. Duration: 0.92725 2023-10-03 13:40:38,548 WARNING [train.py:1204] (3/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_558-148266-0_sp1.1 from training. Duration: 0.963625 2023-10-03 13:40:41,214 WARNING [train.py:1204] (3/4) Exclude cut with ID _58480_210_12392_1_1534811121111_3646160_212-164495-0_sp1.1 from training. Duration: 0.936375 2023-10-03 13:40:41,215 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_454-276429-0_sp0.9 from training. Duration: 0.9666875 2023-10-03 13:40:42,555 WARNING [train.py:1204] (3/4) Exclude cut with ID _24736_210_3279_1_1532332894218_2838700_73-197086-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 13:40:42,608 WARNING [train.py:1204] (3/4) Exclude cut with ID _5294_210_1747_1_1527249643928_3696179_529-242298-0_sp1.1 from training. Duration: 0.863625 2023-10-03 13:40:44,063 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_53555_210_5632_1_1534595220_7232297_345-171438-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 13:40:48,272 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_28211_210_6388_1_1532861506_4523224_22-25766-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 13:40:49,637 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7002_210_1794_1_1527939683_5157286_163-87552-0 from training. Duration: 0.86 2023-10-03 13:40:51,010 WARNING [train.py:1204] (3/4) Exclude cut with ID _56927_210_9969_1_1534848970840_3695229_409-225129-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 13:40:51,105 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_26192_210_7927_1_1532137269_1040115_21-219331-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 13:40:52,602 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.attention_skip_rate, batch_count=1286513.3333333333, ans=0.0 2023-10-03 13:40:54,371 WARNING [train.py:1204] (3/4) Exclude cut with ID _15012_210_1968_1_1530856820429_5095350_662-210469-0 from training. Duration: 0.57 2023-10-03 13:40:58,887 INFO [train.py:1046] (3/4) Epoch 37, batch 1750, loss[loss=0.1568, simple_loss=0.2495, pruned_loss=0.03207, over 24427.00 frames. ], tot_loss[loss=0.1578, simple_loss=0.2376, pruned_loss=0.03898, over 4728465.94 frames. ], batch size: 69, lr: 2.76e-03, grad_scale: 16.0 2023-10-03 13:41:00,927 WARNING [train.py:1204] (3/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_582-191016-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 13:41:02,396 WARNING [train.py:1204] (3/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_270-210032-0_sp1.1 from training. Duration: 0.963625 2023-10-03 13:41:02,424 WARNING [train.py:1204] (3/4) Exclude cut with ID _6789_210_1534_1_1527857937218_3118950_262-48853-0_sp0.9 from training. Duration: 0.62225 2023-10-03 13:41:02,682 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.nonlin_attention.balancer.prob, batch_count=1286580.0, ans=0.125 2023-10-03 13:41:03,722 WARNING [train.py:1204] (3/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_847-41334-0 from training. Duration: 0.44 2023-10-03 13:41:03,752 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_573-46532-0_sp1.1 from training. Duration: 0.92725 2023-10-03 13:41:06,464 WARNING [train.py:1204] (3/4) Exclude cut with ID _21881_210_3525_1_1531825242757_3798380_247-102591-0_sp1.1 from training. Duration: 0.8909375 2023-10-03 13:41:06,492 WARNING [train.py:1204] (3/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_200-21395-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 13:41:11,772 WARNING [train.py:1204] (3/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_711-15688-0 from training. Duration: 0.43 2023-10-03 13:41:14,497 WARNING [train.py:1204] (3/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_53-75025-0_sp1.1 from training. Duration: 0.963625 2023-10-03 13:41:15,941 WARNING [train.py:1204] (3/4) Exclude cut with ID _6194_210_1534_1_1527511826160_3941194_321-148641-0 from training. Duration: 0.64 2023-10-03 13:41:17,533 WARNING [train.py:1204] (3/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_337-294431-0_sp1.1 from training. Duration: 0.936375 2023-10-03 13:41:17,634 WARNING [train.py:1204] (3/4) Exclude cut with ID _21632_210_2253_1_1531994001765_3744140_110-274654-0_sp1.1 from training. Duration: 0.7454375 2023-10-03 13:41:21,730 WARNING [train.py:1204] (3/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_135-174080-0_sp1.1 from training. Duration: 0.4818125 2023-10-03 13:41:23,100 WARNING [train.py:1204] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_21-174514-0 from training. Duration: 0.43 2023-10-03 13:41:24,678 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.554e+02 1.864e+02 2.125e+02 2.477e+02 3.687e+02, threshold=4.251e+02, percent-clipped=0.0 2023-10-03 13:41:24,861 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_38965_210_4852_1_1533174906_3484735_215-294589-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 13:41:24,888 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_548-294316-0 from training. Duration: 0.81 2023-10-03 13:41:29,838 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.1.feed_forward2.hidden_balancer.prob, batch_count=1286713.3333333333, ans=0.125 2023-10-03 13:41:34,429 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_103-84010-0_sp1.1 from training. Duration: 0.62725 2023-10-03 13:41:35,948 WARNING [train.py:1204] (3/4) Exclude cut with ID _18199_210_6109_1_1531475789864_4288790_295-198247-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 13:41:37,272 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_13757_210_3528_1_1530440682_4107239_103-313224-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 13:41:40,052 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_34886_210_10633_1_1533203882_1755661_67-294673-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 13:41:40,069 WARNING [train.py:1204] (3/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_350-101734-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 13:41:41,536 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_37357_210_17024_1_1533089504_2316121_204-153986-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 13:41:43,051 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.balancer2.prob, batch_count=1286780.0, ans=0.125 2023-10-03 13:41:44,205 WARNING [train.py:1204] (3/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_673-115569-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 13:41:45,636 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_489-230703-0_sp0.9 from training. Duration: 0.9444375 2023-10-03 13:41:47,002 WARNING [train.py:1204] (3/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_575-102496-0_sp1.1 from training. Duration: 0.936375 2023-10-03 13:41:48,303 WARNING [train.py:1204] (3/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_669-93163-0 from training. Duration: 0.98 2023-10-03 13:41:48,528 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.conv_module1.balancer1.prob, batch_count=1286780.0, ans=0.125 2023-10-03 13:41:49,683 WARNING [train.py:1204] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_249-281349-0_sp0.9 from training. Duration: 0.9444375 2023-10-03 13:41:52,530 WARNING [train.py:1204] (3/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_106-30782-0 from training. Duration: 0.8 2023-10-03 13:41:52,592 WARNING [train.py:1204] (3/4) Exclude cut with ID _8772_210_1850_1_1528635595972_3664011_398-320049-0_sp1.1 from training. Duration: 0.8 2023-10-03 13:41:54,506 WARNING [train.py:1204] (3/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_516-151727-0_sp1.1 from training. Duration: 0.963625 2023-10-03 13:41:55,863 WARNING [train.py:1204] (3/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_325-62802-0_sp1.1 from training. Duration: 0.7909375 2023-10-03 13:41:56,236 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.ff3_skip_rate, batch_count=1286846.6666666667, ans=0.0 2023-10-03 13:41:59,205 WARNING [train.py:1204] (3/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_93-58002-0_sp0.9 from training. Duration: 0.6333125 2023-10-03 13:42:00,548 WARNING [train.py:1204] (3/4) Exclude cut with ID _4681_210_3525_1_1527251040321_1235713_123-24428-0_sp1.1 from training. Duration: 0.436375 2023-10-03 13:42:00,779 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.conv_skip_rate, batch_count=1286846.6666666667, ans=0.0 2023-10-03 13:42:01,902 WARNING [train.py:1204] (3/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_1185-56473-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 13:42:03,794 WARNING [train.py:1204] (3/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_96-244299-0_sp1.1 from training. Duration: 0.8 2023-10-03 13:42:07,936 WARNING [train.py:1204] (3/4) Exclude cut with ID _59500_210_8254_1_1534809610680_3736009_104-72815-0_sp1.1 from training. Duration: 0.963625 2023-10-03 13:42:10,659 WARNING [train.py:1204] (3/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_468-283113-0_sp1.1 from training. Duration: 0.87275 2023-10-03 13:42:10,980 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.conv_module2.balancer1.prob, batch_count=1286913.3333333333, ans=0.125 2023-10-03 13:42:12,023 INFO [train.py:1046] (3/4) Epoch 37, batch 1800, loss[loss=0.1598, simple_loss=0.2361, pruned_loss=0.04178, over 23371.00 frames. ], tot_loss[loss=0.1572, simple_loss=0.2372, pruned_loss=0.03862, over 4743805.46 frames. ], batch size: 134, lr: 2.76e-03, grad_scale: 16.0 2023-10-03 13:42:12,081 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6239_210_4211_1_1527765584_4306698_528-102186-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 13:42:12,146 WARNING [train.py:1204] (3/4) Exclude cut with ID _45297_210_2721_1_1533715351381_3264949_351-147136-0 from training. Duration: 0.87 2023-10-03 13:42:12,150 WARNING [train.py:1204] (3/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_520-51744-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 13:42:12,394 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.conv_module1.balancer1.prob, batch_count=1286913.3333333333, ans=0.125 2023-10-03 13:42:14,880 WARNING [train.py:1204] (3/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_310-114452-0_sp1.1 from training. Duration: 0.536375 2023-10-03 13:42:14,884 WARNING [train.py:1204] (3/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_450-165710-0_sp1.1 from training. Duration: 0.92725 2023-10-03 13:42:14,902 WARNING [train.py:1204] (3/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_280-120942-0_sp1.1 from training. Duration: 0.7 2023-10-03 13:42:14,924 WARNING [train.py:1204] (3/4) Exclude cut with ID _65585_210_12210_1_1535450451584_3764098_112-294301-0_sp0.9 from training. Duration: 0.97775 2023-10-03 13:42:16,308 WARNING [train.py:1204] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_98-51676-0_sp0.9 from training. Duration: 0.811125 2023-10-03 13:42:19,022 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_33348_210_5632_1_1533086912_3324030_85-28583-0_sp1.1 from training. Duration: 0.7454375 2023-10-03 13:42:19,108 WARNING [train.py:1204] (3/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_336-208232-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 13:42:20,490 WARNING [train.py:1204] (3/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_339-104791-0_sp0.9 from training. Duration: 0.5444375 2023-10-03 13:42:23,156 WARNING [train.py:1204] (3/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_559-29840-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 13:42:24,685 WARNING [train.py:1204] (3/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_117-290916-0_sp0.9 from training. Duration: 0.5666875 2023-10-03 13:42:26,492 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_185-112289-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 13:42:29,812 WARNING [train.py:1204] (3/4) Exclude cut with ID _59256_210_5986_1_1535076085997_3589120_155-298797-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 13:42:30,048 INFO [scaling.py:1118] (3/4) WithLoss: name=encoder.encoders.2.encoder.layers.0.self_attn_weights, loss-sum=0.000e+00 2023-10-03 13:42:32,478 WARNING [train.py:1204] (3/4) Exclude cut with ID _44659_210_2721_1_1534036120061_6614860_246-206531-0_sp1.1 from training. Duration: 0.92725 2023-10-03 13:42:32,528 WARNING [train.py:1204] (3/4) Exclude cut with ID _23818_210_6228_1_1531917063884_3676920_469-151944-0_sp1.1 from training. Duration: 0.92725 2023-10-03 13:42:34,387 WARNING [train.py:1204] (3/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_88-276668-0_sp1.1 from training. Duration: 0.9 2023-10-03 13:42:35,842 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_15101_210_7927_1_1530786415_5033274_320-327527-0_sp1.1 from training. Duration: 0.87275 2023-10-03 13:42:37,225 WARNING [train.py:1204] (3/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_446-112117-0 from training. Duration: 0.9 2023-10-03 13:42:37,287 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_30280_210_1881_1_1532872779_3660139_400-254159-0_sp1.1 from training. Duration: 0.936375 2023-10-03 13:42:37,534 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.balancer1.prob, batch_count=1286980.0, ans=0.125 2023-10-03 13:42:40,108 WARNING [train.py:1204] (3/4) Exclude cut with ID _40284_210_2084_1_1533513679814_3704279_158-345341-0_sp1.1 from training. Duration: 0.936375 2023-10-03 13:42:42,939 WARNING [train.py:1204] (3/4) Exclude cut with ID 3033-130750-0096-55598-0 from training. Duration: 0.83 2023-10-03 13:42:45,644 WARNING [train.py:1204] (3/4) Exclude cut with ID _9389_210_1664_1_1529217337301_5289269_379-128794-0 from training. Duration: 0.85 2023-10-03 13:42:45,660 WARNING [train.py:1204] (3/4) Exclude cut with ID _3675_210_4557_1_1526639934408_4182089_456-18305-0 from training. Duration: 0.76 2023-10-03 13:42:45,702 WARNING [train.py:1204] (3/4) Exclude cut with ID _29853_210_7497_1_1532505600851_6642110_571-249460-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 13:42:48,246 WARNING [train.py:1204] (3/4) Exclude cut with ID _4068_210_2629_1_1526697350395_1546259_230-325568-0_sp1.1 from training. Duration: 0.92725 2023-10-03 13:42:48,247 WARNING [train.py:1204] (3/4) Exclude cut with ID _34415_210_8750_1_1533085213375_3451116_213-267570-0_sp1.1 from training. Duration: 0.963625 2023-10-03 13:42:48,326 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6767_210_3273_1_1527855783_1718932_142-127132-0_sp1.1 from training. Duration: 0.863625 2023-10-03 13:42:51,268 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.bypass_mid.scale_min, batch_count=1287046.6666666667, ans=0.2 2023-10-03 13:42:55,188 WARNING [train.py:1204] (3/4) Exclude cut with ID IC0653W0001-322925 from training. Duration: 0.896 2023-10-03 13:42:57,159 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_4528_210_3273_1_1527076654_3276777_209-219804-0_sp1.1 from training. Duration: 0.62725 2023-10-03 13:42:58,525 WARNING [train.py:1204] (3/4) Exclude cut with ID _55844_210_2346_1_1534471212569_7625707_187-204971-0_sp1.1 from training. Duration: 0.936375 2023-10-03 13:42:58,649 WARNING [train.py:1204] (3/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_369-325184-0 from training. Duration: 0.99 2023-10-03 13:42:58,686 WARNING [train.py:1204] (3/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_791-45149-0 from training. Duration: 0.86 2023-10-03 13:43:00,483 WARNING [train.py:1204] (3/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_317-211107-0_sp1.1 from training. Duration: 0.836375 2023-10-03 13:43:00,576 WARNING [train.py:1204] (3/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_488-330913-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 13:43:02,025 WARNING [train.py:1204] (3/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_506-42614-0_sp1.1 from training. Duration: 0.7181875 2023-10-03 13:43:05,815 WARNING [train.py:1204] (3/4) Exclude cut with ID _8696_210_1876_1_1528545632440_3345058_209-54069-0 from training. Duration: 0.67 2023-10-03 13:43:11,323 WARNING [train.py:1204] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_215-202973-0_sp1.1 from training. Duration: 0.8 2023-10-03 13:43:11,399 WARNING [train.py:1204] (3/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_321-215742-0 from training. Duration: 0.5 2023-10-03 13:43:12,761 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_428-32114-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 13:43:12,776 WARNING [train.py:1204] (3/4) Exclude cut with ID _27013_210_1389_1_1532437173240_3252649_467-263151-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 13:43:12,832 WARNING [train.py:1204] (3/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_734-129761-0_sp0.9 from training. Duration: 0.911125 2023-10-03 13:43:14,130 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_521-214276-0 from training. Duration: 0.44 2023-10-03 13:43:15,617 WARNING [train.py:1204] (3/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_80-147301-0_sp1.1 from training. Duration: 0.763625 2023-10-03 13:43:15,619 WARNING [train.py:1204] (3/4) Exclude cut with ID _9679_210_7497_1_1529129212906_4363929_56-307796-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 13:43:18,425 WARNING [train.py:1204] (3/4) Exclude cut with ID _14832_210_8750_1_1530691351539_3171348_1-54315-0 from training. Duration: 0.87 2023-10-03 13:43:18,426 WARNING [train.py:1204] (3/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_558-155750-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 13:43:21,280 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_142-309370-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 13:43:21,327 WARNING [train.py:1204] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_18-46756-0_sp0.9 from training. Duration: 0.87775 2023-10-03 13:43:23,045 WARNING [train.py:1204] (3/4) Exclude cut with ID _61148_210_6897_1_1535094022726_4217610_279-212159-0_sp1.1 from training. Duration: 0.936375 2023-10-03 13:43:24,408 WARNING [train.py:1204] (3/4) Exclude cut with ID _21987_210_5854_1_1531821500482_3637038_164-2641-0_sp1.1 from training. Duration: 0.936375 2023-10-03 13:43:24,449 WARNING [train.py:1204] (3/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_395-340852-0_sp1.1 from training. Duration: 0.8454375 2023-10-03 13:43:25,739 INFO [train.py:1046] (3/4) Epoch 37, batch 1850, loss[loss=0.1541, simple_loss=0.229, pruned_loss=0.03966, over 23647.00 frames. ], tot_loss[loss=0.1574, simple_loss=0.2376, pruned_loss=0.03864, over 4749318.68 frames. ], batch size: 232, lr: 2.76e-03, grad_scale: 16.0 2023-10-03 13:43:25,913 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1077-184261-0_sp1.1 from training. Duration: 0.963625 2023-10-03 13:43:25,928 WARNING [train.py:1204] (3/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_1176-204992-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 13:43:29,190 WARNING [train.py:1204] (3/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_563-347832-0_sp1.1 from training. Duration: 0.8090625 2023-10-03 13:43:30,559 WARNING [train.py:1204] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_380-204756-0_sp0.9 from training. Duration: 0.9555625 2023-10-03 13:43:32,218 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.attention_skip_rate, batch_count=1287246.6666666667, ans=0.0 2023-10-03 13:43:36,715 WARNING [train.py:1204] (3/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_212-10501-0_sp1.1 from training. Duration: 0.8545625 2023-10-03 13:43:38,033 WARNING [train.py:1204] (3/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_445-117398-0 from training. Duration: 0.89 2023-10-03 13:43:40,775 WARNING [train.py:1204] (3/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_29-280622-0 from training. Duration: 0.82 2023-10-03 13:43:43,462 WARNING [train.py:1204] (3/4) Exclude cut with ID _26664_210_7770_1_1532258943280_3418270_187-80815-0 from training. Duration: 0.93 2023-10-03 13:43:43,634 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.1.nonlin_attention.balancer.prob, batch_count=1287313.3333333333, ans=0.125 2023-10-03 13:43:45,118 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.conv_module1.balancer1.prob, batch_count=1287313.3333333333, ans=0.125 2023-10-03 13:43:46,258 WARNING [train.py:1204] (3/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_414-349640-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 13:43:46,277 WARNING [train.py:1204] (3/4) Exclude cut with ID _45021_210_9994_1_1533619751094_3330288_96-239010-0 from training. Duration: 0.91 2023-10-03 13:43:46,280 WARNING [train.py:1204] (3/4) Exclude cut with ID _5189_210_4557_1_1527248319428_3769229_199-53526-0_sp0.9 from training. Duration: 0.4666875 2023-10-03 13:43:47,723 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=1287313.3333333333, ans=0.1 2023-10-03 13:43:52,037 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.628e+02 1.939e+02 2.095e+02 2.336e+02 4.113e+02, threshold=4.190e+02, percent-clipped=0.0 2023-10-03 13:43:52,429 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.conv_skip_rate, batch_count=1287313.3333333333, ans=0.0 2023-10-03 13:43:56,480 WARNING [train.py:1204] (3/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_63-101996-0_sp1.1 from training. Duration: 0.8 2023-10-03 13:43:57,825 WARNING [train.py:1204] (3/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_199-86432-0 from training. Duration: 0.98 2023-10-03 13:44:01,102 WARNING [train.py:1204] (3/4) Exclude cut with ID _36399_210_16049_1_1532998771377_3698333_207-259013-0_sp1.1 from training. Duration: 0.92725 2023-10-03 13:44:01,149 WARNING [train.py:1204] (3/4) Exclude cut with ID _62574_210_2655_1_1535180407058_3933249_567-111756-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 13:44:07,339 WARNING [train.py:1204] (3/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_436-318113-0 from training. Duration: 0.75 2023-10-03 13:44:07,384 WARNING [train.py:1204] (3/4) Exclude cut with ID _53379_210_20034_1_1534222027363_3854654_43-305214-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 13:44:07,406 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_15126_210_6753_1_1530778677_2591185_113-157651-0_sp1.1 from training. Duration: 0.6181875 2023-10-03 13:44:08,778 WARNING [train.py:1204] (3/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_146-218763-0_sp1.1 from training. Duration: 0.7818125 2023-10-03 13:44:10,306 WARNING [train.py:1204] (3/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_714-310538-0_sp0.9 from training. Duration: 0.9555625 2023-10-03 13:44:13,194 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.ff2_skip_rate, batch_count=1287446.6666666667, ans=0.0 2023-10-03 13:44:14,380 WARNING [train.py:1204] (3/4) Exclude cut with ID _24271_210_12658_1_1531987270022_7190519_709-16721-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 13:44:15,082 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.2.encoder.layers.2.conv_module1.whiten, num_groups=1, num_channels=384, metric=6.22 vs. limit=15.0 2023-10-03 13:44:17,119 WARNING [train.py:1204] (3/4) Exclude cut with ID _5190_210_4557_1_1527244368799_373439_28-197030-0_sp0.9 from training. Duration: 0.8 2023-10-03 13:44:17,159 WARNING [train.py:1204] (3/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_267-197536-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 13:44:17,180 WARNING [train.py:1204] (3/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_658-282610-0_sp1.1 from training. Duration: 0.4818125 2023-10-03 13:44:18,428 WARNING [train.py:1204] (3/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_257-67050-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 13:44:19,853 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_14906_210_7927_1_1530754190_9408633_961-53402-0_sp1.1 from training. Duration: 0.936375 2023-10-03 13:44:20,128 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.ff3_skip_rate, batch_count=1287446.6666666667, ans=0.0 2023-10-03 13:44:21,631 WARNING [train.py:1204] (3/4) Exclude cut with ID _60113_210_16271_1_1535164212481_4386079_490-280290-0_sp1.1 from training. Duration: 0.8909375 2023-10-03 13:44:23,162 WARNING [train.py:1204] (3/4) Exclude cut with ID _14100_210_8341_1_1530529320613_3678069_258-224599-0 from training. Duration: 0.77 2023-10-03 13:44:24,546 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6961_210_2087_1_1527937168_6815011_565-326898-0_sp1.1 from training. Duration: 0.936375 2023-10-03 13:44:28,866 WARNING [train.py:1204] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_239-316802-0_sp1.1 from training. Duration: 0.62725 2023-10-03 13:44:30,223 WARNING [train.py:1204] (3/4) Exclude cut with ID _55857_210_2346_1_1534644007831_6898209_745-98391-0_sp1.1 from training. Duration: 0.6909375 2023-10-03 13:44:30,227 WARNING [train.py:1204] (3/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_154-135696-0 from training. Duration: 0.8 2023-10-03 13:44:30,229 WARNING [train.py:1204] (3/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_93-58002-0 from training. Duration: 0.57 2023-10-03 13:44:32,324 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9025W0005-507335 from training. Duration: 0.896 2023-10-03 13:44:33,669 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9029W0034-509435 from training. Duration: 0.64 2023-10-03 13:44:35,118 WARNING [train.py:1204] (3/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_76-203867-0_sp1.1 from training. Duration: 0.6545625 2023-10-03 13:44:35,134 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_46639_210_8341_1_1533726088_4495500_66-346325-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 13:44:35,140 WARNING [train.py:1204] (3/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_145-137790-0_sp0.9 from training. Duration: 0.92225 2023-10-03 13:44:35,154 WARNING [train.py:1204] (3/4) Exclude cut with ID _36522_210_8887_1_1533177030611_1772478_7-327299-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 13:44:36,914 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9042W0009-515615 from training. Duration: 0.896 2023-10-03 13:44:36,923 WARNING [train.py:1204] (3/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_39-248870-0_sp0.9 from training. Duration: 0.7666875 2023-10-03 13:44:36,971 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_35441_210_16554_1_1533347572_4092680_362-242912-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 13:44:38,328 WARNING [train.py:1204] (3/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_216-343007-0_sp1.1 from training. Duration: 0.6 2023-10-03 13:44:39,624 INFO [train.py:1046] (3/4) Epoch 37, batch 1900, loss[loss=0.1527, simple_loss=0.2409, pruned_loss=0.03219, over 24643.00 frames. ], tot_loss[loss=0.1587, simple_loss=0.2392, pruned_loss=0.03911, over 4739060.79 frames. ], batch size: 68, lr: 2.76e-03, grad_scale: 16.0 2023-10-03 13:44:39,669 WARNING [train.py:1204] (3/4) Exclude cut with ID _3867_210_1883_1_1526641200575_4475830_272-215563-0_sp0.9 from training. Duration: 0.6333125 2023-10-03 13:44:39,754 WARNING [train.py:1204] (3/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_553-175603-0_sp1.1 from training. Duration: 0.92725 2023-10-03 13:44:39,769 WARNING [train.py:1204] (3/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_963-187109-0 from training. Duration: 0.99 2023-10-03 13:44:41,196 WARNING [train.py:1204] (3/4) Exclude cut with ID _44973_210_1511_1_1533794381163_3634770_495-211466-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 13:44:41,206 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9068W0002-529085 from training. Duration: 0.896 2023-10-03 13:44:41,222 WARNING [train.py:1204] (3/4) Exclude cut with ID _5294_210_1747_1_1527249643928_3696179_180-100713-0_sp1.1 from training. Duration: 0.736375 2023-10-03 13:44:41,884 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.3.encoder.layers.0.feed_forward1.out_whiten, num_groups=1, num_channels=512, metric=7.56 vs. limit=15.0 2023-10-03 13:44:43,033 WARNING [train.py:1204] (3/4) Exclude cut with ID _35799_210_5854_1_1533263487887_3529071_149-49689-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 13:44:48,479 WARNING [train.py:1204] (3/4) Exclude cut with ID _67725_210_27767_1_1535608756671_5105359_36-220794-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 13:44:51,206 WARNING [train.py:1204] (3/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_338-96520-0_sp1.1 from training. Duration: 0.8545625 2023-10-03 13:44:53,074 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9104W0004-547160 from training. Duration: 0.896 2023-10-03 13:44:53,155 WARNING [train.py:1204] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_330-327861-0 from training. Duration: 0.67 2023-10-03 13:44:54,544 WARNING [train.py:1204] (3/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_156-257607-0_sp0.9 from training. Duration: 0.92225 2023-10-03 13:44:55,935 WARNING [train.py:1204] (3/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_476-322050-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 13:44:55,944 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9118W0017-553385 from training. Duration: 0.896 2023-10-03 13:44:57,266 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9120W0028-554435 from training. Duration: 0.896 2023-10-03 13:45:00,137 WARNING [train.py:1204] (3/4) Exclude cut with ID _56524_210_9642_1_1534572011779_3619170_145-310842-0 from training. Duration: 0.99 2023-10-03 13:45:01,553 WARNING [train.py:1204] (3/4) Exclude cut with ID _51631_210_8254_1_1534129228420_4084794_270-222508-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 13:45:05,149 WARNING [train.py:1204] (3/4) Exclude cut with ID _14268_210_9206_1_1530622771285_4714099_331-105351-0 from training. Duration: 0.65 2023-10-03 13:45:06,563 WARNING [train.py:1204] (3/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_40-129756-0 from training. Duration: 0.81 2023-10-03 13:45:07,369 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.conv_skip_rate, batch_count=1287646.6666666667, ans=0.0 2023-10-03 13:45:15,797 WARNING [train.py:1204] (3/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_231-104736-0 from training. Duration: 0.78 2023-10-03 13:45:18,563 WARNING [train.py:1204] (3/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_214-291369-0 from training. Duration: 0.63 2023-10-03 13:45:18,578 WARNING [train.py:1204] (3/4) Exclude cut with ID _62574_210_2655_1_1535180407058_3933249_369-84347-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 13:45:18,633 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9210W0111-599240 from training. Duration: 0.896 2023-10-03 13:45:18,637 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_92-246270-0 from training. Duration: 0.85 2023-10-03 13:45:19,934 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_37152_210_9697_1_1533106573_3844188_29-239070-0 from training. Duration: 0.98 2023-10-03 13:45:20,004 WARNING [train.py:1204] (3/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_239-22076-0 from training. Duration: 0.85 2023-10-03 13:45:20,006 WARNING [train.py:1204] (3/4) Exclude cut with ID _65962_210_29927_1_1535436026519_4174930_368-44639-0_sp1.1 from training. Duration: 0.97275 2023-10-03 13:45:20,689 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.3.encoder.layers.1.feed_forward1.out_whiten, num_groups=1, num_channels=512, metric=7.71 vs. limit=15.0 2023-10-03 13:45:23,375 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.ff3_skip_rate, batch_count=1287780.0, ans=0.0 2023-10-03 13:45:24,500 WARNING [train.py:1204] (3/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_152-212533-0 from training. Duration: 0.97 2023-10-03 13:45:25,150 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.3.encoder.layers.3.conv_module1.whiten, num_groups=1, num_channels=512, metric=6.49 vs. limit=15.0 2023-10-03 13:45:27,232 WARNING [train.py:1204] (3/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_90-31383-0_sp1.1 from training. Duration: 0.7909375 2023-10-03 13:45:29,986 WARNING [train.py:1204] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_196-152868-0_sp1.1 from training. Duration: 0.92725 2023-10-03 13:45:29,988 WARNING [train.py:1204] (3/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_140-181176-0 from training. Duration: 0.92 2023-10-03 13:45:31,449 WARNING [train.py:1204] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_168-32906-0_sp0.9 from training. Duration: 0.7666875 2023-10-03 13:45:34,822 WARNING [train.py:1204] (3/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_13-165512-0 from training. Duration: 0.82 2023-10-03 13:45:34,863 WARNING [train.py:1204] (3/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_381-317791-0_sp0.9 from training. Duration: 0.811125 2023-10-03 13:45:39,780 WARNING [train.py:1204] (3/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_63-267585-0_sp1.1 from training. Duration: 0.8181875 2023-10-03 13:45:39,782 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_326-303945-0_sp1.1 from training. Duration: 0.9 2023-10-03 13:45:39,795 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_194-143381-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 13:45:41,119 WARNING [train.py:1204] (3/4) Exclude cut with ID _39290_210_4220_1_1533862778760_3075118_194-16667-0_sp1.1 from training. Duration: 0.963625 2023-10-03 13:45:41,209 WARNING [train.py:1204] (3/4) Exclude cut with ID _15012_210_1968_1_1530856820429_5095350_662-210469-0_sp0.9 from training. Duration: 0.6333125 2023-10-03 13:45:42,836 WARNING [train.py:1204] (3/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_68-219807-0_sp1.1 from training. Duration: 0.42725 2023-10-03 13:45:44,107 WARNING [train.py:1204] (3/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_321-22821-0_sp0.9 from training. Duration: 0.988875 2023-10-03 13:45:45,610 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_1037-45317-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 13:45:45,612 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_403-39619-0_sp1.1 from training. Duration: 0.763625 2023-10-03 13:45:49,542 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_510-96996-0_sp0.9 from training. Duration: 0.9666875 2023-10-03 13:45:49,544 WARNING [train.py:1204] (3/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_225-180155-0_sp1.1 from training. Duration: 0.92725 2023-10-03 13:45:49,581 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_278-272612-0_sp0.9 from training. Duration: 0.811125 2023-10-03 13:45:50,910 WARNING [train.py:1204] (3/4) Exclude cut with ID _53713_210_24116_1_1534314487762_7416751_691-70244-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 13:45:54,086 INFO [train.py:1046] (3/4) Epoch 37, batch 1950, loss[loss=0.1691, simple_loss=0.2531, pruned_loss=0.04258, over 23730.00 frames. ], tot_loss[loss=0.1597, simple_loss=0.2397, pruned_loss=0.03987, over 4740130.70 frames. ], batch size: 85, lr: 2.76e-03, grad_scale: 16.0 2023-10-03 13:45:54,222 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6788_210_1388_1_1527898698_4562021_91-197981-0_sp1.1 from training. Duration: 0.7545625 2023-10-03 13:45:56,927 WARNING [train.py:1204] (3/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_60-18123-0_sp0.9 from training. Duration: 0.97775 2023-10-03 13:45:56,960 WARNING [train.py:1204] (3/4) Exclude cut with ID _35799_210_5854_1_1533263487887_3529071_5-214617-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 13:45:56,964 WARNING [train.py:1204] (3/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_890-5117-0_sp1.1 from training. Duration: 0.7090625 2023-10-03 13:45:58,373 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder_embed.conv.8.prob, batch_count=1287913.3333333333, ans=0.125 2023-10-03 13:45:59,685 WARNING [train.py:1204] (3/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_660-211088-0 from training. Duration: 0.62 2023-10-03 13:45:59,741 WARNING [train.py:1204] (3/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_314-126819-0_sp0.9 from training. Duration: 0.5555625 2023-10-03 13:46:01,585 WARNING [train.py:1204] (3/4) Exclude cut with ID _21632_210_2253_1_1531994001765_3744140_362-66225-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 13:46:01,683 WARNING [train.py:1204] (3/4) Exclude cut with ID _61118_210_9527_1_1534897668884_3752630_173-258555-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 13:46:05,678 WARNING [train.py:1204] (3/4) Exclude cut with ID _7719_210_3525_1_1528456153163_4296960_88-250259-0_sp0.9 from training. Duration: 0.9333125 2023-10-03 13:46:05,713 WARNING [train.py:1204] (3/4) Exclude cut with ID _15140_210_9288_1_1530791388450_4880149_539-153708-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 13:46:05,737 WARNING [train.py:1204] (3/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_253-42544-0_sp1.1 from training. Duration: 0.97275 2023-10-03 13:46:07,769 WARNING [train.py:1204] (3/4) Exclude cut with ID _33640_210_2655_1_1533117605479_4855469_479-296877-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 13:46:10,527 WARNING [train.py:1204] (3/4) Exclude cut with ID 3033-130750-0096-55598-0_sp1.1 from training. Duration: 0.7545625 2023-10-03 13:46:11,814 WARNING [train.py:1204] (3/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_191-41023-0_sp1.1 from training. Duration: 0.6454375 2023-10-03 13:46:11,836 WARNING [train.py:1204] (3/4) Exclude cut with ID _8365_210_2064_1_1528592311070_4677690_431-294102-0_sp1.1 from training. Duration: 0.8090625 2023-10-03 13:46:11,885 WARNING [train.py:1204] (3/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_893-296030-0_sp1.1 from training. Duration: 0.97275 2023-10-03 13:46:16,612 WARNING [train.py:1204] (3/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_401-343820-0_sp1.1 from training. Duration: 0.97275 2023-10-03 13:46:16,848 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.ff3_skip_rate, batch_count=1287980.0, ans=0.0 2023-10-03 13:46:18,069 WARNING [train.py:1204] (3/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_127-566-0_sp1.1 from training. Duration: 0.763625 2023-10-03 13:46:18,071 WARNING [train.py:1204] (3/4) Exclude cut with ID _14100_210_8341_1_1530529320613_3678069_126-272847-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 13:46:18,093 WARNING [train.py:1204] (3/4) Exclude cut with ID _15133_210_6228_1_1530794512074_3080379_29-264458-0_sp1.1 from training. Duration: 0.52725 2023-10-03 13:46:18,094 WARNING [train.py:1204] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_127-119109-0 from training. Duration: 0.72 2023-10-03 13:46:19,471 WARNING [train.py:1204] (3/4) Exclude cut with ID _6537_210_1883_1_1527987559710_3597129_382-133062-0_sp1.1 from training. Duration: 0.6181875 2023-10-03 13:46:19,490 WARNING [train.py:1204] (3/4) Exclude cut with ID _29529_210_7234_1_1532393961455_3977460_391-219682-0_sp1.1 from training. Duration: 0.936375 2023-10-03 13:46:20,727 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.523e+02 1.916e+02 2.137e+02 2.276e+02 3.150e+02, threshold=4.275e+02, percent-clipped=0.0 2023-10-03 13:46:20,825 WARNING [train.py:1204] (3/4) Exclude cut with ID _37537_210_2253_1_1533171758568_3421949_96-137189-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 13:46:24,959 WARNING [train.py:1204] (3/4) Exclude cut with ID _58414_210_8254_1_1535421609162_7151182_15-276301-0_sp1.1 from training. Duration: 0.97275 2023-10-03 13:46:26,437 WARNING [train.py:1204] (3/4) Exclude cut with ID _59502_210_12210_1_1535107024698_1835139_110-289074-0_sp1.1 from training. Duration: 0.92725 2023-10-03 13:46:31,523 WARNING [train.py:1204] (3/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_367-61330-0_sp1.1 from training. Duration: 0.6818125 2023-10-03 13:46:35,491 WARNING [train.py:1204] (3/4) Exclude cut with ID _45297_210_2721_1_1533715351381_3264949_353-9541-0_sp1.1 from training. Duration: 0.8818125 2023-10-03 13:46:36,921 WARNING [train.py:1204] (3/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_85-138257-0_sp0.9 from training. Duration: 0.788875 2023-10-03 13:46:36,947 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3787_210_2040_1_1526477544_5478586_603-120324-0 from training. Duration: 0.81 2023-10-03 13:46:36,999 WARNING [train.py:1204] (3/4) Exclude cut with ID _42863_210_9070_1_1533902398788_3696920_411-204131-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 13:46:41,777 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.conv_module1.balancer1.max_abs, batch_count=1288113.3333333333, ans=10.0 2023-10-03 13:46:43,003 WARNING [train.py:1204] (3/4) Exclude cut with ID _30196_210_1664_1_1533708496522_7074140_822-289909-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 13:46:44,389 WARNING [train.py:1204] (3/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_32-335103-0_sp0.9 from training. Duration: 0.911125 2023-10-03 13:46:44,431 WARNING [train.py:1204] (3/4) Exclude cut with ID _6493_210_1747_1_1528459210424_4012470_513-140967-0_sp0.9 from training. Duration: 0.87775 2023-10-03 13:46:52,148 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_47896_210_20508_1_1534492518_5958868_439-257766-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 13:46:53,582 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_1294-63152-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 13:46:56,146 WARNING [train.py:1204] (3/4) Exclude cut with ID _59170_210_6721_1_1534983633960_4203335_319-166512-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 13:46:57,609 WARNING [train.py:1204] (3/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_146-3470-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 13:46:59,026 WARNING [train.py:1204] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_678-159307-0_sp1.1 from training. Duration: 0.77275 2023-10-03 13:46:59,073 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_343-48822-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 13:47:00,822 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_4692_210_3528_1_1526899755_4323254_107-44401-0 from training. Duration: 0.86 2023-10-03 13:47:00,827 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_74-292608-0_sp1.1 from training. Duration: 0.6545625 2023-10-03 13:47:02,266 WARNING [train.py:1204] (3/4) Exclude cut with ID _55644_210_11856_1_1534593599582_4103719_293-248694-0_sp1.1 from training. Duration: 0.97275 2023-10-03 13:47:03,976 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3172_210_2087_1_1526292929_3987865_197-97762-0 from training. Duration: 0.75 2023-10-03 13:47:05,392 WARNING [train.py:1204] (3/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_264-348896-0_sp0.9 from training. Duration: 0.9444375 2023-10-03 13:47:08,073 INFO [train.py:1046] (3/4) Epoch 37, batch 2000, loss[loss=0.1655, simple_loss=0.2437, pruned_loss=0.04368, over 23345.00 frames. ], tot_loss[loss=0.1608, simple_loss=0.2407, pruned_loss=0.04045, over 4724234.52 frames. ], batch size: 119, lr: 2.76e-03, grad_scale: 16.0 2023-10-03 13:47:10,125 WARNING [train.py:1204] (3/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_209-205365-0_sp0.9 from training. Duration: 0.87775 2023-10-03 13:47:12,776 WARNING [train.py:1204] (3/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_222-198127-0_sp0.9 from training. Duration: 0.9333125 2023-10-03 13:47:12,816 WARNING [train.py:1204] (3/4) Exclude cut with ID _57572_210_5517_1_1534921219624_3809484_52-43530-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 13:47:15,552 WARNING [train.py:1204] (3/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_96-320736-0_sp1.1 from training. Duration: 0.7818125 2023-10-03 13:47:15,669 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_13091_210_7925_1_1530609447_8987354_1159-225508-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 13:47:19,795 WARNING [train.py:1204] (3/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_237-44332-0 from training. Duration: 0.94 2023-10-03 13:47:20,032 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.balancer_ff3.min_abs, batch_count=1288246.6666666667, ans=0.2 2023-10-03 13:47:21,056 WARNING [train.py:1204] (3/4) Exclude cut with ID _7970_210_1883_1_1528455616296_3472230_25-178407-0_sp0.9 from training. Duration: 0.788875 2023-10-03 13:47:22,557 WARNING [train.py:1204] (3/4) Exclude cut with ID _29003_210_19596_1_1532507391720_3894098_357-257062-0_sp1.1 from training. Duration: 0.936375 2023-10-03 13:47:25,885 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_809-17156-0 from training. Duration: 0.73 2023-10-03 13:47:27,230 WARNING [train.py:1204] (3/4) Exclude cut with ID _7160_210_1876_1_1527940923231_3703799_302-150728-0_sp1.1 from training. Duration: 0.7090625 2023-10-03 13:47:27,238 WARNING [train.py:1204] (3/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_444-28041-0_sp0.9 from training. Duration: 0.9444375 2023-10-03 13:47:29,988 WARNING [train.py:1204] (3/4) Exclude cut with ID _52892_210_6721_1_1534378941259_4124129_278-251666-0_sp1.1 from training. Duration: 0.92725 2023-10-03 13:47:30,087 WARNING [train.py:1204] (3/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_115-326778-0 from training. Duration: 0.93 2023-10-03 13:47:32,660 WARNING [train.py:1204] (3/4) Exclude cut with ID _41391_210_7994_1_1533603771872_3638201_31-188961-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 13:47:32,982 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.feed_forward1.out_proj.dropout_p, batch_count=1288313.3333333333, ans=0.1 2023-10-03 13:47:34,429 WARNING [train.py:1204] (3/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_1073-6264-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 13:47:34,455 WARNING [train.py:1204] (3/4) Exclude cut with ID _66868_210_9527_1_1535509287954_4561140_435-259515-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 13:47:34,539 WARNING [train.py:1204] (3/4) Exclude cut with ID _14886_210_4220_1_1530702020355_3516190_159-73748-0 from training. Duration: 0.83 2023-10-03 13:47:34,576 WARNING [train.py:1204] (3/4) Exclude cut with ID _15150_210_8341_1_1530752411803_7256384_469-239509-0_sp1.1 from training. Duration: 0.6181875 2023-10-03 13:47:37,826 WARNING [train.py:1204] (3/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_395-340852-0 from training. Duration: 0.93 2023-10-03 13:47:37,829 WARNING [train.py:1204] (3/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_138-187031-0_sp1.1 from training. Duration: 0.9 2023-10-03 13:47:40,583 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_13757_210_3528_1_1530440682_4107239_48-106347-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 13:47:40,647 WARNING [train.py:1204] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_164-224052-0_sp1.1 from training. Duration: 0.636375 2023-10-03 13:47:40,651 WARNING [train.py:1204] (3/4) Exclude cut with ID _59288_210_5854_1_1535077757774_3412246_408-322648-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 13:47:42,570 WARNING [train.py:1204] (3/4) Exclude cut with ID _30191_210_1664_1_1533189915047_7196670_827-36359-0_sp1.1 from training. Duration: 0.963625 2023-10-03 13:47:42,657 WARNING [train.py:1204] (3/4) Exclude cut with ID _4680_210_3525_1_1527249757953_893386_75-78979-0_sp1.1 from training. Duration: 0.82725 2023-10-03 13:47:44,008 WARNING [train.py:1204] (3/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_383-165234-0 from training. Duration: 0.94 2023-10-03 13:47:46,738 WARNING [train.py:1204] (3/4) Exclude cut with ID _19896_210_1883_1_1531709022662_6649569_94-229927-0 from training. Duration: 0.97 2023-10-03 13:47:46,746 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6788_210_1388_1_1527898698_4562021_180-75288-0_sp1.1 from training. Duration: 0.9 2023-10-03 13:47:46,762 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_26433_210_12108_1_1532157158_4056480_54-321349-0_sp1.1 from training. Duration: 0.97275 2023-10-03 13:47:49,594 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.1.self_attn_weights.pos_emb_skip_rate, batch_count=1288380.0, ans=0.0 2023-10-03 13:47:52,699 WARNING [train.py:1204] (3/4) Exclude cut with ID _56108_210_16380_1_1534505446159_3887959_265-161493-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 13:47:54,097 WARNING [train.py:1204] (3/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_395-111602-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 13:47:54,102 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_154-316450-0_sp0.9 from training. Duration: 0.8444375 2023-10-03 13:47:54,163 WARNING [train.py:1204] (3/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_381-116041-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 13:47:56,999 WARNING [train.py:1204] (3/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_65-104124-0_sp1.1 from training. Duration: 0.963625 2023-10-03 13:47:57,064 WARNING [train.py:1204] (3/4) Exclude cut with ID _6536_210_1883_1_1527850783339_3319069_169-23811-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 13:47:58,339 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_289-180904-0_sp0.9 from training. Duration: 0.8444375 2023-10-03 13:47:58,356 WARNING [train.py:1204] (3/4) Exclude cut with ID _56406_210_9288_1_1534899618664_3496149_226-242400-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 13:47:59,633 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_24899_210_16554_1_1532046422_3827462_276-216330-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 13:47:59,927 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.conv_module1.balancer1.prob, batch_count=1288446.6666666667, ans=0.125 2023-10-03 13:48:01,169 WARNING [train.py:1204] (3/4) Exclude cut with ID _29345_210_4381_1_1532689168263_3860169_198-174328-0_sp1.1 from training. Duration: 0.82725 2023-10-03 13:48:03,045 WARNING [train.py:1204] (3/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_338-96520-0 from training. Duration: 0.94 2023-10-03 13:48:09,121 WARNING [train.py:1204] (3/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_7-111954-0_sp1.1 from training. Duration: 0.5090625 2023-10-03 13:48:09,316 INFO [scaling.py:1118] (3/4) WithLoss: name=encoder.encoders.3.encoder.layers.0.self_attn_weights, loss-sum=0.000e+00 2023-10-03 13:48:10,462 WARNING [train.py:1204] (3/4) Exclude cut with ID _36692_210_5665_1_1533608967420_398809_8-16261-0_sp1.1 from training. Duration: 0.97275 2023-10-03 13:48:13,821 WARNING [train.py:1204] (3/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_86-212657-0_sp1.1 from training. Duration: 0.97275 2023-10-03 13:48:13,836 WARNING [train.py:1204] (3/4) Exclude cut with ID _44659_210_2721_1_1534036120061_6614860_626-298884-0_sp1.1 from training. Duration: 0.8818125 2023-10-03 13:48:16,736 WARNING [train.py:1204] (3/4) Exclude cut with ID _49755_210_18053_1_1534581005846_3218709_120-257918-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 13:48:19,524 WARNING [train.py:1204] (3/4) Exclude cut with ID _21881_210_3525_1_1531825242757_3798380_400-113821-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 13:48:19,533 WARNING [train.py:1204] (3/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_387-236773-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 13:48:20,993 WARNING [train.py:1204] (3/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_613-232856-0_sp1.1 from training. Duration: 0.4909375 2023-10-03 13:48:21,025 WARNING [train.py:1204] (3/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_181-78632-0_sp0.9 from training. Duration: 0.8666875 2023-10-03 13:48:22,346 INFO [train.py:1046] (3/4) Epoch 37, batch 2050, loss[loss=0.1632, simple_loss=0.2501, pruned_loss=0.03817, over 24372.00 frames. ], tot_loss[loss=0.1597, simple_loss=0.2396, pruned_loss=0.03994, over 4722174.26 frames. ], batch size: 77, lr: 2.76e-03, grad_scale: 16.0 2023-10-03 13:48:22,448 WARNING [train.py:1204] (3/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_175-17485-0_sp1.1 from training. Duration: 0.97275 2023-10-03 13:48:24,325 WARNING [train.py:1204] (3/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_1004-183422-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 13:48:25,773 WARNING [train.py:1204] (3/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_32-65962-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 13:48:27,197 WARNING [train.py:1204] (3/4) Exclude cut with ID _15458_210_8341_1_1530858614263_3834410_40-298664-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 13:48:31,373 WARNING [train.py:1204] (3/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_380-261863-0_sp1.1 from training. Duration: 0.963625 2023-10-03 13:48:33,209 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_372-173012-0_sp1.1 from training. Duration: 0.863625 2023-10-03 13:48:34,500 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_57-82205-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 13:48:34,558 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3691_210_3528_1_1526468169_4062053_355-334432-0_sp0.9 from training. Duration: 0.9666875 2023-10-03 13:48:36,458 WARNING [train.py:1204] (3/4) Exclude cut with ID _19023_210_4353_1_1531378892552_5820780_28-36615-0 from training. Duration: 0.97 2023-10-03 13:48:36,464 WARNING [train.py:1204] (3/4) Exclude cut with ID _29853_210_7497_1_1532505600851_6642110_286-251333-0_sp1.1 from training. Duration: 0.92725 2023-10-03 13:48:37,859 WARNING [train.py:1204] (3/4) Exclude cut with ID _31602_210_5854_1_1532677021456_4458250_518-80679-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 13:48:39,224 WARNING [train.py:1204] (3/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_38-187851-0_sp0.9 from training. Duration: 0.688875 2023-10-03 13:48:46,728 WARNING [train.py:1204] (3/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_39-248870-0_sp1.1 from training. Duration: 0.62725 2023-10-03 13:48:46,751 WARNING [train.py:1204] (3/4) Exclude cut with ID _16908_210_5724_1_1531119157248_3772700_154-31455-0_sp1.1 from training. Duration: 0.97275 2023-10-03 13:48:49,438 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8453_210_4211_1_1528502940_8562730_347-330673-0 from training. Duration: 0.72 2023-10-03 13:48:49,763 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.conv_module2.balancer2.prob, batch_count=1288646.6666666667, ans=0.125 2023-10-03 13:48:50,795 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.580e+02 1.876e+02 2.031e+02 2.404e+02 3.900e+02, threshold=4.063e+02, percent-clipped=0.0 2023-10-03 13:48:50,862 WARNING [train.py:1204] (3/4) Exclude cut with ID _30191_210_1664_1_1533189915047_7196670_674-204247-0_sp1.1 from training. Duration: 0.97275 2023-10-03 13:48:52,761 WARNING [train.py:1204] (3/4) Exclude cut with ID _8833_210_5219_1_1528592665210_7553059_329-125156-0 from training. Duration: 0.82 2023-10-03 13:48:54,129 WARNING [train.py:1204] (3/4) Exclude cut with ID _4779_210_2084_1_1527318022944_3914893_500-285013-0_sp1.1 from training. Duration: 0.62725 2023-10-03 13:48:54,313 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.0.self_attn_weights.pos_emb_skip_rate, batch_count=1288713.3333333333, ans=0.0 2023-10-03 13:48:56,949 WARNING [train.py:1204] (3/4) Exclude cut with ID _14421_210_8750_1_1530604795825_3542290_190-313578-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 13:48:59,761 WARNING [train.py:1204] (3/4) Exclude cut with ID _35799_210_5854_1_1533263487887_3529071_538-273574-0_sp1.1 from training. Duration: 0.936375 2023-10-03 13:48:59,931 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.1.attention_skip_rate, batch_count=1288713.3333333333, ans=0.0 2023-10-03 13:49:01,085 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_14928_210_5632_1_1530773461_4714000_454-112198-0_sp1.1 from training. Duration: 0.6 2023-10-03 13:49:01,128 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_468-143126-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 13:49:03,104 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_4528_210_3273_1_1527076654_3276777_258-121681-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 13:49:04,569 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_397-132565-0_sp1.1 from training. Duration: 0.9 2023-10-03 13:49:04,778 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.feed_forward3.hidden_balancer.prob, batch_count=1288713.3333333333, ans=0.125 2023-10-03 13:49:05,886 WARNING [train.py:1204] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_454-123002-0_sp0.9 from training. Duration: 0.7666875 2023-10-03 13:49:07,371 WARNING [train.py:1204] (3/4) Exclude cut with ID _27052_210_5986_1_1532329197187_3585009_297-86890-0_sp1.1 from training. Duration: 0.936375 2023-10-03 13:49:10,518 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_51225_210_3601_1_1534400634_2327383_55-301844-0_sp1.1 from training. Duration: 0.8454375 2023-10-03 13:49:11,958 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6786_210_1388_1_1527839699_4194495_378-46070-0_sp0.9 from training. Duration: 0.8 2023-10-03 13:49:12,061 WARNING [train.py:1204] (3/4) Exclude cut with ID _42334_210_7770_1_1534206610396_3605880_55-19758-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 13:49:16,610 WARNING [train.py:1204] (3/4) Exclude cut with ID _14884_210_2252_1_1530707459689_5561250_114-201399-0_sp0.9 from training. Duration: 0.8444375 2023-10-03 13:49:19,442 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.0.conv_module2.balancer1.prob, batch_count=1288780.0, ans=0.125 2023-10-03 13:49:21,950 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_99-251314-0_sp0.9 from training. Duration: 0.9666875 2023-10-03 13:49:23,346 WARNING [train.py:1204] (3/4) Exclude cut with ID _26528_210_9070_1_1532570410842_3851310_497-97218-0 from training. Duration: 0.86 2023-10-03 13:49:29,524 WARNING [train.py:1204] (3/4) Exclude cut with ID _55328_210_27767_1_1534580973260_4088170_230-317241-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 13:49:29,598 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_125-203105-0_sp0.9 from training. Duration: 0.888875 2023-10-03 13:49:32,401 WARNING [train.py:1204] (3/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_55-31904-0_sp1.1 from training. Duration: 0.8 2023-10-03 13:49:33,644 WARNING [train.py:1204] (3/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_18-174647-0 from training. Duration: 0.91 2023-10-03 13:49:36,702 INFO [train.py:1046] (3/4) Epoch 37, batch 2100, loss[loss=0.1744, simple_loss=0.2409, pruned_loss=0.054, over 23852.00 frames. ], tot_loss[loss=0.1584, simple_loss=0.2375, pruned_loss=0.03969, over 4718379.52 frames. ], batch size: 179, lr: 2.76e-03, grad_scale: 16.0 2023-10-03 13:49:36,863 WARNING [train.py:1204] (3/4) Exclude cut with ID IC0165W0005-81726 from training. Duration: 0.952 2023-10-03 13:49:36,864 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_31583_210_3852_1_1532595851_4024413_148-171847-0_sp1.1 from training. Duration: 0.963625 2023-10-03 13:49:37,043 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.conv_module2.balancer1.prob, batch_count=1288913.3333333333, ans=0.125 2023-10-03 13:49:38,196 WARNING [train.py:1204] (3/4) Exclude cut with ID _21989_210_5854_1_1531899214931_3271360_116-138769-0_sp1.1 from training. Duration: 0.936375 2023-10-03 13:49:38,255 WARNING [train.py:1204] (3/4) Exclude cut with ID _66070_210_7640_1_1535441393096_7805310_489-292631-0_sp1.1 from training. Duration: 0.7181875 2023-10-03 13:49:40,016 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_202-27960-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 13:49:40,024 WARNING [train.py:1204] (3/4) Exclude cut with ID _5294_210_1747_1_1527249643928_3696179_342-270278-0 from training. Duration: 0.55 2023-10-03 13:49:40,062 WARNING [train.py:1204] (3/4) Exclude cut with ID _38323_210_12108_1_1533121233369_3303049_416-11919-0 from training. Duration: 0.96 2023-10-03 13:49:40,280 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.conv_module2.balancer2.min_positive, batch_count=1288913.3333333333, ans=0.05 2023-10-03 13:49:41,485 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6545_210_2040_1_1527774670_4245258_407-48070-0_sp0.9 from training. Duration: 0.8444375 2023-10-03 13:49:44,061 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.4.encoder.layers.1.feed_forward1.out_whiten, num_groups=1, num_channels=384, metric=9.70 vs. limit=15.0 2023-10-03 13:49:46,088 WARNING [train.py:1204] (3/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_503-30719-0_sp1.1 from training. Duration: 0.8909375 2023-10-03 13:49:46,140 WARNING [train.py:1204] (3/4) Exclude cut with ID _49643_210_7989_1_1534640244224_3939031_234-326497-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 13:49:48,911 WARNING [train.py:1204] (3/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_301-77873-0_sp1.1 from training. Duration: 0.963625 2023-10-03 13:49:48,969 WARNING [train.py:1204] (3/4) Exclude cut with ID _45297_210_2721_1_1533715351381_3264949_152-154697-0_sp1.1 from training. Duration: 0.97275 2023-10-03 13:49:48,976 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3371_210_1794_1_1526384462_5580777_396-339812-0 from training. Duration: 0.85 2023-10-03 13:49:50,384 WARNING [train.py:1204] (3/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_399-156791-0_sp1.1 from training. Duration: 0.7545625 2023-10-03 13:49:50,437 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7696_210_2040_1_1528207167_3716099_117-125434-0 from training. Duration: 0.76 2023-10-03 13:49:50,442 WARNING [train.py:1204] (3/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_96-244299-0 from training. Duration: 0.88 2023-10-03 13:49:51,869 WARNING [train.py:1204] (3/4) Exclude cut with ID _37892_210_19596_1_1533198677897_3964058_519-343462-0_sp1.1 from training. Duration: 0.92725 2023-10-03 13:49:51,890 WARNING [train.py:1204] (3/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_202-183922-0_sp1.1 from training. Duration: 0.77275 2023-10-03 13:49:51,897 WARNING [train.py:1204] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_453-133983-0 from training. Duration: 0.78 2023-10-03 13:49:53,226 WARNING [train.py:1204] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_178-39722-0_sp1.1 from training. Duration: 0.4454375 2023-10-03 13:49:59,248 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_15126_210_6753_1_1530778677_2591185_113-157651-0 from training. Duration: 0.68 2023-10-03 13:49:59,249 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_271-234962-0_sp1.1 from training. Duration: 0.7181875 2023-10-03 13:50:01,936 WARNING [train.py:1204] (3/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_195-64967-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 13:50:01,987 WARNING [train.py:1204] (3/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_365-215065-0_sp1.1 from training. Duration: 0.936375 2023-10-03 13:50:05,291 WARNING [train.py:1204] (3/4) Exclude cut with ID _31602_210_5854_1_1532677021456_4458250_249-96604-0_sp0.9 from training. Duration: 0.988875 2023-10-03 13:50:05,338 WARNING [train.py:1204] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_307-158462-0 from training. Duration: 0.69 2023-10-03 13:50:05,381 WARNING [train.py:1204] (3/4) Exclude cut with ID _30918_210_6400_1_1532754078600_3125920_407-223394-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 13:50:05,385 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6673_210_3273_1_1527767790_3173240_351-167954-0_sp0.9 from training. Duration: 0.5333125 2023-10-03 13:50:08,521 WARNING [train.py:1204] (3/4) Exclude cut with ID _4866_210_2577_1_1527404412284_4364659_256-306830-0 from training. Duration: 0.87 2023-10-03 13:50:08,567 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_17613_210_2151_1_1531137569_2366160_222-131118-0_sp1.1 from training. Duration: 0.963625 2023-10-03 13:50:08,579 WARNING [train.py:1204] (3/4) Exclude cut with ID _45560_210_21982_1_1533812603951_3584999_111-6376-0 from training. Duration: 0.84 2023-10-03 13:50:08,610 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_216-73841-0 from training. Duration: 0.96 2023-10-03 13:50:10,037 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_14058_210_4163_1_1530690953_6327180_49-210687-0 from training. Duration: 0.94 2023-10-03 13:50:11,373 WARNING [train.py:1204] (3/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_506-42614-0_sp0.9 from training. Duration: 0.87775 2023-10-03 13:50:14,683 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_25-332138-0_sp1.1 from training. Duration: 0.82725 2023-10-03 13:50:16,103 WARNING [train.py:1204] (3/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_589-9070-0_sp1.1 from training. Duration: 0.6454375 2023-10-03 13:50:16,214 WARNING [train.py:1204] (3/4) Exclude cut with ID _6194_210_1534_1_1527511826160_3941194_14-282876-0_sp1.1 from training. Duration: 0.6454375 2023-10-03 13:50:17,575 WARNING [train.py:1204] (3/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_1166-90654-0_sp1.1 from training. Duration: 0.92725 2023-10-03 13:50:18,898 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_218-255858-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 13:50:18,900 WARNING [train.py:1204] (3/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_1285-256622-0 from training. Duration: 0.84 2023-10-03 13:50:18,914 WARNING [train.py:1204] (3/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_205-286261-0_sp1.1 from training. Duration: 0.963625 2023-10-03 13:50:18,935 WARNING [train.py:1204] (3/4) Exclude cut with ID _68777_210_4353_1_1535810390731_3117904_6-154119-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 13:50:20,303 WARNING [train.py:1204] (3/4) Exclude cut with ID _22982_210_1883_1_1531882790447_5449409_565-199393-0_sp1.1 from training. Duration: 0.92725 2023-10-03 13:50:20,329 WARNING [train.py:1204] (3/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_278-154492-0 from training. Duration: 0.76 2023-10-03 13:50:21,777 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_426-313557-0 from training. Duration: 0.53 2023-10-03 13:50:23,115 WARNING [train.py:1204] (3/4) Exclude cut with ID _41817_210_18835_1_1533434477160_7423049_555-293448-0 from training. Duration: 0.8 2023-10-03 13:50:27,602 WARNING [train.py:1204] (3/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_32-335103-0_sp1.1 from training. Duration: 0.7454375 2023-10-03 13:50:30,343 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2655_210_2087_1_1525688959_3897400_202-147557-0_sp1.1 from training. Duration: 0.77275 2023-10-03 13:50:30,375 WARNING [train.py:1204] (3/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_158-250116-0 from training. Duration: 0.62 2023-10-03 13:50:35,197 WARNING [train.py:1204] (3/4) Exclude cut with ID _55429_210_10626_1_1534467588527_7174143_528-88083-0_sp1.1 from training. Duration: 0.963625 2023-10-03 13:50:37,204 WARNING [train.py:1204] (3/4) Exclude cut with ID _8030_210_7497_1_1528444886781_3531089_32-31837-0_sp1.1 from training. Duration: 0.8818125 2023-10-03 13:50:38,444 WARNING [train.py:1204] (3/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_744-86168-0_sp1.1 from training. Duration: 0.97275 2023-10-03 13:50:38,460 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1113-7369-0_sp0.9 from training. Duration: 0.9444375 2023-10-03 13:50:38,485 WARNING [train.py:1204] (3/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_124-345840-0_sp0.9 from training. Duration: 0.57775 2023-10-03 13:50:39,805 WARNING [train.py:1204] (3/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_328-264776-0_sp0.9 from training. Duration: 0.7444375 2023-10-03 13:50:41,319 WARNING [train.py:1204] (3/4) Exclude cut with ID _18895_210_6228_1_1531357274234_3784930_469-166615-0_sp1.1 from training. Duration: 0.963625 2023-10-03 13:50:41,326 WARNING [train.py:1204] (3/4) Exclude cut with ID _8395_210_1531_1_1528597871523_4411579_431-132470-0_sp0.9 from training. Duration: 0.82225 2023-10-03 13:50:42,570 WARNING [train.py:1204] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_554-45617-0_sp1.1 from training. Duration: 0.9 2023-10-03 13:50:42,598 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_35361_210_4238_1_1533262810_7793476_524-188242-0_sp1.1 from training. Duration: 0.92725 2023-10-03 13:50:43,867 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.3.encoder.layers.1.feed_forward3.out_whiten, num_groups=1, num_channels=512, metric=9.89 vs. limit=15.0 2023-10-03 13:50:45,801 WARNING [train.py:1204] (3/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_938-205135-0 from training. Duration: 0.87 2023-10-03 13:50:47,190 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_5931_210_2040_1_1527428481_4547999_321-193614-0 from training. Duration: 0.67 2023-10-03 13:50:47,203 WARNING [train.py:1204] (3/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_445-269521-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 13:50:48,695 WARNING [train.py:1204] (3/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_3-33116-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 13:50:48,697 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_4633_210_2040_1_1526824163_4145343_416-76117-0_sp1.1 from training. Duration: 0.863625 2023-10-03 13:50:48,732 WARNING [train.py:1204] (3/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_1033-274376-0_sp1.1 from training. Duration: 0.7909375 2023-10-03 13:50:50,042 WARNING [train.py:1204] (3/4) Exclude cut with ID _8772_210_1850_1_1528635595972_3664011_398-320049-0_sp0.9 from training. Duration: 0.97775 2023-10-03 13:50:51,382 INFO [train.py:1046] (3/4) Epoch 37, batch 2150, loss[loss=0.1515, simple_loss=0.2347, pruned_loss=0.03413, over 24677.00 frames. ], tot_loss[loss=0.1577, simple_loss=0.2367, pruned_loss=0.03937, over 4711236.02 frames. ], batch size: 65, lr: 2.76e-03, grad_scale: 16.0 2023-10-03 13:50:55,623 WARNING [train.py:1204] (3/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_321-215742-0_sp0.9 from training. Duration: 0.5555625 2023-10-03 13:50:57,019 WARNING [train.py:1204] (3/4) Exclude cut with ID _28394_210_8913_1_1532430049518_3650118_272-190950-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 13:50:58,850 WARNING [train.py:1204] (3/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_31-299371-0_sp1.1 from training. Duration: 0.92725 2023-10-03 13:51:00,306 WARNING [train.py:1204] (3/4) Exclude cut with ID _55383_210_10635_1_1534468426172_2231669_134-128246-0_sp0.9 from training. Duration: 0.988875 2023-10-03 13:51:00,309 WARNING [train.py:1204] (3/4) Exclude cut with ID _40519_210_13794_1_1533715228655_7337390_887-9547-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 13:51:00,347 WARNING [train.py:1204] (3/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_310-109461-0_sp0.9 from training. Duration: 0.888875 2023-10-03 13:51:03,824 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2009_210_2071_1_1525481134_4588937_258-250196-0_sp1.1 from training. Duration: 0.92725 2023-10-03 13:51:03,888 WARNING [train.py:1204] (3/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_1186-72702-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 13:51:03,890 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_14831_210_7927_1_1530700200_5583610_181-27467-0_sp1.1 from training. Duration: 0.82725 2023-10-03 13:51:09,607 WARNING [train.py:1204] (3/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_278-132109-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 13:51:09,625 WARNING [train.py:1204] (3/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_261-297503-0 from training. Duration: 0.99 2023-10-03 13:51:13,990 WARNING [train.py:1204] (3/4) Exclude cut with ID _19896_210_1883_1_1531709022662_6649569_313-265903-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 13:51:14,089 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_105-114797-0_sp1.1 from training. Duration: 0.763625 2023-10-03 13:51:15,982 WARNING [train.py:1204] (3/4) Exclude cut with ID _53755_210_8254_1_1534577312941_4707360_9-65843-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 13:51:16,004 WARNING [train.py:1204] (3/4) Exclude cut with ID _18516_210_9206_1_1531704653206_3692406_361-224549-0_sp1.1 from training. Duration: 0.97275 2023-10-03 13:51:16,031 WARNING [train.py:1204] (3/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_403-310975-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 13:51:17,225 WARNING [train.py:1204] (3/4) Exclude cut with ID _5444_210_2151_1_1527418808464_5312210_581-315411-0_sp0.9 from training. Duration: 0.688875 2023-10-03 13:51:17,276 WARNING [train.py:1204] (3/4) Exclude cut with ID _23825_210_13449_1_1531915313526_3497150_464-154904-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 13:51:18,594 WARNING [train.py:1204] (3/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_937-208281-0_sp0.9 from training. Duration: 0.9444375 2023-10-03 13:51:19,872 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.509e+02 1.856e+02 2.047e+02 2.375e+02 3.292e+02, threshold=4.095e+02, percent-clipped=0.0 2023-10-03 13:51:19,940 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_37839_210_7234_1_1533542462_3689303_158-181212-0_sp1.1 from training. Duration: 0.963625 2023-10-03 13:51:20,023 WARNING [train.py:1204] (3/4) Exclude cut with ID _8772_210_1850_1_1528635595972_3664011_398-320049-0 from training. Duration: 0.88 2023-10-03 13:51:21,259 WARNING [train.py:1204] (3/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_718-250907-0_sp1.1 from training. Duration: 0.62725 2023-10-03 13:51:22,599 WARNING [train.py:1204] (3/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_641-126395-0_sp1.1 from training. Duration: 0.92725 2023-10-03 13:51:23,944 WARNING [train.py:1204] (3/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_166-241493-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 13:51:24,036 WARNING [train.py:1204] (3/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_526-62079-0_sp0.9 from training. Duration: 0.7444375 2023-10-03 13:51:25,412 WARNING [train.py:1204] (3/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_439-275083-0_sp1.1 from training. Duration: 0.936375 2023-10-03 13:51:26,852 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_66907_210_21581_1_1535615661_3590177_493-229032-0_sp1.1 from training. Duration: 0.92725 2023-10-03 13:51:26,891 WARNING [train.py:1204] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_89-321955-0_sp1.1 from training. Duration: 0.72725 2023-10-03 13:51:30,026 WARNING [train.py:1204] (3/4) Exclude cut with ID _29857_210_20034_1_1532395886753_3798160_273-226195-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 13:51:30,032 WARNING [train.py:1204] (3/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_404-75889-0 from training. Duration: 0.89 2023-10-03 13:51:30,059 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_271-234962-0_sp0.9 from training. Duration: 0.87775 2023-10-03 13:51:30,263 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.conv_module1.balancer1.prob, batch_count=1289380.0, ans=0.125 2023-10-03 13:51:33,188 WARNING [train.py:1204] (3/4) Exclude cut with ID _22961_210_8254_1_1531976366672_4278180_156-339685-0_sp1.1 from training. Duration: 0.97275 2023-10-03 13:51:33,236 WARNING [train.py:1204] (3/4) Exclude cut with ID _37892_210_19596_1_1533198677897_3964058_425-231927-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 13:51:34,607 WARNING [train.py:1204] (3/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_102-288670-0_sp1.1 from training. Duration: 0.97275 2023-10-03 13:51:34,694 WARNING [train.py:1204] (3/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_283-220372-0_sp1.1 from training. Duration: 0.5090625 2023-10-03 13:51:36,646 WARNING [train.py:1204] (3/4) Exclude cut with ID _44304_210_15374_1_1533639352765_4613129_86-1699-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 13:51:36,741 WARNING [train.py:1204] (3/4) Exclude cut with ID _2891_210_2550_1_1525874398521_4427710_327-121590-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 13:51:36,759 WARNING [train.py:1204] (3/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_369-222060-0 from training. Duration: 0.71 2023-10-03 13:51:37,710 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.0.layers.0.self_attn2.whiten, num_groups=1, num_channels=192, metric=11.26 vs. limit=22.5 2023-10-03 13:51:39,448 WARNING [train.py:1204] (3/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_762-109886-0 from training. Duration: 0.91 2023-10-03 13:51:39,481 WARNING [train.py:1204] (3/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_477-255782-0_sp0.9 from training. Duration: 0.911125 2023-10-03 13:51:39,518 WARNING [train.py:1204] (3/4) Exclude cut with ID IC0669W0061-330906 from training. Duration: 0.768 2023-10-03 13:51:39,565 WARNING [train.py:1204] (3/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_347-213071-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 13:51:40,850 WARNING [train.py:1204] (3/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_507-303370-0_sp1.1 from training. Duration: 0.87275 2023-10-03 13:51:41,056 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.nonlin_attention.balancer.max_positive, batch_count=1289446.6666666667, ans=0.95 2023-10-03 13:51:42,178 WARNING [train.py:1204] (3/4) Exclude cut with ID _15012_210_1968_1_1530856820429_5095350_688-178849-0 from training. Duration: 0.64 2023-10-03 13:51:43,420 WARNING [train.py:1204] (3/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_28-70987-0_sp0.9 from training. Duration: 0.9666875 2023-10-03 13:51:43,422 WARNING [train.py:1204] (3/4) Exclude cut with ID _5910_210_1389_1_1527337972454_5050100_555-159815-0 from training. Duration: 0.98 2023-10-03 13:51:43,443 WARNING [train.py:1204] (3/4) Exclude cut with ID IC0679W0064-335871 from training. Duration: 0.768 2023-10-03 13:51:43,443 WARNING [train.py:1204] (3/4) Exclude cut with ID IC0679W0079-335886 from training. Duration: 0.896 2023-10-03 13:51:43,486 WARNING [train.py:1204] (3/4) Exclude cut with ID _5608_210_2106_1_1527415439089_6819768_909-82767-0 from training. Duration: 0.61 2023-10-03 13:51:44,011 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.5.encoder.layers.0.nonlin_attention.whiten2, num_groups=1, num_channels=256, metric=3.43 vs. limit=15.0 2023-10-03 13:51:44,795 WARNING [train.py:1204] (3/4) Exclude cut with ID _23038_210_9570_1_1532001584158_3652419_411-323967-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 13:51:44,846 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_350-231748-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 13:51:44,857 WARNING [train.py:1204] (3/4) Exclude cut with ID _5289_210_2084_1_1527678202736_3779299_142-103317-0_sp0.9 from training. Duration: 0.9333125 2023-10-03 13:51:46,229 WARNING [train.py:1204] (3/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_314-144055-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 13:51:47,041 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=1289446.6666666667, ans=0.1 2023-10-03 13:51:48,235 WARNING [train.py:1204] (3/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_177-236809-0_sp0.9 from training. Duration: 0.5444375 2023-10-03 13:51:50,899 WARNING [train.py:1204] (3/4) Exclude cut with ID _23038_210_9570_1_1532001584158_3652419_82-334733-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 13:51:50,908 WARNING [train.py:1204] (3/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_225-22526-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 13:51:53,923 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.0.conv_skip_rate, batch_count=1289513.3333333333, ans=0.0 2023-10-03 13:51:59,867 WARNING [train.py:1204] (3/4) Exclude cut with ID _30327_210_16049_1_1532484161295_3924169_401-70819-0_sp1.1 from training. Duration: 0.7818125 2023-10-03 13:51:59,913 WARNING [train.py:1204] (3/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_538-32291-0 from training. Duration: 0.83 2023-10-03 13:52:04,725 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_37357_210_17024_1_1533089504_2316121_258-211605-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 13:52:06,060 INFO [train.py:1046] (3/4) Epoch 37, batch 2200, loss[loss=0.165, simple_loss=0.2516, pruned_loss=0.03918, over 24396.00 frames. ], tot_loss[loss=0.158, simple_loss=0.2375, pruned_loss=0.03929, over 4729293.07 frames. ], batch size: 77, lr: 2.76e-03, grad_scale: 16.0 2023-10-03 13:52:09,322 WARNING [train.py:1204] (3/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_550-335198-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 13:52:09,390 WARNING [train.py:1204] (3/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_98-741-0_sp0.9 from training. Duration: 0.888875 2023-10-03 13:52:09,610 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.ff2_skip_rate, batch_count=1289580.0, ans=0.0 2023-10-03 13:52:10,707 WARNING [train.py:1204] (3/4) Exclude cut with ID _38323_210_12108_1_1533121233369_3303049_251-238988-0_sp1.1 from training. Duration: 0.92725 2023-10-03 13:52:12,242 WARNING [train.py:1204] (3/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_259-34038-0_sp0.9 from training. Duration: 0.82225 2023-10-03 13:52:13,587 WARNING [train.py:1204] (3/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_1060-180633-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 13:52:14,871 WARNING [train.py:1204] (3/4) Exclude cut with ID _8755_210_1876_1_1528628559357_3258209_211-37473-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 13:52:14,885 WARNING [train.py:1204] (3/4) Exclude cut with ID _4122_210_2084_1_1526713164417_4353280_528-206771-0 from training. Duration: 0.88 2023-10-03 13:52:19,752 WARNING [train.py:1204] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_64-248359-0 from training. Duration: 0.69 2023-10-03 13:52:21,250 WARNING [train.py:1204] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_402-253027-0_sp1.1 from training. Duration: 0.4909375 2023-10-03 13:52:26,647 WARNING [train.py:1204] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_625-102955-0 from training. Duration: 0.77 2023-10-03 13:52:29,450 WARNING [train.py:1204] (3/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_611-219324-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 13:52:32,142 WARNING [train.py:1204] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_718-68505-0_sp0.9 from training. Duration: 0.988875 2023-10-03 13:52:32,194 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_353-182855-0_sp1.1 from training. Duration: 0.87275 2023-10-03 13:52:35,626 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_5960_210_1794_1_1527403865_3990999_515-2613-0_sp0.9 from training. Duration: 0.97775 2023-10-03 13:52:35,656 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_5575_210_1410_1_1527301305_2570170_342-265572-0 from training. Duration: 0.94 2023-10-03 13:52:40,346 WARNING [train.py:1204] (3/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_329-113173-0_sp0.9 from training. Duration: 0.688875 2023-10-03 13:52:41,018 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.3.encoder.layers.3.nonlin_attention.whiten2, num_groups=1, num_channels=512, metric=4.56 vs. limit=15.0 2023-10-03 13:52:41,726 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_15396_210_6890_1_1531308473_1938936_213-312506-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 13:52:41,788 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_521-214276-0_sp1.1 from training. Duration: 0.4 2023-10-03 13:52:44,442 WARNING [train.py:1204] (3/4) Exclude cut with ID _15026_210_8750_1_1530777602316_3291850_211-195043-0_sp1.1 from training. Duration: 0.72725 2023-10-03 13:52:44,556 WARNING [train.py:1204] (3/4) Exclude cut with ID _29343_210_4381_1_1532602757752_3674109_108-257494-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 13:52:45,927 WARNING [train.py:1204] (3/4) Exclude cut with ID _64414_210_17301_1_1535243252048_3757759_403-58151-0_sp1.1 from training. Duration: 0.963625 2023-10-03 13:52:47,306 WARNING [train.py:1204] (3/4) Exclude cut with ID _36512_210_6987_1_1533033143998_3126069_109-177787-0_sp1.1 from training. Duration: 0.92725 2023-10-03 13:52:50,597 WARNING [train.py:1204] (3/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_498-40153-0 from training. Duration: 0.63 2023-10-03 13:52:51,850 WARNING [train.py:1204] (3/4) Exclude cut with ID _56447_210_1389_1_1535074245910_3697189_161-334523-0_sp1.1 from training. Duration: 0.936375 2023-10-03 13:52:51,948 WARNING [train.py:1204] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_238-245655-0 from training. Duration: 0.81 2023-10-03 13:52:54,676 WARNING [train.py:1204] (3/4) Exclude cut with ID _64631_210_27767_1_1535349580068_4165160_108-43865-0_sp1.1 from training. Duration: 0.92725 2023-10-03 13:52:54,682 WARNING [train.py:1204] (3/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_243-42116-0_sp1.1 from training. Duration: 0.663625 2023-10-03 13:52:54,716 WARNING [train.py:1204] (3/4) Exclude cut with ID _66868_210_9527_1_1535509287954_4561140_594-132160-0_sp1.1 from training. Duration: 0.92725 2023-10-03 13:52:57,420 WARNING [train.py:1204] (3/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_48-338976-0_sp0.9 from training. Duration: 0.988875 2023-10-03 13:52:57,462 WARNING [train.py:1204] (3/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_137-100595-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 13:52:57,468 WARNING [train.py:1204] (3/4) Exclude cut with ID _49748_210_24116_1_1533986971737_3158910_12-271669-0_sp1.1 from training. Duration: 0.936375 2023-10-03 13:52:57,486 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_37357_210_17024_1_1533089504_2316121_206-80421-0_sp1.1 from training. Duration: 0.936375 2023-10-03 13:52:58,754 WARNING [train.py:1204] (3/4) Exclude cut with ID _7537_210_2084_1_1528502395431_3924359_113-199933-0_sp1.1 from training. Duration: 0.536375 2023-10-03 13:53:00,172 WARNING [train.py:1204] (3/4) Exclude cut with ID _55440_210_12651_1_1534496713364_2980520_31-59289-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 13:53:01,638 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1083-220122-0_sp0.9 from training. Duration: 0.5666875 2023-10-03 13:53:05,018 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_426-313557-0_sp1.1 from training. Duration: 0.4818125 2023-10-03 13:53:05,061 WARNING [train.py:1204] (3/4) Exclude cut with ID _35799_210_5854_1_1533263487887_3529071_324-94843-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 13:53:07,040 WARNING [train.py:1204] (3/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_264-108685-0_sp1.1 from training. Duration: 0.5 2023-10-03 13:53:09,766 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9012W0006-501141 from training. Duration: 0.896 2023-10-03 13:53:11,197 WARNING [train.py:1204] (3/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_890-5117-0_sp0.9 from training. Duration: 0.8666875 2023-10-03 13:53:12,373 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9023W0008-506301 from training. Duration: 0.896 2023-10-03 13:53:13,746 WARNING [train.py:1204] (3/4) Exclude cut with ID _13916_210_9994_1_1530788341126_3763149_60-191556-0_sp0.9 from training. Duration: 0.67775 2023-10-03 13:53:13,808 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9029W0331-509721 from training. Duration: 0.896 2023-10-03 13:53:14,010 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.conv_skip_rate, batch_count=1289846.6666666667, ans=0.0 2023-10-03 13:53:16,684 WARNING [train.py:1204] (3/4) Exclude cut with ID _35823_210_6917_1_1533187881914_7297830_567-108596-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 13:53:16,735 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7543_210_6753_1_1528544010_1110402_25-183860-0_sp1.1 from training. Duration: 0.42725 2023-10-03 13:53:19,477 INFO [train.py:1046] (3/4) Epoch 37, batch 2250, loss[loss=0.2092, simple_loss=0.2733, pruned_loss=0.07255, over 19453.00 frames. ], tot_loss[loss=0.1584, simple_loss=0.2381, pruned_loss=0.03936, over 4732955.12 frames. ], batch size: 389, lr: 2.75e-03, grad_scale: 16.0 2023-10-03 13:53:19,517 WARNING [train.py:1204] (3/4) Exclude cut with ID _47278_210_12041_1_1533902418349_3576110_495-260035-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 13:53:21,611 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9051W0015-520281 from training. Duration: 0.9045625 2023-10-03 13:53:24,258 WARNING [train.py:1204] (3/4) Exclude cut with ID _14459_210_2228_1_1531443641946_7516700_707-66193-0_sp1.1 from training. Duration: 0.97275 2023-10-03 13:53:26,890 WARNING [train.py:1204] (3/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_219-224445-0_sp1.1 from training. Duration: 0.763625 2023-10-03 13:53:29,935 WARNING [train.py:1204] (3/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_345-167986-0_sp0.9 from training. Duration: 0.9333125 2023-10-03 13:53:31,310 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_34815_210_6388_1_1533381320_3571903_279-305565-0_sp1.1 from training. Duration: 0.836375 2023-10-03 13:53:31,924 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.3.encoder.layers.2.self_attn_weights.whiten_keys, num_groups=8, num_channels=256, metric=3.32 vs. limit=6.0 2023-10-03 13:53:35,329 WARNING [train.py:1204] (3/4) Exclude cut with ID _21987_210_5854_1_1531821500482_3637038_317-179212-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 13:53:35,514 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.self_attn_weights.pos_emb_skip_rate, batch_count=1289980.0, ans=0.0 2023-10-03 13:53:37,113 WARNING [train.py:1204] (3/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_395-37222-0_sp1.1 from training. Duration: 0.6909375 2023-10-03 13:53:38,409 WARNING [train.py:1204] (3/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_168-50337-0_sp1.1 from training. Duration: 0.763625 2023-10-03 13:53:39,829 WARNING [train.py:1204] (3/4) Exclude cut with ID _60113_210_16271_1_1535164212481_4386079_559-167191-0 from training. Duration: 0.93 2023-10-03 13:53:39,847 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_47228_210_4983_1_1533780021_3672027_344-179642-0_sp1.1 from training. Duration: 0.92725 2023-10-03 13:53:39,880 WARNING [train.py:1204] (3/4) Exclude cut with ID _22961_210_8254_1_1531976366672_4278180_317-199546-0_sp0.9 from training. Duration: 0.9555625 2023-10-03 13:53:41,913 WARNING [train.py:1204] (3/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_141-89727-0 from training. Duration: 0.71 2023-10-03 13:53:43,274 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_78-116075-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 13:53:43,293 WARNING [train.py:1204] (3/4) Exclude cut with ID _64086_210_7994_1_1535270417019_7435949_52-235756-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 13:53:45,892 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7615_210_4211_1_1528110965_4580160_72-235211-0_sp1.1 from training. Duration: 0.6909375 2023-10-03 13:53:46,269 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.bypass_mid.scale_min, batch_count=1289980.0, ans=0.2 2023-10-03 13:53:47,211 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.575e+02 1.849e+02 1.952e+02 2.109e+02 2.912e+02, threshold=3.904e+02, percent-clipped=0.0 2023-10-03 13:53:48,733 WARNING [train.py:1204] (3/4) Exclude cut with ID _14069_210_5632_1_1530512692033_4803390_287-27950-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 13:53:50,161 WARNING [train.py:1204] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_74-48717-0_sp0.9 from training. Duration: 0.6444375 2023-10-03 13:53:50,185 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7544_210_6753_1_1528591327_1471426_172-348777-0_sp1.1 from training. Duration: 0.6 2023-10-03 13:53:52,036 WARNING [train.py:1204] (3/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_220-191080-0 from training. Duration: 0.98 2023-10-03 13:53:53,407 WARNING [train.py:1204] (3/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_1081-137641-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 13:53:56,115 WARNING [train.py:1204] (3/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_278-235459-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 13:53:59,879 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.0.layers.0.feed_forward1.out_whiten, num_groups=1, num_channels=192, metric=9.47 vs. limit=15.0 2023-10-03 13:54:00,255 WARNING [train.py:1204] (3/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_1181-77470-0_sp1.1 from training. Duration: 0.936375 2023-10-03 13:54:01,753 WARNING [train.py:1204] (3/4) Exclude cut with ID _60860_210_8254_1_1534896015875_3557169_198-5531-0_sp1.1 from training. Duration: 0.936375 2023-10-03 13:54:03,042 WARNING [train.py:1204] (3/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_17-289781-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 13:54:03,047 WARNING [train.py:1204] (3/4) Exclude cut with ID _50310_210_15488_1_1534068043345_3641080_146-199752-0_sp1.1 from training. Duration: 0.92725 2023-10-03 13:54:06,426 WARNING [train.py:1204] (3/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_969-213354-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 13:54:07,824 WARNING [train.py:1204] (3/4) Exclude cut with ID _70519_210_4163_1_1535961595994_7458230_66-344156-0_sp1.1 from training. Duration: 0.963625 2023-10-03 13:54:12,390 WARNING [train.py:1204] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_380-204756-0_sp1.1 from training. Duration: 0.7818125 2023-10-03 13:54:15,129 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_128-202104-0_sp0.9 from training. Duration: 0.7 2023-10-03 13:54:18,526 WARNING [train.py:1204] (3/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_51-1049-0_sp0.9 from training. Duration: 0.8333125 2023-10-03 13:54:19,782 WARNING [train.py:1204] (3/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_86-66192-0_sp1.1 from training. Duration: 0.5 2023-10-03 13:54:19,843 WARNING [train.py:1204] (3/4) Exclude cut with ID _14069_210_5632_1_1530512692033_4803390_95-1896-0_sp1.1 from training. Duration: 0.8909375 2023-10-03 13:54:23,497 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.conv_module1.balancer1.prob, batch_count=1290180.0, ans=0.125 2023-10-03 13:54:25,945 WARNING [train.py:1204] (3/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_29-293896-0_sp1.1 from training. Duration: 0.5545625 2023-10-03 13:54:28,638 WARNING [train.py:1204] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_167-137255-0_sp0.9 from training. Duration: 0.711125 2023-10-03 13:54:28,639 WARNING [train.py:1204] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_217-95056-0 from training. Duration: 0.98 2023-10-03 13:54:28,661 WARNING [train.py:1204] (3/4) Exclude cut with ID _47278_210_12041_1_1533902418349_3576110_97-266746-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 13:54:28,712 WARNING [train.py:1204] (3/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_380-343670-0_sp1.1 from training. Duration: 0.8 2023-10-03 13:54:31,351 WARNING [train.py:1204] (3/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_107-332398-0 from training. Duration: 0.78 2023-10-03 13:54:32,564 INFO [train.py:1046] (3/4) Epoch 37, batch 2300, loss[loss=0.1764, simple_loss=0.2646, pruned_loss=0.0441, over 24025.00 frames. ], tot_loss[loss=0.1586, simple_loss=0.2383, pruned_loss=0.03943, over 4741628.37 frames. ], batch size: 80, lr: 2.75e-03, grad_scale: 16.0 2023-10-03 13:54:34,699 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.3.encoder.layers.3.self_attn2.whiten, num_groups=1, num_channels=512, metric=11.03 vs. limit=22.5 2023-10-03 13:54:35,399 WARNING [train.py:1204] (3/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_91-267696-0_sp1.1 from training. Duration: 0.7909375 2023-10-03 13:54:35,418 WARNING [train.py:1204] (3/4) Exclude cut with ID _41298_210_9019_1_1533430371450_4822143_498-186993-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 13:54:41,688 INFO [scaling.py:1118] (3/4) WithLoss: name=encoder.encoders.2.encoder.layers.2.self_attn_weights, loss-sum=0.000e+00 2023-10-03 13:54:42,952 WARNING [train.py:1204] (3/4) Exclude cut with ID _17956_210_4353_1_1531209568125_6979970_54-77610-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 13:54:44,315 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_40047_210_2290_1_1533290519_2374311_257-335162-0_sp1.1 from training. Duration: 0.863625 2023-10-03 13:54:44,502 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=1290246.6666666667, ans=0.1 2023-10-03 13:54:47,596 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9653W0193-675471 from training. Duration: 0.9656875 2023-10-03 13:54:49,592 WARNING [train.py:1204] (3/4) Exclude cut with ID _25248_210_8750_1_1532062825941_5554180_243-240536-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 13:54:49,814 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.nonlin_attention.balancer.prob, batch_count=1290313.3333333333, ans=0.125 2023-10-03 13:54:50,380 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.2.encoder.layers.1.self_attn2.whiten, num_groups=1, num_channels=384, metric=16.73 vs. limit=22.5 2023-10-03 13:54:53,928 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_32360_210_4198_1_1532689160_3648436_60-51908-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 13:54:53,936 WARNING [train.py:1204] (3/4) Exclude cut with ID _4092_210_2151_1_1526643044449_3135759_227-213972-0_sp0.9 from training. Duration: 0.6 2023-10-03 13:54:54,137 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.balancer2.prob, batch_count=1290313.3333333333, ans=0.125 2023-10-03 13:54:55,164 WARNING [train.py:1204] (3/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_392-173500-0_sp1.1 from training. Duration: 0.97275 2023-10-03 13:54:55,198 WARNING [train.py:1204] (3/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_71-329150-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 13:54:55,200 WARNING [train.py:1204] (3/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_866-173440-0 from training. Duration: 0.9 2023-10-03 13:54:55,284 WARNING [train.py:1204] (3/4) Exclude cut with ID _14832_210_8750_1_1530691351539_3171348_1-54315-0_sp0.9 from training. Duration: 0.9666875 2023-10-03 13:54:57,957 WARNING [train.py:1204] (3/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_430-116139-0_sp1.1 from training. Duration: 0.77275 2023-10-03 13:54:59,885 WARNING [train.py:1204] (3/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_356-45054-0_sp0.9 from training. Duration: 0.9555625 2023-10-03 13:55:01,501 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3531_210_3528_1_1526294203_5216456_309-227302-0_sp0.9 from training. Duration: 0.7666875 2023-10-03 13:55:04,228 WARNING [train.py:1204] (3/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_301-330455-0_sp1.1 from training. Duration: 0.72725 2023-10-03 13:55:06,963 WARNING [train.py:1204] (3/4) Exclude cut with ID _23557_210_8750_1_1531890106370_6846255_667-145945-0_sp1.1 from training. Duration: 0.92725 2023-10-03 13:55:13,020 WARNING [train.py:1204] (3/4) Exclude cut with ID _14860_210_4915_1_1530702020340_7343240_836-328489-0_sp1.1 from training. Duration: 0.8181875 2023-10-03 13:55:13,060 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_37152_210_9697_1_1533106573_3844188_416-333249-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 13:55:18,377 WARNING [train.py:1204] (3/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_538-32291-0_sp0.9 from training. Duration: 0.92225 2023-10-03 13:55:21,179 WARNING [train.py:1204] (3/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_965-176824-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 13:55:24,005 WARNING [train.py:1204] (3/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_86-19601-0_sp1.1 from training. Duration: 0.77275 2023-10-03 13:55:25,313 WARNING [train.py:1204] (3/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_436-318113-0_sp1.1 from training. Duration: 0.6818125 2023-10-03 13:55:25,353 WARNING [train.py:1204] (3/4) Exclude cut with ID _41817_210_18835_1_1533434477160_7423049_555-293448-0_sp0.9 from training. Duration: 0.888875 2023-10-03 13:55:25,363 WARNING [train.py:1204] (3/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_68-219807-0 from training. Duration: 0.47 2023-10-03 13:55:29,968 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7544_210_6753_1_1528591327_1471426_33-346031-0_sp1.1 from training. Duration: 0.5545625 2023-10-03 13:55:29,971 WARNING [train.py:1204] (3/4) Exclude cut with ID _49748_210_24116_1_1533986971737_3158910_263-52113-0_sp1.1 from training. Duration: 0.97275 2023-10-03 13:55:30,026 WARNING [train.py:1204] (3/4) Exclude cut with ID _64639_210_5816_1_1535367619437_3528000_340-24580-0_sp1.1 from training. Duration: 0.92725 2023-10-03 13:55:30,047 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_47228_210_4983_1_1533780021_3672027_479-114941-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 13:55:30,101 WARNING [train.py:1204] (3/4) Exclude cut with ID _44585_210_18628_1_1533605350387_4518199_506-119659-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 13:55:31,402 WARNING [train.py:1204] (3/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_314-126819-0_sp1.1 from training. Duration: 0.4545625 2023-10-03 13:55:31,406 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2541_210_1794_1_1525590269_3084765_156-173713-0_sp0.9 from training. Duration: 0.9 2023-10-03 13:55:32,782 WARNING [train.py:1204] (3/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_108-255430-0 from training. Duration: 0.97 2023-10-03 13:55:32,790 WARNING [train.py:1204] (3/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_791-45149-0_sp1.1 from training. Duration: 0.7818125 2023-10-03 13:55:32,796 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_53378_210_17282_1_1534227054_1261425_10-118295-0_sp1.1 from training. Duration: 0.97275 2023-10-03 13:55:34,085 WARNING [train.py:1204] (3/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_148-158509-0 from training. Duration: 0.41 2023-10-03 13:55:38,464 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_34726_210_4525_1_1533019474_5030395_687-291305-0_sp1.1 from training. Duration: 0.936375 2023-10-03 13:55:41,757 WARNING [train.py:1204] (3/4) Exclude cut with ID _14860_210_4915_1_1530702020340_7343240_264-324537-0_sp1.1 from training. Duration: 0.963625 2023-10-03 13:55:47,311 INFO [train.py:1046] (3/4) Epoch 37, batch 2350, loss[loss=0.1556, simple_loss=0.2305, pruned_loss=0.04036, over 23322.00 frames. ], tot_loss[loss=0.1595, simple_loss=0.2393, pruned_loss=0.03989, over 4735895.73 frames. ], batch size: 119, lr: 2.75e-03, grad_scale: 8.0 2023-10-03 13:55:47,444 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_41251_210_15368_1_1533429767_4820695_591-46386-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 13:55:48,807 WARNING [train.py:1204] (3/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_86-19601-0_sp0.9 from training. Duration: 0.9444375 2023-10-03 13:55:48,836 WARNING [train.py:1204] (3/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_401-255491-0_sp0.9 from training. Duration: 0.77775 2023-10-03 13:55:50,254 WARNING [train.py:1204] (3/4) Exclude cut with ID _8696_210_1876_1_1528545632440_3345058_209-54069-0_sp1.1 from training. Duration: 0.6090625 2023-10-03 13:55:50,263 WARNING [train.py:1204] (3/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_369-325184-0_sp1.1 from training. Duration: 0.9 2023-10-03 13:55:50,324 WARNING [train.py:1204] (3/4) Exclude cut with ID _14687_210_5854_1_1530703838259_3664889_115-239046-0_sp0.9 from training. Duration: 0.7444375 2023-10-03 13:55:52,142 WARNING [train.py:1204] (3/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_168-50337-0 from training. Duration: 0.84 2023-10-03 13:55:56,377 WARNING [train.py:1204] (3/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_909-64151-0_sp1.1 from training. Duration: 0.8545625 2023-10-03 13:55:57,756 WARNING [train.py:1204] (3/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_597-154640-0 from training. Duration: 0.9 2023-10-03 13:56:03,738 WARNING [train.py:1204] (3/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_126-274694-0 from training. Duration: 0.98 2023-10-03 13:56:05,198 WARNING [train.py:1204] (3/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_344-269786-0_sp1.1 from training. Duration: 0.97275 2023-10-03 13:56:08,671 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_37818_210_4238_1_1533620910_4720083_322-253154-0_sp1.1 from training. Duration: 0.92725 2023-10-03 13:56:08,674 WARNING [train.py:1204] (3/4) Exclude cut with ID _58895_210_8750_1_1535184081320_3649089_221-15823-0_sp1.1 from training. Duration: 0.92725 2023-10-03 13:56:08,683 WARNING [train.py:1204] (3/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_9-21737-0_sp1.1 from training. Duration: 0.8818125 2023-10-03 13:56:09,969 WARNING [train.py:1204] (3/4) Exclude cut with ID _14069_210_5632_1_1530512692033_4803390_522-341588-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 13:56:11,315 WARNING [train.py:1204] (3/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_363-185703-0 from training. Duration: 0.95 2023-10-03 13:56:14,071 WARNING [train.py:1204] (3/4) Exclude cut with ID _41817_210_18835_1_1533434477160_7423049_265-76216-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 13:56:17,475 WARNING [train.py:1204] (3/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_841-341679-0 from training. Duration: 0.58 2023-10-03 13:56:18,750 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.606e+02 1.950e+02 2.084e+02 2.408e+02 3.908e+02, threshold=4.167e+02, percent-clipped=1.0 2023-10-03 13:56:18,903 WARNING [train.py:1204] (3/4) Exclude cut with ID _55429_210_10626_1_1534467588527_7174143_97-335084-0_sp1.1 from training. Duration: 0.8818125 2023-10-03 13:56:23,298 WARNING [train.py:1204] (3/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_690-126306-0_sp0.9 from training. Duration: 0.8666875 2023-10-03 13:56:23,306 WARNING [train.py:1204] (3/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_102-92751-0_sp1.1 from training. Duration: 0.9 2023-10-03 13:56:24,735 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3787_210_2040_1_1526477544_5478586_603-120324-0_sp1.1 from training. Duration: 0.736375 2023-10-03 13:56:26,111 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_33348_210_5632_1_1533086912_3324030_85-28583-0 from training. Duration: 0.82 2023-10-03 13:56:27,451 WARNING [train.py:1204] (3/4) Exclude cut with ID _60057_210_10640_1_1534903065793_3333150_362-230377-0_sp1.1 from training. Duration: 0.7909375 2023-10-03 13:56:28,897 WARNING [train.py:1204] (3/4) Exclude cut with ID _60113_210_16271_1_1535164212481_4386079_194-146486-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 13:56:28,912 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_53753_210_7234_1_1534320003_3594148_171-277668-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 13:56:28,946 WARNING [train.py:1204] (3/4) Exclude cut with ID _34423_210_8750_1_1533430798456_3330199_251-332788-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 13:56:33,676 WARNING [train.py:1204] (3/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_663-166973-0_sp0.9 from training. Duration: 0.888875 2023-10-03 13:56:35,030 WARNING [train.py:1204] (3/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_927-43827-0 from training. Duration: 0.74 2023-10-03 13:56:35,093 WARNING [train.py:1204] (3/4) Exclude cut with ID _28323_210_16187_1_1532480395667_4100300_306-74561-0_sp1.1 from training. Duration: 0.8545625 2023-10-03 13:56:38,002 WARNING [train.py:1204] (3/4) Exclude cut with ID _15037_210_2253_1_1530752294104_3267650_139-337989-0_sp1.1 from training. Duration: 0.92725 2023-10-03 13:56:38,016 WARNING [train.py:1204] (3/4) Exclude cut with ID _49039_210_6315_1_1533967537541_3846118_460-263670-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 13:56:38,590 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.4.encoder.layers.1.nonlin_attention.whiten2, num_groups=1, num_channels=384, metric=4.06 vs. limit=15.0 2023-10-03 13:56:39,950 WARNING [train.py:1204] (3/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_186-309161-0 from training. Duration: 0.54 2023-10-03 13:56:41,335 WARNING [train.py:1204] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_87-278686-0_sp1.1 from training. Duration: 0.7 2023-10-03 13:56:44,049 WARNING [train.py:1204] (3/4) Exclude cut with ID _27052_210_5986_1_1532329197187_3585009_314-106499-0 from training. Duration: 0.92 2023-10-03 13:56:44,078 WARNING [train.py:1204] (3/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_412-287526-0_sp0.9 from training. Duration: 0.8 2023-10-03 13:56:48,740 WARNING [train.py:1204] (3/4) Exclude cut with ID _22961_210_8254_1_1531976366672_4278180_317-199546-0 from training. Duration: 0.86 2023-10-03 13:56:51,577 WARNING [train.py:1204] (3/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_381-317791-0 from training. Duration: 0.73 2023-10-03 13:56:52,915 WARNING [train.py:1204] (3/4) Exclude cut with ID _44010_210_4525_1_1533884262964_4013620_560-310411-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 13:56:52,921 WARNING [train.py:1204] (3/4) Exclude cut with ID _5294_210_1747_1_1527249643928_3696179_342-270278-0_sp0.9 from training. Duration: 0.611125 2023-10-03 13:56:52,953 WARNING [train.py:1204] (3/4) Exclude cut with ID ID0473W0002-918951 from training. Duration: 0.924 2023-10-03 13:56:54,650 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1083-220122-0 from training. Duration: 0.51 2023-10-03 13:56:56,096 WARNING [train.py:1204] (3/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_138-154812-0 from training. Duration: 0.78 2023-10-03 13:56:58,782 WARNING [train.py:1204] (3/4) Exclude cut with ID _44982_210_1511_1_1534312820108_3726029_331-19420-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 13:57:01,042 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.balancer2.prob, batch_count=1290913.3333333333, ans=0.125 2023-10-03 13:57:01,900 INFO [train.py:1046] (3/4) Epoch 37, batch 2400, loss[loss=0.1583, simple_loss=0.217, pruned_loss=0.04975, over 19651.00 frames. ], tot_loss[loss=0.1592, simple_loss=0.2387, pruned_loss=0.03989, over 4722101.87 frames. ], batch size: 388, lr: 2.75e-03, grad_scale: 8.0 2023-10-03 13:57:03,270 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2655_210_2087_1_1525688959_3897400_202-147557-0_sp0.9 from training. Duration: 0.9444375 2023-10-03 13:57:04,883 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_23850_210_4238_1_1532232597_1632433_32-185135-0_sp0.9 from training. Duration: 0.9555625 2023-10-03 13:57:08,105 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_46-93547-0_sp1.1 from training. Duration: 0.863625 2023-10-03 13:57:08,157 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7931_210_1794_1_1528594728_5564059_280-279560-0 from training. Duration: 0.98 2023-10-03 13:57:09,426 WARNING [train.py:1204] (3/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_937-208281-0 from training. Duration: 0.85 2023-10-03 13:57:16,363 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_14928_210_5632_1_1530773461_4714000_454-112198-0_sp0.9 from training. Duration: 0.7333125 2023-10-03 13:57:16,364 WARNING [train.py:1204] (3/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_944-153333-0_sp1.1 from training. Duration: 0.963625 2023-10-03 13:57:18,339 WARNING [train.py:1204] (3/4) Exclude cut with ID _14821_210_8750_1_1530680503991_6829359_2-227151-0 from training. Duration: 0.83 2023-10-03 13:57:18,362 WARNING [train.py:1204] (3/4) Exclude cut with ID _28461_210_2253_1_1532771775189_3041299_96-152158-0_sp1.1 from training. Duration: 0.77275 2023-10-03 13:57:19,676 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7837_210_1792_1_1528453445_5841305_654-54195-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 13:57:20,985 WARNING [train.py:1204] (3/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_344-152526-0 from training. Duration: 0.53 2023-10-03 13:57:25,048 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_30167_210_3528_1_1532428012_4772375_480-142230-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 13:57:29,800 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_160-218721-0 from training. Duration: 0.96 2023-10-03 13:57:34,752 WARNING [train.py:1204] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_65-88242-0_sp0.9 from training. Duration: 0.62225 2023-10-03 13:57:39,217 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_34886_210_10633_1_1533203882_1755661_76-156096-0 from training. Duration: 0.87 2023-10-03 13:57:40,918 WARNING [train.py:1204] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_188-38388-0_sp1.1 from training. Duration: 0.92725 2023-10-03 13:57:42,244 WARNING [train.py:1204] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_81-289335-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 13:57:43,821 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.feed_forward2.hidden_balancer.prob, batch_count=1291046.6666666667, ans=0.125 2023-10-03 13:57:46,345 WARNING [train.py:1204] (3/4) Exclude cut with ID _59493_210_17282_1_1535078419077_3225879_110-8778-0_sp1.1 from training. Duration: 0.936375 2023-10-03 13:57:46,379 WARNING [train.py:1204] (3/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_188-260011-0 from training. Duration: 0.73 2023-10-03 13:57:46,616 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.nonlin_attention.balancer.prob, batch_count=1291113.3333333333, ans=0.125 2023-10-03 13:57:47,725 WARNING [train.py:1204] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_453-133983-0_sp0.9 from training. Duration: 0.8666875 2023-10-03 13:57:53,892 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8054_210_7927_1_1528627721_4984094_553-17379-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 13:57:56,624 WARNING [train.py:1204] (3/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_341-171072-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 13:57:59,719 WARNING [train.py:1204] (3/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_265-257286-0_sp1.1 from training. Duration: 0.963625 2023-10-03 13:58:01,098 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_403-39619-0_sp0.9 from training. Duration: 0.9333125 2023-10-03 13:58:01,103 WARNING [train.py:1204] (3/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_464-33931-0_sp0.9 from training. Duration: 0.6 2023-10-03 13:58:01,123 WARNING [train.py:1204] (3/4) Exclude cut with ID _60804_210_9642_1_1534831199859_7268299_632-143863-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 13:58:01,129 WARNING [train.py:1204] (3/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_431-310997-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 13:58:02,534 WARNING [train.py:1204] (3/4) Exclude cut with ID _31649_210_6228_1_1532584608063_4091078_114-263860-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 13:58:02,545 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6961_210_2087_1_1527937168_6815011_562-165518-0_sp0.9 from training. Duration: 0.8555625 2023-10-03 13:58:04,495 WARNING [train.py:1204] (3/4) Exclude cut with ID _27052_210_5986_1_1532329197187_3585009_338-325870-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 13:58:06,365 WARNING [train.py:1204] (3/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_469-94673-0_sp1.1 from training. Duration: 0.7181875 2023-10-03 13:58:06,390 WARNING [train.py:1204] (3/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_259-34038-0 from training. Duration: 0.74 2023-10-03 13:58:07,752 WARNING [train.py:1204] (3/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_388-129552-0 from training. Duration: 0.96 2023-10-03 13:58:09,248 WARNING [train.py:1204] (3/4) Exclude cut with ID _28394_210_8913_1_1532430049518_3650118_342-251455-0_sp1.1 from training. Duration: 0.936375 2023-10-03 13:58:10,526 WARNING [train.py:1204] (3/4) Exclude cut with ID _53731_210_7994_1_1534553979244_3757214_8-191390-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 13:58:10,571 WARNING [train.py:1204] (3/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_613-232856-0 from training. Duration: 0.54 2023-10-03 13:58:11,916 WARNING [train.py:1204] (3/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_557-349534-0 from training. Duration: 0.91 2023-10-03 13:58:11,932 WARNING [train.py:1204] (3/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_379-184223-0 from training. Duration: 0.85 2023-10-03 13:58:11,936 WARNING [train.py:1204] (3/4) Exclude cut with ID IC0125W0001-61807 from training. Duration: 0.966 2023-10-03 13:58:13,337 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_229-45300-0 from training. Duration: 0.57 2023-10-03 13:58:13,587 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.conv_module2.balancer1.prob, batch_count=1291180.0, ans=0.125 2023-10-03 13:58:14,667 WARNING [train.py:1204] (3/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_1069-24743-0_sp1.1 from training. Duration: 0.92725 2023-10-03 13:58:14,779 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_41452_210_7291_1_1533437687_3941281_323-172873-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 13:58:14,788 WARNING [train.py:1204] (3/4) Exclude cut with ID _37086_210_9038_1_1532999540891_7021250_802-226709-0_sp1.1 from training. Duration: 0.97275 2023-10-03 13:58:14,976 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.feed_forward1.out_proj.dropout_p, batch_count=1291246.6666666667, ans=0.1 2023-10-03 13:58:16,063 INFO [train.py:1046] (3/4) Epoch 37, batch 2450, loss[loss=0.1589, simple_loss=0.2334, pruned_loss=0.0422, over 23859.00 frames. ], tot_loss[loss=0.1583, simple_loss=0.2377, pruned_loss=0.0394, over 4718291.96 frames. ], batch size: 212, lr: 2.75e-03, grad_scale: 8.0 2023-10-03 13:58:16,169 WARNING [train.py:1204] (3/4) Exclude cut with ID IC0144W0500-71767 from training. Duration: 0.931875 2023-10-03 13:58:17,912 WARNING [train.py:1204] (3/4) Exclude cut with ID _39846_210_12651_1_1533605310265_2115212_233-129699-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 13:58:17,970 WARNING [train.py:1204] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_574-45541-0_sp0.9 from training. Duration: 0.72225 2023-10-03 13:58:21,995 WARNING [train.py:1204] (3/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_372-298093-0_sp1.1 from training. Duration: 0.72725 2023-10-03 13:58:22,007 WARNING [train.py:1204] (3/4) Exclude cut with ID _62574_210_2655_1_1535180407058_3933249_34-26343-0_sp1.1 from training. Duration: 0.97275 2023-10-03 13:58:22,264 INFO [scaling.py:1118] (3/4) WithLoss: name=encoder.encoders.2.encoder.layers.0.self_attn_weights, loss-sum=0.000e+00 2023-10-03 13:58:26,054 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_512-256238-0_sp1.1 from training. Duration: 0.963625 2023-10-03 13:58:26,059 WARNING [train.py:1204] (3/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_196-55502-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 13:58:26,149 WARNING [train.py:1204] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_195-167011-0 from training. Duration: 0.75 2023-10-03 13:58:32,259 WARNING [train.py:1204] (3/4) Exclude cut with ID _13678_210_7497_1_1530530069226_3463170_345-230416-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 13:58:32,276 WARNING [train.py:1204] (3/4) Exclude cut with ID _51078_210_9070_1_1534161557424_3805250_225-327809-0_sp1.1 from training. Duration: 0.963625 2023-10-03 13:58:37,478 WARNING [train.py:1204] (3/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_18-189198-0_sp1.1 from training. Duration: 0.8181875 2023-10-03 13:58:37,489 WARNING [train.py:1204] (3/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_329-82235-0_sp1.1 from training. Duration: 0.7454375 2023-10-03 13:58:37,499 WARNING [train.py:1204] (3/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_726-270780-0_sp0.9 from training. Duration: 0.9555625 2023-10-03 13:58:38,723 WARNING [train.py:1204] (3/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_654-52713-0 from training. Duration: 0.77 2023-10-03 13:58:41,592 WARNING [train.py:1204] (3/4) Exclude cut with ID _20455_210_5219_1_1531565996775_8098100_191-16006-0_sp1.1 from training. Duration: 0.963625 2023-10-03 13:58:43,048 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3171_210_2087_1_1526109084_2330007_66-260583-0_sp1.1 from training. Duration: 0.6545625 2023-10-03 13:58:44,275 WARNING [train.py:1204] (3/4) Exclude cut with ID _38670_210_15379_1_1533448810659_4391059_63-129006-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 13:58:47,042 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.569e+02 1.840e+02 2.041e+02 2.225e+02 3.110e+02, threshold=4.081e+02, percent-clipped=0.0 2023-10-03 13:58:47,191 WARNING [train.py:1204] (3/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_393-57888-0_sp0.9 from training. Duration: 0.711125 2023-10-03 13:58:48,596 WARNING [train.py:1204] (3/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_857-298942-0_sp0.9 from training. Duration: 0.9444375 2023-10-03 13:58:49,978 WARNING [train.py:1204] (3/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_239-22076-0_sp0.9 from training. Duration: 0.9444375 2023-10-03 13:58:50,011 WARNING [train.py:1204] (3/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_424-227947-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 13:58:52,025 WARNING [train.py:1204] (3/4) Exclude cut with ID _14069_210_5632_1_1530512692033_4803390_95-1896-0 from training. Duration: 0.98 2023-10-03 13:58:52,100 WARNING [train.py:1204] (3/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_96-244299-0_sp0.9 from training. Duration: 0.97775 2023-10-03 13:58:56,973 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.2.encoder.layers.1.self_attn2.whiten, num_groups=1, num_channels=384, metric=14.75 vs. limit=22.5 2023-10-03 13:58:59,081 WARNING [train.py:1204] (3/4) Exclude cut with ID _46683_210_9642_1_1533730913492_4591116_562-319825-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 13:59:00,482 WARNING [train.py:1204] (3/4) Exclude cut with ID _64639_210_5816_1_1535367619437_3528000_284-173474-0_sp1.1 from training. Duration: 0.963625 2023-10-03 13:59:00,516 WARNING [train.py:1204] (3/4) Exclude cut with ID _29529_210_7234_1_1532393961455_3977460_497-100133-0_sp1.1 from training. Duration: 0.936375 2023-10-03 13:59:02,465 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_12868_210_6753_1_1530701932_530275_18-76322-0_sp0.9 from training. Duration: 0.9666875 2023-10-03 13:59:04,371 WARNING [train.py:1204] (3/4) Exclude cut with ID _50093_210_4220_1_1534417141207_3432299_116-303447-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 13:59:05,630 WARNING [train.py:1204] (3/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_44-79427-0_sp1.1 from training. Duration: 0.8909375 2023-10-03 13:59:06,902 WARNING [train.py:1204] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_887-249555-0 from training. Duration: 0.77 2023-10-03 13:59:10,194 WARNING [train.py:1204] (3/4) Exclude cut with ID _28461_210_2253_1_1532771775189_3041299_96-152158-0_sp0.9 from training. Duration: 0.9444375 2023-10-03 13:59:10,247 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_14058_210_4163_1_1530690953_6327180_49-210687-0_sp1.1 from training. Duration: 0.8545625 2023-10-03 13:59:10,851 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.4.encoder.layers.1.self_attn_weights.whiten_keys, num_groups=4, num_channels=128, metric=1.87 vs. limit=6.0 2023-10-03 13:59:12,963 WARNING [train.py:1204] (3/4) Exclude cut with ID _40399_210_4353_1_1533305658194_3279406_351-127903-0_sp1.1 from training. Duration: 0.97275 2023-10-03 13:59:12,968 WARNING [train.py:1204] (3/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_184-206939-0_sp1.1 from training. Duration: 0.936375 2023-10-03 13:59:17,299 WARNING [train.py:1204] (3/4) Exclude cut with ID _14100_210_8341_1_1530529320613_3678069_258-224599-0_sp1.1 from training. Duration: 0.7 2023-10-03 13:59:17,314 WARNING [train.py:1204] (3/4) Exclude cut with ID _6493_210_1747_1_1528459210424_4012470_324-323854-0 from training. Duration: 0.92 2023-10-03 13:59:18,590 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_151-325626-0_sp1.1 from training. Duration: 0.8818125 2023-10-03 13:59:18,668 WARNING [train.py:1204] (3/4) Exclude cut with ID _14528_210_8254_1_1530669700514_4306939_106-43145-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 13:59:18,692 WARNING [train.py:1204] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_686-54383-0 from training. Duration: 0.87 2023-10-03 13:59:18,732 WARNING [train.py:1204] (3/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_801-102362-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 13:59:20,581 WARNING [train.py:1204] (3/4) Exclude cut with ID _48677_210_5663_1_1533902405919_3892989_408-40740-0_sp0.9 from training. Duration: 0.92225 2023-10-03 13:59:24,394 WARNING [train.py:1204] (3/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_448-212135-0_sp1.1 from training. Duration: 0.82725 2023-10-03 13:59:27,116 WARNING [train.py:1204] (3/4) Exclude cut with ID _59170_210_6721_1_1534983633960_4203335_320-124047-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 13:59:27,155 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_4692_210_3528_1_1526899755_4323254_107-44401-0_sp1.1 from training. Duration: 0.7818125 2023-10-03 13:59:29,947 INFO [train.py:1046] (3/4) Epoch 37, batch 2500, loss[loss=0.1477, simple_loss=0.2355, pruned_loss=0.02995, over 24471.00 frames. ], tot_loss[loss=0.1577, simple_loss=0.2369, pruned_loss=0.03922, over 4715417.55 frames. ], batch size: 66, lr: 2.75e-03, grad_scale: 8.0 2023-10-03 13:59:30,034 WARNING [train.py:1204] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_731-2428-0 from training. Duration: 0.83 2023-10-03 13:59:31,884 WARNING [train.py:1204] (3/4) Exclude cut with ID _29857_210_20034_1_1532395886753_3798160_335-163562-0_sp1.1 from training. Duration: 0.77275 2023-10-03 13:59:32,468 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.feed_forward2.out_whiten.whitening_limit, batch_count=1291580.0, ans=15.0 2023-10-03 13:59:36,748 WARNING [train.py:1204] (3/4) Exclude cut with ID _14832_210_8750_1_1530691351539_3171348_171-275004-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 13:59:44,043 INFO [scaling.py:1118] (3/4) WithLoss: name=encoder.encoders.5.encoder.layers.0.self_attn_weights, loss-sum=0.000e+00 2023-10-03 13:59:45,052 WARNING [train.py:1204] (3/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_105-328174-0_sp0.9 from training. Duration: 0.8444375 2023-10-03 13:59:45,093 WARNING [train.py:1204] (3/4) Exclude cut with ID _68965_210_1883_1_1535698962272_3497490_164-118421-0_sp1.1 from training. Duration: 0.936375 2023-10-03 13:59:46,458 WARNING [train.py:1204] (3/4) Exclude cut with ID _19896_210_1883_1_1531709022662_6649569_380-24170-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 13:59:46,463 WARNING [train.py:1204] (3/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_505-205270-0_sp1.1 from training. Duration: 0.3454375 2023-10-03 13:59:53,932 WARNING [train.py:1204] (3/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_427-208901-0_sp1.1 from training. Duration: 0.6181875 2023-10-03 13:59:53,991 WARNING [train.py:1204] (3/4) Exclude cut with ID _23818_210_6228_1_1531917063884_3676920_85-326916-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 13:59:55,318 WARNING [train.py:1204] (3/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_101-293367-0_sp1.1 from training. Duration: 0.57275 2023-10-03 13:59:56,663 WARNING [train.py:1204] (3/4) Exclude cut with ID _5409_210_1388_1_1527292850189_5865191_178-127970-0_sp1.1 from training. Duration: 0.5181875 2023-10-03 13:59:56,719 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_403-39619-0 from training. Duration: 0.84 2023-10-03 13:59:58,065 WARNING [train.py:1204] (3/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_427-87841-0_sp1.1 from training. Duration: 0.963625 2023-10-03 13:59:59,958 WARNING [train.py:1204] (3/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_286-68181-0_sp1.1 from training. Duration: 0.97275 2023-10-03 13:59:59,994 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6673_210_3273_1_1527767790_3173240_327-306185-0 from training. Duration: 0.83 2023-10-03 14:00:00,027 WARNING [train.py:1204] (3/4) Exclude cut with ID _3675_210_4557_1_1526639934408_4182089_106-203713-0_sp1.1 from training. Duration: 0.963625 2023-10-03 14:00:01,332 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_53071_210_16554_1_1534493434_1276796_130-36076-0 from training. Duration: 0.95 2023-10-03 14:00:01,354 WARNING [train.py:1204] (3/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_586-93054-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 14:00:03,473 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.2.encoder.layers.2.conv_module2.whiten, num_groups=1, num_channels=384, metric=6.14 vs. limit=15.0 2023-10-03 14:00:07,398 WARNING [train.py:1204] (3/4) Exclude cut with ID _23825_210_13449_1_1531915313526_3497150_262-10571-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 14:00:07,475 WARNING [train.py:1204] (3/4) Exclude cut with ID _28783_210_7943_1_1532671164808_7155479_432-67885-0_sp1.1 from training. Duration: 0.97275 2023-10-03 14:00:10,741 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6287_210_3273_1_1527509506_2653716_182-199887-0_sp0.9 from training. Duration: 0.5666875 2023-10-03 14:00:10,972 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.balancer1.prob, batch_count=1291713.3333333333, ans=0.125 2023-10-03 14:00:12,008 WARNING [train.py:1204] (3/4) Exclude cut with ID _3231_210_2106_1_1526205864934_6784220_171-256004-0 from training. Duration: 0.86 2023-10-03 14:00:12,064 WARNING [train.py:1204] (3/4) Exclude cut with ID _69051_210_1531_1_1535885566901_3829538_332-92090-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 14:00:13,488 WARNING [train.py:1204] (3/4) Exclude cut with ID _3675_210_4557_1_1526639934408_4182089_104-324125-0_sp1.1 from training. Duration: 0.963625 2023-10-03 14:00:13,623 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.0.feed_forward1.out_proj.dropout_p, batch_count=1291780.0, ans=0.1 2023-10-03 14:00:17,551 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_41764_210_2521_1_1533686110_4089313_501-2119-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 14:00:19,205 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.conv_skip_rate, batch_count=1291780.0, ans=0.0 2023-10-03 14:00:20,330 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_56-100988-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 14:00:20,947 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.3.encoder.layers.0.nonlin_attention.whiten1, num_groups=1, num_channels=384, metric=7.16 vs. limit=10.0 2023-10-03 14:00:24,735 WARNING [train.py:1204] (3/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_398-239819-0_sp1.1 from training. Duration: 0.9 2023-10-03 14:00:29,267 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.feed_forward1.hidden_balancer.prob, batch_count=1291846.6666666667, ans=0.125 2023-10-03 14:00:30,433 WARNING [train.py:1204] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_74-48717-0_sp1.1 from training. Duration: 0.52725 2023-10-03 14:00:33,363 WARNING [train.py:1204] (3/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_177-49092-0 from training. Duration: 0.91 2023-10-03 14:00:33,391 WARNING [train.py:1204] (3/4) Exclude cut with ID _62337_210_12392_1_1535009273105_3412229_44-176353-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 14:00:33,401 WARNING [train.py:1204] (3/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_110-130961-0_sp1.1 from training. Duration: 0.42725 2023-10-03 14:00:33,619 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.conv_skip_rate, batch_count=1291846.6666666667, ans=0.0 2023-10-03 14:00:35,394 WARNING [train.py:1204] (3/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_37-189262-0_sp1.1 from training. Duration: 0.8909375 2023-10-03 14:00:35,395 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3172_210_2087_1_1526292929_3987865_197-97762-0_sp0.9 from training. Duration: 0.8333125 2023-10-03 14:00:35,695 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.ff3_skip_rate, batch_count=1291846.6666666667, ans=0.0 2023-10-03 14:00:37,344 WARNING [train.py:1204] (3/4) Exclude cut with ID IC0677W0065-334897 from training. Duration: 0.896 2023-10-03 14:00:37,344 WARNING [train.py:1204] (3/4) Exclude cut with ID IC0677W0080-334912 from training. Duration: 0.768 2023-10-03 14:00:37,355 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_120-41668-0 from training. Duration: 0.7 2023-10-03 14:00:40,107 WARNING [train.py:1204] (3/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_460-205287-0_sp1.1 from training. Duration: 0.963625 2023-10-03 14:00:43,286 WARNING [train.py:1204] (3/4) Exclude cut with ID _14632_210_9988_1_1530691279392_3923200_102-204477-0 from training. Duration: 0.97 2023-10-03 14:00:43,291 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_34815_210_6388_1_1533381320_3571903_279-305565-0 from training. Duration: 0.92 2023-10-03 14:00:44,587 INFO [train.py:1046] (3/4) Epoch 37, batch 2550, loss[loss=0.1518, simple_loss=0.2362, pruned_loss=0.03376, over 24623.00 frames. ], tot_loss[loss=0.158, simple_loss=0.2376, pruned_loss=0.03921, over 4712828.71 frames. ], batch size: 65, lr: 2.75e-03, grad_scale: 8.0 2023-10-03 14:00:44,642 WARNING [train.py:1204] (3/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_91-148831-0_sp1.1 from training. Duration: 0.9 2023-10-03 14:00:44,712 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_82-221196-0 from training. Duration: 0.76 2023-10-03 14:00:47,421 WARNING [train.py:1204] (3/4) Exclude cut with ID _38020_210_5986_1_1533112252259_7006089_493-102745-0 from training. Duration: 0.98 2023-10-03 14:00:49,090 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.conv_skip_rate, batch_count=1291913.3333333333, ans=0.0 2023-10-03 14:00:50,159 WARNING [train.py:1204] (3/4) Exclude cut with ID _61133_210_9969_1_1535196777227_3696409_40-93442-0_sp1.1 from training. Duration: 0.92725 2023-10-03 14:00:52,877 WARNING [train.py:1204] (3/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_45-258852-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 14:00:52,917 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_53071_210_16554_1_1534493434_1276796_130-36076-0_sp1.1 from training. Duration: 0.863625 2023-10-03 14:00:54,366 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8694_210_6400_1_1529065754_3260756_332-13553-0_sp1.1 from training. Duration: 0.92725 2023-10-03 14:00:56,213 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_405-339986-0 from training. Duration: 0.51 2023-10-03 14:00:56,269 WARNING [train.py:1204] (3/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_927-43827-0_sp1.1 from training. Duration: 0.67275 2023-10-03 14:01:00,289 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_397-147196-0 from training. Duration: 0.8 2023-10-03 14:01:00,559 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=1291980.0, ans=0.1 2023-10-03 14:01:01,669 WARNING [train.py:1204] (3/4) Exclude cut with ID 3557-8342-0013-54691-0_sp1.1 from training. Duration: 0.836375 2023-10-03 14:01:03,350 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_47438_210_5399_1_1533797471_4513356_626-127722-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 14:01:05,367 WARNING [train.py:1204] (3/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_256-323883-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 14:01:05,371 WARNING [train.py:1204] (3/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_839-173017-0_sp0.9 from training. Duration: 0.4555625 2023-10-03 14:01:06,714 WARNING [train.py:1204] (3/4) Exclude cut with ID _5289_210_2084_1_1527678202736_3779299_392-261988-0_sp1.1 from training. Duration: 0.5909375 2023-10-03 14:01:06,767 WARNING [train.py:1204] (3/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_119-224384-0_sp1.1 from training. Duration: 0.936375 2023-10-03 14:01:06,806 WARNING [train.py:1204] (3/4) Exclude cut with ID _57523_210_1856_1_1534766294545_3732989_262-267933-0_sp1.1 from training. Duration: 0.963625 2023-10-03 14:01:10,077 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_35361_210_4238_1_1533262810_7793476_675-87012-0_sp0.9 from training. Duration: 0.988875 2023-10-03 14:01:10,096 WARNING [train.py:1204] (3/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_17-90541-0 from training. Duration: 0.94 2023-10-03 14:01:10,125 WARNING [train.py:1204] (3/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_535-14410-0_sp1.1 from training. Duration: 0.42725 2023-10-03 14:01:10,131 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_24673_210_6400_1_1531968633_778659_94-154511-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 14:01:10,134 WARNING [train.py:1204] (3/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_212-10501-0 from training. Duration: 0.94 2023-10-03 14:01:12,317 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.conv_module2.balancer1.prob, batch_count=1291980.0, ans=0.125 2023-10-03 14:01:16,195 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.582e+02 1.844e+02 2.001e+02 2.201e+02 4.188e+02, threshold=4.002e+02, percent-clipped=1.0 2023-10-03 14:01:21,708 WARNING [train.py:1204] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_383-285196-0_sp1.1 from training. Duration: 0.8818125 2023-10-03 14:01:25,259 WARNING [train.py:1204] (3/4) Exclude cut with ID _45560_210_21982_1_1533812603951_3584999_238-47020-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 14:01:25,275 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_21500_210_2054_1_1532149310_2716758_278-18053-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 14:01:26,730 WARNING [train.py:1204] (3/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_453-9626-0_sp1.1 from training. Duration: 0.936375 2023-10-03 14:01:26,829 WARNING [train.py:1204] (3/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_153-97441-0_sp1.1 from training. Duration: 0.5090625 2023-10-03 14:01:34,351 WARNING [train.py:1204] (3/4) Exclude cut with ID _67725_210_27767_1_1535608756671_5105359_431-120895-0_sp1.1 from training. Duration: 0.963625 2023-10-03 14:01:37,106 WARNING [train.py:1204] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_574-45541-0_sp1.1 from training. Duration: 0.5909375 2023-10-03 14:01:37,125 WARNING [train.py:1204] (3/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_162-103620-0_sp1.1 from training. Duration: 0.8454375 2023-10-03 14:01:37,136 WARNING [train.py:1204] (3/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_734-129761-0_sp1.1 from training. Duration: 0.7454375 2023-10-03 14:01:37,180 WARNING [train.py:1204] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_620-276793-0_sp1.1 from training. Duration: 0.536375 2023-10-03 14:01:38,446 WARNING [train.py:1204] (3/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_153-97441-0_sp0.9 from training. Duration: 0.62225 2023-10-03 14:01:43,602 WARNING [train.py:1204] (3/4) Exclude cut with ID _12733_210_8341_1_1530167424556_3646380_523-85621-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 14:01:43,615 WARNING [train.py:1204] (3/4) Exclude cut with ID _64048_210_10626_1_1535198382802_3835179_352-179834-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 14:01:49,133 WARNING [train.py:1204] (3/4) Exclude cut with ID _55162_210_17071_1_1534408248583_3753480_43-242364-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 14:01:49,148 WARNING [train.py:1204] (3/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_661-264857-0 from training. Duration: 0.93 2023-10-03 14:01:49,162 WARNING [train.py:1204] (3/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_325-237800-0_sp1.1 from training. Duration: 0.97275 2023-10-03 14:01:50,542 WARNING [train.py:1204] (3/4) Exclude cut with ID _55328_210_27767_1_1534580973260_4088170_385-171459-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 14:01:51,934 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3780_210_2087_1_1526467389_2145572_176-107140-0_sp1.1 from training. Duration: 0.62725 2023-10-03 14:01:52,034 WARNING [train.py:1204] (3/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_375-244976-0_sp1.1 from training. Duration: 0.6818125 2023-10-03 14:01:54,775 WARNING [train.py:1204] (3/4) Exclude cut with ID _35941_210_15379_1_1532995294515_4023089_309-33162-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 14:01:59,006 INFO [train.py:1046] (3/4) Epoch 37, batch 2600, loss[loss=0.1604, simple_loss=0.2436, pruned_loss=0.03855, over 23427.00 frames. ], tot_loss[loss=0.1585, simple_loss=0.2381, pruned_loss=0.03943, over 4708041.92 frames. ], batch size: 93, lr: 2.75e-03, grad_scale: 8.0 2023-10-03 14:02:00,927 WARNING [train.py:1204] (3/4) Exclude cut with ID _14459_210_2228_1_1531443641946_7516700_1185-22038-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 14:02:03,684 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_1197-69460-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 14:02:06,355 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9012W0022-501157 from training. Duration: 0.896 2023-10-03 14:02:06,523 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9023W0009-506302 from training. Duration: 0.896 2023-10-03 14:02:08,419 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2821_210_2715_1_1526108385_3763560_73-296673-0_sp1.1 from training. Duration: 0.7545625 2023-10-03 14:02:08,456 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9025W0113-507442 from training. Duration: 0.896 2023-10-03 14:02:08,526 WARNING [train.py:1204] (3/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_739-2098-0 from training. Duration: 0.92 2023-10-03 14:02:08,544 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9029W0332-509722 from training. Duration: 0.768 2023-10-03 14:02:10,041 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.ff2_skip_rate, batch_count=1292246.6666666667, ans=0.0 2023-10-03 14:02:12,479 WARNING [train.py:1204] (3/4) Exclude cut with ID _16908_210_5724_1_1531119157248_3772700_68-101953-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 14:02:12,495 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9042W0011-515617 from training. Duration: 0.768 2023-10-03 14:02:14,371 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3019_210_2028_1_1526044245_3264865_191-72235-0 from training. Duration: 0.97 2023-10-03 14:02:15,764 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9052W0006-520792 from training. Duration: 0.896 2023-10-03 14:02:17,162 WARNING [train.py:1204] (3/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_245-174553-0_sp1.1 from training. Duration: 0.8 2023-10-03 14:02:18,573 WARNING [train.py:1204] (3/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_202-67017-0 from training. Duration: 0.82 2023-10-03 14:02:18,670 WARNING [train.py:1204] (3/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_304-59227-0 from training. Duration: 0.77 2023-10-03 14:02:21,329 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_246-125000-0_sp0.9 from training. Duration: 0.688875 2023-10-03 14:02:21,379 WARNING [train.py:1204] (3/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_243-42116-0 from training. Duration: 0.73 2023-10-03 14:02:24,024 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9087W0121-538492 from training. Duration: 0.896 2023-10-03 14:02:24,047 WARNING [train.py:1204] (3/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_120-76843-0 from training. Duration: 0.55 2023-10-03 14:02:31,481 WARNING [train.py:1204] (3/4) Exclude cut with ID _7852_210_5219_1_1528286471316_8139400_850-304621-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 14:02:31,499 WARNING [train.py:1204] (3/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_154-285415-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 14:02:31,517 WARNING [train.py:1204] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_538-278194-0_sp1.1 from training. Duration: 0.92725 2023-10-03 14:02:31,518 WARNING [train.py:1204] (3/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_119-207162-0 from training. Duration: 0.71 2023-10-03 14:02:34,233 WARNING [train.py:1204] (3/4) Exclude cut with ID _2334_210_2667_1_1525608238880_4705129_459-104997-0_sp1.1 from training. Duration: 0.863625 2023-10-03 14:02:38,474 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9147W0019-567847 from training. Duration: 0.768 2023-10-03 14:02:45,229 WARNING [train.py:1204] (3/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_228-189439-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 14:02:46,529 WARNING [train.py:1204] (3/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_196-77172-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 14:02:46,599 WARNING [train.py:1204] (3/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_625-338546-0 from training. Duration: 0.88 2023-10-03 14:02:47,854 WARNING [train.py:1204] (3/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_193-327630-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 14:02:47,856 WARNING [train.py:1204] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_391-24006-0_sp1.1 from training. Duration: 0.92725 2023-10-03 14:02:47,933 WARNING [train.py:1204] (3/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_197-28164-0 from training. Duration: 0.8 2023-10-03 14:02:52,080 WARNING [train.py:1204] (3/4) Exclude cut with ID _8395_210_1531_1_1528597871523_4411579_431-132470-0_sp1.1 from training. Duration: 0.67275 2023-10-03 14:02:52,105 WARNING [train.py:1204] (3/4) Exclude cut with ID _35350_210_16040_1_1533040191183_4075299_465-248326-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 14:02:53,447 WARNING [train.py:1204] (3/4) Exclude cut with ID _16756_210_4915_1_1531047803763_7104259_101-141922-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 14:02:57,540 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9212W0005-600172 from training. Duration: 0.896 2023-10-03 14:02:57,564 WARNING [train.py:1204] (3/4) Exclude cut with ID _63186_210_2084_1_1535414479705_3908649_378-166820-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 14:02:57,584 WARNING [train.py:1204] (3/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_52-211683-0_sp1.1 from training. Duration: 0.8181875 2023-10-03 14:03:03,575 WARNING [train.py:1204] (3/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_1147-237729-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 14:03:03,652 WARNING [train.py:1204] (3/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_234-14174-0_sp0.9 from training. Duration: 0.911125 2023-10-03 14:03:03,665 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_454-276429-0 from training. Duration: 0.87 2023-10-03 14:03:05,063 WARNING [train.py:1204] (3/4) Exclude cut with ID _53713_210_24116_1_1534314487762_7416751_755-165313-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 14:03:05,258 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.balancer1.prob, batch_count=1292513.3333333333, ans=0.125 2023-10-03 14:03:07,719 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8278_210_2550_1_1528637099_4220413_436-317072-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 14:03:08,957 WARNING [train.py:1204] (3/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_28-324729-0_sp1.1 from training. Duration: 0.8545625 2023-10-03 14:03:11,791 INFO [train.py:1046] (3/4) Epoch 37, batch 2650, loss[loss=0.1421, simple_loss=0.2211, pruned_loss=0.03157, over 19996.00 frames. ], tot_loss[loss=0.1594, simple_loss=0.239, pruned_loss=0.03989, over 4700147.04 frames. ], batch size: 44, lr: 2.75e-03, grad_scale: 8.0 2023-10-03 14:03:13,837 WARNING [train.py:1204] (3/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_326-165900-0 from training. Duration: 0.69 2023-10-03 14:03:15,697 WARNING [train.py:1204] (3/4) Exclude cut with ID _53489_210_6228_1_1534298357169_3835970_96-30246-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 14:03:17,583 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8275_210_4198_1_1528455320_3892571_491-17707-0_sp1.1 from training. Duration: 0.5818125 2023-10-03 14:03:20,551 WARNING [train.py:1204] (3/4) Exclude cut with ID _14348_210_8341_1_1530599434996_3778640_240-232370-0 from training. Duration: 0.84 2023-10-03 14:03:20,562 WARNING [train.py:1204] (3/4) Exclude cut with ID _45012_210_12651_1_1533608435136_5371380_54-112623-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 14:03:21,739 WARNING [train.py:1204] (3/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_328-264776-0_sp1.1 from training. Duration: 0.6090625 2023-10-03 14:03:23,122 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9325W0001-648427 from training. Duration: 0.768 2023-10-03 14:03:23,130 WARNING [train.py:1204] (3/4) Exclude cut with ID _56480_210_4220_1_1534679942079_3075409_49-74717-0_sp1.1 from training. Duration: 0.936375 2023-10-03 14:03:24,660 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.bypass_mid.scale_min, batch_count=1292580.0, ans=0.2 2023-10-03 14:03:25,857 WARNING [train.py:1204] (3/4) Exclude cut with ID _59500_210_8254_1_1534809610680_3736009_6-241869-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 14:03:27,159 WARNING [train.py:1204] (3/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_172-215932-0_sp0.9 from training. Duration: 0.7333125 2023-10-03 14:03:27,248 WARNING [train.py:1204] (3/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_17-90541-0_sp1.1 from training. Duration: 0.8545625 2023-10-03 14:03:30,063 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_43831_210_2158_1_1533725502_5872791_525-89447-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 14:03:30,155 WARNING [train.py:1204] (3/4) Exclude cut with ID _12302_210_5219_1_1529924446466_8107280_907-182949-0 from training. Duration: 0.92 2023-10-03 14:03:30,172 WARNING [train.py:1204] (3/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_402-102311-0_sp0.9 from training. Duration: 0.7444375 2023-10-03 14:03:32,133 WARNING [train.py:1204] (3/4) Exclude cut with ID _8021_210_7994_1_1528376451358_3653069_179-45206-0_sp0.9 from training. Duration: 0.9555625 2023-10-03 14:03:34,922 WARNING [train.py:1204] (3/4) Exclude cut with ID _40137_210_12210_1_1533290484046_3795180_249-187059-0 from training. Duration: 0.88 2023-10-03 14:03:36,160 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9653W0194-675472 from training. Duration: 0.9658125 2023-10-03 14:03:38,926 WARNING [train.py:1204] (3/4) Exclude cut with ID _51332_210_9988_1_1534244371836_3801270_336-255604-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 14:03:40,435 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_510-96996-0 from training. Duration: 0.87 2023-10-03 14:03:40,446 WARNING [train.py:1204] (3/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_1167-20437-0_sp1.1 from training. Duration: 0.92725 2023-10-03 14:03:42,318 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3475_210_2028_1_1526193578_3622569_265-276625-0 from training. Duration: 0.76 2023-10-03 14:03:43,648 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.580e+02 1.832e+02 2.042e+02 2.355e+02 4.298e+02, threshold=4.084e+02, percent-clipped=1.0 2023-10-03 14:03:44,675 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.bypass_mid.scale_min, batch_count=1292713.3333333333, ans=0.2 2023-10-03 14:03:45,637 WARNING [train.py:1204] (3/4) Exclude cut with ID _14860_210_4915_1_1530702020340_7343240_470-172418-0_sp1.1 from training. Duration: 0.97275 2023-10-03 14:03:45,646 WARNING [train.py:1204] (3/4) Exclude cut with ID _5444_210_2151_1_1527418808464_5312210_581-315411-0_sp1.1 from training. Duration: 0.563625 2023-10-03 14:03:45,661 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_34450_210_9547_1_1533099443_4109050_519-342439-0_sp1.1 from training. Duration: 0.97275 2023-10-03 14:03:47,735 WARNING [train.py:1204] (3/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_50-175961-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 14:03:50,555 WARNING [train.py:1204] (3/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_372-6869-0 from training. Duration: 0.44 2023-10-03 14:03:50,560 WARNING [train.py:1204] (3/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_3-237434-0 from training. Duration: 0.76 2023-10-03 14:03:53,262 WARNING [train.py:1204] (3/4) Exclude cut with ID _8772_210_1850_1_1528635595972_3664011_247-336448-0_sp1.1 from training. Duration: 0.67275 2023-10-03 14:03:56,133 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_427-78097-0 from training. Duration: 0.8 2023-10-03 14:03:56,148 WARNING [train.py:1204] (3/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_466-252727-0_sp1.1 from training. Duration: 0.97275 2023-10-03 14:03:57,884 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_15972_210_6400_1_1530877934_3405692_316-35478-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 14:03:59,270 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_809-17156-0_sp0.9 from training. Duration: 0.811125 2023-10-03 14:03:59,316 WARNING [train.py:1204] (3/4) Exclude cut with ID _57572_210_5517_1_1534921219624_3809484_594-263474-0_sp1.1 from training. Duration: 0.936375 2023-10-03 14:03:59,343 WARNING [train.py:1204] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_190-228204-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 14:04:01,915 WARNING [train.py:1204] (3/4) Exclude cut with ID _19514_210_9259_1_1531530071330_5084230_117-21193-0_sp1.1 from training. Duration: 0.936375 2023-10-03 14:04:04,541 WARNING [train.py:1204] (3/4) Exclude cut with ID _69584_210_5219_1_1535785133775_4044450_190-21315-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 14:04:04,612 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_53071_210_16554_1_1534493434_1276796_184-302050-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 14:04:04,664 WARNING [train.py:1204] (3/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_120-76843-0_sp1.1 from training. Duration: 0.5 2023-10-03 14:04:05,938 WARNING [train.py:1204] (3/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_318-162017-0_sp1.1 from training. Duration: 0.82725 2023-10-03 14:04:07,426 WARNING [train.py:1204] (3/4) Exclude cut with ID _46067_210_4220_1_1534125571066_3307450_79-46864-0_sp1.1 from training. Duration: 0.92725 2023-10-03 14:04:07,473 WARNING [train.py:1204] (3/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_358-215558-0_sp1.1 from training. Duration: 0.8454375 2023-10-03 14:04:08,874 WARNING [train.py:1204] (3/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_621-66764-0_sp1.1 from training. Duration: 0.92725 2023-10-03 14:04:11,566 WARNING [train.py:1204] (3/4) Exclude cut with ID _42863_210_9070_1_1533902398788_3696920_338-156851-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 14:04:11,600 WARNING [train.py:1204] (3/4) Exclude cut with ID _5289_210_2084_1_1527678202736_3779299_392-261988-0_sp0.9 from training. Duration: 0.72225 2023-10-03 14:04:15,671 WARNING [train.py:1204] (3/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_409-84682-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 14:04:17,000 WARNING [train.py:1204] (3/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_30-326681-0_sp0.9 from training. Duration: 0.92225 2023-10-03 14:04:17,005 WARNING [train.py:1204] (3/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_389-184695-0_sp1.1 from training. Duration: 0.92725 2023-10-03 14:04:17,036 WARNING [train.py:1204] (3/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_442-320317-0 from training. Duration: 0.82 2023-10-03 14:04:20,276 WARNING [train.py:1204] (3/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_756-29911-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 14:04:21,647 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_401-112036-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 14:04:21,739 WARNING [train.py:1204] (3/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_181-308827-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 14:04:24,432 WARNING [train.py:1204] (3/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_332-258725-0_sp1.1 from training. Duration: 0.963625 2023-10-03 14:04:25,781 WARNING [train.py:1204] (3/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_188-260011-0_sp0.9 from training. Duration: 0.811125 2023-10-03 14:04:25,820 WARNING [train.py:1204] (3/4) Exclude cut with ID _49039_210_6315_1_1533967537541_3846118_99-302239-0_sp1.1 from training. Duration: 0.963625 2023-10-03 14:04:27,236 INFO [train.py:1046] (3/4) Epoch 37, batch 2700, loss[loss=0.1596, simple_loss=0.2452, pruned_loss=0.03698, over 23938.00 frames. ], tot_loss[loss=0.1601, simple_loss=0.2402, pruned_loss=0.04005, over 4708134.87 frames. ], batch size: 86, lr: 2.75e-03, grad_scale: 8.0 2023-10-03 14:04:29,141 WARNING [train.py:1204] (3/4) Exclude cut with ID _4122_210_2084_1_1526713164417_4353280_312-294828-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 14:04:29,155 WARNING [train.py:1204] (3/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_303-228738-0 from training. Duration: 0.84 2023-10-03 14:04:31,974 WARNING [train.py:1204] (3/4) Exclude cut with ID _14349_210_8341_1_1530615710818_3442470_115-285975-0_sp0.9 from training. Duration: 0.9555625 2023-10-03 14:04:33,386 WARNING [train.py:1204] (3/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_125-144174-0_sp1.1 from training. Duration: 0.3545625 2023-10-03 14:04:34,855 WARNING [train.py:1204] (3/4) Exclude cut with ID _53565_210_10635_1_1534337218335_3820259_351-54570-0_sp1.1 from training. Duration: 0.97275 2023-10-03 14:04:34,876 WARNING [train.py:1204] (3/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_410-40065-0_sp1.1 from training. Duration: 0.963625 2023-10-03 14:04:34,904 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_38965_210_4852_1_1533174906_3484735_167-339442-0_sp1.1 from training. Duration: 0.963625 2023-10-03 14:04:36,244 WARNING [train.py:1204] (3/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_265-164486-0_sp1.1 from training. Duration: 0.87275 2023-10-03 14:04:36,263 WARNING [train.py:1204] (3/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_725-182484-0_sp1.1 from training. Duration: 0.92725 2023-10-03 14:04:37,551 WARNING [train.py:1204] (3/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_607-213951-0_sp0.9 from training. Duration: 0.9333125 2023-10-03 14:04:37,572 WARNING [train.py:1204] (3/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_605-133395-0_sp1.1 from training. Duration: 0.6 2023-10-03 14:04:37,594 WARNING [train.py:1204] (3/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_351-294650-0 from training. Duration: 0.63 2023-10-03 14:04:38,926 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_14953_210_2151_1_1530701710_5616165_710-165400-0_sp1.1 from training. Duration: 0.7909375 2023-10-03 14:04:40,201 WARNING [train.py:1204] (3/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_237-58433-0_sp1.1 from training. Duration: 0.67275 2023-10-03 14:04:43,235 WARNING [train.py:1204] (3/4) Exclude cut with ID _5190_210_4557_1_1527244368799_373439_28-197030-0_sp1.1 from training. Duration: 0.6545625 2023-10-03 14:04:43,293 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_743-327651-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 14:04:46,112 WARNING [train.py:1204] (3/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_76-203867-0_sp0.9 from training. Duration: 0.8 2023-10-03 14:04:47,503 WARNING [train.py:1204] (3/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_274-274798-0 from training. Duration: 0.98 2023-10-03 14:04:47,549 WARNING [train.py:1204] (3/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_226-242140-0_sp1.1 from training. Duration: 0.736375 2023-10-03 14:04:52,276 WARNING [train.py:1204] (3/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_352-59958-0_sp1.1 from training. Duration: 0.7818125 2023-10-03 14:04:52,283 WARNING [train.py:1204] (3/4) Exclude cut with ID _39411_210_5986_1_1533538825030_3343649_191-251819-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 14:04:59,131 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7046_210_6753_1_1527895337_6920917_642-106799-0_sp1.1 from training. Duration: 0.836375 2023-10-03 14:04:59,137 WARNING [train.py:1204] (3/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_412-3442-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 14:04:59,164 WARNING [train.py:1204] (3/4) Exclude cut with ID _60057_210_10640_1_1534903065793_3333150_171-181133-0_sp0.9 from training. Duration: 0.97775 2023-10-03 14:04:59,173 WARNING [train.py:1204] (3/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_154-135696-0_sp1.1 from training. Duration: 0.72725 2023-10-03 14:05:02,617 WARNING [train.py:1204] (3/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_148-186805-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 14:05:03,268 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.3.encoder.layers.3.feed_forward1.out_whiten, num_groups=1, num_channels=512, metric=5.09 vs. limit=15.0 2023-10-03 14:05:04,166 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.1.feed_forward1.out_proj.dropout_p, batch_count=1293046.6666666667, ans=0.1 2023-10-03 14:05:05,357 WARNING [train.py:1204] (3/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_605-165641-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 14:05:05,370 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_174-277364-0_sp0.9 from training. Duration: 0.788875 2023-10-03 14:05:05,383 WARNING [train.py:1204] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_89-321955-0_sp0.9 from training. Duration: 0.888875 2023-10-03 14:05:07,064 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.attention_skip_rate, batch_count=1293046.6666666667, ans=0.0 2023-10-03 14:05:09,554 WARNING [train.py:1204] (3/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_438-105429-0_sp1.1 from training. Duration: 0.963625 2023-10-03 14:05:09,558 WARNING [train.py:1204] (3/4) Exclude cut with ID _14970_210_15709_1_1530773543493_1715730_187-20259-0_sp0.9 from training. Duration: 0.988875 2023-10-03 14:05:16,291 WARNING [train.py:1204] (3/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_385-193052-0_sp0.9 from training. Duration: 0.9555625 2023-10-03 14:05:16,336 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7113_210_1849_1_1527942685_3448768_194-311626-0_sp1.1 from training. Duration: 0.92725 2023-10-03 14:05:20,961 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3780_210_2087_1_1526467389_2145572_176-107140-0_sp0.9 from training. Duration: 0.7666875 2023-10-03 14:05:20,962 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_36756_210_7234_1_1533026991_966276_123-213457-0_sp1.1 from training. Duration: 0.936375 2023-10-03 14:05:21,248 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.conv_module1.balancer2.min_abs, batch_count=1293113.3333333333, ans=0.5 2023-10-03 14:05:23,914 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.ff2_skip_rate, batch_count=1293113.3333333333, ans=0.0 2023-10-03 14:05:24,999 WARNING [train.py:1204] (3/4) Exclude cut with ID _23557_210_8750_1_1531890106370_6846255_456-244861-0_sp1.1 from training. Duration: 0.963625 2023-10-03 14:05:26,394 WARNING [train.py:1204] (3/4) Exclude cut with ID _37086_210_9038_1_1532999540891_7021250_242-349170-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 14:05:27,724 WARNING [train.py:1204] (3/4) Exclude cut with ID _41298_210_9019_1_1533430371450_4822143_471-281351-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 14:05:27,823 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_53378_210_17282_1_1534221071_5769509_572-126176-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 14:05:29,358 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_1507-263032-0_sp1.1 from training. Duration: 0.963625 2023-10-03 14:05:29,400 WARNING [train.py:1204] (3/4) Exclude cut with ID _13679_210_7497_1_1530534293662_3355098_411-186088-0_sp1.1 from training. Duration: 0.8545625 2023-10-03 14:05:31,238 WARNING [train.py:1204] (3/4) Exclude cut with ID _29345_210_4381_1_1532689168263_3860169_149-257268-0_sp1.1 from training. Duration: 0.736375 2023-10-03 14:05:31,488 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=1293180.0, ans=0.1 2023-10-03 14:05:32,641 WARNING [train.py:1204] (3/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_381-252014-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 14:05:32,647 WARNING [train.py:1204] (3/4) Exclude cut with ID _35799_210_5854_1_1533263487887_3529071_512-2717-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 14:05:32,944 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.feed_forward1.hidden_balancer.prob, batch_count=1293180.0, ans=0.125 2023-10-03 14:05:35,400 WARNING [train.py:1204] (3/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_505-205270-0_sp0.9 from training. Duration: 0.42225 2023-10-03 14:05:36,792 WARNING [train.py:1204] (3/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_731-20117-0_sp1.1 from training. Duration: 0.936375 2023-10-03 14:05:40,179 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_37351_210_7925_1_1533012931_3812113_265-85780-0_sp1.1 from training. Duration: 0.8 2023-10-03 14:05:40,187 WARNING [train.py:1204] (3/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_313-134930-0 from training. Duration: 0.97 2023-10-03 14:05:41,441 INFO [train.py:1046] (3/4) Epoch 37, batch 2750, loss[loss=0.1556, simple_loss=0.2335, pruned_loss=0.03887, over 24613.00 frames. ], tot_loss[loss=0.1598, simple_loss=0.2399, pruned_loss=0.03988, over 4714103.54 frames. ], batch size: 60, lr: 2.75e-03, grad_scale: 8.0 2023-10-03 14:05:42,855 WARNING [train.py:1204] (3/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_374-138971-0 from training. Duration: 0.94 2023-10-03 14:05:42,901 WARNING [train.py:1204] (3/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_1227-287886-0_sp1.1 from training. Duration: 0.936375 2023-10-03 14:05:44,317 WARNING [train.py:1204] (3/4) Exclude cut with ID _59493_210_17282_1_1535078419077_3225879_332-217544-0_sp1.1 from training. Duration: 0.97275 2023-10-03 14:05:44,350 WARNING [train.py:1204] (3/4) Exclude cut with ID _23514_210_1389_1_1531832139785_3031439_361-119640-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 14:05:48,782 WARNING [train.py:1204] (3/4) Exclude cut with ID _69584_210_5219_1_1535785133775_4044450_142-118058-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 14:05:48,802 WARNING [train.py:1204] (3/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_178-177582-0_sp1.1 from training. Duration: 0.67275 2023-10-03 14:05:48,846 WARNING [train.py:1204] (3/4) Exclude cut with ID _25303_210_17519_1_1532050207576_3635970_41-195572-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 14:05:52,064 WARNING [train.py:1204] (3/4) Exclude cut with ID _55328_210_27767_1_1534580973260_4088170_329-341322-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 14:05:53,415 WARNING [train.py:1204] (3/4) Exclude cut with ID _5608_210_2106_1_1527415439089_6819768_909-82767-0_sp1.1 from training. Duration: 0.5545625 2023-10-03 14:05:54,784 WARNING [train.py:1204] (3/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_261-297503-0_sp1.1 from training. Duration: 0.9 2023-10-03 14:05:54,791 WARNING [train.py:1204] (3/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_459-291706-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 14:05:54,792 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7762_210_1794_1_1528199508_4057408_422-227034-0 from training. Duration: 0.73 2023-10-03 14:05:54,801 WARNING [train.py:1204] (3/4) Exclude cut with ID _3067_210_2667_1_1526212974640_4503168_250-285937-0_sp0.9 from training. Duration: 0.888875 2023-10-03 14:05:54,816 WARNING [train.py:1204] (3/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_332-253745-0_sp1.1 from training. Duration: 0.936375 2023-10-03 14:06:00,832 WARNING [train.py:1204] (3/4) Exclude cut with ID _13938_210_9988_1_1530518718978_3693960_304-112784-0 from training. Duration: 0.46 2023-10-03 14:06:02,226 WARNING [train.py:1204] (3/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_226-267957-0_sp1.1 from training. Duration: 0.92725 2023-10-03 14:06:03,581 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_41822_210_13270_1_1533955976_4379284_608-254567-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 14:06:03,655 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_124-113562-0_sp1.1 from training. Duration: 0.8545625 2023-10-03 14:06:03,688 WARNING [train.py:1204] (3/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_117-290916-0_sp1.1 from training. Duration: 0.463625 2023-10-03 14:06:05,094 WARNING [train.py:1204] (3/4) Exclude cut with ID _28427_210_4220_1_1532581240967_3061069_102-66533-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 14:06:06,440 WARNING [train.py:1204] (3/4) Exclude cut with ID _55383_210_10635_1_1534468426172_2231669_134-128246-0_sp1.1 from training. Duration: 0.8090625 2023-10-03 14:06:06,492 WARNING [train.py:1204] (3/4) Exclude cut with ID _45871_210_5665_1_1533864614245_3296060_49-308316-0_sp1.1 from training. Duration: 0.97275 2023-10-03 14:06:07,764 WARNING [train.py:1204] (3/4) Exclude cut with ID _57572_210_5517_1_1534921219624_3809484_176-199205-0_sp1.1 from training. Duration: 0.97275 2023-10-03 14:06:11,304 WARNING [train.py:1204] (3/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_45-265684-0_sp1.1 from training. Duration: 0.6545625 2023-10-03 14:06:11,329 WARNING [train.py:1204] (3/4) Exclude cut with ID _15150_210_8341_1_1530752411803_7256384_469-239509-0_sp0.9 from training. Duration: 0.7555625 2023-10-03 14:06:12,524 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.542e+02 1.957e+02 2.205e+02 2.466e+02 3.615e+02, threshold=4.410e+02, percent-clipped=0.0 2023-10-03 14:06:12,632 WARNING [train.py:1204] (3/4) Exclude cut with ID _28461_210_2253_1_1532771775189_3041299_136-270644-0_sp1.1 from training. Duration: 0.8454375 2023-10-03 14:06:14,129 WARNING [train.py:1204] (3/4) Exclude cut with ID _69584_210_5219_1_1535785133775_4044450_302-47302-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 14:06:15,515 WARNING [train.py:1204] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_166-128941-0_sp1.1 from training. Duration: 0.4909375 2023-10-03 14:06:21,571 WARNING [train.py:1204] (3/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_559-120504-0_sp1.1 from training. Duration: 0.97275 2023-10-03 14:06:23,028 WARNING [train.py:1204] (3/4) Exclude cut with ID _6456_210_1534_1_1527681634212_3181049_345-252877-0_sp1.1 from training. Duration: 0.5818125 2023-10-03 14:06:23,052 WARNING [train.py:1204] (3/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_902-233003-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 14:06:27,725 WARNING [train.py:1204] (3/4) Exclude cut with ID _36538_210_9994_1_1533204020363_3795620_204-261385-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 14:06:27,730 WARNING [train.py:1204] (3/4) Exclude cut with ID _14821_210_8750_1_1530680503991_6829359_189-297451-0_sp1.1 from training. Duration: 0.5 2023-10-03 14:06:27,761 WARNING [train.py:1204] (3/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_1211-226066-0_sp0.9 from training. Duration: 0.8444375 2023-10-03 14:06:33,879 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6786_210_1388_1_1527839699_4194495_273-149841-0_sp1.1 from training. Duration: 0.836375 2023-10-03 14:06:35,082 WARNING [train.py:1204] (3/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_380-162648-0_sp1.1 from training. Duration: 0.8545625 2023-10-03 14:06:35,082 WARNING [train.py:1204] (3/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_494-199538-0 from training. Duration: 0.75 2023-10-03 14:06:38,607 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.1.encoder.layers.1.conv_module2.whiten, num_groups=1, num_channels=256, metric=9.97 vs. limit=15.0 2023-10-03 14:06:39,766 WARNING [train.py:1204] (3/4) Exclude cut with ID _49097_210_16049_1_1533952780207_4360979_276-87053-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 14:06:41,167 WARNING [train.py:1204] (3/4) Exclude cut with ID _5444_210_2151_1_1527418808464_5312210_726-27165-0 from training. Duration: 0.67 2023-10-03 14:06:44,202 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_128-202104-0_sp1.1 from training. Duration: 0.57275 2023-10-03 14:06:45,613 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_11349_210_6753_1_1529800861_1101207_26-65602-0_sp1.1 from training. Duration: 0.87275 2023-10-03 14:06:45,639 WARNING [train.py:1204] (3/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_159-328157-0 from training. Duration: 0.52 2023-10-03 14:06:46,951 WARNING [train.py:1204] (3/4) Exclude cut with ID _14069_210_5632_1_1530512692033_4803390_98-21934-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 14:06:48,345 WARNING [train.py:1204] (3/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_352-59958-0_sp0.9 from training. Duration: 0.9555625 2023-10-03 14:06:48,383 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6163_210_6400_1_1527506746_4263068_283-137811-0 from training. Duration: 0.77 2023-10-03 14:06:48,424 WARNING [train.py:1204] (3/4) Exclude cut with ID _21989_210_5854_1_1531899214931_3271360_133-44759-0_sp1.1 from training. Duration: 0.7 2023-10-03 14:06:54,266 WARNING [train.py:1204] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_21-174514-0_sp0.9 from training. Duration: 0.47775 2023-10-03 14:06:54,299 WARNING [train.py:1204] (3/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_201-78811-0_sp1.1 from training. Duration: 0.963625 2023-10-03 14:06:54,327 WARNING [train.py:1204] (3/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_537-164630-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 14:06:54,390 WARNING [train.py:1204] (3/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_156-257607-0 from training. Duration: 0.83 2023-10-03 14:06:54,401 WARNING [train.py:1204] (3/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_308-154450-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 14:06:56,293 INFO [train.py:1046] (3/4) Epoch 37, batch 2800, loss[loss=0.1615, simple_loss=0.2415, pruned_loss=0.04072, over 23294.00 frames. ], tot_loss[loss=0.1593, simple_loss=0.2394, pruned_loss=0.03956, over 4720815.03 frames. ], batch size: 93, lr: 2.75e-03, grad_scale: 16.0 2023-10-03 14:06:56,337 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_45399_210_4852_1_1533624725_7579737_214-240574-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 14:06:56,521 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.nonlin_attention.balancer.prob, batch_count=1293580.0, ans=0.125 2023-10-03 14:06:57,072 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.3.encoder.layers.0.nonlin_attention.whiten1, num_groups=1, num_channels=384, metric=6.63 vs. limit=10.0 2023-10-03 14:06:57,777 WARNING [train.py:1204] (3/4) Exclude cut with ID _29303_210_2575_1_1532392237303_7410649_615-305517-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 14:06:59,103 WARNING [train.py:1204] (3/4) Exclude cut with ID IC0125W0002-61808 from training. Duration: 0.708 2023-10-03 14:06:59,104 WARNING [train.py:1204] (3/4) Exclude cut with ID IC0125W0017-61823 from training. Duration: 0.759 2023-10-03 14:06:59,323 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.conv_module1.balancer2.prob, batch_count=1293580.0, ans=0.125 2023-10-03 14:07:01,884 WARNING [train.py:1204] (3/4) Exclude cut with ID _53219_210_4525_1_1534834836008_4064825_4-145291-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 14:07:02,213 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.bypass_mid.scale_min, batch_count=1293580.0, ans=0.2 2023-10-03 14:07:03,301 WARNING [train.py:1204] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_185-174805-0_sp1.1 from training. Duration: 0.6181875 2023-10-03 14:07:03,316 WARNING [train.py:1204] (3/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_635-193046-0_sp1.1 from training. Duration: 0.92725 2023-10-03 14:07:06,092 WARNING [train.py:1204] (3/4) Exclude cut with ID _46067_210_4220_1_1534125571066_3307450_321-258866-0_sp1.1 from training. Duration: 0.97275 2023-10-03 14:07:07,673 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.bypass.scale_min, batch_count=1293580.0, ans=0.2 2023-10-03 14:07:09,228 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8558_210_1410_1_1528505225_7613406_131-329524-0 from training. Duration: 0.79 2023-10-03 14:07:10,685 WARNING [train.py:1204] (3/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_847-41334-0_sp0.9 from training. Duration: 0.488875 2023-10-03 14:07:12,085 WARNING [train.py:1204] (3/4) Exclude cut with ID _14227_210_4381_1_1530701804286_3940259_125-275642-0 from training. Duration: 0.99 2023-10-03 14:07:12,194 WARNING [train.py:1204] (3/4) Exclude cut with ID _25248_210_8750_1_1532062825941_5554180_248-177255-0_sp1.1 from training. Duration: 0.963625 2023-10-03 14:07:13,531 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_40047_210_2290_1_1533290519_2374311_195-165693-0_sp1.1 from training. Duration: 0.8090625 2023-10-03 14:07:13,536 WARNING [train.py:1204] (3/4) Exclude cut with ID _23109_210_4381_1_1532150828777_901949_29-258672-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 14:07:17,552 WARNING [train.py:1204] (3/4) Exclude cut with ID _14821_210_8750_1_1530680503991_6829359_2-227151-0_sp1.1 from training. Duration: 0.7545625 2023-10-03 14:07:18,207 WARNING [train.py:1204] (3/4) Exclude cut with ID _49643_210_7989_1_1534640244224_3939031_565-53982-0_sp1.1 from training. Duration: 0.963625 2023-10-03 14:07:18,211 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7544_210_6753_1_1528591327_1471426_33-346031-0_sp0.9 from training. Duration: 0.67775 2023-10-03 14:07:19,465 WARNING [train.py:1204] (3/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_95-61782-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 14:07:27,018 WARNING [train.py:1204] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_686-54383-0_sp0.9 from training. Duration: 0.9666875 2023-10-03 14:07:28,936 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_505-276823-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 14:07:30,465 WARNING [train.py:1204] (3/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_745-128443-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 14:07:30,627 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.balancer2.prob, batch_count=1293713.3333333333, ans=0.125 2023-10-03 14:07:31,833 WARNING [train.py:1204] (3/4) Exclude cut with ID _3844_210_1390_1_1526727803582_2871209_20-251664-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 14:07:31,893 WARNING [train.py:1204] (3/4) Exclude cut with ID _36538_210_9994_1_1533204020363_3795620_487-164628-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 14:07:37,230 WARNING [train.py:1204] (3/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_309-222545-0_sp1.1 from training. Duration: 0.763625 2023-10-03 14:07:37,250 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3531_210_3528_1_1526294203_5216456_309-227302-0 from training. Duration: 0.69 2023-10-03 14:07:39,086 WARNING [train.py:1204] (3/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_42-192460-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 14:07:40,512 WARNING [train.py:1204] (3/4) Exclude cut with ID _22083_210_8254_1_1531702862439_3678970_49-234546-0_sp1.1 from training. Duration: 0.7545625 2023-10-03 14:07:40,516 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_427-78097-0_sp0.9 from training. Duration: 0.888875 2023-10-03 14:07:43,299 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_378-205254-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 14:07:43,344 WARNING [train.py:1204] (3/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_672-314830-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 14:07:44,903 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.feed_forward2.hidden_balancer.prob, batch_count=1293780.0, ans=0.125 2023-10-03 14:07:46,187 WARNING [train.py:1204] (3/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_90-30673-0_sp1.1 from training. Duration: 0.763625 2023-10-03 14:07:49,339 WARNING [train.py:1204] (3/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_563-329943-0_sp1.1 from training. Duration: 0.9 2023-10-03 14:07:49,362 WARNING [train.py:1204] (3/4) Exclude cut with ID _21632_210_2253_1_1531994001765_3744140_154-84090-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 14:07:49,363 WARNING [train.py:1204] (3/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_103-336064-0_sp0.9 from training. Duration: 0.5666875 2023-10-03 14:07:51,111 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_43-71989-0_sp0.9 from training. Duration: 0.6333125 2023-10-03 14:07:52,452 WARNING [train.py:1204] (3/4) Exclude cut with ID _3867_210_1883_1_1526641200575_4475830_319-349850-0_sp0.9 from training. Duration: 0.7444375 2023-10-03 14:07:53,793 WARNING [train.py:1204] (3/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_629-179454-0_sp1.1 from training. Duration: 0.963625 2023-10-03 14:07:53,801 WARNING [train.py:1204] (3/4) Exclude cut with ID _65263_210_22356_1_1535763598691_3921037_49-222917-0 from training. Duration: 0.92 2023-10-03 14:07:53,823 WARNING [train.py:1204] (3/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_602-331772-0_sp1.1 from training. Duration: 0.936375 2023-10-03 14:07:53,910 WARNING [train.py:1204] (3/4) Exclude cut with ID _58414_210_8254_1_1535421609162_7151182_292-134978-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 14:07:55,180 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_38793_210_2544_1_1533176114_3849320_200-299292-0_sp1.1 from training. Duration: 0.936375 2023-10-03 14:07:56,584 WARNING [train.py:1204] (3/4) Exclude cut with ID _14970_210_15709_1_1530775547361_1966965_6-23328-0 from training. Duration: 0.86 2023-10-03 14:07:58,034 WARNING [train.py:1204] (3/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_386-225511-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 14:07:58,045 WARNING [train.py:1204] (3/4) Exclude cut with ID _38323_210_12108_1_1533121233369_3303049_416-11919-0_sp1.1 from training. Duration: 0.87275 2023-10-03 14:07:58,093 WARNING [train.py:1204] (3/4) Exclude cut with ID _4866_210_2577_1_1527404412284_4364659_256-306830-0_sp1.1 from training. Duration: 0.7909375 2023-10-03 14:07:58,335 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.nonlin_attention.balancer.prob, batch_count=1293846.6666666667, ans=0.125 2023-10-03 14:07:59,529 WARNING [train.py:1204] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_53-300544-0 from training. Duration: 0.67 2023-10-03 14:08:06,800 WARNING [train.py:1204] (3/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_80-74184-0_sp1.1 from training. Duration: 0.7545625 2023-10-03 14:08:06,821 WARNING [train.py:1204] (3/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_332-256168-0_sp1.1 from training. Duration: 0.6818125 2023-10-03 14:08:06,874 WARNING [train.py:1204] (3/4) Exclude cut with ID _48836_210_12210_1_1534075848293_3904150_429-35475-0_sp1.1 from training. Duration: 0.8090625 2023-10-03 14:08:09,385 INFO [train.py:1046] (3/4) Epoch 37, batch 2850, loss[loss=0.1526, simple_loss=0.2219, pruned_loss=0.04167, over 23396.00 frames. ], tot_loss[loss=0.1584, simple_loss=0.2379, pruned_loss=0.03942, over 4715366.69 frames. ], batch size: 285, lr: 2.75e-03, grad_scale: 16.0 2023-10-03 14:08:10,768 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_144-135833-0_sp1.1 from training. Duration: 0.97275 2023-10-03 14:08:11,296 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.5.encoder.layers.1.self_attn1.whiten, num_groups=1, num_channels=256, metric=9.84 vs. limit=22.5 2023-10-03 14:08:12,803 WARNING [train.py:1204] (3/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_380-343670-0_sp0.9 from training. Duration: 0.97775 2023-10-03 14:08:13,027 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.conv_skip_rate, batch_count=1293913.3333333333, ans=0.0 2023-10-03 14:08:14,152 WARNING [train.py:1204] (3/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_213-265174-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 14:08:14,193 WARNING [train.py:1204] (3/4) Exclude cut with ID _20455_210_5219_1_1531565996775_8098100_86-314283-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 14:08:17,009 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_41750_210_1960_1_1533520678_3948042_68-223813-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 14:08:17,631 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.3.encoder.layers.0.self_attn_weights.whiten_keys, num_groups=8, num_channels=256, metric=4.37 vs. limit=6.0 2023-10-03 14:08:18,402 WARNING [train.py:1204] (3/4) Exclude cut with ID _55153_210_2253_1_1534384786677_3162096_24-198429-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 14:08:19,763 WARNING [train.py:1204] (3/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_119-207162-0_sp0.9 from training. Duration: 0.788875 2023-10-03 14:08:21,727 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_148-172861-0 from training. Duration: 0.5 2023-10-03 14:08:26,473 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_5522_210_3273_1_1527249606_3909096_318-203153-0 from training. Duration: 0.43 2023-10-03 14:08:26,488 WARNING [train.py:1204] (3/4) Exclude cut with ID _14884_210_2252_1_1530707459689_5561250_288-189949-0_sp1.1 from training. Duration: 0.97275 2023-10-03 14:08:29,218 WARNING [train.py:1204] (3/4) Exclude cut with ID _5294_210_1747_1_1527249643928_3696179_529-242298-0 from training. Duration: 0.95 2023-10-03 14:08:29,286 WARNING [train.py:1204] (3/4) Exclude cut with ID _36538_210_9994_1_1533204020363_3795620_503-110289-0_sp1.1 from training. Duration: 0.936375 2023-10-03 14:08:32,051 WARNING [train.py:1204] (3/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_385-193052-0 from training. Duration: 0.86 2023-10-03 14:08:32,104 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_456-22141-0 from training. Duration: 0.88 2023-10-03 14:08:34,266 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_38965_210_4852_1_1533174906_3484735_284-117840-0_sp1.1 from training. Duration: 0.936375 2023-10-03 14:08:41,022 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.577e+02 1.871e+02 2.042e+02 2.334e+02 3.256e+02, threshold=4.084e+02, percent-clipped=0.0 2023-10-03 14:08:44,706 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.out_combiner.scale_min, batch_count=1294046.6666666667, ans=0.2 2023-10-03 14:08:45,884 WARNING [train.py:1204] (3/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_75-201625-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 14:08:47,269 WARNING [train.py:1204] (3/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_255-312654-0_sp1.1 from training. Duration: 0.82725 2023-10-03 14:08:47,279 WARNING [train.py:1204] (3/4) Exclude cut with ID _47251_210_18628_1_1533814063128_4137250_214-88076-0_sp0.9 from training. Duration: 0.97775 2023-10-03 14:08:48,506 WARNING [train.py:1204] (3/4) Exclude cut with ID _4092_210_2151_1_1526643044449_3135759_227-213972-0_sp1.1 from training. Duration: 0.4909375 2023-10-03 14:08:48,522 WARNING [train.py:1204] (3/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_912-47473-0_sp1.1 from training. Duration: 0.6181875 2023-10-03 14:08:48,551 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6767_210_3273_1_1527855783_1718932_173-256601-0_sp1.1 from training. Duration: 0.72725 2023-10-03 14:08:50,467 WARNING [train.py:1204] (3/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_32-205288-0_sp1.1 from training. Duration: 0.7090625 2023-10-03 14:08:50,505 WARNING [train.py:1204] (3/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_39-248870-0 from training. Duration: 0.69 2023-10-03 14:08:51,055 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.4.encoder.layers.1.whiten, num_groups=1, num_channels=384, metric=4.20 vs. limit=12.0 2023-10-03 14:08:53,230 WARNING [train.py:1204] (3/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_377-296839-0_sp1.1 from training. Duration: 0.5 2023-10-03 14:08:53,244 WARNING [train.py:1204] (3/4) Exclude cut with ID _13679_210_7497_1_1530534293662_3355098_413-56154-0_sp1.1 from training. Duration: 0.963625 2023-10-03 14:08:53,292 WARNING [train.py:1204] (3/4) Exclude cut with ID _14970_210_15709_1_1530775547361_1966965_201-84544-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 14:08:55,178 WARNING [train.py:1204] (3/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_105-95775-0_sp1.1 from training. Duration: 0.936375 2023-10-03 14:08:56,010 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.3.encoder.layers.0.nonlin_attention.whiten2, num_groups=1, num_channels=512, metric=6.09 vs. limit=15.0 2023-10-03 14:08:56,623 WARNING [train.py:1204] (3/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_131-191669-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 14:08:57,938 WARNING [train.py:1204] (3/4) Exclude cut with ID _14860_210_4915_1_1530702020340_7343240_695-4323-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 14:08:58,034 WARNING [train.py:1204] (3/4) Exclude cut with ID _39867_210_9826_1_1533553222030_9344259_991-130565-0_sp1.1 from training. Duration: 0.97275 2023-10-03 14:09:00,900 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_50-3289-0_sp1.1 from training. Duration: 0.82725 2023-10-03 14:09:02,334 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_54-347670-0_sp1.1 from training. Duration: 0.92725 2023-10-03 14:09:02,389 WARNING [train.py:1204] (3/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_200-153445-0_sp1.1 from training. Duration: 0.936375 2023-10-03 14:09:04,377 WARNING [train.py:1204] (3/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_279-302911-0_sp1.1 from training. Duration: 0.97275 2023-10-03 14:09:07,153 WARNING [train.py:1204] (3/4) Exclude cut with ID _14886_210_4220_1_1530702020355_3516190_159-73748-0_sp0.9 from training. Duration: 0.92225 2023-10-03 14:09:12,934 WARNING [train.py:1204] (3/4) Exclude cut with ID _53379_210_20034_1_1534222027363_3854654_23-222715-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 14:09:13,062 WARNING [train.py:1204] (3/4) Exclude cut with ID _12555_210_8750_1_1530075594402_3780239_449-267986-0 from training. Duration: 0.83 2023-10-03 14:09:13,074 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7544_210_6753_1_1528587722_3576686_295-258479-0 from training. Duration: 0.84 2023-10-03 14:09:16,319 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7002_210_1794_1_1527935634_4034444_487-236501-0_sp0.9 from training. Duration: 0.7333125 2023-10-03 14:09:16,372 WARNING [train.py:1204] (3/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_244-19107-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 14:09:16,385 WARNING [train.py:1204] (3/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_390-339137-0 from training. Duration: 0.56 2023-10-03 14:09:16,439 WARNING [train.py:1204] (3/4) Exclude cut with ID _7562_210_2084_1_1528887767916_3946229_186-326422-0_sp1.1 from training. Duration: 0.763625 2023-10-03 14:09:17,763 WARNING [train.py:1204] (3/4) Exclude cut with ID _33640_210_2655_1_1533117605479_4855469_645-145235-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 14:09:17,788 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_25767_210_2161_1_1532307483_3867748_506-171777-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 14:09:19,128 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_5438_210_1794_1_1527336687_2600009_233-58072-0_sp1.1 from training. Duration: 0.7 2023-10-03 14:09:19,128 WARNING [train.py:1204] (3/4) Exclude cut with ID IC0669W0063-330908 from training. Duration: 0.896 2023-10-03 14:09:19,173 WARNING [train.py:1204] (3/4) Exclude cut with ID IC0671W0070-331913 from training. Duration: 0.896 2023-10-03 14:09:19,177 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_258-310570-0_sp1.1 from training. Duration: 0.7454375 2023-10-03 14:09:19,237 WARNING [train.py:1204] (3/4) Exclude cut with ID _9461_210_4557_1_1529060514726_3677451_162-177507-0_sp1.1 from training. Duration: 0.97275 2023-10-03 14:09:23,853 WARNING [train.py:1204] (3/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_26-175553-0_sp0.9 from training. Duration: 0.67775 2023-10-03 14:09:23,870 WARNING [train.py:1204] (3/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_7-150481-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 14:09:25,121 INFO [train.py:1046] (3/4) Epoch 37, batch 2900, loss[loss=0.1605, simple_loss=0.2488, pruned_loss=0.03606, over 24632.00 frames. ], tot_loss[loss=0.1584, simple_loss=0.2378, pruned_loss=0.03949, over 4703621.03 frames. ], batch size: 68, lr: 2.75e-03, grad_scale: 16.0 2023-10-03 14:09:25,178 WARNING [train.py:1204] (3/4) Exclude cut with ID _23557_210_8750_1_1531890106370_6846255_72-273262-0_sp1.1 from training. Duration: 0.963625 2023-10-03 14:09:26,965 WARNING [train.py:1204] (3/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_367-61330-0 from training. Duration: 0.75 2023-10-03 14:09:31,020 WARNING [train.py:1204] (3/4) Exclude cut with ID _39867_210_9826_1_1533553222030_9344259_1337-11141-0_sp1.1 from training. Duration: 0.936375 2023-10-03 14:09:31,047 WARNING [train.py:1204] (3/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_101-293367-0 from training. Duration: 0.63 2023-10-03 14:09:32,397 WARNING [train.py:1204] (3/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_10-100367-0 from training. Duration: 0.98 2023-10-03 14:09:33,850 WARNING [train.py:1204] (3/4) Exclude cut with ID _60057_210_10640_1_1534903065793_3333150_171-181133-0_sp1.1 from training. Duration: 0.8 2023-10-03 14:09:33,853 WARNING [train.py:1204] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_844-96989-0_sp0.9 from training. Duration: 0.788875 2023-10-03 14:09:35,859 WARNING [train.py:1204] (3/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_301-144595-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 14:09:35,963 WARNING [train.py:1204] (3/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_115-147343-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 14:09:37,405 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder_embed.convnext.layerdrop_rate, batch_count=1294246.6666666667, ans=0.015 2023-10-03 14:09:38,701 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3171_210_2087_1_1526109084_2330007_37-64334-0_sp1.1 from training. Duration: 0.7454375 2023-10-03 14:09:40,056 WARNING [train.py:1204] (3/4) Exclude cut with ID _19514_210_9259_1_1531530071330_5084230_769-272914-0_sp1.1 from training. Duration: 0.936375 2023-10-03 14:09:41,521 WARNING [train.py:1204] (3/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_76-311963-0_sp0.9 from training. Duration: 0.77775 2023-10-03 14:09:41,567 WARNING [train.py:1204] (3/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_526-62079-0 from training. Duration: 0.67 2023-10-03 14:09:41,615 WARNING [train.py:1204] (3/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_7-111954-0_sp0.9 from training. Duration: 0.62225 2023-10-03 14:09:44,815 WARNING [train.py:1204] (3/4) Exclude cut with ID _57997_210_4163_1_1534766425889_3961049_360-279140-0_sp1.1 from training. Duration: 0.97275 2023-10-03 14:09:45,112 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.self_attn_weights.pos_emb_skip_rate, batch_count=1294313.3333333333, ans=0.0 2023-10-03 14:09:46,320 WARNING [train.py:1204] (3/4) Exclude cut with ID _4866_210_2577_1_1527404412284_4364659_133-1696-0 from training. Duration: 0.99 2023-10-03 14:09:47,434 WARNING [train.py:1204] (3/4) Exclude cut with ID _5190_210_4557_1_1527244368799_373439_28-197030-0 from training. Duration: 0.72 2023-10-03 14:09:48,864 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_53-39003-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 14:09:48,867 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6673_210_3273_1_1527767790_3173240_351-167954-0 from training. Duration: 0.48 2023-10-03 14:09:48,882 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2822_210_2715_1_1526112586_1999587_165-36656-0_sp1.1 from training. Duration: 0.8090625 2023-10-03 14:09:52,132 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_328-264892-0_sp1.1 from training. Duration: 0.92725 2023-10-03 14:09:52,137 WARNING [train.py:1204] (3/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_29-293896-0_sp0.9 from training. Duration: 0.67775 2023-10-03 14:09:55,312 WARNING [train.py:1204] (3/4) Exclude cut with ID _65263_210_22356_1_1535763598691_3921037_465-55448-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 14:09:56,664 WARNING [train.py:1204] (3/4) Exclude cut with ID _21989_210_5854_1_1531899214931_3271360_311-53106-0_sp1.1 from training. Duration: 0.97275 2023-10-03 14:09:56,881 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=1294380.0, ans=0.1 2023-10-03 14:09:59,331 WARNING [train.py:1204] (3/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_389-277915-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 14:10:00,814 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_69630_210_7234_1_1535791094_677837_63-277139-0_sp1.1 from training. Duration: 0.963625 2023-10-03 14:10:04,108 WARNING [train.py:1204] (3/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_85-138257-0 from training. Duration: 0.71 2023-10-03 14:10:04,124 WARNING [train.py:1204] (3/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_686-247600-0 from training. Duration: 0.92 2023-10-03 14:10:04,129 WARNING [train.py:1204] (3/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_108-255430-0_sp1.1 from training. Duration: 0.8818125 2023-10-03 14:10:06,889 WARNING [train.py:1204] (3/4) Exclude cut with ID _14884_210_2252_1_1530707459689_5561250_114-201399-0_sp1.1 from training. Duration: 0.6909375 2023-10-03 14:10:08,308 WARNING [train.py:1204] (3/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_506-42614-0 from training. Duration: 0.79 2023-10-03 14:10:09,848 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_85-195240-0_sp1.1 from training. Duration: 0.7909375 2023-10-03 14:10:14,519 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_141-36243-0_sp1.1 from training. Duration: 0.97275 2023-10-03 14:10:21,798 WARNING [train.py:1204] (3/4) Exclude cut with ID _7852_210_5219_1_1528286471316_8139400_358-287492-0_sp1.1 from training. Duration: 0.87275 2023-10-03 14:10:21,824 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_805-71497-0_sp0.9 from training. Duration: 0.788875 2023-10-03 14:10:24,984 WARNING [train.py:1204] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_178-39722-0 from training. Duration: 0.49 2023-10-03 14:10:26,027 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.0.layers.0.feed_forward3.out_whiten, num_groups=1, num_channels=192, metric=11.37 vs. limit=15.0 2023-10-03 14:10:27,802 WARNING [train.py:1204] (3/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_428-181925-0_sp1.1 from training. Duration: 0.963625 2023-10-03 14:10:27,816 WARNING [train.py:1204] (3/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_770-200430-0 from training. Duration: 0.89 2023-10-03 14:10:27,867 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_1104-271009-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 14:10:27,916 WARNING [train.py:1204] (3/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_128-296581-0_sp0.9 from training. Duration: 0.82225 2023-10-03 14:10:36,597 WARNING [train.py:1204] (3/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_19-62511-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 14:10:38,027 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_805-71497-0 from training. Duration: 0.71 2023-10-03 14:10:39,304 INFO [train.py:1046] (3/4) Epoch 37, batch 2950, loss[loss=0.1513, simple_loss=0.2384, pruned_loss=0.0321, over 24547.00 frames. ], tot_loss[loss=0.1591, simple_loss=0.2389, pruned_loss=0.03966, over 4707400.34 frames. ], batch size: 71, lr: 2.75e-03, grad_scale: 16.0 2023-10-03 14:10:39,404 WARNING [train.py:1204] (3/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_573-152077-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 14:10:39,409 WARNING [train.py:1204] (3/4) Exclude cut with ID _42875_210_14856_1_1533704397487_4012433_393-177816-0_sp1.1 from training. Duration: 0.963625 2023-10-03 14:10:40,797 WARNING [train.py:1204] (3/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_278-35159-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 14:10:42,137 WARNING [train.py:1204] (3/4) Exclude cut with ID _15037_210_2253_1_1530752294104_3267650_13-130361-0_sp1.1 from training. Duration: 0.9 2023-10-03 14:10:42,937 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.2.encoder.layers.1.conv_module2.whiten, num_groups=1, num_channels=384, metric=4.56 vs. limit=15.0 2023-10-03 14:10:43,980 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3019_210_2028_1_1526044245_3264865_127-135615-0 from training. Duration: 0.94 2023-10-03 14:10:45,249 WARNING [train.py:1204] (3/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_607-213951-0 from training. Duration: 0.84 2023-10-03 14:10:45,317 WARNING [train.py:1204] (3/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_436-318113-0_sp0.9 from training. Duration: 0.8333125 2023-10-03 14:10:45,318 WARNING [train.py:1204] (3/4) Exclude cut with ID _13938_210_9988_1_1530518718978_3693960_148-152225-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 14:10:49,517 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2541_210_1794_1_1525590269_3084765_156-173713-0_sp1.1 from training. Duration: 0.736375 2023-10-03 14:10:51,514 WARNING [train.py:1204] (3/4) Exclude cut with ID _28323_210_16187_1_1532480395667_4100300_531-24157-0_sp1.1 from training. Duration: 0.8909375 2023-10-03 14:10:54,038 WARNING [train.py:1204] (3/4) Exclude cut with ID _29303_210_2575_1_1532392237303_7410649_382-217492-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 14:10:54,085 WARNING [train.py:1204] (3/4) Exclude cut with ID _58895_210_8750_1_1535184081320_3649089_270-158013-0_sp1.1 from training. Duration: 0.92725 2023-10-03 14:10:55,622 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.bypass.skip_rate, batch_count=1294646.6666666667, ans=0.09899494936611666 2023-10-03 14:10:58,847 WARNING [train.py:1204] (3/4) Exclude cut with ID _29277_210_3279_1_1532581109293_3173069_311-206227-0_sp1.1 from training. Duration: 0.936375 2023-10-03 14:11:00,101 WARNING [train.py:1204] (3/4) Exclude cut with ID _7160_210_1876_1_1527940923231_3703799_282-322611-0_sp0.9 from training. Duration: 0.9666875 2023-10-03 14:11:01,489 WARNING [train.py:1204] (3/4) Exclude cut with ID _53219_210_4525_1_1534834836008_4064825_562-32295-0_sp1.1 from training. Duration: 0.963625 2023-10-03 14:11:02,864 WARNING [train.py:1204] (3/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_408-87330-0_sp1.1 from training. Duration: 0.963625 2023-10-03 14:11:02,879 WARNING [train.py:1204] (3/4) Exclude cut with ID _59202_210_26984_1_1534816810612_3549174_137-318710-0_sp1.1 from training. Duration: 0.8090625 2023-10-03 14:11:04,314 WARNING [train.py:1204] (3/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_372-30302-0 from training. Duration: 0.86 2023-10-03 14:11:09,028 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7543_210_6753_1_1528541907_1399322_178-56269-0_sp1.1 from training. Duration: 0.95275 2023-10-03 14:11:09,129 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.0.feed_forward1.hidden_balancer.prob, batch_count=1294713.3333333333, ans=0.125 2023-10-03 14:11:10,188 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.633e+02 1.983e+02 2.182e+02 2.466e+02 3.781e+02, threshold=4.363e+02, percent-clipped=0.0 2023-10-03 14:11:10,271 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9109W0012-549758 from training. Duration: 0.512 2023-10-03 14:11:11,590 WARNING [train.py:1204] (3/4) Exclude cut with ID _5410_210_1388_1_1527397055795_1719020_39-112221-0_sp0.9 from training. Duration: 0.7666875 2023-10-03 14:11:12,960 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9121W0012-554933 from training. Duration: 0.896 2023-10-03 14:11:14,384 WARNING [train.py:1204] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_476-272757-0 from training. Duration: 0.93 2023-10-03 14:11:16,269 WARNING [train.py:1204] (3/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_1085-171718-0_sp1.1 from training. Duration: 0.92725 2023-10-03 14:11:17,570 WARNING [train.py:1204] (3/4) Exclude cut with ID _8480_210_1874_1_1528589391205_6728330_309-139896-0_sp1.1 from training. Duration: 0.736375 2023-10-03 14:11:17,570 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9130W0113-559703 from training. Duration: 0.896 2023-10-03 14:11:17,574 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_292-265928-0_sp1.1 from training. Duration: 0.463625 2023-10-03 14:11:19,013 WARNING [train.py:1204] (3/4) Exclude cut with ID _8772_210_1850_1_1528635595972_3664011_96-12756-0 from training. Duration: 0.98 2023-10-03 14:11:20,447 WARNING [train.py:1204] (3/4) Exclude cut with ID _12556_210_8341_1_1530081024100_7327690_725-193477-0_sp1.1 from training. Duration: 0.8909375 2023-10-03 14:11:20,497 WARNING [train.py:1204] (3/4) Exclude cut with ID _5289_210_2084_1_1527678202736_3779299_142-103317-0_sp1.1 from training. Duration: 0.763625 2023-10-03 14:11:20,966 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.5.encoder.layers.0.self_attn2.whiten, num_groups=1, num_channels=256, metric=12.44 vs. limit=22.5 2023-10-03 14:11:23,202 WARNING [train.py:1204] (3/4) Exclude cut with ID _45012_210_12651_1_1533608435136_5371380_206-51075-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 14:11:23,294 WARNING [train.py:1204] (3/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_176-56120-0_sp1.1 from training. Duration: 0.7545625 2023-10-03 14:11:23,324 WARNING [train.py:1204] (3/4) Exclude cut with ID _40284_210_2084_1_1533513679814_3704279_220-201864-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 14:11:25,282 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9165W0004-576638 from training. Duration: 0.896 2023-10-03 14:11:25,321 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_5438_210_1794_1_1527336687_2600009_218-343336-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 14:11:26,474 WARNING [train.py:1204] (3/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_318-162017-0 from training. Duration: 0.91 2023-10-03 14:11:32,789 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_49544_210_1794_1_1534379770_5262083_483-349298-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 14:11:34,174 WARNING [train.py:1204] (3/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_197-28164-0_sp0.9 from training. Duration: 0.888875 2023-10-03 14:11:34,227 WARNING [train.py:1204] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_643-52007-0 from training. Duration: 0.57 2023-10-03 14:11:35,482 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_38965_210_4852_1_1533174906_3484735_403-332963-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 14:11:36,874 WARNING [train.py:1204] (3/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_340-174176-0 from training. Duration: 0.49 2023-10-03 14:11:38,444 WARNING [train.py:1204] (3/4) Exclude cut with ID _56708_210_13177_1_1535252351282_3422899_363-917-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 14:11:40,167 WARNING [train.py:1204] (3/4) Exclude cut with ID _19896_210_1883_1_1531709022662_6649569_525-159063-0_sp1.1 from training. Duration: 0.936375 2023-10-03 14:11:40,193 WARNING [train.py:1204] (3/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_430-116139-0_sp0.9 from training. Duration: 0.9444375 2023-10-03 14:11:41,804 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.conv_module2.balancer2.prob, batch_count=1294846.6666666667, ans=0.125 2023-10-03 14:11:42,963 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_53753_210_7234_1_1534320003_3594148_86-329250-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 14:11:42,973 WARNING [train.py:1204] (3/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_406-275649-0_sp1.1 from training. Duration: 0.3818125 2023-10-03 14:11:43,083 WARNING [train.py:1204] (3/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_1083-147536-0_sp1.1 from training. Duration: 0.8818125 2023-10-03 14:11:45,651 WARNING [train.py:1204] (3/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_54-311741-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 14:11:45,663 WARNING [train.py:1204] (3/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_237-58433-0_sp0.9 from training. Duration: 0.82225 2023-10-03 14:11:45,709 WARNING [train.py:1204] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_98-51676-0_sp1.1 from training. Duration: 0.663625 2023-10-03 14:11:47,487 WARNING [train.py:1204] (3/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_386-270472-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 14:11:47,584 WARNING [train.py:1204] (3/4) Exclude cut with ID _19023_210_4353_1_1531378892552_5820780_83-254794-0_sp1.1 from training. Duration: 0.7818125 2023-10-03 14:11:49,015 WARNING [train.py:1204] (3/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_314-93656-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 14:11:49,026 WARNING [train.py:1204] (3/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_336-213530-0 from training. Duration: 0.9 2023-10-03 14:11:50,544 WARNING [train.py:1204] (3/4) Exclude cut with ID _36686_210_5665_1_1533778225913_3646410_65-221120-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 14:11:52,117 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_26433_210_12108_1_1532157158_4056480_271-68178-0_sp1.1 from training. Duration: 0.92725 2023-10-03 14:11:53,422 INFO [train.py:1046] (3/4) Epoch 37, batch 3000, loss[loss=0.1583, simple_loss=0.2473, pruned_loss=0.0347, over 24332.00 frames. ], tot_loss[loss=0.1594, simple_loss=0.2395, pruned_loss=0.03966, over 4712559.15 frames. ], batch size: 74, lr: 2.75e-03, grad_scale: 16.0 2023-10-03 14:11:53,423 INFO [train.py:1069] (3/4) Computing validation loss 2023-10-03 14:12:05,421 INFO [train.py:1078] (3/4) Epoch 37, validation: loss=0.3637, simple_loss=0.2861, pruned_loss=0.2207, over 1125622.00 frames. 2023-10-03 14:12:05,422 INFO [train.py:1079] (3/4) Maximum memory allocated so far is 21134MB 2023-10-03 14:12:05,475 WARNING [train.py:1204] (3/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_46-207599-0_sp0.9 from training. Duration: 0.711125 2023-10-03 14:12:07,014 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.conv_module2.balancer1.prob, batch_count=1294913.3333333333, ans=0.125 2023-10-03 14:12:08,648 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9303W0114-637133 from training. Duration: 0.896 2023-10-03 14:12:09,147 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.self_attn_weights.whiten_keys.whitening_limit, batch_count=1294913.3333333333, ans=6.0 2023-10-03 14:12:09,971 WARNING [train.py:1204] (3/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_490-207861-0 from training. Duration: 0.98 2023-10-03 14:12:11,492 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6767_210_3273_1_1527855783_1718932_173-256601-0_sp0.9 from training. Duration: 0.888875 2023-10-03 14:12:11,522 WARNING [train.py:1204] (3/4) Exclude cut with ID _13921_210_9994_1_1530841782272_3503520_327-69677-0_sp1.1 from training. Duration: 0.8181875 2023-10-03 14:12:12,835 WARNING [train.py:1204] (3/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_508-335369-0 from training. Duration: 0.48 2023-10-03 14:12:12,876 WARNING [train.py:1204] (3/4) Exclude cut with ID _64631_210_27767_1_1535349580068_4165160_24-46567-0_sp1.1 from training. Duration: 0.963625 2023-10-03 14:12:19,005 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_89-35748-0_sp1.1 from training. Duration: 0.7181875 2023-10-03 14:12:30,388 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_26433_210_12108_1_1532157158_4056480_578-262107-0_sp1.1 from training. Duration: 0.9 2023-10-03 14:12:33,445 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.bypass.scale_min, batch_count=1295046.6666666667, ans=0.2 2023-10-03 14:12:34,932 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.feed_forward2.hidden_balancer.prob, batch_count=1295046.6666666667, ans=0.125 2023-10-03 14:12:36,742 WARNING [train.py:1204] (3/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_129-337802-0 from training. Duration: 0.72 2023-10-03 14:12:38,176 WARNING [train.py:1204] (3/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_356-167816-0_sp0.9 from training. Duration: 0.8 2023-10-03 14:12:39,675 WARNING [train.py:1204] (3/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_137-116442-0_sp1.1 from training. Duration: 0.7090625 2023-10-03 14:12:40,921 WARNING [train.py:1204] (3/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_358-172940-0_sp1.1 from training. Duration: 0.963625 2023-10-03 14:12:40,959 WARNING [train.py:1204] (3/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_668-344147-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 14:12:42,977 WARNING [train.py:1204] (3/4) Exclude cut with ID _62337_210_12392_1_1535009273105_3412229_255-157667-0_sp1.1 from training. Duration: 0.936375 2023-10-03 14:12:42,981 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_486-151131-0 from training. Duration: 0.85 2023-10-03 14:12:43,159 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.ff2_skip_rate, batch_count=1295046.6666666667, ans=0.0 2023-10-03 14:12:45,738 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_53638_210_21581_1_1534297319_5808449_885-80172-0 from training. Duration: 0.96 2023-10-03 14:12:47,089 WARNING [train.py:1204] (3/4) Exclude cut with ID _50093_210_4220_1_1534417141207_3432299_105-183067-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 14:12:47,147 WARNING [train.py:1204] (3/4) Exclude cut with ID _7562_210_2084_1_1528887767916_3946229_345-169156-0_sp0.9 from training. Duration: 0.6444375 2023-10-03 14:12:50,260 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_4528_210_3273_1_1527076654_3276777_209-219804-0_sp0.9 from training. Duration: 0.7666875 2023-10-03 14:12:50,292 WARNING [train.py:1204] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_35-153886-0_sp0.9 from training. Duration: 0.9333125 2023-10-03 14:12:51,589 WARNING [train.py:1204] (3/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_332-121925-0_sp1.1 from training. Duration: 0.92725 2023-10-03 14:12:51,591 WARNING [train.py:1204] (3/4) Exclude cut with ID _59202_210_26984_1_1534816810612_3549174_495-65515-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 14:12:54,548 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.conv_module1.balancer2.prob, batch_count=1295113.3333333333, ans=0.125 2023-10-03 14:12:56,956 WARNING [train.py:1204] (3/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_769-168302-0_sp1.1 from training. Duration: 0.6454375 2023-10-03 14:12:57,008 WARNING [train.py:1204] (3/4) Exclude cut with ID _24271_210_12658_1_1531987270022_7190519_708-71345-0_sp1.1 from training. Duration: 0.936375 2023-10-03 14:12:57,009 WARNING [train.py:1204] (3/4) Exclude cut with ID 3033-130750-0096-55598-0_sp0.9 from training. Duration: 0.92225 2023-10-03 14:12:58,412 WARNING [train.py:1204] (3/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_219-224445-0_sp0.9 from training. Duration: 0.9333125 2023-10-03 14:13:03,081 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_174-277364-0 from training. Duration: 0.71 2023-10-03 14:13:03,161 WARNING [train.py:1204] (3/4) Exclude cut with ID _45021_210_9994_1_1533619751094_3330288_96-239010-0_sp1.1 from training. Duration: 0.82725 2023-10-03 14:13:04,476 WARNING [train.py:1204] (3/4) Exclude cut with ID _31602_210_5854_1_1532677021456_4458250_489-305116-0_sp1.1 from training. Duration: 0.97275 2023-10-03 14:13:04,502 WARNING [train.py:1204] (3/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_269-154726-0_sp0.9 from training. Duration: 0.9555625 2023-10-03 14:13:09,219 WARNING [train.py:1204] (3/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_126-54418-0_sp1.1 from training. Duration: 0.92725 2023-10-03 14:13:10,417 WARNING [train.py:1204] (3/4) Exclude cut with ID _42326_210_7770_1_1533814282531_3707269_182-224159-0_sp1.1 from training. Duration: 0.92725 2023-10-03 14:13:11,795 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7002_210_1794_1_1527939683_5157286_1-281264-0_sp0.9 from training. Duration: 0.488875 2023-10-03 14:13:11,835 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_86-16881-0 from training. Duration: 0.58 2023-10-03 14:13:11,860 WARNING [train.py:1204] (3/4) Exclude cut with ID _19514_210_9259_1_1531530071330_5084230_637-279094-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 14:13:11,887 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_26433_210_12108_1_1532157158_4056480_578-262107-0 from training. Duration: 0.99 2023-10-03 14:13:11,933 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_82-221196-0_sp1.1 from training. Duration: 0.6909375 2023-10-03 14:13:13,365 WARNING [train.py:1204] (3/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_402-102311-0 from training. Duration: 0.67 2023-10-03 14:13:16,757 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.bypass_mid.scale_min, batch_count=1295180.0, ans=0.2 2023-10-03 14:13:17,959 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1015-343884-0_sp1.1 from training. Duration: 0.563625 2023-10-03 14:13:19,163 INFO [train.py:1046] (3/4) Epoch 37, batch 3050, loss[loss=0.165, simple_loss=0.2517, pruned_loss=0.03918, over 24291.00 frames. ], tot_loss[loss=0.1593, simple_loss=0.2398, pruned_loss=0.03942, over 4728727.29 frames. ], batch size: 74, lr: 2.75e-03, grad_scale: 16.0 2023-10-03 14:13:19,239 WARNING [train.py:1204] (3/4) Exclude cut with ID _13684_210_7497_1_1530619354863_3598359_312-235606-0_sp1.1 from training. Duration: 0.5454375 2023-10-03 14:13:19,268 WARNING [train.py:1204] (3/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_55-31904-0 from training. Duration: 0.88 2023-10-03 14:13:20,671 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_25-332138-0 from training. Duration: 0.91 2023-10-03 14:13:20,672 WARNING [train.py:1204] (3/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_912-282412-0_sp0.9 from training. Duration: 0.5444375 2023-10-03 14:13:22,481 WARNING [train.py:1204] (3/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_458-299435-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 14:13:23,889 WARNING [train.py:1204] (3/4) Exclude cut with ID _69584_210_5219_1_1535785133775_4044450_275-88403-0_sp1.1 from training. Duration: 0.92725 2023-10-03 14:13:23,904 WARNING [train.py:1204] (3/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_841-341679-0_sp1.1 from training. Duration: 0.52725 2023-10-03 14:13:23,917 WARNING [train.py:1204] (3/4) Exclude cut with ID _41391_210_7994_1_1533603771872_3638201_1-54521-0_sp1.1 from training. Duration: 0.97275 2023-10-03 14:13:23,954 WARNING [train.py:1204] (3/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_495-186609-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 14:13:26,685 WARNING [train.py:1204] (3/4) Exclude cut with ID _4681_210_3525_1_1527250689464_120889_15-126760-0 from training. Duration: 0.81 2023-10-03 14:13:27,029 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=1295246.6666666667, ans=0.1 2023-10-03 14:13:28,096 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3019_210_2028_1_1526044245_3264865_127-135615-0_sp1.1 from training. Duration: 0.8545625 2023-10-03 14:13:30,793 WARNING [train.py:1204] (3/4) Exclude cut with ID _30191_210_1664_1_1533189915047_7196670_915-21561-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 14:13:30,826 WARNING [train.py:1204] (3/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_6-214815-0_sp0.9 from training. Duration: 0.9444375 2023-10-03 14:13:33,618 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_45399_210_4852_1_1533624725_7579737_190-137494-0_sp1.1 from training. Duration: 0.97275 2023-10-03 14:13:36,941 WARNING [train.py:1204] (3/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_207-132038-0 from training. Duration: 0.94 2023-10-03 14:13:37,651 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.3.encoder.layers.2.conv_module1.whiten, num_groups=1, num_channels=512, metric=8.04 vs. limit=15.0 2023-10-03 14:13:38,595 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.balancer2.prob, batch_count=1295313.3333333333, ans=0.125 2023-10-03 14:13:40,584 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.feed_forward1.out_proj.dropout_p, batch_count=1295313.3333333333, ans=0.1 2023-10-03 14:13:43,053 WARNING [train.py:1204] (3/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_90-31383-0 from training. Duration: 0.87 2023-10-03 14:13:43,101 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_15517_210_6753_1_1530952293_1664795_119-31001-0 from training. Duration: 0.94 2023-10-03 14:13:44,329 WARNING [train.py:1204] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_684-85342-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 14:13:47,745 WARNING [train.py:1204] (3/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_97-325053-0_sp1.1 from training. Duration: 0.836375 2023-10-03 14:13:49,256 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_68702_210_23689_1_1535799577_3420068_168-25150-0_sp1.1 from training. Duration: 0.97275 2023-10-03 14:13:49,264 WARNING [train.py:1204] (3/4) Exclude cut with ID _65134_210_14763_1_1535335393985_3505368_111-27984-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 14:13:50,506 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.616e+02 1.847e+02 2.000e+02 2.223e+02 2.874e+02, threshold=4.000e+02, percent-clipped=0.0 2023-10-03 14:13:50,618 WARNING [train.py:1204] (3/4) Exclude cut with ID _48678_210_5663_1_1533988830947_3769199_276-226230-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 14:13:53,859 WARNING [train.py:1204] (3/4) Exclude cut with ID _28427_210_4220_1_1532581240967_3061069_43-84928-0_sp1.1 from training. Duration: 0.9 2023-10-03 14:13:53,885 WARNING [train.py:1204] (3/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_158-250116-0_sp1.1 from training. Duration: 0.563625 2023-10-03 14:13:55,180 WARNING [train.py:1204] (3/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_279-332556-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 14:13:55,231 WARNING [train.py:1204] (3/4) Exclude cut with ID _42863_210_9070_1_1533902398788_3696920_488-230146-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 14:13:55,232 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_387-247159-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 14:13:56,621 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6729_210_2550_1_1528032481_1788725_261-42619-0_sp1.1 from training. Duration: 0.97275 2023-10-03 14:13:59,370 WARNING [train.py:1204] (3/4) Exclude cut with ID _19896_210_1883_1_1531709022662_6649569_831-44724-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 14:14:00,864 WARNING [train.py:1204] (3/4) Exclude cut with ID _55153_210_2253_1_1534384786677_3162096_280-285944-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 14:14:02,125 WARNING [train.py:1204] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_448-4048-0 from training. Duration: 0.96 2023-10-03 14:14:02,182 WARNING [train.py:1204] (3/4) Exclude cut with ID _29678_210_7893_1_1532421042787_3906118_456-202548-0_sp1.1 from training. Duration: 0.97275 2023-10-03 14:14:02,202 WARNING [train.py:1204] (3/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_107-332398-0_sp1.1 from training. Duration: 0.7090625 2023-10-03 14:14:02,447 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.bypass.scale_min, batch_count=1295446.6666666667, ans=0.2 2023-10-03 14:14:05,610 WARNING [train.py:1204] (3/4) Exclude cut with ID _13921_210_9994_1_1530838702800_3003429_46-149444-0_sp1.1 from training. Duration: 0.8545625 2023-10-03 14:14:06,756 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_257-20733-0_sp0.9 from training. Duration: 0.8444375 2023-10-03 14:14:06,816 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_53753_210_7234_1_1534320003_3594148_385-234809-0_sp1.1 from training. Duration: 0.963625 2023-10-03 14:14:08,118 WARNING [train.py:1204] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_292-215346-0_sp1.1 from training. Duration: 0.8818125 2023-10-03 14:14:12,771 WARNING [train.py:1204] (3/4) Exclude cut with ID _68965_210_1883_1_1535698962272_3497490_196-124268-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 14:14:14,084 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_23801_210_16223_1_1532066392_3294657_570-5439-0_sp1.1 from training. Duration: 0.8818125 2023-10-03 14:14:14,431 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=1295446.6666666667, ans=0.1 2023-10-03 14:14:17,031 WARNING [train.py:1204] (3/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_349-6928-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 14:14:18,921 WARNING [train.py:1204] (3/4) Exclude cut with ID _60621_210_17071_1_1535013053530_3671739_226-255202-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 14:14:18,925 WARNING [train.py:1204] (3/4) Exclude cut with ID _41817_210_18835_1_1533434477160_7423049_448-130302-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 14:14:20,381 WARNING [train.py:1204] (3/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_758-143677-0_sp1.1 from training. Duration: 0.8909375 2023-10-03 14:14:20,431 WARNING [train.py:1204] (3/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_912-47473-0_sp0.9 from training. Duration: 0.7555625 2023-10-03 14:14:22,262 WARNING [train.py:1204] (3/4) Exclude cut with ID _6694_210_4599_1_1527852637490_3965529_356-169233-0_sp1.1 from training. Duration: 0.9 2023-10-03 14:14:22,350 WARNING [train.py:1204] (3/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_1-99680-0 from training. Duration: 0.67 2023-10-03 14:14:24,939 WARNING [train.py:1204] (3/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_199-86432-0_sp1.1 from training. Duration: 0.8909375 2023-10-03 14:14:24,956 WARNING [train.py:1204] (3/4) Exclude cut with ID _61148_210_6897_1_1535094022726_4217610_362-182430-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 14:14:25,041 WARNING [train.py:1204] (3/4) Exclude cut with ID _49376_210_5986_1_1533985278732_3370740_301-154763-0 from training. Duration: 0.98 2023-10-03 14:14:27,671 WARNING [train.py:1204] (3/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_572-185150-0_sp1.1 from training. Duration: 0.8818125 2023-10-03 14:14:33,154 INFO [train.py:1046] (3/4) Epoch 37, batch 3100, loss[loss=0.1533, simple_loss=0.2404, pruned_loss=0.03312, over 24629.00 frames. ], tot_loss[loss=0.1586, simple_loss=0.2386, pruned_loss=0.03927, over 4728503.88 frames. ], batch size: 65, lr: 2.75e-03, grad_scale: 16.0 2023-10-03 14:14:33,236 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_47896_210_20508_1_1534492518_5958868_558-314572-0_sp1.1 from training. Duration: 0.8818125 2023-10-03 14:14:35,045 WARNING [train.py:1204] (3/4) Exclude cut with ID _45560_210_21982_1_1533812603951_3584999_111-6376-0_sp0.9 from training. Duration: 0.9333125 2023-10-03 14:14:35,343 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.balancer_ff2.min_abs, batch_count=1295580.0, ans=0.1 2023-10-03 14:14:36,548 WARNING [train.py:1204] (3/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_412-317077-0_sp1.1 from training. Duration: 0.6818125 2023-10-03 14:14:37,990 WARNING [train.py:1204] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_718-68505-0 from training. Duration: 0.89 2023-10-03 14:14:41,041 WARNING [train.py:1204] (3/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_77-311404-0 from training. Duration: 0.94 2023-10-03 14:14:41,132 WARNING [train.py:1204] (3/4) Exclude cut with ID _60229_210_27767_1_1534838406158_3655350_264-279573-0 from training. Duration: 0.83 2023-10-03 14:14:42,558 WARNING [train.py:1204] (3/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_106-300985-0_sp1.1 from training. Duration: 0.8181875 2023-10-03 14:14:45,569 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.ff2_skip_rate, batch_count=1295580.0, ans=0.0 2023-10-03 14:14:47,205 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_4183_210_2087_1_1526729100_1869838_79-277189-0_sp1.1 from training. Duration: 0.963625 2023-10-03 14:14:47,214 WARNING [train.py:1204] (3/4) Exclude cut with ID _28427_210_4220_1_1532581240967_3061069_195-295268-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 14:14:48,779 WARNING [train.py:1204] (3/4) Exclude cut with ID _4092_210_2151_1_1526643044449_3135759_356-278694-0_sp0.9 from training. Duration: 0.52225 2023-10-03 14:14:53,204 WARNING [train.py:1204] (3/4) Exclude cut with ID _14459_210_2228_1_1531443641946_7516700_777-71446-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 14:14:56,504 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.5.encoder.layers.1.self_attn1.whiten, num_groups=1, num_channels=256, metric=10.24 vs. limit=22.5 2023-10-03 14:14:57,388 WARNING [train.py:1204] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_520-126729-0 from training. Duration: 0.7 2023-10-03 14:14:58,080 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.3.encoder.layers.0.self_attn1.whiten, num_groups=1, num_channels=512, metric=16.61 vs. limit=22.5 2023-10-03 14:15:01,557 WARNING [train.py:1204] (3/4) Exclude cut with ID _13684_210_7497_1_1530619354863_3598359_312-235606-0_sp0.9 from training. Duration: 0.6666875 2023-10-03 14:15:02,860 WARNING [train.py:1204] (3/4) Exclude cut with ID _37537_210_2253_1_1533171758568_3421949_283-349402-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 14:15:02,900 WARNING [train.py:1204] (3/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_29-117932-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 14:15:02,928 WARNING [train.py:1204] (3/4) Exclude cut with ID _29303_210_2575_1_1532392237303_7410649_721-283351-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 14:15:04,231 WARNING [train.py:1204] (3/4) Exclude cut with ID _14886_210_4220_1_1530702020355_3516190_175-229073-0_sp0.9 from training. Duration: 0.5 2023-10-03 14:15:07,314 WARNING [train.py:1204] (3/4) Exclude cut with ID _15458_210_8341_1_1530858614263_3834410_70-268830-0_sp1.1 from training. Duration: 0.97275 2023-10-03 14:15:07,332 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2295_210_2087_1_1525500993_2407863_234-83104-0 from training. Duration: 0.62 2023-10-03 14:15:07,333 WARNING [train.py:1204] (3/4) Exclude cut with ID _22961_210_8254_1_1531976366672_4278180_317-199546-0_sp1.1 from training. Duration: 0.7818125 2023-10-03 14:15:07,435 WARNING [train.py:1204] (3/4) Exclude cut with ID _27894_210_12392_1_1532415496078_6902439_547-196022-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 14:15:09,274 WARNING [train.py:1204] (3/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_46-207599-0 from training. Duration: 0.64 2023-10-03 14:15:10,704 WARNING [train.py:1204] (3/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_426-326246-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 14:15:13,433 WARNING [train.py:1204] (3/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_445-117398-0_sp0.9 from training. Duration: 0.988875 2023-10-03 14:15:13,498 WARNING [train.py:1204] (3/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_563-347832-0 from training. Duration: 0.89 2023-10-03 14:15:14,868 WARNING [train.py:1204] (3/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_252-216674-0 from training. Duration: 0.54 2023-10-03 14:15:16,309 WARNING [train.py:1204] (3/4) Exclude cut with ID _59611_210_8895_1_1534741198240_3650361_187-212779-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 14:15:18,264 WARNING [train.py:1204] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_282-159367-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 14:15:19,639 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_68702_210_23689_1_1535799577_3420068_33-136883-0_sp1.1 from training. Duration: 0.936375 2023-10-03 14:15:19,650 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_47228_210_4983_1_1533780021_3672027_184-293706-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 14:15:19,680 WARNING [train.py:1204] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_656-188361-0_sp0.9 from training. Duration: 0.9666875 2023-10-03 14:15:21,565 WARNING [train.py:1204] (3/4) Exclude cut with ID _4866_210_2577_1_1527404412284_4364659_352-157930-0_sp0.9 from training. Duration: 0.9 2023-10-03 14:15:21,565 WARNING [train.py:1204] (3/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_210-106047-0_sp1.1 from training. Duration: 0.8818125 2023-10-03 14:15:22,962 WARNING [train.py:1204] (3/4) Exclude cut with ID _9389_210_1664_1_1529217337301_5289269_289-213417-0_sp1.1 from training. Duration: 0.8090625 2023-10-03 14:15:24,352 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3910_210_1794_1_1526777667_3747260_178-41186-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 14:15:24,356 WARNING [train.py:1204] (3/4) Exclude cut with ID _27052_210_5986_1_1532329197187_3585009_24-172263-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 14:15:24,360 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_5522_210_3273_1_1527249606_3909096_318-203153-0_sp1.1 from training. Duration: 0.3909375 2023-10-03 14:15:29,785 WARNING [train.py:1204] (3/4) Exclude cut with ID _27894_210_12392_1_1532415496078_6902439_222-51982-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 14:15:29,887 WARNING [train.py:1204] (3/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_87-131427-0 from training. Duration: 0.78 2023-10-03 14:15:31,366 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.feed_forward3.hidden_balancer.prob, batch_count=1295846.6666666667, ans=0.125 2023-10-03 14:15:32,610 WARNING [train.py:1204] (3/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_372-298093-0_sp0.9 from training. Duration: 0.888875 2023-10-03 14:15:33,258 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.3.encoder.layers.3.feed_forward2.out_whiten, num_groups=1, num_channels=512, metric=5.46 vs. limit=15.0 2023-10-03 14:15:33,905 WARNING [train.py:1204] (3/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_663-166973-0 from training. Duration: 0.8 2023-10-03 14:15:33,945 WARNING [train.py:1204] (3/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_429-177469-0_sp1.1 from training. Duration: 0.936375 2023-10-03 14:15:33,989 WARNING [train.py:1204] (3/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_724-172651-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 14:15:35,282 WARNING [train.py:1204] (3/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_103-336064-0 from training. Duration: 0.51 2023-10-03 14:15:47,056 INFO [train.py:1046] (3/4) Epoch 37, batch 3150, loss[loss=0.1602, simple_loss=0.2235, pruned_loss=0.04844, over 23611.00 frames. ], tot_loss[loss=0.1579, simple_loss=0.238, pruned_loss=0.03892, over 4730767.10 frames. ], batch size: 232, lr: 2.75e-03, grad_scale: 16.0 2023-10-03 14:15:47,126 WARNING [train.py:1204] (3/4) Exclude cut with ID _48677_210_5663_1_1533902405919_3892989_111-11439-0 from training. Duration: 0.96 2023-10-03 14:15:48,627 WARNING [train.py:1204] (3/4) Exclude cut with ID _8085_210_4557_1_1528455633493_3716100_95-103398-0_sp1.1 from training. Duration: 0.963625 2023-10-03 14:15:49,979 WARNING [train.py:1204] (3/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_1041-110837-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 14:15:51,915 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_50050_210_16223_1_1535435796_7290138_1155-121757-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 14:15:51,917 WARNING [train.py:1204] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_602-275260-0_sp0.9 from training. Duration: 0.911125 2023-10-03 14:15:53,744 WARNING [train.py:1204] (3/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_265-164486-0 from training. Duration: 0.96 2023-10-03 14:15:55,124 WARNING [train.py:1204] (3/4) Exclude cut with ID _46067_210_4220_1_1534125571066_3307450_142-229375-0_sp1.1 from training. Duration: 0.963625 2023-10-03 14:15:55,157 WARNING [train.py:1204] (3/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_310-26186-0_sp1.1 from training. Duration: 0.636375 2023-10-03 14:15:55,260 WARNING [train.py:1204] (3/4) Exclude cut with ID _28461_210_2253_1_1532771775189_3041299_136-270644-0 from training. Duration: 0.93 2023-10-03 14:15:57,898 WARNING [train.py:1204] (3/4) Exclude cut with ID _14860_210_4915_1_1530702020340_7343240_438-286372-0_sp1.1 from training. Duration: 0.936375 2023-10-03 14:15:59,374 WARNING [train.py:1204] (3/4) Exclude cut with ID IC0128W0004-63309 from training. Duration: 0.831 2023-10-03 14:16:00,759 WARNING [train.py:1204] (3/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_135-174080-0 from training. Duration: 0.53 2023-10-03 14:16:02,095 WARNING [train.py:1204] (3/4) Exclude cut with ID _41326_210_5854_1_1533778076509_3492080_277-59511-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 14:16:03,397 WARNING [train.py:1204] (3/4) Exclude cut with ID IC0144W0112-71379 from training. Duration: 0.992 2023-10-03 14:16:03,476 WARNING [train.py:1204] (3/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_711-15688-0_sp0.9 from training. Duration: 0.47775 2023-10-03 14:16:06,322 WARNING [train.py:1204] (3/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_237-332027-0 from training. Duration: 0.95 2023-10-03 14:16:06,375 WARNING [train.py:1204] (3/4) Exclude cut with ID _7970_210_1883_1_1528455616296_3472230_25-178407-0 from training. Duration: 0.71 2023-10-03 14:16:08,146 WARNING [train.py:1204] (3/4) Exclude cut with ID _8480_210_1874_1_1528589391205_6728330_309-139896-0 from training. Duration: 0.81 2023-10-03 14:16:08,172 WARNING [train.py:1204] (3/4) Exclude cut with ID _39571_210_13449_1_1533801584143_3651077_319-268877-0_sp1.1 from training. Duration: 0.936375 2023-10-03 14:16:08,175 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_155-246880-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 14:16:08,276 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_566-122599-0_sp1.1 from training. Duration: 0.936375 2023-10-03 14:16:10,289 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_12868_210_6753_1_1530701932_530275_18-76322-0 from training. Duration: 0.87 2023-10-03 14:16:12,929 WARNING [train.py:1204] (3/4) Exclude cut with ID _56494_210_4220_1_1535284837625_3033169_159-106035-0_sp1.1 from training. Duration: 0.963625 2023-10-03 14:16:12,969 WARNING [train.py:1204] (3/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_98-276013-0_sp1.1 from training. Duration: 0.963625 2023-10-03 14:16:14,372 WARNING [train.py:1204] (3/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_330-216124-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 14:16:15,817 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_177-58005-0_sp0.9 from training. Duration: 0.688875 2023-10-03 14:16:18,521 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.509e+02 1.944e+02 2.154e+02 2.658e+02 3.587e+02, threshold=4.309e+02, percent-clipped=0.0 2023-10-03 14:16:19,853 WARNING [train.py:1204] (3/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_343-139898-0 from training. Duration: 0.96 2023-10-03 14:16:19,921 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6961_210_2087_1_1527937168_6815011_395-185060-0_sp1.1 from training. Duration: 0.863625 2023-10-03 14:16:23,703 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7002_210_1794_1_1527939683_5157286_184-229630-0_sp0.9 from training. Duration: 0.82225 2023-10-03 14:16:23,756 WARNING [train.py:1204] (3/4) Exclude cut with ID _59170_210_6721_1_1534983633960_4203335_153-18895-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 14:16:25,131 WARNING [train.py:1204] (3/4) Exclude cut with ID _49643_210_7989_1_1534640244224_3939031_598-161926-0 from training. Duration: 0.9 2023-10-03 14:16:26,549 WARNING [train.py:1204] (3/4) Exclude cut with ID _4068_210_2629_1_1526697350395_1546259_224-288693-0 from training. Duration: 0.59 2023-10-03 14:16:27,923 WARNING [train.py:1204] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_718-68505-0_sp1.1 from training. Duration: 0.8090625 2023-10-03 14:16:27,994 WARNING [train.py:1204] (3/4) Exclude cut with ID _4068_210_2629_1_1526697350395_1546259_224-288693-0_sp0.9 from training. Duration: 0.6555625 2023-10-03 14:16:28,009 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2296_210_2087_1_1525503448_3397335_57-185430-0_sp0.9 from training. Duration: 0.6666875 2023-10-03 14:16:29,185 WARNING [train.py:1204] (3/4) Exclude cut with ID _15458_210_8341_1_1530858614263_3834410_49-235472-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 14:16:29,193 WARNING [train.py:1204] (3/4) Exclude cut with ID _64135_210_2253_1_1535677267876_7608972_498-5039-0_sp0.9 from training. Duration: 0.8666875 2023-10-03 14:16:30,625 WARNING [train.py:1204] (3/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_283-220372-0_sp0.9 from training. Duration: 0.62225 2023-10-03 14:16:30,634 WARNING [train.py:1204] (3/4) Exclude cut with ID _6473_210_1741_1_1527901307396_3815380_108-17335-0_sp0.9 from training. Duration: 0.72225 2023-10-03 14:16:31,930 WARNING [train.py:1204] (3/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_113-92608-0 from training. Duration: 0.54 2023-10-03 14:16:31,978 WARNING [train.py:1204] (3/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_272-249985-0_sp1.1 from training. Duration: 0.6090625 2023-10-03 14:16:33,250 WARNING [train.py:1204] (3/4) Exclude cut with ID _50376_210_13401_1_1534031502101_4217740_518-187390-0_sp1.1 from training. Duration: 0.97275 2023-10-03 14:16:34,743 WARNING [train.py:1204] (3/4) Exclude cut with ID _59738_210_15379_1_1534849512562_3941470_165-248260-0_sp1.1 from training. Duration: 0.92725 2023-10-03 14:16:34,752 WARNING [train.py:1204] (3/4) Exclude cut with ID _23818_210_6228_1_1531917063884_3676920_400-343925-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 14:16:36,159 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6961_210_2087_1_1527937168_6815011_562-165518-0 from training. Duration: 0.77 2023-10-03 14:16:36,206 WARNING [train.py:1204] (3/4) Exclude cut with ID _24271_210_12658_1_1531987270022_7190519_289-207931-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 14:16:37,623 WARNING [train.py:1204] (3/4) Exclude cut with ID _14632_210_9988_1_1530691279392_3923200_332-105904-0 from training. Duration: 0.9 2023-10-03 14:16:37,637 WARNING [train.py:1204] (3/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_203-232438-0_sp1.1 from training. Duration: 0.97275 2023-10-03 14:16:37,724 WARNING [train.py:1204] (3/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_263-168388-0 from training. Duration: 0.86 2023-10-03 14:16:37,884 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.bypass.skip_rate, batch_count=1296113.3333333333, ans=0.07 2023-10-03 14:16:39,101 WARNING [train.py:1204] (3/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_174-205058-0 from training. Duration: 0.95 2023-10-03 14:16:40,997 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_94-195801-0_sp1.1 from training. Duration: 0.8545625 2023-10-03 14:16:41,028 WARNING [train.py:1204] (3/4) Exclude cut with ID _55153_210_2253_1_1534384786677_3162096_269-103204-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 14:16:42,468 WARNING [train.py:1204] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_748-36150-0 from training. Duration: 0.91 2023-10-03 14:16:43,665 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_148-172861-0_sp1.1 from training. Duration: 0.4545625 2023-10-03 14:16:43,725 WARNING [train.py:1204] (3/4) Exclude cut with ID _67499_210_17282_1_1535509805220_3692430_460-77730-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 14:16:46,384 WARNING [train.py:1204] (3/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_374-20753-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 14:16:48,335 WARNING [train.py:1204] (3/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_374-289983-0_sp1.1 from training. Duration: 0.97275 2023-10-03 14:16:48,370 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_34886_210_10633_1_1533203882_1755661_76-156096-0_sp0.9 from training. Duration: 0.9666875 2023-10-03 14:16:54,418 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_238-256668-0_sp0.9 from training. Duration: 0.7666875 2023-10-03 14:16:55,591 WARNING [train.py:1204] (3/4) Exclude cut with ID _46683_210_9642_1_1533730913492_4591116_489-146727-0_sp1.1 from training. Duration: 0.97275 2023-10-03 14:16:58,408 WARNING [train.py:1204] (3/4) Exclude cut with ID _13938_210_9988_1_1530518718978_3693960_304-112784-0_sp0.9 from training. Duration: 0.511125 2023-10-03 14:17:00,159 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.attention_skip_rate, batch_count=1296246.6666666667, ans=0.0 2023-10-03 14:17:01,202 INFO [train.py:1046] (3/4) Epoch 37, batch 3200, loss[loss=0.1657, simple_loss=0.2456, pruned_loss=0.04294, over 23293.00 frames. ], tot_loss[loss=0.1574, simple_loss=0.2375, pruned_loss=0.03866, over 4721842.83 frames. ], batch size: 93, lr: 2.75e-03, grad_scale: 32.0 2023-10-03 14:17:02,672 WARNING [train.py:1204] (3/4) Exclude cut with ID _24271_210_12658_1_1531987270022_7190519_243-46549-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 14:17:02,676 WARNING [train.py:1204] (3/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_78-182912-0_sp1.1 from training. Duration: 0.536375 2023-10-03 14:17:04,149 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.nonlin_attention.balancer.max_positive, batch_count=1296246.6666666667, ans=0.95 2023-10-03 14:17:06,553 WARNING [train.py:1204] (3/4) Exclude cut with ID _57572_210_5517_1_1534921219624_3809484_121-321334-0_sp1.1 from training. Duration: 0.97275 2023-10-03 14:17:07,989 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_40047_210_2290_1_1533290519_2374311_254-102149-0_sp1.1 from training. Duration: 0.963625 2023-10-03 14:17:07,990 WARNING [train.py:1204] (3/4) Exclude cut with ID _28461_210_2253_1_1532771775189_3041299_96-152158-0 from training. Duration: 0.85 2023-10-03 14:17:10,651 WARNING [train.py:1204] (3/4) Exclude cut with ID _29877_210_6228_1_1532397791106_5276149_161-102979-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 14:17:17,379 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3787_210_2040_1_1526477544_5478586_603-120324-0_sp0.9 from training. Duration: 0.9 2023-10-03 14:17:18,908 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_41750_210_1960_1_1533520678_3948042_247-213456-0_sp1.1 from training. Duration: 0.97275 2023-10-03 14:17:22,951 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.bypass_mid.scale_min, batch_count=1296313.3333333333, ans=0.2 2023-10-03 14:17:28,185 WARNING [train.py:1204] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_127-119109-0_sp0.9 from training. Duration: 0.8 2023-10-03 14:17:35,294 WARNING [train.py:1204] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_215-202973-0 from training. Duration: 0.88 2023-10-03 14:17:36,605 WARNING [train.py:1204] (3/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_17-320491-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 14:17:38,056 WARNING [train.py:1204] (3/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_310-114452-0 from training. Duration: 0.59 2023-10-03 14:17:39,259 WARNING [train.py:1204] (3/4) Exclude cut with ID _15133_210_6228_1_1530794512074_3080379_29-264458-0_sp0.9 from training. Duration: 0.6444375 2023-10-03 14:17:41,964 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.2.encoder.layers.2.feed_forward3.out_whiten, num_groups=1, num_channels=384, metric=9.22 vs. limit=15.0 2023-10-03 14:17:43,132 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.3.encoder.layers.3.self_attn2.whiten, num_groups=1, num_channels=512, metric=13.78 vs. limit=22.5 2023-10-03 14:17:44,408 WARNING [train.py:1204] (3/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_159-281310-0_sp1.1 from training. Duration: 0.863625 2023-10-03 14:17:44,425 WARNING [train.py:1204] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_195-167011-0_sp0.9 from training. Duration: 0.8333125 2023-10-03 14:17:44,493 WARNING [train.py:1204] (3/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_156-257607-0_sp1.1 from training. Duration: 0.7545625 2023-10-03 14:17:46,051 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.nonlin_attention.balancer.prob, batch_count=1296446.6666666667, ans=0.125 2023-10-03 14:17:47,318 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2295_210_2087_1_1525500993_2407863_116-238894-0 from training. Duration: 0.7 2023-10-03 14:17:50,670 WARNING [train.py:1204] (3/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_321-335475-0_sp0.9 from training. Duration: 0.5 2023-10-03 14:17:52,454 WARNING [train.py:1204] (3/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_188-114464-0 from training. Duration: 0.82 2023-10-03 14:17:54,448 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.2.encoder.layers.2.whiten, num_groups=1, num_channels=384, metric=3.62 vs. limit=12.0 2023-10-03 14:17:55,117 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2592_210_2290_1_1525697025_4882925_157-70512-0 from training. Duration: 0.91 2023-10-03 14:17:56,561 WARNING [train.py:1204] (3/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_138-64210-0_sp1.1 from training. Duration: 0.663625 2023-10-03 14:17:56,701 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.attention_skip_rate, batch_count=1296446.6666666667, ans=0.0 2023-10-03 14:18:02,161 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_11305_210_1794_1_1529750428_7178148_651-95702-0_sp1.1 from training. Duration: 0.963625 2023-10-03 14:18:02,185 WARNING [train.py:1204] (3/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_379-184223-0_sp0.9 from training. Duration: 0.9444375 2023-10-03 14:18:02,221 WARNING [train.py:1204] (3/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_381-125325-0_sp1.1 from training. Duration: 0.963625 2023-10-03 14:18:03,469 WARNING [train.py:1204] (3/4) Exclude cut with ID IC0622W0021-307659 from training. Duration: 0.896 2023-10-03 14:18:03,471 WARNING [train.py:1204] (3/4) Exclude cut with ID _7624_210_1531_1_1528109802198_4933101_418-349834-0_sp1.1 from training. Duration: 0.5454375 2023-10-03 14:18:07,549 WARNING [train.py:1204] (3/4) Exclude cut with ID _49728_210_12651_1_1534041034294_6788150_607-3416-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 14:18:08,968 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8275_210_4198_1_1528455320_3892571_491-17707-0 from training. Duration: 0.64 2023-10-03 14:18:09,018 WARNING [train.py:1204] (3/4) Exclude cut with ID _5287_210_2084_1_1527418883040_3878670_333-48508-0 from training. Duration: 0.6 2023-10-03 14:18:10,912 WARNING [train.py:1204] (3/4) Exclude cut with ID _8365_210_2064_1_1528592311070_4677690_371-122295-0 from training. Duration: 0.55 2023-10-03 14:18:12,218 WARNING [train.py:1204] (3/4) Exclude cut with ID _14860_210_4915_1_1530702020340_7343240_836-328489-0 from training. Duration: 0.9 2023-10-03 14:18:13,632 WARNING [train.py:1204] (3/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_1033-274376-0_sp0.9 from training. Duration: 0.9666875 2023-10-03 14:18:14,913 INFO [train.py:1046] (3/4) Epoch 37, batch 3250, loss[loss=0.1668, simple_loss=0.2372, pruned_loss=0.04818, over 23740.00 frames. ], tot_loss[loss=0.1576, simple_loss=0.2375, pruned_loss=0.03886, over 4726889.01 frames. ], batch size: 232, lr: 2.75e-03, grad_scale: 32.0 2023-10-03 14:18:16,929 WARNING [train.py:1204] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_475-127455-0_sp0.9 from training. Duration: 0.77775 2023-10-03 14:18:16,935 WARNING [train.py:1204] (3/4) Exclude cut with ID IC0671W0071-331914 from training. Duration: 0.896 2023-10-03 14:18:16,956 WARNING [train.py:1204] (3/4) Exclude cut with ID _30327_210_16049_1_1532484161295_3924169_2-1523-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 14:18:16,959 WARNING [train.py:1204] (3/4) Exclude cut with ID _24314_210_4915_1_1532135078116_7170179_460-208279-0_sp1.1 from training. Duration: 0.97275 2023-10-03 14:18:18,402 WARNING [train.py:1204] (3/4) Exclude cut with ID IC0679W0052-335859 from training. Duration: 0.64 2023-10-03 14:18:18,931 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.4.encoder.layers.2.nonlin_attention.whiten2, num_groups=1, num_channels=384, metric=3.30 vs. limit=15.0 2023-10-03 14:18:22,983 WARNING [train.py:1204] (3/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_3-237434-0_sp1.1 from training. Duration: 0.6909375 2023-10-03 14:18:24,540 WARNING [train.py:1204] (3/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_405-185283-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 14:18:34,293 WARNING [train.py:1204] (3/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_927-83684-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 14:18:34,302 WARNING [train.py:1204] (3/4) Exclude cut with ID _6694_210_4599_1_1527852637490_3965529_356-169233-0 from training. Duration: 0.99 2023-10-03 14:18:34,389 WARNING [train.py:1204] (3/4) Exclude cut with ID _19896_210_1883_1_1531709022662_6649569_562-233001-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 14:18:35,653 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_354-78629-0_sp1.1 from training. Duration: 0.963625 2023-10-03 14:18:35,654 WARNING [train.py:1204] (3/4) Exclude cut with ID _60135_210_13427_1_1534935557922_3651319_96-168198-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 14:18:37,085 WARNING [train.py:1204] (3/4) Exclude cut with ID _31602_210_5854_1_1532677021456_4458250_249-96604-0_sp1.1 from training. Duration: 0.8090625 2023-10-03 14:18:37,107 WARNING [train.py:1204] (3/4) Exclude cut with ID _14100_210_8341_1_1530529320613_3678069_258-224599-0_sp0.9 from training. Duration: 0.8555625 2023-10-03 14:18:39,856 WARNING [train.py:1204] (3/4) Exclude cut with ID _19708_210_9206_1_1532136650733_4238364_215-160135-0_sp1.1 from training. Duration: 0.97275 2023-10-03 14:18:39,877 WARNING [train.py:1204] (3/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_113-18163-0_sp0.9 from training. Duration: 0.988875 2023-10-03 14:18:39,934 WARNING [train.py:1204] (3/4) Exclude cut with ID _64135_210_2253_1_1535677267876_7608972_552-337340-0_sp1.1 from training. Duration: 0.92725 2023-10-03 14:18:41,179 WARNING [train.py:1204] (3/4) Exclude cut with ID _67725_210_27767_1_1535608756671_5105359_10-12887-0_sp1.1 from training. Duration: 0.97275 2023-10-03 14:18:41,187 WARNING [train.py:1204] (3/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_401-284840-0_sp1.1 from training. Duration: 0.97275 2023-10-03 14:18:41,206 WARNING [train.py:1204] (3/4) Exclude cut with ID _44659_210_2721_1_1534036120061_6614860_483-141285-0_sp1.1 from training. Duration: 0.936375 2023-10-03 14:18:43,151 WARNING [train.py:1204] (3/4) Exclude cut with ID _9461_210_4557_1_1529060514726_3677451_358-332734-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 14:18:44,509 WARNING [train.py:1204] (3/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_788-298907-0_sp1.1 from training. Duration: 0.8090625 2023-10-03 14:18:45,901 WARNING [train.py:1204] (3/4) Exclude cut with ID _39411_210_5986_1_1533538825030_3343649_354-267196-0_sp1.1 from training. Duration: 0.92725 2023-10-03 14:18:45,930 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_14953_210_2151_1_1530701710_5616165_192-309792-0_sp1.1 from training. Duration: 0.97275 2023-10-03 14:18:47,720 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.697e+02 1.879e+02 2.099e+02 2.263e+02 3.172e+02, threshold=4.197e+02, percent-clipped=0.0 2023-10-03 14:18:49,170 WARNING [train.py:1204] (3/4) Exclude cut with ID _59170_210_6721_1_1534983633960_4203335_290-151255-0_sp1.1 from training. Duration: 0.92725 2023-10-03 14:18:49,203 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_5575_210_1410_1_1527301305_2570170_249-93769-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 14:18:49,213 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_46013_210_7234_1_1533776362_3744032_463-52378-0_sp1.1 from training. Duration: 0.7818125 2023-10-03 14:18:55,446 WARNING [train.py:1204] (3/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_580-93798-0 from training. Duration: 0.56 2023-10-03 14:18:56,779 WARNING [train.py:1204] (3/4) Exclude cut with ID _19514_210_9259_1_1531530071330_5084230_385-13762-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 14:18:56,784 WARNING [train.py:1204] (3/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_458-67321-0_sp1.1 from training. Duration: 0.8818125 2023-10-03 14:18:58,133 WARNING [train.py:1204] (3/4) Exclude cut with ID _41298_210_9019_1_1533430371450_4822143_188-154791-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 14:18:59,467 WARNING [train.py:1204] (3/4) Exclude cut with ID _15037_210_2253_1_1530752294104_3267650_296-192597-0_sp0.9 from training. Duration: 0.9 2023-10-03 14:19:01,048 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.bypass.skip_rate, batch_count=1296780.0, ans=0.07 2023-10-03 14:19:03,786 WARNING [train.py:1204] (3/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_661-264857-0_sp1.1 from training. Duration: 0.8454375 2023-10-03 14:19:11,984 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_34008_210_10431_1_1533345424_8183247_662-317331-0_sp1.1 from training. Duration: 0.8545625 2023-10-03 14:19:12,021 WARNING [train.py:1204] (3/4) Exclude cut with ID _46502_210_13794_1_1533724452198_3657159_241-278329-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 14:19:12,022 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_11349_210_6753_1_1529800861_1101207_26-65602-0 from training. Duration: 0.96 2023-10-03 14:19:12,026 WARNING [train.py:1204] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_448-4048-0_sp1.1 from training. Duration: 0.87275 2023-10-03 14:19:12,034 WARNING [train.py:1204] (3/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_662-275621-0_sp1.1 from training. Duration: 0.3818125 2023-10-03 14:19:13,505 WARNING [train.py:1204] (3/4) Exclude cut with ID _13938_210_9988_1_1530518718978_3693960_256-54454-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 14:19:15,765 WARNING [train.py:1204] (3/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_256-32662-0 from training. Duration: 0.9 2023-10-03 14:19:16,946 WARNING [train.py:1204] (3/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_326-17061-0 from training. Duration: 0.94 2023-10-03 14:19:16,985 WARNING [train.py:1204] (3/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_505-97987-0_sp1.1 from training. Duration: 0.7818125 2023-10-03 14:19:18,432 WARNING [train.py:1204] (3/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_316-294364-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 14:19:19,743 WARNING [train.py:1204] (3/4) Exclude cut with ID _19514_210_9259_1_1531530071330_5084230_762-231063-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 14:19:19,806 WARNING [train.py:1204] (3/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_68-219807-0_sp0.9 from training. Duration: 0.52225 2023-10-03 14:19:21,034 WARNING [train.py:1204] (3/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_479-158053-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 14:19:23,004 WARNING [train.py:1204] (3/4) Exclude cut with ID _66112_210_10635_1_1535590236305_3801484_83-38181-0_sp1.1 from training. Duration: 0.936375 2023-10-03 14:19:24,941 WARNING [train.py:1204] (3/4) Exclude cut with ID _55857_210_2346_1_1534644007831_6898209_255-146346-0_sp1.1 from training. Duration: 0.963625 2023-10-03 14:19:25,066 WARNING [train.py:1204] (3/4) Exclude cut with ID _48836_210_12210_1_1534075848293_3904150_429-35475-0 from training. Duration: 0.89 2023-10-03 14:19:25,086 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_50154_210_13731_1_1534750862_1080961_119-187163-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 14:19:26,445 WARNING [train.py:1204] (3/4) Exclude cut with ID _8793_210_6310_1_1528542985139_2310009_121-179782-0_sp0.9 from training. Duration: 0.888875 2023-10-03 14:19:26,448 WARNING [train.py:1204] (3/4) Exclude cut with ID _14884_210_2252_1_1530707459689_5561250_591-173406-0 from training. Duration: 0.96 2023-10-03 14:19:29,185 INFO [train.py:1046] (3/4) Epoch 37, batch 3300, loss[loss=0.1968, simple_loss=0.2672, pruned_loss=0.0632, over 19526.00 frames. ], tot_loss[loss=0.1584, simple_loss=0.2383, pruned_loss=0.03921, over 4707030.49 frames. ], batch size: 388, lr: 2.75e-03, grad_scale: 16.0 2023-10-03 14:19:29,354 WARNING [train.py:1204] (3/4) Exclude cut with ID _15133_210_6228_1_1530794512074_3080379_54-198193-0_sp1.1 from training. Duration: 0.8545625 2023-10-03 14:19:29,363 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6545_210_2040_1_1527774670_4245258_407-48070-0 from training. Duration: 0.76 2023-10-03 14:19:32,014 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1032-236863-0 from training. Duration: 0.84 2023-10-03 14:19:32,204 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.feed_forward1.hidden_balancer.prob, batch_count=1296913.3333333333, ans=0.125 2023-10-03 14:19:33,347 WARNING [train.py:1204] (3/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_507-303370-0 from training. Duration: 0.96 2023-10-03 14:19:33,367 WARNING [train.py:1204] (3/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_164-282366-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 14:19:36,218 WARNING [train.py:1204] (3/4) Exclude cut with ID _39867_210_9826_1_1533553222030_9344259_312-158727-0_sp1.1 from training. Duration: 0.963625 2023-10-03 14:19:37,620 WARNING [train.py:1204] (3/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_174-205058-0_sp1.1 from training. Duration: 0.863625 2023-10-03 14:19:37,641 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_11305_210_1794_1_1529750428_7178148_166-237358-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 14:19:39,063 WARNING [train.py:1204] (3/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_26-175553-0_sp1.1 from training. Duration: 0.5545625 2023-10-03 14:19:40,293 WARNING [train.py:1204] (3/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_96-290039-0_sp1.1 from training. Duration: 0.5090625 2023-10-03 14:19:42,414 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.1.attention_skip_rate, batch_count=1296980.0, ans=0.0 2023-10-03 14:19:43,489 WARNING [train.py:1204] (3/4) Exclude cut with ID _53219_210_4525_1_1534834836008_4064825_261-101691-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 14:19:44,891 WARNING [train.py:1204] (3/4) Exclude cut with ID _24271_210_12658_1_1531987270022_7190519_96-293390-0_sp1.1 from training. Duration: 0.936375 2023-10-03 14:19:47,610 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_271-234962-0 from training. Duration: 0.79 2023-10-03 14:19:48,946 WARNING [train.py:1204] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_136-30660-0_sp1.1 from training. Duration: 0.97275 2023-10-03 14:19:48,976 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_625-281046-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 14:19:50,794 WARNING [train.py:1204] (3/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_333-168193-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 14:19:52,663 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9029W0269-509664 from training. Duration: 0.896 2023-10-03 14:19:52,741 WARNING [train.py:1204] (3/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_450-147393-0_sp1.1 from training. Duration: 0.92725 2023-10-03 14:19:52,797 WARNING [train.py:1204] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_643-52007-0_sp1.1 from training. Duration: 0.5181875 2023-10-03 14:19:53,330 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.3.encoder.layers.2.whiten, num_groups=1, num_channels=512, metric=4.80 vs. limit=12.0 2023-10-03 14:19:54,688 WARNING [train.py:1204] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_330-327861-0_sp0.9 from training. Duration: 0.7444375 2023-10-03 14:19:54,696 WARNING [train.py:1204] (3/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_260-317651-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 14:19:54,723 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9044W0010-516654 from training. Duration: 0.896 2023-10-03 14:19:55,775 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.0.layers.0.self_attn_weights.whiten_keys, num_groups=4, num_channels=128, metric=2.73 vs. limit=6.0 2023-10-03 14:19:58,936 WARNING [train.py:1204] (3/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_265-118378-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 14:20:00,172 WARNING [train.py:1204] (3/4) Exclude cut with ID _6194_210_1534_1_1527511826160_3941194_321-148641-0_sp0.9 from training. Duration: 0.711125 2023-10-03 14:20:01,544 WARNING [train.py:1204] (3/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_101-169176-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 14:20:01,547 WARNING [train.py:1204] (3/4) Exclude cut with ID 497-129325-0061-62254-0_sp1.1 from training. Duration: 0.97725 2023-10-03 14:20:03,053 WARNING [train.py:1204] (3/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_795-33252-0_sp1.1 from training. Duration: 0.336375 2023-10-03 14:20:03,072 WARNING [train.py:1204] (3/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_168-246176-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 14:20:05,697 WARNING [train.py:1204] (3/4) Exclude cut with ID _7160_210_1876_1_1527940923231_3703799_120-150943-0_sp1.1 from training. Duration: 0.736375 2023-10-03 14:20:07,110 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9084W0005-536844 from training. Duration: 0.768 2023-10-03 14:20:08,605 WARNING [train.py:1204] (3/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_234-260682-0 from training. Duration: 0.84 2023-10-03 14:20:08,655 WARNING [train.py:1204] (3/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_188-114464-0_sp0.9 from training. Duration: 0.911125 2023-10-03 14:20:12,744 WARNING [train.py:1204] (3/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_81-75849-0 from training. Duration: 0.39 2023-10-03 14:20:14,132 WARNING [train.py:1204] (3/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_379-184223-0_sp1.1 from training. Duration: 0.77275 2023-10-03 14:20:17,494 WARNING [train.py:1204] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_454-123002-0_sp1.1 from training. Duration: 0.62725 2023-10-03 14:20:17,543 WARNING [train.py:1204] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_887-249555-0_sp1.1 from training. Duration: 0.7 2023-10-03 14:20:21,575 WARNING [train.py:1204] (3/4) Exclude cut with ID _31088_210_16446_1_1532520012284_3440919_20-260902-0_sp1.1 from training. Duration: 0.97275 2023-10-03 14:20:21,619 WARNING [train.py:1204] (3/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_856-344574-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 14:20:21,621 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_36756_210_7234_1_1533026991_966276_4-164118-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 14:20:21,643 WARNING [train.py:1204] (3/4) Exclude cut with ID _8365_210_2064_1_1528592311070_4677690_371-122295-0_sp1.1 from training. Duration: 0.5 2023-10-03 14:20:23,527 WARNING [train.py:1204] (3/4) Exclude cut with ID _5910_210_1389_1_1527337972454_5050100_555-159815-0_sp1.1 from training. Duration: 0.8909375 2023-10-03 14:20:23,537 WARNING [train.py:1204] (3/4) Exclude cut with ID _37086_210_9038_1_1532999540891_7021250_469-147941-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 14:20:24,830 WARNING [train.py:1204] (3/4) Exclude cut with ID _3067_210_2667_1_1526212974640_4503168_459-173867-0_sp0.9 from training. Duration: 0.97775 2023-10-03 14:20:26,837 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9156W0006-572499 from training. Duration: 0.896 2023-10-03 14:20:26,932 WARNING [train.py:1204] (3/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_277-210798-0 from training. Duration: 0.55 2023-10-03 14:20:28,958 WARNING [train.py:1204] (3/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_103-336064-0_sp1.1 from training. Duration: 0.463625 2023-10-03 14:20:30,238 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3414_210_1988_1_1526212718_4813195_331-290945-0_sp1.1 from training. Duration: 0.936375 2023-10-03 14:20:30,240 WARNING [train.py:1204] (3/4) Exclude cut with ID _67499_210_17282_1_1535509805220_3692430_365-29276-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 14:20:31,680 WARNING [train.py:1204] (3/4) Exclude cut with ID _14821_210_8750_1_1530680503991_6829359_69-250847-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 14:20:31,682 WARNING [train.py:1204] (3/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_1102-61301-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 14:20:31,769 WARNING [train.py:1204] (3/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_166-246557-0_sp1.1 from training. Duration: 0.5818125 2023-10-03 14:20:33,098 WARNING [train.py:1204] (3/4) Exclude cut with ID _15351_210_10626_1_1530856635653_3576210_214-129918-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 14:20:34,471 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_120-41668-0_sp0.9 from training. Duration: 0.77775 2023-10-03 14:20:35,830 WARNING [train.py:1204] (3/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_27-72278-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 14:20:35,940 WARNING [train.py:1204] (3/4) Exclude cut with ID _24314_210_4915_1_1532135078116_7170179_521-258868-0_sp1.1 from training. Duration: 0.7181875 2023-10-03 14:20:39,984 WARNING [train.py:1204] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_143-86344-0 from training. Duration: 0.77 2023-10-03 14:20:40,018 WARNING [train.py:1204] (3/4) Exclude cut with ID _53489_210_6228_1_1534298357169_3835970_465-157433-0_sp1.1 from training. Duration: 0.92725 2023-10-03 14:20:41,390 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_248-319353-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 14:20:41,528 WARNING [train.py:1204] (3/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_102-77064-0_sp1.1 from training. Duration: 0.8454375 2023-10-03 14:20:42,812 INFO [train.py:1046] (3/4) Epoch 37, batch 3350, loss[loss=0.1674, simple_loss=0.2477, pruned_loss=0.04358, over 23747.00 frames. ], tot_loss[loss=0.1593, simple_loss=0.2392, pruned_loss=0.03971, over 4720316.78 frames. ], batch size: 85, lr: 2.75e-03, grad_scale: 16.0 2023-10-03 14:20:42,874 WARNING [train.py:1204] (3/4) Exclude cut with ID _9389_210_1664_1_1529217337301_5289269_379-128794-0_sp1.1 from training. Duration: 0.77275 2023-10-03 14:20:42,982 WARNING [train.py:1204] (3/4) Exclude cut with ID _3231_210_2106_1_1526205864934_6784220_236-260835-0_sp1.1 from training. Duration: 0.97275 2023-10-03 14:20:46,373 WARNING [train.py:1204] (3/4) Exclude cut with ID _24271_210_12658_1_1531987270022_7190519_184-342363-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 14:20:46,374 WARNING [train.py:1204] (3/4) Exclude cut with ID _27894_210_12392_1_1532415496078_6902439_67-221663-0_sp1.1 from training. Duration: 0.92725 2023-10-03 14:20:50,322 WARNING [train.py:1204] (3/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_304-59227-0_sp1.1 from training. Duration: 0.7 2023-10-03 14:20:51,718 WARNING [train.py:1204] (3/4) Exclude cut with ID _58605_210_12210_1_1534723195939_3904259_458-42232-0_sp1.1 from training. Duration: 0.92725 2023-10-03 14:20:52,565 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.1.encoder.layers.1.feed_forward3.out_whiten, num_groups=1, num_channels=256, metric=7.68 vs. limit=15.0 2023-10-03 14:20:53,114 WARNING [train.py:1204] (3/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_13-165512-0_sp0.9 from training. Duration: 0.911125 2023-10-03 14:20:56,732 WARNING [train.py:1204] (3/4) Exclude cut with ID _58602_210_2346_1_1534903210085_6908830_792-102241-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 14:20:58,769 WARNING [train.py:1204] (3/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_588-125165-0_sp0.9 from training. Duration: 0.92225 2023-10-03 14:21:00,202 WARNING [train.py:1204] (3/4) Exclude cut with ID _54218_210_12210_1_1534413646388_4409259_239-98429-0_sp1.1 from training. Duration: 0.97275 2023-10-03 14:21:01,526 WARNING [train.py:1204] (3/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_538-32291-0_sp1.1 from training. Duration: 0.7545625 2023-10-03 14:21:03,071 WARNING [train.py:1204] (3/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_419-317062-0 from training. Duration: 0.91 2023-10-03 14:21:03,179 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9309W0202-640329 from training. Duration: 0.896 2023-10-03 14:21:04,418 WARNING [train.py:1204] (3/4) Exclude cut with ID _3231_210_2106_1_1526205864934_6784220_419-313291-0_sp1.1 from training. Duration: 0.97275 2023-10-03 14:21:07,215 WARNING [train.py:1204] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_619-344499-0 from training. Duration: 0.63 2023-10-03 14:21:07,227 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_278-272612-0 from training. Duration: 0.73 2023-10-03 14:21:08,630 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_103-84010-0_sp0.9 from training. Duration: 0.7666875 2023-10-03 14:21:08,651 WARNING [train.py:1204] (3/4) Exclude cut with ID _26528_210_9070_1_1532570410842_3851310_497-97218-0_sp1.1 from training. Duration: 0.7818125 2023-10-03 14:21:08,744 WARNING [train.py:1204] (3/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_262-84613-0_sp1.1 from training. Duration: 0.936375 2023-10-03 14:21:08,779 WARNING [train.py:1204] (3/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_387-131538-0 from training. Duration: 0.81 2023-10-03 14:21:10,040 WARNING [train.py:1204] (3/4) Exclude cut with ID _8030_210_7497_1_1528444886781_3531089_224-300489-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 14:21:10,063 WARNING [train.py:1204] (3/4) Exclude cut with ID _14349_210_8341_1_1530615710818_3442470_170-336541-0_sp0.9 from training. Duration: 0.9555625 2023-10-03 14:21:11,438 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_47069_210_15368_1_1533781267_4431320_180-31746-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 14:21:14,149 WARNING [train.py:1204] (3/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_447-7041-0_sp1.1 from training. Duration: 0.92725 2023-10-03 14:21:14,183 WARNING [train.py:1204] (3/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_2-55283-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 14:21:14,246 WARNING [train.py:1204] (3/4) Exclude cut with ID _12555_210_8750_1_1530075594402_3780239_70-181911-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 14:21:15,430 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.616e+02 1.884e+02 2.039e+02 2.303e+02 3.240e+02, threshold=4.079e+02, percent-clipped=0.0 2023-10-03 14:21:18,818 WARNING [train.py:1204] (3/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_311-328022-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 14:21:21,398 WARNING [train.py:1204] (3/4) Exclude cut with ID _14528_210_8254_1_1530669700514_4306939_330-2987-0_sp1.1 from training. Duration: 0.92725 2023-10-03 14:21:21,453 WARNING [train.py:1204] (3/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_385-184858-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 14:21:26,390 WARNING [train.py:1204] (3/4) Exclude cut with ID _64414_210_17301_1_1535243252048_3757759_348-333360-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 14:21:26,466 WARNING [train.py:1204] (3/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_753-214751-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 14:21:28,578 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_17613_210_2151_1_1531137569_2366160_202-208013-0_sp1.1 from training. Duration: 0.92725 2023-10-03 14:21:28,586 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_43831_210_2158_1_1533725502_5872791_610-129300-0_sp1.1 from training. Duration: 0.936375 2023-10-03 14:21:31,215 WARNING [train.py:1204] (3/4) Exclude cut with ID _54218_210_12210_1_1534413646388_4409259_321-23852-0_sp1.1 from training. Duration: 0.936375 2023-10-03 14:21:35,714 WARNING [train.py:1204] (3/4) Exclude cut with ID _39411_210_5986_1_1533538825030_3343649_173-238248-0 from training. Duration: 0.83 2023-10-03 14:21:35,725 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_405-339986-0_sp0.9 from training. Duration: 0.5666875 2023-10-03 14:21:35,765 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2009_210_2071_1_1525481134_4588937_385-302100-0 from training. Duration: 0.87 2023-10-03 14:21:37,167 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_161-276996-0_sp1.1 from training. Duration: 0.863625 2023-10-03 14:21:37,376 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.bypass.scale_min, batch_count=1297446.6666666667, ans=0.2 2023-10-03 14:21:38,478 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_177-58005-0 from training. Duration: 0.62 2023-10-03 14:21:38,554 WARNING [train.py:1204] (3/4) Exclude cut with ID _64639_210_5816_1_1535367619437_3528000_146-343734-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 14:21:39,944 WARNING [train.py:1204] (3/4) Exclude cut with ID _3845_210_1390_1_1526731326141_3523610_72-61441-0_sp1.1 from training. Duration: 0.92725 2023-10-03 14:21:44,249 WARNING [train.py:1204] (3/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_991-125028-0_sp1.1 from training. Duration: 0.936375 2023-10-03 14:21:44,308 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7544_210_6753_1_1528591327_1471426_172-348777-0 from training. Duration: 0.66 2023-10-03 14:21:45,720 WARNING [train.py:1204] (3/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_416-257730-0_sp1.1 from training. Duration: 0.5909375 2023-10-03 14:21:45,795 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1113-7369-0_sp1.1 from training. Duration: 0.77275 2023-10-03 14:21:45,960 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.feed_forward1.hidden_balancer.prob, batch_count=1297513.3333333333, ans=0.125 2023-10-03 14:21:47,146 WARNING [train.py:1204] (3/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_242-76943-0_sp1.1 from training. Duration: 0.8545625 2023-10-03 14:21:52,918 WARNING [train.py:1204] (3/4) Exclude cut with ID _44718_210_5816_1_1534050051869_6469030_190-49648-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 14:21:55,700 WARNING [train.py:1204] (3/4) Exclude cut with ID _47251_210_18628_1_1533814063128_4137250_214-88076-0 from training. Duration: 0.88 2023-10-03 14:21:55,747 WARNING [train.py:1204] (3/4) Exclude cut with ID _5585_210_2667_1_1527418813324_8007340_89-332072-0_sp1.1 from training. Duration: 0.5818125 2023-10-03 14:21:55,778 WARNING [train.py:1204] (3/4) Exclude cut with ID _41817_210_18835_1_1533434477160_7423049_555-293448-0_sp1.1 from training. Duration: 0.72725 2023-10-03 14:21:57,598 INFO [train.py:1046] (3/4) Epoch 37, batch 3400, loss[loss=0.1452, simple_loss=0.2207, pruned_loss=0.03485, over 24417.00 frames. ], tot_loss[loss=0.1594, simple_loss=0.2393, pruned_loss=0.03977, over 4725313.83 frames. ], batch size: 58, lr: 2.75e-03, grad_scale: 16.0 2023-10-03 14:21:57,734 WARNING [train.py:1204] (3/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_286-331314-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 14:21:59,645 WARNING [train.py:1204] (3/4) Exclude cut with ID _13921_210_9994_1_1530838702800_3003429_46-149444-0 from training. Duration: 0.94 2023-10-03 14:22:01,450 WARNING [train.py:1204] (3/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_768-43595-0_sp1.1 from training. Duration: 0.936375 2023-10-03 14:22:01,459 WARNING [train.py:1204] (3/4) Exclude cut with ID _35355_210_16040_1_1533212988977_4016229_74-190076-0 from training. Duration: 0.8 2023-10-03 14:22:02,838 WARNING [train.py:1204] (3/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_1007-316380-0_sp1.1 from training. Duration: 0.963625 2023-10-03 14:22:02,901 WARNING [train.py:1204] (3/4) Exclude cut with ID _23557_210_8750_1_1531890106370_6846255_150-283437-0_sp1.1 from training. Duration: 0.963625 2023-10-03 14:22:04,156 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3474_210_2028_1_1526188513_4825246_145-309131-0_sp1.1 from training. Duration: 0.67275 2023-10-03 14:22:05,475 WARNING [train.py:1204] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_391-22148-0_sp1.1 from training. Duration: 0.7 2023-10-03 14:22:05,498 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_238-256668-0 from training. Duration: 0.69 2023-10-03 14:22:08,605 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.bypass.scale_min, batch_count=1297580.0, ans=0.2 2023-10-03 14:22:09,725 WARNING [train.py:1204] (3/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_372-198878-0 from training. Duration: 0.6 2023-10-03 14:22:09,743 WARNING [train.py:1204] (3/4) Exclude cut with ID ID0174W0006-766659 from training. Duration: 0.953 2023-10-03 14:22:09,759 WARNING [train.py:1204] (3/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_668-23019-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 14:22:09,960 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.conv_module2.balancer2.min_positive, batch_count=1297580.0, ans=0.05 2023-10-03 14:22:11,332 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.nonlin_attention.balancer.prob, batch_count=1297646.6666666667, ans=0.125 2023-10-03 14:22:14,077 WARNING [train.py:1204] (3/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_322-74328-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 14:22:15,206 WARNING [train.py:1204] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_872-264525-0_sp1.1 from training. Duration: 0.5909375 2023-10-03 14:22:15,265 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_53378_210_17282_1_1534221071_5769509_641-286201-0_sp1.1 from training. Duration: 0.97275 2023-10-03 14:22:17,863 WARNING [train.py:1204] (3/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_199-295611-0_sp1.1 from training. Duration: 0.5 2023-10-03 14:22:20,843 WARNING [train.py:1204] (3/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_1270-317971-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 14:22:22,563 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.balancer1.prob, batch_count=1297646.6666666667, ans=0.125 2023-10-03 14:22:24,200 WARNING [train.py:1204] (3/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_784-213099-0 from training. Duration: 0.93 2023-10-03 14:22:28,372 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_115-233929-0_sp0.9 from training. Duration: 0.8 2023-10-03 14:22:30,359 WARNING [train.py:1204] (3/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_256-223809-0_sp1.1 from training. Duration: 0.97275 2023-10-03 14:22:30,399 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_285-87781-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 14:22:32,404 WARNING [train.py:1204] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_619-344499-0_sp1.1 from training. Duration: 0.57275 2023-10-03 14:22:38,399 WARNING [train.py:1204] (3/4) Exclude cut with ID _60229_210_27767_1_1534838406158_3655350_264-279573-0_sp0.9 from training. Duration: 0.92225 2023-10-03 14:22:41,293 WARNING [train.py:1204] (3/4) Exclude cut with ID _7719_210_3525_1_1528456153163_4296960_333-110399-0 from training. Duration: 0.96 2023-10-03 14:22:42,020 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.1.encoder.layers.1.whiten, num_groups=1, num_channels=256, metric=3.77 vs. limit=12.0 2023-10-03 14:22:46,767 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_50423_210_13270_1_1534128598_5021734_759-90833-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 14:22:48,044 WARNING [train.py:1204] (3/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_151-254798-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 14:22:48,077 WARNING [train.py:1204] (3/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_450-203637-0 from training. Duration: 0.92 2023-10-03 14:22:48,107 WARNING [train.py:1204] (3/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_673-36456-0_sp1.1 from training. Duration: 0.936375 2023-10-03 14:22:48,149 WARNING [train.py:1204] (3/4) Exclude cut with ID _36538_210_9994_1_1533204020363_3795620_131-8685-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 14:22:49,508 WARNING [train.py:1204] (3/4) Exclude cut with ID _12556_210_8341_1_1530081024100_7327690_713-55626-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 14:22:50,857 WARNING [train.py:1204] (3/4) Exclude cut with ID _2322_210_2667_1_1525604548971_3656299_440-80494-0_sp0.9 from training. Duration: 0.97775 2023-10-03 14:22:53,068 WARNING [train.py:1204] (3/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_210-325081-0_sp1.1 from training. Duration: 0.97275 2023-10-03 14:22:55,863 WARNING [train.py:1204] (3/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_375-244976-0_sp0.9 from training. Duration: 0.8333125 2023-10-03 14:22:55,869 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_15517_210_6753_1_1530952293_1664795_119-31001-0_sp1.1 from training. Duration: 0.8545625 2023-10-03 14:23:02,564 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_41452_210_7291_1_1533437687_3941281_319-42326-0_sp1.1 from training. Duration: 0.92725 2023-10-03 14:23:03,885 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_372-173012-0 from training. Duration: 0.95 2023-10-03 14:23:09,925 WARNING [train.py:1204] (3/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_41-31157-0_sp1.1 from training. Duration: 0.5181875 2023-10-03 14:23:12,539 INFO [train.py:1046] (3/4) Epoch 37, batch 3450, loss[loss=0.1655, simple_loss=0.2498, pruned_loss=0.0406, over 24016.00 frames. ], tot_loss[loss=0.1595, simple_loss=0.2396, pruned_loss=0.03977, over 4728133.90 frames. ], batch size: 80, lr: 2.75e-03, grad_scale: 16.0 2023-10-03 14:23:14,052 WARNING [train.py:1204] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_91-212486-0 from training. Duration: 0.68 2023-10-03 14:23:15,650 WARNING [train.py:1204] (3/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_1267-272693-0 from training. Duration: 0.73 2023-10-03 14:23:16,913 WARNING [train.py:1204] (3/4) Exclude cut with ID _13938_210_9988_1_1530518718978_3693960_152-67046-0_sp1.1 from training. Duration: 0.936375 2023-10-03 14:23:18,393 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_4528_210_3273_1_1527076654_3276777_342-168068-0_sp1.1 from training. Duration: 0.7909375 2023-10-03 14:23:18,402 WARNING [train.py:1204] (3/4) Exclude cut with ID _7606_210_4915_1_1528894253277_5080690_127-177761-0 from training. Duration: 0.64 2023-10-03 14:23:19,695 WARNING [train.py:1204] (3/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_335-137972-0_sp1.1 from training. Duration: 0.92725 2023-10-03 14:23:22,512 WARNING [train.py:1204] (3/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_331-117829-0_sp1.1 from training. Duration: 0.563625 2023-10-03 14:23:24,711 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.ff3_skip_rate, batch_count=1297913.3333333333, ans=0.0 2023-10-03 14:23:27,728 WARNING [train.py:1204] (3/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_125-125818-0_sp1.1 from training. Duration: 0.87275 2023-10-03 14:23:29,096 WARNING [train.py:1204] (3/4) Exclude cut with ID _6789_210_1534_1_1527854471671_2870519_48-294897-0_sp1.1 from training. Duration: 0.963625 2023-10-03 14:23:30,999 WARNING [train.py:1204] (3/4) Exclude cut with ID _6694_210_4599_1_1527852637490_3965529_116-109398-0_sp1.1 from training. Duration: 0.663625 2023-10-03 14:23:31,008 WARNING [train.py:1204] (3/4) Exclude cut with ID _52892_210_6721_1_1534378941259_4124129_222-172685-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 14:23:32,453 WARNING [train.py:1204] (3/4) Exclude cut with ID _50655_210_18628_1_1534229942193_3772154_320-132275-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 14:23:38,592 WARNING [train.py:1204] (3/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_76-311963-0 from training. Duration: 0.7 2023-10-03 14:23:45,258 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.596e+02 1.909e+02 2.144e+02 2.320e+02 3.378e+02, threshold=4.287e+02, percent-clipped=0.0 2023-10-03 14:23:45,346 WARNING [train.py:1204] (3/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_32-205288-0 from training. Duration: 0.78 2023-10-03 14:23:45,363 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_86-16881-0_sp0.9 from training. Duration: 0.6444375 2023-10-03 14:23:45,406 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_271-297505-0_sp1.1 from training. Duration: 0.9 2023-10-03 14:23:45,502 WARNING [train.py:1204] (3/4) Exclude cut with ID _57572_210_5517_1_1534921219624_3809484_187-295293-0_sp1.1 from training. Duration: 0.963625 2023-10-03 14:23:52,310 WARNING [train.py:1204] (3/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_563-329943-0 from training. Duration: 0.99 2023-10-03 14:23:52,375 WARNING [train.py:1204] (3/4) Exclude cut with ID _7160_210_1876_1_1527940923231_3703799_302-150728-0_sp0.9 from training. Duration: 0.8666875 2023-10-03 14:23:54,097 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.feed_forward1.out_proj.dropout_p, batch_count=1298046.6666666667, ans=0.1 2023-10-03 14:23:57,206 WARNING [train.py:1204] (3/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_926-7526-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 14:23:57,237 WARNING [train.py:1204] (3/4) Exclude cut with ID _62574_210_2655_1_1535180407058_3933249_467-22348-0_sp1.1 from training. Duration: 0.936375 2023-10-03 14:23:58,532 WARNING [train.py:1204] (3/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_280-161633-0_sp0.9 from training. Duration: 0.811125 2023-10-03 14:23:58,652 WARNING [train.py:1204] (3/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_385-193052-0_sp1.1 from training. Duration: 0.7818125 2023-10-03 14:24:00,657 WARNING [train.py:1204] (3/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_444-28041-0 from training. Duration: 0.85 2023-10-03 14:24:00,663 WARNING [train.py:1204] (3/4) Exclude cut with ID _66070_210_7640_1_1535441393096_7805310_341-249237-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 14:24:01,939 WARNING [train.py:1204] (3/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_425-269826-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 14:24:03,911 WARNING [train.py:1204] (3/4) Exclude cut with ID _53950_210_5816_1_1534417264919_3862951_124-185597-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 14:24:07,864 WARNING [train.py:1204] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_380-204756-0 from training. Duration: 0.86 2023-10-03 14:24:09,285 WARNING [train.py:1204] (3/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_282-257791-0_sp0.9 from training. Duration: 0.9555625 2023-10-03 14:24:13,504 WARNING [train.py:1204] (3/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_372-131163-0_sp1.1 from training. Duration: 0.8545625 2023-10-03 14:24:14,874 WARNING [train.py:1204] (3/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_447-233513-0_sp1.1 from training. Duration: 0.963625 2023-10-03 14:24:16,327 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_54-118333-0_sp1.1 from training. Duration: 0.97275 2023-10-03 14:24:21,750 WARNING [train.py:1204] (3/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_388-230239-0_sp1.1 from training. Duration: 0.963625 2023-10-03 14:24:21,761 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_40047_210_2290_1_1533290519_2374311_141-54639-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 14:24:21,821 WARNING [train.py:1204] (3/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_91-267696-0_sp0.9 from training. Duration: 0.9666875 2023-10-03 14:24:22,023 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=1298180.0, ans=0.1 2023-10-03 14:24:23,046 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_532-137847-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 14:24:26,301 INFO [train.py:1046] (3/4) Epoch 37, batch 3500, loss[loss=0.1667, simple_loss=0.2464, pruned_loss=0.0435, over 23435.00 frames. ], tot_loss[loss=0.1586, simple_loss=0.2387, pruned_loss=0.03926, over 4732109.63 frames. ], batch size: 106, lr: 2.75e-03, grad_scale: 16.0 2023-10-03 14:24:27,047 WARNING [train.py:1204] (3/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_155-283231-0_sp1.1 from training. Duration: 0.97275 2023-10-03 14:24:31,176 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2245_210_2715_1_1525511548_3540254_310-52483-0_sp1.1 from training. Duration: 0.8 2023-10-03 14:24:31,260 WARNING [train.py:1204] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_185-174805-0 from training. Duration: 0.68 2023-10-03 14:24:33,110 WARNING [train.py:1204] (3/4) Exclude cut with ID _15012_210_1968_1_1530856820429_5095350_662-210469-0_sp1.1 from training. Duration: 0.5181875 2023-10-03 14:24:35,867 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_426-313557-0_sp0.9 from training. Duration: 0.588875 2023-10-03 14:24:38,626 WARNING [train.py:1204] (3/4) Exclude cut with ID _42326_210_7770_1_1533814282531_3707269_407-204343-0_sp1.1 from training. Duration: 0.97275 2023-10-03 14:24:38,637 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_258-310570-0 from training. Duration: 0.82 2023-10-03 14:24:41,858 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.conv_module2.balancer1.prob, batch_count=1298313.3333333333, ans=0.125 2023-10-03 14:24:43,014 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_4737_210_2040_1_1526998689_3293008_378-178650-0_sp1.1 from training. Duration: 0.863625 2023-10-03 14:24:44,394 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_47896_210_20508_1_1534492518_5958868_432-327451-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 14:24:44,493 WARNING [train.py:1204] (3/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_188-114464-0_sp1.1 from training. Duration: 0.7454375 2023-10-03 14:24:44,501 WARNING [train.py:1204] (3/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_798-106179-0_sp1.1 from training. Duration: 0.7545625 2023-10-03 14:24:45,895 WARNING [train.py:1204] (3/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_143-117421-0_sp1.1 from training. Duration: 0.52725 2023-10-03 14:24:47,204 WARNING [train.py:1204] (3/4) Exclude cut with ID _67499_210_17282_1_1535509805220_3692430_361-78786-0_sp1.1 from training. Duration: 0.963625 2023-10-03 14:24:47,242 WARNING [train.py:1204] (3/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_712-342563-0_sp1.1 from training. Duration: 0.8818125 2023-10-03 14:24:47,302 WARNING [train.py:1204] (3/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_387-159008-0 from training. Duration: 0.86 2023-10-03 14:24:51,255 WARNING [train.py:1204] (3/4) Exclude cut with ID _33640_210_2655_1_1533117605479_4855469_959-18820-0_sp1.1 from training. Duration: 0.963625 2023-10-03 14:24:51,298 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_133-564-0_sp0.9 from training. Duration: 0.87775 2023-10-03 14:24:52,692 WARNING [train.py:1204] (3/4) Exclude cut with ID _23514_210_1389_1_1531832139785_3031439_12-29287-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 14:24:56,815 WARNING [train.py:1204] (3/4) Exclude cut with ID _23514_210_1389_1_1531832139785_3031439_489-185052-0_sp1.1 from training. Duration: 0.963625 2023-10-03 14:24:58,074 WARNING [train.py:1204] (3/4) Exclude cut with ID _5444_210_2151_1_1527418808464_5312210_634-194174-0 from training. Duration: 0.97 2023-10-03 14:24:59,390 WARNING [train.py:1204] (3/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_91-26919-0_sp1.1 from training. Duration: 0.8818125 2023-10-03 14:25:01,509 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_37351_210_7925_1_1533012931_3812113_516-82303-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 14:25:02,914 WARNING [train.py:1204] (3/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_80-74184-0_sp0.9 from training. Duration: 0.92225 2023-10-03 14:25:03,001 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_9460_210_4211_1_1528954587_10049667_495-202963-0_sp1.1 from training. Duration: 0.963625 2023-10-03 14:25:04,509 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.conv_module1.balancer1.prob, batch_count=1298380.0, ans=0.125 2023-10-03 14:25:06,237 WARNING [train.py:1204] (3/4) Exclude cut with ID _14632_210_9988_1_1530691279392_3923200_332-105904-0_sp1.1 from training. Duration: 0.8181875 2023-10-03 14:25:06,252 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_40764_210_1925_1_1533882359_3752345_493-167886-0_sp1.1 from training. Duration: 0.92725 2023-10-03 14:25:06,413 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.attention_skip_rate, batch_count=1298380.0, ans=0.0 2023-10-03 14:25:08,769 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_196-289637-0 from training. Duration: 0.75 2023-10-03 14:25:10,161 WARNING [train.py:1204] (3/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_26-175553-0 from training. Duration: 0.61 2023-10-03 14:25:10,207 WARNING [train.py:1204] (3/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_890-5117-0 from training. Duration: 0.78 2023-10-03 14:25:10,248 WARNING [train.py:1204] (3/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_206-306860-0_sp1.1 from training. Duration: 0.92725 2023-10-03 14:25:11,640 WARNING [train.py:1204] (3/4) Exclude cut with ID _25882_210_3036_1_1532775665815_6479510_570-79824-0_sp1.1 from training. Duration: 0.963625 2023-10-03 14:25:11,705 WARNING [train.py:1204] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_731-2428-0_sp1.1 from training. Duration: 0.7545625 2023-10-03 14:25:11,725 WARNING [train.py:1204] (3/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_138-154812-0_sp0.9 from training. Duration: 0.8666875 2023-10-03 14:25:14,561 WARNING [train.py:1204] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_178-39722-0_sp0.9 from training. Duration: 0.5444375 2023-10-03 14:25:14,629 WARNING [train.py:1204] (3/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_1285-256622-0_sp0.9 from training. Duration: 0.9333125 2023-10-03 14:25:18,724 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_41428_210_2161_1_1533774184_3992532_65-271323-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 14:25:20,169 WARNING [train.py:1204] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_308-161067-0 from training. Duration: 0.66 2023-10-03 14:25:20,174 WARNING [train.py:1204] (3/4) Exclude cut with ID _3067_210_2667_1_1526212974640_4503168_459-173867-0 from training. Duration: 0.88 2023-10-03 14:25:20,175 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2221_210_1988_1_1525597051_4935541_415-296882-0_sp1.1 from training. Duration: 0.7 2023-10-03 14:25:24,812 WARNING [train.py:1204] (3/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_354-192266-0_sp1.1 from training. Duration: 0.936375 2023-10-03 14:25:24,882 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_258-310570-0_sp0.9 from training. Duration: 0.911125 2023-10-03 14:25:26,937 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_548-141039-0_sp1.1 from training. Duration: 0.963625 2023-10-03 14:25:28,246 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_15101_210_7927_1_1530786415_5033274_320-327527-0 from training. Duration: 0.96 2023-10-03 14:25:30,238 WARNING [train.py:1204] (3/4) Exclude cut with ID _2895_210_2477_1_1525956785794_4063750_289-11475-0_sp0.9 from training. Duration: 0.911125 2023-10-03 14:25:31,724 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7543_210_6753_1_1528543318_4552_1-85072-0_sp1.1 from training. Duration: 0.7545625 2023-10-03 14:25:31,808 WARNING [train.py:1204] (3/4) Exclude cut with ID _64639_210_5816_1_1535367619437_3528000_461-329156-0 from training. Duration: 0.96 2023-10-03 14:25:34,994 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_211-339622-0 from training. Duration: 0.45 2023-10-03 14:25:37,731 WARNING [train.py:1204] (3/4) Exclude cut with ID _20455_210_5219_1_1531565996775_8098100_402-112739-0_sp1.1 from training. Duration: 0.963625 2023-10-03 14:25:37,813 WARNING [train.py:1204] (3/4) Exclude cut with ID _53489_210_6228_1_1534298357169_3835970_256-115565-0_sp1.1 from training. Duration: 0.936375 2023-10-03 14:25:37,831 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_26326_210_7925_1_1533002493_3592406_360-305260-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 14:25:37,853 WARNING [train.py:1204] (3/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_18-204383-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 14:25:40,711 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.conv_skip_rate, batch_count=1298580.0, ans=0.0 2023-10-03 14:25:41,962 INFO [train.py:1046] (3/4) Epoch 37, batch 3550, loss[loss=0.1499, simple_loss=0.2212, pruned_loss=0.0393, over 23630.00 frames. ], tot_loss[loss=0.1571, simple_loss=0.2363, pruned_loss=0.03893, over 4707875.98 frames. ], batch size: 256, lr: 2.75e-03, grad_scale: 16.0 2023-10-03 14:25:42,059 WARNING [train.py:1204] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_306-232480-0_sp1.1 from training. Duration: 0.87275 2023-10-03 14:25:47,926 WARNING [train.py:1204] (3/4) Exclude cut with ID _41161_210_20034_1_1533431249657_3555314_116-299186-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 14:25:49,294 WARNING [train.py:1204] (3/4) Exclude cut with ID _13913_210_9994_1_1530579615187_3383897_473-277682-0_sp1.1 from training. Duration: 0.3545625 2023-10-03 14:25:53,706 WARNING [train.py:1204] (3/4) Exclude cut with ID _58895_210_8750_1_1535184081320_3649089_49-225702-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 14:25:54,378 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3080_210_2071_1_1526086102_4365166_312-8726-0_sp1.1 from training. Duration: 0.82725 2023-10-03 14:25:56,891 WARNING [train.py:1204] (3/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_489-190175-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 14:25:58,277 WARNING [train.py:1204] (3/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_345-95717-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 14:25:58,296 WARNING [train.py:1204] (3/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_85-138257-0_sp1.1 from training. Duration: 0.6454375 2023-10-03 14:26:01,602 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7046_210_6753_1_1527895337_6920917_783-278109-0_sp1.1 from training. Duration: 0.7 2023-10-03 14:26:01,644 WARNING [train.py:1204] (3/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_450-203637-0_sp1.1 from training. Duration: 0.836375 2023-10-03 14:26:02,903 WARNING [train.py:1204] (3/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_416-132893-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 14:26:02,921 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2655_210_2087_1_1525688959_3897400_437-337690-0_sp0.9 from training. Duration: 0.7 2023-10-03 14:26:04,894 WARNING [train.py:1204] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_700-183549-0_sp1.1 from training. Duration: 0.6181875 2023-10-03 14:26:09,147 WARNING [train.py:1204] (3/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_191-59784-0_sp1.1 from training. Duration: 0.8 2023-10-03 14:26:09,158 WARNING [train.py:1204] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_218-295097-0_sp1.1 from training. Duration: 0.7 2023-10-03 14:26:11,824 WARNING [train.py:1204] (3/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_310-109461-0_sp1.1 from training. Duration: 0.72725 2023-10-03 14:26:11,830 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_33348_210_5632_1_1533086912_3324030_131-321149-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 14:26:11,886 WARNING [train.py:1204] (3/4) Exclude cut with ID _64135_210_2253_1_1535677267876_7608972_133-188266-0_sp1.1 from training. Duration: 0.736375 2023-10-03 14:26:11,912 WARNING [train.py:1204] (3/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_505-205270-0 from training. Duration: 0.38 2023-10-03 14:26:13,333 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_67223_210_4238_1_1535612120_2537431_102-88248-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 14:26:13,629 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.conv_module2.balancer1.prob, batch_count=1298713.3333333333, ans=0.125 2023-10-03 14:26:14,585 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.534e+02 1.883e+02 2.070e+02 2.275e+02 3.078e+02, threshold=4.140e+02, percent-clipped=0.0 2023-10-03 14:26:14,698 WARNING [train.py:1204] (3/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_60-235433-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 14:26:14,788 WARNING [train.py:1204] (3/4) Exclude cut with ID _5189_210_4557_1_1527248319428_3769229_199-53526-0_sp1.1 from training. Duration: 0.3818125 2023-10-03 14:26:20,401 WARNING [train.py:1204] (3/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_439-90979-0_sp1.1 from training. Duration: 0.963625 2023-10-03 14:26:20,473 WARNING [train.py:1204] (3/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_1162-132880-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 14:26:21,819 WARNING [train.py:1204] (3/4) Exclude cut with ID _56423_210_5854_1_1534896061049_3165969_220-22317-0_sp1.1 from training. Duration: 0.963625 2023-10-03 14:26:23,217 WARNING [train.py:1204] (3/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_909-64151-0 from training. Duration: 0.94 2023-10-03 14:26:25,159 WARNING [train.py:1204] (3/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_465-156500-0_sp0.9 from training. Duration: 0.8 2023-10-03 14:26:26,481 WARNING [train.py:1204] (3/4) Exclude cut with ID _44304_210_15374_1_1533639352765_4613129_302-22084-0 from training. Duration: 0.86 2023-10-03 14:26:26,521 WARNING [train.py:1204] (3/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_663-166973-0_sp1.1 from training. Duration: 0.72725 2023-10-03 14:26:28,407 WARNING [train.py:1204] (3/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_503-287883-0_sp0.9 from training. Duration: 0.97775 2023-10-03 14:26:29,662 WARNING [train.py:1204] (3/4) Exclude cut with ID _60057_210_10640_1_1534903065793_3333150_362-230377-0_sp0.9 from training. Duration: 0.9666875 2023-10-03 14:26:31,210 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_4183_210_2087_1_1526729100_1869838_143-280962-0 from training. Duration: 0.56 2023-10-03 14:26:33,094 WARNING [train.py:1204] (3/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_281-1313-0_sp1.1 from training. Duration: 0.936375 2023-10-03 14:26:39,306 WARNING [train.py:1204] (3/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_64-215118-0_sp1.1 from training. Duration: 0.936375 2023-10-03 14:26:40,650 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2822_210_2715_1_1526112586_1999587_165-36656-0 from training. Duration: 0.89 2023-10-03 14:26:40,714 WARNING [train.py:1204] (3/4) Exclude cut with ID _22961_210_8254_1_1531976366672_4278180_402-208830-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 14:26:44,810 WARNING [train.py:1204] (3/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_98-193494-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 14:26:44,917 WARNING [train.py:1204] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_383-285196-0 from training. Duration: 0.97 2023-10-03 14:26:52,842 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6767_210_3273_1_1527855783_1718932_142-127132-0 from training. Duration: 0.95 2023-10-03 14:26:52,880 WARNING [train.py:1204] (3/4) Exclude cut with ID _39867_210_9826_1_1533553222030_9344259_1018-339635-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 14:26:52,940 WARNING [train.py:1204] (3/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_274-274798-0_sp1.1 from training. Duration: 0.8909375 2023-10-03 14:26:54,473 WARNING [train.py:1204] (3/4) Exclude cut with ID _2334_210_2667_1_1525608238880_4705129_376-263393-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 14:26:56,230 INFO [train.py:1046] (3/4) Epoch 37, batch 3600, loss[loss=0.14, simple_loss=0.2191, pruned_loss=0.03044, over 24317.00 frames. ], tot_loss[loss=0.1569, simple_loss=0.236, pruned_loss=0.03887, over 4719569.19 frames. ], batch size: 56, lr: 2.75e-03, grad_scale: 32.0 2023-10-03 14:26:56,296 WARNING [train.py:1204] (3/4) Exclude cut with ID _29303_210_2575_1_1532392237303_7410649_363-344675-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 14:26:56,383 WARNING [train.py:1204] (3/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_58-68956-0_sp1.1 from training. Duration: 0.7545625 2023-10-03 14:26:56,842 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.5.encoder.layers.1.nonlin_attention.whiten2, num_groups=1, num_channels=256, metric=7.85 vs. limit=15.0 2023-10-03 14:27:01,241 WARNING [train.py:1204] (3/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_709-311571-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 14:27:02,108 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.1.nonlin_attention.whiten1.whitening_limit, batch_count=1298913.3333333333, ans=10.0 2023-10-03 14:27:04,392 WARNING [train.py:1204] (3/4) Exclude cut with ID _46067_210_4220_1_1534125571066_3307450_136-257407-0_sp1.1 from training. Duration: 0.963625 2023-10-03 14:27:05,827 WARNING [train.py:1204] (3/4) Exclude cut with ID _6493_210_1747_1_1528459210424_4012470_324-323854-0_sp1.1 from training. Duration: 0.836375 2023-10-03 14:27:07,171 WARNING [train.py:1204] (3/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_234-260682-0_sp1.1 from training. Duration: 0.763625 2023-10-03 14:27:07,239 WARNING [train.py:1204] (3/4) Exclude cut with ID _36634_210_1794_1_1533296352450_1399884_55-119451-0_sp1.1 from training. Duration: 0.963625 2023-10-03 14:27:07,243 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_40047_210_2290_1_1533290519_2374311_195-165693-0 from training. Duration: 0.89 2023-10-03 14:27:13,099 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_5522_210_3273_1_1527249606_3909096_366-172508-0_sp0.9 from training. Duration: 0.7333125 2023-10-03 14:27:14,492 WARNING [train.py:1204] (3/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_187-10504-0_sp1.1 from training. Duration: 0.963625 2023-10-03 14:27:17,333 WARNING [train.py:1204] (3/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_282-257791-0_sp1.1 from training. Duration: 0.7818125 2023-10-03 14:27:18,882 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_43831_210_2158_1_1533725502_5872791_467-288776-0_sp1.1 from training. Duration: 0.92725 2023-10-03 14:27:20,234 WARNING [train.py:1204] (3/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_138-154812-0_sp1.1 from training. Duration: 0.7090625 2023-10-03 14:27:21,547 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_1154-185930-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 14:27:21,571 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2655_210_2087_1_1525688959_3897400_52-279899-0 from training. Duration: 0.69 2023-10-03 14:27:22,814 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_141-89113-0_sp1.1 from training. Duration: 0.7818125 2023-10-03 14:27:24,281 WARNING [train.py:1204] (3/4) Exclude cut with ID _55134_210_21982_1_1534590028377_3882170_297-47310-0_sp1.1 from training. Duration: 0.963625 2023-10-03 14:27:24,338 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_486-151131-0_sp1.1 from training. Duration: 0.77275 2023-10-03 14:27:25,785 WARNING [train.py:1204] (3/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_21-232980-0_sp1.1 from training. Duration: 0.936375 2023-10-03 14:27:27,278 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6165_210_3528_1_1527590982_4379841_338-106380-0_sp1.1 from training. Duration: 0.92725 2023-10-03 14:27:27,345 WARNING [train.py:1204] (3/4) Exclude cut with ID _55162_210_17071_1_1534408248583_3753480_355-257218-0_sp1.1 from training. Duration: 0.97275 2023-10-03 14:27:29,359 WARNING [train.py:1204] (3/4) Exclude cut with ID _2895_210_2477_1_1525956785794_4063750_289-11475-0 from training. Duration: 0.82 2023-10-03 14:27:33,295 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.3.encoder.layers.0.conv_module1.whiten, num_groups=1, num_channels=512, metric=3.17 vs. limit=15.0 2023-10-03 14:27:37,181 WARNING [train.py:1204] (3/4) Exclude cut with ID _16756_210_4915_1_1531047803763_7104259_839-183431-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 14:27:37,282 WARNING [train.py:1204] (3/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_427-208901-0_sp0.9 from training. Duration: 0.7555625 2023-10-03 14:27:39,083 WARNING [train.py:1204] (3/4) Exclude cut with ID _36512_210_6987_1_1533033143998_3126069_181-306500-0 from training. Duration: 0.95 2023-10-03 14:27:39,374 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.conv_skip_rate, batch_count=1299046.6666666667, ans=0.0 2023-10-03 14:27:43,309 WARNING [train.py:1204] (3/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_28-70987-0_sp1.1 from training. Duration: 0.7909375 2023-10-03 14:27:44,870 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.conv_module1.balancer1.prob, batch_count=1299113.3333333333, ans=0.125 2023-10-03 14:27:46,181 WARNING [train.py:1204] (3/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_590-58996-0_sp1.1 from training. Duration: 0.936375 2023-10-03 14:27:48,882 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_48730_210_5399_1_1533885886_2212769_108-47561-0_sp1.1 from training. Duration: 0.936375 2023-10-03 14:27:54,269 WARNING [train.py:1204] (3/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_401-158239-0_sp0.9 from training. Duration: 0.788875 2023-10-03 14:27:54,300 WARNING [train.py:1204] (3/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_278-154492-0_sp1.1 from training. Duration: 0.6909375 2023-10-03 14:27:54,316 WARNING [train.py:1204] (3/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_137-116442-0 from training. Duration: 0.78 2023-10-03 14:27:55,733 WARNING [train.py:1204] (3/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_189-317158-0 from training. Duration: 0.9 2023-10-03 14:27:57,605 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_47896_210_20508_1_1534492518_5958868_558-314572-0 from training. Duration: 0.97 2023-10-03 14:27:59,095 WARNING [train.py:1204] (3/4) Exclude cut with ID _66070_210_7640_1_1535441393096_7805310_290-40284-0_sp1.1 from training. Duration: 0.97275 2023-10-03 14:28:01,046 WARNING [train.py:1204] (3/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_110-224688-0_sp1.1 from training. Duration: 0.8818125 2023-10-03 14:28:01,135 WARNING [train.py:1204] (3/4) Exclude cut with ID _7719_210_3525_1_1528456153163_4296960_88-250259-0 from training. Duration: 0.84 2023-10-03 14:28:01,180 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8453_210_4211_1_1528502940_8562730_1157-114102-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 14:28:02,316 WARNING [train.py:1204] (3/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_317-175062-0_sp1.1 from training. Duration: 0.7454375 2023-10-03 14:28:02,322 WARNING [train.py:1204] (3/4) Exclude cut with ID _29857_210_20034_1_1532395886753_3798160_254-347254-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 14:28:02,395 WARNING [train.py:1204] (3/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_28-70987-0 from training. Duration: 0.87 2023-10-03 14:28:04,182 WARNING [train.py:1204] (3/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_321-335475-0 from training. Duration: 0.45 2023-10-03 14:28:05,809 WARNING [train.py:1204] (3/4) Exclude cut with ID _40901_210_5665_1_1533363789958_3856130_356-107330-0_sp1.1 from training. Duration: 0.936375 2023-10-03 14:28:05,873 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3780_210_2087_1_1526467389_2145572_176-107140-0 from training. Duration: 0.69 2023-10-03 14:28:06,060 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.conv_module1.balancer1.prob, batch_count=1299180.0, ans=0.125 2023-10-03 14:28:10,405 WARNING [train.py:1204] (3/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_80-135200-0 from training. Duration: 0.79 2023-10-03 14:28:11,746 INFO [train.py:1046] (3/4) Epoch 37, batch 3650, loss[loss=0.1659, simple_loss=0.2421, pruned_loss=0.04478, over 23839.00 frames. ], tot_loss[loss=0.1574, simple_loss=0.2365, pruned_loss=0.03914, over 4715823.66 frames. ], batch size: 164, lr: 2.75e-03, grad_scale: 32.0 2023-10-03 14:28:11,817 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_33348_210_5632_1_1533086912_3324030_85-28583-0_sp0.9 from training. Duration: 0.911125 2023-10-03 14:28:14,512 WARNING [train.py:1204] (3/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_29-293896-0 from training. Duration: 0.61 2023-10-03 14:28:14,920 INFO [scaling.py:1118] (3/4) WithLoss: name=encoder.encoders.5.encoder.layers.1.self_attn_weights, loss-sum=0.000e+00 2023-10-03 14:28:15,894 WARNING [train.py:1204] (3/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_194-252800-0 from training. Duration: 0.97 2023-10-03 14:28:18,815 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_360-52446-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 14:28:18,817 WARNING [train.py:1204] (3/4) Exclude cut with ID _9389_210_1664_1_1529217337301_5289269_289-213417-0_sp0.9 from training. Duration: 0.988875 2023-10-03 14:28:18,863 WARNING [train.py:1204] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_143-86344-0_sp0.9 from training. Duration: 0.8555625 2023-10-03 14:28:21,588 WARNING [train.py:1204] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_520-126729-0_sp0.9 from training. Duration: 0.77775 2023-10-03 14:28:22,884 WARNING [train.py:1204] (3/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_76-308663-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 14:28:22,949 WARNING [train.py:1204] (3/4) Exclude cut with ID _8021_210_7994_1_1528376451358_3653069_100-140231-0 from training. Duration: 0.67 2023-10-03 14:28:24,791 WARNING [train.py:1204] (3/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_387-131538-0_sp0.9 from training. Duration: 0.9 2023-10-03 14:28:25,035 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.conv_module2.balancer1.prob, batch_count=1299313.3333333333, ans=0.125 2023-10-03 14:28:26,141 WARNING [train.py:1204] (3/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_365-182054-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 14:28:26,195 WARNING [train.py:1204] (3/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_345-340690-0 from training. Duration: 0.93 2023-10-03 14:28:27,911 WARNING [train.py:1204] (3/4) Exclude cut with ID _5409_210_1388_1_1527292850189_5865191_178-127970-0_sp0.9 from training. Duration: 0.6333125 2023-10-03 14:28:29,293 WARNING [train.py:1204] (3/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_352-12734-0_sp1.1 from training. Duration: 0.92725 2023-10-03 14:28:29,296 WARNING [train.py:1204] (3/4) Exclude cut with ID _65134_210_14763_1_1535335393985_3505368_121-129373-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 14:28:30,850 WARNING [train.py:1204] (3/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_557-324258-0_sp1.1 from training. Duration: 0.736375 2023-10-03 14:28:33,799 WARNING [train.py:1204] (3/4) Exclude cut with ID _48678_210_5663_1_1533988830947_3769199_306-255886-0 from training. Duration: 0.77 2023-10-03 14:28:35,672 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7544_210_6753_1_1528587000_691468_87-178599-0 from training. Duration: 0.87 2023-10-03 14:28:37,588 WARNING [train.py:1204] (3/4) Exclude cut with ID _2941_210_2577_1_1526199673150_2489759_208-236402-0_sp1.1 from training. Duration: 0.963625 2023-10-03 14:28:40,206 WARNING [train.py:1204] (3/4) Exclude cut with ID _38670_210_15379_1_1533448810659_4391059_65-169397-0 from training. Duration: 0.97 2023-10-03 14:28:40,309 WARNING [train.py:1204] (3/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_306-328175-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 14:28:40,320 WARNING [train.py:1204] (3/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_212-149947-0_sp1.1 from training. Duration: 0.663625 2023-10-03 14:28:44,298 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.587e+02 1.976e+02 2.171e+02 2.481e+02 4.276e+02, threshold=4.341e+02, percent-clipped=1.0 2023-10-03 14:28:47,158 WARNING [train.py:1204] (3/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_321-22821-0_sp1.1 from training. Duration: 0.8090625 2023-10-03 14:28:48,543 WARNING [train.py:1204] (3/4) Exclude cut with ID _44010_210_4525_1_1533884262964_4013620_483-69110-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 14:28:48,553 WARNING [train.py:1204] (3/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_38-187851-0_sp1.1 from training. Duration: 0.563625 2023-10-03 14:28:50,032 WARNING [train.py:1204] (3/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_129-337802-0_sp0.9 from training. Duration: 0.8 2023-10-03 14:28:51,374 WARNING [train.py:1204] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_268-87909-0_sp1.1 from training. Duration: 0.9 2023-10-03 14:28:54,057 WARNING [train.py:1204] (3/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_559-184862-0_sp1.1 from training. Duration: 0.8545625 2023-10-03 14:28:55,539 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_46032_210_14656_1_1533781412_4507969_237-120766-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 14:28:56,924 WARNING [train.py:1204] (3/4) Exclude cut with ID _28427_210_4220_1_1532581240967_3061069_330-170906-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 14:28:56,929 WARNING [train.py:1204] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_401-51578-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 14:28:58,425 WARNING [train.py:1204] (3/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_209-205365-0_sp1.1 from training. Duration: 0.7181875 2023-10-03 14:29:00,419 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_23801_210_16223_1_1532066392_3294657_57-242170-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 14:29:01,756 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_127-37371-0_sp1.1 from training. Duration: 0.936375 2023-10-03 14:29:03,264 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder_embed.conv.8.prob, batch_count=1299446.6666666667, ans=0.125 2023-10-03 14:29:10,349 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9171W0019-578905 from training. Duration: 0.896 2023-10-03 14:29:15,090 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_24605_210_1794_1_1532047863_4149755_349-49056-0_sp1.1 from training. Duration: 0.92725 2023-10-03 14:29:15,101 WARNING [train.py:1204] (3/4) Exclude cut with ID _12302_210_5219_1_1529924446466_8107280_897-21237-0_sp1.1 from training. Duration: 0.936375 2023-10-03 14:29:16,589 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_9460_210_4211_1_1528954587_10049667_480-260269-0_sp0.9 from training. Duration: 0.62225 2023-10-03 14:29:16,634 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_348-152548-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 14:29:18,000 WARNING [train.py:1204] (3/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_301-330455-0_sp0.9 from training. Duration: 0.888875 2023-10-03 14:29:19,420 WARNING [train.py:1204] (3/4) Exclude cut with ID _61008_210_10633_1_1534852769198_6974370_345-91705-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 14:29:20,843 WARNING [train.py:1204] (3/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_304-273433-0 from training. Duration: 0.99 2023-10-03 14:29:20,846 WARNING [train.py:1204] (3/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_359-345972-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 14:29:23,631 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2296_210_2087_1_1525503448_3397335_57-185430-0_sp1.1 from training. Duration: 0.5454375 2023-10-03 14:29:24,817 INFO [train.py:1046] (3/4) Epoch 37, batch 3700, loss[loss=0.1534, simple_loss=0.2309, pruned_loss=0.03791, over 23694.00 frames. ], tot_loss[loss=0.1579, simple_loss=0.2376, pruned_loss=0.03909, over 4717577.72 frames. ], batch size: 149, lr: 2.74e-03, grad_scale: 32.0 2023-10-03 14:29:26,308 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_1256-120974-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 14:29:27,610 WARNING [train.py:1204] (3/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_372-30302-0_sp0.9 from training. Duration: 0.9555625 2023-10-03 14:29:30,476 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_42012_210_7927_1_1533439936_3871556_451-344262-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 14:29:30,477 WARNING [train.py:1204] (3/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_28-324729-0 from training. Duration: 0.94 2023-10-03 14:29:30,486 WARNING [train.py:1204] (3/4) Exclude cut with ID _64135_210_2253_1_1535677267876_7608972_313-18394-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 14:29:31,845 WARNING [train.py:1204] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_620-276793-0_sp0.9 from training. Duration: 0.6555625 2023-10-03 14:29:33,175 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7544_210_6753_1_1528591327_1471426_172-348777-0_sp0.9 from training. Duration: 0.7333125 2023-10-03 14:29:34,727 WARNING [train.py:1204] (3/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_332-221057-0_sp0.9 from training. Duration: 0.7555625 2023-10-03 14:29:39,372 WARNING [train.py:1204] (3/4) Exclude cut with ID _19023_210_4353_1_1531378892552_5820780_5-300035-0_sp1.1 from training. Duration: 0.92725 2023-10-03 14:29:40,717 WARNING [train.py:1204] (3/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_343-309023-0_sp1.1 from training. Duration: 0.963625 2023-10-03 14:29:40,803 WARNING [train.py:1204] (3/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_37-41919-0_sp1.1 from training. Duration: 0.7909375 2023-10-03 14:29:42,038 WARNING [train.py:1204] (3/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_41-182910-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 14:29:42,101 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_229-45300-0_sp0.9 from training. Duration: 0.6333125 2023-10-03 14:29:45,421 WARNING [train.py:1204] (3/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_40-214716-0_sp1.1 from training. Duration: 0.963625 2023-10-03 14:29:47,333 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9309W0203-640330 from training. Duration: 0.896 2023-10-03 14:29:53,033 WARNING [train.py:1204] (3/4) Exclude cut with ID _60229_210_27767_1_1534838406158_3655350_264-279573-0_sp1.1 from training. Duration: 0.7545625 2023-10-03 14:29:53,199 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.balancer2.prob, batch_count=1299713.3333333333, ans=0.125 2023-10-03 14:29:54,313 WARNING [train.py:1204] (3/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_372-198878-0_sp0.9 from training. Duration: 0.6666875 2023-10-03 14:29:55,652 WARNING [train.py:1204] (3/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_359-118926-0_sp1.1 from training. Duration: 0.6545625 2023-10-03 14:29:55,663 WARNING [train.py:1204] (3/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_85-65056-0 from training. Duration: 0.96 2023-10-03 14:29:55,687 WARNING [train.py:1204] (3/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_89-242335-0_sp1.1 from training. Duration: 0.863625 2023-10-03 14:29:58,528 WARNING [train.py:1204] (3/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_128-49444-0_sp1.1 from training. Duration: 0.936375 2023-10-03 14:29:59,853 WARNING [train.py:1204] (3/4) Exclude cut with ID _30327_210_16049_1_1532484161295_3924169_401-70819-0 from training. Duration: 0.86 2023-10-03 14:29:59,934 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_41822_210_13270_1_1533955976_4379284_260-256391-0_sp1.1 from training. Duration: 0.936375 2023-10-03 14:30:01,463 WARNING [train.py:1204] (3/4) Exclude cut with ID _67335_210_5219_1_1535525975652_4275229_599-247317-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 14:30:02,896 WARNING [train.py:1204] (3/4) Exclude cut with ID _29857_210_20034_1_1532395886753_3798160_209-239267-0_sp1.1 from training. Duration: 0.936375 2023-10-03 14:30:04,193 WARNING [train.py:1204] (3/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_46-207599-0_sp1.1 from training. Duration: 0.5818125 2023-10-03 14:30:06,973 WARNING [train.py:1204] (3/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_912-282412-0_sp1.1 from training. Duration: 0.4454375 2023-10-03 14:30:10,314 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_4183_210_2087_1_1526729100_1869838_141-166600-0_sp1.1 from training. Duration: 0.863625 2023-10-03 14:30:10,325 WARNING [train.py:1204] (3/4) Exclude cut with ID _15037_210_2253_1_1530752294104_3267650_296-192597-0 from training. Duration: 0.81 2023-10-03 14:30:12,168 WARNING [train.py:1204] (3/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_571-220562-0_sp1.1 from training. Duration: 0.963625 2023-10-03 14:30:12,188 WARNING [train.py:1204] (3/4) Exclude cut with ID _5287_210_2084_1_1527418883040_3878670_12-221186-0 from training. Duration: 0.98 2023-10-03 14:30:18,330 WARNING [train.py:1204] (3/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_149-85602-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 14:30:18,353 WARNING [train.py:1204] (3/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_1003-101638-0_sp1.1 from training. Duration: 0.663625 2023-10-03 14:30:19,885 WARNING [train.py:1204] (3/4) Exclude cut with ID _55844_210_2346_1_1534471212569_7625707_1099-33689-0_sp1.1 from training. Duration: 0.92725 2023-10-03 14:30:21,173 WARNING [train.py:1204] (3/4) Exclude cut with ID _7606_210_4915_1_1528894253277_5080690_301-10732-0 from training. Duration: 0.81 2023-10-03 14:30:22,660 WARNING [train.py:1204] (3/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_403-34698-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 14:30:22,665 WARNING [train.py:1204] (3/4) Exclude cut with ID _8480_210_1874_1_1528589391205_6728330_755-112625-0_sp0.9 from training. Duration: 0.82225 2023-10-03 14:30:23,943 WARNING [train.py:1204] (3/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_527-238990-0_sp1.1 from training. Duration: 0.8181875 2023-10-03 14:30:23,979 WARNING [train.py:1204] (3/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_426-7328-0_sp1.1 from training. Duration: 0.92725 2023-10-03 14:30:26,715 WARNING [train.py:1204] (3/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_189-317158-0_sp1.1 from training. Duration: 0.8181875 2023-10-03 14:30:28,138 WARNING [train.py:1204] (3/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_162-103620-0 from training. Duration: 0.93 2023-10-03 14:30:29,645 WARNING [train.py:1204] (3/4) Exclude cut with ID _22083_210_8254_1_1531702862439_3678970_49-234546-0 from training. Duration: 0.83 2023-10-03 14:30:30,934 WARNING [train.py:1204] (3/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_370-331600-0_sp1.1 from training. Duration: 0.8545625 2023-10-03 14:30:30,954 WARNING [train.py:1204] (3/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_166-59042-0_sp1.1 from training. Duration: 0.97275 2023-10-03 14:30:32,344 WARNING [train.py:1204] (3/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_138-332773-0_sp1.1 from training. Duration: 0.736375 2023-10-03 14:30:33,610 WARNING [train.py:1204] (3/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_90-30673-0_sp0.9 from training. Duration: 0.9333125 2023-10-03 14:30:35,183 WARNING [train.py:1204] (3/4) Exclude cut with ID _27967_210_12140_1_1532588391859_8419130_737-41898-0_sp1.1 from training. Duration: 0.936375 2023-10-03 14:30:36,524 WARNING [train.py:1204] (3/4) Exclude cut with ID _8833_210_5219_1_1528592665210_7553059_329-125156-0_sp1.1 from training. Duration: 0.7454375 2023-10-03 14:30:37,881 INFO [train.py:1046] (3/4) Epoch 37, batch 3750, loss[loss=0.1682, simple_loss=0.2594, pruned_loss=0.03848, over 24379.00 frames. ], tot_loss[loss=0.1584, simple_loss=0.2382, pruned_loss=0.03936, over 4721804.69 frames. ], batch size: 77, lr: 2.74e-03, grad_scale: 32.0 2023-10-03 14:30:37,998 WARNING [train.py:1204] (3/4) Exclude cut with ID _41817_210_18835_1_1533434477160_7423049_352-42537-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 14:30:40,005 WARNING [train.py:1204] (3/4) Exclude cut with ID _8365_210_2064_1_1528592311070_4677690_431-294102-0 from training. Duration: 0.89 2023-10-03 14:30:41,368 WARNING [train.py:1204] (3/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_454-314901-0_sp1.1 from training. Duration: 0.4181875 2023-10-03 14:30:44,609 WARNING [train.py:1204] (3/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_80-135200-0_sp0.9 from training. Duration: 0.87775 2023-10-03 14:30:44,651 WARNING [train.py:1204] (3/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_181-78632-0 from training. Duration: 0.78 2023-10-03 14:30:46,526 WARNING [train.py:1204] (3/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_300-27235-0_sp0.9 from training. Duration: 0.9666875 2023-10-03 14:30:46,591 WARNING [train.py:1204] (3/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_96-177176-0_sp1.1 from training. Duration: 0.97275 2023-10-03 14:30:47,996 WARNING [train.py:1204] (3/4) Exclude cut with ID _35941_210_15379_1_1532995294515_4023089_246-348742-0_sp1.1 from training. Duration: 0.97275 2023-10-03 14:30:49,377 WARNING [train.py:1204] (3/4) Exclude cut with ID _38670_210_15379_1_1533448810659_4391059_65-169397-0_sp1.1 from training. Duration: 0.8818125 2023-10-03 14:30:52,343 WARNING [train.py:1204] (3/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_1003-99642-0_sp1.1 from training. Duration: 0.963625 2023-10-03 14:30:56,258 WARNING [train.py:1204] (3/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_320-14078-0_sp1.1 from training. Duration: 0.863625 2023-10-03 14:30:57,931 WARNING [train.py:1204] (3/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_367-61330-0_sp0.9 from training. Duration: 0.8333125 2023-10-03 14:31:00,603 WARNING [train.py:1204] (3/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_515-298514-0_sp1.1 from training. Duration: 0.92725 2023-10-03 14:31:02,054 WARNING [train.py:1204] (3/4) Exclude cut with ID _53731_210_7994_1_1534553979244_3757214_471-320815-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 14:31:03,392 WARNING [train.py:1204] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_306-232480-0 from training. Duration: 0.96 2023-10-03 14:31:03,463 WARNING [train.py:1204] (3/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_478-162247-0_sp0.9 from training. Duration: 0.911125 2023-10-03 14:31:04,867 WARNING [train.py:1204] (3/4) Exclude cut with ID _56423_210_5854_1_1534896061049_3165969_345-284696-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 14:31:04,894 WARNING [train.py:1204] (3/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_600-122253-0_sp1.1 from training. Duration: 0.963625 2023-10-03 14:31:07,758 WARNING [train.py:1204] (3/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_199-295611-0 from training. Duration: 0.55 2023-10-03 14:31:08,076 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.feed_forward2.hidden_balancer.prob, batch_count=1300046.6666666667, ans=0.125 2023-10-03 14:31:10,854 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.557e+02 1.862e+02 2.050e+02 2.343e+02 3.351e+02, threshold=4.100e+02, percent-clipped=0.0 2023-10-03 14:31:11,028 WARNING [train.py:1204] (3/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_734-129761-0 from training. Duration: 0.82 2023-10-03 14:31:12,837 WARNING [train.py:1204] (3/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_349-169257-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 14:31:12,873 WARNING [train.py:1204] (3/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_202-67017-0_sp0.9 from training. Duration: 0.911125 2023-10-03 14:31:14,838 WARNING [train.py:1204] (3/4) Exclude cut with ID _16908_210_5724_1_1531119157248_3772700_61-258978-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 14:31:20,122 WARNING [train.py:1204] (3/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_172-156931-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 14:31:20,226 WARNING [train.py:1204] (3/4) Exclude cut with ID _4092_210_2151_1_1526643044449_3135759_356-278694-0_sp1.1 from training. Duration: 0.42725 2023-10-03 14:31:25,701 WARNING [train.py:1204] (3/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_116-332008-0 from training. Duration: 0.98 2023-10-03 14:31:27,407 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.balancer1.prob, batch_count=1300113.3333333333, ans=0.125 2023-10-03 14:31:28,545 WARNING [train.py:1204] (3/4) Exclude cut with ID _29003_210_19596_1_1532507391720_3894098_358-114789-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 14:31:31,347 WARNING [train.py:1204] (3/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_152-212533-0_sp1.1 from training. Duration: 0.8818125 2023-10-03 14:31:31,418 WARNING [train.py:1204] (3/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_267-214116-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 14:31:35,513 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7543_210_6753_1_1528541907_1399322_165-323619-0_sp0.9 from training. Duration: 0.7666875 2023-10-03 14:31:38,332 WARNING [train.py:1204] (3/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_577-332600-0_sp0.9 from training. Duration: 0.6333125 2023-10-03 14:31:40,331 WARNING [train.py:1204] (3/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_342-100157-0_sp0.9 from training. Duration: 0.6 2023-10-03 14:31:43,072 WARNING [train.py:1204] (3/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_141-89727-0_sp1.1 from training. Duration: 0.6454375 2023-10-03 14:31:43,327 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.conv_module2.balancer2.prob, batch_count=1300180.0, ans=0.125 2023-10-03 14:31:44,873 WARNING [train.py:1204] (3/4) Exclude cut with ID _3992_210_1747_1_1526644816031_3731060_107-251434-0_sp1.1 from training. Duration: 0.9 2023-10-03 14:31:47,804 WARNING [train.py:1204] (3/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_401-53423-0_sp1.1 from training. Duration: 0.5 2023-10-03 14:31:51,824 INFO [train.py:1046] (3/4) Epoch 37, batch 3800, loss[loss=0.1568, simple_loss=0.2266, pruned_loss=0.04347, over 24330.00 frames. ], tot_loss[loss=0.1583, simple_loss=0.2378, pruned_loss=0.03941, over 4720115.67 frames. ], batch size: 56, lr: 2.74e-03, grad_scale: 16.0 2023-10-03 14:31:54,716 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7762_210_1794_1_1528199508_4057408_422-227034-0_sp1.1 from training. Duration: 0.663625 2023-10-03 14:31:57,732 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=1300246.6666666667, ans=0.1 2023-10-03 14:31:58,874 WARNING [train.py:1204] (3/4) Exclude cut with ID _4184_210_2550_1_1526730192013_2219380_102-250299-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 14:32:00,206 WARNING [train.py:1204] (3/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_253-308377-0_sp0.9 from training. Duration: 0.5444375 2023-10-03 14:32:00,281 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_271-297505-0 from training. Duration: 0.99 2023-10-03 14:32:02,298 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.3.encoder.layers.2.self_attn1.whiten, num_groups=1, num_channels=512, metric=13.23 vs. limit=22.5 2023-10-03 14:32:02,911 WARNING [train.py:1204] (3/4) Exclude cut with ID _35941_210_15379_1_1532995294515_4023089_213-142973-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 14:32:04,309 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7544_210_6753_1_1528587722_3576686_162-63018-0_sp1.1 from training. Duration: 0.92725 2023-10-03 14:32:04,389 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_365-217119-0_sp0.9 from training. Duration: 0.7 2023-10-03 14:32:07,218 WARNING [train.py:1204] (3/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_662-275621-0_sp0.9 from training. Duration: 0.4666875 2023-10-03 14:32:07,219 WARNING [train.py:1204] (3/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_374-187555-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 14:32:08,592 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_289-180904-0_sp1.1 from training. Duration: 0.6909375 2023-10-03 14:32:11,220 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.1.encoder.layers.1.feed_forward3.out_whiten, num_groups=1, num_channels=256, metric=8.09 vs. limit=15.0 2023-10-03 14:32:12,209 WARNING [train.py:1204] (3/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_189-47206-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 14:32:12,242 WARNING [train.py:1204] (3/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_234-14174-0_sp1.1 from training. Duration: 0.7454375 2023-10-03 14:32:12,282 WARNING [train.py:1204] (3/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_576-76621-0_sp1.1 from training. Duration: 0.963625 2023-10-03 14:32:12,527 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.conv_module2.balancer1.prob, batch_count=1300313.3333333333, ans=0.125 2023-10-03 14:32:12,933 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.3.encoder.layers.2.nonlin_attention.whiten2, num_groups=1, num_channels=512, metric=5.81 vs. limit=15.0 2023-10-03 14:32:13,693 WARNING [train.py:1204] (3/4) Exclude cut with ID _13253_210_1925_1_1530757775819_3577873_253-246386-0 from training. Duration: 0.9 2023-10-03 14:32:16,430 WARNING [train.py:1204] (3/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_372-6869-0_sp1.1 from training. Duration: 0.4 2023-10-03 14:32:17,727 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_21434_210_4238_1_1531916339_1222252_22-54954-0_sp1.1 from training. Duration: 0.936375 2023-10-03 14:32:20,826 WARNING [train.py:1204] (3/4) Exclude cut with ID _29853_210_7497_1_1532505600851_6642110_492-170361-0_sp1.1 from training. Duration: 0.92725 2023-10-03 14:32:21,098 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.balancer2.prob, batch_count=1300380.0, ans=0.125 2023-10-03 14:32:22,583 INFO [scaling.py:1118] (3/4) WithLoss: name=encoder.encoders.3.encoder.layers.2.self_attn_weights, loss-sum=0.000e+00 2023-10-03 14:32:23,629 WARNING [train.py:1204] (3/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_505-97987-0_sp0.9 from training. Duration: 0.9555625 2023-10-03 14:32:23,662 WARNING [train.py:1204] (3/4) Exclude cut with ID _8365_210_2064_1_1528592311070_4677690_371-122295-0_sp0.9 from training. Duration: 0.611125 2023-10-03 14:32:25,086 WARNING [train.py:1204] (3/4) Exclude cut with ID _14268_210_9206_1_1530622771285_4714099_637-86630-0_sp1.1 from training. Duration: 0.463625 2023-10-03 14:32:25,099 WARNING [train.py:1204] (3/4) Exclude cut with ID _29857_210_20034_1_1532395886753_3798160_372-5678-0_sp1.1 from training. Duration: 0.963625 2023-10-03 14:32:26,555 WARNING [train.py:1204] (3/4) Exclude cut with ID _52892_210_6721_1_1534378941259_4124129_90-165108-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 14:32:26,760 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.conv_module2.balancer1.prob, batch_count=1300380.0, ans=0.125 2023-10-03 14:32:27,953 WARNING [train.py:1204] (3/4) Exclude cut with ID _27052_210_5986_1_1532329197187_3585009_143-339198-0_sp1.1 from training. Duration: 0.963625 2023-10-03 14:32:29,591 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.conv_module2.balancer1.prob, batch_count=1300380.0, ans=0.125 2023-10-03 14:32:32,157 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.ff2_skip_rate, batch_count=1300380.0, ans=0.0 2023-10-03 14:32:33,324 WARNING [train.py:1204] (3/4) Exclude cut with ID _14268_210_9206_1_1530622771285_4714099_637-86630-0_sp0.9 from training. Duration: 0.5666875 2023-10-03 14:32:33,327 WARNING [train.py:1204] (3/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_401-255491-0 from training. Duration: 0.7 2023-10-03 14:32:34,783 WARNING [train.py:1204] (3/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_178-6946-0_sp1.1 from training. Duration: 0.97275 2023-10-03 14:32:34,976 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.bypass.scale_min, batch_count=1300446.6666666667, ans=0.2 2023-10-03 14:32:35,003 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.balancer1.prob, batch_count=1300446.6666666667, ans=0.125 2023-10-03 14:32:39,268 WARNING [train.py:1204] (3/4) Exclude cut with ID _27485_210_4916_1_1532739191677_3668269_460-163885-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 14:32:40,233 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.bypass.scale_min, batch_count=1300446.6666666667, ans=0.2 2023-10-03 14:32:44,013 WARNING [train.py:1204] (3/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_270-26053-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 14:32:47,089 WARNING [train.py:1204] (3/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_412-317077-0 from training. Duration: 0.75 2023-10-03 14:32:48,822 WARNING [train.py:1204] (3/4) Exclude cut with ID _5289_210_2084_1_1527678202736_3779299_142-103317-0 from training. Duration: 0.84 2023-10-03 14:32:48,884 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_150-143438-0_sp1.1 from training. Duration: 0.92725 2023-10-03 14:32:50,263 WARNING [train.py:1204] (3/4) Exclude cut with ID _27894_210_12392_1_1532415496078_6902439_668-269779-0_sp1.1 from training. Duration: 0.97275 2023-10-03 14:32:50,324 WARNING [train.py:1204] (3/4) Exclude cut with ID _18513_210_9206_1_1531618215223_3862620_56-222075-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 14:32:53,136 WARNING [train.py:1204] (3/4) Exclude cut with ID _4092_210_2151_1_1526643044449_3135759_356-278694-0 from training. Duration: 0.47 2023-10-03 14:32:55,838 WARNING [train.py:1204] (3/4) Exclude cut with ID _25808_210_13292_1_1532343514562_7366109_633-47524-0 from training. Duration: 0.99 2023-10-03 14:32:55,857 WARNING [train.py:1204] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_872-264525-0 from training. Duration: 0.65 2023-10-03 14:32:55,893 WARNING [train.py:1204] (3/4) Exclude cut with ID _14632_210_9988_1_1530691279392_3923200_239-232545-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 14:32:57,247 WARNING [train.py:1204] (3/4) Exclude cut with ID _14349_210_8341_1_1530615710818_3442470_153-27432-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 14:33:03,882 INFO [train.py:1046] (3/4) Epoch 37, batch 3850, loss[loss=0.1619, simple_loss=0.2441, pruned_loss=0.03991, over 23493.00 frames. ], tot_loss[loss=0.1576, simple_loss=0.2369, pruned_loss=0.03913, over 4706041.05 frames. ], batch size: 93, lr: 2.74e-03, grad_scale: 16.0 2023-10-03 14:33:03,949 WARNING [train.py:1204] (3/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_37-41919-0_sp0.9 from training. Duration: 0.9666875 2023-10-03 14:33:04,023 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_196-289637-0_sp0.9 from training. Duration: 0.8333125 2023-10-03 14:33:08,333 WARNING [train.py:1204] (3/4) Exclude cut with ID _13678_210_7497_1_1530530069226_3463170_371-3221-0_sp1.1 from training. Duration: 0.8181875 2023-10-03 14:33:10,214 WARNING [train.py:1204] (3/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_557-324258-0 from training. Duration: 0.81 2023-10-03 14:33:10,305 WARNING [train.py:1204] (3/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_1012-279473-0_sp1.1 from training. Duration: 0.6090625 2023-10-03 14:33:11,626 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_66916_210_14656_1_1535725525_326566_7-92649-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 14:33:16,181 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_9460_210_4211_1_1528954587_10049667_480-260269-0_sp1.1 from training. Duration: 0.5090625 2023-10-03 14:33:16,347 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_121-155001-0_sp1.1 from training. Duration: 0.92725 2023-10-03 14:33:16,495 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=1300580.0, ans=0.1 2023-10-03 14:33:19,569 WARNING [train.py:1204] (3/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_188-171673-0_sp1.1 from training. Duration: 0.52725 2023-10-03 14:33:19,644 WARNING [train.py:1204] (3/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_2-39653-0 from training. Duration: 0.98 2023-10-03 14:33:23,861 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_23850_210_4238_1_1532232597_1632433_112-229012-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 14:33:26,413 WARNING [train.py:1204] (3/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_626-79528-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 14:33:29,113 WARNING [train.py:1204] (3/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_856-90052-0_sp1.1 from training. Duration: 0.963625 2023-10-03 14:33:29,151 WARNING [train.py:1204] (3/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_122-109023-0_sp0.9 from training. Duration: 0.8444375 2023-10-03 14:33:32,052 WARNING [train.py:1204] (3/4) Exclude cut with ID _64414_210_17301_1_1535243252048_3757759_37-70328-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 14:33:32,113 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_622-344907-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 14:33:33,404 WARNING [train.py:1204] (3/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_410-321956-0_sp1.1 from training. Duration: 0.92725 2023-10-03 14:33:33,416 WARNING [train.py:1204] (3/4) Exclude cut with ID _7562_210_2084_1_1528887767916_3946229_186-326422-0_sp0.9 from training. Duration: 0.9333125 2023-10-03 14:33:33,507 WARNING [train.py:1204] (3/4) Exclude cut with ID _37537_210_2253_1_1533171758568_3421949_127-285192-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 14:33:35,435 WARNING [train.py:1204] (3/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_723-228168-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 14:33:35,650 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.conv_skip_rate, batch_count=1300713.3333333333, ans=0.0 2023-10-03 14:33:36,772 WARNING [train.py:1204] (3/4) Exclude cut with ID _13678_210_7497_1_1530530069226_3463170_426-246984-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 14:33:36,787 WARNING [train.py:1204] (3/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_399-156791-0_sp0.9 from training. Duration: 0.92225 2023-10-03 14:33:38,010 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.546e+02 1.860e+02 2.056e+02 2.272e+02 4.240e+02, threshold=4.112e+02, percent-clipped=1.0 2023-10-03 14:33:38,114 WARNING [train.py:1204] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_391-22148-0 from training. Duration: 0.77 2023-10-03 14:33:38,147 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3475_210_2028_1_1526193578_3622569_64-297072-0 from training. Duration: 0.91 2023-10-03 14:33:39,952 WARNING [train.py:1204] (3/4) Exclude cut with ID _17875_210_2087_1_1531224012907_1821339_225-280015-0_sp1.1 from training. Duration: 0.963625 2023-10-03 14:33:39,970 WARNING [train.py:1204] (3/4) Exclude cut with ID _50655_210_18628_1_1534229942193_3772154_325-246169-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 14:33:41,333 WARNING [train.py:1204] (3/4) Exclude cut with ID _29529_210_7234_1_1532393961455_3977460_597-257348-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 14:33:42,715 WARNING [train.py:1204] (3/4) Exclude cut with ID _58895_210_8750_1_1535184081320_3649089_77-173485-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 14:33:42,736 WARNING [train.py:1204] (3/4) Exclude cut with ID _6270_210_2084_1_1527850911573_3856420_26-277343-0 from training. Duration: 0.82 2023-10-03 14:33:44,827 WARNING [train.py:1204] (3/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_144-3287-0 from training. Duration: 0.67 2023-10-03 14:33:46,233 WARNING [train.py:1204] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_293-145278-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 14:33:47,638 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_40047_210_2290_1_1533290519_2374311_257-335162-0 from training. Duration: 0.95 2023-10-03 14:33:49,012 WARNING [train.py:1204] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_619-344499-0_sp0.9 from training. Duration: 0.7 2023-10-03 14:33:53,042 WARNING [train.py:1204] (3/4) Exclude cut with ID _19504_210_9994_1_1531648863242_3325220_456-315379-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 14:33:53,130 WARNING [train.py:1204] (3/4) Exclude cut with ID _55857_210_2346_1_1534644007831_6898209_631-221692-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 14:33:57,351 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=1300780.0, ans=0.1 2023-10-03 14:33:58,399 WARNING [train.py:1204] (3/4) Exclude cut with ID _64631_210_27767_1_1535349580068_4165160_324-3682-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 14:33:58,428 WARNING [train.py:1204] (3/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_317-211107-0 from training. Duration: 0.92 2023-10-03 14:34:01,384 WARNING [train.py:1204] (3/4) Exclude cut with ID _6789_210_1534_1_1527857937218_3118950_311-61262-0 from training. Duration: 0.65 2023-10-03 14:34:02,849 WARNING [train.py:1204] (3/4) Exclude cut with ID _28323_210_16187_1_1532480395667_4100300_365-68882-0_sp1.1 from training. Duration: 0.92725 2023-10-03 14:34:02,891 WARNING [train.py:1204] (3/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_816-181825-0_sp1.1 from training. Duration: 0.92725 2023-10-03 14:34:06,646 WARNING [train.py:1204] (3/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_132-269633-0_sp1.1 from training. Duration: 0.6181875 2023-10-03 14:34:06,665 WARNING [train.py:1204] (3/4) Exclude cut with ID _44585_210_18628_1_1533605350387_4518199_280-73534-0_sp1.1 from training. Duration: 0.8090625 2023-10-03 14:34:06,821 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.conv_module2.balancer1.prob, batch_count=1300846.6666666667, ans=0.125 2023-10-03 14:34:08,016 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_68095_210_20185_1_1535718226_8888855_1009-164755-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 14:34:08,092 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_40223_210_6228_1_1533298404_4812267_400-103813-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 14:34:09,251 WARNING [train.py:1204] (3/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_491-261923-0_sp1.1 from training. Duration: 0.8909375 2023-10-03 14:34:09,257 WARNING [train.py:1204] (3/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_712-342563-0 from training. Duration: 0.97 2023-10-03 14:34:09,335 WARNING [train.py:1204] (3/4) Exclude cut with ID _41161_210_20034_1_1533431249657_3555314_175-190032-0_sp1.1 from training. Duration: 0.963625 2023-10-03 14:34:12,006 WARNING [train.py:1204] (3/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_788-298907-0 from training. Duration: 0.89 2023-10-03 14:34:12,031 WARNING [train.py:1204] (3/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_225-70961-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 14:34:12,037 WARNING [train.py:1204] (3/4) Exclude cut with ID _58583_210_5236_1_1534726835266_3574249_235-175994-0_sp1.1 from training. Duration: 0.92725 2023-10-03 14:34:15,355 WARNING [train.py:1204] (3/4) Exclude cut with ID _12302_210_5219_1_1529924446466_8107280_907-182949-0_sp1.1 from training. Duration: 0.836375 2023-10-03 14:34:15,413 WARNING [train.py:1204] (3/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_94-193692-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 14:34:15,635 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.conv_skip_rate, batch_count=1300846.6666666667, ans=0.0 2023-10-03 14:34:16,858 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_4528_210_3273_1_1527076654_3276777_342-168068-0_sp0.9 from training. Duration: 0.9666875 2023-10-03 14:34:18,210 INFO [train.py:1046] (3/4) Epoch 37, batch 3900, loss[loss=0.1516, simple_loss=0.2323, pruned_loss=0.0354, over 23411.00 frames. ], tot_loss[loss=0.1568, simple_loss=0.2354, pruned_loss=0.03914, over 4690341.34 frames. ], batch size: 119, lr: 2.74e-03, grad_scale: 8.0 2023-10-03 14:34:18,320 WARNING [train.py:1204] (3/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_368-74239-0_sp1.1 from training. Duration: 0.92725 2023-10-03 14:34:18,322 WARNING [train.py:1204] (3/4) Exclude cut with ID _44831_210_13427_1_1533816011549_3689035_291-295928-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 14:34:19,631 WARNING [train.py:1204] (3/4) Exclude cut with ID _56406_210_9288_1_1534899618664_3496149_455-131997-0_sp1.1 from training. Duration: 0.97275 2023-10-03 14:34:19,638 WARNING [train.py:1204] (3/4) Exclude cut with ID _6473_210_1741_1_1527901307396_3815380_108-17335-0 from training. Duration: 0.65 2023-10-03 14:34:20,932 WARNING [train.py:1204] (3/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_286-135600-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 14:34:23,727 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_928-321675-0_sp1.1 from training. Duration: 0.936375 2023-10-03 14:34:26,416 WARNING [train.py:1204] (3/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_81-300212-0_sp1.1 from training. Duration: 0.5454375 2023-10-03 14:34:26,482 WARNING [train.py:1204] (3/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_29-280622-0_sp0.9 from training. Duration: 0.911125 2023-10-03 14:34:27,802 WARNING [train.py:1204] (3/4) Exclude cut with ID _61079_210_4525_1_1534921008198_4011419_228-176447-0_sp1.1 from training. Duration: 0.936375 2023-10-03 14:34:30,553 WARNING [train.py:1204] (3/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_372-198878-0_sp1.1 from training. Duration: 0.5454375 2023-10-03 14:34:30,567 WARNING [train.py:1204] (3/4) Exclude cut with ID _64521_210_14699_1_1535243427259_3815328_244-208725-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 14:34:31,975 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8275_210_4198_1_1528455320_3892571_491-17707-0_sp0.9 from training. Duration: 0.711125 2023-10-03 14:34:33,353 WARNING [train.py:1204] (3/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_375-245677-0 from training. Duration: 0.81 2023-10-03 14:34:33,360 WARNING [train.py:1204] (3/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_690-286055-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 14:34:34,847 WARNING [train.py:1204] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_89-321955-0 from training. Duration: 0.8 2023-10-03 14:34:34,890 WARNING [train.py:1204] (3/4) Exclude cut with ID _18895_210_6228_1_1531357274234_3784930_213-98721-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 14:34:36,331 WARNING [train.py:1204] (3/4) Exclude cut with ID _19708_210_9206_1_1532136650733_4238364_347-65240-0 from training. Duration: 0.97 2023-10-03 14:34:39,381 WARNING [train.py:1204] (3/4) Exclude cut with ID _64135_210_2253_1_1535677267876_7608972_498-5039-0 from training. Duration: 0.78 2023-10-03 14:34:43,557 WARNING [train.py:1204] (3/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_146-276944-0_sp1.1 from training. Duration: 0.7545625 2023-10-03 14:34:43,637 WARNING [train.py:1204] (3/4) Exclude cut with ID _63171_210_15488_1_1535104792794_3295110_155-34488-0_sp1.1 from training. Duration: 0.97275 2023-10-03 14:34:43,656 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_5438_210_1794_1_1527336687_2600009_233-58072-0_sp0.9 from training. Duration: 0.8555625 2023-10-03 14:34:43,703 WARNING [train.py:1204] (3/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_63-101996-0_sp0.9 from training. Duration: 0.97775 2023-10-03 14:34:50,206 WARNING [train.py:1204] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_271-269084-0_sp1.1 from training. Duration: 0.7545625 2023-10-03 14:34:51,586 WARNING [train.py:1204] (3/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_345-167986-0_sp1.1 from training. Duration: 0.763625 2023-10-03 14:34:53,050 INFO [scaling.py:1118] (3/4) WithLoss: name=encoder.encoders.3.encoder.layers.0.self_attn_weights, loss-sum=0.000e+00 2023-10-03 14:34:54,200 WARNING [train.py:1204] (3/4) Exclude cut with ID _39290_210_4220_1_1533862778760_3075118_92-311600-0_sp1.1 from training. Duration: 0.8 2023-10-03 14:34:54,307 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.1.bypass.scale_min, batch_count=1301046.6666666667, ans=0.2 2023-10-03 14:34:55,444 WARNING [train.py:1204] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_209-65306-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 14:34:55,529 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_26433_210_12108_1_1532157158_4056480_317-335807-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 14:35:02,476 WARNING [train.py:1204] (3/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_238-94127-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 14:35:03,763 WARNING [train.py:1204] (3/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_207-132038-0_sp1.1 from training. Duration: 0.8545625 2023-10-03 14:35:09,383 WARNING [train.py:1204] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_167-137255-0_sp1.1 from training. Duration: 0.5818125 2023-10-03 14:35:10,618 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2009_210_2071_1_1525481134_4588937_385-302100-0_sp0.9 from training. Duration: 0.9666875 2023-10-03 14:35:15,540 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.1.ff3_skip_rate, batch_count=1301180.0, ans=0.0 2023-10-03 14:35:21,531 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_15517_210_6753_1_1530954008_3487336_253-110617-0_sp1.1 from training. Duration: 0.936375 2023-10-03 14:35:21,694 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.0.conv_module1.balancer2.min_abs, batch_count=1301180.0, ans=0.5 2023-10-03 14:35:23,546 WARNING [train.py:1204] (3/4) Exclude cut with ID _39290_210_4220_1_1533862778760_3075118_92-311600-0_sp0.9 from training. Duration: 0.97775 2023-10-03 14:35:24,839 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_42-142473-0 from training. Duration: 0.72 2023-10-03 14:35:24,876 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2296_210_2087_1_1525503448_3397335_57-185430-0 from training. Duration: 0.6 2023-10-03 14:35:26,136 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_11305_210_1794_1_1529750428_7178148_257-312870-0_sp0.9 from training. Duration: 0.97775 2023-10-03 14:35:26,249 WARNING [train.py:1204] (3/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_222-198127-0 from training. Duration: 0.84 2023-10-03 14:35:27,635 WARNING [train.py:1204] (3/4) Exclude cut with ID _65263_210_22356_1_1535763598691_3921037_423-202699-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 14:35:27,865 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.attention_skip_rate, batch_count=1301180.0, ans=0.0 2023-10-03 14:35:27,891 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.nonlin_attention.balancer.prob, batch_count=1301180.0, ans=0.125 2023-10-03 14:35:28,948 WARNING [train.py:1204] (3/4) Exclude cut with ID _15037_210_2253_1_1530752294104_3267650_13-130361-0 from training. Duration: 0.99 2023-10-03 14:35:30,269 INFO [train.py:1046] (3/4) Epoch 37, batch 3950, loss[loss=0.157, simple_loss=0.235, pruned_loss=0.03947, over 23718.00 frames. ], tot_loss[loss=0.1569, simple_loss=0.2358, pruned_loss=0.03899, over 4706027.50 frames. ], batch size: 135, lr: 2.74e-03, grad_scale: 8.0 2023-10-03 14:35:34,739 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.feed_forward1.hidden_balancer.prob, batch_count=1301246.6666666667, ans=0.125 2023-10-03 14:35:35,892 WARNING [train.py:1204] (3/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_628-198080-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 14:35:37,284 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3171_210_2087_1_1526109084_2330007_37-64334-0 from training. Duration: 0.82 2023-10-03 14:35:37,344 WARNING [train.py:1204] (3/4) Exclude cut with ID _5287_210_2084_1_1527418883040_3878670_12-221186-0_sp1.1 from training. Duration: 0.8909375 2023-10-03 14:35:40,126 WARNING [train.py:1204] (3/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_1103-56445-0_sp1.1 from training. Duration: 0.663625 2023-10-03 14:35:40,357 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.ff2_skip_rate, batch_count=1301246.6666666667, ans=0.0 2023-10-03 14:35:41,516 WARNING [train.py:1204] (3/4) Exclude cut with ID _65263_210_22356_1_1535763598691_3921037_169-140068-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 14:35:45,567 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=1301313.3333333333, ans=0.1 2023-10-03 14:35:46,881 WARNING [train.py:1204] (3/4) Exclude cut with ID IC0671W0118-331961 from training. Duration: 0.896 2023-10-03 14:35:48,203 WARNING [train.py:1204] (3/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_402-102311-0_sp1.1 from training. Duration: 0.6090625 2023-10-03 14:35:48,231 WARNING [train.py:1204] (3/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_202-293523-0 from training. Duration: 0.72 2023-10-03 14:35:48,286 WARNING [train.py:1204] (3/4) Exclude cut with ID IC0679W0038-335846 from training. Duration: 0.768 2023-10-03 14:35:50,124 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_37357_210_17024_1_1533089504_2316121_79-198866-0_sp1.1 from training. Duration: 0.936375 2023-10-03 14:35:52,928 WARNING [train.py:1204] (3/4) Exclude cut with ID _7606_210_4915_1_1528894253277_5080690_292-271285-0_sp1.1 from training. Duration: 0.963625 2023-10-03 14:35:52,947 WARNING [train.py:1204] (3/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_377-296839-0_sp0.9 from training. Duration: 0.611125 2023-10-03 14:35:52,950 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_45886_210_17024_1_1533704917_2435333_58-131069-0_sp1.1 from training. Duration: 0.963625 2023-10-03 14:35:54,427 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6961_210_2087_1_1527937168_6815011_395-185060-0 from training. Duration: 0.95 2023-10-03 14:35:55,244 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.2.encoder.layers.0.feed_forward1.out_whiten, num_groups=1, num_channels=384, metric=5.00 vs. limit=15.0 2023-10-03 14:35:57,808 WARNING [train.py:1204] (3/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_304-266479-0_sp1.1 from training. Duration: 0.97275 2023-10-03 14:35:59,116 WARNING [train.py:1204] (3/4) Exclude cut with ID _3867_210_1883_1_1526641200575_4475830_319-349850-0_sp1.1 from training. Duration: 0.6090625 2023-10-03 14:35:59,124 WARNING [train.py:1204] (3/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_304-59227-0_sp0.9 from training. Duration: 0.8555625 2023-10-03 14:35:59,199 WARNING [train.py:1204] (3/4) Exclude cut with ID _8021_210_7994_1_1528376451358_3653069_100-140231-0_sp0.9 from training. Duration: 0.7444375 2023-10-03 14:36:00,501 WARNING [train.py:1204] (3/4) Exclude cut with ID _8833_210_5219_1_1528592665210_7553059_329-125156-0_sp0.9 from training. Duration: 0.911125 2023-10-03 14:36:05,919 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.526e+02 1.893e+02 2.072e+02 2.243e+02 3.144e+02, threshold=4.143e+02, percent-clipped=0.0 2023-10-03 14:36:11,594 WARNING [train.py:1204] (3/4) Exclude cut with ID _14459_210_2228_1_1531443641946_7516700_792-287234-0_sp1.1 from training. Duration: 0.92725 2023-10-03 14:36:11,610 WARNING [train.py:1204] (3/4) Exclude cut with ID _19708_210_9206_1_1532136650733_4238364_347-65240-0_sp1.1 from training. Duration: 0.8818125 2023-10-03 14:36:18,078 WARNING [train.py:1204] (3/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_70-27732-0 from training. Duration: 0.94 2023-10-03 14:36:22,912 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.1.conv_module1.balancer1.prob, batch_count=1301446.6666666667, ans=0.125 2023-10-03 14:36:24,121 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_397-132565-0 from training. Duration: 0.99 2023-10-03 14:36:24,124 WARNING [train.py:1204] (3/4) Exclude cut with ID _49728_210_12651_1_1534041034294_6788150_624-195301-0 from training. Duration: 0.85 2023-10-03 14:36:24,151 WARNING [train.py:1204] (3/4) Exclude cut with ID _55429_210_10626_1_1534467588527_7174143_377-169391-0_sp1.1 from training. Duration: 0.87275 2023-10-03 14:36:25,537 WARNING [train.py:1204] (3/4) Exclude cut with ID _49376_210_5986_1_1533985278732_3370740_354-344695-0_sp1.1 from training. Duration: 0.9 2023-10-03 14:36:28,527 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=1301513.3333333333, ans=0.1 2023-10-03 14:36:34,371 WARNING [train.py:1204] (3/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_276-96593-0_sp1.1 from training. Duration: 0.7 2023-10-03 14:36:34,386 WARNING [train.py:1204] (3/4) Exclude cut with ID _15012_210_1968_1_1530856820429_5095350_688-178849-0_sp0.9 from training. Duration: 0.711125 2023-10-03 14:36:34,433 WARNING [train.py:1204] (3/4) Exclude cut with ID _25565_210_12392_1_1532423009382_3952098_372-69023-0_sp1.1 from training. Duration: 0.936375 2023-10-03 14:36:35,737 WARNING [train.py:1204] (3/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_61-327062-0_sp1.1 from training. Duration: 0.736375 2023-10-03 14:36:35,764 WARNING [train.py:1204] (3/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_215-141059-0 from training. Duration: 0.62 2023-10-03 14:36:39,860 WARNING [train.py:1204] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_6-226750-0_sp1.1 from training. Duration: 0.8545625 2023-10-03 14:36:39,956 WARNING [train.py:1204] (3/4) Exclude cut with ID _14348_210_8341_1_1530599434996_3778640_240-232370-0_sp1.1 from training. Duration: 0.763625 2023-10-03 14:36:43,899 INFO [train.py:1046] (3/4) Epoch 37, batch 4000, loss[loss=0.1546, simple_loss=0.2272, pruned_loss=0.04103, over 23695.00 frames. ], tot_loss[loss=0.1568, simple_loss=0.2363, pruned_loss=0.03865, over 4724036.53 frames. ], batch size: 212, lr: 2.74e-03, grad_scale: 16.0 2023-10-03 14:36:44,011 WARNING [train.py:1204] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_468-189058-0 from training. Duration: 0.96 2023-10-03 14:36:53,505 WARNING [train.py:1204] (3/4) Exclude cut with ID _21987_210_5854_1_1531821500482_3637038_247-93340-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 14:36:55,237 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.bypass_mid.scale_min, batch_count=1301580.0, ans=0.2 2023-10-03 14:36:59,755 WARNING [train.py:1204] (3/4) Exclude cut with ID _66868_210_9527_1_1535509287954_4561140_436-191602-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 14:37:03,489 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.0.layers.1.nonlin_attention.whiten2, num_groups=1, num_channels=192, metric=6.16 vs. limit=15.0 2023-10-03 14:37:03,936 WARNING [train.py:1204] (3/4) Exclude cut with ID _18895_210_6228_1_1531357274234_3784930_394-58203-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 14:37:03,985 WARNING [train.py:1204] (3/4) Exclude cut with ID _58605_210_12210_1_1534723195939_3904259_258-281498-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 14:37:05,301 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_33348_210_5632_1_1533086912_3324030_245-292752-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 14:37:05,330 WARNING [train.py:1204] (3/4) Exclude cut with ID _67831_210_12392_1_1535612464202_3092612_346-2148-0 from training. Duration: 0.93 2023-10-03 14:37:06,720 WARNING [train.py:1204] (3/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_369-222060-0_sp0.9 from training. Duration: 0.788875 2023-10-03 14:37:08,083 WARNING [train.py:1204] (3/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_245-174553-0 from training. Duration: 0.88 2023-10-03 14:37:08,089 WARNING [train.py:1204] (3/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_231-104736-0_sp1.1 from training. Duration: 0.7090625 2023-10-03 14:37:08,095 WARNING [train.py:1204] (3/4) Exclude cut with ID _44585_210_18628_1_1533605350387_4518199_280-73534-0 from training. Duration: 0.89 2023-10-03 14:37:09,542 WARNING [train.py:1204] (3/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_22-201476-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 14:37:10,918 WARNING [train.py:1204] (3/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_325-62802-0_sp0.9 from training. Duration: 0.9666875 2023-10-03 14:37:12,213 WARNING [train.py:1204] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_199-274062-0_sp1.1 from training. Duration: 0.97275 2023-10-03 14:37:12,216 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_8-319957-0_sp1.1 from training. Duration: 0.87275 2023-10-03 14:37:12,239 WARNING [train.py:1204] (3/4) Exclude cut with ID _29343_210_4381_1_1532602757752_3674109_224-349726-0_sp1.1 from training. Duration: 0.936375 2023-10-03 14:37:12,244 WARNING [train.py:1204] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_520-126729-0_sp1.1 from training. Duration: 0.636375 2023-10-03 14:37:13,641 WARNING [train.py:1204] (3/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_239-22076-0_sp1.1 from training. Duration: 0.77275 2023-10-03 14:37:15,048 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9012W0011-501146 from training. Duration: 0.896 2023-10-03 14:37:16,412 WARNING [train.py:1204] (3/4) Exclude cut with ID _29857_210_20034_1_1532395886753_3798160_279-185801-0_sp1.1 from training. Duration: 0.82725 2023-10-03 14:37:18,108 WARNING [train.py:1204] (3/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_261-223933-0_sp1.1 from training. Duration: 0.92725 2023-10-03 14:37:22,565 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9029W0025-509426 from training. Duration: 0.896 2023-10-03 14:37:22,638 WARNING [train.py:1204] (3/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_337-181502-0_sp1.1 from training. Duration: 0.5909375 2023-10-03 14:37:22,658 WARNING [train.py:1204] (3/4) Exclude cut with ID _68635_210_9317_1_1535770813410_4315870_159-332599-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 14:37:28,679 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_161-276996-0 from training. Duration: 0.95 2023-10-03 14:37:28,720 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7088_210_1874_1_1527939776_1101113_127-68870-0_sp1.1 from training. Duration: 0.936375 2023-10-03 14:37:32,063 WARNING [train.py:1204] (3/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_322-217220-0_sp1.1 from training. Duration: 0.7818125 2023-10-03 14:37:33,456 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9072W0002-530636 from training. Duration: 0.768 2023-10-03 14:37:34,799 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_340-336408-0_sp1.1 from training. Duration: 0.8454375 2023-10-03 14:37:34,857 WARNING [train.py:1204] (3/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_395-37222-0 from training. Duration: 0.76 2023-10-03 14:37:34,863 WARNING [train.py:1204] (3/4) Exclude cut with ID _64099_210_17301_1_1535526977656_1578188_159-209156-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 14:37:34,929 WARNING [train.py:1204] (3/4) Exclude cut with ID _53950_210_5816_1_1534417264919_3862951_28-29681-0_sp1.1 from training. Duration: 0.92725 2023-10-03 14:37:36,247 WARNING [train.py:1204] (3/4) Exclude cut with ID _4681_210_3525_1_1527250689464_120889_15-126760-0_sp1.1 from training. Duration: 0.736375 2023-10-03 14:37:37,553 WARNING [train.py:1204] (3/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_242-3472-0_sp1.1 from training. Duration: 0.663625 2023-10-03 14:37:37,594 WARNING [train.py:1204] (3/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_436-186311-0_sp0.9 from training. Duration: 0.72225 2023-10-03 14:37:37,610 WARNING [train.py:1204] (3/4) Exclude cut with ID _3675_210_4557_1_1526639934408_4182089_452-172147-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 14:37:39,059 WARNING [train.py:1204] (3/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_80-147301-0 from training. Duration: 0.84 2023-10-03 14:37:40,376 WARNING [train.py:1204] (3/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_351-107588-0_sp1.1 from training. Duration: 0.92725 2023-10-03 14:37:40,495 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9111W0002-550781 from training. Duration: 0.896 2023-10-03 14:37:46,496 WARNING [train.py:1204] (3/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_465-156500-0_sp1.1 from training. Duration: 0.6545625 2023-10-03 14:37:49,324 WARNING [train.py:1204] (3/4) Exclude cut with ID _13938_210_9988_1_1530518718978_3693960_304-112784-0_sp1.1 from training. Duration: 0.4181875 2023-10-03 14:37:50,741 WARNING [train.py:1204] (3/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_202-67017-0_sp1.1 from training. Duration: 0.7454375 2023-10-03 14:37:52,055 WARNING [train.py:1204] (3/4) Exclude cut with ID _58414_210_8254_1_1535421609162_7151182_448-4358-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 14:37:52,129 WARNING [train.py:1204] (3/4) Exclude cut with ID _66840_210_5219_1_1535452168426_4196349_203-243234-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 14:37:52,830 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.3.encoder.layers.2.feed_forward1.out_whiten, num_groups=1, num_channels=512, metric=9.28 vs. limit=15.0 2023-10-03 14:37:53,970 WARNING [train.py:1204] (3/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_201-213513-0_sp1.1 from training. Duration: 0.97275 2023-10-03 14:37:58,133 INFO [train.py:1046] (3/4) Epoch 37, batch 4050, loss[loss=0.1569, simple_loss=0.2533, pruned_loss=0.0303, over 24298.00 frames. ], tot_loss[loss=0.1582, simple_loss=0.2377, pruned_loss=0.03935, over 4722388.56 frames. ], batch size: 74, lr: 2.74e-03, grad_scale: 16.0 2023-10-03 14:37:58,194 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_117-187560-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 14:37:59,988 WARNING [train.py:1204] (3/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_135-174080-0_sp0.9 from training. Duration: 0.588875 2023-10-03 14:38:01,326 WARNING [train.py:1204] (3/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_45-265684-0 from training. Duration: 0.72 2023-10-03 14:38:02,580 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_435-313305-0_sp1.1 from training. Duration: 0.6454375 2023-10-03 14:38:02,599 WARNING [train.py:1204] (3/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_235-307337-0_sp1.1 from training. Duration: 0.8545625 2023-10-03 14:38:05,152 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7762_210_1794_1_1528199508_4057408_419-125009-0_sp0.9 from training. Duration: 0.92225 2023-10-03 14:38:05,246 WARNING [train.py:1204] (3/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_215-141059-0_sp1.1 from training. Duration: 0.563625 2023-10-03 14:38:06,575 WARNING [train.py:1204] (3/4) Exclude cut with ID _27013_210_1389_1_1532437173240_3252649_222-125573-0_sp1.1 from training. Duration: 0.8909375 2023-10-03 14:38:09,344 WARNING [train.py:1204] (3/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_126-274694-0_sp1.1 from training. Duration: 0.8909375 2023-10-03 14:38:12,141 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2592_210_2290_1_1525697025_4882925_157-70512-0_sp1.1 from training. Duration: 0.82725 2023-10-03 14:38:12,214 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_292-265928-0_sp0.9 from training. Duration: 0.5666875 2023-10-03 14:38:15,146 WARNING [train.py:1204] (3/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_1211-226066-0_sp1.1 from training. Duration: 0.6909375 2023-10-03 14:38:15,195 WARNING [train.py:1204] (3/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_394-265747-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 14:38:19,845 WARNING [train.py:1204] (3/4) Exclude cut with ID _48782_210_12658_1_1534467701544_2300422_239-67139-0_sp1.1 from training. Duration: 0.97275 2023-10-03 14:38:21,238 WARNING [train.py:1204] (3/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_329-113173-0_sp1.1 from training. Duration: 0.563625 2023-10-03 14:38:24,679 WARNING [train.py:1204] (3/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_81-75849-0_sp0.9 from training. Duration: 0.4333125 2023-10-03 14:38:27,253 WARNING [train.py:1204] (3/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_1003-101638-0 from training. Duration: 0.73 2023-10-03 14:38:27,289 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9309W0189-640316 from training. Duration: 0.64 2023-10-03 14:38:27,444 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.ff3_skip_rate, batch_count=1302046.6666666667, ans=0.0 2023-10-03 14:38:30,600 WARNING [train.py:1204] (3/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_503-287883-0_sp1.1 from training. Duration: 0.8 2023-10-03 14:38:33,274 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.494e+02 1.902e+02 2.050e+02 2.347e+02 3.332e+02, threshold=4.101e+02, percent-clipped=0.0 2023-10-03 14:38:36,314 WARNING [train.py:1204] (3/4) Exclude cut with ID _8833_210_5219_1_1528592665210_7553059_611-80313-0 from training. Duration: 0.64 2023-10-03 14:38:37,683 WARNING [train.py:1204] (3/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_335-106626-0_sp1.1 from training. Duration: 0.963625 2023-10-03 14:38:39,246 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.ff3_skip_rate, batch_count=1302046.6666666667, ans=0.0 2023-10-03 14:38:40,399 WARNING [train.py:1204] (3/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_237-44332-0_sp1.1 from training. Duration: 0.8545625 2023-10-03 14:38:43,180 WARNING [train.py:1204] (3/4) Exclude cut with ID _46952_210_5986_1_1534230010357_3107379_178-147341-0_sp1.1 from training. Duration: 0.97275 2023-10-03 14:38:43,220 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_13091_210_7925_1_1530609447_8987354_946-154426-0_sp1.1 from training. Duration: 0.936375 2023-10-03 14:38:43,242 WARNING [train.py:1204] (3/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_326-17061-0_sp1.1 from training. Duration: 0.8545625 2023-10-03 14:38:47,156 WARNING [train.py:1204] (3/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_38-169920-0_sp1.1 from training. Duration: 0.82725 2023-10-03 14:38:49,296 WARNING [train.py:1204] (3/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_283-220372-0 from training. Duration: 0.56 2023-10-03 14:38:49,304 WARNING [train.py:1204] (3/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_252-216674-0_sp1.1 from training. Duration: 0.4909375 2023-10-03 14:38:51,253 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_202-91290-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 14:38:54,027 WARNING [train.py:1204] (3/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_80-74184-0 from training. Duration: 0.83 2023-10-03 14:38:58,698 WARNING [train.py:1204] (3/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_411-271358-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 14:39:04,762 WARNING [train.py:1204] (3/4) Exclude cut with ID _68777_210_4353_1_1535810390731_3117904_129-166012-0 from training. Duration: 0.85 2023-10-03 14:39:06,111 WARNING [train.py:1204] (3/4) Exclude cut with ID _44982_210_1511_1_1534312820108_3726029_410-109759-0_sp1.1 from training. Duration: 0.963625 2023-10-03 14:39:06,115 WARNING [train.py:1204] (3/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_364-22817-0_sp1.1 from training. Duration: 0.6454375 2023-10-03 14:39:07,533 WARNING [train.py:1204] (3/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_89-242335-0 from training. Duration: 0.95 2023-10-03 14:39:07,551 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_151-325626-0 from training. Duration: 0.97 2023-10-03 14:39:07,553 WARNING [train.py:1204] (3/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_786-264332-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 14:39:11,530 INFO [train.py:1046] (3/4) Epoch 37, batch 4100, loss[loss=0.1721, simple_loss=0.2543, pruned_loss=0.04496, over 24412.00 frames. ], tot_loss[loss=0.1584, simple_loss=0.2381, pruned_loss=0.0393, over 4720655.00 frames. ], batch size: 77, lr: 2.74e-03, grad_scale: 16.0 2023-10-03 14:39:11,608 WARNING [train.py:1204] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_35-153886-0_sp1.1 from training. Duration: 0.763625 2023-10-03 14:39:11,704 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_24007_210_1794_1_1531918925_1663875_116-100916-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 14:39:11,726 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_162-262717-0_sp1.1 from training. Duration: 0.7545625 2023-10-03 14:39:17,607 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=1302246.6666666667, ans=0.1 2023-10-03 14:39:18,982 WARNING [train.py:1204] (3/4) Exclude cut with ID _8263_210_1664_1_1528438953494_294100_40-204731-0 from training. Duration: 0.5 2023-10-03 14:39:21,494 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_416-293746-0 from training. Duration: 0.9 2023-10-03 14:39:24,073 WARNING [train.py:1204] (3/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_605-219732-0 from training. Duration: 0.94 2023-10-03 14:39:24,169 WARNING [train.py:1204] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_454-123002-0 from training. Duration: 0.69 2023-10-03 14:39:24,174 WARNING [train.py:1204] (3/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_25-304273-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 14:39:24,222 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6860_210_4852_1_1527917457_3833327_451-39702-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 14:39:24,353 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.conv_skip_rate, batch_count=1302246.6666666667, ans=0.0 2023-10-03 14:39:25,500 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_304-16505-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 14:39:25,521 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_4-266348-0_sp1.1 from training. Duration: 0.7090625 2023-10-03 14:39:27,437 WARNING [train.py:1204] (3/4) Exclude cut with ID ID0149W0001-753701 from training. Duration: 0.989 2023-10-03 14:39:30,142 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_41251_210_15368_1_1533429767_4820695_394-55393-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 14:39:30,224 WARNING [train.py:1204] (3/4) Exclude cut with ID _8793_210_6310_1_1528542985139_2310009_121-179782-0_sp1.1 from training. Duration: 0.72725 2023-10-03 14:39:30,238 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2729_210_3528_1_1525863505_3882214_421-73047-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 14:39:31,426 WARNING [train.py:1204] (3/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_278-154492-0_sp0.9 from training. Duration: 0.8444375 2023-10-03 14:39:37,303 WARNING [train.py:1204] (3/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_412-317077-0_sp0.9 from training. Duration: 0.8333125 2023-10-03 14:39:38,643 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_727-138317-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 14:39:38,687 WARNING [train.py:1204] (3/4) Exclude cut with ID _25808_210_13292_1_1532343514562_7366109_345-335048-0_sp1.1 from training. Duration: 0.92725 2023-10-03 14:39:38,701 WARNING [train.py:1204] (3/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_506-23200-0 from training. Duration: 0.97 2023-10-03 14:39:38,775 WARNING [train.py:1204] (3/4) Exclude cut with ID _44718_210_5816_1_1534050051869_6469030_643-245187-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 14:39:38,779 WARNING [train.py:1204] (3/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_42-186162-0_sp1.1 from training. Duration: 0.836375 2023-10-03 14:39:38,798 WARNING [train.py:1204] (3/4) Exclude cut with ID _3231_210_2106_1_1526205864934_6784220_171-256004-0_sp1.1 from training. Duration: 0.7818125 2023-10-03 14:39:40,138 WARNING [train.py:1204] (3/4) Exclude cut with ID _25914_210_6400_1_1532141317058_644740_43-248478-0_sp1.1 from training. Duration: 0.936375 2023-10-03 14:39:40,195 WARNING [train.py:1204] (3/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_577-287150-0 from training. Duration: 0.82 2023-10-03 14:39:43,057 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_46015_210_7234_1_1533863319_3087837_376-276068-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 14:39:44,366 WARNING [train.py:1204] (3/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_51-200220-0 from training. Duration: 0.56 2023-10-03 14:39:45,671 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_452-96101-0_sp1.1 from training. Duration: 0.963625 2023-10-03 14:39:48,353 WARNING [train.py:1204] (3/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_263-168388-0_sp1.1 from training. Duration: 0.7818125 2023-10-03 14:39:48,354 WARNING [train.py:1204] (3/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_469-94673-0 from training. Duration: 0.79 2023-10-03 14:39:49,769 WARNING [train.py:1204] (3/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_303-228738-0_sp1.1 from training. Duration: 0.763625 2023-10-03 14:39:49,813 WARNING [train.py:1204] (3/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_262-250826-0_sp1.1 from training. Duration: 0.9 2023-10-03 14:39:50,444 WARNING [train.py:1204] (3/4) Exclude cut with ID _68777_210_4353_1_1535810390731_3117904_129-166012-0_sp1.1 from training. Duration: 0.77275 2023-10-03 14:39:51,668 WARNING [train.py:1204] (3/4) Exclude cut with ID _58895_210_8750_1_1535184081320_3649089_265-323810-0 from training. Duration: 0.96 2023-10-03 14:39:53,625 WARNING [train.py:1204] (3/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_120-76843-0_sp0.9 from training. Duration: 0.611125 2023-10-03 14:39:53,683 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7544_210_6753_1_1528587722_3576686_295-258479-0_sp0.9 from training. Duration: 0.9333125 2023-10-03 14:39:57,005 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7046_210_6753_1_1527895337_6920917_783-278109-0 from training. Duration: 0.77 2023-10-03 14:39:57,049 WARNING [train.py:1204] (3/4) Exclude cut with ID _42326_210_7770_1_1533814282531_3707269_113-308323-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 14:39:58,262 WARNING [train.py:1204] (3/4) Exclude cut with ID _7852_210_5219_1_1528286471316_8139400_242-105336-0_sp0.9 from training. Duration: 0.8 2023-10-03 14:40:01,072 WARNING [train.py:1204] (3/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_114-277905-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 14:40:05,913 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_13118_210_7925_1_1530755612_7608297_42-66683-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 14:40:07,389 WARNING [train.py:1204] (3/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_901-79438-0_sp1.1 from training. Duration: 0.8545625 2023-10-03 14:40:07,438 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_30280_210_1881_1_1532872779_3660139_211-182194-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 14:40:15,561 WARNING [train.py:1204] (3/4) Exclude cut with ID _14898_210_9206_1_1530766812377_4177190_621-45732-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 14:40:16,817 WARNING [train.py:1204] (3/4) Exclude cut with ID _35350_210_16040_1_1533040191183_4075299_224-133775-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 14:40:20,101 WARNING [train.py:1204] (3/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_147-293311-0_sp1.1 from training. Duration: 0.8545625 2023-10-03 14:40:24,721 WARNING [train.py:1204] (3/4) Exclude cut with ID _12302_210_5219_1_1529924446466_8107280_835-279754-0_sp1.1 from training. Duration: 0.8181875 2023-10-03 14:40:25,667 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.1.encoder.layers.0.feed_forward3.out_whiten, num_groups=1, num_channels=256, metric=12.93 vs. limit=15.0 2023-10-03 14:40:26,401 INFO [train.py:1046] (3/4) Epoch 37, batch 4150, loss[loss=0.1542, simple_loss=0.2268, pruned_loss=0.04082, over 24460.00 frames. ], tot_loss[loss=0.1591, simple_loss=0.2382, pruned_loss=0.03999, over 4707369.68 frames. ], batch size: 58, lr: 2.74e-03, grad_scale: 16.0 2023-10-03 14:40:29,186 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8453_210_4211_1_1528502940_8562730_347-330673-0_sp0.9 from training. Duration: 0.8 2023-10-03 14:40:30,572 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_213-103282-0_sp0.9 from training. Duration: 0.8666875 2023-10-03 14:40:32,067 WARNING [train.py:1204] (3/4) Exclude cut with ID _49728_210_12651_1_1534041034294_6788150_66-43496-0_sp1.1 from training. Duration: 0.97275 2023-10-03 14:40:32,069 WARNING [train.py:1204] (3/4) Exclude cut with ID _19023_210_4353_1_1531378892552_5820780_28-36615-0_sp1.1 from training. Duration: 0.8818125 2023-10-03 14:40:34,806 WARNING [train.py:1204] (3/4) Exclude cut with ID _14349_210_8341_1_1530615710818_3442470_115-285975-0 from training. Duration: 0.86 2023-10-03 14:40:34,833 WARNING [train.py:1204] (3/4) Exclude cut with ID _12734_210_8341_1_1530183614296_3891929_236-47464-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 14:40:34,890 WARNING [train.py:1204] (3/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_177-236809-0 from training. Duration: 0.49 2023-10-03 14:40:36,270 WARNING [train.py:1204] (3/4) Exclude cut with ID _13678_210_7497_1_1530530069226_3463170_371-3221-0 from training. Duration: 0.9 2023-10-03 14:40:36,285 WARNING [train.py:1204] (3/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_48-338976-0 from training. Duration: 0.89 2023-10-03 14:40:38,247 WARNING [train.py:1204] (3/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_1167-309744-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 14:40:41,239 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=1302646.6666666667, ans=0.1 2023-10-03 14:40:42,375 WARNING [train.py:1204] (3/4) Exclude cut with ID _30327_210_16049_1_1532484161295_3924169_401-70819-0_sp0.9 from training. Duration: 0.9555625 2023-10-03 14:40:42,383 WARNING [train.py:1204] (3/4) Exclude cut with ID _55134_210_21982_1_1534590028377_3882170_346-91022-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 14:40:42,612 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.feed_forward1.hidden_balancer.prob, batch_count=1302646.6666666667, ans=0.125 2023-10-03 14:40:45,320 WARNING [train.py:1204] (3/4) Exclude cut with ID _14459_210_2228_1_1531443641946_7516700_174-118801-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 14:40:46,703 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_269-278006-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 14:40:46,761 WARNING [train.py:1204] (3/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_390-339137-0_sp0.9 from training. Duration: 0.62225 2023-10-03 14:40:49,374 WARNING [train.py:1204] (3/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_253-308377-0_sp1.1 from training. Duration: 0.4454375 2023-10-03 14:40:49,399 WARNING [train.py:1204] (3/4) Exclude cut with ID _23825_210_13449_1_1531915313526_3497150_240-271367-0_sp1.1 from training. Duration: 0.8818125 2023-10-03 14:40:52,559 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1015-343884-0_sp0.9 from training. Duration: 0.688875 2023-10-03 14:40:54,584 WARNING [train.py:1204] (3/4) Exclude cut with ID _23514_210_1389_1_1531832139785_3031439_335-48210-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 14:40:54,753 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.conv_skip_rate, batch_count=1302713.3333333333, ans=0.0 2023-10-03 14:40:59,267 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_309-65177-0_sp1.1 from training. Duration: 0.8 2023-10-03 14:40:59,343 WARNING [train.py:1204] (3/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_231-137892-0 from training. Duration: 0.99 2023-10-03 14:41:02,042 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.542e+02 1.895e+02 2.138e+02 2.418e+02 3.497e+02, threshold=4.277e+02, percent-clipped=0.0 2023-10-03 14:41:02,153 WARNING [train.py:1204] (3/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_57-270037-0 from training. Duration: 0.77 2023-10-03 14:41:02,158 WARNING [train.py:1204] (3/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_465-214726-0_sp1.1 from training. Duration: 0.7818125 2023-10-03 14:41:03,576 WARNING [train.py:1204] (3/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_106-300985-0 from training. Duration: 0.9 2023-10-03 14:41:03,582 WARNING [train.py:1204] (3/4) Exclude cut with ID _14632_210_9988_1_1530691279392_3923200_161-129530-0_sp1.1 from training. Duration: 0.87275 2023-10-03 14:41:03,601 WARNING [train.py:1204] (3/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_514-88370-0_sp1.1 from training. Duration: 0.963625 2023-10-03 14:41:05,048 WARNING [train.py:1204] (3/4) Exclude cut with ID _56495_210_4234_1_1534748080334_4136219_305-334505-0_sp1.1 from training. Duration: 0.92725 2023-10-03 14:41:06,424 WARNING [train.py:1204] (3/4) Exclude cut with ID _42875_210_14856_1_1533704397487_4012433_61-264914-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 14:41:10,952 WARNING [train.py:1204] (3/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_91-148831-0 from training. Duration: 0.99 2023-10-03 14:41:13,712 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_435-313305-0_sp0.9 from training. Duration: 0.788875 2023-10-03 14:41:15,158 WARNING [train.py:1204] (3/4) Exclude cut with ID _16744_210_5986_1_1531724405384_3907567_242-111316-0_sp1.1 from training. Duration: 0.8454375 2023-10-03 14:41:15,220 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_257-20733-0 from training. Duration: 0.76 2023-10-03 14:41:16,908 WARNING [train.py:1204] (3/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_4-91037-0_sp1.1 from training. Duration: 0.8 2023-10-03 14:41:18,392 WARNING [train.py:1204] (3/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_3-276701-0 from training. Duration: 0.91 2023-10-03 14:41:21,082 WARNING [train.py:1204] (3/4) Exclude cut with ID _3992_210_1747_1_1526644816031_3731060_36-147855-0_sp1.1 from training. Duration: 0.7090625 2023-10-03 14:41:21,170 WARNING [train.py:1204] (3/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_1296-298671-0_sp1.1 from training. Duration: 0.963625 2023-10-03 14:41:23,016 WARNING [train.py:1204] (3/4) Exclude cut with ID _68965_210_1883_1_1535698962272_3497490_309-334593-0_sp1.1 from training. Duration: 0.92725 2023-10-03 14:41:24,424 WARNING [train.py:1204] (3/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_128-296581-0 from training. Duration: 0.74 2023-10-03 14:41:24,425 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_17613_210_2151_1_1531137569_2366160_170-299079-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 14:41:24,427 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6287_210_3273_1_1527509506_2653716_182-199887-0_sp1.1 from training. Duration: 0.463625 2023-10-03 14:41:25,799 WARNING [train.py:1204] (3/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_80-135200-0_sp1.1 from training. Duration: 0.7181875 2023-10-03 14:41:27,310 WARNING [train.py:1204] (3/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_34-183685-0 from training. Duration: 0.98 2023-10-03 14:41:27,326 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8278_210_2550_1_1528637099_4220413_418-210155-0_sp1.1 from training. Duration: 0.92725 2023-10-03 14:41:27,329 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8853_210_3273_1_1528633899_818968_51-187685-0_sp0.9 from training. Duration: 0.7666875 2023-10-03 14:41:27,343 WARNING [train.py:1204] (3/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_332-221057-0_sp1.1 from training. Duration: 0.6181875 2023-10-03 14:41:27,419 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2221_210_1988_1_1525597051_4935541_415-296882-0 from training. Duration: 0.77 2023-10-03 14:41:29,258 WARNING [train.py:1204] (3/4) Exclude cut with ID _33640_210_2655_1_1533117605479_4855469_770-278185-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 14:41:29,299 WARNING [train.py:1204] (3/4) Exclude cut with ID _5287_210_2084_1_1527418883040_3878670_333-48508-0_sp1.1 from training. Duration: 0.5454375 2023-10-03 14:41:30,643 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_142-276839-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 14:41:30,864 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.conv_skip_rate, batch_count=1302846.6666666667, ans=0.0 2023-10-03 14:41:32,139 WARNING [train.py:1204] (3/4) Exclude cut with ID _28394_210_8913_1_1532430049518_3650118_126-139371-0_sp1.1 from training. Duration: 0.92725 2023-10-03 14:41:33,338 WARNING [train.py:1204] (3/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_76-203867-0 from training. Duration: 0.72 2023-10-03 14:41:33,387 WARNING [train.py:1204] (3/4) Exclude cut with ID _6194_210_1534_1_1527511826160_3941194_14-282876-0_sp0.9 from training. Duration: 0.788875 2023-10-03 14:41:39,557 WARNING [train.py:1204] (3/4) Exclude cut with ID _35355_210_16040_1_1533212988977_4016229_74-190076-0_sp1.1 from training. Duration: 0.72725 2023-10-03 14:41:40,825 INFO [train.py:1046] (3/4) Epoch 37, batch 4200, loss[loss=0.1711, simple_loss=0.2588, pruned_loss=0.04168, over 24569.00 frames. ], tot_loss[loss=0.1587, simple_loss=0.237, pruned_loss=0.04024, over 4703137.60 frames. ], batch size: 71, lr: 2.74e-03, grad_scale: 16.0 2023-10-03 14:41:40,965 WARNING [train.py:1204] (3/4) Exclude cut with ID _33920_210_4220_1_1533257851500_3084399_198-334063-0 from training. Duration: 0.94 2023-10-03 14:41:42,368 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_4-266348-0_sp0.9 from training. Duration: 0.8666875 2023-10-03 14:41:45,043 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_23856_210_4238_1_1531994785_3087934_122-38075-0_sp1.1 from training. Duration: 0.936375 2023-10-03 14:41:45,783 WARNING [train.py:1204] (3/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_276-96593-0_sp0.9 from training. Duration: 0.8555625 2023-10-03 14:41:46,890 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_308-278734-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 14:41:46,892 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_5190_210_4557_1_1527245078_2965646_353-250603-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 14:41:49,716 WARNING [train.py:1204] (3/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_472-216663-0 from training. Duration: 0.92 2023-10-03 14:41:50,165 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.5.encoder.layers.1.feed_forward1.out_whiten, num_groups=1, num_channels=256, metric=8.55 vs. limit=15.0 2023-10-03 14:41:52,416 WARNING [train.py:1204] (3/4) Exclude cut with ID _7537_210_2084_1_1528502395431_3924359_14-312906-0 from training. Duration: 0.84 2023-10-03 14:41:54,219 WARNING [train.py:1204] (3/4) Exclude cut with ID _29003_210_19596_1_1532507391720_3894098_559-236660-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 14:41:57,002 WARNING [train.py:1204] (3/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_312-204798-0_sp1.1 from training. Duration: 0.8454375 2023-10-03 14:41:59,030 WARNING [train.py:1204] (3/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_17-309070-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 14:42:03,157 WARNING [train.py:1204] (3/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_344-152526-0_sp0.9 from training. Duration: 0.588875 2023-10-03 14:42:04,529 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_41452_210_7291_1_1533437687_3941281_405-11559-0_sp1.1 from training. Duration: 0.82725 2023-10-03 14:42:04,565 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_48730_210_5399_1_1533885886_2212769_157-124052-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 14:42:05,935 WARNING [train.py:1204] (3/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_415-78792-0 from training. Duration: 0.81 2023-10-03 14:42:05,939 WARNING [train.py:1204] (3/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_345-340690-0_sp1.1 from training. Duration: 0.8454375 2023-10-03 14:42:07,305 WARNING [train.py:1204] (3/4) Exclude cut with ID _23825_210_13449_1_1531915313526_3497150_124-8138-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 14:42:07,349 WARNING [train.py:1204] (3/4) Exclude cut with ID _49258_210_7994_1_1534122017863_3738861_194-24864-0_sp1.1 from training. Duration: 0.936375 2023-10-03 14:42:08,632 WARNING [train.py:1204] (3/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_369-222060-0_sp1.1 from training. Duration: 0.6454375 2023-10-03 14:42:08,752 WARNING [train.py:1204] (3/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_480-13146-0_sp0.9 from training. Duration: 0.9444375 2023-10-03 14:42:10,226 WARNING [train.py:1204] (3/4) Exclude cut with ID _13253_210_1925_1_1530757775819_3577873_191-144977-0 from training. Duration: 0.78 2023-10-03 14:42:11,583 WARNING [train.py:1204] (3/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_1128-140266-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 14:42:16,006 WARNING [train.py:1204] (3/4) Exclude cut with ID _8772_210_1850_1_1528635595972_3664011_247-336448-0_sp0.9 from training. Duration: 0.82225 2023-10-03 14:42:17,932 WARNING [train.py:1204] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_318-59097-0_sp1.1 from training. Duration: 0.6909375 2023-10-03 14:42:19,376 WARNING [train.py:1204] (3/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_540-131164-0_sp1.1 from training. Duration: 0.836375 2023-10-03 14:42:19,937 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.5.encoder.layers.0.nonlin_attention.whiten2, num_groups=1, num_channels=256, metric=3.41 vs. limit=15.0 2023-10-03 14:42:20,751 WARNING [train.py:1204] (3/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_39-110836-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 14:42:22,168 WARNING [train.py:1204] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_860-96198-0_sp1.1 from training. Duration: 0.663625 2023-10-03 14:42:22,181 WARNING [train.py:1204] (3/4) Exclude cut with ID _15133_210_6228_1_1530794512074_3080379_29-264458-0 from training. Duration: 0.58 2023-10-03 14:42:23,500 WARNING [train.py:1204] (3/4) Exclude cut with ID _55209_210_1531_1_1534675828996_4597160_797-348343-0_sp1.1 from training. Duration: 0.963625 2023-10-03 14:42:25,014 WARNING [train.py:1204] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_23-189234-0_sp1.1 from training. Duration: 0.7545625 2023-10-03 14:42:30,196 WARNING [train.py:1204] (3/4) Exclude cut with ID _5410_210_1388_1_1527397055795_1719020_39-112221-0_sp1.1 from training. Duration: 0.62725 2023-10-03 14:42:31,660 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_100-348441-0_sp1.1 from training. Duration: 0.82725 2023-10-03 14:42:37,156 WARNING [train.py:1204] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_143-86344-0_sp1.1 from training. Duration: 0.7 2023-10-03 14:42:38,645 WARNING [train.py:1204] (3/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_126-29104-0 from training. Duration: 0.85 2023-10-03 14:42:38,900 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.self_attn_weights.pos_emb_skip_rate, batch_count=1303180.0, ans=0.0 2023-10-03 14:42:41,322 WARNING [train.py:1204] (3/4) Exclude cut with ID _39846_210_12651_1_1533603651992_1318030_138-31771-0_sp1.1 from training. Duration: 0.963625 2023-10-03 14:42:47,541 WARNING [train.py:1204] (3/4) Exclude cut with ID _2220_210_1819_1_1525509109898_3658680_50-99837-0_sp1.1 from training. Duration: 0.4909375 2023-10-03 14:42:47,608 WARNING [train.py:1204] (3/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_836-210550-0_sp1.1 from training. Duration: 0.97275 2023-10-03 14:42:50,425 WARNING [train.py:1204] (3/4) Exclude cut with ID _28323_210_16187_1_1532480395667_4100300_306-74561-0 from training. Duration: 0.94 2023-10-03 14:42:52,212 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.conv_module1.balancer2.prob, batch_count=1303180.0, ans=0.125 2023-10-03 14:42:53,490 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.attention_skip_rate, batch_count=1303246.6666666667, ans=0.0 2023-10-03 14:42:54,556 INFO [train.py:1046] (3/4) Epoch 37, batch 4250, loss[loss=0.1575, simple_loss=0.2249, pruned_loss=0.04507, over 23548.00 frames. ], tot_loss[loss=0.1581, simple_loss=0.236, pruned_loss=0.04005, over 4698638.47 frames. ], batch size: 256, lr: 2.74e-03, grad_scale: 8.0 2023-10-03 14:42:54,611 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_177-58005-0_sp1.1 from training. Duration: 0.563625 2023-10-03 14:42:54,883 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.conv_module2.balancer2.prob, batch_count=1303246.6666666667, ans=0.125 2023-10-03 14:42:57,800 WARNING [train.py:1204] (3/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_19-342632-0_sp1.1 from training. Duration: 0.82725 2023-10-03 14:42:57,825 WARNING [train.py:1204] (3/4) Exclude cut with ID _35355_210_16040_1_1533212988977_4016229_74-190076-0_sp0.9 from training. Duration: 0.888875 2023-10-03 14:43:02,451 WARNING [train.py:1204] (3/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_285-97786-0_sp1.1 from training. Duration: 0.97275 2023-10-03 14:43:06,792 WARNING [train.py:1204] (3/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_337-181502-0_sp0.9 from training. Duration: 0.72225 2023-10-03 14:43:06,835 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_353-182855-0 from training. Duration: 0.96 2023-10-03 14:43:08,161 WARNING [train.py:1204] (3/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_409-327730-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 14:43:10,944 WARNING [train.py:1204] (3/4) Exclude cut with ID _8030_210_7497_1_1528444886781_3531089_373-19403-0_sp1.1 from training. Duration: 0.97275 2023-10-03 14:43:13,669 WARNING [train.py:1204] (3/4) Exclude cut with ID _6789_210_1534_1_1527857937218_3118950_231-340237-0_sp1.1 from training. Duration: 0.936375 2023-10-03 14:43:17,704 WARNING [train.py:1204] (3/4) Exclude cut with ID _6493_210_1747_1_1528459210424_4012470_536-57832-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 14:43:17,717 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7046_210_6753_1_1527895337_6920917_756-116002-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 14:43:20,352 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_37152_210_9697_1_1533106573_3844188_29-239070-0_sp1.1 from training. Duration: 0.8909375 2023-10-03 14:43:20,354 WARNING [train.py:1204] (3/4) Exclude cut with ID _19023_210_4353_1_1531378892552_5820780_61-334539-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 14:43:21,749 WARNING [train.py:1204] (3/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_159-28488-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 14:43:21,826 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_764-71272-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 14:43:23,254 WARNING [train.py:1204] (3/4) Exclude cut with ID _44902_210_16187_1_1533888006062_3792078_258-178274-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 14:43:25,150 WARNING [train.py:1204] (3/4) Exclude cut with ID _7537_210_2084_1_1528502395431_3924359_14-312906-0_sp1.1 from training. Duration: 0.763625 2023-10-03 14:43:25,257 WARNING [train.py:1204] (3/4) Exclude cut with ID _49643_210_7989_1_1534640244224_3939031_674-116639-0_sp1.1 from training. Duration: 0.963625 2023-10-03 14:43:26,613 WARNING [train.py:1204] (3/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_415-161-0 from training. Duration: 0.98 2023-10-03 14:43:29,880 WARNING [train.py:1204] (3/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_264-348896-0 from training. Duration: 0.85 2023-10-03 14:43:29,887 WARNING [train.py:1204] (3/4) Exclude cut with ID _51899_210_20034_1_1534129254466_3669220_233-161492-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 14:43:30,153 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.conv_module2.balancer1.prob, batch_count=1303380.0, ans=0.125 2023-10-03 14:43:31,244 WARNING [train.py:1204] (3/4) Exclude cut with ID _31255_210_13427_1_1532865619495_3815143_233-200023-0_sp1.1 from training. Duration: 0.936375 2023-10-03 14:43:31,261 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_23850_210_4238_1_1532232597_1632433_100-229879-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 14:43:32,485 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.590e+02 1.912e+02 2.052e+02 2.334e+02 3.222e+02, threshold=4.104e+02, percent-clipped=0.0 2023-10-03 14:43:32,577 WARNING [train.py:1204] (3/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_375-245677-0_sp0.9 from training. Duration: 0.9 2023-10-03 14:43:32,580 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_40223_210_6228_1_1533298404_4812267_355-317415-0_sp1.1 from training. Duration: 0.97275 2023-10-03 14:43:32,624 WARNING [train.py:1204] (3/4) Exclude cut with ID _9679_210_7497_1_1529129212906_4363929_235-81488-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 14:43:35,485 WARNING [train.py:1204] (3/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_172-215932-0_sp1.1 from training. Duration: 0.6 2023-10-03 14:43:36,760 WARNING [train.py:1204] (3/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_64-298917-0_sp1.1 from training. Duration: 0.8 2023-10-03 14:43:39,817 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.feed_forward3.hidden_balancer.prob, batch_count=1303446.6666666667, ans=0.125 2023-10-03 14:43:40,886 WARNING [train.py:1204] (3/4) Exclude cut with ID _38020_210_5986_1_1533112252259_7006089_131-53533-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 14:43:42,648 WARNING [train.py:1204] (3/4) Exclude cut with ID _42334_210_7770_1_1534206610396_3605880_392-48154-0_sp1.1 from training. Duration: 0.963625 2023-10-03 14:43:42,703 WARNING [train.py:1204] (3/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_337-181502-0 from training. Duration: 0.65 2023-10-03 14:43:44,013 WARNING [train.py:1204] (3/4) Exclude cut with ID _6267_210_2084_1_1527764658526_3894730_375-219887-0_sp0.9 from training. Duration: 0.7444375 2023-10-03 14:43:44,118 WARNING [train.py:1204] (3/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_60-146189-0 from training. Duration: 0.98 2023-10-03 14:43:46,021 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_243-143119-0_sp1.1 from training. Duration: 0.7 2023-10-03 14:43:47,344 WARNING [train.py:1204] (3/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_1267-272693-0_sp0.9 from training. Duration: 0.811125 2023-10-03 14:43:47,533 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.nonlin_attention.balancer.prob, batch_count=1303446.6666666667, ans=0.125 2023-10-03 14:43:47,569 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.bypass.skip_rate, batch_count=1303446.6666666667, ans=0.07 2023-10-03 14:43:48,753 WARNING [train.py:1204] (3/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_83-182645-0_sp1.1 from training. Duration: 0.97275 2023-10-03 14:43:48,770 WARNING [train.py:1204] (3/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_403-292260-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 14:43:50,315 INFO [scaling.py:1118] (3/4) WithLoss: name=encoder.encoders.3.encoder.layers.1.self_attn_weights, loss-sum=0.000e+00 2023-10-03 14:43:52,871 WARNING [train.py:1204] (3/4) Exclude cut with ID _55429_210_10626_1_1534467588527_7174143_377-169391-0 from training. Duration: 0.96 2023-10-03 14:43:54,169 WARNING [train.py:1204] (3/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_494-199538-0_sp1.1 from training. Duration: 0.6818125 2023-10-03 14:43:54,225 WARNING [train.py:1204] (3/4) Exclude cut with ID _3067_210_2667_1_1526212974640_4503168_119-175776-0_sp1.1 from training. Duration: 0.67275 2023-10-03 14:43:59,507 WARNING [train.py:1204] (3/4) Exclude cut with ID _3231_210_2106_1_1526205864934_6784220_183-303839-0_sp1.1 from training. Duration: 0.97275 2023-10-03 14:43:59,684 WARNING [train.py:1204] (3/4) Exclude cut with ID _18513_210_9206_1_1531618215223_3862620_165-38958-0_sp1.1 from training. Duration: 0.963625 2023-10-03 14:44:01,086 WARNING [train.py:1204] (3/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_87-272068-0_sp1.1 from training. Duration: 0.7545625 2023-10-03 14:44:02,442 WARNING [train.py:1204] (3/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_353-95-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 14:44:02,679 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.feed_forward1.out_proj.dropout_p, batch_count=1303513.3333333333, ans=0.1 2023-10-03 14:44:03,823 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2009_210_2071_1_1525481134_4588937_73-302813-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 14:44:03,925 WARNING [train.py:1204] (3/4) Exclude cut with ID _64220_210_9527_1_1535250151772_4875259_88-95011-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 14:44:05,238 WARNING [train.py:1204] (3/4) Exclude cut with ID _44902_210_16187_1_1533888006062_3792078_160-12107-0_sp1.1 from training. Duration: 0.92725 2023-10-03 14:44:05,245 WARNING [train.py:1204] (3/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_547-5424-0 from training. Duration: 0.94 2023-10-03 14:44:06,709 WARNING [train.py:1204] (3/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_268-85656-0_sp1.1 from training. Duration: 0.936375 2023-10-03 14:44:06,878 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.conv_module1.balancer2.min_abs, batch_count=1303513.3333333333, ans=0.5 2023-10-03 14:44:07,160 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.5.encoder.layers.1.nonlin_attention.whiten2, num_groups=1, num_channels=256, metric=4.24 vs. limit=15.0 2023-10-03 14:44:09,544 INFO [train.py:1046] (3/4) Epoch 37, batch 4300, loss[loss=0.149, simple_loss=0.2271, pruned_loss=0.03544, over 22873.00 frames. ], tot_loss[loss=0.1582, simple_loss=0.2365, pruned_loss=0.03993, over 4704380.90 frames. ], batch size: 50, lr: 2.74e-03, grad_scale: 8.0 2023-10-03 14:44:11,441 WARNING [train.py:1204] (3/4) Exclude cut with ID _66210_210_12651_1_1535446812922_3365850_42-284985-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 14:44:11,472 WARNING [train.py:1204] (3/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_465-214726-0_sp0.9 from training. Duration: 0.9555625 2023-10-03 14:44:16,356 WARNING [train.py:1204] (3/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_50-95770-0_sp1.1 from training. Duration: 0.936375 2023-10-03 14:44:20,764 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=1303580.0, ans=0.1 2023-10-03 14:44:23,269 WARNING [train.py:1204] (3/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_269-77567-0_sp1.1 from training. Duration: 0.963625 2023-10-03 14:44:23,280 WARNING [train.py:1204] (3/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_7-111954-0 from training. Duration: 0.56 2023-10-03 14:44:25,063 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_85-195240-0_sp0.9 from training. Duration: 0.9666875 2023-10-03 14:44:26,927 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2822_210_2715_1_1526112586_1999587_165-36656-0_sp0.9 from training. Duration: 0.988875 2023-10-03 14:44:28,235 WARNING [train.py:1204] (3/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_181-78632-0_sp1.1 from training. Duration: 0.7090625 2023-10-03 14:44:28,248 WARNING [train.py:1204] (3/4) Exclude cut with ID IC0671W0044-331887 from training. Duration: 0.896 2023-10-03 14:44:29,730 WARNING [train.py:1204] (3/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_266-137104-0_sp1.1 from training. Duration: 0.5909375 2023-10-03 14:44:31,140 WARNING [train.py:1204] (3/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_137-116442-0_sp0.9 from training. Duration: 0.8666875 2023-10-03 14:44:33,889 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_23801_210_16223_1_1532066392_3294657_570-5439-0 from training. Duration: 0.97 2023-10-03 14:44:33,911 WARNING [train.py:1204] (3/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_29-280622-0_sp1.1 from training. Duration: 0.7454375 2023-10-03 14:44:34,120 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.feed_forward3.hidden_balancer.prob, batch_count=1303646.6666666667, ans=0.125 2023-10-03 14:44:35,215 WARNING [train.py:1204] (3/4) Exclude cut with ID 3557-8342-0013-54691-0 from training. Duration: 0.92 2023-10-03 14:44:37,919 WARNING [train.py:1204] (3/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_711-15688-0_sp1.1 from training. Duration: 0.3909375 2023-10-03 14:44:39,286 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_15126_210_6753_1_1530778677_2591185_120-277707-0_sp1.1 from training. Duration: 0.72725 2023-10-03 14:44:41,953 WARNING [train.py:1204] (3/4) Exclude cut with ID _14970_210_15709_1_1530775547361_1966965_8-169924-0_sp1.1 from training. Duration: 0.836375 2023-10-03 14:44:41,955 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_4692_210_3528_1_1526899755_4323254_107-44401-0_sp0.9 from training. Duration: 0.9555625 2023-10-03 14:44:43,361 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3475_210_2028_1_1526193578_3622569_265-276625-0_sp0.9 from training. Duration: 0.8444375 2023-10-03 14:44:44,815 WARNING [train.py:1204] (3/4) Exclude cut with ID _55153_210_2253_1_1534384786677_3162096_148-79022-0_sp1.1 from training. Duration: 0.87275 2023-10-03 14:44:46,699 WARNING [train.py:1204] (3/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_292-36336-0_sp1.1 from training. Duration: 0.92725 2023-10-03 14:44:46,715 WARNING [train.py:1204] (3/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_329-82235-0 from training. Duration: 0.82 2023-10-03 14:44:48,139 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_11305_210_1794_1_1529750428_7178148_257-312870-0 from training. Duration: 0.88 2023-10-03 14:44:51,628 WARNING [train.py:1204] (3/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_436-4493-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 14:44:53,195 WARNING [train.py:1204] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_321-54129-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 14:44:53,202 WARNING [train.py:1204] (3/4) Exclude cut with ID _5287_210_2084_1_1527418883040_3878670_333-48508-0_sp0.9 from training. Duration: 0.6666875 2023-10-03 14:44:53,222 WARNING [train.py:1204] (3/4) Exclude cut with ID _26528_210_9070_1_1532570410842_3851310_244-278158-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 14:44:54,520 WARNING [train.py:1204] (3/4) Exclude cut with ID _46952_210_5986_1_1534230010357_3107379_232-344503-0_sp1.1 from training. Duration: 0.87275 2023-10-03 14:44:54,540 WARNING [train.py:1204] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_43-206757-0 from training. Duration: 0.75 2023-10-03 14:44:54,542 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_4183_210_2087_1_1526729100_1869838_141-166600-0 from training. Duration: 0.95 2023-10-03 14:44:54,591 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_296-287678-0 from training. Duration: 0.95 2023-10-03 14:44:56,368 WARNING [train.py:1204] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_265-60343-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 14:44:56,603 INFO [scaling.py:1118] (3/4) WithLoss: name=encoder.encoders.2.encoder.layers.1.self_attn_weights, loss-sum=0.000e+00 2023-10-03 14:44:58,229 WARNING [train.py:1204] (3/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_332-256168-0 from training. Duration: 0.75 2023-10-03 14:44:58,266 WARNING [train.py:1204] (3/4) Exclude cut with ID _13679_210_7497_1_1530534293662_3355098_411-186088-0 from training. Duration: 0.94 2023-10-03 14:45:00,974 WARNING [train.py:1204] (3/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_458-126779-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 14:45:02,334 WARNING [train.py:1204] (3/4) Exclude cut with ID IC0803W0380-397572 from training. Duration: 0.032 2023-10-03 14:45:02,594 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=1303780.0, ans=0.1 2023-10-03 14:45:03,624 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_278-272612-0_sp1.1 from training. Duration: 0.663625 2023-10-03 14:45:06,366 WARNING [train.py:1204] (3/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_491-263101-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 14:45:06,371 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_53555_210_5632_1_1534595220_7232297_334-336778-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 14:45:07,850 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_50-3289-0 from training. Duration: 0.91 2023-10-03 14:45:09,126 WARNING [train.py:1204] (3/4) Exclude cut with ID _8699_210_2069_1_1528630346831_3960409_155-39766-0_sp0.9 from training. Duration: 0.8666875 2023-10-03 14:45:09,130 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_502-82145-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 14:45:09,187 WARNING [train.py:1204] (3/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_550-43132-0_sp1.1 from training. Duration: 0.97275 2023-10-03 14:45:09,208 WARNING [train.py:1204] (3/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_446-112117-0_sp1.1 from training. Duration: 0.8181875 2023-10-03 14:45:09,264 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3691_210_3528_1_1526468169_4062053_355-334432-0_sp1.1 from training. Duration: 0.7909375 2023-10-03 14:45:12,053 WARNING [train.py:1204] (3/4) Exclude cut with ID _58605_210_12210_1_1534723195939_3904259_239-94720-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 14:45:12,262 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.1.feed_forward1.out_proj.dropout_p, batch_count=1303846.6666666667, ans=0.1 2023-10-03 14:45:13,527 WARNING [train.py:1204] (3/4) Exclude cut with ID _49376_210_5986_1_1533985278732_3370740_391-344072-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 14:45:14,970 WARNING [train.py:1204] (3/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_558-322014-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 14:45:15,015 WARNING [train.py:1204] (3/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_256-32662-0_sp1.1 from training. Duration: 0.8181875 2023-10-03 14:45:21,562 WARNING [train.py:1204] (3/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_572-185150-0 from training. Duration: 0.97 2023-10-03 14:45:22,846 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_89-35748-0_sp0.9 from training. Duration: 0.87775 2023-10-03 14:45:24,269 INFO [train.py:1046] (3/4) Epoch 37, batch 4350, loss[loss=0.1463, simple_loss=0.2269, pruned_loss=0.03281, over 24539.00 frames. ], tot_loss[loss=0.1583, simple_loss=0.2371, pruned_loss=0.03976, over 4712774.96 frames. ], batch size: 60, lr: 2.74e-03, grad_scale: 8.0 2023-10-03 14:45:25,176 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.2.encoder.layers.2.nonlin_attention.whiten1, num_groups=1, num_channels=288, metric=4.46 vs. limit=10.0 2023-10-03 14:45:26,365 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_88-66747-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 14:45:29,720 WARNING [train.py:1204] (3/4) Exclude cut with ID _19504_210_9994_1_1531648863242_3325220_357-237029-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 14:45:29,913 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=1303913.3333333333, ans=0.1 2023-10-03 14:45:30,013 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=1303913.3333333333, ans=0.1 2023-10-03 14:45:32,316 WARNING [train.py:1204] (3/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_302-2550-0_sp1.1 from training. Duration: 0.736375 2023-10-03 14:45:32,317 WARNING [train.py:1204] (3/4) Exclude cut with ID _9461_210_4557_1_1529060514726_3677451_59-194017-0_sp1.1 from training. Duration: 0.963625 2023-10-03 14:45:37,827 WARNING [train.py:1204] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_665-196155-0_sp1.1 from training. Duration: 0.6090625 2023-10-03 14:45:40,596 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_326-315007-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 14:45:43,553 WARNING [train.py:1204] (3/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_105-328174-0_sp1.1 from training. Duration: 0.6909375 2023-10-03 14:45:43,561 WARNING [train.py:1204] (3/4) Exclude cut with ID _56494_210_4220_1_1535284837625_3033169_106-263722-0_sp1.1 from training. Duration: 0.97275 2023-10-03 14:45:46,417 WARNING [train.py:1204] (3/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_770-200430-0_sp0.9 from training. Duration: 0.988875 2023-10-03 14:45:49,206 WARNING [train.py:1204] (3/4) Exclude cut with ID _65640_210_6310_1_1535368201445_3173250_209-13008-0_sp1.1 from training. Duration: 0.9 2023-10-03 14:45:51,053 WARNING [train.py:1204] (3/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_212-149947-0_sp0.9 from training. Duration: 0.811125 2023-10-03 14:45:51,204 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.attention_skip_rate, batch_count=1303980.0, ans=0.0 2023-10-03 14:45:57,046 WARNING [train.py:1204] (3/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_188-171673-0 from training. Duration: 0.58 2023-10-03 14:45:57,109 WARNING [train.py:1204] (3/4) Exclude cut with ID _58895_210_8750_1_1535184081320_3649089_286-281724-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 14:45:57,182 WARNING [train.py:1204] (3/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_16-254909-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 14:46:03,753 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.591e+02 1.872e+02 2.033e+02 2.293e+02 3.578e+02, threshold=4.065e+02, percent-clipped=0.0 2023-10-03 14:46:03,906 WARNING [train.py:1204] (3/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_52-95655-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 14:46:06,857 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.conv_module1.balancer2.prob, batch_count=1304046.6666666667, ans=0.125 2023-10-03 14:46:07,841 WARNING [train.py:1204] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_554-45617-0 from training. Duration: 0.99 2023-10-03 14:46:10,580 WARNING [train.py:1204] (3/4) Exclude cut with ID _14683_210_5219_1_1530698352608_3974859_473-344611-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 14:46:11,910 WARNING [train.py:1204] (3/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_17-25045-0_sp0.9 from training. Duration: 0.6555625 2023-10-03 14:46:14,782 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9072W0003-530637 from training. Duration: 0.768 2023-10-03 14:46:16,304 WARNING [train.py:1204] (3/4) Exclude cut with ID _38670_210_15379_1_1533448810659_4391059_76-74025-0_sp1.1 from training. Duration: 0.936375 2023-10-03 14:46:16,353 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_74-292608-0_sp0.9 from training. Duration: 0.8 2023-10-03 14:46:16,491 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.bypass.skip_rate, batch_count=1304113.3333333333, ans=0.04949747468305833 2023-10-03 14:46:17,592 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9084W0025-536862 from training. Duration: 0.896 2023-10-03 14:46:18,889 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9087W0021-538392 from training. Duration: 0.768 2023-10-03 14:46:18,894 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7543_210_6753_1_1528543323_645515_56-163234-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 14:46:18,924 WARNING [train.py:1204] (3/4) Exclude cut with ID _45560_210_21982_1_1533812603951_3584999_44-332643-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 14:46:20,273 WARNING [train.py:1204] (3/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_1285-256622-0_sp1.1 from training. Duration: 0.763625 2023-10-03 14:46:21,597 WARNING [train.py:1204] (3/4) Exclude cut with ID _52781_210_16187_1_1534297729099_1118149_141-325632-0_sp1.1 from training. Duration: 0.936375 2023-10-03 14:46:21,660 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_1051-137631-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 14:46:23,582 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_13091_210_7925_1_1530609447_8987354_233-18333-0_sp1.1 from training. Duration: 0.97275 2023-10-03 14:46:25,062 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_34008_210_10431_1_1533345424_8183247_662-317331-0 from training. Duration: 0.94 2023-10-03 14:46:25,068 WARNING [train.py:1204] (3/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_291-248920-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 14:46:25,070 WARNING [train.py:1204] (3/4) Exclude cut with ID _65263_210_22356_1_1535763598691_3921037_274-288481-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 14:46:25,091 WARNING [train.py:1204] (3/4) Exclude cut with ID _5189_210_4557_1_1527248319428_3769229_233-168080-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 14:46:26,504 WARNING [train.py:1204] (3/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_135-184281-0 from training. Duration: 0.99 2023-10-03 14:46:27,908 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9123W0012-555972 from training. Duration: 0.896 2023-10-03 14:46:27,922 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9123W0118-556077 from training. Duration: 0.896 2023-10-03 14:46:27,934 WARNING [train.py:1204] (3/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_406-275649-0 from training. Duration: 0.42 2023-10-03 14:46:31,093 WARNING [train.py:1204] (3/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_309-13684-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 14:46:31,122 WARNING [train.py:1204] (3/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_129-337802-0_sp1.1 from training. Duration: 0.6545625 2023-10-03 14:46:31,139 WARNING [train.py:1204] (3/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_649-266825-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 14:46:31,206 WARNING [train.py:1204] (3/4) Exclude cut with ID _40519_210_13794_1_1533715228655_7337390_895-224742-0_sp1.1 from training. Duration: 0.92725 2023-10-03 14:46:34,398 WARNING [train.py:1204] (3/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_234-14174-0 from training. Duration: 0.82 2023-10-03 14:46:35,816 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9156W0009-572502 from training. Duration: 0.768 2023-10-03 14:46:35,823 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_424-245529-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 14:46:36,450 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.3.encoder.layers.3.whiten, num_groups=1, num_channels=512, metric=4.12 vs. limit=12.0 2023-10-03 14:46:37,447 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.conv_module1.balancer1.prob, batch_count=1304246.6666666667, ans=0.125 2023-10-03 14:46:38,486 INFO [train.py:1046] (3/4) Epoch 37, batch 4400, loss[loss=0.1625, simple_loss=0.2387, pruned_loss=0.04319, over 22826.00 frames. ], tot_loss[loss=0.159, simple_loss=0.2382, pruned_loss=0.0399, over 4730055.05 frames. ], batch size: 322, lr: 2.74e-03, grad_scale: 16.0 2023-10-03 14:46:38,624 WARNING [train.py:1204] (3/4) Exclude cut with ID _26528_210_9070_1_1532570410842_3851310_232-10149-0_sp1.1 from training. Duration: 0.963625 2023-10-03 14:46:38,639 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_17292_210_6753_1_1531125388_2943734_152-30410-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 14:46:41,303 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_35441_210_16554_1_1533347572_4092680_291-82075-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 14:46:44,042 WARNING [train.py:1204] (3/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_64-298917-0 from training. Duration: 0.88 2023-10-03 14:46:44,061 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_5618_210_1874_1_1527311008_5851_1-214501-0_sp0.9 from training. Duration: 0.7655625 2023-10-03 14:46:44,090 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_9460_210_4211_1_1528954587_10049667_572-37035-0 from training. Duration: 0.8 2023-10-03 14:46:44,120 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9193W0023-590322 from training. Duration: 0.896 2023-10-03 14:46:45,451 WARNING [train.py:1204] (3/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_206-10097-0_sp1.1 from training. Duration: 0.4454375 2023-10-03 14:46:45,463 WARNING [train.py:1204] (3/4) Exclude cut with ID _55134_210_21982_1_1534590028377_3882170_17-51731-0_sp1.1 from training. Duration: 0.963625 2023-10-03 14:46:48,082 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_14953_210_2151_1_1530701710_5616165_710-165400-0 from training. Duration: 0.87 2023-10-03 14:46:50,930 WARNING [train.py:1204] (3/4) Exclude cut with ID _53203_210_2087_1_1534246218309_475590_61-85777-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 14:46:52,942 WARNING [train.py:1204] (3/4) Exclude cut with ID _55844_210_2346_1_1534471212569_7625707_1050-10453-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 14:46:52,959 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9218W0013-603297 from training. Duration: 0.896 2023-10-03 14:46:56,111 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_267-102818-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 14:46:56,112 WARNING [train.py:1204] (3/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_191-41023-0 from training. Duration: 0.71 2023-10-03 14:46:56,157 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9231W0202-610227 from training. Duration: 0.896 2023-10-03 14:46:58,964 WARNING [train.py:1204] (3/4) Exclude cut with ID _6537_210_1883_1_1527987559710_3597129_382-133062-0 from training. Duration: 0.68 2023-10-03 14:47:00,674 WARNING [train.py:1204] (3/4) Exclude cut with ID _28427_210_4220_1_1532581240967_3061069_43-84928-0 from training. Duration: 0.99 2023-10-03 14:47:00,706 WARNING [train.py:1204] (3/4) Exclude cut with ID _59256_210_5986_1_1535076085997_3589120_121-237386-0 from training. Duration: 0.96 2023-10-03 14:47:00,741 WARNING [train.py:1204] (3/4) Exclude cut with ID _8480_210_1874_1_1528589391205_6728330_137-256986-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 14:47:01,995 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_9417_210_1794_1_1529235261_4614131_479-346963-0_sp1.1 from training. Duration: 0.936375 2023-10-03 14:47:02,018 WARNING [train.py:1204] (3/4) Exclude cut with ID _26474_210_9570_1_1532520022699_3623380_267-7362-0_sp1.1 from training. Duration: 0.936375 2023-10-03 14:47:04,746 WARNING [train.py:1204] (3/4) Exclude cut with ID _55328_210_27767_1_1534580973260_4088170_287-61272-0_sp1.1 from training. Duration: 0.97275 2023-10-03 14:47:06,164 WARNING [train.py:1204] (3/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_589-9070-0 from training. Duration: 0.71 2023-10-03 14:47:06,171 WARNING [train.py:1204] (3/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_148-158509-0_sp1.1 from training. Duration: 0.37275 2023-10-03 14:47:08,063 WARNING [train.py:1204] (3/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_164-184548-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 14:47:09,464 WARNING [train.py:1204] (3/4) Exclude cut with ID _14970_210_15709_1_1530775547361_1966965_6-23328-0_sp1.1 from training. Duration: 0.7818125 2023-10-03 14:47:09,469 WARNING [train.py:1204] (3/4) Exclude cut with ID _29303_210_2575_1_1532392237303_7410649_612-165069-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 14:47:10,878 WARNING [train.py:1204] (3/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_383-321224-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 14:47:10,909 WARNING [train.py:1204] (3/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_283-160525-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 14:47:10,929 WARNING [train.py:1204] (3/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_505-97987-0 from training. Duration: 0.86 2023-10-03 14:47:11,002 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9309W0190-640317 from training. Duration: 0.768 2023-10-03 14:47:15,167 WARNING [train.py:1204] (3/4) Exclude cut with ID _50093_210_4220_1_1534417141207_3432299_280-297390-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 14:47:20,569 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_17292_210_6753_1_1531125388_2943734_320-71385-0_sp1.1 from training. Duration: 0.97275 2023-10-03 14:47:22,129 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_213-103282-0 from training. Duration: 0.78 2023-10-03 14:47:25,418 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7615_210_4211_1_1528110965_4580160_72-235211-0_sp0.9 from training. Duration: 0.8444375 2023-10-03 14:47:29,903 WARNING [train.py:1204] (3/4) Exclude cut with ID _14867_210_1968_1_1530684179890_3846969_173-320977-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 14:47:32,117 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7046_210_6753_1_1527895337_6920917_783-278109-0_sp0.9 from training. Duration: 0.8555625 2023-10-03 14:47:32,173 WARNING [train.py:1204] (3/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_90-30673-0 from training. Duration: 0.84 2023-10-03 14:47:32,189 WARNING [train.py:1204] (3/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_415-161-0_sp1.1 from training. Duration: 0.8909375 2023-10-03 14:47:32,204 WARNING [train.py:1204] (3/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_86-66192-0_sp0.9 from training. Duration: 0.611125 2023-10-03 14:47:32,206 WARNING [train.py:1204] (3/4) Exclude cut with ID _4626_210_3532_1_1526817652717_3916859_446-206656-0_sp1.1 from training. Duration: 0.6909375 2023-10-03 14:47:32,264 WARNING [train.py:1204] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_166-128941-0_sp0.9 from training. Duration: 0.6 2023-10-03 14:47:36,976 WARNING [train.py:1204] (3/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_345-167986-0 from training. Duration: 0.84 2023-10-03 14:47:39,105 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.0.layers.0.feed_forward1.out_whiten, num_groups=1, num_channels=192, metric=10.70 vs. limit=15.0 2023-10-03 14:47:39,619 WARNING [train.py:1204] (3/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_973-198085-0 from training. Duration: 0.75 2023-10-03 14:47:40,982 WARNING [train.py:1204] (3/4) Exclude cut with ID _60057_210_10640_1_1534903065793_3333150_171-181133-0 from training. Duration: 0.88 2023-10-03 14:47:41,003 WARNING [train.py:1204] (3/4) Exclude cut with ID _19504_210_9994_1_1531648863242_3325220_336-196513-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 14:47:41,010 WARNING [train.py:1204] (3/4) Exclude cut with ID _6493_210_1747_1_1528459210424_4012470_513-140967-0 from training. Duration: 0.79 2023-10-03 14:47:42,490 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6673_210_3273_1_1527767790_3173240_327-306185-0_sp0.9 from training. Duration: 0.92225 2023-10-03 14:47:45,283 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1032-236863-0_sp1.1 from training. Duration: 0.763625 2023-10-03 14:47:46,694 WARNING [train.py:1204] (3/4) Exclude cut with ID _14867_210_1968_1_1530684179890_3846969_413-29292-0 from training. Duration: 0.9 2023-10-03 14:47:48,213 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.balancer2.prob, batch_count=1304513.3333333333, ans=0.125 2023-10-03 14:47:50,656 WARNING [train.py:1204] (3/4) Exclude cut with ID _47251_210_18628_1_1533814063128_4137250_186-343322-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 14:47:51,870 INFO [train.py:1046] (3/4) Epoch 37, batch 4450, loss[loss=0.1708, simple_loss=0.2552, pruned_loss=0.04327, over 24138.00 frames. ], tot_loss[loss=0.1598, simple_loss=0.2394, pruned_loss=0.04015, over 4732875.91 frames. ], batch size: 80, lr: 2.74e-03, grad_scale: 8.0 2023-10-03 14:47:53,287 WARNING [train.py:1204] (3/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_624-46091-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 14:47:54,070 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.bypass.scale_min, batch_count=1304580.0, ans=0.2 2023-10-03 14:47:55,224 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_791-197891-0_sp0.9 from training. Duration: 0.9333125 2023-10-03 14:48:01,153 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_50050_210_16223_1_1535435796_7290138_786-288457-0_sp1.1 from training. Duration: 0.92725 2023-10-03 14:48:01,170 WARNING [train.py:1204] (3/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_213-227283-0_sp1.1 from training. Duration: 0.963625 2023-10-03 14:48:04,391 WARNING [train.py:1204] (3/4) Exclude cut with ID _42659_210_2070_1_1533607442765_3120380_58-341121-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 14:48:06,372 WARNING [train.py:1204] (3/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_506-23200-0_sp1.1 from training. Duration: 0.8818125 2023-10-03 14:48:07,831 WARNING [train.py:1204] (3/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_336-213530-0_sp1.1 from training. Duration: 0.8181875 2023-10-03 14:48:07,851 WARNING [train.py:1204] (3/4) Exclude cut with ID _64135_210_2253_1_1535677267876_7608972_180-5312-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 14:48:10,469 WARNING [train.py:1204] (3/4) Exclude cut with ID _12302_210_5219_1_1529924446466_8107280_835-279754-0 from training. Duration: 0.9 2023-10-03 14:48:10,471 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_510-96996-0_sp1.1 from training. Duration: 0.7909375 2023-10-03 14:48:11,818 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_15517_210_6753_1_1530954008_3487336_216-187834-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 14:48:11,851 WARNING [train.py:1204] (3/4) Exclude cut with ID _19504_210_9994_1_1531648863242_3325220_414-113427-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 14:48:11,852 WARNING [train.py:1204] (3/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_264-108685-0_sp0.9 from training. Duration: 0.611125 2023-10-03 14:48:13,233 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_5438_210_1794_1_1527336687_2600009_104-288274-0_sp0.9 from training. Duration: 0.6444375 2023-10-03 14:48:17,492 WARNING [train.py:1204] (3/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_520-39496-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 14:48:18,792 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_37818_210_4238_1_1533620910_4720083_315-308565-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 14:48:18,910 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2009_210_2071_1_1525481134_4588937_385-302100-0_sp1.1 from training. Duration: 0.7909375 2023-10-03 14:48:20,219 WARNING [train.py:1204] (3/4) Exclude cut with ID _58895_210_8750_1_1535184081320_3649089_277-53219-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 14:48:21,600 WARNING [train.py:1204] (3/4) Exclude cut with ID _26664_210_7770_1_1532258943280_3418270_121-111258-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 14:48:26,197 WARNING [train.py:1204] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_99-167392-0_sp1.1 from training. Duration: 0.5545625 2023-10-03 14:48:27,468 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1113-7369-0 from training. Duration: 0.85 2023-10-03 14:48:27,500 WARNING [train.py:1204] (3/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_352-59958-0 from training. Duration: 0.86 2023-10-03 14:48:27,505 WARNING [train.py:1204] (3/4) Exclude cut with ID _14886_210_4220_1_1530702020355_3516190_159-73748-0_sp1.1 from training. Duration: 0.7545625 2023-10-03 14:48:32,541 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.550e+02 1.931e+02 2.177e+02 2.666e+02 4.430e+02, threshold=4.354e+02, percent-clipped=2.0 2023-10-03 14:48:32,629 WARNING [train.py:1204] (3/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_131-119569-0_sp1.1 from training. Duration: 0.92725 2023-10-03 14:48:34,348 WARNING [train.py:1204] (3/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_216-343007-0 from training. Duration: 0.66 2023-10-03 14:48:36,072 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.self_attn_weights.pos_emb_skip_rate, batch_count=1304780.0, ans=0.0 2023-10-03 14:48:37,245 WARNING [train.py:1204] (3/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_113-92608-0_sp0.9 from training. Duration: 0.6 2023-10-03 14:48:41,361 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_40047_210_2290_1_1533290519_2374311_132-123271-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 14:48:41,406 WARNING [train.py:1204] (3/4) Exclude cut with ID _8793_210_6310_1_1528542985139_2310009_121-179782-0 from training. Duration: 0.8 2023-10-03 14:48:41,426 WARNING [train.py:1204] (3/4) Exclude cut with ID _43997_210_4525_1_1533538632555_5189329_283-218639-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 14:48:41,429 WARNING [train.py:1204] (3/4) Exclude cut with ID _49376_210_5986_1_1533985278732_3370740_297-78397-0_sp1.1 from training. Duration: 0.936375 2023-10-03 14:48:42,713 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3475_210_2028_1_1526193578_3622569_377-166133-0_sp0.9 from training. Duration: 0.9555625 2023-10-03 14:48:42,720 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_520-213149-0_sp1.1 from training. Duration: 0.92725 2023-10-03 14:48:44,204 WARNING [train.py:1204] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_567-144483-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 14:48:47,105 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_4183_210_2087_1_1526729100_1869838_143-280962-0_sp0.9 from training. Duration: 0.62225 2023-10-03 14:48:47,152 WARNING [train.py:1204] (3/4) Exclude cut with ID _49643_210_7989_1_1534640244224_3939031_611-185515-0 from training. Duration: 0.94 2023-10-03 14:48:48,641 WARNING [train.py:1204] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_88-341718-0_sp0.9 from training. Duration: 0.7333125 2023-10-03 14:48:51,240 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_51225_210_3601_1_1534400634_2327383_49-235441-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 14:48:51,390 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.1.conv_module1.balancer1.prob, batch_count=1304846.6666666667, ans=0.125 2023-10-03 14:48:52,566 WARNING [train.py:1204] (3/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_240-294400-0_sp1.1 from training. Duration: 0.936375 2023-10-03 14:48:54,008 WARNING [train.py:1204] (3/4) Exclude cut with ID _55429_210_10626_1_1534467588527_7174143_640-304825-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 14:48:54,037 WARNING [train.py:1204] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_187-221566-0_sp1.1 from training. Duration: 0.3909375 2023-10-03 14:48:56,087 WARNING [train.py:1204] (3/4) Exclude cut with ID _7606_210_4915_1_1528894253277_5080690_127-177761-0_sp0.9 from training. Duration: 0.711125 2023-10-03 14:48:56,592 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.4.encoder.layers.2.feed_forward2.out_whiten, num_groups=1, num_channels=384, metric=8.70 vs. limit=15.0 2023-10-03 14:48:59,149 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.5.encoder.layers.0.self_attn_weights.whiten_keys, num_groups=4, num_channels=128, metric=1.69 vs. limit=6.0 2023-10-03 14:48:59,962 WARNING [train.py:1204] (3/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_430-116139-0 from training. Duration: 0.85 2023-10-03 14:49:01,385 WARNING [train.py:1204] (3/4) Exclude cut with ID _3675_210_4557_1_1526639934408_4182089_456-18305-0_sp0.9 from training. Duration: 0.8444375 2023-10-03 14:49:03,356 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.0.conv_module1.balancer1.prob, batch_count=1304846.6666666667, ans=0.125 2023-10-03 14:49:03,456 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.conv_module1.balancer2.min_abs, batch_count=1304846.6666666667, ans=0.5 2023-10-03 14:49:05,949 INFO [train.py:1046] (3/4) Epoch 37, batch 4500, loss[loss=0.1563, simple_loss=0.2376, pruned_loss=0.03752, over 24433.00 frames. ], tot_loss[loss=0.16, simple_loss=0.2397, pruned_loss=0.04021, over 4731076.67 frames. ], batch size: 63, lr: 2.74e-03, grad_scale: 8.0 2023-10-03 14:49:06,112 WARNING [train.py:1204] (3/4) Exclude cut with ID _18513_210_9206_1_1531618215223_3862620_402-5957-0_sp1.1 from training. Duration: 0.9 2023-10-03 14:49:09,138 WARNING [train.py:1204] (3/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_465-156500-0 from training. Duration: 0.72 2023-10-03 14:49:09,139 WARNING [train.py:1204] (3/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_714-310538-0 from training. Duration: 0.86 2023-10-03 14:49:10,581 WARNING [train.py:1204] (3/4) Exclude cut with ID _39411_210_5986_1_1533538825030_3343649_335-301187-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 14:49:12,543 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.4.encoder.layers.1.conv_module1.whiten, num_groups=1, num_channels=384, metric=3.09 vs. limit=15.0 2023-10-03 14:49:15,966 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_41750_210_1960_1_1533520678_3948042_241-158616-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 14:49:16,007 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_46013_210_7234_1_1533776362_3744032_50-141535-0_sp1.1 from training. Duration: 0.9 2023-10-03 14:49:17,292 WARNING [train.py:1204] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_43-206757-0_sp0.9 from training. Duration: 0.8333125 2023-10-03 14:49:17,341 WARNING [train.py:1204] (3/4) Exclude cut with ID _22961_210_8254_1_1531976366672_4278180_172-276296-0_sp1.1 from training. Duration: 0.97275 2023-10-03 14:49:17,376 WARNING [train.py:1204] (3/4) Exclude cut with ID _17956_210_4353_1_1531209568125_6979970_1305-301903-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 14:49:18,120 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.3.encoder.layers.0.whiten, num_groups=1, num_channels=512, metric=6.35 vs. limit=12.0 2023-10-03 14:49:18,830 WARNING [train.py:1204] (3/4) Exclude cut with ID _28323_210_16187_1_1532480395667_4100300_604-44507-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 14:49:30,474 WARNING [train.py:1204] (3/4) Exclude cut with ID _23514_210_1389_1_1531832139785_3031439_88-337544-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 14:49:30,723 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.conv_skip_rate, batch_count=1304980.0, ans=0.0 2023-10-03 14:49:31,821 WARNING [train.py:1204] (3/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_932-3672-0_sp1.1 from training. Duration: 0.963625 2023-10-03 14:49:33,277 WARNING [train.py:1204] (3/4) Exclude cut with ID _46502_210_13794_1_1533724452198_3657159_207-109991-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 14:49:33,345 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_216-73841-0_sp1.1 from training. Duration: 0.87275 2023-10-03 14:49:35,311 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8453_210_4211_1_1528502940_8562730_347-330673-0_sp1.1 from training. Duration: 0.6545625 2023-10-03 14:49:44,441 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_4183_210_2087_1_1526729100_1869838_143-280962-0_sp1.1 from training. Duration: 0.5090625 2023-10-03 14:49:46,197 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.attention_skip_rate, batch_count=1305046.6666666667, ans=0.0 2023-10-03 14:49:47,356 WARNING [train.py:1204] (3/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_325-113798-0_sp1.1 from training. Duration: 0.77275 2023-10-03 14:49:51,467 WARNING [train.py:1204] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_91-212486-0_sp0.9 from training. Duration: 0.7555625 2023-10-03 14:49:52,950 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8776_210_2040_1_1528638837_4088036_244-278763-0_sp1.1 from training. Duration: 0.8181875 2023-10-03 14:49:52,986 WARNING [train.py:1204] (3/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_106-286130-0 from training. Duration: 0.98 2023-10-03 14:49:54,366 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2746_210_1988_1_1525872816_3795627_372-205861-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 14:49:55,679 WARNING [train.py:1204] (3/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_240-54646-0_sp1.1 from training. Duration: 0.936375 2023-10-03 14:49:57,029 WARNING [train.py:1204] (3/4) Exclude cut with ID _59500_210_8254_1_1534809610680_3736009_162-180695-0_sp1.1 from training. Duration: 0.936375 2023-10-03 14:49:57,052 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7544_210_6753_1_1528591327_1471426_146-93357-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 14:50:00,355 WARNING [train.py:1204] (3/4) Exclude cut with ID _42657_210_2070_1_1533520654016_3327008_405-87432-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 14:50:00,391 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_4528_210_3273_1_1527076654_3276777_209-219804-0 from training. Duration: 0.69 2023-10-03 14:50:00,392 WARNING [train.py:1204] (3/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_78-182912-0_sp0.9 from training. Duration: 0.6555625 2023-10-03 14:50:00,399 WARNING [train.py:1204] (3/4) Exclude cut with ID _31255_210_13427_1_1532865619495_3815143_322-114998-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 14:50:05,115 WARNING [train.py:1204] (3/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_848-280957-0_sp0.9 from training. Duration: 0.9666875 2023-10-03 14:50:05,134 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7696_210_2040_1_1528207167_3716099_117-125434-0_sp0.9 from training. Duration: 0.8444375 2023-10-03 14:50:07,079 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.bypass.scale_min, batch_count=1305180.0, ans=0.2 2023-10-03 14:50:08,278 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_1002-109192-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 14:50:11,013 WARNING [train.py:1204] (3/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_138-64210-0_sp0.9 from training. Duration: 0.811125 2023-10-03 14:50:11,035 WARNING [train.py:1204] (3/4) Exclude cut with ID _25808_210_13292_1_1532343514562_7366109_633-47524-0_sp1.1 from training. Duration: 0.9 2023-10-03 14:50:12,811 WARNING [train.py:1204] (3/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_297-5233-0 from training. Duration: 0.95 2023-10-03 14:50:14,209 WARNING [train.py:1204] (3/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_335-274780-0 from training. Duration: 0.78 2023-10-03 14:50:14,214 WARNING [train.py:1204] (3/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_301-330455-0 from training. Duration: 0.8 2023-10-03 14:50:19,398 INFO [train.py:1046] (3/4) Epoch 37, batch 4550, loss[loss=0.1626, simple_loss=0.2515, pruned_loss=0.03682, over 24648.00 frames. ], tot_loss[loss=0.1585, simple_loss=0.2378, pruned_loss=0.0396, over 4715743.94 frames. ], batch size: 73, lr: 2.74e-03, grad_scale: 8.0 2023-10-03 14:50:19,446 WARNING [train.py:1204] (3/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_383-205833-0 from training. Duration: 0.95 2023-10-03 14:50:20,874 WARNING [train.py:1204] (3/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_48-182155-0 from training. Duration: 0.87 2023-10-03 14:50:22,256 WARNING [train.py:1204] (3/4) Exclude cut with ID _14832_210_8750_1_1530691351539_3171348_1-54315-0_sp1.1 from training. Duration: 0.7909375 2023-10-03 14:50:25,052 WARNING [train.py:1204] (3/4) Exclude cut with ID _44982_210_1511_1_1534312820108_3726029_413-100814-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 14:50:25,105 WARNING [train.py:1204] (3/4) Exclude cut with ID _3231_210_2106_1_1526205864934_6784220_104-261392-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 14:50:27,920 WARNING [train.py:1204] (3/4) Exclude cut with ID _24314_210_4915_1_1532135078116_7170179_1-13588-0_sp1.1 from training. Duration: 0.97275 2023-10-03 14:50:31,305 WARNING [train.py:1204] (3/4) Exclude cut with ID _44304_210_15374_1_1533639352765_4613129_141-8356-0_sp1.1 from training. Duration: 0.963625 2023-10-03 14:50:32,641 WARNING [train.py:1204] (3/4) Exclude cut with ID _37892_210_19596_1_1533198677897_3964058_374-335034-0_sp1.1 from training. Duration: 0.936375 2023-10-03 14:50:34,087 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_195-257512-0_sp0.9 from training. Duration: 0.7444375 2023-10-03 14:50:34,089 WARNING [train.py:1204] (3/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_1267-272693-0_sp1.1 from training. Duration: 0.663625 2023-10-03 14:50:34,090 WARNING [train.py:1204] (3/4) Exclude cut with ID _30196_210_1664_1_1533708496522_7074140_690-326155-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 14:50:36,115 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_68847_210_14656_1_1536070214_2586843_38-20717-0_sp1.1 from training. Duration: 0.97275 2023-10-03 14:50:36,322 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.conv_module2.balancer2.prob, batch_count=1305313.3333333333, ans=0.125 2023-10-03 14:50:37,429 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_99-251314-0_sp1.1 from training. Duration: 0.7909375 2023-10-03 14:50:39,102 WARNING [train.py:1204] (3/4) Exclude cut with ID _49376_210_5986_1_1533985278732_3370740_225-95630-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 14:50:42,418 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2655_210_2087_1_1525688959_3897400_202-147557-0 from training. Duration: 0.85 2023-10-03 14:50:42,568 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.0.conv_module1.balancer2.prob, batch_count=1305313.3333333333, ans=0.125 2023-10-03 14:50:43,713 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_5618_210_1874_1_1527311008_5851_1-214501-0_sp1.1 from training. Duration: 0.626375 2023-10-03 14:50:45,097 WARNING [train.py:1204] (3/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_225-43381-0_sp1.1 from training. Duration: 0.8545625 2023-10-03 14:50:46,459 WARNING [train.py:1204] (3/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_242-3472-0 from training. Duration: 0.73 2023-10-03 14:50:49,254 WARNING [train.py:1204] (3/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_339-104791-0 from training. Duration: 0.49 2023-10-03 14:50:50,660 WARNING [train.py:1204] (3/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_488-92261-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 14:50:51,000 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=1305380.0, ans=0.1 2023-10-03 14:50:54,776 WARNING [train.py:1204] (3/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_175-163894-0 from training. Duration: 0.74 2023-10-03 14:50:56,165 WARNING [train.py:1204] (3/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_41-234353-0_sp0.9 from training. Duration: 0.7555625 2023-10-03 14:50:58,877 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.515e+02 1.866e+02 2.085e+02 2.369e+02 3.164e+02, threshold=4.169e+02, percent-clipped=0.0 2023-10-03 14:50:58,972 WARNING [train.py:1204] (3/4) Exclude cut with ID _47251_210_18628_1_1533814063128_4137250_116-275927-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 14:50:59,000 WARNING [train.py:1204] (3/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_61-223324-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 14:50:59,022 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6961_210_2087_1_1527937168_6815011_562-165518-0_sp1.1 from training. Duration: 0.7 2023-10-03 14:51:01,049 WARNING [train.py:1204] (3/4) Exclude cut with ID _21989_210_5854_1_1531899214931_3271360_133-44759-0 from training. Duration: 0.77 2023-10-03 14:51:04,165 WARNING [train.py:1204] (3/4) Exclude cut with ID _23557_210_8750_1_1531890106370_6846255_77-284923-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 14:51:06,216 WARNING [train.py:1204] (3/4) Exclude cut with ID _13678_210_7497_1_1530530069226_3463170_464-296127-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 14:51:06,235 WARNING [train.py:1204] (3/4) Exclude cut with ID _22613_210_5219_1_1532253599686_4090170_393-36663-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 14:51:08,893 WARNING [train.py:1204] (3/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_235-262124-0_sp0.9 from training. Duration: 0.7444375 2023-10-03 14:51:10,249 WARNING [train.py:1204] (3/4) Exclude cut with ID _8480_210_1874_1_1528589391205_6728330_755-112625-0 from training. Duration: 0.74 2023-10-03 14:51:11,565 WARNING [train.py:1204] (3/4) Exclude cut with ID _6456_210_1534_1_1527681634212_3181049_215-309359-0 from training. Duration: 0.88 2023-10-03 14:51:11,590 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3019_210_2028_1_1526044245_3264865_191-72235-0_sp1.1 from training. Duration: 0.8818125 2023-10-03 14:51:11,892 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.conv_skip_rate, batch_count=1305446.6666666667, ans=0.0 2023-10-03 14:51:12,906 WARNING [train.py:1204] (3/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_380-162648-0 from training. Duration: 0.94 2023-10-03 14:51:14,953 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_92-46009-0 from training. Duration: 0.54 2023-10-03 14:51:15,069 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=1305446.6666666667, ans=0.1 2023-10-03 14:51:16,286 WARNING [train.py:1204] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_53-300544-0_sp0.9 from training. Duration: 0.7444375 2023-10-03 14:51:16,400 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_68235_210_1925_1_1535863247_4484656_101-250943-0_sp1.1 from training. Duration: 0.97275 2023-10-03 14:51:16,411 WARNING [train.py:1204] (3/4) Exclude cut with ID _14970_210_15709_1_1530775547361_1966965_204-22182-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 14:51:19,027 WARNING [train.py:1204] (3/4) Exclude cut with ID _55153_210_2253_1_1534384786677_3162096_310-130092-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 14:51:19,038 WARNING [train.py:1204] (3/4) Exclude cut with ID _60113_210_16271_1_1535164212481_4386079_559-167191-0_sp1.1 from training. Duration: 0.8454375 2023-10-03 14:51:20,467 WARNING [train.py:1204] (3/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_93-58002-0_sp1.1 from training. Duration: 0.5181875 2023-10-03 14:51:20,688 INFO [scaling.py:1118] (3/4) WithLoss: name=encoder.encoders.2.encoder.layers.2.self_attn_weights, loss-sum=0.000e+00 2023-10-03 14:51:21,776 WARNING [train.py:1204] (3/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_110-224688-0 from training. Duration: 0.97 2023-10-03 14:51:23,167 WARNING [train.py:1204] (3/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_178-178004-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 14:51:23,176 WARNING [train.py:1204] (3/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_321-335475-0_sp1.1 from training. Duration: 0.4090625 2023-10-03 14:51:23,245 WARNING [train.py:1204] (3/4) Exclude cut with ID _66863_210_18053_1_1535698758334_8590319_1092-271784-0 from training. Duration: 0.96 2023-10-03 14:51:23,251 WARNING [train.py:1204] (3/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_607-213951-0_sp1.1 from training. Duration: 0.763625 2023-10-03 14:51:23,267 WARNING [train.py:1204] (3/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_468-283113-0 from training. Duration: 0.96 2023-10-03 14:51:24,730 WARNING [train.py:1204] (3/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_13-165512-0_sp1.1 from training. Duration: 0.7454375 2023-10-03 14:51:26,116 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_69630_210_7234_1_1535788670_910989_56-169762-0_sp1.1 from training. Duration: 0.92725 2023-10-03 14:51:27,574 WARNING [train.py:1204] (3/4) Exclude cut with ID _67335_210_5219_1_1535525975652_4275229_121-80357-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 14:51:27,611 WARNING [train.py:1204] (3/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_126-12079-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 14:51:27,646 WARNING [train.py:1204] (3/4) Exclude cut with ID _5294_210_1747_1_1527249643928_3696179_342-270278-0_sp1.1 from training. Duration: 0.5 2023-10-03 14:51:29,605 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_30322_210_1925_1_1532843478_826817_35-328264-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 14:51:32,183 WARNING [train.py:1204] (3/4) Exclude cut with ID _8480_210_1874_1_1528589391205_6728330_755-112625-0_sp1.1 from training. Duration: 0.67275 2023-10-03 14:51:34,112 INFO [train.py:1046] (3/4) Epoch 37, batch 4600, loss[loss=0.1364, simple_loss=0.1874, pruned_loss=0.04274, over 19107.00 frames. ], tot_loss[loss=0.1578, simple_loss=0.2368, pruned_loss=0.03942, over 4710118.36 frames. ], batch size: 388, lr: 2.74e-03, grad_scale: 8.0 2023-10-03 14:51:34,244 WARNING [train.py:1204] (3/4) Exclude cut with ID _38670_210_15379_1_1533448810659_4391059_132-146412-0_sp1.1 from training. Duration: 0.97275 2023-10-03 14:51:35,649 WARNING [train.py:1204] (3/4) Exclude cut with ID _22020_210_2461_1_1531724164513_6326900_563-39691-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 14:51:38,859 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_9460_210_4211_1_1528954587_10049667_572-37035-0_sp1.1 from training. Duration: 0.72725 2023-10-03 14:51:38,871 WARNING [train.py:1204] (3/4) Exclude cut with ID _68777_210_4353_1_1535810390731_3117904_129-166012-0_sp0.9 from training. Duration: 0.9444375 2023-10-03 14:51:40,216 WARNING [train.py:1204] (3/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_487-91921-0_sp1.1 from training. Duration: 0.963625 2023-10-03 14:51:41,565 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_195-257512-0 from training. Duration: 0.67 2023-10-03 14:51:42,915 WARNING [train.py:1204] (3/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_188-260011-0_sp1.1 from training. Duration: 0.663625 2023-10-03 14:51:43,157 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.feed_forward1.hidden_balancer.prob, batch_count=1305580.0, ans=0.125 2023-10-03 14:51:44,484 WARNING [train.py:1204] (3/4) Exclude cut with ID _8480_210_1874_1_1528589391205_6728330_309-139896-0_sp0.9 from training. Duration: 0.9 2023-10-03 14:51:46,325 WARNING [train.py:1204] (3/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_866-321248-0_sp1.1 from training. Duration: 0.963625 2023-10-03 14:51:49,097 WARNING [train.py:1204] (3/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_510-216513-0_sp1.1 from training. Duration: 0.97275 2023-10-03 14:51:53,466 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.nonlin_attention.balancer.prob, batch_count=1305646.6666666667, ans=0.125 2023-10-03 14:51:54,914 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.ff3_skip_rate, batch_count=1305646.6666666667, ans=0.0 2023-10-03 14:51:55,989 WARNING [train.py:1204] (3/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_758-143677-0 from training. Duration: 0.98 2023-10-03 14:51:57,327 WARNING [train.py:1204] (3/4) Exclude cut with ID _55134_210_21982_1_1534590028377_3882170_50-214427-0_sp1.1 from training. Duration: 0.97275 2023-10-03 14:51:59,006 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.balancer1.prob, batch_count=1305646.6666666667, ans=0.125 2023-10-03 14:52:00,044 WARNING [train.py:1204] (3/4) Exclude cut with ID _31649_210_6228_1_1532584608063_4091078_482-301330-0_sp1.1 from training. Duration: 0.97275 2023-10-03 14:52:00,850 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.0.balancer1.prob, batch_count=1305646.6666666667, ans=0.125 2023-10-03 14:52:03,840 WARNING [train.py:1204] (3/4) Exclude cut with ID _35799_210_5854_1_1533263487887_3529071_490-326847-0_sp1.1 from training. Duration: 0.9 2023-10-03 14:52:03,854 WARNING [train.py:1204] (3/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_984-90716-0_sp1.1 from training. Duration: 0.963625 2023-10-03 14:52:11,304 WARNING [train.py:1204] (3/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_166-246557-0 from training. Duration: 0.64 2023-10-03 14:52:11,305 WARNING [train.py:1204] (3/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_51-200220-0_sp1.1 from training. Duration: 0.5090625 2023-10-03 14:52:12,598 WARNING [train.py:1204] (3/4) Exclude cut with ID _51382_210_23862_1_1534492801838_3650104_275-92796-0_sp1.1 from training. Duration: 0.936375 2023-10-03 14:52:15,570 WARNING [train.py:1204] (3/4) Exclude cut with ID _29853_210_7497_1_1532505600851_6642110_461-142829-0_sp1.1 from training. Duration: 0.97275 2023-10-03 14:52:16,919 WARNING [train.py:1204] (3/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_788-298907-0_sp0.9 from training. Duration: 0.988875 2023-10-03 14:52:18,393 WARNING [train.py:1204] (3/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_739-2098-0_sp1.1 from training. Duration: 0.836375 2023-10-03 14:52:21,554 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7543_210_6753_1_1528541907_1399322_165-323619-0 from training. Duration: 0.69 2023-10-03 14:52:22,901 WARNING [train.py:1204] (3/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_401-255491-0_sp1.1 from training. Duration: 0.636375 2023-10-03 14:52:25,723 WARNING [train.py:1204] (3/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_24-143283-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 14:52:27,163 WARNING [train.py:1204] (3/4) Exclude cut with ID _14459_210_2228_1_1531443641946_7516700_1221-280439-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 14:52:27,421 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.conv_skip_rate, batch_count=1305780.0, ans=0.0 2023-10-03 14:52:28,618 WARNING [train.py:1204] (3/4) Exclude cut with ID _59202_210_26984_1_1534816810612_3549174_469-213482-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 14:52:28,618 WARNING [train.py:1204] (3/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_148-158509-0_sp0.9 from training. Duration: 0.4555625 2023-10-03 14:52:29,911 WARNING [train.py:1204] (3/4) Exclude cut with ID _24314_210_4915_1_1532135078116_7170179_863-259095-0_sp1.1 from training. Duration: 0.97275 2023-10-03 14:52:31,204 WARNING [train.py:1204] (3/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_321-22821-0 from training. Duration: 0.89 2023-10-03 14:52:31,225 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_43831_210_2158_1_1533725502_5872791_55-163137-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 14:52:31,276 WARNING [train.py:1204] (3/4) Exclude cut with ID _27430_210_7770_1_1532604689375_1378590_26-25207-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 14:52:33,233 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_17613_210_2151_1_1531137569_2366160_223-316461-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 14:52:34,557 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_53679_210_4852_1_1534556726_4186141_541-213370-0_sp1.1 from training. Duration: 0.963625 2023-10-03 14:52:34,625 WARNING [train.py:1204] (3/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_369-295086-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 14:52:34,682 WARNING [train.py:1204] (3/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_91-267696-0 from training. Duration: 0.87 2023-10-03 14:52:36,133 WARNING [train.py:1204] (3/4) Exclude cut with ID _5294_210_1747_1_1527249643928_3696179_180-100713-0 from training. Duration: 0.81 2023-10-03 14:52:36,156 WARNING [train.py:1204] (3/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_32-335103-0 from training. Duration: 0.82 2023-10-03 14:52:36,161 WARNING [train.py:1204] (3/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_133-312727-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 14:52:36,253 WARNING [train.py:1204] (3/4) Exclude cut with ID _59288_210_5854_1_1535077757774_3412246_391-165934-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 14:52:37,564 WARNING [train.py:1204] (3/4) Exclude cut with ID _28323_210_16187_1_1532480395667_4100300_488-1281-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 14:52:37,650 WARNING [train.py:1204] (3/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_124-193207-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 14:52:41,118 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=1305846.6666666667, ans=0.1 2023-10-03 14:52:48,330 INFO [train.py:1046] (3/4) Epoch 37, batch 4650, loss[loss=0.1771, simple_loss=0.2494, pruned_loss=0.05237, over 23676.00 frames. ], tot_loss[loss=0.1568, simple_loss=0.2357, pruned_loss=0.03894, over 4712682.90 frames. ], batch size: 164, lr: 2.74e-03, grad_scale: 8.0 2023-10-03 14:52:48,386 WARNING [train.py:1204] (3/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_226-242140-0_sp0.9 from training. Duration: 0.9 2023-10-03 14:52:49,749 WARNING [train.py:1204] (3/4) Exclude cut with ID _29277_210_3279_1_1532581109293_3173069_392-3694-0_sp1.1 from training. Duration: 0.936375 2023-10-03 14:52:49,780 WARNING [train.py:1204] (3/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_563-329842-0_sp1.1 from training. Duration: 0.97275 2023-10-03 14:52:51,124 WARNING [train.py:1204] (3/4) Exclude cut with ID _58583_210_5236_1_1534726835266_3574249_391-87755-0_sp1.1 from training. Duration: 0.92725 2023-10-03 14:52:51,157 WARNING [train.py:1204] (3/4) Exclude cut with ID _28427_210_4220_1_1532581240967_3061069_281-237827-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 14:52:51,173 WARNING [train.py:1204] (3/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_44-147898-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 14:52:52,592 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_24007_210_1794_1_1531918925_1663875_125-320772-0_sp1.1 from training. Duration: 0.97275 2023-10-03 14:52:55,338 WARNING [train.py:1204] (3/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_143-117421-0 from training. Duration: 0.58 2023-10-03 14:52:59,381 WARNING [train.py:1204] (3/4) Exclude cut with ID _69584_210_5219_1_1535785133775_4044450_325-7656-0_sp0.9 from training. Duration: 0.9555625 2023-10-03 14:53:01,549 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_37351_210_7925_1_1533012931_3812113_265-85780-0 from training. Duration: 0.88 2023-10-03 14:53:01,577 WARNING [train.py:1204] (3/4) Exclude cut with ID _18516_210_9206_1_1531704653206_3692406_331-237896-0_sp1.1 from training. Duration: 0.936375 2023-10-03 14:53:02,956 WARNING [train.py:1204] (3/4) Exclude cut with ID _14629_210_15709_1_1530604298182_956899_6-232695-0 from training. Duration: 0.94 2023-10-03 14:53:02,980 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7544_210_6753_1_1528587000_691468_87-178599-0_sp1.1 from training. Duration: 0.7909375 2023-10-03 14:53:03,046 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_326-303945-0 from training. Duration: 0.99 2023-10-03 14:53:04,319 WARNING [train.py:1204] (3/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_503-30719-0 from training. Duration: 0.98 2023-10-03 14:53:04,328 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_11305_210_1794_1_1529750428_7178148_299-344640-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 14:53:04,389 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_53071_210_16554_1_1534490405_2954066_265-170472-0_sp1.1 from training. Duration: 0.7545625 2023-10-03 14:53:05,866 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_174-277364-0_sp1.1 from training. Duration: 0.6454375 2023-10-03 14:53:07,698 WARNING [train.py:1204] (3/4) Exclude cut with ID _58414_210_8254_1_1535421609162_7151182_193-151884-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 14:53:07,729 WARNING [train.py:1204] (3/4) Exclude cut with ID IC0651W0104-322063 from training. Duration: 0.768 2023-10-03 14:53:10,496 WARNING [train.py:1204] (3/4) Exclude cut with ID _21881_210_3525_1_1531825242757_3798380_139-305982-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 14:53:11,692 WARNING [train.py:1204] (3/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_79-200392-0 from training. Duration: 0.97 2023-10-03 14:53:14,955 WARNING [train.py:1204] (3/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_451-280894-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 14:53:14,963 WARNING [train.py:1204] (3/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_304-273433-0_sp1.1 from training. Duration: 0.9 2023-10-03 14:53:15,035 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_5959_210_1794_1_1527400400_3445726_412-44052-0 from training. Duration: 0.77 2023-10-03 14:53:17,687 WARNING [train.py:1204] (3/4) Exclude cut with ID _4681_210_3525_1_1527251040321_1235713_148-26159-0_sp1.1 from training. Duration: 0.963625 2023-10-03 14:53:20,552 WARNING [train.py:1204] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_887-249555-0_sp0.9 from training. Duration: 0.8555625 2023-10-03 14:53:23,298 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_25236_210_2158_1_1532051968_280567_37-6860-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 14:53:27,826 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.571e+02 1.915e+02 2.038e+02 2.271e+02 3.983e+02, threshold=4.075e+02, percent-clipped=0.0 2023-10-03 14:53:27,894 WARNING [train.py:1204] (3/4) Exclude cut with ID _29277_210_3279_1_1532581109293_3173069_417-115527-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 14:53:29,366 WARNING [train.py:1204] (3/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_552-134748-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 14:53:31,106 WARNING [train.py:1204] (3/4) Exclude cut with ID _43176_210_8254_1_1533524417469_4174339_188-174143-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 14:53:31,167 WARNING [train.py:1204] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_127-119109-0_sp1.1 from training. Duration: 0.6545625 2023-10-03 14:53:35,182 WARNING [train.py:1204] (3/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_38-187851-0 from training. Duration: 0.62 2023-10-03 14:53:35,210 WARNING [train.py:1204] (3/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_588-125165-0 from training. Duration: 0.83 2023-10-03 14:53:36,540 WARNING [train.py:1204] (3/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_344-152526-0_sp1.1 from training. Duration: 0.4818125 2023-10-03 14:53:36,541 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7002_210_1794_1_1527935634_4034444_487-236501-0 from training. Duration: 0.66 2023-10-03 14:53:37,929 WARNING [train.py:1204] (3/4) Exclude cut with ID _23825_210_13449_1_1531915313526_3497150_379-16981-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 14:53:46,078 WARNING [train.py:1204] (3/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_297-5233-0_sp1.1 from training. Duration: 0.863625 2023-10-03 14:53:46,085 WARNING [train.py:1204] (3/4) Exclude cut with ID _41298_210_9019_1_1533430371450_4822143_178-283538-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 14:53:46,111 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7046_210_6753_1_1527895337_6920917_642-106799-0 from training. Duration: 0.92 2023-10-03 14:53:47,378 WARNING [train.py:1204] (3/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_532-291952-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 14:53:48,691 WARNING [train.py:1204] (3/4) Exclude cut with ID _47278_210_12041_1_1533902418349_3576110_260-270468-0_sp1.1 from training. Duration: 0.92725 2023-10-03 14:53:50,090 WARNING [train.py:1204] (3/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_144-3287-0_sp0.9 from training. Duration: 0.7444375 2023-10-03 14:53:50,200 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_162-262717-0_sp0.9 from training. Duration: 0.92225 2023-10-03 14:53:52,944 WARNING [train.py:1204] (3/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_62-335552-0_sp0.9 from training. Duration: 0.9666875 2023-10-03 14:53:52,949 WARNING [train.py:1204] (3/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_726-272852-0_sp1.1 from training. Duration: 0.92725 2023-10-03 14:53:54,304 WARNING [train.py:1204] (3/4) Exclude cut with ID _14898_210_9206_1_1530766812377_4177190_26-135492-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 14:53:57,081 WARNING [train.py:1204] (3/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_200-95304-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 14:53:57,112 WARNING [train.py:1204] (3/4) Exclude cut with ID _45012_210_12651_1_1533608435136_5371380_240-37101-0_sp1.1 from training. Duration: 0.8454375 2023-10-03 14:53:57,126 WARNING [train.py:1204] (3/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_132-269633-0_sp0.9 from training. Duration: 0.7555625 2023-10-03 14:53:58,448 WARNING [train.py:1204] (3/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_907-151398-0_sp0.9 from training. Duration: 0.411125 2023-10-03 14:53:58,531 WARNING [train.py:1204] (3/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_45-265684-0_sp0.9 from training. Duration: 0.8 2023-10-03 14:53:58,727 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.conv_module1.balancer2.prob, batch_count=1306180.0, ans=0.125 2023-10-03 14:54:00,431 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_74-292608-0 from training. Duration: 0.72 2023-10-03 14:54:01,695 INFO [train.py:1046] (3/4) Epoch 37, batch 4700, loss[loss=0.1684, simple_loss=0.2471, pruned_loss=0.04482, over 23731.00 frames. ], tot_loss[loss=0.1577, simple_loss=0.2367, pruned_loss=0.03928, over 4712607.07 frames. ], batch size: 85, lr: 2.74e-03, grad_scale: 8.0 2023-10-03 14:54:02,097 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.bypass.skip_rate, batch_count=1306246.6666666667, ans=0.04949747468305833 2023-10-03 14:54:06,587 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.feed_forward3.hidden_balancer.prob, batch_count=1306246.6666666667, ans=0.125 2023-10-03 14:54:08,998 WARNING [train.py:1204] (3/4) Exclude cut with ID _59202_210_26984_1_1534816810612_3549174_221-84819-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 14:54:09,072 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_33696_210_3852_1_1533082780_2663375_181-325595-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 14:54:09,116 WARNING [train.py:1204] (3/4) Exclude cut with ID _47251_210_18628_1_1533814063128_4137250_351-154105-0_sp1.1 from training. Duration: 0.963625 2023-10-03 14:54:10,969 WARNING [train.py:1204] (3/4) Exclude cut with ID _35501_210_9288_1_1532998507986_3192189_351-334740-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 14:54:12,510 WARNING [train.py:1204] (3/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_342-100157-0_sp1.1 from training. Duration: 0.4909375 2023-10-03 14:54:17,167 WARNING [train.py:1204] (3/4) Exclude cut with ID _6789_210_1534_1_1527857937218_3118950_262-48853-0 from training. Duration: 0.56 2023-10-03 14:54:17,198 WARNING [train.py:1204] (3/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_430-201020-0 from training. Duration: 0.97 2023-10-03 14:54:19,928 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_71-136518-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 14:54:20,173 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.self_attn_weights.pos_emb_skip_rate, batch_count=1306313.3333333333, ans=0.0 2023-10-03 14:54:21,219 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_28-291982-0_sp1.1 from training. Duration: 0.8818125 2023-10-03 14:54:21,242 WARNING [train.py:1204] (3/4) Exclude cut with ID _54218_210_12210_1_1534413646388_4409259_437-275326-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 14:54:24,043 WARNING [train.py:1204] (3/4) Exclude cut with ID _55134_210_21982_1_1534590028377_3882170_40-309139-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 14:54:24,276 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.nonlin_attention.balancer.prob, batch_count=1306313.3333333333, ans=0.125 2023-10-03 14:54:29,521 WARNING [train.py:1204] (3/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_266-329755-0_sp1.1 from training. Duration: 0.8090625 2023-10-03 14:54:30,223 WARNING [train.py:1204] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_21-174514-0_sp1.1 from training. Duration: 0.3909375 2023-10-03 14:54:32,765 WARNING [train.py:1204] (3/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_409-207378-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 14:54:36,205 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=1306380.0, ans=0.1 2023-10-03 14:54:37,324 WARNING [train.py:1204] (3/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_132-269633-0 from training. Duration: 0.68 2023-10-03 14:54:38,602 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2245_210_2715_1_1525511548_3540254_310-52483-0_sp0.9 from training. Duration: 0.97775 2023-10-03 14:54:40,642 WARNING [train.py:1204] (3/4) Exclude cut with ID _27430_210_7770_1_1532604689375_1378590_47-77403-0_sp1.1 from training. Duration: 0.92725 2023-10-03 14:54:43,579 WARNING [train.py:1204] (3/4) Exclude cut with ID _7562_210_2084_1_1528887767916_3946229_186-326422-0 from training. Duration: 0.84 2023-10-03 14:54:45,477 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_230-35993-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 14:54:48,543 INFO [scaling.py:1118] (3/4) WithLoss: name=encoder.encoders.5.encoder.layers.0.self_attn_weights, loss-sum=0.000e+00 2023-10-03 14:54:49,634 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_23854_210_1794_1_1531969696_1329157_57-123491-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 14:54:50,920 WARNING [train.py:1204] (3/4) Exclude cut with ID _14747_210_8341_1_1530685797463_3654089_147-131229-0 from training. Duration: 0.76 2023-10-03 14:54:52,260 WARNING [train.py:1204] (3/4) Exclude cut with ID _30191_210_1664_1_1533189915047_7196670_430-19539-0_sp1.1 from training. Duration: 0.92725 2023-10-03 14:54:53,522 WARNING [train.py:1204] (3/4) Exclude cut with ID _67725_210_27767_1_1535608756671_5105359_44-50762-0_sp1.1 from training. Duration: 0.963625 2023-10-03 14:54:56,451 WARNING [train.py:1204] (3/4) Exclude cut with ID _27485_210_4916_1_1532739191677_3668269_217-161755-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 14:54:57,815 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_5931_210_2040_1_1527428481_4547999_321-193614-0_sp1.1 from training. Duration: 0.6090625 2023-10-03 14:54:57,846 WARNING [train.py:1204] (3/4) Exclude cut with ID _6267_210_2084_1_1527764658526_3894730_375-219887-0 from training. Duration: 0.67 2023-10-03 14:54:59,229 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9087W0022-538393 from training. Duration: 0.768 2023-10-03 14:54:59,375 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.0.conv_module1.balancer2.min_positive, batch_count=1306513.3333333333, ans=0.05 2023-10-03 14:55:00,735 WARNING [train.py:1204] (3/4) Exclude cut with ID _68777_210_4353_1_1535810390731_3117904_83-196459-0_sp1.1 from training. Duration: 0.963625 2023-10-03 14:55:02,749 WARNING [train.py:1204] (3/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_455-343812-0_sp1.1 from training. Duration: 0.92725 2023-10-03 14:55:02,749 WARNING [train.py:1204] (3/4) Exclude cut with ID _13916_210_9994_1_1530788341126_3763149_414-320174-0_sp1.1 from training. Duration: 0.92725 2023-10-03 14:55:02,753 WARNING [train.py:1204] (3/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_398-239819-0 from training. Duration: 0.99 2023-10-03 14:55:04,118 WARNING [train.py:1204] (3/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_199-232314-0_sp1.1 from training. Duration: 0.92725 2023-10-03 14:55:07,570 WARNING [train.py:1204] (3/4) Exclude cut with ID _2334_210_2667_1_1525608238880_4705129_459-104997-0 from training. Duration: 0.95 2023-10-03 14:55:10,170 WARNING [train.py:1204] (3/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_441-209395-0_sp1.1 from training. Duration: 0.936375 2023-10-03 14:55:11,524 WARNING [train.py:1204] (3/4) Exclude cut with ID _27052_210_5986_1_1532329197187_3585009_320-80393-0_sp1.1 from training. Duration: 0.97275 2023-10-03 14:55:14,887 WARNING [train.py:1204] (3/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_89-217253-0_sp1.1 from training. Duration: 0.97275 2023-10-03 14:55:16,110 INFO [train.py:1046] (3/4) Epoch 37, batch 4750, loss[loss=0.1406, simple_loss=0.218, pruned_loss=0.03154, over 24610.00 frames. ], tot_loss[loss=0.1579, simple_loss=0.2372, pruned_loss=0.03934, over 4720984.92 frames. ], batch size: 60, lr: 2.74e-03, grad_scale: 8.0 2023-10-03 14:55:16,158 WARNING [train.py:1204] (3/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_453-282588-0_sp1.1 from training. Duration: 0.8909375 2023-10-03 14:55:17,970 WARNING [train.py:1204] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_88-341718-0 from training. Duration: 0.66 2023-10-03 14:55:18,002 WARNING [train.py:1204] (3/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_230-3521-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 14:55:22,172 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.ff3_skip_rate, batch_count=1306580.0, ans=0.0 2023-10-03 14:55:23,231 WARNING [train.py:1204] (3/4) Exclude cut with ID _9679_210_7497_1_1529129212906_4363929_512-313889-0 from training. Duration: 0.81 2023-10-03 14:55:24,619 WARNING [train.py:1204] (3/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_194-252800-0_sp1.1 from training. Duration: 0.8818125 2023-10-03 14:55:24,648 WARNING [train.py:1204] (3/4) Exclude cut with ID _55209_210_1531_1_1534675828996_4597160_239-293541-0_sp1.1 from training. Duration: 0.963625 2023-10-03 14:55:25,978 WARNING [train.py:1204] (3/4) Exclude cut with ID _35799_210_5854_1_1533263487887_3529071_15-325131-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 14:55:28,990 WARNING [train.py:1204] (3/4) Exclude cut with ID _14100_210_8341_1_1530529320613_3678069_279-55760-0 from training. Duration: 0.93 2023-10-03 14:55:38,927 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_489-230703-0_sp1.1 from training. Duration: 0.77275 2023-10-03 14:55:39,062 WARNING [train.py:1204] (3/4) Exclude cut with ID _6694_210_4599_1_1527852637490_3965529_144-185051-0 from training. Duration: 0.96 2023-10-03 14:55:40,434 WARNING [train.py:1204] (3/4) Exclude cut with ID _28461_210_2253_1_1532771775189_3041299_1-228155-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 14:55:43,206 WARNING [train.py:1204] (3/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_686-252323-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 14:55:43,209 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_1075-294633-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 14:55:43,234 WARNING [train.py:1204] (3/4) Exclude cut with ID _22083_210_8254_1_1531702862439_3678970_30-92879-0_sp1.1 from training. Duration: 0.97275 2023-10-03 14:55:44,591 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9245W0121-616888 from training. Duration: 0.896 2023-10-03 14:55:44,594 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6786_210_1388_1_1527839699_4194495_378-46070-0 from training. Duration: 0.72 2023-10-03 14:55:50,374 WARNING [train.py:1204] (3/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_317-175062-0 from training. Duration: 0.82 2023-10-03 14:55:53,609 WARNING [train.py:1204] (3/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_238-89390-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 14:55:54,993 WARNING [train.py:1204] (3/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_524-310811-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 14:55:57,638 WARNING [train.py:1204] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_307-158462-0_sp0.9 from training. Duration: 0.7666875 2023-10-03 14:55:57,639 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9309W0191-640318 from training. Duration: 0.512 2023-10-03 14:55:57,646 WARNING [train.py:1204] (3/4) Exclude cut with ID _55153_210_2253_1_1534384786677_3162096_100-204617-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 14:55:59,039 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.551e+02 1.889e+02 2.021e+02 2.290e+02 3.051e+02, threshold=4.042e+02, percent-clipped=0.0 2023-10-03 14:56:00,525 WARNING [train.py:1204] (3/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_112-149213-0_sp1.1 from training. Duration: 0.863625 2023-10-03 14:56:02,002 WARNING [train.py:1204] (3/4) Exclude cut with ID _67831_210_12392_1_1535612464202_3092612_346-2148-0_sp1.1 from training. Duration: 0.8454375 2023-10-03 14:56:02,498 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.4.encoder.layers.1.feed_forward1.out_whiten, num_groups=1, num_channels=384, metric=10.25 vs. limit=15.0 2023-10-03 14:56:05,346 WARNING [train.py:1204] (3/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_51-117576-0_sp0.9 from training. Duration: 0.4444375 2023-10-03 14:56:05,375 WARNING [train.py:1204] (3/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_300-27235-0 from training. Duration: 0.87 2023-10-03 14:56:06,657 WARNING [train.py:1204] (3/4) Exclude cut with ID _40399_210_4353_1_1533305658194_3279406_195-116825-0_sp1.1 from training. Duration: 0.97275 2023-10-03 14:56:06,680 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7002_210_1794_1_1527939683_5157286_163-87552-0_sp0.9 from training. Duration: 0.9555625 2023-10-03 14:56:06,710 WARNING [train.py:1204] (3/4) Exclude cut with ID _19514_210_9259_1_1531530071330_5084230_560-237597-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 14:56:08,015 WARNING [train.py:1204] (3/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_41-234353-0_sp1.1 from training. Duration: 0.6181875 2023-10-03 14:56:08,041 WARNING [train.py:1204] (3/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_769-168302-0 from training. Duration: 0.71 2023-10-03 14:56:11,339 WARNING [train.py:1204] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_574-45541-0 from training. Duration: 0.65 2023-10-03 14:56:12,750 WARNING [train.py:1204] (3/4) Exclude cut with ID _50310_210_15488_1_1534068043345_3641080_262-67120-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 14:56:13,396 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.4.encoder.layers.1.feed_forward3.out_whiten, num_groups=1, num_channels=384, metric=12.76 vs. limit=15.0 2023-10-03 14:56:14,189 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_53071_210_16554_1_1534493434_1276796_35-11927-0_sp1.1 from training. Duration: 0.963625 2023-10-03 14:56:14,191 WARNING [train.py:1204] (3/4) Exclude cut with ID _3992_210_1747_1_1526644816031_3731060_107-251434-0 from training. Duration: 0.99 2023-10-03 14:56:15,502 WARNING [train.py:1204] (3/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_412-54488-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 14:56:16,935 WARNING [train.py:1204] (3/4) Exclude cut with ID _18516_210_9206_1_1531704653206_3692406_77-258490-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 14:56:19,913 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_40047_210_2290_1_1533290519_2374311_195-165693-0_sp0.9 from training. Duration: 0.988875 2023-10-03 14:56:20,174 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.conv_module2.balancer2.prob, batch_count=1306846.6666666667, ans=0.125 2023-10-03 14:56:21,317 WARNING [train.py:1204] (3/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_296-36971-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 14:56:21,381 WARNING [train.py:1204] (3/4) Exclude cut with ID _14970_210_15709_1_1530773543493_1715730_187-20259-0_sp1.1 from training. Duration: 0.8090625 2023-10-03 14:56:25,396 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_53373_210_5399_1_1534229471_4715695_135-305927-0_sp1.1 from training. Duration: 0.936375 2023-10-03 14:56:25,430 WARNING [train.py:1204] (3/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_60-18123-0 from training. Duration: 0.88 2023-10-03 14:56:26,902 WARNING [train.py:1204] (3/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_166-166990-0 from training. Duration: 0.96 2023-10-03 14:56:26,996 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_5438_210_1794_1_1527336687_2600009_233-58072-0 from training. Duration: 0.77 2023-10-03 14:56:29,018 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.1.bypass.scale_min, batch_count=1306846.6666666667, ans=0.2 2023-10-03 14:56:30,203 WARNING [train.py:1204] (3/4) Exclude cut with ID _8833_210_5219_1_1528592665210_7553059_611-80313-0_sp0.9 from training. Duration: 0.711125 2023-10-03 14:56:30,228 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_53555_210_5632_1_1534595220_7232297_118-43239-0_sp1.1 from training. Duration: 0.936375 2023-10-03 14:56:31,511 INFO [train.py:1046] (3/4) Epoch 37, batch 4800, loss[loss=0.1616, simple_loss=0.2392, pruned_loss=0.04195, over 23374.00 frames. ], tot_loss[loss=0.1587, simple_loss=0.2381, pruned_loss=0.03963, over 4715942.28 frames. ], batch size: 134, lr: 2.74e-03, grad_scale: 8.0 2023-10-03 14:56:31,593 WARNING [train.py:1204] (3/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_480-13146-0 from training. Duration: 0.85 2023-10-03 14:56:36,246 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_892-28814-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 14:56:36,311 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_24899_210_16554_1_1532046422_3827462_160-21255-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 14:56:41,068 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.conv_module1.balancer1.prob, batch_count=1306913.3333333333, ans=0.125 2023-10-03 14:56:42,151 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_154-316450-0_sp1.1 from training. Duration: 0.6909375 2023-10-03 14:56:43,524 WARNING [train.py:1204] (3/4) Exclude cut with ID _30196_210_1664_1_1533708496522_7074140_1225-301232-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 14:56:43,557 WARNING [train.py:1204] (3/4) Exclude cut with ID _28783_210_7943_1_1532671164808_7155479_414-208100-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 14:56:44,864 WARNING [train.py:1204] (3/4) Exclude cut with ID _19023_210_4353_1_1531378892552_5820780_83-254794-0 from training. Duration: 0.86 2023-10-03 14:56:46,234 WARNING [train.py:1204] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_591-83752-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 14:56:46,280 WARNING [train.py:1204] (3/4) Exclude cut with ID _29877_210_6228_1_1532397791106_5276149_266-62291-0_sp1.1 from training. Duration: 0.7909375 2023-10-03 14:56:47,670 WARNING [train.py:1204] (3/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_280-161633-0_sp1.1 from training. Duration: 0.663625 2023-10-03 14:56:50,550 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7543_210_6753_1_1528539468_333764_39-156634-0_sp1.1 from training. Duration: 0.97275 2023-10-03 14:56:51,986 WARNING [train.py:1204] (3/4) Exclude cut with ID _67499_210_17282_1_1535509805220_3692430_505-312248-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 14:56:52,011 WARNING [train.py:1204] (3/4) Exclude cut with ID _5294_210_1747_1_1527249643928_3696179_180-100713-0_sp0.9 from training. Duration: 0.9 2023-10-03 14:56:53,703 WARNING [train.py:1204] (3/4) Exclude cut with ID _26528_210_9070_1_1532570410842_3851310_508-148132-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 14:56:53,714 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_211-339622-0_sp1.1 from training. Duration: 0.4090625 2023-10-03 14:56:53,727 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_339-138356-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 14:56:55,117 WARNING [train.py:1204] (3/4) Exclude cut with ID _27060_210_5986_1_1532674838922_3522549_211-19933-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 14:56:58,315 WARNING [train.py:1204] (3/4) Exclude cut with ID _60229_210_27767_1_1534838406158_3655350_457-117923-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 14:57:00,926 WARNING [train.py:1204] (3/4) Exclude cut with ID _55429_210_10626_1_1534467588527_7174143_460-141439-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 14:57:02,375 WARNING [train.py:1204] (3/4) Exclude cut with ID _59170_210_6721_1_1534983633960_4203335_263-10780-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 14:57:02,396 WARNING [train.py:1204] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_872-264525-0_sp0.9 from training. Duration: 0.72225 2023-10-03 14:57:02,471 WARNING [train.py:1204] (3/4) Exclude cut with ID _7624_210_1531_1_1528109802198_4933101_418-349834-0_sp0.9 from training. Duration: 0.6666875 2023-10-03 14:57:05,153 WARNING [train.py:1204] (3/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_27-53998-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 14:57:08,303 WARNING [train.py:1204] (3/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_202-183922-0 from training. Duration: 0.85 2023-10-03 14:57:08,315 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7002_210_1794_1_1527939683_5157286_1-281264-0 from training. Duration: 0.44 2023-10-03 14:57:10,191 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_53753_210_7234_1_1534320003_3594148_493-9314-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 14:57:10,210 WARNING [train.py:1204] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_217-95056-0_sp1.1 from training. Duration: 0.8909375 2023-10-03 14:57:11,506 WARNING [train.py:1204] (3/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_406-45086-0_sp1.1 from training. Duration: 0.72725 2023-10-03 14:57:11,513 WARNING [train.py:1204] (3/4) Exclude cut with ID _31649_210_6228_1_1532584608063_4091078_505-213821-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 14:57:11,530 WARNING [train.py:1204] (3/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_317-175062-0_sp0.9 from training. Duration: 0.911125 2023-10-03 14:57:13,117 WARNING [train.py:1204] (3/4) Exclude cut with ID _5444_210_2151_1_1527418808464_5312210_726-27165-0_sp1.1 from training. Duration: 0.6090625 2023-10-03 14:57:13,170 WARNING [train.py:1204] (3/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_494-170437-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 14:57:17,331 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_474-64054-0_sp1.1 from training. Duration: 0.936375 2023-10-03 14:57:18,860 WARNING [train.py:1204] (3/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_521-7556-0_sp1.1 from training. Duration: 0.97275 2023-10-03 14:57:20,260 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_269-348505-0_sp1.1 from training. Duration: 0.92725 2023-10-03 14:57:23,343 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.conv_skip_rate, batch_count=1307113.3333333333, ans=0.0 2023-10-03 14:57:24,593 WARNING [train.py:1204] (3/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_559-184862-0 from training. Duration: 0.94 2023-10-03 14:57:26,396 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_68702_210_23689_1_1535799577_3420068_92-76040-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 14:57:28,140 WARNING [train.py:1204] (3/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_227-102052-0_sp1.1 from training. Duration: 0.97275 2023-10-03 14:57:28,181 WARNING [train.py:1204] (3/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_718-250907-0_sp0.9 from training. Duration: 0.7666875 2023-10-03 14:57:28,250 WARNING [train.py:1204] (3/4) Exclude cut with ID _14069_210_5632_1_1530512692033_4803390_352-143891-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 14:57:28,469 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.conv_module2.balancer2.prob, batch_count=1307113.3333333333, ans=0.125 2023-10-03 14:57:32,551 WARNING [train.py:1204] (3/4) Exclude cut with ID _6267_210_2084_1_1527764658526_3894730_65-106602-0_sp1.1 from training. Duration: 0.936375 2023-10-03 14:57:33,984 WARNING [train.py:1204] (3/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_202-183922-0_sp0.9 from training. Duration: 0.9444375 2023-10-03 14:57:33,991 WARNING [train.py:1204] (3/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_465-268827-0_sp1.1 from training. Duration: 0.97275 2023-10-03 14:57:35,288 WARNING [train.py:1204] (3/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_58-29064-0_sp1.1 from training. Duration: 0.9 2023-10-03 14:57:35,332 WARNING [train.py:1204] (3/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_1-99680-0_sp0.9 from training. Duration: 0.7444375 2023-10-03 14:57:35,515 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.conv_skip_rate, batch_count=1307180.0, ans=0.0 2023-10-03 14:57:36,694 WARNING [train.py:1204] (3/4) Exclude cut with ID _13253_210_1925_1_1530757775819_3577873_191-144977-0_sp1.1 from training. Duration: 0.7090625 2023-10-03 14:57:38,256 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.attention_skip_rate, batch_count=1307180.0, ans=0.0 2023-10-03 14:57:41,327 WARNING [train.py:1204] (3/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_276-112684-0_sp1.1 from training. Duration: 0.92725 2023-10-03 14:57:41,335 WARNING [train.py:1204] (3/4) Exclude cut with ID _64639_210_5816_1_1535367619437_3528000_358-411-0_sp1.1 from training. Duration: 0.97275 2023-10-03 14:57:41,342 WARNING [train.py:1204] (3/4) Exclude cut with ID _31649_210_6228_1_1532584608063_4091078_106-21199-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 14:57:42,762 WARNING [train.py:1204] (3/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_901-79438-0 from training. Duration: 0.94 2023-10-03 14:57:44,526 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.ff3_skip_rate, batch_count=1307246.6666666667, ans=0.0 2023-10-03 14:57:44,805 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.4.encoder.layers.0.feed_forward3.out_whiten, num_groups=1, num_channels=384, metric=12.21 vs. limit=15.0 2023-10-03 14:57:45,536 INFO [train.py:1046] (3/4) Epoch 37, batch 4850, loss[loss=0.157, simple_loss=0.2345, pruned_loss=0.03972, over 24061.00 frames. ], tot_loss[loss=0.1585, simple_loss=0.2379, pruned_loss=0.03961, over 4718624.59 frames. ], batch size: 80, lr: 2.74e-03, grad_scale: 8.0 2023-10-03 14:57:46,271 WARNING [train.py:1204] (3/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_718-250907-0 from training. Duration: 0.69 2023-10-03 14:57:46,277 WARNING [train.py:1204] (3/4) Exclude cut with ID _5287_210_2084_1_1527418883040_3878670_363-194161-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 14:57:46,281 WARNING [train.py:1204] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_586-321321-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 14:57:47,538 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_68900_210_2087_1_1535713157_3708529_164-292661-0_sp1.1 from training. Duration: 0.963625 2023-10-03 14:57:47,539 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_26433_210_12108_1_1532157158_4056480_114-144878-0_sp1.1 from training. Duration: 0.97275 2023-10-03 14:57:50,282 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_15517_210_6753_1_1530954008_3487336_96-341874-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 14:57:54,763 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.conv_skip_rate, batch_count=1307246.6666666667, ans=0.0 2023-10-03 14:57:57,309 WARNING [train.py:1204] (3/4) Exclude cut with ID _14268_210_9206_1_1530622771285_4714099_637-86630-0 from training. Duration: 0.51 2023-10-03 14:57:58,648 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_28211_210_6388_1_1532861506_4523224_121-320092-0_sp1.1 from training. Duration: 0.92725 2023-10-03 14:58:01,987 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_141-89113-0_sp0.9 from training. Duration: 0.9555625 2023-10-03 14:58:03,326 WARNING [train.py:1204] (3/4) Exclude cut with ID _6789_210_1534_1_1527857937218_3118950_311-61262-0_sp1.1 from training. Duration: 0.5909375 2023-10-03 14:58:03,356 WARNING [train.py:1204] (3/4) Exclude cut with ID _58480_210_12392_1_1534811121111_3646160_389-200461-0_sp1.1 from training. Duration: 0.97275 2023-10-03 14:58:05,390 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.4.encoder.layers.1.self_attn_weights.whiten_keys, num_groups=4, num_channels=128, metric=2.12 vs. limit=6.0 2023-10-03 14:58:07,561 WARNING [train.py:1204] (3/4) Exclude cut with ID _41326_210_5854_1_1533778076509_3492080_28-156553-0_sp1.1 from training. Duration: 0.92725 2023-10-03 14:58:07,650 WARNING [train.py:1204] (3/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_577-287150-0_sp1.1 from training. Duration: 0.7454375 2023-10-03 14:58:09,041 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.feed_forward1.hidden_balancer.prob, batch_count=1307313.3333333333, ans=0.125 2023-10-03 14:58:10,703 WARNING [train.py:1204] (3/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_40-129756-0_sp1.1 from training. Duration: 0.736375 2023-10-03 14:58:10,713 WARNING [train.py:1204] (3/4) Exclude cut with ID _60113_210_16271_1_1535164212481_4386079_490-280290-0 from training. Duration: 0.98 2023-10-03 14:58:13,818 WARNING [train.py:1204] (3/4) Exclude cut with ID _58059_210_5854_1_1534901471149_3164808_34-39905-0_sp1.1 from training. Duration: 0.963625 2023-10-03 14:58:15,883 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_791-197891-0_sp1.1 from training. Duration: 0.763625 2023-10-03 14:58:15,903 WARNING [train.py:1204] (3/4) Exclude cut with ID _6789_210_1534_1_1527857937218_3118950_262-48853-0_sp1.1 from training. Duration: 0.5090625 2023-10-03 14:58:17,036 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2655_210_2087_1_1525688959_3897400_52-279899-0_sp0.9 from training. Duration: 0.7666875 2023-10-03 14:58:17,048 WARNING [train.py:1204] (3/4) Exclude cut with ID _14884_210_2252_1_1530707459689_5561250_114-201399-0 from training. Duration: 0.76 2023-10-03 14:58:19,843 WARNING [train.py:1204] (3/4) Exclude cut with ID _19023_210_4353_1_1531378892552_5820780_83-254794-0_sp0.9 from training. Duration: 0.9555625 2023-10-03 14:58:19,858 WARNING [train.py:1204] (3/4) Exclude cut with ID _53489_210_6228_1_1534298357169_3835970_10-297919-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 14:58:25,331 WARNING [train.py:1204] (3/4) Exclude cut with ID _44585_210_18628_1_1533605350387_4518199_418-3055-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 14:58:25,345 WARNING [train.py:1204] (3/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_310-109461-0 from training. Duration: 0.8 2023-10-03 14:58:25,395 WARNING [train.py:1204] (3/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_235-262124-0 from training. Duration: 0.67 2023-10-03 14:58:26,625 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.571e+02 1.885e+02 2.047e+02 2.380e+02 3.001e+02, threshold=4.095e+02, percent-clipped=0.0 2023-10-03 14:58:26,717 WARNING [train.py:1204] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_195-167011-0_sp1.1 from training. Duration: 0.6818125 2023-10-03 14:58:34,850 WARNING [train.py:1204] (3/4) Exclude cut with ID _7852_210_5219_1_1528286471316_8139400_188-55162-0_sp1.1 from training. Duration: 0.936375 2023-10-03 14:58:34,900 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_35361_210_4238_1_1533262810_7793476_675-87012-0 from training. Duration: 0.89 2023-10-03 14:58:34,969 WARNING [train.py:1204] (3/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_465-81427-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 14:58:34,975 WARNING [train.py:1204] (3/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_597-154640-0_sp1.1 from training. Duration: 0.8181875 2023-10-03 14:58:36,447 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.out_combiner.scale_min, batch_count=1307446.6666666667, ans=0.2 2023-10-03 14:58:37,620 WARNING [train.py:1204] (3/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_186-309161-0_sp0.9 from training. Duration: 0.6 2023-10-03 14:58:39,043 WARNING [train.py:1204] (3/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_546-119312-0 from training. Duration: 0.99 2023-10-03 14:58:39,047 WARNING [train.py:1204] (3/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_330-240060-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 14:58:39,131 WARNING [train.py:1204] (3/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_477-255782-0 from training. Duration: 0.82 2023-10-03 14:58:39,149 WARNING [train.py:1204] (3/4) Exclude cut with ID _14970_210_15709_1_1530773543493_1715730_150-248939-0_sp1.1 from training. Duration: 0.97275 2023-10-03 14:58:40,977 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_556-206615-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 14:58:41,044 WARNING [train.py:1204] (3/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_308-231735-0 from training. Duration: 0.73 2023-10-03 14:58:49,863 WARNING [train.py:1204] (3/4) Exclude cut with ID _27485_210_4916_1_1532739191677_3668269_66-236571-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 14:58:55,369 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2221_210_1988_1_1525597051_4935541_415-296882-0_sp0.9 from training. Duration: 0.8555625 2023-10-03 14:58:55,380 WARNING [train.py:1204] (3/4) Exclude cut with ID _64521_210_14699_1_1535243427259_3815328_231-78630-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 14:58:57,564 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.2.encoder.layers.2.nonlin_attention.whiten2, num_groups=1, num_channels=384, metric=5.32 vs. limit=15.0 2023-10-03 14:58:58,327 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.conv_skip_rate, batch_count=1307580.0, ans=0.0 2023-10-03 14:58:59,480 INFO [train.py:1046] (3/4) Epoch 37, batch 4900, loss[loss=0.1627, simple_loss=0.2521, pruned_loss=0.03659, over 24669.00 frames. ], tot_loss[loss=0.1583, simple_loss=0.2376, pruned_loss=0.03953, over 4708637.20 frames. ], batch size: 68, lr: 2.74e-03, grad_scale: 8.0 2023-10-03 14:59:01,404 WARNING [train.py:1204] (3/4) Exclude cut with ID _13942_210_9988_1_1530604906036_3733360_499-55676-0 from training. Duration: 0.62 2023-10-03 14:59:01,407 WARNING [train.py:1204] (3/4) Exclude cut with ID _49376_210_5986_1_1533985278732_3370740_301-154763-0_sp1.1 from training. Duration: 0.8909375 2023-10-03 14:59:04,689 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.attention_skip_rate, batch_count=1307580.0, ans=0.0 2023-10-03 14:59:06,029 WARNING [train.py:1204] (3/4) Exclude cut with ID _27485_210_4916_1_1532739191677_3668269_255-216698-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 14:59:07,350 WARNING [train.py:1204] (3/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_137-334143-0_sp1.1 from training. Duration: 0.97275 2023-10-03 14:59:08,717 WARNING [train.py:1204] (3/4) Exclude cut with ID _6456_210_1534_1_1527681634212_3181049_215-309359-0_sp1.1 from training. Duration: 0.8 2023-10-03 14:59:12,007 WARNING [train.py:1204] (3/4) Exclude cut with ID _65640_210_6310_1_1535368201445_3173250_209-13008-0 from training. Duration: 0.99 2023-10-03 14:59:16,234 WARNING [train.py:1204] (3/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_58-29064-0 from training. Duration: 0.99 2023-10-03 14:59:19,700 WARNING [train.py:1204] (3/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_410-287857-0 from training. Duration: 0.93 2023-10-03 14:59:21,091 WARNING [train.py:1204] (3/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_153-153537-0 from training. Duration: 0.91 2023-10-03 14:59:21,139 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7544_210_6753_1_1528587722_3576686_172-171750-0_sp0.9 from training. Duration: 0.72225 2023-10-03 14:59:22,366 WARNING [train.py:1204] (3/4) Exclude cut with ID _64521_210_14699_1_1535243427259_3815328_464-183853-0_sp1.1 from training. Duration: 0.97275 2023-10-03 14:59:22,380 WARNING [train.py:1204] (3/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_385-210308-0_sp1.1 from training. Duration: 0.963625 2023-10-03 14:59:22,387 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_46013_210_7234_1_1533776362_3744032_354-261865-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 14:59:22,394 WARNING [train.py:1204] (3/4) Exclude cut with ID _3067_210_2667_1_1526212974640_4503168_250-285937-0_sp1.1 from training. Duration: 0.72725 2023-10-03 14:59:23,730 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_40036_210_13405_1_1533648647_1315388_48-146016-0 from training. Duration: 0.93 2023-10-03 14:59:25,172 WARNING [train.py:1204] (3/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_280-161633-0 from training. Duration: 0.73 2023-10-03 14:59:25,347 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.conv_module2.balancer2.prob, batch_count=1307646.6666666667, ans=0.125 2023-10-03 14:59:26,541 WARNING [train.py:1204] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_308-161067-0_sp0.9 from training. Duration: 0.7333125 2023-10-03 14:59:29,287 WARNING [train.py:1204] (3/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_68-212801-0_sp0.9 from training. Duration: 0.811125 2023-10-03 14:59:29,344 WARNING [train.py:1204] (3/4) Exclude cut with ID _6789_210_1534_1_1527857937218_3118950_311-61262-0_sp0.9 from training. Duration: 0.72225 2023-10-03 14:59:31,216 WARNING [train.py:1204] (3/4) Exclude cut with ID _14632_210_9988_1_1530691279392_3923200_102-204477-0_sp1.1 from training. Duration: 0.8818125 2023-10-03 14:59:32,491 WARNING [train.py:1204] (3/4) Exclude cut with ID _65242_210_18628_1_1535369393717_3823149_290-117517-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 14:59:32,599 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_69630_210_7234_1_1535788670_910989_35-337140-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 14:59:32,607 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_5618_210_1874_1_1527311008_5851_1-214501-0 from training. Duration: 0.689 2023-10-03 14:59:34,066 WARNING [train.py:1204] (3/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_168-50337-0_sp0.9 from training. Duration: 0.9333125 2023-10-03 14:59:35,899 WARNING [train.py:1204] (3/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_185-14438-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 14:59:35,913 WARNING [train.py:1204] (3/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_61-327062-0 from training. Duration: 0.81 2023-10-03 14:59:35,918 WARNING [train.py:1204] (3/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_98-741-0 from training. Duration: 0.8 2023-10-03 14:59:40,308 WARNING [train.py:1204] (3/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_312-204798-0 from training. Duration: 0.93 2023-10-03 14:59:41,633 WARNING [train.py:1204] (3/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_787-294416-0_sp0.9 from training. Duration: 0.911125 2023-10-03 14:59:44,319 WARNING [train.py:1204] (3/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_140-181176-0_sp1.1 from training. Duration: 0.836375 2023-10-03 14:59:44,354 WARNING [train.py:1204] (3/4) Exclude cut with ID _14687_210_5854_1_1530703838259_3664889_115-239046-0_sp1.1 from training. Duration: 0.6090625 2023-10-03 14:59:44,409 WARNING [train.py:1204] (3/4) Exclude cut with ID _56423_210_5854_1_1534896061049_3165969_370-230614-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 14:59:45,715 WARNING [train.py:1204] (3/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_406-275649-0_sp0.9 from training. Duration: 0.4666875 2023-10-03 14:59:45,726 WARNING [train.py:1204] (3/4) Exclude cut with ID _15150_210_8341_1_1530752411803_7256384_22-138274-0_sp1.1 from training. Duration: 0.87275 2023-10-03 14:59:45,778 WARNING [train.py:1204] (3/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_125-125818-0 from training. Duration: 0.96 2023-10-03 14:59:49,132 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_4399_210_1874_1_1526705485_2400757_220-47974-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 14:59:51,698 WARNING [train.py:1204] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_308-161067-0_sp1.1 from training. Duration: 0.6 2023-10-03 14:59:53,088 WARNING [train.py:1204] (3/4) Exclude cut with ID _53489_210_6228_1_1534298357169_3835970_102-57645-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 14:59:54,546 WARNING [train.py:1204] (3/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_178-177582-0 from training. Duration: 0.74 2023-10-03 14:59:55,937 WARNING [train.py:1204] (3/4) Exclude cut with ID _14349_210_8341_1_1530615710818_3442470_170-336541-0_sp1.1 from training. Duration: 0.7818125 2023-10-03 14:59:56,003 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_521-214276-0_sp0.9 from training. Duration: 0.488875 2023-10-03 14:59:56,048 WARNING [train.py:1204] (3/4) Exclude cut with ID _66070_210_7640_1_1535441393096_7805310_489-292631-0 from training. Duration: 0.79 2023-10-03 15:00:02,241 WARNING [train.py:1204] (3/4) Exclude cut with ID _14069_210_5632_1_1530512692033_4803390_438-340023-0_sp1.1 from training. Duration: 0.936375 2023-10-03 15:00:03,613 WARNING [train.py:1204] (3/4) Exclude cut with ID _7852_210_5219_1_1528286471316_8139400_242-105336-0_sp1.1 from training. Duration: 0.6545625 2023-10-03 15:00:04,981 WARNING [train.py:1204] (3/4) Exclude cut with ID _3867_210_1883_1_1526641200575_4475830_272-215563-0 from training. Duration: 0.57 2023-10-03 15:00:04,988 WARNING [train.py:1204] (3/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_143-117421-0_sp0.9 from training. Duration: 0.6444375 2023-10-03 15:00:05,000 WARNING [train.py:1204] (3/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_30-326681-0_sp1.1 from training. Duration: 0.7545625 2023-10-03 15:00:06,900 WARNING [train.py:1204] (3/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_538-218448-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 15:00:08,975 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.1.encoder.layers.1.feed_forward1.out_whiten, num_groups=1, num_channels=256, metric=5.51 vs. limit=15.0 2023-10-03 15:00:11,423 WARNING [train.py:1204] (3/4) Exclude cut with ID _33640_210_2655_1_1533117605479_4855469_60-192970-0_sp1.1 from training. Duration: 0.963625 2023-10-03 15:00:11,437 WARNING [train.py:1204] (3/4) Exclude cut with ID _65585_210_12210_1_1535450451584_3764098_112-294301-0_sp1.1 from training. Duration: 0.8 2023-10-03 15:00:11,466 WARNING [train.py:1204] (3/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_643-37412-0_sp1.1 from training. Duration: 0.936375 2023-10-03 15:00:12,782 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_9302_210_2550_1_1528889962_4655416_638-301620-0_sp1.1 from training. Duration: 0.42725 2023-10-03 15:00:14,127 INFO [train.py:1046] (3/4) Epoch 37, batch 4950, loss[loss=0.1502, simple_loss=0.2331, pruned_loss=0.0337, over 24271.00 frames. ], tot_loss[loss=0.1573, simple_loss=0.2361, pruned_loss=0.03927, over 4715263.29 frames. ], batch size: 61, lr: 2.74e-03, grad_scale: 8.0 2023-10-03 15:00:14,189 WARNING [train.py:1204] (3/4) Exclude cut with ID _3867_210_1883_1_1526641200575_4475830_272-215563-0_sp1.1 from training. Duration: 0.5181875 2023-10-03 15:00:16,944 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_53679_210_4852_1_1534556726_4186141_358-154143-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 15:00:16,955 WARNING [train.py:1204] (3/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_841-341679-0_sp0.9 from training. Duration: 0.6444375 2023-10-03 15:00:20,298 WARNING [train.py:1204] (3/4) Exclude cut with ID _7160_210_1876_1_1527940923231_3703799_302-150728-0 from training. Duration: 0.78 2023-10-03 15:00:20,345 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_125-203105-0 from training. Duration: 0.8 2023-10-03 15:00:21,608 WARNING [train.py:1204] (3/4) Exclude cut with ID _5585_210_2667_1_1527418813324_8007340_89-332072-0_sp0.9 from training. Duration: 0.711125 2023-10-03 15:00:21,686 WARNING [train.py:1204] (3/4) Exclude cut with ID _3992_210_1747_1_1526644816031_3731060_36-147855-0 from training. Duration: 0.78 2023-10-03 15:00:21,706 WARNING [train.py:1204] (3/4) Exclude cut with ID _59256_210_5986_1_1535076085997_3589120_172-239045-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 15:00:21,720 WARNING [train.py:1204] (3/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_472-216663-0_sp1.1 from training. Duration: 0.836375 2023-10-03 15:00:23,031 WARNING [train.py:1204] (3/4) Exclude cut with ID _36512_210_6987_1_1533033143998_3126069_181-306500-0_sp1.1 from training. Duration: 0.863625 2023-10-03 15:00:23,053 WARNING [train.py:1204] (3/4) Exclude cut with ID _56524_210_9642_1_1534572011779_3619170_492-95008-0_sp1.1 from training. Duration: 0.97275 2023-10-03 15:00:26,188 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_33348_210_5632_1_1533086912_3324030_119-90326-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 15:00:27,440 WARNING [train.py:1204] (3/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_231-137892-0_sp1.1 from training. Duration: 0.9 2023-10-03 15:00:28,843 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8453_210_4211_1_1528502940_8562730_596-306820-0_sp1.1 from training. Duration: 0.8818125 2023-10-03 15:00:30,311 WARNING [train.py:1204] (3/4) Exclude cut with ID _27052_210_5986_1_1532329197187_3585009_50-242750-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 15:00:31,687 WARNING [train.py:1204] (3/4) Exclude cut with ID _13913_210_9994_1_1530579615187_3383897_358-98435-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 15:00:33,297 WARNING [train.py:1204] (3/4) Exclude cut with ID _27013_210_1389_1_1532437173240_3252649_399-229778-0_sp1.1 from training. Duration: 0.963625 2023-10-03 15:00:37,330 WARNING [train.py:1204] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_185-174805-0_sp0.9 from training. Duration: 0.7555625 2023-10-03 15:00:40,688 WARNING [train.py:1204] (3/4) Exclude cut with ID _65585_210_12210_1_1535450451584_3764098_196-221014-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 15:00:40,798 WARNING [train.py:1204] (3/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_202-293523-0_sp1.1 from training. Duration: 0.6545625 2023-10-03 15:00:44,172 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8275_210_4198_1_1528455320_3892571_213-127634-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 15:00:44,213 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_921-231678-0_sp1.1 from training. Duration: 0.97275 2023-10-03 15:00:44,338 WARNING [train.py:1204] (3/4) Exclude cut with ID _38020_210_5986_1_1533112252259_7006089_493-102745-0_sp1.1 from training. Duration: 0.8909375 2023-10-03 15:00:45,701 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6846_210_6753_1_1527850767_4295158_2-315073-0 from training. Duration: 0.88 2023-10-03 15:00:45,972 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.ff2_skip_rate, batch_count=1308046.6666666667, ans=0.0 2023-10-03 15:00:47,000 WARNING [train.py:1204] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_249-281349-0 from training. Duration: 0.85 2023-10-03 15:00:49,049 WARNING [train.py:1204] (3/4) Exclude cut with ID _18513_210_9206_1_1531618215223_3862620_314-83429-0_sp1.1 from training. Duration: 0.97275 2023-10-03 15:00:50,254 WARNING [train.py:1204] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_215-202973-0_sp0.9 from training. Duration: 0.97775 2023-10-03 15:00:50,264 WARNING [train.py:1204] (3/4) Exclude cut with ID _7719_210_3525_1_1528456153163_4296960_88-250259-0_sp1.1 from training. Duration: 0.763625 2023-10-03 15:00:51,671 WARNING [train.py:1204] (3/4) Exclude cut with ID _7606_210_4915_1_1528894253277_5080690_301-10732-0_sp1.1 from training. Duration: 0.736375 2023-10-03 15:00:51,687 WARNING [train.py:1204] (3/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_818-11907-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 15:00:52,957 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_5959_210_1794_1_1527400400_3445726_412-44052-0_sp1.1 from training. Duration: 0.7 2023-10-03 15:00:54,351 WARNING [train.py:1204] (3/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_6-279195-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 15:00:56,284 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.483e+02 1.953e+02 2.233e+02 2.560e+02 3.668e+02, threshold=4.465e+02, percent-clipped=0.0 2023-10-03 15:00:56,407 WARNING [train.py:1204] (3/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_252-216674-0_sp0.9 from training. Duration: 0.6 2023-10-03 15:00:58,016 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.self_attn_weights.pos_emb_skip_rate, batch_count=1308113.3333333333, ans=0.0 2023-10-03 15:00:59,121 WARNING [train.py:1204] (3/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_144-3287-0_sp1.1 from training. Duration: 0.6090625 2023-10-03 15:01:00,636 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=1308113.3333333333, ans=0.1 2023-10-03 15:01:01,778 WARNING [train.py:1204] (3/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_342-3879-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 15:01:01,804 WARNING [train.py:1204] (3/4) Exclude cut with ID _33640_210_2655_1_1533117605479_4855469_584-65630-0_sp1.1 from training. Duration: 0.97275 2023-10-03 15:01:01,888 WARNING [train.py:1204] (3/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_37-189262-0 from training. Duration: 0.98 2023-10-03 15:01:03,216 WARNING [train.py:1204] (3/4) Exclude cut with ID _9389_210_1664_1_1529217337301_5289269_379-128794-0_sp0.9 from training. Duration: 0.9444375 2023-10-03 15:01:03,337 WARNING [train.py:1204] (3/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_119-207162-0_sp1.1 from training. Duration: 0.6454375 2023-10-03 15:01:08,785 WARNING [train.py:1204] (3/4) Exclude cut with ID _2941_210_2577_1_1526199673150_2489759_200-333643-0_sp1.1 from training. Duration: 0.936375 2023-10-03 15:01:10,560 WARNING [train.py:1204] (3/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_479-125611-0_sp1.1 from training. Duration: 0.87275 2023-10-03 15:01:10,572 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3475_210_2028_1_1526193578_3622569_64-297072-0_sp1.1 from training. Duration: 0.82725 2023-10-03 15:01:10,616 WARNING [train.py:1204] (3/4) Exclude cut with ID _53565_210_10635_1_1534337218335_3820259_139-346291-0_sp1.1 from training. Duration: 0.97275 2023-10-03 15:01:12,545 WARNING [train.py:1204] (3/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_588-125165-0_sp1.1 from training. Duration: 0.7545625 2023-10-03 15:01:13,302 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.2.encoder.layers.1.nonlin_attention.whiten2, num_groups=1, num_channels=384, metric=5.82 vs. limit=15.0 2023-10-03 15:01:13,829 WARNING [train.py:1204] (3/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_938-205135-0_sp1.1 from training. Duration: 0.7909375 2023-10-03 15:01:15,214 WARNING [train.py:1204] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_216-48707-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 15:01:15,269 WARNING [train.py:1204] (3/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_477-255782-0_sp1.1 from training. Duration: 0.7454375 2023-10-03 15:01:15,313 WARNING [train.py:1204] (3/4) Exclude cut with ID _16909_210_5724_1_1531292079912_4118919_583-92842-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 15:01:16,659 WARNING [train.py:1204] (3/4) Exclude cut with ID _60860_210_8254_1_1534896015875_3557169_245-256666-0 from training. Duration: 0.97 2023-10-03 15:01:21,310 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_45399_210_4852_1_1533624725_7579737_349-116173-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 15:01:25,780 WARNING [train.py:1204] (3/4) Exclude cut with ID _8699_210_2069_1_1528630346831_3960409_155-39766-0 from training. Duration: 0.78 2023-10-03 15:01:27,010 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_86-16881-0_sp1.1 from training. Duration: 0.52725 2023-10-03 15:01:28,805 INFO [train.py:1046] (3/4) Epoch 37, batch 5000, loss[loss=0.1704, simple_loss=0.2449, pruned_loss=0.04792, over 23816.00 frames. ], tot_loss[loss=0.1567, simple_loss=0.2353, pruned_loss=0.03906, over 4704091.06 frames. ], batch size: 179, lr: 2.74e-03, grad_scale: 8.0 2023-10-03 15:01:33,245 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.1.feed_forward1.out_proj.dropout_p, batch_count=1308246.6666666667, ans=0.1 2023-10-03 15:01:34,551 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_191-159384-0_sp1.1 from training. Duration: 0.97275 2023-10-03 15:01:34,558 WARNING [train.py:1204] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_307-158462-0_sp1.1 from training. Duration: 0.62725 2023-10-03 15:01:36,002 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7544_210_6753_1_1528591327_1471426_33-346031-0 from training. Duration: 0.61 2023-10-03 15:01:37,323 WARNING [train.py:1204] (3/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_112-149213-0 from training. Duration: 0.95 2023-10-03 15:01:37,535 INFO [scaling.py:1118] (3/4) WithLoss: name=encoder.encoders.1.encoder.layers.1.self_attn_weights, loss-sum=0.000e+00 2023-10-03 15:01:38,734 WARNING [train.py:1204] (3/4) Exclude cut with ID _55844_210_2346_1_1534471212569_7625707_925-191342-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 15:01:41,439 WARNING [train.py:1204] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_87-278686-0 from training. Duration: 0.77 2023-10-03 15:01:41,475 WARNING [train.py:1204] (3/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_415-78792-0_sp1.1 from training. Duration: 0.736375 2023-10-03 15:01:41,484 WARNING [train.py:1204] (3/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_107-332398-0_sp0.9 from training. Duration: 0.8666875 2023-10-03 15:01:42,852 WARNING [train.py:1204] (3/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_72-251255-0 from training. Duration: 0.94 2023-10-03 15:01:42,888 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_431-227273-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 15:01:42,957 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7544_210_6753_1_1528587000_691468_87-178599-0_sp0.9 from training. Duration: 0.9666875 2023-10-03 15:01:44,934 WARNING [train.py:1204] (3/4) Exclude cut with ID _13916_210_9994_1_1530788341126_3763149_60-191556-0 from training. Duration: 0.61 2023-10-03 15:01:44,936 WARNING [train.py:1204] (3/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_746-164782-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 15:01:44,976 WARNING [train.py:1204] (3/4) Exclude cut with ID _33640_210_2655_1_1533117605479_4855469_596-21066-0_sp1.1 from training. Duration: 0.92725 2023-10-03 15:01:46,345 WARNING [train.py:1204] (3/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_320-14078-0 from training. Duration: 0.95 2023-10-03 15:01:47,674 WARNING [train.py:1204] (3/4) Exclude cut with ID _29877_210_6228_1_1532397791106_5276149_266-62291-0 from training. Duration: 0.87 2023-10-03 15:01:47,905 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.attention_skip_rate, batch_count=1308313.3333333333, ans=0.0 2023-10-03 15:01:49,000 WARNING [train.py:1204] (3/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_176-56120-0_sp0.9 from training. Duration: 0.92225 2023-10-03 15:01:49,697 WARNING [train.py:1204] (3/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_91-26919-0 from training. Duration: 0.97 2023-10-03 15:01:49,705 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_224-322394-0_sp0.9 from training. Duration: 0.7333125 2023-10-03 15:01:49,738 WARNING [train.py:1204] (3/4) Exclude cut with ID _44982_210_1511_1_1534312820108_3726029_586-297059-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 15:01:49,978 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.conv_module2.balancer1.prob, batch_count=1308313.3333333333, ans=0.125 2023-10-03 15:01:50,853 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_196-289637-0_sp1.1 from training. Duration: 0.6818125 2023-10-03 15:01:50,854 WARNING [train.py:1204] (3/4) Exclude cut with ID _18513_210_9206_1_1531618215223_3862620_402-5957-0 from training. Duration: 0.99 2023-10-03 15:01:50,860 WARNING [train.py:1204] (3/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_81-300212-0 from training. Duration: 0.6 2023-10-03 15:01:52,263 WARNING [train.py:1204] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_166-128941-0 from training. Duration: 0.54 2023-10-03 15:01:53,497 WARNING [train.py:1204] (3/4) Exclude cut with ID _58059_210_5854_1_1534901471149_3164808_57-309660-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 15:01:53,573 WARNING [train.py:1204] (3/4) Exclude cut with ID _60621_210_17071_1_1535013053530_3671739_73-155814-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 15:01:53,672 WARNING [train.py:1204] (3/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_726-270780-0 from training. Duration: 0.86 2023-10-03 15:01:55,047 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8853_210_3273_1_1528633899_818968_51-187685-0_sp1.1 from training. Duration: 0.62725 2023-10-03 15:01:56,462 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_315-212794-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 15:01:56,567 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_53373_210_5399_1_1534229471_4715695_314-122795-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 15:01:56,801 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.bypass_mid.scale_min, batch_count=1308380.0, ans=0.2 2023-10-03 15:01:57,961 WARNING [train.py:1204] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_187-221566-0_sp0.9 from training. Duration: 0.47775 2023-10-03 15:01:59,856 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_133-564-0 from training. Duration: 0.79 2023-10-03 15:02:01,102 WARNING [train.py:1204] (3/4) Exclude cut with ID _55153_210_2253_1_1534384786677_3162096_255-228344-0_sp1.1 from training. Duration: 0.936375 2023-10-03 15:02:02,543 WARNING [train.py:1204] (3/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_383-165234-0_sp1.1 from training. Duration: 0.8545625 2023-10-03 15:02:06,853 WARNING [train.py:1204] (3/4) Exclude cut with ID IC0677W0056-334889 from training. Duration: 0.768 2023-10-03 15:02:09,512 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_14953_210_2151_1_1530701710_5616165_710-165400-0_sp0.9 from training. Duration: 0.9666875 2023-10-03 15:02:09,611 WARNING [train.py:1204] (3/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_355-236205-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 15:02:09,613 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_34450_210_9547_1_1533099443_4109050_503-246893-0_sp1.1 from training. Duration: 0.963625 2023-10-03 15:02:13,662 WARNING [train.py:1204] (3/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_25-120161-0 from training. Duration: 0.98 2023-10-03 15:02:13,698 WARNING [train.py:1204] (3/4) Exclude cut with ID _66112_210_10635_1_1535590236305_3801484_205-311930-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 15:02:15,580 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_48049_210_5632_1_1533864553_3519937_185-281218-0_sp1.1 from training. Duration: 0.92725 2023-10-03 15:02:15,609 WARNING [train.py:1204] (3/4) Exclude cut with ID _48782_210_12658_1_1534467701544_2300422_191-332415-0_sp1.1 from training. Duration: 0.97275 2023-10-03 15:02:17,017 WARNING [train.py:1204] (3/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_907-151398-0_sp1.1 from training. Duration: 0.336375 2023-10-03 15:02:17,056 WARNING [train.py:1204] (3/4) Exclude cut with ID _67499_210_17282_1_1535509805220_3692430_20-179346-0_sp1.1 from training. Duration: 0.8818125 2023-10-03 15:02:19,128 WARNING [train.py:1204] (3/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_441-91466-0_sp1.1 from training. Duration: 0.8818125 2023-10-03 15:02:20,502 WARNING [train.py:1204] (3/4) Exclude cut with ID _46681_210_9642_1_1533893005571_7603359_786-164007-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 15:02:26,072 WARNING [train.py:1204] (3/4) Exclude cut with ID _14970_210_15709_1_1530773543493_1715730_187-20259-0 from training. Duration: 0.89 2023-10-03 15:02:29,397 WARNING [train.py:1204] (3/4) Exclude cut with ID _12734_210_8341_1_1530183614296_3891929_383-339075-0_sp1.1 from training. Duration: 0.963625 2023-10-03 15:02:37,133 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.1.encoder.layers.0.self_attn1.whiten, num_groups=1, num_channels=256, metric=12.08 vs. limit=22.5 2023-10-03 15:02:37,685 WARNING [train.py:1204] (3/4) Exclude cut with ID _37086_210_9038_1_1532999540891_7021250_834-322225-0_sp1.1 from training. Duration: 0.92725 2023-10-03 15:02:40,288 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_10987_210_4163_1_1529760555_4123902_56-290559-0_sp1.1 from training. Duration: 0.963625 2023-10-03 15:02:40,296 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_92-246270-0_sp0.9 from training. Duration: 0.9444375 2023-10-03 15:02:40,327 WARNING [train.py:1204] (3/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_56-13561-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 15:02:40,372 WARNING [train.py:1204] (3/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_393-57888-0_sp1.1 from training. Duration: 0.5818125 2023-10-03 15:02:41,565 INFO [train.py:1046] (3/4) Epoch 37, batch 5050, loss[loss=0.1694, simple_loss=0.2404, pruned_loss=0.0492, over 22885.00 frames. ], tot_loss[loss=0.1568, simple_loss=0.2359, pruned_loss=0.03883, over 4705210.24 frames. ], batch size: 322, lr: 2.74e-03, grad_scale: 8.0 2023-10-03 15:02:41,626 WARNING [train.py:1204] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_168-32906-0_sp1.1 from training. Duration: 0.62725 2023-10-03 15:02:41,676 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_51225_210_3601_1_1534400634_2327383_127-207553-0_sp1.1 from training. Duration: 0.963625 2023-10-03 15:02:48,151 WARNING [train.py:1204] (3/4) Exclude cut with ID _58583_210_5236_1_1534726835266_3574249_494-203521-0_sp1.1 from training. Duration: 0.963625 2023-10-03 15:02:48,164 WARNING [train.py:1204] (3/4) Exclude cut with ID _49376_210_5986_1_1533985278732_3370740_354-344695-0 from training. Duration: 0.99 2023-10-03 15:02:49,535 WARNING [train.py:1204] (3/4) Exclude cut with ID _8021_210_7994_1_1528376451358_3653069_179-45206-0_sp1.1 from training. Duration: 0.7818125 2023-10-03 15:02:50,954 WARNING [train.py:1204] (3/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_742-154017-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 15:02:52,985 WARNING [train.py:1204] (3/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_222-198127-0_sp1.1 from training. Duration: 0.763625 2023-10-03 15:02:53,029 WARNING [train.py:1204] (3/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_235-307337-0 from training. Duration: 0.94 2023-10-03 15:02:54,292 WARNING [train.py:1204] (3/4) Exclude cut with ID _8393_210_6702_1_1528505860249_3457328_453-176500-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 15:02:54,335 WARNING [train.py:1204] (3/4) Exclude cut with ID _53565_210_10635_1_1534337218335_3820259_102-179528-0_sp1.1 from training. Duration: 0.97275 2023-10-03 15:02:55,787 WARNING [train.py:1204] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_43-206757-0_sp1.1 from training. Duration: 0.6818125 2023-10-03 15:02:55,983 INFO [scaling.py:1118] (3/4) WithLoss: name=encoder.encoders.2.encoder.layers.0.self_attn_weights, loss-sum=0.000e+00 2023-10-03 15:02:57,162 WARNING [train.py:1204] (3/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_335-274780-0_sp1.1 from training. Duration: 0.7090625 2023-10-03 15:02:58,583 WARNING [train.py:1204] (3/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_128-296581-0_sp1.1 from training. Duration: 0.67275 2023-10-03 15:03:04,966 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.attention_skip_rate, batch_count=1308646.6666666667, ans=0.0 2023-10-03 15:03:06,111 WARNING [train.py:1204] (3/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_111-112362-0 from training. Duration: 0.91 2023-10-03 15:03:06,164 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_5522_210_3273_1_1527249606_3909096_366-172508-0_sp1.1 from training. Duration: 0.6 2023-10-03 15:03:07,413 WARNING [train.py:1204] (3/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_47-167373-0_sp1.1 from training. Duration: 0.836375 2023-10-03 15:03:08,735 WARNING [train.py:1204] (3/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_41-234353-0 from training. Duration: 0.68 2023-10-03 15:03:08,770 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_82-221196-0_sp0.9 from training. Duration: 0.8444375 2023-10-03 15:03:10,078 WARNING [train.py:1204] (3/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_47-4814-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 15:03:10,115 WARNING [train.py:1204] (3/4) Exclude cut with ID _43541_210_10626_1_1533796233418_3635440_40-247993-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 15:03:11,516 WARNING [train.py:1204] (3/4) Exclude cut with ID _21989_210_5854_1_1531899214931_3271360_133-44759-0_sp0.9 from training. Duration: 0.8555625 2023-10-03 15:03:11,519 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_94-195801-0 from training. Duration: 0.94 2023-10-03 15:03:11,743 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.nonlin_attention.balancer.prob, batch_count=1308713.3333333333, ans=0.125 2023-10-03 15:03:12,876 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_141-89113-0 from training. Duration: 0.86 2023-10-03 15:03:14,712 WARNING [train.py:1204] (3/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_683-284578-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 15:03:16,096 WARNING [train.py:1204] (3/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_6-214815-0_sp1.1 from training. Duration: 0.77275 2023-10-03 15:03:19,510 WARNING [train.py:1204] (3/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_556-212082-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 15:03:19,558 WARNING [train.py:1204] (3/4) Exclude cut with ID _44902_210_16187_1_1533888006062_3792078_149-67212-0 from training. Duration: 0.87 2023-10-03 15:03:21,567 WARNING [train.py:1204] (3/4) Exclude cut with ID _61148_210_6897_1_1535094022726_4217610_333-159440-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 15:03:23,939 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.609e+02 1.822e+02 1.997e+02 2.206e+02 4.245e+02, threshold=3.993e+02, percent-clipped=0.0 2023-10-03 15:03:24,047 WARNING [train.py:1204] (3/4) Exclude cut with ID _3120_210_3525_1_1526036143112_4072980_404-249994-0 from training. Duration: 0.99 2023-10-03 15:03:25,528 WARNING [train.py:1204] (3/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_234-260682-0_sp0.9 from training. Duration: 0.9333125 2023-10-03 15:03:25,558 WARNING [train.py:1204] (3/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_374-138971-0_sp1.1 from training. Duration: 0.8545625 2023-10-03 15:03:25,726 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.out_combiner.scale_min, batch_count=1308780.0, ans=0.2 2023-10-03 15:03:26,917 WARNING [train.py:1204] (3/4) Exclude cut with ID _53713_210_24116_1_1534314487762_7416751_743-302085-0_sp1.1 from training. Duration: 0.92725 2023-10-03 15:03:26,978 WARNING [train.py:1204] (3/4) Exclude cut with ID _18516_210_9206_1_1531704653206_3692406_198-334870-0_sp1.1 from training. Duration: 0.836375 2023-10-03 15:03:28,361 WARNING [train.py:1204] (3/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_395-73570-0_sp1.1 from training. Duration: 0.9 2023-10-03 15:03:29,829 WARNING [train.py:1204] (3/4) Exclude cut with ID _46683_210_9642_1_1533730913492_4591116_421-19547-0_sp1.1 from training. Duration: 0.936375 2023-10-03 15:03:31,166 WARNING [train.py:1204] (3/4) Exclude cut with ID _19896_210_1883_1_1531709022662_6649569_714-8668-0_sp1.1 from training. Duration: 0.963625 2023-10-03 15:03:31,202 WARNING [train.py:1204] (3/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_49-101072-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 15:03:31,209 WARNING [train.py:1204] (3/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_220-191080-0_sp1.1 from training. Duration: 0.8909375 2023-10-03 15:03:32,562 WARNING [train.py:1204] (3/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_62-335552-0 from training. Duration: 0.87 2023-10-03 15:03:33,936 WARNING [train.py:1204] (3/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_111-112362-0_sp1.1 from training. Duration: 0.82725 2023-10-03 15:03:35,689 WARNING [train.py:1204] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_318-59097-0_sp0.9 from training. Duration: 0.8444375 2023-10-03 15:03:38,422 WARNING [train.py:1204] (3/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_98-255710-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 15:03:38,428 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9042W0003-515609 from training. Duration: 0.896 2023-10-03 15:03:38,436 WARNING [train.py:1204] (3/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_158-250116-0_sp0.9 from training. Duration: 0.688875 2023-10-03 15:03:39,811 WARNING [train.py:1204] (3/4) Exclude cut with ID _26750_210_5665_1_1532226678442_4063230_7-346885-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 15:03:41,159 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_37839_210_7234_1_1533542462_3689303_357-227078-0_sp1.1 from training. Duration: 0.963625 2023-10-03 15:03:41,180 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9052W0119-520904 from training. Duration: 0.896 2023-10-03 15:03:42,624 WARNING [train.py:1204] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_249-281349-0_sp1.1 from training. Duration: 0.77275 2023-10-03 15:03:42,628 WARNING [train.py:1204] (3/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_147-293311-0 from training. Duration: 0.94 2023-10-03 15:03:42,629 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_158-21166-0_sp1.1 from training. Duration: 0.963625 2023-10-03 15:03:43,270 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.1.encoder.layers.1.feed_forward3.out_whiten, num_groups=1, num_channels=256, metric=9.64 vs. limit=15.0 2023-10-03 15:03:47,749 WARNING [train.py:1204] (3/4) Exclude cut with ID _14100_210_8341_1_1530529320613_3678069_207-332194-0_sp1.1 from training. Duration: 0.92725 2023-10-03 15:03:49,051 WARNING [train.py:1204] (3/4) Exclude cut with ID _9389_210_1664_1_1529217337301_5289269_324-211031-0_sp1.1 from training. Duration: 0.963625 2023-10-03 15:03:49,071 WARNING [train.py:1204] (3/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_133-158308-0 from training. Duration: 0.84 2023-10-03 15:03:50,400 WARNING [train.py:1204] (3/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_166-314607-0 from training. Duration: 0.54 2023-10-03 15:03:53,474 WARNING [train.py:1204] (3/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_649-79538-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 15:03:53,482 WARNING [train.py:1204] (3/4) Exclude cut with ID _28394_210_8913_1_1532430049518_3650118_343-115667-0_sp1.1 from training. Duration: 0.97275 2023-10-03 15:03:54,796 WARNING [train.py:1204] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_87-278686-0_sp0.9 from training. Duration: 0.8555625 2023-10-03 15:03:55,653 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.2.encoder.layers.0.nonlin_attention.whiten1, num_groups=1, num_channels=288, metric=7.11 vs. limit=10.0 2023-10-03 15:03:56,090 INFO [train.py:1046] (3/4) Epoch 37, batch 5100, loss[loss=0.159, simple_loss=0.2382, pruned_loss=0.03985, over 23609.00 frames. ], tot_loss[loss=0.1574, simple_loss=0.2369, pruned_loss=0.03899, over 4716334.07 frames. ], batch size: 135, lr: 2.73e-03, grad_scale: 8.0 2023-10-03 15:03:56,184 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9109W0003-549749 from training. Duration: 0.768 2023-10-03 15:03:57,662 WARNING [train.py:1204] (3/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_126-29104-0_sp1.1 from training. Duration: 0.77275 2023-10-03 15:04:01,593 WARNING [train.py:1204] (3/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_325-113798-0 from training. Duration: 0.85 2023-10-03 15:04:01,731 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.0.feed_forward1.out_proj.dropout_p, batch_count=1308913.3333333333, ans=0.1 2023-10-03 15:04:02,969 WARNING [train.py:1204] (3/4) Exclude cut with ID _6694_210_4599_1_1527852637490_3965529_116-109398-0 from training. Duration: 0.73 2023-10-03 15:04:03,020 WARNING [train.py:1204] (3/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_46-57119-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 15:04:06,277 WARNING [train.py:1204] (3/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_546-119312-0_sp1.1 from training. Duration: 0.9 2023-10-03 15:04:06,492 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.1.feed_forward1.out_proj.dropout_p, batch_count=1308913.3333333333, ans=0.1 2023-10-03 15:04:09,088 WARNING [train.py:1204] (3/4) Exclude cut with ID _60057_210_10640_1_1534903065793_3333150_468-306939-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 15:04:09,138 WARNING [train.py:1204] (3/4) Exclude cut with ID _15150_210_8341_1_1530752411803_7256384_22-138274-0 from training. Duration: 0.96 2023-10-03 15:04:09,162 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_289-180904-0 from training. Duration: 0.76 2023-10-03 15:04:13,128 WARNING [train.py:1204] (3/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_64-133721-0_sp1.1 from training. Duration: 0.92725 2023-10-03 15:04:14,414 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_195-257512-0_sp1.1 from training. Duration: 0.6090625 2023-10-03 15:04:17,043 WARNING [train.py:1204] (3/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_704-101963-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 15:04:19,162 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.conv_module2.balancer2.prob, batch_count=1308980.0, ans=0.125 2023-10-03 15:04:20,449 WARNING [train.py:1204] (3/4) Exclude cut with ID _4366_210_1388_1_1526792053633_3613819_231-211402-0 from training. Duration: 0.94 2023-10-03 15:04:20,498 WARNING [train.py:1204] (3/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_679-190385-0_sp1.1 from training. Duration: 0.97275 2023-10-03 15:04:22,033 WARNING [train.py:1204] (3/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_298-81459-0_sp1.1 from training. Duration: 0.963625 2023-10-03 15:04:22,043 WARNING [train.py:1204] (3/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_658-282610-0_sp0.9 from training. Duration: 0.588875 2023-10-03 15:04:25,394 WARNING [train.py:1204] (3/4) Exclude cut with ID _57572_210_5517_1_1534921219624_3809484_196-283734-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 15:04:26,817 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_53071_210_16554_1_1534490405_2954066_294-198554-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 15:04:26,825 WARNING [train.py:1204] (3/4) Exclude cut with ID _4866_210_2577_1_1527404412284_4364659_352-157930-0 from training. Duration: 0.81 2023-10-03 15:04:28,168 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9231W0113-610139 from training. Duration: 0.896 2023-10-03 15:04:29,495 WARNING [train.py:1204] (3/4) Exclude cut with ID _24271_210_12658_1_1531987270022_7190519_638-171906-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 15:04:29,535 WARNING [train.py:1204] (3/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_272-249985-0 from training. Duration: 0.67 2023-10-03 15:04:29,552 WARNING [train.py:1204] (3/4) Exclude cut with ID _64086_210_7994_1_1535270417019_7435949_449-31567-0 from training. Duration: 0.97 2023-10-03 15:04:32,409 WARNING [train.py:1204] (3/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_109-282568-0_sp1.1 from training. Duration: 0.97275 2023-10-03 15:04:39,861 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_121-45934-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 15:04:42,607 WARNING [train.py:1204] (3/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_314-126819-0 from training. Duration: 0.5 2023-10-03 15:04:42,636 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9309W0192-640319 from training. Duration: 0.512 2023-10-03 15:04:43,924 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9310W0003-640649 from training. Duration: 0.896 2023-10-03 15:04:45,842 WARNING [train.py:1204] (3/4) Exclude cut with ID _7160_210_1876_1_1527940923231_3703799_120-150943-0 from training. Duration: 0.81 2023-10-03 15:04:45,844 WARNING [train.py:1204] (3/4) Exclude cut with ID _40380_210_9059_1_1533380594759_4187380_6-33508-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 15:04:47,857 WARNING [train.py:1204] (3/4) Exclude cut with ID _59202_210_26984_1_1534816810612_3549174_137-318710-0 from training. Duration: 0.89 2023-10-03 15:04:49,947 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.2.encoder.layers.1.nonlin_attention.whiten1, num_groups=1, num_channels=288, metric=6.32 vs. limit=10.0 2023-10-03 15:04:52,507 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_435-313305-0 from training. Duration: 0.71 2023-10-03 15:04:55,119 WARNING [train.py:1204] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_91-212486-0_sp1.1 from training. Duration: 0.6181875 2023-10-03 15:04:56,553 WARNING [train.py:1204] (3/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_133-158308-0_sp1.1 from training. Duration: 0.763625 2023-10-03 15:04:57,964 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_99-251314-0 from training. Duration: 0.87 2023-10-03 15:04:59,931 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.4.encoder.layers.0.feed_forward2.out_whiten, num_groups=1, num_channels=384, metric=6.18 vs. limit=15.0 2023-10-03 15:05:00,689 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_365-217119-0_sp1.1 from training. Duration: 0.57275 2023-10-03 15:05:00,878 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.conv_skip_rate, batch_count=1309180.0, ans=0.0 2023-10-03 15:05:00,938 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.ff3_skip_rate, batch_count=1309180.0, ans=0.0 2023-10-03 15:05:02,087 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3475_210_2028_1_1526193578_3622569_377-166133-0 from training. Duration: 0.86 2023-10-03 15:05:06,224 WARNING [train.py:1204] (3/4) Exclude cut with ID _13916_210_9994_1_1530788341126_3763149_116-21034-0_sp1.1 from training. Duration: 0.87275 2023-10-03 15:05:06,234 WARNING [train.py:1204] (3/4) Exclude cut with ID _64220_210_9527_1_1535250151772_4875259_142-216831-0_sp1.1 from training. Duration: 0.936375 2023-10-03 15:05:06,234 WARNING [train.py:1204] (3/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_490-207861-0_sp1.1 from training. Duration: 0.8909375 2023-10-03 15:05:08,091 WARNING [train.py:1204] (3/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_87-272068-0_sp0.9 from training. Duration: 0.92225 2023-10-03 15:05:08,119 WARNING [train.py:1204] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_18-46756-0_sp1.1 from training. Duration: 0.7181875 2023-10-03 15:05:09,435 INFO [train.py:1046] (3/4) Epoch 37, batch 5150, loss[loss=0.16, simple_loss=0.2535, pruned_loss=0.03321, over 24291.00 frames. ], tot_loss[loss=0.1579, simple_loss=0.2375, pruned_loss=0.03914, over 4707946.90 frames. ], batch size: 74, lr: 2.73e-03, grad_scale: 8.0 2023-10-03 15:05:09,488 WARNING [train.py:1204] (3/4) Exclude cut with ID _39411_210_5986_1_1533538825030_3343649_98-311335-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 15:05:09,557 WARNING [train.py:1204] (3/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_113-18163-0 from training. Duration: 0.89 2023-10-03 15:05:09,559 WARNING [train.py:1204] (3/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_1103-56445-0 from training. Duration: 0.73 2023-10-03 15:05:09,614 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3474_210_2028_1_1526188513_4825246_145-309131-0 from training. Duration: 0.74 2023-10-03 15:05:09,632 WARNING [train.py:1204] (3/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_120-211630-0_sp1.1 from training. Duration: 0.836375 2023-10-03 15:05:09,642 WARNING [train.py:1204] (3/4) Exclude cut with ID _8030_210_7497_1_1528444886781_3531089_32-31837-0 from training. Duration: 0.97 2023-10-03 15:05:09,825 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.feed_forward3.hidden_balancer.prob, batch_count=1309246.6666666667, ans=0.125 2023-10-03 15:05:11,052 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_19949_210_2161_1_1531706337_928079_8-160956-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 15:05:12,357 WARNING [train.py:1204] (3/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_321-215742-0_sp1.1 from training. Duration: 0.4545625 2023-10-03 15:05:13,838 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_583-124650-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 15:05:13,934 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_30434_210_13049_1_1532519384_981746_164-182755-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 15:05:17,412 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.conv_module1.balancer2.prob, batch_count=1309246.6666666667, ans=0.125 2023-10-03 15:05:19,058 WARNING [train.py:1204] (3/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_339-104791-0_sp1.1 from training. Duration: 0.4454375 2023-10-03 15:05:19,087 WARNING [train.py:1204] (3/4) Exclude cut with ID _15026_210_8750_1_1530777602316_3291850_211-195043-0 from training. Duration: 0.8 2023-10-03 15:05:20,449 WARNING [train.py:1204] (3/4) Exclude cut with ID _29877_210_6228_1_1532397791106_5276149_487-182647-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 15:05:21,746 WARNING [train.py:1204] (3/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_127-566-0_sp0.9 from training. Duration: 0.9333125 2023-10-03 15:05:23,828 WARNING [train.py:1204] (3/4) Exclude cut with ID _8263_210_1664_1_1528439452891_970220_68-181805-0_sp0.9 from training. Duration: 0.711125 2023-10-03 15:05:23,837 WARNING [train.py:1204] (3/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_504-72513-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 15:05:23,845 WARNING [train.py:1204] (3/4) Exclude cut with ID _64639_210_5816_1_1535367619437_3528000_324-265233-0_sp1.1 from training. Duration: 0.963625 2023-10-03 15:05:25,161 WARNING [train.py:1204] (3/4) Exclude cut with ID _6456_210_1534_1_1527681634212_3181049_215-309359-0_sp0.9 from training. Duration: 0.97775 2023-10-03 15:05:25,165 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_115-233929-0_sp1.1 from training. Duration: 0.6545625 2023-10-03 15:05:25,214 WARNING [train.py:1204] (3/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_219-224445-0 from training. Duration: 0.84 2023-10-03 15:05:26,768 WARNING [train.py:1204] (3/4) Exclude cut with ID _14867_210_1968_1_1530684179890_3846969_413-29292-0_sp1.1 from training. Duration: 0.8181875 2023-10-03 15:05:28,078 WARNING [train.py:1204] (3/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_1012-279473-0_sp0.9 from training. Duration: 0.7444375 2023-10-03 15:05:31,134 WARNING [train.py:1204] (3/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_605-133395-0_sp0.9 from training. Duration: 0.7333125 2023-10-03 15:05:31,263 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_89-35748-0 from training. Duration: 0.79 2023-10-03 15:05:32,686 WARNING [train.py:1204] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_476-272757-0_sp1.1 from training. Duration: 0.8454375 2023-10-03 15:05:36,883 WARNING [train.py:1204] (3/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_449-15508-0_sp0.9 from training. Duration: 0.911125 2023-10-03 15:05:39,095 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.bypass.skip_rate, batch_count=1309380.0, ans=0.09899494936611666 2023-10-03 15:05:40,101 WARNING [train.py:1204] (3/4) Exclude cut with ID _29345_210_4381_1_1532689168263_3860169_198-174328-0 from training. Duration: 0.91 2023-10-03 15:05:42,798 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_15517_210_6753_1_1530952293_1664795_125-158157-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 15:05:47,435 WARNING [train.py:1204] (3/4) Exclude cut with ID _25565_210_12392_1_1532423009382_3952098_420-113180-0_sp1.1 from training. Duration: 0.963625 2023-10-03 15:05:47,635 INFO [scaling.py:1118] (3/4) WithLoss: name=encoder.encoders.0.layers.1.self_attn_weights, loss-sum=0.000e+00 2023-10-03 15:05:49,396 WARNING [train.py:1204] (3/4) Exclude cut with ID _51471_210_1889_1_1534399391390_3094250_236-146773-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 15:05:51,926 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.522e+02 1.958e+02 2.128e+02 2.414e+02 3.634e+02, threshold=4.256e+02, percent-clipped=0.0 2023-10-03 15:05:52,080 WARNING [train.py:1204] (3/4) Exclude cut with ID _30039_210_13270_1_1532484612900_2725150_175-35012-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 15:05:52,114 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_23850_210_4238_1_1532232597_1632433_99-275843-0_sp1.1 from training. Duration: 0.936375 2023-10-03 15:05:53,550 WARNING [train.py:1204] (3/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_658-282610-0 from training. Duration: 0.53 2023-10-03 15:05:55,554 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.3.encoder.layers.3.self_attn2.whiten, num_groups=1, num_channels=512, metric=11.99 vs. limit=22.5 2023-10-03 15:05:59,133 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_37818_210_4238_1_1533620910_4720083_234-65934-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 15:05:59,200 WARNING [train.py:1204] (3/4) Exclude cut with ID _3067_210_2667_1_1526212974640_4503168_459-173867-0_sp1.1 from training. Duration: 0.8 2023-10-03 15:06:00,554 WARNING [train.py:1204] (3/4) Exclude cut with ID _8696_210_1876_1_1528545632440_3345058_209-54069-0_sp0.9 from training. Duration: 0.7444375 2023-10-03 15:06:03,512 WARNING [train.py:1204] (3/4) Exclude cut with ID _62574_210_2655_1_1535180407058_3933249_420-236465-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 15:06:03,592 WARNING [train.py:1204] (3/4) Exclude cut with ID _46681_210_9642_1_1533893005571_7603359_333-4042-0_sp1.1 from training. Duration: 0.936375 2023-10-03 15:06:04,935 WARNING [train.py:1204] (3/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_377-296839-0 from training. Duration: 0.55 2023-10-03 15:06:09,627 WARNING [train.py:1204] (3/4) Exclude cut with ID _43997_210_4525_1_1533538632555_5189329_426-92599-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 15:06:10,987 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_43-71989-0_sp1.1 from training. Duration: 0.5181875 2023-10-03 15:06:13,004 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.3.encoder.layers.2.feed_forward2.out_whiten, num_groups=1, num_channels=512, metric=12.70 vs. limit=15.0 2023-10-03 15:06:13,767 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2009_210_2071_1_1525481134_4588937_140-345530-0_sp1.1 from training. Duration: 0.963625 2023-10-03 15:06:13,782 WARNING [train.py:1204] (3/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_138-332773-0_sp0.9 from training. Duration: 0.9 2023-10-03 15:06:15,122 WARNING [train.py:1204] (3/4) Exclude cut with ID _13942_210_9988_1_1530604906036_3733360_499-55676-0_sp0.9 from training. Duration: 0.688875 2023-10-03 15:06:15,144 WARNING [train.py:1204] (3/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_259-34038-0_sp1.1 from training. Duration: 0.67275 2023-10-03 15:06:16,770 WARNING [train.py:1204] (3/4) Exclude cut with ID _3120_210_3525_1_1526036143112_4072980_404-249994-0_sp1.1 from training. Duration: 0.9 2023-10-03 15:06:16,804 WARNING [train.py:1204] (3/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_222-108810-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 15:06:22,648 WARNING [train.py:1204] (3/4) Exclude cut with ID _22083_210_8254_1_1531702862439_3678970_49-234546-0_sp0.9 from training. Duration: 0.92225 2023-10-03 15:06:24,036 INFO [train.py:1046] (3/4) Epoch 37, batch 5200, loss[loss=0.1728, simple_loss=0.2609, pruned_loss=0.04237, over 24489.00 frames. ], tot_loss[loss=0.1593, simple_loss=0.2389, pruned_loss=0.03985, over 4698628.45 frames. ], batch size: 66, lr: 2.73e-03, grad_scale: 16.0 2023-10-03 15:06:24,073 WARNING [train.py:1204] (3/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_199-295611-0_sp0.9 from training. Duration: 0.611125 2023-10-03 15:06:27,348 WARNING [train.py:1204] (3/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_43-192945-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 15:06:30,212 WARNING [train.py:1204] (3/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_364-22817-0 from training. Duration: 0.71 2023-10-03 15:06:31,564 WARNING [train.py:1204] (3/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_388-129552-0_sp1.1 from training. Duration: 0.87275 2023-10-03 15:06:31,616 WARNING [train.py:1204] (3/4) Exclude cut with ID _6537_210_1883_1_1527987559710_3597129_190-280227-0_sp1.1 from training. Duration: 0.97275 2023-10-03 15:06:34,393 WARNING [train.py:1204] (3/4) Exclude cut with ID _55844_210_2346_1_1534471212569_7625707_398-56897-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 15:06:35,644 WARNING [train.py:1204] (3/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_48-182155-0_sp1.1 from training. Duration: 0.7909375 2023-10-03 15:06:35,676 WARNING [train.py:1204] (3/4) Exclude cut with ID _29529_210_7234_1_1532393961455_3977460_582-18084-0_sp1.1 from training. Duration: 0.97275 2023-10-03 15:06:36,259 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.4.encoder.layers.1.conv_module1.whiten, num_groups=1, num_channels=384, metric=3.33 vs. limit=15.0 2023-10-03 15:06:38,378 WARNING [train.py:1204] (3/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_912-282412-0 from training. Duration: 0.49 2023-10-03 15:06:40,432 WARNING [train.py:1204] (3/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_332-256168-0_sp0.9 from training. Duration: 0.8333125 2023-10-03 15:06:41,756 WARNING [train.py:1204] (3/4) Exclude cut with ID _64220_210_9527_1_1535250151772_4875259_55-250624-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 15:06:43,207 WARNING [train.py:1204] (3/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_264-108685-0 from training. Duration: 0.55 2023-10-03 15:06:45,895 WARNING [train.py:1204] (3/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_613-232856-0_sp0.9 from training. Duration: 0.6 2023-10-03 15:06:47,236 WARNING [train.py:1204] (3/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_68-212801-0_sp1.1 from training. Duration: 0.663625 2023-10-03 15:06:47,273 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6290_210_3273_1_1527595224_1282787_115-87095-0 from training. Duration: 0.55 2023-10-03 15:06:48,630 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3080_210_2071_1_1526086102_4365166_312-8726-0 from training. Duration: 0.91 2023-10-03 15:06:51,875 WARNING [train.py:1204] (3/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_105-63309-0 from training. Duration: 0.98 2023-10-03 15:06:51,943 WARNING [train.py:1204] (3/4) Exclude cut with ID _2941_210_2577_1_1526199673150_2489759_201-160779-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 15:06:51,947 WARNING [train.py:1204] (3/4) Exclude cut with ID ID0406W0226-885434 from training. Duration: 0.997 2023-10-03 15:06:51,953 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_17292_210_6753_1_1531125388_2943734_353-328201-0_sp1.1 from training. Duration: 0.97275 2023-10-03 15:06:54,190 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.out_combiner.scale_min, batch_count=1309713.3333333333, ans=0.2 2023-10-03 15:06:55,179 WARNING [train.py:1204] (3/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_572-96029-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 15:06:55,220 WARNING [train.py:1204] (3/4) Exclude cut with ID _14970_210_15709_1_1530775547361_1966965_6-23328-0_sp0.9 from training. Duration: 0.9555625 2023-10-03 15:06:56,519 WARNING [train.py:1204] (3/4) Exclude cut with ID _45012_210_12651_1_1533608435136_5371380_240-37101-0 from training. Duration: 0.93 2023-10-03 15:06:57,835 WARNING [train.py:1204] (3/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_685-90183-0_sp1.1 from training. Duration: 0.936375 2023-10-03 15:06:59,314 WARNING [train.py:1204] (3/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_41-22245-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 15:07:02,063 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_15126_210_6753_1_1530778677_2591185_120-277707-0 from training. Duration: 0.8 2023-10-03 15:07:02,105 WARNING [train.py:1204] (3/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_310-26186-0 from training. Duration: 0.7 2023-10-03 15:07:02,120 WARNING [train.py:1204] (3/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_787-294416-0 from training. Duration: 0.82 2023-10-03 15:07:07,596 WARNING [train.py:1204] (3/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_795-33252-0 from training. Duration: 0.37 2023-10-03 15:07:08,938 WARNING [train.py:1204] (3/4) Exclude cut with ID _7537_210_2084_1_1528502395431_3924359_113-199933-0_sp0.9 from training. Duration: 0.6555625 2023-10-03 15:07:14,936 WARNING [train.py:1204] (3/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_308-231735-0_sp0.9 from training. Duration: 0.811125 2023-10-03 15:07:14,955 WARNING [train.py:1204] (3/4) Exclude cut with ID _19504_210_9994_1_1531648863242_3325220_354-296765-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 15:07:17,626 WARNING [train.py:1204] (3/4) Exclude cut with ID _5409_210_1388_1_1527292850189_5865191_178-127970-0 from training. Duration: 0.57 2023-10-03 15:07:17,666 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_1466-248056-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 15:07:17,689 WARNING [train.py:1204] (3/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_535-14410-0_sp0.9 from training. Duration: 0.52225 2023-10-03 15:07:17,692 WARNING [train.py:1204] (3/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_747-129941-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 15:07:18,965 WARNING [train.py:1204] (3/4) Exclude cut with ID _15012_210_1968_1_1530856820429_5095350_688-178849-0_sp1.1 from training. Duration: 0.5818125 2023-10-03 15:07:20,546 WARNING [train.py:1204] (3/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_356-45054-0_sp1.1 from training. Duration: 0.7818125 2023-10-03 15:07:20,640 WARNING [train.py:1204] (3/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_146-276944-0_sp0.9 from training. Duration: 0.92225 2023-10-03 15:07:24,162 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.conv_module2.balancer1.min_positive, batch_count=1309846.6666666667, ans=0.025 2023-10-03 15:07:25,349 WARNING [train.py:1204] (3/4) Exclude cut with ID _39204_210_4220_1_1533880547368_3191120_346-180306-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 15:07:27,157 WARNING [train.py:1204] (3/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_641-158017-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 15:07:27,157 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_19952_210_2161_1_1531875469_3851750_274-42137-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 15:07:31,806 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.nonlin_attention.balancer.prob, batch_count=1309846.6666666667, ans=0.125 2023-10-03 15:07:34,371 WARNING [train.py:1204] (3/4) Exclude cut with ID _60804_210_9642_1_1534831199859_7268299_87-327819-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 15:07:34,447 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_9302_210_2550_1_1528889962_4655416_638-301620-0 from training. Duration: 0.47 2023-10-03 15:07:35,948 WARNING [train.py:1204] (3/4) Exclude cut with ID _44304_210_15374_1_1533639352765_4613129_302-22084-0_sp1.1 from training. Duration: 0.7818125 2023-10-03 15:07:35,964 WARNING [train.py:1204] (3/4) Exclude cut with ID _18513_210_9206_1_1531618215223_3862620_177-227063-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 15:07:37,235 INFO [train.py:1046] (3/4) Epoch 37, batch 5250, loss[loss=0.1605, simple_loss=0.2484, pruned_loss=0.03635, over 24582.00 frames. ], tot_loss[loss=0.1585, simple_loss=0.2384, pruned_loss=0.03934, over 4713986.11 frames. ], batch size: 71, lr: 2.73e-03, grad_scale: 16.0 2023-10-03 15:07:37,365 WARNING [train.py:1204] (3/4) Exclude cut with ID _41326_210_5854_1_1533778076509_3492080_510-45444-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 15:07:37,408 WARNING [train.py:1204] (3/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_76-311963-0_sp1.1 from training. Duration: 0.636375 2023-10-03 15:07:38,782 WARNING [train.py:1204] (3/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_156-302675-0_sp0.9 from training. Duration: 0.611125 2023-10-03 15:07:40,300 WARNING [train.py:1204] (3/4) Exclude cut with ID _53489_210_6228_1_1534298357169_3835970_367-148411-0_sp1.1 from training. Duration: 0.936375 2023-10-03 15:07:43,016 WARNING [train.py:1204] (3/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_389-144525-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 15:07:44,810 WARNING [train.py:1204] (3/4) Exclude cut with ID _14069_210_5632_1_1530512692033_4803390_158-238427-0_sp1.1 from training. Duration: 0.963625 2023-10-03 15:07:46,105 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_486-151131-0_sp0.9 from training. Duration: 0.9444375 2023-10-03 15:07:50,318 WARNING [train.py:1204] (3/4) Exclude cut with ID _39846_210_12651_1_1533603651992_1318030_188-71831-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 15:07:51,712 WARNING [train.py:1204] (3/4) Exclude cut with ID _39411_210_5986_1_1533538825030_3343649_173-238248-0_sp1.1 from training. Duration: 0.7545625 2023-10-03 15:07:53,723 WARNING [train.py:1204] (3/4) Exclude cut with ID _20684_210_6917_1_1531567751646_4203930_469-49081-0_sp1.1 from training. Duration: 0.92725 2023-10-03 15:07:56,428 WARNING [train.py:1204] (3/4) Exclude cut with ID _8833_210_5219_1_1528592665210_7553059_611-80313-0_sp1.1 from training. Duration: 0.5818125 2023-10-03 15:07:57,821 WARNING [train.py:1204] (3/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_440-141811-0 from training. Duration: 0.95 2023-10-03 15:07:59,123 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_41211_210_2544_1_1533348859_3702910_236-223652-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 15:07:59,209 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_40222_210_6228_1_1533385298_4481939_220-163998-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 15:08:03,172 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.2.encoder.layers.2.self_attn_weights.whiten_keys, num_groups=4, num_channels=128, metric=3.00 vs. limit=6.0 2023-10-03 15:08:05,665 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.conv_module1.balancer1.prob, batch_count=1310046.6666666667, ans=0.125 2023-10-03 15:08:15,823 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.conv_module1.balancer2.prob, batch_count=1310046.6666666667, ans=0.125 2023-10-03 15:08:18,118 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.561e+02 1.876e+02 2.013e+02 2.256e+02 3.803e+02, threshold=4.026e+02, percent-clipped=0.0 2023-10-03 15:08:19,571 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.bypass.skip_rate, batch_count=1310113.3333333333, ans=0.07 2023-10-03 15:08:37,037 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.conv_module1.balancer2.prob, batch_count=1310180.0, ans=0.125 2023-10-03 15:08:45,609 INFO [train.py:1046] (3/4) Epoch 37, batch 5300, loss[loss=0.1392, simple_loss=0.2169, pruned_loss=0.03079, over 21182.00 frames. ], tot_loss[loss=0.1577, simple_loss=0.2368, pruned_loss=0.03931, over 4693767.83 frames. ], batch size: 46, lr: 2.73e-03, grad_scale: 8.0 2023-10-03 15:09:00,000 WARNING [train.py:1204] (3/4) Exclude cut with ID _14629_210_15709_1_1530604298182_956899_6-232695-0_sp1.1 from training. Duration: 0.8545625 2023-10-03 15:09:00,022 WARNING [train.py:1204] (3/4) Exclude cut with ID _13921_210_9994_1_1530841782272_3503520_327-69677-0 from training. Duration: 0.9 2023-10-03 15:09:00,048 WARNING [train.py:1204] (3/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_372-298093-0 from training. Duration: 0.8 2023-10-03 15:09:00,062 WARNING [train.py:1204] (3/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_515-139111-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 15:09:00,296 WARNING [train.py:1204] (3/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_85-65056-0_sp1.1 from training. Duration: 0.87275 2023-10-03 15:09:00,336 WARNING [train.py:1204] (3/4) Exclude cut with ID _15113_210_8254_1_1530842438579_3716279_209-54533-0_sp1.1 from training. Duration: 0.87275 2023-10-03 15:09:00,383 WARNING [train.py:1204] (3/4) Exclude cut with ID _70447_210_12392_1_1535972499300_3160420_123-312838-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 15:09:00,410 WARNING [train.py:1204] (3/4) Exclude cut with ID _60113_210_16271_1_1535164212481_4386079_70-63596-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 15:09:00,438 WARNING [train.py:1204] (3/4) Exclude cut with ID _14632_210_9988_1_1530691279392_3923200_225-188509-0_sp1.1 from training. Duration: 0.963625 2023-10-03 15:09:00,487 WARNING [train.py:1204] (3/4) Exclude cut with ID _34734_210_4525_1_1533365977542_4448720_584-180532-0_sp1.1 from training. Duration: 0.97275 2023-10-03 15:09:00,500 WARNING [train.py:1204] (3/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_351-294650-0_sp1.1 from training. Duration: 0.57275 2023-10-03 15:09:00,833 WARNING [train.py:1204] (3/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_263-168388-0_sp0.9 from training. Duration: 0.9555625 2023-10-03 15:09:00,865 WARNING [train.py:1204] (3/4) Exclude cut with ID _22083_210_8254_1_1531702862439_3678970_201-85738-0 from training. Duration: 0.9 2023-10-03 15:09:00,961 WARNING [train.py:1204] (3/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_237-58433-0 from training. Duration: 0.74 2023-10-03 15:09:00,989 WARNING [train.py:1204] (3/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_262-250826-0 from training. Duration: 0.99 2023-10-03 15:09:01,103 WARNING [train.py:1204] (3/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_927-43827-0_sp0.9 from training. Duration: 0.82225 2023-10-03 15:09:01,136 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2821_210_2715_1_1526108385_3763560_73-296673-0 from training. Duration: 0.83 2023-10-03 15:09:01,220 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6287_210_3273_1_1527509506_2653716_182-199887-0 from training. Duration: 0.51 2023-10-03 15:09:01,371 WARNING [train.py:1204] (3/4) Exclude cut with ID _70519_210_4163_1_1535961595994_7458230_899-80223-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 15:09:02,003 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_210-234949-0_sp1.1 from training. Duration: 0.97275 2023-10-03 15:09:02,040 WARNING [train.py:1204] (3/4) Exclude cut with ID _30196_210_1664_1_1533708496522_7074140_768-278176-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 15:09:02,120 WARNING [train.py:1204] (3/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_315-128283-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 15:09:02,200 WARNING [train.py:1204] (3/4) Exclude cut with ID _14886_210_4220_1_1530702020355_3516190_330-323941-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 15:09:02,464 WARNING [train.py:1204] (3/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_264-348896-0_sp1.1 from training. Duration: 0.77275 2023-10-03 15:09:02,487 WARNING [train.py:1204] (3/4) Exclude cut with ID _37086_210_9038_1_1532999540891_7021250_433-10563-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 15:09:02,526 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_53638_210_21581_1_1534297319_5808449_885-80172-0_sp1.1 from training. Duration: 0.87275 2023-10-03 15:09:02,618 WARNING [train.py:1204] (3/4) Exclude cut with ID _14623_210_1925_1_1530773354962_3632609_157-218771-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 15:09:02,629 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_1368-78165-0_sp1.1 from training. Duration: 0.97275 2023-10-03 15:09:02,633 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_309-65177-0_sp0.9 from training. Duration: 0.97775 2023-10-03 15:09:02,640 WARNING [train.py:1204] (3/4) Exclude cut with ID _21632_210_2253_1_1531994001765_3744140_291-138812-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 15:09:02,653 WARNING [train.py:1204] (3/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_669-93163-0_sp1.1 from training. Duration: 0.8909375 2023-10-03 15:09:03,250 WARNING [train.py:1204] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_292-215346-0 from training. Duration: 0.97 2023-10-03 15:09:03,275 WARNING [train.py:1204] (3/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_286-222073-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 15:09:03,587 WARNING [train.py:1204] (3/4) Exclude cut with ID _59256_210_5986_1_1535076085997_3589120_121-237386-0_sp1.1 from training. Duration: 0.87275 2023-10-03 15:09:03,601 WARNING [train.py:1204] (3/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_96-320736-0 from training. Duration: 0.86 2023-10-03 15:09:03,610 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_128-202104-0 from training. Duration: 0.63 2023-10-03 15:09:03,698 WARNING [train.py:1204] (3/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_242-3472-0_sp0.9 from training. Duration: 0.811125 2023-10-03 15:09:03,742 WARNING [train.py:1204] (3/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_154-74182-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 15:09:03,748 WARNING [train.py:1204] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_187-221566-0 from training. Duration: 0.43 2023-10-03 15:09:03,922 WARNING [train.py:1204] (3/4) Exclude cut with ID _6194_210_1534_1_1527511826160_3941194_14-282876-0 from training. Duration: 0.71 2023-10-03 15:09:03,963 WARNING [train.py:1204] (3/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_660-211088-0_sp1.1 from training. Duration: 0.563625 2023-10-03 15:09:04,746 WARNING [train.py:1204] (3/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_526-62079-0_sp1.1 from training. Duration: 0.6090625 2023-10-03 15:09:04,896 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_92-246270-0_sp1.1 from training. Duration: 0.77275 2023-10-03 15:09:04,988 WARNING [train.py:1204] (3/4) Exclude cut with ID IC0287W0023-142350 from training. Duration: 0.956 2023-10-03 15:09:05,068 WARNING [train.py:1204] (3/4) Exclude cut with ID _58605_210_12210_1_1534723195939_3904259_48-269402-0 from training. Duration: 0.96 2023-10-03 15:09:05,084 WARNING [train.py:1204] (3/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_416-257730-0_sp0.9 from training. Duration: 0.72225 2023-10-03 15:09:05,165 WARNING [train.py:1204] (3/4) Exclude cut with ID _7877_210_6014_1_1528286358099_3750136_185-17516-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 15:09:05,274 WARNING [train.py:1204] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_23-189234-0 from training. Duration: 0.83 2023-10-03 15:09:05,292 WARNING [train.py:1204] (3/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_237-89684-0 from training. Duration: 0.96 2023-10-03 15:09:05,367 WARNING [train.py:1204] (3/4) Exclude cut with ID _13916_210_9994_1_1530788341126_3763149_116-21034-0 from training. Duration: 0.96 2023-10-03 15:09:05,530 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_246-125000-0_sp1.1 from training. Duration: 0.563625 2023-10-03 15:09:12,151 INFO [train.py:1046] (3/4) Epoch 38, batch 0, loss[loss=0.1621, simple_loss=0.238, pruned_loss=0.0431, over 21818.00 frames. ], tot_loss[loss=0.1621, simple_loss=0.238, pruned_loss=0.0431, over 21818.00 frames. ], batch size: 47, lr: 2.70e-03, grad_scale: 16.0 2023-10-03 15:09:12,151 INFO [train.py:1069] (3/4) Computing validation loss 2023-10-03 15:09:24,069 INFO [train.py:1078] (3/4) Epoch 38, validation: loss=0.3257, simple_loss=0.2715, pruned_loss=0.1899, over 1125622.00 frames. 2023-10-03 15:09:24,070 INFO [train.py:1079] (3/4) Maximum memory allocated so far is 21134MB 2023-10-03 15:09:27,062 WARNING [train.py:1204] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_402-253027-0 from training. Duration: 0.54 2023-10-03 15:09:28,389 WARNING [train.py:1204] (3/4) Exclude cut with ID _59288_210_5854_1_1535077757774_3412246_158-289345-0_sp1.1 from training. Duration: 0.92725 2023-10-03 15:09:30,058 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.bypass_mid.scale_min, batch_count=1310326.6666666667, ans=0.2 2023-10-03 15:09:31,126 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_28211_210_6388_1_1532861506_4523224_128-328162-0_sp1.1 from training. Duration: 0.8181875 2023-10-03 15:09:35,996 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_788-278281-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 15:09:36,006 WARNING [train.py:1204] (3/4) Exclude cut with ID _49728_210_12651_1_1534041034294_6788150_624-195301-0_sp0.9 from training. Duration: 0.9444375 2023-10-03 15:09:36,046 WARNING [train.py:1204] (3/4) Exclude cut with ID _14860_210_4915_1_1530702020340_7343240_267-139775-0_sp1.1 from training. Duration: 0.963625 2023-10-03 15:09:36,226 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.0.ff3_skip_rate, batch_count=1310326.6666666667, ans=0.0 2023-10-03 15:09:37,376 WARNING [train.py:1204] (3/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_332-221057-0 from training. Duration: 0.68 2023-10-03 15:09:38,837 WARNING [train.py:1204] (3/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_242-76943-0 from training. Duration: 0.94 2023-10-03 15:09:40,794 WARNING [train.py:1204] (3/4) Exclude cut with ID _16747_210_5986_1_1531811544733_2749109_87-349587-0_sp1.1 from training. Duration: 0.963625 2023-10-03 15:09:40,864 WARNING [train.py:1204] (3/4) Exclude cut with ID _22982_210_1883_1_1531882790447_5449409_505-346452-0_sp1.1 from training. Duration: 0.963625 2023-10-03 15:09:42,787 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.3.encoder.layers.1.nonlin_attention.whiten2, num_groups=1, num_channels=512, metric=7.00 vs. limit=15.0 2023-10-03 15:09:43,456 WARNING [train.py:1204] (3/4) Exclude cut with ID _17956_210_4353_1_1531209568125_6979970_13-192107-0_sp1.1 from training. Duration: 0.963625 2023-10-03 15:09:44,796 WARNING [train.py:1204] (3/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_430-307666-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 15:09:44,846 WARNING [train.py:1204] (3/4) Exclude cut with ID _2895_210_2477_1_1525956785794_4063750_289-11475-0_sp1.1 from training. Duration: 0.7454375 2023-10-03 15:09:44,860 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3910_210_1794_1_1526742306_334530_28-238909-0_sp0.9 from training. Duration: 0.9 2023-10-03 15:09:46,430 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.bypass.scale_min, batch_count=1310393.3333333333, ans=0.2 2023-10-03 15:09:47,617 WARNING [train.py:1204] (3/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_18-130336-0 from training. Duration: 0.79 2023-10-03 15:09:49,084 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_548-294316-0_sp0.9 from training. Duration: 0.9 2023-10-03 15:09:55,720 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_42-142473-0_sp1.1 from training. Duration: 0.6545625 2023-10-03 15:09:55,725 WARNING [train.py:1204] (3/4) Exclude cut with ID _59170_210_6721_1_1534983633960_4203335_326-48066-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 15:09:57,175 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_45886_210_17024_1_1533709215_471063_7-79165-0 from training. Duration: 0.98 2023-10-03 15:10:01,253 WARNING [train.py:1204] (3/4) Exclude cut with ID _44585_210_18628_1_1533605350387_4518199_280-73534-0_sp0.9 from training. Duration: 0.988875 2023-10-03 15:10:01,262 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3475_210_2028_1_1526193578_3622569_265-276625-0_sp1.1 from training. Duration: 0.6909375 2023-10-03 15:10:01,518 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.self_attn_weights.pos_emb_skip_rate, batch_count=1310460.0, ans=0.0 2023-10-03 15:10:02,574 WARNING [train.py:1204] (3/4) Exclude cut with ID _64631_210_27767_1_1535349580068_4165160_88-225232-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 15:10:07,333 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_205-265518-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 15:10:10,681 WARNING [train.py:1204] (3/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_271-253734-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 15:10:16,169 WARNING [train.py:1204] (3/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_282-257791-0 from training. Duration: 0.86 2023-10-03 15:10:18,875 WARNING [train.py:1204] (3/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_356-45054-0 from training. Duration: 0.86 2023-10-03 15:10:18,917 WARNING [train.py:1204] (3/4) Exclude cut with ID _69584_210_5219_1_1535785133775_4044450_38-348116-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 15:10:18,925 WARNING [train.py:1204] (3/4) Exclude cut with ID _66763_210_17301_1_1535788667617_241750_21-328480-0_sp1.1 from training. Duration: 0.936375 2023-10-03 15:10:20,905 WARNING [train.py:1204] (3/4) Exclude cut with ID _26528_210_9070_1_1532570410842_3851310_497-97218-0_sp0.9 from training. Duration: 0.9555625 2023-10-03 15:10:22,112 WARNING [train.py:1204] (3/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_592-101075-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 15:10:23,621 WARNING [train.py:1204] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_678-159307-0 from training. Duration: 0.85 2023-10-03 15:10:28,039 WARNING [train.py:1204] (3/4) Exclude cut with ID _28427_210_4220_1_1532581240967_3061069_166-319773-0_sp1.1 from training. Duration: 0.936375 2023-10-03 15:10:28,128 WARNING [train.py:1204] (3/4) Exclude cut with ID _29877_210_6228_1_1532397791106_5276149_606-224984-0_sp1.1 from training. Duration: 0.936375 2023-10-03 15:10:32,202 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_125-203105-0_sp1.1 from training. Duration: 0.72725 2023-10-03 15:10:35,224 WARNING [train.py:1204] (3/4) Exclude cut with ID IC0620W0113-306765 from training. Duration: 0.768 2023-10-03 15:10:35,935 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.3.encoder.layers.3.self_attn1.whiten, num_groups=1, num_channels=512, metric=11.20 vs. limit=22.5 2023-10-03 15:10:37,229 WARNING [train.py:1204] (3/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_410-287857-0_sp1.1 from training. Duration: 0.8454375 2023-10-03 15:10:38,630 INFO [train.py:1046] (3/4) Epoch 38, batch 50, loss[loss=0.1485, simple_loss=0.2272, pruned_loss=0.03492, over 23599.00 frames. ], tot_loss[loss=0.1621, simple_loss=0.2411, pruned_loss=0.04149, over 1064239.41 frames. ], batch size: 232, lr: 2.70e-03, grad_scale: 16.0 2023-10-03 15:10:38,770 WARNING [train.py:1204] (3/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_78-95260-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 15:10:40,511 WARNING [train.py:1204] (3/4) Exclude cut with ID _58059_210_5854_1_1534901471149_3164808_6-73080-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 15:10:40,533 WARNING [train.py:1204] (3/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_331-117829-0 from training. Duration: 0.62 2023-10-03 15:10:40,726 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.bypass_mid.scale_min, batch_count=1310660.0, ans=0.2 2023-10-03 15:10:41,810 WARNING [train.py:1204] (3/4) Exclude cut with ID _14880_210_8895_1_1530680426655_3795259_289-212817-0_sp0.9 from training. Duration: 0.7666875 2023-10-03 15:10:41,835 WARNING [train.py:1204] (3/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_620-144779-0_sp1.1 from training. Duration: 0.92725 2023-10-03 15:10:43,829 WARNING [train.py:1204] (3/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_561-288575-0_sp1.1 from training. Duration: 0.963625 2023-10-03 15:10:45,182 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_22657_210_6890_1_1532086002_1637216_122-188128-0_sp1.1 from training. Duration: 0.963625 2023-10-03 15:10:47,789 WARNING [train.py:1204] (3/4) Exclude cut with ID _16756_210_4915_1_1531047803763_7104259_604-86607-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 15:10:51,232 WARNING [train.py:1204] (3/4) Exclude cut with ID _15150_210_8341_1_1530752411803_7256384_469-239509-0 from training. Duration: 0.68 2023-10-03 15:10:51,240 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_10987_210_4163_1_1529760555_4123902_302-87926-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 15:10:57,143 WARNING [train.py:1204] (3/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_469-94673-0_sp0.9 from training. Duration: 0.87775 2023-10-03 15:10:58,630 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder_embed.conv.2.prob, batch_count=1310726.6666666667, ans=0.125 2023-10-03 15:10:59,876 WARNING [train.py:1204] (3/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_798-106179-0 from training. Duration: 0.83 2023-10-03 15:11:01,266 WARNING [train.py:1204] (3/4) Exclude cut with ID _16744_210_5986_1_1531724405384_3907567_242-111316-0 from training. Duration: 0.93 2023-10-03 15:11:02,717 WARNING [train.py:1204] (3/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_938-205135-0_sp0.9 from training. Duration: 0.9666875 2023-10-03 15:11:03,995 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.646e+02 1.896e+02 2.083e+02 2.341e+02 4.077e+02, threshold=4.166e+02, percent-clipped=1.0 2023-10-03 15:11:04,098 WARNING [train.py:1204] (3/4) Exclude cut with ID _64135_210_2253_1_1535677267876_7608972_133-188266-0_sp0.9 from training. Duration: 0.9 2023-10-03 15:11:04,102 WARNING [train.py:1204] (3/4) Exclude cut with ID _44585_210_18628_1_1533605350387_4518199_623-346820-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 15:11:04,174 WARNING [train.py:1204] (3/4) Exclude cut with ID _42326_210_7770_1_1533814282531_3707269_454-266903-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 15:11:04,243 WARNING [train.py:1204] (3/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_5-55893-0_sp1.1 from training. Duration: 0.5 2023-10-03 15:11:05,598 WARNING [train.py:1204] (3/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_577-332600-0_sp1.1 from training. Duration: 0.5181875 2023-10-03 15:11:05,599 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_957-138882-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 15:11:07,212 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.1.bypass_mid.scale_min, batch_count=1310793.3333333333, ans=0.2 2023-10-03 15:11:14,265 WARNING [train.py:1204] (3/4) Exclude cut with ID _12556_210_8341_1_1530081024100_7327690_219-38004-0_sp1.1 from training. Duration: 0.97275 2023-10-03 15:11:14,480 INFO [scaling.py:1118] (3/4) WithLoss: name=encoder.encoders.3.encoder.layers.0.self_attn_weights, loss-sum=0.000e+00 2023-10-03 15:11:17,559 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_397-147196-0_sp1.1 from training. Duration: 0.72725 2023-10-03 15:11:17,576 WARNING [train.py:1204] (3/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_231-104736-0_sp0.9 from training. Duration: 0.8666875 2023-10-03 15:11:18,953 WARNING [train.py:1204] (3/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_398-334558-0 from training. Duration: 0.96 2023-10-03 15:11:20,515 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_257-20733-0_sp1.1 from training. Duration: 0.6909375 2023-10-03 15:11:21,848 WARNING [train.py:1204] (3/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_146-282400-0_sp1.1 from training. Duration: 0.6454375 2023-10-03 15:11:21,852 WARNING [train.py:1204] (3/4) Exclude cut with ID _51899_210_20034_1_1534129254466_3669220_265-215428-0 from training. Duration: 0.98 2023-10-03 15:11:23,797 WARNING [train.py:1204] (3/4) Exclude cut with ID _34423_210_8750_1_1533430798456_3330199_219-106541-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 15:11:24,985 WARNING [train.py:1204] (3/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_458-67321-0 from training. Duration: 0.97 2023-10-03 15:11:25,175 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.nonlin_attention.balancer.prob, batch_count=1310860.0, ans=0.125 2023-10-03 15:11:31,282 WARNING [train.py:1204] (3/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_1051-182606-0_sp1.1 from training. Duration: 0.936375 2023-10-03 15:11:31,298 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_53555_210_5632_1_1534595220_7232297_603-4396-0_sp1.1 from training. Duration: 0.97275 2023-10-03 15:11:32,669 WARNING [train.py:1204] (3/4) Exclude cut with ID _54218_210_12210_1_1534413646388_4409259_159-144367-0_sp1.1 from training. Duration: 0.963625 2023-10-03 15:11:33,934 WARNING [train.py:1204] (3/4) Exclude cut with ID _7606_210_4915_1_1528894253277_5080690_301-10732-0_sp0.9 from training. Duration: 0.9 2023-10-03 15:11:33,937 WARNING [train.py:1204] (3/4) Exclude cut with ID _15026_210_8750_1_1530777602316_3291850_211-195043-0_sp0.9 from training. Duration: 0.888875 2023-10-03 15:11:35,399 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.feed_forward3.hidden_balancer.prob, batch_count=1310860.0, ans=0.125 2023-10-03 15:11:36,505 WARNING [train.py:1204] (3/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_146-282400-0 from training. Duration: 0.71 2023-10-03 15:11:36,542 WARNING [train.py:1204] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_65-88242-0 from training. Duration: 0.56 2023-10-03 15:11:39,229 WARNING [train.py:1204] (3/4) Exclude cut with ID _55857_210_2346_1_1534644007831_6898209_80-107757-0_sp1.1 from training. Duration: 0.963625 2023-10-03 15:11:39,276 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_15126_210_6753_1_1530778677_2591185_120-277707-0_sp0.9 from training. Duration: 0.888875 2023-10-03 15:11:41,885 WARNING [train.py:1204] (3/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_1033-168593-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 15:11:41,922 WARNING [train.py:1204] (3/4) Exclude cut with ID _46683_210_9642_1_1533730913492_4591116_475-30711-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 15:11:41,955 WARNING [train.py:1204] (3/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_253-308377-0 from training. Duration: 0.49 2023-10-03 15:11:42,869 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.1.encoder.layers.0.nonlin_attention.whiten1, num_groups=1, num_channels=192, metric=4.92 vs. limit=10.0 2023-10-03 15:11:43,800 WARNING [train.py:1204] (3/4) Exclude cut with ID _4680_210_3525_1_1527249757953_893386_75-78979-0 from training. Duration: 0.91 2023-10-03 15:11:45,179 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_5522_210_3273_1_1527249606_3909096_318-203153-0_sp0.9 from training. Duration: 0.47775 2023-10-03 15:11:46,685 WARNING [train.py:1204] (3/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_550-170116-0_sp1.1 from training. Duration: 0.92725 2023-10-03 15:11:46,724 WARNING [train.py:1204] (3/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_277-210798-0_sp0.9 from training. Duration: 0.611125 2023-10-03 15:11:48,033 WARNING [train.py:1204] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_620-276793-0 from training. Duration: 0.59 2023-10-03 15:11:48,052 WARNING [train.py:1204] (3/4) Exclude cut with ID _3067_210_2667_1_1526212974640_4503168_119-175776-0 from training. Duration: 0.74 2023-10-03 15:11:48,123 WARNING [train.py:1204] (3/4) Exclude cut with ID _55209_210_1531_1_1534675828996_4597160_681-195827-0_sp1.1 from training. Duration: 0.92725 2023-10-03 15:11:49,868 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_427-78097-0_sp1.1 from training. Duration: 0.72725 2023-10-03 15:11:51,252 WARNING [train.py:1204] (3/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_96-290039-0_sp0.9 from training. Duration: 0.62225 2023-10-03 15:11:51,261 WARNING [train.py:1204] (3/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_287-292174-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 15:11:52,672 INFO [train.py:1046] (3/4) Epoch 38, batch 100, loss[loss=0.1415, simple_loss=0.2209, pruned_loss=0.03107, over 19796.00 frames. ], tot_loss[loss=0.1614, simple_loss=0.2408, pruned_loss=0.04097, over 1868033.83 frames. ], batch size: 43, lr: 2.70e-03, grad_scale: 16.0 2023-10-03 15:11:52,867 WARNING [train.py:1204] (3/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_200-292228-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 15:11:57,451 WARNING [train.py:1204] (3/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_291-194999-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 15:12:00,270 WARNING [train.py:1204] (3/4) Exclude cut with ID _58895_210_8750_1_1535184081320_3649089_265-323810-0_sp1.1 from training. Duration: 0.87275 2023-10-03 15:12:02,195 WARNING [train.py:1204] (3/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_58-68956-0 from training. Duration: 0.83 2023-10-03 15:12:02,204 WARNING [train.py:1204] (3/4) Exclude cut with ID _39867_210_9826_1_1533553222030_9344259_322-115153-0_sp1.1 from training. Duration: 0.963625 2023-10-03 15:12:05,050 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3371_210_1794_1_1526384462_5580777_396-339812-0_sp1.1 from training. Duration: 0.77275 2023-10-03 15:12:05,070 WARNING [train.py:1204] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_731-2428-0_sp0.9 from training. Duration: 0.92225 2023-10-03 15:12:05,090 WARNING [train.py:1204] (3/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_98-741-0_sp1.1 from training. Duration: 0.72725 2023-10-03 15:12:05,255 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.bypass.scale_min, batch_count=1310993.3333333333, ans=0.2 2023-10-03 15:12:06,404 WARNING [train.py:1204] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_238-245655-0_sp0.9 from training. Duration: 0.9 2023-10-03 15:12:06,424 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6788_210_1388_1_1527898698_4562021_91-197981-0_sp0.9 from training. Duration: 0.92225 2023-10-03 15:12:06,609 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.0.feed_forward1.out_proj.dropout_p, batch_count=1311060.0, ans=0.1 2023-10-03 15:12:07,815 WARNING [train.py:1204] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_860-96198-0 from training. Duration: 0.73 2023-10-03 15:12:10,479 WARNING [train.py:1204] (3/4) Exclude cut with ID _14268_210_9206_1_1530622771285_4714099_331-105351-0_sp0.9 from training. Duration: 0.72225 2023-10-03 15:12:10,502 WARNING [train.py:1204] (3/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_204-137133-0_sp1.1 from training. Duration: 0.92725 2023-10-03 15:12:10,528 WARNING [train.py:1204] (3/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_5-46496-0_sp1.1 from training. Duration: 0.936375 2023-10-03 15:12:10,536 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_160-218721-0_sp1.1 from training. Duration: 0.87275 2023-10-03 15:12:13,271 WARNING [train.py:1204] (3/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_503-287883-0 from training. Duration: 0.88 2023-10-03 15:12:15,212 WARNING [train.py:1204] (3/4) Exclude cut with ID _57572_210_5517_1_1534921219624_3809484_110-79134-0_sp1.1 from training. Duration: 0.92725 2023-10-03 15:12:15,448 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.feed_forward3.hidden_balancer.prob, batch_count=1311060.0, ans=0.125 2023-10-03 15:12:16,593 WARNING [train.py:1204] (3/4) Exclude cut with ID _44585_210_18628_1_1533605350387_4518199_421-37641-0_sp1.1 from training. Duration: 0.936375 2023-10-03 15:12:17,933 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_42-142473-0_sp0.9 from training. Duration: 0.8 2023-10-03 15:12:19,930 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.2.encoder.layers.0.self_attn2.whiten, num_groups=1, num_channels=384, metric=19.80 vs. limit=22.5 2023-10-03 15:12:20,576 WARNING [train.py:1204] (3/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_310-114452-0_sp0.9 from training. Duration: 0.6555625 2023-10-03 15:12:25,100 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9029W0120-509520 from training. Duration: 0.768 2023-10-03 15:12:25,123 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9029W0433-509820 from training. Duration: 0.896 2023-10-03 15:12:26,478 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_40222_210_6228_1_1533385298_4481939_163-334586-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 15:12:26,478 WARNING [train.py:1204] (3/4) Exclude cut with ID _48677_210_5663_1_1533902405919_3892989_408-40740-0_sp1.1 from training. Duration: 0.7545625 2023-10-03 15:12:29,752 WARNING [train.py:1204] (3/4) Exclude cut with ID _24314_210_4915_1_1532135078116_7170179_521-258868-0_sp0.9 from training. Duration: 0.87775 2023-10-03 15:12:30,785 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.0.layers.0.nonlin_attention.whiten1, num_groups=1, num_channels=144, metric=9.59 vs. limit=10.0 2023-10-03 15:12:31,165 WARNING [train.py:1204] (3/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_576-51198-0_sp1.1 from training. Duration: 0.92725 2023-10-03 15:12:32,539 WARNING [train.py:1204] (3/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_306-216871-0_sp1.1 from training. Duration: 0.97275 2023-10-03 15:12:38,501 WARNING [train.py:1204] (3/4) Exclude cut with ID _20684_210_6917_1_1531567751646_4203930_463-119982-0_sp1.1 from training. Duration: 0.97275 2023-10-03 15:12:38,565 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9084W0012-536850 from training. Duration: 0.64 2023-10-03 15:12:39,252 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.2.encoder.layers.1.feed_forward3.out_whiten, num_groups=1, num_channels=384, metric=8.25 vs. limit=15.0 2023-10-03 15:12:41,329 WARNING [train.py:1204] (3/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_847-41334-0_sp1.1 from training. Duration: 0.4 2023-10-03 15:12:41,585 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.conv_skip_rate, batch_count=1311193.3333333333, ans=0.0 2023-10-03 15:12:44,203 WARNING [train.py:1204] (3/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_326-165900-0_sp1.1 from training. Duration: 0.62725 2023-10-03 15:12:45,491 WARNING [train.py:1204] (3/4) Exclude cut with ID _7537_210_2084_1_1528502395431_3924359_128-268029-0_sp1.1 from training. Duration: 0.8909375 2023-10-03 15:12:47,446 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.0.balancer1.prob, batch_count=1311193.3333333333, ans=0.125 2023-10-03 15:12:48,686 WARNING [train.py:1204] (3/4) Exclude cut with ID _67499_210_17282_1_1535509805220_3692430_333-155030-0_sp1.1 from training. Duration: 0.97275 2023-10-03 15:12:52,780 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_34008_210_10431_1_1533345424_8183247_588-257177-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 15:12:56,164 WARNING [train.py:1204] (3/4) Exclude cut with ID _25914_210_6400_1_1532141317058_644740_52-96808-0_sp1.1 from training. Duration: 0.82725 2023-10-03 15:12:57,611 WARNING [train.py:1204] (3/4) Exclude cut with ID _56494_210_4220_1_1535284837625_3033169_69-138150-0_sp1.1 from training. Duration: 0.8545625 2023-10-03 15:13:00,283 WARNING [train.py:1204] (3/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_19-143553-0_sp1.1 from training. Duration: 0.97275 2023-10-03 15:13:00,337 WARNING [train.py:1204] (3/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_582-130735-0_sp1.1 from training. Duration: 0.936375 2023-10-03 15:13:00,787 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.5.encoder.layers.1.conv_module1.whiten, num_groups=1, num_channels=256, metric=3.15 vs. limit=15.0 2023-10-03 15:13:03,614 WARNING [train.py:1204] (3/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_943-201356-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 15:13:03,625 WARNING [train.py:1204] (3/4) Exclude cut with ID _3231_210_2106_1_1526205864934_6784220_422-229276-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 15:13:03,644 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_48049_210_5632_1_1533864553_3519937_73-198717-0_sp1.1 from training. Duration: 0.97275 2023-10-03 15:13:03,707 WARNING [train.py:1204] (3/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_329-285801-0 from training. Duration: 0.91 2023-10-03 15:13:03,728 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9171W0114-579000 from training. Duration: 0.896 2023-10-03 15:13:05,027 WARNING [train.py:1204] (3/4) Exclude cut with ID _3120_210_3525_1_1526036143112_4072980_210-132675-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 15:13:05,100 WARNING [train.py:1204] (3/4) Exclude cut with ID _55857_210_2346_1_1534644007831_6898209_745-98391-0_sp0.9 from training. Duration: 0.8444375 2023-10-03 15:13:06,873 INFO [train.py:1046] (3/4) Epoch 38, batch 150, loss[loss=0.169, simple_loss=0.2435, pruned_loss=0.04725, over 23714.00 frames. ], tot_loss[loss=0.1602, simple_loss=0.2402, pruned_loss=0.04015, over 2511992.51 frames. ], batch size: 164, lr: 2.70e-03, grad_scale: 16.0 2023-10-03 15:13:06,927 WARNING [train.py:1204] (3/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_615-322035-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 15:13:06,931 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_36756_210_7234_1_1533026991_966276_119-315767-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 15:13:06,955 WARNING [train.py:1204] (3/4) Exclude cut with ID _13916_210_9994_1_1530788341126_3763149_60-191556-0_sp1.1 from training. Duration: 0.5545625 2023-10-03 15:13:06,978 WARNING [train.py:1204] (3/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_177-236809-0_sp1.1 from training. Duration: 0.4454375 2023-10-03 15:13:08,349 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7002_210_1794_1_1527939683_5157286_184-229630-0_sp1.1 from training. Duration: 0.67275 2023-10-03 15:13:08,355 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_50428_210_5724_1_1534247532_4126418_464-18322-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 15:13:08,416 WARNING [train.py:1204] (3/4) Exclude cut with ID _35799_210_5854_1_1533263487887_3529071_466-131402-0_sp1.1 from training. Duration: 0.936375 2023-10-03 15:13:08,640 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.conv_skip_rate, batch_count=1311326.6666666667, ans=0.0 2023-10-03 15:13:09,798 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_1259-132768-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 15:13:11,094 WARNING [train.py:1204] (3/4) Exclude cut with ID _6694_210_4599_1_1527852637490_3965529_144-185051-0_sp1.1 from training. Duration: 0.87275 2023-10-03 15:13:11,125 WARNING [train.py:1204] (3/4) Exclude cut with ID _64639_210_5816_1_1535367619437_3528000_59-133290-0_sp1.1 from training. Duration: 0.963625 2023-10-03 15:13:12,569 WARNING [train.py:1204] (3/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_223-49525-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 15:13:14,221 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.balancer2.prob, batch_count=1311326.6666666667, ans=0.125 2023-10-03 15:13:15,377 WARNING [train.py:1204] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_748-36150-0_sp1.1 from training. Duration: 0.82725 2023-10-03 15:13:15,378 WARNING [train.py:1204] (3/4) Exclude cut with ID _19896_210_1883_1_1531709022662_6649569_52-221627-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 15:13:16,697 WARNING [train.py:1204] (3/4) Exclude cut with ID _6789_210_1534_1_1527854471671_2870519_296-115766-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 15:13:19,487 WARNING [train.py:1204] (3/4) Exclude cut with ID _14624_210_1925_1_1530858690657_3642219_100-19871-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 15:13:19,540 WARNING [train.py:1204] (3/4) Exclude cut with ID _12555_210_8750_1_1530075594402_3780239_50-293675-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 15:13:23,269 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.4.encoder.layers.0.nonlin_attention.whiten2, num_groups=1, num_channels=384, metric=10.16 vs. limit=15.0 2023-10-03 15:13:24,061 WARNING [train.py:1204] (3/4) Exclude cut with ID _14880_210_8895_1_1530680426655_3795259_289-212817-0_sp1.1 from training. Duration: 0.62725 2023-10-03 15:13:24,107 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_388-211626-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 15:13:28,147 WARNING [train.py:1204] (3/4) Exclude cut with ID _5289_210_2084_1_1527678202736_3779299_392-261988-0 from training. Duration: 0.65 2023-10-03 15:13:28,155 WARNING [train.py:1204] (3/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_124-345840-0 from training. Duration: 0.52 2023-10-03 15:13:28,168 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2655_210_2087_1_1525688959_3897400_437-337690-0 from training. Duration: 0.63 2023-10-03 15:13:31,439 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.624e+02 1.858e+02 2.070e+02 2.351e+02 3.809e+02, threshold=4.141e+02, percent-clipped=0.0 2023-10-03 15:13:31,610 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3780_210_2087_1_1526467391_239068_13-274633-0_sp1.1 from training. Duration: 0.92725 2023-10-03 15:13:31,615 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_805-71497-0_sp1.1 from training. Duration: 0.6454375 2023-10-03 15:13:32,989 WARNING [train.py:1204] (3/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_434-83000-0_sp1.1 from training. Duration: 0.8909375 2023-10-03 15:13:34,449 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_17613_210_2151_1_1531137569_2366160_285-219531-0_sp1.1 from training. Duration: 0.97275 2023-10-03 15:13:34,461 WARNING [train.py:1204] (3/4) Exclude cut with ID _56108_210_16380_1_1534505446159_3887959_22-37858-0_sp1.1 from training. Duration: 0.936375 2023-10-03 15:13:34,494 WARNING [train.py:1204] (3/4) Exclude cut with ID _28427_210_4220_1_1532581240967_3061069_200-88444-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 15:13:36,429 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_77-279042-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 15:13:37,807 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9309W0193-640320 from training. Duration: 0.896 2023-10-03 15:13:40,861 WARNING [train.py:1204] (3/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_516-324055-0_sp1.1 from training. Duration: 0.936375 2023-10-03 15:13:45,111 WARNING [train.py:1204] (3/4) Exclude cut with ID _40094_210_16380_1_1533294005773_3641159_10-261240-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 15:13:45,338 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.bypass.skip_rate, batch_count=1311460.0, ans=0.07 2023-10-03 15:13:49,092 WARNING [train.py:1204] (3/4) Exclude cut with ID _6267_210_2084_1_1527764658526_3894730_375-219887-0_sp1.1 from training. Duration: 0.6090625 2023-10-03 15:13:49,181 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6788_210_1388_1_1527898698_4562021_180-75288-0 from training. Duration: 0.99 2023-10-03 15:13:53,174 WARNING [train.py:1204] (3/4) Exclude cut with ID _8030_210_7497_1_1528444886781_3531089_262-86750-0_sp1.1 from training. Duration: 0.736375 2023-10-03 15:13:53,196 WARNING [train.py:1204] (3/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_934-325498-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 15:13:53,365 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.feed_forward3.hidden_balancer.prob, batch_count=1311526.6666666667, ans=0.125 2023-10-03 15:13:54,494 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_456-22141-0_sp0.9 from training. Duration: 0.97775 2023-10-03 15:13:56,377 WARNING [train.py:1204] (3/4) Exclude cut with ID _49643_210_7989_1_1534640244224_3939031_598-161926-0_sp1.1 from training. Duration: 0.8181875 2023-10-03 15:13:59,123 WARNING [train.py:1204] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_345-49721-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 15:14:00,532 WARNING [train.py:1204] (3/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_440-141811-0_sp1.1 from training. Duration: 0.863625 2023-10-03 15:14:00,616 WARNING [train.py:1204] (3/4) Exclude cut with ID _64048_210_10626_1_1535198382802_3835179_394-212120-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 15:14:02,053 WARNING [train.py:1204] (3/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_91-277684-0 from training. Duration: 0.74 2023-10-03 15:14:05,688 WARNING [train.py:1204] (3/4) Exclude cut with ID _23038_210_9570_1_1532001584158_3652419_191-346343-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 15:14:07,588 WARNING [train.py:1204] (3/4) Exclude cut with ID _15140_210_9288_1_1530791388450_4880149_351-347974-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 15:14:07,621 WARNING [train.py:1204] (3/4) Exclude cut with ID _58605_210_12210_1_1534723195939_3904259_48-269402-0_sp1.1 from training. Duration: 0.87275 2023-10-03 15:14:07,627 WARNING [train.py:1204] (3/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_401-53423-0_sp0.9 from training. Duration: 0.611125 2023-10-03 15:14:10,736 WARNING [train.py:1204] (3/4) Exclude cut with ID _42659_210_2070_1_1533607442765_3120380_158-49004-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 15:14:12,043 WARNING [train.py:1204] (3/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_81-75849-0_sp1.1 from training. Duration: 0.3545625 2023-10-03 15:14:13,456 WARNING [train.py:1204] (3/4) Exclude cut with ID _27052_210_5986_1_1532329197187_3585009_314-106499-0_sp1.1 from training. Duration: 0.836375 2023-10-03 15:14:13,651 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.conv_module2.balancer1.prob, batch_count=1311593.3333333333, ans=0.125 2023-10-03 15:14:14,868 WARNING [train.py:1204] (3/4) Exclude cut with ID _4626_210_3532_1_1526817652717_3916859_446-206656-0_sp0.9 from training. Duration: 0.8444375 2023-10-03 15:14:16,282 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_53378_210_17282_1_1534221071_5769509_158-114960-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 15:14:18,955 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3171_210_2087_1_1526109084_2330007_37-64334-0_sp0.9 from training. Duration: 0.911125 2023-10-03 15:14:18,973 WARNING [train.py:1204] (3/4) Exclude cut with ID _5410_210_1388_1_1527397055795_1719020_39-112221-0 from training. Duration: 0.69 2023-10-03 15:14:18,990 WARNING [train.py:1204] (3/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_4-91037-0_sp0.9 from training. Duration: 0.97775 2023-10-03 15:14:19,011 WARNING [train.py:1204] (3/4) Exclude cut with ID ID0079W0443-717795 from training. Duration: 0.541 2023-10-03 15:14:20,206 INFO [train.py:1046] (3/4) Epoch 38, batch 200, loss[loss=0.1459, simple_loss=0.2307, pruned_loss=0.03052, over 24472.00 frames. ], tot_loss[loss=0.1598, simple_loss=0.2397, pruned_loss=0.03995, over 3006972.11 frames. ], batch size: 63, lr: 2.70e-03, grad_scale: 16.0 2023-10-03 15:14:22,954 WARNING [train.py:1204] (3/4) Exclude cut with ID _34734_210_4525_1_1533365977542_4448720_766-297928-0_sp1.1 from training. Duration: 0.936375 2023-10-03 15:14:25,900 WARNING [train.py:1204] (3/4) Exclude cut with ID _51471_210_1889_1_1534399391390_3094250_310-337793-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 15:14:25,922 WARNING [train.py:1204] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_678-159307-0_sp0.9 from training. Duration: 0.9444375 2023-10-03 15:14:29,176 WARNING [train.py:1204] (3/4) Exclude cut with ID _50093_210_4220_1_1534417141207_3432299_259-322252-0 from training. Duration: 0.92 2023-10-03 15:14:29,244 WARNING [train.py:1204] (3/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_67-103444-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 15:14:30,485 WARNING [train.py:1204] (3/4) Exclude cut with ID _40551_210_10635_1_1533776424632_3416179_311-247578-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 15:14:31,923 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_4633_210_2040_1_1526824163_4145343_416-76117-0 from training. Duration: 0.95 2023-10-03 15:14:32,029 WARNING [train.py:1204] (3/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_331-117829-0_sp0.9 from training. Duration: 0.688875 2023-10-03 15:14:33,986 WARNING [train.py:1204] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_299-247059-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 15:14:35,420 WARNING [train.py:1204] (3/4) Exclude cut with ID _64002_210_9570_1_1535198605157_38979_2-240845-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 15:14:38,655 WARNING [train.py:1204] (3/4) Exclude cut with ID _4681_210_3525_1_1527250689464_120889_15-126760-0_sp0.9 from training. Duration: 0.9 2023-10-03 15:14:38,668 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_21500_210_2054_1_1532149310_2716758_256-156611-0_sp1.1 from training. Duration: 0.936375 2023-10-03 15:14:38,690 WARNING [train.py:1204] (3/4) Exclude cut with ID _16908_210_5724_1_1531119157248_3772700_624-203729-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 15:14:43,240 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.5.encoder.layers.0.self_attn2.whiten, num_groups=1, num_channels=256, metric=12.42 vs. limit=22.5 2023-10-03 15:14:48,489 INFO [scaling.py:1118] (3/4) WithLoss: name=encoder.encoders.2.encoder.layers.0.self_attn_weights, loss-sum=0.000e+00 2023-10-03 15:14:56,579 WARNING [train.py:1204] (3/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_605-219732-0_sp1.1 from training. Duration: 0.8545625 2023-10-03 15:14:56,610 WARNING [train.py:1204] (3/4) Exclude cut with ID _44718_210_5816_1_1534050051869_6469030_380-167520-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 15:14:58,506 WARNING [train.py:1204] (3/4) Exclude cut with ID _19896_210_1883_1_1531709022662_6649569_94-229927-0_sp1.1 from training. Duration: 0.8818125 2023-10-03 15:14:58,563 WARNING [train.py:1204] (3/4) Exclude cut with ID _19514_210_9259_1_1531530071330_5084230_519-174300-0_sp1.1 from training. Duration: 0.963625 2023-10-03 15:15:01,105 WARNING [train.py:1204] (3/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_340-174176-0_sp0.9 from training. Duration: 0.5444375 2023-10-03 15:15:01,109 WARNING [train.py:1204] (3/4) Exclude cut with ID _8263_210_1664_1_1528439452891_970220_68-181805-0_sp1.1 from training. Duration: 0.5818125 2023-10-03 15:15:03,794 WARNING [train.py:1204] (3/4) Exclude cut with ID _3120_210_3525_1_1526036143112_4072980_348-257723-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 15:15:05,766 WARNING [train.py:1204] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_453-133983-0_sp1.1 from training. Duration: 0.7090625 2023-10-03 15:15:05,828 WARNING [train.py:1204] (3/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_17-86060-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 15:15:07,882 WARNING [train.py:1204] (3/4) Exclude cut with ID _29877_210_6228_1_1532397791106_5276149_228-184657-0_sp1.1 from training. Duration: 0.97275 2023-10-03 15:15:07,988 WARNING [train.py:1204] (3/4) Exclude cut with ID _18516_210_9206_1_1531704653206_3692406_198-334870-0 from training. Duration: 0.92 2023-10-03 15:15:09,350 WARNING [train.py:1204] (3/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_340-174176-0_sp1.1 from training. Duration: 0.4454375 2023-10-03 15:15:09,363 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_14-139186-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 15:15:12,752 WARNING [train.py:1204] (3/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_126-29104-0_sp0.9 from training. Duration: 0.9444375 2023-10-03 15:15:12,926 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.1.balancer1.prob, batch_count=1311860.0, ans=0.125 2023-10-03 15:15:16,984 WARNING [train.py:1204] (3/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_84-10310-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 15:15:23,799 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_15517_210_6753_1_1530954008_3487336_79-273076-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 15:15:23,844 WARNING [train.py:1204] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_371-18169-0_sp0.9 from training. Duration: 0.9555625 2023-10-03 15:15:27,065 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.5.encoder.layers.1.nonlin_attention.whiten2, num_groups=1, num_channels=256, metric=9.49 vs. limit=15.0 2023-10-03 15:15:30,040 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_68702_210_23689_1_1535799577_3420068_170-92788-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 15:15:31,463 WARNING [train.py:1204] (3/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_78-182912-0 from training. Duration: 0.59 2023-10-03 15:15:32,821 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3019_210_2028_1_1526044245_3264865_297-12384-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 15:15:32,826 WARNING [train.py:1204] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_147-111137-0_sp1.1 from training. Duration: 0.863625 2023-10-03 15:15:32,831 WARNING [train.py:1204] (3/4) Exclude cut with ID _28323_210_16187_1_1532480395667_4100300_117-8859-0_sp1.1 from training. Duration: 0.97275 2023-10-03 15:15:34,196 INFO [train.py:1046] (3/4) Epoch 38, batch 250, loss[loss=0.1626, simple_loss=0.243, pruned_loss=0.04113, over 24315.00 frames. ], tot_loss[loss=0.16, simple_loss=0.2397, pruned_loss=0.04014, over 3391269.73 frames. ], batch size: 61, lr: 2.69e-03, grad_scale: 16.0 2023-10-03 15:15:34,243 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7762_210_1794_1_1528199508_4057408_419-125009-0_sp1.1 from training. Duration: 0.7545625 2023-10-03 15:15:34,336 WARNING [train.py:1204] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_268-87909-0 from training. Duration: 0.99 2023-10-03 15:15:36,111 WARNING [train.py:1204] (3/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_118-311357-0_sp1.1 from training. Duration: 0.92725 2023-10-03 15:15:36,125 WARNING [train.py:1204] (3/4) Exclude cut with ID ID0377W0003-870225 from training. Duration: 0.852 2023-10-03 15:15:38,029 WARNING [train.py:1204] (3/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_56-5838-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 15:15:40,059 WARNING [train.py:1204] (3/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_90-31383-0_sp0.9 from training. Duration: 0.9666875 2023-10-03 15:15:41,320 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_31583_210_3852_1_1532595851_4024413_58-86657-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 15:15:41,359 WARNING [train.py:1204] (3/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_344-113875-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 15:15:44,038 WARNING [train.py:1204] (3/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_278-104676-0_sp1.1 from training. Duration: 0.8909375 2023-10-03 15:15:45,292 WARNING [train.py:1204] (3/4) Exclude cut with ID _55844_210_2346_1_1534471212569_7625707_764-2387-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 15:15:46,752 WARNING [train.py:1204] (3/4) Exclude cut with ID _27430_210_7770_1_1532604689375_1378590_157-14963-0_sp1.1 from training. Duration: 0.963625 2023-10-03 15:15:49,634 WARNING [train.py:1204] (3/4) Exclude cut with ID _7160_210_1876_1_1527940923231_3703799_282-322611-0_sp1.1 from training. Duration: 0.7909375 2023-10-03 15:15:59,008 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.554e+02 1.823e+02 2.031e+02 2.288e+02 2.955e+02, threshold=4.062e+02, percent-clipped=0.0 2023-10-03 15:16:01,094 WARNING [train.py:1204] (3/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_190-138640-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 15:16:02,482 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_33348_210_5632_1_1533086912_3324030_63-270922-0_sp1.1 from training. Duration: 0.97275 2023-10-03 15:16:02,506 WARNING [train.py:1204] (3/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_135-184281-0_sp1.1 from training. Duration: 0.9 2023-10-03 15:16:04,087 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.balancer_na.min_abs, batch_count=1312126.6666666667, ans=0.02 2023-10-03 15:16:06,619 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.1.encoder.layers.1.feed_forward3.out_whiten, num_groups=1, num_channels=256, metric=9.56 vs. limit=15.0 2023-10-03 15:16:10,412 WARNING [train.py:1204] (3/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_277-210798-0_sp1.1 from training. Duration: 0.5 2023-10-03 15:16:11,733 WARNING [train.py:1204] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_402-253027-0_sp0.9 from training. Duration: 0.6 2023-10-03 15:16:13,111 WARNING [train.py:1204] (3/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_540-275685-0_sp0.9 from training. Duration: 0.988875 2023-10-03 15:16:13,156 WARNING [train.py:1204] (3/4) Exclude cut with ID _59493_210_17282_1_1535078419077_3225879_118-18960-0_sp1.1 from training. Duration: 0.936375 2023-10-03 15:16:13,873 WARNING [train.py:1204] (3/4) Exclude cut with ID _6194_210_1534_1_1527511826160_3941194_321-148641-0_sp1.1 from training. Duration: 0.5818125 2023-10-03 15:16:13,879 WARNING [train.py:1204] (3/4) Exclude cut with ID _61133_210_9969_1_1535196777227_3696409_333-270325-0_sp1.1 from training. Duration: 0.8454375 2023-10-03 15:16:15,139 WARNING [train.py:1204] (3/4) Exclude cut with ID _49376_210_5986_1_1533985278732_3370740_307-145221-0_sp1.1 from training. Duration: 0.936375 2023-10-03 15:16:15,328 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.balancer2.prob, batch_count=1312126.6666666667, ans=0.125 2023-10-03 15:16:19,360 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6846_210_6753_1_1527850767_4295158_2-315073-0_sp0.9 from training. Duration: 0.97775 2023-10-03 15:16:20,827 WARNING [train.py:1204] (3/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_17-25045-0 from training. Duration: 0.59 2023-10-03 15:16:20,848 WARNING [train.py:1204] (3/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_503-333510-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 15:16:22,250 WARNING [train.py:1204] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_238-245655-0_sp1.1 from training. Duration: 0.736375 2023-10-03 15:16:22,272 WARNING [train.py:1204] (3/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_329-82235-0_sp0.9 from training. Duration: 0.911125 2023-10-03 15:16:22,277 WARNING [train.py:1204] (3/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_325-113798-0_sp0.9 from training. Duration: 0.9444375 2023-10-03 15:16:23,631 WARNING [train.py:1204] (3/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_395-37222-0_sp0.9 from training. Duration: 0.8444375 2023-10-03 15:16:25,017 WARNING [train.py:1204] (3/4) Exclude cut with ID _64135_210_2253_1_1535677267876_7608972_498-5039-0_sp1.1 from training. Duration: 0.7090625 2023-10-03 15:16:25,026 WARNING [train.py:1204] (3/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_303-228738-0_sp0.9 from training. Duration: 0.9333125 2023-10-03 15:16:25,171 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_577-220899-0_sp1.1 from training. Duration: 0.92725 2023-10-03 15:16:26,978 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.4.encoder.layers.1.feed_forward2.out_whiten, num_groups=1, num_channels=384, metric=10.14 vs. limit=15.0 2023-10-03 15:16:27,898 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3475_210_2028_1_1526193578_3622569_377-166133-0_sp1.1 from training. Duration: 0.7818125 2023-10-03 15:16:27,931 WARNING [train.py:1204] (3/4) Exclude cut with ID _49376_210_5986_1_1533985278732_3370740_272-313512-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 15:16:30,714 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7762_210_1794_1_1528199508_4057408_422-227034-0_sp0.9 from training. Duration: 0.811125 2023-10-03 15:16:35,447 WARNING [train.py:1204] (3/4) Exclude cut with ID _18895_210_6228_1_1531357274234_3784930_74-341428-0_sp1.1 from training. Duration: 0.92725 2023-10-03 15:16:39,334 WARNING [train.py:1204] (3/4) Exclude cut with ID _44902_210_16187_1_1533888006062_3792078_149-67212-0_sp1.1 from training. Duration: 0.7909375 2023-10-03 15:16:43,435 WARNING [train.py:1204] (3/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_243-126020-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 15:16:44,773 WARNING [train.py:1204] (3/4) Exclude cut with ID _54218_210_12210_1_1534413646388_4409259_259-6561-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 15:16:48,222 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_41452_210_7291_1_1533437687_3941281_405-11559-0 from training. Duration: 0.91 2023-10-03 15:16:48,319 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_759-225084-0_sp1.1 from training. Duration: 0.97275 2023-10-03 15:16:48,332 WARNING [train.py:1204] (3/4) Exclude cut with ID _14747_210_8341_1_1530685797463_3654089_147-131229-0_sp0.9 from training. Duration: 0.8444375 2023-10-03 15:16:49,539 INFO [train.py:1046] (3/4) Epoch 38, batch 300, loss[loss=0.1619, simple_loss=0.2453, pruned_loss=0.03927, over 23477.00 frames. ], tot_loss[loss=0.1583, simple_loss=0.2382, pruned_loss=0.03913, over 3687155.44 frames. ], batch size: 106, lr: 2.69e-03, grad_scale: 16.0 2023-10-03 15:16:49,656 WARNING [train.py:1204] (3/4) Exclude cut with ID _14867_210_1968_1_1530684179890_3846969_319-250868-0 from training. Duration: 0.99 2023-10-03 15:16:49,685 WARNING [train.py:1204] (3/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_580-93798-0_sp0.9 from training. Duration: 0.62225 2023-10-03 15:16:51,050 WARNING [train.py:1204] (3/4) Exclude cut with ID _9679_210_7497_1_1529129212906_4363929_512-313889-0_sp0.9 from training. Duration: 0.9 2023-10-03 15:16:51,062 WARNING [train.py:1204] (3/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_18-189198-0 from training. Duration: 0.9 2023-10-03 15:16:55,229 WARNING [train.py:1204] (3/4) Exclude cut with ID _49755_210_18053_1_1534581005846_3218709_137-64179-0_sp1.1 from training. Duration: 0.92725 2023-10-03 15:16:56,605 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_48049_210_5632_1_1533864553_3519937_306-283794-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 15:16:59,401 WARNING [train.py:1204] (3/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_25-120161-0_sp1.1 from training. Duration: 0.8909375 2023-10-03 15:16:59,463 WARNING [train.py:1204] (3/4) Exclude cut with ID _8833_210_5219_1_1528592665210_7553059_319-349181-0 from training. Duration: 0.99 2023-10-03 15:17:01,115 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.feed_forward3.hidden_balancer.prob, batch_count=1312326.6666666667, ans=0.125 2023-10-03 15:17:02,746 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_296-332325-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 15:17:02,840 WARNING [train.py:1204] (3/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_436-186311-0_sp1.1 from training. Duration: 0.5909375 2023-10-03 15:17:04,679 WARNING [train.py:1204] (3/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_348-127789-0 from training. Duration: 0.85 2023-10-03 15:17:04,684 WARNING [train.py:1204] (3/4) Exclude cut with ID _3992_210_1747_1_1526644816031_3731060_443-195183-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 15:17:09,201 WARNING [train.py:1204] (3/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_308-231735-0_sp1.1 from training. Duration: 0.663625 2023-10-03 15:17:13,430 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6786_210_1388_1_1527839699_4194495_378-46070-0_sp1.1 from training. Duration: 0.6545625 2023-10-03 15:17:13,486 WARNING [train.py:1204] (3/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_63-101996-0 from training. Duration: 0.88 2023-10-03 15:17:16,953 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2541_210_1794_1_1525590269_3084765_156-173713-0 from training. Duration: 0.81 2023-10-03 15:17:18,266 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_217-239796-0_sp1.1 from training. Duration: 0.963625 2023-10-03 15:17:19,605 WARNING [train.py:1204] (3/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_530-329400-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 15:17:19,884 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.conv_module1.balancer1.prob, batch_count=1312460.0, ans=0.125 2023-10-03 15:17:21,078 WARNING [train.py:1204] (3/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_150-62920-0_sp1.1 from training. Duration: 0.963625 2023-10-03 15:17:21,078 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_5522_210_3273_1_1527249606_3909096_366-172508-0 from training. Duration: 0.66 2023-10-03 15:17:21,080 WARNING [train.py:1204] (3/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_303-340446-0_sp1.1 from training. Duration: 0.8181875 2023-10-03 15:17:22,438 WARNING [train.py:1204] (3/4) Exclude cut with ID _8699_210_2069_1_1528630346831_3960409_160-24408-0_sp1.1 from training. Duration: 0.82725 2023-10-03 15:17:23,898 WARNING [train.py:1204] (3/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_299-187801-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 15:17:23,936 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_34886_210_10633_1_1533203882_1755661_92-345288-0_sp1.1 from training. Duration: 0.97275 2023-10-03 15:17:28,183 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_5438_210_1794_1_1527336687_2600009_104-288274-0_sp1.1 from training. Duration: 0.52725 2023-10-03 15:17:28,187 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_154-316450-0 from training. Duration: 0.76 2023-10-03 15:17:28,255 WARNING [train.py:1204] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_23-189234-0_sp0.9 from training. Duration: 0.92225 2023-10-03 15:17:32,213 WARNING [train.py:1204] (3/4) Exclude cut with ID _4779_210_2084_1_1527318022944_3914893_346-168820-0_sp1.1 from training. Duration: 0.963625 2023-10-03 15:17:32,303 WARNING [train.py:1204] (3/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_5-55893-0 from training. Duration: 0.55 2023-10-03 15:17:34,240 WARNING [train.py:1204] (3/4) Exclude cut with ID _20455_210_5219_1_1531565996775_8098100_180-24517-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 15:17:39,716 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_243-143119-0_sp0.9 from training. Duration: 0.8555625 2023-10-03 15:17:42,402 WARNING [train.py:1204] (3/4) Exclude cut with ID _65585_210_12210_1_1535450451584_3764098_293-212045-0_sp1.1 from training. Duration: 0.92725 2023-10-03 15:17:42,407 WARNING [train.py:1204] (3/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_577-332600-0 from training. Duration: 0.57 2023-10-03 15:17:47,135 WARNING [train.py:1204] (3/4) Exclude cut with ID _30039_210_13270_1_1532484612900_2725150_104-289874-0_sp1.1 from training. Duration: 0.963625 2023-10-03 15:17:47,142 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3172_210_2087_1_1526292929_3987865_197-97762-0_sp1.1 from training. Duration: 0.6818125 2023-10-03 15:17:49,861 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_256-150908-0_sp1.1 from training. Duration: 0.963625 2023-10-03 15:17:51,274 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_296-287678-0_sp1.1 from training. Duration: 0.863625 2023-10-03 15:17:52,591 WARNING [train.py:1204] (3/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_443-30034-0 from training. Duration: 0.64 2023-10-03 15:17:52,624 WARNING [train.py:1204] (3/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_494-199538-0_sp0.9 from training. Duration: 0.8333125 2023-10-03 15:17:52,673 WARNING [train.py:1204] (3/4) Exclude cut with ID _19708_210_9206_1_1532136650733_4238364_61-162325-0_sp1.1 from training. Duration: 0.97275 2023-10-03 15:17:54,032 WARNING [train.py:1204] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_475-127455-0 from training. Duration: 0.7 2023-10-03 15:17:55,419 WARNING [train.py:1204] (3/4) Exclude cut with ID _48677_210_5663_1_1533902405919_3892989_188-11581-0_sp1.1 from training. Duration: 0.963625 2023-10-03 15:17:55,441 WARNING [train.py:1204] (3/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_136-8381-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 15:17:56,825 WARNING [train.py:1204] (3/4) Exclude cut with ID _55328_210_27767_1_1534580973260_4088170_270-229555-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 15:17:57,077 INFO [scaling.py:1118] (3/4) WithLoss: name=encoder.encoders.3.encoder.layers.1.self_attn_weights, loss-sum=0.000e+00 2023-10-03 15:17:58,267 WARNING [train.py:1204] (3/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_372-266951-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 15:17:58,327 WARNING [train.py:1204] (3/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_14-242149-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 15:17:58,522 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.bypass.skip_rate, batch_count=1312593.3333333333, ans=0.04949747468305833 2023-10-03 15:18:02,278 INFO [train.py:1046] (3/4) Epoch 38, batch 350, loss[loss=0.1425, simple_loss=0.224, pruned_loss=0.03053, over 24493.00 frames. ], tot_loss[loss=0.1568, simple_loss=0.2364, pruned_loss=0.03867, over 3914802.00 frames. ], batch size: 63, lr: 2.69e-03, grad_scale: 16.0 2023-10-03 15:18:04,259 WARNING [train.py:1204] (3/4) Exclude cut with ID _7877_210_6014_1_1528286358099_3750136_159-76512-0_sp1.1 from training. Duration: 0.936375 2023-10-03 15:18:04,261 WARNING [train.py:1204] (3/4) Exclude cut with ID _8263_210_1664_1_1528438953494_294100_40-204731-0_sp1.1 from training. Duration: 0.4545625 2023-10-03 15:18:06,215 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8278_210_2550_1_1528637099_4220413_694-238413-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 15:18:12,211 WARNING [train.py:1204] (3/4) Exclude cut with ID _47278_210_12041_1_1533902418349_3576110_421-172710-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 15:18:14,985 WARNING [train.py:1204] (3/4) Exclude cut with ID _14970_210_15709_1_1530773543493_1715730_215-282680-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 15:18:16,406 WARNING [train.py:1204] (3/4) Exclude cut with ID _64639_210_5816_1_1535367619437_3528000_306-149449-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 15:18:16,649 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.feed_forward1.hidden_balancer.prob, batch_count=1312726.6666666667, ans=0.125 2023-10-03 15:18:17,176 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.3.encoder.layers.0.self_attn2.whiten, num_groups=1, num_channels=512, metric=12.83 vs. limit=22.5 2023-10-03 15:18:17,895 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_28-291982-0 from training. Duration: 0.97 2023-10-03 15:18:19,762 WARNING [train.py:1204] (3/4) Exclude cut with ID _55134_210_21982_1_1534590028377_3882170_412-15870-0_sp1.1 from training. Duration: 0.936375 2023-10-03 15:18:19,795 WARNING [train.py:1204] (3/4) Exclude cut with ID _13684_210_7497_1_1530619354863_3598359_312-235606-0 from training. Duration: 0.6 2023-10-03 15:18:21,336 WARNING [train.py:1204] (3/4) Exclude cut with ID _37112_210_2575_1_1532997007673_7268610_347-238993-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 15:18:22,610 WARNING [train.py:1204] (3/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_121-284152-0 from training. Duration: 0.59 2023-10-03 15:18:23,956 WARNING [train.py:1204] (3/4) Exclude cut with ID _2941_210_2577_1_1526199673150_2489759_228-2961-0_sp1.1 from training. Duration: 0.97275 2023-10-03 15:18:25,314 WARNING [train.py:1204] (3/4) Exclude cut with ID _28323_210_16187_1_1532480395667_4100300_531-24157-0 from training. Duration: 0.98 2023-10-03 15:18:26,879 WARNING [train.py:1204] (3/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_266-329755-0_sp0.9 from training. Duration: 0.988875 2023-10-03 15:18:28,151 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.627e+02 1.973e+02 2.217e+02 2.536e+02 3.904e+02, threshold=4.435e+02, percent-clipped=0.0 2023-10-03 15:18:29,568 WARNING [train.py:1204] (3/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_1038-146421-0_sp1.1 from training. Duration: 0.97275 2023-10-03 15:18:29,644 WARNING [train.py:1204] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_391-22148-0_sp0.9 from training. Duration: 0.8555625 2023-10-03 15:18:31,524 WARNING [train.py:1204] (3/4) Exclude cut with ID _64521_210_14699_1_1535243427259_3815328_543-341117-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 15:18:31,535 WARNING [train.py:1204] (3/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_52-168712-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 15:18:31,579 WARNING [train.py:1204] (3/4) Exclude cut with ID _33920_210_4220_1_1533257851500_3084399_198-334063-0_sp1.1 from training. Duration: 0.8545625 2023-10-03 15:18:32,907 WARNING [train.py:1204] (3/4) Exclude cut with ID _50093_210_4220_1_1534417141207_3432299_69-111987-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 15:18:32,948 WARNING [train.py:1204] (3/4) Exclude cut with ID _14821_210_8750_1_1530680503991_6829359_189-297451-0_sp0.9 from training. Duration: 0.611125 2023-10-03 15:18:34,389 WARNING [train.py:1204] (3/4) Exclude cut with ID _8030_210_7497_1_1528444886781_3531089_262-86750-0_sp0.9 from training. Duration: 0.9 2023-10-03 15:18:34,394 WARNING [train.py:1204] (3/4) Exclude cut with ID _24612_210_8913_1_1531998054193_3806359_294-191196-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 15:18:40,931 WARNING [train.py:1204] (3/4) Exclude cut with ID _16909_210_5724_1_1531292079912_4118919_707-342896-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 15:18:40,955 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_9460_210_4211_1_1528954587_10049667_572-37035-0_sp0.9 from training. Duration: 0.888875 2023-10-03 15:18:41,007 WARNING [train.py:1204] (3/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_762-109886-0_sp1.1 from training. Duration: 0.82725 2023-10-03 15:18:41,030 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_14953_210_2151_1_1530701710_5616165_519-326393-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 15:18:46,436 WARNING [train.py:1204] (3/4) Exclude cut with ID _25914_210_6400_1_1532141317058_644740_52-96808-0 from training. Duration: 0.91 2023-10-03 15:18:46,439 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_68095_210_20185_1_1535718226_8888855_1079-94525-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 15:18:46,695 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.bypass.scale_min, batch_count=1312860.0, ans=0.2 2023-10-03 15:18:51,093 WARNING [train.py:1204] (3/4) Exclude cut with ID _14860_210_4915_1_1530702020340_7343240_746-3647-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 15:18:51,093 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_70379_210_7925_1_1535941455_557455_49-101720-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 15:18:52,310 WARNING [train.py:1204] (3/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_8-307291-0_sp1.1 from training. Duration: 0.936375 2023-10-03 15:18:53,712 WARNING [train.py:1204] (3/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_117-290916-0 from training. Duration: 0.51 2023-10-03 15:18:56,554 WARNING [train.py:1204] (3/4) Exclude cut with ID _22020_210_2461_1_1531724164513_6326900_944-339840-0_sp1.1 from training. Duration: 0.963625 2023-10-03 15:18:56,638 WARNING [train.py:1204] (3/4) Exclude cut with ID IC0515W0043-254911 from training. Duration: 0.976 2023-10-03 15:18:57,962 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3910_210_1794_1_1526742306_334530_28-238909-0 from training. Duration: 0.81 2023-10-03 15:18:57,976 WARNING [train.py:1204] (3/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_269-267275-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 15:18:58,761 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.2.encoder.layers.2.self_attn1.whiten, num_groups=1, num_channels=384, metric=19.67 vs. limit=22.5 2023-10-03 15:19:01,198 WARNING [train.py:1204] (3/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_72-251255-0_sp1.1 from training. Duration: 0.8545625 2023-10-03 15:19:01,209 WARNING [train.py:1204] (3/4) Exclude cut with ID _14886_210_4220_1_1530702020355_3516190_175-229073-0 from training. Duration: 0.45 2023-10-03 15:19:04,000 WARNING [train.py:1204] (3/4) Exclude cut with ID _40519_210_13794_1_1533715228655_7337390_921-209452-0_sp1.1 from training. Duration: 0.963625 2023-10-03 15:19:07,062 WARNING [train.py:1204] (3/4) Exclude cut with ID _29857_210_20034_1_1532395886753_3798160_335-163562-0_sp0.9 from training. Duration: 0.9444375 2023-10-03 15:19:07,168 WARNING [train.py:1204] (3/4) Exclude cut with ID _58583_210_5236_1_1534726835266_3574249_384-58445-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 15:19:09,065 WARNING [train.py:1204] (3/4) Exclude cut with ID _44011_210_2046_1_1533722401691_3631310_96-315656-0_sp1.1 from training. Duration: 0.963625 2023-10-03 15:19:09,079 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_68484_210_1794_1_1535849804_7678097_549-170613-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 15:19:10,526 WARNING [train.py:1204] (3/4) Exclude cut with ID _37892_210_19596_1_1533198677897_3964058_214-148058-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 15:19:14,514 WARNING [train.py:1204] (3/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_415-78792-0_sp0.9 from training. Duration: 0.9 2023-10-03 15:19:17,104 INFO [train.py:1046] (3/4) Epoch 38, batch 400, loss[loss=0.1854, simple_loss=0.2651, pruned_loss=0.05279, over 23492.00 frames. ], tot_loss[loss=0.1569, simple_loss=0.2363, pruned_loss=0.03874, over 4095601.26 frames. ], batch size: 93, lr: 2.69e-03, grad_scale: 32.0 2023-10-03 15:19:17,191 WARNING [train.py:1204] (3/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_383-205833-0_sp1.1 from training. Duration: 0.863625 2023-10-03 15:19:17,279 WARNING [train.py:1204] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_167-137255-0 from training. Duration: 0.64 2023-10-03 15:19:17,287 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8275_210_4198_1_1528455320_3892571_484-154786-0_sp1.1 from training. Duration: 0.963625 2023-10-03 15:19:18,624 WARNING [train.py:1204] (3/4) Exclude cut with ID _40284_210_2084_1_1533513679814_3704279_396-319412-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 15:19:20,587 WARNING [train.py:1204] (3/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_280-120942-0_sp0.9 from training. Duration: 0.8555625 2023-10-03 15:19:20,634 WARNING [train.py:1204] (3/4) Exclude cut with ID _45630_210_10556_1_1533949174498_3902049_558-240889-0_sp1.1 from training. Duration: 0.92725 2023-10-03 15:19:23,415 WARNING [train.py:1204] (3/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_197-307458-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 15:19:23,502 WARNING [train.py:1204] (3/4) Exclude cut with ID _16909_210_5724_1_1531292079912_4118919_163-254843-0_sp1.1 from training. Duration: 0.92725 2023-10-03 15:19:24,906 INFO [scaling.py:1118] (3/4) WithLoss: name=encoder.encoders.0.layers.0.self_attn_weights, loss-sum=0.000e+00 2023-10-03 15:19:26,113 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_103-84010-0 from training. Duration: 0.69 2023-10-03 15:19:27,695 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.conv_skip_rate, batch_count=1312993.3333333333, ans=0.0 2023-10-03 15:19:28,738 WARNING [train.py:1204] (3/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_401-158239-0 from training. Duration: 0.71 2023-10-03 15:19:28,739 WARNING [train.py:1204] (3/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_899-233658-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 15:19:28,847 WARNING [train.py:1204] (3/4) Exclude cut with ID _23825_210_13449_1_1531915313526_3497150_240-271367-0 from training. Duration: 0.97 2023-10-03 15:19:30,159 WARNING [train.py:1204] (3/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_424-134619-0_sp1.1 from training. Duration: 0.92725 2023-10-03 15:19:33,647 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.0.attention_skip_rate, batch_count=1313060.0, ans=0.0 2023-10-03 15:19:36,704 WARNING [train.py:1204] (3/4) Exclude cut with ID _44831_210_13427_1_1533816011549_3689035_32-188147-0_sp1.1 from training. Duration: 0.87275 2023-10-03 15:19:36,707 WARNING [train.py:1204] (3/4) Exclude cut with ID _59170_210_6721_1_1534983633960_4203335_314-63879-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 15:19:38,117 WARNING [train.py:1204] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_602-275260-0 from training. Duration: 0.82 2023-10-03 15:19:38,174 WARNING [train.py:1204] (3/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_511-306767-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 15:19:38,213 WARNING [train.py:1204] (3/4) Exclude cut with ID _41817_210_18835_1_1533434477160_7423049_565-201775-0_sp1.1 from training. Duration: 0.92725 2023-10-03 15:19:38,227 WARNING [train.py:1204] (3/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_778-173151-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 15:19:39,610 WARNING [train.py:1204] (3/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_680-51001-0_sp1.1 from training. Duration: 0.963625 2023-10-03 15:19:41,105 WARNING [train.py:1204] (3/4) Exclude cut with ID IC0677W0059-334891 from training. Duration: 0.896 2023-10-03 15:19:43,029 WARNING [train.py:1204] (3/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_370-331600-0 from training. Duration: 0.94 2023-10-03 15:19:43,243 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.0.ff2_skip_rate, batch_count=1313060.0, ans=0.0 2023-10-03 15:19:46,175 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.ff3_skip_rate, batch_count=1313126.6666666667, ans=0.0 2023-10-03 15:19:47,402 WARNING [train.py:1204] (3/4) Exclude cut with ID _66840_210_5219_1_1535452168426_4196349_446-240192-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 15:19:48,837 WARNING [train.py:1204] (3/4) Exclude cut with ID _65242_210_18628_1_1535369393717_3823149_38-6732-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 15:19:48,896 WARNING [train.py:1204] (3/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_302-2550-0 from training. Duration: 0.81 2023-10-03 15:19:50,207 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_100-348441-0 from training. Duration: 0.91 2023-10-03 15:19:50,381 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.0.feed_forward1.hidden_balancer.prob, batch_count=1313126.6666666667, ans=0.125 2023-10-03 15:19:54,754 WARNING [train.py:1204] (3/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_329-285801-0_sp1.1 from training. Duration: 0.82725 2023-10-03 15:19:57,556 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_30167_210_3528_1_1532428012_4772375_343-334712-0_sp1.1 from training. Duration: 0.936375 2023-10-03 15:20:03,237 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7543_210_6753_1_1528543318_4552_1-85072-0 from training. Duration: 0.83 2023-10-03 15:20:07,210 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7002_210_1794_1_1527935634_4034444_487-236501-0_sp1.1 from training. Duration: 0.6 2023-10-03 15:20:09,263 WARNING [train.py:1204] (3/4) Exclude cut with ID _7160_210_1876_1_1527940923231_3703799_282-322611-0 from training. Duration: 0.87 2023-10-03 15:20:11,187 WARNING [train.py:1204] (3/4) Exclude cut with ID _67335_210_5219_1_1535525975652_4275229_570-262983-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 15:20:12,602 WARNING [train.py:1204] (3/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_625-338546-0_sp1.1 from training. Duration: 0.8 2023-10-03 15:20:12,639 WARNING [train.py:1204] (3/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_176-56120-0 from training. Duration: 0.83 2023-10-03 15:20:16,053 WARNING [train.py:1204] (3/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_963-187109-0_sp1.1 from training. Duration: 0.9 2023-10-03 15:20:18,802 WARNING [train.py:1204] (3/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_121-284152-0_sp0.9 from training. Duration: 0.6555625 2023-10-03 15:20:20,174 WARNING [train.py:1204] (3/4) Exclude cut with ID _45021_210_9994_1_1533619751094_3330288_104-81711-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 15:20:21,670 WARNING [train.py:1204] (3/4) Exclude cut with ID _51382_210_23862_1_1534492801838_3650104_129-343914-0_sp1.1 from training. Duration: 0.936375 2023-10-03 15:20:21,698 WARNING [train.py:1204] (3/4) Exclude cut with ID _14970_210_15709_1_1530775547361_1966965_8-169924-0 from training. Duration: 0.92 2023-10-03 15:20:21,816 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.0.feed_forward1.hidden_balancer.prob, batch_count=1313260.0, ans=0.125 2023-10-03 15:20:24,836 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2295_210_2087_1_1525500993_2407863_234-83104-0_sp1.1 from training. Duration: 0.563625 2023-10-03 15:20:24,891 WARNING [train.py:1204] (3/4) Exclude cut with ID _14687_210_5854_1_1530703838259_3664889_115-239046-0 from training. Duration: 0.67 2023-10-03 15:20:27,565 WARNING [train.py:1204] (3/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_690-126306-0_sp1.1 from training. Duration: 0.7090625 2023-10-03 15:20:27,574 WARNING [train.py:1204] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_625-102955-0_sp0.9 from training. Duration: 0.8555625 2023-10-03 15:20:28,979 WARNING [train.py:1204] (3/4) Exclude cut with ID _9389_210_1664_1_1529217337301_5289269_289-213417-0 from training. Duration: 0.89 2023-10-03 15:20:30,370 WARNING [train.py:1204] (3/4) Exclude cut with ID _13253_210_1925_1_1530757775819_3577873_191-144977-0_sp0.9 from training. Duration: 0.8666875 2023-10-03 15:20:31,632 INFO [train.py:1046] (3/4) Epoch 38, batch 450, loss[loss=0.1576, simple_loss=0.2475, pruned_loss=0.03386, over 24309.00 frames. ], tot_loss[loss=0.1573, simple_loss=0.237, pruned_loss=0.03881, over 4229903.80 frames. ], batch size: 74, lr: 2.69e-03, grad_scale: 16.0 2023-10-03 15:20:31,696 WARNING [train.py:1204] (3/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_325-155703-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 15:20:31,726 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2295_210_2087_1_1525500993_2407863_116-238894-0_sp0.9 from training. Duration: 0.77775 2023-10-03 15:20:31,845 WARNING [train.py:1204] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_94-126128-0 from training. Duration: 0.86 2023-10-03 15:20:32,017 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=1313326.6666666667, ans=0.1 2023-10-03 15:20:33,416 WARNING [train.py:1204] (3/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_395-310220-0_sp0.9 from training. Duration: 0.988875 2023-10-03 15:20:33,501 WARNING [train.py:1204] (3/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_821-110852-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 15:20:33,744 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.balancer2.prob, batch_count=1313326.6666666667, ans=0.125 2023-10-03 15:20:34,815 WARNING [train.py:1204] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_64-248359-0_sp1.1 from training. Duration: 0.62725 2023-10-03 15:20:34,826 WARNING [train.py:1204] (3/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_138-187031-0 from training. Duration: 0.99 2023-10-03 15:20:34,876 WARNING [train.py:1204] (3/4) Exclude cut with ID _4122_210_2084_1_1526713164417_4353280_528-206771-0_sp0.9 from training. Duration: 0.97775 2023-10-03 15:20:36,229 WARNING [train.py:1204] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_844-96989-0_sp1.1 from training. Duration: 0.6454375 2023-10-03 15:20:39,944 WARNING [train.py:1204] (3/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_106-30782-0_sp1.1 from training. Duration: 0.72725 2023-10-03 15:20:46,179 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.3.encoder.layers.1.conv_module2.whiten, num_groups=1, num_channels=512, metric=4.20 vs. limit=15.0 2023-10-03 15:20:50,034 WARNING [train.py:1204] (3/4) Exclude cut with ID _14683_210_5219_1_1530698352608_3974859_281-250040-0_sp1.1 from training. Duration: 0.936375 2023-10-03 15:20:50,079 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_46013_210_7234_1_1533776362_3744032_463-52378-0_sp0.9 from training. Duration: 0.9555625 2023-10-03 15:20:51,533 WARNING [train.py:1204] (3/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_299-34413-0 from training. Duration: 0.65 2023-10-03 15:20:53,331 WARNING [train.py:1204] (3/4) Exclude cut with ID _35799_210_5854_1_1533263487887_3529071_490-326847-0 from training. Duration: 0.99 2023-10-03 15:20:57,473 WARNING [train.py:1204] (3/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_589-9070-0_sp0.9 from training. Duration: 0.788875 2023-10-03 15:20:58,822 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.563e+02 1.857e+02 2.046e+02 2.242e+02 3.489e+02, threshold=4.091e+02, percent-clipped=0.0 2023-10-03 15:20:58,979 WARNING [train.py:1204] (3/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_884-29515-0_sp1.1 from training. Duration: 0.936375 2023-10-03 15:21:01,590 WARNING [train.py:1204] (3/4) Exclude cut with ID _15012_210_1968_1_1530856820429_5095350_474-119995-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 15:21:01,867 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.attention_skip_rate, batch_count=1313460.0, ans=0.0 2023-10-03 15:21:04,333 WARNING [train.py:1204] (3/4) Exclude cut with ID _65585_210_12210_1_1535450451584_3764098_211-97124-0_sp1.1 from training. Duration: 0.963625 2023-10-03 15:21:05,641 WARNING [train.py:1204] (3/4) Exclude cut with ID _33640_210_2655_1_1533117605479_4855469_666-224463-0_sp1.1 from training. Duration: 0.963625 2023-10-03 15:21:08,455 WARNING [train.py:1204] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_35-153886-0 from training. Duration: 0.84 2023-10-03 15:21:08,499 WARNING [train.py:1204] (3/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_210-106047-0 from training. Duration: 0.97 2023-10-03 15:21:10,303 WARNING [train.py:1204] (3/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_479-125611-0 from training. Duration: 0.96 2023-10-03 15:21:10,334 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_9417_210_1794_1_1529235261_4614131_427-153963-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 15:21:12,271 WARNING [train.py:1204] (3/4) Exclude cut with ID _23818_210_6228_1_1531917063884_3676920_133-170571-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 15:21:12,329 WARNING [train.py:1204] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_602-275260-0_sp1.1 from training. Duration: 0.7454375 2023-10-03 15:21:13,695 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9029W0121-509521 from training. Duration: 0.896 2023-10-03 15:21:13,703 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9029W0434-509821 from training. Duration: 0.64 2023-10-03 15:21:13,726 WARNING [train.py:1204] (3/4) Exclude cut with ID _23038_210_9570_1_1532001584158_3652419_289-307034-0_sp1.1 from training. Duration: 0.936375 2023-10-03 15:21:15,717 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.4.encoder.layers.0.whiten, num_groups=1, num_channels=384, metric=5.98 vs. limit=12.0 2023-10-03 15:21:16,437 WARNING [train.py:1204] (3/4) Exclude cut with ID _29345_210_4381_1_1532689168263_3860169_149-257268-0_sp0.9 from training. Duration: 0.9 2023-10-03 15:21:17,828 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_405-339986-0_sp1.1 from training. Duration: 0.463625 2023-10-03 15:21:18,320 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.4.encoder.layers.1.feed_forward2.out_whiten, num_groups=1, num_channels=384, metric=12.29 vs. limit=15.0 2023-10-03 15:21:21,101 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.1.feed_forward2.hidden_balancer.prob, batch_count=1313526.6666666667, ans=0.125 2023-10-03 15:21:22,379 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2655_210_2087_1_1525688959_3897400_437-337690-0_sp1.1 from training. Duration: 0.57275 2023-10-03 15:21:22,406 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7543_210_6753_1_1528541907_1399322_165-323619-0_sp1.1 from training. Duration: 0.62725 2023-10-03 15:21:23,087 WARNING [train.py:1204] (3/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_508-335369-0_sp1.1 from training. Duration: 0.436375 2023-10-03 15:21:24,260 WARNING [train.py:1204] (3/4) Exclude cut with ID _7852_210_5219_1_1528286471316_8139400_358-287492-0 from training. Duration: 0.96 2023-10-03 15:21:25,656 WARNING [train.py:1204] (3/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_322-217220-0_sp0.9 from training. Duration: 0.9555625 2023-10-03 15:21:28,449 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3171_210_2087_1_1526109084_2330007_66-260583-0_sp0.9 from training. Duration: 0.8 2023-10-03 15:21:29,713 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8558_210_1410_1_1528505225_7613406_131-329524-0_sp1.1 from training. Duration: 0.7181875 2023-10-03 15:21:31,147 WARNING [train.py:1204] (3/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_30-326681-0 from training. Duration: 0.83 2023-10-03 15:21:35,373 WARNING [train.py:1204] (3/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_58-68956-0_sp0.9 from training. Duration: 0.92225 2023-10-03 15:21:35,449 WARNING [train.py:1204] (3/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_255-312654-0 from training. Duration: 0.91 2023-10-03 15:21:35,511 WARNING [train.py:1204] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_99-167392-0 from training. Duration: 0.61 2023-10-03 15:21:36,881 WARNING [train.py:1204] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_94-126128-0_sp0.9 from training. Duration: 0.9555625 2023-10-03 15:21:39,990 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.conv_skip_rate, batch_count=1313593.3333333333, ans=0.0 2023-10-03 15:21:41,083 WARNING [train.py:1204] (3/4) Exclude cut with ID _68635_210_9317_1_1535770813410_4315870_187-192817-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 15:21:42,915 WARNING [train.py:1204] (3/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_277-204970-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 15:21:43,023 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6163_210_6400_1_1527506746_4263068_283-137811-0_sp0.9 from training. Duration: 0.8555625 2023-10-03 15:21:44,878 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9144W0121-566446 from training. Duration: 0.896 2023-10-03 15:21:46,215 INFO [train.py:1046] (3/4) Epoch 38, batch 500, loss[loss=0.1429, simple_loss=0.2205, pruned_loss=0.03267, over 24463.00 frames. ], tot_loss[loss=0.1577, simple_loss=0.2373, pruned_loss=0.0391, over 4335761.55 frames. ], batch size: 58, lr: 2.69e-03, grad_scale: 16.0 2023-10-03 15:21:49,010 WARNING [train.py:1204] (3/4) Exclude cut with ID _49371_210_12071_1_1533952761021_3812190_351-315519-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 15:21:50,438 WARNING [train.py:1204] (3/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_115-326778-0_sp1.1 from training. Duration: 0.8454375 2023-10-03 15:21:52,277 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_17292_210_6753_1_1531125388_2943734_436-198965-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 15:21:52,286 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9165W0117-576751 from training. Duration: 0.896 2023-10-03 15:21:53,688 WARNING [train.py:1204] (3/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_212-149947-0 from training. Duration: 0.73 2023-10-03 15:21:53,698 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_13757_210_3528_1_1530440682_4107239_65-167544-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 15:21:55,662 WARNING [train.py:1204] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_700-183549-0_sp0.9 from training. Duration: 0.7555625 2023-10-03 15:21:59,898 WARNING [train.py:1204] (3/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_216-343007-0_sp0.9 from training. Duration: 0.7333125 2023-10-03 15:22:01,219 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7544_210_6753_1_1528587722_3576686_295-258479-0_sp1.1 from training. Duration: 0.763625 2023-10-03 15:22:02,628 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_365-287106-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 15:22:02,648 WARNING [train.py:1204] (3/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_300-3747-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 15:22:03,989 WARNING [train.py:1204] (3/4) Exclude cut with ID _14069_210_5632_1_1530512692033_4803390_487-303201-0_sp1.1 from training. Duration: 0.92725 2023-10-03 15:22:15,443 WARNING [train.py:1204] (3/4) Exclude cut with ID _14459_210_2228_1_1531443641946_7516700_668-123026-0_sp1.1 from training. Duration: 0.936375 2023-10-03 15:22:15,479 WARNING [train.py:1204] (3/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_108-53929-0_sp1.1 from training. Duration: 0.536375 2023-10-03 15:22:17,257 WARNING [train.py:1204] (3/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_266-137104-0_sp0.9 from training. Duration: 0.72225 2023-10-03 15:22:17,294 WARNING [train.py:1204] (3/4) Exclude cut with ID _12302_210_5219_1_1529924446466_8107280_902-168619-0_sp1.1 from training. Duration: 0.936375 2023-10-03 15:22:17,320 WARNING [train.py:1204] (3/4) Exclude cut with ID _59502_210_12210_1_1535107024698_1835139_146-162495-0 from training. Duration: 0.92 2023-10-03 15:22:18,639 WARNING [train.py:1204] (3/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_412-287526-0_sp1.1 from training. Duration: 0.6545625 2023-10-03 15:22:20,138 WARNING [train.py:1204] (3/4) Exclude cut with ID _7160_210_1876_1_1527940923231_3703799_120-150943-0_sp0.9 from training. Duration: 0.9 2023-10-03 15:22:21,522 WARNING [train.py:1204] (3/4) Exclude cut with ID _15037_210_2253_1_1530752294104_3267650_296-192597-0_sp1.1 from training. Duration: 0.736375 2023-10-03 15:22:21,543 WARNING [train.py:1204] (3/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_397-216497-0_sp1.1 from training. Duration: 0.87275 2023-10-03 15:22:21,566 WARNING [train.py:1204] (3/4) Exclude cut with ID _40094_210_16380_1_1533294005773_3641159_76-269320-0_sp1.1 from training. Duration: 0.936375 2023-10-03 15:22:22,900 WARNING [train.py:1204] (3/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_449-15508-0 from training. Duration: 0.82 2023-10-03 15:22:24,455 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=1313793.3333333333, ans=0.1 2023-10-03 15:22:27,472 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9309W0194-640321 from training. Duration: 0.64 2023-10-03 15:22:30,205 WARNING [train.py:1204] (3/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_284-312178-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 15:22:30,329 WARNING [train.py:1204] (3/4) Exclude cut with ID _29853_210_7497_1_1532505600851_6642110_401-55937-0_sp1.1 from training. Duration: 0.92725 2023-10-03 15:22:31,853 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.nonlin_attention.balancer.prob, batch_count=1313860.0, ans=0.125 2023-10-03 15:22:32,512 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.1.encoder.layers.1.nonlin_attention.whiten2, num_groups=1, num_channels=256, metric=7.66 vs. limit=15.0 2023-10-03 15:22:33,003 WARNING [train.py:1204] (3/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_716-24918-0_sp1.1 from training. Duration: 0.92725 2023-10-03 15:22:33,021 WARNING [train.py:1204] (3/4) Exclude cut with ID _19637_210_4220_1_1531537167676_3205060_314-305638-0_sp1.1 from training. Duration: 0.92725 2023-10-03 15:22:33,057 WARNING [train.py:1204] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_475-127455-0_sp1.1 from training. Duration: 0.636375 2023-10-03 15:22:34,444 WARNING [train.py:1204] (3/4) Exclude cut with ID _5444_210_2151_1_1527418808464_5312210_581-315411-0 from training. Duration: 0.62 2023-10-03 15:22:37,324 WARNING [train.py:1204] (3/4) Exclude cut with ID _44902_210_16187_1_1533888006062_3792078_149-67212-0_sp0.9 from training. Duration: 0.9666875 2023-10-03 15:22:38,663 WARNING [train.py:1204] (3/4) Exclude cut with ID _31602_210_5854_1_1532677021456_4458250_63-63253-0_sp1.1 from training. Duration: 0.963625 2023-10-03 15:22:41,509 WARNING [train.py:1204] (3/4) Exclude cut with ID _55209_210_1531_1_1534675828996_4597160_660-203992-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 15:22:46,113 WARNING [train.py:1204] (3/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_83-190624-0_sp1.1 from training. Duration: 0.92725 2023-10-03 15:22:49,930 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.bypass.scale_min, batch_count=1313926.6666666667, ans=0.2 2023-10-03 15:22:50,989 WARNING [train.py:1204] (3/4) Exclude cut with ID _58605_210_12210_1_1534723195939_3904259_154-139635-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 15:22:52,479 WARNING [train.py:1204] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_164-224052-0 from training. Duration: 0.7 2023-10-03 15:22:53,714 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_1126-189071-0_sp1.1 from training. Duration: 0.963625 2023-10-03 15:22:53,734 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_15396_210_6890_1_1531308473_1938936_310-214789-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 15:22:57,064 WARNING [train.py:1204] (3/4) Exclude cut with ID _8395_210_1531_1_1528597871523_4411579_431-132470-0 from training. Duration: 0.74 2023-10-03 15:22:57,129 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3780_210_2087_1_1526467760_2130483_170-259044-0_sp0.9 from training. Duration: 0.77775 2023-10-03 15:22:58,982 WARNING [train.py:1204] (3/4) Exclude cut with ID _12733_210_8341_1_1530167424556_3646380_537-230640-0_sp1.1 from training. Duration: 0.963625 2023-10-03 15:23:00,352 INFO [train.py:1046] (3/4) Epoch 38, batch 550, loss[loss=0.1458, simple_loss=0.2268, pruned_loss=0.03236, over 24419.00 frames. ], tot_loss[loss=0.1584, simple_loss=0.2379, pruned_loss=0.03948, over 4421201.32 frames. ], batch size: 58, lr: 2.69e-03, grad_scale: 16.0 2023-10-03 15:23:03,174 WARNING [train.py:1204] (3/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_159-281310-0 from training. Duration: 0.95 2023-10-03 15:23:04,605 WARNING [train.py:1204] (3/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_434-83000-0 from training. Duration: 0.98 2023-10-03 15:23:04,626 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_4405_210_1849_1_1526732205_3593939_493-32464-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 15:23:04,643 WARNING [train.py:1204] (3/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_342-100157-0 from training. Duration: 0.54 2023-10-03 15:23:06,004 WARNING [train.py:1204] (3/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_765-250133-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 15:23:06,011 WARNING [train.py:1204] (3/4) Exclude cut with ID _38670_210_15379_1_1533448810659_4391059_81-312245-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 15:23:06,087 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_50050_210_16223_1_1535435796_7290138_1273-247611-0_sp1.1 from training. Duration: 0.936375 2023-10-03 15:23:07,430 WARNING [train.py:1204] (3/4) Exclude cut with ID _44831_210_13427_1_1533816011549_3689035_399-18591-0_sp1.1 from training. Duration: 0.936375 2023-10-03 15:23:07,445 WARNING [train.py:1204] (3/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_557-324258-0_sp0.9 from training. Duration: 0.9 2023-10-03 15:23:08,698 WARNING [train.py:1204] (3/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_62-335552-0_sp1.1 from training. Duration: 0.7909375 2023-10-03 15:23:11,388 WARNING [train.py:1204] (3/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_673-264125-0_sp1.1 from training. Duration: 0.963625 2023-10-03 15:23:12,726 WARNING [train.py:1204] (3/4) Exclude cut with ID _8030_210_7497_1_1528444886781_3531089_262-86750-0 from training. Duration: 0.81 2023-10-03 15:23:12,756 WARNING [train.py:1204] (3/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_444-28041-0_sp1.1 from training. Duration: 0.77275 2023-10-03 15:23:19,268 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_34008_210_10431_1_1533345424_8183247_516-14858-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 15:23:19,281 WARNING [train.py:1204] (3/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_54-248218-0_sp1.1 from training. Duration: 0.936375 2023-10-03 15:23:20,755 WARNING [train.py:1204] (3/4) Exclude cut with ID _13253_210_1925_1_1530757775819_3577873_300-119782-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 15:23:22,050 WARNING [train.py:1204] (3/4) Exclude cut with ID _39290_210_4220_1_1533862778760_3075118_21-240703-0_sp1.1 from training. Duration: 0.936375 2023-10-03 15:23:26,465 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.614e+02 1.899e+02 2.084e+02 2.370e+02 3.616e+02, threshold=4.167e+02, percent-clipped=0.0 2023-10-03 15:23:27,818 WARNING [train.py:1204] (3/4) Exclude cut with ID 774-127930-0014-10412-0_sp1.1 from training. Duration: 0.95 2023-10-03 15:23:27,890 WARNING [train.py:1204] (3/4) Exclude cut with ID _67499_210_17282_1_1535509805220_3692430_20-179346-0 from training. Duration: 0.97 2023-10-03 15:23:29,756 WARNING [train.py:1204] (3/4) Exclude cut with ID _8365_210_2064_1_1528592311070_4677690_431-294102-0_sp0.9 from training. Duration: 0.988875 2023-10-03 15:23:29,877 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.0.balancer2.prob, batch_count=1314126.6666666667, ans=0.125 2023-10-03 15:23:34,006 WARNING [train.py:1204] (3/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_365-1492-0_sp1.1 from training. Duration: 0.82725 2023-10-03 15:23:35,334 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_105-114797-0_sp0.9 from training. Duration: 0.9333125 2023-10-03 15:23:36,691 WARNING [train.py:1204] (3/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_769-168302-0_sp0.9 from training. Duration: 0.788875 2023-10-03 15:23:36,872 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.attention_skip_rate, batch_count=1314126.6666666667, ans=0.0 2023-10-03 15:23:39,472 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_40340_210_9024_1_1533433916_4132944_320-270277-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 15:23:39,485 WARNING [train.py:1204] (3/4) Exclude cut with ID ID0203W0003-781711 from training. Duration: 0.958 2023-10-03 15:23:40,716 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_30434_210_13049_1_1532519384_981746_52-12932-0_sp1.1 from training. Duration: 0.936375 2023-10-03 15:23:40,803 WARNING [train.py:1204] (3/4) Exclude cut with ID _8263_210_1664_1_1528438953494_294100_40-204731-0_sp0.9 from training. Duration: 0.5555625 2023-10-03 15:23:41,704 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.0.layers.1.feed_forward1.out_whiten, num_groups=1, num_channels=192, metric=4.43 vs. limit=15.0 2023-10-03 15:23:43,701 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1032-236863-0_sp0.9 from training. Duration: 0.9333125 2023-10-03 15:23:43,752 WARNING [train.py:1204] (3/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_335-274780-0_sp0.9 from training. Duration: 0.8666875 2023-10-03 15:23:43,760 WARNING [train.py:1204] (3/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_654-52713-0_sp1.1 from training. Duration: 0.7 2023-10-03 15:23:45,172 WARNING [train.py:1204] (3/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_742-267856-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 15:23:45,294 WARNING [train.py:1204] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_74-48717-0 from training. Duration: 0.58 2023-10-03 15:23:48,554 WARNING [train.py:1204] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_18-46756-0 from training. Duration: 0.79 2023-10-03 15:23:50,506 WARNING [train.py:1204] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_612-56061-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 15:23:50,520 WARNING [train.py:1204] (3/4) Exclude cut with ID _56423_210_5854_1_1534896061049_3165969_412-2403-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 15:23:51,695 WARNING [train.py:1204] (3/4) Exclude cut with ID _62574_210_2655_1_1535180407058_3933249_120-196848-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 15:23:51,696 WARNING [train.py:1204] (3/4) Exclude cut with ID _40519_210_13794_1_1533715228655_7337390_916-51000-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 15:23:52,084 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.ff2_skip_rate, batch_count=1314193.3333333333, ans=0.0 2023-10-03 15:23:53,211 WARNING [train.py:1204] (3/4) Exclude cut with ID _65263_210_22356_1_1535763598691_3921037_108-21902-0_sp1.1 from training. Duration: 0.9 2023-10-03 15:23:54,614 WARNING [train.py:1204] (3/4) Exclude cut with ID _4122_210_2084_1_1526713164417_4353280_528-206771-0_sp1.1 from training. Duration: 0.8 2023-10-03 15:23:59,177 WARNING [train.py:1204] (3/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_43-325657-0_sp1.1 from training. Duration: 0.92725 2023-10-03 15:23:59,230 WARNING [train.py:1204] (3/4) Exclude cut with ID _44659_210_2721_1_1534036120061_6614860_314-82041-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 15:24:00,564 WARNING [train.py:1204] (3/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_81-300212-0_sp0.9 from training. Duration: 0.6666875 2023-10-03 15:24:00,633 WARNING [train.py:1204] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_665-196155-0_sp0.9 from training. Duration: 0.7444375 2023-10-03 15:24:02,001 WARNING [train.py:1204] (3/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_140-336881-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 15:24:03,345 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6290_210_3273_1_1527595224_1282787_115-87095-0_sp0.9 from training. Duration: 0.611125 2023-10-03 15:24:03,395 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_53373_210_5399_1_1534229471_4715695_607-259685-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 15:24:04,737 WARNING [train.py:1204] (3/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_175-163894-0_sp1.1 from training. Duration: 0.67275 2023-10-03 15:24:04,754 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7002_210_1794_1_1527939683_5157286_1-281264-0_sp1.1 from training. Duration: 0.4 2023-10-03 15:24:05,762 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.0.layers.1.self_attn_weights.whiten_keys, num_groups=4, num_channels=128, metric=2.32 vs. limit=6.0 2023-10-03 15:24:11,576 WARNING [train.py:1204] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_605-257205-0 from training. Duration: 0.59 2023-10-03 15:24:12,791 INFO [train.py:1046] (3/4) Epoch 38, batch 600, loss[loss=0.156, simple_loss=0.2307, pruned_loss=0.04059, over 23692.00 frames. ], tot_loss[loss=0.159, simple_loss=0.2384, pruned_loss=0.03977, over 4477412.52 frames. ], batch size: 232, lr: 2.69e-03, grad_scale: 16.0 2023-10-03 15:24:13,203 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.conv_skip_rate, batch_count=1314326.6666666667, ans=0.0 2023-10-03 15:24:14,475 WARNING [train.py:1204] (3/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_454-314901-0 from training. Duration: 0.46 2023-10-03 15:24:15,864 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_46032_210_14656_1_1533781412_4507969_287-130422-0_sp1.1 from training. Duration: 0.97275 2023-10-03 15:24:17,213 WARNING [train.py:1204] (3/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_186-309161-0_sp1.1 from training. Duration: 0.4909375 2023-10-03 15:24:17,249 WARNING [train.py:1204] (3/4) Exclude cut with ID _42659_210_2070_1_1533607442765_3120380_45-342307-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 15:24:23,869 WARNING [train.py:1204] (3/4) Exclude cut with ID _12555_210_8750_1_1530075594402_3780239_478-326818-0_sp1.1 from training. Duration: 0.963625 2023-10-03 15:24:25,191 WARNING [train.py:1204] (3/4) Exclude cut with ID _7606_210_4915_1_1528894253277_5080690_127-177761-0_sp1.1 from training. Duration: 0.5818125 2023-10-03 15:24:27,225 INFO [scaling.py:1118] (3/4) WithLoss: name=encoder.encoders.1.encoder.layers.1.self_attn_weights, loss-sum=0.000e+00 2023-10-03 15:24:28,359 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6788_210_1388_1_1527898698_4562021_91-197981-0 from training. Duration: 0.83 2023-10-03 15:24:30,314 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8558_210_1410_1_1528505225_7613406_131-329524-0_sp0.9 from training. Duration: 0.87775 2023-10-03 15:24:33,016 WARNING [train.py:1204] (3/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_153-153537-0_sp1.1 from training. Duration: 0.82725 2023-10-03 15:24:33,150 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_17292_210_6753_1_1531125388_2943734_285-349105-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 15:24:33,306 INFO [scaling.py:1118] (3/4) WithLoss: name=encoder.encoders.1.encoder.layers.1.self_attn_weights, loss-sum=0.000e+00 2023-10-03 15:24:36,033 WARNING [train.py:1204] (3/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_37-41919-0 from training. Duration: 0.87 2023-10-03 15:24:36,058 WARNING [train.py:1204] (3/4) Exclude cut with ID _60860_210_8254_1_1534896015875_3557169_172-294852-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 15:24:41,556 WARNING [train.py:1204] (3/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_146-276944-0 from training. Duration: 0.83 2023-10-03 15:24:41,724 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.0.feed_forward2.hidden_balancer.prob, batch_count=1314460.0, ans=0.125 2023-10-03 15:24:44,358 WARNING [train.py:1204] (3/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_1254-44082-0_sp1.1 from training. Duration: 0.936375 2023-10-03 15:24:44,374 WARNING [train.py:1204] (3/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_1115-180350-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 15:24:44,396 WARNING [train.py:1204] (3/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_60-146189-0_sp1.1 from training. Duration: 0.8909375 2023-10-03 15:24:44,668 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.conv_skip_rate, batch_count=1314460.0, ans=0.0 2023-10-03 15:24:44,680 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.bypass.scale_min, batch_count=1314460.0, ans=0.2 2023-10-03 15:24:50,881 WARNING [train.py:1204] (3/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_517-153649-0_sp1.1 from training. Duration: 0.92725 2023-10-03 15:24:50,894 WARNING [train.py:1204] (3/4) Exclude cut with ID _39571_210_13449_1_1533801584143_3651077_37-266257-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 15:24:50,944 WARNING [train.py:1204] (3/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_527-276118-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 15:24:57,031 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7696_210_2040_1_1528207167_3716099_117-125434-0_sp1.1 from training. Duration: 0.6909375 2023-10-03 15:25:01,791 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_25767_210_2161_1_1532307483_3867748_478-156360-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 15:25:01,795 WARNING [train.py:1204] (3/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_177-49092-0_sp1.1 from training. Duration: 0.82725 2023-10-03 15:25:01,981 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.balancer1.prob, batch_count=1314526.6666666667, ans=0.125 2023-10-03 15:25:03,036 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_11305_210_1794_1_1529750428_7178148_680-317073-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 15:25:06,173 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.nonlin_attention.balancer.prob, batch_count=1314526.6666666667, ans=0.125 2023-10-03 15:25:07,426 WARNING [train.py:1204] (3/4) Exclude cut with ID _53565_210_10635_1_1534337218335_3820259_287-18120-0 from training. Duration: 0.97 2023-10-03 15:25:12,723 WARNING [train.py:1204] (3/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_498-40153-0_sp1.1 from training. Duration: 0.57275 2023-10-03 15:25:13,941 WARNING [train.py:1204] (3/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_555-63066-0_sp1.1 from training. Duration: 0.963625 2023-10-03 15:25:15,638 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.bypass.skip_rate, batch_count=1314593.3333333333, ans=0.09899494936611666 2023-10-03 15:25:18,713 WARNING [train.py:1204] (3/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_395-310220-0 from training. Duration: 0.89 2023-10-03 15:25:20,514 WARNING [train.py:1204] (3/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_937-208281-0_sp1.1 from training. Duration: 0.77275 2023-10-03 15:25:21,881 WARNING [train.py:1204] (3/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_120-211630-0 from training. Duration: 0.92 2023-10-03 15:25:21,918 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_454-276429-0_sp1.1 from training. Duration: 0.7909375 2023-10-03 15:25:23,222 WARNING [train.py:1204] (3/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_404-75889-0_sp1.1 from training. Duration: 0.8090625 2023-10-03 15:25:25,474 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.0.self_attn_weights.pos_emb_skip_rate, batch_count=1314593.3333333333, ans=0.0 2023-10-03 15:25:27,776 INFO [train.py:1046] (3/4) Epoch 38, batch 650, loss[loss=0.1544, simple_loss=0.2243, pruned_loss=0.04226, over 23891.00 frames. ], tot_loss[loss=0.1588, simple_loss=0.2377, pruned_loss=0.03995, over 4530635.64 frames. ], batch size: 195, lr: 2.69e-03, grad_scale: 16.0 2023-10-03 15:25:27,856 WARNING [train.py:1204] (3/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_282-136641-0_sp0.9 from training. Duration: 0.4666875 2023-10-03 15:25:27,967 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_809-17156-0_sp1.1 from training. Duration: 0.663625 2023-10-03 15:25:30,345 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.4.encoder.layers.2.conv_module1.whiten, num_groups=1, num_channels=384, metric=3.01 vs. limit=15.0 2023-10-03 15:25:31,647 WARNING [train.py:1204] (3/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_1003-101638-0_sp0.9 from training. Duration: 0.811125 2023-10-03 15:25:32,548 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.0.feed_forward2.out_whiten.whitening_limit, batch_count=1314660.0, ans=15.0 2023-10-03 15:25:33,046 WARNING [train.py:1204] (3/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_404-75889-0_sp0.9 from training. Duration: 0.988875 2023-10-03 15:25:34,420 WARNING [train.py:1204] (3/4) Exclude cut with ID _59288_210_5854_1_1535077757774_3412246_570-295255-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 15:25:35,884 WARNING [train.py:1204] (3/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_412-287526-0 from training. Duration: 0.72 2023-10-03 15:25:35,938 WARNING [train.py:1204] (3/4) Exclude cut with ID _44010_210_4525_1_1533884262964_4013620_649-79655-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 15:25:41,417 WARNING [train.py:1204] (3/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_57-270037-0_sp0.9 from training. Duration: 0.8555625 2023-10-03 15:25:41,418 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_34815_210_6388_1_1533381320_3571903_302-255963-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 15:25:44,717 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_47449_210_5399_1_1534076445_4006929_573-301852-0_sp1.1 from training. Duration: 0.97275 2023-10-03 15:25:48,153 WARNING [train.py:1204] (3/4) Exclude cut with ID 6709-74022-0004-86860-0_sp1.1 from training. Duration: 0.9409375 2023-10-03 15:25:49,555 WARNING [train.py:1204] (3/4) Exclude cut with ID _40519_210_13794_1_1533715228655_7337390_111-71991-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 15:25:50,851 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_30167_210_3528_1_1532428012_4772375_169-214186-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 15:25:52,900 WARNING [train.py:1204] (3/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_20-235313-0_sp1.1 from training. Duration: 0.963625 2023-10-03 15:25:54,223 WARNING [train.py:1204] (3/4) Exclude cut with ID _4681_210_3525_1_1527251040321_1235713_123-24428-0_sp0.9 from training. Duration: 0.5333125 2023-10-03 15:25:55,439 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.557e+02 1.877e+02 2.025e+02 2.307e+02 3.864e+02, threshold=4.051e+02, percent-clipped=0.0 2023-10-03 15:25:56,946 WARNING [train.py:1204] (3/4) Exclude cut with ID _31602_210_5854_1_1532677021456_4458250_444-198752-0_sp1.1 from training. Duration: 0.97275 2023-10-03 15:25:58,301 WARNING [train.py:1204] (3/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_9-279442-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 15:25:59,683 WARNING [train.py:1204] (3/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_973-198085-0_sp0.9 from training. Duration: 0.8333125 2023-10-03 15:25:59,767 WARNING [train.py:1204] (3/4) Exclude cut with ID _29853_210_7497_1_1532505600851_6642110_567-250221-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 15:26:01,619 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6545_210_2040_1_1527774670_4245258_407-48070-0_sp1.1 from training. Duration: 0.6909375 2023-10-03 15:26:03,051 WARNING [train.py:1204] (3/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_224-343295-0_sp0.9 from training. Duration: 0.611125 2023-10-03 15:26:03,067 WARNING [train.py:1204] (3/4) Exclude cut with ID IC0125W0011-61817 from training. Duration: 0.88 2023-10-03 15:26:03,076 WARNING [train.py:1204] (3/4) Exclude cut with ID _60113_210_16271_1_1535164212481_4386079_435-154371-0_sp1.1 from training. Duration: 0.97275 2023-10-03 15:26:03,088 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_25836_210_16554_1_1532138167_53197_3-9394-0_sp1.1 from training. Duration: 0.92725 2023-10-03 15:26:07,226 WARNING [train.py:1204] (3/4) Exclude cut with ID _29343_210_4381_1_1532602757752_3674109_57-55125-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 15:26:08,624 WARNING [train.py:1204] (3/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_267-23560-0_sp1.1 from training. Duration: 0.936375 2023-10-03 15:26:08,657 WARNING [train.py:1204] (3/4) Exclude cut with ID _30196_210_1664_1_1533708496522_7074140_1150-278867-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 15:26:09,957 WARNING [train.py:1204] (3/4) Exclude cut with ID _6270_210_2084_1_1527850911573_3856420_26-277343-0_sp0.9 from training. Duration: 0.911125 2023-10-03 15:26:10,017 WARNING [train.py:1204] (3/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_325-62802-0 from training. Duration: 0.87 2023-10-03 15:26:11,369 WARNING [train.py:1204] (3/4) Exclude cut with ID _56524_210_9642_1_1534572011779_3619170_145-310842-0_sp1.1 from training. Duration: 0.9 2023-10-03 15:26:11,414 WARNING [train.py:1204] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_860-96198-0_sp0.9 from training. Duration: 0.811125 2023-10-03 15:26:11,496 WARNING [train.py:1204] (3/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_17-25045-0_sp1.1 from training. Duration: 0.536375 2023-10-03 15:26:12,776 WARNING [train.py:1204] (3/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_349-69727-0_sp1.1 from training. Duration: 0.936375 2023-10-03 15:26:14,598 WARNING [train.py:1204] (3/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_580-93798-0_sp1.1 from training. Duration: 0.5090625 2023-10-03 15:26:16,052 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7762_210_1794_1_1528199508_4057408_419-125009-0 from training. Duration: 0.83 2023-10-03 15:26:17,962 WARNING [train.py:1204] (3/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_356-167816-0 from training. Duration: 0.72 2023-10-03 15:26:17,990 WARNING [train.py:1204] (3/4) Exclude cut with ID _29345_210_4381_1_1532689168263_3860169_225-97100-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 15:26:19,300 WARNING [train.py:1204] (3/4) Exclude cut with ID _14227_210_4381_1_1530701804286_3940259_374-194712-0_sp1.1 from training. Duration: 0.92725 2023-10-03 15:26:19,325 WARNING [train.py:1204] (3/4) Exclude cut with ID _40137_210_12210_1_1533290484046_3795180_249-187059-0_sp0.9 from training. Duration: 0.97775 2023-10-03 15:26:19,350 WARNING [train.py:1204] (3/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_389-317194-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 15:26:20,784 WARNING [train.py:1204] (3/4) Exclude cut with ID _27485_210_4916_1_1532739191677_3668269_239-122918-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 15:26:26,678 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2245_210_2715_1_1525511548_3540254_128-117609-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 15:26:26,720 WARNING [train.py:1204] (3/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_160-276178-0_sp1.1 from training. Duration: 0.963625 2023-10-03 15:26:26,956 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.ff2_skip_rate, batch_count=1314926.6666666667, ans=0.0 2023-10-03 15:26:28,048 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_31920_210_10633_1_1532860009_1686094_1-279735-0_sp1.1 from training. Duration: 0.97275 2023-10-03 15:26:31,378 WARNING [train.py:1204] (3/4) Exclude cut with ID _51078_210_9070_1_1534161557424_3805250_325-262383-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 15:26:31,399 WARNING [train.py:1204] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_643-52007-0_sp0.9 from training. Duration: 0.6333125 2023-10-03 15:26:31,446 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_51937_210_5014_1_1534573610_4022025_518-299248-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 15:26:36,998 WARNING [train.py:1204] (3/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_166-314607-0_sp1.1 from training. Duration: 0.4909375 2023-10-03 15:26:37,001 WARNING [train.py:1204] (3/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_1087-282939-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 15:26:37,035 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_23850_210_4238_1_1532232597_1632433_32-185135-0_sp1.1 from training. Duration: 0.7818125 2023-10-03 15:26:38,314 WARNING [train.py:1204] (3/4) Exclude cut with ID _59500_210_8254_1_1534809610680_3736009_145-89359-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 15:26:42,487 INFO [train.py:1046] (3/4) Epoch 38, batch 700, loss[loss=0.1456, simple_loss=0.2209, pruned_loss=0.03514, over 23483.00 frames. ], tot_loss[loss=0.1574, simple_loss=0.2364, pruned_loss=0.0392, over 4581620.40 frames. ], batch size: 285, lr: 2.69e-03, grad_scale: 16.0 2023-10-03 15:26:42,579 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_340-336408-0 from training. Duration: 0.93 2023-10-03 15:26:44,398 WARNING [train.py:1204] (3/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_9-21737-0 from training. Duration: 0.97 2023-10-03 15:26:45,855 WARNING [train.py:1204] (3/4) Exclude cut with ID _2220_210_1819_1_1525509109898_3658680_50-99837-0 from training. Duration: 0.54 2023-10-03 15:26:47,800 WARNING [train.py:1204] (3/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_879-299350-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 15:26:49,195 WARNING [train.py:1204] (3/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_905-160074-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 15:26:50,637 WARNING [train.py:1204] (3/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_395-73570-0 from training. Duration: 0.99 2023-10-03 15:26:55,193 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7264_210_4938_1_1527939911_8132095_800-91995-0_sp1.1 from training. Duration: 0.7818125 2023-10-03 15:26:58,055 WARNING [train.py:1204] (3/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_77-311404-0_sp1.1 from training. Duration: 0.8545625 2023-10-03 15:27:00,067 WARNING [train.py:1204] (3/4) Exclude cut with ID _13679_210_7497_1_1530534293662_3355098_107-80772-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 15:27:01,487 WARNING [train.py:1204] (3/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_224-343295-0_sp1.1 from training. Duration: 0.5 2023-10-03 15:27:01,525 WARNING [train.py:1204] (3/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_156-253482-0_sp1.1 from training. Duration: 0.963625 2023-10-03 15:27:04,312 WARNING [train.py:1204] (3/4) Exclude cut with ID _49728_210_12651_1_1534041034294_6788150_309-125532-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 15:27:05,854 WARNING [train.py:1204] (3/4) Exclude cut with ID _13913_210_9994_1_1530579615187_3383897_473-277682-0_sp0.9 from training. Duration: 0.4333125 2023-10-03 15:27:05,858 WARNING [train.py:1204] (3/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_3-276701-0_sp1.1 from training. Duration: 0.82725 2023-10-03 15:27:07,237 WARNING [train.py:1204] (3/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_282-136641-0 from training. Duration: 0.42 2023-10-03 15:27:11,157 WARNING [train.py:1204] (3/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_127-566-0 from training. Duration: 0.84 2023-10-03 15:27:12,673 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6163_210_6400_1_1527506746_4263068_283-137811-0_sp1.1 from training. Duration: 0.7 2023-10-03 15:27:14,586 WARNING [train.py:1204] (3/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_34-183685-0_sp1.1 from training. Duration: 0.8909375 2023-10-03 15:27:14,809 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.feed_forward3.hidden_balancer.prob, batch_count=1315126.6666666667, ans=0.125 2023-10-03 15:27:17,632 WARNING [train.py:1204] (3/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_154-135696-0_sp0.9 from training. Duration: 0.888875 2023-10-03 15:27:20,366 WARNING [train.py:1204] (3/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_676-51014-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 15:27:21,818 WARNING [train.py:1204] (3/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_38-169920-0 from training. Duration: 0.91 2023-10-03 15:27:26,737 WARNING [train.py:1204] (3/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_747-114465-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 15:27:26,785 WARNING [train.py:1204] (3/4) Exclude cut with ID _8021_210_7994_1_1528376451358_3653069_100-140231-0_sp1.1 from training. Duration: 0.6090625 2023-10-03 15:27:28,120 WARNING [train.py:1204] (3/4) Exclude cut with ID _64135_210_2253_1_1535677267876_7608972_133-188266-0 from training. Duration: 0.81 2023-10-03 15:27:28,412 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.conv_skip_rate, batch_count=1315193.3333333333, ans=0.0 2023-10-03 15:27:32,661 WARNING [train.py:1204] (3/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_714-310538-0_sp1.1 from training. Duration: 0.7818125 2023-10-03 15:27:34,122 WARNING [train.py:1204] (3/4) Exclude cut with ID _39867_210_9826_1_1533553222030_9344259_1134-288498-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 15:27:35,621 WARNING [train.py:1204] (3/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_211-237289-0_sp1.1 from training. Duration: 0.936375 2023-10-03 15:27:41,015 WARNING [train.py:1204] (3/4) Exclude cut with ID _59202_210_26984_1_1534816810612_3549174_137-318710-0_sp0.9 from training. Duration: 0.988875 2023-10-03 15:27:41,055 WARNING [train.py:1204] (3/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_86-66192-0 from training. Duration: 0.55 2023-10-03 15:27:43,882 WARNING [train.py:1204] (3/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_540-275685-0 from training. Duration: 0.89 2023-10-03 15:27:45,200 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8776_210_2040_1_1528638837_4088036_244-278763-0 from training. Duration: 0.9 2023-10-03 15:27:47,079 WARNING [train.py:1204] (3/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_9-219972-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 15:27:47,496 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.5.encoder.layers.1.conv_module1.whiten, num_groups=1, num_channels=256, metric=5.51 vs. limit=15.0 2023-10-03 15:27:49,003 WARNING [train.py:1204] (3/4) Exclude cut with ID _44659_210_2721_1_1534036120061_6614860_656-283823-0_sp1.1 from training. Duration: 0.92725 2023-10-03 15:27:50,419 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_69630_210_7234_1_1535789601_661049_77-216385-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 15:27:53,071 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_19952_210_2161_1_1531875469_3851750_210-308919-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 15:27:53,077 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_4737_210_2040_1_1526998689_3293008_378-178650-0 from training. Duration: 0.95 2023-10-03 15:27:57,698 INFO [train.py:1046] (3/4) Epoch 38, batch 750, loss[loss=0.1669, simple_loss=0.2533, pruned_loss=0.04022, over 23693.00 frames. ], tot_loss[loss=0.1564, simple_loss=0.2357, pruned_loss=0.03856, over 4614892.19 frames. ], batch size: 94, lr: 2.69e-03, grad_scale: 16.0 2023-10-03 15:27:57,778 WARNING [train.py:1204] (3/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_224-343295-0 from training. Duration: 0.55 2023-10-03 15:27:57,801 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7002_210_1794_1_1527939683_5157286_184-229630-0 from training. Duration: 0.74 2023-10-03 15:27:59,042 WARNING [train.py:1204] (3/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_6-214815-0 from training. Duration: 0.85 2023-10-03 15:27:59,152 WARNING [train.py:1204] (3/4) Exclude cut with ID _8699_210_2069_1_1528630346831_3960409_160-24408-0 from training. Duration: 0.91 2023-10-03 15:28:00,498 WARNING [train.py:1204] (3/4) Exclude cut with ID _5585_210_2667_1_1527418813324_8007340_89-332072-0 from training. Duration: 0.64 2023-10-03 15:28:00,540 WARNING [train.py:1204] (3/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_528-259800-0_sp1.1 from training. Duration: 0.963625 2023-10-03 15:28:02,548 WARNING [train.py:1204] (3/4) Exclude cut with ID _55383_210_10635_1_1534468426172_2231669_134-128246-0 from training. Duration: 0.89 2023-10-03 15:28:03,923 WARNING [train.py:1204] (3/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_400-338274-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 15:28:05,383 WARNING [train.py:1204] (3/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_364-22817-0_sp0.9 from training. Duration: 0.788875 2023-10-03 15:28:06,791 WARNING [train.py:1204] (3/4) Exclude cut with ID _42863_210_9070_1_1533902398788_3696920_331-291057-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 15:28:08,191 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_5436_210_1794_1_1527331317_2541494_229-211016-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 15:28:09,473 WARNING [train.py:1204] (3/4) Exclude cut with ID _6456_210_1534_1_1527681634212_3181049_345-252877-0_sp0.9 from training. Duration: 0.711125 2023-10-03 15:28:09,496 WARNING [train.py:1204] (3/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_96-81499-0_sp1.1 from training. Duration: 0.92725 2023-10-03 15:28:10,993 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_5575_210_1410_1_1527301305_2570170_342-265572-0_sp1.1 from training. Duration: 0.8545625 2023-10-03 15:28:11,066 WARNING [train.py:1204] (3/4) Exclude cut with ID _5444_210_2151_1_1527418808464_5312210_726-27165-0_sp0.9 from training. Duration: 0.7444375 2023-10-03 15:28:12,507 WARNING [train.py:1204] (3/4) Exclude cut with ID _22083_210_8254_1_1531702862439_3678970_304-327750-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 15:28:16,328 WARNING [train.py:1204] (3/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_245-226199-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 15:28:16,358 WARNING [train.py:1204] (3/4) Exclude cut with ID _42875_210_14856_1_1533704397487_4012433_102-199499-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 15:28:16,411 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_51225_210_3601_1_1534400634_2327383_55-301844-0 from training. Duration: 0.93 2023-10-03 15:28:18,248 WARNING [train.py:1204] (3/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_916-195848-0_sp1.1 from training. Duration: 0.836375 2023-10-03 15:28:19,977 WARNING [train.py:1204] (3/4) Exclude cut with ID _44718_210_5816_1_1534050051869_6469030_497-311999-0_sp1.1 from training. Duration: 0.936375 2023-10-03 15:28:21,378 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_34886_210_10633_1_1533203882_1755661_91-194765-0_sp1.1 from training. Duration: 0.936375 2023-10-03 15:28:22,829 WARNING [train.py:1204] (3/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_310-26186-0_sp0.9 from training. Duration: 0.77775 2023-10-03 15:28:24,074 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.543e+02 1.901e+02 2.100e+02 2.467e+02 3.467e+02, threshold=4.200e+02, percent-clipped=0.0 2023-10-03 15:28:24,192 WARNING [train.py:1204] (3/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_416-257730-0 from training. Duration: 0.65 2023-10-03 15:28:24,197 WARNING [train.py:1204] (3/4) Exclude cut with ID _9389_210_1664_1_1529217337301_5289269_209-154760-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 15:28:25,759 WARNING [train.py:1204] (3/4) Exclude cut with ID _4092_210_2151_1_1526643044449_3135759_227-213972-0 from training. Duration: 0.54 2023-10-03 15:28:27,068 WARNING [train.py:1204] (3/4) Exclude cut with ID IC0679W0044-335852 from training. Duration: 0.768 2023-10-03 15:28:27,139 WARNING [train.py:1204] (3/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_42-186162-0 from training. Duration: 0.92 2023-10-03 15:28:27,153 WARNING [train.py:1204] (3/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_302-2550-0_sp0.9 from training. Duration: 0.9 2023-10-03 15:28:28,404 WARNING [train.py:1204] (3/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_356-31216-0_sp0.9 from training. Duration: 0.8333125 2023-10-03 15:28:31,078 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.2.encoder.layers.0.nonlin_attention.whiten1, num_groups=1, num_channels=288, metric=7.93 vs. limit=10.0 2023-10-03 15:28:31,711 WARNING [train.py:1204] (3/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_125-322172-0_sp1.1 from training. Duration: 0.8454375 2023-10-03 15:28:32,065 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.feed_forward1.out_proj.dropout_p, batch_count=1315460.0, ans=0.1 2023-10-03 15:28:37,936 WARNING [train.py:1204] (3/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_141-89727-0_sp0.9 from training. Duration: 0.788875 2023-10-03 15:28:37,959 WARNING [train.py:1204] (3/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_647-337784-0_sp1.1 from training. Duration: 0.97275 2023-10-03 15:28:38,590 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.3.encoder.layers.1.conv_module1.whiten, num_groups=1, num_channels=512, metric=2.79 vs. limit=15.0 2023-10-03 15:28:39,122 WARNING [train.py:1204] (3/4) Exclude cut with ID _6493_210_1747_1_1528459210424_4012470_513-140967-0_sp1.1 from training. Duration: 0.7181875 2023-10-03 15:28:40,538 WARNING [train.py:1204] (3/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_87-266334-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 15:28:43,214 WARNING [train.py:1204] (3/4) Exclude cut with ID _26528_210_9070_1_1532570410842_3851310_526-261338-0_sp1.1 from training. Duration: 0.92725 2023-10-03 15:28:43,264 WARNING [train.py:1204] (3/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_380-343670-0 from training. Duration: 0.88 2023-10-03 15:28:43,315 WARNING [train.py:1204] (3/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_449-15508-0_sp1.1 from training. Duration: 0.7454375 2023-10-03 15:28:46,067 WARNING [train.py:1204] (3/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_372-6869-0_sp0.9 from training. Duration: 0.488875 2023-10-03 15:28:46,128 WARNING [train.py:1204] (3/4) Exclude cut with ID _26907_210_7234_1_1532221740694_3205263_337-2494-0_sp0.9 from training. Duration: 0.97775 2023-10-03 15:28:48,840 WARNING [train.py:1204] (3/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_10-100367-0_sp1.1 from training. Duration: 0.8909375 2023-10-03 15:28:50,500 WARNING [train.py:1204] (3/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_47-167373-0 from training. Duration: 0.92 2023-10-03 15:28:50,571 WARNING [train.py:1204] (3/4) Exclude cut with ID _28323_210_16187_1_1532480395667_4100300_628-282179-0_sp1.1 from training. Duration: 0.97275 2023-10-03 15:28:55,326 WARNING [train.py:1204] (3/4) Exclude cut with ID _20455_210_5219_1_1531565996775_8098100_213-280898-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 15:28:56,743 WARNING [train.py:1204] (3/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_383-316000-0_sp1.1 from training. Duration: 0.8181875 2023-10-03 15:28:58,060 WARNING [train.py:1204] (3/4) Exclude cut with ID _18513_210_9206_1_1531618215223_3862620_16-78180-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 15:28:59,542 WARNING [train.py:1204] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_330-327861-0_sp1.1 from training. Duration: 0.6090625 2023-10-03 15:29:03,861 WARNING [train.py:1204] (3/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_356-31216-0 from training. Duration: 0.75 2023-10-03 15:29:05,161 WARNING [train.py:1204] (3/4) Exclude cut with ID _25248_210_8750_1_1532062825941_5554180_255-148539-0_sp1.1 from training. Duration: 0.963625 2023-10-03 15:29:05,211 WARNING [train.py:1204] (3/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_269-154726-0_sp1.1 from training. Duration: 0.7818125 2023-10-03 15:29:05,412 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.conv_skip_rate, batch_count=1315593.3333333333, ans=0.0 2023-10-03 15:29:08,075 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7002_210_1794_1_1527939683_5157286_163-87552-0_sp1.1 from training. Duration: 0.7818125 2023-10-03 15:29:08,108 WARNING [train.py:1204] (3/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_97-151896-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 15:29:09,709 WARNING [train.py:1204] (3/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_207-7026-0_sp1.1 from training. Duration: 0.97275 2023-10-03 15:29:09,729 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_92-46009-0_sp0.9 from training. Duration: 0.6 2023-10-03 15:29:09,888 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.feed_forward1.hidden_balancer.prob, batch_count=1315660.0, ans=0.125 2023-10-03 15:29:11,389 INFO [train.py:1046] (3/4) Epoch 38, batch 800, loss[loss=0.1353, simple_loss=0.2077, pruned_loss=0.03143, over 24449.00 frames. ], tot_loss[loss=0.1572, simple_loss=0.2367, pruned_loss=0.03891, over 4643749.45 frames. ], batch size: 58, lr: 2.69e-03, grad_scale: 16.0 2023-10-03 15:29:19,474 WARNING [train.py:1204] (3/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_122-178847-0_sp1.1 from training. Duration: 0.97275 2023-10-03 15:29:19,478 WARNING [train.py:1204] (3/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_94-348956-0_sp1.1 from training. Duration: 0.92725 2023-10-03 15:29:20,892 WARNING [train.py:1204] (3/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_1018-244089-0_sp1.1 from training. Duration: 0.963625 2023-10-03 15:29:22,183 WARNING [train.py:1204] (3/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_87-118423-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 15:29:22,273 WARNING [train.py:1204] (3/4) Exclude cut with ID _53713_210_24116_1_1534314487762_7416751_504-212292-0_sp1.1 from training. Duration: 0.92725 2023-10-03 15:29:22,300 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_446-150327-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 15:29:24,346 WARNING [train.py:1204] (3/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_682-245989-0_sp1.1 from training. Duration: 0.92725 2023-10-03 15:29:24,580 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.feed_forward1.hidden_balancer.prob, batch_count=1315726.6666666667, ans=0.125 2023-10-03 15:29:28,834 WARNING [train.py:1204] (3/4) Exclude cut with ID _35723_210_1794_1_1533088341043_628250_8-234524-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 15:29:30,167 WARNING [train.py:1204] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_64-248359-0_sp0.9 from training. Duration: 0.7666875 2023-10-03 15:29:32,935 WARNING [train.py:1204] (3/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_49-97158-0 from training. Duration: 0.76 2023-10-03 15:29:33,627 WARNING [train.py:1204] (3/4) Exclude cut with ID _24314_210_4915_1_1532135078116_7170179_743-915-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 15:29:34,923 WARNING [train.py:1204] (3/4) Exclude cut with ID _61194_210_18628_1_1535007555628_3750909_353-323468-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 15:29:34,952 WARNING [train.py:1204] (3/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_387-131538-0_sp1.1 from training. Duration: 0.736375 2023-10-03 15:29:34,971 WARNING [train.py:1204] (3/4) Exclude cut with ID _44424_210_18163_1_1533623394848_1213110_79-187603-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 15:29:36,317 WARNING [train.py:1204] (3/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_86-19601-0 from training. Duration: 0.85 2023-10-03 15:29:36,350 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_50050_210_16223_1_1535435796_7290138_103-283529-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 15:29:37,625 WARNING [train.py:1204] (3/4) Exclude cut with ID _13913_210_9994_1_1530579615187_3383897_473-277682-0 from training. Duration: 0.39 2023-10-03 15:29:39,216 WARNING [train.py:1204] (3/4) Exclude cut with ID _42326_210_7770_1_1533814282531_3707269_368-68029-0_sp1.1 from training. Duration: 0.92725 2023-10-03 15:29:41,209 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_41428_210_2161_1_1533774184_3992532_446-339645-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 15:29:42,630 WARNING [train.py:1204] (3/4) Exclude cut with ID _69584_210_5219_1_1535785133775_4044450_325-7656-0_sp1.1 from training. Duration: 0.7818125 2023-10-03 15:29:42,792 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.attention_skip_rate, batch_count=1315793.3333333333, ans=0.0 2023-10-03 15:29:43,914 WARNING [train.py:1204] (3/4) Exclude cut with ID _21632_210_2253_1_1531994001765_3744140_134-184387-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 15:29:45,295 WARNING [train.py:1204] (3/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_550-184707-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 15:29:45,322 WARNING [train.py:1204] (3/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_106-341780-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 15:29:49,470 WARNING [train.py:1204] (3/4) Exclude cut with ID _45871_210_5665_1_1533864614245_3296060_253-132552-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 15:29:49,515 WARNING [train.py:1204] (3/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_395-310220-0_sp1.1 from training. Duration: 0.8090625 2023-10-03 15:29:49,536 WARNING [train.py:1204] (3/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_795-33252-0_sp0.9 from training. Duration: 0.411125 2023-10-03 15:29:52,605 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9002W0003-495962 from training. Duration: 0.946 2023-10-03 15:29:52,628 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_43-71989-0 from training. Duration: 0.57 2023-10-03 15:29:52,648 WARNING [train.py:1204] (3/4) Exclude cut with ID _8699_210_2069_1_1528630346831_3960409_155-39766-0_sp1.1 from training. Duration: 0.7090625 2023-10-03 15:29:52,661 WARNING [train.py:1204] (3/4) Exclude cut with ID _27894_210_12392_1_1532415496078_6902439_788-232684-0_sp1.1 from training. Duration: 0.97275 2023-10-03 15:29:54,470 WARNING [train.py:1204] (3/4) Exclude cut with ID _56494_210_4220_1_1535284837625_3033169_10-39296-0_sp1.1 from training. Duration: 0.92725 2023-10-03 15:29:54,496 WARNING [train.py:1204] (3/4) Exclude cut with ID _40901_210_5665_1_1533363789958_3856130_341-20819-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 15:29:59,922 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9029W0435-509822 from training. Duration: 0.896 2023-10-03 15:30:01,255 WARNING [train.py:1204] (3/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_51-117576-0 from training. Duration: 0.4 2023-10-03 15:30:02,624 WARNING [train.py:1204] (3/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_197-28164-0_sp1.1 from training. Duration: 0.72725 2023-10-03 15:30:04,691 WARNING [train.py:1204] (3/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_145-137790-0_sp1.1 from training. Duration: 0.7545625 2023-10-03 15:30:08,759 WARNING [train.py:1204] (3/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_2-39653-0_sp1.1 from training. Duration: 0.8909375 2023-10-03 15:30:12,169 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3780_210_2087_1_1526467760_2130483_186-208240-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 15:30:13,471 WARNING [train.py:1204] (3/4) Exclude cut with ID _24055_210_12651_1_1532310907143_976979_92-59356-0 from training. Duration: 0.91 2023-10-03 15:30:13,516 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3910_210_1794_1_1526742306_334530_28-238909-0_sp1.1 from training. Duration: 0.736375 2023-10-03 15:30:16,274 WARNING [train.py:1204] (3/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_269-154726-0 from training. Duration: 0.86 2023-10-03 15:30:21,769 WARNING [train.py:1204] (3/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_326-165900-0_sp0.9 from training. Duration: 0.7666875 2023-10-03 15:30:23,681 WARNING [train.py:1204] (3/4) Exclude cut with ID _5294_210_1747_1_1527249643928_3696179_470-276077-0_sp1.1 from training. Duration: 0.963625 2023-10-03 15:30:24,890 INFO [train.py:1046] (3/4) Epoch 38, batch 850, loss[loss=0.1643, simple_loss=0.2441, pruned_loss=0.04228, over 23292.00 frames. ], tot_loss[loss=0.1583, simple_loss=0.2378, pruned_loss=0.03939, over 4656197.56 frames. ], batch size: 93, lr: 2.69e-03, grad_scale: 16.0 2023-10-03 15:30:24,962 WARNING [train.py:1204] (3/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_372-131163-0 from training. Duration: 0.94 2023-10-03 15:30:25,022 WARNING [train.py:1204] (3/4) Exclude cut with ID _45297_210_2721_1_1533715351381_3264949_351-147136-0_sp1.1 from training. Duration: 0.7909375 2023-10-03 15:30:26,946 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_1072-12671-0_sp1.1 from training. Duration: 0.92725 2023-10-03 15:30:27,044 WARNING [train.py:1204] (3/4) Exclude cut with ID _15133_210_6228_1_1530794512074_3080379_54-198193-0 from training. Duration: 0.94 2023-10-03 15:30:27,069 WARNING [train.py:1204] (3/4) Exclude cut with ID _30196_210_1664_1_1533708496522_7074140_1194-270988-0_sp1.1 from training. Duration: 0.97275 2023-10-03 15:30:29,767 WARNING [train.py:1204] (3/4) Exclude cut with ID _50310_210_15488_1_1534068043345_3641080_289-193637-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 15:30:29,878 WARNING [train.py:1204] (3/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_313-263495-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 15:30:31,272 WARNING [train.py:1204] (3/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_18-130336-0_sp1.1 from training. Duration: 0.7181875 2023-10-03 15:30:33,017 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_47896_210_20508_1_1534492518_5958868_646-287209-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 15:30:34,463 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_294-163793-0 from training. Duration: 0.82 2023-10-03 15:30:34,509 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_8-319957-0 from training. Duration: 0.96 2023-10-03 15:30:34,512 WARNING [train.py:1204] (3/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_1211-226066-0 from training. Duration: 0.76 2023-10-03 15:30:34,676 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=1315993.3333333333, ans=0.1 2023-10-03 15:30:35,898 WARNING [train.py:1204] (3/4) Exclude cut with ID _4779_210_2084_1_1527318022944_3914893_500-285013-0_sp0.9 from training. Duration: 0.7666875 2023-10-03 15:30:35,917 WARNING [train.py:1204] (3/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_347-232268-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 15:30:38,554 WARNING [train.py:1204] (3/4) Exclude cut with ID _29877_210_6228_1_1532397791106_5276149_111-55056-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 15:30:40,007 WARNING [train.py:1204] (3/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_812-271646-0_sp1.1 from training. Duration: 0.92725 2023-10-03 15:30:40,039 WARNING [train.py:1204] (3/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_235-262124-0_sp1.1 from training. Duration: 0.6090625 2023-10-03 15:30:44,971 WARNING [train.py:1204] (3/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_14-16530-0_sp1.1 from training. Duration: 0.97275 2023-10-03 15:30:46,294 WARNING [train.py:1204] (3/4) Exclude cut with ID _44718_210_5816_1_1534050051869_6469030_568-304512-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 15:30:46,313 WARNING [train.py:1204] (3/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_1033-274376-0 from training. Duration: 0.87 2023-10-03 15:30:47,923 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=1316060.0, ans=0.1 2023-10-03 15:30:50,333 WARNING [train.py:1204] (3/4) Exclude cut with ID _4681_210_3525_1_1527251040321_1235713_123-24428-0 from training. Duration: 0.48 2023-10-03 15:30:51,842 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_37818_210_4238_1_1533620910_4720083_251-288764-0_sp1.1 from training. Duration: 0.97275 2023-10-03 15:30:53,070 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.660e+02 1.915e+02 2.103e+02 2.491e+02 3.805e+02, threshold=4.207e+02, percent-clipped=0.0 2023-10-03 15:30:53,196 WARNING [train.py:1204] (3/4) Exclude cut with ID _65585_210_12210_1_1535450451584_3764098_112-294301-0 from training. Duration: 0.88 2023-10-03 15:30:58,380 WARNING [train.py:1204] (3/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_329-113173-0 from training. Duration: 0.62 2023-10-03 15:30:59,727 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6767_210_3273_1_1527855783_1718932_173-256601-0 from training. Duration: 0.8 2023-10-03 15:30:59,924 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.1.conv_skip_rate, batch_count=1316126.6666666667, ans=0.0 2023-10-03 15:31:02,024 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.0.layers.1.self_attn1.whiten, num_groups=1, num_channels=192, metric=12.20 vs. limit=22.5 2023-10-03 15:31:02,491 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9266W0001-627677 from training. Duration: 0.896 2023-10-03 15:31:02,501 WARNING [train.py:1204] (3/4) Exclude cut with ID _49643_210_7989_1_1534640244224_3939031_611-185515-0_sp1.1 from training. Duration: 0.8545625 2023-10-03 15:31:02,505 WARNING [train.py:1204] (3/4) Exclude cut with ID _31649_210_6228_1_1532584608063_4091078_687-237771-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 15:31:02,515 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_148-172861-0_sp0.9 from training. Duration: 0.5555625 2023-10-03 15:31:05,773 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_5129_210_6400_1_1527161005_3579054_107-69738-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 15:31:07,225 WARNING [train.py:1204] (3/4) Exclude cut with ID _40137_210_12210_1_1533290484046_3795180_36-301164-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 15:31:07,256 WARNING [train.py:1204] (3/4) Exclude cut with ID _7537_210_2084_1_1528502395431_3924359_113-199933-0 from training. Duration: 0.59 2023-10-03 15:31:08,695 WARNING [train.py:1204] (3/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_547-5424-0_sp1.1 from training. Duration: 0.8545625 2023-10-03 15:31:11,350 WARNING [train.py:1204] (3/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_147-284024-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 15:31:12,772 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_294-163793-0_sp1.1 from training. Duration: 0.7454375 2023-10-03 15:31:12,807 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3780_210_2087_1_1526467760_2130483_170-259044-0_sp1.1 from training. Duration: 0.636375 2023-10-03 15:31:15,554 WARNING [train.py:1204] (3/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_726-270780-0_sp1.1 from training. Duration: 0.7818125 2023-10-03 15:31:15,646 WARNING [train.py:1204] (3/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_214-291369-0_sp1.1 from training. Duration: 0.57275 2023-10-03 15:31:17,498 WARNING [train.py:1204] (3/4) Exclude cut with ID _21881_210_3525_1_1531825242757_3798380_247-102591-0 from training. Duration: 0.98 2023-10-03 15:31:20,383 WARNING [train.py:1204] (3/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_18-174647-0_sp1.1 from training. Duration: 0.82725 2023-10-03 15:31:20,386 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_415-52611-0_sp1.1 from training. Duration: 0.936375 2023-10-03 15:31:21,772 WARNING [train.py:1204] (3/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_379-49457-0_sp0.9 from training. Duration: 0.9666875 2023-10-03 15:31:21,785 WARNING [train.py:1204] (3/4) Exclude cut with ID _64521_210_14699_1_1535243427259_3815328_368-195106-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 15:31:23,073 WARNING [train.py:1204] (3/4) Exclude cut with ID _28394_210_8913_1_1532430049518_3650118_51-334124-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 15:31:25,839 WARNING [train.py:1204] (3/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_317-161769-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 15:31:29,172 WARNING [train.py:1204] (3/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_40-129756-0_sp0.9 from training. Duration: 0.9 2023-10-03 15:31:29,253 WARNING [train.py:1204] (3/4) Exclude cut with ID _3067_210_2667_1_1526212974640_4503168_119-175776-0_sp0.9 from training. Duration: 0.82225 2023-10-03 15:31:29,303 WARNING [train.py:1204] (3/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_1004-304915-0_sp1.1 from training. Duration: 0.92725 2023-10-03 15:31:31,192 WARNING [train.py:1204] (3/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_359-118926-0_sp0.9 from training. Duration: 0.8 2023-10-03 15:31:34,189 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=1316260.0, ans=0.1 2023-10-03 15:31:38,677 WARNING [train.py:1204] (3/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_51-200220-0_sp0.9 from training. Duration: 0.62225 2023-10-03 15:31:40,035 INFO [train.py:1046] (3/4) Epoch 38, batch 900, loss[loss=0.1557, simple_loss=0.2284, pruned_loss=0.04153, over 23571.00 frames. ], tot_loss[loss=0.1589, simple_loss=0.2385, pruned_loss=0.03959, over 4677329.12 frames. ], batch size: 134, lr: 2.69e-03, grad_scale: 16.0 2023-10-03 15:31:40,114 WARNING [train.py:1204] (3/4) Exclude cut with ID _46683_210_9642_1_1533730913492_4591116_530-326202-0_sp1.1 from training. Duration: 0.936375 2023-10-03 15:31:40,169 WARNING [train.py:1204] (3/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_567-135114-0 from training. Duration: 0.87 2023-10-03 15:31:41,529 WARNING [train.py:1204] (3/4) Exclude cut with ID _66863_210_18053_1_1535698758334_8590319_1092-271784-0_sp1.1 from training. Duration: 0.87275 2023-10-03 15:31:41,546 WARNING [train.py:1204] (3/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_762-282614-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 15:31:42,997 WARNING [train.py:1204] (3/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_280-120942-0 from training. Duration: 0.77 2023-10-03 15:31:48,906 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_31583_210_3852_1_1532595851_4024413_27-170931-0_sp1.1 from training. Duration: 0.963625 2023-10-03 15:31:49,165 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.conv_skip_rate, batch_count=1316326.6666666667, ans=0.0 2023-10-03 15:31:51,693 WARNING [train.py:1204] (3/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_266-271628-0_sp1.1 from training. Duration: 0.92725 2023-10-03 15:31:51,744 WARNING [train.py:1204] (3/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_41-31157-0 from training. Duration: 0.57 2023-10-03 15:31:52,037 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.feed_forward1.out_proj.dropout_p, batch_count=1316326.6666666667, ans=0.1 2023-10-03 15:31:55,831 WARNING [train.py:1204] (3/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_113-18163-0_sp1.1 from training. Duration: 0.8090625 2023-10-03 15:31:55,864 WARNING [train.py:1204] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_147-111137-0 from training. Duration: 0.95 2023-10-03 15:31:55,947 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1083-220122-0_sp1.1 from training. Duration: 0.463625 2023-10-03 15:31:57,328 WARNING [train.py:1204] (3/4) Exclude cut with ID _14884_210_2252_1_1530707459689_5561250_591-173406-0_sp1.1 from training. Duration: 0.87275 2023-10-03 15:31:57,341 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_24677_210_1794_1_1532002693_4341296_149-303793-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 15:31:59,130 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_133-564-0_sp1.1 from training. Duration: 0.7181875 2023-10-03 15:31:59,150 WARNING [train.py:1204] (3/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_625-338546-0_sp0.9 from training. Duration: 0.97775 2023-10-03 15:32:02,060 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.2.encoder.layers.0.feed_forward1.out_whiten, num_groups=1, num_channels=384, metric=5.16 vs. limit=15.0 2023-10-03 15:32:07,359 WARNING [train.py:1204] (3/4) Exclude cut with ID _25882_210_3036_1_1532775665815_6479510_624-320169-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 15:32:07,365 WARNING [train.py:1204] (3/4) Exclude cut with ID _20622_210_3852_1_1531622427080_1093839_127-325326-0_sp1.1 from training. Duration: 0.92725 2023-10-03 15:32:08,639 WARNING [train.py:1204] (3/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_464-33931-0_sp1.1 from training. Duration: 0.4909375 2023-10-03 15:32:10,171 WARNING [train.py:1204] (3/4) Exclude cut with ID _66112_210_10635_1_1535590236305_3801484_120-304588-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 15:32:14,278 WARNING [train.py:1204] (3/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_110-130961-0 from training. Duration: 0.47 2023-10-03 15:32:15,771 WARNING [train.py:1204] (3/4) Exclude cut with ID _16908_210_5724_1_1531119157248_3772700_64-54020-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 15:32:20,249 WARNING [train.py:1204] (3/4) Exclude cut with ID _4068_210_2629_1_1526697350395_1546259_224-288693-0_sp1.1 from training. Duration: 0.536375 2023-10-03 15:32:20,286 WARNING [train.py:1204] (3/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_191-41023-0_sp0.9 from training. Duration: 0.788875 2023-10-03 15:32:20,346 WARNING [train.py:1204] (3/4) Exclude cut with ID ID0205W0255-783002 from training. Duration: 0.958 2023-10-03 15:32:21,624 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_162-262717-0 from training. Duration: 0.83 2023-10-03 15:32:27,780 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_224-322394-0_sp1.1 from training. Duration: 0.6 2023-10-03 15:32:27,786 WARNING [train.py:1204] (3/4) Exclude cut with ID _4866_210_2577_1_1527404412284_4364659_133-1696-0_sp1.1 from training. Duration: 0.9 2023-10-03 15:32:29,722 WARNING [train.py:1204] (3/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_973-198085-0_sp1.1 from training. Duration: 0.6818125 2023-10-03 15:32:35,921 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_47449_210_5399_1_1534076445_4006929_410-237806-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 15:32:37,011 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7543_210_6753_1_1528543318_4552_1-85072-0_sp0.9 from training. Duration: 0.92225 2023-10-03 15:32:37,141 WARNING [train.py:1204] (3/4) Exclude cut with ID _65263_210_22356_1_1535763598691_3921037_108-21902-0 from training. Duration: 0.99 2023-10-03 15:32:37,145 WARNING [train.py:1204] (3/4) Exclude cut with ID _65585_210_12210_1_1535450451584_3764098_309-174818-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 15:32:41,231 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_309-65177-0 from training. Duration: 0.88 2023-10-03 15:32:41,452 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.feed_forward3.hidden_balancer.prob, batch_count=1316593.3333333333, ans=0.125 2023-10-03 15:32:42,641 WARNING [train.py:1204] (3/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_363-185703-0_sp1.1 from training. Duration: 0.863625 2023-10-03 15:32:42,675 WARNING [train.py:1204] (3/4) Exclude cut with ID _42326_210_7770_1_1533814282531_3707269_442-238692-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 15:32:42,895 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.nonlin_attention.balancer.prob, batch_count=1316593.3333333333, ans=0.125 2023-10-03 15:32:45,571 WARNING [train.py:1204] (3/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_226-254446-0_sp1.1 from training. Duration: 0.936375 2023-10-03 15:32:45,594 WARNING [train.py:1204] (3/4) Exclude cut with ID _29853_210_7497_1_1532505600851_6642110_287-299903-0_sp1.1 from training. Duration: 0.963625 2023-10-03 15:32:50,116 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1015-343884-0 from training. Duration: 0.62 2023-10-03 15:32:50,152 WARNING [train.py:1204] (3/4) Exclude cut with ID ID0305W0008-833402 from training. Duration: 0.91 2023-10-03 15:32:51,608 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6673_210_3273_1_1527767790_3173240_351-167954-0_sp1.1 from training. Duration: 0.436375 2023-10-03 15:32:51,634 WARNING [train.py:1204] (3/4) Exclude cut with ID _6456_210_1534_1_1527681634212_3181049_345-252877-0 from training. Duration: 0.64 2023-10-03 15:32:54,239 INFO [train.py:1046] (3/4) Epoch 38, batch 950, loss[loss=0.1595, simple_loss=0.2445, pruned_loss=0.03726, over 24431.00 frames. ], tot_loss[loss=0.1597, simple_loss=0.2393, pruned_loss=0.04004, over 4682832.46 frames. ], batch size: 69, lr: 2.69e-03, grad_scale: 16.0 2023-10-03 15:32:54,361 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_13-205182-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 15:32:57,122 WARNING [train.py:1204] (3/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_427-208901-0 from training. Duration: 0.68 2023-10-03 15:33:03,373 WARNING [train.py:1204] (3/4) Exclude cut with ID _49039_210_6315_1_1533967537541_3846118_9-148921-0_sp1.1 from training. Duration: 0.97275 2023-10-03 15:33:04,871 WARNING [train.py:1204] (3/4) Exclude cut with ID _59493_210_17282_1_1535078419077_3225879_143-70317-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 15:33:04,909 WARNING [train.py:1204] (3/4) Exclude cut with ID _27967_210_12140_1_1532588391859_8419130_1056-236369-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 15:33:06,175 WARNING [train.py:1204] (3/4) Exclude cut with ID _6537_210_1883_1_1527987559710_3597129_382-133062-0_sp0.9 from training. Duration: 0.7555625 2023-10-03 15:33:08,403 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.balancer1.prob, batch_count=1316726.6666666667, ans=0.125 2023-10-03 15:33:09,543 WARNING [train.py:1204] (3/4) Exclude cut with ID ID0376W0255-869957 from training. Duration: 0.9510625 2023-10-03 15:33:11,021 WARNING [train.py:1204] (3/4) Exclude cut with ID _49371_210_12071_1_1533952761021_3812190_483-240691-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 15:33:12,289 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_12868_210_6753_1_1530701932_530275_18-76322-0_sp1.1 from training. Duration: 0.7909375 2023-10-03 15:33:12,356 WARNING [train.py:1204] (3/4) Exclude cut with ID _53950_210_5816_1_1534417264919_3862951_367-26344-0_sp1.1 from training. Duration: 0.97275 2023-10-03 15:33:12,383 WARNING [train.py:1204] (3/4) Exclude cut with ID _5444_210_2151_1_1527418808464_5312210_634-194174-0_sp1.1 from training. Duration: 0.8818125 2023-10-03 15:33:13,675 WARNING [train.py:1204] (3/4) Exclude cut with ID _56494_210_4220_1_1535284837625_3033169_69-138150-0 from training. Duration: 0.94 2023-10-03 15:33:13,777 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7543_210_6753_1_1528544010_1110402_25-183860-0_sp0.9 from training. Duration: 0.52225 2023-10-03 15:33:15,205 WARNING [train.py:1204] (3/4) Exclude cut with ID _40284_210_2084_1_1533513679814_3704279_287-191918-0_sp1.1 from training. Duration: 0.963625 2023-10-03 15:33:16,596 WARNING [train.py:1204] (3/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_291-270881-0 from training. Duration: 0.97 2023-10-03 15:33:16,638 WARNING [train.py:1204] (3/4) Exclude cut with ID _12555_210_8750_1_1530075594402_3780239_449-267986-0_sp0.9 from training. Duration: 0.92225 2023-10-03 15:33:17,234 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.4.encoder.layers.0.feed_forward1.out_whiten, num_groups=1, num_channels=384, metric=5.20 vs. limit=15.0 2023-10-03 15:33:19,015 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.4.encoder.layers.2.self_attn_weights.whiten_keys, num_groups=4, num_channels=128, metric=2.33 vs. limit=6.0 2023-10-03 15:33:21,291 WARNING [train.py:1204] (3/4) Exclude cut with ID _3992_210_1747_1_1526644816031_3731060_268-64520-0_sp1.1 from training. Duration: 0.963625 2023-10-03 15:33:21,297 WARNING [train.py:1204] (3/4) Exclude cut with ID _39411_210_5986_1_1533538825030_3343649_173-238248-0_sp0.9 from training. Duration: 0.92225 2023-10-03 15:33:21,324 WARNING [train.py:1204] (3/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_828-313097-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 15:33:21,519 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.balancer1.prob, batch_count=1316726.6666666667, ans=0.125 2023-10-03 15:33:22,510 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.638e+02 1.928e+02 2.082e+02 2.343e+02 3.541e+02, threshold=4.165e+02, percent-clipped=0.0 2023-10-03 15:33:23,900 WARNING [train.py:1204] (3/4) Exclude cut with ID _26907_210_7234_1_1532221740694_3205263_337-2494-0 from training. Duration: 0.88 2023-10-03 15:33:25,363 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_229-45300-0_sp1.1 from training. Duration: 0.5181875 2023-10-03 15:33:28,099 WARNING [train.py:1204] (3/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_300-27235-0_sp1.1 from training. Duration: 0.7909375 2023-10-03 15:33:30,095 WARNING [train.py:1204] (3/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_48-338976-0_sp1.1 from training. Duration: 0.8090625 2023-10-03 15:33:34,812 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_13091_210_7925_1_1530609447_8987354_792-195353-0_sp1.1 from training. Duration: 0.936375 2023-10-03 15:33:34,819 WARNING [train.py:1204] (3/4) Exclude cut with ID _56524_210_9642_1_1534572011779_3619170_311-54500-0_sp1.1 from training. Duration: 0.97275 2023-10-03 15:33:39,601 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7543_210_6753_1_1528544010_1110402_25-183860-0 from training. Duration: 0.47 2023-10-03 15:33:42,100 WARNING [train.py:1204] (3/4) Exclude cut with ID 2411-132532-0017-82279-0_sp1.1 from training. Duration: 0.9681875 2023-10-03 15:33:42,101 WARNING [train.py:1204] (3/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_272-249985-0_sp0.9 from training. Duration: 0.7444375 2023-10-03 15:33:42,155 WARNING [train.py:1204] (3/4) Exclude cut with ID _55644_210_11856_1_1534593599582_4103719_320-229006-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 15:33:42,216 WARNING [train.py:1204] (3/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_177-100554-0_sp1.1 from training. Duration: 0.963625 2023-10-03 15:33:42,217 WARNING [train.py:1204] (3/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_787-294416-0_sp1.1 from training. Duration: 0.7454375 2023-10-03 15:33:46,433 WARNING [train.py:1204] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_318-59097-0 from training. Duration: 0.76 2023-10-03 15:33:46,515 WARNING [train.py:1204] (3/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_480-13146-0_sp1.1 from training. Duration: 0.77275 2023-10-03 15:33:49,119 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_469-338558-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 15:33:50,470 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_367-126897-0_sp1.1 from training. Duration: 0.963625 2023-10-03 15:33:50,486 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6545_210_2040_1_1527774670_4245258_379-14182-0 from training. Duration: 0.8 2023-10-03 15:33:50,498 WARNING [train.py:1204] (3/4) Exclude cut with ID _21631_210_2253_1_1531907880778_3160339_197-79498-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 15:33:50,507 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_416-293746-0_sp1.1 from training. Duration: 0.8181875 2023-10-03 15:33:51,789 WARNING [train.py:1204] (3/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_52-211683-0 from training. Duration: 0.9 2023-10-03 15:33:54,046 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.5.encoder.layers.1.feed_forward2.out_whiten, num_groups=1, num_channels=256, metric=7.53 vs. limit=15.0 2023-10-03 15:33:57,768 WARNING [train.py:1204] (3/4) Exclude cut with ID _48678_210_5663_1_1533988830947_3769199_306-255886-0_sp0.9 from training. Duration: 0.8555625 2023-10-03 15:33:59,272 WARNING [train.py:1204] (3/4) Exclude cut with ID _65134_210_14763_1_1535335393985_3505368_296-24733-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 15:34:03,192 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.nonlin_attention.balancer.prob, batch_count=1316926.6666666667, ans=0.125 2023-10-03 15:34:04,364 WARNING [train.py:1204] (3/4) Exclude cut with ID _14898_210_9206_1_1530766812377_4177190_335-99364-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 15:34:05,718 WARNING [train.py:1204] (3/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_19-342632-0 from training. Duration: 0.91 2023-10-03 15:34:05,734 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_53071_210_16554_1_1534490405_2954066_265-170472-0 from training. Duration: 0.83 2023-10-03 15:34:08,418 INFO [train.py:1046] (3/4) Epoch 38, batch 1000, loss[loss=0.1491, simple_loss=0.2372, pruned_loss=0.03049, over 24687.00 frames. ], tot_loss[loss=0.1583, simple_loss=0.2378, pruned_loss=0.03939, over 4685227.42 frames. ], batch size: 65, lr: 2.69e-03, grad_scale: 16.0 2023-10-03 15:34:08,611 WARNING [train.py:1204] (3/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_189-298172-0_sp1.1 from training. Duration: 0.963625 2023-10-03 15:34:13,392 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7615_210_4211_1_1528110965_4580160_72-235211-0 from training. Duration: 0.76 2023-10-03 15:34:13,428 WARNING [train.py:1204] (3/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_155-90874-0_sp1.1 from training. Duration: 0.936375 2023-10-03 15:34:18,835 WARNING [train.py:1204] (3/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_164-216310-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 15:34:20,247 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_46013_210_7234_1_1533776362_3744032_463-52378-0 from training. Duration: 0.86 2023-10-03 15:34:20,251 WARNING [train.py:1204] (3/4) Exclude cut with ID _61133_210_9969_1_1535196777227_3696409_333-270325-0 from training. Duration: 0.93 2023-10-03 15:34:24,351 WARNING [train.py:1204] (3/4) Exclude cut with ID _5172_210_1883_1_1527246062567_3577219_18-101380-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 15:34:24,370 WARNING [train.py:1204] (3/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_577-236858-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 15:34:26,347 WARNING [train.py:1204] (3/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_1006-177345-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 15:34:27,880 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_4-266348-0 from training. Duration: 0.78 2023-10-03 15:34:29,465 WARNING [train.py:1204] (3/4) Exclude cut with ID _55857_210_2346_1_1534644007831_6898209_745-98391-0 from training. Duration: 0.76 2023-10-03 15:34:32,574 WARNING [train.py:1204] (3/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_375-244976-0 from training. Duration: 0.75 2023-10-03 15:34:33,243 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.3.encoder.layers.1.feed_forward2.out_whiten, num_groups=1, num_channels=512, metric=5.70 vs. limit=15.0 2023-10-03 15:34:33,897 WARNING [train.py:1204] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_4-248404-0_sp1.1 from training. Duration: 0.97275 2023-10-03 15:34:35,299 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8453_210_4211_1_1528502940_8562730_596-306820-0 from training. Duration: 0.97 2023-10-03 15:34:36,716 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_211-339622-0_sp0.9 from training. Duration: 0.5 2023-10-03 15:34:36,760 WARNING [train.py:1204] (3/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_393-57888-0 from training. Duration: 0.64 2023-10-03 15:34:38,180 WARNING [train.py:1204] (3/4) Exclude cut with ID _23825_210_13449_1_1531915313526_3497150_143-343575-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 15:34:38,242 WARNING [train.py:1204] (3/4) Exclude cut with ID _58605_210_12210_1_1534723195939_3904259_36-12256-0_sp1.1 from training. Duration: 0.936375 2023-10-03 15:34:45,714 WARNING [train.py:1204] (3/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_38-67795-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 15:34:47,024 WARNING [train.py:1204] (3/4) Exclude cut with ID _64631_210_27767_1_1535349580068_4165160_308-327832-0_sp1.1 from training. Duration: 0.92725 2023-10-03 15:34:47,822 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.1.encoder.layers.0.feed_forward1.out_whiten, num_groups=1, num_channels=256, metric=6.60 vs. limit=15.0 2023-10-03 15:34:48,311 WARNING [train.py:1204] (3/4) Exclude cut with ID _37086_210_9038_1_1532999540891_7021250_726-81176-0_sp1.1 from training. Duration: 0.936375 2023-10-03 15:34:48,375 WARNING [train.py:1204] (3/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_47-141521-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 15:34:48,379 WARNING [train.py:1204] (3/4) Exclude cut with ID _3067_210_2667_1_1526212974640_4503168_250-285937-0 from training. Duration: 0.8 2023-10-03 15:34:48,413 WARNING [train.py:1204] (3/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_239-24000-0_sp1.1 from training. Duration: 0.97275 2023-10-03 15:34:48,585 INFO [scaling.py:1118] (3/4) WithLoss: name=encoder.encoders.2.encoder.layers.0.self_attn_weights, loss-sum=0.000e+00 2023-10-03 15:34:49,783 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3371_210_1794_1_1526384462_5580777_396-339812-0_sp0.9 from training. Duration: 0.9444375 2023-10-03 15:34:49,830 WARNING [train.py:1204] (3/4) Exclude cut with ID _16909_210_5724_1_1531292079912_4118919_580-173454-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 15:34:51,124 WARNING [train.py:1204] (3/4) Exclude cut with ID IC0124W0002-61308 from training. Duration: 0.982 2023-10-03 15:34:54,467 WARNING [train.py:1204] (3/4) Exclude cut with ID _31602_210_5854_1_1532677021456_4458250_249-96604-0 from training. Duration: 0.89 2023-10-03 15:34:54,668 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.balancer1.prob, batch_count=1317193.3333333333, ans=0.125 2023-10-03 15:34:55,873 WARNING [train.py:1204] (3/4) Exclude cut with ID _69584_210_5219_1_1535785133775_4044450_325-7656-0 from training. Duration: 0.86 2023-10-03 15:34:56,011 WARNING [train.py:1204] (3/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_478-162247-0 from training. Duration: 0.82 2023-10-03 15:34:59,105 WARNING [train.py:1204] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_686-54383-0_sp1.1 from training. Duration: 0.7909375 2023-10-03 15:35:02,971 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.3.encoder.layers.1.whiten, num_groups=1, num_channels=512, metric=4.19 vs. limit=12.0 2023-10-03 15:35:05,080 WARNING [train.py:1204] (3/4) Exclude cut with ID _29303_210_2575_1_1532392237303_7410649_85-270092-0_sp1.1 from training. Duration: 0.936375 2023-10-03 15:35:05,091 WARNING [train.py:1204] (3/4) Exclude cut with ID _2322_210_2667_1_1525604548971_3656299_440-80494-0_sp1.1 from training. Duration: 0.8 2023-10-03 15:35:06,371 WARNING [train.py:1204] (3/4) Exclude cut with ID _47346_210_9476_1_1533815971553_3536160_201-40644-0_sp1.1 from training. Duration: 0.936375 2023-10-03 15:35:06,476 WARNING [train.py:1204] (3/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_343-139898-0_sp1.1 from training. Duration: 0.87275 2023-10-03 15:35:07,198 WARNING [train.py:1204] (3/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_1012-279473-0 from training. Duration: 0.67 2023-10-03 15:35:08,584 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7931_210_1794_1_1528594728_5564059_280-279560-0_sp1.1 from training. Duration: 0.8909375 2023-10-03 15:35:09,858 WARNING [train.py:1204] (3/4) Exclude cut with ID _8263_210_1664_1_1528439452891_970220_68-181805-0 from training. Duration: 0.64 2023-10-03 15:35:11,222 WARNING [train.py:1204] (3/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_605-133395-0 from training. Duration: 0.66 2023-10-03 15:35:11,438 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=1317260.0, ans=0.1 2023-10-03 15:35:12,500 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_43-171109-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 15:35:12,504 WARNING [train.py:1204] (3/4) Exclude cut with ID _40094_210_16380_1_1533294005773_3641159_107-280739-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 15:35:12,815 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.conv_skip_rate, batch_count=1317260.0, ans=0.0 2023-10-03 15:35:13,942 WARNING [train.py:1204] (3/4) Exclude cut with ID _14821_210_8750_1_1530680503991_6829359_2-227151-0_sp0.9 from training. Duration: 0.92225 2023-10-03 15:35:15,481 WARNING [train.py:1204] (3/4) Exclude cut with ID _14268_210_9206_1_1530622771285_4714099_331-105351-0_sp1.1 from training. Duration: 0.5909375 2023-10-03 15:35:16,870 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_26326_210_7925_1_1533002493_3592406_451-82741-0_sp1.1 from training. Duration: 0.97275 2023-10-03 15:35:20,826 WARNING [train.py:1204] (3/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_191-59784-0_sp0.9 from training. Duration: 0.97775 2023-10-03 15:35:22,126 INFO [train.py:1046] (3/4) Epoch 38, batch 1050, loss[loss=0.1639, simple_loss=0.2468, pruned_loss=0.04053, over 24005.00 frames. ], tot_loss[loss=0.1579, simple_loss=0.2367, pruned_loss=0.03956, over 4674394.73 frames. ], batch size: 80, lr: 2.69e-03, grad_scale: 16.0 2023-10-03 15:35:22,196 WARNING [train.py:1204] (3/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_445-117398-0_sp1.1 from training. Duration: 0.8090625 2023-10-03 15:35:25,710 WARNING [train.py:1204] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_605-257205-0_sp0.9 from training. Duration: 0.6555625 2023-10-03 15:35:25,778 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_50050_210_16223_1_1535435796_7290138_592-258841-0_sp1.1 from training. Duration: 0.936375 2023-10-03 15:35:26,022 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.self_attn_weights.pos_emb_skip_rate, batch_count=1317326.6666666667, ans=0.0 2023-10-03 15:35:27,101 WARNING [train.py:1204] (3/4) Exclude cut with ID _14348_210_8341_1_1530599434996_3778640_240-232370-0_sp0.9 from training. Duration: 0.9333125 2023-10-03 15:35:30,372 WARNING [train.py:1204] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_239-316802-0_sp0.9 from training. Duration: 0.7666875 2023-10-03 15:35:30,491 WARNING [train.py:1204] (3/4) Exclude cut with ID _45560_210_21982_1_1533812603951_3584999_111-6376-0_sp1.1 from training. Duration: 0.763625 2023-10-03 15:35:34,431 WARNING [train.py:1204] (3/4) Exclude cut with ID _54189_210_16380_1_1534332670039_3911710_606-311481-0_sp1.1 from training. Duration: 0.963625 2023-10-03 15:35:34,484 WARNING [train.py:1204] (3/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_237-332027-0_sp1.1 from training. Duration: 0.863625 2023-10-03 15:35:34,524 WARNING [train.py:1204] (3/4) Exclude cut with ID _66070_210_7640_1_1535441393096_7805310_489-292631-0_sp0.9 from training. Duration: 0.87775 2023-10-03 15:35:37,749 WARNING [train.py:1204] (3/4) Exclude cut with ID _49728_210_12651_1_1534041034294_6788150_624-195301-0_sp1.1 from training. Duration: 0.77275 2023-10-03 15:35:37,800 WARNING [train.py:1204] (3/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_44-79427-0 from training. Duration: 0.98 2023-10-03 15:35:39,198 WARNING [train.py:1204] (3/4) Exclude cut with ID _35350_210_16040_1_1533040191183_4075299_490-198354-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 15:35:39,241 WARNING [train.py:1204] (3/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_309-222545-0 from training. Duration: 0.84 2023-10-03 15:35:41,896 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_13091_210_7925_1_1530609447_8987354_790-199469-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 15:35:41,902 WARNING [train.py:1204] (3/4) Exclude cut with ID _21632_210_2253_1_1531994001765_3744140_110-274654-0 from training. Duration: 0.82 2023-10-03 15:35:41,908 WARNING [train.py:1204] (3/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_498-40153-0_sp0.9 from training. Duration: 0.7 2023-10-03 15:35:47,520 WARNING [train.py:1204] (3/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_363-282909-0_sp1.1 from training. Duration: 0.936375 2023-10-03 15:35:48,847 WARNING [train.py:1204] (3/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_178-177582-0_sp0.9 from training. Duration: 0.82225 2023-10-03 15:35:48,866 WARNING [train.py:1204] (3/4) Exclude cut with ID _24612_210_8913_1_1531998054193_3806359_357-35487-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 15:35:49,135 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.balancer1.prob, batch_count=1317393.3333333333, ans=0.125 2023-10-03 15:35:50,134 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.487e+02 1.904e+02 2.119e+02 2.413e+02 3.551e+02, threshold=4.238e+02, percent-clipped=0.0 2023-10-03 15:35:51,601 WARNING [train.py:1204] (3/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_138-332773-0 from training. Duration: 0.81 2023-10-03 15:35:51,630 WARNING [train.py:1204] (3/4) Exclude cut with ID _7624_210_1531_1_1528109802198_4933101_418-349834-0 from training. Duration: 0.6 2023-10-03 15:35:51,661 WARNING [train.py:1204] (3/4) Exclude cut with ID _7537_210_2084_1_1528502395431_3924359_14-312906-0_sp0.9 from training. Duration: 0.9333125 2023-10-03 15:35:56,186 WARNING [train.py:1204] (3/4) Exclude cut with ID _60057_210_10640_1_1534903065793_3333150_362-230377-0 from training. Duration: 0.87 2023-10-03 15:35:58,270 WARNING [train.py:1204] (3/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_138-64210-0 from training. Duration: 0.73 2023-10-03 15:35:59,605 WARNING [train.py:1204] (3/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_454-339504-0_sp1.1 from training. Duration: 0.97275 2023-10-03 15:36:02,980 WARNING [train.py:1204] (3/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_508-335369-0_sp0.9 from training. Duration: 0.5333125 2023-10-03 15:36:05,507 WARNING [train.py:1204] (3/4) Exclude cut with ID _5608_210_2106_1_1527415439089_6819768_909-82767-0_sp0.9 from training. Duration: 0.67775 2023-10-03 15:36:05,547 WARNING [train.py:1204] (3/4) Exclude cut with ID _6694_210_4599_1_1527852637490_3965529_99-11611-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 15:36:06,839 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2655_210_2087_1_1525688959_3897400_52-279899-0_sp1.1 from training. Duration: 0.62725 2023-10-03 15:36:10,209 WARNING [train.py:1204] (3/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_375-245677-0_sp1.1 from training. Duration: 0.736375 2023-10-03 15:36:13,010 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_23850_210_4238_1_1532232597_1632433_32-185135-0 from training. Duration: 0.86 2023-10-03 15:36:15,653 WARNING [train.py:1204] (3/4) Exclude cut with ID _15113_210_8254_1_1530842438579_3716279_209-54533-0 from training. Duration: 0.96 2023-10-03 15:36:15,693 WARNING [train.py:1204] (3/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_401-53423-0 from training. Duration: 0.55 2023-10-03 15:36:15,734 WARNING [train.py:1204] (3/4) Exclude cut with ID _14867_210_1968_1_1530684179890_3846969_319-250868-0_sp1.1 from training. Duration: 0.9 2023-10-03 15:36:15,747 WARNING [train.py:1204] (3/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_106-30782-0_sp0.9 from training. Duration: 0.888875 2023-10-03 15:36:17,078 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_5960_210_1794_1_1527403865_3990999_515-2613-0 from training. Duration: 0.88 2023-10-03 15:36:19,872 WARNING [train.py:1204] (3/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_798-106179-0_sp0.9 from training. Duration: 0.92225 2023-10-03 15:36:22,618 WARNING [train.py:1204] (3/4) Exclude cut with ID _14227_210_4381_1_1530701804286_3940259_125-275642-0_sp1.1 from training. Duration: 0.9 2023-10-03 15:36:22,620 WARNING [train.py:1204] (3/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_79-200392-0_sp1.1 from training. Duration: 0.8818125 2023-10-03 15:36:22,669 WARNING [train.py:1204] (3/4) Exclude cut with ID _48836_210_12210_1_1534075848293_3904150_429-35475-0_sp0.9 from training. Duration: 0.988875 2023-10-03 15:36:23,920 WARNING [train.py:1204] (3/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_224-172517-0_sp1.1 from training. Duration: 0.97275 2023-10-03 15:36:26,489 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.2.encoder.layers.0.conv_module1.whiten, num_groups=1, num_channels=384, metric=6.17 vs. limit=15.0 2023-10-03 15:36:29,979 WARNING [train.py:1204] (3/4) Exclude cut with ID _15113_210_8254_1_1530842438579_3716279_76-232507-0_sp1.1 from training. Duration: 0.97275 2023-10-03 15:36:29,981 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_135-5501-0 from training. Duration: 0.98 2023-10-03 15:36:32,007 WARNING [train.py:1204] (3/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_563-347832-0_sp0.9 from training. Duration: 0.988875 2023-10-03 15:36:32,028 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6786_210_1388_1_1527839699_4194495_273-149841-0 from training. Duration: 0.92 2023-10-03 15:36:33,378 WARNING [train.py:1204] (3/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_172-215932-0 from training. Duration: 0.66 2023-10-03 15:36:33,464 WARNING [train.py:1204] (3/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_939-132910-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 15:36:36,025 INFO [train.py:1046] (3/4) Epoch 38, batch 1100, loss[loss=0.1411, simple_loss=0.2201, pruned_loss=0.03108, over 24352.00 frames. ], tot_loss[loss=0.1573, simple_loss=0.2364, pruned_loss=0.03906, over 4689758.41 frames. ], batch size: 56, lr: 2.69e-03, grad_scale: 16.0 2023-10-03 15:36:38,046 WARNING [train.py:1204] (3/4) Exclude cut with ID _49371_210_12071_1_1533952761021_3812190_16-246001-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 15:36:39,689 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.self_attn_weights.pos_emb_skip_rate, batch_count=1317660.0, ans=0.0 2023-10-03 15:36:42,811 WARNING [train.py:1204] (3/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_680-189165-0_sp1.1 from training. Duration: 0.936375 2023-10-03 15:36:48,202 WARNING [train.py:1204] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_53-300544-0_sp1.1 from training. Duration: 0.6090625 2023-10-03 15:36:48,299 WARNING [train.py:1204] (3/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_540-275685-0_sp1.1 from training. Duration: 0.8090625 2023-10-03 15:36:48,317 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_38965_210_4852_1_1533174906_3484735_453-106579-0_sp1.1 from training. Duration: 0.92725 2023-10-03 15:36:49,739 WARNING [train.py:1204] (3/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_105-328174-0 from training. Duration: 0.76 2023-10-03 15:36:51,160 WARNING [train.py:1204] (3/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_279-126205-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 15:36:52,644 WARNING [train.py:1204] (3/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_5-55893-0_sp0.9 from training. Duration: 0.611125 2023-10-03 15:36:55,376 WARNING [train.py:1204] (3/4) Exclude cut with ID _13678_210_7497_1_1530530069226_3463170_141-10835-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 15:36:58,589 WARNING [train.py:1204] (3/4) Exclude cut with ID _22083_210_8254_1_1531702862439_3678970_201-85738-0_sp1.1 from training. Duration: 0.8181875 2023-10-03 15:36:58,621 WARNING [train.py:1204] (3/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_51-1049-0 from training. Duration: 0.75 2023-10-03 15:36:59,977 WARNING [train.py:1204] (3/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_206-10097-0_sp0.9 from training. Duration: 0.5444375 2023-10-03 15:37:01,856 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_4528_210_3273_1_1527076654_3276777_360-63661-0_sp1.1 from training. Duration: 0.92725 2023-10-03 15:37:01,873 WARNING [train.py:1204] (3/4) Exclude cut with ID _64086_210_7994_1_1535270417019_7435949_449-31567-0_sp1.1 from training. Duration: 0.8818125 2023-10-03 15:37:02,832 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.1.encoder.layers.0.conv_module1.whiten, num_groups=1, num_channels=256, metric=10.80 vs. limit=15.0 2023-10-03 15:37:03,335 WARNING [train.py:1204] (3/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_245-174553-0_sp0.9 from training. Duration: 0.97775 2023-10-03 15:37:04,686 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6545_210_2040_1_1527774670_4245258_379-14182-0_sp1.1 from training. Duration: 0.72725 2023-10-03 15:37:10,299 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_256-206070-0_sp1.1 from training. Duration: 0.963625 2023-10-03 15:37:13,756 WARNING [train.py:1204] (3/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_278-104676-0 from training. Duration: 0.98 2023-10-03 15:37:14,967 WARNING [train.py:1204] (3/4) Exclude cut with ID IC0671W0065-331908 from training. Duration: 0.896 2023-10-03 15:37:15,009 WARNING [train.py:1204] (3/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_244-129917-0_sp1.1 from training. Duration: 0.97275 2023-10-03 15:37:17,665 WARNING [train.py:1204] (3/4) Exclude cut with ID _7852_210_5219_1_1528286471316_8139400_1129-170857-0_sp1.1 from training. Duration: 0.97275 2023-10-03 15:37:19,016 WARNING [train.py:1204] (3/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_443-30034-0_sp0.9 from training. Duration: 0.711125 2023-10-03 15:37:19,054 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_31583_210_3852_1_1532595851_4024413_250-332999-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 15:37:19,155 WARNING [train.py:1204] (3/4) Exclude cut with ID _8772_210_1850_1_1528635595972_3664011_247-336448-0 from training. Duration: 0.74 2023-10-03 15:37:20,550 WARNING [train.py:1204] (3/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_567-135114-0_sp0.9 from training. Duration: 0.9666875 2023-10-03 15:37:20,554 WARNING [train.py:1204] (3/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_557-349534-0_sp1.1 from training. Duration: 0.82725 2023-10-03 15:37:20,567 WARNING [train.py:1204] (3/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_1341-26728-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 15:37:20,624 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_40222_210_6228_1_1533385298_4481939_375-328051-0_sp1.1 from training. Duration: 0.97275 2023-10-03 15:37:20,646 WARNING [train.py:1204] (3/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_276-96593-0 from training. Duration: 0.77 2023-10-03 15:37:26,138 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_53071_210_16554_1_1534490405_2954066_265-170472-0_sp0.9 from training. Duration: 0.92225 2023-10-03 15:37:27,883 WARNING [train.py:1204] (3/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_365-1492-0 from training. Duration: 0.91 2023-10-03 15:37:29,298 WARNING [train.py:1204] (3/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_654-52713-0_sp0.9 from training. Duration: 0.8555625 2023-10-03 15:37:29,615 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=1317860.0, ans=0.1 2023-10-03 15:37:32,723 WARNING [train.py:1204] (3/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_113-92608-0_sp1.1 from training. Duration: 0.4909375 2023-10-03 15:37:35,458 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_46013_210_7234_1_1533776362_3744032_50-141535-0 from training. Duration: 0.99 2023-10-03 15:37:35,466 WARNING [train.py:1204] (3/4) Exclude cut with ID _13942_210_9988_1_1530604906036_3733360_499-55676-0_sp1.1 from training. Duration: 0.563625 2023-10-03 15:37:37,270 WARNING [train.py:1204] (3/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_375-65143-0_sp1.1 from training. Duration: 0.97275 2023-10-03 15:37:39,920 WARNING [train.py:1204] (3/4) Exclude cut with ID _16756_210_4915_1_1531047803763_7104259_199-325608-0_sp1.1 from training. Duration: 0.92725 2023-10-03 15:37:39,945 WARNING [train.py:1204] (3/4) Exclude cut with ID _31649_210_6228_1_1532584608063_4091078_443-124391-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 15:37:42,654 WARNING [train.py:1204] (3/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_146-218763-0 from training. Duration: 0.86 2023-10-03 15:37:42,719 WARNING [train.py:1204] (3/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_567-135114-0_sp1.1 from training. Duration: 0.7909375 2023-10-03 15:37:44,660 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_26326_210_7925_1_1533002493_3592406_462-288299-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 15:37:45,946 WARNING [train.py:1204] (3/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_322-217220-0 from training. Duration: 0.86 2023-10-03 15:37:45,994 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_238-256668-0_sp1.1 from training. Duration: 0.62725 2023-10-03 15:37:47,357 WARNING [train.py:1204] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_371-18169-0 from training. Duration: 0.86 2023-10-03 15:37:47,635 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=1317926.6666666667, ans=0.1 2023-10-03 15:37:48,825 WARNING [train.py:1204] (3/4) Exclude cut with ID _58480_210_12392_1_1534811121111_3646160_23-74512-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 15:37:48,830 WARNING [train.py:1204] (3/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_784-213099-0_sp1.1 from training. Duration: 0.8454375 2023-10-03 15:37:48,917 WARNING [train.py:1204] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_625-102955-0_sp1.1 from training. Duration: 0.7 2023-10-03 15:37:50,249 INFO [train.py:1046] (3/4) Epoch 38, batch 1150, loss[loss=0.156, simple_loss=0.2451, pruned_loss=0.03343, over 24489.00 frames. ], tot_loss[loss=0.158, simple_loss=0.2377, pruned_loss=0.03916, over 4708120.69 frames. ], batch size: 66, lr: 2.69e-03, grad_scale: 16.0 2023-10-03 15:37:53,119 WARNING [train.py:1204] (3/4) Exclude cut with ID _49748_210_24116_1_1533986971737_3158910_176-174327-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 15:37:55,810 WARNING [train.py:1204] (3/4) Exclude cut with ID _50310_210_15488_1_1534068043345_3641080_432-96140-0_sp1.1 from training. Duration: 0.936375 2023-10-03 15:37:56,058 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.conv_module2.balancer1.prob, batch_count=1317993.3333333333, ans=0.125 2023-10-03 15:37:57,173 WARNING [train.py:1204] (3/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_76-14910-0_sp1.1 from training. Duration: 0.92725 2023-10-03 15:37:57,208 WARNING [train.py:1204] (3/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_40-19458-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 15:37:57,246 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_46-93547-0 from training. Duration: 0.95 2023-10-03 15:37:57,284 WARNING [train.py:1204] (3/4) Exclude cut with ID _21631_210_2253_1_1531907880778_3160339_321-35325-0_sp1.1 from training. Duration: 0.963625 2023-10-03 15:37:59,324 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.1.conv_skip_rate, batch_count=1317993.3333333333, ans=0.0 2023-10-03 15:38:00,520 WARNING [train.py:1204] (3/4) Exclude cut with ID _4779_210_2084_1_1527318022944_3914893_500-285013-0 from training. Duration: 0.69 2023-10-03 15:38:02,432 WARNING [train.py:1204] (3/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_301-93324-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 15:38:02,438 WARNING [train.py:1204] (3/4) Exclude cut with ID _6473_210_1741_1_1527901307396_3815380_108-17335-0_sp1.1 from training. Duration: 0.5909375 2023-10-03 15:38:07,236 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.3.encoder.layers.0.whiten, num_groups=1, num_channels=512, metric=5.53 vs. limit=12.0 2023-10-03 15:38:08,511 WARNING [train.py:1204] (3/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_206-10097-0 from training. Duration: 0.49 2023-10-03 15:38:10,039 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_36756_210_7234_1_1533026991_966276_103-77513-0_sp1.1 from training. Duration: 0.97275 2023-10-03 15:38:12,786 WARNING [train.py:1204] (3/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_662-18193-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 15:38:12,926 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.conv_module1.balancer1.prob, batch_count=1318060.0, ans=0.125 2023-10-03 15:38:14,074 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_26433_210_12108_1_1532157158_4056480_439-52438-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 15:38:14,722 WARNING [train.py:1204] (3/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_464-33931-0 from training. Duration: 0.54 2023-10-03 15:38:14,736 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3474_210_2028_1_1526188513_4825246_145-309131-0_sp0.9 from training. Duration: 0.82225 2023-10-03 15:38:14,754 WARNING [train.py:1204] (3/4) Exclude cut with ID _9679_210_7497_1_1529129212906_4363929_446-308321-0_sp1.1 from training. Duration: 0.963625 2023-10-03 15:38:18,536 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.688e+02 1.952e+02 2.181e+02 2.480e+02 4.023e+02, threshold=4.362e+02, percent-clipped=0.0 2023-10-03 15:38:18,688 WARNING [train.py:1204] (3/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_662-275621-0 from training. Duration: 0.42 2023-10-03 15:38:20,072 WARNING [train.py:1204] (3/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_344-337148-0_sp1.1 from training. Duration: 0.97275 2023-10-03 15:38:21,632 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.conv_module1.balancer1.prob, batch_count=1318126.6666666667, ans=0.125 2023-10-03 15:38:22,803 WARNING [train.py:1204] (3/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_759-179071-0_sp1.1 from training. Duration: 0.92725 2023-10-03 15:38:33,485 WARNING [train.py:1204] (3/4) Exclude cut with ID _28783_210_7943_1_1532671164808_7155479_293-12188-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 15:38:40,250 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_17613_210_2151_1_1531137569_2366160_235-320304-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 15:38:40,275 WARNING [train.py:1204] (3/4) Exclude cut with ID _14349_210_8341_1_1530615710818_3442470_170-336541-0 from training. Duration: 0.86 2023-10-03 15:38:41,604 WARNING [train.py:1204] (3/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_375-218677-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 15:38:43,337 WARNING [train.py:1204] (3/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_829-202956-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 15:38:45,020 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.feed_forward1.out_proj.dropout_p, batch_count=1318193.3333333333, ans=0.1 2023-10-03 15:38:47,652 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9029W0436-509823 from training. Duration: 0.768 2023-10-03 15:38:49,633 WARNING [train.py:1204] (3/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_231-112214-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 15:38:52,767 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.5.encoder.layers.1.feed_forward1.out_whiten, num_groups=1, num_channels=256, metric=5.85 vs. limit=15.0 2023-10-03 15:38:55,141 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9059W0470-524883 from training. Duration: 0.3415625 2023-10-03 15:38:59,224 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_43266_210_1794_1_1533521789_4497218_206-79512-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 15:39:00,613 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_294-163793-0_sp0.9 from training. Duration: 0.911125 2023-10-03 15:39:00,649 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6290_210_3273_1_1527595224_1282787_115-87095-0_sp1.1 from training. Duration: 0.5 2023-10-03 15:39:01,958 WARNING [train.py:1204] (3/4) Exclude cut with ID _14100_210_8341_1_1530529320613_3678069_279-55760-0_sp1.1 from training. Duration: 0.8454375 2023-10-03 15:39:03,910 INFO [train.py:1046] (3/4) Epoch 38, batch 1200, loss[loss=0.1616, simple_loss=0.247, pruned_loss=0.03809, over 23348.00 frames. ], tot_loss[loss=0.1579, simple_loss=0.238, pruned_loss=0.03894, over 4715799.47 frames. ], batch size: 93, lr: 2.69e-03, grad_scale: 32.0 2023-10-03 15:39:05,378 WARNING [train.py:1204] (3/4) Exclude cut with ID _14621_210_1925_1_1530686901185_3675420_419-234200-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 15:39:08,098 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.1.encoder.layers.1.self_attn2.whiten, num_groups=1, num_channels=256, metric=10.77 vs. limit=22.5 2023-10-03 15:39:09,999 WARNING [train.py:1204] (3/4) Exclude cut with ID _60229_210_27767_1_1534838406158_3655350_353-213656-0_sp1.1 from training. Duration: 0.863625 2023-10-03 15:39:11,302 WARNING [train.py:1204] (3/4) Exclude cut with ID _48678_210_5663_1_1533988830947_3769199_306-255886-0_sp1.1 from training. Duration: 0.7 2023-10-03 15:39:12,728 WARNING [train.py:1204] (3/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_466-3492-0_sp1.1 from training. Duration: 0.97275 2023-10-03 15:39:12,729 WARNING [train.py:1204] (3/4) Exclude cut with ID _8755_210_1876_1_1528628559357_3258209_199-242167-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 15:39:12,796 WARNING [train.py:1204] (3/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_237-89684-0_sp1.1 from training. Duration: 0.87275 2023-10-03 15:39:16,120 WARNING [train.py:1204] (3/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_70-84121-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 15:39:16,242 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7544_210_6753_1_1528587722_3576686_172-171750-0_sp1.1 from training. Duration: 0.5909375 2023-10-03 15:39:18,975 WARNING [train.py:1204] (3/4) Exclude cut with ID _24271_210_12658_1_1531987270022_7190519_40-321822-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 15:39:19,002 WARNING [train.py:1204] (3/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_227-83612-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 15:39:21,004 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9156W0015-572508 from training. Duration: 0.896 2023-10-03 15:39:23,567 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_292-265928-0 from training. Duration: 0.51 2023-10-03 15:39:25,083 WARNING [train.py:1204] (3/4) Exclude cut with ID _14747_210_8341_1_1530685797463_3654089_147-131229-0_sp1.1 from training. Duration: 0.6909375 2023-10-03 15:39:27,765 WARNING [train.py:1204] (3/4) Exclude cut with ID _45297_210_2721_1_1533715351381_3264949_351-147136-0_sp0.9 from training. Duration: 0.9666875 2023-10-03 15:39:29,295 WARNING [train.py:1204] (3/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_1102-324568-0_sp1.1 from training. Duration: 0.97275 2023-10-03 15:39:32,013 WARNING [train.py:1204] (3/4) Exclude cut with ID _18516_210_9206_1_1531704653206_3692406_465-75446-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 15:39:32,015 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9203W0126-595623 from training. Duration: 0.896 2023-10-03 15:39:32,090 WARNING [train.py:1204] (3/4) Exclude cut with ID _55383_210_10635_1_1534468426172_2231669_139-328410-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 15:39:40,077 WARNING [train.py:1204] (3/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_166-246557-0_sp0.9 from training. Duration: 0.711125 2023-10-03 15:39:40,081 WARNING [train.py:1204] (3/4) Exclude cut with ID _30039_210_13270_1_1532484612900_2725150_239-304872-0_sp1.1 from training. Duration: 0.963625 2023-10-03 15:39:41,843 WARNING [train.py:1204] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_6-226750-0 from training. Duration: 0.94 2023-10-03 15:39:43,161 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_135-5501-0_sp1.1 from training. Duration: 0.8909375 2023-10-03 15:39:44,486 WARNING [train.py:1204] (3/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_359-118926-0 from training. Duration: 0.72 2023-10-03 15:39:48,679 WARNING [train.py:1204] (3/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_4-91037-0 from training. Duration: 0.88 2023-10-03 15:39:48,695 WARNING [train.py:1204] (3/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_415-288492-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 15:39:48,863 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.1.conv_module2.balancer2.prob, batch_count=1318526.6666666667, ans=0.125 2023-10-03 15:39:50,012 WARNING [train.py:1204] (3/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_544-287179-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 15:39:51,860 WARNING [train.py:1204] (3/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_122-32606-0_sp1.1 from training. Duration: 0.92725 2023-10-03 15:39:53,118 WARNING [train.py:1204] (3/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_121-284152-0_sp1.1 from training. Duration: 0.536375 2023-10-03 15:39:54,492 WARNING [train.py:1204] (3/4) Exclude cut with ID _44718_210_5816_1_1534050051869_6469030_350-299477-0_sp1.1 from training. Duration: 0.97275 2023-10-03 15:39:54,511 WARNING [train.py:1204] (3/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_1103-56445-0_sp0.9 from training. Duration: 0.811125 2023-10-03 15:39:55,894 WARNING [train.py:1204] (3/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_64-298917-0_sp0.9 from training. Duration: 0.97775 2023-10-03 15:39:57,187 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2245_210_2715_1_1525511548_3540254_310-52483-0 from training. Duration: 0.88 2023-10-03 15:39:57,225 WARNING [train.py:1204] (3/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_401-158239-0_sp1.1 from training. Duration: 0.6454375 2023-10-03 15:39:57,266 WARNING [train.py:1204] (3/4) Exclude cut with ID _59502_210_12210_1_1535107024698_1835139_146-162495-0_sp1.1 from training. Duration: 0.836375 2023-10-03 15:39:57,275 WARNING [train.py:1204] (3/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_41-31157-0_sp0.9 from training. Duration: 0.6333125 2023-10-03 15:39:59,965 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_68717_210_3528_1_1535628683_1556759_95-36722-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 15:39:59,981 WARNING [train.py:1204] (3/4) Exclude cut with ID _30196_210_1664_1_1533708496522_7074140_654-162611-0_sp1.1 from training. Duration: 0.92725 2023-10-03 15:40:02,866 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.nonlin_attention.balancer.prob, batch_count=1318593.3333333333, ans=0.125 2023-10-03 15:40:04,021 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_9302_210_2550_1_1528889962_4655416_638-301620-0_sp0.9 from training. Duration: 0.52225 2023-10-03 15:40:07,288 WARNING [train.py:1204] (3/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_478-162247-0_sp1.1 from training. Duration: 0.7454375 2023-10-03 15:40:08,760 WARNING [train.py:1204] (3/4) Exclude cut with ID _45297_210_2721_1_1533715351381_3264949_353-9541-0 from training. Duration: 0.97 2023-10-03 15:40:16,039 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9653W0190-675468 from training. Duration: 0.9935 2023-10-03 15:40:17,490 INFO [train.py:1046] (3/4) Epoch 38, batch 1250, loss[loss=0.1314, simple_loss=0.2109, pruned_loss=0.02598, over 24348.00 frames. ], tot_loss[loss=0.159, simple_loss=0.2388, pruned_loss=0.03961, over 4715985.04 frames. ], batch size: 56, lr: 2.69e-03, grad_scale: 16.0 2023-10-03 15:40:17,586 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_49730_210_13049_1_1534075294_1830676_214-84436-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 15:40:17,873 INFO [scaling.py:1118] (3/4) WithLoss: name=encoder.encoders.4.encoder.layers.0.self_attn_weights, loss-sum=0.000e+00 2023-10-03 15:40:19,318 WARNING [train.py:1204] (3/4) Exclude cut with ID _50093_210_4220_1_1534417141207_3432299_259-322252-0_sp1.1 from training. Duration: 0.836375 2023-10-03 15:40:20,640 WARNING [train.py:1204] (3/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_599-50220-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 15:40:22,677 WARNING [train.py:1204] (3/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_174-109016-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 15:40:25,446 WARNING [train.py:1204] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_665-196155-0 from training. Duration: 0.67 2023-10-03 15:40:28,486 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.feed_forward3.hidden_balancer.prob, batch_count=1318660.0, ans=0.125 2023-10-03 15:40:29,598 WARNING [train.py:1204] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_656-188361-0_sp1.1 from training. Duration: 0.7909375 2023-10-03 15:40:29,663 WARNING [train.py:1204] (3/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_60-18123-0_sp1.1 from training. Duration: 0.8 2023-10-03 15:40:30,968 WARNING [train.py:1204] (3/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_535-14410-0 from training. Duration: 0.47 2023-10-03 15:40:32,371 WARNING [train.py:1204] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_94-126128-0_sp1.1 from training. Duration: 0.7818125 2023-10-03 15:40:33,713 WARNING [train.py:1204] (3/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_356-31216-0_sp1.1 from training. Duration: 0.6818125 2023-10-03 15:40:37,870 WARNING [train.py:1204] (3/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_443-30034-0_sp1.1 from training. Duration: 0.5818125 2023-10-03 15:40:39,364 WARNING [train.py:1204] (3/4) Exclude cut with ID _26907_210_7234_1_1532221740694_3205263_337-2494-0_sp1.1 from training. Duration: 0.8 2023-10-03 15:40:40,713 WARNING [train.py:1204] (3/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_49-97158-0_sp0.9 from training. Duration: 0.8444375 2023-10-03 15:40:40,730 WARNING [train.py:1204] (3/4) Exclude cut with ID _7877_210_6014_1_1528286358099_3750136_553-170128-0_sp1.1 from training. Duration: 0.963625 2023-10-03 15:40:42,669 WARNING [train.py:1204] (3/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_202-293523-0_sp0.9 from training. Duration: 0.8 2023-10-03 15:40:45,684 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.conv_module2.balancer1.prob, batch_count=1318793.3333333333, ans=0.125 2023-10-03 15:40:47,224 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.615e+02 1.889e+02 2.073e+02 2.333e+02 3.437e+02, threshold=4.145e+02, percent-clipped=0.0 2023-10-03 15:40:47,394 WARNING [train.py:1204] (3/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_188-171673-0_sp0.9 from training. Duration: 0.6444375 2023-10-03 15:40:47,410 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_120-41668-0_sp1.1 from training. Duration: 0.636375 2023-10-03 15:40:47,415 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_40223_210_6228_1_1533298404_4812267_613-78769-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 15:40:49,264 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_53861_210_5399_1_1534422156_4272077_150-336899-0_sp1.1 from training. Duration: 0.963625 2023-10-03 15:40:49,541 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.ff2_skip_rate, batch_count=1318793.3333333333, ans=0.0 2023-10-03 15:40:50,755 WARNING [train.py:1204] (3/4) Exclude cut with ID _14459_210_2228_1_1531443641946_7516700_1146-56843-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 15:40:51,666 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.bypass.skip_rate, batch_count=1318793.3333333333, ans=0.07 2023-10-03 15:40:52,707 WARNING [train.py:1204] (3/4) Exclude cut with ID _39867_210_9826_1_1533553222030_9344259_1194-299928-0_sp1.1 from training. Duration: 0.92725 2023-10-03 15:40:52,786 WARNING [train.py:1204] (3/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_214-291369-0_sp0.9 from training. Duration: 0.7 2023-10-03 15:40:58,138 WARNING [train.py:1204] (3/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_358-215558-0 from training. Duration: 0.93 2023-10-03 15:40:58,195 WARNING [train.py:1204] (3/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_577-287150-0_sp0.9 from training. Duration: 0.911125 2023-10-03 15:41:00,831 WARNING [train.py:1204] (3/4) Exclude cut with ID _60860_210_8254_1_1534896015875_3557169_229-187367-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 15:41:02,169 WARNING [train.py:1204] (3/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_857-298942-0 from training. Duration: 0.85 2023-10-03 15:41:02,214 WARNING [train.py:1204] (3/4) Exclude cut with ID _40137_210_12210_1_1533290484046_3795180_249-187059-0_sp1.1 from training. Duration: 0.8 2023-10-03 15:41:02,223 WARNING [train.py:1204] (3/4) Exclude cut with ID ID0171W0508-765603 from training. Duration: 0.541 2023-10-03 15:41:02,239 WARNING [train.py:1204] (3/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_521-219439-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 15:41:02,250 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_40223_210_6228_1_1533298404_4812267_741-302616-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 15:41:02,467 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.ff2_skip_rate, batch_count=1318860.0, ans=0.0 2023-10-03 15:41:06,344 WARNING [train.py:1204] (3/4) Exclude cut with ID _65293_210_9070_1_1535545549765_4004210_438-154735-0_sp1.1 from training. Duration: 0.92725 2023-10-03 15:41:10,782 WARNING [train.py:1204] (3/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_564-132950-0_sp1.1 from training. Duration: 0.92725 2023-10-03 15:41:10,819 WARNING [train.py:1204] (3/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_291-270881-0_sp1.1 from training. Duration: 0.8818125 2023-10-03 15:41:13,535 WARNING [train.py:1204] (3/4) Exclude cut with ID _60229_210_27767_1_1534838406158_3655350_353-213656-0 from training. Duration: 0.95 2023-10-03 15:41:13,547 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_243-143119-0 from training. Duration: 0.77 2023-10-03 15:41:13,570 WARNING [train.py:1204] (3/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_690-126306-0 from training. Duration: 0.78 2023-10-03 15:41:15,079 WARNING [train.py:1204] (3/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_347-95003-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 15:41:16,473 WARNING [train.py:1204] (3/4) Exclude cut with ID _2322_210_2667_1_1525604548971_3656299_440-80494-0 from training. Duration: 0.88 2023-10-03 15:41:16,489 WARNING [train.py:1204] (3/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_370-229434-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 15:41:18,486 WARNING [train.py:1204] (3/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_124-345840-0_sp1.1 from training. Duration: 0.47275 2023-10-03 15:41:18,494 WARNING [train.py:1204] (3/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_165-123581-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 15:41:21,087 WARNING [train.py:1204] (3/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_453-282588-0 from training. Duration: 0.98 2023-10-03 15:41:22,324 WARNING [train.py:1204] (3/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_299-34413-0_sp0.9 from training. Duration: 0.72225 2023-10-03 15:41:22,373 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_40036_210_13405_1_1533648647_1315388_48-146016-0_sp1.1 from training. Duration: 0.8454375 2023-10-03 15:41:22,401 WARNING [train.py:1204] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_99-167392-0_sp0.9 from training. Duration: 0.67775 2023-10-03 15:41:22,611 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.conv_module2.balancer1.prob, batch_count=1318926.6666666667, ans=0.125 2023-10-03 15:41:23,802 WARNING [train.py:1204] (3/4) Exclude cut with ID _48677_210_5663_1_1533902405919_3892989_37-207114-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 15:41:25,046 WARNING [train.py:1204] (3/4) Exclude cut with ID _29857_210_20034_1_1532395886753_3798160_335-163562-0 from training. Duration: 0.85 2023-10-03 15:41:27,875 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_36492_210_7927_1_1533261987_4790834_703-343351-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 15:41:27,987 WARNING [train.py:1204] (3/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_80-147301-0_sp0.9 from training. Duration: 0.9333125 2023-10-03 15:41:29,289 WARNING [train.py:1204] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_218-295097-0_sp0.9 from training. Duration: 0.8555625 2023-10-03 15:41:32,045 INFO [train.py:1046] (3/4) Epoch 38, batch 1300, loss[loss=0.1733, simple_loss=0.2616, pruned_loss=0.04245, over 24566.00 frames. ], tot_loss[loss=0.1596, simple_loss=0.2395, pruned_loss=0.03991, over 4713991.50 frames. ], batch size: 71, lr: 2.69e-03, grad_scale: 16.0 2023-10-03 15:41:33,541 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2295_210_2087_1_1525500993_2407863_116-238894-0_sp1.1 from training. Duration: 0.636375 2023-10-03 15:41:36,377 WARNING [train.py:1204] (3/4) Exclude cut with ID _3231_210_2106_1_1526205864934_6784220_697-171068-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 15:41:36,404 WARNING [train.py:1204] (3/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_102-77064-0 from training. Duration: 0.93 2023-10-03 15:41:38,973 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.4.encoder.layers.0.feed_forward2.out_whiten, num_groups=1, num_channels=384, metric=6.29 vs. limit=15.0 2023-10-03 15:41:41,119 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_13091_210_7925_1_1530609447_8987354_1186-261957-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 15:41:42,492 WARNING [train.py:1204] (3/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_91-277684-0_sp1.1 from training. Duration: 0.67275 2023-10-03 15:41:43,765 WARNING [train.py:1204] (3/4) Exclude cut with ID _31088_210_16446_1_1532520012284_3440919_159-313751-0_sp1.1 from training. Duration: 0.936375 2023-10-03 15:41:45,184 WARNING [train.py:1204] (3/4) Exclude cut with ID _42326_210_7770_1_1533814282531_3707269_227-203721-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 15:41:46,541 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3531_210_3528_1_1526294203_5216456_309-227302-0_sp1.1 from training. Duration: 0.62725 2023-10-03 15:41:46,601 WARNING [train.py:1204] (3/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_226-242140-0 from training. Duration: 0.81 2023-10-03 15:41:47,130 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.3.encoder.layers.3.feed_forward3.out_whiten, num_groups=1, num_channels=512, metric=16.29 vs. limit=15.0 2023-10-03 15:41:52,001 WARNING [train.py:1204] (3/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_122-109023-0_sp1.1 from training. Duration: 0.6909375 2023-10-03 15:41:53,328 WARNING [train.py:1204] (3/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_660-211088-0_sp0.9 from training. Duration: 0.688875 2023-10-03 15:41:54,696 WARNING [train.py:1204] (3/4) Exclude cut with ID _39290_210_4220_1_1533862778760_3075118_92-311600-0 from training. Duration: 0.88 2023-10-03 15:41:57,373 WARNING [train.py:1204] (3/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_390-339137-0_sp1.1 from training. Duration: 0.5090625 2023-10-03 15:42:01,482 WARNING [train.py:1204] (3/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_449-341946-0_sp1.1 from training. Duration: 0.963625 2023-10-03 15:42:01,552 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_68095_210_20185_1_1535718226_8888855_884-327007-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 15:42:02,991 WARNING [train.py:1204] (3/4) Exclude cut with ID _42326_210_7770_1_1533814282531_3707269_308-222998-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 15:42:04,280 WARNING [train.py:1204] (3/4) Exclude cut with ID _40399_210_4353_1_1533305658194_3279406_344-238708-0_sp1.1 from training. Duration: 0.963625 2023-10-03 15:42:04,356 WARNING [train.py:1204] (3/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_87-131427-0_sp1.1 from training. Duration: 0.7090625 2023-10-03 15:42:05,670 WARNING [train.py:1204] (3/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_110-130961-0_sp0.9 from training. Duration: 0.52225 2023-10-03 15:42:07,587 WARNING [train.py:1204] (3/4) Exclude cut with ID _55153_210_2253_1_1534384786677_3162096_148-79022-0 from training. Duration: 0.96 2023-10-03 15:42:11,871 WARNING [train.py:1204] (3/4) Exclude cut with ID _9679_210_7497_1_1529129212906_4363929_512-313889-0_sp1.1 from training. Duration: 0.736375 2023-10-03 15:42:13,160 WARNING [train.py:1204] (3/4) Exclude cut with ID _7970_210_1883_1_1528455616296_3472230_25-178407-0_sp1.1 from training. Duration: 0.6454375 2023-10-03 15:42:14,540 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_246-125000-0 from training. Duration: 0.62 2023-10-03 15:42:14,599 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_15126_210_6753_1_1530778677_2591185_113-157651-0_sp0.9 from training. Duration: 0.7555625 2023-10-03 15:42:17,118 WARNING [train.py:1204] (3/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_105-63309-0_sp1.1 from training. Duration: 0.8909375 2023-10-03 15:42:18,551 WARNING [train.py:1204] (3/4) Exclude cut with ID _29877_210_6228_1_1532397791106_5276149_578-177271-0_sp1.1 from training. Duration: 0.936375 2023-10-03 15:42:20,281 WARNING [train.py:1204] (3/4) Exclude cut with ID _44831_210_13427_1_1533816011549_3689035_32-188147-0 from training. Duration: 0.96 2023-10-03 15:42:20,329 WARNING [train.py:1204] (3/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_300-254337-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 15:42:20,345 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_128-241008-0 from training. Duration: 0.94 2023-10-03 15:42:21,145 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.bypass.scale_min, batch_count=1319193.3333333333, ans=0.2 2023-10-03 15:42:22,327 WARNING [train.py:1204] (3/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_53-279178-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 15:42:26,519 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_41452_210_7291_1_1533437687_3941281_273-334088-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 15:42:26,521 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_128-241008-0_sp1.1 from training. Duration: 0.8545625 2023-10-03 15:42:29,332 WARNING [train.py:1204] (3/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_379-49457-0 from training. Duration: 0.87 2023-10-03 15:42:30,728 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_105-114797-0 from training. Duration: 0.84 2023-10-03 15:42:32,051 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_14928_210_5632_1_1530773461_4714000_454-112198-0 from training. Duration: 0.66 2023-10-03 15:42:32,293 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.conv_skip_rate, batch_count=1319260.0, ans=0.0 2023-10-03 15:42:36,752 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_456-22141-0_sp1.1 from training. Duration: 0.8 2023-10-03 15:42:39,546 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_115-233929-0 from training. Duration: 0.72 2023-10-03 15:42:39,653 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_616-16795-0_sp1.1 from training. Duration: 0.963625 2023-10-03 15:42:45,105 INFO [train.py:1046] (3/4) Epoch 38, batch 1350, loss[loss=0.1584, simple_loss=0.2183, pruned_loss=0.04922, over 19411.00 frames. ], tot_loss[loss=0.159, simple_loss=0.2383, pruned_loss=0.03981, over 4709100.32 frames. ], batch size: 388, lr: 2.69e-03, grad_scale: 8.0 2023-10-03 15:42:46,546 WARNING [train.py:1204] (3/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_448-212135-0 from training. Duration: 0.91 2023-10-03 15:42:49,824 WARNING [train.py:1204] (3/4) Exclude cut with ID _46683_210_9642_1_1533730913492_4591116_201-205677-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 15:42:51,798 WARNING [train.py:1204] (3/4) Exclude cut with ID _64086_210_7994_1_1535270417019_7435949_43-80483-0_sp1.1 from training. Duration: 0.97275 2023-10-03 15:42:54,524 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_240-134124-0_sp1.1 from training. Duration: 0.963625 2023-10-03 15:42:54,548 WARNING [train.py:1204] (3/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_907-120940-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 15:42:55,915 WARNING [train.py:1204] (3/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_848-280957-0_sp1.1 from training. Duration: 0.7909375 2023-10-03 15:42:55,964 WARNING [train.py:1204] (3/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_243-42116-0_sp0.9 from training. Duration: 0.811125 2023-10-03 15:42:56,127 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.nonlin_attention.balancer.max_positive, batch_count=1319326.6666666667, ans=0.95 2023-10-03 15:43:00,151 WARNING [train.py:1204] (3/4) Exclude cut with ID _6694_210_4599_1_1527852637490_3965529_116-109398-0_sp0.9 from training. Duration: 0.811125 2023-10-03 15:43:01,557 WARNING [train.py:1204] (3/4) Exclude cut with ID _24314_210_4915_1_1532135078116_7170179_521-258868-0 from training. Duration: 0.79 2023-10-03 15:43:03,142 WARNING [train.py:1204] (3/4) Exclude cut with ID _2220_210_1819_1_1525509109898_3658680_50-99837-0_sp0.9 from training. Duration: 0.6 2023-10-03 15:43:03,183 WARNING [train.py:1204] (3/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_313-134930-0_sp1.1 from training. Duration: 0.8818125 2023-10-03 15:43:05,355 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.attention_skip_rate, batch_count=1319393.3333333333, ans=0.0 2023-10-03 15:43:06,575 WARNING [train.py:1204] (3/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_125-144174-0 from training. Duration: 0.39 2023-10-03 15:43:07,924 WARNING [train.py:1204] (3/4) Exclude cut with ID _59288_210_5854_1_1535077757774_3412246_275-293897-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 15:43:09,257 WARNING [train.py:1204] (3/4) Exclude cut with ID _40519_210_13794_1_1533715228655_7337390_722-52880-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 15:43:09,278 WARNING [train.py:1204] (3/4) Exclude cut with ID _7537_210_2084_1_1528502395431_3924359_128-268029-0 from training. Duration: 0.98 2023-10-03 15:43:11,904 WARNING [train.py:1204] (3/4) Exclude cut with ID _8021_210_7994_1_1528376451358_3653069_179-45206-0 from training. Duration: 0.86 2023-10-03 15:43:13,447 WARNING [train.py:1204] (3/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_88-276668-0 from training. Duration: 0.99 2023-10-03 15:43:14,757 WARNING [train.py:1204] (3/4) Exclude cut with ID _22613_210_5219_1_1532253599686_4090170_290-212862-0_sp1.1 from training. Duration: 0.97275 2023-10-03 15:43:14,762 WARNING [train.py:1204] (3/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_465-214726-0 from training. Duration: 0.86 2023-10-03 15:43:16,098 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.487e+02 1.825e+02 1.970e+02 2.155e+02 2.948e+02, threshold=3.940e+02, percent-clipped=0.0 2023-10-03 15:43:27,383 WARNING [train.py:1204] (3/4) Exclude cut with ID _56480_210_4220_1_1534679942079_3075409_138-36031-0_sp1.1 from training. Duration: 0.97275 2023-10-03 15:43:32,385 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.3.encoder.layers.1.feed_forward3.out_whiten, num_groups=1, num_channels=512, metric=9.36 vs. limit=15.0 2023-10-03 15:43:34,450 WARNING [train.py:1204] (3/4) Exclude cut with ID _58605_210_12210_1_1534723195939_3904259_300-285878-0_sp1.1 from training. Duration: 0.97275 2023-10-03 15:43:34,486 WARNING [train.py:1204] (3/4) Exclude cut with ID _40901_210_5665_1_1533363789958_3856130_114-340229-0_sp1.1 from training. Duration: 0.936375 2023-10-03 15:43:36,258 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_489-230703-0 from training. Duration: 0.85 2023-10-03 15:43:39,020 WARNING [train.py:1204] (3/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_322-81243-0_sp1.1 from training. Duration: 0.936375 2023-10-03 15:43:40,355 WARNING [train.py:1204] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_271-269084-0 from training. Duration: 0.83 2023-10-03 15:43:40,374 WARNING [train.py:1204] (3/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_166-314607-0_sp0.9 from training. Duration: 0.6 2023-10-03 15:43:41,714 WARNING [train.py:1204] (3/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_340-343302-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 15:43:43,171 WARNING [train.py:1204] (3/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_286-112015-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 15:43:44,646 WARNING [train.py:1204] (3/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_328-264776-0 from training. Duration: 0.67 2023-10-03 15:43:46,149 WARNING [train.py:1204] (3/4) Exclude cut with ID _4866_210_2577_1_1527404412284_4364659_256-306830-0_sp0.9 from training. Duration: 0.9666875 2023-10-03 15:43:52,625 WARNING [train.py:1204] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_239-316802-0 from training. Duration: 0.69 2023-10-03 15:43:52,889 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.feed_forward1.out_proj.dropout_p, batch_count=1319593.3333333333, ans=0.1 2023-10-03 15:43:54,037 WARNING [train.py:1204] (3/4) Exclude cut with ID _29857_210_20034_1_1532395886753_3798160_279-185801-0 from training. Duration: 0.91 2023-10-03 15:43:57,124 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.attention_skip_rate, batch_count=1319593.3333333333, ans=0.0 2023-10-03 15:43:59,519 INFO [train.py:1046] (3/4) Epoch 38, batch 1400, loss[loss=0.1496, simple_loss=0.2396, pruned_loss=0.02981, over 24637.00 frames. ], tot_loss[loss=0.1579, simple_loss=0.2373, pruned_loss=0.03929, over 4715707.04 frames. ], batch size: 68, lr: 2.69e-03, grad_scale: 8.0 2023-10-03 15:43:59,574 WARNING [train.py:1204] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_656-188361-0 from training. Duration: 0.87 2023-10-03 15:44:00,999 WARNING [train.py:1204] (3/4) Exclude cut with ID _44718_210_5816_1_1534050051869_6469030_100-230287-0_sp1.1 from training. Duration: 0.936375 2023-10-03 15:44:03,907 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8275_210_4198_1_1528455320_3892571_276-215622-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 15:44:05,866 WARNING [train.py:1204] (3/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_698-309430-0_sp1.1 from training. Duration: 0.963625 2023-10-03 15:44:10,082 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_365-217119-0 from training. Duration: 0.63 2023-10-03 15:44:10,198 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7264_210_4938_1_1527939911_8132095_800-91995-0 from training. Duration: 0.86 2023-10-03 15:44:11,668 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.balancer_na.min_abs, batch_count=1319660.0, ans=0.02 2023-10-03 15:44:18,963 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.1.conv_module1.balancer2.prob, batch_count=1319726.6666666667, ans=0.125 2023-10-03 15:44:21,922 WARNING [train.py:1204] (3/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_49-97158-0_sp1.1 from training. Duration: 0.6909375 2023-10-03 15:44:25,091 WARNING [train.py:1204] (3/4) Exclude cut with ID _61132_210_9969_1_1535021792964_3755419_61-57720-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 15:44:27,841 WARNING [train.py:1204] (3/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_345-306832-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 15:44:27,865 WARNING [train.py:1204] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_164-224052-0_sp0.9 from training. Duration: 0.77775 2023-10-03 15:44:29,318 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_34886_210_10633_1_1533203882_1755661_76-156096-0_sp1.1 from training. Duration: 0.7909375 2023-10-03 15:44:30,640 WARNING [train.py:1204] (3/4) Exclude cut with ID 4133-6541-0027-40495-0_sp1.1 from training. Duration: 0.9681875 2023-10-03 15:44:40,791 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_514-211165-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 15:44:40,842 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_41211_210_2544_1_1533348859_3702910_97-212695-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 15:44:44,931 WARNING [train.py:1204] (3/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_406-45086-0 from training. Duration: 0.8 2023-10-03 15:44:44,986 WARNING [train.py:1204] (3/4) Exclude cut with ID _65263_210_22356_1_1535763598691_3921037_49-222917-0_sp1.1 from training. Duration: 0.836375 2023-10-03 15:44:45,047 WARNING [train.py:1204] (3/4) Exclude cut with ID _4866_210_2577_1_1527404412284_4364659_352-157930-0_sp1.1 from training. Duration: 0.736375 2023-10-03 15:44:47,008 WARNING [train.py:1204] (3/4) Exclude cut with ID _4366_210_1388_1_1526792053633_3613819_231-211402-0_sp1.1 from training. Duration: 0.8545625 2023-10-03 15:44:47,053 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_98-21576-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 15:44:47,318 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.ff2_skip_rate, batch_count=1319860.0, ans=0.0 2023-10-03 15:44:48,442 WARNING [train.py:1204] (3/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_348-127789-0_sp0.9 from training. Duration: 0.9444375 2023-10-03 15:44:48,464 WARNING [train.py:1204] (3/4) Exclude cut with ID _45871_210_5665_1_1533864614245_3296060_98-329557-0_sp1.1 from training. Duration: 0.97275 2023-10-03 15:44:49,791 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6846_210_6753_1_1527850767_4295158_2-315073-0_sp1.1 from training. Duration: 0.8 2023-10-03 15:44:51,691 WARNING [train.py:1204] (3/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_145-137790-0 from training. Duration: 0.83 2023-10-03 15:44:51,705 WARNING [train.py:1204] (3/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_133-158308-0_sp0.9 from training. Duration: 0.9333125 2023-10-03 15:44:56,392 WARNING [train.py:1204] (3/4) Exclude cut with ID _14898_210_9206_1_1530766812377_4177190_320-11758-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 15:44:59,290 WARNING [train.py:1204] (3/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_406-45086-0_sp0.9 from training. Duration: 0.888875 2023-10-03 15:45:02,247 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.feed_forward1.out_proj.dropout_p, batch_count=1319926.6666666667, ans=0.1 2023-10-03 15:45:06,149 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_5438_210_1794_1_1527336687_2600009_104-288274-0 from training. Duration: 0.58 2023-10-03 15:45:06,227 WARNING [train.py:1204] (3/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_108-53929-0_sp0.9 from training. Duration: 0.6555625 2023-10-03 15:45:07,526 WARNING [train.py:1204] (3/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_444-286390-0_sp1.1 from training. Duration: 0.963625 2023-10-03 15:45:10,498 WARNING [train.py:1204] (3/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_125-144174-0_sp0.9 from training. Duration: 0.4333125 2023-10-03 15:45:10,565 WARNING [train.py:1204] (3/4) Exclude cut with ID _55209_210_1531_1_1534675828996_4597160_118-345621-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 15:45:12,005 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_45886_210_17024_1_1533709215_471063_7-79165-0_sp1.1 from training. Duration: 0.8909375 2023-10-03 15:45:13,313 INFO [train.py:1046] (3/4) Epoch 38, batch 1450, loss[loss=0.1442, simple_loss=0.2263, pruned_loss=0.0311, over 24618.00 frames. ], tot_loss[loss=0.1574, simple_loss=0.2366, pruned_loss=0.03915, over 4708482.65 frames. ], batch size: 60, lr: 2.69e-03, grad_scale: 8.0 2023-10-03 15:45:18,150 WARNING [train.py:1204] (3/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_175-163894-0_sp0.9 from training. Duration: 0.82225 2023-10-03 15:45:19,639 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_50188_210_4198_1_1534503621_3646041_353-81682-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 15:45:19,645 WARNING [train.py:1204] (3/4) Exclude cut with ID _17875_210_2087_1_1531224012907_1821339_175-80565-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 15:45:19,665 WARNING [train.py:1204] (3/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_159-328157-0_sp1.1 from training. Duration: 0.47275 2023-10-03 15:45:24,581 WARNING [train.py:1204] (3/4) Exclude cut with ID _60057_210_10640_1_1534903065793_3333150_137-267579-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 15:45:26,330 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_92-46009-0_sp1.1 from training. Duration: 0.4909375 2023-10-03 15:45:29,137 WARNING [train.py:1204] (3/4) Exclude cut with ID _53203_210_2087_1_1534246218309_475590_48-68507-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 15:45:29,153 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_28211_210_6388_1_1532861506_4523224_128-328162-0 from training. Duration: 0.9 2023-10-03 15:45:29,216 WARNING [train.py:1204] (3/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_51-1049-0_sp1.1 from training. Duration: 0.6818125 2023-10-03 15:45:29,486 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.bypass_mid.scale_min, batch_count=1320060.0, ans=0.2 2023-10-03 15:45:30,540 WARNING [train.py:1204] (3/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_383-316000-0 from training. Duration: 0.9 2023-10-03 15:45:30,596 WARNING [train.py:1204] (3/4) Exclude cut with ID _4779_210_2084_1_1527318022944_3914893_185-295153-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 15:45:31,998 WARNING [train.py:1204] (3/4) Exclude cut with ID _3231_210_2106_1_1526205864934_6784220_171-256004-0_sp0.9 from training. Duration: 0.9555625 2023-10-03 15:45:31,998 WARNING [train.py:1204] (3/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_87-272068-0 from training. Duration: 0.83 2023-10-03 15:45:33,409 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_164-152777-0_sp1.1 from training. Duration: 0.92725 2023-10-03 15:45:33,450 WARNING [train.py:1204] (3/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_18-130336-0_sp0.9 from training. Duration: 0.87775 2023-10-03 15:45:34,694 WARNING [train.py:1204] (3/4) Exclude cut with ID _14886_210_4220_1_1530702020355_3516190_175-229073-0_sp1.1 from training. Duration: 0.4090625 2023-10-03 15:45:34,714 WARNING [train.py:1204] (3/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_791-45149-0_sp0.9 from training. Duration: 0.9555625 2023-10-03 15:45:34,793 WARNING [train.py:1204] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_468-189058-0_sp1.1 from training. Duration: 0.87275 2023-10-03 15:45:36,206 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8278_210_2550_1_1528637099_4220413_32-142780-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 15:45:37,635 WARNING [train.py:1204] (3/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_146-218763-0_sp0.9 from training. Duration: 0.9555625 2023-10-03 15:45:40,917 WARNING [train.py:1204] (3/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_348-127789-0_sp1.1 from training. Duration: 0.77275 2023-10-03 15:45:40,923 WARNING [train.py:1204] (3/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_288-285445-0_sp1.1 from training. Duration: 0.936375 2023-10-03 15:45:42,305 WARNING [train.py:1204] (3/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_308-181732-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 15:45:43,544 WARNING [train.py:1204] (3/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_895-117428-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 15:45:45,229 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.529e+02 1.843e+02 2.021e+02 2.311e+02 3.468e+02, threshold=4.043e+02, percent-clipped=0.0 2023-10-03 15:45:45,404 WARNING [train.py:1204] (3/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_387-159008-0_sp0.9 from training. Duration: 0.9555625 2023-10-03 15:45:45,417 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_37351_210_7925_1_1533012931_3812113_265-85780-0_sp0.9 from training. Duration: 0.97775 2023-10-03 15:45:46,757 WARNING [train.py:1204] (3/4) Exclude cut with ID _40551_210_10635_1_1533776424632_3416179_361-3964-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 15:45:46,796 WARNING [train.py:1204] (3/4) Exclude cut with ID _25882_210_3036_1_1532775665815_6479510_485-122742-0_sp1.1 from training. Duration: 0.97275 2023-10-03 15:45:49,559 WARNING [train.py:1204] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_98-51676-0 from training. Duration: 0.73 2023-10-03 15:45:52,714 WARNING [train.py:1204] (3/4) Exclude cut with ID _30548_210_16040_1_1532608097130_3146969_371-280610-0_sp1.1 from training. Duration: 0.92725 2023-10-03 15:45:57,312 WARNING [train.py:1204] (3/4) Exclude cut with ID IC0671W0081-331924 from training. Duration: 0.896 2023-10-03 15:45:58,708 WARNING [train.py:1204] (3/4) Exclude cut with ID _40284_210_2084_1_1533513679814_3704279_380-160775-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 15:45:58,804 WARNING [train.py:1204] (3/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_57-270037-0_sp1.1 from training. Duration: 0.7 2023-10-03 15:46:00,126 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_853-150237-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 15:46:02,807 WARNING [train.py:1204] (3/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_491-261923-0 from training. Duration: 0.98 2023-10-03 15:46:06,934 WARNING [train.py:1204] (3/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_836-244045-0_sp1.1 from training. Duration: 0.97275 2023-10-03 15:46:08,300 WARNING [train.py:1204] (3/4) Exclude cut with ID _14880_210_8895_1_1530680426655_3795259_289-212817-0 from training. Duration: 0.69 2023-10-03 15:46:09,707 WARNING [train.py:1204] (3/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_303-340446-0 from training. Duration: 0.9 2023-10-03 15:46:11,025 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_37152_210_9697_1_1533106573_3844188_418-41736-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 15:46:14,328 WARNING [train.py:1204] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_371-18169-0_sp1.1 from training. Duration: 0.7818125 2023-10-03 15:46:14,357 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_187-219984-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 15:46:14,581 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.conv_module2.balancer2.min_positive, batch_count=1320260.0, ans=0.05 2023-10-03 15:46:14,598 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.conv_module2.balancer1.prob, batch_count=1320260.0, ans=0.125 2023-10-03 15:46:17,613 WARNING [train.py:1204] (3/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_63-267585-0 from training. Duration: 0.9 2023-10-03 15:46:19,034 WARNING [train.py:1204] (3/4) Exclude cut with ID _27013_210_1389_1_1532437173240_3252649_222-125573-0 from training. Duration: 0.98 2023-10-03 15:46:19,092 WARNING [train.py:1204] (3/4) Exclude cut with ID _14821_210_8750_1_1530680503991_6829359_189-297451-0 from training. Duration: 0.55 2023-10-03 15:46:20,458 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_557-163391-0_sp1.1 from training. Duration: 0.97275 2023-10-03 15:46:20,549 WARNING [train.py:1204] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_65-88242-0_sp1.1 from training. Duration: 0.5090625 2023-10-03 15:46:27,013 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.0.self_attn_weights.pos_emb_skip_rate, batch_count=1320326.6666666667, ans=0.0 2023-10-03 15:46:28,103 INFO [train.py:1046] (3/4) Epoch 38, batch 1500, loss[loss=0.1636, simple_loss=0.2421, pruned_loss=0.04256, over 23237.00 frames. ], tot_loss[loss=0.1575, simple_loss=0.237, pruned_loss=0.03897, over 4714607.46 frames. ], batch size: 105, lr: 2.69e-03, grad_scale: 8.0 2023-10-03 15:46:32,442 WARNING [train.py:1204] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_218-295097-0 from training. Duration: 0.77 2023-10-03 15:46:32,477 WARNING [train.py:1204] (3/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_686-247600-0_sp1.1 from training. Duration: 0.836375 2023-10-03 15:46:32,481 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_397-147196-0_sp0.9 from training. Duration: 0.888875 2023-10-03 15:46:35,020 WARNING [train.py:1204] (3/4) Exclude cut with ID _53950_210_5816_1_1534417264919_3862951_237-136449-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 15:46:35,088 WARNING [train.py:1204] (3/4) Exclude cut with ID _3120_210_3525_1_1526036143112_4072980_74-178400-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 15:46:36,334 WARNING [train.py:1204] (3/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_309-222545-0_sp0.9 from training. Duration: 0.9333125 2023-10-03 15:46:37,757 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7544_210_6753_1_1528587722_3576686_172-171750-0 from training. Duration: 0.65 2023-10-03 15:46:39,136 WARNING [train.py:1204] (3/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_3-237434-0_sp0.9 from training. Duration: 0.8444375 2023-10-03 15:46:39,166 WARNING [train.py:1204] (3/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_101-293367-0_sp0.9 from training. Duration: 0.7 2023-10-03 15:46:39,173 WARNING [train.py:1204] (3/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_818-213552-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 15:46:40,564 WARNING [train.py:1204] (3/4) Exclude cut with ID _14349_210_8341_1_1530615710818_3442470_115-285975-0_sp1.1 from training. Duration: 0.7818125 2023-10-03 15:46:41,983 WARNING [train.py:1204] (3/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_291-309749-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 15:46:43,258 WARNING [train.py:1204] (3/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_715-3462-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 15:46:49,067 WARNING [train.py:1204] (3/4) Exclude cut with ID _59502_210_12210_1_1535107024698_1835139_13-331827-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 15:46:49,081 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_4528_210_3273_1_1527076654_3276777_342-168068-0 from training. Duration: 0.87 2023-10-03 15:46:49,149 WARNING [train.py:1204] (3/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_215-141059-0_sp0.9 from training. Duration: 0.688875 2023-10-03 15:46:50,439 WARNING [train.py:1204] (3/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_106-286130-0_sp1.1 from training. Duration: 0.8909375 2023-10-03 15:46:50,524 WARNING [train.py:1204] (3/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_480-77655-0_sp1.1 from training. Duration: 0.97275 2023-10-03 15:46:54,732 WARNING [train.py:1204] (3/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_848-280957-0 from training. Duration: 0.87 2023-10-03 15:46:59,984 WARNING [train.py:1204] (3/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_122-109023-0 from training. Duration: 0.76 2023-10-03 15:47:01,374 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_11305_210_1794_1_1529750428_7178148_645-233480-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 15:47:02,644 WARNING [train.py:1204] (3/4) Exclude cut with ID _7562_210_2084_1_1528887767916_3946229_345-169156-0 from training. Duration: 0.58 2023-10-03 15:47:04,098 WARNING [train.py:1204] (3/4) Exclude cut with ID _7562_210_2084_1_1528887767916_3946229_345-169156-0_sp1.1 from training. Duration: 0.52725 2023-10-03 15:47:05,552 WARNING [train.py:1204] (3/4) Exclude cut with ID _13253_210_1925_1_1530757775819_3577873_253-246386-0_sp1.1 from training. Duration: 0.8181875 2023-10-03 15:47:05,614 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_136-264444-0_sp1.1 from training. Duration: 0.97275 2023-10-03 15:47:06,918 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2168_210_2290_1_1525606920_1849400_136-222991-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 15:47:07,030 WARNING [train.py:1204] (3/4) Exclude cut with ID _7852_210_5219_1_1528286471316_8139400_242-105336-0 from training. Duration: 0.72 2023-10-03 15:47:08,339 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_5960_210_1794_1_1527403865_3990999_515-2613-0_sp1.1 from training. Duration: 0.8 2023-10-03 15:47:08,369 WARNING [train.py:1204] (3/4) Exclude cut with ID _39688_210_19596_1_1533371626725_4154159_551-167834-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 15:47:09,691 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_85-195240-0 from training. Duration: 0.87 2023-10-03 15:47:09,747 WARNING [train.py:1204] (3/4) Exclude cut with ID _63171_210_15488_1_1535104792794_3295110_113-113534-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 15:47:13,932 WARNING [train.py:1204] (3/4) Exclude cut with ID _8833_210_5219_1_1528592665210_7553059_319-349181-0_sp1.1 from training. Duration: 0.9 2023-10-03 15:47:13,933 WARNING [train.py:1204] (3/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_225-43381-0 from training. Duration: 0.94 2023-10-03 15:47:19,217 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_5959_210_1794_1_1527400400_3445726_412-44052-0_sp0.9 from training. Duration: 0.8555625 2023-10-03 15:47:20,644 WARNING [train.py:1204] (3/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_356-167816-0_sp1.1 from training. Duration: 0.6545625 2023-10-03 15:47:22,167 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.nonlin_attention.balancer.prob, batch_count=1320526.6666666667, ans=0.125 2023-10-03 15:47:25,997 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9012W0112-501244 from training. Duration: 0.768 2023-10-03 15:47:26,800 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.feed_forward2.hidden_balancer.prob, batch_count=1320593.3333333333, ans=0.125 2023-10-03 15:47:27,842 WARNING [train.py:1204] (3/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_177-326952-0_sp1.1 from training. Duration: 0.92725 2023-10-03 15:47:27,848 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9016W0116-502789 from training. Duration: 0.896 2023-10-03 15:47:28,087 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.conv_skip_rate, batch_count=1320593.3333333333, ans=0.0 2023-10-03 15:47:29,700 WARNING [train.py:1204] (3/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_557-204815-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 15:47:31,085 WARNING [train.py:1204] (3/4) Exclude cut with ID _55857_210_2346_1_1534644007831_6898209_279-229469-0_sp1.1 from training. Duration: 0.936375 2023-10-03 15:47:31,153 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9029W0375-509764 from training. Duration: 0.896 2023-10-03 15:47:32,502 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2295_210_2087_1_1525500993_2407863_234-83104-0_sp0.9 from training. Duration: 0.688875 2023-10-03 15:47:35,352 WARNING [train.py:1204] (3/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_266-137104-0 from training. Duration: 0.65 2023-10-03 15:47:36,804 WARNING [train.py:1204] (3/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_289-163927-0_sp1.1 from training. Duration: 0.92725 2023-10-03 15:47:39,481 WARNING [train.py:1204] (3/4) Exclude cut with ID _12733_210_8341_1_1530167424556_3646380_86-225314-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 15:47:40,799 INFO [train.py:1046] (3/4) Epoch 38, batch 1550, loss[loss=0.1648, simple_loss=0.249, pruned_loss=0.04029, over 24008.00 frames. ], tot_loss[loss=0.1578, simple_loss=0.2375, pruned_loss=0.03903, over 4719750.78 frames. ], batch size: 80, lr: 2.69e-03, grad_scale: 8.0 2023-10-03 15:47:40,841 WARNING [train.py:1204] (3/4) Exclude cut with ID _7877_210_6014_1_1528286358099_3750136_618-284064-0_sp1.1 from training. Duration: 0.92725 2023-10-03 15:47:40,891 WARNING [train.py:1204] (3/4) Exclude cut with ID _55844_210_2346_1_1534471212569_7625707_1017-31469-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 15:47:40,931 WARNING [train.py:1204] (3/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_617-241446-0_sp1.1 from training. Duration: 0.92725 2023-10-03 15:47:40,983 WARNING [train.py:1204] (3/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_866-173440-0_sp1.1 from training. Duration: 0.8181875 2023-10-03 15:47:43,606 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_9460_210_4211_1_1528954587_10049667_480-260269-0 from training. Duration: 0.56 2023-10-03 15:47:43,651 WARNING [train.py:1204] (3/4) Exclude cut with ID _44659_210_2721_1_1534036120061_6614860_626-298884-0 from training. Duration: 0.97 2023-10-03 15:47:43,695 WARNING [train.py:1204] (3/4) Exclude cut with ID _53565_210_10635_1_1534337218335_3820259_287-18120-0_sp1.1 from training. Duration: 0.8818125 2023-10-03 15:47:44,972 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_791-197891-0 from training. Duration: 0.84 2023-10-03 15:47:45,017 WARNING [train.py:1204] (3/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_527-238990-0 from training. Duration: 0.9 2023-10-03 15:47:48,087 WARNING [train.py:1204] (3/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_449-65969-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 15:47:48,209 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_553-341452-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 15:47:49,434 WARNING [train.py:1204] (3/4) Exclude cut with ID _16909_210_5724_1_1531292079912_4118919_300-47574-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 15:47:49,455 WARNING [train.py:1204] (3/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_70-27732-0_sp1.1 from training. Duration: 0.8545625 2023-10-03 15:47:51,224 WARNING [train.py:1204] (3/4) Exclude cut with ID _44718_210_5816_1_1534050051869_6469030_727-28074-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 15:47:51,274 WARNING [train.py:1204] (3/4) Exclude cut with ID _55857_210_2346_1_1534644007831_6898209_357-123664-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 15:47:54,099 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9123W0004-555964 from training. Duration: 0.896 2023-10-03 15:47:54,128 WARNING [train.py:1204] (3/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_445-145208-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 15:47:54,152 WARNING [train.py:1204] (3/4) Exclude cut with ID _3675_210_4557_1_1526639934408_4182089_456-18305-0_sp1.1 from training. Duration: 0.6909375 2023-10-03 15:47:55,449 WARNING [train.py:1204] (3/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_1-99680-0_sp1.1 from training. Duration: 0.6090625 2023-10-03 15:47:57,457 WARNING [train.py:1204] (3/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_146-282400-0_sp0.9 from training. Duration: 0.788875 2023-10-03 15:47:57,477 WARNING [train.py:1204] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_844-96989-0 from training. Duration: 0.71 2023-10-03 15:47:59,504 WARNING [train.py:1204] (3/4) Exclude cut with ID _29853_210_7497_1_1532505600851_6642110_128-146364-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 15:47:59,535 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_224-322394-0 from training. Duration: 0.66 2023-10-03 15:48:00,900 WARNING [train.py:1204] (3/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_191-59784-0 from training. Duration: 0.88 2023-10-03 15:48:00,907 WARNING [train.py:1204] (3/4) Exclude cut with ID _12556_210_8341_1_1530081024100_7327690_725-193477-0 from training. Duration: 0.98 2023-10-03 15:48:00,949 WARNING [train.py:1204] (3/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_523-79838-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 15:48:03,622 WARNING [train.py:1204] (3/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_398-334558-0_sp1.1 from training. Duration: 0.87275 2023-10-03 15:48:03,786 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.1.conv_module2.balancer1.prob, batch_count=1320726.6666666667, ans=0.125 2023-10-03 15:48:03,801 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.feed_forward2.hidden_balancer.prob, batch_count=1320726.6666666667, ans=0.125 2023-10-03 15:48:08,976 WARNING [train.py:1204] (3/4) Exclude cut with ID _6456_210_1534_1_1527681634212_3181049_171-36697-0_sp1.1 from training. Duration: 0.936375 2023-10-03 15:48:11,708 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.581e+02 1.894e+02 2.092e+02 2.413e+02 3.361e+02, threshold=4.184e+02, percent-clipped=0.0 2023-10-03 15:48:11,845 WARNING [train.py:1204] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_700-183549-0 from training. Duration: 0.68 2023-10-03 15:48:11,847 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8853_210_3273_1_1528633899_818968_51-187685-0 from training. Duration: 0.69 2023-10-03 15:48:15,395 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.2.encoder.layers.2.conv_module2.whiten, num_groups=1, num_channels=384, metric=5.30 vs. limit=15.0 2023-10-03 15:48:17,788 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.conv_module2.balancer2.prob, batch_count=1320793.3333333333, ans=0.125 2023-10-03 15:48:18,881 WARNING [train.py:1204] (3/4) Exclude cut with ID _7719_210_3525_1_1528456153163_4296960_333-110399-0_sp1.1 from training. Duration: 0.87275 2023-10-03 15:48:22,143 WARNING [train.py:1204] (3/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_1022-83740-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 15:48:24,023 WARNING [train.py:1204] (3/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_61-327062-0_sp0.9 from training. Duration: 0.9 2023-10-03 15:48:24,037 WARNING [train.py:1204] (3/4) Exclude cut with ID _47251_210_18628_1_1533814063128_4137250_214-88076-0_sp1.1 from training. Duration: 0.8 2023-10-03 15:48:24,089 WARNING [train.py:1204] (3/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_839-173017-0 from training. Duration: 0.41 2023-10-03 15:48:24,308 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.0.balancer2.prob, batch_count=1320860.0, ans=0.125 2023-10-03 15:48:31,209 WARNING [train.py:1204] (3/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_299-34413-0_sp1.1 from training. Duration: 0.5909375 2023-10-03 15:48:31,321 WARNING [train.py:1204] (3/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_664-159457-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 15:48:34,830 WARNING [train.py:1204] (3/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_483-291760-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 15:48:37,571 WARNING [train.py:1204] (3/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_287-255474-0_sp1.1 from training. Duration: 0.92725 2023-10-03 15:48:37,624 WARNING [train.py:1204] (3/4) Exclude cut with ID _48677_210_5663_1_1533902405919_3892989_111-11439-0_sp1.1 from training. Duration: 0.87275 2023-10-03 15:48:37,631 WARNING [train.py:1204] (3/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_399-156791-0 from training. Duration: 0.83 2023-10-03 15:48:38,840 WARNING [train.py:1204] (3/4) Exclude cut with ID _3992_210_1747_1_1526644816031_3731060_36-147855-0_sp0.9 from training. Duration: 0.8666875 2023-10-03 15:48:40,251 WARNING [train.py:1204] (3/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_442-320317-0_sp1.1 from training. Duration: 0.7454375 2023-10-03 15:48:40,278 WARNING [train.py:1204] (3/4) Exclude cut with ID _55429_210_10626_1_1534467588527_7174143_953-80868-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 15:48:41,642 WARNING [train.py:1204] (3/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_159-328157-0_sp0.9 from training. Duration: 0.57775 2023-10-03 15:48:41,649 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9309W0197-640324 from training. Duration: 0.896 2023-10-03 15:48:41,820 WARNING [train.py:1204] (3/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_361-30822-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 15:48:44,606 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.1.conv_skip_rate, batch_count=1320926.6666666667, ans=0.0 2023-10-03 15:48:48,539 WARNING [train.py:1204] (3/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_636-251213-0 from training. Duration: 0.94 2023-10-03 15:48:52,841 WARNING [train.py:1204] (3/4) Exclude cut with ID _49728_210_12651_1_1534041034294_6788150_468-333827-0_sp1.1 from training. Duration: 0.963625 2023-10-03 15:48:54,719 INFO [train.py:1046] (3/4) Epoch 38, batch 1600, loss[loss=0.1666, simple_loss=0.2392, pruned_loss=0.047, over 23810.00 frames. ], tot_loss[loss=0.1582, simple_loss=0.2379, pruned_loss=0.03925, over 4721688.20 frames. ], batch size: 195, lr: 2.69e-03, grad_scale: 16.0 2023-10-03 15:48:54,814 WARNING [train.py:1204] (3/4) Exclude cut with ID _25882_210_3036_1_1532775665815_6479510_623-191759-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 15:48:54,868 WARNING [train.py:1204] (3/4) Exclude cut with ID _5189_210_4557_1_1527248319428_3769229_199-53526-0 from training. Duration: 0.42 2023-10-03 15:48:56,333 WARNING [train.py:1204] (3/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_32-205288-0_sp0.9 from training. Duration: 0.8666875 2023-10-03 15:48:57,789 WARNING [train.py:1204] (3/4) Exclude cut with ID _65263_210_22356_1_1535763598691_3921037_401-346646-0_sp1.1 from training. Duration: 0.963625 2023-10-03 15:48:57,793 WARNING [train.py:1204] (3/4) Exclude cut with ID _12555_210_8750_1_1530075594402_3780239_449-267986-0_sp1.1 from training. Duration: 0.7545625 2023-10-03 15:48:57,807 WARNING [train.py:1204] (3/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_430-201020-0_sp1.1 from training. Duration: 0.8818125 2023-10-03 15:48:57,903 WARNING [train.py:1204] (3/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_379-49457-0_sp1.1 from training. Duration: 0.7909375 2023-10-03 15:49:02,884 WARNING [train.py:1204] (3/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_200-105218-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 15:49:02,941 WARNING [train.py:1204] (3/4) Exclude cut with ID _3867_210_1883_1_1526641200575_4475830_319-349850-0 from training. Duration: 0.67 2023-10-03 15:49:04,386 WARNING [train.py:1204] (3/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_916-195848-0 from training. Duration: 0.92 2023-10-03 15:49:05,782 WARNING [train.py:1204] (3/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_436-186311-0 from training. Duration: 0.65 2023-10-03 15:49:05,968 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.conv_module1.balancer1.max_abs, batch_count=1320993.3333333333, ans=10.0 2023-10-03 15:49:08,520 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3910_210_1794_1_1526777667_3747260_302-180617-0_sp1.1 from training. Duration: 0.97275 2023-10-03 15:49:09,938 WARNING [train.py:1204] (3/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_102-92751-0 from training. Duration: 0.99 2023-10-03 15:49:11,288 WARNING [train.py:1204] (3/4) Exclude cut with ID _51899_210_20034_1_1534129254466_3669220_265-215428-0_sp1.1 from training. Duration: 0.8909375 2023-10-03 15:49:12,748 WARNING [train.py:1204] (3/4) Exclude cut with ID _58583_210_5236_1_1534726835266_3574249_438-198993-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 15:49:17,023 WARNING [train.py:1204] (3/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_166-166990-0_sp1.1 from training. Duration: 0.87275 2023-10-03 15:49:19,767 WARNING [train.py:1204] (3/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_907-151398-0 from training. Duration: 0.37 2023-10-03 15:49:21,277 WARNING [train.py:1204] (3/4) Exclude cut with ID _38670_210_15379_1_1533448810659_4391059_452-256669-0_sp1.1 from training. Duration: 0.936375 2023-10-03 15:49:21,320 WARNING [train.py:1204] (3/4) Exclude cut with ID _48677_210_5663_1_1533902405919_3892989_408-40740-0 from training. Duration: 0.83 2023-10-03 15:49:22,556 WARNING [train.py:1204] (3/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_903-158756-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 15:49:22,609 WARNING [train.py:1204] (3/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_397-216497-0 from training. Duration: 0.96 2023-10-03 15:49:26,285 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.bypass.skip_rate, batch_count=1321126.6666666667, ans=0.07 2023-10-03 15:49:27,498 WARNING [train.py:1204] (3/4) Exclude cut with ID _55429_210_10626_1_1534467588527_7174143_97-335084-0 from training. Duration: 0.97 2023-10-03 15:49:36,188 WARNING [train.py:1204] (3/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_453-267730-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 15:49:37,518 WARNING [train.py:1204] (3/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_153-97441-0 from training. Duration: 0.56 2023-10-03 15:49:37,584 WARNING [train.py:1204] (3/4) Exclude cut with ID _40901_210_5665_1_1533363789958_3856130_312-329122-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 15:49:38,948 WARNING [train.py:1204] (3/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_97-126471-0_sp1.1 from training. Duration: 0.97275 2023-10-03 15:49:38,948 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_11305_210_1794_1_1529750428_7178148_257-312870-0_sp1.1 from training. Duration: 0.8 2023-10-03 15:49:40,503 WARNING [train.py:1204] (3/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_454-314901-0_sp0.9 from training. Duration: 0.511125 2023-10-03 15:49:44,742 WARNING [train.py:1204] (3/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_282-136641-0_sp1.1 from training. Duration: 0.3818125 2023-10-03 15:49:46,183 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_46013_210_7234_1_1533776362_3744032_493-160724-0_sp1.1 from training. Duration: 0.963625 2023-10-03 15:49:47,575 WARNING [train.py:1204] (3/4) Exclude cut with ID _67831_210_12392_1_1535612464202_3092612_259-33442-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 15:49:47,624 WARNING [train.py:1204] (3/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_289-62384-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 15:49:48,952 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6545_210_2040_1_1527774670_4245258_379-14182-0_sp0.9 from training. Duration: 0.888875 2023-10-03 15:49:51,538 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_548-294316-0_sp1.1 from training. Duration: 0.736375 2023-10-03 15:49:51,623 WARNING [train.py:1204] (3/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_857-298942-0_sp1.1 from training. Duration: 0.77275 2023-10-03 15:49:53,018 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6673_210_3273_1_1527767790_3173240_327-306185-0_sp1.1 from training. Duration: 0.7545625 2023-10-03 15:49:59,469 WARNING [train.py:1204] (3/4) Exclude cut with ID _61008_210_10633_1_1534852769198_6974370_814-107647-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 15:50:01,377 WARNING [train.py:1204] (3/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_116-332008-0_sp1.1 from training. Duration: 0.8909375 2023-10-03 15:50:02,797 WARNING [train.py:1204] (3/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_266-329755-0 from training. Duration: 0.89 2023-10-03 15:50:02,800 WARNING [train.py:1204] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_271-269084-0_sp0.9 from training. Duration: 0.92225 2023-10-03 15:50:05,506 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3171_210_2087_1_1526109084_2330007_66-260583-0 from training. Duration: 0.72 2023-10-03 15:50:08,133 INFO [train.py:1046] (3/4) Epoch 38, batch 1650, loss[loss=0.1922, simple_loss=0.2684, pruned_loss=0.05802, over 19654.00 frames. ], tot_loss[loss=0.1589, simple_loss=0.2385, pruned_loss=0.03964, over 4705613.33 frames. ], batch size: 389, lr: 2.69e-03, grad_scale: 8.0 2023-10-03 15:50:11,350 WARNING [train.py:1204] (3/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_1326-276386-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 15:50:11,467 WARNING [train.py:1204] (3/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_372-30302-0_sp1.1 from training. Duration: 0.7818125 2023-10-03 15:50:12,849 WARNING [train.py:1204] (3/4) Exclude cut with ID _25882_210_3036_1_1532775665815_6479510_631-257029-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 15:50:12,858 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3691_210_3528_1_1526468169_4062053_355-334432-0 from training. Duration: 0.87 2023-10-03 15:50:12,867 WARNING [train.py:1204] (3/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_839-173017-0_sp1.1 from training. Duration: 0.37275 2023-10-03 15:50:12,885 WARNING [train.py:1204] (3/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_441-91466-0 from training. Duration: 0.97 2023-10-03 15:50:12,920 WARNING [train.py:1204] (3/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_108-53929-0 from training. Duration: 0.59 2023-10-03 15:50:15,878 WARNING [train.py:1204] (3/4) Exclude cut with ID _9389_210_1664_1_1529217337301_5289269_260-1942-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 15:50:17,132 WARNING [train.py:1204] (3/4) Exclude cut with ID _56423_210_5854_1_1534896061049_3165969_198-61547-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 15:50:17,174 WARNING [train.py:1204] (3/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_412-142122-0_sp1.1 from training. Duration: 0.92725 2023-10-03 15:50:17,199 WARNING [train.py:1204] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_88-341718-0_sp1.1 from training. Duration: 0.6 2023-10-03 15:50:19,256 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.0.layers.1.feed_forward2.out_whiten, num_groups=1, num_channels=192, metric=4.64 vs. limit=15.0 2023-10-03 15:50:19,578 WARNING [train.py:1204] (3/4) Exclude cut with ID _61079_210_4525_1_1534921008198_4011419_334-298697-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 15:50:19,846 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=1321326.6666666667, ans=0.1 2023-10-03 15:50:22,891 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_5465_210_4211_1_1527227253_55357_8-2499-0_sp1.1 from training. Duration: 0.996375 2023-10-03 15:50:24,245 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_447-201565-0_sp1.1 from training. Duration: 0.97275 2023-10-03 15:50:24,250 WARNING [train.py:1204] (3/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_253-161789-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 15:50:24,253 WARNING [train.py:1204] (3/4) Exclude cut with ID _44304_210_15374_1_1533639352765_4613129_302-22084-0_sp0.9 from training. Duration: 0.9555625 2023-10-03 15:50:24,263 WARNING [train.py:1204] (3/4) Exclude cut with ID _26664_210_7770_1_1532258943280_3418270_187-80815-0_sp1.1 from training. Duration: 0.8454375 2023-10-03 15:50:25,639 WARNING [train.py:1204] (3/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_68-212801-0 from training. Duration: 0.73 2023-10-03 15:50:25,676 WARNING [train.py:1204] (3/4) Exclude cut with ID _29345_210_4381_1_1532689168263_3860169_149-257268-0 from training. Duration: 0.81 2023-10-03 15:50:25,931 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.bypass.scale_min, batch_count=1321393.3333333333, ans=0.2 2023-10-03 15:50:31,065 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_213-103282-0_sp1.1 from training. Duration: 0.7090625 2023-10-03 15:50:34,297 WARNING [train.py:1204] (3/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_156-302675-0_sp1.1 from training. Duration: 0.5 2023-10-03 15:50:39,374 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.0.layers.0.feed_forward1.out_whiten, num_groups=1, num_channels=192, metric=10.05 vs. limit=15.0 2023-10-03 15:50:41,111 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.576e+02 1.937e+02 2.128e+02 2.357e+02 3.873e+02, threshold=4.255e+02, percent-clipped=0.0 2023-10-03 15:50:43,927 WARNING [train.py:1204] (3/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_125-322172-0 from training. Duration: 0.93 2023-10-03 15:50:45,249 WARNING [train.py:1204] (3/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_493-60474-0_sp1.1 from training. Duration: 0.963625 2023-10-03 15:50:46,714 WARNING [train.py:1204] (3/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_156-302675-0 from training. Duration: 0.55 2023-10-03 15:50:49,552 WARNING [train.py:1204] (3/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_123-267862-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 15:50:51,051 WARNING [train.py:1204] (3/4) Exclude cut with ID _28427_210_4220_1_1532581240967_3061069_201-143747-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 15:50:51,077 WARNING [train.py:1204] (3/4) Exclude cut with ID _8772_210_1850_1_1528635595972_3664011_96-12756-0_sp1.1 from training. Duration: 0.8909375 2023-10-03 15:50:51,190 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.0.conv_skip_rate, batch_count=1321526.6666666667, ans=0.0 2023-10-03 15:50:52,398 WARNING [train.py:1204] (3/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_1347-145675-0_sp1.1 from training. Duration: 0.936375 2023-10-03 15:50:52,491 WARNING [train.py:1204] (3/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_387-159008-0_sp1.1 from training. Duration: 0.7818125 2023-10-03 15:50:54,237 WARNING [train.py:1204] (3/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_400-201896-0_sp1.1 from training. Duration: 0.963625 2023-10-03 15:50:55,758 WARNING [train.py:1204] (3/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_326-174612-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 15:50:55,796 WARNING [train.py:1204] (3/4) Exclude cut with ID _55844_210_2346_1_1534471212569_7625707_390-116233-0_sp1.1 from training. Duration: 0.963625 2023-10-03 15:50:55,821 WARNING [train.py:1204] (3/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_419-317062-0_sp1.1 from training. Duration: 0.82725 2023-10-03 15:50:55,874 WARNING [train.py:1204] (3/4) Exclude cut with ID _29877_210_6228_1_1532397791106_5276149_266-62291-0_sp0.9 from training. Duration: 0.9666875 2023-10-03 15:50:57,294 WARNING [train.py:1204] (3/4) Exclude cut with ID _7857_210_1534_1_1528286515745_6831470_431-338528-0_sp1.1 from training. Duration: 0.92725 2023-10-03 15:50:57,358 WARNING [train.py:1204] (3/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_87-131427-0_sp0.9 from training. Duration: 0.8666875 2023-10-03 15:51:02,248 WARNING [train.py:1204] (3/4) Exclude cut with ID _24055_210_12651_1_1532310907143_976979_92-59356-0_sp1.1 from training. Duration: 0.82725 2023-10-03 15:51:02,328 WARNING [train.py:1204] (3/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_96-290039-0 from training. Duration: 0.56 2023-10-03 15:51:05,505 WARNING [train.py:1204] (3/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_48-182155-0_sp0.9 from training. Duration: 0.9666875 2023-10-03 15:51:06,816 WARNING [train.py:1204] (3/4) Exclude cut with ID _4626_210_3532_1_1526817652717_3916859_446-206656-0 from training. Duration: 0.76 2023-10-03 15:51:06,899 WARNING [train.py:1204] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_168-32906-0 from training. Duration: 0.69 2023-10-03 15:51:06,921 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3780_210_2087_1_1526467760_2130483_170-259044-0 from training. Duration: 0.7 2023-10-03 15:51:06,934 WARNING [train.py:1204] (3/4) Exclude cut with ID _65134_210_14763_1_1535335393985_3505368_153-216718-0_sp1.1 from training. Duration: 0.92725 2023-10-03 15:51:08,313 WARNING [train.py:1204] (3/4) Exclude cut with ID _64639_210_5816_1_1535367619437_3528000_461-329156-0_sp1.1 from training. Duration: 0.87275 2023-10-03 15:51:08,350 WARNING [train.py:1204] (3/4) Exclude cut with ID _21989_210_5854_1_1531899214931_3271360_126-347531-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 15:51:09,742 WARNING [train.py:1204] (3/4) Exclude cut with ID _7614_210_4915_1_1529326764092_4537359_96-162197-0_sp1.1 from training. Duration: 0.963625 2023-10-03 15:51:09,743 WARNING [train.py:1204] (3/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_1083-147536-0 from training. Duration: 0.97 2023-10-03 15:51:09,921 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.1.feed_forward2.hidden_balancer.prob, batch_count=1321593.3333333333, ans=0.125 2023-10-03 15:51:12,592 WARNING [train.py:1204] (3/4) Exclude cut with ID _64029_210_4381_1_1535178502772_3756459_275-16710-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 15:51:15,238 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7264_210_4938_1_1527939911_8132095_800-91995-0_sp0.9 from training. Duration: 0.9555625 2023-10-03 15:51:15,282 WARNING [train.py:1204] (3/4) Exclude cut with ID _3844_210_1390_1_1526727803582_2871209_50-193879-0_sp1.1 from training. Duration: 0.936375 2023-10-03 15:51:18,012 WARNING [train.py:1204] (3/4) Exclude cut with ID _46952_210_5986_1_1534230010357_3107379_232-344503-0 from training. Duration: 0.96 2023-10-03 15:51:18,235 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.attention_skip_rate, batch_count=1321593.3333333333, ans=0.0 2023-10-03 15:51:22,526 INFO [train.py:1046] (3/4) Epoch 38, batch 1700, loss[loss=0.1487, simple_loss=0.2326, pruned_loss=0.0324, over 24262.00 frames. ], tot_loss[loss=0.1592, simple_loss=0.2385, pruned_loss=0.03991, over 4702260.45 frames. ], batch size: 61, lr: 2.69e-03, grad_scale: 8.0 2023-10-03 15:51:22,628 WARNING [train.py:1204] (3/4) Exclude cut with ID _25808_210_13292_1_1532343514562_7366109_206-14076-0_sp1.1 from training. Duration: 0.936375 2023-10-03 15:51:22,629 WARNING [train.py:1204] (3/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_55-31904-0_sp0.9 from training. Duration: 0.97775 2023-10-03 15:51:22,657 WARNING [train.py:1204] (3/4) Exclude cut with ID _14632_210_9988_1_1530691279392_3923200_161-129530-0 from training. Duration: 0.96 2023-10-03 15:51:23,978 WARNING [train.py:1204] (3/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_770-200430-0_sp1.1 from training. Duration: 0.8090625 2023-10-03 15:51:23,987 WARNING [train.py:1204] (3/4) Exclude cut with ID _6270_210_2084_1_1527850911573_3856420_26-277343-0_sp1.1 from training. Duration: 0.7454375 2023-10-03 15:51:23,988 WARNING [train.py:1204] (3/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_88-23776-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 15:51:25,427 WARNING [train.py:1204] (3/4) Exclude cut with ID _60860_210_8254_1_1534896015875_3557169_245-256666-0_sp1.1 from training. Duration: 0.8818125 2023-10-03 15:51:25,468 WARNING [train.py:1204] (3/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_636-251213-0_sp1.1 from training. Duration: 0.8545625 2023-10-03 15:51:25,489 WARNING [train.py:1204] (3/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_540-131164-0 from training. Duration: 0.92 2023-10-03 15:51:28,768 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_5931_210_2040_1_1527428481_4547999_321-193614-0_sp0.9 from training. Duration: 0.7444375 2023-10-03 15:51:38,019 WARNING [train.py:1204] (3/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_373-225403-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 15:51:40,811 WARNING [train.py:1204] (3/4) Exclude cut with ID _20622_210_3852_1_1531622427080_1093839_57-326027-0_sp1.1 from training. Duration: 0.97275 2023-10-03 15:51:46,414 WARNING [train.py:1204] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_605-257205-0_sp1.1 from training. Duration: 0.536375 2023-10-03 15:51:46,444 WARNING [train.py:1204] (3/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_442-320317-0_sp0.9 from training. Duration: 0.911125 2023-10-03 15:51:46,484 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_35361_210_4238_1_1533262810_7793476_675-87012-0_sp1.1 from training. Duration: 0.8090625 2023-10-03 15:51:47,752 WARNING [train.py:1204] (3/4) Exclude cut with ID _4184_210_2550_1_1526730192013_2219380_306-44736-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 15:51:50,540 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_124-113562-0 from training. Duration: 0.94 2023-10-03 15:51:51,956 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2821_210_2715_1_1526108385_3763560_73-296673-0_sp0.9 from training. Duration: 0.92225 2023-10-03 15:51:51,974 WARNING [train.py:1204] (3/4) Exclude cut with ID _10301_210_5886_1_1529222275332_3910620_216-46959-0_sp1.1 from training. Duration: 0.92725 2023-10-03 15:51:53,816 WARNING [train.py:1204] (3/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_91-277684-0_sp0.9 from training. Duration: 0.82225 2023-10-03 15:51:55,174 WARNING [train.py:1204] (3/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_351-294650-0_sp0.9 from training. Duration: 0.7 2023-10-03 15:51:56,679 WARNING [train.py:1204] (3/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_209-205365-0 from training. Duration: 0.79 2023-10-03 15:51:56,740 WARNING [train.py:1204] (3/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_97-325053-0 from training. Duration: 0.92 2023-10-03 15:51:58,739 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_49348_210_7927_1_1533954500_3691300_109-276430-0_sp1.1 from training. Duration: 0.92725 2023-10-03 15:51:58,871 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.0.conv_skip_rate, batch_count=1321793.3333333333, ans=0.0 2023-10-03 15:52:01,528 WARNING [train.py:1204] (3/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_912-47473-0 from training. Duration: 0.68 2023-10-03 15:52:02,971 WARNING [train.py:1204] (3/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_96-320736-0_sp0.9 from training. Duration: 0.9555625 2023-10-03 15:52:04,529 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.bypass.skip_rate, batch_count=1321793.3333333333, ans=0.07 2023-10-03 15:52:11,853 WARNING [train.py:1204] (3/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_1252-235481-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 15:52:11,952 WARNING [train.py:1204] (3/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_813-261554-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 15:52:13,230 WARNING [train.py:1204] (3/4) Exclude cut with ID _21632_210_2253_1_1531994001765_3744140_110-274654-0_sp0.9 from training. Duration: 0.911125 2023-10-03 15:52:14,808 WARNING [train.py:1204] (3/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_381-317791-0_sp1.1 from training. Duration: 0.663625 2023-10-03 15:52:14,823 WARNING [train.py:1204] (3/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_51-117576-0_sp1.1 from training. Duration: 0.363625 2023-10-03 15:52:14,841 WARNING [train.py:1204] (3/4) Exclude cut with ID _42321_210_7770_1_1533468631352_4366895_104-215915-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 15:52:17,601 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_216-22824-0_sp1.1 from training. Duration: 0.92725 2023-10-03 15:52:17,602 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_14831_210_7927_1_1530700200_5583610_181-27467-0 from training. Duration: 0.91 2023-10-03 15:52:17,662 WARNING [train.py:1204] (3/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_1024-265645-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 15:52:17,674 WARNING [train.py:1204] (3/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_412-233904-0_sp1.1 from training. Duration: 0.936375 2023-10-03 15:52:17,697 WARNING [train.py:1204] (3/4) Exclude cut with ID _30191_210_1664_1_1533189915047_7196670_911-223461-0_sp1.1 from training. Duration: 0.92725 2023-10-03 15:52:17,929 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.attention_skip_rate, batch_count=1321860.0, ans=0.0 2023-10-03 15:52:18,950 WARNING [train.py:1204] (3/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_558-148266-0_sp1.1 from training. Duration: 0.963625 2023-10-03 15:52:20,423 WARNING [train.py:1204] (3/4) Exclude cut with ID _58480_210_12392_1_1534811121111_3646160_212-164495-0_sp1.1 from training. Duration: 0.936375 2023-10-03 15:52:20,424 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_454-276429-0_sp0.9 from training. Duration: 0.9666875 2023-10-03 15:52:21,710 WARNING [train.py:1204] (3/4) Exclude cut with ID _24736_210_3279_1_1532332894218_2838700_73-197086-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 15:52:21,763 WARNING [train.py:1204] (3/4) Exclude cut with ID _5294_210_1747_1_1527249643928_3696179_529-242298-0_sp1.1 from training. Duration: 0.863625 2023-10-03 15:52:21,801 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_53555_210_5632_1_1534595220_7232297_345-171438-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 15:52:23,864 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.conv_module1.balancer1.prob, batch_count=1321926.6666666667, ans=0.125 2023-10-03 15:52:26,569 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_28211_210_6388_1_1532861506_4523224_22-25766-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 15:52:27,808 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7002_210_1794_1_1527939683_5157286_163-87552-0 from training. Duration: 0.86 2023-10-03 15:52:29,786 WARNING [train.py:1204] (3/4) Exclude cut with ID _56927_210_9969_1_1534848970840_3695229_409-225129-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 15:52:31,133 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_26192_210_7927_1_1532137269_1040115_21-219331-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 15:52:34,485 WARNING [train.py:1204] (3/4) Exclude cut with ID _15012_210_1968_1_1530856820429_5095350_662-210469-0 from training. Duration: 0.57 2023-10-03 15:52:37,193 INFO [train.py:1046] (3/4) Epoch 38, batch 1750, loss[loss=0.1581, simple_loss=0.2412, pruned_loss=0.03755, over 23399.00 frames. ], tot_loss[loss=0.1584, simple_loss=0.2375, pruned_loss=0.03963, over 4709592.93 frames. ], batch size: 93, lr: 2.68e-03, grad_scale: 8.0 2023-10-03 15:52:40,368 WARNING [train.py:1204] (3/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_582-191016-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 15:52:41,874 WARNING [train.py:1204] (3/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_270-210032-0_sp1.1 from training. Duration: 0.963625 2023-10-03 15:52:41,905 WARNING [train.py:1204] (3/4) Exclude cut with ID _6789_210_1534_1_1527857937218_3118950_262-48853-0_sp0.9 from training. Duration: 0.62225 2023-10-03 15:52:43,190 WARNING [train.py:1204] (3/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_847-41334-0 from training. Duration: 0.44 2023-10-03 15:52:43,221 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_573-46532-0_sp1.1 from training. Duration: 0.92725 2023-10-03 15:52:44,900 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.bypass.skip_rate, batch_count=1321993.3333333333, ans=0.09899494936611666 2023-10-03 15:52:46,027 WARNING [train.py:1204] (3/4) Exclude cut with ID _21881_210_3525_1_1531825242757_3798380_247-102591-0_sp1.1 from training. Duration: 0.8909375 2023-10-03 15:52:46,060 WARNING [train.py:1204] (3/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_200-21395-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 15:52:46,216 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.bypass_mid.scale_min, batch_count=1321993.3333333333, ans=0.2 2023-10-03 15:52:50,191 WARNING [train.py:1204] (3/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_711-15688-0 from training. Duration: 0.43 2023-10-03 15:52:53,383 WARNING [train.py:1204] (3/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_53-75025-0_sp1.1 from training. Duration: 0.963625 2023-10-03 15:52:54,861 WARNING [train.py:1204] (3/4) Exclude cut with ID _6194_210_1534_1_1527511826160_3941194_321-148641-0 from training. Duration: 0.64 2023-10-03 15:52:55,026 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.attention_skip_rate, batch_count=1322060.0, ans=0.0 2023-10-03 15:52:56,111 WARNING [train.py:1204] (3/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_337-294431-0_sp1.1 from training. Duration: 0.936375 2023-10-03 15:52:56,217 WARNING [train.py:1204] (3/4) Exclude cut with ID _21632_210_2253_1_1531994001765_3744140_110-274654-0_sp1.1 from training. Duration: 0.7454375 2023-10-03 15:52:57,751 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.conv_skip_rate, batch_count=1322060.0, ans=0.0 2023-10-03 15:53:00,713 WARNING [train.py:1204] (3/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_135-174080-0_sp1.1 from training. Duration: 0.4818125 2023-10-03 15:53:00,815 WARNING [train.py:1204] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_21-174514-0 from training. Duration: 0.43 2023-10-03 15:53:03,302 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_38965_210_4852_1_1533174906_3484735_215-294589-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 15:53:03,325 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_548-294316-0 from training. Duration: 0.81 2023-10-03 15:53:09,931 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.580e+02 1.868e+02 2.023e+02 2.283e+02 3.172e+02, threshold=4.046e+02, percent-clipped=0.0 2023-10-03 15:53:12,782 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_103-84010-0_sp1.1 from training. Duration: 0.62725 2023-10-03 15:53:15,438 WARNING [train.py:1204] (3/4) Exclude cut with ID _18199_210_6109_1_1531475789864_4288790_295-198247-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 15:53:15,467 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_13757_210_3528_1_1530440682_4107239_103-313224-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 15:53:17,636 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.2.encoder.layers.1.self_attn1.whiten, num_groups=1, num_channels=384, metric=21.42 vs. limit=22.5 2023-10-03 15:53:18,216 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_34886_210_10633_1_1533203882_1755661_67-294673-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 15:53:19,526 WARNING [train.py:1204] (3/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_350-101734-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 15:53:19,664 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_37357_210_17024_1_1533089504_2316121_204-153986-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 15:53:21,077 WARNING [train.py:1204] (3/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_673-115569-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 15:53:23,131 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.out_combiner.scale_min, batch_count=1322193.3333333333, ans=0.2 2023-10-03 15:53:24,173 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_489-230703-0_sp0.9 from training. Duration: 0.9444375 2023-10-03 15:53:24,603 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.5.encoder.layers.1.whiten, num_groups=1, num_channels=256, metric=2.06 vs. limit=12.0 2023-10-03 15:53:25,589 WARNING [train.py:1204] (3/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_575-102496-0_sp1.1 from training. Duration: 0.936375 2023-10-03 15:53:26,777 WARNING [train.py:1204] (3/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_669-93163-0 from training. Duration: 0.98 2023-10-03 15:53:29,395 WARNING [train.py:1204] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_249-281349-0_sp0.9 from training. Duration: 0.9444375 2023-10-03 15:53:32,153 WARNING [train.py:1204] (3/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_106-30782-0 from training. Duration: 0.8 2023-10-03 15:53:32,218 WARNING [train.py:1204] (3/4) Exclude cut with ID _8772_210_1850_1_1528635595972_3664011_398-320049-0_sp1.1 from training. Duration: 0.8 2023-10-03 15:53:32,374 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.1.attention_skip_rate, batch_count=1322193.3333333333, ans=0.0 2023-10-03 15:53:34,050 WARNING [train.py:1204] (3/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_516-151727-0_sp1.1 from training. Duration: 0.963625 2023-10-03 15:53:35,450 WARNING [train.py:1204] (3/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_325-62802-0_sp1.1 from training. Duration: 0.7909375 2023-10-03 15:53:38,799 WARNING [train.py:1204] (3/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_93-58002-0_sp0.9 from training. Duration: 0.6333125 2023-10-03 15:53:38,831 WARNING [train.py:1204] (3/4) Exclude cut with ID _4681_210_3525_1_1527251040321_1235713_123-24428-0_sp1.1 from training. Duration: 0.436375 2023-10-03 15:53:38,890 WARNING [train.py:1204] (3/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_1185-56473-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 15:53:40,528 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.feed_forward1.hidden_balancer.prob, batch_count=1322260.0, ans=0.125 2023-10-03 15:53:41,606 WARNING [train.py:1204] (3/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_96-244299-0_sp1.1 from training. Duration: 0.8 2023-10-03 15:53:44,380 WARNING [train.py:1204] (3/4) Exclude cut with ID _59500_210_8254_1_1534809610680_3736009_104-72815-0_sp1.1 from training. Duration: 0.963625 2023-10-03 15:53:45,786 WARNING [train.py:1204] (3/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_468-283113-0_sp1.1 from training. Duration: 0.87275 2023-10-03 15:53:47,316 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6239_210_4211_1_1527765584_4306698_528-102186-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 15:53:48,720 WARNING [train.py:1204] (3/4) Exclude cut with ID _45297_210_2721_1_1533715351381_3264949_351-147136-0 from training. Duration: 0.87 2023-10-03 15:53:48,733 WARNING [train.py:1204] (3/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_520-51744-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 15:53:48,837 WARNING [train.py:1204] (3/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_310-114452-0_sp1.1 from training. Duration: 0.536375 2023-10-03 15:53:48,841 WARNING [train.py:1204] (3/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_450-165710-0_sp1.1 from training. Duration: 0.92725 2023-10-03 15:53:48,851 WARNING [train.py:1204] (3/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_280-120942-0_sp1.1 from training. Duration: 0.7 2023-10-03 15:53:50,050 INFO [train.py:1046] (3/4) Epoch 38, batch 1800, loss[loss=0.1421, simple_loss=0.1938, pruned_loss=0.04515, over 19027.00 frames. ], tot_loss[loss=0.158, simple_loss=0.2372, pruned_loss=0.03938, over 4703190.43 frames. ], batch size: 388, lr: 2.68e-03, grad_scale: 8.0 2023-10-03 15:53:50,108 WARNING [train.py:1204] (3/4) Exclude cut with ID _65585_210_12210_1_1535450451584_3764098_112-294301-0_sp0.9 from training. Duration: 0.97775 2023-10-03 15:53:50,160 WARNING [train.py:1204] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_98-51676-0_sp0.9 from training. Duration: 0.811125 2023-10-03 15:53:52,922 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_33348_210_5632_1_1533086912_3324030_85-28583-0_sp1.1 from training. Duration: 0.7454375 2023-10-03 15:53:53,001 WARNING [train.py:1204] (3/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_336-208232-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 15:53:56,204 WARNING [train.py:1204] (3/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_339-104791-0_sp0.9 from training. Duration: 0.5444375 2023-10-03 15:53:58,845 WARNING [train.py:1204] (3/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_559-29840-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 15:54:02,228 WARNING [train.py:1204] (3/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_117-290916-0_sp0.9 from training. Duration: 0.5666875 2023-10-03 15:54:02,309 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_185-112289-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 15:54:05,571 WARNING [train.py:1204] (3/4) Exclude cut with ID _59256_210_5986_1_1535076085997_3589120_155-298797-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 15:54:08,797 WARNING [train.py:1204] (3/4) Exclude cut with ID _44659_210_2721_1_1534036120061_6614860_246-206531-0_sp1.1 from training. Duration: 0.92725 2023-10-03 15:54:10,083 WARNING [train.py:1204] (3/4) Exclude cut with ID _23818_210_6228_1_1531917063884_3676920_469-151944-0_sp1.1 from training. Duration: 0.92725 2023-10-03 15:54:11,368 WARNING [train.py:1204] (3/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_88-276668-0_sp1.1 from training. Duration: 0.9 2023-10-03 15:54:11,639 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.conv_skip_rate, batch_count=1322393.3333333333, ans=0.0 2023-10-03 15:54:12,744 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_15101_210_7927_1_1530786415_5033274_320-327527-0_sp1.1 from training. Duration: 0.87275 2023-10-03 15:54:12,753 WARNING [train.py:1204] (3/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_446-112117-0 from training. Duration: 0.9 2023-10-03 15:54:12,822 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_30280_210_1881_1_1532872779_3660139_400-254159-0_sp1.1 from training. Duration: 0.936375 2023-10-03 15:54:13,063 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.conv_skip_rate, batch_count=1322393.3333333333, ans=0.0 2023-10-03 15:54:16,813 WARNING [train.py:1204] (3/4) Exclude cut with ID _40284_210_2084_1_1533513679814_3704279_158-345341-0_sp1.1 from training. Duration: 0.936375 2023-10-03 15:54:19,515 WARNING [train.py:1204] (3/4) Exclude cut with ID 3033-130750-0096-55598-0 from training. Duration: 0.83 2023-10-03 15:54:22,247 WARNING [train.py:1204] (3/4) Exclude cut with ID _9389_210_1664_1_1529217337301_5289269_379-128794-0 from training. Duration: 0.85 2023-10-03 15:54:23,571 WARNING [train.py:1204] (3/4) Exclude cut with ID _3675_210_4557_1_1526639934408_4182089_456-18305-0 from training. Duration: 0.76 2023-10-03 15:54:23,614 WARNING [train.py:1204] (3/4) Exclude cut with ID _29853_210_7497_1_1532505600851_6642110_571-249460-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 15:54:25,388 WARNING [train.py:1204] (3/4) Exclude cut with ID _4068_210_2629_1_1526697350395_1546259_230-325568-0_sp1.1 from training. Duration: 0.92725 2023-10-03 15:54:25,389 WARNING [train.py:1204] (3/4) Exclude cut with ID _34415_210_8750_1_1533085213375_3451116_213-267570-0_sp1.1 from training. Duration: 0.963625 2023-10-03 15:54:25,478 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6767_210_3273_1_1527855783_1718932_142-127132-0_sp1.1 from training. Duration: 0.863625 2023-10-03 15:54:30,584 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.conv_module2.balancer2.prob, batch_count=1322460.0, ans=0.125 2023-10-03 15:54:31,619 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.balancer2.prob, batch_count=1322460.0, ans=0.125 2023-10-03 15:54:32,844 WARNING [train.py:1204] (3/4) Exclude cut with ID IC0653W0001-322925 from training. Duration: 0.896 2023-10-03 15:54:34,243 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_4528_210_3273_1_1527076654_3276777_209-219804-0_sp1.1 from training. Duration: 0.62725 2023-10-03 15:54:36,100 WARNING [train.py:1204] (3/4) Exclude cut with ID _55844_210_2346_1_1534471212569_7625707_187-204971-0_sp1.1 from training. Duration: 0.936375 2023-10-03 15:54:37,516 WARNING [train.py:1204] (3/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_369-325184-0 from training. Duration: 0.99 2023-10-03 15:54:37,552 WARNING [train.py:1204] (3/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_791-45149-0 from training. Duration: 0.86 2023-10-03 15:54:37,588 WARNING [train.py:1204] (3/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_317-211107-0_sp1.1 from training. Duration: 0.836375 2023-10-03 15:54:38,584 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.0.layers.0.feed_forward2.out_whiten, num_groups=1, num_channels=192, metric=5.97 vs. limit=15.0 2023-10-03 15:54:39,642 WARNING [train.py:1204] (3/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_488-330913-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 15:54:41,064 WARNING [train.py:1204] (3/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_506-42614-0_sp1.1 from training. Duration: 0.7181875 2023-10-03 15:54:41,306 INFO [scaling.py:1118] (3/4) WithLoss: name=encoder.encoders.2.encoder.layers.1.self_attn_weights, loss-sum=0.000e+00 2023-10-03 15:54:41,394 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.attention_skip_rate, batch_count=1322526.6666666667, ans=0.0 2023-10-03 15:54:46,450 WARNING [train.py:1204] (3/4) Exclude cut with ID _8696_210_1876_1_1528545632440_3345058_209-54069-0 from training. Duration: 0.67 2023-10-03 15:54:52,275 WARNING [train.py:1204] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_215-202973-0_sp1.1 from training. Duration: 0.8 2023-10-03 15:54:53,622 WARNING [train.py:1204] (3/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_321-215742-0 from training. Duration: 0.5 2023-10-03 15:54:53,668 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_428-32114-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 15:54:53,682 WARNING [train.py:1204] (3/4) Exclude cut with ID _27013_210_1389_1_1532437173240_3252649_467-263151-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 15:54:55,076 WARNING [train.py:1204] (3/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_734-129761-0_sp0.9 from training. Duration: 0.911125 2023-10-03 15:54:56,829 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_521-214276-0 from training. Duration: 0.44 2023-10-03 15:54:59,709 WARNING [train.py:1204] (3/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_80-147301-0_sp1.1 from training. Duration: 0.763625 2023-10-03 15:54:59,711 WARNING [train.py:1204] (3/4) Exclude cut with ID _9679_210_7497_1_1529129212906_4363929_56-307796-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 15:55:02,422 WARNING [train.py:1204] (3/4) Exclude cut with ID _14832_210_8750_1_1530691351539_3171348_1-54315-0 from training. Duration: 0.87 2023-10-03 15:55:02,422 WARNING [train.py:1204] (3/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_558-155750-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 15:55:03,819 INFO [train.py:1046] (3/4) Epoch 38, batch 1850, loss[loss=0.1577, simple_loss=0.2444, pruned_loss=0.0355, over 24414.00 frames. ], tot_loss[loss=0.1578, simple_loss=0.237, pruned_loss=0.03925, over 4711602.22 frames. ], batch size: 77, lr: 2.68e-03, grad_scale: 8.0 2023-10-03 15:55:03,946 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_142-309370-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 15:55:03,971 WARNING [train.py:1204] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_18-46756-0_sp0.9 from training. Duration: 0.87775 2023-10-03 15:55:05,437 WARNING [train.py:1204] (3/4) Exclude cut with ID _61148_210_6897_1_1535094022726_4217610_279-212159-0_sp1.1 from training. Duration: 0.936375 2023-10-03 15:55:06,139 WARNING [train.py:1204] (3/4) Exclude cut with ID _21987_210_5854_1_1531821500482_3637038_164-2641-0_sp1.1 from training. Duration: 0.936375 2023-10-03 15:55:07,443 WARNING [train.py:1204] (3/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_395-340852-0_sp1.1 from training. Duration: 0.8454375 2023-10-03 15:55:08,928 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1077-184261-0_sp1.1 from training. Duration: 0.963625 2023-10-03 15:55:08,937 WARNING [train.py:1204] (3/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_1176-204992-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 15:55:12,209 WARNING [train.py:1204] (3/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_563-347832-0_sp1.1 from training. Duration: 0.8090625 2023-10-03 15:55:12,271 WARNING [train.py:1204] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_380-204756-0_sp0.9 from training. Duration: 0.9555625 2023-10-03 15:55:16,571 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.0.nonlin_attention.balancer.prob, batch_count=1322660.0, ans=0.125 2023-10-03 15:55:17,816 WARNING [train.py:1204] (3/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_212-10501-0_sp1.1 from training. Duration: 0.8545625 2023-10-03 15:55:17,845 WARNING [train.py:1204] (3/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_445-117398-0 from training. Duration: 0.89 2023-10-03 15:55:21,893 WARNING [train.py:1204] (3/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_29-280622-0 from training. Duration: 0.82 2023-10-03 15:55:22,112 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.1.feed_forward3.hidden_balancer.prob, batch_count=1322726.6666666667, ans=0.125 2023-10-03 15:55:23,398 WARNING [train.py:1204] (3/4) Exclude cut with ID _26664_210_7770_1_1532258943280_3418270_187-80815-0 from training. Duration: 0.93 2023-10-03 15:55:26,134 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.1.encoder.layers.0.whiten, num_groups=1, num_channels=256, metric=5.02 vs. limit=12.0 2023-10-03 15:55:27,921 WARNING [train.py:1204] (3/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_414-349640-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 15:55:27,947 WARNING [train.py:1204] (3/4) Exclude cut with ID _45021_210_9994_1_1533619751094_3330288_96-239010-0 from training. Duration: 0.91 2023-10-03 15:55:27,950 WARNING [train.py:1204] (3/4) Exclude cut with ID _5189_210_4557_1_1527248319428_3769229_199-53526-0_sp0.9 from training. Duration: 0.4666875 2023-10-03 15:55:36,365 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.645e+02 1.902e+02 2.104e+02 2.357e+02 3.020e+02, threshold=4.208e+02, percent-clipped=0.0 2023-10-03 15:55:38,466 WARNING [train.py:1204] (3/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_63-101996-0_sp1.1 from training. Duration: 0.8 2023-10-03 15:55:39,896 WARNING [train.py:1204] (3/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_199-86432-0 from training. Duration: 0.98 2023-10-03 15:55:42,064 WARNING [train.py:1204] (3/4) Exclude cut with ID _36399_210_16049_1_1532998771377_3698333_207-259013-0_sp1.1 from training. Duration: 0.92725 2023-10-03 15:55:42,112 WARNING [train.py:1204] (3/4) Exclude cut with ID _62574_210_2655_1_1535180407058_3933249_567-111756-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 15:55:46,237 WARNING [train.py:1204] (3/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_436-318113-0 from training. Duration: 0.75 2023-10-03 15:55:47,455 WARNING [train.py:1204] (3/4) Exclude cut with ID _53379_210_20034_1_1534222027363_3854654_43-305214-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 15:55:47,479 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_15126_210_6753_1_1530778677_2591185_113-157651-0_sp1.1 from training. Duration: 0.6181875 2023-10-03 15:55:48,833 WARNING [train.py:1204] (3/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_146-218763-0_sp1.1 from training. Duration: 0.7818125 2023-10-03 15:55:50,369 WARNING [train.py:1204] (3/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_714-310538-0_sp0.9 from training. Duration: 0.9555625 2023-10-03 15:55:50,604 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.bypass_mid.scale_min, batch_count=1322860.0, ans=0.2 2023-10-03 15:55:52,415 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.2.encoder.layers.1.feed_forward2.out_whiten, num_groups=1, num_channels=384, metric=3.69 vs. limit=15.0 2023-10-03 15:55:53,033 WARNING [train.py:1204] (3/4) Exclude cut with ID _24271_210_12658_1_1531987270022_7190519_709-16721-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 15:55:56,468 WARNING [train.py:1204] (3/4) Exclude cut with ID _5190_210_4557_1_1527244368799_373439_28-197030-0_sp0.9 from training. Duration: 0.8 2023-10-03 15:55:57,803 WARNING [train.py:1204] (3/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_267-197536-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 15:55:57,839 WARNING [train.py:1204] (3/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_658-282610-0_sp1.1 from training. Duration: 0.4818125 2023-10-03 15:55:57,850 WARNING [train.py:1204] (3/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_257-67050-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 15:55:59,180 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_14906_210_7927_1_1530754190_9408633_961-53402-0_sp1.1 from training. Duration: 0.936375 2023-10-03 15:56:00,590 WARNING [train.py:1204] (3/4) Exclude cut with ID _60113_210_16271_1_1535164212481_4386079_490-280290-0_sp1.1 from training. Duration: 0.8909375 2023-10-03 15:56:03,583 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.1.balancer2.prob, batch_count=1322926.6666666667, ans=0.125 2023-10-03 15:56:04,830 WARNING [train.py:1204] (3/4) Exclude cut with ID _14100_210_8341_1_1530529320613_3678069_258-224599-0 from training. Duration: 0.77 2023-10-03 15:56:04,876 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6961_210_2087_1_1527937168_6815011_565-326898-0_sp1.1 from training. Duration: 0.936375 2023-10-03 15:56:07,575 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.3.encoder.layers.3.self_attn_weights.whiten_keys, num_groups=8, num_channels=256, metric=2.54 vs. limit=6.0 2023-10-03 15:56:08,298 WARNING [train.py:1204] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_239-316802-0_sp1.1 from training. Duration: 0.62725 2023-10-03 15:56:09,550 WARNING [train.py:1204] (3/4) Exclude cut with ID _55857_210_2346_1_1534644007831_6898209_745-98391-0_sp1.1 from training. Duration: 0.6909375 2023-10-03 15:56:09,554 WARNING [train.py:1204] (3/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_154-135696-0 from training. Duration: 0.8 2023-10-03 15:56:09,556 WARNING [train.py:1204] (3/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_93-58002-0 from training. Duration: 0.57 2023-10-03 15:56:11,447 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9025W0005-507335 from training. Duration: 0.896 2023-10-03 15:56:12,805 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9029W0034-509435 from training. Duration: 0.64 2023-10-03 15:56:15,441 WARNING [train.py:1204] (3/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_76-203867-0_sp1.1 from training. Duration: 0.6545625 2023-10-03 15:56:15,450 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_46639_210_8341_1_1533726088_4495500_66-346325-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 15:56:15,457 WARNING [train.py:1204] (3/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_145-137790-0_sp0.9 from training. Duration: 0.92225 2023-10-03 15:56:15,471 WARNING [train.py:1204] (3/4) Exclude cut with ID _36522_210_8887_1_1533177030611_1772478_7-327299-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 15:56:15,544 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9042W0009-515615 from training. Duration: 0.896 2023-10-03 15:56:15,553 WARNING [train.py:1204] (3/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_39-248870-0_sp0.9 from training. Duration: 0.7666875 2023-10-03 15:56:15,597 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_35441_210_16554_1_1533347572_4092680_362-242912-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 15:56:17,055 WARNING [train.py:1204] (3/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_216-343007-0_sp1.1 from training. Duration: 0.6 2023-10-03 15:56:18,356 INFO [train.py:1046] (3/4) Epoch 38, batch 1900, loss[loss=0.1531, simple_loss=0.244, pruned_loss=0.03109, over 24304.00 frames. ], tot_loss[loss=0.1581, simple_loss=0.2378, pruned_loss=0.03921, over 4719505.21 frames. ], batch size: 74, lr: 2.68e-03, grad_scale: 8.0 2023-10-03 15:56:18,446 WARNING [train.py:1204] (3/4) Exclude cut with ID _3867_210_1883_1_1526641200575_4475830_272-215563-0_sp0.9 from training. Duration: 0.6333125 2023-10-03 15:56:19,799 WARNING [train.py:1204] (3/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_553-175603-0_sp1.1 from training. Duration: 0.92725 2023-10-03 15:56:19,814 WARNING [train.py:1204] (3/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_963-187109-0 from training. Duration: 0.99 2023-10-03 15:56:21,244 WARNING [train.py:1204] (3/4) Exclude cut with ID _44973_210_1511_1_1533794381163_3634770_495-211466-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 15:56:21,255 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9068W0002-529085 from training. Duration: 0.896 2023-10-03 15:56:21,268 WARNING [train.py:1204] (3/4) Exclude cut with ID _5294_210_1747_1_1527249643928_3696179_180-100713-0_sp1.1 from training. Duration: 0.736375 2023-10-03 15:56:22,788 WARNING [train.py:1204] (3/4) Exclude cut with ID _35799_210_5854_1_1533263487887_3529071_149-49689-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 15:56:28,634 WARNING [train.py:1204] (3/4) Exclude cut with ID _67725_210_27767_1_1535608756671_5105359_36-220794-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 15:56:30,056 WARNING [train.py:1204] (3/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_338-96520-0_sp1.1 from training. Duration: 0.8545625 2023-10-03 15:56:31,436 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9104W0004-547160 from training. Duration: 0.896 2023-10-03 15:56:31,539 WARNING [train.py:1204] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_330-327861-0 from training. Duration: 0.67 2023-10-03 15:56:32,939 WARNING [train.py:1204] (3/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_156-257607-0_sp0.9 from training. Duration: 0.92225 2023-10-03 15:56:32,994 WARNING [train.py:1204] (3/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_476-322050-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 15:56:33,009 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9118W0017-553385 from training. Duration: 0.896 2023-10-03 15:56:34,335 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9120W0028-554435 from training. Duration: 0.896 2023-10-03 15:56:34,649 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.conv_module2.balancer1.min_positive, batch_count=1323060.0, ans=0.025 2023-10-03 15:56:37,616 WARNING [train.py:1204] (3/4) Exclude cut with ID _56524_210_9642_1_1534572011779_3619170_145-310842-0 from training. Duration: 0.99 2023-10-03 15:56:39,007 WARNING [train.py:1204] (3/4) Exclude cut with ID _51631_210_8254_1_1534129228420_4084794_270-222508-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 15:56:42,402 WARNING [train.py:1204] (3/4) Exclude cut with ID _14268_210_9206_1_1530622771285_4714099_331-105351-0 from training. Duration: 0.65 2023-10-03 15:56:43,860 WARNING [train.py:1204] (3/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_40-129756-0 from training. Duration: 0.81 2023-10-03 15:56:48,752 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.conv_module2.whiten.whitening_limit, batch_count=1323126.6666666667, ans=15.0 2023-10-03 15:56:52,301 WARNING [train.py:1204] (3/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_231-104736-0 from training. Duration: 0.78 2023-10-03 15:56:55,582 WARNING [train.py:1204] (3/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_214-291369-0 from training. Duration: 0.63 2023-10-03 15:56:55,589 WARNING [train.py:1204] (3/4) Exclude cut with ID _62574_210_2655_1_1535180407058_3933249_369-84347-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 15:56:57,063 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9210W0111-599240 from training. Duration: 0.896 2023-10-03 15:56:57,067 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_92-246270-0 from training. Duration: 0.85 2023-10-03 15:56:57,112 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_37152_210_9697_1_1533106573_3844188_29-239070-0 from training. Duration: 0.98 2023-10-03 15:56:58,396 WARNING [train.py:1204] (3/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_239-22076-0 from training. Duration: 0.85 2023-10-03 15:56:58,398 WARNING [train.py:1204] (3/4) Exclude cut with ID _65962_210_29927_1_1535436026519_4174930_368-44639-0_sp1.1 from training. Duration: 0.97275 2023-10-03 15:57:02,620 WARNING [train.py:1204] (3/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_152-212533-0 from training. Duration: 0.97 2023-10-03 15:57:05,722 WARNING [train.py:1204] (3/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_90-31383-0_sp1.1 from training. Duration: 0.7909375 2023-10-03 15:57:07,246 WARNING [train.py:1204] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_196-152868-0_sp1.1 from training. Duration: 0.92725 2023-10-03 15:57:07,248 WARNING [train.py:1204] (3/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_140-181176-0 from training. Duration: 0.92 2023-10-03 15:57:08,628 WARNING [train.py:1204] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_168-32906-0_sp0.9 from training. Duration: 0.7666875 2023-10-03 15:57:13,180 WARNING [train.py:1204] (3/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_13-165512-0 from training. Duration: 0.82 2023-10-03 15:57:13,218 WARNING [train.py:1204] (3/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_381-317791-0_sp0.9 from training. Duration: 0.811125 2023-10-03 15:57:18,715 WARNING [train.py:1204] (3/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_63-267585-0_sp1.1 from training. Duration: 0.8181875 2023-10-03 15:57:18,717 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_326-303945-0_sp1.1 from training. Duration: 0.9 2023-10-03 15:57:20,029 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_194-143381-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 15:57:21,356 WARNING [train.py:1204] (3/4) Exclude cut with ID _39290_210_4220_1_1533862778760_3075118_194-16667-0_sp1.1 from training. Duration: 0.963625 2023-10-03 15:57:22,854 WARNING [train.py:1204] (3/4) Exclude cut with ID _15012_210_1968_1_1530856820429_5095350_662-210469-0_sp0.9 from training. Duration: 0.6333125 2023-10-03 15:57:22,886 WARNING [train.py:1204] (3/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_68-219807-0_sp1.1 from training. Duration: 0.42725 2023-10-03 15:57:23,072 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.conv_module2.balancer2.prob, batch_count=1323260.0, ans=0.125 2023-10-03 15:57:24,183 WARNING [train.py:1204] (3/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_321-22821-0_sp0.9 from training. Duration: 0.988875 2023-10-03 15:57:27,453 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_1037-45317-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 15:57:27,454 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_403-39619-0_sp1.1 from training. Duration: 0.763625 2023-10-03 15:57:28,948 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_510-96996-0_sp0.9 from training. Duration: 0.9666875 2023-10-03 15:57:28,951 WARNING [train.py:1204] (3/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_225-180155-0_sp1.1 from training. Duration: 0.92725 2023-10-03 15:57:30,314 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_278-272612-0_sp0.9 from training. Duration: 0.811125 2023-10-03 15:57:31,652 INFO [train.py:1046] (3/4) Epoch 38, batch 1950, loss[loss=0.1506, simple_loss=0.2333, pruned_loss=0.03389, over 24474.00 frames. ], tot_loss[loss=0.1594, simple_loss=0.2392, pruned_loss=0.03974, over 4719478.82 frames. ], batch size: 66, lr: 2.68e-03, grad_scale: 8.0 2023-10-03 15:57:31,696 WARNING [train.py:1204] (3/4) Exclude cut with ID _53713_210_24116_1_1534314487762_7416751_691-70244-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 15:57:34,529 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6788_210_1388_1_1527898698_4562021_91-197981-0_sp1.1 from training. Duration: 0.7545625 2023-10-03 15:57:35,210 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.1.encoder.layers.1.whiten, num_groups=1, num_channels=256, metric=3.79 vs. limit=12.0 2023-10-03 15:57:37,791 WARNING [train.py:1204] (3/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_60-18123-0_sp0.9 from training. Duration: 0.97775 2023-10-03 15:57:37,836 WARNING [train.py:1204] (3/4) Exclude cut with ID _35799_210_5854_1_1533263487887_3529071_5-214617-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 15:57:37,842 WARNING [train.py:1204] (3/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_890-5117-0_sp1.1 from training. Duration: 0.7090625 2023-10-03 15:57:38,051 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.balancer1.prob, batch_count=1323326.6666666667, ans=0.125 2023-10-03 15:57:40,872 WARNING [train.py:1204] (3/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_660-211088-0 from training. Duration: 0.62 2023-10-03 15:57:40,941 WARNING [train.py:1204] (3/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_314-126819-0_sp0.9 from training. Duration: 0.5555625 2023-10-03 15:57:40,964 WARNING [train.py:1204] (3/4) Exclude cut with ID _21632_210_2253_1_1531994001765_3744140_362-66225-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 15:57:42,808 WARNING [train.py:1204] (3/4) Exclude cut with ID _61118_210_9527_1_1534897668884_3752630_173-258555-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 15:57:45,507 WARNING [train.py:1204] (3/4) Exclude cut with ID _7719_210_3525_1_1528456153163_4296960_88-250259-0_sp0.9 from training. Duration: 0.9333125 2023-10-03 15:57:46,834 WARNING [train.py:1204] (3/4) Exclude cut with ID _15140_210_9288_1_1530791388450_4880149_539-153708-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 15:57:46,869 WARNING [train.py:1204] (3/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_253-42544-0_sp1.1 from training. Duration: 0.97275 2023-10-03 15:57:48,267 WARNING [train.py:1204] (3/4) Exclude cut with ID _33640_210_2655_1_1533117605479_4855469_479-296877-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 15:57:52,252 WARNING [train.py:1204] (3/4) Exclude cut with ID 3033-130750-0096-55598-0_sp1.1 from training. Duration: 0.7545625 2023-10-03 15:57:52,270 WARNING [train.py:1204] (3/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_191-41023-0_sp1.1 from training. Duration: 0.6454375 2023-10-03 15:57:52,285 WARNING [train.py:1204] (3/4) Exclude cut with ID _8365_210_2064_1_1528592311070_4677690_431-294102-0_sp1.1 from training. Duration: 0.8090625 2023-10-03 15:57:52,316 WARNING [train.py:1204] (3/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_893-296030-0_sp1.1 from training. Duration: 0.97275 2023-10-03 15:57:56,531 WARNING [train.py:1204] (3/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_401-343820-0_sp1.1 from training. Duration: 0.97275 2023-10-03 15:57:59,813 WARNING [train.py:1204] (3/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_127-566-0_sp1.1 from training. Duration: 0.763625 2023-10-03 15:57:59,815 WARNING [train.py:1204] (3/4) Exclude cut with ID _14100_210_8341_1_1530529320613_3678069_126-272847-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 15:57:59,832 WARNING [train.py:1204] (3/4) Exclude cut with ID _15133_210_6228_1_1530794512074_3080379_29-264458-0_sp1.1 from training. Duration: 0.52725 2023-10-03 15:57:59,833 WARNING [train.py:1204] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_127-119109-0 from training. Duration: 0.72 2023-10-03 15:58:01,240 WARNING [train.py:1204] (3/4) Exclude cut with ID _6537_210_1883_1_1527987559710_3597129_382-133062-0_sp1.1 from training. Duration: 0.6181875 2023-10-03 15:58:01,263 WARNING [train.py:1204] (3/4) Exclude cut with ID _29529_210_7234_1_1532393961455_3977460_391-219682-0_sp1.1 from training. Duration: 0.936375 2023-10-03 15:58:02,572 WARNING [train.py:1204] (3/4) Exclude cut with ID _37537_210_2253_1_1533171758568_3421949_96-137189-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 15:58:04,151 WARNING [train.py:1204] (3/4) Exclude cut with ID _58414_210_8254_1_1535421609162_7151182_15-276301-0_sp1.1 from training. Duration: 0.97275 2023-10-03 15:58:05,335 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.744e+02 1.976e+02 2.296e+02 2.556e+02 3.551e+02, threshold=4.593e+02, percent-clipped=0.0 2023-10-03 15:58:06,896 WARNING [train.py:1204] (3/4) Exclude cut with ID _59502_210_12210_1_1535107024698_1835139_110-289074-0_sp1.1 from training. Duration: 0.92725 2023-10-03 15:58:10,242 WARNING [train.py:1204] (3/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_367-61330-0_sp1.1 from training. Duration: 0.6818125 2023-10-03 15:58:13,507 WARNING [train.py:1204] (3/4) Exclude cut with ID _45297_210_2721_1_1533715351381_3264949_353-9541-0_sp1.1 from training. Duration: 0.8818125 2023-10-03 15:58:15,356 WARNING [train.py:1204] (3/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_85-138257-0_sp0.9 from training. Duration: 0.788875 2023-10-03 15:58:15,389 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3787_210_2040_1_1526477544_5478586_603-120324-0 from training. Duration: 0.81 2023-10-03 15:58:15,436 WARNING [train.py:1204] (3/4) Exclude cut with ID _42863_210_9070_1_1533902398788_3696920_411-204131-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 15:58:15,639 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.feed_forward1.hidden_balancer.prob, batch_count=1323526.6666666667, ans=0.125 2023-10-03 15:58:18,290 WARNING [train.py:1204] (3/4) Exclude cut with ID _30196_210_1664_1_1533708496522_7074140_822-289909-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 15:58:19,598 WARNING [train.py:1204] (3/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_32-335103-0_sp0.9 from training. Duration: 0.911125 2023-10-03 15:58:20,857 WARNING [train.py:1204] (3/4) Exclude cut with ID _6493_210_1747_1_1528459210424_4012470_513-140967-0_sp0.9 from training. Duration: 0.87775 2023-10-03 15:58:28,226 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_47896_210_20508_1_1534492518_5958868_439-257766-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 15:58:28,284 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_1294-63152-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 15:58:29,775 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=1323593.3333333333, ans=0.1 2023-10-03 15:58:32,234 WARNING [train.py:1204] (3/4) Exclude cut with ID _59170_210_6721_1_1534983633960_4203335_319-166512-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 15:58:32,478 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.conv_module2.balancer1.max_abs, batch_count=1323593.3333333333, ans=10.0 2023-10-03 15:58:33,667 WARNING [train.py:1204] (3/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_146-3470-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 15:58:37,716 WARNING [train.py:1204] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_678-159307-0_sp1.1 from training. Duration: 0.77275 2023-10-03 15:58:37,755 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_343-48822-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 15:58:37,808 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_4692_210_3528_1_1526899755_4323254_107-44401-0 from training. Duration: 0.86 2023-10-03 15:58:37,812 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_74-292608-0_sp1.1 from training. Duration: 0.6545625 2023-10-03 15:58:39,160 WARNING [train.py:1204] (3/4) Exclude cut with ID _55644_210_11856_1_1534593599582_4103719_293-248694-0_sp1.1 from training. Duration: 0.97275 2023-10-03 15:58:41,015 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3172_210_2087_1_1526292929_3987865_197-97762-0 from training. Duration: 0.75 2023-10-03 15:58:44,411 WARNING [train.py:1204] (3/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_264-348896-0_sp0.9 from training. Duration: 0.9444375 2023-10-03 15:58:44,652 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.nonlin_attention.balancer.prob, batch_count=1323660.0, ans=0.125 2023-10-03 15:58:45,751 INFO [train.py:1046] (3/4) Epoch 38, batch 2000, loss[loss=0.1513, simple_loss=0.24, pruned_loss=0.03124, over 24447.00 frames. ], tot_loss[loss=0.1601, simple_loss=0.2403, pruned_loss=0.03994, over 4728015.96 frames. ], batch size: 66, lr: 2.68e-03, grad_scale: 16.0 2023-10-03 15:58:47,807 WARNING [train.py:1204] (3/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_209-205365-0_sp0.9 from training. Duration: 0.87775 2023-10-03 15:58:47,877 WARNING [train.py:1204] (3/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_222-198127-0_sp0.9 from training. Duration: 0.9333125 2023-10-03 15:58:49,157 WARNING [train.py:1204] (3/4) Exclude cut with ID _57572_210_5517_1_1534921219624_3809484_52-43530-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 15:58:51,854 WARNING [train.py:1204] (3/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_96-320736-0_sp1.1 from training. Duration: 0.7818125 2023-10-03 15:58:53,222 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_13091_210_7925_1_1530609447_8987354_1159-225508-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 15:58:54,722 WARNING [train.py:1204] (3/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_237-44332-0 from training. Duration: 0.94 2023-10-03 15:58:56,035 WARNING [train.py:1204] (3/4) Exclude cut with ID _7970_210_1883_1_1528455616296_3472230_25-178407-0_sp0.9 from training. Duration: 0.788875 2023-10-03 15:58:57,548 WARNING [train.py:1204] (3/4) Exclude cut with ID _29003_210_19596_1_1532507391720_3894098_357-257062-0_sp1.1 from training. Duration: 0.936375 2023-10-03 15:59:00,766 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_809-17156-0 from training. Duration: 0.73 2023-10-03 15:59:02,077 WARNING [train.py:1204] (3/4) Exclude cut with ID _7160_210_1876_1_1527940923231_3703799_302-150728-0_sp1.1 from training. Duration: 0.7090625 2023-10-03 15:59:02,085 WARNING [train.py:1204] (3/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_444-28041-0_sp0.9 from training. Duration: 0.9444375 2023-10-03 15:59:04,791 WARNING [train.py:1204] (3/4) Exclude cut with ID _52892_210_6721_1_1534378941259_4124129_278-251666-0_sp1.1 from training. Duration: 0.92725 2023-10-03 15:59:06,132 WARNING [train.py:1204] (3/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_115-326778-0 from training. Duration: 0.93 2023-10-03 15:59:07,497 WARNING [train.py:1204] (3/4) Exclude cut with ID _41391_210_7994_1_1533603771872_3638201_31-188961-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 15:59:08,917 WARNING [train.py:1204] (3/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_1073-6264-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 15:59:08,936 WARNING [train.py:1204] (3/4) Exclude cut with ID _66868_210_9527_1_1535509287954_4561140_435-259515-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 15:59:10,313 WARNING [train.py:1204] (3/4) Exclude cut with ID _14886_210_4220_1_1530702020355_3516190_159-73748-0 from training. Duration: 0.83 2023-10-03 15:59:10,352 WARNING [train.py:1204] (3/4) Exclude cut with ID _15150_210_8341_1_1530752411803_7256384_469-239509-0_sp1.1 from training. Duration: 0.6181875 2023-10-03 15:59:12,313 WARNING [train.py:1204] (3/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_395-340852-0 from training. Duration: 0.93 2023-10-03 15:59:12,316 WARNING [train.py:1204] (3/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_138-187031-0_sp1.1 from training. Duration: 0.9 2023-10-03 15:59:16,999 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_13757_210_3528_1_1530440682_4107239_48-106347-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 15:59:18,773 WARNING [train.py:1204] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_164-224052-0_sp1.1 from training. Duration: 0.636375 2023-10-03 15:59:18,785 WARNING [train.py:1204] (3/4) Exclude cut with ID _59288_210_5854_1_1535077757774_3412246_408-322648-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 15:59:18,828 WARNING [train.py:1204] (3/4) Exclude cut with ID _30191_210_1664_1_1533189915047_7196670_827-36359-0_sp1.1 from training. Duration: 0.963625 2023-10-03 15:59:20,172 WARNING [train.py:1204] (3/4) Exclude cut with ID _4680_210_3525_1_1527249757953_893386_75-78979-0_sp1.1 from training. Duration: 0.82725 2023-10-03 15:59:21,563 WARNING [train.py:1204] (3/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_383-165234-0 from training. Duration: 0.94 2023-10-03 15:59:24,297 WARNING [train.py:1204] (3/4) Exclude cut with ID _19896_210_1883_1_1531709022662_6649569_94-229927-0 from training. Duration: 0.97 2023-10-03 15:59:24,304 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6788_210_1388_1_1527898698_4562021_180-75288-0_sp1.1 from training. Duration: 0.9 2023-10-03 15:59:24,312 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_26433_210_12108_1_1532157158_4056480_54-321349-0_sp1.1 from training. Duration: 0.97275 2023-10-03 15:59:25,924 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.0.balancer1.prob, batch_count=1323793.3333333333, ans=0.125 2023-10-03 15:59:30,462 WARNING [train.py:1204] (3/4) Exclude cut with ID _56108_210_16380_1_1534505446159_3887959_265-161493-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 15:59:31,845 WARNING [train.py:1204] (3/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_395-111602-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 15:59:31,851 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_154-316450-0_sp0.9 from training. Duration: 0.8444375 2023-10-03 15:59:33,238 WARNING [train.py:1204] (3/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_381-116041-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 15:59:34,601 WARNING [train.py:1204] (3/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_65-104124-0_sp1.1 from training. Duration: 0.963625 2023-10-03 15:59:35,987 WARNING [train.py:1204] (3/4) Exclude cut with ID _6536_210_1883_1_1527850783339_3319069_169-23811-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 15:59:36,027 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_289-180904-0_sp0.9 from training. Duration: 0.8444375 2023-10-03 15:59:36,040 WARNING [train.py:1204] (3/4) Exclude cut with ID _56406_210_9288_1_1534899618664_3496149_226-242400-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 15:59:37,429 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_24899_210_16554_1_1532046422_3827462_276-216330-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 15:59:40,868 WARNING [train.py:1204] (3/4) Exclude cut with ID _29345_210_4381_1_1532689168263_3860169_198-174328-0_sp1.1 from training. Duration: 0.82725 2023-10-03 15:59:40,944 WARNING [train.py:1204] (3/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_338-96520-0 from training. Duration: 0.94 2023-10-03 15:59:41,093 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=1323860.0, ans=0.1 2023-10-03 15:59:43,603 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.0.bypass.scale_min, batch_count=1323926.6666666667, ans=0.2 2023-10-03 15:59:46,248 WARNING [train.py:1204] (3/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_7-111954-0_sp1.1 from training. Duration: 0.5090625 2023-10-03 15:59:46,455 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.balancer2.prob, batch_count=1323926.6666666667, ans=0.125 2023-10-03 15:59:47,587 WARNING [train.py:1204] (3/4) Exclude cut with ID _36692_210_5665_1_1533608967420_398809_8-16261-0_sp1.1 from training. Duration: 0.97275 2023-10-03 15:59:50,994 WARNING [train.py:1204] (3/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_86-212657-0_sp1.1 from training. Duration: 0.97275 2023-10-03 15:59:51,008 WARNING [train.py:1204] (3/4) Exclude cut with ID _44659_210_2721_1_1534036120061_6614860_626-298884-0_sp1.1 from training. Duration: 0.8818125 2023-10-03 15:59:53,933 WARNING [train.py:1204] (3/4) Exclude cut with ID _49755_210_18053_1_1534581005846_3218709_120-257918-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 15:59:56,562 WARNING [train.py:1204] (3/4) Exclude cut with ID _21881_210_3525_1_1531825242757_3798380_400-113821-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 15:59:56,566 WARNING [train.py:1204] (3/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_387-236773-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 15:59:57,960 WARNING [train.py:1204] (3/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_613-232856-0_sp1.1 from training. Duration: 0.4909375 2023-10-03 15:59:57,991 WARNING [train.py:1204] (3/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_181-78632-0_sp0.9 from training. Duration: 0.8666875 2023-10-03 15:59:59,297 INFO [train.py:1046] (3/4) Epoch 38, batch 2050, loss[loss=0.1577, simple_loss=0.2414, pruned_loss=0.03695, over 23338.00 frames. ], tot_loss[loss=0.1598, simple_loss=0.2395, pruned_loss=0.04002, over 4731734.62 frames. ], batch size: 93, lr: 2.68e-03, grad_scale: 8.0 2023-10-03 15:59:59,406 WARNING [train.py:1204] (3/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_175-17485-0_sp1.1 from training. Duration: 0.97275 2023-10-03 16:00:01,256 WARNING [train.py:1204] (3/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_1004-183422-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 16:00:04,017 WARNING [train.py:1204] (3/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_32-65962-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 16:00:05,412 WARNING [train.py:1204] (3/4) Exclude cut with ID _15458_210_8341_1_1530858614263_3834410_40-298664-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 16:00:09,569 WARNING [train.py:1204] (3/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_380-261863-0_sp1.1 from training. Duration: 0.963625 2023-10-03 16:00:10,967 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_372-173012-0_sp1.1 from training. Duration: 0.863625 2023-10-03 16:00:11,028 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_57-82205-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 16:00:13,098 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3691_210_3528_1_1526468169_4062053_355-334432-0_sp0.9 from training. Duration: 0.9666875 2023-10-03 16:00:14,365 WARNING [train.py:1204] (3/4) Exclude cut with ID _19023_210_4353_1_1531378892552_5820780_28-36615-0 from training. Duration: 0.97 2023-10-03 16:00:15,668 WARNING [train.py:1204] (3/4) Exclude cut with ID _29853_210_7497_1_1532505600851_6642110_286-251333-0_sp1.1 from training. Duration: 0.92725 2023-10-03 16:00:16,947 WARNING [train.py:1204] (3/4) Exclude cut with ID _31602_210_5854_1_1532677021456_4458250_518-80679-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 16:00:16,982 WARNING [train.py:1204] (3/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_38-187851-0_sp0.9 from training. Duration: 0.688875 2023-10-03 16:00:25,228 INFO [scaling.py:1118] (3/4) WithLoss: name=encoder.encoders.4.encoder.layers.2.self_attn_weights, loss-sum=0.000e+00 2023-10-03 16:00:26,286 WARNING [train.py:1204] (3/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_39-248870-0_sp1.1 from training. Duration: 0.62725 2023-10-03 16:00:26,304 WARNING [train.py:1204] (3/4) Exclude cut with ID _16908_210_5724_1_1531119157248_3772700_154-31455-0_sp1.1 from training. Duration: 0.97275 2023-10-03 16:00:27,811 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8453_210_4211_1_1528502940_8562730_347-330673-0 from training. Duration: 0.72 2023-10-03 16:00:29,876 WARNING [train.py:1204] (3/4) Exclude cut with ID _30191_210_1664_1_1533189915047_7196670_674-204247-0_sp1.1 from training. Duration: 0.97275 2023-10-03 16:00:29,965 WARNING [train.py:1204] (3/4) Exclude cut with ID _8833_210_5219_1_1528592665210_7553059_329-125156-0 from training. Duration: 0.82 2023-10-03 16:00:30,190 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.ff3_skip_rate, batch_count=1324126.6666666667, ans=0.0 2023-10-03 16:00:31,287 WARNING [train.py:1204] (3/4) Exclude cut with ID _4779_210_2084_1_1527318022944_3914893_500-285013-0_sp1.1 from training. Duration: 0.62725 2023-10-03 16:00:31,604 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=1324126.6666666667, ans=0.1 2023-10-03 16:00:32,732 WARNING [train.py:1204] (3/4) Exclude cut with ID _14421_210_8750_1_1530604795825_3542290_190-313578-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 16:00:35,355 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.628e+02 1.904e+02 2.086e+02 2.285e+02 3.176e+02, threshold=4.171e+02, percent-clipped=0.0 2023-10-03 16:00:35,462 WARNING [train.py:1204] (3/4) Exclude cut with ID _35799_210_5854_1_1533263487887_3529071_538-273574-0_sp1.1 from training. Duration: 0.936375 2023-10-03 16:00:36,706 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_14928_210_5632_1_1530773461_4714000_454-112198-0_sp1.1 from training. Duration: 0.6 2023-10-03 16:00:38,060 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_468-143126-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 16:00:39,528 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_4528_210_3273_1_1527076654_3276777_258-121681-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 16:00:39,604 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_397-132565-0_sp1.1 from training. Duration: 0.9 2023-10-03 16:00:40,836 WARNING [train.py:1204] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_454-123002-0_sp0.9 from training. Duration: 0.7666875 2023-10-03 16:00:44,241 WARNING [train.py:1204] (3/4) Exclude cut with ID _27052_210_5986_1_1532329197187_3585009_297-86890-0_sp1.1 from training. Duration: 0.936375 2023-10-03 16:00:46,909 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_51225_210_3601_1_1534400634_2327383_55-301844-0_sp1.1 from training. Duration: 0.8454375 2023-10-03 16:00:47,075 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.1.conv_skip_rate, batch_count=1324193.3333333333, ans=0.0 2023-10-03 16:00:50,102 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6786_210_1388_1_1527839699_4194495_378-46070-0_sp0.9 from training. Duration: 0.8 2023-10-03 16:00:50,210 WARNING [train.py:1204] (3/4) Exclude cut with ID _42334_210_7770_1_1534206610396_3605880_55-19758-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 16:00:53,558 WARNING [train.py:1204] (3/4) Exclude cut with ID _14884_210_2252_1_1530707459689_5561250_114-201399-0_sp0.9 from training. Duration: 0.8444375 2023-10-03 16:01:00,724 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_99-251314-0_sp0.9 from training. Duration: 0.9666875 2023-10-03 16:01:00,802 WARNING [train.py:1204] (3/4) Exclude cut with ID _26528_210_9070_1_1532570410842_3851310_497-97218-0 from training. Duration: 0.86 2023-10-03 16:01:05,140 WARNING [train.py:1204] (3/4) Exclude cut with ID _55328_210_27767_1_1534580973260_4088170_230-317241-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 16:01:06,499 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_125-203105-0_sp0.9 from training. Duration: 0.888875 2023-10-03 16:01:07,874 WARNING [train.py:1204] (3/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_55-31904-0_sp1.1 from training. Duration: 0.8 2023-10-03 16:01:09,278 WARNING [train.py:1204] (3/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_18-174647-0 from training. Duration: 0.91 2023-10-03 16:01:13,247 INFO [train.py:1046] (3/4) Epoch 38, batch 2100, loss[loss=0.1454, simple_loss=0.2272, pruned_loss=0.03185, over 23188.00 frames. ], tot_loss[loss=0.1584, simple_loss=0.2378, pruned_loss=0.03951, over 4716022.04 frames. ], batch size: 93, lr: 2.68e-03, grad_scale: 8.0 2023-10-03 16:01:13,988 WARNING [train.py:1204] (3/4) Exclude cut with ID IC0165W0005-81726 from training. Duration: 0.952 2023-10-03 16:01:13,989 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_31583_210_3852_1_1532595851_4024413_148-171847-0_sp1.1 from training. Duration: 0.963625 2023-10-03 16:01:15,135 WARNING [train.py:1204] (3/4) Exclude cut with ID _21989_210_5854_1_1531899214931_3271360_116-138769-0_sp1.1 from training. Duration: 0.936375 2023-10-03 16:01:15,198 WARNING [train.py:1204] (3/4) Exclude cut with ID _66070_210_7640_1_1535441393096_7805310_489-292631-0_sp1.1 from training. Duration: 0.7181875 2023-10-03 16:01:16,576 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_202-27960-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 16:01:16,585 WARNING [train.py:1204] (3/4) Exclude cut with ID _5294_210_1747_1_1527249643928_3696179_342-270278-0 from training. Duration: 0.55 2023-10-03 16:01:16,622 WARNING [train.py:1204] (3/4) Exclude cut with ID _38323_210_12108_1_1533121233369_3303049_416-11919-0 from training. Duration: 0.96 2023-10-03 16:01:18,577 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6545_210_2040_1_1527774670_4245258_407-48070-0_sp0.9 from training. Duration: 0.8444375 2023-10-03 16:01:23,205 WARNING [train.py:1204] (3/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_503-30719-0_sp1.1 from training. Duration: 0.8909375 2023-10-03 16:01:23,277 WARNING [train.py:1204] (3/4) Exclude cut with ID _49643_210_7989_1_1534640244224_3939031_234-326497-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 16:01:24,078 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.2.encoder.layers.2.feed_forward3.out_whiten, num_groups=1, num_channels=384, metric=9.33 vs. limit=15.0 2023-10-03 16:01:26,069 WARNING [train.py:1204] (3/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_301-77873-0_sp1.1 from training. Duration: 0.963625 2023-10-03 16:01:26,122 WARNING [train.py:1204] (3/4) Exclude cut with ID _45297_210_2721_1_1533715351381_3264949_152-154697-0_sp1.1 from training. Duration: 0.97275 2023-10-03 16:01:27,520 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3371_210_1794_1_1526384462_5580777_396-339812-0 from training. Duration: 0.85 2023-10-03 16:01:27,607 WARNING [train.py:1204] (3/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_399-156791-0_sp1.1 from training. Duration: 0.7545625 2023-10-03 16:01:28,969 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7696_210_2040_1_1528207167_3716099_117-125434-0 from training. Duration: 0.76 2023-10-03 16:01:28,982 WARNING [train.py:1204] (3/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_96-244299-0 from training. Duration: 0.88 2023-10-03 16:01:30,451 WARNING [train.py:1204] (3/4) Exclude cut with ID _37892_210_19596_1_1533198677897_3964058_519-343462-0_sp1.1 from training. Duration: 0.92725 2023-10-03 16:01:30,474 WARNING [train.py:1204] (3/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_202-183922-0_sp1.1 from training. Duration: 0.77275 2023-10-03 16:01:30,481 WARNING [train.py:1204] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_453-133983-0 from training. Duration: 0.78 2023-10-03 16:01:32,180 WARNING [train.py:1204] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_178-39722-0_sp1.1 from training. Duration: 0.4454375 2023-10-03 16:01:33,800 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.bypass.scale_min, batch_count=1324393.3333333333, ans=0.2 2023-10-03 16:01:33,834 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.feed_forward2.hidden_balancer.prob, batch_count=1324393.3333333333, ans=0.125 2023-10-03 16:01:37,758 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_15126_210_6753_1_1530778677_2591185_113-157651-0 from training. Duration: 0.68 2023-10-03 16:01:37,759 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_271-234962-0_sp1.1 from training. Duration: 0.7181875 2023-10-03 16:01:40,518 WARNING [train.py:1204] (3/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_195-64967-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 16:01:41,819 WARNING [train.py:1204] (3/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_365-215065-0_sp1.1 from training. Duration: 0.936375 2023-10-03 16:01:43,357 WARNING [train.py:1204] (3/4) Exclude cut with ID _31602_210_5854_1_1532677021456_4458250_249-96604-0_sp0.9 from training. Duration: 0.988875 2023-10-03 16:01:45,271 WARNING [train.py:1204] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_307-158462-0 from training. Duration: 0.69 2023-10-03 16:01:45,316 WARNING [train.py:1204] (3/4) Exclude cut with ID _30918_210_6400_1_1532754078600_3125920_407-223394-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 16:01:45,319 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6673_210_3273_1_1527767790_3173240_351-167954-0_sp0.9 from training. Duration: 0.5333125 2023-10-03 16:01:46,827 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.bypass.scale_min, batch_count=1324460.0, ans=0.2 2023-10-03 16:01:48,008 WARNING [train.py:1204] (3/4) Exclude cut with ID _4866_210_2577_1_1527404412284_4364659_256-306830-0 from training. Duration: 0.87 2023-10-03 16:01:49,238 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_17613_210_2151_1_1531137569_2366160_222-131118-0_sp1.1 from training. Duration: 0.963625 2023-10-03 16:01:49,260 WARNING [train.py:1204] (3/4) Exclude cut with ID _45560_210_21982_1_1533812603951_3584999_111-6376-0 from training. Duration: 0.84 2023-10-03 16:01:49,293 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_216-73841-0 from training. Duration: 0.96 2023-10-03 16:01:51,126 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_14058_210_4163_1_1530690953_6327180_49-210687-0 from training. Duration: 0.94 2023-10-03 16:01:52,495 WARNING [train.py:1204] (3/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_506-42614-0_sp0.9 from training. Duration: 0.87775 2023-10-03 16:01:53,923 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_25-332138-0_sp1.1 from training. Duration: 0.82725 2023-10-03 16:01:54,198 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.conv_module1.balancer1.prob, batch_count=1324460.0, ans=0.125 2023-10-03 16:01:55,383 WARNING [train.py:1204] (3/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_589-9070-0_sp1.1 from training. Duration: 0.6454375 2023-10-03 16:01:57,396 WARNING [train.py:1204] (3/4) Exclude cut with ID _6194_210_1534_1_1527511826160_3941194_14-282876-0_sp1.1 from training. Duration: 0.6454375 2023-10-03 16:01:58,882 WARNING [train.py:1204] (3/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_1166-90654-0_sp1.1 from training. Duration: 0.92725 2023-10-03 16:01:58,985 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_218-255858-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 16:01:58,986 WARNING [train.py:1204] (3/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_1285-256622-0 from training. Duration: 0.84 2023-10-03 16:01:58,999 WARNING [train.py:1204] (3/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_205-286261-0_sp1.1 from training. Duration: 0.963625 2023-10-03 16:02:00,806 WARNING [train.py:1204] (3/4) Exclude cut with ID _68777_210_4353_1_1535810390731_3117904_6-154119-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 16:02:00,854 WARNING [train.py:1204] (3/4) Exclude cut with ID _22982_210_1883_1_1531882790447_5449409_565-199393-0_sp1.1 from training. Duration: 0.92725 2023-10-03 16:02:00,872 WARNING [train.py:1204] (3/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_278-154492-0 from training. Duration: 0.76 2023-10-03 16:02:02,300 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_426-313557-0 from training. Duration: 0.53 2023-10-03 16:02:03,693 WARNING [train.py:1204] (3/4) Exclude cut with ID _41817_210_18835_1_1533434477160_7423049_555-293448-0 from training. Duration: 0.8 2023-10-03 16:02:06,370 WARNING [train.py:1204] (3/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_32-335103-0_sp1.1 from training. Duration: 0.7454375 2023-10-03 16:02:09,197 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2655_210_2087_1_1525688959_3897400_202-147557-0_sp1.1 from training. Duration: 0.77275 2023-10-03 16:02:09,220 WARNING [train.py:1204] (3/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_158-250116-0 from training. Duration: 0.62 2023-10-03 16:02:10,019 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.1.encoder.layers.1.self_attn2.whiten, num_groups=1, num_channels=256, metric=10.64 vs. limit=22.5 2023-10-03 16:02:16,589 WARNING [train.py:1204] (3/4) Exclude cut with ID _55429_210_10626_1_1534467588527_7174143_528-88083-0_sp1.1 from training. Duration: 0.963625 2023-10-03 16:02:18,009 WARNING [train.py:1204] (3/4) Exclude cut with ID _8030_210_7497_1_1528444886781_3531089_32-31837-0_sp1.1 from training. Duration: 0.8818125 2023-10-03 16:02:19,327 WARNING [train.py:1204] (3/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_744-86168-0_sp1.1 from training. Duration: 0.97275 2023-10-03 16:02:19,338 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1113-7369-0_sp0.9 from training. Duration: 0.9444375 2023-10-03 16:02:19,360 WARNING [train.py:1204] (3/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_124-345840-0_sp0.9 from training. Duration: 0.57775 2023-10-03 16:02:19,409 WARNING [train.py:1204] (3/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_328-264776-0_sp0.9 from training. Duration: 0.7444375 2023-10-03 16:02:22,592 WARNING [train.py:1204] (3/4) Exclude cut with ID _18895_210_6228_1_1531357274234_3784930_469-166615-0_sp1.1 from training. Duration: 0.963625 2023-10-03 16:02:22,598 WARNING [train.py:1204] (3/4) Exclude cut with ID _8395_210_1531_1_1528597871523_4411579_431-132470-0_sp0.9 from training. Duration: 0.82225 2023-10-03 16:02:23,937 WARNING [train.py:1204] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_554-45617-0_sp1.1 from training. Duration: 0.9 2023-10-03 16:02:23,961 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_35361_210_4238_1_1533262810_7793476_524-188242-0_sp1.1 from training. Duration: 0.92725 2023-10-03 16:02:25,393 WARNING [train.py:1204] (3/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_938-205135-0 from training. Duration: 0.87 2023-10-03 16:02:26,908 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_5931_210_2040_1_1527428481_4547999_321-193614-0 from training. Duration: 0.67 2023-10-03 16:02:26,927 WARNING [train.py:1204] (3/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_445-269521-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 16:02:28,744 INFO [train.py:1046] (3/4) Epoch 38, batch 2150, loss[loss=0.1324, simple_loss=0.214, pruned_loss=0.02538, over 24637.00 frames. ], tot_loss[loss=0.1579, simple_loss=0.2368, pruned_loss=0.03951, over 4708802.11 frames. ], batch size: 60, lr: 2.68e-03, grad_scale: 8.0 2023-10-03 16:02:30,218 WARNING [train.py:1204] (3/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_3-33116-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 16:02:30,219 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_4633_210_2040_1_1526824163_4145343_416-76117-0_sp1.1 from training. Duration: 0.863625 2023-10-03 16:02:30,264 WARNING [train.py:1204] (3/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_1033-274376-0_sp1.1 from training. Duration: 0.7909375 2023-10-03 16:02:31,943 WARNING [train.py:1204] (3/4) Exclude cut with ID _8772_210_1850_1_1528635595972_3664011_398-320049-0_sp0.9 from training. Duration: 0.97775 2023-10-03 16:02:36,123 WARNING [train.py:1204] (3/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_321-215742-0_sp0.9 from training. Duration: 0.5555625 2023-10-03 16:02:38,831 WARNING [train.py:1204] (3/4) Exclude cut with ID _28394_210_8913_1_1532430049518_3650118_272-190950-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 16:02:40,159 WARNING [train.py:1204] (3/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_31-299371-0_sp1.1 from training. Duration: 0.92725 2023-10-03 16:02:41,575 WARNING [train.py:1204] (3/4) Exclude cut with ID _55383_210_10635_1_1534468426172_2231669_134-128246-0_sp0.9 from training. Duration: 0.988875 2023-10-03 16:02:41,584 WARNING [train.py:1204] (3/4) Exclude cut with ID _40519_210_13794_1_1533715228655_7337390_887-9547-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 16:02:42,878 WARNING [train.py:1204] (3/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_310-109461-0_sp0.9 from training. Duration: 0.888875 2023-10-03 16:02:43,877 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.nonlin_attention.balancer.prob, batch_count=1324726.6666666667, ans=0.125 2023-10-03 16:02:45,020 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2009_210_2071_1_1525481134_4588937_258-250196-0_sp1.1 from training. Duration: 0.92725 2023-10-03 16:02:46,325 WARNING [train.py:1204] (3/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_1186-72702-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 16:02:46,328 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_14831_210_7927_1_1530700200_5583610_181-27467-0_sp1.1 from training. Duration: 0.82725 2023-10-03 16:02:50,390 WARNING [train.py:1204] (3/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_278-132109-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 16:02:50,416 WARNING [train.py:1204] (3/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_261-297503-0 from training. Duration: 0.99 2023-10-03 16:02:56,307 WARNING [train.py:1204] (3/4) Exclude cut with ID _19896_210_1883_1_1531709022662_6649569_313-265903-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 16:02:56,415 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_105-114797-0_sp1.1 from training. Duration: 0.763625 2023-10-03 16:02:57,823 WARNING [train.py:1204] (3/4) Exclude cut with ID _53755_210_8254_1_1534577312941_4707360_9-65843-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 16:02:57,845 WARNING [train.py:1204] (3/4) Exclude cut with ID _18516_210_9206_1_1531704653206_3692406_361-224549-0_sp1.1 from training. Duration: 0.97275 2023-10-03 16:02:59,140 WARNING [train.py:1204] (3/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_403-310975-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 16:02:59,196 WARNING [train.py:1204] (3/4) Exclude cut with ID _5444_210_2151_1_1527418808464_5312210_581-315411-0_sp0.9 from training. Duration: 0.688875 2023-10-03 16:03:00,866 WARNING [train.py:1204] (3/4) Exclude cut with ID _23825_210_13449_1_1531915313526_3497150_464-154904-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 16:03:00,875 WARNING [train.py:1204] (3/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_937-208281-0_sp0.9 from training. Duration: 0.9444375 2023-10-03 16:03:00,944 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_37839_210_7234_1_1533542462_3689303_158-181212-0_sp1.1 from training. Duration: 0.963625 2023-10-03 16:03:02,320 WARNING [train.py:1204] (3/4) Exclude cut with ID _8772_210_1850_1_1528635595972_3664011_398-320049-0 from training. Duration: 0.88 2023-10-03 16:03:04,124 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.630e+02 1.852e+02 2.034e+02 2.219e+02 3.109e+02, threshold=4.067e+02, percent-clipped=0.0 2023-10-03 16:03:04,206 WARNING [train.py:1204] (3/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_718-250907-0_sp1.1 from training. Duration: 0.62725 2023-10-03 16:03:05,555 WARNING [train.py:1204] (3/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_641-126395-0_sp1.1 from training. Duration: 0.92725 2023-10-03 16:03:05,580 WARNING [train.py:1204] (3/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_166-241493-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 16:03:06,923 WARNING [train.py:1204] (3/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_526-62079-0_sp0.9 from training. Duration: 0.7444375 2023-10-03 16:03:08,323 WARNING [train.py:1204] (3/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_439-275083-0_sp1.1 from training. Duration: 0.936375 2023-10-03 16:03:10,931 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_66907_210_21581_1_1535615661_3590177_493-229032-0_sp1.1 from training. Duration: 0.92725 2023-10-03 16:03:10,958 WARNING [train.py:1204] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_89-321955-0_sp1.1 from training. Duration: 0.72725 2023-10-03 16:03:12,374 WARNING [train.py:1204] (3/4) Exclude cut with ID _29857_210_20034_1_1532395886753_3798160_273-226195-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 16:03:12,389 WARNING [train.py:1204] (3/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_404-75889-0 from training. Duration: 0.89 2023-10-03 16:03:12,418 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_271-234962-0_sp0.9 from training. Duration: 0.87775 2023-10-03 16:03:15,240 WARNING [train.py:1204] (3/4) Exclude cut with ID _22961_210_8254_1_1531976366672_4278180_156-339685-0_sp1.1 from training. Duration: 0.97275 2023-10-03 16:03:17,020 WARNING [train.py:1204] (3/4) Exclude cut with ID _37892_210_19596_1_1533198677897_3964058_425-231927-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 16:03:17,121 WARNING [train.py:1204] (3/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_102-288670-0_sp1.1 from training. Duration: 0.97275 2023-10-03 16:03:18,442 WARNING [train.py:1204] (3/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_283-220372-0_sp1.1 from training. Duration: 0.5090625 2023-10-03 16:03:19,831 WARNING [train.py:1204] (3/4) Exclude cut with ID _44304_210_15374_1_1533639352765_4613129_86-1699-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 16:03:19,903 WARNING [train.py:1204] (3/4) Exclude cut with ID _2891_210_2550_1_1525874398521_4427710_327-121590-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 16:03:19,909 WARNING [train.py:1204] (3/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_369-222060-0 from training. Duration: 0.71 2023-10-03 16:03:21,514 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.conv_module1.balancer1.prob, batch_count=1324860.0, ans=0.125 2023-10-03 16:03:23,254 WARNING [train.py:1204] (3/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_762-109886-0 from training. Duration: 0.91 2023-10-03 16:03:23,281 WARNING [train.py:1204] (3/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_477-255782-0_sp0.9 from training. Duration: 0.911125 2023-10-03 16:03:24,554 WARNING [train.py:1204] (3/4) Exclude cut with ID IC0669W0061-330906 from training. Duration: 0.768 2023-10-03 16:03:25,910 WARNING [train.py:1204] (3/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_347-213071-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 16:03:25,937 WARNING [train.py:1204] (3/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_507-303370-0_sp1.1 from training. Duration: 0.87275 2023-10-03 16:03:26,026 WARNING [train.py:1204] (3/4) Exclude cut with ID _15012_210_1968_1_1530856820429_5095350_688-178849-0 from training. Duration: 0.64 2023-10-03 16:03:27,273 WARNING [train.py:1204] (3/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_28-70987-0_sp0.9 from training. Duration: 0.9666875 2023-10-03 16:03:27,275 WARNING [train.py:1204] (3/4) Exclude cut with ID _5910_210_1389_1_1527337972454_5050100_555-159815-0 from training. Duration: 0.98 2023-10-03 16:03:27,302 WARNING [train.py:1204] (3/4) Exclude cut with ID IC0679W0064-335871 from training. Duration: 0.768 2023-10-03 16:03:27,302 WARNING [train.py:1204] (3/4) Exclude cut with ID IC0679W0079-335886 from training. Duration: 0.896 2023-10-03 16:03:27,351 WARNING [train.py:1204] (3/4) Exclude cut with ID _5608_210_2106_1_1527415439089_6819768_909-82767-0 from training. Duration: 0.61 2023-10-03 16:03:28,694 WARNING [train.py:1204] (3/4) Exclude cut with ID _23038_210_9570_1_1532001584158_3652419_411-323967-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 16:03:30,025 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_350-231748-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 16:03:30,035 WARNING [train.py:1204] (3/4) Exclude cut with ID _5289_210_2084_1_1527678202736_3779299_142-103317-0_sp0.9 from training. Duration: 0.9333125 2023-10-03 16:03:31,363 WARNING [train.py:1204] (3/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_314-144055-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 16:03:32,782 WARNING [train.py:1204] (3/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_177-236809-0_sp0.9 from training. Duration: 0.5444375 2023-10-03 16:03:34,710 WARNING [train.py:1204] (3/4) Exclude cut with ID _23038_210_9570_1_1532001584158_3652419_82-334733-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 16:03:34,724 WARNING [train.py:1204] (3/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_225-22526-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 16:03:42,158 INFO [train.py:1046] (3/4) Epoch 38, batch 2200, loss[loss=0.1585, simple_loss=0.2354, pruned_loss=0.04087, over 23739.00 frames. ], tot_loss[loss=0.1575, simple_loss=0.2366, pruned_loss=0.03921, over 4717862.58 frames. ], batch size: 232, lr: 2.68e-03, grad_scale: 8.0 2023-10-03 16:03:42,280 WARNING [train.py:1204] (3/4) Exclude cut with ID _30327_210_16049_1_1532484161295_3924169_401-70819-0_sp1.1 from training. Duration: 0.7818125 2023-10-03 16:03:42,318 WARNING [train.py:1204] (3/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_538-32291-0 from training. Duration: 0.83 2023-10-03 16:03:45,897 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.0.layers.0.self_attn_weights.whiten_keys, num_groups=4, num_channels=128, metric=2.87 vs. limit=6.0 2023-10-03 16:03:46,889 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_37357_210_17024_1_1533089504_2316121_258-211605-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 16:03:49,776 WARNING [train.py:1204] (3/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_550-335198-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 16:03:49,845 WARNING [train.py:1204] (3/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_98-741-0_sp0.9 from training. Duration: 0.888875 2023-10-03 16:03:49,904 WARNING [train.py:1204] (3/4) Exclude cut with ID _38323_210_12108_1_1533121233369_3303049_251-238988-0_sp1.1 from training. Duration: 0.92725 2023-10-03 16:03:51,256 WARNING [train.py:1204] (3/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_259-34038-0_sp0.9 from training. Duration: 0.82225 2023-10-03 16:03:53,158 WARNING [train.py:1204] (3/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_1060-180633-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 16:03:54,434 WARNING [train.py:1204] (3/4) Exclude cut with ID _8755_210_1876_1_1528628559357_3258209_211-37473-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 16:03:54,439 WARNING [train.py:1204] (3/4) Exclude cut with ID _4122_210_2084_1_1526713164417_4353280_528-206771-0 from training. Duration: 0.88 2023-10-03 16:03:58,570 WARNING [train.py:1204] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_64-248359-0 from training. Duration: 0.69 2023-10-03 16:04:01,773 WARNING [train.py:1204] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_402-253027-0_sp1.1 from training. Duration: 0.4909375 2023-10-03 16:04:06,430 WARNING [train.py:1204] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_625-102955-0 from training. Duration: 0.77 2023-10-03 16:04:06,535 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.0.conv_module1.balancer1.prob, batch_count=1325060.0, ans=0.125 2023-10-03 16:04:10,481 WARNING [train.py:1204] (3/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_611-219324-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 16:04:11,845 WARNING [train.py:1204] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_718-68505-0_sp0.9 from training. Duration: 0.988875 2023-10-03 16:04:12,018 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.self_attn_weights.pos_emb_skip_rate, batch_count=1325126.6666666667, ans=0.0 2023-10-03 16:04:13,151 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_353-182855-0_sp1.1 from training. Duration: 0.87275 2023-10-03 16:04:13,358 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.balancer1.prob, batch_count=1325126.6666666667, ans=0.125 2023-10-03 16:04:16,139 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_5960_210_1794_1_1527403865_3990999_515-2613-0_sp0.9 from training. Duration: 0.97775 2023-10-03 16:04:18,075 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_5575_210_1410_1_1527301305_2570170_342-265572-0 from training. Duration: 0.94 2023-10-03 16:04:21,383 WARNING [train.py:1204] (3/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_329-113173-0_sp0.9 from training. Duration: 0.688875 2023-10-03 16:04:21,470 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_15396_210_6890_1_1531308473_1938936_213-312506-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 16:04:22,840 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_521-214276-0_sp1.1 from training. Duration: 0.4 2023-10-03 16:04:24,379 WARNING [train.py:1204] (3/4) Exclude cut with ID _15026_210_8750_1_1530777602316_3291850_211-195043-0_sp1.1 from training. Duration: 0.72725 2023-10-03 16:04:27,003 WARNING [train.py:1204] (3/4) Exclude cut with ID _29343_210_4381_1_1532602757752_3674109_108-257494-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 16:04:27,126 WARNING [train.py:1204] (3/4) Exclude cut with ID _64414_210_17301_1_1535243252048_3757759_403-58151-0_sp1.1 from training. Duration: 0.963625 2023-10-03 16:04:28,519 WARNING [train.py:1204] (3/4) Exclude cut with ID _36512_210_6987_1_1533033143998_3126069_109-177787-0_sp1.1 from training. Duration: 0.92725 2023-10-03 16:04:31,341 WARNING [train.py:1204] (3/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_498-40153-0 from training. Duration: 0.63 2023-10-03 16:04:32,677 WARNING [train.py:1204] (3/4) Exclude cut with ID _56447_210_1389_1_1535074245910_3697189_161-334523-0_sp1.1 from training. Duration: 0.936375 2023-10-03 16:04:32,766 WARNING [train.py:1204] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_238-245655-0 from training. Duration: 0.81 2023-10-03 16:04:32,943 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.attention_skip_rate, batch_count=1325193.3333333333, ans=0.0 2023-10-03 16:04:36,059 WARNING [train.py:1204] (3/4) Exclude cut with ID _64631_210_27767_1_1535349580068_4165160_108-43865-0_sp1.1 from training. Duration: 0.92725 2023-10-03 16:04:36,065 WARNING [train.py:1204] (3/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_243-42116-0_sp1.1 from training. Duration: 0.663625 2023-10-03 16:04:36,103 WARNING [train.py:1204] (3/4) Exclude cut with ID _66868_210_9527_1_1535509287954_4561140_594-132160-0_sp1.1 from training. Duration: 0.92725 2023-10-03 16:04:37,541 WARNING [train.py:1204] (3/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_48-338976-0_sp0.9 from training. Duration: 0.988875 2023-10-03 16:04:39,444 WARNING [train.py:1204] (3/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_137-100595-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 16:04:39,458 WARNING [train.py:1204] (3/4) Exclude cut with ID _49748_210_24116_1_1533986971737_3158910_12-271669-0_sp1.1 from training. Duration: 0.936375 2023-10-03 16:04:39,489 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_37357_210_17024_1_1533089504_2316121_206-80421-0_sp1.1 from training. Duration: 0.936375 2023-10-03 16:04:39,577 WARNING [train.py:1204] (3/4) Exclude cut with ID _7537_210_2084_1_1528502395431_3924359_113-199933-0_sp1.1 from training. Duration: 0.536375 2023-10-03 16:04:40,885 WARNING [train.py:1204] (3/4) Exclude cut with ID _55440_210_12651_1_1534496713364_2980520_31-59289-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 16:04:42,513 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1083-220122-0_sp0.9 from training. Duration: 0.5666875 2023-10-03 16:04:45,808 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_426-313557-0_sp1.1 from training. Duration: 0.4818125 2023-10-03 16:04:45,987 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.conv_module1.balancer2.min_abs, batch_count=1325260.0, ans=0.5 2023-10-03 16:04:47,071 WARNING [train.py:1204] (3/4) Exclude cut with ID _35799_210_5854_1_1533263487887_3529071_324-94843-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 16:04:48,522 WARNING [train.py:1204] (3/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_264-108685-0_sp1.1 from training. Duration: 0.5 2023-10-03 16:04:51,781 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9012W0006-501141 from training. Duration: 0.896 2023-10-03 16:04:53,149 WARNING [train.py:1204] (3/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_890-5117-0_sp0.9 from training. Duration: 0.8666875 2023-10-03 16:04:53,187 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9023W0008-506301 from training. Duration: 0.896 2023-10-03 16:04:54,640 WARNING [train.py:1204] (3/4) Exclude cut with ID _13916_210_9994_1_1530788341126_3763149_60-191556-0_sp0.9 from training. Duration: 0.67775 2023-10-03 16:04:55,924 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9029W0331-509721 from training. Duration: 0.896 2023-10-03 16:04:57,207 INFO [train.py:1046] (3/4) Epoch 38, batch 2250, loss[loss=0.1414, simple_loss=0.2187, pruned_loss=0.03207, over 24303.00 frames. ], tot_loss[loss=0.1582, simple_loss=0.2374, pruned_loss=0.03954, over 4711691.09 frames. ], batch size: 56, lr: 2.68e-03, grad_scale: 8.0 2023-10-03 16:04:57,300 WARNING [train.py:1204] (3/4) Exclude cut with ID _35823_210_6917_1_1533187881914_7297830_567-108596-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 16:04:58,895 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7543_210_6753_1_1528544010_1110402_25-183860-0_sp1.1 from training. Duration: 0.42725 2023-10-03 16:05:00,240 WARNING [train.py:1204] (3/4) Exclude cut with ID _47278_210_12041_1_1533902418349_3576110_495-260035-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 16:05:00,347 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9051W0015-520281 from training. Duration: 0.9045625 2023-10-03 16:05:01,719 WARNING [train.py:1204] (3/4) Exclude cut with ID _14459_210_2228_1_1531443641946_7516700_707-66193-0_sp1.1 from training. Duration: 0.97275 2023-10-03 16:05:04,903 WARNING [train.py:1204] (3/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_219-224445-0_sp1.1 from training. Duration: 0.763625 2023-10-03 16:05:07,762 WARNING [train.py:1204] (3/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_345-167986-0_sp0.9 from training. Duration: 0.9333125 2023-10-03 16:05:10,483 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_34815_210_6388_1_1533381320_3571903_279-305565-0_sp1.1 from training. Duration: 0.836375 2023-10-03 16:05:15,100 WARNING [train.py:1204] (3/4) Exclude cut with ID _21987_210_5854_1_1531821500482_3637038_317-179212-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 16:05:15,175 WARNING [train.py:1204] (3/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_395-37222-0_sp1.1 from training. Duration: 0.6909375 2023-10-03 16:05:16,569 WARNING [train.py:1204] (3/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_168-50337-0_sp1.1 from training. Duration: 0.763625 2023-10-03 16:05:19,916 WARNING [train.py:1204] (3/4) Exclude cut with ID _60113_210_16271_1_1535164212481_4386079_559-167191-0 from training. Duration: 0.93 2023-10-03 16:05:19,934 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_47228_210_4983_1_1533780021_3672027_344-179642-0_sp1.1 from training. Duration: 0.92725 2023-10-03 16:05:19,959 WARNING [train.py:1204] (3/4) Exclude cut with ID _22961_210_8254_1_1531976366672_4278180_317-199546-0_sp0.9 from training. Duration: 0.9555625 2023-10-03 16:05:20,192 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.conv_skip_rate, batch_count=1325393.3333333333, ans=0.0 2023-10-03 16:05:21,985 WARNING [train.py:1204] (3/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_141-89727-0 from training. Duration: 0.71 2023-10-03 16:05:23,287 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_78-116075-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 16:05:23,306 WARNING [train.py:1204] (3/4) Exclude cut with ID _64086_210_7994_1_1535270417019_7435949_52-235756-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 16:05:25,938 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7615_210_4211_1_1528110965_4580160_72-235211-0_sp1.1 from training. Duration: 0.6909375 2023-10-03 16:05:28,802 WARNING [train.py:1204] (3/4) Exclude cut with ID _14069_210_5632_1_1530512692033_4803390_287-27950-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 16:05:28,907 WARNING [train.py:1204] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_74-48717-0_sp0.9 from training. Duration: 0.6444375 2023-10-03 16:05:30,180 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7544_210_6753_1_1528591327_1471426_172-348777-0_sp1.1 from training. Duration: 0.6 2023-10-03 16:05:31,572 WARNING [train.py:1204] (3/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_220-191080-0 from training. Duration: 0.98 2023-10-03 16:05:31,847 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.conv_module1.balancer2.prob, batch_count=1325460.0, ans=0.125 2023-10-03 16:05:32,779 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.610e+02 1.873e+02 2.067e+02 2.203e+02 2.954e+02, threshold=4.134e+02, percent-clipped=0.0 2023-10-03 16:05:32,914 WARNING [train.py:1204] (3/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_1081-137641-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 16:05:36,101 WARNING [train.py:1204] (3/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_278-235459-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 16:05:38,900 WARNING [train.py:1204] (3/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_1181-77470-0_sp1.1 from training. Duration: 0.936375 2023-10-03 16:05:40,655 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.out_combiner.scale_min, batch_count=1325526.6666666667, ans=0.2 2023-10-03 16:05:41,782 WARNING [train.py:1204] (3/4) Exclude cut with ID _60860_210_8254_1_1534896015875_3557169_198-5531-0_sp1.1 from training. Duration: 0.936375 2023-10-03 16:05:41,880 WARNING [train.py:1204] (3/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_17-289781-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 16:05:41,893 WARNING [train.py:1204] (3/4) Exclude cut with ID _50310_210_15488_1_1534068043345_3641080_146-199752-0_sp1.1 from training. Duration: 0.92725 2023-10-03 16:05:44,994 WARNING [train.py:1204] (3/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_969-213354-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 16:05:46,803 WARNING [train.py:1204] (3/4) Exclude cut with ID _70519_210_4163_1_1535961595994_7458230_66-344156-0_sp1.1 from training. Duration: 0.963625 2023-10-03 16:05:50,912 WARNING [train.py:1204] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_380-204756-0_sp1.1 from training. Duration: 0.7818125 2023-10-03 16:05:51,069 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.feed_forward1.out_proj.dropout_p, batch_count=1325526.6666666667, ans=0.1 2023-10-03 16:05:51,126 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.balancer2.prob, batch_count=1325526.6666666667, ans=0.125 2023-10-03 16:05:51,130 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.conv_module1.balancer1.min_positive, batch_count=1325526.6666666667, ans=0.025 2023-10-03 16:05:54,155 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_128-202104-0_sp0.9 from training. Duration: 0.7 2023-10-03 16:05:59,592 WARNING [train.py:1204] (3/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_51-1049-0_sp0.9 from training. Duration: 0.8333125 2023-10-03 16:05:59,617 WARNING [train.py:1204] (3/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_86-66192-0_sp1.1 from training. Duration: 0.5 2023-10-03 16:05:59,680 WARNING [train.py:1204] (3/4) Exclude cut with ID _14069_210_5632_1_1530512692033_4803390_95-1896-0_sp1.1 from training. Duration: 0.8909375 2023-10-03 16:06:05,114 WARNING [train.py:1204] (3/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_29-293896-0_sp1.1 from training. Duration: 0.5545625 2023-10-03 16:06:06,929 WARNING [train.py:1204] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_167-137255-0_sp0.9 from training. Duration: 0.711125 2023-10-03 16:06:06,930 WARNING [train.py:1204] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_217-95056-0 from training. Duration: 0.98 2023-10-03 16:06:06,952 WARNING [train.py:1204] (3/4) Exclude cut with ID _47278_210_12041_1_1533902418349_3576110_97-266746-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 16:06:08,318 WARNING [train.py:1204] (3/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_380-343670-0_sp1.1 from training. Duration: 0.8 2023-10-03 16:06:09,780 WARNING [train.py:1204] (3/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_107-332398-0 from training. Duration: 0.78 2023-10-03 16:06:11,260 INFO [train.py:1046] (3/4) Epoch 38, batch 2300, loss[loss=0.1689, simple_loss=0.2469, pruned_loss=0.04546, over 23473.00 frames. ], tot_loss[loss=0.1591, simple_loss=0.2386, pruned_loss=0.03979, over 4714670.50 frames. ], batch size: 285, lr: 2.68e-03, grad_scale: 8.0 2023-10-03 16:06:12,733 WARNING [train.py:1204] (3/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_91-267696-0_sp1.1 from training. Duration: 0.7909375 2023-10-03 16:06:14,000 WARNING [train.py:1204] (3/4) Exclude cut with ID _41298_210_9019_1_1533430371450_4822143_498-186993-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 16:06:19,372 WARNING [train.py:1204] (3/4) Exclude cut with ID _17956_210_4353_1_1531209568125_6979970_54-77610-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 16:06:19,418 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_40047_210_2290_1_1533290519_2374311_257-335162-0_sp1.1 from training. Duration: 0.863625 2023-10-03 16:06:22,103 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9653W0193-675471 from training. Duration: 0.9656875 2023-10-03 16:06:23,923 WARNING [train.py:1204] (3/4) Exclude cut with ID _25248_210_8750_1_1532062825941_5554180_243-240536-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 16:06:25,528 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.ff2_skip_rate, batch_count=1325726.6666666667, ans=0.0 2023-10-03 16:06:29,424 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_32360_210_4198_1_1532689160_3648436_60-51908-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 16:06:29,432 WARNING [train.py:1204] (3/4) Exclude cut with ID _4092_210_2151_1_1526643044449_3135759_227-213972-0_sp0.9 from training. Duration: 0.6 2023-10-03 16:06:29,477 WARNING [train.py:1204] (3/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_392-173500-0_sp1.1 from training. Duration: 0.97275 2023-10-03 16:06:30,925 WARNING [train.py:1204] (3/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_71-329150-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 16:06:30,927 WARNING [train.py:1204] (3/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_866-173440-0 from training. Duration: 0.9 2023-10-03 16:06:31,012 WARNING [train.py:1204] (3/4) Exclude cut with ID _14832_210_8750_1_1530691351539_3171348_1-54315-0_sp0.9 from training. Duration: 0.9666875 2023-10-03 16:06:33,751 WARNING [train.py:1204] (3/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_430-116139-0_sp1.1 from training. Duration: 0.77275 2023-10-03 16:06:33,799 WARNING [train.py:1204] (3/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_356-45054-0_sp0.9 from training. Duration: 0.9555625 2023-10-03 16:06:39,829 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3531_210_3528_1_1526294203_5216456_309-227302-0_sp0.9 from training. Duration: 0.7666875 2023-10-03 16:06:41,257 WARNING [train.py:1204] (3/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_301-330455-0_sp1.1 from training. Duration: 0.72725 2023-10-03 16:06:43,167 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.4.encoder.layers.1.feed_forward3.out_whiten, num_groups=1, num_channels=384, metric=11.47 vs. limit=15.0 2023-10-03 16:06:44,066 WARNING [train.py:1204] (3/4) Exclude cut with ID _23557_210_8750_1_1531890106370_6846255_667-145945-0_sp1.1 from training. Duration: 0.92725 2023-10-03 16:06:45,585 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.feed_forward1.hidden_balancer.prob, batch_count=1325793.3333333333, ans=0.125 2023-10-03 16:06:50,537 WARNING [train.py:1204] (3/4) Exclude cut with ID _14860_210_4915_1_1530702020340_7343240_836-328489-0_sp1.1 from training. Duration: 0.8181875 2023-10-03 16:06:50,572 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_37152_210_9697_1_1533106573_3844188_416-333249-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 16:06:53,389 WARNING [train.py:1204] (3/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_538-32291-0_sp0.9 from training. Duration: 0.92225 2023-10-03 16:06:56,771 WARNING [train.py:1204] (3/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_965-176824-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 16:07:00,914 WARNING [train.py:1204] (3/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_86-19601-0_sp1.1 from training. Duration: 0.77275 2023-10-03 16:07:00,978 WARNING [train.py:1204] (3/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_436-318113-0_sp1.1 from training. Duration: 0.6818125 2023-10-03 16:07:01,010 WARNING [train.py:1204] (3/4) Exclude cut with ID _41817_210_18835_1_1533434477160_7423049_555-293448-0_sp0.9 from training. Duration: 0.888875 2023-10-03 16:07:01,026 WARNING [train.py:1204] (3/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_68-219807-0 from training. Duration: 0.47 2023-10-03 16:07:05,302 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7544_210_6753_1_1528591327_1471426_33-346031-0_sp1.1 from training. Duration: 0.5545625 2023-10-03 16:07:05,304 WARNING [train.py:1204] (3/4) Exclude cut with ID _49748_210_24116_1_1533986971737_3158910_263-52113-0_sp1.1 from training. Duration: 0.97275 2023-10-03 16:07:05,356 WARNING [train.py:1204] (3/4) Exclude cut with ID _64639_210_5816_1_1535367619437_3528000_340-24580-0_sp1.1 from training. Duration: 0.92725 2023-10-03 16:07:05,370 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_47228_210_4983_1_1533780021_3672027_479-114941-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 16:07:05,395 WARNING [train.py:1204] (3/4) Exclude cut with ID _44585_210_18628_1_1533605350387_4518199_506-119659-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 16:07:05,463 WARNING [train.py:1204] (3/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_314-126819-0_sp1.1 from training. Duration: 0.4545625 2023-10-03 16:07:05,466 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2541_210_1794_1_1525590269_3084765_156-173713-0_sp0.9 from training. Duration: 0.9 2023-10-03 16:07:05,599 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=1325860.0, ans=0.1 2023-10-03 16:07:06,881 WARNING [train.py:1204] (3/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_108-255430-0 from training. Duration: 0.97 2023-10-03 16:07:06,890 WARNING [train.py:1204] (3/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_791-45149-0_sp1.1 from training. Duration: 0.7818125 2023-10-03 16:07:06,896 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_53378_210_17282_1_1534227054_1261425_10-118295-0_sp1.1 from training. Duration: 0.97275 2023-10-03 16:07:06,955 WARNING [train.py:1204] (3/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_148-158509-0 from training. Duration: 0.41 2023-10-03 16:07:07,148 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.balancer2.prob, batch_count=1325860.0, ans=0.125 2023-10-03 16:07:15,492 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_34726_210_4525_1_1533019474_5030395_687-291305-0_sp1.1 from training. Duration: 0.936375 2023-10-03 16:07:17,981 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.4.encoder.layers.2.feed_forward1.out_whiten, num_groups=1, num_channels=384, metric=8.99 vs. limit=15.0 2023-10-03 16:07:18,786 WARNING [train.py:1204] (3/4) Exclude cut with ID _14860_210_4915_1_1530702020340_7343240_264-324537-0_sp1.1 from training. Duration: 0.963625 2023-10-03 16:07:22,217 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_41251_210_15368_1_1533429767_4820695_591-46386-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 16:07:22,256 WARNING [train.py:1204] (3/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_86-19601-0_sp0.9 from training. Duration: 0.9444375 2023-10-03 16:07:22,283 WARNING [train.py:1204] (3/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_401-255491-0_sp0.9 from training. Duration: 0.77775 2023-10-03 16:07:24,862 WARNING [train.py:1204] (3/4) Exclude cut with ID _8696_210_1876_1_1528545632440_3345058_209-54069-0_sp1.1 from training. Duration: 0.6090625 2023-10-03 16:07:24,877 WARNING [train.py:1204] (3/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_369-325184-0_sp1.1 from training. Duration: 0.9 2023-10-03 16:07:26,602 INFO [train.py:1046] (3/4) Epoch 38, batch 2350, loss[loss=0.1642, simple_loss=0.2458, pruned_loss=0.04131, over 23343.00 frames. ], tot_loss[loss=0.16, simple_loss=0.2397, pruned_loss=0.0402, over 4711044.56 frames. ], batch size: 93, lr: 2.68e-03, grad_scale: 8.0 2023-10-03 16:07:26,706 WARNING [train.py:1204] (3/4) Exclude cut with ID _14687_210_5854_1_1530703838259_3664889_115-239046-0_sp0.9 from training. Duration: 0.7444375 2023-10-03 16:07:27,962 WARNING [train.py:1204] (3/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_168-50337-0 from training. Duration: 0.84 2023-10-03 16:07:28,184 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.conv_skip_rate, batch_count=1325993.3333333333, ans=0.0 2023-10-03 16:07:35,015 WARNING [train.py:1204] (3/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_909-64151-0_sp1.1 from training. Duration: 0.8545625 2023-10-03 16:07:35,025 WARNING [train.py:1204] (3/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_597-154640-0 from training. Duration: 0.9 2023-10-03 16:07:39,229 WARNING [train.py:1204] (3/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_126-274694-0 from training. Duration: 0.98 2023-10-03 16:07:43,717 WARNING [train.py:1204] (3/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_344-269786-0_sp1.1 from training. Duration: 0.97275 2023-10-03 16:07:46,506 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_37818_210_4238_1_1533620910_4720083_322-253154-0_sp1.1 from training. Duration: 0.92725 2023-10-03 16:07:46,509 WARNING [train.py:1204] (3/4) Exclude cut with ID _58895_210_8750_1_1535184081320_3649089_221-15823-0_sp1.1 from training. Duration: 0.92725 2023-10-03 16:07:46,519 WARNING [train.py:1204] (3/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_9-21737-0_sp1.1 from training. Duration: 0.8818125 2023-10-03 16:07:47,844 WARNING [train.py:1204] (3/4) Exclude cut with ID _14069_210_5632_1_1530512692033_4803390_522-341588-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 16:07:47,930 WARNING [train.py:1204] (3/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_363-185703-0 from training. Duration: 0.95 2023-10-03 16:07:51,295 WARNING [train.py:1204] (3/4) Exclude cut with ID _41817_210_18835_1_1533434477160_7423049_265-76216-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 16:07:55,937 WARNING [train.py:1204] (3/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_841-341679-0 from training. Duration: 0.58 2023-10-03 16:07:57,354 WARNING [train.py:1204] (3/4) Exclude cut with ID _55429_210_10626_1_1534467588527_7174143_97-335084-0_sp1.1 from training. Duration: 0.8818125 2023-10-03 16:07:57,593 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.conv_module2.balancer1.prob, batch_count=1326126.6666666667, ans=0.125 2023-10-03 16:08:02,027 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.592e+02 1.917e+02 2.142e+02 2.413e+02 3.614e+02, threshold=4.285e+02, percent-clipped=0.0 2023-10-03 16:08:02,144 WARNING [train.py:1204] (3/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_690-126306-0_sp0.9 from training. Duration: 0.8666875 2023-10-03 16:08:02,153 WARNING [train.py:1204] (3/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_102-92751-0_sp1.1 from training. Duration: 0.9 2023-10-03 16:08:03,609 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3787_210_2040_1_1526477544_5478586_603-120324-0_sp1.1 from training. Duration: 0.736375 2023-10-03 16:08:04,934 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_33348_210_5632_1_1533086912_3324030_85-28583-0 from training. Duration: 0.82 2023-10-03 16:08:04,992 WARNING [train.py:1204] (3/4) Exclude cut with ID _60057_210_10640_1_1534903065793_3333150_362-230377-0_sp1.1 from training. Duration: 0.7909375 2023-10-03 16:08:07,760 WARNING [train.py:1204] (3/4) Exclude cut with ID _60113_210_16271_1_1535164212481_4386079_194-146486-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 16:08:07,777 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_53753_210_7234_1_1534320003_3594148_171-277668-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 16:08:07,958 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.conv_module2.balancer1.prob, batch_count=1326126.6666666667, ans=0.125 2023-10-03 16:08:09,070 WARNING [train.py:1204] (3/4) Exclude cut with ID _34423_210_8750_1_1533430798456_3330199_251-332788-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 16:08:10,533 WARNING [train.py:1204] (3/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_663-166973-0_sp0.9 from training. Duration: 0.888875 2023-10-03 16:08:13,890 WARNING [train.py:1204] (3/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_927-43827-0 from training. Duration: 0.74 2023-10-03 16:08:13,955 WARNING [train.py:1204] (3/4) Exclude cut with ID _28323_210_16187_1_1532480395667_4100300_306-74561-0_sp1.1 from training. Duration: 0.8545625 2023-10-03 16:08:15,515 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.feed_forward3.hidden_balancer.prob, batch_count=1326193.3333333333, ans=0.125 2023-10-03 16:08:16,640 WARNING [train.py:1204] (3/4) Exclude cut with ID _15037_210_2253_1_1530752294104_3267650_139-337989-0_sp1.1 from training. Duration: 0.92725 2023-10-03 16:08:16,654 WARNING [train.py:1204] (3/4) Exclude cut with ID _49039_210_6315_1_1533967537541_3846118_460-263670-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 16:08:18,711 WARNING [train.py:1204] (3/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_186-309161-0 from training. Duration: 0.54 2023-10-03 16:08:19,855 WARNING [train.py:1204] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_87-278686-0_sp1.1 from training. Duration: 0.7 2023-10-03 16:08:21,981 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.self_attn_weights.pos_emb_skip_rate, batch_count=1326193.3333333333, ans=0.0 2023-10-03 16:08:23,175 WARNING [train.py:1204] (3/4) Exclude cut with ID _27052_210_5986_1_1532329197187_3585009_314-106499-0 from training. Duration: 0.92 2023-10-03 16:08:23,192 WARNING [train.py:1204] (3/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_412-287526-0_sp0.9 from training. Duration: 0.8 2023-10-03 16:08:23,406 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.conv_module1.balancer1.min_positive, batch_count=1326193.3333333333, ans=0.025 2023-10-03 16:08:27,252 WARNING [train.py:1204] (3/4) Exclude cut with ID _22961_210_8254_1_1531976366672_4278180_317-199546-0 from training. Duration: 0.86 2023-10-03 16:08:28,836 WARNING [train.py:1204] (3/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_381-317791-0 from training. Duration: 0.73 2023-10-03 16:08:30,714 WARNING [train.py:1204] (3/4) Exclude cut with ID _44010_210_4525_1_1533884262964_4013620_560-310411-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 16:08:30,726 WARNING [train.py:1204] (3/4) Exclude cut with ID _5294_210_1747_1_1527249643928_3696179_342-270278-0_sp0.9 from training. Duration: 0.611125 2023-10-03 16:08:30,751 WARNING [train.py:1204] (3/4) Exclude cut with ID ID0473W0002-918951 from training. Duration: 0.924 2023-10-03 16:08:31,994 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1083-220122-0 from training. Duration: 0.51 2023-10-03 16:08:34,748 WARNING [train.py:1204] (3/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_138-154812-0 from training. Duration: 0.78 2023-10-03 16:08:34,991 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.conv_module2.balancer1.prob, batch_count=1326260.0, ans=0.125 2023-10-03 16:08:36,187 WARNING [train.py:1204] (3/4) Exclude cut with ID _44982_210_1511_1_1534312820108_3726029_331-19420-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 16:08:40,331 INFO [train.py:1046] (3/4) Epoch 38, batch 2400, loss[loss=0.1506, simple_loss=0.2322, pruned_loss=0.03449, over 24647.00 frames. ], tot_loss[loss=0.1596, simple_loss=0.2391, pruned_loss=0.04001, over 4706708.44 frames. ], batch size: 68, lr: 2.68e-03, grad_scale: 16.0 2023-10-03 16:08:40,410 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2655_210_2087_1_1525688959_3897400_202-147557-0_sp0.9 from training. Duration: 0.9444375 2023-10-03 16:08:40,650 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.bypass.skip_rate, batch_count=1326326.6666666667, ans=0.07 2023-10-03 16:08:43,867 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_23850_210_4238_1_1532232597_1632433_32-185135-0_sp0.9 from training. Duration: 0.9555625 2023-10-03 16:08:46,540 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_46-93547-0_sp1.1 from training. Duration: 0.863625 2023-10-03 16:08:47,933 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7931_210_1794_1_1528594728_5564059_280-279560-0 from training. Duration: 0.98 2023-10-03 16:08:48,607 WARNING [train.py:1204] (3/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_937-208281-0 from training. Duration: 0.85 2023-10-03 16:08:53,036 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.1.feed_forward2.hidden_balancer.prob, batch_count=1326326.6666666667, ans=0.125 2023-10-03 16:08:54,231 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_14928_210_5632_1_1530773461_4714000_454-112198-0_sp0.9 from training. Duration: 0.7333125 2023-10-03 16:08:54,232 WARNING [train.py:1204] (3/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_944-153333-0_sp1.1 from training. Duration: 0.963625 2023-10-03 16:08:56,869 WARNING [train.py:1204] (3/4) Exclude cut with ID _14821_210_8750_1_1530680503991_6829359_2-227151-0 from training. Duration: 0.83 2023-10-03 16:08:58,240 WARNING [train.py:1204] (3/4) Exclude cut with ID _28461_210_2253_1_1532771775189_3041299_96-152158-0_sp1.1 from training. Duration: 0.77275 2023-10-03 16:08:58,505 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.conv_module1.balancer1.prob, batch_count=1326393.3333333333, ans=0.125 2023-10-03 16:08:59,556 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7837_210_1792_1_1528453445_5841305_654-54195-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 16:08:59,588 WARNING [train.py:1204] (3/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_344-152526-0 from training. Duration: 0.53 2023-10-03 16:09:04,310 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_30167_210_3528_1_1532428012_4772375_480-142230-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 16:09:07,096 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_160-218721-0 from training. Duration: 0.96 2023-10-03 16:09:09,967 WARNING [train.py:1204] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_65-88242-0_sp0.9 from training. Duration: 0.62225 2023-10-03 16:09:14,581 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_34886_210_10633_1_1533203882_1755661_76-156096-0 from training. Duration: 0.87 2023-10-03 16:09:16,560 WARNING [train.py:1204] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_188-38388-0_sp1.1 from training. Duration: 0.92725 2023-10-03 16:09:17,943 WARNING [train.py:1204] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_81-289335-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 16:09:18,716 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.3.encoder.layers.0.nonlin_attention.whiten2, num_groups=1, num_channels=512, metric=6.61 vs. limit=15.0 2023-10-03 16:09:19,620 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.conv_module2.balancer2.min_abs, batch_count=1326460.0, ans=0.5 2023-10-03 16:09:22,349 WARNING [train.py:1204] (3/4) Exclude cut with ID _59493_210_17282_1_1535078419077_3225879_110-8778-0_sp1.1 from training. Duration: 0.936375 2023-10-03 16:09:22,389 WARNING [train.py:1204] (3/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_188-260011-0 from training. Duration: 0.73 2023-10-03 16:09:22,428 WARNING [train.py:1204] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_453-133983-0_sp0.9 from training. Duration: 0.8666875 2023-10-03 16:09:22,657 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.conv_skip_rate, batch_count=1326460.0, ans=0.0 2023-10-03 16:09:29,542 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8054_210_7927_1_1528627721_4984094_553-17379-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 16:09:29,712 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.ff2_skip_rate, batch_count=1326526.6666666667, ans=0.0 2023-10-03 16:09:32,738 WARNING [train.py:1204] (3/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_341-171072-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 16:09:35,623 WARNING [train.py:1204] (3/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_265-257286-0_sp1.1 from training. Duration: 0.963625 2023-10-03 16:09:36,934 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_403-39619-0_sp0.9 from training. Duration: 0.9333125 2023-10-03 16:09:36,939 WARNING [train.py:1204] (3/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_464-33931-0_sp0.9 from training. Duration: 0.6 2023-10-03 16:09:36,966 WARNING [train.py:1204] (3/4) Exclude cut with ID _60804_210_9642_1_1534831199859_7268299_632-143863-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 16:09:36,982 WARNING [train.py:1204] (3/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_431-310997-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 16:09:37,020 WARNING [train.py:1204] (3/4) Exclude cut with ID _31649_210_6228_1_1532584608063_4091078_114-263860-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 16:09:37,039 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6961_210_2087_1_1527937168_6815011_562-165518-0_sp0.9 from training. Duration: 0.8555625 2023-10-03 16:09:41,372 WARNING [train.py:1204] (3/4) Exclude cut with ID _27052_210_5986_1_1532329197187_3585009_338-325870-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 16:09:42,531 WARNING [train.py:1204] (3/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_469-94673-0_sp1.1 from training. Duration: 0.7181875 2023-10-03 16:09:42,560 WARNING [train.py:1204] (3/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_259-34038-0 from training. Duration: 0.74 2023-10-03 16:09:44,421 WARNING [train.py:1204] (3/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_388-129552-0 from training. Duration: 0.96 2023-10-03 16:09:47,776 WARNING [train.py:1204] (3/4) Exclude cut with ID _28394_210_8913_1_1532430049518_3650118_342-251455-0_sp1.1 from training. Duration: 0.936375 2023-10-03 16:09:47,802 WARNING [train.py:1204] (3/4) Exclude cut with ID _53731_210_7994_1_1534553979244_3757214_8-191390-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 16:09:49,165 WARNING [train.py:1204] (3/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_613-232856-0 from training. Duration: 0.54 2023-10-03 16:09:49,229 WARNING [train.py:1204] (3/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_557-349534-0 from training. Duration: 0.91 2023-10-03 16:09:49,237 WARNING [train.py:1204] (3/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_379-184223-0 from training. Duration: 0.85 2023-10-03 16:09:49,241 WARNING [train.py:1204] (3/4) Exclude cut with ID IC0125W0001-61807 from training. Duration: 0.966 2023-10-03 16:09:51,955 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_229-45300-0 from training. Duration: 0.57 2023-10-03 16:09:53,912 WARNING [train.py:1204] (3/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_1069-24743-0_sp1.1 from training. Duration: 0.92725 2023-10-03 16:09:54,032 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_41452_210_7291_1_1533437687_3941281_323-172873-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 16:09:55,241 INFO [train.py:1046] (3/4) Epoch 38, batch 2450, loss[loss=0.1577, simple_loss=0.2277, pruned_loss=0.04383, over 23741.00 frames. ], tot_loss[loss=0.158, simple_loss=0.237, pruned_loss=0.03952, over 4695296.46 frames. ], batch size: 164, lr: 2.68e-03, grad_scale: 16.0 2023-10-03 16:09:55,286 WARNING [train.py:1204] (3/4) Exclude cut with ID _37086_210_9038_1_1532999540891_7021250_802-226709-0_sp1.1 from training. Duration: 0.97275 2023-10-03 16:09:56,974 WARNING [train.py:1204] (3/4) Exclude cut with ID IC0144W0500-71767 from training. Duration: 0.931875 2023-10-03 16:09:57,052 WARNING [train.py:1204] (3/4) Exclude cut with ID _39846_210_12651_1_1533605310265_2115212_233-129699-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 16:09:58,405 WARNING [train.py:1204] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_574-45541-0_sp0.9 from training. Duration: 0.72225 2023-10-03 16:09:58,599 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.conv_module2.balancer1.prob, batch_count=1326660.0, ans=0.125 2023-10-03 16:10:01,295 WARNING [train.py:1204] (3/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_372-298093-0_sp1.1 from training. Duration: 0.72725 2023-10-03 16:10:01,307 WARNING [train.py:1204] (3/4) Exclude cut with ID _62574_210_2655_1_1535180407058_3933249_34-26343-0_sp1.1 from training. Duration: 0.97275 2023-10-03 16:10:03,455 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.2.encoder.layers.2.nonlin_attention.whiten1, num_groups=1, num_channels=288, metric=4.57 vs. limit=10.0 2023-10-03 16:10:06,011 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_512-256238-0_sp1.1 from training. Duration: 0.963625 2023-10-03 16:10:06,016 WARNING [train.py:1204] (3/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_196-55502-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 16:10:07,451 WARNING [train.py:1204] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_195-167011-0 from training. Duration: 0.75 2023-10-03 16:10:11,703 WARNING [train.py:1204] (3/4) Exclude cut with ID _13678_210_7497_1_1530530069226_3463170_345-230416-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 16:10:11,718 WARNING [train.py:1204] (3/4) Exclude cut with ID _51078_210_9070_1_1534161557424_3805250_225-327809-0_sp1.1 from training. Duration: 0.963625 2023-10-03 16:10:16,434 WARNING [train.py:1204] (3/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_18-189198-0_sp1.1 from training. Duration: 0.8181875 2023-10-03 16:10:16,447 WARNING [train.py:1204] (3/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_329-82235-0_sp1.1 from training. Duration: 0.7454375 2023-10-03 16:10:16,454 WARNING [train.py:1204] (3/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_726-270780-0_sp0.9 from training. Duration: 0.9555625 2023-10-03 16:10:16,486 WARNING [train.py:1204] (3/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_654-52713-0 from training. Duration: 0.77 2023-10-03 16:10:20,896 WARNING [train.py:1204] (3/4) Exclude cut with ID _20455_210_5219_1_1531565996775_8098100_191-16006-0_sp1.1 from training. Duration: 0.963625 2023-10-03 16:10:22,359 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3171_210_2087_1_1526109084_2330007_66-260583-0_sp1.1 from training. Duration: 0.6545625 2023-10-03 16:10:23,821 WARNING [train.py:1204] (3/4) Exclude cut with ID _38670_210_15379_1_1533448810659_4391059_63-129006-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 16:10:26,924 WARNING [train.py:1204] (3/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_393-57888-0_sp0.9 from training. Duration: 0.711125 2023-10-03 16:10:26,991 WARNING [train.py:1204] (3/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_857-298942-0_sp0.9 from training. Duration: 0.9444375 2023-10-03 16:10:28,386 WARNING [train.py:1204] (3/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_239-22076-0_sp0.9 from training. Duration: 0.9444375 2023-10-03 16:10:29,673 WARNING [train.py:1204] (3/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_424-227947-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 16:10:31,033 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.541e+02 1.921e+02 2.165e+02 2.566e+02 3.578e+02, threshold=4.329e+02, percent-clipped=0.0 2023-10-03 16:10:31,111 WARNING [train.py:1204] (3/4) Exclude cut with ID _14069_210_5632_1_1530512692033_4803390_95-1896-0 from training. Duration: 0.98 2023-10-03 16:10:32,509 WARNING [train.py:1204] (3/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_96-244299-0_sp0.9 from training. Duration: 0.97775 2023-10-03 16:10:35,449 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=1326793.3333333333, ans=0.1 2023-10-03 16:10:38,771 WARNING [train.py:1204] (3/4) Exclude cut with ID _46683_210_9642_1_1533730913492_4591116_562-319825-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 16:10:40,100 WARNING [train.py:1204] (3/4) Exclude cut with ID _64639_210_5816_1_1535367619437_3528000_284-173474-0_sp1.1 from training. Duration: 0.963625 2023-10-03 16:10:41,390 WARNING [train.py:1204] (3/4) Exclude cut with ID _29529_210_7234_1_1532393961455_3977460_497-100133-0_sp1.1 from training. Duration: 0.936375 2023-10-03 16:10:41,419 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_12868_210_6753_1_1530701932_530275_18-76322-0_sp0.9 from training. Duration: 0.9666875 2023-10-03 16:10:42,739 WARNING [train.py:1204] (3/4) Exclude cut with ID _50093_210_4220_1_1534417141207_3432299_116-303447-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 16:10:42,838 WARNING [train.py:1204] (3/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_44-79427-0_sp1.1 from training. Duration: 0.8909375 2023-10-03 16:10:44,225 WARNING [train.py:1204] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_887-249555-0 from training. Duration: 0.77 2023-10-03 16:10:47,611 WARNING [train.py:1204] (3/4) Exclude cut with ID _28461_210_2253_1_1532771775189_3041299_96-152158-0_sp0.9 from training. Duration: 0.9444375 2023-10-03 16:10:47,667 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_14058_210_4163_1_1530690953_6327180_49-210687-0_sp1.1 from training. Duration: 0.8545625 2023-10-03 16:10:47,850 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.conv_module2.balancer2.min_abs, batch_count=1326860.0, ans=0.5 2023-10-03 16:10:50,511 WARNING [train.py:1204] (3/4) Exclude cut with ID _40399_210_4353_1_1533305658194_3279406_351-127903-0_sp1.1 from training. Duration: 0.97275 2023-10-03 16:10:50,517 WARNING [train.py:1204] (3/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_184-206939-0_sp1.1 from training. Duration: 0.936375 2023-10-03 16:10:55,144 WARNING [train.py:1204] (3/4) Exclude cut with ID _14100_210_8341_1_1530529320613_3678069_258-224599-0_sp1.1 from training. Duration: 0.7 2023-10-03 16:10:55,169 WARNING [train.py:1204] (3/4) Exclude cut with ID _6493_210_1747_1_1528459210424_4012470_324-323854-0 from training. Duration: 0.92 2023-10-03 16:10:57,019 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_151-325626-0_sp1.1 from training. Duration: 0.8818125 2023-10-03 16:10:58,403 WARNING [train.py:1204] (3/4) Exclude cut with ID _14528_210_8254_1_1530669700514_4306939_106-43145-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 16:10:58,418 WARNING [train.py:1204] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_686-54383-0 from training. Duration: 0.87 2023-10-03 16:10:59,764 WARNING [train.py:1204] (3/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_801-102362-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 16:11:01,135 WARNING [train.py:1204] (3/4) Exclude cut with ID _48677_210_5663_1_1533902405919_3892989_408-40740-0_sp0.9 from training. Duration: 0.92225 2023-10-03 16:11:03,928 WARNING [train.py:1204] (3/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_448-212135-0_sp1.1 from training. Duration: 0.82725 2023-10-03 16:11:05,382 WARNING [train.py:1204] (3/4) Exclude cut with ID _59170_210_6721_1_1534983633960_4203335_320-124047-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 16:11:05,416 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_4692_210_3528_1_1526899755_4323254_107-44401-0_sp1.1 from training. Duration: 0.7818125 2023-10-03 16:11:09,070 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.ff2_skip_rate, batch_count=1326993.3333333333, ans=0.0 2023-10-03 16:11:09,125 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.ff2_skip_rate, batch_count=1326993.3333333333, ans=0.0 2023-10-03 16:11:10,255 INFO [train.py:1046] (3/4) Epoch 38, batch 2500, loss[loss=0.1579, simple_loss=0.2353, pruned_loss=0.04019, over 24596.00 frames. ], tot_loss[loss=0.1574, simple_loss=0.2365, pruned_loss=0.03915, over 4700449.39 frames. ], batch size: 60, lr: 2.68e-03, grad_scale: 16.0 2023-10-03 16:11:10,386 WARNING [train.py:1204] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_731-2428-0 from training. Duration: 0.83 2023-10-03 16:11:11,736 WARNING [train.py:1204] (3/4) Exclude cut with ID _29857_210_20034_1_1532395886753_3798160_335-163562-0_sp1.1 from training. Duration: 0.77275 2023-10-03 16:11:16,477 WARNING [train.py:1204] (3/4) Exclude cut with ID _14832_210_8750_1_1530691351539_3171348_171-275004-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 16:11:23,999 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.0.conv_module2.balancer2.prob, batch_count=1327060.0, ans=0.125 2023-10-03 16:11:27,864 WARNING [train.py:1204] (3/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_105-328174-0_sp0.9 from training. Duration: 0.8444375 2023-10-03 16:11:27,890 WARNING [train.py:1204] (3/4) Exclude cut with ID _68965_210_1883_1_1535698962272_3497490_164-118421-0_sp1.1 from training. Duration: 0.936375 2023-10-03 16:11:27,982 WARNING [train.py:1204] (3/4) Exclude cut with ID _19896_210_1883_1_1531709022662_6649569_380-24170-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 16:11:27,988 WARNING [train.py:1204] (3/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_505-205270-0_sp1.1 from training. Duration: 0.3454375 2023-10-03 16:11:35,652 WARNING [train.py:1204] (3/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_427-208901-0_sp1.1 from training. Duration: 0.6181875 2023-10-03 16:11:35,696 WARNING [train.py:1204] (3/4) Exclude cut with ID _23818_210_6228_1_1531917063884_3676920_85-326916-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 16:11:36,987 WARNING [train.py:1204] (3/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_101-293367-0_sp1.1 from training. Duration: 0.57275 2023-10-03 16:11:37,001 WARNING [train.py:1204] (3/4) Exclude cut with ID _5409_210_1388_1_1527292850189_5865191_178-127970-0_sp1.1 from training. Duration: 0.5181875 2023-10-03 16:11:38,326 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_403-39619-0 from training. Duration: 0.84 2023-10-03 16:11:39,724 WARNING [train.py:1204] (3/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_427-87841-0_sp1.1 from training. Duration: 0.963625 2023-10-03 16:11:41,719 WARNING [train.py:1204] (3/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_286-68181-0_sp1.1 from training. Duration: 0.97275 2023-10-03 16:11:43,044 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6673_210_3273_1_1527767790_3173240_327-306185-0 from training. Duration: 0.83 2023-10-03 16:11:43,060 WARNING [train.py:1204] (3/4) Exclude cut with ID _3675_210_4557_1_1526639934408_4182089_106-203713-0_sp1.1 from training. Duration: 0.963625 2023-10-03 16:11:43,119 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_53071_210_16554_1_1534493434_1276796_130-36076-0 from training. Duration: 0.95 2023-10-03 16:11:43,155 WARNING [train.py:1204] (3/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_586-93054-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 16:11:47,737 WARNING [train.py:1204] (3/4) Exclude cut with ID _23825_210_13449_1_1531915313526_3497150_262-10571-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 16:11:49,021 WARNING [train.py:1204] (3/4) Exclude cut with ID _28783_210_7943_1_1532671164808_7155479_432-67885-0_sp1.1 from training. Duration: 0.97275 2023-10-03 16:11:50,677 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=1327126.6666666667, ans=0.1 2023-10-03 16:11:51,757 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6287_210_3273_1_1527509506_2653716_182-199887-0_sp0.9 from training. Duration: 0.5666875 2023-10-03 16:11:51,811 WARNING [train.py:1204] (3/4) Exclude cut with ID _3231_210_2106_1_1526205864934_6784220_171-256004-0 from training. Duration: 0.86 2023-10-03 16:11:53,120 WARNING [train.py:1204] (3/4) Exclude cut with ID _69051_210_1531_1_1535885566901_3829538_332-92090-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 16:11:54,563 WARNING [train.py:1204] (3/4) Exclude cut with ID _3675_210_4557_1_1526639934408_4182089_104-324125-0_sp1.1 from training. Duration: 0.963625 2023-10-03 16:11:58,040 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_41764_210_2521_1_1533686110_4089313_501-2119-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 16:12:02,584 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_56-100988-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 16:12:04,063 WARNING [train.py:1204] (3/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_398-239819-0_sp1.1 from training. Duration: 0.9 2023-10-03 16:12:08,232 WARNING [train.py:1204] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_74-48717-0_sp1.1 from training. Duration: 0.52725 2023-10-03 16:12:09,694 WARNING [train.py:1204] (3/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_177-49092-0 from training. Duration: 0.91 2023-10-03 16:12:09,721 WARNING [train.py:1204] (3/4) Exclude cut with ID _62337_210_12392_1_1535009273105_3412229_44-176353-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 16:12:09,729 WARNING [train.py:1204] (3/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_110-130961-0_sp1.1 from training. Duration: 0.42725 2023-10-03 16:12:12,711 WARNING [train.py:1204] (3/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_37-189262-0_sp1.1 from training. Duration: 0.8909375 2023-10-03 16:12:12,711 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3172_210_2087_1_1526292929_3987865_197-97762-0_sp0.9 from training. Duration: 0.8333125 2023-10-03 16:12:12,829 WARNING [train.py:1204] (3/4) Exclude cut with ID IC0677W0065-334897 from training. Duration: 0.896 2023-10-03 16:12:12,830 WARNING [train.py:1204] (3/4) Exclude cut with ID IC0677W0080-334912 from training. Duration: 0.768 2023-10-03 16:12:14,100 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_120-41668-0 from training. Duration: 0.7 2023-10-03 16:12:16,189 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.bypass.scale_min, batch_count=1327260.0, ans=0.2 2023-10-03 16:12:17,387 WARNING [train.py:1204] (3/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_460-205287-0_sp1.1 from training. Duration: 0.963625 2023-10-03 16:12:17,613 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.conv_module1.balancer2.prob, batch_count=1327260.0, ans=0.125 2023-10-03 16:12:20,057 WARNING [train.py:1204] (3/4) Exclude cut with ID _14632_210_9988_1_1530691279392_3923200_102-204477-0 from training. Duration: 0.97 2023-10-03 16:12:20,062 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_34815_210_6388_1_1533381320_3571903_279-305565-0 from training. Duration: 0.92 2023-10-03 16:12:20,125 WARNING [train.py:1204] (3/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_91-148831-0_sp1.1 from training. Duration: 0.9 2023-10-03 16:12:20,184 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_82-221196-0 from training. Duration: 0.76 2023-10-03 16:12:22,926 WARNING [train.py:1204] (3/4) Exclude cut with ID _38020_210_5986_1_1533112252259_7006089_493-102745-0 from training. Duration: 0.98 2023-10-03 16:12:24,751 INFO [train.py:1046] (3/4) Epoch 38, batch 2550, loss[loss=0.1448, simple_loss=0.2192, pruned_loss=0.03522, over 20246.00 frames. ], tot_loss[loss=0.1573, simple_loss=0.2365, pruned_loss=0.03902, over 4701007.30 frames. ], batch size: 44, lr: 2.68e-03, grad_scale: 16.0 2023-10-03 16:12:27,400 WARNING [train.py:1204] (3/4) Exclude cut with ID _61133_210_9969_1_1535196777227_3696409_40-93442-0_sp1.1 from training. Duration: 0.92725 2023-10-03 16:12:29,324 WARNING [train.py:1204] (3/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_45-258852-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 16:12:29,368 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_53071_210_16554_1_1534493434_1276796_130-36076-0_sp1.1 from training. Duration: 0.863625 2023-10-03 16:12:30,799 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8694_210_6400_1_1529065754_3260756_332-13553-0_sp1.1 from training. Duration: 0.92725 2023-10-03 16:12:32,111 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_405-339986-0 from training. Duration: 0.51 2023-10-03 16:12:32,166 WARNING [train.py:1204] (3/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_927-43827-0_sp1.1 from training. Duration: 0.67275 2023-10-03 16:12:37,452 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_397-147196-0 from training. Duration: 0.8 2023-10-03 16:12:38,884 WARNING [train.py:1204] (3/4) Exclude cut with ID 3557-8342-0013-54691-0_sp1.1 from training. Duration: 0.836375 2023-10-03 16:12:40,843 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_47438_210_5399_1_1533797471_4513356_626-127722-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 16:12:42,239 WARNING [train.py:1204] (3/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_256-323883-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 16:12:42,246 WARNING [train.py:1204] (3/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_839-173017-0_sp0.9 from training. Duration: 0.4555625 2023-10-03 16:12:42,360 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.1.ff3_skip_rate, batch_count=1327393.3333333333, ans=0.0 2023-10-03 16:12:43,512 WARNING [train.py:1204] (3/4) Exclude cut with ID _5289_210_2084_1_1527678202736_3779299_392-261988-0_sp1.1 from training. Duration: 0.5909375 2023-10-03 16:12:44,827 WARNING [train.py:1204] (3/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_119-224384-0_sp1.1 from training. Duration: 0.936375 2023-10-03 16:12:44,856 WARNING [train.py:1204] (3/4) Exclude cut with ID _57523_210_1856_1_1534766294545_3732989_262-267933-0_sp1.1 from training. Duration: 0.963625 2023-10-03 16:12:45,058 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=1327393.3333333333, ans=0.1 2023-10-03 16:12:47,636 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_35361_210_4238_1_1533262810_7793476_675-87012-0_sp0.9 from training. Duration: 0.988875 2023-10-03 16:12:47,660 WARNING [train.py:1204] (3/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_17-90541-0 from training. Duration: 0.94 2023-10-03 16:12:47,688 WARNING [train.py:1204] (3/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_535-14410-0_sp1.1 from training. Duration: 0.42725 2023-10-03 16:12:47,699 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_24673_210_6400_1_1531968633_778659_94-154511-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 16:12:47,703 WARNING [train.py:1204] (3/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_212-10501-0 from training. Duration: 0.94 2023-10-03 16:12:59,651 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.548e+02 1.861e+02 2.049e+02 2.382e+02 3.363e+02, threshold=4.098e+02, percent-clipped=0.0 2023-10-03 16:13:03,257 WARNING [train.py:1204] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_383-285196-0_sp1.1 from training. Duration: 0.8818125 2023-10-03 16:13:04,894 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.bypass.scale_min, batch_count=1327460.0, ans=0.2 2023-10-03 16:13:07,329 WARNING [train.py:1204] (3/4) Exclude cut with ID _45560_210_21982_1_1533812603951_3584999_238-47020-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 16:13:07,338 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_21500_210_2054_1_1532149310_2716758_278-18053-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 16:13:07,352 WARNING [train.py:1204] (3/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_453-9626-0_sp1.1 from training. Duration: 0.936375 2023-10-03 16:13:08,800 WARNING [train.py:1204] (3/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_153-97441-0_sp1.1 from training. Duration: 0.5090625 2023-10-03 16:13:14,912 WARNING [train.py:1204] (3/4) Exclude cut with ID _67725_210_27767_1_1535608756671_5105359_431-120895-0_sp1.1 from training. Duration: 0.963625 2023-10-03 16:13:16,617 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.bypass_mid.scale_min, batch_count=1327526.6666666667, ans=0.2 2023-10-03 16:13:17,741 WARNING [train.py:1204] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_574-45541-0_sp1.1 from training. Duration: 0.5909375 2023-10-03 16:13:17,768 WARNING [train.py:1204] (3/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_162-103620-0_sp1.1 from training. Duration: 0.8454375 2023-10-03 16:13:17,779 WARNING [train.py:1204] (3/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_734-129761-0_sp1.1 from training. Duration: 0.7454375 2023-10-03 16:13:17,819 WARNING [train.py:1204] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_620-276793-0_sp1.1 from training. Duration: 0.536375 2023-10-03 16:13:19,087 WARNING [train.py:1204] (3/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_153-97441-0_sp0.9 from training. Duration: 0.62225 2023-10-03 16:13:22,379 WARNING [train.py:1204] (3/4) Exclude cut with ID _12733_210_8341_1_1530167424556_3646380_523-85621-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 16:13:22,391 WARNING [train.py:1204] (3/4) Exclude cut with ID _64048_210_10626_1_1535198382802_3835179_352-179834-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 16:13:27,872 WARNING [train.py:1204] (3/4) Exclude cut with ID _55162_210_17071_1_1534408248583_3753480_43-242364-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 16:13:27,901 WARNING [train.py:1204] (3/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_661-264857-0 from training. Duration: 0.93 2023-10-03 16:13:27,908 WARNING [train.py:1204] (3/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_325-237800-0_sp1.1 from training. Duration: 0.97275 2023-10-03 16:13:27,959 WARNING [train.py:1204] (3/4) Exclude cut with ID _55328_210_27767_1_1534580973260_4088170_385-171459-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 16:13:29,436 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3780_210_2087_1_1526467389_2145572_176-107140-0_sp1.1 from training. Duration: 0.62725 2023-10-03 16:13:29,528 WARNING [train.py:1204] (3/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_375-244976-0_sp1.1 from training. Duration: 0.6818125 2023-10-03 16:13:31,341 WARNING [train.py:1204] (3/4) Exclude cut with ID _35941_210_15379_1_1532995294515_4023089_309-33162-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 16:13:33,527 INFO [scaling.py:1118] (3/4) WithLoss: name=encoder.encoders.2.encoder.layers.0.self_attn_weights, loss-sum=0.000e+00 2023-10-03 16:13:35,971 WARNING [train.py:1204] (3/4) Exclude cut with ID _14459_210_2228_1_1531443641946_7516700_1185-22038-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 16:13:38,558 INFO [train.py:1046] (3/4) Epoch 38, batch 2600, loss[loss=0.1758, simple_loss=0.2534, pruned_loss=0.04909, over 23795.00 frames. ], tot_loss[loss=0.1577, simple_loss=0.2376, pruned_loss=0.03895, over 4715858.54 frames. ], batch size: 195, lr: 2.68e-03, grad_scale: 16.0 2023-10-03 16:13:38,652 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_1197-69460-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 16:13:40,641 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9012W0022-501157 from training. Duration: 0.896 2023-10-03 16:13:43,399 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9023W0009-506302 from training. Duration: 0.896 2023-10-03 16:13:44,695 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2821_210_2715_1_1526108385_3763560_73-296673-0_sp1.1 from training. Duration: 0.7545625 2023-10-03 16:13:44,723 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9025W0113-507442 from training. Duration: 0.896 2023-10-03 16:13:46,092 WARNING [train.py:1204] (3/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_739-2098-0 from training. Duration: 0.92 2023-10-03 16:13:46,105 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9029W0332-509722 from training. Duration: 0.768 2023-10-03 16:13:48,783 WARNING [train.py:1204] (3/4) Exclude cut with ID _16908_210_5724_1_1531119157248_3772700_68-101953-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 16:13:50,461 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9042W0011-515617 from training. Duration: 0.768 2023-10-03 16:13:50,569 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3019_210_2028_1_1526044245_3264865_191-72235-0 from training. Duration: 0.97 2023-10-03 16:13:51,993 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9052W0006-520792 from training. Duration: 0.896 2023-10-03 16:13:54,676 WARNING [train.py:1204] (3/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_245-174553-0_sp1.1 from training. Duration: 0.8 2023-10-03 16:13:56,138 WARNING [train.py:1204] (3/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_202-67017-0 from training. Duration: 0.82 2023-10-03 16:13:56,238 WARNING [train.py:1204] (3/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_304-59227-0 from training. Duration: 0.77 2023-10-03 16:13:58,793 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_246-125000-0_sp0.9 from training. Duration: 0.688875 2023-10-03 16:13:58,829 WARNING [train.py:1204] (3/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_243-42116-0 from training. Duration: 0.73 2023-10-03 16:14:02,131 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9087W0121-538492 from training. Duration: 0.896 2023-10-03 16:14:02,149 WARNING [train.py:1204] (3/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_120-76843-0 from training. Duration: 0.55 2023-10-03 16:14:08,135 WARNING [train.py:1204] (3/4) Exclude cut with ID _7852_210_5219_1_1528286471316_8139400_850-304621-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 16:14:09,409 WARNING [train.py:1204] (3/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_154-285415-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 16:14:09,422 WARNING [train.py:1204] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_538-278194-0_sp1.1 from training. Duration: 0.92725 2023-10-03 16:14:09,423 WARNING [train.py:1204] (3/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_119-207162-0 from training. Duration: 0.71 2023-10-03 16:14:09,695 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=1327793.3333333333, ans=0.1 2023-10-03 16:14:12,005 WARNING [train.py:1204] (3/4) Exclude cut with ID _2334_210_2667_1_1525608238880_4705129_459-104997-0_sp1.1 from training. Duration: 0.863625 2023-10-03 16:14:15,822 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.self_attn_weights.pos_emb_skip_rate, batch_count=1327793.3333333333, ans=0.0 2023-10-03 16:14:16,999 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9147W0019-567847 from training. Duration: 0.768 2023-10-03 16:14:17,194 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=1327793.3333333333, ans=0.1 2023-10-03 16:14:20,315 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.bypass.scale_min, batch_count=1327793.3333333333, ans=0.2 2023-10-03 16:14:20,361 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.conv_skip_rate, batch_count=1327793.3333333333, ans=0.0 2023-10-03 16:14:21,641 WARNING [train.py:1204] (3/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_228-189439-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 16:14:23,013 WARNING [train.py:1204] (3/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_196-77172-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 16:14:24,393 WARNING [train.py:1204] (3/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_625-338546-0 from training. Duration: 0.88 2023-10-03 16:14:24,436 WARNING [train.py:1204] (3/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_193-327630-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 16:14:24,438 WARNING [train.py:1204] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_391-24006-0_sp1.1 from training. Duration: 0.92725 2023-10-03 16:14:25,782 WARNING [train.py:1204] (3/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_197-28164-0 from training. Duration: 0.8 2023-10-03 16:14:28,558 WARNING [train.py:1204] (3/4) Exclude cut with ID _8395_210_1531_1_1528597871523_4411579_431-132470-0_sp1.1 from training. Duration: 0.67275 2023-10-03 16:14:28,585 WARNING [train.py:1204] (3/4) Exclude cut with ID _35350_210_16040_1_1533040191183_4075299_465-248326-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 16:14:28,699 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.0.bypass.scale_min, batch_count=1327860.0, ans=0.2 2023-10-03 16:14:31,213 WARNING [train.py:1204] (3/4) Exclude cut with ID _16756_210_4915_1_1531047803763_7104259_101-141922-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 16:14:34,566 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9212W0005-600172 from training. Duration: 0.896 2023-10-03 16:14:36,403 WARNING [train.py:1204] (3/4) Exclude cut with ID _63186_210_2084_1_1535414479705_3908649_378-166820-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 16:14:36,427 WARNING [train.py:1204] (3/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_52-211683-0_sp1.1 from training. Duration: 0.8181875 2023-10-03 16:14:40,624 WARNING [train.py:1204] (3/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_1147-237729-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 16:14:41,970 WARNING [train.py:1204] (3/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_234-14174-0_sp0.9 from training. Duration: 0.911125 2023-10-03 16:14:41,985 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_454-276429-0 from training. Duration: 0.87 2023-10-03 16:14:43,279 WARNING [train.py:1204] (3/4) Exclude cut with ID _53713_210_24116_1_1534314487762_7416751_755-165313-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 16:14:44,704 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8278_210_2550_1_1528637099_4220413_436-317072-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 16:14:44,764 WARNING [train.py:1204] (3/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_28-324729-0_sp1.1 from training. Duration: 0.8545625 2023-10-03 16:14:50,002 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.0.balancer2.prob, batch_count=1327926.6666666667, ans=0.125 2023-10-03 16:14:51,259 WARNING [train.py:1204] (3/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_326-165900-0 from training. Duration: 0.69 2023-10-03 16:14:52,582 INFO [train.py:1046] (3/4) Epoch 38, batch 2650, loss[loss=0.1418, simple_loss=0.218, pruned_loss=0.03274, over 24415.00 frames. ], tot_loss[loss=0.1585, simple_loss=0.2384, pruned_loss=0.03924, over 4727233.37 frames. ], batch size: 58, lr: 2.68e-03, grad_scale: 16.0 2023-10-03 16:14:52,663 WARNING [train.py:1204] (3/4) Exclude cut with ID _53489_210_6228_1_1534298357169_3835970_96-30246-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 16:14:54,343 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8275_210_4198_1_1528455320_3892571_491-17707-0_sp1.1 from training. Duration: 0.5818125 2023-10-03 16:14:56,001 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.1.feed_forward1.hidden_balancer.prob, batch_count=1327993.3333333333, ans=0.125 2023-10-03 16:14:58,455 WARNING [train.py:1204] (3/4) Exclude cut with ID _14348_210_8341_1_1530599434996_3778640_240-232370-0 from training. Duration: 0.84 2023-10-03 16:14:58,473 WARNING [train.py:1204] (3/4) Exclude cut with ID _45012_210_12651_1_1533608435136_5371380_54-112623-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 16:14:59,818 WARNING [train.py:1204] (3/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_328-264776-0_sp1.1 from training. Duration: 0.6090625 2023-10-03 16:14:59,871 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9325W0001-648427 from training. Duration: 0.768 2023-10-03 16:14:59,887 WARNING [train.py:1204] (3/4) Exclude cut with ID _56480_210_4220_1_1534679942079_3075409_49-74717-0_sp1.1 from training. Duration: 0.936375 2023-10-03 16:15:02,746 WARNING [train.py:1204] (3/4) Exclude cut with ID _59500_210_8254_1_1534809610680_3736009_6-241869-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 16:15:04,795 WARNING [train.py:1204] (3/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_172-215932-0_sp0.9 from training. Duration: 0.7333125 2023-10-03 16:15:06,188 WARNING [train.py:1204] (3/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_17-90541-0_sp1.1 from training. Duration: 0.8545625 2023-10-03 16:15:08,867 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_43831_210_2158_1_1533725502_5872791_525-89447-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 16:15:09,121 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.attention_skip_rate, batch_count=1328060.0, ans=0.0 2023-10-03 16:15:10,228 WARNING [train.py:1204] (3/4) Exclude cut with ID _12302_210_5219_1_1529924446466_8107280_907-182949-0 from training. Duration: 0.92 2023-10-03 16:15:10,246 WARNING [train.py:1204] (3/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_402-102311-0_sp0.9 from training. Duration: 0.7444375 2023-10-03 16:15:10,277 WARNING [train.py:1204] (3/4) Exclude cut with ID _8021_210_7994_1_1528376451358_3653069_179-45206-0_sp0.9 from training. Duration: 0.9555625 2023-10-03 16:15:14,424 WARNING [train.py:1204] (3/4) Exclude cut with ID _40137_210_12210_1_1533290484046_3795180_249-187059-0 from training. Duration: 0.88 2023-10-03 16:15:15,799 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9653W0194-675472 from training. Duration: 0.9658125 2023-10-03 16:15:17,628 WARNING [train.py:1204] (3/4) Exclude cut with ID _51332_210_9988_1_1534244371836_3801270_336-255604-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 16:15:19,158 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.bypass.skip_rate, batch_count=1328060.0, ans=0.07 2023-10-03 16:15:20,328 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_510-96996-0 from training. Duration: 0.87 2023-10-03 16:15:20,357 WARNING [train.py:1204] (3/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_1167-20437-0_sp1.1 from training. Duration: 0.92725 2023-10-03 16:15:20,497 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.0.balancer1.prob, batch_count=1328126.6666666667, ans=0.125 2023-10-03 16:15:22,111 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3475_210_2028_1_1526193578_3622569_265-276625-0 from training. Duration: 0.76 2023-10-03 16:15:24,969 WARNING [train.py:1204] (3/4) Exclude cut with ID _14860_210_4915_1_1530702020340_7343240_470-172418-0_sp1.1 from training. Duration: 0.97275 2023-10-03 16:15:26,377 WARNING [train.py:1204] (3/4) Exclude cut with ID _5444_210_2151_1_1527418808464_5312210_581-315411-0_sp1.1 from training. Duration: 0.563625 2023-10-03 16:15:26,399 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_34450_210_9547_1_1533099443_4109050_519-342439-0_sp1.1 from training. Duration: 0.97275 2023-10-03 16:15:26,436 WARNING [train.py:1204] (3/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_50-175961-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 16:15:29,040 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.651e+02 1.957e+02 2.182e+02 2.477e+02 3.538e+02, threshold=4.363e+02, percent-clipped=0.0 2023-10-03 16:15:30,029 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.1.encoder.layers.1.whiten, num_groups=1, num_channels=256, metric=3.79 vs. limit=12.0 2023-10-03 16:15:30,511 WARNING [train.py:1204] (3/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_372-6869-0 from training. Duration: 0.44 2023-10-03 16:15:30,526 WARNING [train.py:1204] (3/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_3-237434-0 from training. Duration: 0.76 2023-10-03 16:15:33,298 WARNING [train.py:1204] (3/4) Exclude cut with ID _8772_210_1850_1_1528635595972_3664011_247-336448-0_sp1.1 from training. Duration: 0.67275 2023-10-03 16:15:35,534 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_427-78097-0 from training. Duration: 0.8 2023-10-03 16:15:36,612 WARNING [train.py:1204] (3/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_466-252727-0_sp1.1 from training. Duration: 0.97275 2023-10-03 16:15:36,689 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_15972_210_6400_1_1530877934_3405692_316-35478-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 16:15:38,594 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_809-17156-0_sp0.9 from training. Duration: 0.811125 2023-10-03 16:15:38,626 WARNING [train.py:1204] (3/4) Exclude cut with ID _57572_210_5517_1_1534921219624_3809484_594-263474-0_sp1.1 from training. Duration: 0.936375 2023-10-03 16:15:38,659 WARNING [train.py:1204] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_190-228204-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 16:15:41,395 WARNING [train.py:1204] (3/4) Exclude cut with ID _19514_210_9259_1_1531530071330_5084230_117-21193-0_sp1.1 from training. Duration: 0.936375 2023-10-03 16:15:41,526 WARNING [train.py:1204] (3/4) Exclude cut with ID _69584_210_5219_1_1535785133775_4044450_190-21315-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 16:15:41,735 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.conv_module1.balancer2.prob, batch_count=1328193.3333333333, ans=0.125 2023-10-03 16:15:44,160 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_53071_210_16554_1_1534493434_1276796_184-302050-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 16:15:45,547 WARNING [train.py:1204] (3/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_120-76843-0_sp1.1 from training. Duration: 0.5 2023-10-03 16:15:45,638 WARNING [train.py:1204] (3/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_318-162017-0_sp1.1 from training. Duration: 0.82725 2023-10-03 16:15:47,511 WARNING [train.py:1204] (3/4) Exclude cut with ID _46067_210_4220_1_1534125571066_3307450_79-46864-0_sp1.1 from training. Duration: 0.92725 2023-10-03 16:15:47,567 WARNING [train.py:1204] (3/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_358-215558-0_sp1.1 from training. Duration: 0.8454375 2023-10-03 16:15:47,840 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=1328193.3333333333, ans=0.1 2023-10-03 16:15:49,498 WARNING [train.py:1204] (3/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_621-66764-0_sp1.1 from training. Duration: 0.92725 2023-10-03 16:15:50,891 WARNING [train.py:1204] (3/4) Exclude cut with ID _42863_210_9070_1_1533902398788_3696920_338-156851-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 16:15:50,924 WARNING [train.py:1204] (3/4) Exclude cut with ID _5289_210_2084_1_1527678202736_3779299_392-261988-0_sp0.9 from training. Duration: 0.72225 2023-10-03 16:15:53,558 WARNING [train.py:1204] (3/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_409-84682-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 16:15:54,846 WARNING [train.py:1204] (3/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_30-326681-0_sp0.9 from training. Duration: 0.92225 2023-10-03 16:15:54,850 WARNING [train.py:1204] (3/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_389-184695-0_sp1.1 from training. Duration: 0.92725 2023-10-03 16:15:54,896 WARNING [train.py:1204] (3/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_442-320317-0 from training. Duration: 0.82 2023-10-03 16:15:58,969 WARNING [train.py:1204] (3/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_756-29911-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 16:16:00,310 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_401-112036-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 16:16:00,403 WARNING [train.py:1204] (3/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_181-308827-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 16:16:01,835 WARNING [train.py:1204] (3/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_332-258725-0_sp1.1 from training. Duration: 0.963625 2023-10-03 16:16:03,222 WARNING [train.py:1204] (3/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_188-260011-0_sp0.9 from training. Duration: 0.811125 2023-10-03 16:16:03,276 WARNING [train.py:1204] (3/4) Exclude cut with ID _49039_210_6315_1_1533967537541_3846118_99-302239-0_sp1.1 from training. Duration: 0.963625 2023-10-03 16:16:06,463 INFO [train.py:1046] (3/4) Epoch 38, batch 2700, loss[loss=0.2141, simple_loss=0.2842, pruned_loss=0.07197, over 19421.00 frames. ], tot_loss[loss=0.1591, simple_loss=0.2393, pruned_loss=0.03948, over 4722650.26 frames. ], batch size: 389, lr: 2.68e-03, grad_scale: 8.0 2023-10-03 16:16:06,524 WARNING [train.py:1204] (3/4) Exclude cut with ID _4122_210_2084_1_1526713164417_4353280_312-294828-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 16:16:06,530 WARNING [train.py:1204] (3/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_303-228738-0 from training. Duration: 0.84 2023-10-03 16:16:09,862 WARNING [train.py:1204] (3/4) Exclude cut with ID _14349_210_8341_1_1530615710818_3442470_115-285975-0_sp0.9 from training. Duration: 0.9555625 2023-10-03 16:16:11,310 WARNING [train.py:1204] (3/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_125-144174-0_sp1.1 from training. Duration: 0.3545625 2023-10-03 16:16:11,757 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.4.encoder.layers.2.self_attn_weights.whiten_keys, num_groups=4, num_channels=128, metric=2.15 vs. limit=6.0 2023-10-03 16:16:13,921 WARNING [train.py:1204] (3/4) Exclude cut with ID _53565_210_10635_1_1534337218335_3820259_351-54570-0_sp1.1 from training. Duration: 0.97275 2023-10-03 16:16:13,941 WARNING [train.py:1204] (3/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_410-40065-0_sp1.1 from training. Duration: 0.963625 2023-10-03 16:16:13,973 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_38965_210_4852_1_1533174906_3484735_167-339442-0_sp1.1 from training. Duration: 0.963625 2023-10-03 16:16:16,636 WARNING [train.py:1204] (3/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_265-164486-0_sp1.1 from training. Duration: 0.87275 2023-10-03 16:16:16,646 WARNING [train.py:1204] (3/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_725-182484-0_sp1.1 from training. Duration: 0.92725 2023-10-03 16:16:16,672 WARNING [train.py:1204] (3/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_607-213951-0_sp0.9 from training. Duration: 0.9333125 2023-10-03 16:16:16,687 WARNING [train.py:1204] (3/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_605-133395-0_sp1.1 from training. Duration: 0.6 2023-10-03 16:16:16,718 WARNING [train.py:1204] (3/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_351-294650-0 from training. Duration: 0.63 2023-10-03 16:16:18,690 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_14953_210_2151_1_1530701710_5616165_710-165400-0_sp1.1 from training. Duration: 0.7909375 2023-10-03 16:16:20,076 WARNING [train.py:1204] (3/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_237-58433-0_sp1.1 from training. Duration: 0.67275 2023-10-03 16:16:20,170 WARNING [train.py:1204] (3/4) Exclude cut with ID _5190_210_4557_1_1527244368799_373439_28-197030-0_sp1.1 from training. Duration: 0.6545625 2023-10-03 16:16:21,477 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_743-327651-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 16:16:24,397 WARNING [train.py:1204] (3/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_76-203867-0_sp0.9 from training. Duration: 0.8 2023-10-03 16:16:24,489 WARNING [train.py:1204] (3/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_274-274798-0 from training. Duration: 0.98 2023-10-03 16:16:25,762 WARNING [train.py:1204] (3/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_226-242140-0_sp1.1 from training. Duration: 0.736375 2023-10-03 16:16:31,348 WARNING [train.py:1204] (3/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_352-59958-0_sp1.1 from training. Duration: 0.7818125 2023-10-03 16:16:31,352 WARNING [train.py:1204] (3/4) Exclude cut with ID _39411_210_5986_1_1533538825030_3343649_191-251819-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 16:16:36,044 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7046_210_6753_1_1527895337_6920917_642-106799-0_sp1.1 from training. Duration: 0.836375 2023-10-03 16:16:37,362 WARNING [train.py:1204] (3/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_412-3442-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 16:16:37,381 WARNING [train.py:1204] (3/4) Exclude cut with ID _60057_210_10640_1_1534903065793_3333150_171-181133-0_sp0.9 from training. Duration: 0.97775 2023-10-03 16:16:37,390 WARNING [train.py:1204] (3/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_154-135696-0_sp1.1 from training. Duration: 0.72725 2023-10-03 16:16:37,651 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.conv_module2.balancer1.prob, batch_count=1328460.0, ans=0.125 2023-10-03 16:16:40,234 WARNING [train.py:1204] (3/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_148-186805-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 16:16:43,005 WARNING [train.py:1204] (3/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_605-165641-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 16:16:43,020 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_174-277364-0_sp0.9 from training. Duration: 0.788875 2023-10-03 16:16:43,039 WARNING [train.py:1204] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_89-321955-0_sp0.9 from training. Duration: 0.888875 2023-10-03 16:16:47,676 WARNING [train.py:1204] (3/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_438-105429-0_sp1.1 from training. Duration: 0.963625 2023-10-03 16:16:47,679 WARNING [train.py:1204] (3/4) Exclude cut with ID _14970_210_15709_1_1530773543493_1715730_187-20259-0_sp0.9 from training. Duration: 0.988875 2023-10-03 16:16:57,052 WARNING [train.py:1204] (3/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_385-193052-0_sp0.9 from training. Duration: 0.9555625 2023-10-03 16:16:58,512 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7113_210_1849_1_1527942685_3448768_194-311626-0_sp1.1 from training. Duration: 0.92725 2023-10-03 16:17:01,342 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3780_210_2087_1_1526467389_2145572_176-107140-0_sp0.9 from training. Duration: 0.7666875 2023-10-03 16:17:01,345 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_36756_210_7234_1_1533026991_966276_123-213457-0_sp1.1 from training. Duration: 0.936375 2023-10-03 16:17:04,729 WARNING [train.py:1204] (3/4) Exclude cut with ID _23557_210_8750_1_1531890106370_6846255_456-244861-0_sp1.1 from training. Duration: 0.963625 2023-10-03 16:17:06,055 WARNING [train.py:1204] (3/4) Exclude cut with ID _37086_210_9038_1_1532999540891_7021250_242-349170-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 16:17:06,135 WARNING [train.py:1204] (3/4) Exclude cut with ID _41298_210_9019_1_1533430371450_4822143_471-281351-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 16:17:07,468 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_53378_210_17282_1_1534221071_5769509_572-126176-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 16:17:08,837 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_1507-263032-0_sp1.1 from training. Duration: 0.963625 2023-10-03 16:17:09,497 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.3.encoder.layers.2.conv_module2.whiten, num_groups=1, num_channels=512, metric=4.34 vs. limit=15.0 2023-10-03 16:17:10,152 WARNING [train.py:1204] (3/4) Exclude cut with ID _13679_210_7497_1_1530534293662_3355098_411-186088-0_sp1.1 from training. Duration: 0.8545625 2023-10-03 16:17:11,621 WARNING [train.py:1204] (3/4) Exclude cut with ID _29345_210_4381_1_1532689168263_3860169_149-257268-0_sp1.1 from training. Duration: 0.736375 2023-10-03 16:17:14,231 WARNING [train.py:1204] (3/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_381-252014-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 16:17:14,238 WARNING [train.py:1204] (3/4) Exclude cut with ID _35799_210_5854_1_1533263487887_3529071_512-2717-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 16:17:17,429 WARNING [train.py:1204] (3/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_505-205270-0_sp0.9 from training. Duration: 0.42225 2023-10-03 16:17:18,813 WARNING [train.py:1204] (3/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_731-20117-0_sp1.1 from training. Duration: 0.936375 2023-10-03 16:17:20,113 INFO [train.py:1046] (3/4) Epoch 38, batch 2750, loss[loss=0.1493, simple_loss=0.233, pruned_loss=0.03277, over 24462.00 frames. ], tot_loss[loss=0.1599, simple_loss=0.239, pruned_loss=0.04037, over 4696028.82 frames. ], batch size: 63, lr: 2.68e-03, grad_scale: 8.0 2023-10-03 16:17:21,984 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_37351_210_7925_1_1533012931_3812113_265-85780-0_sp1.1 from training. Duration: 0.8 2023-10-03 16:17:21,994 WARNING [train.py:1204] (3/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_313-134930-0 from training. Duration: 0.97 2023-10-03 16:17:24,062 WARNING [train.py:1204] (3/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_374-138971-0 from training. Duration: 0.94 2023-10-03 16:17:24,099 WARNING [train.py:1204] (3/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_1227-287886-0_sp1.1 from training. Duration: 0.936375 2023-10-03 16:17:26,922 WARNING [train.py:1204] (3/4) Exclude cut with ID _59493_210_17282_1_1535078419077_3225879_332-217544-0_sp1.1 from training. Duration: 0.97275 2023-10-03 16:17:26,967 WARNING [train.py:1204] (3/4) Exclude cut with ID _23514_210_1389_1_1531832139785_3031439_361-119640-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 16:17:29,625 WARNING [train.py:1204] (3/4) Exclude cut with ID _69584_210_5219_1_1535785133775_4044450_142-118058-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 16:17:29,645 WARNING [train.py:1204] (3/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_178-177582-0_sp1.1 from training. Duration: 0.67275 2023-10-03 16:17:29,686 WARNING [train.py:1204] (3/4) Exclude cut with ID _25303_210_17519_1_1532050207576_3635970_41-195572-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 16:17:31,698 WARNING [train.py:1204] (3/4) Exclude cut with ID _55328_210_27767_1_1534580973260_4088170_329-341322-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 16:17:31,736 WARNING [train.py:1204] (3/4) Exclude cut with ID _5608_210_2106_1_1527415439089_6819768_909-82767-0_sp1.1 from training. Duration: 0.5545625 2023-10-03 16:17:33,005 WARNING [train.py:1204] (3/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_261-297503-0_sp1.1 from training. Duration: 0.9 2023-10-03 16:17:33,011 WARNING [train.py:1204] (3/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_459-291706-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 16:17:33,012 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7762_210_1794_1_1528199508_4057408_422-227034-0 from training. Duration: 0.73 2023-10-03 16:17:33,026 WARNING [train.py:1204] (3/4) Exclude cut with ID _3067_210_2667_1_1526212974640_4503168_250-285937-0_sp0.9 from training. Duration: 0.888875 2023-10-03 16:17:34,423 WARNING [train.py:1204] (3/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_332-253745-0_sp1.1 from training. Duration: 0.936375 2023-10-03 16:17:34,667 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.self_attn_weights.pos_emb_skip_rate, batch_count=1328726.6666666667, ans=0.0 2023-10-03 16:17:38,561 WARNING [train.py:1204] (3/4) Exclude cut with ID _13938_210_9988_1_1530518718978_3693960_304-112784-0 from training. Duration: 0.46 2023-10-03 16:17:39,976 WARNING [train.py:1204] (3/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_226-267957-0_sp1.1 from training. Duration: 0.92725 2023-10-03 16:17:40,006 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_41822_210_13270_1_1533955976_4379284_608-254567-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 16:17:41,381 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_124-113562-0_sp1.1 from training. Duration: 0.8545625 2023-10-03 16:17:41,416 WARNING [train.py:1204] (3/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_117-290916-0_sp1.1 from training. Duration: 0.463625 2023-10-03 16:17:42,719 WARNING [train.py:1204] (3/4) Exclude cut with ID _28427_210_4220_1_1532581240967_3061069_102-66533-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 16:17:44,183 WARNING [train.py:1204] (3/4) Exclude cut with ID _55383_210_10635_1_1534468426172_2231669_134-128246-0_sp1.1 from training. Duration: 0.8090625 2023-10-03 16:17:44,231 WARNING [train.py:1204] (3/4) Exclude cut with ID _45871_210_5665_1_1533864614245_3296060_49-308316-0_sp1.1 from training. Duration: 0.97275 2023-10-03 16:17:44,277 WARNING [train.py:1204] (3/4) Exclude cut with ID _57572_210_5517_1_1534921219624_3809484_176-199205-0_sp1.1 from training. Duration: 0.97275 2023-10-03 16:17:45,797 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.conv_module2.balancer2.prob, batch_count=1328726.6666666667, ans=0.125 2023-10-03 16:17:48,920 WARNING [train.py:1204] (3/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_45-265684-0_sp1.1 from training. Duration: 0.6545625 2023-10-03 16:17:50,796 WARNING [train.py:1204] (3/4) Exclude cut with ID _15150_210_8341_1_1530752411803_7256384_469-239509-0_sp0.9 from training. Duration: 0.7555625 2023-10-03 16:17:50,846 WARNING [train.py:1204] (3/4) Exclude cut with ID _28461_210_2253_1_1532771775189_3041299_136-270644-0_sp1.1 from training. Duration: 0.8454375 2023-10-03 16:17:52,662 WARNING [train.py:1204] (3/4) Exclude cut with ID _69584_210_5219_1_1535785133775_4044450_302-47302-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 16:17:52,872 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.conv_module1.balancer2.prob, batch_count=1328793.3333333333, ans=0.125 2023-10-03 16:17:52,969 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.feed_forward2.hidden_balancer.prob, batch_count=1328793.3333333333, ans=0.125 2023-10-03 16:17:54,086 WARNING [train.py:1204] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_166-128941-0_sp1.1 from training. Duration: 0.4909375 2023-10-03 16:17:58,232 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.609e+02 1.929e+02 2.191e+02 2.513e+02 4.361e+02, threshold=4.383e+02, percent-clipped=0.0 2023-10-03 16:17:58,449 WARNING [train.py:1204] (3/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_559-120504-0_sp1.1 from training. Duration: 0.97275 2023-10-03 16:18:00,495 WARNING [train.py:1204] (3/4) Exclude cut with ID _6456_210_1534_1_1527681634212_3181049_345-252877-0_sp1.1 from training. Duration: 0.5818125 2023-10-03 16:18:01,654 WARNING [train.py:1204] (3/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_902-233003-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 16:18:04,600 WARNING [train.py:1204] (3/4) Exclude cut with ID _36538_210_9994_1_1533204020363_3795620_204-261385-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 16:18:04,603 WARNING [train.py:1204] (3/4) Exclude cut with ID _14821_210_8750_1_1530680503991_6829359_189-297451-0_sp1.1 from training. Duration: 0.5 2023-10-03 16:18:04,644 WARNING [train.py:1204] (3/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_1211-226066-0_sp0.9 from training. Duration: 0.8444375 2023-10-03 16:18:11,551 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6786_210_1388_1_1527839699_4194495_273-149841-0_sp1.1 from training. Duration: 0.836375 2023-10-03 16:18:12,812 WARNING [train.py:1204] (3/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_380-162648-0_sp1.1 from training. Duration: 0.8545625 2023-10-03 16:18:12,813 WARNING [train.py:1204] (3/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_494-199538-0 from training. Duration: 0.75 2023-10-03 16:18:15,609 WARNING [train.py:1204] (3/4) Exclude cut with ID _49097_210_16049_1_1533952780207_4360979_276-87053-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 16:18:17,696 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.2.encoder.layers.2.feed_forward2.out_whiten, num_groups=1, num_channels=384, metric=5.23 vs. limit=15.0 2023-10-03 16:18:18,900 WARNING [train.py:1204] (3/4) Exclude cut with ID _5444_210_2151_1_1527418808464_5312210_726-27165-0 from training. Duration: 0.67 2023-10-03 16:18:19,584 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.3.encoder.layers.2.whiten, num_groups=1, num_channels=512, metric=4.99 vs. limit=12.0 2023-10-03 16:18:21,011 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.2.encoder.layers.0.feed_forward2.out_whiten, num_groups=1, num_channels=384, metric=6.11 vs. limit=15.0 2023-10-03 16:18:25,012 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_128-202104-0_sp1.1 from training. Duration: 0.57275 2023-10-03 16:18:27,754 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_11349_210_6753_1_1529800861_1101207_26-65602-0_sp1.1 from training. Duration: 0.87275 2023-10-03 16:18:27,790 WARNING [train.py:1204] (3/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_159-328157-0 from training. Duration: 0.52 2023-10-03 16:18:29,808 WARNING [train.py:1204] (3/4) Exclude cut with ID _14069_210_5632_1_1530512692033_4803390_98-21934-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 16:18:31,224 WARNING [train.py:1204] (3/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_352-59958-0_sp0.9 from training. Duration: 0.9555625 2023-10-03 16:18:31,264 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6163_210_6400_1_1527506746_4263068_283-137811-0 from training. Duration: 0.77 2023-10-03 16:18:31,296 WARNING [train.py:1204] (3/4) Exclude cut with ID _21989_210_5854_1_1531899214931_3271360_133-44759-0_sp1.1 from training. Duration: 0.7 2023-10-03 16:18:34,110 WARNING [train.py:1204] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_21-174514-0_sp0.9 from training. Duration: 0.47775 2023-10-03 16:18:34,142 WARNING [train.py:1204] (3/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_201-78811-0_sp1.1 from training. Duration: 0.963625 2023-10-03 16:18:34,168 WARNING [train.py:1204] (3/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_537-164630-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 16:18:35,424 INFO [train.py:1046] (3/4) Epoch 38, batch 2800, loss[loss=0.1529, simple_loss=0.2409, pruned_loss=0.03241, over 24656.00 frames. ], tot_loss[loss=0.1581, simple_loss=0.2366, pruned_loss=0.03977, over 4691254.06 frames. ], batch size: 65, lr: 2.68e-03, grad_scale: 16.0 2023-10-03 16:18:35,525 WARNING [train.py:1204] (3/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_156-257607-0 from training. Duration: 0.83 2023-10-03 16:18:36,708 WARNING [train.py:1204] (3/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_308-154450-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 16:18:36,758 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_45399_210_4852_1_1533624725_7579737_214-240574-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 16:18:37,545 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.2.encoder.layers.2.nonlin_attention.whiten1, num_groups=1, num_channels=288, metric=4.82 vs. limit=10.0 2023-10-03 16:18:38,204 WARNING [train.py:1204] (3/4) Exclude cut with ID _29303_210_2575_1_1532392237303_7410649_615-305517-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 16:18:39,502 WARNING [train.py:1204] (3/4) Exclude cut with ID IC0125W0002-61808 from training. Duration: 0.708 2023-10-03 16:18:39,502 WARNING [train.py:1204] (3/4) Exclude cut with ID IC0125W0017-61823 from training. Duration: 0.759 2023-10-03 16:18:40,985 WARNING [train.py:1204] (3/4) Exclude cut with ID _53219_210_4525_1_1534834836008_4064825_4-145291-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 16:18:43,732 WARNING [train.py:1204] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_185-174805-0_sp1.1 from training. Duration: 0.6181875 2023-10-03 16:18:43,753 WARNING [train.py:1204] (3/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_635-193046-0_sp1.1 from training. Duration: 0.92725 2023-10-03 16:18:46,494 WARNING [train.py:1204] (3/4) Exclude cut with ID _46067_210_4220_1_1534125571066_3307450_321-258866-0_sp1.1 from training. Duration: 0.97275 2023-10-03 16:18:50,886 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8558_210_1410_1_1528505225_7613406_131-329524-0 from training. Duration: 0.79 2023-10-03 16:18:51,021 WARNING [train.py:1204] (3/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_847-41334-0_sp0.9 from training. Duration: 0.488875 2023-10-03 16:18:52,844 WARNING [train.py:1204] (3/4) Exclude cut with ID _14227_210_4381_1_1530701804286_3940259_125-275642-0 from training. Duration: 0.99 2023-10-03 16:18:54,191 WARNING [train.py:1204] (3/4) Exclude cut with ID _25248_210_8750_1_1532062825941_5554180_248-177255-0_sp1.1 from training. Duration: 0.963625 2023-10-03 16:18:55,570 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_40047_210_2290_1_1533290519_2374311_195-165693-0_sp1.1 from training. Duration: 0.8090625 2023-10-03 16:18:55,574 WARNING [train.py:1204] (3/4) Exclude cut with ID _23109_210_4381_1_1532150828777_901949_29-258672-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 16:18:58,532 WARNING [train.py:1204] (3/4) Exclude cut with ID _14821_210_8750_1_1530680503991_6829359_2-227151-0_sp1.1 from training. Duration: 0.7545625 2023-10-03 16:18:58,573 WARNING [train.py:1204] (3/4) Exclude cut with ID _49643_210_7989_1_1534640244224_3939031_565-53982-0_sp1.1 from training. Duration: 0.963625 2023-10-03 16:18:58,576 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7544_210_6753_1_1528591327_1471426_33-346031-0_sp0.9 from training. Duration: 0.67775 2023-10-03 16:18:59,937 WARNING [train.py:1204] (3/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_95-61782-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 16:19:05,221 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.1.encoder.layers.1.conv_module2.whiten, num_groups=1, num_channels=256, metric=12.01 vs. limit=15.0 2023-10-03 16:19:08,659 WARNING [train.py:1204] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_686-54383-0_sp0.9 from training. Duration: 0.9666875 2023-10-03 16:19:10,263 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_505-276823-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 16:19:12,928 WARNING [train.py:1204] (3/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_745-128443-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 16:19:12,984 WARNING [train.py:1204] (3/4) Exclude cut with ID _3844_210_1390_1_1526727803582_2871209_20-251664-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 16:19:14,347 WARNING [train.py:1204] (3/4) Exclude cut with ID _36538_210_9994_1_1533204020363_3795620_487-164628-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 16:19:18,547 WARNING [train.py:1204] (3/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_309-222545-0_sp1.1 from training. Duration: 0.763625 2023-10-03 16:19:18,578 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3531_210_3528_1_1526294203_5216456_309-227302-0 from training. Duration: 0.69 2023-10-03 16:19:20,313 WARNING [train.py:1204] (3/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_42-192460-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 16:19:22,280 WARNING [train.py:1204] (3/4) Exclude cut with ID _22083_210_8254_1_1531702862439_3678970_49-234546-0_sp1.1 from training. Duration: 0.7545625 2023-10-03 16:19:22,285 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_427-78097-0_sp0.9 from training. Duration: 0.888875 2023-10-03 16:19:26,927 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_378-205254-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 16:19:26,982 WARNING [train.py:1204] (3/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_672-314830-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 16:19:30,949 WARNING [train.py:1204] (3/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_90-30673-0_sp1.1 from training. Duration: 0.763625 2023-10-03 16:19:31,752 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.0.feed_forward1.out_proj.dropout_p, batch_count=1329193.3333333333, ans=0.1 2023-10-03 16:19:32,978 WARNING [train.py:1204] (3/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_563-329943-0_sp1.1 from training. Duration: 0.9 2023-10-03 16:19:33,005 WARNING [train.py:1204] (3/4) Exclude cut with ID _21632_210_2253_1_1531994001765_3744140_154-84090-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 16:19:33,007 WARNING [train.py:1204] (3/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_103-336064-0_sp0.9 from training. Duration: 0.5666875 2023-10-03 16:19:33,052 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_43-71989-0_sp0.9 from training. Duration: 0.6333125 2023-10-03 16:19:34,415 WARNING [train.py:1204] (3/4) Exclude cut with ID _3867_210_1883_1_1526641200575_4475830_319-349850-0_sp0.9 from training. Duration: 0.7444375 2023-10-03 16:19:35,784 WARNING [train.py:1204] (3/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_629-179454-0_sp1.1 from training. Duration: 0.963625 2023-10-03 16:19:35,792 WARNING [train.py:1204] (3/4) Exclude cut with ID _65263_210_22356_1_1535763598691_3921037_49-222917-0 from training. Duration: 0.92 2023-10-03 16:19:35,822 WARNING [train.py:1204] (3/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_602-331772-0_sp1.1 from training. Duration: 0.936375 2023-10-03 16:19:35,908 WARNING [train.py:1204] (3/4) Exclude cut with ID _58414_210_8254_1_1535421609162_7151182_292-134978-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 16:19:35,933 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_38793_210_2544_1_1533176114_3849320_200-299292-0_sp1.1 from training. Duration: 0.936375 2023-10-03 16:19:37,480 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.bypass_mid.scale_min, batch_count=1329260.0, ans=0.2 2023-10-03 16:19:38,607 WARNING [train.py:1204] (3/4) Exclude cut with ID _14970_210_15709_1_1530775547361_1966965_6-23328-0 from training. Duration: 0.86 2023-10-03 16:19:38,825 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.balancer2.prob, batch_count=1329260.0, ans=0.125 2023-10-03 16:19:39,944 WARNING [train.py:1204] (3/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_386-225511-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 16:19:39,964 WARNING [train.py:1204] (3/4) Exclude cut with ID _38323_210_12108_1_1533121233369_3303049_416-11919-0_sp1.1 from training. Duration: 0.87275 2023-10-03 16:19:40,018 WARNING [train.py:1204] (3/4) Exclude cut with ID _4866_210_2577_1_1527404412284_4364659_256-306830-0_sp1.1 from training. Duration: 0.7909375 2023-10-03 16:19:41,407 WARNING [train.py:1204] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_53-300544-0 from training. Duration: 0.67 2023-10-03 16:19:47,022 WARNING [train.py:1204] (3/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_80-74184-0_sp1.1 from training. Duration: 0.7545625 2023-10-03 16:19:47,037 WARNING [train.py:1204] (3/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_332-256168-0_sp1.1 from training. Duration: 0.6818125 2023-10-03 16:19:47,089 WARNING [train.py:1204] (3/4) Exclude cut with ID _48836_210_12210_1_1534075848293_3904150_429-35475-0_sp1.1 from training. Duration: 0.8090625 2023-10-03 16:19:48,414 INFO [train.py:1046] (3/4) Epoch 38, batch 2850, loss[loss=0.1527, simple_loss=0.2291, pruned_loss=0.03815, over 23645.00 frames. ], tot_loss[loss=0.1577, simple_loss=0.2365, pruned_loss=0.03944, over 4712187.11 frames. ], batch size: 256, lr: 2.68e-03, grad_scale: 16.0 2023-10-03 16:19:48,506 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_144-135833-0_sp1.1 from training. Duration: 0.97275 2023-10-03 16:19:48,766 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.attention_skip_rate, batch_count=1329326.6666666667, ans=0.0 2023-10-03 16:19:52,681 WARNING [train.py:1204] (3/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_380-343670-0_sp0.9 from training. Duration: 0.97775 2023-10-03 16:19:54,058 WARNING [train.py:1204] (3/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_213-265174-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 16:19:54,104 WARNING [train.py:1204] (3/4) Exclude cut with ID _20455_210_5219_1_1531565996775_8098100_86-314283-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 16:19:55,612 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.ff3_skip_rate, batch_count=1329326.6666666667, ans=0.0 2023-10-03 16:19:58,127 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_41750_210_1960_1_1533520678_3948042_68-223813-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 16:19:58,192 WARNING [train.py:1204] (3/4) Exclude cut with ID _55153_210_2253_1_1534384786677_3162096_24-198429-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 16:19:59,678 WARNING [train.py:1204] (3/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_119-207162-0_sp0.9 from training. Duration: 0.788875 2023-10-03 16:20:00,999 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_148-172861-0 from training. Duration: 0.5 2023-10-03 16:20:06,495 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.4.encoder.layers.0.feed_forward1.out_whiten, num_groups=1, num_channels=384, metric=5.19 vs. limit=15.0 2023-10-03 16:20:07,294 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_5522_210_3273_1_1527249606_3909096_318-203153-0 from training. Duration: 0.43 2023-10-03 16:20:07,300 WARNING [train.py:1204] (3/4) Exclude cut with ID _14884_210_2252_1_1530707459689_5561250_288-189949-0_sp1.1 from training. Duration: 0.97275 2023-10-03 16:20:07,552 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.balancer2.prob, batch_count=1329393.3333333333, ans=0.125 2023-10-03 16:20:09,987 WARNING [train.py:1204] (3/4) Exclude cut with ID _5294_210_1747_1_1527249643928_3696179_529-242298-0 from training. Duration: 0.95 2023-10-03 16:20:11,375 WARNING [train.py:1204] (3/4) Exclude cut with ID _36538_210_9994_1_1533204020363_3795620_503-110289-0_sp1.1 from training. Duration: 0.936375 2023-10-03 16:20:11,692 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.bypass.skip_rate, batch_count=1329393.3333333333, ans=0.04949747468305833 2023-10-03 16:20:12,014 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.3.encoder.layers.3.nonlin_attention.whiten1, num_groups=1, num_channels=384, metric=4.97 vs. limit=10.0 2023-10-03 16:20:14,051 WARNING [train.py:1204] (3/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_385-193052-0 from training. Duration: 0.86 2023-10-03 16:20:14,113 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_456-22141-0 from training. Duration: 0.88 2023-10-03 16:20:15,479 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_38965_210_4852_1_1533174906_3484735_284-117840-0_sp1.1 from training. Duration: 0.936375 2023-10-03 16:20:18,606 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=1329460.0, ans=0.1 2023-10-03 16:20:26,108 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.606e+02 1.929e+02 2.175e+02 2.437e+02 3.531e+02, threshold=4.350e+02, percent-clipped=0.0 2023-10-03 16:20:26,304 WARNING [train.py:1204] (3/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_75-201625-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 16:20:27,656 WARNING [train.py:1204] (3/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_255-312654-0_sp1.1 from training. Duration: 0.82725 2023-10-03 16:20:28,905 WARNING [train.py:1204] (3/4) Exclude cut with ID _47251_210_18628_1_1533814063128_4137250_214-88076-0_sp0.9 from training. Duration: 0.97775 2023-10-03 16:20:30,314 WARNING [train.py:1204] (3/4) Exclude cut with ID _4092_210_2151_1_1526643044449_3135759_227-213972-0_sp1.1 from training. Duration: 0.4909375 2023-10-03 16:20:30,330 WARNING [train.py:1204] (3/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_912-47473-0_sp1.1 from training. Duration: 0.6181875 2023-10-03 16:20:30,359 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6767_210_3273_1_1527855783_1718932_173-256601-0_sp1.1 from training. Duration: 0.72725 2023-10-03 16:20:31,904 WARNING [train.py:1204] (3/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_32-205288-0_sp1.1 from training. Duration: 0.7090625 2023-10-03 16:20:33,163 WARNING [train.py:1204] (3/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_39-248870-0 from training. Duration: 0.69 2023-10-03 16:20:35,132 WARNING [train.py:1204] (3/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_377-296839-0_sp1.1 from training. Duration: 0.5 2023-10-03 16:20:35,148 WARNING [train.py:1204] (3/4) Exclude cut with ID _13679_210_7497_1_1530534293662_3355098_413-56154-0_sp1.1 from training. Duration: 0.963625 2023-10-03 16:20:36,562 WARNING [train.py:1204] (3/4) Exclude cut with ID _14970_210_15709_1_1530775547361_1966965_201-84544-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 16:20:36,623 WARNING [train.py:1204] (3/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_105-95775-0_sp1.1 from training. Duration: 0.936375 2023-10-03 16:20:38,195 WARNING [train.py:1204] (3/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_131-191669-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 16:20:39,580 WARNING [train.py:1204] (3/4) Exclude cut with ID _14860_210_4915_1_1530702020340_7343240_695-4323-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 16:20:39,865 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.self_attn_weights.pos_emb_skip_rate, batch_count=1329526.6666666667, ans=0.0 2023-10-03 16:20:40,884 WARNING [train.py:1204] (3/4) Exclude cut with ID _39867_210_9826_1_1533553222030_9344259_991-130565-0_sp1.1 from training. Duration: 0.97275 2023-10-03 16:20:42,427 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_50-3289-0_sp1.1 from training. Duration: 0.82725 2023-10-03 16:20:45,109 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_54-347670-0_sp1.1 from training. Duration: 0.92725 2023-10-03 16:20:45,166 WARNING [train.py:1204] (3/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_200-153445-0_sp1.1 from training. Duration: 0.936375 2023-10-03 16:20:46,492 WARNING [train.py:1204] (3/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_279-302911-0_sp1.1 from training. Duration: 0.97275 2023-10-03 16:20:47,957 WARNING [train.py:1204] (3/4) Exclude cut with ID _14886_210_4220_1_1530702020355_3516190_159-73748-0_sp0.9 from training. Duration: 0.92225 2023-10-03 16:20:50,612 WARNING [train.py:1204] (3/4) Exclude cut with ID _53379_210_20034_1_1534222027363_3854654_23-222715-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 16:20:51,997 WARNING [train.py:1204] (3/4) Exclude cut with ID _12555_210_8750_1_1530075594402_3780239_449-267986-0 from training. Duration: 0.83 2023-10-03 16:20:53,908 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7544_210_6753_1_1528587722_3576686_295-258479-0 from training. Duration: 0.84 2023-10-03 16:20:55,679 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7002_210_1794_1_1527935634_4034444_487-236501-0_sp0.9 from training. Duration: 0.7333125 2023-10-03 16:20:55,742 WARNING [train.py:1204] (3/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_244-19107-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 16:20:55,755 WARNING [train.py:1204] (3/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_390-339137-0 from training. Duration: 0.56 2023-10-03 16:20:57,088 WARNING [train.py:1204] (3/4) Exclude cut with ID _7562_210_2084_1_1528887767916_3946229_186-326422-0_sp1.1 from training. Duration: 0.763625 2023-10-03 16:20:57,576 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.5.encoder.layers.1.self_attn1.whiten, num_groups=1, num_channels=256, metric=9.67 vs. limit=22.5 2023-10-03 16:20:58,558 WARNING [train.py:1204] (3/4) Exclude cut with ID _33640_210_2655_1_1533117605479_4855469_645-145235-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 16:20:58,582 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_25767_210_2161_1_1532307483_3867748_506-171777-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 16:20:58,606 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_5438_210_1794_1_1527336687_2600009_233-58072-0_sp1.1 from training. Duration: 0.7 2023-10-03 16:20:58,607 WARNING [train.py:1204] (3/4) Exclude cut with ID IC0669W0063-330908 from training. Duration: 0.896 2023-10-03 16:20:58,644 WARNING [train.py:1204] (3/4) Exclude cut with ID IC0671W0070-331913 from training. Duration: 0.896 2023-10-03 16:20:58,647 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_258-310570-0_sp1.1 from training. Duration: 0.7454375 2023-10-03 16:20:59,974 WARNING [train.py:1204] (3/4) Exclude cut with ID _9461_210_4557_1_1529060514726_3677451_162-177507-0_sp1.1 from training. Duration: 0.97275 2023-10-03 16:21:02,753 INFO [train.py:1046] (3/4) Epoch 38, batch 2900, loss[loss=0.1689, simple_loss=0.2567, pruned_loss=0.04053, over 23648.00 frames. ], tot_loss[loss=0.1577, simple_loss=0.2369, pruned_loss=0.03922, over 4702248.30 frames. ], batch size: 85, lr: 2.68e-03, grad_scale: 16.0 2023-10-03 16:21:04,745 WARNING [train.py:1204] (3/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_26-175553-0_sp0.9 from training. Duration: 0.67775 2023-10-03 16:21:04,772 WARNING [train.py:1204] (3/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_7-150481-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 16:21:06,146 WARNING [train.py:1204] (3/4) Exclude cut with ID _23557_210_8750_1_1531890106370_6846255_72-273262-0_sp1.1 from training. Duration: 0.963625 2023-10-03 16:21:06,236 WARNING [train.py:1204] (3/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_367-61330-0 from training. Duration: 0.75 2023-10-03 16:21:07,950 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.ff3_skip_rate, batch_count=1329660.0, ans=0.0 2023-10-03 16:21:09,100 WARNING [train.py:1204] (3/4) Exclude cut with ID _39867_210_9826_1_1533553222030_9344259_1337-11141-0_sp1.1 from training. Duration: 0.936375 2023-10-03 16:21:10,363 WARNING [train.py:1204] (3/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_101-293367-0 from training. Duration: 0.63 2023-10-03 16:21:11,764 WARNING [train.py:1204] (3/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_10-100367-0 from training. Duration: 0.98 2023-10-03 16:21:13,098 WARNING [train.py:1204] (3/4) Exclude cut with ID _60057_210_10640_1_1534903065793_3333150_171-181133-0_sp1.1 from training. Duration: 0.8 2023-10-03 16:21:13,101 WARNING [train.py:1204] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_844-96989-0_sp0.9 from training. Duration: 0.788875 2023-10-03 16:21:13,358 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.feed_forward1.out_proj.dropout_p, batch_count=1329660.0, ans=0.1 2023-10-03 16:21:14,463 WARNING [train.py:1204] (3/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_301-144595-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 16:21:14,686 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.feed_forward2.hidden_balancer.prob, batch_count=1329660.0, ans=0.125 2023-10-03 16:21:15,834 WARNING [train.py:1204] (3/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_115-147343-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 16:21:19,902 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3171_210_2087_1_1526109084_2330007_37-64334-0_sp1.1 from training. Duration: 0.7454375 2023-10-03 16:21:19,946 WARNING [train.py:1204] (3/4) Exclude cut with ID _19514_210_9259_1_1531530071330_5084230_769-272914-0_sp1.1 from training. Duration: 0.936375 2023-10-03 16:21:23,220 WARNING [train.py:1204] (3/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_76-311963-0_sp0.9 from training. Duration: 0.77775 2023-10-03 16:21:23,258 WARNING [train.py:1204] (3/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_526-62079-0 from training. Duration: 0.67 2023-10-03 16:21:25,132 WARNING [train.py:1204] (3/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_7-111954-0_sp0.9 from training. Duration: 0.62225 2023-10-03 16:21:26,522 WARNING [train.py:1204] (3/4) Exclude cut with ID _57997_210_4163_1_1534766425889_3961049_360-279140-0_sp1.1 from training. Duration: 0.97275 2023-10-03 16:21:29,191 WARNING [train.py:1204] (3/4) Exclude cut with ID _4866_210_2577_1_1527404412284_4364659_133-1696-0 from training. Duration: 0.99 2023-10-03 16:21:30,420 WARNING [train.py:1204] (3/4) Exclude cut with ID _5190_210_4557_1_1527244368799_373439_28-197030-0 from training. Duration: 0.72 2023-10-03 16:21:31,989 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.ff3_skip_rate, batch_count=1329793.3333333333, ans=0.0 2023-10-03 16:21:34,473 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_53-39003-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 16:21:34,475 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6673_210_3273_1_1527767790_3173240_351-167954-0 from training. Duration: 0.48 2023-10-03 16:21:34,500 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2822_210_2715_1_1526112586_1999587_165-36656-0_sp1.1 from training. Duration: 0.8090625 2023-10-03 16:21:34,741 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=1329793.3333333333, ans=0.1 2023-10-03 16:21:35,914 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_328-264892-0_sp1.1 from training. Duration: 0.92725 2023-10-03 16:21:35,919 WARNING [train.py:1204] (3/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_29-293896-0_sp0.9 from training. Duration: 0.67775 2023-10-03 16:21:38,007 WARNING [train.py:1204] (3/4) Exclude cut with ID _65263_210_22356_1_1535763598691_3921037_465-55448-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 16:21:39,348 WARNING [train.py:1204] (3/4) Exclude cut with ID _21989_210_5854_1_1531899214931_3271360_311-53106-0_sp1.1 from training. Duration: 0.97275 2023-10-03 16:21:43,376 WARNING [train.py:1204] (3/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_389-277915-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 16:21:44,843 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_69630_210_7234_1_1535791094_677837_63-277139-0_sp1.1 from training. Duration: 0.963625 2023-10-03 16:21:46,234 WARNING [train.py:1204] (3/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_85-138257-0 from training. Duration: 0.71 2023-10-03 16:21:46,250 WARNING [train.py:1204] (3/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_686-247600-0 from training. Duration: 0.92 2023-10-03 16:21:46,256 WARNING [train.py:1204] (3/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_108-255430-0_sp1.1 from training. Duration: 0.8818125 2023-10-03 16:21:47,994 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.conv_skip_rate, batch_count=1329860.0, ans=0.0 2023-10-03 16:21:50,315 WARNING [train.py:1204] (3/4) Exclude cut with ID _14884_210_2252_1_1530707459689_5561250_114-201399-0_sp1.1 from training. Duration: 0.6909375 2023-10-03 16:21:51,723 WARNING [train.py:1204] (3/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_506-42614-0 from training. Duration: 0.79 2023-10-03 16:21:53,635 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_85-195240-0_sp1.1 from training. Duration: 0.7909375 2023-10-03 16:21:58,815 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_141-36243-0_sp1.1 from training. Duration: 0.97275 2023-10-03 16:22:07,627 WARNING [train.py:1204] (3/4) Exclude cut with ID _7852_210_5219_1_1528286471316_8139400_358-287492-0_sp1.1 from training. Duration: 0.87275 2023-10-03 16:22:07,645 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_805-71497-0_sp0.9 from training. Duration: 0.788875 2023-10-03 16:22:08,861 WARNING [train.py:1204] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_178-39722-0 from training. Duration: 0.49 2023-10-03 16:22:10,879 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.4.encoder.layers.2.conv_module2.whiten, num_groups=1, num_channels=384, metric=7.55 vs. limit=15.0 2023-10-03 16:22:11,766 WARNING [train.py:1204] (3/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_428-181925-0_sp1.1 from training. Duration: 0.963625 2023-10-03 16:22:11,769 WARNING [train.py:1204] (3/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_770-200430-0 from training. Duration: 0.89 2023-10-03 16:22:11,817 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_1104-271009-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 16:22:13,127 WARNING [train.py:1204] (3/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_128-296581-0_sp0.9 from training. Duration: 0.82225 2023-10-03 16:22:15,795 INFO [train.py:1046] (3/4) Epoch 38, batch 2950, loss[loss=0.1891, simple_loss=0.2562, pruned_loss=0.06101, over 19033.00 frames. ], tot_loss[loss=0.1585, simple_loss=0.2381, pruned_loss=0.03945, over 4693029.31 frames. ], batch size: 388, lr: 2.68e-03, grad_scale: 16.0 2023-10-03 16:22:18,560 WARNING [train.py:1204] (3/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_19-62511-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 16:22:19,969 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_805-71497-0 from training. Duration: 0.71 2023-10-03 16:22:21,429 WARNING [train.py:1204] (3/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_573-152077-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 16:22:21,434 WARNING [train.py:1204] (3/4) Exclude cut with ID _42875_210_14856_1_1533704397487_4012433_393-177816-0_sp1.1 from training. Duration: 0.963625 2023-10-03 16:22:22,773 WARNING [train.py:1204] (3/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_278-35159-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 16:22:24,599 WARNING [train.py:1204] (3/4) Exclude cut with ID _15037_210_2253_1_1530752294104_3267650_13-130361-0_sp1.1 from training. Duration: 0.9 2023-10-03 16:22:25,974 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3019_210_2028_1_1526044245_3264865_127-135615-0 from training. Duration: 0.94 2023-10-03 16:22:26,034 WARNING [train.py:1204] (3/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_607-213951-0 from training. Duration: 0.84 2023-10-03 16:22:26,616 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.3.encoder.layers.3.nonlin_attention.whiten1, num_groups=1, num_channels=384, metric=5.26 vs. limit=10.0 2023-10-03 16:22:27,882 WARNING [train.py:1204] (3/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_436-318113-0_sp0.9 from training. Duration: 0.8333125 2023-10-03 16:22:27,883 WARNING [train.py:1204] (3/4) Exclude cut with ID _13938_210_9988_1_1530518718978_3693960_148-152225-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 16:22:34,263 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.attention_skip_rate, batch_count=1330060.0, ans=0.0 2023-10-03 16:22:35,388 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2541_210_1794_1_1525590269_3084765_156-173713-0_sp1.1 from training. Duration: 0.736375 2023-10-03 16:22:37,299 WARNING [train.py:1204] (3/4) Exclude cut with ID _28323_210_16187_1_1532480395667_4100300_531-24157-0_sp1.1 from training. Duration: 0.8909375 2023-10-03 16:22:38,755 WARNING [train.py:1204] (3/4) Exclude cut with ID _29303_210_2575_1_1532392237303_7410649_382-217492-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 16:22:38,786 WARNING [train.py:1204] (3/4) Exclude cut with ID _58895_210_8750_1_1535184081320_3649089_270-158013-0_sp1.1 from training. Duration: 0.92725 2023-10-03 16:22:42,781 WARNING [train.py:1204] (3/4) Exclude cut with ID _29277_210_3279_1_1532581109293_3173069_311-206227-0_sp1.1 from training. Duration: 0.936375 2023-10-03 16:22:42,792 WARNING [train.py:1204] (3/4) Exclude cut with ID _7160_210_1876_1_1527940923231_3703799_282-322611-0_sp0.9 from training. Duration: 0.9666875 2023-10-03 16:22:44,199 WARNING [train.py:1204] (3/4) Exclude cut with ID _53219_210_4525_1_1534834836008_4064825_562-32295-0_sp1.1 from training. Duration: 0.963625 2023-10-03 16:22:44,273 WARNING [train.py:1204] (3/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_408-87330-0_sp1.1 from training. Duration: 0.963625 2023-10-03 16:22:45,565 WARNING [train.py:1204] (3/4) Exclude cut with ID _59202_210_26984_1_1534816810612_3549174_137-318710-0_sp1.1 from training. Duration: 0.8090625 2023-10-03 16:22:47,115 WARNING [train.py:1204] (3/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_372-30302-0 from training. Duration: 0.86 2023-10-03 16:22:47,389 INFO [scaling.py:1118] (3/4) WithLoss: name=encoder.encoders.2.encoder.layers.2.self_attn_weights, loss-sum=0.000e+00 2023-10-03 16:22:50,012 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7543_210_6753_1_1528541907_1399322_178-56269-0_sp1.1 from training. Duration: 0.95275 2023-10-03 16:22:51,270 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9109W0012-549758 from training. Duration: 0.512 2023-10-03 16:22:51,347 WARNING [train.py:1204] (3/4) Exclude cut with ID _5410_210_1388_1_1527397055795_1719020_39-112221-0_sp0.9 from training. Duration: 0.7666875 2023-10-03 16:22:52,582 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.673e+02 1.960e+02 2.141e+02 2.460e+02 3.177e+02, threshold=4.282e+02, percent-clipped=0.0 2023-10-03 16:22:53,884 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9121W0012-554933 from training. Duration: 0.896 2023-10-03 16:22:53,977 WARNING [train.py:1204] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_476-272757-0 from training. Duration: 0.93 2023-10-03 16:22:55,250 WARNING [train.py:1204] (3/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_1085-171718-0_sp1.1 from training. Duration: 0.92725 2023-10-03 16:22:57,167 WARNING [train.py:1204] (3/4) Exclude cut with ID _8480_210_1874_1_1528589391205_6728330_309-139896-0_sp1.1 from training. Duration: 0.736375 2023-10-03 16:22:57,167 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9130W0113-559703 from training. Duration: 0.896 2023-10-03 16:22:57,171 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_292-265928-0_sp1.1 from training. Duration: 0.463625 2023-10-03 16:22:59,912 WARNING [train.py:1204] (3/4) Exclude cut with ID _8772_210_1850_1_1528635595972_3664011_96-12756-0 from training. Duration: 0.98 2023-10-03 16:23:01,301 WARNING [train.py:1204] (3/4) Exclude cut with ID _12556_210_8341_1_1530081024100_7327690_725-193477-0_sp1.1 from training. Duration: 0.8909375 2023-10-03 16:23:01,337 WARNING [train.py:1204] (3/4) Exclude cut with ID _5289_210_2084_1_1527678202736_3779299_142-103317-0_sp1.1 from training. Duration: 0.763625 2023-10-03 16:23:03,307 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.feed_forward2.hidden_balancer.prob, batch_count=1330193.3333333333, ans=0.125 2023-10-03 16:23:04,398 WARNING [train.py:1204] (3/4) Exclude cut with ID _45012_210_12651_1_1533608435136_5371380_206-51075-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 16:23:04,610 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=1330193.3333333333, ans=0.1 2023-10-03 16:23:05,909 WARNING [train.py:1204] (3/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_176-56120-0_sp1.1 from training. Duration: 0.7545625 2023-10-03 16:23:05,938 WARNING [train.py:1204] (3/4) Exclude cut with ID _40284_210_2084_1_1533513679814_3704279_220-201864-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 16:23:07,834 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9165W0004-576638 from training. Duration: 0.896 2023-10-03 16:23:07,876 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_5438_210_1794_1_1527336687_2600009_218-343336-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 16:23:09,295 WARNING [train.py:1204] (3/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_318-162017-0 from training. Duration: 0.91 2023-10-03 16:23:14,748 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_49544_210_1794_1_1534379770_5262083_483-349298-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 16:23:14,820 WARNING [train.py:1204] (3/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_197-28164-0_sp0.9 from training. Duration: 0.888875 2023-10-03 16:23:14,873 WARNING [train.py:1204] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_643-52007-0 from training. Duration: 0.57 2023-10-03 16:23:14,892 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_38965_210_4852_1_1533174906_3484735_403-332963-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 16:23:16,223 WARNING [train.py:1204] (3/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_340-174176-0 from training. Duration: 0.49 2023-10-03 16:23:18,972 WARNING [train.py:1204] (3/4) Exclude cut with ID _56708_210_13177_1_1535252351282_3422899_363-917-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 16:23:20,437 WARNING [train.py:1204] (3/4) Exclude cut with ID _19896_210_1883_1_1531709022662_6649569_525-159063-0_sp1.1 from training. Duration: 0.936375 2023-10-03 16:23:20,465 WARNING [train.py:1204] (3/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_430-116139-0_sp0.9 from training. Duration: 0.9444375 2023-10-03 16:23:21,850 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_53753_210_7234_1_1534320003_3594148_86-329250-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 16:23:21,869 WARNING [train.py:1204] (3/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_406-275649-0_sp1.1 from training. Duration: 0.3818125 2023-10-03 16:23:23,147 WARNING [train.py:1204] (3/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_1083-147536-0_sp1.1 from training. Duration: 0.8818125 2023-10-03 16:23:24,545 WARNING [train.py:1204] (3/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_54-311741-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 16:23:24,550 WARNING [train.py:1204] (3/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_237-58433-0_sp0.9 from training. Duration: 0.82225 2023-10-03 16:23:26,028 WARNING [train.py:1204] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_98-51676-0_sp1.1 from training. Duration: 0.663625 2023-10-03 16:23:27,933 WARNING [train.py:1204] (3/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_386-270472-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 16:23:28,046 WARNING [train.py:1204] (3/4) Exclude cut with ID _19023_210_4353_1_1531378892552_5820780_83-254794-0_sp1.1 from training. Duration: 0.7818125 2023-10-03 16:23:29,356 INFO [train.py:1046] (3/4) Epoch 38, batch 3000, loss[loss=0.1554, simple_loss=0.2317, pruned_loss=0.03954, over 24431.00 frames. ], tot_loss[loss=0.1586, simple_loss=0.2382, pruned_loss=0.03949, over 4688827.94 frames. ], batch size: 58, lr: 2.68e-03, grad_scale: 16.0 2023-10-03 16:23:29,356 INFO [train.py:1069] (3/4) Computing validation loss 2023-10-03 16:23:35,311 INFO [zipformer.py:1853] (3/4) name=encoder.encoders.1.encoder.layers.0.self_attn_weights, attn_weights_entropy = tensor([4.8841, 4.4509, 4.2098, 4.0531], device='cuda:3') 2023-10-03 16:23:41,538 INFO [train.py:1078] (3/4) Epoch 38, validation: loss=0.3508, simple_loss=0.2758, pruned_loss=0.2129, over 1125622.00 frames. 2023-10-03 16:23:41,539 INFO [train.py:1079] (3/4) Maximum memory allocated so far is 21134MB 2023-10-03 16:23:41,662 WARNING [train.py:1204] (3/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_314-93656-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 16:23:41,679 WARNING [train.py:1204] (3/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_336-213530-0 from training. Duration: 0.9 2023-10-03 16:23:43,024 WARNING [train.py:1204] (3/4) Exclude cut with ID _36686_210_5665_1_1533778225913_3646410_65-221120-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 16:23:45,796 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_26433_210_12108_1_1532157158_4056480_271-68178-0_sp1.1 from training. Duration: 0.92725 2023-10-03 16:23:45,840 WARNING [train.py:1204] (3/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_46-207599-0_sp0.9 from training. Duration: 0.711125 2023-10-03 16:23:46,152 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.feed_forward2.hidden_balancer.prob, batch_count=1330326.6666666667, ans=0.125 2023-10-03 16:23:48,747 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9303W0114-637133 from training. Duration: 0.896 2023-10-03 16:23:48,774 WARNING [train.py:1204] (3/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_490-207861-0 from training. Duration: 0.98 2023-10-03 16:23:52,868 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6767_210_3273_1_1527855783_1718932_173-256601-0_sp0.9 from training. Duration: 0.888875 2023-10-03 16:23:52,914 WARNING [train.py:1204] (3/4) Exclude cut with ID _13921_210_9994_1_1530841782272_3503520_327-69677-0_sp1.1 from training. Duration: 0.8181875 2023-10-03 16:23:53,054 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.0.feed_forward1.hidden_balancer.prob, batch_count=1330326.6666666667, ans=0.125 2023-10-03 16:23:54,869 WARNING [train.py:1204] (3/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_508-335369-0 from training. Duration: 0.48 2023-10-03 16:23:54,905 WARNING [train.py:1204] (3/4) Exclude cut with ID _64631_210_27767_1_1535349580068_4165160_24-46567-0_sp1.1 from training. Duration: 0.963625 2023-10-03 16:24:01,237 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_89-35748-0_sp1.1 from training. Duration: 0.7181875 2023-10-03 16:24:08,836 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.0.layers.1.conv_module1.whiten, num_groups=1, num_channels=192, metric=9.00 vs. limit=15.0 2023-10-03 16:24:09,201 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_26433_210_12108_1_1532157158_4056480_578-262107-0_sp1.1 from training. Duration: 0.9 2023-10-03 16:24:14,604 WARNING [train.py:1204] (3/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_129-337802-0 from training. Duration: 0.72 2023-10-03 16:24:16,017 WARNING [train.py:1204] (3/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_356-167816-0_sp0.9 from training. Duration: 0.8 2023-10-03 16:24:17,507 WARNING [train.py:1204] (3/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_137-116442-0_sp1.1 from training. Duration: 0.7090625 2023-10-03 16:24:17,544 WARNING [train.py:1204] (3/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_358-172940-0_sp1.1 from training. Duration: 0.963625 2023-10-03 16:24:17,563 WARNING [train.py:1204] (3/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_668-344147-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 16:24:17,778 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.self_attn_weights.pos_emb_skip_rate, batch_count=1330460.0, ans=0.0 2023-10-03 16:24:20,358 WARNING [train.py:1204] (3/4) Exclude cut with ID _62337_210_12392_1_1535009273105_3412229_255-157667-0_sp1.1 from training. Duration: 0.936375 2023-10-03 16:24:20,372 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_486-151131-0 from training. Duration: 0.85 2023-10-03 16:24:23,114 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_53638_210_21581_1_1534297319_5808449_885-80172-0 from training. Duration: 0.96 2023-10-03 16:24:23,208 WARNING [train.py:1204] (3/4) Exclude cut with ID _50093_210_4220_1_1534417141207_3432299_105-183067-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 16:24:23,349 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.feed_forward1.hidden_balancer.prob, batch_count=1330460.0, ans=0.125 2023-10-03 16:24:25,060 WARNING [train.py:1204] (3/4) Exclude cut with ID _7562_210_2084_1_1528887767916_3946229_345-169156-0_sp0.9 from training. Duration: 0.6444375 2023-10-03 16:24:27,773 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_4528_210_3273_1_1527076654_3276777_209-219804-0_sp0.9 from training. Duration: 0.7666875 2023-10-03 16:24:27,802 WARNING [train.py:1204] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_35-153886-0_sp0.9 from training. Duration: 0.9333125 2023-10-03 16:24:27,868 WARNING [train.py:1204] (3/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_332-121925-0_sp1.1 from training. Duration: 0.92725 2023-10-03 16:24:27,870 WARNING [train.py:1204] (3/4) Exclude cut with ID _59202_210_26984_1_1534816810612_3549174_495-65515-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 16:24:31,935 WARNING [train.py:1204] (3/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_769-168302-0_sp1.1 from training. Duration: 0.6454375 2023-10-03 16:24:31,985 WARNING [train.py:1204] (3/4) Exclude cut with ID _24271_210_12658_1_1531987270022_7190519_708-71345-0_sp1.1 from training. Duration: 0.936375 2023-10-03 16:24:31,985 WARNING [train.py:1204] (3/4) Exclude cut with ID 3033-130750-0096-55598-0_sp0.9 from training. Duration: 0.92225 2023-10-03 16:24:33,312 WARNING [train.py:1204] (3/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_219-224445-0_sp0.9 from training. Duration: 0.9333125 2023-10-03 16:24:35,878 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_174-277364-0 from training. Duration: 0.71 2023-10-03 16:24:37,097 WARNING [train.py:1204] (3/4) Exclude cut with ID _45021_210_9994_1_1533619751094_3330288_96-239010-0_sp1.1 from training. Duration: 0.82725 2023-10-03 16:24:37,129 WARNING [train.py:1204] (3/4) Exclude cut with ID _31602_210_5854_1_1532677021456_4458250_489-305116-0_sp1.1 from training. Duration: 0.97275 2023-10-03 16:24:38,574 WARNING [train.py:1204] (3/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_269-154726-0_sp0.9 from training. Duration: 0.9555625 2023-10-03 16:24:41,446 WARNING [train.py:1204] (3/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_126-54418-0_sp1.1 from training. Duration: 0.92725 2023-10-03 16:24:41,471 WARNING [train.py:1204] (3/4) Exclude cut with ID _42326_210_7770_1_1533814282531_3707269_182-224159-0_sp1.1 from training. Duration: 0.92725 2023-10-03 16:24:42,780 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7002_210_1794_1_1527939683_5157286_1-281264-0_sp0.9 from training. Duration: 0.488875 2023-10-03 16:24:42,816 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_86-16881-0 from training. Duration: 0.58 2023-10-03 16:24:42,841 WARNING [train.py:1204] (3/4) Exclude cut with ID _19514_210_9259_1_1531530071330_5084230_637-279094-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 16:24:42,876 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_26433_210_12108_1_1532157158_4056480_578-262107-0 from training. Duration: 0.99 2023-10-03 16:24:44,253 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_82-221196-0_sp1.1 from training. Duration: 0.6909375 2023-10-03 16:24:45,602 WARNING [train.py:1204] (3/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_402-102311-0 from training. Duration: 0.67 2023-10-03 16:24:49,559 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1015-343884-0_sp1.1 from training. Duration: 0.563625 2023-10-03 16:24:49,641 WARNING [train.py:1204] (3/4) Exclude cut with ID _13684_210_7497_1_1530619354863_3598359_312-235606-0_sp1.1 from training. Duration: 0.5454375 2023-10-03 16:24:50,881 WARNING [train.py:1204] (3/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_55-31904-0 from training. Duration: 0.88 2023-10-03 16:24:50,966 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_25-332138-0 from training. Duration: 0.91 2023-10-03 16:24:50,968 WARNING [train.py:1204] (3/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_912-282412-0_sp0.9 from training. Duration: 0.5444375 2023-10-03 16:24:52,906 WARNING [train.py:1204] (3/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_458-299435-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 16:24:54,885 WARNING [train.py:1204] (3/4) Exclude cut with ID _69584_210_5219_1_1535785133775_4044450_275-88403-0_sp1.1 from training. Duration: 0.92725 2023-10-03 16:24:54,894 WARNING [train.py:1204] (3/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_841-341679-0_sp1.1 from training. Duration: 0.52725 2023-10-03 16:24:54,901 WARNING [train.py:1204] (3/4) Exclude cut with ID _41391_210_7994_1_1533603771872_3638201_1-54521-0_sp1.1 from training. Duration: 0.97275 2023-10-03 16:24:54,955 WARNING [train.py:1204] (3/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_495-186609-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 16:24:56,286 INFO [train.py:1046] (3/4) Epoch 38, batch 3050, loss[loss=0.1585, simple_loss=0.2486, pruned_loss=0.03417, over 24307.00 frames. ], tot_loss[loss=0.1595, simple_loss=0.2393, pruned_loss=0.03989, over 4684500.04 frames. ], batch size: 74, lr: 2.68e-03, grad_scale: 16.0 2023-10-03 16:24:57,055 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.3.encoder.layers.2.self_attn_weights.whiten_keys, num_groups=8, num_channels=256, metric=3.33 vs. limit=6.0 2023-10-03 16:24:58,023 WARNING [train.py:1204] (3/4) Exclude cut with ID _4681_210_3525_1_1527250689464_120889_15-126760-0 from training. Duration: 0.81 2023-10-03 16:24:59,366 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3019_210_2028_1_1526044245_3264865_127-135615-0_sp1.1 from training. Duration: 0.8545625 2023-10-03 16:25:00,952 WARNING [train.py:1204] (3/4) Exclude cut with ID _30191_210_1664_1_1533189915047_7196670_915-21561-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 16:25:01,113 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.conv_skip_rate, batch_count=1330660.0, ans=0.0 2023-10-03 16:25:02,314 WARNING [train.py:1204] (3/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_6-214815-0_sp0.9 from training. Duration: 0.9444375 2023-10-03 16:25:04,351 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_45399_210_4852_1_1533624725_7579737_190-137494-0_sp1.1 from training. Duration: 0.97275 2023-10-03 16:25:07,479 WARNING [train.py:1204] (3/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_207-132038-0 from training. Duration: 0.94 2023-10-03 16:25:14,243 WARNING [train.py:1204] (3/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_90-31383-0 from training. Duration: 0.87 2023-10-03 16:25:14,301 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_15517_210_6753_1_1530952293_1664795_119-31001-0 from training. Duration: 0.94 2023-10-03 16:25:14,485 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.bypass_mid.scale_min, batch_count=1330726.6666666667, ans=0.2 2023-10-03 16:25:15,644 WARNING [train.py:1204] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_684-85342-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 16:25:18,484 WARNING [train.py:1204] (3/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_97-325053-0_sp1.1 from training. Duration: 0.836375 2023-10-03 16:25:20,020 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_68702_210_23689_1_1535799577_3420068_168-25150-0_sp1.1 from training. Duration: 0.97275 2023-10-03 16:25:20,779 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.2.encoder.layers.1.self_attn_weights.whiten_keys, num_groups=4, num_channels=128, metric=2.96 vs. limit=6.0 2023-10-03 16:25:21,884 WARNING [train.py:1204] (3/4) Exclude cut with ID _65134_210_14763_1_1535335393985_3505368_111-27984-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 16:25:21,936 WARNING [train.py:1204] (3/4) Exclude cut with ID _48678_210_5663_1_1533988830947_3769199_276-226230-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 16:25:23,474 WARNING [train.py:1204] (3/4) Exclude cut with ID _28427_210_4220_1_1532581240967_3061069_43-84928-0_sp1.1 from training. Duration: 0.9 2023-10-03 16:25:23,609 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.conv_module1.balancer1.prob, batch_count=1330726.6666666667, ans=0.125 2023-10-03 16:25:23,626 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.ff2_skip_rate, batch_count=1330726.6666666667, ans=0.0 2023-10-03 16:25:25,208 WARNING [train.py:1204] (3/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_158-250116-0_sp1.1 from training. Duration: 0.563625 2023-10-03 16:25:25,255 WARNING [train.py:1204] (3/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_279-332556-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 16:25:26,640 WARNING [train.py:1204] (3/4) Exclude cut with ID _42863_210_9070_1_1533902398788_3696920_488-230146-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 16:25:26,641 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_387-247159-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 16:25:27,988 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6729_210_2550_1_1528032481_1788725_261-42619-0_sp1.1 from training. Duration: 0.97275 2023-10-03 16:25:28,167 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.conv_skip_rate, batch_count=1330793.3333333333, ans=0.0 2023-10-03 16:25:30,611 WARNING [train.py:1204] (3/4) Exclude cut with ID _19896_210_1883_1_1531709022662_6649569_831-44724-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 16:25:30,777 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.conv_module1.balancer1.prob, batch_count=1330793.3333333333, ans=0.125 2023-10-03 16:25:33,291 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.628e+02 1.918e+02 2.115e+02 2.470e+02 3.368e+02, threshold=4.229e+02, percent-clipped=0.0 2023-10-03 16:25:33,423 WARNING [train.py:1204] (3/4) Exclude cut with ID _55153_210_2253_1_1534384786677_3162096_280-285944-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 16:25:33,462 WARNING [train.py:1204] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_448-4048-0 from training. Duration: 0.96 2023-10-03 16:25:34,786 WARNING [train.py:1204] (3/4) Exclude cut with ID _29678_210_7893_1_1532421042787_3906118_456-202548-0_sp1.1 from training. Duration: 0.97275 2023-10-03 16:25:34,806 WARNING [train.py:1204] (3/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_107-332398-0_sp1.1 from training. Duration: 0.7090625 2023-10-03 16:25:38,807 WARNING [train.py:1204] (3/4) Exclude cut with ID _13921_210_9994_1_1530838702800_3003429_46-149444-0_sp1.1 from training. Duration: 0.8545625 2023-10-03 16:25:38,851 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_257-20733-0_sp0.9 from training. Duration: 0.8444375 2023-10-03 16:25:38,912 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_53753_210_7234_1_1534320003_3594148_385-234809-0_sp1.1 from training. Duration: 0.963625 2023-10-03 16:25:40,268 WARNING [train.py:1204] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_292-215346-0_sp1.1 from training. Duration: 0.8818125 2023-10-03 16:25:43,123 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.bypass.skip_rate, batch_count=1330860.0, ans=0.04949747468305833 2023-10-03 16:25:45,634 WARNING [train.py:1204] (3/4) Exclude cut with ID _68965_210_1883_1_1535698962272_3497490_196-124268-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 16:25:45,685 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_23801_210_16223_1_1532066392_3294657_570-5439-0_sp1.1 from training. Duration: 0.8818125 2023-10-03 16:25:45,980 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.bypass.scale_min, batch_count=1330860.0, ans=0.2 2023-10-03 16:25:52,614 WARNING [train.py:1204] (3/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_349-6928-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 16:25:54,437 WARNING [train.py:1204] (3/4) Exclude cut with ID _60621_210_17071_1_1535013053530_3671739_226-255202-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 16:25:54,441 WARNING [train.py:1204] (3/4) Exclude cut with ID _41817_210_18835_1_1533434477160_7423049_448-130302-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 16:25:55,872 WARNING [train.py:1204] (3/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_758-143677-0_sp1.1 from training. Duration: 0.8909375 2023-10-03 16:25:55,923 WARNING [train.py:1204] (3/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_912-47473-0_sp0.9 from training. Duration: 0.7555625 2023-10-03 16:25:55,953 WARNING [train.py:1204] (3/4) Exclude cut with ID _6694_210_4599_1_1527852637490_3965529_356-169233-0_sp1.1 from training. Duration: 0.9 2023-10-03 16:25:57,335 WARNING [train.py:1204] (3/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_1-99680-0 from training. Duration: 0.67 2023-10-03 16:25:59,315 WARNING [train.py:1204] (3/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_199-86432-0_sp1.1 from training. Duration: 0.8909375 2023-10-03 16:25:59,335 WARNING [train.py:1204] (3/4) Exclude cut with ID _61148_210_6897_1_1535094022726_4217610_362-182430-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 16:26:00,653 WARNING [train.py:1204] (3/4) Exclude cut with ID _49376_210_5986_1_1533985278732_3370740_301-154763-0 from training. Duration: 0.98 2023-10-03 16:26:02,231 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.0.conv_skip_rate, batch_count=1330926.6666666667, ans=0.0 2023-10-03 16:26:03,426 WARNING [train.py:1204] (3/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_572-185150-0_sp1.1 from training. Duration: 0.8818125 2023-10-03 16:26:07,653 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_47896_210_20508_1_1534492518_5958868_558-314572-0_sp1.1 from training. Duration: 0.8818125 2023-10-03 16:26:10,304 INFO [train.py:1046] (3/4) Epoch 38, batch 3100, loss[loss=0.1556, simple_loss=0.2284, pruned_loss=0.04142, over 23657.00 frames. ], tot_loss[loss=0.1594, simple_loss=0.2387, pruned_loss=0.04008, over 4694981.29 frames. ], batch size: 149, lr: 2.68e-03, grad_scale: 16.0 2023-10-03 16:26:10,974 WARNING [train.py:1204] (3/4) Exclude cut with ID _45560_210_21982_1_1533812603951_3584999_111-6376-0_sp0.9 from training. Duration: 0.9333125 2023-10-03 16:26:12,259 WARNING [train.py:1204] (3/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_412-317077-0_sp1.1 from training. Duration: 0.6818125 2023-10-03 16:26:13,653 WARNING [train.py:1204] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_718-68505-0 from training. Duration: 0.89 2023-10-03 16:26:16,357 WARNING [train.py:1204] (3/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_77-311404-0 from training. Duration: 0.94 2023-10-03 16:26:17,629 WARNING [train.py:1204] (3/4) Exclude cut with ID _60229_210_27767_1_1534838406158_3655350_264-279573-0 from training. Duration: 0.83 2023-10-03 16:26:19,040 WARNING [train.py:1204] (3/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_106-300985-0_sp1.1 from training. Duration: 0.8181875 2023-10-03 16:26:20,561 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_4183_210_2087_1_1526729100_1869838_79-277189-0_sp1.1 from training. Duration: 0.963625 2023-10-03 16:26:20,570 WARNING [train.py:1204] (3/4) Exclude cut with ID _28427_210_4220_1_1532581240967_3061069_195-295268-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 16:26:23,771 WARNING [train.py:1204] (3/4) Exclude cut with ID _4092_210_2151_1_1526643044449_3135759_356-278694-0_sp0.9 from training. Duration: 0.52225 2023-10-03 16:26:23,997 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.bypass.skip_rate, batch_count=1331060.0, ans=0.07 2023-10-03 16:26:26,723 WARNING [train.py:1204] (3/4) Exclude cut with ID _14459_210_2228_1_1531443641946_7516700_777-71446-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 16:26:31,517 WARNING [train.py:1204] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_520-126729-0 from training. Duration: 0.7 2023-10-03 16:26:34,483 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.feed_forward2.hidden_balancer.prob, batch_count=1331060.0, ans=0.125 2023-10-03 16:26:35,691 WARNING [train.py:1204] (3/4) Exclude cut with ID _13684_210_7497_1_1530619354863_3598359_312-235606-0_sp0.9 from training. Duration: 0.6666875 2023-10-03 16:26:35,728 WARNING [train.py:1204] (3/4) Exclude cut with ID _37537_210_2253_1_1533171758568_3421949_283-349402-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 16:26:35,766 WARNING [train.py:1204] (3/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_29-117932-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 16:26:35,792 WARNING [train.py:1204] (3/4) Exclude cut with ID _29303_210_2575_1_1532392237303_7410649_721-283351-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 16:26:37,005 WARNING [train.py:1204] (3/4) Exclude cut with ID _14886_210_4220_1_1530702020355_3516190_175-229073-0_sp0.9 from training. Duration: 0.5 2023-10-03 16:26:40,878 WARNING [train.py:1204] (3/4) Exclude cut with ID _15458_210_8341_1_1530858614263_3834410_70-268830-0_sp1.1 from training. Duration: 0.97275 2023-10-03 16:26:40,903 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2295_210_2087_1_1525500993_2407863_234-83104-0 from training. Duration: 0.62 2023-10-03 16:26:40,904 WARNING [train.py:1204] (3/4) Exclude cut with ID _22961_210_8254_1_1531976366672_4278180_317-199546-0_sp1.1 from training. Duration: 0.7818125 2023-10-03 16:26:43,422 WARNING [train.py:1204] (3/4) Exclude cut with ID _27894_210_12392_1_1532415496078_6902439_547-196022-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 16:26:44,815 WARNING [train.py:1204] (3/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_46-207599-0 from training. Duration: 0.64 2023-10-03 16:26:44,935 WARNING [train.py:1204] (3/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_426-326246-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 16:26:49,032 WARNING [train.py:1204] (3/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_445-117398-0_sp0.9 from training. Duration: 0.988875 2023-10-03 16:26:50,362 WARNING [train.py:1204] (3/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_563-347832-0 from training. Duration: 0.89 2023-10-03 16:26:50,455 WARNING [train.py:1204] (3/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_252-216674-0 from training. Duration: 0.54 2023-10-03 16:26:53,645 WARNING [train.py:1204] (3/4) Exclude cut with ID _59611_210_8895_1_1534741198240_3650361_187-212779-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 16:26:53,729 WARNING [train.py:1204] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_282-159367-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 16:26:56,443 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_68702_210_23689_1_1535799577_3420068_33-136883-0_sp1.1 from training. Duration: 0.936375 2023-10-03 16:26:56,466 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_47228_210_4983_1_1533780021_3672027_184-293706-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 16:26:57,819 WARNING [train.py:1204] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_656-188361-0_sp0.9 from training. Duration: 0.9666875 2023-10-03 16:26:59,170 WARNING [train.py:1204] (3/4) Exclude cut with ID _4866_210_2577_1_1527404412284_4364659_352-157930-0_sp0.9 from training. Duration: 0.9 2023-10-03 16:26:59,171 WARNING [train.py:1204] (3/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_210-106047-0_sp1.1 from training. Duration: 0.8818125 2023-10-03 16:27:00,562 WARNING [train.py:1204] (3/4) Exclude cut with ID _9389_210_1664_1_1529217337301_5289269_289-213417-0_sp1.1 from training. Duration: 0.8090625 2023-10-03 16:27:00,595 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3910_210_1794_1_1526777667_3747260_178-41186-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 16:27:00,604 WARNING [train.py:1204] (3/4) Exclude cut with ID _27052_210_5986_1_1532329197187_3585009_24-172263-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 16:27:00,608 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_5522_210_3273_1_1527249606_3909096_318-203153-0_sp1.1 from training. Duration: 0.3909375 2023-10-03 16:27:06,829 WARNING [train.py:1204] (3/4) Exclude cut with ID _27894_210_12392_1_1532415496078_6902439_222-51982-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 16:27:06,938 WARNING [train.py:1204] (3/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_87-131427-0 from training. Duration: 0.78 2023-10-03 16:27:09,639 WARNING [train.py:1204] (3/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_372-298093-0_sp0.9 from training. Duration: 0.888875 2023-10-03 16:27:09,706 WARNING [train.py:1204] (3/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_663-166973-0 from training. Duration: 0.8 2023-10-03 16:27:11,595 WARNING [train.py:1204] (3/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_429-177469-0_sp1.1 from training. Duration: 0.936375 2023-10-03 16:27:11,628 WARNING [train.py:1204] (3/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_724-172651-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 16:27:11,669 WARNING [train.py:1204] (3/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_103-336064-0 from training. Duration: 0.51 2023-10-03 16:27:15,330 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.1.encoder.layers.0.feed_forward1.out_whiten, num_groups=1, num_channels=256, metric=4.84 vs. limit=15.0 2023-10-03 16:27:20,046 WARNING [train.py:1204] (3/4) Exclude cut with ID _48677_210_5663_1_1533902405919_3892989_111-11439-0 from training. Duration: 0.96 2023-10-03 16:27:23,048 WARNING [train.py:1204] (3/4) Exclude cut with ID _8085_210_4557_1_1528455633493_3716100_95-103398-0_sp1.1 from training. Duration: 0.963625 2023-10-03 16:27:23,110 WARNING [train.py:1204] (3/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_1041-110837-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 16:27:24,356 INFO [train.py:1046] (3/4) Epoch 38, batch 3150, loss[loss=0.1562, simple_loss=0.2539, pruned_loss=0.0292, over 24649.00 frames. ], tot_loss[loss=0.1585, simple_loss=0.2377, pruned_loss=0.03964, over 4697658.24 frames. ], batch size: 73, lr: 2.68e-03, grad_scale: 8.0 2023-10-03 16:27:25,864 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_50050_210_16223_1_1535435796_7290138_1155-121757-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 16:27:25,867 WARNING [train.py:1204] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_602-275260-0_sp0.9 from training. Duration: 0.911125 2023-10-03 16:27:27,214 WARNING [train.py:1204] (3/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_265-164486-0 from training. Duration: 0.96 2023-10-03 16:27:28,571 WARNING [train.py:1204] (3/4) Exclude cut with ID _46067_210_4220_1_1534125571066_3307450_142-229375-0_sp1.1 from training. Duration: 0.963625 2023-10-03 16:27:28,610 WARNING [train.py:1204] (3/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_310-26186-0_sp1.1 from training. Duration: 0.636375 2023-10-03 16:27:29,978 WARNING [train.py:1204] (3/4) Exclude cut with ID _28461_210_2253_1_1532771775189_3041299_136-270644-0 from training. Duration: 0.93 2023-10-03 16:27:33,165 WARNING [train.py:1204] (3/4) Exclude cut with ID _14860_210_4915_1_1530702020340_7343240_438-286372-0_sp1.1 from training. Duration: 0.936375 2023-10-03 16:27:34,610 WARNING [train.py:1204] (3/4) Exclude cut with ID IC0128W0004-63309 from training. Duration: 0.831 2023-10-03 16:27:38,666 WARNING [train.py:1204] (3/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_135-174080-0 from training. Duration: 0.53 2023-10-03 16:27:38,697 WARNING [train.py:1204] (3/4) Exclude cut with ID _41326_210_5854_1_1533778076509_3492080_277-59511-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 16:27:38,936 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.conv_module1.balancer1.prob, batch_count=1331393.3333333333, ans=0.125 2023-10-03 16:27:40,034 WARNING [train.py:1204] (3/4) Exclude cut with ID IC0144W0112-71379 from training. Duration: 0.992 2023-10-03 16:27:41,369 WARNING [train.py:1204] (3/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_711-15688-0_sp0.9 from training. Duration: 0.47775 2023-10-03 16:27:41,481 WARNING [train.py:1204] (3/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_237-332027-0 from training. Duration: 0.95 2023-10-03 16:27:43,352 WARNING [train.py:1204] (3/4) Exclude cut with ID _7970_210_1883_1_1528455616296_3472230_25-178407-0 from training. Duration: 0.71 2023-10-03 16:27:43,354 WARNING [train.py:1204] (3/4) Exclude cut with ID _8480_210_1874_1_1528589391205_6728330_309-139896-0 from training. Duration: 0.81 2023-10-03 16:27:43,387 WARNING [train.py:1204] (3/4) Exclude cut with ID _39571_210_13449_1_1533801584143_3651077_319-268877-0_sp1.1 from training. Duration: 0.936375 2023-10-03 16:27:43,391 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_155-246880-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 16:27:44,768 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_566-122599-0_sp1.1 from training. Duration: 0.936375 2023-10-03 16:27:44,896 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_12868_210_6753_1_1530701932_530275_18-76322-0 from training. Duration: 0.87 2023-10-03 16:27:48,286 WARNING [train.py:1204] (3/4) Exclude cut with ID _56494_210_4220_1_1535284837625_3033169_159-106035-0_sp1.1 from training. Duration: 0.963625 2023-10-03 16:27:48,317 WARNING [train.py:1204] (3/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_98-276013-0_sp1.1 from training. Duration: 0.963625 2023-10-03 16:27:48,364 WARNING [train.py:1204] (3/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_330-216124-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 16:27:49,708 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_177-58005-0_sp0.9 from training. Duration: 0.688875 2023-10-03 16:27:50,537 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.0.layers.0.feed_forward3.out_whiten, num_groups=1, num_channels=192, metric=12.64 vs. limit=15.0 2023-10-03 16:27:54,393 WARNING [train.py:1204] (3/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_343-139898-0 from training. Duration: 0.96 2023-10-03 16:27:54,464 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6961_210_2087_1_1527937168_6815011_395-185060-0_sp1.1 from training. Duration: 0.863625 2023-10-03 16:27:57,153 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7002_210_1794_1_1527939683_5157286_184-229630-0_sp0.9 from training. Duration: 0.82225 2023-10-03 16:27:58,410 WARNING [train.py:1204] (3/4) Exclude cut with ID _59170_210_6721_1_1534983633960_4203335_153-18895-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 16:27:58,460 WARNING [train.py:1204] (3/4) Exclude cut with ID _49643_210_7989_1_1534640244224_3939031_598-161926-0 from training. Duration: 0.9 2023-10-03 16:28:01,336 WARNING [train.py:1204] (3/4) Exclude cut with ID _4068_210_2629_1_1526697350395_1546259_224-288693-0 from training. Duration: 0.59 2023-10-03 16:28:02,635 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.650e+02 1.924e+02 2.139e+02 2.464e+02 3.251e+02, threshold=4.278e+02, percent-clipped=0.0 2023-10-03 16:28:02,712 WARNING [train.py:1204] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_718-68505-0_sp1.1 from training. Duration: 0.8090625 2023-10-03 16:28:02,757 WARNING [train.py:1204] (3/4) Exclude cut with ID _4068_210_2629_1_1526697350395_1546259_224-288693-0_sp0.9 from training. Duration: 0.6555625 2023-10-03 16:28:02,788 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2296_210_2087_1_1525503448_3397335_57-185430-0_sp0.9 from training. Duration: 0.6666875 2023-10-03 16:28:04,182 WARNING [train.py:1204] (3/4) Exclude cut with ID _15458_210_8341_1_1530858614263_3834410_49-235472-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 16:28:04,184 WARNING [train.py:1204] (3/4) Exclude cut with ID _64135_210_2253_1_1535677267876_7608972_498-5039-0_sp0.9 from training. Duration: 0.8666875 2023-10-03 16:28:06,104 WARNING [train.py:1204] (3/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_283-220372-0_sp0.9 from training. Duration: 0.62225 2023-10-03 16:28:06,119 WARNING [train.py:1204] (3/4) Exclude cut with ID _6473_210_1741_1_1527901307396_3815380_108-17335-0_sp0.9 from training. Duration: 0.72225 2023-10-03 16:28:07,574 WARNING [train.py:1204] (3/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_113-92608-0 from training. Duration: 0.54 2023-10-03 16:28:08,830 WARNING [train.py:1204] (3/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_272-249985-0_sp1.1 from training. Duration: 0.6090625 2023-10-03 16:28:08,846 WARNING [train.py:1204] (3/4) Exclude cut with ID _50376_210_13401_1_1534031502101_4217740_518-187390-0_sp1.1 from training. Duration: 0.97275 2023-10-03 16:28:09,072 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.out_combiner.scale_min, batch_count=1331526.6666666667, ans=0.2 2023-10-03 16:28:10,219 WARNING [train.py:1204] (3/4) Exclude cut with ID _59738_210_15379_1_1534849512562_3941470_165-248260-0_sp1.1 from training. Duration: 0.92725 2023-10-03 16:28:10,229 WARNING [train.py:1204] (3/4) Exclude cut with ID _23818_210_6228_1_1531917063884_3676920_400-343925-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 16:28:11,528 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6961_210_2087_1_1527937168_6815011_562-165518-0 from training. Duration: 0.77 2023-10-03 16:28:12,242 WARNING [train.py:1204] (3/4) Exclude cut with ID _24271_210_12658_1_1531987270022_7190519_289-207931-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 16:28:14,728 WARNING [train.py:1204] (3/4) Exclude cut with ID _14632_210_9988_1_1530691279392_3923200_332-105904-0 from training. Duration: 0.9 2023-10-03 16:28:14,740 WARNING [train.py:1204] (3/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_203-232438-0_sp1.1 from training. Duration: 0.97275 2023-10-03 16:28:16,123 WARNING [train.py:1204] (3/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_263-168388-0 from training. Duration: 0.86 2023-10-03 16:28:17,437 WARNING [train.py:1204] (3/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_174-205058-0 from training. Duration: 0.95 2023-10-03 16:28:19,562 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_94-195801-0_sp1.1 from training. Duration: 0.8545625 2023-10-03 16:28:19,593 WARNING [train.py:1204] (3/4) Exclude cut with ID _55153_210_2253_1_1534384786677_3162096_269-103204-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 16:28:20,893 WARNING [train.py:1204] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_748-36150-0 from training. Duration: 0.91 2023-10-03 16:28:22,271 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_148-172861-0_sp1.1 from training. Duration: 0.4545625 2023-10-03 16:28:23,543 WARNING [train.py:1204] (3/4) Exclude cut with ID _67499_210_17282_1_1535509805220_3692430_460-77730-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 16:28:25,439 WARNING [train.py:1204] (3/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_374-20753-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 16:28:25,548 WARNING [train.py:1204] (3/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_374-289983-0_sp1.1 from training. Duration: 0.97275 2023-10-03 16:28:25,575 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_34886_210_10633_1_1533203882_1755661_76-156096-0_sp0.9 from training. Duration: 0.9666875 2023-10-03 16:28:31,110 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_238-256668-0_sp0.9 from training. Duration: 0.7666875 2023-10-03 16:28:32,364 WARNING [train.py:1204] (3/4) Exclude cut with ID _46683_210_9642_1_1533730913492_4591116_489-146727-0_sp1.1 from training. Duration: 0.97275 2023-10-03 16:28:33,800 WARNING [train.py:1204] (3/4) Exclude cut with ID _13938_210_9988_1_1530518718978_3693960_304-112784-0_sp0.9 from training. Duration: 0.511125 2023-10-03 16:28:38,473 INFO [train.py:1046] (3/4) Epoch 38, batch 3200, loss[loss=0.1463, simple_loss=0.2328, pruned_loss=0.02994, over 24559.00 frames. ], tot_loss[loss=0.1576, simple_loss=0.2367, pruned_loss=0.0392, over 4716357.00 frames. ], batch size: 71, lr: 2.67e-03, grad_scale: 16.0 2023-10-03 16:28:39,977 WARNING [train.py:1204] (3/4) Exclude cut with ID _24271_210_12658_1_1531987270022_7190519_243-46549-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 16:28:39,982 WARNING [train.py:1204] (3/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_78-182912-0_sp1.1 from training. Duration: 0.536375 2023-10-03 16:28:45,915 WARNING [train.py:1204] (3/4) Exclude cut with ID _57572_210_5517_1_1534921219624_3809484_121-321334-0_sp1.1 from training. Duration: 0.97275 2023-10-03 16:28:46,014 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_40047_210_2290_1_1533290519_2374311_254-102149-0_sp1.1 from training. Duration: 0.963625 2023-10-03 16:28:46,015 WARNING [train.py:1204] (3/4) Exclude cut with ID _28461_210_2253_1_1532771775189_3041299_96-152158-0 from training. Duration: 0.85 2023-10-03 16:28:48,877 WARNING [train.py:1204] (3/4) Exclude cut with ID _29877_210_6228_1_1532397791106_5276149_161-102979-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 16:28:50,922 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3787_210_2040_1_1526477544_5478586_603-120324-0_sp0.9 from training. Duration: 0.9 2023-10-03 16:28:51,130 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.conv_module1.balancer2.prob, batch_count=1331660.0, ans=0.125 2023-10-03 16:28:55,607 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_41750_210_1960_1_1533520678_3948042_247-213456-0_sp1.1 from training. Duration: 0.97275 2023-10-03 16:28:55,960 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.ff3_skip_rate, batch_count=1331726.6666666667, ans=0.0 2023-10-03 16:28:56,020 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.feed_forward3.hidden_balancer.prob, batch_count=1331726.6666666667, ans=0.125 2023-10-03 16:28:57,667 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.3.encoder.layers.1.feed_forward3.out_whiten, num_groups=1, num_channels=512, metric=10.55 vs. limit=15.0 2023-10-03 16:29:02,592 WARNING [train.py:1204] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_127-119109-0_sp0.9 from training. Duration: 0.8 2023-10-03 16:29:11,436 WARNING [train.py:1204] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_215-202973-0 from training. Duration: 0.88 2023-10-03 16:29:11,502 WARNING [train.py:1204] (3/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_17-320491-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 16:29:17,168 WARNING [train.py:1204] (3/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_310-114452-0 from training. Duration: 0.59 2023-10-03 16:29:18,952 WARNING [train.py:1204] (3/4) Exclude cut with ID _15133_210_6228_1_1530794512074_3080379_29-264458-0_sp0.9 from training. Duration: 0.6444375 2023-10-03 16:29:21,795 WARNING [train.py:1204] (3/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_159-281310-0_sp1.1 from training. Duration: 0.863625 2023-10-03 16:29:21,812 WARNING [train.py:1204] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_195-167011-0_sp0.9 from training. Duration: 0.8333125 2023-10-03 16:29:21,877 WARNING [train.py:1204] (3/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_156-257607-0_sp1.1 from training. Duration: 0.7545625 2023-10-03 16:29:26,451 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2295_210_2087_1_1525500993_2407863_116-238894-0 from training. Duration: 0.7 2023-10-03 16:29:26,558 WARNING [train.py:1204] (3/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_321-335475-0_sp0.9 from training. Duration: 0.5 2023-10-03 16:29:28,065 WARNING [train.py:1204] (3/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_188-114464-0 from training. Duration: 0.82 2023-10-03 16:29:28,198 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.1.conv_skip_rate, batch_count=1331860.0, ans=0.0 2023-10-03 16:29:30,919 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2592_210_2290_1_1525697025_4882925_157-70512-0 from training. Duration: 0.91 2023-10-03 16:29:33,484 WARNING [train.py:1204] (3/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_138-64210-0_sp1.1 from training. Duration: 0.663625 2023-10-03 16:29:39,441 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_11305_210_1794_1_1529750428_7178148_651-95702-0_sp1.1 from training. Duration: 0.963625 2023-10-03 16:29:39,460 WARNING [train.py:1204] (3/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_379-184223-0_sp0.9 from training. Duration: 0.9444375 2023-10-03 16:29:39,483 WARNING [train.py:1204] (3/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_381-125325-0_sp1.1 from training. Duration: 0.963625 2023-10-03 16:29:40,808 WARNING [train.py:1204] (3/4) Exclude cut with ID IC0622W0021-307659 from training. Duration: 0.896 2023-10-03 16:29:40,811 WARNING [train.py:1204] (3/4) Exclude cut with ID _7624_210_1531_1_1528109802198_4933101_418-349834-0_sp1.1 from training. Duration: 0.5454375 2023-10-03 16:29:44,842 WARNING [train.py:1204] (3/4) Exclude cut with ID _49728_210_12651_1_1534041034294_6788150_607-3416-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 16:29:46,757 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8275_210_4198_1_1528455320_3892571_491-17707-0 from training. Duration: 0.64 2023-10-03 16:29:48,088 WARNING [train.py:1204] (3/4) Exclude cut with ID _5287_210_2084_1_1527418883040_3878670_333-48508-0 from training. Duration: 0.6 2023-10-03 16:29:48,160 WARNING [train.py:1204] (3/4) Exclude cut with ID _8365_210_2064_1_1528592311070_4677690_371-122295-0 from training. Duration: 0.55 2023-10-03 16:29:50,158 WARNING [train.py:1204] (3/4) Exclude cut with ID _14860_210_4915_1_1530702020340_7343240_836-328489-0 from training. Duration: 0.9 2023-10-03 16:29:53,179 INFO [train.py:1046] (3/4) Epoch 38, batch 3250, loss[loss=0.1472, simple_loss=0.2291, pruned_loss=0.03269, over 24440.00 frames. ], tot_loss[loss=0.1575, simple_loss=0.2372, pruned_loss=0.03891, over 4734265.95 frames. ], batch size: 58, lr: 2.67e-03, grad_scale: 16.0 2023-10-03 16:29:53,254 WARNING [train.py:1204] (3/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_1033-274376-0_sp0.9 from training. Duration: 0.9666875 2023-10-03 16:29:54,985 WARNING [train.py:1204] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_475-127455-0_sp0.9 from training. Duration: 0.77775 2023-10-03 16:29:54,991 WARNING [train.py:1204] (3/4) Exclude cut with ID IC0671W0071-331914 from training. Duration: 0.896 2023-10-03 16:29:56,276 WARNING [train.py:1204] (3/4) Exclude cut with ID _30327_210_16049_1_1532484161295_3924169_2-1523-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 16:29:56,279 WARNING [train.py:1204] (3/4) Exclude cut with ID _24314_210_4915_1_1532135078116_7170179_460-208279-0_sp1.1 from training. Duration: 0.97275 2023-10-03 16:29:56,405 WARNING [train.py:1204] (3/4) Exclude cut with ID IC0679W0052-335859 from training. Duration: 0.64 2023-10-03 16:30:00,652 WARNING [train.py:1204] (3/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_3-237434-0_sp1.1 from training. Duration: 0.6909375 2023-10-03 16:30:02,168 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.balancer1.prob, batch_count=1331993.3333333333, ans=0.125 2023-10-03 16:30:03,400 WARNING [train.py:1204] (3/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_405-185283-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 16:30:09,379 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.conv_module2.balancer1.min_positive, batch_count=1332060.0, ans=0.025 2023-10-03 16:30:12,438 WARNING [train.py:1204] (3/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_927-83684-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 16:30:12,453 WARNING [train.py:1204] (3/4) Exclude cut with ID _6694_210_4599_1_1527852637490_3965529_356-169233-0 from training. Duration: 0.99 2023-10-03 16:30:13,788 WARNING [train.py:1204] (3/4) Exclude cut with ID _19896_210_1883_1_1531709022662_6649569_562-233001-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 16:30:13,836 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_354-78629-0_sp1.1 from training. Duration: 0.963625 2023-10-03 16:30:13,837 WARNING [train.py:1204] (3/4) Exclude cut with ID _60135_210_13427_1_1534935557922_3651319_96-168198-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 16:30:15,334 WARNING [train.py:1204] (3/4) Exclude cut with ID _31602_210_5854_1_1532677021456_4458250_249-96604-0_sp1.1 from training. Duration: 0.8090625 2023-10-03 16:30:15,367 WARNING [train.py:1204] (3/4) Exclude cut with ID _14100_210_8341_1_1530529320613_3678069_258-224599-0_sp0.9 from training. Duration: 0.8555625 2023-10-03 16:30:18,627 WARNING [train.py:1204] (3/4) Exclude cut with ID _19708_210_9206_1_1532136650733_4238364_215-160135-0_sp1.1 from training. Duration: 0.97275 2023-10-03 16:30:19,814 WARNING [train.py:1204] (3/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_113-18163-0_sp0.9 from training. Duration: 0.988875 2023-10-03 16:30:19,849 WARNING [train.py:1204] (3/4) Exclude cut with ID _64135_210_2253_1_1535677267876_7608972_552-337340-0_sp1.1 from training. Duration: 0.92725 2023-10-03 16:30:19,882 WARNING [train.py:1204] (3/4) Exclude cut with ID _67725_210_27767_1_1535608756671_5105359_10-12887-0_sp1.1 from training. Duration: 0.97275 2023-10-03 16:30:19,888 WARNING [train.py:1204] (3/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_401-284840-0_sp1.1 from training. Duration: 0.97275 2023-10-03 16:30:19,922 WARNING [train.py:1204] (3/4) Exclude cut with ID _44659_210_2721_1_1534036120061_6614860_483-141285-0_sp1.1 from training. Duration: 0.936375 2023-10-03 16:30:22,573 WARNING [train.py:1204] (3/4) Exclude cut with ID _9461_210_4557_1_1529060514726_3677451_358-332734-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 16:30:24,564 WARNING [train.py:1204] (3/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_788-298907-0_sp1.1 from training. Duration: 0.8090625 2023-10-03 16:30:26,410 WARNING [train.py:1204] (3/4) Exclude cut with ID _39411_210_5986_1_1533538825030_3343649_354-267196-0_sp1.1 from training. Duration: 0.92725 2023-10-03 16:30:26,430 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_14953_210_2151_1_1530701710_5616165_192-309792-0_sp1.1 from training. Duration: 0.97275 2023-10-03 16:30:27,839 WARNING [train.py:1204] (3/4) Exclude cut with ID _59170_210_6721_1_1534983633960_4203335_290-151255-0_sp1.1 from training. Duration: 0.92725 2023-10-03 16:30:29,152 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_5575_210_1410_1_1527301305_2570170_249-93769-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 16:30:29,170 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_46013_210_7234_1_1533776362_3744032_463-52378-0_sp1.1 from training. Duration: 0.7818125 2023-10-03 16:30:33,138 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.564e+02 2.003e+02 2.220e+02 2.605e+02 4.440e+02, threshold=4.440e+02, percent-clipped=1.0 2023-10-03 16:30:34,547 WARNING [train.py:1204] (3/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_580-93798-0 from training. Duration: 0.56 2023-10-03 16:30:34,596 WARNING [train.py:1204] (3/4) Exclude cut with ID _19514_210_9259_1_1531530071330_5084230_385-13762-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 16:30:34,608 WARNING [train.py:1204] (3/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_458-67321-0_sp1.1 from training. Duration: 0.8818125 2023-10-03 16:30:35,985 WARNING [train.py:1204] (3/4) Exclude cut with ID _41298_210_9019_1_1533430371450_4822143_188-154791-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 16:30:37,351 WARNING [train.py:1204] (3/4) Exclude cut with ID _15037_210_2253_1_1530752294104_3267650_296-192597-0_sp0.9 from training. Duration: 0.9 2023-10-03 16:30:41,570 WARNING [train.py:1204] (3/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_661-264857-0_sp1.1 from training. Duration: 0.8454375 2023-10-03 16:30:50,948 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_34008_210_10431_1_1533345424_8183247_662-317331-0_sp1.1 from training. Duration: 0.8545625 2023-10-03 16:30:50,976 WARNING [train.py:1204] (3/4) Exclude cut with ID _46502_210_13794_1_1533724452198_3657159_241-278329-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 16:30:50,977 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_11349_210_6753_1_1529800861_1101207_26-65602-0 from training. Duration: 0.96 2023-10-03 16:30:50,981 WARNING [train.py:1204] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_448-4048-0_sp1.1 from training. Duration: 0.87275 2023-10-03 16:30:50,989 WARNING [train.py:1204] (3/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_662-275621-0_sp1.1 from training. Duration: 0.3818125 2023-10-03 16:30:51,053 WARNING [train.py:1204] (3/4) Exclude cut with ID _13938_210_9988_1_1530518718978_3693960_256-54454-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 16:30:53,764 WARNING [train.py:1204] (3/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_256-32662-0 from training. Duration: 0.9 2023-10-03 16:30:53,830 WARNING [train.py:1204] (3/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_326-17061-0 from training. Duration: 0.94 2023-10-03 16:30:53,862 WARNING [train.py:1204] (3/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_505-97987-0_sp1.1 from training. Duration: 0.7818125 2023-10-03 16:30:55,315 WARNING [train.py:1204] (3/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_316-294364-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 16:30:56,612 WARNING [train.py:1204] (3/4) Exclude cut with ID _19514_210_9259_1_1531530071330_5084230_762-231063-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 16:30:58,569 WARNING [train.py:1204] (3/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_68-219807-0_sp0.9 from training. Duration: 0.52225 2023-10-03 16:30:58,612 WARNING [train.py:1204] (3/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_479-158053-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 16:31:00,119 WARNING [train.py:1204] (3/4) Exclude cut with ID _66112_210_10635_1_1535590236305_3801484_83-38181-0_sp1.1 from training. Duration: 0.936375 2023-10-03 16:31:00,133 WARNING [train.py:1204] (3/4) Exclude cut with ID _55857_210_2346_1_1534644007831_6898209_255-146346-0_sp1.1 from training. Duration: 0.963625 2023-10-03 16:31:00,349 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.bypass.scale_min, batch_count=1332260.0, ans=0.2 2023-10-03 16:31:00,405 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.ff3_skip_rate, batch_count=1332260.0, ans=0.0 2023-10-03 16:31:01,523 WARNING [train.py:1204] (3/4) Exclude cut with ID _48836_210_12210_1_1534075848293_3904150_429-35475-0 from training. Duration: 0.89 2023-10-03 16:31:01,541 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_50154_210_13731_1_1534750862_1080961_119-187163-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 16:31:03,024 WARNING [train.py:1204] (3/4) Exclude cut with ID _8793_210_6310_1_1528542985139_2310009_121-179782-0_sp0.9 from training. Duration: 0.888875 2023-10-03 16:31:03,027 WARNING [train.py:1204] (3/4) Exclude cut with ID _14884_210_2252_1_1530707459689_5561250_591-173406-0 from training. Duration: 0.96 2023-10-03 16:31:05,765 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.bypass.skip_rate, batch_count=1332326.6666666667, ans=0.07 2023-10-03 16:31:06,746 INFO [train.py:1046] (3/4) Epoch 38, batch 3300, loss[loss=0.154, simple_loss=0.24, pruned_loss=0.034, over 24491.00 frames. ], tot_loss[loss=0.158, simple_loss=0.2373, pruned_loss=0.0393, over 4723395.39 frames. ], batch size: 63, lr: 2.67e-03, grad_scale: 8.0 2023-10-03 16:31:06,801 WARNING [train.py:1204] (3/4) Exclude cut with ID _15133_210_6228_1_1530794512074_3080379_54-198193-0_sp1.1 from training. Duration: 0.8545625 2023-10-03 16:31:06,811 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6545_210_2040_1_1527774670_4245258_407-48070-0 from training. Duration: 0.76 2023-10-03 16:31:08,237 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1032-236863-0 from training. Duration: 0.84 2023-10-03 16:31:10,075 WARNING [train.py:1204] (3/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_507-303370-0 from training. Duration: 0.96 2023-10-03 16:31:10,095 WARNING [train.py:1204] (3/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_164-282366-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 16:31:12,941 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.bypass_mid.scale_min, batch_count=1332326.6666666667, ans=0.2 2023-10-03 16:31:15,809 WARNING [train.py:1204] (3/4) Exclude cut with ID _39867_210_9826_1_1533553222030_9344259_312-158727-0_sp1.1 from training. Duration: 0.963625 2023-10-03 16:31:15,897 WARNING [train.py:1204] (3/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_174-205058-0_sp1.1 from training. Duration: 0.863625 2023-10-03 16:31:17,289 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_11305_210_1794_1_1529750428_7178148_166-237358-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 16:31:19,957 WARNING [train.py:1204] (3/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_26-175553-0_sp1.1 from training. Duration: 0.5545625 2023-10-03 16:31:19,975 WARNING [train.py:1204] (3/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_96-290039-0_sp1.1 from training. Duration: 0.5090625 2023-10-03 16:31:20,499 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.5.encoder.layers.1.nonlin_attention.whiten1, num_groups=1, num_channels=192, metric=2.72 vs. limit=10.0 2023-10-03 16:31:21,425 WARNING [train.py:1204] (3/4) Exclude cut with ID _53219_210_4525_1_1534834836008_4064825_261-101691-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 16:31:24,254 WARNING [train.py:1204] (3/4) Exclude cut with ID _24271_210_12658_1_1531987270022_7190519_96-293390-0_sp1.1 from training. Duration: 0.936375 2023-10-03 16:31:26,995 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_271-234962-0 from training. Duration: 0.79 2023-10-03 16:31:28,824 WARNING [train.py:1204] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_136-30660-0_sp1.1 from training. Duration: 0.97275 2023-10-03 16:31:28,849 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_625-281046-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 16:31:31,989 WARNING [train.py:1204] (3/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_333-168193-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 16:31:32,063 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9029W0269-509664 from training. Duration: 0.896 2023-10-03 16:31:33,488 WARNING [train.py:1204] (3/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_450-147393-0_sp1.1 from training. Duration: 0.92725 2023-10-03 16:31:33,535 WARNING [train.py:1204] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_643-52007-0_sp1.1 from training. Duration: 0.5181875 2023-10-03 16:31:34,875 WARNING [train.py:1204] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_330-327861-0_sp0.9 from training. Duration: 0.7444375 2023-10-03 16:31:34,883 WARNING [train.py:1204] (3/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_260-317651-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 16:31:34,902 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9044W0010-516654 from training. Duration: 0.896 2023-10-03 16:31:37,762 WARNING [train.py:1204] (3/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_265-118378-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 16:31:38,944 WARNING [train.py:1204] (3/4) Exclude cut with ID _6194_210_1534_1_1527511826160_3941194_321-148641-0_sp0.9 from training. Duration: 0.711125 2023-10-03 16:31:41,607 WARNING [train.py:1204] (3/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_101-169176-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 16:31:41,610 WARNING [train.py:1204] (3/4) Exclude cut with ID 497-129325-0061-62254-0_sp1.1 from training. Duration: 0.97725 2023-10-03 16:31:43,529 WARNING [train.py:1204] (3/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_795-33252-0_sp1.1 from training. Duration: 0.336375 2023-10-03 16:31:43,550 WARNING [train.py:1204] (3/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_168-246176-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 16:31:44,874 WARNING [train.py:1204] (3/4) Exclude cut with ID _7160_210_1876_1_1527940923231_3703799_120-150943-0_sp1.1 from training. Duration: 0.736375 2023-10-03 16:31:46,356 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9084W0005-536844 from training. Duration: 0.768 2023-10-03 16:31:47,897 WARNING [train.py:1204] (3/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_234-260682-0 from training. Duration: 0.84 2023-10-03 16:31:49,836 WARNING [train.py:1204] (3/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_188-114464-0_sp0.9 from training. Duration: 0.911125 2023-10-03 16:31:52,512 WARNING [train.py:1204] (3/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_81-75849-0 from training. Duration: 0.39 2023-10-03 16:31:53,965 WARNING [train.py:1204] (3/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_379-184223-0_sp1.1 from training. Duration: 0.77275 2023-10-03 16:31:55,464 WARNING [train.py:1204] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_454-123002-0_sp1.1 from training. Duration: 0.62725 2023-10-03 16:31:55,609 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.attention_skip_rate, batch_count=1332526.6666666667, ans=0.0 2023-10-03 16:31:56,813 WARNING [train.py:1204] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_887-249555-0_sp1.1 from training. Duration: 0.7 2023-10-03 16:31:58,272 WARNING [train.py:1204] (3/4) Exclude cut with ID _31088_210_16446_1_1532520012284_3440919_20-260902-0_sp1.1 from training. Duration: 0.97275 2023-10-03 16:31:59,564 WARNING [train.py:1204] (3/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_856-344574-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 16:31:59,566 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_36756_210_7234_1_1533026991_966276_4-164118-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 16:31:59,582 WARNING [train.py:1204] (3/4) Exclude cut with ID _8365_210_2064_1_1528592311070_4677690_371-122295-0_sp1.1 from training. Duration: 0.5 2023-10-03 16:32:01,136 WARNING [train.py:1204] (3/4) Exclude cut with ID _5910_210_1389_1_1527337972454_5050100_555-159815-0_sp1.1 from training. Duration: 0.8909375 2023-10-03 16:32:01,145 WARNING [train.py:1204] (3/4) Exclude cut with ID _37086_210_9038_1_1532999540891_7021250_469-147941-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 16:32:01,208 WARNING [train.py:1204] (3/4) Exclude cut with ID _3067_210_2667_1_1526212974640_4503168_459-173867-0_sp0.9 from training. Duration: 0.97775 2023-10-03 16:32:03,191 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9156W0006-572499 from training. Duration: 0.896 2023-10-03 16:32:04,539 WARNING [train.py:1204] (3/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_277-210798-0 from training. Duration: 0.55 2023-10-03 16:32:06,055 WARNING [train.py:1204] (3/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_103-336064-0_sp1.1 from training. Duration: 0.463625 2023-10-03 16:32:07,346 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3414_210_1988_1_1526212718_4813195_331-290945-0_sp1.1 from training. Duration: 0.936375 2023-10-03 16:32:07,347 WARNING [train.py:1204] (3/4) Exclude cut with ID _67499_210_17282_1_1535509805220_3692430_365-29276-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 16:32:08,771 WARNING [train.py:1204] (3/4) Exclude cut with ID _14821_210_8750_1_1530680503991_6829359_69-250847-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 16:32:08,773 WARNING [train.py:1204] (3/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_1102-61301-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 16:32:10,166 WARNING [train.py:1204] (3/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_166-246557-0_sp1.1 from training. Duration: 0.5818125 2023-10-03 16:32:11,588 WARNING [train.py:1204] (3/4) Exclude cut with ID _15351_210_10626_1_1530856635653_3576210_214-129918-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 16:32:11,600 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_120-41668-0_sp0.9 from training. Duration: 0.77775 2023-10-03 16:32:12,860 WARNING [train.py:1204] (3/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_27-72278-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 16:32:14,901 WARNING [train.py:1204] (3/4) Exclude cut with ID _24314_210_4915_1_1532135078116_7170179_521-258868-0_sp1.1 from training. Duration: 0.7181875 2023-10-03 16:32:16,229 WARNING [train.py:1204] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_143-86344-0 from training. Duration: 0.77 2023-10-03 16:32:18,075 WARNING [train.py:1204] (3/4) Exclude cut with ID _53489_210_6228_1_1534298357169_3835970_465-157433-0_sp1.1 from training. Duration: 0.92725 2023-10-03 16:32:18,182 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_248-319353-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 16:32:20,876 INFO [train.py:1046] (3/4) Epoch 38, batch 3350, loss[loss=0.1663, simple_loss=0.243, pruned_loss=0.04481, over 23460.00 frames. ], tot_loss[loss=0.1595, simple_loss=0.2389, pruned_loss=0.04003, over 4712619.56 frames. ], batch size: 93, lr: 2.67e-03, grad_scale: 8.0 2023-10-03 16:32:20,952 WARNING [train.py:1204] (3/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_102-77064-0_sp1.1 from training. Duration: 0.8454375 2023-10-03 16:32:22,295 WARNING [train.py:1204] (3/4) Exclude cut with ID _9389_210_1664_1_1529217337301_5289269_379-128794-0_sp1.1 from training. Duration: 0.77275 2023-10-03 16:32:23,765 WARNING [train.py:1204] (3/4) Exclude cut with ID _3231_210_2106_1_1526205864934_6784220_236-260835-0_sp1.1 from training. Duration: 0.97275 2023-10-03 16:32:23,901 WARNING [train.py:1204] (3/4) Exclude cut with ID _24271_210_12658_1_1531987270022_7190519_184-342363-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 16:32:23,902 WARNING [train.py:1204] (3/4) Exclude cut with ID _27894_210_12392_1_1532415496078_6902439_67-221663-0_sp1.1 from training. Duration: 0.92725 2023-10-03 16:32:28,007 WARNING [train.py:1204] (3/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_304-59227-0_sp1.1 from training. Duration: 0.7 2023-10-03 16:32:29,387 WARNING [train.py:1204] (3/4) Exclude cut with ID _58605_210_12210_1_1534723195939_3904259_458-42232-0_sp1.1 from training. Duration: 0.92725 2023-10-03 16:32:29,496 WARNING [train.py:1204] (3/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_13-165512-0_sp0.9 from training. Duration: 0.911125 2023-10-03 16:32:33,808 WARNING [train.py:1204] (3/4) Exclude cut with ID _58602_210_2346_1_1534903210085_6908830_792-102241-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 16:32:35,709 WARNING [train.py:1204] (3/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_588-125165-0_sp0.9 from training. Duration: 0.92225 2023-10-03 16:32:37,036 WARNING [train.py:1204] (3/4) Exclude cut with ID _54218_210_12210_1_1534413646388_4409259_239-98429-0_sp1.1 from training. Duration: 0.97275 2023-10-03 16:32:37,100 WARNING [train.py:1204] (3/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_538-32291-0_sp1.1 from training. Duration: 0.7545625 2023-10-03 16:32:38,457 WARNING [train.py:1204] (3/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_419-317062-0 from training. Duration: 0.91 2023-10-03 16:32:39,283 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.2.encoder.layers.0.conv_module1.whiten, num_groups=1, num_channels=384, metric=6.31 vs. limit=15.0 2023-10-03 16:32:39,956 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9309W0202-640329 from training. Duration: 0.896 2023-10-03 16:32:39,990 WARNING [train.py:1204] (3/4) Exclude cut with ID _3231_210_2106_1_1526205864934_6784220_419-313291-0_sp1.1 from training. Duration: 0.97275 2023-10-03 16:32:42,698 WARNING [train.py:1204] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_619-344499-0 from training. Duration: 0.63 2023-10-03 16:32:42,710 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_278-272612-0 from training. Duration: 0.73 2023-10-03 16:32:44,113 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_103-84010-0_sp0.9 from training. Duration: 0.7666875 2023-10-03 16:32:44,133 WARNING [train.py:1204] (3/4) Exclude cut with ID _26528_210_9070_1_1532570410842_3851310_497-97218-0_sp1.1 from training. Duration: 0.7818125 2023-10-03 16:32:44,222 WARNING [train.py:1204] (3/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_262-84613-0_sp1.1 from training. Duration: 0.936375 2023-10-03 16:32:44,259 WARNING [train.py:1204] (3/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_387-131538-0 from training. Duration: 0.81 2023-10-03 16:32:45,014 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.2.encoder.layers.1.nonlin_attention.whiten1, num_groups=1, num_channels=288, metric=4.96 vs. limit=10.0 2023-10-03 16:32:46,276 WARNING [train.py:1204] (3/4) Exclude cut with ID _8030_210_7497_1_1528444886781_3531089_224-300489-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 16:32:46,292 WARNING [train.py:1204] (3/4) Exclude cut with ID _14349_210_8341_1_1530615710818_3442470_170-336541-0_sp0.9 from training. Duration: 0.9555625 2023-10-03 16:32:47,764 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_47069_210_15368_1_1533781267_4431320_180-31746-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 16:32:49,128 WARNING [train.py:1204] (3/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_447-7041-0_sp1.1 from training. Duration: 0.92725 2023-10-03 16:32:50,379 WARNING [train.py:1204] (3/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_2-55283-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 16:32:50,430 WARNING [train.py:1204] (3/4) Exclude cut with ID _12555_210_8750_1_1530075594402_3780239_70-181911-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 16:32:51,980 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=1332793.3333333333, ans=0.1 2023-10-03 16:32:53,194 WARNING [train.py:1204] (3/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_311-328022-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 16:32:54,643 WARNING [train.py:1204] (3/4) Exclude cut with ID _14528_210_8254_1_1530669700514_4306939_330-2987-0_sp1.1 from training. Duration: 0.92725 2023-10-03 16:32:54,701 WARNING [train.py:1204] (3/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_385-184858-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 16:32:58,905 WARNING [train.py:1204] (3/4) Exclude cut with ID _64414_210_17301_1_1535243252048_3757759_348-333360-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 16:32:58,974 WARNING [train.py:1204] (3/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_753-214751-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 16:33:00,141 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.721e+02 1.992e+02 2.171e+02 2.466e+02 3.497e+02, threshold=4.342e+02, percent-clipped=0.0 2023-10-03 16:33:00,325 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_17613_210_2151_1_1531137569_2366160_202-208013-0_sp1.1 from training. Duration: 0.92725 2023-10-03 16:33:00,334 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_43831_210_2158_1_1533725502_5872791_610-129300-0_sp1.1 from training. Duration: 0.936375 2023-10-03 16:33:03,661 WARNING [train.py:1204] (3/4) Exclude cut with ID _54218_210_12210_1_1534413646388_4409259_321-23852-0_sp1.1 from training. Duration: 0.936375 2023-10-03 16:33:05,106 WARNING [train.py:1204] (3/4) Exclude cut with ID _39411_210_5986_1_1533538825030_3343649_173-238248-0 from training. Duration: 0.83 2023-10-03 16:33:06,374 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_405-339986-0_sp0.9 from training. Duration: 0.5666875 2023-10-03 16:33:06,417 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2009_210_2071_1_1525481134_4588937_385-302100-0 from training. Duration: 0.87 2023-10-03 16:33:06,462 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_161-276996-0_sp1.1 from training. Duration: 0.863625 2023-10-03 16:33:07,823 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_177-58005-0 from training. Duration: 0.62 2023-10-03 16:33:09,306 WARNING [train.py:1204] (3/4) Exclude cut with ID _64639_210_5816_1_1535367619437_3528000_146-343734-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 16:33:10,691 WARNING [train.py:1204] (3/4) Exclude cut with ID _3845_210_1390_1_1526731326141_3523610_72-61441-0_sp1.1 from training. Duration: 0.92725 2023-10-03 16:33:16,839 WARNING [train.py:1204] (3/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_991-125028-0_sp1.1 from training. Duration: 0.936375 2023-10-03 16:33:18,169 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7544_210_6753_1_1528591327_1471426_172-348777-0 from training. Duration: 0.66 2023-10-03 16:33:18,212 WARNING [train.py:1204] (3/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_416-257730-0_sp1.1 from training. Duration: 0.5909375 2023-10-03 16:33:19,742 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.1.conv_module1.balancer1.max_abs, batch_count=1332926.6666666667, ans=10.0 2023-10-03 16:33:19,790 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=1332926.6666666667, ans=0.1 2023-10-03 16:33:20,865 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1113-7369-0_sp1.1 from training. Duration: 0.77275 2023-10-03 16:33:22,238 WARNING [train.py:1204] (3/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_242-76943-0_sp1.1 from training. Duration: 0.8545625 2023-10-03 16:33:27,702 WARNING [train.py:1204] (3/4) Exclude cut with ID _44718_210_5816_1_1534050051869_6469030_190-49648-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 16:33:29,053 WARNING [train.py:1204] (3/4) Exclude cut with ID _47251_210_18628_1_1533814063128_4137250_214-88076-0 from training. Duration: 0.88 2023-10-03 16:33:29,098 WARNING [train.py:1204] (3/4) Exclude cut with ID _5585_210_2667_1_1527418813324_8007340_89-332072-0_sp1.1 from training. Duration: 0.5818125 2023-10-03 16:33:30,388 WARNING [train.py:1204] (3/4) Exclude cut with ID _41817_210_18835_1_1533434477160_7423049_555-293448-0_sp1.1 from training. Duration: 0.72725 2023-10-03 16:33:31,831 WARNING [train.py:1204] (3/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_286-331314-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 16:33:33,598 INFO [train.py:1046] (3/4) Epoch 38, batch 3400, loss[loss=0.1483, simple_loss=0.2292, pruned_loss=0.0337, over 24425.00 frames. ], tot_loss[loss=0.1607, simple_loss=0.2405, pruned_loss=0.04043, over 4707283.17 frames. ], batch size: 63, lr: 2.67e-03, grad_scale: 8.0 2023-10-03 16:33:33,646 WARNING [train.py:1204] (3/4) Exclude cut with ID _13921_210_9994_1_1530838702800_3003429_46-149444-0 from training. Duration: 0.94 2023-10-03 16:33:33,741 WARNING [train.py:1204] (3/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_768-43595-0_sp1.1 from training. Duration: 0.936375 2023-10-03 16:33:35,554 WARNING [train.py:1204] (3/4) Exclude cut with ID _35355_210_16040_1_1533212988977_4016229_74-190076-0 from training. Duration: 0.8 2023-10-03 16:33:36,949 WARNING [train.py:1204] (3/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_1007-316380-0_sp1.1 from training. Duration: 0.963625 2023-10-03 16:33:36,989 WARNING [train.py:1204] (3/4) Exclude cut with ID _23557_210_8750_1_1531890106370_6846255_150-283437-0_sp1.1 from training. Duration: 0.963625 2023-10-03 16:33:38,268 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3474_210_2028_1_1526188513_4825246_145-309131-0_sp1.1 from training. Duration: 0.67275 2023-10-03 16:33:38,367 WARNING [train.py:1204] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_391-22148-0_sp1.1 from training. Duration: 0.7 2023-10-03 16:33:39,687 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_238-256668-0 from training. Duration: 0.69 2023-10-03 16:33:43,930 WARNING [train.py:1204] (3/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_372-198878-0 from training. Duration: 0.6 2023-10-03 16:33:43,940 WARNING [train.py:1204] (3/4) Exclude cut with ID ID0174W0006-766659 from training. Duration: 0.953 2023-10-03 16:33:43,956 WARNING [train.py:1204] (3/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_668-23019-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 16:33:48,797 WARNING [train.py:1204] (3/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_322-74328-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 16:33:48,806 WARNING [train.py:1204] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_872-264525-0_sp1.1 from training. Duration: 0.5909375 2023-10-03 16:33:48,855 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_53378_210_17282_1_1534221071_5769509_641-286201-0_sp1.1 from training. Duration: 0.97275 2023-10-03 16:33:48,943 WARNING [train.py:1204] (3/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_199-295611-0_sp1.1 from training. Duration: 0.5 2023-10-03 16:33:55,509 WARNING [train.py:1204] (3/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_1270-317971-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 16:33:56,866 WARNING [train.py:1204] (3/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_784-213099-0 from training. Duration: 0.93 2023-10-03 16:34:01,184 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_115-233929-0_sp0.9 from training. Duration: 0.8 2023-10-03 16:34:01,333 WARNING [train.py:1204] (3/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_256-223809-0_sp1.1 from training. Duration: 0.97275 2023-10-03 16:34:02,668 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_285-87781-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 16:34:04,530 WARNING [train.py:1204] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_619-344499-0_sp1.1 from training. Duration: 0.57275 2023-10-03 16:34:07,968 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.feed_forward3.hidden_balancer.prob, batch_count=1333126.6666666667, ans=0.125 2023-10-03 16:34:10,467 WARNING [train.py:1204] (3/4) Exclude cut with ID _60229_210_27767_1_1534838406158_3655350_264-279573-0_sp0.9 from training. Duration: 0.92225 2023-10-03 16:34:12,105 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.bypass.scale_min, batch_count=1333126.6666666667, ans=0.2 2023-10-03 16:34:13,343 WARNING [train.py:1204] (3/4) Exclude cut with ID _7719_210_3525_1_1528456153163_4296960_333-110399-0 from training. Duration: 0.96 2023-10-03 16:34:19,813 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_50423_210_13270_1_1534128598_5021734_759-90833-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 16:34:21,177 WARNING [train.py:1204] (3/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_151-254798-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 16:34:21,211 WARNING [train.py:1204] (3/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_450-203637-0 from training. Duration: 0.92 2023-10-03 16:34:21,249 WARNING [train.py:1204] (3/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_673-36456-0_sp1.1 from training. Duration: 0.936375 2023-10-03 16:34:21,294 WARNING [train.py:1204] (3/4) Exclude cut with ID _36538_210_9994_1_1533204020363_3795620_131-8685-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 16:34:22,642 WARNING [train.py:1204] (3/4) Exclude cut with ID _12556_210_8341_1_1530081024100_7327690_713-55626-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 16:34:22,687 WARNING [train.py:1204] (3/4) Exclude cut with ID _2322_210_2667_1_1525604548971_3656299_440-80494-0_sp0.9 from training. Duration: 0.97775 2023-10-03 16:34:23,150 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.5.encoder.layers.0.whiten, num_groups=1, num_channels=256, metric=3.13 vs. limit=12.0 2023-10-03 16:34:24,272 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.ff3_skip_rate, batch_count=1333193.3333333333, ans=0.0 2023-10-03 16:34:25,406 WARNING [train.py:1204] (3/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_210-325081-0_sp1.1 from training. Duration: 0.97275 2023-10-03 16:34:29,513 WARNING [train.py:1204] (3/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_375-244976-0_sp0.9 from training. Duration: 0.8333125 2023-10-03 16:34:29,519 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_15517_210_6753_1_1530952293_1664795_119-31001-0_sp1.1 from training. Duration: 0.8545625 2023-10-03 16:34:34,156 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_41452_210_7291_1_1533437687_3941281_319-42326-0_sp1.1 from training. Duration: 0.92725 2023-10-03 16:34:35,527 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_372-173012-0 from training. Duration: 0.95 2023-10-03 16:34:41,368 WARNING [train.py:1204] (3/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_41-31157-0_sp1.1 from training. Duration: 0.5181875 2023-10-03 16:34:45,650 WARNING [train.py:1204] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_91-212486-0 from training. Duration: 0.68 2023-10-03 16:34:47,549 INFO [train.py:1046] (3/4) Epoch 38, batch 3450, loss[loss=0.1725, simple_loss=0.2515, pruned_loss=0.0468, over 23379.00 frames. ], tot_loss[loss=0.1606, simple_loss=0.2404, pruned_loss=0.04041, over 4697023.27 frames. ], batch size: 93, lr: 2.67e-03, grad_scale: 8.0 2023-10-03 16:34:47,990 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.attention_skip_rate, batch_count=1333326.6666666667, ans=0.0 2023-10-03 16:34:51,432 WARNING [train.py:1204] (3/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_1267-272693-0 from training. Duration: 0.73 2023-10-03 16:34:52,743 WARNING [train.py:1204] (3/4) Exclude cut with ID _13938_210_9988_1_1530518718978_3693960_152-67046-0_sp1.1 from training. Duration: 0.936375 2023-10-03 16:34:54,189 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_4528_210_3273_1_1527076654_3276777_342-168068-0_sp1.1 from training. Duration: 0.7909375 2023-10-03 16:34:54,200 WARNING [train.py:1204] (3/4) Exclude cut with ID _7606_210_4915_1_1528894253277_5080690_127-177761-0 from training. Duration: 0.64 2023-10-03 16:34:54,265 WARNING [train.py:1204] (3/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_335-137972-0_sp1.1 from training. Duration: 0.92725 2023-10-03 16:34:55,757 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.ff3_skip_rate, batch_count=1333326.6666666667, ans=0.0 2023-10-03 16:34:58,291 WARNING [train.py:1204] (3/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_331-117829-0_sp1.1 from training. Duration: 0.563625 2023-10-03 16:35:03,700 WARNING [train.py:1204] (3/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_125-125818-0_sp1.1 from training. Duration: 0.87275 2023-10-03 16:35:03,771 WARNING [train.py:1204] (3/4) Exclude cut with ID _6789_210_1534_1_1527854471671_2870519_48-294897-0_sp1.1 from training. Duration: 0.963625 2023-10-03 16:35:05,611 WARNING [train.py:1204] (3/4) Exclude cut with ID _6694_210_4599_1_1527852637490_3965529_116-109398-0_sp1.1 from training. Duration: 0.663625 2023-10-03 16:35:05,630 WARNING [train.py:1204] (3/4) Exclude cut with ID _52892_210_6721_1_1534378941259_4124129_222-172685-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 16:35:06,523 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.1.encoder.layers.1.self_attn_weights.whiten_keys, num_groups=4, num_channels=128, metric=2.70 vs. limit=6.0 2023-10-03 16:35:07,169 WARNING [train.py:1204] (3/4) Exclude cut with ID _50655_210_18628_1_1534229942193_3772154_320-132275-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 16:35:08,820 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.bypass.scale_min, batch_count=1333393.3333333333, ans=0.2 2023-10-03 16:35:11,049 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.0.layers.0.self_attn_weights.whiten_keys, num_groups=4, num_channels=128, metric=2.81 vs. limit=6.0 2023-10-03 16:35:12,527 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.conv_module1.whiten.whitening_limit, batch_count=1333393.3333333333, ans=15.0 2023-10-03 16:35:13,115 WARNING [train.py:1204] (3/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_76-311963-0 from training. Duration: 0.7 2023-10-03 16:35:17,146 WARNING [train.py:1204] (3/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_32-205288-0 from training. Duration: 0.78 2023-10-03 16:35:17,171 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_86-16881-0_sp0.9 from training. Duration: 0.6444375 2023-10-03 16:35:17,212 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_271-297505-0_sp1.1 from training. Duration: 0.9 2023-10-03 16:35:18,469 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.3.encoder.layers.3.whiten, num_groups=1, num_channels=512, metric=3.74 vs. limit=12.0 2023-10-03 16:35:19,216 WARNING [train.py:1204] (3/4) Exclude cut with ID _57572_210_5517_1_1534921219624_3809484_187-295293-0_sp1.1 from training. Duration: 0.963625 2023-10-03 16:35:19,434 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.conv_skip_rate, batch_count=1333460.0, ans=0.0 2023-10-03 16:35:24,732 WARNING [train.py:1204] (3/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_563-329943-0 from training. Duration: 0.99 2023-10-03 16:35:26,087 WARNING [train.py:1204] (3/4) Exclude cut with ID _7160_210_1876_1_1527940923231_3703799_302-150728-0_sp0.9 from training. Duration: 0.8666875 2023-10-03 16:35:28,780 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.563e+02 1.894e+02 2.065e+02 2.341e+02 2.920e+02, threshold=4.130e+02, percent-clipped=0.0 2023-10-03 16:35:28,930 WARNING [train.py:1204] (3/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_926-7526-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 16:35:28,961 WARNING [train.py:1204] (3/4) Exclude cut with ID _62574_210_2655_1_1535180407058_3933249_467-22348-0_sp1.1 from training. Duration: 0.936375 2023-10-03 16:35:30,321 WARNING [train.py:1204] (3/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_280-161633-0_sp0.9 from training. Duration: 0.811125 2023-10-03 16:35:31,680 WARNING [train.py:1204] (3/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_385-193052-0_sp1.1 from training. Duration: 0.7818125 2023-10-03 16:35:33,695 WARNING [train.py:1204] (3/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_444-28041-0 from training. Duration: 0.85 2023-10-03 16:35:33,699 WARNING [train.py:1204] (3/4) Exclude cut with ID _66070_210_7640_1_1535441393096_7805310_341-249237-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 16:35:34,986 WARNING [train.py:1204] (3/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_425-269826-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 16:35:37,609 WARNING [train.py:1204] (3/4) Exclude cut with ID _53950_210_5816_1_1534417264919_3862951_124-185597-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 16:35:40,273 WARNING [train.py:1204] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_380-204756-0 from training. Duration: 0.86 2023-10-03 16:35:43,541 WARNING [train.py:1204] (3/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_282-257791-0_sp0.9 from training. Duration: 0.9555625 2023-10-03 16:35:48,378 WARNING [train.py:1204] (3/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_372-131163-0_sp1.1 from training. Duration: 0.8545625 2023-10-03 16:35:48,603 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.conv_skip_rate, batch_count=1333593.3333333333, ans=0.0 2023-10-03 16:35:51,521 WARNING [train.py:1204] (3/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_447-233513-0_sp1.1 from training. Duration: 0.963625 2023-10-03 16:35:51,809 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.balancer2.prob, batch_count=1333593.3333333333, ans=0.125 2023-10-03 16:35:54,362 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_54-118333-0_sp1.1 from training. Duration: 0.97275 2023-10-03 16:35:58,606 WARNING [train.py:1204] (3/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_388-230239-0_sp1.1 from training. Duration: 0.963625 2023-10-03 16:35:58,623 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_40047_210_2290_1_1533290519_2374311_141-54639-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 16:35:59,914 WARNING [train.py:1204] (3/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_91-267696-0_sp0.9 from training. Duration: 0.9666875 2023-10-03 16:35:59,954 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_532-137847-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 16:36:00,636 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.2.encoder.layers.2.self_attn1.whiten, num_groups=1, num_channels=384, metric=16.68 vs. limit=22.5 2023-10-03 16:36:02,549 INFO [train.py:1046] (3/4) Epoch 38, batch 3500, loss[loss=0.1337, simple_loss=0.2118, pruned_loss=0.02782, over 24322.00 frames. ], tot_loss[loss=0.1591, simple_loss=0.2382, pruned_loss=0.04, over 4688094.65 frames. ], batch size: 56, lr: 2.67e-03, grad_scale: 8.0 2023-10-03 16:36:04,608 WARNING [train.py:1204] (3/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_155-283231-0_sp1.1 from training. Duration: 0.97275 2023-10-03 16:36:06,071 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2245_210_2715_1_1525511548_3540254_310-52483-0_sp1.1 from training. Duration: 0.8 2023-10-03 16:36:07,444 WARNING [train.py:1204] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_185-174805-0 from training. Duration: 0.68 2023-10-03 16:36:08,911 WARNING [train.py:1204] (3/4) Exclude cut with ID _15012_210_1968_1_1530856820429_5095350_662-210469-0_sp1.1 from training. Duration: 0.5181875 2023-10-03 16:36:11,736 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_426-313557-0_sp0.9 from training. Duration: 0.588875 2023-10-03 16:36:15,548 WARNING [train.py:1204] (3/4) Exclude cut with ID _42326_210_7770_1_1533814282531_3707269_407-204343-0_sp1.1 from training. Duration: 0.97275 2023-10-03 16:36:15,556 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_258-310570-0 from training. Duration: 0.82 2023-10-03 16:36:17,064 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.attention_skip_rate, batch_count=1333726.6666666667, ans=0.0 2023-10-03 16:36:20,979 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_4737_210_2040_1_1526998689_3293008_378-178650-0_sp1.1 from training. Duration: 0.863625 2023-10-03 16:36:21,077 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_47896_210_20508_1_1534492518_5958868_432-327451-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 16:36:23,034 WARNING [train.py:1204] (3/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_188-114464-0_sp1.1 from training. Duration: 0.7454375 2023-10-03 16:36:23,041 WARNING [train.py:1204] (3/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_798-106179-0_sp1.1 from training. Duration: 0.7545625 2023-10-03 16:36:24,293 WARNING [train.py:1204] (3/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_143-117421-0_sp1.1 from training. Duration: 0.52725 2023-10-03 16:36:24,350 WARNING [train.py:1204] (3/4) Exclude cut with ID _67499_210_17282_1_1535509805220_3692430_361-78786-0_sp1.1 from training. Duration: 0.963625 2023-10-03 16:36:24,394 WARNING [train.py:1204] (3/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_712-342563-0_sp1.1 from training. Duration: 0.8818125 2023-10-03 16:36:25,840 WARNING [train.py:1204] (3/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_387-159008-0 from training. Duration: 0.86 2023-10-03 16:36:28,602 WARNING [train.py:1204] (3/4) Exclude cut with ID _33640_210_2655_1_1533117605479_4855469_959-18820-0_sp1.1 from training. Duration: 0.963625 2023-10-03 16:36:29,842 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_133-564-0_sp0.9 from training. Duration: 0.87775 2023-10-03 16:36:29,962 WARNING [train.py:1204] (3/4) Exclude cut with ID _23514_210_1389_1_1531832139785_3031439_12-29287-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 16:36:34,724 WARNING [train.py:1204] (3/4) Exclude cut with ID _23514_210_1389_1_1531832139785_3031439_489-185052-0_sp1.1 from training. Duration: 0.963625 2023-10-03 16:36:36,088 WARNING [train.py:1204] (3/4) Exclude cut with ID _5444_210_2151_1_1527418808464_5312210_634-194174-0 from training. Duration: 0.97 2023-10-03 16:36:36,114 WARNING [train.py:1204] (3/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_91-26919-0_sp1.1 from training. Duration: 0.8818125 2023-10-03 16:36:38,934 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_37351_210_7925_1_1533012931_3812113_516-82303-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 16:36:41,473 WARNING [train.py:1204] (3/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_80-74184-0_sp0.9 from training. Duration: 0.92225 2023-10-03 16:36:42,820 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_9460_210_4211_1_1528954587_10049667_495-202963-0_sp1.1 from training. Duration: 0.963625 2023-10-03 16:36:44,171 WARNING [train.py:1204] (3/4) Exclude cut with ID _14632_210_9988_1_1530691279392_3923200_332-105904-0_sp1.1 from training. Duration: 0.8181875 2023-10-03 16:36:44,188 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_40764_210_1925_1_1533882359_3752345_493-167886-0_sp1.1 from training. Duration: 0.92725 2023-10-03 16:36:45,558 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_196-289637-0 from training. Duration: 0.75 2023-10-03 16:36:45,627 WARNING [train.py:1204] (3/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_26-175553-0 from training. Duration: 0.61 2023-10-03 16:36:47,547 WARNING [train.py:1204] (3/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_890-5117-0 from training. Duration: 0.78 2023-10-03 16:36:48,820 WARNING [train.py:1204] (3/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_206-306860-0_sp1.1 from training. Duration: 0.92725 2023-10-03 16:36:49,574 WARNING [train.py:1204] (3/4) Exclude cut with ID _25882_210_3036_1_1532775665815_6479510_570-79824-0_sp1.1 from training. Duration: 0.963625 2023-10-03 16:36:50,680 WARNING [train.py:1204] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_731-2428-0_sp1.1 from training. Duration: 0.7545625 2023-10-03 16:36:50,717 WARNING [train.py:1204] (3/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_138-154812-0_sp0.9 from training. Duration: 0.8666875 2023-10-03 16:36:53,388 WARNING [train.py:1204] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_178-39722-0_sp0.9 from training. Duration: 0.5444375 2023-10-03 16:36:53,449 WARNING [train.py:1204] (3/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_1285-256622-0_sp0.9 from training. Duration: 0.9333125 2023-10-03 16:36:59,586 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_41428_210_2161_1_1533774184_3992532_65-271323-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 16:37:00,870 WARNING [train.py:1204] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_308-161067-0 from training. Duration: 0.66 2023-10-03 16:37:00,876 WARNING [train.py:1204] (3/4) Exclude cut with ID _3067_210_2667_1_1526212974640_4503168_459-173867-0 from training. Duration: 0.88 2023-10-03 16:37:00,877 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2221_210_1988_1_1525597051_4935541_415-296882-0_sp1.1 from training. Duration: 0.7 2023-10-03 16:37:03,523 WARNING [train.py:1204] (3/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_354-192266-0_sp1.1 from training. Duration: 0.936375 2023-10-03 16:37:05,349 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_258-310570-0_sp0.9 from training. Duration: 0.911125 2023-10-03 16:37:06,746 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_548-141039-0_sp1.1 from training. Duration: 0.963625 2023-10-03 16:37:09,496 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_15101_210_7927_1_1530786415_5033274_320-327527-0 from training. Duration: 0.96 2023-10-03 16:37:09,549 WARNING [train.py:1204] (3/4) Exclude cut with ID _2895_210_2477_1_1525956785794_4063750_289-11475-0_sp0.9 from training. Duration: 0.911125 2023-10-03 16:37:10,928 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7543_210_6753_1_1528543318_4552_1-85072-0_sp1.1 from training. Duration: 0.7545625 2023-10-03 16:37:12,314 WARNING [train.py:1204] (3/4) Exclude cut with ID _64639_210_5816_1_1535367619437_3528000_461-329156-0 from training. Duration: 0.96 2023-10-03 16:37:13,785 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_211-339622-0 from training. Duration: 0.45 2023-10-03 16:37:15,862 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.bypass.scale_min, batch_count=1333993.3333333333, ans=0.2 2023-10-03 16:37:16,912 INFO [train.py:1046] (3/4) Epoch 38, batch 3550, loss[loss=0.1534, simple_loss=0.2278, pruned_loss=0.0395, over 23679.00 frames. ], tot_loss[loss=0.1577, simple_loss=0.2368, pruned_loss=0.03929, over 4702265.27 frames. ], batch size: 232, lr: 2.67e-03, grad_scale: 8.0 2023-10-03 16:37:17,000 WARNING [train.py:1204] (3/4) Exclude cut with ID _20455_210_5219_1_1531565996775_8098100_402-112739-0_sp1.1 from training. Duration: 0.963625 2023-10-03 16:37:17,090 WARNING [train.py:1204] (3/4) Exclude cut with ID _53489_210_6228_1_1534298357169_3835970_256-115565-0_sp1.1 from training. Duration: 0.936375 2023-10-03 16:37:18,390 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_26326_210_7925_1_1533002493_3592406_360-305260-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 16:37:18,426 WARNING [train.py:1204] (3/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_18-204383-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 16:37:22,500 WARNING [train.py:1204] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_306-232480-0_sp1.1 from training. Duration: 0.87275 2023-10-03 16:37:24,786 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.conv_skip_rate, batch_count=1333993.3333333333, ans=0.0 2023-10-03 16:37:30,076 WARNING [train.py:1204] (3/4) Exclude cut with ID _41161_210_20034_1_1533431249657_3555314_116-299186-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 16:37:32,709 WARNING [train.py:1204] (3/4) Exclude cut with ID _13913_210_9994_1_1530579615187_3383897_473-277682-0_sp1.1 from training. Duration: 0.3545625 2023-10-03 16:37:36,058 WARNING [train.py:1204] (3/4) Exclude cut with ID _58895_210_8750_1_1535184081320_3649089_49-225702-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 16:37:36,132 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3080_210_2071_1_1526086102_4365166_312-8726-0_sp1.1 from training. Duration: 0.82725 2023-10-03 16:37:37,531 WARNING [train.py:1204] (3/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_489-190175-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 16:37:37,599 WARNING [train.py:1204] (3/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_345-95717-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 16:37:37,610 WARNING [train.py:1204] (3/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_85-138257-0_sp1.1 from training. Duration: 0.6454375 2023-10-03 16:37:40,546 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7046_210_6753_1_1527895337_6920917_783-278109-0_sp1.1 from training. Duration: 0.7 2023-10-03 16:37:41,825 WARNING [train.py:1204] (3/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_450-203637-0_sp1.1 from training. Duration: 0.836375 2023-10-03 16:37:41,869 WARNING [train.py:1204] (3/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_416-132893-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 16:37:43,180 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2655_210_2087_1_1525688959_3897400_437-337690-0_sp0.9 from training. Duration: 0.7 2023-10-03 16:37:43,245 WARNING [train.py:1204] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_700-183549-0_sp1.1 from training. Duration: 0.6181875 2023-10-03 16:37:47,972 WARNING [train.py:1204] (3/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_191-59784-0_sp1.1 from training. Duration: 0.8 2023-10-03 16:37:47,982 WARNING [train.py:1204] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_218-295097-0_sp1.1 from training. Duration: 0.7 2023-10-03 16:37:48,125 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.0.feed_forward1.out_proj.dropout_p, batch_count=1334126.6666666667, ans=0.1 2023-10-03 16:37:49,372 WARNING [train.py:1204] (3/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_310-109461-0_sp1.1 from training. Duration: 0.72725 2023-10-03 16:37:49,376 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_33348_210_5632_1_1533086912_3324030_131-321149-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 16:37:50,791 WARNING [train.py:1204] (3/4) Exclude cut with ID _64135_210_2253_1_1535677267876_7608972_133-188266-0_sp1.1 from training. Duration: 0.736375 2023-10-03 16:37:50,809 WARNING [train.py:1204] (3/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_505-205270-0 from training. Duration: 0.38 2023-10-03 16:37:50,820 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_67223_210_4238_1_1535612120_2537431_102-88248-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 16:37:50,943 WARNING [train.py:1204] (3/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_60-235433-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 16:37:52,941 WARNING [train.py:1204] (3/4) Exclude cut with ID _5189_210_4557_1_1527248319428_3769229_199-53526-0_sp1.1 from training. Duration: 0.3818125 2023-10-03 16:37:53,736 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.2.encoder.layers.0.conv_module2.whiten, num_groups=1, num_channels=384, metric=5.22 vs. limit=15.0 2023-10-03 16:37:54,876 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.4.encoder.layers.2.self_attn2.whiten, num_groups=1, num_channels=384, metric=11.64 vs. limit=22.5 2023-10-03 16:37:57,017 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.575e+02 2.030e+02 2.255e+02 2.585e+02 3.418e+02, threshold=4.510e+02, percent-clipped=0.0 2023-10-03 16:37:58,465 WARNING [train.py:1204] (3/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_439-90979-0_sp1.1 from training. Duration: 0.963625 2023-10-03 16:37:58,532 WARNING [train.py:1204] (3/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_1162-132880-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 16:37:59,892 WARNING [train.py:1204] (3/4) Exclude cut with ID _56423_210_5854_1_1534896061049_3165969_220-22317-0_sp1.1 from training. Duration: 0.963625 2023-10-03 16:38:00,131 INFO [scaling.py:1118] (3/4) WithLoss: name=encoder.encoders.0.layers.1.self_attn_weights, loss-sum=0.000e+00 2023-10-03 16:38:01,286 WARNING [train.py:1204] (3/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_909-64151-0 from training. Duration: 0.94 2023-10-03 16:38:02,655 WARNING [train.py:1204] (3/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_465-156500-0_sp0.9 from training. Duration: 0.8 2023-10-03 16:38:02,945 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.out_combiner.scale_min, batch_count=1334193.3333333333, ans=0.2 2023-10-03 16:38:04,564 WARNING [train.py:1204] (3/4) Exclude cut with ID _44304_210_15374_1_1533639352765_4613129_302-22084-0 from training. Duration: 0.86 2023-10-03 16:38:04,609 WARNING [train.py:1204] (3/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_663-166973-0_sp1.1 from training. Duration: 0.72725 2023-10-03 16:38:04,775 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.conv_module1.balancer2.prob, batch_count=1334193.3333333333, ans=0.125 2023-10-03 16:38:06,048 WARNING [train.py:1204] (3/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_503-287883-0_sp0.9 from training. Duration: 0.97775 2023-10-03 16:38:07,349 WARNING [train.py:1204] (3/4) Exclude cut with ID _60057_210_10640_1_1534903065793_3333150_362-230377-0_sp0.9 from training. Duration: 0.9666875 2023-10-03 16:38:10,150 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_4183_210_2087_1_1526729100_1869838_143-280962-0 from training. Duration: 0.56 2023-10-03 16:38:11,474 WARNING [train.py:1204] (3/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_281-1313-0_sp1.1 from training. Duration: 0.936375 2023-10-03 16:38:17,629 WARNING [train.py:1204] (3/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_64-215118-0_sp1.1 from training. Duration: 0.936375 2023-10-03 16:38:17,705 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2822_210_2715_1_1526112586_1999587_165-36656-0 from training. Duration: 0.89 2023-10-03 16:38:17,754 WARNING [train.py:1204] (3/4) Exclude cut with ID _22961_210_8254_1_1531976366672_4278180_402-208830-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 16:38:22,374 WARNING [train.py:1204] (3/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_98-193494-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 16:38:22,472 WARNING [train.py:1204] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_383-285196-0 from training. Duration: 0.97 2023-10-03 16:38:25,416 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=1334260.0, ans=0.1 2023-10-03 16:38:28,029 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.balancer2.prob, batch_count=1334260.0, ans=0.125 2023-10-03 16:38:29,112 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6767_210_3273_1_1527855783_1718932_142-127132-0 from training. Duration: 0.95 2023-10-03 16:38:29,139 WARNING [train.py:1204] (3/4) Exclude cut with ID _39867_210_9826_1_1533553222030_9344259_1018-339635-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 16:38:29,210 WARNING [train.py:1204] (3/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_274-274798-0_sp1.1 from training. Duration: 0.8909375 2023-10-03 16:38:29,445 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.feed_forward1.hidden_balancer.prob, batch_count=1334326.6666666667, ans=0.125 2023-10-03 16:38:30,613 INFO [train.py:1046] (3/4) Epoch 38, batch 3600, loss[loss=0.1883, simple_loss=0.2425, pruned_loss=0.067, over 19479.00 frames. ], tot_loss[loss=0.1581, simple_loss=0.2374, pruned_loss=0.03941, over 4698190.56 frames. ], batch size: 388, lr: 2.67e-03, grad_scale: 16.0 2023-10-03 16:38:30,752 WARNING [train.py:1204] (3/4) Exclude cut with ID _2334_210_2667_1_1525608238880_4705129_376-263393-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 16:38:30,884 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.attention_skip_rate, batch_count=1334326.6666666667, ans=0.0 2023-10-03 16:38:32,109 WARNING [train.py:1204] (3/4) Exclude cut with ID _29303_210_2575_1_1532392237303_7410649_363-344675-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 16:38:32,184 WARNING [train.py:1204] (3/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_58-68956-0_sp1.1 from training. Duration: 0.7545625 2023-10-03 16:38:35,337 WARNING [train.py:1204] (3/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_709-311571-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 16:38:36,690 WARNING [train.py:1204] (3/4) Exclude cut with ID _46067_210_4220_1_1534125571066_3307450_136-257407-0_sp1.1 from training. Duration: 0.963625 2023-10-03 16:38:38,052 WARNING [train.py:1204] (3/4) Exclude cut with ID _6493_210_1747_1_1528459210424_4012470_324-323854-0_sp1.1 from training. Duration: 0.836375 2023-10-03 16:38:39,262 WARNING [train.py:1204] (3/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_234-260682-0_sp1.1 from training. Duration: 0.763625 2023-10-03 16:38:39,319 WARNING [train.py:1204] (3/4) Exclude cut with ID _36634_210_1794_1_1533296352450_1399884_55-119451-0_sp1.1 from training. Duration: 0.963625 2023-10-03 16:38:39,322 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_40047_210_2290_1_1533290519_2374311_195-165693-0 from training. Duration: 0.89 2023-10-03 16:38:39,582 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.feed_forward3.hidden_balancer.prob, batch_count=1334326.6666666667, ans=0.125 2023-10-03 16:38:42,081 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_5522_210_3273_1_1527249606_3909096_366-172508-0_sp0.9 from training. Duration: 0.7333125 2023-10-03 16:38:42,169 WARNING [train.py:1204] (3/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_187-10504-0_sp1.1 from training. Duration: 0.963625 2023-10-03 16:38:45,419 WARNING [train.py:1204] (3/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_282-257791-0_sp1.1 from training. Duration: 0.7818125 2023-10-03 16:38:48,799 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_43831_210_2158_1_1533725502_5872791_467-288776-0_sp1.1 from training. Duration: 0.92725 2023-10-03 16:38:48,892 WARNING [train.py:1204] (3/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_138-154812-0_sp1.1 from training. Duration: 0.7090625 2023-10-03 16:38:48,939 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_1154-185930-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 16:38:50,729 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2655_210_2087_1_1525688959_3897400_52-279899-0 from training. Duration: 0.69 2023-10-03 16:38:50,800 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_141-89113-0_sp1.1 from training. Duration: 0.7818125 2023-10-03 16:38:53,586 WARNING [train.py:1204] (3/4) Exclude cut with ID _55134_210_21982_1_1534590028377_3882170_297-47310-0_sp1.1 from training. Duration: 0.963625 2023-10-03 16:38:54,784 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_486-151131-0_sp1.1 from training. Duration: 0.77275 2023-10-03 16:38:56,154 WARNING [train.py:1204] (3/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_21-232980-0_sp1.1 from training. Duration: 0.936375 2023-10-03 16:38:57,569 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6165_210_3528_1_1527590982_4379841_338-106380-0_sp1.1 from training. Duration: 0.92725 2023-10-03 16:38:57,641 WARNING [train.py:1204] (3/4) Exclude cut with ID _55162_210_17071_1_1534408248583_3753480_355-257218-0_sp1.1 from training. Duration: 0.97275 2023-10-03 16:38:57,896 INFO [scaling.py:1118] (3/4) WithLoss: name=encoder.encoders.2.encoder.layers.2.self_attn_weights, loss-sum=0.000e+00 2023-10-03 16:38:58,929 WARNING [train.py:1204] (3/4) Exclude cut with ID _2895_210_2477_1_1525956785794_4063750_289-11475-0 from training. Duration: 0.82 2023-10-03 16:38:59,116 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.0.bypass_mid.scale_min, batch_count=1334460.0, ans=0.2 2023-10-03 16:39:07,505 WARNING [train.py:1204] (3/4) Exclude cut with ID _16756_210_4915_1_1531047803763_7104259_839-183431-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 16:39:07,608 WARNING [train.py:1204] (3/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_427-208901-0_sp0.9 from training. Duration: 0.7555625 2023-10-03 16:39:08,934 WARNING [train.py:1204] (3/4) Exclude cut with ID _36512_210_6987_1_1533033143998_3126069_181-306500-0 from training. Duration: 0.95 2023-10-03 16:39:13,713 WARNING [train.py:1204] (3/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_28-70987-0_sp1.1 from training. Duration: 0.7909375 2023-10-03 16:39:18,270 WARNING [train.py:1204] (3/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_590-58996-0_sp1.1 from training. Duration: 0.936375 2023-10-03 16:39:21,467 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_48730_210_5399_1_1533885886_2212769_108-47561-0_sp1.1 from training. Duration: 0.936375 2023-10-03 16:39:27,063 WARNING [train.py:1204] (3/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_401-158239-0_sp0.9 from training. Duration: 0.788875 2023-10-03 16:39:28,464 WARNING [train.py:1204] (3/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_278-154492-0_sp1.1 from training. Duration: 0.6909375 2023-10-03 16:39:28,471 WARNING [train.py:1204] (3/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_137-116442-0 from training. Duration: 0.78 2023-10-03 16:39:29,951 WARNING [train.py:1204] (3/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_189-317158-0 from training. Duration: 0.9 2023-10-03 16:39:31,339 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_47896_210_20508_1_1534492518_5958868_558-314572-0 from training. Duration: 0.97 2023-10-03 16:39:32,753 WARNING [train.py:1204] (3/4) Exclude cut with ID _66070_210_7640_1_1535441393096_7805310_290-40284-0_sp1.1 from training. Duration: 0.97275 2023-10-03 16:39:32,802 WARNING [train.py:1204] (3/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_110-224688-0_sp1.1 from training. Duration: 0.8818125 2023-10-03 16:39:34,766 WARNING [train.py:1204] (3/4) Exclude cut with ID _7719_210_3525_1_1528456153163_4296960_88-250259-0 from training. Duration: 0.84 2023-10-03 16:39:34,819 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8453_210_4211_1_1528502940_8562730_1157-114102-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 16:39:34,845 WARNING [train.py:1204] (3/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_317-175062-0_sp1.1 from training. Duration: 0.7454375 2023-10-03 16:39:34,851 WARNING [train.py:1204] (3/4) Exclude cut with ID _29857_210_20034_1_1532395886753_3798160_254-347254-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 16:39:35,039 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.feed_forward2.hidden_balancer.prob, batch_count=1334593.3333333333, ans=0.125 2023-10-03 16:39:36,249 WARNING [train.py:1204] (3/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_28-70987-0 from training. Duration: 0.87 2023-10-03 16:39:36,325 WARNING [train.py:1204] (3/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_321-335475-0 from training. Duration: 0.45 2023-10-03 16:39:40,327 WARNING [train.py:1204] (3/4) Exclude cut with ID _40901_210_5665_1_1533363789958_3856130_356-107330-0_sp1.1 from training. Duration: 0.936375 2023-10-03 16:39:40,399 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3780_210_2087_1_1526467389_2145572_176-107140-0 from training. Duration: 0.69 2023-10-03 16:39:41,952 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.conv_skip_rate, batch_count=1334593.3333333333, ans=0.0 2023-10-03 16:39:44,419 INFO [train.py:1046] (3/4) Epoch 38, batch 3650, loss[loss=0.1586, simple_loss=0.2467, pruned_loss=0.0352, over 24618.00 frames. ], tot_loss[loss=0.1585, simple_loss=0.238, pruned_loss=0.03951, over 4703361.39 frames. ], batch size: 68, lr: 2.67e-03, grad_scale: 16.0 2023-10-03 16:39:46,152 WARNING [train.py:1204] (3/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_80-135200-0 from training. Duration: 0.79 2023-10-03 16:39:48,104 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_33348_210_5632_1_1533086912_3324030_85-28583-0_sp0.9 from training. Duration: 0.911125 2023-10-03 16:39:54,602 WARNING [train.py:1204] (3/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_29-293896-0 from training. Duration: 0.61 2023-10-03 16:39:54,736 WARNING [train.py:1204] (3/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_194-252800-0 from training. Duration: 0.97 2023-10-03 16:39:57,474 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_360-52446-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 16:39:57,475 WARNING [train.py:1204] (3/4) Exclude cut with ID _9389_210_1664_1_1529217337301_5289269_289-213417-0_sp0.9 from training. Duration: 0.988875 2023-10-03 16:39:57,532 WARNING [train.py:1204] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_143-86344-0_sp0.9 from training. Duration: 0.8555625 2023-10-03 16:40:00,390 WARNING [train.py:1204] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_520-126729-0_sp0.9 from training. Duration: 0.77775 2023-10-03 16:40:00,424 WARNING [train.py:1204] (3/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_76-308663-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 16:40:01,699 WARNING [train.py:1204] (3/4) Exclude cut with ID _8021_210_7994_1_1528376451358_3653069_100-140231-0 from training. Duration: 0.67 2023-10-03 16:40:01,758 WARNING [train.py:1204] (3/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_387-131538-0_sp0.9 from training. Duration: 0.9 2023-10-03 16:40:01,781 WARNING [train.py:1204] (3/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_365-182054-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 16:40:02,057 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=1334726.6666666667, ans=0.1 2023-10-03 16:40:03,509 WARNING [train.py:1204] (3/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_345-340690-0 from training. Duration: 0.93 2023-10-03 16:40:04,860 WARNING [train.py:1204] (3/4) Exclude cut with ID _5409_210_1388_1_1527292850189_5865191_178-127970-0_sp0.9 from training. Duration: 0.6333125 2023-10-03 16:40:04,918 WARNING [train.py:1204] (3/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_352-12734-0_sp1.1 from training. Duration: 0.92725 2023-10-03 16:40:04,922 WARNING [train.py:1204] (3/4) Exclude cut with ID _65134_210_14763_1_1535335393985_3505368_121-129373-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 16:40:05,155 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.feed_forward1.hidden_balancer.prob, batch_count=1334726.6666666667, ans=0.125 2023-10-03 16:40:06,457 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.ff2_skip_rate, batch_count=1334726.6666666667, ans=0.0 2023-10-03 16:40:07,535 WARNING [train.py:1204] (3/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_557-324258-0_sp1.1 from training. Duration: 0.736375 2023-10-03 16:40:09,084 WARNING [train.py:1204] (3/4) Exclude cut with ID _48678_210_5663_1_1533988830947_3769199_306-255886-0 from training. Duration: 0.77 2023-10-03 16:40:11,674 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7544_210_6753_1_1528587000_691468_87-178599-0 from training. Duration: 0.87 2023-10-03 16:40:12,965 WARNING [train.py:1204] (3/4) Exclude cut with ID _2941_210_2577_1_1526199673150_2489759_208-236402-0_sp1.1 from training. Duration: 0.963625 2023-10-03 16:40:14,421 WARNING [train.py:1204] (3/4) Exclude cut with ID _38670_210_15379_1_1533448810659_4391059_65-169397-0 from training. Duration: 0.97 2023-10-03 16:40:15,726 WARNING [train.py:1204] (3/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_306-328175-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 16:40:15,743 WARNING [train.py:1204] (3/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_212-149947-0_sp1.1 from training. Duration: 0.663625 2023-10-03 16:40:22,211 WARNING [train.py:1204] (3/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_321-22821-0_sp1.1 from training. Duration: 0.8090625 2023-10-03 16:40:25,394 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.584e+02 1.902e+02 2.048e+02 2.259e+02 3.256e+02, threshold=4.095e+02, percent-clipped=0.0 2023-10-03 16:40:25,490 WARNING [train.py:1204] (3/4) Exclude cut with ID _44010_210_4525_1_1533884262964_4013620_483-69110-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 16:40:25,510 WARNING [train.py:1204] (3/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_38-187851-0_sp1.1 from training. Duration: 0.563625 2023-10-03 16:40:26,882 WARNING [train.py:1204] (3/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_129-337802-0_sp0.9 from training. Duration: 0.8 2023-10-03 16:40:26,956 WARNING [train.py:1204] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_268-87909-0_sp1.1 from training. Duration: 0.9 2023-10-03 16:40:27,157 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.1.feed_forward2.hidden_balancer.prob, batch_count=1334793.3333333333, ans=0.125 2023-10-03 16:40:28,382 WARNING [train.py:1204] (3/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_559-184862-0_sp1.1 from training. Duration: 0.8545625 2023-10-03 16:40:28,704 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.conv_module2.balancer1.max_abs, batch_count=1334860.0, ans=10.0 2023-10-03 16:40:28,925 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.5.encoder.layers.0.whiten, num_groups=1, num_channels=256, metric=2.46 vs. limit=12.0 2023-10-03 16:40:31,232 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_46032_210_14656_1_1533781412_4507969_237-120766-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 16:40:33,023 WARNING [train.py:1204] (3/4) Exclude cut with ID _28427_210_4220_1_1532581240967_3061069_330-170906-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 16:40:33,035 WARNING [train.py:1204] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_401-51578-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 16:40:34,436 WARNING [train.py:1204] (3/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_209-205365-0_sp1.1 from training. Duration: 0.7181875 2023-10-03 16:40:34,698 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=1334860.0, ans=0.1 2023-10-03 16:40:35,697 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_23801_210_16223_1_1532066392_3294657_57-242170-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 16:40:37,108 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_127-37371-0_sp1.1 from training. Duration: 0.936375 2023-10-03 16:40:41,365 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9171W0019-578905 from training. Duration: 0.896 2023-10-03 16:40:41,992 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.4.encoder.layers.1.self_attn1.whiten, num_groups=1, num_channels=384, metric=10.62 vs. limit=22.5 2023-10-03 16:40:44,092 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_24605_210_1794_1_1532047863_4149755_349-49056-0_sp1.1 from training. Duration: 0.92725 2023-10-03 16:40:44,104 WARNING [train.py:1204] (3/4) Exclude cut with ID _12302_210_5219_1_1529924446466_8107280_897-21237-0_sp1.1 from training. Duration: 0.936375 2023-10-03 16:40:46,664 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_9460_210_4211_1_1528954587_10049667_480-260269-0_sp0.9 from training. Duration: 0.62225 2023-10-03 16:40:46,739 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_348-152548-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 16:40:47,986 WARNING [train.py:1204] (3/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_301-330455-0_sp0.9 from training. Duration: 0.888875 2023-10-03 16:40:49,846 WARNING [train.py:1204] (3/4) Exclude cut with ID _61008_210_10633_1_1534852769198_6974370_345-91705-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 16:40:52,609 WARNING [train.py:1204] (3/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_304-273433-0 from training. Duration: 0.99 2023-10-03 16:40:52,613 WARNING [train.py:1204] (3/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_359-345972-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 16:40:56,649 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2296_210_2087_1_1525503448_3397335_57-185430-0_sp1.1 from training. Duration: 0.5454375 2023-10-03 16:40:58,046 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_1256-120974-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 16:40:59,337 INFO [train.py:1046] (3/4) Epoch 38, batch 3700, loss[loss=0.1621, simple_loss=0.2498, pruned_loss=0.03721, over 24569.00 frames. ], tot_loss[loss=0.1588, simple_loss=0.2386, pruned_loss=0.03952, over 4711057.34 frames. ], batch size: 71, lr: 2.67e-03, grad_scale: 8.0 2023-10-03 16:40:59,401 WARNING [train.py:1204] (3/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_372-30302-0_sp0.9 from training. Duration: 0.9555625 2023-10-03 16:41:01,042 INFO [scaling.py:1118] (3/4) WithLoss: name=encoder.encoders.2.encoder.layers.1.self_attn_weights, loss-sum=0.000e+00 2023-10-03 16:41:02,157 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_42012_210_7927_1_1533439936_3871556_451-344262-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 16:41:02,158 WARNING [train.py:1204] (3/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_28-324729-0 from training. Duration: 0.94 2023-10-03 16:41:02,173 WARNING [train.py:1204] (3/4) Exclude cut with ID _64135_210_2253_1_1535677267876_7608972_313-18394-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 16:41:03,410 WARNING [train.py:1204] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_620-276793-0_sp0.9 from training. Duration: 0.6555625 2023-10-03 16:41:03,435 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7544_210_6753_1_1528591327_1471426_172-348777-0_sp0.9 from training. Duration: 0.7333125 2023-10-03 16:41:07,984 WARNING [train.py:1204] (3/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_332-221057-0_sp0.9 from training. Duration: 0.7555625 2023-10-03 16:41:10,842 WARNING [train.py:1204] (3/4) Exclude cut with ID _19023_210_4353_1_1531378892552_5820780_5-300035-0_sp1.1 from training. Duration: 0.92725 2023-10-03 16:41:10,881 WARNING [train.py:1204] (3/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_343-309023-0_sp1.1 from training. Duration: 0.963625 2023-10-03 16:41:12,189 WARNING [train.py:1204] (3/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_37-41919-0_sp1.1 from training. Duration: 0.7909375 2023-10-03 16:41:13,493 WARNING [train.py:1204] (3/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_41-182910-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 16:41:13,552 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_229-45300-0_sp0.9 from training. Duration: 0.6333125 2023-10-03 16:41:14,945 WARNING [train.py:1204] (3/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_40-214716-0_sp1.1 from training. Duration: 0.963625 2023-10-03 16:41:16,345 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9309W0203-640330 from training. Duration: 0.896 2023-10-03 16:41:23,834 WARNING [train.py:1204] (3/4) Exclude cut with ID _60229_210_27767_1_1534838406158_3655350_264-279573-0_sp1.1 from training. Duration: 0.7545625 2023-10-03 16:41:23,872 WARNING [train.py:1204] (3/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_372-198878-0_sp0.9 from training. Duration: 0.6666875 2023-10-03 16:41:26,995 WARNING [train.py:1204] (3/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_359-118926-0_sp1.1 from training. Duration: 0.6545625 2023-10-03 16:41:27,007 WARNING [train.py:1204] (3/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_85-65056-0 from training. Duration: 0.96 2023-10-03 16:41:27,025 WARNING [train.py:1204] (3/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_89-242335-0_sp1.1 from training. Duration: 0.863625 2023-10-03 16:41:27,327 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.nonlin_attention.balancer.prob, batch_count=1335126.6666666667, ans=0.125 2023-10-03 16:41:29,162 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.feed_forward2.hidden_balancer.prob, batch_count=1335126.6666666667, ans=0.125 2023-10-03 16:41:31,711 WARNING [train.py:1204] (3/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_128-49444-0_sp1.1 from training. Duration: 0.936375 2023-10-03 16:41:31,790 WARNING [train.py:1204] (3/4) Exclude cut with ID _30327_210_16049_1_1532484161295_3924169_401-70819-0 from training. Duration: 0.86 2023-10-03 16:41:32,560 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.1.encoder.layers.0.whiten, num_groups=1, num_channels=256, metric=6.50 vs. limit=12.0 2023-10-03 16:41:33,288 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_41822_210_13270_1_1533955976_4379284_260-256391-0_sp1.1 from training. Duration: 0.936375 2023-10-03 16:41:34,655 WARNING [train.py:1204] (3/4) Exclude cut with ID _67335_210_5219_1_1535525975652_4275229_599-247317-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 16:41:37,549 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.self_attn_weights.pos_emb_skip_rate, batch_count=1335126.6666666667, ans=0.0 2023-10-03 16:41:39,006 WARNING [train.py:1204] (3/4) Exclude cut with ID _29857_210_20034_1_1532395886753_3798160_209-239267-0_sp1.1 from training. Duration: 0.936375 2023-10-03 16:41:39,037 WARNING [train.py:1204] (3/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_46-207599-0_sp1.1 from training. Duration: 0.5818125 2023-10-03 16:41:41,752 WARNING [train.py:1204] (3/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_912-282412-0_sp1.1 from training. Duration: 0.4454375 2023-10-03 16:41:45,974 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_4183_210_2087_1_1526729100_1869838_141-166600-0_sp1.1 from training. Duration: 0.863625 2023-10-03 16:41:45,978 WARNING [train.py:1204] (3/4) Exclude cut with ID _15037_210_2253_1_1530752294104_3267650_296-192597-0 from training. Duration: 0.81 2023-10-03 16:41:46,034 WARNING [train.py:1204] (3/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_571-220562-0_sp1.1 from training. Duration: 0.963625 2023-10-03 16:41:46,060 WARNING [train.py:1204] (3/4) Exclude cut with ID _5287_210_2084_1_1527418883040_3878670_12-221186-0 from training. Duration: 0.98 2023-10-03 16:41:50,397 WARNING [train.py:1204] (3/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_149-85602-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 16:41:50,428 WARNING [train.py:1204] (3/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_1003-101638-0_sp1.1 from training. Duration: 0.663625 2023-10-03 16:41:53,163 WARNING [train.py:1204] (3/4) Exclude cut with ID _55844_210_2346_1_1534471212569_7625707_1099-33689-0_sp1.1 from training. Duration: 0.92725 2023-10-03 16:41:55,087 WARNING [train.py:1204] (3/4) Exclude cut with ID _7606_210_4915_1_1528894253277_5080690_301-10732-0 from training. Duration: 0.81 2023-10-03 16:41:56,401 WARNING [train.py:1204] (3/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_403-34698-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 16:41:56,407 WARNING [train.py:1204] (3/4) Exclude cut with ID _8480_210_1874_1_1528589391205_6728330_755-112625-0_sp0.9 from training. Duration: 0.82225 2023-10-03 16:41:56,431 WARNING [train.py:1204] (3/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_527-238990-0_sp1.1 from training. Duration: 0.8181875 2023-10-03 16:41:56,462 WARNING [train.py:1204] (3/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_426-7328-0_sp1.1 from training. Duration: 0.92725 2023-10-03 16:42:01,567 WARNING [train.py:1204] (3/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_189-317158-0_sp1.1 from training. Duration: 0.8181875 2023-10-03 16:42:02,790 WARNING [train.py:1204] (3/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_162-103620-0 from training. Duration: 0.93 2023-10-03 16:42:04,290 WARNING [train.py:1204] (3/4) Exclude cut with ID _22083_210_8254_1_1531702862439_3678970_49-234546-0 from training. Duration: 0.83 2023-10-03 16:42:05,637 WARNING [train.py:1204] (3/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_370-331600-0_sp1.1 from training. Duration: 0.8545625 2023-10-03 16:42:05,649 WARNING [train.py:1204] (3/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_166-59042-0_sp1.1 from training. Duration: 0.97275 2023-10-03 16:42:07,030 WARNING [train.py:1204] (3/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_138-332773-0_sp1.1 from training. Duration: 0.736375 2023-10-03 16:42:08,290 WARNING [train.py:1204] (3/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_90-30673-0_sp0.9 from training. Duration: 0.9333125 2023-10-03 16:42:11,576 WARNING [train.py:1204] (3/4) Exclude cut with ID _27967_210_12140_1_1532588391859_8419130_737-41898-0_sp1.1 from training. Duration: 0.936375 2023-10-03 16:42:11,682 WARNING [train.py:1204] (3/4) Exclude cut with ID _8833_210_5219_1_1528592665210_7553059_329-125156-0_sp1.1 from training. Duration: 0.7454375 2023-10-03 16:42:12,977 INFO [train.py:1046] (3/4) Epoch 38, batch 3750, loss[loss=0.1522, simple_loss=0.2412, pruned_loss=0.03165, over 24509.00 frames. ], tot_loss[loss=0.1599, simple_loss=0.2397, pruned_loss=0.04006, over 4688744.00 frames. ], batch size: 66, lr: 2.67e-03, grad_scale: 8.0 2023-10-03 16:42:13,064 WARNING [train.py:1204] (3/4) Exclude cut with ID _41817_210_18835_1_1533434477160_7423049_352-42537-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 16:42:14,587 WARNING [train.py:1204] (3/4) Exclude cut with ID _8365_210_2064_1_1528592311070_4677690_431-294102-0 from training. Duration: 0.89 2023-10-03 16:42:15,986 WARNING [train.py:1204] (3/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_454-314901-0_sp1.1 from training. Duration: 0.4181875 2023-10-03 16:42:18,665 WARNING [train.py:1204] (3/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_80-135200-0_sp0.9 from training. Duration: 0.87775 2023-10-03 16:42:18,715 WARNING [train.py:1204] (3/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_181-78632-0 from training. Duration: 0.78 2023-10-03 16:42:18,768 WARNING [train.py:1204] (3/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_300-27235-0_sp0.9 from training. Duration: 0.9666875 2023-10-03 16:42:20,062 WARNING [train.py:1204] (3/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_96-177176-0_sp1.1 from training. Duration: 0.97275 2023-10-03 16:42:20,161 WARNING [train.py:1204] (3/4) Exclude cut with ID _35941_210_15379_1_1532995294515_4023089_246-348742-0_sp1.1 from training. Duration: 0.97275 2023-10-03 16:42:21,517 WARNING [train.py:1204] (3/4) Exclude cut with ID _38670_210_15379_1_1533448810659_4391059_65-169397-0_sp1.1 from training. Duration: 0.8818125 2023-10-03 16:42:22,321 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.0.layers.0.nonlin_attention.whiten2, num_groups=1, num_channels=192, metric=4.66 vs. limit=15.0 2023-10-03 16:42:24,950 WARNING [train.py:1204] (3/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_1003-99642-0_sp1.1 from training. Duration: 0.963625 2023-10-03 16:42:30,147 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.ff2_skip_rate, batch_count=1335393.3333333333, ans=0.0 2023-10-03 16:42:31,228 WARNING [train.py:1204] (3/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_320-14078-0_sp1.1 from training. Duration: 0.863625 2023-10-03 16:42:31,323 WARNING [train.py:1204] (3/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_367-61330-0_sp0.9 from training. Duration: 0.8333125 2023-10-03 16:42:32,694 WARNING [train.py:1204] (3/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_515-298514-0_sp1.1 from training. Duration: 0.92725 2023-10-03 16:42:36,654 WARNING [train.py:1204] (3/4) Exclude cut with ID _53731_210_7994_1_1534553979244_3757214_471-320815-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 16:42:36,713 WARNING [train.py:1204] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_306-232480-0 from training. Duration: 0.96 2023-10-03 16:42:38,044 WARNING [train.py:1204] (3/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_478-162247-0_sp0.9 from training. Duration: 0.911125 2023-10-03 16:42:38,164 WARNING [train.py:1204] (3/4) Exclude cut with ID _56423_210_5854_1_1534896061049_3165969_345-284696-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 16:42:39,506 WARNING [train.py:1204] (3/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_600-122253-0_sp1.1 from training. Duration: 0.963625 2023-10-03 16:42:41,032 WARNING [train.py:1204] (3/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_199-295611-0 from training. Duration: 0.55 2023-10-03 16:42:44,612 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.conv_module1.balancer2.prob, batch_count=1335460.0, ans=0.125 2023-10-03 16:42:45,707 WARNING [train.py:1204] (3/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_734-129761-0 from training. Duration: 0.82 2023-10-03 16:42:45,837 WARNING [train.py:1204] (3/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_349-169257-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 16:42:45,876 WARNING [train.py:1204] (3/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_202-67017-0_sp0.9 from training. Duration: 0.911125 2023-10-03 16:42:47,242 WARNING [train.py:1204] (3/4) Exclude cut with ID _16908_210_5724_1_1531119157248_3772700_61-258978-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 16:42:48,799 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.conv_skip_rate, batch_count=1335460.0, ans=0.0 2023-10-03 16:42:53,200 WARNING [train.py:1204] (3/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_172-156931-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 16:42:54,485 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.636e+02 1.979e+02 2.269e+02 2.767e+02 4.342e+02, threshold=4.539e+02, percent-clipped=2.0 2023-10-03 16:42:54,603 WARNING [train.py:1204] (3/4) Exclude cut with ID _4092_210_2151_1_1526643044449_3135759_356-278694-0_sp1.1 from training. Duration: 0.42725 2023-10-03 16:42:57,814 WARNING [train.py:1204] (3/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_116-332008-0 from training. Duration: 0.98 2023-10-03 16:43:01,210 WARNING [train.py:1204] (3/4) Exclude cut with ID _29003_210_19596_1_1532507391720_3894098_358-114789-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 16:43:04,085 WARNING [train.py:1204] (3/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_152-212533-0_sp1.1 from training. Duration: 0.8818125 2023-10-03 16:43:04,142 WARNING [train.py:1204] (3/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_267-214116-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 16:43:08,242 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7543_210_6753_1_1528541907_1399322_165-323619-0_sp0.9 from training. Duration: 0.7666875 2023-10-03 16:43:08,571 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.ff3_skip_rate, batch_count=1335526.6666666667, ans=0.0 2023-10-03 16:43:11,548 WARNING [train.py:1204] (3/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_577-332600-0_sp0.9 from training. Duration: 0.6333125 2023-10-03 16:43:13,070 WARNING [train.py:1204] (3/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_342-100157-0_sp0.9 from training. Duration: 0.6 2023-10-03 16:43:13,240 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder_embed.convnext.layerdrop_rate, batch_count=1335593.3333333333, ans=0.015 2023-10-03 16:43:14,130 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.0.layers.0.conv_module2.whiten, num_groups=1, num_channels=192, metric=10.98 vs. limit=15.0 2023-10-03 16:43:14,489 WARNING [train.py:1204] (3/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_141-89727-0_sp1.1 from training. Duration: 0.6454375 2023-10-03 16:43:15,911 WARNING [train.py:1204] (3/4) Exclude cut with ID _3992_210_1747_1_1526644816031_3731060_107-251434-0_sp1.1 from training. Duration: 0.9 2023-10-03 16:43:17,319 WARNING [train.py:1204] (3/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_401-53423-0_sp1.1 from training. Duration: 0.5 2023-10-03 16:43:24,450 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7762_210_1794_1_1528199508_4057408_422-227034-0_sp1.1 from training. Duration: 0.663625 2023-10-03 16:43:26,370 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.feed_forward1.hidden_balancer.prob, batch_count=1335660.0, ans=0.125 2023-10-03 16:43:27,557 INFO [train.py:1046] (3/4) Epoch 38, batch 3800, loss[loss=0.1491, simple_loss=0.223, pruned_loss=0.0376, over 23682.00 frames. ], tot_loss[loss=0.1592, simple_loss=0.2387, pruned_loss=0.03987, over 4696281.62 frames. ], batch size: 149, lr: 2.67e-03, grad_scale: 8.0 2023-10-03 16:43:30,831 WARNING [train.py:1204] (3/4) Exclude cut with ID _4184_210_2550_1_1526730192013_2219380_102-250299-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 16:43:30,895 WARNING [train.py:1204] (3/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_253-308377-0_sp0.9 from training. Duration: 0.5444375 2023-10-03 16:43:32,216 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_271-297505-0 from training. Duration: 0.99 2023-10-03 16:43:33,635 WARNING [train.py:1204] (3/4) Exclude cut with ID _35941_210_15379_1_1532995294515_4023089_213-142973-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 16:43:35,025 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7544_210_6753_1_1528587722_3576686_162-63018-0_sp1.1 from training. Duration: 0.92725 2023-10-03 16:43:36,431 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_365-217119-0_sp0.9 from training. Duration: 0.7 2023-10-03 16:43:37,846 WARNING [train.py:1204] (3/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_662-275621-0_sp0.9 from training. Duration: 0.4666875 2023-10-03 16:43:37,847 WARNING [train.py:1204] (3/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_374-187555-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 16:43:39,202 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_289-180904-0_sp1.1 from training. Duration: 0.6909375 2023-10-03 16:43:42,389 WARNING [train.py:1204] (3/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_189-47206-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 16:43:42,426 WARNING [train.py:1204] (3/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_234-14174-0_sp1.1 from training. Duration: 0.7454375 2023-10-03 16:43:43,797 WARNING [train.py:1204] (3/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_576-76621-0_sp1.1 from training. Duration: 0.963625 2023-10-03 16:43:45,155 WARNING [train.py:1204] (3/4) Exclude cut with ID _13253_210_1925_1_1530757775819_3577873_253-246386-0 from training. Duration: 0.9 2023-10-03 16:43:47,939 WARNING [train.py:1204] (3/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_372-6869-0_sp1.1 from training. Duration: 0.4 2023-10-03 16:43:49,247 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_21434_210_4238_1_1531916339_1222252_22-54954-0_sp1.1 from training. Duration: 0.936375 2023-10-03 16:43:51,986 WARNING [train.py:1204] (3/4) Exclude cut with ID _29853_210_7497_1_1532505600851_6642110_492-170361-0_sp1.1 from training. Duration: 0.92725 2023-10-03 16:43:54,039 WARNING [train.py:1204] (3/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_505-97987-0_sp0.9 from training. Duration: 0.9555625 2023-10-03 16:43:55,405 WARNING [train.py:1204] (3/4) Exclude cut with ID _8365_210_2064_1_1528592311070_4677690_371-122295-0_sp0.9 from training. Duration: 0.611125 2023-10-03 16:43:57,320 WARNING [train.py:1204] (3/4) Exclude cut with ID _14268_210_9206_1_1530622771285_4714099_637-86630-0_sp1.1 from training. Duration: 0.463625 2023-10-03 16:43:58,595 WARNING [train.py:1204] (3/4) Exclude cut with ID _29857_210_20034_1_1532395886753_3798160_372-5678-0_sp1.1 from training. Duration: 0.963625 2023-10-03 16:44:00,053 WARNING [train.py:1204] (3/4) Exclude cut with ID _52892_210_6721_1_1534378941259_4124129_90-165108-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 16:44:00,129 WARNING [train.py:1204] (3/4) Exclude cut with ID _27052_210_5986_1_1532329197187_3585009_143-339198-0_sp1.1 from training. Duration: 0.963625 2023-10-03 16:44:01,683 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.feed_forward1.out_proj.dropout_p, batch_count=1335793.3333333333, ans=0.1 2023-10-03 16:44:04,955 WARNING [train.py:1204] (3/4) Exclude cut with ID _14268_210_9206_1_1530622771285_4714099_637-86630-0_sp0.9 from training. Duration: 0.5666875 2023-10-03 16:44:04,957 WARNING [train.py:1204] (3/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_401-255491-0 from training. Duration: 0.7 2023-10-03 16:44:07,563 WARNING [train.py:1204] (3/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_178-6946-0_sp1.1 from training. Duration: 0.97275 2023-10-03 16:44:13,231 WARNING [train.py:1204] (3/4) Exclude cut with ID _27485_210_4916_1_1532739191677_3668269_460-163885-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 16:44:19,235 WARNING [train.py:1204] (3/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_270-26053-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 16:44:21,911 WARNING [train.py:1204] (3/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_412-317077-0 from training. Duration: 0.75 2023-10-03 16:44:23,322 WARNING [train.py:1204] (3/4) Exclude cut with ID _5289_210_2084_1_1527678202736_3779299_142-103317-0 from training. Duration: 0.84 2023-10-03 16:44:23,371 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_150-143438-0_sp1.1 from training. Duration: 0.92725 2023-10-03 16:44:26,698 WARNING [train.py:1204] (3/4) Exclude cut with ID _27894_210_12392_1_1532415496078_6902439_668-269779-0_sp1.1 from training. Duration: 0.97275 2023-10-03 16:44:26,764 WARNING [train.py:1204] (3/4) Exclude cut with ID _18513_210_9206_1_1531618215223_3862620_56-222075-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 16:44:28,882 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.ff3_skip_rate, batch_count=1335926.6666666667, ans=0.0 2023-10-03 16:44:29,877 WARNING [train.py:1204] (3/4) Exclude cut with ID _4092_210_2151_1_1526643044449_3135759_356-278694-0 from training. Duration: 0.47 2023-10-03 16:44:32,755 WARNING [train.py:1204] (3/4) Exclude cut with ID _25808_210_13292_1_1532343514562_7366109_633-47524-0 from training. Duration: 0.99 2023-10-03 16:44:32,771 WARNING [train.py:1204] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_872-264525-0 from training. Duration: 0.65 2023-10-03 16:44:34,247 WARNING [train.py:1204] (3/4) Exclude cut with ID _14632_210_9988_1_1530691279392_3923200_239-232545-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 16:44:35,582 WARNING [train.py:1204] (3/4) Exclude cut with ID _14349_210_8341_1_1530615710818_3442470_153-27432-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 16:44:40,477 WARNING [train.py:1204] (3/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_37-41919-0_sp0.9 from training. Duration: 0.9666875 2023-10-03 16:44:41,695 INFO [train.py:1046] (3/4) Epoch 38, batch 3850, loss[loss=0.1531, simple_loss=0.228, pruned_loss=0.03908, over 23636.00 frames. ], tot_loss[loss=0.158, simple_loss=0.2372, pruned_loss=0.03936, over 4700493.79 frames. ], batch size: 135, lr: 2.67e-03, grad_scale: 8.0 2023-10-03 16:44:41,774 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_196-289637-0_sp0.9 from training. Duration: 0.8333125 2023-10-03 16:44:46,055 WARNING [train.py:1204] (3/4) Exclude cut with ID _13678_210_7497_1_1530530069226_3463170_371-3221-0_sp1.1 from training. Duration: 0.8181875 2023-10-03 16:44:46,140 WARNING [train.py:1204] (3/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_557-324258-0 from training. Duration: 0.81 2023-10-03 16:44:47,986 WARNING [train.py:1204] (3/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_1012-279473-0_sp1.1 from training. Duration: 0.6090625 2023-10-03 16:44:48,219 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.attention_skip_rate, batch_count=1335993.3333333333, ans=0.0 2023-10-03 16:44:49,331 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_66916_210_14656_1_1535725525_326566_7-92649-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 16:44:51,049 INFO [scaling.py:1118] (3/4) WithLoss: name=encoder.encoders.3.encoder.layers.2.self_attn_weights, loss-sum=0.000e+00 2023-10-03 16:44:52,187 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_9460_210_4211_1_1528954587_10049667_480-260269-0_sp1.1 from training. Duration: 0.5090625 2023-10-03 16:44:53,758 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_121-155001-0_sp1.1 from training. Duration: 0.92725 2023-10-03 16:44:56,970 WARNING [train.py:1204] (3/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_188-171673-0_sp1.1 from training. Duration: 0.52725 2023-10-03 16:44:57,056 WARNING [train.py:1204] (3/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_2-39653-0 from training. Duration: 0.98 2023-10-03 16:45:03,160 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_23850_210_4238_1_1532232597_1632433_112-229012-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 16:45:04,604 WARNING [train.py:1204] (3/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_626-79528-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 16:45:06,478 WARNING [train.py:1204] (3/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_856-90052-0_sp1.1 from training. Duration: 0.963625 2023-10-03 16:45:07,832 WARNING [train.py:1204] (3/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_122-109023-0_sp0.9 from training. Duration: 0.8444375 2023-10-03 16:45:10,563 WARNING [train.py:1204] (3/4) Exclude cut with ID _64414_210_17301_1_1535243252048_3757759_37-70328-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 16:45:10,622 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_622-344907-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 16:45:10,674 WARNING [train.py:1204] (3/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_410-321956-0_sp1.1 from training. Duration: 0.92725 2023-10-03 16:45:10,686 WARNING [train.py:1204] (3/4) Exclude cut with ID _7562_210_2084_1_1528887767916_3946229_186-326422-0_sp0.9 from training. Duration: 0.9333125 2023-10-03 16:45:13,396 WARNING [train.py:1204] (3/4) Exclude cut with ID _37537_210_2253_1_1533171758568_3421949_127-285192-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 16:45:15,621 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.1.encoder.layers.1.nonlin_attention.whiten1, num_groups=1, num_channels=192, metric=5.88 vs. limit=10.0 2023-10-03 16:45:16,495 WARNING [train.py:1204] (3/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_723-228168-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 16:45:16,573 WARNING [train.py:1204] (3/4) Exclude cut with ID _13678_210_7497_1_1530530069226_3463170_426-246984-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 16:45:17,826 WARNING [train.py:1204] (3/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_399-156791-0_sp0.9 from training. Duration: 0.92225 2023-10-03 16:45:19,083 WARNING [train.py:1204] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_391-22148-0 from training. Duration: 0.77 2023-10-03 16:45:19,109 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3475_210_2028_1_1526193578_3622569_64-297072-0 from training. Duration: 0.91 2023-10-03 16:45:20,720 WARNING [train.py:1204] (3/4) Exclude cut with ID _17875_210_2087_1_1531224012907_1821339_225-280015-0_sp1.1 from training. Duration: 0.963625 2023-10-03 16:45:20,748 WARNING [train.py:1204] (3/4) Exclude cut with ID _50655_210_18628_1_1534229942193_3772154_325-246169-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 16:45:22,168 WARNING [train.py:1204] (3/4) Exclude cut with ID _29529_210_7234_1_1532393961455_3977460_597-257348-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 16:45:22,209 WARNING [train.py:1204] (3/4) Exclude cut with ID _58895_210_8750_1_1535184081320_3649089_77-173485-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 16:45:23,445 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.568e+02 1.919e+02 2.123e+02 2.422e+02 3.840e+02, threshold=4.245e+02, percent-clipped=0.0 2023-10-03 16:45:23,561 WARNING [train.py:1204] (3/4) Exclude cut with ID _6270_210_2084_1_1527850911573_3856420_26-277343-0 from training. Duration: 0.82 2023-10-03 16:45:26,296 WARNING [train.py:1204] (3/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_144-3287-0 from training. Duration: 0.67 2023-10-03 16:45:27,633 WARNING [train.py:1204] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_293-145278-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 16:45:29,352 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_40047_210_2290_1_1533290519_2374311_257-335162-0 from training. Duration: 0.95 2023-10-03 16:45:33,266 WARNING [train.py:1204] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_619-344499-0_sp0.9 from training. Duration: 0.7 2023-10-03 16:45:37,853 WARNING [train.py:1204] (3/4) Exclude cut with ID _19504_210_9994_1_1531648863242_3325220_456-315379-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 16:45:39,408 WARNING [train.py:1204] (3/4) Exclude cut with ID _55857_210_2346_1_1534644007831_6898209_631-221692-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 16:45:43,541 WARNING [train.py:1204] (3/4) Exclude cut with ID _64631_210_27767_1_1535349580068_4165160_324-3682-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 16:45:43,568 WARNING [train.py:1204] (3/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_317-211107-0 from training. Duration: 0.92 2023-10-03 16:45:46,371 WARNING [train.py:1204] (3/4) Exclude cut with ID _6789_210_1534_1_1527857937218_3118950_311-61262-0 from training. Duration: 0.65 2023-10-03 16:45:47,879 WARNING [train.py:1204] (3/4) Exclude cut with ID _28323_210_16187_1_1532480395667_4100300_365-68882-0_sp1.1 from training. Duration: 0.92725 2023-10-03 16:45:49,481 WARNING [train.py:1204] (3/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_816-181825-0_sp1.1 from training. Duration: 0.92725 2023-10-03 16:45:50,961 WARNING [train.py:1204] (3/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_132-269633-0_sp1.1 from training. Duration: 0.6181875 2023-10-03 16:45:50,972 WARNING [train.py:1204] (3/4) Exclude cut with ID _44585_210_18628_1_1533605350387_4518199_280-73534-0_sp1.1 from training. Duration: 0.8090625 2023-10-03 16:45:52,287 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_68095_210_20185_1_1535718226_8888855_1009-164755-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 16:45:53,688 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_40223_210_6228_1_1533298404_4812267_400-103813-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 16:45:53,689 WARNING [train.py:1204] (3/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_491-261923-0_sp1.1 from training. Duration: 0.8909375 2023-10-03 16:45:53,697 WARNING [train.py:1204] (3/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_712-342563-0 from training. Duration: 0.97 2023-10-03 16:45:53,779 WARNING [train.py:1204] (3/4) Exclude cut with ID _41161_210_20034_1_1533431249657_3555314_175-190032-0_sp1.1 from training. Duration: 0.963625 2023-10-03 16:45:55,063 WARNING [train.py:1204] (3/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_788-298907-0 from training. Duration: 0.89 2023-10-03 16:45:55,079 WARNING [train.py:1204] (3/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_225-70961-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 16:45:56,391 INFO [train.py:1046] (3/4) Epoch 38, batch 3900, loss[loss=0.1531, simple_loss=0.241, pruned_loss=0.03257, over 24648.00 frames. ], tot_loss[loss=0.1574, simple_loss=0.2367, pruned_loss=0.03901, over 4683689.02 frames. ], batch size: 73, lr: 2.67e-03, grad_scale: 8.0 2023-10-03 16:45:56,460 WARNING [train.py:1204] (3/4) Exclude cut with ID _58583_210_5236_1_1534726835266_3574249_235-175994-0_sp1.1 from training. Duration: 0.92725 2023-10-03 16:45:59,307 WARNING [train.py:1204] (3/4) Exclude cut with ID _12302_210_5219_1_1529924446466_8107280_907-182949-0_sp1.1 from training. Duration: 0.836375 2023-10-03 16:45:59,349 WARNING [train.py:1204] (3/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_94-193692-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 16:45:59,449 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_4528_210_3273_1_1527076654_3276777_342-168068-0_sp0.9 from training. Duration: 0.9666875 2023-10-03 16:46:00,665 WARNING [train.py:1204] (3/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_368-74239-0_sp1.1 from training. Duration: 0.92725 2023-10-03 16:46:00,666 WARNING [train.py:1204] (3/4) Exclude cut with ID _44831_210_13427_1_1533816011549_3689035_291-295928-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 16:46:00,744 WARNING [train.py:1204] (3/4) Exclude cut with ID _56406_210_9288_1_1534899618664_3496149_455-131997-0_sp1.1 from training. Duration: 0.97275 2023-10-03 16:46:00,753 WARNING [train.py:1204] (3/4) Exclude cut with ID _6473_210_1741_1_1527901307396_3815380_108-17335-0 from training. Duration: 0.65 2023-10-03 16:46:02,770 WARNING [train.py:1204] (3/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_286-135600-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 16:46:07,446 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_928-321675-0_sp1.1 from training. Duration: 0.936375 2023-10-03 16:46:09,370 WARNING [train.py:1204] (3/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_81-300212-0_sp1.1 from training. Duration: 0.5454375 2023-10-03 16:46:09,409 WARNING [train.py:1204] (3/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_29-280622-0_sp0.9 from training. Duration: 0.911125 2023-10-03 16:46:09,498 WARNING [train.py:1204] (3/4) Exclude cut with ID _61079_210_4525_1_1534921008198_4011419_228-176447-0_sp1.1 from training. Duration: 0.936375 2023-10-03 16:46:11,047 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.ff2_skip_rate, batch_count=1336393.3333333333, ans=0.0 2023-10-03 16:46:12,246 WARNING [train.py:1204] (3/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_372-198878-0_sp1.1 from training. Duration: 0.5454375 2023-10-03 16:46:12,257 WARNING [train.py:1204] (3/4) Exclude cut with ID _64521_210_14699_1_1535243427259_3815328_244-208725-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 16:46:13,623 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8275_210_4198_1_1528455320_3892571_491-17707-0_sp0.9 from training. Duration: 0.711125 2023-10-03 16:46:15,267 WARNING [train.py:1204] (3/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_375-245677-0 from training. Duration: 0.81 2023-10-03 16:46:16,509 WARNING [train.py:1204] (3/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_690-286055-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 16:46:17,852 WARNING [train.py:1204] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_89-321955-0 from training. Duration: 0.8 2023-10-03 16:46:17,901 WARNING [train.py:1204] (3/4) Exclude cut with ID _18895_210_6228_1_1531357274234_3784930_213-98721-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 16:46:19,236 WARNING [train.py:1204] (3/4) Exclude cut with ID _19708_210_9206_1_1532136650733_4238364_347-65240-0 from training. Duration: 0.97 2023-10-03 16:46:21,307 WARNING [train.py:1204] (3/4) Exclude cut with ID _64135_210_2253_1_1535677267876_7608972_498-5039-0 from training. Duration: 0.78 2023-10-03 16:46:26,836 WARNING [train.py:1204] (3/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_146-276944-0_sp1.1 from training. Duration: 0.7545625 2023-10-03 16:46:26,930 WARNING [train.py:1204] (3/4) Exclude cut with ID _63171_210_15488_1_1535104792794_3295110_155-34488-0_sp1.1 from training. Duration: 0.97275 2023-10-03 16:46:26,949 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_5438_210_1794_1_1527336687_2600009_233-58072-0_sp0.9 from training. Duration: 0.8555625 2023-10-03 16:46:28,263 WARNING [train.py:1204] (3/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_63-101996-0_sp0.9 from training. Duration: 0.97775 2023-10-03 16:46:32,428 WARNING [train.py:1204] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_271-269084-0_sp1.1 from training. Duration: 0.7545625 2023-10-03 16:46:35,144 WARNING [train.py:1204] (3/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_345-167986-0_sp1.1 from training. Duration: 0.763625 2023-10-03 16:46:35,262 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.0.feed_forward3.hidden_balancer.prob, batch_count=1336460.0, ans=0.125 2023-10-03 16:46:37,795 WARNING [train.py:1204] (3/4) Exclude cut with ID _39290_210_4220_1_1533862778760_3075118_92-311600-0_sp1.1 from training. Duration: 0.8 2023-10-03 16:46:37,801 WARNING [train.py:1204] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_209-65306-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 16:46:39,028 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_26433_210_12108_1_1532157158_4056480_317-335807-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 16:46:45,116 WARNING [train.py:1204] (3/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_238-94127-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 16:46:45,172 WARNING [train.py:1204] (3/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_207-132038-0_sp1.1 from training. Duration: 0.8545625 2023-10-03 16:46:49,639 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.5.encoder.layers.1.nonlin_attention.whiten2, num_groups=1, num_channels=256, metric=4.80 vs. limit=15.0 2023-10-03 16:46:51,875 WARNING [train.py:1204] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_167-137255-0_sp1.1 from training. Duration: 0.5818125 2023-10-03 16:46:53,669 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2009_210_2071_1_1525481134_4588937_385-302100-0_sp0.9 from training. Duration: 0.9666875 2023-10-03 16:47:02,236 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_15517_210_6753_1_1530954008_3487336_253-110617-0_sp1.1 from training. Duration: 0.936375 2023-10-03 16:47:03,748 WARNING [train.py:1204] (3/4) Exclude cut with ID _39290_210_4220_1_1533862778760_3075118_92-311600-0_sp0.9 from training. Duration: 0.97775 2023-10-03 16:47:03,794 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_42-142473-0 from training. Duration: 0.72 2023-10-03 16:47:05,129 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2296_210_2087_1_1525503448_3397335_57-185430-0 from training. Duration: 0.6 2023-10-03 16:47:05,141 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_11305_210_1794_1_1529750428_7178148_257-312870-0_sp0.9 from training. Duration: 0.97775 2023-10-03 16:47:05,267 WARNING [train.py:1204] (3/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_222-198127-0 from training. Duration: 0.84 2023-10-03 16:47:08,476 WARNING [train.py:1204] (3/4) Exclude cut with ID _65263_210_22356_1_1535763598691_3921037_423-202699-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 16:47:08,539 WARNING [train.py:1204] (3/4) Exclude cut with ID _15037_210_2253_1_1530752294104_3267650_13-130361-0 from training. Duration: 0.99 2023-10-03 16:47:11,657 INFO [train.py:1046] (3/4) Epoch 38, batch 3950, loss[loss=0.1536, simple_loss=0.228, pruned_loss=0.03959, over 23431.00 frames. ], tot_loss[loss=0.1566, simple_loss=0.2363, pruned_loss=0.03842, over 4704248.63 frames. ], batch size: 285, lr: 2.67e-03, grad_scale: 8.0 2023-10-03 16:47:11,991 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=1336660.0, ans=0.1 2023-10-03 16:47:14,048 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.attention_skip_rate, batch_count=1336660.0, ans=0.0 2023-10-03 16:47:17,809 WARNING [train.py:1204] (3/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_628-198080-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 16:47:17,909 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3171_210_2087_1_1526109084_2330007_37-64334-0 from training. Duration: 0.82 2023-10-03 16:47:19,167 WARNING [train.py:1204] (3/4) Exclude cut with ID _5287_210_2084_1_1527418883040_3878670_12-221186-0_sp1.1 from training. Duration: 0.8909375 2023-10-03 16:47:20,602 WARNING [train.py:1204] (3/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_1103-56445-0_sp1.1 from training. Duration: 0.663625 2023-10-03 16:47:20,720 WARNING [train.py:1204] (3/4) Exclude cut with ID _65263_210_22356_1_1535763598691_3921037_169-140068-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 16:47:26,824 WARNING [train.py:1204] (3/4) Exclude cut with ID IC0671W0118-331961 from training. Duration: 0.896 2023-10-03 16:47:26,877 WARNING [train.py:1204] (3/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_402-102311-0_sp1.1 from training. Duration: 0.6090625 2023-10-03 16:47:26,903 WARNING [train.py:1204] (3/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_202-293523-0 from training. Duration: 0.72 2023-10-03 16:47:28,353 WARNING [train.py:1204] (3/4) Exclude cut with ID IC0679W0038-335846 from training. Duration: 0.768 2023-10-03 16:47:28,389 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_37357_210_17024_1_1533089504_2316121_79-198866-0_sp1.1 from training. Duration: 0.936375 2023-10-03 16:47:28,640 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.feed_forward1.hidden_balancer.prob, batch_count=1336726.6666666667, ans=0.125 2023-10-03 16:47:32,385 WARNING [train.py:1204] (3/4) Exclude cut with ID _7606_210_4915_1_1528894253277_5080690_292-271285-0_sp1.1 from training. Duration: 0.963625 2023-10-03 16:47:32,404 WARNING [train.py:1204] (3/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_377-296839-0_sp0.9 from training. Duration: 0.611125 2023-10-03 16:47:32,407 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_45886_210_17024_1_1533704917_2435333_58-131069-0_sp1.1 from training. Duration: 0.963625 2023-10-03 16:47:33,939 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6961_210_2087_1_1527937168_6815011_395-185060-0 from training. Duration: 0.95 2023-10-03 16:47:37,168 WARNING [train.py:1204] (3/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_304-266479-0_sp1.1 from training. Duration: 0.97275 2023-10-03 16:47:37,238 WARNING [train.py:1204] (3/4) Exclude cut with ID _3867_210_1883_1_1526641200575_4475830_319-349850-0_sp1.1 from training. Duration: 0.6090625 2023-10-03 16:47:37,246 WARNING [train.py:1204] (3/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_304-59227-0_sp0.9 from training. Duration: 0.8555625 2023-10-03 16:47:38,571 WARNING [train.py:1204] (3/4) Exclude cut with ID _8021_210_7994_1_1528376451358_3653069_100-140231-0_sp0.9 from training. Duration: 0.7444375 2023-10-03 16:47:38,626 WARNING [train.py:1204] (3/4) Exclude cut with ID _8833_210_5219_1_1528592665210_7553059_329-125156-0_sp0.9 from training. Duration: 0.911125 2023-10-03 16:47:49,361 WARNING [train.py:1204] (3/4) Exclude cut with ID _14459_210_2228_1_1531443641946_7516700_792-287234-0_sp1.1 from training. Duration: 0.92725 2023-10-03 16:47:49,382 WARNING [train.py:1204] (3/4) Exclude cut with ID _19708_210_9206_1_1532136650733_4238364_347-65240-0_sp1.1 from training. Duration: 0.8818125 2023-10-03 16:47:53,692 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.560e+02 1.916e+02 2.036e+02 2.336e+02 4.528e+02, threshold=4.072e+02, percent-clipped=1.0 2023-10-03 16:47:55,179 WARNING [train.py:1204] (3/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_70-27732-0 from training. Duration: 0.94 2023-10-03 16:48:00,660 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_397-132565-0 from training. Duration: 0.99 2023-10-03 16:48:00,663 WARNING [train.py:1204] (3/4) Exclude cut with ID _49728_210_12651_1_1534041034294_6788150_624-195301-0 from training. Duration: 0.85 2023-10-03 16:48:01,856 WARNING [train.py:1204] (3/4) Exclude cut with ID _55429_210_10626_1_1534467588527_7174143_377-169391-0_sp1.1 from training. Duration: 0.87275 2023-10-03 16:48:01,966 WARNING [train.py:1204] (3/4) Exclude cut with ID _49376_210_5986_1_1533985278732_3370740_354-344695-0_sp1.1 from training. Duration: 0.9 2023-10-03 16:48:06,834 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.0.feed_forward1.out_proj.dropout_p, batch_count=1336860.0, ans=0.1 2023-10-03 16:48:07,488 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.1.encoder.layers.1.whiten, num_groups=1, num_channels=256, metric=3.71 vs. limit=12.0 2023-10-03 16:48:09,511 WARNING [train.py:1204] (3/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_276-96593-0_sp1.1 from training. Duration: 0.7 2023-10-03 16:48:09,519 WARNING [train.py:1204] (3/4) Exclude cut with ID _15012_210_1968_1_1530856820429_5095350_688-178849-0_sp0.9 from training. Duration: 0.711125 2023-10-03 16:48:09,576 WARNING [train.py:1204] (3/4) Exclude cut with ID _25565_210_12392_1_1532423009382_3952098_372-69023-0_sp1.1 from training. Duration: 0.936375 2023-10-03 16:48:10,844 WARNING [train.py:1204] (3/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_61-327062-0_sp1.1 from training. Duration: 0.736375 2023-10-03 16:48:11,546 WARNING [train.py:1204] (3/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_215-141059-0 from training. Duration: 0.62 2023-10-03 16:48:11,793 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.conv_module2.balancer1.prob, batch_count=1336926.6666666667, ans=0.125 2023-10-03 16:48:15,447 WARNING [train.py:1204] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_6-226750-0_sp1.1 from training. Duration: 0.8545625 2023-10-03 16:48:15,554 WARNING [train.py:1204] (3/4) Exclude cut with ID _14348_210_8341_1_1530599434996_3778640_240-232370-0_sp1.1 from training. Duration: 0.763625 2023-10-03 16:48:18,371 WARNING [train.py:1204] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_468-189058-0 from training. Duration: 0.96 2023-10-03 16:48:25,530 INFO [train.py:1046] (3/4) Epoch 38, batch 4000, loss[loss=0.1711, simple_loss=0.2387, pruned_loss=0.05177, over 23696.00 frames. ], tot_loss[loss=0.1574, simple_loss=0.2371, pruned_loss=0.03887, over 4710153.39 frames. ], batch size: 164, lr: 2.67e-03, grad_scale: 16.0 2023-10-03 16:48:27,048 WARNING [train.py:1204] (3/4) Exclude cut with ID _21987_210_5854_1_1531821500482_3637038_247-93340-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 16:48:34,066 WARNING [train.py:1204] (3/4) Exclude cut with ID _66868_210_9527_1_1535509287954_4561140_436-191602-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 16:48:34,905 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.1.encoder.layers.1.conv_module2.whiten, num_groups=1, num_channels=256, metric=11.98 vs. limit=15.0 2023-10-03 16:48:38,705 WARNING [train.py:1204] (3/4) Exclude cut with ID _18895_210_6228_1_1531357274234_3784930_394-58203-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 16:48:40,655 WARNING [train.py:1204] (3/4) Exclude cut with ID _58605_210_12210_1_1534723195939_3904259_258-281498-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 16:48:40,708 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_33348_210_5632_1_1533086912_3324030_245-292752-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 16:48:41,977 WARNING [train.py:1204] (3/4) Exclude cut with ID _67831_210_12392_1_1535612464202_3092612_346-2148-0 from training. Duration: 0.93 2023-10-03 16:48:42,034 WARNING [train.py:1204] (3/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_369-222060-0_sp0.9 from training. Duration: 0.788875 2023-10-03 16:48:43,864 WARNING [train.py:1204] (3/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_245-174553-0 from training. Duration: 0.88 2023-10-03 16:48:43,877 WARNING [train.py:1204] (3/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_231-104736-0_sp1.1 from training. Duration: 0.7090625 2023-10-03 16:48:43,889 WARNING [train.py:1204] (3/4) Exclude cut with ID _44585_210_18628_1_1533605350387_4518199_280-73534-0 from training. Duration: 0.89 2023-10-03 16:48:44,074 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.conv_module1.balancer2.prob, batch_count=1337060.0, ans=0.125 2023-10-03 16:48:46,616 WARNING [train.py:1204] (3/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_22-201476-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 16:48:49,360 WARNING [train.py:1204] (3/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_325-62802-0_sp0.9 from training. Duration: 0.9666875 2023-10-03 16:48:49,373 WARNING [train.py:1204] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_199-274062-0_sp1.1 from training. Duration: 0.97275 2023-10-03 16:48:49,376 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_8-319957-0_sp1.1 from training. Duration: 0.87275 2023-10-03 16:48:50,779 WARNING [train.py:1204] (3/4) Exclude cut with ID _29343_210_4381_1_1532602757752_3674109_224-349726-0_sp1.1 from training. Duration: 0.936375 2023-10-03 16:48:50,784 WARNING [train.py:1204] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_520-126729-0_sp1.1 from training. Duration: 0.636375 2023-10-03 16:48:52,160 WARNING [train.py:1204] (3/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_239-22076-0_sp1.1 from training. Duration: 0.77275 2023-10-03 16:48:52,443 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.self_attn_weights.pos_emb_skip_rate, batch_count=1337060.0, ans=0.0 2023-10-03 16:48:53,676 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9012W0011-501146 from training. Duration: 0.896 2023-10-03 16:48:53,757 WARNING [train.py:1204] (3/4) Exclude cut with ID _29857_210_20034_1_1532395886753_3798160_279-185801-0_sp1.1 from training. Duration: 0.82725 2023-10-03 16:48:55,617 WARNING [train.py:1204] (3/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_261-223933-0_sp1.1 from training. Duration: 0.92725 2023-10-03 16:48:57,063 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.conv_skip_rate, batch_count=1337126.6666666667, ans=0.0 2023-10-03 16:48:58,341 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9029W0025-509426 from training. Duration: 0.896 2023-10-03 16:48:59,623 WARNING [train.py:1204] (3/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_337-181502-0_sp1.1 from training. Duration: 0.5909375 2023-10-03 16:48:59,643 WARNING [train.py:1204] (3/4) Exclude cut with ID _68635_210_9317_1_1535770813410_4315870_159-332599-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 16:49:05,230 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.self_attn_weights.pos_emb_skip_rate, batch_count=1337126.6666666667, ans=0.0 2023-10-03 16:49:06,425 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_161-276996-0 from training. Duration: 0.95 2023-10-03 16:49:07,737 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7088_210_1874_1_1527939776_1101113_127-68870-0_sp1.1 from training. Duration: 0.936375 2023-10-03 16:49:09,337 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.bypass_mid.scale_min, batch_count=1337193.3333333333, ans=0.2 2023-10-03 16:49:11,119 WARNING [train.py:1204] (3/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_322-217220-0_sp1.1 from training. Duration: 0.7818125 2023-10-03 16:49:11,170 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9072W0002-530636 from training. Duration: 0.768 2023-10-03 16:49:12,527 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_340-336408-0_sp1.1 from training. Duration: 0.8454375 2023-10-03 16:49:12,598 WARNING [train.py:1204] (3/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_395-37222-0 from training. Duration: 0.76 2023-10-03 16:49:12,601 WARNING [train.py:1204] (3/4) Exclude cut with ID _64099_210_17301_1_1535526977656_1578188_159-209156-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 16:49:14,510 WARNING [train.py:1204] (3/4) Exclude cut with ID _53950_210_5816_1_1534417264919_3862951_28-29681-0_sp1.1 from training. Duration: 0.92725 2023-10-03 16:49:15,981 WARNING [train.py:1204] (3/4) Exclude cut with ID _4681_210_3525_1_1527250689464_120889_15-126760-0_sp1.1 from training. Duration: 0.736375 2023-10-03 16:49:17,359 WARNING [train.py:1204] (3/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_242-3472-0_sp1.1 from training. Duration: 0.663625 2023-10-03 16:49:17,390 WARNING [train.py:1204] (3/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_436-186311-0_sp0.9 from training. Duration: 0.72225 2023-10-03 16:49:17,407 WARNING [train.py:1204] (3/4) Exclude cut with ID _3675_210_4557_1_1526639934408_4182089_452-172147-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 16:49:18,713 WARNING [train.py:1204] (3/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_80-147301-0 from training. Duration: 0.84 2023-10-03 16:49:18,735 WARNING [train.py:1204] (3/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_351-107588-0_sp1.1 from training. Duration: 0.92725 2023-10-03 16:49:22,148 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9111W0002-550781 from training. Duration: 0.896 2023-10-03 16:49:26,855 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.0.ff2_skip_rate, batch_count=1337260.0, ans=0.0 2023-10-03 16:49:28,100 WARNING [train.py:1204] (3/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_465-156500-0_sp1.1 from training. Duration: 0.6545625 2023-10-03 16:49:28,365 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.conv_module1.balancer2.prob, batch_count=1337260.0, ans=0.125 2023-10-03 16:49:29,556 WARNING [train.py:1204] (3/4) Exclude cut with ID _13938_210_9988_1_1530518718978_3693960_304-112784-0_sp1.1 from training. Duration: 0.4181875 2023-10-03 16:49:32,274 WARNING [train.py:1204] (3/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_202-67017-0_sp1.1 from training. Duration: 0.7454375 2023-10-03 16:49:33,590 WARNING [train.py:1204] (3/4) Exclude cut with ID _58414_210_8254_1_1535421609162_7151182_448-4358-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 16:49:33,658 WARNING [train.py:1204] (3/4) Exclude cut with ID _66840_210_5219_1_1535452168426_4196349_203-243234-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 16:49:34,994 WARNING [train.py:1204] (3/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_201-213513-0_sp1.1 from training. Duration: 0.97275 2023-10-03 16:49:37,853 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_117-187560-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 16:49:39,162 INFO [train.py:1046] (3/4) Epoch 38, batch 4050, loss[loss=0.1436, simple_loss=0.22, pruned_loss=0.03358, over 15077.00 frames. ], tot_loss[loss=0.1581, simple_loss=0.238, pruned_loss=0.03911, over 4709839.62 frames. ], batch size: 32, lr: 2.67e-03, grad_scale: 8.0 2023-10-03 16:49:39,440 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.feed_forward1.out_proj.dropout_p, batch_count=1337326.6666666667, ans=0.1 2023-10-03 16:49:41,034 WARNING [train.py:1204] (3/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_135-174080-0_sp0.9 from training. Duration: 0.588875 2023-10-03 16:49:41,085 WARNING [train.py:1204] (3/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_45-265684-0 from training. Duration: 0.72 2023-10-03 16:49:41,938 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.balancer2.prob, batch_count=1337326.6666666667, ans=0.125 2023-10-03 16:49:44,174 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_435-313305-0_sp1.1 from training. Duration: 0.6454375 2023-10-03 16:49:44,203 WARNING [train.py:1204] (3/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_235-307337-0_sp1.1 from training. Duration: 0.8545625 2023-10-03 16:49:44,462 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.attention_skip_rate, batch_count=1337326.6666666667, ans=0.0 2023-10-03 16:49:45,644 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7762_210_1794_1_1528199508_4057408_419-125009-0_sp0.9 from training. Duration: 0.92225 2023-10-03 16:49:46,999 WARNING [train.py:1204] (3/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_215-141059-0_sp1.1 from training. Duration: 0.563625 2023-10-03 16:49:47,081 WARNING [train.py:1204] (3/4) Exclude cut with ID _27013_210_1389_1_1532437173240_3252649_222-125573-0_sp1.1 from training. Duration: 0.8909375 2023-10-03 16:49:51,624 WARNING [train.py:1204] (3/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_126-274694-0_sp1.1 from training. Duration: 0.8909375 2023-10-03 16:49:54,300 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2592_210_2290_1_1525697025_4882925_157-70512-0_sp1.1 from training. Duration: 0.82725 2023-10-03 16:49:54,350 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_292-265928-0_sp0.9 from training. Duration: 0.5666875 2023-10-03 16:49:56,298 WARNING [train.py:1204] (3/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_1211-226066-0_sp1.1 from training. Duration: 0.6909375 2023-10-03 16:49:57,639 WARNING [train.py:1204] (3/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_394-265747-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 16:49:59,193 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.self_attn_weights.pos_emb_skip_rate, batch_count=1337393.3333333333, ans=0.0 2023-10-03 16:50:01,694 WARNING [train.py:1204] (3/4) Exclude cut with ID _48782_210_12658_1_1534467701544_2300422_239-67139-0_sp1.1 from training. Duration: 0.97275 2023-10-03 16:50:04,535 WARNING [train.py:1204] (3/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_329-113173-0_sp1.1 from training. Duration: 0.563625 2023-10-03 16:50:07,408 WARNING [train.py:1204] (3/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_81-75849-0_sp0.9 from training. Duration: 0.4333125 2023-10-03 16:50:08,816 WARNING [train.py:1204] (3/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_1003-101638-0 from training. Duration: 0.73 2023-10-03 16:50:08,848 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9309W0189-640316 from training. Duration: 0.64 2023-10-03 16:50:09,489 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.3.encoder.layers.2.self_attn1.whiten, num_groups=1, num_channels=512, metric=12.59 vs. limit=22.5 2023-10-03 16:50:10,223 WARNING [train.py:1204] (3/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_503-287883-0_sp1.1 from training. Duration: 0.8 2023-10-03 16:50:10,396 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.conv_module1.balancer1.prob, batch_count=1337460.0, ans=0.125 2023-10-03 16:50:16,857 WARNING [train.py:1204] (3/4) Exclude cut with ID _8833_210_5219_1_1528592665210_7553059_611-80313-0 from training. Duration: 0.64 2023-10-03 16:50:18,171 WARNING [train.py:1204] (3/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_335-106626-0_sp1.1 from training. Duration: 0.963625 2023-10-03 16:50:20,923 WARNING [train.py:1204] (3/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_237-44332-0_sp1.1 from training. Duration: 0.8545625 2023-10-03 16:50:22,143 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.575e+02 1.926e+02 2.112e+02 2.341e+02 3.122e+02, threshold=4.223e+02, percent-clipped=0.0 2023-10-03 16:50:24,016 WARNING [train.py:1204] (3/4) Exclude cut with ID _46952_210_5986_1_1534230010357_3107379_178-147341-0_sp1.1 from training. Duration: 0.97275 2023-10-03 16:50:24,064 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_13091_210_7925_1_1530609447_8987354_946-154426-0_sp1.1 from training. Duration: 0.936375 2023-10-03 16:50:24,083 WARNING [train.py:1204] (3/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_326-17061-0_sp1.1 from training. Duration: 0.8545625 2023-10-03 16:50:28,558 WARNING [train.py:1204] (3/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_38-169920-0_sp1.1 from training. Duration: 0.82725 2023-10-03 16:50:31,347 WARNING [train.py:1204] (3/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_283-220372-0 from training. Duration: 0.56 2023-10-03 16:50:32,648 WARNING [train.py:1204] (3/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_252-216674-0_sp1.1 from training. Duration: 0.4909375 2023-10-03 16:50:34,062 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_202-91290-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 16:50:34,184 WARNING [train.py:1204] (3/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_80-74184-0 from training. Duration: 0.83 2023-10-03 16:50:34,414 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.bypass_mid.scale_min, batch_count=1337526.6666666667, ans=0.2 2023-10-03 16:50:38,410 WARNING [train.py:1204] (3/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_411-271358-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 16:50:38,711 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.conv_module1.balancer1.prob, batch_count=1337593.3333333333, ans=0.125 2023-10-03 16:50:44,864 WARNING [train.py:1204] (3/4) Exclude cut with ID _68777_210_4353_1_1535810390731_3117904_129-166012-0 from training. Duration: 0.85 2023-10-03 16:50:48,185 WARNING [train.py:1204] (3/4) Exclude cut with ID _44982_210_1511_1_1534312820108_3726029_410-109759-0_sp1.1 from training. Duration: 0.963625 2023-10-03 16:50:48,189 WARNING [train.py:1204] (3/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_364-22817-0_sp1.1 from training. Duration: 0.6454375 2023-10-03 16:50:50,784 WARNING [train.py:1204] (3/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_89-242335-0 from training. Duration: 0.95 2023-10-03 16:50:50,793 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_151-325626-0 from training. Duration: 0.97 2023-10-03 16:50:50,793 WARNING [train.py:1204] (3/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_786-264332-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 16:50:52,268 WARNING [train.py:1204] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_35-153886-0_sp1.1 from training. Duration: 0.763625 2023-10-03 16:50:53,512 INFO [train.py:1046] (3/4) Epoch 38, batch 4100, loss[loss=0.1506, simple_loss=0.2335, pruned_loss=0.03388, over 23343.00 frames. ], tot_loss[loss=0.1584, simple_loss=0.2387, pruned_loss=0.03905, over 4722778.56 frames. ], batch size: 93, lr: 2.67e-03, grad_scale: 8.0 2023-10-03 16:50:53,630 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_24007_210_1794_1_1531918925_1663875_116-100916-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 16:50:54,922 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_162-262717-0_sp1.1 from training. Duration: 0.7545625 2023-10-03 16:50:58,455 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.0.conv_module1.balancer2.prob, batch_count=1337660.0, ans=0.125 2023-10-03 16:51:01,104 WARNING [train.py:1204] (3/4) Exclude cut with ID _8263_210_1664_1_1528438953494_294100_40-204731-0 from training. Duration: 0.5 2023-10-03 16:51:02,466 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_416-293746-0 from training. Duration: 0.9 2023-10-03 16:51:03,836 WARNING [train.py:1204] (3/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_605-219732-0 from training. Duration: 0.94 2023-10-03 16:51:05,321 WARNING [train.py:1204] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_454-123002-0 from training. Duration: 0.69 2023-10-03 16:51:05,327 WARNING [train.py:1204] (3/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_25-304273-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 16:51:05,399 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6860_210_4852_1_1527917457_3833327_451-39702-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 16:51:06,711 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_304-16505-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 16:51:06,724 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_4-266348-0_sp1.1 from training. Duration: 0.7090625 2023-10-03 16:51:06,791 WARNING [train.py:1204] (3/4) Exclude cut with ID ID0149W0001-753701 from training. Duration: 0.989 2023-10-03 16:51:09,577 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_41251_210_15368_1_1533429767_4820695_394-55393-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 16:51:10,909 WARNING [train.py:1204] (3/4) Exclude cut with ID _8793_210_6310_1_1528542985139_2310009_121-179782-0_sp1.1 from training. Duration: 0.72725 2023-10-03 16:51:10,931 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2729_210_3528_1_1525863505_3882214_421-73047-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 16:51:12,255 WARNING [train.py:1204] (3/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_278-154492-0_sp0.9 from training. Duration: 0.8444375 2023-10-03 16:51:16,725 WARNING [train.py:1204] (3/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_412-317077-0_sp0.9 from training. Duration: 0.8333125 2023-10-03 16:51:18,092 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_727-138317-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 16:51:18,831 WARNING [train.py:1204] (3/4) Exclude cut with ID _25808_210_13292_1_1532343514562_7366109_345-335048-0_sp1.1 from training. Duration: 0.92725 2023-10-03 16:51:20,087 WARNING [train.py:1204] (3/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_506-23200-0 from training. Duration: 0.97 2023-10-03 16:51:20,174 WARNING [train.py:1204] (3/4) Exclude cut with ID _44718_210_5816_1_1534050051869_6469030_643-245187-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 16:51:20,178 WARNING [train.py:1204] (3/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_42-186162-0_sp1.1 from training. Duration: 0.836375 2023-10-03 16:51:20,199 WARNING [train.py:1204] (3/4) Exclude cut with ID _3231_210_2106_1_1526205864934_6784220_171-256004-0_sp1.1 from training. Duration: 0.7818125 2023-10-03 16:51:20,219 WARNING [train.py:1204] (3/4) Exclude cut with ID _25914_210_6400_1_1532141317058_644740_43-248478-0_sp1.1 from training. Duration: 0.936375 2023-10-03 16:51:21,551 WARNING [train.py:1204] (3/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_577-287150-0 from training. Duration: 0.82 2023-10-03 16:51:21,845 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.conv_module1.balancer1.max_abs, batch_count=1337793.3333333333, ans=10.0 2023-10-03 16:51:21,845 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.conv_module1.balancer2.prob, batch_count=1337793.3333333333, ans=0.125 2023-10-03 16:51:22,973 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_46015_210_7234_1_1533863319_3087837_376-276068-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 16:51:26,148 WARNING [train.py:1204] (3/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_51-200220-0 from training. Duration: 0.56 2023-10-03 16:51:26,236 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_452-96101-0_sp1.1 from training. Duration: 0.963625 2023-10-03 16:51:29,362 WARNING [train.py:1204] (3/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_263-168388-0_sp1.1 from training. Duration: 0.7818125 2023-10-03 16:51:29,363 WARNING [train.py:1204] (3/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_469-94673-0 from training. Duration: 0.79 2023-10-03 16:51:29,472 WARNING [train.py:1204] (3/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_303-228738-0_sp1.1 from training. Duration: 0.763625 2023-10-03 16:51:30,772 WARNING [train.py:1204] (3/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_262-250826-0_sp1.1 from training. Duration: 0.9 2023-10-03 16:51:30,792 WARNING [train.py:1204] (3/4) Exclude cut with ID _68777_210_4353_1_1535810390731_3117904_129-166012-0_sp1.1 from training. Duration: 0.77275 2023-10-03 16:51:32,246 WARNING [train.py:1204] (3/4) Exclude cut with ID _58895_210_8750_1_1535184081320_3649089_265-323810-0 from training. Duration: 0.96 2023-10-03 16:51:34,910 WARNING [train.py:1204] (3/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_120-76843-0_sp0.9 from training. Duration: 0.611125 2023-10-03 16:51:34,980 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7544_210_6753_1_1528587722_3576686_295-258479-0_sp0.9 from training. Duration: 0.9333125 2023-10-03 16:51:37,956 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.bypass.scale_min, batch_count=1337860.0, ans=0.2 2023-10-03 16:51:39,013 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7046_210_6753_1_1527895337_6920917_783-278109-0 from training. Duration: 0.77 2023-10-03 16:51:39,066 WARNING [train.py:1204] (3/4) Exclude cut with ID _42326_210_7770_1_1533814282531_3707269_113-308323-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 16:51:39,108 WARNING [train.py:1204] (3/4) Exclude cut with ID _7852_210_5219_1_1528286471316_8139400_242-105336-0_sp0.9 from training. Duration: 0.8 2023-10-03 16:51:39,321 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=1337860.0, ans=0.1 2023-10-03 16:51:41,893 WARNING [train.py:1204] (3/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_114-277905-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 16:51:46,138 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_13118_210_7925_1_1530755612_7608297_42-66683-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 16:51:51,636 WARNING [train.py:1204] (3/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_901-79438-0_sp1.1 from training. Duration: 0.8545625 2023-10-03 16:51:51,694 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_30280_210_1881_1_1532872779_3660139_211-182194-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 16:51:59,558 WARNING [train.py:1204] (3/4) Exclude cut with ID _14898_210_9206_1_1530766812377_4177190_621-45732-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 16:51:59,571 WARNING [train.py:1204] (3/4) Exclude cut with ID _35350_210_16040_1_1533040191183_4075299_224-133775-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 16:52:03,684 WARNING [train.py:1204] (3/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_147-293311-0_sp1.1 from training. Duration: 0.8545625 2023-10-03 16:52:05,128 WARNING [train.py:1204] (3/4) Exclude cut with ID _12302_210_5219_1_1529924446466_8107280_835-279754-0_sp1.1 from training. Duration: 0.8181875 2023-10-03 16:52:07,681 INFO [train.py:1046] (3/4) Epoch 38, batch 4150, loss[loss=0.1686, simple_loss=0.2593, pruned_loss=0.03893, over 24642.00 frames. ], tot_loss[loss=0.1578, simple_loss=0.2382, pruned_loss=0.03869, over 4734482.47 frames. ], batch size: 68, lr: 2.67e-03, grad_scale: 8.0 2023-10-03 16:52:09,168 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8453_210_4211_1_1528502940_8562730_347-330673-0_sp0.9 from training. Duration: 0.8 2023-10-03 16:52:10,536 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_213-103282-0_sp0.9 from training. Duration: 0.8666875 2023-10-03 16:52:10,617 WARNING [train.py:1204] (3/4) Exclude cut with ID _49728_210_12651_1_1534041034294_6788150_66-43496-0_sp1.1 from training. Duration: 0.97275 2023-10-03 16:52:10,625 WARNING [train.py:1204] (3/4) Exclude cut with ID _19023_210_4353_1_1531378892552_5820780_28-36615-0_sp1.1 from training. Duration: 0.8818125 2023-10-03 16:52:13,364 WARNING [train.py:1204] (3/4) Exclude cut with ID _14349_210_8341_1_1530615710818_3442470_115-285975-0 from training. Duration: 0.86 2023-10-03 16:52:13,383 WARNING [train.py:1204] (3/4) Exclude cut with ID _12734_210_8341_1_1530183614296_3891929_236-47464-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 16:52:13,432 WARNING [train.py:1204] (3/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_177-236809-0 from training. Duration: 0.49 2023-10-03 16:52:13,488 WARNING [train.py:1204] (3/4) Exclude cut with ID _13678_210_7497_1_1530530069226_3463170_371-3221-0 from training. Duration: 0.9 2023-10-03 16:52:14,761 WARNING [train.py:1204] (3/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_48-338976-0 from training. Duration: 0.89 2023-10-03 16:52:16,685 WARNING [train.py:1204] (3/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_1167-309744-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 16:52:20,055 WARNING [train.py:1204] (3/4) Exclude cut with ID _30327_210_16049_1_1532484161295_3924169_401-70819-0_sp0.9 from training. Duration: 0.9555625 2023-10-03 16:52:21,309 WARNING [train.py:1204] (3/4) Exclude cut with ID _55134_210_21982_1_1534590028377_3882170_346-91022-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 16:52:24,576 WARNING [train.py:1204] (3/4) Exclude cut with ID _14459_210_2228_1_1531443641946_7516700_174-118801-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 16:52:25,954 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_269-278006-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 16:52:26,011 WARNING [train.py:1204] (3/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_390-339137-0_sp0.9 from training. Duration: 0.62225 2023-10-03 16:52:29,117 WARNING [train.py:1204] (3/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_253-308377-0_sp1.1 from training. Duration: 0.4454375 2023-10-03 16:52:29,137 WARNING [train.py:1204] (3/4) Exclude cut with ID _23825_210_13449_1_1531915313526_3497150_240-271367-0_sp1.1 from training. Duration: 0.8818125 2023-10-03 16:52:30,665 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1015-343884-0_sp0.9 from training. Duration: 0.688875 2023-10-03 16:52:34,744 WARNING [train.py:1204] (3/4) Exclude cut with ID _23514_210_1389_1_1531832139785_3031439_335-48210-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 16:52:38,922 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_309-65177-0_sp1.1 from training. Duration: 0.8 2023-10-03 16:52:38,997 WARNING [train.py:1204] (3/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_231-137892-0 from training. Duration: 0.99 2023-10-03 16:52:41,046 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.3.encoder.layers.3.self_attn1.whiten, num_groups=1, num_channels=512, metric=12.82 vs. limit=22.5 2023-10-03 16:52:43,009 WARNING [train.py:1204] (3/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_57-270037-0 from training. Duration: 0.77 2023-10-03 16:52:43,014 WARNING [train.py:1204] (3/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_465-214726-0_sp1.1 from training. Duration: 0.7818125 2023-10-03 16:52:44,357 WARNING [train.py:1204] (3/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_106-300985-0 from training. Duration: 0.9 2023-10-03 16:52:44,368 WARNING [train.py:1204] (3/4) Exclude cut with ID _14632_210_9988_1_1530691279392_3923200_161-129530-0_sp1.1 from training. Duration: 0.87275 2023-10-03 16:52:44,387 WARNING [train.py:1204] (3/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_514-88370-0_sp1.1 from training. Duration: 0.963625 2023-10-03 16:52:47,142 WARNING [train.py:1204] (3/4) Exclude cut with ID _56495_210_4234_1_1534748080334_4136219_305-334505-0_sp1.1 from training. Duration: 0.92725 2023-10-03 16:52:48,459 WARNING [train.py:1204] (3/4) Exclude cut with ID _42875_210_14856_1_1533704397487_4012433_61-264914-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 16:52:50,422 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.604e+02 1.901e+02 2.086e+02 2.272e+02 3.701e+02, threshold=4.173e+02, percent-clipped=0.0 2023-10-03 16:52:51,291 WARNING [train.py:1204] (3/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_91-148831-0 from training. Duration: 0.99 2023-10-03 16:52:55,335 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_435-313305-0_sp0.9 from training. Duration: 0.788875 2023-10-03 16:52:55,566 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.conv_module1.balancer1.prob, batch_count=1338193.3333333333, ans=0.125 2023-10-03 16:52:57,164 WARNING [train.py:1204] (3/4) Exclude cut with ID _16744_210_5986_1_1531724405384_3907567_242-111316-0_sp1.1 from training. Duration: 0.8454375 2023-10-03 16:52:58,492 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_257-20733-0 from training. Duration: 0.76 2023-10-03 16:52:58,541 WARNING [train.py:1204] (3/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_4-91037-0_sp1.1 from training. Duration: 0.8 2023-10-03 16:53:00,481 WARNING [train.py:1204] (3/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_3-276701-0 from training. Duration: 0.91 2023-10-03 16:53:01,965 WARNING [train.py:1204] (3/4) Exclude cut with ID _3992_210_1747_1_1526644816031_3731060_36-147855-0_sp1.1 from training. Duration: 0.7090625 2023-10-03 16:53:03,255 WARNING [train.py:1204] (3/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_1296-298671-0_sp1.1 from training. Duration: 0.963625 2023-10-03 16:53:04,603 WARNING [train.py:1204] (3/4) Exclude cut with ID _68965_210_1883_1_1535698962272_3497490_309-334593-0_sp1.1 from training. Duration: 0.92725 2023-10-03 16:53:05,987 WARNING [train.py:1204] (3/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_128-296581-0 from training. Duration: 0.74 2023-10-03 16:53:05,987 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_17613_210_2151_1_1531137569_2366160_170-299079-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 16:53:05,989 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6287_210_3273_1_1527509506_2653716_182-199887-0_sp1.1 from training. Duration: 0.463625 2023-10-03 16:53:08,647 WARNING [train.py:1204] (3/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_80-135200-0_sp1.1 from training. Duration: 0.7181875 2023-10-03 16:53:11,466 WARNING [train.py:1204] (3/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_34-183685-0 from training. Duration: 0.98 2023-10-03 16:53:11,489 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8278_210_2550_1_1528637099_4220413_418-210155-0_sp1.1 from training. Duration: 0.92725 2023-10-03 16:53:11,492 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8853_210_3273_1_1528633899_818968_51-187685-0_sp0.9 from training. Duration: 0.7666875 2023-10-03 16:53:11,513 WARNING [train.py:1204] (3/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_332-221057-0_sp1.1 from training. Duration: 0.6181875 2023-10-03 16:53:12,885 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2221_210_1988_1_1525597051_4935541_415-296882-0 from training. Duration: 0.77 2023-10-03 16:53:12,907 WARNING [train.py:1204] (3/4) Exclude cut with ID _33640_210_2655_1_1533117605479_4855469_770-278185-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 16:53:12,951 WARNING [train.py:1204] (3/4) Exclude cut with ID _5287_210_2084_1_1527418883040_3878670_333-48508-0_sp1.1 from training. Duration: 0.5454375 2023-10-03 16:53:13,004 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_142-276839-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 16:53:14,455 WARNING [train.py:1204] (3/4) Exclude cut with ID _28394_210_8913_1_1532430049518_3650118_126-139371-0_sp1.1 from training. Duration: 0.92725 2023-10-03 16:53:14,475 WARNING [train.py:1204] (3/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_76-203867-0 from training. Duration: 0.72 2023-10-03 16:53:15,884 WARNING [train.py:1204] (3/4) Exclude cut with ID _6194_210_1534_1_1527511826160_3941194_14-282876-0_sp0.9 from training. Duration: 0.788875 2023-10-03 16:53:17,543 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.bypass.skip_rate, batch_count=1338260.0, ans=0.09899494936611666 2023-10-03 16:53:22,140 INFO [train.py:1046] (3/4) Epoch 38, batch 4200, loss[loss=0.1593, simple_loss=0.2541, pruned_loss=0.03227, over 24303.00 frames. ], tot_loss[loss=0.1569, simple_loss=0.2365, pruned_loss=0.0386, over 4722625.37 frames. ], batch size: 74, lr: 2.67e-03, grad_scale: 8.0 2023-10-03 16:53:22,218 WARNING [train.py:1204] (3/4) Exclude cut with ID _35355_210_16040_1_1533212988977_4016229_74-190076-0_sp1.1 from training. Duration: 0.72725 2023-10-03 16:53:23,652 WARNING [train.py:1204] (3/4) Exclude cut with ID _33920_210_4220_1_1533257851500_3084399_198-334063-0 from training. Duration: 0.94 2023-10-03 16:53:25,156 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_4-266348-0_sp0.9 from training. Duration: 0.8666875 2023-10-03 16:53:27,833 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_23856_210_4238_1_1531994785_3087934_122-38075-0_sp1.1 from training. Duration: 0.936375 2023-10-03 16:53:28,073 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=1338326.6666666667, ans=0.1 2023-10-03 16:53:29,896 WARNING [train.py:1204] (3/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_276-96593-0_sp0.9 from training. Duration: 0.8555625 2023-10-03 16:53:29,946 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_308-278734-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 16:53:29,948 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_5190_210_4557_1_1527245078_2965646_353-250603-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 16:53:33,916 WARNING [train.py:1204] (3/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_472-216663-0 from training. Duration: 0.92 2023-10-03 16:53:35,495 WARNING [train.py:1204] (3/4) Exclude cut with ID _7537_210_2084_1_1528502395431_3924359_14-312906-0 from training. Duration: 0.84 2023-10-03 16:53:35,540 WARNING [train.py:1204] (3/4) Exclude cut with ID _29003_210_19596_1_1532507391720_3894098_559-236660-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 16:53:37,032 WARNING [train.py:1204] (3/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_312-204798-0_sp1.1 from training. Duration: 0.8454375 2023-10-03 16:53:39,802 WARNING [train.py:1204] (3/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_17-309070-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 16:53:42,365 WARNING [train.py:1204] (3/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_344-152526-0_sp0.9 from training. Duration: 0.588875 2023-10-03 16:53:43,819 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_41452_210_7291_1_1533437687_3941281_405-11559-0_sp1.1 from training. Duration: 0.82725 2023-10-03 16:53:43,844 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_48730_210_5399_1_1533885886_2212769_157-124052-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 16:53:45,219 WARNING [train.py:1204] (3/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_415-78792-0 from training. Duration: 0.81 2023-10-03 16:53:45,224 WARNING [train.py:1204] (3/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_345-340690-0_sp1.1 from training. Duration: 0.8454375 2023-10-03 16:53:46,621 WARNING [train.py:1204] (3/4) Exclude cut with ID _23825_210_13449_1_1531915313526_3497150_124-8138-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 16:53:47,970 WARNING [train.py:1204] (3/4) Exclude cut with ID _49258_210_7994_1_1534122017863_3738861_194-24864-0_sp1.1 from training. Duration: 0.936375 2023-10-03 16:53:47,995 WARNING [train.py:1204] (3/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_369-222060-0_sp1.1 from training. Duration: 0.6454375 2023-10-03 16:53:49,941 WARNING [train.py:1204] (3/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_480-13146-0_sp0.9 from training. Duration: 0.9444375 2023-10-03 16:53:51,360 WARNING [train.py:1204] (3/4) Exclude cut with ID _13253_210_1925_1_1530757775819_3577873_191-144977-0 from training. Duration: 0.78 2023-10-03 16:53:51,378 WARNING [train.py:1204] (3/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_1128-140266-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 16:53:54,746 WARNING [train.py:1204] (3/4) Exclude cut with ID _8772_210_1850_1_1528635595972_3664011_247-336448-0_sp0.9 from training. Duration: 0.82225 2023-10-03 16:53:56,150 WARNING [train.py:1204] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_318-59097-0_sp1.1 from training. Duration: 0.6909375 2023-10-03 16:53:56,914 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.3.encoder.layers.1.conv_module2.whiten, num_groups=1, num_channels=512, metric=3.14 vs. limit=15.0 2023-10-03 16:53:58,063 WARNING [train.py:1204] (3/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_540-131164-0_sp1.1 from training. Duration: 0.836375 2023-10-03 16:53:59,858 WARNING [train.py:1204] (3/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_39-110836-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 16:54:02,570 WARNING [train.py:1204] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_860-96198-0_sp1.1 from training. Duration: 0.663625 2023-10-03 16:54:02,577 WARNING [train.py:1204] (3/4) Exclude cut with ID _15133_210_6228_1_1530794512074_3080379_29-264458-0 from training. Duration: 0.58 2023-10-03 16:54:03,906 WARNING [train.py:1204] (3/4) Exclude cut with ID _55209_210_1531_1_1534675828996_4597160_797-348343-0_sp1.1 from training. Duration: 0.963625 2023-10-03 16:54:05,228 WARNING [train.py:1204] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_23-189234-0_sp1.1 from training. Duration: 0.7545625 2023-10-03 16:54:08,238 WARNING [train.py:1204] (3/4) Exclude cut with ID _5410_210_1388_1_1527397055795_1719020_39-112221-0_sp1.1 from training. Duration: 0.62725 2023-10-03 16:54:10,926 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_100-348441-0_sp1.1 from training. Duration: 0.82725 2023-10-03 16:54:15,147 WARNING [train.py:1204] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_143-86344-0_sp1.1 from training. Duration: 0.7 2023-10-03 16:54:18,369 WARNING [train.py:1204] (3/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_126-29104-0 from training. Duration: 0.85 2023-10-03 16:54:21,225 WARNING [train.py:1204] (3/4) Exclude cut with ID _39846_210_12651_1_1533603651992_1318030_138-31771-0_sp1.1 from training. Duration: 0.963625 2023-10-03 16:54:24,651 WARNING [train.py:1204] (3/4) Exclude cut with ID _2220_210_1819_1_1525509109898_3658680_50-99837-0_sp1.1 from training. Duration: 0.4909375 2023-10-03 16:54:25,986 WARNING [train.py:1204] (3/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_836-210550-0_sp1.1 from training. Duration: 0.97275 2023-10-03 16:54:28,011 WARNING [train.py:1204] (3/4) Exclude cut with ID _28323_210_16187_1_1532480395667_4100300_306-74561-0 from training. Duration: 0.94 2023-10-03 16:54:31,300 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_177-58005-0_sp1.1 from training. Duration: 0.563625 2023-10-03 16:54:35,308 WARNING [train.py:1204] (3/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_19-342632-0_sp1.1 from training. Duration: 0.82725 2023-10-03 16:54:35,329 WARNING [train.py:1204] (3/4) Exclude cut with ID _35355_210_16040_1_1533212988977_4016229_74-190076-0_sp0.9 from training. Duration: 0.888875 2023-10-03 16:54:36,550 INFO [train.py:1046] (3/4) Epoch 38, batch 4250, loss[loss=0.1483, simple_loss=0.2225, pruned_loss=0.03706, over 23829.00 frames. ], tot_loss[loss=0.1559, simple_loss=0.235, pruned_loss=0.0384, over 4715300.98 frames. ], batch size: 212, lr: 2.67e-03, grad_scale: 8.0 2023-10-03 16:54:37,233 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.4.encoder.layers.0.nonlin_attention.whiten2, num_groups=1, num_channels=384, metric=10.40 vs. limit=15.0 2023-10-03 16:54:39,541 WARNING [train.py:1204] (3/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_285-97786-0_sp1.1 from training. Duration: 0.97275 2023-10-03 16:54:41,384 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.4.encoder.layers.2.self_attn_weights.whiten_keys, num_groups=4, num_channels=128, metric=1.92 vs. limit=6.0 2023-10-03 16:54:43,859 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.feed_forward2.hidden_balancer.prob, batch_count=1338660.0, ans=0.125 2023-10-03 16:54:45,052 WARNING [train.py:1204] (3/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_337-181502-0_sp0.9 from training. Duration: 0.72225 2023-10-03 16:54:46,360 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_353-182855-0 from training. Duration: 0.96 2023-10-03 16:54:46,385 WARNING [train.py:1204] (3/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_409-327730-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 16:54:48,625 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.conv_skip_rate, batch_count=1338660.0, ans=0.0 2023-10-03 16:54:49,768 WARNING [train.py:1204] (3/4) Exclude cut with ID _8030_210_7497_1_1528444886781_3531089_373-19403-0_sp1.1 from training. Duration: 0.97275 2023-10-03 16:54:53,935 WARNING [train.py:1204] (3/4) Exclude cut with ID _6789_210_1534_1_1527857937218_3118950_231-340237-0_sp1.1 from training. Duration: 0.936375 2023-10-03 16:54:57,274 WARNING [train.py:1204] (3/4) Exclude cut with ID _6493_210_1747_1_1528459210424_4012470_536-57832-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 16:54:57,287 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7046_210_6753_1_1527895337_6920917_756-116002-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 16:54:59,314 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_37152_210_9697_1_1533106573_3844188_29-239070-0_sp1.1 from training. Duration: 0.8909375 2023-10-03 16:54:59,316 WARNING [train.py:1204] (3/4) Exclude cut with ID _19023_210_4353_1_1531378892552_5820780_61-334539-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 16:55:00,655 WARNING [train.py:1204] (3/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_159-28488-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 16:55:02,667 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_764-71272-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 16:55:04,115 WARNING [train.py:1204] (3/4) Exclude cut with ID _44902_210_16187_1_1533888006062_3792078_258-178274-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 16:55:05,388 WARNING [train.py:1204] (3/4) Exclude cut with ID _7537_210_2084_1_1528502395431_3924359_14-312906-0_sp1.1 from training. Duration: 0.763625 2023-10-03 16:55:06,844 WARNING [train.py:1204] (3/4) Exclude cut with ID _49643_210_7989_1_1534640244224_3939031_674-116639-0_sp1.1 from training. Duration: 0.963625 2023-10-03 16:55:08,184 WARNING [train.py:1204] (3/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_415-161-0 from training. Duration: 0.98 2023-10-03 16:55:10,833 WARNING [train.py:1204] (3/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_264-348896-0 from training. Duration: 0.85 2023-10-03 16:55:10,840 WARNING [train.py:1204] (3/4) Exclude cut with ID _51899_210_20034_1_1534129254466_3669220_233-161492-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 16:55:12,196 WARNING [train.py:1204] (3/4) Exclude cut with ID _31255_210_13427_1_1532865619495_3815143_233-200023-0_sp1.1 from training. Duration: 0.936375 2023-10-03 16:55:12,219 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_23850_210_4238_1_1532232597_1632433_100-229879-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 16:55:13,511 WARNING [train.py:1204] (3/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_375-245677-0_sp0.9 from training. Duration: 0.9 2023-10-03 16:55:13,514 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_40223_210_6228_1_1533298404_4812267_355-317415-0_sp1.1 from training. Duration: 0.97275 2023-10-03 16:55:14,861 WARNING [train.py:1204] (3/4) Exclude cut with ID _9679_210_7497_1_1529129212906_4363929_235-81488-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 16:55:16,462 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=1338793.3333333333, ans=0.1 2023-10-03 16:55:16,490 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.balancer2.prob, batch_count=1338793.3333333333, ans=0.125 2023-10-03 16:55:18,904 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.623e+02 1.873e+02 2.084e+02 2.323e+02 3.046e+02, threshold=4.167e+02, percent-clipped=0.0 2023-10-03 16:55:19,583 WARNING [train.py:1204] (3/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_172-215932-0_sp1.1 from training. Duration: 0.6 2023-10-03 16:55:19,634 WARNING [train.py:1204] (3/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_64-298917-0_sp1.1 from training. Duration: 0.8 2023-10-03 16:55:24,924 WARNING [train.py:1204] (3/4) Exclude cut with ID _38020_210_5986_1_1533112252259_7006089_131-53533-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 16:55:25,189 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.bypass.skip_rate, batch_count=1338860.0, ans=0.04949747468305833 2023-10-03 16:55:26,866 WARNING [train.py:1204] (3/4) Exclude cut with ID _42334_210_7770_1_1534206610396_3605880_392-48154-0_sp1.1 from training. Duration: 0.963625 2023-10-03 16:55:26,939 WARNING [train.py:1204] (3/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_337-181502-0 from training. Duration: 0.65 2023-10-03 16:55:26,947 WARNING [train.py:1204] (3/4) Exclude cut with ID _6267_210_2084_1_1527764658526_3894730_375-219887-0_sp0.9 from training. Duration: 0.7444375 2023-10-03 16:55:28,883 WARNING [train.py:1204] (3/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_60-146189-0 from training. Duration: 0.98 2023-10-03 16:55:28,975 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_243-143119-0_sp1.1 from training. Duration: 0.7 2023-10-03 16:55:30,400 WARNING [train.py:1204] (3/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_1267-272693-0_sp0.9 from training. Duration: 0.811125 2023-10-03 16:55:31,812 WARNING [train.py:1204] (3/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_83-182645-0_sp1.1 from training. Duration: 0.97275 2023-10-03 16:55:31,839 WARNING [train.py:1204] (3/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_403-292260-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 16:55:33,312 WARNING [train.py:1204] (3/4) Exclude cut with ID _55429_210_10626_1_1534467588527_7174143_377-169391-0 from training. Duration: 0.96 2023-10-03 16:55:36,557 WARNING [train.py:1204] (3/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_494-199538-0_sp1.1 from training. Duration: 0.6818125 2023-10-03 16:55:36,630 WARNING [train.py:1204] (3/4) Exclude cut with ID _3067_210_2667_1_1526212974640_4503168_119-175776-0_sp1.1 from training. Duration: 0.67275 2023-10-03 16:55:37,352 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.0.layers.1.whiten, num_groups=1, num_channels=192, metric=3.74 vs. limit=12.0 2023-10-03 16:55:39,501 WARNING [train.py:1204] (3/4) Exclude cut with ID _3231_210_2106_1_1526205864934_6784220_183-303839-0_sp1.1 from training. Duration: 0.97275 2023-10-03 16:55:42,189 WARNING [train.py:1204] (3/4) Exclude cut with ID _18513_210_9206_1_1531618215223_3862620_165-38958-0_sp1.1 from training. Duration: 0.963625 2023-10-03 16:55:43,583 WARNING [train.py:1204] (3/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_87-272068-0_sp1.1 from training. Duration: 0.7545625 2023-10-03 16:55:43,681 WARNING [train.py:1204] (3/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_353-95-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 16:55:45,065 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2009_210_2071_1_1525481134_4588937_73-302813-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 16:55:47,591 WARNING [train.py:1204] (3/4) Exclude cut with ID _64220_210_9527_1_1535250151772_4875259_88-95011-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 16:55:48,972 WARNING [train.py:1204] (3/4) Exclude cut with ID _44902_210_16187_1_1533888006062_3792078_160-12107-0_sp1.1 from training. Duration: 0.92725 2023-10-03 16:55:48,988 WARNING [train.py:1204] (3/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_547-5424-0 from training. Duration: 0.94 2023-10-03 16:55:50,861 INFO [train.py:1046] (3/4) Epoch 38, batch 4300, loss[loss=0.1606, simple_loss=0.244, pruned_loss=0.03857, over 23278.00 frames. ], tot_loss[loss=0.1557, simple_loss=0.235, pruned_loss=0.0382, over 4713325.36 frames. ], batch size: 93, lr: 2.67e-03, grad_scale: 8.0 2023-10-03 16:55:52,331 WARNING [train.py:1204] (3/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_268-85656-0_sp1.1 from training. Duration: 0.936375 2023-10-03 16:55:55,465 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.conv_skip_rate, batch_count=1338993.3333333333, ans=0.0 2023-10-03 16:55:57,940 WARNING [train.py:1204] (3/4) Exclude cut with ID _66210_210_12651_1_1535446812922_3365850_42-284985-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 16:55:57,963 WARNING [train.py:1204] (3/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_465-214726-0_sp0.9 from training. Duration: 0.9555625 2023-10-03 16:56:01,351 WARNING [train.py:1204] (3/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_50-95770-0_sp1.1 from training. Duration: 0.936375 2023-10-03 16:56:07,418 WARNING [train.py:1204] (3/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_269-77567-0_sp1.1 from training. Duration: 0.963625 2023-10-03 16:56:07,427 WARNING [train.py:1204] (3/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_7-111954-0 from training. Duration: 0.56 2023-10-03 16:56:08,810 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_85-195240-0_sp0.9 from training. Duration: 0.9666875 2023-10-03 16:56:09,083 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.bypass.skip_rate, batch_count=1339060.0, ans=0.09899494936611666 2023-10-03 16:56:11,583 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2822_210_2715_1_1526112586_1999587_165-36656-0_sp0.9 from training. Duration: 0.988875 2023-10-03 16:56:11,605 WARNING [train.py:1204] (3/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_181-78632-0_sp1.1 from training. Duration: 0.7090625 2023-10-03 16:56:11,618 WARNING [train.py:1204] (3/4) Exclude cut with ID IC0671W0044-331887 from training. Duration: 0.896 2023-10-03 16:56:15,820 INFO [scaling.py:1118] (3/4) WithLoss: name=encoder.encoders.1.encoder.layers.1.self_attn_weights, loss-sum=0.000e+00 2023-10-03 16:56:16,984 WARNING [train.py:1204] (3/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_266-137104-0_sp1.1 from training. Duration: 0.5909375 2023-10-03 16:56:17,113 WARNING [train.py:1204] (3/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_137-116442-0_sp0.9 from training. Duration: 0.8666875 2023-10-03 16:56:19,901 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_23801_210_16223_1_1532066392_3294657_570-5439-0 from training. Duration: 0.97 2023-10-03 16:56:21,143 WARNING [train.py:1204] (3/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_29-280622-0_sp1.1 from training. Duration: 0.7454375 2023-10-03 16:56:21,164 WARNING [train.py:1204] (3/4) Exclude cut with ID 3557-8342-0013-54691-0 from training. Duration: 0.92 2023-10-03 16:56:21,952 WARNING [train.py:1204] (3/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_711-15688-0_sp1.1 from training. Duration: 0.3909375 2023-10-03 16:56:23,391 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_15126_210_6753_1_1530778677_2591185_120-277707-0_sp1.1 from training. Duration: 0.72725 2023-10-03 16:56:26,112 WARNING [train.py:1204] (3/4) Exclude cut with ID _14970_210_15709_1_1530775547361_1966965_8-169924-0_sp1.1 from training. Duration: 0.836375 2023-10-03 16:56:26,113 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_4692_210_3528_1_1526899755_4323254_107-44401-0_sp0.9 from training. Duration: 0.9555625 2023-10-03 16:56:26,200 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3475_210_2028_1_1526193578_3622569_265-276625-0_sp0.9 from training. Duration: 0.8444375 2023-10-03 16:56:27,528 WARNING [train.py:1204] (3/4) Exclude cut with ID _55153_210_2253_1_1534384786677_3162096_148-79022-0_sp1.1 from training. Duration: 0.87275 2023-10-03 16:56:29,336 WARNING [train.py:1204] (3/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_292-36336-0_sp1.1 from training. Duration: 0.92725 2023-10-03 16:56:29,346 WARNING [train.py:1204] (3/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_329-82235-0 from training. Duration: 0.82 2023-10-03 16:56:30,730 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_11305_210_1794_1_1529750428_7178148_257-312870-0 from training. Duration: 0.88 2023-10-03 16:56:32,211 WARNING [train.py:1204] (3/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_436-4493-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 16:56:35,371 WARNING [train.py:1204] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_321-54129-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 16:56:35,386 WARNING [train.py:1204] (3/4) Exclude cut with ID _5287_210_2084_1_1527418883040_3878670_333-48508-0_sp0.9 from training. Duration: 0.6666875 2023-10-03 16:56:36,727 WARNING [train.py:1204] (3/4) Exclude cut with ID _26528_210_9070_1_1532570410842_3851310_244-278158-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 16:56:36,771 WARNING [train.py:1204] (3/4) Exclude cut with ID _46952_210_5986_1_1534230010357_3107379_232-344503-0_sp1.1 from training. Duration: 0.87275 2023-10-03 16:56:36,780 WARNING [train.py:1204] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_43-206757-0 from training. Duration: 0.75 2023-10-03 16:56:36,782 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_4183_210_2087_1_1526729100_1869838_141-166600-0 from training. Duration: 0.95 2023-10-03 16:56:36,836 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_296-287678-0 from training. Duration: 0.95 2023-10-03 16:56:38,136 WARNING [train.py:1204] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_265-60343-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 16:56:38,171 WARNING [train.py:1204] (3/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_332-256168-0 from training. Duration: 0.75 2023-10-03 16:56:38,211 WARNING [train.py:1204] (3/4) Exclude cut with ID _13679_210_7497_1_1530534293662_3355098_411-186088-0 from training. Duration: 0.94 2023-10-03 16:56:42,335 WARNING [train.py:1204] (3/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_458-126779-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 16:56:44,915 WARNING [train.py:1204] (3/4) Exclude cut with ID IC0803W0380-397572 from training. Duration: 0.032 2023-10-03 16:56:44,982 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_278-272612-0_sp1.1 from training. Duration: 0.663625 2023-10-03 16:56:46,393 WARNING [train.py:1204] (3/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_491-263101-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 16:56:46,405 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_53555_210_5632_1_1534595220_7232297_334-336778-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 16:56:46,642 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.conv_module2.balancer2.prob, batch_count=1339193.3333333333, ans=0.125 2023-10-03 16:56:49,140 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_50-3289-0 from training. Duration: 0.91 2023-10-03 16:56:50,558 WARNING [train.py:1204] (3/4) Exclude cut with ID _8699_210_2069_1_1528630346831_3960409_155-39766-0_sp0.9 from training. Duration: 0.8666875 2023-10-03 16:56:50,562 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_502-82145-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 16:56:50,606 WARNING [train.py:1204] (3/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_550-43132-0_sp1.1 from training. Duration: 0.97275 2023-10-03 16:56:52,327 WARNING [train.py:1204] (3/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_446-112117-0_sp1.1 from training. Duration: 0.8181875 2023-10-03 16:56:52,392 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3691_210_3528_1_1526468169_4062053_355-334432-0_sp1.1 from training. Duration: 0.7909375 2023-10-03 16:56:52,622 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.feed_forward3.hidden_balancer.prob, batch_count=1339260.0, ans=0.125 2023-10-03 16:56:55,085 WARNING [train.py:1204] (3/4) Exclude cut with ID _58605_210_12210_1_1534723195939_3904259_239-94720-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 16:56:56,548 WARNING [train.py:1204] (3/4) Exclude cut with ID _49376_210_5986_1_1533985278732_3370740_391-344072-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 16:56:58,411 WARNING [train.py:1204] (3/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_558-322014-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 16:56:59,749 WARNING [train.py:1204] (3/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_256-32662-0_sp1.1 from training. Duration: 0.8181875 2023-10-03 16:57:04,236 INFO [train.py:1046] (3/4) Epoch 38, batch 4350, loss[loss=0.1796, simple_loss=0.2538, pruned_loss=0.05268, over 23215.00 frames. ], tot_loss[loss=0.1565, simple_loss=0.2362, pruned_loss=0.03841, over 4714625.36 frames. ], batch size: 105, lr: 2.67e-03, grad_scale: 8.0 2023-10-03 16:57:06,987 WARNING [train.py:1204] (3/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_572-185150-0 from training. Duration: 0.97 2023-10-03 16:57:07,023 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_89-35748-0_sp0.9 from training. Duration: 0.87775 2023-10-03 16:57:09,375 INFO [scaling.py:1118] (3/4) WithLoss: name=encoder.encoders.0.layers.1.self_attn_weights, loss-sum=0.000e+00 2023-10-03 16:57:10,713 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_88-66747-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 16:57:13,484 WARNING [train.py:1204] (3/4) Exclude cut with ID _19504_210_9994_1_1531648863242_3325220_357-237029-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 16:57:16,211 WARNING [train.py:1204] (3/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_302-2550-0_sp1.1 from training. Duration: 0.736375 2023-10-03 16:57:16,211 WARNING [train.py:1204] (3/4) Exclude cut with ID _9461_210_4557_1_1529060514726_3677451_59-194017-0_sp1.1 from training. Duration: 0.963625 2023-10-03 16:57:19,461 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.5.encoder.layers.1.self_attn1.whiten, num_groups=1, num_channels=256, metric=12.28 vs. limit=22.5 2023-10-03 16:57:20,261 WARNING [train.py:1204] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_665-196155-0_sp1.1 from training. Duration: 0.6090625 2023-10-03 16:57:23,728 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_326-315007-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 16:57:26,416 WARNING [train.py:1204] (3/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_105-328174-0_sp1.1 from training. Duration: 0.6909375 2023-10-03 16:57:26,423 WARNING [train.py:1204] (3/4) Exclude cut with ID _56494_210_4220_1_1535284837625_3033169_106-263722-0_sp1.1 from training. Duration: 0.97275 2023-10-03 16:57:29,778 WARNING [train.py:1204] (3/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_770-200430-0_sp0.9 from training. Duration: 0.988875 2023-10-03 16:57:31,740 WARNING [train.py:1204] (3/4) Exclude cut with ID _65640_210_6310_1_1535368201445_3173250_209-13008-0_sp1.1 from training. Duration: 0.9 2023-10-03 16:57:31,860 WARNING [train.py:1204] (3/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_212-149947-0_sp0.9 from training. Duration: 0.811125 2023-10-03 16:57:32,029 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.bypass_mid.scale_min, batch_count=1339393.3333333333, ans=0.2 2023-10-03 16:57:37,235 WARNING [train.py:1204] (3/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_188-171673-0 from training. Duration: 0.58 2023-10-03 16:57:37,299 WARNING [train.py:1204] (3/4) Exclude cut with ID _58895_210_8750_1_1535184081320_3649089_286-281724-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 16:57:37,365 WARNING [train.py:1204] (3/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_16-254909-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 16:57:41,864 WARNING [train.py:1204] (3/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_52-95655-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 16:57:42,052 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=1339460.0, ans=0.1 2023-10-03 16:57:44,456 WARNING [train.py:1204] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_554-45617-0 from training. Duration: 0.99 2023-10-03 16:57:47,238 WARNING [train.py:1204] (3/4) Exclude cut with ID _14683_210_5219_1_1530698352608_3974859_473-344611-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 16:57:48,468 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.582e+02 1.931e+02 2.144e+02 2.418e+02 3.398e+02, threshold=4.287e+02, percent-clipped=0.0 2023-10-03 16:57:48,573 WARNING [train.py:1204] (3/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_17-25045-0_sp0.9 from training. Duration: 0.6555625 2023-10-03 16:57:54,512 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9072W0003-530637 from training. Duration: 0.768 2023-10-03 16:57:54,617 WARNING [train.py:1204] (3/4) Exclude cut with ID _38670_210_15379_1_1533448810659_4391059_76-74025-0_sp1.1 from training. Duration: 0.936375 2023-10-03 16:57:56,023 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_74-292608-0_sp0.9 from training. Duration: 0.8 2023-10-03 16:57:57,367 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9084W0025-536862 from training. Duration: 0.896 2023-10-03 16:57:58,791 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9087W0021-538392 from training. Duration: 0.768 2023-10-03 16:57:58,807 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7543_210_6753_1_1528543323_645515_56-163234-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 16:57:58,831 WARNING [train.py:1204] (3/4) Exclude cut with ID _45560_210_21982_1_1533812603951_3584999_44-332643-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 16:58:00,174 WARNING [train.py:1204] (3/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_1285-256622-0_sp1.1 from training. Duration: 0.763625 2023-10-03 16:58:00,230 WARNING [train.py:1204] (3/4) Exclude cut with ID _52781_210_16187_1_1534297729099_1118149_141-325632-0_sp1.1 from training. Duration: 0.936375 2023-10-03 16:58:02,078 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_1051-137631-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 16:58:02,111 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_13091_210_7925_1_1530609447_8987354_233-18333-0_sp1.1 from training. Duration: 0.97275 2023-10-03 16:58:04,788 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_34008_210_10431_1_1533345424_8183247_662-317331-0 from training. Duration: 0.94 2023-10-03 16:58:04,795 WARNING [train.py:1204] (3/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_291-248920-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 16:58:04,798 WARNING [train.py:1204] (3/4) Exclude cut with ID _65263_210_22356_1_1535763598691_3921037_274-288481-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 16:58:04,822 WARNING [train.py:1204] (3/4) Exclude cut with ID _5189_210_4557_1_1527248319428_3769229_233-168080-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 16:58:04,899 WARNING [train.py:1204] (3/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_135-184281-0 from training. Duration: 0.99 2023-10-03 16:58:06,247 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9123W0012-555972 from training. Duration: 0.896 2023-10-03 16:58:06,251 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9123W0118-556077 from training. Duration: 0.896 2023-10-03 16:58:08,005 WARNING [train.py:1204] (3/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_406-275649-0 from training. Duration: 0.42 2023-10-03 16:58:09,706 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.out_combiner.scale_min, batch_count=1339593.3333333333, ans=0.2 2023-10-03 16:58:10,779 WARNING [train.py:1204] (3/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_309-13684-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 16:58:10,807 WARNING [train.py:1204] (3/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_129-337802-0_sp1.1 from training. Duration: 0.6545625 2023-10-03 16:58:10,824 WARNING [train.py:1204] (3/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_649-266825-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 16:58:12,226 WARNING [train.py:1204] (3/4) Exclude cut with ID _40519_210_13794_1_1533715228655_7337390_895-224742-0_sp1.1 from training. Duration: 0.92725 2023-10-03 16:58:12,344 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.ff2_skip_rate, batch_count=1339593.3333333333, ans=0.0 2023-10-03 16:58:14,904 WARNING [train.py:1204] (3/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_234-14174-0 from training. Duration: 0.82 2023-10-03 16:58:16,389 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9156W0009-572502 from training. Duration: 0.768 2023-10-03 16:58:16,396 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_424-245529-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 16:58:17,637 INFO [train.py:1046] (3/4) Epoch 38, batch 4400, loss[loss=0.1655, simple_loss=0.2514, pruned_loss=0.03977, over 24437.00 frames. ], tot_loss[loss=0.1573, simple_loss=0.2373, pruned_loss=0.03868, over 4710911.08 frames. ], batch size: 69, lr: 2.67e-03, grad_scale: 8.0 2023-10-03 16:58:20,440 WARNING [train.py:1204] (3/4) Exclude cut with ID _26528_210_9070_1_1532570410842_3851310_232-10149-0_sp1.1 from training. Duration: 0.963625 2023-10-03 16:58:20,447 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_17292_210_6753_1_1531125388_2943734_152-30410-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 16:58:21,902 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_35441_210_16554_1_1533347572_4092680_291-82075-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 16:58:23,510 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.0.layers.0.nonlin_attention.whiten2, num_groups=1, num_channels=192, metric=5.20 vs. limit=15.0 2023-10-03 16:58:23,918 WARNING [train.py:1204] (3/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_64-298917-0 from training. Duration: 0.88 2023-10-03 16:58:23,949 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_5618_210_1874_1_1527311008_5851_1-214501-0_sp0.9 from training. Duration: 0.7655625 2023-10-03 16:58:25,228 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_9460_210_4211_1_1528954587_10049667_572-37035-0 from training. Duration: 0.8 2023-10-03 16:58:25,256 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9193W0023-590322 from training. Duration: 0.896 2023-10-03 16:58:26,641 WARNING [train.py:1204] (3/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_206-10097-0_sp1.1 from training. Duration: 0.4454375 2023-10-03 16:58:26,652 WARNING [train.py:1204] (3/4) Exclude cut with ID _55134_210_21982_1_1534590028377_3882170_17-51731-0_sp1.1 from training. Duration: 0.963625 2023-10-03 16:58:29,290 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_14953_210_2151_1_1530701710_5616165_710-165400-0 from training. Duration: 0.87 2023-10-03 16:58:30,656 WARNING [train.py:1204] (3/4) Exclude cut with ID _53203_210_2087_1_1534246218309_475590_61-85777-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 16:58:32,096 WARNING [train.py:1204] (3/4) Exclude cut with ID _55844_210_2346_1_1534471212569_7625707_1050-10453-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 16:58:32,106 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9218W0013-603297 from training. Duration: 0.896 2023-10-03 16:58:34,209 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.feed_forward1.hidden_balancer.prob, batch_count=1339726.6666666667, ans=0.125 2023-10-03 16:58:35,485 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_267-102818-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 16:58:35,486 WARNING [train.py:1204] (3/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_191-41023-0 from training. Duration: 0.71 2023-10-03 16:58:36,810 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9231W0202-610227 from training. Duration: 0.896 2023-10-03 16:58:39,951 WARNING [train.py:1204] (3/4) Exclude cut with ID _6537_210_1883_1_1527987559710_3597129_382-133062-0 from training. Duration: 0.68 2023-10-03 16:58:39,988 WARNING [train.py:1204] (3/4) Exclude cut with ID _28427_210_4220_1_1532581240967_3061069_43-84928-0 from training. Duration: 0.99 2023-10-03 16:58:41,230 WARNING [train.py:1204] (3/4) Exclude cut with ID _59256_210_5986_1_1535076085997_3589120_121-237386-0 from training. Duration: 0.96 2023-10-03 16:58:41,269 WARNING [train.py:1204] (3/4) Exclude cut with ID _8480_210_1874_1_1528589391205_6728330_137-256986-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 16:58:42,568 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_9417_210_1794_1_1529235261_4614131_479-346963-0_sp1.1 from training. Duration: 0.936375 2023-10-03 16:58:42,598 WARNING [train.py:1204] (3/4) Exclude cut with ID _26474_210_9570_1_1532520022699_3623380_267-7362-0_sp1.1 from training. Duration: 0.936375 2023-10-03 16:58:43,924 WARNING [train.py:1204] (3/4) Exclude cut with ID _55328_210_27767_1_1534580973260_4088170_287-61272-0_sp1.1 from training. Duration: 0.97275 2023-10-03 16:58:45,312 WARNING [train.py:1204] (3/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_589-9070-0 from training. Duration: 0.71 2023-10-03 16:58:45,319 WARNING [train.py:1204] (3/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_148-158509-0_sp1.1 from training. Duration: 0.37275 2023-10-03 16:58:46,737 WARNING [train.py:1204] (3/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_164-184548-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 16:58:46,907 WARNING [train.py:1204] (3/4) Exclude cut with ID _14970_210_15709_1_1530775547361_1966965_6-23328-0_sp1.1 from training. Duration: 0.7818125 2023-10-03 16:58:46,913 WARNING [train.py:1204] (3/4) Exclude cut with ID _29303_210_2575_1_1532392237303_7410649_612-165069-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 16:58:49,541 WARNING [train.py:1204] (3/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_383-321224-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 16:58:49,575 WARNING [train.py:1204] (3/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_283-160525-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 16:58:49,592 WARNING [train.py:1204] (3/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_505-97987-0 from training. Duration: 0.86 2023-10-03 16:58:52,698 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9309W0190-640317 from training. Duration: 0.768 2023-10-03 16:58:54,183 WARNING [train.py:1204] (3/4) Exclude cut with ID _50093_210_4220_1_1534417141207_3432299_280-297390-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 16:58:55,829 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.ff3_skip_rate, batch_count=1339793.3333333333, ans=0.0 2023-10-03 16:58:59,677 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_17292_210_6753_1_1531125388_2943734_320-71385-0_sp1.1 from training. Duration: 0.97275 2023-10-03 16:59:02,820 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_213-103282-0 from training. Duration: 0.78 2023-10-03 16:59:06,064 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7615_210_4211_1_1528110965_4580160_72-235211-0_sp0.9 from training. Duration: 0.8444375 2023-10-03 16:59:06,180 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.nonlin_attention.balancer.prob, batch_count=1339860.0, ans=0.125 2023-10-03 16:59:08,832 WARNING [train.py:1204] (3/4) Exclude cut with ID _14867_210_1968_1_1530684179890_3846969_173-320977-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 16:59:11,336 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.4.encoder.layers.2.nonlin_attention.whiten2, num_groups=1, num_channels=384, metric=3.48 vs. limit=15.0 2023-10-03 16:59:12,170 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7046_210_6753_1_1527895337_6920917_783-278109-0_sp0.9 from training. Duration: 0.8555625 2023-10-03 16:59:12,212 WARNING [train.py:1204] (3/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_90-30673-0 from training. Duration: 0.84 2023-10-03 16:59:12,234 WARNING [train.py:1204] (3/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_415-161-0_sp1.1 from training. Duration: 0.8909375 2023-10-03 16:59:13,594 WARNING [train.py:1204] (3/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_86-66192-0_sp0.9 from training. Duration: 0.611125 2023-10-03 16:59:13,597 WARNING [train.py:1204] (3/4) Exclude cut with ID _4626_210_3532_1_1526817652717_3916859_446-206656-0_sp1.1 from training. Duration: 0.6909375 2023-10-03 16:59:13,651 WARNING [train.py:1204] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_166-128941-0_sp0.9 from training. Duration: 0.6 2023-10-03 16:59:16,511 WARNING [train.py:1204] (3/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_345-167986-0 from training. Duration: 0.84 2023-10-03 16:59:18,042 INFO [scaling.py:1118] (3/4) WithLoss: name=encoder.encoders.1.encoder.layers.1.self_attn_weights, loss-sum=0.000e+00 2023-10-03 16:59:19,240 WARNING [train.py:1204] (3/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_973-198085-0 from training. Duration: 0.75 2023-10-03 16:59:20,579 WARNING [train.py:1204] (3/4) Exclude cut with ID _60057_210_10640_1_1534903065793_3333150_171-181133-0 from training. Duration: 0.88 2023-10-03 16:59:20,600 WARNING [train.py:1204] (3/4) Exclude cut with ID _19504_210_9994_1_1531648863242_3325220_336-196513-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 16:59:20,607 WARNING [train.py:1204] (3/4) Exclude cut with ID _6493_210_1747_1_1528459210424_4012470_513-140967-0 from training. Duration: 0.79 2023-10-03 16:59:22,484 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6673_210_3273_1_1527767790_3173240_327-306185-0_sp0.9 from training. Duration: 0.92225 2023-10-03 16:59:25,304 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1032-236863-0_sp1.1 from training. Duration: 0.763625 2023-10-03 16:59:26,821 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.attention_skip_rate, batch_count=1339926.6666666667, ans=0.0 2023-10-03 16:59:27,978 WARNING [train.py:1204] (3/4) Exclude cut with ID _14867_210_1968_1_1530684179890_3846969_413-29292-0 from training. Duration: 0.9 2023-10-03 16:59:30,601 INFO [train.py:1046] (3/4) Epoch 38, batch 4450, loss[loss=0.1424, simple_loss=0.2198, pruned_loss=0.03253, over 24337.00 frames. ], tot_loss[loss=0.1583, simple_loss=0.2376, pruned_loss=0.03952, over 4705767.45 frames. ], batch size: 56, lr: 2.67e-03, grad_scale: 8.0 2023-10-03 16:59:32,373 WARNING [train.py:1204] (3/4) Exclude cut with ID _47251_210_18628_1_1533814063128_4137250_186-343322-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 16:59:35,496 WARNING [train.py:1204] (3/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_624-46091-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 16:59:36,787 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_791-197891-0_sp0.9 from training. Duration: 0.9333125 2023-10-03 16:59:42,910 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_50050_210_16223_1_1535435796_7290138_786-288457-0_sp1.1 from training. Duration: 0.92725 2023-10-03 16:59:44,688 WARNING [train.py:1204] (3/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_213-227283-0_sp1.1 from training. Duration: 0.963625 2023-10-03 16:59:47,444 WARNING [train.py:1204] (3/4) Exclude cut with ID _42659_210_2070_1_1533607442765_3120380_58-341121-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 16:59:50,231 WARNING [train.py:1204] (3/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_506-23200-0_sp1.1 from training. Duration: 0.8818125 2023-10-03 16:59:52,881 WARNING [train.py:1204] (3/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_336-213530-0_sp1.1 from training. Duration: 0.8181875 2023-10-03 16:59:52,914 WARNING [train.py:1204] (3/4) Exclude cut with ID _64135_210_2253_1_1535677267876_7608972_180-5312-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 16:59:54,248 WARNING [train.py:1204] (3/4) Exclude cut with ID _12302_210_5219_1_1529924446466_8107280_835-279754-0 from training. Duration: 0.9 2023-10-03 16:59:54,249 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_510-96996-0_sp1.1 from training. Duration: 0.7909375 2023-10-03 16:59:55,679 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_15517_210_6753_1_1530954008_3487336_216-187834-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 16:59:55,704 WARNING [train.py:1204] (3/4) Exclude cut with ID _19504_210_9994_1_1531648863242_3325220_414-113427-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 16:59:55,705 WARNING [train.py:1204] (3/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_264-108685-0_sp0.9 from training. Duration: 0.611125 2023-10-03 16:59:58,918 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_5438_210_1794_1_1527336687_2600009_104-288274-0_sp0.9 from training. Duration: 0.6444375 2023-10-03 17:00:02,026 WARNING [train.py:1204] (3/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_520-39496-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 17:00:02,062 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_37818_210_4238_1_1533620910_4720083_315-308565-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 17:00:03,857 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2009_210_2071_1_1525481134_4588937_385-302100-0_sp1.1 from training. Duration: 0.7909375 2023-10-03 17:00:03,923 WARNING [train.py:1204] (3/4) Exclude cut with ID _58895_210_8750_1_1535184081320_3649089_277-53219-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 17:00:05,266 WARNING [train.py:1204] (3/4) Exclude cut with ID _26664_210_7770_1_1532258943280_3418270_121-111258-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 17:00:09,175 WARNING [train.py:1204] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_99-167392-0_sp1.1 from training. Duration: 0.5545625 2023-10-03 17:00:09,339 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.bypass.scale_min, batch_count=1340126.6666666667, ans=0.2 2023-10-03 17:00:09,977 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.0.layers.1.self_attn_weights.whiten_keys, num_groups=4, num_channels=128, metric=2.18 vs. limit=6.0 2023-10-03 17:00:10,947 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1113-7369-0 from training. Duration: 0.85 2023-10-03 17:00:10,963 WARNING [train.py:1204] (3/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_352-59958-0 from training. Duration: 0.86 2023-10-03 17:00:10,970 WARNING [train.py:1204] (3/4) Exclude cut with ID _14886_210_4220_1_1530702020355_3516190_159-73748-0_sp1.1 from training. Duration: 0.7545625 2023-10-03 17:00:12,415 WARNING [train.py:1204] (3/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_131-119569-0_sp1.1 from training. Duration: 0.92725 2023-10-03 17:00:12,648 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.self_attn_weights.pos_emb_skip_rate, batch_count=1340126.6666666667, ans=0.0 2023-10-03 17:00:14,433 WARNING [train.py:1204] (3/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_216-343007-0 from training. Duration: 0.66 2023-10-03 17:00:15,718 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.495e+02 1.913e+02 2.140e+02 2.381e+02 4.249e+02, threshold=4.280e+02, percent-clipped=0.0 2023-10-03 17:00:19,687 WARNING [train.py:1204] (3/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_113-92608-0_sp0.9 from training. Duration: 0.6 2023-10-03 17:00:22,613 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_40047_210_2290_1_1533290519_2374311_132-123271-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 17:00:22,666 WARNING [train.py:1204] (3/4) Exclude cut with ID _8793_210_6310_1_1528542985139_2310009_121-179782-0 from training. Duration: 0.8 2023-10-03 17:00:22,679 WARNING [train.py:1204] (3/4) Exclude cut with ID _43997_210_4525_1_1533538632555_5189329_283-218639-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 17:00:22,682 WARNING [train.py:1204] (3/4) Exclude cut with ID _49376_210_5986_1_1533985278732_3370740_297-78397-0_sp1.1 from training. Duration: 0.936375 2023-10-03 17:00:22,703 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3475_210_2028_1_1526193578_3622569_377-166133-0_sp0.9 from training. Duration: 0.9555625 2023-10-03 17:00:22,709 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_520-213149-0_sp1.1 from training. Duration: 0.92725 2023-10-03 17:00:24,073 WARNING [train.py:1204] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_567-144483-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 17:00:27,525 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_4183_210_2087_1_1526729100_1869838_143-280962-0_sp0.9 from training. Duration: 0.62225 2023-10-03 17:00:28,852 WARNING [train.py:1204] (3/4) Exclude cut with ID _49643_210_7989_1_1534640244224_3939031_611-185515-0 from training. Duration: 0.94 2023-10-03 17:00:31,517 WARNING [train.py:1204] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_88-341718-0_sp0.9 from training. Duration: 0.7333125 2023-10-03 17:00:32,948 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_51225_210_3601_1_1534400634_2327383_49-235441-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 17:00:34,770 WARNING [train.py:1204] (3/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_240-294400-0_sp1.1 from training. Duration: 0.936375 2023-10-03 17:00:36,160 WARNING [train.py:1204] (3/4) Exclude cut with ID _55429_210_10626_1_1534467588527_7174143_640-304825-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 17:00:36,195 WARNING [train.py:1204] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_187-221566-0_sp1.1 from training. Duration: 0.3909375 2023-10-03 17:00:37,713 WARNING [train.py:1204] (3/4) Exclude cut with ID _7606_210_4915_1_1528894253277_5080690_127-177761-0_sp0.9 from training. Duration: 0.711125 2023-10-03 17:00:41,196 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.balancer2.prob, batch_count=1340260.0, ans=0.125 2023-10-03 17:00:42,322 WARNING [train.py:1204] (3/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_430-116139-0 from training. Duration: 0.85 2023-10-03 17:00:42,440 WARNING [train.py:1204] (3/4) Exclude cut with ID _3675_210_4557_1_1526639934408_4182089_456-18305-0_sp0.9 from training. Duration: 0.8444375 2023-10-03 17:00:45,000 INFO [train.py:1046] (3/4) Epoch 38, batch 4500, loss[loss=0.1633, simple_loss=0.2523, pruned_loss=0.03712, over 24622.00 frames. ], tot_loss[loss=0.1585, simple_loss=0.2381, pruned_loss=0.03941, over 4717136.39 frames. ], batch size: 73, lr: 2.67e-03, grad_scale: 8.0 2023-10-03 17:00:47,199 WARNING [train.py:1204] (3/4) Exclude cut with ID _18513_210_9206_1_1531618215223_3862620_402-5957-0_sp1.1 from training. Duration: 0.9 2023-10-03 17:00:48,587 WARNING [train.py:1204] (3/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_465-156500-0 from training. Duration: 0.72 2023-10-03 17:00:48,588 WARNING [train.py:1204] (3/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_714-310538-0 from training. Duration: 0.86 2023-10-03 17:00:51,328 WARNING [train.py:1204] (3/4) Exclude cut with ID _39411_210_5986_1_1533538825030_3343649_335-301187-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 17:00:55,363 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_41750_210_1960_1_1533520678_3948042_241-158616-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 17:00:55,401 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_46013_210_7234_1_1533776362_3744032_50-141535-0_sp1.1 from training. Duration: 0.9 2023-10-03 17:00:56,877 WARNING [train.py:1204] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_43-206757-0_sp0.9 from training. Duration: 0.8333125 2023-10-03 17:00:56,936 WARNING [train.py:1204] (3/4) Exclude cut with ID _22961_210_8254_1_1531976366672_4278180_172-276296-0_sp1.1 from training. Duration: 0.97275 2023-10-03 17:00:58,751 WARNING [train.py:1204] (3/4) Exclude cut with ID _17956_210_4353_1_1531209568125_6979970_1305-301903-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 17:00:58,798 WARNING [train.py:1204] (3/4) Exclude cut with ID _28323_210_16187_1_1532480395667_4100300_604-44507-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 17:01:09,432 WARNING [train.py:1204] (3/4) Exclude cut with ID _23514_210_1389_1_1531832139785_3031439_88-337544-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 17:01:09,504 WARNING [train.py:1204] (3/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_932-3672-0_sp1.1 from training. Duration: 0.963625 2023-10-03 17:01:12,182 WARNING [train.py:1204] (3/4) Exclude cut with ID _46502_210_13794_1_1533724452198_3657159_207-109991-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 17:01:12,235 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_216-73841-0_sp1.1 from training. Duration: 0.87275 2023-10-03 17:01:12,437 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.attention_skip_rate, batch_count=1340393.3333333333, ans=0.0 2023-10-03 17:01:12,490 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.feed_forward3.hidden_balancer.prob, batch_count=1340393.3333333333, ans=0.125 2023-10-03 17:01:13,600 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8453_210_4211_1_1528502940_8562730_347-330673-0_sp1.1 from training. Duration: 0.6545625 2023-10-03 17:01:22,143 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_4183_210_2087_1_1526729100_1869838_143-280962-0_sp1.1 from training. Duration: 0.5090625 2023-10-03 17:01:26,213 WARNING [train.py:1204] (3/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_325-113798-0_sp1.1 from training. Duration: 0.77275 2023-10-03 17:01:30,350 WARNING [train.py:1204] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_91-212486-0_sp0.9 from training. Duration: 0.7555625 2023-10-03 17:01:33,752 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8776_210_2040_1_1528638837_4088036_244-278763-0_sp1.1 from training. Duration: 0.8181875 2023-10-03 17:01:33,798 WARNING [train.py:1204] (3/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_106-286130-0 from training. Duration: 0.98 2023-10-03 17:01:35,560 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2746_210_1988_1_1525872816_3795627_372-205861-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 17:01:35,605 WARNING [train.py:1204] (3/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_240-54646-0_sp1.1 from training. Duration: 0.936375 2023-10-03 17:01:37,127 WARNING [train.py:1204] (3/4) Exclude cut with ID _59500_210_8254_1_1534809610680_3736009_162-180695-0_sp1.1 from training. Duration: 0.936375 2023-10-03 17:01:37,164 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7544_210_6753_1_1528591327_1471426_146-93357-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 17:01:38,597 WARNING [train.py:1204] (3/4) Exclude cut with ID _42657_210_2070_1_1533520654016_3327008_405-87432-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 17:01:38,626 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_4528_210_3273_1_1527076654_3276777_209-219804-0 from training. Duration: 0.69 2023-10-03 17:01:38,627 WARNING [train.py:1204] (3/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_78-182912-0_sp0.9 from training. Duration: 0.6555625 2023-10-03 17:01:38,641 WARNING [train.py:1204] (3/4) Exclude cut with ID _31255_210_13427_1_1532865619495_3815143_322-114998-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 17:01:44,514 WARNING [train.py:1204] (3/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_848-280957-0_sp0.9 from training. Duration: 0.9666875 2023-10-03 17:01:44,534 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7696_210_2040_1_1528207167_3716099_117-125434-0_sp0.9 from training. Duration: 0.8444375 2023-10-03 17:01:47,060 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_1002-109192-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 17:01:50,353 WARNING [train.py:1204] (3/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_138-64210-0_sp0.9 from training. Duration: 0.811125 2023-10-03 17:01:50,368 WARNING [train.py:1204] (3/4) Exclude cut with ID _25808_210_13292_1_1532343514562_7366109_633-47524-0_sp1.1 from training. Duration: 0.9 2023-10-03 17:01:51,802 WARNING [train.py:1204] (3/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_297-5233-0 from training. Duration: 0.95 2023-10-03 17:01:53,156 WARNING [train.py:1204] (3/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_335-274780-0 from training. Duration: 0.78 2023-10-03 17:01:53,169 WARNING [train.py:1204] (3/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_301-330455-0 from training. Duration: 0.8 2023-10-03 17:01:55,997 WARNING [train.py:1204] (3/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_383-205833-0 from training. Duration: 0.95 2023-10-03 17:01:58,551 INFO [train.py:1046] (3/4) Epoch 38, batch 4550, loss[loss=0.1359, simple_loss=0.2128, pruned_loss=0.02948, over 24422.00 frames. ], tot_loss[loss=0.1576, simple_loss=0.2374, pruned_loss=0.03893, over 4718240.94 frames. ], batch size: 58, lr: 2.67e-03, grad_scale: 8.0 2023-10-03 17:01:58,776 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=1340660.0, ans=0.1 2023-10-03 17:01:59,979 WARNING [train.py:1204] (3/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_48-182155-0 from training. Duration: 0.87 2023-10-03 17:02:01,994 WARNING [train.py:1204] (3/4) Exclude cut with ID _14832_210_8750_1_1530691351539_3171348_1-54315-0_sp1.1 from training. Duration: 0.7909375 2023-10-03 17:02:03,436 WARNING [train.py:1204] (3/4) Exclude cut with ID _44982_210_1511_1_1534312820108_3726029_413-100814-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 17:02:03,488 WARNING [train.py:1204] (3/4) Exclude cut with ID _3231_210_2106_1_1526205864934_6784220_104-261392-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 17:02:06,860 WARNING [train.py:1204] (3/4) Exclude cut with ID _24314_210_4915_1_1532135078116_7170179_1-13588-0_sp1.1 from training. Duration: 0.97275 2023-10-03 17:02:11,370 WARNING [train.py:1204] (3/4) Exclude cut with ID _44304_210_15374_1_1533639352765_4613129_141-8356-0_sp1.1 from training. Duration: 0.963625 2023-10-03 17:02:12,772 WARNING [train.py:1204] (3/4) Exclude cut with ID _37892_210_19596_1_1533198677897_3964058_374-335034-0_sp1.1 from training. Duration: 0.936375 2023-10-03 17:02:14,172 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_195-257512-0_sp0.9 from training. Duration: 0.7444375 2023-10-03 17:02:14,174 WARNING [train.py:1204] (3/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_1267-272693-0_sp1.1 from training. Duration: 0.663625 2023-10-03 17:02:14,175 WARNING [train.py:1204] (3/4) Exclude cut with ID _30196_210_1664_1_1533708496522_7074140_690-326155-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 17:02:14,457 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.ff3_skip_rate, batch_count=1340726.6666666667, ans=0.0 2023-10-03 17:02:15,786 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_68847_210_14656_1_1536070214_2586843_38-20717-0_sp1.1 from training. Duration: 0.97275 2023-10-03 17:02:15,836 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_99-251314-0_sp1.1 from training. Duration: 0.7909375 2023-10-03 17:02:18,160 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.4.encoder.layers.2.whiten, num_groups=1, num_channels=384, metric=2.72 vs. limit=12.0 2023-10-03 17:02:21,676 WARNING [train.py:1204] (3/4) Exclude cut with ID _49376_210_5986_1_1533985278732_3370740_225-95630-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 17:02:23,268 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2655_210_2087_1_1525688959_3897400_202-147557-0 from training. Duration: 0.85 2023-10-03 17:02:24,611 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_5618_210_1874_1_1527311008_5851_1-214501-0_sp1.1 from training. Duration: 0.626375 2023-10-03 17:02:24,684 WARNING [train.py:1204] (3/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_225-43381-0_sp1.1 from training. Duration: 0.8545625 2023-10-03 17:02:26,117 WARNING [train.py:1204] (3/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_242-3472-0 from training. Duration: 0.73 2023-10-03 17:02:30,648 WARNING [train.py:1204] (3/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_339-104791-0 from training. Duration: 0.49 2023-10-03 17:02:30,709 WARNING [train.py:1204] (3/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_488-92261-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 17:02:32,865 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.conv_skip_rate, batch_count=1340793.3333333333, ans=0.0 2023-10-03 17:02:33,964 WARNING [train.py:1204] (3/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_175-163894-0 from training. Duration: 0.74 2023-10-03 17:02:34,085 WARNING [train.py:1204] (3/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_41-234353-0_sp0.9 from training. Duration: 0.7555625 2023-10-03 17:02:35,612 WARNING [train.py:1204] (3/4) Exclude cut with ID _47251_210_18628_1_1533814063128_4137250_116-275927-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 17:02:35,648 WARNING [train.py:1204] (3/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_61-223324-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 17:02:36,890 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6961_210_2087_1_1527937168_6815011_562-165518-0_sp1.1 from training. Duration: 0.7 2023-10-03 17:02:38,721 WARNING [train.py:1204] (3/4) Exclude cut with ID _21989_210_5854_1_1531899214931_3271360_133-44759-0 from training. Duration: 0.77 2023-10-03 17:02:40,272 WARNING [train.py:1204] (3/4) Exclude cut with ID _23557_210_8750_1_1531890106370_6846255_77-284923-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 17:02:42,855 WARNING [train.py:1204] (3/4) Exclude cut with ID _13678_210_7497_1_1530530069226_3463170_464-296127-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 17:02:42,875 WARNING [train.py:1204] (3/4) Exclude cut with ID _22613_210_5219_1_1532253599686_4090170_393-36663-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 17:02:43,083 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.bypass_mid.scale_min, batch_count=1340860.0, ans=0.2 2023-10-03 17:02:44,104 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.677e+02 1.957e+02 2.138e+02 2.462e+02 4.431e+02, threshold=4.276e+02, percent-clipped=1.0 2023-10-03 17:02:44,268 WARNING [train.py:1204] (3/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_235-262124-0_sp0.9 from training. Duration: 0.7444375 2023-10-03 17:02:47,433 WARNING [train.py:1204] (3/4) Exclude cut with ID _8480_210_1874_1_1528589391205_6728330_755-112625-0 from training. Duration: 0.74 2023-10-03 17:02:47,496 WARNING [train.py:1204] (3/4) Exclude cut with ID _6456_210_1534_1_1527681634212_3181049_215-309359-0 from training. Duration: 0.88 2023-10-03 17:02:47,520 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3019_210_2028_1_1526044245_3264865_191-72235-0_sp1.1 from training. Duration: 0.8818125 2023-10-03 17:02:48,835 WARNING [train.py:1204] (3/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_380-162648-0 from training. Duration: 0.94 2023-10-03 17:02:50,257 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_92-46009-0 from training. Duration: 0.54 2023-10-03 17:02:50,274 WARNING [train.py:1204] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_53-300544-0_sp0.9 from training. Duration: 0.7444375 2023-10-03 17:02:51,663 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_68235_210_1925_1_1535863247_4484656_101-250943-0_sp1.1 from training. Duration: 0.97275 2023-10-03 17:02:51,679 WARNING [train.py:1204] (3/4) Exclude cut with ID _14970_210_15709_1_1530775547361_1966965_204-22182-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 17:02:51,757 WARNING [train.py:1204] (3/4) Exclude cut with ID _55153_210_2253_1_1534384786677_3162096_310-130092-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 17:02:53,060 WARNING [train.py:1204] (3/4) Exclude cut with ID _60113_210_16271_1_1535164212481_4386079_559-167191-0_sp1.1 from training. Duration: 0.8454375 2023-10-03 17:02:54,538 WARNING [train.py:1204] (3/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_93-58002-0_sp1.1 from training. Duration: 0.5181875 2023-10-03 17:02:54,583 WARNING [train.py:1204] (3/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_110-224688-0 from training. Duration: 0.97 2023-10-03 17:02:55,997 WARNING [train.py:1204] (3/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_178-178004-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 17:02:56,008 WARNING [train.py:1204] (3/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_321-335475-0_sp1.1 from training. Duration: 0.4090625 2023-10-03 17:02:57,875 WARNING [train.py:1204] (3/4) Exclude cut with ID _66863_210_18053_1_1535698758334_8590319_1092-271784-0 from training. Duration: 0.96 2023-10-03 17:02:57,890 WARNING [train.py:1204] (3/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_607-213951-0_sp1.1 from training. Duration: 0.763625 2023-10-03 17:02:59,179 WARNING [train.py:1204] (3/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_468-283113-0 from training. Duration: 0.96 2023-10-03 17:03:01,974 WARNING [train.py:1204] (3/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_13-165512-0_sp1.1 from training. Duration: 0.7454375 2023-10-03 17:03:01,995 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_69630_210_7234_1_1535788670_910989_56-169762-0_sp1.1 from training. Duration: 0.92725 2023-10-03 17:03:03,988 WARNING [train.py:1204] (3/4) Exclude cut with ID _67335_210_5219_1_1535525975652_4275229_121-80357-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 17:03:05,417 WARNING [train.py:1204] (3/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_126-12079-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 17:03:05,460 WARNING [train.py:1204] (3/4) Exclude cut with ID _5294_210_1747_1_1527249643928_3696179_342-270278-0_sp1.1 from training. Duration: 0.5 2023-10-03 17:03:07,364 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_30322_210_1925_1_1532843478_826817_35-328264-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 17:03:10,097 WARNING [train.py:1204] (3/4) Exclude cut with ID _8480_210_1874_1_1528589391205_6728330_755-112625-0_sp1.1 from training. Duration: 0.67275 2023-10-03 17:03:10,368 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.conv_skip_rate, batch_count=1340926.6666666667, ans=0.0 2023-10-03 17:03:12,845 WARNING [train.py:1204] (3/4) Exclude cut with ID _38670_210_15379_1_1533448810659_4391059_132-146412-0_sp1.1 from training. Duration: 0.97275 2023-10-03 17:03:12,929 WARNING [train.py:1204] (3/4) Exclude cut with ID _22020_210_2461_1_1531724164513_6326900_563-39691-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 17:03:14,316 INFO [train.py:1046] (3/4) Epoch 38, batch 4600, loss[loss=0.1559, simple_loss=0.2366, pruned_loss=0.03764, over 24473.00 frames. ], tot_loss[loss=0.1564, simple_loss=0.2353, pruned_loss=0.0388, over 4704920.01 frames. ], batch size: 63, lr: 2.67e-03, grad_scale: 8.0 2023-10-03 17:03:15,860 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_9460_210_4211_1_1528954587_10049667_572-37035-0_sp1.1 from training. Duration: 0.72725 2023-10-03 17:03:17,540 WARNING [train.py:1204] (3/4) Exclude cut with ID _68777_210_4353_1_1535810390731_3117904_129-166012-0_sp0.9 from training. Duration: 0.9444375 2023-10-03 17:03:17,599 WARNING [train.py:1204] (3/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_487-91921-0_sp1.1 from training. Duration: 0.963625 2023-10-03 17:03:18,971 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_195-257512-0 from training. Duration: 0.67 2023-10-03 17:03:20,325 WARNING [train.py:1204] (3/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_188-260011-0_sp1.1 from training. Duration: 0.663625 2023-10-03 17:03:23,165 WARNING [train.py:1204] (3/4) Exclude cut with ID _8480_210_1874_1_1528589391205_6728330_309-139896-0_sp0.9 from training. Duration: 0.9 2023-10-03 17:03:24,558 WARNING [train.py:1204] (3/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_866-321248-0_sp1.1 from training. Duration: 0.963625 2023-10-03 17:03:26,051 WARNING [train.py:1204] (3/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_510-216513-0_sp1.1 from training. Duration: 0.97275 2023-10-03 17:03:26,253 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.conv_module1.balancer2.prob, batch_count=1340993.3333333333, ans=0.125 2023-10-03 17:03:34,889 WARNING [train.py:1204] (3/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_758-143677-0 from training. Duration: 0.98 2023-10-03 17:03:36,740 WARNING [train.py:1204] (3/4) Exclude cut with ID _55134_210_21982_1_1534590028377_3882170_50-214427-0_sp1.1 from training. Duration: 0.97275 2023-10-03 17:03:40,126 WARNING [train.py:1204] (3/4) Exclude cut with ID _31649_210_6228_1_1532584608063_4091078_482-301330-0_sp1.1 from training. Duration: 0.97275 2023-10-03 17:03:42,910 WARNING [train.py:1204] (3/4) Exclude cut with ID _35799_210_5854_1_1533263487887_3529071_490-326847-0_sp1.1 from training. Duration: 0.9 2023-10-03 17:03:42,918 WARNING [train.py:1204] (3/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_984-90716-0_sp1.1 from training. Duration: 0.963625 2023-10-03 17:03:47,300 WARNING [train.py:1204] (3/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_166-246557-0 from training. Duration: 0.64 2023-10-03 17:03:47,301 WARNING [train.py:1204] (3/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_51-200220-0_sp1.1 from training. Duration: 0.5090625 2023-10-03 17:03:48,613 WARNING [train.py:1204] (3/4) Exclude cut with ID _51382_210_23862_1_1534492801838_3650104_275-92796-0_sp1.1 from training. Duration: 0.936375 2023-10-03 17:03:52,169 WARNING [train.py:1204] (3/4) Exclude cut with ID _29853_210_7497_1_1532505600851_6642110_461-142829-0_sp1.1 from training. Duration: 0.97275 2023-10-03 17:03:53,540 WARNING [train.py:1204] (3/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_788-298907-0_sp0.9 from training. Duration: 0.988875 2023-10-03 17:03:54,907 WARNING [train.py:1204] (3/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_739-2098-0_sp1.1 from training. Duration: 0.836375 2023-10-03 17:04:00,743 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7543_210_6753_1_1528541907_1399322_165-323619-0 from training. Duration: 0.69 2023-10-03 17:04:02,113 WARNING [train.py:1204] (3/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_401-255491-0_sp1.1 from training. Duration: 0.636375 2023-10-03 17:04:05,019 WARNING [train.py:1204] (3/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_24-143283-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 17:04:05,112 WARNING [train.py:1204] (3/4) Exclude cut with ID _14459_210_2228_1_1531443641946_7516700_1221-280439-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 17:04:05,268 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.feed_forward3.hidden_balancer.prob, batch_count=1341193.3333333333, ans=0.125 2023-10-03 17:04:07,788 WARNING [train.py:1204] (3/4) Exclude cut with ID _59202_210_26984_1_1534816810612_3549174_469-213482-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 17:04:07,789 WARNING [train.py:1204] (3/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_148-158509-0_sp0.9 from training. Duration: 0.4555625 2023-10-03 17:04:09,697 WARNING [train.py:1204] (3/4) Exclude cut with ID _24314_210_4915_1_1532135078116_7170179_863-259095-0_sp1.1 from training. Duration: 0.97275 2023-10-03 17:04:09,773 WARNING [train.py:1204] (3/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_321-22821-0 from training. Duration: 0.89 2023-10-03 17:04:10,570 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.0.layers.1.whiten, num_groups=1, num_channels=192, metric=3.93 vs. limit=12.0 2023-10-03 17:04:11,701 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_43831_210_2158_1_1533725502_5872791_55-163137-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 17:04:11,748 WARNING [train.py:1204] (3/4) Exclude cut with ID _27430_210_7770_1_1532604689375_1378590_26-25207-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 17:04:13,219 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_17613_210_2151_1_1531137569_2366160_223-316461-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 17:04:14,694 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_53679_210_4852_1_1534556726_4186141_541-213370-0_sp1.1 from training. Duration: 0.963625 2023-10-03 17:04:15,966 WARNING [train.py:1204] (3/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_369-295086-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 17:04:16,019 WARNING [train.py:1204] (3/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_91-267696-0 from training. Duration: 0.87 2023-10-03 17:04:16,066 WARNING [train.py:1204] (3/4) Exclude cut with ID _5294_210_1747_1_1527249643928_3696179_180-100713-0 from training. Duration: 0.81 2023-10-03 17:04:17,320 WARNING [train.py:1204] (3/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_32-335103-0 from training. Duration: 0.82 2023-10-03 17:04:17,326 WARNING [train.py:1204] (3/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_133-312727-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 17:04:18,705 WARNING [train.py:1204] (3/4) Exclude cut with ID _59288_210_5854_1_1535077757774_3412246_391-165934-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 17:04:18,760 WARNING [train.py:1204] (3/4) Exclude cut with ID _28323_210_16187_1_1532480395667_4100300_488-1281-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 17:04:20,124 WARNING [train.py:1204] (3/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_124-193207-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 17:04:21,789 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.0.feed_forward1.out_proj.dropout_p, batch_count=1341260.0, ans=0.1 2023-10-03 17:04:28,035 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.conv_skip_rate, batch_count=1341326.6666666667, ans=0.0 2023-10-03 17:04:28,953 INFO [train.py:1046] (3/4) Epoch 38, batch 4650, loss[loss=0.1601, simple_loss=0.229, pruned_loss=0.04566, over 23860.00 frames. ], tot_loss[loss=0.1563, simple_loss=0.2348, pruned_loss=0.03885, over 4695415.15 frames. ], batch size: 195, lr: 2.67e-03, grad_scale: 8.0 2023-10-03 17:04:29,001 WARNING [train.py:1204] (3/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_226-242140-0_sp0.9 from training. Duration: 0.9 2023-10-03 17:04:32,248 WARNING [train.py:1204] (3/4) Exclude cut with ID _29277_210_3279_1_1532581109293_3173069_392-3694-0_sp1.1 from training. Duration: 0.936375 2023-10-03 17:04:32,279 WARNING [train.py:1204] (3/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_563-329842-0_sp1.1 from training. Duration: 0.97275 2023-10-03 17:04:32,341 WARNING [train.py:1204] (3/4) Exclude cut with ID _58583_210_5236_1_1534726835266_3574249_391-87755-0_sp1.1 from training. Duration: 0.92725 2023-10-03 17:04:32,359 WARNING [train.py:1204] (3/4) Exclude cut with ID _28427_210_4220_1_1532581240967_3061069_281-237827-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 17:04:32,373 WARNING [train.py:1204] (3/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_44-147898-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 17:04:33,753 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_24007_210_1794_1_1531918925_1663875_125-320772-0_sp1.1 from training. Duration: 0.97275 2023-10-03 17:04:36,610 WARNING [train.py:1204] (3/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_143-117421-0 from training. Duration: 0.58 2023-10-03 17:04:40,210 WARNING [train.py:1204] (3/4) Exclude cut with ID _69584_210_5219_1_1535785133775_4044450_325-7656-0_sp0.9 from training. Duration: 0.9555625 2023-10-03 17:04:41,642 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_37351_210_7925_1_1533012931_3812113_265-85780-0 from training. Duration: 0.88 2023-10-03 17:04:41,664 WARNING [train.py:1204] (3/4) Exclude cut with ID _18516_210_9206_1_1531704653206_3692406_331-237896-0_sp1.1 from training. Duration: 0.936375 2023-10-03 17:04:43,008 WARNING [train.py:1204] (3/4) Exclude cut with ID _14629_210_15709_1_1530604298182_956899_6-232695-0 from training. Duration: 0.94 2023-10-03 17:04:43,034 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7544_210_6753_1_1528587000_691468_87-178599-0_sp1.1 from training. Duration: 0.7909375 2023-10-03 17:04:43,232 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.feed_forward1.out_proj.dropout_p, batch_count=1341393.3333333333, ans=0.1 2023-10-03 17:04:44,416 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_326-303945-0 from training. Duration: 0.99 2023-10-03 17:04:44,438 WARNING [train.py:1204] (3/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_503-30719-0 from training. Duration: 0.98 2023-10-03 17:04:44,446 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_11305_210_1794_1_1529750428_7178148_299-344640-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 17:04:45,779 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_53071_210_16554_1_1534490405_2954066_265-170472-0_sp1.1 from training. Duration: 0.7545625 2023-10-03 17:04:48,544 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_174-277364-0_sp1.1 from training. Duration: 0.6454375 2023-10-03 17:04:49,869 WARNING [train.py:1204] (3/4) Exclude cut with ID _58414_210_8254_1_1535421609162_7151182_193-151884-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 17:04:49,896 WARNING [train.py:1204] (3/4) Exclude cut with ID IC0651W0104-322063 from training. Duration: 0.768 2023-10-03 17:04:51,368 WARNING [train.py:1204] (3/4) Exclude cut with ID _21881_210_3525_1_1531825242757_3798380_139-305982-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 17:04:52,868 WARNING [train.py:1204] (3/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_79-200392-0 from training. Duration: 0.97 2023-10-03 17:04:57,346 WARNING [train.py:1204] (3/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_451-280894-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 17:04:57,362 WARNING [train.py:1204] (3/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_304-273433-0_sp1.1 from training. Duration: 0.9 2023-10-03 17:04:58,696 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_5959_210_1794_1_1527400400_3445726_412-44052-0 from training. Duration: 0.77 2023-10-03 17:05:00,657 WARNING [train.py:1204] (3/4) Exclude cut with ID _4681_210_3525_1_1527251040321_1235713_148-26159-0_sp1.1 from training. Duration: 0.963625 2023-10-03 17:05:00,937 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.bypass.scale_min, batch_count=1341460.0, ans=0.2 2023-10-03 17:05:03,379 WARNING [train.py:1204] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_887-249555-0_sp0.9 from training. Duration: 0.8555625 2023-10-03 17:05:06,264 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_25236_210_2158_1_1532051968_280567_37-6860-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 17:05:07,901 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.conv_module2.balancer1.prob, batch_count=1341460.0, ans=0.125 2023-10-03 17:05:10,900 WARNING [train.py:1204] (3/4) Exclude cut with ID _29277_210_3279_1_1532581109293_3173069_417-115527-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 17:05:11,147 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.balancer2.prob, batch_count=1341460.0, ans=0.125 2023-10-03 17:05:13,571 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.579e+02 1.819e+02 2.027e+02 2.299e+02 3.451e+02, threshold=4.055e+02, percent-clipped=0.0 2023-10-03 17:05:13,645 WARNING [train.py:1204] (3/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_552-134748-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 17:05:13,711 WARNING [train.py:1204] (3/4) Exclude cut with ID _43176_210_8254_1_1533524417469_4174339_188-174143-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 17:05:13,745 WARNING [train.py:1204] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_127-119109-0_sp1.1 from training. Duration: 0.6545625 2023-10-03 17:05:15,209 WARNING [train.py:1204] (3/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_38-187851-0 from training. Duration: 0.62 2023-10-03 17:05:15,243 WARNING [train.py:1204] (3/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_588-125165-0 from training. Duration: 0.83 2023-10-03 17:05:15,440 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.feed_forward2.hidden_balancer.prob, batch_count=1341526.6666666667, ans=0.125 2023-10-03 17:05:16,547 WARNING [train.py:1204] (3/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_344-152526-0_sp1.1 from training. Duration: 0.4818125 2023-10-03 17:05:16,548 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7002_210_1794_1_1527935634_4034444_487-236501-0 from training. Duration: 0.66 2023-10-03 17:05:18,020 WARNING [train.py:1204] (3/4) Exclude cut with ID _23825_210_13449_1_1531915313526_3497150_379-16981-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 17:05:25,359 WARNING [train.py:1204] (3/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_297-5233-0_sp1.1 from training. Duration: 0.863625 2023-10-03 17:05:25,363 WARNING [train.py:1204] (3/4) Exclude cut with ID _41298_210_9019_1_1533430371450_4822143_178-283538-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 17:05:25,390 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7046_210_6753_1_1527895337_6920917_642-106799-0 from training. Duration: 0.92 2023-10-03 17:05:25,403 WARNING [train.py:1204] (3/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_532-291952-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 17:05:26,827 WARNING [train.py:1204] (3/4) Exclude cut with ID _47278_210_12041_1_1533902418349_3576110_260-270468-0_sp1.1 from training. Duration: 0.92725 2023-10-03 17:05:26,833 WARNING [train.py:1204] (3/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_144-3287-0_sp0.9 from training. Duration: 0.7444375 2023-10-03 17:05:28,217 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_162-262717-0_sp0.9 from training. Duration: 0.92225 2023-10-03 17:05:31,558 WARNING [train.py:1204] (3/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_62-335552-0_sp0.9 from training. Duration: 0.9666875 2023-10-03 17:05:31,565 WARNING [train.py:1204] (3/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_726-272852-0_sp1.1 from training. Duration: 0.92725 2023-10-03 17:05:32,869 WARNING [train.py:1204] (3/4) Exclude cut with ID _14898_210_9206_1_1530766812377_4177190_26-135492-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 17:05:33,179 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.ff3_skip_rate, batch_count=1341593.3333333333, ans=0.0 2023-10-03 17:05:36,903 WARNING [train.py:1204] (3/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_200-95304-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 17:05:36,952 WARNING [train.py:1204] (3/4) Exclude cut with ID _45012_210_12651_1_1533608435136_5371380_240-37101-0_sp1.1 from training. Duration: 0.8454375 2023-10-03 17:05:36,970 WARNING [train.py:1204] (3/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_132-269633-0_sp0.9 from training. Duration: 0.7555625 2023-10-03 17:05:38,659 WARNING [train.py:1204] (3/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_907-151398-0_sp0.9 from training. Duration: 0.411125 2023-10-03 17:05:40,557 WARNING [train.py:1204] (3/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_45-265684-0_sp0.9 from training. Duration: 0.8 2023-10-03 17:05:40,638 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_74-292608-0 from training. Duration: 0.72 2023-10-03 17:05:40,781 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.conv_skip_rate, batch_count=1341593.3333333333, ans=0.0 2023-10-03 17:05:43,356 INFO [train.py:1046] (3/4) Epoch 38, batch 4700, loss[loss=0.1586, simple_loss=0.2344, pruned_loss=0.04138, over 23676.00 frames. ], tot_loss[loss=0.1568, simple_loss=0.2362, pruned_loss=0.03872, over 4712256.47 frames. ], batch size: 149, lr: 2.66e-03, grad_scale: 8.0 2023-10-03 17:05:46,389 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.1.nonlin_attention.balancer.min_positive, batch_count=1341660.0, ans=0.05 2023-10-03 17:05:47,691 WARNING [train.py:1204] (3/4) Exclude cut with ID _59202_210_26984_1_1534816810612_3549174_221-84819-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 17:05:48,909 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_33696_210_3852_1_1533082780_2663375_181-325595-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 17:05:48,965 WARNING [train.py:1204] (3/4) Exclude cut with ID _47251_210_18628_1_1533814063128_4137250_351-154105-0_sp1.1 from training. Duration: 0.963625 2023-10-03 17:05:49,054 WARNING [train.py:1204] (3/4) Exclude cut with ID _35501_210_9288_1_1532998507986_3192189_351-334740-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 17:05:51,785 WARNING [train.py:1204] (3/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_342-100157-0_sp1.1 from training. Duration: 0.4909375 2023-10-03 17:05:53,927 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.bypass.scale_min, batch_count=1341660.0, ans=0.2 2023-10-03 17:05:56,579 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.0.conv_skip_rate, batch_count=1341726.6666666667, ans=0.0 2023-10-03 17:05:57,710 WARNING [train.py:1204] (3/4) Exclude cut with ID _6789_210_1534_1_1527857937218_3118950_262-48853-0 from training. Duration: 0.56 2023-10-03 17:05:57,745 WARNING [train.py:1204] (3/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_430-201020-0 from training. Duration: 0.97 2023-10-03 17:05:59,316 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.0.attention_skip_rate, batch_count=1341726.6666666667, ans=0.0 2023-10-03 17:06:00,507 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_71-136518-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 17:06:00,588 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_28-291982-0_sp1.1 from training. Duration: 0.8818125 2023-10-03 17:06:00,640 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder_embed.convnext.layerdrop_rate, batch_count=1341726.6666666667, ans=0.015 2023-10-03 17:06:01,883 WARNING [train.py:1204] (3/4) Exclude cut with ID _54218_210_12210_1_1534413646388_4409259_437-275326-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 17:06:02,175 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.nonlin_attention.balancer.prob, batch_count=1341726.6666666667, ans=0.125 2023-10-03 17:06:03,261 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.3.encoder.layers.1.self_attn_weights.whiten_keys, num_groups=8, num_channels=256, metric=3.76 vs. limit=6.0 2023-10-03 17:06:03,931 WARNING [train.py:1204] (3/4) Exclude cut with ID _55134_210_21982_1_1534590028377_3882170_40-309139-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 17:06:09,898 WARNING [train.py:1204] (3/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_266-329755-0_sp1.1 from training. Duration: 0.8090625 2023-10-03 17:06:11,139 WARNING [train.py:1204] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_21-174514-0_sp1.1 from training. Duration: 0.3909375 2023-10-03 17:06:13,895 WARNING [train.py:1204] (3/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_409-207378-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 17:06:19,336 WARNING [train.py:1204] (3/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_132-269633-0 from training. Duration: 0.68 2023-10-03 17:06:19,446 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2245_210_2715_1_1525511548_3540254_310-52483-0_sp0.9 from training. Duration: 0.97775 2023-10-03 17:06:21,382 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.3.encoder.layers.3.feed_forward3.out_whiten, num_groups=1, num_channels=512, metric=12.45 vs. limit=15.0 2023-10-03 17:06:22,123 WARNING [train.py:1204] (3/4) Exclude cut with ID _27430_210_7770_1_1532604689375_1378590_47-77403-0_sp1.1 from training. Duration: 0.92725 2023-10-03 17:06:26,584 WARNING [train.py:1204] (3/4) Exclude cut with ID _7562_210_2084_1_1528887767916_3946229_186-326422-0 from training. Duration: 0.84 2023-10-03 17:06:29,292 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_230-35993-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 17:06:32,203 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_23854_210_1794_1_1531969696_1329157_57-123491-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 17:06:32,251 WARNING [train.py:1204] (3/4) Exclude cut with ID _14747_210_8341_1_1530685797463_3654089_147-131229-0 from training. Duration: 0.76 2023-10-03 17:06:34,142 WARNING [train.py:1204] (3/4) Exclude cut with ID _30191_210_1664_1_1533189915047_7196670_430-19539-0_sp1.1 from training. Duration: 0.92725 2023-10-03 17:06:34,164 WARNING [train.py:1204] (3/4) Exclude cut with ID _67725_210_27767_1_1535608756671_5105359_44-50762-0_sp1.1 from training. Duration: 0.963625 2023-10-03 17:06:36,989 WARNING [train.py:1204] (3/4) Exclude cut with ID _27485_210_4916_1_1532739191677_3668269_217-161755-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 17:06:37,027 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_5931_210_2040_1_1527428481_4547999_321-193614-0_sp1.1 from training. Duration: 0.6090625 2023-10-03 17:06:37,050 WARNING [train.py:1204] (3/4) Exclude cut with ID _6267_210_2084_1_1527764658526_3894730_375-219887-0 from training. Duration: 0.67 2023-10-03 17:06:38,851 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9087W0022-538393 from training. Duration: 0.768 2023-10-03 17:06:40,267 WARNING [train.py:1204] (3/4) Exclude cut with ID _68777_210_4353_1_1535810390731_3117904_83-196459-0_sp1.1 from training. Duration: 0.963625 2023-10-03 17:06:42,170 WARNING [train.py:1204] (3/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_455-343812-0_sp1.1 from training. Duration: 0.92725 2023-10-03 17:06:42,171 WARNING [train.py:1204] (3/4) Exclude cut with ID _13916_210_9994_1_1530788341126_3763149_414-320174-0_sp1.1 from training. Duration: 0.92725 2023-10-03 17:06:42,175 WARNING [train.py:1204] (3/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_398-239819-0 from training. Duration: 0.99 2023-10-03 17:06:43,562 WARNING [train.py:1204] (3/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_199-232314-0_sp1.1 from training. Duration: 0.92725 2023-10-03 17:06:46,430 WARNING [train.py:1204] (3/4) Exclude cut with ID _2334_210_2667_1_1525608238880_4705129_459-104997-0 from training. Duration: 0.95 2023-10-03 17:06:49,121 WARNING [train.py:1204] (3/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_441-209395-0_sp1.1 from training. Duration: 0.936375 2023-10-03 17:06:49,293 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.0.feed_forward3.hidden_balancer.prob, batch_count=1341926.6666666667, ans=0.125 2023-10-03 17:06:50,511 WARNING [train.py:1204] (3/4) Exclude cut with ID _27052_210_5986_1_1532329197187_3585009_320-80393-0_sp1.1 from training. Duration: 0.97275 2023-10-03 17:06:53,432 WARNING [train.py:1204] (3/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_89-217253-0_sp1.1 from training. Duration: 0.97275 2023-10-03 17:06:55,211 WARNING [train.py:1204] (3/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_453-282588-0_sp1.1 from training. Duration: 0.8909375 2023-10-03 17:06:55,344 WARNING [train.py:1204] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_88-341718-0 from training. Duration: 0.66 2023-10-03 17:06:56,621 INFO [train.py:1046] (3/4) Epoch 38, batch 4750, loss[loss=0.1642, simple_loss=0.2386, pruned_loss=0.04488, over 23711.00 frames. ], tot_loss[loss=0.1575, simple_loss=0.2369, pruned_loss=0.03902, over 4722489.68 frames. ], batch size: 212, lr: 2.66e-03, grad_scale: 8.0 2023-10-03 17:06:56,681 WARNING [train.py:1204] (3/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_230-3521-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 17:06:59,420 WARNING [train.py:1204] (3/4) Exclude cut with ID _9679_210_7497_1_1529129212906_4363929_512-313889-0 from training. Duration: 0.81 2023-10-03 17:07:00,758 WARNING [train.py:1204] (3/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_194-252800-0_sp1.1 from training. Duration: 0.8818125 2023-10-03 17:07:00,784 WARNING [train.py:1204] (3/4) Exclude cut with ID _55209_210_1531_1_1534675828996_4597160_239-293541-0_sp1.1 from training. Duration: 0.963625 2023-10-03 17:07:02,210 WARNING [train.py:1204] (3/4) Exclude cut with ID _35799_210_5854_1_1533263487887_3529071_15-325131-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 17:07:07,039 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.out_combiner.scale_min, batch_count=1341993.3333333333, ans=0.2 2023-10-03 17:07:09,801 WARNING [train.py:1204] (3/4) Exclude cut with ID _14100_210_8341_1_1530529320613_3678069_279-55760-0 from training. Duration: 0.93 2023-10-03 17:07:12,749 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_489-230703-0_sp1.1 from training. Duration: 0.77275 2023-10-03 17:07:14,666 WARNING [train.py:1204] (3/4) Exclude cut with ID _6694_210_4599_1_1527852637490_3965529_144-185051-0 from training. Duration: 0.96 2023-10-03 17:07:16,031 WARNING [train.py:1204] (3/4) Exclude cut with ID _28461_210_2253_1_1532771775189_3041299_1-228155-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 17:07:20,063 WARNING [train.py:1204] (3/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_686-252323-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 17:07:20,066 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_1075-294633-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 17:07:20,077 WARNING [train.py:1204] (3/4) Exclude cut with ID _22083_210_8254_1_1531702862439_3678970_30-92879-0_sp1.1 from training. Duration: 0.97275 2023-10-03 17:07:21,486 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9245W0121-616888 from training. Duration: 0.896 2023-10-03 17:07:21,489 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6786_210_1388_1_1527839699_4194495_378-46070-0 from training. Duration: 0.72 2023-10-03 17:07:28,664 WARNING [train.py:1204] (3/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_317-175062-0 from training. Duration: 0.82 2023-10-03 17:07:29,139 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.4.encoder.layers.2.conv_module2.whiten, num_groups=1, num_channels=384, metric=3.91 vs. limit=15.0 2023-10-03 17:07:30,696 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.3.encoder.layers.0.self_attn1.whiten, num_groups=1, num_channels=512, metric=13.31 vs. limit=22.5 2023-10-03 17:07:31,351 WARNING [train.py:1204] (3/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_238-89390-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 17:07:34,083 WARNING [train.py:1204] (3/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_524-310811-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 17:07:36,944 WARNING [train.py:1204] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_307-158462-0_sp0.9 from training. Duration: 0.7666875 2023-10-03 17:07:36,951 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9309W0191-640318 from training. Duration: 0.512 2023-10-03 17:07:36,955 WARNING [train.py:1204] (3/4) Exclude cut with ID _55153_210_2253_1_1534384786677_3162096_100-204617-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 17:07:38,501 WARNING [train.py:1204] (3/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_112-149213-0_sp1.1 from training. Duration: 0.863625 2023-10-03 17:07:39,790 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.552e+02 1.879e+02 2.097e+02 2.401e+02 3.213e+02, threshold=4.194e+02, percent-clipped=0.0 2023-10-03 17:07:41,800 WARNING [train.py:1204] (3/4) Exclude cut with ID _67831_210_12392_1_1535612464202_3092612_346-2148-0_sp1.1 from training. Duration: 0.8454375 2023-10-03 17:07:43,751 WARNING [train.py:1204] (3/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_51-117576-0_sp0.9 from training. Duration: 0.4444375 2023-10-03 17:07:43,772 WARNING [train.py:1204] (3/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_300-27235-0 from training. Duration: 0.87 2023-10-03 17:07:45,110 WARNING [train.py:1204] (3/4) Exclude cut with ID _40399_210_4353_1_1533305658194_3279406_195-116825-0_sp1.1 from training. Duration: 0.97275 2023-10-03 17:07:45,132 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7002_210_1794_1_1527939683_5157286_163-87552-0_sp0.9 from training. Duration: 0.9555625 2023-10-03 17:07:45,167 WARNING [train.py:1204] (3/4) Exclude cut with ID _19514_210_9259_1_1531530071330_5084230_560-237597-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 17:07:46,477 WARNING [train.py:1204] (3/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_41-234353-0_sp1.1 from training. Duration: 0.6181875 2023-10-03 17:07:46,508 WARNING [train.py:1204] (3/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_769-168302-0 from training. Duration: 0.71 2023-10-03 17:07:49,778 WARNING [train.py:1204] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_574-45541-0 from training. Duration: 0.65 2023-10-03 17:07:51,209 WARNING [train.py:1204] (3/4) Exclude cut with ID _50310_210_15488_1_1534068043345_3641080_262-67120-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 17:07:53,925 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_53071_210_16554_1_1534493434_1276796_35-11927-0_sp1.1 from training. Duration: 0.963625 2023-10-03 17:07:53,927 WARNING [train.py:1204] (3/4) Exclude cut with ID _3992_210_1747_1_1526644816031_3731060_107-251434-0 from training. Duration: 0.99 2023-10-03 17:07:55,258 WARNING [train.py:1204] (3/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_412-54488-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 17:07:55,354 WARNING [train.py:1204] (3/4) Exclude cut with ID _18516_210_9206_1_1531704653206_3692406_77-258490-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 17:07:58,149 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_40047_210_2290_1_1533290519_2374311_195-165693-0_sp0.9 from training. Duration: 0.988875 2023-10-03 17:07:59,889 WARNING [train.py:1204] (3/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_296-36971-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 17:07:59,946 WARNING [train.py:1204] (3/4) Exclude cut with ID _14970_210_15709_1_1530773543493_1715730_187-20259-0_sp1.1 from training. Duration: 0.8090625 2023-10-03 17:08:01,515 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_53373_210_5399_1_1534229471_4715695_135-305927-0_sp1.1 from training. Duration: 0.936375 2023-10-03 17:08:01,541 WARNING [train.py:1204] (3/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_60-18123-0 from training. Duration: 0.88 2023-10-03 17:08:02,834 WARNING [train.py:1204] (3/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_166-166990-0 from training. Duration: 0.96 2023-10-03 17:08:04,164 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_5438_210_1794_1_1527336687_2600009_233-58072-0 from training. Duration: 0.77 2023-10-03 17:08:06,910 WARNING [train.py:1204] (3/4) Exclude cut with ID _8833_210_5219_1_1528592665210_7553059_611-80313-0_sp0.9 from training. Duration: 0.711125 2023-10-03 17:08:06,941 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_53555_210_5632_1_1534595220_7232297_118-43239-0_sp1.1 from training. Duration: 0.936375 2023-10-03 17:08:07,026 WARNING [train.py:1204] (3/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_480-13146-0 from training. Duration: 0.85 2023-10-03 17:08:10,186 INFO [train.py:1046] (3/4) Epoch 38, batch 4800, loss[loss=0.1667, simple_loss=0.2477, pruned_loss=0.04289, over 24496.00 frames. ], tot_loss[loss=0.158, simple_loss=0.2378, pruned_loss=0.03909, over 4737975.04 frames. ], batch size: 66, lr: 2.66e-03, grad_scale: 16.0 2023-10-03 17:08:13,454 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_892-28814-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 17:08:13,496 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_24899_210_16554_1_1532046422_3827462_160-21255-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 17:08:13,712 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.balancer2.prob, batch_count=1342326.6666666667, ans=0.125 2023-10-03 17:08:18,302 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_154-316450-0_sp1.1 from training. Duration: 0.6909375 2023-10-03 17:08:19,696 WARNING [train.py:1204] (3/4) Exclude cut with ID _30196_210_1664_1_1533708496522_7074140_1225-301232-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 17:08:21,018 WARNING [train.py:1204] (3/4) Exclude cut with ID _28783_210_7943_1_1532671164808_7155479_414-208100-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 17:08:21,079 WARNING [train.py:1204] (3/4) Exclude cut with ID _19023_210_4353_1_1531378892552_5820780_83-254794-0 from training. Duration: 0.86 2023-10-03 17:08:22,524 WARNING [train.py:1204] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_591-83752-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 17:08:22,570 WARNING [train.py:1204] (3/4) Exclude cut with ID _29877_210_6228_1_1532397791106_5276149_266-62291-0_sp1.1 from training. Duration: 0.7909375 2023-10-03 17:08:22,684 WARNING [train.py:1204] (3/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_280-161633-0_sp1.1 from training. Duration: 0.663625 2023-10-03 17:08:24,073 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.nonlin_attention.balancer.prob, batch_count=1342393.3333333333, ans=0.125 2023-10-03 17:08:26,691 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7543_210_6753_1_1528539468_333764_39-156634-0_sp1.1 from training. Duration: 0.97275 2023-10-03 17:08:26,861 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.self_attn_weights.pos_emb_skip_rate, batch_count=1342393.3333333333, ans=0.0 2023-10-03 17:08:28,658 WARNING [train.py:1204] (3/4) Exclude cut with ID _67499_210_17282_1_1535509805220_3692430_505-312248-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 17:08:29,918 WARNING [train.py:1204] (3/4) Exclude cut with ID _5294_210_1747_1_1527249643928_3696179_180-100713-0_sp0.9 from training. Duration: 0.9 2023-10-03 17:08:30,010 WARNING [train.py:1204] (3/4) Exclude cut with ID _26528_210_9070_1_1532570410842_3851310_508-148132-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 17:08:30,020 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_211-339622-0_sp1.1 from training. Duration: 0.4090625 2023-10-03 17:08:30,040 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_339-138356-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 17:08:31,424 WARNING [train.py:1204] (3/4) Exclude cut with ID _27060_210_5986_1_1532674838922_3522549_211-19933-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 17:08:34,115 WARNING [train.py:1204] (3/4) Exclude cut with ID _60229_210_27767_1_1534838406158_3655350_457-117923-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 17:08:35,576 WARNING [train.py:1204] (3/4) Exclude cut with ID _55429_210_10626_1_1534467588527_7174143_460-141439-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 17:08:36,945 WARNING [train.py:1204] (3/4) Exclude cut with ID _59170_210_6721_1_1534983633960_4203335_263-10780-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 17:08:38,273 WARNING [train.py:1204] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_872-264525-0_sp0.9 from training. Duration: 0.72225 2023-10-03 17:08:39,715 WARNING [train.py:1204] (3/4) Exclude cut with ID _7624_210_1531_1_1528109802198_4933101_418-349834-0_sp0.9 from training. Duration: 0.6666875 2023-10-03 17:08:41,735 WARNING [train.py:1204] (3/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_27-53998-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 17:08:43,149 WARNING [train.py:1204] (3/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_202-183922-0 from training. Duration: 0.85 2023-10-03 17:08:43,162 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7002_210_1794_1_1527939683_5157286_1-281264-0 from training. Duration: 0.44 2023-10-03 17:08:43,236 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_53753_210_7234_1_1534320003_3594148_493-9314-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 17:08:43,249 WARNING [train.py:1204] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_217-95056-0_sp1.1 from training. Duration: 0.8909375 2023-10-03 17:08:43,293 WARNING [train.py:1204] (3/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_406-45086-0_sp1.1 from training. Duration: 0.72725 2023-10-03 17:08:43,301 WARNING [train.py:1204] (3/4) Exclude cut with ID _31649_210_6228_1_1532584608063_4091078_505-213821-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 17:08:44,569 WARNING [train.py:1204] (3/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_317-175062-0_sp0.9 from training. Duration: 0.911125 2023-10-03 17:08:47,818 WARNING [train.py:1204] (3/4) Exclude cut with ID _5444_210_2151_1_1527418808464_5312210_726-27165-0_sp1.1 from training. Duration: 0.6090625 2023-10-03 17:08:47,869 WARNING [train.py:1204] (3/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_494-170437-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 17:08:50,647 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_474-64054-0_sp1.1 from training. Duration: 0.936375 2023-10-03 17:08:53,393 WARNING [train.py:1204] (3/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_521-7556-0_sp1.1 from training. Duration: 0.97275 2023-10-03 17:08:54,762 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_269-348505-0_sp1.1 from training. Duration: 0.92725 2023-10-03 17:08:56,368 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.feed_forward1.hidden_balancer.prob, batch_count=1342526.6666666667, ans=0.125 2023-10-03 17:09:00,808 WARNING [train.py:1204] (3/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_559-184862-0 from training. Duration: 0.94 2023-10-03 17:09:02,151 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_68702_210_23689_1_1535799577_3420068_92-76040-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 17:09:02,198 WARNING [train.py:1204] (3/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_227-102052-0_sp1.1 from training. Duration: 0.97275 2023-10-03 17:09:02,236 WARNING [train.py:1204] (3/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_718-250907-0_sp0.9 from training. Duration: 0.7666875 2023-10-03 17:09:03,575 WARNING [train.py:1204] (3/4) Exclude cut with ID _14069_210_5632_1_1530512692033_4803390_352-143891-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 17:09:06,463 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.ff3_skip_rate, batch_count=1342526.6666666667, ans=0.0 2023-10-03 17:09:07,687 WARNING [train.py:1204] (3/4) Exclude cut with ID _6267_210_2084_1_1527764658526_3894730_65-106602-0_sp1.1 from training. Duration: 0.936375 2023-10-03 17:09:09,001 WARNING [train.py:1204] (3/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_202-183922-0_sp0.9 from training. Duration: 0.9444375 2023-10-03 17:09:09,016 WARNING [train.py:1204] (3/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_465-268827-0_sp1.1 from training. Duration: 0.97275 2023-10-03 17:09:09,081 WARNING [train.py:1204] (3/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_58-29064-0_sp1.1 from training. Duration: 0.9 2023-10-03 17:09:10,400 WARNING [train.py:1204] (3/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_1-99680-0_sp0.9 from training. Duration: 0.7444375 2023-10-03 17:09:10,452 WARNING [train.py:1204] (3/4) Exclude cut with ID _13253_210_1925_1_1530757775819_3577873_191-144977-0_sp1.1 from training. Duration: 0.7090625 2023-10-03 17:09:15,594 WARNING [train.py:1204] (3/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_276-112684-0_sp1.1 from training. Duration: 0.92725 2023-10-03 17:09:15,604 WARNING [train.py:1204] (3/4) Exclude cut with ID _64639_210_5816_1_1535367619437_3528000_358-411-0_sp1.1 from training. Duration: 0.97275 2023-10-03 17:09:15,619 WARNING [train.py:1204] (3/4) Exclude cut with ID _31649_210_6228_1_1532584608063_4091078_106-21199-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 17:09:18,210 WARNING [train.py:1204] (3/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_901-79438-0 from training. Duration: 0.94 2023-10-03 17:09:19,652 WARNING [train.py:1204] (3/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_718-250907-0 from training. Duration: 0.69 2023-10-03 17:09:19,667 WARNING [train.py:1204] (3/4) Exclude cut with ID _5287_210_2084_1_1527418883040_3878670_363-194161-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 17:09:19,670 WARNING [train.py:1204] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_586-321321-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 17:09:21,713 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_68900_210_2087_1_1535713157_3708529_164-292661-0_sp1.1 from training. Duration: 0.963625 2023-10-03 17:09:21,714 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_26433_210_12108_1_1532157158_4056480_114-144878-0_sp1.1 from training. Duration: 0.97275 2023-10-03 17:09:24,434 INFO [train.py:1046] (3/4) Epoch 38, batch 4850, loss[loss=0.1229, simple_loss=0.199, pruned_loss=0.02341, over 24281.00 frames. ], tot_loss[loss=0.158, simple_loss=0.2378, pruned_loss=0.03914, over 4740245.01 frames. ], batch size: 56, lr: 2.66e-03, grad_scale: 16.0 2023-10-03 17:09:24,526 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_15517_210_6753_1_1530954008_3487336_96-341874-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 17:09:31,800 WARNING [train.py:1204] (3/4) Exclude cut with ID _14268_210_9206_1_1530622771285_4714099_637-86630-0 from training. Duration: 0.51 2023-10-03 17:09:33,797 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_28211_210_6388_1_1532861506_4523224_121-320092-0_sp1.1 from training. Duration: 0.92725 2023-10-03 17:09:37,989 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_141-89113-0_sp0.9 from training. Duration: 0.9555625 2023-10-03 17:09:39,334 WARNING [train.py:1204] (3/4) Exclude cut with ID _6789_210_1534_1_1527857937218_3118950_311-61262-0_sp1.1 from training. Duration: 0.5909375 2023-10-03 17:09:40,609 WARNING [train.py:1204] (3/4) Exclude cut with ID _58480_210_12392_1_1534811121111_3646160_389-200461-0_sp1.1 from training. Duration: 0.97275 2023-10-03 17:09:43,312 WARNING [train.py:1204] (3/4) Exclude cut with ID _41326_210_5854_1_1533778076509_3492080_28-156553-0_sp1.1 from training. Duration: 0.92725 2023-10-03 17:09:44,694 WARNING [train.py:1204] (3/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_577-287150-0_sp1.1 from training. Duration: 0.7454375 2023-10-03 17:09:47,065 WARNING [train.py:1204] (3/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_40-129756-0_sp1.1 from training. Duration: 0.736375 2023-10-03 17:09:47,082 WARNING [train.py:1204] (3/4) Exclude cut with ID _60113_210_16271_1_1535164212481_4386079_490-280290-0 from training. Duration: 0.98 2023-10-03 17:09:52,313 WARNING [train.py:1204] (3/4) Exclude cut with ID _58059_210_5854_1_1534901471149_3164808_34-39905-0_sp1.1 from training. Duration: 0.963625 2023-10-03 17:09:53,724 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_791-197891-0_sp1.1 from training. Duration: 0.763625 2023-10-03 17:09:53,740 WARNING [train.py:1204] (3/4) Exclude cut with ID _6789_210_1534_1_1527857937218_3118950_262-48853-0_sp1.1 from training. Duration: 0.5090625 2023-10-03 17:09:55,081 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2655_210_2087_1_1525688959_3897400_52-279899-0_sp0.9 from training. Duration: 0.7666875 2023-10-03 17:09:55,085 WARNING [train.py:1204] (3/4) Exclude cut with ID _14884_210_2252_1_1530707459689_5561250_114-201399-0 from training. Duration: 0.76 2023-10-03 17:09:57,003 WARNING [train.py:1204] (3/4) Exclude cut with ID _19023_210_4353_1_1531378892552_5820780_83-254794-0_sp0.9 from training. Duration: 0.9555625 2023-10-03 17:09:57,024 WARNING [train.py:1204] (3/4) Exclude cut with ID _53489_210_6228_1_1534298357169_3835970_10-297919-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 17:09:58,599 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.out_combiner.scale_min, batch_count=1342793.3333333333, ans=0.2 2023-10-03 17:10:02,572 WARNING [train.py:1204] (3/4) Exclude cut with ID _44585_210_18628_1_1533605350387_4518199_418-3055-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 17:10:02,592 WARNING [train.py:1204] (3/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_310-109461-0 from training. Duration: 0.8 2023-10-03 17:10:02,636 WARNING [train.py:1204] (3/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_235-262124-0 from training. Duration: 0.67 2023-10-03 17:10:03,970 WARNING [train.py:1204] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_195-167011-0_sp1.1 from training. Duration: 0.6818125 2023-10-03 17:10:04,138 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.balancer1.prob, batch_count=1342793.3333333333, ans=0.125 2023-10-03 17:10:08,524 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.561e+02 1.910e+02 2.157e+02 2.584e+02 3.262e+02, threshold=4.314e+02, percent-clipped=0.0 2023-10-03 17:10:11,517 WARNING [train.py:1204] (3/4) Exclude cut with ID _7852_210_5219_1_1528286471316_8139400_188-55162-0_sp1.1 from training. Duration: 0.936375 2023-10-03 17:10:11,637 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.1.ff2_skip_rate, batch_count=1342860.0, ans=0.0 2023-10-03 17:10:11,671 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=1342860.0, ans=0.1 2023-10-03 17:10:12,936 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_35361_210_4238_1_1533262810_7793476_675-87012-0 from training. Duration: 0.89 2023-10-03 17:10:14,301 WARNING [train.py:1204] (3/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_465-81427-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 17:10:14,314 WARNING [train.py:1204] (3/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_597-154640-0_sp1.1 from training. Duration: 0.8181875 2023-10-03 17:10:15,702 WARNING [train.py:1204] (3/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_186-309161-0_sp0.9 from training. Duration: 0.6 2023-10-03 17:10:17,469 WARNING [train.py:1204] (3/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_546-119312-0 from training. Duration: 0.99 2023-10-03 17:10:17,473 WARNING [train.py:1204] (3/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_330-240060-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 17:10:20,043 WARNING [train.py:1204] (3/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_477-255782-0 from training. Duration: 0.82 2023-10-03 17:10:20,070 WARNING [train.py:1204] (3/4) Exclude cut with ID _14970_210_15709_1_1530773543493_1715730_150-248939-0_sp1.1 from training. Duration: 0.97275 2023-10-03 17:10:20,164 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_556-206615-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 17:10:21,494 WARNING [train.py:1204] (3/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_308-231735-0 from training. Duration: 0.73 2023-10-03 17:10:29,546 WARNING [train.py:1204] (3/4) Exclude cut with ID _27485_210_4916_1_1532739191677_3668269_66-236571-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 17:10:29,766 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.1.balancer1.prob, batch_count=1342926.6666666667, ans=0.125 2023-10-03 17:10:34,995 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2221_210_1988_1_1525597051_4935541_415-296882-0_sp0.9 from training. Duration: 0.8555625 2023-10-03 17:10:35,003 WARNING [train.py:1204] (3/4) Exclude cut with ID _64521_210_14699_1_1535243427259_3815328_231-78630-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 17:10:38,181 INFO [train.py:1046] (3/4) Epoch 38, batch 4900, loss[loss=0.1431, simple_loss=0.2187, pruned_loss=0.03375, over 23765.00 frames. ], tot_loss[loss=0.1574, simple_loss=0.2375, pruned_loss=0.03868, over 4733423.07 frames. ], batch size: 179, lr: 2.66e-03, grad_scale: 16.0 2023-10-03 17:10:41,023 WARNING [train.py:1204] (3/4) Exclude cut with ID _13942_210_9988_1_1530604906036_3733360_499-55676-0 from training. Duration: 0.62 2023-10-03 17:10:41,025 WARNING [train.py:1204] (3/4) Exclude cut with ID _49376_210_5986_1_1533985278732_3370740_301-154763-0_sp1.1 from training. Duration: 0.8909375 2023-10-03 17:10:45,244 WARNING [train.py:1204] (3/4) Exclude cut with ID _27485_210_4916_1_1532739191677_3668269_255-216698-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 17:10:47,171 WARNING [train.py:1204] (3/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_137-334143-0_sp1.1 from training. Duration: 0.97275 2023-10-03 17:10:47,190 WARNING [train.py:1204] (3/4) Exclude cut with ID _6456_210_1534_1_1527681634212_3181049_215-309359-0_sp1.1 from training. Duration: 0.8 2023-10-03 17:10:49,843 WARNING [train.py:1204] (3/4) Exclude cut with ID _65640_210_6310_1_1535368201445_3173250_209-13008-0 from training. Duration: 0.99 2023-10-03 17:10:53,307 WARNING [train.py:1204] (3/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_58-29064-0 from training. Duration: 0.99 2023-10-03 17:10:57,657 WARNING [train.py:1204] (3/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_410-287857-0 from training. Duration: 0.93 2023-10-03 17:10:58,955 WARNING [train.py:1204] (3/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_153-153537-0 from training. Duration: 0.91 2023-10-03 17:10:59,003 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7544_210_6753_1_1528587722_3576686_172-171750-0_sp0.9 from training. Duration: 0.72225 2023-10-03 17:10:59,030 WARNING [train.py:1204] (3/4) Exclude cut with ID _64521_210_14699_1_1535243427259_3815328_464-183853-0_sp1.1 from training. Duration: 0.97275 2023-10-03 17:10:59,043 WARNING [train.py:1204] (3/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_385-210308-0_sp1.1 from training. Duration: 0.963625 2023-10-03 17:10:59,052 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_46013_210_7234_1_1533776362_3744032_354-261865-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 17:10:59,065 WARNING [train.py:1204] (3/4) Exclude cut with ID _3067_210_2667_1_1526212974640_4503168_250-285937-0_sp1.1 from training. Duration: 0.72725 2023-10-03 17:10:59,104 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder_embed.conv.5.prob, batch_count=1343060.0, ans=0.125 2023-10-03 17:11:00,392 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_40036_210_13405_1_1533648647_1315388_48-146016-0 from training. Duration: 0.93 2023-10-03 17:11:03,177 WARNING [train.py:1204] (3/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_280-161633-0 from training. Duration: 0.73 2023-10-03 17:11:03,223 WARNING [train.py:1204] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_308-161067-0_sp0.9 from training. Duration: 0.7333125 2023-10-03 17:11:03,429 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.conv_skip_rate, batch_count=1343060.0, ans=0.0 2023-10-03 17:11:04,580 WARNING [train.py:1204] (3/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_68-212801-0_sp0.9 from training. Duration: 0.811125 2023-10-03 17:11:05,901 WARNING [train.py:1204] (3/4) Exclude cut with ID _6789_210_1534_1_1527857937218_3118950_311-61262-0_sp0.9 from training. Duration: 0.72225 2023-10-03 17:11:10,716 WARNING [train.py:1204] (3/4) Exclude cut with ID _14632_210_9988_1_1530691279392_3923200_102-204477-0_sp1.1 from training. Duration: 0.8818125 2023-10-03 17:11:10,776 WARNING [train.py:1204] (3/4) Exclude cut with ID _65242_210_18628_1_1535369393717_3823149_290-117517-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 17:11:12,201 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_69630_210_7234_1_1535788670_910989_35-337140-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 17:11:12,210 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_5618_210_1874_1_1527311008_5851_1-214501-0 from training. Duration: 0.689 2023-10-03 17:11:13,674 WARNING [train.py:1204] (3/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_168-50337-0_sp0.9 from training. Duration: 0.9333125 2023-10-03 17:11:15,086 WARNING [train.py:1204] (3/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_185-14438-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 17:11:15,109 WARNING [train.py:1204] (3/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_61-327062-0 from training. Duration: 0.81 2023-10-03 17:11:15,121 WARNING [train.py:1204] (3/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_98-741-0 from training. Duration: 0.8 2023-10-03 17:11:18,430 WARNING [train.py:1204] (3/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_312-204798-0 from training. Duration: 0.93 2023-10-03 17:11:19,877 WARNING [train.py:1204] (3/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_787-294416-0_sp0.9 from training. Duration: 0.911125 2023-10-03 17:11:21,821 WARNING [train.py:1204] (3/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_140-181176-0_sp1.1 from training. Duration: 0.836375 2023-10-03 17:11:21,862 WARNING [train.py:1204] (3/4) Exclude cut with ID _14687_210_5854_1_1530703838259_3664889_115-239046-0_sp1.1 from training. Duration: 0.6090625 2023-10-03 17:11:21,904 WARNING [train.py:1204] (3/4) Exclude cut with ID _56423_210_5854_1_1534896061049_3165969_370-230614-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 17:11:23,053 WARNING [train.py:1204] (3/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_406-275649-0_sp0.9 from training. Duration: 0.4666875 2023-10-03 17:11:24,348 WARNING [train.py:1204] (3/4) Exclude cut with ID _15150_210_8341_1_1530752411803_7256384_22-138274-0_sp1.1 from training. Duration: 0.87275 2023-10-03 17:11:24,392 WARNING [train.py:1204] (3/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_125-125818-0 from training. Duration: 0.96 2023-10-03 17:11:27,810 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_4399_210_1874_1_1526705485_2400757_220-47974-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 17:11:27,948 WARNING [train.py:1204] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_308-161067-0_sp1.1 from training. Duration: 0.6 2023-10-03 17:11:29,344 WARNING [train.py:1204] (3/4) Exclude cut with ID _53489_210_6228_1_1534298357169_3835970_102-57645-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 17:11:33,447 WARNING [train.py:1204] (3/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_178-177582-0 from training. Duration: 0.74 2023-10-03 17:11:34,790 WARNING [train.py:1204] (3/4) Exclude cut with ID _14349_210_8341_1_1530615710818_3442470_170-336541-0_sp1.1 from training. Duration: 0.7818125 2023-10-03 17:11:34,860 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_521-214276-0_sp0.9 from training. Duration: 0.488875 2023-10-03 17:11:34,905 WARNING [train.py:1204] (3/4) Exclude cut with ID _66070_210_7640_1_1535441393096_7805310_489-292631-0 from training. Duration: 0.79 2023-10-03 17:11:43,669 WARNING [train.py:1204] (3/4) Exclude cut with ID _14069_210_5632_1_1530512692033_4803390_438-340023-0_sp1.1 from training. Duration: 0.936375 2023-10-03 17:11:43,748 WARNING [train.py:1204] (3/4) Exclude cut with ID _7852_210_5219_1_1528286471316_8139400_242-105336-0_sp1.1 from training. Duration: 0.6545625 2023-10-03 17:11:44,316 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.5.encoder.layers.0.self_attn1.whiten, num_groups=1, num_channels=256, metric=12.30 vs. limit=22.5 2023-10-03 17:11:45,163 WARNING [train.py:1204] (3/4) Exclude cut with ID _3867_210_1883_1_1526641200575_4475830_272-215563-0 from training. Duration: 0.57 2023-10-03 17:11:45,178 WARNING [train.py:1204] (3/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_143-117421-0_sp0.9 from training. Duration: 0.6444375 2023-10-03 17:11:45,184 WARNING [train.py:1204] (3/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_30-326681-0_sp1.1 from training. Duration: 0.7545625 2023-10-03 17:11:47,020 WARNING [train.py:1204] (3/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_538-218448-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 17:11:49,851 WARNING [train.py:1204] (3/4) Exclude cut with ID _33640_210_2655_1_1533117605479_4855469_60-192970-0_sp1.1 from training. Duration: 0.963625 2023-10-03 17:11:49,858 WARNING [train.py:1204] (3/4) Exclude cut with ID _65585_210_12210_1_1535450451584_3764098_112-294301-0_sp1.1 from training. Duration: 0.8 2023-10-03 17:11:51,123 WARNING [train.py:1204] (3/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_643-37412-0_sp1.1 from training. Duration: 0.936375 2023-10-03 17:11:51,139 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_9302_210_2550_1_1528889962_4655416_638-301620-0_sp1.1 from training. Duration: 0.42725 2023-10-03 17:11:52,780 INFO [train.py:1046] (3/4) Epoch 38, batch 4950, loss[loss=0.1523, simple_loss=0.225, pruned_loss=0.03981, over 23684.00 frames. ], tot_loss[loss=0.1563, simple_loss=0.236, pruned_loss=0.03835, over 4725267.68 frames. ], batch size: 256, lr: 2.66e-03, grad_scale: 16.0 2023-10-03 17:11:52,857 WARNING [train.py:1204] (3/4) Exclude cut with ID _3867_210_1883_1_1526641200575_4475830_272-215563-0_sp1.1 from training. Duration: 0.5181875 2023-10-03 17:11:56,242 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_53679_210_4852_1_1534556726_4186141_358-154143-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 17:11:56,254 WARNING [train.py:1204] (3/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_841-341679-0_sp0.9 from training. Duration: 0.6444375 2023-10-03 17:11:59,042 WARNING [train.py:1204] (3/4) Exclude cut with ID _7160_210_1876_1_1527940923231_3703799_302-150728-0 from training. Duration: 0.78 2023-10-03 17:12:00,303 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_125-203105-0 from training. Duration: 0.8 2023-10-03 17:12:00,324 WARNING [train.py:1204] (3/4) Exclude cut with ID _5585_210_2667_1_1527418813324_8007340_89-332072-0_sp0.9 from training. Duration: 0.711125 2023-10-03 17:12:01,645 WARNING [train.py:1204] (3/4) Exclude cut with ID _3992_210_1747_1_1526644816031_3731060_36-147855-0 from training. Duration: 0.78 2023-10-03 17:12:01,663 WARNING [train.py:1204] (3/4) Exclude cut with ID _59256_210_5986_1_1535076085997_3589120_172-239045-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 17:12:01,672 WARNING [train.py:1204] (3/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_472-216663-0_sp1.1 from training. Duration: 0.836375 2023-10-03 17:12:01,740 WARNING [train.py:1204] (3/4) Exclude cut with ID _36512_210_6987_1_1533033143998_3126069_181-306500-0_sp1.1 from training. Duration: 0.863625 2023-10-03 17:12:03,067 WARNING [train.py:1204] (3/4) Exclude cut with ID _56524_210_9642_1_1534572011779_3619170_492-95008-0_sp1.1 from training. Duration: 0.97275 2023-10-03 17:12:04,487 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_33348_210_5632_1_1533086912_3324030_119-90326-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 17:12:04,523 WARNING [train.py:1204] (3/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_231-137892-0_sp1.1 from training. Duration: 0.9 2023-10-03 17:12:05,934 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8453_210_4211_1_1528502940_8562730_596-306820-0_sp1.1 from training. Duration: 0.8818125 2023-10-03 17:12:08,612 WARNING [train.py:1204] (3/4) Exclude cut with ID _27052_210_5986_1_1532329197187_3585009_50-242750-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 17:12:10,128 WARNING [train.py:1204] (3/4) Exclude cut with ID _13913_210_9994_1_1530579615187_3383897_358-98435-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 17:12:10,141 WARNING [train.py:1204] (3/4) Exclude cut with ID _27013_210_1389_1_1532437173240_3252649_399-229778-0_sp1.1 from training. Duration: 0.963625 2023-10-03 17:12:13,387 WARNING [train.py:1204] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_185-174805-0_sp0.9 from training. Duration: 0.7555625 2023-10-03 17:12:15,372 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.4.encoder.layers.2.self_attn1.whiten, num_groups=1, num_channels=384, metric=16.00 vs. limit=22.5 2023-10-03 17:12:18,092 WARNING [train.py:1204] (3/4) Exclude cut with ID _65585_210_12210_1_1535450451584_3764098_196-221014-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 17:12:18,220 WARNING [train.py:1204] (3/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_202-293523-0_sp1.1 from training. Duration: 0.6545625 2023-10-03 17:12:19,595 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8275_210_4198_1_1528455320_3892571_213-127634-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 17:12:20,983 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_921-231678-0_sp1.1 from training. Duration: 0.97275 2023-10-03 17:12:22,686 WARNING [train.py:1204] (3/4) Exclude cut with ID _38020_210_5986_1_1533112252259_7006089_493-102745-0_sp1.1 from training. Duration: 0.8909375 2023-10-03 17:12:24,644 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6846_210_6753_1_1527850767_4295158_2-315073-0 from training. Duration: 0.88 2023-10-03 17:12:24,708 WARNING [train.py:1204] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_249-281349-0 from training. Duration: 0.85 2023-10-03 17:12:26,195 WARNING [train.py:1204] (3/4) Exclude cut with ID _18513_210_9206_1_1531618215223_3862620_314-83429-0_sp1.1 from training. Duration: 0.97275 2023-10-03 17:12:26,384 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.ff2_skip_rate, batch_count=1343460.0, ans=0.0 2023-10-03 17:12:28,766 WARNING [train.py:1204] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_215-202973-0_sp0.9 from training. Duration: 0.97775 2023-10-03 17:12:28,777 WARNING [train.py:1204] (3/4) Exclude cut with ID _7719_210_3525_1_1528456153163_4296960_88-250259-0_sp1.1 from training. Duration: 0.763625 2023-10-03 17:12:30,185 WARNING [train.py:1204] (3/4) Exclude cut with ID _7606_210_4915_1_1528894253277_5080690_301-10732-0_sp1.1 from training. Duration: 0.736375 2023-10-03 17:12:30,196 WARNING [train.py:1204] (3/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_818-11907-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 17:12:30,343 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.conv_skip_rate, batch_count=1343460.0, ans=0.0 2023-10-03 17:12:32,785 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_5959_210_1794_1_1527400400_3445726_412-44052-0_sp1.1 from training. Duration: 0.7 2023-10-03 17:12:34,190 WARNING [train.py:1204] (3/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_6-279195-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 17:12:37,004 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.566e+02 1.915e+02 2.064e+02 2.439e+02 4.072e+02, threshold=4.129e+02, percent-clipped=0.0 2023-10-03 17:12:37,088 WARNING [train.py:1204] (3/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_252-216674-0_sp0.9 from training. Duration: 0.6 2023-10-03 17:12:38,574 WARNING [train.py:1204] (3/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_144-3287-0_sp1.1 from training. Duration: 0.6090625 2023-10-03 17:12:38,684 WARNING [train.py:1204] (3/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_342-3879-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 17:12:38,828 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.conv_module1.balancer1.prob, batch_count=1343526.6666666667, ans=0.125 2023-10-03 17:12:40,042 WARNING [train.py:1204] (3/4) Exclude cut with ID _33640_210_2655_1_1533117605479_4855469_584-65630-0_sp1.1 from training. Duration: 0.97275 2023-10-03 17:12:41,382 WARNING [train.py:1204] (3/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_37-189262-0 from training. Duration: 0.98 2023-10-03 17:12:41,411 WARNING [train.py:1204] (3/4) Exclude cut with ID _9389_210_1664_1_1529217337301_5289269_379-128794-0_sp0.9 from training. Duration: 0.9444375 2023-10-03 17:12:41,541 WARNING [train.py:1204] (3/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_119-207162-0_sp1.1 from training. Duration: 0.6454375 2023-10-03 17:12:45,046 WARNING [train.py:1204] (3/4) Exclude cut with ID _2941_210_2577_1_1526199673150_2489759_200-333643-0_sp1.1 from training. Duration: 0.936375 2023-10-03 17:12:46,339 WARNING [train.py:1204] (3/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_479-125611-0_sp1.1 from training. Duration: 0.87275 2023-10-03 17:12:46,349 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3475_210_2028_1_1526193578_3622569_64-297072-0_sp1.1 from training. Duration: 0.82725 2023-10-03 17:12:46,387 WARNING [train.py:1204] (3/4) Exclude cut with ID _53565_210_10635_1_1534337218335_3820259_139-346291-0_sp1.1 from training. Duration: 0.97275 2023-10-03 17:12:48,276 WARNING [train.py:1204] (3/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_588-125165-0_sp1.1 from training. Duration: 0.7545625 2023-10-03 17:12:48,360 WARNING [train.py:1204] (3/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_938-205135-0_sp1.1 from training. Duration: 0.7909375 2023-10-03 17:12:51,483 WARNING [train.py:1204] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_216-48707-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 17:12:51,553 WARNING [train.py:1204] (3/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_477-255782-0_sp1.1 from training. Duration: 0.7454375 2023-10-03 17:12:53,413 WARNING [train.py:1204] (3/4) Exclude cut with ID _16909_210_5724_1_1531292079912_4118919_583-92842-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 17:12:54,846 WARNING [train.py:1204] (3/4) Exclude cut with ID _60860_210_8254_1_1534896015875_3557169_245-256666-0 from training. Duration: 0.97 2023-10-03 17:12:57,778 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_45399_210_4852_1_1533624725_7579737_349-116173-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 17:13:03,278 WARNING [train.py:1204] (3/4) Exclude cut with ID _8699_210_2069_1_1528630346831_3960409_155-39766-0 from training. Duration: 0.78 2023-10-03 17:13:03,287 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_86-16881-0_sp1.1 from training. Duration: 0.52725 2023-10-03 17:13:07,167 INFO [train.py:1046] (3/4) Epoch 38, batch 5000, loss[loss=0.1488, simple_loss=0.2262, pruned_loss=0.03574, over 23356.00 frames. ], tot_loss[loss=0.1561, simple_loss=0.2354, pruned_loss=0.0384, over 4722331.02 frames. ], batch size: 119, lr: 2.66e-03, grad_scale: 16.0 2023-10-03 17:13:11,500 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_191-159384-0_sp1.1 from training. Duration: 0.97275 2023-10-03 17:13:11,509 WARNING [train.py:1204] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_307-158462-0_sp1.1 from training. Duration: 0.62725 2023-10-03 17:13:11,616 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7544_210_6753_1_1528591327_1471426_33-346031-0 from training. Duration: 0.61 2023-10-03 17:13:13,111 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.feed_forward1.out_proj.dropout_p, batch_count=1343660.0, ans=0.1 2023-10-03 17:13:14,240 WARNING [train.py:1204] (3/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_112-149213-0 from training. Duration: 0.95 2023-10-03 17:13:17,002 WARNING [train.py:1204] (3/4) Exclude cut with ID _55844_210_2346_1_1534471212569_7625707_925-191342-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 17:13:17,175 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.0.feed_forward1.hidden_balancer.prob, batch_count=1343660.0, ans=0.125 2023-10-03 17:13:18,797 WARNING [train.py:1204] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_87-278686-0 from training. Duration: 0.77 2023-10-03 17:13:18,817 WARNING [train.py:1204] (3/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_415-78792-0_sp1.1 from training. Duration: 0.736375 2023-10-03 17:13:18,826 WARNING [train.py:1204] (3/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_107-332398-0_sp0.9 from training. Duration: 0.8666875 2023-10-03 17:13:20,826 WARNING [train.py:1204] (3/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_72-251255-0 from training. Duration: 0.94 2023-10-03 17:13:22,135 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_431-227273-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 17:13:22,202 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7544_210_6753_1_1528587000_691468_87-178599-0_sp0.9 from training. Duration: 0.9666875 2023-10-03 17:13:23,590 WARNING [train.py:1204] (3/4) Exclude cut with ID _13916_210_9994_1_1530788341126_3763149_60-191556-0 from training. Duration: 0.61 2023-10-03 17:13:23,592 WARNING [train.py:1204] (3/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_746-164782-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 17:13:24,857 WARNING [train.py:1204] (3/4) Exclude cut with ID _33640_210_2655_1_1533117605479_4855469_596-21066-0_sp1.1 from training. Duration: 0.92725 2023-10-03 17:13:25,601 WARNING [train.py:1204] (3/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_320-14078-0 from training. Duration: 0.95 2023-10-03 17:13:26,843 WARNING [train.py:1204] (3/4) Exclude cut with ID _29877_210_6228_1_1532397791106_5276149_266-62291-0 from training. Duration: 0.87 2023-10-03 17:13:28,190 WARNING [train.py:1204] (3/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_176-56120-0_sp0.9 from training. Duration: 0.92225 2023-10-03 17:13:28,245 WARNING [train.py:1204] (3/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_91-26919-0 from training. Duration: 0.97 2023-10-03 17:13:28,253 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_224-322394-0_sp0.9 from training. Duration: 0.7333125 2023-10-03 17:13:28,371 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.1.conv_skip_rate, batch_count=1343726.6666666667, ans=0.0 2023-10-03 17:13:29,609 WARNING [train.py:1204] (3/4) Exclude cut with ID _44982_210_1511_1_1534312820108_3726029_586-297059-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 17:13:29,653 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_196-289637-0_sp1.1 from training. Duration: 0.6818125 2023-10-03 17:13:29,655 WARNING [train.py:1204] (3/4) Exclude cut with ID _18513_210_9206_1_1531618215223_3862620_402-5957-0 from training. Duration: 0.99 2023-10-03 17:13:29,661 WARNING [train.py:1204] (3/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_81-300212-0 from training. Duration: 0.6 2023-10-03 17:13:29,803 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.bypass_mid.scale_min, batch_count=1343726.6666666667, ans=0.2 2023-10-03 17:13:32,310 WARNING [train.py:1204] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_166-128941-0 from training. Duration: 0.54 2023-10-03 17:13:32,324 WARNING [train.py:1204] (3/4) Exclude cut with ID _58059_210_5854_1_1534901471149_3164808_57-309660-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 17:13:32,400 WARNING [train.py:1204] (3/4) Exclude cut with ID _60621_210_17071_1_1535013053530_3671739_73-155814-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 17:13:33,847 WARNING [train.py:1204] (3/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_726-270780-0 from training. Duration: 0.86 2023-10-03 17:13:33,866 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8853_210_3273_1_1528633899_818968_51-187685-0_sp1.1 from training. Duration: 0.62725 2023-10-03 17:13:35,266 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_315-212794-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 17:13:36,597 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_53373_210_5399_1_1534229471_4715695_314-122795-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 17:13:36,861 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.conv_skip_rate, batch_count=1343793.3333333333, ans=0.0 2023-10-03 17:13:38,013 WARNING [train.py:1204] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_187-221566-0_sp0.9 from training. Duration: 0.47775 2023-10-03 17:13:39,309 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_133-564-0 from training. Duration: 0.79 2023-10-03 17:13:39,362 WARNING [train.py:1204] (3/4) Exclude cut with ID _55153_210_2253_1_1534384786677_3162096_255-228344-0_sp1.1 from training. Duration: 0.936375 2023-10-03 17:13:41,967 WARNING [train.py:1204] (3/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_383-165234-0_sp1.1 from training. Duration: 0.8545625 2023-10-03 17:13:46,163 WARNING [train.py:1204] (3/4) Exclude cut with ID IC0677W0056-334889 from training. Duration: 0.768 2023-10-03 17:13:48,940 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_14953_210_2151_1_1530701710_5616165_710-165400-0_sp0.9 from training. Duration: 0.9666875 2023-10-03 17:13:49,028 WARNING [train.py:1204] (3/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_355-236205-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 17:13:49,030 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_34450_210_9547_1_1533099443_4109050_503-246893-0_sp1.1 from training. Duration: 0.963625 2023-10-03 17:13:52,323 WARNING [train.py:1204] (3/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_25-120161-0 from training. Duration: 0.98 2023-10-03 17:13:52,370 WARNING [train.py:1204] (3/4) Exclude cut with ID _66112_210_10635_1_1535590236305_3801484_205-311930-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 17:13:53,597 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_48049_210_5632_1_1533864553_3519937_185-281218-0_sp1.1 from training. Duration: 0.92725 2023-10-03 17:13:53,635 WARNING [train.py:1204] (3/4) Exclude cut with ID _48782_210_12658_1_1534467701544_2300422_191-332415-0_sp1.1 from training. Duration: 0.97275 2023-10-03 17:13:55,720 WARNING [train.py:1204] (3/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_907-151398-0_sp1.1 from training. Duration: 0.336375 2023-10-03 17:13:55,961 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.conv_module1.balancer2.prob, batch_count=1343860.0, ans=0.125 2023-10-03 17:13:57,001 WARNING [train.py:1204] (3/4) Exclude cut with ID _67499_210_17282_1_1535509805220_3692430_20-179346-0_sp1.1 from training. Duration: 0.8818125 2023-10-03 17:13:58,420 WARNING [train.py:1204] (3/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_441-91466-0_sp1.1 from training. Duration: 0.8818125 2023-10-03 17:13:59,813 WARNING [train.py:1204] (3/4) Exclude cut with ID _46681_210_9642_1_1533893005571_7603359_786-164007-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 17:14:03,885 WARNING [train.py:1204] (3/4) Exclude cut with ID _14970_210_15709_1_1530773543493_1715730_187-20259-0 from training. Duration: 0.89 2023-10-03 17:14:09,366 WARNING [train.py:1204] (3/4) Exclude cut with ID _12734_210_8341_1_1530183614296_3891929_383-339075-0_sp1.1 from training. Duration: 0.963625 2023-10-03 17:14:11,445 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.feed_forward3.out_whiten.whitening_limit, batch_count=1343926.6666666667, ans=15.0 2023-10-03 17:14:17,771 WARNING [train.py:1204] (3/4) Exclude cut with ID _37086_210_9038_1_1532999540891_7021250_834-322225-0_sp1.1 from training. Duration: 0.92725 2023-10-03 17:14:19,519 INFO [train.py:1046] (3/4) Epoch 38, batch 5050, loss[loss=0.1592, simple_loss=0.2518, pruned_loss=0.03331, over 24654.00 frames. ], tot_loss[loss=0.1564, simple_loss=0.2362, pruned_loss=0.0383, over 4731949.60 frames. ], batch size: 73, lr: 2.66e-03, grad_scale: 16.0 2023-10-03 17:14:19,570 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_10987_210_4163_1_1529760555_4123902_56-290559-0_sp1.1 from training. Duration: 0.963625 2023-10-03 17:14:19,584 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_92-246270-0_sp0.9 from training. Duration: 0.9444375 2023-10-03 17:14:19,608 WARNING [train.py:1204] (3/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_56-13561-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 17:14:19,638 WARNING [train.py:1204] (3/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_393-57888-0_sp1.1 from training. Duration: 0.5818125 2023-10-03 17:14:21,722 WARNING [train.py:1204] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_168-32906-0_sp1.1 from training. Duration: 0.62725 2023-10-03 17:14:21,777 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_51225_210_3601_1_1534400634_2327383_127-207553-0_sp1.1 from training. Duration: 0.963625 2023-10-03 17:14:26,613 WARNING [train.py:1204] (3/4) Exclude cut with ID _58583_210_5236_1_1534726835266_3574249_494-203521-0_sp1.1 from training. Duration: 0.963625 2023-10-03 17:14:26,625 WARNING [train.py:1204] (3/4) Exclude cut with ID _49376_210_5986_1_1533985278732_3370740_354-344695-0 from training. Duration: 0.99 2023-10-03 17:14:26,706 WARNING [train.py:1204] (3/4) Exclude cut with ID _8021_210_7994_1_1528376451358_3653069_179-45206-0_sp1.1 from training. Duration: 0.7818125 2023-10-03 17:14:28,738 WARNING [train.py:1204] (3/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_742-154017-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 17:14:30,134 WARNING [train.py:1204] (3/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_222-198127-0_sp1.1 from training. Duration: 0.763625 2023-10-03 17:14:31,497 WARNING [train.py:1204] (3/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_235-307337-0 from training. Duration: 0.94 2023-10-03 17:14:31,564 WARNING [train.py:1204] (3/4) Exclude cut with ID _8393_210_6702_1_1528505860249_3457328_453-176500-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 17:14:31,608 WARNING [train.py:1204] (3/4) Exclude cut with ID _53565_210_10635_1_1534337218335_3820259_102-179528-0_sp1.1 from training. Duration: 0.97275 2023-10-03 17:14:34,110 WARNING [train.py:1204] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_43-206757-0_sp1.1 from training. Duration: 0.6818125 2023-10-03 17:14:35,446 WARNING [train.py:1204] (3/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_335-274780-0_sp1.1 from training. Duration: 0.7090625 2023-10-03 17:14:35,482 WARNING [train.py:1204] (3/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_128-296581-0_sp1.1 from training. Duration: 0.67275 2023-10-03 17:14:41,184 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.1.bypass_mid.scale_min, batch_count=1344060.0, ans=0.2 2023-10-03 17:14:46,287 WARNING [train.py:1204] (3/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_111-112362-0 from training. Duration: 0.91 2023-10-03 17:14:46,348 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_5522_210_3273_1_1527249606_3909096_366-172508-0_sp1.1 from training. Duration: 0.6 2023-10-03 17:14:47,700 WARNING [train.py:1204] (3/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_47-167373-0_sp1.1 from training. Duration: 0.836375 2023-10-03 17:14:47,733 WARNING [train.py:1204] (3/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_41-234353-0 from training. Duration: 0.68 2023-10-03 17:14:49,041 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_82-221196-0_sp0.9 from training. Duration: 0.8444375 2023-10-03 17:14:50,423 WARNING [train.py:1204] (3/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_47-4814-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 17:14:50,445 WARNING [train.py:1204] (3/4) Exclude cut with ID _43541_210_10626_1_1533796233418_3635440_40-247993-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 17:14:50,503 WARNING [train.py:1204] (3/4) Exclude cut with ID _21989_210_5854_1_1531899214931_3271360_133-44759-0_sp0.9 from training. Duration: 0.8555625 2023-10-03 17:14:50,506 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_94-195801-0 from training. Duration: 0.94 2023-10-03 17:14:52,520 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_141-89113-0 from training. Duration: 0.86 2023-10-03 17:14:52,620 WARNING [train.py:1204] (3/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_683-284578-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 17:14:55,878 WARNING [train.py:1204] (3/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_6-214815-0_sp1.1 from training. Duration: 0.77275 2023-10-03 17:14:59,598 WARNING [train.py:1204] (3/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_556-212082-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 17:15:00,933 WARNING [train.py:1204] (3/4) Exclude cut with ID _44902_210_16187_1_1533888006062_3792078_149-67212-0 from training. Duration: 0.87 2023-10-03 17:15:02,277 WARNING [train.py:1204] (3/4) Exclude cut with ID _61148_210_6897_1_1535094022726_4217610_333-159440-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 17:15:03,741 WARNING [train.py:1204] (3/4) Exclude cut with ID _3120_210_3525_1_1526036143112_4072980_404-249994-0 from training. Duration: 0.99 2023-10-03 17:15:05,041 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.638e+02 1.875e+02 2.025e+02 2.172e+02 3.153e+02, threshold=4.049e+02, percent-clipped=0.0 2023-10-03 17:15:05,160 WARNING [train.py:1204] (3/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_234-260682-0_sp0.9 from training. Duration: 0.9333125 2023-10-03 17:15:05,196 WARNING [train.py:1204] (3/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_374-138971-0_sp1.1 from training. Duration: 0.8545625 2023-10-03 17:15:06,484 WARNING [train.py:1204] (3/4) Exclude cut with ID _53713_210_24116_1_1534314487762_7416751_743-302085-0_sp1.1 from training. Duration: 0.92725 2023-10-03 17:15:07,901 WARNING [train.py:1204] (3/4) Exclude cut with ID _18516_210_9206_1_1531704653206_3692406_198-334870-0_sp1.1 from training. Duration: 0.836375 2023-10-03 17:15:08,026 WARNING [train.py:1204] (3/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_395-73570-0_sp1.1 from training. Duration: 0.9 2023-10-03 17:15:10,691 WARNING [train.py:1204] (3/4) Exclude cut with ID _46683_210_9642_1_1533730913492_4591116_421-19547-0_sp1.1 from training. Duration: 0.936375 2023-10-03 17:15:10,749 WARNING [train.py:1204] (3/4) Exclude cut with ID _19896_210_1883_1_1531709022662_6649569_714-8668-0_sp1.1 from training. Duration: 0.963625 2023-10-03 17:15:12,089 WARNING [train.py:1204] (3/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_49-101072-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 17:15:12,095 WARNING [train.py:1204] (3/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_220-191080-0_sp1.1 from training. Duration: 0.8909375 2023-10-03 17:15:12,135 WARNING [train.py:1204] (3/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_62-335552-0 from training. Duration: 0.87 2023-10-03 17:15:13,508 WARNING [train.py:1204] (3/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_111-112362-0_sp1.1 from training. Duration: 0.82725 2023-10-03 17:15:16,217 WARNING [train.py:1204] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_318-59097-0_sp0.9 from training. Duration: 0.8444375 2023-10-03 17:15:19,069 WARNING [train.py:1204] (3/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_98-255710-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 17:15:20,374 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9042W0003-515609 from training. Duration: 0.896 2023-10-03 17:15:20,390 WARNING [train.py:1204] (3/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_158-250116-0_sp0.9 from training. Duration: 0.688875 2023-10-03 17:15:20,493 WARNING [train.py:1204] (3/4) Exclude cut with ID _26750_210_5665_1_1532226678442_4063230_7-346885-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 17:15:22,498 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_37839_210_7234_1_1533542462_3689303_357-227078-0_sp1.1 from training. Duration: 0.963625 2023-10-03 17:15:22,532 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9052W0119-520904 from training. Duration: 0.896 2023-10-03 17:15:26,678 WARNING [train.py:1204] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_249-281349-0_sp1.1 from training. Duration: 0.77275 2023-10-03 17:15:26,683 WARNING [train.py:1204] (3/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_147-293311-0 from training. Duration: 0.94 2023-10-03 17:15:26,684 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_158-21166-0_sp1.1 from training. Duration: 0.963625 2023-10-03 17:15:27,054 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.conv_module2.balancer2.prob, batch_count=1344260.0, ans=0.125 2023-10-03 17:15:31,040 WARNING [train.py:1204] (3/4) Exclude cut with ID _14100_210_8341_1_1530529320613_3678069_207-332194-0_sp1.1 from training. Duration: 0.92725 2023-10-03 17:15:31,087 WARNING [train.py:1204] (3/4) Exclude cut with ID _9389_210_1664_1_1529217337301_5289269_324-211031-0_sp1.1 from training. Duration: 0.963625 2023-10-03 17:15:31,100 WARNING [train.py:1204] (3/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_133-158308-0 from training. Duration: 0.84 2023-10-03 17:15:33,815 WARNING [train.py:1204] (3/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_166-314607-0 from training. Duration: 0.54 2023-10-03 17:15:35,077 INFO [train.py:1046] (3/4) Epoch 38, batch 5100, loss[loss=0.1617, simple_loss=0.2494, pruned_loss=0.037, over 24646.00 frames. ], tot_loss[loss=0.1572, simple_loss=0.2371, pruned_loss=0.03862, over 4729456.02 frames. ], batch size: 68, lr: 2.66e-03, grad_scale: 16.0 2023-10-03 17:15:36,504 WARNING [train.py:1204] (3/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_649-79538-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 17:15:36,514 WARNING [train.py:1204] (3/4) Exclude cut with ID _28394_210_8913_1_1532430049518_3650118_343-115667-0_sp1.1 from training. Duration: 0.97275 2023-10-03 17:15:37,793 WARNING [train.py:1204] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_87-278686-0_sp0.9 from training. Duration: 0.8555625 2023-10-03 17:15:41,749 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9109W0003-549749 from training. Duration: 0.768 2023-10-03 17:15:43,199 WARNING [train.py:1204] (3/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_126-29104-0_sp1.1 from training. Duration: 0.77275 2023-10-03 17:15:44,725 WARNING [train.py:1204] (3/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_325-113798-0 from training. Duration: 0.85 2023-10-03 17:15:44,775 WARNING [train.py:1204] (3/4) Exclude cut with ID _6694_210_4599_1_1527852637490_3965529_116-109398-0 from training. Duration: 0.73 2023-10-03 17:15:46,154 WARNING [train.py:1204] (3/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_46-57119-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 17:15:47,553 WARNING [train.py:1204] (3/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_546-119312-0_sp1.1 from training. Duration: 0.9 2023-10-03 17:15:47,779 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.feed_forward2.hidden_balancer.prob, batch_count=1344393.3333333333, ans=0.125 2023-10-03 17:15:48,969 WARNING [train.py:1204] (3/4) Exclude cut with ID _60057_210_10640_1_1534903065793_3333150_468-306939-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 17:15:50,232 WARNING [train.py:1204] (3/4) Exclude cut with ID _15150_210_8341_1_1530752411803_7256384_22-138274-0 from training. Duration: 0.96 2023-10-03 17:15:50,255 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_289-180904-0 from training. Duration: 0.76 2023-10-03 17:15:53,568 WARNING [train.py:1204] (3/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_64-133721-0_sp1.1 from training. Duration: 0.92725 2023-10-03 17:15:53,604 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_195-257512-0_sp1.1 from training. Duration: 0.6090625 2023-10-03 17:15:59,417 WARNING [train.py:1204] (3/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_704-101963-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 17:16:02,774 WARNING [train.py:1204] (3/4) Exclude cut with ID _4366_210_1388_1_1526792053633_3613819_231-211402-0 from training. Duration: 0.94 2023-10-03 17:16:02,819 WARNING [train.py:1204] (3/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_679-190385-0_sp1.1 from training. Duration: 0.97275 2023-10-03 17:16:06,153 WARNING [train.py:1204] (3/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_298-81459-0_sp1.1 from training. Duration: 0.963625 2023-10-03 17:16:06,162 WARNING [train.py:1204] (3/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_658-282610-0_sp0.9 from training. Duration: 0.588875 2023-10-03 17:16:09,053 WARNING [train.py:1204] (3/4) Exclude cut with ID _57572_210_5517_1_1534921219624_3809484_196-283734-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 17:16:10,395 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_53071_210_16554_1_1534490405_2954066_294-198554-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 17:16:10,400 WARNING [train.py:1204] (3/4) Exclude cut with ID _4866_210_2577_1_1527404412284_4364659_352-157930-0 from training. Duration: 0.81 2023-10-03 17:16:12,335 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.5.encoder.layers.0.self_attn1.whiten, num_groups=1, num_channels=256, metric=11.83 vs. limit=22.5 2023-10-03 17:16:13,171 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9231W0113-610139 from training. Duration: 0.896 2023-10-03 17:16:14,509 WARNING [train.py:1204] (3/4) Exclude cut with ID _24271_210_12658_1_1531987270022_7190519_638-171906-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 17:16:14,547 WARNING [train.py:1204] (3/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_272-249985-0 from training. Duration: 0.67 2023-10-03 17:16:14,557 WARNING [train.py:1204] (3/4) Exclude cut with ID _64086_210_7994_1_1535270417019_7435949_449-31567-0 from training. Duration: 0.97 2023-10-03 17:16:17,471 WARNING [train.py:1204] (3/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_109-282568-0_sp1.1 from training. Duration: 0.97275 2023-10-03 17:16:17,693 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.feed_forward1.hidden_balancer.prob, batch_count=1344526.6666666667, ans=0.125 2023-10-03 17:16:19,198 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.conv_module1.balancer1.min_positive, batch_count=1344526.6666666667, ans=0.025 2023-10-03 17:16:24,351 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_121-45934-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 17:16:27,617 WARNING [train.py:1204] (3/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_314-126819-0 from training. Duration: 0.5 2023-10-03 17:16:27,645 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9309W0192-640319 from training. Duration: 0.512 2023-10-03 17:16:28,967 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9310W0003-640649 from training. Duration: 0.896 2023-10-03 17:16:29,197 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.conv_skip_rate, batch_count=1344526.6666666667, ans=0.0 2023-10-03 17:16:30,274 WARNING [train.py:1204] (3/4) Exclude cut with ID _7160_210_1876_1_1527940923231_3703799_120-150943-0 from training. Duration: 0.81 2023-10-03 17:16:30,275 WARNING [train.py:1204] (3/4) Exclude cut with ID _40380_210_9059_1_1533380594759_4187380_6-33508-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 17:16:31,726 WARNING [train.py:1204] (3/4) Exclude cut with ID _59202_210_26984_1_1534816810612_3549174_137-318710-0 from training. Duration: 0.89 2023-10-03 17:16:35,186 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_435-313305-0 from training. Duration: 0.71 2023-10-03 17:16:38,461 WARNING [train.py:1204] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_91-212486-0_sp1.1 from training. Duration: 0.6181875 2023-10-03 17:16:39,347 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.0.layers.0.self_attn1.whiten, num_groups=1, num_channels=192, metric=11.46 vs. limit=22.5 2023-10-03 17:16:39,815 WARNING [train.py:1204] (3/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_133-158308-0_sp1.1 from training. Duration: 0.763625 2023-10-03 17:16:42,451 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_99-251314-0 from training. Duration: 0.87 2023-10-03 17:16:43,913 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_365-217119-0_sp1.1 from training. Duration: 0.57275 2023-10-03 17:16:43,942 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3475_210_2028_1_1526193578_3622569_377-166133-0 from training. Duration: 0.86 2023-10-03 17:16:48,155 INFO [train.py:1046] (3/4) Epoch 38, batch 5150, loss[loss=0.1621, simple_loss=0.2414, pruned_loss=0.0414, over 23352.00 frames. ], tot_loss[loss=0.1576, simple_loss=0.238, pruned_loss=0.0386, over 4746298.66 frames. ], batch size: 119, lr: 2.66e-03, grad_scale: 16.0 2023-10-03 17:16:49,723 WARNING [train.py:1204] (3/4) Exclude cut with ID _13916_210_9994_1_1530788341126_3763149_116-21034-0_sp1.1 from training. Duration: 0.87275 2023-10-03 17:16:49,737 WARNING [train.py:1204] (3/4) Exclude cut with ID _64220_210_9527_1_1535250151772_4875259_142-216831-0_sp1.1 from training. Duration: 0.936375 2023-10-03 17:16:49,737 WARNING [train.py:1204] (3/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_490-207861-0_sp1.1 from training. Duration: 0.8909375 2023-10-03 17:16:49,789 WARNING [train.py:1204] (3/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_87-272068-0_sp0.9 from training. Duration: 0.92225 2023-10-03 17:16:49,817 WARNING [train.py:1204] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_18-46756-0_sp1.1 from training. Duration: 0.7181875 2023-10-03 17:16:51,144 WARNING [train.py:1204] (3/4) Exclude cut with ID _39411_210_5986_1_1533538825030_3343649_98-311335-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 17:16:52,485 WARNING [train.py:1204] (3/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_113-18163-0 from training. Duration: 0.89 2023-10-03 17:16:52,488 WARNING [train.py:1204] (3/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_1103-56445-0 from training. Duration: 0.73 2023-10-03 17:16:53,883 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3474_210_2028_1_1526188513_4825246_145-309131-0 from training. Duration: 0.74 2023-10-03 17:16:53,901 WARNING [train.py:1204] (3/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_120-211630-0_sp1.1 from training. Duration: 0.836375 2023-10-03 17:16:53,911 WARNING [train.py:1204] (3/4) Exclude cut with ID _8030_210_7497_1_1528444886781_3531089_32-31837-0 from training. Duration: 0.97 2023-10-03 17:16:55,817 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_19949_210_2161_1_1531706337_928079_8-160956-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 17:16:57,061 WARNING [train.py:1204] (3/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_321-215742-0_sp1.1 from training. Duration: 0.4545625 2023-10-03 17:16:58,483 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_583-124650-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 17:16:59,771 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_30434_210_13049_1_1532519384_981746_164-182755-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 17:17:05,262 WARNING [train.py:1204] (3/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_339-104791-0_sp1.1 from training. Duration: 0.4454375 2023-10-03 17:17:05,281 WARNING [train.py:1204] (3/4) Exclude cut with ID _15026_210_8750_1_1530777602316_3291850_211-195043-0 from training. Duration: 0.8 2023-10-03 17:17:06,544 WARNING [train.py:1204] (3/4) Exclude cut with ID _29877_210_6228_1_1532397791106_5276149_487-182647-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 17:17:06,589 WARNING [train.py:1204] (3/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_127-566-0_sp0.9 from training. Duration: 0.9333125 2023-10-03 17:17:08,090 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.ff2_skip_rate, batch_count=1344726.6666666667, ans=0.0 2023-10-03 17:17:09,241 WARNING [train.py:1204] (3/4) Exclude cut with ID _8263_210_1664_1_1528439452891_970220_68-181805-0_sp0.9 from training. Duration: 0.711125 2023-10-03 17:17:09,242 WARNING [train.py:1204] (3/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_504-72513-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 17:17:09,258 WARNING [train.py:1204] (3/4) Exclude cut with ID _64639_210_5816_1_1535367619437_3528000_324-265233-0_sp1.1 from training. Duration: 0.963625 2023-10-03 17:17:10,653 WARNING [train.py:1204] (3/4) Exclude cut with ID _6456_210_1534_1_1527681634212_3181049_215-309359-0_sp0.9 from training. Duration: 0.97775 2023-10-03 17:17:10,657 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_115-233929-0_sp1.1 from training. Duration: 0.6545625 2023-10-03 17:17:10,703 WARNING [train.py:1204] (3/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_219-224445-0 from training. Duration: 0.84 2023-10-03 17:17:13,338 WARNING [train.py:1204] (3/4) Exclude cut with ID _14867_210_1968_1_1530684179890_3846969_413-29292-0_sp1.1 from training. Duration: 0.8181875 2023-10-03 17:17:13,400 WARNING [train.py:1204] (3/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_1012-279473-0_sp0.9 from training. Duration: 0.7444375 2023-10-03 17:17:14,957 WARNING [train.py:1204] (3/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_605-133395-0_sp0.9 from training. Duration: 0.7333125 2023-10-03 17:17:16,312 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_89-35748-0 from training. Duration: 0.79 2023-10-03 17:17:16,419 WARNING [train.py:1204] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_476-272757-0_sp1.1 from training. Duration: 0.8454375 2023-10-03 17:17:21,887 WARNING [train.py:1204] (3/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_449-15508-0_sp0.9 from training. Duration: 0.911125 2023-10-03 17:17:23,237 WARNING [train.py:1204] (3/4) Exclude cut with ID _29345_210_4381_1_1532689168263_3860169_198-174328-0 from training. Duration: 0.91 2023-10-03 17:17:29,057 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_15517_210_6753_1_1530952293_1664795_125-158157-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 17:17:31,772 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.549e+02 1.952e+02 2.144e+02 2.442e+02 5.340e+02, threshold=4.289e+02, percent-clipped=2.0 2023-10-03 17:17:33,241 WARNING [train.py:1204] (3/4) Exclude cut with ID _25565_210_12392_1_1532423009382_3952098_420-113180-0_sp1.1 from training. Duration: 0.963625 2023-10-03 17:17:33,933 WARNING [train.py:1204] (3/4) Exclude cut with ID _51471_210_1889_1_1534399391390_3094250_236-146773-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 17:17:35,585 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.conv_module1.balancer1.prob, batch_count=1344860.0, ans=0.125 2023-10-03 17:17:36,813 WARNING [train.py:1204] (3/4) Exclude cut with ID _30039_210_13270_1_1532484612900_2725150_175-35012-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 17:17:38,561 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_23850_210_4238_1_1532232597_1632433_99-275843-0_sp1.1 from training. Duration: 0.936375 2023-10-03 17:17:40,066 WARNING [train.py:1204] (3/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_658-282610-0 from training. Duration: 0.53 2023-10-03 17:17:44,198 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_37818_210_4238_1_1533620910_4720083_234-65934-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 17:17:45,555 WARNING [train.py:1204] (3/4) Exclude cut with ID _3067_210_2667_1_1526212974640_4503168_459-173867-0_sp1.1 from training. Duration: 0.8 2023-10-03 17:17:45,580 WARNING [train.py:1204] (3/4) Exclude cut with ID _8696_210_1876_1_1528545632440_3345058_209-54069-0_sp0.9 from training. Duration: 0.7444375 2023-10-03 17:17:47,745 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.3.encoder.layers.0.feed_forward2.out_whiten, num_groups=1, num_channels=512, metric=3.94 vs. limit=15.0 2023-10-03 17:17:48,452 WARNING [train.py:1204] (3/4) Exclude cut with ID _62574_210_2655_1_1535180407058_3933249_420-236465-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 17:17:49,735 WARNING [train.py:1204] (3/4) Exclude cut with ID _46681_210_9642_1_1533893005571_7603359_333-4042-0_sp1.1 from training. Duration: 0.936375 2023-10-03 17:17:51,175 WARNING [train.py:1204] (3/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_377-296839-0 from training. Duration: 0.55 2023-10-03 17:17:55,738 WARNING [train.py:1204] (3/4) Exclude cut with ID _43997_210_4525_1_1533538632555_5189329_426-92599-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 17:17:57,047 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_43-71989-0_sp1.1 from training. Duration: 0.5181875 2023-10-03 17:17:58,439 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2009_210_2071_1_1525481134_4588937_140-345530-0_sp1.1 from training. Duration: 0.963625 2023-10-03 17:17:58,447 WARNING [train.py:1204] (3/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_138-332773-0_sp0.9 from training. Duration: 0.9 2023-10-03 17:17:59,767 WARNING [train.py:1204] (3/4) Exclude cut with ID _13942_210_9988_1_1530604906036_3733360_499-55676-0_sp0.9 from training. Duration: 0.688875 2023-10-03 17:17:59,791 WARNING [train.py:1204] (3/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_259-34038-0_sp1.1 from training. Duration: 0.67275 2023-10-03 17:17:59,801 WARNING [train.py:1204] (3/4) Exclude cut with ID _3120_210_3525_1_1526036143112_4072980_404-249994-0_sp1.1 from training. Duration: 0.9 2023-10-03 17:17:59,835 WARNING [train.py:1204] (3/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_222-108810-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 17:18:01,109 INFO [train.py:1046] (3/4) Epoch 38, batch 5200, loss[loss=0.2209, simple_loss=0.2904, pruned_loss=0.07569, over 19674.00 frames. ], tot_loss[loss=0.1591, simple_loss=0.2393, pruned_loss=0.03945, over 4723000.90 frames. ], batch size: 388, lr: 2.66e-03, grad_scale: 32.0 2023-10-03 17:18:03,297 WARNING [train.py:1204] (3/4) Exclude cut with ID _22083_210_8254_1_1531702862439_3678970_49-234546-0_sp0.9 from training. Duration: 0.92225 2023-10-03 17:18:05,374 WARNING [train.py:1204] (3/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_199-295611-0_sp0.9 from training. Duration: 0.611125 2023-10-03 17:18:09,908 WARNING [train.py:1204] (3/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_43-192945-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 17:18:12,739 WARNING [train.py:1204] (3/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_364-22817-0 from training. Duration: 0.71 2023-10-03 17:18:14,250 WARNING [train.py:1204] (3/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_388-129552-0_sp1.1 from training. Duration: 0.87275 2023-10-03 17:18:15,617 WARNING [train.py:1204] (3/4) Exclude cut with ID _6537_210_1883_1_1527987559710_3597129_190-280227-0_sp1.1 from training. Duration: 0.97275 2023-10-03 17:18:15,950 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.bypass.skip_rate, batch_count=1345060.0, ans=0.07 2023-10-03 17:18:16,993 WARNING [train.py:1204] (3/4) Exclude cut with ID _55844_210_2346_1_1534471212569_7625707_398-56897-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 17:18:19,630 WARNING [train.py:1204] (3/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_48-182155-0_sp1.1 from training. Duration: 0.7909375 2023-10-03 17:18:19,650 WARNING [train.py:1204] (3/4) Exclude cut with ID _29529_210_7234_1_1532393961455_3977460_582-18084-0_sp1.1 from training. Duration: 0.97275 2023-10-03 17:18:21,076 WARNING [train.py:1204] (3/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_912-282412-0 from training. Duration: 0.49 2023-10-03 17:18:22,506 WARNING [train.py:1204] (3/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_332-256168-0_sp0.9 from training. Duration: 0.8333125 2023-10-03 17:18:23,877 WARNING [train.py:1204] (3/4) Exclude cut with ID _64220_210_9527_1_1535250151772_4875259_55-250624-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 17:18:27,126 WARNING [train.py:1204] (3/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_264-108685-0 from training. Duration: 0.55 2023-10-03 17:18:28,569 WARNING [train.py:1204] (3/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_613-232856-0_sp0.9 from training. Duration: 0.6 2023-10-03 17:18:28,663 WARNING [train.py:1204] (3/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_68-212801-0_sp1.1 from training. Duration: 0.663625 2023-10-03 17:18:30,007 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6290_210_3273_1_1527595224_1282787_115-87095-0 from training. Duration: 0.55 2023-10-03 17:18:30,049 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3080_210_2071_1_1526086102_4365166_312-8726-0 from training. Duration: 0.91 2023-10-03 17:18:31,526 WARNING [train.py:1204] (3/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_105-63309-0 from training. Duration: 0.98 2023-10-03 17:18:33,484 WARNING [train.py:1204] (3/4) Exclude cut with ID _2941_210_2577_1_1526199673150_2489759_201-160779-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 17:18:33,487 WARNING [train.py:1204] (3/4) Exclude cut with ID ID0406W0226-885434 from training. Duration: 0.997 2023-10-03 17:18:33,493 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_17292_210_6753_1_1531125388_2943734_353-328201-0_sp1.1 from training. Duration: 0.97275 2023-10-03 17:18:34,688 WARNING [train.py:1204] (3/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_572-96029-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 17:18:34,716 WARNING [train.py:1204] (3/4) Exclude cut with ID _14970_210_15709_1_1530775547361_1966965_6-23328-0_sp0.9 from training. Duration: 0.9555625 2023-10-03 17:18:36,454 WARNING [train.py:1204] (3/4) Exclude cut with ID _45012_210_12651_1_1533608435136_5371380_240-37101-0 from training. Duration: 0.93 2023-10-03 17:18:37,728 WARNING [train.py:1204] (3/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_685-90183-0_sp1.1 from training. Duration: 0.936375 2023-10-03 17:18:39,737 WARNING [train.py:1204] (3/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_41-22245-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 17:18:45,020 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_15126_210_6753_1_1530778677_2591185_120-277707-0 from training. Duration: 0.8 2023-10-03 17:18:45,057 WARNING [train.py:1204] (3/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_310-26186-0 from training. Duration: 0.7 2023-10-03 17:18:45,078 WARNING [train.py:1204] (3/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_787-294416-0 from training. Duration: 0.82 2023-10-03 17:18:48,030 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.nonlin_attention.balancer.prob, batch_count=1345193.3333333333, ans=0.125 2023-10-03 17:18:49,336 WARNING [train.py:1204] (3/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_795-33252-0 from training. Duration: 0.37 2023-10-03 17:18:50,727 WARNING [train.py:1204] (3/4) Exclude cut with ID _7537_210_2084_1_1528502395431_3924359_113-199933-0_sp0.9 from training. Duration: 0.6555625 2023-10-03 17:18:54,813 WARNING [train.py:1204] (3/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_308-231735-0_sp0.9 from training. Duration: 0.811125 2023-10-03 17:18:54,833 WARNING [train.py:1204] (3/4) Exclude cut with ID _19504_210_9994_1_1531648863242_3325220_354-296765-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 17:18:56,289 WARNING [train.py:1204] (3/4) Exclude cut with ID _5409_210_1388_1_1527292850189_5865191_178-127970-0 from training. Duration: 0.57 2023-10-03 17:18:58,118 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_1466-248056-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 17:18:58,143 WARNING [train.py:1204] (3/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_535-14410-0_sp0.9 from training. Duration: 0.52225 2023-10-03 17:18:58,146 WARNING [train.py:1204] (3/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_747-129941-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 17:18:59,430 WARNING [train.py:1204] (3/4) Exclude cut with ID _15012_210_1968_1_1530856820429_5095350_688-178849-0_sp1.1 from training. Duration: 0.5818125 2023-10-03 17:19:00,992 WARNING [train.py:1204] (3/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_356-45054-0_sp1.1 from training. Duration: 0.7818125 2023-10-03 17:19:02,388 WARNING [train.py:1204] (3/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_146-276944-0_sp0.9 from training. Duration: 0.92225 2023-10-03 17:19:04,560 WARNING [train.py:1204] (3/4) Exclude cut with ID _39204_210_4220_1_1533880547368_3191120_346-180306-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 17:19:06,980 WARNING [train.py:1204] (3/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_641-158017-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 17:19:06,981 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_19952_210_2161_1_1531875469_3851750_274-42137-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 17:19:10,406 WARNING [train.py:1204] (3/4) Exclude cut with ID _60804_210_9642_1_1534831199859_7268299_87-327819-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 17:19:11,908 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_9302_210_2550_1_1528889962_4655416_638-301620-0 from training. Duration: 0.47 2023-10-03 17:19:11,974 WARNING [train.py:1204] (3/4) Exclude cut with ID _44304_210_15374_1_1533639352765_4613129_302-22084-0_sp1.1 from training. Duration: 0.7818125 2023-10-03 17:19:11,992 WARNING [train.py:1204] (3/4) Exclude cut with ID _18513_210_9206_1_1531618215223_3862620_177-227063-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 17:19:14,604 WARNING [train.py:1204] (3/4) Exclude cut with ID _41326_210_5854_1_1533778076509_3492080_510-45444-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 17:19:15,790 INFO [train.py:1046] (3/4) Epoch 38, batch 5250, loss[loss=0.1435, simple_loss=0.2252, pruned_loss=0.03092, over 24350.00 frames. ], tot_loss[loss=0.1588, simple_loss=0.2389, pruned_loss=0.03935, over 4713447.31 frames. ], batch size: 56, lr: 2.66e-03, grad_scale: 32.0 2023-10-03 17:19:15,870 WARNING [train.py:1204] (3/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_76-311963-0_sp1.1 from training. Duration: 0.636375 2023-10-03 17:19:15,969 WARNING [train.py:1204] (3/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_156-302675-0_sp0.9 from training. Duration: 0.611125 2023-10-03 17:19:17,891 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.conv_module2.balancer1.prob, batch_count=1345326.6666666667, ans=0.125 2023-10-03 17:19:19,051 WARNING [train.py:1204] (3/4) Exclude cut with ID _53489_210_6228_1_1534298357169_3835970_367-148411-0_sp1.1 from training. Duration: 0.936375 2023-10-03 17:19:20,545 WARNING [train.py:1204] (3/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_389-144525-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 17:19:21,845 WARNING [train.py:1204] (3/4) Exclude cut with ID _14069_210_5632_1_1530512692033_4803390_158-238427-0_sp1.1 from training. Duration: 0.963625 2023-10-03 17:19:23,192 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_486-151131-0_sp0.9 from training. Duration: 0.9444375 2023-10-03 17:19:26,303 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.5.encoder.layers.1.self_attn1.whiten, num_groups=1, num_channels=256, metric=13.18 vs. limit=22.5 2023-10-03 17:19:27,299 WARNING [train.py:1204] (3/4) Exclude cut with ID _39846_210_12651_1_1533603651992_1318030_188-71831-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 17:19:30,592 WARNING [train.py:1204] (3/4) Exclude cut with ID _39411_210_5986_1_1533538825030_3343649_173-238248-0_sp1.1 from training. Duration: 0.7545625 2023-10-03 17:19:32,047 WARNING [train.py:1204] (3/4) Exclude cut with ID _20684_210_6917_1_1531567751646_4203930_469-49081-0_sp1.1 from training. Duration: 0.92725 2023-10-03 17:19:34,038 WARNING [train.py:1204] (3/4) Exclude cut with ID _8833_210_5219_1_1528592665210_7553059_611-80313-0_sp1.1 from training. Duration: 0.5818125 2023-10-03 17:19:35,465 WARNING [train.py:1204] (3/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_440-141811-0 from training. Duration: 0.95 2023-10-03 17:19:35,482 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_41211_210_2544_1_1533348859_3702910_236-223652-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 17:19:37,413 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_40222_210_6228_1_1533385298_4481939_220-163998-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 17:19:57,844 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.633e+02 1.916e+02 2.104e+02 2.391e+02 3.735e+02, threshold=4.208e+02, percent-clipped=0.0 2023-10-03 17:20:02,086 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=1345526.6666666667, ans=0.1 2023-10-03 17:20:09,553 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.conv_module1.balancer1.prob, batch_count=1345593.3333333333, ans=0.125 2023-10-03 17:20:10,854 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.conv_module1.balancer1.prob, batch_count=1345593.3333333333, ans=0.125 2023-10-03 17:20:13,462 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.ff3_skip_rate, batch_count=1345593.3333333333, ans=0.0 2023-10-03 17:20:24,125 INFO [train.py:1046] (3/4) Epoch 38, batch 5300, loss[loss=0.1348, simple_loss=0.2153, pruned_loss=0.02711, over 24606.00 frames. ], tot_loss[loss=0.1574, simple_loss=0.2372, pruned_loss=0.03877, over 4700821.85 frames. ], batch size: 60, lr: 2.66e-03, grad_scale: 32.0 2023-10-03 17:20:25,672 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.bypass.skip_rate, batch_count=1345660.0, ans=0.09899494936611666 2023-10-03 17:20:26,002 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.4.encoder.layers.1.self_attn1.whiten, num_groups=1, num_channels=384, metric=11.45 vs. limit=22.5 2023-10-03 17:20:38,512 WARNING [train.py:1204] (3/4) Exclude cut with ID _14629_210_15709_1_1530604298182_956899_6-232695-0_sp1.1 from training. Duration: 0.8545625 2023-10-03 17:20:38,544 WARNING [train.py:1204] (3/4) Exclude cut with ID _13921_210_9994_1_1530841782272_3503520_327-69677-0 from training. Duration: 0.9 2023-10-03 17:20:38,572 WARNING [train.py:1204] (3/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_372-298093-0 from training. Duration: 0.8 2023-10-03 17:20:38,586 WARNING [train.py:1204] (3/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_515-139111-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 17:20:38,823 WARNING [train.py:1204] (3/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_85-65056-0_sp1.1 from training. Duration: 0.87275 2023-10-03 17:20:38,863 WARNING [train.py:1204] (3/4) Exclude cut with ID _15113_210_8254_1_1530842438579_3716279_209-54533-0_sp1.1 from training. Duration: 0.87275 2023-10-03 17:20:38,913 WARNING [train.py:1204] (3/4) Exclude cut with ID _70447_210_12392_1_1535972499300_3160420_123-312838-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 17:20:38,940 WARNING [train.py:1204] (3/4) Exclude cut with ID _60113_210_16271_1_1535164212481_4386079_70-63596-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 17:20:38,968 WARNING [train.py:1204] (3/4) Exclude cut with ID _14632_210_9988_1_1530691279392_3923200_225-188509-0_sp1.1 from training. Duration: 0.963625 2023-10-03 17:20:39,018 WARNING [train.py:1204] (3/4) Exclude cut with ID _34734_210_4525_1_1533365977542_4448720_584-180532-0_sp1.1 from training. Duration: 0.97275 2023-10-03 17:20:39,032 WARNING [train.py:1204] (3/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_351-294650-0_sp1.1 from training. Duration: 0.57275 2023-10-03 17:20:39,364 WARNING [train.py:1204] (3/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_263-168388-0_sp0.9 from training. Duration: 0.9555625 2023-10-03 17:20:39,393 WARNING [train.py:1204] (3/4) Exclude cut with ID _22083_210_8254_1_1531702862439_3678970_201-85738-0 from training. Duration: 0.9 2023-10-03 17:20:39,492 WARNING [train.py:1204] (3/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_237-58433-0 from training. Duration: 0.74 2023-10-03 17:20:39,520 WARNING [train.py:1204] (3/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_262-250826-0 from training. Duration: 0.99 2023-10-03 17:20:39,627 WARNING [train.py:1204] (3/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_927-43827-0_sp0.9 from training. Duration: 0.82225 2023-10-03 17:20:39,661 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2821_210_2715_1_1526108385_3763560_73-296673-0 from training. Duration: 0.83 2023-10-03 17:20:39,740 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6287_210_3273_1_1527509506_2653716_182-199887-0 from training. Duration: 0.51 2023-10-03 17:20:39,880 WARNING [train.py:1204] (3/4) Exclude cut with ID _70519_210_4163_1_1535961595994_7458230_899-80223-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 17:20:40,526 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_210-234949-0_sp1.1 from training. Duration: 0.97275 2023-10-03 17:20:40,569 WARNING [train.py:1204] (3/4) Exclude cut with ID _30196_210_1664_1_1533708496522_7074140_768-278176-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 17:20:40,650 WARNING [train.py:1204] (3/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_315-128283-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 17:20:40,732 WARNING [train.py:1204] (3/4) Exclude cut with ID _14886_210_4220_1_1530702020355_3516190_330-323941-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 17:20:41,001 WARNING [train.py:1204] (3/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_264-348896-0_sp1.1 from training. Duration: 0.77275 2023-10-03 17:20:41,023 WARNING [train.py:1204] (3/4) Exclude cut with ID _37086_210_9038_1_1532999540891_7021250_433-10563-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 17:20:41,063 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_53638_210_21581_1_1534297319_5808449_885-80172-0_sp1.1 from training. Duration: 0.87275 2023-10-03 17:20:41,158 WARNING [train.py:1204] (3/4) Exclude cut with ID _14623_210_1925_1_1530773354962_3632609_157-218771-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 17:20:41,169 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_1368-78165-0_sp1.1 from training. Duration: 0.97275 2023-10-03 17:20:41,173 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_309-65177-0_sp0.9 from training. Duration: 0.97775 2023-10-03 17:20:41,179 WARNING [train.py:1204] (3/4) Exclude cut with ID _21632_210_2253_1_1531994001765_3744140_291-138812-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 17:20:41,192 WARNING [train.py:1204] (3/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_669-93163-0_sp1.1 from training. Duration: 0.8909375 2023-10-03 17:20:41,767 WARNING [train.py:1204] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_292-215346-0 from training. Duration: 0.97 2023-10-03 17:20:41,793 WARNING [train.py:1204] (3/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_286-222073-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 17:20:42,102 WARNING [train.py:1204] (3/4) Exclude cut with ID _59256_210_5986_1_1535076085997_3589120_121-237386-0_sp1.1 from training. Duration: 0.87275 2023-10-03 17:20:42,117 WARNING [train.py:1204] (3/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_96-320736-0 from training. Duration: 0.86 2023-10-03 17:20:42,128 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_128-202104-0 from training. Duration: 0.63 2023-10-03 17:20:42,217 WARNING [train.py:1204] (3/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_242-3472-0_sp0.9 from training. Duration: 0.811125 2023-10-03 17:20:42,262 WARNING [train.py:1204] (3/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_154-74182-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 17:20:42,266 WARNING [train.py:1204] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_187-221566-0 from training. Duration: 0.43 2023-10-03 17:20:42,907 WARNING [train.py:1204] (3/4) Exclude cut with ID _6194_210_1534_1_1527511826160_3941194_14-282876-0 from training. Duration: 0.71 2023-10-03 17:20:42,948 WARNING [train.py:1204] (3/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_660-211088-0_sp1.1 from training. Duration: 0.563625 2023-10-03 17:20:43,274 WARNING [train.py:1204] (3/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_526-62079-0_sp1.1 from training. Duration: 0.6090625 2023-10-03 17:20:43,427 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_92-246270-0_sp1.1 from training. Duration: 0.77275 2023-10-03 17:20:43,520 WARNING [train.py:1204] (3/4) Exclude cut with ID IC0287W0023-142350 from training. Duration: 0.956 2023-10-03 17:20:43,600 WARNING [train.py:1204] (3/4) Exclude cut with ID _58605_210_12210_1_1534723195939_3904259_48-269402-0 from training. Duration: 0.96 2023-10-03 17:20:43,616 WARNING [train.py:1204] (3/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_416-257730-0_sp0.9 from training. Duration: 0.72225 2023-10-03 17:20:43,695 WARNING [train.py:1204] (3/4) Exclude cut with ID _7877_210_6014_1_1528286358099_3750136_185-17516-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 17:20:43,803 WARNING [train.py:1204] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_23-189234-0 from training. Duration: 0.83 2023-10-03 17:20:43,820 WARNING [train.py:1204] (3/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_237-89684-0 from training. Duration: 0.96 2023-10-03 17:20:43,893 WARNING [train.py:1204] (3/4) Exclude cut with ID _13916_210_9994_1_1530788341126_3763149_116-21034-0 from training. Duration: 0.96 2023-10-03 17:20:44,064 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_246-125000-0_sp1.1 from training. Duration: 0.563625 2023-10-03 17:20:50,420 INFO [train.py:1046] (3/4) Epoch 39, batch 0, loss[loss=0.144, simple_loss=0.2234, pruned_loss=0.03229, over 24337.00 frames. ], tot_loss[loss=0.144, simple_loss=0.2234, pruned_loss=0.03229, over 24337.00 frames. ], batch size: 61, lr: 2.63e-03, grad_scale: 32.0 2023-10-03 17:20:50,421 INFO [train.py:1069] (3/4) Computing validation loss 2023-10-03 17:21:02,123 INFO [train.py:1078] (3/4) Epoch 39, validation: loss=0.3329, simple_loss=0.2734, pruned_loss=0.1962, over 1125622.00 frames. 2023-10-03 17:21:02,123 INFO [train.py:1079] (3/4) Maximum memory allocated so far is 21134MB 2023-10-03 17:21:02,355 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.0.feed_forward2.hidden_balancer.prob, batch_count=1345740.0, ans=0.125 2023-10-03 17:21:05,438 WARNING [train.py:1204] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_402-253027-0 from training. Duration: 0.54 2023-10-03 17:21:06,765 WARNING [train.py:1204] (3/4) Exclude cut with ID _59288_210_5854_1_1535077757774_3412246_158-289345-0_sp1.1 from training. Duration: 0.92725 2023-10-03 17:21:08,189 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_28211_210_6388_1_1532861506_4523224_128-328162-0_sp1.1 from training. Duration: 0.8181875 2023-10-03 17:21:12,366 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_788-278281-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 17:21:12,374 WARNING [train.py:1204] (3/4) Exclude cut with ID _49728_210_12651_1_1534041034294_6788150_624-195301-0_sp0.9 from training. Duration: 0.9444375 2023-10-03 17:21:12,415 WARNING [train.py:1204] (3/4) Exclude cut with ID _14860_210_4915_1_1530702020340_7343240_267-139775-0_sp1.1 from training. Duration: 0.963625 2023-10-03 17:21:13,731 WARNING [train.py:1204] (3/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_332-221057-0 from training. Duration: 0.68 2023-10-03 17:21:15,074 WARNING [train.py:1204] (3/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_242-76943-0 from training. Duration: 0.94 2023-10-03 17:21:16,555 WARNING [train.py:1204] (3/4) Exclude cut with ID _16747_210_5986_1_1531811544733_2749109_87-349587-0_sp1.1 from training. Duration: 0.963625 2023-10-03 17:21:17,838 WARNING [train.py:1204] (3/4) Exclude cut with ID _22982_210_1883_1_1531882790447_5449409_505-346452-0_sp1.1 from training. Duration: 0.963625 2023-10-03 17:21:19,313 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.1.bypass.scale_min, batch_count=1345806.6666666667, ans=0.2 2023-10-03 17:21:19,431 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.ff3_skip_rate, batch_count=1345806.6666666667, ans=0.0 2023-10-03 17:21:20,677 WARNING [train.py:1204] (3/4) Exclude cut with ID _17956_210_4353_1_1531209568125_6979970_13-192107-0_sp1.1 from training. Duration: 0.963625 2023-10-03 17:21:21,962 WARNING [train.py:1204] (3/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_430-307666-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 17:21:22,008 WARNING [train.py:1204] (3/4) Exclude cut with ID _2895_210_2477_1_1525956785794_4063750_289-11475-0_sp1.1 from training. Duration: 0.7454375 2023-10-03 17:21:22,019 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3910_210_1794_1_1526742306_334530_28-238909-0_sp0.9 from training. Duration: 0.9 2023-10-03 17:21:24,627 WARNING [train.py:1204] (3/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_18-130336-0 from training. Duration: 0.79 2023-10-03 17:21:24,754 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_548-294316-0_sp0.9 from training. Duration: 0.9 2023-10-03 17:21:29,528 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.3.encoder.layers.2.feed_forward1.out_whiten, num_groups=1, num_channels=512, metric=6.50 vs. limit=15.0 2023-10-03 17:21:32,348 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_42-142473-0_sp1.1 from training. Duration: 0.6545625 2023-10-03 17:21:32,354 WARNING [train.py:1204] (3/4) Exclude cut with ID _59170_210_6721_1_1534983633960_4203335_326-48066-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 17:21:35,122 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_45886_210_17024_1_1533709215_471063_7-79165-0 from training. Duration: 0.98 2023-10-03 17:21:38,575 WARNING [train.py:1204] (3/4) Exclude cut with ID _44585_210_18628_1_1533605350387_4518199_280-73534-0_sp0.9 from training. Duration: 0.988875 2023-10-03 17:21:38,581 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3475_210_2028_1_1526193578_3622569_265-276625-0_sp1.1 from training. Duration: 0.6909375 2023-10-03 17:21:40,815 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.1.encoder.layers.0.feed_forward1.out_whiten, num_groups=1, num_channels=256, metric=4.97 vs. limit=15.0 2023-10-03 17:21:41,307 WARNING [train.py:1204] (3/4) Exclude cut with ID _64631_210_27767_1_1535349580068_4165160_88-225232-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 17:21:44,065 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_205-265518-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 17:21:44,331 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.feed_forward1.out_proj.dropout_p, batch_count=1345940.0, ans=0.1 2023-10-03 17:21:48,325 WARNING [train.py:1204] (3/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_271-253734-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 17:21:54,067 WARNING [train.py:1204] (3/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_282-257791-0 from training. Duration: 0.86 2023-10-03 17:21:54,301 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=1345940.0, ans=0.1 2023-10-03 17:21:58,147 WARNING [train.py:1204] (3/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_356-45054-0 from training. Duration: 0.86 2023-10-03 17:21:58,196 WARNING [train.py:1204] (3/4) Exclude cut with ID _69584_210_5219_1_1535785133775_4044450_38-348116-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 17:21:58,198 WARNING [train.py:1204] (3/4) Exclude cut with ID _66763_210_17301_1_1535788667617_241750_21-328480-0_sp1.1 from training. Duration: 0.936375 2023-10-03 17:21:59,578 WARNING [train.py:1204] (3/4) Exclude cut with ID _26528_210_9070_1_1532570410842_3851310_497-97218-0_sp0.9 from training. Duration: 0.9555625 2023-10-03 17:21:59,628 WARNING [train.py:1204] (3/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_592-101075-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 17:22:02,321 WARNING [train.py:1204] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_678-159307-0 from training. Duration: 0.85 2023-10-03 17:22:04,479 WARNING [train.py:1204] (3/4) Exclude cut with ID _28427_210_4220_1_1532581240967_3061069_166-319773-0_sp1.1 from training. Duration: 0.936375 2023-10-03 17:22:06,326 WARNING [train.py:1204] (3/4) Exclude cut with ID _29877_210_6228_1_1532397791106_5276149_606-224984-0_sp1.1 from training. Duration: 0.936375 2023-10-03 17:22:09,149 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_125-203105-0_sp1.1 from training. Duration: 0.72725 2023-10-03 17:22:13,186 WARNING [train.py:1204] (3/4) Exclude cut with ID IC0620W0113-306765 from training. Duration: 0.768 2023-10-03 17:22:13,306 WARNING [train.py:1204] (3/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_410-287857-0_sp1.1 from training. Duration: 0.8454375 2023-10-03 17:22:14,582 INFO [train.py:1046] (3/4) Epoch 39, batch 50, loss[loss=0.1522, simple_loss=0.2494, pruned_loss=0.02745, over 24317.00 frames. ], tot_loss[loss=0.1551, simple_loss=0.2374, pruned_loss=0.03635, over 1075996.59 frames. ], batch size: 74, lr: 2.63e-03, grad_scale: 32.0 2023-10-03 17:22:16,095 WARNING [train.py:1204] (3/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_78-95260-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 17:22:18,858 WARNING [train.py:1204] (3/4) Exclude cut with ID _58059_210_5854_1_1534901471149_3164808_6-73080-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 17:22:18,879 WARNING [train.py:1204] (3/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_331-117829-0 from training. Duration: 0.62 2023-10-03 17:22:20,226 WARNING [train.py:1204] (3/4) Exclude cut with ID _14880_210_8895_1_1530680426655_3795259_289-212817-0_sp0.9 from training. Duration: 0.7666875 2023-10-03 17:22:20,256 WARNING [train.py:1204] (3/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_620-144779-0_sp1.1 from training. Duration: 0.92725 2023-10-03 17:22:23,006 WARNING [train.py:1204] (3/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_561-288575-0_sp1.1 from training. Duration: 0.963625 2023-10-03 17:22:24,340 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_22657_210_6890_1_1532086002_1637216_122-188128-0_sp1.1 from training. Duration: 0.963625 2023-10-03 17:22:27,038 WARNING [train.py:1204] (3/4) Exclude cut with ID _16756_210_4915_1_1531047803763_7104259_604-86607-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 17:22:27,316 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.conv_module1.balancer1.prob, batch_count=1346140.0, ans=0.125 2023-10-03 17:22:29,933 WARNING [train.py:1204] (3/4) Exclude cut with ID _15150_210_8341_1_1530752411803_7256384_469-239509-0 from training. Duration: 0.68 2023-10-03 17:22:29,941 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_10987_210_4163_1_1529760555_4123902_302-87926-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 17:22:35,369 WARNING [train.py:1204] (3/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_469-94673-0_sp0.9 from training. Duration: 0.87775 2023-10-03 17:22:37,374 WARNING [train.py:1204] (3/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_798-106179-0 from training. Duration: 0.83 2023-10-03 17:22:38,774 WARNING [train.py:1204] (3/4) Exclude cut with ID _16744_210_5986_1_1531724405384_3907567_242-111316-0 from training. Duration: 0.93 2023-10-03 17:22:41,257 WARNING [train.py:1204] (3/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_938-205135-0_sp0.9 from training. Duration: 0.9666875 2023-10-03 17:22:41,357 WARNING [train.py:1204] (3/4) Exclude cut with ID _64135_210_2253_1_1535677267876_7608972_133-188266-0_sp0.9 from training. Duration: 0.9 2023-10-03 17:22:41,370 WARNING [train.py:1204] (3/4) Exclude cut with ID _44585_210_18628_1_1533605350387_4518199_623-346820-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 17:22:42,530 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.621e+02 1.895e+02 2.125e+02 2.422e+02 4.892e+02, threshold=4.250e+02, percent-clipped=3.0 2023-10-03 17:22:42,653 WARNING [train.py:1204] (3/4) Exclude cut with ID _42326_210_7770_1_1533814282531_3707269_454-266903-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 17:22:44,023 WARNING [train.py:1204] (3/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_5-55893-0_sp1.1 from training. Duration: 0.5 2023-10-03 17:22:44,062 WARNING [train.py:1204] (3/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_577-332600-0_sp1.1 from training. Duration: 0.5181875 2023-10-03 17:22:44,063 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_957-138882-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 17:22:50,823 WARNING [train.py:1204] (3/4) Exclude cut with ID _12556_210_8341_1_1530081024100_7327690_219-38004-0_sp1.1 from training. Duration: 0.97275 2023-10-03 17:22:51,100 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.balancer_ff2.min_abs, batch_count=1346206.6666666667, ans=0.1 2023-10-03 17:22:52,201 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_397-147196-0_sp1.1 from training. Duration: 0.72725 2023-10-03 17:22:52,223 WARNING [train.py:1204] (3/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_231-104736-0_sp0.9 from training. Duration: 0.8666875 2023-10-03 17:22:53,591 WARNING [train.py:1204] (3/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_398-334558-0 from training. Duration: 0.96 2023-10-03 17:22:55,022 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_257-20733-0_sp1.1 from training. Duration: 0.6909375 2023-10-03 17:22:56,428 WARNING [train.py:1204] (3/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_146-282400-0_sp1.1 from training. Duration: 0.6454375 2023-10-03 17:22:56,432 WARNING [train.py:1204] (3/4) Exclude cut with ID _51899_210_20034_1_1534129254466_3669220_265-215428-0 from training. Duration: 0.98 2023-10-03 17:22:56,497 WARNING [train.py:1204] (3/4) Exclude cut with ID _34423_210_8750_1_1533430798456_3330199_219-106541-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 17:22:57,898 WARNING [train.py:1204] (3/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_458-67321-0 from training. Duration: 0.97 2023-10-03 17:23:05,303 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.0.layers.1.nonlin_attention.whiten2, num_groups=1, num_channels=192, metric=5.67 vs. limit=15.0 2023-10-03 17:23:05,750 WARNING [train.py:1204] (3/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_1051-182606-0_sp1.1 from training. Duration: 0.936375 2023-10-03 17:23:07,617 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_53555_210_5632_1_1534595220_7232297_603-4396-0_sp1.1 from training. Duration: 0.97275 2023-10-03 17:23:07,713 WARNING [train.py:1204] (3/4) Exclude cut with ID _54218_210_12210_1_1534413646388_4409259_159-144367-0_sp1.1 from training. Duration: 0.963625 2023-10-03 17:23:09,067 WARNING [train.py:1204] (3/4) Exclude cut with ID _7606_210_4915_1_1528894253277_5080690_301-10732-0_sp0.9 from training. Duration: 0.9 2023-10-03 17:23:09,071 WARNING [train.py:1204] (3/4) Exclude cut with ID _15026_210_8750_1_1530777602316_3291850_211-195043-0_sp0.9 from training. Duration: 0.888875 2023-10-03 17:23:11,795 WARNING [train.py:1204] (3/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_146-282400-0 from training. Duration: 0.71 2023-10-03 17:23:11,837 WARNING [train.py:1204] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_65-88242-0 from training. Duration: 0.56 2023-10-03 17:23:13,145 WARNING [train.py:1204] (3/4) Exclude cut with ID _55857_210_2346_1_1534644007831_6898209_80-107757-0_sp1.1 from training. Duration: 0.963625 2023-10-03 17:23:13,193 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_15126_210_6753_1_1530778677_2591185_120-277707-0_sp0.9 from training. Duration: 0.888875 2023-10-03 17:23:15,696 WARNING [train.py:1204] (3/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_1033-168593-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 17:23:15,735 WARNING [train.py:1204] (3/4) Exclude cut with ID _46683_210_9642_1_1533730913492_4591116_475-30711-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 17:23:15,775 WARNING [train.py:1204] (3/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_253-308377-0 from training. Duration: 0.49 2023-10-03 17:23:15,828 WARNING [train.py:1204] (3/4) Exclude cut with ID _4680_210_3525_1_1527249757953_893386_75-78979-0 from training. Duration: 0.91 2023-10-03 17:23:17,255 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_5522_210_3273_1_1527249606_3909096_318-203153-0_sp0.9 from training. Duration: 0.47775 2023-10-03 17:23:17,366 WARNING [train.py:1204] (3/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_550-170116-0_sp1.1 from training. Duration: 0.92725 2023-10-03 17:23:18,676 WARNING [train.py:1204] (3/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_277-210798-0_sp0.9 from training. Duration: 0.611125 2023-10-03 17:23:18,728 WARNING [train.py:1204] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_620-276793-0 from training. Duration: 0.59 2023-10-03 17:23:18,746 WARNING [train.py:1204] (3/4) Exclude cut with ID _3067_210_2667_1_1526212974640_4503168_119-175776-0 from training. Duration: 0.74 2023-10-03 17:23:20,131 WARNING [train.py:1204] (3/4) Exclude cut with ID _55209_210_1531_1_1534675828996_4597160_681-195827-0_sp1.1 from training. Duration: 0.92725 2023-10-03 17:23:20,203 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_427-78097-0_sp1.1 from training. Duration: 0.72725 2023-10-03 17:23:22,859 WARNING [train.py:1204] (3/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_96-290039-0_sp0.9 from training. Duration: 0.62225 2023-10-03 17:23:22,868 WARNING [train.py:1204] (3/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_287-292174-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 17:23:24,333 WARNING [train.py:1204] (3/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_200-292228-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 17:23:26,986 INFO [train.py:1046] (3/4) Epoch 39, batch 100, loss[loss=0.139, simple_loss=0.2184, pruned_loss=0.02974, over 24294.00 frames. ], tot_loss[loss=0.1591, simple_loss=0.2401, pruned_loss=0.03911, over 1890652.24 frames. ], batch size: 56, lr: 2.63e-03, grad_scale: 16.0 2023-10-03 17:23:27,087 WARNING [train.py:1204] (3/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_291-194999-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 17:23:29,904 WARNING [train.py:1204] (3/4) Exclude cut with ID _58895_210_8750_1_1535184081320_3649089_265-323810-0_sp1.1 from training. Duration: 0.87275 2023-10-03 17:23:31,866 WARNING [train.py:1204] (3/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_58-68956-0 from training. Duration: 0.83 2023-10-03 17:23:31,870 WARNING [train.py:1204] (3/4) Exclude cut with ID _39867_210_9826_1_1533553222030_9344259_322-115153-0_sp1.1 from training. Duration: 0.963625 2023-10-03 17:23:35,042 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3371_210_1794_1_1526384462_5580777_396-339812-0_sp1.1 from training. Duration: 0.77275 2023-10-03 17:23:36,292 WARNING [train.py:1204] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_731-2428-0_sp0.9 from training. Duration: 0.92225 2023-10-03 17:23:36,311 WARNING [train.py:1204] (3/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_98-741-0_sp1.1 from training. Duration: 0.72725 2023-10-03 17:23:36,313 WARNING [train.py:1204] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_238-245655-0_sp0.9 from training. Duration: 0.9 2023-10-03 17:23:36,335 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6788_210_1388_1_1527898698_4562021_91-197981-0_sp0.9 from training. Duration: 0.92225 2023-10-03 17:23:37,766 WARNING [train.py:1204] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_860-96198-0 from training. Duration: 0.73 2023-10-03 17:23:40,534 WARNING [train.py:1204] (3/4) Exclude cut with ID _14268_210_9206_1_1530622771285_4714099_331-105351-0_sp0.9 from training. Duration: 0.72225 2023-10-03 17:23:40,570 WARNING [train.py:1204] (3/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_204-137133-0_sp1.1 from training. Duration: 0.92725 2023-10-03 17:23:41,757 WARNING [train.py:1204] (3/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_5-46496-0_sp1.1 from training. Duration: 0.936375 2023-10-03 17:23:41,765 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_160-218721-0_sp1.1 from training. Duration: 0.87275 2023-10-03 17:23:44,750 WARNING [train.py:1204] (3/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_503-287883-0 from training. Duration: 0.88 2023-10-03 17:23:46,074 WARNING [train.py:1204] (3/4) Exclude cut with ID _57572_210_5517_1_1534921219624_3809484_110-79134-0_sp1.1 from training. Duration: 0.92725 2023-10-03 17:23:47,399 WARNING [train.py:1204] (3/4) Exclude cut with ID _44585_210_18628_1_1533605350387_4518199_421-37641-0_sp1.1 from training. Duration: 0.936375 2023-10-03 17:23:48,839 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_42-142473-0_sp0.9 from training. Duration: 0.8 2023-10-03 17:23:50,348 WARNING [train.py:1204] (3/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_310-114452-0_sp0.9 from training. Duration: 0.6555625 2023-10-03 17:23:53,228 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9029W0120-509520 from training. Duration: 0.768 2023-10-03 17:23:54,512 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9029W0433-509820 from training. Duration: 0.896 2023-10-03 17:23:55,911 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_40222_210_6228_1_1533385298_4481939_163-334586-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 17:23:55,912 WARNING [train.py:1204] (3/4) Exclude cut with ID _48677_210_5663_1_1533902405919_3892989_408-40740-0_sp1.1 from training. Duration: 0.7545625 2023-10-03 17:23:58,657 WARNING [train.py:1204] (3/4) Exclude cut with ID _24314_210_4915_1_1532135078116_7170179_521-258868-0_sp0.9 from training. Duration: 0.87775 2023-10-03 17:24:00,065 WARNING [train.py:1204] (3/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_576-51198-0_sp1.1 from training. Duration: 0.92725 2023-10-03 17:24:00,209 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.0.bypass.skip_rate, batch_count=1346540.0, ans=0.035 2023-10-03 17:24:01,429 WARNING [train.py:1204] (3/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_306-216871-0_sp1.1 from training. Duration: 0.97275 2023-10-03 17:24:08,575 WARNING [train.py:1204] (3/4) Exclude cut with ID _20684_210_6917_1_1531567751646_4203930_463-119982-0_sp1.1 from training. Duration: 0.97275 2023-10-03 17:24:09,891 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9084W0012-536850 from training. Duration: 0.64 2023-10-03 17:24:11,355 WARNING [train.py:1204] (3/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_847-41334-0_sp1.1 from training. Duration: 0.4 2023-10-03 17:24:11,554 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.balancer1.prob, batch_count=1346606.6666666667, ans=0.125 2023-10-03 17:24:15,497 WARNING [train.py:1204] (3/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_326-165900-0_sp1.1 from training. Duration: 0.62725 2023-10-03 17:24:16,813 WARNING [train.py:1204] (3/4) Exclude cut with ID _7537_210_2084_1_1528502395431_3924359_128-268029-0_sp1.1 from training. Duration: 0.8909375 2023-10-03 17:24:18,169 WARNING [train.py:1204] (3/4) Exclude cut with ID _67499_210_17282_1_1535509805220_3692430_333-155030-0_sp1.1 from training. Duration: 0.97275 2023-10-03 17:24:19,664 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_34008_210_10431_1_1533345424_8183247_588-257177-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 17:24:24,067 WARNING [train.py:1204] (3/4) Exclude cut with ID _25914_210_6400_1_1532141317058_644740_52-96808-0_sp1.1 from training. Duration: 0.82725 2023-10-03 17:24:25,502 WARNING [train.py:1204] (3/4) Exclude cut with ID _56494_210_4220_1_1535284837625_3033169_69-138150-0_sp1.1 from training. Duration: 0.8545625 2023-10-03 17:24:29,500 WARNING [train.py:1204] (3/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_19-143553-0_sp1.1 from training. Duration: 0.97275 2023-10-03 17:24:29,568 WARNING [train.py:1204] (3/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_582-130735-0_sp1.1 from training. Duration: 0.936375 2023-10-03 17:24:30,955 WARNING [train.py:1204] (3/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_943-201356-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 17:24:30,964 WARNING [train.py:1204] (3/4) Exclude cut with ID _3231_210_2106_1_1526205864934_6784220_422-229276-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 17:24:30,987 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_48049_210_5632_1_1533864553_3519937_73-198717-0_sp1.1 from training. Duration: 0.97275 2023-10-03 17:24:32,288 WARNING [train.py:1204] (3/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_329-285801-0 from training. Duration: 0.91 2023-10-03 17:24:32,317 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9171W0114-579000 from training. Duration: 0.896 2023-10-03 17:24:32,338 WARNING [train.py:1204] (3/4) Exclude cut with ID _3120_210_3525_1_1526036143112_4072980_210-132675-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 17:24:34,402 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.2.encoder.layers.1.conv_module1.whiten, num_groups=1, num_channels=384, metric=9.10 vs. limit=15.0 2023-10-03 17:24:34,924 WARNING [train.py:1204] (3/4) Exclude cut with ID _55857_210_2346_1_1534644007831_6898209_745-98391-0_sp0.9 from training. Duration: 0.8444375 2023-10-03 17:24:34,985 WARNING [train.py:1204] (3/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_615-322035-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 17:24:34,996 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_36756_210_7234_1_1533026991_966276_119-315767-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 17:24:35,018 WARNING [train.py:1204] (3/4) Exclude cut with ID _13916_210_9994_1_1530788341126_3763149_60-191556-0_sp1.1 from training. Duration: 0.5545625 2023-10-03 17:24:35,642 WARNING [train.py:1204] (3/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_177-236809-0_sp1.1 from training. Duration: 0.4454375 2023-10-03 17:24:36,919 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7002_210_1794_1_1527939683_5157286_184-229630-0_sp1.1 from training. Duration: 0.67275 2023-10-03 17:24:36,925 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_50428_210_5724_1_1534247532_4126418_464-18322-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 17:24:36,994 WARNING [train.py:1204] (3/4) Exclude cut with ID _35799_210_5854_1_1533263487887_3529071_466-131402-0_sp1.1 from training. Duration: 0.936375 2023-10-03 17:24:37,072 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_1259-132768-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 17:24:37,236 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=1346673.3333333333, ans=0.1 2023-10-03 17:24:38,943 WARNING [train.py:1204] (3/4) Exclude cut with ID _6694_210_4599_1_1527852637490_3965529_144-185051-0_sp1.1 from training. Duration: 0.87275 2023-10-03 17:24:40,688 INFO [train.py:1046] (3/4) Epoch 39, batch 150, loss[loss=0.1624, simple_loss=0.2422, pruned_loss=0.04129, over 23268.00 frames. ], tot_loss[loss=0.1593, simple_loss=0.2399, pruned_loss=0.03939, over 2523621.65 frames. ], batch size: 105, lr: 2.63e-03, grad_scale: 16.0 2023-10-03 17:24:40,754 WARNING [train.py:1204] (3/4) Exclude cut with ID _64639_210_5816_1_1535367619437_3528000_59-133290-0_sp1.1 from training. Duration: 0.963625 2023-10-03 17:24:44,779 WARNING [train.py:1204] (3/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_223-49525-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 17:24:46,350 WARNING [train.py:1204] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_748-36150-0_sp1.1 from training. Duration: 0.82725 2023-10-03 17:24:46,351 WARNING [train.py:1204] (3/4) Exclude cut with ID _19896_210_1883_1_1531709022662_6649569_52-221627-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 17:24:46,548 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.bypass_mid.scale_min, batch_count=1346740.0, ans=0.2 2023-10-03 17:24:46,983 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.3.encoder.layers.2.self_attn_weights.whiten_keys, num_groups=8, num_channels=256, metric=3.52 vs. limit=6.0 2023-10-03 17:24:47,723 WARNING [train.py:1204] (3/4) Exclude cut with ID _6789_210_1534_1_1527854471671_2870519_296-115766-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 17:24:49,376 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.self_attn_weights.pos_emb_skip_rate, batch_count=1346740.0, ans=0.0 2023-10-03 17:24:50,518 WARNING [train.py:1204] (3/4) Exclude cut with ID _14624_210_1925_1_1530858690657_3642219_100-19871-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 17:24:50,559 WARNING [train.py:1204] (3/4) Exclude cut with ID _12555_210_8750_1_1530075594402_3780239_50-293675-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 17:24:50,709 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.feed_forward2.hidden_balancer.prob, batch_count=1346740.0, ans=0.125 2023-10-03 17:24:54,516 WARNING [train.py:1204] (3/4) Exclude cut with ID _14880_210_8895_1_1530680426655_3795259_289-212817-0_sp1.1 from training. Duration: 0.62725 2023-10-03 17:24:55,864 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_388-211626-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 17:24:58,552 WARNING [train.py:1204] (3/4) Exclude cut with ID _5289_210_2084_1_1527678202736_3779299_392-261988-0 from training. Duration: 0.65 2023-10-03 17:24:58,560 WARNING [train.py:1204] (3/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_124-345840-0 from training. Duration: 0.52 2023-10-03 17:24:58,564 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2655_210_2087_1_1525688959_3897400_437-337690-0 from training. Duration: 0.63 2023-10-03 17:25:01,330 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3780_210_2087_1_1526467391_239068_13-274633-0_sp1.1 from training. Duration: 0.92725 2023-10-03 17:25:01,334 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_805-71497-0_sp1.1 from training. Duration: 0.6454375 2023-10-03 17:25:02,743 WARNING [train.py:1204] (3/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_434-83000-0_sp1.1 from training. Duration: 0.8909375 2023-10-03 17:25:04,119 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_17613_210_2151_1_1531137569_2366160_285-219531-0_sp1.1 from training. Duration: 0.97275 2023-10-03 17:25:04,123 WARNING [train.py:1204] (3/4) Exclude cut with ID _56108_210_16380_1_1534505446159_3887959_22-37858-0_sp1.1 from training. Duration: 0.936375 2023-10-03 17:25:04,148 WARNING [train.py:1204] (3/4) Exclude cut with ID _28427_210_4220_1_1532581240967_3061069_200-88444-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 17:25:04,209 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_77-279042-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 17:25:04,308 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9309W0193-640320 from training. Duration: 0.896 2023-10-03 17:25:06,195 WARNING [train.py:1204] (3/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_516-324055-0_sp1.1 from training. Duration: 0.936375 2023-10-03 17:25:06,832 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.3.encoder.layers.2.whiten, num_groups=1, num_channels=512, metric=3.80 vs. limit=12.0 2023-10-03 17:25:07,384 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.556e+02 1.926e+02 2.144e+02 2.350e+02 3.601e+02, threshold=4.288e+02, percent-clipped=0.0 2023-10-03 17:25:13,923 WARNING [train.py:1204] (3/4) Exclude cut with ID _40094_210_16380_1_1533294005773_3641159_10-261240-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 17:25:16,726 WARNING [train.py:1204] (3/4) Exclude cut with ID _6267_210_2084_1_1527764658526_3894730_375-219887-0_sp1.1 from training. Duration: 0.6090625 2023-10-03 17:25:18,173 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6788_210_1388_1_1527898698_4562021_180-75288-0 from training. Duration: 0.99 2023-10-03 17:25:20,989 WARNING [train.py:1204] (3/4) Exclude cut with ID _8030_210_7497_1_1528444886781_3531089_262-86750-0_sp1.1 from training. Duration: 0.736375 2023-10-03 17:25:21,003 WARNING [train.py:1204] (3/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_934-325498-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 17:25:22,147 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_456-22141-0_sp0.9 from training. Duration: 0.97775 2023-10-03 17:25:23,550 WARNING [train.py:1204] (3/4) Exclude cut with ID _49643_210_7989_1_1534640244224_3939031_598-161926-0_sp1.1 from training. Duration: 0.8181875 2023-10-03 17:25:24,945 WARNING [train.py:1204] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_345-49721-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 17:25:25,031 WARNING [train.py:1204] (3/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_440-141811-0_sp1.1 from training. Duration: 0.863625 2023-10-03 17:25:25,918 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.0.layers.1.feed_forward3.out_whiten, num_groups=1, num_channels=192, metric=8.22 vs. limit=15.0 2023-10-03 17:25:26,394 WARNING [train.py:1204] (3/4) Exclude cut with ID _64048_210_10626_1_1535198382802_3835179_394-212120-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 17:25:26,446 WARNING [train.py:1204] (3/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_91-277684-0 from training. Duration: 0.74 2023-10-03 17:25:31,191 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.3.encoder.layers.0.feed_forward2.out_whiten, num_groups=1, num_channels=512, metric=7.96 vs. limit=15.0 2023-10-03 17:25:31,876 WARNING [train.py:1204] (3/4) Exclude cut with ID _23038_210_9570_1_1532001584158_3652419_191-346343-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 17:25:32,061 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.nonlin_attention.balancer.max_positive, batch_count=1346940.0, ans=0.95 2023-10-03 17:25:33,238 WARNING [train.py:1204] (3/4) Exclude cut with ID _15140_210_9288_1_1530791388450_4880149_351-347974-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 17:25:33,278 WARNING [train.py:1204] (3/4) Exclude cut with ID _58605_210_12210_1_1534723195939_3904259_48-269402-0_sp1.1 from training. Duration: 0.87275 2023-10-03 17:25:33,285 WARNING [train.py:1204] (3/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_401-53423-0_sp0.9 from training. Duration: 0.611125 2023-10-03 17:25:34,668 WARNING [train.py:1204] (3/4) Exclude cut with ID _42659_210_2070_1_1533607442765_3120380_158-49004-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 17:25:36,466 WARNING [train.py:1204] (3/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_81-75849-0_sp1.1 from training. Duration: 0.3545625 2023-10-03 17:25:39,758 WARNING [train.py:1204] (3/4) Exclude cut with ID _27052_210_5986_1_1532329197187_3585009_314-106499-0_sp1.1 from training. Duration: 0.836375 2023-10-03 17:25:41,100 WARNING [train.py:1204] (3/4) Exclude cut with ID _4626_210_3532_1_1526817652717_3916859_446-206656-0_sp0.9 from training. Duration: 0.8444375 2023-10-03 17:25:43,052 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_53378_210_17282_1_1534221071_5769509_158-114960-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 17:25:44,454 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3171_210_2087_1_1526109084_2330007_37-64334-0_sp0.9 from training. Duration: 0.911125 2023-10-03 17:25:44,472 WARNING [train.py:1204] (3/4) Exclude cut with ID _5410_210_1388_1_1527397055795_1719020_39-112221-0 from training. Duration: 0.69 2023-10-03 17:25:44,490 WARNING [train.py:1204] (3/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_4-91037-0_sp0.9 from training. Duration: 0.97775 2023-10-03 17:25:45,819 WARNING [train.py:1204] (3/4) Exclude cut with ID ID0079W0443-717795 from training. Duration: 0.541 2023-10-03 17:25:48,662 WARNING [train.py:1204] (3/4) Exclude cut with ID _34734_210_4525_1_1533365977542_4448720_766-297928-0_sp1.1 from training. Duration: 0.936375 2023-10-03 17:25:51,365 WARNING [train.py:1204] (3/4) Exclude cut with ID _51471_210_1889_1_1534399391390_3094250_310-337793-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 17:25:51,377 WARNING [train.py:1204] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_678-159307-0_sp0.9 from training. Duration: 0.9444375 2023-10-03 17:25:51,560 WARNING [train.py:1204] (3/4) Exclude cut with ID _50093_210_4220_1_1534417141207_3432299_259-322252-0 from training. Duration: 0.92 2023-10-03 17:25:52,248 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.3.encoder.layers.1.feed_forward3.out_whiten, num_groups=1, num_channels=512, metric=10.67 vs. limit=15.0 2023-10-03 17:25:52,909 INFO [train.py:1046] (3/4) Epoch 39, batch 200, loss[loss=0.1398, simple_loss=0.2268, pruned_loss=0.0264, over 24430.00 frames. ], tot_loss[loss=0.1599, simple_loss=0.2402, pruned_loss=0.0398, over 3011374.20 frames. ], batch size: 69, lr: 2.62e-03, grad_scale: 16.0 2023-10-03 17:25:53,001 WARNING [train.py:1204] (3/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_67-103444-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 17:25:53,030 WARNING [train.py:1204] (3/4) Exclude cut with ID _40551_210_10635_1_1533776424632_3416179_311-247578-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 17:25:55,715 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_4633_210_2040_1_1526824163_4145343_416-76117-0 from training. Duration: 0.95 2023-10-03 17:25:57,010 WARNING [train.py:1204] (3/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_331-117829-0_sp0.9 from training. Duration: 0.688875 2023-10-03 17:25:59,445 WARNING [train.py:1204] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_299-247059-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 17:25:59,527 WARNING [train.py:1204] (3/4) Exclude cut with ID _64002_210_9570_1_1535198605157_38979_2-240845-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 17:26:04,235 WARNING [train.py:1204] (3/4) Exclude cut with ID _4681_210_3525_1_1527250689464_120889_15-126760-0_sp0.9 from training. Duration: 0.9 2023-10-03 17:26:04,254 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_21500_210_2054_1_1532149310_2716758_256-156611-0_sp1.1 from training. Duration: 0.936375 2023-10-03 17:26:04,276 WARNING [train.py:1204] (3/4) Exclude cut with ID _16908_210_5724_1_1531119157248_3772700_624-203729-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 17:26:10,679 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.4.encoder.layers.0.feed_forward2.out_whiten, num_groups=1, num_channels=384, metric=9.32 vs. limit=15.0 2023-10-03 17:26:26,351 WARNING [train.py:1204] (3/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_605-219732-0_sp1.1 from training. Duration: 0.8545625 2023-10-03 17:26:26,386 WARNING [train.py:1204] (3/4) Exclude cut with ID _44718_210_5816_1_1534050051869_6469030_380-167520-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 17:26:27,749 WARNING [train.py:1204] (3/4) Exclude cut with ID _19896_210_1883_1_1531709022662_6649569_94-229927-0_sp1.1 from training. Duration: 0.8818125 2023-10-03 17:26:27,811 WARNING [train.py:1204] (3/4) Exclude cut with ID _19514_210_9259_1_1531530071330_5084230_519-174300-0_sp1.1 from training. Duration: 0.963625 2023-10-03 17:26:29,133 WARNING [train.py:1204] (3/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_340-174176-0_sp0.9 from training. Duration: 0.5444375 2023-10-03 17:26:29,137 WARNING [train.py:1204] (3/4) Exclude cut with ID _8263_210_1664_1_1528439452891_970220_68-181805-0_sp1.1 from training. Duration: 0.5818125 2023-10-03 17:26:31,857 WARNING [train.py:1204] (3/4) Exclude cut with ID _3120_210_3525_1_1526036143112_4072980_348-257723-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 17:26:33,233 WARNING [train.py:1204] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_453-133983-0_sp1.1 from training. Duration: 0.7090625 2023-10-03 17:26:34,592 WARNING [train.py:1204] (3/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_17-86060-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 17:26:34,612 WARNING [train.py:1204] (3/4) Exclude cut with ID _29877_210_6228_1_1532397791106_5276149_228-184657-0_sp1.1 from training. Duration: 0.97275 2023-10-03 17:26:34,720 WARNING [train.py:1204] (3/4) Exclude cut with ID _18516_210_9206_1_1531704653206_3692406_198-334870-0 from training. Duration: 0.92 2023-10-03 17:26:36,140 WARNING [train.py:1204] (3/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_340-174176-0_sp1.1 from training. Duration: 0.4454375 2023-10-03 17:26:36,160 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_14-139186-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 17:26:42,671 WARNING [train.py:1204] (3/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_126-29104-0_sp0.9 from training. Duration: 0.9444375 2023-10-03 17:26:46,932 WARNING [train.py:1204] (3/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_84-10310-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 17:26:54,191 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_15517_210_6753_1_1530954008_3487336_79-273076-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 17:26:55,535 WARNING [train.py:1204] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_371-18169-0_sp0.9 from training. Duration: 0.9555625 2023-10-03 17:26:58,491 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.attention_skip_rate, batch_count=1347340.0, ans=0.0 2023-10-03 17:27:01,108 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_68702_210_23689_1_1535799577_3420068_170-92788-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 17:27:03,903 WARNING [train.py:1204] (3/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_78-182912-0 from training. Duration: 0.59 2023-10-03 17:27:05,266 INFO [train.py:1046] (3/4) Epoch 39, batch 250, loss[loss=0.1624, simple_loss=0.2529, pruned_loss=0.03595, over 24353.00 frames. ], tot_loss[loss=0.1593, simple_loss=0.24, pruned_loss=0.03932, over 3398303.83 frames. ], batch size: 77, lr: 2.62e-03, grad_scale: 16.0 2023-10-03 17:27:05,319 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3019_210_2028_1_1526044245_3264865_297-12384-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 17:27:05,324 WARNING [train.py:1204] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_147-111137-0_sp1.1 from training. Duration: 0.863625 2023-10-03 17:27:05,329 WARNING [train.py:1204] (3/4) Exclude cut with ID _28323_210_16187_1_1532480395667_4100300_117-8859-0_sp1.1 from training. Duration: 0.97275 2023-10-03 17:27:05,405 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7762_210_1794_1_1528199508_4057408_419-125009-0_sp1.1 from training. Duration: 0.7545625 2023-10-03 17:27:05,492 WARNING [train.py:1204] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_268-87909-0 from training. Duration: 0.99 2023-10-03 17:27:06,871 WARNING [train.py:1204] (3/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_118-311357-0_sp1.1 from training. Duration: 0.92725 2023-10-03 17:27:06,885 WARNING [train.py:1204] (3/4) Exclude cut with ID ID0377W0003-870225 from training. Duration: 0.852 2023-10-03 17:27:09,624 WARNING [train.py:1204] (3/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_56-5838-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 17:27:10,871 WARNING [train.py:1204] (3/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_90-31383-0_sp0.9 from training. Duration: 0.9666875 2023-10-03 17:27:12,966 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_31583_210_3852_1_1532595851_4024413_58-86657-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 17:27:12,996 WARNING [train.py:1204] (3/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_344-113875-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 17:27:14,447 WARNING [train.py:1204] (3/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_278-104676-0_sp1.1 from training. Duration: 0.8909375 2023-10-03 17:27:14,470 WARNING [train.py:1204] (3/4) Exclude cut with ID _55844_210_2346_1_1534471212569_7625707_764-2387-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 17:27:17,750 WARNING [train.py:1204] (3/4) Exclude cut with ID _27430_210_7770_1_1532604689375_1378590_157-14963-0_sp1.1 from training. Duration: 0.963625 2023-10-03 17:27:20,291 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.feed_forward3.out_whiten.whitening_limit, batch_count=1347473.3333333333, ans=15.0 2023-10-03 17:27:20,899 WARNING [train.py:1204] (3/4) Exclude cut with ID _7160_210_1876_1_1527940923231_3703799_282-322611-0_sp1.1 from training. Duration: 0.7909375 2023-10-03 17:27:23,857 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.conv_module1.balancer1.prob, batch_count=1347473.3333333333, ans=0.125 2023-10-03 17:27:30,553 WARNING [train.py:1204] (3/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_190-138640-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 17:27:31,996 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_33348_210_5632_1_1533086912_3324030_63-270922-0_sp1.1 from training. Duration: 0.97275 2023-10-03 17:27:32,026 WARNING [train.py:1204] (3/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_135-184281-0_sp1.1 from training. Duration: 0.9 2023-10-03 17:27:33,343 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.636e+02 1.891e+02 2.094e+02 2.550e+02 3.805e+02, threshold=4.188e+02, percent-clipped=0.0 2023-10-03 17:27:33,701 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.conv_module2.balancer2.min_positive, batch_count=1347540.0, ans=0.05 2023-10-03 17:27:38,937 WARNING [train.py:1204] (3/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_277-210798-0_sp1.1 from training. Duration: 0.5 2023-10-03 17:27:40,208 WARNING [train.py:1204] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_402-253027-0_sp0.9 from training. Duration: 0.6 2023-10-03 17:27:40,293 WARNING [train.py:1204] (3/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_540-275685-0_sp0.9 from training. Duration: 0.988875 2023-10-03 17:27:40,561 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.nonlin_attention.balancer.prob, batch_count=1347540.0, ans=0.125 2023-10-03 17:27:41,601 WARNING [train.py:1204] (3/4) Exclude cut with ID _59493_210_17282_1_1535078419077_3225879_118-18960-0_sp1.1 from training. Duration: 0.936375 2023-10-03 17:27:43,508 WARNING [train.py:1204] (3/4) Exclude cut with ID _6194_210_1534_1_1527511826160_3941194_321-148641-0_sp1.1 from training. Duration: 0.5818125 2023-10-03 17:27:43,514 WARNING [train.py:1204] (3/4) Exclude cut with ID _61133_210_9969_1_1535196777227_3696409_333-270325-0_sp1.1 from training. Duration: 0.8454375 2023-10-03 17:27:43,577 WARNING [train.py:1204] (3/4) Exclude cut with ID _49376_210_5986_1_1533985278732_3370740_307-145221-0_sp1.1 from training. Duration: 0.936375 2023-10-03 17:27:47,886 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6846_210_6753_1_1527850767_4295158_2-315073-0_sp0.9 from training. Duration: 0.97775 2023-10-03 17:27:49,377 WARNING [train.py:1204] (3/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_17-25045-0 from training. Duration: 0.59 2023-10-03 17:27:49,390 WARNING [train.py:1204] (3/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_503-333510-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 17:27:50,831 WARNING [train.py:1204] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_238-245655-0_sp1.1 from training. Duration: 0.736375 2023-10-03 17:27:52,557 WARNING [train.py:1204] (3/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_329-82235-0_sp0.9 from training. Duration: 0.911125 2023-10-03 17:27:52,569 WARNING [train.py:1204] (3/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_325-113798-0_sp0.9 from training. Duration: 0.9444375 2023-10-03 17:27:52,638 WARNING [train.py:1204] (3/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_395-37222-0_sp0.9 from training. Duration: 0.8444375 2023-10-03 17:27:52,703 WARNING [train.py:1204] (3/4) Exclude cut with ID _64135_210_2253_1_1535677267876_7608972_498-5039-0_sp1.1 from training. Duration: 0.7090625 2023-10-03 17:27:53,931 WARNING [train.py:1204] (3/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_303-228738-0_sp0.9 from training. Duration: 0.9333125 2023-10-03 17:27:55,395 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_577-220899-0_sp1.1 from training. Duration: 0.92725 2023-10-03 17:27:56,757 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3475_210_2028_1_1526193578_3622569_377-166133-0_sp1.1 from training. Duration: 0.7818125 2023-10-03 17:27:58,071 WARNING [train.py:1204] (3/4) Exclude cut with ID _49376_210_5986_1_1533985278732_3370740_272-313512-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 17:28:00,934 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7762_210_1794_1_1528199508_4057408_422-227034-0_sp0.9 from training. Duration: 0.811125 2023-10-03 17:28:03,722 WARNING [train.py:1204] (3/4) Exclude cut with ID _18895_210_6228_1_1531357274234_3784930_74-341428-0_sp1.1 from training. Duration: 0.92725 2023-10-03 17:28:06,574 WARNING [train.py:1204] (3/4) Exclude cut with ID _44902_210_16187_1_1533888006062_3792078_149-67212-0_sp1.1 from training. Duration: 0.7909375 2023-10-03 17:28:06,762 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.conv_module2.balancer2.prob, batch_count=1347673.3333333333, ans=0.125 2023-10-03 17:28:10,773 WARNING [train.py:1204] (3/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_243-126020-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 17:28:13,946 WARNING [train.py:1204] (3/4) Exclude cut with ID _54218_210_12210_1_1534413646388_4409259_259-6561-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 17:28:16,739 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_41452_210_7291_1_1533437687_3941281_405-11559-0 from training. Duration: 0.91 2023-10-03 17:28:18,534 INFO [train.py:1046] (3/4) Epoch 39, batch 300, loss[loss=0.1587, simple_loss=0.2438, pruned_loss=0.03681, over 23988.00 frames. ], tot_loss[loss=0.1581, simple_loss=0.2378, pruned_loss=0.03918, over 3694980.96 frames. ], batch size: 86, lr: 2.62e-03, grad_scale: 16.0 2023-10-03 17:28:18,615 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_759-225084-0_sp1.1 from training. Duration: 0.97275 2023-10-03 17:28:18,629 WARNING [train.py:1204] (3/4) Exclude cut with ID _14747_210_8341_1_1530685797463_3654089_147-131229-0_sp0.9 from training. Duration: 0.8444375 2023-10-03 17:28:19,830 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.0.layers.0.self_attn_weights.whiten_keys, num_groups=4, num_channels=128, metric=2.86 vs. limit=6.0 2023-10-03 17:28:20,148 WARNING [train.py:1204] (3/4) Exclude cut with ID _14867_210_1968_1_1530684179890_3846969_319-250868-0 from training. Duration: 0.99 2023-10-03 17:28:21,818 WARNING [train.py:1204] (3/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_580-93798-0_sp0.9 from training. Duration: 0.62225 2023-10-03 17:28:23,213 WARNING [train.py:1204] (3/4) Exclude cut with ID _9679_210_7497_1_1529129212906_4363929_512-313889-0_sp0.9 from training. Duration: 0.9 2023-10-03 17:28:23,221 WARNING [train.py:1204] (3/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_18-189198-0 from training. Duration: 0.9 2023-10-03 17:28:25,027 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.5.encoder.layers.1.conv_module1.whiten, num_groups=1, num_channels=256, metric=3.12 vs. limit=15.0 2023-10-03 17:28:27,415 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.conv_module1.balancer2.prob, batch_count=1347740.0, ans=0.125 2023-10-03 17:28:28,559 WARNING [train.py:1204] (3/4) Exclude cut with ID _49755_210_18053_1_1534581005846_3218709_137-64179-0_sp1.1 from training. Duration: 0.92725 2023-10-03 17:28:28,624 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_48049_210_5632_1_1533864553_3519937_306-283794-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 17:28:31,502 WARNING [train.py:1204] (3/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_25-120161-0_sp1.1 from training. Duration: 0.8909375 2023-10-03 17:28:31,564 WARNING [train.py:1204] (3/4) Exclude cut with ID _8833_210_5219_1_1528592665210_7553059_319-349181-0 from training. Duration: 0.99 2023-10-03 17:28:34,247 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_296-332325-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 17:28:34,334 WARNING [train.py:1204] (3/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_436-186311-0_sp1.1 from training. Duration: 0.5909375 2023-10-03 17:28:35,664 WARNING [train.py:1204] (3/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_348-127789-0 from training. Duration: 0.85 2023-10-03 17:28:35,669 WARNING [train.py:1204] (3/4) Exclude cut with ID _3992_210_1747_1_1526644816031_3731060_443-195183-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 17:28:39,676 WARNING [train.py:1204] (3/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_308-231735-0_sp1.1 from training. Duration: 0.663625 2023-10-03 17:28:43,011 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6786_210_1388_1_1527839699_4194495_378-46070-0_sp1.1 from training. Duration: 0.6545625 2023-10-03 17:28:44,320 WARNING [train.py:1204] (3/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_63-101996-0 from training. Duration: 0.88 2023-10-03 17:28:46,466 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.1.feed_forward1.hidden_balancer.prob, batch_count=1347873.3333333333, ans=0.125 2023-10-03 17:28:47,685 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2541_210_1794_1_1525590269_3084765_156-173713-0 from training. Duration: 0.81 2023-10-03 17:28:49,451 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_217-239796-0_sp1.1 from training. Duration: 0.963625 2023-10-03 17:28:51,012 WARNING [train.py:1204] (3/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_530-329400-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 17:28:52,394 WARNING [train.py:1204] (3/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_150-62920-0_sp1.1 from training. Duration: 0.963625 2023-10-03 17:28:52,395 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_5522_210_3273_1_1527249606_3909096_366-172508-0 from training. Duration: 0.66 2023-10-03 17:28:52,397 WARNING [train.py:1204] (3/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_303-340446-0_sp1.1 from training. Duration: 0.8181875 2023-10-03 17:28:54,244 WARNING [train.py:1204] (3/4) Exclude cut with ID _8699_210_2069_1_1528630346831_3960409_160-24408-0_sp1.1 from training. Duration: 0.82725 2023-10-03 17:28:55,681 WARNING [train.py:1204] (3/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_299-187801-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 17:28:55,716 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_34886_210_10633_1_1533203882_1755661_92-345288-0_sp1.1 from training. Duration: 0.97275 2023-10-03 17:28:58,421 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_5438_210_1794_1_1527336687_2600009_104-288274-0_sp1.1 from training. Duration: 0.52725 2023-10-03 17:28:58,425 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_154-316450-0 from training. Duration: 0.76 2023-10-03 17:28:59,753 WARNING [train.py:1204] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_23-189234-0_sp0.9 from training. Duration: 0.92225 2023-10-03 17:29:01,138 WARNING [train.py:1204] (3/4) Exclude cut with ID _4779_210_2084_1_1527318022944_3914893_346-168820-0_sp1.1 from training. Duration: 0.963625 2023-10-03 17:29:03,833 WARNING [train.py:1204] (3/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_5-55893-0 from training. Duration: 0.55 2023-10-03 17:29:05,252 WARNING [train.py:1204] (3/4) Exclude cut with ID _20455_210_5219_1_1531565996775_8098100_180-24517-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 17:29:06,838 INFO [scaling.py:1118] (3/4) WithLoss: name=encoder.encoders.1.encoder.layers.1.self_attn_weights, loss-sum=0.000e+00 2023-10-03 17:29:08,119 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_243-143119-0_sp0.9 from training. Duration: 0.8555625 2023-10-03 17:29:10,285 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.conv_module2.balancer2.prob, batch_count=1347940.0, ans=0.125 2023-10-03 17:29:11,544 WARNING [train.py:1204] (3/4) Exclude cut with ID _65585_210_12210_1_1535450451584_3764098_293-212045-0_sp1.1 from training. Duration: 0.92725 2023-10-03 17:29:11,548 WARNING [train.py:1204] (3/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_577-332600-0 from training. Duration: 0.57 2023-10-03 17:29:16,654 WARNING [train.py:1204] (3/4) Exclude cut with ID _30039_210_13270_1_1532484612900_2725150_104-289874-0_sp1.1 from training. Duration: 0.963625 2023-10-03 17:29:16,667 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3172_210_2087_1_1526292929_3987865_197-97762-0_sp1.1 from training. Duration: 0.6818125 2023-10-03 17:29:19,387 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_256-150908-0_sp1.1 from training. Duration: 0.963625 2023-10-03 17:29:20,753 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_296-287678-0_sp1.1 from training. Duration: 0.863625 2023-10-03 17:29:20,781 WARNING [train.py:1204] (3/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_443-30034-0 from training. Duration: 0.64 2023-10-03 17:29:20,812 WARNING [train.py:1204] (3/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_494-199538-0_sp0.9 from training. Duration: 0.8333125 2023-10-03 17:29:22,152 WARNING [train.py:1204] (3/4) Exclude cut with ID _19708_210_9206_1_1532136650733_4238364_61-162325-0_sp1.1 from training. Duration: 0.97275 2023-10-03 17:29:23,605 WARNING [train.py:1204] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_475-127455-0 from training. Duration: 0.7 2023-10-03 17:29:25,531 WARNING [train.py:1204] (3/4) Exclude cut with ID _48677_210_5663_1_1533902405919_3892989_188-11581-0_sp1.1 from training. Duration: 0.963625 2023-10-03 17:29:25,565 WARNING [train.py:1204] (3/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_136-8381-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 17:29:26,946 WARNING [train.py:1204] (3/4) Exclude cut with ID _55328_210_27767_1_1534580973260_4088170_270-229555-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 17:29:28,282 WARNING [train.py:1204] (3/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_372-266951-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 17:29:28,343 WARNING [train.py:1204] (3/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_14-242149-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 17:29:31,253 WARNING [train.py:1204] (3/4) Exclude cut with ID _7877_210_6014_1_1528286358099_3750136_159-76512-0_sp1.1 from training. Duration: 0.936375 2023-10-03 17:29:31,255 WARNING [train.py:1204] (3/4) Exclude cut with ID _8263_210_1664_1_1528438953494_294100_40-204731-0_sp1.1 from training. Duration: 0.4545625 2023-10-03 17:29:32,489 INFO [train.py:1046] (3/4) Epoch 39, batch 350, loss[loss=0.1608, simple_loss=0.2492, pruned_loss=0.03621, over 24453.00 frames. ], tot_loss[loss=0.1568, simple_loss=0.2356, pruned_loss=0.03899, over 3910279.30 frames. ], batch size: 66, lr: 2.62e-03, grad_scale: 16.0 2023-10-03 17:29:34,025 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8278_210_2550_1_1528637099_4220413_694-238413-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 17:29:39,933 WARNING [train.py:1204] (3/4) Exclude cut with ID _47278_210_12041_1_1533902418349_3576110_421-172710-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 17:29:40,158 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.conv_skip_rate, batch_count=1348073.3333333333, ans=0.0 2023-10-03 17:29:41,625 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.conv_module1.balancer1.min_positive, batch_count=1348073.3333333333, ans=0.025 2023-10-03 17:29:42,694 WARNING [train.py:1204] (3/4) Exclude cut with ID _14970_210_15709_1_1530773543493_1715730_215-282680-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 17:29:44,028 WARNING [train.py:1204] (3/4) Exclude cut with ID _64639_210_5816_1_1535367619437_3528000_306-149449-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 17:29:47,788 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_28-291982-0 from training. Duration: 0.97 2023-10-03 17:29:49,335 WARNING [train.py:1204] (3/4) Exclude cut with ID _55134_210_21982_1_1534590028377_3882170_412-15870-0_sp1.1 from training. Duration: 0.936375 2023-10-03 17:29:49,373 WARNING [train.py:1204] (3/4) Exclude cut with ID _13684_210_7497_1_1530619354863_3598359_312-235606-0 from training. Duration: 0.6 2023-10-03 17:29:49,534 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.conv_skip_rate, batch_count=1348140.0, ans=0.0 2023-10-03 17:29:50,076 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.whiten.whitening_limit, batch_count=1348140.0, ans=12.0 2023-10-03 17:29:52,054 WARNING [train.py:1204] (3/4) Exclude cut with ID _37112_210_2575_1_1532997007673_7268610_347-238993-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 17:29:52,099 WARNING [train.py:1204] (3/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_121-284152-0 from training. Duration: 0.59 2023-10-03 17:29:52,819 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.2.encoder.layers.2.conv_module1.whiten, num_groups=1, num_channels=384, metric=5.72 vs. limit=15.0 2023-10-03 17:29:54,014 WARNING [train.py:1204] (3/4) Exclude cut with ID _2941_210_2577_1_1526199673150_2489759_228-2961-0_sp1.1 from training. Duration: 0.97275 2023-10-03 17:29:56,781 WARNING [train.py:1204] (3/4) Exclude cut with ID _28323_210_16187_1_1532480395667_4100300_531-24157-0 from training. Duration: 0.98 2023-10-03 17:29:58,192 WARNING [train.py:1204] (3/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_266-329755-0_sp0.9 from training. Duration: 0.988875 2023-10-03 17:29:59,552 WARNING [train.py:1204] (3/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_1038-146421-0_sp1.1 from training. Duration: 0.97275 2023-10-03 17:29:59,639 WARNING [train.py:1204] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_391-22148-0_sp0.9 from training. Duration: 0.8555625 2023-10-03 17:30:00,839 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.659e+02 1.879e+02 2.125e+02 2.432e+02 3.754e+02, threshold=4.249e+02, percent-clipped=0.0 2023-10-03 17:30:02,143 WARNING [train.py:1204] (3/4) Exclude cut with ID _64521_210_14699_1_1535243427259_3815328_543-341117-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 17:30:02,168 WARNING [train.py:1204] (3/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_52-168712-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 17:30:02,207 WARNING [train.py:1204] (3/4) Exclude cut with ID _33920_210_4220_1_1533257851500_3084399_198-334063-0_sp1.1 from training. Duration: 0.8545625 2023-10-03 17:30:02,220 WARNING [train.py:1204] (3/4) Exclude cut with ID _50093_210_4220_1_1534417141207_3432299_69-111987-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 17:30:02,268 WARNING [train.py:1204] (3/4) Exclude cut with ID _14821_210_8750_1_1530680503991_6829359_189-297451-0_sp0.9 from training. Duration: 0.611125 2023-10-03 17:30:03,774 WARNING [train.py:1204] (3/4) Exclude cut with ID _8030_210_7497_1_1528444886781_3531089_262-86750-0_sp0.9 from training. Duration: 0.9 2023-10-03 17:30:04,926 WARNING [train.py:1204] (3/4) Exclude cut with ID _24612_210_8913_1_1531998054193_3806359_294-191196-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 17:30:05,194 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.conv_module2.balancer2.prob, batch_count=1348206.6666666667, ans=0.125 2023-10-03 17:30:12,119 WARNING [train.py:1204] (3/4) Exclude cut with ID _16909_210_5724_1_1531292079912_4118919_707-342896-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 17:30:13,890 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_9460_210_4211_1_1528954587_10049667_572-37035-0_sp0.9 from training. Duration: 0.888875 2023-10-03 17:30:13,937 WARNING [train.py:1204] (3/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_762-109886-0_sp1.1 from training. Duration: 0.82725 2023-10-03 17:30:13,966 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_14953_210_2151_1_1530701710_5616165_519-326393-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 17:30:19,478 WARNING [train.py:1204] (3/4) Exclude cut with ID _25914_210_6400_1_1532141317058_644740_52-96808-0 from training. Duration: 0.91 2023-10-03 17:30:19,482 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_68095_210_20185_1_1535718226_8888855_1079-94525-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 17:30:22,491 WARNING [train.py:1204] (3/4) Exclude cut with ID _14860_210_4915_1_1530702020340_7343240_746-3647-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 17:30:22,491 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_70379_210_7925_1_1535941455_557455_49-101720-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 17:30:24,174 WARNING [train.py:1204] (3/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_8-307291-0_sp1.1 from training. Duration: 0.936375 2023-10-03 17:30:26,830 WARNING [train.py:1204] (3/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_117-290916-0 from training. Duration: 0.51 2023-10-03 17:30:28,371 WARNING [train.py:1204] (3/4) Exclude cut with ID _22020_210_2461_1_1531724164513_6326900_944-339840-0_sp1.1 from training. Duration: 0.963625 2023-10-03 17:30:29,846 WARNING [train.py:1204] (3/4) Exclude cut with ID IC0515W0043-254911 from training. Duration: 0.976 2023-10-03 17:30:31,347 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3910_210_1794_1_1526742306_334530_28-238909-0 from training. Duration: 0.81 2023-10-03 17:30:31,368 WARNING [train.py:1204] (3/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_269-267275-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 17:30:32,852 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.conv_module1.balancer2.prob, batch_count=1348340.0, ans=0.125 2023-10-03 17:30:34,049 WARNING [train.py:1204] (3/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_72-251255-0_sp1.1 from training. Duration: 0.8545625 2023-10-03 17:30:34,061 WARNING [train.py:1204] (3/4) Exclude cut with ID _14886_210_4220_1_1530702020355_3516190_175-229073-0 from training. Duration: 0.45 2023-10-03 17:30:35,420 WARNING [train.py:1204] (3/4) Exclude cut with ID _40519_210_13794_1_1533715228655_7337390_921-209452-0_sp1.1 from training. Duration: 0.963625 2023-10-03 17:30:36,956 WARNING [train.py:1204] (3/4) Exclude cut with ID _29857_210_20034_1_1532395886753_3798160_335-163562-0_sp0.9 from training. Duration: 0.9444375 2023-10-03 17:30:38,204 WARNING [train.py:1204] (3/4) Exclude cut with ID _58583_210_5236_1_1534726835266_3574249_384-58445-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 17:30:39,686 WARNING [train.py:1204] (3/4) Exclude cut with ID _44011_210_2046_1_1533722401691_3631310_96-315656-0_sp1.1 from training. Duration: 0.963625 2023-10-03 17:30:39,690 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_68484_210_1794_1_1535849804_7678097_549-170613-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 17:30:41,048 WARNING [train.py:1204] (3/4) Exclude cut with ID _37892_210_19596_1_1533198677897_3964058_214-148058-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 17:30:41,224 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.feed_forward3.hidden_balancer.prob, batch_count=1348340.0, ans=0.125 2023-10-03 17:30:45,750 WARNING [train.py:1204] (3/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_415-78792-0_sp0.9 from training. Duration: 0.9 2023-10-03 17:30:47,672 INFO [train.py:1046] (3/4) Epoch 39, batch 400, loss[loss=0.1494, simple_loss=0.2378, pruned_loss=0.03052, over 24482.00 frames. ], tot_loss[loss=0.1571, simple_loss=0.2363, pruned_loss=0.03891, over 4082166.62 frames. ], batch size: 63, lr: 2.62e-03, grad_scale: 32.0 2023-10-03 17:30:49,013 WARNING [train.py:1204] (3/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_383-205833-0_sp1.1 from training. Duration: 0.863625 2023-10-03 17:30:49,094 WARNING [train.py:1204] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_167-137255-0 from training. Duration: 0.64 2023-10-03 17:30:49,102 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8275_210_4198_1_1528455320_3892571_484-154786-0_sp1.1 from training. Duration: 0.963625 2023-10-03 17:30:50,494 WARNING [train.py:1204] (3/4) Exclude cut with ID _40284_210_2084_1_1533513679814_3704279_396-319412-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 17:30:51,936 WARNING [train.py:1204] (3/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_280-120942-0_sp0.9 from training. Duration: 0.8555625 2023-10-03 17:30:53,320 WARNING [train.py:1204] (3/4) Exclude cut with ID _45630_210_10556_1_1533949174498_3902049_558-240889-0_sp1.1 from training. Duration: 0.92725 2023-10-03 17:30:56,563 WARNING [train.py:1204] (3/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_197-307458-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 17:30:56,664 WARNING [train.py:1204] (3/4) Exclude cut with ID _16909_210_5724_1_1531292079912_4118919_163-254843-0_sp1.1 from training. Duration: 0.92725 2023-10-03 17:30:56,798 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.conv_module1.balancer1.min_positive, batch_count=1348406.6666666667, ans=0.025 2023-10-03 17:30:59,012 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.0.layers.0.nonlin_attention.whiten2, num_groups=1, num_channels=192, metric=4.82 vs. limit=15.0 2023-10-03 17:30:59,342 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_103-84010-0 from training. Duration: 0.69 2023-10-03 17:31:00,756 WARNING [train.py:1204] (3/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_401-158239-0 from training. Duration: 0.71 2023-10-03 17:31:00,757 WARNING [train.py:1204] (3/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_899-233658-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 17:31:02,186 WARNING [train.py:1204] (3/4) Exclude cut with ID _23825_210_13449_1_1531915313526_3497150_240-271367-0 from training. Duration: 0.97 2023-10-03 17:31:03,488 WARNING [train.py:1204] (3/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_424-134619-0_sp1.1 from training. Duration: 0.92725 2023-10-03 17:31:07,563 WARNING [train.py:1204] (3/4) Exclude cut with ID _44831_210_13427_1_1533816011549_3689035_32-188147-0_sp1.1 from training. Duration: 0.87275 2023-10-03 17:31:07,568 WARNING [train.py:1204] (3/4) Exclude cut with ID _59170_210_6721_1_1534983633960_4203335_314-63879-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 17:31:07,599 WARNING [train.py:1204] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_602-275260-0 from training. Duration: 0.82 2023-10-03 17:31:07,649 WARNING [train.py:1204] (3/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_511-306767-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 17:31:09,119 WARNING [train.py:1204] (3/4) Exclude cut with ID _41817_210_18835_1_1533434477160_7423049_565-201775-0_sp1.1 from training. Duration: 0.92725 2023-10-03 17:31:09,136 WARNING [train.py:1204] (3/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_778-173151-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 17:31:10,480 WARNING [train.py:1204] (3/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_680-51001-0_sp1.1 from training. Duration: 0.963625 2023-10-03 17:31:13,268 WARNING [train.py:1204] (3/4) Exclude cut with ID IC0677W0059-334891 from training. Duration: 0.896 2023-10-03 17:31:14,585 WARNING [train.py:1204] (3/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_370-331600-0 from training. Duration: 0.94 2023-10-03 17:31:19,444 WARNING [train.py:1204] (3/4) Exclude cut with ID _66840_210_5219_1_1535452168426_4196349_446-240192-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 17:31:20,743 WARNING [train.py:1204] (3/4) Exclude cut with ID _65242_210_18628_1_1535369393717_3823149_38-6732-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 17:31:22,638 WARNING [train.py:1204] (3/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_302-2550-0 from training. Duration: 0.81 2023-10-03 17:31:22,739 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_100-348441-0 from training. Duration: 0.91 2023-10-03 17:31:25,495 WARNING [train.py:1204] (3/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_329-285801-0_sp1.1 from training. Duration: 0.82725 2023-10-03 17:31:28,226 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_30167_210_3528_1_1532428012_4772375_343-334712-0_sp1.1 from training. Duration: 0.936375 2023-10-03 17:31:32,895 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7543_210_6753_1_1528543318_4552_1-85072-0 from training. Duration: 0.83 2023-10-03 17:31:36,947 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7002_210_1794_1_1527935634_4034444_487-236501-0_sp1.1 from training. Duration: 0.6 2023-10-03 17:31:38,326 WARNING [train.py:1204] (3/4) Exclude cut with ID _7160_210_1876_1_1527940923231_3703799_282-322611-0 from training. Duration: 0.87 2023-10-03 17:31:39,833 WARNING [train.py:1204] (3/4) Exclude cut with ID _67335_210_5219_1_1535525975652_4275229_570-262983-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 17:31:41,245 WARNING [train.py:1204] (3/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_625-338546-0_sp1.1 from training. Duration: 0.8 2023-10-03 17:31:41,277 WARNING [train.py:1204] (3/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_176-56120-0 from training. Duration: 0.83 2023-10-03 17:31:45,446 WARNING [train.py:1204] (3/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_963-187109-0_sp1.1 from training. Duration: 0.9 2023-10-03 17:31:47,558 WARNING [train.py:1204] (3/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_121-284152-0_sp0.9 from training. Duration: 0.6555625 2023-10-03 17:31:48,910 WARNING [train.py:1204] (3/4) Exclude cut with ID _45021_210_9994_1_1533619751094_3330288_104-81711-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 17:31:52,370 WARNING [train.py:1204] (3/4) Exclude cut with ID _51382_210_23862_1_1534492801838_3650104_129-343914-0_sp1.1 from training. Duration: 0.936375 2023-10-03 17:31:52,409 WARNING [train.py:1204] (3/4) Exclude cut with ID _14970_210_15709_1_1530775547361_1966965_8-169924-0 from training. Duration: 0.92 2023-10-03 17:31:54,393 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2295_210_2087_1_1525500993_2407863_234-83104-0_sp1.1 from training. Duration: 0.563625 2023-10-03 17:31:55,660 WARNING [train.py:1204] (3/4) Exclude cut with ID _14687_210_5854_1_1530703838259_3664889_115-239046-0 from training. Duration: 0.67 2023-10-03 17:31:57,193 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.conv_module2.balancer1.max_abs, batch_count=1348673.3333333333, ans=10.0 2023-10-03 17:31:58,362 WARNING [train.py:1204] (3/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_690-126306-0_sp1.1 from training. Duration: 0.7090625 2023-10-03 17:31:58,378 WARNING [train.py:1204] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_625-102955-0_sp0.9 from training. Duration: 0.8555625 2023-10-03 17:32:01,556 INFO [train.py:1046] (3/4) Epoch 39, batch 450, loss[loss=0.1545, simple_loss=0.2353, pruned_loss=0.03684, over 23512.00 frames. ], tot_loss[loss=0.1577, simple_loss=0.2371, pruned_loss=0.03919, over 4224486.76 frames. ], batch size: 93, lr: 2.62e-03, grad_scale: 16.0 2023-10-03 17:32:01,684 WARNING [train.py:1204] (3/4) Exclude cut with ID _9389_210_1664_1_1529217337301_5289269_289-213417-0 from training. Duration: 0.89 2023-10-03 17:32:01,874 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.ff2_skip_rate, batch_count=1348740.0, ans=0.0 2023-10-03 17:32:04,493 WARNING [train.py:1204] (3/4) Exclude cut with ID _13253_210_1925_1_1530757775819_3577873_191-144977-0_sp0.9 from training. Duration: 0.8666875 2023-10-03 17:32:04,578 WARNING [train.py:1204] (3/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_325-155703-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 17:32:04,608 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2295_210_2087_1_1525500993_2407863_116-238894-0_sp0.9 from training. Duration: 0.77775 2023-10-03 17:32:06,009 WARNING [train.py:1204] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_94-126128-0 from training. Duration: 0.86 2023-10-03 17:32:06,024 WARNING [train.py:1204] (3/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_395-310220-0_sp0.9 from training. Duration: 0.988875 2023-10-03 17:32:07,397 WARNING [train.py:1204] (3/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_821-110852-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 17:32:07,455 WARNING [train.py:1204] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_64-248359-0_sp1.1 from training. Duration: 0.62725 2023-10-03 17:32:07,465 WARNING [train.py:1204] (3/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_138-187031-0 from training. Duration: 0.99 2023-10-03 17:32:07,494 WARNING [train.py:1204] (3/4) Exclude cut with ID _4122_210_2084_1_1526713164417_4353280_528-206771-0_sp0.9 from training. Duration: 0.97775 2023-10-03 17:32:07,622 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.ff3_skip_rate, batch_count=1348740.0, ans=0.0 2023-10-03 17:32:10,175 WARNING [train.py:1204] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_844-96989-0_sp1.1 from training. Duration: 0.6454375 2023-10-03 17:32:12,876 WARNING [train.py:1204] (3/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_106-30782-0_sp1.1 from training. Duration: 0.72725 2023-10-03 17:32:17,457 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.bypass.scale_min, batch_count=1348806.6666666667, ans=0.2 2023-10-03 17:32:21,771 WARNING [train.py:1204] (3/4) Exclude cut with ID _14683_210_5219_1_1530698352608_3974859_281-250040-0_sp1.1 from training. Duration: 0.936375 2023-10-03 17:32:23,731 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_46013_210_7234_1_1533776362_3744032_463-52378-0_sp0.9 from training. Duration: 0.9555625 2023-10-03 17:32:25,246 WARNING [train.py:1204] (3/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_299-34413-0 from training. Duration: 0.65 2023-10-03 17:32:27,110 WARNING [train.py:1204] (3/4) Exclude cut with ID _35799_210_5854_1_1533263487887_3529071_490-326847-0 from training. Duration: 0.99 2023-10-03 17:32:29,029 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.5.encoder.layers.1.conv_module2.whiten, num_groups=1, num_channels=256, metric=5.43 vs. limit=15.0 2023-10-03 17:32:29,982 WARNING [train.py:1204] (3/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_589-9070-0_sp0.9 from training. Duration: 0.788875 2023-10-03 17:32:31,361 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.520e+02 1.906e+02 2.093e+02 2.336e+02 3.263e+02, threshold=4.186e+02, percent-clipped=0.0 2023-10-03 17:32:31,532 WARNING [train.py:1204] (3/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_884-29515-0_sp1.1 from training. Duration: 0.936375 2023-10-03 17:32:34,218 WARNING [train.py:1204] (3/4) Exclude cut with ID _15012_210_1968_1_1530856820429_5095350_474-119995-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 17:32:37,608 WARNING [train.py:1204] (3/4) Exclude cut with ID _65585_210_12210_1_1535450451584_3764098_211-97124-0_sp1.1 from training. Duration: 0.963625 2023-10-03 17:32:37,675 WARNING [train.py:1204] (3/4) Exclude cut with ID _33640_210_2655_1_1533117605479_4855469_666-224463-0_sp1.1 from training. Duration: 0.963625 2023-10-03 17:32:40,527 WARNING [train.py:1204] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_35-153886-0 from training. Duration: 0.84 2023-10-03 17:32:40,588 WARNING [train.py:1204] (3/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_210-106047-0 from training. Duration: 0.97 2023-10-03 17:32:43,137 WARNING [train.py:1204] (3/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_479-125611-0 from training. Duration: 0.96 2023-10-03 17:32:43,158 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_9417_210_1794_1_1529235261_4614131_427-153963-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 17:32:44,545 WARNING [train.py:1204] (3/4) Exclude cut with ID _23818_210_6228_1_1531917063884_3676920_133-170571-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 17:32:44,607 WARNING [train.py:1204] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_602-275260-0_sp1.1 from training. Duration: 0.7454375 2023-10-03 17:32:46,000 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9029W0121-509521 from training. Duration: 0.896 2023-10-03 17:32:46,009 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9029W0434-509821 from training. Duration: 0.64 2023-10-03 17:32:46,135 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.nonlin_attention.balancer.prob, batch_count=1348940.0, ans=0.125 2023-10-03 17:32:47,295 WARNING [train.py:1204] (3/4) Exclude cut with ID _23038_210_9570_1_1532001584158_3652419_289-307034-0_sp1.1 from training. Duration: 0.936375 2023-10-03 17:32:48,698 WARNING [train.py:1204] (3/4) Exclude cut with ID _29345_210_4381_1_1532689168263_3860169_149-257268-0_sp0.9 from training. Duration: 0.9 2023-10-03 17:32:50,497 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_405-339986-0_sp1.1 from training. Duration: 0.463625 2023-10-03 17:32:53,892 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2655_210_2087_1_1525688959_3897400_437-337690-0_sp1.1 from training. Duration: 0.57275 2023-10-03 17:32:53,944 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7543_210_6753_1_1528541907_1399322_165-323619-0_sp1.1 from training. Duration: 0.62725 2023-10-03 17:32:55,239 WARNING [train.py:1204] (3/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_508-335369-0_sp1.1 from training. Duration: 0.436375 2023-10-03 17:32:56,582 WARNING [train.py:1204] (3/4) Exclude cut with ID _7852_210_5219_1_1528286471316_8139400_358-287492-0 from training. Duration: 0.96 2023-10-03 17:32:58,080 WARNING [train.py:1204] (3/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_322-217220-0_sp0.9 from training. Duration: 0.9555625 2023-10-03 17:33:00,094 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3171_210_2087_1_1526109084_2330007_66-260583-0_sp0.9 from training. Duration: 0.8 2023-10-03 17:33:00,125 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8558_210_1410_1_1528505225_7613406_131-329524-0_sp1.1 from training. Duration: 0.7181875 2023-10-03 17:33:01,639 WARNING [train.py:1204] (3/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_30-326681-0 from training. Duration: 0.83 2023-10-03 17:33:04,463 WARNING [train.py:1204] (3/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_58-68956-0_sp0.9 from training. Duration: 0.92225 2023-10-03 17:33:05,834 WARNING [train.py:1204] (3/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_255-312654-0 from training. Duration: 0.91 2023-10-03 17:33:05,908 WARNING [train.py:1204] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_99-167392-0 from training. Duration: 0.61 2023-10-03 17:33:06,100 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.feed_forward3.hidden_balancer.prob, batch_count=1349006.6666666667, ans=0.125 2023-10-03 17:33:07,779 WARNING [train.py:1204] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_94-126128-0_sp0.9 from training. Duration: 0.9555625 2023-10-03 17:33:10,625 WARNING [train.py:1204] (3/4) Exclude cut with ID _68635_210_9317_1_1535770813410_4315870_187-192817-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 17:33:13,336 WARNING [train.py:1204] (3/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_277-204970-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 17:33:14,780 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6163_210_6400_1_1527506746_4263068_283-137811-0_sp0.9 from training. Duration: 0.8555625 2023-10-03 17:33:16,015 INFO [train.py:1046] (3/4) Epoch 39, batch 500, loss[loss=0.1599, simple_loss=0.2316, pruned_loss=0.0441, over 22958.00 frames. ], tot_loss[loss=0.1578, simple_loss=0.2372, pruned_loss=0.03918, over 4340865.95 frames. ], batch size: 322, lr: 2.62e-03, grad_scale: 16.0 2023-10-03 17:33:16,060 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9144W0121-566446 from training. Duration: 0.896 2023-10-03 17:33:18,826 WARNING [train.py:1204] (3/4) Exclude cut with ID _49371_210_12071_1_1533952761021_3812190_351-315519-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 17:33:20,760 WARNING [train.py:1204] (3/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_115-326778-0_sp1.1 from training. Duration: 0.8454375 2023-10-03 17:33:20,818 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_17292_210_6753_1_1531125388_2943734_436-198965-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 17:33:20,827 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9165W0117-576751 from training. Duration: 0.896 2023-10-03 17:33:24,004 WARNING [train.py:1204] (3/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_212-149947-0 from training. Duration: 0.73 2023-10-03 17:33:24,019 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_13757_210_3528_1_1530440682_4107239_65-167544-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 17:33:26,802 WARNING [train.py:1204] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_700-183549-0_sp0.9 from training. Duration: 0.7555625 2023-10-03 17:33:30,209 WARNING [train.py:1204] (3/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_216-343007-0_sp0.9 from training. Duration: 0.7333125 2023-10-03 17:33:32,902 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7544_210_6753_1_1528587722_3576686_295-258479-0_sp1.1 from training. Duration: 0.763625 2023-10-03 17:33:35,606 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_365-287106-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 17:33:35,619 WARNING [train.py:1204] (3/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_300-3747-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 17:33:35,681 WARNING [train.py:1204] (3/4) Exclude cut with ID _14069_210_5632_1_1530512692033_4803390_487-303201-0_sp1.1 from training. Duration: 0.92725 2023-10-03 17:33:46,118 WARNING [train.py:1204] (3/4) Exclude cut with ID _14459_210_2228_1_1531443641946_7516700_668-123026-0_sp1.1 from training. Duration: 0.936375 2023-10-03 17:33:46,140 WARNING [train.py:1204] (3/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_108-53929-0_sp1.1 from training. Duration: 0.536375 2023-10-03 17:33:47,448 WARNING [train.py:1204] (3/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_266-137104-0_sp0.9 from training. Duration: 0.72225 2023-10-03 17:33:47,486 WARNING [train.py:1204] (3/4) Exclude cut with ID _12302_210_5219_1_1529924446466_8107280_902-168619-0_sp1.1 from training. Duration: 0.936375 2023-10-03 17:33:47,506 WARNING [train.py:1204] (3/4) Exclude cut with ID _59502_210_12210_1_1535107024698_1835139_146-162495-0 from training. Duration: 0.92 2023-10-03 17:33:47,532 WARNING [train.py:1204] (3/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_412-287526-0_sp1.1 from training. Duration: 0.6545625 2023-10-03 17:33:50,350 WARNING [train.py:1204] (3/4) Exclude cut with ID _7160_210_1876_1_1527940923231_3703799_120-150943-0_sp0.9 from training. Duration: 0.9 2023-10-03 17:33:51,754 WARNING [train.py:1204] (3/4) Exclude cut with ID _15037_210_2253_1_1530752294104_3267650_296-192597-0_sp1.1 from training. Duration: 0.736375 2023-10-03 17:33:51,780 WARNING [train.py:1204] (3/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_397-216497-0_sp1.1 from training. Duration: 0.87275 2023-10-03 17:33:51,795 WARNING [train.py:1204] (3/4) Exclude cut with ID _40094_210_16380_1_1533294005773_3641159_76-269320-0_sp1.1 from training. Duration: 0.936375 2023-10-03 17:33:53,501 WARNING [train.py:1204] (3/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_449-15508-0 from training. Duration: 0.82 2023-10-03 17:33:55,202 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.conv_module1.balancer1.prob, batch_count=1349206.6666666667, ans=0.125 2023-10-03 17:33:58,121 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9309W0194-640321 from training. Duration: 0.64 2023-10-03 17:33:58,297 WARNING [train.py:1204] (3/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_284-312178-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 17:34:00,224 WARNING [train.py:1204] (3/4) Exclude cut with ID _29853_210_7497_1_1532505600851_6642110_401-55937-0_sp1.1 from training. Duration: 0.92725 2023-10-03 17:34:01,718 WARNING [train.py:1204] (3/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_716-24918-0_sp1.1 from training. Duration: 0.92725 2023-10-03 17:34:01,738 WARNING [train.py:1204] (3/4) Exclude cut with ID _19637_210_4220_1_1531537167676_3205060_314-305638-0_sp1.1 from training. Duration: 0.92725 2023-10-03 17:34:01,771 WARNING [train.py:1204] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_475-127455-0_sp1.1 from training. Duration: 0.636375 2023-10-03 17:34:04,442 WARNING [train.py:1204] (3/4) Exclude cut with ID _5444_210_2151_1_1527418808464_5312210_581-315411-0 from training. Duration: 0.62 2023-10-03 17:34:07,175 WARNING [train.py:1204] (3/4) Exclude cut with ID _44902_210_16187_1_1533888006062_3792078_149-67212-0_sp0.9 from training. Duration: 0.9666875 2023-10-03 17:34:08,491 WARNING [train.py:1204] (3/4) Exclude cut with ID _31602_210_5854_1_1532677021456_4458250_63-63253-0_sp1.1 from training. Duration: 0.963625 2023-10-03 17:34:08,697 INFO [scaling.py:1118] (3/4) WithLoss: name=encoder.encoders.3.encoder.layers.0.self_attn_weights, loss-sum=0.000e+00 2023-10-03 17:34:11,887 WARNING [train.py:1204] (3/4) Exclude cut with ID _55209_210_1531_1_1534675828996_4597160_660-203992-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 17:34:14,987 WARNING [train.py:1204] (3/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_83-190624-0_sp1.1 from training. Duration: 0.92725 2023-10-03 17:34:17,864 WARNING [train.py:1204] (3/4) Exclude cut with ID _58605_210_12210_1_1534723195939_3904259_154-139635-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 17:34:19,312 WARNING [train.py:1204] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_164-224052-0 from training. Duration: 0.7 2023-10-03 17:34:19,321 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_1126-189071-0_sp1.1 from training. Duration: 0.963625 2023-10-03 17:34:19,331 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_15396_210_6890_1_1531308473_1938936_310-214789-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 17:34:21,579 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.self_attn_weights.pos_emb_skip_rate, batch_count=1349340.0, ans=0.0 2023-10-03 17:34:24,367 WARNING [train.py:1204] (3/4) Exclude cut with ID _8395_210_1531_1_1528597871523_4411579_431-132470-0 from training. Duration: 0.74 2023-10-03 17:34:25,563 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3780_210_2087_1_1526467760_2130483_170-259044-0_sp0.9 from training. Duration: 0.77775 2023-10-03 17:34:28,517 WARNING [train.py:1204] (3/4) Exclude cut with ID _12733_210_8341_1_1530167424556_3646380_537-230640-0_sp1.1 from training. Duration: 0.963625 2023-10-03 17:34:31,317 INFO [train.py:1046] (3/4) Epoch 39, batch 550, loss[loss=0.1555, simple_loss=0.2329, pruned_loss=0.03908, over 23433.00 frames. ], tot_loss[loss=0.1581, simple_loss=0.2376, pruned_loss=0.03927, over 4418135.89 frames. ], batch size: 93, lr: 2.62e-03, grad_scale: 16.0 2023-10-03 17:34:34,099 WARNING [train.py:1204] (3/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_159-281310-0 from training. Duration: 0.95 2023-10-03 17:34:35,532 WARNING [train.py:1204] (3/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_434-83000-0 from training. Duration: 0.98 2023-10-03 17:34:35,566 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_4405_210_1849_1_1526732205_3593939_493-32464-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 17:34:35,575 WARNING [train.py:1204] (3/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_342-100157-0 from training. Duration: 0.54 2023-10-03 17:34:36,917 WARNING [train.py:1204] (3/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_765-250133-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 17:34:36,921 WARNING [train.py:1204] (3/4) Exclude cut with ID _38670_210_15379_1_1533448810659_4391059_81-312245-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 17:34:38,323 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_50050_210_16223_1_1535435796_7290138_1273-247611-0_sp1.1 from training. Duration: 0.936375 2023-10-03 17:34:38,571 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.bypass_mid.scale_min, batch_count=1349406.6666666667, ans=0.2 2023-10-03 17:34:39,617 WARNING [train.py:1204] (3/4) Exclude cut with ID _44831_210_13427_1_1533816011549_3689035_399-18591-0_sp1.1 from training. Duration: 0.936375 2023-10-03 17:34:39,641 WARNING [train.py:1204] (3/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_557-324258-0_sp0.9 from training. Duration: 0.9 2023-10-03 17:34:41,031 WARNING [train.py:1204] (3/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_62-335552-0_sp1.1 from training. Duration: 0.7909375 2023-10-03 17:34:43,706 WARNING [train.py:1204] (3/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_673-264125-0_sp1.1 from training. Duration: 0.963625 2023-10-03 17:34:44,641 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.0.layers.0.whiten, num_groups=1, num_channels=192, metric=4.19 vs. limit=12.0 2023-10-03 17:34:45,658 WARNING [train.py:1204] (3/4) Exclude cut with ID _8030_210_7497_1_1528444886781_3531089_262-86750-0 from training. Duration: 0.81 2023-10-03 17:34:45,672 WARNING [train.py:1204] (3/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_444-28041-0_sp1.1 from training. Duration: 0.77275 2023-10-03 17:34:48,552 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_34008_210_10431_1_1533345424_8183247_516-14858-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 17:34:48,565 WARNING [train.py:1204] (3/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_54-248218-0_sp1.1 from training. Duration: 0.936375 2023-10-03 17:34:48,883 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.feed_forward2.hidden_balancer.prob, batch_count=1349473.3333333333, ans=0.125 2023-10-03 17:34:50,196 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.0.ff3_skip_rate, batch_count=1349473.3333333333, ans=0.0 2023-10-03 17:34:51,507 WARNING [train.py:1204] (3/4) Exclude cut with ID _13253_210_1925_1_1530757775819_3577873_300-119782-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 17:34:51,587 WARNING [train.py:1204] (3/4) Exclude cut with ID _39290_210_4220_1_1533862778760_3075118_21-240703-0_sp1.1 from training. Duration: 0.936375 2023-10-03 17:34:56,069 WARNING [train.py:1204] (3/4) Exclude cut with ID 774-127930-0014-10412-0_sp1.1 from training. Duration: 0.95 2023-10-03 17:34:58,106 WARNING [train.py:1204] (3/4) Exclude cut with ID _67499_210_17282_1_1535509805220_3692430_20-179346-0 from training. Duration: 0.97 2023-10-03 17:35:01,313 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.474e+02 1.945e+02 2.178e+02 2.458e+02 4.129e+02, threshold=4.356e+02, percent-clipped=0.0 2023-10-03 17:35:01,386 WARNING [train.py:1204] (3/4) Exclude cut with ID _8365_210_2064_1_1528592311070_4677690_431-294102-0_sp0.9 from training. Duration: 0.988875 2023-10-03 17:35:01,789 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.attention_skip_rate, batch_count=1349540.0, ans=0.0 2023-10-03 17:35:04,270 WARNING [train.py:1204] (3/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_365-1492-0_sp1.1 from training. Duration: 0.82725 2023-10-03 17:35:05,588 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_105-114797-0_sp0.9 from training. Duration: 0.9333125 2023-10-03 17:35:07,055 WARNING [train.py:1204] (3/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_769-168302-0_sp0.9 from training. Duration: 0.788875 2023-10-03 17:35:09,823 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_40340_210_9024_1_1533433916_4132944_320-270277-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 17:35:09,834 WARNING [train.py:1204] (3/4) Exclude cut with ID ID0203W0003-781711 from training. Duration: 0.958 2023-10-03 17:35:11,186 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_30434_210_13049_1_1532519384_981746_52-12932-0_sp1.1 from training. Duration: 0.936375 2023-10-03 17:35:12,605 WARNING [train.py:1204] (3/4) Exclude cut with ID _8263_210_1664_1_1528438953494_294100_40-204731-0_sp0.9 from training. Duration: 0.5555625 2023-10-03 17:35:15,794 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1032-236863-0_sp0.9 from training. Duration: 0.9333125 2023-10-03 17:35:17,123 WARNING [train.py:1204] (3/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_335-274780-0_sp0.9 from training. Duration: 0.8666875 2023-10-03 17:35:17,132 WARNING [train.py:1204] (3/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_654-52713-0_sp1.1 from training. Duration: 0.7 2023-10-03 17:35:18,512 WARNING [train.py:1204] (3/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_742-267856-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 17:35:18,615 WARNING [train.py:1204] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_74-48717-0 from training. Duration: 0.58 2023-10-03 17:35:20,001 WARNING [train.py:1204] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_18-46756-0 from training. Duration: 0.79 2023-10-03 17:35:21,338 WARNING [train.py:1204] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_612-56061-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 17:35:21,353 WARNING [train.py:1204] (3/4) Exclude cut with ID _56423_210_5854_1_1534896061049_3165969_412-2403-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 17:35:21,408 WARNING [train.py:1204] (3/4) Exclude cut with ID _62574_210_2655_1_1535180407058_3933249_120-196848-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 17:35:21,408 WARNING [train.py:1204] (3/4) Exclude cut with ID _40519_210_13794_1_1533715228655_7337390_916-51000-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 17:35:25,623 WARNING [train.py:1204] (3/4) Exclude cut with ID _65263_210_22356_1_1535763598691_3921037_108-21902-0_sp1.1 from training. Duration: 0.9 2023-10-03 17:35:25,705 WARNING [train.py:1204] (3/4) Exclude cut with ID _4122_210_2084_1_1526713164417_4353280_528-206771-0_sp1.1 from training. Duration: 0.8 2023-10-03 17:35:27,897 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.bypass.scale_min, batch_count=1349606.6666666667, ans=0.2 2023-10-03 17:35:28,867 WARNING [train.py:1204] (3/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_43-325657-0_sp1.1 from training. Duration: 0.92725 2023-10-03 17:35:28,920 WARNING [train.py:1204] (3/4) Exclude cut with ID _44659_210_2721_1_1534036120061_6614860_314-82041-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 17:35:30,810 WARNING [train.py:1204] (3/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_81-300212-0_sp0.9 from training. Duration: 0.6666875 2023-10-03 17:35:30,885 WARNING [train.py:1204] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_665-196155-0_sp0.9 from training. Duration: 0.7444375 2023-10-03 17:35:32,209 WARNING [train.py:1204] (3/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_140-336881-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 17:35:33,556 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6290_210_3273_1_1527595224_1282787_115-87095-0_sp0.9 from training. Duration: 0.611125 2023-10-03 17:35:33,611 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_53373_210_5399_1_1534229471_4715695_607-259685-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 17:35:34,987 WARNING [train.py:1204] (3/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_175-163894-0_sp1.1 from training. Duration: 0.67275 2023-10-03 17:35:35,026 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7002_210_1794_1_1527939683_5157286_1-281264-0_sp1.1 from training. Duration: 0.4 2023-10-03 17:35:40,378 WARNING [train.py:1204] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_605-257205-0 from training. Duration: 0.59 2023-10-03 17:35:43,528 WARNING [train.py:1204] (3/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_454-314901-0 from training. Duration: 0.46 2023-10-03 17:35:44,810 INFO [train.py:1046] (3/4) Epoch 39, batch 600, loss[loss=0.14, simple_loss=0.223, pruned_loss=0.02846, over 24334.00 frames. ], tot_loss[loss=0.1583, simple_loss=0.238, pruned_loss=0.03929, over 4494663.52 frames. ], batch size: 56, lr: 2.62e-03, grad_scale: 16.0 2023-10-03 17:35:44,879 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_46032_210_14656_1_1533781412_4507969_287-130422-0_sp1.1 from training. Duration: 0.97275 2023-10-03 17:35:44,897 WARNING [train.py:1204] (3/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_186-309161-0_sp1.1 from training. Duration: 0.4909375 2023-10-03 17:35:46,248 WARNING [train.py:1204] (3/4) Exclude cut with ID _42659_210_2070_1_1533607442765_3120380_45-342307-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 17:35:50,631 WARNING [train.py:1204] (3/4) Exclude cut with ID _12555_210_8750_1_1530075594402_3780239_478-326818-0_sp1.1 from training. Duration: 0.963625 2023-10-03 17:35:53,484 WARNING [train.py:1204] (3/4) Exclude cut with ID _7606_210_4915_1_1528894253277_5080690_127-177761-0_sp1.1 from training. Duration: 0.5818125 2023-10-03 17:35:55,425 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6788_210_1388_1_1527898698_4562021_91-197981-0 from training. Duration: 0.83 2023-10-03 17:35:56,698 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.4.encoder.layers.2.self_attn1.whiten, num_groups=1, num_channels=384, metric=14.78 vs. limit=22.5 2023-10-03 17:35:57,382 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8558_210_1410_1_1528505225_7613406_131-329524-0_sp0.9 from training. Duration: 0.87775 2023-10-03 17:35:59,349 WARNING [train.py:1204] (3/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_153-153537-0_sp1.1 from training. Duration: 0.82725 2023-10-03 17:36:01,987 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_17292_210_6753_1_1531125388_2943734_285-349105-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 17:36:03,374 WARNING [train.py:1204] (3/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_37-41919-0 from training. Duration: 0.87 2023-10-03 17:36:03,511 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.1.feed_forward1.out_proj.dropout_p, batch_count=1349806.6666666667, ans=0.1 2023-10-03 17:36:04,700 WARNING [train.py:1204] (3/4) Exclude cut with ID _60860_210_8254_1_1534896015875_3557169_172-294852-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 17:36:07,847 WARNING [train.py:1204] (3/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_146-276944-0 from training. Duration: 0.83 2023-10-03 17:36:13,750 WARNING [train.py:1204] (3/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_1254-44082-0_sp1.1 from training. Duration: 0.936375 2023-10-03 17:36:13,768 WARNING [train.py:1204] (3/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_1115-180350-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 17:36:15,007 WARNING [train.py:1204] (3/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_60-146189-0_sp1.1 from training. Duration: 0.8909375 2023-10-03 17:36:22,100 WARNING [train.py:1204] (3/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_517-153649-0_sp1.1 from training. Duration: 0.92725 2023-10-03 17:36:22,112 WARNING [train.py:1204] (3/4) Exclude cut with ID _39571_210_13449_1_1533801584143_3651077_37-266257-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 17:36:22,159 WARNING [train.py:1204] (3/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_527-276118-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 17:36:22,403 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.conv_skip_rate, batch_count=1349873.3333333333, ans=0.0 2023-10-03 17:36:28,209 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7696_210_2040_1_1528207167_3716099_117-125434-0_sp1.1 from training. Duration: 0.6909375 2023-10-03 17:36:31,523 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_25767_210_2161_1_1532307483_3867748_478-156360-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 17:36:31,527 WARNING [train.py:1204] (3/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_177-49092-0_sp1.1 from training. Duration: 0.82725 2023-10-03 17:36:31,534 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_11305_210_1794_1_1529750428_7178148_680-317073-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 17:36:40,075 WARNING [train.py:1204] (3/4) Exclude cut with ID _53565_210_10635_1_1534337218335_3820259_287-18120-0 from training. Duration: 0.97 2023-10-03 17:36:44,855 WARNING [train.py:1204] (3/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_498-40153-0_sp1.1 from training. Duration: 0.57275 2023-10-03 17:36:44,877 WARNING [train.py:1204] (3/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_555-63066-0_sp1.1 from training. Duration: 0.963625 2023-10-03 17:36:49,043 WARNING [train.py:1204] (3/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_395-310220-0 from training. Duration: 0.89 2023-10-03 17:36:49,126 WARNING [train.py:1204] (3/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_937-208281-0_sp1.1 from training. Duration: 0.77275 2023-10-03 17:36:51,847 WARNING [train.py:1204] (3/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_120-211630-0 from training. Duration: 0.92 2023-10-03 17:36:53,144 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_454-276429-0_sp1.1 from training. Duration: 0.7909375 2023-10-03 17:36:53,192 WARNING [train.py:1204] (3/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_404-75889-0_sp1.1 from training. Duration: 0.8090625 2023-10-03 17:36:53,829 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.2.encoder.layers.2.self_attn1.whiten, num_groups=1, num_channels=384, metric=17.40 vs. limit=22.5 2023-10-03 17:36:56,052 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.attention_skip_rate, batch_count=1350006.6666666667, ans=0.0 2023-10-03 17:36:59,594 INFO [train.py:1046] (3/4) Epoch 39, batch 650, loss[loss=0.1664, simple_loss=0.2403, pruned_loss=0.04619, over 23724.00 frames. ], tot_loss[loss=0.1575, simple_loss=0.2368, pruned_loss=0.0391, over 4535560.30 frames. ], batch size: 164, lr: 2.62e-03, grad_scale: 16.0 2023-10-03 17:36:59,683 WARNING [train.py:1204] (3/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_282-136641-0_sp0.9 from training. Duration: 0.4666875 2023-10-03 17:36:59,786 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_809-17156-0_sp1.1 from training. Duration: 0.663625 2023-10-03 17:37:02,457 WARNING [train.py:1204] (3/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_1003-101638-0_sp0.9 from training. Duration: 0.811125 2023-10-03 17:37:03,831 WARNING [train.py:1204] (3/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_404-75889-0_sp0.9 from training. Duration: 0.988875 2023-10-03 17:37:07,165 WARNING [train.py:1204] (3/4) Exclude cut with ID _59288_210_5854_1_1535077757774_3412246_570-295255-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 17:37:08,652 WARNING [train.py:1204] (3/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_412-287526-0 from training. Duration: 0.72 2023-10-03 17:37:08,730 WARNING [train.py:1204] (3/4) Exclude cut with ID _44010_210_4525_1_1533884262964_4013620_649-79655-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 17:37:11,662 WARNING [train.py:1204] (3/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_57-270037-0_sp0.9 from training. Duration: 0.8555625 2023-10-03 17:37:11,662 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_34815_210_6388_1_1533381320_3571903_302-255963-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 17:37:16,300 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_47449_210_5399_1_1534076445_4006929_573-301852-0_sp1.1 from training. Duration: 0.97275 2023-10-03 17:37:20,410 WARNING [train.py:1204] (3/4) Exclude cut with ID 6709-74022-0004-86860-0_sp1.1 from training. Duration: 0.9409375 2023-10-03 17:37:21,874 WARNING [train.py:1204] (3/4) Exclude cut with ID _40519_210_13794_1_1533715228655_7337390_111-71991-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 17:37:21,923 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_30167_210_3528_1_1532428012_4772375_169-214186-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 17:37:27,068 WARNING [train.py:1204] (3/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_20-235313-0_sp1.1 from training. Duration: 0.963625 2023-10-03 17:37:28,160 WARNING [train.py:1204] (3/4) Exclude cut with ID _4681_210_3525_1_1527251040321_1235713_123-24428-0_sp0.9 from training. Duration: 0.5333125 2023-10-03 17:37:29,367 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.560e+02 1.906e+02 2.088e+02 2.290e+02 3.410e+02, threshold=4.176e+02, percent-clipped=0.0 2023-10-03 17:37:30,885 WARNING [train.py:1204] (3/4) Exclude cut with ID _31602_210_5854_1_1532677021456_4458250_444-198752-0_sp1.1 from training. Duration: 0.97275 2023-10-03 17:37:30,934 WARNING [train.py:1204] (3/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_9-279442-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 17:37:31,206 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.ff3_skip_rate, batch_count=1350206.6666666667, ans=0.0 2023-10-03 17:37:32,323 WARNING [train.py:1204] (3/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_973-198085-0_sp0.9 from training. Duration: 0.8333125 2023-10-03 17:37:34,304 WARNING [train.py:1204] (3/4) Exclude cut with ID _29853_210_7497_1_1532505600851_6642110_567-250221-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 17:37:34,405 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6545_210_2040_1_1527774670_4245258_407-48070-0_sp1.1 from training. Duration: 0.6909375 2023-10-03 17:37:38,527 WARNING [train.py:1204] (3/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_224-343295-0_sp0.9 from training. Duration: 0.611125 2023-10-03 17:37:38,549 WARNING [train.py:1204] (3/4) Exclude cut with ID IC0125W0011-61817 from training. Duration: 0.88 2023-10-03 17:37:38,563 WARNING [train.py:1204] (3/4) Exclude cut with ID _60113_210_16271_1_1535164212481_4386079_435-154371-0_sp1.1 from training. Duration: 0.97275 2023-10-03 17:37:38,575 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_25836_210_16554_1_1532138167_53197_3-9394-0_sp1.1 from training. Duration: 0.92725 2023-10-03 17:37:38,733 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.nonlin_attention.balancer.prob, batch_count=1350206.6666666667, ans=0.125 2023-10-03 17:37:41,468 WARNING [train.py:1204] (3/4) Exclude cut with ID _29343_210_4381_1_1532602757752_3674109_57-55125-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 17:37:41,682 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.conv_module1.balancer2.prob, batch_count=1350206.6666666667, ans=0.125 2023-10-03 17:37:42,834 WARNING [train.py:1204] (3/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_267-23560-0_sp1.1 from training. Duration: 0.936375 2023-10-03 17:37:42,858 WARNING [train.py:1204] (3/4) Exclude cut with ID _30196_210_1664_1_1533708496522_7074140_1150-278867-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 17:37:42,918 WARNING [train.py:1204] (3/4) Exclude cut with ID _6270_210_2084_1_1527850911573_3856420_26-277343-0_sp0.9 from training. Duration: 0.911125 2023-10-03 17:37:42,982 WARNING [train.py:1204] (3/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_325-62802-0 from training. Duration: 0.87 2023-10-03 17:37:45,653 WARNING [train.py:1204] (3/4) Exclude cut with ID _56524_210_9642_1_1534572011779_3619170_145-310842-0_sp1.1 from training. Duration: 0.9 2023-10-03 17:37:45,683 WARNING [train.py:1204] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_860-96198-0_sp0.9 from training. Duration: 0.811125 2023-10-03 17:37:47,628 WARNING [train.py:1204] (3/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_17-25045-0_sp1.1 from training. Duration: 0.536375 2023-10-03 17:37:47,650 WARNING [train.py:1204] (3/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_349-69727-0_sp1.1 from training. Duration: 0.936375 2023-10-03 17:37:47,750 WARNING [train.py:1204] (3/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_580-93798-0_sp1.1 from training. Duration: 0.5090625 2023-10-03 17:37:49,061 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7762_210_1794_1_1528199508_4057408_419-125009-0 from training. Duration: 0.83 2023-10-03 17:37:50,455 WARNING [train.py:1204] (3/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_356-167816-0 from training. Duration: 0.72 2023-10-03 17:37:50,484 WARNING [train.py:1204] (3/4) Exclude cut with ID _29345_210_4381_1_1532689168263_3860169_225-97100-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 17:37:50,488 WARNING [train.py:1204] (3/4) Exclude cut with ID _14227_210_4381_1_1530701804286_3940259_374-194712-0_sp1.1 from training. Duration: 0.92725 2023-10-03 17:37:50,607 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.1.conv_module2.balancer1.prob, batch_count=1350273.3333333333, ans=0.125 2023-10-03 17:37:51,883 WARNING [train.py:1204] (3/4) Exclude cut with ID _40137_210_12210_1_1533290484046_3795180_249-187059-0_sp0.9 from training. Duration: 0.97775 2023-10-03 17:37:51,910 WARNING [train.py:1204] (3/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_389-317194-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 17:37:53,363 WARNING [train.py:1204] (3/4) Exclude cut with ID _27485_210_4916_1_1532739191677_3668269_239-122918-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 17:37:59,889 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2245_210_2715_1_1525511548_3540254_128-117609-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 17:37:59,923 WARNING [train.py:1204] (3/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_160-276178-0_sp1.1 from training. Duration: 0.963625 2023-10-03 17:38:01,245 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_31920_210_10633_1_1532860009_1686094_1-279735-0_sp1.1 from training. Duration: 0.97275 2023-10-03 17:38:02,716 WARNING [train.py:1204] (3/4) Exclude cut with ID _51078_210_9070_1_1534161557424_3805250_325-262383-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 17:38:03,991 WARNING [train.py:1204] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_643-52007-0_sp0.9 from training. Duration: 0.6333125 2023-10-03 17:38:04,045 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_51937_210_5014_1_1534573610_4022025_518-299248-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 17:38:11,650 WARNING [train.py:1204] (3/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_166-314607-0_sp1.1 from training. Duration: 0.4909375 2023-10-03 17:38:12,927 WARNING [train.py:1204] (3/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_1087-282939-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 17:38:12,968 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_23850_210_4238_1_1532232597_1632433_32-185135-0_sp1.1 from training. Duration: 0.7818125 2023-10-03 17:38:13,000 WARNING [train.py:1204] (3/4) Exclude cut with ID _59500_210_8254_1_1534809610680_3736009_145-89359-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 17:38:14,243 INFO [train.py:1046] (3/4) Epoch 39, batch 700, loss[loss=0.1745, simple_loss=0.2543, pruned_loss=0.04738, over 24048.00 frames. ], tot_loss[loss=0.1569, simple_loss=0.2364, pruned_loss=0.03873, over 4583507.54 frames. ], batch size: 86, lr: 2.62e-03, grad_scale: 16.0 2023-10-03 17:38:14,566 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.ff3_skip_rate, batch_count=1350406.6666666667, ans=0.0 2023-10-03 17:38:19,033 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_340-336408-0 from training. Duration: 0.93 2023-10-03 17:38:19,105 WARNING [train.py:1204] (3/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_9-21737-0 from training. Duration: 0.97 2023-10-03 17:38:21,852 WARNING [train.py:1204] (3/4) Exclude cut with ID _2220_210_1819_1_1525509109898_3658680_50-99837-0 from training. Duration: 0.54 2023-10-03 17:38:21,917 WARNING [train.py:1204] (3/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_879-299350-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 17:38:24,585 WARNING [train.py:1204] (3/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_905-160074-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 17:38:24,721 WARNING [train.py:1204] (3/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_395-73570-0 from training. Duration: 0.99 2023-10-03 17:38:30,094 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7264_210_4938_1_1527939911_8132095_800-91995-0_sp1.1 from training. Duration: 0.7818125 2023-10-03 17:38:32,709 WARNING [train.py:1204] (3/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_77-311404-0_sp1.1 from training. Duration: 0.8545625 2023-10-03 17:38:34,177 WARNING [train.py:1204] (3/4) Exclude cut with ID _13679_210_7497_1_1530534293662_3355098_107-80772-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 17:38:35,526 WARNING [train.py:1204] (3/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_224-343295-0_sp1.1 from training. Duration: 0.5 2023-10-03 17:38:36,827 WARNING [train.py:1204] (3/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_156-253482-0_sp1.1 from training. Duration: 0.963625 2023-10-03 17:38:38,848 WARNING [train.py:1204] (3/4) Exclude cut with ID _49728_210_12651_1_1534041034294_6788150_309-125532-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 17:38:41,675 WARNING [train.py:1204] (3/4) Exclude cut with ID _13913_210_9994_1_1530579615187_3383897_473-277682-0_sp0.9 from training. Duration: 0.4333125 2023-10-03 17:38:41,680 WARNING [train.py:1204] (3/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_3-276701-0_sp1.1 from training. Duration: 0.82725 2023-10-03 17:38:43,096 WARNING [train.py:1204] (3/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_282-136641-0 from training. Duration: 0.42 2023-10-03 17:38:43,851 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.self_attn2.whiten.whitening_limit, batch_count=1350540.0, ans=22.5 2023-10-03 17:38:47,511 WARNING [train.py:1204] (3/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_127-566-0 from training. Duration: 0.84 2023-10-03 17:38:49,162 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.conv_skip_rate, batch_count=1350540.0, ans=0.0 2023-10-03 17:38:50,220 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6163_210_6400_1_1527506746_4263068_283-137811-0_sp1.1 from training. Duration: 0.7 2023-10-03 17:38:51,666 WARNING [train.py:1204] (3/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_34-183685-0_sp1.1 from training. Duration: 0.8909375 2023-10-03 17:38:51,783 WARNING [train.py:1204] (3/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_154-135696-0_sp0.9 from training. Duration: 0.888875 2023-10-03 17:38:55,958 WARNING [train.py:1204] (3/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_676-51014-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 17:38:57,707 WARNING [train.py:1204] (3/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_38-169920-0 from training. Duration: 0.91 2023-10-03 17:38:57,981 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.conv_module1.balancer2.prob, batch_count=1350606.6666666667, ans=0.125 2023-10-03 17:39:01,314 WARNING [train.py:1204] (3/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_747-114465-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 17:39:02,635 WARNING [train.py:1204] (3/4) Exclude cut with ID _8021_210_7994_1_1528376451358_3653069_100-140231-0_sp1.1 from training. Duration: 0.6090625 2023-10-03 17:39:02,658 WARNING [train.py:1204] (3/4) Exclude cut with ID _64135_210_2253_1_1535677267876_7608972_133-188266-0 from training. Duration: 0.81 2023-10-03 17:39:06,729 WARNING [train.py:1204] (3/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_714-310538-0_sp1.1 from training. Duration: 0.7818125 2023-10-03 17:39:06,972 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.bypass.scale_min, batch_count=1350606.6666666667, ans=0.2 2023-10-03 17:39:08,078 WARNING [train.py:1204] (3/4) Exclude cut with ID _39867_210_9826_1_1533553222030_9344259_1134-288498-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 17:39:11,236 WARNING [train.py:1204] (3/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_211-237289-0_sp1.1 from training. Duration: 0.936375 2023-10-03 17:39:15,815 WARNING [train.py:1204] (3/4) Exclude cut with ID _59202_210_26984_1_1534816810612_3549174_137-318710-0_sp0.9 from training. Duration: 0.988875 2023-10-03 17:39:17,101 WARNING [train.py:1204] (3/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_86-66192-0 from training. Duration: 0.55 2023-10-03 17:39:20,589 WARNING [train.py:1204] (3/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_540-275685-0 from training. Duration: 0.89 2023-10-03 17:39:20,629 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8776_210_2040_1_1528638837_4088036_244-278763-0 from training. Duration: 0.9 2023-10-03 17:39:23,276 WARNING [train.py:1204] (3/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_9-219972-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 17:39:24,686 WARNING [train.py:1204] (3/4) Exclude cut with ID _44659_210_2721_1_1534036120061_6614860_656-283823-0_sp1.1 from training. Duration: 0.92725 2023-10-03 17:39:26,089 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_69630_210_7234_1_1535789601_661049_77-216385-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 17:39:26,290 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.feed_forward1.out_proj.dropout_p, batch_count=1350673.3333333333, ans=0.1 2023-10-03 17:39:28,851 INFO [train.py:1046] (3/4) Epoch 39, batch 750, loss[loss=0.1604, simple_loss=0.2367, pruned_loss=0.04207, over 23859.00 frames. ], tot_loss[loss=0.1564, simple_loss=0.2356, pruned_loss=0.03858, over 4604959.47 frames. ], batch size: 179, lr: 2.62e-03, grad_scale: 16.0 2023-10-03 17:39:28,901 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_19952_210_2161_1_1531875469_3851750_210-308919-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 17:39:28,906 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_4737_210_2040_1_1526998689_3293008_378-178650-0 from training. Duration: 0.95 2023-10-03 17:39:32,887 WARNING [train.py:1204] (3/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_224-343295-0 from training. Duration: 0.55 2023-10-03 17:39:34,163 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7002_210_1794_1_1527939683_5157286_184-229630-0 from training. Duration: 0.74 2023-10-03 17:39:34,194 WARNING [train.py:1204] (3/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_6-214815-0 from training. Duration: 0.85 2023-10-03 17:39:35,597 WARNING [train.py:1204] (3/4) Exclude cut with ID _8699_210_2069_1_1528630346831_3960409_160-24408-0 from training. Duration: 0.91 2023-10-03 17:39:36,877 WARNING [train.py:1204] (3/4) Exclude cut with ID _5585_210_2667_1_1527418813324_8007340_89-332072-0 from training. Duration: 0.64 2023-10-03 17:39:36,911 WARNING [train.py:1204] (3/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_528-259800-0_sp1.1 from training. Duration: 0.963625 2023-10-03 17:39:36,982 WARNING [train.py:1204] (3/4) Exclude cut with ID _55383_210_10635_1_1534468426172_2231669_134-128246-0 from training. Duration: 0.89 2023-10-03 17:39:38,274 WARNING [train.py:1204] (3/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_400-338274-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 17:39:39,704 WARNING [train.py:1204] (3/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_364-22817-0_sp0.9 from training. Duration: 0.788875 2023-10-03 17:39:41,048 WARNING [train.py:1204] (3/4) Exclude cut with ID _42863_210_9070_1_1533902398788_3696920_331-291057-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 17:39:43,048 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_5436_210_1794_1_1527331317_2541494_229-211016-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 17:39:43,087 WARNING [train.py:1204] (3/4) Exclude cut with ID _6456_210_1534_1_1527681634212_3181049_345-252877-0_sp0.9 from training. Duration: 0.711125 2023-10-03 17:39:44,360 WARNING [train.py:1204] (3/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_96-81499-0_sp1.1 from training. Duration: 0.92725 2023-10-03 17:39:47,155 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_5575_210_1410_1_1527301305_2570170_342-265572-0_sp1.1 from training. Duration: 0.8545625 2023-10-03 17:39:47,219 WARNING [train.py:1204] (3/4) Exclude cut with ID _5444_210_2151_1_1527418808464_5312210_726-27165-0_sp0.9 from training. Duration: 0.7444375 2023-10-03 17:39:48,633 WARNING [train.py:1204] (3/4) Exclude cut with ID _22083_210_8254_1_1531702862439_3678970_304-327750-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 17:39:50,548 WARNING [train.py:1204] (3/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_245-226199-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 17:39:51,882 WARNING [train.py:1204] (3/4) Exclude cut with ID _42875_210_14856_1_1533704397487_4012433_102-199499-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 17:39:51,941 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_51225_210_3601_1_1534400634_2327383_55-301844-0 from training. Duration: 0.93 2023-10-03 17:39:53,249 WARNING [train.py:1204] (3/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_916-195848-0_sp1.1 from training. Duration: 0.836375 2023-10-03 17:39:53,320 WARNING [train.py:1204] (3/4) Exclude cut with ID _44718_210_5816_1_1534050051869_6469030_497-311999-0_sp1.1 from training. Duration: 0.936375 2023-10-03 17:39:56,061 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_34886_210_10633_1_1533203882_1755661_91-194765-0_sp1.1 from training. Duration: 0.936375 2023-10-03 17:39:57,474 WARNING [train.py:1204] (3/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_310-26186-0_sp0.9 from training. Duration: 0.77775 2023-10-03 17:39:58,635 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.693e+02 2.002e+02 2.322e+02 2.663e+02 4.203e+02, threshold=4.644e+02, percent-clipped=1.0 2023-10-03 17:39:58,729 WARNING [train.py:1204] (3/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_416-257730-0 from training. Duration: 0.65 2023-10-03 17:39:58,735 WARNING [train.py:1204] (3/4) Exclude cut with ID _9389_210_1664_1_1529217337301_5289269_209-154760-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 17:40:00,774 WARNING [train.py:1204] (3/4) Exclude cut with ID _4092_210_2151_1_1526643044449_3135759_227-213972-0 from training. Duration: 0.54 2023-10-03 17:40:00,804 WARNING [train.py:1204] (3/4) Exclude cut with ID IC0679W0044-335852 from training. Duration: 0.768 2023-10-03 17:40:02,154 WARNING [train.py:1204] (3/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_42-186162-0 from training. Duration: 0.92 2023-10-03 17:40:02,167 WARNING [train.py:1204] (3/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_302-2550-0_sp0.9 from training. Duration: 0.9 2023-10-03 17:40:02,181 WARNING [train.py:1204] (3/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_356-31216-0_sp0.9 from training. Duration: 0.8333125 2023-10-03 17:40:02,340 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.1.bypass_mid.scale_min, batch_count=1350873.3333333333, ans=0.2 2023-10-03 17:40:03,622 WARNING [train.py:1204] (3/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_125-322172-0_sp1.1 from training. Duration: 0.8454375 2023-10-03 17:40:08,072 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.bypass.skip_rate, batch_count=1350873.3333333333, ans=0.07 2023-10-03 17:40:10,428 WARNING [train.py:1204] (3/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_141-89727-0_sp0.9 from training. Duration: 0.788875 2023-10-03 17:40:10,444 WARNING [train.py:1204] (3/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_647-337784-0_sp1.1 from training. Duration: 0.97275 2023-10-03 17:40:10,460 WARNING [train.py:1204] (3/4) Exclude cut with ID _6493_210_1747_1_1528459210424_4012470_513-140967-0_sp1.1 from training. Duration: 0.7181875 2023-10-03 17:40:13,619 WARNING [train.py:1204] (3/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_87-266334-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 17:40:13,728 WARNING [train.py:1204] (3/4) Exclude cut with ID _26528_210_9070_1_1532570410842_3851310_526-261338-0_sp1.1 from training. Duration: 0.92725 2023-10-03 17:40:15,049 WARNING [train.py:1204] (3/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_380-343670-0 from training. Duration: 0.88 2023-10-03 17:40:15,106 WARNING [train.py:1204] (3/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_449-15508-0_sp1.1 from training. Duration: 0.7454375 2023-10-03 17:40:17,800 WARNING [train.py:1204] (3/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_372-6869-0_sp0.9 from training. Duration: 0.488875 2023-10-03 17:40:17,866 WARNING [train.py:1204] (3/4) Exclude cut with ID _26907_210_7234_1_1532221740694_3205263_337-2494-0_sp0.9 from training. Duration: 0.97775 2023-10-03 17:40:21,087 WARNING [train.py:1204] (3/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_10-100367-0_sp1.1 from training. Duration: 0.8909375 2023-10-03 17:40:21,114 WARNING [train.py:1204] (3/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_47-167373-0 from training. Duration: 0.92 2023-10-03 17:40:21,175 WARNING [train.py:1204] (3/4) Exclude cut with ID _28323_210_16187_1_1532480395667_4100300_628-282179-0_sp1.1 from training. Duration: 0.97275 2023-10-03 17:40:24,021 WARNING [train.py:1204] (3/4) Exclude cut with ID _20455_210_5219_1_1531565996775_8098100_213-280898-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 17:40:25,398 WARNING [train.py:1204] (3/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_383-316000-0_sp1.1 from training. Duration: 0.8181875 2023-10-03 17:40:26,699 WARNING [train.py:1204] (3/4) Exclude cut with ID _18513_210_9206_1_1531618215223_3862620_16-78180-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 17:40:29,301 WARNING [train.py:1204] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_330-327861-0_sp1.1 from training. Duration: 0.6090625 2023-10-03 17:40:32,008 WARNING [train.py:1204] (3/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_356-31216-0 from training. Duration: 0.75 2023-10-03 17:40:32,026 WARNING [train.py:1204] (3/4) Exclude cut with ID _25248_210_8750_1_1532062825941_5554180_255-148539-0_sp1.1 from training. Duration: 0.963625 2023-10-03 17:40:33,379 WARNING [train.py:1204] (3/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_269-154726-0_sp1.1 from training. Duration: 0.7818125 2023-10-03 17:40:36,189 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7002_210_1794_1_1527939683_5157286_163-87552-0_sp1.1 from training. Duration: 0.7818125 2023-10-03 17:40:36,330 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.feed_forward2.hidden_balancer.prob, batch_count=1351006.6666666667, ans=0.125 2023-10-03 17:40:37,492 WARNING [train.py:1204] (3/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_97-151896-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 17:40:38,901 WARNING [train.py:1204] (3/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_207-7026-0_sp1.1 from training. Duration: 0.97275 2023-10-03 17:40:38,921 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_92-46009-0_sp0.9 from training. Duration: 0.6 2023-10-03 17:40:43,530 INFO [train.py:1046] (3/4) Epoch 39, batch 800, loss[loss=0.1392, simple_loss=0.2144, pruned_loss=0.03205, over 24313.00 frames. ], tot_loss[loss=0.157, simple_loss=0.2365, pruned_loss=0.03872, over 4631802.53 frames. ], batch size: 56, lr: 2.62e-03, grad_scale: 32.0 2023-10-03 17:40:46,263 WARNING [train.py:1204] (3/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_122-178847-0_sp1.1 from training. Duration: 0.97275 2023-10-03 17:40:46,266 WARNING [train.py:1204] (3/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_94-348956-0_sp1.1 from training. Duration: 0.92725 2023-10-03 17:40:47,700 WARNING [train.py:1204] (3/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_1018-244089-0_sp1.1 from training. Duration: 0.963625 2023-10-03 17:40:47,711 WARNING [train.py:1204] (3/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_87-118423-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 17:40:49,544 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.bypass_mid.scale_min, batch_count=1351073.3333333333, ans=0.2 2023-10-03 17:40:50,759 WARNING [train.py:1204] (3/4) Exclude cut with ID _53713_210_24116_1_1534314487762_7416751_504-212292-0_sp1.1 from training. Duration: 0.92725 2023-10-03 17:40:50,783 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_446-150327-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 17:40:52,152 WARNING [train.py:1204] (3/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_682-245989-0_sp1.1 from training. Duration: 0.92725 2023-10-03 17:40:56,756 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.5.encoder.layers.0.feed_forward1.out_whiten, num_groups=1, num_channels=256, metric=7.00 vs. limit=15.0 2023-10-03 17:40:57,602 WARNING [train.py:1204] (3/4) Exclude cut with ID _35723_210_1794_1_1533088341043_628250_8-234524-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 17:40:57,664 WARNING [train.py:1204] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_64-248359-0_sp0.9 from training. Duration: 0.7666875 2023-10-03 17:40:59,732 WARNING [train.py:1204] (3/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_49-97158-0 from training. Duration: 0.76 2023-10-03 17:41:01,055 WARNING [train.py:1204] (3/4) Exclude cut with ID _24314_210_4915_1_1532135078116_7170179_743-915-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 17:41:01,147 WARNING [train.py:1204] (3/4) Exclude cut with ID _61194_210_18628_1_1535007555628_3750909_353-323468-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 17:41:02,438 WARNING [train.py:1204] (3/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_387-131538-0_sp1.1 from training. Duration: 0.736375 2023-10-03 17:41:02,470 WARNING [train.py:1204] (3/4) Exclude cut with ID _44424_210_18163_1_1533623394848_1213110_79-187603-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 17:41:02,503 WARNING [train.py:1204] (3/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_86-19601-0 from training. Duration: 0.85 2023-10-03 17:41:02,527 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_50050_210_16223_1_1535435796_7290138_103-283529-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 17:41:02,565 WARNING [train.py:1204] (3/4) Exclude cut with ID _13913_210_9994_1_1530579615187_3383897_473-277682-0 from training. Duration: 0.39 2023-10-03 17:41:03,193 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.2.encoder.layers.0.whiten, num_groups=1, num_channels=384, metric=5.48 vs. limit=12.0 2023-10-03 17:41:05,216 WARNING [train.py:1204] (3/4) Exclude cut with ID _42326_210_7770_1_1533814282531_3707269_368-68029-0_sp1.1 from training. Duration: 0.92725 2023-10-03 17:41:05,409 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=1351140.0, ans=0.1 2023-10-03 17:41:07,840 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_41428_210_2161_1_1533774184_3992532_446-339645-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 17:41:09,733 WARNING [train.py:1204] (3/4) Exclude cut with ID _69584_210_5219_1_1535785133775_4044450_325-7656-0_sp1.1 from training. Duration: 0.7818125 2023-10-03 17:41:09,740 WARNING [train.py:1204] (3/4) Exclude cut with ID _21632_210_2253_1_1531994001765_3744140_134-184387-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 17:41:12,482 WARNING [train.py:1204] (3/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_550-184707-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 17:41:13,739 WARNING [train.py:1204] (3/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_106-341780-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 17:41:16,797 WARNING [train.py:1204] (3/4) Exclude cut with ID _45871_210_5665_1_1533864614245_3296060_253-132552-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 17:41:17,289 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.5.encoder.layers.0.self_attn2.whiten, num_groups=1, num_channels=256, metric=11.62 vs. limit=22.5 2023-10-03 17:41:18,193 WARNING [train.py:1204] (3/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_395-310220-0_sp1.1 from training. Duration: 0.8090625 2023-10-03 17:41:18,213 WARNING [train.py:1204] (3/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_795-33252-0_sp0.9 from training. Duration: 0.411125 2023-10-03 17:41:21,391 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9002W0003-495962 from training. Duration: 0.946 2023-10-03 17:41:22,623 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_43-71989-0 from training. Duration: 0.57 2023-10-03 17:41:22,643 WARNING [train.py:1204] (3/4) Exclude cut with ID _8699_210_2069_1_1528630346831_3960409_155-39766-0_sp1.1 from training. Duration: 0.7090625 2023-10-03 17:41:22,661 WARNING [train.py:1204] (3/4) Exclude cut with ID _27894_210_12392_1_1532415496078_6902439_788-232684-0_sp1.1 from training. Duration: 0.97275 2023-10-03 17:41:24,023 WARNING [train.py:1204] (3/4) Exclude cut with ID _56494_210_4220_1_1535284837625_3033169_10-39296-0_sp1.1 from training. Duration: 0.92725 2023-10-03 17:41:25,304 WARNING [train.py:1204] (3/4) Exclude cut with ID _40901_210_5665_1_1533363789958_3856130_341-20819-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 17:41:30,169 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9029W0435-509822 from training. Duration: 0.896 2023-10-03 17:41:30,218 WARNING [train.py:1204] (3/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_51-117576-0 from training. Duration: 0.4 2023-10-03 17:41:32,197 WARNING [train.py:1204] (3/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_197-28164-0_sp1.1 from training. Duration: 0.72725 2023-10-03 17:41:32,380 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=1351273.3333333333, ans=0.1 2023-10-03 17:41:33,603 WARNING [train.py:1204] (3/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_145-137790-0_sp1.1 from training. Duration: 0.7545625 2023-10-03 17:41:37,705 WARNING [train.py:1204] (3/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_2-39653-0_sp1.1 from training. Duration: 0.8909375 2023-10-03 17:41:38,002 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.ff2_skip_rate, batch_count=1351273.3333333333, ans=0.0 2023-10-03 17:41:39,134 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3780_210_2087_1_1526467760_2130483_186-208240-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 17:41:40,465 WARNING [train.py:1204] (3/4) Exclude cut with ID _24055_210_12651_1_1532310907143_976979_92-59356-0 from training. Duration: 0.91 2023-10-03 17:41:40,513 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3910_210_1794_1_1526742306_334530_28-238909-0_sp1.1 from training. Duration: 0.736375 2023-10-03 17:41:43,734 WARNING [train.py:1204] (3/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_269-154726-0 from training. Duration: 0.86 2023-10-03 17:41:49,354 WARNING [train.py:1204] (3/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_326-165900-0_sp0.9 from training. Duration: 0.7666875 2023-10-03 17:41:50,165 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.2.encoder.layers.1.nonlin_attention.whiten2, num_groups=1, num_channels=384, metric=5.28 vs. limit=15.0 2023-10-03 17:41:52,225 WARNING [train.py:1204] (3/4) Exclude cut with ID _5294_210_1747_1_1527249643928_3696179_470-276077-0_sp1.1 from training. Duration: 0.963625 2023-10-03 17:41:52,389 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.bypass_mid.scale_min, batch_count=1351340.0, ans=0.2 2023-10-03 17:41:53,582 WARNING [train.py:1204] (3/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_372-131163-0 from training. Duration: 0.94 2023-10-03 17:41:55,243 WARNING [train.py:1204] (3/4) Exclude cut with ID _45297_210_2721_1_1533715351381_3264949_351-147136-0_sp1.1 from training. Duration: 0.7909375 2023-10-03 17:41:55,325 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_1072-12671-0_sp1.1 from training. Duration: 0.92725 2023-10-03 17:41:56,650 INFO [train.py:1046] (3/4) Epoch 39, batch 850, loss[loss=0.1583, simple_loss=0.2295, pruned_loss=0.0436, over 23805.00 frames. ], tot_loss[loss=0.1579, simple_loss=0.2375, pruned_loss=0.03913, over 4647843.10 frames. ], batch size: 195, lr: 2.62e-03, grad_scale: 16.0 2023-10-03 17:41:56,719 WARNING [train.py:1204] (3/4) Exclude cut with ID _15133_210_6228_1_1530794512074_3080379_54-198193-0 from training. Duration: 0.94 2023-10-03 17:41:56,738 WARNING [train.py:1204] (3/4) Exclude cut with ID _30196_210_1664_1_1533708496522_7074140_1194-270988-0_sp1.1 from training. Duration: 0.97275 2023-10-03 17:41:58,143 WARNING [train.py:1204] (3/4) Exclude cut with ID _50310_210_15488_1_1534068043345_3641080_289-193637-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 17:41:59,540 WARNING [train.py:1204] (3/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_313-263495-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 17:42:01,489 WARNING [train.py:1204] (3/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_18-130336-0_sp1.1 from training. Duration: 0.7181875 2023-10-03 17:42:02,797 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_47896_210_20508_1_1534492518_5958868_646-287209-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 17:42:04,197 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_294-163793-0 from training. Duration: 0.82 2023-10-03 17:42:04,315 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.1.balancer2.prob, batch_count=1351406.6666666667, ans=0.125 2023-10-03 17:42:05,488 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_8-319957-0 from training. Duration: 0.96 2023-10-03 17:42:05,498 WARNING [train.py:1204] (3/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_1211-226066-0 from training. Duration: 0.76 2023-10-03 17:42:08,211 WARNING [train.py:1204] (3/4) Exclude cut with ID _4779_210_2084_1_1527318022944_3914893_500-285013-0_sp0.9 from training. Duration: 0.7666875 2023-10-03 17:42:08,238 WARNING [train.py:1204] (3/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_347-232268-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 17:42:09,710 WARNING [train.py:1204] (3/4) Exclude cut with ID _29877_210_6228_1_1532397791106_5276149_111-55056-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 17:42:09,734 WARNING [train.py:1204] (3/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_812-271646-0_sp1.1 from training. Duration: 0.92725 2023-10-03 17:42:09,774 WARNING [train.py:1204] (3/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_235-262124-0_sp1.1 from training. Duration: 0.6090625 2023-10-03 17:42:14,392 WARNING [train.py:1204] (3/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_14-16530-0_sp1.1 from training. Duration: 0.97275 2023-10-03 17:42:14,433 WARNING [train.py:1204] (3/4) Exclude cut with ID _44718_210_5816_1_1534050051869_6469030_568-304512-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 17:42:14,456 WARNING [train.py:1204] (3/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_1033-274376-0 from training. Duration: 0.87 2023-10-03 17:42:18,580 WARNING [train.py:1204] (3/4) Exclude cut with ID _4681_210_3525_1_1527251040321_1235713_123-24428-0 from training. Duration: 0.48 2023-10-03 17:42:22,751 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_37818_210_4238_1_1533620910_4720083_251-288764-0_sp1.1 from training. Duration: 0.97275 2023-10-03 17:42:24,153 WARNING [train.py:1204] (3/4) Exclude cut with ID _65585_210_12210_1_1535450451584_3764098_112-294301-0 from training. Duration: 0.88 2023-10-03 17:42:26,738 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.581e+02 1.921e+02 2.098e+02 2.491e+02 3.402e+02, threshold=4.195e+02, percent-clipped=0.0 2023-10-03 17:42:28,216 WARNING [train.py:1204] (3/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_329-113173-0 from training. Duration: 0.62 2023-10-03 17:42:31,370 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6767_210_3273_1_1527855783_1718932_173-256601-0 from training. Duration: 0.8 2023-10-03 17:42:34,579 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9266W0001-627677 from training. Duration: 0.896 2023-10-03 17:42:34,589 WARNING [train.py:1204] (3/4) Exclude cut with ID _49643_210_7989_1_1534640244224_3939031_611-185515-0_sp1.1 from training. Duration: 0.8545625 2023-10-03 17:42:34,593 WARNING [train.py:1204] (3/4) Exclude cut with ID _31649_210_6228_1_1532584608063_4091078_687-237771-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 17:42:34,603 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_148-172861-0_sp0.9 from training. Duration: 0.5555625 2023-10-03 17:42:37,849 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_5129_210_6400_1_1527161005_3579054_107-69738-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 17:42:39,218 WARNING [train.py:1204] (3/4) Exclude cut with ID _40137_210_12210_1_1533290484046_3795180_36-301164-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 17:42:39,241 WARNING [train.py:1204] (3/4) Exclude cut with ID _7537_210_2084_1_1528502395431_3924359_113-199933-0 from training. Duration: 0.59 2023-10-03 17:42:40,666 WARNING [train.py:1204] (3/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_547-5424-0_sp1.1 from training. Duration: 0.8545625 2023-10-03 17:42:41,964 WARNING [train.py:1204] (3/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_147-284024-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 17:42:43,346 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_294-163793-0_sp1.1 from training. Duration: 0.7454375 2023-10-03 17:42:43,372 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3780_210_2087_1_1526467760_2130483_170-259044-0_sp1.1 from training. Duration: 0.636375 2023-10-03 17:42:44,753 WARNING [train.py:1204] (3/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_726-270780-0_sp1.1 from training. Duration: 0.7818125 2023-10-03 17:42:46,416 WARNING [train.py:1204] (3/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_214-291369-0_sp1.1 from training. Duration: 0.57275 2023-10-03 17:42:46,473 WARNING [train.py:1204] (3/4) Exclude cut with ID _21881_210_3525_1_1531825242757_3798380_247-102591-0 from training. Duration: 0.98 2023-10-03 17:42:48,111 WARNING [train.py:1204] (3/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_18-174647-0_sp1.1 from training. Duration: 0.82725 2023-10-03 17:42:48,114 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_415-52611-0_sp1.1 from training. Duration: 0.936375 2023-10-03 17:42:49,528 WARNING [train.py:1204] (3/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_379-49457-0_sp0.9 from training. Duration: 0.9666875 2023-10-03 17:42:49,542 WARNING [train.py:1204] (3/4) Exclude cut with ID _64521_210_14699_1_1535243427259_3815328_368-195106-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 17:42:50,889 WARNING [train.py:1204] (3/4) Exclude cut with ID _28394_210_8913_1_1532430049518_3650118_51-334124-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 17:42:55,081 WARNING [train.py:1204] (3/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_317-161769-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 17:42:56,457 WARNING [train.py:1204] (3/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_40-129756-0_sp0.9 from training. Duration: 0.9 2023-10-03 17:42:57,831 WARNING [train.py:1204] (3/4) Exclude cut with ID _3067_210_2667_1_1526212974640_4503168_119-175776-0_sp0.9 from training. Duration: 0.82225 2023-10-03 17:42:59,326 WARNING [train.py:1204] (3/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_1004-304915-0_sp1.1 from training. Duration: 0.92725 2023-10-03 17:43:00,639 WARNING [train.py:1204] (3/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_359-118926-0_sp0.9 from training. Duration: 0.8 2023-10-03 17:43:00,928 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.conv_module2.balancer2.prob, batch_count=1351673.3333333333, ans=0.125 2023-10-03 17:43:08,514 WARNING [train.py:1204] (3/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_51-200220-0_sp0.9 from training. Duration: 0.62225 2023-10-03 17:43:09,834 INFO [train.py:1046] (3/4) Epoch 39, batch 900, loss[loss=0.1526, simple_loss=0.2321, pruned_loss=0.0365, over 24314.00 frames. ], tot_loss[loss=0.1588, simple_loss=0.2386, pruned_loss=0.03946, over 4662007.69 frames. ], batch size: 56, lr: 2.62e-03, grad_scale: 16.0 2023-10-03 17:43:09,944 WARNING [train.py:1204] (3/4) Exclude cut with ID _46683_210_9642_1_1533730913492_4591116_530-326202-0_sp1.1 from training. Duration: 0.936375 2023-10-03 17:43:11,849 WARNING [train.py:1204] (3/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_567-135114-0 from training. Duration: 0.87 2023-10-03 17:43:11,902 WARNING [train.py:1204] (3/4) Exclude cut with ID _66863_210_18053_1_1535698758334_8590319_1092-271784-0_sp1.1 from training. Duration: 0.87275 2023-10-03 17:43:11,915 WARNING [train.py:1204] (3/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_762-282614-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 17:43:13,347 WARNING [train.py:1204] (3/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_280-120942-0 from training. Duration: 0.77 2023-10-03 17:43:19,458 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_31583_210_3852_1_1532595851_4024413_27-170931-0_sp1.1 from training. Duration: 0.963625 2023-10-03 17:43:20,981 WARNING [train.py:1204] (3/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_266-271628-0_sp1.1 from training. Duration: 0.92725 2023-10-03 17:43:21,016 WARNING [train.py:1204] (3/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_41-31157-0 from training. Duration: 0.57 2023-10-03 17:43:23,643 WARNING [train.py:1204] (3/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_113-18163-0_sp1.1 from training. Duration: 0.8090625 2023-10-03 17:43:25,037 WARNING [train.py:1204] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_147-111137-0 from training. Duration: 0.95 2023-10-03 17:43:26,331 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1083-220122-0_sp1.1 from training. Duration: 0.463625 2023-10-03 17:43:27,705 WARNING [train.py:1204] (3/4) Exclude cut with ID _14884_210_2252_1_1530707459689_5561250_591-173406-0_sp1.1 from training. Duration: 0.87275 2023-10-03 17:43:27,711 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_24677_210_1794_1_1532002693_4341296_149-303793-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 17:43:27,753 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_133-564-0_sp1.1 from training. Duration: 0.7181875 2023-10-03 17:43:29,030 WARNING [train.py:1204] (3/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_625-338546-0_sp0.9 from training. Duration: 0.97775 2023-10-03 17:43:32,185 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.1.conv_module1.balancer2.prob, batch_count=1351806.6666666667, ans=0.125 2023-10-03 17:43:39,845 WARNING [train.py:1204] (3/4) Exclude cut with ID _25882_210_3036_1_1532775665815_6479510_624-320169-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 17:43:39,859 WARNING [train.py:1204] (3/4) Exclude cut with ID _20622_210_3852_1_1531622427080_1093839_127-325326-0_sp1.1 from training. Duration: 0.92725 2023-10-03 17:43:39,889 WARNING [train.py:1204] (3/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_464-33931-0_sp1.1 from training. Duration: 0.4909375 2023-10-03 17:43:43,349 WARNING [train.py:1204] (3/4) Exclude cut with ID _66112_210_10635_1_1535590236305_3801484_120-304588-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 17:43:43,929 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.3.encoder.layers.0.conv_module1.whiten, num_groups=1, num_channels=512, metric=3.02 vs. limit=15.0 2023-10-03 17:43:47,821 WARNING [train.py:1204] (3/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_110-130961-0 from training. Duration: 0.47 2023-10-03 17:43:48,541 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.2.encoder.layers.0.feed_forward2.out_whiten, num_groups=1, num_channels=384, metric=5.10 vs. limit=15.0 2023-10-03 17:43:50,530 WARNING [train.py:1204] (3/4) Exclude cut with ID _16908_210_5724_1_1531119157248_3772700_64-54020-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 17:43:53,455 WARNING [train.py:1204] (3/4) Exclude cut with ID _4068_210_2629_1_1526697350395_1546259_224-288693-0_sp1.1 from training. Duration: 0.536375 2023-10-03 17:43:54,770 WARNING [train.py:1204] (3/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_191-41023-0_sp0.9 from training. Duration: 0.788875 2023-10-03 17:43:54,830 WARNING [train.py:1204] (3/4) Exclude cut with ID ID0205W0255-783002 from training. Duration: 0.958 2023-10-03 17:43:54,897 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_162-262717-0 from training. Duration: 0.83 2023-10-03 17:43:56,451 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.ff2_skip_rate, batch_count=1351940.0, ans=0.0 2023-10-03 17:43:56,494 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.bypass.skip_rate, batch_count=1351940.0, ans=0.07 2023-10-03 17:44:01,574 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_224-322394-0_sp1.1 from training. Duration: 0.6 2023-10-03 17:44:01,580 WARNING [train.py:1204] (3/4) Exclude cut with ID _4866_210_2577_1_1527404412284_4364659_133-1696-0_sp1.1 from training. Duration: 0.9 2023-10-03 17:44:01,630 WARNING [train.py:1204] (3/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_973-198085-0_sp1.1 from training. Duration: 0.6818125 2023-10-03 17:44:03,130 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.self_attn_weights.pos_emb_skip_rate, batch_count=1351940.0, ans=0.0 2023-10-03 17:44:07,188 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.conv_module2.balancer1.max_abs, batch_count=1351940.0, ans=10.0 2023-10-03 17:44:08,214 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_47449_210_5399_1_1534076445_4006929_410-237806-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 17:44:08,224 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7543_210_6753_1_1528543318_4552_1-85072-0_sp0.9 from training. Duration: 0.92225 2023-10-03 17:44:11,439 WARNING [train.py:1204] (3/4) Exclude cut with ID _65263_210_22356_1_1535763598691_3921037_108-21902-0 from training. Duration: 0.99 2023-10-03 17:44:11,447 WARNING [train.py:1204] (3/4) Exclude cut with ID _65585_210_12210_1_1535450451584_3764098_309-174818-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 17:44:14,299 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_309-65177-0 from training. Duration: 0.88 2023-10-03 17:44:15,708 WARNING [train.py:1204] (3/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_363-185703-0_sp1.1 from training. Duration: 0.863625 2023-10-03 17:44:15,733 WARNING [train.py:1204] (3/4) Exclude cut with ID _42326_210_7770_1_1533814282531_3707269_442-238692-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 17:44:16,036 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.ff2_skip_rate, batch_count=1352006.6666666667, ans=0.0 2023-10-03 17:44:17,105 WARNING [train.py:1204] (3/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_226-254446-0_sp1.1 from training. Duration: 0.936375 2023-10-03 17:44:17,133 WARNING [train.py:1204] (3/4) Exclude cut with ID _29853_210_7497_1_1532505600851_6642110_287-299903-0_sp1.1 from training. Duration: 0.963625 2023-10-03 17:44:21,968 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1015-343884-0 from training. Duration: 0.62 2023-10-03 17:44:22,008 WARNING [train.py:1204] (3/4) Exclude cut with ID ID0305W0008-833402 from training. Duration: 0.91 2023-10-03 17:44:23,432 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6673_210_3273_1_1527767790_3173240_351-167954-0_sp1.1 from training. Duration: 0.436375 2023-10-03 17:44:23,447 WARNING [train.py:1204] (3/4) Exclude cut with ID _6456_210_1534_1_1527681634212_3181049_345-252877-0 from training. Duration: 0.64 2023-10-03 17:44:24,780 INFO [train.py:1046] (3/4) Epoch 39, batch 950, loss[loss=0.1665, simple_loss=0.2333, pruned_loss=0.04985, over 23698.00 frames. ], tot_loss[loss=0.1587, simple_loss=0.2386, pruned_loss=0.03942, over 4672129.53 frames. ], batch size: 164, lr: 2.62e-03, grad_scale: 16.0 2023-10-03 17:44:25,044 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.conv_module1.balancer1.min_positive, batch_count=1352073.3333333333, ans=0.025 2023-10-03 17:44:26,298 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_13-205182-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 17:44:30,208 WARNING [train.py:1204] (3/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_427-208901-0 from training. Duration: 0.68 2023-10-03 17:44:33,174 WARNING [train.py:1204] (3/4) Exclude cut with ID _49039_210_6315_1_1533967537541_3846118_9-148921-0_sp1.1 from training. Duration: 0.97275 2023-10-03 17:44:36,466 WARNING [train.py:1204] (3/4) Exclude cut with ID _59493_210_17282_1_1535078419077_3225879_143-70317-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 17:44:36,498 WARNING [train.py:1204] (3/4) Exclude cut with ID _27967_210_12140_1_1532588391859_8419130_1056-236369-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 17:44:36,536 WARNING [train.py:1204] (3/4) Exclude cut with ID _6537_210_1883_1_1527987559710_3597129_382-133062-0_sp0.9 from training. Duration: 0.7555625 2023-10-03 17:44:39,237 WARNING [train.py:1204] (3/4) Exclude cut with ID ID0376W0255-869957 from training. Duration: 0.9510625 2023-10-03 17:44:40,121 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.0.layers.0.whiten, num_groups=1, num_channels=192, metric=4.74 vs. limit=12.0 2023-10-03 17:44:43,719 WARNING [train.py:1204] (3/4) Exclude cut with ID _49371_210_12071_1_1533952761021_3812190_483-240691-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 17:44:45,136 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_12868_210_6753_1_1530701932_530275_18-76322-0_sp1.1 from training. Duration: 0.7909375 2023-10-03 17:44:45,193 WARNING [train.py:1204] (3/4) Exclude cut with ID _53950_210_5816_1_1534417264919_3862951_367-26344-0_sp1.1 from training. Duration: 0.97275 2023-10-03 17:44:45,218 WARNING [train.py:1204] (3/4) Exclude cut with ID _5444_210_2151_1_1527418808464_5312210_634-194174-0_sp1.1 from training. Duration: 0.8818125 2023-10-03 17:44:45,235 WARNING [train.py:1204] (3/4) Exclude cut with ID _56494_210_4220_1_1535284837625_3033169_69-138150-0 from training. Duration: 0.94 2023-10-03 17:44:46,497 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7543_210_6753_1_1528544010_1110402_25-183860-0_sp0.9 from training. Duration: 0.52225 2023-10-03 17:44:47,914 WARNING [train.py:1204] (3/4) Exclude cut with ID _40284_210_2084_1_1533513679814_3704279_287-191918-0_sp1.1 from training. Duration: 0.963625 2023-10-03 17:44:49,825 WARNING [train.py:1204] (3/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_291-270881-0 from training. Duration: 0.97 2023-10-03 17:44:49,873 WARNING [train.py:1204] (3/4) Exclude cut with ID _12555_210_8750_1_1530075594402_3780239_449-267986-0_sp0.9 from training. Duration: 0.92225 2023-10-03 17:44:50,123 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.feed_forward3.hidden_balancer.prob, batch_count=1352140.0, ans=0.125 2023-10-03 17:44:53,833 WARNING [train.py:1204] (3/4) Exclude cut with ID _3992_210_1747_1_1526644816031_3731060_268-64520-0_sp1.1 from training. Duration: 0.963625 2023-10-03 17:44:53,840 WARNING [train.py:1204] (3/4) Exclude cut with ID _39411_210_5986_1_1533538825030_3343649_173-238248-0_sp0.9 from training. Duration: 0.92225 2023-10-03 17:44:53,861 WARNING [train.py:1204] (3/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_828-313097-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 17:44:55,211 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.580e+02 1.980e+02 2.279e+02 2.726e+02 3.992e+02, threshold=4.557e+02, percent-clipped=0.0 2023-10-03 17:44:55,299 WARNING [train.py:1204] (3/4) Exclude cut with ID _26907_210_7234_1_1532221740694_3205263_337-2494-0 from training. Duration: 0.88 2023-10-03 17:44:55,491 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.conv_module2.balancer2.prob, batch_count=1352206.6666666667, ans=0.125 2023-10-03 17:44:56,671 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_229-45300-0_sp1.1 from training. Duration: 0.5181875 2023-10-03 17:44:58,065 WARNING [train.py:1204] (3/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_300-27235-0_sp1.1 from training. Duration: 0.7909375 2023-10-03 17:45:00,658 WARNING [train.py:1204] (3/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_48-338976-0_sp1.1 from training. Duration: 0.8090625 2023-10-03 17:45:02,303 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_13091_210_7925_1_1530609447_8987354_792-195353-0_sp1.1 from training. Duration: 0.936375 2023-10-03 17:45:02,315 WARNING [train.py:1204] (3/4) Exclude cut with ID _56524_210_9642_1_1534572011779_3619170_311-54500-0_sp1.1 from training. Duration: 0.97275 2023-10-03 17:45:08,290 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7543_210_6753_1_1528544010_1110402_25-183860-0 from training. Duration: 0.47 2023-10-03 17:45:10,197 WARNING [train.py:1204] (3/4) Exclude cut with ID 2411-132532-0017-82279-0_sp1.1 from training. Duration: 0.9681875 2023-10-03 17:45:10,198 WARNING [train.py:1204] (3/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_272-249985-0_sp0.9 from training. Duration: 0.7444375 2023-10-03 17:45:11,555 WARNING [train.py:1204] (3/4) Exclude cut with ID _55644_210_11856_1_1534593599582_4103719_320-229006-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 17:45:11,615 WARNING [train.py:1204] (3/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_177-100554-0_sp1.1 from training. Duration: 0.963625 2023-10-03 17:45:11,617 WARNING [train.py:1204] (3/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_787-294416-0_sp1.1 from training. Duration: 0.7454375 2023-10-03 17:45:15,862 WARNING [train.py:1204] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_318-59097-0 from training. Duration: 0.76 2023-10-03 17:45:15,930 WARNING [train.py:1204] (3/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_480-13146-0_sp1.1 from training. Duration: 0.77275 2023-10-03 17:45:19,263 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_469-338558-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 17:45:19,334 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_367-126897-0_sp1.1 from training. Duration: 0.963625 2023-10-03 17:45:19,351 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6545_210_2040_1_1527774670_4245258_379-14182-0 from training. Duration: 0.8 2023-10-03 17:45:19,367 WARNING [train.py:1204] (3/4) Exclude cut with ID _21631_210_2253_1_1531907880778_3160339_197-79498-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 17:45:19,372 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_416-293746-0_sp1.1 from training. Duration: 0.8181875 2023-10-03 17:45:20,619 WARNING [train.py:1204] (3/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_52-211683-0 from training. Duration: 0.9 2023-10-03 17:45:24,712 WARNING [train.py:1204] (3/4) Exclude cut with ID _48678_210_5663_1_1533988830947_3769199_306-255886-0_sp0.9 from training. Duration: 0.8555625 2023-10-03 17:45:26,098 WARNING [train.py:1204] (3/4) Exclude cut with ID _65134_210_14763_1_1535335393985_3505368_296-24733-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 17:45:30,370 WARNING [train.py:1204] (3/4) Exclude cut with ID _14898_210_9206_1_1530766812377_4177190_335-99364-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 17:45:33,439 WARNING [train.py:1204] (3/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_19-342632-0 from training. Duration: 0.91 2023-10-03 17:45:33,472 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_53071_210_16554_1_1534490405_2954066_265-170472-0 from training. Duration: 0.83 2023-10-03 17:45:38,061 INFO [train.py:1046] (3/4) Epoch 39, batch 1000, loss[loss=0.14, simple_loss=0.2166, pruned_loss=0.03166, over 24309.00 frames. ], tot_loss[loss=0.1582, simple_loss=0.2376, pruned_loss=0.03939, over 4680010.57 frames. ], batch size: 56, lr: 2.62e-03, grad_scale: 16.0 2023-10-03 17:45:38,193 WARNING [train.py:1204] (3/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_189-298172-0_sp1.1 from training. Duration: 0.963625 2023-10-03 17:45:38,412 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.feed_forward1.out_proj.dropout_p, batch_count=1352406.6666666667, ans=0.1 2023-10-03 17:45:41,672 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7615_210_4211_1_1528110965_4580160_72-235211-0 from training. Duration: 0.76 2023-10-03 17:45:41,716 WARNING [train.py:1204] (3/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_155-90874-0_sp1.1 from training. Duration: 0.936375 2023-10-03 17:45:43,384 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.feed_forward1.hidden_balancer.prob, batch_count=1352406.6666666667, ans=0.125 2023-10-03 17:45:44,638 WARNING [train.py:1204] (3/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_164-216310-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 17:45:47,250 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_46013_210_7234_1_1533776362_3744032_463-52378-0 from training. Duration: 0.86 2023-10-03 17:45:47,253 WARNING [train.py:1204] (3/4) Exclude cut with ID _61133_210_9969_1_1535196777227_3696409_333-270325-0 from training. Duration: 0.93 2023-10-03 17:45:51,760 WARNING [train.py:1204] (3/4) Exclude cut with ID _5172_210_1883_1_1527246062567_3577219_18-101380-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 17:45:51,768 WARNING [train.py:1204] (3/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_577-236858-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 17:45:53,213 WARNING [train.py:1204] (3/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_1006-177345-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 17:45:55,826 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_4-266348-0 from training. Duration: 0.78 2023-10-03 17:46:00,056 WARNING [train.py:1204] (3/4) Exclude cut with ID _55857_210_2346_1_1534644007831_6898209_745-98391-0 from training. Duration: 0.76 2023-10-03 17:46:02,687 WARNING [train.py:1204] (3/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_375-244976-0 from training. Duration: 0.75 2023-10-03 17:46:02,739 WARNING [train.py:1204] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_4-248404-0_sp1.1 from training. Duration: 0.97275 2023-10-03 17:46:04,669 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8453_210_4211_1_1528502940_8562730_596-306820-0 from training. Duration: 0.97 2023-10-03 17:46:07,914 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_211-339622-0_sp0.9 from training. Duration: 0.5 2023-10-03 17:46:07,949 WARNING [train.py:1204] (3/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_393-57888-0 from training. Duration: 0.64 2023-10-03 17:46:09,348 WARNING [train.py:1204] (3/4) Exclude cut with ID _23825_210_13449_1_1531915313526_3497150_143-343575-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 17:46:10,648 WARNING [train.py:1204] (3/4) Exclude cut with ID _58605_210_12210_1_1534723195939_3904259_36-12256-0_sp1.1 from training. Duration: 0.936375 2023-10-03 17:46:15,804 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.5.encoder.layers.0.nonlin_attention.whiten1, num_groups=1, num_channels=192, metric=4.28 vs. limit=10.0 2023-10-03 17:46:18,235 WARNING [train.py:1204] (3/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_38-67795-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 17:46:20,276 WARNING [train.py:1204] (3/4) Exclude cut with ID _64631_210_27767_1_1535349580068_4165160_308-327832-0_sp1.1 from training. Duration: 0.92725 2023-10-03 17:46:20,333 WARNING [train.py:1204] (3/4) Exclude cut with ID _37086_210_9038_1_1532999540891_7021250_726-81176-0_sp1.1 from training. Duration: 0.936375 2023-10-03 17:46:20,387 WARNING [train.py:1204] (3/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_47-141521-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 17:46:20,391 WARNING [train.py:1204] (3/4) Exclude cut with ID _3067_210_2667_1_1526212974640_4503168_250-285937-0 from training. Duration: 0.8 2023-10-03 17:46:20,415 WARNING [train.py:1204] (3/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_239-24000-0_sp1.1 from training. Duration: 0.97275 2023-10-03 17:46:21,771 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3371_210_1794_1_1526384462_5580777_396-339812-0_sp0.9 from training. Duration: 0.9444375 2023-10-03 17:46:21,821 WARNING [train.py:1204] (3/4) Exclude cut with ID _16909_210_5724_1_1531292079912_4118919_580-173454-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 17:46:23,164 WARNING [train.py:1204] (3/4) Exclude cut with ID IC0124W0002-61308 from training. Duration: 0.982 2023-10-03 17:46:26,047 WARNING [train.py:1204] (3/4) Exclude cut with ID _31602_210_5854_1_1532677021456_4458250_249-96604-0 from training. Duration: 0.89 2023-10-03 17:46:27,347 WARNING [train.py:1204] (3/4) Exclude cut with ID _69584_210_5219_1_1535785133775_4044450_325-7656-0 from training. Duration: 0.86 2023-10-03 17:46:28,784 WARNING [train.py:1204] (3/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_478-162247-0 from training. Duration: 0.82 2023-10-03 17:46:30,955 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.2.encoder.layers.0.whiten, num_groups=1, num_channels=384, metric=5.66 vs. limit=12.0 2023-10-03 17:46:31,438 WARNING [train.py:1204] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_686-54383-0_sp1.1 from training. Duration: 0.7909375 2023-10-03 17:46:33,231 INFO [scaling.py:1118] (3/4) WithLoss: name=encoder.encoders.3.encoder.layers.0.self_attn_weights, loss-sum=0.000e+00 2023-10-03 17:46:37,650 WARNING [train.py:1204] (3/4) Exclude cut with ID _29303_210_2575_1_1532392237303_7410649_85-270092-0_sp1.1 from training. Duration: 0.936375 2023-10-03 17:46:37,666 WARNING [train.py:1204] (3/4) Exclude cut with ID _2322_210_2667_1_1525604548971_3656299_440-80494-0_sp1.1 from training. Duration: 0.8 2023-10-03 17:46:39,477 WARNING [train.py:1204] (3/4) Exclude cut with ID _47346_210_9476_1_1533815971553_3536160_201-40644-0_sp1.1 from training. Duration: 0.936375 2023-10-03 17:46:40,831 WARNING [train.py:1204] (3/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_343-139898-0_sp1.1 from training. Duration: 0.87275 2023-10-03 17:46:42,279 WARNING [train.py:1204] (3/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_1012-279473-0 from training. Duration: 0.67 2023-10-03 17:46:43,653 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7931_210_1794_1_1528594728_5564059_280-279560-0_sp1.1 from training. Duration: 0.8909375 2023-10-03 17:46:43,689 WARNING [train.py:1204] (3/4) Exclude cut with ID _8263_210_1664_1_1528439452891_970220_68-181805-0 from training. Duration: 0.64 2023-10-03 17:46:43,935 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.conv_skip_rate, batch_count=1352673.3333333333, ans=0.0 2023-10-03 17:46:44,966 WARNING [train.py:1204] (3/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_605-133395-0 from training. Duration: 0.66 2023-10-03 17:46:46,963 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_43-171109-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 17:46:46,966 WARNING [train.py:1204] (3/4) Exclude cut with ID _40094_210_16380_1_1533294005773_3641159_107-280739-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 17:46:48,850 WARNING [train.py:1204] (3/4) Exclude cut with ID _14821_210_8750_1_1530680503991_6829359_2-227151-0_sp0.9 from training. Duration: 0.92225 2023-10-03 17:46:50,385 WARNING [train.py:1204] (3/4) Exclude cut with ID _14268_210_9206_1_1530622771285_4714099_331-105351-0_sp1.1 from training. Duration: 0.5909375 2023-10-03 17:46:52,085 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.conv_module2.balancer2.min_positive, batch_count=1352740.0, ans=0.05 2023-10-03 17:46:53,104 INFO [train.py:1046] (3/4) Epoch 39, batch 1050, loss[loss=0.1429, simple_loss=0.1948, pruned_loss=0.04554, over 19161.00 frames. ], tot_loss[loss=0.1574, simple_loss=0.2365, pruned_loss=0.03918, over 4687869.56 frames. ], batch size: 388, lr: 2.62e-03, grad_scale: 16.0 2023-10-03 17:46:53,146 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_26326_210_7925_1_1533002493_3592406_451-82741-0_sp1.1 from training. Duration: 0.97275 2023-10-03 17:46:55,898 WARNING [train.py:1204] (3/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_191-59784-0_sp0.9 from training. Duration: 0.97775 2023-10-03 17:46:57,306 WARNING [train.py:1204] (3/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_445-117398-0_sp1.1 from training. Duration: 0.8090625 2023-10-03 17:46:58,716 WARNING [train.py:1204] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_605-257205-0_sp0.9 from training. Duration: 0.6555625 2023-10-03 17:46:58,792 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_50050_210_16223_1_1535435796_7290138_592-258841-0_sp1.1 from training. Duration: 0.936375 2023-10-03 17:47:00,166 WARNING [train.py:1204] (3/4) Exclude cut with ID _14348_210_8341_1_1530599434996_3778640_240-232370-0_sp0.9 from training. Duration: 0.9333125 2023-10-03 17:47:00,874 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.2.encoder.layers.1.feed_forward3.out_whiten, num_groups=1, num_channels=384, metric=6.70 vs. limit=15.0 2023-10-03 17:47:01,580 WARNING [train.py:1204] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_239-316802-0_sp0.9 from training. Duration: 0.7666875 2023-10-03 17:47:04,311 WARNING [train.py:1204] (3/4) Exclude cut with ID _45560_210_21982_1_1533812603951_3584999_111-6376-0_sp1.1 from training. Duration: 0.763625 2023-10-03 17:47:06,185 WARNING [train.py:1204] (3/4) Exclude cut with ID _54189_210_16380_1_1534332670039_3911710_606-311481-0_sp1.1 from training. Duration: 0.963625 2023-10-03 17:47:07,570 WARNING [train.py:1204] (3/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_237-332027-0_sp1.1 from training. Duration: 0.863625 2023-10-03 17:47:08,868 WARNING [train.py:1204] (3/4) Exclude cut with ID _66070_210_7640_1_1535441393096_7805310_489-292631-0_sp0.9 from training. Duration: 0.87775 2023-10-03 17:47:10,304 WARNING [train.py:1204] (3/4) Exclude cut with ID _49728_210_12651_1_1534041034294_6788150_624-195301-0_sp1.1 from training. Duration: 0.77275 2023-10-03 17:47:10,995 WARNING [train.py:1204] (3/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_44-79427-0 from training. Duration: 0.98 2023-10-03 17:47:12,306 WARNING [train.py:1204] (3/4) Exclude cut with ID _35350_210_16040_1_1533040191183_4075299_490-198354-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 17:47:13,598 WARNING [train.py:1204] (3/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_309-222545-0 from training. Duration: 0.84 2023-10-03 17:47:13,790 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_13091_210_7925_1_1530609447_8987354_790-199469-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 17:47:13,796 WARNING [train.py:1204] (3/4) Exclude cut with ID _21632_210_2253_1_1531994001765_3744140_110-274654-0 from training. Duration: 0.82 2023-10-03 17:47:13,812 WARNING [train.py:1204] (3/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_498-40153-0_sp0.9 from training. Duration: 0.7 2023-10-03 17:47:15,970 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=1352806.6666666667, ans=0.1 2023-10-03 17:47:20,506 WARNING [train.py:1204] (3/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_363-282909-0_sp1.1 from training. Duration: 0.936375 2023-10-03 17:47:21,680 WARNING [train.py:1204] (3/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_178-177582-0_sp0.9 from training. Duration: 0.82225 2023-10-03 17:47:21,699 WARNING [train.py:1204] (3/4) Exclude cut with ID _24612_210_8913_1_1531998054193_3806359_357-35487-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 17:47:24,360 WARNING [train.py:1204] (3/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_138-332773-0 from training. Duration: 0.81 2023-10-03 17:47:24,380 WARNING [train.py:1204] (3/4) Exclude cut with ID _7624_210_1531_1_1528109802198_4933101_418-349834-0 from training. Duration: 0.6 2023-10-03 17:47:25,572 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.619e+02 1.919e+02 2.049e+02 2.517e+02 3.582e+02, threshold=4.099e+02, percent-clipped=0.0 2023-10-03 17:47:25,645 WARNING [train.py:1204] (3/4) Exclude cut with ID _7537_210_2084_1_1528502395431_3924359_14-312906-0_sp0.9 from training. Duration: 0.9333125 2023-10-03 17:47:27,105 WARNING [train.py:1204] (3/4) Exclude cut with ID _60057_210_10640_1_1534903065793_3333150_362-230377-0 from training. Duration: 0.87 2023-10-03 17:47:28,642 WARNING [train.py:1204] (3/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_138-64210-0 from training. Duration: 0.73 2023-10-03 17:47:30,003 WARNING [train.py:1204] (3/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_454-339504-0_sp1.1 from training. Duration: 0.97275 2023-10-03 17:47:31,489 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=1352873.3333333333, ans=0.1 2023-10-03 17:47:34,477 WARNING [train.py:1204] (3/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_508-335369-0_sp0.9 from training. Duration: 0.5333125 2023-10-03 17:47:35,953 WARNING [train.py:1204] (3/4) Exclude cut with ID _5608_210_2106_1_1527415439089_6819768_909-82767-0_sp0.9 from training. Duration: 0.67775 2023-10-03 17:47:37,269 WARNING [train.py:1204] (3/4) Exclude cut with ID _6694_210_4599_1_1527852637490_3965529_99-11611-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 17:47:38,587 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2655_210_2087_1_1525688959_3897400_52-279899-0_sp1.1 from training. Duration: 0.62725 2023-10-03 17:47:41,513 WARNING [train.py:1204] (3/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_375-245677-0_sp1.1 from training. Duration: 0.736375 2023-10-03 17:47:46,267 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.2.encoder.layers.1.conv_module1.whiten, num_groups=1, num_channels=384, metric=11.93 vs. limit=15.0 2023-10-03 17:47:46,901 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_23850_210_4238_1_1532232597_1632433_32-185135-0 from training. Duration: 0.86 2023-10-03 17:47:48,305 WARNING [train.py:1204] (3/4) Exclude cut with ID _15113_210_8254_1_1530842438579_3716279_209-54533-0 from training. Duration: 0.96 2023-10-03 17:47:48,341 WARNING [train.py:1204] (3/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_401-53423-0 from training. Duration: 0.55 2023-10-03 17:47:48,368 WARNING [train.py:1204] (3/4) Exclude cut with ID _14867_210_1968_1_1530684179890_3846969_319-250868-0_sp1.1 from training. Duration: 0.9 2023-10-03 17:47:50,265 WARNING [train.py:1204] (3/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_106-30782-0_sp0.9 from training. Duration: 0.888875 2023-10-03 17:47:51,714 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_5960_210_1794_1_1527403865_3990999_515-2613-0 from training. Duration: 0.88 2023-10-03 17:47:55,943 WARNING [train.py:1204] (3/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_798-106179-0_sp0.9 from training. Duration: 0.92225 2023-10-03 17:47:57,382 WARNING [train.py:1204] (3/4) Exclude cut with ID _14227_210_4381_1_1530701804286_3940259_125-275642-0_sp1.1 from training. Duration: 0.9 2023-10-03 17:47:57,384 WARNING [train.py:1204] (3/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_79-200392-0_sp1.1 from training. Duration: 0.8818125 2023-10-03 17:47:57,436 WARNING [train.py:1204] (3/4) Exclude cut with ID _48836_210_12210_1_1534075848293_3904150_429-35475-0_sp0.9 from training. Duration: 0.988875 2023-10-03 17:47:57,453 WARNING [train.py:1204] (3/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_224-172517-0_sp1.1 from training. Duration: 0.97275 2023-10-03 17:48:00,323 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.conv_module2.balancer2.min_abs, batch_count=1353006.6666666667, ans=0.5 2023-10-03 17:48:01,583 WARNING [train.py:1204] (3/4) Exclude cut with ID _15113_210_8254_1_1530842438579_3716279_76-232507-0_sp1.1 from training. Duration: 0.97275 2023-10-03 17:48:01,586 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_135-5501-0 from training. Duration: 0.98 2023-10-03 17:48:03,025 WARNING [train.py:1204] (3/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_563-347832-0_sp0.9 from training. Duration: 0.988875 2023-10-03 17:48:03,237 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.conv_module2.balancer1.prob, batch_count=1353006.6666666667, ans=0.125 2023-10-03 17:48:04,829 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6786_210_1388_1_1527839699_4194495_273-149841-0 from training. Duration: 0.92 2023-10-03 17:48:04,856 WARNING [train.py:1204] (3/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_172-215932-0 from training. Duration: 0.66 2023-10-03 17:48:04,926 WARNING [train.py:1204] (3/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_939-132910-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 17:48:06,507 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.conv_module2.balancer2.min_positive, batch_count=1353073.3333333333, ans=0.05 2023-10-03 17:48:07,647 INFO [train.py:1046] (3/4) Epoch 39, batch 1100, loss[loss=0.1513, simple_loss=0.2363, pruned_loss=0.03321, over 24488.00 frames. ], tot_loss[loss=0.1573, simple_loss=0.2368, pruned_loss=0.03887, over 4699685.78 frames. ], batch size: 63, lr: 2.62e-03, grad_scale: 8.0 2023-10-03 17:48:09,056 WARNING [train.py:1204] (3/4) Exclude cut with ID _49371_210_12071_1_1533952761021_3812190_16-246001-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 17:48:09,253 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.1.ff3_skip_rate, batch_count=1353073.3333333333, ans=0.0 2023-10-03 17:48:14,429 WARNING [train.py:1204] (3/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_680-189165-0_sp1.1 from training. Duration: 0.936375 2023-10-03 17:48:15,344 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.feed_forward1.hidden_balancer.prob, batch_count=1353073.3333333333, ans=0.125 2023-10-03 17:48:18,460 WARNING [train.py:1204] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_53-300544-0_sp1.1 from training. Duration: 0.6090625 2023-10-03 17:48:18,722 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.feed_forward2.hidden_balancer.prob, batch_count=1353073.3333333333, ans=0.125 2023-10-03 17:48:18,750 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=1353073.3333333333, ans=0.1 2023-10-03 17:48:20,176 WARNING [train.py:1204] (3/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_540-275685-0_sp1.1 from training. Duration: 0.8090625 2023-10-03 17:48:21,603 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_38965_210_4852_1_1533174906_3484735_453-106579-0_sp1.1 from training. Duration: 0.92725 2023-10-03 17:48:21,642 WARNING [train.py:1204] (3/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_105-328174-0 from training. Duration: 0.76 2023-10-03 17:48:21,884 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.conv_module1.balancer2.prob, batch_count=1353140.0, ans=0.125 2023-10-03 17:48:23,047 WARNING [train.py:1204] (3/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_279-126205-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 17:48:24,461 WARNING [train.py:1204] (3/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_5-55893-0_sp0.9 from training. Duration: 0.611125 2023-10-03 17:48:27,215 WARNING [train.py:1204] (3/4) Exclude cut with ID _13678_210_7497_1_1530530069226_3463170_141-10835-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 17:48:31,218 WARNING [train.py:1204] (3/4) Exclude cut with ID _22083_210_8254_1_1531702862439_3678970_201-85738-0_sp1.1 from training. Duration: 0.8181875 2023-10-03 17:48:32,496 WARNING [train.py:1204] (3/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_51-1049-0 from training. Duration: 0.75 2023-10-03 17:48:33,969 WARNING [train.py:1204] (3/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_206-10097-0_sp0.9 from training. Duration: 0.5444375 2023-10-03 17:48:34,059 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_4528_210_3273_1_1527076654_3276777_360-63661-0_sp1.1 from training. Duration: 0.92725 2023-10-03 17:48:34,074 WARNING [train.py:1204] (3/4) Exclude cut with ID _64086_210_7994_1_1535270417019_7435949_449-31567-0_sp1.1 from training. Duration: 0.8818125 2023-10-03 17:48:36,828 WARNING [train.py:1204] (3/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_245-174553-0_sp0.9 from training. Duration: 0.97775 2023-10-03 17:48:38,849 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6545_210_2040_1_1527774670_4245258_379-14182-0_sp1.1 from training. Duration: 0.72725 2023-10-03 17:48:42,973 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_256-206070-0_sp1.1 from training. Duration: 0.963625 2023-10-03 17:48:44,598 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.ff2_skip_rate, batch_count=1353206.6666666667, ans=0.0 2023-10-03 17:48:44,628 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.feed_forward2.hidden_balancer.prob, batch_count=1353206.6666666667, ans=0.125 2023-10-03 17:48:45,673 WARNING [train.py:1204] (3/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_278-104676-0 from training. Duration: 0.98 2023-10-03 17:48:45,747 WARNING [train.py:1204] (3/4) Exclude cut with ID IC0671W0065-331908 from training. Duration: 0.896 2023-10-03 17:48:47,445 WARNING [train.py:1204] (3/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_244-129917-0_sp1.1 from training. Duration: 0.97275 2023-10-03 17:48:48,868 WARNING [train.py:1204] (3/4) Exclude cut with ID _7852_210_5219_1_1528286471316_8139400_1129-170857-0_sp1.1 from training. Duration: 0.97275 2023-10-03 17:48:49,045 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.bypass.skip_rate, batch_count=1353206.6666666667, ans=0.07 2023-10-03 17:48:50,620 WARNING [train.py:1204] (3/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_443-30034-0_sp0.9 from training. Duration: 0.711125 2023-10-03 17:48:50,654 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_31583_210_3852_1_1532595851_4024413_250-332999-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 17:48:50,814 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.feed_forward2.hidden_balancer.prob, batch_count=1353273.3333333333, ans=0.125 2023-10-03 17:48:52,554 WARNING [train.py:1204] (3/4) Exclude cut with ID _8772_210_1850_1_1528635595972_3664011_247-336448-0 from training. Duration: 0.74 2023-10-03 17:48:53,853 WARNING [train.py:1204] (3/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_567-135114-0_sp0.9 from training. Duration: 0.9666875 2023-10-03 17:48:53,858 WARNING [train.py:1204] (3/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_557-349534-0_sp1.1 from training. Duration: 0.82725 2023-10-03 17:48:53,871 WARNING [train.py:1204] (3/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_1341-26728-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 17:48:53,923 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_40222_210_6228_1_1533385298_4481939_375-328051-0_sp1.1 from training. Duration: 0.97275 2023-10-03 17:48:53,958 WARNING [train.py:1204] (3/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_276-96593-0 from training. Duration: 0.77 2023-10-03 17:48:59,649 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_53071_210_16554_1_1534490405_2954066_265-170472-0_sp0.9 from training. Duration: 0.92225 2023-10-03 17:48:59,675 WARNING [train.py:1204] (3/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_365-1492-0 from training. Duration: 0.91 2023-10-03 17:49:00,990 WARNING [train.py:1204] (3/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_654-52713-0_sp0.9 from training. Duration: 0.8555625 2023-10-03 17:49:04,060 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.nonlin_attention.balancer.prob, batch_count=1353273.3333333333, ans=0.125 2023-10-03 17:49:04,438 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.3.encoder.layers.3.feed_forward2.out_whiten, num_groups=1, num_channels=512, metric=5.03 vs. limit=15.0 2023-10-03 17:49:05,541 WARNING [train.py:1204] (3/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_113-92608-0_sp1.1 from training. Duration: 0.4909375 2023-10-03 17:49:08,692 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_46013_210_7234_1_1533776362_3744032_50-141535-0 from training. Duration: 0.99 2023-10-03 17:49:08,709 WARNING [train.py:1204] (3/4) Exclude cut with ID _13942_210_9988_1_1530604906036_3733360_499-55676-0_sp1.1 from training. Duration: 0.563625 2023-10-03 17:49:10,122 WARNING [train.py:1204] (3/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_375-65143-0_sp1.1 from training. Duration: 0.97275 2023-10-03 17:49:12,937 WARNING [train.py:1204] (3/4) Exclude cut with ID _16756_210_4915_1_1531047803763_7104259_199-325608-0_sp1.1 from training. Duration: 0.92725 2023-10-03 17:49:12,978 WARNING [train.py:1204] (3/4) Exclude cut with ID _31649_210_6228_1_1532584608063_4091078_443-124391-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 17:49:14,949 WARNING [train.py:1204] (3/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_146-218763-0 from training. Duration: 0.86 2023-10-03 17:49:16,140 WARNING [train.py:1204] (3/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_567-135114-0_sp1.1 from training. Duration: 0.7909375 2023-10-03 17:49:16,199 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_26326_210_7925_1_1533002493_3592406_462-288299-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 17:49:18,974 WARNING [train.py:1204] (3/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_322-217220-0 from training. Duration: 0.86 2023-10-03 17:49:19,009 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_238-256668-0_sp1.1 from training. Duration: 0.62725 2023-10-03 17:49:20,875 WARNING [train.py:1204] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_371-18169-0 from training. Duration: 0.86 2023-10-03 17:49:22,625 INFO [train.py:1046] (3/4) Epoch 39, batch 1150, loss[loss=0.1668, simple_loss=0.2455, pruned_loss=0.04401, over 23354.00 frames. ], tot_loss[loss=0.1578, simple_loss=0.2371, pruned_loss=0.03925, over 4690968.82 frames. ], batch size: 93, lr: 2.62e-03, grad_scale: 8.0 2023-10-03 17:49:22,699 WARNING [train.py:1204] (3/4) Exclude cut with ID _58480_210_12392_1_1534811121111_3646160_23-74512-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 17:49:22,704 WARNING [train.py:1204] (3/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_784-213099-0_sp1.1 from training. Duration: 0.8454375 2023-10-03 17:49:22,797 WARNING [train.py:1204] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_625-102955-0_sp1.1 from training. Duration: 0.7 2023-10-03 17:49:25,866 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.conv_module1.balancer2.prob, batch_count=1353406.6666666667, ans=0.125 2023-10-03 17:49:29,617 WARNING [train.py:1204] (3/4) Exclude cut with ID _49748_210_24116_1_1533986971737_3158910_176-174327-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 17:49:31,097 WARNING [train.py:1204] (3/4) Exclude cut with ID _50310_210_15488_1_1534068043345_3641080_432-96140-0_sp1.1 from training. Duration: 0.936375 2023-10-03 17:49:32,475 WARNING [train.py:1204] (3/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_76-14910-0_sp1.1 from training. Duration: 0.92725 2023-10-03 17:49:32,515 WARNING [train.py:1204] (3/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_40-19458-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 17:49:32,552 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_46-93547-0 from training. Duration: 0.95 2023-10-03 17:49:33,891 WARNING [train.py:1204] (3/4) Exclude cut with ID _21631_210_2253_1_1531907880778_3160339_321-35325-0_sp1.1 from training. Duration: 0.963625 2023-10-03 17:49:34,128 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.0.conv_module2.balancer1.prob, batch_count=1353406.6666666667, ans=0.125 2023-10-03 17:49:35,360 WARNING [train.py:1204] (3/4) Exclude cut with ID _4779_210_2084_1_1527318022944_3914893_500-285013-0 from training. Duration: 0.69 2023-10-03 17:49:36,678 WARNING [train.py:1204] (3/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_301-93324-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 17:49:36,684 WARNING [train.py:1204] (3/4) Exclude cut with ID _6473_210_1741_1_1527901307396_3815380_108-17335-0_sp1.1 from training. Duration: 0.5909375 2023-10-03 17:49:41,064 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.feed_forward1.hidden_balancer.prob, batch_count=1353473.3333333333, ans=0.125 2023-10-03 17:49:42,643 WARNING [train.py:1204] (3/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_206-10097-0 from training. Duration: 0.49 2023-10-03 17:49:44,050 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_36756_210_7234_1_1533026991_966276_103-77513-0_sp1.1 from training. Duration: 0.97275 2023-10-03 17:49:44,495 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.5.encoder.layers.1.whiten, num_groups=1, num_channels=256, metric=2.10 vs. limit=12.0 2023-10-03 17:49:47,672 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.feed_forward1.out_proj.dropout_p, batch_count=1353473.3333333333, ans=0.1 2023-10-03 17:49:48,615 WARNING [train.py:1204] (3/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_662-18193-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 17:49:49,930 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_26433_210_12108_1_1532157158_4056480_439-52438-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 17:49:49,981 WARNING [train.py:1204] (3/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_464-33931-0 from training. Duration: 0.54 2023-10-03 17:49:49,988 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3474_210_2028_1_1526188513_4825246_145-309131-0_sp0.9 from training. Duration: 0.82225 2023-10-03 17:49:50,007 WARNING [train.py:1204] (3/4) Exclude cut with ID _9679_210_7497_1_1529129212906_4363929_446-308321-0_sp1.1 from training. Duration: 0.963625 2023-10-03 17:49:55,047 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.525e+02 1.879e+02 2.025e+02 2.261e+02 4.283e+02, threshold=4.051e+02, percent-clipped=1.0 2023-10-03 17:49:55,164 WARNING [train.py:1204] (3/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_662-275621-0 from training. Duration: 0.42 2023-10-03 17:49:56,458 WARNING [train.py:1204] (3/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_344-337148-0_sp1.1 from training. Duration: 0.97275 2023-10-03 17:49:57,793 WARNING [train.py:1204] (3/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_759-179071-0_sp1.1 from training. Duration: 0.92725 2023-10-03 17:50:03,538 WARNING [train.py:1204] (3/4) Exclude cut with ID _28783_210_7943_1_1532671164808_7155479_293-12188-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 17:50:10,104 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_17613_210_2151_1_1531137569_2366160_235-320304-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 17:50:10,134 WARNING [train.py:1204] (3/4) Exclude cut with ID _14349_210_8341_1_1530615710818_3442470_170-336541-0 from training. Duration: 0.86 2023-10-03 17:50:11,544 WARNING [train.py:1204] (3/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_375-218677-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 17:50:11,593 WARNING [train.py:1204] (3/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_829-202956-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 17:50:18,341 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9029W0436-509823 from training. Duration: 0.768 2023-10-03 17:50:18,495 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.bypass.skip_rate, batch_count=1353606.6666666667, ans=0.04949747468305833 2023-10-03 17:50:21,595 WARNING [train.py:1204] (3/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_231-112214-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 17:50:27,756 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9059W0470-524883 from training. Duration: 0.3415625 2023-10-03 17:50:31,881 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_43266_210_1794_1_1533521789_4497218_206-79512-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 17:50:33,203 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_294-163793-0_sp0.9 from training. Duration: 0.911125 2023-10-03 17:50:33,226 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6290_210_3273_1_1527595224_1282787_115-87095-0_sp1.1 from training. Duration: 0.5 2023-10-03 17:50:34,541 WARNING [train.py:1204] (3/4) Exclude cut with ID _14100_210_8341_1_1530529320613_3678069_279-55760-0_sp1.1 from training. Duration: 0.8454375 2023-10-03 17:50:35,216 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.3.encoder.layers.3.self_attn1.whiten, num_groups=1, num_channels=512, metric=11.30 vs. limit=22.5 2023-10-03 17:50:35,964 INFO [train.py:1046] (3/4) Epoch 39, batch 1200, loss[loss=0.1553, simple_loss=0.2311, pruned_loss=0.03977, over 23673.00 frames. ], tot_loss[loss=0.1575, simple_loss=0.2369, pruned_loss=0.039, over 4705412.31 frames. ], batch size: 135, lr: 2.62e-03, grad_scale: 16.0 2023-10-03 17:50:37,644 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.conv_module1.balancer1.prob, batch_count=1353740.0, ans=0.125 2023-10-03 17:50:38,799 WARNING [train.py:1204] (3/4) Exclude cut with ID _14621_210_1925_1_1530686901185_3675420_419-234200-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 17:50:40,488 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=1353740.0, ans=0.1 2023-10-03 17:50:41,672 WARNING [train.py:1204] (3/4) Exclude cut with ID _60229_210_27767_1_1534838406158_3655350_353-213656-0_sp1.1 from training. Duration: 0.863625 2023-10-03 17:50:41,681 WARNING [train.py:1204] (3/4) Exclude cut with ID _48678_210_5663_1_1533988830947_3769199_306-255886-0_sp1.1 from training. Duration: 0.7 2023-10-03 17:50:43,082 WARNING [train.py:1204] (3/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_466-3492-0_sp1.1 from training. Duration: 0.97275 2023-10-03 17:50:43,082 WARNING [train.py:1204] (3/4) Exclude cut with ID _8755_210_1876_1_1528628559357_3258209_199-242167-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 17:50:43,139 WARNING [train.py:1204] (3/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_237-89684-0_sp1.1 from training. Duration: 0.87275 2023-10-03 17:50:44,627 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.conv_module2.balancer1.prob, batch_count=1353740.0, ans=0.125 2023-10-03 17:50:45,792 WARNING [train.py:1204] (3/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_70-84121-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 17:50:48,906 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7544_210_6753_1_1528587722_3576686_172-171750-0_sp1.1 from training. Duration: 0.5909375 2023-10-03 17:50:49,221 INFO [scaling.py:1118] (3/4) WithLoss: name=encoder.encoders.3.encoder.layers.3.self_attn_weights, loss-sum=0.000e+00 2023-10-03 17:50:50,280 WARNING [train.py:1204] (3/4) Exclude cut with ID _24271_210_12658_1_1531987270022_7190519_40-321822-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 17:50:50,304 WARNING [train.py:1204] (3/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_227-83612-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 17:50:54,010 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9156W0015-572508 from training. Duration: 0.896 2023-10-03 17:50:57,322 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_292-265928-0 from training. Duration: 0.51 2023-10-03 17:50:57,592 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.ff2_skip_rate, batch_count=1353806.6666666667, ans=0.0 2023-10-03 17:51:00,112 WARNING [train.py:1204] (3/4) Exclude cut with ID _14747_210_8341_1_1530685797463_3654089_147-131229-0_sp1.1 from training. Duration: 0.6909375 2023-10-03 17:51:02,805 WARNING [train.py:1204] (3/4) Exclude cut with ID _45297_210_2721_1_1533715351381_3264949_351-147136-0_sp0.9 from training. Duration: 0.9666875 2023-10-03 17:51:04,266 WARNING [train.py:1204] (3/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_1102-324568-0_sp1.1 from training. Duration: 0.97275 2023-10-03 17:51:04,387 WARNING [train.py:1204] (3/4) Exclude cut with ID _18516_210_9206_1_1531704653206_3692406_465-75446-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 17:51:04,388 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9203W0126-595623 from training. Duration: 0.896 2023-10-03 17:51:05,763 WARNING [train.py:1204] (3/4) Exclude cut with ID _55383_210_10635_1_1534468426172_2231669_139-328410-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 17:51:12,674 WARNING [train.py:1204] (3/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_166-246557-0_sp0.9 from training. Duration: 0.711125 2023-10-03 17:51:12,678 WARNING [train.py:1204] (3/4) Exclude cut with ID _30039_210_13270_1_1532484612900_2725150_239-304872-0_sp1.1 from training. Duration: 0.963625 2023-10-03 17:51:13,988 WARNING [train.py:1204] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_6-226750-0 from training. Duration: 0.94 2023-10-03 17:51:15,382 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_135-5501-0_sp1.1 from training. Duration: 0.8909375 2023-10-03 17:51:17,575 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.1.encoder.layers.1.feed_forward2.out_whiten, num_groups=1, num_channels=256, metric=4.08 vs. limit=15.0 2023-10-03 17:51:18,177 WARNING [train.py:1204] (3/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_359-118926-0 from training. Duration: 0.72 2023-10-03 17:51:23,311 WARNING [train.py:1204] (3/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_4-91037-0 from training. Duration: 0.88 2023-10-03 17:51:23,334 WARNING [train.py:1204] (3/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_415-288492-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 17:51:25,053 WARNING [train.py:1204] (3/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_544-287179-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 17:51:25,162 WARNING [train.py:1204] (3/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_122-32606-0_sp1.1 from training. Duration: 0.92725 2023-10-03 17:51:26,936 WARNING [train.py:1204] (3/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_121-284152-0_sp1.1 from training. Duration: 0.536375 2023-10-03 17:51:28,304 WARNING [train.py:1204] (3/4) Exclude cut with ID _44718_210_5816_1_1534050051869_6469030_350-299477-0_sp1.1 from training. Duration: 0.97275 2023-10-03 17:51:28,317 WARNING [train.py:1204] (3/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_1103-56445-0_sp0.9 from training. Duration: 0.811125 2023-10-03 17:51:28,372 WARNING [train.py:1204] (3/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_64-298917-0_sp0.9 from training. Duration: 0.97775 2023-10-03 17:51:28,429 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2245_210_2715_1_1525511548_3540254_310-52483-0 from training. Duration: 0.88 2023-10-03 17:51:29,829 WARNING [train.py:1204] (3/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_401-158239-0_sp1.1 from training. Duration: 0.6454375 2023-10-03 17:51:29,884 WARNING [train.py:1204] (3/4) Exclude cut with ID _59502_210_12210_1_1535107024698_1835139_146-162495-0_sp1.1 from training. Duration: 0.836375 2023-10-03 17:51:29,895 WARNING [train.py:1204] (3/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_41-31157-0_sp0.9 from training. Duration: 0.6333125 2023-10-03 17:51:32,512 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_68717_210_3528_1_1535628683_1556759_95-36722-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 17:51:32,519 WARNING [train.py:1204] (3/4) Exclude cut with ID _30196_210_1664_1_1533708496522_7074140_654-162611-0_sp1.1 from training. Duration: 0.92725 2023-10-03 17:51:35,322 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_9302_210_2550_1_1528889962_4655416_638-301620-0_sp0.9 from training. Duration: 0.52225 2023-10-03 17:51:37,996 WARNING [train.py:1204] (3/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_478-162247-0_sp1.1 from training. Duration: 0.7454375 2023-10-03 17:51:39,526 WARNING [train.py:1204] (3/4) Exclude cut with ID _45297_210_2721_1_1533715351381_3264949_353-9541-0 from training. Duration: 0.97 2023-10-03 17:51:42,375 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9653W0190-675468 from training. Duration: 0.9935 2023-10-03 17:51:43,772 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_49730_210_13049_1_1534075294_1830676_214-84436-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 17:51:46,448 WARNING [train.py:1204] (3/4) Exclude cut with ID _50093_210_4220_1_1534417141207_3432299_259-322252-0_sp1.1 from training. Duration: 0.836375 2023-10-03 17:51:48,315 WARNING [train.py:1204] (3/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_599-50220-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 17:51:49,622 INFO [train.py:1046] (3/4) Epoch 39, batch 1250, loss[loss=0.1588, simple_loss=0.2448, pruned_loss=0.03638, over 23797.00 frames. ], tot_loss[loss=0.1579, simple_loss=0.2375, pruned_loss=0.03918, over 4705246.58 frames. ], batch size: 85, lr: 2.62e-03, grad_scale: 16.0 2023-10-03 17:51:49,758 WARNING [train.py:1204] (3/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_174-109016-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 17:51:55,221 WARNING [train.py:1204] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_665-196155-0 from training. Duration: 0.67 2023-10-03 17:51:58,345 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.conv_module2.balancer2.prob, batch_count=1354073.3333333333, ans=0.125 2023-10-03 17:51:59,411 WARNING [train.py:1204] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_656-188361-0_sp1.1 from training. Duration: 0.7909375 2023-10-03 17:51:59,486 WARNING [train.py:1204] (3/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_60-18123-0_sp1.1 from training. Duration: 0.8 2023-10-03 17:52:00,803 WARNING [train.py:1204] (3/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_535-14410-0 from training. Duration: 0.47 2023-10-03 17:52:03,532 WARNING [train.py:1204] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_94-126128-0_sp1.1 from training. Duration: 0.7818125 2023-10-03 17:52:04,950 WARNING [train.py:1204] (3/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_356-31216-0_sp1.1 from training. Duration: 0.6818125 2023-10-03 17:52:09,170 WARNING [train.py:1204] (3/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_443-30034-0_sp1.1 from training. Duration: 0.5818125 2023-10-03 17:52:09,274 WARNING [train.py:1204] (3/4) Exclude cut with ID _26907_210_7234_1_1532221740694_3205263_337-2494-0_sp1.1 from training. Duration: 0.8 2023-10-03 17:52:10,658 WARNING [train.py:1204] (3/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_49-97158-0_sp0.9 from training. Duration: 0.8444375 2023-10-03 17:52:10,677 WARNING [train.py:1204] (3/4) Exclude cut with ID _7877_210_6014_1_1528286358099_3750136_553-170128-0_sp1.1 from training. Duration: 0.963625 2023-10-03 17:52:12,058 WARNING [train.py:1204] (3/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_202-293523-0_sp0.9 from training. Duration: 0.8 2023-10-03 17:52:12,329 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.balancer2.prob, batch_count=1354140.0, ans=0.125 2023-10-03 17:52:16,138 WARNING [train.py:1204] (3/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_188-171673-0_sp0.9 from training. Duration: 0.6444375 2023-10-03 17:52:16,162 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_120-41668-0_sp1.1 from training. Duration: 0.636375 2023-10-03 17:52:16,167 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_40223_210_6228_1_1533298404_4812267_613-78769-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 17:52:16,461 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.self_attn_weights.pos_emb_skip_rate, batch_count=1354140.0, ans=0.0 2023-10-03 17:52:17,526 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_53861_210_5399_1_1534422156_4272077_150-336899-0_sp1.1 from training. Duration: 0.963625 2023-10-03 17:52:17,598 WARNING [train.py:1204] (3/4) Exclude cut with ID _14459_210_2228_1_1531443641946_7516700_1146-56843-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 17:52:19,558 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.2.encoder.layers.2.self_attn_weights.whiten_keys, num_groups=4, num_channels=128, metric=3.38 vs. limit=6.0 2023-10-03 17:52:21,937 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.534e+02 1.935e+02 2.085e+02 2.357e+02 3.026e+02, threshold=4.170e+02, percent-clipped=0.0 2023-10-03 17:52:23,831 WARNING [train.py:1204] (3/4) Exclude cut with ID _39867_210_9826_1_1533553222030_9344259_1194-299928-0_sp1.1 from training. Duration: 0.92725 2023-10-03 17:52:23,912 WARNING [train.py:1204] (3/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_214-291369-0_sp0.9 from training. Duration: 0.7 2023-10-03 17:52:28,599 WARNING [train.py:1204] (3/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_358-215558-0 from training. Duration: 0.93 2023-10-03 17:52:29,909 WARNING [train.py:1204] (3/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_577-287150-0_sp0.9 from training. Duration: 0.911125 2023-10-03 17:52:31,524 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.balancer2.prob, batch_count=1354206.6666666667, ans=0.125 2023-10-03 17:52:31,530 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.conv_module2.balancer2.min_positive, batch_count=1354206.6666666667, ans=0.05 2023-10-03 17:52:32,669 WARNING [train.py:1204] (3/4) Exclude cut with ID _60860_210_8254_1_1534896015875_3557169_229-187367-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 17:52:34,045 WARNING [train.py:1204] (3/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_857-298942-0 from training. Duration: 0.85 2023-10-03 17:52:34,093 WARNING [train.py:1204] (3/4) Exclude cut with ID _40137_210_12210_1_1533290484046_3795180_249-187059-0_sp1.1 from training. Duration: 0.8 2023-10-03 17:52:34,100 WARNING [train.py:1204] (3/4) Exclude cut with ID ID0171W0508-765603 from training. Duration: 0.541 2023-10-03 17:52:34,116 WARNING [train.py:1204] (3/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_521-219439-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 17:52:34,122 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_40223_210_6228_1_1533298404_4812267_741-302616-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 17:52:34,425 INFO [scaling.py:1118] (3/4) WithLoss: name=encoder.encoders.4.encoder.layers.1.self_attn_weights, loss-sum=0.000e+00 2023-10-03 17:52:36,131 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.3.encoder.layers.3.whiten, num_groups=1, num_channels=512, metric=3.45 vs. limit=12.0 2023-10-03 17:52:38,278 WARNING [train.py:1204] (3/4) Exclude cut with ID _65293_210_9070_1_1535545549765_4004210_438-154735-0_sp1.1 from training. Duration: 0.92725 2023-10-03 17:52:39,789 WARNING [train.py:1204] (3/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_564-132950-0_sp1.1 from training. Duration: 0.92725 2023-10-03 17:52:40,963 WARNING [train.py:1204] (3/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_291-270881-0_sp1.1 from training. Duration: 0.8818125 2023-10-03 17:52:41,083 WARNING [train.py:1204] (3/4) Exclude cut with ID _60229_210_27767_1_1534838406158_3655350_353-213656-0 from training. Duration: 0.95 2023-10-03 17:52:41,095 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_243-143119-0 from training. Duration: 0.77 2023-10-03 17:52:42,421 WARNING [train.py:1204] (3/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_690-126306-0 from training. Duration: 0.78 2023-10-03 17:52:46,458 WARNING [train.py:1204] (3/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_347-95003-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 17:52:47,814 WARNING [train.py:1204] (3/4) Exclude cut with ID _2322_210_2667_1_1525604548971_3656299_440-80494-0 from training. Duration: 0.88 2023-10-03 17:52:47,855 WARNING [train.py:1204] (3/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_370-229434-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 17:52:49,370 WARNING [train.py:1204] (3/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_124-345840-0_sp1.1 from training. Duration: 0.47275 2023-10-03 17:52:49,377 WARNING [train.py:1204] (3/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_165-123581-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 17:52:53,237 WARNING [train.py:1204] (3/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_453-282588-0 from training. Duration: 0.98 2023-10-03 17:52:53,263 WARNING [train.py:1204] (3/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_299-34413-0_sp0.9 from training. Duration: 0.72225 2023-10-03 17:52:53,299 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_40036_210_13405_1_1533648647_1315388_48-146016-0_sp1.1 from training. Duration: 0.8454375 2023-10-03 17:52:53,328 WARNING [train.py:1204] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_99-167392-0_sp0.9 from training. Duration: 0.67775 2023-10-03 17:52:53,504 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.ff2_skip_rate, batch_count=1354340.0, ans=0.0 2023-10-03 17:52:54,457 WARNING [train.py:1204] (3/4) Exclude cut with ID _48677_210_5663_1_1533902405919_3892989_37-207114-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 17:52:54,574 WARNING [train.py:1204] (3/4) Exclude cut with ID _29857_210_20034_1_1532395886753_3798160_335-163562-0 from training. Duration: 0.85 2023-10-03 17:52:59,175 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_36492_210_7927_1_1533261987_4790834_703-343351-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 17:53:00,521 WARNING [train.py:1204] (3/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_80-147301-0_sp0.9 from training. Duration: 0.9333125 2023-10-03 17:53:01,978 WARNING [train.py:1204] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_218-295097-0_sp0.9 from training. Duration: 0.8555625 2023-10-03 17:53:03,235 INFO [train.py:1046] (3/4) Epoch 39, batch 1300, loss[loss=0.1591, simple_loss=0.2406, pruned_loss=0.03883, over 24439.00 frames. ], tot_loss[loss=0.1586, simple_loss=0.2382, pruned_loss=0.03952, over 4704507.65 frames. ], batch size: 77, lr: 2.62e-03, grad_scale: 16.0 2023-10-03 17:53:04,709 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2295_210_2087_1_1525500993_2407863_116-238894-0_sp1.1 from training. Duration: 0.636375 2023-10-03 17:53:06,334 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.ff3_skip_rate, batch_count=1354406.6666666667, ans=0.0 2023-10-03 17:53:07,548 WARNING [train.py:1204] (3/4) Exclude cut with ID _3231_210_2106_1_1526205864934_6784220_697-171068-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 17:53:07,568 WARNING [train.py:1204] (3/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_102-77064-0 from training. Duration: 0.93 2023-10-03 17:53:11,583 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_13091_210_7925_1_1530609447_8987354_1186-261957-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 17:53:12,976 WARNING [train.py:1204] (3/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_91-277684-0_sp1.1 from training. Duration: 0.67275 2023-10-03 17:53:14,275 WARNING [train.py:1204] (3/4) Exclude cut with ID _31088_210_16446_1_1532520012284_3440919_159-313751-0_sp1.1 from training. Duration: 0.936375 2023-10-03 17:53:15,636 WARNING [train.py:1204] (3/4) Exclude cut with ID _42326_210_7770_1_1533814282531_3707269_227-203721-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 17:53:17,142 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3531_210_3528_1_1526294203_5216456_309-227302-0_sp1.1 from training. Duration: 0.62725 2023-10-03 17:53:17,207 WARNING [train.py:1204] (3/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_226-242140-0 from training. Duration: 0.81 2023-10-03 17:53:17,434 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.self_attn_weights.pos_emb_skip_rate, batch_count=1354473.3333333333, ans=0.0 2023-10-03 17:53:21,959 WARNING [train.py:1204] (3/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_122-109023-0_sp1.1 from training. Duration: 0.6909375 2023-10-03 17:53:22,052 WARNING [train.py:1204] (3/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_660-211088-0_sp0.9 from training. Duration: 0.688875 2023-10-03 17:53:23,437 WARNING [train.py:1204] (3/4) Exclude cut with ID _39290_210_4220_1_1533862778760_3075118_92-311600-0 from training. Duration: 0.88 2023-10-03 17:53:27,038 WARNING [train.py:1204] (3/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_390-339137-0_sp1.1 from training. Duration: 0.5090625 2023-10-03 17:53:29,813 WARNING [train.py:1204] (3/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_449-341946-0_sp1.1 from training. Duration: 0.963625 2023-10-03 17:53:31,219 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_68095_210_20185_1_1535718226_8888855_884-327007-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 17:53:32,702 WARNING [train.py:1204] (3/4) Exclude cut with ID _42326_210_7770_1_1533814282531_3707269_308-222998-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 17:53:34,029 WARNING [train.py:1204] (3/4) Exclude cut with ID _40399_210_4353_1_1533305658194_3279406_344-238708-0_sp1.1 from training. Duration: 0.963625 2023-10-03 17:53:34,219 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.balancer2.prob, batch_count=1354540.0, ans=0.125 2023-10-03 17:53:35,321 WARNING [train.py:1204] (3/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_87-131427-0_sp1.1 from training. Duration: 0.7090625 2023-10-03 17:53:35,540 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.conv_skip_rate, batch_count=1354540.0, ans=0.0 2023-10-03 17:53:36,631 WARNING [train.py:1204] (3/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_110-130961-0_sp0.9 from training. Duration: 0.52225 2023-10-03 17:53:36,678 WARNING [train.py:1204] (3/4) Exclude cut with ID _55153_210_2253_1_1534384786677_3162096_148-79022-0 from training. Duration: 0.96 2023-10-03 17:53:40,949 WARNING [train.py:1204] (3/4) Exclude cut with ID _9679_210_7497_1_1529129212906_4363929_512-313889-0_sp1.1 from training. Duration: 0.736375 2023-10-03 17:53:40,967 WARNING [train.py:1204] (3/4) Exclude cut with ID _7970_210_1883_1_1528455616296_3472230_25-178407-0_sp1.1 from training. Duration: 0.6454375 2023-10-03 17:53:42,409 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_246-125000-0 from training. Duration: 0.62 2023-10-03 17:53:43,730 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_15126_210_6753_1_1530778677_2591185_113-157651-0_sp0.9 from training. Duration: 0.7555625 2023-10-03 17:53:44,433 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.4.encoder.layers.0.self_attn_weights.whiten_keys, num_groups=4, num_channels=128, metric=1.61 vs. limit=6.0 2023-10-03 17:53:45,139 WARNING [train.py:1204] (3/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_105-63309-0_sp1.1 from training. Duration: 0.8909375 2023-10-03 17:53:47,833 WARNING [train.py:1204] (3/4) Exclude cut with ID _29877_210_6228_1_1532397791106_5276149_578-177271-0_sp1.1 from training. Duration: 0.936375 2023-10-03 17:53:48,056 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.conv_module1.balancer1.prob, batch_count=1354606.6666666667, ans=0.125 2023-10-03 17:53:49,136 WARNING [train.py:1204] (3/4) Exclude cut with ID _44831_210_13427_1_1533816011549_3689035_32-188147-0 from training. Duration: 0.96 2023-10-03 17:53:49,186 WARNING [train.py:1204] (3/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_300-254337-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 17:53:49,198 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_128-241008-0 from training. Duration: 0.94 2023-10-03 17:53:52,020 WARNING [train.py:1204] (3/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_53-279178-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 17:53:55,437 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_41452_210_7291_1_1533437687_3941281_273-334088-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 17:53:57,234 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_128-241008-0_sp1.1 from training. Duration: 0.8545625 2023-10-03 17:53:58,782 WARNING [train.py:1204] (3/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_379-49457-0 from training. Duration: 0.87 2023-10-03 17:54:01,709 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_105-114797-0 from training. Duration: 0.84 2023-10-03 17:54:01,798 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_14928_210_5632_1_1530773461_4714000_454-112198-0 from training. Duration: 0.66 2023-10-03 17:54:05,888 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_456-22141-0_sp1.1 from training. Duration: 0.8 2023-10-03 17:54:06,050 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.0.conv_module1.balancer1.prob, batch_count=1354673.3333333333, ans=0.125 2023-10-03 17:54:07,343 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_115-233929-0 from training. Duration: 0.72 2023-10-03 17:54:09,971 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_616-16795-0_sp1.1 from training. Duration: 0.963625 2023-10-03 17:54:10,255 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.feed_forward1.hidden_balancer.prob, batch_count=1354673.3333333333, ans=0.125 2023-10-03 17:54:14,346 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.conv_module2.balancer1.prob, batch_count=1354740.0, ans=0.125 2023-10-03 17:54:15,388 INFO [train.py:1046] (3/4) Epoch 39, batch 1350, loss[loss=0.1613, simple_loss=0.2392, pruned_loss=0.04171, over 23466.00 frames. ], tot_loss[loss=0.1586, simple_loss=0.2377, pruned_loss=0.03969, over 4699382.95 frames. ], batch size: 119, lr: 2.62e-03, grad_scale: 8.0 2023-10-03 17:54:15,517 WARNING [train.py:1204] (3/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_448-212135-0 from training. Duration: 0.91 2023-10-03 17:54:17,512 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.3.encoder.layers.0.self_attn_weights.whiten_keys, num_groups=8, num_channels=256, metric=4.51 vs. limit=6.0 2023-10-03 17:54:19,483 WARNING [train.py:1204] (3/4) Exclude cut with ID _46683_210_9642_1_1533730913492_4591116_201-205677-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 17:54:20,895 WARNING [train.py:1204] (3/4) Exclude cut with ID _64086_210_7994_1_1535270417019_7435949_43-80483-0_sp1.1 from training. Duration: 0.97275 2023-10-03 17:54:23,077 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_240-134124-0_sp1.1 from training. Duration: 0.963625 2023-10-03 17:54:25,012 WARNING [train.py:1204] (3/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_907-120940-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 17:54:27,769 WARNING [train.py:1204] (3/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_848-280957-0_sp1.1 from training. Duration: 0.7909375 2023-10-03 17:54:27,810 WARNING [train.py:1204] (3/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_243-42116-0_sp0.9 from training. Duration: 0.811125 2023-10-03 17:54:32,350 WARNING [train.py:1204] (3/4) Exclude cut with ID _6694_210_4599_1_1527852637490_3965529_116-109398-0_sp0.9 from training. Duration: 0.811125 2023-10-03 17:54:32,462 WARNING [train.py:1204] (3/4) Exclude cut with ID _24314_210_4915_1_1532135078116_7170179_521-258868-0 from training. Duration: 0.79 2023-10-03 17:54:33,823 WARNING [train.py:1204] (3/4) Exclude cut with ID _2220_210_1819_1_1525509109898_3658680_50-99837-0_sp0.9 from training. Duration: 0.6 2023-10-03 17:54:33,863 WARNING [train.py:1204] (3/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_313-134930-0_sp1.1 from training. Duration: 0.8818125 2023-10-03 17:54:34,014 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.feed_forward2.hidden_balancer.prob, batch_count=1354806.6666666667, ans=0.125 2023-10-03 17:54:34,053 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.conv_module1.balancer2.prob, batch_count=1354806.6666666667, ans=0.125 2023-10-03 17:54:36,607 WARNING [train.py:1204] (3/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_125-144174-0 from training. Duration: 0.39 2023-10-03 17:54:37,936 WARNING [train.py:1204] (3/4) Exclude cut with ID _59288_210_5854_1_1535077757774_3412246_275-293897-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 17:54:39,465 WARNING [train.py:1204] (3/4) Exclude cut with ID _40519_210_13794_1_1533715228655_7337390_722-52880-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 17:54:39,487 WARNING [train.py:1204] (3/4) Exclude cut with ID _7537_210_2084_1_1528502395431_3924359_128-268029-0 from training. Duration: 0.98 2023-10-03 17:54:42,254 WARNING [train.py:1204] (3/4) Exclude cut with ID _8021_210_7994_1_1528376451358_3653069_179-45206-0 from training. Duration: 0.86 2023-10-03 17:54:42,448 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=1354806.6666666667, ans=0.1 2023-10-03 17:54:43,636 WARNING [train.py:1204] (3/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_88-276668-0 from training. Duration: 0.99 2023-10-03 17:54:46,274 WARNING [train.py:1204] (3/4) Exclude cut with ID _22613_210_5219_1_1532253599686_4090170_290-212862-0_sp1.1 from training. Duration: 0.97275 2023-10-03 17:54:46,279 WARNING [train.py:1204] (3/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_465-214726-0 from training. Duration: 0.86 2023-10-03 17:54:48,894 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.630e+02 1.867e+02 2.085e+02 2.496e+02 4.197e+02, threshold=4.170e+02, percent-clipped=1.0 2023-10-03 17:54:54,285 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.5.encoder.layers.0.feed_forward3.out_whiten, num_groups=1, num_channels=256, metric=9.75 vs. limit=15.0 2023-10-03 17:54:57,005 WARNING [train.py:1204] (3/4) Exclude cut with ID _56480_210_4220_1_1534679942079_3075409_138-36031-0_sp1.1 from training. Duration: 0.97275 2023-10-03 17:55:04,595 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.ff2_skip_rate, batch_count=1354940.0, ans=0.0 2023-10-03 17:55:05,743 WARNING [train.py:1204] (3/4) Exclude cut with ID _58605_210_12210_1_1534723195939_3904259_300-285878-0_sp1.1 from training. Duration: 0.97275 2023-10-03 17:55:05,771 WARNING [train.py:1204] (3/4) Exclude cut with ID _40901_210_5665_1_1533363789958_3856130_114-340229-0_sp1.1 from training. Duration: 0.936375 2023-10-03 17:55:05,813 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_489-230703-0 from training. Duration: 0.85 2023-10-03 17:55:08,507 WARNING [train.py:1204] (3/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_322-81243-0_sp1.1 from training. Duration: 0.936375 2023-10-03 17:55:09,936 WARNING [train.py:1204] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_271-269084-0 from training. Duration: 0.83 2023-10-03 17:55:09,948 WARNING [train.py:1204] (3/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_166-314607-0_sp0.9 from training. Duration: 0.6 2023-10-03 17:55:10,018 WARNING [train.py:1204] (3/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_340-343302-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 17:55:12,711 WARNING [train.py:1204] (3/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_286-112015-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 17:55:14,104 WARNING [train.py:1204] (3/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_328-264776-0 from training. Duration: 0.67 2023-10-03 17:55:15,473 WARNING [train.py:1204] (3/4) Exclude cut with ID _4866_210_2577_1_1527404412284_4364659_256-306830-0_sp0.9 from training. Duration: 0.9666875 2023-10-03 17:55:17,236 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.conv_skip_rate, batch_count=1355006.6666666667, ans=0.0 2023-10-03 17:55:20,326 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.3.encoder.layers.3.self_attn_weights.whiten_keys, num_groups=8, num_channels=256, metric=2.50 vs. limit=6.0 2023-10-03 17:55:21,663 WARNING [train.py:1204] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_239-316802-0 from training. Duration: 0.69 2023-10-03 17:55:23,112 WARNING [train.py:1204] (3/4) Exclude cut with ID _29857_210_20034_1_1532395886753_3798160_279-185801-0 from training. Duration: 0.91 2023-10-03 17:55:29,760 INFO [train.py:1046] (3/4) Epoch 39, batch 1400, loss[loss=0.1475, simple_loss=0.2249, pruned_loss=0.03502, over 23508.00 frames. ], tot_loss[loss=0.1576, simple_loss=0.2366, pruned_loss=0.03929, over 4696066.33 frames. ], batch size: 134, lr: 2.62e-03, grad_scale: 8.0 2023-10-03 17:55:29,891 WARNING [train.py:1204] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_656-188361-0 from training. Duration: 0.87 2023-10-03 17:55:31,340 WARNING [train.py:1204] (3/4) Exclude cut with ID _44718_210_5816_1_1534050051869_6469030_100-230287-0_sp1.1 from training. Duration: 0.936375 2023-10-03 17:55:34,263 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.conv_module1.balancer1.prob, batch_count=1355073.3333333333, ans=0.125 2023-10-03 17:55:35,407 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8275_210_4198_1_1528455320_3892571_276-215622-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 17:55:35,464 WARNING [train.py:1204] (3/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_698-309430-0_sp1.1 from training. Duration: 0.963625 2023-10-03 17:55:36,071 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.4.encoder.layers.2.whiten, num_groups=1, num_channels=384, metric=3.51 vs. limit=12.0 2023-10-03 17:55:37,334 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.5.encoder.layers.1.conv_module1.whiten, num_groups=1, num_channels=256, metric=4.37 vs. limit=15.0 2023-10-03 17:55:39,703 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_365-217119-0 from training. Duration: 0.63 2023-10-03 17:55:41,229 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7264_210_4938_1_1527939911_8132095_800-91995-0 from training. Duration: 0.86 2023-10-03 17:55:49,428 WARNING [train.py:1204] (3/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_49-97158-0_sp1.1 from training. Duration: 0.6909375 2023-10-03 17:55:51,127 WARNING [train.py:1204] (3/4) Exclude cut with ID _61132_210_9969_1_1535021792964_3755419_61-57720-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 17:55:54,423 WARNING [train.py:1204] (3/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_345-306832-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 17:55:54,455 WARNING [train.py:1204] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_164-224052-0_sp0.9 from training. Duration: 0.77775 2023-10-03 17:55:57,523 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_34886_210_10633_1_1533203882_1755661_76-156096-0_sp1.1 from training. Duration: 0.7909375 2023-10-03 17:55:59,453 WARNING [train.py:1204] (3/4) Exclude cut with ID 4133-6541-0027-40495-0_sp1.1 from training. Duration: 0.9681875 2023-10-03 17:56:07,815 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_514-211165-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 17:56:09,148 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_41211_210_2544_1_1533348859_3702910_97-212695-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 17:56:11,980 WARNING [train.py:1204] (3/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_406-45086-0 from training. Duration: 0.8 2023-10-03 17:56:12,175 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.conv_module1.balancer2.prob, batch_count=1355273.3333333333, ans=0.125 2023-10-03 17:56:13,298 WARNING [train.py:1204] (3/4) Exclude cut with ID _65263_210_22356_1_1535763598691_3921037_49-222917-0_sp1.1 from training. Duration: 0.836375 2023-10-03 17:56:14,577 WARNING [train.py:1204] (3/4) Exclude cut with ID _4866_210_2577_1_1527404412284_4364659_352-157930-0_sp1.1 from training. Duration: 0.736375 2023-10-03 17:56:14,647 WARNING [train.py:1204] (3/4) Exclude cut with ID _4366_210_1388_1_1526792053633_3613819_231-211402-0_sp1.1 from training. Duration: 0.8545625 2023-10-03 17:56:15,954 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_98-21576-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 17:56:17,331 WARNING [train.py:1204] (3/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_348-127789-0_sp0.9 from training. Duration: 0.9444375 2023-10-03 17:56:17,359 WARNING [train.py:1204] (3/4) Exclude cut with ID _45871_210_5665_1_1533864614245_3296060_98-329557-0_sp1.1 from training. Duration: 0.97275 2023-10-03 17:56:18,780 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6846_210_6753_1_1527850767_4295158_2-315073-0_sp1.1 from training. Duration: 0.8 2023-10-03 17:56:20,162 WARNING [train.py:1204] (3/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_145-137790-0 from training. Duration: 0.83 2023-10-03 17:56:20,181 WARNING [train.py:1204] (3/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_133-158308-0_sp0.9 from training. Duration: 0.9333125 2023-10-03 17:56:24,984 WARNING [train.py:1204] (3/4) Exclude cut with ID _14898_210_9206_1_1530766812377_4177190_320-11758-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 17:56:27,534 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.1.encoder.layers.1.whiten, num_groups=1, num_channels=256, metric=5.64 vs. limit=12.0 2023-10-03 17:56:29,845 WARNING [train.py:1204] (3/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_406-45086-0_sp0.9 from training. Duration: 0.888875 2023-10-03 17:56:34,800 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.conv_module2.balancer2.prob, batch_count=1355340.0, ans=0.125 2023-10-03 17:56:36,003 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_5438_210_1794_1_1527336687_2600009_104-288274-0 from training. Duration: 0.58 2023-10-03 17:56:37,375 WARNING [train.py:1204] (3/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_108-53929-0_sp0.9 from training. Duration: 0.6555625 2023-10-03 17:56:38,772 WARNING [train.py:1204] (3/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_444-286390-0_sp1.1 from training. Duration: 0.963625 2023-10-03 17:56:38,956 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.0.bypass_mid.scale_min, batch_count=1355340.0, ans=0.2 2023-10-03 17:56:41,382 WARNING [train.py:1204] (3/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_125-144174-0_sp0.9 from training. Duration: 0.4333125 2023-10-03 17:56:42,695 INFO [train.py:1046] (3/4) Epoch 39, batch 1450, loss[loss=0.1625, simple_loss=0.2416, pruned_loss=0.04174, over 23315.00 frames. ], tot_loss[loss=0.1566, simple_loss=0.2351, pruned_loss=0.03905, over 4675588.61 frames. ], batch size: 93, lr: 2.62e-03, grad_scale: 8.0 2023-10-03 17:56:42,743 WARNING [train.py:1204] (3/4) Exclude cut with ID _55209_210_1531_1_1534675828996_4597160_118-345621-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 17:56:42,890 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_45886_210_17024_1_1533709215_471063_7-79165-0_sp1.1 from training. Duration: 0.8909375 2023-10-03 17:56:46,931 WARNING [train.py:1204] (3/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_175-163894-0_sp0.9 from training. Duration: 0.82225 2023-10-03 17:56:49,649 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_50188_210_4198_1_1534503621_3646041_353-81682-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 17:56:49,662 WARNING [train.py:1204] (3/4) Exclude cut with ID _17875_210_2087_1_1531224012907_1821339_175-80565-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 17:56:49,671 WARNING [train.py:1204] (3/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_159-328157-0_sp1.1 from training. Duration: 0.47275 2023-10-03 17:56:55,926 WARNING [train.py:1204] (3/4) Exclude cut with ID _60057_210_10640_1_1534903065793_3333150_137-267579-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 17:56:57,256 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_92-46009-0_sp1.1 from training. Duration: 0.4909375 2023-10-03 17:56:58,682 WARNING [train.py:1204] (3/4) Exclude cut with ID _53203_210_2087_1_1534246218309_475590_48-68507-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 17:56:58,698 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_28211_210_6388_1_1532861506_4523224_128-328162-0 from training. Duration: 0.9 2023-10-03 17:56:58,824 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.0.conv_module2.balancer1.prob, batch_count=1355473.3333333333, ans=0.125 2023-10-03 17:57:00,074 WARNING [train.py:1204] (3/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_51-1049-0_sp1.1 from training. Duration: 0.6818125 2023-10-03 17:57:01,389 WARNING [train.py:1204] (3/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_383-316000-0 from training. Duration: 0.9 2023-10-03 17:57:01,448 WARNING [train.py:1204] (3/4) Exclude cut with ID _4779_210_2084_1_1527318022944_3914893_185-295153-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 17:57:03,493 WARNING [train.py:1204] (3/4) Exclude cut with ID _3231_210_2106_1_1526205864934_6784220_171-256004-0_sp0.9 from training. Duration: 0.9555625 2023-10-03 17:57:03,494 WARNING [train.py:1204] (3/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_87-272068-0 from training. Duration: 0.83 2023-10-03 17:57:05,333 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_164-152777-0_sp1.1 from training. Duration: 0.92725 2023-10-03 17:57:05,372 WARNING [train.py:1204] (3/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_18-130336-0_sp0.9 from training. Duration: 0.87775 2023-10-03 17:57:05,431 WARNING [train.py:1204] (3/4) Exclude cut with ID _14886_210_4220_1_1530702020355_3516190_175-229073-0_sp1.1 from training. Duration: 0.4090625 2023-10-03 17:57:05,445 WARNING [train.py:1204] (3/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_791-45149-0_sp0.9 from training. Duration: 0.9555625 2023-10-03 17:57:06,764 WARNING [train.py:1204] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_468-189058-0_sp1.1 from training. Duration: 0.87275 2023-10-03 17:57:08,120 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8278_210_2550_1_1528637099_4220413_32-142780-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 17:57:09,804 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.feed_forward1.hidden_balancer.prob, batch_count=1355473.3333333333, ans=0.125 2023-10-03 17:57:10,827 WARNING [train.py:1204] (3/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_146-218763-0_sp0.9 from training. Duration: 0.9555625 2023-10-03 17:57:13,648 WARNING [train.py:1204] (3/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_348-127789-0_sp1.1 from training. Duration: 0.77275 2023-10-03 17:57:14,986 WARNING [train.py:1204] (3/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_288-285445-0_sp1.1 from training. Duration: 0.936375 2023-10-03 17:57:16,183 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.601e+02 1.879e+02 2.005e+02 2.239e+02 6.029e+02, threshold=4.010e+02, percent-clipped=2.0 2023-10-03 17:57:16,355 WARNING [train.py:1204] (3/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_308-181732-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 17:57:17,658 WARNING [train.py:1204] (3/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_895-117428-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 17:57:19,041 WARNING [train.py:1204] (3/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_387-159008-0_sp0.9 from training. Duration: 0.9555625 2023-10-03 17:57:19,063 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_37351_210_7925_1_1533012931_3812113_265-85780-0_sp0.9 from training. Duration: 0.97775 2023-10-03 17:57:19,084 WARNING [train.py:1204] (3/4) Exclude cut with ID _40551_210_10635_1_1533776424632_3416179_361-3964-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 17:57:19,124 WARNING [train.py:1204] (3/4) Exclude cut with ID _25882_210_3036_1_1532775665815_6479510_485-122742-0_sp1.1 from training. Duration: 0.97275 2023-10-03 17:57:21,863 WARNING [train.py:1204] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_98-51676-0 from training. Duration: 0.73 2023-10-03 17:57:26,442 WARNING [train.py:1204] (3/4) Exclude cut with ID _30548_210_16040_1_1532608097130_3146969_371-280610-0_sp1.1 from training. Duration: 0.92725 2023-10-03 17:57:28,013 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.1.ff2_skip_rate, batch_count=1355606.6666666667, ans=0.0 2023-10-03 17:57:29,207 WARNING [train.py:1204] (3/4) Exclude cut with ID IC0671W0081-331924 from training. Duration: 0.896 2023-10-03 17:57:30,620 WARNING [train.py:1204] (3/4) Exclude cut with ID _40284_210_2084_1_1533513679814_3704279_380-160775-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 17:57:32,017 WARNING [train.py:1204] (3/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_57-270037-0_sp1.1 from training. Duration: 0.7 2023-10-03 17:57:33,909 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_853-150237-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 17:57:35,889 WARNING [train.py:1204] (3/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_491-261923-0 from training. Duration: 0.98 2023-10-03 17:57:38,792 WARNING [train.py:1204] (3/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_836-244045-0_sp1.1 from training. Duration: 0.97275 2023-10-03 17:57:38,870 WARNING [train.py:1204] (3/4) Exclude cut with ID _14880_210_8895_1_1530680426655_3795259_289-212817-0 from training. Duration: 0.69 2023-10-03 17:57:40,206 WARNING [train.py:1204] (3/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_303-340446-0 from training. Duration: 0.9 2023-10-03 17:57:41,549 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_37152_210_9697_1_1533106573_3844188_418-41736-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 17:57:45,542 WARNING [train.py:1204] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_371-18169-0_sp1.1 from training. Duration: 0.7818125 2023-10-03 17:57:46,876 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_187-219984-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 17:57:48,241 WARNING [train.py:1204] (3/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_63-267585-0 from training. Duration: 0.9 2023-10-03 17:57:49,635 WARNING [train.py:1204] (3/4) Exclude cut with ID _27013_210_1389_1_1532437173240_3252649_222-125573-0 from training. Duration: 0.98 2023-10-03 17:57:49,680 WARNING [train.py:1204] (3/4) Exclude cut with ID _14821_210_8750_1_1530680503991_6829359_189-297451-0 from training. Duration: 0.55 2023-10-03 17:57:51,098 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_557-163391-0_sp1.1 from training. Duration: 0.97275 2023-10-03 17:57:53,123 WARNING [train.py:1204] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_65-88242-0_sp1.1 from training. Duration: 0.5090625 2023-10-03 17:57:55,817 INFO [train.py:1046] (3/4) Epoch 39, batch 1500, loss[loss=0.1723, simple_loss=0.2329, pruned_loss=0.05587, over 19204.00 frames. ], tot_loss[loss=0.1569, simple_loss=0.2354, pruned_loss=0.03924, over 4682713.22 frames. ], batch size: 388, lr: 2.62e-03, grad_scale: 8.0 2023-10-03 17:57:56,172 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.conv_module2.balancer2.prob, batch_count=1355740.0, ans=0.125 2023-10-03 17:58:01,738 WARNING [train.py:1204] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_218-295097-0 from training. Duration: 0.77 2023-10-03 17:58:01,760 WARNING [train.py:1204] (3/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_686-247600-0_sp1.1 from training. Duration: 0.836375 2023-10-03 17:58:01,764 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_397-147196-0_sp0.9 from training. Duration: 0.888875 2023-10-03 17:58:05,072 WARNING [train.py:1204] (3/4) Exclude cut with ID _53950_210_5816_1_1534417264919_3862951_237-136449-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 17:58:05,155 WARNING [train.py:1204] (3/4) Exclude cut with ID _3120_210_3525_1_1526036143112_4072980_74-178400-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 17:58:06,537 WARNING [train.py:1204] (3/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_309-222545-0_sp0.9 from training. Duration: 0.9333125 2023-10-03 17:58:08,549 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7544_210_6753_1_1528587722_3576686_172-171750-0 from training. Duration: 0.65 2023-10-03 17:58:08,661 WARNING [train.py:1204] (3/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_3-237434-0_sp0.9 from training. Duration: 0.8444375 2023-10-03 17:58:09,883 WARNING [train.py:1204] (3/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_101-293367-0_sp0.9 from training. Duration: 0.7 2023-10-03 17:58:09,890 WARNING [train.py:1204] (3/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_818-213552-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 17:58:10,566 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.4.encoder.layers.0.nonlin_attention.whiten1, num_groups=1, num_channels=288, metric=8.75 vs. limit=10.0 2023-10-03 17:58:11,235 WARNING [train.py:1204] (3/4) Exclude cut with ID _14349_210_8341_1_1530615710818_3442470_115-285975-0_sp1.1 from training. Duration: 0.7818125 2023-10-03 17:58:13,930 WARNING [train.py:1204] (3/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_291-309749-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 17:58:14,016 WARNING [train.py:1204] (3/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_715-3462-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 17:58:18,275 WARNING [train.py:1204] (3/4) Exclude cut with ID _59502_210_12210_1_1535107024698_1835139_13-331827-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 17:58:18,285 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_4528_210_3273_1_1527076654_3276777_342-168068-0 from training. Duration: 0.87 2023-10-03 17:58:19,640 WARNING [train.py:1204] (3/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_215-141059-0_sp0.9 from training. Duration: 0.688875 2023-10-03 17:58:19,680 WARNING [train.py:1204] (3/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_106-286130-0_sp1.1 from training. Duration: 0.8909375 2023-10-03 17:58:21,530 WARNING [train.py:1204] (3/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_480-77655-0_sp1.1 from training. Duration: 0.97275 2023-10-03 17:58:24,258 WARNING [train.py:1204] (3/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_848-280957-0 from training. Duration: 0.87 2023-10-03 17:58:28,541 WARNING [train.py:1204] (3/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_122-109023-0 from training. Duration: 0.76 2023-10-03 17:58:29,883 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_11305_210_1794_1_1529750428_7178148_645-233480-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 17:58:29,946 WARNING [train.py:1204] (3/4) Exclude cut with ID _7562_210_2084_1_1528887767916_3946229_345-169156-0 from training. Duration: 0.58 2023-10-03 17:58:33,027 WARNING [train.py:1204] (3/4) Exclude cut with ID _7562_210_2084_1_1528887767916_3946229_345-169156-0_sp1.1 from training. Duration: 0.52725 2023-10-03 17:58:34,399 WARNING [train.py:1204] (3/4) Exclude cut with ID _13253_210_1925_1_1530757775819_3577873_253-246386-0_sp1.1 from training. Duration: 0.8181875 2023-10-03 17:58:35,595 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_136-264444-0_sp1.1 from training. Duration: 0.97275 2023-10-03 17:58:35,608 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2168_210_2290_1_1525606920_1849400_136-222991-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 17:58:35,836 INFO [scaling.py:1118] (3/4) WithLoss: name=encoder.encoders.3.encoder.layers.2.self_attn_weights, loss-sum=0.000e+00 2023-10-03 17:58:39,162 WARNING [train.py:1204] (3/4) Exclude cut with ID _7852_210_5219_1_1528286471316_8139400_242-105336-0 from training. Duration: 0.72 2023-10-03 17:58:39,206 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_5960_210_1794_1_1527403865_3990999_515-2613-0_sp1.1 from training. Duration: 0.8 2023-10-03 17:58:39,226 WARNING [train.py:1204] (3/4) Exclude cut with ID _39688_210_19596_1_1533371626725_4154159_551-167834-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 17:58:39,274 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_85-195240-0 from training. Duration: 0.87 2023-10-03 17:58:40,591 WARNING [train.py:1204] (3/4) Exclude cut with ID _63171_210_15488_1_1535104792794_3295110_113-113534-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 17:58:46,086 WARNING [train.py:1204] (3/4) Exclude cut with ID _8833_210_5219_1_1528592665210_7553059_319-349181-0_sp1.1 from training. Duration: 0.9 2023-10-03 17:58:46,087 WARNING [train.py:1204] (3/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_225-43381-0 from training. Duration: 0.94 2023-10-03 17:58:46,948 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.0.layers.1.self_attn2.whiten, num_groups=1, num_channels=192, metric=10.25 vs. limit=22.5 2023-10-03 17:58:50,320 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_5959_210_1794_1_1527400400_3445726_412-44052-0_sp0.9 from training. Duration: 0.8555625 2023-10-03 17:58:51,711 WARNING [train.py:1204] (3/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_356-167816-0_sp1.1 from training. Duration: 0.6545625 2023-10-03 17:58:55,553 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9012W0112-501244 from training. Duration: 0.768 2023-10-03 17:58:56,907 WARNING [train.py:1204] (3/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_177-326952-0_sp1.1 from training. Duration: 0.92725 2023-10-03 17:58:56,911 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9016W0116-502789 from training. Duration: 0.896 2023-10-03 17:58:58,346 WARNING [train.py:1204] (3/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_557-204815-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 17:58:59,726 WARNING [train.py:1204] (3/4) Exclude cut with ID _55857_210_2346_1_1534644007831_6898209_279-229469-0_sp1.1 from training. Duration: 0.936375 2023-10-03 17:58:59,799 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9029W0375-509764 from training. Duration: 0.896 2023-10-03 17:58:59,868 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2295_210_2087_1_1525500993_2407863_234-83104-0_sp0.9 from training. Duration: 0.688875 2023-10-03 17:59:03,925 WARNING [train.py:1204] (3/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_266-137104-0 from training. Duration: 0.65 2023-10-03 17:59:05,848 WARNING [train.py:1204] (3/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_289-163927-0_sp1.1 from training. Duration: 0.92725 2023-10-03 17:59:10,662 INFO [train.py:1046] (3/4) Epoch 39, batch 1550, loss[loss=0.1619, simple_loss=0.2464, pruned_loss=0.03869, over 23154.00 frames. ], tot_loss[loss=0.1578, simple_loss=0.2369, pruned_loss=0.0394, over 4690524.83 frames. ], batch size: 93, lr: 2.62e-03, grad_scale: 8.0 2023-10-03 17:59:10,699 WARNING [train.py:1204] (3/4) Exclude cut with ID _12733_210_8341_1_1530167424556_3646380_86-225314-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 17:59:10,712 WARNING [train.py:1204] (3/4) Exclude cut with ID _7877_210_6014_1_1528286358099_3750136_618-284064-0_sp1.1 from training. Duration: 0.92725 2023-10-03 17:59:10,757 WARNING [train.py:1204] (3/4) Exclude cut with ID _55844_210_2346_1_1534471212569_7625707_1017-31469-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 17:59:10,797 WARNING [train.py:1204] (3/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_617-241446-0_sp1.1 from training. Duration: 0.92725 2023-10-03 17:59:12,101 WARNING [train.py:1204] (3/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_866-173440-0_sp1.1 from training. Duration: 0.8181875 2023-10-03 17:59:13,603 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_9460_210_4211_1_1528954587_10049667_480-260269-0 from training. Duration: 0.56 2023-10-03 17:59:15,004 WARNING [train.py:1204] (3/4) Exclude cut with ID _44659_210_2721_1_1534036120061_6614860_626-298884-0 from training. Duration: 0.97 2023-10-03 17:59:15,047 WARNING [train.py:1204] (3/4) Exclude cut with ID _53565_210_10635_1_1534337218335_3820259_287-18120-0_sp1.1 from training. Duration: 0.8818125 2023-10-03 17:59:15,103 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_791-197891-0 from training. Duration: 0.84 2023-10-03 17:59:16,344 WARNING [train.py:1204] (3/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_527-238990-0 from training. Duration: 0.9 2023-10-03 17:59:17,778 WARNING [train.py:1204] (3/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_449-65969-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 17:59:19,185 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_553-341452-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 17:59:19,222 WARNING [train.py:1204] (3/4) Exclude cut with ID _16909_210_5724_1_1531292079912_4118919_300-47574-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 17:59:19,234 WARNING [train.py:1204] (3/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_70-27732-0_sp1.1 from training. Duration: 0.8545625 2023-10-03 17:59:20,656 WARNING [train.py:1204] (3/4) Exclude cut with ID _44718_210_5816_1_1534050051869_6469030_727-28074-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 17:59:21,911 WARNING [train.py:1204] (3/4) Exclude cut with ID _55857_210_2346_1_1534644007831_6898209_357-123664-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 17:59:22,184 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.conv_module2.balancer2.prob, batch_count=1356073.3333333333, ans=0.125 2023-10-03 17:59:25,482 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9123W0004-555964 from training. Duration: 0.896 2023-10-03 17:59:25,512 WARNING [train.py:1204] (3/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_445-145208-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 17:59:26,673 WARNING [train.py:1204] (3/4) Exclude cut with ID _3675_210_4557_1_1526639934408_4182089_456-18305-0_sp1.1 from training. Duration: 0.6909375 2023-10-03 17:59:26,737 WARNING [train.py:1204] (3/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_1-99680-0_sp1.1 from training. Duration: 0.6090625 2023-10-03 17:59:28,174 WARNING [train.py:1204] (3/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_146-282400-0_sp0.9 from training. Duration: 0.788875 2023-10-03 17:59:29,486 WARNING [train.py:1204] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_844-96989-0 from training. Duration: 0.71 2023-10-03 17:59:30,836 WARNING [train.py:1204] (3/4) Exclude cut with ID _29853_210_7497_1_1532505600851_6642110_128-146364-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 17:59:30,859 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_224-322394-0 from training. Duration: 0.66 2023-10-03 17:59:32,340 WARNING [train.py:1204] (3/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_191-59784-0 from training. Duration: 0.88 2023-10-03 17:59:32,345 WARNING [train.py:1204] (3/4) Exclude cut with ID _12556_210_8341_1_1530081024100_7327690_725-193477-0 from training. Duration: 0.98 2023-10-03 17:59:33,645 WARNING [train.py:1204] (3/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_523-79838-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 17:59:35,326 WARNING [train.py:1204] (3/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_398-334558-0_sp1.1 from training. Duration: 0.87275 2023-10-03 17:59:38,150 WARNING [train.py:1204] (3/4) Exclude cut with ID _6456_210_1534_1_1527681634212_3181049_171-36697-0_sp1.1 from training. Duration: 0.936375 2023-10-03 17:59:41,815 WARNING [train.py:1204] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_700-183549-0 from training. Duration: 0.68 2023-10-03 17:59:41,817 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8853_210_3273_1_1528633899_818968_51-187685-0 from training. Duration: 0.69 2023-10-03 17:59:44,433 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.576e+02 1.873e+02 2.058e+02 2.310e+02 4.421e+02, threshold=4.116e+02, percent-clipped=1.0 2023-10-03 17:59:47,446 WARNING [train.py:1204] (3/4) Exclude cut with ID _7719_210_3525_1_1528456153163_4296960_333-110399-0_sp1.1 from training. Duration: 0.87275 2023-10-03 17:59:51,536 WARNING [train.py:1204] (3/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_1022-83740-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 17:59:51,565 WARNING [train.py:1204] (3/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_61-327062-0_sp0.9 from training. Duration: 0.9 2023-10-03 17:59:51,571 WARNING [train.py:1204] (3/4) Exclude cut with ID _47251_210_18628_1_1533814063128_4137250_214-88076-0_sp1.1 from training. Duration: 0.8 2023-10-03 17:59:52,829 WARNING [train.py:1204] (3/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_839-173017-0 from training. Duration: 0.41 2023-10-03 17:59:57,446 WARNING [train.py:1204] (3/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_299-34413-0_sp1.1 from training. Duration: 0.5909375 2023-10-03 17:59:58,806 WARNING [train.py:1204] (3/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_664-159457-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 18:00:00,270 WARNING [train.py:1204] (3/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_483-291760-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 18:00:02,995 WARNING [train.py:1204] (3/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_287-255474-0_sp1.1 from training. Duration: 0.92725 2023-10-03 18:00:03,042 WARNING [train.py:1204] (3/4) Exclude cut with ID _48677_210_5663_1_1533902405919_3892989_111-11439-0_sp1.1 from training. Duration: 0.87275 2023-10-03 18:00:03,049 WARNING [train.py:1204] (3/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_399-156791-0 from training. Duration: 0.83 2023-10-03 18:00:03,619 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.feed_forward2.out_whiten.whitening_limit, batch_count=1356273.3333333333, ans=15.0 2023-10-03 18:00:04,956 WARNING [train.py:1204] (3/4) Exclude cut with ID _3992_210_1747_1_1526644816031_3731060_36-147855-0_sp0.9 from training. Duration: 0.8666875 2023-10-03 18:00:07,643 WARNING [train.py:1204] (3/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_442-320317-0_sp1.1 from training. Duration: 0.7454375 2023-10-03 18:00:07,673 WARNING [train.py:1204] (3/4) Exclude cut with ID _55429_210_10626_1_1534467588527_7174143_953-80868-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 18:00:08,987 WARNING [train.py:1204] (3/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_159-328157-0_sp0.9 from training. Duration: 0.57775 2023-10-03 18:00:08,995 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9309W0197-640324 from training. Duration: 0.896 2023-10-03 18:00:11,070 WARNING [train.py:1204] (3/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_361-30822-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 18:00:17,075 WARNING [train.py:1204] (3/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_636-251213-0 from training. Duration: 0.94 2023-10-03 18:00:21,461 WARNING [train.py:1204] (3/4) Exclude cut with ID _49728_210_12651_1_1534041034294_6788150_468-333827-0_sp1.1 from training. Duration: 0.963625 2023-10-03 18:00:23,249 WARNING [train.py:1204] (3/4) Exclude cut with ID _25882_210_3036_1_1532775665815_6479510_623-191759-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 18:00:23,305 WARNING [train.py:1204] (3/4) Exclude cut with ID _5189_210_4557_1_1527248319428_3769229_199-53526-0 from training. Duration: 0.42 2023-10-03 18:00:24,603 INFO [train.py:1046] (3/4) Epoch 39, batch 1600, loss[loss=0.1713, simple_loss=0.241, pruned_loss=0.05076, over 23743.00 frames. ], tot_loss[loss=0.1587, simple_loss=0.2378, pruned_loss=0.03979, over 4687244.33 frames. ], batch size: 179, lr: 2.62e-03, grad_scale: 16.0 2023-10-03 18:00:24,736 WARNING [train.py:1204] (3/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_32-205288-0_sp0.9 from training. Duration: 0.8666875 2023-10-03 18:00:26,100 WARNING [train.py:1204] (3/4) Exclude cut with ID _65263_210_22356_1_1535763598691_3921037_401-346646-0_sp1.1 from training. Duration: 0.963625 2023-10-03 18:00:26,105 WARNING [train.py:1204] (3/4) Exclude cut with ID _12555_210_8750_1_1530075594402_3780239_449-267986-0_sp1.1 from training. Duration: 0.7545625 2023-10-03 18:00:26,119 WARNING [train.py:1204] (3/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_430-201020-0_sp1.1 from training. Duration: 0.8818125 2023-10-03 18:00:26,668 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.4.encoder.layers.0.nonlin_attention.whiten2, num_groups=1, num_channels=384, metric=8.47 vs. limit=15.0 2023-10-03 18:00:27,433 WARNING [train.py:1204] (3/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_379-49457-0_sp1.1 from training. Duration: 0.7909375 2023-10-03 18:00:28,924 WARNING [train.py:1204] (3/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_200-105218-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 18:00:30,383 WARNING [train.py:1204] (3/4) Exclude cut with ID _3867_210_1883_1_1526641200575_4475830_319-349850-0 from training. Duration: 0.67 2023-10-03 18:00:31,811 WARNING [train.py:1204] (3/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_916-195848-0 from training. Duration: 0.92 2023-10-03 18:00:31,942 WARNING [train.py:1204] (3/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_436-186311-0 from training. Duration: 0.65 2023-10-03 18:00:33,439 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.balancer1.prob, batch_count=1356406.6666666667, ans=0.125 2023-10-03 18:00:34,588 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3910_210_1794_1_1526777667_3747260_302-180617-0_sp1.1 from training. Duration: 0.97275 2023-10-03 18:00:36,367 WARNING [train.py:1204] (3/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_102-92751-0 from training. Duration: 0.99 2023-10-03 18:00:36,436 WARNING [train.py:1204] (3/4) Exclude cut with ID _51899_210_20034_1_1534129254466_3669220_265-215428-0_sp1.1 from training. Duration: 0.8909375 2023-10-03 18:00:39,707 WARNING [train.py:1204] (3/4) Exclude cut with ID _58583_210_5236_1_1534726835266_3574249_438-198993-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 18:00:42,598 WARNING [train.py:1204] (3/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_166-166990-0_sp1.1 from training. Duration: 0.87275 2023-10-03 18:00:45,907 WARNING [train.py:1204] (3/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_907-151398-0 from training. Duration: 0.37 2023-10-03 18:00:49,049 WARNING [train.py:1204] (3/4) Exclude cut with ID _38670_210_15379_1_1533448810659_4391059_452-256669-0_sp1.1 from training. Duration: 0.936375 2023-10-03 18:00:49,094 WARNING [train.py:1204] (3/4) Exclude cut with ID _48677_210_5663_1_1533902405919_3892989_408-40740-0 from training. Duration: 0.83 2023-10-03 18:00:50,462 WARNING [train.py:1204] (3/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_903-158756-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 18:00:51,775 WARNING [train.py:1204] (3/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_397-216497-0 from training. Duration: 0.96 2023-10-03 18:00:54,108 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.0.layers.1.feed_forward3.out_whiten, num_groups=1, num_channels=192, metric=9.70 vs. limit=15.0 2023-10-03 18:00:57,387 WARNING [train.py:1204] (3/4) Exclude cut with ID _55429_210_10626_1_1534467588527_7174143_97-335084-0 from training. Duration: 0.97 2023-10-03 18:01:04,077 WARNING [train.py:1204] (3/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_453-267730-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 18:01:05,579 WARNING [train.py:1204] (3/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_153-97441-0 from training. Duration: 0.56 2023-10-03 18:01:05,789 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.conv_skip_rate, batch_count=1356540.0, ans=0.0 2023-10-03 18:01:07,342 WARNING [train.py:1204] (3/4) Exclude cut with ID _40901_210_5665_1_1533363789958_3856130_312-329122-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 18:01:07,401 WARNING [train.py:1204] (3/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_97-126471-0_sp1.1 from training. Duration: 0.97275 2023-10-03 18:01:07,401 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_11305_210_1794_1_1529750428_7178148_257-312870-0_sp1.1 from training. Duration: 0.8 2023-10-03 18:01:10,943 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.0.layers.1.feed_forward1.out_whiten, num_groups=1, num_channels=192, metric=4.26 vs. limit=15.0 2023-10-03 18:01:11,972 WARNING [train.py:1204] (3/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_454-314901-0_sp0.9 from training. Duration: 0.511125 2023-10-03 18:01:16,090 WARNING [train.py:1204] (3/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_282-136641-0_sp1.1 from training. Duration: 0.3818125 2023-10-03 18:01:17,907 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_46013_210_7234_1_1533776362_3744032_493-160724-0_sp1.1 from training. Duration: 0.963625 2023-10-03 18:01:19,368 WARNING [train.py:1204] (3/4) Exclude cut with ID _67831_210_12392_1_1535612464202_3092612_259-33442-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 18:01:19,438 WARNING [train.py:1204] (3/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_289-62384-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 18:01:20,725 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6545_210_2040_1_1527774670_4245258_379-14182-0_sp0.9 from training. Duration: 0.888875 2023-10-03 18:01:22,812 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_548-294316-0_sp1.1 from training. Duration: 0.736375 2023-10-03 18:01:22,901 WARNING [train.py:1204] (3/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_857-298942-0_sp1.1 from training. Duration: 0.77275 2023-10-03 18:01:24,281 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6673_210_3273_1_1527767790_3173240_327-306185-0_sp1.1 from training. Duration: 0.7545625 2023-10-03 18:01:30,934 WARNING [train.py:1204] (3/4) Exclude cut with ID _61008_210_10633_1_1534852769198_6974370_814-107647-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 18:01:32,238 WARNING [train.py:1204] (3/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_116-332008-0_sp1.1 from training. Duration: 0.8909375 2023-10-03 18:01:33,727 WARNING [train.py:1204] (3/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_266-329755-0 from training. Duration: 0.89 2023-10-03 18:01:33,740 WARNING [train.py:1204] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_271-269084-0_sp0.9 from training. Duration: 0.92225 2023-10-03 18:01:35,065 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3171_210_2087_1_1526109084_2330007_66-260583-0 from training. Duration: 0.72 2023-10-03 18:01:37,827 INFO [train.py:1046] (3/4) Epoch 39, batch 1650, loss[loss=0.1689, simple_loss=0.2395, pruned_loss=0.04919, over 23802.00 frames. ], tot_loss[loss=0.1594, simple_loss=0.2383, pruned_loss=0.04027, over 4676451.40 frames. ], batch size: 212, lr: 2.62e-03, grad_scale: 16.0 2023-10-03 18:01:41,124 WARNING [train.py:1204] (3/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_1326-276386-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 18:01:42,539 WARNING [train.py:1204] (3/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_372-30302-0_sp1.1 from training. Duration: 0.7818125 2023-10-03 18:01:42,614 WARNING [train.py:1204] (3/4) Exclude cut with ID _25882_210_3036_1_1532775665815_6479510_631-257029-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 18:01:42,622 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3691_210_3528_1_1526468169_4062053_355-334432-0 from training. Duration: 0.87 2023-10-03 18:01:42,630 WARNING [train.py:1204] (3/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_839-173017-0_sp1.1 from training. Duration: 0.37275 2023-10-03 18:01:42,638 WARNING [train.py:1204] (3/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_441-91466-0 from training. Duration: 0.97 2023-10-03 18:01:42,667 WARNING [train.py:1204] (3/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_108-53929-0 from training. Duration: 0.59 2023-10-03 18:01:42,834 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.ff3_skip_rate, batch_count=1356740.0, ans=0.0 2023-10-03 18:01:47,284 WARNING [train.py:1204] (3/4) Exclude cut with ID _9389_210_1664_1_1529217337301_5289269_260-1942-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 18:01:49,194 WARNING [train.py:1204] (3/4) Exclude cut with ID _56423_210_5854_1_1534896061049_3165969_198-61547-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 18:01:49,234 WARNING [train.py:1204] (3/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_412-142122-0_sp1.1 from training. Duration: 0.92725 2023-10-03 18:01:49,252 WARNING [train.py:1204] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_88-341718-0_sp1.1 from training. Duration: 0.6 2023-10-03 18:01:52,553 WARNING [train.py:1204] (3/4) Exclude cut with ID _61079_210_4525_1_1534921008198_4011419_334-298697-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 18:01:52,700 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_5465_210_4211_1_1527227253_55357_8-2499-0_sp1.1 from training. Duration: 0.996375 2023-10-03 18:01:54,170 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_447-201565-0_sp1.1 from training. Duration: 0.97275 2023-10-03 18:01:54,176 WARNING [train.py:1204] (3/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_253-161789-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 18:01:54,179 WARNING [train.py:1204] (3/4) Exclude cut with ID _44304_210_15374_1_1533639352765_4613129_302-22084-0_sp0.9 from training. Duration: 0.9555625 2023-10-03 18:01:54,187 WARNING [train.py:1204] (3/4) Exclude cut with ID _26664_210_7770_1_1532258943280_3418270_187-80815-0_sp1.1 from training. Duration: 0.8454375 2023-10-03 18:01:56,818 WARNING [train.py:1204] (3/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_68-212801-0 from training. Duration: 0.73 2023-10-03 18:01:56,848 WARNING [train.py:1204] (3/4) Exclude cut with ID _29345_210_4381_1_1532689168263_3860169_149-257268-0 from training. Duration: 0.81 2023-10-03 18:01:58,452 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.conv_module1.balancer1.prob, batch_count=1356806.6666666667, ans=0.125 2023-10-03 18:02:02,328 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_213-103282-0_sp1.1 from training. Duration: 0.7090625 2023-10-03 18:02:03,777 WARNING [train.py:1204] (3/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_156-302675-0_sp1.1 from training. Duration: 0.5 2023-10-03 18:02:05,414 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.attention_skip_rate, batch_count=1356806.6666666667, ans=0.0 2023-10-03 18:02:06,864 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.conv_module1.balancer1.prob, batch_count=1356873.3333333333, ans=0.125 2023-10-03 18:02:11,417 WARNING [train.py:1204] (3/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_125-322172-0 from training. Duration: 0.93 2023-10-03 18:02:11,641 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.ff3_skip_rate, batch_count=1356873.3333333333, ans=0.0 2023-10-03 18:02:12,616 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.632e+02 1.940e+02 2.128e+02 2.413e+02 3.924e+02, threshold=4.257e+02, percent-clipped=0.0 2023-10-03 18:02:12,744 WARNING [train.py:1204] (3/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_493-60474-0_sp1.1 from training. Duration: 0.963625 2023-10-03 18:02:14,203 WARNING [train.py:1204] (3/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_156-302675-0 from training. Duration: 0.55 2023-10-03 18:02:16,253 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.bypass.skip_rate, batch_count=1356873.3333333333, ans=0.09899494936611666 2023-10-03 18:02:18,160 WARNING [train.py:1204] (3/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_123-267862-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 18:02:21,508 WARNING [train.py:1204] (3/4) Exclude cut with ID _28427_210_4220_1_1532581240967_3061069_201-143747-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 18:02:21,542 WARNING [train.py:1204] (3/4) Exclude cut with ID _8772_210_1850_1_1528635595972_3664011_96-12756-0_sp1.1 from training. Duration: 0.8909375 2023-10-03 18:02:22,779 WARNING [train.py:1204] (3/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_1347-145675-0_sp1.1 from training. Duration: 0.936375 2023-10-03 18:02:22,886 WARNING [train.py:1204] (3/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_387-159008-0_sp1.1 from training. Duration: 0.7818125 2023-10-03 18:02:22,903 WARNING [train.py:1204] (3/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_400-201896-0_sp1.1 from training. Duration: 0.963625 2023-10-03 18:02:25,763 WARNING [train.py:1204] (3/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_326-174612-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 18:02:25,803 WARNING [train.py:1204] (3/4) Exclude cut with ID _55844_210_2346_1_1534471212569_7625707_390-116233-0_sp1.1 from training. Duration: 0.963625 2023-10-03 18:02:27,109 WARNING [train.py:1204] (3/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_419-317062-0_sp1.1 from training. Duration: 0.82725 2023-10-03 18:02:27,154 WARNING [train.py:1204] (3/4) Exclude cut with ID _29877_210_6228_1_1532397791106_5276149_266-62291-0_sp0.9 from training. Duration: 0.9666875 2023-10-03 18:02:28,446 WARNING [train.py:1204] (3/4) Exclude cut with ID _7857_210_1534_1_1528286515745_6831470_431-338528-0_sp1.1 from training. Duration: 0.92725 2023-10-03 18:02:28,524 WARNING [train.py:1204] (3/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_87-131427-0_sp0.9 from training. Duration: 0.8666875 2023-10-03 18:02:31,333 WARNING [train.py:1204] (3/4) Exclude cut with ID _24055_210_12651_1_1532310907143_976979_92-59356-0_sp1.1 from training. Duration: 0.82725 2023-10-03 18:02:33,846 WARNING [train.py:1204] (3/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_96-290039-0 from training. Duration: 0.56 2023-10-03 18:02:35,206 WARNING [train.py:1204] (3/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_48-182155-0_sp0.9 from training. Duration: 0.9666875 2023-10-03 18:02:35,235 WARNING [train.py:1204] (3/4) Exclude cut with ID _4626_210_3532_1_1526817652717_3916859_446-206656-0 from training. Duration: 0.76 2023-10-03 18:02:36,535 WARNING [train.py:1204] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_168-32906-0 from training. Duration: 0.69 2023-10-03 18:02:36,552 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3780_210_2087_1_1526467760_2130483_170-259044-0 from training. Duration: 0.7 2023-10-03 18:02:36,563 WARNING [train.py:1204] (3/4) Exclude cut with ID _65134_210_14763_1_1535335393985_3505368_153-216718-0_sp1.1 from training. Duration: 0.92725 2023-10-03 18:02:37,949 WARNING [train.py:1204] (3/4) Exclude cut with ID _64639_210_5816_1_1535367619437_3528000_461-329156-0_sp1.1 from training. Duration: 0.87275 2023-10-03 18:02:37,989 WARNING [train.py:1204] (3/4) Exclude cut with ID _21989_210_5854_1_1531899214931_3271360_126-347531-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 18:02:38,030 WARNING [train.py:1204] (3/4) Exclude cut with ID _7614_210_4915_1_1529326764092_4537359_96-162197-0_sp1.1 from training. Duration: 0.963625 2023-10-03 18:02:38,031 WARNING [train.py:1204] (3/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_1083-147536-0 from training. Duration: 0.97 2023-10-03 18:02:43,110 WARNING [train.py:1204] (3/4) Exclude cut with ID _64029_210_4381_1_1535178502772_3756459_275-16710-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 18:02:44,548 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7264_210_4938_1_1527939911_8132095_800-91995-0_sp0.9 from training. Duration: 0.9555625 2023-10-03 18:02:45,819 WARNING [train.py:1204] (3/4) Exclude cut with ID _3844_210_1390_1_1526727803582_2871209_50-193879-0_sp1.1 from training. Duration: 0.936375 2023-10-03 18:02:47,777 WARNING [train.py:1204] (3/4) Exclude cut with ID _46952_210_5986_1_1534230010357_3107379_232-344503-0 from training. Duration: 0.96 2023-10-03 18:02:51,327 WARNING [train.py:1204] (3/4) Exclude cut with ID _25808_210_13292_1_1532343514562_7366109_206-14076-0_sp1.1 from training. Duration: 0.936375 2023-10-03 18:02:51,328 WARNING [train.py:1204] (3/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_55-31904-0_sp0.9 from training. Duration: 0.97775 2023-10-03 18:02:51,353 WARNING [train.py:1204] (3/4) Exclude cut with ID _14632_210_9988_1_1530691279392_3923200_161-129530-0 from training. Duration: 0.96 2023-10-03 18:02:52,751 WARNING [train.py:1204] (3/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_770-200430-0_sp1.1 from training. Duration: 0.8090625 2023-10-03 18:02:52,761 WARNING [train.py:1204] (3/4) Exclude cut with ID _6270_210_2084_1_1527850911573_3856420_26-277343-0_sp1.1 from training. Duration: 0.7454375 2023-10-03 18:02:52,762 WARNING [train.py:1204] (3/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_88-23776-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 18:02:53,993 INFO [train.py:1046] (3/4) Epoch 39, batch 1700, loss[loss=0.1621, simple_loss=0.2383, pruned_loss=0.04296, over 23328.00 frames. ], tot_loss[loss=0.1589, simple_loss=0.2382, pruned_loss=0.0398, over 4697102.55 frames. ], batch size: 119, lr: 2.62e-03, grad_scale: 16.0 2023-10-03 18:02:55,444 WARNING [train.py:1204] (3/4) Exclude cut with ID _60860_210_8254_1_1534896015875_3557169_245-256666-0_sp1.1 from training. Duration: 0.8818125 2023-10-03 18:02:55,477 WARNING [train.py:1204] (3/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_636-251213-0_sp1.1 from training. Duration: 0.8545625 2023-10-03 18:02:55,500 WARNING [train.py:1204] (3/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_540-131164-0 from training. Duration: 0.92 2023-10-03 18:02:56,958 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_5931_210_2040_1_1527428481_4547999_321-193614-0_sp0.9 from training. Duration: 0.7444375 2023-10-03 18:02:58,438 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.0.feed_forward1.out_proj.dropout_p, batch_count=1357073.3333333333, ans=0.1 2023-10-03 18:03:02,936 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.4.encoder.layers.1.self_attn2.whiten, num_groups=1, num_channels=384, metric=12.16 vs. limit=22.5 2023-10-03 18:03:03,764 WARNING [train.py:1204] (3/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_373-225403-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 18:03:07,682 WARNING [train.py:1204] (3/4) Exclude cut with ID _20622_210_3852_1_1531622427080_1093839_57-326027-0_sp1.1 from training. Duration: 0.97275 2023-10-03 18:03:14,006 WARNING [train.py:1204] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_605-257205-0_sp1.1 from training. Duration: 0.536375 2023-10-03 18:03:14,027 WARNING [train.py:1204] (3/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_442-320317-0_sp0.9 from training. Duration: 0.911125 2023-10-03 18:03:14,070 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_35361_210_4238_1_1533262810_7793476_675-87012-0_sp1.1 from training. Duration: 0.8090625 2023-10-03 18:03:15,409 WARNING [train.py:1204] (3/4) Exclude cut with ID _4184_210_2550_1_1526730192013_2219380_306-44736-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 18:03:17,903 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_124-113562-0 from training. Duration: 0.94 2023-10-03 18:03:19,230 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2821_210_2715_1_1526108385_3763560_73-296673-0_sp0.9 from training. Duration: 0.92225 2023-10-03 18:03:20,545 WARNING [train.py:1204] (3/4) Exclude cut with ID _10301_210_5886_1_1529222275332_3910620_216-46959-0_sp1.1 from training. Duration: 0.92725 2023-10-03 18:03:22,016 WARNING [train.py:1204] (3/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_91-277684-0_sp0.9 from training. Duration: 0.82225 2023-10-03 18:03:23,337 WARNING [train.py:1204] (3/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_351-294650-0_sp0.9 from training. Duration: 0.7 2023-10-03 18:03:24,734 WARNING [train.py:1204] (3/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_209-205365-0 from training. Duration: 0.79 2023-10-03 18:03:26,099 WARNING [train.py:1204] (3/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_97-325053-0 from training. Duration: 0.92 2023-10-03 18:03:27,495 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_49348_210_7927_1_1533954500_3691300_109-276430-0_sp1.1 from training. Duration: 0.92725 2023-10-03 18:03:28,939 WARNING [train.py:1204] (3/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_912-47473-0 from training. Duration: 0.68 2023-10-03 18:03:30,247 WARNING [train.py:1204] (3/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_96-320736-0_sp0.9 from training. Duration: 0.9555625 2023-10-03 18:03:32,349 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.1.encoder.layers.0.conv_module2.whiten, num_groups=1, num_channels=256, metric=5.60 vs. limit=15.0 2023-10-03 18:03:37,167 WARNING [train.py:1204] (3/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_1252-235481-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 18:03:38,536 WARNING [train.py:1204] (3/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_813-261554-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 18:03:38,601 WARNING [train.py:1204] (3/4) Exclude cut with ID _21632_210_2253_1_1531994001765_3744140_110-274654-0_sp0.9 from training. Duration: 0.911125 2023-10-03 18:03:40,575 WARNING [train.py:1204] (3/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_381-317791-0_sp1.1 from training. Duration: 0.663625 2023-10-03 18:03:40,594 WARNING [train.py:1204] (3/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_51-117576-0_sp1.1 from training. Duration: 0.363625 2023-10-03 18:03:42,234 WARNING [train.py:1204] (3/4) Exclude cut with ID _42321_210_7770_1_1533468631352_4366895_104-215915-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 18:03:44,909 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_216-22824-0_sp1.1 from training. Duration: 0.92725 2023-10-03 18:03:44,909 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_14831_210_7927_1_1530700200_5583610_181-27467-0 from training. Duration: 0.91 2023-10-03 18:03:46,266 WARNING [train.py:1204] (3/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_1024-265645-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 18:03:46,275 WARNING [train.py:1204] (3/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_412-233904-0_sp1.1 from training. Duration: 0.936375 2023-10-03 18:03:47,700 WARNING [train.py:1204] (3/4) Exclude cut with ID _30191_210_1664_1_1533189915047_7196670_911-223461-0_sp1.1 from training. Duration: 0.92725 2023-10-03 18:03:47,708 WARNING [train.py:1204] (3/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_558-148266-0_sp1.1 from training. Duration: 0.963625 2023-10-03 18:03:49,855 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.bypass.scale_min, batch_count=1357273.3333333333, ans=0.2 2023-10-03 18:03:51,401 WARNING [train.py:1204] (3/4) Exclude cut with ID _58480_210_12392_1_1534811121111_3646160_212-164495-0_sp1.1 from training. Duration: 0.936375 2023-10-03 18:03:51,402 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_454-276429-0_sp0.9 from training. Duration: 0.9666875 2023-10-03 18:03:52,758 WARNING [train.py:1204] (3/4) Exclude cut with ID _24736_210_3279_1_1532332894218_2838700_73-197086-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 18:03:52,823 WARNING [train.py:1204] (3/4) Exclude cut with ID _5294_210_1747_1_1527249643928_3696179_529-242298-0_sp1.1 from training. Duration: 0.863625 2023-10-03 18:03:54,116 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_53555_210_5632_1_1534595220_7232297_345-171438-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 18:03:58,352 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_28211_210_6388_1_1532861506_4523224_22-25766-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 18:03:59,729 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7002_210_1794_1_1527939683_5157286_163-87552-0 from training. Duration: 0.86 2023-10-03 18:03:59,872 WARNING [train.py:1204] (3/4) Exclude cut with ID _56927_210_9969_1_1534848970840_3695229_409-225129-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 18:04:02,473 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_26192_210_7927_1_1532137269_1040115_21-219331-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 18:04:02,610 WARNING [train.py:1204] (3/4) Exclude cut with ID _15012_210_1968_1_1530856820429_5095350_662-210469-0 from training. Duration: 0.57 2023-10-03 18:04:06,629 INFO [train.py:1046] (3/4) Epoch 39, batch 1750, loss[loss=0.1454, simple_loss=0.2166, pruned_loss=0.03713, over 23606.00 frames. ], tot_loss[loss=0.158, simple_loss=0.2369, pruned_loss=0.03956, over 4688073.32 frames. ], batch size: 256, lr: 2.61e-03, grad_scale: 16.0 2023-10-03 18:04:09,503 WARNING [train.py:1204] (3/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_582-191016-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 18:04:12,771 WARNING [train.py:1204] (3/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_270-210032-0_sp1.1 from training. Duration: 0.963625 2023-10-03 18:04:12,808 WARNING [train.py:1204] (3/4) Exclude cut with ID _6789_210_1534_1_1527857937218_3118950_262-48853-0_sp0.9 from training. Duration: 0.62225 2023-10-03 18:04:14,205 WARNING [train.py:1204] (3/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_847-41334-0 from training. Duration: 0.44 2023-10-03 18:04:14,237 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_573-46532-0_sp1.1 from training. Duration: 0.92725 2023-10-03 18:04:17,401 WARNING [train.py:1204] (3/4) Exclude cut with ID _21881_210_3525_1_1531825242757_3798380_247-102591-0_sp1.1 from training. Duration: 0.8909375 2023-10-03 18:04:17,434 WARNING [train.py:1204] (3/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_200-21395-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 18:04:20,810 WARNING [train.py:1204] (3/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_711-15688-0 from training. Duration: 0.43 2023-10-03 18:04:23,278 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.4.encoder.layers.1.conv_module1.whiten, num_groups=1, num_channels=384, metric=3.06 vs. limit=15.0 2023-10-03 18:04:24,048 WARNING [train.py:1204] (3/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_53-75025-0_sp1.1 from training. Duration: 0.963625 2023-10-03 18:04:26,638 WARNING [train.py:1204] (3/4) Exclude cut with ID _6194_210_1534_1_1527511826160_3941194_321-148641-0 from training. Duration: 0.64 2023-10-03 18:04:26,662 WARNING [train.py:1204] (3/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_337-294431-0_sp1.1 from training. Duration: 0.936375 2023-10-03 18:04:26,762 WARNING [train.py:1204] (3/4) Exclude cut with ID _21632_210_2253_1_1531994001765_3744140_110-274654-0_sp1.1 from training. Duration: 0.7454375 2023-10-03 18:04:29,717 WARNING [train.py:1204] (3/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_135-174080-0_sp1.1 from training. Duration: 0.4818125 2023-10-03 18:04:31,110 WARNING [train.py:1204] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_21-174514-0 from training. Duration: 0.43 2023-10-03 18:04:32,615 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_38965_210_4852_1_1533174906_3484735_215-294589-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 18:04:32,637 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_548-294316-0 from training. Duration: 0.81 2023-10-03 18:04:32,849 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.attention_skip_rate, batch_count=1357473.3333333333, ans=0.0 2023-10-03 18:04:39,564 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_103-84010-0_sp1.1 from training. Duration: 0.62725 2023-10-03 18:04:41,282 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.600e+02 1.897e+02 2.069e+02 2.382e+02 4.230e+02, threshold=4.138e+02, percent-clipped=0.0 2023-10-03 18:04:41,560 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.0.self_attn_weights.pos_emb_skip_rate, batch_count=1357540.0, ans=0.0 2023-10-03 18:04:42,987 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.ff2_skip_rate, batch_count=1357540.0, ans=0.0 2023-10-03 18:04:44,128 WARNING [train.py:1204] (3/4) Exclude cut with ID _18199_210_6109_1_1531475789864_4288790_295-198247-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 18:04:44,158 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_13757_210_3528_1_1530440682_4107239_103-313224-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 18:04:47,606 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.attention_skip_rate, batch_count=1357540.0, ans=0.0 2023-10-03 18:04:48,720 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_34886_210_10633_1_1533203882_1755661_67-294673-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 18:04:48,733 WARNING [train.py:1204] (3/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_350-101734-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 18:04:50,168 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_37357_210_17024_1_1533089504_2316121_204-153986-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 18:04:52,193 WARNING [train.py:1204] (3/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_673-115569-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 18:04:52,517 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.ff3_skip_rate, batch_count=1357606.6666666667, ans=0.0 2023-10-03 18:04:53,631 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_489-230703-0_sp0.9 from training. Duration: 0.9444375 2023-10-03 18:04:53,680 WARNING [train.py:1204] (3/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_575-102496-0_sp1.1 from training. Duration: 0.936375 2023-10-03 18:04:55,046 WARNING [train.py:1204] (3/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_669-93163-0 from training. Duration: 0.98 2023-10-03 18:04:57,807 WARNING [train.py:1204] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_249-281349-0_sp0.9 from training. Duration: 0.9444375 2023-10-03 18:05:00,546 WARNING [train.py:1204] (3/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_106-30782-0 from training. Duration: 0.8 2023-10-03 18:05:01,927 WARNING [train.py:1204] (3/4) Exclude cut with ID _8772_210_1850_1_1528635595972_3664011_398-320049-0_sp1.1 from training. Duration: 0.8 2023-10-03 18:05:03,345 WARNING [train.py:1204] (3/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_516-151727-0_sp1.1 from training. Duration: 0.963625 2023-10-03 18:05:03,400 WARNING [train.py:1204] (3/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_325-62802-0_sp1.1 from training. Duration: 0.7909375 2023-10-03 18:05:06,235 WARNING [train.py:1204] (3/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_93-58002-0_sp0.9 from training. Duration: 0.6333125 2023-10-03 18:05:07,526 WARNING [train.py:1204] (3/4) Exclude cut with ID _4681_210_3525_1_1527251040321_1235713_123-24428-0_sp1.1 from training. Duration: 0.436375 2023-10-03 18:05:07,585 WARNING [train.py:1204] (3/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_1185-56473-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 18:05:10,235 WARNING [train.py:1204] (3/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_96-244299-0_sp1.1 from training. Duration: 0.8 2023-10-03 18:05:10,820 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.3.encoder.layers.2.feed_forward3.out_whiten, num_groups=1, num_channels=512, metric=9.36 vs. limit=15.0 2023-10-03 18:05:13,592 WARNING [train.py:1204] (3/4) Exclude cut with ID _59500_210_8254_1_1534809610680_3736009_104-72815-0_sp1.1 from training. Duration: 0.963625 2023-10-03 18:05:16,827 WARNING [train.py:1204] (3/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_468-283113-0_sp1.1 from training. Duration: 0.87275 2023-10-03 18:05:18,239 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6239_210_4211_1_1527765584_4306698_528-102186-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 18:05:18,306 WARNING [train.py:1204] (3/4) Exclude cut with ID _45297_210_2721_1_1533715351381_3264949_351-147136-0 from training. Duration: 0.87 2023-10-03 18:05:18,310 WARNING [train.py:1204] (3/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_520-51744-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 18:05:21,284 INFO [train.py:1046] (3/4) Epoch 39, batch 1800, loss[loss=0.1771, simple_loss=0.2497, pruned_loss=0.05228, over 23804.00 frames. ], tot_loss[loss=0.1568, simple_loss=0.2361, pruned_loss=0.0388, over 4708922.82 frames. ], batch size: 164, lr: 2.61e-03, grad_scale: 16.0 2023-10-03 18:05:21,333 WARNING [train.py:1204] (3/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_310-114452-0_sp1.1 from training. Duration: 0.536375 2023-10-03 18:05:21,336 WARNING [train.py:1204] (3/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_450-165710-0_sp1.1 from training. Duration: 0.92725 2023-10-03 18:05:21,346 WARNING [train.py:1204] (3/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_280-120942-0_sp1.1 from training. Duration: 0.7 2023-10-03 18:05:21,369 WARNING [train.py:1204] (3/4) Exclude cut with ID _65585_210_12210_1_1535450451584_3764098_112-294301-0_sp0.9 from training. Duration: 0.97775 2023-10-03 18:05:21,429 WARNING [train.py:1204] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_98-51676-0_sp0.9 from training. Duration: 0.811125 2023-10-03 18:05:23,881 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.4.encoder.layers.2.nonlin_attention.whiten1, num_groups=1, num_channels=288, metric=4.38 vs. limit=10.0 2023-10-03 18:05:26,065 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_33348_210_5632_1_1533086912_3324030_85-28583-0_sp1.1 from training. Duration: 0.7454375 2023-10-03 18:05:27,396 WARNING [train.py:1204] (3/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_336-208232-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 18:05:28,796 WARNING [train.py:1204] (3/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_339-104791-0_sp0.9 from training. Duration: 0.5444375 2023-10-03 18:05:31,509 WARNING [train.py:1204] (3/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_559-29840-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 18:05:33,119 WARNING [train.py:1204] (3/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_117-290916-0_sp0.9 from training. Duration: 0.5666875 2023-10-03 18:05:34,477 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_185-112289-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 18:05:37,180 WARNING [train.py:1204] (3/4) Exclude cut with ID _59256_210_5986_1_1535076085997_3589120_155-298797-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 18:05:38,744 WARNING [train.py:1204] (3/4) Exclude cut with ID _44659_210_2721_1_1534036120061_6614860_246-206531-0_sp1.1 from training. Duration: 0.92725 2023-10-03 18:05:40,193 WARNING [train.py:1204] (3/4) Exclude cut with ID _23818_210_6228_1_1531917063884_3676920_469-151944-0_sp1.1 from training. Duration: 0.92725 2023-10-03 18:05:40,268 WARNING [train.py:1204] (3/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_88-276668-0_sp1.1 from training. Duration: 0.9 2023-10-03 18:05:41,727 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_15101_210_7927_1_1530786415_5033274_320-327527-0_sp1.1 from training. Duration: 0.87275 2023-10-03 18:05:42,936 WARNING [train.py:1204] (3/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_446-112117-0 from training. Duration: 0.9 2023-10-03 18:05:43,014 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_30280_210_1881_1_1532872779_3660139_400-254159-0_sp1.1 from training. Duration: 0.936375 2023-10-03 18:05:48,052 WARNING [train.py:1204] (3/4) Exclude cut with ID _40284_210_2084_1_1533513679814_3704279_158-345341-0_sp1.1 from training. Duration: 0.936375 2023-10-03 18:05:50,104 WARNING [train.py:1204] (3/4) Exclude cut with ID 3033-130750-0096-55598-0 from training. Duration: 0.83 2023-10-03 18:05:53,360 WARNING [train.py:1204] (3/4) Exclude cut with ID _9389_210_1664_1_1529217337301_5289269_379-128794-0 from training. Duration: 0.85 2023-10-03 18:05:53,383 WARNING [train.py:1204] (3/4) Exclude cut with ID _3675_210_4557_1_1526639934408_4182089_456-18305-0 from training. Duration: 0.76 2023-10-03 18:05:53,418 WARNING [train.py:1204] (3/4) Exclude cut with ID _29853_210_7497_1_1532505600851_6642110_571-249460-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 18:05:55,980 WARNING [train.py:1204] (3/4) Exclude cut with ID _4068_210_2629_1_1526697350395_1546259_230-325568-0_sp1.1 from training. Duration: 0.92725 2023-10-03 18:05:55,981 WARNING [train.py:1204] (3/4) Exclude cut with ID _34415_210_8750_1_1533085213375_3451116_213-267570-0_sp1.1 from training. Duration: 0.963625 2023-10-03 18:05:57,359 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6767_210_3273_1_1527855783_1718932_142-127132-0_sp1.1 from training. Duration: 0.863625 2023-10-03 18:06:04,126 WARNING [train.py:1204] (3/4) Exclude cut with ID IC0653W0001-322925 from training. Duration: 0.896 2023-10-03 18:06:05,560 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_4528_210_3273_1_1527076654_3276777_209-219804-0_sp1.1 from training. Duration: 0.62725 2023-10-03 18:06:06,931 WARNING [train.py:1204] (3/4) Exclude cut with ID _55844_210_2346_1_1534471212569_7625707_187-204971-0_sp1.1 from training. Duration: 0.936375 2023-10-03 18:06:07,063 WARNING [train.py:1204] (3/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_369-325184-0 from training. Duration: 0.99 2023-10-03 18:06:08,355 WARNING [train.py:1204] (3/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_791-45149-0 from training. Duration: 0.86 2023-10-03 18:06:08,399 WARNING [train.py:1204] (3/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_317-211107-0_sp1.1 from training. Duration: 0.836375 2023-10-03 18:06:09,816 WARNING [train.py:1204] (3/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_488-330913-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 18:06:11,256 WARNING [train.py:1204] (3/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_506-42614-0_sp1.1 from training. Duration: 0.7181875 2023-10-03 18:06:15,209 WARNING [train.py:1204] (3/4) Exclude cut with ID _8696_210_1876_1_1528545632440_3345058_209-54069-0 from training. Duration: 0.67 2023-10-03 18:06:22,294 WARNING [train.py:1204] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_215-202973-0_sp1.1 from training. Duration: 0.8 2023-10-03 18:06:22,429 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.0.nonlin_attention.balancer.prob, batch_count=1358006.6666666667, ans=0.125 2023-10-03 18:06:22,496 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.conv_skip_rate, batch_count=1358006.6666666667, ans=0.0 2023-10-03 18:06:24,104 WARNING [train.py:1204] (3/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_321-215742-0 from training. Duration: 0.5 2023-10-03 18:06:24,160 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_428-32114-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 18:06:24,168 WARNING [train.py:1204] (3/4) Exclude cut with ID _27013_210_1389_1_1532437173240_3252649_467-263151-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 18:06:25,492 WARNING [train.py:1204] (3/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_734-129761-0_sp0.9 from training. Duration: 0.911125 2023-10-03 18:06:25,529 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_521-214276-0 from training. Duration: 0.44 2023-10-03 18:06:27,033 WARNING [train.py:1204] (3/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_80-147301-0_sp1.1 from training. Duration: 0.763625 2023-10-03 18:06:28,433 WARNING [train.py:1204] (3/4) Exclude cut with ID _9679_210_7497_1_1529129212906_4363929_56-307796-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 18:06:31,322 WARNING [train.py:1204] (3/4) Exclude cut with ID _14832_210_8750_1_1530691351539_3171348_1-54315-0 from training. Duration: 0.87 2023-10-03 18:06:31,322 WARNING [train.py:1204] (3/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_558-155750-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 18:06:34,054 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_142-309370-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 18:06:34,093 WARNING [train.py:1204] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_18-46756-0_sp0.9 from training. Duration: 0.87775 2023-10-03 18:06:34,100 WARNING [train.py:1204] (3/4) Exclude cut with ID _61148_210_6897_1_1535094022726_4217610_279-212159-0_sp1.1 from training. Duration: 0.936375 2023-10-03 18:06:35,304 INFO [train.py:1046] (3/4) Epoch 39, batch 1850, loss[loss=0.1492, simple_loss=0.2307, pruned_loss=0.03388, over 24451.00 frames. ], tot_loss[loss=0.1571, simple_loss=0.2366, pruned_loss=0.0388, over 4709329.01 frames. ], batch size: 63, lr: 2.61e-03, grad_scale: 16.0 2023-10-03 18:06:36,766 WARNING [train.py:1204] (3/4) Exclude cut with ID _21987_210_5854_1_1531821500482_3637038_164-2641-0_sp1.1 from training. Duration: 0.936375 2023-10-03 18:06:38,103 WARNING [train.py:1204] (3/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_395-340852-0_sp1.1 from training. Duration: 0.8454375 2023-10-03 18:06:39,639 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.0.conv_skip_rate, batch_count=1358073.3333333333, ans=0.0 2023-10-03 18:06:40,861 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1077-184261-0_sp1.1 from training. Duration: 0.963625 2023-10-03 18:06:40,870 WARNING [train.py:1204] (3/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_1176-204992-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 18:06:43,693 WARNING [train.py:1204] (3/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_563-347832-0_sp1.1 from training. Duration: 0.8090625 2023-10-03 18:06:43,772 WARNING [train.py:1204] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_380-204756-0_sp0.9 from training. Duration: 0.9555625 2023-10-03 18:06:49,426 WARNING [train.py:1204] (3/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_212-10501-0_sp1.1 from training. Duration: 0.8545625 2023-10-03 18:06:51,129 WARNING [train.py:1204] (3/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_445-117398-0 from training. Duration: 0.89 2023-10-03 18:06:54,628 WARNING [train.py:1204] (3/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_29-280622-0 from training. Duration: 0.82 2023-10-03 18:06:57,945 WARNING [train.py:1204] (3/4) Exclude cut with ID _26664_210_7770_1_1532258943280_3418270_187-80815-0 from training. Duration: 0.93 2023-10-03 18:07:00,850 WARNING [train.py:1204] (3/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_414-349640-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 18:07:00,869 WARNING [train.py:1204] (3/4) Exclude cut with ID _45021_210_9994_1_1533619751094_3330288_96-239010-0 from training. Duration: 0.91 2023-10-03 18:07:00,872 WARNING [train.py:1204] (3/4) Exclude cut with ID _5189_210_4557_1_1527248319428_3769229_199-53526-0_sp0.9 from training. Duration: 0.4666875 2023-10-03 18:07:08,965 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.584e+02 1.977e+02 2.198e+02 2.562e+02 3.885e+02, threshold=4.397e+02, percent-clipped=0.0 2023-10-03 18:07:11,954 WARNING [train.py:1204] (3/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_63-101996-0_sp1.1 from training. Duration: 0.8 2023-10-03 18:07:13,367 WARNING [train.py:1204] (3/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_199-86432-0 from training. Duration: 0.98 2023-10-03 18:07:16,110 WARNING [train.py:1204] (3/4) Exclude cut with ID _36399_210_16049_1_1532998771377_3698333_207-259013-0_sp1.1 from training. Duration: 0.92725 2023-10-03 18:07:16,160 WARNING [train.py:1204] (3/4) Exclude cut with ID _62574_210_2655_1_1535180407058_3933249_567-111756-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 18:07:19,024 WARNING [train.py:1204] (3/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_436-318113-0 from training. Duration: 0.75 2023-10-03 18:07:20,363 WARNING [train.py:1204] (3/4) Exclude cut with ID _53379_210_20034_1_1534222027363_3854654_43-305214-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 18:07:20,384 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_15126_210_6753_1_1530778677_2591185_113-157651-0_sp1.1 from training. Duration: 0.6181875 2023-10-03 18:07:23,070 WARNING [train.py:1204] (3/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_146-218763-0_sp1.1 from training. Duration: 0.7818125 2023-10-03 18:07:24,894 WARNING [train.py:1204] (3/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_714-310538-0_sp0.9 from training. Duration: 0.9555625 2023-10-03 18:07:25,065 WARNING [train.py:1204] (3/4) Exclude cut with ID _24271_210_12658_1_1531987270022_7190519_709-16721-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 18:07:28,440 WARNING [train.py:1204] (3/4) Exclude cut with ID _5190_210_4557_1_1527244368799_373439_28-197030-0_sp0.9 from training. Duration: 0.8 2023-10-03 18:07:30,393 WARNING [train.py:1204] (3/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_267-197536-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 18:07:30,423 WARNING [train.py:1204] (3/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_658-282610-0_sp1.1 from training. Duration: 0.4818125 2023-10-03 18:07:30,439 WARNING [train.py:1204] (3/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_257-67050-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 18:07:31,852 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_14906_210_7927_1_1530754190_9408633_961-53402-0_sp1.1 from training. Duration: 0.936375 2023-10-03 18:07:33,216 WARNING [train.py:1204] (3/4) Exclude cut with ID _60113_210_16271_1_1535164212481_4386079_490-280290-0_sp1.1 from training. Duration: 0.8909375 2023-10-03 18:07:36,027 WARNING [train.py:1204] (3/4) Exclude cut with ID _14100_210_8341_1_1530529320613_3678069_258-224599-0 from training. Duration: 0.77 2023-10-03 18:07:37,367 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6961_210_2087_1_1527937168_6815011_565-326898-0_sp1.1 from training. Duration: 0.936375 2023-10-03 18:07:40,295 WARNING [train.py:1204] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_239-316802-0_sp1.1 from training. Duration: 0.62725 2023-10-03 18:07:40,385 WARNING [train.py:1204] (3/4) Exclude cut with ID _55857_210_2346_1_1534644007831_6898209_745-98391-0_sp1.1 from training. Duration: 0.6909375 2023-10-03 18:07:40,388 WARNING [train.py:1204] (3/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_154-135696-0 from training. Duration: 0.8 2023-10-03 18:07:40,390 WARNING [train.py:1204] (3/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_93-58002-0 from training. Duration: 0.57 2023-10-03 18:07:43,165 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9025W0005-507335 from training. Duration: 0.896 2023-10-03 18:07:43,244 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9029W0034-509435 from training. Duration: 0.64 2023-10-03 18:07:44,657 WARNING [train.py:1204] (3/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_76-203867-0_sp1.1 from training. Duration: 0.6545625 2023-10-03 18:07:44,665 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_46639_210_8341_1_1533726088_4495500_66-346325-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 18:07:44,671 WARNING [train.py:1204] (3/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_145-137790-0_sp0.9 from training. Duration: 0.92225 2023-10-03 18:07:44,685 WARNING [train.py:1204] (3/4) Exclude cut with ID _36522_210_8887_1_1533177030611_1772478_7-327299-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 18:07:46,031 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9042W0009-515615 from training. Duration: 0.896 2023-10-03 18:07:46,040 WARNING [train.py:1204] (3/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_39-248870-0_sp0.9 from training. Duration: 0.7666875 2023-10-03 18:07:47,330 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_35441_210_16554_1_1533347572_4092680_362-242912-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 18:07:47,415 WARNING [train.py:1204] (3/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_216-343007-0_sp1.1 from training. Duration: 0.6 2023-10-03 18:07:48,762 INFO [train.py:1046] (3/4) Epoch 39, batch 1900, loss[loss=0.1541, simple_loss=0.2353, pruned_loss=0.03646, over 23263.00 frames. ], tot_loss[loss=0.1574, simple_loss=0.2375, pruned_loss=0.0386, over 4726098.69 frames. ], batch size: 119, lr: 2.61e-03, grad_scale: 16.0 2023-10-03 18:07:48,860 WARNING [train.py:1204] (3/4) Exclude cut with ID _3867_210_1883_1_1526641200575_4475830_272-215563-0_sp0.9 from training. Duration: 0.6333125 2023-10-03 18:07:50,395 WARNING [train.py:1204] (3/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_553-175603-0_sp1.1 from training. Duration: 0.92725 2023-10-03 18:07:51,557 WARNING [train.py:1204] (3/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_963-187109-0 from training. Duration: 0.99 2023-10-03 18:07:52,177 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.4.encoder.layers.2.self_attn2.whiten, num_groups=1, num_channels=384, metric=11.40 vs. limit=22.5 2023-10-03 18:07:55,279 WARNING [train.py:1204] (3/4) Exclude cut with ID _44973_210_1511_1_1533794381163_3634770_495-211466-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 18:07:55,291 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9068W0002-529085 from training. Duration: 0.896 2023-10-03 18:07:55,312 WARNING [train.py:1204] (3/4) Exclude cut with ID _5294_210_1747_1_1527249643928_3696179_180-100713-0_sp1.1 from training. Duration: 0.736375 2023-10-03 18:07:55,389 WARNING [train.py:1204] (3/4) Exclude cut with ID _35799_210_5854_1_1533263487887_3529071_149-49689-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 18:08:03,217 WARNING [train.py:1204] (3/4) Exclude cut with ID _67725_210_27767_1_1535608756671_5105359_36-220794-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 18:08:06,026 WARNING [train.py:1204] (3/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_338-96520-0_sp1.1 from training. Duration: 0.8545625 2023-10-03 18:08:07,404 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9104W0004-547160 from training. Duration: 0.896 2023-10-03 18:08:08,768 WARNING [train.py:1204] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_330-327861-0 from training. Duration: 0.67 2023-10-03 18:08:09,526 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.3.encoder.layers.2.self_attn1.whiten, num_groups=1, num_channels=512, metric=12.79 vs. limit=22.5 2023-10-03 18:08:10,201 WARNING [train.py:1204] (3/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_156-257607-0_sp0.9 from training. Duration: 0.92225 2023-10-03 18:08:10,259 WARNING [train.py:1204] (3/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_476-322050-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 18:08:10,268 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9118W0017-553385 from training. Duration: 0.896 2023-10-03 18:08:10,293 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9120W0028-554435 from training. Duration: 0.896 2023-10-03 18:08:13,081 WARNING [train.py:1204] (3/4) Exclude cut with ID _56524_210_9642_1_1534572011779_3619170_145-310842-0 from training. Duration: 0.99 2023-10-03 18:08:14,525 WARNING [train.py:1204] (3/4) Exclude cut with ID _51631_210_8254_1_1534129228420_4084794_270-222508-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 18:08:17,183 WARNING [train.py:1204] (3/4) Exclude cut with ID _14268_210_9206_1_1530622771285_4714099_331-105351-0 from training. Duration: 0.65 2023-10-03 18:08:18,651 WARNING [train.py:1204] (3/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_40-129756-0 from training. Duration: 0.81 2023-10-03 18:08:22,235 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.1.encoder.layers.1.whiten, num_groups=1, num_channels=256, metric=4.92 vs. limit=12.0 2023-10-03 18:08:27,521 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.self_attn_weights.pos_emb_skip_rate, batch_count=1358540.0, ans=0.0 2023-10-03 18:08:28,717 WARNING [train.py:1204] (3/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_231-104736-0 from training. Duration: 0.78 2023-10-03 18:08:30,779 WARNING [train.py:1204] (3/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_214-291369-0 from training. Duration: 0.63 2023-10-03 18:08:30,786 WARNING [train.py:1204] (3/4) Exclude cut with ID _62574_210_2655_1_1535180407058_3933249_369-84347-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 18:08:32,580 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9210W0111-599240 from training. Duration: 0.896 2023-10-03 18:08:32,584 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_92-246270-0 from training. Duration: 0.85 2023-10-03 18:08:33,855 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_37152_210_9697_1_1533106573_3844188_29-239070-0 from training. Duration: 0.98 2023-10-03 18:08:35,204 WARNING [train.py:1204] (3/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_239-22076-0 from training. Duration: 0.85 2023-10-03 18:08:35,207 WARNING [train.py:1204] (3/4) Exclude cut with ID _65962_210_29927_1_1535436026519_4174930_368-44639-0_sp1.1 from training. Duration: 0.97275 2023-10-03 18:08:39,196 WARNING [train.py:1204] (3/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_152-212533-0 from training. Duration: 0.97 2023-10-03 18:08:41,903 WARNING [train.py:1204] (3/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_90-31383-0_sp1.1 from training. Duration: 0.7909375 2023-10-03 18:08:44,729 WARNING [train.py:1204] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_196-152868-0_sp1.1 from training. Duration: 0.92725 2023-10-03 18:08:44,731 WARNING [train.py:1204] (3/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_140-181176-0 from training. Duration: 0.92 2023-10-03 18:08:46,212 WARNING [train.py:1204] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_168-32906-0_sp0.9 from training. Duration: 0.7666875 2023-10-03 18:08:49,006 WARNING [train.py:1204] (3/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_13-165512-0 from training. Duration: 0.82 2023-10-03 18:08:49,042 WARNING [train.py:1204] (3/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_381-317791-0_sp0.9 from training. Duration: 0.811125 2023-10-03 18:08:50,690 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.balancer2.prob, batch_count=1358673.3333333333, ans=0.125 2023-10-03 18:08:51,948 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.attention_skip_rate, batch_count=1358673.3333333333, ans=0.0 2023-10-03 18:08:55,004 WARNING [train.py:1204] (3/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_63-267585-0_sp1.1 from training. Duration: 0.8181875 2023-10-03 18:08:55,006 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_326-303945-0_sp1.1 from training. Duration: 0.9 2023-10-03 18:08:55,018 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_194-143381-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 18:08:56,873 WARNING [train.py:1204] (3/4) Exclude cut with ID _39290_210_4220_1_1533862778760_3075118_194-16667-0_sp1.1 from training. Duration: 0.963625 2023-10-03 18:08:58,197 WARNING [train.py:1204] (3/4) Exclude cut with ID _15012_210_1968_1_1530856820429_5095350_662-210469-0_sp0.9 from training. Duration: 0.6333125 2023-10-03 18:08:59,490 WARNING [train.py:1204] (3/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_68-219807-0_sp1.1 from training. Duration: 0.42725 2023-10-03 18:09:00,853 WARNING [train.py:1204] (3/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_321-22821-0_sp0.9 from training. Duration: 0.988875 2023-10-03 18:09:02,734 INFO [train.py:1046] (3/4) Epoch 39, batch 1950, loss[loss=0.1738, simple_loss=0.2518, pruned_loss=0.04791, over 20150.00 frames. ], tot_loss[loss=0.1582, simple_loss=0.2385, pruned_loss=0.03895, over 4716841.34 frames. ], batch size: 44, lr: 2.61e-03, grad_scale: 16.0 2023-10-03 18:09:04,212 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_1037-45317-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 18:09:04,214 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_403-39619-0_sp1.1 from training. Duration: 0.763625 2023-10-03 18:09:06,332 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_510-96996-0_sp0.9 from training. Duration: 0.9666875 2023-10-03 18:09:06,335 WARNING [train.py:1204] (3/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_225-180155-0_sp1.1 from training. Duration: 0.92725 2023-10-03 18:09:06,369 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_278-272612-0_sp0.9 from training. Duration: 0.811125 2023-10-03 18:09:07,710 WARNING [train.py:1204] (3/4) Exclude cut with ID _53713_210_24116_1_1534314487762_7416751_691-70244-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 18:09:07,937 INFO [scaling.py:1118] (3/4) WithLoss: name=encoder.encoders.1.encoder.layers.1.self_attn_weights, loss-sum=0.000e+00 2023-10-03 18:09:11,899 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6788_210_1388_1_1527898698_4562021_91-197981-0_sp1.1 from training. Duration: 0.7545625 2023-10-03 18:09:13,270 WARNING [train.py:1204] (3/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_60-18123-0_sp0.9 from training. Duration: 0.97775 2023-10-03 18:09:14,611 WARNING [train.py:1204] (3/4) Exclude cut with ID _35799_210_5854_1_1533263487887_3529071_5-214617-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 18:09:14,616 WARNING [train.py:1204] (3/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_890-5117-0_sp1.1 from training. Duration: 0.7090625 2023-10-03 18:09:16,141 WARNING [train.py:1204] (3/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_660-211088-0 from training. Duration: 0.62 2023-10-03 18:09:16,213 WARNING [train.py:1204] (3/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_314-126819-0_sp0.9 from training. Duration: 0.5555625 2023-10-03 18:09:17,614 WARNING [train.py:1204] (3/4) Exclude cut with ID _21632_210_2253_1_1531994001765_3744140_362-66225-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 18:09:18,941 WARNING [train.py:1204] (3/4) Exclude cut with ID _61118_210_9527_1_1534897668884_3752630_173-258555-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 18:09:20,396 WARNING [train.py:1204] (3/4) Exclude cut with ID _7719_210_3525_1_1528456153163_4296960_88-250259-0_sp0.9 from training. Duration: 0.9333125 2023-10-03 18:09:21,695 WARNING [train.py:1204] (3/4) Exclude cut with ID _15140_210_9288_1_1530791388450_4880149_539-153708-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 18:09:21,728 WARNING [train.py:1204] (3/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_253-42544-0_sp1.1 from training. Duration: 0.97275 2023-10-03 18:09:23,246 WARNING [train.py:1204] (3/4) Exclude cut with ID _33640_210_2655_1_1533117605479_4855469_479-296877-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 18:09:26,883 WARNING [train.py:1204] (3/4) Exclude cut with ID 3033-130750-0096-55598-0_sp1.1 from training. Duration: 0.7545625 2023-10-03 18:09:26,894 WARNING [train.py:1204] (3/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_191-41023-0_sp1.1 from training. Duration: 0.6454375 2023-10-03 18:09:26,909 WARNING [train.py:1204] (3/4) Exclude cut with ID _8365_210_2064_1_1528592311070_4677690_431-294102-0_sp1.1 from training. Duration: 0.8090625 2023-10-03 18:09:28,200 WARNING [train.py:1204] (3/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_893-296030-0_sp1.1 from training. Duration: 0.97275 2023-10-03 18:09:30,961 WARNING [train.py:1204] (3/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_401-343820-0_sp1.1 from training. Duration: 0.97275 2023-10-03 18:09:34,291 WARNING [train.py:1204] (3/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_127-566-0_sp1.1 from training. Duration: 0.763625 2023-10-03 18:09:34,293 WARNING [train.py:1204] (3/4) Exclude cut with ID _14100_210_8341_1_1530529320613_3678069_126-272847-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 18:09:34,318 WARNING [train.py:1204] (3/4) Exclude cut with ID _15133_210_6228_1_1530794512074_3080379_29-264458-0_sp1.1 from training. Duration: 0.52725 2023-10-03 18:09:34,318 WARNING [train.py:1204] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_127-119109-0 from training. Duration: 0.72 2023-10-03 18:09:35,731 WARNING [train.py:1204] (3/4) Exclude cut with ID _6537_210_1883_1_1527987559710_3597129_382-133062-0_sp1.1 from training. Duration: 0.6181875 2023-10-03 18:09:35,749 WARNING [train.py:1204] (3/4) Exclude cut with ID _29529_210_7234_1_1532393961455_3977460_391-219682-0_sp1.1 from training. Duration: 0.936375 2023-10-03 18:09:35,789 WARNING [train.py:1204] (3/4) Exclude cut with ID _37537_210_2253_1_1533171758568_3421949_96-137189-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 18:09:37,505 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.707e+02 1.953e+02 2.190e+02 2.399e+02 3.415e+02, threshold=4.380e+02, percent-clipped=0.0 2023-10-03 18:09:39,043 WARNING [train.py:1204] (3/4) Exclude cut with ID _58414_210_8254_1_1535421609162_7151182_15-276301-0_sp1.1 from training. Duration: 0.97275 2023-10-03 18:09:41,749 WARNING [train.py:1204] (3/4) Exclude cut with ID _59502_210_12210_1_1535107024698_1835139_110-289074-0_sp1.1 from training. Duration: 0.92725 2023-10-03 18:09:45,860 WARNING [train.py:1204] (3/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_367-61330-0_sp1.1 from training. Duration: 0.6818125 2023-10-03 18:09:48,541 WARNING [train.py:1204] (3/4) Exclude cut with ID _45297_210_2721_1_1533715351381_3264949_353-9541-0_sp1.1 from training. Duration: 0.8818125 2023-10-03 18:09:48,578 WARNING [train.py:1204] (3/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_85-138257-0_sp0.9 from training. Duration: 0.788875 2023-10-03 18:09:48,614 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3787_210_2040_1_1526477544_5478586_603-120324-0 from training. Duration: 0.81 2023-10-03 18:09:49,980 WARNING [train.py:1204] (3/4) Exclude cut with ID _42863_210_9070_1_1533902398788_3696920_411-204131-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 18:09:53,947 WARNING [train.py:1204] (3/4) Exclude cut with ID _30196_210_1664_1_1533708496522_7074140_822-289909-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 18:09:54,031 WARNING [train.py:1204] (3/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_32-335103-0_sp0.9 from training. Duration: 0.911125 2023-10-03 18:09:55,922 WARNING [train.py:1204] (3/4) Exclude cut with ID _6493_210_1747_1_1528459210424_4012470_513-140967-0_sp0.9 from training. Duration: 0.87775 2023-10-03 18:10:04,172 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_47896_210_20508_1_1534492518_5958868_439-257766-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 18:10:05,560 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_1294-63152-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 18:10:08,779 WARNING [train.py:1204] (3/4) Exclude cut with ID _59170_210_6721_1_1534983633960_4203335_319-166512-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 18:10:10,149 WARNING [train.py:1204] (3/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_146-3470-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 18:10:12,801 WARNING [train.py:1204] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_678-159307-0_sp1.1 from training. Duration: 0.77275 2023-10-03 18:10:12,851 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_343-48822-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 18:10:14,180 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_4692_210_3528_1_1526899755_4323254_107-44401-0 from training. Duration: 0.86 2023-10-03 18:10:14,185 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_74-292608-0_sp1.1 from training. Duration: 0.6545625 2023-10-03 18:10:15,671 WARNING [train.py:1204] (3/4) Exclude cut with ID _55644_210_11856_1_1534593599582_4103719_293-248694-0_sp1.1 from training. Duration: 0.97275 2023-10-03 18:10:15,748 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3172_210_2087_1_1526292929_3987865_197-97762-0 from training. Duration: 0.75 2023-10-03 18:10:17,165 INFO [train.py:1046] (3/4) Epoch 39, batch 2000, loss[loss=0.1608, simple_loss=0.2482, pruned_loss=0.0367, over 24528.00 frames. ], tot_loss[loss=0.1582, simple_loss=0.2388, pruned_loss=0.03878, over 4723387.89 frames. ], batch size: 71, lr: 2.61e-03, grad_scale: 32.0 2023-10-03 18:10:17,345 WARNING [train.py:1204] (3/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_264-348896-0_sp0.9 from training. Duration: 0.9444375 2023-10-03 18:10:18,796 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.conv_module1.balancer2.prob, batch_count=1359073.3333333333, ans=0.125 2023-10-03 18:10:20,029 WARNING [train.py:1204] (3/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_209-205365-0_sp0.9 from training. Duration: 0.87775 2023-10-03 18:10:20,097 WARNING [train.py:1204] (3/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_222-198127-0_sp0.9 from training. Duration: 0.9333125 2023-10-03 18:10:21,500 WARNING [train.py:1204] (3/4) Exclude cut with ID _57572_210_5517_1_1534921219624_3809484_52-43530-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 18:10:22,860 WARNING [train.py:1204] (3/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_96-320736-0_sp1.1 from training. Duration: 0.7818125 2023-10-03 18:10:24,834 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_13091_210_7925_1_1530609447_8987354_1159-225508-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 18:10:28,797 WARNING [train.py:1204] (3/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_237-44332-0 from training. Duration: 0.94 2023-10-03 18:10:28,846 WARNING [train.py:1204] (3/4) Exclude cut with ID _7970_210_1883_1_1528455616296_3472230_25-178407-0_sp0.9 from training. Duration: 0.788875 2023-10-03 18:10:32,470 WARNING [train.py:1204] (3/4) Exclude cut with ID _29003_210_19596_1_1532507391720_3894098_357-257062-0_sp1.1 from training. Duration: 0.936375 2023-10-03 18:10:33,877 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_809-17156-0 from training. Duration: 0.73 2023-10-03 18:10:35,942 WARNING [train.py:1204] (3/4) Exclude cut with ID _7160_210_1876_1_1527940923231_3703799_302-150728-0_sp1.1 from training. Duration: 0.7090625 2023-10-03 18:10:35,957 WARNING [train.py:1204] (3/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_444-28041-0_sp0.9 from training. Duration: 0.9444375 2023-10-03 18:10:37,406 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=1359140.0, ans=0.1 2023-10-03 18:10:37,982 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.feed_forward1.out_whiten.whitening_limit, batch_count=1359140.0, ans=15.0 2023-10-03 18:10:38,484 WARNING [train.py:1204] (3/4) Exclude cut with ID _52892_210_6721_1_1534378941259_4124129_278-251666-0_sp1.1 from training. Duration: 0.92725 2023-10-03 18:10:39,825 WARNING [train.py:1204] (3/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_115-326778-0 from training. Duration: 0.93 2023-10-03 18:10:41,322 WARNING [train.py:1204] (3/4) Exclude cut with ID _41391_210_7994_1_1533603771872_3638201_31-188961-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 18:10:43,923 WARNING [train.py:1204] (3/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_1073-6264-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 18:10:43,941 WARNING [train.py:1204] (3/4) Exclude cut with ID _66868_210_9527_1_1535509287954_4561140_435-259515-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 18:10:45,380 WARNING [train.py:1204] (3/4) Exclude cut with ID _14886_210_4220_1_1530702020355_3516190_159-73748-0 from training. Duration: 0.83 2023-10-03 18:10:46,674 WARNING [train.py:1204] (3/4) Exclude cut with ID _15150_210_8341_1_1530752411803_7256384_469-239509-0_sp1.1 from training. Duration: 0.6181875 2023-10-03 18:10:48,085 WARNING [train.py:1204] (3/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_395-340852-0 from training. Duration: 0.93 2023-10-03 18:10:48,088 WARNING [train.py:1204] (3/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_138-187031-0_sp1.1 from training. Duration: 0.9 2023-10-03 18:10:49,743 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.conv_module1.balancer2.prob, batch_count=1359206.6666666667, ans=0.125 2023-10-03 18:10:50,793 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_13757_210_3528_1_1530440682_4107239_48-106347-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 18:10:52,126 WARNING [train.py:1204] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_164-224052-0_sp1.1 from training. Duration: 0.636375 2023-10-03 18:10:52,131 WARNING [train.py:1204] (3/4) Exclude cut with ID _59288_210_5854_1_1535077757774_3412246_408-322648-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 18:10:52,165 WARNING [train.py:1204] (3/4) Exclude cut with ID _30191_210_1664_1_1533189915047_7196670_827-36359-0_sp1.1 from training. Duration: 0.963625 2023-10-03 18:10:53,550 WARNING [train.py:1204] (3/4) Exclude cut with ID _4680_210_3525_1_1527249757953_893386_75-78979-0_sp1.1 from training. Duration: 0.82725 2023-10-03 18:10:53,599 WARNING [train.py:1204] (3/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_383-165234-0 from training. Duration: 0.94 2023-10-03 18:10:58,032 WARNING [train.py:1204] (3/4) Exclude cut with ID _19896_210_1883_1_1531709022662_6649569_94-229927-0 from training. Duration: 0.97 2023-10-03 18:10:58,039 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6788_210_1388_1_1527898698_4562021_180-75288-0_sp1.1 from training. Duration: 0.9 2023-10-03 18:10:58,046 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_26433_210_12108_1_1532157158_4056480_54-321349-0_sp1.1 from training. Duration: 0.97275 2023-10-03 18:11:01,492 WARNING [train.py:1204] (3/4) Exclude cut with ID _56108_210_16380_1_1534505446159_3887959_265-161493-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 18:11:02,966 WARNING [train.py:1204] (3/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_395-111602-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 18:11:02,979 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_154-316450-0_sp0.9 from training. Duration: 0.8444375 2023-10-03 18:11:04,899 WARNING [train.py:1204] (3/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_381-116041-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 18:11:06,630 WARNING [train.py:1204] (3/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_65-104124-0_sp1.1 from training. Duration: 0.963625 2023-10-03 18:11:08,070 WARNING [train.py:1204] (3/4) Exclude cut with ID _6536_210_1883_1_1527850783339_3319069_169-23811-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 18:11:08,102 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_289-180904-0_sp0.9 from training. Duration: 0.8444375 2023-10-03 18:11:08,113 WARNING [train.py:1204] (3/4) Exclude cut with ID _56406_210_9288_1_1534899618664_3496149_226-242400-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 18:11:08,227 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_24899_210_16554_1_1532046422_3827462_276-216330-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 18:11:10,951 WARNING [train.py:1204] (3/4) Exclude cut with ID _29345_210_4381_1_1532689168263_3860169_198-174328-0_sp1.1 from training. Duration: 0.82725 2023-10-03 18:11:12,264 WARNING [train.py:1204] (3/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_338-96520-0 from training. Duration: 0.94 2023-10-03 18:11:15,106 WARNING [train.py:1204] (3/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_7-111954-0_sp1.1 from training. Duration: 0.5090625 2023-10-03 18:11:16,511 WARNING [train.py:1204] (3/4) Exclude cut with ID _36692_210_5665_1_1533608967420_398809_8-16261-0_sp1.1 from training. Duration: 0.97275 2023-10-03 18:11:20,581 WARNING [train.py:1204] (3/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_86-212657-0_sp1.1 from training. Duration: 0.97275 2023-10-03 18:11:20,588 WARNING [train.py:1204] (3/4) Exclude cut with ID _44659_210_2721_1_1534036120061_6614860_626-298884-0_sp1.1 from training. Duration: 0.8818125 2023-10-03 18:11:25,312 WARNING [train.py:1204] (3/4) Exclude cut with ID _49755_210_18053_1_1534581005846_3218709_120-257918-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 18:11:26,774 WARNING [train.py:1204] (3/4) Exclude cut with ID _21881_210_3525_1_1531825242757_3798380_400-113821-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 18:11:26,777 WARNING [train.py:1204] (3/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_387-236773-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 18:11:28,099 WARNING [train.py:1204] (3/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_613-232856-0_sp1.1 from training. Duration: 0.4909375 2023-10-03 18:11:28,129 WARNING [train.py:1204] (3/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_181-78632-0_sp0.9 from training. Duration: 0.8666875 2023-10-03 18:11:29,455 WARNING [train.py:1204] (3/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_175-17485-0_sp1.1 from training. Duration: 0.97275 2023-10-03 18:11:29,525 WARNING [train.py:1204] (3/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_1004-183422-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 18:11:30,758 INFO [train.py:1046] (3/4) Epoch 39, batch 2050, loss[loss=0.1364, simple_loss=0.1956, pruned_loss=0.03857, over 22732.00 frames. ], tot_loss[loss=0.1577, simple_loss=0.2373, pruned_loss=0.03902, over 4718487.52 frames. ], batch size: 322, lr: 2.61e-03, grad_scale: 8.0 2023-10-03 18:11:34,684 WARNING [train.py:1204] (3/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_32-65962-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 18:11:35,999 WARNING [train.py:1204] (3/4) Exclude cut with ID _15458_210_8341_1_1530858614263_3834410_40-298664-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 18:11:41,503 WARNING [train.py:1204] (3/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_380-261863-0_sp1.1 from training. Duration: 0.963625 2023-10-03 18:11:43,037 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_372-173012-0_sp1.1 from training. Duration: 0.863625 2023-10-03 18:11:44,402 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_57-82205-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 18:11:44,452 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3691_210_3528_1_1526468169_4062053_355-334432-0_sp0.9 from training. Duration: 0.9666875 2023-10-03 18:11:45,886 WARNING [train.py:1204] (3/4) Exclude cut with ID _19023_210_4353_1_1531378892552_5820780_28-36615-0 from training. Duration: 0.97 2023-10-03 18:11:45,892 WARNING [train.py:1204] (3/4) Exclude cut with ID _29853_210_7497_1_1532505600851_6642110_286-251333-0_sp1.1 from training. Duration: 0.92725 2023-10-03 18:11:48,534 WARNING [train.py:1204] (3/4) Exclude cut with ID _31602_210_5854_1_1532677021456_4458250_518-80679-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 18:11:48,581 WARNING [train.py:1204] (3/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_38-187851-0_sp0.9 from training. Duration: 0.688875 2023-10-03 18:11:57,456 WARNING [train.py:1204] (3/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_39-248870-0_sp1.1 from training. Duration: 0.62725 2023-10-03 18:11:57,475 WARNING [train.py:1204] (3/4) Exclude cut with ID _16908_210_5724_1_1531119157248_3772700_154-31455-0_sp1.1 from training. Duration: 0.97275 2023-10-03 18:11:58,879 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8453_210_4211_1_1528502940_8562730_347-330673-0 from training. Duration: 0.72 2023-10-03 18:11:59,029 WARNING [train.py:1204] (3/4) Exclude cut with ID _30191_210_1664_1_1533189915047_7196670_674-204247-0_sp1.1 from training. Duration: 0.97275 2023-10-03 18:12:01,043 WARNING [train.py:1204] (3/4) Exclude cut with ID _8833_210_5219_1_1528592665210_7553059_329-125156-0 from training. Duration: 0.82 2023-10-03 18:12:01,091 WARNING [train.py:1204] (3/4) Exclude cut with ID _4779_210_2084_1_1527318022944_3914893_500-285013-0_sp1.1 from training. Duration: 0.62725 2023-10-03 18:12:06,144 WARNING [train.py:1204] (3/4) Exclude cut with ID _14421_210_8750_1_1530604795825_3542290_190-313578-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 18:12:07,484 WARNING [train.py:1204] (3/4) Exclude cut with ID _35799_210_5854_1_1533263487887_3529071_538-273574-0_sp1.1 from training. Duration: 0.936375 2023-10-03 18:12:08,629 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.636e+02 1.929e+02 2.107e+02 2.301e+02 3.275e+02, threshold=4.215e+02, percent-clipped=0.0 2023-10-03 18:12:08,773 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_14928_210_5632_1_1530773461_4714000_454-112198-0_sp1.1 from training. Duration: 0.6 2023-10-03 18:12:10,081 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_468-143126-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 18:12:12,647 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_4528_210_3273_1_1527076654_3276777_258-121681-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 18:12:12,746 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_397-132565-0_sp1.1 from training. Duration: 0.9 2023-10-03 18:12:12,765 WARNING [train.py:1204] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_454-123002-0_sp0.9 from training. Duration: 0.7666875 2023-10-03 18:12:14,663 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.4.encoder.layers.1.nonlin_attention.whiten2, num_groups=1, num_channels=384, metric=3.21 vs. limit=15.0 2023-10-03 18:12:16,852 WARNING [train.py:1204] (3/4) Exclude cut with ID _27052_210_5986_1_1532329197187_3585009_297-86890-0_sp1.1 from training. Duration: 0.936375 2023-10-03 18:12:18,297 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_51225_210_3601_1_1534400634_2327383_55-301844-0_sp1.1 from training. Duration: 0.8454375 2023-10-03 18:12:21,036 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6786_210_1388_1_1527839699_4194495_378-46070-0_sp0.9 from training. Duration: 0.8 2023-10-03 18:12:21,224 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.0.feed_forward1.hidden_balancer.prob, batch_count=1359606.6666666667, ans=0.125 2023-10-03 18:12:22,435 WARNING [train.py:1204] (3/4) Exclude cut with ID _42334_210_7770_1_1534206610396_3605880_55-19758-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 18:12:25,718 WARNING [train.py:1204] (3/4) Exclude cut with ID _14884_210_2252_1_1530707459689_5561250_114-201399-0_sp0.9 from training. Duration: 0.8444375 2023-10-03 18:12:30,035 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=1359673.3333333333, ans=0.1 2023-10-03 18:12:31,200 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_99-251314-0_sp0.9 from training. Duration: 0.9666875 2023-10-03 18:12:32,554 WARNING [train.py:1204] (3/4) Exclude cut with ID _26528_210_9070_1_1532570410842_3851310_497-97218-0 from training. Duration: 0.86 2023-10-03 18:12:38,447 WARNING [train.py:1204] (3/4) Exclude cut with ID _55328_210_27767_1_1534580973260_4088170_230-317241-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 18:12:39,610 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_125-203105-0_sp0.9 from training. Duration: 0.888875 2023-10-03 18:12:41,014 WARNING [train.py:1204] (3/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_55-31904-0_sp1.1 from training. Duration: 0.8 2023-10-03 18:12:42,513 WARNING [train.py:1204] (3/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_18-174647-0 from training. Duration: 0.91 2023-10-03 18:12:45,289 INFO [train.py:1046] (3/4) Epoch 39, batch 2100, loss[loss=0.1598, simple_loss=0.2358, pruned_loss=0.04189, over 23810.00 frames. ], tot_loss[loss=0.157, simple_loss=0.2367, pruned_loss=0.03866, over 4724171.04 frames. ], batch size: 195, lr: 2.61e-03, grad_scale: 8.0 2023-10-03 18:12:46,604 WARNING [train.py:1204] (3/4) Exclude cut with ID IC0165W0005-81726 from training. Duration: 0.952 2023-10-03 18:12:46,604 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_31583_210_3852_1_1532595851_4024413_148-171847-0_sp1.1 from training. Duration: 0.963625 2023-10-03 18:12:46,656 WARNING [train.py:1204] (3/4) Exclude cut with ID _21989_210_5854_1_1531899214931_3271360_116-138769-0_sp1.1 from training. Duration: 0.936375 2023-10-03 18:12:46,710 WARNING [train.py:1204] (3/4) Exclude cut with ID _66070_210_7640_1_1535441393096_7805310_489-292631-0_sp1.1 from training. Duration: 0.7181875 2023-10-03 18:12:48,048 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_202-27960-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 18:12:48,064 WARNING [train.py:1204] (3/4) Exclude cut with ID _5294_210_1747_1_1527249643928_3696179_342-270278-0 from training. Duration: 0.55 2023-10-03 18:12:48,094 WARNING [train.py:1204] (3/4) Exclude cut with ID _38323_210_12108_1_1533121233369_3303049_416-11919-0 from training. Duration: 0.96 2023-10-03 18:12:50,727 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6545_210_2040_1_1527774670_4245258_407-48070-0_sp0.9 from training. Duration: 0.8444375 2023-10-03 18:12:53,637 WARNING [train.py:1204] (3/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_503-30719-0_sp1.1 from training. Duration: 0.8909375 2023-10-03 18:12:54,990 WARNING [train.py:1204] (3/4) Exclude cut with ID _49643_210_7989_1_1534640244224_3939031_234-326497-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 18:12:57,067 WARNING [train.py:1204] (3/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_301-77873-0_sp1.1 from training. Duration: 0.963625 2023-10-03 18:12:57,124 WARNING [train.py:1204] (3/4) Exclude cut with ID _45297_210_2721_1_1533715351381_3264949_152-154697-0_sp1.1 from training. Duration: 0.97275 2023-10-03 18:12:57,137 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3371_210_1794_1_1526384462_5580777_396-339812-0 from training. Duration: 0.85 2023-10-03 18:12:57,301 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.self_attn_weights.pos_emb_skip_rate, batch_count=1359740.0, ans=0.0 2023-10-03 18:12:58,472 WARNING [train.py:1204] (3/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_399-156791-0_sp1.1 from training. Duration: 0.7545625 2023-10-03 18:12:59,776 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7696_210_2040_1_1528207167_3716099_117-125434-0 from training. Duration: 0.76 2023-10-03 18:12:59,789 WARNING [train.py:1204] (3/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_96-244299-0 from training. Duration: 0.88 2023-10-03 18:13:01,238 WARNING [train.py:1204] (3/4) Exclude cut with ID _37892_210_19596_1_1533198677897_3964058_519-343462-0_sp1.1 from training. Duration: 0.92725 2023-10-03 18:13:02,585 WARNING [train.py:1204] (3/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_202-183922-0_sp1.1 from training. Duration: 0.77275 2023-10-03 18:13:02,592 WARNING [train.py:1204] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_453-133983-0 from training. Duration: 0.78 2023-10-03 18:13:02,632 WARNING [train.py:1204] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_178-39722-0_sp1.1 from training. Duration: 0.4454375 2023-10-03 18:13:04,727 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.conv_module2.balancer1.max_abs, batch_count=1359806.6666666667, ans=10.0 2023-10-03 18:13:09,579 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_15126_210_6753_1_1530778677_2591185_113-157651-0 from training. Duration: 0.68 2023-10-03 18:13:09,580 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_271-234962-0_sp1.1 from training. Duration: 0.7181875 2023-10-03 18:13:11,036 WARNING [train.py:1204] (3/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_195-64967-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 18:13:11,077 WARNING [train.py:1204] (3/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_365-215065-0_sp1.1 from training. Duration: 0.936375 2023-10-03 18:13:14,022 WARNING [train.py:1204] (3/4) Exclude cut with ID _31602_210_5854_1_1532677021456_4458250_249-96604-0_sp0.9 from training. Duration: 0.988875 2023-10-03 18:13:15,395 WARNING [train.py:1204] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_307-158462-0 from training. Duration: 0.69 2023-10-03 18:13:15,444 WARNING [train.py:1204] (3/4) Exclude cut with ID _30918_210_6400_1_1532754078600_3125920_407-223394-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 18:13:16,610 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6673_210_3273_1_1527767790_3173240_351-167954-0_sp0.9 from training. Duration: 0.5333125 2023-10-03 18:13:16,910 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.bypass.scale_min, batch_count=1359873.3333333333, ans=0.2 2023-10-03 18:13:18,045 WARNING [train.py:1204] (3/4) Exclude cut with ID _4866_210_2577_1_1527404412284_4364659_256-306830-0 from training. Duration: 0.87 2023-10-03 18:13:19,362 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_17613_210_2151_1_1531137569_2366160_222-131118-0_sp1.1 from training. Duration: 0.963625 2023-10-03 18:13:19,373 WARNING [train.py:1204] (3/4) Exclude cut with ID _45560_210_21982_1_1533812603951_3584999_111-6376-0 from training. Duration: 0.84 2023-10-03 18:13:19,398 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_216-73841-0 from training. Duration: 0.96 2023-10-03 18:13:20,786 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_14058_210_4163_1_1530690953_6327180_49-210687-0 from training. Duration: 0.94 2023-10-03 18:13:22,200 WARNING [train.py:1204] (3/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_506-42614-0_sp0.9 from training. Duration: 0.87775 2023-10-03 18:13:23,671 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_25-332138-0_sp1.1 from training. Duration: 0.82725 2023-10-03 18:13:26,953 WARNING [train.py:1204] (3/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_589-9070-0_sp1.1 from training. Duration: 0.6454375 2023-10-03 18:13:28,344 WARNING [train.py:1204] (3/4) Exclude cut with ID _6194_210_1534_1_1527511826160_3941194_14-282876-0_sp1.1 from training. Duration: 0.6454375 2023-10-03 18:13:29,725 WARNING [train.py:1204] (3/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_1166-90654-0_sp1.1 from training. Duration: 0.92725 2023-10-03 18:13:31,106 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_218-255858-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 18:13:31,107 WARNING [train.py:1204] (3/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_1285-256622-0 from training. Duration: 0.84 2023-10-03 18:13:31,125 WARNING [train.py:1204] (3/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_205-286261-0_sp1.1 from training. Duration: 0.963625 2023-10-03 18:13:31,139 WARNING [train.py:1204] (3/4) Exclude cut with ID _68777_210_4353_1_1535810390731_3117904_6-154119-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 18:13:32,450 WARNING [train.py:1204] (3/4) Exclude cut with ID _22982_210_1883_1_1531882790447_5449409_565-199393-0_sp1.1 from training. Duration: 0.92725 2023-10-03 18:13:32,462 WARNING [train.py:1204] (3/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_278-154492-0 from training. Duration: 0.76 2023-10-03 18:13:34,408 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_426-313557-0 from training. Duration: 0.53 2023-10-03 18:13:35,938 WARNING [train.py:1204] (3/4) Exclude cut with ID _41817_210_18835_1_1533434477160_7423049_555-293448-0 from training. Duration: 0.8 2023-10-03 18:13:36,586 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.4.encoder.layers.2.feed_forward1.out_whiten, num_groups=1, num_channels=384, metric=9.30 vs. limit=15.0 2023-10-03 18:13:41,000 WARNING [train.py:1204] (3/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_32-335103-0_sp1.1 from training. Duration: 0.7454375 2023-10-03 18:13:46,402 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2655_210_2087_1_1525688959_3897400_202-147557-0_sp1.1 from training. Duration: 0.77275 2023-10-03 18:13:46,426 WARNING [train.py:1204] (3/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_158-250116-0 from training. Duration: 0.62 2023-10-03 18:13:52,088 WARNING [train.py:1204] (3/4) Exclude cut with ID _55429_210_10626_1_1534467588527_7174143_528-88083-0_sp1.1 from training. Duration: 0.963625 2023-10-03 18:13:53,587 WARNING [train.py:1204] (3/4) Exclude cut with ID _8030_210_7497_1_1528444886781_3531089_32-31837-0_sp1.1 from training. Duration: 0.8818125 2023-10-03 18:13:54,929 WARNING [train.py:1204] (3/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_744-86168-0_sp1.1 from training. Duration: 0.97275 2023-10-03 18:13:54,945 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1113-7369-0_sp0.9 from training. Duration: 0.9444375 2023-10-03 18:13:54,969 WARNING [train.py:1204] (3/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_124-345840-0_sp0.9 from training. Duration: 0.57775 2023-10-03 18:13:55,015 WARNING [train.py:1204] (3/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_328-264776-0_sp0.9 from training. Duration: 0.7444375 2023-10-03 18:13:57,735 WARNING [train.py:1204] (3/4) Exclude cut with ID _18895_210_6228_1_1531357274234_3784930_469-166615-0_sp1.1 from training. Duration: 0.963625 2023-10-03 18:13:57,749 WARNING [train.py:1204] (3/4) Exclude cut with ID _8395_210_1531_1_1528597871523_4411579_431-132470-0_sp0.9 from training. Duration: 0.82225 2023-10-03 18:13:59,010 WARNING [train.py:1204] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_554-45617-0_sp1.1 from training. Duration: 0.9 2023-10-03 18:13:59,030 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_35361_210_4238_1_1533262810_7793476_524-188242-0_sp1.1 from training. Duration: 0.92725 2023-10-03 18:14:00,986 WARNING [train.py:1204] (3/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_938-205135-0 from training. Duration: 0.87 2023-10-03 18:14:02,217 INFO [train.py:1046] (3/4) Epoch 39, batch 2150, loss[loss=0.1447, simple_loss=0.2228, pruned_loss=0.03329, over 23693.00 frames. ], tot_loss[loss=0.1566, simple_loss=0.2362, pruned_loss=0.03856, over 4724181.11 frames. ], batch size: 149, lr: 2.61e-03, grad_scale: 8.0 2023-10-03 18:14:02,296 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_5931_210_2040_1_1527428481_4547999_321-193614-0 from training. Duration: 0.67 2023-10-03 18:14:02,314 WARNING [train.py:1204] (3/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_445-269521-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 18:14:05,179 WARNING [train.py:1204] (3/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_3-33116-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 18:14:05,180 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_4633_210_2040_1_1526824163_4145343_416-76117-0_sp1.1 from training. Duration: 0.863625 2023-10-03 18:14:05,215 WARNING [train.py:1204] (3/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_1033-274376-0_sp1.1 from training. Duration: 0.7909375 2023-10-03 18:14:06,548 WARNING [train.py:1204] (3/4) Exclude cut with ID _8772_210_1850_1_1528635595972_3664011_398-320049-0_sp0.9 from training. Duration: 0.97775 2023-10-03 18:14:10,664 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.2.encoder.layers.1.conv_module1.whiten, num_groups=1, num_channels=384, metric=10.80 vs. limit=15.0 2023-10-03 18:14:11,898 WARNING [train.py:1204] (3/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_321-215742-0_sp0.9 from training. Duration: 0.5555625 2023-10-03 18:14:13,292 WARNING [train.py:1204] (3/4) Exclude cut with ID _28394_210_8913_1_1532430049518_3650118_272-190950-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 18:14:13,397 WARNING [train.py:1204] (3/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_31-299371-0_sp1.1 from training. Duration: 0.92725 2023-10-03 18:14:16,783 WARNING [train.py:1204] (3/4) Exclude cut with ID _55383_210_10635_1_1534468426172_2231669_134-128246-0_sp0.9 from training. Duration: 0.988875 2023-10-03 18:14:16,786 WARNING [train.py:1204] (3/4) Exclude cut with ID _40519_210_13794_1_1533715228655_7337390_887-9547-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 18:14:16,836 WARNING [train.py:1204] (3/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_310-109461-0_sp0.9 from training. Duration: 0.888875 2023-10-03 18:14:18,192 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2009_210_2071_1_1525481134_4588937_258-250196-0_sp1.1 from training. Duration: 0.92725 2023-10-03 18:14:19,535 WARNING [train.py:1204] (3/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_1186-72702-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 18:14:19,538 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_14831_210_7927_1_1530700200_5583610_181-27467-0_sp1.1 from training. Duration: 0.82725 2023-10-03 18:14:20,084 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.5.encoder.layers.1.feed_forward1.out_whiten, num_groups=1, num_channels=256, metric=5.73 vs. limit=15.0 2023-10-03 18:14:20,419 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.2.encoder.layers.0.self_attn1.whiten, num_groups=1, num_channels=384, metric=20.66 vs. limit=22.5 2023-10-03 18:14:21,216 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.conv_skip_rate, batch_count=1360140.0, ans=0.0 2023-10-03 18:14:23,591 WARNING [train.py:1204] (3/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_278-132109-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 18:14:23,614 WARNING [train.py:1204] (3/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_261-297503-0 from training. Duration: 0.99 2023-10-03 18:14:28,350 WARNING [train.py:1204] (3/4) Exclude cut with ID _19896_210_1883_1_1531709022662_6649569_313-265903-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 18:14:29,726 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_105-114797-0_sp1.1 from training. Duration: 0.763625 2023-10-03 18:14:31,126 WARNING [train.py:1204] (3/4) Exclude cut with ID _53755_210_8254_1_1534577312941_4707360_9-65843-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 18:14:31,164 WARNING [train.py:1204] (3/4) Exclude cut with ID _18516_210_9206_1_1531704653206_3692406_361-224549-0_sp1.1 from training. Duration: 0.97275 2023-10-03 18:14:31,184 WARNING [train.py:1204] (3/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_403-310975-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 18:14:32,433 WARNING [train.py:1204] (3/4) Exclude cut with ID _5444_210_2151_1_1527418808464_5312210_581-315411-0_sp0.9 from training. Duration: 0.688875 2023-10-03 18:14:32,482 WARNING [train.py:1204] (3/4) Exclude cut with ID _23825_210_13449_1_1531915313526_3497150_464-154904-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 18:14:32,491 WARNING [train.py:1204] (3/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_937-208281-0_sp0.9 from training. Duration: 0.9444375 2023-10-03 18:14:33,885 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_37839_210_7234_1_1533542462_3689303_158-181212-0_sp1.1 from training. Duration: 0.963625 2023-10-03 18:14:34,007 WARNING [train.py:1204] (3/4) Exclude cut with ID _8772_210_1850_1_1528635595972_3664011_398-320049-0 from training. Duration: 0.88 2023-10-03 18:14:35,286 WARNING [train.py:1204] (3/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_718-250907-0_sp1.1 from training. Duration: 0.62725 2023-10-03 18:14:36,732 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.0.conv_module1.balancer1.prob, batch_count=1360206.6666666667, ans=0.125 2023-10-03 18:14:37,809 WARNING [train.py:1204] (3/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_641-126395-0_sp1.1 from training. Duration: 0.92725 2023-10-03 18:14:37,848 WARNING [train.py:1204] (3/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_166-241493-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 18:14:39,434 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.625e+02 1.915e+02 2.126e+02 2.508e+02 4.642e+02, threshold=4.251e+02, percent-clipped=1.0 2023-10-03 18:14:39,573 WARNING [train.py:1204] (3/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_526-62079-0_sp0.9 from training. Duration: 0.7444375 2023-10-03 18:14:42,749 WARNING [train.py:1204] (3/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_439-275083-0_sp1.1 from training. Duration: 0.936375 2023-10-03 18:14:44,166 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_66907_210_21581_1_1535615661_3590177_493-229032-0_sp1.1 from training. Duration: 0.92725 2023-10-03 18:14:46,196 WARNING [train.py:1204] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_89-321955-0_sp1.1 from training. Duration: 0.72725 2023-10-03 18:14:46,480 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=1360273.3333333333, ans=0.1 2023-10-03 18:14:47,574 WARNING [train.py:1204] (3/4) Exclude cut with ID _29857_210_20034_1_1532395886753_3798160_273-226195-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 18:14:47,587 WARNING [train.py:1204] (3/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_404-75889-0 from training. Duration: 0.89 2023-10-03 18:14:47,613 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_271-234962-0_sp0.9 from training. Duration: 0.87775 2023-10-03 18:14:50,474 WARNING [train.py:1204] (3/4) Exclude cut with ID _22961_210_8254_1_1531976366672_4278180_156-339685-0_sp1.1 from training. Duration: 0.97275 2023-10-03 18:14:50,532 WARNING [train.py:1204] (3/4) Exclude cut with ID _37892_210_19596_1_1533198677897_3964058_425-231927-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 18:14:53,185 WARNING [train.py:1204] (3/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_102-288670-0_sp1.1 from training. Duration: 0.97275 2023-10-03 18:14:53,274 WARNING [train.py:1204] (3/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_283-220372-0_sp1.1 from training. Duration: 0.5090625 2023-10-03 18:14:53,345 WARNING [train.py:1204] (3/4) Exclude cut with ID _44304_210_15374_1_1533639352765_4613129_86-1699-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 18:14:54,691 WARNING [train.py:1204] (3/4) Exclude cut with ID _2891_210_2550_1_1525874398521_4427710_327-121590-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 18:14:54,704 WARNING [train.py:1204] (3/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_369-222060-0 from training. Duration: 0.71 2023-10-03 18:14:56,115 WARNING [train.py:1204] (3/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_762-109886-0 from training. Duration: 0.91 2023-10-03 18:14:56,144 WARNING [train.py:1204] (3/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_477-255782-0_sp0.9 from training. Duration: 0.911125 2023-10-03 18:14:57,498 WARNING [train.py:1204] (3/4) Exclude cut with ID IC0669W0061-330906 from training. Duration: 0.768 2023-10-03 18:14:57,558 WARNING [train.py:1204] (3/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_347-213071-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 18:14:58,861 WARNING [train.py:1204] (3/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_507-303370-0_sp1.1 from training. Duration: 0.87275 2023-10-03 18:14:58,940 WARNING [train.py:1204] (3/4) Exclude cut with ID _15012_210_1968_1_1530856820429_5095350_688-178849-0 from training. Duration: 0.64 2023-10-03 18:14:58,945 WARNING [train.py:1204] (3/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_28-70987-0_sp0.9 from training. Duration: 0.9666875 2023-10-03 18:14:58,947 WARNING [train.py:1204] (3/4) Exclude cut with ID _5910_210_1389_1_1527337972454_5050100_555-159815-0 from training. Duration: 0.98 2023-10-03 18:15:00,190 WARNING [train.py:1204] (3/4) Exclude cut with ID IC0679W0064-335871 from training. Duration: 0.768 2023-10-03 18:15:00,191 WARNING [train.py:1204] (3/4) Exclude cut with ID IC0679W0079-335886 from training. Duration: 0.896 2023-10-03 18:15:00,240 WARNING [train.py:1204] (3/4) Exclude cut with ID _5608_210_2106_1_1527415439089_6819768_909-82767-0 from training. Duration: 0.61 2023-10-03 18:15:02,085 WARNING [train.py:1204] (3/4) Exclude cut with ID _23038_210_9570_1_1532001584158_3652419_411-323967-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 18:15:02,147 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_350-231748-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 18:15:02,157 WARNING [train.py:1204] (3/4) Exclude cut with ID _5289_210_2084_1_1527678202736_3779299_142-103317-0_sp0.9 from training. Duration: 0.9333125 2023-10-03 18:15:03,475 WARNING [train.py:1204] (3/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_314-144055-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 18:15:04,816 WARNING [train.py:1204] (3/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_177-236809-0_sp0.9 from training. Duration: 0.5444375 2023-10-03 18:15:06,213 WARNING [train.py:1204] (3/4) Exclude cut with ID _23038_210_9570_1_1532001584158_3652419_82-334733-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 18:15:06,220 WARNING [train.py:1204] (3/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_225-22526-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 18:15:14,354 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.ff3_skip_rate, batch_count=1360340.0, ans=0.0 2023-10-03 18:15:15,490 WARNING [train.py:1204] (3/4) Exclude cut with ID _30327_210_16049_1_1532484161295_3924169_401-70819-0_sp1.1 from training. Duration: 0.7818125 2023-10-03 18:15:15,528 WARNING [train.py:1204] (3/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_538-32291-0 from training. Duration: 0.83 2023-10-03 18:15:17,331 INFO [train.py:1046] (3/4) Epoch 39, batch 2200, loss[loss=0.1607, simple_loss=0.2381, pruned_loss=0.04161, over 23740.00 frames. ], tot_loss[loss=0.1568, simple_loss=0.2363, pruned_loss=0.03864, over 4712413.34 frames. ], batch size: 232, lr: 2.61e-03, grad_scale: 8.0 2023-10-03 18:15:20,234 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_37357_210_17024_1_1533089504_2316121_258-211605-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 18:15:24,350 WARNING [train.py:1204] (3/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_550-335198-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 18:15:25,633 WARNING [train.py:1204] (3/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_98-741-0_sp0.9 from training. Duration: 0.888875 2023-10-03 18:15:26,986 WARNING [train.py:1204] (3/4) Exclude cut with ID _38323_210_12108_1_1533121233369_3303049_251-238988-0_sp1.1 from training. Duration: 0.92725 2023-10-03 18:15:27,087 WARNING [train.py:1204] (3/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_259-34038-0_sp0.9 from training. Duration: 0.82225 2023-10-03 18:15:29,885 WARNING [train.py:1204] (3/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_1060-180633-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 18:15:29,928 WARNING [train.py:1204] (3/4) Exclude cut with ID _8755_210_1876_1_1528628559357_3258209_211-37473-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 18:15:29,933 WARNING [train.py:1204] (3/4) Exclude cut with ID _4122_210_2084_1_1526713164417_4353280_528-206771-0 from training. Duration: 0.88 2023-10-03 18:15:33,436 WARNING [train.py:1204] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_64-248359-0 from training. Duration: 0.69 2023-10-03 18:15:34,846 WARNING [train.py:1204] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_402-253027-0_sp1.1 from training. Duration: 0.4909375 2023-10-03 18:15:40,891 WARNING [train.py:1204] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_625-102955-0 from training. Duration: 0.77 2023-10-03 18:15:41,049 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.bypass_mid.scale_min, batch_count=1360473.3333333333, ans=0.2 2023-10-03 18:15:43,588 WARNING [train.py:1204] (3/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_611-219324-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 18:15:45,525 WARNING [train.py:1204] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_718-68505-0_sp0.9 from training. Duration: 0.988875 2023-10-03 18:15:47,449 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_353-182855-0_sp1.1 from training. Duration: 0.87275 2023-10-03 18:15:50,291 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_5960_210_1794_1_1527403865_3990999_515-2613-0_sp0.9 from training. Duration: 0.97775 2023-10-03 18:15:50,310 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_5575_210_1410_1_1527301305_2570170_342-265572-0 from training. Duration: 0.94 2023-10-03 18:15:54,586 WARNING [train.py:1204] (3/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_329-113173-0_sp0.9 from training. Duration: 0.688875 2023-10-03 18:15:54,687 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_15396_210_6890_1_1531308473_1938936_213-312506-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 18:15:56,107 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_521-214276-0_sp1.1 from training. Duration: 0.4 2023-10-03 18:15:57,668 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.feed_forward3.hidden_balancer.prob, batch_count=1360540.0, ans=0.125 2023-10-03 18:15:58,789 WARNING [train.py:1204] (3/4) Exclude cut with ID _15026_210_8750_1_1530777602316_3291850_211-195043-0_sp1.1 from training. Duration: 0.72725 2023-10-03 18:16:00,183 WARNING [train.py:1204] (3/4) Exclude cut with ID _29343_210_4381_1_1532602757752_3674109_108-257494-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 18:16:00,728 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.4.encoder.layers.0.feed_forward2.out_whiten, num_groups=1, num_channels=384, metric=15.02 vs. limit=15.0 2023-10-03 18:16:02,133 WARNING [train.py:1204] (3/4) Exclude cut with ID _64414_210_17301_1_1535243252048_3757759_403-58151-0_sp1.1 from training. Duration: 0.963625 2023-10-03 18:16:03,518 WARNING [train.py:1204] (3/4) Exclude cut with ID _36512_210_6987_1_1533033143998_3126069_109-177787-0_sp1.1 from training. Duration: 0.92725 2023-10-03 18:16:04,849 WARNING [train.py:1204] (3/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_498-40153-0 from training. Duration: 0.63 2023-10-03 18:16:05,310 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.5.encoder.layers.1.self_attn1.whiten, num_groups=1, num_channels=256, metric=12.68 vs. limit=22.5 2023-10-03 18:16:06,186 WARNING [train.py:1204] (3/4) Exclude cut with ID _56447_210_1389_1_1535074245910_3697189_161-334523-0_sp1.1 from training. Duration: 0.936375 2023-10-03 18:16:06,273 WARNING [train.py:1204] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_238-245655-0 from training. Duration: 0.81 2023-10-03 18:16:08,900 WARNING [train.py:1204] (3/4) Exclude cut with ID _64631_210_27767_1_1535349580068_4165160_108-43865-0_sp1.1 from training. Duration: 0.92725 2023-10-03 18:16:08,905 WARNING [train.py:1204] (3/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_243-42116-0_sp1.1 from training. Duration: 0.663625 2023-10-03 18:16:08,937 WARNING [train.py:1204] (3/4) Exclude cut with ID _66868_210_9527_1_1535509287954_4561140_594-132160-0_sp1.1 from training. Duration: 0.92725 2023-10-03 18:16:10,956 WARNING [train.py:1204] (3/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_48-338976-0_sp0.9 from training. Duration: 0.988875 2023-10-03 18:16:12,257 WARNING [train.py:1204] (3/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_137-100595-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 18:16:12,264 WARNING [train.py:1204] (3/4) Exclude cut with ID _49748_210_24116_1_1533986971737_3158910_12-271669-0_sp1.1 from training. Duration: 0.936375 2023-10-03 18:16:12,282 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_37357_210_17024_1_1533089504_2316121_206-80421-0_sp1.1 from training. Duration: 0.936375 2023-10-03 18:16:12,374 WARNING [train.py:1204] (3/4) Exclude cut with ID _7537_210_2084_1_1528502395431_3924359_113-199933-0_sp1.1 from training. Duration: 0.536375 2023-10-03 18:16:14,095 WARNING [train.py:1204] (3/4) Exclude cut with ID _55440_210_12651_1_1534496713364_2980520_31-59289-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 18:16:15,572 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1083-220122-0_sp0.9 from training. Duration: 0.5666875 2023-10-03 18:16:20,009 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_426-313557-0_sp1.1 from training. Duration: 0.4818125 2023-10-03 18:16:20,058 WARNING [train.py:1204] (3/4) Exclude cut with ID _35799_210_5854_1_1533263487887_3529071_324-94843-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 18:16:22,759 WARNING [train.py:1204] (3/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_264-108685-0_sp1.1 from training. Duration: 0.5 2023-10-03 18:16:22,845 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9012W0006-501141 from training. Duration: 0.896 2023-10-03 18:16:24,237 WARNING [train.py:1204] (3/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_890-5117-0_sp0.9 from training. Duration: 0.8666875 2023-10-03 18:16:25,611 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9023W0008-506301 from training. Duration: 0.896 2023-10-03 18:16:26,957 WARNING [train.py:1204] (3/4) Exclude cut with ID _13916_210_9994_1_1530788341126_3763149_60-191556-0_sp0.9 from training. Duration: 0.67775 2023-10-03 18:16:27,002 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9029W0331-509721 from training. Duration: 0.896 2023-10-03 18:16:29,573 WARNING [train.py:1204] (3/4) Exclude cut with ID _35823_210_6917_1_1533187881914_7297830_567-108596-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 18:16:29,622 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7543_210_6753_1_1528544010_1110402_25-183860-0_sp1.1 from training. Duration: 0.42725 2023-10-03 18:16:31,506 INFO [train.py:1046] (3/4) Epoch 39, batch 2250, loss[loss=0.1524, simple_loss=0.2301, pruned_loss=0.03741, over 23709.00 frames. ], tot_loss[loss=0.1566, simple_loss=0.2367, pruned_loss=0.03827, over 4723157.93 frames. ], batch size: 135, lr: 2.61e-03, grad_scale: 8.0 2023-10-03 18:16:31,613 WARNING [train.py:1204] (3/4) Exclude cut with ID _47278_210_12041_1_1533902418349_3576110_495-260035-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 18:16:32,978 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9051W0015-520281 from training. Duration: 0.9045625 2023-10-03 18:16:34,252 WARNING [train.py:1204] (3/4) Exclude cut with ID _14459_210_2228_1_1531443641946_7516700_707-66193-0_sp1.1 from training. Duration: 0.97275 2023-10-03 18:16:35,778 WARNING [train.py:1204] (3/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_219-224445-0_sp1.1 from training. Duration: 0.763625 2023-10-03 18:16:35,977 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=1360740.0, ans=0.1 2023-10-03 18:16:42,917 WARNING [train.py:1204] (3/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_345-167986-0_sp0.9 from training. Duration: 0.9333125 2023-10-03 18:16:43,212 INFO [scaling.py:1118] (3/4) WithLoss: name=encoder.encoders.3.encoder.layers.0.self_attn_weights, loss-sum=0.000e+00 2023-10-03 18:16:44,704 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_34815_210_6388_1_1533381320_3571903_279-305565-0_sp1.1 from training. Duration: 0.836375 2023-10-03 18:16:47,399 WARNING [train.py:1204] (3/4) Exclude cut with ID _21987_210_5854_1_1531821500482_3637038_317-179212-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 18:16:48,722 WARNING [train.py:1204] (3/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_395-37222-0_sp1.1 from training. Duration: 0.6909375 2023-10-03 18:16:50,643 WARNING [train.py:1204] (3/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_168-50337-0_sp1.1 from training. Duration: 0.763625 2023-10-03 18:16:53,327 WARNING [train.py:1204] (3/4) Exclude cut with ID _60113_210_16271_1_1535164212481_4386079_559-167191-0 from training. Duration: 0.93 2023-10-03 18:16:53,347 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_47228_210_4983_1_1533780021_3672027_344-179642-0_sp1.1 from training. Duration: 0.92725 2023-10-03 18:16:54,656 WARNING [train.py:1204] (3/4) Exclude cut with ID _22961_210_8254_1_1531976366672_4278180_317-199546-0_sp0.9 from training. Duration: 0.9555625 2023-10-03 18:16:57,517 WARNING [train.py:1204] (3/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_141-89727-0 from training. Duration: 0.71 2023-10-03 18:16:57,588 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_78-116075-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 18:16:57,607 WARNING [train.py:1204] (3/4) Exclude cut with ID _64086_210_7994_1_1535270417019_7435949_52-235756-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 18:16:57,891 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.feed_forward3.hidden_balancer.prob, batch_count=1360806.6666666667, ans=0.125 2023-10-03 18:16:58,970 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7615_210_4211_1_1528110965_4580160_72-235211-0_sp1.1 from training. Duration: 0.6909375 2023-10-03 18:16:59,159 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.balancer1.prob, batch_count=1360873.3333333333, ans=0.125 2023-10-03 18:17:06,255 WARNING [train.py:1204] (3/4) Exclude cut with ID _14069_210_5632_1_1530512692033_4803390_287-27950-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 18:17:07,582 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.461e+02 1.878e+02 2.042e+02 2.352e+02 3.262e+02, threshold=4.083e+02, percent-clipped=0.0 2023-10-03 18:17:07,650 WARNING [train.py:1204] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_74-48717-0_sp0.9 from training. Duration: 0.6444375 2023-10-03 18:17:07,674 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7544_210_6753_1_1528591327_1471426_172-348777-0_sp1.1 from training. Duration: 0.6 2023-10-03 18:17:09,084 WARNING [train.py:1204] (3/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_220-191080-0 from training. Duration: 0.98 2023-10-03 18:17:10,412 WARNING [train.py:1204] (3/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_1081-137641-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 18:17:11,881 WARNING [train.py:1204] (3/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_278-235459-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 18:17:15,548 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.conv_module2.balancer2.prob, batch_count=1360940.0, ans=0.125 2023-10-03 18:17:16,180 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.0.layers.1.feed_forward3.out_whiten, num_groups=1, num_channels=192, metric=9.64 vs. limit=15.0 2023-10-03 18:17:16,656 WARNING [train.py:1204] (3/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_1181-77470-0_sp1.1 from training. Duration: 0.936375 2023-10-03 18:17:16,867 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.balancer1.prob, batch_count=1360940.0, ans=0.125 2023-10-03 18:17:18,044 WARNING [train.py:1204] (3/4) Exclude cut with ID _60860_210_8254_1_1534896015875_3557169_198-5531-0_sp1.1 from training. Duration: 0.936375 2023-10-03 18:17:18,243 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.nonlin_attention.balancer.prob, batch_count=1360940.0, ans=0.125 2023-10-03 18:17:19,397 WARNING [train.py:1204] (3/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_17-289781-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 18:17:19,402 WARNING [train.py:1204] (3/4) Exclude cut with ID _50310_210_15488_1_1534068043345_3641080_146-199752-0_sp1.1 from training. Duration: 0.92725 2023-10-03 18:17:21,558 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.conv_module1.balancer1.prob, batch_count=1360940.0, ans=0.125 2023-10-03 18:17:22,749 WARNING [train.py:1204] (3/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_969-213354-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 18:17:24,155 WARNING [train.py:1204] (3/4) Exclude cut with ID _70519_210_4163_1_1535961595994_7458230_66-344156-0_sp1.1 from training. Duration: 0.963625 2023-10-03 18:17:27,182 WARNING [train.py:1204] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_380-204756-0_sp1.1 from training. Duration: 0.7818125 2023-10-03 18:17:29,966 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_128-202104-0_sp0.9 from training. Duration: 0.7 2023-10-03 18:17:34,286 WARNING [train.py:1204] (3/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_51-1049-0_sp0.9 from training. Duration: 0.8333125 2023-10-03 18:17:34,311 WARNING [train.py:1204] (3/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_86-66192-0_sp1.1 from training. Duration: 0.5 2023-10-03 18:17:35,593 WARNING [train.py:1204] (3/4) Exclude cut with ID _14069_210_5632_1_1530512692033_4803390_95-1896-0_sp1.1 from training. Duration: 0.8909375 2023-10-03 18:17:39,002 WARNING [train.py:1204] (3/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_29-293896-0_sp1.1 from training. Duration: 0.5545625 2023-10-03 18:17:39,147 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.bypass_mid.scale_min, batch_count=1361006.6666666667, ans=0.2 2023-10-03 18:17:41,708 WARNING [train.py:1204] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_167-137255-0_sp0.9 from training. Duration: 0.711125 2023-10-03 18:17:41,709 WARNING [train.py:1204] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_217-95056-0 from training. Duration: 0.98 2023-10-03 18:17:41,738 WARNING [train.py:1204] (3/4) Exclude cut with ID _47278_210_12041_1_1533902418349_3576110_97-266746-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 18:17:43,064 WARNING [train.py:1204] (3/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_380-343670-0_sp1.1 from training. Duration: 0.8 2023-10-03 18:17:44,279 INFO [train.py:1046] (3/4) Epoch 39, batch 2300, loss[loss=0.1653, simple_loss=0.2395, pruned_loss=0.04553, over 23828.00 frames. ], tot_loss[loss=0.1578, simple_loss=0.2381, pruned_loss=0.03877, over 4724570.13 frames. ], batch size: 195, lr: 2.61e-03, grad_scale: 8.0 2023-10-03 18:17:44,410 WARNING [train.py:1204] (3/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_107-332398-0 from training. Duration: 0.78 2023-10-03 18:17:47,635 WARNING [train.py:1204] (3/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_91-267696-0_sp1.1 from training. Duration: 0.7909375 2023-10-03 18:17:49,474 WARNING [train.py:1204] (3/4) Exclude cut with ID _41298_210_9019_1_1533430371450_4822143_498-186993-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 18:17:55,553 WARNING [train.py:1204] (3/4) Exclude cut with ID _17956_210_4353_1_1531209568125_6979970_54-77610-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 18:17:55,595 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_40047_210_2290_1_1533290519_2374311_257-335162-0_sp1.1 from training. Duration: 0.863625 2023-10-03 18:17:57,111 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.self_attn_weights.pos_emb_skip_rate, batch_count=1361073.3333333333, ans=0.0 2023-10-03 18:17:58,206 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9653W0193-675471 from training. Duration: 0.9656875 2023-10-03 18:17:59,621 WARNING [train.py:1204] (3/4) Exclude cut with ID _25248_210_8750_1_1532062825941_5554180_243-240536-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 18:18:05,273 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_32360_210_4198_1_1532689160_3648436_60-51908-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 18:18:06,547 WARNING [train.py:1204] (3/4) Exclude cut with ID _4092_210_2151_1_1526643044449_3135759_227-213972-0_sp0.9 from training. Duration: 0.6 2023-10-03 18:18:06,582 WARNING [train.py:1204] (3/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_392-173500-0_sp1.1 from training. Duration: 0.97275 2023-10-03 18:18:06,613 WARNING [train.py:1204] (3/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_71-329150-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 18:18:06,615 WARNING [train.py:1204] (3/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_866-173440-0 from training. Duration: 0.9 2023-10-03 18:18:06,821 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.ff3_skip_rate, batch_count=1361140.0, ans=0.0 2023-10-03 18:18:06,823 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.out_combiner.scale_min, batch_count=1361140.0, ans=0.2 2023-10-03 18:18:08,506 WARNING [train.py:1204] (3/4) Exclude cut with ID _14832_210_8750_1_1530691351539_3171348_1-54315-0_sp0.9 from training. Duration: 0.9666875 2023-10-03 18:18:09,953 WARNING [train.py:1204] (3/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_430-116139-0_sp1.1 from training. Duration: 0.77275 2023-10-03 18:18:09,987 WARNING [train.py:1204] (3/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_356-45054-0_sp0.9 from training. Duration: 0.9555625 2023-10-03 18:18:12,776 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.0.conv_module1.balancer2.prob, batch_count=1361206.6666666667, ans=0.125 2023-10-03 18:18:15,520 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3531_210_3528_1_1526294203_5216456_309-227302-0_sp0.9 from training. Duration: 0.7666875 2023-10-03 18:18:18,304 WARNING [train.py:1204] (3/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_301-330455-0_sp1.1 from training. Duration: 0.72725 2023-10-03 18:18:22,209 WARNING [train.py:1204] (3/4) Exclude cut with ID _23557_210_8750_1_1531890106370_6846255_667-145945-0_sp1.1 from training. Duration: 0.92725 2023-10-03 18:18:23,551 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.2.encoder.layers.2.feed_forward2.out_whiten, num_groups=1, num_channels=384, metric=3.89 vs. limit=15.0 2023-10-03 18:18:24,400 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.balancer2.prob, batch_count=1361206.6666666667, ans=0.125 2023-10-03 18:18:26,853 WARNING [train.py:1204] (3/4) Exclude cut with ID _14860_210_4915_1_1530702020340_7343240_836-328489-0_sp1.1 from training. Duration: 0.8181875 2023-10-03 18:18:28,151 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_37152_210_9697_1_1533106573_3844188_416-333249-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 18:18:30,827 WARNING [train.py:1204] (3/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_538-32291-0_sp0.9 from training. Duration: 0.92225 2023-10-03 18:18:30,985 WARNING [train.py:1204] (3/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_965-176824-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 18:18:35,185 WARNING [train.py:1204] (3/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_86-19601-0_sp1.1 from training. Duration: 0.77275 2023-10-03 18:18:36,488 WARNING [train.py:1204] (3/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_436-318113-0_sp1.1 from training. Duration: 0.6818125 2023-10-03 18:18:36,521 WARNING [train.py:1204] (3/4) Exclude cut with ID _41817_210_18835_1_1533434477160_7423049_555-293448-0_sp0.9 from training. Duration: 0.888875 2023-10-03 18:18:36,532 WARNING [train.py:1204] (3/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_68-219807-0 from training. Duration: 0.47 2023-10-03 18:18:40,848 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7544_210_6753_1_1528591327_1471426_33-346031-0_sp1.1 from training. Duration: 0.5545625 2023-10-03 18:18:40,850 WARNING [train.py:1204] (3/4) Exclude cut with ID _49748_210_24116_1_1533986971737_3158910_263-52113-0_sp1.1 from training. Duration: 0.97275 2023-10-03 18:18:42,234 WARNING [train.py:1204] (3/4) Exclude cut with ID _64639_210_5816_1_1535367619437_3528000_340-24580-0_sp1.1 from training. Duration: 0.92725 2023-10-03 18:18:42,262 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_47228_210_4983_1_1533780021_3672027_479-114941-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 18:18:42,294 WARNING [train.py:1204] (3/4) Exclude cut with ID _44585_210_18628_1_1533605350387_4518199_506-119659-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 18:18:42,367 WARNING [train.py:1204] (3/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_314-126819-0_sp1.1 from training. Duration: 0.4545625 2023-10-03 18:18:42,370 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2541_210_1794_1_1525590269_3084765_156-173713-0_sp0.9 from training. Duration: 0.9 2023-10-03 18:18:43,746 WARNING [train.py:1204] (3/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_108-255430-0 from training. Duration: 0.97 2023-10-03 18:18:43,754 WARNING [train.py:1204] (3/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_791-45149-0_sp1.1 from training. Duration: 0.7818125 2023-10-03 18:18:43,760 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_53378_210_17282_1_1534227054_1261425_10-118295-0_sp1.1 from training. Duration: 0.97275 2023-10-03 18:18:43,818 WARNING [train.py:1204] (3/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_148-158509-0 from training. Duration: 0.41 2023-10-03 18:18:49,568 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_34726_210_4525_1_1533019474_5030395_687-291305-0_sp1.1 from training. Duration: 0.936375 2023-10-03 18:18:53,558 WARNING [train.py:1204] (3/4) Exclude cut with ID _14860_210_4915_1_1530702020340_7343240_264-324537-0_sp1.1 from training. Duration: 0.963625 2023-10-03 18:18:53,729 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.attention_skip_rate, batch_count=1361340.0, ans=0.0 2023-10-03 18:18:53,810 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.bypass_mid.scale_min, batch_count=1361340.0, ans=0.2 2023-10-03 18:18:56,307 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_41251_210_15368_1_1533429767_4820695_591-46386-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 18:18:56,348 WARNING [train.py:1204] (3/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_86-19601-0_sp0.9 from training. Duration: 0.9444375 2023-10-03 18:18:57,653 WARNING [train.py:1204] (3/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_401-255491-0_sp0.9 from training. Duration: 0.77775 2023-10-03 18:18:59,028 INFO [train.py:1046] (3/4) Epoch 39, batch 2350, loss[loss=0.1738, simple_loss=0.2509, pruned_loss=0.04834, over 23889.00 frames. ], tot_loss[loss=0.1585, simple_loss=0.2385, pruned_loss=0.03923, over 4717430.38 frames. ], batch size: 212, lr: 2.61e-03, grad_scale: 8.0 2023-10-03 18:19:00,512 WARNING [train.py:1204] (3/4) Exclude cut with ID _8696_210_1876_1_1528545632440_3345058_209-54069-0_sp1.1 from training. Duration: 0.6090625 2023-10-03 18:19:00,522 WARNING [train.py:1204] (3/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_369-325184-0_sp1.1 from training. Duration: 0.9 2023-10-03 18:19:00,604 WARNING [train.py:1204] (3/4) Exclude cut with ID _14687_210_5854_1_1530703838259_3664889_115-239046-0_sp0.9 from training. Duration: 0.7444375 2023-10-03 18:19:00,653 WARNING [train.py:1204] (3/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_168-50337-0 from training. Duration: 0.84 2023-10-03 18:19:00,815 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.feed_forward2.hidden_balancer.prob, batch_count=1361406.6666666667, ans=0.125 2023-10-03 18:19:04,812 WARNING [train.py:1204] (3/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_909-64151-0_sp1.1 from training. Duration: 0.8545625 2023-10-03 18:19:04,829 WARNING [train.py:1204] (3/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_597-154640-0 from training. Duration: 0.9 2023-10-03 18:19:09,546 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.1.feed_forward1.out_proj.dropout_p, batch_count=1361406.6666666667, ans=0.1 2023-10-03 18:19:10,822 WARNING [train.py:1204] (3/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_126-274694-0 from training. Duration: 0.98 2023-10-03 18:19:12,225 WARNING [train.py:1204] (3/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_344-269786-0_sp1.1 from training. Duration: 0.97275 2023-10-03 18:19:15,566 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_37818_210_4238_1_1533620910_4720083_322-253154-0_sp1.1 from training. Duration: 0.92725 2023-10-03 18:19:15,577 WARNING [train.py:1204] (3/4) Exclude cut with ID _58895_210_8750_1_1535184081320_3649089_221-15823-0_sp1.1 from training. Duration: 0.92725 2023-10-03 18:19:16,867 WARNING [train.py:1204] (3/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_9-21737-0_sp1.1 from training. Duration: 0.8818125 2023-10-03 18:19:16,915 WARNING [train.py:1204] (3/4) Exclude cut with ID _14069_210_5632_1_1530512692033_4803390_522-341588-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 18:19:18,256 WARNING [train.py:1204] (3/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_363-185703-0 from training. Duration: 0.95 2023-10-03 18:19:20,186 WARNING [train.py:1204] (3/4) Exclude cut with ID _41817_210_18835_1_1533434477160_7423049_265-76216-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 18:19:26,229 WARNING [train.py:1204] (3/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_841-341679-0 from training. Duration: 0.58 2023-10-03 18:19:27,618 WARNING [train.py:1204] (3/4) Exclude cut with ID _55429_210_10626_1_1534467588527_7174143_97-335084-0_sp1.1 from training. Duration: 0.8818125 2023-10-03 18:19:29,509 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.3.encoder.layers.1.nonlin_attention.whiten1, num_groups=1, num_channels=384, metric=5.15 vs. limit=10.0 2023-10-03 18:19:31,679 WARNING [train.py:1204] (3/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_690-126306-0_sp0.9 from training. Duration: 0.8666875 2023-10-03 18:19:31,691 WARNING [train.py:1204] (3/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_102-92751-0_sp1.1 from training. Duration: 0.9 2023-10-03 18:19:34,550 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3787_210_2040_1_1526477544_5478586_603-120324-0_sp1.1 from training. Duration: 0.736375 2023-10-03 18:19:35,813 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.597e+02 1.969e+02 2.119e+02 2.541e+02 4.388e+02, threshold=4.238e+02, percent-clipped=2.0 2023-10-03 18:19:35,966 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_33348_210_5632_1_1533086912_3324030_85-28583-0 from training. Duration: 0.82 2023-10-03 18:19:37,299 WARNING [train.py:1204] (3/4) Exclude cut with ID _60057_210_10640_1_1534903065793_3333150_362-230377-0_sp1.1 from training. Duration: 0.7909375 2023-10-03 18:19:37,426 WARNING [train.py:1204] (3/4) Exclude cut with ID _60113_210_16271_1_1535164212481_4386079_194-146486-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 18:19:39,145 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_53753_210_7234_1_1534320003_3594148_171-277668-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 18:19:39,176 WARNING [train.py:1204] (3/4) Exclude cut with ID _34423_210_8750_1_1533430798456_3330199_251-332788-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 18:19:42,098 WARNING [train.py:1204] (3/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_663-166973-0_sp0.9 from training. Duration: 0.888875 2023-10-03 18:19:43,504 WARNING [train.py:1204] (3/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_927-43827-0 from training. Duration: 0.74 2023-10-03 18:19:43,563 WARNING [train.py:1204] (3/4) Exclude cut with ID _28323_210_16187_1_1532480395667_4100300_306-74561-0_sp1.1 from training. Duration: 0.8545625 2023-10-03 18:19:46,707 WARNING [train.py:1204] (3/4) Exclude cut with ID _15037_210_2253_1_1530752294104_3267650_139-337989-0_sp1.1 from training. Duration: 0.92725 2023-10-03 18:19:46,726 WARNING [train.py:1204] (3/4) Exclude cut with ID _49039_210_6315_1_1533967537541_3846118_460-263670-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 18:19:49,844 WARNING [train.py:1204] (3/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_186-309161-0 from training. Duration: 0.54 2023-10-03 18:19:49,928 WARNING [train.py:1204] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_87-278686-0_sp1.1 from training. Duration: 0.7 2023-10-03 18:19:54,315 WARNING [train.py:1204] (3/4) Exclude cut with ID _27052_210_5986_1_1532329197187_3585009_314-106499-0 from training. Duration: 0.92 2023-10-03 18:19:54,336 WARNING [train.py:1204] (3/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_412-287526-0_sp0.9 from training. Duration: 0.8 2023-10-03 18:19:58,570 WARNING [train.py:1204] (3/4) Exclude cut with ID _22961_210_8254_1_1531976366672_4278180_317-199546-0 from training. Duration: 0.86 2023-10-03 18:20:01,420 WARNING [train.py:1204] (3/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_381-317791-0 from training. Duration: 0.73 2023-10-03 18:20:02,749 WARNING [train.py:1204] (3/4) Exclude cut with ID _44010_210_4525_1_1533884262964_4013620_560-310411-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 18:20:02,756 WARNING [train.py:1204] (3/4) Exclude cut with ID _5294_210_1747_1_1527249643928_3696179_342-270278-0_sp0.9 from training. Duration: 0.611125 2023-10-03 18:20:02,783 WARNING [train.py:1204] (3/4) Exclude cut with ID ID0473W0002-918951 from training. Duration: 0.924 2023-10-03 18:20:02,814 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1083-220122-0 from training. Duration: 0.51 2023-10-03 18:20:04,686 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.4.encoder.layers.2.conv_module1.whiten, num_groups=1, num_channels=384, metric=3.95 vs. limit=15.0 2023-10-03 18:20:05,532 WARNING [train.py:1204] (3/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_138-154812-0 from training. Duration: 0.78 2023-10-03 18:20:07,487 WARNING [train.py:1204] (3/4) Exclude cut with ID _44982_210_1511_1_1534312820108_3726029_331-19420-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 18:20:11,719 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2655_210_2087_1_1525688959_3897400_202-147557-0_sp0.9 from training. Duration: 0.9444375 2023-10-03 18:20:11,985 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.feed_forward3.hidden_balancer.prob, batch_count=1361740.0, ans=0.125 2023-10-03 18:20:13,066 INFO [train.py:1046] (3/4) Epoch 39, batch 2400, loss[loss=0.1545, simple_loss=0.2104, pruned_loss=0.04931, over 19314.00 frames. ], tot_loss[loss=0.1575, simple_loss=0.2369, pruned_loss=0.03907, over 4713258.91 frames. ], batch size: 388, lr: 2.61e-03, grad_scale: 16.0 2023-10-03 18:20:14,577 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_23850_210_4238_1_1532232597_1632433_32-185135-0_sp0.9 from training. Duration: 0.9555625 2023-10-03 18:20:16,008 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_46-93547-0_sp1.1 from training. Duration: 0.863625 2023-10-03 18:20:17,319 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7931_210_1794_1_1528594728_5564059_280-279560-0 from training. Duration: 0.98 2023-10-03 18:20:17,373 WARNING [train.py:1204] (3/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_937-208281-0 from training. Duration: 0.85 2023-10-03 18:20:19,899 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.3.encoder.layers.2.nonlin_attention.whiten2, num_groups=1, num_channels=512, metric=6.28 vs. limit=15.0 2023-10-03 18:20:20,858 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=1361740.0, ans=0.1 2023-10-03 18:20:22,779 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_14928_210_5632_1_1530773461_4714000_454-112198-0_sp0.9 from training. Duration: 0.7333125 2023-10-03 18:20:22,780 WARNING [train.py:1204] (3/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_944-153333-0_sp1.1 from training. Duration: 0.963625 2023-10-03 18:20:26,812 WARNING [train.py:1204] (3/4) Exclude cut with ID _14821_210_8750_1_1530680503991_6829359_2-227151-0 from training. Duration: 0.83 2023-10-03 18:20:26,837 WARNING [train.py:1204] (3/4) Exclude cut with ID _28461_210_2253_1_1532771775189_3041299_96-152158-0_sp1.1 from training. Duration: 0.77275 2023-10-03 18:20:29,446 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7837_210_1792_1_1528453445_5841305_654-54195-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 18:20:30,749 WARNING [train.py:1204] (3/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_344-152526-0 from training. Duration: 0.53 2023-10-03 18:20:33,851 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_30167_210_3528_1_1532428012_4772375_480-142230-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 18:20:33,992 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.bypass_mid.scale_min, batch_count=1361806.6666666667, ans=0.2 2023-10-03 18:20:37,923 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_160-218721-0 from training. Duration: 0.96 2023-10-03 18:20:42,315 WARNING [train.py:1204] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_65-88242-0_sp0.9 from training. Duration: 0.62225 2023-10-03 18:20:46,516 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_34886_210_10633_1_1533203882_1755661_76-156096-0 from training. Duration: 0.87 2023-10-03 18:20:48,445 WARNING [train.py:1204] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_188-38388-0_sp1.1 from training. Duration: 0.92725 2023-10-03 18:20:49,811 WARNING [train.py:1204] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_81-289335-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 18:20:56,134 WARNING [train.py:1204] (3/4) Exclude cut with ID _59493_210_17282_1_1535078419077_3225879_110-8778-0_sp1.1 from training. Duration: 0.936375 2023-10-03 18:20:57,447 WARNING [train.py:1204] (3/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_188-260011-0 from training. Duration: 0.73 2023-10-03 18:20:58,783 WARNING [train.py:1204] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_453-133983-0_sp0.9 from training. Duration: 0.8666875 2023-10-03 18:21:01,818 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8054_210_7927_1_1528627721_4984094_553-17379-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 18:21:03,299 WARNING [train.py:1204] (3/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_341-171072-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 18:21:05,925 WARNING [train.py:1204] (3/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_265-257286-0_sp1.1 from training. Duration: 0.963625 2023-10-03 18:21:06,018 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_403-39619-0_sp0.9 from training. Duration: 0.9333125 2023-10-03 18:21:06,023 WARNING [train.py:1204] (3/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_464-33931-0_sp0.9 from training. Duration: 0.6 2023-10-03 18:21:06,050 WARNING [train.py:1204] (3/4) Exclude cut with ID _60804_210_9642_1_1534831199859_7268299_632-143863-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 18:21:06,057 WARNING [train.py:1204] (3/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_431-310997-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 18:21:07,350 WARNING [train.py:1204] (3/4) Exclude cut with ID _31649_210_6228_1_1532584608063_4091078_114-263860-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 18:21:07,368 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6961_210_2087_1_1527937168_6815011_562-165518-0_sp0.9 from training. Duration: 0.8555625 2023-10-03 18:21:12,839 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.1.encoder.layers.0.whiten, num_groups=1, num_channels=256, metric=5.11 vs. limit=12.0 2023-10-03 18:21:13,297 WARNING [train.py:1204] (3/4) Exclude cut with ID _27052_210_5986_1_1532329197187_3585009_338-325870-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 18:21:13,348 WARNING [train.py:1204] (3/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_469-94673-0_sp1.1 from training. Duration: 0.7181875 2023-10-03 18:21:13,381 WARNING [train.py:1204] (3/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_259-34038-0 from training. Duration: 0.74 2023-10-03 18:21:13,492 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.0.conv_skip_rate, batch_count=1362006.6666666667, ans=0.0 2023-10-03 18:21:14,627 WARNING [train.py:1204] (3/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_388-129552-0 from training. Duration: 0.96 2023-10-03 18:21:16,734 WARNING [train.py:1204] (3/4) Exclude cut with ID _28394_210_8913_1_1532430049518_3650118_342-251455-0_sp1.1 from training. Duration: 0.936375 2023-10-03 18:21:16,753 WARNING [train.py:1204] (3/4) Exclude cut with ID _53731_210_7994_1_1534553979244_3757214_8-191390-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 18:21:18,130 WARNING [train.py:1204] (3/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_613-232856-0 from training. Duration: 0.54 2023-10-03 18:21:18,200 WARNING [train.py:1204] (3/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_557-349534-0 from training. Duration: 0.91 2023-10-03 18:21:18,209 WARNING [train.py:1204] (3/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_379-184223-0 from training. Duration: 0.85 2023-10-03 18:21:18,212 WARNING [train.py:1204] (3/4) Exclude cut with ID IC0125W0001-61807 from training. Duration: 0.966 2023-10-03 18:21:18,380 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.attention_skip_rate, batch_count=1362006.6666666667, ans=0.0 2023-10-03 18:21:20,171 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_229-45300-0 from training. Duration: 0.57 2023-10-03 18:21:20,396 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.bypass.scale_min, batch_count=1362006.6666666667, ans=0.2 2023-10-03 18:21:21,255 WARNING [train.py:1204] (3/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_1069-24743-0_sp1.1 from training. Duration: 0.92725 2023-10-03 18:21:21,363 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_41452_210_7291_1_1533437687_3941281_323-172873-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 18:21:21,379 WARNING [train.py:1204] (3/4) Exclude cut with ID _37086_210_9038_1_1532999540891_7021250_802-226709-0_sp1.1 from training. Duration: 0.97275 2023-10-03 18:21:22,849 WARNING [train.py:1204] (3/4) Exclude cut with ID IC0144W0500-71767 from training. Duration: 0.931875 2023-10-03 18:21:24,669 WARNING [train.py:1204] (3/4) Exclude cut with ID _39846_210_12651_1_1533605310265_2115212_233-129699-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 18:21:25,983 WARNING [train.py:1204] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_574-45541-0_sp0.9 from training. Duration: 0.72225 2023-10-03 18:21:27,227 INFO [train.py:1046] (3/4) Epoch 39, batch 2450, loss[loss=0.1569, simple_loss=0.2182, pruned_loss=0.0478, over 22697.00 frames. ], tot_loss[loss=0.1567, simple_loss=0.2354, pruned_loss=0.03897, over 4709755.67 frames. ], batch size: 322, lr: 2.61e-03, grad_scale: 8.0 2023-10-03 18:21:30,165 WARNING [train.py:1204] (3/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_372-298093-0_sp1.1 from training. Duration: 0.72725 2023-10-03 18:21:30,184 WARNING [train.py:1204] (3/4) Exclude cut with ID _62574_210_2655_1_1535180407058_3933249_34-26343-0_sp1.1 from training. Duration: 0.97275 2023-10-03 18:21:31,651 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.0.conv_module1.balancer1.prob, batch_count=1362073.3333333333, ans=0.125 2023-10-03 18:21:31,748 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.ff3_skip_rate, batch_count=1362073.3333333333, ans=0.0 2023-10-03 18:21:31,823 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.bypass.skip_rate, batch_count=1362073.3333333333, ans=0.04949747468305833 2023-10-03 18:21:34,390 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_512-256238-0_sp1.1 from training. Duration: 0.963625 2023-10-03 18:21:34,404 WARNING [train.py:1204] (3/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_196-55502-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 18:21:35,848 WARNING [train.py:1204] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_195-167011-0 from training. Duration: 0.75 2023-10-03 18:21:41,977 WARNING [train.py:1204] (3/4) Exclude cut with ID _13678_210_7497_1_1530530069226_3463170_345-230416-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 18:21:41,995 WARNING [train.py:1204] (3/4) Exclude cut with ID _51078_210_9070_1_1534161557424_3805250_225-327809-0_sp1.1 from training. Duration: 0.963625 2023-10-03 18:21:44,691 WARNING [train.py:1204] (3/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_18-189198-0_sp1.1 from training. Duration: 0.8181875 2023-10-03 18:21:44,708 WARNING [train.py:1204] (3/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_329-82235-0_sp1.1 from training. Duration: 0.7454375 2023-10-03 18:21:44,725 WARNING [train.py:1204] (3/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_726-270780-0_sp0.9 from training. Duration: 0.9555625 2023-10-03 18:21:45,985 WARNING [train.py:1204] (3/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_654-52713-0 from training. Duration: 0.77 2023-10-03 18:21:50,474 WARNING [train.py:1204] (3/4) Exclude cut with ID _20455_210_5219_1_1531565996775_8098100_191-16006-0_sp1.1 from training. Duration: 0.963625 2023-10-03 18:21:53,154 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3171_210_2087_1_1526109084_2330007_66-260583-0_sp1.1 from training. Duration: 0.6545625 2023-10-03 18:21:53,209 WARNING [train.py:1204] (3/4) Exclude cut with ID _38670_210_15379_1_1533448810659_4391059_63-129006-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 18:21:58,102 WARNING [train.py:1204] (3/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_393-57888-0_sp0.9 from training. Duration: 0.711125 2023-10-03 18:21:58,140 WARNING [train.py:1204] (3/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_857-298942-0_sp0.9 from training. Duration: 0.9444375 2023-10-03 18:22:00,683 WARNING [train.py:1204] (3/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_239-22076-0_sp0.9 from training. Duration: 0.9444375 2023-10-03 18:22:01,908 WARNING [train.py:1204] (3/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_424-227947-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 18:22:03,399 WARNING [train.py:1204] (3/4) Exclude cut with ID _14069_210_5632_1_1530512692033_4803390_95-1896-0 from training. Duration: 0.98 2023-10-03 18:22:03,485 WARNING [train.py:1204] (3/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_96-244299-0_sp0.9 from training. Duration: 0.97775 2023-10-03 18:22:04,682 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.636e+02 1.889e+02 2.081e+02 2.399e+02 4.481e+02, threshold=4.162e+02, percent-clipped=1.0 2023-10-03 18:22:10,564 WARNING [train.py:1204] (3/4) Exclude cut with ID _46683_210_9642_1_1533730913492_4591116_562-319825-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 18:22:12,588 WARNING [train.py:1204] (3/4) Exclude cut with ID _64639_210_5816_1_1535367619437_3528000_284-173474-0_sp1.1 from training. Duration: 0.963625 2023-10-03 18:22:12,609 WARNING [train.py:1204] (3/4) Exclude cut with ID _29529_210_7234_1_1532393961455_3977460_497-100133-0_sp1.1 from training. Duration: 0.936375 2023-10-03 18:22:12,637 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_12868_210_6753_1_1530701932_530275_18-76322-0_sp0.9 from training. Duration: 0.9666875 2023-10-03 18:22:12,679 WARNING [train.py:1204] (3/4) Exclude cut with ID _50093_210_4220_1_1534417141207_3432299_116-303447-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 18:22:13,998 WARNING [train.py:1204] (3/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_44-79427-0_sp1.1 from training. Duration: 0.8909375 2023-10-03 18:22:14,070 WARNING [train.py:1204] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_887-249555-0 from training. Duration: 0.77 2023-10-03 18:22:16,800 WARNING [train.py:1204] (3/4) Exclude cut with ID _28461_210_2253_1_1532771775189_3041299_96-152158-0_sp0.9 from training. Duration: 0.9444375 2023-10-03 18:22:16,850 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_14058_210_4163_1_1530690953_6327180_49-210687-0_sp1.1 from training. Duration: 0.8545625 2023-10-03 18:22:20,045 WARNING [train.py:1204] (3/4) Exclude cut with ID _40399_210_4353_1_1533305658194_3279406_351-127903-0_sp1.1 from training. Duration: 0.97275 2023-10-03 18:22:20,050 WARNING [train.py:1204] (3/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_184-206939-0_sp1.1 from training. Duration: 0.936375 2023-10-03 18:22:24,100 WARNING [train.py:1204] (3/4) Exclude cut with ID _14100_210_8341_1_1530529320613_3678069_258-224599-0_sp1.1 from training. Duration: 0.7 2023-10-03 18:22:25,789 WARNING [train.py:1204] (3/4) Exclude cut with ID _6493_210_1747_1_1528459210424_4012470_324-323854-0 from training. Duration: 0.92 2023-10-03 18:22:26,502 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_151-325626-0_sp1.1 from training. Duration: 0.8818125 2023-10-03 18:22:29,034 WARNING [train.py:1204] (3/4) Exclude cut with ID _14528_210_8254_1_1530669700514_4306939_106-43145-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 18:22:29,056 WARNING [train.py:1204] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_686-54383-0 from training. Duration: 0.87 2023-10-03 18:22:29,094 WARNING [train.py:1204] (3/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_801-102362-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 18:22:30,471 WARNING [train.py:1204] (3/4) Exclude cut with ID _48677_210_5663_1_1533902405919_3892989_408-40740-0_sp0.9 from training. Duration: 0.92225 2023-10-03 18:22:34,633 WARNING [train.py:1204] (3/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_448-212135-0_sp1.1 from training. Duration: 0.82725 2023-10-03 18:22:36,047 WARNING [train.py:1204] (3/4) Exclude cut with ID _59170_210_6721_1_1534983633960_4203335_320-124047-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 18:22:37,437 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_4692_210_3528_1_1526899755_4323254_107-44401-0_sp1.1 from training. Duration: 0.7818125 2023-10-03 18:22:40,305 INFO [train.py:1046] (3/4) Epoch 39, batch 2500, loss[loss=0.1629, simple_loss=0.2369, pruned_loss=0.04448, over 19213.00 frames. ], tot_loss[loss=0.1564, simple_loss=0.2359, pruned_loss=0.03847, over 4708389.35 frames. ], batch size: 42, lr: 2.61e-03, grad_scale: 8.0 2023-10-03 18:22:40,394 WARNING [train.py:1204] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_731-2428-0 from training. Duration: 0.83 2023-10-03 18:22:41,757 WARNING [train.py:1204] (3/4) Exclude cut with ID _29857_210_20034_1_1532395886753_3798160_335-163562-0_sp1.1 from training. Duration: 0.77275 2023-10-03 18:22:44,022 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.feed_forward2.hidden_balancer.prob, batch_count=1362406.6666666667, ans=0.125 2023-10-03 18:22:45,403 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.bypass.skip_rate, batch_count=1362406.6666666667, ans=0.07 2023-10-03 18:22:47,898 WARNING [train.py:1204] (3/4) Exclude cut with ID _14832_210_8750_1_1530691351539_3171348_171-275004-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 18:22:54,115 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.balancer1.prob, batch_count=1362473.3333333333, ans=0.125 2023-10-03 18:22:55,343 WARNING [train.py:1204] (3/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_105-328174-0_sp0.9 from training. Duration: 0.8444375 2023-10-03 18:22:55,375 WARNING [train.py:1204] (3/4) Exclude cut with ID _68965_210_1883_1_1535698962272_3497490_164-118421-0_sp1.1 from training. Duration: 0.936375 2023-10-03 18:22:57,102 WARNING [train.py:1204] (3/4) Exclude cut with ID _19896_210_1883_1_1531709022662_6649569_380-24170-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 18:22:57,109 WARNING [train.py:1204] (3/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_505-205270-0_sp1.1 from training. Duration: 0.3454375 2023-10-03 18:22:58,608 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.bypass.scale_min, batch_count=1362473.3333333333, ans=0.2 2023-10-03 18:23:03,156 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.conv_module1.balancer2.prob, batch_count=1362473.3333333333, ans=0.125 2023-10-03 18:23:04,297 WARNING [train.py:1204] (3/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_427-208901-0_sp1.1 from training. Duration: 0.6181875 2023-10-03 18:23:05,618 WARNING [train.py:1204] (3/4) Exclude cut with ID _23818_210_6228_1_1531917063884_3676920_85-326916-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 18:23:05,710 WARNING [train.py:1204] (3/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_101-293367-0_sp1.1 from training. Duration: 0.57275 2023-10-03 18:23:05,726 WARNING [train.py:1204] (3/4) Exclude cut with ID _5409_210_1388_1_1527292850189_5865191_178-127970-0_sp1.1 from training. Duration: 0.5181875 2023-10-03 18:23:07,177 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_403-39619-0 from training. Duration: 0.84 2023-10-03 18:23:08,517 WARNING [train.py:1204] (3/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_427-87841-0_sp1.1 from training. Duration: 0.963625 2023-10-03 18:23:08,558 WARNING [train.py:1204] (3/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_286-68181-0_sp1.1 from training. Duration: 0.97275 2023-10-03 18:23:08,729 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.conv_module2.balancer1.prob, batch_count=1362540.0, ans=0.125 2023-10-03 18:23:09,842 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6673_210_3273_1_1527767790_3173240_327-306185-0 from training. Duration: 0.83 2023-10-03 18:23:09,866 WARNING [train.py:1204] (3/4) Exclude cut with ID _3675_210_4557_1_1526639934408_4182089_106-203713-0_sp1.1 from training. Duration: 0.963625 2023-10-03 18:23:09,917 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_53071_210_16554_1_1534493434_1276796_130-36076-0 from training. Duration: 0.95 2023-10-03 18:23:09,955 WARNING [train.py:1204] (3/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_586-93054-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 18:23:14,797 WARNING [train.py:1204] (3/4) Exclude cut with ID _23825_210_13449_1_1531915313526_3497150_262-10571-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 18:23:14,881 WARNING [train.py:1204] (3/4) Exclude cut with ID _28783_210_7943_1_1532671164808_7155479_432-67885-0_sp1.1 from training. Duration: 0.97275 2023-10-03 18:23:16,318 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6287_210_3273_1_1527509506_2653716_182-199887-0_sp0.9 from training. Duration: 0.5666875 2023-10-03 18:23:16,493 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.conv_module2.balancer2.prob, batch_count=1362540.0, ans=0.125 2023-10-03 18:23:18,208 WARNING [train.py:1204] (3/4) Exclude cut with ID _3231_210_2106_1_1526205864934_6784220_171-256004-0 from training. Duration: 0.86 2023-10-03 18:23:18,271 WARNING [train.py:1204] (3/4) Exclude cut with ID _69051_210_1531_1_1535885566901_3829538_332-92090-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 18:23:20,912 WARNING [train.py:1204] (3/4) Exclude cut with ID _3675_210_4557_1_1526639934408_4182089_104-324125-0_sp1.1 from training. Duration: 0.963625 2023-10-03 18:23:23,730 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_41764_210_2521_1_1533686110_4089313_501-2119-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 18:23:27,136 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_56-100988-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 18:23:27,808 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.3.encoder.layers.2.feed_forward2.out_whiten, num_groups=1, num_channels=512, metric=11.57 vs. limit=15.0 2023-10-03 18:23:30,231 WARNING [train.py:1204] (3/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_398-239819-0_sp1.1 from training. Duration: 0.9 2023-10-03 18:23:31,969 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.feed_forward1.hidden_balancer.prob, batch_count=1362606.6666666667, ans=0.125 2023-10-03 18:23:35,565 WARNING [train.py:1204] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_74-48717-0_sp1.1 from training. Duration: 0.52725 2023-10-03 18:23:36,939 WARNING [train.py:1204] (3/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_177-49092-0 from training. Duration: 0.91 2023-10-03 18:23:36,958 WARNING [train.py:1204] (3/4) Exclude cut with ID _62337_210_12392_1_1535009273105_3412229_44-176353-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 18:23:36,966 WARNING [train.py:1204] (3/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_110-130961-0_sp1.1 from training. Duration: 0.42725 2023-10-03 18:23:38,558 WARNING [train.py:1204] (3/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_37-189262-0_sp1.1 from training. Duration: 0.8909375 2023-10-03 18:23:38,558 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3172_210_2087_1_1526292929_3987865_197-97762-0_sp0.9 from training. Duration: 0.8333125 2023-10-03 18:23:41,800 WARNING [train.py:1204] (3/4) Exclude cut with ID IC0677W0065-334897 from training. Duration: 0.896 2023-10-03 18:23:41,801 WARNING [train.py:1204] (3/4) Exclude cut with ID IC0677W0080-334912 from training. Duration: 0.768 2023-10-03 18:23:41,806 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_120-41668-0 from training. Duration: 0.7 2023-10-03 18:23:43,265 WARNING [train.py:1204] (3/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_460-205287-0_sp1.1 from training. Duration: 0.963625 2023-10-03 18:23:43,474 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.conv_module1.balancer2.prob, batch_count=1362673.3333333333, ans=0.125 2023-10-03 18:23:45,962 WARNING [train.py:1204] (3/4) Exclude cut with ID _14632_210_9988_1_1530691279392_3923200_102-204477-0 from training. Duration: 0.97 2023-10-03 18:23:45,968 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_34815_210_6388_1_1533381320_3571903_279-305565-0 from training. Duration: 0.92 2023-10-03 18:23:46,035 WARNING [train.py:1204] (3/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_91-148831-0_sp1.1 from training. Duration: 0.9 2023-10-03 18:23:46,097 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_82-221196-0 from training. Duration: 0.76 2023-10-03 18:23:50,650 WARNING [train.py:1204] (3/4) Exclude cut with ID _38020_210_5986_1_1533112252259_7006089_493-102745-0 from training. Duration: 0.98 2023-10-03 18:23:55,075 INFO [train.py:1046] (3/4) Epoch 39, batch 2550, loss[loss=0.1552, simple_loss=0.2516, pruned_loss=0.02938, over 24630.00 frames. ], tot_loss[loss=0.1572, simple_loss=0.2368, pruned_loss=0.03879, over 4711310.72 frames. ], batch size: 68, lr: 2.61e-03, grad_scale: 8.0 2023-10-03 18:23:55,149 WARNING [train.py:1204] (3/4) Exclude cut with ID _61133_210_9969_1_1535196777227_3696409_40-93442-0_sp1.1 from training. Duration: 0.92725 2023-10-03 18:23:55,412 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.conv_module2.balancer1.prob, batch_count=1362740.0, ans=0.125 2023-10-03 18:23:56,624 WARNING [train.py:1204] (3/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_45-258852-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 18:23:56,663 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_53071_210_16554_1_1534493434_1276796_130-36076-0_sp1.1 from training. Duration: 0.863625 2023-10-03 18:23:58,564 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8694_210_6400_1_1529065754_3260756_332-13553-0_sp1.1 from training. Duration: 0.92725 2023-10-03 18:23:59,900 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_405-339986-0 from training. Duration: 0.51 2023-10-03 18:23:59,959 WARNING [train.py:1204] (3/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_927-43827-0_sp1.1 from training. Duration: 0.67275 2023-10-03 18:24:04,082 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_397-147196-0 from training. Duration: 0.8 2023-10-03 18:24:05,459 WARNING [train.py:1204] (3/4) Exclude cut with ID 3557-8342-0013-54691-0_sp1.1 from training. Duration: 0.836375 2023-10-03 18:24:08,215 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_47438_210_5399_1_1533797471_4513356_626-127722-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 18:24:09,669 WARNING [train.py:1204] (3/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_256-323883-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 18:24:09,674 WARNING [train.py:1204] (3/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_839-173017-0_sp0.9 from training. Duration: 0.4555625 2023-10-03 18:24:10,996 WARNING [train.py:1204] (3/4) Exclude cut with ID _5289_210_2084_1_1527678202736_3779299_392-261988-0_sp1.1 from training. Duration: 0.5909375 2023-10-03 18:24:11,034 WARNING [train.py:1204] (3/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_119-224384-0_sp1.1 from training. Duration: 0.936375 2023-10-03 18:24:11,071 WARNING [train.py:1204] (3/4) Exclude cut with ID _57523_210_1856_1_1534766294545_3732989_262-267933-0_sp1.1 from training. Duration: 0.963625 2023-10-03 18:24:15,475 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_35361_210_4238_1_1533262810_7793476_675-87012-0_sp0.9 from training. Duration: 0.988875 2023-10-03 18:24:15,501 WARNING [train.py:1204] (3/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_17-90541-0 from training. Duration: 0.94 2023-10-03 18:24:15,529 WARNING [train.py:1204] (3/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_535-14410-0_sp1.1 from training. Duration: 0.42725 2023-10-03 18:24:15,534 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_24673_210_6400_1_1531968633_778659_94-154511-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 18:24:15,538 WARNING [train.py:1204] (3/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_212-10501-0 from training. Duration: 0.94 2023-10-03 18:24:24,600 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.conv_module2.balancer2.prob, batch_count=1362873.3333333333, ans=0.125 2023-10-03 18:24:27,661 WARNING [train.py:1204] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_383-285196-0_sp1.1 from training. Duration: 0.8818125 2023-10-03 18:24:33,652 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.541e+02 1.933e+02 2.149e+02 2.358e+02 3.387e+02, threshold=4.299e+02, percent-clipped=0.0 2023-10-03 18:24:33,739 WARNING [train.py:1204] (3/4) Exclude cut with ID _45560_210_21982_1_1533812603951_3584999_238-47020-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 18:24:33,748 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_21500_210_2054_1_1532149310_2716758_278-18053-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 18:24:33,763 WARNING [train.py:1204] (3/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_453-9626-0_sp1.1 from training. Duration: 0.936375 2023-10-03 18:24:35,209 WARNING [train.py:1204] (3/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_153-97441-0_sp1.1 from training. Duration: 0.5090625 2023-10-03 18:24:42,121 WARNING [train.py:1204] (3/4) Exclude cut with ID _67725_210_27767_1_1535608756671_5105359_431-120895-0_sp1.1 from training. Duration: 0.963625 2023-10-03 18:24:45,324 WARNING [train.py:1204] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_574-45541-0_sp1.1 from training. Duration: 0.5909375 2023-10-03 18:24:46,597 WARNING [train.py:1204] (3/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_162-103620-0_sp1.1 from training. Duration: 0.8454375 2023-10-03 18:24:46,615 WARNING [train.py:1204] (3/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_734-129761-0_sp1.1 from training. Duration: 0.7454375 2023-10-03 18:24:46,645 WARNING [train.py:1204] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_620-276793-0_sp1.1 from training. Duration: 0.536375 2023-10-03 18:24:46,686 WARNING [train.py:1204] (3/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_153-97441-0_sp0.9 from training. Duration: 0.62225 2023-10-03 18:24:48,578 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.3.encoder.layers.3.feed_forward3.out_whiten, num_groups=1, num_channels=512, metric=13.86 vs. limit=15.0 2023-10-03 18:24:50,877 WARNING [train.py:1204] (3/4) Exclude cut with ID _12733_210_8341_1_1530167424556_3646380_523-85621-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 18:24:50,890 WARNING [train.py:1204] (3/4) Exclude cut with ID _64048_210_10626_1_1535198382802_3835179_352-179834-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 18:24:55,643 WARNING [train.py:1204] (3/4) Exclude cut with ID _55162_210_17071_1_1534408248583_3753480_43-242364-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 18:24:55,671 WARNING [train.py:1204] (3/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_661-264857-0 from training. Duration: 0.93 2023-10-03 18:24:55,678 WARNING [train.py:1204] (3/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_325-237800-0_sp1.1 from training. Duration: 0.97275 2023-10-03 18:24:56,912 WARNING [train.py:1204] (3/4) Exclude cut with ID _55328_210_27767_1_1534580973260_4088170_385-171459-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 18:24:57,174 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.bypass.scale_min, batch_count=1363006.6666666667, ans=0.2 2023-10-03 18:24:58,161 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3780_210_2087_1_1526467389_2145572_176-107140-0_sp1.1 from training. Duration: 0.62725 2023-10-03 18:24:59,572 WARNING [train.py:1204] (3/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_375-244976-0_sp1.1 from training. Duration: 0.6818125 2023-10-03 18:25:01,310 WARNING [train.py:1204] (3/4) Exclude cut with ID _35941_210_15379_1_1532995294515_4023089_309-33162-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 18:25:06,220 WARNING [train.py:1204] (3/4) Exclude cut with ID _14459_210_2228_1_1531443641946_7516700_1185-22038-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 18:25:08,879 INFO [train.py:1046] (3/4) Epoch 39, batch 2600, loss[loss=0.1691, simple_loss=0.2565, pruned_loss=0.04086, over 24653.00 frames. ], tot_loss[loss=0.1574, simple_loss=0.2371, pruned_loss=0.03891, over 4704414.71 frames. ], batch size: 73, lr: 2.61e-03, grad_scale: 8.0 2023-10-03 18:25:08,953 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_1197-69460-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 18:25:09,272 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.conv_skip_rate, batch_count=1363073.3333333333, ans=0.0 2023-10-03 18:25:10,495 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9012W0022-501157 from training. Duration: 0.896 2023-10-03 18:25:13,328 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9023W0009-506302 from training. Duration: 0.896 2023-10-03 18:25:14,507 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2821_210_2715_1_1526108385_3763560_73-296673-0_sp1.1 from training. Duration: 0.7545625 2023-10-03 18:25:14,544 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9025W0113-507442 from training. Duration: 0.896 2023-10-03 18:25:14,621 WARNING [train.py:1204] (3/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_739-2098-0 from training. Duration: 0.92 2023-10-03 18:25:14,632 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9029W0332-509722 from training. Duration: 0.768 2023-10-03 18:25:17,855 WARNING [train.py:1204] (3/4) Exclude cut with ID _16908_210_5724_1_1531119157248_3772700_68-101953-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 18:25:17,882 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9042W0011-515617 from training. Duration: 0.768 2023-10-03 18:25:19,256 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3019_210_2028_1_1526044245_3264865_191-72235-0 from training. Duration: 0.97 2023-10-03 18:25:19,348 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9052W0006-520792 from training. Duration: 0.896 2023-10-03 18:25:22,182 WARNING [train.py:1204] (3/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_245-174553-0_sp1.1 from training. Duration: 0.8 2023-10-03 18:25:24,786 WARNING [train.py:1204] (3/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_202-67017-0 from training. Duration: 0.82 2023-10-03 18:25:26,588 WARNING [train.py:1204] (3/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_304-59227-0 from training. Duration: 0.77 2023-10-03 18:25:27,997 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_246-125000-0_sp0.9 from training. Duration: 0.688875 2023-10-03 18:25:28,027 WARNING [train.py:1204] (3/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_243-42116-0 from training. Duration: 0.73 2023-10-03 18:25:28,830 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.1.encoder.layers.1.self_attn2.whiten, num_groups=1, num_channels=256, metric=10.81 vs. limit=22.5 2023-10-03 18:25:29,421 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9087W0121-538492 from training. Duration: 0.896 2023-10-03 18:25:29,436 WARNING [train.py:1204] (3/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_120-76843-0 from training. Duration: 0.55 2023-10-03 18:25:38,153 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.4.encoder.layers.0.feed_forward1.out_whiten, num_groups=1, num_channels=384, metric=5.28 vs. limit=15.0 2023-10-03 18:25:40,229 WARNING [train.py:1204] (3/4) Exclude cut with ID _7852_210_5219_1_1528286471316_8139400_850-304621-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 18:25:40,249 WARNING [train.py:1204] (3/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_154-285415-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 18:25:40,261 WARNING [train.py:1204] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_538-278194-0_sp1.1 from training. Duration: 0.92725 2023-10-03 18:25:40,261 WARNING [train.py:1204] (3/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_119-207162-0 from training. Duration: 0.71 2023-10-03 18:25:41,802 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.1.balancer1.prob, batch_count=1363206.6666666667, ans=0.125 2023-10-03 18:25:42,956 WARNING [train.py:1204] (3/4) Exclude cut with ID _2334_210_2667_1_1525608238880_4705129_459-104997-0_sp1.1 from training. Duration: 0.863625 2023-10-03 18:25:47,803 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9147W0019-567847 from training. Duration: 0.768 2023-10-03 18:25:50,992 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.conv_module1.balancer1.prob, batch_count=1363206.6666666667, ans=0.125 2023-10-03 18:25:53,606 WARNING [train.py:1204] (3/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_228-189439-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 18:25:53,631 WARNING [train.py:1204] (3/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_196-77172-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 18:25:53,711 WARNING [train.py:1204] (3/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_625-338546-0 from training. Duration: 0.88 2023-10-03 18:25:55,062 WARNING [train.py:1204] (3/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_193-327630-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 18:25:55,064 WARNING [train.py:1204] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_391-24006-0_sp1.1 from training. Duration: 0.92725 2023-10-03 18:25:55,132 WARNING [train.py:1204] (3/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_197-28164-0 from training. Duration: 0.8 2023-10-03 18:25:57,083 WARNING [train.py:1204] (3/4) Exclude cut with ID _8395_210_1531_1_1528597871523_4411579_431-132470-0_sp1.1 from training. Duration: 0.67275 2023-10-03 18:25:57,104 WARNING [train.py:1204] (3/4) Exclude cut with ID _35350_210_16040_1_1533040191183_4075299_465-248326-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 18:25:58,538 WARNING [train.py:1204] (3/4) Exclude cut with ID _16756_210_4915_1_1531047803763_7104259_101-141922-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 18:26:02,928 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9212W0005-600172 from training. Duration: 0.896 2023-10-03 18:26:02,962 WARNING [train.py:1204] (3/4) Exclude cut with ID _63186_210_2084_1_1535414479705_3908649_378-166820-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 18:26:02,975 WARNING [train.py:1204] (3/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_52-211683-0_sp1.1 from training. Duration: 0.8181875 2023-10-03 18:26:09,072 WARNING [train.py:1204] (3/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_1147-237729-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 18:26:10,439 WARNING [train.py:1204] (3/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_234-14174-0_sp0.9 from training. Duration: 0.911125 2023-10-03 18:26:11,724 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_454-276429-0 from training. Duration: 0.87 2023-10-03 18:26:13,113 WARNING [train.py:1204] (3/4) Exclude cut with ID _53713_210_24116_1_1534314487762_7416751_755-165313-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 18:26:14,496 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8278_210_2550_1_1528637099_4220413_436-317072-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 18:26:15,795 WARNING [train.py:1204] (3/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_28-324729-0_sp1.1 from training. Duration: 0.8545625 2023-10-03 18:26:22,058 WARNING [train.py:1204] (3/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_326-165900-0 from training. Duration: 0.69 2023-10-03 18:26:23,434 INFO [train.py:1046] (3/4) Epoch 39, batch 2650, loss[loss=0.2215, simple_loss=0.2839, pruned_loss=0.07954, over 19260.00 frames. ], tot_loss[loss=0.1589, simple_loss=0.2384, pruned_loss=0.03965, over 4700254.54 frames. ], batch size: 388, lr: 2.61e-03, grad_scale: 8.0 2023-10-03 18:26:23,474 WARNING [train.py:1204] (3/4) Exclude cut with ID _53489_210_6228_1_1534298357169_3835970_96-30246-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 18:26:23,621 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8275_210_4198_1_1528455320_3892571_491-17707-0_sp1.1 from training. Duration: 0.5818125 2023-10-03 18:26:27,790 WARNING [train.py:1204] (3/4) Exclude cut with ID _14348_210_8341_1_1530599434996_3778640_240-232370-0 from training. Duration: 0.84 2023-10-03 18:26:27,801 WARNING [train.py:1204] (3/4) Exclude cut with ID _45012_210_12651_1_1533608435136_5371380_54-112623-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 18:26:29,637 WARNING [train.py:1204] (3/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_328-264776-0_sp1.1 from training. Duration: 0.6090625 2023-10-03 18:26:31,012 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9325W0001-648427 from training. Duration: 0.768 2023-10-03 18:26:31,022 WARNING [train.py:1204] (3/4) Exclude cut with ID _56480_210_4220_1_1534679942079_3075409_49-74717-0_sp1.1 from training. Duration: 0.936375 2023-10-03 18:26:32,411 WARNING [train.py:1204] (3/4) Exclude cut with ID _59500_210_8254_1_1534809610680_3736009_6-241869-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 18:26:33,172 WARNING [train.py:1204] (3/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_172-215932-0_sp0.9 from training. Duration: 0.7333125 2023-10-03 18:26:34,496 WARNING [train.py:1204] (3/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_17-90541-0_sp1.1 from training. Duration: 0.8545625 2023-10-03 18:26:37,225 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_43831_210_2158_1_1533725502_5872791_525-89447-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 18:26:38,968 WARNING [train.py:1204] (3/4) Exclude cut with ID _12302_210_5219_1_1529924446466_8107280_907-182949-0 from training. Duration: 0.92 2023-10-03 18:26:38,990 WARNING [train.py:1204] (3/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_402-102311-0_sp0.9 from training. Duration: 0.7444375 2023-10-03 18:26:39,021 WARNING [train.py:1204] (3/4) Exclude cut with ID _8021_210_7994_1_1528376451358_3653069_179-45206-0_sp0.9 from training. Duration: 0.9555625 2023-10-03 18:26:40,541 WARNING [train.py:1204] (3/4) Exclude cut with ID _40137_210_12210_1_1533290484046_3795180_249-187059-0 from training. Duration: 0.88 2023-10-03 18:26:41,984 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9653W0194-675472 from training. Duration: 0.9658125 2023-10-03 18:26:42,170 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.nonlin_attention.balancer.prob, batch_count=1363473.3333333333, ans=0.125 2023-10-03 18:26:44,668 WARNING [train.py:1204] (3/4) Exclude cut with ID _51332_210_9988_1_1534244371836_3801270_336-255604-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 18:26:49,169 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_510-96996-0 from training. Duration: 0.87 2023-10-03 18:26:49,182 WARNING [train.py:1204] (3/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_1167-20437-0_sp1.1 from training. Duration: 0.92725 2023-10-03 18:26:50,553 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3475_210_2028_1_1526193578_3622569_265-276625-0 from training. Duration: 0.76 2023-10-03 18:26:52,114 WARNING [train.py:1204] (3/4) Exclude cut with ID _14860_210_4915_1_1530702020340_7343240_470-172418-0_sp1.1 from training. Duration: 0.97275 2023-10-03 18:26:52,124 WARNING [train.py:1204] (3/4) Exclude cut with ID _5444_210_2151_1_1527418808464_5312210_581-315411-0_sp1.1 from training. Duration: 0.563625 2023-10-03 18:26:52,139 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_34450_210_9547_1_1533099443_4109050_519-342439-0_sp1.1 from training. Duration: 0.97275 2023-10-03 18:26:52,281 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.ff2_skip_rate, batch_count=1363540.0, ans=0.0 2023-10-03 18:26:53,443 WARNING [train.py:1204] (3/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_50-175961-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 18:26:55,095 WARNING [train.py:1204] (3/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_372-6869-0 from training. Duration: 0.44 2023-10-03 18:26:55,100 WARNING [train.py:1204] (3/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_3-237434-0 from training. Duration: 0.76 2023-10-03 18:26:58,548 WARNING [train.py:1204] (3/4) Exclude cut with ID _8772_210_1850_1_1528635595972_3664011_247-336448-0_sp1.1 from training. Duration: 0.67275 2023-10-03 18:27:01,873 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_427-78097-0 from training. Duration: 0.8 2023-10-03 18:27:01,888 WARNING [train.py:1204] (3/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_466-252727-0_sp1.1 from training. Duration: 0.97275 2023-10-03 18:27:02,907 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.627e+02 2.015e+02 2.271e+02 2.609e+02 3.479e+02, threshold=4.541e+02, percent-clipped=0.0 2023-10-03 18:27:03,035 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_15972_210_6400_1_1530877934_3405692_316-35478-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 18:27:04,413 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_809-17156-0_sp0.9 from training. Duration: 0.811125 2023-10-03 18:27:04,461 WARNING [train.py:1204] (3/4) Exclude cut with ID _57572_210_5517_1_1534921219624_3809484_594-263474-0_sp1.1 from training. Duration: 0.936375 2023-10-03 18:27:06,412 WARNING [train.py:1204] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_190-228204-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 18:27:07,787 WARNING [train.py:1204] (3/4) Exclude cut with ID _19514_210_9259_1_1531530071330_5084230_117-21193-0_sp1.1 from training. Duration: 0.936375 2023-10-03 18:27:09,208 WARNING [train.py:1204] (3/4) Exclude cut with ID _69584_210_5219_1_1535785133775_4044450_190-21315-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 18:27:10,532 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_53071_210_16554_1_1534493434_1276796_184-302050-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 18:27:11,788 WARNING [train.py:1204] (3/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_120-76843-0_sp1.1 from training. Duration: 0.5 2023-10-03 18:27:13,178 WARNING [train.py:1204] (3/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_318-162017-0_sp1.1 from training. Duration: 0.82725 2023-10-03 18:27:14,579 WARNING [train.py:1204] (3/4) Exclude cut with ID _46067_210_4220_1_1534125571066_3307450_79-46864-0_sp1.1 from training. Duration: 0.92725 2023-10-03 18:27:15,943 WARNING [train.py:1204] (3/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_358-215558-0_sp1.1 from training. Duration: 0.8454375 2023-10-03 18:27:16,035 WARNING [train.py:1204] (3/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_621-66764-0_sp1.1 from training. Duration: 0.92725 2023-10-03 18:27:16,273 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.feed_forward3.hidden_balancer.prob, batch_count=1363606.6666666667, ans=0.125 2023-10-03 18:27:17,440 WARNING [train.py:1204] (3/4) Exclude cut with ID _42863_210_9070_1_1533902398788_3696920_338-156851-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 18:27:19,181 WARNING [train.py:1204] (3/4) Exclude cut with ID _5289_210_2084_1_1527678202736_3779299_392-261988-0_sp0.9 from training. Duration: 0.72225 2023-10-03 18:27:20,642 WARNING [train.py:1204] (3/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_409-84682-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 18:27:22,083 WARNING [train.py:1204] (3/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_30-326681-0_sp0.9 from training. Duration: 0.92225 2023-10-03 18:27:22,088 WARNING [train.py:1204] (3/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_389-184695-0_sp1.1 from training. Duration: 0.92725 2023-10-03 18:27:22,118 WARNING [train.py:1204] (3/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_442-320317-0 from training. Duration: 0.82 2023-10-03 18:27:23,631 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.0.nonlin_attention.balancer.prob, batch_count=1363673.3333333333, ans=0.125 2023-10-03 18:27:26,233 WARNING [train.py:1204] (3/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_756-29911-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 18:27:28,094 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_401-112036-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 18:27:29,491 WARNING [train.py:1204] (3/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_181-308827-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 18:27:31,468 WARNING [train.py:1204] (3/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_332-258725-0_sp1.1 from training. Duration: 0.963625 2023-10-03 18:27:32,888 WARNING [train.py:1204] (3/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_188-260011-0_sp0.9 from training. Duration: 0.811125 2023-10-03 18:27:32,944 WARNING [train.py:1204] (3/4) Exclude cut with ID _49039_210_6315_1_1533967537541_3846118_99-302239-0_sp1.1 from training. Duration: 0.963625 2023-10-03 18:27:36,874 WARNING [train.py:1204] (3/4) Exclude cut with ID _4122_210_2084_1_1526713164417_4353280_312-294828-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 18:27:36,880 WARNING [train.py:1204] (3/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_303-228738-0 from training. Duration: 0.84 2023-10-03 18:27:38,192 INFO [train.py:1046] (3/4) Epoch 39, batch 2700, loss[loss=0.1532, simple_loss=0.2352, pruned_loss=0.03559, over 24653.00 frames. ], tot_loss[loss=0.1596, simple_loss=0.2392, pruned_loss=0.03999, over 4706692.18 frames. ], batch size: 65, lr: 2.61e-03, grad_scale: 8.0 2023-10-03 18:27:39,727 WARNING [train.py:1204] (3/4) Exclude cut with ID _14349_210_8341_1_1530615710818_3442470_115-285975-0_sp0.9 from training. Duration: 0.9555625 2023-10-03 18:27:41,723 WARNING [train.py:1204] (3/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_125-144174-0_sp1.1 from training. Duration: 0.3545625 2023-10-03 18:27:43,269 WARNING [train.py:1204] (3/4) Exclude cut with ID _53565_210_10635_1_1534337218335_3820259_351-54570-0_sp1.1 from training. Duration: 0.97275 2023-10-03 18:27:43,294 WARNING [train.py:1204] (3/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_410-40065-0_sp1.1 from training. Duration: 0.963625 2023-10-03 18:27:44,603 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_38965_210_4852_1_1533174906_3484735_167-339442-0_sp1.1 from training. Duration: 0.963625 2023-10-03 18:27:45,962 WARNING [train.py:1204] (3/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_265-164486-0_sp1.1 from training. Duration: 0.87275 2023-10-03 18:27:45,973 WARNING [train.py:1204] (3/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_725-182484-0_sp1.1 from training. Duration: 0.92725 2023-10-03 18:27:46,004 WARNING [train.py:1204] (3/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_607-213951-0_sp0.9 from training. Duration: 0.9333125 2023-10-03 18:27:46,024 WARNING [train.py:1204] (3/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_605-133395-0_sp1.1 from training. Duration: 0.6 2023-10-03 18:27:47,310 WARNING [train.py:1204] (3/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_351-294650-0 from training. Duration: 0.63 2023-10-03 18:27:48,668 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_14953_210_2151_1_1530701710_5616165_710-165400-0_sp1.1 from training. Duration: 0.7909375 2023-10-03 18:27:50,009 WARNING [train.py:1204] (3/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_237-58433-0_sp1.1 from training. Duration: 0.67275 2023-10-03 18:27:50,104 WARNING [train.py:1204] (3/4) Exclude cut with ID _5190_210_4557_1_1527244368799_373439_28-197030-0_sp1.1 from training. Duration: 0.6545625 2023-10-03 18:27:50,142 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_743-327651-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 18:27:54,885 WARNING [train.py:1204] (3/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_76-203867-0_sp0.9 from training. Duration: 0.8 2023-10-03 18:27:54,971 WARNING [train.py:1204] (3/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_274-274798-0 from training. Duration: 0.98 2023-10-03 18:27:56,275 WARNING [train.py:1204] (3/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_226-242140-0_sp1.1 from training. Duration: 0.736375 2023-10-03 18:28:00,847 WARNING [train.py:1204] (3/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_352-59958-0_sp1.1 from training. Duration: 0.7818125 2023-10-03 18:28:00,851 WARNING [train.py:1204] (3/4) Exclude cut with ID _39411_210_5986_1_1533538825030_3343649_191-251819-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 18:28:08,245 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7046_210_6753_1_1527895337_6920917_642-106799-0_sp1.1 from training. Duration: 0.836375 2023-10-03 18:28:08,253 WARNING [train.py:1204] (3/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_412-3442-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 18:28:09,524 WARNING [train.py:1204] (3/4) Exclude cut with ID _60057_210_10640_1_1534903065793_3333150_171-181133-0_sp0.9 from training. Duration: 0.97775 2023-10-03 18:28:09,533 WARNING [train.py:1204] (3/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_154-135696-0_sp1.1 from training. Duration: 0.72725 2023-10-03 18:28:12,379 WARNING [train.py:1204] (3/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_148-186805-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 18:28:12,586 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.bypass_mid.scale_min, batch_count=1363873.3333333333, ans=0.2 2023-10-03 18:28:15,067 WARNING [train.py:1204] (3/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_605-165641-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 18:28:15,073 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_174-277364-0_sp0.9 from training. Duration: 0.788875 2023-10-03 18:28:15,085 WARNING [train.py:1204] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_89-321955-0_sp0.9 from training. Duration: 0.888875 2023-10-03 18:28:19,582 WARNING [train.py:1204] (3/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_438-105429-0_sp1.1 from training. Duration: 0.963625 2023-10-03 18:28:19,592 WARNING [train.py:1204] (3/4) Exclude cut with ID _14970_210_15709_1_1530773543493_1715730_187-20259-0_sp0.9 from training. Duration: 0.988875 2023-10-03 18:28:21,633 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.3.encoder.layers.3.feed_forward2.out_whiten, num_groups=1, num_channels=512, metric=5.53 vs. limit=15.0 2023-10-03 18:28:23,991 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.ff2_skip_rate, batch_count=1363940.0, ans=0.0 2023-10-03 18:28:24,003 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.balancer1.prob, batch_count=1363940.0, ans=0.125 2023-10-03 18:28:28,509 WARNING [train.py:1204] (3/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_385-193052-0_sp0.9 from training. Duration: 0.9555625 2023-10-03 18:28:28,564 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7113_210_1849_1_1527942685_3448768_194-311626-0_sp1.1 from training. Duration: 0.92725 2023-10-03 18:28:30,117 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.self_attn_weights.pos_emb_skip_rate, batch_count=1363940.0, ans=0.0 2023-10-03 18:28:33,171 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3780_210_2087_1_1526467389_2145572_176-107140-0_sp0.9 from training. Duration: 0.7666875 2023-10-03 18:28:33,173 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_36756_210_7234_1_1533026991_966276_123-213457-0_sp1.1 from training. Duration: 0.936375 2023-10-03 18:28:36,631 WARNING [train.py:1204] (3/4) Exclude cut with ID _23557_210_8750_1_1531890106370_6846255_456-244861-0_sp1.1 from training. Duration: 0.963625 2023-10-03 18:28:37,952 WARNING [train.py:1204] (3/4) Exclude cut with ID _37086_210_9038_1_1532999540891_7021250_242-349170-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 18:28:38,013 WARNING [train.py:1204] (3/4) Exclude cut with ID _41298_210_9019_1_1533430371450_4822143_471-281351-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 18:28:39,305 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_53378_210_17282_1_1534221071_5769509_572-126176-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 18:28:40,642 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_1507-263032-0_sp1.1 from training. Duration: 0.963625 2023-10-03 18:28:40,679 WARNING [train.py:1204] (3/4) Exclude cut with ID _13679_210_7497_1_1530534293662_3355098_411-186088-0_sp1.1 from training. Duration: 0.8545625 2023-10-03 18:28:43,348 WARNING [train.py:1204] (3/4) Exclude cut with ID _29345_210_4381_1_1532689168263_3860169_149-257268-0_sp1.1 from training. Duration: 0.736375 2023-10-03 18:28:44,703 WARNING [train.py:1204] (3/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_381-252014-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 18:28:44,711 WARNING [train.py:1204] (3/4) Exclude cut with ID _35799_210_5854_1_1533263487887_3529071_512-2717-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 18:28:48,187 WARNING [train.py:1204] (3/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_505-205270-0_sp0.9 from training. Duration: 0.42225 2023-10-03 18:28:49,616 WARNING [train.py:1204] (3/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_731-20117-0_sp1.1 from training. Duration: 0.936375 2023-10-03 18:28:52,250 INFO [train.py:1046] (3/4) Epoch 39, batch 2750, loss[loss=0.1579, simple_loss=0.2483, pruned_loss=0.03377, over 24655.00 frames. ], tot_loss[loss=0.1592, simple_loss=0.2387, pruned_loss=0.0399, over 4721518.34 frames. ], batch size: 73, lr: 2.61e-03, grad_scale: 8.0 2023-10-03 18:28:52,361 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_37351_210_7925_1_1533012931_3812113_265-85780-0_sp1.1 from training. Duration: 0.8 2023-10-03 18:28:52,367 WARNING [train.py:1204] (3/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_313-134930-0 from training. Duration: 0.97 2023-10-03 18:28:55,143 WARNING [train.py:1204] (3/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_374-138971-0 from training. Duration: 0.94 2023-10-03 18:28:55,185 WARNING [train.py:1204] (3/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_1227-287886-0_sp1.1 from training. Duration: 0.936375 2023-10-03 18:28:57,986 WARNING [train.py:1204] (3/4) Exclude cut with ID _59493_210_17282_1_1535078419077_3225879_332-217544-0_sp1.1 from training. Duration: 0.97275 2023-10-03 18:28:59,799 WARNING [train.py:1204] (3/4) Exclude cut with ID _23514_210_1389_1_1531832139785_3031439_361-119640-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 18:29:01,274 WARNING [train.py:1204] (3/4) Exclude cut with ID _69584_210_5219_1_1535785133775_4044450_142-118058-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 18:29:01,296 WARNING [train.py:1204] (3/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_178-177582-0_sp1.1 from training. Duration: 0.67275 2023-10-03 18:29:01,328 WARNING [train.py:1204] (3/4) Exclude cut with ID _25303_210_17519_1_1532050207576_3635970_41-195572-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 18:29:05,815 WARNING [train.py:1204] (3/4) Exclude cut with ID _55328_210_27767_1_1534580973260_4088170_329-341322-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 18:29:07,124 WARNING [train.py:1204] (3/4) Exclude cut with ID _5608_210_2106_1_1527415439089_6819768_909-82767-0_sp1.1 from training. Duration: 0.5545625 2023-10-03 18:29:07,162 WARNING [train.py:1204] (3/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_261-297503-0_sp1.1 from training. Duration: 0.9 2023-10-03 18:29:07,168 WARNING [train.py:1204] (3/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_459-291706-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 18:29:07,169 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7762_210_1794_1_1528199508_4057408_422-227034-0 from training. Duration: 0.73 2023-10-03 18:29:07,179 WARNING [train.py:1204] (3/4) Exclude cut with ID _3067_210_2667_1_1526212974640_4503168_250-285937-0_sp0.9 from training. Duration: 0.888875 2023-10-03 18:29:07,203 WARNING [train.py:1204] (3/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_332-253745-0_sp1.1 from training. Duration: 0.936375 2023-10-03 18:29:07,464 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.conv_module2.balancer2.prob, batch_count=1364140.0, ans=0.125 2023-10-03 18:29:13,254 WARNING [train.py:1204] (3/4) Exclude cut with ID _13938_210_9988_1_1530518718978_3693960_304-112784-0 from training. Duration: 0.46 2023-10-03 18:29:14,701 WARNING [train.py:1204] (3/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_226-267957-0_sp1.1 from training. Duration: 0.92725 2023-10-03 18:29:14,743 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_41822_210_13270_1_1533955976_4379284_608-254567-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 18:29:14,806 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_124-113562-0_sp1.1 from training. Duration: 0.8545625 2023-10-03 18:29:16,096 WARNING [train.py:1204] (3/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_117-290916-0_sp1.1 from training. Duration: 0.463625 2023-10-03 18:29:16,167 WARNING [train.py:1204] (3/4) Exclude cut with ID _28427_210_4220_1_1532581240967_3061069_102-66533-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 18:29:19,401 WARNING [train.py:1204] (3/4) Exclude cut with ID _55383_210_10635_1_1534468426172_2231669_134-128246-0_sp1.1 from training. Duration: 0.8090625 2023-10-03 18:29:20,778 WARNING [train.py:1204] (3/4) Exclude cut with ID _45871_210_5665_1_1533864614245_3296060_49-308316-0_sp1.1 from training. Duration: 0.97275 2023-10-03 18:29:20,828 WARNING [train.py:1204] (3/4) Exclude cut with ID _57572_210_5517_1_1534921219624_3809484_176-199205-0_sp1.1 from training. Duration: 0.97275 2023-10-03 18:29:22,363 WARNING [train.py:1204] (3/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_45-265684-0_sp1.1 from training. Duration: 0.6545625 2023-10-03 18:29:22,397 WARNING [train.py:1204] (3/4) Exclude cut with ID _15150_210_8341_1_1530752411803_7256384_469-239509-0_sp0.9 from training. Duration: 0.7555625 2023-10-03 18:29:22,440 WARNING [train.py:1204] (3/4) Exclude cut with ID _28461_210_2253_1_1532771775189_3041299_136-270644-0_sp1.1 from training. Duration: 0.8454375 2023-10-03 18:29:22,697 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.ff2_skip_rate, batch_count=1364206.6666666667, ans=0.0 2023-10-03 18:29:23,819 WARNING [train.py:1204] (3/4) Exclude cut with ID _69584_210_5219_1_1535785133775_4044450_302-47302-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 18:29:23,916 WARNING [train.py:1204] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_166-128941-0_sp1.1 from training. Duration: 0.4909375 2023-10-03 18:29:24,053 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.0.balancer2.prob, batch_count=1364206.6666666667, ans=0.125 2023-10-03 18:29:24,606 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.2.encoder.layers.2.feed_forward2.out_whiten, num_groups=1, num_channels=384, metric=3.85 vs. limit=15.0 2023-10-03 18:29:31,659 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.598e+02 1.877e+02 2.030e+02 2.207e+02 3.015e+02, threshold=4.060e+02, percent-clipped=0.0 2023-10-03 18:29:31,803 WARNING [train.py:1204] (3/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_559-120504-0_sp1.1 from training. Duration: 0.97275 2023-10-03 18:29:33,234 WARNING [train.py:1204] (3/4) Exclude cut with ID _6456_210_1534_1_1527681634212_3181049_345-252877-0_sp1.1 from training. Duration: 0.5818125 2023-10-03 18:29:33,271 WARNING [train.py:1204] (3/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_902-233003-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 18:29:34,780 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.conv_skip_rate, batch_count=1364206.6666666667, ans=0.0 2023-10-03 18:29:38,040 WARNING [train.py:1204] (3/4) Exclude cut with ID _36538_210_9994_1_1533204020363_3795620_204-261385-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 18:29:38,044 WARNING [train.py:1204] (3/4) Exclude cut with ID _14821_210_8750_1_1530680503991_6829359_189-297451-0_sp1.1 from training. Duration: 0.5 2023-10-03 18:29:39,180 WARNING [train.py:1204] (3/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_1211-226066-0_sp0.9 from training. Duration: 0.8444375 2023-10-03 18:29:43,569 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.bypass.scale_min, batch_count=1364273.3333333333, ans=0.2 2023-10-03 18:29:45,892 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6786_210_1388_1_1527839699_4194495_273-149841-0_sp1.1 from training. Duration: 0.836375 2023-10-03 18:29:45,937 WARNING [train.py:1204] (3/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_380-162648-0_sp1.1 from training. Duration: 0.8545625 2023-10-03 18:29:45,938 WARNING [train.py:1204] (3/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_494-199538-0 from training. Duration: 0.75 2023-10-03 18:29:50,674 WARNING [train.py:1204] (3/4) Exclude cut with ID _49097_210_16049_1_1533952780207_4360979_276-87053-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 18:29:52,194 WARNING [train.py:1204] (3/4) Exclude cut with ID _5444_210_2151_1_1527418808464_5312210_726-27165-0 from training. Duration: 0.67 2023-10-03 18:29:53,819 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.bypass.skip_rate, batch_count=1364340.0, ans=0.07 2023-10-03 18:29:57,616 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_128-202104-0_sp1.1 from training. Duration: 0.57275 2023-10-03 18:29:59,575 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_11349_210_6753_1_1529800861_1101207_26-65602-0_sp1.1 from training. Duration: 0.87275 2023-10-03 18:29:59,604 WARNING [train.py:1204] (3/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_159-328157-0 from training. Duration: 0.52 2023-10-03 18:29:59,678 WARNING [train.py:1204] (3/4) Exclude cut with ID _14069_210_5632_1_1530512692033_4803390_98-21934-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 18:30:01,094 WARNING [train.py:1204] (3/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_352-59958-0_sp0.9 from training. Duration: 0.9555625 2023-10-03 18:30:02,795 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6163_210_6400_1_1527506746_4263068_283-137811-0 from training. Duration: 0.77 2023-10-03 18:30:02,822 WARNING [train.py:1204] (3/4) Exclude cut with ID _21989_210_5854_1_1531899214931_3271360_133-44759-0_sp1.1 from training. Duration: 0.7 2023-10-03 18:30:04,501 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.bypass.scale_min, batch_count=1364340.0, ans=0.2 2023-10-03 18:30:05,622 WARNING [train.py:1204] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_21-174514-0_sp0.9 from training. Duration: 0.47775 2023-10-03 18:30:05,647 WARNING [train.py:1204] (3/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_201-78811-0_sp1.1 from training. Duration: 0.963625 2023-10-03 18:30:06,863 INFO [train.py:1046] (3/4) Epoch 39, batch 2800, loss[loss=0.1455, simple_loss=0.2248, pruned_loss=0.0331, over 24613.00 frames. ], tot_loss[loss=0.1578, simple_loss=0.2368, pruned_loss=0.03939, over 4715237.38 frames. ], batch size: 60, lr: 2.61e-03, grad_scale: 16.0 2023-10-03 18:30:06,921 WARNING [train.py:1204] (3/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_537-164630-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 18:30:08,893 WARNING [train.py:1204] (3/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_156-257607-0 from training. Duration: 0.83 2023-10-03 18:30:08,911 WARNING [train.py:1204] (3/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_308-154450-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 18:30:08,958 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_45399_210_4852_1_1533624725_7579737_214-240574-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 18:30:11,522 WARNING [train.py:1204] (3/4) Exclude cut with ID _29303_210_2575_1_1532392237303_7410649_615-305517-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 18:30:11,559 WARNING [train.py:1204] (3/4) Exclude cut with ID IC0125W0002-61808 from training. Duration: 0.708 2023-10-03 18:30:11,560 WARNING [train.py:1204] (3/4) Exclude cut with ID IC0125W0017-61823 from training. Duration: 0.759 2023-10-03 18:30:14,451 WARNING [train.py:1204] (3/4) Exclude cut with ID _53219_210_4525_1_1534834836008_4064825_4-145291-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 18:30:17,167 WARNING [train.py:1204] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_185-174805-0_sp1.1 from training. Duration: 0.6181875 2023-10-03 18:30:17,189 WARNING [train.py:1204] (3/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_635-193046-0_sp1.1 from training. Duration: 0.92725 2023-10-03 18:30:20,370 WARNING [train.py:1204] (3/4) Exclude cut with ID _46067_210_4220_1_1534125571066_3307450_321-258866-0_sp1.1 from training. Duration: 0.97275 2023-10-03 18:30:21,836 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8558_210_1410_1_1528505225_7613406_131-329524-0 from training. Duration: 0.79 2023-10-03 18:30:23,175 WARNING [train.py:1204] (3/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_847-41334-0_sp0.9 from training. Duration: 0.488875 2023-10-03 18:30:24,486 WARNING [train.py:1204] (3/4) Exclude cut with ID _14227_210_4381_1_1530701804286_3940259_125-275642-0 from training. Duration: 0.99 2023-10-03 18:30:25,848 WARNING [train.py:1204] (3/4) Exclude cut with ID _25248_210_8750_1_1532062825941_5554180_248-177255-0_sp1.1 from training. Duration: 0.963625 2023-10-03 18:30:25,902 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_40047_210_2290_1_1533290519_2374311_195-165693-0_sp1.1 from training. Duration: 0.8090625 2023-10-03 18:30:25,906 WARNING [train.py:1204] (3/4) Exclude cut with ID _23109_210_4381_1_1532150828777_901949_29-258672-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 18:30:30,338 WARNING [train.py:1204] (3/4) Exclude cut with ID _14821_210_8750_1_1530680503991_6829359_2-227151-0_sp1.1 from training. Duration: 0.7545625 2023-10-03 18:30:30,378 WARNING [train.py:1204] (3/4) Exclude cut with ID _49643_210_7989_1_1534640244224_3939031_565-53982-0_sp1.1 from training. Duration: 0.963625 2023-10-03 18:30:30,381 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7544_210_6753_1_1528591327_1471426_33-346031-0_sp0.9 from training. Duration: 0.67775 2023-10-03 18:30:32,238 WARNING [train.py:1204] (3/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_95-61782-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 18:30:39,979 WARNING [train.py:1204] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_686-54383-0_sp0.9 from training. Duration: 0.9666875 2023-10-03 18:30:41,337 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_505-276823-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 18:30:42,754 WARNING [train.py:1204] (3/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_745-128443-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 18:30:44,055 WARNING [train.py:1204] (3/4) Exclude cut with ID _3844_210_1390_1_1526727803582_2871209_20-251664-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 18:30:45,360 WARNING [train.py:1204] (3/4) Exclude cut with ID _36538_210_9994_1_1533204020363_3795620_487-164628-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 18:30:51,233 WARNING [train.py:1204] (3/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_309-222545-0_sp1.1 from training. Duration: 0.763625 2023-10-03 18:30:51,249 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3531_210_3528_1_1526294203_5216456_309-227302-0 from training. Duration: 0.69 2023-10-03 18:30:51,306 WARNING [train.py:1204] (3/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_42-192460-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 18:30:52,787 WARNING [train.py:1204] (3/4) Exclude cut with ID _22083_210_8254_1_1531702862439_3678970_49-234546-0_sp1.1 from training. Duration: 0.7545625 2023-10-03 18:30:52,798 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_427-78097-0_sp0.9 from training. Duration: 0.888875 2023-10-03 18:30:56,919 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_378-205254-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 18:30:58,678 WARNING [train.py:1204] (3/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_672-314830-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 18:31:02,834 WARNING [train.py:1204] (3/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_90-30673-0_sp1.1 from training. Duration: 0.763625 2023-10-03 18:31:04,907 WARNING [train.py:1204] (3/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_563-329943-0_sp1.1 from training. Duration: 0.9 2023-10-03 18:31:04,938 WARNING [train.py:1204] (3/4) Exclude cut with ID _21632_210_2253_1_1531994001765_3744140_154-84090-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 18:31:04,939 WARNING [train.py:1204] (3/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_103-336064-0_sp0.9 from training. Duration: 0.5666875 2023-10-03 18:31:04,991 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_43-71989-0_sp0.9 from training. Duration: 0.6333125 2023-10-03 18:31:05,046 WARNING [train.py:1204] (3/4) Exclude cut with ID _3867_210_1883_1_1526641200575_4475830_319-349850-0_sp0.9 from training. Duration: 0.7444375 2023-10-03 18:31:06,334 WARNING [train.py:1204] (3/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_629-179454-0_sp1.1 from training. Duration: 0.963625 2023-10-03 18:31:06,341 WARNING [train.py:1204] (3/4) Exclude cut with ID _65263_210_22356_1_1535763598691_3921037_49-222917-0 from training. Duration: 0.92 2023-10-03 18:31:06,530 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.bypass.skip_rate, batch_count=1364673.3333333333, ans=0.09899494936611666 2023-10-03 18:31:07,644 WARNING [train.py:1204] (3/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_602-331772-0_sp1.1 from training. Duration: 0.936375 2023-10-03 18:31:07,729 WARNING [train.py:1204] (3/4) Exclude cut with ID _58414_210_8254_1_1535421609162_7151182_292-134978-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 18:31:07,764 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_38793_210_2544_1_1533176114_3849320_200-299292-0_sp1.1 from training. Duration: 0.936375 2023-10-03 18:31:07,954 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.feed_forward1.hidden_balancer.prob, batch_count=1364673.3333333333, ans=0.125 2023-10-03 18:31:08,017 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.ff3_skip_rate, batch_count=1364673.3333333333, ans=0.0 2023-10-03 18:31:09,889 WARNING [train.py:1204] (3/4) Exclude cut with ID _14970_210_15709_1_1530775547361_1966965_6-23328-0 from training. Duration: 0.86 2023-10-03 18:31:11,223 WARNING [train.py:1204] (3/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_386-225511-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 18:31:11,243 WARNING [train.py:1204] (3/4) Exclude cut with ID _38323_210_12108_1_1533121233369_3303049_416-11919-0_sp1.1 from training. Duration: 0.87275 2023-10-03 18:31:12,547 WARNING [train.py:1204] (3/4) Exclude cut with ID _4866_210_2577_1_1527404412284_4364659_256-306830-0_sp1.1 from training. Duration: 0.7909375 2023-10-03 18:31:12,666 WARNING [train.py:1204] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_53-300544-0 from training. Duration: 0.67 2023-10-03 18:31:18,697 WARNING [train.py:1204] (3/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_80-74184-0_sp1.1 from training. Duration: 0.7545625 2023-10-03 18:31:18,711 WARNING [train.py:1204] (3/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_332-256168-0_sp1.1 from training. Duration: 0.6818125 2023-10-03 18:31:20,042 WARNING [train.py:1204] (3/4) Exclude cut with ID _48836_210_12210_1_1534075848293_3904150_429-35475-0_sp1.1 from training. Duration: 0.8090625 2023-10-03 18:31:21,329 INFO [train.py:1046] (3/4) Epoch 39, batch 2850, loss[loss=0.1633, simple_loss=0.2376, pruned_loss=0.04456, over 23437.00 frames. ], tot_loss[loss=0.1571, simple_loss=0.2366, pruned_loss=0.03879, over 4714466.30 frames. ], batch size: 285, lr: 2.61e-03, grad_scale: 8.0 2023-10-03 18:31:21,484 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_144-135833-0_sp1.1 from training. Duration: 0.97275 2023-10-03 18:31:25,641 WARNING [train.py:1204] (3/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_380-343670-0_sp0.9 from training. Duration: 0.97775 2023-10-03 18:31:25,664 WARNING [train.py:1204] (3/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_213-265174-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 18:31:26,364 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.feed_forward1.out_whiten.whitening_limit, batch_count=1364740.0, ans=15.0 2023-10-03 18:31:26,974 WARNING [train.py:1204] (3/4) Exclude cut with ID _20455_210_5219_1_1531565996775_8098100_86-314283-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 18:31:30,249 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_41750_210_1960_1_1533520678_3948042_68-223813-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 18:31:31,508 WARNING [train.py:1204] (3/4) Exclude cut with ID _55153_210_2253_1_1534384786677_3162096_24-198429-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 18:31:32,919 WARNING [train.py:1204] (3/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_119-207162-0_sp0.9 from training. Duration: 0.788875 2023-10-03 18:31:34,274 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_148-172861-0 from training. Duration: 0.5 2023-10-03 18:31:37,713 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.conv_skip_rate, batch_count=1364806.6666666667, ans=0.0 2023-10-03 18:31:37,773 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.feed_forward3.hidden_balancer.prob, batch_count=1364806.6666666667, ans=0.125 2023-10-03 18:31:41,589 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_5522_210_3273_1_1527249606_3909096_318-203153-0 from training. Duration: 0.43 2023-10-03 18:31:41,596 WARNING [train.py:1204] (3/4) Exclude cut with ID _14884_210_2252_1_1530707459689_5561250_288-189949-0_sp1.1 from training. Duration: 0.97275 2023-10-03 18:31:43,024 WARNING [train.py:1204] (3/4) Exclude cut with ID _5294_210_1747_1_1527249643928_3696179_529-242298-0 from training. Duration: 0.95 2023-10-03 18:31:43,086 WARNING [train.py:1204] (3/4) Exclude cut with ID _36538_210_9994_1_1533204020363_3795620_503-110289-0_sp1.1 from training. Duration: 0.936375 2023-10-03 18:31:46,173 WARNING [train.py:1204] (3/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_385-193052-0 from training. Duration: 0.86 2023-10-03 18:31:46,226 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_456-22141-0 from training. Duration: 0.88 2023-10-03 18:31:47,705 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_38965_210_4852_1_1533174906_3484735_284-117840-0_sp1.1 from training. Duration: 0.936375 2023-10-03 18:31:53,754 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.feed_forward1.out_proj.dropout_p, batch_count=1364873.3333333333, ans=0.1 2023-10-03 18:31:59,107 WARNING [train.py:1204] (3/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_75-201625-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 18:31:59,332 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.conv_module2.balancer2.prob, batch_count=1364873.3333333333, ans=0.125 2023-10-03 18:32:00,732 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.497e+02 1.914e+02 2.254e+02 2.782e+02 3.876e+02, threshold=4.507e+02, percent-clipped=0.0 2023-10-03 18:32:00,869 WARNING [train.py:1204] (3/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_255-312654-0_sp1.1 from training. Duration: 0.82725 2023-10-03 18:32:00,878 WARNING [train.py:1204] (3/4) Exclude cut with ID _47251_210_18628_1_1533814063128_4137250_214-88076-0_sp0.9 from training. Duration: 0.97775 2023-10-03 18:32:02,236 WARNING [train.py:1204] (3/4) Exclude cut with ID _4092_210_2151_1_1526643044449_3135759_227-213972-0_sp1.1 from training. Duration: 0.4909375 2023-10-03 18:32:02,260 WARNING [train.py:1204] (3/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_912-47473-0_sp1.1 from training. Duration: 0.6181875 2023-10-03 18:32:02,439 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=1364873.3333333333, ans=0.1 2023-10-03 18:32:03,510 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6767_210_3273_1_1527855783_1718932_173-256601-0_sp1.1 from training. Duration: 0.72725 2023-10-03 18:32:05,109 WARNING [train.py:1204] (3/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_32-205288-0_sp1.1 from training. Duration: 0.7090625 2023-10-03 18:32:05,134 WARNING [train.py:1204] (3/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_39-248870-0 from training. Duration: 0.69 2023-10-03 18:32:06,908 WARNING [train.py:1204] (3/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_377-296839-0_sp1.1 from training. Duration: 0.5 2023-10-03 18:32:06,930 WARNING [train.py:1204] (3/4) Exclude cut with ID _13679_210_7497_1_1530534293662_3355098_413-56154-0_sp1.1 from training. Duration: 0.963625 2023-10-03 18:32:08,158 WARNING [train.py:1204] (3/4) Exclude cut with ID _14970_210_15709_1_1530775547361_1966965_201-84544-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 18:32:08,223 WARNING [train.py:1204] (3/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_105-95775-0_sp1.1 from training. Duration: 0.936375 2023-10-03 18:32:09,716 WARNING [train.py:1204] (3/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_131-191669-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 18:32:11,000 WARNING [train.py:1204] (3/4) Exclude cut with ID _14860_210_4915_1_1530702020340_7343240_695-4323-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 18:32:11,100 WARNING [train.py:1204] (3/4) Exclude cut with ID _39867_210_9826_1_1533553222030_9344259_991-130565-0_sp1.1 from training. Duration: 0.97275 2023-10-03 18:32:12,489 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_50-3289-0_sp1.1 from training. Duration: 0.82725 2023-10-03 18:32:15,750 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_54-347670-0_sp1.1 from training. Duration: 0.92725 2023-10-03 18:32:15,808 WARNING [train.py:1204] (3/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_200-153445-0_sp1.1 from training. Duration: 0.936375 2023-10-03 18:32:17,009 WARNING [train.py:1204] (3/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_279-302911-0_sp1.1 from training. Duration: 0.97275 2023-10-03 18:32:18,849 WARNING [train.py:1204] (3/4) Exclude cut with ID _14886_210_4220_1_1530702020355_3516190_159-73748-0_sp0.9 from training. Duration: 0.92225 2023-10-03 18:32:24,235 WARNING [train.py:1204] (3/4) Exclude cut with ID _53379_210_20034_1_1534222027363_3854654_23-222715-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 18:32:25,638 WARNING [train.py:1204] (3/4) Exclude cut with ID _12555_210_8750_1_1530075594402_3780239_449-267986-0 from training. Duration: 0.83 2023-10-03 18:32:26,984 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7544_210_6753_1_1528587722_3576686_295-258479-0 from training. Duration: 0.84 2023-10-03 18:32:27,140 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder_embed.conv.5.prob, batch_count=1365006.6666666667, ans=0.125 2023-10-03 18:32:28,462 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7002_210_1794_1_1527935634_4034444_487-236501-0_sp0.9 from training. Duration: 0.7333125 2023-10-03 18:32:28,522 WARNING [train.py:1204] (3/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_244-19107-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 18:32:28,535 WARNING [train.py:1204] (3/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_390-339137-0 from training. Duration: 0.56 2023-10-03 18:32:29,870 WARNING [train.py:1204] (3/4) Exclude cut with ID _7562_210_2084_1_1528887767916_3946229_186-326422-0_sp1.1 from training. Duration: 0.763625 2023-10-03 18:32:29,940 WARNING [train.py:1204] (3/4) Exclude cut with ID _33640_210_2655_1_1533117605479_4855469_645-145235-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 18:32:29,959 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_25767_210_2161_1_1532307483_3867748_506-171777-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 18:32:31,387 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_5438_210_1794_1_1527336687_2600009_233-58072-0_sp1.1 from training. Duration: 0.7 2023-10-03 18:32:31,388 WARNING [train.py:1204] (3/4) Exclude cut with ID IC0669W0063-330908 from training. Duration: 0.896 2023-10-03 18:32:31,426 WARNING [train.py:1204] (3/4) Exclude cut with ID IC0671W0070-331913 from training. Duration: 0.896 2023-10-03 18:32:31,429 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_258-310570-0_sp1.1 from training. Duration: 0.7454375 2023-10-03 18:32:33,232 WARNING [train.py:1204] (3/4) Exclude cut with ID _9461_210_4557_1_1529060514726_3677451_162-177507-0_sp1.1 from training. Duration: 0.97275 2023-10-03 18:32:34,616 INFO [train.py:1046] (3/4) Epoch 39, batch 2900, loss[loss=0.1533, simple_loss=0.2458, pruned_loss=0.03041, over 24660.00 frames. ], tot_loss[loss=0.1568, simple_loss=0.2368, pruned_loss=0.03839, over 4735775.20 frames. ], batch size: 73, lr: 2.61e-03, grad_scale: 8.0 2023-10-03 18:32:39,279 WARNING [train.py:1204] (3/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_26-175553-0_sp0.9 from training. Duration: 0.67775 2023-10-03 18:32:39,308 WARNING [train.py:1204] (3/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_7-150481-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 18:32:40,669 WARNING [train.py:1204] (3/4) Exclude cut with ID _23557_210_8750_1_1531890106370_6846255_72-273262-0_sp1.1 from training. Duration: 0.963625 2023-10-03 18:32:40,765 WARNING [train.py:1204] (3/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_367-61330-0 from training. Duration: 0.75 2023-10-03 18:32:44,840 WARNING [train.py:1204] (3/4) Exclude cut with ID _39867_210_9826_1_1533553222030_9344259_1337-11141-0_sp1.1 from training. Duration: 0.936375 2023-10-03 18:32:44,863 WARNING [train.py:1204] (3/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_101-293367-0 from training. Duration: 0.63 2023-10-03 18:32:44,960 WARNING [train.py:1204] (3/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_10-100367-0 from training. Duration: 0.98 2023-10-03 18:32:47,752 WARNING [train.py:1204] (3/4) Exclude cut with ID _60057_210_10640_1_1534903065793_3333150_171-181133-0_sp1.1 from training. Duration: 0.8 2023-10-03 18:32:47,754 WARNING [train.py:1204] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_844-96989-0_sp0.9 from training. Duration: 0.788875 2023-10-03 18:32:49,731 WARNING [train.py:1204] (3/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_301-144595-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 18:32:51,081 WARNING [train.py:1204] (3/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_115-147343-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 18:32:54,226 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3171_210_2087_1_1526109084_2330007_37-64334-0_sp1.1 from training. Duration: 0.7454375 2023-10-03 18:32:54,456 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.feed_forward2.hidden_balancer.prob, batch_count=1365140.0, ans=0.125 2023-10-03 18:32:55,522 WARNING [train.py:1204] (3/4) Exclude cut with ID _19514_210_9259_1_1531530071330_5084230_769-272914-0_sp1.1 from training. Duration: 0.936375 2023-10-03 18:32:58,420 WARNING [train.py:1204] (3/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_76-311963-0_sp0.9 from training. Duration: 0.77775 2023-10-03 18:32:58,460 WARNING [train.py:1204] (3/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_526-62079-0 from training. Duration: 0.67 2023-10-03 18:32:59,744 WARNING [train.py:1204] (3/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_7-111954-0_sp0.9 from training. Duration: 0.62225 2023-10-03 18:33:01,182 WARNING [train.py:1204] (3/4) Exclude cut with ID _57997_210_4163_1_1534766425889_3961049_360-279140-0_sp1.1 from training. Duration: 0.97275 2023-10-03 18:33:04,030 WARNING [train.py:1204] (3/4) Exclude cut with ID _4866_210_2577_1_1527404412284_4364659_133-1696-0 from training. Duration: 0.99 2023-10-03 18:33:04,107 WARNING [train.py:1204] (3/4) Exclude cut with ID _5190_210_4557_1_1527244368799_373439_28-197030-0 from training. Duration: 0.72 2023-10-03 18:33:04,220 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.1.feed_forward1.out_proj.dropout_p, batch_count=1365206.6666666667, ans=0.1 2023-10-03 18:33:05,656 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.out_combiner.scale_min, batch_count=1365206.6666666667, ans=0.2 2023-10-03 18:33:07,612 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.0.layers.0.conv_module1.whiten, num_groups=1, num_channels=192, metric=11.39 vs. limit=15.0 2023-10-03 18:33:08,681 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_53-39003-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 18:33:08,684 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6673_210_3273_1_1527767790_3173240_351-167954-0 from training. Duration: 0.48 2023-10-03 18:33:08,718 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2822_210_2715_1_1526112586_1999587_165-36656-0_sp1.1 from training. Duration: 0.8090625 2023-10-03 18:33:11,867 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_328-264892-0_sp1.1 from training. Duration: 0.92725 2023-10-03 18:33:11,872 WARNING [train.py:1204] (3/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_29-293896-0_sp0.9 from training. Duration: 0.67775 2023-10-03 18:33:13,383 WARNING [train.py:1204] (3/4) Exclude cut with ID _65263_210_22356_1_1535763598691_3921037_465-55448-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 18:33:14,861 WARNING [train.py:1204] (3/4) Exclude cut with ID _21989_210_5854_1_1531899214931_3271360_311-53106-0_sp1.1 from training. Duration: 0.97275 2023-10-03 18:33:17,063 WARNING [train.py:1204] (3/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_389-277915-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 18:33:17,652 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.3.encoder.layers.1.feed_forward3.out_whiten, num_groups=1, num_channels=512, metric=9.87 vs. limit=15.0 2023-10-03 18:33:20,781 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_69630_210_7234_1_1535791094_677837_63-277139-0_sp1.1 from training. Duration: 0.963625 2023-10-03 18:33:22,147 WARNING [train.py:1204] (3/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_85-138257-0 from training. Duration: 0.71 2023-10-03 18:33:22,170 WARNING [train.py:1204] (3/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_686-247600-0 from training. Duration: 0.92 2023-10-03 18:33:22,175 WARNING [train.py:1204] (3/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_108-255430-0_sp1.1 from training. Duration: 0.8818125 2023-10-03 18:33:25,028 WARNING [train.py:1204] (3/4) Exclude cut with ID _14884_210_2252_1_1530707459689_5561250_114-201399-0_sp1.1 from training. Duration: 0.6909375 2023-10-03 18:33:28,372 WARNING [train.py:1204] (3/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_506-42614-0 from training. Duration: 0.79 2023-10-03 18:33:28,466 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_85-195240-0_sp1.1 from training. Duration: 0.7909375 2023-10-03 18:33:30,048 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.feed_forward2.hidden_balancer.prob, batch_count=1365273.3333333333, ans=0.125 2023-10-03 18:33:34,196 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_141-36243-0_sp1.1 from training. Duration: 0.97275 2023-10-03 18:33:37,278 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.conv_module2.balancer1.prob, batch_count=1365340.0, ans=0.125 2023-10-03 18:33:40,701 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.feed_forward3.hidden_balancer.prob, batch_count=1365340.0, ans=0.125 2023-10-03 18:33:41,898 WARNING [train.py:1204] (3/4) Exclude cut with ID _7852_210_5219_1_1528286471316_8139400_358-287492-0_sp1.1 from training. Duration: 0.87275 2023-10-03 18:33:41,922 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_805-71497-0_sp0.9 from training. Duration: 0.788875 2023-10-03 18:33:43,296 WARNING [train.py:1204] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_178-39722-0 from training. Duration: 0.49 2023-10-03 18:33:46,114 WARNING [train.py:1204] (3/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_428-181925-0_sp1.1 from training. Duration: 0.963625 2023-10-03 18:33:46,117 WARNING [train.py:1204] (3/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_770-200430-0 from training. Duration: 0.89 2023-10-03 18:33:47,866 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_1104-271009-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 18:33:47,906 WARNING [train.py:1204] (3/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_128-296581-0_sp0.9 from training. Duration: 0.82225 2023-10-03 18:33:49,293 INFO [train.py:1046] (3/4) Epoch 39, batch 2950, loss[loss=0.1483, simple_loss=0.2253, pruned_loss=0.03567, over 24461.00 frames. ], tot_loss[loss=0.1572, simple_loss=0.2368, pruned_loss=0.0388, over 4729564.15 frames. ], batch size: 58, lr: 2.61e-03, grad_scale: 8.0 2023-10-03 18:33:52,107 WARNING [train.py:1204] (3/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_19-62511-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 18:33:53,497 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_805-71497-0 from training. Duration: 0.71 2023-10-03 18:33:53,759 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.conv_skip_rate, batch_count=1365406.6666666667, ans=0.0 2023-10-03 18:33:55,337 WARNING [train.py:1204] (3/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_573-152077-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 18:33:55,342 WARNING [train.py:1204] (3/4) Exclude cut with ID _42875_210_14856_1_1533704397487_4012433_393-177816-0_sp1.1 from training. Duration: 0.963625 2023-10-03 18:33:55,451 WARNING [train.py:1204] (3/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_278-35159-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 18:33:56,822 WARNING [train.py:1204] (3/4) Exclude cut with ID _15037_210_2253_1_1530752294104_3267650_13-130361-0_sp1.1 from training. Duration: 0.9 2023-10-03 18:33:59,503 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3019_210_2028_1_1526044245_3264865_127-135615-0 from training. Duration: 0.94 2023-10-03 18:33:59,569 WARNING [train.py:1204] (3/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_607-213951-0 from training. Duration: 0.84 2023-10-03 18:34:00,915 WARNING [train.py:1204] (3/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_436-318113-0_sp0.9 from training. Duration: 0.8333125 2023-10-03 18:34:00,916 WARNING [train.py:1204] (3/4) Exclude cut with ID _13938_210_9988_1_1530518718978_3693960_148-152225-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 18:34:05,185 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2541_210_1794_1_1525590269_3084765_156-173713-0_sp1.1 from training. Duration: 0.736375 2023-10-03 18:34:06,664 WARNING [train.py:1204] (3/4) Exclude cut with ID _28323_210_16187_1_1532480395667_4100300_531-24157-0_sp1.1 from training. Duration: 0.8909375 2023-10-03 18:34:09,271 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.2.encoder.layers.0.conv_module2.whiten, num_groups=1, num_channels=384, metric=5.37 vs. limit=15.0 2023-10-03 18:34:10,145 WARNING [train.py:1204] (3/4) Exclude cut with ID _29303_210_2575_1_1532392237303_7410649_382-217492-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 18:34:10,196 WARNING [train.py:1204] (3/4) Exclude cut with ID _58895_210_8750_1_1535184081320_3649089_270-158013-0_sp1.1 from training. Duration: 0.92725 2023-10-03 18:34:13,670 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.1.encoder.layers.0.conv_module2.whiten, num_groups=1, num_channels=256, metric=6.01 vs. limit=15.0 2023-10-03 18:34:14,103 WARNING [train.py:1204] (3/4) Exclude cut with ID _29277_210_3279_1_1532581109293_3173069_311-206227-0_sp1.1 from training. Duration: 0.936375 2023-10-03 18:34:14,123 WARNING [train.py:1204] (3/4) Exclude cut with ID _7160_210_1876_1_1527940923231_3703799_282-322611-0_sp0.9 from training. Duration: 0.9666875 2023-10-03 18:34:15,592 WARNING [train.py:1204] (3/4) Exclude cut with ID _53219_210_4525_1_1534834836008_4064825_562-32295-0_sp1.1 from training. Duration: 0.963625 2023-10-03 18:34:17,378 WARNING [train.py:1204] (3/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_408-87330-0_sp1.1 from training. Duration: 0.963625 2023-10-03 18:34:17,385 WARNING [train.py:1204] (3/4) Exclude cut with ID _59202_210_26984_1_1534816810612_3549174_137-318710-0_sp1.1 from training. Duration: 0.8090625 2023-10-03 18:34:18,837 WARNING [train.py:1204] (3/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_372-30302-0 from training. Duration: 0.86 2023-10-03 18:34:24,137 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7543_210_6753_1_1528541907_1399322_178-56269-0_sp1.1 from training. Duration: 0.95275 2023-10-03 18:34:24,172 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9109W0012-549758 from training. Duration: 0.512 2023-10-03 18:34:25,855 WARNING [train.py:1204] (3/4) Exclude cut with ID _5410_210_1388_1_1527397055795_1719020_39-112221-0_sp0.9 from training. Duration: 0.7666875 2023-10-03 18:34:27,254 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9121W0012-554933 from training. Duration: 0.896 2023-10-03 18:34:28,556 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.636e+02 1.923e+02 2.142e+02 2.477e+02 3.548e+02, threshold=4.284e+02, percent-clipped=0.0 2023-10-03 18:34:28,635 WARNING [train.py:1204] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_476-272757-0 from training. Duration: 0.93 2023-10-03 18:34:28,656 WARNING [train.py:1204] (3/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_1085-171718-0_sp1.1 from training. Duration: 0.92725 2023-10-03 18:34:28,715 WARNING [train.py:1204] (3/4) Exclude cut with ID _8480_210_1874_1_1528589391205_6728330_309-139896-0_sp1.1 from training. Duration: 0.736375 2023-10-03 18:34:28,715 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9130W0113-559703 from training. Duration: 0.896 2023-10-03 18:34:28,719 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_292-265928-0_sp1.1 from training. Duration: 0.463625 2023-10-03 18:34:31,451 WARNING [train.py:1204] (3/4) Exclude cut with ID _8772_210_1850_1_1528635595972_3664011_96-12756-0 from training. Duration: 0.98 2023-10-03 18:34:32,801 WARNING [train.py:1204] (3/4) Exclude cut with ID _12556_210_8341_1_1530081024100_7327690_725-193477-0_sp1.1 from training. Duration: 0.8909375 2023-10-03 18:34:32,839 WARNING [train.py:1204] (3/4) Exclude cut with ID _5289_210_2084_1_1527678202736_3779299_142-103317-0_sp1.1 from training. Duration: 0.763625 2023-10-03 18:34:32,942 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder_embed.convnext.out_balancer.prob, batch_count=1365606.6666666667, ans=0.125 2023-10-03 18:34:35,554 WARNING [train.py:1204] (3/4) Exclude cut with ID _45012_210_12651_1_1533608435136_5371380_206-51075-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 18:34:35,660 WARNING [train.py:1204] (3/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_176-56120-0_sp1.1 from training. Duration: 0.7545625 2023-10-03 18:34:37,522 WARNING [train.py:1204] (3/4) Exclude cut with ID _40284_210_2084_1_1533513679814_3704279_220-201864-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 18:34:37,558 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9165W0004-576638 from training. Duration: 0.896 2023-10-03 18:34:38,900 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_5438_210_1794_1_1527336687_2600009_218-343336-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 18:34:38,926 WARNING [train.py:1204] (3/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_318-162017-0 from training. Duration: 0.91 2023-10-03 18:34:43,656 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_49544_210_1794_1_1534379770_5262083_483-349298-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 18:34:45,068 WARNING [train.py:1204] (3/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_197-28164-0_sp0.9 from training. Duration: 0.888875 2023-10-03 18:34:46,319 WARNING [train.py:1204] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_643-52007-0 from training. Duration: 0.57 2023-10-03 18:34:46,336 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_38965_210_4852_1_1533174906_3484735_403-332963-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 18:34:47,809 WARNING [train.py:1204] (3/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_340-174176-0 from training. Duration: 0.49 2023-10-03 18:34:51,217 WARNING [train.py:1204] (3/4) Exclude cut with ID _56708_210_13177_1_1535252351282_3422899_363-917-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 18:34:51,331 WARNING [train.py:1204] (3/4) Exclude cut with ID _19896_210_1883_1_1531709022662_6649569_525-159063-0_sp1.1 from training. Duration: 0.936375 2023-10-03 18:34:52,593 WARNING [train.py:1204] (3/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_430-116139-0_sp0.9 from training. Duration: 0.9444375 2023-10-03 18:34:54,063 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_53753_210_7234_1_1534320003_3594148_86-329250-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 18:34:54,073 WARNING [train.py:1204] (3/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_406-275649-0_sp1.1 from training. Duration: 0.3818125 2023-10-03 18:34:56,006 WARNING [train.py:1204] (3/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_1083-147536-0_sp1.1 from training. Duration: 0.8818125 2023-10-03 18:34:56,231 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.bypass.scale_min, batch_count=1365673.3333333333, ans=0.2 2023-10-03 18:34:57,317 WARNING [train.py:1204] (3/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_54-311741-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 18:34:57,322 WARNING [train.py:1204] (3/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_237-58433-0_sp0.9 from training. Duration: 0.82225 2023-10-03 18:34:57,367 WARNING [train.py:1204] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_98-51676-0_sp1.1 from training. Duration: 0.663625 2023-10-03 18:34:58,761 WARNING [train.py:1204] (3/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_386-270472-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 18:35:00,107 WARNING [train.py:1204] (3/4) Exclude cut with ID _19023_210_4353_1_1531378892552_5820780_83-254794-0_sp1.1 from training. Duration: 0.7818125 2023-10-03 18:35:01,461 WARNING [train.py:1204] (3/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_314-93656-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 18:35:02,760 INFO [train.py:1046] (3/4) Epoch 39, batch 3000, loss[loss=0.1532, simple_loss=0.2407, pruned_loss=0.03286, over 24646.00 frames. ], tot_loss[loss=0.1575, simple_loss=0.2373, pruned_loss=0.03881, over 4735036.80 frames. ], batch size: 68, lr: 2.61e-03, grad_scale: 8.0 2023-10-03 18:35:02,761 INFO [train.py:1069] (3/4) Computing validation loss 2023-10-03 18:35:14,731 INFO [train.py:1078] (3/4) Epoch 39, validation: loss=0.3532, simple_loss=0.2838, pruned_loss=0.2113, over 1125622.00 frames. 2023-10-03 18:35:14,731 INFO [train.py:1079] (3/4) Maximum memory allocated so far is 21134MB 2023-10-03 18:35:14,776 WARNING [train.py:1204] (3/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_336-213530-0 from training. Duration: 0.9 2023-10-03 18:35:14,976 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.self_attn_weights.pos_emb_skip_rate, batch_count=1365740.0, ans=0.0 2023-10-03 18:35:16,128 WARNING [train.py:1204] (3/4) Exclude cut with ID _36686_210_5665_1_1533778225913_3646410_65-221120-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 18:35:19,438 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_26433_210_12108_1_1532157158_4056480_271-68178-0_sp1.1 from training. Duration: 0.92725 2023-10-03 18:35:19,483 WARNING [train.py:1204] (3/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_46-207599-0_sp0.9 from training. Duration: 0.711125 2023-10-03 18:35:23,606 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9303W0114-637133 from training. Duration: 0.896 2023-10-03 18:35:23,650 WARNING [train.py:1204] (3/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_490-207861-0 from training. Duration: 0.98 2023-10-03 18:35:25,685 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6767_210_3273_1_1527855783_1718932_173-256601-0_sp0.9 from training. Duration: 0.888875 2023-10-03 18:35:27,076 WARNING [train.py:1204] (3/4) Exclude cut with ID _13921_210_9994_1_1530841782272_3503520_327-69677-0_sp1.1 from training. Duration: 0.8181875 2023-10-03 18:35:27,130 WARNING [train.py:1204] (3/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_508-335369-0 from training. Duration: 0.48 2023-10-03 18:35:27,161 WARNING [train.py:1204] (3/4) Exclude cut with ID _64631_210_27767_1_1535349580068_4165160_24-46567-0_sp1.1 from training. Duration: 0.963625 2023-10-03 18:35:27,717 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.3.encoder.layers.3.self_attn2.whiten, num_groups=1, num_channels=512, metric=10.19 vs. limit=22.5 2023-10-03 18:35:29,989 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.feed_forward3.hidden_balancer.prob, batch_count=1365806.6666666667, ans=0.125 2023-10-03 18:35:32,581 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_89-35748-0_sp1.1 from training. Duration: 0.7181875 2023-10-03 18:35:42,954 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=1365873.3333333333, ans=0.1 2023-10-03 18:35:42,985 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.bypass.scale_min, batch_count=1365873.3333333333, ans=0.2 2023-10-03 18:35:44,085 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_26433_210_12108_1_1532157158_4056480_578-262107-0_sp1.1 from training. Duration: 0.9 2023-10-03 18:35:44,293 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.ff2_skip_rate, batch_count=1365873.3333333333, ans=0.0 2023-10-03 18:35:50,212 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.1.encoder.layers.1.feed_forward2.out_whiten, num_groups=1, num_channels=256, metric=3.62 vs. limit=15.0 2023-10-03 18:35:50,692 WARNING [train.py:1204] (3/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_129-337802-0 from training. Duration: 0.72 2023-10-03 18:35:52,079 WARNING [train.py:1204] (3/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_356-167816-0_sp0.9 from training. Duration: 0.8 2023-10-03 18:35:54,756 WARNING [train.py:1204] (3/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_137-116442-0_sp1.1 from training. Duration: 0.7090625 2023-10-03 18:35:54,805 WARNING [train.py:1204] (3/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_358-172940-0_sp1.1 from training. Duration: 0.963625 2023-10-03 18:35:54,825 WARNING [train.py:1204] (3/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_668-344147-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 18:35:58,128 WARNING [train.py:1204] (3/4) Exclude cut with ID _62337_210_12392_1_1535009273105_3412229_255-157667-0_sp1.1 from training. Duration: 0.936375 2023-10-03 18:35:58,131 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_486-151131-0 from training. Duration: 0.85 2023-10-03 18:35:59,575 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_53638_210_21581_1_1534297319_5808449_885-80172-0 from training. Duration: 0.96 2023-10-03 18:36:00,877 WARNING [train.py:1204] (3/4) Exclude cut with ID _50093_210_4220_1_1534417141207_3432299_105-183067-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 18:36:00,936 WARNING [train.py:1204] (3/4) Exclude cut with ID _7562_210_2084_1_1528887767916_3946229_345-169156-0_sp0.9 from training. Duration: 0.6444375 2023-10-03 18:36:02,438 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_4528_210_3273_1_1527076654_3276777_209-219804-0_sp0.9 from training. Duration: 0.7666875 2023-10-03 18:36:02,464 WARNING [train.py:1204] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_35-153886-0_sp0.9 from training. Duration: 0.9333125 2023-10-03 18:36:03,796 WARNING [train.py:1204] (3/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_332-121925-0_sp1.1 from training. Duration: 0.92725 2023-10-03 18:36:03,799 WARNING [train.py:1204] (3/4) Exclude cut with ID _59202_210_26984_1_1534816810612_3549174_495-65515-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 18:36:06,497 WARNING [train.py:1204] (3/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_769-168302-0_sp1.1 from training. Duration: 0.6454375 2023-10-03 18:36:06,538 WARNING [train.py:1204] (3/4) Exclude cut with ID _24271_210_12658_1_1531987270022_7190519_708-71345-0_sp1.1 from training. Duration: 0.936375 2023-10-03 18:36:06,539 WARNING [train.py:1204] (3/4) Exclude cut with ID 3033-130750-0096-55598-0_sp0.9 from training. Duration: 0.92225 2023-10-03 18:36:08,494 WARNING [train.py:1204] (3/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_219-224445-0_sp0.9 from training. Duration: 0.9333125 2023-10-03 18:36:11,345 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_174-277364-0 from training. Duration: 0.71 2023-10-03 18:36:12,803 WARNING [train.py:1204] (3/4) Exclude cut with ID _45021_210_9994_1_1533619751094_3330288_96-239010-0_sp1.1 from training. Duration: 0.82725 2023-10-03 18:36:12,833 WARNING [train.py:1204] (3/4) Exclude cut with ID _31602_210_5854_1_1532677021456_4458250_489-305116-0_sp1.1 from training. Duration: 0.97275 2023-10-03 18:36:12,860 WARNING [train.py:1204] (3/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_269-154726-0_sp0.9 from training. Duration: 0.9555625 2023-10-03 18:36:17,353 WARNING [train.py:1204] (3/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_126-54418-0_sp1.1 from training. Duration: 0.92725 2023-10-03 18:36:17,397 WARNING [train.py:1204] (3/4) Exclude cut with ID _42326_210_7770_1_1533814282531_3707269_182-224159-0_sp1.1 from training. Duration: 0.92725 2023-10-03 18:36:20,504 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7002_210_1794_1_1527939683_5157286_1-281264-0_sp0.9 from training. Duration: 0.488875 2023-10-03 18:36:20,539 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_86-16881-0 from training. Duration: 0.58 2023-10-03 18:36:20,580 WARNING [train.py:1204] (3/4) Exclude cut with ID _19514_210_9259_1_1531530071330_5084230_637-279094-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 18:36:20,624 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_26433_210_12108_1_1532157158_4056480_578-262107-0 from training. Duration: 0.99 2023-10-03 18:36:21,923 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_82-221196-0_sp1.1 from training. Duration: 0.6909375 2023-10-03 18:36:23,292 WARNING [train.py:1204] (3/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_402-102311-0 from training. Duration: 0.67 2023-10-03 18:36:24,746 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1015-343884-0_sp1.1 from training. Duration: 0.563625 2023-10-03 18:36:26,155 WARNING [train.py:1204] (3/4) Exclude cut with ID _13684_210_7497_1_1530619354863_3598359_312-235606-0_sp1.1 from training. Duration: 0.5454375 2023-10-03 18:36:26,186 WARNING [train.py:1204] (3/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_55-31904-0 from training. Duration: 0.88 2023-10-03 18:36:28,099 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_25-332138-0 from training. Duration: 0.91 2023-10-03 18:36:28,101 WARNING [train.py:1204] (3/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_912-282412-0_sp0.9 from training. Duration: 0.5444375 2023-10-03 18:36:29,437 INFO [train.py:1046] (3/4) Epoch 39, batch 3050, loss[loss=0.1506, simple_loss=0.226, pruned_loss=0.03762, over 23231.00 frames. ], tot_loss[loss=0.1579, simple_loss=0.2379, pruned_loss=0.03902, over 4725680.59 frames. ], batch size: 51, lr: 2.61e-03, grad_scale: 8.0 2023-10-03 18:36:29,521 WARNING [train.py:1204] (3/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_458-299435-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 18:36:30,827 WARNING [train.py:1204] (3/4) Exclude cut with ID _69584_210_5219_1_1535785133775_4044450_275-88403-0_sp1.1 from training. Duration: 0.92725 2023-10-03 18:36:30,844 WARNING [train.py:1204] (3/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_841-341679-0_sp1.1 from training. Duration: 0.52725 2023-10-03 18:36:30,851 WARNING [train.py:1204] (3/4) Exclude cut with ID _41391_210_7994_1_1533603771872_3638201_1-54521-0_sp1.1 from training. Duration: 0.97275 2023-10-03 18:36:32,180 WARNING [train.py:1204] (3/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_495-186609-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 18:36:33,608 WARNING [train.py:1204] (3/4) Exclude cut with ID _4681_210_3525_1_1527250689464_120889_15-126760-0 from training. Duration: 0.81 2023-10-03 18:36:35,075 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3019_210_2028_1_1526044245_3264865_127-135615-0_sp1.1 from training. Duration: 0.8545625 2023-10-03 18:36:38,284 WARNING [train.py:1204] (3/4) Exclude cut with ID _30191_210_1664_1_1533189915047_7196670_915-21561-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 18:36:38,331 WARNING [train.py:1204] (3/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_6-214815-0_sp0.9 from training. Duration: 0.9444375 2023-10-03 18:36:41,075 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_45399_210_4852_1_1533624725_7579737_190-137494-0_sp1.1 from training. Duration: 0.97275 2023-10-03 18:36:42,675 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.bypass.scale_min, batch_count=1366140.0, ans=0.2 2023-10-03 18:36:43,869 WARNING [train.py:1204] (3/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_207-132038-0 from training. Duration: 0.94 2023-10-03 18:36:47,896 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.1.encoder.layers.0.feed_forward2.out_whiten, num_groups=1, num_channels=256, metric=4.21 vs. limit=15.0 2023-10-03 18:36:48,976 WARNING [train.py:1204] (3/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_90-31383-0 from training. Duration: 0.87 2023-10-03 18:36:49,007 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_15517_210_6753_1_1530952293_1664795_119-31001-0 from training. Duration: 0.94 2023-10-03 18:36:49,247 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.out_combiner.scale_min, batch_count=1366140.0, ans=0.2 2023-10-03 18:36:50,279 WARNING [train.py:1204] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_684-85342-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 18:36:52,963 WARNING [train.py:1204] (3/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_97-325053-0_sp1.1 from training. Duration: 0.836375 2023-10-03 18:36:55,679 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_68702_210_23689_1_1535799577_3420068_168-25150-0_sp1.1 from training. Duration: 0.97275 2023-10-03 18:36:57,383 WARNING [train.py:1204] (3/4) Exclude cut with ID _65134_210_14763_1_1535335393985_3505368_111-27984-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 18:36:57,439 WARNING [train.py:1204] (3/4) Exclude cut with ID _48678_210_5663_1_1533988830947_3769199_276-226230-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 18:37:00,218 WARNING [train.py:1204] (3/4) Exclude cut with ID _28427_210_4220_1_1532581240967_3061069_43-84928-0_sp1.1 from training. Duration: 0.9 2023-10-03 18:37:01,588 WARNING [train.py:1204] (3/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_158-250116-0_sp1.1 from training. Duration: 0.563625 2023-10-03 18:37:01,612 WARNING [train.py:1204] (3/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_279-332556-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 18:37:01,650 WARNING [train.py:1204] (3/4) Exclude cut with ID _42863_210_9070_1_1533902398788_3696920_488-230146-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 18:37:01,651 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_387-247159-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 18:37:03,021 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6729_210_2550_1_1528032481_1788725_261-42619-0_sp1.1 from training. Duration: 0.97275 2023-10-03 18:37:04,379 WARNING [train.py:1204] (3/4) Exclude cut with ID _19896_210_1883_1_1531709022662_6649569_831-44724-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 18:37:06,344 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.conv_skip_rate, batch_count=1366206.6666666667, ans=0.0 2023-10-03 18:37:07,608 WARNING [train.py:1204] (3/4) Exclude cut with ID _55153_210_2253_1_1534384786677_3162096_280-285944-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 18:37:07,635 WARNING [train.py:1204] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_448-4048-0 from training. Duration: 0.96 2023-10-03 18:37:08,975 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.579e+02 1.946e+02 2.164e+02 2.475e+02 3.663e+02, threshold=4.327e+02, percent-clipped=0.0 2023-10-03 18:37:09,049 WARNING [train.py:1204] (3/4) Exclude cut with ID _29678_210_7893_1_1532421042787_3906118_456-202548-0_sp1.1 from training. Duration: 0.97275 2023-10-03 18:37:09,071 WARNING [train.py:1204] (3/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_107-332398-0_sp1.1 from training. Duration: 0.7090625 2023-10-03 18:37:11,844 WARNING [train.py:1204] (3/4) Exclude cut with ID _13921_210_9994_1_1530838702800_3003429_46-149444-0_sp1.1 from training. Duration: 0.8545625 2023-10-03 18:37:11,882 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_257-20733-0_sp0.9 from training. Duration: 0.8444375 2023-10-03 18:37:13,248 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_53753_210_7234_1_1534320003_3594148_385-234809-0_sp1.1 from training. Duration: 0.963625 2023-10-03 18:37:13,310 WARNING [train.py:1204] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_292-215346-0_sp1.1 from training. Duration: 0.8818125 2023-10-03 18:37:13,554 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.ff2_skip_rate, batch_count=1366273.3333333333, ans=0.0 2023-10-03 18:37:17,373 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.1.feed_forward1.out_proj.dropout_p, batch_count=1366273.3333333333, ans=0.1 2023-10-03 18:37:20,381 WARNING [train.py:1204] (3/4) Exclude cut with ID _68965_210_1883_1_1535698962272_3497490_196-124268-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 18:37:20,422 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_23801_210_16223_1_1532066392_3294657_570-5439-0_sp1.1 from training. Duration: 0.8818125 2023-10-03 18:37:26,571 WARNING [train.py:1204] (3/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_349-6928-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 18:37:28,396 WARNING [train.py:1204] (3/4) Exclude cut with ID _60621_210_17071_1_1535013053530_3671739_226-255202-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 18:37:28,400 WARNING [train.py:1204] (3/4) Exclude cut with ID _41817_210_18835_1_1533434477160_7423049_448-130302-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 18:37:28,689 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=1366340.0, ans=0.1 2023-10-03 18:37:29,736 WARNING [train.py:1204] (3/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_758-143677-0_sp1.1 from training. Duration: 0.8909375 2023-10-03 18:37:29,988 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.feed_forward1.out_proj.dropout_p, batch_count=1366340.0, ans=0.1 2023-10-03 18:37:31,097 WARNING [train.py:1204] (3/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_912-47473-0_sp0.9 from training. Duration: 0.7555625 2023-10-03 18:37:31,125 WARNING [train.py:1204] (3/4) Exclude cut with ID _6694_210_4599_1_1527852637490_3965529_356-169233-0_sp1.1 from training. Duration: 0.9 2023-10-03 18:37:31,216 WARNING [train.py:1204] (3/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_1-99680-0 from training. Duration: 0.67 2023-10-03 18:37:33,956 WARNING [train.py:1204] (3/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_199-86432-0_sp1.1 from training. Duration: 0.8909375 2023-10-03 18:37:33,982 WARNING [train.py:1204] (3/4) Exclude cut with ID _61148_210_6897_1_1535094022726_4217610_362-182430-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 18:37:35,342 WARNING [train.py:1204] (3/4) Exclude cut with ID _49376_210_5986_1_1533985278732_3370740_301-154763-0 from training. Duration: 0.98 2023-10-03 18:37:36,790 WARNING [train.py:1204] (3/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_572-185150-0_sp1.1 from training. Duration: 0.8818125 2023-10-03 18:37:42,840 INFO [train.py:1046] (3/4) Epoch 39, batch 3100, loss[loss=0.1595, simple_loss=0.2495, pruned_loss=0.03468, over 24645.00 frames. ], tot_loss[loss=0.1575, simple_loss=0.2376, pruned_loss=0.03868, over 4722190.72 frames. ], batch size: 68, lr: 2.61e-03, grad_scale: 8.0 2023-10-03 18:37:42,922 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_47896_210_20508_1_1534492518_5958868_558-314572-0_sp1.1 from training. Duration: 0.8818125 2023-10-03 18:37:44,352 WARNING [train.py:1204] (3/4) Exclude cut with ID _45560_210_21982_1_1533812603951_3584999_111-6376-0_sp0.9 from training. Duration: 0.9333125 2023-10-03 18:37:44,634 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.feed_forward2.hidden_balancer.prob, batch_count=1366406.6666666667, ans=0.125 2023-10-03 18:37:45,806 WARNING [train.py:1204] (3/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_412-317077-0_sp1.1 from training. Duration: 0.6818125 2023-10-03 18:37:47,265 WARNING [train.py:1204] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_718-68505-0 from training. Duration: 0.89 2023-10-03 18:37:50,562 WARNING [train.py:1204] (3/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_77-311404-0 from training. Duration: 0.94 2023-10-03 18:37:51,904 WARNING [train.py:1204] (3/4) Exclude cut with ID _60229_210_27767_1_1534838406158_3655350_264-279573-0 from training. Duration: 0.83 2023-10-03 18:37:53,482 WARNING [train.py:1204] (3/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_106-300985-0_sp1.1 from training. Duration: 0.8181875 2023-10-03 18:37:57,645 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_4183_210_2087_1_1526729100_1869838_79-277189-0_sp1.1 from training. Duration: 0.963625 2023-10-03 18:37:57,654 WARNING [train.py:1204] (3/4) Exclude cut with ID _28427_210_4220_1_1532581240967_3061069_195-295268-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 18:37:59,640 WARNING [train.py:1204] (3/4) Exclude cut with ID _4092_210_2151_1_1526643044449_3135759_356-278694-0_sp0.9 from training. Duration: 0.52225 2023-10-03 18:38:03,733 WARNING [train.py:1204] (3/4) Exclude cut with ID _14459_210_2228_1_1531443641946_7516700_777-71446-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 18:38:05,211 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.bypass.scale_min, batch_count=1366473.3333333333, ans=0.2 2023-10-03 18:38:08,031 WARNING [train.py:1204] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_520-126729-0 from training. Duration: 0.7 2023-10-03 18:38:12,824 WARNING [train.py:1204] (3/4) Exclude cut with ID _13684_210_7497_1_1530619354863_3598359_312-235606-0_sp0.9 from training. Duration: 0.6666875 2023-10-03 18:38:12,869 WARNING [train.py:1204] (3/4) Exclude cut with ID _37537_210_2253_1_1533171758568_3421949_283-349402-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 18:38:14,081 WARNING [train.py:1204] (3/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_29-117932-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 18:38:14,115 WARNING [train.py:1204] (3/4) Exclude cut with ID _29303_210_2575_1_1532392237303_7410649_721-283351-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 18:38:15,434 WARNING [train.py:1204] (3/4) Exclude cut with ID _14886_210_4220_1_1530702020355_3516190_175-229073-0_sp0.9 from training. Duration: 0.5 2023-10-03 18:38:16,863 WARNING [train.py:1204] (3/4) Exclude cut with ID _15458_210_8341_1_1530858614263_3834410_70-268830-0_sp1.1 from training. Duration: 0.97275 2023-10-03 18:38:18,105 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2295_210_2087_1_1525500993_2407863_234-83104-0 from training. Duration: 0.62 2023-10-03 18:38:18,107 WARNING [train.py:1204] (3/4) Exclude cut with ID _22961_210_8254_1_1531976366672_4278180_317-199546-0_sp1.1 from training. Duration: 0.7818125 2023-10-03 18:38:18,216 WARNING [train.py:1204] (3/4) Exclude cut with ID _27894_210_12392_1_1532415496078_6902439_547-196022-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 18:38:21,213 WARNING [train.py:1204] (3/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_46-207599-0 from training. Duration: 0.64 2023-10-03 18:38:23,075 WARNING [train.py:1204] (3/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_426-326246-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 18:38:25,929 WARNING [train.py:1204] (3/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_445-117398-0_sp0.9 from training. Duration: 0.988875 2023-10-03 18:38:26,014 WARNING [train.py:1204] (3/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_563-347832-0 from training. Duration: 0.89 2023-10-03 18:38:28,645 WARNING [train.py:1204] (3/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_252-216674-0 from training. Duration: 0.54 2023-10-03 18:38:28,723 WARNING [train.py:1204] (3/4) Exclude cut with ID _59611_210_8895_1_1534741198240_3650361_187-212779-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 18:38:30,063 WARNING [train.py:1204] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_282-159367-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 18:38:31,875 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_68702_210_23689_1_1535799577_3420068_33-136883-0_sp1.1 from training. Duration: 0.936375 2023-10-03 18:38:31,885 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_47228_210_4983_1_1533780021_3672027_184-293706-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 18:38:31,909 WARNING [train.py:1204] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_656-188361-0_sp0.9 from training. Duration: 0.9666875 2023-10-03 18:38:33,326 WARNING [train.py:1204] (3/4) Exclude cut with ID _4866_210_2577_1_1527404412284_4364659_352-157930-0_sp0.9 from training. Duration: 0.9 2023-10-03 18:38:33,326 WARNING [train.py:1204] (3/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_210-106047-0_sp1.1 from training. Duration: 0.8818125 2023-10-03 18:38:36,243 WARNING [train.py:1204] (3/4) Exclude cut with ID _9389_210_1664_1_1529217337301_5289269_289-213417-0_sp1.1 from training. Duration: 0.8090625 2023-10-03 18:38:37,487 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3910_210_1794_1_1526777667_3747260_178-41186-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 18:38:37,498 WARNING [train.py:1204] (3/4) Exclude cut with ID _27052_210_5986_1_1532329197187_3585009_24-172263-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 18:38:37,502 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_5522_210_3273_1_1527249606_3909096_318-203153-0_sp1.1 from training. Duration: 0.3909375 2023-10-03 18:38:40,521 WARNING [train.py:1204] (3/4) Exclude cut with ID _27894_210_12392_1_1532415496078_6902439_222-51982-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 18:38:40,811 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.self_attn_weights.pos_emb_skip_rate, batch_count=1366673.3333333333, ans=0.0 2023-10-03 18:38:42,264 WARNING [train.py:1204] (3/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_87-131427-0 from training. Duration: 0.78 2023-10-03 18:38:43,698 WARNING [train.py:1204] (3/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_372-298093-0_sp0.9 from training. Duration: 0.888875 2023-10-03 18:38:44,990 WARNING [train.py:1204] (3/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_663-166973-0 from training. Duration: 0.8 2023-10-03 18:38:45,025 WARNING [train.py:1204] (3/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_429-177469-0_sp1.1 from training. Duration: 0.936375 2023-10-03 18:38:45,063 WARNING [train.py:1204] (3/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_724-172651-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 18:38:46,399 WARNING [train.py:1204] (3/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_103-336064-0 from training. Duration: 0.51 2023-10-03 18:38:47,419 INFO [scaling.py:1022] (3/4) Whitening: name=encoder_embed.out_whiten, num_groups=1, num_channels=192, metric=7.43 vs. limit=8.0 2023-10-03 18:38:55,486 WARNING [train.py:1204] (3/4) Exclude cut with ID _48677_210_5663_1_1533902405919_3892989_111-11439-0 from training. Duration: 0.96 2023-10-03 18:38:56,921 INFO [train.py:1046] (3/4) Epoch 39, batch 3150, loss[loss=0.1631, simple_loss=0.2314, pruned_loss=0.04741, over 23886.00 frames. ], tot_loss[loss=0.1572, simple_loss=0.2366, pruned_loss=0.03888, over 4714089.14 frames. ], batch size: 164, lr: 2.61e-03, grad_scale: 8.0 2023-10-03 18:38:58,322 WARNING [train.py:1204] (3/4) Exclude cut with ID _8085_210_4557_1_1528455633493_3716100_95-103398-0_sp1.1 from training. Duration: 0.963625 2023-10-03 18:38:58,380 WARNING [train.py:1204] (3/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_1041-110837-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 18:38:59,838 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_50050_210_16223_1_1535435796_7290138_1155-121757-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 18:38:59,846 WARNING [train.py:1204] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_602-275260-0_sp0.9 from training. Duration: 0.911125 2023-10-03 18:39:01,767 WARNING [train.py:1204] (3/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_265-164486-0 from training. Duration: 0.96 2023-10-03 18:39:01,860 WARNING [train.py:1204] (3/4) Exclude cut with ID _46067_210_4220_1_1534125571066_3307450_142-229375-0_sp1.1 from training. Duration: 0.963625 2023-10-03 18:39:01,896 WARNING [train.py:1204] (3/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_310-26186-0_sp1.1 from training. Duration: 0.636375 2023-10-03 18:39:04,466 WARNING [train.py:1204] (3/4) Exclude cut with ID _28461_210_2253_1_1532771775189_3041299_136-270644-0 from training. Duration: 0.93 2023-10-03 18:39:05,885 WARNING [train.py:1204] (3/4) Exclude cut with ID _14860_210_4915_1_1530702020340_7343240_438-286372-0_sp1.1 from training. Duration: 0.936375 2023-10-03 18:39:07,495 INFO [scaling.py:1118] (3/4) WithLoss: name=encoder.encoders.3.encoder.layers.1.self_attn_weights, loss-sum=0.000e+00 2023-10-03 18:39:08,619 WARNING [train.py:1204] (3/4) Exclude cut with ID IC0128W0004-63309 from training. Duration: 0.831 2023-10-03 18:39:10,149 WARNING [train.py:1204] (3/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_135-174080-0 from training. Duration: 0.53 2023-10-03 18:39:12,062 WARNING [train.py:1204] (3/4) Exclude cut with ID _41326_210_5854_1_1533778076509_3492080_277-59511-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 18:39:13,380 WARNING [train.py:1204] (3/4) Exclude cut with ID IC0144W0112-71379 from training. Duration: 0.992 2023-10-03 18:39:13,446 WARNING [train.py:1204] (3/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_711-15688-0_sp0.9 from training. Duration: 0.47775 2023-10-03 18:39:14,860 WARNING [train.py:1204] (3/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_237-332027-0 from training. Duration: 0.95 2023-10-03 18:39:16,236 WARNING [train.py:1204] (3/4) Exclude cut with ID _7970_210_1883_1_1528455616296_3472230_25-178407-0 from training. Duration: 0.71 2023-10-03 18:39:16,237 WARNING [train.py:1204] (3/4) Exclude cut with ID _8480_210_1874_1_1528589391205_6728330_309-139896-0 from training. Duration: 0.81 2023-10-03 18:39:16,268 WARNING [train.py:1204] (3/4) Exclude cut with ID _39571_210_13449_1_1533801584143_3651077_319-268877-0_sp1.1 from training. Duration: 0.936375 2023-10-03 18:39:16,271 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_155-246880-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 18:39:18,994 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.2.encoder.layers.0.nonlin_attention.whiten2, num_groups=1, num_channels=384, metric=15.15 vs. limit=15.0 2023-10-03 18:39:19,411 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_566-122599-0_sp1.1 from training. Duration: 0.936375 2023-10-03 18:39:19,546 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_12868_210_6753_1_1530701932_530275_18-76322-0 from training. Duration: 0.87 2023-10-03 18:39:20,174 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.3.encoder.layers.3.self_attn2.whiten, num_groups=1, num_channels=512, metric=9.83 vs. limit=22.5 2023-10-03 18:39:22,193 WARNING [train.py:1204] (3/4) Exclude cut with ID _56494_210_4220_1_1535284837625_3033169_159-106035-0_sp1.1 from training. Duration: 0.963625 2023-10-03 18:39:22,228 WARNING [train.py:1204] (3/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_98-276013-0_sp1.1 from training. Duration: 0.963625 2023-10-03 18:39:24,012 WARNING [train.py:1204] (3/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_330-216124-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 18:39:24,147 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_177-58005-0_sp0.9 from training. Duration: 0.688875 2023-10-03 18:39:29,517 WARNING [train.py:1204] (3/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_343-139898-0 from training. Duration: 0.96 2023-10-03 18:39:29,581 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6961_210_2087_1_1527937168_6815011_395-185060-0_sp1.1 from training. Duration: 0.863625 2023-10-03 18:39:30,918 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7002_210_1794_1_1527939683_5157286_184-229630-0_sp0.9 from training. Duration: 0.82225 2023-10-03 18:39:30,979 WARNING [train.py:1204] (3/4) Exclude cut with ID _59170_210_6721_1_1534983633960_4203335_153-18895-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 18:39:31,024 WARNING [train.py:1204] (3/4) Exclude cut with ID _49643_210_7989_1_1534640244224_3939031_598-161926-0 from training. Duration: 0.9 2023-10-03 18:39:34,438 WARNING [train.py:1204] (3/4) Exclude cut with ID _4068_210_2629_1_1526697350395_1546259_224-288693-0 from training. Duration: 0.59 2023-10-03 18:39:34,489 WARNING [train.py:1204] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_718-68505-0_sp1.1 from training. Duration: 0.8090625 2023-10-03 18:39:35,797 WARNING [train.py:1204] (3/4) Exclude cut with ID _4068_210_2629_1_1526697350395_1546259_224-288693-0_sp0.9 from training. Duration: 0.6555625 2023-10-03 18:39:37,011 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.547e+02 1.857e+02 2.083e+02 2.385e+02 4.094e+02, threshold=4.165e+02, percent-clipped=0.0 2023-10-03 18:39:37,080 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2296_210_2087_1_1525503448_3397335_57-185430-0_sp0.9 from training. Duration: 0.6666875 2023-10-03 18:39:37,139 WARNING [train.py:1204] (3/4) Exclude cut with ID _15458_210_8341_1_1530858614263_3834410_49-235472-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 18:39:37,141 WARNING [train.py:1204] (3/4) Exclude cut with ID _64135_210_2253_1_1535677267876_7608972_498-5039-0_sp0.9 from training. Duration: 0.8666875 2023-10-03 18:39:38,511 WARNING [train.py:1204] (3/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_283-220372-0_sp0.9 from training. Duration: 0.62225 2023-10-03 18:39:38,522 WARNING [train.py:1204] (3/4) Exclude cut with ID _6473_210_1741_1_1527901307396_3815380_108-17335-0_sp0.9 from training. Duration: 0.72225 2023-10-03 18:39:38,599 WARNING [train.py:1204] (3/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_113-92608-0 from training. Duration: 0.54 2023-10-03 18:39:39,880 WARNING [train.py:1204] (3/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_272-249985-0_sp1.1 from training. Duration: 0.6090625 2023-10-03 18:39:39,896 WARNING [train.py:1204] (3/4) Exclude cut with ID _50376_210_13401_1_1534031502101_4217740_518-187390-0_sp1.1 from training. Duration: 0.97275 2023-10-03 18:39:41,313 WARNING [train.py:1204] (3/4) Exclude cut with ID _59738_210_15379_1_1534849512562_3941470_165-248260-0_sp1.1 from training. Duration: 0.92725 2023-10-03 18:39:41,323 WARNING [train.py:1204] (3/4) Exclude cut with ID _23818_210_6228_1_1531917063884_3676920_400-343925-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 18:39:43,235 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6961_210_2087_1_1527937168_6815011_562-165518-0 from training. Duration: 0.77 2023-10-03 18:39:43,303 WARNING [train.py:1204] (3/4) Exclude cut with ID _24271_210_12658_1_1531987270022_7190519_289-207931-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 18:39:45,997 WARNING [train.py:1204] (3/4) Exclude cut with ID _14632_210_9988_1_1530691279392_3923200_332-105904-0 from training. Duration: 0.9 2023-10-03 18:39:46,015 WARNING [train.py:1204] (3/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_203-232438-0_sp1.1 from training. Duration: 0.97275 2023-10-03 18:39:46,291 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.attention_skip_rate, batch_count=1366940.0, ans=0.0 2023-10-03 18:39:47,383 WARNING [train.py:1204] (3/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_263-168388-0 from training. Duration: 0.86 2023-10-03 18:39:47,462 WARNING [train.py:1204] (3/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_174-205058-0 from training. Duration: 0.95 2023-10-03 18:39:49,348 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_94-195801-0_sp1.1 from training. Duration: 0.8545625 2023-10-03 18:39:49,385 WARNING [train.py:1204] (3/4) Exclude cut with ID _55153_210_2253_1_1534384786677_3162096_269-103204-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 18:39:50,664 WARNING [train.py:1204] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_748-36150-0 from training. Duration: 0.91 2023-10-03 18:39:52,591 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_148-172861-0_sp1.1 from training. Duration: 0.4545625 2023-10-03 18:39:53,868 WARNING [train.py:1204] (3/4) Exclude cut with ID _67499_210_17282_1_1535509805220_3692430_460-77730-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 18:39:56,645 WARNING [train.py:1204] (3/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_374-20753-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 18:39:58,011 WARNING [train.py:1204] (3/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_374-289983-0_sp1.1 from training. Duration: 0.97275 2023-10-03 18:39:58,043 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_34886_210_10633_1_1533203882_1755661_76-156096-0_sp0.9 from training. Duration: 0.9666875 2023-10-03 18:40:02,734 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_238-256668-0_sp0.9 from training. Duration: 0.7666875 2023-10-03 18:40:03,999 WARNING [train.py:1204] (3/4) Exclude cut with ID _46683_210_9642_1_1533730913492_4591116_489-146727-0_sp1.1 from training. Duration: 0.97275 2023-10-03 18:40:06,770 WARNING [train.py:1204] (3/4) Exclude cut with ID _13938_210_9988_1_1530518718978_3693960_304-112784-0_sp0.9 from training. Duration: 0.511125 2023-10-03 18:40:07,575 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.2.encoder.layers.1.self_attn2.whiten, num_groups=1, num_channels=384, metric=16.14 vs. limit=22.5 2023-10-03 18:40:10,901 INFO [train.py:1046] (3/4) Epoch 39, batch 3200, loss[loss=0.1723, simple_loss=0.2463, pruned_loss=0.04918, over 23340.00 frames. ], tot_loss[loss=0.1559, simple_loss=0.2347, pruned_loss=0.03851, over 4699888.31 frames. ], batch size: 106, lr: 2.61e-03, grad_scale: 16.0 2023-10-03 18:40:10,966 WARNING [train.py:1204] (3/4) Exclude cut with ID _24271_210_12658_1_1531987270022_7190519_243-46549-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 18:40:10,971 WARNING [train.py:1204] (3/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_78-182912-0_sp1.1 from training. Duration: 0.536375 2023-10-03 18:40:13,866 WARNING [train.py:1204] (3/4) Exclude cut with ID _57572_210_5517_1_1534921219624_3809484_121-321334-0_sp1.1 from training. Duration: 0.97275 2023-10-03 18:40:15,718 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_40047_210_2290_1_1533290519_2374311_254-102149-0_sp1.1 from training. Duration: 0.963625 2023-10-03 18:40:15,719 WARNING [train.py:1204] (3/4) Exclude cut with ID _28461_210_2253_1_1532771775189_3041299_96-152158-0 from training. Duration: 0.85 2023-10-03 18:40:17,303 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.feed_forward1.out_proj.dropout_p, batch_count=1367073.3333333333, ans=0.1 2023-10-03 18:40:19,132 WARNING [train.py:1204] (3/4) Exclude cut with ID _29877_210_6228_1_1532397791106_5276149_161-102979-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 18:40:25,010 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3787_210_2040_1_1526477544_5478586_603-120324-0_sp0.9 from training. Duration: 0.9 2023-10-03 18:40:26,561 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_41750_210_1960_1_1533520678_3948042_247-213456-0_sp1.1 from training. Duration: 0.97275 2023-10-03 18:40:36,620 WARNING [train.py:1204] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_127-119109-0_sp0.9 from training. Duration: 0.8 2023-10-03 18:40:45,036 WARNING [train.py:1204] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_215-202973-0 from training. Duration: 0.88 2023-10-03 18:40:45,940 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.1.encoder.layers.1.nonlin_attention.whiten2, num_groups=1, num_channels=256, metric=7.17 vs. limit=15.0 2023-10-03 18:40:46,434 WARNING [train.py:1204] (3/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_17-320491-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 18:40:50,643 WARNING [train.py:1204] (3/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_310-114452-0 from training. Duration: 0.59 2023-10-03 18:40:51,865 WARNING [train.py:1204] (3/4) Exclude cut with ID _15133_210_6228_1_1530794512074_3080379_29-264458-0_sp0.9 from training. Duration: 0.6444375 2023-10-03 18:40:54,770 WARNING [train.py:1204] (3/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_159-281310-0_sp1.1 from training. Duration: 0.863625 2023-10-03 18:40:54,787 WARNING [train.py:1204] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_195-167011-0_sp0.9 from training. Duration: 0.8333125 2023-10-03 18:40:56,114 WARNING [train.py:1204] (3/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_156-257607-0_sp1.1 from training. Duration: 0.7545625 2023-10-03 18:40:58,927 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2295_210_2087_1_1525500993_2407863_116-238894-0 from training. Duration: 0.7 2023-10-03 18:41:00,336 WARNING [train.py:1204] (3/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_321-335475-0_sp0.9 from training. Duration: 0.5 2023-10-03 18:41:01,726 WARNING [train.py:1204] (3/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_188-114464-0 from training. Duration: 0.82 2023-10-03 18:41:04,446 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2592_210_2290_1_1525697025_4882925_157-70512-0 from training. Duration: 0.91 2023-10-03 18:41:05,937 WARNING [train.py:1204] (3/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_138-64210-0_sp1.1 from training. Duration: 0.663625 2023-10-03 18:41:06,405 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.5.encoder.layers.1.feed_forward2.out_whiten, num_groups=1, num_channels=256, metric=9.97 vs. limit=15.0 2023-10-03 18:41:13,218 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_11305_210_1794_1_1529750428_7178148_651-95702-0_sp1.1 from training. Duration: 0.963625 2023-10-03 18:41:13,231 WARNING [train.py:1204] (3/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_379-184223-0_sp0.9 from training. Duration: 0.9444375 2023-10-03 18:41:14,665 WARNING [train.py:1204] (3/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_381-125325-0_sp1.1 from training. Duration: 0.963625 2023-10-03 18:41:16,055 WARNING [train.py:1204] (3/4) Exclude cut with ID IC0622W0021-307659 from training. Duration: 0.896 2023-10-03 18:41:16,057 WARNING [train.py:1204] (3/4) Exclude cut with ID _7624_210_1531_1_1528109802198_4933101_418-349834-0_sp1.1 from training. Duration: 0.5454375 2023-10-03 18:41:18,934 WARNING [train.py:1204] (3/4) Exclude cut with ID _49728_210_12651_1_1534041034294_6788150_607-3416-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 18:41:22,667 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8275_210_4198_1_1528455320_3892571_491-17707-0 from training. Duration: 0.64 2023-10-03 18:41:22,729 WARNING [train.py:1204] (3/4) Exclude cut with ID _5287_210_2084_1_1527418883040_3878670_333-48508-0 from training. Duration: 0.6 2023-10-03 18:41:25,938 INFO [train.py:1046] (3/4) Epoch 39, batch 3250, loss[loss=0.1667, simple_loss=0.2382, pruned_loss=0.04762, over 23343.00 frames. ], tot_loss[loss=0.1558, simple_loss=0.2348, pruned_loss=0.03842, over 4712669.92 frames. ], batch size: 93, lr: 2.61e-03, grad_scale: 16.0 2023-10-03 18:41:26,003 WARNING [train.py:1204] (3/4) Exclude cut with ID _8365_210_2064_1_1528592311070_4677690_371-122295-0 from training. Duration: 0.55 2023-10-03 18:41:27,391 WARNING [train.py:1204] (3/4) Exclude cut with ID _14860_210_4915_1_1530702020340_7343240_836-328489-0 from training. Duration: 0.9 2023-10-03 18:41:28,764 WARNING [train.py:1204] (3/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_1033-274376-0_sp0.9 from training. Duration: 0.9666875 2023-10-03 18:41:30,246 WARNING [train.py:1204] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_475-127455-0_sp0.9 from training. Duration: 0.77775 2023-10-03 18:41:31,611 WARNING [train.py:1204] (3/4) Exclude cut with ID IC0671W0071-331914 from training. Duration: 0.896 2023-10-03 18:41:31,637 WARNING [train.py:1204] (3/4) Exclude cut with ID _30327_210_16049_1_1532484161295_3924169_2-1523-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 18:41:31,639 WARNING [train.py:1204] (3/4) Exclude cut with ID _24314_210_4915_1_1532135078116_7170179_460-208279-0_sp1.1 from training. Duration: 0.97275 2023-10-03 18:41:32,980 WARNING [train.py:1204] (3/4) Exclude cut with ID IC0679W0052-335859 from training. Duration: 0.64 2023-10-03 18:41:34,580 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.conv_module1.balancer1.prob, batch_count=1367406.6666666667, ans=0.125 2023-10-03 18:41:37,081 WARNING [train.py:1204] (3/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_3-237434-0_sp1.1 from training. Duration: 0.6909375 2023-10-03 18:41:37,200 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.0.feed_forward3.hidden_balancer.prob, batch_count=1367406.6666666667, ans=0.125 2023-10-03 18:41:39,853 WARNING [train.py:1204] (3/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_405-185283-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 18:41:42,803 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.conv_module2.balancer2.min_abs, batch_count=1367473.3333333333, ans=0.5 2023-10-03 18:41:47,074 WARNING [train.py:1204] (3/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_927-83684-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 18:41:48,320 WARNING [train.py:1204] (3/4) Exclude cut with ID _6694_210_4599_1_1527852637490_3965529_356-169233-0 from training. Duration: 0.99 2023-10-03 18:41:48,407 WARNING [train.py:1204] (3/4) Exclude cut with ID _19896_210_1883_1_1531709022662_6649569_562-233001-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 18:41:49,750 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_354-78629-0_sp1.1 from training. Duration: 0.963625 2023-10-03 18:41:49,751 WARNING [train.py:1204] (3/4) Exclude cut with ID _60135_210_13427_1_1534935557922_3651319_96-168198-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 18:41:49,935 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.0.conv_skip_rate, batch_count=1367473.3333333333, ans=0.0 2023-10-03 18:41:51,147 WARNING [train.py:1204] (3/4) Exclude cut with ID _31602_210_5854_1_1532677021456_4458250_249-96604-0_sp1.1 from training. Duration: 0.8090625 2023-10-03 18:41:51,178 WARNING [train.py:1204] (3/4) Exclude cut with ID _14100_210_8341_1_1530529320613_3678069_258-224599-0_sp0.9 from training. Duration: 0.8555625 2023-10-03 18:41:54,578 WARNING [train.py:1204] (3/4) Exclude cut with ID _19708_210_9206_1_1532136650733_4238364_215-160135-0_sp1.1 from training. Duration: 0.97275 2023-10-03 18:41:54,598 WARNING [train.py:1204] (3/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_113-18163-0_sp0.9 from training. Duration: 0.988875 2023-10-03 18:41:54,640 WARNING [train.py:1204] (3/4) Exclude cut with ID _64135_210_2253_1_1535677267876_7608972_552-337340-0_sp1.1 from training. Duration: 0.92725 2023-10-03 18:41:54,672 WARNING [train.py:1204] (3/4) Exclude cut with ID _67725_210_27767_1_1535608756671_5105359_10-12887-0_sp1.1 from training. Duration: 0.97275 2023-10-03 18:41:54,678 WARNING [train.py:1204] (3/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_401-284840-0_sp1.1 from training. Duration: 0.97275 2023-10-03 18:41:54,696 WARNING [train.py:1204] (3/4) Exclude cut with ID _44659_210_2721_1_1534036120061_6614860_483-141285-0_sp1.1 from training. Duration: 0.936375 2023-10-03 18:41:56,726 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.conv_module2.balancer2.min_positive, batch_count=1367540.0, ans=0.05 2023-10-03 18:41:58,614 WARNING [train.py:1204] (3/4) Exclude cut with ID _9461_210_4557_1_1529060514726_3677451_358-332734-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 18:41:59,890 WARNING [train.py:1204] (3/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_788-298907-0_sp1.1 from training. Duration: 0.8090625 2023-10-03 18:42:01,314 WARNING [train.py:1204] (3/4) Exclude cut with ID _39411_210_5986_1_1533538825030_3343649_354-267196-0_sp1.1 from training. Duration: 0.92725 2023-10-03 18:42:02,584 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_14953_210_2151_1_1530701710_5616165_192-309792-0_sp1.1 from training. Duration: 0.97275 2023-10-03 18:42:03,927 WARNING [train.py:1204] (3/4) Exclude cut with ID _59170_210_6721_1_1534983633960_4203335_290-151255-0_sp1.1 from training. Duration: 0.92725 2023-10-03 18:42:03,957 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_5575_210_1410_1_1527301305_2570170_249-93769-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 18:42:03,973 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_46013_210_7234_1_1533776362_3744032_463-52378-0_sp1.1 from training. Duration: 0.7818125 2023-10-03 18:42:05,208 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.490e+02 1.905e+02 2.104e+02 2.395e+02 3.285e+02, threshold=4.208e+02, percent-clipped=0.0 2023-10-03 18:42:09,477 WARNING [train.py:1204] (3/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_580-93798-0 from training. Duration: 0.56 2023-10-03 18:42:09,529 WARNING [train.py:1204] (3/4) Exclude cut with ID _19514_210_9259_1_1531530071330_5084230_385-13762-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 18:42:09,543 WARNING [train.py:1204] (3/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_458-67321-0_sp1.1 from training. Duration: 0.8818125 2023-10-03 18:42:10,786 WARNING [train.py:1204] (3/4) Exclude cut with ID _41298_210_9019_1_1533430371450_4822143_188-154791-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 18:42:12,099 WARNING [train.py:1204] (3/4) Exclude cut with ID _15037_210_2253_1_1530752294104_3267650_296-192597-0_sp0.9 from training. Duration: 0.9 2023-10-03 18:42:18,182 WARNING [train.py:1204] (3/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_661-264857-0_sp1.1 from training. Duration: 0.8454375 2023-10-03 18:42:23,798 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_34008_210_10431_1_1533345424_8183247_662-317331-0_sp1.1 from training. Duration: 0.8545625 2023-10-03 18:42:25,575 WARNING [train.py:1204] (3/4) Exclude cut with ID _46502_210_13794_1_1533724452198_3657159_241-278329-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 18:42:25,576 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_11349_210_6753_1_1529800861_1101207_26-65602-0 from training. Duration: 0.96 2023-10-03 18:42:25,580 WARNING [train.py:1204] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_448-4048-0_sp1.1 from training. Duration: 0.87275 2023-10-03 18:42:25,587 WARNING [train.py:1204] (3/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_662-275621-0_sp1.1 from training. Duration: 0.3818125 2023-10-03 18:42:25,631 WARNING [train.py:1204] (3/4) Exclude cut with ID _13938_210_9988_1_1530518718978_3693960_256-54454-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 18:42:26,983 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.3.encoder.layers.2.feed_forward2.out_whiten, num_groups=1, num_channels=512, metric=10.62 vs. limit=15.0 2023-10-03 18:42:29,139 WARNING [train.py:1204] (3/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_256-32662-0 from training. Duration: 0.9 2023-10-03 18:42:29,173 WARNING [train.py:1204] (3/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_326-17061-0 from training. Duration: 0.94 2023-10-03 18:42:29,213 WARNING [train.py:1204] (3/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_505-97987-0_sp1.1 from training. Duration: 0.7818125 2023-10-03 18:42:30,616 WARNING [train.py:1204] (3/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_316-294364-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 18:42:31,926 WARNING [train.py:1204] (3/4) Exclude cut with ID _19514_210_9259_1_1531530071330_5084230_762-231063-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 18:42:31,977 WARNING [train.py:1204] (3/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_68-219807-0_sp0.9 from training. Duration: 0.52225 2023-10-03 18:42:33,322 WARNING [train.py:1204] (3/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_479-158053-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 18:42:34,982 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.conv_module1.balancer1.prob, batch_count=1367673.3333333333, ans=0.125 2023-10-03 18:42:36,062 WARNING [train.py:1204] (3/4) Exclude cut with ID _66112_210_10635_1_1535590236305_3801484_83-38181-0_sp1.1 from training. Duration: 0.936375 2023-10-03 18:42:36,091 WARNING [train.py:1204] (3/4) Exclude cut with ID _55857_210_2346_1_1534644007831_6898209_255-146346-0_sp1.1 from training. Duration: 0.963625 2023-10-03 18:42:37,457 WARNING [train.py:1204] (3/4) Exclude cut with ID _48836_210_12210_1_1534075848293_3904150_429-35475-0 from training. Duration: 0.89 2023-10-03 18:42:37,480 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_50154_210_13731_1_1534750862_1080961_119-187163-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 18:42:37,767 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.feed_forward1.out_proj.dropout_p, batch_count=1367740.0, ans=0.1 2023-10-03 18:42:38,906 INFO [train.py:1046] (3/4) Epoch 39, batch 3300, loss[loss=0.1536, simple_loss=0.2318, pruned_loss=0.03774, over 22854.00 frames. ], tot_loss[loss=0.1563, simple_loss=0.2354, pruned_loss=0.0386, over 4711988.22 frames. ], batch size: 322, lr: 2.60e-03, grad_scale: 16.0 2023-10-03 18:42:40,388 WARNING [train.py:1204] (3/4) Exclude cut with ID _8793_210_6310_1_1528542985139_2310009_121-179782-0_sp0.9 from training. Duration: 0.888875 2023-10-03 18:42:40,391 WARNING [train.py:1204] (3/4) Exclude cut with ID _14884_210_2252_1_1530707459689_5561250_591-173406-0 from training. Duration: 0.96 2023-10-03 18:42:41,889 WARNING [train.py:1204] (3/4) Exclude cut with ID _15133_210_6228_1_1530794512074_3080379_54-198193-0_sp1.1 from training. Duration: 0.8545625 2023-10-03 18:42:41,906 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6545_210_2040_1_1527774670_4245258_407-48070-0 from training. Duration: 0.76 2023-10-03 18:42:43,943 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1032-236863-0 from training. Duration: 0.84 2023-10-03 18:42:45,238 WARNING [train.py:1204] (3/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_507-303370-0 from training. Duration: 0.96 2023-10-03 18:42:46,512 WARNING [train.py:1204] (3/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_164-282366-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 18:42:48,686 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.3.encoder.layers.1.self_attn1.whiten, num_groups=1, num_channels=512, metric=13.76 vs. limit=22.5 2023-10-03 18:42:49,362 WARNING [train.py:1204] (3/4) Exclude cut with ID _39867_210_9826_1_1533553222030_9344259_312-158727-0_sp1.1 from training. Duration: 0.963625 2023-10-03 18:42:50,722 WARNING [train.py:1204] (3/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_174-205058-0_sp1.1 from training. Duration: 0.863625 2023-10-03 18:42:50,756 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_11305_210_1794_1_1529750428_7178148_166-237358-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 18:42:53,412 WARNING [train.py:1204] (3/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_26-175553-0_sp1.1 from training. Duration: 0.5545625 2023-10-03 18:42:53,439 WARNING [train.py:1204] (3/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_96-290039-0_sp1.1 from training. Duration: 0.5090625 2023-10-03 18:42:59,086 WARNING [train.py:1204] (3/4) Exclude cut with ID _53219_210_4525_1_1534834836008_4064825_261-101691-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 18:42:59,200 WARNING [train.py:1204] (3/4) Exclude cut with ID _24271_210_12658_1_1531987270022_7190519_96-293390-0_sp1.1 from training. Duration: 0.936375 2023-10-03 18:43:00,835 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.balancer1.prob, batch_count=1367806.6666666667, ans=0.125 2023-10-03 18:43:02,042 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_271-234962-0 from training. Duration: 0.79 2023-10-03 18:43:03,478 WARNING [train.py:1204] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_136-30660-0_sp1.1 from training. Duration: 0.97275 2023-10-03 18:43:03,492 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_625-281046-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 18:43:03,588 WARNING [train.py:1204] (3/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_333-168193-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 18:43:03,782 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=1367806.6666666667, ans=0.1 2023-10-03 18:43:04,957 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9029W0269-509664 from training. Duration: 0.896 2023-10-03 18:43:05,048 WARNING [train.py:1204] (3/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_450-147393-0_sp1.1 from training. Duration: 0.92725 2023-10-03 18:43:06,263 WARNING [train.py:1204] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_643-52007-0_sp1.1 from training. Duration: 0.5181875 2023-10-03 18:43:06,336 WARNING [train.py:1204] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_330-327861-0_sp0.9 from training. Duration: 0.7444375 2023-10-03 18:43:06,351 WARNING [train.py:1204] (3/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_260-317651-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 18:43:07,604 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9044W0010-516654 from training. Duration: 0.896 2023-10-03 18:43:11,807 WARNING [train.py:1204] (3/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_265-118378-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 18:43:11,813 WARNING [train.py:1204] (3/4) Exclude cut with ID _6194_210_1534_1_1527511826160_3941194_321-148641-0_sp0.9 from training. Duration: 0.711125 2023-10-03 18:43:13,213 WARNING [train.py:1204] (3/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_101-169176-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 18:43:13,216 WARNING [train.py:1204] (3/4) Exclude cut with ID 497-129325-0061-62254-0_sp1.1 from training. Duration: 0.97725 2023-10-03 18:43:14,601 WARNING [train.py:1204] (3/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_795-33252-0_sp1.1 from training. Duration: 0.336375 2023-10-03 18:43:14,615 WARNING [train.py:1204] (3/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_168-246176-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 18:43:16,555 WARNING [train.py:1204] (3/4) Exclude cut with ID _7160_210_1876_1_1527940923231_3703799_120-150943-0_sp1.1 from training. Duration: 0.736375 2023-10-03 18:43:19,241 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9084W0005-536844 from training. Duration: 0.768 2023-10-03 18:43:20,642 WARNING [train.py:1204] (3/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_234-260682-0 from training. Duration: 0.84 2023-10-03 18:43:20,687 WARNING [train.py:1204] (3/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_188-114464-0_sp0.9 from training. Duration: 0.911125 2023-10-03 18:43:22,109 WARNING [train.py:1204] (3/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_81-75849-0 from training. Duration: 0.39 2023-10-03 18:43:23,521 WARNING [train.py:1204] (3/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_379-184223-0_sp1.1 from training. Duration: 0.77275 2023-10-03 18:43:27,426 WARNING [train.py:1204] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_454-123002-0_sp1.1 from training. Duration: 0.62725 2023-10-03 18:43:28,661 WARNING [train.py:1204] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_887-249555-0_sp1.1 from training. Duration: 0.7 2023-10-03 18:43:30,783 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.2.encoder.layers.1.conv_module1.whiten, num_groups=1, num_channels=384, metric=7.96 vs. limit=15.0 2023-10-03 18:43:31,369 WARNING [train.py:1204] (3/4) Exclude cut with ID _31088_210_16446_1_1532520012284_3440919_20-260902-0_sp1.1 from training. Duration: 0.97275 2023-10-03 18:43:31,420 WARNING [train.py:1204] (3/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_856-344574-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 18:43:31,422 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_36756_210_7234_1_1533026991_966276_4-164118-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 18:43:32,803 WARNING [train.py:1204] (3/4) Exclude cut with ID _8365_210_2064_1_1528592311070_4677690_371-122295-0_sp1.1 from training. Duration: 0.5 2023-10-03 18:43:34,191 WARNING [train.py:1204] (3/4) Exclude cut with ID _5910_210_1389_1_1527337972454_5050100_555-159815-0_sp1.1 from training. Duration: 0.8909375 2023-10-03 18:43:34,200 WARNING [train.py:1204] (3/4) Exclude cut with ID _37086_210_9038_1_1532999540891_7021250_469-147941-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 18:43:35,536 WARNING [train.py:1204] (3/4) Exclude cut with ID _3067_210_2667_1_1526212974640_4503168_459-173867-0_sp0.9 from training. Duration: 0.97775 2023-10-03 18:43:35,778 INFO [scaling.py:1118] (3/4) WithLoss: name=encoder.encoders.3.encoder.layers.0.self_attn_weights, loss-sum=0.000e+00 2023-10-03 18:43:38,546 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9156W0006-572499 from training. Duration: 0.896 2023-10-03 18:43:38,629 WARNING [train.py:1204] (3/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_277-210798-0 from training. Duration: 0.55 2023-10-03 18:43:40,011 WARNING [train.py:1204] (3/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_103-336064-0_sp1.1 from training. Duration: 0.463625 2023-10-03 18:43:40,063 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3414_210_1988_1_1526212718_4813195_331-290945-0_sp1.1 from training. Duration: 0.936375 2023-10-03 18:43:40,064 WARNING [train.py:1204] (3/4) Exclude cut with ID _67499_210_17282_1_1535509805220_3692430_365-29276-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 18:43:41,491 WARNING [train.py:1204] (3/4) Exclude cut with ID _14821_210_8750_1_1530680503991_6829359_69-250847-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 18:43:41,492 WARNING [train.py:1204] (3/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_1102-61301-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 18:43:42,830 WARNING [train.py:1204] (3/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_166-246557-0_sp1.1 from training. Duration: 0.5818125 2023-10-03 18:43:44,183 WARNING [train.py:1204] (3/4) Exclude cut with ID _15351_210_10626_1_1530856635653_3576210_214-129918-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 18:43:44,203 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_120-41668-0_sp0.9 from training. Duration: 0.77775 2023-10-03 18:43:44,411 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=1368006.6666666667, ans=0.1 2023-10-03 18:43:45,453 WARNING [train.py:1204] (3/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_27-72278-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 18:43:46,860 WARNING [train.py:1204] (3/4) Exclude cut with ID _24314_210_4915_1_1532135078116_7170179_521-258868-0_sp1.1 from training. Duration: 0.7181875 2023-10-03 18:43:48,255 WARNING [train.py:1204] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_143-86344-0 from training. Duration: 0.77 2023-10-03 18:43:50,165 WARNING [train.py:1204] (3/4) Exclude cut with ID _53489_210_6228_1_1534298357169_3835970_465-157433-0_sp1.1 from training. Duration: 0.92725 2023-10-03 18:43:51,493 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_248-319353-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 18:43:52,790 INFO [train.py:1046] (3/4) Epoch 39, batch 3350, loss[loss=0.1692, simple_loss=0.2459, pruned_loss=0.04624, over 24078.00 frames. ], tot_loss[loss=0.1571, simple_loss=0.2365, pruned_loss=0.03884, over 4719755.56 frames. ], batch size: 86, lr: 2.60e-03, grad_scale: 16.0 2023-10-03 18:43:52,918 WARNING [train.py:1204] (3/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_102-77064-0_sp1.1 from training. Duration: 0.8454375 2023-10-03 18:43:52,941 WARNING [train.py:1204] (3/4) Exclude cut with ID _9389_210_1664_1_1529217337301_5289269_379-128794-0_sp1.1 from training. Duration: 0.77275 2023-10-03 18:43:54,438 WARNING [train.py:1204] (3/4) Exclude cut with ID _3231_210_2106_1_1526205864934_6784220_236-260835-0_sp1.1 from training. Duration: 0.97275 2023-10-03 18:43:56,407 WARNING [train.py:1204] (3/4) Exclude cut with ID _24271_210_12658_1_1531987270022_7190519_184-342363-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 18:43:56,408 WARNING [train.py:1204] (3/4) Exclude cut with ID _27894_210_12392_1_1532415496078_6902439_67-221663-0_sp1.1 from training. Duration: 0.92725 2023-10-03 18:44:01,120 WARNING [train.py:1204] (3/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_304-59227-0_sp1.1 from training. Duration: 0.7 2023-10-03 18:44:01,373 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.nonlin_attention.balancer.max_positive, batch_count=1368073.3333333333, ans=0.95 2023-10-03 18:44:02,520 WARNING [train.py:1204] (3/4) Exclude cut with ID _58605_210_12210_1_1534723195939_3904259_458-42232-0_sp1.1 from training. Duration: 0.92725 2023-10-03 18:44:03,855 WARNING [train.py:1204] (3/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_13-165512-0_sp0.9 from training. Duration: 0.911125 2023-10-03 18:44:06,636 WARNING [train.py:1204] (3/4) Exclude cut with ID _58602_210_2346_1_1534903210085_6908830_792-102241-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 18:44:06,769 WARNING [train.py:1204] (3/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_588-125165-0_sp0.9 from training. Duration: 0.92225 2023-10-03 18:44:08,069 WARNING [train.py:1204] (3/4) Exclude cut with ID _54218_210_12210_1_1534413646388_4409259_239-98429-0_sp1.1 from training. Duration: 0.97275 2023-10-03 18:44:09,414 WARNING [train.py:1204] (3/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_538-32291-0_sp1.1 from training. Duration: 0.7545625 2023-10-03 18:44:09,506 WARNING [train.py:1204] (3/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_419-317062-0 from training. Duration: 0.91 2023-10-03 18:44:12,145 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9309W0202-640329 from training. Duration: 0.896 2023-10-03 18:44:12,185 WARNING [train.py:1204] (3/4) Exclude cut with ID _3231_210_2106_1_1526205864934_6784220_419-313291-0_sp1.1 from training. Duration: 0.97275 2023-10-03 18:44:13,707 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.1.conv_module2.balancer2.prob, batch_count=1368140.0, ans=0.125 2023-10-03 18:44:16,137 WARNING [train.py:1204] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_619-344499-0 from training. Duration: 0.63 2023-10-03 18:44:16,156 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_278-272612-0 from training. Duration: 0.73 2023-10-03 18:44:17,468 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_103-84010-0_sp0.9 from training. Duration: 0.7666875 2023-10-03 18:44:17,498 WARNING [train.py:1204] (3/4) Exclude cut with ID _26528_210_9070_1_1532570410842_3851310_497-97218-0_sp1.1 from training. Duration: 0.7818125 2023-10-03 18:44:18,918 WARNING [train.py:1204] (3/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_262-84613-0_sp1.1 from training. Duration: 0.936375 2023-10-03 18:44:18,955 WARNING [train.py:1204] (3/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_387-131538-0 from training. Duration: 0.81 2023-10-03 18:44:18,978 WARNING [train.py:1204] (3/4) Exclude cut with ID _8030_210_7497_1_1528444886781_3531089_224-300489-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 18:44:19,000 WARNING [train.py:1204] (3/4) Exclude cut with ID _14349_210_8341_1_1530615710818_3442470_170-336541-0_sp0.9 from training. Duration: 0.9555625 2023-10-03 18:44:22,298 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_47069_210_15368_1_1533781267_4431320_180-31746-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 18:44:24,912 WARNING [train.py:1204] (3/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_447-7041-0_sp1.1 from training. Duration: 0.92725 2023-10-03 18:44:24,957 WARNING [train.py:1204] (3/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_2-55283-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 18:44:25,140 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.bypass_mid.scale_min, batch_count=1368206.6666666667, ans=0.2 2023-10-03 18:44:26,365 WARNING [train.py:1204] (3/4) Exclude cut with ID _12555_210_8750_1_1530075594402_3780239_70-181911-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 18:44:31,593 WARNING [train.py:1204] (3/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_311-328022-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 18:44:31,874 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.conv_module2.balancer1.prob, batch_count=1368206.6666666667, ans=0.125 2023-10-03 18:44:32,929 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.633e+02 1.940e+02 2.156e+02 2.529e+02 3.650e+02, threshold=4.311e+02, percent-clipped=0.0 2023-10-03 18:44:33,033 WARNING [train.py:1204] (3/4) Exclude cut with ID _14528_210_8254_1_1530669700514_4306939_330-2987-0_sp1.1 from training. Duration: 0.92725 2023-10-03 18:44:34,481 WARNING [train.py:1204] (3/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_385-184858-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 18:44:36,599 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.3.encoder.layers.1.conv_module1.whiten, num_groups=1, num_channels=512, metric=3.65 vs. limit=15.0 2023-10-03 18:44:37,294 WARNING [train.py:1204] (3/4) Exclude cut with ID _64414_210_17301_1_1535243252048_3757759_348-333360-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 18:44:37,353 WARNING [train.py:1204] (3/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_753-214751-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 18:44:37,501 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.conv_module2.balancer1.prob, batch_count=1368273.3333333333, ans=0.125 2023-10-03 18:44:40,005 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_17613_210_2151_1_1531137569_2366160_202-208013-0_sp1.1 from training. Duration: 0.92725 2023-10-03 18:44:40,014 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_43831_210_2158_1_1533725502_5872791_610-129300-0_sp1.1 from training. Duration: 0.936375 2023-10-03 18:44:41,416 WARNING [train.py:1204] (3/4) Exclude cut with ID _54218_210_12210_1_1534413646388_4409259_321-23852-0_sp1.1 from training. Duration: 0.936375 2023-10-03 18:44:43,979 WARNING [train.py:1204] (3/4) Exclude cut with ID _39411_210_5986_1_1533538825030_3343649_173-238248-0 from training. Duration: 0.83 2023-10-03 18:44:43,995 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_405-339986-0_sp0.9 from training. Duration: 0.5666875 2023-10-03 18:44:44,019 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2009_210_2071_1_1525481134_4588937_385-302100-0 from training. Duration: 0.87 2023-10-03 18:44:44,059 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_161-276996-0_sp1.1 from training. Duration: 0.863625 2023-10-03 18:44:45,379 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_177-58005-0 from training. Duration: 0.62 2023-10-03 18:44:46,596 WARNING [train.py:1204] (3/4) Exclude cut with ID _64639_210_5816_1_1535367619437_3528000_146-343734-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 18:44:48,043 WARNING [train.py:1204] (3/4) Exclude cut with ID _3845_210_1390_1_1526731326141_3523610_72-61441-0_sp1.1 from training. Duration: 0.92725 2023-10-03 18:44:54,270 WARNING [train.py:1204] (3/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_991-125028-0_sp1.1 from training. Duration: 0.936375 2023-10-03 18:44:55,624 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7544_210_6753_1_1528591327_1471426_172-348777-0 from training. Duration: 0.66 2023-10-03 18:44:55,678 WARNING [train.py:1204] (3/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_416-257730-0_sp1.1 from training. Duration: 0.5909375 2023-10-03 18:44:56,981 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1113-7369-0_sp1.1 from training. Duration: 0.77275 2023-10-03 18:44:58,943 WARNING [train.py:1204] (3/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_242-76943-0_sp1.1 from training. Duration: 0.8545625 2023-10-03 18:45:04,777 WARNING [train.py:1204] (3/4) Exclude cut with ID _44718_210_5816_1_1534050051869_6469030_190-49648-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 18:45:06,156 INFO [train.py:1046] (3/4) Epoch 39, batch 3400, loss[loss=0.1674, simple_loss=0.2511, pruned_loss=0.04191, over 23378.00 frames. ], tot_loss[loss=0.1577, simple_loss=0.2372, pruned_loss=0.0391, over 4718254.00 frames. ], batch size: 119, lr: 2.60e-03, grad_scale: 16.0 2023-10-03 18:45:06,485 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.conv_skip_rate, batch_count=1368406.6666666667, ans=0.0 2023-10-03 18:45:07,624 WARNING [train.py:1204] (3/4) Exclude cut with ID _47251_210_18628_1_1533814063128_4137250_214-88076-0 from training. Duration: 0.88 2023-10-03 18:45:07,659 WARNING [train.py:1204] (3/4) Exclude cut with ID _5585_210_2667_1_1527418813324_8007340_89-332072-0_sp1.1 from training. Duration: 0.5818125 2023-10-03 18:45:07,683 WARNING [train.py:1204] (3/4) Exclude cut with ID _41817_210_18835_1_1533434477160_7423049_555-293448-0_sp1.1 from training. Duration: 0.72725 2023-10-03 18:45:09,031 WARNING [train.py:1204] (3/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_286-331314-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 18:45:09,091 WARNING [train.py:1204] (3/4) Exclude cut with ID _13921_210_9994_1_1530838702800_3003429_46-149444-0 from training. Duration: 0.94 2023-10-03 18:45:10,443 WARNING [train.py:1204] (3/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_768-43595-0_sp1.1 from training. Duration: 0.936375 2023-10-03 18:45:10,459 WARNING [train.py:1204] (3/4) Exclude cut with ID _35355_210_16040_1_1533212988977_4016229_74-190076-0 from training. Duration: 0.8 2023-10-03 18:45:11,831 WARNING [train.py:1204] (3/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_1007-316380-0_sp1.1 from training. Duration: 0.963625 2023-10-03 18:45:13,148 WARNING [train.py:1204] (3/4) Exclude cut with ID _23557_210_8750_1_1531890106370_6846255_150-283437-0_sp1.1 from training. Duration: 0.963625 2023-10-03 18:45:14,455 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3474_210_2028_1_1526188513_4825246_145-309131-0_sp1.1 from training. Duration: 0.67275 2023-10-03 18:45:15,831 WARNING [train.py:1204] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_391-22148-0_sp1.1 from training. Duration: 0.7 2023-10-03 18:45:15,847 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_238-256668-0 from training. Duration: 0.69 2023-10-03 18:45:16,767 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.1.encoder.layers.1.conv_module1.whiten, num_groups=1, num_channels=256, metric=10.07 vs. limit=15.0 2023-10-03 18:45:18,876 WARNING [train.py:1204] (3/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_372-198878-0 from training. Duration: 0.6 2023-10-03 18:45:18,887 WARNING [train.py:1204] (3/4) Exclude cut with ID ID0174W0006-766659 from training. Duration: 0.953 2023-10-03 18:45:20,144 WARNING [train.py:1204] (3/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_668-23019-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 18:45:21,650 WARNING [train.py:1204] (3/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_322-74328-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 18:45:21,656 WARNING [train.py:1204] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_872-264525-0_sp1.1 from training. Duration: 0.5909375 2023-10-03 18:45:22,996 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_53378_210_17282_1_1534221071_5769509_641-286201-0_sp1.1 from training. Duration: 0.97275 2023-10-03 18:45:23,095 WARNING [train.py:1204] (3/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_199-295611-0_sp1.1 from training. Duration: 0.5 2023-10-03 18:45:28,046 WARNING [train.py:1204] (3/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_1270-317971-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 18:45:30,447 WARNING [train.py:1204] (3/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_784-213099-0 from training. Duration: 0.93 2023-10-03 18:45:35,813 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_115-233929-0_sp0.9 from training. Duration: 0.8 2023-10-03 18:45:37,239 WARNING [train.py:1204] (3/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_256-223809-0_sp1.1 from training. Duration: 0.97275 2023-10-03 18:45:38,554 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_285-87781-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 18:45:38,639 WARNING [train.py:1204] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_619-344499-0_sp1.1 from training. Duration: 0.57275 2023-10-03 18:45:42,919 WARNING [train.py:1204] (3/4) Exclude cut with ID _60229_210_27767_1_1534838406158_3655350_264-279573-0_sp0.9 from training. Duration: 0.92225 2023-10-03 18:45:46,907 WARNING [train.py:1204] (3/4) Exclude cut with ID _7719_210_3525_1_1528456153163_4296960_333-110399-0 from training. Duration: 0.96 2023-10-03 18:45:47,544 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.4.encoder.layers.1.feed_forward2.out_whiten, num_groups=1, num_channels=384, metric=12.43 vs. limit=15.0 2023-10-03 18:45:50,951 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_50423_210_13270_1_1534128598_5021734_759-90833-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 18:45:52,932 WARNING [train.py:1204] (3/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_151-254798-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 18:45:52,973 WARNING [train.py:1204] (3/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_450-203637-0 from training. Duration: 0.92 2023-10-03 18:45:53,005 WARNING [train.py:1204] (3/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_673-36456-0_sp1.1 from training. Duration: 0.936375 2023-10-03 18:45:53,052 WARNING [train.py:1204] (3/4) Exclude cut with ID _36538_210_9994_1_1533204020363_3795620_131-8685-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 18:45:54,811 WARNING [train.py:1204] (3/4) Exclude cut with ID _12556_210_8341_1_1530081024100_7327690_713-55626-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 18:45:54,843 WARNING [train.py:1204] (3/4) Exclude cut with ID _2322_210_2667_1_1525604548971_3656299_440-80494-0_sp0.9 from training. Duration: 0.97775 2023-10-03 18:45:59,804 WARNING [train.py:1204] (3/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_210-325081-0_sp1.1 from training. Duration: 0.97275 2023-10-03 18:46:03,946 WARNING [train.py:1204] (3/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_375-244976-0_sp0.9 from training. Duration: 0.8333125 2023-10-03 18:46:03,952 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_15517_210_6753_1_1530952293_1664795_119-31001-0_sp1.1 from training. Duration: 0.8545625 2023-10-03 18:46:05,682 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_41452_210_7291_1_1533437687_3941281_319-42326-0_sp1.1 from training. Duration: 0.92725 2023-10-03 18:46:07,061 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_372-173012-0 from training. Duration: 0.95 2023-10-03 18:46:13,805 WARNING [train.py:1204] (3/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_41-31157-0_sp1.1 from training. Duration: 0.5181875 2023-10-03 18:46:16,655 WARNING [train.py:1204] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_91-212486-0 from training. Duration: 0.68 2023-10-03 18:46:19,273 INFO [train.py:1046] (3/4) Epoch 39, batch 3450, loss[loss=0.137, simple_loss=0.2205, pruned_loss=0.0268, over 24659.00 frames. ], tot_loss[loss=0.1577, simple_loss=0.2369, pruned_loss=0.03921, over 4711882.99 frames. ], batch size: 60, lr: 2.60e-03, grad_scale: 16.0 2023-10-03 18:46:20,403 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.0.layers.0.self_attn_weights.whiten_keys, num_groups=4, num_channels=128, metric=2.76 vs. limit=6.0 2023-10-03 18:46:20,845 WARNING [train.py:1204] (3/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_1267-272693-0 from training. Duration: 0.73 2023-10-03 18:46:22,554 WARNING [train.py:1204] (3/4) Exclude cut with ID _13938_210_9988_1_1530518718978_3693960_152-67046-0_sp1.1 from training. Duration: 0.936375 2023-10-03 18:46:25,826 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_4528_210_3273_1_1527076654_3276777_342-168068-0_sp1.1 from training. Duration: 0.7909375 2023-10-03 18:46:25,836 WARNING [train.py:1204] (3/4) Exclude cut with ID _7606_210_4915_1_1528894253277_5080690_127-177761-0 from training. Duration: 0.64 2023-10-03 18:46:27,154 WARNING [train.py:1204] (3/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_335-137972-0_sp1.1 from training. Duration: 0.92725 2023-10-03 18:46:31,126 WARNING [train.py:1204] (3/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_331-117829-0_sp1.1 from training. Duration: 0.563625 2023-10-03 18:46:34,131 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.self_attn_weights.pos_emb_skip_rate, batch_count=1368806.6666666667, ans=0.0 2023-10-03 18:46:34,153 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.bypass.scale_min, batch_count=1368806.6666666667, ans=0.2 2023-10-03 18:46:36,755 WARNING [train.py:1204] (3/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_125-125818-0_sp1.1 from training. Duration: 0.87275 2023-10-03 18:46:36,806 WARNING [train.py:1204] (3/4) Exclude cut with ID _6789_210_1534_1_1527854471671_2870519_48-294897-0_sp1.1 from training. Duration: 0.963625 2023-10-03 18:46:38,114 WARNING [train.py:1204] (3/4) Exclude cut with ID _6694_210_4599_1_1527852637490_3965529_116-109398-0_sp1.1 from training. Duration: 0.663625 2023-10-03 18:46:38,124 WARNING [train.py:1204] (3/4) Exclude cut with ID _52892_210_6721_1_1534378941259_4124129_222-172685-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 18:46:39,543 WARNING [train.py:1204] (3/4) Exclude cut with ID _50655_210_18628_1_1534229942193_3772154_320-132275-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 18:46:39,743 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.attention_skip_rate, batch_count=1368806.6666666667, ans=0.0 2023-10-03 18:46:46,432 WARNING [train.py:1204] (3/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_76-311963-0 from training. Duration: 0.7 2023-10-03 18:46:50,678 WARNING [train.py:1204] (3/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_32-205288-0 from training. Duration: 0.78 2023-10-03 18:46:50,696 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_86-16881-0_sp0.9 from training. Duration: 0.6444375 2023-10-03 18:46:50,740 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_271-297505-0_sp1.1 from training. Duration: 0.9 2023-10-03 18:46:53,422 WARNING [train.py:1204] (3/4) Exclude cut with ID _57572_210_5517_1_1534921219624_3809484_187-295293-0_sp1.1 from training. Duration: 0.963625 2023-10-03 18:46:53,750 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.bypass_mid.scale_min, batch_count=1368873.3333333333, ans=0.2 2023-10-03 18:46:58,850 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.nonlin_attention.balancer.prob, batch_count=1368873.3333333333, ans=0.125 2023-10-03 18:46:59,748 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.732e+02 1.946e+02 2.153e+02 2.434e+02 3.371e+02, threshold=4.307e+02, percent-clipped=0.0 2023-10-03 18:46:59,825 WARNING [train.py:1204] (3/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_563-329943-0 from training. Duration: 0.99 2023-10-03 18:46:59,893 WARNING [train.py:1204] (3/4) Exclude cut with ID _7160_210_1876_1_1527940923231_3703799_302-150728-0_sp0.9 from training. Duration: 0.8666875 2023-10-03 18:47:03,180 WARNING [train.py:1204] (3/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_926-7526-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 18:47:03,208 WARNING [train.py:1204] (3/4) Exclude cut with ID _62574_210_2655_1_1535180407058_3933249_467-22348-0_sp1.1 from training. Duration: 0.936375 2023-10-03 18:47:04,623 WARNING [train.py:1204] (3/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_280-161633-0_sp0.9 from training. Duration: 0.811125 2023-10-03 18:47:05,979 WARNING [train.py:1204] (3/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_385-193052-0_sp1.1 from training. Duration: 0.7818125 2023-10-03 18:47:07,335 WARNING [train.py:1204] (3/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_444-28041-0 from training. Duration: 0.85 2023-10-03 18:47:07,340 WARNING [train.py:1204] (3/4) Exclude cut with ID _66070_210_7640_1_1535441393096_7805310_341-249237-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 18:47:08,712 WARNING [train.py:1204] (3/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_425-269826-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 18:47:11,447 WARNING [train.py:1204] (3/4) Exclude cut with ID _53950_210_5816_1_1534417264919_3862951_124-185597-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 18:47:14,239 WARNING [train.py:1204] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_380-204756-0 from training. Duration: 0.86 2023-10-03 18:47:17,046 WARNING [train.py:1204] (3/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_282-257791-0_sp0.9 from training. Duration: 0.9555625 2023-10-03 18:47:22,001 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.2.encoder.layers.0.nonlin_attention.whiten2, num_groups=1, num_channels=384, metric=8.22 vs. limit=15.0 2023-10-03 18:47:22,567 WARNING [train.py:1204] (3/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_372-131163-0_sp1.1 from training. Duration: 0.8545625 2023-10-03 18:47:23,937 WARNING [train.py:1204] (3/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_447-233513-0_sp1.1 from training. Duration: 0.963625 2023-10-03 18:47:25,853 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_54-118333-0_sp1.1 from training. Duration: 0.97275 2023-10-03 18:47:30,533 WARNING [train.py:1204] (3/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_388-230239-0_sp1.1 from training. Duration: 0.963625 2023-10-03 18:47:30,552 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_40047_210_2290_1_1533290519_2374311_141-54639-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 18:47:31,889 WARNING [train.py:1204] (3/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_91-267696-0_sp0.9 from training. Duration: 0.9666875 2023-10-03 18:47:33,226 INFO [train.py:1046] (3/4) Epoch 39, batch 3500, loss[loss=0.157, simple_loss=0.2189, pruned_loss=0.0476, over 22763.00 frames. ], tot_loss[loss=0.157, simple_loss=0.2361, pruned_loss=0.03894, over 4706791.64 frames. ], batch size: 322, lr: 2.60e-03, grad_scale: 16.0 2023-10-03 18:47:33,298 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_532-137847-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 18:47:37,393 WARNING [train.py:1204] (3/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_155-283231-0_sp1.1 from training. Duration: 0.97275 2023-10-03 18:47:41,651 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2245_210_2715_1_1525511548_3540254_310-52483-0_sp1.1 from training. Duration: 0.8 2023-10-03 18:47:42,246 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.4.encoder.layers.0.self_attn2.whiten, num_groups=1, num_channels=384, metric=13.71 vs. limit=22.5 2023-10-03 18:47:43,020 WARNING [train.py:1204] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_185-174805-0 from training. Duration: 0.68 2023-10-03 18:47:44,432 WARNING [train.py:1204] (3/4) Exclude cut with ID _15012_210_1968_1_1530856820429_5095350_662-210469-0_sp1.1 from training. Duration: 0.5181875 2023-10-03 18:47:45,831 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_426-313557-0_sp0.9 from training. Duration: 0.588875 2023-10-03 18:47:48,659 WARNING [train.py:1204] (3/4) Exclude cut with ID _42326_210_7770_1_1533814282531_3707269_407-204343-0_sp1.1 from training. Duration: 0.97275 2023-10-03 18:47:48,673 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_258-310570-0 from training. Duration: 0.82 2023-10-03 18:47:50,369 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=1369140.0, ans=0.1 2023-10-03 18:47:50,416 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.feed_forward1.hidden_balancer.prob, batch_count=1369140.0, ans=0.125 2023-10-03 18:47:51,636 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_4737_210_2040_1_1526998689_3293008_378-178650-0_sp1.1 from training. Duration: 0.863625 2023-10-03 18:47:53,082 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.conv_module2.balancer2.min_positive, batch_count=1369140.0, ans=0.05 2023-10-03 18:47:54,240 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_47896_210_20508_1_1534492518_5958868_432-327451-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 18:47:54,335 WARNING [train.py:1204] (3/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_188-114464-0_sp1.1 from training. Duration: 0.7454375 2023-10-03 18:47:54,342 WARNING [train.py:1204] (3/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_798-106179-0_sp1.1 from training. Duration: 0.7545625 2023-10-03 18:47:54,377 WARNING [train.py:1204] (3/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_143-117421-0_sp1.1 from training. Duration: 0.52725 2023-10-03 18:47:55,676 WARNING [train.py:1204] (3/4) Exclude cut with ID _67499_210_17282_1_1535509805220_3692430_361-78786-0_sp1.1 from training. Duration: 0.963625 2023-10-03 18:47:56,360 WARNING [train.py:1204] (3/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_712-342563-0_sp1.1 from training. Duration: 0.8818125 2023-10-03 18:47:56,404 WARNING [train.py:1204] (3/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_387-159008-0 from training. Duration: 0.86 2023-10-03 18:47:59,504 WARNING [train.py:1204] (3/4) Exclude cut with ID _33640_210_2655_1_1533117605479_4855469_959-18820-0_sp1.1 from training. Duration: 0.963625 2023-10-03 18:47:59,531 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_133-564-0_sp0.9 from training. Duration: 0.87775 2023-10-03 18:48:01,360 WARNING [train.py:1204] (3/4) Exclude cut with ID _23514_210_1389_1_1531832139785_3031439_12-29287-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 18:48:05,470 WARNING [train.py:1204] (3/4) Exclude cut with ID _23514_210_1389_1_1531832139785_3031439_489-185052-0_sp1.1 from training. Duration: 0.963625 2023-10-03 18:48:05,542 WARNING [train.py:1204] (3/4) Exclude cut with ID _5444_210_2151_1_1527418808464_5312210_634-194174-0 from training. Duration: 0.97 2023-10-03 18:48:05,560 WARNING [train.py:1204] (3/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_91-26919-0_sp1.1 from training. Duration: 0.8818125 2023-10-03 18:48:09,595 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_37351_210_7925_1_1533012931_3812113_516-82303-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 18:48:09,697 WARNING [train.py:1204] (3/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_80-74184-0_sp0.9 from training. Duration: 0.92225 2023-10-03 18:48:12,269 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_9460_210_4211_1_1528954587_10049667_495-202963-0_sp1.1 from training. Duration: 0.963625 2023-10-03 18:48:13,714 WARNING [train.py:1204] (3/4) Exclude cut with ID _14632_210_9988_1_1530691279392_3923200_332-105904-0_sp1.1 from training. Duration: 0.8181875 2023-10-03 18:48:13,729 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_40764_210_1925_1_1533882359_3752345_493-167886-0_sp1.1 from training. Duration: 0.92725 2023-10-03 18:48:15,113 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_196-289637-0 from training. Duration: 0.75 2023-10-03 18:48:16,409 WARNING [train.py:1204] (3/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_26-175553-0 from training. Duration: 0.61 2023-10-03 18:48:16,453 WARNING [train.py:1204] (3/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_890-5117-0 from training. Duration: 0.78 2023-10-03 18:48:16,502 WARNING [train.py:1204] (3/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_206-306860-0_sp1.1 from training. Duration: 0.92725 2023-10-03 18:48:17,933 WARNING [train.py:1204] (3/4) Exclude cut with ID _25882_210_3036_1_1532775665815_6479510_570-79824-0_sp1.1 from training. Duration: 0.963625 2023-10-03 18:48:18,153 INFO [scaling.py:1118] (3/4) WithLoss: name=encoder.encoders.2.encoder.layers.2.self_attn_weights, loss-sum=0.000e+00 2023-10-03 18:48:19,408 WARNING [train.py:1204] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_731-2428-0_sp1.1 from training. Duration: 0.7545625 2023-10-03 18:48:19,442 WARNING [train.py:1204] (3/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_138-154812-0_sp0.9 from training. Duration: 0.8666875 2023-10-03 18:48:23,267 WARNING [train.py:1204] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_178-39722-0_sp0.9 from training. Duration: 0.5444375 2023-10-03 18:48:23,500 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.feed_forward1.hidden_balancer.prob, batch_count=1369273.3333333333, ans=0.125 2023-10-03 18:48:23,793 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.3.encoder.layers.3.feed_forward3.out_whiten, num_groups=1, num_channels=512, metric=11.41 vs. limit=15.0 2023-10-03 18:48:24,661 WARNING [train.py:1204] (3/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_1285-256622-0_sp0.9 from training. Duration: 0.9333125 2023-10-03 18:48:31,884 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_41428_210_2161_1_1533774184_3992532_65-271323-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 18:48:33,353 WARNING [train.py:1204] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_308-161067-0 from training. Duration: 0.66 2023-10-03 18:48:33,358 WARNING [train.py:1204] (3/4) Exclude cut with ID _3067_210_2667_1_1526212974640_4503168_459-173867-0 from training. Duration: 0.88 2023-10-03 18:48:33,360 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2221_210_1988_1_1525597051_4935541_415-296882-0_sp1.1 from training. Duration: 0.7 2023-10-03 18:48:33,657 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.conv_module2.balancer2.prob, batch_count=1369340.0, ans=0.125 2023-10-03 18:48:34,902 WARNING [train.py:1204] (3/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_354-192266-0_sp1.1 from training. Duration: 0.936375 2023-10-03 18:48:36,224 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_258-310570-0_sp0.9 from training. Duration: 0.911125 2023-10-03 18:48:36,333 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_548-141039-0_sp1.1 from training. Duration: 0.963625 2023-10-03 18:48:37,975 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.feed_forward2.hidden_balancer.prob, batch_count=1369340.0, ans=0.125 2023-10-03 18:48:39,113 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_15101_210_7927_1_1530786415_5033274_320-327527-0 from training. Duration: 0.96 2023-10-03 18:48:39,170 WARNING [train.py:1204] (3/4) Exclude cut with ID _2895_210_2477_1_1525956785794_4063750_289-11475-0_sp0.9 from training. Duration: 0.911125 2023-10-03 18:48:40,558 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7543_210_6753_1_1528543318_4552_1-85072-0_sp1.1 from training. Duration: 0.7545625 2023-10-03 18:48:43,050 WARNING [train.py:1204] (3/4) Exclude cut with ID _64639_210_5816_1_1535367619437_3528000_461-329156-0 from training. Duration: 0.96 2023-10-03 18:48:44,453 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_211-339622-0 from training. Duration: 0.45 2023-10-03 18:48:45,732 INFO [train.py:1046] (3/4) Epoch 39, batch 3550, loss[loss=0.1538, simple_loss=0.2254, pruned_loss=0.0411, over 23529.00 frames. ], tot_loss[loss=0.1564, simple_loss=0.2353, pruned_loss=0.03877, over 4703969.18 frames. ], batch size: 256, lr: 2.60e-03, grad_scale: 16.0 2023-10-03 18:48:45,811 WARNING [train.py:1204] (3/4) Exclude cut with ID _20455_210_5219_1_1531565996775_8098100_402-112739-0_sp1.1 from training. Duration: 0.963625 2023-10-03 18:48:47,202 WARNING [train.py:1204] (3/4) Exclude cut with ID _53489_210_6228_1_1534298357169_3835970_256-115565-0_sp1.1 from training. Duration: 0.936375 2023-10-03 18:48:47,226 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_26326_210_7925_1_1533002493_3592406_360-305260-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 18:48:47,250 WARNING [train.py:1204] (3/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_18-204383-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 18:48:50,073 WARNING [train.py:1204] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_306-232480-0_sp1.1 from training. Duration: 0.87275 2023-10-03 18:48:58,121 WARNING [train.py:1204] (3/4) Exclude cut with ID _41161_210_20034_1_1533431249657_3555314_116-299186-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 18:48:59,502 WARNING [train.py:1204] (3/4) Exclude cut with ID _13913_210_9994_1_1530579615187_3383897_473-277682-0_sp1.1 from training. Duration: 0.3545625 2023-10-03 18:49:01,652 WARNING [train.py:1204] (3/4) Exclude cut with ID _58895_210_8750_1_1535184081320_3649089_49-225702-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 18:49:03,008 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3080_210_2071_1_1526086102_4365166_312-8726-0_sp1.1 from training. Duration: 0.82725 2023-10-03 18:49:04,365 WARNING [train.py:1204] (3/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_489-190175-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 18:49:06,860 WARNING [train.py:1204] (3/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_345-95717-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 18:49:06,874 WARNING [train.py:1204] (3/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_85-138257-0_sp1.1 from training. Duration: 0.6454375 2023-10-03 18:49:09,756 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7046_210_6753_1_1527895337_6920917_783-278109-0_sp1.1 from training. Duration: 0.7 2023-10-03 18:49:09,782 WARNING [train.py:1204] (3/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_450-203637-0_sp1.1 from training. Duration: 0.836375 2023-10-03 18:49:09,826 WARNING [train.py:1204] (3/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_416-132893-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 18:49:09,842 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2655_210_2087_1_1525688959_3897400_437-337690-0_sp0.9 from training. Duration: 0.7 2023-10-03 18:49:11,184 WARNING [train.py:1204] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_700-183549-0_sp1.1 from training. Duration: 0.6181875 2023-10-03 18:49:16,758 WARNING [train.py:1204] (3/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_191-59784-0_sp1.1 from training. Duration: 0.8 2023-10-03 18:49:16,771 WARNING [train.py:1204] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_218-295097-0_sp1.1 from training. Duration: 0.7 2023-10-03 18:49:18,199 WARNING [train.py:1204] (3/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_310-109461-0_sp1.1 from training. Duration: 0.72725 2023-10-03 18:49:18,203 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_33348_210_5632_1_1533086912_3324030_131-321149-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 18:49:19,591 WARNING [train.py:1204] (3/4) Exclude cut with ID _64135_210_2253_1_1535677267876_7608972_133-188266-0_sp1.1 from training. Duration: 0.736375 2023-10-03 18:49:19,614 WARNING [train.py:1204] (3/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_505-205270-0 from training. Duration: 0.38 2023-10-03 18:49:19,627 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_67223_210_4238_1_1535612120_2537431_102-88248-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 18:49:19,759 WARNING [train.py:1204] (3/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_60-235433-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 18:49:21,022 WARNING [train.py:1204] (3/4) Exclude cut with ID _5189_210_4557_1_1527248319428_3769229_199-53526-0_sp1.1 from training. Duration: 0.3818125 2023-10-03 18:49:25,660 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.536e+02 1.892e+02 2.071e+02 2.363e+02 4.365e+02, threshold=4.143e+02, percent-clipped=1.0 2023-10-03 18:49:27,136 WARNING [train.py:1204] (3/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_439-90979-0_sp1.1 from training. Duration: 0.963625 2023-10-03 18:49:27,206 WARNING [train.py:1204] (3/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_1162-132880-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 18:49:28,514 WARNING [train.py:1204] (3/4) Exclude cut with ID _56423_210_5854_1_1534896061049_3165969_220-22317-0_sp1.1 from training. Duration: 0.963625 2023-10-03 18:49:31,261 WARNING [train.py:1204] (3/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_909-64151-0 from training. Duration: 0.94 2023-10-03 18:49:31,314 WARNING [train.py:1204] (3/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_465-156500-0_sp0.9 from training. Duration: 0.8 2023-10-03 18:49:33,045 WARNING [train.py:1204] (3/4) Exclude cut with ID _44304_210_15374_1_1533639352765_4613129_302-22084-0 from training. Duration: 0.86 2023-10-03 18:49:33,093 WARNING [train.py:1204] (3/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_663-166973-0_sp1.1 from training. Duration: 0.72725 2023-10-03 18:49:34,534 WARNING [train.py:1204] (3/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_503-287883-0_sp0.9 from training. Duration: 0.97775 2023-10-03 18:49:34,551 WARNING [train.py:1204] (3/4) Exclude cut with ID _60057_210_10640_1_1534903065793_3333150_362-230377-0_sp0.9 from training. Duration: 0.9666875 2023-10-03 18:49:38,586 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_4183_210_2087_1_1526729100_1869838_143-280962-0 from training. Duration: 0.56 2023-10-03 18:49:38,686 WARNING [train.py:1204] (3/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_281-1313-0_sp1.1 from training. Duration: 0.936375 2023-10-03 18:49:46,919 WARNING [train.py:1204] (3/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_64-215118-0_sp1.1 from training. Duration: 0.936375 2023-10-03 18:49:46,984 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2822_210_2715_1_1526112586_1999587_165-36656-0 from training. Duration: 0.89 2023-10-03 18:49:47,047 WARNING [train.py:1204] (3/4) Exclude cut with ID _22961_210_8254_1_1531976366672_4278180_402-208830-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 18:49:48,618 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.balancer2.prob, batch_count=1369673.3333333333, ans=0.125 2023-10-03 18:49:51,090 WARNING [train.py:1204] (3/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_98-193494-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 18:49:51,336 INFO [scaling.py:1118] (3/4) WithLoss: name=encoder.encoders.3.encoder.layers.2.self_attn_weights, loss-sum=0.000e+00 2023-10-03 18:49:52,509 WARNING [train.py:1204] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_383-285196-0 from training. Duration: 0.97 2023-10-03 18:49:59,632 INFO [train.py:1046] (3/4) Epoch 39, batch 3600, loss[loss=0.1568, simple_loss=0.2305, pruned_loss=0.04153, over 23619.00 frames. ], tot_loss[loss=0.1567, simple_loss=0.2358, pruned_loss=0.03875, over 4721226.91 frames. ], batch size: 232, lr: 2.60e-03, grad_scale: 32.0 2023-10-03 18:49:59,691 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6767_210_3273_1_1527855783_1718932_142-127132-0 from training. Duration: 0.95 2023-10-03 18:49:59,723 WARNING [train.py:1204] (3/4) Exclude cut with ID _39867_210_9826_1_1533553222030_9344259_1018-339635-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 18:50:01,095 WARNING [train.py:1204] (3/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_274-274798-0_sp1.1 from training. Duration: 0.8909375 2023-10-03 18:50:02,519 WARNING [train.py:1204] (3/4) Exclude cut with ID _2334_210_2667_1_1525608238880_4705129_376-263393-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 18:50:02,579 WARNING [train.py:1204] (3/4) Exclude cut with ID _29303_210_2575_1_1532392237303_7410649_363-344675-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 18:50:04,246 WARNING [train.py:1204] (3/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_58-68956-0_sp1.1 from training. Duration: 0.7545625 2023-10-03 18:50:08,290 WARNING [train.py:1204] (3/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_709-311571-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 18:50:09,653 WARNING [train.py:1204] (3/4) Exclude cut with ID _46067_210_4220_1_1534125571066_3307450_136-257407-0_sp1.1 from training. Duration: 0.963625 2023-10-03 18:50:09,757 WARNING [train.py:1204] (3/4) Exclude cut with ID _6493_210_1747_1_1528459210424_4012470_324-323854-0_sp1.1 from training. Duration: 0.836375 2023-10-03 18:50:11,129 WARNING [train.py:1204] (3/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_234-260682-0_sp1.1 from training. Duration: 0.763625 2023-10-03 18:50:11,195 WARNING [train.py:1204] (3/4) Exclude cut with ID _36634_210_1794_1_1533296352450_1399884_55-119451-0_sp1.1 from training. Duration: 0.963625 2023-10-03 18:50:11,198 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_40047_210_2290_1_1533290519_2374311_195-165693-0 from training. Duration: 0.89 2023-10-03 18:50:15,249 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_5522_210_3273_1_1527249606_3909096_366-172508-0_sp0.9 from training. Duration: 0.7333125 2023-10-03 18:50:16,657 WARNING [train.py:1204] (3/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_187-10504-0_sp1.1 from training. Duration: 0.963625 2023-10-03 18:50:19,689 WARNING [train.py:1204] (3/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_282-257791-0_sp1.1 from training. Duration: 0.7818125 2023-10-03 18:50:22,572 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_43831_210_2158_1_1533725502_5872791_467-288776-0_sp1.1 from training. Duration: 0.92725 2023-10-03 18:50:23,987 WARNING [train.py:1204] (3/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_138-154812-0_sp1.1 from training. Duration: 0.7090625 2023-10-03 18:50:24,034 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_1154-185930-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 18:50:24,053 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2655_210_2087_1_1525688959_3897400_52-279899-0 from training. Duration: 0.69 2023-10-03 18:50:25,319 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_141-89113-0_sp1.1 from training. Duration: 0.7818125 2023-10-03 18:50:28,732 WARNING [train.py:1204] (3/4) Exclude cut with ID _55134_210_21982_1_1534590028377_3882170_297-47310-0_sp1.1 from training. Duration: 0.963625 2023-10-03 18:50:28,801 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_486-151131-0_sp1.1 from training. Duration: 0.77275 2023-10-03 18:50:31,416 WARNING [train.py:1204] (3/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_21-232980-0_sp1.1 from training. Duration: 0.936375 2023-10-03 18:50:31,591 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.conv_module2.balancer1.prob, batch_count=1369873.3333333333, ans=0.125 2023-10-03 18:50:32,789 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6165_210_3528_1_1527590982_4379841_338-106380-0_sp1.1 from training. Duration: 0.92725 2023-10-03 18:50:34,149 WARNING [train.py:1204] (3/4) Exclude cut with ID _55162_210_17071_1_1534408248583_3753480_355-257218-0_sp1.1 from training. Duration: 0.97275 2023-10-03 18:50:36,091 WARNING [train.py:1204] (3/4) Exclude cut with ID _2895_210_2477_1_1525956785794_4063750_289-11475-0 from training. Duration: 0.82 2023-10-03 18:50:39,106 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.conv_module2.balancer2.prob, batch_count=1369873.3333333333, ans=0.125 2023-10-03 18:50:40,320 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.feed_forward1.hidden_balancer.prob, batch_count=1369873.3333333333, ans=0.125 2023-10-03 18:50:42,965 WARNING [train.py:1204] (3/4) Exclude cut with ID _16756_210_4915_1_1531047803763_7104259_839-183431-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 18:50:44,283 WARNING [train.py:1204] (3/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_427-208901-0_sp0.9 from training. Duration: 0.7555625 2023-10-03 18:50:44,340 WARNING [train.py:1204] (3/4) Exclude cut with ID _36512_210_6987_1_1533033143998_3126069_181-306500-0 from training. Duration: 0.95 2023-10-03 18:50:49,786 WARNING [train.py:1204] (3/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_28-70987-0_sp1.1 from training. Duration: 0.7909375 2023-10-03 18:50:53,989 WARNING [train.py:1204] (3/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_590-58996-0_sp1.1 from training. Duration: 0.936375 2023-10-03 18:50:56,767 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_48730_210_5399_1_1533885886_2212769_108-47561-0_sp1.1 from training. Duration: 0.936375 2023-10-03 18:50:57,487 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.2.encoder.layers.1.conv_module1.whiten, num_groups=1, num_channels=384, metric=7.35 vs. limit=15.0 2023-10-03 18:51:02,129 WARNING [train.py:1204] (3/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_401-158239-0_sp0.9 from training. Duration: 0.788875 2023-10-03 18:51:02,145 WARNING [train.py:1204] (3/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_278-154492-0_sp1.1 from training. Duration: 0.6909375 2023-10-03 18:51:02,160 WARNING [train.py:1204] (3/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_137-116442-0 from training. Duration: 0.78 2023-10-03 18:51:03,533 WARNING [train.py:1204] (3/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_189-317158-0 from training. Duration: 0.9 2023-10-03 18:51:04,869 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_47896_210_20508_1_1534492518_5958868_558-314572-0 from training. Duration: 0.97 2023-10-03 18:51:08,048 WARNING [train.py:1204] (3/4) Exclude cut with ID _66070_210_7640_1_1535441393096_7805310_290-40284-0_sp1.1 from training. Duration: 0.97275 2023-10-03 18:51:08,089 WARNING [train.py:1204] (3/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_110-224688-0_sp1.1 from training. Duration: 0.8818125 2023-10-03 18:51:08,165 WARNING [train.py:1204] (3/4) Exclude cut with ID _7719_210_3525_1_1528456153163_4296960_88-250259-0 from training. Duration: 0.84 2023-10-03 18:51:08,357 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.feed_forward3.hidden_balancer.prob, batch_count=1370006.6666666667, ans=0.125 2023-10-03 18:51:09,485 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8453_210_4211_1_1528502940_8562730_1157-114102-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 18:51:09,523 WARNING [train.py:1204] (3/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_317-175062-0_sp1.1 from training. Duration: 0.7454375 2023-10-03 18:51:09,535 WARNING [train.py:1204] (3/4) Exclude cut with ID _29857_210_20034_1_1532395886753_3798160_254-347254-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 18:51:10,906 WARNING [train.py:1204] (3/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_28-70987-0 from training. Duration: 0.87 2023-10-03 18:51:10,976 WARNING [train.py:1204] (3/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_321-335475-0 from training. Duration: 0.45 2023-10-03 18:51:12,184 INFO [train.py:1046] (3/4) Epoch 39, batch 3650, loss[loss=0.145, simple_loss=0.2178, pruned_loss=0.03609, over 23643.00 frames. ], tot_loss[loss=0.157, simple_loss=0.2364, pruned_loss=0.03877, over 4719502.44 frames. ], batch size: 149, lr: 2.60e-03, grad_scale: 16.0 2023-10-03 18:51:14,990 WARNING [train.py:1204] (3/4) Exclude cut with ID _40901_210_5665_1_1533363789958_3856130_356-107330-0_sp1.1 from training. Duration: 0.936375 2023-10-03 18:51:16,323 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3780_210_2087_1_1526467389_2145572_176-107140-0 from training. Duration: 0.69 2023-10-03 18:51:17,034 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.3.encoder.layers.2.feed_forward2.out_whiten, num_groups=1, num_channels=512, metric=7.23 vs. limit=15.0 2023-10-03 18:51:20,573 WARNING [train.py:1204] (3/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_80-135200-0 from training. Duration: 0.79 2023-10-03 18:51:22,024 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_33348_210_5632_1_1533086912_3324030_85-28583-0_sp0.9 from training. Duration: 0.911125 2023-10-03 18:51:23,677 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.attention_skip_rate, batch_count=1370073.3333333333, ans=0.0 2023-10-03 18:51:26,094 WARNING [train.py:1204] (3/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_29-293896-0 from training. Duration: 0.61 2023-10-03 18:51:27,384 WARNING [train.py:1204] (3/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_194-252800-0 from training. Duration: 0.97 2023-10-03 18:51:31,391 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_360-52446-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 18:51:31,392 WARNING [train.py:1204] (3/4) Exclude cut with ID _9389_210_1664_1_1529217337301_5289269_289-213417-0_sp0.9 from training. Duration: 0.988875 2023-10-03 18:51:31,451 WARNING [train.py:1204] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_143-86344-0_sp0.9 from training. Duration: 0.8555625 2023-10-03 18:51:34,308 WARNING [train.py:1204] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_520-126729-0_sp0.9 from training. Duration: 0.77775 2023-10-03 18:51:34,326 WARNING [train.py:1204] (3/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_76-308663-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 18:51:36,129 WARNING [train.py:1204] (3/4) Exclude cut with ID _8021_210_7994_1_1528376451358_3653069_100-140231-0 from training. Duration: 0.67 2023-10-03 18:51:36,190 WARNING [train.py:1204] (3/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_387-131538-0_sp0.9 from training. Duration: 0.9 2023-10-03 18:51:36,219 WARNING [train.py:1204] (3/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_365-182054-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 18:51:37,505 WARNING [train.py:1204] (3/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_345-340690-0 from training. Duration: 0.93 2023-10-03 18:51:37,597 WARNING [train.py:1204] (3/4) Exclude cut with ID _5409_210_1388_1_1527292850189_5865191_178-127970-0_sp0.9 from training. Duration: 0.6333125 2023-10-03 18:51:37,642 WARNING [train.py:1204] (3/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_352-12734-0_sp1.1 from training. Duration: 0.92725 2023-10-03 18:51:38,993 WARNING [train.py:1204] (3/4) Exclude cut with ID _65134_210_14763_1_1535335393985_3505368_121-129373-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 18:51:40,532 WARNING [train.py:1204] (3/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_557-324258-0_sp1.1 from training. Duration: 0.736375 2023-10-03 18:51:43,271 WARNING [train.py:1204] (3/4) Exclude cut with ID _48678_210_5663_1_1533988830947_3769199_306-255886-0 from training. Duration: 0.77 2023-10-03 18:51:43,363 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7544_210_6753_1_1528587000_691468_87-178599-0 from training. Duration: 0.87 2023-10-03 18:51:43,421 WARNING [train.py:1204] (3/4) Exclude cut with ID _2941_210_2577_1_1526199673150_2489759_208-236402-0_sp1.1 from training. Duration: 0.963625 2023-10-03 18:51:46,200 WARNING [train.py:1204] (3/4) Exclude cut with ID _38670_210_15379_1_1533448810659_4391059_65-169397-0 from training. Duration: 0.97 2023-10-03 18:51:47,610 WARNING [train.py:1204] (3/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_306-328175-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 18:51:47,627 WARNING [train.py:1204] (3/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_212-149947-0_sp1.1 from training. Duration: 0.663625 2023-10-03 18:51:53,162 WARNING [train.py:1204] (3/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_321-22821-0_sp1.1 from training. Duration: 0.8090625 2023-10-03 18:51:54,269 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.486e+02 1.929e+02 2.156e+02 2.406e+02 3.410e+02, threshold=4.311e+02, percent-clipped=0.0 2023-10-03 18:51:55,787 WARNING [train.py:1204] (3/4) Exclude cut with ID _44010_210_4525_1_1533884262964_4013620_483-69110-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 18:51:55,798 WARNING [train.py:1204] (3/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_38-187851-0_sp1.1 from training. Duration: 0.563625 2023-10-03 18:51:57,191 WARNING [train.py:1204] (3/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_129-337802-0_sp0.9 from training. Duration: 0.8 2023-10-03 18:51:59,135 WARNING [train.py:1204] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_268-87909-0_sp1.1 from training. Duration: 0.9 2023-10-03 18:52:01,217 WARNING [train.py:1204] (3/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_559-184862-0_sp1.1 from training. Duration: 0.8545625 2023-10-03 18:52:03,834 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_46032_210_14656_1_1533781412_4507969_237-120766-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 18:52:05,130 WARNING [train.py:1204] (3/4) Exclude cut with ID _28427_210_4220_1_1532581240967_3061069_330-170906-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 18:52:05,141 WARNING [train.py:1204] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_401-51578-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 18:52:07,162 WARNING [train.py:1204] (3/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_209-205365-0_sp1.1 from training. Duration: 0.7181875 2023-10-03 18:52:08,535 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_23801_210_16223_1_1532066392_3294657_57-242170-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 18:52:09,911 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_127-37371-0_sp1.1 from training. Duration: 0.936375 2023-10-03 18:52:15,627 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9171W0019-578905 from training. Duration: 0.896 2023-10-03 18:52:18,404 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_24605_210_1794_1_1532047863_4149755_349-49056-0_sp1.1 from training. Duration: 0.92725 2023-10-03 18:52:19,648 WARNING [train.py:1204] (3/4) Exclude cut with ID _12302_210_5219_1_1529924446466_8107280_897-21237-0_sp1.1 from training. Duration: 0.936375 2023-10-03 18:52:21,029 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_9460_210_4211_1_1528954587_10049667_480-260269-0_sp0.9 from training. Duration: 0.62225 2023-10-03 18:52:21,078 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_348-152548-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 18:52:22,431 WARNING [train.py:1204] (3/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_301-330455-0_sp0.9 from training. Duration: 0.888875 2023-10-03 18:52:23,841 WARNING [train.py:1204] (3/4) Exclude cut with ID _61008_210_10633_1_1534852769198_6974370_345-91705-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 18:52:25,220 INFO [train.py:1046] (3/4) Epoch 39, batch 3700, loss[loss=0.1408, simple_loss=0.2161, pruned_loss=0.03275, over 21094.00 frames. ], tot_loss[loss=0.1573, simple_loss=0.2368, pruned_loss=0.03893, over 4728594.44 frames. ], batch size: 46, lr: 2.60e-03, grad_scale: 8.0 2023-10-03 18:52:25,342 WARNING [train.py:1204] (3/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_304-273433-0 from training. Duration: 0.99 2023-10-03 18:52:25,345 WARNING [train.py:1204] (3/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_359-345972-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 18:52:28,020 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2296_210_2087_1_1525503448_3397335_57-185430-0_sp1.1 from training. Duration: 0.5454375 2023-10-03 18:52:29,944 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_1256-120974-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 18:52:30,018 WARNING [train.py:1204] (3/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_372-30302-0_sp0.9 from training. Duration: 0.9555625 2023-10-03 18:52:33,892 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_42012_210_7927_1_1533439936_3871556_451-344262-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 18:52:33,893 WARNING [train.py:1204] (3/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_28-324729-0 from training. Duration: 0.94 2023-10-03 18:52:33,908 WARNING [train.py:1204] (3/4) Exclude cut with ID _64135_210_2253_1_1535677267876_7608972_313-18394-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 18:52:35,023 WARNING [train.py:1204] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_620-276793-0_sp0.9 from training. Duration: 0.6555625 2023-10-03 18:52:35,060 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7544_210_6753_1_1528591327_1471426_172-348777-0_sp0.9 from training. Duration: 0.7333125 2023-10-03 18:52:39,631 WARNING [train.py:1204] (3/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_332-221057-0_sp0.9 from training. Duration: 0.7555625 2023-10-03 18:52:42,408 WARNING [train.py:1204] (3/4) Exclude cut with ID _19023_210_4353_1_1531378892552_5820780_5-300035-0_sp1.1 from training. Duration: 0.92725 2023-10-03 18:52:42,473 WARNING [train.py:1204] (3/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_343-309023-0_sp1.1 from training. Duration: 0.963625 2023-10-03 18:52:45,125 WARNING [train.py:1204] (3/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_37-41919-0_sp1.1 from training. Duration: 0.7909375 2023-10-03 18:52:45,166 WARNING [train.py:1204] (3/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_41-182910-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 18:52:46,500 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_229-45300-0_sp0.9 from training. Duration: 0.6333125 2023-10-03 18:52:47,949 WARNING [train.py:1204] (3/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_40-214716-0_sp1.1 from training. Duration: 0.963625 2023-10-03 18:52:49,297 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9309W0203-640330 from training. Duration: 0.896 2023-10-03 18:52:49,616 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.feed_forward1.hidden_balancer.prob, batch_count=1370473.3333333333, ans=0.125 2023-10-03 18:52:53,878 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.bypass_mid.scale_min, batch_count=1370540.0, ans=0.2 2023-10-03 18:52:55,044 WARNING [train.py:1204] (3/4) Exclude cut with ID _60229_210_27767_1_1534838406158_3655350_264-279573-0_sp1.1 from training. Duration: 0.7545625 2023-10-03 18:52:55,075 WARNING [train.py:1204] (3/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_372-198878-0_sp0.9 from training. Duration: 0.6666875 2023-10-03 18:52:56,511 WARNING [train.py:1204] (3/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_359-118926-0_sp1.1 from training. Duration: 0.6545625 2023-10-03 18:52:56,522 WARNING [train.py:1204] (3/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_85-65056-0 from training. Duration: 0.96 2023-10-03 18:52:56,547 WARNING [train.py:1204] (3/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_89-242335-0_sp1.1 from training. Duration: 0.863625 2023-10-03 18:52:59,426 WARNING [train.py:1204] (3/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_128-49444-0_sp1.1 from training. Duration: 0.936375 2023-10-03 18:53:01,348 WARNING [train.py:1204] (3/4) Exclude cut with ID _30327_210_16049_1_1532484161295_3924169_401-70819-0 from training. Duration: 0.86 2023-10-03 18:53:01,420 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_41822_210_13270_1_1533955976_4379284_260-256391-0_sp1.1 from training. Duration: 0.936375 2023-10-03 18:53:04,574 WARNING [train.py:1204] (3/4) Exclude cut with ID _67335_210_5219_1_1535525975652_4275229_599-247317-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 18:53:08,298 WARNING [train.py:1204] (3/4) Exclude cut with ID _29857_210_20034_1_1532395886753_3798160_209-239267-0_sp1.1 from training. Duration: 0.936375 2023-10-03 18:53:09,615 WARNING [train.py:1204] (3/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_46-207599-0_sp1.1 from training. Duration: 0.5818125 2023-10-03 18:53:09,855 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=1370606.6666666667, ans=0.1 2023-10-03 18:53:11,133 WARNING [train.py:1204] (3/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_912-282412-0_sp1.1 from training. Duration: 0.4454375 2023-10-03 18:53:11,744 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.4.encoder.layers.1.feed_forward1.out_whiten, num_groups=1, num_channels=384, metric=8.63 vs. limit=15.0 2023-10-03 18:53:13,921 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_4183_210_2087_1_1526729100_1869838_141-166600-0_sp1.1 from training. Duration: 0.863625 2023-10-03 18:53:15,352 WARNING [train.py:1204] (3/4) Exclude cut with ID _15037_210_2253_1_1530752294104_3267650_296-192597-0 from training. Duration: 0.81 2023-10-03 18:53:15,407 WARNING [train.py:1204] (3/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_571-220562-0_sp1.1 from training. Duration: 0.963625 2023-10-03 18:53:15,433 WARNING [train.py:1204] (3/4) Exclude cut with ID _5287_210_2084_1_1527418883040_3878670_12-221186-0 from training. Duration: 0.98 2023-10-03 18:53:18,404 WARNING [train.py:1204] (3/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_149-85602-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 18:53:19,702 WARNING [train.py:1204] (3/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_1003-101638-0_sp1.1 from training. Duration: 0.663625 2023-10-03 18:53:21,616 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.3.encoder.layers.3.self_attn2.whiten, num_groups=1, num_channels=512, metric=11.25 vs. limit=22.5 2023-10-03 18:53:22,422 WARNING [train.py:1204] (3/4) Exclude cut with ID _55844_210_2346_1_1534471212569_7625707_1099-33689-0_sp1.1 from training. Duration: 0.92725 2023-10-03 18:53:24,004 WARNING [train.py:1204] (3/4) Exclude cut with ID _7606_210_4915_1_1528894253277_5080690_301-10732-0 from training. Duration: 0.81 2023-10-03 18:53:24,139 WARNING [train.py:1204] (3/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_403-34698-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 18:53:24,145 WARNING [train.py:1204] (3/4) Exclude cut with ID _8480_210_1874_1_1528589391205_6728330_755-112625-0_sp0.9 from training. Duration: 0.82225 2023-10-03 18:53:24,163 WARNING [train.py:1204] (3/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_527-238990-0_sp1.1 from training. Duration: 0.8181875 2023-10-03 18:53:25,502 WARNING [train.py:1204] (3/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_426-7328-0_sp1.1 from training. Duration: 0.92725 2023-10-03 18:53:28,349 WARNING [train.py:1204] (3/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_189-317158-0_sp1.1 from training. Duration: 0.8181875 2023-10-03 18:53:29,704 WARNING [train.py:1204] (3/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_162-103620-0 from training. Duration: 0.93 2023-10-03 18:53:31,515 WARNING [train.py:1204] (3/4) Exclude cut with ID _22083_210_8254_1_1531702862439_3678970_49-234546-0 from training. Duration: 0.83 2023-10-03 18:53:33,364 WARNING [train.py:1204] (3/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_370-331600-0_sp1.1 from training. Duration: 0.8545625 2023-10-03 18:53:33,376 WARNING [train.py:1204] (3/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_166-59042-0_sp1.1 from training. Duration: 0.97275 2023-10-03 18:53:35,198 WARNING [train.py:1204] (3/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_138-332773-0_sp1.1 from training. Duration: 0.736375 2023-10-03 18:53:35,271 WARNING [train.py:1204] (3/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_90-30673-0_sp0.9 from training. Duration: 0.9333125 2023-10-03 18:53:39,219 WARNING [train.py:1204] (3/4) Exclude cut with ID _27967_210_12140_1_1532588391859_8419130_737-41898-0_sp1.1 from training. Duration: 0.936375 2023-10-03 18:53:40,446 INFO [train.py:1046] (3/4) Epoch 39, batch 3750, loss[loss=0.1428, simple_loss=0.2281, pruned_loss=0.02877, over 24428.00 frames. ], tot_loss[loss=0.1583, simple_loss=0.2381, pruned_loss=0.03929, over 4714847.59 frames. ], batch size: 63, lr: 2.60e-03, grad_scale: 8.0 2023-10-03 18:53:40,553 WARNING [train.py:1204] (3/4) Exclude cut with ID _8833_210_5219_1_1528592665210_7553059_329-125156-0_sp1.1 from training. Duration: 0.7454375 2023-10-03 18:53:40,775 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.feed_forward2.hidden_balancer.prob, batch_count=1370740.0, ans=0.125 2023-10-03 18:53:42,527 WARNING [train.py:1204] (3/4) Exclude cut with ID _41817_210_18835_1_1533434477160_7423049_352-42537-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 18:53:43,924 WARNING [train.py:1204] (3/4) Exclude cut with ID _8365_210_2064_1_1528592311070_4677690_431-294102-0 from training. Duration: 0.89 2023-10-03 18:53:45,370 WARNING [train.py:1204] (3/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_454-314901-0_sp1.1 from training. Duration: 0.4181875 2023-10-03 18:53:45,699 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.out_combiner.scale_min, batch_count=1370740.0, ans=0.2 2023-10-03 18:53:48,132 WARNING [train.py:1204] (3/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_80-135200-0_sp0.9 from training. Duration: 0.87775 2023-10-03 18:53:48,190 WARNING [train.py:1204] (3/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_181-78632-0 from training. Duration: 0.78 2023-10-03 18:53:49,579 WARNING [train.py:1204] (3/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_300-27235-0_sp0.9 from training. Duration: 0.9666875 2023-10-03 18:53:50,933 WARNING [train.py:1204] (3/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_96-177176-0_sp1.1 from training. Duration: 0.97275 2023-10-03 18:53:52,389 WARNING [train.py:1204] (3/4) Exclude cut with ID _35941_210_15379_1_1532995294515_4023089_246-348742-0_sp1.1 from training. Duration: 0.97275 2023-10-03 18:53:53,834 WARNING [train.py:1204] (3/4) Exclude cut with ID _38670_210_15379_1_1533448810659_4391059_65-169397-0_sp1.1 from training. Duration: 0.8818125 2023-10-03 18:53:57,926 WARNING [train.py:1204] (3/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_1003-99642-0_sp1.1 from training. Duration: 0.963625 2023-10-03 18:54:02,077 WARNING [train.py:1204] (3/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_320-14078-0_sp1.1 from training. Duration: 0.863625 2023-10-03 18:54:02,174 WARNING [train.py:1204] (3/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_367-61330-0_sp0.9 from training. Duration: 0.8333125 2023-10-03 18:54:04,220 WARNING [train.py:1204] (3/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_515-298514-0_sp1.1 from training. Duration: 0.92725 2023-10-03 18:54:07,431 WARNING [train.py:1204] (3/4) Exclude cut with ID _53731_210_7994_1_1534553979244_3757214_471-320815-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 18:54:09,327 WARNING [train.py:1204] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_306-232480-0 from training. Duration: 0.96 2023-10-03 18:54:10,468 WARNING [train.py:1204] (3/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_478-162247-0_sp0.9 from training. Duration: 0.911125 2023-10-03 18:54:11,858 WARNING [train.py:1204] (3/4) Exclude cut with ID _56423_210_5854_1_1534896061049_3165969_345-284696-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 18:54:11,886 WARNING [train.py:1204] (3/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_600-122253-0_sp1.1 from training. Duration: 0.963625 2023-10-03 18:54:13,403 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.0.feed_forward1.out_proj.dropout_p, batch_count=1370873.3333333333, ans=0.1 2023-10-03 18:54:14,612 WARNING [train.py:1204] (3/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_199-295611-0 from training. Duration: 0.55 2023-10-03 18:54:17,991 WARNING [train.py:1204] (3/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_734-129761-0 from training. Duration: 0.82 2023-10-03 18:54:19,344 WARNING [train.py:1204] (3/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_349-169257-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 18:54:20,697 WARNING [train.py:1204] (3/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_202-67017-0_sp0.9 from training. Duration: 0.911125 2023-10-03 18:54:22,131 WARNING [train.py:1204] (3/4) Exclude cut with ID _16908_210_5724_1_1531119157248_3772700_61-258978-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 18:54:23,397 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.605e+02 2.003e+02 2.174e+02 2.569e+02 4.236e+02, threshold=4.347e+02, percent-clipped=0.0 2023-10-03 18:54:27,675 WARNING [train.py:1204] (3/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_172-156931-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 18:54:29,130 WARNING [train.py:1204] (3/4) Exclude cut with ID _4092_210_2151_1_1526643044449_3135759_356-278694-0_sp1.1 from training. Duration: 0.42725 2023-10-03 18:54:31,896 WARNING [train.py:1204] (3/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_116-332008-0 from training. Duration: 0.98 2023-10-03 18:54:33,925 WARNING [train.py:1204] (3/4) Exclude cut with ID _29003_210_19596_1_1532507391720_3894098_358-114789-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 18:54:37,086 WARNING [train.py:1204] (3/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_152-212533-0_sp1.1 from training. Duration: 0.8818125 2023-10-03 18:54:38,838 WARNING [train.py:1204] (3/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_267-214116-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 18:54:41,629 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7543_210_6753_1_1528541907_1399322_165-323619-0_sp0.9 from training. Duration: 0.7666875 2023-10-03 18:54:44,395 WARNING [train.py:1204] (3/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_577-332600-0_sp0.9 from training. Duration: 0.6333125 2023-10-03 18:54:47,486 WARNING [train.py:1204] (3/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_342-100157-0_sp0.9 from training. Duration: 0.6 2023-10-03 18:54:49,107 WARNING [train.py:1204] (3/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_141-89727-0_sp1.1 from training. Duration: 0.6454375 2023-10-03 18:54:49,217 WARNING [train.py:1204] (3/4) Exclude cut with ID _3992_210_1747_1_1526644816031_3731060_107-251434-0_sp1.1 from training. Duration: 0.9 2023-10-03 18:54:50,644 WARNING [train.py:1204] (3/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_401-53423-0_sp1.1 from training. Duration: 0.5 2023-10-03 18:54:54,792 INFO [train.py:1046] (3/4) Epoch 39, batch 3800, loss[loss=0.161, simple_loss=0.2272, pruned_loss=0.04741, over 19573.00 frames. ], tot_loss[loss=0.1585, simple_loss=0.2385, pruned_loss=0.03923, over 4712580.77 frames. ], batch size: 388, lr: 2.60e-03, grad_scale: 8.0 2023-10-03 18:54:58,964 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7762_210_1794_1_1528199508_4057408_422-227034-0_sp1.1 from training. Duration: 0.663625 2023-10-03 18:55:02,520 WARNING [train.py:1204] (3/4) Exclude cut with ID _4184_210_2550_1_1526730192013_2219380_102-250299-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 18:55:03,855 WARNING [train.py:1204] (3/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_253-308377-0_sp0.9 from training. Duration: 0.5444375 2023-10-03 18:55:03,927 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_271-297505-0 from training. Duration: 0.99 2023-10-03 18:55:04,180 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.bypass.skip_rate, batch_count=1371073.3333333333, ans=0.04949747468305833 2023-10-03 18:55:05,284 WARNING [train.py:1204] (3/4) Exclude cut with ID _35941_210_15379_1_1532995294515_4023089_213-142973-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 18:55:07,280 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7544_210_6753_1_1528587722_3576686_162-63018-0_sp1.1 from training. Duration: 0.92725 2023-10-03 18:55:08,638 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_365-217119-0_sp0.9 from training. Duration: 0.7 2023-10-03 18:55:11,439 WARNING [train.py:1204] (3/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_662-275621-0_sp0.9 from training. Duration: 0.4666875 2023-10-03 18:55:11,441 WARNING [train.py:1204] (3/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_374-187555-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 18:55:11,517 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_289-180904-0_sp1.1 from training. Duration: 0.6909375 2023-10-03 18:55:12,904 WARNING [train.py:1204] (3/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_189-47206-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 18:55:12,931 WARNING [train.py:1204] (3/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_234-14174-0_sp1.1 from training. Duration: 0.7454375 2023-10-03 18:55:14,278 WARNING [train.py:1204] (3/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_576-76621-0_sp1.1 from training. Duration: 0.963625 2023-10-03 18:55:14,356 WARNING [train.py:1204] (3/4) Exclude cut with ID _13253_210_1925_1_1530757775819_3577873_253-246386-0 from training. Duration: 0.9 2023-10-03 18:55:19,112 WARNING [train.py:1204] (3/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_372-6869-0_sp1.1 from training. Duration: 0.4 2023-10-03 18:55:19,164 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_21434_210_4238_1_1531916339_1222252_22-54954-0_sp1.1 from training. Duration: 0.936375 2023-10-03 18:55:20,589 WARNING [train.py:1204] (3/4) Exclude cut with ID _29853_210_7497_1_1532505600851_6642110_492-170361-0_sp1.1 from training. Duration: 0.92725 2023-10-03 18:55:23,246 WARNING [train.py:1204] (3/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_505-97987-0_sp0.9 from training. Duration: 0.9555625 2023-10-03 18:55:23,290 WARNING [train.py:1204] (3/4) Exclude cut with ID _8365_210_2064_1_1528592311070_4677690_371-122295-0_sp0.9 from training. Duration: 0.611125 2023-10-03 18:55:25,933 WARNING [train.py:1204] (3/4) Exclude cut with ID _14268_210_9206_1_1530622771285_4714099_637-86630-0_sp1.1 from training. Duration: 0.463625 2023-10-03 18:55:25,952 WARNING [train.py:1204] (3/4) Exclude cut with ID _29857_210_20034_1_1532395886753_3798160_372-5678-0_sp1.1 from training. Duration: 0.963625 2023-10-03 18:55:27,379 WARNING [train.py:1204] (3/4) Exclude cut with ID _52892_210_6721_1_1534378941259_4124129_90-165108-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 18:55:30,555 WARNING [train.py:1204] (3/4) Exclude cut with ID _27052_210_5986_1_1532329197187_3585009_143-339198-0_sp1.1 from training. Duration: 0.963625 2023-10-03 18:55:33,362 WARNING [train.py:1204] (3/4) Exclude cut with ID _14268_210_9206_1_1530622771285_4714099_637-86630-0_sp0.9 from training. Duration: 0.5666875 2023-10-03 18:55:33,365 WARNING [train.py:1204] (3/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_401-255491-0 from training. Duration: 0.7 2023-10-03 18:55:36,636 WARNING [train.py:1204] (3/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_178-6946-0_sp1.1 from training. Duration: 0.97275 2023-10-03 18:55:42,217 WARNING [train.py:1204] (3/4) Exclude cut with ID _27485_210_4916_1_1532739191677_3668269_460-163885-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 18:55:49,648 WARNING [train.py:1204] (3/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_270-26053-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 18:55:51,165 WARNING [train.py:1204] (3/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_412-317077-0 from training. Duration: 0.75 2023-10-03 18:55:53,902 WARNING [train.py:1204] (3/4) Exclude cut with ID _5289_210_2084_1_1527678202736_3779299_142-103317-0 from training. Duration: 0.84 2023-10-03 18:55:53,947 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_150-143438-0_sp1.1 from training. Duration: 0.92725 2023-10-03 18:55:55,481 WARNING [train.py:1204] (3/4) Exclude cut with ID _27894_210_12392_1_1532415496078_6902439_668-269779-0_sp1.1 from training. Duration: 0.97275 2023-10-03 18:55:56,790 WARNING [train.py:1204] (3/4) Exclude cut with ID _18513_210_9206_1_1531618215223_3862620_56-222075-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 18:55:58,145 WARNING [train.py:1204] (3/4) Exclude cut with ID _4092_210_2151_1_1526643044449_3135759_356-278694-0 from training. Duration: 0.47 2023-10-03 18:56:01,071 WARNING [train.py:1204] (3/4) Exclude cut with ID _25808_210_13292_1_1532343514562_7366109_633-47524-0 from training. Duration: 0.99 2023-10-03 18:56:02,878 WARNING [train.py:1204] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_872-264525-0 from training. Duration: 0.65 2023-10-03 18:56:02,917 WARNING [train.py:1204] (3/4) Exclude cut with ID _14632_210_9988_1_1530691279392_3923200_239-232545-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 18:56:03,180 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.attention_skip_rate, batch_count=1371340.0, ans=0.0 2023-10-03 18:56:04,274 WARNING [train.py:1204] (3/4) Exclude cut with ID _14349_210_8341_1_1530615710818_3442470_153-27432-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 18:56:09,369 INFO [train.py:1046] (3/4) Epoch 39, batch 3850, loss[loss=0.1373, simple_loss=0.22, pruned_loss=0.02733, over 24612.00 frames. ], tot_loss[loss=0.158, simple_loss=0.2377, pruned_loss=0.03913, over 4712012.65 frames. ], batch size: 60, lr: 2.60e-03, grad_scale: 8.0 2023-10-03 18:56:09,450 WARNING [train.py:1204] (3/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_37-41919-0_sp0.9 from training. Duration: 0.9666875 2023-10-03 18:56:10,841 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_196-289637-0_sp0.9 from training. Duration: 0.8333125 2023-10-03 18:56:16,161 WARNING [train.py:1204] (3/4) Exclude cut with ID _13678_210_7497_1_1530530069226_3463170_371-3221-0_sp1.1 from training. Duration: 0.8181875 2023-10-03 18:56:17,453 WARNING [train.py:1204] (3/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_557-324258-0 from training. Duration: 0.81 2023-10-03 18:56:17,537 WARNING [train.py:1204] (3/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_1012-279473-0_sp1.1 from training. Duration: 0.6090625 2023-10-03 18:56:19,450 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_66916_210_14656_1_1535725525_326566_7-92649-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 18:56:22,274 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_9460_210_4211_1_1528954587_10049667_480-260269-0_sp1.1 from training. Duration: 0.5090625 2023-10-03 18:56:23,835 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_121-155001-0_sp1.1 from training. Duration: 0.92725 2023-10-03 18:56:24,081 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.nonlin_attention.balancer.min_positive, batch_count=1371473.3333333333, ans=0.05 2023-10-03 18:56:26,418 WARNING [train.py:1204] (3/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_188-171673-0_sp1.1 from training. Duration: 0.52725 2023-10-03 18:56:27,739 WARNING [train.py:1204] (3/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_2-39653-0 from training. Duration: 0.98 2023-10-03 18:56:30,825 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_23850_210_4238_1_1532232597_1632433_112-229012-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 18:56:34,155 WARNING [train.py:1204] (3/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_626-79528-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 18:56:35,590 WARNING [train.py:1204] (3/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_856-90052-0_sp1.1 from training. Duration: 0.963625 2023-10-03 18:56:36,241 WARNING [train.py:1204] (3/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_122-109023-0_sp0.9 from training. Duration: 0.8444375 2023-10-03 18:56:39,369 WARNING [train.py:1204] (3/4) Exclude cut with ID _64414_210_17301_1_1535243252048_3757759_37-70328-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 18:56:40,720 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_622-344907-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 18:56:40,784 WARNING [train.py:1204] (3/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_410-321956-0_sp1.1 from training. Duration: 0.92725 2023-10-03 18:56:40,796 WARNING [train.py:1204] (3/4) Exclude cut with ID _7562_210_2084_1_1528887767916_3946229_186-326422-0_sp0.9 from training. Duration: 0.9333125 2023-10-03 18:56:42,207 WARNING [train.py:1204] (3/4) Exclude cut with ID _37537_210_2253_1_1533171758568_3421949_127-285192-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 18:56:43,530 WARNING [train.py:1204] (3/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_723-228168-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 18:56:44,939 WARNING [train.py:1204] (3/4) Exclude cut with ID _13678_210_7497_1_1530530069226_3463170_426-246984-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 18:56:44,953 WARNING [train.py:1204] (3/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_399-156791-0_sp0.9 from training. Duration: 0.92225 2023-10-03 18:56:46,701 WARNING [train.py:1204] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_391-22148-0 from training. Duration: 0.77 2023-10-03 18:56:46,725 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3475_210_2028_1_1526193578_3622569_64-297072-0 from training. Duration: 0.91 2023-10-03 18:56:46,798 WARNING [train.py:1204] (3/4) Exclude cut with ID _17875_210_2087_1_1531224012907_1821339_225-280015-0_sp1.1 from training. Duration: 0.963625 2023-10-03 18:56:48,139 WARNING [train.py:1204] (3/4) Exclude cut with ID _50655_210_18628_1_1534229942193_3772154_325-246169-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 18:56:49,613 WARNING [train.py:1204] (3/4) Exclude cut with ID _29529_210_7234_1_1532393961455_3977460_597-257348-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 18:56:50,956 WARNING [train.py:1204] (3/4) Exclude cut with ID _58895_210_8750_1_1535184081320_3649089_77-173485-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 18:56:50,975 WARNING [train.py:1204] (3/4) Exclude cut with ID _6270_210_2084_1_1527850911573_3856420_26-277343-0 from training. Duration: 0.82 2023-10-03 18:56:52,338 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.416e+02 1.999e+02 2.176e+02 2.496e+02 3.894e+02, threshold=4.352e+02, percent-clipped=0.0 2023-10-03 18:56:52,481 WARNING [train.py:1204] (3/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_144-3287-0 from training. Duration: 0.67 2023-10-03 18:56:55,238 WARNING [train.py:1204] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_293-145278-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 18:56:56,579 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_40047_210_2290_1_1533290519_2374311_257-335162-0 from training. Duration: 0.95 2023-10-03 18:56:56,719 WARNING [train.py:1204] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_619-344499-0_sp0.9 from training. Duration: 0.7 2023-10-03 18:57:00,804 WARNING [train.py:1204] (3/4) Exclude cut with ID _19504_210_9994_1_1531648863242_3325220_456-315379-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 18:57:02,176 WARNING [train.py:1204] (3/4) Exclude cut with ID _55857_210_2346_1_1534644007831_6898209_631-221692-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 18:57:06,066 WARNING [train.py:1204] (3/4) Exclude cut with ID _64631_210_27767_1_1535349580068_4165160_324-3682-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 18:57:07,376 WARNING [train.py:1204] (3/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_317-211107-0 from training. Duration: 0.92 2023-10-03 18:57:10,161 WARNING [train.py:1204] (3/4) Exclude cut with ID _6789_210_1534_1_1527857937218_3118950_311-61262-0 from training. Duration: 0.65 2023-10-03 18:57:10,244 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder_embed.convnext.hidden_balancer.prob, batch_count=1371673.3333333333, ans=0.125 2023-10-03 18:57:11,749 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.attention_skip_rate, batch_count=1371673.3333333333, ans=0.0 2023-10-03 18:57:12,831 WARNING [train.py:1204] (3/4) Exclude cut with ID _28323_210_16187_1_1532480395667_4100300_365-68882-0_sp1.1 from training. Duration: 0.92725 2023-10-03 18:57:12,860 WARNING [train.py:1204] (3/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_816-181825-0_sp1.1 from training. Duration: 0.92725 2023-10-03 18:57:15,605 WARNING [train.py:1204] (3/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_132-269633-0_sp1.1 from training. Duration: 0.6181875 2023-10-03 18:57:15,625 WARNING [train.py:1204] (3/4) Exclude cut with ID _44585_210_18628_1_1533605350387_4518199_280-73534-0_sp1.1 from training. Duration: 0.8090625 2023-10-03 18:57:15,674 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_68095_210_20185_1_1535718226_8888855_1009-164755-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 18:57:17,618 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_40223_210_6228_1_1533298404_4812267_400-103813-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 18:57:17,619 WARNING [train.py:1204] (3/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_491-261923-0_sp1.1 from training. Duration: 0.8909375 2023-10-03 18:57:17,633 WARNING [train.py:1204] (3/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_712-342563-0 from training. Duration: 0.97 2023-10-03 18:57:19,053 WARNING [train.py:1204] (3/4) Exclude cut with ID _41161_210_20034_1_1533431249657_3555314_175-190032-0_sp1.1 from training. Duration: 0.963625 2023-10-03 18:57:20,487 WARNING [train.py:1204] (3/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_788-298907-0 from training. Duration: 0.89 2023-10-03 18:57:20,514 WARNING [train.py:1204] (3/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_225-70961-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 18:57:20,520 WARNING [train.py:1204] (3/4) Exclude cut with ID _58583_210_5236_1_1534726835266_3574249_235-175994-0_sp1.1 from training. Duration: 0.92725 2023-10-03 18:57:23,171 INFO [train.py:1046] (3/4) Epoch 39, batch 3900, loss[loss=0.1537, simple_loss=0.2476, pruned_loss=0.02988, over 24455.00 frames. ], tot_loss[loss=0.157, simple_loss=0.2362, pruned_loss=0.03891, over 4696713.85 frames. ], batch size: 69, lr: 2.60e-03, grad_scale: 8.0 2023-10-03 18:57:23,284 WARNING [train.py:1204] (3/4) Exclude cut with ID _12302_210_5219_1_1529924446466_8107280_907-182949-0_sp1.1 from training. Duration: 0.836375 2023-10-03 18:57:24,770 WARNING [train.py:1204] (3/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_94-193692-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 18:57:26,119 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_4528_210_3273_1_1527076654_3276777_342-168068-0_sp0.9 from training. Duration: 0.9666875 2023-10-03 18:57:26,174 WARNING [train.py:1204] (3/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_368-74239-0_sp1.1 from training. Duration: 0.92725 2023-10-03 18:57:26,175 WARNING [train.py:1204] (3/4) Exclude cut with ID _44831_210_13427_1_1533816011549_3689035_291-295928-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 18:57:26,316 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.feed_forward2.hidden_balancer.prob, batch_count=1371740.0, ans=0.125 2023-10-03 18:57:27,551 WARNING [train.py:1204] (3/4) Exclude cut with ID _56406_210_9288_1_1534899618664_3496149_455-131997-0_sp1.1 from training. Duration: 0.97275 2023-10-03 18:57:27,565 WARNING [train.py:1204] (3/4) Exclude cut with ID _6473_210_1741_1_1527901307396_3815380_108-17335-0 from training. Duration: 0.65 2023-10-03 18:57:27,609 WARNING [train.py:1204] (3/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_286-135600-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 18:57:27,779 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.nonlin_attention.balancer.prob, batch_count=1371740.0, ans=0.125 2023-10-03 18:57:31,919 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_928-321675-0_sp1.1 from training. Duration: 0.936375 2023-10-03 18:57:31,993 WARNING [train.py:1204] (3/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_81-300212-0_sp1.1 from training. Duration: 0.5454375 2023-10-03 18:57:33,321 WARNING [train.py:1204] (3/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_29-280622-0_sp0.9 from training. Duration: 0.911125 2023-10-03 18:57:35,079 WARNING [train.py:1204] (3/4) Exclude cut with ID _61079_210_4525_1_1534921008198_4011419_228-176447-0_sp1.1 from training. Duration: 0.936375 2023-10-03 18:57:35,379 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.attention_skip_rate, batch_count=1371740.0, ans=0.0 2023-10-03 18:57:37,080 WARNING [train.py:1204] (3/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_372-198878-0_sp1.1 from training. Duration: 0.5454375 2023-10-03 18:57:37,098 WARNING [train.py:1204] (3/4) Exclude cut with ID _64521_210_14699_1_1535243427259_3815328_244-208725-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 18:57:40,187 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8275_210_4198_1_1528455320_3892571_491-17707-0_sp0.9 from training. Duration: 0.711125 2023-10-03 18:57:40,300 WARNING [train.py:1204] (3/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_375-245677-0 from training. Duration: 0.81 2023-10-03 18:57:40,307 WARNING [train.py:1204] (3/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_690-286055-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 18:57:43,090 WARNING [train.py:1204] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_89-321955-0 from training. Duration: 0.8 2023-10-03 18:57:43,141 WARNING [train.py:1204] (3/4) Exclude cut with ID _18895_210_6228_1_1531357274234_3784930_213-98721-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 18:57:44,479 WARNING [train.py:1204] (3/4) Exclude cut with ID _19708_210_9206_1_1532136650733_4238364_347-65240-0 from training. Duration: 0.97 2023-10-03 18:57:45,835 WARNING [train.py:1204] (3/4) Exclude cut with ID _64135_210_2253_1_1535677267876_7608972_498-5039-0 from training. Duration: 0.78 2023-10-03 18:57:49,913 WARNING [train.py:1204] (3/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_146-276944-0_sp1.1 from training. Duration: 0.7545625 2023-10-03 18:57:51,799 WARNING [train.py:1204] (3/4) Exclude cut with ID _63171_210_15488_1_1535104792794_3295110_155-34488-0_sp1.1 from training. Duration: 0.97275 2023-10-03 18:57:51,819 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_5438_210_1794_1_1527336687_2600009_233-58072-0_sp0.9 from training. Duration: 0.8555625 2023-10-03 18:57:53,113 WARNING [train.py:1204] (3/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_63-101996-0_sp0.9 from training. Duration: 0.97775 2023-10-03 18:57:57,301 WARNING [train.py:1204] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_271-269084-0_sp1.1 from training. Duration: 0.7545625 2023-10-03 18:58:00,015 WARNING [train.py:1204] (3/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_345-167986-0_sp1.1 from training. Duration: 0.763625 2023-10-03 18:58:01,629 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.feed_forward1.hidden_balancer.prob, batch_count=1371873.3333333333, ans=0.125 2023-10-03 18:58:02,793 WARNING [train.py:1204] (3/4) Exclude cut with ID _39290_210_4220_1_1533862778760_3075118_92-311600-0_sp1.1 from training. Duration: 0.8 2023-10-03 18:58:02,800 WARNING [train.py:1204] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_209-65306-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 18:58:04,140 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_26433_210_12108_1_1532157158_4056480_317-335807-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 18:58:10,669 WARNING [train.py:1204] (3/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_238-94127-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 18:58:11,964 WARNING [train.py:1204] (3/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_207-132038-0_sp1.1 from training. Duration: 0.8545625 2023-10-03 18:58:14,144 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.balancer1.prob, batch_count=1371940.0, ans=0.125 2023-10-03 18:58:18,286 WARNING [train.py:1204] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_167-137255-0_sp1.1 from training. Duration: 0.5818125 2023-10-03 18:58:19,626 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2009_210_2071_1_1525481134_4588937_385-302100-0_sp0.9 from training. Duration: 0.9666875 2023-10-03 18:58:21,434 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.ff2_skip_rate, batch_count=1372006.6666666667, ans=0.0 2023-10-03 18:58:27,315 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_15517_210_6753_1_1530954008_3487336_253-110617-0_sp1.1 from training. Duration: 0.936375 2023-10-03 18:58:30,046 WARNING [train.py:1204] (3/4) Exclude cut with ID _39290_210_4220_1_1533862778760_3075118_92-311600-0_sp0.9 from training. Duration: 0.97775 2023-10-03 18:58:31,536 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_42-142473-0 from training. Duration: 0.72 2023-10-03 18:58:31,579 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2296_210_2087_1_1525503448_3397335_57-185430-0 from training. Duration: 0.6 2023-10-03 18:58:31,591 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_11305_210_1794_1_1529750428_7178148_257-312870-0_sp0.9 from training. Duration: 0.97775 2023-10-03 18:58:34,134 WARNING [train.py:1204] (3/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_222-198127-0 from training. Duration: 0.84 2023-10-03 18:58:34,239 WARNING [train.py:1204] (3/4) Exclude cut with ID _65263_210_22356_1_1535763598691_3921037_423-202699-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 18:58:35,595 WARNING [train.py:1204] (3/4) Exclude cut with ID _15037_210_2253_1_1530752294104_3267650_13-130361-0 from training. Duration: 0.99 2023-10-03 18:58:37,061 INFO [train.py:1046] (3/4) Epoch 39, batch 3950, loss[loss=0.1466, simple_loss=0.2282, pruned_loss=0.03246, over 24281.00 frames. ], tot_loss[loss=0.1565, simple_loss=0.236, pruned_loss=0.03856, over 4708835.53 frames. ], batch size: 56, lr: 2.60e-03, grad_scale: 8.0 2023-10-03 18:58:40,966 WARNING [train.py:1204] (3/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_628-198080-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 18:58:42,369 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3171_210_2087_1_1526109084_2330007_37-64334-0 from training. Duration: 0.82 2023-10-03 18:58:43,040 WARNING [train.py:1204] (3/4) Exclude cut with ID _5287_210_2084_1_1527418883040_3878670_12-221186-0_sp1.1 from training. Duration: 0.8909375 2023-10-03 18:58:45,654 WARNING [train.py:1204] (3/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_1103-56445-0_sp1.1 from training. Duration: 0.663625 2023-10-03 18:58:46,958 WARNING [train.py:1204] (3/4) Exclude cut with ID _65263_210_22356_1_1535763598691_3921037_169-140068-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 18:58:51,164 WARNING [train.py:1204] (3/4) Exclude cut with ID IC0671W0118-331961 from training. Duration: 0.896 2023-10-03 18:58:51,208 WARNING [train.py:1204] (3/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_402-102311-0_sp1.1 from training. Duration: 0.6090625 2023-10-03 18:58:52,531 WARNING [train.py:1204] (3/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_202-293523-0 from training. Duration: 0.72 2023-10-03 18:58:52,587 WARNING [train.py:1204] (3/4) Exclude cut with ID IC0679W0038-335846 from training. Duration: 0.768 2023-10-03 18:58:52,611 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_37357_210_17024_1_1533089504_2316121_79-198866-0_sp1.1 from training. Duration: 0.936375 2023-10-03 18:58:54,614 WARNING [train.py:1204] (3/4) Exclude cut with ID _7606_210_4915_1_1528894253277_5080690_292-271285-0_sp1.1 from training. Duration: 0.963625 2023-10-03 18:58:54,626 WARNING [train.py:1204] (3/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_377-296839-0_sp0.9 from training. Duration: 0.611125 2023-10-03 18:58:54,629 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_45886_210_17024_1_1533704917_2435333_58-131069-0_sp1.1 from training. Duration: 0.963625 2023-10-03 18:58:57,405 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6961_210_2087_1_1527937168_6815011_395-185060-0 from training. Duration: 0.95 2023-10-03 18:59:00,107 WARNING [train.py:1204] (3/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_304-266479-0_sp1.1 from training. Duration: 0.97275 2023-10-03 18:59:01,366 WARNING [train.py:1204] (3/4) Exclude cut with ID _3867_210_1883_1_1526641200575_4475830_319-349850-0_sp1.1 from training. Duration: 0.6090625 2023-10-03 18:59:01,374 WARNING [train.py:1204] (3/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_304-59227-0_sp0.9 from training. Duration: 0.8555625 2023-10-03 18:59:02,772 WARNING [train.py:1204] (3/4) Exclude cut with ID _8021_210_7994_1_1528376451358_3653069_100-140231-0_sp0.9 from training. Duration: 0.7444375 2023-10-03 18:59:02,830 WARNING [train.py:1204] (3/4) Exclude cut with ID _8833_210_5219_1_1528592665210_7553059_329-125156-0_sp0.9 from training. Duration: 0.911125 2023-10-03 18:59:12,856 WARNING [train.py:1204] (3/4) Exclude cut with ID _14459_210_2228_1_1531443641946_7516700_792-287234-0_sp1.1 from training. Duration: 0.92725 2023-10-03 18:59:12,894 WARNING [train.py:1204] (3/4) Exclude cut with ID _19708_210_9206_1_1532136650733_4238364_347-65240-0_sp1.1 from training. Duration: 0.8818125 2023-10-03 18:59:18,564 WARNING [train.py:1204] (3/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_70-27732-0 from training. Duration: 0.94 2023-10-03 18:59:21,280 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.622e+02 1.928e+02 2.107e+02 2.479e+02 4.773e+02, threshold=4.215e+02, percent-clipped=2.0 2023-10-03 18:59:22,973 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_397-132565-0 from training. Duration: 0.99 2023-10-03 18:59:22,976 WARNING [train.py:1204] (3/4) Exclude cut with ID _49728_210_12651_1_1534041034294_6788150_624-195301-0 from training. Duration: 0.85 2023-10-03 18:59:23,004 WARNING [train.py:1204] (3/4) Exclude cut with ID _55429_210_10626_1_1534467588527_7174143_377-169391-0_sp1.1 from training. Duration: 0.87275 2023-10-03 18:59:24,740 WARNING [train.py:1204] (3/4) Exclude cut with ID _49376_210_5986_1_1533985278732_3370740_354-344695-0_sp1.1 from training. Duration: 0.9 2023-10-03 18:59:31,792 WARNING [train.py:1204] (3/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_276-96593-0_sp1.1 from training. Duration: 0.7 2023-10-03 18:59:31,801 WARNING [train.py:1204] (3/4) Exclude cut with ID _15012_210_1968_1_1530856820429_5095350_688-178849-0_sp0.9 from training. Duration: 0.711125 2023-10-03 18:59:31,864 WARNING [train.py:1204] (3/4) Exclude cut with ID _25565_210_12392_1_1532423009382_3952098_372-69023-0_sp1.1 from training. Duration: 0.936375 2023-10-03 18:59:31,897 WARNING [train.py:1204] (3/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_61-327062-0_sp1.1 from training. Duration: 0.736375 2023-10-03 18:59:31,926 WARNING [train.py:1204] (3/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_215-141059-0 from training. Duration: 0.62 2023-10-03 18:59:35,504 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.feed_forward3.hidden_balancer.prob, batch_count=1372273.3333333333, ans=0.125 2023-10-03 18:59:37,354 WARNING [train.py:1204] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_6-226750-0_sp1.1 from training. Duration: 0.8545625 2023-10-03 18:59:39,417 WARNING [train.py:1204] (3/4) Exclude cut with ID _14348_210_8341_1_1530599434996_3778640_240-232370-0_sp1.1 from training. Duration: 0.763625 2023-10-03 18:59:43,330 WARNING [train.py:1204] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_468-189058-0 from training. Duration: 0.96 2023-10-03 18:59:43,632 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.conv_module1.balancer1.prob, batch_count=1372340.0, ans=0.125 2023-10-03 18:59:51,968 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.bypass.scale_min, batch_count=1372406.6666666667, ans=0.2 2023-10-03 18:59:52,978 INFO [train.py:1046] (3/4) Epoch 39, batch 4000, loss[loss=0.1634, simple_loss=0.2562, pruned_loss=0.03535, over 24683.00 frames. ], tot_loss[loss=0.1576, simple_loss=0.2367, pruned_loss=0.0393, over 4690558.65 frames. ], batch size: 73, lr: 2.60e-03, grad_scale: 16.0 2023-10-03 18:59:53,105 WARNING [train.py:1204] (3/4) Exclude cut with ID _21987_210_5854_1_1531821500482_3637038_247-93340-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 18:59:59,369 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.balancer2.prob, batch_count=1372406.6666666667, ans=0.125 2023-10-03 19:00:01,890 WARNING [train.py:1204] (3/4) Exclude cut with ID _66868_210_9527_1_1535509287954_4561140_436-191602-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 19:00:05,973 WARNING [train.py:1204] (3/4) Exclude cut with ID _18895_210_6228_1_1531357274234_3784930_394-58203-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 19:00:06,009 WARNING [train.py:1204] (3/4) Exclude cut with ID _58605_210_12210_1_1534723195939_3904259_258-281498-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 19:00:07,982 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_33348_210_5632_1_1533086912_3324030_245-292752-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 19:00:08,010 WARNING [train.py:1204] (3/4) Exclude cut with ID _67831_210_12392_1_1535612464202_3092612_346-2148-0 from training. Duration: 0.93 2023-10-03 19:00:08,267 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.feed_forward3.hidden_balancer.prob, batch_count=1372473.3333333333, ans=0.125 2023-10-03 19:00:09,373 WARNING [train.py:1204] (3/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_369-222060-0_sp0.9 from training. Duration: 0.788875 2023-10-03 19:00:10,624 WARNING [train.py:1204] (3/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_245-174553-0 from training. Duration: 0.88 2023-10-03 19:00:10,637 WARNING [train.py:1204] (3/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_231-104736-0_sp1.1 from training. Duration: 0.7090625 2023-10-03 19:00:10,642 WARNING [train.py:1204] (3/4) Exclude cut with ID _44585_210_18628_1_1533605350387_4518199_280-73534-0 from training. Duration: 0.89 2023-10-03 19:00:10,854 INFO [scaling.py:1118] (3/4) WithLoss: name=encoder.encoders.1.encoder.layers.0.self_attn_weights, loss-sum=0.000e+00 2023-10-03 19:00:12,087 WARNING [train.py:1204] (3/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_22-201476-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 19:00:15,500 WARNING [train.py:1204] (3/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_325-62802-0_sp0.9 from training. Duration: 0.9666875 2023-10-03 19:00:16,740 WARNING [train.py:1204] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_199-274062-0_sp1.1 from training. Duration: 0.97275 2023-10-03 19:00:16,744 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_8-319957-0_sp1.1 from training. Duration: 0.87275 2023-10-03 19:00:16,782 WARNING [train.py:1204] (3/4) Exclude cut with ID _29343_210_4381_1_1532602757752_3674109_224-349726-0_sp1.1 from training. Duration: 0.936375 2023-10-03 19:00:16,787 WARNING [train.py:1204] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_520-126729-0_sp1.1 from training. Duration: 0.636375 2023-10-03 19:00:19,420 WARNING [train.py:1204] (3/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_239-22076-0_sp1.1 from training. Duration: 0.77275 2023-10-03 19:00:20,780 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9012W0011-501146 from training. Duration: 0.896 2023-10-03 19:00:22,125 WARNING [train.py:1204] (3/4) Exclude cut with ID _29857_210_20034_1_1532395886753_3798160_279-185801-0_sp1.1 from training. Duration: 0.82725 2023-10-03 19:00:22,176 WARNING [train.py:1204] (3/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_261-223933-0_sp1.1 from training. Duration: 0.92725 2023-10-03 19:00:24,932 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9029W0025-509426 from training. Duration: 0.896 2023-10-03 19:00:25,026 WARNING [train.py:1204] (3/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_337-181502-0_sp1.1 from training. Duration: 0.5909375 2023-10-03 19:00:25,041 WARNING [train.py:1204] (3/4) Exclude cut with ID _68635_210_9317_1_1535770813410_4315870_159-332599-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 19:00:29,820 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.1.conv_skip_rate, batch_count=1372540.0, ans=0.0 2023-10-03 19:00:32,327 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_161-276996-0 from training. Duration: 0.95 2023-10-03 19:00:32,378 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7088_210_1874_1_1527939776_1101113_127-68870-0_sp1.1 from training. Duration: 0.936375 2023-10-03 19:00:33,929 WARNING [train.py:1204] (3/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_322-217220-0_sp1.1 from training. Duration: 0.7818125 2023-10-03 19:00:35,211 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9072W0002-530636 from training. Duration: 0.768 2023-10-03 19:00:36,620 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_340-336408-0_sp1.1 from training. Duration: 0.8454375 2023-10-03 19:00:36,702 WARNING [train.py:1204] (3/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_395-37222-0 from training. Duration: 0.76 2023-10-03 19:00:36,705 WARNING [train.py:1204] (3/4) Exclude cut with ID _64099_210_17301_1_1535526977656_1578188_159-209156-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 19:00:38,080 WARNING [train.py:1204] (3/4) Exclude cut with ID _53950_210_5816_1_1534417264919_3862951_28-29681-0_sp1.1 from training. Duration: 0.92725 2023-10-03 19:00:39,953 WARNING [train.py:1204] (3/4) Exclude cut with ID _4681_210_3525_1_1527250689464_120889_15-126760-0_sp1.1 from training. Duration: 0.736375 2023-10-03 19:00:41,304 WARNING [train.py:1204] (3/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_242-3472-0_sp1.1 from training. Duration: 0.663625 2023-10-03 19:00:41,438 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.0.feed_forward1.out_proj.dropout_p, batch_count=1372606.6666666667, ans=0.1 2023-10-03 19:00:42,628 WARNING [train.py:1204] (3/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_436-186311-0_sp0.9 from training. Duration: 0.72225 2023-10-03 19:00:42,643 WARNING [train.py:1204] (3/4) Exclude cut with ID _3675_210_4557_1_1526639934408_4182089_452-172147-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 19:00:43,440 WARNING [train.py:1204] (3/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_80-147301-0 from training. Duration: 0.84 2023-10-03 19:00:44,718 WARNING [train.py:1204] (3/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_351-107588-0_sp1.1 from training. Duration: 0.92725 2023-10-03 19:00:46,121 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9111W0002-550781 from training. Duration: 0.896 2023-10-03 19:00:50,248 WARNING [train.py:1204] (3/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_465-156500-0_sp1.1 from training. Duration: 0.6545625 2023-10-03 19:00:53,074 WARNING [train.py:1204] (3/4) Exclude cut with ID _13938_210_9988_1_1530518718978_3693960_304-112784-0_sp1.1 from training. Duration: 0.4181875 2023-10-03 19:00:54,553 WARNING [train.py:1204] (3/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_202-67017-0_sp1.1 from training. Duration: 0.7454375 2023-10-03 19:00:55,771 WARNING [train.py:1204] (3/4) Exclude cut with ID _58414_210_8254_1_1535421609162_7151182_448-4358-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 19:00:57,179 WARNING [train.py:1204] (3/4) Exclude cut with ID _66840_210_5219_1_1535452168426_4196349_203-243234-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 19:00:57,256 WARNING [train.py:1204] (3/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_201-213513-0_sp1.1 from training. Duration: 0.97275 2023-10-03 19:01:00,330 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.conv_skip_rate, batch_count=1372673.3333333333, ans=0.0 2023-10-03 19:01:02,017 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_117-187560-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 19:01:03,455 WARNING [train.py:1204] (3/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_135-174080-0_sp0.9 from training. Duration: 0.588875 2023-10-03 19:01:04,707 WARNING [train.py:1204] (3/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_45-265684-0 from training. Duration: 0.72 2023-10-03 19:01:06,068 INFO [train.py:1046] (3/4) Epoch 39, batch 4050, loss[loss=0.1516, simple_loss=0.2339, pruned_loss=0.03464, over 23177.00 frames. ], tot_loss[loss=0.1579, simple_loss=0.2374, pruned_loss=0.03919, over 4698117.95 frames. ], batch size: 105, lr: 2.60e-03, grad_scale: 16.0 2023-10-03 19:01:06,171 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_435-313305-0_sp1.1 from training. Duration: 0.6454375 2023-10-03 19:01:06,211 WARNING [train.py:1204] (3/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_235-307337-0_sp1.1 from training. Duration: 0.8545625 2023-10-03 19:01:06,359 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.attention_skip_rate, batch_count=1372740.0, ans=0.0 2023-10-03 19:01:08,030 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7762_210_1794_1_1528199508_4057408_419-125009-0_sp0.9 from training. Duration: 0.92225 2023-10-03 19:01:09,441 WARNING [train.py:1204] (3/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_215-141059-0_sp1.1 from training. Duration: 0.563625 2023-10-03 19:01:09,530 WARNING [train.py:1204] (3/4) Exclude cut with ID _27013_210_1389_1_1532437173240_3252649_222-125573-0_sp1.1 from training. Duration: 0.8909375 2023-10-03 19:01:11,080 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.conv_module2.balancer1.max_abs, batch_count=1372740.0, ans=10.0 2023-10-03 19:01:14,612 WARNING [train.py:1204] (3/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_126-274694-0_sp1.1 from training. Duration: 0.8909375 2023-10-03 19:01:18,731 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2592_210_2290_1_1525697025_4882925_157-70512-0_sp1.1 from training. Duration: 0.82725 2023-10-03 19:01:18,783 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_292-265928-0_sp0.9 from training. Duration: 0.5666875 2023-10-03 19:01:21,409 WARNING [train.py:1204] (3/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_1211-226066-0_sp1.1 from training. Duration: 0.6909375 2023-10-03 19:01:21,462 WARNING [train.py:1204] (3/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_394-265747-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 19:01:25,723 WARNING [train.py:1204] (3/4) Exclude cut with ID _48782_210_12658_1_1534467701544_2300422_239-67139-0_sp1.1 from training. Duration: 0.97275 2023-10-03 19:01:27,111 WARNING [train.py:1204] (3/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_329-113173-0_sp1.1 from training. Duration: 0.563625 2023-10-03 19:01:29,875 WARNING [train.py:1204] (3/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_81-75849-0_sp0.9 from training. Duration: 0.4333125 2023-10-03 19:01:32,537 WARNING [train.py:1204] (3/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_1003-101638-0 from training. Duration: 0.73 2023-10-03 19:01:32,569 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9309W0189-640316 from training. Duration: 0.64 2023-10-03 19:01:33,951 WARNING [train.py:1204] (3/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_503-287883-0_sp1.1 from training. Duration: 0.8 2023-10-03 19:01:41,599 WARNING [train.py:1204] (3/4) Exclude cut with ID _8833_210_5219_1_1528592665210_7553059_611-80313-0 from training. Duration: 0.64 2023-10-03 19:01:43,452 WARNING [train.py:1204] (3/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_335-106626-0_sp1.1 from training. Duration: 0.963625 2023-10-03 19:01:46,333 WARNING [train.py:1204] (3/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_237-44332-0_sp1.1 from training. Duration: 0.8545625 2023-10-03 19:01:49,653 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.572e+02 1.853e+02 2.035e+02 2.339e+02 3.700e+02, threshold=4.069e+02, percent-clipped=0.0 2023-10-03 19:01:49,739 WARNING [train.py:1204] (3/4) Exclude cut with ID _46952_210_5986_1_1534230010357_3107379_178-147341-0_sp1.1 from training. Duration: 0.97275 2023-10-03 19:01:49,799 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_13091_210_7925_1_1530609447_8987354_946-154426-0_sp1.1 from training. Duration: 0.936375 2023-10-03 19:01:49,816 WARNING [train.py:1204] (3/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_326-17061-0_sp1.1 from training. Duration: 0.8545625 2023-10-03 19:01:53,752 WARNING [train.py:1204] (3/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_38-169920-0_sp1.1 from training. Duration: 0.82725 2023-10-03 19:01:57,757 WARNING [train.py:1204] (3/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_283-220372-0 from training. Duration: 0.56 2023-10-03 19:01:57,771 WARNING [train.py:1204] (3/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_252-216674-0_sp1.1 from training. Duration: 0.4909375 2023-10-03 19:01:59,201 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_202-91290-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 19:02:01,967 WARNING [train.py:1204] (3/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_80-74184-0 from training. Duration: 0.83 2023-10-03 19:02:04,953 INFO [scaling.py:1118] (3/4) WithLoss: name=encoder.encoders.0.layers.0.self_attn_weights, loss-sum=0.000e+00 2023-10-03 19:02:07,421 WARNING [train.py:1204] (3/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_411-271358-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 19:02:13,446 WARNING [train.py:1204] (3/4) Exclude cut with ID _68777_210_4353_1_1535810390731_3117904_129-166012-0 from training. Duration: 0.85 2023-10-03 19:02:15,134 WARNING [train.py:1204] (3/4) Exclude cut with ID _44982_210_1511_1_1534312820108_3726029_410-109759-0_sp1.1 from training. Duration: 0.963625 2023-10-03 19:02:15,138 WARNING [train.py:1204] (3/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_364-22817-0_sp1.1 from training. Duration: 0.6454375 2023-10-03 19:02:15,883 WARNING [train.py:1204] (3/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_89-242335-0 from training. Duration: 0.95 2023-10-03 19:02:15,894 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_151-325626-0 from training. Duration: 0.97 2023-10-03 19:02:15,895 WARNING [train.py:1204] (3/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_786-264332-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 19:02:18,455 WARNING [train.py:1204] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_35-153886-0_sp1.1 from training. Duration: 0.763625 2023-10-03 19:02:19,747 INFO [train.py:1046] (3/4) Epoch 39, batch 4100, loss[loss=0.139, simple_loss=0.2234, pruned_loss=0.02726, over 24437.00 frames. ], tot_loss[loss=0.159, simple_loss=0.2389, pruned_loss=0.03957, over 4710013.61 frames. ], batch size: 58, lr: 2.60e-03, grad_scale: 16.0 2023-10-03 19:02:19,867 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_24007_210_1794_1_1531918925_1663875_116-100916-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 19:02:21,106 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_162-262717-0_sp1.1 from training. Duration: 0.7545625 2023-10-03 19:02:27,277 WARNING [train.py:1204] (3/4) Exclude cut with ID _8263_210_1664_1_1528438953494_294100_40-204731-0 from training. Duration: 0.5 2023-10-03 19:02:28,706 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_416-293746-0 from training. Duration: 0.9 2023-10-03 19:02:30,136 WARNING [train.py:1204] (3/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_605-219732-0 from training. Duration: 0.94 2023-10-03 19:02:31,505 WARNING [train.py:1204] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_454-123002-0 from training. Duration: 0.69 2023-10-03 19:02:31,520 WARNING [train.py:1204] (3/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_25-304273-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 19:02:31,572 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6860_210_4852_1_1527917457_3833327_451-39702-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 19:02:32,901 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_304-16505-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 19:02:32,914 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_4-266348-0_sp1.1 from training. Duration: 0.7090625 2023-10-03 19:02:34,365 WARNING [train.py:1204] (3/4) Exclude cut with ID ID0149W0001-753701 from training. Duration: 0.989 2023-10-03 19:02:34,582 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.1.balancer1.prob, batch_count=1373140.0, ans=0.125 2023-10-03 19:02:37,250 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_41251_210_15368_1_1533429767_4820695_394-55393-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 19:02:37,349 WARNING [train.py:1204] (3/4) Exclude cut with ID _8793_210_6310_1_1528542985139_2310009_121-179782-0_sp1.1 from training. Duration: 0.72725 2023-10-03 19:02:37,363 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2729_210_3528_1_1525863505_3882214_421-73047-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 19:02:38,750 WARNING [train.py:1204] (3/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_278-154492-0_sp0.9 from training. Duration: 0.8444375 2023-10-03 19:02:41,994 WARNING [train.py:1204] (3/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_412-317077-0_sp0.9 from training. Duration: 0.8333125 2023-10-03 19:02:43,894 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_727-138317-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 19:02:43,940 WARNING [train.py:1204] (3/4) Exclude cut with ID _25808_210_13292_1_1532343514562_7366109_345-335048-0_sp1.1 from training. Duration: 0.92725 2023-10-03 19:02:45,193 WARNING [train.py:1204] (3/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_506-23200-0 from training. Duration: 0.97 2023-10-03 19:02:45,278 WARNING [train.py:1204] (3/4) Exclude cut with ID _44718_210_5816_1_1534050051869_6469030_643-245187-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 19:02:45,282 WARNING [train.py:1204] (3/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_42-186162-0_sp1.1 from training. Duration: 0.836375 2023-10-03 19:02:46,617 WARNING [train.py:1204] (3/4) Exclude cut with ID _3231_210_2106_1_1526205864934_6784220_171-256004-0_sp1.1 from training. Duration: 0.7818125 2023-10-03 19:02:46,657 WARNING [train.py:1204] (3/4) Exclude cut with ID _25914_210_6400_1_1532141317058_644740_43-248478-0_sp1.1 from training. Duration: 0.936375 2023-10-03 19:02:46,701 WARNING [train.py:1204] (3/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_577-287150-0 from training. Duration: 0.82 2023-10-03 19:02:49,129 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.4.encoder.layers.0.conv_module1.whiten, num_groups=1, num_channels=384, metric=5.15 vs. limit=15.0 2023-10-03 19:02:50,338 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_46015_210_7234_1_1533863319_3087837_376-276068-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 19:02:51,717 WARNING [train.py:1204] (3/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_51-200220-0 from training. Duration: 0.56 2023-10-03 19:02:53,092 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_452-96101-0_sp1.1 from training. Duration: 0.963625 2023-10-03 19:02:54,047 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.1.encoder.layers.1.self_attn_weights.whiten_keys, num_groups=4, num_channels=128, metric=2.74 vs. limit=6.0 2023-10-03 19:02:54,543 WARNING [train.py:1204] (3/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_263-168388-0_sp1.1 from training. Duration: 0.7818125 2023-10-03 19:02:54,544 WARNING [train.py:1204] (3/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_469-94673-0 from training. Duration: 0.79 2023-10-03 19:02:57,342 WARNING [train.py:1204] (3/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_303-228738-0_sp1.1 from training. Duration: 0.763625 2023-10-03 19:02:57,398 WARNING [train.py:1204] (3/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_262-250826-0_sp1.1 from training. Duration: 0.9 2023-10-03 19:02:57,419 WARNING [train.py:1204] (3/4) Exclude cut with ID _68777_210_4353_1_1535810390731_3117904_129-166012-0_sp1.1 from training. Duration: 0.77275 2023-10-03 19:03:00,098 WARNING [train.py:1204] (3/4) Exclude cut with ID _58895_210_8750_1_1535184081320_3649089_265-323810-0 from training. Duration: 0.96 2023-10-03 19:03:02,827 WARNING [train.py:1204] (3/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_120-76843-0_sp0.9 from training. Duration: 0.611125 2023-10-03 19:03:02,900 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7544_210_6753_1_1528587722_3576686_295-258479-0_sp0.9 from training. Duration: 0.9333125 2023-10-03 19:03:03,138 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.attention_skip_rate, batch_count=1373273.3333333333, ans=0.0 2023-10-03 19:03:05,615 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7046_210_6753_1_1527895337_6920917_783-278109-0 from training. Duration: 0.77 2023-10-03 19:03:05,665 WARNING [train.py:1204] (3/4) Exclude cut with ID _42326_210_7770_1_1533814282531_3707269_113-308323-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 19:03:05,698 WARNING [train.py:1204] (3/4) Exclude cut with ID _7852_210_5219_1_1528286471316_8139400_242-105336-0_sp0.9 from training. Duration: 0.8 2023-10-03 19:03:07,174 WARNING [train.py:1204] (3/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_114-277905-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 19:03:13,052 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_13118_210_7925_1_1530755612_7608297_42-66683-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 19:03:16,212 WARNING [train.py:1204] (3/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_901-79438-0_sp1.1 from training. Duration: 0.8545625 2023-10-03 19:03:16,265 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_30280_210_1881_1_1532872779_3660139_211-182194-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 19:03:24,569 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.bypass.skip_rate, batch_count=1373340.0, ans=0.09899494936611666 2023-10-03 19:03:24,619 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.balancer1.prob, batch_count=1373340.0, ans=0.125 2023-10-03 19:03:26,911 WARNING [train.py:1204] (3/4) Exclude cut with ID _14898_210_9206_1_1530766812377_4177190_621-45732-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 19:03:26,919 WARNING [train.py:1204] (3/4) Exclude cut with ID _35350_210_16040_1_1533040191183_4075299_224-133775-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 19:03:29,809 WARNING [train.py:1204] (3/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_147-293311-0_sp1.1 from training. Duration: 0.8545625 2023-10-03 19:03:30,023 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.attention_skip_rate, batch_count=1373340.0, ans=0.0 2023-10-03 19:03:32,567 WARNING [train.py:1204] (3/4) Exclude cut with ID _12302_210_5219_1_1529924446466_8107280_835-279754-0_sp1.1 from training. Duration: 0.8181875 2023-10-03 19:03:32,850 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.conv_module2.balancer2.prob, batch_count=1373406.6666666667, ans=0.125 2023-10-03 19:03:33,861 INFO [train.py:1046] (3/4) Epoch 39, batch 4150, loss[loss=0.1574, simple_loss=0.2303, pruned_loss=0.04227, over 23758.00 frames. ], tot_loss[loss=0.1582, simple_loss=0.2384, pruned_loss=0.03903, over 4719213.28 frames. ], batch size: 212, lr: 2.60e-03, grad_scale: 8.0 2023-10-03 19:03:35,387 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8453_210_4211_1_1528502940_8562730_347-330673-0_sp0.9 from training. Duration: 0.8 2023-10-03 19:03:36,721 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_213-103282-0_sp0.9 from training. Duration: 0.8666875 2023-10-03 19:03:38,027 WARNING [train.py:1204] (3/4) Exclude cut with ID _49728_210_12651_1_1534041034294_6788150_66-43496-0_sp1.1 from training. Duration: 0.97275 2023-10-03 19:03:38,029 WARNING [train.py:1204] (3/4) Exclude cut with ID _19023_210_4353_1_1531378892552_5820780_28-36615-0_sp1.1 from training. Duration: 0.8818125 2023-10-03 19:03:39,522 WARNING [train.py:1204] (3/4) Exclude cut with ID _14349_210_8341_1_1530615710818_3442470_115-285975-0 from training. Duration: 0.86 2023-10-03 19:03:39,678 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.self_attn_weights.pos_emb_skip_rate, batch_count=1373406.6666666667, ans=0.0 2023-10-03 19:03:40,794 WARNING [train.py:1204] (3/4) Exclude cut with ID _12734_210_8341_1_1530183614296_3891929_236-47464-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 19:03:42,163 WARNING [train.py:1204] (3/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_177-236809-0 from training. Duration: 0.49 2023-10-03 19:03:42,238 WARNING [train.py:1204] (3/4) Exclude cut with ID _13678_210_7497_1_1530530069226_3463170_371-3221-0 from training. Duration: 0.9 2023-10-03 19:03:42,261 WARNING [train.py:1204] (3/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_48-338976-0 from training. Duration: 0.89 2023-10-03 19:03:44,107 WARNING [train.py:1204] (3/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_1167-309744-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 19:03:48,803 WARNING [train.py:1204] (3/4) Exclude cut with ID _30327_210_16049_1_1532484161295_3924169_401-70819-0_sp0.9 from training. Duration: 0.9555625 2023-10-03 19:03:48,812 WARNING [train.py:1204] (3/4) Exclude cut with ID _55134_210_21982_1_1534590028377_3882170_346-91022-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 19:03:56,318 WARNING [train.py:1204] (3/4) Exclude cut with ID _14459_210_2228_1_1531443641946_7516700_174-118801-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 19:03:56,403 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_269-278006-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 19:03:57,720 WARNING [train.py:1204] (3/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_390-339137-0_sp0.9 from training. Duration: 0.62225 2023-10-03 19:03:59,127 WARNING [train.py:1204] (3/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_253-308377-0_sp1.1 from training. Duration: 0.4454375 2023-10-03 19:03:59,150 WARNING [train.py:1204] (3/4) Exclude cut with ID _23825_210_13449_1_1531915313526_3497150_240-271367-0_sp1.1 from training. Duration: 0.8818125 2023-10-03 19:03:59,937 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.2.encoder.layers.1.feed_forward3.out_whiten, num_groups=1, num_channels=384, metric=7.10 vs. limit=15.0 2023-10-03 19:04:00,545 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1015-343884-0_sp0.9 from training. Duration: 0.688875 2023-10-03 19:04:03,500 WARNING [train.py:1204] (3/4) Exclude cut with ID _23514_210_1389_1_1531832139785_3031439_335-48210-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 19:04:06,405 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.conv_skip_rate, batch_count=1373540.0, ans=0.0 2023-10-03 19:04:07,467 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_309-65177-0_sp1.1 from training. Duration: 0.8 2023-10-03 19:04:07,534 WARNING [train.py:1204] (3/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_231-137892-0 from training. Duration: 0.99 2023-10-03 19:04:08,974 WARNING [train.py:1204] (3/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_57-270037-0 from training. Duration: 0.77 2023-10-03 19:04:10,301 WARNING [train.py:1204] (3/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_465-214726-0_sp1.1 from training. Duration: 0.7818125 2023-10-03 19:04:11,679 WARNING [train.py:1204] (3/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_106-300985-0 from training. Duration: 0.9 2023-10-03 19:04:11,684 WARNING [train.py:1204] (3/4) Exclude cut with ID _14632_210_9988_1_1530691279392_3923200_161-129530-0_sp1.1 from training. Duration: 0.87275 2023-10-03 19:04:12,914 WARNING [train.py:1204] (3/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_514-88370-0_sp1.1 from training. Duration: 0.963625 2023-10-03 19:04:14,907 WARNING [train.py:1204] (3/4) Exclude cut with ID _56495_210_4234_1_1534748080334_4136219_305-334505-0_sp1.1 from training. Duration: 0.92725 2023-10-03 19:04:16,335 WARNING [train.py:1204] (3/4) Exclude cut with ID _42875_210_14856_1_1533704397487_4012433_61-264914-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 19:04:16,565 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.nonlin_attention.balancer.prob, batch_count=1373606.6666666667, ans=0.125 2023-10-03 19:04:17,651 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.610e+02 1.923e+02 2.108e+02 2.478e+02 3.681e+02, threshold=4.216e+02, percent-clipped=0.0 2023-10-03 19:04:18,224 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.5.encoder.layers.1.nonlin_attention.whiten2, num_groups=1, num_channels=256, metric=4.75 vs. limit=15.0 2023-10-03 19:04:19,839 WARNING [train.py:1204] (3/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_91-148831-0 from training. Duration: 0.99 2023-10-03 19:04:22,535 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_435-313305-0_sp0.9 from training. Duration: 0.788875 2023-10-03 19:04:24,512 WARNING [train.py:1204] (3/4) Exclude cut with ID _16744_210_5986_1_1531724405384_3907567_242-111316-0_sp1.1 from training. Duration: 0.8454375 2023-10-03 19:04:25,804 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_257-20733-0 from training. Duration: 0.76 2023-10-03 19:04:27,101 WARNING [train.py:1204] (3/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_4-91037-0_sp1.1 from training. Duration: 0.8 2023-10-03 19:04:27,195 WARNING [train.py:1204] (3/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_3-276701-0 from training. Duration: 0.91 2023-10-03 19:04:28,565 WARNING [train.py:1204] (3/4) Exclude cut with ID _3992_210_1747_1_1526644816031_3731060_36-147855-0_sp1.1 from training. Duration: 0.7090625 2023-10-03 19:04:29,911 WARNING [train.py:1204] (3/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_1296-298671-0_sp1.1 from training. Duration: 0.963625 2023-10-03 19:04:30,007 WARNING [train.py:1204] (3/4) Exclude cut with ID _68965_210_1883_1_1535698962272_3497490_309-334593-0_sp1.1 from training. Duration: 0.92725 2023-10-03 19:04:31,288 WARNING [train.py:1204] (3/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_128-296581-0 from training. Duration: 0.74 2023-10-03 19:04:31,289 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_17613_210_2151_1_1531137569_2366160_170-299079-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 19:04:31,291 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6287_210_3273_1_1527509506_2653716_182-199887-0_sp1.1 from training. Duration: 0.463625 2023-10-03 19:04:32,758 WARNING [train.py:1204] (3/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_80-135200-0_sp1.1 from training. Duration: 0.7181875 2023-10-03 19:04:35,493 WARNING [train.py:1204] (3/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_34-183685-0 from training. Duration: 0.98 2023-10-03 19:04:35,509 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8278_210_2550_1_1528637099_4220413_418-210155-0_sp1.1 from training. Duration: 0.92725 2023-10-03 19:04:35,513 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8853_210_3273_1_1528633899_818968_51-187685-0_sp0.9 from training. Duration: 0.7666875 2023-10-03 19:04:36,758 WARNING [train.py:1204] (3/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_332-221057-0_sp1.1 from training. Duration: 0.6181875 2023-10-03 19:04:36,834 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2221_210_1988_1_1525597051_4935541_415-296882-0 from training. Duration: 0.77 2023-10-03 19:04:36,860 WARNING [train.py:1204] (3/4) Exclude cut with ID _33640_210_2655_1_1533117605479_4855469_770-278185-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 19:04:38,179 WARNING [train.py:1204] (3/4) Exclude cut with ID _5287_210_2084_1_1527418883040_3878670_333-48508-0_sp1.1 from training. Duration: 0.5454375 2023-10-03 19:04:38,243 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_142-276839-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 19:04:39,593 WARNING [train.py:1204] (3/4) Exclude cut with ID _28394_210_8913_1_1532430049518_3650118_126-139371-0_sp1.1 from training. Duration: 0.92725 2023-10-03 19:04:40,861 WARNING [train.py:1204] (3/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_76-203867-0 from training. Duration: 0.72 2023-10-03 19:04:40,910 WARNING [train.py:1204] (3/4) Exclude cut with ID _6194_210_1534_1_1527511826160_3941194_14-282876-0_sp0.9 from training. Duration: 0.788875 2023-10-03 19:04:46,929 INFO [train.py:1046] (3/4) Epoch 39, batch 4200, loss[loss=0.1671, simple_loss=0.2514, pruned_loss=0.04144, over 23721.00 frames. ], tot_loss[loss=0.158, simple_loss=0.2377, pruned_loss=0.03914, over 4718400.68 frames. ], batch size: 85, lr: 2.60e-03, grad_scale: 8.0 2023-10-03 19:04:47,053 WARNING [train.py:1204] (3/4) Exclude cut with ID _35355_210_16040_1_1533212988977_4016229_74-190076-0_sp1.1 from training. Duration: 0.72725 2023-10-03 19:04:49,169 WARNING [train.py:1204] (3/4) Exclude cut with ID _33920_210_4220_1_1533257851500_3084399_198-334063-0 from training. Duration: 0.94 2023-10-03 19:04:50,592 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_4-266348-0_sp0.9 from training. Duration: 0.8666875 2023-10-03 19:04:52,550 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_23856_210_4238_1_1531994785_3087934_122-38075-0_sp1.1 from training. Duration: 0.936375 2023-10-03 19:04:53,443 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.balancer1.prob, batch_count=1373740.0, ans=0.125 2023-10-03 19:04:55,714 WARNING [train.py:1204] (3/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_276-96593-0_sp0.9 from training. Duration: 0.8555625 2023-10-03 19:04:55,774 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_308-278734-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 19:04:55,776 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_5190_210_4557_1_1527245078_2965646_353-250603-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 19:04:58,458 WARNING [train.py:1204] (3/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_472-216663-0 from training. Duration: 0.92 2023-10-03 19:05:01,266 WARNING [train.py:1204] (3/4) Exclude cut with ID _7537_210_2084_1_1528502395431_3924359_14-312906-0 from training. Duration: 0.84 2023-10-03 19:05:01,310 WARNING [train.py:1204] (3/4) Exclude cut with ID _29003_210_19596_1_1532507391720_3894098_559-236660-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 19:05:01,514 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.conv_module1.balancer2.min_positive, batch_count=1373806.6666666667, ans=0.05 2023-10-03 19:05:02,748 WARNING [train.py:1204] (3/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_312-204798-0_sp1.1 from training. Duration: 0.8454375 2023-10-03 19:05:06,930 WARNING [train.py:1204] (3/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_17-309070-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 19:05:09,643 WARNING [train.py:1204] (3/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_344-152526-0_sp0.9 from training. Duration: 0.588875 2023-10-03 19:05:11,077 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_41452_210_7291_1_1533437687_3941281_405-11559-0_sp1.1 from training. Duration: 0.82725 2023-10-03 19:05:11,110 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_48730_210_5399_1_1533885886_2212769_157-124052-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 19:05:11,162 WARNING [train.py:1204] (3/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_415-78792-0 from training. Duration: 0.81 2023-10-03 19:05:11,167 WARNING [train.py:1204] (3/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_345-340690-0_sp1.1 from training. Duration: 0.8454375 2023-10-03 19:05:13,172 WARNING [train.py:1204] (3/4) Exclude cut with ID _23825_210_13449_1_1531915313526_3497150_124-8138-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 19:05:14,461 WARNING [train.py:1204] (3/4) Exclude cut with ID _49258_210_7994_1_1534122017863_3738861_194-24864-0_sp1.1 from training. Duration: 0.936375 2023-10-03 19:05:14,479 WARNING [train.py:1204] (3/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_369-222060-0_sp1.1 from training. Duration: 0.6454375 2023-10-03 19:05:16,022 WARNING [train.py:1204] (3/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_480-13146-0_sp0.9 from training. Duration: 0.9444375 2023-10-03 19:05:18,661 WARNING [train.py:1204] (3/4) Exclude cut with ID _13253_210_1925_1_1530757775819_3577873_191-144977-0 from training. Duration: 0.78 2023-10-03 19:05:18,683 WARNING [train.py:1204] (3/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_1128-140266-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 19:05:23,311 WARNING [train.py:1204] (3/4) Exclude cut with ID _8772_210_1850_1_1528635595972_3664011_247-336448-0_sp0.9 from training. Duration: 0.82225 2023-10-03 19:05:25,358 WARNING [train.py:1204] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_318-59097-0_sp1.1 from training. Duration: 0.6909375 2023-10-03 19:05:26,812 WARNING [train.py:1204] (3/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_540-131164-0_sp1.1 from training. Duration: 0.836375 2023-10-03 19:05:28,287 WARNING [train.py:1204] (3/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_39-110836-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 19:05:29,726 WARNING [train.py:1204] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_860-96198-0_sp1.1 from training. Duration: 0.663625 2023-10-03 19:05:31,023 WARNING [train.py:1204] (3/4) Exclude cut with ID _15133_210_6228_1_1530794512074_3080379_29-264458-0 from training. Duration: 0.58 2023-10-03 19:05:31,049 WARNING [train.py:1204] (3/4) Exclude cut with ID _55209_210_1531_1_1534675828996_4597160_797-348343-0_sp1.1 from training. Duration: 0.963625 2023-10-03 19:05:31,360 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.bypass.scale_min, batch_count=1373940.0, ans=0.2 2023-10-03 19:05:32,435 WARNING [train.py:1204] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_23-189234-0_sp1.1 from training. Duration: 0.7545625 2023-10-03 19:05:34,025 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.conv_module2.balancer2.prob, batch_count=1373940.0, ans=0.125 2023-10-03 19:05:36,505 WARNING [train.py:1204] (3/4) Exclude cut with ID _5410_210_1388_1_1527397055795_1719020_39-112221-0_sp1.1 from training. Duration: 0.62725 2023-10-03 19:05:39,258 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_100-348441-0_sp1.1 from training. Duration: 0.82725 2023-10-03 19:05:43,520 WARNING [train.py:1204] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_143-86344-0_sp1.1 from training. Duration: 0.7 2023-10-03 19:05:45,700 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.feed_forward1.out_proj.dropout_p, batch_count=1374006.6666666667, ans=0.1 2023-10-03 19:05:46,731 WARNING [train.py:1204] (3/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_126-29104-0 from training. Duration: 0.85 2023-10-03 19:05:48,179 WARNING [train.py:1204] (3/4) Exclude cut with ID _39846_210_12651_1_1533603651992_1318030_138-31771-0_sp1.1 from training. Duration: 0.963625 2023-10-03 19:05:53,773 WARNING [train.py:1204] (3/4) Exclude cut with ID _2220_210_1819_1_1525509109898_3658680_50-99837-0_sp1.1 from training. Duration: 0.4909375 2023-10-03 19:05:53,835 WARNING [train.py:1204] (3/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_836-210550-0_sp1.1 from training. Duration: 0.97275 2023-10-03 19:05:55,141 WARNING [train.py:1204] (3/4) Exclude cut with ID _28323_210_16187_1_1532480395667_4100300_306-74561-0 from training. Duration: 0.94 2023-10-03 19:05:59,431 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_177-58005-0_sp1.1 from training. Duration: 0.563625 2023-10-03 19:05:59,658 INFO [scaling.py:1118] (3/4) WithLoss: name=encoder.encoders.3.encoder.layers.3.self_attn_weights, loss-sum=0.000e+00 2023-10-03 19:06:01,970 INFO [train.py:1046] (3/4) Epoch 39, batch 4250, loss[loss=0.168, simple_loss=0.2465, pruned_loss=0.04474, over 23409.00 frames. ], tot_loss[loss=0.1568, simple_loss=0.2357, pruned_loss=0.03891, over 4699808.88 frames. ], batch size: 93, lr: 2.60e-03, grad_scale: 8.0 2023-10-03 19:06:03,398 WARNING [train.py:1204] (3/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_19-342632-0_sp1.1 from training. Duration: 0.82725 2023-10-03 19:06:03,412 WARNING [train.py:1204] (3/4) Exclude cut with ID _35355_210_16040_1_1533212988977_4016229_74-190076-0_sp0.9 from training. Duration: 0.888875 2023-10-03 19:06:03,597 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.conv_module1.balancer1.prob, batch_count=1374073.3333333333, ans=0.125 2023-10-03 19:06:06,160 WARNING [train.py:1204] (3/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_285-97786-0_sp1.1 from training. Duration: 0.97275 2023-10-03 19:06:06,356 INFO [scaling.py:1118] (3/4) WithLoss: name=encoder.encoders.3.encoder.layers.0.self_attn_weights, loss-sum=0.000e+00 2023-10-03 19:06:11,583 WARNING [train.py:1204] (3/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_337-181502-0_sp0.9 from training. Duration: 0.72225 2023-10-03 19:06:11,633 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_353-182855-0 from training. Duration: 0.96 2023-10-03 19:06:13,011 WARNING [train.py:1204] (3/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_409-327730-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 19:06:15,245 WARNING [train.py:1204] (3/4) Exclude cut with ID _8030_210_7497_1_1528444886781_3531089_373-19403-0_sp1.1 from training. Duration: 0.97275 2023-10-03 19:06:18,386 WARNING [train.py:1204] (3/4) Exclude cut with ID _6789_210_1534_1_1527857937218_3118950_231-340237-0_sp1.1 from training. Duration: 0.936375 2023-10-03 19:06:23,564 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.2.encoder.layers.2.self_attn_weights.whiten_keys, num_groups=4, num_channels=128, metric=2.97 vs. limit=6.0 2023-10-03 19:06:26,026 WARNING [train.py:1204] (3/4) Exclude cut with ID _6493_210_1747_1_1528459210424_4012470_536-57832-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 19:06:26,047 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7046_210_6753_1_1527895337_6920917_756-116002-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 19:06:27,529 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_37152_210_9697_1_1533106573_3844188_29-239070-0_sp1.1 from training. Duration: 0.8909375 2023-10-03 19:06:27,531 WARNING [train.py:1204] (3/4) Exclude cut with ID _19023_210_4353_1_1531378892552_5820780_61-334539-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 19:06:28,995 WARNING [train.py:1204] (3/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_159-28488-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 19:06:30,363 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_764-71272-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 19:06:31,825 WARNING [train.py:1204] (3/4) Exclude cut with ID _44902_210_16187_1_1533888006062_3792078_258-178274-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 19:06:32,036 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.1.feed_forward3.hidden_balancer.prob, batch_count=1374206.6666666667, ans=0.125 2023-10-03 19:06:33,387 WARNING [train.py:1204] (3/4) Exclude cut with ID _7537_210_2084_1_1528502395431_3924359_14-312906-0_sp1.1 from training. Duration: 0.763625 2023-10-03 19:06:33,500 WARNING [train.py:1204] (3/4) Exclude cut with ID _49643_210_7989_1_1534640244224_3939031_674-116639-0_sp1.1 from training. Duration: 0.963625 2023-10-03 19:06:36,315 WARNING [train.py:1204] (3/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_415-161-0 from training. Duration: 0.98 2023-10-03 19:06:38,996 WARNING [train.py:1204] (3/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_264-348896-0 from training. Duration: 0.85 2023-10-03 19:06:39,005 WARNING [train.py:1204] (3/4) Exclude cut with ID _51899_210_20034_1_1534129254466_3669220_233-161492-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 19:06:39,070 WARNING [train.py:1204] (3/4) Exclude cut with ID _31255_210_13427_1_1532865619495_3815143_233-200023-0_sp1.1 from training. Duration: 0.936375 2023-10-03 19:06:40,392 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_23850_210_4238_1_1532232597_1632433_100-229879-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 19:06:41,774 WARNING [train.py:1204] (3/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_375-245677-0_sp0.9 from training. Duration: 0.9 2023-10-03 19:06:41,777 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_40223_210_6228_1_1533298404_4812267_355-317415-0_sp1.1 from training. Duration: 0.97275 2023-10-03 19:06:41,831 WARNING [train.py:1204] (3/4) Exclude cut with ID _9679_210_7497_1_1529129212906_4363929_235-81488-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 19:06:42,043 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.conv_module2.balancer1.prob, batch_count=1374206.6666666667, ans=0.125 2023-10-03 19:06:44,697 WARNING [train.py:1204] (3/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_172-215932-0_sp1.1 from training. Duration: 0.6 2023-10-03 19:06:45,937 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.578e+02 1.890e+02 2.070e+02 2.259e+02 3.425e+02, threshold=4.141e+02, percent-clipped=0.0 2023-10-03 19:06:46,065 WARNING [train.py:1204] (3/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_64-298917-0_sp1.1 from training. Duration: 0.8 2023-10-03 19:06:46,818 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.1.encoder.layers.1.nonlin_attention.whiten1, num_groups=1, num_channels=192, metric=5.87 vs. limit=10.0 2023-10-03 19:06:49,467 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.conv_module2.balancer2.prob, batch_count=1374273.3333333333, ans=0.125 2023-10-03 19:06:52,062 WARNING [train.py:1204] (3/4) Exclude cut with ID _38020_210_5986_1_1533112252259_7006089_131-53533-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 19:06:53,485 WARNING [train.py:1204] (3/4) Exclude cut with ID _42334_210_7770_1_1534206610396_3605880_392-48154-0_sp1.1 from training. Duration: 0.963625 2023-10-03 19:06:54,895 WARNING [train.py:1204] (3/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_337-181502-0 from training. Duration: 0.65 2023-10-03 19:06:54,902 WARNING [train.py:1204] (3/4) Exclude cut with ID _6267_210_2084_1_1527764658526_3894730_375-219887-0_sp0.9 from training. Duration: 0.7444375 2023-10-03 19:06:56,646 WARNING [train.py:1204] (3/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_60-146189-0 from training. Duration: 0.98 2023-10-03 19:06:56,739 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_243-143119-0_sp1.1 from training. Duration: 0.7 2023-10-03 19:06:58,628 WARNING [train.py:1204] (3/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_1267-272693-0_sp0.9 from training. Duration: 0.811125 2023-10-03 19:07:01,248 WARNING [train.py:1204] (3/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_83-182645-0_sp1.1 from training. Duration: 0.97275 2023-10-03 19:07:01,289 WARNING [train.py:1204] (3/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_403-292260-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 19:07:02,756 WARNING [train.py:1204] (3/4) Exclude cut with ID _55429_210_10626_1_1534467588527_7174143_377-169391-0 from training. Duration: 0.96 2023-10-03 19:07:02,999 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.nonlin_attention.balancer.prob, batch_count=1374340.0, ans=0.125 2023-10-03 19:07:04,040 WARNING [train.py:1204] (3/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_494-199538-0_sp1.1 from training. Duration: 0.6818125 2023-10-03 19:07:04,090 WARNING [train.py:1204] (3/4) Exclude cut with ID _3067_210_2667_1_1526212974640_4503168_119-175776-0_sp1.1 from training. Duration: 0.67275 2023-10-03 19:07:05,812 INFO [scaling.py:1118] (3/4) WithLoss: name=encoder.encoders.1.encoder.layers.1.self_attn_weights, loss-sum=0.000e+00 2023-10-03 19:07:07,160 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.1.conv_module2.balancer2.min_positive, batch_count=1374340.0, ans=0.05 2023-10-03 19:07:08,369 WARNING [train.py:1204] (3/4) Exclude cut with ID _3231_210_2106_1_1526205864934_6784220_183-303839-0_sp1.1 from training. Duration: 0.97275 2023-10-03 19:07:10,972 WARNING [train.py:1204] (3/4) Exclude cut with ID _18513_210_9206_1_1531618215223_3862620_165-38958-0_sp1.1 from training. Duration: 0.963625 2023-10-03 19:07:11,118 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.1.conv_module1.balancer2.prob, batch_count=1374340.0, ans=0.125 2023-10-03 19:07:12,297 WARNING [train.py:1204] (3/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_87-272068-0_sp1.1 from training. Duration: 0.7545625 2023-10-03 19:07:13,667 WARNING [train.py:1204] (3/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_353-95-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 19:07:13,782 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2009_210_2071_1_1525481134_4588937_73-302813-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 19:07:14,436 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.feed_forward3.out_whiten.whitening_limit, batch_count=1374406.6666666667, ans=15.0 2023-10-03 19:07:15,101 INFO [train.py:1046] (3/4) Epoch 39, batch 4300, loss[loss=0.1524, simple_loss=0.2407, pruned_loss=0.03205, over 24434.00 frames. ], tot_loss[loss=0.156, simple_loss=0.2348, pruned_loss=0.03858, over 4698917.31 frames. ], batch size: 69, lr: 2.60e-03, grad_scale: 8.0 2023-10-03 19:07:15,244 WARNING [train.py:1204] (3/4) Exclude cut with ID _64220_210_9527_1_1535250151772_4875259_88-95011-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 19:07:15,376 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.0.balancer1.prob, batch_count=1374406.6666666667, ans=0.125 2023-10-03 19:07:16,661 WARNING [train.py:1204] (3/4) Exclude cut with ID _44902_210_16187_1_1533888006062_3792078_160-12107-0_sp1.1 from training. Duration: 0.92725 2023-10-03 19:07:16,668 WARNING [train.py:1204] (3/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_547-5424-0 from training. Duration: 0.94 2023-10-03 19:07:17,413 WARNING [train.py:1204] (3/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_268-85656-0_sp1.1 from training. Duration: 0.936375 2023-10-03 19:07:22,731 WARNING [train.py:1204] (3/4) Exclude cut with ID _66210_210_12651_1_1535446812922_3365850_42-284985-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 19:07:22,764 WARNING [train.py:1204] (3/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_465-214726-0_sp0.9 from training. Duration: 0.9555625 2023-10-03 19:07:26,622 WARNING [train.py:1204] (3/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_50-95770-0_sp1.1 from training. Duration: 0.936375 2023-10-03 19:07:33,581 WARNING [train.py:1204] (3/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_269-77567-0_sp1.1 from training. Duration: 0.963625 2023-10-03 19:07:33,583 WARNING [train.py:1204] (3/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_7-111954-0 from training. Duration: 0.56 2023-10-03 19:07:36,129 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_85-195240-0_sp0.9 from training. Duration: 0.9666875 2023-10-03 19:07:37,566 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2822_210_2715_1_1526112586_1999587_165-36656-0_sp0.9 from training. Duration: 0.988875 2023-10-03 19:07:37,787 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.conv_module1.balancer2.prob, batch_count=1374473.3333333333, ans=0.125 2023-10-03 19:07:38,844 WARNING [train.py:1204] (3/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_181-78632-0_sp1.1 from training. Duration: 0.7090625 2023-10-03 19:07:38,863 WARNING [train.py:1204] (3/4) Exclude cut with ID IC0671W0044-331887 from training. Duration: 0.896 2023-10-03 19:07:39,085 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.bypass_mid.scale_min, batch_count=1374473.3333333333, ans=0.2 2023-10-03 19:07:41,607 WARNING [train.py:1204] (3/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_266-137104-0_sp1.1 from training. Duration: 0.5909375 2023-10-03 19:07:44,281 WARNING [train.py:1204] (3/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_137-116442-0_sp0.9 from training. Duration: 0.8666875 2023-10-03 19:07:45,768 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_23801_210_16223_1_1532066392_3294657_570-5439-0 from training. Duration: 0.97 2023-10-03 19:07:45,781 WARNING [train.py:1204] (3/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_29-280622-0_sp1.1 from training. Duration: 0.7454375 2023-10-03 19:07:46,975 WARNING [train.py:1204] (3/4) Exclude cut with ID 3557-8342-0013-54691-0 from training. Duration: 0.92 2023-10-03 19:07:48,864 WARNING [train.py:1204] (3/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_711-15688-0_sp1.1 from training. Duration: 0.3909375 2023-10-03 19:07:50,208 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_15126_210_6753_1_1530778677_2591185_120-277707-0_sp1.1 from training. Duration: 0.72725 2023-10-03 19:07:53,478 WARNING [train.py:1204] (3/4) Exclude cut with ID _14970_210_15709_1_1530775547361_1966965_8-169924-0_sp1.1 from training. Duration: 0.836375 2023-10-03 19:07:53,479 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_4692_210_3528_1_1526899755_4323254_107-44401-0_sp0.9 from training. Duration: 0.9555625 2023-10-03 19:07:53,592 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3475_210_2028_1_1526193578_3622569_265-276625-0_sp0.9 from training. Duration: 0.8444375 2023-10-03 19:07:56,785 WARNING [train.py:1204] (3/4) Exclude cut with ID _55153_210_2253_1_1534384786677_3162096_148-79022-0_sp1.1 from training. Duration: 0.87275 2023-10-03 19:07:56,865 WARNING [train.py:1204] (3/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_292-36336-0_sp1.1 from training. Duration: 0.92725 2023-10-03 19:07:56,874 WARNING [train.py:1204] (3/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_329-82235-0 from training. Duration: 0.82 2023-10-03 19:07:58,691 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_11305_210_1794_1_1529750428_7178148_257-312870-0 from training. Duration: 0.88 2023-10-03 19:08:01,310 WARNING [train.py:1204] (3/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_436-4493-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 19:08:04,150 WARNING [train.py:1204] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_321-54129-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 19:08:04,157 WARNING [train.py:1204] (3/4) Exclude cut with ID _5287_210_2084_1_1527418883040_3878670_333-48508-0_sp0.9 from training. Duration: 0.6666875 2023-10-03 19:08:04,171 WARNING [train.py:1204] (3/4) Exclude cut with ID _26528_210_9070_1_1532570410842_3851310_244-278158-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 19:08:04,205 WARNING [train.py:1204] (3/4) Exclude cut with ID _46952_210_5986_1_1534230010357_3107379_232-344503-0_sp1.1 from training. Duration: 0.87275 2023-10-03 19:08:04,215 WARNING [train.py:1204] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_43-206757-0 from training. Duration: 0.75 2023-10-03 19:08:04,216 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_4183_210_2087_1_1526729100_1869838_141-166600-0 from training. Duration: 0.95 2023-10-03 19:08:05,511 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_296-287678-0 from training. Duration: 0.95 2023-10-03 19:08:05,575 WARNING [train.py:1204] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_265-60343-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 19:08:05,610 WARNING [train.py:1204] (3/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_332-256168-0 from training. Duration: 0.75 2023-10-03 19:08:05,644 WARNING [train.py:1204] (3/4) Exclude cut with ID _13679_210_7497_1_1530534293662_3355098_411-186088-0 from training. Duration: 0.94 2023-10-03 19:08:09,803 WARNING [train.py:1204] (3/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_458-126779-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 19:08:11,007 WARNING [train.py:1204] (3/4) Exclude cut with ID IC0803W0380-397572 from training. Duration: 0.032 2023-10-03 19:08:12,647 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_278-272612-0_sp1.1 from training. Duration: 0.663625 2023-10-03 19:08:15,370 WARNING [train.py:1204] (3/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_491-263101-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 19:08:15,374 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_53555_210_5632_1_1534595220_7232297_334-336778-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 19:08:16,767 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_50-3289-0 from training. Duration: 0.91 2023-10-03 19:08:18,729 WARNING [train.py:1204] (3/4) Exclude cut with ID _8699_210_2069_1_1528630346831_3960409_155-39766-0_sp0.9 from training. Duration: 0.8666875 2023-10-03 19:08:18,739 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_502-82145-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 19:08:20,072 WARNING [train.py:1204] (3/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_550-43132-0_sp1.1 from training. Duration: 0.97275 2023-10-03 19:08:20,101 WARNING [train.py:1204] (3/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_446-112117-0_sp1.1 from training. Duration: 0.8181875 2023-10-03 19:08:21,531 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3691_210_3528_1_1526468169_4062053_355-334432-0_sp1.1 from training. Duration: 0.7909375 2023-10-03 19:08:23,516 WARNING [train.py:1204] (3/4) Exclude cut with ID _58605_210_12210_1_1534723195939_3904259_239-94720-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 19:08:23,686 WARNING [train.py:1204] (3/4) Exclude cut with ID _49376_210_5986_1_1533985278732_3370740_391-344072-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 19:08:25,030 WARNING [train.py:1204] (3/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_558-322014-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 19:08:26,266 WARNING [train.py:1204] (3/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_256-32662-0_sp1.1 from training. Duration: 0.8181875 2023-10-03 19:08:26,550 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=1374673.3333333333, ans=0.1 2023-10-03 19:08:29,496 INFO [train.py:1046] (3/4) Epoch 39, batch 4350, loss[loss=0.1995, simple_loss=0.258, pruned_loss=0.07049, over 19601.00 frames. ], tot_loss[loss=0.1563, simple_loss=0.2356, pruned_loss=0.03848, over 4700963.06 frames. ], batch size: 388, lr: 2.60e-03, grad_scale: 8.0 2023-10-03 19:08:29,879 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.feed_forward1.out_proj.dropout_p, batch_count=1374740.0, ans=0.1 2023-10-03 19:08:32,798 WARNING [train.py:1204] (3/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_572-185150-0 from training. Duration: 0.97 2023-10-03 19:08:32,842 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_89-35748-0_sp0.9 from training. Duration: 0.87775 2023-10-03 19:08:38,464 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_88-66747-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 19:08:41,146 WARNING [train.py:1204] (3/4) Exclude cut with ID _19504_210_9994_1_1531648863242_3325220_357-237029-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 19:08:42,716 WARNING [train.py:1204] (3/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_302-2550-0_sp1.1 from training. Duration: 0.736375 2023-10-03 19:08:42,716 WARNING [train.py:1204] (3/4) Exclude cut with ID _9461_210_4557_1_1529060514726_3677451_59-194017-0_sp1.1 from training. Duration: 0.963625 2023-10-03 19:08:43,416 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.3.encoder.layers.0.nonlin_attention.whiten2, num_groups=1, num_channels=512, metric=7.26 vs. limit=15.0 2023-10-03 19:08:45,514 WARNING [train.py:1204] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_665-196155-0_sp1.1 from training. Duration: 0.6090625 2023-10-03 19:08:47,116 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.ff2_skip_rate, batch_count=1374806.6666666667, ans=0.0 2023-10-03 19:08:48,820 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_326-315007-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 19:08:51,507 WARNING [train.py:1204] (3/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_105-328174-0_sp1.1 from training. Duration: 0.6909375 2023-10-03 19:08:51,516 WARNING [train.py:1204] (3/4) Exclude cut with ID _56494_210_4220_1_1535284837625_3033169_106-263722-0_sp1.1 from training. Duration: 0.97275 2023-10-03 19:08:54,626 WARNING [train.py:1204] (3/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_770-200430-0_sp0.9 from training. Duration: 0.988875 2023-10-03 19:08:54,864 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.bypass.skip_rate, batch_count=1374806.6666666667, ans=0.07 2023-10-03 19:08:56,499 WARNING [train.py:1204] (3/4) Exclude cut with ID _65640_210_6310_1_1535368201445_3173250_209-13008-0_sp1.1 from training. Duration: 0.9 2023-10-03 19:08:57,924 WARNING [train.py:1204] (3/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_212-149947-0_sp0.9 from training. Duration: 0.811125 2023-10-03 19:09:05,127 WARNING [train.py:1204] (3/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_188-171673-0 from training. Duration: 0.58 2023-10-03 19:09:05,195 WARNING [train.py:1204] (3/4) Exclude cut with ID _58895_210_8750_1_1535184081320_3649089_286-281724-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 19:09:06,576 WARNING [train.py:1204] (3/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_16-254909-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 19:09:10,821 WARNING [train.py:1204] (3/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_52-95655-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 19:09:12,280 WARNING [train.py:1204] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_554-45617-0 from training. Duration: 0.99 2023-10-03 19:09:13,505 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.595e+02 1.954e+02 2.148e+02 2.453e+02 3.516e+02, threshold=4.296e+02, percent-clipped=0.0 2023-10-03 19:09:14,393 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.3.encoder.layers.0.nonlin_attention.whiten1, num_groups=1, num_channels=384, metric=6.91 vs. limit=10.0 2023-10-03 19:09:15,019 WARNING [train.py:1204] (3/4) Exclude cut with ID _14683_210_5219_1_1530698352608_3974859_473-344611-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 19:09:17,611 WARNING [train.py:1204] (3/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_17-25045-0_sp0.9 from training. Duration: 0.6555625 2023-10-03 19:09:21,029 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9072W0003-530637 from training. Duration: 0.768 2023-10-03 19:09:22,406 WARNING [train.py:1204] (3/4) Exclude cut with ID _38670_210_15379_1_1533448810659_4391059_76-74025-0_sp1.1 from training. Duration: 0.936375 2023-10-03 19:09:22,603 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.self_attn_weights.pos_emb_skip_rate, batch_count=1374940.0, ans=0.0 2023-10-03 19:09:23,812 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_74-292608-0_sp0.9 from training. Duration: 0.8 2023-10-03 19:09:25,698 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9084W0025-536862 from training. Duration: 0.896 2023-10-03 19:09:25,760 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9087W0021-538392 from training. Duration: 0.768 2023-10-03 19:09:25,765 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7543_210_6753_1_1528543323_645515_56-163234-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 19:09:25,794 WARNING [train.py:1204] (3/4) Exclude cut with ID _45560_210_21982_1_1533812603951_3584999_44-332643-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 19:09:27,181 WARNING [train.py:1204] (3/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_1285-256622-0_sp1.1 from training. Duration: 0.763625 2023-10-03 19:09:28,579 WARNING [train.py:1204] (3/4) Exclude cut with ID _52781_210_16187_1_1534297729099_1118149_141-325632-0_sp1.1 from training. Duration: 0.936375 2023-10-03 19:09:28,640 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_1051-137631-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 19:09:28,690 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_13091_210_7925_1_1530609447_8987354_233-18333-0_sp1.1 from training. Duration: 0.97275 2023-10-03 19:09:31,397 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_34008_210_10431_1_1533345424_8183247_662-317331-0 from training. Duration: 0.94 2023-10-03 19:09:33,176 WARNING [train.py:1204] (3/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_291-248920-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 19:09:33,178 WARNING [train.py:1204] (3/4) Exclude cut with ID _65263_210_22356_1_1535763598691_3921037_274-288481-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 19:09:33,201 WARNING [train.py:1204] (3/4) Exclude cut with ID _5189_210_4557_1_1527248319428_3769229_233-168080-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 19:09:33,275 WARNING [train.py:1204] (3/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_135-184281-0 from training. Duration: 0.99 2023-10-03 19:09:34,669 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9123W0012-555972 from training. Duration: 0.896 2023-10-03 19:09:34,673 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9123W0118-556077 from training. Duration: 0.896 2023-10-03 19:09:34,682 WARNING [train.py:1204] (3/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_406-275649-0 from training. Duration: 0.42 2023-10-03 19:09:37,496 WARNING [train.py:1204] (3/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_309-13684-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 19:09:37,526 WARNING [train.py:1204] (3/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_129-337802-0_sp1.1 from training. Duration: 0.6545625 2023-10-03 19:09:37,551 WARNING [train.py:1204] (3/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_649-266825-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 19:09:38,847 WARNING [train.py:1204] (3/4) Exclude cut with ID _40519_210_13794_1_1533715228655_7337390_895-224742-0_sp1.1 from training. Duration: 0.92725 2023-10-03 19:09:40,279 WARNING [train.py:1204] (3/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_234-14174-0 from training. Duration: 0.82 2023-10-03 19:09:41,661 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9156W0009-572502 from training. Duration: 0.768 2023-10-03 19:09:41,675 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_424-245529-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 19:09:42,949 INFO [train.py:1046] (3/4) Epoch 39, batch 4400, loss[loss=0.1563, simple_loss=0.236, pruned_loss=0.03826, over 23732.00 frames. ], tot_loss[loss=0.1568, simple_loss=0.2364, pruned_loss=0.03858, over 4708158.86 frames. ], batch size: 149, lr: 2.60e-03, grad_scale: 16.0 2023-10-03 19:09:45,800 WARNING [train.py:1204] (3/4) Exclude cut with ID _26528_210_9070_1_1532570410842_3851310_232-10149-0_sp1.1 from training. Duration: 0.963625 2023-10-03 19:09:45,808 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_17292_210_6753_1_1531125388_2943734_152-30410-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 19:09:46,385 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.4.encoder.layers.2.nonlin_attention.whiten2, num_groups=1, num_channels=384, metric=3.39 vs. limit=15.0 2023-10-03 19:09:47,159 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_35441_210_16554_1_1533347572_4092680_291-82075-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 19:09:49,028 WARNING [train.py:1204] (3/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_64-298917-0 from training. Duration: 0.88 2023-10-03 19:09:49,063 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_5618_210_1874_1_1527311008_5851_1-214501-0_sp0.9 from training. Duration: 0.7655625 2023-10-03 19:09:49,100 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_9460_210_4211_1_1528954587_10049667_572-37035-0 from training. Duration: 0.8 2023-10-03 19:09:50,377 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9193W0023-590322 from training. Duration: 0.896 2023-10-03 19:09:51,805 WARNING [train.py:1204] (3/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_206-10097-0_sp1.1 from training. Duration: 0.4454375 2023-10-03 19:09:51,832 WARNING [train.py:1204] (3/4) Exclude cut with ID _55134_210_21982_1_1534590028377_3882170_17-51731-0_sp1.1 from training. Duration: 0.963625 2023-10-03 19:09:53,240 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_14953_210_2151_1_1530701710_5616165_710-165400-0 from training. Duration: 0.87 2023-10-03 19:09:56,835 WARNING [train.py:1204] (3/4) Exclude cut with ID _53203_210_2087_1_1534246218309_475590_61-85777-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 19:09:56,927 WARNING [train.py:1204] (3/4) Exclude cut with ID _55844_210_2346_1_1534471212569_7625707_1050-10453-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 19:09:58,335 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9218W0013-603297 from training. Duration: 0.896 2023-10-03 19:09:59,859 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_267-102818-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 19:09:59,860 WARNING [train.py:1204] (3/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_191-41023-0 from training. Duration: 0.71 2023-10-03 19:10:01,219 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9231W0202-610227 from training. Duration: 0.896 2023-10-03 19:10:05,626 WARNING [train.py:1204] (3/4) Exclude cut with ID _6537_210_1883_1_1527987559710_3597129_382-133062-0 from training. Duration: 0.68 2023-10-03 19:10:06,970 WARNING [train.py:1204] (3/4) Exclude cut with ID _28427_210_4220_1_1532581240967_3061069_43-84928-0 from training. Duration: 0.99 2023-10-03 19:10:07,010 WARNING [train.py:1204] (3/4) Exclude cut with ID _59256_210_5986_1_1535076085997_3589120_121-237386-0 from training. Duration: 0.96 2023-10-03 19:10:08,254 WARNING [train.py:1204] (3/4) Exclude cut with ID _8480_210_1874_1_1528589391205_6728330_137-256986-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 19:10:08,565 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.out_combiner.scale_min, batch_count=1375140.0, ans=0.2 2023-10-03 19:10:09,634 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_9417_210_1794_1_1529235261_4614131_479-346963-0_sp1.1 from training. Duration: 0.936375 2023-10-03 19:10:09,658 WARNING [train.py:1204] (3/4) Exclude cut with ID _26474_210_9570_1_1532520022699_3623380_267-7362-0_sp1.1 from training. Duration: 0.936375 2023-10-03 19:10:11,080 WARNING [train.py:1204] (3/4) Exclude cut with ID _55328_210_27767_1_1534580973260_4088170_287-61272-0_sp1.1 from training. Duration: 0.97275 2023-10-03 19:10:12,560 WARNING [train.py:1204] (3/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_589-9070-0 from training. Duration: 0.71 2023-10-03 19:10:12,567 WARNING [train.py:1204] (3/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_148-158509-0_sp1.1 from training. Duration: 0.37275 2023-10-03 19:10:13,888 WARNING [train.py:1204] (3/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_164-184548-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 19:10:16,638 WARNING [train.py:1204] (3/4) Exclude cut with ID _14970_210_15709_1_1530775547361_1966965_6-23328-0_sp1.1 from training. Duration: 0.7818125 2023-10-03 19:10:16,643 WARNING [train.py:1204] (3/4) Exclude cut with ID _29303_210_2575_1_1532392237303_7410649_612-165069-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 19:10:18,051 WARNING [train.py:1204] (3/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_383-321224-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 19:10:18,088 WARNING [train.py:1204] (3/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_283-160525-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 19:10:18,098 WARNING [train.py:1204] (3/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_505-97987-0 from training. Duration: 0.86 2023-10-03 19:10:19,417 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9309W0190-640317 from training. Duration: 0.768 2023-10-03 19:10:22,959 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.balancer1.prob, batch_count=1375206.6666666667, ans=0.125 2023-10-03 19:10:24,140 WARNING [train.py:1204] (3/4) Exclude cut with ID _50093_210_4220_1_1534417141207_3432299_280-297390-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 19:10:30,264 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_17292_210_6753_1_1531125388_2943734_320-71385-0_sp1.1 from training. Duration: 0.97275 2023-10-03 19:10:32,261 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_213-103282-0 from training. Duration: 0.78 2023-10-03 19:10:36,381 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7615_210_4211_1_1528110965_4580160_72-235211-0_sp0.9 from training. Duration: 0.8444375 2023-10-03 19:10:37,843 WARNING [train.py:1204] (3/4) Exclude cut with ID _14867_210_1968_1_1530684179890_3846969_173-320977-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 19:10:41,153 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7046_210_6753_1_1527895337_6920917_783-278109-0_sp0.9 from training. Duration: 0.8555625 2023-10-03 19:10:42,444 WARNING [train.py:1204] (3/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_90-30673-0 from training. Duration: 0.84 2023-10-03 19:10:42,472 WARNING [train.py:1204] (3/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_415-161-0_sp1.1 from training. Duration: 0.8909375 2023-10-03 19:10:42,488 WARNING [train.py:1204] (3/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_86-66192-0_sp0.9 from training. Duration: 0.611125 2023-10-03 19:10:42,491 WARNING [train.py:1204] (3/4) Exclude cut with ID _4626_210_3532_1_1526817652717_3916859_446-206656-0_sp1.1 from training. Duration: 0.6909375 2023-10-03 19:10:43,663 WARNING [train.py:1204] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_166-128941-0_sp0.9 from training. Duration: 0.6 2023-10-03 19:10:47,913 WARNING [train.py:1204] (3/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_345-167986-0 from training. Duration: 0.84 2023-10-03 19:10:50,567 WARNING [train.py:1204] (3/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_973-198085-0 from training. Duration: 0.75 2023-10-03 19:10:52,391 WARNING [train.py:1204] (3/4) Exclude cut with ID _60057_210_10640_1_1534903065793_3333150_171-181133-0 from training. Duration: 0.88 2023-10-03 19:10:52,404 WARNING [train.py:1204] (3/4) Exclude cut with ID _19504_210_9994_1_1531648863242_3325220_336-196513-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 19:10:52,411 WARNING [train.py:1204] (3/4) Exclude cut with ID _6493_210_1747_1_1528459210424_4012470_513-140967-0 from training. Duration: 0.79 2023-10-03 19:10:53,903 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6673_210_3273_1_1527767790_3173240_327-306185-0_sp0.9 from training. Duration: 0.92225 2023-10-03 19:10:56,643 INFO [train.py:1046] (3/4) Epoch 39, batch 4450, loss[loss=0.1546, simple_loss=0.2429, pruned_loss=0.03316, over 24477.00 frames. ], tot_loss[loss=0.1574, simple_loss=0.2371, pruned_loss=0.03884, over 4698641.47 frames. ], batch size: 69, lr: 2.60e-03, grad_scale: 16.0 2023-10-03 19:10:56,725 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1032-236863-0_sp1.1 from training. Duration: 0.763625 2023-10-03 19:10:58,215 WARNING [train.py:1204] (3/4) Exclude cut with ID _14867_210_1968_1_1530684179890_3846969_413-29292-0 from training. Duration: 0.9 2023-10-03 19:11:01,648 WARNING [train.py:1204] (3/4) Exclude cut with ID _47251_210_18628_1_1533814063128_4137250_186-343322-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 19:11:03,687 WARNING [train.py:1204] (3/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_624-46091-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 19:11:03,798 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.self_attn_weights.pos_emb_skip_rate, batch_count=1375406.6666666667, ans=0.0 2023-10-03 19:11:04,933 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_791-197891-0_sp0.9 from training. Duration: 0.9333125 2023-10-03 19:11:10,575 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_50050_210_16223_1_1535435796_7290138_786-288457-0_sp1.1 from training. Duration: 0.92725 2023-10-03 19:11:10,605 WARNING [train.py:1204] (3/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_213-227283-0_sp1.1 from training. Duration: 0.963625 2023-10-03 19:11:15,214 WARNING [train.py:1204] (3/4) Exclude cut with ID _42659_210_2070_1_1533607442765_3120380_58-341121-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 19:11:15,685 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.4.encoder.layers.2.self_attn1.whiten, num_groups=1, num_channels=384, metric=13.92 vs. limit=22.5 2023-10-03 19:11:17,926 WARNING [train.py:1204] (3/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_506-23200-0_sp1.1 from training. Duration: 0.8818125 2023-10-03 19:11:19,387 WARNING [train.py:1204] (3/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_336-213530-0_sp1.1 from training. Duration: 0.8181875 2023-10-03 19:11:19,406 WARNING [train.py:1204] (3/4) Exclude cut with ID _64135_210_2253_1_1535677267876_7608972_180-5312-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 19:11:21,178 WARNING [train.py:1204] (3/4) Exclude cut with ID _12302_210_5219_1_1529924446466_8107280_835-279754-0 from training. Duration: 0.9 2023-10-03 19:11:21,180 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_510-96996-0_sp1.1 from training. Duration: 0.7909375 2023-10-03 19:11:22,647 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_15517_210_6753_1_1530954008_3487336_216-187834-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 19:11:23,933 WARNING [train.py:1204] (3/4) Exclude cut with ID _19504_210_9994_1_1531648863242_3325220_414-113427-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 19:11:23,934 WARNING [train.py:1204] (3/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_264-108685-0_sp0.9 from training. Duration: 0.611125 2023-10-03 19:11:26,596 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_5438_210_1794_1_1527336687_2600009_104-288274-0_sp0.9 from training. Duration: 0.6444375 2023-10-03 19:11:31,328 WARNING [train.py:1204] (3/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_520-39496-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 19:11:31,382 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_37818_210_4238_1_1533620910_4720083_315-308565-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 19:11:31,519 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.conv_module1.balancer1.prob, batch_count=1375540.0, ans=0.125 2023-10-03 19:11:32,747 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2009_210_2071_1_1525481134_4588937_385-302100-0_sp1.1 from training. Duration: 0.7909375 2023-10-03 19:11:32,784 WARNING [train.py:1204] (3/4) Exclude cut with ID _58895_210_8750_1_1535184081320_3649089_277-53219-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 19:11:32,980 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.self_attn_weights.pos_emb_skip_rate, batch_count=1375540.0, ans=0.0 2023-10-03 19:11:34,157 WARNING [train.py:1204] (3/4) Exclude cut with ID _26664_210_7770_1_1532258943280_3418270_121-111258-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 19:11:37,482 WARNING [train.py:1204] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_99-167392-0_sp1.1 from training. Duration: 0.5545625 2023-10-03 19:11:38,874 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1113-7369-0 from training. Duration: 0.85 2023-10-03 19:11:38,892 WARNING [train.py:1204] (3/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_352-59958-0 from training. Duration: 0.86 2023-10-03 19:11:38,905 WARNING [train.py:1204] (3/4) Exclude cut with ID _14886_210_4220_1_1530702020355_3516190_159-73748-0_sp1.1 from training. Duration: 0.7545625 2023-10-03 19:11:41,518 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.535e+02 1.929e+02 2.104e+02 2.393e+02 3.740e+02, threshold=4.208e+02, percent-clipped=0.0 2023-10-03 19:11:41,620 WARNING [train.py:1204] (3/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_131-119569-0_sp1.1 from training. Duration: 0.92725 2023-10-03 19:11:41,695 WARNING [train.py:1204] (3/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_216-343007-0 from training. Duration: 0.66 2023-10-03 19:11:44,867 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.4.encoder.layers.0.whiten, num_groups=1, num_channels=384, metric=5.31 vs. limit=12.0 2023-10-03 19:11:45,660 WARNING [train.py:1204] (3/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_113-92608-0_sp0.9 from training. Duration: 0.6 2023-10-03 19:11:49,067 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_40047_210_2290_1_1533290519_2374311_132-123271-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 19:11:50,442 WARNING [train.py:1204] (3/4) Exclude cut with ID _8793_210_6310_1_1528542985139_2310009_121-179782-0 from training. Duration: 0.8 2023-10-03 19:11:50,459 WARNING [train.py:1204] (3/4) Exclude cut with ID _43997_210_4525_1_1533538632555_5189329_283-218639-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 19:11:50,462 WARNING [train.py:1204] (3/4) Exclude cut with ID _49376_210_5986_1_1533985278732_3370740_297-78397-0_sp1.1 from training. Duration: 0.936375 2023-10-03 19:11:51,788 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3475_210_2028_1_1526193578_3622569_377-166133-0_sp0.9 from training. Duration: 0.9555625 2023-10-03 19:11:51,801 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_520-213149-0_sp1.1 from training. Duration: 0.92725 2023-10-03 19:11:53,261 WARNING [train.py:1204] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_567-144483-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 19:11:56,016 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_4183_210_2087_1_1526729100_1869838_143-280962-0_sp0.9 from training. Duration: 0.62225 2023-10-03 19:11:56,067 WARNING [train.py:1204] (3/4) Exclude cut with ID _49643_210_7989_1_1534640244224_3939031_611-185515-0 from training. Duration: 0.94 2023-10-03 19:11:59,231 WARNING [train.py:1204] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_88-341718-0_sp0.9 from training. Duration: 0.7333125 2023-10-03 19:12:00,642 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_51225_210_3601_1_1534400634_2327383_49-235441-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 19:12:00,730 WARNING [train.py:1204] (3/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_240-294400-0_sp1.1 from training. Duration: 0.936375 2023-10-03 19:12:03,397 WARNING [train.py:1204] (3/4) Exclude cut with ID _55429_210_10626_1_1534467588527_7174143_640-304825-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 19:12:03,417 WARNING [train.py:1204] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_187-221566-0_sp1.1 from training. Duration: 0.3909375 2023-10-03 19:12:06,838 WARNING [train.py:1204] (3/4) Exclude cut with ID _7606_210_4915_1_1528894253277_5080690_127-177761-0_sp0.9 from training. Duration: 0.711125 2023-10-03 19:12:09,620 WARNING [train.py:1204] (3/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_430-116139-0 from training. Duration: 0.85 2023-10-03 19:12:10,912 INFO [train.py:1046] (3/4) Epoch 39, batch 4500, loss[loss=0.1684, simple_loss=0.2519, pruned_loss=0.04246, over 24440.00 frames. ], tot_loss[loss=0.1581, simple_loss=0.2377, pruned_loss=0.03923, over 4696842.49 frames. ], batch size: 69, lr: 2.60e-03, grad_scale: 16.0 2023-10-03 19:12:12,296 WARNING [train.py:1204] (3/4) Exclude cut with ID _3675_210_4557_1_1526639934408_4182089_456-18305-0_sp0.9 from training. Duration: 0.8444375 2023-10-03 19:12:15,170 WARNING [train.py:1204] (3/4) Exclude cut with ID _18513_210_9206_1_1531618215223_3862620_402-5957-0_sp1.1 from training. Duration: 0.9 2023-10-03 19:12:16,674 WARNING [train.py:1204] (3/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_465-156500-0 from training. Duration: 0.72 2023-10-03 19:12:16,675 WARNING [train.py:1204] (3/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_714-310538-0 from training. Duration: 0.86 2023-10-03 19:12:16,903 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.conv_skip_rate, batch_count=1375740.0, ans=0.0 2023-10-03 19:12:18,697 WARNING [train.py:1204] (3/4) Exclude cut with ID _39411_210_5986_1_1533538825030_3343649_335-301187-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 19:12:24,796 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_41750_210_1960_1_1533520678_3948042_241-158616-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 19:12:24,846 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_46013_210_7234_1_1533776362_3744032_50-141535-0_sp1.1 from training. Duration: 0.9 2023-10-03 19:12:24,923 WARNING [train.py:1204] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_43-206757-0_sp0.9 from training. Duration: 0.8333125 2023-10-03 19:12:26,297 WARNING [train.py:1204] (3/4) Exclude cut with ID _22961_210_8254_1_1531976366672_4278180_172-276296-0_sp1.1 from training. Duration: 0.97275 2023-10-03 19:12:26,330 WARNING [train.py:1204] (3/4) Exclude cut with ID _17956_210_4353_1_1531209568125_6979970_1305-301903-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 19:12:27,617 WARNING [train.py:1204] (3/4) Exclude cut with ID _28323_210_16187_1_1532480395667_4100300_604-44507-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 19:12:39,684 WARNING [train.py:1204] (3/4) Exclude cut with ID _23514_210_1389_1_1531832139785_3031439_88-337544-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 19:12:39,761 WARNING [train.py:1204] (3/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_932-3672-0_sp1.1 from training. Duration: 0.963625 2023-10-03 19:12:39,937 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=1375873.3333333333, ans=0.1 2023-10-03 19:12:42,526 WARNING [train.py:1204] (3/4) Exclude cut with ID _46502_210_13794_1_1533724452198_3657159_207-109991-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 19:12:43,858 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_216-73841-0_sp1.1 from training. Duration: 0.87275 2023-10-03 19:12:43,947 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8453_210_4211_1_1528502940_8562730_347-330673-0_sp1.1 from training. Duration: 0.6545625 2023-10-03 19:12:49,898 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_4183_210_2087_1_1526729100_1869838_143-280962-0_sp1.1 from training. Duration: 0.5090625 2023-10-03 19:12:54,616 WARNING [train.py:1204] (3/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_325-113798-0_sp1.1 from training. Duration: 0.77275 2023-10-03 19:12:56,396 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.ff2_skip_rate, batch_count=1375940.0, ans=0.0 2023-10-03 19:12:57,464 WARNING [train.py:1204] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_91-212486-0_sp0.9 from training. Duration: 0.7555625 2023-10-03 19:13:00,182 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8776_210_2040_1_1528638837_4088036_244-278763-0_sp1.1 from training. Duration: 0.8181875 2023-10-03 19:13:01,940 WARNING [train.py:1204] (3/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_106-286130-0 from training. Duration: 0.98 2023-10-03 19:13:03,273 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2746_210_1988_1_1525872816_3795627_372-205861-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 19:13:03,319 WARNING [train.py:1204] (3/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_240-54646-0_sp1.1 from training. Duration: 0.936375 2023-10-03 19:13:06,024 WARNING [train.py:1204] (3/4) Exclude cut with ID _59500_210_8254_1_1534809610680_3736009_162-180695-0_sp1.1 from training. Duration: 0.936375 2023-10-03 19:13:06,046 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7544_210_6753_1_1528591327_1471426_146-93357-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 19:13:06,218 WARNING [train.py:1204] (3/4) Exclude cut with ID _42657_210_2070_1_1533520654016_3327008_405-87432-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 19:13:07,985 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_4528_210_3273_1_1527076654_3276777_209-219804-0 from training. Duration: 0.69 2023-10-03 19:13:07,985 WARNING [train.py:1204] (3/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_78-182912-0_sp0.9 from training. Duration: 0.6555625 2023-10-03 19:13:07,993 WARNING [train.py:1204] (3/4) Exclude cut with ID _31255_210_13427_1_1532865619495_3815143_322-114998-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 19:13:13,670 WARNING [train.py:1204] (3/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_848-280957-0_sp0.9 from training. Duration: 0.9666875 2023-10-03 19:13:15,003 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7696_210_2040_1_1528207167_3716099_117-125434-0_sp0.9 from training. Duration: 0.8444375 2023-10-03 19:13:16,591 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_1002-109192-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 19:13:19,195 WARNING [train.py:1204] (3/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_138-64210-0_sp0.9 from training. Duration: 0.811125 2023-10-03 19:13:19,216 WARNING [train.py:1204] (3/4) Exclude cut with ID _25808_210_13292_1_1532343514562_7366109_633-47524-0_sp1.1 from training. Duration: 0.9 2023-10-03 19:13:19,325 WARNING [train.py:1204] (3/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_297-5233-0 from training. Duration: 0.95 2023-10-03 19:13:21,426 WARNING [train.py:1204] (3/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_335-274780-0 from training. Duration: 0.78 2023-10-03 19:13:23,248 WARNING [train.py:1204] (3/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_301-330455-0 from training. Duration: 0.8 2023-10-03 19:13:25,972 INFO [train.py:1046] (3/4) Epoch 39, batch 4550, loss[loss=0.1373, simple_loss=0.2193, pruned_loss=0.0276, over 24427.00 frames. ], tot_loss[loss=0.1575, simple_loss=0.2369, pruned_loss=0.03905, over 4708293.09 frames. ], batch size: 58, lr: 2.60e-03, grad_scale: 16.0 2023-10-03 19:13:26,116 WARNING [train.py:1204] (3/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_383-205833-0 from training. Duration: 0.95 2023-10-03 19:13:27,625 WARNING [train.py:1204] (3/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_48-182155-0 from training. Duration: 0.87 2023-10-03 19:13:27,826 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.conv_module2.balancer2.prob, batch_count=1376073.3333333333, ans=0.125 2023-10-03 19:13:28,911 WARNING [train.py:1204] (3/4) Exclude cut with ID _14832_210_8750_1_1530691351539_3171348_1-54315-0_sp1.1 from training. Duration: 0.7909375 2023-10-03 19:13:33,283 WARNING [train.py:1204] (3/4) Exclude cut with ID _44982_210_1511_1_1534312820108_3726029_413-100814-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 19:13:33,342 WARNING [train.py:1204] (3/4) Exclude cut with ID _3231_210_2106_1_1526205864934_6784220_104-261392-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 19:13:35,996 WARNING [train.py:1204] (3/4) Exclude cut with ID _24314_210_4915_1_1532135078116_7170179_1-13588-0_sp1.1 from training. Duration: 0.97275 2023-10-03 19:13:36,900 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.0.layers.1.conv_module2.whiten, num_groups=1, num_channels=192, metric=15.53 vs. limit=15.0 2023-10-03 19:13:38,894 WARNING [train.py:1204] (3/4) Exclude cut with ID _44304_210_15374_1_1533639352765_4613129_141-8356-0_sp1.1 from training. Duration: 0.963625 2023-10-03 19:13:40,767 WARNING [train.py:1204] (3/4) Exclude cut with ID _37892_210_19596_1_1533198677897_3964058_374-335034-0_sp1.1 from training. Duration: 0.936375 2023-10-03 19:13:43,506 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_195-257512-0_sp0.9 from training. Duration: 0.7444375 2023-10-03 19:13:43,509 WARNING [train.py:1204] (3/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_1267-272693-0_sp1.1 from training. Duration: 0.663625 2023-10-03 19:13:43,510 WARNING [train.py:1204] (3/4) Exclude cut with ID _30196_210_1664_1_1533708496522_7074140_690-326155-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 19:13:43,795 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.feed_forward1.hidden_balancer.prob, batch_count=1376140.0, ans=0.125 2023-10-03 19:13:45,008 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_68847_210_14656_1_1536070214_2586843_38-20717-0_sp1.1 from training. Duration: 0.97275 2023-10-03 19:13:46,015 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.0.layers.0.self_attn1.whiten, num_groups=1, num_channels=192, metric=11.11 vs. limit=22.5 2023-10-03 19:13:46,319 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_99-251314-0_sp1.1 from training. Duration: 0.7909375 2023-10-03 19:13:47,798 WARNING [train.py:1204] (3/4) Exclude cut with ID _49376_210_5986_1_1533985278732_3370740_225-95630-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 19:13:51,054 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2655_210_2087_1_1525688959_3897400_202-147557-0 from training. Duration: 0.85 2023-10-03 19:13:51,284 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=1376140.0, ans=0.1 2023-10-03 19:13:52,880 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_5618_210_1874_1_1527311008_5851_1-214501-0_sp1.1 from training. Duration: 0.626375 2023-10-03 19:13:52,950 WARNING [train.py:1204] (3/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_225-43381-0_sp1.1 from training. Duration: 0.8545625 2023-10-03 19:13:54,403 WARNING [train.py:1204] (3/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_242-3472-0 from training. Duration: 0.73 2023-10-03 19:14:00,288 WARNING [train.py:1204] (3/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_339-104791-0 from training. Duration: 0.49 2023-10-03 19:14:00,348 WARNING [train.py:1204] (3/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_488-92261-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 19:14:01,864 WARNING [train.py:1204] (3/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_175-163894-0 from training. Duration: 0.74 2023-10-03 19:14:04,629 WARNING [train.py:1204] (3/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_41-234353-0_sp0.9 from training. Duration: 0.7555625 2023-10-03 19:14:08,647 WARNING [train.py:1204] (3/4) Exclude cut with ID _47251_210_18628_1_1533814063128_4137250_116-275927-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 19:14:08,682 WARNING [train.py:1204] (3/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_61-223324-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 19:14:09,909 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.625e+02 1.921e+02 2.129e+02 2.564e+02 3.877e+02, threshold=4.258e+02, percent-clipped=0.0 2023-10-03 19:14:09,988 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6961_210_2087_1_1527937168_6815011_562-165518-0_sp1.1 from training. Duration: 0.7 2023-10-03 19:14:11,408 WARNING [train.py:1204] (3/4) Exclude cut with ID _21989_210_5854_1_1531899214931_3271360_133-44759-0 from training. Duration: 0.77 2023-10-03 19:14:11,624 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=1376273.3333333333, ans=0.1 2023-10-03 19:14:13,459 WARNING [train.py:1204] (3/4) Exclude cut with ID _23557_210_8750_1_1531890106370_6846255_77-284923-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 19:14:16,308 WARNING [train.py:1204] (3/4) Exclude cut with ID _13678_210_7497_1_1530530069226_3463170_464-296127-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 19:14:16,334 WARNING [train.py:1204] (3/4) Exclude cut with ID _22613_210_5219_1_1532253599686_4090170_393-36663-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 19:14:16,913 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.5.encoder.layers.0.self_attn1.whiten, num_groups=1, num_channels=256, metric=11.73 vs. limit=22.5 2023-10-03 19:14:17,712 WARNING [train.py:1204] (3/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_235-262124-0_sp0.9 from training. Duration: 0.7444375 2023-10-03 19:14:19,644 WARNING [train.py:1204] (3/4) Exclude cut with ID _8480_210_1874_1_1528589391205_6728330_755-112625-0 from training. Duration: 0.74 2023-10-03 19:14:20,950 WARNING [train.py:1204] (3/4) Exclude cut with ID _6456_210_1534_1_1527681634212_3181049_215-309359-0 from training. Duration: 0.88 2023-10-03 19:14:20,968 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3019_210_2028_1_1526044245_3264865_191-72235-0_sp1.1 from training. Duration: 0.8818125 2023-10-03 19:14:22,314 WARNING [train.py:1204] (3/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_380-162648-0 from training. Duration: 0.94 2023-10-03 19:14:23,716 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_92-46009-0 from training. Duration: 0.54 2023-10-03 19:14:23,740 WARNING [train.py:1204] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_53-300544-0_sp0.9 from training. Duration: 0.7444375 2023-10-03 19:14:25,215 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_68235_210_1925_1_1535863247_4484656_101-250943-0_sp1.1 from training. Duration: 0.97275 2023-10-03 19:14:25,226 WARNING [train.py:1204] (3/4) Exclude cut with ID _14970_210_15709_1_1530775547361_1966965_204-22182-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 19:14:25,315 WARNING [train.py:1204] (3/4) Exclude cut with ID _55153_210_2253_1_1534384786677_3162096_310-130092-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 19:14:25,331 WARNING [train.py:1204] (3/4) Exclude cut with ID _60113_210_16271_1_1535164212481_4386079_559-167191-0_sp1.1 from training. Duration: 0.8454375 2023-10-03 19:14:27,263 WARNING [train.py:1204] (3/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_93-58002-0_sp1.1 from training. Duration: 0.5181875 2023-10-03 19:14:28,603 WARNING [train.py:1204] (3/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_110-224688-0 from training. Duration: 0.97 2023-10-03 19:14:31,566 WARNING [train.py:1204] (3/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_178-178004-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 19:14:31,582 WARNING [train.py:1204] (3/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_321-335475-0_sp1.1 from training. Duration: 0.4090625 2023-10-03 19:14:31,633 WARNING [train.py:1204] (3/4) Exclude cut with ID _66863_210_18053_1_1535698758334_8590319_1092-271784-0 from training. Duration: 0.96 2023-10-03 19:14:31,642 WARNING [train.py:1204] (3/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_607-213951-0_sp1.1 from training. Duration: 0.763625 2023-10-03 19:14:31,657 WARNING [train.py:1204] (3/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_468-283113-0 from training. Duration: 0.96 2023-10-03 19:14:33,194 INFO [scaling.py:1118] (3/4) WithLoss: name=encoder.encoders.1.encoder.layers.0.self_attn_weights, loss-sum=0.000e+00 2023-10-03 19:14:35,726 WARNING [train.py:1204] (3/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_13-165512-0_sp1.1 from training. Duration: 0.7454375 2023-10-03 19:14:35,739 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_69630_210_7234_1_1535788670_910989_56-169762-0_sp1.1 from training. Duration: 0.92725 2023-10-03 19:14:35,966 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.self_attn_weights.pos_emb_skip_rate, batch_count=1376340.0, ans=0.0 2023-10-03 19:14:37,210 WARNING [train.py:1204] (3/4) Exclude cut with ID _67335_210_5219_1_1535525975652_4275229_121-80357-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 19:14:37,253 WARNING [train.py:1204] (3/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_126-12079-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 19:14:38,569 WARNING [train.py:1204] (3/4) Exclude cut with ID _5294_210_1747_1_1527249643928_3696179_342-270278-0_sp1.1 from training. Duration: 0.5 2023-10-03 19:14:40,457 INFO [train.py:1046] (3/4) Epoch 39, batch 4600, loss[loss=0.1527, simple_loss=0.2241, pruned_loss=0.04061, over 22778.00 frames. ], tot_loss[loss=0.1567, simple_loss=0.236, pruned_loss=0.03872, over 4696555.24 frames. ], batch size: 322, lr: 2.60e-03, grad_scale: 16.0 2023-10-03 19:14:40,518 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_30322_210_1925_1_1532843478_826817_35-328264-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 19:14:41,938 WARNING [train.py:1204] (3/4) Exclude cut with ID _8480_210_1874_1_1528589391205_6728330_755-112625-0_sp1.1 from training. Duration: 0.67275 2023-10-03 19:14:44,750 WARNING [train.py:1204] (3/4) Exclude cut with ID _38670_210_15379_1_1533448810659_4391059_132-146412-0_sp1.1 from training. Duration: 0.97275 2023-10-03 19:14:46,079 WARNING [train.py:1204] (3/4) Exclude cut with ID _22020_210_2461_1_1531724164513_6326900_563-39691-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 19:14:47,565 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_9460_210_4211_1_1528954587_10049667_572-37035-0_sp1.1 from training. Duration: 0.72725 2023-10-03 19:14:47,577 WARNING [train.py:1204] (3/4) Exclude cut with ID _68777_210_4353_1_1535810390731_3117904_129-166012-0_sp0.9 from training. Duration: 0.9444375 2023-10-03 19:14:47,645 WARNING [train.py:1204] (3/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_487-91921-0_sp1.1 from training. Duration: 0.963625 2023-10-03 19:14:49,060 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_195-257512-0 from training. Duration: 0.67 2023-10-03 19:14:49,164 WARNING [train.py:1204] (3/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_188-260011-0_sp1.1 from training. Duration: 0.663625 2023-10-03 19:14:49,358 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.conv_skip_rate, batch_count=1376406.6666666667, ans=0.0 2023-10-03 19:14:53,643 WARNING [train.py:1204] (3/4) Exclude cut with ID _8480_210_1874_1_1528589391205_6728330_309-139896-0_sp0.9 from training. Duration: 0.9 2023-10-03 19:14:54,964 WARNING [train.py:1204] (3/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_866-321248-0_sp1.1 from training. Duration: 0.963625 2023-10-03 19:14:57,047 WARNING [train.py:1204] (3/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_510-216513-0_sp1.1 from training. Duration: 0.97275 2023-10-03 19:15:04,484 WARNING [train.py:1204] (3/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_758-143677-0 from training. Duration: 0.98 2023-10-03 19:15:04,566 WARNING [train.py:1204] (3/4) Exclude cut with ID _55134_210_21982_1_1534590028377_3882170_50-214427-0_sp1.1 from training. Duration: 0.97275 2023-10-03 19:15:07,840 WARNING [train.py:1204] (3/4) Exclude cut with ID _31649_210_6228_1_1532584608063_4091078_482-301330-0_sp1.1 from training. Duration: 0.97275 2023-10-03 19:15:11,850 WARNING [train.py:1204] (3/4) Exclude cut with ID _35799_210_5854_1_1533263487887_3529071_490-326847-0_sp1.1 from training. Duration: 0.9 2023-10-03 19:15:11,858 WARNING [train.py:1204] (3/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_984-90716-0_sp1.1 from training. Duration: 0.963625 2023-10-03 19:15:17,433 WARNING [train.py:1204] (3/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_166-246557-0 from training. Duration: 0.64 2023-10-03 19:15:17,433 WARNING [train.py:1204] (3/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_51-200220-0_sp1.1 from training. Duration: 0.5090625 2023-10-03 19:15:17,508 WARNING [train.py:1204] (3/4) Exclude cut with ID _51382_210_23862_1_1534492801838_3650104_275-92796-0_sp1.1 from training. Duration: 0.936375 2023-10-03 19:15:23,298 WARNING [train.py:1204] (3/4) Exclude cut with ID _29853_210_7497_1_1532505600851_6642110_461-142829-0_sp1.1 from training. Duration: 0.97275 2023-10-03 19:15:23,338 WARNING [train.py:1204] (3/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_788-298907-0_sp0.9 from training. Duration: 0.988875 2023-10-03 19:15:24,837 WARNING [train.py:1204] (3/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_739-2098-0_sp1.1 from training. Duration: 0.836375 2023-10-03 19:15:24,987 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.1.feed_forward1.hidden_balancer.prob, batch_count=1376606.6666666667, ans=0.125 2023-10-03 19:15:28,878 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7543_210_6753_1_1528541907_1399322_165-323619-0 from training. Duration: 0.69 2023-10-03 19:15:28,971 WARNING [train.py:1204] (3/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_401-255491-0_sp1.1 from training. Duration: 0.636375 2023-10-03 19:15:36,650 WARNING [train.py:1204] (3/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_24-143283-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 19:15:36,726 WARNING [train.py:1204] (3/4) Exclude cut with ID _14459_210_2228_1_1531443641946_7516700_1221-280439-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 19:15:38,538 WARNING [train.py:1204] (3/4) Exclude cut with ID _59202_210_26984_1_1534816810612_3549174_469-213482-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 19:15:38,539 WARNING [train.py:1204] (3/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_148-158509-0_sp0.9 from training. Duration: 0.4555625 2023-10-03 19:15:38,573 WARNING [train.py:1204] (3/4) Exclude cut with ID _24314_210_4915_1_1532135078116_7170179_863-259095-0_sp1.1 from training. Duration: 0.97275 2023-10-03 19:15:38,638 WARNING [train.py:1204] (3/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_321-22821-0 from training. Duration: 0.89 2023-10-03 19:15:39,981 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_43831_210_2158_1_1533725502_5872791_55-163137-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 19:15:40,040 WARNING [train.py:1204] (3/4) Exclude cut with ID _27430_210_7770_1_1532604689375_1378590_26-25207-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 19:15:42,682 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_17613_210_2151_1_1531137569_2366160_223-316461-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 19:15:42,757 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_53679_210_4852_1_1534556726_4186141_541-213370-0_sp1.1 from training. Duration: 0.963625 2023-10-03 19:15:43,998 WARNING [train.py:1204] (3/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_369-295086-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 19:15:44,219 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.nonlin_attention.balancer.prob, batch_count=1376673.3333333333, ans=0.125 2023-10-03 19:15:45,305 WARNING [train.py:1204] (3/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_91-267696-0 from training. Duration: 0.87 2023-10-03 19:15:45,361 WARNING [train.py:1204] (3/4) Exclude cut with ID _5294_210_1747_1_1527249643928_3696179_180-100713-0 from training. Duration: 0.81 2023-10-03 19:15:45,384 WARNING [train.py:1204] (3/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_32-335103-0 from training. Duration: 0.82 2023-10-03 19:15:45,390 WARNING [train.py:1204] (3/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_133-312727-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 19:15:46,756 WARNING [train.py:1204] (3/4) Exclude cut with ID _59288_210_5854_1_1535077757774_3412246_391-165934-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 19:15:46,803 WARNING [train.py:1204] (3/4) Exclude cut with ID _28323_210_16187_1_1532480395667_4100300_488-1281-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 19:15:48,173 WARNING [train.py:1204] (3/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_124-193207-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 19:15:53,991 INFO [train.py:1046] (3/4) Epoch 39, batch 4650, loss[loss=0.1565, simple_loss=0.244, pruned_loss=0.03449, over 24650.00 frames. ], tot_loss[loss=0.1562, simple_loss=0.2359, pruned_loss=0.03825, over 4713464.09 frames. ], batch size: 73, lr: 2.60e-03, grad_scale: 16.0 2023-10-03 19:15:56,956 WARNING [train.py:1204] (3/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_226-242140-0_sp0.9 from training. Duration: 0.9 2023-10-03 19:16:00,210 WARNING [train.py:1204] (3/4) Exclude cut with ID _29277_210_3279_1_1532581109293_3173069_392-3694-0_sp1.1 from training. Duration: 0.936375 2023-10-03 19:16:00,249 WARNING [train.py:1204] (3/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_563-329842-0_sp1.1 from training. Duration: 0.97275 2023-10-03 19:16:02,286 WARNING [train.py:1204] (3/4) Exclude cut with ID _58583_210_5236_1_1534726835266_3574249_391-87755-0_sp1.1 from training. Duration: 0.92725 2023-10-03 19:16:02,305 WARNING [train.py:1204] (3/4) Exclude cut with ID _28427_210_4220_1_1532581240967_3061069_281-237827-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 19:16:02,329 WARNING [train.py:1204] (3/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_44-147898-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 19:16:02,406 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_24007_210_1794_1_1531918925_1663875_125-320772-0_sp1.1 from training. Duration: 0.97275 2023-10-03 19:16:05,035 WARNING [train.py:1204] (3/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_143-117421-0 from training. Duration: 0.58 2023-10-03 19:16:05,209 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.feed_forward1.hidden_balancer.prob, batch_count=1376740.0, ans=0.125 2023-10-03 19:16:10,998 WARNING [train.py:1204] (3/4) Exclude cut with ID _69584_210_5219_1_1535785133775_4044450_325-7656-0_sp0.9 from training. Duration: 0.9555625 2023-10-03 19:16:12,449 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_37351_210_7925_1_1533012931_3812113_265-85780-0 from training. Duration: 0.88 2023-10-03 19:16:13,744 WARNING [train.py:1204] (3/4) Exclude cut with ID _18516_210_9206_1_1531704653206_3692406_331-237896-0_sp1.1 from training. Duration: 0.936375 2023-10-03 19:16:15,044 WARNING [train.py:1204] (3/4) Exclude cut with ID _14629_210_15709_1_1530604298182_956899_6-232695-0 from training. Duration: 0.94 2023-10-03 19:16:15,073 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7544_210_6753_1_1528587000_691468_87-178599-0_sp1.1 from training. Duration: 0.7909375 2023-10-03 19:16:15,129 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_326-303945-0 from training. Duration: 0.99 2023-10-03 19:16:16,438 WARNING [train.py:1204] (3/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_503-30719-0 from training. Duration: 0.98 2023-10-03 19:16:16,453 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_11305_210_1794_1_1529750428_7178148_299-344640-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 19:16:16,516 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_53071_210_16554_1_1534490405_2954066_265-170472-0_sp1.1 from training. Duration: 0.7545625 2023-10-03 19:16:19,315 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_174-277364-0_sp1.1 from training. Duration: 0.6454375 2023-10-03 19:16:19,538 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=1376806.6666666667, ans=0.1 2023-10-03 19:16:20,699 WARNING [train.py:1204] (3/4) Exclude cut with ID _58414_210_8254_1_1535421609162_7151182_193-151884-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 19:16:20,724 WARNING [train.py:1204] (3/4) Exclude cut with ID IC0651W0104-322063 from training. Duration: 0.768 2023-10-03 19:16:22,338 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.feed_forward1.hidden_balancer.prob, batch_count=1376873.3333333333, ans=0.125 2023-10-03 19:16:24,105 WARNING [train.py:1204] (3/4) Exclude cut with ID _21881_210_3525_1_1531825242757_3798380_139-305982-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 19:16:25,543 WARNING [train.py:1204] (3/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_79-200392-0 from training. Duration: 0.97 2023-10-03 19:16:26,931 WARNING [train.py:1204] (3/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_451-280894-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 19:16:26,946 WARNING [train.py:1204] (3/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_304-273433-0_sp1.1 from training. Duration: 0.9 2023-10-03 19:16:28,252 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_5959_210_1794_1_1527400400_3445726_412-44052-0 from training. Duration: 0.77 2023-10-03 19:16:29,661 WARNING [train.py:1204] (3/4) Exclude cut with ID _4681_210_3525_1_1527251040321_1235713_148-26159-0_sp1.1 from training. Duration: 0.963625 2023-10-03 19:16:33,451 WARNING [train.py:1204] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_887-249555-0_sp0.9 from training. Duration: 0.8555625 2023-10-03 19:16:36,130 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_25236_210_2158_1_1532051968_280567_37-6860-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 19:16:38,877 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.547e+02 1.924e+02 2.085e+02 2.377e+02 3.719e+02, threshold=4.171e+02, percent-clipped=0.0 2023-10-03 19:16:40,557 WARNING [train.py:1204] (3/4) Exclude cut with ID _29277_210_3279_1_1532581109293_3173069_417-115527-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 19:16:42,440 WARNING [train.py:1204] (3/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_552-134748-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 19:16:43,821 WARNING [train.py:1204] (3/4) Exclude cut with ID _43176_210_8254_1_1533524417469_4174339_188-174143-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 19:16:45,143 WARNING [train.py:1204] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_127-119109-0_sp1.1 from training. Duration: 0.6545625 2023-10-03 19:16:46,655 WARNING [train.py:1204] (3/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_38-187851-0 from training. Duration: 0.62 2023-10-03 19:16:47,937 WARNING [train.py:1204] (3/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_588-125165-0 from training. Duration: 0.83 2023-10-03 19:16:48,018 WARNING [train.py:1204] (3/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_344-152526-0_sp1.1 from training. Duration: 0.4818125 2023-10-03 19:16:48,020 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7002_210_1794_1_1527935634_4034444_487-236501-0 from training. Duration: 0.66 2023-10-03 19:16:50,655 WARNING [train.py:1204] (3/4) Exclude cut with ID _23825_210_13449_1_1531915313526_3497150_379-16981-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 19:16:56,808 WARNING [train.py:1204] (3/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_297-5233-0_sp1.1 from training. Duration: 0.863625 2023-10-03 19:16:56,812 WARNING [train.py:1204] (3/4) Exclude cut with ID _41298_210_9019_1_1533430371450_4822143_178-283538-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 19:16:58,212 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7046_210_6753_1_1527895337_6920917_642-106799-0 from training. Duration: 0.92 2023-10-03 19:16:58,233 WARNING [train.py:1204] (3/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_532-291952-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 19:16:59,627 WARNING [train.py:1204] (3/4) Exclude cut with ID _47278_210_12041_1_1533902418349_3576110_260-270468-0_sp1.1 from training. Duration: 0.92725 2023-10-03 19:16:59,632 WARNING [train.py:1204] (3/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_144-3287-0_sp0.9 from training. Duration: 0.7444375 2023-10-03 19:17:01,645 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_162-262717-0_sp0.9 from training. Duration: 0.92225 2023-10-03 19:17:04,124 WARNING [train.py:1204] (3/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_62-335552-0_sp0.9 from training. Duration: 0.9666875 2023-10-03 19:17:04,129 WARNING [train.py:1204] (3/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_726-272852-0_sp1.1 from training. Duration: 0.92725 2023-10-03 19:17:05,945 WARNING [train.py:1204] (3/4) Exclude cut with ID _14898_210_9206_1_1530766812377_4177190_26-135492-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 19:17:08,723 INFO [train.py:1046] (3/4) Epoch 39, batch 4700, loss[loss=0.1674, simple_loss=0.2496, pruned_loss=0.04264, over 24297.00 frames. ], tot_loss[loss=0.1572, simple_loss=0.2368, pruned_loss=0.0388, over 4714261.97 frames. ], batch size: 77, lr: 2.60e-03, grad_scale: 16.0 2023-10-03 19:17:08,842 WARNING [train.py:1204] (3/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_200-95304-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 19:17:09,036 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.ff2_skip_rate, batch_count=1377073.3333333333, ans=0.0 2023-10-03 19:17:10,123 WARNING [train.py:1204] (3/4) Exclude cut with ID _45012_210_12651_1_1533608435136_5371380_240-37101-0_sp1.1 from training. Duration: 0.8454375 2023-10-03 19:17:10,132 WARNING [train.py:1204] (3/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_132-269633-0_sp0.9 from training. Duration: 0.7555625 2023-10-03 19:17:10,254 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.1.ff3_skip_rate, batch_count=1377073.3333333333, ans=0.0 2023-10-03 19:17:11,427 WARNING [train.py:1204] (3/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_907-151398-0_sp0.9 from training. Duration: 0.411125 2023-10-03 19:17:13,446 WARNING [train.py:1204] (3/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_45-265684-0_sp0.9 from training. Duration: 0.8 2023-10-03 19:17:14,762 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_74-292608-0 from training. Duration: 0.72 2023-10-03 19:17:20,475 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.1.feed_forward1.hidden_balancer.prob, batch_count=1377073.3333333333, ans=0.125 2023-10-03 19:17:23,016 WARNING [train.py:1204] (3/4) Exclude cut with ID _59202_210_26984_1_1534816810612_3549174_221-84819-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 19:17:23,095 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_33696_210_3852_1_1533082780_2663375_181-325595-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 19:17:23,136 WARNING [train.py:1204] (3/4) Exclude cut with ID _47251_210_18628_1_1533814063128_4137250_351-154105-0_sp1.1 from training. Duration: 0.963625 2023-10-03 19:17:24,528 WARNING [train.py:1204] (3/4) Exclude cut with ID _35501_210_9288_1_1532998507986_3192189_351-334740-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 19:17:25,826 WARNING [train.py:1204] (3/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_342-100157-0_sp1.1 from training. Duration: 0.4909375 2023-10-03 19:17:29,284 WARNING [train.py:1204] (3/4) Exclude cut with ID _6789_210_1534_1_1527857937218_3118950_262-48853-0 from training. Duration: 0.56 2023-10-03 19:17:29,310 WARNING [train.py:1204] (3/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_430-201020-0 from training. Duration: 0.97 2023-10-03 19:17:31,843 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_71-136518-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 19:17:32,024 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.self_attn_weights.pos_emb_skip_rate, batch_count=1377140.0, ans=0.0 2023-10-03 19:17:33,513 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_28-291982-0_sp1.1 from training. Duration: 0.8818125 2023-10-03 19:17:33,555 WARNING [train.py:1204] (3/4) Exclude cut with ID _54218_210_12210_1_1534413646388_4409259_437-275326-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 19:17:35,550 WARNING [train.py:1204] (3/4) Exclude cut with ID _55134_210_21982_1_1534590028377_3882170_40-309139-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 19:17:42,286 WARNING [train.py:1204] (3/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_266-329755-0_sp1.1 from training. Duration: 0.8090625 2023-10-03 19:17:43,960 WARNING [train.py:1204] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_21-174514-0_sp1.1 from training. Duration: 0.3909375 2023-10-03 19:17:46,747 WARNING [train.py:1204] (3/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_409-207378-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 19:17:51,015 WARNING [train.py:1204] (3/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_132-269633-0 from training. Duration: 0.68 2023-10-03 19:17:52,416 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2245_210_2715_1_1525511548_3540254_310-52483-0_sp0.9 from training. Duration: 0.97775 2023-10-03 19:17:55,182 WARNING [train.py:1204] (3/4) Exclude cut with ID _27430_210_7770_1_1532604689375_1378590_47-77403-0_sp1.1 from training. Duration: 0.92725 2023-10-03 19:17:55,327 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.1.feed_forward1.out_proj.dropout_p, batch_count=1377273.3333333333, ans=0.1 2023-10-03 19:18:01,148 WARNING [train.py:1204] (3/4) Exclude cut with ID _7562_210_2084_1_1528887767916_3946229_186-326422-0 from training. Duration: 0.84 2023-10-03 19:18:02,523 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_230-35993-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 19:18:05,571 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_23854_210_1794_1_1531969696_1329157_57-123491-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 19:18:06,228 WARNING [train.py:1204] (3/4) Exclude cut with ID _14747_210_8341_1_1530685797463_3654089_147-131229-0 from training. Duration: 0.76 2023-10-03 19:18:07,610 WARNING [train.py:1204] (3/4) Exclude cut with ID _30191_210_1664_1_1533189915047_7196670_430-19539-0_sp1.1 from training. Duration: 0.92725 2023-10-03 19:18:07,624 WARNING [train.py:1204] (3/4) Exclude cut with ID _67725_210_27767_1_1535608756671_5105359_44-50762-0_sp1.1 from training. Duration: 0.963625 2023-10-03 19:18:11,768 WARNING [train.py:1204] (3/4) Exclude cut with ID _27485_210_4916_1_1532739191677_3668269_217-161755-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 19:18:11,817 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_5931_210_2040_1_1527428481_4547999_321-193614-0_sp1.1 from training. Duration: 0.6090625 2023-10-03 19:18:11,834 WARNING [train.py:1204] (3/4) Exclude cut with ID _6267_210_2084_1_1527764658526_3894730_375-219887-0 from training. Duration: 0.67 2023-10-03 19:18:14,499 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9087W0022-538393 from training. Duration: 0.768 2023-10-03 19:18:14,617 WARNING [train.py:1204] (3/4) Exclude cut with ID _68777_210_4353_1_1535810390731_3117904_83-196459-0_sp1.1 from training. Duration: 0.963625 2023-10-03 19:18:17,864 WARNING [train.py:1204] (3/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_455-343812-0_sp1.1 from training. Duration: 0.92725 2023-10-03 19:18:17,865 WARNING [train.py:1204] (3/4) Exclude cut with ID _13916_210_9994_1_1530788341126_3763149_414-320174-0_sp1.1 from training. Duration: 0.92725 2023-10-03 19:18:17,869 WARNING [train.py:1204] (3/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_398-239819-0 from training. Duration: 0.99 2023-10-03 19:18:19,279 WARNING [train.py:1204] (3/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_199-232314-0_sp1.1 from training. Duration: 0.92725 2023-10-03 19:18:21,978 INFO [train.py:1046] (3/4) Epoch 39, batch 4750, loss[loss=0.1502, simple_loss=0.2314, pruned_loss=0.03446, over 23295.00 frames. ], tot_loss[loss=0.1572, simple_loss=0.2372, pruned_loss=0.03859, over 4716874.73 frames. ], batch size: 119, lr: 2.60e-03, grad_scale: 16.0 2023-10-03 19:18:22,108 WARNING [train.py:1204] (3/4) Exclude cut with ID _2334_210_2667_1_1525608238880_4705129_459-104997-0 from training. Duration: 0.95 2023-10-03 19:18:26,077 WARNING [train.py:1204] (3/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_441-209395-0_sp1.1 from training. Duration: 0.936375 2023-10-03 19:18:27,500 WARNING [train.py:1204] (3/4) Exclude cut with ID _27052_210_5986_1_1532329197187_3585009_320-80393-0_sp1.1 from training. Duration: 0.97275 2023-10-03 19:18:30,847 WARNING [train.py:1204] (3/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_89-217253-0_sp1.1 from training. Duration: 0.97275 2023-10-03 19:18:30,868 WARNING [train.py:1204] (3/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_453-282588-0_sp1.1 from training. Duration: 0.8909375 2023-10-03 19:18:33,567 WARNING [train.py:1204] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_88-341718-0 from training. Duration: 0.66 2023-10-03 19:18:33,602 WARNING [train.py:1204] (3/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_230-3521-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 19:18:36,386 WARNING [train.py:1204] (3/4) Exclude cut with ID _9679_210_7497_1_1529129212906_4363929_512-313889-0 from training. Duration: 0.81 2023-10-03 19:18:36,516 WARNING [train.py:1204] (3/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_194-252800-0_sp1.1 from training. Duration: 0.8818125 2023-10-03 19:18:37,410 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.nonlin_attention.balancer.prob, batch_count=1377473.3333333333, ans=0.125 2023-10-03 19:18:38,251 WARNING [train.py:1204] (3/4) Exclude cut with ID _55209_210_1531_1_1534675828996_4597160_239-293541-0_sp1.1 from training. Duration: 0.963625 2023-10-03 19:18:38,297 WARNING [train.py:1204] (3/4) Exclude cut with ID _35799_210_5854_1_1533263487887_3529071_15-325131-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 19:18:44,807 WARNING [train.py:1204] (3/4) Exclude cut with ID _14100_210_8341_1_1530529320613_3678069_279-55760-0 from training. Duration: 0.93 2023-10-03 19:18:48,993 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_489-230703-0_sp1.1 from training. Duration: 0.77275 2023-10-03 19:18:50,401 WARNING [train.py:1204] (3/4) Exclude cut with ID _6694_210_4599_1_1527852637490_3965529_144-185051-0 from training. Duration: 0.96 2023-10-03 19:18:51,755 WARNING [train.py:1204] (3/4) Exclude cut with ID _28461_210_2253_1_1532771775189_3041299_1-228155-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 19:18:55,929 WARNING [train.py:1204] (3/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_686-252323-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 19:18:55,934 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_1075-294633-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 19:18:55,951 WARNING [train.py:1204] (3/4) Exclude cut with ID _22083_210_8254_1_1531702862439_3678970_30-92879-0_sp1.1 from training. Duration: 0.97275 2023-10-03 19:18:57,297 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9245W0121-616888 from training. Duration: 0.896 2023-10-03 19:18:57,309 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6786_210_1388_1_1527839699_4194495_378-46070-0 from training. Duration: 0.72 2023-10-03 19:19:02,155 WARNING [train.py:1204] (3/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_317-175062-0 from training. Duration: 0.82 2023-10-03 19:19:04,933 WARNING [train.py:1204] (3/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_238-89390-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 19:19:05,061 WARNING [train.py:1204] (3/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_524-310811-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 19:19:06,317 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.495e+02 1.868e+02 2.072e+02 2.356e+02 4.427e+02, threshold=4.144e+02, percent-clipped=1.0 2023-10-03 19:19:07,325 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.nonlin_attention.balancer.prob, batch_count=1377606.6666666667, ans=0.125 2023-10-03 19:19:08,319 WARNING [train.py:1204] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_307-158462-0_sp0.9 from training. Duration: 0.7666875 2023-10-03 19:19:08,319 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9309W0191-640318 from training. Duration: 0.512 2023-10-03 19:19:08,323 WARNING [train.py:1204] (3/4) Exclude cut with ID _55153_210_2253_1_1534384786677_3162096_100-204617-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 19:19:09,937 WARNING [train.py:1204] (3/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_112-149213-0_sp1.1 from training. Duration: 0.863625 2023-10-03 19:19:13,410 WARNING [train.py:1204] (3/4) Exclude cut with ID _67831_210_12392_1_1535612464202_3092612_346-2148-0_sp1.1 from training. Duration: 0.8454375 2023-10-03 19:19:14,856 WARNING [train.py:1204] (3/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_51-117576-0_sp0.9 from training. Duration: 0.4444375 2023-10-03 19:19:14,877 WARNING [train.py:1204] (3/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_300-27235-0 from training. Duration: 0.87 2023-10-03 19:19:14,934 WARNING [train.py:1204] (3/4) Exclude cut with ID _40399_210_4353_1_1533305658194_3279406_195-116825-0_sp1.1 from training. Duration: 0.97275 2023-10-03 19:19:16,631 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7002_210_1794_1_1527939683_5157286_163-87552-0_sp0.9 from training. Duration: 0.9555625 2023-10-03 19:19:16,673 WARNING [train.py:1204] (3/4) Exclude cut with ID _19514_210_9259_1_1531530071330_5084230_560-237597-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 19:19:18,014 WARNING [train.py:1204] (3/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_41-234353-0_sp1.1 from training. Duration: 0.6181875 2023-10-03 19:19:18,031 WARNING [train.py:1204] (3/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_769-168302-0 from training. Duration: 0.71 2023-10-03 19:19:20,854 WARNING [train.py:1204] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_574-45541-0 from training. Duration: 0.65 2023-10-03 19:19:21,890 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.0.layers.0.feed_forward3.out_whiten, num_groups=1, num_channels=192, metric=11.08 vs. limit=15.0 2023-10-03 19:19:23,615 WARNING [train.py:1204] (3/4) Exclude cut with ID _50310_210_15488_1_1534068043345_3641080_262-67120-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 19:19:26,416 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_53071_210_16554_1_1534493434_1276796_35-11927-0_sp1.1 from training. Duration: 0.963625 2023-10-03 19:19:26,417 WARNING [train.py:1204] (3/4) Exclude cut with ID _3992_210_1747_1_1526644816031_3731060_107-251434-0 from training. Duration: 0.99 2023-10-03 19:19:27,753 WARNING [train.py:1204] (3/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_412-54488-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 19:19:27,843 WARNING [train.py:1204] (3/4) Exclude cut with ID _18516_210_9206_1_1531704653206_3692406_77-258490-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 19:19:31,187 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_40047_210_2290_1_1533290519_2374311_195-165693-0_sp0.9 from training. Duration: 0.988875 2023-10-03 19:19:32,522 WARNING [train.py:1204] (3/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_296-36971-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 19:19:32,573 WARNING [train.py:1204] (3/4) Exclude cut with ID _14970_210_15709_1_1530773543493_1715730_187-20259-0_sp1.1 from training. Duration: 0.8090625 2023-10-03 19:19:36,598 INFO [train.py:1046] (3/4) Epoch 39, batch 4800, loss[loss=0.1603, simple_loss=0.2525, pruned_loss=0.03402, over 24660.00 frames. ], tot_loss[loss=0.1583, simple_loss=0.2386, pruned_loss=0.03903, over 4724132.18 frames. ], batch size: 73, lr: 2.60e-03, grad_scale: 32.0 2023-10-03 19:19:36,658 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_53373_210_5399_1_1534229471_4715695_135-305927-0_sp1.1 from training. Duration: 0.936375 2023-10-03 19:19:36,689 WARNING [train.py:1204] (3/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_60-18123-0 from training. Duration: 0.88 2023-10-03 19:19:38,117 WARNING [train.py:1204] (3/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_166-166990-0 from training. Duration: 0.96 2023-10-03 19:19:39,508 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_5438_210_1794_1_1527336687_2600009_233-58072-0 from training. Duration: 0.77 2023-10-03 19:19:42,679 WARNING [train.py:1204] (3/4) Exclude cut with ID _8833_210_5219_1_1528592665210_7553059_611-80313-0_sp0.9 from training. Duration: 0.711125 2023-10-03 19:19:42,707 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_53555_210_5632_1_1534595220_7232297_118-43239-0_sp1.1 from training. Duration: 0.936375 2023-10-03 19:19:42,798 WARNING [train.py:1204] (3/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_480-13146-0 from training. Duration: 0.85 2023-10-03 19:19:48,890 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_892-28814-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 19:19:48,942 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_24899_210_16554_1_1532046422_3827462_160-21255-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 19:19:54,360 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_154-316450-0_sp1.1 from training. Duration: 0.6909375 2023-10-03 19:19:55,641 WARNING [train.py:1204] (3/4) Exclude cut with ID _30196_210_1664_1_1533708496522_7074140_1225-301232-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 19:19:56,911 WARNING [train.py:1204] (3/4) Exclude cut with ID _28783_210_7943_1_1532671164808_7155479_414-208100-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 19:19:56,966 WARNING [train.py:1204] (3/4) Exclude cut with ID _19023_210_4353_1_1531378892552_5820780_83-254794-0 from training. Duration: 0.86 2023-10-03 19:19:58,308 WARNING [train.py:1204] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_591-83752-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 19:19:58,353 WARNING [train.py:1204] (3/4) Exclude cut with ID _29877_210_6228_1_1532397791106_5276149_266-62291-0_sp1.1 from training. Duration: 0.7909375 2023-10-03 19:20:00,431 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.1.encoder.layers.1.self_attn_weights.whiten_keys, num_groups=4, num_channels=128, metric=2.63 vs. limit=6.0 2023-10-03 19:20:01,429 WARNING [train.py:1204] (3/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_280-161633-0_sp1.1 from training. Duration: 0.663625 2023-10-03 19:20:04,309 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7543_210_6753_1_1528539468_333764_39-156634-0_sp1.1 from training. Duration: 0.97275 2023-10-03 19:20:05,831 WARNING [train.py:1204] (3/4) Exclude cut with ID _67499_210_17282_1_1535509805220_3692430_505-312248-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 19:20:05,853 WARNING [train.py:1204] (3/4) Exclude cut with ID _5294_210_1747_1_1527249643928_3696179_180-100713-0_sp0.9 from training. Duration: 0.9 2023-10-03 19:20:07,249 WARNING [train.py:1204] (3/4) Exclude cut with ID _26528_210_9070_1_1532570410842_3851310_508-148132-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 19:20:07,259 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_211-339622-0_sp1.1 from training. Duration: 0.4090625 2023-10-03 19:20:07,273 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_339-138356-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 19:20:09,968 WARNING [train.py:1204] (3/4) Exclude cut with ID _27060_210_5986_1_1532674838922_3522549_211-19933-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 19:20:11,407 WARNING [train.py:1204] (3/4) Exclude cut with ID _60229_210_27767_1_1534838406158_3655350_457-117923-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 19:20:13,523 WARNING [train.py:1204] (3/4) Exclude cut with ID _55429_210_10626_1_1534467588527_7174143_460-141439-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 19:20:16,254 WARNING [train.py:1204] (3/4) Exclude cut with ID _59170_210_6721_1_1534983633960_4203335_263-10780-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 19:20:16,267 WARNING [train.py:1204] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_872-264525-0_sp0.9 from training. Duration: 0.72225 2023-10-03 19:20:16,352 WARNING [train.py:1204] (3/4) Exclude cut with ID _7624_210_1531_1_1528109802198_4933101_418-349834-0_sp0.9 from training. Duration: 0.6666875 2023-10-03 19:20:17,723 WARNING [train.py:1204] (3/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_27-53998-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 19:20:19,868 WARNING [train.py:1204] (3/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_202-183922-0 from training. Duration: 0.85 2023-10-03 19:20:19,881 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7002_210_1794_1_1527939683_5157286_1-281264-0 from training. Duration: 0.44 2023-10-03 19:20:21,259 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_53753_210_7234_1_1534320003_3594148_493-9314-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 19:20:21,272 WARNING [train.py:1204] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_217-95056-0_sp1.1 from training. Duration: 0.8909375 2023-10-03 19:20:21,316 WARNING [train.py:1204] (3/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_406-45086-0_sp1.1 from training. Duration: 0.72725 2023-10-03 19:20:21,322 WARNING [train.py:1204] (3/4) Exclude cut with ID _31649_210_6228_1_1532584608063_4091078_505-213821-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 19:20:21,330 WARNING [train.py:1204] (3/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_317-175062-0_sp0.9 from training. Duration: 0.911125 2023-10-03 19:20:21,615 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.balancer_ff3.min_abs, batch_count=1377940.0, ans=0.2 2023-10-03 19:20:23,248 WARNING [train.py:1204] (3/4) Exclude cut with ID _5444_210_2151_1_1527418808464_5312210_726-27165-0_sp1.1 from training. Duration: 0.6090625 2023-10-03 19:20:23,287 WARNING [train.py:1204] (3/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_494-170437-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 19:20:23,478 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.attention_skip_rate, batch_count=1377940.0, ans=0.0 2023-10-03 19:20:27,264 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_474-64054-0_sp1.1 from training. Duration: 0.936375 2023-10-03 19:20:30,446 WARNING [train.py:1204] (3/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_521-7556-0_sp1.1 from training. Duration: 0.97275 2023-10-03 19:20:31,909 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_269-348505-0_sp1.1 from training. Duration: 0.92725 2023-10-03 19:20:34,813 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.feed_forward3.hidden_balancer.prob, batch_count=1378006.6666666667, ans=0.125 2023-10-03 19:20:34,853 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.conv_module2.balancer1.prob, batch_count=1378006.6666666667, ans=0.125 2023-10-03 19:20:37,366 WARNING [train.py:1204] (3/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_559-184862-0 from training. Duration: 0.94 2023-10-03 19:20:37,396 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_68702_210_23689_1_1535799577_3420068_92-76040-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 19:20:37,430 WARNING [train.py:1204] (3/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_227-102052-0_sp1.1 from training. Duration: 0.97275 2023-10-03 19:20:37,464 WARNING [train.py:1204] (3/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_718-250907-0_sp0.9 from training. Duration: 0.7666875 2023-10-03 19:20:38,790 WARNING [train.py:1204] (3/4) Exclude cut with ID _14069_210_5632_1_1530512692033_4803390_352-143891-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 19:20:42,898 WARNING [train.py:1204] (3/4) Exclude cut with ID _6267_210_2084_1_1527764658526_3894730_65-106602-0_sp1.1 from training. Duration: 0.936375 2023-10-03 19:20:44,853 WARNING [train.py:1204] (3/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_202-183922-0_sp0.9 from training. Duration: 0.9444375 2023-10-03 19:20:44,867 WARNING [train.py:1204] (3/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_465-268827-0_sp1.1 from training. Duration: 0.97275 2023-10-03 19:20:44,915 WARNING [train.py:1204] (3/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_58-29064-0_sp1.1 from training. Duration: 0.9 2023-10-03 19:20:46,045 WARNING [train.py:1204] (3/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_1-99680-0_sp0.9 from training. Duration: 0.7444375 2023-10-03 19:20:46,096 WARNING [train.py:1204] (3/4) Exclude cut with ID _13253_210_1925_1_1530757775819_3577873_191-144977-0_sp1.1 from training. Duration: 0.7090625 2023-10-03 19:20:51,026 INFO [train.py:1046] (3/4) Epoch 39, batch 4850, loss[loss=0.1484, simple_loss=0.2204, pruned_loss=0.03818, over 23531.00 frames. ], tot_loss[loss=0.1585, simple_loss=0.2389, pruned_loss=0.03904, over 4727364.65 frames. ], batch size: 256, lr: 2.60e-03, grad_scale: 16.0 2023-10-03 19:20:51,097 WARNING [train.py:1204] (3/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_276-112684-0_sp1.1 from training. Duration: 0.92725 2023-10-03 19:20:51,112 WARNING [train.py:1204] (3/4) Exclude cut with ID _64639_210_5816_1_1535367619437_3528000_358-411-0_sp1.1 from training. Duration: 0.97275 2023-10-03 19:20:51,118 WARNING [train.py:1204] (3/4) Exclude cut with ID _31649_210_6228_1_1532584608063_4091078_106-21199-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 19:20:52,491 WARNING [train.py:1204] (3/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_901-79438-0 from training. Duration: 0.94 2023-10-03 19:20:55,242 WARNING [train.py:1204] (3/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_718-250907-0 from training. Duration: 0.69 2023-10-03 19:20:55,248 WARNING [train.py:1204] (3/4) Exclude cut with ID _5287_210_2084_1_1527418883040_3878670_363-194161-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 19:20:55,252 WARNING [train.py:1204] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_586-321321-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 19:20:55,331 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_68900_210_2087_1_1535713157_3708529_164-292661-0_sp1.1 from training. Duration: 0.963625 2023-10-03 19:20:55,332 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_26433_210_12108_1_1532157158_4056480_114-144878-0_sp1.1 from training. Duration: 0.97275 2023-10-03 19:20:59,355 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_15517_210_6753_1_1530954008_3487336_96-341874-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 19:21:04,763 WARNING [train.py:1204] (3/4) Exclude cut with ID _14268_210_9206_1_1530622771285_4714099_637-86630-0 from training. Duration: 0.51 2023-10-03 19:21:05,025 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.self_attn_weights.pos_emb_skip_rate, batch_count=1378140.0, ans=0.0 2023-10-03 19:21:07,503 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_28211_210_6388_1_1532861506_4523224_121-320092-0_sp1.1 from training. Duration: 0.92725 2023-10-03 19:21:11,631 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_141-89113-0_sp0.9 from training. Duration: 0.9555625 2023-10-03 19:21:11,710 WARNING [train.py:1204] (3/4) Exclude cut with ID _6789_210_1534_1_1527857937218_3118950_311-61262-0_sp1.1 from training. Duration: 0.5909375 2023-10-03 19:21:11,729 WARNING [train.py:1204] (3/4) Exclude cut with ID _58480_210_12392_1_1534811121111_3646160_389-200461-0_sp1.1 from training. Duration: 0.97275 2023-10-03 19:21:14,037 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.balancer2.prob, batch_count=1378140.0, ans=0.125 2023-10-03 19:21:17,737 WARNING [train.py:1204] (3/4) Exclude cut with ID _41326_210_5854_1_1533778076509_3492080_28-156553-0_sp1.1 from training. Duration: 0.92725 2023-10-03 19:21:19,532 WARNING [train.py:1204] (3/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_577-287150-0_sp1.1 from training. Duration: 0.7454375 2023-10-03 19:21:20,910 WARNING [train.py:1204] (3/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_40-129756-0_sp1.1 from training. Duration: 0.736375 2023-10-03 19:21:20,920 WARNING [train.py:1204] (3/4) Exclude cut with ID _60113_210_16271_1_1535164212481_4386079_490-280290-0 from training. Duration: 0.98 2023-10-03 19:21:22,480 WARNING [train.py:1204] (3/4) Exclude cut with ID _58059_210_5854_1_1534901471149_3164808_34-39905-0_sp1.1 from training. Duration: 0.963625 2023-10-03 19:21:24,169 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.bypass.scale_min, batch_count=1378206.6666666667, ans=0.2 2023-10-03 19:21:25,251 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_791-197891-0_sp1.1 from training. Duration: 0.763625 2023-10-03 19:21:25,264 WARNING [train.py:1204] (3/4) Exclude cut with ID _6789_210_1534_1_1527857937218_3118950_262-48853-0_sp1.1 from training. Duration: 0.5090625 2023-10-03 19:21:25,338 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2655_210_2087_1_1525688959_3897400_52-279899-0_sp0.9 from training. Duration: 0.7666875 2023-10-03 19:21:25,341 WARNING [train.py:1204] (3/4) Exclude cut with ID _14884_210_2252_1_1530707459689_5561250_114-201399-0 from training. Duration: 0.76 2023-10-03 19:21:27,858 WARNING [train.py:1204] (3/4) Exclude cut with ID _19023_210_4353_1_1531378892552_5820780_83-254794-0_sp0.9 from training. Duration: 0.9555625 2023-10-03 19:21:27,879 WARNING [train.py:1204] (3/4) Exclude cut with ID _53489_210_6228_1_1534298357169_3835970_10-297919-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 19:21:31,870 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.feed_forward2.hidden_balancer.prob, batch_count=1378206.6666666667, ans=0.125 2023-10-03 19:21:33,055 WARNING [train.py:1204] (3/4) Exclude cut with ID _44585_210_18628_1_1533605350387_4518199_418-3055-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 19:21:33,067 WARNING [train.py:1204] (3/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_310-109461-0 from training. Duration: 0.8 2023-10-03 19:21:34,397 WARNING [train.py:1204] (3/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_235-262124-0 from training. Duration: 0.67 2023-10-03 19:21:34,490 WARNING [train.py:1204] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_195-167011-0_sp1.1 from training. Duration: 0.6818125 2023-10-03 19:21:36,941 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.652e+02 1.898e+02 2.062e+02 2.421e+02 3.265e+02, threshold=4.123e+02, percent-clipped=0.0 2023-10-03 19:21:39,992 WARNING [train.py:1204] (3/4) Exclude cut with ID _7852_210_5219_1_1528286471316_8139400_188-55162-0_sp1.1 from training. Duration: 0.936375 2023-10-03 19:21:40,440 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.4.encoder.layers.2.whiten, num_groups=1, num_channels=384, metric=3.73 vs. limit=12.0 2023-10-03 19:21:41,331 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_35361_210_4238_1_1533262810_7793476_675-87012-0 from training. Duration: 0.89 2023-10-03 19:21:41,997 WARNING [train.py:1204] (3/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_465-81427-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 19:21:42,010 WARNING [train.py:1204] (3/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_597-154640-0_sp1.1 from training. Duration: 0.8181875 2023-10-03 19:21:44,640 WARNING [train.py:1204] (3/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_186-309161-0_sp0.9 from training. Duration: 0.6 2023-10-03 19:21:45,988 WARNING [train.py:1204] (3/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_546-119312-0 from training. Duration: 0.99 2023-10-03 19:21:45,992 WARNING [train.py:1204] (3/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_330-240060-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 19:21:47,296 WARNING [train.py:1204] (3/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_477-255782-0 from training. Duration: 0.82 2023-10-03 19:21:47,322 WARNING [train.py:1204] (3/4) Exclude cut with ID _14970_210_15709_1_1530773543493_1715730_150-248939-0_sp1.1 from training. Duration: 0.97275 2023-10-03 19:21:49,107 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_556-206615-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 19:21:49,174 WARNING [train.py:1204] (3/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_308-231735-0 from training. Duration: 0.73 2023-10-03 19:21:57,632 WARNING [train.py:1204] (3/4) Exclude cut with ID _27485_210_4916_1_1532739191677_3668269_66-236571-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 19:21:59,705 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.conv_module2.balancer2.prob, batch_count=1378340.0, ans=0.125 2023-10-03 19:22:04,068 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2221_210_1988_1_1525597051_4935541_415-296882-0_sp0.9 from training. Duration: 0.8555625 2023-10-03 19:22:04,076 WARNING [train.py:1204] (3/4) Exclude cut with ID _64521_210_14699_1_1535243427259_3815328_231-78630-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 19:22:05,440 INFO [train.py:1046] (3/4) Epoch 39, batch 4900, loss[loss=0.1507, simple_loss=0.2204, pruned_loss=0.04048, over 23675.00 frames. ], tot_loss[loss=0.1586, simple_loss=0.2383, pruned_loss=0.03948, over 4709332.29 frames. ], batch size: 149, lr: 2.59e-03, grad_scale: 16.0 2023-10-03 19:22:06,983 WARNING [train.py:1204] (3/4) Exclude cut with ID _13942_210_9988_1_1530604906036_3733360_499-55676-0 from training. Duration: 0.62 2023-10-03 19:22:06,992 WARNING [train.py:1204] (3/4) Exclude cut with ID _49376_210_5986_1_1533985278732_3370740_301-154763-0_sp1.1 from training. Duration: 0.8909375 2023-10-03 19:22:08,517 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.bypass.scale_min, batch_count=1378406.6666666667, ans=0.2 2023-10-03 19:22:12,932 WARNING [train.py:1204] (3/4) Exclude cut with ID _27485_210_4916_1_1532739191677_3668269_255-216698-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 19:22:14,506 WARNING [train.py:1204] (3/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_137-334143-0_sp1.1 from training. Duration: 0.97275 2023-10-03 19:22:14,534 WARNING [train.py:1204] (3/4) Exclude cut with ID _6456_210_1534_1_1527681634212_3181049_215-309359-0_sp1.1 from training. Duration: 0.8 2023-10-03 19:22:18,485 WARNING [train.py:1204] (3/4) Exclude cut with ID _65640_210_6310_1_1535368201445_3173250_209-13008-0 from training. Duration: 0.99 2023-10-03 19:22:24,659 WARNING [train.py:1204] (3/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_58-29064-0 from training. Duration: 0.99 2023-10-03 19:22:28,874 WARNING [train.py:1204] (3/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_410-287857-0 from training. Duration: 0.93 2023-10-03 19:22:28,949 WARNING [train.py:1204] (3/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_153-153537-0 from training. Duration: 0.91 2023-10-03 19:22:28,988 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7544_210_6753_1_1528587722_3576686_172-171750-0_sp0.9 from training. Duration: 0.72225 2023-10-03 19:22:29,017 WARNING [train.py:1204] (3/4) Exclude cut with ID _64521_210_14699_1_1535243427259_3815328_464-183853-0_sp1.1 from training. Duration: 0.97275 2023-10-03 19:22:29,031 WARNING [train.py:1204] (3/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_385-210308-0_sp1.1 from training. Duration: 0.963625 2023-10-03 19:22:30,860 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_46013_210_7234_1_1533776362_3744032_354-261865-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 19:22:30,867 WARNING [train.py:1204] (3/4) Exclude cut with ID _3067_210_2667_1_1526212974640_4503168_250-285937-0_sp1.1 from training. Duration: 0.72725 2023-10-03 19:22:30,932 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_40036_210_13405_1_1533648647_1315388_48-146016-0 from training. Duration: 0.93 2023-10-03 19:22:34,185 WARNING [train.py:1204] (3/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_280-161633-0 from training. Duration: 0.73 2023-10-03 19:22:35,532 WARNING [train.py:1204] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_308-161067-0_sp0.9 from training. Duration: 0.7333125 2023-10-03 19:22:35,635 WARNING [train.py:1204] (3/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_68-212801-0_sp0.9 from training. Duration: 0.811125 2023-10-03 19:22:35,873 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.conv_module2.balancer1.prob, batch_count=1378540.0, ans=0.125 2023-10-03 19:22:36,806 WARNING [train.py:1204] (3/4) Exclude cut with ID _6789_210_1534_1_1527857937218_3118950_311-61262-0_sp0.9 from training. Duration: 0.72225 2023-10-03 19:22:38,339 WARNING [train.py:1204] (3/4) Exclude cut with ID _14632_210_9988_1_1530691279392_3923200_102-204477-0_sp1.1 from training. Duration: 0.8818125 2023-10-03 19:22:39,761 WARNING [train.py:1204] (3/4) Exclude cut with ID _65242_210_18628_1_1535369393717_3823149_290-117517-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 19:22:41,164 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_69630_210_7234_1_1535788670_910989_35-337140-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 19:22:41,180 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_5618_210_1874_1_1527311008_5851_1-214501-0 from training. Duration: 0.689 2023-10-03 19:22:43,228 WARNING [train.py:1204] (3/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_168-50337-0_sp0.9 from training. Duration: 0.9333125 2023-10-03 19:22:43,308 WARNING [train.py:1204] (3/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_185-14438-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 19:22:43,329 WARNING [train.py:1204] (3/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_61-327062-0 from training. Duration: 0.81 2023-10-03 19:22:43,334 WARNING [train.py:1204] (3/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_98-741-0 from training. Duration: 0.8 2023-10-03 19:22:47,333 WARNING [train.py:1204] (3/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_312-204798-0 from training. Duration: 0.93 2023-10-03 19:22:49,234 WARNING [train.py:1204] (3/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_787-294416-0_sp0.9 from training. Duration: 0.911125 2023-10-03 19:22:50,647 WARNING [train.py:1204] (3/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_140-181176-0_sp1.1 from training. Duration: 0.836375 2023-10-03 19:22:51,961 WARNING [train.py:1204] (3/4) Exclude cut with ID _14687_210_5854_1_1530703838259_3664889_115-239046-0_sp1.1 from training. Duration: 0.6090625 2023-10-03 19:22:52,008 WARNING [train.py:1204] (3/4) Exclude cut with ID _56423_210_5854_1_1534896061049_3165969_370-230614-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 19:22:52,052 WARNING [train.py:1204] (3/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_406-275649-0_sp0.9 from training. Duration: 0.4666875 2023-10-03 19:22:52,063 WARNING [train.py:1204] (3/4) Exclude cut with ID _15150_210_8341_1_1530752411803_7256384_22-138274-0_sp1.1 from training. Duration: 0.87275 2023-10-03 19:22:53,357 WARNING [train.py:1204] (3/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_125-125818-0 from training. Duration: 0.96 2023-10-03 19:22:56,251 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_4399_210_1874_1_1526705485_2400757_220-47974-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 19:22:57,642 WARNING [train.py:1204] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_308-161067-0_sp1.1 from training. Duration: 0.6 2023-10-03 19:22:59,463 WARNING [train.py:1204] (3/4) Exclude cut with ID _53489_210_6228_1_1534298357169_3835970_102-57645-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 19:23:02,783 WARNING [train.py:1204] (3/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_178-177582-0 from training. Duration: 0.74 2023-10-03 19:23:02,850 WARNING [train.py:1204] (3/4) Exclude cut with ID _14349_210_8341_1_1530615710818_3442470_170-336541-0_sp1.1 from training. Duration: 0.7818125 2023-10-03 19:23:04,447 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_521-214276-0_sp0.9 from training. Duration: 0.488875 2023-10-03 19:23:05,729 WARNING [train.py:1204] (3/4) Exclude cut with ID _66070_210_7640_1_1535441393096_7805310_489-292631-0 from training. Duration: 0.79 2023-10-03 19:23:05,935 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.0.conv_module2.balancer2.min_abs, batch_count=1378673.3333333333, ans=0.5 2023-10-03 19:23:07,644 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.5.encoder.layers.1.feed_forward2.out_whiten, num_groups=1, num_channels=256, metric=8.68 vs. limit=15.0 2023-10-03 19:23:13,066 WARNING [train.py:1204] (3/4) Exclude cut with ID _14069_210_5632_1_1530512692033_4803390_438-340023-0_sp1.1 from training. Duration: 0.936375 2023-10-03 19:23:13,162 WARNING [train.py:1204] (3/4) Exclude cut with ID _7852_210_5219_1_1528286471316_8139400_242-105336-0_sp1.1 from training. Duration: 0.6545625 2023-10-03 19:23:13,930 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.1.encoder.layers.0.whiten, num_groups=1, num_channels=256, metric=5.30 vs. limit=12.0 2023-10-03 19:23:15,788 WARNING [train.py:1204] (3/4) Exclude cut with ID _3867_210_1883_1_1526641200575_4475830_272-215563-0 from training. Duration: 0.57 2023-10-03 19:23:15,795 WARNING [train.py:1204] (3/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_143-117421-0_sp0.9 from training. Duration: 0.6444375 2023-10-03 19:23:15,808 WARNING [train.py:1204] (3/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_30-326681-0_sp1.1 from training. Duration: 0.7545625 2023-10-03 19:23:17,203 WARNING [train.py:1204] (3/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_538-218448-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 19:23:20,012 INFO [train.py:1046] (3/4) Epoch 39, batch 4950, loss[loss=0.1603, simple_loss=0.2376, pruned_loss=0.04146, over 23407.00 frames. ], tot_loss[loss=0.1573, simple_loss=0.2363, pruned_loss=0.03913, over 4693438.83 frames. ], batch size: 119, lr: 2.59e-03, grad_scale: 8.0 2023-10-03 19:23:20,083 WARNING [train.py:1204] (3/4) Exclude cut with ID _33640_210_2655_1_1533117605479_4855469_60-192970-0_sp1.1 from training. Duration: 0.963625 2023-10-03 19:23:20,091 WARNING [train.py:1204] (3/4) Exclude cut with ID _65585_210_12210_1_1535450451584_3764098_112-294301-0_sp1.1 from training. Duration: 0.8 2023-10-03 19:23:20,132 WARNING [train.py:1204] (3/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_643-37412-0_sp1.1 from training. Duration: 0.936375 2023-10-03 19:23:20,152 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_9302_210_2550_1_1528889962_4655416_638-301620-0_sp1.1 from training. Duration: 0.42725 2023-10-03 19:23:20,282 WARNING [train.py:1204] (3/4) Exclude cut with ID _3867_210_1883_1_1526641200575_4475830_272-215563-0_sp1.1 from training. Duration: 0.5181875 2023-10-03 19:23:23,495 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_53679_210_4852_1_1534556726_4186141_358-154143-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 19:23:23,505 WARNING [train.py:1204] (3/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_841-341679-0_sp0.9 from training. Duration: 0.6444375 2023-10-03 19:23:27,590 WARNING [train.py:1204] (3/4) Exclude cut with ID _7160_210_1876_1_1527940923231_3703799_302-150728-0 from training. Duration: 0.78 2023-10-03 19:23:27,625 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_125-203105-0 from training. Duration: 0.8 2023-10-03 19:23:28,914 WARNING [train.py:1204] (3/4) Exclude cut with ID _5585_210_2667_1_1527418813324_8007340_89-332072-0_sp0.9 from training. Duration: 0.711125 2023-10-03 19:23:29,003 WARNING [train.py:1204] (3/4) Exclude cut with ID _3992_210_1747_1_1526644816031_3731060_36-147855-0 from training. Duration: 0.78 2023-10-03 19:23:29,021 WARNING [train.py:1204] (3/4) Exclude cut with ID _59256_210_5986_1_1535076085997_3589120_172-239045-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 19:23:29,030 WARNING [train.py:1204] (3/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_472-216663-0_sp1.1 from training. Duration: 0.836375 2023-10-03 19:23:30,778 WARNING [train.py:1204] (3/4) Exclude cut with ID _36512_210_6987_1_1533033143998_3126069_181-306500-0_sp1.1 from training. Duration: 0.863625 2023-10-03 19:23:30,802 WARNING [train.py:1204] (3/4) Exclude cut with ID _56524_210_9642_1_1534572011779_3619170_492-95008-0_sp1.1 from training. Duration: 0.97275 2023-10-03 19:23:32,286 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_33348_210_5632_1_1533086912_3324030_119-90326-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 19:23:33,652 WARNING [train.py:1204] (3/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_231-137892-0_sp1.1 from training. Duration: 0.9 2023-10-03 19:23:36,392 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8453_210_4211_1_1528502940_8562730_596-306820-0_sp1.1 from training. Duration: 0.8818125 2023-10-03 19:23:36,466 WARNING [train.py:1204] (3/4) Exclude cut with ID _27052_210_5986_1_1532329197187_3585009_50-242750-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 19:23:39,204 WARNING [train.py:1204] (3/4) Exclude cut with ID _13913_210_9994_1_1530579615187_3383897_358-98435-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 19:23:39,228 WARNING [train.py:1204] (3/4) Exclude cut with ID _27013_210_1389_1_1532437173240_3252649_399-229778-0_sp1.1 from training. Duration: 0.963625 2023-10-03 19:23:42,436 WARNING [train.py:1204] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_185-174805-0_sp0.9 from training. Duration: 0.7555625 2023-10-03 19:23:46,567 WARNING [train.py:1204] (3/4) Exclude cut with ID _65585_210_12210_1_1535450451584_3764098_196-221014-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 19:23:47,951 WARNING [train.py:1204] (3/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_202-293523-0_sp1.1 from training. Duration: 0.6545625 2023-10-03 19:23:49,337 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8275_210_4198_1_1528455320_3892571_213-127634-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 19:23:50,651 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_921-231678-0_sp1.1 from training. Duration: 0.97275 2023-10-03 19:23:52,531 WARNING [train.py:1204] (3/4) Exclude cut with ID _38020_210_5986_1_1533112252259_7006089_493-102745-0_sp1.1 from training. Duration: 0.8909375 2023-10-03 19:23:53,946 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6846_210_6753_1_1527850767_4295158_2-315073-0 from training. Duration: 0.88 2023-10-03 19:23:55,251 WARNING [train.py:1204] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_249-281349-0 from training. Duration: 0.85 2023-10-03 19:23:57,946 WARNING [train.py:1204] (3/4) Exclude cut with ID _18513_210_9206_1_1531618215223_3862620_314-83429-0_sp1.1 from training. Duration: 0.97275 2023-10-03 19:24:01,187 WARNING [train.py:1204] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_215-202973-0_sp0.9 from training. Duration: 0.97775 2023-10-03 19:24:01,206 WARNING [train.py:1204] (3/4) Exclude cut with ID _7719_210_3525_1_1528456153163_4296960_88-250259-0_sp1.1 from training. Duration: 0.763625 2023-10-03 19:24:03,175 WARNING [train.py:1204] (3/4) Exclude cut with ID _7606_210_4915_1_1528894253277_5080690_301-10732-0_sp1.1 from training. Duration: 0.736375 2023-10-03 19:24:03,184 WARNING [train.py:1204] (3/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_818-11907-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 19:24:03,651 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.5.encoder.layers.1.conv_module2.whiten, num_groups=1, num_channels=256, metric=5.12 vs. limit=15.0 2023-10-03 19:24:04,494 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_5959_210_1794_1_1527400400_3445726_412-44052-0_sp1.1 from training. Duration: 0.7 2023-10-03 19:24:05,918 WARNING [train.py:1204] (3/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_6-279195-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 19:24:07,184 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.599e+02 1.841e+02 1.990e+02 2.204e+02 4.323e+02, threshold=3.980e+02, percent-clipped=1.0 2023-10-03 19:24:07,374 WARNING [train.py:1204] (3/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_252-216674-0_sp0.9 from training. Duration: 0.6 2023-10-03 19:24:08,750 WARNING [train.py:1204] (3/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_144-3287-0_sp1.1 from training. Duration: 0.6090625 2023-10-03 19:24:10,241 WARNING [train.py:1204] (3/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_342-3879-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 19:24:10,278 WARNING [train.py:1204] (3/4) Exclude cut with ID _33640_210_2655_1_1533117605479_4855469_584-65630-0_sp1.1 from training. Duration: 0.97275 2023-10-03 19:24:11,671 WARNING [train.py:1204] (3/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_37-189262-0 from training. Duration: 0.98 2023-10-03 19:24:12,908 WARNING [train.py:1204] (3/4) Exclude cut with ID _9389_210_1664_1_1529217337301_5289269_379-128794-0_sp0.9 from training. Duration: 0.9444375 2023-10-03 19:24:15,081 WARNING [train.py:1204] (3/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_119-207162-0_sp1.1 from training. Duration: 0.6454375 2023-10-03 19:24:19,218 WARNING [train.py:1204] (3/4) Exclude cut with ID _2941_210_2577_1_1526199673150_2489759_200-333643-0_sp1.1 from training. Duration: 0.936375 2023-10-03 19:24:20,582 WARNING [train.py:1204] (3/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_479-125611-0_sp1.1 from training. Duration: 0.87275 2023-10-03 19:24:20,600 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3475_210_2028_1_1526193578_3622569_64-297072-0_sp1.1 from training. Duration: 0.82725 2023-10-03 19:24:20,639 WARNING [train.py:1204] (3/4) Exclude cut with ID _53565_210_10635_1_1534337218335_3820259_139-346291-0_sp1.1 from training. Duration: 0.97275 2023-10-03 19:24:21,916 WARNING [train.py:1204] (3/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_588-125165-0_sp1.1 from training. Duration: 0.7545625 2023-10-03 19:24:22,001 WARNING [train.py:1204] (3/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_938-205135-0_sp1.1 from training. Duration: 0.7909375 2023-10-03 19:24:24,079 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.3.encoder.layers.0.self_attn2.whiten, num_groups=1, num_channels=512, metric=15.54 vs. limit=22.5 2023-10-03 19:24:24,757 WARNING [train.py:1204] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_216-48707-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 19:24:24,817 WARNING [train.py:1204] (3/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_477-255782-0_sp1.1 from training. Duration: 0.7454375 2023-10-03 19:24:24,854 WARNING [train.py:1204] (3/4) Exclude cut with ID _16909_210_5724_1_1531292079912_4118919_583-92842-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 19:24:26,721 WARNING [train.py:1204] (3/4) Exclude cut with ID _60860_210_8254_1_1534896015875_3557169_245-256666-0 from training. Duration: 0.97 2023-10-03 19:24:29,563 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_45399_210_4852_1_1533624725_7579737_349-116173-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 19:24:34,093 INFO [train.py:1046] (3/4) Epoch 39, batch 5000, loss[loss=0.1657, simple_loss=0.2537, pruned_loss=0.03884, over 24579.00 frames. ], tot_loss[loss=0.1567, simple_loss=0.2358, pruned_loss=0.03878, over 4691240.30 frames. ], batch size: 71, lr: 2.59e-03, grad_scale: 8.0 2023-10-03 19:24:34,247 WARNING [train.py:1204] (3/4) Exclude cut with ID _8699_210_2069_1_1528630346831_3960409_155-39766-0 from training. Duration: 0.78 2023-10-03 19:24:34,264 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_86-16881-0_sp1.1 from training. Duration: 0.52725 2023-10-03 19:24:41,075 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_191-159384-0_sp1.1 from training. Duration: 0.97275 2023-10-03 19:24:41,084 WARNING [train.py:1204] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_307-158462-0_sp1.1 from training. Duration: 0.62725 2023-10-03 19:24:41,187 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7544_210_6753_1_1528591327_1471426_33-346031-0 from training. Duration: 0.61 2023-10-03 19:24:43,179 WARNING [train.py:1204] (3/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_112-149213-0 from training. Duration: 0.95 2023-10-03 19:24:44,358 WARNING [train.py:1204] (3/4) Exclude cut with ID _55844_210_2346_1_1534471212569_7625707_925-191342-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 19:24:45,011 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.4.encoder.layers.0.self_attn_weights.whiten_keys, num_groups=4, num_channels=128, metric=1.65 vs. limit=6.0 2023-10-03 19:24:45,970 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.balancer2.prob, batch_count=1379073.3333333333, ans=0.125 2023-10-03 19:24:47,109 WARNING [train.py:1204] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_87-278686-0 from training. Duration: 0.77 2023-10-03 19:24:47,136 WARNING [train.py:1204] (3/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_415-78792-0_sp1.1 from training. Duration: 0.736375 2023-10-03 19:24:47,151 WARNING [train.py:1204] (3/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_107-332398-0_sp0.9 from training. Duration: 0.8666875 2023-10-03 19:24:48,389 WARNING [train.py:1204] (3/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_72-251255-0 from training. Duration: 0.94 2023-10-03 19:24:48,421 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_431-227273-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 19:24:49,779 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7544_210_6753_1_1528587000_691468_87-178599-0_sp0.9 from training. Duration: 0.9666875 2023-10-03 19:24:49,874 WARNING [train.py:1204] (3/4) Exclude cut with ID _13916_210_9994_1_1530788341126_3763149_60-191556-0 from training. Duration: 0.61 2023-10-03 19:24:49,876 WARNING [train.py:1204] (3/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_746-164782-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 19:24:51,138 WARNING [train.py:1204] (3/4) Exclude cut with ID _33640_210_2655_1_1533117605479_4855469_596-21066-0_sp1.1 from training. Duration: 0.92725 2023-10-03 19:24:52,620 WARNING [train.py:1204] (3/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_320-14078-0 from training. Duration: 0.95 2023-10-03 19:24:53,925 WARNING [train.py:1204] (3/4) Exclude cut with ID _29877_210_6228_1_1532397791106_5276149_266-62291-0 from training. Duration: 0.87 2023-10-03 19:24:54,004 WARNING [train.py:1204] (3/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_176-56120-0_sp0.9 from training. Duration: 0.92225 2023-10-03 19:24:54,043 WARNING [train.py:1204] (3/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_91-26919-0 from training. Duration: 0.97 2023-10-03 19:24:54,050 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_224-322394-0_sp0.9 from training. Duration: 0.7333125 2023-10-03 19:24:55,717 WARNING [train.py:1204] (3/4) Exclude cut with ID _44982_210_1511_1_1534312820108_3726029_586-297059-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 19:24:55,763 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_196-289637-0_sp1.1 from training. Duration: 0.6818125 2023-10-03 19:24:55,764 WARNING [train.py:1204] (3/4) Exclude cut with ID _18513_210_9206_1_1531618215223_3862620_402-5957-0 from training. Duration: 0.99 2023-10-03 19:24:55,770 WARNING [train.py:1204] (3/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_81-300212-0 from training. Duration: 0.6 2023-10-03 19:24:57,231 WARNING [train.py:1204] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_166-128941-0 from training. Duration: 0.54 2023-10-03 19:24:57,483 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.conv_module2.balancer2.prob, batch_count=1379140.0, ans=0.125 2023-10-03 19:24:58,539 WARNING [train.py:1204] (3/4) Exclude cut with ID _58059_210_5854_1_1534901471149_3164808_57-309660-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 19:24:58,601 WARNING [train.py:1204] (3/4) Exclude cut with ID _60621_210_17071_1_1535013053530_3671739_73-155814-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 19:25:00,423 WARNING [train.py:1204] (3/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_726-270780-0 from training. Duration: 0.86 2023-10-03 19:25:00,438 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8853_210_3273_1_1528633899_818968_51-187685-0_sp1.1 from training. Duration: 0.62725 2023-10-03 19:25:02,435 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_315-212794-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 19:25:03,844 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_53373_210_5399_1_1534229471_4715695_314-122795-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 19:25:05,270 WARNING [train.py:1204] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_187-221566-0_sp0.9 from training. Duration: 0.47775 2023-10-03 19:25:06,599 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_133-564-0 from training. Duration: 0.79 2023-10-03 19:25:07,946 WARNING [train.py:1204] (3/4) Exclude cut with ID _55153_210_2253_1_1534384786677_3162096_255-228344-0_sp1.1 from training. Duration: 0.936375 2023-10-03 19:25:09,331 WARNING [train.py:1204] (3/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_383-165234-0_sp1.1 from training. Duration: 0.8545625 2023-10-03 19:25:12,180 WARNING [train.py:1204] (3/4) Exclude cut with ID IC0677W0056-334889 from training. Duration: 0.768 2023-10-03 19:25:15,417 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_14953_210_2151_1_1530701710_5616165_710-165400-0_sp0.9 from training. Duration: 0.9666875 2023-10-03 19:25:15,704 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.conv_module1.balancer1.prob, batch_count=1379206.6666666667, ans=0.125 2023-10-03 19:25:16,825 WARNING [train.py:1204] (3/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_355-236205-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 19:25:16,827 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_34450_210_9547_1_1533099443_4109050_503-246893-0_sp1.1 from training. Duration: 0.963625 2023-10-03 19:25:20,922 WARNING [train.py:1204] (3/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_25-120161-0 from training. Duration: 0.98 2023-10-03 19:25:20,957 WARNING [train.py:1204] (3/4) Exclude cut with ID _66112_210_10635_1_1535590236305_3801484_205-311930-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 19:25:20,977 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_48049_210_5632_1_1533864553_3519937_185-281218-0_sp1.1 from training. Duration: 0.92725 2023-10-03 19:25:22,334 WARNING [train.py:1204] (3/4) Exclude cut with ID _48782_210_12658_1_1534467701544_2300422_191-332415-0_sp1.1 from training. Duration: 0.97275 2023-10-03 19:25:23,768 WARNING [train.py:1204] (3/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_907-151398-0_sp1.1 from training. Duration: 0.336375 2023-10-03 19:25:23,806 WARNING [train.py:1204] (3/4) Exclude cut with ID _67499_210_17282_1_1535509805220_3692430_20-179346-0_sp1.1 from training. Duration: 0.8818125 2023-10-03 19:25:23,983 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.ff2_skip_rate, batch_count=1379273.3333333333, ans=0.0 2023-10-03 19:25:28,382 WARNING [train.py:1204] (3/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_441-91466-0_sp1.1 from training. Duration: 0.8818125 2023-10-03 19:25:28,472 WARNING [train.py:1204] (3/4) Exclude cut with ID _46681_210_9642_1_1533893005571_7603359_786-164007-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 19:25:34,893 WARNING [train.py:1204] (3/4) Exclude cut with ID _14970_210_15709_1_1530773543493_1715730_187-20259-0 from training. Duration: 0.89 2023-10-03 19:25:39,052 WARNING [train.py:1204] (3/4) Exclude cut with ID _12734_210_8341_1_1530183614296_3891929_383-339075-0_sp1.1 from training. Duration: 0.963625 2023-10-03 19:25:46,986 INFO [train.py:1046] (3/4) Epoch 39, batch 5050, loss[loss=0.1543, simple_loss=0.2342, pruned_loss=0.03726, over 23427.00 frames. ], tot_loss[loss=0.1571, simple_loss=0.2365, pruned_loss=0.0388, over 4708310.79 frames. ], batch size: 119, lr: 2.59e-03, grad_scale: 8.0 2023-10-03 19:25:49,031 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.self_attn_weights.pos_emb_skip_rate, batch_count=1379406.6666666667, ans=0.0 2023-10-03 19:25:50,165 WARNING [train.py:1204] (3/4) Exclude cut with ID _37086_210_9038_1_1532999540891_7021250_834-322225-0_sp1.1 from training. Duration: 0.92725 2023-10-03 19:25:51,642 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_10987_210_4163_1_1529760555_4123902_56-290559-0_sp1.1 from training. Duration: 0.963625 2023-10-03 19:25:51,649 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_92-246270-0_sp0.9 from training. Duration: 0.9444375 2023-10-03 19:25:51,666 WARNING [train.py:1204] (3/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_56-13561-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 19:25:51,699 WARNING [train.py:1204] (3/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_393-57888-0_sp1.1 from training. Duration: 0.5818125 2023-10-03 19:25:51,719 WARNING [train.py:1204] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_168-32906-0_sp1.1 from training. Duration: 0.62725 2023-10-03 19:25:53,016 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_51225_210_3601_1_1534400634_2327383_127-207553-0_sp1.1 from training. Duration: 0.963625 2023-10-03 19:25:55,145 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.3.encoder.layers.1.nonlin_attention.whiten1, num_groups=1, num_channels=384, metric=4.72 vs. limit=10.0 2023-10-03 19:25:55,874 WARNING [train.py:1204] (3/4) Exclude cut with ID _58583_210_5236_1_1534726835266_3574249_494-203521-0_sp1.1 from training. Duration: 0.963625 2023-10-03 19:25:55,886 WARNING [train.py:1204] (3/4) Exclude cut with ID _49376_210_5986_1_1533985278732_3370740_354-344695-0 from training. Duration: 0.99 2023-10-03 19:25:57,451 WARNING [train.py:1204] (3/4) Exclude cut with ID _8021_210_7994_1_1528376451358_3653069_179-45206-0_sp1.1 from training. Duration: 0.7818125 2023-10-03 19:26:00,030 WARNING [train.py:1204] (3/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_742-154017-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 19:26:02,017 WARNING [train.py:1204] (3/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_222-198127-0_sp1.1 from training. Duration: 0.763625 2023-10-03 19:26:02,064 WARNING [train.py:1204] (3/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_235-307337-0 from training. Duration: 0.94 2023-10-03 19:26:03,436 WARNING [train.py:1204] (3/4) Exclude cut with ID _8393_210_6702_1_1528505860249_3457328_453-176500-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 19:26:05,411 WARNING [train.py:1204] (3/4) Exclude cut with ID _53565_210_10635_1_1534337218335_3820259_102-179528-0_sp1.1 from training. Duration: 0.97275 2023-10-03 19:26:06,879 WARNING [train.py:1204] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_43-206757-0_sp1.1 from training. Duration: 0.6818125 2023-10-03 19:26:08,825 WARNING [train.py:1204] (3/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_335-274780-0_sp1.1 from training. Duration: 0.7090625 2023-10-03 19:26:08,869 WARNING [train.py:1204] (3/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_128-296581-0_sp1.1 from training. Duration: 0.67275 2023-10-03 19:26:18,423 WARNING [train.py:1204] (3/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_111-112362-0 from training. Duration: 0.91 2023-10-03 19:26:18,465 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_5522_210_3273_1_1527249606_3909096_366-172508-0_sp1.1 from training. Duration: 0.6 2023-10-03 19:26:19,802 WARNING [train.py:1204] (3/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_47-167373-0_sp1.1 from training. Duration: 0.836375 2023-10-03 19:26:19,856 WARNING [train.py:1204] (3/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_41-234353-0 from training. Duration: 0.68 2023-10-03 19:26:21,145 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_82-221196-0_sp0.9 from training. Duration: 0.8444375 2023-10-03 19:26:23,136 WARNING [train.py:1204] (3/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_47-4814-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 19:26:23,157 WARNING [train.py:1204] (3/4) Exclude cut with ID _43541_210_10626_1_1533796233418_3635440_40-247993-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 19:26:24,486 WARNING [train.py:1204] (3/4) Exclude cut with ID _21989_210_5854_1_1531899214931_3271360_133-44759-0_sp0.9 from training. Duration: 0.8555625 2023-10-03 19:26:24,489 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_94-195801-0 from training. Duration: 0.94 2023-10-03 19:26:25,801 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_141-89113-0 from training. Duration: 0.86 2023-10-03 19:26:25,892 WARNING [train.py:1204] (3/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_683-284578-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 19:26:27,244 WARNING [train.py:1204] (3/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_6-214815-0_sp1.1 from training. Duration: 0.77275 2023-10-03 19:26:31,264 WARNING [train.py:1204] (3/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_556-212082-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 19:26:31,314 WARNING [train.py:1204] (3/4) Exclude cut with ID _44902_210_16187_1_1533888006062_3792078_149-67212-0 from training. Duration: 0.87 2023-10-03 19:26:32,793 WARNING [train.py:1204] (3/4) Exclude cut with ID _61148_210_6897_1_1535094022726_4217610_333-159440-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 19:26:34,003 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.616e+02 1.934e+02 2.073e+02 2.452e+02 3.577e+02, threshold=4.146e+02, percent-clipped=0.0 2023-10-03 19:26:37,301 WARNING [train.py:1204] (3/4) Exclude cut with ID _3120_210_3525_1_1526036143112_4072980_404-249994-0 from training. Duration: 0.99 2023-10-03 19:26:38,751 WARNING [train.py:1204] (3/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_234-260682-0_sp0.9 from training. Duration: 0.9333125 2023-10-03 19:26:38,772 WARNING [train.py:1204] (3/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_374-138971-0_sp1.1 from training. Duration: 0.8545625 2023-10-03 19:26:39,283 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.4.encoder.layers.2.self_attn1.whiten, num_groups=1, num_channels=384, metric=13.31 vs. limit=22.5 2023-10-03 19:26:40,693 WARNING [train.py:1204] (3/4) Exclude cut with ID _53713_210_24116_1_1534314487762_7416751_743-302085-0_sp1.1 from training. Duration: 0.92725 2023-10-03 19:26:40,753 WARNING [train.py:1204] (3/4) Exclude cut with ID _18516_210_9206_1_1531704653206_3692406_198-334870-0_sp1.1 from training. Duration: 0.836375 2023-10-03 19:26:42,171 WARNING [train.py:1204] (3/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_395-73570-0_sp1.1 from training. Duration: 0.9 2023-10-03 19:26:43,644 WARNING [train.py:1204] (3/4) Exclude cut with ID _46683_210_9642_1_1533730913492_4591116_421-19547-0_sp1.1 from training. Duration: 0.936375 2023-10-03 19:26:43,692 WARNING [train.py:1204] (3/4) Exclude cut with ID _19896_210_1883_1_1531709022662_6649569_714-8668-0_sp1.1 from training. Duration: 0.963625 2023-10-03 19:26:43,721 WARNING [train.py:1204] (3/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_49-101072-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 19:26:43,728 WARNING [train.py:1204] (3/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_220-191080-0_sp1.1 from training. Duration: 0.8909375 2023-10-03 19:26:45,057 WARNING [train.py:1204] (3/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_62-335552-0 from training. Duration: 0.87 2023-10-03 19:26:46,435 WARNING [train.py:1204] (3/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_111-112362-0_sp1.1 from training. Duration: 0.82725 2023-10-03 19:26:46,551 WARNING [train.py:1204] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_318-59097-0_sp0.9 from training. Duration: 0.8444375 2023-10-03 19:26:50,611 WARNING [train.py:1204] (3/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_98-255710-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 19:26:50,619 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9042W0003-515609 from training. Duration: 0.896 2023-10-03 19:26:50,628 WARNING [train.py:1204] (3/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_158-250116-0_sp0.9 from training. Duration: 0.688875 2023-10-03 19:26:52,628 WARNING [train.py:1204] (3/4) Exclude cut with ID _26750_210_5665_1_1532226678442_4063230_7-346885-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 19:26:53,917 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_37839_210_7234_1_1533542462_3689303_357-227078-0_sp1.1 from training. Duration: 0.963625 2023-10-03 19:26:53,937 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9052W0119-520904 from training. Duration: 0.896 2023-10-03 19:26:55,499 WARNING [train.py:1204] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_249-281349-0_sp1.1 from training. Duration: 0.77275 2023-10-03 19:26:55,504 WARNING [train.py:1204] (3/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_147-293311-0 from training. Duration: 0.94 2023-10-03 19:26:55,505 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_158-21166-0_sp1.1 from training. Duration: 0.963625 2023-10-03 19:27:00,718 INFO [train.py:1046] (3/4) Epoch 39, batch 5100, loss[loss=0.2022, simple_loss=0.2707, pruned_loss=0.06686, over 19572.00 frames. ], tot_loss[loss=0.1577, simple_loss=0.2377, pruned_loss=0.03888, over 4708996.53 frames. ], batch size: 388, lr: 2.59e-03, grad_scale: 8.0 2023-10-03 19:27:00,770 WARNING [train.py:1204] (3/4) Exclude cut with ID _14100_210_8341_1_1530529320613_3678069_207-332194-0_sp1.1 from training. Duration: 0.92725 2023-10-03 19:27:00,811 WARNING [train.py:1204] (3/4) Exclude cut with ID _9389_210_1664_1_1529217337301_5289269_324-211031-0_sp1.1 from training. Duration: 0.963625 2023-10-03 19:27:00,829 WARNING [train.py:1204] (3/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_133-158308-0 from training. Duration: 0.84 2023-10-03 19:27:02,218 WARNING [train.py:1204] (3/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_166-314607-0 from training. Duration: 0.54 2023-10-03 19:27:06,071 WARNING [train.py:1204] (3/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_649-79538-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 19:27:06,079 WARNING [train.py:1204] (3/4) Exclude cut with ID _28394_210_8913_1_1532430049518_3650118_343-115667-0_sp1.1 from training. Duration: 0.97275 2023-10-03 19:27:06,116 WARNING [train.py:1204] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_87-278686-0_sp0.9 from training. Duration: 0.8555625 2023-10-03 19:27:08,882 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9109W0003-549749 from training. Duration: 0.768 2023-10-03 19:27:10,225 WARNING [train.py:1204] (3/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_126-29104-0_sp1.1 from training. Duration: 0.77275 2023-10-03 19:27:13,684 WARNING [train.py:1204] (3/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_325-113798-0 from training. Duration: 0.85 2023-10-03 19:27:15,051 WARNING [train.py:1204] (3/4) Exclude cut with ID _6694_210_4599_1_1527852637490_3965529_116-109398-0 from training. Duration: 0.73 2023-10-03 19:27:16,357 WARNING [train.py:1204] (3/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_46-57119-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 19:27:17,749 WARNING [train.py:1204] (3/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_546-119312-0_sp1.1 from training. Duration: 0.9 2023-10-03 19:27:19,138 WARNING [train.py:1204] (3/4) Exclude cut with ID _60057_210_10640_1_1534903065793_3333150_468-306939-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 19:27:19,190 WARNING [train.py:1204] (3/4) Exclude cut with ID _15150_210_8341_1_1530752411803_7256384_22-138274-0 from training. Duration: 0.96 2023-10-03 19:27:20,529 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_289-180904-0 from training. Duration: 0.76 2023-10-03 19:27:24,074 WARNING [train.py:1204] (3/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_64-133721-0_sp1.1 from training. Duration: 0.92725 2023-10-03 19:27:25,283 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_195-257512-0_sp1.1 from training. Duration: 0.6090625 2023-10-03 19:27:29,401 WARNING [train.py:1204] (3/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_704-101963-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 19:27:29,658 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.balancer2.prob, batch_count=1379873.3333333333, ans=0.125 2023-10-03 19:27:30,971 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.conv_module1.balancer2.prob, batch_count=1379873.3333333333, ans=0.125 2023-10-03 19:27:33,519 WARNING [train.py:1204] (3/4) Exclude cut with ID _4366_210_1388_1_1526792053633_3613819_231-211402-0 from training. Duration: 0.94 2023-10-03 19:27:33,567 WARNING [train.py:1204] (3/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_679-190385-0_sp1.1 from training. Duration: 0.97275 2023-10-03 19:27:34,960 WARNING [train.py:1204] (3/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_298-81459-0_sp1.1 from training. Duration: 0.963625 2023-10-03 19:27:35,112 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.1.feed_forward1.hidden_balancer.prob, batch_count=1379873.3333333333, ans=0.125 2023-10-03 19:27:36,775 WARNING [train.py:1204] (3/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_658-282610-0_sp0.9 from training. Duration: 0.588875 2023-10-03 19:27:38,234 WARNING [train.py:1204] (3/4) Exclude cut with ID _57572_210_5517_1_1534921219624_3809484_196-283734-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 19:27:39,634 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_53071_210_16554_1_1534490405_2954066_294-198554-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 19:27:39,639 WARNING [train.py:1204] (3/4) Exclude cut with ID _4866_210_2577_1_1527404412284_4364659_352-157930-0 from training. Duration: 0.81 2023-10-03 19:27:41,027 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9231W0113-610139 from training. Duration: 0.896 2023-10-03 19:27:41,070 WARNING [train.py:1204] (3/4) Exclude cut with ID _24271_210_12658_1_1531987270022_7190519_638-171906-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 19:27:42,805 WARNING [train.py:1204] (3/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_272-249985-0 from training. Duration: 0.67 2023-10-03 19:27:42,822 WARNING [train.py:1204] (3/4) Exclude cut with ID _64086_210_7994_1_1535270417019_7435949_449-31567-0 from training. Duration: 0.97 2023-10-03 19:27:45,688 WARNING [train.py:1204] (3/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_109-282568-0_sp1.1 from training. Duration: 0.97275 2023-10-03 19:27:52,505 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_121-45934-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 19:27:53,657 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.5.encoder.layers.1.self_attn1.whiten, num_groups=1, num_channels=256, metric=12.16 vs. limit=22.5 2023-10-03 19:27:55,852 WARNING [train.py:1204] (3/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_314-126819-0 from training. Duration: 0.5 2023-10-03 19:27:57,135 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9309W0192-640319 from training. Duration: 0.512 2023-10-03 19:27:57,151 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9310W0003-640649 from training. Duration: 0.896 2023-10-03 19:28:00,124 WARNING [train.py:1204] (3/4) Exclude cut with ID _7160_210_1876_1_1527940923231_3703799_120-150943-0 from training. Duration: 0.81 2023-10-03 19:28:00,126 WARNING [train.py:1204] (3/4) Exclude cut with ID _40380_210_9059_1_1533380594759_4187380_6-33508-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 19:28:02,932 WARNING [train.py:1204] (3/4) Exclude cut with ID _59202_210_26984_1_1534816810612_3549174_137-318710-0 from training. Duration: 0.89 2023-10-03 19:28:06,213 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_435-313305-0 from training. Duration: 0.71 2023-10-03 19:28:07,705 WARNING [train.py:1204] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_91-212486-0_sp1.1 from training. Duration: 0.6181875 2023-10-03 19:28:10,738 WARNING [train.py:1204] (3/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_133-158308-0_sp1.1 from training. Duration: 0.763625 2023-10-03 19:28:12,445 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_99-251314-0 from training. Duration: 0.87 2023-10-03 19:28:13,753 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_365-217119-0_sp1.1 from training. Duration: 0.57275 2023-10-03 19:28:13,780 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3475_210_2028_1_1526193578_3622569_377-166133-0 from training. Duration: 0.86 2023-10-03 19:28:15,618 INFO [train.py:1046] (3/4) Epoch 39, batch 5150, loss[loss=0.1706, simple_loss=0.2447, pruned_loss=0.04823, over 23563.00 frames. ], tot_loss[loss=0.1581, simple_loss=0.2383, pruned_loss=0.03898, over 4708958.46 frames. ], batch size: 256, lr: 2.59e-03, grad_scale: 8.0 2023-10-03 19:28:18,596 WARNING [train.py:1204] (3/4) Exclude cut with ID _13916_210_9994_1_1530788341126_3763149_116-21034-0_sp1.1 from training. Duration: 0.87275 2023-10-03 19:28:18,615 WARNING [train.py:1204] (3/4) Exclude cut with ID _64220_210_9527_1_1535250151772_4875259_142-216831-0_sp1.1 from training. Duration: 0.936375 2023-10-03 19:28:18,616 WARNING [train.py:1204] (3/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_490-207861-0_sp1.1 from training. Duration: 0.8909375 2023-10-03 19:28:19,993 WARNING [train.py:1204] (3/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_87-272068-0_sp0.9 from training. Duration: 0.92225 2023-10-03 19:28:20,021 WARNING [train.py:1204] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_18-46756-0_sp1.1 from training. Duration: 0.7181875 2023-10-03 19:28:21,345 WARNING [train.py:1204] (3/4) Exclude cut with ID _39411_210_5986_1_1533538825030_3343649_98-311335-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 19:28:21,426 WARNING [train.py:1204] (3/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_113-18163-0 from training. Duration: 0.89 2023-10-03 19:28:21,428 WARNING [train.py:1204] (3/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_1103-56445-0 from training. Duration: 0.73 2023-10-03 19:28:21,472 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3474_210_2028_1_1526188513_4825246_145-309131-0 from training. Duration: 0.74 2023-10-03 19:28:22,751 WARNING [train.py:1204] (3/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_120-211630-0_sp1.1 from training. Duration: 0.836375 2023-10-03 19:28:22,761 WARNING [train.py:1204] (3/4) Exclude cut with ID _8030_210_7497_1_1528444886781_3531089_32-31837-0 from training. Duration: 0.97 2023-10-03 19:28:23,936 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_19949_210_2161_1_1531706337_928079_8-160956-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 19:28:24,604 WARNING [train.py:1204] (3/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_321-215742-0_sp1.1 from training. Duration: 0.4545625 2023-10-03 19:28:27,294 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_583-124650-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 19:28:28,653 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_30434_210_13049_1_1532519384_981746_164-182755-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 19:28:28,720 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder_embed.convnext.layerdrop_rate, batch_count=1380140.0, ans=0.015 2023-10-03 19:28:32,941 WARNING [train.py:1204] (3/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_339-104791-0_sp1.1 from training. Duration: 0.4454375 2023-10-03 19:28:32,969 WARNING [train.py:1204] (3/4) Exclude cut with ID _15026_210_8750_1_1530777602316_3291850_211-195043-0 from training. Duration: 0.8 2023-10-03 19:28:34,883 WARNING [train.py:1204] (3/4) Exclude cut with ID _29877_210_6228_1_1532397791106_5276149_487-182647-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 19:28:34,938 WARNING [train.py:1204] (3/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_127-566-0_sp0.9 from training. Duration: 0.9333125 2023-10-03 19:28:36,303 WARNING [train.py:1204] (3/4) Exclude cut with ID _8263_210_1664_1_1528439452891_970220_68-181805-0_sp0.9 from training. Duration: 0.711125 2023-10-03 19:28:36,304 WARNING [train.py:1204] (3/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_504-72513-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 19:28:36,312 WARNING [train.py:1204] (3/4) Exclude cut with ID _64639_210_5816_1_1535367619437_3528000_324-265233-0_sp1.1 from training. Duration: 0.963625 2023-10-03 19:28:37,603 WARNING [train.py:1204] (3/4) Exclude cut with ID _6456_210_1534_1_1527681634212_3181049_215-309359-0_sp0.9 from training. Duration: 0.97775 2023-10-03 19:28:37,608 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_115-233929-0_sp1.1 from training. Duration: 0.6545625 2023-10-03 19:28:37,649 WARNING [train.py:1204] (3/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_219-224445-0 from training. Duration: 0.84 2023-10-03 19:28:40,854 WARNING [train.py:1204] (3/4) Exclude cut with ID _14867_210_1968_1_1530684179890_3846969_413-29292-0_sp1.1 from training. Duration: 0.8181875 2023-10-03 19:28:40,915 WARNING [train.py:1204] (3/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_1012-279473-0_sp0.9 from training. Duration: 0.7444375 2023-10-03 19:28:42,448 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.conv_module1.balancer1.prob, batch_count=1380140.0, ans=0.125 2023-10-03 19:28:43,468 WARNING [train.py:1204] (3/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_605-133395-0_sp0.9 from training. Duration: 0.7333125 2023-10-03 19:28:45,310 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_89-35748-0 from training. Duration: 0.79 2023-10-03 19:28:45,418 WARNING [train.py:1204] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_476-272757-0_sp1.1 from training. Duration: 0.8454375 2023-10-03 19:28:49,636 WARNING [train.py:1204] (3/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_449-15508-0_sp0.9 from training. Duration: 0.911125 2023-10-03 19:28:52,302 WARNING [train.py:1204] (3/4) Exclude cut with ID _29345_210_4381_1_1532689168263_3860169_198-174328-0 from training. Duration: 0.91 2023-10-03 19:28:55,705 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_15517_210_6753_1_1530952293_1664795_125-158157-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 19:28:55,928 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.bypass_mid.scale_min, batch_count=1380206.6666666667, ans=0.2 2023-10-03 19:29:01,432 WARNING [train.py:1204] (3/4) Exclude cut with ID _25565_210_12392_1_1532423009382_3952098_420-113180-0_sp1.1 from training. Duration: 0.963625 2023-10-03 19:29:02,720 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.590e+02 1.915e+02 2.051e+02 2.368e+02 4.802e+02, threshold=4.101e+02, percent-clipped=1.0 2023-10-03 19:29:02,820 WARNING [train.py:1204] (3/4) Exclude cut with ID _51471_210_1889_1_1534399391390_3094250_236-146773-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 19:29:05,016 WARNING [train.py:1204] (3/4) Exclude cut with ID _30039_210_13270_1_1532484612900_2725150_175-35012-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 19:29:05,060 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_23850_210_4238_1_1532232597_1632433_99-275843-0_sp1.1 from training. Duration: 0.936375 2023-10-03 19:29:06,020 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.0.layers.0.conv_module1.whiten, num_groups=1, num_channels=192, metric=11.71 vs. limit=15.0 2023-10-03 19:29:07,759 WARNING [train.py:1204] (3/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_658-282610-0 from training. Duration: 0.53 2023-10-03 19:29:10,942 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_37818_210_4238_1_1533620910_4720083_234-65934-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 19:29:12,456 WARNING [train.py:1204] (3/4) Exclude cut with ID _3067_210_2667_1_1526212974640_4503168_459-173867-0_sp1.1 from training. Duration: 0.8 2023-10-03 19:29:12,482 WARNING [train.py:1204] (3/4) Exclude cut with ID _8696_210_1876_1_1528545632440_3345058_209-54069-0_sp0.9 from training. Duration: 0.7444375 2023-10-03 19:29:16,971 WARNING [train.py:1204] (3/4) Exclude cut with ID _62574_210_2655_1_1535180407058_3933249_420-236465-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 19:29:18,289 WARNING [train.py:1204] (3/4) Exclude cut with ID _46681_210_9642_1_1533893005571_7603359_333-4042-0_sp1.1 from training. Duration: 0.936375 2023-10-03 19:29:18,371 WARNING [train.py:1204] (3/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_377-296839-0 from training. Duration: 0.55 2023-10-03 19:29:23,181 WARNING [train.py:1204] (3/4) Exclude cut with ID _43997_210_4525_1_1533538632555_5189329_426-92599-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 19:29:24,589 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_43-71989-0_sp1.1 from training. Duration: 0.5181875 2023-10-03 19:29:25,067 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.5.encoder.layers.1.feed_forward2.out_whiten, num_groups=1, num_channels=256, metric=8.86 vs. limit=15.0 2023-10-03 19:29:26,028 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2009_210_2071_1_1525481134_4588937_140-345530-0_sp1.1 from training. Duration: 0.963625 2023-10-03 19:29:27,425 WARNING [train.py:1204] (3/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_138-332773-0_sp0.9 from training. Duration: 0.9 2023-10-03 19:29:27,506 WARNING [train.py:1204] (3/4) Exclude cut with ID _13942_210_9988_1_1530604906036_3733360_499-55676-0_sp0.9 from training. Duration: 0.688875 2023-10-03 19:29:27,524 WARNING [train.py:1204] (3/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_259-34038-0_sp1.1 from training. Duration: 0.67275 2023-10-03 19:29:27,541 WARNING [train.py:1204] (3/4) Exclude cut with ID _3120_210_3525_1_1526036143112_4072980_404-249994-0_sp1.1 from training. Duration: 0.9 2023-10-03 19:29:28,756 WARNING [train.py:1204] (3/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_222-108810-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 19:29:30,178 INFO [train.py:1046] (3/4) Epoch 39, batch 5200, loss[loss=0.1469, simple_loss=0.2359, pruned_loss=0.02897, over 24489.00 frames. ], tot_loss[loss=0.1589, simple_loss=0.2391, pruned_loss=0.03934, over 4698245.72 frames. ], batch size: 63, lr: 2.59e-03, grad_scale: 16.0 2023-10-03 19:29:31,667 WARNING [train.py:1204] (3/4) Exclude cut with ID _22083_210_8254_1_1531702862439_3678970_49-234546-0_sp0.9 from training. Duration: 0.92225 2023-10-03 19:29:33,032 WARNING [train.py:1204] (3/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_199-295611-0_sp0.9 from training. Duration: 0.611125 2023-10-03 19:29:36,388 WARNING [train.py:1204] (3/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_43-192945-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 19:29:39,101 WARNING [train.py:1204] (3/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_364-22817-0 from training. Duration: 0.71 2023-10-03 19:29:41,024 WARNING [train.py:1204] (3/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_388-129552-0_sp1.1 from training. Duration: 0.87275 2023-10-03 19:29:41,069 WARNING [train.py:1204] (3/4) Exclude cut with ID _6537_210_1883_1_1527987559710_3597129_190-280227-0_sp1.1 from training. Duration: 0.97275 2023-10-03 19:29:43,727 WARNING [train.py:1204] (3/4) Exclude cut with ID _55844_210_2346_1_1534471212569_7625707_398-56897-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 19:29:44,028 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.self_attn_weights.pos_emb_skip_rate, batch_count=1380473.3333333333, ans=0.0 2023-10-03 19:29:45,525 WARNING [train.py:1204] (3/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_48-182155-0_sp1.1 from training. Duration: 0.7909375 2023-10-03 19:29:45,574 WARNING [train.py:1204] (3/4) Exclude cut with ID _29529_210_7234_1_1532393961455_3977460_582-18084-0_sp1.1 from training. Duration: 0.97275 2023-10-03 19:29:46,967 WARNING [train.py:1204] (3/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_912-282412-0 from training. Duration: 0.49 2023-10-03 19:29:49,742 WARNING [train.py:1204] (3/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_332-256168-0_sp0.9 from training. Duration: 0.8333125 2023-10-03 19:29:51,088 WARNING [train.py:1204] (3/4) Exclude cut with ID _64220_210_9527_1_1535250151772_4875259_55-250624-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 19:29:52,035 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.1.encoder.layers.1.self_attn1.whiten, num_groups=1, num_channels=256, metric=11.53 vs. limit=22.5 2023-10-03 19:29:53,113 WARNING [train.py:1204] (3/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_264-108685-0 from training. Duration: 0.55 2023-10-03 19:29:55,820 WARNING [train.py:1204] (3/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_613-232856-0_sp0.9 from training. Duration: 0.6 2023-10-03 19:29:57,203 WARNING [train.py:1204] (3/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_68-212801-0_sp1.1 from training. Duration: 0.663625 2023-10-03 19:29:58,493 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6290_210_3273_1_1527595224_1282787_115-87095-0 from training. Duration: 0.55 2023-10-03 19:29:58,540 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3080_210_2071_1_1526086102_4365166_312-8726-0 from training. Duration: 0.91 2023-10-03 19:30:01,215 WARNING [train.py:1204] (3/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_105-63309-0 from training. Duration: 0.98 2023-10-03 19:30:02,514 WARNING [train.py:1204] (3/4) Exclude cut with ID _2941_210_2577_1_1526199673150_2489759_201-160779-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 19:30:02,517 WARNING [train.py:1204] (3/4) Exclude cut with ID ID0406W0226-885434 from training. Duration: 0.997 2023-10-03 19:30:02,523 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_17292_210_6753_1_1531125388_2943734_353-328201-0_sp1.1 from training. Duration: 0.97275 2023-10-03 19:30:03,959 WARNING [train.py:1204] (3/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_572-96029-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 19:30:04,901 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.0.layers.1.feed_forward1.out_whiten, num_groups=1, num_channels=192, metric=4.35 vs. limit=15.0 2023-10-03 19:30:05,756 WARNING [train.py:1204] (3/4) Exclude cut with ID _14970_210_15709_1_1530775547361_1966965_6-23328-0_sp0.9 from training. Duration: 0.9555625 2023-10-03 19:30:07,114 WARNING [train.py:1204] (3/4) Exclude cut with ID _45012_210_12651_1_1533608435136_5371380_240-37101-0 from training. Duration: 0.93 2023-10-03 19:30:07,179 WARNING [train.py:1204] (3/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_685-90183-0_sp1.1 from training. Duration: 0.936375 2023-10-03 19:30:08,655 WARNING [train.py:1204] (3/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_41-22245-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 19:30:12,663 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_15126_210_6753_1_1530778677_2591185_120-277707-0 from training. Duration: 0.8 2023-10-03 19:30:12,709 WARNING [train.py:1204] (3/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_310-26186-0 from training. Duration: 0.7 2023-10-03 19:30:12,725 WARNING [train.py:1204] (3/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_787-294416-0 from training. Duration: 0.82 2023-10-03 19:30:17,437 WARNING [train.py:1204] (3/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_795-33252-0 from training. Duration: 0.37 2023-10-03 19:30:17,490 WARNING [train.py:1204] (3/4) Exclude cut with ID _7537_210_2084_1_1528502395431_3924359_113-199933-0_sp0.9 from training. Duration: 0.6555625 2023-10-03 19:30:22,239 WARNING [train.py:1204] (3/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_308-231735-0_sp0.9 from training. Duration: 0.811125 2023-10-03 19:30:22,258 WARNING [train.py:1204] (3/4) Exclude cut with ID _19504_210_9994_1_1531648863242_3325220_354-296765-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 19:30:22,981 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.0.ff2_skip_rate, batch_count=1380606.6666666667, ans=0.0 2023-10-03 19:30:24,067 WARNING [train.py:1204] (3/4) Exclude cut with ID _5409_210_1388_1_1527292850189_5865191_178-127970-0 from training. Duration: 0.57 2023-10-03 19:30:24,109 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_1466-248056-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 19:30:24,126 WARNING [train.py:1204] (3/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_535-14410-0_sp0.9 from training. Duration: 0.52225 2023-10-03 19:30:24,136 WARNING [train.py:1204] (3/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_747-129941-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 19:30:25,357 WARNING [train.py:1204] (3/4) Exclude cut with ID _15012_210_1968_1_1530856820429_5095350_688-178849-0_sp1.1 from training. Duration: 0.5818125 2023-10-03 19:30:28,175 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder_embed.conv.8.prob, batch_count=1380673.3333333333, ans=0.125 2023-10-03 19:30:29,377 WARNING [train.py:1204] (3/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_356-45054-0_sp1.1 from training. Duration: 0.7818125 2023-10-03 19:30:29,600 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.balancer1.prob, batch_count=1380673.3333333333, ans=0.125 2023-10-03 19:30:30,768 WARNING [train.py:1204] (3/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_146-276944-0_sp0.9 from training. Duration: 0.92225 2023-10-03 19:30:31,561 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.3.encoder.layers.0.self_attn1.whiten, num_groups=1, num_channels=512, metric=13.59 vs. limit=22.5 2023-10-03 19:30:33,563 WARNING [train.py:1204] (3/4) Exclude cut with ID _39204_210_4220_1_1533880547368_3191120_346-180306-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 19:30:35,474 WARNING [train.py:1204] (3/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_641-158017-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 19:30:35,474 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_19952_210_2161_1_1531875469_3851750_274-42137-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 19:30:40,970 WARNING [train.py:1204] (3/4) Exclude cut with ID _60804_210_9642_1_1534831199859_7268299_87-327819-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 19:30:42,283 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_9302_210_2550_1_1528889962_4655416_638-301620-0 from training. Duration: 0.47 2023-10-03 19:30:43,599 INFO [train.py:1046] (3/4) Epoch 39, batch 5250, loss[loss=0.1588, simple_loss=0.2369, pruned_loss=0.04035, over 23487.00 frames. ], tot_loss[loss=0.1581, simple_loss=0.2378, pruned_loss=0.03925, over 4693041.01 frames. ], batch size: 134, lr: 2.59e-03, grad_scale: 4.0 2023-10-03 19:30:43,676 WARNING [train.py:1204] (3/4) Exclude cut with ID _44304_210_15374_1_1533639352765_4613129_302-22084-0_sp1.1 from training. Duration: 0.7818125 2023-10-03 19:30:43,687 WARNING [train.py:1204] (3/4) Exclude cut with ID _18513_210_9206_1_1531618215223_3862620_177-227063-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 19:30:43,786 WARNING [train.py:1204] (3/4) Exclude cut with ID _41326_210_5854_1_1533778076509_3492080_510-45444-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 19:30:45,778 WARNING [train.py:1204] (3/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_76-311963-0_sp1.1 from training. Duration: 0.636375 2023-10-03 19:30:45,877 WARNING [train.py:1204] (3/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_156-302675-0_sp0.9 from training. Duration: 0.611125 2023-10-03 19:30:47,363 WARNING [train.py:1204] (3/4) Exclude cut with ID _53489_210_6228_1_1534298357169_3835970_367-148411-0_sp1.1 from training. Duration: 0.936375 2023-10-03 19:30:50,777 WARNING [train.py:1204] (3/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_389-144525-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 19:30:51,573 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.conv_skip_rate, batch_count=1380740.0, ans=0.0 2023-10-03 19:30:52,533 WARNING [train.py:1204] (3/4) Exclude cut with ID _14069_210_5632_1_1530512692033_4803390_158-238427-0_sp1.1 from training. Duration: 0.963625 2023-10-03 19:30:52,623 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_486-151131-0_sp0.9 from training. Duration: 0.9444375 2023-10-03 19:30:59,307 WARNING [train.py:1204] (3/4) Exclude cut with ID _39846_210_12651_1_1533603651992_1318030_188-71831-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 19:31:00,690 WARNING [train.py:1204] (3/4) Exclude cut with ID _39411_210_5986_1_1533538825030_3343649_173-238248-0_sp1.1 from training. Duration: 0.7545625 2023-10-03 19:31:02,125 WARNING [train.py:1204] (3/4) Exclude cut with ID _20684_210_6917_1_1531567751646_4203930_469-49081-0_sp1.1 from training. Duration: 0.92725 2023-10-03 19:31:04,843 WARNING [train.py:1204] (3/4) Exclude cut with ID _8833_210_5219_1_1528592665210_7553059_611-80313-0_sp1.1 from training. Duration: 0.5818125 2023-10-03 19:31:04,997 WARNING [train.py:1204] (3/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_440-141811-0 from training. Duration: 0.95 2023-10-03 19:31:06,846 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_41211_210_2544_1_1533348859_3702910_236-223652-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 19:31:06,937 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_40222_210_6228_1_1533385298_4481939_220-163998-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 19:31:31,833 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.542e+02 1.831e+02 2.019e+02 2.223e+02 3.571e+02, threshold=4.037e+02, percent-clipped=0.0 2023-10-03 19:31:41,070 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.5.encoder.layers.1.self_attn2.whiten, num_groups=1, num_channels=256, metric=12.82 vs. limit=22.5 2023-10-03 19:31:41,975 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.1.feed_forward1.out_proj.dropout_p, batch_count=1381006.6666666667, ans=0.1 2023-10-03 19:31:47,051 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.bypass_mid.scale_min, batch_count=1381006.6666666667, ans=0.2 2023-10-03 19:31:53,166 INFO [train.py:1046] (3/4) Epoch 39, batch 5300, loss[loss=0.1508, simple_loss=0.2283, pruned_loss=0.03667, over 23635.00 frames. ], tot_loss[loss=0.1578, simple_loss=0.237, pruned_loss=0.03933, over 4686618.73 frames. ], batch size: 149, lr: 2.59e-03, grad_scale: 8.0 2023-10-03 19:32:01,476 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.5.encoder.layers.0.self_attn_weights.whiten_keys, num_groups=4, num_channels=128, metric=2.33 vs. limit=6.0 2023-10-03 19:32:07,663 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=1381140.0, ans=0.1 2023-10-03 19:32:08,569 WARNING [train.py:1204] (3/4) Exclude cut with ID _14629_210_15709_1_1530604298182_956899_6-232695-0_sp1.1 from training. Duration: 0.8545625 2023-10-03 19:32:08,590 WARNING [train.py:1204] (3/4) Exclude cut with ID _13921_210_9994_1_1530841782272_3503520_327-69677-0 from training. Duration: 0.9 2023-10-03 19:32:08,616 WARNING [train.py:1204] (3/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_372-298093-0 from training. Duration: 0.8 2023-10-03 19:32:08,630 WARNING [train.py:1204] (3/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_515-139111-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 19:32:08,878 WARNING [train.py:1204] (3/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_85-65056-0_sp1.1 from training. Duration: 0.87275 2023-10-03 19:32:08,920 WARNING [train.py:1204] (3/4) Exclude cut with ID _15113_210_8254_1_1530842438579_3716279_209-54533-0_sp1.1 from training. Duration: 0.87275 2023-10-03 19:32:08,967 WARNING [train.py:1204] (3/4) Exclude cut with ID _70447_210_12392_1_1535972499300_3160420_123-312838-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 19:32:08,995 WARNING [train.py:1204] (3/4) Exclude cut with ID _60113_210_16271_1_1535164212481_4386079_70-63596-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 19:32:09,024 WARNING [train.py:1204] (3/4) Exclude cut with ID _14632_210_9988_1_1530691279392_3923200_225-188509-0_sp1.1 from training. Duration: 0.963625 2023-10-03 19:32:09,075 WARNING [train.py:1204] (3/4) Exclude cut with ID _34734_210_4525_1_1533365977542_4448720_584-180532-0_sp1.1 from training. Duration: 0.97275 2023-10-03 19:32:09,090 WARNING [train.py:1204] (3/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_351-294650-0_sp1.1 from training. Duration: 0.57275 2023-10-03 19:32:09,427 WARNING [train.py:1204] (3/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_263-168388-0_sp0.9 from training. Duration: 0.9555625 2023-10-03 19:32:09,454 WARNING [train.py:1204] (3/4) Exclude cut with ID _22083_210_8254_1_1531702862439_3678970_201-85738-0 from training. Duration: 0.9 2023-10-03 19:32:09,545 WARNING [train.py:1204] (3/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_237-58433-0 from training. Duration: 0.74 2023-10-03 19:32:09,571 WARNING [train.py:1204] (3/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_262-250826-0 from training. Duration: 0.99 2023-10-03 19:32:10,025 WARNING [train.py:1204] (3/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_927-43827-0_sp0.9 from training. Duration: 0.82225 2023-10-03 19:32:10,069 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2821_210_2715_1_1526108385_3763560_73-296673-0 from training. Duration: 0.83 2023-10-03 19:32:10,149 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6287_210_3273_1_1527509506_2653716_182-199887-0 from training. Duration: 0.51 2023-10-03 19:32:10,288 WARNING [train.py:1204] (3/4) Exclude cut with ID _70519_210_4163_1_1535961595994_7458230_899-80223-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 19:32:10,559 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_210-234949-0_sp1.1 from training. Duration: 0.97275 2023-10-03 19:32:10,600 WARNING [train.py:1204] (3/4) Exclude cut with ID _30196_210_1664_1_1533708496522_7074140_768-278176-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 19:32:10,681 WARNING [train.py:1204] (3/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_315-128283-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 19:32:10,763 WARNING [train.py:1204] (3/4) Exclude cut with ID _14886_210_4220_1_1530702020355_3516190_330-323941-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 19:32:11,030 WARNING [train.py:1204] (3/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_264-348896-0_sp1.1 from training. Duration: 0.77275 2023-10-03 19:32:11,054 WARNING [train.py:1204] (3/4) Exclude cut with ID _37086_210_9038_1_1532999540891_7021250_433-10563-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 19:32:11,094 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_53638_210_21581_1_1534297319_5808449_885-80172-0_sp1.1 from training. Duration: 0.87275 2023-10-03 19:32:11,197 WARNING [train.py:1204] (3/4) Exclude cut with ID _14623_210_1925_1_1530773354962_3632609_157-218771-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 19:32:11,209 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_1368-78165-0_sp1.1 from training. Duration: 0.97275 2023-10-03 19:32:11,213 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_309-65177-0_sp0.9 from training. Duration: 0.97775 2023-10-03 19:32:11,220 WARNING [train.py:1204] (3/4) Exclude cut with ID _21632_210_2253_1_1531994001765_3744140_291-138812-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 19:32:11,234 WARNING [train.py:1204] (3/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_669-93163-0_sp1.1 from training. Duration: 0.8909375 2023-10-03 19:32:11,817 WARNING [train.py:1204] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_292-215346-0 from training. Duration: 0.97 2023-10-03 19:32:11,842 WARNING [train.py:1204] (3/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_286-222073-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 19:32:12,593 WARNING [train.py:1204] (3/4) Exclude cut with ID _59256_210_5986_1_1535076085997_3589120_121-237386-0_sp1.1 from training. Duration: 0.87275 2023-10-03 19:32:12,608 WARNING [train.py:1204] (3/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_96-320736-0 from training. Duration: 0.86 2023-10-03 19:32:12,619 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_128-202104-0 from training. Duration: 0.63 2023-10-03 19:32:12,714 WARNING [train.py:1204] (3/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_242-3472-0_sp0.9 from training. Duration: 0.811125 2023-10-03 19:32:12,762 WARNING [train.py:1204] (3/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_154-74182-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 19:32:12,766 WARNING [train.py:1204] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_187-221566-0 from training. Duration: 0.43 2023-10-03 19:32:12,936 WARNING [train.py:1204] (3/4) Exclude cut with ID _6194_210_1534_1_1527511826160_3941194_14-282876-0 from training. Duration: 0.71 2023-10-03 19:32:12,977 WARNING [train.py:1204] (3/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_660-211088-0_sp1.1 from training. Duration: 0.563625 2023-10-03 19:32:13,336 WARNING [train.py:1204] (3/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_526-62079-0_sp1.1 from training. Duration: 0.6090625 2023-10-03 19:32:13,491 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_92-246270-0_sp1.1 from training. Duration: 0.77275 2023-10-03 19:32:13,582 WARNING [train.py:1204] (3/4) Exclude cut with ID IC0287W0023-142350 from training. Duration: 0.956 2023-10-03 19:32:13,661 WARNING [train.py:1204] (3/4) Exclude cut with ID _58605_210_12210_1_1534723195939_3904259_48-269402-0 from training. Duration: 0.96 2023-10-03 19:32:13,676 WARNING [train.py:1204] (3/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_416-257730-0_sp0.9 from training. Duration: 0.72225 2023-10-03 19:32:13,754 WARNING [train.py:1204] (3/4) Exclude cut with ID _7877_210_6014_1_1528286358099_3750136_185-17516-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 19:32:13,859 WARNING [train.py:1204] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_23-189234-0 from training. Duration: 0.83 2023-10-03 19:32:13,878 WARNING [train.py:1204] (3/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_237-89684-0 from training. Duration: 0.96 2023-10-03 19:32:13,947 WARNING [train.py:1204] (3/4) Exclude cut with ID _13916_210_9994_1_1530788341126_3763149_116-21034-0 from training. Duration: 0.96 2023-10-03 19:32:14,088 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_246-125000-0_sp1.1 from training. Duration: 0.563625 2023-10-03 19:32:21,180 INFO [train.py:1046] (3/4) Epoch 40, batch 0, loss[loss=0.2045, simple_loss=0.2677, pruned_loss=0.07063, over 19765.00 frames. ], tot_loss[loss=0.2045, simple_loss=0.2677, pruned_loss=0.07063, over 19765.00 frames. ], batch size: 388, lr: 2.56e-03, grad_scale: 16.0 2023-10-03 19:32:21,180 INFO [train.py:1069] (3/4) Computing validation loss 2023-10-03 19:32:32,918 INFO [train.py:1078] (3/4) Epoch 40, validation: loss=0.3547, simple_loss=0.2733, pruned_loss=0.2181, over 1125622.00 frames. 2023-10-03 19:32:32,919 INFO [train.py:1079] (3/4) Maximum memory allocated so far is 21134MB 2023-10-03 19:32:34,331 WARNING [train.py:1204] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_402-253027-0 from training. Duration: 0.54 2023-10-03 19:32:34,392 WARNING [train.py:1204] (3/4) Exclude cut with ID _59288_210_5854_1_1535077757774_3412246_158-289345-0_sp1.1 from training. Duration: 0.92725 2023-10-03 19:32:37,130 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_28211_210_6388_1_1532861506_4523224_128-328162-0_sp1.1 from training. Duration: 0.8181875 2023-10-03 19:32:37,248 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder_embed.dropout.p, batch_count=1381160.0, ans=0.1 2023-10-03 19:32:41,353 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_788-278281-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 19:32:41,361 WARNING [train.py:1204] (3/4) Exclude cut with ID _49728_210_12651_1_1534041034294_6788150_624-195301-0_sp0.9 from training. Duration: 0.9444375 2023-10-03 19:32:41,394 WARNING [train.py:1204] (3/4) Exclude cut with ID _14860_210_4915_1_1530702020340_7343240_267-139775-0_sp1.1 from training. Duration: 0.963625 2023-10-03 19:32:42,837 WARNING [train.py:1204] (3/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_332-221057-0 from training. Duration: 0.68 2023-10-03 19:32:44,145 WARNING [train.py:1204] (3/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_242-76943-0 from training. Duration: 0.94 2023-10-03 19:32:48,881 WARNING [train.py:1204] (3/4) Exclude cut with ID _16747_210_5986_1_1531811544733_2749109_87-349587-0_sp1.1 from training. Duration: 0.963625 2023-10-03 19:32:48,952 WARNING [train.py:1204] (3/4) Exclude cut with ID _22982_210_1883_1_1531882790447_5449409_505-346452-0_sp1.1 from training. Duration: 0.963625 2023-10-03 19:32:53,633 WARNING [train.py:1204] (3/4) Exclude cut with ID _17956_210_4353_1_1531209568125_6979970_13-192107-0_sp1.1 from training. Duration: 0.963625 2023-10-03 19:32:53,685 WARNING [train.py:1204] (3/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_430-307666-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 19:32:55,062 WARNING [train.py:1204] (3/4) Exclude cut with ID _2895_210_2477_1_1525956785794_4063750_289-11475-0_sp1.1 from training. Duration: 0.7454375 2023-10-03 19:32:55,084 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3910_210_1794_1_1526742306_334530_28-238909-0_sp0.9 from training. Duration: 0.9 2023-10-03 19:32:57,053 WARNING [train.py:1204] (3/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_18-130336-0 from training. Duration: 0.79 2023-10-03 19:32:58,475 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_548-294316-0_sp0.9 from training. Duration: 0.9 2023-10-03 19:33:05,747 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.conv_module2.balancer1.prob, batch_count=1381293.3333333333, ans=0.125 2023-10-03 19:33:07,478 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_42-142473-0_sp1.1 from training. Duration: 0.6545625 2023-10-03 19:33:07,485 WARNING [train.py:1204] (3/4) Exclude cut with ID _59170_210_6721_1_1534983633960_4203335_326-48066-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 19:33:09,032 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_45886_210_17024_1_1533709215_471063_7-79165-0 from training. Duration: 0.98 2023-10-03 19:33:13,258 WARNING [train.py:1204] (3/4) Exclude cut with ID _44585_210_18628_1_1533605350387_4518199_280-73534-0_sp0.9 from training. Duration: 0.988875 2023-10-03 19:33:13,265 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3475_210_2028_1_1526193578_3622569_265-276625-0_sp1.1 from training. Duration: 0.6909375 2023-10-03 19:33:14,739 WARNING [train.py:1204] (3/4) Exclude cut with ID _64631_210_27767_1_1535349580068_4165160_88-225232-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 19:33:16,178 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_205-265518-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 19:33:20,915 WARNING [train.py:1204] (3/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_271-253734-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 19:33:25,681 WARNING [train.py:1204] (3/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_282-257791-0 from training. Duration: 0.86 2023-10-03 19:33:29,226 WARNING [train.py:1204] (3/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_356-45054-0 from training. Duration: 0.86 2023-10-03 19:33:29,266 WARNING [train.py:1204] (3/4) Exclude cut with ID _69584_210_5219_1_1535785133775_4044450_38-348116-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 19:33:29,267 WARNING [train.py:1204] (3/4) Exclude cut with ID _66763_210_17301_1_1535788667617_241750_21-328480-0_sp1.1 from training. Duration: 0.936375 2023-10-03 19:33:30,608 WARNING [train.py:1204] (3/4) Exclude cut with ID _26528_210_9070_1_1532570410842_3851310_497-97218-0_sp0.9 from training. Duration: 0.9555625 2023-10-03 19:33:30,667 WARNING [train.py:1204] (3/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_592-101075-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 19:33:30,784 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.1.feed_forward1.out_proj.dropout_p, batch_count=1381360.0, ans=0.1 2023-10-03 19:33:33,322 WARNING [train.py:1204] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_678-159307-0 from training. Duration: 0.85 2023-10-03 19:33:33,495 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.ff2_skip_rate, batch_count=1381426.6666666667, ans=0.0 2023-10-03 19:33:36,486 WARNING [train.py:1204] (3/4) Exclude cut with ID _28427_210_4220_1_1532581240967_3061069_166-319773-0_sp1.1 from training. Duration: 0.936375 2023-10-03 19:33:36,577 WARNING [train.py:1204] (3/4) Exclude cut with ID _29877_210_6228_1_1532397791106_5276149_606-224984-0_sp1.1 from training. Duration: 0.936375 2023-10-03 19:33:39,424 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_125-203105-0_sp1.1 from training. Duration: 0.72725 2023-10-03 19:33:42,230 WARNING [train.py:1204] (3/4) Exclude cut with ID IC0620W0113-306765 from training. Duration: 0.768 2023-10-03 19:33:44,871 WARNING [train.py:1204] (3/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_410-287857-0_sp1.1 from training. Duration: 0.8454375 2023-10-03 19:33:45,597 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.4.encoder.layers.2.feed_forward1.out_whiten, num_groups=1, num_channels=384, metric=8.14 vs. limit=15.0 2023-10-03 19:33:46,356 WARNING [train.py:1204] (3/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_78-95260-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 19:33:47,709 INFO [train.py:1046] (3/4) Epoch 40, batch 50, loss[loss=0.2028, simple_loss=0.2718, pruned_loss=0.06692, over 19473.00 frames. ], tot_loss[loss=0.1569, simple_loss=0.2377, pruned_loss=0.03804, over 1067127.32 frames. ], batch size: 388, lr: 2.56e-03, grad_scale: 16.0 2023-10-03 19:33:47,856 WARNING [train.py:1204] (3/4) Exclude cut with ID _58059_210_5854_1_1534901471149_3164808_6-73080-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 19:33:47,870 WARNING [train.py:1204] (3/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_331-117829-0 from training. Duration: 0.62 2023-10-03 19:33:49,148 WARNING [train.py:1204] (3/4) Exclude cut with ID _14880_210_8895_1_1530680426655_3795259_289-212817-0_sp0.9 from training. Duration: 0.7666875 2023-10-03 19:33:49,181 WARNING [train.py:1204] (3/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_620-144779-0_sp1.1 from training. Duration: 0.92725 2023-10-03 19:33:52,406 WARNING [train.py:1204] (3/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_561-288575-0_sp1.1 from training. Duration: 0.963625 2023-10-03 19:33:52,693 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.balancer1.prob, batch_count=1381493.3333333333, ans=0.125 2023-10-03 19:33:53,748 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_22657_210_6890_1_1532086002_1637216_122-188128-0_sp1.1 from training. Duration: 0.963625 2023-10-03 19:33:56,282 WARNING [train.py:1204] (3/4) Exclude cut with ID _16756_210_4915_1_1531047803763_7104259_604-86607-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 19:33:59,004 WARNING [train.py:1204] (3/4) Exclude cut with ID _15150_210_8341_1_1530752411803_7256384_469-239509-0 from training. Duration: 0.68 2023-10-03 19:33:59,016 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_10987_210_4163_1_1529760555_4123902_302-87926-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 19:34:02,394 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.4.encoder.layers.0.conv_module2.whiten, num_groups=1, num_channels=384, metric=10.31 vs. limit=15.0 2023-10-03 19:34:05,024 WARNING [train.py:1204] (3/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_469-94673-0_sp0.9 from training. Duration: 0.87775 2023-10-03 19:34:06,477 WARNING [train.py:1204] (3/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_798-106179-0 from training. Duration: 0.83 2023-10-03 19:34:07,856 WARNING [train.py:1204] (3/4) Exclude cut with ID _16744_210_5986_1_1531724405384_3907567_242-111316-0 from training. Duration: 0.93 2023-10-03 19:34:09,276 WARNING [train.py:1204] (3/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_938-205135-0_sp0.9 from training. Duration: 0.9666875 2023-10-03 19:34:10,623 WARNING [train.py:1204] (3/4) Exclude cut with ID _64135_210_2253_1_1535677267876_7608972_133-188266-0_sp0.9 from training. Duration: 0.9 2023-10-03 19:34:10,628 WARNING [train.py:1204] (3/4) Exclude cut with ID _44585_210_18628_1_1533605350387_4518199_623-346820-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 19:34:11,992 WARNING [train.py:1204] (3/4) Exclude cut with ID _42326_210_7770_1_1533814282531_3707269_454-266903-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 19:34:13,377 WARNING [train.py:1204] (3/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_5-55893-0_sp1.1 from training. Duration: 0.5 2023-10-03 19:34:13,409 WARNING [train.py:1204] (3/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_577-332600-0_sp1.1 from training. Duration: 0.5181875 2023-10-03 19:34:13,410 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_957-138882-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 19:34:19,955 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.618e+02 1.937e+02 2.112e+02 2.333e+02 3.745e+02, threshold=4.224e+02, percent-clipped=0.0 2023-10-03 19:34:21,968 WARNING [train.py:1204] (3/4) Exclude cut with ID _12556_210_8341_1_1530081024100_7327690_219-38004-0_sp1.1 from training. Duration: 0.97275 2023-10-03 19:34:22,086 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_397-147196-0_sp1.1 from training. Duration: 0.72725 2023-10-03 19:34:23,349 WARNING [train.py:1204] (3/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_231-104736-0_sp0.9 from training. Duration: 0.8666875 2023-10-03 19:34:24,665 WARNING [train.py:1204] (3/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_398-334558-0 from training. Duration: 0.96 2023-10-03 19:34:24,986 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.nonlin_attention.balancer.min_positive, batch_count=1381626.6666666667, ans=0.05 2023-10-03 19:34:26,622 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_257-20733-0_sp1.1 from training. Duration: 0.6909375 2023-10-03 19:34:26,705 WARNING [train.py:1204] (3/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_146-282400-0_sp1.1 from training. Duration: 0.6454375 2023-10-03 19:34:26,709 WARNING [train.py:1204] (3/4) Exclude cut with ID _51899_210_20034_1_1534129254466_3669220_265-215428-0 from training. Duration: 0.98 2023-10-03 19:34:26,772 WARNING [train.py:1204] (3/4) Exclude cut with ID _34423_210_8750_1_1533430798456_3330199_219-106541-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 19:34:29,822 WARNING [train.py:1204] (3/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_458-67321-0 from training. Duration: 0.97 2023-10-03 19:34:38,467 WARNING [train.py:1204] (3/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_1051-182606-0_sp1.1 from training. Duration: 0.936375 2023-10-03 19:34:38,482 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_53555_210_5632_1_1534595220_7232297_603-4396-0_sp1.1 from training. Duration: 0.97275 2023-10-03 19:34:39,846 WARNING [train.py:1204] (3/4) Exclude cut with ID _54218_210_12210_1_1534413646388_4409259_159-144367-0_sp1.1 from training. Duration: 0.963625 2023-10-03 19:34:41,158 WARNING [train.py:1204] (3/4) Exclude cut with ID _7606_210_4915_1_1528894253277_5080690_301-10732-0_sp0.9 from training. Duration: 0.9 2023-10-03 19:34:41,162 WARNING [train.py:1204] (3/4) Exclude cut with ID _15026_210_8750_1_1530777602316_3291850_211-195043-0_sp0.9 from training. Duration: 0.888875 2023-10-03 19:34:42,665 WARNING [train.py:1204] (3/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_146-282400-0 from training. Duration: 0.71 2023-10-03 19:34:43,863 WARNING [train.py:1204] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_65-88242-0 from training. Duration: 0.56 2023-10-03 19:34:45,352 WARNING [train.py:1204] (3/4) Exclude cut with ID _55857_210_2346_1_1534644007831_6898209_80-107757-0_sp1.1 from training. Duration: 0.963625 2023-10-03 19:34:45,399 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_15126_210_6753_1_1530778677_2591185_120-277707-0_sp0.9 from training. Duration: 0.888875 2023-10-03 19:34:45,977 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.4.encoder.layers.2.feed_forward1.out_whiten, num_groups=1, num_channels=384, metric=8.47 vs. limit=15.0 2023-10-03 19:34:46,791 WARNING [train.py:1204] (3/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_1033-168593-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 19:34:48,149 WARNING [train.py:1204] (3/4) Exclude cut with ID _46683_210_9642_1_1533730913492_4591116_475-30711-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 19:34:48,179 WARNING [train.py:1204] (3/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_253-308377-0 from training. Duration: 0.49 2023-10-03 19:34:48,258 WARNING [train.py:1204] (3/4) Exclude cut with ID _4680_210_3525_1_1527249757953_893386_75-78979-0 from training. Duration: 0.91 2023-10-03 19:34:49,484 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_5522_210_3273_1_1527249606_3909096_318-203153-0_sp0.9 from training. Duration: 0.47775 2023-10-03 19:34:50,149 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.3.encoder.layers.3.conv_module2.whiten, num_groups=1, num_channels=512, metric=4.16 vs. limit=15.0 2023-10-03 19:34:50,827 WARNING [train.py:1204] (3/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_550-170116-0_sp1.1 from training. Duration: 0.92725 2023-10-03 19:34:50,873 WARNING [train.py:1204] (3/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_277-210798-0_sp0.9 from training. Duration: 0.611125 2023-10-03 19:34:50,933 WARNING [train.py:1204] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_620-276793-0 from training. Duration: 0.59 2023-10-03 19:34:51,350 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.5.encoder.layers.1.self_attn_weights.whiten_keys, num_groups=4, num_channels=128, metric=1.66 vs. limit=6.0 2023-10-03 19:34:52,726 WARNING [train.py:1204] (3/4) Exclude cut with ID _3067_210_2667_1_1526212974640_4503168_119-175776-0 from training. Duration: 0.74 2023-10-03 19:34:52,812 WARNING [train.py:1204] (3/4) Exclude cut with ID _55209_210_1531_1_1534675828996_4597160_681-195827-0_sp1.1 from training. Duration: 0.92725 2023-10-03 19:34:54,180 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_427-78097-0_sp1.1 from training. Duration: 0.72725 2023-10-03 19:34:55,608 WARNING [train.py:1204] (3/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_96-290039-0_sp0.9 from training. Duration: 0.62225 2023-10-03 19:34:55,624 WARNING [train.py:1204] (3/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_287-292174-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 19:34:57,664 WARNING [train.py:1204] (3/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_200-292228-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 19:35:01,779 INFO [train.py:1046] (3/4) Epoch 40, batch 100, loss[loss=0.1599, simple_loss=0.246, pruned_loss=0.03686, over 23263.00 frames. ], tot_loss[loss=0.1591, simple_loss=0.2397, pruned_loss=0.03924, over 1889075.89 frames. ], batch size: 105, lr: 2.56e-03, grad_scale: 8.0 2023-10-03 19:35:01,839 WARNING [train.py:1204] (3/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_291-194999-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 19:35:03,216 WARNING [train.py:1204] (3/4) Exclude cut with ID _58895_210_8750_1_1535184081320_3649089_265-323810-0_sp1.1 from training. Duration: 0.87275 2023-10-03 19:35:06,582 WARNING [train.py:1204] (3/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_58-68956-0 from training. Duration: 0.83 2023-10-03 19:35:06,595 WARNING [train.py:1204] (3/4) Exclude cut with ID _39867_210_9826_1_1533553222030_9344259_322-115153-0_sp1.1 from training. Duration: 0.963625 2023-10-03 19:35:12,036 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3371_210_1794_1_1526384462_5580777_396-339812-0_sp1.1 from training. Duration: 0.77275 2023-10-03 19:35:12,048 WARNING [train.py:1204] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_731-2428-0_sp0.9 from training. Duration: 0.92225 2023-10-03 19:35:12,067 WARNING [train.py:1204] (3/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_98-741-0_sp1.1 from training. Duration: 0.72725 2023-10-03 19:35:12,069 WARNING [train.py:1204] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_238-245655-0_sp0.9 from training. Duration: 0.9 2023-10-03 19:35:12,099 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6788_210_1388_1_1527898698_4562021_91-197981-0_sp0.9 from training. Duration: 0.92225 2023-10-03 19:35:13,510 WARNING [train.py:1204] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_860-96198-0 from training. Duration: 0.73 2023-10-03 19:35:14,951 WARNING [train.py:1204] (3/4) Exclude cut with ID _14268_210_9206_1_1530622771285_4714099_331-105351-0_sp0.9 from training. Duration: 0.72225 2023-10-03 19:35:14,973 WARNING [train.py:1204] (3/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_204-137133-0_sp1.1 from training. Duration: 0.92725 2023-10-03 19:35:16,332 WARNING [train.py:1204] (3/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_5-46496-0_sp1.1 from training. Duration: 0.936375 2023-10-03 19:35:16,342 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_160-218721-0_sp1.1 from training. Duration: 0.87275 2023-10-03 19:35:19,411 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.1.attention_skip_rate, batch_count=1381893.3333333333, ans=0.0 2023-10-03 19:35:20,543 WARNING [train.py:1204] (3/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_503-287883-0 from training. Duration: 0.88 2023-10-03 19:35:20,618 WARNING [train.py:1204] (3/4) Exclude cut with ID _57572_210_5517_1_1534921219624_3809484_110-79134-0_sp1.1 from training. Duration: 0.92725 2023-10-03 19:35:21,844 WARNING [train.py:1204] (3/4) Exclude cut with ID _44585_210_18628_1_1533605350387_4518199_421-37641-0_sp1.1 from training. Duration: 0.936375 2023-10-03 19:35:23,210 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_42-142473-0_sp0.9 from training. Duration: 0.8 2023-10-03 19:35:25,251 WARNING [train.py:1204] (3/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_310-114452-0_sp0.9 from training. Duration: 0.6555625 2023-10-03 19:35:29,836 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9029W0120-509520 from training. Duration: 0.768 2023-10-03 19:35:29,858 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9029W0433-509820 from training. Duration: 0.896 2023-10-03 19:35:31,205 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.ff3_skip_rate, batch_count=1381960.0, ans=0.0 2023-10-03 19:35:32,358 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_40222_210_6228_1_1533385298_4481939_163-334586-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 19:35:32,359 WARNING [train.py:1204] (3/4) Exclude cut with ID _48677_210_5663_1_1533902405919_3892989_408-40740-0_sp1.1 from training. Duration: 0.7545625 2023-10-03 19:35:37,081 WARNING [train.py:1204] (3/4) Exclude cut with ID _24314_210_4915_1_1532135078116_7170179_521-258868-0_sp0.9 from training. Duration: 0.87775 2023-10-03 19:35:38,562 WARNING [train.py:1204] (3/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_576-51198-0_sp1.1 from training. Duration: 0.92725 2023-10-03 19:35:40,581 WARNING [train.py:1204] (3/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_306-216871-0_sp1.1 from training. Duration: 0.97275 2023-10-03 19:35:44,755 WARNING [train.py:1204] (3/4) Exclude cut with ID _20684_210_6917_1_1531567751646_4203930_463-119982-0_sp1.1 from training. Duration: 0.97275 2023-10-03 19:35:46,185 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9084W0012-536850 from training. Duration: 0.64 2023-10-03 19:35:47,646 WARNING [train.py:1204] (3/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_847-41334-0_sp1.1 from training. Duration: 0.4 2023-10-03 19:35:50,426 WARNING [train.py:1204] (3/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_326-165900-0_sp1.1 from training. Duration: 0.62725 2023-10-03 19:35:51,789 WARNING [train.py:1204] (3/4) Exclude cut with ID _7537_210_2084_1_1528502395431_3924359_128-268029-0_sp1.1 from training. Duration: 0.8909375 2023-10-03 19:35:54,538 WARNING [train.py:1204] (3/4) Exclude cut with ID _67499_210_17282_1_1535509805220_3692430_333-155030-0_sp1.1 from training. Duration: 0.97275 2023-10-03 19:35:54,750 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.conv_module2.balancer1.prob, batch_count=1382026.6666666667, ans=0.125 2023-10-03 19:35:57,748 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_34008_210_10431_1_1533345424_8183247_588-257177-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 19:35:59,202 WARNING [train.py:1204] (3/4) Exclude cut with ID _25914_210_6400_1_1532141317058_644740_52-96808-0_sp1.1 from training. Duration: 0.82725 2023-10-03 19:36:01,341 WARNING [train.py:1204] (3/4) Exclude cut with ID _56494_210_4220_1_1535284837625_3033169_69-138150-0_sp1.1 from training. Duration: 0.8545625 2023-10-03 19:36:03,998 WARNING [train.py:1204] (3/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_19-143553-0_sp1.1 from training. Duration: 0.97275 2023-10-03 19:36:05,294 WARNING [train.py:1204] (3/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_582-130735-0_sp1.1 from training. Duration: 0.936375 2023-10-03 19:36:05,419 WARNING [train.py:1204] (3/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_943-201356-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 19:36:05,429 WARNING [train.py:1204] (3/4) Exclude cut with ID _3231_210_2106_1_1526205864934_6784220_422-229276-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 19:36:07,267 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_48049_210_5632_1_1533864553_3519937_73-198717-0_sp1.1 from training. Duration: 0.97275 2023-10-03 19:36:08,641 WARNING [train.py:1204] (3/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_329-285801-0 from training. Duration: 0.91 2023-10-03 19:36:08,665 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9171W0114-579000 from training. Duration: 0.896 2023-10-03 19:36:08,676 WARNING [train.py:1204] (3/4) Exclude cut with ID _3120_210_3525_1_1526036143112_4072980_210-132675-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 19:36:10,108 WARNING [train.py:1204] (3/4) Exclude cut with ID _55857_210_2346_1_1534644007831_6898209_745-98391-0_sp0.9 from training. Duration: 0.8444375 2023-10-03 19:36:11,485 WARNING [train.py:1204] (3/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_615-322035-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 19:36:11,491 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_36756_210_7234_1_1533026991_966276_119-315767-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 19:36:11,510 WARNING [train.py:1204] (3/4) Exclude cut with ID _13916_210_9994_1_1530788341126_3763149_60-191556-0_sp1.1 from training. Duration: 0.5545625 2023-10-03 19:36:11,547 WARNING [train.py:1204] (3/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_177-236809-0_sp1.1 from training. Duration: 0.4454375 2023-10-03 19:36:11,605 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7002_210_1794_1_1527939683_5157286_184-229630-0_sp1.1 from training. Duration: 0.67275 2023-10-03 19:36:11,613 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_50428_210_5724_1_1534247532_4126418_464-18322-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 19:36:13,271 WARNING [train.py:1204] (3/4) Exclude cut with ID _35799_210_5854_1_1533263487887_3529071_466-131402-0_sp1.1 from training. Duration: 0.936375 2023-10-03 19:36:13,361 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_1259-132768-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 19:36:14,741 WARNING [train.py:1204] (3/4) Exclude cut with ID _6694_210_4599_1_1527852637490_3965529_144-185051-0_sp1.1 from training. Duration: 0.87275 2023-10-03 19:36:14,786 WARNING [train.py:1204] (3/4) Exclude cut with ID _64639_210_5816_1_1535367619437_3528000_59-133290-0_sp1.1 from training. Duration: 0.963625 2023-10-03 19:36:16,170 INFO [train.py:1046] (3/4) Epoch 40, batch 150, loss[loss=0.1699, simple_loss=0.2394, pruned_loss=0.05018, over 23771.00 frames. ], tot_loss[loss=0.1604, simple_loss=0.2406, pruned_loss=0.04013, over 2514488.09 frames. ], batch size: 179, lr: 2.56e-03, grad_scale: 8.0 2023-10-03 19:36:17,730 WARNING [train.py:1204] (3/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_223-49525-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 19:36:20,573 WARNING [train.py:1204] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_748-36150-0_sp1.1 from training. Duration: 0.82725 2023-10-03 19:36:20,574 WARNING [train.py:1204] (3/4) Exclude cut with ID _19896_210_1883_1_1531709022662_6649569_52-221627-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 19:36:20,623 WARNING [train.py:1204] (3/4) Exclude cut with ID _6789_210_1534_1_1527854471671_2870519_296-115766-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 19:36:22,102 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder_embed.conv.8.prob, batch_count=1382160.0, ans=0.125 2023-10-03 19:36:24,613 WARNING [train.py:1204] (3/4) Exclude cut with ID _14624_210_1925_1_1530858690657_3642219_100-19871-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 19:36:24,659 WARNING [train.py:1204] (3/4) Exclude cut with ID _12555_210_8750_1_1530075594402_3780239_50-293675-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 19:36:26,194 WARNING [train.py:1204] (3/4) Exclude cut with ID _14880_210_8895_1_1530680426655_3795259_289-212817-0_sp1.1 from training. Duration: 0.62725 2023-10-03 19:36:28,122 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_388-211626-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 19:36:28,260 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.0.ff2_skip_rate, batch_count=1382160.0, ans=0.0 2023-10-03 19:36:32,901 WARNING [train.py:1204] (3/4) Exclude cut with ID _5289_210_2084_1_1527678202736_3779299_392-261988-0 from training. Duration: 0.65 2023-10-03 19:36:32,908 WARNING [train.py:1204] (3/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_124-345840-0 from training. Duration: 0.52 2023-10-03 19:36:34,163 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2655_210_2087_1_1525688959_3897400_437-337690-0 from training. Duration: 0.63 2023-10-03 19:36:35,708 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3780_210_2087_1_1526467391_239068_13-274633-0_sp1.1 from training. Duration: 0.92725 2023-10-03 19:36:35,712 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_805-71497-0_sp1.1 from training. Duration: 0.6454375 2023-10-03 19:36:37,107 WARNING [train.py:1204] (3/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_434-83000-0_sp1.1 from training. Duration: 0.8909375 2023-10-03 19:36:38,441 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_17613_210_2151_1_1531137569_2366160_285-219531-0_sp1.1 from training. Duration: 0.97275 2023-10-03 19:36:38,445 WARNING [train.py:1204] (3/4) Exclude cut with ID _56108_210_16380_1_1534505446159_3887959_22-37858-0_sp1.1 from training. Duration: 0.936375 2023-10-03 19:36:38,482 WARNING [train.py:1204] (3/4) Exclude cut with ID _28427_210_4220_1_1532581240967_3061069_200-88444-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 19:36:40,399 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_77-279042-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 19:36:41,088 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.3.encoder.layers.2.feed_forward1.out_whiten, num_groups=1, num_channels=512, metric=7.41 vs. limit=15.0 2023-10-03 19:36:41,785 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9309W0193-640320 from training. Duration: 0.896 2023-10-03 19:36:45,778 WARNING [train.py:1204] (3/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_516-324055-0_sp1.1 from training. Duration: 0.936375 2023-10-03 19:36:49,007 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.599e+02 1.890e+02 2.037e+02 2.278e+02 3.667e+02, threshold=4.074e+02, percent-clipped=0.0 2023-10-03 19:36:49,184 WARNING [train.py:1204] (3/4) Exclude cut with ID _40094_210_16380_1_1533294005773_3641159_10-261240-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 19:36:53,328 WARNING [train.py:1204] (3/4) Exclude cut with ID _6267_210_2084_1_1527764658526_3894730_375-219887-0_sp1.1 from training. Duration: 0.6090625 2023-10-03 19:36:53,400 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6788_210_1388_1_1527898698_4562021_180-75288-0 from training. Duration: 0.99 2023-10-03 19:36:56,242 WARNING [train.py:1204] (3/4) Exclude cut with ID _8030_210_7497_1_1528444886781_3531089_262-86750-0_sp1.1 from training. Duration: 0.736375 2023-10-03 19:36:56,256 WARNING [train.py:1204] (3/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_934-325498-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 19:36:56,276 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_456-22141-0_sp0.9 from training. Duration: 0.97775 2023-10-03 19:36:59,579 WARNING [train.py:1204] (3/4) Exclude cut with ID _49643_210_7989_1_1534640244224_3939031_598-161926-0_sp1.1 from training. Duration: 0.8181875 2023-10-03 19:36:59,711 WARNING [train.py:1204] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_345-49721-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 19:37:01,050 WARNING [train.py:1204] (3/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_440-141811-0_sp1.1 from training. Duration: 0.863625 2023-10-03 19:37:01,136 WARNING [train.py:1204] (3/4) Exclude cut with ID _64048_210_10626_1_1535198382802_3835179_394-212120-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 19:37:02,481 WARNING [train.py:1204] (3/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_91-277684-0 from training. Duration: 0.74 2023-10-03 19:37:02,687 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.ff2_skip_rate, batch_count=1382360.0, ans=0.0 2023-10-03 19:37:02,820 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.conv_module1.balancer2.prob, batch_count=1382360.0, ans=0.125 2023-10-03 19:37:07,045 WARNING [train.py:1204] (3/4) Exclude cut with ID _23038_210_9570_1_1532001584158_3652419_191-346343-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 19:37:07,842 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.1.encoder.layers.1.self_attn1.whiten, num_groups=1, num_channels=256, metric=12.45 vs. limit=22.5 2023-10-03 19:37:08,344 WARNING [train.py:1204] (3/4) Exclude cut with ID _15140_210_9288_1_1530791388450_4880149_351-347974-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 19:37:08,380 WARNING [train.py:1204] (3/4) Exclude cut with ID _58605_210_12210_1_1534723195939_3904259_48-269402-0_sp1.1 from training. Duration: 0.87275 2023-10-03 19:37:08,386 WARNING [train.py:1204] (3/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_401-53423-0_sp0.9 from training. Duration: 0.611125 2023-10-03 19:37:11,693 WARNING [train.py:1204] (3/4) Exclude cut with ID _42659_210_2070_1_1533607442765_3120380_158-49004-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 19:37:11,940 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=1382360.0, ans=0.1 2023-10-03 19:37:13,159 WARNING [train.py:1204] (3/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_81-75849-0_sp1.1 from training. Duration: 0.3545625 2023-10-03 19:37:15,025 WARNING [train.py:1204] (3/4) Exclude cut with ID _27052_210_5986_1_1532329197187_3585009_314-106499-0_sp1.1 from training. Duration: 0.836375 2023-10-03 19:37:16,386 WARNING [train.py:1204] (3/4) Exclude cut with ID _4626_210_3532_1_1526817652717_3916859_446-206656-0_sp0.9 from training. Duration: 0.8444375 2023-10-03 19:37:17,708 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_53378_210_17282_1_1534221071_5769509_158-114960-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 19:37:20,236 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3171_210_2087_1_1526109084_2330007_37-64334-0_sp0.9 from training. Duration: 0.911125 2023-10-03 19:37:20,252 WARNING [train.py:1204] (3/4) Exclude cut with ID _5410_210_1388_1_1527397055795_1719020_39-112221-0 from training. Duration: 0.69 2023-10-03 19:37:20,276 WARNING [train.py:1204] (3/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_4-91037-0_sp0.9 from training. Duration: 0.97775 2023-10-03 19:37:20,301 WARNING [train.py:1204] (3/4) Exclude cut with ID ID0079W0443-717795 from training. Duration: 0.541 2023-10-03 19:37:23,059 WARNING [train.py:1204] (3/4) Exclude cut with ID _34734_210_4525_1_1533365977542_4448720_766-297928-0_sp1.1 from training. Duration: 0.936375 2023-10-03 19:37:26,392 WARNING [train.py:1204] (3/4) Exclude cut with ID _51471_210_1889_1_1534399391390_3094250_310-337793-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 19:37:26,410 WARNING [train.py:1204] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_678-159307-0_sp0.9 from training. Duration: 0.9444375 2023-10-03 19:37:30,422 INFO [train.py:1046] (3/4) Epoch 40, batch 200, loss[loss=0.1565, simple_loss=0.2476, pruned_loss=0.03274, over 24671.00 frames. ], tot_loss[loss=0.16, simple_loss=0.2403, pruned_loss=0.03986, over 3008784.50 frames. ], batch size: 73, lr: 2.56e-03, grad_scale: 8.0 2023-10-03 19:37:30,468 WARNING [train.py:1204] (3/4) Exclude cut with ID _50093_210_4220_1_1534417141207_3432299_259-322252-0 from training. Duration: 0.92 2023-10-03 19:37:30,530 WARNING [train.py:1204] (3/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_67-103444-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 19:37:30,554 WARNING [train.py:1204] (3/4) Exclude cut with ID _40551_210_10635_1_1533776424632_3416179_311-247578-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 19:37:33,231 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_4633_210_2040_1_1526824163_4145343_416-76117-0 from training. Duration: 0.95 2023-10-03 19:37:34,074 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.conv_skip_rate, batch_count=1382493.3333333333, ans=0.0 2023-10-03 19:37:34,985 WARNING [train.py:1204] (3/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_331-117829-0_sp0.9 from training. Duration: 0.688875 2023-10-03 19:37:36,422 WARNING [train.py:1204] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_299-247059-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 19:37:37,651 WARNING [train.py:1204] (3/4) Exclude cut with ID _64002_210_9570_1_1535198605157_38979_2-240845-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 19:37:42,368 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.conv_module1.balancer1.prob, batch_count=1382493.3333333333, ans=0.125 2023-10-03 19:37:43,487 WARNING [train.py:1204] (3/4) Exclude cut with ID _4681_210_3525_1_1527250689464_120889_15-126760-0_sp0.9 from training. Duration: 0.9 2023-10-03 19:37:43,500 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_21500_210_2054_1_1532149310_2716758_256-156611-0_sp1.1 from training. Duration: 0.936375 2023-10-03 19:37:43,515 WARNING [train.py:1204] (3/4) Exclude cut with ID _16908_210_5724_1_1531119157248_3772700_624-203729-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 19:37:51,145 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.ff2_skip_rate, batch_count=1382560.0, ans=0.0 2023-10-03 19:38:02,479 WARNING [train.py:1204] (3/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_605-219732-0_sp1.1 from training. Duration: 0.8545625 2023-10-03 19:38:02,524 WARNING [train.py:1204] (3/4) Exclude cut with ID _44718_210_5816_1_1534050051869_6469030_380-167520-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 19:38:03,792 WARNING [train.py:1204] (3/4) Exclude cut with ID _19896_210_1883_1_1531709022662_6649569_94-229927-0_sp1.1 from training. Duration: 0.8818125 2023-10-03 19:38:05,108 WARNING [train.py:1204] (3/4) Exclude cut with ID _19514_210_9259_1_1531530071330_5084230_519-174300-0_sp1.1 from training. Duration: 0.963625 2023-10-03 19:38:05,185 WARNING [train.py:1204] (3/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_340-174176-0_sp0.9 from training. Duration: 0.5444375 2023-10-03 19:38:05,189 WARNING [train.py:1204] (3/4) Exclude cut with ID _8263_210_1664_1_1528439452891_970220_68-181805-0_sp1.1 from training. Duration: 0.5818125 2023-10-03 19:38:08,337 WARNING [train.py:1204] (3/4) Exclude cut with ID _3120_210_3525_1_1526036143112_4072980_348-257723-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 19:38:08,422 WARNING [train.py:1204] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_453-133983-0_sp1.1 from training. Duration: 0.7090625 2023-10-03 19:38:09,844 WARNING [train.py:1204] (3/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_17-86060-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 19:38:09,878 WARNING [train.py:1204] (3/4) Exclude cut with ID _29877_210_6228_1_1532397791106_5276149_228-184657-0_sp1.1 from training. Duration: 0.97275 2023-10-03 19:38:11,239 WARNING [train.py:1204] (3/4) Exclude cut with ID _18516_210_9206_1_1531704653206_3692406_198-334870-0 from training. Duration: 0.92 2023-10-03 19:38:12,589 WARNING [train.py:1204] (3/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_340-174176-0_sp1.1 from training. Duration: 0.4454375 2023-10-03 19:38:13,915 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_14-139186-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 19:38:17,254 WARNING [train.py:1204] (3/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_126-29104-0_sp0.9 from training. Duration: 0.9444375 2023-10-03 19:38:23,334 WARNING [train.py:1204] (3/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_84-10310-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 19:38:26,649 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.4.encoder.layers.2.feed_forward1.out_whiten, num_groups=1, num_channels=384, metric=11.03 vs. limit=15.0 2023-10-03 19:38:28,976 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_15517_210_6753_1_1530954008_3487336_79-273076-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 19:38:29,028 WARNING [train.py:1204] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_371-18169-0_sp0.9 from training. Duration: 0.9555625 2023-10-03 19:38:33,982 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.feed_forward1.hidden_balancer.prob, batch_count=1382760.0, ans=0.125 2023-10-03 19:38:34,037 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.balancer2.prob, batch_count=1382760.0, ans=0.125 2023-10-03 19:38:36,490 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_68702_210_23689_1_1535799577_3420068_170-92788-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 19:38:36,715 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.bypass_mid.scale_min, batch_count=1382760.0, ans=0.2 2023-10-03 19:38:38,458 WARNING [train.py:1204] (3/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_78-182912-0 from training. Duration: 0.59 2023-10-03 19:38:38,678 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.conv_module1.balancer1.prob, batch_count=1382760.0, ans=0.125 2023-10-03 19:38:39,761 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3019_210_2028_1_1526044245_3264865_297-12384-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 19:38:39,767 WARNING [train.py:1204] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_147-111137-0_sp1.1 from training. Duration: 0.863625 2023-10-03 19:38:39,772 WARNING [train.py:1204] (3/4) Exclude cut with ID _28323_210_16187_1_1532480395667_4100300_117-8859-0_sp1.1 from training. Duration: 0.97275 2023-10-03 19:38:41,113 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7762_210_1794_1_1528199508_4057408_419-125009-0_sp1.1 from training. Duration: 0.7545625 2023-10-03 19:38:41,209 WARNING [train.py:1204] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_268-87909-0 from training. Duration: 0.99 2023-10-03 19:38:43,792 INFO [train.py:1046] (3/4) Epoch 40, batch 250, loss[loss=0.1371, simple_loss=0.2198, pruned_loss=0.02721, over 24627.00 frames. ], tot_loss[loss=0.1586, simple_loss=0.2383, pruned_loss=0.03944, over 3386655.84 frames. ], batch size: 65, lr: 2.56e-03, grad_scale: 8.0 2023-10-03 19:38:43,835 WARNING [train.py:1204] (3/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_118-311357-0_sp1.1 from training. Duration: 0.92725 2023-10-03 19:38:43,861 WARNING [train.py:1204] (3/4) Exclude cut with ID ID0377W0003-870225 from training. Duration: 0.852 2023-10-03 19:38:47,234 WARNING [train.py:1204] (3/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_56-5838-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 19:38:48,606 WARNING [train.py:1204] (3/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_90-31383-0_sp0.9 from training. Duration: 0.9666875 2023-10-03 19:38:48,708 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_31583_210_3852_1_1532595851_4024413_58-86657-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 19:38:48,752 WARNING [train.py:1204] (3/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_344-113875-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 19:38:51,833 WARNING [train.py:1204] (3/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_278-104676-0_sp1.1 from training. Duration: 0.8909375 2023-10-03 19:38:53,126 WARNING [train.py:1204] (3/4) Exclude cut with ID _55844_210_2346_1_1534471212569_7625707_764-2387-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 19:38:54,579 WARNING [train.py:1204] (3/4) Exclude cut with ID _27430_210_7770_1_1532604689375_1378590_157-14963-0_sp1.1 from training. Duration: 0.963625 2023-10-03 19:38:57,265 WARNING [train.py:1204] (3/4) Exclude cut with ID _7160_210_1876_1_1527940923231_3703799_282-322611-0_sp1.1 from training. Duration: 0.7909375 2023-10-03 19:38:57,488 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.conv_module2.balancer2.prob, batch_count=1382893.3333333333, ans=0.125 2023-10-03 19:39:05,787 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.balancer_ff2.min_abs, batch_count=1382893.3333333333, ans=0.1 2023-10-03 19:39:07,022 WARNING [train.py:1204] (3/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_190-138640-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 19:39:11,028 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_33348_210_5632_1_1533086912_3324030_63-270922-0_sp1.1 from training. Duration: 0.97275 2023-10-03 19:39:12,363 WARNING [train.py:1204] (3/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_135-184281-0_sp1.1 from training. Duration: 0.9 2023-10-03 19:39:16,359 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.534e+02 1.944e+02 2.127e+02 2.472e+02 3.844e+02, threshold=4.253e+02, percent-clipped=0.0 2023-10-03 19:39:16,590 WARNING [train.py:1204] (3/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_277-210798-0_sp1.1 from training. Duration: 0.5 2023-10-03 19:39:16,634 WARNING [train.py:1204] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_402-253027-0_sp0.9 from training. Duration: 0.6 2023-10-03 19:39:18,486 WARNING [train.py:1204] (3/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_540-275685-0_sp0.9 from training. Duration: 0.988875 2023-10-03 19:39:19,934 WARNING [train.py:1204] (3/4) Exclude cut with ID _59493_210_17282_1_1535078419077_3225879_118-18960-0_sp1.1 from training. Duration: 0.936375 2023-10-03 19:39:21,307 WARNING [train.py:1204] (3/4) Exclude cut with ID _6194_210_1534_1_1527511826160_3941194_321-148641-0_sp1.1 from training. Duration: 0.5818125 2023-10-03 19:39:21,314 WARNING [train.py:1204] (3/4) Exclude cut with ID _61133_210_9969_1_1535196777227_3696409_333-270325-0_sp1.1 from training. Duration: 0.8454375 2023-10-03 19:39:21,391 WARNING [train.py:1204] (3/4) Exclude cut with ID _49376_210_5986_1_1533985278732_3370740_307-145221-0_sp1.1 from training. Duration: 0.936375 2023-10-03 19:39:24,452 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6846_210_6753_1_1527850767_4295158_2-315073-0_sp0.9 from training. Duration: 0.97775 2023-10-03 19:39:27,059 WARNING [train.py:1204] (3/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_17-25045-0 from training. Duration: 0.59 2023-10-03 19:39:27,081 WARNING [train.py:1204] (3/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_503-333510-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 19:39:28,501 WARNING [train.py:1204] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_238-245655-0_sp1.1 from training. Duration: 0.736375 2023-10-03 19:39:28,531 WARNING [train.py:1204] (3/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_329-82235-0_sp0.9 from training. Duration: 0.911125 2023-10-03 19:39:28,536 WARNING [train.py:1204] (3/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_325-113798-0_sp0.9 from training. Duration: 0.9444375 2023-10-03 19:39:29,923 WARNING [train.py:1204] (3/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_395-37222-0_sp0.9 from training. Duration: 0.8444375 2023-10-03 19:39:30,760 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.2.encoder.layers.1.feed_forward1.out_whiten, num_groups=1, num_channels=384, metric=5.86 vs. limit=15.0 2023-10-03 19:39:31,274 WARNING [train.py:1204] (3/4) Exclude cut with ID _64135_210_2253_1_1535677267876_7608972_498-5039-0_sp1.1 from training. Duration: 0.7090625 2023-10-03 19:39:31,283 WARNING [train.py:1204] (3/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_303-228738-0_sp0.9 from training. Duration: 0.9333125 2023-10-03 19:39:32,689 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_577-220899-0_sp1.1 from training. Duration: 0.92725 2023-10-03 19:39:34,127 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3475_210_2028_1_1526193578_3622569_377-166133-0_sp1.1 from training. Duration: 0.7818125 2023-10-03 19:39:34,154 WARNING [train.py:1204] (3/4) Exclude cut with ID _49376_210_5986_1_1533985278732_3370740_272-313512-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 19:39:35,645 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.conv_skip_rate, batch_count=1383026.6666666667, ans=0.0 2023-10-03 19:39:40,218 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7762_210_1794_1_1528199508_4057408_422-227034-0_sp0.9 from training. Duration: 0.811125 2023-10-03 19:39:43,596 WARNING [train.py:1204] (3/4) Exclude cut with ID _18895_210_6228_1_1531357274234_3784930_74-341428-0_sp1.1 from training. Duration: 0.92725 2023-10-03 19:39:46,402 WARNING [train.py:1204] (3/4) Exclude cut with ID _44902_210_16187_1_1533888006062_3792078_149-67212-0_sp1.1 from training. Duration: 0.7909375 2023-10-03 19:39:50,975 WARNING [train.py:1204] (3/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_243-126020-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 19:39:52,392 WARNING [train.py:1204] (3/4) Exclude cut with ID _54218_210_12210_1_1534413646388_4409259_259-6561-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 19:39:52,584 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.balancer1.prob, batch_count=1383093.3333333333, ans=0.125 2023-10-03 19:39:55,512 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_41452_210_7291_1_1533437687_3941281_405-11559-0 from training. Duration: 0.91 2023-10-03 19:39:56,856 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_759-225084-0_sp1.1 from training. Duration: 0.97275 2023-10-03 19:39:58,081 INFO [train.py:1046] (3/4) Epoch 40, batch 300, loss[loss=0.1415, simple_loss=0.2242, pruned_loss=0.02942, over 24646.00 frames. ], tot_loss[loss=0.1574, simple_loss=0.2367, pruned_loss=0.03906, over 3687367.99 frames. ], batch size: 65, lr: 2.56e-03, grad_scale: 8.0 2023-10-03 19:39:58,135 WARNING [train.py:1204] (3/4) Exclude cut with ID _14747_210_8341_1_1530685797463_3654089_147-131229-0_sp0.9 from training. Duration: 0.8444375 2023-10-03 19:39:59,517 WARNING [train.py:1204] (3/4) Exclude cut with ID _14867_210_1968_1_1530684179890_3846969_319-250868-0 from training. Duration: 0.99 2023-10-03 19:39:59,551 WARNING [train.py:1204] (3/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_580-93798-0_sp0.9 from training. Duration: 0.62225 2023-10-03 19:40:00,892 WARNING [train.py:1204] (3/4) Exclude cut with ID _9679_210_7497_1_1529129212906_4363929_512-313889-0_sp0.9 from training. Duration: 0.9 2023-10-03 19:40:00,908 WARNING [train.py:1204] (3/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_18-189198-0 from training. Duration: 0.9 2023-10-03 19:40:03,732 WARNING [train.py:1204] (3/4) Exclude cut with ID _49755_210_18053_1_1534581005846_3218709_137-64179-0_sp1.1 from training. Duration: 0.92725 2023-10-03 19:40:05,121 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_48049_210_5632_1_1533864553_3519937_306-283794-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 19:40:06,629 WARNING [train.py:1204] (3/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_25-120161-0_sp1.1 from training. Duration: 0.8909375 2023-10-03 19:40:06,896 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.balancer1.prob, batch_count=1383160.0, ans=0.125 2023-10-03 19:40:08,553 WARNING [train.py:1204] (3/4) Exclude cut with ID _8833_210_5219_1_1528592665210_7553059_319-349181-0 from training. Duration: 0.99 2023-10-03 19:40:09,861 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_296-332325-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 19:40:11,295 WARNING [train.py:1204] (3/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_436-186311-0_sp1.1 from training. Duration: 0.5909375 2023-10-03 19:40:11,307 WARNING [train.py:1204] (3/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_348-127789-0 from training. Duration: 0.85 2023-10-03 19:40:11,312 WARNING [train.py:1204] (3/4) Exclude cut with ID _3992_210_1747_1_1526644816031_3731060_443-195183-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 19:40:12,168 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.out_combiner.scale_min, batch_count=1383226.6666666667, ans=0.2 2023-10-03 19:40:16,141 WARNING [train.py:1204] (3/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_308-231735-0_sp1.1 from training. Duration: 0.663625 2023-10-03 19:40:19,567 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6786_210_1388_1_1527839699_4194495_378-46070-0_sp1.1 from training. Duration: 0.6545625 2023-10-03 19:40:19,627 WARNING [train.py:1204] (3/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_63-101996-0 from training. Duration: 0.88 2023-10-03 19:40:21,681 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.ff3_skip_rate, batch_count=1383226.6666666667, ans=0.0 2023-10-03 19:40:25,508 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2541_210_1794_1_1525590269_3084765_156-173713-0 from training. Duration: 0.81 2023-10-03 19:40:25,540 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_217-239796-0_sp1.1 from training. Duration: 0.963625 2023-10-03 19:40:26,987 WARNING [train.py:1204] (3/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_530-329400-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 19:40:28,502 WARNING [train.py:1204] (3/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_150-62920-0_sp1.1 from training. Duration: 0.963625 2023-10-03 19:40:28,502 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_5522_210_3273_1_1527249606_3909096_366-172508-0 from training. Duration: 0.66 2023-10-03 19:40:28,505 WARNING [train.py:1204] (3/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_303-340446-0_sp1.1 from training. Duration: 0.8181875 2023-10-03 19:40:31,099 WARNING [train.py:1204] (3/4) Exclude cut with ID _8699_210_2069_1_1528630346831_3960409_160-24408-0_sp1.1 from training. Duration: 0.82725 2023-10-03 19:40:32,532 WARNING [train.py:1204] (3/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_299-187801-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 19:40:33,878 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_34886_210_10633_1_1533203882_1755661_92-345288-0_sp1.1 from training. Duration: 0.97275 2023-10-03 19:40:36,673 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_5438_210_1794_1_1527336687_2600009_104-288274-0_sp1.1 from training. Duration: 0.52725 2023-10-03 19:40:36,677 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_154-316450-0 from training. Duration: 0.76 2023-10-03 19:40:38,494 WARNING [train.py:1204] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_23-189234-0_sp0.9 from training. Duration: 0.92225 2023-10-03 19:40:41,916 WARNING [train.py:1204] (3/4) Exclude cut with ID _4779_210_2084_1_1527318022944_3914893_346-168820-0_sp1.1 from training. Duration: 0.963625 2023-10-03 19:40:44,368 WARNING [train.py:1204] (3/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_5-55893-0 from training. Duration: 0.55 2023-10-03 19:40:44,980 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.5.encoder.layers.0.self_attn_weights.whiten_keys, num_groups=4, num_channels=128, metric=1.58 vs. limit=6.0 2023-10-03 19:40:45,732 WARNING [train.py:1204] (3/4) Exclude cut with ID _20455_210_5219_1_1531565996775_8098100_180-24517-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 19:40:49,788 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_243-143119-0_sp0.9 from training. Duration: 0.8555625 2023-10-03 19:40:53,107 WARNING [train.py:1204] (3/4) Exclude cut with ID _65585_210_12210_1_1535450451584_3764098_293-212045-0_sp1.1 from training. Duration: 0.92725 2023-10-03 19:40:53,112 WARNING [train.py:1204] (3/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_577-332600-0 from training. Duration: 0.57 2023-10-03 19:40:57,403 WARNING [train.py:1204] (3/4) Exclude cut with ID _30039_210_13270_1_1532484612900_2725150_104-289874-0_sp1.1 from training. Duration: 0.963625 2023-10-03 19:40:57,413 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3172_210_2087_1_1526292929_3987865_197-97762-0_sp1.1 from training. Duration: 0.6818125 2023-10-03 19:41:00,173 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_256-150908-0_sp1.1 from training. Duration: 0.963625 2023-10-03 19:41:01,585 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_296-287678-0_sp1.1 from training. Duration: 0.863625 2023-10-03 19:41:01,622 WARNING [train.py:1204] (3/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_443-30034-0 from training. Duration: 0.64 2023-10-03 19:41:01,657 WARNING [train.py:1204] (3/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_494-199538-0_sp0.9 from training. Duration: 0.8333125 2023-10-03 19:41:01,706 WARNING [train.py:1204] (3/4) Exclude cut with ID _19708_210_9206_1_1532136650733_4238364_61-162325-0_sp1.1 from training. Duration: 0.97275 2023-10-03 19:41:04,408 WARNING [train.py:1204] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_475-127455-0 from training. Duration: 0.7 2023-10-03 19:41:05,747 WARNING [train.py:1204] (3/4) Exclude cut with ID _48677_210_5663_1_1533902405919_3892989_188-11581-0_sp1.1 from training. Duration: 0.963625 2023-10-03 19:41:05,776 WARNING [train.py:1204] (3/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_136-8381-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 19:41:07,172 WARNING [train.py:1204] (3/4) Exclude cut with ID _55328_210_27767_1_1534580973260_4088170_270-229555-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 19:41:08,467 WARNING [train.py:1204] (3/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_372-266951-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 19:41:08,530 WARNING [train.py:1204] (3/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_14-242149-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 19:41:11,610 INFO [train.py:1046] (3/4) Epoch 40, batch 350, loss[loss=0.1446, simple_loss=0.2308, pruned_loss=0.02917, over 24464.00 frames. ], tot_loss[loss=0.1571, simple_loss=0.2368, pruned_loss=0.03874, over 3924045.59 frames. ], batch size: 63, lr: 2.56e-03, grad_scale: 8.0 2023-10-03 19:41:13,134 WARNING [train.py:1204] (3/4) Exclude cut with ID _7877_210_6014_1_1528286358099_3750136_159-76512-0_sp1.1 from training. Duration: 0.936375 2023-10-03 19:41:13,136 WARNING [train.py:1204] (3/4) Exclude cut with ID _8263_210_1664_1_1528438953494_294100_40-204731-0_sp1.1 from training. Duration: 0.4545625 2023-10-03 19:41:16,324 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8278_210_2550_1_1528637099_4220413_694-238413-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 19:41:20,675 WARNING [train.py:1204] (3/4) Exclude cut with ID _47278_210_12041_1_1533902418349_3576110_421-172710-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 19:41:22,625 WARNING [train.py:1204] (3/4) Exclude cut with ID _14970_210_15709_1_1530773543493_1715730_215-282680-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 19:41:22,660 WARNING [train.py:1204] (3/4) Exclude cut with ID _64639_210_5816_1_1535367619437_3528000_306-149449-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 19:41:27,218 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_28-291982-0 from training. Duration: 0.97 2023-10-03 19:41:27,520 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.balancer2.prob, batch_count=1383560.0, ans=0.125 2023-10-03 19:41:28,569 WARNING [train.py:1204] (3/4) Exclude cut with ID _55134_210_21982_1_1534590028377_3882170_412-15870-0_sp1.1 from training. Duration: 0.936375 2023-10-03 19:41:29,730 WARNING [train.py:1204] (3/4) Exclude cut with ID _13684_210_7497_1_1530619354863_3598359_312-235606-0 from training. Duration: 0.6 2023-10-03 19:41:32,617 WARNING [train.py:1204] (3/4) Exclude cut with ID _37112_210_2575_1_1532997007673_7268610_347-238993-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 19:41:32,659 WARNING [train.py:1204] (3/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_121-284152-0 from training. Duration: 0.59 2023-10-03 19:41:32,720 WARNING [train.py:1204] (3/4) Exclude cut with ID _2941_210_2577_1_1526199673150_2489759_228-2961-0_sp1.1 from training. Duration: 0.97275 2023-10-03 19:41:35,408 WARNING [train.py:1204] (3/4) Exclude cut with ID _28323_210_16187_1_1532480395667_4100300_531-24157-0 from training. Duration: 0.98 2023-10-03 19:41:37,727 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.0.layers.1.feed_forward3.out_whiten, num_groups=1, num_channels=192, metric=9.25 vs. limit=15.0 2023-10-03 19:41:38,261 WARNING [train.py:1204] (3/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_266-329755-0_sp0.9 from training. Duration: 0.988875 2023-10-03 19:41:39,637 WARNING [train.py:1204] (3/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_1038-146421-0_sp1.1 from training. Duration: 0.97275 2023-10-03 19:41:41,391 WARNING [train.py:1204] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_391-22148-0_sp0.9 from training. Duration: 0.8555625 2023-10-03 19:41:42,748 WARNING [train.py:1204] (3/4) Exclude cut with ID _64521_210_14699_1_1535243427259_3815328_543-341117-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 19:41:42,760 WARNING [train.py:1204] (3/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_52-168712-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 19:41:42,963 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.conv_module1.balancer2.prob, batch_count=1383626.6666666667, ans=0.125 2023-10-03 19:41:44,060 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.650e+02 1.933e+02 2.130e+02 2.384e+02 3.625e+02, threshold=4.259e+02, percent-clipped=0.0 2023-10-03 19:41:44,142 WARNING [train.py:1204] (3/4) Exclude cut with ID _33920_210_4220_1_1533257851500_3084399_198-334063-0_sp1.1 from training. Duration: 0.8545625 2023-10-03 19:41:44,162 WARNING [train.py:1204] (3/4) Exclude cut with ID _50093_210_4220_1_1534417141207_3432299_69-111987-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 19:41:45,361 WARNING [train.py:1204] (3/4) Exclude cut with ID _14821_210_8750_1_1530680503991_6829359_189-297451-0_sp0.9 from training. Duration: 0.611125 2023-10-03 19:41:46,121 WARNING [train.py:1204] (3/4) Exclude cut with ID _8030_210_7497_1_1528444886781_3531089_262-86750-0_sp0.9 from training. Duration: 0.9 2023-10-03 19:41:47,176 WARNING [train.py:1204] (3/4) Exclude cut with ID _24612_210_8913_1_1531998054193_3806359_294-191196-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 19:41:54,009 WARNING [train.py:1204] (3/4) Exclude cut with ID _16909_210_5724_1_1531292079912_4118919_707-342896-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 19:41:54,023 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_9460_210_4211_1_1528954587_10049667_572-37035-0_sp0.9 from training. Duration: 0.888875 2023-10-03 19:41:54,073 WARNING [train.py:1204] (3/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_762-109886-0_sp1.1 from training. Duration: 0.82725 2023-10-03 19:41:54,093 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_14953_210_2151_1_1530701710_5616165_519-326393-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 19:41:54,234 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.0.conv_module2.balancer1.prob, batch_count=1383693.3333333333, ans=0.125 2023-10-03 19:42:00,555 WARNING [train.py:1204] (3/4) Exclude cut with ID _25914_210_6400_1_1532141317058_644740_52-96808-0 from training. Duration: 0.91 2023-10-03 19:42:00,559 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_68095_210_20185_1_1535718226_8888855_1079-94525-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 19:42:03,390 WARNING [train.py:1204] (3/4) Exclude cut with ID _14860_210_4915_1_1530702020340_7343240_746-3647-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 19:42:03,391 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_70379_210_7925_1_1535941455_557455_49-101720-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 19:42:04,679 WARNING [train.py:1204] (3/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_8-307291-0_sp1.1 from training. Duration: 0.936375 2023-10-03 19:42:06,118 WARNING [train.py:1204] (3/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_117-290916-0 from training. Duration: 0.51 2023-10-03 19:42:07,472 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder_embed.conv.2.prob, batch_count=1383693.3333333333, ans=0.125 2023-10-03 19:42:08,770 WARNING [train.py:1204] (3/4) Exclude cut with ID _22020_210_2461_1_1531724164513_6326900_944-339840-0_sp1.1 from training. Duration: 0.963625 2023-10-03 19:42:10,183 WARNING [train.py:1204] (3/4) Exclude cut with ID IC0515W0043-254911 from training. Duration: 0.976 2023-10-03 19:42:12,150 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3910_210_1794_1_1526742306_334530_28-238909-0 from training. Duration: 0.81 2023-10-03 19:42:12,181 WARNING [train.py:1204] (3/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_269-267275-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 19:42:14,887 WARNING [train.py:1204] (3/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_72-251255-0_sp1.1 from training. Duration: 0.8545625 2023-10-03 19:42:14,898 WARNING [train.py:1204] (3/4) Exclude cut with ID _14886_210_4220_1_1530702020355_3516190_175-229073-0 from training. Duration: 0.45 2023-10-03 19:42:16,923 WARNING [train.py:1204] (3/4) Exclude cut with ID _40519_210_13794_1_1533715228655_7337390_921-209452-0_sp1.1 from training. Duration: 0.963625 2023-10-03 19:42:19,625 WARNING [train.py:1204] (3/4) Exclude cut with ID _29857_210_20034_1_1532395886753_3798160_335-163562-0_sp0.9 from training. Duration: 0.9444375 2023-10-03 19:42:21,033 WARNING [train.py:1204] (3/4) Exclude cut with ID _58583_210_5236_1_1534726835266_3574249_384-58445-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 19:42:23,626 WARNING [train.py:1204] (3/4) Exclude cut with ID _44011_210_2046_1_1533722401691_3631310_96-315656-0_sp1.1 from training. Duration: 0.963625 2023-10-03 19:42:23,631 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_68484_210_1794_1_1535849804_7678097_549-170613-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 19:42:23,936 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.feed_forward3.hidden_balancer.prob, batch_count=1383826.6666666667, ans=0.125 2023-10-03 19:42:24,936 INFO [train.py:1046] (3/4) Epoch 40, batch 400, loss[loss=0.1642, simple_loss=0.2559, pruned_loss=0.03625, over 24523.00 frames. ], tot_loss[loss=0.1565, simple_loss=0.2359, pruned_loss=0.03854, over 4098533.44 frames. ], batch size: 71, lr: 2.56e-03, grad_scale: 16.0 2023-10-03 19:42:25,064 WARNING [train.py:1204] (3/4) Exclude cut with ID _37892_210_19596_1_1533198677897_3964058_214-148058-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 19:42:28,149 WARNING [train.py:1204] (3/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_415-78792-0_sp0.9 from training. Duration: 0.9 2023-10-03 19:42:29,584 WARNING [train.py:1204] (3/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_383-205833-0_sp1.1 from training. Duration: 0.863625 2023-10-03 19:42:29,671 WARNING [train.py:1204] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_167-137255-0 from training. Duration: 0.64 2023-10-03 19:42:29,679 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8275_210_4198_1_1528455320_3892571_484-154786-0_sp1.1 from training. Duration: 0.963625 2023-10-03 19:42:31,516 WARNING [train.py:1204] (3/4) Exclude cut with ID _40284_210_2084_1_1533513679814_3704279_396-319412-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 19:42:32,941 WARNING [train.py:1204] (3/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_280-120942-0_sp0.9 from training. Duration: 0.8555625 2023-10-03 19:42:34,199 WARNING [train.py:1204] (3/4) Exclude cut with ID _45630_210_10556_1_1533949174498_3902049_558-240889-0_sp1.1 from training. Duration: 0.92725 2023-10-03 19:42:35,640 WARNING [train.py:1204] (3/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_197-307458-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 19:42:37,007 WARNING [train.py:1204] (3/4) Exclude cut with ID _16909_210_5724_1_1531292079912_4118919_163-254843-0_sp1.1 from training. Duration: 0.92725 2023-10-03 19:42:38,419 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_103-84010-0 from training. Duration: 0.69 2023-10-03 19:42:40,209 WARNING [train.py:1204] (3/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_401-158239-0 from training. Duration: 0.71 2023-10-03 19:42:40,210 WARNING [train.py:1204] (3/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_899-233658-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 19:42:41,571 WARNING [train.py:1204] (3/4) Exclude cut with ID _23825_210_13449_1_1531915313526_3497150_240-271367-0 from training. Duration: 0.97 2023-10-03 19:42:41,824 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.conv_skip_rate, batch_count=1383893.3333333333, ans=0.0 2023-10-03 19:42:42,874 WARNING [train.py:1204] (3/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_424-134619-0_sp1.1 from training. Duration: 0.92725 2023-10-03 19:42:43,204 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.bypass.skip_rate, batch_count=1383893.3333333333, ans=0.09899494936611666 2023-10-03 19:42:45,752 WARNING [train.py:1204] (3/4) Exclude cut with ID _44831_210_13427_1_1533816011549_3689035_32-188147-0_sp1.1 from training. Duration: 0.87275 2023-10-03 19:42:45,756 WARNING [train.py:1204] (3/4) Exclude cut with ID _59170_210_6721_1_1534983633960_4203335_314-63879-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 19:42:45,778 WARNING [train.py:1204] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_602-275260-0 from training. Duration: 0.82 2023-10-03 19:42:47,616 WARNING [train.py:1204] (3/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_511-306767-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 19:42:47,664 WARNING [train.py:1204] (3/4) Exclude cut with ID _41817_210_18835_1_1533434477160_7423049_565-201775-0_sp1.1 from training. Duration: 0.92725 2023-10-03 19:42:47,679 WARNING [train.py:1204] (3/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_778-173151-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 19:42:47,712 WARNING [train.py:1204] (3/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_680-51001-0_sp1.1 from training. Duration: 0.963625 2023-10-03 19:42:50,519 WARNING [train.py:1204] (3/4) Exclude cut with ID IC0677W0059-334891 from training. Duration: 0.896 2023-10-03 19:42:50,557 WARNING [train.py:1204] (3/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_370-331600-0 from training. Duration: 0.94 2023-10-03 19:42:53,742 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.bypass.scale_min, batch_count=1383960.0, ans=0.2 2023-10-03 19:42:56,297 WARNING [train.py:1204] (3/4) Exclude cut with ID _66840_210_5219_1_1535452168426_4196349_446-240192-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 19:42:57,669 WARNING [train.py:1204] (3/4) Exclude cut with ID _65242_210_18628_1_1535369393717_3823149_38-6732-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 19:42:57,729 WARNING [train.py:1204] (3/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_302-2550-0 from training. Duration: 0.81 2023-10-03 19:42:57,812 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_100-348441-0 from training. Duration: 0.91 2023-10-03 19:43:00,587 WARNING [train.py:1204] (3/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_329-285801-0_sp1.1 from training. Duration: 0.82725 2023-10-03 19:43:05,390 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_30167_210_3528_1_1532428012_4772375_343-334712-0_sp1.1 from training. Duration: 0.936375 2023-10-03 19:43:11,129 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7543_210_6753_1_1528543318_4552_1-85072-0 from training. Duration: 0.83 2023-10-03 19:43:14,565 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7002_210_1794_1_1527935634_4034444_487-236501-0_sp1.1 from training. Duration: 0.6 2023-10-03 19:43:15,957 WARNING [train.py:1204] (3/4) Exclude cut with ID _7160_210_1876_1_1527940923231_3703799_282-322611-0 from training. Duration: 0.87 2023-10-03 19:43:17,394 WARNING [train.py:1204] (3/4) Exclude cut with ID _67335_210_5219_1_1535525975652_4275229_570-262983-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 19:43:18,176 WARNING [train.py:1204] (3/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_625-338546-0_sp1.1 from training. Duration: 0.8 2023-10-03 19:43:18,198 WARNING [train.py:1204] (3/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_176-56120-0 from training. Duration: 0.83 2023-10-03 19:43:20,948 WARNING [train.py:1204] (3/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_963-187109-0_sp1.1 from training. Duration: 0.9 2023-10-03 19:43:21,076 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=1384026.6666666667, ans=0.1 2023-10-03 19:43:24,050 WARNING [train.py:1204] (3/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_121-284152-0_sp0.9 from training. Duration: 0.6555625 2023-10-03 19:43:25,319 WARNING [train.py:1204] (3/4) Exclude cut with ID _45021_210_9994_1_1533619751094_3330288_104-81711-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 19:43:26,828 WARNING [train.py:1204] (3/4) Exclude cut with ID _51382_210_23862_1_1534492801838_3650104_129-343914-0_sp1.1 from training. Duration: 0.936375 2023-10-03 19:43:26,856 WARNING [train.py:1204] (3/4) Exclude cut with ID _14970_210_15709_1_1530775547361_1966965_8-169924-0 from training. Duration: 0.92 2023-10-03 19:43:28,428 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2295_210_2087_1_1525500993_2407863_234-83104-0_sp1.1 from training. Duration: 0.563625 2023-10-03 19:43:29,797 WARNING [train.py:1204] (3/4) Exclude cut with ID _14687_210_5854_1_1530703838259_3664889_115-239046-0 from training. Duration: 0.67 2023-10-03 19:43:34,224 WARNING [train.py:1204] (3/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_690-126306-0_sp1.1 from training. Duration: 0.7090625 2023-10-03 19:43:34,233 WARNING [train.py:1204] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_625-102955-0_sp0.9 from training. Duration: 0.8555625 2023-10-03 19:43:35,614 WARNING [train.py:1204] (3/4) Exclude cut with ID _9389_210_1664_1_1529217337301_5289269_289-213417-0 from training. Duration: 0.89 2023-10-03 19:43:38,422 WARNING [train.py:1204] (3/4) Exclude cut with ID _13253_210_1925_1_1530757775819_3577873_191-144977-0_sp0.9 from training. Duration: 0.8666875 2023-10-03 19:43:38,491 WARNING [train.py:1204] (3/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_325-155703-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 19:43:39,822 INFO [train.py:1046] (3/4) Epoch 40, batch 450, loss[loss=0.1586, simple_loss=0.2479, pruned_loss=0.03461, over 24447.00 frames. ], tot_loss[loss=0.1568, simple_loss=0.2367, pruned_loss=0.03842, over 4226402.47 frames. ], batch size: 69, lr: 2.56e-03, grad_scale: 16.0 2023-10-03 19:43:39,881 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2295_210_2087_1_1525500993_2407863_116-238894-0_sp0.9 from training. Duration: 0.77775 2023-10-03 19:43:39,996 WARNING [train.py:1204] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_94-126128-0 from training. Duration: 0.86 2023-10-03 19:43:41,755 WARNING [train.py:1204] (3/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_395-310220-0_sp0.9 from training. Duration: 0.988875 2023-10-03 19:43:41,840 WARNING [train.py:1204] (3/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_821-110852-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 19:43:43,184 WARNING [train.py:1204] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_64-248359-0_sp1.1 from training. Duration: 0.62725 2023-10-03 19:43:43,202 WARNING [train.py:1204] (3/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_138-187031-0 from training. Duration: 0.99 2023-10-03 19:43:43,240 WARNING [train.py:1204] (3/4) Exclude cut with ID _4122_210_2084_1_1526713164417_4353280_528-206771-0_sp0.9 from training. Duration: 0.97775 2023-10-03 19:43:43,696 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.4.encoder.layers.2.conv_module1.whiten, num_groups=1, num_channels=384, metric=3.46 vs. limit=15.0 2023-10-03 19:43:44,623 WARNING [train.py:1204] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_844-96989-0_sp1.1 from training. Duration: 0.6454375 2023-10-03 19:43:47,300 WARNING [train.py:1204] (3/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_106-30782-0_sp1.1 from training. Duration: 0.72725 2023-10-03 19:43:56,436 WARNING [train.py:1204] (3/4) Exclude cut with ID _14683_210_5219_1_1530698352608_3974859_281-250040-0_sp1.1 from training. Duration: 0.936375 2023-10-03 19:43:57,799 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_46013_210_7234_1_1533776362_3744032_463-52378-0_sp0.9 from training. Duration: 0.9555625 2023-10-03 19:43:59,256 WARNING [train.py:1204] (3/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_299-34413-0 from training. Duration: 0.65 2023-10-03 19:44:00,624 WARNING [train.py:1204] (3/4) Exclude cut with ID _35799_210_5854_1_1533263487887_3529071_490-326847-0 from training. Duration: 0.99 2023-10-03 19:44:02,199 WARNING [train.py:1204] (3/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_589-9070-0_sp0.9 from training. Duration: 0.788875 2023-10-03 19:44:05,351 WARNING [train.py:1204] (3/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_884-29515-0_sp1.1 from training. Duration: 0.936375 2023-10-03 19:44:06,749 WARNING [train.py:1204] (3/4) Exclude cut with ID _15012_210_1968_1_1530856820429_5095350_474-119995-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 19:44:08,215 WARNING [train.py:1204] (3/4) Exclude cut with ID _65585_210_12210_1_1535450451584_3764098_211-97124-0_sp1.1 from training. Duration: 0.963625 2023-10-03 19:44:10,104 WARNING [train.py:1204] (3/4) Exclude cut with ID _33640_210_2655_1_1533117605479_4855469_666-224463-0_sp1.1 from training. Duration: 0.963625 2023-10-03 19:44:12,666 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.551e+02 1.927e+02 2.083e+02 2.312e+02 3.254e+02, threshold=4.166e+02, percent-clipped=0.0 2023-10-03 19:44:12,800 WARNING [train.py:1204] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_35-153886-0 from training. Duration: 0.84 2023-10-03 19:44:14,069 WARNING [train.py:1204] (3/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_210-106047-0 from training. Duration: 0.97 2023-10-03 19:44:15,453 WARNING [train.py:1204] (3/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_479-125611-0 from training. Duration: 0.96 2023-10-03 19:44:16,808 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_9417_210_1794_1_1529235261_4614131_427-153963-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 19:44:16,900 WARNING [train.py:1204] (3/4) Exclude cut with ID _23818_210_6228_1_1531917063884_3676920_133-170571-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 19:44:18,421 WARNING [train.py:1204] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_602-275260-0_sp1.1 from training. Duration: 0.7454375 2023-10-03 19:44:19,844 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9029W0121-509521 from training. Duration: 0.896 2023-10-03 19:44:19,859 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9029W0434-509821 from training. Duration: 0.64 2023-10-03 19:44:19,882 WARNING [train.py:1204] (3/4) Exclude cut with ID _23038_210_9570_1_1532001584158_3652419_289-307034-0_sp1.1 from training. Duration: 0.936375 2023-10-03 19:44:20,042 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.ff2_skip_rate, batch_count=1384293.3333333333, ans=0.0 2023-10-03 19:44:22,001 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.conv_module2.balancer2.prob, batch_count=1384293.3333333333, ans=0.125 2023-10-03 19:44:23,095 WARNING [train.py:1204] (3/4) Exclude cut with ID _29345_210_4381_1_1532689168263_3860169_149-257268-0_sp0.9 from training. Duration: 0.9 2023-10-03 19:44:24,344 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_405-339986-0_sp1.1 from training. Duration: 0.463625 2023-10-03 19:44:27,654 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2655_210_2087_1_1525688959_3897400_437-337690-0_sp1.1 from training. Duration: 0.57275 2023-10-03 19:44:27,688 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7543_210_6753_1_1528541907_1399322_165-323619-0_sp1.1 from training. Duration: 0.62725 2023-10-03 19:44:29,062 WARNING [train.py:1204] (3/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_508-335369-0_sp1.1 from training. Duration: 0.436375 2023-10-03 19:44:29,118 WARNING [train.py:1204] (3/4) Exclude cut with ID _7852_210_5219_1_1528286471316_8139400_358-287492-0 from training. Duration: 0.96 2023-10-03 19:44:29,374 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.feed_forward1.hidden_balancer.prob, batch_count=1384360.0, ans=0.125 2023-10-03 19:44:31,795 WARNING [train.py:1204] (3/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_322-217220-0_sp0.9 from training. Duration: 0.9555625 2023-10-03 19:44:34,517 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3171_210_2087_1_1526109084_2330007_66-260583-0_sp0.9 from training. Duration: 0.8 2023-10-03 19:44:34,551 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8558_210_1410_1_1528505225_7613406_131-329524-0_sp1.1 from training. Duration: 0.7181875 2023-10-03 19:44:37,878 WARNING [train.py:1204] (3/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_30-326681-0 from training. Duration: 0.83 2023-10-03 19:44:42,488 WARNING [train.py:1204] (3/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_58-68956-0_sp0.9 from training. Duration: 0.92225 2023-10-03 19:44:43,813 WARNING [train.py:1204] (3/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_255-312654-0 from training. Duration: 0.91 2023-10-03 19:44:43,900 WARNING [train.py:1204] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_99-167392-0 from training. Duration: 0.61 2023-10-03 19:44:45,363 WARNING [train.py:1204] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_94-126128-0_sp0.9 from training. Duration: 0.9555625 2023-10-03 19:44:47,086 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.bypass.skip_rate, batch_count=1384426.6666666667, ans=0.07 2023-10-03 19:44:48,460 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.self_attn_weights.pos_emb_skip_rate, batch_count=1384426.6666666667, ans=0.0 2023-10-03 19:44:49,569 WARNING [train.py:1204] (3/4) Exclude cut with ID _68635_210_9317_1_1535770813410_4315870_187-192817-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 19:44:49,794 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.bypass.scale_min, batch_count=1384426.6666666667, ans=0.2 2023-10-03 19:44:50,746 WARNING [train.py:1204] (3/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_277-204970-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 19:44:50,866 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6163_210_6400_1_1527506746_4263068_283-137811-0_sp0.9 from training. Duration: 0.8555625 2023-10-03 19:44:50,888 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9144W0121-566446 from training. Duration: 0.896 2023-10-03 19:44:51,081 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.bypass.scale_min, batch_count=1384426.6666666667, ans=0.2 2023-10-03 19:44:53,576 INFO [train.py:1046] (3/4) Epoch 40, batch 500, loss[loss=0.1404, simple_loss=0.2233, pruned_loss=0.02872, over 24676.00 frames. ], tot_loss[loss=0.1572, simple_loss=0.2375, pruned_loss=0.03846, over 4345256.72 frames. ], batch size: 65, lr: 2.56e-03, grad_scale: 16.0 2023-10-03 19:44:53,773 WARNING [train.py:1204] (3/4) Exclude cut with ID _49371_210_12071_1_1533952761021_3812190_351-315519-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 19:44:55,753 WARNING [train.py:1204] (3/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_115-326778-0_sp1.1 from training. Duration: 0.8454375 2023-10-03 19:44:55,806 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_17292_210_6753_1_1531125388_2943734_436-198965-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 19:44:57,013 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9165W0117-576751 from training. Duration: 0.896 2023-10-03 19:44:58,503 WARNING [train.py:1204] (3/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_212-149947-0 from training. Duration: 0.73 2023-10-03 19:44:58,513 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_13757_210_3528_1_1530440682_4107239_65-167544-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 19:45:00,458 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.4.encoder.layers.2.feed_forward1.out_whiten, num_groups=1, num_channels=384, metric=8.08 vs. limit=15.0 2023-10-03 19:45:01,199 WARNING [train.py:1204] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_700-183549-0_sp0.9 from training. Duration: 0.7555625 2023-10-03 19:45:03,992 WARNING [train.py:1204] (3/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_216-343007-0_sp0.9 from training. Duration: 0.7333125 2023-10-03 19:45:05,404 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7544_210_6753_1_1528587722_3576686_295-258479-0_sp1.1 from training. Duration: 0.763625 2023-10-03 19:45:08,541 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_365-287106-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 19:45:08,564 WARNING [train.py:1204] (3/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_300-3747-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 19:45:08,624 WARNING [train.py:1204] (3/4) Exclude cut with ID _14069_210_5632_1_1530512692033_4803390_487-303201-0_sp1.1 from training. Duration: 0.92725 2023-10-03 19:45:18,595 WARNING [train.py:1204] (3/4) Exclude cut with ID _14459_210_2228_1_1531443641946_7516700_668-123026-0_sp1.1 from training. Duration: 0.936375 2023-10-03 19:45:18,618 WARNING [train.py:1204] (3/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_108-53929-0_sp1.1 from training. Duration: 0.536375 2023-10-03 19:45:19,981 WARNING [train.py:1204] (3/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_266-137104-0_sp0.9 from training. Duration: 0.72225 2023-10-03 19:45:20,023 WARNING [train.py:1204] (3/4) Exclude cut with ID _12302_210_5219_1_1529924446466_8107280_902-168619-0_sp1.1 from training. Duration: 0.936375 2023-10-03 19:45:20,047 WARNING [train.py:1204] (3/4) Exclude cut with ID _59502_210_12210_1_1535107024698_1835139_146-162495-0 from training. Duration: 0.92 2023-10-03 19:45:20,202 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.conv_module2.balancer2.min_positive, batch_count=1384560.0, ans=0.05 2023-10-03 19:45:20,263 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.feed_forward2.hidden_balancer.prob, batch_count=1384560.0, ans=0.125 2023-10-03 19:45:21,359 WARNING [train.py:1204] (3/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_412-287526-0_sp1.1 from training. Duration: 0.6545625 2023-10-03 19:45:24,894 WARNING [train.py:1204] (3/4) Exclude cut with ID _7160_210_1876_1_1527940923231_3703799_120-150943-0_sp0.9 from training. Duration: 0.9 2023-10-03 19:45:26,246 WARNING [train.py:1204] (3/4) Exclude cut with ID _15037_210_2253_1_1530752294104_3267650_296-192597-0_sp1.1 from training. Duration: 0.736375 2023-10-03 19:45:26,272 WARNING [train.py:1204] (3/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_397-216497-0_sp1.1 from training. Duration: 0.87275 2023-10-03 19:45:28,135 WARNING [train.py:1204] (3/4) Exclude cut with ID _40094_210_16380_1_1533294005773_3641159_76-269320-0_sp1.1 from training. Duration: 0.936375 2023-10-03 19:45:28,203 WARNING [train.py:1204] (3/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_449-15508-0 from training. Duration: 0.82 2023-10-03 19:45:30,953 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9309W0194-640321 from training. Duration: 0.64 2023-10-03 19:45:33,709 WARNING [train.py:1204] (3/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_284-312178-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 19:45:35,148 WARNING [train.py:1204] (3/4) Exclude cut with ID _29853_210_7497_1_1532505600851_6642110_401-55937-0_sp1.1 from training. Duration: 0.92725 2023-10-03 19:45:35,339 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.ff3_skip_rate, batch_count=1384626.6666666667, ans=0.0 2023-10-03 19:45:36,536 WARNING [train.py:1204] (3/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_716-24918-0_sp1.1 from training. Duration: 0.92725 2023-10-03 19:45:36,555 WARNING [train.py:1204] (3/4) Exclude cut with ID _19637_210_4220_1_1531537167676_3205060_314-305638-0_sp1.1 from training. Duration: 0.92725 2023-10-03 19:45:36,584 WARNING [train.py:1204] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_475-127455-0_sp1.1 from training. Duration: 0.636375 2023-10-03 19:45:37,997 WARNING [train.py:1204] (3/4) Exclude cut with ID _5444_210_2151_1_1527418808464_5312210_581-315411-0 from training. Duration: 0.62 2023-10-03 19:45:41,389 WARNING [train.py:1204] (3/4) Exclude cut with ID _44902_210_16187_1_1533888006062_3792078_149-67212-0_sp0.9 from training. Duration: 0.9666875 2023-10-03 19:45:42,661 WARNING [train.py:1204] (3/4) Exclude cut with ID _31602_210_5854_1_1532677021456_4458250_63-63253-0_sp1.1 from training. Duration: 0.963625 2023-10-03 19:45:45,457 WARNING [train.py:1204] (3/4) Exclude cut with ID _55209_210_1531_1_1534675828996_4597160_660-203992-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 19:45:46,934 WARNING [train.py:1204] (3/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_83-190624-0_sp1.1 from training. Duration: 0.92725 2023-10-03 19:45:49,960 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.conv_module1.balancer1.prob, batch_count=1384693.3333333333, ans=0.125 2023-10-03 19:45:52,427 WARNING [train.py:1204] (3/4) Exclude cut with ID _58605_210_12210_1_1534723195939_3904259_154-139635-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 19:45:56,031 WARNING [train.py:1204] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_164-224052-0 from training. Duration: 0.7 2023-10-03 19:45:56,040 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_1126-189071-0_sp1.1 from training. Duration: 0.963625 2023-10-03 19:45:56,064 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_15396_210_6890_1_1531308473_1938936_310-214789-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 19:45:58,812 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.0.conv_module1.balancer1.max_abs, batch_count=1384760.0, ans=10.0 2023-10-03 19:46:00,001 WARNING [train.py:1204] (3/4) Exclude cut with ID _8395_210_1531_1_1528597871523_4411579_431-132470-0 from training. Duration: 0.74 2023-10-03 19:46:00,066 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3780_210_2087_1_1526467760_2130483_170-259044-0_sp0.9 from training. Duration: 0.77775 2023-10-03 19:46:02,787 WARNING [train.py:1204] (3/4) Exclude cut with ID _12733_210_8341_1_1530167424556_3646380_537-230640-0_sp1.1 from training. Duration: 0.963625 2023-10-03 19:46:06,878 INFO [train.py:1046] (3/4) Epoch 40, batch 550, loss[loss=0.1364, simple_loss=0.2164, pruned_loss=0.02823, over 24584.00 frames. ], tot_loss[loss=0.1573, simple_loss=0.2379, pruned_loss=0.03838, over 4430623.72 frames. ], batch size: 60, lr: 2.56e-03, grad_scale: 8.0 2023-10-03 19:46:06,988 WARNING [train.py:1204] (3/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_159-281310-0 from training. Duration: 0.95 2023-10-03 19:46:08,934 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.5.encoder.layers.0.nonlin_attention.whiten2, num_groups=1, num_channels=256, metric=3.42 vs. limit=15.0 2023-10-03 19:46:10,243 WARNING [train.py:1204] (3/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_434-83000-0 from training. Duration: 0.98 2023-10-03 19:46:10,263 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_4405_210_1849_1_1526732205_3593939_493-32464-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 19:46:10,272 WARNING [train.py:1204] (3/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_342-100157-0 from training. Duration: 0.54 2023-10-03 19:46:10,349 WARNING [train.py:1204] (3/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_765-250133-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 19:46:11,666 WARNING [train.py:1204] (3/4) Exclude cut with ID _38670_210_15379_1_1533448810659_4391059_81-312245-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 19:46:11,915 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=1384826.6666666667, ans=0.1 2023-10-03 19:46:13,609 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_50050_210_16223_1_1535435796_7290138_1273-247611-0_sp1.1 from training. Duration: 0.936375 2023-10-03 19:46:14,934 WARNING [train.py:1204] (3/4) Exclude cut with ID _44831_210_13427_1_1533816011549_3689035_399-18591-0_sp1.1 from training. Duration: 0.936375 2023-10-03 19:46:14,956 WARNING [train.py:1204] (3/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_557-324258-0_sp0.9 from training. Duration: 0.9 2023-10-03 19:46:15,036 WARNING [train.py:1204] (3/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_62-335552-0_sp1.1 from training. Duration: 0.7909375 2023-10-03 19:46:17,700 WARNING [train.py:1204] (3/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_673-264125-0_sp1.1 from training. Duration: 0.963625 2023-10-03 19:46:19,055 WARNING [train.py:1204] (3/4) Exclude cut with ID _8030_210_7497_1_1528444886781_3531089_262-86750-0 from training. Duration: 0.81 2023-10-03 19:46:19,076 WARNING [train.py:1204] (3/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_444-28041-0_sp1.1 from training. Duration: 0.77275 2023-10-03 19:46:24,758 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_34008_210_10431_1_1533345424_8183247_516-14858-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 19:46:24,770 WARNING [train.py:1204] (3/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_54-248218-0_sp1.1 from training. Duration: 0.936375 2023-10-03 19:46:27,943 WARNING [train.py:1204] (3/4) Exclude cut with ID _13253_210_1925_1_1530757775819_3577873_300-119782-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 19:46:28,103 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=1384893.3333333333, ans=0.1 2023-10-03 19:46:29,310 WARNING [train.py:1204] (3/4) Exclude cut with ID _39290_210_4220_1_1533862778760_3075118_21-240703-0_sp1.1 from training. Duration: 0.936375 2023-10-03 19:46:33,799 WARNING [train.py:1204] (3/4) Exclude cut with ID 774-127930-0014-10412-0_sp1.1 from training. Duration: 0.95 2023-10-03 19:46:33,864 WARNING [train.py:1204] (3/4) Exclude cut with ID _67499_210_17282_1_1535509805220_3692430_20-179346-0 from training. Duration: 0.97 2023-10-03 19:46:35,199 WARNING [train.py:1204] (3/4) Exclude cut with ID _8365_210_2064_1_1528592311070_4677690_431-294102-0_sp0.9 from training. Duration: 0.988875 2023-10-03 19:46:35,436 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=1384960.0, ans=0.1 2023-10-03 19:46:40,536 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.581e+02 1.820e+02 1.987e+02 2.259e+02 2.913e+02, threshold=3.975e+02, percent-clipped=0.0 2023-10-03 19:46:42,091 WARNING [train.py:1204] (3/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_365-1492-0_sp1.1 from training. Duration: 0.82725 2023-10-03 19:46:42,117 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_105-114797-0_sp0.9 from training. Duration: 0.9333125 2023-10-03 19:46:42,388 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.conv_module2.balancer1.min_positive, batch_count=1384960.0, ans=0.025 2023-10-03 19:46:44,075 WARNING [train.py:1204] (3/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_769-168302-0_sp0.9 from training. Duration: 0.788875 2023-10-03 19:46:46,851 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_40340_210_9024_1_1533433916_4132944_320-270277-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 19:46:46,857 WARNING [train.py:1204] (3/4) Exclude cut with ID ID0203W0003-781711 from training. Duration: 0.958 2023-10-03 19:46:46,935 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_30434_210_13049_1_1532519384_981746_52-12932-0_sp1.1 from training. Duration: 0.936375 2023-10-03 19:46:48,901 WARNING [train.py:1204] (3/4) Exclude cut with ID _8263_210_1664_1_1528438953494_294100_40-204731-0_sp0.9 from training. Duration: 0.5555625 2023-10-03 19:46:50,527 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.feed_forward1.out_proj.dropout_p, batch_count=1385026.6666666667, ans=0.1 2023-10-03 19:46:51,664 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1032-236863-0_sp0.9 from training. Duration: 0.9333125 2023-10-03 19:46:52,964 WARNING [train.py:1204] (3/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_335-274780-0_sp0.9 from training. Duration: 0.8666875 2023-10-03 19:46:52,972 WARNING [train.py:1204] (3/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_654-52713-0_sp1.1 from training. Duration: 0.7 2023-10-03 19:46:53,060 WARNING [train.py:1204] (3/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_742-267856-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 19:46:54,471 WARNING [train.py:1204] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_74-48717-0 from training. Duration: 0.58 2023-10-03 19:46:54,562 WARNING [train.py:1204] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_18-46756-0 from training. Duration: 0.79 2023-10-03 19:46:55,954 WARNING [train.py:1204] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_612-56061-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 19:46:55,968 WARNING [train.py:1204] (3/4) Exclude cut with ID _56423_210_5854_1_1534896061049_3165969_412-2403-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 19:46:56,103 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.0.feed_forward2.hidden_balancer.prob, batch_count=1385026.6666666667, ans=0.125 2023-10-03 19:46:57,781 WARNING [train.py:1204] (3/4) Exclude cut with ID _62574_210_2655_1_1535180407058_3933249_120-196848-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 19:46:57,781 WARNING [train.py:1204] (3/4) Exclude cut with ID _40519_210_13794_1_1533715228655_7337390_916-51000-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 19:46:57,990 INFO [scaling.py:1118] (3/4) WithLoss: name=encoder.encoders.0.layers.0.self_attn_weights, loss-sum=0.000e+00 2023-10-03 19:47:00,494 WARNING [train.py:1204] (3/4) Exclude cut with ID _65263_210_22356_1_1535763598691_3921037_108-21902-0_sp1.1 from training. Duration: 0.9 2023-10-03 19:47:00,586 WARNING [train.py:1204] (3/4) Exclude cut with ID _4122_210_2084_1_1526713164417_4353280_528-206771-0_sp1.1 from training. Duration: 0.8 2023-10-03 19:47:02,487 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.1.feed_forward1.hidden_balancer.prob, batch_count=1385026.6666666667, ans=0.125 2023-10-03 19:47:03,696 WARNING [train.py:1204] (3/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_43-325657-0_sp1.1 from training. Duration: 0.92725 2023-10-03 19:47:03,858 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.nonlin_attention.balancer.max_positive, batch_count=1385026.6666666667, ans=0.95 2023-10-03 19:47:05,073 WARNING [train.py:1204] (3/4) Exclude cut with ID _44659_210_2721_1_1534036120061_6614860_314-82041-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 19:47:06,397 WARNING [train.py:1204] (3/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_81-300212-0_sp0.9 from training. Duration: 0.6666875 2023-10-03 19:47:06,472 WARNING [train.py:1204] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_665-196155-0_sp0.9 from training. Duration: 0.7444375 2023-10-03 19:47:07,869 WARNING [train.py:1204] (3/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_140-336881-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 19:47:07,950 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6290_210_3273_1_1527595224_1282787_115-87095-0_sp0.9 from training. Duration: 0.611125 2023-10-03 19:47:09,330 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_53373_210_5399_1_1534229471_4715695_607-259685-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 19:47:10,614 WARNING [train.py:1204] (3/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_175-163894-0_sp1.1 from training. Duration: 0.67275 2023-10-03 19:47:10,638 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7002_210_1794_1_1527939683_5157286_1-281264-0_sp1.1 from training. Duration: 0.4 2023-10-03 19:47:18,296 WARNING [train.py:1204] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_605-257205-0 from training. Duration: 0.59 2023-10-03 19:47:19,822 WARNING [train.py:1204] (3/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_454-314901-0 from training. Duration: 0.46 2023-10-03 19:47:21,136 INFO [train.py:1046] (3/4) Epoch 40, batch 600, loss[loss=0.1529, simple_loss=0.2373, pruned_loss=0.03422, over 24475.00 frames. ], tot_loss[loss=0.1567, simple_loss=0.2375, pruned_loss=0.03794, over 4497433.01 frames. ], batch size: 63, lr: 2.56e-03, grad_scale: 8.0 2023-10-03 19:47:21,258 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_46032_210_14656_1_1533781412_4507969_287-130422-0_sp1.1 from training. Duration: 0.97275 2023-10-03 19:47:21,275 WARNING [train.py:1204] (3/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_186-309161-0_sp1.1 from training. Duration: 0.4909375 2023-10-03 19:47:21,324 WARNING [train.py:1204] (3/4) Exclude cut with ID _42659_210_2070_1_1533607442765_3120380_45-342307-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 19:47:22,774 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.0.bypass_mid.scale_min, batch_count=1385160.0, ans=0.2 2023-10-03 19:47:29,674 WARNING [train.py:1204] (3/4) Exclude cut with ID _12555_210_8750_1_1530075594402_3780239_478-326818-0_sp1.1 from training. Duration: 0.963625 2023-10-03 19:47:31,103 WARNING [train.py:1204] (3/4) Exclude cut with ID _7606_210_4915_1_1528894253277_5080690_127-177761-0_sp1.1 from training. Duration: 0.5818125 2023-10-03 19:47:32,510 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6788_210_1388_1_1527898698_4562021_91-197981-0 from training. Duration: 0.83 2023-10-03 19:47:33,322 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.attention_skip_rate, batch_count=1385160.0, ans=0.0 2023-10-03 19:47:34,435 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8558_210_1410_1_1528505225_7613406_131-329524-0_sp0.9 from training. Duration: 0.87775 2023-10-03 19:47:37,157 WARNING [train.py:1204] (3/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_153-153537-0_sp1.1 from training. Duration: 0.82725 2023-10-03 19:47:38,576 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_17292_210_6753_1_1531125388_2943734_285-349105-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 19:47:40,037 WARNING [train.py:1204] (3/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_37-41919-0 from training. Duration: 0.87 2023-10-03 19:47:40,068 WARNING [train.py:1204] (3/4) Exclude cut with ID _60860_210_8254_1_1534896015875_3557169_172-294852-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 19:47:46,702 WARNING [train.py:1204] (3/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_146-276944-0 from training. Duration: 0.83 2023-10-03 19:47:48,256 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=1385226.6666666667, ans=0.1 2023-10-03 19:47:49,365 WARNING [train.py:1204] (3/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_1254-44082-0_sp1.1 from training. Duration: 0.936375 2023-10-03 19:47:49,377 WARNING [train.py:1204] (3/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_1115-180350-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 19:47:49,398 WARNING [train.py:1204] (3/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_60-146189-0_sp1.1 from training. Duration: 0.8909375 2023-10-03 19:47:55,438 WARNING [train.py:1204] (3/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_517-153649-0_sp1.1 from training. Duration: 0.92725 2023-10-03 19:47:56,687 WARNING [train.py:1204] (3/4) Exclude cut with ID _39571_210_13449_1_1533801584143_3651077_37-266257-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 19:47:56,735 WARNING [train.py:1204] (3/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_527-276118-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 19:48:02,600 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7696_210_2040_1_1528207167_3716099_117-125434-0_sp1.1 from training. Duration: 0.6909375 2023-10-03 19:48:07,381 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_25767_210_2161_1_1532307483_3867748_478-156360-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 19:48:07,393 WARNING [train.py:1204] (3/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_177-49092-0_sp1.1 from training. Duration: 0.82725 2023-10-03 19:48:07,401 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_11305_210_1794_1_1529750428_7178148_680-317073-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 19:48:07,636 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.conv_module1.balancer1.max_abs, batch_count=1385360.0, ans=10.0 2023-10-03 19:48:14,915 WARNING [train.py:1204] (3/4) Exclude cut with ID _53565_210_10635_1_1534337218335_3820259_287-18120-0 from training. Duration: 0.97 2023-10-03 19:48:20,820 WARNING [train.py:1204] (3/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_498-40153-0_sp1.1 from training. Duration: 0.57275 2023-10-03 19:48:20,840 WARNING [train.py:1204] (3/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_555-63066-0_sp1.1 from training. Duration: 0.963625 2023-10-03 19:48:23,512 WARNING [train.py:1204] (3/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_395-310220-0 from training. Duration: 0.89 2023-10-03 19:48:25,467 WARNING [train.py:1204] (3/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_937-208281-0_sp1.1 from training. Duration: 0.77275 2023-10-03 19:48:26,881 WARNING [train.py:1204] (3/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_120-211630-0 from training. Duration: 0.92 2023-10-03 19:48:26,910 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_454-276429-0_sp1.1 from training. Duration: 0.7909375 2023-10-03 19:48:27,666 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.2.encoder.layers.1.self_attn2.whiten, num_groups=1, num_channels=384, metric=16.53 vs. limit=22.5 2023-10-03 19:48:28,252 WARNING [train.py:1204] (3/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_404-75889-0_sp1.1 from training. Duration: 0.8090625 2023-10-03 19:48:33,860 WARNING [train.py:1204] (3/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_282-136641-0_sp0.9 from training. Duration: 0.4666875 2023-10-03 19:48:35,765 INFO [train.py:1046] (3/4) Epoch 40, batch 650, loss[loss=0.178, simple_loss=0.255, pruned_loss=0.05053, over 23170.00 frames. ], tot_loss[loss=0.1563, simple_loss=0.2368, pruned_loss=0.03789, over 4548182.00 frames. ], batch size: 105, lr: 2.55e-03, grad_scale: 4.0 2023-10-03 19:48:35,827 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_809-17156-0_sp1.1 from training. Duration: 0.663625 2023-10-03 19:48:37,158 WARNING [train.py:1204] (3/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_1003-101638-0_sp0.9 from training. Duration: 0.811125 2023-10-03 19:48:37,429 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.conv_module2.balancer1.prob, batch_count=1385493.3333333333, ans=0.125 2023-10-03 19:48:38,499 WARNING [train.py:1204] (3/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_404-75889-0_sp0.9 from training. Duration: 0.988875 2023-10-03 19:48:39,928 WARNING [train.py:1204] (3/4) Exclude cut with ID _59288_210_5854_1_1535077757774_3412246_570-295255-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 19:48:43,090 WARNING [train.py:1204] (3/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_412-287526-0 from training. Duration: 0.72 2023-10-03 19:48:43,224 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.ff3_skip_rate, batch_count=1385493.3333333333, ans=0.0 2023-10-03 19:48:44,937 WARNING [train.py:1204] (3/4) Exclude cut with ID _44010_210_4525_1_1533884262964_4013620_649-79655-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 19:48:45,089 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.bypass.scale_min, batch_count=1385493.3333333333, ans=0.2 2023-10-03 19:48:50,696 WARNING [train.py:1204] (3/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_57-270037-0_sp0.9 from training. Duration: 0.8555625 2023-10-03 19:48:50,696 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_34815_210_6388_1_1533381320_3571903_302-255963-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 19:48:53,551 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_47449_210_5399_1_1534076445_4006929_573-301852-0_sp1.1 from training. Duration: 0.97275 2023-10-03 19:48:56,891 WARNING [train.py:1204] (3/4) Exclude cut with ID 6709-74022-0004-86860-0_sp1.1 from training. Duration: 0.9409375 2023-10-03 19:48:58,281 WARNING [train.py:1204] (3/4) Exclude cut with ID _40519_210_13794_1_1533715228655_7337390_111-71991-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 19:48:59,508 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_30167_210_3528_1_1532428012_4772375_169-214186-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 19:49:02,245 WARNING [train.py:1204] (3/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_20-235313-0_sp1.1 from training. Duration: 0.963625 2023-10-03 19:49:02,297 WARNING [train.py:1204] (3/4) Exclude cut with ID _4681_210_3525_1_1527251040321_1235713_123-24428-0_sp0.9 from training. Duration: 0.5333125 2023-10-03 19:49:05,095 WARNING [train.py:1204] (3/4) Exclude cut with ID _31602_210_5854_1_1532677021456_4458250_444-198752-0_sp1.1 from training. Duration: 0.97275 2023-10-03 19:49:07,020 WARNING [train.py:1204] (3/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_9-279442-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 19:49:07,072 WARNING [train.py:1204] (3/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_973-198085-0_sp0.9 from training. Duration: 0.8333125 2023-10-03 19:49:07,151 WARNING [train.py:1204] (3/4) Exclude cut with ID _29853_210_7497_1_1532505600851_6642110_567-250221-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 19:49:08,369 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6545_210_2040_1_1527774670_4245258_407-48070-0_sp1.1 from training. Duration: 0.6909375 2023-10-03 19:49:09,868 WARNING [train.py:1204] (3/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_224-343295-0_sp0.9 from training. Duration: 0.611125 2023-10-03 19:49:11,451 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.617e+02 1.985e+02 2.192e+02 2.479e+02 3.880e+02, threshold=4.384e+02, percent-clipped=0.0 2023-10-03 19:49:11,544 WARNING [train.py:1204] (3/4) Exclude cut with ID IC0125W0011-61817 from training. Duration: 0.88 2023-10-03 19:49:11,562 WARNING [train.py:1204] (3/4) Exclude cut with ID _60113_210_16271_1_1535164212481_4386079_435-154371-0_sp1.1 from training. Duration: 0.97275 2023-10-03 19:49:11,582 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_25836_210_16554_1_1532138167_53197_3-9394-0_sp1.1 from training. Duration: 0.92725 2023-10-03 19:49:14,726 WARNING [train.py:1204] (3/4) Exclude cut with ID _29343_210_4381_1_1532602757752_3674109_57-55125-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 19:49:14,826 WARNING [train.py:1204] (3/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_267-23560-0_sp1.1 from training. Duration: 0.936375 2023-10-03 19:49:16,152 WARNING [train.py:1204] (3/4) Exclude cut with ID _30196_210_1664_1_1533708496522_7074140_1150-278867-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 19:49:16,215 WARNING [train.py:1204] (3/4) Exclude cut with ID _6270_210_2084_1_1527850911573_3856420_26-277343-0_sp0.9 from training. Duration: 0.911125 2023-10-03 19:49:16,361 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.bypass.skip_rate, batch_count=1385626.6666666667, ans=0.07 2023-10-03 19:49:17,617 WARNING [train.py:1204] (3/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_325-62802-0 from training. Duration: 0.87 2023-10-03 19:49:19,015 WARNING [train.py:1204] (3/4) Exclude cut with ID _56524_210_9642_1_1534572011779_3619170_145-310842-0_sp1.1 from training. Duration: 0.9 2023-10-03 19:49:20,361 WARNING [train.py:1204] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_860-96198-0_sp0.9 from training. Duration: 0.811125 2023-10-03 19:49:20,450 WARNING [train.py:1204] (3/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_17-25045-0_sp1.1 from training. Duration: 0.536375 2023-10-03 19:49:21,744 WARNING [train.py:1204] (3/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_349-69727-0_sp1.1 from training. Duration: 0.936375 2023-10-03 19:49:23,066 WARNING [train.py:1204] (3/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_580-93798-0_sp1.1 from training. Duration: 0.5090625 2023-10-03 19:49:24,427 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7762_210_1794_1_1528199508_4057408_419-125009-0 from training. Duration: 0.83 2023-10-03 19:49:24,956 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.5.encoder.layers.0.whiten, num_groups=1, num_channels=256, metric=2.20 vs. limit=12.0 2023-10-03 19:49:26,196 WARNING [train.py:1204] (3/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_356-167816-0 from training. Duration: 0.72 2023-10-03 19:49:26,219 WARNING [train.py:1204] (3/4) Exclude cut with ID _29345_210_4381_1_1532689168263_3860169_225-97100-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 19:49:26,223 WARNING [train.py:1204] (3/4) Exclude cut with ID _14227_210_4381_1_1530701804286_3940259_374-194712-0_sp1.1 from training. Duration: 0.92725 2023-10-03 19:49:26,257 WARNING [train.py:1204] (3/4) Exclude cut with ID _40137_210_12210_1_1533290484046_3795180_249-187059-0_sp0.9 from training. Duration: 0.97775 2023-10-03 19:49:26,288 WARNING [train.py:1204] (3/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_389-317194-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 19:49:28,992 WARNING [train.py:1204] (3/4) Exclude cut with ID _27485_210_4916_1_1532739191677_3668269_239-122918-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 19:49:29,690 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.3.encoder.layers.0.conv_module1.whiten, num_groups=1, num_channels=512, metric=4.66 vs. limit=15.0 2023-10-03 19:49:33,760 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2245_210_2715_1_1525511548_3540254_128-117609-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 19:49:33,798 WARNING [train.py:1204] (3/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_160-276178-0_sp1.1 from training. Duration: 0.963625 2023-10-03 19:49:34,918 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_31920_210_10633_1_1532860009_1686094_1-279735-0_sp1.1 from training. Duration: 0.97275 2023-10-03 19:49:37,616 WARNING [train.py:1204] (3/4) Exclude cut with ID _51078_210_9070_1_1534161557424_3805250_325-262383-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 19:49:37,643 WARNING [train.py:1204] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_643-52007-0_sp0.9 from training. Duration: 0.6333125 2023-10-03 19:49:38,962 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_51937_210_5014_1_1534573610_4022025_518-299248-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 19:49:47,001 WARNING [train.py:1204] (3/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_166-314607-0_sp1.1 from training. Duration: 0.4909375 2023-10-03 19:49:47,011 WARNING [train.py:1204] (3/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_1087-282939-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 19:49:47,046 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_23850_210_4238_1_1532232597_1632433_32-185135-0_sp1.1 from training. Duration: 0.7818125 2023-10-03 19:49:48,444 WARNING [train.py:1204] (3/4) Exclude cut with ID _59500_210_8254_1_1534809610680_3736009_145-89359-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 19:49:49,723 INFO [train.py:1046] (3/4) Epoch 40, batch 700, loss[loss=0.146, simple_loss=0.2202, pruned_loss=0.03587, over 23714.00 frames. ], tot_loss[loss=0.1557, simple_loss=0.2355, pruned_loss=0.03791, over 4569113.37 frames. ], batch size: 164, lr: 2.55e-03, grad_scale: 8.0 2023-10-03 19:49:52,550 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_340-336408-0 from training. Duration: 0.93 2023-10-03 19:49:52,615 WARNING [train.py:1204] (3/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_9-21737-0 from training. Duration: 0.97 2023-10-03 19:49:55,451 WARNING [train.py:1204] (3/4) Exclude cut with ID _2220_210_1819_1_1525509109898_3658680_50-99837-0 from training. Duration: 0.54 2023-10-03 19:49:55,506 WARNING [train.py:1204] (3/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_879-299350-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 19:49:57,486 WARNING [train.py:1204] (3/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_905-160074-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 19:49:58,908 WARNING [train.py:1204] (3/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_395-73570-0 from training. Duration: 0.99 2023-10-03 19:50:02,800 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7264_210_4938_1_1527939911_8132095_800-91995-0_sp1.1 from training. Duration: 0.7818125 2023-10-03 19:50:04,727 WARNING [train.py:1204] (3/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_77-311404-0_sp1.1 from training. Duration: 0.8545625 2023-10-03 19:50:07,344 WARNING [train.py:1204] (3/4) Exclude cut with ID _13679_210_7497_1_1530534293662_3355098_107-80772-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 19:50:08,793 WARNING [train.py:1204] (3/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_224-343295-0_sp1.1 from training. Duration: 0.5 2023-10-03 19:50:08,823 WARNING [train.py:1204] (3/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_156-253482-0_sp1.1 from training. Duration: 0.963625 2023-10-03 19:50:10,377 WARNING [train.py:1204] (3/4) Exclude cut with ID _49728_210_12651_1_1534041034294_6788150_309-125532-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 19:50:13,124 WARNING [train.py:1204] (3/4) Exclude cut with ID _13913_210_9994_1_1530579615187_3383897_473-277682-0_sp0.9 from training. Duration: 0.4333125 2023-10-03 19:50:13,136 WARNING [train.py:1204] (3/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_3-276701-0_sp1.1 from training. Duration: 0.82725 2023-10-03 19:50:16,323 WARNING [train.py:1204] (3/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_282-136641-0 from training. Duration: 0.42 2023-10-03 19:50:18,057 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.attention_skip_rate, batch_count=1385960.0, ans=0.0 2023-10-03 19:50:19,071 WARNING [train.py:1204] (3/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_127-566-0 from training. Duration: 0.84 2023-10-03 19:50:21,937 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6163_210_6400_1_1527506746_4263068_283-137811-0_sp1.1 from training. Duration: 0.7 2023-10-03 19:50:21,964 WARNING [train.py:1204] (3/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_34-183685-0_sp1.1 from training. Duration: 0.8909375 2023-10-03 19:50:23,851 WARNING [train.py:1204] (3/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_154-135696-0_sp0.9 from training. Duration: 0.888875 2023-10-03 19:50:28,055 WARNING [train.py:1204] (3/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_676-51014-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 19:50:29,368 WARNING [train.py:1204] (3/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_38-169920-0 from training. Duration: 0.91 2023-10-03 19:50:34,102 WARNING [train.py:1204] (3/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_747-114465-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 19:50:35,395 WARNING [train.py:1204] (3/4) Exclude cut with ID _8021_210_7994_1_1528376451358_3653069_100-140231-0_sp1.1 from training. Duration: 0.6090625 2023-10-03 19:50:35,427 WARNING [train.py:1204] (3/4) Exclude cut with ID _64135_210_2253_1_1535677267876_7608972_133-188266-0 from training. Duration: 0.81 2023-10-03 19:50:39,582 WARNING [train.py:1204] (3/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_714-310538-0_sp1.1 from training. Duration: 0.7818125 2023-10-03 19:50:39,682 WARNING [train.py:1204] (3/4) Exclude cut with ID _39867_210_9826_1_1533553222030_9344259_1134-288498-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 19:50:42,407 WARNING [train.py:1204] (3/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_211-237289-0_sp1.1 from training. Duration: 0.936375 2023-10-03 19:50:48,414 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.1.encoder.layers.1.self_attn2.whiten, num_groups=1, num_channels=256, metric=10.66 vs. limit=22.5 2023-10-03 19:50:49,092 WARNING [train.py:1204] (3/4) Exclude cut with ID _59202_210_26984_1_1534816810612_3549174_137-318710-0_sp0.9 from training. Duration: 0.988875 2023-10-03 19:50:50,362 WARNING [train.py:1204] (3/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_86-66192-0 from training. Duration: 0.55 2023-10-03 19:50:53,182 WARNING [train.py:1204] (3/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_540-275685-0 from training. Duration: 0.89 2023-10-03 19:50:53,224 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8776_210_2040_1_1528638837_4088036_244-278763-0 from training. Duration: 0.9 2023-10-03 19:50:57,846 WARNING [train.py:1204] (3/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_9-219972-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 19:50:59,319 WARNING [train.py:1204] (3/4) Exclude cut with ID _44659_210_2721_1_1534036120061_6614860_656-283823-0_sp1.1 from training. Duration: 0.92725 2023-10-03 19:50:59,389 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_69630_210_7234_1_1535789601_661049_77-216385-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 19:51:00,804 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_19952_210_2161_1_1531875469_3851750_210-308919-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 19:51:00,817 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_4737_210_2040_1_1526998689_3293008_378-178650-0 from training. Duration: 0.95 2023-10-03 19:51:03,469 INFO [train.py:1046] (3/4) Epoch 40, batch 750, loss[loss=0.1598, simple_loss=0.2329, pruned_loss=0.04335, over 23467.00 frames. ], tot_loss[loss=0.1556, simple_loss=0.2357, pruned_loss=0.0377, over 4606746.52 frames. ], batch size: 285, lr: 2.55e-03, grad_scale: 8.0 2023-10-03 19:51:05,007 WARNING [train.py:1204] (3/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_224-343295-0 from training. Duration: 0.55 2023-10-03 19:51:05,030 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7002_210_1794_1_1527939683_5157286_184-229630-0 from training. Duration: 0.74 2023-10-03 19:51:06,275 WARNING [train.py:1204] (3/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_6-214815-0 from training. Duration: 0.85 2023-10-03 19:51:06,368 WARNING [train.py:1204] (3/4) Exclude cut with ID _8699_210_2069_1_1528630346831_3960409_160-24408-0 from training. Duration: 0.91 2023-10-03 19:51:07,029 WARNING [train.py:1204] (3/4) Exclude cut with ID _5585_210_2667_1_1527418813324_8007340_89-332072-0 from training. Duration: 0.64 2023-10-03 19:51:07,508 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.5.encoder.layers.1.nonlin_attention.whiten1, num_groups=1, num_channels=192, metric=2.76 vs. limit=10.0 2023-10-03 19:51:08,406 WARNING [train.py:1204] (3/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_528-259800-0_sp1.1 from training. Duration: 0.963625 2023-10-03 19:51:08,488 WARNING [train.py:1204] (3/4) Exclude cut with ID _55383_210_10635_1_1534468426172_2231669_134-128246-0 from training. Duration: 0.89 2023-10-03 19:51:09,125 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.3.encoder.layers.2.self_attn2.whiten, num_groups=1, num_channels=512, metric=10.46 vs. limit=22.5 2023-10-03 19:51:09,823 WARNING [train.py:1204] (3/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_400-338274-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 19:51:11,155 WARNING [train.py:1204] (3/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_364-22817-0_sp0.9 from training. Duration: 0.788875 2023-10-03 19:51:12,591 WARNING [train.py:1204] (3/4) Exclude cut with ID _42863_210_9070_1_1533902398788_3696920_331-291057-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 19:51:14,081 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_5436_210_1794_1_1527331317_2541494_229-211016-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 19:51:15,812 WARNING [train.py:1204] (3/4) Exclude cut with ID _6456_210_1534_1_1527681634212_3181049_345-252877-0_sp0.9 from training. Duration: 0.711125 2023-10-03 19:51:15,844 WARNING [train.py:1204] (3/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_96-81499-0_sp1.1 from training. Duration: 0.92725 2023-10-03 19:51:18,553 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_5575_210_1410_1_1527301305_2570170_342-265572-0_sp1.1 from training. Duration: 0.8545625 2023-10-03 19:51:18,624 WARNING [train.py:1204] (3/4) Exclude cut with ID _5444_210_2151_1_1527418808464_5312210_726-27165-0_sp0.9 from training. Duration: 0.7444375 2023-10-03 19:51:20,486 WARNING [train.py:1204] (3/4) Exclude cut with ID _22083_210_8254_1_1531702862439_3678970_304-327750-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 19:51:22,099 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.nonlin_attention.balancer.prob, batch_count=1386226.6666666667, ans=0.125 2023-10-03 19:51:23,181 WARNING [train.py:1204] (3/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_245-226199-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 19:51:24,996 WARNING [train.py:1204] (3/4) Exclude cut with ID _42875_210_14856_1_1533704397487_4012433_102-199499-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 19:51:25,057 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_51225_210_3601_1_1534400634_2327383_55-301844-0 from training. Duration: 0.93 2023-10-03 19:51:26,377 WARNING [train.py:1204] (3/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_916-195848-0_sp1.1 from training. Duration: 0.836375 2023-10-03 19:51:26,443 WARNING [train.py:1204] (3/4) Exclude cut with ID _44718_210_5816_1_1534050051869_6469030_497-311999-0_sp1.1 from training. Duration: 0.936375 2023-10-03 19:51:29,239 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_34886_210_10633_1_1533203882_1755661_91-194765-0_sp1.1 from training. Duration: 0.936375 2023-10-03 19:51:29,355 WARNING [train.py:1204] (3/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_310-26186-0_sp0.9 from training. Duration: 0.77775 2023-10-03 19:51:30,711 WARNING [train.py:1204] (3/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_416-257730-0 from training. Duration: 0.65 2023-10-03 19:51:32,005 WARNING [train.py:1204] (3/4) Exclude cut with ID _9389_210_1664_1_1529217337301_5289269_209-154760-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 19:51:33,543 WARNING [train.py:1204] (3/4) Exclude cut with ID _4092_210_2151_1_1526643044449_3135759_227-213972-0 from training. Duration: 0.54 2023-10-03 19:51:34,835 WARNING [train.py:1204] (3/4) Exclude cut with ID IC0679W0044-335852 from training. Duration: 0.768 2023-10-03 19:51:34,889 WARNING [train.py:1204] (3/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_42-186162-0 from training. Duration: 0.92 2023-10-03 19:51:34,895 WARNING [train.py:1204] (3/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_302-2550-0_sp0.9 from training. Duration: 0.9 2023-10-03 19:51:34,919 WARNING [train.py:1204] (3/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_356-31216-0_sp0.9 from training. Duration: 0.8333125 2023-10-03 19:51:38,196 WARNING [train.py:1204] (3/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_125-322172-0_sp1.1 from training. Duration: 0.8454375 2023-10-03 19:51:38,418 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.conv_module1.balancer2.min_positive, batch_count=1386293.3333333333, ans=0.05 2023-10-03 19:51:39,314 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.577e+02 1.912e+02 2.088e+02 2.370e+02 3.919e+02, threshold=4.175e+02, percent-clipped=0.0 2023-10-03 19:51:43,680 WARNING [train.py:1204] (3/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_141-89727-0_sp0.9 from training. Duration: 0.788875 2023-10-03 19:51:44,896 WARNING [train.py:1204] (3/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_647-337784-0_sp1.1 from training. Duration: 0.97275 2023-10-03 19:51:44,913 WARNING [train.py:1204] (3/4) Exclude cut with ID _6493_210_1747_1_1528459210424_4012470_513-140967-0_sp1.1 from training. Duration: 0.7181875 2023-10-03 19:51:47,000 WARNING [train.py:1204] (3/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_87-266334-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 19:51:48,384 WARNING [train.py:1204] (3/4) Exclude cut with ID _26528_210_9070_1_1532570410842_3851310_526-261338-0_sp1.1 from training. Duration: 0.92725 2023-10-03 19:51:49,808 WARNING [train.py:1204] (3/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_380-343670-0 from training. Duration: 0.88 2023-10-03 19:51:49,861 WARNING [train.py:1204] (3/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_449-15508-0_sp1.1 from training. Duration: 0.7454375 2023-10-03 19:51:50,044 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.conv_skip_rate, batch_count=1386360.0, ans=0.0 2023-10-03 19:51:51,153 WARNING [train.py:1204] (3/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_372-6869-0_sp0.9 from training. Duration: 0.488875 2023-10-03 19:51:53,131 WARNING [train.py:1204] (3/4) Exclude cut with ID _26907_210_7234_1_1532221740694_3205263_337-2494-0_sp0.9 from training. Duration: 0.97775 2023-10-03 19:51:54,637 WARNING [train.py:1204] (3/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_10-100367-0_sp1.1 from training. Duration: 0.8909375 2023-10-03 19:51:54,678 WARNING [train.py:1204] (3/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_47-167373-0 from training. Duration: 0.92 2023-10-03 19:51:56,059 WARNING [train.py:1204] (3/4) Exclude cut with ID _28323_210_16187_1_1532480395667_4100300_628-282179-0_sp1.1 from training. Duration: 0.97275 2023-10-03 19:52:01,924 WARNING [train.py:1204] (3/4) Exclude cut with ID _20455_210_5219_1_1531565996775_8098100_213-280898-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 19:52:04,573 WARNING [train.py:1204] (3/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_383-316000-0_sp1.1 from training. Duration: 0.8181875 2023-10-03 19:52:04,613 WARNING [train.py:1204] (3/4) Exclude cut with ID _18513_210_9206_1_1531618215223_3862620_16-78180-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 19:52:04,849 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.feed_forward1.out_proj.dropout_p, batch_count=1386426.6666666667, ans=0.1 2023-10-03 19:52:05,994 WARNING [train.py:1204] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_330-327861-0_sp1.1 from training. Duration: 0.6090625 2023-10-03 19:52:07,931 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.5.encoder.layers.1.feed_forward1.out_whiten, num_groups=1, num_channels=256, metric=7.35 vs. limit=15.0 2023-10-03 19:52:07,947 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.nonlin_attention.whiten2.whitening_limit, batch_count=1386426.6666666667, ans=15.0 2023-10-03 19:52:08,826 WARNING [train.py:1204] (3/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_356-31216-0 from training. Duration: 0.75 2023-10-03 19:52:10,895 WARNING [train.py:1204] (3/4) Exclude cut with ID _25248_210_8750_1_1532062825941_5554180_255-148539-0_sp1.1 from training. Duration: 0.963625 2023-10-03 19:52:10,951 WARNING [train.py:1204] (3/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_269-154726-0_sp1.1 from training. Duration: 0.7818125 2023-10-03 19:52:14,789 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7002_210_1794_1_1527939683_5157286_163-87552-0_sp1.1 from training. Duration: 0.7818125 2023-10-03 19:52:14,982 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=1386426.6666666667, ans=0.1 2023-10-03 19:52:16,115 WARNING [train.py:1204] (3/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_97-151896-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 19:52:18,020 INFO [train.py:1046] (3/4) Epoch 40, batch 800, loss[loss=0.142, simple_loss=0.2188, pruned_loss=0.03259, over 24534.00 frames. ], tot_loss[loss=0.1561, simple_loss=0.2365, pruned_loss=0.03781, over 4629764.97 frames. ], batch size: 60, lr: 2.55e-03, grad_scale: 8.0 2023-10-03 19:52:19,512 WARNING [train.py:1204] (3/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_207-7026-0_sp1.1 from training. Duration: 0.97275 2023-10-03 19:52:19,532 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_92-46009-0_sp0.9 from training. Duration: 0.6 2023-10-03 19:52:28,357 WARNING [train.py:1204] (3/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_122-178847-0_sp1.1 from training. Duration: 0.97275 2023-10-03 19:52:28,363 WARNING [train.py:1204] (3/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_94-348956-0_sp1.1 from training. Duration: 0.92725 2023-10-03 19:52:29,834 WARNING [train.py:1204] (3/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_1018-244089-0_sp1.1 from training. Duration: 0.963625 2023-10-03 19:52:29,845 WARNING [train.py:1204] (3/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_87-118423-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 19:52:31,737 WARNING [train.py:1204] (3/4) Exclude cut with ID _53713_210_24116_1_1534314487762_7416751_504-212292-0_sp1.1 from training. Duration: 0.92725 2023-10-03 19:52:31,764 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_446-150327-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 19:52:33,090 WARNING [train.py:1204] (3/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_682-245989-0_sp1.1 from training. Duration: 0.92725 2023-10-03 19:52:35,871 WARNING [train.py:1204] (3/4) Exclude cut with ID _35723_210_1794_1_1533088341043_628250_8-234524-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 19:52:37,265 WARNING [train.py:1204] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_64-248359-0_sp0.9 from training. Duration: 0.7666875 2023-10-03 19:52:40,000 WARNING [train.py:1204] (3/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_49-97158-0 from training. Duration: 0.76 2023-10-03 19:52:40,055 WARNING [train.py:1204] (3/4) Exclude cut with ID _24314_210_4915_1_1532135078116_7170179_743-915-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 19:52:41,264 WARNING [train.py:1204] (3/4) Exclude cut with ID _61194_210_18628_1_1535007555628_3750909_353-323468-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 19:52:41,285 WARNING [train.py:1204] (3/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_387-131538-0_sp1.1 from training. Duration: 0.736375 2023-10-03 19:52:41,320 WARNING [train.py:1204] (3/4) Exclude cut with ID _44424_210_18163_1_1533623394848_1213110_79-187603-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 19:52:41,352 WARNING [train.py:1204] (3/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_86-19601-0 from training. Duration: 0.85 2023-10-03 19:52:43,342 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_50050_210_16223_1_1535435796_7290138_103-283529-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 19:52:43,373 WARNING [train.py:1204] (3/4) Exclude cut with ID _13913_210_9994_1_1530579615187_3383897_473-277682-0 from training. Duration: 0.39 2023-10-03 19:52:45,960 WARNING [train.py:1204] (3/4) Exclude cut with ID _42326_210_7770_1_1533814282531_3707269_368-68029-0_sp1.1 from training. Duration: 0.92725 2023-10-03 19:52:46,606 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.4.encoder.layers.0.feed_forward3.out_whiten, num_groups=1, num_channels=384, metric=12.90 vs. limit=15.0 2023-10-03 19:52:47,385 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_41428_210_2161_1_1533774184_3992532_446-339645-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 19:52:49,372 WARNING [train.py:1204] (3/4) Exclude cut with ID _69584_210_5219_1_1535785133775_4044450_325-7656-0_sp1.1 from training. Duration: 0.7818125 2023-10-03 19:52:49,380 WARNING [train.py:1204] (3/4) Exclude cut with ID _21632_210_2253_1_1531994001765_3744140_134-184387-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 19:52:52,152 WARNING [train.py:1204] (3/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_550-184707-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 19:52:52,167 WARNING [train.py:1204] (3/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_106-341780-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 19:53:00,709 WARNING [train.py:1204] (3/4) Exclude cut with ID _45871_210_5665_1_1533864614245_3296060_253-132552-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 19:53:00,750 WARNING [train.py:1204] (3/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_395-310220-0_sp1.1 from training. Duration: 0.8090625 2023-10-03 19:53:00,773 WARNING [train.py:1204] (3/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_795-33252-0_sp0.9 from training. Duration: 0.411125 2023-10-03 19:53:04,150 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9002W0003-495962 from training. Duration: 0.946 2023-10-03 19:53:04,183 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_43-71989-0 from training. Duration: 0.57 2023-10-03 19:53:04,202 WARNING [train.py:1204] (3/4) Exclude cut with ID _8699_210_2069_1_1528630346831_3960409_155-39766-0_sp1.1 from training. Duration: 0.7090625 2023-10-03 19:53:04,221 WARNING [train.py:1204] (3/4) Exclude cut with ID _27894_210_12392_1_1532415496078_6902439_788-232684-0_sp1.1 from training. Duration: 0.97275 2023-10-03 19:53:06,874 WARNING [train.py:1204] (3/4) Exclude cut with ID _56494_210_4220_1_1535284837625_3033169_10-39296-0_sp1.1 from training. Duration: 0.92725 2023-10-03 19:53:08,142 WARNING [train.py:1204] (3/4) Exclude cut with ID _40901_210_5665_1_1533363789958_3856130_341-20819-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 19:53:11,083 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9029W0435-509822 from training. Duration: 0.896 2023-10-03 19:53:11,135 WARNING [train.py:1204] (3/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_51-117576-0 from training. Duration: 0.4 2023-10-03 19:53:12,618 WARNING [train.py:1204] (3/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_197-28164-0_sp1.1 from training. Duration: 0.72725 2023-10-03 19:53:15,899 WARNING [train.py:1204] (3/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_145-137790-0_sp1.1 from training. Duration: 0.7545625 2023-10-03 19:53:18,636 WARNING [train.py:1204] (3/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_2-39653-0_sp1.1 from training. Duration: 0.8909375 2023-10-03 19:53:20,266 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3780_210_2087_1_1526467760_2130483_186-208240-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 19:53:21,642 WARNING [train.py:1204] (3/4) Exclude cut with ID _24055_210_12651_1_1532310907143_976979_92-59356-0 from training. Duration: 0.91 2023-10-03 19:53:23,421 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3910_210_1794_1_1526742306_334530_28-238909-0_sp1.1 from training. Duration: 0.736375 2023-10-03 19:53:24,957 WARNING [train.py:1204] (3/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_269-154726-0 from training. Duration: 0.86 2023-10-03 19:53:30,540 WARNING [train.py:1204] (3/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_326-165900-0_sp0.9 from training. Duration: 0.7666875 2023-10-03 19:53:35,555 INFO [train.py:1046] (3/4) Epoch 40, batch 850, loss[loss=0.1613, simple_loss=0.2469, pruned_loss=0.03782, over 24574.00 frames. ], tot_loss[loss=0.1568, simple_loss=0.2372, pruned_loss=0.0382, over 4646771.14 frames. ], batch size: 71, lr: 2.55e-03, grad_scale: 8.0 2023-10-03 19:53:35,620 WARNING [train.py:1204] (3/4) Exclude cut with ID _5294_210_1747_1_1527249643928_3696179_470-276077-0_sp1.1 from training. Duration: 0.963625 2023-10-03 19:53:37,044 WARNING [train.py:1204] (3/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_372-131163-0 from training. Duration: 0.94 2023-10-03 19:53:37,085 WARNING [train.py:1204] (3/4) Exclude cut with ID _45297_210_2721_1_1533715351381_3264949_351-147136-0_sp1.1 from training. Duration: 0.7909375 2023-10-03 19:53:38,461 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_1072-12671-0_sp1.1 from training. Duration: 0.92725 2023-10-03 19:53:38,561 WARNING [train.py:1204] (3/4) Exclude cut with ID _15133_210_6228_1_1530794512074_3080379_54-198193-0 from training. Duration: 0.94 2023-10-03 19:53:39,262 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.2.encoder.layers.2.self_attn1.whiten, num_groups=1, num_channels=384, metric=19.74 vs. limit=22.5 2023-10-03 19:53:39,804 WARNING [train.py:1204] (3/4) Exclude cut with ID _30196_210_1664_1_1533708496522_7074140_1194-270988-0_sp1.1 from training. Duration: 0.97275 2023-10-03 19:53:40,045 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.bypass.skip_rate, batch_count=1386826.6666666667, ans=0.07 2023-10-03 19:53:41,169 WARNING [train.py:1204] (3/4) Exclude cut with ID _50310_210_15488_1_1534068043345_3641080_289-193637-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 19:53:42,584 WARNING [train.py:1204] (3/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_313-263495-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 19:53:43,934 WARNING [train.py:1204] (3/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_18-130336-0_sp1.1 from training. Duration: 0.7181875 2023-10-03 19:53:45,328 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_47896_210_20508_1_1534492518_5958868_646-287209-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 19:53:46,767 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_294-163793-0 from training. Duration: 0.82 2023-10-03 19:53:46,813 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_8-319957-0 from training. Duration: 0.96 2023-10-03 19:53:46,817 WARNING [train.py:1204] (3/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_1211-226066-0 from training. Duration: 0.76 2023-10-03 19:53:48,797 WARNING [train.py:1204] (3/4) Exclude cut with ID _4779_210_2084_1_1527318022944_3914893_500-285013-0_sp0.9 from training. Duration: 0.7666875 2023-10-03 19:53:50,106 WARNING [train.py:1204] (3/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_347-232268-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 19:53:51,538 WARNING [train.py:1204] (3/4) Exclude cut with ID _29877_210_6228_1_1532397791106_5276149_111-55056-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 19:53:51,573 WARNING [train.py:1204] (3/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_812-271646-0_sp1.1 from training. Duration: 0.92725 2023-10-03 19:53:51,597 WARNING [train.py:1204] (3/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_235-262124-0_sp1.1 from training. Duration: 0.6090625 2023-10-03 19:53:55,174 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.0.self_attn_weights.pos_emb_skip_rate, batch_count=1386893.3333333333, ans=0.0 2023-10-03 19:53:56,413 WARNING [train.py:1204] (3/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_14-16530-0_sp1.1 from training. Duration: 0.97275 2023-10-03 19:53:56,444 WARNING [train.py:1204] (3/4) Exclude cut with ID _44718_210_5816_1_1534050051869_6469030_568-304512-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 19:53:57,716 WARNING [train.py:1204] (3/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_1033-274376-0 from training. Duration: 0.87 2023-10-03 19:54:01,844 WARNING [train.py:1204] (3/4) Exclude cut with ID _4681_210_3525_1_1527251040321_1235713_123-24428-0 from training. Duration: 0.48 2023-10-03 19:54:05,948 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_37818_210_4238_1_1533620910_4720083_251-288764-0_sp1.1 from training. Duration: 0.97275 2023-10-03 19:54:07,807 WARNING [train.py:1204] (3/4) Exclude cut with ID _65585_210_12210_1_1535450451584_3764098_112-294301-0 from training. Duration: 0.88 2023-10-03 19:54:09,954 WARNING [train.py:1204] (3/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_329-113173-0 from training. Duration: 0.62 2023-10-03 19:54:11,335 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6767_210_3273_1_1527855783_1718932_173-256601-0 from training. Duration: 0.8 2023-10-03 19:54:12,668 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.605e+02 1.964e+02 2.128e+02 2.466e+02 3.367e+02, threshold=4.255e+02, percent-clipped=0.0 2023-10-03 19:54:12,844 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9266W0001-627677 from training. Duration: 0.896 2023-10-03 19:54:13,697 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.0.layers.0.nonlin_attention.whiten2, num_groups=1, num_channels=192, metric=4.68 vs. limit=15.0 2023-10-03 19:54:14,165 WARNING [train.py:1204] (3/4) Exclude cut with ID _49643_210_7989_1_1534640244224_3939031_611-185515-0_sp1.1 from training. Duration: 0.8545625 2023-10-03 19:54:14,170 WARNING [train.py:1204] (3/4) Exclude cut with ID _31649_210_6228_1_1532584608063_4091078_687-237771-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 19:54:14,188 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_148-172861-0_sp0.9 from training. Duration: 0.5555625 2023-10-03 19:54:15,593 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_5129_210_6400_1_1527161005_3579054_107-69738-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 19:54:17,573 WARNING [train.py:1204] (3/4) Exclude cut with ID _40137_210_12210_1_1533290484046_3795180_36-301164-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 19:54:18,692 WARNING [train.py:1204] (3/4) Exclude cut with ID _7537_210_2084_1_1528502395431_3924359_113-199933-0 from training. Duration: 0.59 2023-10-03 19:54:20,242 WARNING [train.py:1204] (3/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_547-5424-0_sp1.1 from training. Duration: 0.8545625 2023-10-03 19:54:20,285 WARNING [train.py:1204] (3/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_147-284024-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 19:54:21,654 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_294-163793-0_sp1.1 from training. Duration: 0.7454375 2023-10-03 19:54:23,005 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3780_210_2087_1_1526467760_2130483_170-259044-0_sp1.1 from training. Duration: 0.636375 2023-10-03 19:54:24,412 WARNING [train.py:1204] (3/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_726-270780-0_sp1.1 from training. Duration: 0.7818125 2023-10-03 19:54:26,471 WARNING [train.py:1204] (3/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_214-291369-0_sp1.1 from training. Duration: 0.57275 2023-10-03 19:54:26,532 WARNING [train.py:1204] (3/4) Exclude cut with ID _21881_210_3525_1_1531825242757_3798380_247-102591-0 from training. Duration: 0.98 2023-10-03 19:54:29,370 WARNING [train.py:1204] (3/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_18-174647-0_sp1.1 from training. Duration: 0.82725 2023-10-03 19:54:29,373 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_415-52611-0_sp1.1 from training. Duration: 0.936375 2023-10-03 19:54:30,688 WARNING [train.py:1204] (3/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_379-49457-0_sp0.9 from training. Duration: 0.9666875 2023-10-03 19:54:30,709 WARNING [train.py:1204] (3/4) Exclude cut with ID _64521_210_14699_1_1535243427259_3815328_368-195106-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 19:54:32,041 WARNING [train.py:1204] (3/4) Exclude cut with ID _28394_210_8913_1_1532430049518_3650118_51-334124-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 19:54:33,446 WARNING [train.py:1204] (3/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_317-161769-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 19:54:35,098 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.conv_skip_rate, batch_count=1387093.3333333333, ans=0.0 2023-10-03 19:54:36,651 WARNING [train.py:1204] (3/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_40-129756-0_sp0.9 from training. Duration: 0.9 2023-10-03 19:54:36,760 WARNING [train.py:1204] (3/4) Exclude cut with ID _3067_210_2667_1_1526212974640_4503168_119-175776-0_sp0.9 from training. Duration: 0.82225 2023-10-03 19:54:36,806 WARNING [train.py:1204] (3/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_1004-304915-0_sp1.1 from training. Duration: 0.92725 2023-10-03 19:54:38,124 WARNING [train.py:1204] (3/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_359-118926-0_sp0.9 from training. Duration: 0.8 2023-10-03 19:54:45,428 WARNING [train.py:1204] (3/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_51-200220-0_sp0.9 from training. Duration: 0.62225 2023-10-03 19:54:46,733 WARNING [train.py:1204] (3/4) Exclude cut with ID _46683_210_9642_1_1533730913492_4591116_530-326202-0_sp1.1 from training. Duration: 0.936375 2023-10-03 19:54:47,419 WARNING [train.py:1204] (3/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_567-135114-0 from training. Duration: 0.87 2023-10-03 19:54:47,446 WARNING [train.py:1204] (3/4) Exclude cut with ID _66863_210_18053_1_1535698758334_8590319_1092-271784-0_sp1.1 from training. Duration: 0.87275 2023-10-03 19:54:48,668 WARNING [train.py:1204] (3/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_762-282614-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 19:54:50,062 INFO [train.py:1046] (3/4) Epoch 40, batch 900, loss[loss=0.1491, simple_loss=0.2282, pruned_loss=0.03499, over 23459.00 frames. ], tot_loss[loss=0.1576, simple_loss=0.2378, pruned_loss=0.03867, over 4657962.26 frames. ], batch size: 119, lr: 2.55e-03, grad_scale: 8.0 2023-10-03 19:54:50,130 WARNING [train.py:1204] (3/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_280-120942-0 from training. Duration: 0.77 2023-10-03 19:54:57,620 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_31583_210_3852_1_1532595851_4024413_27-170931-0_sp1.1 from training. Duration: 0.963625 2023-10-03 19:54:59,172 WARNING [train.py:1204] (3/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_266-271628-0_sp1.1 from training. Duration: 0.92725 2023-10-03 19:55:00,543 WARNING [train.py:1204] (3/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_41-31157-0 from training. Duration: 0.57 2023-10-03 19:55:00,794 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=1387160.0, ans=0.1 2023-10-03 19:55:03,355 WARNING [train.py:1204] (3/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_113-18163-0_sp1.1 from training. Duration: 0.8090625 2023-10-03 19:55:03,394 WARNING [train.py:1204] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_147-111137-0 from training. Duration: 0.95 2023-10-03 19:55:03,463 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1083-220122-0_sp1.1 from training. Duration: 0.463625 2023-10-03 19:55:03,704 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.feed_forward2.hidden_balancer.prob, batch_count=1387226.6666666667, ans=0.125 2023-10-03 19:55:05,314 WARNING [train.py:1204] (3/4) Exclude cut with ID _14884_210_2252_1_1530707459689_5561250_591-173406-0_sp1.1 from training. Duration: 0.87275 2023-10-03 19:55:05,320 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_24677_210_1794_1_1532002693_4341296_149-303793-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 19:55:05,371 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_133-564-0_sp1.1 from training. Duration: 0.7181875 2023-10-03 19:55:06,711 WARNING [train.py:1204] (3/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_625-338546-0_sp0.9 from training. Duration: 0.97775 2023-10-03 19:55:11,678 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.conv_skip_rate, batch_count=1387226.6666666667, ans=0.0 2023-10-03 19:55:13,248 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.balancer1.prob, batch_count=1387226.6666666667, ans=0.125 2023-10-03 19:55:16,926 WARNING [train.py:1204] (3/4) Exclude cut with ID _25882_210_3036_1_1532775665815_6479510_624-320169-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 19:55:16,934 WARNING [train.py:1204] (3/4) Exclude cut with ID _20622_210_3852_1_1531622427080_1093839_127-325326-0_sp1.1 from training. Duration: 0.92725 2023-10-03 19:55:16,969 WARNING [train.py:1204] (3/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_464-33931-0_sp1.1 from training. Duration: 0.4909375 2023-10-03 19:55:19,026 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.3.encoder.layers.2.self_attn1.whiten, num_groups=1, num_channels=512, metric=12.23 vs. limit=22.5 2023-10-03 19:55:19,705 WARNING [train.py:1204] (3/4) Exclude cut with ID _66112_210_10635_1_1535590236305_3801484_120-304588-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 19:55:19,924 INFO [scaling.py:1118] (3/4) WithLoss: name=encoder.encoders.2.encoder.layers.2.self_attn_weights, loss-sum=0.000e+00 2023-10-03 19:55:24,551 WARNING [train.py:1204] (3/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_110-130961-0 from training. Duration: 0.47 2023-10-03 19:55:25,990 WARNING [train.py:1204] (3/4) Exclude cut with ID _16908_210_5724_1_1531119157248_3772700_64-54020-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 19:55:29,422 WARNING [train.py:1204] (3/4) Exclude cut with ID _4068_210_2629_1_1526697350395_1546259_224-288693-0_sp1.1 from training. Duration: 0.536375 2023-10-03 19:55:30,720 WARNING [train.py:1204] (3/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_191-41023-0_sp0.9 from training. Duration: 0.788875 2023-10-03 19:55:30,962 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.conv_module2.balancer1.prob, batch_count=1387293.3333333333, ans=0.125 2023-10-03 19:55:32,043 WARNING [train.py:1204] (3/4) Exclude cut with ID ID0205W0255-783002 from training. Duration: 0.958 2023-10-03 19:55:33,322 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_162-262717-0 from training. Duration: 0.83 2023-10-03 19:55:37,983 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_224-322394-0_sp1.1 from training. Duration: 0.6 2023-10-03 19:55:37,989 WARNING [train.py:1204] (3/4) Exclude cut with ID _4866_210_2577_1_1527404412284_4364659_133-1696-0_sp1.1 from training. Duration: 0.9 2023-10-03 19:55:39,712 WARNING [train.py:1204] (3/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_973-198085-0_sp1.1 from training. Duration: 0.6818125 2023-10-03 19:55:43,999 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_47449_210_5399_1_1534076445_4006929_410-237806-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 19:55:44,009 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7543_210_6753_1_1528543318_4552_1-85072-0_sp0.9 from training. Duration: 0.92225 2023-10-03 19:55:45,386 WARNING [train.py:1204] (3/4) Exclude cut with ID _65263_210_22356_1_1535763598691_3921037_108-21902-0 from training. Duration: 0.99 2023-10-03 19:55:45,400 WARNING [train.py:1204] (3/4) Exclude cut with ID _65585_210_12210_1_1535450451584_3764098_309-174818-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 19:55:48,275 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_309-65177-0 from training. Duration: 0.88 2023-10-03 19:55:50,210 WARNING [train.py:1204] (3/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_363-185703-0_sp1.1 from training. Duration: 0.863625 2023-10-03 19:55:50,244 WARNING [train.py:1204] (3/4) Exclude cut with ID _42326_210_7770_1_1533814282531_3707269_442-238692-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 19:55:51,432 WARNING [train.py:1204] (3/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_226-254446-0_sp1.1 from training. Duration: 0.936375 2023-10-03 19:55:51,458 WARNING [train.py:1204] (3/4) Exclude cut with ID _29853_210_7497_1_1532505600851_6642110_287-299903-0_sp1.1 from training. Duration: 0.963625 2023-10-03 19:55:52,310 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.1.encoder.layers.0.feed_forward3.out_whiten, num_groups=1, num_channels=256, metric=12.24 vs. limit=15.0 2023-10-03 19:55:56,199 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1015-343884-0 from training. Duration: 0.62 2023-10-03 19:55:56,231 WARNING [train.py:1204] (3/4) Exclude cut with ID ID0305W0008-833402 from training. Duration: 0.91 2023-10-03 19:55:57,587 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6673_210_3273_1_1527767790_3173240_351-167954-0_sp1.1 from training. Duration: 0.436375 2023-10-03 19:55:57,750 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.bypass_mid.scale_min, batch_count=1387426.6666666667, ans=0.2 2023-10-03 19:55:58,771 WARNING [train.py:1204] (3/4) Exclude cut with ID _6456_210_1534_1_1527681634212_3181049_345-252877-0 from training. Duration: 0.64 2023-10-03 19:56:00,264 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_13-205182-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 19:56:03,036 WARNING [train.py:1204] (3/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_427-208901-0 from training. Duration: 0.68 2023-10-03 19:56:05,007 INFO [train.py:1046] (3/4) Epoch 40, batch 950, loss[loss=0.1725, simple_loss=0.2579, pruned_loss=0.04352, over 24364.00 frames. ], tot_loss[loss=0.1577, simple_loss=0.2379, pruned_loss=0.03873, over 4659511.93 frames. ], batch size: 77, lr: 2.55e-03, grad_scale: 8.0 2023-10-03 19:56:06,511 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.0.balancer2.prob, batch_count=1387493.3333333333, ans=0.125 2023-10-03 19:56:07,752 WARNING [train.py:1204] (3/4) Exclude cut with ID _49039_210_6315_1_1533967537541_3846118_9-148921-0_sp1.1 from training. Duration: 0.97275 2023-10-03 19:56:11,039 WARNING [train.py:1204] (3/4) Exclude cut with ID _59493_210_17282_1_1535078419077_3225879_143-70317-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 19:56:12,393 WARNING [train.py:1204] (3/4) Exclude cut with ID _27967_210_12140_1_1532588391859_8419130_1056-236369-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 19:56:12,433 WARNING [train.py:1204] (3/4) Exclude cut with ID _6537_210_1883_1_1527987559710_3597129_382-133062-0_sp0.9 from training. Duration: 0.7555625 2023-10-03 19:56:13,885 WARNING [train.py:1204] (3/4) Exclude cut with ID ID0376W0255-869957 from training. Duration: 0.9510625 2023-10-03 19:56:14,105 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=1387493.3333333333, ans=0.1 2023-10-03 19:56:16,652 WARNING [train.py:1204] (3/4) Exclude cut with ID _49371_210_12071_1_1533952761021_3812190_483-240691-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 19:56:18,608 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_12868_210_6753_1_1530701932_530275_18-76322-0_sp1.1 from training. Duration: 0.7909375 2023-10-03 19:56:19,912 WARNING [train.py:1204] (3/4) Exclude cut with ID _53950_210_5816_1_1534417264919_3862951_367-26344-0_sp1.1 from training. Duration: 0.97275 2023-10-03 19:56:19,938 WARNING [train.py:1204] (3/4) Exclude cut with ID _5444_210_2151_1_1527418808464_5312210_634-194174-0_sp1.1 from training. Duration: 0.8818125 2023-10-03 19:56:19,955 WARNING [train.py:1204] (3/4) Exclude cut with ID _56494_210_4220_1_1535284837625_3033169_69-138150-0 from training. Duration: 0.94 2023-10-03 19:56:21,500 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7543_210_6753_1_1528544010_1110402_25-183860-0_sp0.9 from training. Duration: 0.52225 2023-10-03 19:56:22,877 WARNING [train.py:1204] (3/4) Exclude cut with ID _40284_210_2084_1_1533513679814_3704279_287-191918-0_sp1.1 from training. Duration: 0.963625 2023-10-03 19:56:24,225 WARNING [train.py:1204] (3/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_291-270881-0 from training. Duration: 0.97 2023-10-03 19:56:26,120 WARNING [train.py:1204] (3/4) Exclude cut with ID _12555_210_8750_1_1530075594402_3780239_449-267986-0_sp0.9 from training. Duration: 0.92225 2023-10-03 19:56:27,739 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.conv_module2.balancer1.prob, batch_count=1387560.0, ans=0.125 2023-10-03 19:56:30,145 WARNING [train.py:1204] (3/4) Exclude cut with ID _3992_210_1747_1_1526644816031_3731060_268-64520-0_sp1.1 from training. Duration: 0.963625 2023-10-03 19:56:30,160 WARNING [train.py:1204] (3/4) Exclude cut with ID _39411_210_5986_1_1533538825030_3343649_173-238248-0_sp0.9 from training. Duration: 0.92225 2023-10-03 19:56:30,183 WARNING [train.py:1204] (3/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_828-313097-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 19:56:31,586 WARNING [train.py:1204] (3/4) Exclude cut with ID _26907_210_7234_1_1532221740694_3205263_337-2494-0 from training. Duration: 0.88 2023-10-03 19:56:34,279 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_229-45300-0_sp1.1 from training. Duration: 0.5181875 2023-10-03 19:56:34,415 WARNING [train.py:1204] (3/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_300-27235-0_sp1.1 from training. Duration: 0.7909375 2023-10-03 19:56:35,777 WARNING [train.py:1204] (3/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_48-338976-0_sp1.1 from training. Duration: 0.8090625 2023-10-03 19:56:41,687 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.738e+02 1.970e+02 2.172e+02 2.506e+02 3.661e+02, threshold=4.343e+02, percent-clipped=0.0 2023-10-03 19:56:41,782 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_13091_210_7925_1_1530609447_8987354_792-195353-0_sp1.1 from training. Duration: 0.936375 2023-10-03 19:56:41,796 WARNING [train.py:1204] (3/4) Exclude cut with ID _56524_210_9642_1_1534572011779_3619170_311-54500-0_sp1.1 from training. Duration: 0.97275 2023-10-03 19:56:44,036 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7543_210_6753_1_1528544010_1110402_25-183860-0 from training. Duration: 0.47 2023-10-03 19:56:44,594 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.3.encoder.layers.1.nonlin_attention.whiten1, num_groups=1, num_channels=384, metric=5.09 vs. limit=10.0 2023-10-03 19:56:47,311 WARNING [train.py:1204] (3/4) Exclude cut with ID 2411-132532-0017-82279-0_sp1.1 from training. Duration: 0.9681875 2023-10-03 19:56:47,312 WARNING [train.py:1204] (3/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_272-249985-0_sp0.9 from training. Duration: 0.7444375 2023-10-03 19:56:48,665 WARNING [train.py:1204] (3/4) Exclude cut with ID _55644_210_11856_1_1534593599582_4103719_320-229006-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 19:56:48,728 WARNING [train.py:1204] (3/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_177-100554-0_sp1.1 from training. Duration: 0.963625 2023-10-03 19:56:48,730 WARNING [train.py:1204] (3/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_787-294416-0_sp1.1 from training. Duration: 0.7454375 2023-10-03 19:56:52,709 WARNING [train.py:1204] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_318-59097-0 from training. Duration: 0.76 2023-10-03 19:56:54,060 WARNING [train.py:1204] (3/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_480-13146-0_sp1.1 from training. Duration: 0.77275 2023-10-03 19:56:55,451 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_469-338558-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 19:56:57,354 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_367-126897-0_sp1.1 from training. Duration: 0.963625 2023-10-03 19:56:57,372 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6545_210_2040_1_1527774670_4245258_379-14182-0 from training. Duration: 0.8 2023-10-03 19:56:57,393 WARNING [train.py:1204] (3/4) Exclude cut with ID _21631_210_2253_1_1531907880778_3160339_197-79498-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 19:56:57,405 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_416-293746-0_sp1.1 from training. Duration: 0.8181875 2023-10-03 19:56:58,772 WARNING [train.py:1204] (3/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_52-211683-0 from training. Duration: 0.9 2023-10-03 19:57:01,591 WARNING [train.py:1204] (3/4) Exclude cut with ID _48678_210_5663_1_1533988830947_3769199_306-255886-0_sp0.9 from training. Duration: 0.8555625 2023-10-03 19:57:05,882 WARNING [train.py:1204] (3/4) Exclude cut with ID _65134_210_14763_1_1535335393985_3505368_296-24733-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 19:57:10,405 WARNING [train.py:1204] (3/4) Exclude cut with ID _14898_210_9206_1_1530766812377_4177190_335-99364-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 19:57:11,781 WARNING [train.py:1204] (3/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_19-342632-0 from training. Duration: 0.91 2023-10-03 19:57:11,799 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_53071_210_16554_1_1534490405_2954066_265-170472-0 from training. Duration: 0.83 2023-10-03 19:57:13,723 WARNING [train.py:1204] (3/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_189-298172-0_sp1.1 from training. Duration: 0.963625 2023-10-03 19:57:15,257 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.conv_module1.balancer2.min_abs, batch_count=1387760.0, ans=0.5 2023-10-03 19:57:19,403 INFO [train.py:1046] (3/4) Epoch 40, batch 1000, loss[loss=0.1756, simple_loss=0.2445, pruned_loss=0.05335, over 23732.00 frames. ], tot_loss[loss=0.157, simple_loss=0.2365, pruned_loss=0.03876, over 4662632.69 frames. ], batch size: 164, lr: 2.55e-03, grad_scale: 8.0 2023-10-03 19:57:20,759 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7615_210_4211_1_1528110965_4580160_72-235211-0 from training. Duration: 0.76 2023-10-03 19:57:20,814 WARNING [train.py:1204] (3/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_155-90874-0_sp1.1 from training. Duration: 0.936375 2023-10-03 19:57:25,114 WARNING [train.py:1204] (3/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_164-216310-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 19:57:25,230 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_46013_210_7234_1_1533776362_3744032_463-52378-0 from training. Duration: 0.86 2023-10-03 19:57:25,233 WARNING [train.py:1204] (3/4) Exclude cut with ID _61133_210_9969_1_1535196777227_3696409_333-270325-0 from training. Duration: 0.93 2023-10-03 19:57:31,343 WARNING [train.py:1204] (3/4) Exclude cut with ID _5172_210_1883_1_1527246062567_3577219_18-101380-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 19:57:31,351 WARNING [train.py:1204] (3/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_577-236858-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 19:57:32,742 WARNING [train.py:1204] (3/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_1006-177345-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 19:57:34,282 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_4-266348-0 from training. Duration: 0.78 2023-10-03 19:57:38,436 WARNING [train.py:1204] (3/4) Exclude cut with ID _55857_210_2346_1_1534644007831_6898209_745-98391-0 from training. Duration: 0.76 2023-10-03 19:57:40,326 WARNING [train.py:1204] (3/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_375-244976-0 from training. Duration: 0.75 2023-10-03 19:57:40,369 WARNING [train.py:1204] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_4-248404-0_sp1.1 from training. Duration: 0.97275 2023-10-03 19:57:43,022 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8453_210_4211_1_1528502940_8562730_596-306820-0 from training. Duration: 0.97 2023-10-03 19:57:45,019 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_211-339622-0_sp0.9 from training. Duration: 0.5 2023-10-03 19:57:45,062 WARNING [train.py:1204] (3/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_393-57888-0 from training. Duration: 0.64 2023-10-03 19:57:46,396 WARNING [train.py:1204] (3/4) Exclude cut with ID _23825_210_13449_1_1531915313526_3497150_143-343575-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 19:57:47,072 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.2.encoder.layers.0.conv_module1.whiten, num_groups=1, num_channels=384, metric=10.21 vs. limit=15.0 2023-10-03 19:57:47,672 WARNING [train.py:1204] (3/4) Exclude cut with ID _58605_210_12210_1_1534723195939_3904259_36-12256-0_sp1.1 from training. Duration: 0.936375 2023-10-03 19:57:48,485 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.conv_skip_rate, batch_count=1387960.0, ans=0.0 2023-10-03 19:57:51,028 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.conv_module1.balancer2.prob, batch_count=1387960.0, ans=0.125 2023-10-03 19:57:57,925 WARNING [train.py:1204] (3/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_38-67795-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 19:57:57,980 WARNING [train.py:1204] (3/4) Exclude cut with ID _64631_210_27767_1_1535349580068_4165160_308-327832-0_sp1.1 from training. Duration: 0.92725 2023-10-03 19:57:58,048 WARNING [train.py:1204] (3/4) Exclude cut with ID _37086_210_9038_1_1532999540891_7021250_726-81176-0_sp1.1 from training. Duration: 0.936375 2023-10-03 19:57:58,102 WARNING [train.py:1204] (3/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_47-141521-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 19:57:58,114 WARNING [train.py:1204] (3/4) Exclude cut with ID _3067_210_2667_1_1526212974640_4503168_250-285937-0 from training. Duration: 0.8 2023-10-03 19:57:58,140 WARNING [train.py:1204] (3/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_239-24000-0_sp1.1 from training. Duration: 0.97275 2023-10-03 19:57:59,427 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3371_210_1794_1_1526384462_5580777_396-339812-0_sp0.9 from training. Duration: 0.9444375 2023-10-03 19:57:59,473 WARNING [train.py:1204] (3/4) Exclude cut with ID _16909_210_5724_1_1531292079912_4118919_580-173454-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 19:58:01,251 WARNING [train.py:1204] (3/4) Exclude cut with ID IC0124W0002-61308 from training. Duration: 0.982 2023-10-03 19:58:05,358 WARNING [train.py:1204] (3/4) Exclude cut with ID _31602_210_5854_1_1532677021456_4458250_249-96604-0 from training. Duration: 0.89 2023-10-03 19:58:06,812 WARNING [train.py:1204] (3/4) Exclude cut with ID _69584_210_5219_1_1535785133775_4044450_325-7656-0 from training. Duration: 0.86 2023-10-03 19:58:08,267 WARNING [train.py:1204] (3/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_478-162247-0 from training. Duration: 0.82 2023-10-03 19:58:09,692 WARNING [train.py:1204] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_686-54383-0_sp1.1 from training. Duration: 0.7909375 2023-10-03 19:58:14,595 WARNING [train.py:1204] (3/4) Exclude cut with ID _29303_210_2575_1_1532392237303_7410649_85-270092-0_sp1.1 from training. Duration: 0.936375 2023-10-03 19:58:14,606 WARNING [train.py:1204] (3/4) Exclude cut with ID _2322_210_2667_1_1525604548971_3656299_440-80494-0_sp1.1 from training. Duration: 0.8 2023-10-03 19:58:14,645 WARNING [train.py:1204] (3/4) Exclude cut with ID _47346_210_9476_1_1533815971553_3536160_201-40644-0_sp1.1 from training. Duration: 0.936375 2023-10-03 19:58:16,701 WARNING [train.py:1204] (3/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_343-139898-0_sp1.1 from training. Duration: 0.87275 2023-10-03 19:58:17,927 WARNING [train.py:1204] (3/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_1012-279473-0 from training. Duration: 0.67 2023-10-03 19:58:20,667 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7931_210_1794_1_1528594728_5564059_280-279560-0_sp1.1 from training. Duration: 0.8909375 2023-10-03 19:58:20,711 WARNING [train.py:1204] (3/4) Exclude cut with ID _8263_210_1664_1_1528439452891_970220_68-181805-0 from training. Duration: 0.64 2023-10-03 19:58:22,010 WARNING [train.py:1204] (3/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_605-133395-0 from training. Duration: 0.66 2023-10-03 19:58:23,406 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_43-171109-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 19:58:23,409 WARNING [train.py:1204] (3/4) Exclude cut with ID _40094_210_16380_1_1533294005773_3641159_107-280739-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 19:58:26,184 WARNING [train.py:1204] (3/4) Exclude cut with ID _14821_210_8750_1_1530680503991_6829359_2-227151-0_sp0.9 from training. Duration: 0.92225 2023-10-03 19:58:28,955 WARNING [train.py:1204] (3/4) Exclude cut with ID _14268_210_9206_1_1530622771285_4714099_331-105351-0_sp1.1 from training. Duration: 0.5909375 2023-10-03 19:58:30,351 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_26326_210_7925_1_1533002493_3592406_451-82741-0_sp1.1 from training. Duration: 0.97275 2023-10-03 19:58:32,445 WARNING [train.py:1204] (3/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_191-59784-0_sp0.9 from training. Duration: 0.97775 2023-10-03 19:58:32,542 WARNING [train.py:1204] (3/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_445-117398-0_sp1.1 from training. Duration: 0.8090625 2023-10-03 19:58:33,948 INFO [train.py:1046] (3/4) Epoch 40, batch 1050, loss[loss=0.1481, simple_loss=0.2239, pruned_loss=0.03612, over 20756.00 frames. ], tot_loss[loss=0.1556, simple_loss=0.2347, pruned_loss=0.03831, over 4672457.97 frames. ], batch size: 45, lr: 2.55e-03, grad_scale: 8.0 2023-10-03 19:58:35,377 WARNING [train.py:1204] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_605-257205-0_sp0.9 from training. Duration: 0.6555625 2023-10-03 19:58:36,704 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_50050_210_16223_1_1535435796_7290138_592-258841-0_sp1.1 from training. Duration: 0.936375 2023-10-03 19:58:39,370 WARNING [train.py:1204] (3/4) Exclude cut with ID _14348_210_8341_1_1530599434996_3778640_240-232370-0_sp0.9 from training. Duration: 0.9333125 2023-10-03 19:58:42,688 WARNING [train.py:1204] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_239-316802-0_sp0.9 from training. Duration: 0.7666875 2023-10-03 19:58:43,003 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.attention_skip_rate, batch_count=1388160.0, ans=0.0 2023-10-03 19:58:44,179 WARNING [train.py:1204] (3/4) Exclude cut with ID _45560_210_21982_1_1533812603951_3584999_111-6376-0_sp1.1 from training. Duration: 0.763625 2023-10-03 19:58:46,904 WARNING [train.py:1204] (3/4) Exclude cut with ID _54189_210_16380_1_1534332670039_3911710_606-311481-0_sp1.1 from training. Duration: 0.963625 2023-10-03 19:58:48,616 WARNING [train.py:1204] (3/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_237-332027-0_sp1.1 from training. Duration: 0.863625 2023-10-03 19:58:48,641 WARNING [train.py:1204] (3/4) Exclude cut with ID _66070_210_7640_1_1535441393096_7805310_489-292631-0_sp0.9 from training. Duration: 0.87775 2023-10-03 19:58:48,722 WARNING [train.py:1204] (3/4) Exclude cut with ID _49728_210_12651_1_1534041034294_6788150_624-195301-0_sp1.1 from training. Duration: 0.77275 2023-10-03 19:58:50,637 WARNING [train.py:1204] (3/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_44-79427-0 from training. Duration: 0.98 2023-10-03 19:58:52,044 WARNING [train.py:1204] (3/4) Exclude cut with ID _35350_210_16040_1_1533040191183_4075299_490-198354-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 19:58:52,093 WARNING [train.py:1204] (3/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_309-222545-0 from training. Duration: 0.84 2023-10-03 19:58:54,845 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_13091_210_7925_1_1530609447_8987354_790-199469-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 19:58:56,067 WARNING [train.py:1204] (3/4) Exclude cut with ID _21632_210_2253_1_1531994001765_3744140_110-274654-0 from training. Duration: 0.82 2023-10-03 19:58:56,082 WARNING [train.py:1204] (3/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_498-40153-0_sp0.9 from training. Duration: 0.7 2023-10-03 19:58:56,362 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.conv_skip_rate, batch_count=1388226.6666666667, ans=0.0 2023-10-03 19:59:01,700 WARNING [train.py:1204] (3/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_363-282909-0_sp1.1 from training. Duration: 0.936375 2023-10-03 19:59:01,755 WARNING [train.py:1204] (3/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_178-177582-0_sp0.9 from training. Duration: 0.82225 2023-10-03 19:59:01,773 WARNING [train.py:1204] (3/4) Exclude cut with ID _24612_210_8913_1_1531998054193_3806359_357-35487-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 19:59:04,948 WARNING [train.py:1204] (3/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_138-332773-0 from training. Duration: 0.81 2023-10-03 19:59:04,967 WARNING [train.py:1204] (3/4) Exclude cut with ID _7624_210_1531_1_1528109802198_4933101_418-349834-0 from training. Duration: 0.6 2023-10-03 19:59:04,994 WARNING [train.py:1204] (3/4) Exclude cut with ID _7537_210_2084_1_1528502395431_3924359_14-312906-0_sp0.9 from training. Duration: 0.9333125 2023-10-03 19:59:09,009 WARNING [train.py:1204] (3/4) Exclude cut with ID _60057_210_10640_1_1534903065793_3333150_362-230377-0 from training. Duration: 0.87 2023-10-03 19:59:09,227 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=1388293.3333333333, ans=0.1 2023-10-03 19:59:10,269 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.564e+02 1.893e+02 2.073e+02 2.289e+02 3.386e+02, threshold=4.145e+02, percent-clipped=0.0 2023-10-03 19:59:10,486 WARNING [train.py:1204] (3/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_138-64210-0 from training. Duration: 0.73 2023-10-03 19:59:11,841 WARNING [train.py:1204] (3/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_454-339504-0_sp1.1 from training. Duration: 0.97275 2023-10-03 19:59:15,306 WARNING [train.py:1204] (3/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_508-335369-0_sp0.9 from training. Duration: 0.5333125 2023-10-03 19:59:18,393 WARNING [train.py:1204] (3/4) Exclude cut with ID _5608_210_2106_1_1527415439089_6819768_909-82767-0_sp0.9 from training. Duration: 0.67775 2023-10-03 19:59:18,427 WARNING [train.py:1204] (3/4) Exclude cut with ID _6694_210_4599_1_1527852637490_3965529_99-11611-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 19:59:19,967 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2655_210_2087_1_1525688959_3897400_52-279899-0_sp1.1 from training. Duration: 0.62725 2023-10-03 19:59:23,386 WARNING [train.py:1204] (3/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_375-245677-0_sp1.1 from training. Duration: 0.736375 2023-10-03 19:59:24,800 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.ff3_skip_rate, batch_count=1388360.0, ans=0.0 2023-10-03 19:59:26,661 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.3.encoder.layers.1.whiten, num_groups=1, num_channels=512, metric=4.99 vs. limit=12.0 2023-10-03 19:59:27,398 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_23850_210_4238_1_1532232597_1632433_32-185135-0 from training. Duration: 0.86 2023-10-03 19:59:28,758 WARNING [train.py:1204] (3/4) Exclude cut with ID _15113_210_8254_1_1530842438579_3716279_209-54533-0 from training. Duration: 0.96 2023-10-03 19:59:28,797 WARNING [train.py:1204] (3/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_401-53423-0 from training. Duration: 0.55 2023-10-03 19:59:28,825 WARNING [train.py:1204] (3/4) Exclude cut with ID _14867_210_1968_1_1530684179890_3846969_319-250868-0_sp1.1 from training. Duration: 0.9 2023-10-03 19:59:30,059 WARNING [train.py:1204] (3/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_106-30782-0_sp0.9 from training. Duration: 0.888875 2023-10-03 19:59:31,525 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_5960_210_1794_1_1527403865_3990999_515-2613-0 from training. Duration: 0.88 2023-10-03 19:59:34,532 WARNING [train.py:1204] (3/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_798-106179-0_sp0.9 from training. Duration: 0.92225 2023-10-03 19:59:36,444 WARNING [train.py:1204] (3/4) Exclude cut with ID _14227_210_4381_1_1530701804286_3940259_125-275642-0_sp1.1 from training. Duration: 0.9 2023-10-03 19:59:36,446 WARNING [train.py:1204] (3/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_79-200392-0_sp1.1 from training. Duration: 0.8818125 2023-10-03 19:59:37,801 WARNING [train.py:1204] (3/4) Exclude cut with ID _48836_210_12210_1_1534075848293_3904150_429-35475-0_sp0.9 from training. Duration: 0.988875 2023-10-03 19:59:37,835 WARNING [train.py:1204] (3/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_224-172517-0_sp1.1 from training. Duration: 0.97275 2023-10-03 19:59:40,866 WARNING [train.py:1204] (3/4) Exclude cut with ID _15113_210_8254_1_1530842438579_3716279_76-232507-0_sp1.1 from training. Duration: 0.97275 2023-10-03 19:59:40,869 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_135-5501-0 from training. Duration: 0.98 2023-10-03 19:59:43,497 WARNING [train.py:1204] (3/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_563-347832-0_sp0.9 from training. Duration: 0.988875 2023-10-03 19:59:43,517 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6786_210_1388_1_1527839699_4194495_273-149841-0 from training. Duration: 0.92 2023-10-03 19:59:43,535 WARNING [train.py:1204] (3/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_172-215932-0 from training. Duration: 0.66 2023-10-03 19:59:44,845 WARNING [train.py:1204] (3/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_939-132910-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 19:59:46,810 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=1388493.3333333333, ans=0.1 2023-10-03 19:59:47,896 INFO [train.py:1046] (3/4) Epoch 40, batch 1100, loss[loss=0.1529, simple_loss=0.2443, pruned_loss=0.03072, over 24500.00 frames. ], tot_loss[loss=0.1553, simple_loss=0.2345, pruned_loss=0.03801, over 4694533.89 frames. ], batch size: 66, lr: 2.55e-03, grad_scale: 8.0 2023-10-03 19:59:48,204 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.ff2_skip_rate, batch_count=1388493.3333333333, ans=0.0 2023-10-03 19:59:49,292 WARNING [train.py:1204] (3/4) Exclude cut with ID _49371_210_12071_1_1533952761021_3812190_16-246001-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 19:59:54,684 WARNING [train.py:1204] (3/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_680-189165-0_sp1.1 from training. Duration: 0.936375 2023-10-03 19:59:57,671 WARNING [train.py:1204] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_53-300544-0_sp1.1 from training. Duration: 0.6090625 2023-10-03 20:00:00,374 WARNING [train.py:1204] (3/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_540-275685-0_sp1.1 from training. Duration: 0.8090625 2023-10-03 20:00:00,398 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_38965_210_4852_1_1533174906_3484735_453-106579-0_sp1.1 from training. Duration: 0.92725 2023-10-03 20:00:00,622 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.balancer1.prob, batch_count=1388493.3333333333, ans=0.125 2023-10-03 20:00:01,762 WARNING [train.py:1204] (3/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_105-328174-0 from training. Duration: 0.76 2023-10-03 20:00:01,880 WARNING [train.py:1204] (3/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_279-126205-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 20:00:04,806 WARNING [train.py:1204] (3/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_5-55893-0_sp0.9 from training. Duration: 0.611125 2023-10-03 20:00:06,563 WARNING [train.py:1204] (3/4) Exclude cut with ID _13678_210_7497_1_1530530069226_3463170_141-10835-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 20:00:06,924 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=1388560.0, ans=0.1 2023-10-03 20:00:07,952 WARNING [train.py:1204] (3/4) Exclude cut with ID _22083_210_8254_1_1531702862439_3678970_201-85738-0_sp1.1 from training. Duration: 0.8181875 2023-10-03 20:00:09,251 WARNING [train.py:1204] (3/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_51-1049-0 from training. Duration: 0.75 2023-10-03 20:00:10,577 WARNING [train.py:1204] (3/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_206-10097-0_sp0.9 from training. Duration: 0.5444375 2023-10-03 20:00:10,675 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_4528_210_3273_1_1527076654_3276777_360-63661-0_sp1.1 from training. Duration: 0.92725 2023-10-03 20:00:10,683 WARNING [train.py:1204] (3/4) Exclude cut with ID _64086_210_7994_1_1535270417019_7435949_449-31567-0_sp1.1 from training. Duration: 0.8818125 2023-10-03 20:00:14,720 WARNING [train.py:1204] (3/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_245-174553-0_sp0.9 from training. Duration: 0.97775 2023-10-03 20:00:16,100 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6545_210_2040_1_1527774670_4245258_379-14182-0_sp1.1 from training. Duration: 0.72725 2023-10-03 20:00:20,928 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_256-206070-0_sp1.1 from training. Duration: 0.963625 2023-10-03 20:00:22,882 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.conv_module2.balancer2.prob, batch_count=1388626.6666666667, ans=0.125 2023-10-03 20:00:24,595 WARNING [train.py:1204] (3/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_278-104676-0 from training. Duration: 0.98 2023-10-03 20:00:25,896 WARNING [train.py:1204] (3/4) Exclude cut with ID IC0671W0065-331908 from training. Duration: 0.896 2023-10-03 20:00:27,233 WARNING [train.py:1204] (3/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_244-129917-0_sp1.1 from training. Duration: 0.97275 2023-10-03 20:00:28,660 WARNING [train.py:1204] (3/4) Exclude cut with ID _7852_210_5219_1_1528286471316_8139400_1129-170857-0_sp1.1 from training. Duration: 0.97275 2023-10-03 20:00:31,272 WARNING [train.py:1204] (3/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_443-30034-0_sp0.9 from training. Duration: 0.711125 2023-10-03 20:00:31,325 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_31583_210_3852_1_1532595851_4024413_250-332999-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 20:00:34,011 WARNING [train.py:1204] (3/4) Exclude cut with ID _8772_210_1850_1_1528635595972_3664011_247-336448-0 from training. Duration: 0.74 2023-10-03 20:00:35,341 WARNING [train.py:1204] (3/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_567-135114-0_sp0.9 from training. Duration: 0.9666875 2023-10-03 20:00:35,345 WARNING [train.py:1204] (3/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_557-349534-0_sp1.1 from training. Duration: 0.82725 2023-10-03 20:00:35,358 WARNING [train.py:1204] (3/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_1341-26728-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 20:00:35,410 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_40222_210_6228_1_1533385298_4481939_375-328051-0_sp1.1 from training. Duration: 0.97275 2023-10-03 20:00:35,438 WARNING [train.py:1204] (3/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_276-96593-0 from training. Duration: 0.77 2023-10-03 20:00:35,635 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.feed_forward1.out_proj.dropout_p, batch_count=1388693.3333333333, ans=0.1 2023-10-03 20:00:41,403 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_53071_210_16554_1_1534490405_2954066_265-170472-0_sp0.9 from training. Duration: 0.92225 2023-10-03 20:00:41,421 WARNING [train.py:1204] (3/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_365-1492-0 from training. Duration: 0.91 2023-10-03 20:00:44,150 WARNING [train.py:1204] (3/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_654-52713-0_sp0.9 from training. Duration: 0.8555625 2023-10-03 20:00:46,980 WARNING [train.py:1204] (3/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_113-92608-0_sp1.1 from training. Duration: 0.4909375 2023-10-03 20:00:47,203 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.feed_forward1.out_proj.dropout_p, batch_count=1388760.0, ans=0.1 2023-10-03 20:00:50,305 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_46013_210_7234_1_1533776362_3744032_50-141535-0 from training. Duration: 0.99 2023-10-03 20:00:50,324 WARNING [train.py:1204] (3/4) Exclude cut with ID _13942_210_9988_1_1530604906036_3733360_499-55676-0_sp1.1 from training. Duration: 0.563625 2023-10-03 20:00:51,683 WARNING [train.py:1204] (3/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_375-65143-0_sp1.1 from training. Duration: 0.97275 2023-10-03 20:00:53,502 WARNING [train.py:1204] (3/4) Exclude cut with ID _16756_210_4915_1_1531047803763_7104259_199-325608-0_sp1.1 from training. Duration: 0.92725 2023-10-03 20:00:53,774 INFO [scaling.py:1118] (3/4) WithLoss: name=encoder.encoders.5.encoder.layers.1.self_attn_weights, loss-sum=0.000e+00 2023-10-03 20:00:55,335 WARNING [train.py:1204] (3/4) Exclude cut with ID _31649_210_6228_1_1532584608063_4091078_443-124391-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 20:00:55,429 WARNING [train.py:1204] (3/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_146-218763-0 from training. Duration: 0.86 2023-10-03 20:00:55,466 WARNING [train.py:1204] (3/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_567-135114-0_sp1.1 from training. Duration: 0.7909375 2023-10-03 20:00:55,505 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_26326_210_7925_1_1533002493_3592406_462-288299-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 20:00:58,174 WARNING [train.py:1204] (3/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_322-217220-0 from training. Duration: 0.86 2023-10-03 20:00:58,215 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_238-256668-0_sp1.1 from training. Duration: 0.62725 2023-10-03 20:00:58,262 WARNING [train.py:1204] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_371-18169-0 from training. Duration: 0.86 2023-10-03 20:00:59,635 WARNING [train.py:1204] (3/4) Exclude cut with ID _58480_210_12392_1_1534811121111_3646160_23-74512-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 20:00:59,640 WARNING [train.py:1204] (3/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_784-213099-0_sp1.1 from training. Duration: 0.8454375 2023-10-03 20:01:01,107 WARNING [train.py:1204] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_625-102955-0_sp1.1 from training. Duration: 0.7 2023-10-03 20:01:02,415 INFO [train.py:1046] (3/4) Epoch 40, batch 1150, loss[loss=0.1456, simple_loss=0.225, pruned_loss=0.03311, over 24331.00 frames. ], tot_loss[loss=0.1567, simple_loss=0.236, pruned_loss=0.03864, over 4703771.81 frames. ], batch size: 61, lr: 2.55e-03, grad_scale: 8.0 2023-10-03 20:01:05,500 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.attention_skip_rate, batch_count=1388826.6666666667, ans=0.0 2023-10-03 20:01:06,679 WARNING [train.py:1204] (3/4) Exclude cut with ID _49748_210_24116_1_1533986971737_3158910_176-174327-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 20:01:09,419 WARNING [train.py:1204] (3/4) Exclude cut with ID _50310_210_15488_1_1534068043345_3641080_432-96140-0_sp1.1 from training. Duration: 0.936375 2023-10-03 20:01:11,443 WARNING [train.py:1204] (3/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_76-14910-0_sp1.1 from training. Duration: 0.92725 2023-10-03 20:01:11,465 WARNING [train.py:1204] (3/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_40-19458-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 20:01:11,489 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_46-93547-0 from training. Duration: 0.95 2023-10-03 20:01:12,761 WARNING [train.py:1204] (3/4) Exclude cut with ID _21631_210_2253_1_1531907880778_3160339_321-35325-0_sp1.1 from training. Duration: 0.963625 2023-10-03 20:01:14,174 WARNING [train.py:1204] (3/4) Exclude cut with ID _4779_210_2084_1_1527318022944_3914893_500-285013-0 from training. Duration: 0.69 2023-10-03 20:01:14,446 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.nonlin_attention.balancer.max_positive, batch_count=1388826.6666666667, ans=0.95 2023-10-03 20:01:15,554 WARNING [train.py:1204] (3/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_301-93324-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 20:01:15,560 WARNING [train.py:1204] (3/4) Exclude cut with ID _6473_210_1741_1_1527901307396_3815380_108-17335-0_sp1.1 from training. Duration: 0.5909375 2023-10-03 20:01:18,416 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.conv_skip_rate, batch_count=1388893.3333333333, ans=0.0 2023-10-03 20:01:21,597 WARNING [train.py:1204] (3/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_206-10097-0 from training. Duration: 0.49 2023-10-03 20:01:25,003 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_36756_210_7234_1_1533026991_966276_103-77513-0_sp1.1 from training. Duration: 0.97275 2023-10-03 20:01:29,611 WARNING [train.py:1204] (3/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_662-18193-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 20:01:30,922 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_26433_210_12108_1_1532157158_4056480_439-52438-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 20:01:30,981 WARNING [train.py:1204] (3/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_464-33931-0 from training. Duration: 0.54 2023-10-03 20:01:30,987 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3474_210_2028_1_1526188513_4825246_145-309131-0_sp0.9 from training. Duration: 0.82225 2023-10-03 20:01:30,998 WARNING [train.py:1204] (3/4) Exclude cut with ID _9679_210_7497_1_1529129212906_4363929_446-308321-0_sp1.1 from training. Duration: 0.963625 2023-10-03 20:01:35,064 WARNING [train.py:1204] (3/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_662-275621-0 from training. Duration: 0.42 2023-10-03 20:01:36,385 WARNING [train.py:1204] (3/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_344-337148-0_sp1.1 from training. Duration: 0.97275 2023-10-03 20:01:36,508 WARNING [train.py:1204] (3/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_759-179071-0_sp1.1 from training. Duration: 0.92725 2023-10-03 20:01:39,029 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.652e+02 2.015e+02 2.225e+02 2.485e+02 5.014e+02, threshold=4.451e+02, percent-clipped=1.0 2023-10-03 20:01:45,303 WARNING [train.py:1204] (3/4) Exclude cut with ID _28783_210_7943_1_1532671164808_7155479_293-12188-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 20:01:46,747 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.0.conv_skip_rate, batch_count=1389026.6666666667, ans=0.0 2023-10-03 20:01:51,269 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_17613_210_2151_1_1531137569_2366160_235-320304-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 20:01:51,293 WARNING [train.py:1204] (3/4) Exclude cut with ID _14349_210_8341_1_1530615710818_3442470_170-336541-0 from training. Duration: 0.86 2023-10-03 20:01:52,597 WARNING [train.py:1204] (3/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_375-218677-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 20:01:52,647 WARNING [train.py:1204] (3/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_829-202956-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 20:01:57,344 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9029W0436-509823 from training. Duration: 0.768 2023-10-03 20:01:59,535 WARNING [train.py:1204] (3/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_231-112214-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 20:02:01,208 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.feed_forward2.hidden_balancer.prob, batch_count=1389093.3333333333, ans=0.125 2023-10-03 20:02:04,191 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.4.encoder.layers.1.nonlin_attention.whiten1, num_groups=1, num_channels=288, metric=4.50 vs. limit=10.0 2023-10-03 20:02:04,954 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9059W0470-524883 from training. Duration: 0.3415625 2023-10-03 20:02:06,637 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=1389093.3333333333, ans=0.1 2023-10-03 20:02:09,312 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_43266_210_1794_1_1533521789_4497218_206-79512-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 20:02:09,398 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_294-163793-0_sp0.9 from training. Duration: 0.911125 2023-10-03 20:02:10,652 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6290_210_3273_1_1527595224_1282787_115-87095-0_sp1.1 from training. Duration: 0.5 2023-10-03 20:02:10,707 WARNING [train.py:1204] (3/4) Exclude cut with ID _14100_210_8341_1_1530529320613_3678069_279-55760-0_sp1.1 from training. Duration: 0.8454375 2023-10-03 20:02:13,857 WARNING [train.py:1204] (3/4) Exclude cut with ID _14621_210_1925_1_1530686901185_3675420_419-234200-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 20:02:14,024 INFO [scaling.py:1118] (3/4) WithLoss: name=encoder.encoders.0.layers.1.self_attn_weights, loss-sum=0.000e+00 2023-10-03 20:02:16,509 INFO [train.py:1046] (3/4) Epoch 40, batch 1200, loss[loss=0.1591, simple_loss=0.2333, pruned_loss=0.04245, over 23830.00 frames. ], tot_loss[loss=0.1572, simple_loss=0.2368, pruned_loss=0.03885, over 4703680.97 frames. ], batch size: 164, lr: 2.55e-03, grad_scale: 16.0 2023-10-03 20:02:18,087 WARNING [train.py:1204] (3/4) Exclude cut with ID _60229_210_27767_1_1534838406158_3655350_353-213656-0_sp1.1 from training. Duration: 0.863625 2023-10-03 20:02:18,105 WARNING [train.py:1204] (3/4) Exclude cut with ID _48678_210_5663_1_1533988830947_3769199_306-255886-0_sp1.1 from training. Duration: 0.7 2023-10-03 20:02:21,318 WARNING [train.py:1204] (3/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_466-3492-0_sp1.1 from training. Duration: 0.97275 2023-10-03 20:02:21,318 WARNING [train.py:1204] (3/4) Exclude cut with ID _8755_210_1876_1_1528628559357_3258209_199-242167-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 20:02:22,640 WARNING [train.py:1204] (3/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_237-89684-0_sp1.1 from training. Duration: 0.87275 2023-10-03 20:02:25,709 WARNING [train.py:1204] (3/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_70-84121-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 20:02:27,111 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7544_210_6753_1_1528587722_3576686_172-171750-0_sp1.1 from training. Duration: 0.5909375 2023-10-03 20:02:28,555 WARNING [train.py:1204] (3/4) Exclude cut with ID _24271_210_12658_1_1531987270022_7190519_40-321822-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 20:02:28,572 WARNING [train.py:1204] (3/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_227-83612-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 20:02:31,281 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.2.encoder.layers.2.nonlin_attention.whiten2, num_groups=1, num_channels=384, metric=4.75 vs. limit=15.0 2023-10-03 20:02:31,861 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9156W0015-572508 from training. Duration: 0.896 2023-10-03 20:02:34,599 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_292-265928-0 from training. Duration: 0.51 2023-10-03 20:02:36,159 WARNING [train.py:1204] (3/4) Exclude cut with ID _14747_210_8341_1_1530685797463_3654089_147-131229-0_sp1.1 from training. Duration: 0.6909375 2023-10-03 20:02:36,303 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.1.attention_skip_rate, batch_count=1389226.6666666667, ans=0.0 2023-10-03 20:02:36,432 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.bypass_mid.scale_min, batch_count=1389226.6666666667, ans=0.2 2023-10-03 20:02:37,580 WARNING [train.py:1204] (3/4) Exclude cut with ID _45297_210_2721_1_1533715351381_3264949_351-147136-0_sp0.9 from training. Duration: 0.9666875 2023-10-03 20:02:39,318 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.ff3_skip_rate, batch_count=1389226.6666666667, ans=0.0 2023-10-03 20:02:40,426 WARNING [train.py:1204] (3/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_1102-324568-0_sp1.1 from training. Duration: 0.97275 2023-10-03 20:02:41,806 WARNING [train.py:1204] (3/4) Exclude cut with ID _18516_210_9206_1_1531704653206_3692406_465-75446-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 20:02:41,808 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9203W0126-595623 from training. Duration: 0.896 2023-10-03 20:02:43,185 WARNING [train.py:1204] (3/4) Exclude cut with ID _55383_210_10635_1_1534468426172_2231669_139-328410-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 20:02:45,537 INFO [scaling.py:1118] (3/4) WithLoss: name=encoder.encoders.4.encoder.layers.2.self_attn_weights, loss-sum=0.000e+00 2023-10-03 20:02:46,949 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.conv_module1.balancer2.min_positive, batch_count=1389293.3333333333, ans=0.05 2023-10-03 20:02:49,461 WARNING [train.py:1204] (3/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_166-246557-0_sp0.9 from training. Duration: 0.711125 2023-10-03 20:02:49,465 WARNING [train.py:1204] (3/4) Exclude cut with ID _30039_210_13270_1_1532484612900_2725150_239-304872-0_sp1.1 from training. Duration: 0.963625 2023-10-03 20:02:49,488 WARNING [train.py:1204] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_6-226750-0 from training. Duration: 0.94 2023-10-03 20:02:50,772 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_135-5501-0_sp1.1 from training. Duration: 0.8909375 2023-10-03 20:02:54,832 WARNING [train.py:1204] (3/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_359-118926-0 from training. Duration: 0.72 2023-10-03 20:03:00,861 WARNING [train.py:1204] (3/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_4-91037-0 from training. Duration: 0.88 2023-10-03 20:03:00,894 WARNING [train.py:1204] (3/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_415-288492-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 20:03:02,266 WARNING [train.py:1204] (3/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_544-287179-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 20:03:03,567 WARNING [train.py:1204] (3/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_122-32606-0_sp1.1 from training. Duration: 0.92725 2023-10-03 20:03:04,859 WARNING [train.py:1204] (3/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_121-284152-0_sp1.1 from training. Duration: 0.536375 2023-10-03 20:03:06,237 WARNING [train.py:1204] (3/4) Exclude cut with ID _44718_210_5816_1_1534050051869_6469030_350-299477-0_sp1.1 from training. Duration: 0.97275 2023-10-03 20:03:06,256 WARNING [train.py:1204] (3/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_1103-56445-0_sp0.9 from training. Duration: 0.811125 2023-10-03 20:03:06,319 WARNING [train.py:1204] (3/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_64-298917-0_sp0.9 from training. Duration: 0.97775 2023-10-03 20:03:07,649 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2245_210_2715_1_1525511548_3540254_310-52483-0 from training. Duration: 0.88 2023-10-03 20:03:07,710 WARNING [train.py:1204] (3/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_401-158239-0_sp1.1 from training. Duration: 0.6454375 2023-10-03 20:03:08,205 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.5.encoder.layers.0.feed_forward1.out_whiten, num_groups=1, num_channels=256, metric=10.22 vs. limit=15.0 2023-10-03 20:03:09,020 WARNING [train.py:1204] (3/4) Exclude cut with ID _59502_210_12210_1_1535107024698_1835139_146-162495-0_sp1.1 from training. Duration: 0.836375 2023-10-03 20:03:09,039 WARNING [train.py:1204] (3/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_41-31157-0_sp0.9 from training. Duration: 0.6333125 2023-10-03 20:03:10,706 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.attention_skip_rate, batch_count=1389360.0, ans=0.0 2023-10-03 20:03:11,774 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_68717_210_3528_1_1535628683_1556759_95-36722-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 20:03:11,789 WARNING [train.py:1204] (3/4) Exclude cut with ID _30196_210_1664_1_1533708496522_7074140_654-162611-0_sp1.1 from training. Duration: 0.92725 2023-10-03 20:03:16,558 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_9302_210_2550_1_1528889962_4655416_638-301620-0_sp0.9 from training. Duration: 0.52225 2023-10-03 20:03:17,969 WARNING [train.py:1204] (3/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_478-162247-0_sp1.1 from training. Duration: 0.7454375 2023-10-03 20:03:20,758 WARNING [train.py:1204] (3/4) Exclude cut with ID _45297_210_2721_1_1533715351381_3264949_353-9541-0 from training. Duration: 0.97 2023-10-03 20:03:25,513 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9653W0190-675468 from training. Duration: 0.9935 2023-10-03 20:03:26,870 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_49730_210_13049_1_1534075294_1830676_214-84436-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 20:03:28,895 WARNING [train.py:1204] (3/4) Exclude cut with ID _50093_210_4220_1_1534417141207_3432299_259-322252-0_sp1.1 from training. Duration: 0.836375 2023-10-03 20:03:30,394 WARNING [train.py:1204] (3/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_599-50220-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 20:03:31,679 INFO [train.py:1046] (3/4) Epoch 40, batch 1250, loss[loss=0.1519, simple_loss=0.2241, pruned_loss=0.03986, over 23761.00 frames. ], tot_loss[loss=0.1578, simple_loss=0.2374, pruned_loss=0.03907, over 4707233.16 frames. ], batch size: 135, lr: 2.55e-03, grad_scale: 8.0 2023-10-03 20:03:31,809 WARNING [train.py:1204] (3/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_174-109016-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 20:03:35,110 WARNING [train.py:1204] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_665-196155-0 from training. Duration: 0.67 2023-10-03 20:03:37,865 WARNING [train.py:1204] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_656-188361-0_sp1.1 from training. Duration: 0.7909375 2023-10-03 20:03:39,352 WARNING [train.py:1204] (3/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_60-18123-0_sp1.1 from training. Duration: 0.8 2023-10-03 20:03:40,676 WARNING [train.py:1204] (3/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_535-14410-0 from training. Duration: 0.47 2023-10-03 20:03:42,185 WARNING [train.py:1204] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_94-126128-0_sp1.1 from training. Duration: 0.7818125 2023-10-03 20:03:42,262 WARNING [train.py:1204] (3/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_356-31216-0_sp1.1 from training. Duration: 0.6818125 2023-10-03 20:03:48,204 WARNING [train.py:1204] (3/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_443-30034-0_sp1.1 from training. Duration: 0.5818125 2023-10-03 20:03:48,294 WARNING [train.py:1204] (3/4) Exclude cut with ID _26907_210_7234_1_1532221740694_3205263_337-2494-0_sp1.1 from training. Duration: 0.8 2023-10-03 20:03:49,554 WARNING [train.py:1204] (3/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_49-97158-0_sp0.9 from training. Duration: 0.8444375 2023-10-03 20:03:49,572 WARNING [train.py:1204] (3/4) Exclude cut with ID _7877_210_6014_1_1528286358099_3750136_553-170128-0_sp1.1 from training. Duration: 0.963625 2023-10-03 20:03:52,352 WARNING [train.py:1204] (3/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_202-293523-0_sp0.9 from training. Duration: 0.8 2023-10-03 20:03:55,204 WARNING [train.py:1204] (3/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_188-171673-0_sp0.9 from training. Duration: 0.6444375 2023-10-03 20:03:55,220 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_120-41668-0_sp1.1 from training. Duration: 0.636375 2023-10-03 20:03:55,224 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_40223_210_6228_1_1533298404_4812267_613-78769-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 20:03:59,008 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_53861_210_5399_1_1534422156_4272077_150-336899-0_sp1.1 from training. Duration: 0.963625 2023-10-03 20:03:59,082 WARNING [train.py:1204] (3/4) Exclude cut with ID _14459_210_2228_1_1531443641946_7516700_1146-56843-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 20:04:01,659 WARNING [train.py:1204] (3/4) Exclude cut with ID _39867_210_9826_1_1533553222030_9344259_1194-299928-0_sp1.1 from training. Duration: 0.92725 2023-10-03 20:04:02,207 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.4.encoder.layers.1.nonlin_attention.whiten2, num_groups=1, num_channels=384, metric=3.31 vs. limit=15.0 2023-10-03 20:04:03,010 WARNING [train.py:1204] (3/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_214-291369-0_sp0.9 from training. Duration: 0.7 2023-10-03 20:04:09,024 WARNING [train.py:1204] (3/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_358-215558-0 from training. Duration: 0.93 2023-10-03 20:04:09,084 WARNING [train.py:1204] (3/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_577-287150-0_sp0.9 from training. Duration: 0.911125 2023-10-03 20:04:10,318 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.574e+02 1.900e+02 2.073e+02 2.356e+02 3.253e+02, threshold=4.146e+02, percent-clipped=0.0 2023-10-03 20:04:13,178 WARNING [train.py:1204] (3/4) Exclude cut with ID _60860_210_8254_1_1534896015875_3557169_229-187367-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 20:04:13,223 WARNING [train.py:1204] (3/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_857-298942-0 from training. Duration: 0.85 2023-10-03 20:04:14,638 WARNING [train.py:1204] (3/4) Exclude cut with ID _40137_210_12210_1_1533290484046_3795180_249-187059-0_sp1.1 from training. Duration: 0.8 2023-10-03 20:04:14,646 WARNING [train.py:1204] (3/4) Exclude cut with ID ID0171W0508-765603 from training. Duration: 0.541 2023-10-03 20:04:14,676 WARNING [train.py:1204] (3/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_521-219439-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 20:04:14,681 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_40223_210_6228_1_1533298404_4812267_741-302616-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 20:04:18,907 WARNING [train.py:1204] (3/4) Exclude cut with ID _65293_210_9070_1_1535545549765_4004210_438-154735-0_sp1.1 from training. Duration: 0.92725 2023-10-03 20:04:22,215 WARNING [train.py:1204] (3/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_564-132950-0_sp1.1 from training. Duration: 0.92725 2023-10-03 20:04:23,470 WARNING [train.py:1204] (3/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_291-270881-0_sp1.1 from training. Duration: 0.8818125 2023-10-03 20:04:23,567 WARNING [train.py:1204] (3/4) Exclude cut with ID _60229_210_27767_1_1534838406158_3655350_353-213656-0 from training. Duration: 0.95 2023-10-03 20:04:23,585 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_243-143119-0 from training. Duration: 0.77 2023-10-03 20:04:24,870 WARNING [train.py:1204] (3/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_690-126306-0 from training. Duration: 0.78 2023-10-03 20:04:27,816 WARNING [train.py:1204] (3/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_347-95003-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 20:04:29,125 WARNING [train.py:1204] (3/4) Exclude cut with ID _2322_210_2667_1_1525604548971_3656299_440-80494-0 from training. Duration: 0.88 2023-10-03 20:04:29,152 WARNING [train.py:1204] (3/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_370-229434-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 20:04:31,262 WARNING [train.py:1204] (3/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_124-345840-0_sp1.1 from training. Duration: 0.47275 2023-10-03 20:04:32,598 WARNING [train.py:1204] (3/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_165-123581-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 20:04:34,029 WARNING [train.py:1204] (3/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_453-282588-0 from training. Duration: 0.98 2023-10-03 20:04:34,054 WARNING [train.py:1204] (3/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_299-34413-0_sp0.9 from training. Duration: 0.72225 2023-10-03 20:04:34,097 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_40036_210_13405_1_1533648647_1315388_48-146016-0_sp1.1 from training. Duration: 0.8454375 2023-10-03 20:04:34,124 WARNING [train.py:1204] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_99-167392-0_sp0.9 from training. Duration: 0.67775 2023-10-03 20:04:35,458 WARNING [train.py:1204] (3/4) Exclude cut with ID _48677_210_5663_1_1533902405919_3892989_37-207114-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 20:04:38,129 WARNING [train.py:1204] (3/4) Exclude cut with ID _29857_210_20034_1_1532395886753_3798160_335-163562-0 from training. Duration: 0.85 2023-10-03 20:04:40,144 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_36492_210_7927_1_1533261987_4790834_703-343351-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 20:04:40,345 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.ff2_skip_rate, batch_count=1389760.0, ans=0.0 2023-10-03 20:04:41,583 WARNING [train.py:1204] (3/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_80-147301-0_sp0.9 from training. Duration: 0.9333125 2023-10-03 20:04:43,041 WARNING [train.py:1204] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_218-295097-0_sp0.9 from training. Duration: 0.8555625 2023-10-03 20:04:45,735 INFO [train.py:1046] (3/4) Epoch 40, batch 1300, loss[loss=0.1512, simple_loss=0.235, pruned_loss=0.03368, over 24283.00 frames. ], tot_loss[loss=0.1581, simple_loss=0.2379, pruned_loss=0.03914, over 4698465.19 frames. ], batch size: 61, lr: 2.55e-03, grad_scale: 8.0 2023-10-03 20:04:45,858 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2295_210_2087_1_1525500993_2407863_116-238894-0_sp1.1 from training. Duration: 0.636375 2023-10-03 20:04:47,307 WARNING [train.py:1204] (3/4) Exclude cut with ID _3231_210_2106_1_1526205864934_6784220_697-171068-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 20:04:49,222 WARNING [train.py:1204] (3/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_102-77064-0 from training. Duration: 0.93 2023-10-03 20:04:53,570 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.bypass_mid.scale_min, batch_count=1389826.6666666667, ans=0.2 2023-10-03 20:04:54,555 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_13091_210_7925_1_1530609447_8987354_1186-261957-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 20:04:55,976 WARNING [train.py:1204] (3/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_91-277684-0_sp1.1 from training. Duration: 0.67275 2023-10-03 20:04:57,375 WARNING [train.py:1204] (3/4) Exclude cut with ID _31088_210_16446_1_1532520012284_3440919_159-313751-0_sp1.1 from training. Duration: 0.936375 2023-10-03 20:04:58,815 WARNING [train.py:1204] (3/4) Exclude cut with ID _42326_210_7770_1_1533814282531_3707269_227-203721-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 20:05:00,164 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3531_210_3528_1_1526294203_5216456_309-227302-0_sp1.1 from training. Duration: 0.62725 2023-10-03 20:05:00,232 WARNING [train.py:1204] (3/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_226-242140-0 from training. Duration: 0.81 2023-10-03 20:05:01,028 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.feed_forward1.hidden_balancer.prob, batch_count=1389893.3333333333, ans=0.125 2023-10-03 20:05:06,653 WARNING [train.py:1204] (3/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_122-109023-0_sp1.1 from training. Duration: 0.6909375 2023-10-03 20:05:06,743 WARNING [train.py:1204] (3/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_660-211088-0_sp0.9 from training. Duration: 0.688875 2023-10-03 20:05:09,399 WARNING [train.py:1204] (3/4) Exclude cut with ID _39290_210_4220_1_1533862778760_3075118_92-311600-0 from training. Duration: 0.88 2023-10-03 20:05:12,188 WARNING [train.py:1204] (3/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_390-339137-0_sp1.1 from training. Duration: 0.5090625 2023-10-03 20:05:16,667 WARNING [train.py:1204] (3/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_449-341946-0_sp1.1 from training. Duration: 0.963625 2023-10-03 20:05:16,740 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_68095_210_20185_1_1535718226_8888855_884-327007-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 20:05:18,122 WARNING [train.py:1204] (3/4) Exclude cut with ID _42326_210_7770_1_1533814282531_3707269_308-222998-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 20:05:19,548 WARNING [train.py:1204] (3/4) Exclude cut with ID _40399_210_4353_1_1533305658194_3279406_344-238708-0_sp1.1 from training. Duration: 0.963625 2023-10-03 20:05:19,621 WARNING [train.py:1204] (3/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_87-131427-0_sp1.1 from training. Duration: 0.7090625 2023-10-03 20:05:21,593 WARNING [train.py:1204] (3/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_110-130961-0_sp0.9 from training. Duration: 0.52225 2023-10-03 20:05:22,851 WARNING [train.py:1204] (3/4) Exclude cut with ID _55153_210_2253_1_1534384786677_3162096_148-79022-0 from training. Duration: 0.96 2023-10-03 20:05:27,027 WARNING [train.py:1204] (3/4) Exclude cut with ID _9679_210_7497_1_1529129212906_4363929_512-313889-0_sp1.1 from training. Duration: 0.736375 2023-10-03 20:05:27,054 WARNING [train.py:1204] (3/4) Exclude cut with ID _7970_210_1883_1_1528455616296_3472230_25-178407-0_sp1.1 from training. Duration: 0.6454375 2023-10-03 20:05:29,780 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_246-125000-0 from training. Duration: 0.62 2023-10-03 20:05:29,838 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_15126_210_6753_1_1530778677_2591185_113-157651-0_sp0.9 from training. Duration: 0.7555625 2023-10-03 20:05:31,279 WARNING [train.py:1204] (3/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_105-63309-0_sp1.1 from training. Duration: 0.8909375 2023-10-03 20:05:34,509 WARNING [train.py:1204] (3/4) Exclude cut with ID _29877_210_6228_1_1532397791106_5276149_578-177271-0_sp1.1 from training. Duration: 0.936375 2023-10-03 20:05:34,563 WARNING [train.py:1204] (3/4) Exclude cut with ID _44831_210_13427_1_1533816011549_3689035_32-188147-0 from training. Duration: 0.96 2023-10-03 20:05:35,944 WARNING [train.py:1204] (3/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_300-254337-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 20:05:35,957 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_128-241008-0 from training. Duration: 0.94 2023-10-03 20:05:37,323 WARNING [train.py:1204] (3/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_53-279178-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 20:05:41,270 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_41452_210_7291_1_1533437687_3941281_273-334088-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 20:05:41,273 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_128-241008-0_sp1.1 from training. Duration: 0.8545625 2023-10-03 20:05:42,802 WARNING [train.py:1204] (3/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_379-49457-0 from training. Duration: 0.87 2023-10-03 20:05:44,700 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_105-114797-0 from training. Duration: 0.84 2023-10-03 20:05:44,794 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_14928_210_5632_1_1530773461_4714000_454-112198-0 from training. Duration: 0.66 2023-10-03 20:05:50,063 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_456-22141-0_sp1.1 from training. Duration: 0.8 2023-10-03 20:05:53,254 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_115-233929-0 from training. Duration: 0.72 2023-10-03 20:05:53,548 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.feed_forward1.out_proj.dropout_p, batch_count=1390093.3333333333, ans=0.1 2023-10-03 20:05:54,659 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_616-16795-0_sp1.1 from training. Duration: 0.963625 2023-10-03 20:05:59,088 INFO [train.py:1046] (3/4) Epoch 40, batch 1350, loss[loss=0.1761, simple_loss=0.2607, pruned_loss=0.04582, over 24377.00 frames. ], tot_loss[loss=0.1577, simple_loss=0.2373, pruned_loss=0.03903, over 4707606.30 frames. ], batch size: 77, lr: 2.55e-03, grad_scale: 8.0 2023-10-03 20:06:00,587 WARNING [train.py:1204] (3/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_448-212135-0 from training. Duration: 0.91 2023-10-03 20:06:04,351 WARNING [train.py:1204] (3/4) Exclude cut with ID _46683_210_9642_1_1533730913492_4591116_201-205677-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 20:06:05,771 WARNING [train.py:1204] (3/4) Exclude cut with ID _64086_210_7994_1_1535270417019_7435949_43-80483-0_sp1.1 from training. Duration: 0.97275 2023-10-03 20:06:07,375 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_240-134124-0_sp1.1 from training. Duration: 0.963625 2023-10-03 20:06:08,719 WARNING [train.py:1204] (3/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_907-120940-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 20:06:10,111 WARNING [train.py:1204] (3/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_848-280957-0_sp1.1 from training. Duration: 0.7909375 2023-10-03 20:06:11,408 WARNING [train.py:1204] (3/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_243-42116-0_sp0.9 from training. Duration: 0.811125 2023-10-03 20:06:16,075 WARNING [train.py:1204] (3/4) Exclude cut with ID _6694_210_4599_1_1527852637490_3965529_116-109398-0_sp0.9 from training. Duration: 0.811125 2023-10-03 20:06:17,520 WARNING [train.py:1204] (3/4) Exclude cut with ID _24314_210_4915_1_1532135078116_7170179_521-258868-0 from training. Duration: 0.79 2023-10-03 20:06:17,629 WARNING [train.py:1204] (3/4) Exclude cut with ID _2220_210_1819_1_1525509109898_3658680_50-99837-0_sp0.9 from training. Duration: 0.6 2023-10-03 20:06:17,671 WARNING [train.py:1204] (3/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_313-134930-0_sp1.1 from training. Duration: 0.8818125 2023-10-03 20:06:20,307 WARNING [train.py:1204] (3/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_125-144174-0 from training. Duration: 0.39 2023-10-03 20:06:22,288 WARNING [train.py:1204] (3/4) Exclude cut with ID _59288_210_5854_1_1535077757774_3412246_275-293897-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 20:06:23,597 WARNING [train.py:1204] (3/4) Exclude cut with ID _40519_210_13794_1_1533715228655_7337390_722-52880-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 20:06:23,611 WARNING [train.py:1204] (3/4) Exclude cut with ID _7537_210_2084_1_1528502395431_3924359_128-268029-0 from training. Duration: 0.98 2023-10-03 20:06:24,406 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.2.encoder.layers.1.feed_forward1.out_whiten, num_groups=1, num_channels=384, metric=6.50 vs. limit=15.0 2023-10-03 20:06:26,273 WARNING [train.py:1204] (3/4) Exclude cut with ID _8021_210_7994_1_1528376451358_3653069_179-45206-0 from training. Duration: 0.86 2023-10-03 20:06:26,406 WARNING [train.py:1204] (3/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_88-276668-0 from training. Duration: 0.99 2023-10-03 20:06:27,743 WARNING [train.py:1204] (3/4) Exclude cut with ID _22613_210_5219_1_1532253599686_4090170_290-212862-0_sp1.1 from training. Duration: 0.97275 2023-10-03 20:06:27,748 WARNING [train.py:1204] (3/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_465-214726-0 from training. Duration: 0.86 2023-10-03 20:06:27,980 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.conv_module2.balancer1.prob, batch_count=1390293.3333333333, ans=0.125 2023-10-03 20:06:38,029 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.449e+02 1.913e+02 2.157e+02 2.390e+02 3.072e+02, threshold=4.314e+02, percent-clipped=0.0 2023-10-03 20:06:38,367 INFO [scaling.py:1118] (3/4) WithLoss: name=encoder.encoders.1.encoder.layers.0.self_attn_weights, loss-sum=0.000e+00 2023-10-03 20:06:39,569 WARNING [train.py:1204] (3/4) Exclude cut with ID _56480_210_4220_1_1534679942079_3075409_138-36031-0_sp1.1 from training. Duration: 0.97275 2023-10-03 20:06:45,735 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=1390360.0, ans=0.1 2023-10-03 20:06:49,702 WARNING [train.py:1204] (3/4) Exclude cut with ID _58605_210_12210_1_1534723195939_3904259_300-285878-0_sp1.1 from training. Duration: 0.97275 2023-10-03 20:06:49,735 WARNING [train.py:1204] (3/4) Exclude cut with ID _40901_210_5665_1_1533363789958_3856130_114-340229-0_sp1.1 from training. Duration: 0.936375 2023-10-03 20:06:51,071 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_489-230703-0 from training. Duration: 0.85 2023-10-03 20:06:54,438 WARNING [train.py:1204] (3/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_322-81243-0_sp1.1 from training. Duration: 0.936375 2023-10-03 20:06:54,531 WARNING [train.py:1204] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_271-269084-0 from training. Duration: 0.83 2023-10-03 20:06:54,549 WARNING [train.py:1204] (3/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_166-314607-0_sp0.9 from training. Duration: 0.6 2023-10-03 20:06:55,856 WARNING [train.py:1204] (3/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_340-343302-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 20:06:58,537 WARNING [train.py:1204] (3/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_286-112015-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 20:07:02,253 WARNING [train.py:1204] (3/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_328-264776-0 from training. Duration: 0.67 2023-10-03 20:07:04,891 WARNING [train.py:1204] (3/4) Exclude cut with ID _4866_210_2577_1_1527404412284_4364659_256-306830-0_sp0.9 from training. Duration: 0.9666875 2023-10-03 20:07:07,987 WARNING [train.py:1204] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_239-316802-0 from training. Duration: 0.69 2023-10-03 20:07:10,731 WARNING [train.py:1204] (3/4) Exclude cut with ID _29857_210_20034_1_1532395886753_3798160_279-185801-0 from training. Duration: 0.91 2023-10-03 20:07:13,511 INFO [train.py:1046] (3/4) Epoch 40, batch 1400, loss[loss=0.1629, simple_loss=0.2327, pruned_loss=0.04653, over 23762.00 frames. ], tot_loss[loss=0.1563, simple_loss=0.2356, pruned_loss=0.03848, over 4699935.99 frames. ], batch size: 164, lr: 2.55e-03, grad_scale: 8.0 2023-10-03 20:07:16,765 WARNING [train.py:1204] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_656-188361-0 from training. Duration: 0.87 2023-10-03 20:07:18,301 WARNING [train.py:1204] (3/4) Exclude cut with ID _44718_210_5816_1_1534050051869_6469030_100-230287-0_sp1.1 from training. Duration: 0.936375 2023-10-03 20:07:21,071 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8275_210_4198_1_1528455320_3892571_276-215622-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 20:07:22,382 WARNING [train.py:1204] (3/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_698-309430-0_sp1.1 from training. Duration: 0.963625 2023-10-03 20:07:26,844 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_365-217119-0 from training. Duration: 0.63 2023-10-03 20:07:28,279 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7264_210_4938_1_1527939911_8132095_800-91995-0 from training. Duration: 0.86 2023-10-03 20:07:36,356 WARNING [train.py:1204] (3/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_49-97158-0_sp1.1 from training. Duration: 0.6909375 2023-10-03 20:07:39,022 WARNING [train.py:1204] (3/4) Exclude cut with ID _61132_210_9969_1_1535021792964_3755419_61-57720-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 20:07:40,600 WARNING [train.py:1204] (3/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_345-306832-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 20:07:40,629 WARNING [train.py:1204] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_164-224052-0_sp0.9 from training. Duration: 0.77775 2023-10-03 20:07:44,489 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_34886_210_10633_1_1533203882_1755661_76-156096-0_sp1.1 from training. Duration: 0.7909375 2023-10-03 20:07:46,449 WARNING [train.py:1204] (3/4) Exclude cut with ID 4133-6541-0027-40495-0_sp1.1 from training. Duration: 0.9681875 2023-10-03 20:07:54,120 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_514-211165-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 20:07:55,418 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_41211_210_2544_1_1533348859_3702910_97-212695-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 20:07:56,197 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.2.encoder.layers.2.feed_forward2.out_whiten, num_groups=1, num_channels=384, metric=3.76 vs. limit=15.0 2023-10-03 20:07:58,378 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.conv_module2.balancer1.min_positive, batch_count=1390693.3333333333, ans=0.025 2023-10-03 20:07:59,512 WARNING [train.py:1204] (3/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_406-45086-0 from training. Duration: 0.8 2023-10-03 20:08:00,946 WARNING [train.py:1204] (3/4) Exclude cut with ID _65263_210_22356_1_1535763598691_3921037_49-222917-0_sp1.1 from training. Duration: 0.836375 2023-10-03 20:08:01,634 WARNING [train.py:1204] (3/4) Exclude cut with ID _4866_210_2577_1_1527404412284_4364659_352-157930-0_sp1.1 from training. Duration: 0.736375 2023-10-03 20:08:02,834 WARNING [train.py:1204] (3/4) Exclude cut with ID _4366_210_1388_1_1526792053633_3613819_231-211402-0_sp1.1 from training. Duration: 0.8545625 2023-10-03 20:08:02,882 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_98-21576-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 20:08:04,868 WARNING [train.py:1204] (3/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_348-127789-0_sp0.9 from training. Duration: 0.9444375 2023-10-03 20:08:04,885 WARNING [train.py:1204] (3/4) Exclude cut with ID _45871_210_5665_1_1533864614245_3296060_98-329557-0_sp1.1 from training. Duration: 0.97275 2023-10-03 20:08:04,954 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6846_210_6753_1_1527850767_4295158_2-315073-0_sp1.1 from training. Duration: 0.8 2023-10-03 20:08:06,425 WARNING [train.py:1204] (3/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_145-137790-0 from training. Duration: 0.83 2023-10-03 20:08:06,439 WARNING [train.py:1204] (3/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_133-158308-0_sp0.9 from training. Duration: 0.9333125 2023-10-03 20:08:06,685 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.conv_module2.balancer1.prob, batch_count=1390693.3333333333, ans=0.125 2023-10-03 20:08:10,593 WARNING [train.py:1204] (3/4) Exclude cut with ID _14898_210_9206_1_1530766812377_4177190_320-11758-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 20:08:10,840 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.conv_skip_rate, batch_count=1390693.3333333333, ans=0.0 2023-10-03 20:08:13,423 WARNING [train.py:1204] (3/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_406-45086-0_sp0.9 from training. Duration: 0.888875 2023-10-03 20:08:15,167 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=1390760.0, ans=0.1 2023-10-03 20:08:22,163 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_5438_210_1794_1_1527336687_2600009_104-288274-0 from training. Duration: 0.58 2023-10-03 20:08:22,245 WARNING [train.py:1204] (3/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_108-53929-0_sp0.9 from training. Duration: 0.6555625 2023-10-03 20:08:23,577 WARNING [train.py:1204] (3/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_444-286390-0_sp1.1 from training. Duration: 0.963625 2023-10-03 20:08:23,796 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=1390760.0, ans=0.1 2023-10-03 20:08:26,255 WARNING [train.py:1204] (3/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_125-144174-0_sp0.9 from training. Duration: 0.4333125 2023-10-03 20:08:26,335 WARNING [train.py:1204] (3/4) Exclude cut with ID _55209_210_1531_1_1534675828996_4597160_118-345621-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 20:08:27,594 INFO [train.py:1046] (3/4) Epoch 40, batch 1450, loss[loss=0.1367, simple_loss=0.2175, pruned_loss=0.02791, over 21947.00 frames. ], tot_loss[loss=0.1556, simple_loss=0.2348, pruned_loss=0.03822, over 4696069.67 frames. ], batch size: 48, lr: 2.55e-03, grad_scale: 8.0 2023-10-03 20:08:29,079 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_45886_210_17024_1_1533709215_471063_7-79165-0_sp1.1 from training. Duration: 0.8909375 2023-10-03 20:08:33,615 WARNING [train.py:1204] (3/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_175-163894-0_sp0.9 from training. Duration: 0.82225 2023-10-03 20:08:35,033 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_50188_210_4198_1_1534503621_3646041_353-81682-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 20:08:35,041 WARNING [train.py:1204] (3/4) Exclude cut with ID _17875_210_2087_1_1531224012907_1821339_175-80565-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 20:08:36,347 WARNING [train.py:1204] (3/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_159-328157-0_sp1.1 from training. Duration: 0.47275 2023-10-03 20:08:41,028 WARNING [train.py:1204] (3/4) Exclude cut with ID _60057_210_10640_1_1534903065793_3333150_137-267579-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 20:08:41,072 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_92-46009-0_sp1.1 from training. Duration: 0.4909375 2023-10-03 20:08:42,437 WARNING [train.py:1204] (3/4) Exclude cut with ID _53203_210_2087_1_1534246218309_475590_48-68507-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 20:08:42,466 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_28211_210_6388_1_1532861506_4523224_128-328162-0 from training. Duration: 0.9 2023-10-03 20:08:43,769 WARNING [train.py:1204] (3/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_51-1049-0_sp1.1 from training. Duration: 0.6818125 2023-10-03 20:08:45,144 WARNING [train.py:1204] (3/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_383-316000-0 from training. Duration: 0.9 2023-10-03 20:08:45,194 WARNING [train.py:1204] (3/4) Exclude cut with ID _4779_210_2084_1_1527318022944_3914893_185-295153-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 20:08:46,527 WARNING [train.py:1204] (3/4) Exclude cut with ID _3231_210_2106_1_1526205864934_6784220_171-256004-0_sp0.9 from training. Duration: 0.9555625 2023-10-03 20:08:46,528 WARNING [train.py:1204] (3/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_87-272068-0 from training. Duration: 0.83 2023-10-03 20:08:47,952 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_164-152777-0_sp1.1 from training. Duration: 0.92725 2023-10-03 20:08:47,986 WARNING [train.py:1204] (3/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_18-130336-0_sp0.9 from training. Duration: 0.87775 2023-10-03 20:08:49,337 WARNING [train.py:1204] (3/4) Exclude cut with ID _14886_210_4220_1_1530702020355_3516190_175-229073-0_sp1.1 from training. Duration: 0.4090625 2023-10-03 20:08:49,357 WARNING [train.py:1204] (3/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_791-45149-0_sp0.9 from training. Duration: 0.9555625 2023-10-03 20:08:50,711 WARNING [train.py:1204] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_468-189058-0_sp1.1 from training. Duration: 0.87275 2023-10-03 20:08:53,827 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8278_210_2550_1_1528637099_4220413_32-142780-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 20:08:55,718 WARNING [train.py:1204] (3/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_146-218763-0_sp0.9 from training. Duration: 0.9555625 2023-10-03 20:08:59,965 WARNING [train.py:1204] (3/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_348-127789-0_sp1.1 from training. Duration: 0.77275 2023-10-03 20:08:59,970 WARNING [train.py:1204] (3/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_288-285445-0_sp1.1 from training. Duration: 0.936375 2023-10-03 20:09:01,416 WARNING [train.py:1204] (3/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_308-181732-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 20:09:01,448 WARNING [train.py:1204] (3/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_895-117428-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 20:09:04,813 WARNING [train.py:1204] (3/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_387-159008-0_sp0.9 from training. Duration: 0.9555625 2023-10-03 20:09:04,833 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_37351_210_7925_1_1533012931_3812113_265-85780-0_sp0.9 from training. Duration: 0.97775 2023-10-03 20:09:04,848 WARNING [train.py:1204] (3/4) Exclude cut with ID _40551_210_10635_1_1533776424632_3416179_361-3964-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 20:09:05,923 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.610e+02 1.904e+02 2.065e+02 2.334e+02 4.319e+02, threshold=4.131e+02, percent-clipped=1.0 2023-10-03 20:09:05,989 WARNING [train.py:1204] (3/4) Exclude cut with ID _25882_210_3036_1_1532775665815_6479510_485-122742-0_sp1.1 from training. Duration: 0.97275 2023-10-03 20:09:06,272 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.conv_module1.balancer1.max_abs, batch_count=1390960.0, ans=10.0 2023-10-03 20:09:09,076 WARNING [train.py:1204] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_98-51676-0 from training. Duration: 0.73 2023-10-03 20:09:09,363 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.out_combiner.scale_min, batch_count=1390960.0, ans=0.2 2023-10-03 20:09:10,592 WARNING [train.py:1204] (3/4) Exclude cut with ID _30548_210_16040_1_1532608097130_3146969_371-280610-0_sp1.1 from training. Duration: 0.92725 2023-10-03 20:09:12,177 WARNING [train.py:1204] (3/4) Exclude cut with ID IC0671W0081-331924 from training. Duration: 0.896 2023-10-03 20:09:14,857 WARNING [train.py:1204] (3/4) Exclude cut with ID _40284_210_2084_1_1533513679814_3704279_380-160775-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 20:09:16,085 WARNING [train.py:1204] (3/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_57-270037-0_sp1.1 from training. Duration: 0.7 2023-10-03 20:09:17,422 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_853-150237-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 20:09:18,733 WARNING [train.py:1204] (3/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_491-261923-0 from training. Duration: 0.98 2023-10-03 20:09:18,985 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.feed_forward1.out_proj.dropout_p, batch_count=1391026.6666666667, ans=0.1 2023-10-03 20:09:22,123 WARNING [train.py:1204] (3/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_836-244045-0_sp1.1 from training. Duration: 0.97275 2023-10-03 20:09:23,514 WARNING [train.py:1204] (3/4) Exclude cut with ID _14880_210_8895_1_1530680426655_3795259_289-212817-0 from training. Duration: 0.69 2023-10-03 20:09:24,946 WARNING [train.py:1204] (3/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_303-340446-0 from training. Duration: 0.9 2023-10-03 20:09:26,902 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_37152_210_9697_1_1533106573_3844188_418-41736-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 20:09:28,308 WARNING [train.py:1204] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_371-18169-0_sp1.1 from training. Duration: 0.7818125 2023-10-03 20:09:28,356 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_187-219984-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 20:09:29,760 WARNING [train.py:1204] (3/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_63-267585-0 from training. Duration: 0.9 2023-10-03 20:09:32,892 WARNING [train.py:1204] (3/4) Exclude cut with ID _27013_210_1389_1_1532437173240_3252649_222-125573-0 from training. Duration: 0.98 2023-10-03 20:09:34,249 WARNING [train.py:1204] (3/4) Exclude cut with ID _14821_210_8750_1_1530680503991_6829359_189-297451-0 from training. Duration: 0.55 2023-10-03 20:09:36,136 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_557-163391-0_sp1.1 from training. Duration: 0.97275 2023-10-03 20:09:37,473 WARNING [train.py:1204] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_65-88242-0_sp1.1 from training. Duration: 0.5090625 2023-10-03 20:09:41,642 INFO [train.py:1046] (3/4) Epoch 40, batch 1500, loss[loss=0.1591, simple_loss=0.2472, pruned_loss=0.03546, over 24654.00 frames. ], tot_loss[loss=0.1562, simple_loss=0.2354, pruned_loss=0.03847, over 4706620.59 frames. ], batch size: 68, lr: 2.55e-03, grad_scale: 8.0 2023-10-03 20:09:41,944 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.balancer1.prob, batch_count=1391160.0, ans=0.125 2023-10-03 20:09:46,712 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.3.encoder.layers.0.self_attn1.whiten, num_groups=1, num_channels=512, metric=14.43 vs. limit=22.5 2023-10-03 20:09:47,223 WARNING [train.py:1204] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_218-295097-0 from training. Duration: 0.77 2023-10-03 20:09:47,259 WARNING [train.py:1204] (3/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_686-247600-0_sp1.1 from training. Duration: 0.836375 2023-10-03 20:09:47,263 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_397-147196-0_sp0.9 from training. Duration: 0.888875 2023-10-03 20:09:48,553 WARNING [train.py:1204] (3/4) Exclude cut with ID _53950_210_5816_1_1534417264919_3862951_237-136449-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 20:09:48,612 WARNING [train.py:1204] (3/4) Exclude cut with ID _3120_210_3525_1_1526036143112_4072980_74-178400-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 20:09:49,852 WARNING [train.py:1204] (3/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_309-222545-0_sp0.9 from training. Duration: 0.9333125 2023-10-03 20:09:51,220 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7544_210_6753_1_1528587722_3576686_172-171750-0 from training. Duration: 0.65 2023-10-03 20:09:51,322 WARNING [train.py:1204] (3/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_3-237434-0_sp0.9 from training. Duration: 0.8444375 2023-10-03 20:09:53,091 WARNING [train.py:1204] (3/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_101-293367-0_sp0.9 from training. Duration: 0.7 2023-10-03 20:09:53,106 WARNING [train.py:1204] (3/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_818-213552-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 20:09:53,185 WARNING [train.py:1204] (3/4) Exclude cut with ID _14349_210_8341_1_1530615710818_3442470_115-285975-0_sp1.1 from training. Duration: 0.7818125 2023-10-03 20:09:53,313 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.ff2_skip_rate, batch_count=1391160.0, ans=0.0 2023-10-03 20:09:53,332 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=1391160.0, ans=0.1 2023-10-03 20:09:56,497 WARNING [train.py:1204] (3/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_291-309749-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 20:09:57,006 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.4.encoder.layers.2.nonlin_attention.whiten2, num_groups=1, num_channels=384, metric=3.42 vs. limit=15.0 2023-10-03 20:09:57,855 WARNING [train.py:1204] (3/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_715-3462-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 20:10:03,309 WARNING [train.py:1204] (3/4) Exclude cut with ID _59502_210_12210_1_1535107024698_1835139_13-331827-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 20:10:03,943 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_4528_210_3273_1_1527076654_3276777_342-168068-0 from training. Duration: 0.87 2023-10-03 20:10:05,183 WARNING [train.py:1204] (3/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_215-141059-0_sp0.9 from training. Duration: 0.688875 2023-10-03 20:10:05,211 WARNING [train.py:1204] (3/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_106-286130-0_sp1.1 from training. Duration: 0.8909375 2023-10-03 20:10:07,095 WARNING [train.py:1204] (3/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_480-77655-0_sp1.1 from training. Duration: 0.97275 2023-10-03 20:10:09,025 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.5.encoder.layers.1.nonlin_attention.whiten1, num_groups=1, num_channels=192, metric=2.70 vs. limit=10.0 2023-10-03 20:10:09,836 WARNING [train.py:1204] (3/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_848-280957-0 from training. Duration: 0.87 2023-10-03 20:10:14,088 WARNING [train.py:1204] (3/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_122-109023-0 from training. Duration: 0.76 2023-10-03 20:10:15,494 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_11305_210_1794_1_1529750428_7178148_645-233480-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 20:10:16,871 WARNING [train.py:1204] (3/4) Exclude cut with ID _7562_210_2084_1_1528887767916_3946229_345-169156-0 from training. Duration: 0.58 2023-10-03 20:10:18,344 WARNING [train.py:1204] (3/4) Exclude cut with ID _7562_210_2084_1_1528887767916_3946229_345-169156-0_sp1.1 from training. Duration: 0.52725 2023-10-03 20:10:20,996 WARNING [train.py:1204] (3/4) Exclude cut with ID _13253_210_1925_1_1530757775819_3577873_253-246386-0_sp1.1 from training. Duration: 0.8181875 2023-10-03 20:10:22,435 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_136-264444-0_sp1.1 from training. Duration: 0.97275 2023-10-03 20:10:22,448 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2168_210_2290_1_1525606920_1849400_136-222991-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 20:10:23,825 WARNING [train.py:1204] (3/4) Exclude cut with ID _7852_210_5219_1_1528286471316_8139400_242-105336-0 from training. Duration: 0.72 2023-10-03 20:10:23,861 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_5960_210_1794_1_1527403865_3990999_515-2613-0_sp1.1 from training. Duration: 0.8 2023-10-03 20:10:23,885 WARNING [train.py:1204] (3/4) Exclude cut with ID _39688_210_19596_1_1533371626725_4154159_551-167834-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 20:10:25,721 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_85-195240-0 from training. Duration: 0.87 2023-10-03 20:10:25,782 WARNING [train.py:1204] (3/4) Exclude cut with ID _63171_210_15488_1_1535104792794_3295110_113-113534-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 20:10:31,982 WARNING [train.py:1204] (3/4) Exclude cut with ID _8833_210_5219_1_1528592665210_7553059_319-349181-0_sp1.1 from training. Duration: 0.9 2023-10-03 20:10:31,983 WARNING [train.py:1204] (3/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_225-43381-0 from training. Duration: 0.94 2023-10-03 20:10:35,014 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_5959_210_1794_1_1527400400_3445726_412-44052-0_sp0.9 from training. Duration: 0.8555625 2023-10-03 20:10:36,940 WARNING [train.py:1204] (3/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_356-167816-0_sp1.1 from training. Duration: 0.6545625 2023-10-03 20:10:39,971 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.balancer_ff2.min_abs, batch_count=1391426.6666666667, ans=0.1 2023-10-03 20:10:40,942 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9012W0112-501244 from training. Duration: 0.768 2023-10-03 20:10:41,005 WARNING [train.py:1204] (3/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_177-326952-0_sp1.1 from training. Duration: 0.92725 2023-10-03 20:10:41,009 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9016W0116-502789 from training. Duration: 0.896 2023-10-03 20:10:42,412 WARNING [train.py:1204] (3/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_557-204815-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 20:10:43,078 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.3.encoder.layers.1.feed_forward3.out_whiten, num_groups=1, num_channels=512, metric=8.80 vs. limit=15.0 2023-10-03 20:10:43,782 WARNING [train.py:1204] (3/4) Exclude cut with ID _55857_210_2346_1_1534644007831_6898209_279-229469-0_sp1.1 from training. Duration: 0.936375 2023-10-03 20:10:43,872 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9029W0375-509764 from training. Duration: 0.896 2023-10-03 20:10:45,164 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2295_210_2087_1_1525500993_2407863_234-83104-0_sp0.9 from training. Duration: 0.688875 2023-10-03 20:10:45,318 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.1.feed_forward2.hidden_balancer.prob, batch_count=1391426.6666666667, ans=0.125 2023-10-03 20:10:47,860 WARNING [train.py:1204] (3/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_266-137104-0 from training. Duration: 0.65 2023-10-03 20:10:49,175 WARNING [train.py:1204] (3/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_289-163927-0_sp1.1 from training. Duration: 0.92725 2023-10-03 20:10:51,920 WARNING [train.py:1204] (3/4) Exclude cut with ID _12733_210_8341_1_1530167424556_3646380_86-225314-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 20:10:51,942 WARNING [train.py:1204] (3/4) Exclude cut with ID _7877_210_6014_1_1528286358099_3750136_618-284064-0_sp1.1 from training. Duration: 0.92725 2023-10-03 20:10:51,980 WARNING [train.py:1204] (3/4) Exclude cut with ID _55844_210_2346_1_1534471212569_7625707_1017-31469-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 20:10:53,688 WARNING [train.py:1204] (3/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_617-241446-0_sp1.1 from training. Duration: 0.92725 2023-10-03 20:10:53,745 WARNING [train.py:1204] (3/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_866-173440-0_sp1.1 from training. Duration: 0.8181875 2023-10-03 20:10:54,972 INFO [train.py:1046] (3/4) Epoch 40, batch 1550, loss[loss=0.16, simple_loss=0.2509, pruned_loss=0.03461, over 24301.00 frames. ], tot_loss[loss=0.1566, simple_loss=0.2361, pruned_loss=0.03858, over 4723575.94 frames. ], batch size: 74, lr: 2.55e-03, grad_scale: 8.0 2023-10-03 20:10:55,115 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_9460_210_4211_1_1528954587_10049667_480-260269-0 from training. Duration: 0.56 2023-10-03 20:10:56,427 WARNING [train.py:1204] (3/4) Exclude cut with ID _44659_210_2721_1_1534036120061_6614860_626-298884-0 from training. Duration: 0.97 2023-10-03 20:10:56,475 WARNING [train.py:1204] (3/4) Exclude cut with ID _53565_210_10635_1_1534337218335_3820259_287-18120-0_sp1.1 from training. Duration: 0.8818125 2023-10-03 20:10:56,643 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=1391493.3333333333, ans=0.1 2023-10-03 20:10:58,445 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_791-197891-0 from training. Duration: 0.84 2023-10-03 20:10:58,504 WARNING [train.py:1204] (3/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_527-238990-0 from training. Duration: 0.9 2023-10-03 20:10:59,985 WARNING [train.py:1204] (3/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_449-65969-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 20:11:01,337 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_553-341452-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 20:11:02,651 WARNING [train.py:1204] (3/4) Exclude cut with ID _16909_210_5724_1_1531292079912_4118919_300-47574-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 20:11:02,673 WARNING [train.py:1204] (3/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_70-27732-0_sp1.1 from training. Duration: 0.8545625 2023-10-03 20:11:04,208 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.balancer1.prob, batch_count=1391493.3333333333, ans=0.125 2023-10-03 20:11:05,303 WARNING [train.py:1204] (3/4) Exclude cut with ID _44718_210_5816_1_1534050051869_6469030_727-28074-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 20:11:05,347 WARNING [train.py:1204] (3/4) Exclude cut with ID _55857_210_2346_1_1534644007831_6898209_357-123664-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 20:11:09,136 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9123W0004-555964 from training. Duration: 0.896 2023-10-03 20:11:10,387 WARNING [train.py:1204] (3/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_445-145208-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 20:11:10,422 WARNING [train.py:1204] (3/4) Exclude cut with ID _3675_210_4557_1_1526639934408_4182089_456-18305-0_sp1.1 from training. Duration: 0.6909375 2023-10-03 20:11:10,493 WARNING [train.py:1204] (3/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_1-99680-0_sp1.1 from training. Duration: 0.6090625 2023-10-03 20:11:11,897 WARNING [train.py:1204] (3/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_146-282400-0_sp0.9 from training. Duration: 0.788875 2023-10-03 20:11:11,912 WARNING [train.py:1204] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_844-96989-0 from training. Duration: 0.71 2023-10-03 20:11:14,742 WARNING [train.py:1204] (3/4) Exclude cut with ID _29853_210_7497_1_1532505600851_6642110_128-146364-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 20:11:14,781 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_224-322394-0 from training. Duration: 0.66 2023-10-03 20:11:17,300 WARNING [train.py:1204] (3/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_191-59784-0 from training. Duration: 0.88 2023-10-03 20:11:17,305 WARNING [train.py:1204] (3/4) Exclude cut with ID _12556_210_8341_1_1530081024100_7327690_725-193477-0 from training. Duration: 0.98 2023-10-03 20:11:17,356 WARNING [train.py:1204] (3/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_523-79838-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 20:11:18,773 WARNING [train.py:1204] (3/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_398-334558-0_sp1.1 from training. Duration: 0.87275 2023-10-03 20:11:19,541 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.1.encoder.layers.1.feed_forward1.out_whiten, num_groups=1, num_channels=256, metric=5.30 vs. limit=15.0 2023-10-03 20:11:22,861 WARNING [train.py:1204] (3/4) Exclude cut with ID _6456_210_1534_1_1527681634212_3181049_171-36697-0_sp1.1 from training. Duration: 0.936375 2023-10-03 20:11:24,355 WARNING [train.py:1204] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_700-183549-0 from training. Duration: 0.68 2023-10-03 20:11:24,357 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8853_210_3273_1_1528633899_818968_51-187685-0 from training. Duration: 0.69 2023-10-03 20:11:33,600 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.503e+02 1.912e+02 2.147e+02 2.431e+02 4.744e+02, threshold=4.295e+02, percent-clipped=1.0 2023-10-03 20:11:33,770 WARNING [train.py:1204] (3/4) Exclude cut with ID _7719_210_3525_1_1528456153163_4296960_333-110399-0_sp1.1 from training. Duration: 0.87275 2023-10-03 20:11:39,282 WARNING [train.py:1204] (3/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_1022-83740-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 20:11:39,307 WARNING [train.py:1204] (3/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_61-327062-0_sp0.9 from training. Duration: 0.9 2023-10-03 20:11:39,316 WARNING [train.py:1204] (3/4) Exclude cut with ID _47251_210_18628_1_1533814063128_4137250_214-88076-0_sp1.1 from training. Duration: 0.8 2023-10-03 20:11:39,360 WARNING [train.py:1204] (3/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_839-173017-0 from training. Duration: 0.41 2023-10-03 20:11:45,502 WARNING [train.py:1204] (3/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_299-34413-0_sp1.1 from training. Duration: 0.5909375 2023-10-03 20:11:46,844 WARNING [train.py:1204] (3/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_664-159457-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 20:11:49,803 WARNING [train.py:1204] (3/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_483-291760-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 20:11:52,514 WARNING [train.py:1204] (3/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_287-255474-0_sp1.1 from training. Duration: 0.92725 2023-10-03 20:11:52,553 WARNING [train.py:1204] (3/4) Exclude cut with ID _48677_210_5663_1_1533902405919_3892989_111-11439-0_sp1.1 from training. Duration: 0.87275 2023-10-03 20:11:52,560 WARNING [train.py:1204] (3/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_399-156791-0 from training. Duration: 0.83 2023-10-03 20:11:52,605 WARNING [train.py:1204] (3/4) Exclude cut with ID _3992_210_1747_1_1526644816031_3731060_36-147855-0_sp0.9 from training. Duration: 0.8666875 2023-10-03 20:11:54,099 WARNING [train.py:1204] (3/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_442-320317-0_sp1.1 from training. Duration: 0.7454375 2023-10-03 20:11:54,118 WARNING [train.py:1204] (3/4) Exclude cut with ID _55429_210_10626_1_1534467588527_7174143_953-80868-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 20:11:54,706 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.3.encoder.layers.1.feed_forward2.out_whiten, num_groups=1, num_channels=512, metric=9.21 vs. limit=15.0 2023-10-03 20:11:55,608 WARNING [train.py:1204] (3/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_159-328157-0_sp0.9 from training. Duration: 0.57775 2023-10-03 20:11:55,615 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9309W0197-640324 from training. Duration: 0.896 2023-10-03 20:11:55,882 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.feed_forward1.out_proj.dropout_p, batch_count=1391760.0, ans=0.1 2023-10-03 20:11:57,048 WARNING [train.py:1204] (3/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_361-30822-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 20:12:03,565 WARNING [train.py:1204] (3/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_636-251213-0 from training. Duration: 0.94 2023-10-03 20:12:09,091 INFO [train.py:1046] (3/4) Epoch 40, batch 1600, loss[loss=0.2126, simple_loss=0.276, pruned_loss=0.07457, over 19247.00 frames. ], tot_loss[loss=0.1576, simple_loss=0.2375, pruned_loss=0.03892, over 4719189.02 frames. ], batch size: 388, lr: 2.55e-03, grad_scale: 16.0 2023-10-03 20:12:09,143 WARNING [train.py:1204] (3/4) Exclude cut with ID _49728_210_12651_1_1534041034294_6788150_468-333827-0_sp1.1 from training. Duration: 0.963625 2023-10-03 20:12:09,215 WARNING [train.py:1204] (3/4) Exclude cut with ID _25882_210_3036_1_1532775665815_6479510_623-191759-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 20:12:09,881 WARNING [train.py:1204] (3/4) Exclude cut with ID _5189_210_4557_1_1527248319428_3769229_199-53526-0 from training. Duration: 0.42 2023-10-03 20:12:11,186 WARNING [train.py:1204] (3/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_32-205288-0_sp0.9 from training. Duration: 0.8666875 2023-10-03 20:12:11,332 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.bypass.skip_rate, batch_count=1391826.6666666667, ans=0.07 2023-10-03 20:12:12,581 WARNING [train.py:1204] (3/4) Exclude cut with ID _65263_210_22356_1_1535763598691_3921037_401-346646-0_sp1.1 from training. Duration: 0.963625 2023-10-03 20:12:12,586 WARNING [train.py:1204] (3/4) Exclude cut with ID _12555_210_8750_1_1530075594402_3780239_449-267986-0_sp1.1 from training. Duration: 0.7545625 2023-10-03 20:12:12,599 WARNING [train.py:1204] (3/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_430-201020-0_sp1.1 from training. Duration: 0.8818125 2023-10-03 20:12:12,790 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.balancer2.prob, batch_count=1391826.6666666667, ans=0.125 2023-10-03 20:12:12,806 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.bypass.scale_min, batch_count=1391826.6666666667, ans=0.2 2023-10-03 20:12:13,843 WARNING [train.py:1204] (3/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_379-49457-0_sp1.1 from training. Duration: 0.7909375 2023-10-03 20:12:14,062 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=1391826.6666666667, ans=0.1 2023-10-03 20:12:17,840 WARNING [train.py:1204] (3/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_200-105218-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 20:12:17,895 WARNING [train.py:1204] (3/4) Exclude cut with ID _3867_210_1883_1_1526641200575_4475830_319-349850-0 from training. Duration: 0.67 2023-10-03 20:12:17,966 WARNING [train.py:1204] (3/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_916-195848-0 from training. Duration: 0.92 2023-10-03 20:12:19,474 WARNING [train.py:1204] (3/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_436-186311-0 from training. Duration: 0.65 2023-10-03 20:12:20,456 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.1.encoder.layers.1.conv_module1.whiten, num_groups=1, num_channels=256, metric=10.19 vs. limit=15.0 2023-10-03 20:12:21,004 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3910_210_1794_1_1526777667_3747260_302-180617-0_sp1.1 from training. Duration: 0.97275 2023-10-03 20:12:22,417 WARNING [train.py:1204] (3/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_102-92751-0 from training. Duration: 0.99 2023-10-03 20:12:22,483 WARNING [train.py:1204] (3/4) Exclude cut with ID _51899_210_20034_1_1534129254466_3669220_265-215428-0_sp1.1 from training. Duration: 0.8909375 2023-10-03 20:12:25,099 WARNING [train.py:1204] (3/4) Exclude cut with ID _58583_210_5236_1_1534726835266_3574249_438-198993-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 20:12:29,639 WARNING [train.py:1204] (3/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_166-166990-0_sp1.1 from training. Duration: 0.87275 2023-10-03 20:12:34,269 WARNING [train.py:1204] (3/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_907-151398-0 from training. Duration: 0.37 2023-10-03 20:12:36,998 WARNING [train.py:1204] (3/4) Exclude cut with ID _38670_210_15379_1_1533448810659_4391059_452-256669-0_sp1.1 from training. Duration: 0.936375 2023-10-03 20:12:38,334 WARNING [train.py:1204] (3/4) Exclude cut with ID _48677_210_5663_1_1533902405919_3892989_408-40740-0 from training. Duration: 0.83 2023-10-03 20:12:38,389 WARNING [train.py:1204] (3/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_903-158756-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 20:12:40,291 WARNING [train.py:1204] (3/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_397-216497-0 from training. Duration: 0.96 2023-10-03 20:12:43,941 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.0.conv_module2.balancer1.prob, batch_count=1391960.0, ans=0.125 2023-10-03 20:12:45,216 WARNING [train.py:1204] (3/4) Exclude cut with ID _55429_210_10626_1_1534467588527_7174143_97-335084-0 from training. Duration: 0.97 2023-10-03 20:12:48,528 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.conv_skip_rate, batch_count=1391960.0, ans=0.0 2023-10-03 20:12:51,277 WARNING [train.py:1204] (3/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_453-267730-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 20:12:51,357 WARNING [train.py:1204] (3/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_153-97441-0 from training. Duration: 0.56 2023-10-03 20:12:51,577 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.ff3_skip_rate, batch_count=1391960.0, ans=0.0 2023-10-03 20:12:52,682 WARNING [train.py:1204] (3/4) Exclude cut with ID _40901_210_5665_1_1533363789958_3856130_312-329122-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 20:12:52,734 WARNING [train.py:1204] (3/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_97-126471-0_sp1.1 from training. Duration: 0.97275 2023-10-03 20:12:52,735 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_11305_210_1794_1_1529750428_7178148_257-312870-0_sp1.1 from training. Duration: 0.8 2023-10-03 20:12:55,575 WARNING [train.py:1204] (3/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_454-314901-0_sp0.9 from training. Duration: 0.511125 2023-10-03 20:12:55,689 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.1.conv_module2.balancer1.prob, batch_count=1392026.6666666667, ans=0.125 2023-10-03 20:12:56,125 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.3.encoder.layers.2.self_attn2.whiten, num_groups=1, num_channels=512, metric=10.46 vs. limit=22.5 2023-10-03 20:12:59,642 WARNING [train.py:1204] (3/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_282-136641-0_sp1.1 from training. Duration: 0.3818125 2023-10-03 20:13:02,807 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_46013_210_7234_1_1533776362_3744032_493-160724-0_sp1.1 from training. Duration: 0.963625 2023-10-03 20:13:02,856 WARNING [train.py:1204] (3/4) Exclude cut with ID _67831_210_12392_1_1535612464202_3092612_259-33442-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 20:13:02,990 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.1.attention_skip_rate, batch_count=1392026.6666666667, ans=0.0 2023-10-03 20:13:04,183 WARNING [train.py:1204] (3/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_289-62384-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 20:13:04,252 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6545_210_2040_1_1527774670_4245258_379-14182-0_sp0.9 from training. Duration: 0.888875 2023-10-03 20:13:06,145 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_548-294316-0_sp1.1 from training. Duration: 0.736375 2023-10-03 20:13:09,298 WARNING [train.py:1204] (3/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_857-298942-0_sp1.1 from training. Duration: 0.77275 2023-10-03 20:13:10,601 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6673_210_3273_1_1527767790_3173240_327-306185-0_sp1.1 from training. Duration: 0.7545625 2023-10-03 20:13:14,122 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.ff3_skip_rate, batch_count=1392093.3333333333, ans=0.0 2023-10-03 20:13:16,745 WARNING [train.py:1204] (3/4) Exclude cut with ID _61008_210_10633_1_1534852769198_6974370_814-107647-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 20:13:16,810 WARNING [train.py:1204] (3/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_116-332008-0_sp1.1 from training. Duration: 0.8909375 2023-10-03 20:13:18,270 WARNING [train.py:1204] (3/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_266-329755-0 from training. Duration: 0.89 2023-10-03 20:13:18,273 WARNING [train.py:1204] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_271-269084-0_sp0.9 from training. Duration: 0.92225 2023-10-03 20:13:20,926 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3171_210_2087_1_1526109084_2330007_66-260583-0 from training. Duration: 0.72 2023-10-03 20:13:23,584 INFO [train.py:1046] (3/4) Epoch 40, batch 1650, loss[loss=0.1445, simple_loss=0.2133, pruned_loss=0.03787, over 23618.00 frames. ], tot_loss[loss=0.1591, simple_loss=0.2387, pruned_loss=0.03979, over 4703671.93 frames. ], batch size: 256, lr: 2.55e-03, grad_scale: 16.0 2023-10-03 20:13:23,688 WARNING [train.py:1204] (3/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_1326-276386-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 20:13:25,274 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.1.feed_forward1.out_proj.dropout_p, batch_count=1392160.0, ans=0.1 2023-10-03 20:13:26,396 WARNING [train.py:1204] (3/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_372-30302-0_sp1.1 from training. Duration: 0.7818125 2023-10-03 20:13:26,472 WARNING [train.py:1204] (3/4) Exclude cut with ID _25882_210_3036_1_1532775665815_6479510_631-257029-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 20:13:27,729 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3691_210_3528_1_1526468169_4062053_355-334432-0 from training. Duration: 0.87 2023-10-03 20:13:27,738 WARNING [train.py:1204] (3/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_839-173017-0_sp1.1 from training. Duration: 0.37275 2023-10-03 20:13:27,746 WARNING [train.py:1204] (3/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_441-91466-0 from training. Duration: 0.97 2023-10-03 20:13:27,795 WARNING [train.py:1204] (3/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_108-53929-0 from training. Duration: 0.59 2023-10-03 20:13:33,182 WARNING [train.py:1204] (3/4) Exclude cut with ID _9389_210_1664_1_1529217337301_5289269_260-1942-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 20:13:33,243 WARNING [train.py:1204] (3/4) Exclude cut with ID _56423_210_5854_1_1534896061049_3165969_198-61547-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 20:13:33,277 WARNING [train.py:1204] (3/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_412-142122-0_sp1.1 from training. Duration: 0.92725 2023-10-03 20:13:33,293 WARNING [train.py:1204] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_88-341718-0_sp1.1 from training. Duration: 0.6 2023-10-03 20:13:36,568 WARNING [train.py:1204] (3/4) Exclude cut with ID _61079_210_4525_1_1534921008198_4011419_334-298697-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 20:13:37,140 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.4.encoder.layers.0.nonlin_attention.whiten2, num_groups=1, num_channels=384, metric=9.63 vs. limit=15.0 2023-10-03 20:13:38,501 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_5465_210_4211_1_1527227253_55357_8-2499-0_sp1.1 from training. Duration: 0.996375 2023-10-03 20:13:41,264 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_447-201565-0_sp1.1 from training. Duration: 0.97275 2023-10-03 20:13:41,270 WARNING [train.py:1204] (3/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_253-161789-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 20:13:41,273 WARNING [train.py:1204] (3/4) Exclude cut with ID _44304_210_15374_1_1533639352765_4613129_302-22084-0_sp0.9 from training. Duration: 0.9555625 2023-10-03 20:13:41,282 WARNING [train.py:1204] (3/4) Exclude cut with ID _26664_210_7770_1_1532258943280_3418270_187-80815-0_sp1.1 from training. Duration: 0.8454375 2023-10-03 20:13:41,370 WARNING [train.py:1204] (3/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_68-212801-0 from training. Duration: 0.73 2023-10-03 20:13:43,153 WARNING [train.py:1204] (3/4) Exclude cut with ID _29345_210_4381_1_1532689168263_3860169_149-257268-0 from training. Duration: 0.81 2023-10-03 20:13:48,626 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_213-103282-0_sp1.1 from training. Duration: 0.7090625 2023-10-03 20:13:50,043 WARNING [train.py:1204] (3/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_156-302675-0_sp1.1 from training. Duration: 0.5 2023-10-03 20:13:58,264 WARNING [train.py:1204] (3/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_125-322172-0 from training. Duration: 0.93 2023-10-03 20:13:58,328 WARNING [train.py:1204] (3/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_493-60474-0_sp1.1 from training. Duration: 0.963625 2023-10-03 20:13:59,840 WARNING [train.py:1204] (3/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_156-302675-0 from training. Duration: 0.55 2023-10-03 20:14:01,150 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.642e+02 1.943e+02 2.116e+02 2.383e+02 3.563e+02, threshold=4.232e+02, percent-clipped=0.0 2023-10-03 20:14:01,304 WARNING [train.py:1204] (3/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_123-267862-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 20:14:04,388 WARNING [train.py:1204] (3/4) Exclude cut with ID _28427_210_4220_1_1532581240967_3061069_201-143747-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 20:14:04,437 WARNING [train.py:1204] (3/4) Exclude cut with ID _8772_210_1850_1_1528635595972_3664011_96-12756-0_sp1.1 from training. Duration: 0.8909375 2023-10-03 20:14:06,391 WARNING [train.py:1204] (3/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_1347-145675-0_sp1.1 from training. Duration: 0.936375 2023-10-03 20:14:08,106 WARNING [train.py:1204] (3/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_387-159008-0_sp1.1 from training. Duration: 0.7818125 2023-10-03 20:14:08,125 WARNING [train.py:1204] (3/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_400-201896-0_sp1.1 from training. Duration: 0.963625 2023-10-03 20:14:08,357 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.conv_module2.balancer1.prob, batch_count=1392360.0, ans=0.125 2023-10-03 20:14:10,261 WARNING [train.py:1204] (3/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_326-174612-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 20:14:11,576 WARNING [train.py:1204] (3/4) Exclude cut with ID _55844_210_2346_1_1534471212569_7625707_390-116233-0_sp1.1 from training. Duration: 0.963625 2023-10-03 20:14:11,610 WARNING [train.py:1204] (3/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_419-317062-0_sp1.1 from training. Duration: 0.82725 2023-10-03 20:14:13,000 WARNING [train.py:1204] (3/4) Exclude cut with ID _29877_210_6228_1_1532397791106_5276149_266-62291-0_sp0.9 from training. Duration: 0.9666875 2023-10-03 20:14:14,271 WARNING [train.py:1204] (3/4) Exclude cut with ID _7857_210_1534_1_1528286515745_6831470_431-338528-0_sp1.1 from training. Duration: 0.92725 2023-10-03 20:14:15,681 WARNING [train.py:1204] (3/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_87-131427-0_sp0.9 from training. Duration: 0.8666875 2023-10-03 20:14:17,091 WARNING [train.py:1204] (3/4) Exclude cut with ID _24055_210_12651_1_1532310907143_976979_92-59356-0_sp1.1 from training. Duration: 0.82725 2023-10-03 20:14:17,181 WARNING [train.py:1204] (3/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_96-290039-0 from training. Duration: 0.56 2023-10-03 20:14:18,583 WARNING [train.py:1204] (3/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_48-182155-0_sp0.9 from training. Duration: 0.9666875 2023-10-03 20:14:18,619 WARNING [train.py:1204] (3/4) Exclude cut with ID _4626_210_3532_1_1526817652717_3916859_446-206656-0 from training. Duration: 0.76 2023-10-03 20:14:19,956 WARNING [train.py:1204] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_168-32906-0 from training. Duration: 0.69 2023-10-03 20:14:21,363 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3780_210_2087_1_1526467760_2130483_170-259044-0 from training. Duration: 0.7 2023-10-03 20:14:21,384 WARNING [train.py:1204] (3/4) Exclude cut with ID _65134_210_14763_1_1535335393985_3505368_153-216718-0_sp1.1 from training. Duration: 0.92725 2023-10-03 20:14:21,462 WARNING [train.py:1204] (3/4) Exclude cut with ID _64639_210_5816_1_1535367619437_3528000_461-329156-0_sp1.1 from training. Duration: 0.87275 2023-10-03 20:14:22,601 WARNING [train.py:1204] (3/4) Exclude cut with ID _21989_210_5854_1_1531899214931_3271360_126-347531-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 20:14:22,656 WARNING [train.py:1204] (3/4) Exclude cut with ID _7614_210_4915_1_1529326764092_4537359_96-162197-0_sp1.1 from training. Duration: 0.963625 2023-10-03 20:14:22,657 WARNING [train.py:1204] (3/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_1083-147536-0 from training. Duration: 0.97 2023-10-03 20:14:24,401 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.balancer2.prob, batch_count=1392426.6666666667, ans=0.125 2023-10-03 20:14:25,552 WARNING [train.py:1204] (3/4) Exclude cut with ID _64029_210_4381_1_1535178502772_3756459_275-16710-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 20:14:28,208 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7264_210_4938_1_1527939911_8132095_800-91995-0_sp0.9 from training. Duration: 0.9555625 2023-10-03 20:14:29,640 WARNING [train.py:1204] (3/4) Exclude cut with ID _3844_210_1390_1_1526727803582_2871209_50-193879-0_sp1.1 from training. Duration: 0.936375 2023-10-03 20:14:31,397 WARNING [train.py:1204] (3/4) Exclude cut with ID _46952_210_5986_1_1534230010357_3107379_232-344503-0 from training. Duration: 0.96 2023-10-03 20:14:36,160 WARNING [train.py:1204] (3/4) Exclude cut with ID _25808_210_13292_1_1532343514562_7366109_206-14076-0_sp1.1 from training. Duration: 0.936375 2023-10-03 20:14:36,161 WARNING [train.py:1204] (3/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_55-31904-0_sp0.9 from training. Duration: 0.97775 2023-10-03 20:14:36,179 WARNING [train.py:1204] (3/4) Exclude cut with ID _14632_210_9988_1_1530691279392_3923200_161-129530-0 from training. Duration: 0.96 2023-10-03 20:14:36,474 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.feed_forward2.hidden_balancer.prob, batch_count=1392493.3333333333, ans=0.125 2023-10-03 20:14:37,330 INFO [train.py:1046] (3/4) Epoch 40, batch 1700, loss[loss=0.1654, simple_loss=0.2383, pruned_loss=0.04621, over 23800.00 frames. ], tot_loss[loss=0.1578, simple_loss=0.2373, pruned_loss=0.03917, over 4694482.62 frames. ], batch size: 212, lr: 2.55e-03, grad_scale: 16.0 2023-10-03 20:14:37,389 WARNING [train.py:1204] (3/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_770-200430-0_sp1.1 from training. Duration: 0.8090625 2023-10-03 20:14:37,399 WARNING [train.py:1204] (3/4) Exclude cut with ID _6270_210_2084_1_1527850911573_3856420_26-277343-0_sp1.1 from training. Duration: 0.7454375 2023-10-03 20:14:37,400 WARNING [train.py:1204] (3/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_88-23776-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 20:14:40,566 WARNING [train.py:1204] (3/4) Exclude cut with ID _60860_210_8254_1_1534896015875_3557169_245-256666-0_sp1.1 from training. Duration: 0.8818125 2023-10-03 20:14:40,603 WARNING [train.py:1204] (3/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_636-251213-0_sp1.1 from training. Duration: 0.8545625 2023-10-03 20:14:41,856 WARNING [train.py:1204] (3/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_540-131164-0 from training. Duration: 0.92 2023-10-03 20:14:44,611 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_5931_210_2040_1_1527428481_4547999_321-193614-0_sp0.9 from training. Duration: 0.7444375 2023-10-03 20:14:44,839 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.bypass.scale_min, batch_count=1392493.3333333333, ans=0.2 2023-10-03 20:14:50,277 WARNING [train.py:1204] (3/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_373-225403-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 20:14:53,191 WARNING [train.py:1204] (3/4) Exclude cut with ID _20622_210_3852_1_1531622427080_1093839_57-326027-0_sp1.1 from training. Duration: 0.97275 2023-10-03 20:15:00,148 WARNING [train.py:1204] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_605-257205-0_sp1.1 from training. Duration: 0.536375 2023-10-03 20:15:00,175 WARNING [train.py:1204] (3/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_442-320317-0_sp0.9 from training. Duration: 0.911125 2023-10-03 20:15:02,130 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_35361_210_4238_1_1533262810_7793476_675-87012-0_sp1.1 from training. Duration: 0.8090625 2023-10-03 20:15:02,158 WARNING [train.py:1204] (3/4) Exclude cut with ID _4184_210_2550_1_1526730192013_2219380_306-44736-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 20:15:03,620 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_124-113562-0 from training. Duration: 0.94 2023-10-03 20:15:05,017 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2821_210_2715_1_1526108385_3763560_73-296673-0_sp0.9 from training. Duration: 0.92225 2023-10-03 20:15:06,903 WARNING [train.py:1204] (3/4) Exclude cut with ID _10301_210_5886_1_1529222275332_3910620_216-46959-0_sp1.1 from training. Duration: 0.92725 2023-10-03 20:15:07,032 WARNING [train.py:1204] (3/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_91-277684-0_sp0.9 from training. Duration: 0.82225 2023-10-03 20:15:09,539 WARNING [train.py:1204] (3/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_351-294650-0_sp0.9 from training. Duration: 0.7 2023-10-03 20:15:10,902 WARNING [train.py:1204] (3/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_209-205365-0 from training. Duration: 0.79 2023-10-03 20:15:12,768 WARNING [train.py:1204] (3/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_97-325053-0 from training. Duration: 0.92 2023-10-03 20:15:12,868 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_49348_210_7927_1_1533954500_3691300_109-276430-0_sp1.1 from training. Duration: 0.92725 2023-10-03 20:15:14,314 WARNING [train.py:1204] (3/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_912-47473-0 from training. Duration: 0.68 2023-10-03 20:15:15,643 WARNING [train.py:1204] (3/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_96-320736-0_sp0.9 from training. Duration: 0.9555625 2023-10-03 20:15:17,188 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.feed_forward1.out_proj.dropout_p, batch_count=1392626.6666666667, ans=0.1 2023-10-03 20:15:22,455 WARNING [train.py:1204] (3/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_1252-235481-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 20:15:23,902 WARNING [train.py:1204] (3/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_813-261554-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 20:15:23,975 WARNING [train.py:1204] (3/4) Exclude cut with ID _21632_210_2253_1_1531994001765_3744140_110-274654-0_sp0.9 from training. Duration: 0.911125 2023-10-03 20:15:26,686 WARNING [train.py:1204] (3/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_381-317791-0_sp1.1 from training. Duration: 0.663625 2023-10-03 20:15:26,699 WARNING [train.py:1204] (3/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_51-117576-0_sp1.1 from training. Duration: 0.363625 2023-10-03 20:15:26,728 WARNING [train.py:1204] (3/4) Exclude cut with ID _42321_210_7770_1_1533468631352_4366895_104-215915-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 20:15:30,707 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_216-22824-0_sp1.1 from training. Duration: 0.92725 2023-10-03 20:15:30,708 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_14831_210_7927_1_1530700200_5583610_181-27467-0 from training. Duration: 0.91 2023-10-03 20:15:30,776 WARNING [train.py:1204] (3/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_1024-265645-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 20:15:30,778 WARNING [train.py:1204] (3/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_412-233904-0_sp1.1 from training. Duration: 0.936375 2023-10-03 20:15:32,550 WARNING [train.py:1204] (3/4) Exclude cut with ID _30191_210_1664_1_1533189915047_7196670_911-223461-0_sp1.1 from training. Duration: 0.92725 2023-10-03 20:15:32,558 WARNING [train.py:1204] (3/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_558-148266-0_sp1.1 from training. Duration: 0.963625 2023-10-03 20:15:34,017 WARNING [train.py:1204] (3/4) Exclude cut with ID _58480_210_12392_1_1534811121111_3646160_212-164495-0_sp1.1 from training. Duration: 0.936375 2023-10-03 20:15:34,018 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_454-276429-0_sp0.9 from training. Duration: 0.9666875 2023-10-03 20:15:34,217 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.feed_forward3.hidden_balancer.prob, batch_count=1392760.0, ans=0.125 2023-10-03 20:15:36,045 WARNING [train.py:1204] (3/4) Exclude cut with ID _24736_210_3279_1_1532332894218_2838700_73-197086-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 20:15:36,109 WARNING [train.py:1204] (3/4) Exclude cut with ID _5294_210_1747_1_1527249643928_3696179_529-242298-0_sp1.1 from training. Duration: 0.863625 2023-10-03 20:15:36,141 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_53555_210_5632_1_1534595220_7232297_345-171438-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 20:15:41,289 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_28211_210_6388_1_1532861506_4523224_22-25766-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 20:15:41,381 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7002_210_1794_1_1527939683_5157286_163-87552-0 from training. Duration: 0.86 2023-10-03 20:15:41,641 INFO [scaling.py:1118] (3/4) WithLoss: name=encoder.encoders.4.encoder.layers.2.self_attn_weights, loss-sum=0.000e+00 2023-10-03 20:15:44,786 WARNING [train.py:1204] (3/4) Exclude cut with ID _56927_210_9969_1_1534848970840_3695229_409-225129-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 20:15:46,224 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_26192_210_7927_1_1532137269_1040115_21-219331-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 20:15:48,828 WARNING [train.py:1204] (3/4) Exclude cut with ID _15012_210_1968_1_1530856820429_5095350_662-210469-0 from training. Duration: 0.57 2023-10-03 20:15:50,206 INFO [train.py:1046] (3/4) Epoch 40, batch 1750, loss[loss=0.1671, simple_loss=0.2585, pruned_loss=0.03789, over 24440.00 frames. ], tot_loss[loss=0.1568, simple_loss=0.2361, pruned_loss=0.03874, over 4699448.59 frames. ], batch size: 69, lr: 2.55e-03, grad_scale: 8.0 2023-10-03 20:15:53,151 WARNING [train.py:1204] (3/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_582-191016-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 20:15:54,792 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.conv_module2.balancer1.prob, batch_count=1392826.6666666667, ans=0.125 2023-10-03 20:15:55,865 WARNING [train.py:1204] (3/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_270-210032-0_sp1.1 from training. Duration: 0.963625 2023-10-03 20:15:57,145 WARNING [train.py:1204] (3/4) Exclude cut with ID _6789_210_1534_1_1527857937218_3118950_262-48853-0_sp0.9 from training. Duration: 0.62225 2023-10-03 20:15:57,205 WARNING [train.py:1204] (3/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_847-41334-0 from training. Duration: 0.44 2023-10-03 20:15:58,428 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_573-46532-0_sp1.1 from training. Duration: 0.92725 2023-10-03 20:16:00,054 WARNING [train.py:1204] (3/4) Exclude cut with ID _21881_210_3525_1_1531825242757_3798380_247-102591-0_sp1.1 from training. Duration: 0.8909375 2023-10-03 20:16:00,080 WARNING [train.py:1204] (3/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_200-21395-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 20:16:02,129 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.feed_forward2.hidden_balancer.prob, batch_count=1392826.6666666667, ans=0.125 2023-10-03 20:16:05,051 INFO [scaling.py:1118] (3/4) WithLoss: name=encoder.encoders.4.encoder.layers.1.self_attn_weights, loss-sum=0.000e+00 2023-10-03 20:16:06,139 WARNING [train.py:1204] (3/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_711-15688-0 from training. Duration: 0.43 2023-10-03 20:16:08,168 WARNING [train.py:1204] (3/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_53-75025-0_sp1.1 from training. Duration: 0.963625 2023-10-03 20:16:09,559 WARNING [train.py:1204] (3/4) Exclude cut with ID _6194_210_1534_1_1527511826160_3941194_321-148641-0 from training. Duration: 0.64 2023-10-03 20:16:09,605 WARNING [train.py:1204] (3/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_337-294431-0_sp1.1 from training. Duration: 0.936375 2023-10-03 20:16:10,945 WARNING [train.py:1204] (3/4) Exclude cut with ID _21632_210_2253_1_1531994001765_3744140_110-274654-0_sp1.1 from training. Duration: 0.7454375 2023-10-03 20:16:11,141 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.bypass.skip_rate, batch_count=1392893.3333333333, ans=0.07 2023-10-03 20:16:14,040 WARNING [train.py:1204] (3/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_135-174080-0_sp1.1 from training. Duration: 0.4818125 2023-10-03 20:16:15,453 WARNING [train.py:1204] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_21-174514-0 from training. Duration: 0.43 2023-10-03 20:16:16,949 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_38965_210_4852_1_1533174906_3484735_215-294589-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 20:16:18,329 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_548-294316-0 from training. Duration: 0.81 2023-10-03 20:16:25,325 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_103-84010-0_sp1.1 from training. Duration: 0.62725 2023-10-03 20:16:26,791 WARNING [train.py:1204] (3/4) Exclude cut with ID _18199_210_6109_1_1531475789864_4288790_295-198247-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 20:16:26,819 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_13757_210_3528_1_1530440682_4107239_103-313224-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 20:16:29,454 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.665e+02 1.982e+02 2.207e+02 2.647e+02 3.651e+02, threshold=4.414e+02, percent-clipped=0.0 2023-10-03 20:16:29,603 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_34886_210_10633_1_1533203882_1755661_67-294673-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 20:16:29,613 WARNING [train.py:1204] (3/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_350-101734-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 20:16:29,849 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.balancer2.prob, batch_count=1392960.0, ans=0.125 2023-10-03 20:16:31,009 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_37357_210_17024_1_1533089504_2316121_204-153986-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 20:16:34,544 WARNING [train.py:1204] (3/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_673-115569-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 20:16:34,684 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.nonlin_attention.balancer.prob, batch_count=1393026.6666666667, ans=0.125 2023-10-03 20:16:37,608 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_489-230703-0_sp0.9 from training. Duration: 0.9444375 2023-10-03 20:16:38,985 WARNING [train.py:1204] (3/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_575-102496-0_sp1.1 from training. Duration: 0.936375 2023-10-03 20:16:39,064 WARNING [train.py:1204] (3/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_669-93163-0 from training. Duration: 0.98 2023-10-03 20:16:42,346 WARNING [train.py:1204] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_249-281349-0_sp0.9 from training. Duration: 0.9444375 2023-10-03 20:16:43,863 WARNING [train.py:1204] (3/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_106-30782-0 from training. Duration: 0.8 2023-10-03 20:16:45,282 WARNING [train.py:1204] (3/4) Exclude cut with ID _8772_210_1850_1_1528635595972_3664011_398-320049-0_sp1.1 from training. Duration: 0.8 2023-10-03 20:16:46,657 WARNING [train.py:1204] (3/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_516-151727-0_sp1.1 from training. Duration: 0.963625 2023-10-03 20:16:46,719 WARNING [train.py:1204] (3/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_325-62802-0_sp1.1 from training. Duration: 0.7909375 2023-10-03 20:16:50,261 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.0.layers.0.conv_module1.whiten, num_groups=1, num_channels=192, metric=11.12 vs. limit=15.0 2023-10-03 20:16:50,773 WARNING [train.py:1204] (3/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_93-58002-0_sp0.9 from training. Duration: 0.6333125 2023-10-03 20:16:50,806 WARNING [train.py:1204] (3/4) Exclude cut with ID _4681_210_3525_1_1527251040321_1235713_123-24428-0_sp1.1 from training. Duration: 0.436375 2023-10-03 20:16:50,867 WARNING [train.py:1204] (3/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_1185-56473-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 20:16:52,242 WARNING [train.py:1204] (3/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_96-244299-0_sp1.1 from training. Duration: 0.8 2023-10-03 20:16:54,996 WARNING [train.py:1204] (3/4) Exclude cut with ID _59500_210_8254_1_1534809610680_3736009_104-72815-0_sp1.1 from training. Duration: 0.963625 2023-10-03 20:16:59,059 WARNING [train.py:1204] (3/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_468-283113-0_sp1.1 from training. Duration: 0.87275 2023-10-03 20:16:59,172 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6239_210_4211_1_1527765584_4306698_528-102186-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 20:17:02,350 WARNING [train.py:1204] (3/4) Exclude cut with ID _45297_210_2721_1_1533715351381_3264949_351-147136-0 from training. Duration: 0.87 2023-10-03 20:17:02,355 WARNING [train.py:1204] (3/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_520-51744-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 20:17:03,618 INFO [train.py:1046] (3/4) Epoch 40, batch 1800, loss[loss=0.1749, simple_loss=0.2469, pruned_loss=0.05145, over 23744.00 frames. ], tot_loss[loss=0.1559, simple_loss=0.235, pruned_loss=0.03841, over 4693886.23 frames. ], batch size: 212, lr: 2.55e-03, grad_scale: 8.0 2023-10-03 20:17:03,697 WARNING [train.py:1204] (3/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_310-114452-0_sp1.1 from training. Duration: 0.536375 2023-10-03 20:17:03,700 WARNING [train.py:1204] (3/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_450-165710-0_sp1.1 from training. Duration: 0.92725 2023-10-03 20:17:03,711 WARNING [train.py:1204] (3/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_280-120942-0_sp1.1 from training. Duration: 0.7 2023-10-03 20:17:03,743 WARNING [train.py:1204] (3/4) Exclude cut with ID _65585_210_12210_1_1535450451584_3764098_112-294301-0_sp0.9 from training. Duration: 0.97775 2023-10-03 20:17:03,807 WARNING [train.py:1204] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_98-51676-0_sp0.9 from training. Duration: 0.811125 2023-10-03 20:17:07,239 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_33348_210_5632_1_1533086912_3324030_85-28583-0_sp1.1 from training. Duration: 0.7454375 2023-10-03 20:17:08,597 WARNING [train.py:1204] (3/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_336-208232-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 20:17:11,243 WARNING [train.py:1204] (3/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_339-104791-0_sp0.9 from training. Duration: 0.5444375 2023-10-03 20:17:13,331 WARNING [train.py:1204] (3/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_559-29840-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 20:17:16,126 WARNING [train.py:1204] (3/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_117-290916-0_sp0.9 from training. Duration: 0.5666875 2023-10-03 20:17:17,597 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_185-112289-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 20:17:20,336 WARNING [train.py:1204] (3/4) Exclude cut with ID _59256_210_5986_1_1535076085997_3589120_155-298797-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 20:17:21,830 WARNING [train.py:1204] (3/4) Exclude cut with ID _44659_210_2721_1_1534036120061_6614860_246-206531-0_sp1.1 from training. Duration: 0.92725 2023-10-03 20:17:23,125 WARNING [train.py:1204] (3/4) Exclude cut with ID _23818_210_6228_1_1531917063884_3676920_469-151944-0_sp1.1 from training. Duration: 0.92725 2023-10-03 20:17:23,206 WARNING [train.py:1204] (3/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_88-276668-0_sp1.1 from training. Duration: 0.9 2023-10-03 20:17:25,930 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_15101_210_7927_1_1530786415_5033274_320-327527-0_sp1.1 from training. Duration: 0.87275 2023-10-03 20:17:25,951 WARNING [train.py:1204] (3/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_446-112117-0 from training. Duration: 0.9 2023-10-03 20:17:27,188 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_30280_210_1881_1_1532872779_3660139_400-254159-0_sp1.1 from training. Duration: 0.936375 2023-10-03 20:17:29,928 WARNING [train.py:1204] (3/4) Exclude cut with ID _40284_210_2084_1_1533513679814_3704279_158-345341-0_sp1.1 from training. Duration: 0.936375 2023-10-03 20:17:31,526 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.feed_forward1.hidden_balancer.prob, batch_count=1393293.3333333333, ans=0.125 2023-10-03 20:17:35,232 WARNING [train.py:1204] (3/4) Exclude cut with ID 3033-130750-0096-55598-0 from training. Duration: 0.83 2023-10-03 20:17:36,555 WARNING [train.py:1204] (3/4) Exclude cut with ID _9389_210_1664_1_1529217337301_5289269_379-128794-0 from training. Duration: 0.85 2023-10-03 20:17:37,818 WARNING [train.py:1204] (3/4) Exclude cut with ID _3675_210_4557_1_1526639934408_4182089_456-18305-0 from training. Duration: 0.76 2023-10-03 20:17:37,846 WARNING [train.py:1204] (3/4) Exclude cut with ID _29853_210_7497_1_1532505600851_6642110_571-249460-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 20:17:39,226 WARNING [train.py:1204] (3/4) Exclude cut with ID _4068_210_2629_1_1526697350395_1546259_230-325568-0_sp1.1 from training. Duration: 0.92725 2023-10-03 20:17:39,227 WARNING [train.py:1204] (3/4) Exclude cut with ID _34415_210_8750_1_1533085213375_3451116_213-267570-0_sp1.1 from training. Duration: 0.963625 2023-10-03 20:17:39,293 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6767_210_3273_1_1527855783_1718932_142-127132-0_sp1.1 from training. Duration: 0.863625 2023-10-03 20:17:46,144 WARNING [train.py:1204] (3/4) Exclude cut with ID IC0653W0001-322925 from training. Duration: 0.896 2023-10-03 20:17:47,510 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_4528_210_3273_1_1527076654_3276777_209-219804-0_sp1.1 from training. Duration: 0.62725 2023-10-03 20:17:50,167 WARNING [train.py:1204] (3/4) Exclude cut with ID _55844_210_2346_1_1534471212569_7625707_187-204971-0_sp1.1 from training. Duration: 0.936375 2023-10-03 20:17:51,594 WARNING [train.py:1204] (3/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_369-325184-0 from training. Duration: 0.99 2023-10-03 20:17:52,912 WARNING [train.py:1204] (3/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_791-45149-0 from training. Duration: 0.86 2023-10-03 20:17:52,955 WARNING [train.py:1204] (3/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_317-211107-0_sp1.1 from training. Duration: 0.836375 2023-10-03 20:17:53,058 WARNING [train.py:1204] (3/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_488-330913-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 20:17:54,565 WARNING [train.py:1204] (3/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_506-42614-0_sp1.1 from training. Duration: 0.7181875 2023-10-03 20:17:54,767 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.ff2_skip_rate, batch_count=1393360.0, ans=0.0 2023-10-03 20:17:59,908 WARNING [train.py:1204] (3/4) Exclude cut with ID _8696_210_1876_1_1528545632440_3345058_209-54069-0 from training. Duration: 0.67 2023-10-03 20:18:05,406 WARNING [train.py:1204] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_215-202973-0_sp1.1 from training. Duration: 0.8 2023-10-03 20:18:07,316 WARNING [train.py:1204] (3/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_321-215742-0 from training. Duration: 0.5 2023-10-03 20:18:07,369 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_428-32114-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 20:18:07,380 WARNING [train.py:1204] (3/4) Exclude cut with ID _27013_210_1389_1_1532437173240_3252649_467-263151-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 20:18:08,898 WARNING [train.py:1204] (3/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_734-129761-0_sp0.9 from training. Duration: 0.911125 2023-10-03 20:18:08,944 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_521-214276-0 from training. Duration: 0.44 2023-10-03 20:18:10,464 WARNING [train.py:1204] (3/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_80-147301-0_sp1.1 from training. Duration: 0.763625 2023-10-03 20:18:10,466 WARNING [train.py:1204] (3/4) Exclude cut with ID _9679_210_7497_1_1529129212906_4363929_56-307796-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 20:18:13,210 WARNING [train.py:1204] (3/4) Exclude cut with ID _14832_210_8750_1_1530691351539_3171348_1-54315-0 from training. Duration: 0.87 2023-10-03 20:18:13,211 WARNING [train.py:1204] (3/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_558-155750-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 20:18:15,166 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_142-309370-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 20:18:16,353 WARNING [train.py:1204] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_18-46756-0_sp0.9 from training. Duration: 0.87775 2023-10-03 20:18:16,361 WARNING [train.py:1204] (3/4) Exclude cut with ID _61148_210_6897_1_1535094022726_4217610_279-212159-0_sp1.1 from training. Duration: 0.936375 2023-10-03 20:18:16,463 WARNING [train.py:1204] (3/4) Exclude cut with ID _21987_210_5854_1_1531821500482_3637038_164-2641-0_sp1.1 from training. Duration: 0.936375 2023-10-03 20:18:18,166 INFO [train.py:1046] (3/4) Epoch 40, batch 1850, loss[loss=0.1547, simple_loss=0.243, pruned_loss=0.03324, over 24453.00 frames. ], tot_loss[loss=0.1565, simple_loss=0.2358, pruned_loss=0.0386, over 4684797.24 frames. ], batch size: 66, lr: 2.55e-03, grad_scale: 8.0 2023-10-03 20:18:18,287 WARNING [train.py:1204] (3/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_395-340852-0_sp1.1 from training. Duration: 0.8454375 2023-10-03 20:18:19,697 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1077-184261-0_sp1.1 from training. Duration: 0.963625 2023-10-03 20:18:19,704 WARNING [train.py:1204] (3/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_1176-204992-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 20:18:22,472 WARNING [train.py:1204] (3/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_563-347832-0_sp1.1 from training. Duration: 0.8090625 2023-10-03 20:18:22,547 WARNING [train.py:1204] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_380-204756-0_sp0.9 from training. Duration: 0.9555625 2023-10-03 20:18:22,764 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.feed_forward1.hidden_balancer.prob, batch_count=1393493.3333333333, ans=0.125 2023-10-03 20:18:29,450 WARNING [train.py:1204] (3/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_212-10501-0_sp1.1 from training. Duration: 0.8545625 2023-10-03 20:18:30,713 WARNING [train.py:1204] (3/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_445-117398-0 from training. Duration: 0.89 2023-10-03 20:18:33,581 WARNING [train.py:1204] (3/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_29-280622-0 from training. Duration: 0.82 2023-10-03 20:18:36,867 WARNING [train.py:1204] (3/4) Exclude cut with ID _26664_210_7770_1_1532258943280_3418270_187-80815-0 from training. Duration: 0.93 2023-10-03 20:18:39,700 WARNING [train.py:1204] (3/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_414-349640-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 20:18:41,174 WARNING [train.py:1204] (3/4) Exclude cut with ID _45021_210_9994_1_1533619751094_3330288_96-239010-0 from training. Duration: 0.91 2023-10-03 20:18:41,177 WARNING [train.py:1204] (3/4) Exclude cut with ID _5189_210_4557_1_1527248319428_3769229_199-53526-0_sp0.9 from training. Duration: 0.4666875 2023-10-03 20:18:52,961 WARNING [train.py:1204] (3/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_63-101996-0_sp1.1 from training. Duration: 0.8 2023-10-03 20:18:54,311 WARNING [train.py:1204] (3/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_199-86432-0 from training. Duration: 0.98 2023-10-03 20:18:57,064 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.598e+02 1.932e+02 2.118e+02 2.522e+02 3.488e+02, threshold=4.237e+02, percent-clipped=0.0 2023-10-03 20:18:57,112 WARNING [train.py:1204] (3/4) Exclude cut with ID _36399_210_16049_1_1532998771377_3698333_207-259013-0_sp1.1 from training. Duration: 0.92725 2023-10-03 20:18:57,143 WARNING [train.py:1204] (3/4) Exclude cut with ID _62574_210_2655_1_1535180407058_3933249_567-111756-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 20:18:59,930 WARNING [train.py:1204] (3/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_436-318113-0 from training. Duration: 0.75 2023-10-03 20:19:00,222 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.conv_skip_rate, batch_count=1393693.3333333333, ans=0.0 2023-10-03 20:19:01,363 WARNING [train.py:1204] (3/4) Exclude cut with ID _53379_210_20034_1_1534222027363_3854654_43-305214-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 20:19:01,377 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_15126_210_6753_1_1530778677_2591185_113-157651-0_sp1.1 from training. Duration: 0.6181875 2023-10-03 20:19:02,680 WARNING [train.py:1204] (3/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_146-218763-0_sp1.1 from training. Duration: 0.7818125 2023-10-03 20:19:02,818 WARNING [train.py:1204] (3/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_714-310538-0_sp0.9 from training. Duration: 0.9555625 2023-10-03 20:19:05,851 WARNING [train.py:1204] (3/4) Exclude cut with ID _24271_210_12658_1_1531987270022_7190519_709-16721-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 20:19:07,643 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.conv_skip_rate, batch_count=1393693.3333333333, ans=0.0 2023-10-03 20:19:09,283 WARNING [train.py:1204] (3/4) Exclude cut with ID _5190_210_4557_1_1527244368799_373439_28-197030-0_sp0.9 from training. Duration: 0.8 2023-10-03 20:19:10,503 WARNING [train.py:1204] (3/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_267-197536-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 20:19:10,526 WARNING [train.py:1204] (3/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_658-282610-0_sp1.1 from training. Duration: 0.4818125 2023-10-03 20:19:10,543 WARNING [train.py:1204] (3/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_257-67050-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 20:19:13,252 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_14906_210_7927_1_1530754190_9408633_961-53402-0_sp1.1 from training. Duration: 0.936375 2023-10-03 20:19:14,596 WARNING [train.py:1204] (3/4) Exclude cut with ID _60113_210_16271_1_1535164212481_4386079_490-280290-0_sp1.1 from training. Duration: 0.8909375 2023-10-03 20:19:17,459 WARNING [train.py:1204] (3/4) Exclude cut with ID _14100_210_8341_1_1530529320613_3678069_258-224599-0 from training. Duration: 0.77 2023-10-03 20:19:19,239 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6961_210_2087_1_1527937168_6815011_565-326898-0_sp1.1 from training. Duration: 0.936375 2023-10-03 20:19:22,610 WARNING [train.py:1204] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_239-316802-0_sp1.1 from training. Duration: 0.62725 2023-10-03 20:19:23,996 WARNING [train.py:1204] (3/4) Exclude cut with ID _55857_210_2346_1_1534644007831_6898209_745-98391-0_sp1.1 from training. Duration: 0.6909375 2023-10-03 20:19:24,000 WARNING [train.py:1204] (3/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_154-135696-0 from training. Duration: 0.8 2023-10-03 20:19:24,002 WARNING [train.py:1204] (3/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_93-58002-0 from training. Duration: 0.57 2023-10-03 20:19:25,392 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9025W0005-507335 from training. Duration: 0.896 2023-10-03 20:19:25,482 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9029W0034-509435 from training. Duration: 0.64 2023-10-03 20:19:25,658 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.bypass_mid.scale_min, batch_count=1393760.0, ans=0.2 2023-10-03 20:19:26,849 WARNING [train.py:1204] (3/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_76-203867-0_sp1.1 from training. Duration: 0.6545625 2023-10-03 20:19:28,154 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_46639_210_8341_1_1533726088_4495500_66-346325-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 20:19:28,170 WARNING [train.py:1204] (3/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_145-137790-0_sp0.9 from training. Duration: 0.92225 2023-10-03 20:19:28,186 WARNING [train.py:1204] (3/4) Exclude cut with ID _36522_210_8887_1_1533177030611_1772478_7-327299-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 20:19:28,257 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9042W0009-515615 from training. Duration: 0.896 2023-10-03 20:19:29,513 WARNING [train.py:1204] (3/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_39-248870-0_sp0.9 from training. Duration: 0.7666875 2023-10-03 20:19:29,557 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_35441_210_16554_1_1533347572_4092680_362-242912-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 20:19:29,637 WARNING [train.py:1204] (3/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_216-343007-0_sp1.1 from training. Duration: 0.6 2023-10-03 20:19:29,840 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.conv_module1.balancer1.prob, batch_count=1393826.6666666667, ans=0.125 2023-10-03 20:19:30,878 INFO [train.py:1046] (3/4) Epoch 40, batch 1900, loss[loss=0.1521, simple_loss=0.2406, pruned_loss=0.03178, over 24528.00 frames. ], tot_loss[loss=0.1581, simple_loss=0.2377, pruned_loss=0.0393, over 4689862.24 frames. ], batch size: 71, lr: 2.55e-03, grad_scale: 8.0 2023-10-03 20:19:30,981 WARNING [train.py:1204] (3/4) Exclude cut with ID _3867_210_1883_1_1526641200575_4475830_272-215563-0_sp0.9 from training. Duration: 0.6333125 2023-10-03 20:19:31,061 WARNING [train.py:1204] (3/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_553-175603-0_sp1.1 from training. Duration: 0.92725 2023-10-03 20:19:31,069 WARNING [train.py:1204] (3/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_963-187109-0 from training. Duration: 0.99 2023-10-03 20:19:33,780 WARNING [train.py:1204] (3/4) Exclude cut with ID _44973_210_1511_1_1533794381163_3634770_495-211466-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 20:19:33,795 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9068W0002-529085 from training. Duration: 0.896 2023-10-03 20:19:33,815 WARNING [train.py:1204] (3/4) Exclude cut with ID _5294_210_1747_1_1527249643928_3696179_180-100713-0_sp1.1 from training. Duration: 0.736375 2023-10-03 20:19:33,919 WARNING [train.py:1204] (3/4) Exclude cut with ID _35799_210_5854_1_1533263487887_3529071_149-49689-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 20:19:34,125 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=1393826.6666666667, ans=0.1 2023-10-03 20:19:39,339 WARNING [train.py:1204] (3/4) Exclude cut with ID _67725_210_27767_1_1535608756671_5105359_36-220794-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 20:19:39,557 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.balancer1.prob, batch_count=1393826.6666666667, ans=0.125 2023-10-03 20:19:42,086 WARNING [train.py:1204] (3/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_338-96520-0_sp1.1 from training. Duration: 0.8545625 2023-10-03 20:19:42,171 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9104W0004-547160 from training. Duration: 0.896 2023-10-03 20:19:43,534 WARNING [train.py:1204] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_330-327861-0 from training. Duration: 0.67 2023-10-03 20:19:44,851 WARNING [train.py:1204] (3/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_156-257607-0_sp0.9 from training. Duration: 0.92225 2023-10-03 20:19:44,919 WARNING [train.py:1204] (3/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_476-322050-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 20:19:44,928 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9118W0017-553385 from training. Duration: 0.896 2023-10-03 20:19:45,213 INFO [scaling.py:1118] (3/4) WithLoss: name=encoder.encoders.4.encoder.layers.2.self_attn_weights, loss-sum=0.000e+00 2023-10-03 20:19:46,917 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9120W0028-554435 from training. Duration: 0.896 2023-10-03 20:19:50,329 WARNING [train.py:1204] (3/4) Exclude cut with ID _56524_210_9642_1_1534572011779_3619170_145-310842-0 from training. Duration: 0.99 2023-10-03 20:19:50,460 WARNING [train.py:1204] (3/4) Exclude cut with ID _51631_210_8254_1_1534129228420_4084794_270-222508-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 20:19:54,646 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.bypass.scale_min, batch_count=1393893.3333333333, ans=0.2 2023-10-03 20:19:55,782 WARNING [train.py:1204] (3/4) Exclude cut with ID _14268_210_9206_1_1530622771285_4714099_331-105351-0 from training. Duration: 0.65 2023-10-03 20:19:55,915 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.self_attn_weights.pos_emb_skip_rate, batch_count=1393893.3333333333, ans=0.0 2023-10-03 20:19:58,432 WARNING [train.py:1204] (3/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_40-129756-0 from training. Duration: 0.81 2023-10-03 20:19:58,614 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.self_attn_weights.pos_emb_skip_rate, batch_count=1393893.3333333333, ans=0.0 2023-10-03 20:20:10,000 WARNING [train.py:1204] (3/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_231-104736-0 from training. Duration: 0.78 2023-10-03 20:20:11,489 WARNING [train.py:1204] (3/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_214-291369-0 from training. Duration: 0.63 2023-10-03 20:20:11,696 INFO [scaling.py:1118] (3/4) WithLoss: name=encoder.encoders.2.encoder.layers.2.self_attn_weights, loss-sum=0.000e+00 2023-10-03 20:20:12,834 WARNING [train.py:1204] (3/4) Exclude cut with ID _62574_210_2655_1_1535180407058_3933249_369-84347-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 20:20:12,896 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9210W0111-599240 from training. Duration: 0.896 2023-10-03 20:20:12,900 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_92-246270-0 from training. Duration: 0.85 2023-10-03 20:20:12,933 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_37152_210_9697_1_1533106573_3844188_29-239070-0 from training. Duration: 0.98 2023-10-03 20:20:14,760 WARNING [train.py:1204] (3/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_239-22076-0 from training. Duration: 0.85 2023-10-03 20:20:14,770 WARNING [train.py:1204] (3/4) Exclude cut with ID _65962_210_29927_1_1535436026519_4174930_368-44639-0_sp1.1 from training. Duration: 0.97275 2023-10-03 20:20:17,672 WARNING [train.py:1204] (3/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_152-212533-0 from training. Duration: 0.97 2023-10-03 20:20:22,229 WARNING [train.py:1204] (3/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_90-31383-0_sp1.1 from training. Duration: 0.7909375 2023-10-03 20:20:25,477 WARNING [train.py:1204] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_196-152868-0_sp1.1 from training. Duration: 0.92725 2023-10-03 20:20:25,479 WARNING [train.py:1204] (3/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_140-181176-0 from training. Duration: 0.92 2023-10-03 20:20:26,924 WARNING [train.py:1204] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_168-32906-0_sp0.9 from training. Duration: 0.7666875 2023-10-03 20:20:27,213 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.ff3_skip_rate, batch_count=1394026.6666666667, ans=0.0 2023-10-03 20:20:29,818 WARNING [train.py:1204] (3/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_13-165512-0 from training. Duration: 0.82 2023-10-03 20:20:31,179 WARNING [train.py:1204] (3/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_381-317791-0_sp0.9 from training. Duration: 0.811125 2023-10-03 20:20:31,391 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.conv_skip_rate, batch_count=1394093.3333333333, ans=0.0 2023-10-03 20:20:34,340 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.conv_module1.balancer1.prob, batch_count=1394093.3333333333, ans=0.125 2023-10-03 20:20:36,874 WARNING [train.py:1204] (3/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_63-267585-0_sp1.1 from training. Duration: 0.8181875 2023-10-03 20:20:36,876 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_326-303945-0_sp1.1 from training. Duration: 0.9 2023-10-03 20:20:36,896 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_194-143381-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 20:20:38,943 WARNING [train.py:1204] (3/4) Exclude cut with ID _39290_210_4220_1_1533862778760_3075118_194-16667-0_sp1.1 from training. Duration: 0.963625 2023-10-03 20:20:40,339 WARNING [train.py:1204] (3/4) Exclude cut with ID _15012_210_1968_1_1530856820429_5095350_662-210469-0_sp0.9 from training. Duration: 0.6333125 2023-10-03 20:20:40,380 WARNING [train.py:1204] (3/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_68-219807-0_sp1.1 from training. Duration: 0.42725 2023-10-03 20:20:40,433 WARNING [train.py:1204] (3/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_321-22821-0_sp0.9 from training. Duration: 0.988875 2023-10-03 20:20:43,699 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_1037-45317-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 20:20:43,700 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_403-39619-0_sp1.1 from training. Duration: 0.763625 2023-10-03 20:20:46,360 INFO [train.py:1046] (3/4) Epoch 40, batch 1950, loss[loss=0.1901, simple_loss=0.2612, pruned_loss=0.0595, over 22732.00 frames. ], tot_loss[loss=0.1585, simple_loss=0.2381, pruned_loss=0.03949, over 4693091.69 frames. ], batch size: 322, lr: 2.55e-03, grad_scale: 8.0 2023-10-03 20:20:46,421 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_510-96996-0_sp0.9 from training. Duration: 0.9666875 2023-10-03 20:20:46,424 WARNING [train.py:1204] (3/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_225-180155-0_sp1.1 from training. Duration: 0.92725 2023-10-03 20:20:46,471 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_278-272612-0_sp0.9 from training. Duration: 0.811125 2023-10-03 20:20:47,827 WARNING [train.py:1204] (3/4) Exclude cut with ID _53713_210_24116_1_1534314487762_7416751_691-70244-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 20:20:51,085 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6788_210_1388_1_1527898698_4562021_91-197981-0_sp1.1 from training. Duration: 0.7545625 2023-10-03 20:20:52,557 WARNING [train.py:1204] (3/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_60-18123-0_sp0.9 from training. Duration: 0.97775 2023-10-03 20:20:53,859 WARNING [train.py:1204] (3/4) Exclude cut with ID _35799_210_5854_1_1533263487887_3529071_5-214617-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 20:20:53,864 WARNING [train.py:1204] (3/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_890-5117-0_sp1.1 from training. Duration: 0.7090625 2023-10-03 20:20:55,337 WARNING [train.py:1204] (3/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_660-211088-0 from training. Duration: 0.62 2023-10-03 20:20:57,256 WARNING [train.py:1204] (3/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_314-126819-0_sp0.9 from training. Duration: 0.5555625 2023-10-03 20:20:57,283 WARNING [train.py:1204] (3/4) Exclude cut with ID _21632_210_2253_1_1531994001765_3744140_362-66225-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 20:20:58,645 WARNING [train.py:1204] (3/4) Exclude cut with ID _61118_210_9527_1_1534897668884_3752630_173-258555-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 20:21:01,375 WARNING [train.py:1204] (3/4) Exclude cut with ID _7719_210_3525_1_1528456153163_4296960_88-250259-0_sp0.9 from training. Duration: 0.9333125 2023-10-03 20:21:02,637 WARNING [train.py:1204] (3/4) Exclude cut with ID _15140_210_9288_1_1530791388450_4880149_539-153708-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 20:21:02,659 WARNING [train.py:1204] (3/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_253-42544-0_sp1.1 from training. Duration: 0.97275 2023-10-03 20:21:04,165 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.1.ff2_skip_rate, batch_count=1394226.6666666667, ans=0.0 2023-10-03 20:21:05,326 WARNING [train.py:1204] (3/4) Exclude cut with ID _33640_210_2655_1_1533117605479_4855469_479-296877-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 20:21:08,149 WARNING [train.py:1204] (3/4) Exclude cut with ID 3033-130750-0096-55598-0_sp1.1 from training. Duration: 0.7545625 2023-10-03 20:21:08,161 WARNING [train.py:1204] (3/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_191-41023-0_sp1.1 from training. Duration: 0.6454375 2023-10-03 20:21:08,190 WARNING [train.py:1204] (3/4) Exclude cut with ID _8365_210_2064_1_1528592311070_4677690_431-294102-0_sp1.1 from training. Duration: 0.8090625 2023-10-03 20:21:08,222 WARNING [train.py:1204] (3/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_893-296030-0_sp1.1 from training. Duration: 0.97275 2023-10-03 20:21:13,992 WARNING [train.py:1204] (3/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_401-343820-0_sp1.1 from training. Duration: 0.97275 2023-10-03 20:21:15,547 WARNING [train.py:1204] (3/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_127-566-0_sp1.1 from training. Duration: 0.763625 2023-10-03 20:21:15,548 WARNING [train.py:1204] (3/4) Exclude cut with ID _14100_210_8341_1_1530529320613_3678069_126-272847-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 20:21:15,566 WARNING [train.py:1204] (3/4) Exclude cut with ID _15133_210_6228_1_1530794512074_3080379_29-264458-0_sp1.1 from training. Duration: 0.52725 2023-10-03 20:21:15,567 WARNING [train.py:1204] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_127-119109-0 from training. Duration: 0.72 2023-10-03 20:21:17,598 WARNING [train.py:1204] (3/4) Exclude cut with ID _6537_210_1883_1_1527987559710_3597129_382-133062-0_sp1.1 from training. Duration: 0.6181875 2023-10-03 20:21:17,617 WARNING [train.py:1204] (3/4) Exclude cut with ID _29529_210_7234_1_1532393961455_3977460_391-219682-0_sp1.1 from training. Duration: 0.936375 2023-10-03 20:21:19,080 WARNING [train.py:1204] (3/4) Exclude cut with ID _37537_210_2253_1_1533171758568_3421949_96-137189-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 20:21:20,613 WARNING [train.py:1204] (3/4) Exclude cut with ID _58414_210_8254_1_1535421609162_7151182_15-276301-0_sp1.1 from training. Duration: 0.97275 2023-10-03 20:21:23,802 WARNING [train.py:1204] (3/4) Exclude cut with ID _59502_210_12210_1_1535107024698_1835139_110-289074-0_sp1.1 from training. Duration: 0.92725 2023-10-03 20:21:26,797 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.630e+02 1.972e+02 2.252e+02 2.613e+02 4.035e+02, threshold=4.504e+02, percent-clipped=0.0 2023-10-03 20:21:26,918 WARNING [train.py:1204] (3/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_367-61330-0_sp1.1 from training. Duration: 0.6818125 2023-10-03 20:21:29,653 WARNING [train.py:1204] (3/4) Exclude cut with ID _45297_210_2721_1_1533715351381_3264949_353-9541-0_sp1.1 from training. Duration: 0.8818125 2023-10-03 20:21:31,010 WARNING [train.py:1204] (3/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_85-138257-0_sp0.9 from training. Duration: 0.788875 2023-10-03 20:21:31,036 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3787_210_2040_1_1526477544_5478586_603-120324-0 from training. Duration: 0.81 2023-10-03 20:21:31,087 WARNING [train.py:1204] (3/4) Exclude cut with ID _42863_210_9070_1_1533902398788_3696920_411-204131-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 20:21:35,307 WARNING [train.py:1204] (3/4) Exclude cut with ID _30196_210_1664_1_1533708496522_7074140_822-289909-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 20:21:35,397 WARNING [train.py:1204] (3/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_32-335103-0_sp0.9 from training. Duration: 0.911125 2023-10-03 20:21:36,718 WARNING [train.py:1204] (3/4) Exclude cut with ID _6493_210_1747_1_1528459210424_4012470_513-140967-0_sp0.9 from training. Duration: 0.87775 2023-10-03 20:21:44,350 INFO [scaling.py:1118] (3/4) WithLoss: name=encoder.encoders.4.encoder.layers.2.self_attn_weights, loss-sum=0.000e+00 2023-10-03 20:21:45,466 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_47896_210_20508_1_1534492518_5958868_439-257766-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 20:21:48,139 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_1294-63152-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 20:21:48,378 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.conv_module2.balancer1.prob, batch_count=1394426.6666666667, ans=0.125 2023-10-03 20:21:49,954 WARNING [train.py:1204] (3/4) Exclude cut with ID _59170_210_6721_1_1534983633960_4203335_319-166512-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 20:21:52,991 WARNING [train.py:1204] (3/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_146-3470-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 20:21:54,517 WARNING [train.py:1204] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_678-159307-0_sp1.1 from training. Duration: 0.77275 2023-10-03 20:21:54,546 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_343-48822-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 20:21:55,973 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_4692_210_3528_1_1526899755_4323254_107-44401-0 from training. Duration: 0.86 2023-10-03 20:21:55,984 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_74-292608-0_sp1.1 from training. Duration: 0.6545625 2023-10-03 20:21:57,371 WARNING [train.py:1204] (3/4) Exclude cut with ID _55644_210_11856_1_1534593599582_4103719_293-248694-0_sp1.1 from training. Duration: 0.97275 2023-10-03 20:21:59,225 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3172_210_2087_1_1526292929_3987865_197-97762-0 from training. Duration: 0.75 2023-10-03 20:22:00,504 INFO [train.py:1046] (3/4) Epoch 40, batch 2000, loss[loss=0.1452, simple_loss=0.226, pruned_loss=0.03219, over 24580.00 frames. ], tot_loss[loss=0.1583, simple_loss=0.2385, pruned_loss=0.03901, over 4722138.47 frames. ], batch size: 60, lr: 2.55e-03, grad_scale: 16.0 2023-10-03 20:22:00,648 WARNING [train.py:1204] (3/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_264-348896-0_sp0.9 from training. Duration: 0.9444375 2023-10-03 20:22:04,738 WARNING [train.py:1204] (3/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_209-205365-0_sp0.9 from training. Duration: 0.87775 2023-10-03 20:22:04,817 WARNING [train.py:1204] (3/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_222-198127-0_sp0.9 from training. Duration: 0.9333125 2023-10-03 20:22:06,132 WARNING [train.py:1204] (3/4) Exclude cut with ID _57572_210_5517_1_1534921219624_3809484_52-43530-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 20:22:07,507 WARNING [train.py:1204] (3/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_96-320736-0_sp1.1 from training. Duration: 0.7818125 2023-10-03 20:22:09,016 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_13091_210_7925_1_1530609447_8987354_1159-225508-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 20:22:11,829 WARNING [train.py:1204] (3/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_237-44332-0 from training. Duration: 0.94 2023-10-03 20:22:13,136 WARNING [train.py:1204] (3/4) Exclude cut with ID _7970_210_1883_1_1528455616296_3472230_25-178407-0_sp0.9 from training. Duration: 0.788875 2023-10-03 20:22:14,617 WARNING [train.py:1204] (3/4) Exclude cut with ID _29003_210_19596_1_1532507391720_3894098_357-257062-0_sp1.1 from training. Duration: 0.936375 2023-10-03 20:22:16,647 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_809-17156-0 from training. Duration: 0.73 2023-10-03 20:22:17,997 WARNING [train.py:1204] (3/4) Exclude cut with ID _7160_210_1876_1_1527940923231_3703799_302-150728-0_sp1.1 from training. Duration: 0.7090625 2023-10-03 20:22:18,012 WARNING [train.py:1204] (3/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_444-28041-0_sp0.9 from training. Duration: 0.9444375 2023-10-03 20:22:20,016 WARNING [train.py:1204] (3/4) Exclude cut with ID _52892_210_6721_1_1534378941259_4124129_278-251666-0_sp1.1 from training. Duration: 0.92725 2023-10-03 20:22:23,135 WARNING [train.py:1204] (3/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_115-326778-0 from training. Duration: 0.93 2023-10-03 20:22:24,531 WARNING [train.py:1204] (3/4) Exclude cut with ID _41391_210_7994_1_1533603771872_3638201_31-188961-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 20:22:27,604 WARNING [train.py:1204] (3/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_1073-6264-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 20:22:27,624 WARNING [train.py:1204] (3/4) Exclude cut with ID _66868_210_9527_1_1535509287954_4561140_435-259515-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 20:22:27,722 WARNING [train.py:1204] (3/4) Exclude cut with ID _14886_210_4220_1_1530702020355_3516190_159-73748-0 from training. Duration: 0.83 2023-10-03 20:22:28,995 WARNING [train.py:1204] (3/4) Exclude cut with ID _15150_210_8341_1_1530752411803_7256384_469-239509-0_sp1.1 from training. Duration: 0.6181875 2023-10-03 20:22:30,437 WARNING [train.py:1204] (3/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_395-340852-0 from training. Duration: 0.93 2023-10-03 20:22:30,441 WARNING [train.py:1204] (3/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_138-187031-0_sp1.1 from training. Duration: 0.9 2023-10-03 20:22:33,258 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_13757_210_3528_1_1530440682_4107239_48-106347-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 20:22:34,507 WARNING [train.py:1204] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_164-224052-0_sp1.1 from training. Duration: 0.636375 2023-10-03 20:22:34,515 WARNING [train.py:1204] (3/4) Exclude cut with ID _59288_210_5854_1_1535077757774_3412246_408-322648-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 20:22:34,563 WARNING [train.py:1204] (3/4) Exclude cut with ID _30191_210_1664_1_1533189915047_7196670_827-36359-0_sp1.1 from training. Duration: 0.963625 2023-10-03 20:22:34,688 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.feed_forward2.hidden_balancer.prob, batch_count=1394626.6666666667, ans=0.125 2023-10-03 20:22:35,883 WARNING [train.py:1204] (3/4) Exclude cut with ID _4680_210_3525_1_1527249757953_893386_75-78979-0_sp1.1 from training. Duration: 0.82725 2023-10-03 20:22:37,172 WARNING [train.py:1204] (3/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_383-165234-0 from training. Duration: 0.94 2023-10-03 20:22:38,947 WARNING [train.py:1204] (3/4) Exclude cut with ID _19896_210_1883_1_1531709022662_6649569_94-229927-0 from training. Duration: 0.97 2023-10-03 20:22:38,953 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6788_210_1388_1_1527898698_4562021_180-75288-0_sp1.1 from training. Duration: 0.9 2023-10-03 20:22:38,964 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_26433_210_12108_1_1532157158_4056480_54-321349-0_sp1.1 from training. Duration: 0.97275 2023-10-03 20:22:44,563 WARNING [train.py:1204] (3/4) Exclude cut with ID _56108_210_16380_1_1534505446159_3887959_265-161493-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 20:22:46,645 WARNING [train.py:1204] (3/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_395-111602-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 20:22:46,651 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_154-316450-0_sp0.9 from training. Duration: 0.8444375 2023-10-03 20:22:47,943 WARNING [train.py:1204] (3/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_381-116041-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 20:22:49,848 WARNING [train.py:1204] (3/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_65-104124-0_sp1.1 from training. Duration: 0.963625 2023-10-03 20:22:49,911 WARNING [train.py:1204] (3/4) Exclude cut with ID _6536_210_1883_1_1527850783339_3319069_169-23811-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 20:22:49,943 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_289-180904-0_sp0.9 from training. Duration: 0.8444375 2023-10-03 20:22:49,953 WARNING [train.py:1204] (3/4) Exclude cut with ID _56406_210_9288_1_1534899618664_3496149_226-242400-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 20:22:51,403 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=1394693.3333333333, ans=0.1 2023-10-03 20:22:52,606 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_24899_210_16554_1_1532046422_3827462_276-216330-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 20:22:55,758 WARNING [train.py:1204] (3/4) Exclude cut with ID _29345_210_4381_1_1532689168263_3860169_198-174328-0_sp1.1 from training. Duration: 0.82725 2023-10-03 20:22:55,919 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=1394693.3333333333, ans=0.1 2023-10-03 20:22:57,124 WARNING [train.py:1204] (3/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_338-96520-0 from training. Duration: 0.94 2023-10-03 20:23:01,751 WARNING [train.py:1204] (3/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_7-111954-0_sp1.1 from training. Duration: 0.5090625 2023-10-03 20:23:03,044 WARNING [train.py:1204] (3/4) Exclude cut with ID _36692_210_5665_1_1533608967420_398809_8-16261-0_sp1.1 from training. Duration: 0.97275 2023-10-03 20:23:07,127 WARNING [train.py:1204] (3/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_86-212657-0_sp1.1 from training. Duration: 0.97275 2023-10-03 20:23:07,141 WARNING [train.py:1204] (3/4) Exclude cut with ID _44659_210_2721_1_1534036120061_6614860_626-298884-0_sp1.1 from training. Duration: 0.8818125 2023-10-03 20:23:08,775 WARNING [train.py:1204] (3/4) Exclude cut with ID _49755_210_18053_1_1534581005846_3218709_120-257918-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 20:23:11,545 WARNING [train.py:1204] (3/4) Exclude cut with ID _21881_210_3525_1_1531825242757_3798380_400-113821-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 20:23:11,548 WARNING [train.py:1204] (3/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_387-236773-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 20:23:12,864 WARNING [train.py:1204] (3/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_613-232856-0_sp1.1 from training. Duration: 0.4909375 2023-10-03 20:23:12,888 WARNING [train.py:1204] (3/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_181-78632-0_sp0.9 from training. Duration: 0.8666875 2023-10-03 20:23:14,231 INFO [train.py:1046] (3/4) Epoch 40, batch 2050, loss[loss=0.1611, simple_loss=0.2414, pruned_loss=0.04041, over 23263.00 frames. ], tot_loss[loss=0.1582, simple_loss=0.2386, pruned_loss=0.03893, over 4723911.94 frames. ], batch size: 93, lr: 2.55e-03, grad_scale: 16.0 2023-10-03 20:23:15,523 WARNING [train.py:1204] (3/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_175-17485-0_sp1.1 from training. Duration: 0.97275 2023-10-03 20:23:15,608 WARNING [train.py:1204] (3/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_1004-183422-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 20:23:18,798 WARNING [train.py:1204] (3/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_32-65962-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 20:23:20,718 WARNING [train.py:1204] (3/4) Exclude cut with ID _15458_210_8341_1_1530858614263_3834410_40-298664-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 20:23:25,267 WARNING [train.py:1204] (3/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_380-261863-0_sp1.1 from training. Duration: 0.963625 2023-10-03 20:23:26,739 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_372-173012-0_sp1.1 from training. Duration: 0.863625 2023-10-03 20:23:26,799 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_57-82205-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 20:23:28,160 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3691_210_3528_1_1526468169_4062053_355-334432-0_sp0.9 from training. Duration: 0.9666875 2023-10-03 20:23:30,124 WARNING [train.py:1204] (3/4) Exclude cut with ID _19023_210_4353_1_1531378892552_5820780_28-36615-0 from training. Duration: 0.97 2023-10-03 20:23:30,129 WARNING [train.py:1204] (3/4) Exclude cut with ID _29853_210_7497_1_1532505600851_6642110_286-251333-0_sp1.1 from training. Duration: 0.92725 2023-10-03 20:23:31,533 WARNING [train.py:1204] (3/4) Exclude cut with ID _31602_210_5854_1_1532677021456_4458250_518-80679-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 20:23:31,573 WARNING [train.py:1204] (3/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_38-187851-0_sp0.9 from training. Duration: 0.688875 2023-10-03 20:23:39,630 WARNING [train.py:1204] (3/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_39-248870-0_sp1.1 from training. Duration: 0.62725 2023-10-03 20:23:39,656 WARNING [train.py:1204] (3/4) Exclude cut with ID _16908_210_5724_1_1531119157248_3772700_154-31455-0_sp1.1 from training. Duration: 0.97275 2023-10-03 20:23:41,004 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8453_210_4211_1_1528502940_8562730_347-330673-0 from training. Duration: 0.72 2023-10-03 20:23:44,060 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.bypass.scale_min, batch_count=1394960.0, ans=0.2 2023-10-03 20:23:45,069 WARNING [train.py:1204] (3/4) Exclude cut with ID _30191_210_1664_1_1533189915047_7196670_674-204247-0_sp1.1 from training. Duration: 0.97275 2023-10-03 20:23:46,481 WARNING [train.py:1204] (3/4) Exclude cut with ID _8833_210_5219_1_1528592665210_7553059_329-125156-0 from training. Duration: 0.82 2023-10-03 20:23:46,510 WARNING [train.py:1204] (3/4) Exclude cut with ID _4779_210_2084_1_1527318022944_3914893_500-285013-0_sp1.1 from training. Duration: 0.62725 2023-10-03 20:23:49,997 WARNING [train.py:1204] (3/4) Exclude cut with ID _14421_210_8750_1_1530604795825_3542290_190-313578-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 20:23:53,164 WARNING [train.py:1204] (3/4) Exclude cut with ID _35799_210_5854_1_1533263487887_3529071_538-273574-0_sp1.1 from training. Duration: 0.936375 2023-10-03 20:23:54,420 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.660e+02 1.914e+02 2.146e+02 2.312e+02 3.091e+02, threshold=4.293e+02, percent-clipped=0.0 2023-10-03 20:23:54,502 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_14928_210_5632_1_1530773461_4714000_454-112198-0_sp1.1 from training. Duration: 0.6 2023-10-03 20:23:54,550 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_468-143126-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 20:23:55,865 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_4528_210_3273_1_1527076654_3276777_258-121681-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 20:23:57,206 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_397-132565-0_sp1.1 from training. Duration: 0.9 2023-10-03 20:23:57,237 WARNING [train.py:1204] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_454-123002-0_sp0.9 from training. Duration: 0.7666875 2023-10-03 20:24:00,483 WARNING [train.py:1204] (3/4) Exclude cut with ID _27052_210_5986_1_1532329197187_3585009_297-86890-0_sp1.1 from training. Duration: 0.936375 2023-10-03 20:24:01,900 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_51225_210_3601_1_1534400634_2327383_55-301844-0_sp1.1 from training. Duration: 0.8454375 2023-10-03 20:24:04,617 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6786_210_1388_1_1527839699_4194495_378-46070-0_sp0.9 from training. Duration: 0.8 2023-10-03 20:24:04,713 WARNING [train.py:1204] (3/4) Exclude cut with ID _42334_210_7770_1_1534206610396_3605880_55-19758-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 20:24:08,766 WARNING [train.py:1204] (3/4) Exclude cut with ID _14884_210_2252_1_1530707459689_5561250_114-201399-0_sp0.9 from training. Duration: 0.8444375 2023-10-03 20:24:13,110 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.nonlin_attention.balancer.prob, batch_count=1395093.3333333333, ans=0.125 2023-10-03 20:24:13,112 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.feed_forward2.hidden_balancer.prob, batch_count=1395093.3333333333, ans=0.125 2023-10-03 20:24:14,250 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_99-251314-0_sp0.9 from training. Duration: 0.9666875 2023-10-03 20:24:15,592 WARNING [train.py:1204] (3/4) Exclude cut with ID _26528_210_9070_1_1532570410842_3851310_497-97218-0 from training. Duration: 0.86 2023-10-03 20:24:20,971 WARNING [train.py:1204] (3/4) Exclude cut with ID _55328_210_27767_1_1534580973260_4088170_230-317241-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 20:24:21,041 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_125-203105-0_sp0.9 from training. Duration: 0.888875 2023-10-03 20:24:22,520 WARNING [train.py:1204] (3/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_55-31904-0_sp1.1 from training. Duration: 0.8 2023-10-03 20:24:24,450 WARNING [train.py:1204] (3/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_18-174647-0 from training. Duration: 0.91 2023-10-03 20:24:25,058 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.3.encoder.layers.2.whiten, num_groups=1, num_channels=512, metric=5.21 vs. limit=12.0 2023-10-03 20:24:26,048 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.bypass.skip_rate, batch_count=1395093.3333333333, ans=0.09899494936611666 2023-10-03 20:24:28,252 INFO [train.py:1046] (3/4) Epoch 40, batch 2100, loss[loss=0.1287, simple_loss=0.1836, pruned_loss=0.03693, over 19341.00 frames. ], tot_loss[loss=0.1573, simple_loss=0.2373, pruned_loss=0.03866, over 4723023.68 frames. ], batch size: 388, lr: 2.55e-03, grad_scale: 16.0 2023-10-03 20:24:28,331 WARNING [train.py:1204] (3/4) Exclude cut with ID IC0165W0005-81726 from training. Duration: 0.952 2023-10-03 20:24:28,331 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_31583_210_3852_1_1532595851_4024413_148-171847-0_sp1.1 from training. Duration: 0.963625 2023-10-03 20:24:28,393 WARNING [train.py:1204] (3/4) Exclude cut with ID _21989_210_5854_1_1531899214931_3271360_116-138769-0_sp1.1 from training. Duration: 0.936375 2023-10-03 20:24:28,461 WARNING [train.py:1204] (3/4) Exclude cut with ID _66070_210_7640_1_1535441393096_7805310_489-292631-0_sp1.1 from training. Duration: 0.7181875 2023-10-03 20:24:30,411 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_202-27960-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 20:24:31,660 WARNING [train.py:1204] (3/4) Exclude cut with ID _5294_210_1747_1_1527249643928_3696179_342-270278-0 from training. Duration: 0.55 2023-10-03 20:24:31,704 WARNING [train.py:1204] (3/4) Exclude cut with ID _38323_210_12108_1_1533121233369_3303049_416-11919-0 from training. Duration: 0.96 2023-10-03 20:24:33,119 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6545_210_2040_1_1527774670_4245258_407-48070-0_sp0.9 from training. Duration: 0.8444375 2023-10-03 20:24:34,604 WARNING [train.py:1204] (3/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_503-30719-0_sp1.1 from training. Duration: 0.8909375 2023-10-03 20:24:35,881 WARNING [train.py:1204] (3/4) Exclude cut with ID _49643_210_7989_1_1534640244224_3939031_234-326497-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 20:24:36,025 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.0.ff2_skip_rate, batch_count=1395160.0, ans=0.0 2023-10-03 20:24:36,434 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.5.encoder.layers.0.self_attn_weights.whiten_keys, num_groups=4, num_channels=128, metric=1.55 vs. limit=6.0 2023-10-03 20:24:38,557 WARNING [train.py:1204] (3/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_301-77873-0_sp1.1 from training. Duration: 0.963625 2023-10-03 20:24:39,873 WARNING [train.py:1204] (3/4) Exclude cut with ID _45297_210_2721_1_1533715351381_3264949_152-154697-0_sp1.1 from training. Duration: 0.97275 2023-10-03 20:24:39,880 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3371_210_1794_1_1526384462_5580777_396-339812-0 from training. Duration: 0.85 2023-10-03 20:24:41,261 WARNING [train.py:1204] (3/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_399-156791-0_sp1.1 from training. Duration: 0.7545625 2023-10-03 20:24:42,730 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7696_210_2040_1_1528207167_3716099_117-125434-0 from training. Duration: 0.76 2023-10-03 20:24:42,734 WARNING [train.py:1204] (3/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_96-244299-0 from training. Duration: 0.88 2023-10-03 20:24:44,137 WARNING [train.py:1204] (3/4) Exclude cut with ID _37892_210_19596_1_1533198677897_3964058_519-343462-0_sp1.1 from training. Duration: 0.92725 2023-10-03 20:24:44,178 WARNING [train.py:1204] (3/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_202-183922-0_sp1.1 from training. Duration: 0.77275 2023-10-03 20:24:44,186 WARNING [train.py:1204] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_453-133983-0 from training. Duration: 0.78 2023-10-03 20:24:44,222 WARNING [train.py:1204] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_178-39722-0_sp1.1 from training. Duration: 0.4454375 2023-10-03 20:24:48,146 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.conv_module1.balancer2.prob, batch_count=1395226.6666666667, ans=0.125 2023-10-03 20:24:50,698 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_15126_210_6753_1_1530778677_2591185_113-157651-0 from training. Duration: 0.68 2023-10-03 20:24:50,699 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_271-234962-0_sp1.1 from training. Duration: 0.7181875 2023-10-03 20:24:53,496 WARNING [train.py:1204] (3/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_195-64967-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 20:24:53,530 WARNING [train.py:1204] (3/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_365-215065-0_sp1.1 from training. Duration: 0.936375 2023-10-03 20:24:56,747 WARNING [train.py:1204] (3/4) Exclude cut with ID _31602_210_5854_1_1532677021456_4458250_249-96604-0_sp0.9 from training. Duration: 0.988875 2023-10-03 20:24:58,160 WARNING [train.py:1204] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_307-158462-0 from training. Duration: 0.69 2023-10-03 20:24:58,201 WARNING [train.py:1204] (3/4) Exclude cut with ID _30918_210_6400_1_1532754078600_3125920_407-223394-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 20:24:58,204 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6673_210_3273_1_1527767790_3173240_351-167954-0_sp0.9 from training. Duration: 0.5333125 2023-10-03 20:24:59,606 WARNING [train.py:1204] (3/4) Exclude cut with ID _4866_210_2577_1_1527404412284_4364659_256-306830-0 from training. Duration: 0.87 2023-10-03 20:25:01,415 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_17613_210_2151_1_1531137569_2366160_222-131118-0_sp1.1 from training. Duration: 0.963625 2023-10-03 20:25:01,434 WARNING [train.py:1204] (3/4) Exclude cut with ID _45560_210_21982_1_1533812603951_3584999_111-6376-0 from training. Duration: 0.84 2023-10-03 20:25:01,467 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_216-73841-0 from training. Duration: 0.96 2023-10-03 20:25:02,796 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_14058_210_4163_1_1530690953_6327180_49-210687-0 from training. Duration: 0.94 2023-10-03 20:25:05,385 WARNING [train.py:1204] (3/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_506-42614-0_sp0.9 from training. Duration: 0.87775 2023-10-03 20:25:06,787 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_25-332138-0_sp1.1 from training. Duration: 0.82725 2023-10-03 20:25:08,285 WARNING [train.py:1204] (3/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_589-9070-0_sp1.1 from training. Duration: 0.6454375 2023-10-03 20:25:09,698 WARNING [train.py:1204] (3/4) Exclude cut with ID _6194_210_1534_1_1527511826160_3941194_14-282876-0_sp1.1 from training. Duration: 0.6454375 2023-10-03 20:25:09,797 WARNING [train.py:1204] (3/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_1166-90654-0_sp1.1 from training. Duration: 0.92725 2023-10-03 20:25:11,224 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_218-255858-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 20:25:11,226 WARNING [train.py:1204] (3/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_1285-256622-0 from training. Duration: 0.84 2023-10-03 20:25:12,504 WARNING [train.py:1204] (3/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_205-286261-0_sp1.1 from training. Duration: 0.963625 2023-10-03 20:25:12,526 WARNING [train.py:1204] (3/4) Exclude cut with ID _68777_210_4353_1_1535810390731_3117904_6-154119-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 20:25:12,587 WARNING [train.py:1204] (3/4) Exclude cut with ID _22982_210_1883_1_1531882790447_5449409_565-199393-0_sp1.1 from training. Duration: 0.92725 2023-10-03 20:25:12,604 WARNING [train.py:1204] (3/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_278-154492-0 from training. Duration: 0.76 2023-10-03 20:25:14,419 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_426-313557-0 from training. Duration: 0.53 2023-10-03 20:25:16,352 WARNING [train.py:1204] (3/4) Exclude cut with ID _41817_210_18835_1_1533434477160_7423049_555-293448-0 from training. Duration: 0.8 2023-10-03 20:25:20,195 WARNING [train.py:1204] (3/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_32-335103-0_sp1.1 from training. Duration: 0.7454375 2023-10-03 20:25:23,035 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2655_210_2087_1_1525688959_3897400_202-147557-0_sp1.1 from training. Duration: 0.77275 2023-10-03 20:25:24,355 WARNING [train.py:1204] (3/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_158-250116-0 from training. Duration: 0.62 2023-10-03 20:25:24,615 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.nonlin_attention.balancer.prob, batch_count=1395360.0, ans=0.125 2023-10-03 20:25:27,824 WARNING [train.py:1204] (3/4) Exclude cut with ID _55429_210_10626_1_1534467588527_7174143_528-88083-0_sp1.1 from training. Duration: 0.963625 2023-10-03 20:25:29,397 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.conv_module1.balancer1.prob, batch_count=1395426.6666666667, ans=0.125 2023-10-03 20:25:31,186 WARNING [train.py:1204] (3/4) Exclude cut with ID _8030_210_7497_1_1528444886781_3531089_32-31837-0_sp1.1 from training. Duration: 0.8818125 2023-10-03 20:25:31,209 WARNING [train.py:1204] (3/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_744-86168-0_sp1.1 from training. Duration: 0.97275 2023-10-03 20:25:32,444 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1113-7369-0_sp0.9 from training. Duration: 0.9444375 2023-10-03 20:25:32,471 WARNING [train.py:1204] (3/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_124-345840-0_sp0.9 from training. Duration: 0.57775 2023-10-03 20:25:32,517 WARNING [train.py:1204] (3/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_328-264776-0_sp0.9 from training. Duration: 0.7444375 2023-10-03 20:25:33,843 WARNING [train.py:1204] (3/4) Exclude cut with ID _18895_210_6228_1_1531357274234_3784930_469-166615-0_sp1.1 from training. Duration: 0.963625 2023-10-03 20:25:33,850 WARNING [train.py:1204] (3/4) Exclude cut with ID _8395_210_1531_1_1528597871523_4411579_431-132470-0_sp0.9 from training. Duration: 0.82225 2023-10-03 20:25:35,164 WARNING [train.py:1204] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_554-45617-0_sp1.1 from training. Duration: 0.9 2023-10-03 20:25:35,184 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_35361_210_4238_1_1533262810_7793476_524-188242-0_sp1.1 from training. Duration: 0.92725 2023-10-03 20:25:36,728 WARNING [train.py:1204] (3/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_938-205135-0 from training. Duration: 0.87 2023-10-03 20:25:38,078 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_5931_210_2040_1_1527428481_4547999_321-193614-0 from training. Duration: 0.67 2023-10-03 20:25:38,098 WARNING [train.py:1204] (3/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_445-269521-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 20:25:40,691 WARNING [train.py:1204] (3/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_3-33116-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 20:25:40,692 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_4633_210_2040_1_1526824163_4145343_416-76117-0_sp1.1 from training. Duration: 0.863625 2023-10-03 20:25:40,722 WARNING [train.py:1204] (3/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_1033-274376-0_sp1.1 from training. Duration: 0.7909375 2023-10-03 20:25:42,013 INFO [train.py:1046] (3/4) Epoch 40, batch 2150, loss[loss=0.1495, simple_loss=0.2045, pruned_loss=0.04722, over 19069.00 frames. ], tot_loss[loss=0.1567, simple_loss=0.236, pruned_loss=0.03866, over 4706944.92 frames. ], batch size: 388, lr: 2.55e-03, grad_scale: 16.0 2023-10-03 20:25:42,068 WARNING [train.py:1204] (3/4) Exclude cut with ID _8772_210_1850_1_1528635595972_3664011_398-320049-0_sp0.9 from training. Duration: 0.97775 2023-10-03 20:25:43,113 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.bypass_mid.scale_min, batch_count=1395493.3333333333, ans=0.2 2023-10-03 20:25:45,528 WARNING [train.py:1204] (3/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_321-215742-0_sp0.9 from training. Duration: 0.5555625 2023-10-03 20:25:47,026 WARNING [train.py:1204] (3/4) Exclude cut with ID _28394_210_8913_1_1532430049518_3650118_272-190950-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 20:25:47,744 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.2.encoder.layers.2.whiten, num_groups=1, num_channels=384, metric=4.99 vs. limit=12.0 2023-10-03 20:25:48,333 WARNING [train.py:1204] (3/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_31-299371-0_sp1.1 from training. Duration: 0.92725 2023-10-03 20:25:51,037 WARNING [train.py:1204] (3/4) Exclude cut with ID _55383_210_10635_1_1534468426172_2231669_134-128246-0_sp0.9 from training. Duration: 0.988875 2023-10-03 20:25:51,040 WARNING [train.py:1204] (3/4) Exclude cut with ID _40519_210_13794_1_1533715228655_7337390_887-9547-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 20:25:51,089 WARNING [train.py:1204] (3/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_310-109461-0_sp0.9 from training. Duration: 0.888875 2023-10-03 20:25:55,488 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2009_210_2071_1_1525481134_4588937_258-250196-0_sp1.1 from training. Duration: 0.92725 2023-10-03 20:25:55,552 WARNING [train.py:1204] (3/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_1186-72702-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 20:25:55,555 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_14831_210_7927_1_1530700200_5583610_181-27467-0_sp1.1 from training. Duration: 0.82725 2023-10-03 20:25:57,172 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.balancer2.prob, batch_count=1395560.0, ans=0.125 2023-10-03 20:25:59,654 WARNING [train.py:1204] (3/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_278-132109-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 20:25:59,681 WARNING [train.py:1204] (3/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_261-297503-0 from training. Duration: 0.99 2023-10-03 20:26:03,099 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.conv_module2.balancer2.min_abs, batch_count=1395560.0, ans=0.5 2023-10-03 20:26:04,204 WARNING [train.py:1204] (3/4) Exclude cut with ID _19896_210_1883_1_1531709022662_6649569_313-265903-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 20:26:04,405 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.attention_skip_rate, batch_count=1395560.0, ans=0.0 2023-10-03 20:26:05,575 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_105-114797-0_sp1.1 from training. Duration: 0.763625 2023-10-03 20:26:05,656 WARNING [train.py:1204] (3/4) Exclude cut with ID _53755_210_8254_1_1534577312941_4707360_9-65843-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 20:26:05,677 WARNING [train.py:1204] (3/4) Exclude cut with ID _18516_210_9206_1_1531704653206_3692406_361-224549-0_sp1.1 from training. Duration: 0.97275 2023-10-03 20:26:07,013 WARNING [train.py:1204] (3/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_403-310975-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 20:26:07,051 WARNING [train.py:1204] (3/4) Exclude cut with ID _5444_210_2151_1_1527418808464_5312210_581-315411-0_sp0.9 from training. Duration: 0.688875 2023-10-03 20:26:07,114 WARNING [train.py:1204] (3/4) Exclude cut with ID _23825_210_13449_1_1531915313526_3497150_464-154904-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 20:26:07,123 WARNING [train.py:1204] (3/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_937-208281-0_sp0.9 from training. Duration: 0.9444375 2023-10-03 20:26:08,505 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_37839_210_7234_1_1533542462_3689303_158-181212-0_sp1.1 from training. Duration: 0.963625 2023-10-03 20:26:08,592 WARNING [train.py:1204] (3/4) Exclude cut with ID _8772_210_1850_1_1528635595972_3664011_398-320049-0 from training. Duration: 0.88 2023-10-03 20:26:09,995 WARNING [train.py:1204] (3/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_718-250907-0_sp1.1 from training. Duration: 0.62725 2023-10-03 20:26:10,210 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.ff3_skip_rate, batch_count=1395626.6666666667, ans=0.0 2023-10-03 20:26:12,612 WARNING [train.py:1204] (3/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_641-126395-0_sp1.1 from training. Duration: 0.92725 2023-10-03 20:26:13,707 WARNING [train.py:1204] (3/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_166-241493-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 20:26:15,131 WARNING [train.py:1204] (3/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_526-62079-0_sp0.9 from training. Duration: 0.7444375 2023-10-03 20:26:16,490 WARNING [train.py:1204] (3/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_439-275083-0_sp1.1 from training. Duration: 0.936375 2023-10-03 20:26:19,135 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_66907_210_21581_1_1535615661_3590177_493-229032-0_sp1.1 from training. Duration: 0.92725 2023-10-03 20:26:19,162 WARNING [train.py:1204] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_89-321955-0_sp1.1 from training. Duration: 0.72725 2023-10-03 20:26:21,711 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.573e+02 1.888e+02 2.102e+02 2.294e+02 3.502e+02, threshold=4.204e+02, percent-clipped=0.0 2023-10-03 20:26:21,828 WARNING [train.py:1204] (3/4) Exclude cut with ID _29857_210_20034_1_1532395886753_3798160_273-226195-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 20:26:21,834 WARNING [train.py:1204] (3/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_404-75889-0 from training. Duration: 0.89 2023-10-03 20:26:21,865 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_271-234962-0_sp0.9 from training. Duration: 0.87775 2023-10-03 20:26:24,655 WARNING [train.py:1204] (3/4) Exclude cut with ID _22961_210_8254_1_1531976366672_4278180_156-339685-0_sp1.1 from training. Duration: 0.97275 2023-10-03 20:26:26,511 WARNING [train.py:1204] (3/4) Exclude cut with ID _37892_210_19596_1_1533198677897_3964058_425-231927-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 20:26:27,845 WARNING [train.py:1204] (3/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_102-288670-0_sp1.1 from training. Duration: 0.97275 2023-10-03 20:26:27,928 WARNING [train.py:1204] (3/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_283-220372-0_sp1.1 from training. Duration: 0.5090625 2023-10-03 20:26:29,258 WARNING [train.py:1204] (3/4) Exclude cut with ID _44304_210_15374_1_1533639352765_4613129_86-1699-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 20:26:30,650 WARNING [train.py:1204] (3/4) Exclude cut with ID _2891_210_2550_1_1525874398521_4427710_327-121590-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 20:26:30,665 WARNING [train.py:1204] (3/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_369-222060-0 from training. Duration: 0.71 2023-10-03 20:26:32,096 WARNING [train.py:1204] (3/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_762-109886-0 from training. Duration: 0.91 2023-10-03 20:26:32,126 WARNING [train.py:1204] (3/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_477-255782-0_sp0.9 from training. Duration: 0.911125 2023-10-03 20:26:32,160 WARNING [train.py:1204] (3/4) Exclude cut with ID IC0669W0061-330906 from training. Duration: 0.768 2023-10-03 20:26:32,201 WARNING [train.py:1204] (3/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_347-213071-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 20:26:32,235 WARNING [train.py:1204] (3/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_507-303370-0_sp1.1 from training. Duration: 0.87275 2023-10-03 20:26:34,164 WARNING [train.py:1204] (3/4) Exclude cut with ID _15012_210_1968_1_1530856820429_5095350_688-178849-0 from training. Duration: 0.64 2023-10-03 20:26:34,178 WARNING [train.py:1204] (3/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_28-70987-0_sp0.9 from training. Duration: 0.9666875 2023-10-03 20:26:34,180 WARNING [train.py:1204] (3/4) Exclude cut with ID _5910_210_1389_1_1527337972454_5050100_555-159815-0 from training. Duration: 0.98 2023-10-03 20:26:34,199 WARNING [train.py:1204] (3/4) Exclude cut with ID IC0679W0064-335871 from training. Duration: 0.768 2023-10-03 20:26:34,199 WARNING [train.py:1204] (3/4) Exclude cut with ID IC0679W0079-335886 from training. Duration: 0.896 2023-10-03 20:26:35,401 WARNING [train.py:1204] (3/4) Exclude cut with ID _5608_210_2106_1_1527415439089_6819768_909-82767-0 from training. Duration: 0.61 2023-10-03 20:26:36,774 WARNING [train.py:1204] (3/4) Exclude cut with ID _23038_210_9570_1_1532001584158_3652419_411-323967-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 20:26:36,854 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_350-231748-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 20:26:37,432 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.3.encoder.layers.3.feed_forward1.out_whiten, num_groups=1, num_channels=512, metric=4.56 vs. limit=15.0 2023-10-03 20:26:38,126 WARNING [train.py:1204] (3/4) Exclude cut with ID _5289_210_2084_1_1527678202736_3779299_142-103317-0_sp0.9 from training. Duration: 0.9333125 2023-10-03 20:26:39,432 WARNING [train.py:1204] (3/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_314-144055-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 20:26:39,502 WARNING [train.py:1204] (3/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_177-236809-0_sp0.9 from training. Duration: 0.5444375 2023-10-03 20:26:41,408 WARNING [train.py:1204] (3/4) Exclude cut with ID _23038_210_9570_1_1532001584158_3652419_82-334733-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 20:26:41,416 WARNING [train.py:1204] (3/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_225-22526-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 20:26:50,399 WARNING [train.py:1204] (3/4) Exclude cut with ID _30327_210_16049_1_1532484161295_3924169_401-70819-0_sp1.1 from training. Duration: 0.7818125 2023-10-03 20:26:51,735 WARNING [train.py:1204] (3/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_538-32291-0 from training. Duration: 0.83 2023-10-03 20:26:54,625 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_37357_210_17024_1_1533089504_2316121_258-211605-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 20:26:56,455 INFO [train.py:1046] (3/4) Epoch 40, batch 2200, loss[loss=0.1485, simple_loss=0.2398, pruned_loss=0.02856, over 24670.00 frames. ], tot_loss[loss=0.1567, simple_loss=0.2362, pruned_loss=0.03857, over 4722296.49 frames. ], batch size: 73, lr: 2.55e-03, grad_scale: 16.0 2023-10-03 20:26:58,147 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.conv_module2.balancer2.min_positive, batch_count=1395826.6666666667, ans=0.05 2023-10-03 20:27:01,819 WARNING [train.py:1204] (3/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_550-335198-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 20:27:01,884 WARNING [train.py:1204] (3/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_98-741-0_sp0.9 from training. Duration: 0.888875 2023-10-03 20:27:02,136 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=1395826.6666666667, ans=0.1 2023-10-03 20:27:03,154 WARNING [train.py:1204] (3/4) Exclude cut with ID _38323_210_12108_1_1533121233369_3303049_251-238988-0_sp1.1 from training. Duration: 0.92725 2023-10-03 20:27:03,240 WARNING [train.py:1204] (3/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_259-34038-0_sp0.9 from training. Duration: 0.82225 2023-10-03 20:27:06,539 WARNING [train.py:1204] (3/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_1060-180633-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 20:27:06,570 WARNING [train.py:1204] (3/4) Exclude cut with ID _8755_210_1876_1_1528628559357_3258209_211-37473-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 20:27:06,575 WARNING [train.py:1204] (3/4) Exclude cut with ID _4122_210_2084_1_1526713164417_4353280_528-206771-0 from training. Duration: 0.88 2023-10-03 20:27:10,895 WARNING [train.py:1204] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_64-248359-0 from training. Duration: 0.69 2023-10-03 20:27:14,008 WARNING [train.py:1204] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_402-253027-0_sp1.1 from training. Duration: 0.4909375 2023-10-03 20:27:21,190 WARNING [train.py:1204] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_625-102955-0 from training. Duration: 0.77 2023-10-03 20:27:22,847 WARNING [train.py:1204] (3/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_611-219324-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 20:27:24,232 WARNING [train.py:1204] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_718-68505-0_sp0.9 from training. Duration: 0.988875 2023-10-03 20:27:24,282 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_353-182855-0_sp1.1 from training. Duration: 0.87275 2023-10-03 20:27:27,007 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_5960_210_1794_1_1527403865_3990999_515-2613-0_sp0.9 from training. Duration: 0.97775 2023-10-03 20:27:27,035 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_5575_210_1410_1_1527301305_2570170_342-265572-0 from training. Duration: 0.94 2023-10-03 20:27:30,329 WARNING [train.py:1204] (3/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_329-113173-0_sp0.9 from training. Duration: 0.688875 2023-10-03 20:27:31,690 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_15396_210_6890_1_1531308473_1938936_213-312506-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 20:27:31,757 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_521-214276-0_sp1.1 from training. Duration: 0.4 2023-10-03 20:27:34,794 WARNING [train.py:1204] (3/4) Exclude cut with ID _15026_210_8750_1_1530777602316_3291850_211-195043-0_sp1.1 from training. Duration: 0.72725 2023-10-03 20:27:35,162 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.bypass.scale_min, batch_count=1395960.0, ans=0.2 2023-10-03 20:27:36,725 WARNING [train.py:1204] (3/4) Exclude cut with ID _29343_210_4381_1_1532602757752_3674109_108-257494-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 20:27:38,234 WARNING [train.py:1204] (3/4) Exclude cut with ID _64414_210_17301_1_1535243252048_3757759_403-58151-0_sp1.1 from training. Duration: 0.963625 2023-10-03 20:27:39,549 WARNING [train.py:1204] (3/4) Exclude cut with ID _36512_210_6987_1_1533033143998_3126069_109-177787-0_sp1.1 from training. Duration: 0.92725 2023-10-03 20:27:41,006 WARNING [train.py:1204] (3/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_498-40153-0 from training. Duration: 0.63 2023-10-03 20:27:44,030 WARNING [train.py:1204] (3/4) Exclude cut with ID _56447_210_1389_1_1535074245910_3697189_161-334523-0_sp1.1 from training. Duration: 0.936375 2023-10-03 20:27:45,501 WARNING [train.py:1204] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_238-245655-0 from training. Duration: 0.81 2023-10-03 20:27:48,821 WARNING [train.py:1204] (3/4) Exclude cut with ID _64631_210_27767_1_1535349580068_4165160_108-43865-0_sp1.1 from training. Duration: 0.92725 2023-10-03 20:27:48,826 WARNING [train.py:1204] (3/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_243-42116-0_sp1.1 from training. Duration: 0.663625 2023-10-03 20:27:50,104 WARNING [train.py:1204] (3/4) Exclude cut with ID _66868_210_9527_1_1535509287954_4561140_594-132160-0_sp1.1 from training. Duration: 0.92725 2023-10-03 20:27:51,560 WARNING [train.py:1204] (3/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_48-338976-0_sp0.9 from training. Duration: 0.988875 2023-10-03 20:27:52,846 WARNING [train.py:1204] (3/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_137-100595-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 20:27:52,852 WARNING [train.py:1204] (3/4) Exclude cut with ID _49748_210_24116_1_1533986971737_3158910_12-271669-0_sp1.1 from training. Duration: 0.936375 2023-10-03 20:27:52,877 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_37357_210_17024_1_1533089504_2316121_206-80421-0_sp1.1 from training. Duration: 0.936375 2023-10-03 20:27:54,252 WARNING [train.py:1204] (3/4) Exclude cut with ID _7537_210_2084_1_1528502395431_3924359_113-199933-0_sp1.1 from training. Duration: 0.536375 2023-10-03 20:27:55,625 WARNING [train.py:1204] (3/4) Exclude cut with ID _55440_210_12651_1_1534496713364_2980520_31-59289-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 20:27:57,083 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1083-220122-0_sp0.9 from training. Duration: 0.5666875 2023-10-03 20:27:58,565 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_426-313557-0_sp1.1 from training. Duration: 0.4818125 2023-10-03 20:28:00,284 WARNING [train.py:1204] (3/4) Exclude cut with ID _35799_210_5854_1_1533263487887_3529071_324-94843-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 20:28:01,032 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.1.encoder.layers.1.self_attn2.whiten, num_groups=1, num_channels=256, metric=9.78 vs. limit=22.5 2023-10-03 20:28:01,716 WARNING [train.py:1204] (3/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_264-108685-0_sp1.1 from training. Duration: 0.5 2023-10-03 20:28:03,081 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9012W0006-501141 from training. Duration: 0.896 2023-10-03 20:28:04,691 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.self_attn_weights.pos_emb_skip_rate, batch_count=1396093.3333333333, ans=0.0 2023-10-03 20:28:05,809 WARNING [train.py:1204] (3/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_890-5117-0_sp0.9 from training. Duration: 0.8666875 2023-10-03 20:28:05,853 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9023W0008-506301 from training. Duration: 0.896 2023-10-03 20:28:07,272 WARNING [train.py:1204] (3/4) Exclude cut with ID _13916_210_9994_1_1530788341126_3763149_60-191556-0_sp0.9 from training. Duration: 0.67775 2023-10-03 20:28:07,318 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9029W0331-509721 from training. Duration: 0.896 2023-10-03 20:28:08,670 WARNING [train.py:1204] (3/4) Exclude cut with ID _35823_210_6917_1_1533187881914_7297830_567-108596-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 20:28:09,907 INFO [train.py:1046] (3/4) Epoch 40, batch 2250, loss[loss=0.1566, simple_loss=0.2389, pruned_loss=0.03711, over 23283.00 frames. ], tot_loss[loss=0.1564, simple_loss=0.2362, pruned_loss=0.03833, over 4708366.19 frames. ], batch size: 93, lr: 2.55e-03, grad_scale: 16.0 2023-10-03 20:28:09,950 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7543_210_6753_1_1528544010_1110402_25-183860-0_sp1.1 from training. Duration: 0.42725 2023-10-03 20:28:12,026 WARNING [train.py:1204] (3/4) Exclude cut with ID _47278_210_12041_1_1533902418349_3576110_495-260035-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 20:28:12,216 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.conv_module1.balancer1.prob, batch_count=1396160.0, ans=0.125 2023-10-03 20:28:13,495 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9051W0015-520281 from training. Duration: 0.9045625 2023-10-03 20:28:14,780 WARNING [train.py:1204] (3/4) Exclude cut with ID _14459_210_2228_1_1531443641946_7516700_707-66193-0_sp1.1 from training. Duration: 0.97275 2023-10-03 20:28:17,228 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.5.encoder.layers.0.feed_forward1.out_whiten, num_groups=1, num_channels=256, metric=6.20 vs. limit=15.0 2023-10-03 20:28:18,024 WARNING [train.py:1204] (3/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_219-224445-0_sp1.1 from training. Duration: 0.763625 2023-10-03 20:28:18,228 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.feed_forward3.hidden_balancer.prob, batch_count=1396160.0, ans=0.125 2023-10-03 20:28:21,564 WARNING [train.py:1204] (3/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_345-167986-0_sp0.9 from training. Duration: 0.9333125 2023-10-03 20:28:22,934 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_34815_210_6388_1_1533381320_3571903_279-305565-0_sp1.1 from training. Duration: 0.836375 2023-10-03 20:28:25,867 WARNING [train.py:1204] (3/4) Exclude cut with ID _21987_210_5854_1_1531821500482_3637038_317-179212-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 20:28:27,092 WARNING [train.py:1204] (3/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_395-37222-0_sp1.1 from training. Duration: 0.6909375 2023-10-03 20:28:28,445 WARNING [train.py:1204] (3/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_168-50337-0_sp1.1 from training. Duration: 0.763625 2023-10-03 20:28:28,991 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.4.encoder.layers.1.whiten, num_groups=1, num_channels=384, metric=4.42 vs. limit=12.0 2023-10-03 20:28:29,906 WARNING [train.py:1204] (3/4) Exclude cut with ID _60113_210_16271_1_1535164212481_4386079_559-167191-0 from training. Duration: 0.93 2023-10-03 20:28:31,572 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_47228_210_4983_1_1533780021_3672027_344-179642-0_sp1.1 from training. Duration: 0.92725 2023-10-03 20:28:31,598 WARNING [train.py:1204] (3/4) Exclude cut with ID _22961_210_8254_1_1531976366672_4278180_317-199546-0_sp0.9 from training. Duration: 0.9555625 2023-10-03 20:28:34,516 WARNING [train.py:1204] (3/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_141-89727-0 from training. Duration: 0.71 2023-10-03 20:28:35,780 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_78-116075-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 20:28:35,792 WARNING [train.py:1204] (3/4) Exclude cut with ID _64086_210_7994_1_1535270417019_7435949_52-235756-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 20:28:37,174 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7615_210_4211_1_1528110965_4580160_72-235211-0_sp1.1 from training. Duration: 0.6909375 2023-10-03 20:28:43,379 WARNING [train.py:1204] (3/4) Exclude cut with ID _14069_210_5632_1_1530512692033_4803390_287-27950-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 20:28:44,787 WARNING [train.py:1204] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_74-48717-0_sp0.9 from training. Duration: 0.6444375 2023-10-03 20:28:44,810 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7544_210_6753_1_1528591327_1471426_172-348777-0_sp1.1 from training. Duration: 0.6 2023-10-03 20:28:46,156 WARNING [train.py:1204] (3/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_220-191080-0 from training. Duration: 0.98 2023-10-03 20:28:47,466 WARNING [train.py:1204] (3/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_1081-137641-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 20:28:50,639 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.442e+02 1.901e+02 2.050e+02 2.328e+02 3.368e+02, threshold=4.100e+02, percent-clipped=0.0 2023-10-03 20:28:50,709 WARNING [train.py:1204] (3/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_278-235459-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 20:28:52,347 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.ff2_skip_rate, batch_count=1396293.3333333333, ans=0.0 2023-10-03 20:28:55,425 WARNING [train.py:1204] (3/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_1181-77470-0_sp1.1 from training. Duration: 0.936375 2023-10-03 20:28:56,859 WARNING [train.py:1204] (3/4) Exclude cut with ID _60860_210_8254_1_1534896015875_3557169_198-5531-0_sp1.1 from training. Duration: 0.936375 2023-10-03 20:28:58,210 WARNING [train.py:1204] (3/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_17-289781-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 20:28:58,216 WARNING [train.py:1204] (3/4) Exclude cut with ID _50310_210_15488_1_1534068043345_3641080_146-199752-0_sp1.1 from training. Duration: 0.92725 2023-10-03 20:28:59,672 WARNING [train.py:1204] (3/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_969-213354-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 20:29:01,048 WARNING [train.py:1204] (3/4) Exclude cut with ID _70519_210_4163_1_1535961595994_7458230_66-344156-0_sp1.1 from training. Duration: 0.963625 2023-10-03 20:29:05,729 WARNING [train.py:1204] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_380-204756-0_sp1.1 from training. Duration: 0.7818125 2023-10-03 20:29:07,224 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_128-202104-0_sp0.9 from training. Duration: 0.7 2023-10-03 20:29:10,099 WARNING [train.py:1204] (3/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_51-1049-0_sp0.9 from training. Duration: 0.8333125 2023-10-03 20:29:10,125 WARNING [train.py:1204] (3/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_86-66192-0_sp1.1 from training. Duration: 0.5 2023-10-03 20:29:11,996 WARNING [train.py:1204] (3/4) Exclude cut with ID _14069_210_5632_1_1530512692033_4803390_95-1896-0_sp1.1 from training. Duration: 0.8909375 2023-10-03 20:29:16,230 WARNING [train.py:1204] (3/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_29-293896-0_sp1.1 from training. Duration: 0.5545625 2023-10-03 20:29:19,045 WARNING [train.py:1204] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_167-137255-0_sp0.9 from training. Duration: 0.711125 2023-10-03 20:29:19,045 WARNING [train.py:1204] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_217-95056-0 from training. Duration: 0.98 2023-10-03 20:29:19,061 WARNING [train.py:1204] (3/4) Exclude cut with ID _47278_210_12041_1_1533902418349_3576110_97-266746-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 20:29:19,120 WARNING [train.py:1204] (3/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_380-343670-0_sp1.1 from training. Duration: 0.8 2023-10-03 20:29:23,725 WARNING [train.py:1204] (3/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_107-332398-0 from training. Duration: 0.78 2023-10-03 20:29:25,044 INFO [train.py:1046] (3/4) Epoch 40, batch 2300, loss[loss=0.1778, simple_loss=0.252, pruned_loss=0.05175, over 23573.00 frames. ], tot_loss[loss=0.1579, simple_loss=0.2377, pruned_loss=0.03907, over 4710420.72 frames. ], batch size: 256, lr: 2.54e-03, grad_scale: 16.0 2023-10-03 20:29:25,189 WARNING [train.py:1204] (3/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_91-267696-0_sp1.1 from training. Duration: 0.7909375 2023-10-03 20:29:25,215 WARNING [train.py:1204] (3/4) Exclude cut with ID _41298_210_9019_1_1533430371450_4822143_498-186993-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 20:29:26,864 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=1396493.3333333333, ans=0.1 2023-10-03 20:29:29,570 WARNING [train.py:1204] (3/4) Exclude cut with ID _17956_210_4353_1_1531209568125_6979970_54-77610-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 20:29:30,850 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_40047_210_2290_1_1533290519_2374311_257-335162-0_sp1.1 from training. Duration: 0.863625 2023-10-03 20:29:33,982 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9653W0193-675471 from training. Duration: 0.9656875 2023-10-03 20:29:35,265 WARNING [train.py:1204] (3/4) Exclude cut with ID _25248_210_8750_1_1532062825941_5554180_243-240536-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 20:29:37,377 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.3.encoder.layers.0.self_attn1.whiten, num_groups=1, num_channels=512, metric=16.23 vs. limit=22.5 2023-10-03 20:29:38,242 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.0.attention_skip_rate, batch_count=1396560.0, ans=0.0 2023-10-03 20:29:40,938 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_32360_210_4198_1_1532689160_3648436_60-51908-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 20:29:40,949 WARNING [train.py:1204] (3/4) Exclude cut with ID _4092_210_2151_1_1526643044449_3135759_227-213972-0_sp0.9 from training. Duration: 0.6 2023-10-03 20:29:41,138 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.bypass.scale_min, batch_count=1396560.0, ans=0.2 2023-10-03 20:29:42,781 WARNING [train.py:1204] (3/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_392-173500-0_sp1.1 from training. Duration: 0.97275 2023-10-03 20:29:42,820 WARNING [train.py:1204] (3/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_71-329150-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 20:29:42,822 WARNING [train.py:1204] (3/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_866-173440-0 from training. Duration: 0.9 2023-10-03 20:29:42,898 WARNING [train.py:1204] (3/4) Exclude cut with ID _14832_210_8750_1_1530691351539_3171348_1-54315-0_sp0.9 from training. Duration: 0.9666875 2023-10-03 20:29:45,737 WARNING [train.py:1204] (3/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_430-116139-0_sp1.1 from training. Duration: 0.77275 2023-10-03 20:29:45,761 WARNING [train.py:1204] (3/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_356-45054-0_sp0.9 from training. Duration: 0.9555625 2023-10-03 20:29:49,070 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3531_210_3528_1_1526294203_5216456_309-227302-0_sp0.9 from training. Duration: 0.7666875 2023-10-03 20:29:52,346 WARNING [train.py:1204] (3/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_301-330455-0_sp1.1 from training. Duration: 0.72725 2023-10-03 20:29:55,103 WARNING [train.py:1204] (3/4) Exclude cut with ID _23557_210_8750_1_1531890106370_6846255_667-145945-0_sp1.1 from training. Duration: 0.92725 2023-10-03 20:30:00,470 WARNING [train.py:1204] (3/4) Exclude cut with ID _14860_210_4915_1_1530702020340_7343240_836-328489-0_sp1.1 from training. Duration: 0.8181875 2023-10-03 20:30:00,510 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_37152_210_9697_1_1533106573_3844188_416-333249-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 20:30:04,680 WARNING [train.py:1204] (3/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_538-32291-0_sp0.9 from training. Duration: 0.92225 2023-10-03 20:30:06,670 WARNING [train.py:1204] (3/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_965-176824-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 20:30:06,821 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=1396626.6666666667, ans=0.1 2023-10-03 20:30:09,514 WARNING [train.py:1204] (3/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_86-19601-0_sp1.1 from training. Duration: 0.77275 2023-10-03 20:30:10,866 WARNING [train.py:1204] (3/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_436-318113-0_sp1.1 from training. Duration: 0.6818125 2023-10-03 20:30:12,176 WARNING [train.py:1204] (3/4) Exclude cut with ID _41817_210_18835_1_1533434477160_7423049_555-293448-0_sp0.9 from training. Duration: 0.888875 2023-10-03 20:30:12,189 WARNING [train.py:1204] (3/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_68-219807-0 from training. Duration: 0.47 2023-10-03 20:30:18,297 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7544_210_6753_1_1528591327_1471426_33-346031-0_sp1.1 from training. Duration: 0.5545625 2023-10-03 20:30:18,299 WARNING [train.py:1204] (3/4) Exclude cut with ID _49748_210_24116_1_1533986971737_3158910_263-52113-0_sp1.1 from training. Duration: 0.97275 2023-10-03 20:30:18,337 WARNING [train.py:1204] (3/4) Exclude cut with ID _64639_210_5816_1_1535367619437_3528000_340-24580-0_sp1.1 from training. Duration: 0.92725 2023-10-03 20:30:18,350 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_47228_210_4983_1_1533780021_3672027_479-114941-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 20:30:18,391 WARNING [train.py:1204] (3/4) Exclude cut with ID _44585_210_18628_1_1533605350387_4518199_506-119659-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 20:30:18,459 WARNING [train.py:1204] (3/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_314-126819-0_sp1.1 from training. Duration: 0.4545625 2023-10-03 20:30:19,806 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2541_210_1794_1_1525590269_3084765_156-173713-0_sp0.9 from training. Duration: 0.9 2023-10-03 20:30:19,868 WARNING [train.py:1204] (3/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_108-255430-0 from training. Duration: 0.97 2023-10-03 20:30:19,876 WARNING [train.py:1204] (3/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_791-45149-0_sp1.1 from training. Duration: 0.7818125 2023-10-03 20:30:19,882 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_53378_210_17282_1_1534227054_1261425_10-118295-0_sp1.1 from training. Duration: 0.97275 2023-10-03 20:30:21,243 WARNING [train.py:1204] (3/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_148-158509-0 from training. Duration: 0.41 2023-10-03 20:30:28,617 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_34726_210_4525_1_1533019474_5030395_687-291305-0_sp1.1 from training. Duration: 0.936375 2023-10-03 20:30:32,637 WARNING [train.py:1204] (3/4) Exclude cut with ID _14860_210_4915_1_1530702020340_7343240_264-324537-0_sp1.1 from training. Duration: 0.963625 2023-10-03 20:30:33,571 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.2.encoder.layers.0.self_attn2.whiten, num_groups=1, num_channels=384, metric=18.60 vs. limit=22.5 2023-10-03 20:30:34,219 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_41251_210_15368_1_1533429767_4820695_591-46386-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 20:30:34,424 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.attention_skip_rate, batch_count=1396760.0, ans=0.0 2023-10-03 20:30:35,554 WARNING [train.py:1204] (3/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_86-19601-0_sp0.9 from training. Duration: 0.9444375 2023-10-03 20:30:35,581 WARNING [train.py:1204] (3/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_401-255491-0_sp0.9 from training. Duration: 0.77775 2023-10-03 20:30:36,979 WARNING [train.py:1204] (3/4) Exclude cut with ID _8696_210_1876_1_1528545632440_3345058_209-54069-0_sp1.1 from training. Duration: 0.6090625 2023-10-03 20:30:38,906 INFO [train.py:1046] (3/4) Epoch 40, batch 2350, loss[loss=0.1722, simple_loss=0.2419, pruned_loss=0.05127, over 23768.00 frames. ], tot_loss[loss=0.1587, simple_loss=0.2385, pruned_loss=0.0394, over 4709216.65 frames. ], batch size: 179, lr: 2.54e-03, grad_scale: 16.0 2023-10-03 20:30:38,941 WARNING [train.py:1204] (3/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_369-325184-0_sp1.1 from training. Duration: 0.9 2023-10-03 20:30:39,017 WARNING [train.py:1204] (3/4) Exclude cut with ID _14687_210_5854_1_1530703838259_3664889_115-239046-0_sp0.9 from training. Duration: 0.7444375 2023-10-03 20:30:40,384 WARNING [train.py:1204] (3/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_168-50337-0 from training. Duration: 0.84 2023-10-03 20:30:45,861 WARNING [train.py:1204] (3/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_909-64151-0_sp1.1 from training. Duration: 0.8545625 2023-10-03 20:30:45,879 WARNING [train.py:1204] (3/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_597-154640-0 from training. Duration: 0.9 2023-10-03 20:30:47,935 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.0.conv_module1.balancer1.prob, batch_count=1396826.6666666667, ans=0.125 2023-10-03 20:30:49,725 WARNING [train.py:1204] (3/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_126-274694-0 from training. Duration: 0.98 2023-10-03 20:30:52,503 WARNING [train.py:1204] (3/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_344-269786-0_sp1.1 from training. Duration: 0.97275 2023-10-03 20:30:53,946 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.0.feed_forward1.out_proj.dropout_p, batch_count=1396893.3333333333, ans=0.1 2023-10-03 20:30:53,978 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=1396893.3333333333, ans=0.1 2023-10-03 20:30:56,889 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_37818_210_4238_1_1533620910_4720083_322-253154-0_sp1.1 from training. Duration: 0.92725 2023-10-03 20:30:56,892 WARNING [train.py:1204] (3/4) Exclude cut with ID _58895_210_8750_1_1535184081320_3649089_221-15823-0_sp1.1 from training. Duration: 0.92725 2023-10-03 20:30:56,909 WARNING [train.py:1204] (3/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_9-21737-0_sp1.1 from training. Duration: 0.8818125 2023-10-03 20:30:56,955 WARNING [train.py:1204] (3/4) Exclude cut with ID _14069_210_5632_1_1530512692033_4803390_522-341588-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 20:30:58,367 WARNING [train.py:1204] (3/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_363-185703-0 from training. Duration: 0.95 2023-10-03 20:30:59,996 WARNING [train.py:1204] (3/4) Exclude cut with ID _41817_210_18835_1_1533434477160_7423049_265-76216-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 20:31:07,281 WARNING [train.py:1204] (3/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_841-341679-0 from training. Duration: 0.58 2023-10-03 20:31:08,684 WARNING [train.py:1204] (3/4) Exclude cut with ID _55429_210_10626_1_1534467588527_7174143_97-335084-0_sp1.1 from training. Duration: 0.8818125 2023-10-03 20:31:10,089 WARNING [train.py:1204] (3/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_690-126306-0_sp0.9 from training. Duration: 0.8666875 2023-10-03 20:31:10,105 WARNING [train.py:1204] (3/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_102-92751-0_sp1.1 from training. Duration: 0.9 2023-10-03 20:31:12,898 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3787_210_2040_1_1526477544_5478586_603-120324-0_sp1.1 from training. Duration: 0.736375 2023-10-03 20:31:14,369 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_33348_210_5632_1_1533086912_3324030_85-28583-0 from training. Duration: 0.82 2023-10-03 20:31:14,431 WARNING [train.py:1204] (3/4) Exclude cut with ID _60057_210_10640_1_1534903065793_3333150_362-230377-0_sp1.1 from training. Duration: 0.7909375 2023-10-03 20:31:17,761 WARNING [train.py:1204] (3/4) Exclude cut with ID _60113_210_16271_1_1535164212481_4386079_194-146486-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 20:31:17,773 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_53753_210_7234_1_1534320003_3594148_171-277668-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 20:31:17,802 WARNING [train.py:1204] (3/4) Exclude cut with ID _34423_210_8750_1_1533430798456_3330199_251-332788-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 20:31:19,437 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.621e+02 2.056e+02 2.202e+02 2.513e+02 4.106e+02, threshold=4.405e+02, percent-clipped=1.0 2023-10-03 20:31:21,024 WARNING [train.py:1204] (3/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_663-166973-0_sp0.9 from training. Duration: 0.888875 2023-10-03 20:31:23,606 WARNING [train.py:1204] (3/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_927-43827-0 from training. Duration: 0.74 2023-10-03 20:31:24,944 WARNING [train.py:1204] (3/4) Exclude cut with ID _28323_210_16187_1_1532480395667_4100300_306-74561-0_sp1.1 from training. Duration: 0.8545625 2023-10-03 20:31:27,240 WARNING [train.py:1204] (3/4) Exclude cut with ID _15037_210_2253_1_1530752294104_3267650_139-337989-0_sp1.1 from training. Duration: 0.92725 2023-10-03 20:31:27,253 WARNING [train.py:1204] (3/4) Exclude cut with ID _49039_210_6315_1_1533967537541_3846118_460-263670-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 20:31:29,935 WARNING [train.py:1204] (3/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_186-309161-0 from training. Duration: 0.54 2023-10-03 20:31:30,009 WARNING [train.py:1204] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_87-278686-0_sp1.1 from training. Duration: 0.7 2023-10-03 20:31:32,730 WARNING [train.py:1204] (3/4) Exclude cut with ID _27052_210_5986_1_1532329197187_3585009_314-106499-0 from training. Duration: 0.92 2023-10-03 20:31:32,749 WARNING [train.py:1204] (3/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_412-287526-0_sp0.9 from training. Duration: 0.8 2023-10-03 20:31:38,555 WARNING [train.py:1204] (3/4) Exclude cut with ID _22961_210_8254_1_1531976366672_4278180_317-199546-0 from training. Duration: 0.86 2023-10-03 20:31:40,037 WARNING [train.py:1204] (3/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_381-317791-0 from training. Duration: 0.73 2023-10-03 20:31:41,380 WARNING [train.py:1204] (3/4) Exclude cut with ID _44010_210_4525_1_1533884262964_4013620_560-310411-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 20:31:41,387 WARNING [train.py:1204] (3/4) Exclude cut with ID _5294_210_1747_1_1527249643928_3696179_342-270278-0_sp0.9 from training. Duration: 0.611125 2023-10-03 20:31:41,416 WARNING [train.py:1204] (3/4) Exclude cut with ID ID0473W0002-918951 from training. Duration: 0.924 2023-10-03 20:31:41,434 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1083-220122-0 from training. Duration: 0.51 2023-10-03 20:31:44,153 WARNING [train.py:1204] (3/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_138-154812-0 from training. Duration: 0.78 2023-10-03 20:31:46,846 WARNING [train.py:1204] (3/4) Exclude cut with ID _44982_210_1511_1_1534312820108_3726029_331-19420-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 20:31:47,093 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.balancer2.prob, batch_count=1397093.3333333333, ans=0.125 2023-10-03 20:31:49,178 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=1397093.3333333333, ans=0.1 2023-10-03 20:31:50,648 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2655_210_2087_1_1525688959_3897400_202-147557-0_sp0.9 from training. Duration: 0.9444375 2023-10-03 20:31:53,167 INFO [train.py:1046] (3/4) Epoch 40, batch 2400, loss[loss=0.1538, simple_loss=0.2261, pruned_loss=0.04077, over 23848.00 frames. ], tot_loss[loss=0.1574, simple_loss=0.237, pruned_loss=0.03892, over 4713923.34 frames. ], batch size: 164, lr: 2.54e-03, grad_scale: 32.0 2023-10-03 20:31:54,737 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_23850_210_4238_1_1532232597_1632433_32-185135-0_sp0.9 from training. Duration: 0.9555625 2023-10-03 20:31:56,371 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.conv_skip_rate, batch_count=1397160.0, ans=0.0 2023-10-03 20:31:58,119 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_46-93547-0_sp1.1 from training. Duration: 0.863625 2023-10-03 20:31:59,321 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7931_210_1794_1_1528594728_5564059_280-279560-0 from training. Duration: 0.98 2023-10-03 20:31:59,361 WARNING [train.py:1204] (3/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_937-208281-0 from training. Duration: 0.85 2023-10-03 20:32:03,633 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.nonlin_attention.balancer.prob, batch_count=1397160.0, ans=0.125 2023-10-03 20:32:05,061 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.conv_module2.balancer1.prob, batch_count=1397160.0, ans=0.125 2023-10-03 20:32:06,334 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_14928_210_5632_1_1530773461_4714000_454-112198-0_sp0.9 from training. Duration: 0.7333125 2023-10-03 20:32:06,335 WARNING [train.py:1204] (3/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_944-153333-0_sp1.1 from training. Duration: 0.963625 2023-10-03 20:32:08,240 WARNING [train.py:1204] (3/4) Exclude cut with ID _14821_210_8750_1_1530680503991_6829359_2-227151-0 from training. Duration: 0.83 2023-10-03 20:32:08,269 WARNING [train.py:1204] (3/4) Exclude cut with ID _28461_210_2253_1_1532771775189_3041299_96-152158-0_sp1.1 from training. Duration: 0.77275 2023-10-03 20:32:09,552 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7837_210_1792_1_1528453445_5841305_654-54195-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 20:32:09,606 WARNING [train.py:1204] (3/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_344-152526-0 from training. Duration: 0.53 2023-10-03 20:32:10,213 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.3.encoder.layers.3.self_attn2.whiten, num_groups=1, num_channels=512, metric=10.86 vs. limit=22.5 2023-10-03 20:32:16,451 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_30167_210_3528_1_1532428012_4772375_480-142230-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 20:32:16,685 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.conv_skip_rate, batch_count=1397226.6666666667, ans=0.0 2023-10-03 20:32:17,935 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_160-218721-0 from training. Duration: 0.96 2023-10-03 20:32:18,163 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=1397226.6666666667, ans=0.1 2023-10-03 20:32:22,718 WARNING [train.py:1204] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_65-88242-0_sp0.9 from training. Duration: 0.62225 2023-10-03 20:32:26,936 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_34886_210_10633_1_1533203882_1755661_76-156096-0 from training. Duration: 0.87 2023-10-03 20:32:29,042 WARNING [train.py:1204] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_188-38388-0_sp1.1 from training. Duration: 0.92725 2023-10-03 20:32:30,300 WARNING [train.py:1204] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_81-289335-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 20:32:33,386 WARNING [train.py:1204] (3/4) Exclude cut with ID _59493_210_17282_1_1535078419077_3225879_110-8778-0_sp1.1 from training. Duration: 0.936375 2023-10-03 20:32:33,426 WARNING [train.py:1204] (3/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_188-260011-0 from training. Duration: 0.73 2023-10-03 20:32:34,702 WARNING [train.py:1204] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_453-133983-0_sp0.9 from training. Duration: 0.8666875 2023-10-03 20:32:34,957 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.conv_module1.balancer2.prob, batch_count=1397293.3333333333, ans=0.125 2023-10-03 20:32:40,804 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.feed_forward1.out_proj.dropout_p, batch_count=1397360.0, ans=0.1 2023-10-03 20:32:41,922 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8054_210_7927_1_1528627721_4984094_553-17379-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 20:32:43,459 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.bypass.skip_rate, batch_count=1397360.0, ans=0.09899494936611666 2023-10-03 20:32:44,619 WARNING [train.py:1204] (3/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_341-171072-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 20:32:47,356 WARNING [train.py:1204] (3/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_265-257286-0_sp1.1 from training. Duration: 0.963625 2023-10-03 20:32:48,795 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_403-39619-0_sp0.9 from training. Duration: 0.9333125 2023-10-03 20:32:48,800 WARNING [train.py:1204] (3/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_464-33931-0_sp0.9 from training. Duration: 0.6 2023-10-03 20:32:48,821 WARNING [train.py:1204] (3/4) Exclude cut with ID _60804_210_9642_1_1534831199859_7268299_632-143863-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 20:32:48,829 WARNING [train.py:1204] (3/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_431-310997-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 20:32:50,349 WARNING [train.py:1204] (3/4) Exclude cut with ID _31649_210_6228_1_1532584608063_4091078_114-263860-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 20:32:50,363 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6961_210_2087_1_1527937168_6815011_562-165518-0_sp0.9 from training. Duration: 0.8555625 2023-10-03 20:32:54,039 WARNING [train.py:1204] (3/4) Exclude cut with ID _27052_210_5986_1_1532329197187_3585009_338-325870-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 20:32:54,177 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.1.feed_forward2.hidden_balancer.prob, batch_count=1397426.6666666667, ans=0.125 2023-10-03 20:32:55,359 WARNING [train.py:1204] (3/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_469-94673-0_sp1.1 from training. Duration: 0.7181875 2023-10-03 20:32:55,391 WARNING [train.py:1204] (3/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_259-34038-0 from training. Duration: 0.74 2023-10-03 20:32:55,540 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.attention_skip_rate, batch_count=1397426.6666666667, ans=0.0 2023-10-03 20:32:58,704 WARNING [train.py:1204] (3/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_388-129552-0 from training. Duration: 0.96 2023-10-03 20:33:01,346 WARNING [train.py:1204] (3/4) Exclude cut with ID _28394_210_8913_1_1532430049518_3650118_342-251455-0_sp1.1 from training. Duration: 0.936375 2023-10-03 20:33:01,369 WARNING [train.py:1204] (3/4) Exclude cut with ID _53731_210_7994_1_1534553979244_3757214_8-191390-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 20:33:01,401 WARNING [train.py:1204] (3/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_613-232856-0 from training. Duration: 0.54 2023-10-03 20:33:02,827 WARNING [train.py:1204] (3/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_557-349534-0 from training. Duration: 0.91 2023-10-03 20:33:02,835 WARNING [train.py:1204] (3/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_379-184223-0 from training. Duration: 0.85 2023-10-03 20:33:02,839 WARNING [train.py:1204] (3/4) Exclude cut with ID IC0125W0001-61807 from training. Duration: 0.966 2023-10-03 20:33:04,256 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_229-45300-0 from training. Duration: 0.57 2023-10-03 20:33:04,339 WARNING [train.py:1204] (3/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_1069-24743-0_sp1.1 from training. Duration: 0.92725 2023-10-03 20:33:05,693 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_41452_210_7291_1_1533437687_3941281_323-172873-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 20:33:05,704 WARNING [train.py:1204] (3/4) Exclude cut with ID _37086_210_9038_1_1532999540891_7021250_802-226709-0_sp1.1 from training. Duration: 0.97275 2023-10-03 20:33:06,992 INFO [train.py:1046] (3/4) Epoch 40, batch 2450, loss[loss=0.1697, simple_loss=0.2504, pruned_loss=0.04446, over 24307.00 frames. ], tot_loss[loss=0.1568, simple_loss=0.2361, pruned_loss=0.03874, over 4710241.71 frames. ], batch size: 61, lr: 2.54e-03, grad_scale: 32.0 2023-10-03 20:33:07,070 WARNING [train.py:1204] (3/4) Exclude cut with ID IC0144W0500-71767 from training. Duration: 0.931875 2023-10-03 20:33:07,133 WARNING [train.py:1204] (3/4) Exclude cut with ID _39846_210_12651_1_1533605310265_2115212_233-129699-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 20:33:08,483 WARNING [train.py:1204] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_574-45541-0_sp0.9 from training. Duration: 0.72225 2023-10-03 20:33:11,688 WARNING [train.py:1204] (3/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_372-298093-0_sp1.1 from training. Duration: 0.72725 2023-10-03 20:33:11,700 WARNING [train.py:1204] (3/4) Exclude cut with ID _62574_210_2655_1_1535180407058_3933249_34-26343-0_sp1.1 from training. Duration: 0.97275 2023-10-03 20:33:15,774 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_512-256238-0_sp1.1 from training. Duration: 0.963625 2023-10-03 20:33:15,781 WARNING [train.py:1204] (3/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_196-55502-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 20:33:17,131 WARNING [train.py:1204] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_195-167011-0 from training. Duration: 0.75 2023-10-03 20:33:22,684 WARNING [train.py:1204] (3/4) Exclude cut with ID _13678_210_7497_1_1530530069226_3463170_345-230416-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 20:33:22,702 WARNING [train.py:1204] (3/4) Exclude cut with ID _51078_210_9070_1_1534161557424_3805250_225-327809-0_sp1.1 from training. Duration: 0.963625 2023-10-03 20:33:26,542 WARNING [train.py:1204] (3/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_18-189198-0_sp1.1 from training. Duration: 0.8181875 2023-10-03 20:33:27,863 WARNING [train.py:1204] (3/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_329-82235-0_sp1.1 from training. Duration: 0.7454375 2023-10-03 20:33:27,879 WARNING [train.py:1204] (3/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_726-270780-0_sp0.9 from training. Duration: 0.9555625 2023-10-03 20:33:27,914 WARNING [train.py:1204] (3/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_654-52713-0 from training. Duration: 0.77 2023-10-03 20:33:32,062 WARNING [train.py:1204] (3/4) Exclude cut with ID _20455_210_5219_1_1531565996775_8098100_191-16006-0_sp1.1 from training. Duration: 0.963625 2023-10-03 20:33:34,136 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3171_210_2087_1_1526109084_2330007_66-260583-0_sp1.1 from training. Duration: 0.6545625 2023-10-03 20:33:34,198 WARNING [train.py:1204] (3/4) Exclude cut with ID _38670_210_15379_1_1533448810659_4391059_63-129006-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 20:33:36,783 WARNING [train.py:1204] (3/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_393-57888-0_sp0.9 from training. Duration: 0.711125 2023-10-03 20:33:36,822 WARNING [train.py:1204] (3/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_857-298942-0_sp0.9 from training. Duration: 0.9444375 2023-10-03 20:33:38,861 WARNING [train.py:1204] (3/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_239-22076-0_sp0.9 from training. Duration: 0.9444375 2023-10-03 20:33:38,887 WARNING [train.py:1204] (3/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_424-227947-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 20:33:40,292 WARNING [train.py:1204] (3/4) Exclude cut with ID _14069_210_5632_1_1530512692033_4803390_95-1896-0 from training. Duration: 0.98 2023-10-03 20:33:41,600 WARNING [train.py:1204] (3/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_96-244299-0_sp0.9 from training. Duration: 0.97775 2023-10-03 20:33:41,812 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.feed_forward3.hidden_balancer.prob, batch_count=1397626.6666666667, ans=0.125 2023-10-03 20:33:44,407 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=1397626.6666666667, ans=0.1 2023-10-03 20:33:45,922 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.conv_skip_rate, batch_count=1397626.6666666667, ans=0.0 2023-10-03 20:33:46,905 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.624e+02 1.910e+02 2.084e+02 2.402e+02 3.633e+02, threshold=4.168e+02, percent-clipped=0.0 2023-10-03 20:33:49,890 WARNING [train.py:1204] (3/4) Exclude cut with ID _46683_210_9642_1_1533730913492_4591116_562-319825-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 20:33:49,997 WARNING [train.py:1204] (3/4) Exclude cut with ID _64639_210_5816_1_1535367619437_3528000_284-173474-0_sp1.1 from training. Duration: 0.963625 2023-10-03 20:33:50,017 WARNING [train.py:1204] (3/4) Exclude cut with ID _29529_210_7234_1_1532393961455_3977460_497-100133-0_sp1.1 from training. Duration: 0.936375 2023-10-03 20:33:51,432 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_12868_210_6753_1_1530701932_530275_18-76322-0_sp0.9 from training. Duration: 0.9666875 2023-10-03 20:33:51,475 WARNING [train.py:1204] (3/4) Exclude cut with ID _50093_210_4220_1_1534417141207_3432299_116-303447-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 20:33:52,659 WARNING [train.py:1204] (3/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_44-79427-0_sp1.1 from training. Duration: 0.8909375 2023-10-03 20:33:54,608 WARNING [train.py:1204] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_887-249555-0 from training. Duration: 0.77 2023-10-03 20:33:54,853 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.feed_forward2.hidden_balancer.prob, batch_count=1397693.3333333333, ans=0.125 2023-10-03 20:33:57,945 WARNING [train.py:1204] (3/4) Exclude cut with ID _28461_210_2253_1_1532771775189_3041299_96-152158-0_sp0.9 from training. Duration: 0.9444375 2023-10-03 20:33:57,995 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_14058_210_4163_1_1530690953_6327180_49-210687-0_sp1.1 from training. Duration: 0.8545625 2023-10-03 20:34:00,107 WARNING [train.py:1204] (3/4) Exclude cut with ID _40399_210_4353_1_1533305658194_3279406_351-127903-0_sp1.1 from training. Duration: 0.97275 2023-10-03 20:34:01,335 WARNING [train.py:1204] (3/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_184-206939-0_sp1.1 from training. Duration: 0.936375 2023-10-03 20:34:04,334 WARNING [train.py:1204] (3/4) Exclude cut with ID _14100_210_8341_1_1530529320613_3678069_258-224599-0_sp1.1 from training. Duration: 0.7 2023-10-03 20:34:04,350 WARNING [train.py:1204] (3/4) Exclude cut with ID _6493_210_1747_1_1528459210424_4012470_324-323854-0 from training. Duration: 0.92 2023-10-03 20:34:05,655 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_151-325626-0_sp1.1 from training. Duration: 0.8818125 2023-10-03 20:34:06,965 WARNING [train.py:1204] (3/4) Exclude cut with ID _14528_210_8254_1_1530669700514_4306939_106-43145-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 20:34:08,316 WARNING [train.py:1204] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_686-54383-0 from training. Duration: 0.87 2023-10-03 20:34:08,355 WARNING [train.py:1204] (3/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_801-102362-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 20:34:08,432 WARNING [train.py:1204] (3/4) Exclude cut with ID _48677_210_5663_1_1533902405919_3892989_408-40740-0_sp0.9 from training. Duration: 0.92225 2023-10-03 20:34:14,387 WARNING [train.py:1204] (3/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_448-212135-0_sp1.1 from training. Duration: 0.82725 2023-10-03 20:34:15,834 WARNING [train.py:1204] (3/4) Exclude cut with ID _59170_210_6721_1_1534983633960_4203335_320-124047-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 20:34:15,878 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_4692_210_3528_1_1526899755_4323254_107-44401-0_sp1.1 from training. Duration: 0.7818125 2023-10-03 20:34:19,144 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.4.encoder.layers.1.conv_module1.whiten, num_groups=1, num_channels=384, metric=3.39 vs. limit=15.0 2023-10-03 20:34:20,067 WARNING [train.py:1204] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_731-2428-0 from training. Duration: 0.83 2023-10-03 20:34:21,414 INFO [train.py:1046] (3/4) Epoch 40, batch 2500, loss[loss=0.1408, simple_loss=0.1993, pruned_loss=0.04112, over 19293.00 frames. ], tot_loss[loss=0.1557, simple_loss=0.2344, pruned_loss=0.03843, over 4683779.96 frames. ], batch size: 388, lr: 2.54e-03, grad_scale: 32.0 2023-10-03 20:34:21,514 WARNING [train.py:1204] (3/4) Exclude cut with ID _29857_210_20034_1_1532395886753_3798160_335-163562-0_sp1.1 from training. Duration: 0.77275 2023-10-03 20:34:29,157 WARNING [train.py:1204] (3/4) Exclude cut with ID _14832_210_8750_1_1530691351539_3171348_171-275004-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 20:34:29,497 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.bypass.scale_min, batch_count=1397826.6666666667, ans=0.2 2023-10-03 20:34:36,799 WARNING [train.py:1204] (3/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_105-328174-0_sp0.9 from training. Duration: 0.8444375 2023-10-03 20:34:36,824 WARNING [train.py:1204] (3/4) Exclude cut with ID _68965_210_1883_1_1535698962272_3497490_164-118421-0_sp1.1 from training. Duration: 0.936375 2023-10-03 20:34:38,192 WARNING [train.py:1204] (3/4) Exclude cut with ID _19896_210_1883_1_1531709022662_6649569_380-24170-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 20:34:38,197 WARNING [train.py:1204] (3/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_505-205270-0_sp1.1 from training. Duration: 0.3454375 2023-10-03 20:34:43,674 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.3.encoder.layers.3.feed_forward1.out_whiten, num_groups=1, num_channels=512, metric=4.97 vs. limit=15.0 2023-10-03 20:34:45,791 WARNING [train.py:1204] (3/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_427-208901-0_sp1.1 from training. Duration: 0.6181875 2023-10-03 20:34:45,828 WARNING [train.py:1204] (3/4) Exclude cut with ID _23818_210_6228_1_1531917063884_3676920_85-326916-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 20:34:45,926 WARNING [train.py:1204] (3/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_101-293367-0_sp1.1 from training. Duration: 0.57275 2023-10-03 20:34:45,934 WARNING [train.py:1204] (3/4) Exclude cut with ID _5409_210_1388_1_1527292850189_5865191_178-127970-0_sp1.1 from training. Duration: 0.5181875 2023-10-03 20:34:47,266 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_403-39619-0 from training. Duration: 0.84 2023-10-03 20:34:48,632 WARNING [train.py:1204] (3/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_427-87841-0_sp1.1 from training. Duration: 0.963625 2023-10-03 20:34:48,691 WARNING [train.py:1204] (3/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_286-68181-0_sp1.1 from training. Duration: 0.97275 2023-10-03 20:34:49,944 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6673_210_3273_1_1527767790_3173240_327-306185-0 from training. Duration: 0.83 2023-10-03 20:34:49,966 WARNING [train.py:1204] (3/4) Exclude cut with ID _3675_210_4557_1_1526639934408_4182089_106-203713-0_sp1.1 from training. Duration: 0.963625 2023-10-03 20:34:50,014 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_53071_210_16554_1_1534493434_1276796_130-36076-0 from training. Duration: 0.95 2023-10-03 20:34:50,041 WARNING [train.py:1204] (3/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_586-93054-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 20:34:55,996 WARNING [train.py:1204] (3/4) Exclude cut with ID _23825_210_13449_1_1531915313526_3497150_262-10571-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 20:34:57,306 WARNING [train.py:1204] (3/4) Exclude cut with ID _28783_210_7943_1_1532671164808_7155479_432-67885-0_sp1.1 from training. Duration: 0.97275 2023-10-03 20:34:59,382 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6287_210_3273_1_1527509506_2653716_182-199887-0_sp0.9 from training. Duration: 0.5666875 2023-10-03 20:35:00,735 WARNING [train.py:1204] (3/4) Exclude cut with ID _3231_210_2106_1_1526205864934_6784220_171-256004-0 from training. Duration: 0.86 2023-10-03 20:35:01,412 WARNING [train.py:1204] (3/4) Exclude cut with ID _69051_210_1531_1_1535885566901_3829538_332-92090-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 20:35:02,932 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.1.nonlin_attention.balancer.prob, batch_count=1397960.0, ans=0.125 2023-10-03 20:35:04,178 WARNING [train.py:1204] (3/4) Exclude cut with ID _3675_210_4557_1_1526639934408_4182089_104-324125-0_sp1.1 from training. Duration: 0.963625 2023-10-03 20:35:06,933 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_41764_210_2521_1_1533686110_4089313_501-2119-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 20:35:09,695 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_56-100988-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 20:35:12,763 WARNING [train.py:1204] (3/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_398-239819-0_sp1.1 from training. Duration: 0.9 2023-10-03 20:35:16,904 WARNING [train.py:1204] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_74-48717-0_sp1.1 from training. Duration: 0.52725 2023-10-03 20:35:19,606 WARNING [train.py:1204] (3/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_177-49092-0 from training. Duration: 0.91 2023-10-03 20:35:19,633 WARNING [train.py:1204] (3/4) Exclude cut with ID _62337_210_12392_1_1535009273105_3412229_44-176353-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 20:35:19,649 WARNING [train.py:1204] (3/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_110-130961-0_sp1.1 from training. Duration: 0.42725 2023-10-03 20:35:19,884 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.nonlin_attention.balancer.prob, batch_count=1398093.3333333333, ans=0.125 2023-10-03 20:35:21,104 WARNING [train.py:1204] (3/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_37-189262-0_sp1.1 from training. Duration: 0.8909375 2023-10-03 20:35:21,104 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3172_210_2087_1_1526292929_3987865_197-97762-0_sp0.9 from training. Duration: 0.8333125 2023-10-03 20:35:22,539 WARNING [train.py:1204] (3/4) Exclude cut with ID IC0677W0065-334897 from training. Duration: 0.896 2023-10-03 20:35:22,539 WARNING [train.py:1204] (3/4) Exclude cut with ID IC0677W0080-334912 from training. Duration: 0.768 2023-10-03 20:35:22,544 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_120-41668-0 from training. Duration: 0.7 2023-10-03 20:35:24,482 WARNING [train.py:1204] (3/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_460-205287-0_sp1.1 from training. Duration: 0.963625 2023-10-03 20:35:27,127 WARNING [train.py:1204] (3/4) Exclude cut with ID _14632_210_9988_1_1530691279392_3923200_102-204477-0 from training. Duration: 0.97 2023-10-03 20:35:27,140 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_34815_210_6388_1_1533381320_3571903_279-305565-0 from training. Duration: 0.92 2023-10-03 20:35:29,024 WARNING [train.py:1204] (3/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_91-148831-0_sp1.1 from training. Duration: 0.9 2023-10-03 20:35:29,086 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_82-221196-0 from training. Duration: 0.76 2023-10-03 20:35:33,048 WARNING [train.py:1204] (3/4) Exclude cut with ID _38020_210_5986_1_1533112252259_7006089_493-102745-0 from training. Duration: 0.98 2023-10-03 20:35:34,498 WARNING [train.py:1204] (3/4) Exclude cut with ID _61133_210_9969_1_1535196777227_3696409_40-93442-0_sp1.1 from training. Duration: 0.92725 2023-10-03 20:35:34,635 WARNING [train.py:1204] (3/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_45-258852-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 20:35:35,916 INFO [train.py:1046] (3/4) Epoch 40, batch 2550, loss[loss=0.1651, simple_loss=0.2407, pruned_loss=0.04477, over 23684.00 frames. ], tot_loss[loss=0.1565, simple_loss=0.2357, pruned_loss=0.03865, over 4696033.66 frames. ], batch size: 232, lr: 2.54e-03, grad_scale: 16.0 2023-10-03 20:35:35,993 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_53071_210_16554_1_1534493434_1276796_130-36076-0_sp1.1 from training. Duration: 0.863625 2023-10-03 20:35:36,293 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.feed_forward1.out_proj.dropout_p, batch_count=1398160.0, ans=0.1 2023-10-03 20:35:38,653 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8694_210_6400_1_1529065754_3260756_332-13553-0_sp1.1 from training. Duration: 0.92725 2023-10-03 20:35:38,733 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_405-339986-0 from training. Duration: 0.51 2023-10-03 20:35:40,580 WARNING [train.py:1204] (3/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_927-43827-0_sp1.1 from training. Duration: 0.67275 2023-10-03 20:35:42,260 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.attention_skip_rate, batch_count=1398160.0, ans=0.0 2023-10-03 20:35:43,416 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_397-147196-0 from training. Duration: 0.8 2023-10-03 20:35:44,829 WARNING [train.py:1204] (3/4) Exclude cut with ID 3557-8342-0013-54691-0_sp1.1 from training. Duration: 0.836375 2023-10-03 20:35:47,550 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_47438_210_5399_1_1533797471_4513356_626-127722-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 20:35:48,990 WARNING [train.py:1204] (3/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_256-323883-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 20:35:48,995 WARNING [train.py:1204] (3/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_839-173017-0_sp0.9 from training. Duration: 0.4555625 2023-10-03 20:35:49,022 WARNING [train.py:1204] (3/4) Exclude cut with ID _5289_210_2084_1_1527678202736_3779299_392-261988-0_sp1.1 from training. Duration: 0.5909375 2023-10-03 20:35:49,188 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.bypass.scale_min, batch_count=1398226.6666666667, ans=0.2 2023-10-03 20:35:50,269 WARNING [train.py:1204] (3/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_119-224384-0_sp1.1 from training. Duration: 0.936375 2023-10-03 20:35:50,302 WARNING [train.py:1204] (3/4) Exclude cut with ID _57523_210_1856_1_1534766294545_3732989_262-267933-0_sp1.1 from training. Duration: 0.963625 2023-10-03 20:35:50,609 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.ff2_skip_rate, batch_count=1398226.6666666667, ans=0.0 2023-10-03 20:35:53,703 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_35361_210_4238_1_1533262810_7793476_675-87012-0_sp0.9 from training. Duration: 0.988875 2023-10-03 20:35:53,722 WARNING [train.py:1204] (3/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_17-90541-0 from training. Duration: 0.94 2023-10-03 20:35:53,756 WARNING [train.py:1204] (3/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_535-14410-0_sp1.1 from training. Duration: 0.42725 2023-10-03 20:35:53,761 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_24673_210_6400_1_1531968633_778659_94-154511-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 20:35:53,764 WARNING [train.py:1204] (3/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_212-10501-0 from training. Duration: 0.94 2023-10-03 20:36:00,246 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.feed_forward3.hidden_balancer.prob, batch_count=1398226.6666666667, ans=0.125 2023-10-03 20:36:07,127 WARNING [train.py:1204] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_383-285196-0_sp1.1 from training. Duration: 0.8818125 2023-10-03 20:36:07,257 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.conv_module2.balancer2.prob, batch_count=1398293.3333333333, ans=0.125 2023-10-03 20:36:11,662 WARNING [train.py:1204] (3/4) Exclude cut with ID _45560_210_21982_1_1533812603951_3584999_238-47020-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 20:36:11,672 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_21500_210_2054_1_1532149310_2716758_278-18053-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 20:36:11,687 WARNING [train.py:1204] (3/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_453-9626-0_sp1.1 from training. Duration: 0.936375 2023-10-03 20:36:13,086 WARNING [train.py:1204] (3/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_153-97441-0_sp1.1 from training. Duration: 0.5090625 2023-10-03 20:36:16,969 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.652e+02 1.930e+02 2.153e+02 2.349e+02 3.303e+02, threshold=4.307e+02, percent-clipped=0.0 2023-10-03 20:36:21,057 WARNING [train.py:1204] (3/4) Exclude cut with ID _67725_210_27767_1_1535608756671_5105359_431-120895-0_sp1.1 from training. Duration: 0.963625 2023-10-03 20:36:22,526 WARNING [train.py:1204] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_574-45541-0_sp1.1 from training. Duration: 0.5909375 2023-10-03 20:36:22,556 WARNING [train.py:1204] (3/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_162-103620-0_sp1.1 from training. Duration: 0.8454375 2023-10-03 20:36:22,567 WARNING [train.py:1204] (3/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_734-129761-0_sp1.1 from training. Duration: 0.7454375 2023-10-03 20:36:23,892 WARNING [train.py:1204] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_620-276793-0_sp1.1 from training. Duration: 0.536375 2023-10-03 20:36:23,938 WARNING [train.py:1204] (3/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_153-97441-0_sp0.9 from training. Duration: 0.62225 2023-10-03 20:36:27,230 WARNING [train.py:1204] (3/4) Exclude cut with ID _12733_210_8341_1_1530167424556_3646380_523-85621-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 20:36:27,242 WARNING [train.py:1204] (3/4) Exclude cut with ID _64048_210_10626_1_1535198382802_3835179_352-179834-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 20:36:30,550 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.ff2_skip_rate, batch_count=1398360.0, ans=0.0 2023-10-03 20:36:32,014 WARNING [train.py:1204] (3/4) Exclude cut with ID _55162_210_17071_1_1534408248583_3753480_43-242364-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 20:36:32,032 WARNING [train.py:1204] (3/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_661-264857-0 from training. Duration: 0.93 2023-10-03 20:36:32,038 WARNING [train.py:1204] (3/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_325-237800-0_sp1.1 from training. Duration: 0.97275 2023-10-03 20:36:33,414 WARNING [train.py:1204] (3/4) Exclude cut with ID _55328_210_27767_1_1534580973260_4088170_385-171459-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 20:36:33,489 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3780_210_2087_1_1526467389_2145572_176-107140-0_sp1.1 from training. Duration: 0.62725 2023-10-03 20:36:34,821 WARNING [train.py:1204] (3/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_375-244976-0_sp1.1 from training. Duration: 0.6818125 2023-10-03 20:36:35,766 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.1.encoder.layers.0.self_attn_weights.whiten_keys, num_groups=4, num_channels=128, metric=3.69 vs. limit=6.0 2023-10-03 20:36:36,192 WARNING [train.py:1204] (3/4) Exclude cut with ID _35941_210_15379_1_1532995294515_4023089_309-33162-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 20:36:43,350 WARNING [train.py:1204] (3/4) Exclude cut with ID _14459_210_2228_1_1531443641946_7516700_1185-22038-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 20:36:46,157 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_1197-69460-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 20:36:48,922 INFO [train.py:1046] (3/4) Epoch 40, batch 2600, loss[loss=0.1664, simple_loss=0.2518, pruned_loss=0.04048, over 24569.00 frames. ], tot_loss[loss=0.1574, simple_loss=0.237, pruned_loss=0.03887, over 4697674.24 frames. ], batch size: 71, lr: 2.54e-03, grad_scale: 16.0 2023-10-03 20:36:48,961 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9012W0022-501157 from training. Duration: 0.896 2023-10-03 20:36:50,467 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9023W0009-506302 from training. Duration: 0.896 2023-10-03 20:36:50,481 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2821_210_2715_1_1526108385_3763560_73-296673-0_sp1.1 from training. Duration: 0.7545625 2023-10-03 20:36:50,676 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.feed_forward3.hidden_balancer.prob, batch_count=1398493.3333333333, ans=0.125 2023-10-03 20:36:51,832 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9025W0113-507442 from training. Duration: 0.896 2023-10-03 20:36:51,912 WARNING [train.py:1204] (3/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_739-2098-0 from training. Duration: 0.92 2023-10-03 20:36:51,923 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9029W0332-509722 from training. Duration: 0.768 2023-10-03 20:36:54,691 WARNING [train.py:1204] (3/4) Exclude cut with ID _16908_210_5724_1_1531119157248_3772700_68-101953-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 20:36:54,708 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9042W0011-515617 from training. Duration: 0.768 2023-10-03 20:36:56,693 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3019_210_2028_1_1526044245_3264865_191-72235-0 from training. Duration: 0.97 2023-10-03 20:36:59,217 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9052W0006-520792 from training. Duration: 0.896 2023-10-03 20:37:01,150 WARNING [train.py:1204] (3/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_245-174553-0_sp1.1 from training. Duration: 0.8 2023-10-03 20:37:02,618 WARNING [train.py:1204] (3/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_202-67017-0 from training. Duration: 0.82 2023-10-03 20:37:04,506 WARNING [train.py:1204] (3/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_304-59227-0 from training. Duration: 0.77 2023-10-03 20:37:04,685 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.conv_module1.balancer2.prob, batch_count=1398560.0, ans=0.125 2023-10-03 20:37:05,228 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.3.encoder.layers.1.feed_forward1.out_whiten, num_groups=1, num_channels=512, metric=7.01 vs. limit=15.0 2023-10-03 20:37:05,864 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_246-125000-0_sp0.9 from training. Duration: 0.688875 2023-10-03 20:37:05,915 WARNING [train.py:1204] (3/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_243-42116-0 from training. Duration: 0.73 2023-10-03 20:37:08,651 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9087W0121-538492 from training. Duration: 0.896 2023-10-03 20:37:08,666 WARNING [train.py:1204] (3/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_120-76843-0 from training. Duration: 0.55 2023-10-03 20:37:11,716 INFO [scaling.py:1118] (3/4) WithLoss: name=encoder.encoders.5.encoder.layers.0.self_attn_weights, loss-sum=0.000e+00 2023-10-03 20:37:16,073 WARNING [train.py:1204] (3/4) Exclude cut with ID _7852_210_5219_1_1528286471316_8139400_850-304621-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 20:37:17,255 WARNING [train.py:1204] (3/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_154-285415-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 20:37:17,267 WARNING [train.py:1204] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_538-278194-0_sp1.1 from training. Duration: 0.92725 2023-10-03 20:37:17,268 WARNING [train.py:1204] (3/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_119-207162-0 from training. Duration: 0.71 2023-10-03 20:37:17,561 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.conv_module1.balancer1.prob, batch_count=1398626.6666666667, ans=0.125 2023-10-03 20:37:18,738 WARNING [train.py:1204] (3/4) Exclude cut with ID _2334_210_2667_1_1525608238880_4705129_459-104997-0_sp1.1 from training. Duration: 0.863625 2023-10-03 20:37:20,576 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.conv_module1.balancer2.prob, batch_count=1398626.6666666667, ans=0.125 2023-10-03 20:37:23,162 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.conv_skip_rate, batch_count=1398626.6666666667, ans=0.0 2023-10-03 20:37:24,289 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9147W0019-567847 from training. Duration: 0.768 2023-10-03 20:37:29,351 WARNING [train.py:1204] (3/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_228-189439-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 20:37:31,157 WARNING [train.py:1204] (3/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_196-77172-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 20:37:32,523 WARNING [train.py:1204] (3/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_625-338546-0 from training. Duration: 0.88 2023-10-03 20:37:32,593 WARNING [train.py:1204] (3/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_193-327630-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 20:37:32,595 WARNING [train.py:1204] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_391-24006-0_sp1.1 from training. Duration: 0.92725 2023-10-03 20:37:33,914 WARNING [train.py:1204] (3/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_197-28164-0 from training. Duration: 0.8 2023-10-03 20:37:37,306 WARNING [train.py:1204] (3/4) Exclude cut with ID _8395_210_1531_1_1528597871523_4411579_431-132470-0_sp1.1 from training. Duration: 0.67275 2023-10-03 20:37:37,333 WARNING [train.py:1204] (3/4) Exclude cut with ID _35350_210_16040_1_1533040191183_4075299_465-248326-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 20:37:38,749 WARNING [train.py:1204] (3/4) Exclude cut with ID _16756_210_4915_1_1531047803763_7104259_101-141922-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 20:37:41,589 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9212W0005-600172 from training. Duration: 0.896 2023-10-03 20:37:41,616 WARNING [train.py:1204] (3/4) Exclude cut with ID _63186_210_2084_1_1535414479705_3908649_378-166820-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 20:37:42,930 WARNING [train.py:1204] (3/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_52-211683-0_sp1.1 from training. Duration: 0.8181875 2023-10-03 20:37:47,641 WARNING [train.py:1204] (3/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_1147-237729-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 20:37:47,727 WARNING [train.py:1204] (3/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_234-14174-0_sp0.9 from training. Duration: 0.911125 2023-10-03 20:37:47,739 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_454-276429-0 from training. Duration: 0.87 2023-10-03 20:37:49,022 WARNING [train.py:1204] (3/4) Exclude cut with ID _53713_210_24116_1_1534314487762_7416751_755-165313-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 20:37:50,474 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8278_210_2550_1_1528637099_4220413_436-317072-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 20:37:51,894 WARNING [train.py:1204] (3/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_28-324729-0_sp1.1 from training. Duration: 0.8545625 2023-10-03 20:37:54,685 WARNING [train.py:1204] (3/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_326-165900-0 from training. Duration: 0.69 2023-10-03 20:37:56,073 WARNING [train.py:1204] (3/4) Exclude cut with ID _53489_210_6228_1_1534298357169_3835970_96-30246-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 20:37:58,042 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8275_210_4198_1_1528455320_3892571_491-17707-0_sp1.1 from training. Duration: 0.5818125 2023-10-03 20:38:02,221 WARNING [train.py:1204] (3/4) Exclude cut with ID _14348_210_8341_1_1530599434996_3778640_240-232370-0 from training. Duration: 0.84 2023-10-03 20:38:02,231 WARNING [train.py:1204] (3/4) Exclude cut with ID _45012_210_12651_1_1533608435136_5371380_54-112623-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 20:38:03,475 INFO [train.py:1046] (3/4) Epoch 40, batch 2650, loss[loss=0.177, simple_loss=0.2464, pruned_loss=0.05381, over 23703.00 frames. ], tot_loss[loss=0.1586, simple_loss=0.2382, pruned_loss=0.03949, over 4694316.37 frames. ], batch size: 232, lr: 2.54e-03, grad_scale: 16.0 2023-10-03 20:38:03,540 WARNING [train.py:1204] (3/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_328-264776-0_sp1.1 from training. Duration: 0.6090625 2023-10-03 20:38:04,888 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9325W0001-648427 from training. Duration: 0.768 2023-10-03 20:38:04,905 WARNING [train.py:1204] (3/4) Exclude cut with ID _56480_210_4220_1_1534679942079_3075409_49-74717-0_sp1.1 from training. Duration: 0.936375 2023-10-03 20:38:07,756 WARNING [train.py:1204] (3/4) Exclude cut with ID _59500_210_8254_1_1534809610680_3736009_6-241869-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 20:38:09,637 WARNING [train.py:1204] (3/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_172-215932-0_sp0.9 from training. Duration: 0.7333125 2023-10-03 20:38:11,018 WARNING [train.py:1204] (3/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_17-90541-0_sp1.1 from training. Duration: 0.8545625 2023-10-03 20:38:14,419 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_43831_210_2158_1_1533725502_5872791_525-89447-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 20:38:15,644 WARNING [train.py:1204] (3/4) Exclude cut with ID _12302_210_5219_1_1529924446466_8107280_907-182949-0 from training. Duration: 0.92 2023-10-03 20:38:15,667 WARNING [train.py:1204] (3/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_402-102311-0_sp0.9 from training. Duration: 0.7444375 2023-10-03 20:38:16,926 WARNING [train.py:1204] (3/4) Exclude cut with ID _8021_210_7994_1_1528376451358_3653069_179-45206-0_sp0.9 from training. Duration: 0.9555625 2023-10-03 20:38:19,807 WARNING [train.py:1204] (3/4) Exclude cut with ID _40137_210_12210_1_1533290484046_3795180_249-187059-0 from training. Duration: 0.88 2023-10-03 20:38:21,167 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9653W0194-675472 from training. Duration: 0.9658125 2023-10-03 20:38:23,827 WARNING [train.py:1204] (3/4) Exclude cut with ID _51332_210_9988_1_1534244371836_3801270_336-255604-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 20:38:25,263 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_510-96996-0 from training. Duration: 0.87 2023-10-03 20:38:25,276 WARNING [train.py:1204] (3/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_1167-20437-0_sp1.1 from training. Duration: 0.92725 2023-10-03 20:38:26,714 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3475_210_2028_1_1526193578_3622569_265-276625-0 from training. Duration: 0.76 2023-10-03 20:38:31,668 WARNING [train.py:1204] (3/4) Exclude cut with ID _14860_210_4915_1_1530702020340_7343240_470-172418-0_sp1.1 from training. Duration: 0.97275 2023-10-03 20:38:31,685 WARNING [train.py:1204] (3/4) Exclude cut with ID _5444_210_2151_1_1527418808464_5312210_581-315411-0_sp1.1 from training. Duration: 0.563625 2023-10-03 20:38:32,930 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_34450_210_9547_1_1533099443_4109050_519-342439-0_sp1.1 from training. Duration: 0.97275 2023-10-03 20:38:32,954 WARNING [train.py:1204] (3/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_50-175961-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 20:38:35,737 WARNING [train.py:1204] (3/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_372-6869-0 from training. Duration: 0.44 2023-10-03 20:38:35,742 WARNING [train.py:1204] (3/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_3-237434-0 from training. Duration: 0.76 2023-10-03 20:38:38,483 WARNING [train.py:1204] (3/4) Exclude cut with ID _8772_210_1850_1_1528635595972_3664011_247-336448-0_sp1.1 from training. Duration: 0.67275 2023-10-03 20:38:41,840 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_427-78097-0 from training. Duration: 0.8 2023-10-03 20:38:41,855 WARNING [train.py:1204] (3/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_466-252727-0_sp1.1 from training. Duration: 0.97275 2023-10-03 20:38:43,858 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_15972_210_6400_1_1530877934_3405692_316-35478-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 20:38:43,891 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_809-17156-0_sp0.9 from training. Duration: 0.811125 2023-10-03 20:38:45,163 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.644e+02 2.003e+02 2.143e+02 2.550e+02 3.121e+02, threshold=4.287e+02, percent-clipped=0.0 2023-10-03 20:38:45,254 WARNING [train.py:1204] (3/4) Exclude cut with ID _57572_210_5517_1_1534921219624_3809484_594-263474-0_sp1.1 from training. Duration: 0.936375 2023-10-03 20:38:45,290 WARNING [train.py:1204] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_190-228204-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 20:38:46,672 WARNING [train.py:1204] (3/4) Exclude cut with ID _19514_210_9259_1_1531530071330_5084230_117-21193-0_sp1.1 from training. Duration: 0.936375 2023-10-03 20:38:48,036 WARNING [train.py:1204] (3/4) Exclude cut with ID _69584_210_5219_1_1535785133775_4044450_190-21315-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 20:38:48,193 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.conv_skip_rate, batch_count=1399026.6666666667, ans=0.0 2023-10-03 20:38:50,919 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_53071_210_16554_1_1534493434_1276796_184-302050-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 20:38:52,258 WARNING [train.py:1204] (3/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_120-76843-0_sp1.1 from training. Duration: 0.5 2023-10-03 20:38:53,647 WARNING [train.py:1204] (3/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_318-162017-0_sp1.1 from training. Duration: 0.82725 2023-10-03 20:38:56,436 WARNING [train.py:1204] (3/4) Exclude cut with ID _46067_210_4220_1_1534125571066_3307450_79-46864-0_sp1.1 from training. Duration: 0.92725 2023-10-03 20:38:56,493 WARNING [train.py:1204] (3/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_358-215558-0_sp1.1 from training. Duration: 0.8454375 2023-10-03 20:38:57,829 WARNING [train.py:1204] (3/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_621-66764-0_sp1.1 from training. Duration: 0.92725 2023-10-03 20:38:59,257 WARNING [train.py:1204] (3/4) Exclude cut with ID _42863_210_9070_1_1533902398788_3696920_338-156851-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 20:39:00,570 WARNING [train.py:1204] (3/4) Exclude cut with ID _5289_210_2084_1_1527678202736_3779299_392-261988-0_sp0.9 from training. Duration: 0.72225 2023-10-03 20:39:04,388 WARNING [train.py:1204] (3/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_409-84682-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 20:39:04,463 WARNING [train.py:1204] (3/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_30-326681-0_sp0.9 from training. Duration: 0.92225 2023-10-03 20:39:04,467 WARNING [train.py:1204] (3/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_389-184695-0_sp1.1 from training. Duration: 0.92725 2023-10-03 20:39:05,752 WARNING [train.py:1204] (3/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_442-320317-0 from training. Duration: 0.82 2023-10-03 20:39:06,208 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.5.encoder.layers.1.feed_forward3.out_whiten, num_groups=1, num_channels=256, metric=11.26 vs. limit=15.0 2023-10-03 20:39:08,588 WARNING [train.py:1204] (3/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_756-29911-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 20:39:10,012 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_401-112036-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 20:39:12,658 WARNING [train.py:1204] (3/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_181-308827-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 20:39:14,125 WARNING [train.py:1204] (3/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_332-258725-0_sp1.1 from training. Duration: 0.963625 2023-10-03 20:39:14,199 WARNING [train.py:1204] (3/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_188-260011-0_sp0.9 from training. Duration: 0.811125 2023-10-03 20:39:15,924 WARNING [train.py:1204] (3/4) Exclude cut with ID _49039_210_6315_1_1533967537541_3846118_99-302239-0_sp1.1 from training. Duration: 0.963625 2023-10-03 20:39:17,264 INFO [train.py:1046] (3/4) Epoch 40, batch 2700, loss[loss=0.1443, simple_loss=0.2282, pruned_loss=0.03022, over 24310.00 frames. ], tot_loss[loss=0.159, simple_loss=0.2392, pruned_loss=0.03942, over 4717485.06 frames. ], batch size: 61, lr: 2.54e-03, grad_scale: 8.0 2023-10-03 20:39:18,643 WARNING [train.py:1204] (3/4) Exclude cut with ID _4122_210_2084_1_1526713164417_4353280_312-294828-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 20:39:18,649 WARNING [train.py:1204] (3/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_303-228738-0 from training. Duration: 0.84 2023-10-03 20:39:21,851 WARNING [train.py:1204] (3/4) Exclude cut with ID _14349_210_8341_1_1530615710818_3442470_115-285975-0_sp0.9 from training. Duration: 0.9555625 2023-10-03 20:39:23,265 WARNING [train.py:1204] (3/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_125-144174-0_sp1.1 from training. Duration: 0.3545625 2023-10-03 20:39:24,800 WARNING [train.py:1204] (3/4) Exclude cut with ID _53565_210_10635_1_1534337218335_3820259_351-54570-0_sp1.1 from training. Duration: 0.97275 2023-10-03 20:39:26,125 WARNING [train.py:1204] (3/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_410-40065-0_sp1.1 from training. Duration: 0.963625 2023-10-03 20:39:26,154 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_38965_210_4852_1_1533174906_3484735_167-339442-0_sp1.1 from training. Duration: 0.963625 2023-10-03 20:39:26,232 WARNING [train.py:1204] (3/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_265-164486-0_sp1.1 from training. Duration: 0.87275 2023-10-03 20:39:26,242 WARNING [train.py:1204] (3/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_725-182484-0_sp1.1 from training. Duration: 0.92725 2023-10-03 20:39:27,500 WARNING [train.py:1204] (3/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_607-213951-0_sp0.9 from training. Duration: 0.9333125 2023-10-03 20:39:27,514 WARNING [train.py:1204] (3/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_605-133395-0_sp1.1 from training. Duration: 0.6 2023-10-03 20:39:27,529 WARNING [train.py:1204] (3/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_351-294650-0 from training. Duration: 0.63 2023-10-03 20:39:28,862 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_14953_210_2151_1_1530701710_5616165_710-165400-0_sp1.1 from training. Duration: 0.7909375 2023-10-03 20:39:30,295 WARNING [train.py:1204] (3/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_237-58433-0_sp1.1 from training. Duration: 0.67275 2023-10-03 20:39:31,666 WARNING [train.py:1204] (3/4) Exclude cut with ID _5190_210_4557_1_1527244368799_373439_28-197030-0_sp1.1 from training. Duration: 0.6545625 2023-10-03 20:39:31,715 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_743-327651-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 20:39:36,855 WARNING [train.py:1204] (3/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_76-203867-0_sp0.9 from training. Duration: 0.8 2023-10-03 20:39:36,944 WARNING [train.py:1204] (3/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_274-274798-0 from training. Duration: 0.98 2023-10-03 20:39:38,265 WARNING [train.py:1204] (3/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_226-242140-0_sp1.1 from training. Duration: 0.736375 2023-10-03 20:39:42,416 WARNING [train.py:1204] (3/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_352-59958-0_sp1.1 from training. Duration: 0.7818125 2023-10-03 20:39:42,420 WARNING [train.py:1204] (3/4) Exclude cut with ID _39411_210_5986_1_1533538825030_3343649_191-251819-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 20:39:48,387 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7046_210_6753_1_1527895337_6920917_642-106799-0_sp1.1 from training. Duration: 0.836375 2023-10-03 20:39:48,393 WARNING [train.py:1204] (3/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_412-3442-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 20:39:48,419 WARNING [train.py:1204] (3/4) Exclude cut with ID _60057_210_10640_1_1534903065793_3333150_171-181133-0_sp0.9 from training. Duration: 0.97775 2023-10-03 20:39:48,427 WARNING [train.py:1204] (3/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_154-135696-0_sp1.1 from training. Duration: 0.72725 2023-10-03 20:39:50,259 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.5.encoder.layers.1.whiten, num_groups=1, num_channels=256, metric=1.96 vs. limit=12.0 2023-10-03 20:39:52,625 WARNING [train.py:1204] (3/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_148-186805-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 20:39:55,855 WARNING [train.py:1204] (3/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_605-165641-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 20:39:55,870 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_174-277364-0_sp0.9 from training. Duration: 0.788875 2023-10-03 20:39:55,885 WARNING [train.py:1204] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_89-321955-0_sp0.9 from training. Duration: 0.888875 2023-10-03 20:39:57,410 WARNING [train.py:1204] (3/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_438-105429-0_sp1.1 from training. Duration: 0.963625 2023-10-03 20:39:57,413 WARNING [train.py:1204] (3/4) Exclude cut with ID _14970_210_15709_1_1530773543493_1715730_187-20259-0_sp0.9 from training. Duration: 0.988875 2023-10-03 20:40:06,774 WARNING [train.py:1204] (3/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_385-193052-0_sp0.9 from training. Duration: 0.9555625 2023-10-03 20:40:06,821 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7113_210_1849_1_1527942685_3448768_194-311626-0_sp1.1 from training. Duration: 0.92725 2023-10-03 20:40:10,985 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3780_210_2087_1_1526467389_2145572_176-107140-0_sp0.9 from training. Duration: 0.7666875 2023-10-03 20:40:10,987 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_36756_210_7234_1_1533026991_966276_123-213457-0_sp1.1 from training. Duration: 0.936375 2023-10-03 20:40:13,871 WARNING [train.py:1204] (3/4) Exclude cut with ID _23557_210_8750_1_1531890106370_6846255_456-244861-0_sp1.1 from training. Duration: 0.963625 2023-10-03 20:40:15,806 WARNING [train.py:1204] (3/4) Exclude cut with ID _37086_210_9038_1_1532999540891_7021250_242-349170-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 20:40:17,183 WARNING [train.py:1204] (3/4) Exclude cut with ID _41298_210_9019_1_1533430371450_4822143_471-281351-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 20:40:18,668 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_53378_210_17282_1_1534221071_5769509_572-126176-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 20:40:20,001 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_1507-263032-0_sp1.1 from training. Duration: 0.963625 2023-10-03 20:40:20,037 WARNING [train.py:1204] (3/4) Exclude cut with ID _13679_210_7497_1_1530534293662_3355098_411-186088-0_sp1.1 from training. Duration: 0.8545625 2023-10-03 20:40:21,640 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.bypass.skip_rate, batch_count=1399426.6666666667, ans=0.09899494936611666 2023-10-03 20:40:22,707 WARNING [train.py:1204] (3/4) Exclude cut with ID _29345_210_4381_1_1532689168263_3860169_149-257268-0_sp1.1 from training. Duration: 0.736375 2023-10-03 20:40:24,055 WARNING [train.py:1204] (3/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_381-252014-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 20:40:24,067 WARNING [train.py:1204] (3/4) Exclude cut with ID _35799_210_5854_1_1533263487887_3529071_512-2717-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 20:40:28,706 WARNING [train.py:1204] (3/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_505-205270-0_sp0.9 from training. Duration: 0.42225 2023-10-03 20:40:30,061 WARNING [train.py:1204] (3/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_731-20117-0_sp1.1 from training. Duration: 0.936375 2023-10-03 20:40:31,499 INFO [train.py:1046] (3/4) Epoch 40, batch 2750, loss[loss=0.1445, simple_loss=0.2252, pruned_loss=0.03194, over 24431.00 frames. ], tot_loss[loss=0.1584, simple_loss=0.2381, pruned_loss=0.03933, over 4717805.05 frames. ], batch size: 58, lr: 2.54e-03, grad_scale: 8.0 2023-10-03 20:40:31,600 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_37351_210_7925_1_1533012931_3812113_265-85780-0_sp1.1 from training. Duration: 0.8 2023-10-03 20:40:31,606 WARNING [train.py:1204] (3/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_313-134930-0 from training. Duration: 0.97 2023-10-03 20:40:32,996 WARNING [train.py:1204] (3/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_374-138971-0 from training. Duration: 0.94 2023-10-03 20:40:33,044 WARNING [train.py:1204] (3/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_1227-287886-0_sp1.1 from training. Duration: 0.936375 2023-10-03 20:40:35,162 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.1.feed_forward3.hidden_balancer.prob, batch_count=1399493.3333333333, ans=0.125 2023-10-03 20:40:36,434 WARNING [train.py:1204] (3/4) Exclude cut with ID _59493_210_17282_1_1535078419077_3225879_332-217544-0_sp1.1 from training. Duration: 0.97275 2023-10-03 20:40:38,202 WARNING [train.py:1204] (3/4) Exclude cut with ID _23514_210_1389_1_1531832139785_3031439_361-119640-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 20:40:39,584 WARNING [train.py:1204] (3/4) Exclude cut with ID _69584_210_5219_1_1535785133775_4044450_142-118058-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 20:40:39,604 WARNING [train.py:1204] (3/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_178-177582-0_sp1.1 from training. Duration: 0.67275 2023-10-03 20:40:40,915 WARNING [train.py:1204] (3/4) Exclude cut with ID _25303_210_17519_1_1532050207576_3635970_41-195572-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 20:40:43,671 WARNING [train.py:1204] (3/4) Exclude cut with ID _55328_210_27767_1_1534580973260_4088170_329-341322-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 20:40:43,712 WARNING [train.py:1204] (3/4) Exclude cut with ID _5608_210_2106_1_1527415439089_6819768_909-82767-0_sp1.1 from training. Duration: 0.5545625 2023-10-03 20:40:45,023 WARNING [train.py:1204] (3/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_261-297503-0_sp1.1 from training. Duration: 0.9 2023-10-03 20:40:45,029 WARNING [train.py:1204] (3/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_459-291706-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 20:40:45,030 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7762_210_1794_1_1528199508_4057408_422-227034-0 from training. Duration: 0.73 2023-10-03 20:40:45,040 WARNING [train.py:1204] (3/4) Exclude cut with ID _3067_210_2667_1_1526212974640_4503168_250-285937-0_sp0.9 from training. Duration: 0.888875 2023-10-03 20:40:45,055 WARNING [train.py:1204] (3/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_332-253745-0_sp1.1 from training. Duration: 0.936375 2023-10-03 20:40:45,682 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.3.encoder.layers.2.whiten, num_groups=1, num_channels=512, metric=4.56 vs. limit=12.0 2023-10-03 20:40:50,932 WARNING [train.py:1204] (3/4) Exclude cut with ID _13938_210_9988_1_1530518718978_3693960_304-112784-0 from training. Duration: 0.46 2023-10-03 20:40:51,199 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.conv_module2.balancer1.prob, batch_count=1399560.0, ans=0.125 2023-10-03 20:40:52,351 WARNING [train.py:1204] (3/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_226-267957-0_sp1.1 from training. Duration: 0.92725 2023-10-03 20:40:52,386 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_41822_210_13270_1_1533955976_4379284_608-254567-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 20:40:53,741 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_124-113562-0_sp1.1 from training. Duration: 0.8545625 2023-10-03 20:40:53,771 WARNING [train.py:1204] (3/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_117-290916-0_sp1.1 from training. Duration: 0.463625 2023-10-03 20:40:55,237 WARNING [train.py:1204] (3/4) Exclude cut with ID _28427_210_4220_1_1532581240967_3061069_102-66533-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 20:40:56,529 WARNING [train.py:1204] (3/4) Exclude cut with ID _55383_210_10635_1_1534468426172_2231669_134-128246-0_sp1.1 from training. Duration: 0.8090625 2023-10-03 20:40:56,577 WARNING [train.py:1204] (3/4) Exclude cut with ID _45871_210_5665_1_1533864614245_3296060_49-308316-0_sp1.1 from training. Duration: 0.97275 2023-10-03 20:40:56,629 WARNING [train.py:1204] (3/4) Exclude cut with ID _57572_210_5517_1_1534921219624_3809484_176-199205-0_sp1.1 from training. Duration: 0.97275 2023-10-03 20:41:01,206 WARNING [train.py:1204] (3/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_45-265684-0_sp1.1 from training. Duration: 0.6545625 2023-10-03 20:41:01,225 WARNING [train.py:1204] (3/4) Exclude cut with ID _15150_210_8341_1_1530752411803_7256384_469-239509-0_sp0.9 from training. Duration: 0.7555625 2023-10-03 20:41:01,270 WARNING [train.py:1204] (3/4) Exclude cut with ID _28461_210_2253_1_1532771775189_3041299_136-270644-0_sp1.1 from training. Duration: 0.8454375 2023-10-03 20:41:03,834 WARNING [train.py:1204] (3/4) Exclude cut with ID _69584_210_5219_1_1535785133775_4044450_302-47302-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 20:41:05,779 WARNING [train.py:1204] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_166-128941-0_sp1.1 from training. Duration: 0.4909375 2023-10-03 20:41:05,912 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.0.balancer1.prob, batch_count=1399626.6666666667, ans=0.125 2023-10-03 20:41:13,377 WARNING [train.py:1204] (3/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_559-120504-0_sp1.1 from training. Duration: 0.97275 2023-10-03 20:41:14,585 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.643e+02 1.978e+02 2.171e+02 2.705e+02 3.940e+02, threshold=4.342e+02, percent-clipped=0.0 2023-10-03 20:41:14,806 WARNING [train.py:1204] (3/4) Exclude cut with ID _6456_210_1534_1_1527681634212_3181049_345-252877-0_sp1.1 from training. Duration: 0.5818125 2023-10-03 20:41:14,849 WARNING [train.py:1204] (3/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_902-233003-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 20:41:17,757 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.conv_skip_rate, batch_count=1399693.3333333333, ans=0.0 2023-10-03 20:41:18,970 WARNING [train.py:1204] (3/4) Exclude cut with ID _36538_210_9994_1_1533204020363_3795620_204-261385-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 20:41:18,974 WARNING [train.py:1204] (3/4) Exclude cut with ID _14821_210_8750_1_1530680503991_6829359_189-297451-0_sp1.1 from training. Duration: 0.5 2023-10-03 20:41:19,023 WARNING [train.py:1204] (3/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_1211-226066-0_sp0.9 from training. Duration: 0.8444375 2023-10-03 20:41:25,140 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6786_210_1388_1_1527839699_4194495_273-149841-0_sp1.1 from training. Duration: 0.836375 2023-10-03 20:41:25,179 WARNING [train.py:1204] (3/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_380-162648-0_sp1.1 from training. Duration: 0.8545625 2023-10-03 20:41:25,179 WARNING [train.py:1204] (3/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_494-199538-0 from training. Duration: 0.75 2023-10-03 20:41:25,299 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.0.bypass.skip_rate, batch_count=1399693.3333333333, ans=0.035 2023-10-03 20:41:29,710 WARNING [train.py:1204] (3/4) Exclude cut with ID _49097_210_16049_1_1533952780207_4360979_276-87053-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 20:41:31,151 WARNING [train.py:1204] (3/4) Exclude cut with ID _5444_210_2151_1_1527418808464_5312210_726-27165-0 from training. Duration: 0.67 2023-10-03 20:41:34,115 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder_embed.convnext.hidden_balancer.prob, batch_count=1399760.0, ans=0.125 2023-10-03 20:41:36,036 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_128-202104-0_sp1.1 from training. Duration: 0.57275 2023-10-03 20:41:36,295 INFO [scaling.py:1118] (3/4) WithLoss: name=encoder.encoders.3.encoder.layers.1.self_attn_weights, loss-sum=0.000e+00 2023-10-03 20:41:38,799 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_11349_210_6753_1_1529800861_1101207_26-65602-0_sp1.1 from training. Duration: 0.87275 2023-10-03 20:41:40,531 WARNING [train.py:1204] (3/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_159-328157-0 from training. Duration: 0.52 2023-10-03 20:41:40,621 WARNING [train.py:1204] (3/4) Exclude cut with ID _14069_210_5632_1_1530512692033_4803390_98-21934-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 20:41:42,064 WARNING [train.py:1204] (3/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_352-59958-0_sp0.9 from training. Duration: 0.9555625 2023-10-03 20:41:42,113 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6163_210_6400_1_1527506746_4263068_283-137811-0 from training. Duration: 0.77 2023-10-03 20:41:42,148 WARNING [train.py:1204] (3/4) Exclude cut with ID _21989_210_5854_1_1531899214931_3271360_133-44759-0_sp1.1 from training. Duration: 0.7 2023-10-03 20:41:46,241 INFO [train.py:1046] (3/4) Epoch 40, batch 2800, loss[loss=0.1302, simple_loss=0.1923, pruned_loss=0.03406, over 23464.00 frames. ], tot_loss[loss=0.1572, simple_loss=0.2367, pruned_loss=0.03886, over 4706421.23 frames. ], batch size: 285, lr: 2.54e-03, grad_scale: 16.0 2023-10-03 20:41:46,289 WARNING [train.py:1204] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_21-174514-0_sp0.9 from training. Duration: 0.47775 2023-10-03 20:41:46,328 WARNING [train.py:1204] (3/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_201-78811-0_sp1.1 from training. Duration: 0.963625 2023-10-03 20:41:47,579 WARNING [train.py:1204] (3/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_537-164630-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 20:41:47,650 WARNING [train.py:1204] (3/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_156-257607-0 from training. Duration: 0.83 2023-10-03 20:41:47,660 WARNING [train.py:1204] (3/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_308-154450-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 20:41:49,019 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_45399_210_4852_1_1533624725_7579737_214-240574-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 20:41:51,002 WARNING [train.py:1204] (3/4) Exclude cut with ID _29303_210_2575_1_1532392237303_7410649_615-305517-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 20:41:51,055 WARNING [train.py:1204] (3/4) Exclude cut with ID IC0125W0002-61808 from training. Duration: 0.708 2023-10-03 20:41:51,055 WARNING [train.py:1204] (3/4) Exclude cut with ID IC0125W0017-61823 from training. Duration: 0.759 2023-10-03 20:41:53,862 WARNING [train.py:1204] (3/4) Exclude cut with ID _53219_210_4525_1_1534834836008_4064825_4-145291-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 20:41:55,288 WARNING [train.py:1204] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_185-174805-0_sp1.1 from training. Duration: 0.6181875 2023-10-03 20:41:55,308 WARNING [train.py:1204] (3/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_635-193046-0_sp1.1 from training. Duration: 0.92725 2023-10-03 20:41:58,350 WARNING [train.py:1204] (3/4) Exclude cut with ID _46067_210_4220_1_1534125571066_3307450_321-258866-0_sp1.1 from training. Duration: 0.97275 2023-10-03 20:42:00,999 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8558_210_1410_1_1528505225_7613406_131-329524-0 from training. Duration: 0.79 2023-10-03 20:42:02,519 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.0.attention_skip_rate, batch_count=1399893.3333333333, ans=0.0 2023-10-03 20:42:04,408 WARNING [train.py:1204] (3/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_847-41334-0_sp0.9 from training. Duration: 0.488875 2023-10-03 20:42:05,715 WARNING [train.py:1204] (3/4) Exclude cut with ID _14227_210_4381_1_1530701804286_3940259_125-275642-0 from training. Duration: 0.99 2023-10-03 20:42:05,822 WARNING [train.py:1204] (3/4) Exclude cut with ID _25248_210_8750_1_1532062825941_5554180_248-177255-0_sp1.1 from training. Duration: 0.963625 2023-10-03 20:42:07,147 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_40047_210_2290_1_1533290519_2374311_195-165693-0_sp1.1 from training. Duration: 0.8090625 2023-10-03 20:42:07,150 WARNING [train.py:1204] (3/4) Exclude cut with ID _23109_210_4381_1_1532150828777_901949_29-258672-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 20:42:11,764 WARNING [train.py:1204] (3/4) Exclude cut with ID _14821_210_8750_1_1530680503991_6829359_2-227151-0_sp1.1 from training. Duration: 0.7545625 2023-10-03 20:42:11,802 WARNING [train.py:1204] (3/4) Exclude cut with ID _49643_210_7989_1_1534640244224_3939031_565-53982-0_sp1.1 from training. Duration: 0.963625 2023-10-03 20:42:11,806 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7544_210_6753_1_1528591327_1471426_33-346031-0_sp0.9 from training. Duration: 0.67775 2023-10-03 20:42:13,248 WARNING [train.py:1204] (3/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_95-61782-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 20:42:21,975 WARNING [train.py:1204] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_686-54383-0_sp0.9 from training. Duration: 0.9666875 2023-10-03 20:42:23,390 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_505-276823-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 20:42:25,183 WARNING [train.py:1204] (3/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_745-128443-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 20:42:25,228 WARNING [train.py:1204] (3/4) Exclude cut with ID _3844_210_1390_1_1526727803582_2871209_20-251664-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 20:42:26,604 WARNING [train.py:1204] (3/4) Exclude cut with ID _36538_210_9994_1_1533204020363_3795620_487-164628-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 20:42:31,310 WARNING [train.py:1204] (3/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_309-222545-0_sp1.1 from training. Duration: 0.763625 2023-10-03 20:42:31,325 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3531_210_3528_1_1526294203_5216456_309-227302-0 from training. Duration: 0.69 2023-10-03 20:42:32,745 WARNING [train.py:1204] (3/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_42-192460-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 20:42:32,821 WARNING [train.py:1204] (3/4) Exclude cut with ID _22083_210_8254_1_1531702862439_3678970_49-234546-0_sp1.1 from training. Duration: 0.7545625 2023-10-03 20:42:32,825 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_427-78097-0_sp0.9 from training. Duration: 0.888875 2023-10-03 20:42:37,507 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_378-205254-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 20:42:37,551 WARNING [train.py:1204] (3/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_672-314830-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 20:42:37,691 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.self_attn_weights.pos_emb_skip_rate, batch_count=1400026.6666666667, ans=0.0 2023-10-03 20:42:42,090 WARNING [train.py:1204] (3/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_90-30673-0_sp1.1 from training. Duration: 0.763625 2023-10-03 20:42:43,482 WARNING [train.py:1204] (3/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_563-329943-0_sp1.1 from training. Duration: 0.9 2023-10-03 20:42:43,498 WARNING [train.py:1204] (3/4) Exclude cut with ID _21632_210_2253_1_1531994001765_3744140_154-84090-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 20:42:43,499 WARNING [train.py:1204] (3/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_103-336064-0_sp0.9 from training. Duration: 0.5666875 2023-10-03 20:42:43,556 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_43-71989-0_sp0.9 from training. Duration: 0.6333125 2023-10-03 20:42:44,997 WARNING [train.py:1204] (3/4) Exclude cut with ID _3867_210_1883_1_1526641200575_4475830_319-349850-0_sp0.9 from training. Duration: 0.7444375 2023-10-03 20:42:46,309 WARNING [train.py:1204] (3/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_629-179454-0_sp1.1 from training. Duration: 0.963625 2023-10-03 20:42:46,317 WARNING [train.py:1204] (3/4) Exclude cut with ID _65263_210_22356_1_1535763598691_3921037_49-222917-0 from training. Duration: 0.92 2023-10-03 20:42:46,352 WARNING [train.py:1204] (3/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_602-331772-0_sp1.1 from training. Duration: 0.936375 2023-10-03 20:42:47,717 WARNING [train.py:1204] (3/4) Exclude cut with ID _58414_210_8254_1_1535421609162_7151182_292-134978-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 20:42:47,758 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_38793_210_2544_1_1533176114_3849320_200-299292-0_sp1.1 from training. Duration: 0.936375 2023-10-03 20:42:49,621 WARNING [train.py:1204] (3/4) Exclude cut with ID _14970_210_15709_1_1530775547361_1966965_6-23328-0 from training. Duration: 0.86 2023-10-03 20:42:50,942 WARNING [train.py:1204] (3/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_386-225511-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 20:42:50,959 WARNING [train.py:1204] (3/4) Exclude cut with ID _38323_210_12108_1_1533121233369_3303049_416-11919-0_sp1.1 from training. Duration: 0.87275 2023-10-03 20:42:52,285 WARNING [train.py:1204] (3/4) Exclude cut with ID _4866_210_2577_1_1527404412284_4364659_256-306830-0_sp1.1 from training. Duration: 0.7909375 2023-10-03 20:42:53,741 WARNING [train.py:1204] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_53-300544-0 from training. Duration: 0.67 2023-10-03 20:42:59,206 WARNING [train.py:1204] (3/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_80-74184-0_sp1.1 from training. Duration: 0.7545625 2023-10-03 20:43:00,987 INFO [train.py:1046] (3/4) Epoch 40, batch 2850, loss[loss=0.1544, simple_loss=0.2301, pruned_loss=0.03929, over 23744.00 frames. ], tot_loss[loss=0.1568, simple_loss=0.2363, pruned_loss=0.03862, over 4711756.06 frames. ], batch size: 164, lr: 2.54e-03, grad_scale: 16.0 2023-10-03 20:43:01,030 WARNING [train.py:1204] (3/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_332-256168-0_sp1.1 from training. Duration: 0.6818125 2023-10-03 20:43:01,078 WARNING [train.py:1204] (3/4) Exclude cut with ID _48836_210_12210_1_1534075848293_3904150_429-35475-0_sp1.1 from training. Duration: 0.8090625 2023-10-03 20:43:02,551 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_144-135833-0_sp1.1 from training. Duration: 0.97275 2023-10-03 20:43:04,558 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.4.encoder.layers.1.self_attn1.whiten, num_groups=1, num_channels=384, metric=11.94 vs. limit=22.5 2023-10-03 20:43:05,389 WARNING [train.py:1204] (3/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_380-343670-0_sp0.9 from training. Duration: 0.97775 2023-10-03 20:43:06,733 WARNING [train.py:1204] (3/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_213-265174-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 20:43:06,767 WARNING [train.py:1204] (3/4) Exclude cut with ID _20455_210_5219_1_1531565996775_8098100_86-314283-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 20:43:09,897 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_41750_210_1960_1_1533520678_3948042_68-223813-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 20:43:11,198 WARNING [train.py:1204] (3/4) Exclude cut with ID _55153_210_2253_1_1534384786677_3162096_24-198429-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 20:43:13,059 WARNING [train.py:1204] (3/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_119-207162-0_sp0.9 from training. Duration: 0.788875 2023-10-03 20:43:13,123 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_148-172861-0 from training. Duration: 0.5 2023-10-03 20:43:19,103 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_5522_210_3273_1_1527249606_3909096_318-203153-0 from training. Duration: 0.43 2023-10-03 20:43:19,110 WARNING [train.py:1204] (3/4) Exclude cut with ID _14884_210_2252_1_1530707459689_5561250_288-189949-0_sp1.1 from training. Duration: 0.97275 2023-10-03 20:43:20,619 WARNING [train.py:1204] (3/4) Exclude cut with ID _5294_210_1747_1_1527249643928_3696179_529-242298-0 from training. Duration: 0.95 2023-10-03 20:43:21,945 WARNING [train.py:1204] (3/4) Exclude cut with ID _36538_210_9994_1_1533204020363_3795620_503-110289-0_sp1.1 from training. Duration: 0.936375 2023-10-03 20:43:23,402 WARNING [train.py:1204] (3/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_385-193052-0 from training. Duration: 0.86 2023-10-03 20:43:24,619 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_456-22141-0 from training. Duration: 0.88 2023-10-03 20:43:25,948 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_38965_210_4852_1_1533174906_3484735_284-117840-0_sp1.1 from training. Duration: 0.936375 2023-10-03 20:43:37,518 WARNING [train.py:1204] (3/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_75-201625-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 20:43:39,383 WARNING [train.py:1204] (3/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_255-312654-0_sp1.1 from training. Duration: 0.82725 2023-10-03 20:43:39,394 WARNING [train.py:1204] (3/4) Exclude cut with ID _47251_210_18628_1_1533814063128_4137250_214-88076-0_sp0.9 from training. Duration: 0.97775 2023-10-03 20:43:39,484 WARNING [train.py:1204] (3/4) Exclude cut with ID _4092_210_2151_1_1526643044449_3135759_227-213972-0_sp1.1 from training. Duration: 0.4909375 2023-10-03 20:43:40,822 WARNING [train.py:1204] (3/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_912-47473-0_sp1.1 from training. Duration: 0.6181875 2023-10-03 20:43:40,857 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6767_210_3273_1_1527855783_1718932_173-256601-0_sp1.1 from training. Duration: 0.72725 2023-10-03 20:43:42,321 WARNING [train.py:1204] (3/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_32-205288-0_sp1.1 from training. Duration: 0.7090625 2023-10-03 20:43:44,039 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.644e+02 1.957e+02 2.192e+02 2.497e+02 4.123e+02, threshold=4.384e+02, percent-clipped=0.0 2023-10-03 20:43:44,126 WARNING [train.py:1204] (3/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_39-248870-0 from training. Duration: 0.69 2023-10-03 20:43:45,525 WARNING [train.py:1204] (3/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_377-296839-0_sp1.1 from training. Duration: 0.5 2023-10-03 20:43:45,547 WARNING [train.py:1204] (3/4) Exclude cut with ID _13679_210_7497_1_1530534293662_3355098_413-56154-0_sp1.1 from training. Duration: 0.963625 2023-10-03 20:43:46,856 WARNING [train.py:1204] (3/4) Exclude cut with ID _14970_210_15709_1_1530775547361_1966965_201-84544-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 20:43:46,913 WARNING [train.py:1204] (3/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_105-95775-0_sp1.1 from training. Duration: 0.936375 2023-10-03 20:43:50,134 WARNING [train.py:1204] (3/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_131-191669-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 20:43:50,174 WARNING [train.py:1204] (3/4) Exclude cut with ID _14860_210_4915_1_1530702020340_7343240_695-4323-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 20:43:50,258 WARNING [train.py:1204] (3/4) Exclude cut with ID _39867_210_9826_1_1533553222030_9344259_991-130565-0_sp1.1 from training. Duration: 0.97275 2023-10-03 20:43:51,650 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_50-3289-0_sp1.1 from training. Duration: 0.82725 2023-10-03 20:43:52,971 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_54-347670-0_sp1.1 from training. Duration: 0.92725 2023-10-03 20:43:54,337 WARNING [train.py:1204] (3/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_200-153445-0_sp1.1 from training. Duration: 0.936375 2023-10-03 20:43:55,677 WARNING [train.py:1204] (3/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_279-302911-0_sp1.1 from training. Duration: 0.97275 2023-10-03 20:43:56,966 WARNING [train.py:1204] (3/4) Exclude cut with ID _14886_210_4220_1_1530702020355_3516190_159-73748-0_sp0.9 from training. Duration: 0.92225 2023-10-03 20:43:59,772 WARNING [train.py:1204] (3/4) Exclude cut with ID _53379_210_20034_1_1534222027363_3854654_23-222715-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 20:44:01,652 INFO [scaling.py:1118] (3/4) WithLoss: name=encoder.encoders.0.layers.0.self_attn_weights, loss-sum=0.000e+00 2023-10-03 20:44:02,763 WARNING [train.py:1204] (3/4) Exclude cut with ID _12555_210_8750_1_1530075594402_3780239_449-267986-0 from training. Duration: 0.83 2023-10-03 20:44:02,781 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7544_210_6753_1_1528587722_3576686_295-258479-0 from training. Duration: 0.84 2023-10-03 20:44:05,352 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7002_210_1794_1_1527935634_4034444_487-236501-0_sp0.9 from training. Duration: 0.7333125 2023-10-03 20:44:06,719 WARNING [train.py:1204] (3/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_244-19107-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 20:44:06,733 WARNING [train.py:1204] (3/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_390-339137-0 from training. Duration: 0.56 2023-10-03 20:44:08,126 WARNING [train.py:1204] (3/4) Exclude cut with ID _7562_210_2084_1_1528887767916_3946229_186-326422-0_sp1.1 from training. Duration: 0.763625 2023-10-03 20:44:08,201 WARNING [train.py:1204] (3/4) Exclude cut with ID _33640_210_2655_1_1533117605479_4855469_645-145235-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 20:44:08,218 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_25767_210_2161_1_1532307483_3867748_506-171777-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 20:44:08,899 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_5438_210_1794_1_1527336687_2600009_233-58072-0_sp1.1 from training. Duration: 0.7 2023-10-03 20:44:08,900 WARNING [train.py:1204] (3/4) Exclude cut with ID IC0669W0063-330908 from training. Duration: 0.896 2023-10-03 20:44:10,139 WARNING [train.py:1204] (3/4) Exclude cut with ID IC0671W0070-331913 from training. Duration: 0.896 2023-10-03 20:44:10,143 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_258-310570-0_sp1.1 from training. Duration: 0.7454375 2023-10-03 20:44:10,199 WARNING [train.py:1204] (3/4) Exclude cut with ID _9461_210_4557_1_1529060514726_3677451_162-177507-0_sp1.1 from training. Duration: 0.97275 2023-10-03 20:44:14,225 INFO [train.py:1046] (3/4) Epoch 40, batch 2900, loss[loss=0.1571, simple_loss=0.2324, pruned_loss=0.04089, over 23688.00 frames. ], tot_loss[loss=0.1571, simple_loss=0.2368, pruned_loss=0.03866, over 4723200.35 frames. ], batch size: 232, lr: 2.54e-03, grad_scale: 16.0 2023-10-03 20:44:16,069 WARNING [train.py:1204] (3/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_26-175553-0_sp0.9 from training. Duration: 0.67775 2023-10-03 20:44:16,086 WARNING [train.py:1204] (3/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_7-150481-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 20:44:16,141 WARNING [train.py:1204] (3/4) Exclude cut with ID _23557_210_8750_1_1531890106370_6846255_72-273262-0_sp1.1 from training. Duration: 0.963625 2023-10-03 20:44:16,234 WARNING [train.py:1204] (3/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_367-61330-0 from training. Duration: 0.75 2023-10-03 20:44:20,704 WARNING [train.py:1204] (3/4) Exclude cut with ID _39867_210_9826_1_1533553222030_9344259_1337-11141-0_sp1.1 from training. Duration: 0.936375 2023-10-03 20:44:20,723 WARNING [train.py:1204] (3/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_101-293367-0 from training. Duration: 0.63 2023-10-03 20:44:22,015 WARNING [train.py:1204] (3/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_10-100367-0 from training. Duration: 0.98 2023-10-03 20:44:23,464 WARNING [train.py:1204] (3/4) Exclude cut with ID _60057_210_10640_1_1534903065793_3333150_171-181133-0_sp1.1 from training. Duration: 0.8 2023-10-03 20:44:23,467 WARNING [train.py:1204] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_844-96989-0_sp0.9 from training. Duration: 0.788875 2023-10-03 20:44:24,942 WARNING [train.py:1204] (3/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_301-144595-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 20:44:26,294 WARNING [train.py:1204] (3/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_115-147343-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 20:44:29,045 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3171_210_2087_1_1526109084_2330007_37-64334-0_sp1.1 from training. Duration: 0.7454375 2023-10-03 20:44:30,368 WARNING [train.py:1204] (3/4) Exclude cut with ID _19514_210_9259_1_1531530071330_5084230_769-272914-0_sp1.1 from training. Duration: 0.936375 2023-10-03 20:44:32,271 WARNING [train.py:1204] (3/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_76-311963-0_sp0.9 from training. Duration: 0.77775 2023-10-03 20:44:32,302 WARNING [train.py:1204] (3/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_526-62079-0 from training. Duration: 0.67 2023-10-03 20:44:32,353 WARNING [train.py:1204] (3/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_7-111954-0_sp0.9 from training. Duration: 0.62225 2023-10-03 20:44:35,068 WARNING [train.py:1204] (3/4) Exclude cut with ID _57997_210_4163_1_1534766425889_3961049_360-279140-0_sp1.1 from training. Duration: 0.97275 2023-10-03 20:44:37,820 WARNING [train.py:1204] (3/4) Exclude cut with ID _4866_210_2577_1_1527404412284_4364659_133-1696-0 from training. Duration: 0.99 2023-10-03 20:44:39,743 WARNING [train.py:1204] (3/4) Exclude cut with ID _5190_210_4557_1_1527244368799_373439_28-197030-0 from training. Duration: 0.72 2023-10-03 20:44:43,886 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_53-39003-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 20:44:43,889 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6673_210_3273_1_1527767790_3173240_351-167954-0 from training. Duration: 0.48 2023-10-03 20:44:43,912 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2822_210_2715_1_1526112586_1999587_165-36656-0_sp1.1 from training. Duration: 0.8090625 2023-10-03 20:44:47,252 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_328-264892-0_sp1.1 from training. Duration: 0.92725 2023-10-03 20:44:47,257 WARNING [train.py:1204] (3/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_29-293896-0_sp0.9 from training. Duration: 0.67775 2023-10-03 20:44:48,764 WARNING [train.py:1204] (3/4) Exclude cut with ID _65263_210_22356_1_1535763598691_3921037_465-55448-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 20:44:50,461 WARNING [train.py:1204] (3/4) Exclude cut with ID _21989_210_5854_1_1531899214931_3271360_311-53106-0_sp1.1 from training. Duration: 0.97275 2023-10-03 20:44:53,259 WARNING [train.py:1204] (3/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_389-277915-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 20:44:55,928 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_69630_210_7234_1_1535791094_677837_63-277139-0_sp1.1 from training. Duration: 0.963625 2023-10-03 20:44:56,070 WARNING [train.py:1204] (3/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_85-138257-0 from training. Duration: 0.71 2023-10-03 20:44:56,086 WARNING [train.py:1204] (3/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_686-247600-0 from training. Duration: 0.92 2023-10-03 20:44:57,268 WARNING [train.py:1204] (3/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_108-255430-0_sp1.1 from training. Duration: 0.8818125 2023-10-03 20:44:58,964 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.conv_skip_rate, batch_count=1400693.3333333333, ans=0.0 2023-10-03 20:45:01,410 WARNING [train.py:1204] (3/4) Exclude cut with ID _14884_210_2252_1_1530707459689_5561250_114-201399-0_sp1.1 from training. Duration: 0.6909375 2023-10-03 20:45:04,312 WARNING [train.py:1204] (3/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_506-42614-0 from training. Duration: 0.79 2023-10-03 20:45:04,623 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.bypass.scale_min, batch_count=1400693.3333333333, ans=0.2 2023-10-03 20:45:06,070 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_85-195240-0_sp1.1 from training. Duration: 0.7909375 2023-10-03 20:45:06,279 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.bypass.scale_min, batch_count=1400693.3333333333, ans=0.2 2023-10-03 20:45:12,110 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_141-36243-0_sp1.1 from training. Duration: 0.97275 2023-10-03 20:45:21,817 WARNING [train.py:1204] (3/4) Exclude cut with ID _7852_210_5219_1_1528286471316_8139400_358-287492-0_sp1.1 from training. Duration: 0.87275 2023-10-03 20:45:21,848 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_805-71497-0_sp0.9 from training. Duration: 0.788875 2023-10-03 20:45:23,272 WARNING [train.py:1204] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_178-39722-0 from training. Duration: 0.49 2023-10-03 20:45:27,243 WARNING [train.py:1204] (3/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_428-181925-0_sp1.1 from training. Duration: 0.963625 2023-10-03 20:45:27,247 WARNING [train.py:1204] (3/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_770-200430-0 from training. Duration: 0.89 2023-10-03 20:45:28,537 INFO [train.py:1046] (3/4) Epoch 40, batch 2950, loss[loss=0.1573, simple_loss=0.2312, pruned_loss=0.04168, over 23878.00 frames. ], tot_loss[loss=0.1578, simple_loss=0.2376, pruned_loss=0.03904, over 4707734.41 frames. ], batch size: 195, lr: 2.54e-03, grad_scale: 16.0 2023-10-03 20:45:28,655 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_1104-271009-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 20:45:29,962 WARNING [train.py:1204] (3/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_128-296581-0_sp0.9 from training. Duration: 0.82225 2023-10-03 20:45:34,281 WARNING [train.py:1204] (3/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_19-62511-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 20:45:34,457 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.balancer2.prob, batch_count=1400826.6666666667, ans=0.125 2023-10-03 20:45:35,600 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_805-71497-0 from training. Duration: 0.71 2023-10-03 20:45:35,668 WARNING [train.py:1204] (3/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_573-152077-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 20:45:35,673 WARNING [train.py:1204] (3/4) Exclude cut with ID _42875_210_14856_1_1533704397487_4012433_393-177816-0_sp1.1 from training. Duration: 0.963625 2023-10-03 20:45:37,056 WARNING [train.py:1204] (3/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_278-35159-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 20:45:38,216 WARNING [train.py:1204] (3/4) Exclude cut with ID _15037_210_2253_1_1530752294104_3267650_13-130361-0_sp1.1 from training. Duration: 0.9 2023-10-03 20:45:40,805 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3019_210_2028_1_1526044245_3264865_127-135615-0 from training. Duration: 0.94 2023-10-03 20:45:40,859 WARNING [train.py:1204] (3/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_607-213951-0 from training. Duration: 0.84 2023-10-03 20:45:41,161 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.balancer1.prob, batch_count=1400893.3333333333, ans=0.125 2023-10-03 20:45:43,538 WARNING [train.py:1204] (3/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_436-318113-0_sp0.9 from training. Duration: 0.8333125 2023-10-03 20:45:43,538 WARNING [train.py:1204] (3/4) Exclude cut with ID _13938_210_9988_1_1530518718978_3693960_148-152225-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 20:45:46,474 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=1400893.3333333333, ans=0.1 2023-10-03 20:45:48,973 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2541_210_1794_1_1525590269_3084765_156-173713-0_sp1.1 from training. Duration: 0.736375 2023-10-03 20:45:50,361 WARNING [train.py:1204] (3/4) Exclude cut with ID _28323_210_16187_1_1532480395667_4100300_531-24157-0_sp1.1 from training. Duration: 0.8909375 2023-10-03 20:45:51,874 WARNING [train.py:1204] (3/4) Exclude cut with ID _29303_210_2575_1_1532392237303_7410649_382-217492-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 20:45:53,652 WARNING [train.py:1204] (3/4) Exclude cut with ID _58895_210_8750_1_1535184081320_3649089_270-158013-0_sp1.1 from training. Duration: 0.92725 2023-10-03 20:45:55,252 WARNING [train.py:1204] (3/4) Exclude cut with ID _29277_210_3279_1_1532581109293_3173069_311-206227-0_sp1.1 from training. Duration: 0.936375 2023-10-03 20:45:55,266 WARNING [train.py:1204] (3/4) Exclude cut with ID _7160_210_1876_1_1527940923231_3703799_282-322611-0_sp0.9 from training. Duration: 0.9666875 2023-10-03 20:45:56,577 WARNING [train.py:1204] (3/4) Exclude cut with ID _53219_210_4525_1_1534834836008_4064825_562-32295-0_sp1.1 from training. Duration: 0.963625 2023-10-03 20:45:58,019 WARNING [train.py:1204] (3/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_408-87330-0_sp1.1 from training. Duration: 0.963625 2023-10-03 20:45:58,033 WARNING [train.py:1204] (3/4) Exclude cut with ID _59202_210_26984_1_1534816810612_3549174_137-318710-0_sp1.1 from training. Duration: 0.8090625 2023-10-03 20:46:00,853 WARNING [train.py:1204] (3/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_372-30302-0 from training. Duration: 0.86 2023-10-03 20:46:06,353 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7543_210_6753_1_1528541907_1399322_178-56269-0_sp1.1 from training. Duration: 0.95275 2023-10-03 20:46:06,371 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9109W0012-549758 from training. Duration: 0.512 2023-10-03 20:46:07,561 WARNING [train.py:1204] (3/4) Exclude cut with ID _5410_210_1388_1_1527397055795_1719020_39-112221-0_sp0.9 from training. Duration: 0.7666875 2023-10-03 20:46:09,083 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9121W0012-554933 from training. Duration: 0.896 2023-10-03 20:46:10,282 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.618e+02 1.938e+02 2.092e+02 2.435e+02 3.514e+02, threshold=4.184e+02, percent-clipped=0.0 2023-10-03 20:46:11,803 WARNING [train.py:1204] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_476-272757-0 from training. Duration: 0.93 2023-10-03 20:46:11,826 WARNING [train.py:1204] (3/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_1085-171718-0_sp1.1 from training. Duration: 0.92725 2023-10-03 20:46:11,880 WARNING [train.py:1204] (3/4) Exclude cut with ID _8480_210_1874_1_1528589391205_6728330_309-139896-0_sp1.1 from training. Duration: 0.736375 2023-10-03 20:46:11,881 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9130W0113-559703 from training. Duration: 0.896 2023-10-03 20:46:11,885 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_292-265928-0_sp1.1 from training. Duration: 0.463625 2023-10-03 20:46:15,288 WARNING [train.py:1204] (3/4) Exclude cut with ID _8772_210_1850_1_1528635595972_3664011_96-12756-0 from training. Duration: 0.98 2023-10-03 20:46:15,361 WARNING [train.py:1204] (3/4) Exclude cut with ID _12556_210_8341_1_1530081024100_7327690_725-193477-0_sp1.1 from training. Duration: 0.8909375 2023-10-03 20:46:16,620 WARNING [train.py:1204] (3/4) Exclude cut with ID _5289_210_2084_1_1527678202736_3779299_142-103317-0_sp1.1 from training. Duration: 0.763625 2023-10-03 20:46:19,317 WARNING [train.py:1204] (3/4) Exclude cut with ID _45012_210_12651_1_1533608435136_5371380_206-51075-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 20:46:19,418 WARNING [train.py:1204] (3/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_176-56120-0_sp1.1 from training. Duration: 0.7545625 2023-10-03 20:46:19,452 WARNING [train.py:1204] (3/4) Exclude cut with ID _40284_210_2084_1_1533513679814_3704279_220-201864-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 20:46:19,473 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9165W0004-576638 from training. Duration: 0.896 2023-10-03 20:46:20,776 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_5438_210_1794_1_1527336687_2600009_218-343336-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 20:46:20,801 WARNING [train.py:1204] (3/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_318-162017-0 from training. Duration: 0.91 2023-10-03 20:46:24,358 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_49544_210_1794_1_1534379770_5262083_483-349298-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 20:46:25,710 WARNING [train.py:1204] (3/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_197-28164-0_sp0.9 from training. Duration: 0.888875 2023-10-03 20:46:26,960 WARNING [train.py:1204] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_643-52007-0 from training. Duration: 0.57 2023-10-03 20:46:26,985 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_38965_210_4852_1_1533174906_3484735_403-332963-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 20:46:28,375 WARNING [train.py:1204] (3/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_340-174176-0 from training. Duration: 0.49 2023-10-03 20:46:29,846 WARNING [train.py:1204] (3/4) Exclude cut with ID _56708_210_13177_1_1535252351282_3422899_363-917-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 20:46:30,051 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.bypass_mid.scale_min, batch_count=1401093.3333333333, ans=0.2 2023-10-03 20:46:31,302 WARNING [train.py:1204] (3/4) Exclude cut with ID _19896_210_1883_1_1531709022662_6649569_525-159063-0_sp1.1 from training. Duration: 0.936375 2023-10-03 20:46:32,600 WARNING [train.py:1204] (3/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_430-116139-0_sp0.9 from training. Duration: 0.9444375 2023-10-03 20:46:35,209 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_53753_210_7234_1_1534320003_3594148_86-329250-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 20:46:35,228 WARNING [train.py:1204] (3/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_406-275649-0_sp1.1 from training. Duration: 0.3818125 2023-10-03 20:46:37,896 WARNING [train.py:1204] (3/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_1083-147536-0_sp1.1 from training. Duration: 0.8818125 2023-10-03 20:46:37,959 WARNING [train.py:1204] (3/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_54-311741-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 20:46:37,965 WARNING [train.py:1204] (3/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_237-58433-0_sp0.9 from training. Duration: 0.82225 2023-10-03 20:46:39,290 WARNING [train.py:1204] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_98-51676-0_sp1.1 from training. Duration: 0.663625 2023-10-03 20:46:39,346 WARNING [train.py:1204] (3/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_386-270472-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 20:46:40,685 INFO [train.py:1046] (3/4) Epoch 40, batch 3000, loss[loss=0.1598, simple_loss=0.2529, pruned_loss=0.03336, over 24318.00 frames. ], tot_loss[loss=0.1585, simple_loss=0.2383, pruned_loss=0.03935, over 4710419.23 frames. ], batch size: 74, lr: 2.54e-03, grad_scale: 16.0 2023-10-03 20:46:40,686 INFO [train.py:1069] (3/4) Computing validation loss 2023-10-03 20:46:52,705 INFO [train.py:1078] (3/4) Epoch 40, validation: loss=0.3553, simple_loss=0.2798, pruned_loss=0.2154, over 1125622.00 frames. 2023-10-03 20:46:52,706 INFO [train.py:1079] (3/4) Maximum memory allocated so far is 21134MB 2023-10-03 20:46:52,776 WARNING [train.py:1204] (3/4) Exclude cut with ID _19023_210_4353_1_1531378892552_5820780_83-254794-0_sp1.1 from training. Duration: 0.7818125 2023-10-03 20:46:54,071 WARNING [train.py:1204] (3/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_314-93656-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 20:46:54,098 WARNING [train.py:1204] (3/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_336-213530-0 from training. Duration: 0.9 2023-10-03 20:46:55,508 WARNING [train.py:1204] (3/4) Exclude cut with ID _36686_210_5665_1_1533778225913_3646410_65-221120-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 20:46:59,440 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_26433_210_12108_1_1532157158_4056480_271-68178-0_sp1.1 from training. Duration: 0.92725 2023-10-03 20:46:59,477 WARNING [train.py:1204] (3/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_46-207599-0_sp0.9 from training. Duration: 0.711125 2023-10-03 20:47:02,436 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9303W0114-637133 from training. Duration: 0.896 2023-10-03 20:47:02,468 WARNING [train.py:1204] (3/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_490-207861-0 from training. Duration: 0.98 2023-10-03 20:47:05,226 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6767_210_3273_1_1527855783_1718932_173-256601-0_sp0.9 from training. Duration: 0.888875 2023-10-03 20:47:06,550 WARNING [train.py:1204] (3/4) Exclude cut with ID _13921_210_9994_1_1530841782272_3503520_327-69677-0_sp1.1 from training. Duration: 0.8181875 2023-10-03 20:47:06,616 WARNING [train.py:1204] (3/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_508-335369-0 from training. Duration: 0.48 2023-10-03 20:47:06,656 WARNING [train.py:1204] (3/4) Exclude cut with ID _64631_210_27767_1_1535349580068_4165160_24-46567-0_sp1.1 from training. Duration: 0.963625 2023-10-03 20:47:13,876 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_89-35748-0_sp1.1 from training. Duration: 0.7181875 2023-10-03 20:47:21,897 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_26433_210_12108_1_1532157158_4056480_578-262107-0_sp1.1 from training. Duration: 0.9 2023-10-03 20:47:29,023 WARNING [train.py:1204] (3/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_129-337802-0 from training. Duration: 0.72 2023-10-03 20:47:29,111 WARNING [train.py:1204] (3/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_356-167816-0_sp0.9 from training. Duration: 0.8 2023-10-03 20:47:29,300 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.bypass.skip_rate, batch_count=1401293.3333333333, ans=0.07 2023-10-03 20:47:31,127 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.balancer2.prob, batch_count=1401293.3333333333, ans=0.125 2023-10-03 20:47:32,336 WARNING [train.py:1204] (3/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_137-116442-0_sp1.1 from training. Duration: 0.7090625 2023-10-03 20:47:32,372 WARNING [train.py:1204] (3/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_358-172940-0_sp1.1 from training. Duration: 0.963625 2023-10-03 20:47:32,391 WARNING [train.py:1204] (3/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_668-344147-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 20:47:33,802 WARNING [train.py:1204] (3/4) Exclude cut with ID _62337_210_12392_1_1535009273105_3412229_255-157667-0_sp1.1 from training. Duration: 0.936375 2023-10-03 20:47:33,806 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_486-151131-0 from training. Duration: 0.85 2023-10-03 20:47:36,592 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_53638_210_21581_1_1534297319_5808449_885-80172-0 from training. Duration: 0.96 2023-10-03 20:47:37,827 WARNING [train.py:1204] (3/4) Exclude cut with ID _50093_210_4220_1_1534417141207_3432299_105-183067-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 20:47:38,525 WARNING [train.py:1204] (3/4) Exclude cut with ID _7562_210_2084_1_1528887767916_3946229_345-169156-0_sp0.9 from training. Duration: 0.6444375 2023-10-03 20:47:41,057 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_4528_210_3273_1_1527076654_3276777_209-219804-0_sp0.9 from training. Duration: 0.7666875 2023-10-03 20:47:41,084 WARNING [train.py:1204] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_35-153886-0_sp0.9 from training. Duration: 0.9333125 2023-10-03 20:47:42,547 WARNING [train.py:1204] (3/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_332-121925-0_sp1.1 from training. Duration: 0.92725 2023-10-03 20:47:42,549 WARNING [train.py:1204] (3/4) Exclude cut with ID _59202_210_26984_1_1534816810612_3549174_495-65515-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 20:47:45,279 WARNING [train.py:1204] (3/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_769-168302-0_sp1.1 from training. Duration: 0.6454375 2023-10-03 20:47:46,573 WARNING [train.py:1204] (3/4) Exclude cut with ID _24271_210_12658_1_1531987270022_7190519_708-71345-0_sp1.1 from training. Duration: 0.936375 2023-10-03 20:47:46,573 WARNING [train.py:1204] (3/4) Exclude cut with ID 3033-130750-0096-55598-0_sp0.9 from training. Duration: 0.92225 2023-10-03 20:47:48,512 WARNING [train.py:1204] (3/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_219-224445-0_sp0.9 from training. Duration: 0.9333125 2023-10-03 20:47:50,402 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_174-277364-0 from training. Duration: 0.71 2023-10-03 20:47:51,743 WARNING [train.py:1204] (3/4) Exclude cut with ID _45021_210_9994_1_1533619751094_3330288_96-239010-0_sp1.1 from training. Duration: 0.82725 2023-10-03 20:47:51,772 WARNING [train.py:1204] (3/4) Exclude cut with ID _31602_210_5854_1_1532677021456_4458250_489-305116-0_sp1.1 from training. Duration: 0.97275 2023-10-03 20:47:51,797 WARNING [train.py:1204] (3/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_269-154726-0_sp0.9 from training. Duration: 0.9555625 2023-10-03 20:47:56,351 WARNING [train.py:1204] (3/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_126-54418-0_sp1.1 from training. Duration: 0.92725 2023-10-03 20:47:56,376 WARNING [train.py:1204] (3/4) Exclude cut with ID _42326_210_7770_1_1533814282531_3707269_182-224159-0_sp1.1 from training. Duration: 0.92725 2023-10-03 20:47:57,740 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7002_210_1794_1_1527939683_5157286_1-281264-0_sp0.9 from training. Duration: 0.488875 2023-10-03 20:47:59,105 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_86-16881-0 from training. Duration: 0.58 2023-10-03 20:47:59,129 WARNING [train.py:1204] (3/4) Exclude cut with ID _19514_210_9259_1_1531530071330_5084230_637-279094-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 20:47:59,155 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_26433_210_12108_1_1532157158_4056480_578-262107-0 from training. Duration: 0.99 2023-10-03 20:48:00,557 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_82-221196-0_sp1.1 from training. Duration: 0.6909375 2023-10-03 20:48:01,925 WARNING [train.py:1204] (3/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_402-102311-0 from training. Duration: 0.67 2023-10-03 20:48:05,800 INFO [train.py:1046] (3/4) Epoch 40, batch 3050, loss[loss=0.1606, simple_loss=0.2441, pruned_loss=0.03855, over 24478.00 frames. ], tot_loss[loss=0.1589, simple_loss=0.2386, pruned_loss=0.03954, over 4701186.18 frames. ], batch size: 66, lr: 2.54e-03, grad_scale: 16.0 2023-10-03 20:48:05,850 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1015-343884-0_sp1.1 from training. Duration: 0.563625 2023-10-03 20:48:05,942 WARNING [train.py:1204] (3/4) Exclude cut with ID _13684_210_7497_1_1530619354863_3598359_312-235606-0_sp1.1 from training. Duration: 0.5454375 2023-10-03 20:48:07,228 WARNING [train.py:1204] (3/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_55-31904-0 from training. Duration: 0.88 2023-10-03 20:48:07,303 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_25-332138-0 from training. Duration: 0.91 2023-10-03 20:48:07,305 WARNING [train.py:1204] (3/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_912-282412-0_sp0.9 from training. Duration: 0.5444375 2023-10-03 20:48:08,623 WARNING [train.py:1204] (3/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_458-299435-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 20:48:10,385 WARNING [train.py:1204] (3/4) Exclude cut with ID _69584_210_5219_1_1535785133775_4044450_275-88403-0_sp1.1 from training. Duration: 0.92725 2023-10-03 20:48:10,400 WARNING [train.py:1204] (3/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_841-341679-0_sp1.1 from training. Duration: 0.52725 2023-10-03 20:48:10,407 WARNING [train.py:1204] (3/4) Exclude cut with ID _41391_210_7994_1_1533603771872_3638201_1-54521-0_sp1.1 from training. Duration: 0.97275 2023-10-03 20:48:11,751 WARNING [train.py:1204] (3/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_495-186609-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 20:48:13,239 WARNING [train.py:1204] (3/4) Exclude cut with ID _4681_210_3525_1_1527250689464_120889_15-126760-0 from training. Duration: 0.81 2023-10-03 20:48:14,686 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3019_210_2028_1_1526044245_3264865_127-135615-0_sp1.1 from training. Duration: 0.8545625 2023-10-03 20:48:17,456 WARNING [train.py:1204] (3/4) Exclude cut with ID _30191_210_1664_1_1533189915047_7196670_915-21561-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 20:48:18,850 WARNING [train.py:1204] (3/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_6-214815-0_sp0.9 from training. Duration: 0.9444375 2023-10-03 20:48:20,926 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_45399_210_4852_1_1533624725_7579737_190-137494-0_sp1.1 from training. Duration: 0.97275 2023-10-03 20:48:24,074 WARNING [train.py:1204] (3/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_207-132038-0 from training. Duration: 0.94 2023-10-03 20:48:29,949 WARNING [train.py:1204] (3/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_90-31383-0 from training. Duration: 0.87 2023-10-03 20:48:29,994 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_15517_210_6753_1_1530952293_1664795_119-31001-0 from training. Duration: 0.94 2023-10-03 20:48:31,291 WARNING [train.py:1204] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_684-85342-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 20:48:34,079 WARNING [train.py:1204] (3/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_97-325053-0_sp1.1 from training. Duration: 0.836375 2023-10-03 20:48:37,034 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_68702_210_23689_1_1535799577_3420068_168-25150-0_sp1.1 from training. Duration: 0.97275 2023-10-03 20:48:37,042 WARNING [train.py:1204] (3/4) Exclude cut with ID _65134_210_14763_1_1535335393985_3505368_111-27984-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 20:48:38,459 WARNING [train.py:1204] (3/4) Exclude cut with ID _48678_210_5663_1_1533988830947_3769199_276-226230-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 20:48:38,657 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.feed_forward1.hidden_balancer.prob, batch_count=1401626.6666666667, ans=0.125 2023-10-03 20:48:40,527 WARNING [train.py:1204] (3/4) Exclude cut with ID _28427_210_4220_1_1532581240967_3061069_43-84928-0_sp1.1 from training. Duration: 0.9 2023-10-03 20:48:40,781 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.ff2_skip_rate, batch_count=1401626.6666666667, ans=0.0 2023-10-03 20:48:41,753 WARNING [train.py:1204] (3/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_158-250116-0_sp1.1 from training. Duration: 0.563625 2023-10-03 20:48:41,794 WARNING [train.py:1204] (3/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_279-332556-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 20:48:41,830 WARNING [train.py:1204] (3/4) Exclude cut with ID _42863_210_9070_1_1533902398788_3696920_488-230146-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 20:48:41,830 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_387-247159-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 20:48:44,455 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6729_210_2550_1_1528032481_1788725_261-42619-0_sp1.1 from training. Duration: 0.97275 2023-10-03 20:48:45,826 WARNING [train.py:1204] (3/4) Exclude cut with ID _19896_210_1883_1_1531709022662_6649569_831-44724-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 20:48:48,606 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.484e+02 1.898e+02 2.071e+02 2.314e+02 3.328e+02, threshold=4.142e+02, percent-clipped=0.0 2023-10-03 20:48:48,746 WARNING [train.py:1204] (3/4) Exclude cut with ID _55153_210_2253_1_1534384786677_3162096_280-285944-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 20:48:48,775 WARNING [train.py:1204] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_448-4048-0 from training. Duration: 0.96 2023-10-03 20:48:50,620 WARNING [train.py:1204] (3/4) Exclude cut with ID _29678_210_7893_1_1532421042787_3906118_456-202548-0_sp1.1 from training. Duration: 0.97275 2023-10-03 20:48:50,642 WARNING [train.py:1204] (3/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_107-332398-0_sp1.1 from training. Duration: 0.7090625 2023-10-03 20:48:53,873 WARNING [train.py:1204] (3/4) Exclude cut with ID _13921_210_9994_1_1530838702800_3003429_46-149444-0_sp1.1 from training. Duration: 0.8545625 2023-10-03 20:48:53,919 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_257-20733-0_sp0.9 from training. Duration: 0.8444375 2023-10-03 20:48:55,228 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_53753_210_7234_1_1534320003_3594148_385-234809-0_sp1.1 from training. Duration: 0.963625 2023-10-03 20:48:55,284 WARNING [train.py:1204] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_292-215346-0_sp1.1 from training. Duration: 0.8818125 2023-10-03 20:49:01,082 WARNING [train.py:1204] (3/4) Exclude cut with ID _68965_210_1883_1_1535698962272_3497490_196-124268-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 20:49:01,128 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_23801_210_16223_1_1532066392_3294657_570-5439-0_sp1.1 from training. Duration: 0.8818125 2023-10-03 20:49:02,667 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.nonlin_attention.balancer.prob, batch_count=1401693.3333333333, ans=0.125 2023-10-03 20:49:02,669 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.feed_forward2.hidden_balancer.prob, batch_count=1401693.3333333333, ans=0.125 2023-10-03 20:49:06,679 WARNING [train.py:1204] (3/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_349-6928-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 20:49:06,722 WARNING [train.py:1204] (3/4) Exclude cut with ID _60621_210_17071_1_1535013053530_3671739_226-255202-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 20:49:06,726 WARNING [train.py:1204] (3/4) Exclude cut with ID _41817_210_18835_1_1533434477160_7423049_448-130302-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 20:49:08,164 WARNING [train.py:1204] (3/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_758-143677-0_sp1.1 from training. Duration: 0.8909375 2023-10-03 20:49:08,206 WARNING [train.py:1204] (3/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_912-47473-0_sp0.9 from training. Duration: 0.7555625 2023-10-03 20:49:09,527 WARNING [train.py:1204] (3/4) Exclude cut with ID _6694_210_4599_1_1527852637490_3965529_356-169233-0_sp1.1 from training. Duration: 0.9 2023-10-03 20:49:09,621 WARNING [train.py:1204] (3/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_1-99680-0 from training. Duration: 0.67 2023-10-03 20:49:09,803 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.attention_skip_rate, batch_count=1401760.0, ans=0.0 2023-10-03 20:49:11,638 WARNING [train.py:1204] (3/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_199-86432-0_sp1.1 from training. Duration: 0.8909375 2023-10-03 20:49:12,834 WARNING [train.py:1204] (3/4) Exclude cut with ID _61148_210_6897_1_1535094022726_4217610_362-182430-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 20:49:12,922 WARNING [train.py:1204] (3/4) Exclude cut with ID _49376_210_5986_1_1533985278732_3370740_301-154763-0 from training. Duration: 0.98 2023-10-03 20:49:15,595 WARNING [train.py:1204] (3/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_572-185150-0_sp1.1 from training. Duration: 0.8818125 2023-10-03 20:49:20,219 INFO [train.py:1046] (3/4) Epoch 40, batch 3100, loss[loss=0.1491, simple_loss=0.2277, pruned_loss=0.03528, over 22346.00 frames. ], tot_loss[loss=0.1584, simple_loss=0.238, pruned_loss=0.03941, over 4703438.91 frames. ], batch size: 49, lr: 2.54e-03, grad_scale: 16.0 2023-10-03 20:49:20,261 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_47896_210_20508_1_1534492518_5958868_558-314572-0_sp1.1 from training. Duration: 0.8818125 2023-10-03 20:49:21,676 WARNING [train.py:1204] (3/4) Exclude cut with ID _45560_210_21982_1_1533812603951_3584999_111-6376-0_sp0.9 from training. Duration: 0.9333125 2023-10-03 20:49:22,558 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.1.encoder.layers.1.self_attn1.whiten, num_groups=1, num_channels=256, metric=10.29 vs. limit=22.5 2023-10-03 20:49:23,086 WARNING [train.py:1204] (3/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_412-317077-0_sp1.1 from training. Duration: 0.6818125 2023-10-03 20:49:25,163 WARNING [train.py:1204] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_718-68505-0 from training. Duration: 0.89 2023-10-03 20:49:27,967 WARNING [train.py:1204] (3/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_77-311404-0 from training. Duration: 0.94 2023-10-03 20:49:30,453 WARNING [train.py:1204] (3/4) Exclude cut with ID _60229_210_27767_1_1534838406158_3655350_264-279573-0 from training. Duration: 0.83 2023-10-03 20:49:31,813 WARNING [train.py:1204] (3/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_106-300985-0_sp1.1 from training. Duration: 0.8181875 2023-10-03 20:49:33,364 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_4183_210_2087_1_1526729100_1869838_79-277189-0_sp1.1 from training. Duration: 0.963625 2023-10-03 20:49:33,374 WARNING [train.py:1204] (3/4) Exclude cut with ID _28427_210_4220_1_1532581240967_3061069_195-295268-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 20:49:35,425 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.4.encoder.layers.1.feed_forward1.out_whiten, num_groups=1, num_channels=384, metric=9.02 vs. limit=15.0 2023-10-03 20:49:36,252 WARNING [train.py:1204] (3/4) Exclude cut with ID _4092_210_2151_1_1526643044449_3135759_356-278694-0_sp0.9 from training. Duration: 0.52225 2023-10-03 20:49:39,032 WARNING [train.py:1204] (3/4) Exclude cut with ID _14459_210_2228_1_1531443641946_7516700_777-71446-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 20:49:44,100 WARNING [train.py:1204] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_520-126729-0 from training. Duration: 0.7 2023-10-03 20:49:50,055 WARNING [train.py:1204] (3/4) Exclude cut with ID _13684_210_7497_1_1530619354863_3598359_312-235606-0_sp0.9 from training. Duration: 0.6666875 2023-10-03 20:49:50,096 WARNING [train.py:1204] (3/4) Exclude cut with ID _37537_210_2253_1_1533171758568_3421949_283-349402-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 20:49:50,133 WARNING [train.py:1204] (3/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_29-117932-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 20:49:50,159 WARNING [train.py:1204] (3/4) Exclude cut with ID _29303_210_2575_1_1532392237303_7410649_721-283351-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 20:49:51,501 WARNING [train.py:1204] (3/4) Exclude cut with ID _14886_210_4220_1_1530702020355_3516190_175-229073-0_sp0.9 from training. Duration: 0.5 2023-10-03 20:49:53,024 WARNING [train.py:1204] (3/4) Exclude cut with ID _15458_210_8341_1_1530858614263_3834410_70-268830-0_sp1.1 from training. Duration: 0.97275 2023-10-03 20:49:53,041 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2295_210_2087_1_1525500993_2407863_234-83104-0 from training. Duration: 0.62 2023-10-03 20:49:53,042 WARNING [train.py:1204] (3/4) Exclude cut with ID _22961_210_8254_1_1531976366672_4278180_317-199546-0_sp1.1 from training. Duration: 0.7818125 2023-10-03 20:49:55,724 WARNING [train.py:1204] (3/4) Exclude cut with ID _27894_210_12392_1_1532415496078_6902439_547-196022-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 20:49:55,829 WARNING [train.py:1204] (3/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_46-207599-0 from training. Duration: 0.64 2023-10-03 20:49:57,615 WARNING [train.py:1204] (3/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_426-326246-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 20:50:00,398 WARNING [train.py:1204] (3/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_445-117398-0_sp0.9 from training. Duration: 0.988875 2023-10-03 20:50:01,972 WARNING [train.py:1204] (3/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_563-347832-0 from training. Duration: 0.89 2023-10-03 20:50:03,368 WARNING [train.py:1204] (3/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_252-216674-0 from training. Duration: 0.54 2023-10-03 20:50:04,744 WARNING [train.py:1204] (3/4) Exclude cut with ID _59611_210_8895_1_1534741198240_3650361_187-212779-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 20:50:04,799 WARNING [train.py:1204] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_282-159367-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 20:50:07,526 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_68702_210_23689_1_1535799577_3420068_33-136883-0_sp1.1 from training. Duration: 0.936375 2023-10-03 20:50:07,546 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_47228_210_4983_1_1533780021_3672027_184-293706-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 20:50:07,567 WARNING [train.py:1204] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_656-188361-0_sp0.9 from training. Duration: 0.9666875 2023-10-03 20:50:08,869 WARNING [train.py:1204] (3/4) Exclude cut with ID _4866_210_2577_1_1527404412284_4364659_352-157930-0_sp0.9 from training. Duration: 0.9 2023-10-03 20:50:08,870 WARNING [train.py:1204] (3/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_210-106047-0_sp1.1 from training. Duration: 0.8818125 2023-10-03 20:50:10,354 WARNING [train.py:1204] (3/4) Exclude cut with ID _9389_210_1664_1_1529217337301_5289269_289-213417-0_sp1.1 from training. Duration: 0.8090625 2023-10-03 20:50:10,381 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3910_210_1794_1_1526777667_3747260_178-41186-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 20:50:12,264 WARNING [train.py:1204] (3/4) Exclude cut with ID _27052_210_5986_1_1532329197187_3585009_24-172263-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 20:50:12,268 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_5522_210_3273_1_1527249606_3909096_318-203153-0_sp1.1 from training. Duration: 0.3909375 2023-10-03 20:50:16,427 WARNING [train.py:1204] (3/4) Exclude cut with ID _27894_210_12392_1_1532415496078_6902439_222-51982-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 20:50:17,944 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.balancer1.prob, batch_count=1402093.3333333333, ans=0.125 2023-10-03 20:50:19,651 WARNING [train.py:1204] (3/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_87-131427-0 from training. Duration: 0.78 2023-10-03 20:50:21,487 WARNING [train.py:1204] (3/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_372-298093-0_sp0.9 from training. Duration: 0.888875 2023-10-03 20:50:21,540 WARNING [train.py:1204] (3/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_663-166973-0 from training. Duration: 0.8 2023-10-03 20:50:22,877 WARNING [train.py:1204] (3/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_429-177469-0_sp1.1 from training. Duration: 0.936375 2023-10-03 20:50:22,913 WARNING [train.py:1204] (3/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_724-172651-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 20:50:23,144 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.conv_module2.balancer1.prob, batch_count=1402093.3333333333, ans=0.125 2023-10-03 20:50:24,150 WARNING [train.py:1204] (3/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_103-336064-0 from training. Duration: 0.51 2023-10-03 20:50:32,928 WARNING [train.py:1204] (3/4) Exclude cut with ID _48677_210_5663_1_1533902405919_3892989_111-11439-0 from training. Duration: 0.96 2023-10-03 20:50:34,337 INFO [train.py:1046] (3/4) Epoch 40, batch 3150, loss[loss=0.1528, simple_loss=0.2328, pruned_loss=0.03644, over 24330.00 frames. ], tot_loss[loss=0.1577, simple_loss=0.2372, pruned_loss=0.03905, over 4702492.21 frames. ], batch size: 61, lr: 2.54e-03, grad_scale: 8.0 2023-10-03 20:50:34,431 WARNING [train.py:1204] (3/4) Exclude cut with ID _8085_210_4557_1_1528455633493_3716100_95-103398-0_sp1.1 from training. Duration: 0.963625 2023-10-03 20:50:34,499 WARNING [train.py:1204] (3/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_1041-110837-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 20:50:37,303 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_50050_210_16223_1_1535435796_7290138_1155-121757-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 20:50:37,305 WARNING [train.py:1204] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_602-275260-0_sp0.9 from training. Duration: 0.911125 2023-10-03 20:50:37,378 WARNING [train.py:1204] (3/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_265-164486-0 from training. Duration: 0.96 2023-10-03 20:50:38,767 WARNING [train.py:1204] (3/4) Exclude cut with ID _46067_210_4220_1_1534125571066_3307450_142-229375-0_sp1.1 from training. Duration: 0.963625 2023-10-03 20:50:38,804 WARNING [train.py:1204] (3/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_310-26186-0_sp1.1 from training. Duration: 0.636375 2023-10-03 20:50:40,233 WARNING [train.py:1204] (3/4) Exclude cut with ID _28461_210_2253_1_1532771775189_3041299_136-270644-0 from training. Duration: 0.93 2023-10-03 20:50:40,453 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.out_combiner.scale_min, batch_count=1402160.0, ans=0.2 2023-10-03 20:50:41,629 WARNING [train.py:1204] (3/4) Exclude cut with ID _14860_210_4915_1_1530702020340_7343240_438-286372-0_sp1.1 from training. Duration: 0.936375 2023-10-03 20:50:44,756 WARNING [train.py:1204] (3/4) Exclude cut with ID IC0128W0004-63309 from training. Duration: 0.831 2023-10-03 20:50:48,166 WARNING [train.py:1204] (3/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_135-174080-0 from training. Duration: 0.53 2023-10-03 20:50:49,462 WARNING [train.py:1204] (3/4) Exclude cut with ID _41326_210_5854_1_1533778076509_3492080_277-59511-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 20:50:49,547 WARNING [train.py:1204] (3/4) Exclude cut with ID IC0144W0112-71379 from training. Duration: 0.992 2023-10-03 20:50:50,888 WARNING [train.py:1204] (3/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_711-15688-0_sp0.9 from training. Duration: 0.47775 2023-10-03 20:50:54,055 WARNING [train.py:1204] (3/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_237-332027-0 from training. Duration: 0.95 2023-10-03 20:50:54,111 WARNING [train.py:1204] (3/4) Exclude cut with ID _7970_210_1883_1_1528455616296_3472230_25-178407-0 from training. Duration: 0.71 2023-10-03 20:50:54,113 WARNING [train.py:1204] (3/4) Exclude cut with ID _8480_210_1874_1_1528589391205_6728330_309-139896-0 from training. Duration: 0.81 2023-10-03 20:50:54,135 WARNING [train.py:1204] (3/4) Exclude cut with ID _39571_210_13449_1_1533801584143_3651077_319-268877-0_sp1.1 from training. Duration: 0.936375 2023-10-03 20:50:54,138 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_155-246880-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 20:50:55,540 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_566-122599-0_sp1.1 from training. Duration: 0.936375 2023-10-03 20:50:55,651 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_12868_210_6753_1_1530701932_530275_18-76322-0 from training. Duration: 0.87 2023-10-03 20:50:58,692 WARNING [train.py:1204] (3/4) Exclude cut with ID _56494_210_4220_1_1535284837625_3033169_159-106035-0_sp1.1 from training. Duration: 0.963625 2023-10-03 20:50:58,735 WARNING [train.py:1204] (3/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_98-276013-0_sp1.1 from training. Duration: 0.963625 2023-10-03 20:50:58,906 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.ff3_skip_rate, batch_count=1402226.6666666667, ans=0.0 2023-10-03 20:51:00,157 WARNING [train.py:1204] (3/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_330-216124-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 20:51:02,907 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_177-58005-0_sp0.9 from training. Duration: 0.688875 2023-10-03 20:51:05,754 WARNING [train.py:1204] (3/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_343-139898-0 from training. Duration: 0.96 2023-10-03 20:51:07,120 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6961_210_2087_1_1527937168_6815011_395-185060-0_sp1.1 from training. Duration: 0.863625 2023-10-03 20:51:08,620 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7002_210_1794_1_1527939683_5157286_184-229630-0_sp0.9 from training. Duration: 0.82225 2023-10-03 20:51:09,914 WARNING [train.py:1204] (3/4) Exclude cut with ID _59170_210_6721_1_1534983633960_4203335_153-18895-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 20:51:11,272 WARNING [train.py:1204] (3/4) Exclude cut with ID _49643_210_7989_1_1534640244224_3939031_598-161926-0 from training. Duration: 0.9 2023-10-03 20:51:14,454 WARNING [train.py:1204] (3/4) Exclude cut with ID _4068_210_2629_1_1526697350395_1546259_224-288693-0 from training. Duration: 0.59 2023-10-03 20:51:14,514 WARNING [train.py:1204] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_718-68505-0_sp1.1 from training. Duration: 0.8090625 2023-10-03 20:51:15,944 WARNING [train.py:1204] (3/4) Exclude cut with ID _4068_210_2629_1_1526697350395_1546259_224-288693-0_sp0.9 from training. Duration: 0.6555625 2023-10-03 20:51:15,956 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2296_210_2087_1_1525503448_3397335_57-185430-0_sp0.9 from training. Duration: 0.6666875 2023-10-03 20:51:16,005 WARNING [train.py:1204] (3/4) Exclude cut with ID _15458_210_8341_1_1530858614263_3834410_49-235472-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 20:51:16,007 WARNING [train.py:1204] (3/4) Exclude cut with ID _64135_210_2253_1_1535677267876_7608972_498-5039-0_sp0.9 from training. Duration: 0.8666875 2023-10-03 20:51:17,370 WARNING [train.py:1204] (3/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_283-220372-0_sp0.9 from training. Duration: 0.62225 2023-10-03 20:51:17,385 WARNING [train.py:1204] (3/4) Exclude cut with ID _6473_210_1741_1_1527901307396_3815380_108-17335-0_sp0.9 from training. Duration: 0.72225 2023-10-03 20:51:18,690 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.586e+02 1.948e+02 2.114e+02 2.510e+02 3.900e+02, threshold=4.229e+02, percent-clipped=0.0 2023-10-03 20:51:18,776 WARNING [train.py:1204] (3/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_113-92608-0 from training. Duration: 0.54 2023-10-03 20:51:18,838 WARNING [train.py:1204] (3/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_272-249985-0_sp1.1 from training. Duration: 0.6090625 2023-10-03 20:51:18,855 WARNING [train.py:1204] (3/4) Exclude cut with ID _50376_210_13401_1_1534031502101_4217740_518-187390-0_sp1.1 from training. Duration: 0.97275 2023-10-03 20:51:20,268 WARNING [train.py:1204] (3/4) Exclude cut with ID _59738_210_15379_1_1534849512562_3941470_165-248260-0_sp1.1 from training. Duration: 0.92725 2023-10-03 20:51:22,225 WARNING [train.py:1204] (3/4) Exclude cut with ID _23818_210_6228_1_1531917063884_3676920_400-343925-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 20:51:22,296 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6961_210_2087_1_1527937168_6815011_562-165518-0 from training. Duration: 0.77 2023-10-03 20:51:22,350 WARNING [train.py:1204] (3/4) Exclude cut with ID _24271_210_12658_1_1531987270022_7190519_289-207931-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 20:51:22,543 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.balancer2.prob, batch_count=1402360.0, ans=0.125 2023-10-03 20:51:23,769 WARNING [train.py:1204] (3/4) Exclude cut with ID _14632_210_9988_1_1530691279392_3923200_332-105904-0 from training. Duration: 0.9 2023-10-03 20:51:23,949 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.bypass.scale_min, batch_count=1402360.0, ans=0.2 2023-10-03 20:51:25,515 WARNING [train.py:1204] (3/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_203-232438-0_sp1.1 from training. Duration: 0.97275 2023-10-03 20:51:25,610 WARNING [train.py:1204] (3/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_263-168388-0 from training. Duration: 0.86 2023-10-03 20:51:25,675 WARNING [train.py:1204] (3/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_174-205058-0 from training. Duration: 0.95 2023-10-03 20:51:28,796 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_94-195801-0_sp1.1 from training. Duration: 0.8545625 2023-10-03 20:51:28,833 WARNING [train.py:1204] (3/4) Exclude cut with ID _55153_210_2253_1_1534384786677_3162096_269-103204-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 20:51:30,180 WARNING [train.py:1204] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_748-36150-0 from training. Duration: 0.91 2023-10-03 20:51:30,261 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_148-172861-0_sp1.1 from training. Duration: 0.4545625 2023-10-03 20:51:30,312 WARNING [train.py:1204] (3/4) Exclude cut with ID _67499_210_17282_1_1535509805220_3692430_460-77730-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 20:51:34,368 WARNING [train.py:1204] (3/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_374-20753-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 20:51:34,475 WARNING [train.py:1204] (3/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_374-289983-0_sp1.1 from training. Duration: 0.97275 2023-10-03 20:51:35,781 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_34886_210_10633_1_1533203882_1755661_76-156096-0_sp0.9 from training. Duration: 0.9666875 2023-10-03 20:51:41,304 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_238-256668-0_sp0.9 from training. Duration: 0.7666875 2023-10-03 20:51:41,356 WARNING [train.py:1204] (3/4) Exclude cut with ID _46683_210_9642_1_1533730913492_4591116_489-146727-0_sp1.1 from training. Duration: 0.97275 2023-10-03 20:51:44,679 WARNING [train.py:1204] (3/4) Exclude cut with ID _13938_210_9988_1_1530518718978_3693960_304-112784-0_sp0.9 from training. Duration: 0.511125 2023-10-03 20:51:48,683 INFO [train.py:1046] (3/4) Epoch 40, batch 3200, loss[loss=0.1646, simple_loss=0.2504, pruned_loss=0.03937, over 24305.00 frames. ], tot_loss[loss=0.1568, simple_loss=0.2361, pruned_loss=0.03875, over 4713071.62 frames. ], batch size: 74, lr: 2.54e-03, grad_scale: 16.0 2023-10-03 20:51:48,854 WARNING [train.py:1204] (3/4) Exclude cut with ID _24271_210_12658_1_1531987270022_7190519_243-46549-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 20:51:48,859 WARNING [train.py:1204] (3/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_78-182912-0_sp1.1 from training. Duration: 0.536375 2023-10-03 20:51:49,105 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.bypass_mid.scale_min, batch_count=1402493.3333333333, ans=0.2 2023-10-03 20:51:53,545 WARNING [train.py:1204] (3/4) Exclude cut with ID _57572_210_5517_1_1534921219624_3809484_121-321334-0_sp1.1 from training. Duration: 0.97275 2023-10-03 20:51:54,913 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_40047_210_2290_1_1533290519_2374311_254-102149-0_sp1.1 from training. Duration: 0.963625 2023-10-03 20:51:54,914 WARNING [train.py:1204] (3/4) Exclude cut with ID _28461_210_2253_1_1532771775189_3041299_96-152158-0 from training. Duration: 0.85 2023-10-03 20:52:00,099 WARNING [train.py:1204] (3/4) Exclude cut with ID _29877_210_6228_1_1532397791106_5276149_161-102979-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 20:52:02,952 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3787_210_2040_1_1526477544_5478586_603-120324-0_sp0.9 from training. Duration: 0.9 2023-10-03 20:52:05,799 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_41750_210_1960_1_1533520678_3948042_247-213456-0_sp1.1 from training. Duration: 0.97275 2023-10-03 20:52:05,973 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.bypass_mid.scale_min, batch_count=1402560.0, ans=0.2 2023-10-03 20:52:14,618 WARNING [train.py:1204] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_127-119109-0_sp0.9 from training. Duration: 0.8 2023-10-03 20:52:20,388 INFO [scaling.py:1118] (3/4) WithLoss: name=encoder.encoders.3.encoder.layers.2.self_attn_weights, loss-sum=0.000e+00 2023-10-03 20:52:23,562 WARNING [train.py:1204] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_215-202973-0 from training. Duration: 0.88 2023-10-03 20:52:24,943 WARNING [train.py:1204] (3/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_17-320491-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 20:52:28,066 WARNING [train.py:1204] (3/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_310-114452-0 from training. Duration: 0.59 2023-10-03 20:52:28,313 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.bypass.skip_rate, batch_count=1402626.6666666667, ans=0.07 2023-10-03 20:52:29,404 WARNING [train.py:1204] (3/4) Exclude cut with ID _15133_210_6228_1_1530794512074_3080379_29-264458-0_sp0.9 from training. Duration: 0.6444375 2023-10-03 20:52:32,767 WARNING [train.py:1204] (3/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_159-281310-0_sp1.1 from training. Duration: 0.863625 2023-10-03 20:52:32,784 WARNING [train.py:1204] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_195-167011-0_sp0.9 from training. Duration: 0.8333125 2023-10-03 20:52:33,031 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.nonlin_attention.balancer.prob, batch_count=1402693.3333333333, ans=0.125 2023-10-03 20:52:34,235 WARNING [train.py:1204] (3/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_156-257607-0_sp1.1 from training. Duration: 0.7545625 2023-10-03 20:52:38,349 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2295_210_2087_1_1525500993_2407863_116-238894-0 from training. Duration: 0.7 2023-10-03 20:52:39,772 WARNING [train.py:1204] (3/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_321-335475-0_sp0.9 from training. Duration: 0.5 2023-10-03 20:52:42,588 WARNING [train.py:1204] (3/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_188-114464-0 from training. Duration: 0.82 2023-10-03 20:52:45,865 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2592_210_2290_1_1525697025_4882925_157-70512-0 from training. Duration: 0.91 2023-10-03 20:52:46,010 WARNING [train.py:1204] (3/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_138-64210-0_sp1.1 from training. Duration: 0.663625 2023-10-03 20:52:51,572 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_11305_210_1794_1_1529750428_7178148_651-95702-0_sp1.1 from training. Duration: 0.963625 2023-10-03 20:52:52,927 WARNING [train.py:1204] (3/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_379-184223-0_sp0.9 from training. Duration: 0.9444375 2023-10-03 20:52:52,945 WARNING [train.py:1204] (3/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_381-125325-0_sp1.1 from training. Duration: 0.963625 2023-10-03 20:52:53,000 WARNING [train.py:1204] (3/4) Exclude cut with ID IC0622W0021-307659 from training. Duration: 0.896 2023-10-03 20:52:53,002 WARNING [train.py:1204] (3/4) Exclude cut with ID _7624_210_1531_1_1528109802198_4933101_418-349834-0_sp1.1 from training. Duration: 0.5454375 2023-10-03 20:52:57,672 WARNING [train.py:1204] (3/4) Exclude cut with ID _49728_210_12651_1_1534041034294_6788150_607-3416-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 20:52:59,407 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8275_210_4198_1_1528455320_3892571_491-17707-0 from training. Duration: 0.64 2023-10-03 20:53:00,711 WARNING [train.py:1204] (3/4) Exclude cut with ID _5287_210_2084_1_1527418883040_3878670_333-48508-0 from training. Duration: 0.6 2023-10-03 20:53:00,795 WARNING [train.py:1204] (3/4) Exclude cut with ID _8365_210_2064_1_1528592311070_4677690_371-122295-0 from training. Duration: 0.55 2023-10-03 20:53:00,996 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.attention_skip_rate, batch_count=1402760.0, ans=0.0 2023-10-03 20:53:02,167 WARNING [train.py:1204] (3/4) Exclude cut with ID _14860_210_4915_1_1530702020340_7343240_836-328489-0 from training. Duration: 0.9 2023-10-03 20:53:04,231 INFO [train.py:1046] (3/4) Epoch 40, batch 3250, loss[loss=0.1578, simple_loss=0.2345, pruned_loss=0.04052, over 20717.00 frames. ], tot_loss[loss=0.1566, simple_loss=0.2359, pruned_loss=0.03867, over 4711483.69 frames. ], batch size: 45, lr: 2.54e-03, grad_scale: 16.0 2023-10-03 20:53:04,291 WARNING [train.py:1204] (3/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_1033-274376-0_sp0.9 from training. Duration: 0.9666875 2023-10-03 20:53:06,552 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.0.layers.1.feed_forward2.out_whiten, num_groups=1, num_channels=192, metric=4.67 vs. limit=15.0 2023-10-03 20:53:06,958 WARNING [train.py:1204] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_475-127455-0_sp0.9 from training. Duration: 0.77775 2023-10-03 20:53:06,971 WARNING [train.py:1204] (3/4) Exclude cut with ID IC0671W0071-331914 from training. Duration: 0.896 2023-10-03 20:53:06,997 WARNING [train.py:1204] (3/4) Exclude cut with ID _30327_210_16049_1_1532484161295_3924169_2-1523-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 20:53:07,000 WARNING [train.py:1204] (3/4) Exclude cut with ID _24314_210_4915_1_1532135078116_7170179_460-208279-0_sp1.1 from training. Duration: 0.97275 2023-10-03 20:53:09,577 WARNING [train.py:1204] (3/4) Exclude cut with ID IC0679W0052-335859 from training. Duration: 0.64 2023-10-03 20:53:12,400 WARNING [train.py:1204] (3/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_3-237434-0_sp1.1 from training. Duration: 0.6909375 2023-10-03 20:53:15,593 WARNING [train.py:1204] (3/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_405-185283-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 20:53:17,910 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.0.layers.1.nonlin_attention.whiten2, num_groups=1, num_channels=192, metric=6.07 vs. limit=15.0 2023-10-03 20:53:21,105 WARNING [train.py:1204] (3/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_927-83684-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 20:53:22,408 WARNING [train.py:1204] (3/4) Exclude cut with ID _6694_210_4599_1_1527852637490_3965529_356-169233-0 from training. Duration: 0.99 2023-10-03 20:53:23,777 WARNING [train.py:1204] (3/4) Exclude cut with ID _19896_210_1883_1_1531709022662_6649569_562-233001-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 20:53:23,811 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_354-78629-0_sp1.1 from training. Duration: 0.963625 2023-10-03 20:53:23,812 WARNING [train.py:1204] (3/4) Exclude cut with ID _60135_210_13427_1_1534935557922_3651319_96-168198-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 20:53:27,118 WARNING [train.py:1204] (3/4) Exclude cut with ID _31602_210_5854_1_1532677021456_4458250_249-96604-0_sp1.1 from training. Duration: 0.8090625 2023-10-03 20:53:27,149 WARNING [train.py:1204] (3/4) Exclude cut with ID _14100_210_8341_1_1530529320613_3678069_258-224599-0_sp0.9 from training. Duration: 0.8555625 2023-10-03 20:53:30,436 WARNING [train.py:1204] (3/4) Exclude cut with ID _19708_210_9206_1_1532136650733_4238364_215-160135-0_sp1.1 from training. Duration: 0.97275 2023-10-03 20:53:30,465 WARNING [train.py:1204] (3/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_113-18163-0_sp0.9 from training. Duration: 0.988875 2023-10-03 20:53:30,507 WARNING [train.py:1204] (3/4) Exclude cut with ID _64135_210_2253_1_1535677267876_7608972_552-337340-0_sp1.1 from training. Duration: 0.92725 2023-10-03 20:53:30,532 WARNING [train.py:1204] (3/4) Exclude cut with ID _67725_210_27767_1_1535608756671_5105359_10-12887-0_sp1.1 from training. Duration: 0.97275 2023-10-03 20:53:30,538 WARNING [train.py:1204] (3/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_401-284840-0_sp1.1 from training. Duration: 0.97275 2023-10-03 20:53:30,562 WARNING [train.py:1204] (3/4) Exclude cut with ID _44659_210_2721_1_1534036120061_6614860_483-141285-0_sp1.1 from training. Duration: 0.936375 2023-10-03 20:53:33,477 WARNING [train.py:1204] (3/4) Exclude cut with ID _9461_210_4557_1_1529060514726_3677451_358-332734-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 20:53:34,860 WARNING [train.py:1204] (3/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_788-298907-0_sp1.1 from training. Duration: 0.8090625 2023-10-03 20:53:38,135 WARNING [train.py:1204] (3/4) Exclude cut with ID _39411_210_5986_1_1533538825030_3343649_354-267196-0_sp1.1 from training. Duration: 0.92725 2023-10-03 20:53:38,161 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_14953_210_2151_1_1530701710_5616165_192-309792-0_sp1.1 from training. Duration: 0.97275 2023-10-03 20:53:39,509 WARNING [train.py:1204] (3/4) Exclude cut with ID _59170_210_6721_1_1534983633960_4203335_290-151255-0_sp1.1 from training. Duration: 0.92725 2023-10-03 20:53:40,836 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_5575_210_1410_1_1527301305_2570170_249-93769-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 20:53:40,846 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_46013_210_7234_1_1533776362_3744032_463-52378-0_sp1.1 from training. Duration: 0.7818125 2023-10-03 20:53:45,146 WARNING [train.py:1204] (3/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_580-93798-0 from training. Duration: 0.56 2023-10-03 20:53:45,199 WARNING [train.py:1204] (3/4) Exclude cut with ID _19514_210_9259_1_1531530071330_5084230_385-13762-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 20:53:45,204 WARNING [train.py:1204] (3/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_458-67321-0_sp1.1 from training. Duration: 0.8818125 2023-10-03 20:53:47,050 WARNING [train.py:1204] (3/4) Exclude cut with ID _41298_210_9019_1_1533430371450_4822143_188-154791-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 20:53:48,328 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.532e+02 1.926e+02 2.163e+02 2.567e+02 5.244e+02, threshold=4.326e+02, percent-clipped=4.0 2023-10-03 20:53:48,418 WARNING [train.py:1204] (3/4) Exclude cut with ID _15037_210_2253_1_1530752294104_3267650_296-192597-0_sp0.9 from training. Duration: 0.9 2023-10-03 20:53:49,995 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.ff2_skip_rate, batch_count=1403026.6666666667, ans=0.0 2023-10-03 20:53:55,144 WARNING [train.py:1204] (3/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_661-264857-0_sp1.1 from training. Duration: 0.8454375 2023-10-03 20:54:01,619 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_34008_210_10431_1_1533345424_8183247_662-317331-0_sp1.1 from training. Duration: 0.8545625 2023-10-03 20:54:01,658 WARNING [train.py:1204] (3/4) Exclude cut with ID _46502_210_13794_1_1533724452198_3657159_241-278329-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 20:54:01,658 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_11349_210_6753_1_1529800861_1101207_26-65602-0 from training. Duration: 0.96 2023-10-03 20:54:01,662 WARNING [train.py:1204] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_448-4048-0_sp1.1 from training. Duration: 0.87275 2023-10-03 20:54:01,677 WARNING [train.py:1204] (3/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_662-275621-0_sp1.1 from training. Duration: 0.3818125 2023-10-03 20:54:01,719 WARNING [train.py:1204] (3/4) Exclude cut with ID _13938_210_9988_1_1530518718978_3693960_256-54454-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 20:54:05,705 WARNING [train.py:1204] (3/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_256-32662-0 from training. Duration: 0.9 2023-10-03 20:54:05,751 WARNING [train.py:1204] (3/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_326-17061-0 from training. Duration: 0.94 2023-10-03 20:54:05,792 WARNING [train.py:1204] (3/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_505-97987-0_sp1.1 from training. Duration: 0.7818125 2023-10-03 20:54:07,726 WARNING [train.py:1204] (3/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_316-294364-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 20:54:07,859 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.1.ff3_skip_rate, batch_count=1403093.3333333333, ans=0.0 2023-10-03 20:54:09,006 WARNING [train.py:1204] (3/4) Exclude cut with ID _19514_210_9259_1_1531530071330_5084230_762-231063-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 20:54:09,070 WARNING [train.py:1204] (3/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_68-219807-0_sp0.9 from training. Duration: 0.52225 2023-10-03 20:54:10,352 WARNING [train.py:1204] (3/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_479-158053-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 20:54:13,665 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.3.encoder.layers.3.nonlin_attention.whiten1, num_groups=1, num_channels=384, metric=5.48 vs. limit=10.0 2023-10-03 20:54:14,384 WARNING [train.py:1204] (3/4) Exclude cut with ID _66112_210_10635_1_1535590236305_3801484_83-38181-0_sp1.1 from training. Duration: 0.936375 2023-10-03 20:54:14,404 WARNING [train.py:1204] (3/4) Exclude cut with ID _55857_210_2346_1_1534644007831_6898209_255-146346-0_sp1.1 from training. Duration: 0.963625 2023-10-03 20:54:15,551 INFO [scaling.py:1022] (3/4) Whitening: name=encoder_embed.convnext.out_whiten, num_groups=1, num_channels=128, metric=4.75 vs. limit=5.0 2023-10-03 20:54:15,855 WARNING [train.py:1204] (3/4) Exclude cut with ID _48836_210_12210_1_1534075848293_3904150_429-35475-0 from training. Duration: 0.89 2023-10-03 20:54:15,869 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_50154_210_13731_1_1534750862_1080961_119-187163-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 20:54:17,218 INFO [train.py:1046] (3/4) Epoch 40, batch 3300, loss[loss=0.1694, simple_loss=0.2484, pruned_loss=0.04527, over 24395.00 frames. ], tot_loss[loss=0.1567, simple_loss=0.2361, pruned_loss=0.03865, over 4716214.86 frames. ], batch size: 77, lr: 2.54e-03, grad_scale: 16.0 2023-10-03 20:54:18,573 WARNING [train.py:1204] (3/4) Exclude cut with ID _8793_210_6310_1_1528542985139_2310009_121-179782-0_sp0.9 from training. Duration: 0.888875 2023-10-03 20:54:18,580 WARNING [train.py:1204] (3/4) Exclude cut with ID _14884_210_2252_1_1530707459689_5561250_591-173406-0 from training. Duration: 0.96 2023-10-03 20:54:21,922 WARNING [train.py:1204] (3/4) Exclude cut with ID _15133_210_6228_1_1530794512074_3080379_54-198193-0_sp1.1 from training. Duration: 0.8545625 2023-10-03 20:54:21,931 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6545_210_2040_1_1527774670_4245258_407-48070-0 from training. Duration: 0.76 2023-10-03 20:54:22,070 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1032-236863-0 from training. Duration: 0.84 2023-10-03 20:54:24,875 WARNING [train.py:1204] (3/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_507-303370-0 from training. Duration: 0.96 2023-10-03 20:54:24,902 WARNING [train.py:1204] (3/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_164-282366-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 20:54:27,761 WARNING [train.py:1204] (3/4) Exclude cut with ID _39867_210_9826_1_1533553222030_9344259_312-158727-0_sp1.1 from training. Duration: 0.963625 2023-10-03 20:54:29,122 WARNING [train.py:1204] (3/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_174-205058-0_sp1.1 from training. Duration: 0.863625 2023-10-03 20:54:29,144 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_11305_210_1794_1_1529750428_7178148_166-237358-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 20:54:29,806 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.3.encoder.layers.2.self_attn2.whiten, num_groups=1, num_channels=512, metric=12.06 vs. limit=22.5 2023-10-03 20:54:31,162 WARNING [train.py:1204] (3/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_26-175553-0_sp1.1 from training. Duration: 0.5545625 2023-10-03 20:54:31,181 WARNING [train.py:1204] (3/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_96-290039-0_sp1.1 from training. Duration: 0.5090625 2023-10-03 20:54:33,256 WARNING [train.py:1204] (3/4) Exclude cut with ID _53219_210_4525_1_1534834836008_4064825_261-101691-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 20:54:34,597 WARNING [train.py:1204] (3/4) Exclude cut with ID _24271_210_12658_1_1531987270022_7190519_96-293390-0_sp1.1 from training. Duration: 0.936375 2023-10-03 20:54:39,150 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_271-234962-0 from training. Duration: 0.79 2023-10-03 20:54:40,555 WARNING [train.py:1204] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_136-30660-0_sp1.1 from training. Duration: 0.97275 2023-10-03 20:54:40,578 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_625-281046-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 20:54:41,305 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.2.encoder.layers.2.nonlin_attention.whiten1, num_groups=1, num_channels=288, metric=4.75 vs. limit=10.0 2023-10-03 20:54:41,940 WARNING [train.py:1204] (3/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_333-168193-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 20:54:43,336 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9029W0269-509664 from training. Duration: 0.896 2023-10-03 20:54:43,418 WARNING [train.py:1204] (3/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_450-147393-0_sp1.1 from training. Duration: 0.92725 2023-10-03 20:54:44,704 WARNING [train.py:1204] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_643-52007-0_sp1.1 from training. Duration: 0.5181875 2023-10-03 20:54:44,783 WARNING [train.py:1204] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_330-327861-0_sp0.9 from training. Duration: 0.7444375 2023-10-03 20:54:44,802 WARNING [train.py:1204] (3/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_260-317651-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 20:54:46,098 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9044W0010-516654 from training. Duration: 0.896 2023-10-03 20:54:50,150 WARNING [train.py:1204] (3/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_265-118378-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 20:54:50,162 WARNING [train.py:1204] (3/4) Exclude cut with ID _6194_210_1534_1_1527511826160_3941194_321-148641-0_sp0.9 from training. Duration: 0.711125 2023-10-03 20:54:51,035 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=1403293.3333333333, ans=0.1 2023-10-03 20:54:52,182 WARNING [train.py:1204] (3/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_101-169176-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 20:54:52,185 WARNING [train.py:1204] (3/4) Exclude cut with ID 497-129325-0061-62254-0_sp1.1 from training. Duration: 0.97725 2023-10-03 20:54:53,553 WARNING [train.py:1204] (3/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_795-33252-0_sp1.1 from training. Duration: 0.336375 2023-10-03 20:54:53,566 WARNING [train.py:1204] (3/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_168-246176-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 20:54:54,984 WARNING [train.py:1204] (3/4) Exclude cut with ID _7160_210_1876_1_1527940923231_3703799_120-150943-0_sp1.1 from training. Duration: 0.736375 2023-10-03 20:54:56,360 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9084W0005-536844 from training. Duration: 0.768 2023-10-03 20:54:57,830 WARNING [train.py:1204] (3/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_234-260682-0 from training. Duration: 0.84 2023-10-03 20:54:59,116 WARNING [train.py:1204] (3/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_188-114464-0_sp0.9 from training. Duration: 0.911125 2023-10-03 20:55:02,446 WARNING [train.py:1204] (3/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_81-75849-0 from training. Duration: 0.39 2023-10-03 20:55:03,938 WARNING [train.py:1204] (3/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_379-184223-0_sp1.1 from training. Duration: 0.77275 2023-10-03 20:55:06,651 WARNING [train.py:1204] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_454-123002-0_sp1.1 from training. Duration: 0.62725 2023-10-03 20:55:07,797 WARNING [train.py:1204] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_887-249555-0_sp1.1 from training. Duration: 0.7 2023-10-03 20:55:11,005 WARNING [train.py:1204] (3/4) Exclude cut with ID _31088_210_16446_1_1532520012284_3440919_20-260902-0_sp1.1 from training. Duration: 0.97275 2023-10-03 20:55:12,362 WARNING [train.py:1204] (3/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_856-344574-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 20:55:12,364 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_36756_210_7234_1_1533026991_966276_4-164118-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 20:55:12,378 WARNING [train.py:1204] (3/4) Exclude cut with ID _8365_210_2064_1_1528592311070_4677690_371-122295-0_sp1.1 from training. Duration: 0.5 2023-10-03 20:55:13,855 WARNING [train.py:1204] (3/4) Exclude cut with ID _5910_210_1389_1_1527337972454_5050100_555-159815-0_sp1.1 from training. Duration: 0.8909375 2023-10-03 20:55:13,864 WARNING [train.py:1204] (3/4) Exclude cut with ID _37086_210_9038_1_1532999540891_7021250_469-147941-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 20:55:15,254 WARNING [train.py:1204] (3/4) Exclude cut with ID _3067_210_2667_1_1526212974640_4503168_459-173867-0_sp0.9 from training. Duration: 0.97775 2023-10-03 20:55:16,668 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9156W0006-572499 from training. Duration: 0.896 2023-10-03 20:55:18,122 WARNING [train.py:1204] (3/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_277-210798-0 from training. Duration: 0.55 2023-10-03 20:55:18,452 INFO [scaling.py:1118] (3/4) WithLoss: name=encoder.encoders.3.encoder.layers.3.self_attn_weights, loss-sum=0.000e+00 2023-10-03 20:55:19,606 WARNING [train.py:1204] (3/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_103-336064-0_sp1.1 from training. Duration: 0.463625 2023-10-03 20:55:20,978 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3414_210_1988_1_1526212718_4813195_331-290945-0_sp1.1 from training. Duration: 0.936375 2023-10-03 20:55:20,979 WARNING [train.py:1204] (3/4) Exclude cut with ID _67499_210_17282_1_1535509805220_3692430_365-29276-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 20:55:22,746 WARNING [train.py:1204] (3/4) Exclude cut with ID _14821_210_8750_1_1530680503991_6829359_69-250847-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 20:55:22,747 WARNING [train.py:1204] (3/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_1102-61301-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 20:55:22,856 WARNING [train.py:1204] (3/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_166-246557-0_sp1.1 from training. Duration: 0.5818125 2023-10-03 20:55:24,132 WARNING [train.py:1204] (3/4) Exclude cut with ID _15351_210_10626_1_1530856635653_3576210_214-129918-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 20:55:24,144 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_120-41668-0_sp0.9 from training. Duration: 0.77775 2023-10-03 20:55:24,425 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.balancer2.prob, batch_count=1403426.6666666667, ans=0.125 2023-10-03 20:55:25,554 WARNING [train.py:1204] (3/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_27-72278-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 20:55:26,997 WARNING [train.py:1204] (3/4) Exclude cut with ID _24314_210_4915_1_1532135078116_7170179_521-258868-0_sp1.1 from training. Duration: 0.7181875 2023-10-03 20:55:29,701 WARNING [train.py:1204] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_143-86344-0 from training. Duration: 0.77 2023-10-03 20:55:29,737 WARNING [train.py:1204] (3/4) Exclude cut with ID _53489_210_6228_1_1534298357169_3835970_465-157433-0_sp1.1 from training. Duration: 0.92725 2023-10-03 20:55:31,102 INFO [train.py:1046] (3/4) Epoch 40, batch 3350, loss[loss=0.147, simple_loss=0.2369, pruned_loss=0.02856, over 24463.00 frames. ], tot_loss[loss=0.1574, simple_loss=0.2369, pruned_loss=0.03897, over 4716281.39 frames. ], batch size: 63, lr: 2.54e-03, grad_scale: 16.0 2023-10-03 20:55:31,222 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_248-319353-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 20:55:32,990 WARNING [train.py:1204] (3/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_102-77064-0_sp1.1 from training. Duration: 0.8454375 2023-10-03 20:55:33,014 WARNING [train.py:1204] (3/4) Exclude cut with ID _9389_210_1664_1_1529217337301_5289269_379-128794-0_sp1.1 from training. Duration: 0.77275 2023-10-03 20:55:36,230 WARNING [train.py:1204] (3/4) Exclude cut with ID _3231_210_2106_1_1526205864934_6784220_236-260835-0_sp1.1 from training. Duration: 0.97275 2023-10-03 20:55:36,434 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.ff3_skip_rate, batch_count=1403493.3333333333, ans=0.0 2023-10-03 20:55:36,908 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.3.encoder.layers.2.self_attn1.whiten, num_groups=1, num_channels=512, metric=12.57 vs. limit=22.5 2023-10-03 20:55:37,721 WARNING [train.py:1204] (3/4) Exclude cut with ID _24271_210_12658_1_1531987270022_7190519_184-342363-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 20:55:37,728 WARNING [train.py:1204] (3/4) Exclude cut with ID _27894_210_12392_1_1532415496078_6902439_67-221663-0_sp1.1 from training. Duration: 0.92725 2023-10-03 20:55:40,532 WARNING [train.py:1204] (3/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_304-59227-0_sp1.1 from training. Duration: 0.7 2023-10-03 20:55:43,644 WARNING [train.py:1204] (3/4) Exclude cut with ID _58605_210_12210_1_1534723195939_3904259_458-42232-0_sp1.1 from training. Duration: 0.92725 2023-10-03 20:55:43,716 WARNING [train.py:1204] (3/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_13-165512-0_sp0.9 from training. Duration: 0.911125 2023-10-03 20:55:46,392 WARNING [train.py:1204] (3/4) Exclude cut with ID _58602_210_2346_1_1534903210085_6908830_792-102241-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 20:55:47,737 WARNING [train.py:1204] (3/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_588-125165-0_sp0.9 from training. Duration: 0.92225 2023-10-03 20:55:49,072 WARNING [train.py:1204] (3/4) Exclude cut with ID _54218_210_12210_1_1534413646388_4409259_239-98429-0_sp1.1 from training. Duration: 0.97275 2023-10-03 20:55:50,477 WARNING [train.py:1204] (3/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_538-32291-0_sp1.1 from training. Duration: 0.7545625 2023-10-03 20:55:51,829 WARNING [train.py:1204] (3/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_419-317062-0 from training. Duration: 0.91 2023-10-03 20:55:53,716 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9309W0202-640329 from training. Duration: 0.896 2023-10-03 20:55:53,758 WARNING [train.py:1204] (3/4) Exclude cut with ID _3231_210_2106_1_1526205864934_6784220_419-313291-0_sp1.1 from training. Duration: 0.97275 2023-10-03 20:55:56,582 WARNING [train.py:1204] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_619-344499-0 from training. Duration: 0.63 2023-10-03 20:55:56,593 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_278-272612-0 from training. Duration: 0.73 2023-10-03 20:55:56,680 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_103-84010-0_sp0.9 from training. Duration: 0.7666875 2023-10-03 20:55:56,702 WARNING [train.py:1204] (3/4) Exclude cut with ID _26528_210_9070_1_1532570410842_3851310_497-97218-0_sp1.1 from training. Duration: 0.7818125 2023-10-03 20:55:56,910 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.feed_forward1.out_proj.dropout_p, batch_count=1403560.0, ans=0.1 2023-10-03 20:55:58,154 WARNING [train.py:1204] (3/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_262-84613-0_sp1.1 from training. Duration: 0.936375 2023-10-03 20:55:58,201 WARNING [train.py:1204] (3/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_387-131538-0 from training. Duration: 0.81 2023-10-03 20:55:59,428 WARNING [train.py:1204] (3/4) Exclude cut with ID _8030_210_7497_1_1528444886781_3531089_224-300489-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 20:55:59,447 WARNING [train.py:1204] (3/4) Exclude cut with ID _14349_210_8341_1_1530615710818_3442470_170-336541-0_sp0.9 from training. Duration: 0.9555625 2023-10-03 20:56:00,872 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_47069_210_15368_1_1533781267_4431320_180-31746-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 20:56:02,755 WARNING [train.py:1204] (3/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_447-7041-0_sp1.1 from training. Duration: 0.92725 2023-10-03 20:56:02,792 WARNING [train.py:1204] (3/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_2-55283-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 20:56:04,168 WARNING [train.py:1204] (3/4) Exclude cut with ID _12555_210_8750_1_1530075594402_3780239_70-181911-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 20:56:09,978 WARNING [train.py:1204] (3/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_311-328022-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 20:56:10,129 WARNING [train.py:1204] (3/4) Exclude cut with ID _14528_210_8254_1_1530669700514_4306939_330-2987-0_sp1.1 from training. Duration: 0.92725 2023-10-03 20:56:10,168 WARNING [train.py:1204] (3/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_385-184858-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 20:56:14,865 WARNING [train.py:1204] (3/4) Exclude cut with ID _64414_210_17301_1_1535243252048_3757759_348-333360-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 20:56:16,073 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.647e+02 1.915e+02 2.091e+02 2.361e+02 5.355e+02, threshold=4.181e+02, percent-clipped=1.0 2023-10-03 20:56:16,163 WARNING [train.py:1204] (3/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_753-214751-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 20:56:17,656 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_17613_210_2151_1_1531137569_2366160_202-208013-0_sp1.1 from training. Duration: 0.92725 2023-10-03 20:56:17,667 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_43831_210_2158_1_1533725502_5872791_610-129300-0_sp1.1 from training. Duration: 0.936375 2023-10-03 20:56:19,085 WARNING [train.py:1204] (3/4) Exclude cut with ID _54218_210_12210_1_1534413646388_4409259_321-23852-0_sp1.1 from training. Duration: 0.936375 2023-10-03 20:56:20,571 WARNING [train.py:1204] (3/4) Exclude cut with ID _39411_210_5986_1_1533538825030_3343649_173-238248-0 from training. Duration: 0.83 2023-10-03 20:56:21,791 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_405-339986-0_sp0.9 from training. Duration: 0.5666875 2023-10-03 20:56:21,822 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2009_210_2071_1_1525481134_4588937_385-302100-0 from training. Duration: 0.87 2023-10-03 20:56:22,499 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_161-276996-0_sp1.1 from training. Duration: 0.863625 2023-10-03 20:56:22,590 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_177-58005-0 from training. Duration: 0.62 2023-10-03 20:56:23,707 WARNING [train.py:1204] (3/4) Exclude cut with ID _64639_210_5816_1_1535367619437_3528000_146-343734-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 20:56:25,045 WARNING [train.py:1204] (3/4) Exclude cut with ID _3845_210_1390_1_1526731326141_3523610_72-61441-0_sp1.1 from training. Duration: 0.92725 2023-10-03 20:56:32,155 WARNING [train.py:1204] (3/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_991-125028-0_sp1.1 from training. Duration: 0.936375 2023-10-03 20:56:32,685 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.4.encoder.layers.2.conv_module1.whiten, num_groups=1, num_channels=384, metric=5.12 vs. limit=15.0 2023-10-03 20:56:33,468 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7544_210_6753_1_1528591327_1471426_172-348777-0 from training. Duration: 0.66 2023-10-03 20:56:33,526 WARNING [train.py:1204] (3/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_416-257730-0_sp1.1 from training. Duration: 0.5909375 2023-10-03 20:56:35,287 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1113-7369-0_sp1.1 from training. Duration: 0.77275 2023-10-03 20:56:37,293 WARNING [train.py:1204] (3/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_242-76943-0_sp1.1 from training. Duration: 0.8545625 2023-10-03 20:56:38,929 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.1.balancer2.prob, batch_count=1403760.0, ans=0.125 2023-10-03 20:56:40,408 INFO [scaling.py:1118] (3/4) WithLoss: name=encoder.encoders.2.encoder.layers.1.self_attn_weights, loss-sum=0.000e+00 2023-10-03 20:56:41,687 WARNING [train.py:1204] (3/4) Exclude cut with ID _44718_210_5816_1_1534050051869_6469030_190-49648-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 20:56:44,958 WARNING [train.py:1204] (3/4) Exclude cut with ID _47251_210_18628_1_1533814063128_4137250_214-88076-0 from training. Duration: 0.88 2023-10-03 20:56:46,171 INFO [train.py:1046] (3/4) Epoch 40, batch 3400, loss[loss=0.2142, simple_loss=0.2826, pruned_loss=0.07284, over 19639.00 frames. ], tot_loss[loss=0.1589, simple_loss=0.2387, pruned_loss=0.03958, over 4705841.99 frames. ], batch size: 388, lr: 2.54e-03, grad_scale: 16.0 2023-10-03 20:56:46,222 WARNING [train.py:1204] (3/4) Exclude cut with ID _5585_210_2667_1_1527418813324_8007340_89-332072-0_sp1.1 from training. Duration: 0.5818125 2023-10-03 20:56:46,249 WARNING [train.py:1204] (3/4) Exclude cut with ID _41817_210_18835_1_1533434477160_7423049_555-293448-0_sp1.1 from training. Duration: 0.72725 2023-10-03 20:56:47,659 WARNING [train.py:1204] (3/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_286-331314-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 20:56:47,722 WARNING [train.py:1204] (3/4) Exclude cut with ID _13921_210_9994_1_1530838702800_3003429_46-149444-0 from training. Duration: 0.94 2023-10-03 20:56:49,077 WARNING [train.py:1204] (3/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_768-43595-0_sp1.1 from training. Duration: 0.936375 2023-10-03 20:56:49,093 WARNING [train.py:1204] (3/4) Exclude cut with ID _35355_210_16040_1_1533212988977_4016229_74-190076-0 from training. Duration: 0.8 2023-10-03 20:56:50,509 WARNING [train.py:1204] (3/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_1007-316380-0_sp1.1 from training. Duration: 0.963625 2023-10-03 20:56:51,200 WARNING [train.py:1204] (3/4) Exclude cut with ID _23557_210_8750_1_1531890106370_6846255_150-283437-0_sp1.1 from training. Duration: 0.963625 2023-10-03 20:56:52,325 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3474_210_2028_1_1526188513_4825246_145-309131-0_sp1.1 from training. Duration: 0.67275 2023-10-03 20:56:53,658 WARNING [train.py:1204] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_391-22148-0_sp1.1 from training. Duration: 0.7 2023-10-03 20:56:53,680 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_238-256668-0 from training. Duration: 0.69 2023-10-03 20:56:57,990 WARNING [train.py:1204] (3/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_372-198878-0 from training. Duration: 0.6 2023-10-03 20:56:58,000 WARNING [train.py:1204] (3/4) Exclude cut with ID ID0174W0006-766659 from training. Duration: 0.953 2023-10-03 20:56:58,030 WARNING [train.py:1204] (3/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_668-23019-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 20:57:02,133 WARNING [train.py:1204] (3/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_322-74328-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 20:57:02,140 WARNING [train.py:1204] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_872-264525-0_sp1.1 from training. Duration: 0.5909375 2023-10-03 20:57:02,190 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_53378_210_17282_1_1534221071_5769509_641-286201-0_sp1.1 from training. Duration: 0.97275 2023-10-03 20:57:03,567 WARNING [train.py:1204] (3/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_199-295611-0_sp1.1 from training. Duration: 0.5 2023-10-03 20:57:05,546 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.3.encoder.layers.0.conv_module1.whiten, num_groups=1, num_channels=512, metric=3.95 vs. limit=15.0 2023-10-03 20:57:09,887 WARNING [train.py:1204] (3/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_1270-317971-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 20:57:11,234 WARNING [train.py:1204] (3/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_784-213099-0 from training. Duration: 0.93 2023-10-03 20:57:14,543 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_115-233929-0_sp0.9 from training. Duration: 0.8 2023-10-03 20:57:17,140 WARNING [train.py:1204] (3/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_256-223809-0_sp1.1 from training. Duration: 0.97275 2023-10-03 20:57:17,179 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_285-87781-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 20:57:18,480 WARNING [train.py:1204] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_619-344499-0_sp1.1 from training. Duration: 0.57275 2023-10-03 20:57:23,177 WARNING [train.py:1204] (3/4) Exclude cut with ID _60229_210_27767_1_1534838406158_3655350_264-279573-0_sp0.9 from training. Duration: 0.92225 2023-10-03 20:57:23,351 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.0.nonlin_attention.balancer.prob, batch_count=1403960.0, ans=0.125 2023-10-03 20:57:26,277 WARNING [train.py:1204] (3/4) Exclude cut with ID _7719_210_3525_1_1528456153163_4296960_333-110399-0 from training. Duration: 0.96 2023-10-03 20:57:29,379 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.bypass.skip_rate, batch_count=1404026.6666666667, ans=0.04949747468305833 2023-10-03 20:57:31,683 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_50423_210_13270_1_1534128598_5021734_759-90833-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 20:57:33,049 WARNING [train.py:1204] (3/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_151-254798-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 20:57:33,100 WARNING [train.py:1204] (3/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_450-203637-0 from training. Duration: 0.92 2023-10-03 20:57:34,837 WARNING [train.py:1204] (3/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_673-36456-0_sp1.1 from training. Duration: 0.936375 2023-10-03 20:57:34,888 WARNING [train.py:1204] (3/4) Exclude cut with ID _36538_210_9994_1_1533204020363_3795620_131-8685-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 20:57:36,227 WARNING [train.py:1204] (3/4) Exclude cut with ID _12556_210_8341_1_1530081024100_7327690_713-55626-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 20:57:36,285 WARNING [train.py:1204] (3/4) Exclude cut with ID _2322_210_2667_1_1525604548971_3656299_440-80494-0_sp0.9 from training. Duration: 0.97775 2023-10-03 20:57:39,707 WARNING [train.py:1204] (3/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_210-325081-0_sp1.1 from training. Duration: 0.97275 2023-10-03 20:57:43,821 WARNING [train.py:1204] (3/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_375-244976-0_sp0.9 from training. Duration: 0.8333125 2023-10-03 20:57:43,827 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_15517_210_6753_1_1530952293_1664795_119-31001-0_sp1.1 from training. Duration: 0.8545625 2023-10-03 20:57:48,501 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_41452_210_7291_1_1533437687_3941281_319-42326-0_sp1.1 from training. Duration: 0.92725 2023-10-03 20:57:49,883 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_372-173012-0 from training. Duration: 0.95 2023-10-03 20:57:55,725 WARNING [train.py:1204] (3/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_41-31157-0_sp1.1 from training. Duration: 0.5181875 2023-10-03 20:57:59,860 INFO [train.py:1046] (3/4) Epoch 40, batch 3450, loss[loss=0.1505, simple_loss=0.2264, pruned_loss=0.03731, over 23540.00 frames. ], tot_loss[loss=0.1578, simple_loss=0.2373, pruned_loss=0.03912, over 4706685.82 frames. ], batch size: 119, lr: 2.54e-03, grad_scale: 16.0 2023-10-03 20:57:59,979 WARNING [train.py:1204] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_91-212486-0 from training. Duration: 0.68 2023-10-03 20:58:04,079 WARNING [train.py:1204] (3/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_1267-272693-0 from training. Duration: 0.73 2023-10-03 20:58:05,402 WARNING [train.py:1204] (3/4) Exclude cut with ID _13938_210_9988_1_1530518718978_3693960_152-67046-0_sp1.1 from training. Duration: 0.936375 2023-10-03 20:58:05,610 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.1.feed_forward3.hidden_balancer.prob, batch_count=1404160.0, ans=0.125 2023-10-03 20:58:07,390 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_4528_210_3273_1_1527076654_3276777_342-168068-0_sp1.1 from training. Duration: 0.7909375 2023-10-03 20:58:07,398 WARNING [train.py:1204] (3/4) Exclude cut with ID _7606_210_4915_1_1528894253277_5080690_127-177761-0 from training. Duration: 0.64 2023-10-03 20:58:07,470 WARNING [train.py:1204] (3/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_335-137972-0_sp1.1 from training. Duration: 0.92725 2023-10-03 20:58:12,117 WARNING [train.py:1204] (3/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_331-117829-0_sp1.1 from training. Duration: 0.563625 2023-10-03 20:58:16,324 WARNING [train.py:1204] (3/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_125-125818-0_sp1.1 from training. Duration: 0.87275 2023-10-03 20:58:18,114 WARNING [train.py:1204] (3/4) Exclude cut with ID _6789_210_1534_1_1527854471671_2870519_48-294897-0_sp1.1 from training. Duration: 0.963625 2023-10-03 20:58:18,192 WARNING [train.py:1204] (3/4) Exclude cut with ID _6694_210_4599_1_1527852637490_3965529_116-109398-0_sp1.1 from training. Duration: 0.663625 2023-10-03 20:58:18,210 WARNING [train.py:1204] (3/4) Exclude cut with ID _52892_210_6721_1_1534378941259_4124129_222-172685-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 20:58:20,974 WARNING [train.py:1204] (3/4) Exclude cut with ID _50655_210_18628_1_1534229942193_3772154_320-132275-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 20:58:27,011 WARNING [train.py:1204] (3/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_76-311963-0 from training. Duration: 0.7 2023-10-03 20:58:29,898 WARNING [train.py:1204] (3/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_32-205288-0 from training. Duration: 0.78 2023-10-03 20:58:31,202 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_86-16881-0_sp0.9 from training. Duration: 0.6444375 2023-10-03 20:58:31,253 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_271-297505-0_sp1.1 from training. Duration: 0.9 2023-10-03 20:58:32,642 WARNING [train.py:1204] (3/4) Exclude cut with ID _57572_210_5517_1_1534921219624_3809484_187-295293-0_sp1.1 from training. Duration: 0.963625 2023-10-03 20:58:38,560 WARNING [train.py:1204] (3/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_563-329943-0 from training. Duration: 0.99 2023-10-03 20:58:38,636 WARNING [train.py:1204] (3/4) Exclude cut with ID _7160_210_1876_1_1527940923231_3703799_302-150728-0_sp0.9 from training. Duration: 0.8666875 2023-10-03 20:58:38,818 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.conv_module1.balancer1.prob, batch_count=1404293.3333333333, ans=0.125 2023-10-03 20:58:40,910 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.1.conv_skip_rate, batch_count=1404293.3333333333, ans=0.0 2023-10-03 20:58:42,074 WARNING [train.py:1204] (3/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_926-7526-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 20:58:42,112 WARNING [train.py:1204] (3/4) Exclude cut with ID _62574_210_2655_1_1535180407058_3933249_467-22348-0_sp1.1 from training. Duration: 0.936375 2023-10-03 20:58:43,473 WARNING [train.py:1204] (3/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_280-161633-0_sp0.9 from training. Duration: 0.811125 2023-10-03 20:58:43,714 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.conv_skip_rate, batch_count=1404360.0, ans=0.0 2023-10-03 20:58:44,698 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.578e+02 1.925e+02 2.071e+02 2.347e+02 3.387e+02, threshold=4.143e+02, percent-clipped=0.0 2023-10-03 20:58:46,048 WARNING [train.py:1204] (3/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_385-193052-0_sp1.1 from training. Duration: 0.7818125 2023-10-03 20:58:47,816 WARNING [train.py:1204] (3/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_444-28041-0 from training. Duration: 0.85 2023-10-03 20:58:47,821 WARNING [train.py:1204] (3/4) Exclude cut with ID _66070_210_7640_1_1535441393096_7805310_341-249237-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 20:58:47,915 WARNING [train.py:1204] (3/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_425-269826-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 20:58:48,495 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.4.encoder.layers.0.self_attn1.whiten, num_groups=1, num_channels=384, metric=14.60 vs. limit=22.5 2023-10-03 20:58:49,639 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.feed_forward3.hidden_balancer.prob, batch_count=1404360.0, ans=0.125 2023-10-03 20:58:50,662 WARNING [train.py:1204] (3/4) Exclude cut with ID _53950_210_5816_1_1534417264919_3862951_124-185597-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 20:58:54,017 WARNING [train.py:1204] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_380-204756-0 from training. Duration: 0.86 2023-10-03 20:58:58,036 WARNING [train.py:1204] (3/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_282-257791-0_sp0.9 from training. Duration: 0.9555625 2023-10-03 20:59:03,551 WARNING [train.py:1204] (3/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_372-131163-0_sp1.1 from training. Duration: 0.8545625 2023-10-03 20:59:04,958 WARNING [train.py:1204] (3/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_447-233513-0_sp1.1 from training. Duration: 0.963625 2023-10-03 20:59:08,206 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_54-118333-0_sp1.1 from training. Duration: 0.97275 2023-10-03 20:59:12,318 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.2.encoder.layers.2.conv_module2.whiten, num_groups=1, num_channels=384, metric=5.24 vs. limit=15.0 2023-10-03 20:59:13,002 WARNING [train.py:1204] (3/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_388-230239-0_sp1.1 from training. Duration: 0.963625 2023-10-03 20:59:13,022 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_40047_210_2290_1_1533290519_2374311_141-54639-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 20:59:13,090 WARNING [train.py:1204] (3/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_91-267696-0_sp0.9 from training. Duration: 0.9666875 2023-10-03 20:59:14,339 INFO [train.py:1046] (3/4) Epoch 40, batch 3500, loss[loss=0.1594, simple_loss=0.2333, pruned_loss=0.04274, over 23924.00 frames. ], tot_loss[loss=0.1567, simple_loss=0.2357, pruned_loss=0.03886, over 4713964.92 frames. ], batch size: 179, lr: 2.54e-03, grad_scale: 8.0 2023-10-03 20:59:14,402 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_532-137847-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 20:59:18,442 WARNING [train.py:1204] (3/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_155-283231-0_sp1.1 from training. Duration: 0.97275 2023-10-03 20:59:20,490 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2245_210_2715_1_1525511548_3540254_310-52483-0_sp1.1 from training. Duration: 0.8 2023-10-03 20:59:21,803 WARNING [train.py:1204] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_185-174805-0 from training. Duration: 0.68 2023-10-03 20:59:23,245 WARNING [train.py:1204] (3/4) Exclude cut with ID _15012_210_1968_1_1530856820429_5095350_662-210469-0_sp1.1 from training. Duration: 0.5181875 2023-10-03 20:59:26,605 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_426-313557-0_sp0.9 from training. Duration: 0.588875 2023-10-03 20:59:28,024 WARNING [train.py:1204] (3/4) Exclude cut with ID _42326_210_7770_1_1533814282531_3707269_407-204343-0_sp1.1 from training. Duration: 0.97275 2023-10-03 20:59:29,330 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_258-310570-0 from training. Duration: 0.82 2023-10-03 20:59:32,217 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_4737_210_2040_1_1526998689_3293008_378-178650-0_sp1.1 from training. Duration: 0.863625 2023-10-03 20:59:33,521 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_47896_210_20508_1_1534492518_5958868_432-327451-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 20:59:33,604 WARNING [train.py:1204] (3/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_188-114464-0_sp1.1 from training. Duration: 0.7454375 2023-10-03 20:59:33,610 WARNING [train.py:1204] (3/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_798-106179-0_sp1.1 from training. Duration: 0.7545625 2023-10-03 20:59:33,652 WARNING [train.py:1204] (3/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_143-117421-0_sp1.1 from training. Duration: 0.52725 2023-10-03 20:59:34,956 WARNING [train.py:1204] (3/4) Exclude cut with ID _67499_210_17282_1_1535509805220_3692430_361-78786-0_sp1.1 from training. Duration: 0.963625 2023-10-03 20:59:35,009 WARNING [train.py:1204] (3/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_712-342563-0_sp1.1 from training. Duration: 0.8818125 2023-10-03 20:59:35,068 WARNING [train.py:1204] (3/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_387-159008-0 from training. Duration: 0.86 2023-10-03 20:59:38,907 WARNING [train.py:1204] (3/4) Exclude cut with ID _33640_210_2655_1_1533117605479_4855469_959-18820-0_sp1.1 from training. Duration: 0.963625 2023-10-03 20:59:40,194 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_133-564-0_sp0.9 from training. Duration: 0.87775 2023-10-03 20:59:41,618 WARNING [train.py:1204] (3/4) Exclude cut with ID _23514_210_1389_1_1531832139785_3031439_12-29287-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 20:59:44,537 WARNING [train.py:1204] (3/4) Exclude cut with ID _23514_210_1389_1_1531832139785_3031439_489-185052-0_sp1.1 from training. Duration: 0.963625 2023-10-03 20:59:44,599 WARNING [train.py:1204] (3/4) Exclude cut with ID _5444_210_2151_1_1527418808464_5312210_634-194174-0 from training. Duration: 0.97 2023-10-03 20:59:45,875 WARNING [train.py:1204] (3/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_91-26919-0_sp1.1 from training. Duration: 0.8818125 2023-10-03 20:59:50,473 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_37351_210_7925_1_1533012931_3812113_516-82303-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 20:59:51,841 WARNING [train.py:1204] (3/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_80-74184-0_sp0.9 from training. Duration: 0.92225 2023-10-03 20:59:53,246 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_9460_210_4211_1_1528954587_10049667_495-202963-0_sp1.1 from training. Duration: 0.963625 2023-10-03 20:59:54,671 WARNING [train.py:1204] (3/4) Exclude cut with ID _14632_210_9988_1_1530691279392_3923200_332-105904-0_sp1.1 from training. Duration: 0.8181875 2023-10-03 20:59:56,516 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_40764_210_1925_1_1533882359_3752345_493-167886-0_sp1.1 from training. Duration: 0.92725 2023-10-03 20:59:57,928 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_196-289637-0 from training. Duration: 0.75 2023-10-03 20:59:57,998 WARNING [train.py:1204] (3/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_26-175553-0 from training. Duration: 0.61 2023-10-03 20:59:58,033 WARNING [train.py:1204] (3/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_890-5117-0 from training. Duration: 0.78 2023-10-03 20:59:59,360 WARNING [train.py:1204] (3/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_206-306860-0_sp1.1 from training. Duration: 0.92725 2023-10-03 21:00:00,818 WARNING [train.py:1204] (3/4) Exclude cut with ID _25882_210_3036_1_1532775665815_6479510_570-79824-0_sp1.1 from training. Duration: 0.963625 2023-10-03 21:00:02,178 WARNING [train.py:1204] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_731-2428-0_sp1.1 from training. Duration: 0.7545625 2023-10-03 21:00:02,202 WARNING [train.py:1204] (3/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_138-154812-0_sp0.9 from training. Duration: 0.8666875 2023-10-03 21:00:04,860 WARNING [train.py:1204] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_178-39722-0_sp0.9 from training. Duration: 0.5444375 2023-10-03 21:00:06,194 WARNING [train.py:1204] (3/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_1285-256622-0_sp0.9 from training. Duration: 0.9333125 2023-10-03 21:00:08,524 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_41428_210_2161_1_1533774184_3992532_65-271323-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 21:00:11,220 WARNING [train.py:1204] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_308-161067-0 from training. Duration: 0.66 2023-10-03 21:00:11,225 WARNING [train.py:1204] (3/4) Exclude cut with ID _3067_210_2667_1_1526212974640_4503168_459-173867-0 from training. Duration: 0.88 2023-10-03 21:00:11,226 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2221_210_1988_1_1525597051_4935541_415-296882-0_sp1.1 from training. Duration: 0.7 2023-10-03 21:00:13,889 WARNING [train.py:1204] (3/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_354-192266-0_sp1.1 from training. Duration: 0.936375 2023-10-03 21:00:14,130 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.balancer2.prob, batch_count=1404760.0, ans=0.125 2023-10-03 21:00:15,388 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_258-310570-0_sp0.9 from training. Duration: 0.911125 2023-10-03 21:00:16,785 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_548-141039-0_sp1.1 from training. Duration: 0.963625 2023-10-03 21:00:19,598 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_15101_210_7927_1_1530786415_5033274_320-327527-0 from training. Duration: 0.96 2023-10-03 21:00:19,658 WARNING [train.py:1204] (3/4) Exclude cut with ID _2895_210_2477_1_1525956785794_4063750_289-11475-0_sp0.9 from training. Duration: 0.911125 2023-10-03 21:00:21,073 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7543_210_6753_1_1528543318_4552_1-85072-0_sp1.1 from training. Duration: 0.7545625 2023-10-03 21:00:23,107 WARNING [train.py:1204] (3/4) Exclude cut with ID _64639_210_5816_1_1535367619437_3528000_461-329156-0 from training. Duration: 0.96 2023-10-03 21:00:24,542 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_211-339622-0 from training. Duration: 0.45 2023-10-03 21:00:27,939 WARNING [train.py:1204] (3/4) Exclude cut with ID _20455_210_5219_1_1531565996775_8098100_402-112739-0_sp1.1 from training. Duration: 0.963625 2023-10-03 21:00:29,206 INFO [train.py:1046] (3/4) Epoch 40, batch 3550, loss[loss=0.1473, simple_loss=0.2318, pruned_loss=0.03137, over 24308.00 frames. ], tot_loss[loss=0.1557, simple_loss=0.2346, pruned_loss=0.03839, over 4719313.11 frames. ], batch size: 61, lr: 2.54e-03, grad_scale: 8.0 2023-10-03 21:00:29,329 WARNING [train.py:1204] (3/4) Exclude cut with ID _53489_210_6228_1_1534298357169_3835970_256-115565-0_sp1.1 from training. Duration: 0.936375 2023-10-03 21:00:29,353 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_26326_210_7925_1_1533002493_3592406_360-305260-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 21:00:30,682 WARNING [train.py:1204] (3/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_18-204383-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 21:00:31,514 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.2.encoder.layers.0.self_attn1.whiten, num_groups=1, num_channels=384, metric=20.83 vs. limit=22.5 2023-10-03 21:00:33,427 WARNING [train.py:1204] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_306-232480-0_sp1.1 from training. Duration: 0.87275 2023-10-03 21:00:42,366 WARNING [train.py:1204] (3/4) Exclude cut with ID _41161_210_20034_1_1533431249657_3555314_116-299186-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 21:00:42,495 WARNING [train.py:1204] (3/4) Exclude cut with ID _13913_210_9994_1_1530579615187_3383897_473-277682-0_sp1.1 from training. Duration: 0.3545625 2023-10-03 21:00:45,206 WARNING [train.py:1204] (3/4) Exclude cut with ID _58895_210_8750_1_1535184081320_3649089_49-225702-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 21:00:45,381 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.bypass_mid.scale_min, batch_count=1404893.3333333333, ans=0.2 2023-10-03 21:00:46,529 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3080_210_2071_1_1526086102_4365166_312-8726-0_sp1.1 from training. Duration: 0.82725 2023-10-03 21:00:47,953 WARNING [train.py:1204] (3/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_489-190175-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 21:00:49,330 WARNING [train.py:1204] (3/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_345-95717-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 21:00:49,351 WARNING [train.py:1204] (3/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_85-138257-0_sp1.1 from training. Duration: 0.6454375 2023-10-03 21:00:53,404 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7046_210_6753_1_1527895337_6920917_783-278109-0_sp1.1 from training. Duration: 0.7 2023-10-03 21:00:53,430 WARNING [train.py:1204] (3/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_450-203637-0_sp1.1 from training. Duration: 0.836375 2023-10-03 21:00:53,478 WARNING [train.py:1204] (3/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_416-132893-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 21:00:53,503 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2655_210_2087_1_1525688959_3897400_437-337690-0_sp0.9 from training. Duration: 0.7 2023-10-03 21:00:55,335 WARNING [train.py:1204] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_700-183549-0_sp1.1 from training. Duration: 0.6181875 2023-10-03 21:00:58,845 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.nonlin_attention.balancer.prob, batch_count=1404960.0, ans=0.125 2023-10-03 21:00:59,995 WARNING [train.py:1204] (3/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_191-59784-0_sp1.1 from training. Duration: 0.8 2023-10-03 21:01:00,007 WARNING [train.py:1204] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_218-295097-0_sp1.1 from training. Duration: 0.7 2023-10-03 21:01:01,237 WARNING [train.py:1204] (3/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_310-109461-0_sp1.1 from training. Duration: 0.72725 2023-10-03 21:01:01,242 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_33348_210_5632_1_1533086912_3324030_131-321149-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 21:01:01,309 WARNING [train.py:1204] (3/4) Exclude cut with ID _64135_210_2253_1_1535677267876_7608972_133-188266-0_sp1.1 from training. Duration: 0.736375 2023-10-03 21:01:01,329 WARNING [train.py:1204] (3/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_505-205270-0 from training. Duration: 0.38 2023-10-03 21:01:01,340 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_67223_210_4238_1_1535612120_2537431_102-88248-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 21:01:03,936 WARNING [train.py:1204] (3/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_60-235433-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 21:01:04,022 WARNING [train.py:1204] (3/4) Exclude cut with ID _5189_210_4557_1_1527248319428_3769229_199-53526-0_sp1.1 from training. Duration: 0.3818125 2023-10-03 21:01:09,218 WARNING [train.py:1204] (3/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_439-90979-0_sp1.1 from training. Duration: 0.963625 2023-10-03 21:01:10,427 WARNING [train.py:1204] (3/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_1162-132880-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 21:01:10,499 WARNING [train.py:1204] (3/4) Exclude cut with ID _56423_210_5854_1_1534896061049_3165969_220-22317-0_sp1.1 from training. Duration: 0.963625 2023-10-03 21:01:13,202 WARNING [train.py:1204] (3/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_909-64151-0 from training. Duration: 0.94 2023-10-03 21:01:13,285 WARNING [train.py:1204] (3/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_465-156500-0_sp0.9 from training. Duration: 0.8 2023-10-03 21:01:14,494 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.605e+02 1.948e+02 2.132e+02 2.372e+02 3.710e+02, threshold=4.263e+02, percent-clipped=0.0 2023-10-03 21:01:14,645 WARNING [train.py:1204] (3/4) Exclude cut with ID _44304_210_15374_1_1533639352765_4613129_302-22084-0 from training. Duration: 0.86 2023-10-03 21:01:15,786 WARNING [train.py:1204] (3/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_663-166973-0_sp1.1 from training. Duration: 0.72725 2023-10-03 21:01:17,172 WARNING [train.py:1204] (3/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_503-287883-0_sp0.9 from training. Duration: 0.97775 2023-10-03 21:01:18,624 WARNING [train.py:1204] (3/4) Exclude cut with ID _60057_210_10640_1_1534903065793_3333150_362-230377-0_sp0.9 from training. Duration: 0.9666875 2023-10-03 21:01:22,891 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_4183_210_2087_1_1526729100_1869838_143-280962-0 from training. Duration: 0.56 2023-10-03 21:01:23,625 WARNING [train.py:1204] (3/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_281-1313-0_sp1.1 from training. Duration: 0.936375 2023-10-03 21:01:27,802 WARNING [train.py:1204] (3/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_64-215118-0_sp1.1 from training. Duration: 0.936375 2023-10-03 21:01:29,138 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2822_210_2715_1_1526112586_1999587_165-36656-0 from training. Duration: 0.89 2023-10-03 21:01:29,203 WARNING [train.py:1204] (3/4) Exclude cut with ID _22961_210_8254_1_1531976366672_4278180_402-208830-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 21:01:30,741 WARNING [train.py:1204] (3/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_98-193494-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 21:01:32,081 WARNING [train.py:1204] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_383-285196-0 from training. Duration: 0.97 2023-10-03 21:01:36,974 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.conv_module2.balancer2.prob, batch_count=1405093.3333333333, ans=0.125 2023-10-03 21:01:38,591 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6767_210_3273_1_1527855783_1718932_142-127132-0 from training. Duration: 0.95 2023-10-03 21:01:38,626 WARNING [train.py:1204] (3/4) Exclude cut with ID _39867_210_9826_1_1533553222030_9344259_1018-339635-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 21:01:39,959 WARNING [train.py:1204] (3/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_274-274798-0_sp1.1 from training. Duration: 0.8909375 2023-10-03 21:01:40,312 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.bypass_mid.scale_min, batch_count=1405093.3333333333, ans=0.2 2023-10-03 21:01:41,400 WARNING [train.py:1204] (3/4) Exclude cut with ID _2334_210_2667_1_1525608238880_4705129_376-263393-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 21:01:42,644 INFO [train.py:1046] (3/4) Epoch 40, batch 3600, loss[loss=0.1596, simple_loss=0.2299, pruned_loss=0.04469, over 23827.00 frames. ], tot_loss[loss=0.1554, simple_loss=0.2342, pruned_loss=0.03829, over 4707665.35 frames. ], batch size: 179, lr: 2.54e-03, grad_scale: 16.0 2023-10-03 21:01:42,740 WARNING [train.py:1204] (3/4) Exclude cut with ID _29303_210_2575_1_1532392237303_7410649_363-344675-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 21:01:44,078 WARNING [train.py:1204] (3/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_58-68956-0_sp1.1 from training. Duration: 0.7545625 2023-10-03 21:01:44,958 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.2.encoder.layers.1.feed_forward2.out_whiten, num_groups=1, num_channels=384, metric=4.10 vs. limit=15.0 2023-10-03 21:01:46,750 WARNING [train.py:1204] (3/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_709-311571-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 21:01:48,203 WARNING [train.py:1204] (3/4) Exclude cut with ID _46067_210_4220_1_1534125571066_3307450_136-257407-0_sp1.1 from training. Duration: 0.963625 2023-10-03 21:01:48,283 WARNING [train.py:1204] (3/4) Exclude cut with ID _6493_210_1747_1_1528459210424_4012470_324-323854-0_sp1.1 from training. Duration: 0.836375 2023-10-03 21:01:49,512 WARNING [train.py:1204] (3/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_234-260682-0_sp1.1 from training. Duration: 0.763625 2023-10-03 21:01:50,183 WARNING [train.py:1204] (3/4) Exclude cut with ID _36634_210_1794_1_1533296352450_1399884_55-119451-0_sp1.1 from training. Duration: 0.963625 2023-10-03 21:01:50,186 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_40047_210_2290_1_1533290519_2374311_195-165693-0 from training. Duration: 0.89 2023-10-03 21:01:53,467 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.4.encoder.layers.1.conv_module2.whiten, num_groups=1, num_channels=384, metric=6.60 vs. limit=15.0 2023-10-03 21:01:54,184 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_5522_210_3273_1_1527249606_3909096_366-172508-0_sp0.9 from training. Duration: 0.7333125 2023-10-03 21:01:54,265 WARNING [train.py:1204] (3/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_187-10504-0_sp1.1 from training. Duration: 0.963625 2023-10-03 21:01:57,024 WARNING [train.py:1204] (3/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_282-257791-0_sp1.1 from training. Duration: 0.7818125 2023-10-03 21:01:59,727 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_43831_210_2158_1_1533725502_5872791_467-288776-0_sp1.1 from training. Duration: 0.92725 2023-10-03 21:01:59,822 WARNING [train.py:1204] (3/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_138-154812-0_sp1.1 from training. Duration: 0.7090625 2023-10-03 21:02:01,148 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_1154-185930-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 21:02:01,174 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2655_210_2087_1_1525688959_3897400_52-279899-0 from training. Duration: 0.69 2023-10-03 21:02:02,564 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_141-89113-0_sp1.1 from training. Duration: 0.7818125 2023-10-03 21:02:05,841 WARNING [train.py:1204] (3/4) Exclude cut with ID _55134_210_21982_1_1534590028377_3882170_297-47310-0_sp1.1 from training. Duration: 0.963625 2023-10-03 21:02:05,919 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_486-151131-0_sp1.1 from training. Duration: 0.77275 2023-10-03 21:02:06,122 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.ff3_skip_rate, batch_count=1405226.6666666667, ans=0.0 2023-10-03 21:02:07,833 WARNING [train.py:1204] (3/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_21-232980-0_sp1.1 from training. Duration: 0.936375 2023-10-03 21:02:10,564 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6165_210_3528_1_1527590982_4379841_338-106380-0_sp1.1 from training. Duration: 0.92725 2023-10-03 21:02:10,628 WARNING [train.py:1204] (3/4) Exclude cut with ID _55162_210_17071_1_1534408248583_3753480_355-257218-0_sp1.1 from training. Duration: 0.97275 2023-10-03 21:02:13,264 WARNING [train.py:1204] (3/4) Exclude cut with ID _2895_210_2477_1_1525956785794_4063750_289-11475-0 from training. Duration: 0.82 2023-10-03 21:02:20,347 WARNING [train.py:1204] (3/4) Exclude cut with ID _16756_210_4915_1_1531047803763_7104259_839-183431-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 21:02:22,360 WARNING [train.py:1204] (3/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_427-208901-0_sp0.9 from training. Duration: 0.7555625 2023-10-03 21:02:23,726 WARNING [train.py:1204] (3/4) Exclude cut with ID _36512_210_6987_1_1533033143998_3126069_181-306500-0 from training. Duration: 0.95 2023-10-03 21:02:27,131 WARNING [train.py:1204] (3/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_28-70987-0_sp1.1 from training. Duration: 0.7909375 2023-10-03 21:02:31,326 WARNING [train.py:1204] (3/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_590-58996-0_sp1.1 from training. Duration: 0.936375 2023-10-03 21:02:34,094 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_48730_210_5399_1_1533885886_2212769_108-47561-0_sp1.1 from training. Duration: 0.936375 2023-10-03 21:02:42,005 WARNING [train.py:1204] (3/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_401-158239-0_sp0.9 from training. Duration: 0.788875 2023-10-03 21:02:42,018 WARNING [train.py:1204] (3/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_278-154492-0_sp1.1 from training. Duration: 0.6909375 2023-10-03 21:02:42,024 WARNING [train.py:1204] (3/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_137-116442-0 from training. Duration: 0.78 2023-10-03 21:02:43,406 WARNING [train.py:1204] (3/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_189-317158-0 from training. Duration: 0.9 2023-10-03 21:02:44,807 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_47896_210_20508_1_1534492518_5958868_558-314572-0 from training. Duration: 0.97 2023-10-03 21:02:46,267 WARNING [train.py:1204] (3/4) Exclude cut with ID _66070_210_7640_1_1535441393096_7805310_290-40284-0_sp1.1 from training. Duration: 0.97275 2023-10-03 21:02:47,641 WARNING [train.py:1204] (3/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_110-224688-0_sp1.1 from training. Duration: 0.8818125 2023-10-03 21:02:47,733 WARNING [train.py:1204] (3/4) Exclude cut with ID _7719_210_3525_1_1528456153163_4296960_88-250259-0 from training. Duration: 0.84 2023-10-03 21:02:49,005 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8453_210_4211_1_1528502940_8562730_1157-114102-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 21:02:49,043 WARNING [train.py:1204] (3/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_317-175062-0_sp1.1 from training. Duration: 0.7454375 2023-10-03 21:02:49,057 WARNING [train.py:1204] (3/4) Exclude cut with ID _29857_210_20034_1_1532395886753_3798160_254-347254-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 21:02:50,513 WARNING [train.py:1204] (3/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_28-70987-0 from training. Duration: 0.87 2023-10-03 21:02:51,982 WARNING [train.py:1204] (3/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_321-335475-0 from training. Duration: 0.45 2023-10-03 21:02:56,515 INFO [train.py:1046] (3/4) Epoch 40, batch 3650, loss[loss=0.1588, simple_loss=0.2507, pruned_loss=0.03349, over 24325.00 frames. ], tot_loss[loss=0.1559, simple_loss=0.2353, pruned_loss=0.03824, over 4715766.33 frames. ], batch size: 74, lr: 2.54e-03, grad_scale: 16.0 2023-10-03 21:02:56,551 WARNING [train.py:1204] (3/4) Exclude cut with ID _40901_210_5665_1_1533363789958_3856130_356-107330-0_sp1.1 from training. Duration: 0.936375 2023-10-03 21:02:56,623 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3780_210_2087_1_1526467389_2145572_176-107140-0 from training. Duration: 0.69 2023-10-03 21:03:01,310 WARNING [train.py:1204] (3/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_80-135200-0 from training. Duration: 0.79 2023-10-03 21:03:01,415 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_33348_210_5632_1_1533086912_3324030_85-28583-0_sp0.9 from training. Duration: 0.911125 2023-10-03 21:03:05,577 WARNING [train.py:1204] (3/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_29-293896-0 from training. Duration: 0.61 2023-10-03 21:03:07,157 WARNING [train.py:1204] (3/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_194-252800-0 from training. Duration: 0.97 2023-10-03 21:03:13,598 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_360-52446-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 21:03:13,600 WARNING [train.py:1204] (3/4) Exclude cut with ID _9389_210_1664_1_1529217337301_5289269_289-213417-0_sp0.9 from training. Duration: 0.988875 2023-10-03 21:03:13,650 WARNING [train.py:1204] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_143-86344-0_sp0.9 from training. Duration: 0.8555625 2023-10-03 21:03:16,415 WARNING [train.py:1204] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_520-126729-0_sp0.9 from training. Duration: 0.77775 2023-10-03 21:03:16,434 WARNING [train.py:1204] (3/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_76-308663-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 21:03:17,752 WARNING [train.py:1204] (3/4) Exclude cut with ID _8021_210_7994_1_1528376451358_3653069_100-140231-0 from training. Duration: 0.67 2023-10-03 21:03:17,829 WARNING [train.py:1204] (3/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_387-131538-0_sp0.9 from training. Duration: 0.9 2023-10-03 21:03:19,175 WARNING [train.py:1204] (3/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_365-182054-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 21:03:20,594 WARNING [train.py:1204] (3/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_345-340690-0 from training. Duration: 0.93 2023-10-03 21:03:20,682 WARNING [train.py:1204] (3/4) Exclude cut with ID _5409_210_1388_1_1527292850189_5865191_178-127970-0_sp0.9 from training. Duration: 0.6333125 2023-10-03 21:03:22,008 WARNING [train.py:1204] (3/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_352-12734-0_sp1.1 from training. Duration: 0.92725 2023-10-03 21:03:22,011 WARNING [train.py:1204] (3/4) Exclude cut with ID _65134_210_14763_1_1535335393985_3505368_121-129373-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 21:03:23,436 WARNING [train.py:1204] (3/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_557-324258-0_sp1.1 from training. Duration: 0.736375 2023-10-03 21:03:24,997 WARNING [train.py:1204] (3/4) Exclude cut with ID _48678_210_5663_1_1533988830947_3769199_306-255886-0 from training. Duration: 0.77 2023-10-03 21:03:26,372 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7544_210_6753_1_1528587000_691468_87-178599-0 from training. Duration: 0.87 2023-10-03 21:03:28,251 WARNING [train.py:1204] (3/4) Exclude cut with ID _2941_210_2577_1_1526199673150_2489759_208-236402-0_sp1.1 from training. Duration: 0.963625 2023-10-03 21:03:29,592 WARNING [train.py:1204] (3/4) Exclude cut with ID _38670_210_15379_1_1533448810659_4391059_65-169397-0 from training. Duration: 0.97 2023-10-03 21:03:30,934 WARNING [train.py:1204] (3/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_306-328175-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 21:03:30,943 WARNING [train.py:1204] (3/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_212-149947-0_sp1.1 from training. Duration: 0.663625 2023-10-03 21:03:35,255 WARNING [train.py:1204] (3/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_321-22821-0_sp1.1 from training. Duration: 0.8090625 2023-10-03 21:03:37,963 WARNING [train.py:1204] (3/4) Exclude cut with ID _44010_210_4525_1_1533884262964_4013620_483-69110-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 21:03:37,972 WARNING [train.py:1204] (3/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_38-187851-0_sp1.1 from training. Duration: 0.563625 2023-10-03 21:03:39,335 WARNING [train.py:1204] (3/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_129-337802-0_sp0.9 from training. Duration: 0.8 2023-10-03 21:03:41,115 WARNING [train.py:1204] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_268-87909-0_sp1.1 from training. Duration: 0.9 2023-10-03 21:03:42,372 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.645e+02 1.999e+02 2.151e+02 2.370e+02 3.014e+02, threshold=4.301e+02, percent-clipped=0.0 2023-10-03 21:03:44,209 WARNING [train.py:1204] (3/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_559-184862-0_sp1.1 from training. Duration: 0.8545625 2023-10-03 21:03:45,747 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_46032_210_14656_1_1533781412_4507969_237-120766-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 21:03:47,121 WARNING [train.py:1204] (3/4) Exclude cut with ID _28427_210_4220_1_1532581240967_3061069_330-170906-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 21:03:47,140 WARNING [train.py:1204] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_401-51578-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 21:03:48,510 WARNING [train.py:1204] (3/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_209-205365-0_sp1.1 from training. Duration: 0.7181875 2023-10-03 21:03:49,821 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_23801_210_16223_1_1532066392_3294657_57-242170-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 21:03:49,881 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_127-37371-0_sp1.1 from training. Duration: 0.936375 2023-10-03 21:03:54,175 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.self_attn_weights.pos_emb_skip_rate, batch_count=1405760.0, ans=0.0 2023-10-03 21:03:55,236 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9171W0019-578905 from training. Duration: 0.896 2023-10-03 21:03:59,752 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_24605_210_1794_1_1532047863_4149755_349-49056-0_sp1.1 from training. Duration: 0.92725 2023-10-03 21:03:59,775 WARNING [train.py:1204] (3/4) Exclude cut with ID _12302_210_5219_1_1529924446466_8107280_897-21237-0_sp1.1 from training. Duration: 0.936375 2023-10-03 21:04:01,796 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_9460_210_4211_1_1528954587_10049667_480-260269-0_sp0.9 from training. Duration: 0.62225 2023-10-03 21:04:01,856 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_348-152548-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 21:04:03,251 WARNING [train.py:1204] (3/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_301-330455-0_sp0.9 from training. Duration: 0.888875 2023-10-03 21:04:03,357 WARNING [train.py:1204] (3/4) Exclude cut with ID _61008_210_10633_1_1534852769198_6974370_345-91705-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 21:04:04,648 WARNING [train.py:1204] (3/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_304-273433-0 from training. Duration: 0.99 2023-10-03 21:04:04,652 WARNING [train.py:1204] (3/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_359-345972-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 21:04:07,314 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2296_210_2087_1_1525503448_3397335_57-185430-0_sp1.1 from training. Duration: 0.5454375 2023-10-03 21:04:08,931 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.0.feed_forward1.hidden_balancer.prob, batch_count=1405826.6666666667, ans=0.125 2023-10-03 21:04:10,592 INFO [train.py:1046] (3/4) Epoch 40, batch 3700, loss[loss=0.1672, simple_loss=0.2515, pruned_loss=0.04142, over 23331.00 frames. ], tot_loss[loss=0.1568, simple_loss=0.2362, pruned_loss=0.03867, over 4698602.45 frames. ], batch size: 93, lr: 2.54e-03, grad_scale: 16.0 2023-10-03 21:04:10,633 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_1256-120974-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 21:04:10,697 WARNING [train.py:1204] (3/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_372-30302-0_sp0.9 from training. Duration: 0.9555625 2023-10-03 21:04:12,054 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_42012_210_7927_1_1533439936_3871556_451-344262-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 21:04:12,054 WARNING [train.py:1204] (3/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_28-324729-0 from training. Duration: 0.94 2023-10-03 21:04:12,063 WARNING [train.py:1204] (3/4) Exclude cut with ID _64135_210_2253_1_1535677267876_7608972_313-18394-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 21:04:13,433 WARNING [train.py:1204] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_620-276793-0_sp0.9 from training. Duration: 0.6555625 2023-10-03 21:04:15,392 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7544_210_6753_1_1528591327_1471426_172-348777-0_sp0.9 from training. Duration: 0.7333125 2023-10-03 21:04:16,911 WARNING [train.py:1204] (3/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_332-221057-0_sp0.9 from training. Duration: 0.7555625 2023-10-03 21:04:19,673 WARNING [train.py:1204] (3/4) Exclude cut with ID _19023_210_4353_1_1531378892552_5820780_5-300035-0_sp1.1 from training. Duration: 0.92725 2023-10-03 21:04:20,960 WARNING [train.py:1204] (3/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_343-309023-0_sp1.1 from training. Duration: 0.963625 2023-10-03 21:04:21,038 WARNING [train.py:1204] (3/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_37-41919-0_sp1.1 from training. Duration: 0.7909375 2023-10-03 21:04:22,350 WARNING [train.py:1204] (3/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_41-182910-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 21:04:22,412 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_229-45300-0_sp0.9 from training. Duration: 0.6333125 2023-10-03 21:04:24,074 INFO [scaling.py:1118] (3/4) WithLoss: name=encoder.encoders.5.encoder.layers.1.self_attn_weights, loss-sum=0.000e+00 2023-10-03 21:04:25,068 WARNING [train.py:1204] (3/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_40-214716-0_sp1.1 from training. Duration: 0.963625 2023-10-03 21:04:26,983 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9309W0203-640330 from training. Duration: 0.896 2023-10-03 21:04:31,363 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.0.bypass.skip_rate, batch_count=1405893.3333333333, ans=0.035 2023-10-03 21:04:34,404 WARNING [train.py:1204] (3/4) Exclude cut with ID _60229_210_27767_1_1534838406158_3655350_264-279573-0_sp1.1 from training. Duration: 0.7545625 2023-10-03 21:04:35,742 WARNING [train.py:1204] (3/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_372-198878-0_sp0.9 from training. Duration: 0.6666875 2023-10-03 21:04:35,851 WARNING [train.py:1204] (3/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_359-118926-0_sp1.1 from training. Duration: 0.6545625 2023-10-03 21:04:35,900 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder_embed.conv.2.prob, batch_count=1405893.3333333333, ans=0.125 2023-10-03 21:04:37,057 WARNING [train.py:1204] (3/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_85-65056-0 from training. Duration: 0.96 2023-10-03 21:04:37,076 WARNING [train.py:1204] (3/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_89-242335-0_sp1.1 from training. Duration: 0.863625 2023-10-03 21:04:39,141 WARNING [train.py:1204] (3/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_128-49444-0_sp1.1 from training. Duration: 0.936375 2023-10-03 21:04:40,606 WARNING [train.py:1204] (3/4) Exclude cut with ID _30327_210_16049_1_1532484161295_3924169_401-70819-0 from training. Duration: 0.86 2023-10-03 21:04:43,146 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_41822_210_13270_1_1533955976_4379284_260-256391-0_sp1.1 from training. Duration: 0.936375 2023-10-03 21:04:45,135 WARNING [train.py:1204] (3/4) Exclude cut with ID _67335_210_5219_1_1535525975652_4275229_599-247317-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 21:04:45,189 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder_embed.conv.2.prob, batch_count=1405960.0, ans=0.125 2023-10-03 21:04:47,889 WARNING [train.py:1204] (3/4) Exclude cut with ID _29857_210_20034_1_1532395886753_3798160_209-239267-0_sp1.1 from training. Duration: 0.936375 2023-10-03 21:04:49,226 WARNING [train.py:1204] (3/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_46-207599-0_sp1.1 from training. Duration: 0.5818125 2023-10-03 21:04:50,668 WARNING [train.py:1204] (3/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_912-282412-0_sp1.1 from training. Duration: 0.4454375 2023-10-03 21:04:54,944 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_4183_210_2087_1_1526729100_1869838_141-166600-0_sp1.1 from training. Duration: 0.863625 2023-10-03 21:04:54,955 WARNING [train.py:1204] (3/4) Exclude cut with ID _15037_210_2253_1_1530752294104_3267650_296-192597-0 from training. Duration: 0.81 2023-10-03 21:04:56,287 WARNING [train.py:1204] (3/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_571-220562-0_sp1.1 from training. Duration: 0.963625 2023-10-03 21:04:56,304 WARNING [train.py:1204] (3/4) Exclude cut with ID _5287_210_2084_1_1527418883040_3878670_12-221186-0 from training. Duration: 0.98 2023-10-03 21:05:02,226 WARNING [train.py:1204] (3/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_149-85602-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 21:05:02,250 WARNING [train.py:1204] (3/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_1003-101638-0_sp1.1 from training. Duration: 0.663625 2023-10-03 21:05:05,203 WARNING [train.py:1204] (3/4) Exclude cut with ID _55844_210_2346_1_1534471212569_7625707_1099-33689-0_sp1.1 from training. Duration: 0.92725 2023-10-03 21:05:06,962 WARNING [train.py:1204] (3/4) Exclude cut with ID _7606_210_4915_1_1528894253277_5080690_301-10732-0 from training. Duration: 0.81 2023-10-03 21:05:08,423 WARNING [train.py:1204] (3/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_403-34698-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 21:05:08,430 WARNING [train.py:1204] (3/4) Exclude cut with ID _8480_210_1874_1_1528589391205_6728330_755-112625-0_sp0.9 from training. Duration: 0.82225 2023-10-03 21:05:08,451 WARNING [train.py:1204] (3/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_527-238990-0_sp1.1 from training. Duration: 0.8181875 2023-10-03 21:05:08,473 WARNING [train.py:1204] (3/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_426-7328-0_sp1.1 from training. Duration: 0.92725 2023-10-03 21:05:08,681 INFO [scaling.py:1118] (3/4) WithLoss: name=encoder.encoders.1.encoder.layers.0.self_attn_weights, loss-sum=0.000e+00 2023-10-03 21:05:10,062 WARNING [train.py:1204] (3/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_189-317158-0_sp1.1 from training. Duration: 0.8181875 2023-10-03 21:05:12,043 WARNING [train.py:1204] (3/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_162-103620-0 from training. Duration: 0.93 2023-10-03 21:05:13,520 WARNING [train.py:1204] (3/4) Exclude cut with ID _22083_210_8254_1_1531702862439_3678970_49-234546-0 from training. Duration: 0.83 2023-10-03 21:05:14,812 WARNING [train.py:1204] (3/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_370-331600-0_sp1.1 from training. Duration: 0.8545625 2023-10-03 21:05:14,828 WARNING [train.py:1204] (3/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_166-59042-0_sp1.1 from training. Duration: 0.97275 2023-10-03 21:05:16,220 WARNING [train.py:1204] (3/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_138-332773-0_sp1.1 from training. Duration: 0.736375 2023-10-03 21:05:16,432 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.bypass_mid.scale_min, batch_count=1406093.3333333333, ans=0.2 2023-10-03 21:05:17,903 WARNING [train.py:1204] (3/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_90-30673-0_sp0.9 from training. Duration: 0.9333125 2023-10-03 21:05:20,640 WARNING [train.py:1204] (3/4) Exclude cut with ID _27967_210_12140_1_1532588391859_8419130_737-41898-0_sp1.1 from training. Duration: 0.936375 2023-10-03 21:05:23,357 WARNING [train.py:1204] (3/4) Exclude cut with ID _8833_210_5219_1_1528592665210_7553059_329-125156-0_sp1.1 from training. Duration: 0.7454375 2023-10-03 21:05:23,448 WARNING [train.py:1204] (3/4) Exclude cut with ID _41817_210_18835_1_1533434477160_7423049_352-42537-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 21:05:24,776 INFO [train.py:1046] (3/4) Epoch 40, batch 3750, loss[loss=0.1549, simple_loss=0.245, pruned_loss=0.03239, over 24003.00 frames. ], tot_loss[loss=0.1579, simple_loss=0.2375, pruned_loss=0.03921, over 4699176.72 frames. ], batch size: 86, lr: 2.54e-03, grad_scale: 16.0 2023-10-03 21:05:24,862 WARNING [train.py:1204] (3/4) Exclude cut with ID _8365_210_2064_1_1528592311070_4677690_431-294102-0 from training. Duration: 0.89 2023-10-03 21:05:27,538 WARNING [train.py:1204] (3/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_454-314901-0_sp1.1 from training. Duration: 0.4181875 2023-10-03 21:05:28,956 WARNING [train.py:1204] (3/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_80-135200-0_sp0.9 from training. Duration: 0.87775 2023-10-03 21:05:29,654 WARNING [train.py:1204] (3/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_181-78632-0 from training. Duration: 0.78 2023-10-03 21:05:31,042 WARNING [train.py:1204] (3/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_300-27235-0_sp0.9 from training. Duration: 0.9666875 2023-10-03 21:05:32,478 WARNING [train.py:1204] (3/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_96-177176-0_sp1.1 from training. Duration: 0.97275 2023-10-03 21:05:32,585 WARNING [train.py:1204] (3/4) Exclude cut with ID _35941_210_15379_1_1532995294515_4023089_246-348742-0_sp1.1 from training. Duration: 0.97275 2023-10-03 21:05:33,935 WARNING [train.py:1204] (3/4) Exclude cut with ID _38670_210_15379_1_1533448810659_4391059_65-169397-0_sp1.1 from training. Duration: 0.8818125 2023-10-03 21:05:38,560 WARNING [train.py:1204] (3/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_1003-99642-0_sp1.1 from training. Duration: 0.963625 2023-10-03 21:05:40,184 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.1.conv_skip_rate, batch_count=1406226.6666666667, ans=0.0 2023-10-03 21:05:41,357 WARNING [train.py:1204] (3/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_320-14078-0_sp1.1 from training. Duration: 0.863625 2023-10-03 21:05:41,459 WARNING [train.py:1204] (3/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_367-61330-0_sp0.9 from training. Duration: 0.8333125 2023-10-03 21:05:44,179 WARNING [train.py:1204] (3/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_515-298514-0_sp1.1 from training. Duration: 0.92725 2023-10-03 21:05:46,131 WARNING [train.py:1204] (3/4) Exclude cut with ID _53731_210_7994_1_1534553979244_3757214_471-320815-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 21:05:46,189 WARNING [train.py:1204] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_306-232480-0 from training. Duration: 0.96 2023-10-03 21:05:48,073 WARNING [train.py:1204] (3/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_478-162247-0_sp0.9 from training. Duration: 0.911125 2023-10-03 21:05:48,889 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.1.encoder.layers.1.feed_forward3.out_whiten, num_groups=1, num_channels=256, metric=9.36 vs. limit=15.0 2023-10-03 21:05:49,467 WARNING [train.py:1204] (3/4) Exclude cut with ID _56423_210_5854_1_1534896061049_3165969_345-284696-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 21:05:50,784 WARNING [train.py:1204] (3/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_600-122253-0_sp1.1 from training. Duration: 0.963625 2023-10-03 21:05:53,569 WARNING [train.py:1204] (3/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_199-295611-0 from training. Duration: 0.55 2023-10-03 21:05:56,399 WARNING [train.py:1204] (3/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_734-129761-0 from training. Duration: 0.82 2023-10-03 21:05:59,082 WARNING [train.py:1204] (3/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_349-169257-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 21:05:59,122 WARNING [train.py:1204] (3/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_202-67017-0_sp0.9 from training. Duration: 0.911125 2023-10-03 21:06:01,068 WARNING [train.py:1204] (3/4) Exclude cut with ID _16908_210_5724_1_1531119157248_3772700_61-258978-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 21:06:01,864 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.2.encoder.layers.2.conv_module1.whiten, num_groups=1, num_channels=384, metric=5.29 vs. limit=15.0 2023-10-03 21:06:06,458 WARNING [train.py:1204] (3/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_172-156931-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 21:06:07,862 WARNING [train.py:1204] (3/4) Exclude cut with ID _4092_210_2151_1_1526643044449_3135759_356-278694-0_sp1.1 from training. Duration: 0.42725 2023-10-03 21:06:11,030 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.638e+02 1.980e+02 2.164e+02 2.553e+02 4.062e+02, threshold=4.329e+02, percent-clipped=0.0 2023-10-03 21:06:12,565 WARNING [train.py:1204] (3/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_116-332008-0 from training. Duration: 0.98 2023-10-03 21:06:15,331 WARNING [train.py:1204] (3/4) Exclude cut with ID _29003_210_19596_1_1532507391720_3894098_358-114789-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 21:06:18,856 WARNING [train.py:1204] (3/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_152-212533-0_sp1.1 from training. Duration: 0.8818125 2023-10-03 21:06:18,923 WARNING [train.py:1204] (3/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_267-214116-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 21:06:21,818 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7543_210_6753_1_1528541907_1399322_165-323619-0_sp0.9 from training. Duration: 0.7666875 2023-10-03 21:06:25,808 WARNING [train.py:1204] (3/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_577-332600-0_sp0.9 from training. Duration: 0.6333125 2023-10-03 21:06:27,153 WARNING [train.py:1204] (3/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_342-100157-0_sp0.9 from training. Duration: 0.6 2023-10-03 21:06:29,751 WARNING [train.py:1204] (3/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_141-89727-0_sp1.1 from training. Duration: 0.6454375 2023-10-03 21:06:31,129 WARNING [train.py:1204] (3/4) Exclude cut with ID _3992_210_1747_1_1526644816031_3731060_107-251434-0_sp1.1 from training. Duration: 0.9 2023-10-03 21:06:34,420 WARNING [train.py:1204] (3/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_401-53423-0_sp1.1 from training. Duration: 0.5 2023-10-03 21:06:38,475 INFO [train.py:1046] (3/4) Epoch 40, batch 3800, loss[loss=0.1782, simple_loss=0.2603, pruned_loss=0.04802, over 24074.00 frames. ], tot_loss[loss=0.1577, simple_loss=0.2373, pruned_loss=0.03905, over 4712672.21 frames. ], batch size: 80, lr: 2.54e-03, grad_scale: 16.0 2023-10-03 21:06:40,610 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7762_210_1794_1_1528199508_4057408_422-227034-0_sp1.1 from training. Duration: 0.663625 2023-10-03 21:06:40,760 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.1.conv_skip_rate, batch_count=1406493.3333333333, ans=0.0 2023-10-03 21:06:44,690 WARNING [train.py:1204] (3/4) Exclude cut with ID _4184_210_2550_1_1526730192013_2219380_102-250299-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 21:06:46,089 WARNING [train.py:1204] (3/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_253-308377-0_sp0.9 from training. Duration: 0.5444375 2023-10-03 21:06:46,307 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.bypass.skip_rate, batch_count=1406493.3333333333, ans=0.09899494936611666 2023-10-03 21:06:47,891 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_271-297505-0 from training. Duration: 0.99 2023-10-03 21:06:49,770 WARNING [train.py:1204] (3/4) Exclude cut with ID _35941_210_15379_1_1532995294515_4023089_213-142973-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 21:06:52,501 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7544_210_6753_1_1528587722_3576686_162-63018-0_sp1.1 from training. Duration: 0.92725 2023-10-03 21:06:52,592 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_365-217119-0_sp0.9 from training. Duration: 0.7 2023-10-03 21:06:54,422 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.4.encoder.layers.0.feed_forward2.out_whiten, num_groups=1, num_channels=384, metric=7.46 vs. limit=15.0 2023-10-03 21:06:55,305 WARNING [train.py:1204] (3/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_662-275621-0_sp0.9 from training. Duration: 0.4666875 2023-10-03 21:06:55,306 WARNING [train.py:1204] (3/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_374-187555-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 21:06:56,691 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_289-180904-0_sp1.1 from training. Duration: 0.6909375 2023-10-03 21:06:58,076 WARNING [train.py:1204] (3/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_189-47206-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 21:06:59,341 WARNING [train.py:1204] (3/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_234-14174-0_sp1.1 from training. Duration: 0.7454375 2023-10-03 21:06:59,369 WARNING [train.py:1204] (3/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_576-76621-0_sp1.1 from training. Duration: 0.963625 2023-10-03 21:06:59,471 WARNING [train.py:1204] (3/4) Exclude cut with ID _13253_210_1925_1_1530757775819_3577873_253-246386-0 from training. Duration: 0.9 2023-10-03 21:07:04,019 WARNING [train.py:1204] (3/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_372-6869-0_sp1.1 from training. Duration: 0.4 2023-10-03 21:07:05,323 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_21434_210_4238_1_1531916339_1222252_22-54954-0_sp1.1 from training. Duration: 0.936375 2023-10-03 21:07:05,640 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.bypass.scale_min, batch_count=1406560.0, ans=0.2 2023-10-03 21:07:06,720 WARNING [train.py:1204] (3/4) Exclude cut with ID _29853_210_7497_1_1532505600851_6642110_492-170361-0_sp1.1 from training. Duration: 0.92725 2023-10-03 21:07:09,502 WARNING [train.py:1204] (3/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_505-97987-0_sp0.9 from training. Duration: 0.9555625 2023-10-03 21:07:09,535 WARNING [train.py:1204] (3/4) Exclude cut with ID _8365_210_2064_1_1528592311070_4677690_371-122295-0_sp0.9 from training. Duration: 0.611125 2023-10-03 21:07:10,905 WARNING [train.py:1204] (3/4) Exclude cut with ID _14268_210_9206_1_1530622771285_4714099_637-86630-0_sp1.1 from training. Duration: 0.463625 2023-10-03 21:07:10,918 WARNING [train.py:1204] (3/4) Exclude cut with ID _29857_210_20034_1_1532395886753_3798160_372-5678-0_sp1.1 from training. Duration: 0.963625 2023-10-03 21:07:12,334 WARNING [train.py:1204] (3/4) Exclude cut with ID _52892_210_6721_1_1534378941259_4124129_90-165108-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 21:07:14,088 WARNING [train.py:1204] (3/4) Exclude cut with ID _27052_210_5986_1_1532329197187_3585009_143-339198-0_sp1.1 from training. Duration: 0.963625 2023-10-03 21:07:18,518 WARNING [train.py:1204] (3/4) Exclude cut with ID _14268_210_9206_1_1530622771285_4714099_637-86630-0_sp0.9 from training. Duration: 0.5666875 2023-10-03 21:07:18,520 WARNING [train.py:1204] (3/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_401-255491-0 from training. Duration: 0.7 2023-10-03 21:07:20,529 WARNING [train.py:1204] (3/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_178-6946-0_sp1.1 from training. Duration: 0.97275 2023-10-03 21:07:26,274 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.feed_forward2.hidden_balancer.prob, batch_count=1406693.3333333333, ans=0.125 2023-10-03 21:07:27,414 WARNING [train.py:1204] (3/4) Exclude cut with ID _27485_210_4916_1_1532739191677_3668269_460-163885-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 21:07:32,075 WARNING [train.py:1204] (3/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_270-26053-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 21:07:33,463 WARNING [train.py:1204] (3/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_412-317077-0 from training. Duration: 0.75 2023-10-03 21:07:34,921 WARNING [train.py:1204] (3/4) Exclude cut with ID _5289_210_2084_1_1527678202736_3779299_142-103317-0 from training. Duration: 0.84 2023-10-03 21:07:34,966 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_150-143438-0_sp1.1 from training. Duration: 0.92725 2023-10-03 21:07:36,230 WARNING [train.py:1204] (3/4) Exclude cut with ID _27894_210_12392_1_1532415496078_6902439_668-269779-0_sp1.1 from training. Duration: 0.97275 2023-10-03 21:07:37,008 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.2.encoder.layers.0.self_attn2.whiten, num_groups=1, num_channels=384, metric=19.54 vs. limit=22.5 2023-10-03 21:07:37,753 WARNING [train.py:1204] (3/4) Exclude cut with ID _18513_210_9206_1_1531618215223_3862620_56-222075-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 21:07:40,369 WARNING [train.py:1204] (3/4) Exclude cut with ID _4092_210_2151_1_1526643044449_3135759_356-278694-0 from training. Duration: 0.47 2023-10-03 21:07:43,366 WARNING [train.py:1204] (3/4) Exclude cut with ID _25808_210_13292_1_1532343514562_7366109_633-47524-0 from training. Duration: 0.99 2023-10-03 21:07:43,580 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.attention_skip_rate, batch_count=1406760.0, ans=0.0 2023-10-03 21:07:45,207 WARNING [train.py:1204] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_872-264525-0 from training. Duration: 0.65 2023-10-03 21:07:45,237 WARNING [train.py:1204] (3/4) Exclude cut with ID _14632_210_9988_1_1530691279392_3923200_239-232545-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 21:07:45,331 WARNING [train.py:1204] (3/4) Exclude cut with ID _14349_210_8341_1_1530615710818_3442470_153-27432-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 21:07:51,267 WARNING [train.py:1204] (3/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_37-41919-0_sp0.9 from training. Duration: 0.9666875 2023-10-03 21:07:51,338 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_196-289637-0_sp0.9 from training. Duration: 0.8333125 2023-10-03 21:07:52,909 INFO [train.py:1046] (3/4) Epoch 40, batch 3850, loss[loss=0.1586, simple_loss=0.2301, pruned_loss=0.04355, over 23923.00 frames. ], tot_loss[loss=0.1573, simple_loss=0.2361, pruned_loss=0.03926, over 4697645.23 frames. ], batch size: 179, lr: 2.54e-03, grad_scale: 16.0 2023-10-03 21:07:57,228 WARNING [train.py:1204] (3/4) Exclude cut with ID _13678_210_7497_1_1530530069226_3463170_371-3221-0_sp1.1 from training. Duration: 0.8181875 2023-10-03 21:07:58,622 WARNING [train.py:1204] (3/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_557-324258-0 from training. Duration: 0.81 2023-10-03 21:07:58,701 WARNING [train.py:1204] (3/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_1012-279473-0_sp1.1 from training. Duration: 0.6090625 2023-10-03 21:08:00,104 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_66916_210_14656_1_1535725525_326566_7-92649-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 21:08:03,314 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_9460_210_4211_1_1528954587_10049667_480-260269-0_sp1.1 from training. Duration: 0.5090625 2023-10-03 21:08:05,989 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_121-155001-0_sp1.1 from training. Duration: 0.92725 2023-10-03 21:08:07,442 WARNING [train.py:1204] (3/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_188-171673-0_sp1.1 from training. Duration: 0.52725 2023-10-03 21:08:07,635 INFO [scaling.py:1118] (3/4) WithLoss: name=encoder.encoders.2.encoder.layers.0.self_attn_weights, loss-sum=0.000e+00 2023-10-03 21:08:07,649 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.ff3_skip_rate, batch_count=1406893.3333333333, ans=0.0 2023-10-03 21:08:08,815 WARNING [train.py:1204] (3/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_2-39653-0 from training. Duration: 0.98 2023-10-03 21:08:16,189 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_23850_210_4238_1_1532232597_1632433_112-229012-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 21:08:17,711 WARNING [train.py:1204] (3/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_626-79528-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 21:08:19,695 WARNING [train.py:1204] (3/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_856-90052-0_sp1.1 from training. Duration: 0.963625 2023-10-03 21:08:19,747 WARNING [train.py:1204] (3/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_122-109023-0_sp0.9 from training. Duration: 0.8444375 2023-10-03 21:08:23,185 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.2.encoder.layers.0.feed_forward2.out_whiten, num_groups=1, num_channels=384, metric=4.69 vs. limit=15.0 2023-10-03 21:08:24,435 WARNING [train.py:1204] (3/4) Exclude cut with ID _64414_210_17301_1_1535243252048_3757759_37-70328-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 21:08:24,494 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_622-344907-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 21:08:25,818 WARNING [train.py:1204] (3/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_410-321956-0_sp1.1 from training. Duration: 0.92725 2023-10-03 21:08:25,830 WARNING [train.py:1204] (3/4) Exclude cut with ID _7562_210_2084_1_1528887767916_3946229_186-326422-0_sp0.9 from training. Duration: 0.9333125 2023-10-03 21:08:27,277 WARNING [train.py:1204] (3/4) Exclude cut with ID _37537_210_2253_1_1533171758568_3421949_127-285192-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 21:08:28,692 WARNING [train.py:1204] (3/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_723-228168-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 21:08:30,011 WARNING [train.py:1204] (3/4) Exclude cut with ID _13678_210_7497_1_1530530069226_3463170_426-246984-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 21:08:30,026 WARNING [train.py:1204] (3/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_399-156791-0_sp0.9 from training. Duration: 0.92225 2023-10-03 21:08:31,385 WARNING [train.py:1204] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_391-22148-0 from training. Duration: 0.77 2023-10-03 21:08:31,412 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3475_210_2028_1_1526193578_3622569_64-297072-0 from training. Duration: 0.91 2023-10-03 21:08:31,483 WARNING [train.py:1204] (3/4) Exclude cut with ID _17875_210_2087_1_1531224012907_1821339_225-280015-0_sp1.1 from training. Duration: 0.963625 2023-10-03 21:08:32,771 WARNING [train.py:1204] (3/4) Exclude cut with ID _50655_210_18628_1_1534229942193_3772154_325-246169-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 21:08:33,651 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.conv_skip_rate, batch_count=1406960.0, ans=0.0 2023-10-03 21:08:34,929 WARNING [train.py:1204] (3/4) Exclude cut with ID _29529_210_7234_1_1532393961455_3977460_597-257348-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 21:08:36,263 WARNING [train.py:1204] (3/4) Exclude cut with ID _58895_210_8750_1_1535184081320_3649089_77-173485-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 21:08:36,289 WARNING [train.py:1204] (3/4) Exclude cut with ID _6270_210_2084_1_1527850911573_3856420_26-277343-0 from training. Duration: 0.82 2023-10-03 21:08:39,123 WARNING [train.py:1204] (3/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_144-3287-0 from training. Duration: 0.67 2023-10-03 21:08:40,422 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.710e+02 1.946e+02 2.190e+02 2.397e+02 4.110e+02, threshold=4.380e+02, percent-clipped=0.0 2023-10-03 21:08:40,496 WARNING [train.py:1204] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_293-145278-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 21:08:41,878 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_40047_210_2290_1_1533290519_2374311_257-335162-0 from training. Duration: 0.95 2023-10-03 21:08:43,316 WARNING [train.py:1204] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_619-344499-0_sp0.9 from training. Duration: 0.7 2023-10-03 21:08:43,454 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.1.feed_forward1.out_proj.dropout_p, batch_count=1407026.6666666667, ans=0.1 2023-10-03 21:08:49,940 WARNING [train.py:1204] (3/4) Exclude cut with ID _19504_210_9994_1_1531648863242_3325220_456-315379-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 21:08:50,019 WARNING [train.py:1204] (3/4) Exclude cut with ID _55857_210_2346_1_1534644007831_6898209_631-221692-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 21:08:54,642 WARNING [train.py:1204] (3/4) Exclude cut with ID _64631_210_27767_1_1535349580068_4165160_324-3682-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 21:08:54,664 WARNING [train.py:1204] (3/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_317-211107-0 from training. Duration: 0.92 2023-10-03 21:08:57,346 WARNING [train.py:1204] (3/4) Exclude cut with ID _6789_210_1534_1_1527857937218_3118950_311-61262-0 from training. Duration: 0.65 2023-10-03 21:09:00,222 WARNING [train.py:1204] (3/4) Exclude cut with ID _28323_210_16187_1_1532480395667_4100300_365-68882-0_sp1.1 from training. Duration: 0.92725 2023-10-03 21:09:00,865 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.2.encoder.layers.2.conv_module2.whiten, num_groups=1, num_channels=384, metric=5.39 vs. limit=15.0 2023-10-03 21:09:01,502 WARNING [train.py:1204] (3/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_816-181825-0_sp1.1 from training. Duration: 0.92725 2023-10-03 21:09:03,400 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.5.encoder.layers.1.self_attn_weights.whiten_keys, num_groups=4, num_channels=128, metric=1.55 vs. limit=6.0 2023-10-03 21:09:04,278 WARNING [train.py:1204] (3/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_132-269633-0_sp1.1 from training. Duration: 0.6181875 2023-10-03 21:09:04,298 WARNING [train.py:1204] (3/4) Exclude cut with ID _44585_210_18628_1_1533605350387_4518199_280-73534-0_sp1.1 from training. Duration: 0.8090625 2023-10-03 21:09:05,624 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_68095_210_20185_1_1535718226_8888855_1009-164755-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 21:09:05,694 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_40223_210_6228_1_1533298404_4812267_400-103813-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 21:09:05,695 WARNING [train.py:1204] (3/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_491-261923-0_sp1.1 from training. Duration: 0.8909375 2023-10-03 21:09:05,707 WARNING [train.py:1204] (3/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_712-342563-0 from training. Duration: 0.97 2023-10-03 21:09:05,783 WARNING [train.py:1204] (3/4) Exclude cut with ID _41161_210_20034_1_1533431249657_3555314_175-190032-0_sp1.1 from training. Duration: 0.963625 2023-10-03 21:09:07,609 INFO [train.py:1046] (3/4) Epoch 40, batch 3900, loss[loss=0.1451, simple_loss=0.2338, pruned_loss=0.02817, over 24472.00 frames. ], tot_loss[loss=0.1568, simple_loss=0.2359, pruned_loss=0.03879, over 4713976.37 frames. ], batch size: 63, lr: 2.54e-03, grad_scale: 8.0 2023-10-03 21:09:09,059 WARNING [train.py:1204] (3/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_788-298907-0 from training. Duration: 0.89 2023-10-03 21:09:09,082 WARNING [train.py:1204] (3/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_225-70961-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 21:09:09,096 WARNING [train.py:1204] (3/4) Exclude cut with ID _58583_210_5236_1_1534726835266_3574249_235-175994-0_sp1.1 from training. Duration: 0.92725 2023-10-03 21:09:11,235 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.1.encoder.layers.0.feed_forward3.out_whiten, num_groups=1, num_channels=256, metric=13.96 vs. limit=15.0 2023-10-03 21:09:11,715 WARNING [train.py:1204] (3/4) Exclude cut with ID _12302_210_5219_1_1529924446466_8107280_907-182949-0_sp1.1 from training. Duration: 0.836375 2023-10-03 21:09:11,762 WARNING [train.py:1204] (3/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_94-193692-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 21:09:13,099 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_4528_210_3273_1_1527076654_3276777_342-168068-0_sp0.9 from training. Duration: 0.9666875 2023-10-03 21:09:14,442 WARNING [train.py:1204] (3/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_368-74239-0_sp1.1 from training. Duration: 0.92725 2023-10-03 21:09:14,443 WARNING [train.py:1204] (3/4) Exclude cut with ID _44831_210_13427_1_1533816011549_3689035_291-295928-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 21:09:14,508 WARNING [train.py:1204] (3/4) Exclude cut with ID _56406_210_9288_1_1534899618664_3496149_455-131997-0_sp1.1 from training. Duration: 0.97275 2023-10-03 21:09:14,515 WARNING [train.py:1204] (3/4) Exclude cut with ID _6473_210_1741_1_1527901307396_3815380_108-17335-0 from training. Duration: 0.65 2023-10-03 21:09:14,563 WARNING [train.py:1204] (3/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_286-135600-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 21:09:19,292 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_928-321675-0_sp1.1 from training. Duration: 0.936375 2023-10-03 21:09:21,081 WARNING [train.py:1204] (3/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_81-300212-0_sp1.1 from training. Duration: 0.5454375 2023-10-03 21:09:21,111 WARNING [train.py:1204] (3/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_29-280622-0_sp0.9 from training. Duration: 0.911125 2023-10-03 21:09:22,457 WARNING [train.py:1204] (3/4) Exclude cut with ID _61079_210_4525_1_1534921008198_4011419_228-176447-0_sp1.1 from training. Duration: 0.936375 2023-10-03 21:09:23,916 WARNING [train.py:1204] (3/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_372-198878-0_sp1.1 from training. Duration: 0.5454375 2023-10-03 21:09:23,934 WARNING [train.py:1204] (3/4) Exclude cut with ID _64521_210_14699_1_1535243427259_3815328_244-208725-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 21:09:25,408 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8275_210_4198_1_1528455320_3892571_491-17707-0_sp0.9 from training. Duration: 0.711125 2023-10-03 21:09:27,361 WARNING [train.py:1204] (3/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_375-245677-0 from training. Duration: 0.81 2023-10-03 21:09:27,369 WARNING [train.py:1204] (3/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_690-286055-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 21:09:27,618 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.conv_skip_rate, batch_count=1407226.6666666667, ans=0.0 2023-10-03 21:09:29,995 WARNING [train.py:1204] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_89-321955-0 from training. Duration: 0.8 2023-10-03 21:09:30,035 WARNING [train.py:1204] (3/4) Exclude cut with ID _18895_210_6228_1_1531357274234_3784930_213-98721-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 21:09:31,372 WARNING [train.py:1204] (3/4) Exclude cut with ID _19708_210_9206_1_1532136650733_4238364_347-65240-0 from training. Duration: 0.97 2023-10-03 21:09:31,497 WARNING [train.py:1204] (3/4) Exclude cut with ID _64135_210_2253_1_1535677267876_7608972_498-5039-0 from training. Duration: 0.78 2023-10-03 21:09:36,247 WARNING [train.py:1204] (3/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_146-276944-0_sp1.1 from training. Duration: 0.7545625 2023-10-03 21:09:37,514 WARNING [train.py:1204] (3/4) Exclude cut with ID _63171_210_15488_1_1535104792794_3295110_155-34488-0_sp1.1 from training. Duration: 0.97275 2023-10-03 21:09:38,779 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_5438_210_1794_1_1527336687_2600009_233-58072-0_sp0.9 from training. Duration: 0.8555625 2023-10-03 21:09:38,831 WARNING [train.py:1204] (3/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_63-101996-0_sp0.9 from training. Duration: 0.97775 2023-10-03 21:09:41,773 WARNING [train.py:1204] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_271-269084-0_sp1.1 from training. Duration: 0.7545625 2023-10-03 21:09:43,301 WARNING [train.py:1204] (3/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_345-167986-0_sp1.1 from training. Duration: 0.763625 2023-10-03 21:09:45,991 WARNING [train.py:1204] (3/4) Exclude cut with ID _39290_210_4220_1_1533862778760_3075118_92-311600-0_sp1.1 from training. Duration: 0.8 2023-10-03 21:09:45,998 WARNING [train.py:1204] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_209-65306-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 21:09:46,143 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.0.feed_forward1.out_proj.dropout_p, batch_count=1407293.3333333333, ans=0.1 2023-10-03 21:09:47,285 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_26433_210_12108_1_1532157158_4056480_317-335807-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 21:09:53,882 WARNING [train.py:1204] (3/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_238-94127-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 21:09:53,925 WARNING [train.py:1204] (3/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_207-132038-0_sp1.1 from training. Duration: 0.8545625 2023-10-03 21:09:59,884 WARNING [train.py:1204] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_167-137255-0_sp1.1 from training. Duration: 0.5818125 2023-10-03 21:10:01,222 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2009_210_2071_1_1525481134_4588937_385-302100-0_sp0.9 from training. Duration: 0.9666875 2023-10-03 21:10:01,971 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.3.encoder.layers.1.self_attn1.whiten, num_groups=1, num_channels=512, metric=14.91 vs. limit=22.5 2023-10-03 21:10:04,167 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.conv_skip_rate, batch_count=1407360.0, ans=0.0 2023-10-03 21:10:12,707 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_15517_210_6753_1_1530954008_3487336_253-110617-0_sp1.1 from training. Duration: 0.936375 2023-10-03 21:10:14,221 WARNING [train.py:1204] (3/4) Exclude cut with ID _39290_210_4220_1_1533862778760_3075118_92-311600-0_sp0.9 from training. Duration: 0.97775 2023-10-03 21:10:14,261 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_42-142473-0 from training. Duration: 0.72 2023-10-03 21:10:15,551 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2296_210_2087_1_1525503448_3397335_57-185430-0 from training. Duration: 0.6 2023-10-03 21:10:15,577 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_11305_210_1794_1_1529750428_7178148_257-312870-0_sp0.9 from training. Duration: 0.97775 2023-10-03 21:10:17,020 WARNING [train.py:1204] (3/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_222-198127-0 from training. Duration: 0.84 2023-10-03 21:10:18,361 WARNING [train.py:1204] (3/4) Exclude cut with ID _65263_210_22356_1_1535763598691_3921037_423-202699-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 21:10:19,693 WARNING [train.py:1204] (3/4) Exclude cut with ID _15037_210_2253_1_1530752294104_3267650_13-130361-0 from training. Duration: 0.99 2023-10-03 21:10:21,447 INFO [train.py:1046] (3/4) Epoch 40, batch 3950, loss[loss=0.1552, simple_loss=0.2395, pruned_loss=0.03549, over 24369.00 frames. ], tot_loss[loss=0.1561, simple_loss=0.2354, pruned_loss=0.0384, over 4703194.18 frames. ], batch size: 77, lr: 2.53e-03, grad_scale: 8.0 2023-10-03 21:10:26,398 WARNING [train.py:1204] (3/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_628-198080-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 21:10:26,483 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3171_210_2087_1_1526109084_2330007_37-64334-0 from training. Duration: 0.82 2023-10-03 21:10:27,762 WARNING [train.py:1204] (3/4) Exclude cut with ID _5287_210_2084_1_1527418883040_3878670_12-221186-0_sp1.1 from training. Duration: 0.8909375 2023-10-03 21:10:29,619 WARNING [train.py:1204] (3/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_1103-56445-0_sp1.1 from training. Duration: 0.663625 2023-10-03 21:10:32,240 WARNING [train.py:1204] (3/4) Exclude cut with ID _65263_210_22356_1_1535763598691_3921037_169-140068-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 21:10:36,492 WARNING [train.py:1204] (3/4) Exclude cut with ID IC0671W0118-331961 from training. Duration: 0.896 2023-10-03 21:10:37,971 WARNING [train.py:1204] (3/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_402-102311-0_sp1.1 from training. Duration: 0.6090625 2023-10-03 21:10:37,990 WARNING [train.py:1204] (3/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_202-293523-0 from training. Duration: 0.72 2023-10-03 21:10:38,688 WARNING [train.py:1204] (3/4) Exclude cut with ID IC0679W0038-335846 from training. Duration: 0.768 2023-10-03 21:10:39,988 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_37357_210_17024_1_1533089504_2316121_79-198866-0_sp1.1 from training. Duration: 0.936375 2023-10-03 21:10:40,832 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.1.encoder.layers.1.conv_module2.whiten, num_groups=1, num_channels=256, metric=12.01 vs. limit=15.0 2023-10-03 21:10:41,416 WARNING [train.py:1204] (3/4) Exclude cut with ID _7606_210_4915_1_1528894253277_5080690_292-271285-0_sp1.1 from training. Duration: 0.963625 2023-10-03 21:10:41,874 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.5.encoder.layers.1.nonlin_attention.whiten2, num_groups=1, num_channels=256, metric=8.37 vs. limit=15.0 2023-10-03 21:10:42,715 WARNING [train.py:1204] (3/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_377-296839-0_sp0.9 from training. Duration: 0.611125 2023-10-03 21:10:42,718 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_45886_210_17024_1_1533704917_2435333_58-131069-0_sp1.1 from training. Duration: 0.963625 2023-10-03 21:10:45,548 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6961_210_2087_1_1527937168_6815011_395-185060-0 from training. Duration: 0.95 2023-10-03 21:10:48,774 WARNING [train.py:1204] (3/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_304-266479-0_sp1.1 from training. Duration: 0.97275 2023-10-03 21:10:48,840 WARNING [train.py:1204] (3/4) Exclude cut with ID _3867_210_1883_1_1526641200575_4475830_319-349850-0_sp1.1 from training. Duration: 0.6090625 2023-10-03 21:10:48,848 WARNING [train.py:1204] (3/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_304-59227-0_sp0.9 from training. Duration: 0.8555625 2023-10-03 21:10:50,159 WARNING [train.py:1204] (3/4) Exclude cut with ID _8021_210_7994_1_1528376451358_3653069_100-140231-0_sp0.9 from training. Duration: 0.7444375 2023-10-03 21:10:51,509 WARNING [train.py:1204] (3/4) Exclude cut with ID _8833_210_5219_1_1528592665210_7553059_329-125156-0_sp0.9 from training. Duration: 0.911125 2023-10-03 21:11:03,852 WARNING [train.py:1204] (3/4) Exclude cut with ID _14459_210_2228_1_1531443641946_7516700_792-287234-0_sp1.1 from training. Duration: 0.92725 2023-10-03 21:11:03,877 WARNING [train.py:1204] (3/4) Exclude cut with ID _19708_210_9206_1_1532136650733_4238364_347-65240-0_sp1.1 from training. Duration: 0.8818125 2023-10-03 21:11:07,649 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.1.encoder.layers.0.self_attn2.whiten, num_groups=1, num_channels=256, metric=12.09 vs. limit=22.5 2023-10-03 21:11:09,378 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.584e+02 1.960e+02 2.144e+02 2.428e+02 3.730e+02, threshold=4.288e+02, percent-clipped=0.0 2023-10-03 21:11:09,458 WARNING [train.py:1204] (3/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_70-27732-0 from training. Duration: 0.94 2023-10-03 21:11:14,228 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_397-132565-0 from training. Duration: 0.99 2023-10-03 21:11:14,234 WARNING [train.py:1204] (3/4) Exclude cut with ID _49728_210_12651_1_1534041034294_6788150_624-195301-0 from training. Duration: 0.85 2023-10-03 21:11:14,271 WARNING [train.py:1204] (3/4) Exclude cut with ID _55429_210_10626_1_1534467588527_7174143_377-169391-0_sp1.1 from training. Duration: 0.87275 2023-10-03 21:11:15,621 WARNING [train.py:1204] (3/4) Exclude cut with ID _49376_210_5986_1_1533985278732_3370740_354-344695-0_sp1.1 from training. Duration: 0.9 2023-10-03 21:11:19,948 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.bypass.scale_min, batch_count=1407760.0, ans=0.2 2023-10-03 21:11:23,033 WARNING [train.py:1204] (3/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_276-96593-0_sp1.1 from training. Duration: 0.7 2023-10-03 21:11:23,042 WARNING [train.py:1204] (3/4) Exclude cut with ID _15012_210_1968_1_1530856820429_5095350_688-178849-0_sp0.9 from training. Duration: 0.711125 2023-10-03 21:11:24,380 WARNING [train.py:1204] (3/4) Exclude cut with ID _25565_210_12392_1_1532423009382_3952098_372-69023-0_sp1.1 from training. Duration: 0.936375 2023-10-03 21:11:24,406 WARNING [train.py:1204] (3/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_61-327062-0_sp1.1 from training. Duration: 0.736375 2023-10-03 21:11:24,442 WARNING [train.py:1204] (3/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_215-141059-0 from training. Duration: 0.62 2023-10-03 21:11:27,621 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.conv_module1.balancer2.min_abs, batch_count=1407760.0, ans=0.5 2023-10-03 21:11:29,128 WARNING [train.py:1204] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_6-226750-0_sp1.1 from training. Duration: 0.8545625 2023-10-03 21:11:30,482 WARNING [train.py:1204] (3/4) Exclude cut with ID _14348_210_8341_1_1530599434996_3778640_240-232370-0_sp1.1 from training. Duration: 0.763625 2023-10-03 21:11:35,159 WARNING [train.py:1204] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_468-189058-0 from training. Duration: 0.96 2023-10-03 21:11:35,452 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.feed_forward2.hidden_balancer.prob, batch_count=1407826.6666666667, ans=0.125 2023-10-03 21:11:36,504 INFO [train.py:1046] (3/4) Epoch 40, batch 4000, loss[loss=0.1357, simple_loss=0.214, pruned_loss=0.0287, over 24358.00 frames. ], tot_loss[loss=0.1567, simple_loss=0.2359, pruned_loss=0.03878, over 4693504.70 frames. ], batch size: 56, lr: 2.53e-03, grad_scale: 16.0 2023-10-03 21:11:42,697 WARNING [train.py:1204] (3/4) Exclude cut with ID _21987_210_5854_1_1531821500482_3637038_247-93340-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 21:11:48,357 WARNING [train.py:1204] (3/4) Exclude cut with ID _66868_210_9527_1_1535509287954_4561140_436-191602-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 21:11:51,753 WARNING [train.py:1204] (3/4) Exclude cut with ID _18895_210_6228_1_1531357274234_3784930_394-58203-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 21:11:51,800 WARNING [train.py:1204] (3/4) Exclude cut with ID _58605_210_12210_1_1534723195939_3904259_258-281498-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 21:11:53,080 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_33348_210_5632_1_1533086912_3324030_245-292752-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 21:11:53,099 WARNING [train.py:1204] (3/4) Exclude cut with ID _67831_210_12392_1_1535612464202_3092612_346-2148-0 from training. Duration: 0.93 2023-10-03 21:11:53,433 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.conv_module2.balancer2.prob, batch_count=1407893.3333333333, ans=0.125 2023-10-03 21:11:54,476 WARNING [train.py:1204] (3/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_369-222060-0_sp0.9 from training. Duration: 0.788875 2023-10-03 21:11:54,542 WARNING [train.py:1204] (3/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_245-174553-0 from training. Duration: 0.88 2023-10-03 21:11:54,555 WARNING [train.py:1204] (3/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_231-104736-0_sp1.1 from training. Duration: 0.7090625 2023-10-03 21:11:54,560 WARNING [train.py:1204] (3/4) Exclude cut with ID _44585_210_18628_1_1533605350387_4518199_280-73534-0 from training. Duration: 0.89 2023-10-03 21:11:56,101 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.bypass.scale_min, batch_count=1407893.3333333333, ans=0.2 2023-10-03 21:11:57,291 WARNING [train.py:1204] (3/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_22-201476-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 21:12:02,381 WARNING [train.py:1204] (3/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_325-62802-0_sp0.9 from training. Duration: 0.9666875 2023-10-03 21:12:02,396 WARNING [train.py:1204] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_199-274062-0_sp1.1 from training. Duration: 0.97275 2023-10-03 21:12:02,399 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_8-319957-0_sp1.1 from training. Duration: 0.87275 2023-10-03 21:12:02,422 WARNING [train.py:1204] (3/4) Exclude cut with ID _29343_210_4381_1_1532602757752_3674109_224-349726-0_sp1.1 from training. Duration: 0.936375 2023-10-03 21:12:02,426 WARNING [train.py:1204] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_520-126729-0_sp1.1 from training. Duration: 0.636375 2023-10-03 21:12:03,810 WARNING [train.py:1204] (3/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_239-22076-0_sp1.1 from training. Duration: 0.77275 2023-10-03 21:12:06,382 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9012W0011-501146 from training. Duration: 0.896 2023-10-03 21:12:07,719 WARNING [train.py:1204] (3/4) Exclude cut with ID _29857_210_20034_1_1532395886753_3798160_279-185801-0_sp1.1 from training. Duration: 0.82725 2023-10-03 21:12:07,770 WARNING [train.py:1204] (3/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_261-223933-0_sp1.1 from training. Duration: 0.92725 2023-10-03 21:12:09,262 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9029W0025-509426 from training. Duration: 0.896 2023-10-03 21:12:11,767 WARNING [train.py:1204] (3/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_337-181502-0_sp1.1 from training. Duration: 0.5909375 2023-10-03 21:12:11,799 WARNING [train.py:1204] (3/4) Exclude cut with ID _68635_210_9317_1_1535770813410_4315870_159-332599-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 21:12:17,954 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_161-276996-0 from training. Duration: 0.95 2023-10-03 21:12:18,389 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.5.encoder.layers.1.self_attn1.whiten, num_groups=1, num_channels=256, metric=10.21 vs. limit=22.5 2023-10-03 21:12:19,294 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7088_210_1874_1_1527939776_1101113_127-68870-0_sp1.1 from training. Duration: 0.936375 2023-10-03 21:12:20,765 WARNING [train.py:1204] (3/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_322-217220-0_sp1.1 from training. Duration: 0.7818125 2023-10-03 21:12:21,991 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9072W0002-530636 from training. Duration: 0.768 2023-10-03 21:12:23,885 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_340-336408-0_sp1.1 from training. Duration: 0.8454375 2023-10-03 21:12:25,205 WARNING [train.py:1204] (3/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_395-37222-0 from training. Duration: 0.76 2023-10-03 21:12:25,208 WARNING [train.py:1204] (3/4) Exclude cut with ID _64099_210_17301_1_1535526977656_1578188_159-209156-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 21:12:25,287 WARNING [train.py:1204] (3/4) Exclude cut with ID _53950_210_5816_1_1534417264919_3862951_28-29681-0_sp1.1 from training. Duration: 0.92725 2023-10-03 21:12:25,466 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.conv_skip_rate, batch_count=1408026.6666666667, ans=0.0 2023-10-03 21:12:26,682 WARNING [train.py:1204] (3/4) Exclude cut with ID _4681_210_3525_1_1527250689464_120889_15-126760-0_sp1.1 from training. Duration: 0.736375 2023-10-03 21:12:26,937 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.feed_forward2.hidden_balancer.prob, batch_count=1408026.6666666667, ans=0.125 2023-10-03 21:12:28,054 WARNING [train.py:1204] (3/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_242-3472-0_sp1.1 from training. Duration: 0.663625 2023-10-03 21:12:28,110 WARNING [train.py:1204] (3/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_436-186311-0_sp0.9 from training. Duration: 0.72225 2023-10-03 21:12:28,129 WARNING [train.py:1204] (3/4) Exclude cut with ID _3675_210_4557_1_1526639934408_4182089_452-172147-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 21:12:29,519 WARNING [train.py:1204] (3/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_80-147301-0 from training. Duration: 0.84 2023-10-03 21:12:29,550 WARNING [train.py:1204] (3/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_351-107588-0_sp1.1 from training. Duration: 0.92725 2023-10-03 21:12:31,609 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9111W0002-550781 from training. Duration: 0.896 2023-10-03 21:12:37,423 WARNING [train.py:1204] (3/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_465-156500-0_sp1.1 from training. Duration: 0.6545625 2023-10-03 21:12:40,253 WARNING [train.py:1204] (3/4) Exclude cut with ID _13938_210_9988_1_1530518718978_3693960_304-112784-0_sp1.1 from training. Duration: 0.4181875 2023-10-03 21:12:43,871 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.attention_skip_rate, batch_count=1408093.3333333333, ans=0.0 2023-10-03 21:12:44,931 WARNING [train.py:1204] (3/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_202-67017-0_sp1.1 from training. Duration: 0.7454375 2023-10-03 21:12:44,969 WARNING [train.py:1204] (3/4) Exclude cut with ID _58414_210_8254_1_1535421609162_7151182_448-4358-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 21:12:46,310 WARNING [train.py:1204] (3/4) Exclude cut with ID _66840_210_5219_1_1535452168426_4196349_203-243234-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 21:12:47,685 WARNING [train.py:1204] (3/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_201-213513-0_sp1.1 from training. Duration: 0.97275 2023-10-03 21:12:50,426 INFO [train.py:1046] (3/4) Epoch 40, batch 4050, loss[loss=0.205, simple_loss=0.2678, pruned_loss=0.07114, over 19115.00 frames. ], tot_loss[loss=0.1571, simple_loss=0.2366, pruned_loss=0.03882, over 4696721.52 frames. ], batch size: 388, lr: 2.53e-03, grad_scale: 16.0 2023-10-03 21:12:50,548 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_117-187560-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 21:12:53,375 WARNING [train.py:1204] (3/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_135-174080-0_sp0.9 from training. Duration: 0.588875 2023-10-03 21:12:54,676 WARNING [train.py:1204] (3/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_45-265684-0 from training. Duration: 0.72 2023-10-03 21:12:57,732 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_435-313305-0_sp1.1 from training. Duration: 0.6454375 2023-10-03 21:12:57,757 WARNING [train.py:1204] (3/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_235-307337-0_sp1.1 from training. Duration: 0.8545625 2023-10-03 21:12:59,083 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7762_210_1794_1_1528199508_4057408_419-125009-0_sp0.9 from training. Duration: 0.92225 2023-10-03 21:12:59,374 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=1408160.0, ans=0.1 2023-10-03 21:13:00,634 WARNING [train.py:1204] (3/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_215-141059-0_sp1.1 from training. Duration: 0.563625 2023-10-03 21:13:00,739 WARNING [train.py:1204] (3/4) Exclude cut with ID _27013_210_1389_1_1532437173240_3252649_222-125573-0_sp1.1 from training. Duration: 0.8909375 2023-10-03 21:13:00,896 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.feed_forward3.hidden_balancer.prob, batch_count=1408160.0, ans=0.125 2023-10-03 21:13:03,676 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.bypass.skip_rate, batch_count=1408226.6666666667, ans=0.09899494936611666 2023-10-03 21:13:05,401 WARNING [train.py:1204] (3/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_126-274694-0_sp1.1 from training. Duration: 0.8909375 2023-10-03 21:13:06,854 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2592_210_2290_1_1525697025_4882925_157-70512-0_sp1.1 from training. Duration: 0.82725 2023-10-03 21:13:08,483 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_292-265928-0_sp0.9 from training. Duration: 0.5666875 2023-10-03 21:13:09,849 WARNING [train.py:1204] (3/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_1211-226066-0_sp1.1 from training. Duration: 0.6909375 2023-10-03 21:13:09,906 WARNING [train.py:1204] (3/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_394-265747-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 21:13:14,066 WARNING [train.py:1204] (3/4) Exclude cut with ID _48782_210_12658_1_1534467701544_2300422_239-67139-0_sp1.1 from training. Duration: 0.97275 2023-10-03 21:13:16,111 WARNING [train.py:1204] (3/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_329-113173-0_sp1.1 from training. Duration: 0.563625 2023-10-03 21:13:16,414 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.feed_forward1.out_proj.dropout_p, batch_count=1408226.6666666667, ans=0.1 2023-10-03 21:13:17,592 WARNING [train.py:1204] (3/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_81-75849-0_sp0.9 from training. Duration: 0.4333125 2023-10-03 21:13:17,629 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder_embed.conv.8.prob, batch_count=1408226.6666666667, ans=0.125 2023-10-03 21:13:19,249 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.conv_module1.balancer2.min_abs, batch_count=1408293.3333333333, ans=0.5 2023-10-03 21:13:20,336 WARNING [train.py:1204] (3/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_1003-101638-0 from training. Duration: 0.73 2023-10-03 21:13:21,680 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9309W0189-640316 from training. Duration: 0.64 2023-10-03 21:13:23,118 WARNING [train.py:1204] (3/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_503-287883-0_sp1.1 from training. Duration: 0.8 2023-10-03 21:13:23,387 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.conv_skip_rate, batch_count=1408293.3333333333, ans=0.0 2023-10-03 21:13:30,654 WARNING [train.py:1204] (3/4) Exclude cut with ID _8833_210_5219_1_1528592665210_7553059_611-80313-0 from training. Duration: 0.64 2023-10-03 21:13:30,719 WARNING [train.py:1204] (3/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_335-106626-0_sp1.1 from training. Duration: 0.963625 2023-10-03 21:13:33,573 WARNING [train.py:1204] (3/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_237-44332-0_sp1.1 from training. Duration: 0.8545625 2023-10-03 21:13:36,651 WARNING [train.py:1204] (3/4) Exclude cut with ID _46952_210_5986_1_1534230010357_3107379_178-147341-0_sp1.1 from training. Duration: 0.97275 2023-10-03 21:13:36,686 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_13091_210_7925_1_1530609447_8987354_946-154426-0_sp1.1 from training. Duration: 0.936375 2023-10-03 21:13:36,708 WARNING [train.py:1204] (3/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_326-17061-0_sp1.1 from training. Duration: 0.8545625 2023-10-03 21:13:37,893 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.523e+02 1.894e+02 2.100e+02 2.406e+02 4.274e+02, threshold=4.201e+02, percent-clipped=0.0 2023-10-03 21:13:41,129 WARNING [train.py:1204] (3/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_38-169920-0_sp1.1 from training. Duration: 0.82725 2023-10-03 21:13:43,950 WARNING [train.py:1204] (3/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_283-220372-0 from training. Duration: 0.56 2023-10-03 21:13:43,966 WARNING [train.py:1204] (3/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_252-216674-0_sp1.1 from training. Duration: 0.4909375 2023-10-03 21:13:45,335 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_202-91290-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 21:13:46,758 WARNING [train.py:1204] (3/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_80-74184-0 from training. Duration: 0.83 2023-10-03 21:13:48,773 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.feed_forward3.hidden_balancer.prob, batch_count=1408426.6666666667, ans=0.125 2023-10-03 21:13:51,263 WARNING [train.py:1204] (3/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_411-271358-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 21:13:58,789 WARNING [train.py:1204] (3/4) Exclude cut with ID _68777_210_4353_1_1535810390731_3117904_129-166012-0 from training. Duration: 0.85 2023-10-03 21:13:58,874 WARNING [train.py:1204] (3/4) Exclude cut with ID _44982_210_1511_1_1534312820108_3726029_410-109759-0_sp1.1 from training. Duration: 0.963625 2023-10-03 21:13:58,879 WARNING [train.py:1204] (3/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_364-22817-0_sp1.1 from training. Duration: 0.6454375 2023-10-03 21:14:01,546 WARNING [train.py:1204] (3/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_89-242335-0 from training. Duration: 0.95 2023-10-03 21:14:01,563 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_151-325626-0 from training. Duration: 0.97 2023-10-03 21:14:01,564 WARNING [train.py:1204] (3/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_786-264332-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 21:14:04,141 INFO [train.py:1046] (3/4) Epoch 40, batch 4100, loss[loss=0.1441, simple_loss=0.2297, pruned_loss=0.02929, over 24647.00 frames. ], tot_loss[loss=0.1578, simple_loss=0.2372, pruned_loss=0.03914, over 4699921.88 frames. ], batch size: 65, lr: 2.53e-03, grad_scale: 16.0 2023-10-03 21:14:04,258 WARNING [train.py:1204] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_35-153886-0_sp1.1 from training. Duration: 0.763625 2023-10-03 21:14:05,613 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_24007_210_1794_1_1531918925_1663875_116-100916-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 21:14:05,627 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_162-262717-0_sp1.1 from training. Duration: 0.7545625 2023-10-03 21:14:13,460 WARNING [train.py:1204] (3/4) Exclude cut with ID _8263_210_1664_1_1528438953494_294100_40-204731-0 from training. Duration: 0.5 2023-10-03 21:14:14,859 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_416-293746-0 from training. Duration: 0.9 2023-10-03 21:14:15,005 WARNING [train.py:1204] (3/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_605-219732-0 from training. Duration: 0.94 2023-10-03 21:14:16,366 WARNING [train.py:1204] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_454-123002-0 from training. Duration: 0.69 2023-10-03 21:14:16,372 WARNING [train.py:1204] (3/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_25-304273-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 21:14:16,573 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.out_combiner.scale_min, batch_count=1408493.3333333333, ans=0.2 2023-10-03 21:14:18,197 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6860_210_4852_1_1527917457_3833327_451-39702-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 21:14:18,229 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_304-16505-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 21:14:18,249 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_4-266348-0_sp1.1 from training. Duration: 0.7090625 2023-10-03 21:14:19,666 WARNING [train.py:1204] (3/4) Exclude cut with ID ID0149W0001-753701 from training. Duration: 0.989 2023-10-03 21:14:22,348 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_41251_210_15368_1_1533429767_4820695_394-55393-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 21:14:23,742 WARNING [train.py:1204] (3/4) Exclude cut with ID _8793_210_6310_1_1528542985139_2310009_121-179782-0_sp1.1 from training. Duration: 0.72725 2023-10-03 21:14:23,769 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2729_210_3528_1_1525863505_3882214_421-73047-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 21:14:25,078 WARNING [train.py:1204] (3/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_278-154492-0_sp0.9 from training. Duration: 0.8444375 2023-10-03 21:14:27,978 WARNING [train.py:1204] (3/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_412-317077-0_sp0.9 from training. Duration: 0.8333125 2023-10-03 21:14:29,464 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_727-138317-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 21:14:29,513 WARNING [train.py:1204] (3/4) Exclude cut with ID _25808_210_13292_1_1532343514562_7366109_345-335048-0_sp1.1 from training. Duration: 0.92725 2023-10-03 21:14:31,451 WARNING [train.py:1204] (3/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_506-23200-0 from training. Duration: 0.97 2023-10-03 21:14:32,815 WARNING [train.py:1204] (3/4) Exclude cut with ID _44718_210_5816_1_1534050051869_6469030_643-245187-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 21:14:32,819 WARNING [train.py:1204] (3/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_42-186162-0_sp1.1 from training. Duration: 0.836375 2023-10-03 21:14:32,832 WARNING [train.py:1204] (3/4) Exclude cut with ID _3231_210_2106_1_1526205864934_6784220_171-256004-0_sp1.1 from training. Duration: 0.7818125 2023-10-03 21:14:32,851 WARNING [train.py:1204] (3/4) Exclude cut with ID _25914_210_6400_1_1532141317058_644740_43-248478-0_sp1.1 from training. Duration: 0.936375 2023-10-03 21:14:32,905 WARNING [train.py:1204] (3/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_577-287150-0 from training. Duration: 0.82 2023-10-03 21:14:35,686 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_46015_210_7234_1_1533863319_3087837_376-276068-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 21:14:37,493 WARNING [train.py:1204] (3/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_51-200220-0 from training. Duration: 0.56 2023-10-03 21:14:37,691 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.attention_skip_rate, batch_count=1408626.6666666667, ans=0.0 2023-10-03 21:14:40,109 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_452-96101-0_sp1.1 from training. Duration: 0.963625 2023-10-03 21:14:41,621 WARNING [train.py:1204] (3/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_263-168388-0_sp1.1 from training. Duration: 0.7818125 2023-10-03 21:14:41,622 WARNING [train.py:1204] (3/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_469-94673-0 from training. Duration: 0.79 2023-10-03 21:14:44,831 WARNING [train.py:1204] (3/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_303-228738-0_sp1.1 from training. Duration: 0.763625 2023-10-03 21:14:44,885 WARNING [train.py:1204] (3/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_262-250826-0_sp1.1 from training. Duration: 0.9 2023-10-03 21:14:44,914 WARNING [train.py:1204] (3/4) Exclude cut with ID _68777_210_4353_1_1535810390731_3117904_129-166012-0_sp1.1 from training. Duration: 0.77275 2023-10-03 21:14:47,687 WARNING [train.py:1204] (3/4) Exclude cut with ID _58895_210_8750_1_1535184081320_3649089_265-323810-0 from training. Duration: 0.96 2023-10-03 21:14:49,025 WARNING [train.py:1204] (3/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_120-76843-0_sp0.9 from training. Duration: 0.611125 2023-10-03 21:14:49,081 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7544_210_6753_1_1528587722_3576686_295-258479-0_sp0.9 from training. Duration: 0.9333125 2023-10-03 21:14:52,188 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7046_210_6753_1_1527895337_6920917_783-278109-0 from training. Duration: 0.77 2023-10-03 21:14:52,239 WARNING [train.py:1204] (3/4) Exclude cut with ID _42326_210_7770_1_1533814282531_3707269_113-308323-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 21:14:53,556 WARNING [train.py:1204] (3/4) Exclude cut with ID _7852_210_5219_1_1528286471316_8139400_242-105336-0_sp0.9 from training. Duration: 0.8 2023-10-03 21:14:56,257 WARNING [train.py:1204] (3/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_114-277905-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 21:14:59,835 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_13118_210_7925_1_1530755612_7608297_42-66683-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 21:15:02,582 WARNING [train.py:1204] (3/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_901-79438-0_sp1.1 from training. Duration: 0.8545625 2023-10-03 21:15:02,635 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_30280_210_1881_1_1532872779_3660139_211-182194-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 21:15:10,033 WARNING [train.py:1204] (3/4) Exclude cut with ID _14898_210_9206_1_1530766812377_4177190_621-45732-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 21:15:10,041 WARNING [train.py:1204] (3/4) Exclude cut with ID _35350_210_16040_1_1533040191183_4075299_224-133775-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 21:15:11,668 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.bypass.scale_min, batch_count=1408760.0, ans=0.2 2023-10-03 21:15:13,398 WARNING [train.py:1204] (3/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_147-293311-0_sp1.1 from training. Duration: 0.8545625 2023-10-03 21:15:16,159 WARNING [train.py:1204] (3/4) Exclude cut with ID _12302_210_5219_1_1529924446466_8107280_835-279754-0_sp1.1 from training. Duration: 0.8181875 2023-10-03 21:15:18,455 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.feed_forward1.hidden_balancer.prob, batch_count=1408826.6666666667, ans=0.125 2023-10-03 21:15:19,456 INFO [train.py:1046] (3/4) Epoch 40, batch 4150, loss[loss=0.1466, simple_loss=0.2326, pruned_loss=0.03028, over 24485.00 frames. ], tot_loss[loss=0.158, simple_loss=0.2377, pruned_loss=0.03915, over 4710258.82 frames. ], batch size: 66, lr: 2.53e-03, grad_scale: 16.0 2023-10-03 21:15:19,514 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8453_210_4211_1_1528502940_8562730_347-330673-0_sp0.9 from training. Duration: 0.8 2023-10-03 21:15:19,600 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_213-103282-0_sp0.9 from training. Duration: 0.8666875 2023-10-03 21:15:20,936 WARNING [train.py:1204] (3/4) Exclude cut with ID _49728_210_12651_1_1534041034294_6788150_66-43496-0_sp1.1 from training. Duration: 0.97275 2023-10-03 21:15:20,945 WARNING [train.py:1204] (3/4) Exclude cut with ID _19023_210_4353_1_1531378892552_5820780_28-36615-0_sp1.1 from training. Duration: 0.8818125 2023-10-03 21:15:22,382 WARNING [train.py:1204] (3/4) Exclude cut with ID _14349_210_8341_1_1530615710818_3442470_115-285975-0 from training. Duration: 0.86 2023-10-03 21:15:22,584 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.ff3_skip_rate, batch_count=1408826.6666666667, ans=0.0 2023-10-03 21:15:23,659 WARNING [train.py:1204] (3/4) Exclude cut with ID _12734_210_8341_1_1530183614296_3891929_236-47464-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 21:15:23,708 WARNING [train.py:1204] (3/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_177-236809-0 from training. Duration: 0.49 2023-10-03 21:15:25,060 WARNING [train.py:1204] (3/4) Exclude cut with ID _13678_210_7497_1_1530530069226_3463170_371-3221-0 from training. Duration: 0.9 2023-10-03 21:15:25,082 WARNING [train.py:1204] (3/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_48-338976-0 from training. Duration: 0.89 2023-10-03 21:15:26,803 WARNING [train.py:1204] (3/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_1167-309744-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 21:15:31,439 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.3.encoder.layers.2.nonlin_attention.whiten1, num_groups=1, num_channels=384, metric=4.07 vs. limit=10.0 2023-10-03 21:15:32,159 WARNING [train.py:1204] (3/4) Exclude cut with ID _30327_210_16049_1_1532484161295_3924169_401-70819-0_sp0.9 from training. Duration: 0.9555625 2023-10-03 21:15:32,175 WARNING [train.py:1204] (3/4) Exclude cut with ID _55134_210_21982_1_1534590028377_3882170_346-91022-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 21:15:32,724 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.5.encoder.layers.0.feed_forward1.out_whiten, num_groups=1, num_channels=256, metric=7.06 vs. limit=15.0 2023-10-03 21:15:35,113 WARNING [train.py:1204] (3/4) Exclude cut with ID _14459_210_2228_1_1531443641946_7516700_174-118801-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 21:15:35,173 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_269-278006-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 21:15:36,968 WARNING [train.py:1204] (3/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_390-339137-0_sp0.9 from training. Duration: 0.62225 2023-10-03 21:15:38,328 WARNING [train.py:1204] (3/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_253-308377-0_sp1.1 from training. Duration: 0.4454375 2023-10-03 21:15:38,347 WARNING [train.py:1204] (3/4) Exclude cut with ID _23825_210_13449_1_1531915313526_3497150_240-271367-0_sp1.1 from training. Duration: 0.8818125 2023-10-03 21:15:38,498 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.self_attn_weights.pos_emb_skip_rate, batch_count=1408893.3333333333, ans=0.0 2023-10-03 21:15:40,926 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1015-343884-0_sp0.9 from training. Duration: 0.688875 2023-10-03 21:15:45,547 WARNING [train.py:1204] (3/4) Exclude cut with ID _23514_210_1389_1_1531832139785_3031439_335-48210-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 21:15:50,284 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_309-65177-0_sp1.1 from training. Duration: 0.8 2023-10-03 21:15:51,591 WARNING [train.py:1204] (3/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_231-137892-0 from training. Duration: 0.99 2023-10-03 21:15:53,064 WARNING [train.py:1204] (3/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_57-270037-0 from training. Duration: 0.77 2023-10-03 21:15:53,075 WARNING [train.py:1204] (3/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_465-214726-0_sp1.1 from training. Duration: 0.7818125 2023-10-03 21:15:54,465 WARNING [train.py:1204] (3/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_106-300985-0 from training. Duration: 0.9 2023-10-03 21:15:54,470 WARNING [train.py:1204] (3/4) Exclude cut with ID _14632_210_9988_1_1530691279392_3923200_161-129530-0_sp1.1 from training. Duration: 0.87275 2023-10-03 21:15:54,487 WARNING [train.py:1204] (3/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_514-88370-0_sp1.1 from training. Duration: 0.963625 2023-10-03 21:15:57,152 WARNING [train.py:1204] (3/4) Exclude cut with ID _56495_210_4234_1_1534748080334_4136219_305-334505-0_sp1.1 from training. Duration: 0.92725 2023-10-03 21:15:58,469 WARNING [train.py:1204] (3/4) Exclude cut with ID _42875_210_14856_1_1533704397487_4012433_61-264914-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 21:16:01,858 WARNING [train.py:1204] (3/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_91-148831-0 from training. Duration: 0.99 2023-10-03 21:16:04,734 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_435-313305-0_sp0.9 from training. Duration: 0.788875 2023-10-03 21:16:06,123 WARNING [train.py:1204] (3/4) Exclude cut with ID _16744_210_5986_1_1531724405384_3907567_242-111316-0_sp1.1 from training. Duration: 0.8454375 2023-10-03 21:16:06,620 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.4.encoder.layers.2.conv_module1.whiten, num_groups=1, num_channels=384, metric=3.58 vs. limit=15.0 2023-10-03 21:16:07,375 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.668e+02 2.015e+02 2.186e+02 2.538e+02 3.737e+02, threshold=4.372e+02, percent-clipped=0.0 2023-10-03 21:16:07,484 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_257-20733-0 from training. Duration: 0.76 2023-10-03 21:16:07,534 WARNING [train.py:1204] (3/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_4-91037-0_sp1.1 from training. Duration: 0.8 2023-10-03 21:16:09,376 WARNING [train.py:1204] (3/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_3-276701-0 from training. Duration: 0.91 2023-10-03 21:16:10,853 WARNING [train.py:1204] (3/4) Exclude cut with ID _3992_210_1747_1_1526644816031_3731060_36-147855-0_sp1.1 from training. Duration: 0.7090625 2023-10-03 21:16:12,247 WARNING [train.py:1204] (3/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_1296-298671-0_sp1.1 from training. Duration: 0.963625 2023-10-03 21:16:14,176 WARNING [train.py:1204] (3/4) Exclude cut with ID _68965_210_1883_1_1535698962272_3497490_309-334593-0_sp1.1 from training. Duration: 0.92725 2023-10-03 21:16:16,780 WARNING [train.py:1204] (3/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_128-296581-0 from training. Duration: 0.74 2023-10-03 21:16:16,780 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_17613_210_2151_1_1531137569_2366160_170-299079-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 21:16:16,782 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6287_210_3273_1_1527509506_2653716_182-199887-0_sp1.1 from training. Duration: 0.463625 2023-10-03 21:16:17,549 WARNING [train.py:1204] (3/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_80-135200-0_sp1.1 from training. Duration: 0.7181875 2023-10-03 21:16:20,195 WARNING [train.py:1204] (3/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_34-183685-0 from training. Duration: 0.98 2023-10-03 21:16:20,212 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8278_210_2550_1_1528637099_4220413_418-210155-0_sp1.1 from training. Duration: 0.92725 2023-10-03 21:16:20,215 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8853_210_3273_1_1528633899_818968_51-187685-0_sp0.9 from training. Duration: 0.7666875 2023-10-03 21:16:20,230 WARNING [train.py:1204] (3/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_332-221057-0_sp1.1 from training. Duration: 0.6181875 2023-10-03 21:16:21,645 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2221_210_1988_1_1525597051_4935541_415-296882-0 from training. Duration: 0.77 2023-10-03 21:16:21,666 WARNING [train.py:1204] (3/4) Exclude cut with ID _33640_210_2655_1_1533117605479_4855469_770-278185-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 21:16:21,713 WARNING [train.py:1204] (3/4) Exclude cut with ID _5287_210_2084_1_1527418883040_3878670_333-48508-0_sp1.1 from training. Duration: 0.5454375 2023-10-03 21:16:23,135 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_142-276839-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 21:16:24,574 WARNING [train.py:1204] (3/4) Exclude cut with ID _28394_210_8913_1_1532430049518_3650118_126-139371-0_sp1.1 from training. Duration: 0.92725 2023-10-03 21:16:24,593 WARNING [train.py:1204] (3/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_76-203867-0 from training. Duration: 0.72 2023-10-03 21:16:24,633 WARNING [train.py:1204] (3/4) Exclude cut with ID _6194_210_1534_1_1527511826160_3941194_14-282876-0_sp0.9 from training. Duration: 0.788875 2023-10-03 21:16:30,592 WARNING [train.py:1204] (3/4) Exclude cut with ID _35355_210_16040_1_1533212988977_4016229_74-190076-0_sp1.1 from training. Duration: 0.72725 2023-10-03 21:16:31,941 WARNING [train.py:1204] (3/4) Exclude cut with ID _33920_210_4220_1_1533257851500_3084399_198-334063-0 from training. Duration: 0.94 2023-10-03 21:16:33,344 INFO [train.py:1046] (3/4) Epoch 40, batch 4200, loss[loss=0.1419, simple_loss=0.2219, pruned_loss=0.03096, over 23230.00 frames. ], tot_loss[loss=0.157, simple_loss=0.2358, pruned_loss=0.03915, over 4693757.18 frames. ], batch size: 93, lr: 2.53e-03, grad_scale: 8.0 2023-10-03 21:16:33,393 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_4-266348-0_sp0.9 from training. Duration: 0.8666875 2023-10-03 21:16:36,055 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_23856_210_4238_1_1531994785_3087934_122-38075-0_sp1.1 from training. Duration: 0.936375 2023-10-03 21:16:37,350 WARNING [train.py:1204] (3/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_276-96593-0_sp0.9 from training. Duration: 0.8555625 2023-10-03 21:16:38,682 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_308-278734-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 21:16:38,691 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_5190_210_4557_1_1527245078_2965646_353-250603-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 21:16:40,137 WARNING [train.py:1204] (3/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_472-216663-0 from training. Duration: 0.92 2023-10-03 21:16:44,004 WARNING [train.py:1204] (3/4) Exclude cut with ID _7537_210_2084_1_1528502395431_3924359_14-312906-0 from training. Duration: 0.84 2023-10-03 21:16:44,052 WARNING [train.py:1204] (3/4) Exclude cut with ID _29003_210_19596_1_1532507391720_3894098_559-236660-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 21:16:46,133 WARNING [train.py:1204] (3/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_312-204798-0_sp1.1 from training. Duration: 0.8454375 2023-10-03 21:16:48,738 WARNING [train.py:1204] (3/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_17-309070-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 21:16:51,278 WARNING [train.py:1204] (3/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_344-152526-0_sp0.9 from training. Duration: 0.588875 2023-10-03 21:16:54,052 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_41452_210_7291_1_1533437687_3941281_405-11559-0_sp1.1 from training. Duration: 0.82725 2023-10-03 21:16:54,077 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_48730_210_5399_1_1533885886_2212769_157-124052-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 21:16:55,351 WARNING [train.py:1204] (3/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_415-78792-0 from training. Duration: 0.81 2023-10-03 21:16:55,355 WARNING [train.py:1204] (3/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_345-340690-0_sp1.1 from training. Duration: 0.8454375 2023-10-03 21:16:56,726 WARNING [train.py:1204] (3/4) Exclude cut with ID _23825_210_13449_1_1531915313526_3497150_124-8138-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 21:16:56,773 WARNING [train.py:1204] (3/4) Exclude cut with ID _49258_210_7994_1_1534122017863_3738861_194-24864-0_sp1.1 from training. Duration: 0.936375 2023-10-03 21:16:56,797 WARNING [train.py:1204] (3/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_369-222060-0_sp1.1 from training. Duration: 0.6454375 2023-10-03 21:16:58,204 WARNING [train.py:1204] (3/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_480-13146-0_sp0.9 from training. Duration: 0.9444375 2023-10-03 21:16:58,402 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.conv_skip_rate, batch_count=1409226.6666666667, ans=0.0 2023-10-03 21:17:01,604 WARNING [train.py:1204] (3/4) Exclude cut with ID _13253_210_1925_1_1530757775819_3577873_191-144977-0 from training. Duration: 0.78 2023-10-03 21:17:02,792 WARNING [train.py:1204] (3/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_1128-140266-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 21:17:05,779 WARNING [train.py:1204] (3/4) Exclude cut with ID _8772_210_1850_1_1528635595972_3664011_247-336448-0_sp0.9 from training. Duration: 0.82225 2023-10-03 21:17:07,206 WARNING [train.py:1204] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_318-59097-0_sp1.1 from training. Duration: 0.6909375 2023-10-03 21:17:07,335 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.conv_module1.balancer2.prob, batch_count=1409293.3333333333, ans=0.125 2023-10-03 21:17:09,896 WARNING [train.py:1204] (3/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_540-131164-0_sp1.1 from training. Duration: 0.836375 2023-10-03 21:17:11,820 WARNING [train.py:1204] (3/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_39-110836-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 21:17:13,183 WARNING [train.py:1204] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_860-96198-0_sp1.1 from training. Duration: 0.663625 2023-10-03 21:17:13,191 WARNING [train.py:1204] (3/4) Exclude cut with ID _15133_210_6228_1_1530794512074_3080379_29-264458-0 from training. Duration: 0.58 2023-10-03 21:17:13,219 WARNING [train.py:1204] (3/4) Exclude cut with ID _55209_210_1531_1_1534675828996_4597160_797-348343-0_sp1.1 from training. Duration: 0.963625 2023-10-03 21:17:15,265 WARNING [train.py:1204] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_23-189234-0_sp1.1 from training. Duration: 0.7545625 2023-10-03 21:17:20,204 WARNING [train.py:1204] (3/4) Exclude cut with ID _5410_210_1388_1_1527397055795_1719020_39-112221-0_sp1.1 from training. Duration: 0.62725 2023-10-03 21:17:22,705 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_100-348441-0_sp1.1 from training. Duration: 0.82725 2023-10-03 21:17:24,431 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.bypass.scale_min, batch_count=1409360.0, ans=0.2 2023-10-03 21:17:26,954 WARNING [train.py:1204] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_143-86344-0_sp1.1 from training. Duration: 0.7 2023-10-03 21:17:29,721 WARNING [train.py:1204] (3/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_126-29104-0 from training. Duration: 0.85 2023-10-03 21:17:31,766 WARNING [train.py:1204] (3/4) Exclude cut with ID _39846_210_12651_1_1533603651992_1318030_138-31771-0_sp1.1 from training. Duration: 0.963625 2023-10-03 21:17:31,953 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.conv_module2.balancer2.min_positive, batch_count=1409426.6666666667, ans=0.05 2023-10-03 21:17:35,890 WARNING [train.py:1204] (3/4) Exclude cut with ID _2220_210_1819_1_1525509109898_3658680_50-99837-0_sp1.1 from training. Duration: 0.4909375 2023-10-03 21:17:37,243 WARNING [train.py:1204] (3/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_836-210550-0_sp1.1 from training. Duration: 0.97275 2023-10-03 21:17:38,649 WARNING [train.py:1204] (3/4) Exclude cut with ID _28323_210_16187_1_1532480395667_4100300_306-74561-0 from training. Duration: 0.94 2023-10-03 21:17:44,610 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_177-58005-0_sp1.1 from training. Duration: 0.563625 2023-10-03 21:17:48,467 INFO [train.py:1046] (3/4) Epoch 40, batch 4250, loss[loss=0.1304, simple_loss=0.2049, pruned_loss=0.028, over 24504.00 frames. ], tot_loss[loss=0.1566, simple_loss=0.2355, pruned_loss=0.03891, over 4696114.50 frames. ], batch size: 58, lr: 2.53e-03, grad_scale: 8.0 2023-10-03 21:17:50,038 WARNING [train.py:1204] (3/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_19-342632-0_sp1.1 from training. Duration: 0.82725 2023-10-03 21:17:50,053 WARNING [train.py:1204] (3/4) Exclude cut with ID _35355_210_16040_1_1533212988977_4016229_74-190076-0_sp0.9 from training. Duration: 0.888875 2023-10-03 21:17:51,473 WARNING [train.py:1204] (3/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_285-97786-0_sp1.1 from training. Duration: 0.97275 2023-10-03 21:17:56,862 WARNING [train.py:1204] (3/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_337-181502-0_sp0.9 from training. Duration: 0.72225 2023-10-03 21:17:56,907 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_353-182855-0 from training. Duration: 0.96 2023-10-03 21:17:56,942 WARNING [train.py:1204] (3/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_409-327730-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 21:17:59,800 WARNING [train.py:1204] (3/4) Exclude cut with ID _8030_210_7497_1_1528444886781_3531089_373-19403-0_sp1.1 from training. Duration: 0.97275 2023-10-03 21:18:00,301 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.4.encoder.layers.0.whiten, num_groups=1, num_channels=384, metric=5.74 vs. limit=12.0 2023-10-03 21:18:02,935 WARNING [train.py:1204] (3/4) Exclude cut with ID _6789_210_1534_1_1527857937218_3118950_231-340237-0_sp1.1 from training. Duration: 0.936375 2023-10-03 21:18:07,237 WARNING [train.py:1204] (3/4) Exclude cut with ID _6493_210_1747_1_1528459210424_4012470_536-57832-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 21:18:07,265 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7046_210_6753_1_1527895337_6920917_756-116002-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 21:18:08,681 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_37152_210_9697_1_1533106573_3844188_29-239070-0_sp1.1 from training. Duration: 0.8909375 2023-10-03 21:18:08,684 WARNING [train.py:1204] (3/4) Exclude cut with ID _19023_210_4353_1_1531378892552_5820780_61-334539-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 21:18:11,367 WARNING [train.py:1204] (3/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_159-28488-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 21:18:11,457 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_764-71272-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 21:18:13,416 WARNING [train.py:1204] (3/4) Exclude cut with ID _44902_210_16187_1_1533888006062_3792078_258-178274-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 21:18:14,393 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.conv_module1.balancer1.prob, batch_count=1409560.0, ans=0.125 2023-10-03 21:18:15,402 WARNING [train.py:1204] (3/4) Exclude cut with ID _7537_210_2084_1_1528502395431_3924359_14-312906-0_sp1.1 from training. Duration: 0.763625 2023-10-03 21:18:15,501 WARNING [train.py:1204] (3/4) Exclude cut with ID _49643_210_7989_1_1534640244224_3939031_674-116639-0_sp1.1 from training. Duration: 0.963625 2023-10-03 21:18:16,897 WARNING [train.py:1204] (3/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_415-161-0 from training. Duration: 0.98 2023-10-03 21:18:21,462 WARNING [train.py:1204] (3/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_264-348896-0 from training. Duration: 0.85 2023-10-03 21:18:21,472 WARNING [train.py:1204] (3/4) Exclude cut with ID _51899_210_20034_1_1534129254466_3669220_233-161492-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 21:18:22,808 WARNING [train.py:1204] (3/4) Exclude cut with ID _31255_210_13427_1_1532865619495_3815143_233-200023-0_sp1.1 from training. Duration: 0.936375 2023-10-03 21:18:22,833 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_23850_210_4238_1_1532232597_1632433_100-229879-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 21:18:24,219 WARNING [train.py:1204] (3/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_375-245677-0_sp0.9 from training. Duration: 0.9 2023-10-03 21:18:24,222 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_40223_210_6228_1_1533298404_4812267_355-317415-0_sp1.1 from training. Duration: 0.97275 2023-10-03 21:18:24,255 WARNING [train.py:1204] (3/4) Exclude cut with ID _9679_210_7497_1_1529129212906_4363929_235-81488-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 21:18:29,608 WARNING [train.py:1204] (3/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_172-215932-0_sp1.1 from training. Duration: 0.6 2023-10-03 21:18:29,679 WARNING [train.py:1204] (3/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_64-298917-0_sp1.1 from training. Duration: 0.8 2023-10-03 21:18:34,267 WARNING [train.py:1204] (3/4) Exclude cut with ID _38020_210_5986_1_1533112252259_7006089_131-53533-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 21:18:35,739 WARNING [train.py:1204] (3/4) Exclude cut with ID _42334_210_7770_1_1534206610396_3605880_392-48154-0_sp1.1 from training. Duration: 0.963625 2023-10-03 21:18:36,953 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.647e+02 1.902e+02 2.076e+02 2.432e+02 4.125e+02, threshold=4.152e+02, percent-clipped=0.0 2023-10-03 21:18:37,026 WARNING [train.py:1204] (3/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_337-181502-0 from training. Duration: 0.65 2023-10-03 21:18:37,040 WARNING [train.py:1204] (3/4) Exclude cut with ID _6267_210_2084_1_1527764658526_3894730_375-219887-0_sp0.9 from training. Duration: 0.7444375 2023-10-03 21:18:38,466 WARNING [train.py:1204] (3/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_60-146189-0 from training. Duration: 0.98 2023-10-03 21:18:39,799 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_243-143119-0_sp1.1 from training. Duration: 0.7 2023-10-03 21:18:42,495 WARNING [train.py:1204] (3/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_1267-272693-0_sp0.9 from training. Duration: 0.811125 2023-10-03 21:18:43,963 WARNING [train.py:1204] (3/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_83-182645-0_sp1.1 from training. Duration: 0.97275 2023-10-03 21:18:43,981 WARNING [train.py:1204] (3/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_403-292260-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 21:18:47,614 WARNING [train.py:1204] (3/4) Exclude cut with ID _55429_210_10626_1_1534467588527_7174143_377-169391-0 from training. Duration: 0.96 2023-10-03 21:18:47,828 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.conv_skip_rate, batch_count=1409760.0, ans=0.0 2023-10-03 21:18:48,919 WARNING [train.py:1204] (3/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_494-199538-0_sp1.1 from training. Duration: 0.6818125 2023-10-03 21:18:48,967 WARNING [train.py:1204] (3/4) Exclude cut with ID _3067_210_2667_1_1526212974640_4503168_119-175776-0_sp1.1 from training. Duration: 0.67275 2023-10-03 21:18:53,680 WARNING [train.py:1204] (3/4) Exclude cut with ID _3231_210_2106_1_1526205864934_6784220_183-303839-0_sp1.1 from training. Duration: 0.97275 2023-10-03 21:18:55,082 WARNING [train.py:1204] (3/4) Exclude cut with ID _18513_210_9206_1_1531618215223_3862620_165-38958-0_sp1.1 from training. Duration: 0.963625 2023-10-03 21:18:56,508 WARNING [train.py:1204] (3/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_87-272068-0_sp1.1 from training. Duration: 0.7545625 2023-10-03 21:18:56,787 INFO [scaling.py:1118] (3/4) WithLoss: name=encoder.encoders.4.encoder.layers.1.self_attn_weights, loss-sum=0.000e+00 2023-10-03 21:18:57,797 WARNING [train.py:1204] (3/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_353-95-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 21:18:57,900 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2009_210_2071_1_1525481134_4588937_73-302813-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 21:18:59,518 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.balancer2.prob, batch_count=1409760.0, ans=0.125 2023-10-03 21:19:00,547 WARNING [train.py:1204] (3/4) Exclude cut with ID _64220_210_9527_1_1535250151772_4875259_88-95011-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 21:19:00,610 WARNING [train.py:1204] (3/4) Exclude cut with ID _44902_210_16187_1_1533888006062_3792078_160-12107-0_sp1.1 from training. Duration: 0.92725 2023-10-03 21:19:00,617 WARNING [train.py:1204] (3/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_547-5424-0 from training. Duration: 0.94 2023-10-03 21:19:02,333 INFO [train.py:1046] (3/4) Epoch 40, batch 4300, loss[loss=0.1712, simple_loss=0.2649, pruned_loss=0.0388, over 24326.00 frames. ], tot_loss[loss=0.1562, simple_loss=0.2349, pruned_loss=0.03872, over 4678481.66 frames. ], batch size: 77, lr: 2.53e-03, grad_scale: 8.0 2023-10-03 21:19:02,488 WARNING [train.py:1204] (3/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_268-85656-0_sp1.1 from training. Duration: 0.936375 2023-10-03 21:19:02,721 INFO [scaling.py:1118] (3/4) WithLoss: name=encoder.encoders.1.encoder.layers.1.self_attn_weights, loss-sum=0.000e+00 2023-10-03 21:19:02,813 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.conv_module1.balancer1.prob, batch_count=1409826.6666666667, ans=0.125 2023-10-03 21:19:05,599 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.conv_module1.balancer1.prob, batch_count=1409826.6666666667, ans=0.125 2023-10-03 21:19:06,675 WARNING [train.py:1204] (3/4) Exclude cut with ID _66210_210_12651_1_1535446812922_3365850_42-284985-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 21:19:06,707 WARNING [train.py:1204] (3/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_465-214726-0_sp0.9 from training. Duration: 0.9555625 2023-10-03 21:19:09,454 WARNING [train.py:1204] (3/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_50-95770-0_sp1.1 from training. Duration: 0.936375 2023-10-03 21:19:18,811 WARNING [train.py:1204] (3/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_269-77567-0_sp1.1 from training. Duration: 0.963625 2023-10-03 21:19:18,813 WARNING [train.py:1204] (3/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_7-111954-0 from training. Duration: 0.56 2023-10-03 21:19:20,283 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_85-195240-0_sp0.9 from training. Duration: 0.9666875 2023-10-03 21:19:21,668 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2822_210_2715_1_1526112586_1999587_165-36656-0_sp0.9 from training. Duration: 0.988875 2023-10-03 21:19:21,688 WARNING [train.py:1204] (3/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_181-78632-0_sp1.1 from training. Duration: 0.7090625 2023-10-03 21:19:23,462 WARNING [train.py:1204] (3/4) Exclude cut with ID IC0671W0044-331887 from training. Duration: 0.896 2023-10-03 21:19:24,973 WARNING [train.py:1204] (3/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_266-137104-0_sp1.1 from training. Duration: 0.5909375 2023-10-03 21:19:26,352 WARNING [train.py:1204] (3/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_137-116442-0_sp0.9 from training. Duration: 0.8666875 2023-10-03 21:19:30,824 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_23801_210_16223_1_1532066392_3294657_570-5439-0 from training. Duration: 0.97 2023-10-03 21:19:32,285 WARNING [train.py:1204] (3/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_29-280622-0_sp1.1 from training. Duration: 0.7454375 2023-10-03 21:19:32,309 WARNING [train.py:1204] (3/4) Exclude cut with ID 3557-8342-0013-54691-0 from training. Duration: 0.92 2023-10-03 21:19:33,719 WARNING [train.py:1204] (3/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_711-15688-0_sp1.1 from training. Duration: 0.3909375 2023-10-03 21:19:33,913 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.0.feed_forward1.out_proj.dropout_p, batch_count=1409960.0, ans=0.1 2023-10-03 21:19:35,247 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_15126_210_6753_1_1530778677_2591185_120-277707-0_sp1.1 from training. Duration: 0.72725 2023-10-03 21:19:37,958 WARNING [train.py:1204] (3/4) Exclude cut with ID _14970_210_15709_1_1530775547361_1966965_8-169924-0_sp1.1 from training. Duration: 0.836375 2023-10-03 21:19:37,960 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_4692_210_3528_1_1526899755_4323254_107-44401-0_sp0.9 from training. Duration: 0.9555625 2023-10-03 21:19:40,565 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3475_210_2028_1_1526193578_3622569_265-276625-0_sp0.9 from training. Duration: 0.8444375 2023-10-03 21:19:40,668 WARNING [train.py:1204] (3/4) Exclude cut with ID _55153_210_2253_1_1534384786677_3162096_148-79022-0_sp1.1 from training. Duration: 0.87275 2023-10-03 21:19:41,830 WARNING [train.py:1204] (3/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_292-36336-0_sp1.1 from training. Duration: 0.92725 2023-10-03 21:19:41,846 WARNING [train.py:1204] (3/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_329-82235-0 from training. Duration: 0.82 2023-10-03 21:19:43,174 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_11305_210_1794_1_1529750428_7178148_257-312870-0 from training. Duration: 0.88 2023-10-03 21:19:46,526 WARNING [train.py:1204] (3/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_436-4493-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 21:19:48,229 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.nonlin_attention.balancer.prob, batch_count=1410026.6666666667, ans=0.125 2023-10-03 21:19:49,305 WARNING [train.py:1204] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_321-54129-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 21:19:49,312 WARNING [train.py:1204] (3/4) Exclude cut with ID _5287_210_2084_1_1527418883040_3878670_333-48508-0_sp0.9 from training. Duration: 0.6666875 2023-10-03 21:19:49,326 WARNING [train.py:1204] (3/4) Exclude cut with ID _26528_210_9070_1_1532570410842_3851310_244-278158-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 21:19:49,361 WARNING [train.py:1204] (3/4) Exclude cut with ID _46952_210_5986_1_1534230010357_3107379_232-344503-0_sp1.1 from training. Duration: 0.87275 2023-10-03 21:19:49,378 WARNING [train.py:1204] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_43-206757-0 from training. Duration: 0.75 2023-10-03 21:19:49,379 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_4183_210_2087_1_1526729100_1869838_141-166600-0 from training. Duration: 0.95 2023-10-03 21:19:49,426 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_296-287678-0 from training. Duration: 0.95 2023-10-03 21:19:51,306 WARNING [train.py:1204] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_265-60343-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 21:19:51,346 WARNING [train.py:1204] (3/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_332-256168-0 from training. Duration: 0.75 2023-10-03 21:19:52,610 WARNING [train.py:1204] (3/4) Exclude cut with ID _13679_210_7497_1_1530534293662_3355098_411-186088-0 from training. Duration: 0.94 2023-10-03 21:19:53,165 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.4.encoder.layers.2.self_attn1.whiten, num_groups=1, num_channels=384, metric=11.95 vs. limit=22.5 2023-10-03 21:19:56,655 WARNING [train.py:1204] (3/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_458-126779-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 21:19:58,542 WARNING [train.py:1204] (3/4) Exclude cut with ID IC0803W0380-397572 from training. Duration: 0.032 2023-10-03 21:20:00,292 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_278-272612-0_sp1.1 from training. Duration: 0.663625 2023-10-03 21:20:01,672 WARNING [train.py:1204] (3/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_491-263101-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 21:20:01,677 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_53555_210_5632_1_1534595220_7232297_334-336778-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 21:20:03,173 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_50-3289-0 from training. Duration: 0.91 2023-10-03 21:20:03,251 WARNING [train.py:1204] (3/4) Exclude cut with ID _8699_210_2069_1_1528630346831_3960409_155-39766-0_sp0.9 from training. Duration: 0.8666875 2023-10-03 21:20:03,255 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_502-82145-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 21:20:04,607 WARNING [train.py:1204] (3/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_550-43132-0_sp1.1 from training. Duration: 0.97275 2023-10-03 21:20:04,643 WARNING [train.py:1204] (3/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_446-112117-0_sp1.1 from training. Duration: 0.8181875 2023-10-03 21:20:05,891 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3691_210_3528_1_1526468169_4062053_355-334432-0_sp1.1 from training. Duration: 0.7909375 2023-10-03 21:20:06,963 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.0.layers.0.conv_module2.whiten, num_groups=1, num_channels=192, metric=11.56 vs. limit=15.0 2023-10-03 21:20:07,389 WARNING [train.py:1204] (3/4) Exclude cut with ID _58605_210_12210_1_1534723195939_3904259_239-94720-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 21:20:08,749 WARNING [train.py:1204] (3/4) Exclude cut with ID _49376_210_5986_1_1533985278732_3370740_391-344072-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 21:20:10,083 WARNING [train.py:1204] (3/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_558-322014-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 21:20:10,121 WARNING [train.py:1204] (3/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_256-32662-0_sp1.1 from training. Duration: 0.8181875 2023-10-03 21:20:15,528 INFO [train.py:1046] (3/4) Epoch 40, batch 4350, loss[loss=0.1393, simple_loss=0.2219, pruned_loss=0.02833, over 20278.00 frames. ], tot_loss[loss=0.1562, simple_loss=0.2353, pruned_loss=0.03854, over 4690987.52 frames. ], batch size: 44, lr: 2.53e-03, grad_scale: 8.0 2023-10-03 21:20:17,265 WARNING [train.py:1204] (3/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_572-185150-0 from training. Duration: 0.97 2023-10-03 21:20:17,295 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_89-35748-0_sp0.9 from training. Duration: 0.87775 2023-10-03 21:20:22,083 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_88-66747-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 21:20:23,364 WARNING [train.py:1204] (3/4) Exclude cut with ID _19504_210_9994_1_1531648863242_3325220_357-237029-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 21:20:27,368 WARNING [train.py:1204] (3/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_302-2550-0_sp1.1 from training. Duration: 0.736375 2023-10-03 21:20:27,369 WARNING [train.py:1204] (3/4) Exclude cut with ID _9461_210_4557_1_1529060514726_3677451_59-194017-0_sp1.1 from training. Duration: 0.963625 2023-10-03 21:20:33,427 WARNING [train.py:1204] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_665-196155-0_sp1.1 from training. Duration: 0.6090625 2023-10-03 21:20:36,310 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_326-315007-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 21:20:38,998 WARNING [train.py:1204] (3/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_105-328174-0_sp1.1 from training. Duration: 0.6909375 2023-10-03 21:20:39,019 WARNING [train.py:1204] (3/4) Exclude cut with ID _56494_210_4220_1_1535284837625_3033169_106-263722-0_sp1.1 from training. Duration: 0.97275 2023-10-03 21:20:41,802 WARNING [train.py:1204] (3/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_770-200430-0_sp0.9 from training. Duration: 0.988875 2023-10-03 21:20:42,045 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.feed_forward1.hidden_balancer.prob, batch_count=1410226.6666666667, ans=0.125 2023-10-03 21:20:43,162 WARNING [train.py:1204] (3/4) Exclude cut with ID _65640_210_6310_1_1535368201445_3173250_209-13008-0_sp1.1 from training. Duration: 0.9 2023-10-03 21:20:44,788 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.conv_module1.balancer1.prob, batch_count=1410293.3333333333, ans=0.125 2023-10-03 21:20:45,859 WARNING [train.py:1204] (3/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_212-149947-0_sp0.9 from training. Duration: 0.811125 2023-10-03 21:20:50,664 WARNING [train.py:1204] (3/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_188-171673-0 from training. Duration: 0.58 2023-10-03 21:20:51,384 WARNING [train.py:1204] (3/4) Exclude cut with ID _58895_210_8750_1_1535184081320_3649089_286-281724-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 21:20:52,514 WARNING [train.py:1204] (3/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_16-254909-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 21:20:56,652 WARNING [train.py:1204] (3/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_52-95655-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 21:20:58,079 WARNING [train.py:1204] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_554-45617-0 from training. Duration: 0.99 2023-10-03 21:20:58,724 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.3.encoder.layers.0.conv_module2.whiten, num_groups=1, num_channels=512, metric=3.55 vs. limit=15.0 2023-10-03 21:21:01,627 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=1410360.0, ans=0.1 2023-10-03 21:21:02,824 WARNING [train.py:1204] (3/4) Exclude cut with ID _14683_210_5219_1_1530698352608_3974859_473-344611-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 21:21:02,936 WARNING [train.py:1204] (3/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_17-25045-0_sp0.9 from training. Duration: 0.6555625 2023-10-03 21:21:04,098 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.636e+02 2.055e+02 2.301e+02 2.803e+02 4.605e+02, threshold=4.602e+02, percent-clipped=1.0 2023-10-03 21:21:08,376 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9072W0003-530637 from training. Duration: 0.768 2023-10-03 21:21:08,482 WARNING [train.py:1204] (3/4) Exclude cut with ID _38670_210_15379_1_1533448810659_4391059_76-74025-0_sp1.1 from training. Duration: 0.936375 2023-10-03 21:21:09,750 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_74-292608-0_sp0.9 from training. Duration: 0.8 2023-10-03 21:21:11,082 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9084W0025-536862 from training. Duration: 0.896 2023-10-03 21:21:13,757 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9087W0021-538392 from training. Duration: 0.768 2023-10-03 21:21:13,763 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7543_210_6753_1_1528543323_645515_56-163234-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 21:21:13,792 WARNING [train.py:1204] (3/4) Exclude cut with ID _45560_210_21982_1_1533812603951_3584999_44-332643-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 21:21:15,145 WARNING [train.py:1204] (3/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_1285-256622-0_sp1.1 from training. Duration: 0.763625 2023-10-03 21:21:15,211 WARNING [train.py:1204] (3/4) Exclude cut with ID _52781_210_16187_1_1534297729099_1118149_141-325632-0_sp1.1 from training. Duration: 0.936375 2023-10-03 21:21:16,568 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_1051-137631-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 21:21:16,608 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_13091_210_7925_1_1530609447_8987354_233-18333-0_sp1.1 from training. Duration: 0.97275 2023-10-03 21:21:19,338 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_34008_210_10431_1_1533345424_8183247_662-317331-0 from training. Duration: 0.94 2023-10-03 21:21:20,640 WARNING [train.py:1204] (3/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_291-248920-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 21:21:20,642 WARNING [train.py:1204] (3/4) Exclude cut with ID _65263_210_22356_1_1535763598691_3921037_274-288481-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 21:21:20,670 WARNING [train.py:1204] (3/4) Exclude cut with ID _5189_210_4557_1_1527248319428_3769229_233-168080-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 21:21:20,733 WARNING [train.py:1204] (3/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_135-184281-0 from training. Duration: 0.99 2023-10-03 21:21:22,478 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9123W0012-555972 from training. Duration: 0.896 2023-10-03 21:21:22,490 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9123W0118-556077 from training. Duration: 0.896 2023-10-03 21:21:22,498 WARNING [train.py:1204] (3/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_406-275649-0 from training. Duration: 0.42 2023-10-03 21:21:25,889 WARNING [train.py:1204] (3/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_309-13684-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 21:21:25,909 WARNING [train.py:1204] (3/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_129-337802-0_sp1.1 from training. Duration: 0.6545625 2023-10-03 21:21:25,936 WARNING [train.py:1204] (3/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_649-266825-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 21:21:26,002 WARNING [train.py:1204] (3/4) Exclude cut with ID _40519_210_13794_1_1533715228655_7337390_895-224742-0_sp1.1 from training. Duration: 0.92725 2023-10-03 21:21:27,418 WARNING [train.py:1204] (3/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_234-14174-0 from training. Duration: 0.82 2023-10-03 21:21:28,571 INFO [train.py:1046] (3/4) Epoch 40, batch 4400, loss[loss=0.2123, simple_loss=0.2758, pruned_loss=0.07444, over 19192.00 frames. ], tot_loss[loss=0.1572, simple_loss=0.2365, pruned_loss=0.03891, over 4690831.10 frames. ], batch size: 389, lr: 2.53e-03, grad_scale: 16.0 2023-10-03 21:21:28,743 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9156W0009-572502 from training. Duration: 0.768 2023-10-03 21:21:30,092 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_424-245529-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 21:21:33,855 WARNING [train.py:1204] (3/4) Exclude cut with ID _26528_210_9070_1_1532570410842_3851310_232-10149-0_sp1.1 from training. Duration: 0.963625 2023-10-03 21:21:33,863 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_17292_210_6753_1_1531125388_2943734_152-30410-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 21:21:35,281 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_35441_210_16554_1_1533347572_4092680_291-82075-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 21:21:38,041 WARNING [train.py:1204] (3/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_64-298917-0 from training. Duration: 0.88 2023-10-03 21:21:38,070 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_5618_210_1874_1_1527311008_5851_1-214501-0_sp0.9 from training. Duration: 0.7655625 2023-10-03 21:21:38,099 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_9460_210_4211_1_1528954587_10049667_572-37035-0 from training. Duration: 0.8 2023-10-03 21:21:38,117 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9193W0023-590322 from training. Duration: 0.896 2023-10-03 21:21:39,521 WARNING [train.py:1204] (3/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_206-10097-0_sp1.1 from training. Duration: 0.4454375 2023-10-03 21:21:39,533 WARNING [train.py:1204] (3/4) Exclude cut with ID _55134_210_21982_1_1534590028377_3882170_17-51731-0_sp1.1 from training. Duration: 0.963625 2023-10-03 21:21:40,946 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_14953_210_2151_1_1530701710_5616165_710-165400-0 from training. Duration: 0.87 2023-10-03 21:21:42,989 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.3.encoder.layers.1.feed_forward2.out_whiten, num_groups=1, num_channels=512, metric=6.19 vs. limit=15.0 2023-10-03 21:21:43,666 WARNING [train.py:1204] (3/4) Exclude cut with ID _53203_210_2087_1_1534246218309_475590_61-85777-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 21:21:45,029 WARNING [train.py:1204] (3/4) Exclude cut with ID _55844_210_2346_1_1534471212569_7625707_1050-10453-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 21:21:45,039 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9218W0013-603297 from training. Duration: 0.896 2023-10-03 21:21:47,761 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_267-102818-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 21:21:47,762 WARNING [train.py:1204] (3/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_191-41023-0 from training. Duration: 0.71 2023-10-03 21:21:47,807 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9231W0202-610227 from training. Duration: 0.896 2023-10-03 21:21:49,336 WARNING [train.py:1204] (3/4) Exclude cut with ID _6537_210_1883_1_1527987559710_3597129_382-133062-0 from training. Duration: 0.68 2023-10-03 21:21:51,171 WARNING [train.py:1204] (3/4) Exclude cut with ID _28427_210_4220_1_1532581240967_3061069_43-84928-0 from training. Duration: 0.99 2023-10-03 21:21:51,205 WARNING [train.py:1204] (3/4) Exclude cut with ID _59256_210_5986_1_1535076085997_3589120_121-237386-0 from training. Duration: 0.96 2023-10-03 21:21:51,242 WARNING [train.py:1204] (3/4) Exclude cut with ID _8480_210_1874_1_1528589391205_6728330_137-256986-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 21:21:53,040 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_9417_210_1794_1_1529235261_4614131_479-346963-0_sp1.1 from training. Duration: 0.936375 2023-10-03 21:21:53,257 INFO [scaling.py:1118] (3/4) WithLoss: name=encoder.encoders.3.encoder.layers.3.self_attn_weights, loss-sum=0.000e+00 2023-10-03 21:21:54,366 WARNING [train.py:1204] (3/4) Exclude cut with ID _26474_210_9570_1_1532520022699_3623380_267-7362-0_sp1.1 from training. Duration: 0.936375 2023-10-03 21:21:54,441 WARNING [train.py:1204] (3/4) Exclude cut with ID _55328_210_27767_1_1534580973260_4088170_287-61272-0_sp1.1 from training. Duration: 0.97275 2023-10-03 21:21:55,843 WARNING [train.py:1204] (3/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_589-9070-0 from training. Duration: 0.71 2023-10-03 21:21:57,103 WARNING [train.py:1204] (3/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_148-158509-0_sp1.1 from training. Duration: 0.37275 2023-10-03 21:21:57,175 WARNING [train.py:1204] (3/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_164-184548-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 21:21:59,896 WARNING [train.py:1204] (3/4) Exclude cut with ID _14970_210_15709_1_1530775547361_1966965_6-23328-0_sp1.1 from training. Duration: 0.7818125 2023-10-03 21:21:59,909 WARNING [train.py:1204] (3/4) Exclude cut with ID _29303_210_2575_1_1532392237303_7410649_612-165069-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 21:22:01,716 WARNING [train.py:1204] (3/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_383-321224-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 21:22:03,592 WARNING [train.py:1204] (3/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_283-160525-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 21:22:03,611 WARNING [train.py:1204] (3/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_505-97987-0 from training. Duration: 0.86 2023-10-03 21:22:03,681 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9309W0190-640317 from training. Duration: 0.768 2023-10-03 21:22:06,474 WARNING [train.py:1204] (3/4) Exclude cut with ID _50093_210_4220_1_1534417141207_3432299_280-297390-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 21:22:12,262 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_17292_210_6753_1_1531125388_2943734_320-71385-0_sp1.1 from training. Duration: 0.97275 2023-10-03 21:22:15,050 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_213-103282-0 from training. Duration: 0.78 2023-10-03 21:22:19,319 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7615_210_4211_1_1528110965_4580160_72-235211-0_sp0.9 from training. Duration: 0.8444375 2023-10-03 21:22:23,143 WARNING [train.py:1204] (3/4) Exclude cut with ID _14867_210_1968_1_1530684179890_3846969_173-320977-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 21:22:24,664 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.balancer1.prob, batch_count=1410693.3333333333, ans=0.125 2023-10-03 21:22:25,878 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7046_210_6753_1_1527895337_6920917_783-278109-0_sp0.9 from training. Duration: 0.8555625 2023-10-03 21:22:25,916 WARNING [train.py:1204] (3/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_90-30673-0 from training. Duration: 0.84 2023-10-03 21:22:25,941 WARNING [train.py:1204] (3/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_415-161-0_sp1.1 from training. Duration: 0.8909375 2023-10-03 21:22:25,956 WARNING [train.py:1204] (3/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_86-66192-0_sp0.9 from training. Duration: 0.611125 2023-10-03 21:22:25,958 WARNING [train.py:1204] (3/4) Exclude cut with ID _4626_210_3532_1_1526817652717_3916859_446-206656-0_sp1.1 from training. Duration: 0.6909375 2023-10-03 21:22:27,196 WARNING [train.py:1204] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_166-128941-0_sp0.9 from training. Duration: 0.6 2023-10-03 21:22:30,016 WARNING [train.py:1204] (3/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_345-167986-0 from training. Duration: 0.84 2023-10-03 21:22:32,929 WARNING [train.py:1204] (3/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_973-198085-0 from training. Duration: 0.75 2023-10-03 21:22:34,770 WARNING [train.py:1204] (3/4) Exclude cut with ID _60057_210_10640_1_1534903065793_3333150_171-181133-0 from training. Duration: 0.88 2023-10-03 21:22:34,793 WARNING [train.py:1204] (3/4) Exclude cut with ID _19504_210_9994_1_1531648863242_3325220_336-196513-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 21:22:34,800 WARNING [train.py:1204] (3/4) Exclude cut with ID _6493_210_1747_1_1528459210424_4012470_513-140967-0 from training. Duration: 0.79 2023-10-03 21:22:36,773 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6673_210_3273_1_1527767790_3173240_327-306185-0_sp0.9 from training. Duration: 0.92225 2023-10-03 21:22:38,365 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=1410760.0, ans=0.1 2023-10-03 21:22:39,552 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1032-236863-0_sp1.1 from training. Duration: 0.763625 2023-10-03 21:22:40,990 WARNING [train.py:1204] (3/4) Exclude cut with ID _14867_210_1968_1_1530684179890_3846969_413-29292-0 from training. Duration: 0.9 2023-10-03 21:22:43,746 INFO [train.py:1046] (3/4) Epoch 40, batch 4450, loss[loss=0.1633, simple_loss=0.242, pruned_loss=0.04225, over 23420.00 frames. ], tot_loss[loss=0.1583, simple_loss=0.2377, pruned_loss=0.03944, over 4690156.11 frames. ], batch size: 93, lr: 2.53e-03, grad_scale: 16.0 2023-10-03 21:22:43,830 WARNING [train.py:1204] (3/4) Exclude cut with ID _47251_210_18628_1_1533814063128_4137250_186-343322-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 21:22:46,516 WARNING [train.py:1204] (3/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_624-46091-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 21:22:46,565 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_791-197891-0_sp0.9 from training. Duration: 0.9333125 2023-10-03 21:22:46,715 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.feed_forward3.hidden_balancer.prob, batch_count=1410826.6666666667, ans=0.125 2023-10-03 21:22:52,587 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_50050_210_16223_1_1535435796_7290138_786-288457-0_sp1.1 from training. Duration: 0.92725 2023-10-03 21:22:52,610 WARNING [train.py:1204] (3/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_213-227283-0_sp1.1 from training. Duration: 0.963625 2023-10-03 21:22:57,118 WARNING [train.py:1204] (3/4) Exclude cut with ID _42659_210_2070_1_1533607442765_3120380_58-341121-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 21:22:58,499 WARNING [train.py:1204] (3/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_506-23200-0_sp1.1 from training. Duration: 0.8818125 2023-10-03 21:23:00,002 WARNING [train.py:1204] (3/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_336-213530-0_sp1.1 from training. Duration: 0.8181875 2023-10-03 21:23:00,034 WARNING [train.py:1204] (3/4) Exclude cut with ID _64135_210_2253_1_1535677267876_7608972_180-5312-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 21:23:02,247 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.0.layers.0.whiten, num_groups=1, num_channels=192, metric=4.24 vs. limit=12.0 2023-10-03 21:23:03,133 WARNING [train.py:1204] (3/4) Exclude cut with ID _12302_210_5219_1_1529924446466_8107280_835-279754-0 from training. Duration: 0.9 2023-10-03 21:23:03,135 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_510-96996-0_sp1.1 from training. Duration: 0.7909375 2023-10-03 21:23:04,947 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_15517_210_6753_1_1530954008_3487336_216-187834-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 21:23:04,975 WARNING [train.py:1204] (3/4) Exclude cut with ID _19504_210_9994_1_1531648863242_3325220_414-113427-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 21:23:04,976 WARNING [train.py:1204] (3/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_264-108685-0_sp0.9 from training. Duration: 0.611125 2023-10-03 21:23:06,661 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.balancer2.prob, batch_count=1410893.3333333333, ans=0.125 2023-10-03 21:23:07,742 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_5438_210_1794_1_1527336687_2600009_104-288274-0_sp0.9 from training. Duration: 0.6444375 2023-10-03 21:23:10,666 WARNING [train.py:1204] (3/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_520-39496-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 21:23:11,909 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_37818_210_4238_1_1533620910_4720083_315-308565-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 21:23:13,347 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2009_210_2071_1_1525481134_4588937_385-302100-0_sp1.1 from training. Duration: 0.7909375 2023-10-03 21:23:13,384 WARNING [train.py:1204] (3/4) Exclude cut with ID _58895_210_8750_1_1535184081320_3649089_277-53219-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 21:23:14,723 WARNING [train.py:1204] (3/4) Exclude cut with ID _26664_210_7770_1_1532258943280_3418270_121-111258-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 21:23:17,455 WARNING [train.py:1204] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_99-167392-0_sp1.1 from training. Duration: 0.5545625 2023-10-03 21:23:18,807 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1113-7369-0 from training. Duration: 0.85 2023-10-03 21:23:18,822 WARNING [train.py:1204] (3/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_352-59958-0 from training. Duration: 0.86 2023-10-03 21:23:18,827 WARNING [train.py:1204] (3/4) Exclude cut with ID _14886_210_4220_1_1530702020355_3516190_159-73748-0_sp1.1 from training. Duration: 0.7545625 2023-10-03 21:23:22,057 WARNING [train.py:1204] (3/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_131-119569-0_sp1.1 from training. Duration: 0.92725 2023-10-03 21:23:23,346 WARNING [train.py:1204] (3/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_216-343007-0 from training. Duration: 0.66 2023-10-03 21:23:26,556 WARNING [train.py:1204] (3/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_113-92608-0_sp0.9 from training. Duration: 0.6 2023-10-03 21:23:30,667 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_40047_210_2290_1_1533290519_2374311_132-123271-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 21:23:32,400 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.525e+02 1.956e+02 2.159e+02 2.589e+02 3.505e+02, threshold=4.319e+02, percent-clipped=0.0 2023-10-03 21:23:32,512 WARNING [train.py:1204] (3/4) Exclude cut with ID _8793_210_6310_1_1528542985139_2310009_121-179782-0 from training. Duration: 0.8 2023-10-03 21:23:32,533 WARNING [train.py:1204] (3/4) Exclude cut with ID _43997_210_4525_1_1533538632555_5189329_283-218639-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 21:23:32,536 WARNING [train.py:1204] (3/4) Exclude cut with ID _49376_210_5986_1_1533985278732_3370740_297-78397-0_sp1.1 from training. Duration: 0.936375 2023-10-03 21:23:32,550 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3475_210_2028_1_1526193578_3622569_377-166133-0_sp0.9 from training. Duration: 0.9555625 2023-10-03 21:23:32,558 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_520-213149-0_sp1.1 from training. Duration: 0.92725 2023-10-03 21:23:33,949 WARNING [train.py:1204] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_567-144483-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 21:23:38,339 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_4183_210_2087_1_1526729100_1869838_143-280962-0_sp0.9 from training. Duration: 0.62225 2023-10-03 21:23:38,574 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=1411026.6666666667, ans=0.1 2023-10-03 21:23:39,656 WARNING [train.py:1204] (3/4) Exclude cut with ID _49643_210_7989_1_1534640244224_3939031_611-185515-0 from training. Duration: 0.94 2023-10-03 21:23:39,823 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.1.attention_skip_rate, batch_count=1411026.6666666667, ans=0.0 2023-10-03 21:23:41,028 WARNING [train.py:1204] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_88-341718-0_sp0.9 from training. Duration: 0.7333125 2023-10-03 21:23:42,460 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_51225_210_3601_1_1534400634_2327383_49-235441-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 21:23:42,729 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.balancer2.prob, batch_count=1411093.3333333333, ans=0.125 2023-10-03 21:23:43,838 WARNING [train.py:1204] (3/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_240-294400-0_sp1.1 from training. Duration: 0.936375 2023-10-03 21:23:46,480 WARNING [train.py:1204] (3/4) Exclude cut with ID _55429_210_10626_1_1534467588527_7174143_640-304825-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 21:23:47,778 WARNING [train.py:1204] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_187-221566-0_sp1.1 from training. Duration: 0.3909375 2023-10-03 21:23:49,365 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.bypass.scale_min, batch_count=1411093.3333333333, ans=0.2 2023-10-03 21:23:50,479 WARNING [train.py:1204] (3/4) Exclude cut with ID _7606_210_4915_1_1528894253277_5080690_127-177761-0_sp0.9 from training. Duration: 0.711125 2023-10-03 21:23:50,777 INFO [scaling.py:1118] (3/4) WithLoss: name=encoder.encoders.1.encoder.layers.1.self_attn_weights, loss-sum=0.000e+00 2023-10-03 21:23:52,689 WARNING [train.py:1204] (3/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_430-116139-0 from training. Duration: 0.85 2023-10-03 21:23:53,974 WARNING [train.py:1204] (3/4) Exclude cut with ID _3675_210_4557_1_1526639934408_4182089_456-18305-0_sp0.9 from training. Duration: 0.8444375 2023-10-03 21:23:56,095 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.attention_skip_rate, batch_count=1411160.0, ans=0.0 2023-10-03 21:23:57,170 INFO [train.py:1046] (3/4) Epoch 40, batch 4500, loss[loss=0.168, simple_loss=0.2602, pruned_loss=0.03786, over 24447.00 frames. ], tot_loss[loss=0.1588, simple_loss=0.2383, pruned_loss=0.03963, over 4702277.61 frames. ], batch size: 69, lr: 2.53e-03, grad_scale: 16.0 2023-10-03 21:23:59,937 WARNING [train.py:1204] (3/4) Exclude cut with ID _18513_210_9206_1_1531618215223_3862620_402-5957-0_sp1.1 from training. Duration: 0.9 2023-10-03 21:24:01,304 WARNING [train.py:1204] (3/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_465-156500-0 from training. Duration: 0.72 2023-10-03 21:24:01,305 WARNING [train.py:1204] (3/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_714-310538-0 from training. Duration: 0.86 2023-10-03 21:24:02,735 WARNING [train.py:1204] (3/4) Exclude cut with ID _39411_210_5986_1_1533538825030_3343649_335-301187-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 21:24:06,137 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_41750_210_1960_1_1533520678_3948042_241-158616-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 21:24:07,522 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_46013_210_7234_1_1533776362_3744032_50-141535-0_sp1.1 from training. Duration: 0.9 2023-10-03 21:24:07,587 WARNING [train.py:1204] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_43-206757-0_sp0.9 from training. Duration: 0.8333125 2023-10-03 21:24:09,343 WARNING [train.py:1204] (3/4) Exclude cut with ID _22961_210_8254_1_1531976366672_4278180_172-276296-0_sp1.1 from training. Duration: 0.97275 2023-10-03 21:24:09,379 WARNING [train.py:1204] (3/4) Exclude cut with ID _17956_210_4353_1_1531209568125_6979970_1305-301903-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 21:24:10,709 WARNING [train.py:1204] (3/4) Exclude cut with ID _28323_210_16187_1_1532480395667_4100300_604-44507-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 21:24:20,601 WARNING [train.py:1204] (3/4) Exclude cut with ID _23514_210_1389_1_1531832139785_3031439_88-337544-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 21:24:21,976 WARNING [train.py:1204] (3/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_932-3672-0_sp1.1 from training. Duration: 0.963625 2023-10-03 21:24:24,038 WARNING [train.py:1204] (3/4) Exclude cut with ID _46502_210_13794_1_1533724452198_3657159_207-109991-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 21:24:25,440 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_216-73841-0_sp1.1 from training. Duration: 0.87275 2023-10-03 21:24:27,227 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8453_210_4211_1_1528502940_8562730_347-330673-0_sp1.1 from training. Duration: 0.6545625 2023-10-03 21:24:32,614 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_4183_210_2087_1_1526729100_1869838_143-280962-0_sp1.1 from training. Duration: 0.5090625 2023-10-03 21:24:35,479 WARNING [train.py:1204] (3/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_325-113798-0_sp1.1 from training. Duration: 0.77275 2023-10-03 21:24:37,531 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.bypass_mid.scale_min, batch_count=1411293.3333333333, ans=0.2 2023-10-03 21:24:40,336 WARNING [train.py:1204] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_91-212486-0_sp0.9 from training. Duration: 0.7555625 2023-10-03 21:24:43,139 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8776_210_2040_1_1528638837_4088036_244-278763-0_sp1.1 from training. Duration: 0.8181875 2023-10-03 21:24:43,171 WARNING [train.py:1204] (3/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_106-286130-0 from training. Duration: 0.98 2023-10-03 21:24:43,249 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2746_210_1988_1_1525872816_3795627_372-205861-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 21:24:43,288 WARNING [train.py:1204] (3/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_240-54646-0_sp1.1 from training. Duration: 0.936375 2023-10-03 21:24:46,017 WARNING [train.py:1204] (3/4) Exclude cut with ID _59500_210_8254_1_1534809610680_3736009_162-180695-0_sp1.1 from training. Duration: 0.936375 2023-10-03 21:24:47,325 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7544_210_6753_1_1528591327_1471426_146-93357-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 21:24:50,169 WARNING [train.py:1204] (3/4) Exclude cut with ID _42657_210_2070_1_1533520654016_3327008_405-87432-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 21:24:50,198 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_4528_210_3273_1_1527076654_3276777_209-219804-0 from training. Duration: 0.69 2023-10-03 21:24:50,199 WARNING [train.py:1204] (3/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_78-182912-0_sp0.9 from training. Duration: 0.6555625 2023-10-03 21:24:50,205 WARNING [train.py:1204] (3/4) Exclude cut with ID _31255_210_13427_1_1532865619495_3815143_322-114998-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 21:24:54,980 WARNING [train.py:1204] (3/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_848-280957-0_sp0.9 from training. Duration: 0.9666875 2023-10-03 21:24:55,000 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7696_210_2040_1_1528207167_3716099_117-125434-0_sp0.9 from training. Duration: 0.8444375 2023-10-03 21:24:58,277 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_1002-109192-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 21:24:58,440 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.feed_forward2.hidden_balancer.prob, batch_count=1411426.6666666667, ans=0.125 2023-10-03 21:25:02,207 WARNING [train.py:1204] (3/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_138-64210-0_sp0.9 from training. Duration: 0.811125 2023-10-03 21:25:02,228 WARNING [train.py:1204] (3/4) Exclude cut with ID _25808_210_13292_1_1532343514562_7366109_633-47524-0_sp1.1 from training. Duration: 0.9 2023-10-03 21:25:03,654 WARNING [train.py:1204] (3/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_297-5233-0 from training. Duration: 0.95 2023-10-03 21:25:05,090 WARNING [train.py:1204] (3/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_335-274780-0 from training. Duration: 0.78 2023-10-03 21:25:05,094 WARNING [train.py:1204] (3/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_301-330455-0 from training. Duration: 0.8 2023-10-03 21:25:08,449 WARNING [train.py:1204] (3/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_383-205833-0 from training. Duration: 0.95 2023-10-03 21:25:11,777 INFO [train.py:1046] (3/4) Epoch 40, batch 4550, loss[loss=0.1403, simple_loss=0.2002, pruned_loss=0.04017, over 22623.00 frames. ], tot_loss[loss=0.1575, simple_loss=0.2366, pruned_loss=0.03922, over 4699552.46 frames. ], batch size: 322, lr: 2.53e-03, grad_scale: 16.0 2023-10-03 21:25:11,870 WARNING [train.py:1204] (3/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_48-182155-0 from training. Duration: 0.87 2023-10-03 21:25:13,208 WARNING [train.py:1204] (3/4) Exclude cut with ID _14832_210_8750_1_1530691351539_3171348_1-54315-0_sp1.1 from training. Duration: 0.7909375 2023-10-03 21:25:17,191 WARNING [train.py:1204] (3/4) Exclude cut with ID _44982_210_1511_1_1534312820108_3726029_413-100814-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 21:25:17,259 WARNING [train.py:1204] (3/4) Exclude cut with ID _3231_210_2106_1_1526205864934_6784220_104-261392-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 21:25:21,406 WARNING [train.py:1204] (3/4) Exclude cut with ID _24314_210_4915_1_1532135078116_7170179_1-13588-0_sp1.1 from training. Duration: 0.97275 2023-10-03 21:25:24,577 WARNING [train.py:1204] (3/4) Exclude cut with ID _44304_210_15374_1_1533639352765_4613129_141-8356-0_sp1.1 from training. Duration: 0.963625 2023-10-03 21:25:25,059 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.5.encoder.layers.1.nonlin_attention.whiten2, num_groups=1, num_channels=256, metric=8.32 vs. limit=15.0 2023-10-03 21:25:26,172 WARNING [train.py:1204] (3/4) Exclude cut with ID _37892_210_19596_1_1533198677897_3964058_374-335034-0_sp1.1 from training. Duration: 0.936375 2023-10-03 21:25:27,548 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_195-257512-0_sp0.9 from training. Duration: 0.7444375 2023-10-03 21:25:27,550 WARNING [train.py:1204] (3/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_1267-272693-0_sp1.1 from training. Duration: 0.663625 2023-10-03 21:25:27,551 WARNING [train.py:1204] (3/4) Exclude cut with ID _30196_210_1664_1_1533708496522_7074140_690-326155-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 21:25:29,067 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_68847_210_14656_1_1536070214_2586843_38-20717-0_sp1.1 from training. Duration: 0.97275 2023-10-03 21:25:30,849 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_99-251314-0_sp1.1 from training. Duration: 0.7909375 2023-10-03 21:25:33,605 WARNING [train.py:1204] (3/4) Exclude cut with ID _49376_210_5986_1_1533985278732_3370740_225-95630-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 21:25:35,139 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.bypass.skip_rate, batch_count=1411560.0, ans=0.04949747468305833 2023-10-03 21:25:36,731 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2655_210_2087_1_1525688959_3897400_202-147557-0 from training. Duration: 0.85 2023-10-03 21:25:36,784 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_5618_210_1874_1_1527311008_5851_1-214501-0_sp1.1 from training. Duration: 0.626375 2023-10-03 21:25:38,113 WARNING [train.py:1204] (3/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_225-43381-0_sp1.1 from training. Duration: 0.8545625 2023-10-03 21:25:39,472 WARNING [train.py:1204] (3/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_242-3472-0 from training. Duration: 0.73 2023-10-03 21:25:44,037 WARNING [train.py:1204] (3/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_339-104791-0 from training. Duration: 0.49 2023-10-03 21:25:44,112 WARNING [train.py:1204] (3/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_488-92261-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 21:25:46,942 WARNING [train.py:1204] (3/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_175-163894-0 from training. Duration: 0.74 2023-10-03 21:25:47,183 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.conv_skip_rate, batch_count=1411626.6666666667, ans=0.0 2023-10-03 21:25:48,333 WARNING [train.py:1204] (3/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_41-234353-0_sp0.9 from training. Duration: 0.7555625 2023-10-03 21:25:52,355 WARNING [train.py:1204] (3/4) Exclude cut with ID _47251_210_18628_1_1533814063128_4137250_116-275927-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 21:25:52,395 WARNING [train.py:1204] (3/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_61-223324-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 21:25:53,633 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6961_210_2087_1_1527937168_6815011_562-165518-0_sp1.1 from training. Duration: 0.7 2023-10-03 21:25:55,717 WARNING [train.py:1204] (3/4) Exclude cut with ID _21989_210_5854_1_1531899214931_3271360_133-44759-0 from training. Duration: 0.77 2023-10-03 21:25:57,186 WARNING [train.py:1204] (3/4) Exclude cut with ID _23557_210_8750_1_1531890106370_6846255_77-284923-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 21:25:59,968 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.501e+02 1.936e+02 2.088e+02 2.339e+02 3.946e+02, threshold=4.176e+02, percent-clipped=0.0 2023-10-03 21:26:00,075 WARNING [train.py:1204] (3/4) Exclude cut with ID _13678_210_7497_1_1530530069226_3463170_464-296127-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 21:26:00,101 WARNING [train.py:1204] (3/4) Exclude cut with ID _22613_210_5219_1_1532253599686_4090170_393-36663-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 21:26:01,433 WARNING [train.py:1204] (3/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_235-262124-0_sp0.9 from training. Duration: 0.7444375 2023-10-03 21:26:03,230 WARNING [train.py:1204] (3/4) Exclude cut with ID _8480_210_1874_1_1528589391205_6728330_755-112625-0 from training. Duration: 0.74 2023-10-03 21:26:03,297 WARNING [train.py:1204] (3/4) Exclude cut with ID _6456_210_1534_1_1527681634212_3181049_215-309359-0 from training. Duration: 0.88 2023-10-03 21:26:03,323 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3019_210_2028_1_1526044245_3264865_191-72235-0_sp1.1 from training. Duration: 0.8818125 2023-10-03 21:26:04,640 WARNING [train.py:1204] (3/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_380-162648-0 from training. Duration: 0.94 2023-10-03 21:26:04,895 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.conv_skip_rate, batch_count=1411693.3333333333, ans=0.0 2023-10-03 21:26:06,093 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_92-46009-0 from training. Duration: 0.54 2023-10-03 21:26:06,112 WARNING [train.py:1204] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_53-300544-0_sp0.9 from training. Duration: 0.7444375 2023-10-03 21:26:08,057 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_68235_210_1925_1_1535863247_4484656_101-250943-0_sp1.1 from training. Duration: 0.97275 2023-10-03 21:26:08,066 WARNING [train.py:1204] (3/4) Exclude cut with ID _14970_210_15709_1_1530775547361_1966965_204-22182-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 21:26:09,407 WARNING [train.py:1204] (3/4) Exclude cut with ID _55153_210_2253_1_1534384786677_3162096_310-130092-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 21:26:09,424 WARNING [train.py:1204] (3/4) Exclude cut with ID _60113_210_16271_1_1535164212481_4386079_559-167191-0_sp1.1 from training. Duration: 0.8454375 2023-10-03 21:26:12,204 WARNING [train.py:1204] (3/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_93-58002-0_sp1.1 from training. Duration: 0.5181875 2023-10-03 21:26:12,266 WARNING [train.py:1204] (3/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_110-224688-0 from training. Duration: 0.97 2023-10-03 21:26:14,144 WARNING [train.py:1204] (3/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_178-178004-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 21:26:14,167 WARNING [train.py:1204] (3/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_321-335475-0_sp1.1 from training. Duration: 0.4090625 2023-10-03 21:26:15,489 WARNING [train.py:1204] (3/4) Exclude cut with ID _66863_210_18053_1_1535698758334_8590319_1092-271784-0 from training. Duration: 0.96 2023-10-03 21:26:15,496 WARNING [train.py:1204] (3/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_607-213951-0_sp1.1 from training. Duration: 0.763625 2023-10-03 21:26:15,512 WARNING [train.py:1204] (3/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_468-283113-0 from training. Duration: 0.96 2023-10-03 21:26:18,363 WARNING [train.py:1204] (3/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_13-165512-0_sp1.1 from training. Duration: 0.7454375 2023-10-03 21:26:18,380 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_69630_210_7234_1_1535788670_910989_56-169762-0_sp1.1 from training. Duration: 0.92725 2023-10-03 21:26:21,071 WARNING [train.py:1204] (3/4) Exclude cut with ID _67335_210_5219_1_1535525975652_4275229_121-80357-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 21:26:22,323 WARNING [train.py:1204] (3/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_126-12079-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 21:26:22,361 WARNING [train.py:1204] (3/4) Exclude cut with ID _5294_210_1747_1_1527249643928_3696179_342-270278-0_sp1.1 from training. Duration: 0.5 2023-10-03 21:26:23,751 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_30322_210_1925_1_1532843478_826817_35-328264-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 21:26:25,682 INFO [train.py:1046] (3/4) Epoch 40, batch 4600, loss[loss=0.145, simple_loss=0.228, pruned_loss=0.03104, over 24292.00 frames. ], tot_loss[loss=0.1564, simple_loss=0.2351, pruned_loss=0.03882, over 4702382.62 frames. ], batch size: 61, lr: 2.53e-03, grad_scale: 16.0 2023-10-03 21:26:25,821 WARNING [train.py:1204] (3/4) Exclude cut with ID _8480_210_1874_1_1528589391205_6728330_755-112625-0_sp1.1 from training. Duration: 0.67275 2023-10-03 21:26:27,452 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.conv_skip_rate, batch_count=1411826.6666666667, ans=0.0 2023-10-03 21:26:29,935 WARNING [train.py:1204] (3/4) Exclude cut with ID _38670_210_15379_1_1533448810659_4391059_132-146412-0_sp1.1 from training. Duration: 0.97275 2023-10-03 21:26:30,014 WARNING [train.py:1204] (3/4) Exclude cut with ID _22020_210_2461_1_1531724164513_6326900_563-39691-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 21:26:31,505 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_9460_210_4211_1_1528954587_10049667_572-37035-0_sp1.1 from training. Duration: 0.72725 2023-10-03 21:26:33,347 WARNING [train.py:1204] (3/4) Exclude cut with ID _68777_210_4353_1_1535810390731_3117904_129-166012-0_sp0.9 from training. Duration: 0.9444375 2023-10-03 21:26:33,414 WARNING [train.py:1204] (3/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_487-91921-0_sp1.1 from training. Duration: 0.963625 2023-10-03 21:26:34,989 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_195-257512-0 from training. Duration: 0.67 2023-10-03 21:26:36,304 WARNING [train.py:1204] (3/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_188-260011-0_sp1.1 from training. Duration: 0.663625 2023-10-03 21:26:40,739 WARNING [train.py:1204] (3/4) Exclude cut with ID _8480_210_1874_1_1528589391205_6728330_309-139896-0_sp0.9 from training. Duration: 0.9 2023-10-03 21:26:40,790 WARNING [train.py:1204] (3/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_866-321248-0_sp1.1 from training. Duration: 0.963625 2023-10-03 21:26:41,094 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.nonlin_attention.balancer.prob, batch_count=1411893.3333333333, ans=0.125 2023-10-03 21:26:42,278 WARNING [train.py:1204] (3/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_510-216513-0_sp1.1 from training. Duration: 0.97275 2023-10-03 21:26:49,582 WARNING [train.py:1204] (3/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_758-143677-0 from training. Duration: 0.98 2023-10-03 21:26:49,838 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.self_attn_weights.pos_emb_skip_rate, batch_count=1411893.3333333333, ans=0.0 2023-10-03 21:26:50,919 WARNING [train.py:1204] (3/4) Exclude cut with ID _55134_210_21982_1_1534590028377_3882170_50-214427-0_sp1.1 from training. Duration: 0.97275 2023-10-03 21:26:54,867 WARNING [train.py:1204] (3/4) Exclude cut with ID _31649_210_6228_1_1532584608063_4091078_482-301330-0_sp1.1 from training. Duration: 0.97275 2023-10-03 21:26:58,071 WARNING [train.py:1204] (3/4) Exclude cut with ID _35799_210_5854_1_1533263487887_3529071_490-326847-0_sp1.1 from training. Duration: 0.9 2023-10-03 21:26:58,086 WARNING [train.py:1204] (3/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_984-90716-0_sp1.1 from training. Duration: 0.963625 2023-10-03 21:27:03,881 WARNING [train.py:1204] (3/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_166-246557-0 from training. Duration: 0.64 2023-10-03 21:27:03,881 WARNING [train.py:1204] (3/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_51-200220-0_sp1.1 from training. Duration: 0.5090625 2023-10-03 21:27:03,960 WARNING [train.py:1204] (3/4) Exclude cut with ID _51382_210_23862_1_1534492801838_3650104_275-92796-0_sp1.1 from training. Duration: 0.936375 2023-10-03 21:27:08,650 WARNING [train.py:1204] (3/4) Exclude cut with ID _29853_210_7497_1_1532505600851_6642110_461-142829-0_sp1.1 from training. Duration: 0.97275 2023-10-03 21:27:08,694 WARNING [train.py:1204] (3/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_788-298907-0_sp0.9 from training. Duration: 0.988875 2023-10-03 21:27:10,130 WARNING [train.py:1204] (3/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_739-2098-0_sp1.1 from training. Duration: 0.836375 2023-10-03 21:27:13,851 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.3.encoder.layers.2.feed_forward3.out_whiten, num_groups=1, num_channels=512, metric=8.41 vs. limit=15.0 2023-10-03 21:27:14,619 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7543_210_6753_1_1528541907_1399322_165-323619-0 from training. Duration: 0.69 2023-10-03 21:27:16,453 WARNING [train.py:1204] (3/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_401-255491-0_sp1.1 from training. Duration: 0.636375 2023-10-03 21:27:19,280 WARNING [train.py:1204] (3/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_24-143283-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 21:27:19,497 INFO [scaling.py:1118] (3/4) WithLoss: name=encoder.encoders.2.encoder.layers.1.self_attn_weights, loss-sum=0.000e+00 2023-10-03 21:27:20,620 WARNING [train.py:1204] (3/4) Exclude cut with ID _14459_210_2228_1_1531443641946_7516700_1221-280439-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 21:27:23,359 WARNING [train.py:1204] (3/4) Exclude cut with ID _59202_210_26984_1_1534816810612_3549174_469-213482-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 21:27:23,360 WARNING [train.py:1204] (3/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_148-158509-0_sp0.9 from training. Duration: 0.4555625 2023-10-03 21:27:24,774 WARNING [train.py:1204] (3/4) Exclude cut with ID _24314_210_4915_1_1532135078116_7170179_863-259095-0_sp1.1 from training. Duration: 0.97275 2023-10-03 21:27:24,842 WARNING [train.py:1204] (3/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_321-22821-0 from training. Duration: 0.89 2023-10-03 21:27:24,870 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_43831_210_2158_1_1533725502_5872791_55-163137-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 21:27:24,914 WARNING [train.py:1204] (3/4) Exclude cut with ID _27430_210_7770_1_1532604689375_1378590_26-25207-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 21:27:26,266 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_17613_210_2151_1_1531137569_2366160_223-316461-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 21:27:28,188 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_53679_210_4852_1_1534556726_4186141_541-213370-0_sp1.1 from training. Duration: 0.963625 2023-10-03 21:27:28,272 WARNING [train.py:1204] (3/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_369-295086-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 21:27:29,645 WARNING [train.py:1204] (3/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_91-267696-0 from training. Duration: 0.87 2023-10-03 21:27:29,690 WARNING [train.py:1204] (3/4) Exclude cut with ID _5294_210_1747_1_1527249643928_3696179_180-100713-0 from training. Duration: 0.81 2023-10-03 21:27:30,189 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.whiten.whitening_limit, batch_count=1412093.3333333333, ans=12.0 2023-10-03 21:27:30,936 WARNING [train.py:1204] (3/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_32-335103-0 from training. Duration: 0.82 2023-10-03 21:27:30,953 WARNING [train.py:1204] (3/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_133-312727-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 21:27:32,425 WARNING [train.py:1204] (3/4) Exclude cut with ID _59288_210_5854_1_1535077757774_3412246_391-165934-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 21:27:33,679 WARNING [train.py:1204] (3/4) Exclude cut with ID _28323_210_16187_1_1532480395667_4100300_488-1281-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 21:27:33,758 WARNING [train.py:1204] (3/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_124-193207-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 21:27:39,900 INFO [train.py:1046] (3/4) Epoch 40, batch 4650, loss[loss=0.1586, simple_loss=0.2457, pruned_loss=0.03574, over 24588.00 frames. ], tot_loss[loss=0.1561, simple_loss=0.2351, pruned_loss=0.03856, over 4704319.03 frames. ], batch size: 71, lr: 2.53e-03, grad_scale: 16.0 2023-10-03 21:27:41,866 WARNING [train.py:1204] (3/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_226-242140-0_sp0.9 from training. Duration: 0.9 2023-10-03 21:27:44,532 WARNING [train.py:1204] (3/4) Exclude cut with ID _29277_210_3279_1_1532581109293_3173069_392-3694-0_sp1.1 from training. Duration: 0.936375 2023-10-03 21:27:45,785 WARNING [train.py:1204] (3/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_563-329842-0_sp1.1 from training. Duration: 0.97275 2023-10-03 21:27:47,074 WARNING [train.py:1204] (3/4) Exclude cut with ID _58583_210_5236_1_1534726835266_3574249_391-87755-0_sp1.1 from training. Duration: 0.92725 2023-10-03 21:27:47,100 WARNING [train.py:1204] (3/4) Exclude cut with ID _28427_210_4220_1_1532581240967_3061069_281-237827-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 21:27:47,122 WARNING [train.py:1204] (3/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_44-147898-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 21:27:49,050 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_24007_210_1794_1_1531918925_1663875_125-320772-0_sp1.1 from training. Duration: 0.97275 2023-10-03 21:27:49,399 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.bypass.scale_min, batch_count=1412160.0, ans=0.2 2023-10-03 21:27:50,563 WARNING [train.py:1204] (3/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_143-117421-0 from training. Duration: 0.58 2023-10-03 21:27:53,399 WARNING [train.py:1204] (3/4) Exclude cut with ID _69584_210_5219_1_1535785133775_4044450_325-7656-0_sp0.9 from training. Duration: 0.9555625 2023-10-03 21:27:54,878 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.1.conv_skip_rate, batch_count=1412226.6666666667, ans=0.0 2023-10-03 21:27:56,061 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_37351_210_7925_1_1533012931_3812113_265-85780-0 from training. Duration: 0.88 2023-10-03 21:27:56,082 WARNING [train.py:1204] (3/4) Exclude cut with ID _18516_210_9206_1_1531704653206_3692406_331-237896-0_sp1.1 from training. Duration: 0.936375 2023-10-03 21:27:57,969 WARNING [train.py:1204] (3/4) Exclude cut with ID _14629_210_15709_1_1530604298182_956899_6-232695-0 from training. Duration: 0.94 2023-10-03 21:27:58,003 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7544_210_6753_1_1528587000_691468_87-178599-0_sp1.1 from training. Duration: 0.7909375 2023-10-03 21:27:59,282 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_326-303945-0 from training. Duration: 0.99 2023-10-03 21:27:59,306 WARNING [train.py:1204] (3/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_503-30719-0 from training. Duration: 0.98 2023-10-03 21:27:59,319 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_11305_210_1794_1_1529750428_7178148_299-344640-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 21:27:59,540 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.feed_forward3.hidden_balancer.prob, batch_count=1412226.6666666667, ans=0.125 2023-10-03 21:28:00,681 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_53071_210_16554_1_1534490405_2954066_265-170472-0_sp1.1 from training. Duration: 0.7545625 2023-10-03 21:28:02,647 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.3.encoder.layers.1.nonlin_attention.whiten2, num_groups=1, num_channels=512, metric=5.63 vs. limit=15.0 2023-10-03 21:28:03,461 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_174-277364-0_sp1.1 from training. Duration: 0.6454375 2023-10-03 21:28:06,572 WARNING [train.py:1204] (3/4) Exclude cut with ID _58414_210_8254_1_1535421609162_7151182_193-151884-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 21:28:06,608 WARNING [train.py:1204] (3/4) Exclude cut with ID IC0651W0104-322063 from training. Duration: 0.768 2023-10-03 21:28:09,384 WARNING [train.py:1204] (3/4) Exclude cut with ID _21881_210_3525_1_1531825242757_3798380_139-305982-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 21:28:11,112 WARNING [train.py:1204] (3/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_79-200392-0 from training. Duration: 0.97 2023-10-03 21:28:11,314 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.conv_skip_rate, batch_count=1412293.3333333333, ans=0.0 2023-10-03 21:28:13,839 WARNING [train.py:1204] (3/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_451-280894-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 21:28:13,855 WARNING [train.py:1204] (3/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_304-273433-0_sp1.1 from training. Duration: 0.9 2023-10-03 21:28:15,208 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_5959_210_1794_1_1527400400_3445726_412-44052-0 from training. Duration: 0.77 2023-10-03 21:28:15,310 WARNING [train.py:1204] (3/4) Exclude cut with ID _4681_210_3525_1_1527251040321_1235713_148-26159-0_sp1.1 from training. Duration: 0.963625 2023-10-03 21:28:16,851 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.feed_forward2.hidden_balancer.prob, batch_count=1412293.3333333333, ans=0.125 2023-10-03 21:28:19,417 WARNING [train.py:1204] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_887-249555-0_sp0.9 from training. Duration: 0.8555625 2023-10-03 21:28:24,004 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_25236_210_2158_1_1532051968_280567_37-6860-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 21:28:28,248 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.592e+02 1.889e+02 2.069e+02 2.261e+02 3.657e+02, threshold=4.138e+02, percent-clipped=0.0 2023-10-03 21:28:28,346 WARNING [train.py:1204] (3/4) Exclude cut with ID _29277_210_3279_1_1532581109293_3173069_417-115527-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 21:28:28,576 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.conv_module1.balancer1.prob, batch_count=1412360.0, ans=0.125 2023-10-03 21:28:28,609 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.conv_module2.balancer2.prob, batch_count=1412360.0, ans=0.125 2023-10-03 21:28:28,675 INFO [scaling.py:1118] (3/4) WithLoss: name=encoder.encoders.4.encoder.layers.2.self_attn_weights, loss-sum=0.000e+00 2023-10-03 21:28:31,511 WARNING [train.py:1204] (3/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_552-134748-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 21:28:31,584 WARNING [train.py:1204] (3/4) Exclude cut with ID _43176_210_8254_1_1533524417469_4174339_188-174143-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 21:28:32,874 WARNING [train.py:1204] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_127-119109-0_sp1.1 from training. Duration: 0.6545625 2023-10-03 21:28:34,324 WARNING [train.py:1204] (3/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_38-187851-0 from training. Duration: 0.62 2023-10-03 21:28:34,359 WARNING [train.py:1204] (3/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_588-125165-0 from training. Duration: 0.83 2023-10-03 21:28:35,658 WARNING [train.py:1204] (3/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_344-152526-0_sp1.1 from training. Duration: 0.4818125 2023-10-03 21:28:35,659 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7002_210_1794_1_1527935634_4034444_487-236501-0 from training. Duration: 0.66 2023-10-03 21:28:35,818 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.nonlin_attention.balancer.min_positive, batch_count=1412360.0, ans=0.05 2023-10-03 21:28:37,698 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.attention_skip_rate, batch_count=1412426.6666666667, ans=0.0 2023-10-03 21:28:38,818 WARNING [train.py:1204] (3/4) Exclude cut with ID _23825_210_13449_1_1531915313526_3497150_379-16981-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 21:28:41,724 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.ff2_skip_rate, batch_count=1412426.6666666667, ans=0.0 2023-10-03 21:28:46,200 WARNING [train.py:1204] (3/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_297-5233-0_sp1.1 from training. Duration: 0.863625 2023-10-03 21:28:46,206 WARNING [train.py:1204] (3/4) Exclude cut with ID _41298_210_9019_1_1533430371450_4822143_178-283538-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 21:28:46,246 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7046_210_6753_1_1527895337_6920917_642-106799-0 from training. Duration: 0.92 2023-10-03 21:28:47,450 WARNING [train.py:1204] (3/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_532-291952-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 21:28:50,168 WARNING [train.py:1204] (3/4) Exclude cut with ID _47278_210_12041_1_1533902418349_3576110_260-270468-0_sp1.1 from training. Duration: 0.92725 2023-10-03 21:28:50,174 WARNING [train.py:1204] (3/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_144-3287-0_sp0.9 from training. Duration: 0.7444375 2023-10-03 21:28:50,294 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_162-262717-0_sp0.9 from training. Duration: 0.92225 2023-10-03 21:28:51,713 WARNING [train.py:1204] (3/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_62-335552-0_sp0.9 from training. Duration: 0.9666875 2023-10-03 21:28:52,974 INFO [train.py:1046] (3/4) Epoch 40, batch 4700, loss[loss=0.1324, simple_loss=0.2064, pruned_loss=0.02921, over 22683.00 frames. ], tot_loss[loss=0.1561, simple_loss=0.2355, pruned_loss=0.0383, over 4717810.60 frames. ], batch size: 50, lr: 2.53e-03, grad_scale: 16.0 2023-10-03 21:28:53,019 WARNING [train.py:1204] (3/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_726-272852-0_sp1.1 from training. Duration: 0.92725 2023-10-03 21:28:53,100 WARNING [train.py:1204] (3/4) Exclude cut with ID _14898_210_9206_1_1530766812377_4177190_26-135492-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 21:28:56,488 WARNING [train.py:1204] (3/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_200-95304-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 21:28:56,539 WARNING [train.py:1204] (3/4) Exclude cut with ID _45012_210_12651_1_1533608435136_5371380_240-37101-0_sp1.1 from training. Duration: 0.8454375 2023-10-03 21:28:56,548 WARNING [train.py:1204] (3/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_132-269633-0_sp0.9 from training. Duration: 0.7555625 2023-10-03 21:28:57,780 WARNING [train.py:1204] (3/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_907-151398-0_sp0.9 from training. Duration: 0.411125 2023-10-03 21:28:59,179 WARNING [train.py:1204] (3/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_45-265684-0_sp0.9 from training. Duration: 0.8 2023-10-03 21:29:00,465 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_74-292608-0 from training. Duration: 0.72 2023-10-03 21:29:07,826 WARNING [train.py:1204] (3/4) Exclude cut with ID _59202_210_26984_1_1534816810612_3549174_221-84819-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 21:29:09,161 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_33696_210_3852_1_1533082780_2663375_181-325595-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 21:29:09,201 WARNING [train.py:1204] (3/4) Exclude cut with ID _47251_210_18628_1_1533814063128_4137250_351-154105-0_sp1.1 from training. Duration: 0.963625 2023-10-03 21:29:10,585 WARNING [train.py:1204] (3/4) Exclude cut with ID _35501_210_9288_1_1532998507986_3192189_351-334740-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 21:29:10,711 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.1.feed_forward1.out_proj.dropout_p, batch_count=1412560.0, ans=0.1 2023-10-03 21:29:11,966 WARNING [train.py:1204] (3/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_342-100157-0_sp1.1 from training. Duration: 0.4909375 2023-10-03 21:29:16,711 WARNING [train.py:1204] (3/4) Exclude cut with ID _6789_210_1534_1_1527857937218_3118950_262-48853-0 from training. Duration: 0.56 2023-10-03 21:29:16,747 WARNING [train.py:1204] (3/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_430-201020-0 from training. Duration: 0.97 2023-10-03 21:29:18,184 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_71-136518-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 21:29:19,481 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_28-291982-0_sp1.1 from training. Duration: 0.8818125 2023-10-03 21:29:19,515 WARNING [train.py:1204] (3/4) Exclude cut with ID _54218_210_12210_1_1534413646388_4409259_437-275326-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 21:29:19,792 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.nonlin_attention.balancer.prob, batch_count=1412560.0, ans=0.125 2023-10-03 21:29:21,009 WARNING [train.py:1204] (3/4) Exclude cut with ID _55134_210_21982_1_1534590028377_3882170_40-309139-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 21:29:27,012 WARNING [train.py:1204] (3/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_266-329755-0_sp1.1 from training. Duration: 0.8090625 2023-10-03 21:29:27,094 WARNING [train.py:1204] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_21-174514-0_sp1.1 from training. Duration: 0.3909375 2023-10-03 21:29:30,383 WARNING [train.py:1204] (3/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_409-207378-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 21:29:36,445 WARNING [train.py:1204] (3/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_132-269633-0 from training. Duration: 0.68 2023-10-03 21:29:37,786 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2245_210_2715_1_1525511548_3540254_310-52483-0_sp0.9 from training. Duration: 0.97775 2023-10-03 21:29:39,288 WARNING [train.py:1204] (3/4) Exclude cut with ID _27430_210_7770_1_1532604689375_1378590_47-77403-0_sp1.1 from training. Duration: 0.92725 2023-10-03 21:29:43,620 WARNING [train.py:1204] (3/4) Exclude cut with ID _7562_210_2084_1_1528887767916_3946229_186-326422-0 from training. Duration: 0.84 2023-10-03 21:29:43,861 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.balancer1.prob, batch_count=1412693.3333333333, ans=0.125 2023-10-03 21:29:45,420 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_230-35993-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 21:29:45,622 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=1412693.3333333333, ans=0.1 2023-10-03 21:29:47,649 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.2.encoder.layers.1.self_attn1.whiten, num_groups=1, num_channels=384, metric=18.24 vs. limit=22.5 2023-10-03 21:29:49,621 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_23854_210_1794_1_1531969696_1329157_57-123491-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 21:29:49,665 WARNING [train.py:1204] (3/4) Exclude cut with ID _14747_210_8341_1_1530685797463_3654089_147-131229-0 from training. Duration: 0.76 2023-10-03 21:29:51,036 WARNING [train.py:1204] (3/4) Exclude cut with ID _30191_210_1664_1_1533189915047_7196670_430-19539-0_sp1.1 from training. Duration: 0.92725 2023-10-03 21:29:51,068 WARNING [train.py:1204] (3/4) Exclude cut with ID _67725_210_27767_1_1535608756671_5105359_44-50762-0_sp1.1 from training. Duration: 0.963625 2023-10-03 21:29:51,317 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.conv_module1.balancer2.min_positive, batch_count=1412760.0, ans=0.05 2023-10-03 21:29:55,688 WARNING [train.py:1204] (3/4) Exclude cut with ID _27485_210_4916_1_1532739191677_3668269_217-161755-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 21:29:55,739 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_5931_210_2040_1_1527428481_4547999_321-193614-0_sp1.1 from training. Duration: 0.6090625 2023-10-03 21:29:55,755 WARNING [train.py:1204] (3/4) Exclude cut with ID _6267_210_2084_1_1527764658526_3894730_375-219887-0 from training. Duration: 0.67 2023-10-03 21:29:57,005 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9087W0022-538393 from training. Duration: 0.768 2023-10-03 21:29:58,523 WARNING [train.py:1204] (3/4) Exclude cut with ID _68777_210_4353_1_1535810390731_3117904_83-196459-0_sp1.1 from training. Duration: 0.963625 2023-10-03 21:29:58,650 WARNING [train.py:1204] (3/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_455-343812-0_sp1.1 from training. Duration: 0.92725 2023-10-03 21:29:58,651 WARNING [train.py:1204] (3/4) Exclude cut with ID _13916_210_9994_1_1530788341126_3763149_414-320174-0_sp1.1 from training. Duration: 0.92725 2023-10-03 21:29:58,655 WARNING [train.py:1204] (3/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_398-239819-0 from training. Duration: 0.99 2023-10-03 21:29:59,983 WARNING [train.py:1204] (3/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_199-232314-0_sp1.1 from training. Duration: 0.92725 2023-10-03 21:30:05,713 WARNING [train.py:1204] (3/4) Exclude cut with ID _2334_210_2667_1_1525608238880_4705129_459-104997-0 from training. Duration: 0.95 2023-10-03 21:30:07,645 INFO [train.py:1046] (3/4) Epoch 40, batch 4750, loss[loss=0.1693, simple_loss=0.2521, pruned_loss=0.04326, over 23695.00 frames. ], tot_loss[loss=0.1565, simple_loss=0.2364, pruned_loss=0.03832, over 4726472.07 frames. ], batch size: 85, lr: 2.53e-03, grad_scale: 8.0 2023-10-03 21:30:07,814 WARNING [train.py:1204] (3/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_441-209395-0_sp1.1 from training. Duration: 0.936375 2023-10-03 21:30:09,181 WARNING [train.py:1204] (3/4) Exclude cut with ID _27052_210_5986_1_1532329197187_3585009_320-80393-0_sp1.1 from training. Duration: 0.97275 2023-10-03 21:30:09,398 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.bypass_mid.scale_min, batch_count=1412826.6666666667, ans=0.2 2023-10-03 21:30:12,049 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=1412826.6666666667, ans=0.1 2023-10-03 21:30:13,271 WARNING [train.py:1204] (3/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_89-217253-0_sp1.1 from training. Duration: 0.97275 2023-10-03 21:30:13,285 WARNING [train.py:1204] (3/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_453-282588-0_sp1.1 from training. Duration: 0.8909375 2023-10-03 21:30:14,763 WARNING [train.py:1204] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_88-341718-0 from training. Duration: 0.66 2023-10-03 21:30:16,047 WARNING [train.py:1204] (3/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_230-3521-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 21:30:18,024 WARNING [train.py:1204] (3/4) Exclude cut with ID _9679_210_7497_1_1529129212906_4363929_512-313889-0 from training. Duration: 0.81 2023-10-03 21:30:20,690 WARNING [train.py:1204] (3/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_194-252800-0_sp1.1 from training. Duration: 0.8818125 2023-10-03 21:30:20,717 WARNING [train.py:1204] (3/4) Exclude cut with ID _55209_210_1531_1_1534675828996_4597160_239-293541-0_sp1.1 from training. Duration: 0.963625 2023-10-03 21:30:20,763 WARNING [train.py:1204] (3/4) Exclude cut with ID _35799_210_5854_1_1533263487887_3529071_15-325131-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 21:30:25,124 WARNING [train.py:1204] (3/4) Exclude cut with ID _14100_210_8341_1_1530529320613_3678069_279-55760-0 from training. Duration: 0.93 2023-10-03 21:30:27,000 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.bypass.skip_rate, batch_count=1412893.3333333333, ans=0.09899494936611666 2023-10-03 21:30:29,848 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_489-230703-0_sp1.1 from training. Duration: 0.77275 2023-10-03 21:30:31,347 WARNING [train.py:1204] (3/4) Exclude cut with ID _6694_210_4599_1_1527852637490_3965529_144-185051-0 from training. Duration: 0.96 2023-10-03 21:30:32,608 WARNING [train.py:1204] (3/4) Exclude cut with ID _28461_210_2253_1_1532771775189_3041299_1-228155-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 21:30:35,277 WARNING [train.py:1204] (3/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_686-252323-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 21:30:35,279 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_1075-294633-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 21:30:36,697 WARNING [train.py:1204] (3/4) Exclude cut with ID _22083_210_8254_1_1531702862439_3678970_30-92879-0_sp1.1 from training. Duration: 0.97275 2023-10-03 21:30:36,786 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9245W0121-616888 from training. Duration: 0.896 2023-10-03 21:30:36,789 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6786_210_1388_1_1527839699_4194495_378-46070-0 from training. Duration: 0.72 2023-10-03 21:30:41,629 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.feed_forward1.hidden_balancer.prob, batch_count=1412960.0, ans=0.125 2023-10-03 21:30:44,144 WARNING [train.py:1204] (3/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_317-175062-0 from training. Duration: 0.82 2023-10-03 21:30:47,481 WARNING [train.py:1204] (3/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_238-89390-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 21:30:48,962 WARNING [train.py:1204] (3/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_524-310811-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 21:30:49,176 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.conv_skip_rate, batch_count=1412960.0, ans=0.0 2023-10-03 21:30:50,348 WARNING [train.py:1204] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_307-158462-0_sp0.9 from training. Duration: 0.7666875 2023-10-03 21:30:50,349 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9309W0191-640318 from training. Duration: 0.512 2023-10-03 21:30:50,353 WARNING [train.py:1204] (3/4) Exclude cut with ID _55153_210_2253_1_1534384786677_3162096_100-204617-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 21:30:53,228 WARNING [train.py:1204] (3/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_112-149213-0_sp1.1 from training. Duration: 0.863625 2023-10-03 21:30:53,508 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.bypass_mid.scale_min, batch_count=1413026.6666666667, ans=0.2 2023-10-03 21:30:56,035 WARNING [train.py:1204] (3/4) Exclude cut with ID _67831_210_12392_1_1535612464202_3092612_346-2148-0_sp1.1 from training. Duration: 0.8454375 2023-10-03 21:30:57,304 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.633e+02 1.943e+02 2.101e+02 2.308e+02 3.375e+02, threshold=4.202e+02, percent-clipped=0.0 2023-10-03 21:30:59,358 WARNING [train.py:1204] (3/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_51-117576-0_sp0.9 from training. Duration: 0.4444375 2023-10-03 21:30:59,396 WARNING [train.py:1204] (3/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_300-27235-0 from training. Duration: 0.87 2023-10-03 21:31:00,703 WARNING [train.py:1204] (3/4) Exclude cut with ID _40399_210_4353_1_1533305658194_3279406_195-116825-0_sp1.1 from training. Duration: 0.97275 2023-10-03 21:31:00,731 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7002_210_1794_1_1527939683_5157286_163-87552-0_sp0.9 from training. Duration: 0.9555625 2023-10-03 21:31:00,766 WARNING [train.py:1204] (3/4) Exclude cut with ID _19514_210_9259_1_1531530071330_5084230_560-237597-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 21:31:02,625 WARNING [train.py:1204] (3/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_41-234353-0_sp1.1 from training. Duration: 0.6181875 2023-10-03 21:31:03,924 WARNING [train.py:1204] (3/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_769-168302-0 from training. Duration: 0.71 2023-10-03 21:31:06,796 WARNING [train.py:1204] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_574-45541-0 from training. Duration: 0.65 2023-10-03 21:31:09,559 WARNING [train.py:1204] (3/4) Exclude cut with ID _50310_210_15488_1_1534068043345_3641080_262-67120-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 21:31:12,350 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_53071_210_16554_1_1534493434_1276796_35-11927-0_sp1.1 from training. Duration: 0.963625 2023-10-03 21:31:12,352 WARNING [train.py:1204] (3/4) Exclude cut with ID _3992_210_1747_1_1526644816031_3731060_107-251434-0 from training. Duration: 0.99 2023-10-03 21:31:12,404 WARNING [train.py:1204] (3/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_412-54488-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 21:31:13,123 WARNING [train.py:1204] (3/4) Exclude cut with ID _18516_210_9206_1_1531704653206_3692406_77-258490-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 21:31:15,761 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_40047_210_2290_1_1533290519_2374311_195-165693-0_sp0.9 from training. Duration: 0.988875 2023-10-03 21:31:15,809 WARNING [train.py:1204] (3/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_296-36971-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 21:31:17,138 WARNING [train.py:1204] (3/4) Exclude cut with ID _14970_210_15709_1_1530773543493_1715730_187-20259-0_sp1.1 from training. Duration: 0.8090625 2023-10-03 21:31:18,555 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.conv_module2.balancer2.min_abs, batch_count=1413093.3333333333, ans=0.5 2023-10-03 21:31:20,244 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_53373_210_5399_1_1534229471_4715695_135-305927-0_sp1.1 from training. Duration: 0.936375 2023-10-03 21:31:21,565 INFO [train.py:1046] (3/4) Epoch 40, batch 4800, loss[loss=0.1768, simple_loss=0.2592, pruned_loss=0.04719, over 23739.00 frames. ], tot_loss[loss=0.1572, simple_loss=0.2378, pruned_loss=0.03832, over 4727976.27 frames. ], batch size: 85, lr: 2.53e-03, grad_scale: 16.0 2023-10-03 21:31:21,618 WARNING [train.py:1204] (3/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_60-18123-0 from training. Duration: 0.88 2023-10-03 21:31:21,695 WARNING [train.py:1204] (3/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_166-166990-0 from training. Duration: 0.96 2023-10-03 21:31:22,942 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_5438_210_1794_1_1527336687_2600009_233-58072-0 from training. Duration: 0.77 2023-10-03 21:31:24,399 WARNING [train.py:1204] (3/4) Exclude cut with ID _8833_210_5219_1_1528592665210_7553059_611-80313-0_sp0.9 from training. Duration: 0.711125 2023-10-03 21:31:25,760 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_53555_210_5632_1_1534595220_7232297_118-43239-0_sp1.1 from training. Duration: 0.936375 2023-10-03 21:31:27,062 WARNING [train.py:1204] (3/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_480-13146-0 from training. Duration: 0.85 2023-10-03 21:31:31,729 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_892-28814-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 21:31:33,139 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_24899_210_16554_1_1532046422_3827462_160-21255-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 21:31:37,800 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_154-316450-0_sp1.1 from training. Duration: 0.6909375 2023-10-03 21:31:38,031 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.balancer2.prob, batch_count=1413226.6666666667, ans=0.125 2023-10-03 21:31:39,185 WARNING [train.py:1204] (3/4) Exclude cut with ID _30196_210_1664_1_1533708496522_7074140_1225-301232-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 21:31:39,222 WARNING [train.py:1204] (3/4) Exclude cut with ID _28783_210_7943_1_1532671164808_7155479_414-208100-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 21:31:39,328 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.0.feed_forward1.out_proj.dropout_p, batch_count=1413226.6666666667, ans=0.1 2023-10-03 21:31:40,608 WARNING [train.py:1204] (3/4) Exclude cut with ID _19023_210_4353_1_1531378892552_5820780_83-254794-0 from training. Duration: 0.86 2023-10-03 21:31:40,680 WARNING [train.py:1204] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_591-83752-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 21:31:42,024 WARNING [train.py:1204] (3/4) Exclude cut with ID _29877_210_6228_1_1532397791106_5276149_266-62291-0_sp1.1 from training. Duration: 0.7909375 2023-10-03 21:31:43,411 WARNING [train.py:1204] (3/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_280-161633-0_sp1.1 from training. Duration: 0.663625 2023-10-03 21:31:45,666 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.bypass.skip_rate, batch_count=1413226.6666666667, ans=0.07 2023-10-03 21:31:48,073 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7543_210_6753_1_1528539468_333764_39-156634-0_sp1.1 from training. Duration: 0.97275 2023-10-03 21:31:48,194 WARNING [train.py:1204] (3/4) Exclude cut with ID _67499_210_17282_1_1535509805220_3692430_505-312248-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 21:31:48,222 WARNING [train.py:1204] (3/4) Exclude cut with ID _5294_210_1747_1_1527249643928_3696179_180-100713-0_sp0.9 from training. Duration: 0.9 2023-10-03 21:31:48,412 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.conv_module2.balancer1.prob, batch_count=1413226.6666666667, ans=0.125 2023-10-03 21:31:49,687 WARNING [train.py:1204] (3/4) Exclude cut with ID _26528_210_9070_1_1532570410842_3851310_508-148132-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 21:31:49,696 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_211-339622-0_sp1.1 from training. Duration: 0.4090625 2023-10-03 21:31:49,709 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_339-138356-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 21:31:51,547 WARNING [train.py:1204] (3/4) Exclude cut with ID _27060_210_5986_1_1532674838922_3522549_211-19933-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 21:31:53,027 WARNING [train.py:1204] (3/4) Exclude cut with ID _60229_210_27767_1_1534838406158_3655350_457-117923-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 21:31:55,720 WARNING [train.py:1204] (3/4) Exclude cut with ID _55429_210_10626_1_1534467588527_7174143_460-141439-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 21:31:57,294 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.conv_module2.balancer2.prob, batch_count=1413293.3333333333, ans=0.125 2023-10-03 21:31:58,407 WARNING [train.py:1204] (3/4) Exclude cut with ID _59170_210_6721_1_1534983633960_4203335_263-10780-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 21:31:58,419 WARNING [train.py:1204] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_872-264525-0_sp0.9 from training. Duration: 0.72225 2023-10-03 21:32:02,852 WARNING [train.py:1204] (3/4) Exclude cut with ID _7624_210_1531_1_1528109802198_4933101_418-349834-0_sp0.9 from training. Duration: 0.6666875 2023-10-03 21:32:02,954 WARNING [train.py:1204] (3/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_27-53998-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 21:32:03,128 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.attention_skip_rate, batch_count=1413293.3333333333, ans=0.0 2023-10-03 21:32:04,879 WARNING [train.py:1204] (3/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_202-183922-0 from training. Duration: 0.85 2023-10-03 21:32:04,898 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7002_210_1794_1_1527939683_5157286_1-281264-0 from training. Duration: 0.44 2023-10-03 21:32:06,249 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_53753_210_7234_1_1534320003_3594148_493-9314-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 21:32:06,263 WARNING [train.py:1204] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_217-95056-0_sp1.1 from training. Duration: 0.8909375 2023-10-03 21:32:06,454 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.ff3_skip_rate, batch_count=1413293.3333333333, ans=0.0 2023-10-03 21:32:07,550 WARNING [train.py:1204] (3/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_406-45086-0_sp1.1 from training. Duration: 0.72725 2023-10-03 21:32:07,556 WARNING [train.py:1204] (3/4) Exclude cut with ID _31649_210_6228_1_1532584608063_4091078_505-213821-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 21:32:07,570 WARNING [train.py:1204] (3/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_317-175062-0_sp0.9 from training. Duration: 0.911125 2023-10-03 21:32:10,219 WARNING [train.py:1204] (3/4) Exclude cut with ID _5444_210_2151_1_1527418808464_5312210_726-27165-0_sp1.1 from training. Duration: 0.6090625 2023-10-03 21:32:11,523 WARNING [train.py:1204] (3/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_494-170437-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 21:32:11,762 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.conv_skip_rate, batch_count=1413360.0, ans=0.0 2023-10-03 21:32:11,835 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.attention_skip_rate, batch_count=1413360.0, ans=0.0 2023-10-03 21:32:13,039 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_474-64054-0_sp1.1 from training. Duration: 0.936375 2023-10-03 21:32:14,626 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.ff3_skip_rate, batch_count=1413360.0, ans=0.0 2023-10-03 21:32:15,822 WARNING [train.py:1204] (3/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_521-7556-0_sp1.1 from training. Duration: 0.97275 2023-10-03 21:32:17,877 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.1.nonlin_attention.balancer.prob, batch_count=1413360.0, ans=0.125 2023-10-03 21:32:19,046 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_269-348505-0_sp1.1 from training. Duration: 0.92725 2023-10-03 21:32:22,020 WARNING [train.py:1204] (3/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_559-184862-0 from training. Duration: 0.94 2023-10-03 21:32:22,058 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_68702_210_23689_1_1535799577_3420068_92-76040-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 21:32:22,087 WARNING [train.py:1204] (3/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_227-102052-0_sp1.1 from training. Duration: 0.97275 2023-10-03 21:32:22,124 WARNING [train.py:1204] (3/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_718-250907-0_sp0.9 from training. Duration: 0.7666875 2023-10-03 21:32:23,525 WARNING [train.py:1204] (3/4) Exclude cut with ID _14069_210_5632_1_1530512692033_4803390_352-143891-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 21:32:26,717 WARNING [train.py:1204] (3/4) Exclude cut with ID _6267_210_2084_1_1527764658526_3894730_65-106602-0_sp1.1 from training. Duration: 0.936375 2023-10-03 21:32:28,071 WARNING [train.py:1204] (3/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_202-183922-0_sp0.9 from training. Duration: 0.9444375 2023-10-03 21:32:28,079 WARNING [train.py:1204] (3/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_465-268827-0_sp1.1 from training. Duration: 0.97275 2023-10-03 21:32:28,139 WARNING [train.py:1204] (3/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_58-29064-0_sp1.1 from training. Duration: 0.9 2023-10-03 21:32:29,467 WARNING [train.py:1204] (3/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_1-99680-0_sp0.9 from training. Duration: 0.7444375 2023-10-03 21:32:29,533 WARNING [train.py:1204] (3/4) Exclude cut with ID _13253_210_1925_1_1530757775819_3577873_191-144977-0_sp1.1 from training. Duration: 0.7090625 2023-10-03 21:32:29,728 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.feed_forward2.hidden_balancer.prob, batch_count=1413426.6666666667, ans=0.125 2023-10-03 21:32:34,207 WARNING [train.py:1204] (3/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_276-112684-0_sp1.1 from training. Duration: 0.92725 2023-10-03 21:32:34,223 WARNING [train.py:1204] (3/4) Exclude cut with ID _64639_210_5816_1_1535367619437_3528000_358-411-0_sp1.1 from training. Duration: 0.97275 2023-10-03 21:32:34,236 WARNING [train.py:1204] (3/4) Exclude cut with ID _31649_210_6228_1_1532584608063_4091078_106-21199-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 21:32:35,589 WARNING [train.py:1204] (3/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_901-79438-0 from training. Duration: 0.94 2023-10-03 21:32:37,081 WARNING [train.py:1204] (3/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_718-250907-0 from training. Duration: 0.69 2023-10-03 21:32:37,089 WARNING [train.py:1204] (3/4) Exclude cut with ID _5287_210_2084_1_1527418883040_3878670_363-194161-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 21:32:37,099 WARNING [train.py:1204] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_586-321321-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 21:32:38,331 INFO [train.py:1046] (3/4) Epoch 40, batch 4850, loss[loss=0.1492, simple_loss=0.2311, pruned_loss=0.03363, over 23444.00 frames. ], tot_loss[loss=0.1567, simple_loss=0.2374, pruned_loss=0.03807, over 4739071.41 frames. ], batch size: 119, lr: 2.53e-03, grad_scale: 16.0 2023-10-03 21:32:38,436 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_68900_210_2087_1_1535713157_3708529_164-292661-0_sp1.1 from training. Duration: 0.963625 2023-10-03 21:32:38,437 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_26433_210_12108_1_1532157158_4056480_114-144878-0_sp1.1 from training. Duration: 0.97275 2023-10-03 21:32:41,396 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_15517_210_6753_1_1530954008_3487336_96-341874-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 21:32:44,601 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.conv_module2.balancer2.prob, batch_count=1413493.3333333333, ans=0.125 2023-10-03 21:32:47,137 WARNING [train.py:1204] (3/4) Exclude cut with ID _14268_210_9206_1_1530622771285_4714099_637-86630-0 from training. Duration: 0.51 2023-10-03 21:32:48,912 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_28211_210_6388_1_1532861506_4523224_121-320092-0_sp1.1 from training. Duration: 0.92725 2023-10-03 21:32:52,332 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_141-89113-0_sp0.9 from training. Duration: 0.9555625 2023-10-03 21:32:53,742 WARNING [train.py:1204] (3/4) Exclude cut with ID _6789_210_1534_1_1527857937218_3118950_311-61262-0_sp1.1 from training. Duration: 0.5909375 2023-10-03 21:32:53,770 WARNING [train.py:1204] (3/4) Exclude cut with ID _58480_210_12392_1_1534811121111_3646160_389-200461-0_sp1.1 from training. Duration: 0.97275 2023-10-03 21:32:57,951 WARNING [train.py:1204] (3/4) Exclude cut with ID _41326_210_5854_1_1533778076509_3492080_28-156553-0_sp1.1 from training. Duration: 0.92725 2023-10-03 21:33:01,009 WARNING [train.py:1204] (3/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_577-287150-0_sp1.1 from training. Duration: 0.7454375 2023-10-03 21:33:01,096 WARNING [train.py:1204] (3/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_40-129756-0_sp1.1 from training. Duration: 0.736375 2023-10-03 21:33:01,111 WARNING [train.py:1204] (3/4) Exclude cut with ID _60113_210_16271_1_1535164212481_4386079_490-280290-0 from training. Duration: 0.98 2023-10-03 21:33:04,409 WARNING [train.py:1204] (3/4) Exclude cut with ID _58059_210_5854_1_1534901471149_3164808_34-39905-0_sp1.1 from training. Duration: 0.963625 2023-10-03 21:33:07,120 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_791-197891-0_sp1.1 from training. Duration: 0.763625 2023-10-03 21:33:07,140 WARNING [train.py:1204] (3/4) Exclude cut with ID _6789_210_1534_1_1527857937218_3118950_262-48853-0_sp1.1 from training. Duration: 0.5090625 2023-10-03 21:33:08,465 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2655_210_2087_1_1525688959_3897400_52-279899-0_sp0.9 from training. Duration: 0.7666875 2023-10-03 21:33:08,469 WARNING [train.py:1204] (3/4) Exclude cut with ID _14884_210_2252_1_1530707459689_5561250_114-201399-0 from training. Duration: 0.76 2023-10-03 21:33:12,486 WARNING [train.py:1204] (3/4) Exclude cut with ID _19023_210_4353_1_1531378892552_5820780_83-254794-0_sp0.9 from training. Duration: 0.9555625 2023-10-03 21:33:12,501 WARNING [train.py:1204] (3/4) Exclude cut with ID _53489_210_6228_1_1534298357169_3835970_10-297919-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 21:33:16,723 WARNING [train.py:1204] (3/4) Exclude cut with ID _44585_210_18628_1_1533605350387_4518199_418-3055-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 21:33:16,735 WARNING [train.py:1204] (3/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_310-109461-0 from training. Duration: 0.8 2023-10-03 21:33:16,778 WARNING [train.py:1204] (3/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_235-262124-0 from training. Duration: 0.67 2023-10-03 21:33:18,534 WARNING [train.py:1204] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_195-167011-0_sp1.1 from training. Duration: 0.6818125 2023-10-03 21:33:25,338 WARNING [train.py:1204] (3/4) Exclude cut with ID _7852_210_5219_1_1528286471316_8139400_188-55162-0_sp1.1 from training. Duration: 0.936375 2023-10-03 21:33:27,313 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_35361_210_4238_1_1533262810_7793476_675-87012-0 from training. Duration: 0.89 2023-10-03 21:33:27,402 WARNING [train.py:1204] (3/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_465-81427-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 21:33:27,409 WARNING [train.py:1204] (3/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_597-154640-0_sp1.1 from training. Duration: 0.8181875 2023-10-03 21:33:28,689 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.626e+02 1.943e+02 2.210e+02 2.558e+02 3.580e+02, threshold=4.421e+02, percent-clipped=0.0 2023-10-03 21:33:28,816 WARNING [train.py:1204] (3/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_186-309161-0_sp0.9 from training. Duration: 0.6 2023-10-03 21:33:30,194 WARNING [train.py:1204] (3/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_546-119312-0 from training. Duration: 0.99 2023-10-03 21:33:30,198 WARNING [train.py:1204] (3/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_330-240060-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 21:33:30,279 WARNING [train.py:1204] (3/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_477-255782-0 from training. Duration: 0.82 2023-10-03 21:33:30,475 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.feed_forward1.out_proj.dropout_p, batch_count=1413693.3333333333, ans=0.1 2023-10-03 21:33:31,544 WARNING [train.py:1204] (3/4) Exclude cut with ID _14970_210_15709_1_1530773543493_1715730_150-248939-0_sp1.1 from training. Duration: 0.97275 2023-10-03 21:33:33,373 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_556-206615-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 21:33:33,427 WARNING [train.py:1204] (3/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_308-231735-0 from training. Duration: 0.73 2023-10-03 21:33:43,687 WARNING [train.py:1204] (3/4) Exclude cut with ID _27485_210_4916_1_1532739191677_3668269_66-236571-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 21:33:48,012 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2221_210_1988_1_1525597051_4935541_415-296882-0_sp0.9 from training. Duration: 0.8555625 2023-10-03 21:33:49,325 WARNING [train.py:1204] (3/4) Exclude cut with ID _64521_210_14699_1_1535243427259_3815328_231-78630-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 21:33:52,392 INFO [train.py:1046] (3/4) Epoch 40, batch 4900, loss[loss=0.148, simple_loss=0.2127, pruned_loss=0.04164, over 23629.00 frames. ], tot_loss[loss=0.1561, simple_loss=0.2365, pruned_loss=0.0379, over 4723023.39 frames. ], batch size: 256, lr: 2.53e-03, grad_scale: 16.0 2023-10-03 21:33:55,130 WARNING [train.py:1204] (3/4) Exclude cut with ID _13942_210_9988_1_1530604906036_3733360_499-55676-0 from training. Duration: 0.62 2023-10-03 21:33:55,133 WARNING [train.py:1204] (3/4) Exclude cut with ID _49376_210_5986_1_1533985278732_3370740_301-154763-0_sp1.1 from training. Duration: 0.8909375 2023-10-03 21:33:59,951 WARNING [train.py:1204] (3/4) Exclude cut with ID _27485_210_4916_1_1532739191677_3668269_255-216698-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 21:34:01,297 WARNING [train.py:1204] (3/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_137-334143-0_sp1.1 from training. Duration: 0.97275 2023-10-03 21:34:01,325 WARNING [train.py:1204] (3/4) Exclude cut with ID _6456_210_1534_1_1527681634212_3181049_215-309359-0_sp1.1 from training. Duration: 0.8 2023-10-03 21:34:05,942 WARNING [train.py:1204] (3/4) Exclude cut with ID _65640_210_6310_1_1535368201445_3173250_209-13008-0 from training. Duration: 0.99 2023-10-03 21:34:09,455 WARNING [train.py:1204] (3/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_58-29064-0 from training. Duration: 0.99 2023-10-03 21:34:13,505 WARNING [train.py:1204] (3/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_410-287857-0 from training. Duration: 0.93 2023-10-03 21:34:13,730 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.bypass.skip_rate, batch_count=1413893.3333333333, ans=0.04949747468305833 2023-10-03 21:34:14,854 WARNING [train.py:1204] (3/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_153-153537-0 from training. Duration: 0.91 2023-10-03 21:34:14,883 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7544_210_6753_1_1528587722_3576686_172-171750-0_sp0.9 from training. Duration: 0.72225 2023-10-03 21:34:14,923 WARNING [train.py:1204] (3/4) Exclude cut with ID _64521_210_14699_1_1535243427259_3815328_464-183853-0_sp1.1 from training. Duration: 0.97275 2023-10-03 21:34:14,937 WARNING [train.py:1204] (3/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_385-210308-0_sp1.1 from training. Duration: 0.963625 2023-10-03 21:34:14,944 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_46013_210_7234_1_1533776362_3744032_354-261865-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 21:34:14,951 WARNING [train.py:1204] (3/4) Exclude cut with ID _3067_210_2667_1_1526212974640_4503168_250-285937-0_sp1.1 from training. Duration: 0.72725 2023-10-03 21:34:15,020 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_40036_210_13405_1_1533648647_1315388_48-146016-0 from training. Duration: 0.93 2023-10-03 21:34:17,737 WARNING [train.py:1204] (3/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_280-161633-0 from training. Duration: 0.73 2023-10-03 21:34:19,651 WARNING [train.py:1204] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_308-161067-0_sp0.9 from training. Duration: 0.7333125 2023-10-03 21:34:20,899 WARNING [train.py:1204] (3/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_68-212801-0_sp0.9 from training. Duration: 0.811125 2023-10-03 21:34:20,961 WARNING [train.py:1204] (3/4) Exclude cut with ID _6789_210_1534_1_1527857937218_3118950_311-61262-0_sp0.9 from training. Duration: 0.72225 2023-10-03 21:34:23,633 WARNING [train.py:1204] (3/4) Exclude cut with ID _14632_210_9988_1_1530691279392_3923200_102-204477-0_sp1.1 from training. Duration: 0.8818125 2023-10-03 21:34:24,959 WARNING [train.py:1204] (3/4) Exclude cut with ID _65242_210_18628_1_1535369393717_3823149_290-117517-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 21:34:27,039 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_69630_210_7234_1_1535788670_910989_35-337140-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 21:34:27,048 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_5618_210_1874_1_1527311008_5851_1-214501-0 from training. Duration: 0.689 2023-10-03 21:34:28,472 WARNING [train.py:1204] (3/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_168-50337-0_sp0.9 from training. Duration: 0.9333125 2023-10-03 21:34:29,840 WARNING [train.py:1204] (3/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_185-14438-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 21:34:29,857 WARNING [train.py:1204] (3/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_61-327062-0 from training. Duration: 0.81 2023-10-03 21:34:29,862 WARNING [train.py:1204] (3/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_98-741-0 from training. Duration: 0.8 2023-10-03 21:34:32,703 WARNING [train.py:1204] (3/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_312-204798-0 from training. Duration: 0.93 2023-10-03 21:34:34,591 WARNING [train.py:1204] (3/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_787-294416-0_sp0.9 from training. Duration: 0.911125 2023-10-03 21:34:34,685 WARNING [train.py:1204] (3/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_140-181176-0_sp1.1 from training. Duration: 0.836375 2023-10-03 21:34:36,030 WARNING [train.py:1204] (3/4) Exclude cut with ID _14687_210_5854_1_1530703838259_3664889_115-239046-0_sp1.1 from training. Duration: 0.6090625 2023-10-03 21:34:37,361 WARNING [train.py:1204] (3/4) Exclude cut with ID _56423_210_5854_1_1534896061049_3165969_370-230614-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 21:34:37,418 WARNING [train.py:1204] (3/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_406-275649-0_sp0.9 from training. Duration: 0.4666875 2023-10-03 21:34:37,438 WARNING [train.py:1204] (3/4) Exclude cut with ID _15150_210_8341_1_1530752411803_7256384_22-138274-0_sp1.1 from training. Duration: 0.87275 2023-10-03 21:34:39,168 WARNING [train.py:1204] (3/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_125-125818-0 from training. Duration: 0.96 2023-10-03 21:34:41,793 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_4399_210_1874_1_1526705485_2400757_220-47974-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 21:34:43,213 WARNING [train.py:1204] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_308-161067-0_sp1.1 from training. Duration: 0.6 2023-10-03 21:34:46,012 WARNING [train.py:1204] (3/4) Exclude cut with ID _53489_210_6228_1_1534298357169_3835970_102-57645-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 21:34:48,217 WARNING [train.py:1204] (3/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_178-177582-0 from training. Duration: 0.74 2023-10-03 21:34:49,432 WARNING [train.py:1204] (3/4) Exclude cut with ID _14349_210_8341_1_1530615710818_3442470_170-336541-0_sp1.1 from training. Duration: 0.7818125 2023-10-03 21:34:50,761 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_521-214276-0_sp0.9 from training. Duration: 0.488875 2023-10-03 21:34:50,805 WARNING [train.py:1204] (3/4) Exclude cut with ID _66070_210_7640_1_1535441393096_7805310_489-292631-0 from training. Duration: 0.79 2023-10-03 21:34:51,022 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.conv_skip_rate, batch_count=1414093.3333333333, ans=0.0 2023-10-03 21:34:53,756 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.self_attn_weights.pos_emb_skip_rate, batch_count=1414093.3333333333, ans=0.0 2023-10-03 21:34:56,625 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.conv_module1.balancer2.prob, batch_count=1414093.3333333333, ans=0.125 2023-10-03 21:34:58,190 WARNING [train.py:1204] (3/4) Exclude cut with ID _14069_210_5632_1_1530512692033_4803390_438-340023-0_sp1.1 from training. Duration: 0.936375 2023-10-03 21:34:59,562 WARNING [train.py:1204] (3/4) Exclude cut with ID _7852_210_5219_1_1528286471316_8139400_242-105336-0_sp1.1 from training. Duration: 0.6545625 2023-10-03 21:35:01,006 WARNING [train.py:1204] (3/4) Exclude cut with ID _3867_210_1883_1_1526641200575_4475830_272-215563-0 from training. Duration: 0.57 2023-10-03 21:35:01,013 WARNING [train.py:1204] (3/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_143-117421-0_sp0.9 from training. Duration: 0.6444375 2023-10-03 21:35:01,019 WARNING [train.py:1204] (3/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_30-326681-0_sp1.1 from training. Duration: 0.7545625 2023-10-03 21:35:02,510 WARNING [train.py:1204] (3/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_538-218448-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 21:35:04,001 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=1414093.3333333333, ans=0.1 2023-10-03 21:35:06,478 INFO [train.py:1046] (3/4) Epoch 40, batch 4950, loss[loss=0.1551, simple_loss=0.2415, pruned_loss=0.03437, over 24471.00 frames. ], tot_loss[loss=0.1549, simple_loss=0.2344, pruned_loss=0.03774, over 4705320.09 frames. ], batch size: 69, lr: 2.53e-03, grad_scale: 16.0 2023-10-03 21:35:06,632 WARNING [train.py:1204] (3/4) Exclude cut with ID _33640_210_2655_1_1533117605479_4855469_60-192970-0_sp1.1 from training. Duration: 0.963625 2023-10-03 21:35:06,640 WARNING [train.py:1204] (3/4) Exclude cut with ID _65585_210_12210_1_1535450451584_3764098_112-294301-0_sp1.1 from training. Duration: 0.8 2023-10-03 21:35:06,800 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.nonlin_attention.balancer.min_positive, batch_count=1414160.0, ans=0.05 2023-10-03 21:35:07,902 WARNING [train.py:1204] (3/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_643-37412-0_sp1.1 from training. Duration: 0.936375 2023-10-03 21:35:07,928 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_9302_210_2550_1_1528889962_4655416_638-301620-0_sp1.1 from training. Duration: 0.42725 2023-10-03 21:35:08,033 WARNING [train.py:1204] (3/4) Exclude cut with ID _3867_210_1883_1_1526641200575_4475830_272-215563-0_sp1.1 from training. Duration: 0.5181875 2023-10-03 21:35:08,273 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.attention_skip_rate, batch_count=1414160.0, ans=0.0 2023-10-03 21:35:13,047 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_53679_210_4852_1_1534556726_4186141_358-154143-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 21:35:13,058 WARNING [train.py:1204] (3/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_841-341679-0_sp0.9 from training. Duration: 0.6444375 2023-10-03 21:35:14,429 WARNING [train.py:1204] (3/4) Exclude cut with ID _7160_210_1876_1_1527940923231_3703799_302-150728-0 from training. Duration: 0.78 2023-10-03 21:35:15,066 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.3.encoder.layers.3.nonlin_attention.whiten2, num_groups=1, num_channels=512, metric=4.36 vs. limit=15.0 2023-10-03 21:35:15,816 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_125-203105-0 from training. Duration: 0.8 2023-10-03 21:35:15,831 WARNING [train.py:1204] (3/4) Exclude cut with ID _5585_210_2667_1_1527418813324_8007340_89-332072-0_sp0.9 from training. Duration: 0.711125 2023-10-03 21:35:15,910 WARNING [train.py:1204] (3/4) Exclude cut with ID _3992_210_1747_1_1526644816031_3731060_36-147855-0 from training. Duration: 0.78 2023-10-03 21:35:17,258 WARNING [train.py:1204] (3/4) Exclude cut with ID _59256_210_5986_1_1535076085997_3589120_172-239045-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 21:35:17,267 WARNING [train.py:1204] (3/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_472-216663-0_sp1.1 from training. Duration: 0.836375 2023-10-03 21:35:17,332 WARNING [train.py:1204] (3/4) Exclude cut with ID _36512_210_6987_1_1533033143998_3126069_181-306500-0_sp1.1 from training. Duration: 0.863625 2023-10-03 21:35:17,347 WARNING [train.py:1204] (3/4) Exclude cut with ID _56524_210_9642_1_1534572011779_3619170_492-95008-0_sp1.1 from training. Duration: 0.97275 2023-10-03 21:35:18,452 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.4.encoder.layers.1.feed_forward3.out_whiten, num_groups=1, num_channels=384, metric=14.71 vs. limit=15.0 2023-10-03 21:35:20,473 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_33348_210_5632_1_1533086912_3324030_119-90326-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 21:35:20,511 WARNING [train.py:1204] (3/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_231-137892-0_sp1.1 from training. Duration: 0.9 2023-10-03 21:35:20,643 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8453_210_4211_1_1528502940_8562730_596-306820-0_sp1.1 from training. Duration: 0.8818125 2023-10-03 21:35:21,991 WARNING [train.py:1204] (3/4) Exclude cut with ID _27052_210_5986_1_1532329197187_3585009_50-242750-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 21:35:24,667 WARNING [train.py:1204] (3/4) Exclude cut with ID _13913_210_9994_1_1530579615187_3383897_358-98435-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 21:35:24,687 WARNING [train.py:1204] (3/4) Exclude cut with ID _27013_210_1389_1_1532437173240_3252649_399-229778-0_sp1.1 from training. Duration: 0.963625 2023-10-03 21:35:29,441 WARNING [train.py:1204] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_185-174805-0_sp0.9 from training. Duration: 0.7555625 2023-10-03 21:35:32,486 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=1414226.6666666667, ans=0.1 2023-10-03 21:35:33,563 WARNING [train.py:1204] (3/4) Exclude cut with ID _65585_210_12210_1_1535450451584_3764098_196-221014-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 21:35:36,180 WARNING [train.py:1204] (3/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_202-293523-0_sp1.1 from training. Duration: 0.6545625 2023-10-03 21:35:37,589 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8275_210_4198_1_1528455320_3892571_213-127634-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 21:35:37,657 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_921-231678-0_sp1.1 from training. Duration: 0.97275 2023-10-03 21:35:39,484 WARNING [train.py:1204] (3/4) Exclude cut with ID _38020_210_5986_1_1533112252259_7006089_493-102745-0_sp1.1 from training. Duration: 0.8909375 2023-10-03 21:35:42,280 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6846_210_6753_1_1527850767_4295158_2-315073-0 from training. Duration: 0.88 2023-10-03 21:35:43,527 WARNING [train.py:1204] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_249-281349-0 from training. Duration: 0.85 2023-10-03 21:35:45,401 WARNING [train.py:1204] (3/4) Exclude cut with ID _18513_210_9206_1_1531618215223_3862620_314-83429-0_sp1.1 from training. Duration: 0.97275 2023-10-03 21:35:46,868 WARNING [train.py:1204] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_215-202973-0_sp0.9 from training. Duration: 0.97775 2023-10-03 21:35:46,877 WARNING [train.py:1204] (3/4) Exclude cut with ID _7719_210_3525_1_1528456153163_4296960_88-250259-0_sp1.1 from training. Duration: 0.763625 2023-10-03 21:35:48,304 WARNING [train.py:1204] (3/4) Exclude cut with ID _7606_210_4915_1_1528894253277_5080690_301-10732-0_sp1.1 from training. Duration: 0.736375 2023-10-03 21:35:48,313 WARNING [train.py:1204] (3/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_818-11907-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 21:35:50,162 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_5959_210_1794_1_1527400400_3445726_412-44052-0_sp1.1 from training. Duration: 0.7 2023-10-03 21:35:50,397 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.nonlin_attention.balancer.prob, batch_count=1414360.0, ans=0.125 2023-10-03 21:35:52,763 WARNING [train.py:1204] (3/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_6-279195-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 21:35:54,204 WARNING [train.py:1204] (3/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_252-216674-0_sp0.9 from training. Duration: 0.6 2023-10-03 21:35:56,898 WARNING [train.py:1204] (3/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_144-3287-0_sp1.1 from training. Duration: 0.6090625 2023-10-03 21:35:58,278 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.588e+02 2.018e+02 2.178e+02 2.432e+02 3.766e+02, threshold=4.356e+02, percent-clipped=0.0 2023-10-03 21:35:58,355 WARNING [train.py:1204] (3/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_342-3879-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 21:35:58,383 WARNING [train.py:1204] (3/4) Exclude cut with ID _33640_210_2655_1_1533117605479_4855469_584-65630-0_sp1.1 from training. Duration: 0.97275 2023-10-03 21:35:58,459 WARNING [train.py:1204] (3/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_37-189262-0 from training. Duration: 0.98 2023-10-03 21:35:58,488 WARNING [train.py:1204] (3/4) Exclude cut with ID _9389_210_1664_1_1529217337301_5289269_379-128794-0_sp0.9 from training. Duration: 0.9444375 2023-10-03 21:36:00,301 WARNING [train.py:1204] (3/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_119-207162-0_sp1.1 from training. Duration: 0.6454375 2023-10-03 21:36:02,000 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.feed_forward1.out_proj.dropout_p, batch_count=1414360.0, ans=0.1 2023-10-03 21:36:03,184 WARNING [train.py:1204] (3/4) Exclude cut with ID _2941_210_2577_1_1526199673150_2489759_200-333643-0_sp1.1 from training. Duration: 0.936375 2023-10-03 21:36:04,509 WARNING [train.py:1204] (3/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_479-125611-0_sp1.1 from training. Duration: 0.87275 2023-10-03 21:36:04,519 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3475_210_2028_1_1526193578_3622569_64-297072-0_sp1.1 from training. Duration: 0.82725 2023-10-03 21:36:04,559 WARNING [train.py:1204] (3/4) Exclude cut with ID _53565_210_10635_1_1534337218335_3820259_139-346291-0_sp1.1 from training. Duration: 0.97275 2023-10-03 21:36:05,864 WARNING [train.py:1204] (3/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_588-125165-0_sp1.1 from training. Duration: 0.7545625 2023-10-03 21:36:07,264 WARNING [train.py:1204] (3/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_938-205135-0_sp1.1 from training. Duration: 0.7909375 2023-10-03 21:36:07,452 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.feed_forward1.hidden_balancer.prob, batch_count=1414426.6666666667, ans=0.125 2023-10-03 21:36:08,662 WARNING [train.py:1204] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_216-48707-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 21:36:09,964 WARNING [train.py:1204] (3/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_477-255782-0_sp1.1 from training. Duration: 0.7454375 2023-10-03 21:36:10,006 WARNING [train.py:1204] (3/4) Exclude cut with ID _16909_210_5724_1_1531292079912_4118919_583-92842-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 21:36:11,905 WARNING [train.py:1204] (3/4) Exclude cut with ID _60860_210_8254_1_1534896015875_3557169_245-256666-0 from training. Duration: 0.97 2023-10-03 21:36:13,622 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.bypass.scale_min, batch_count=1414426.6666666667, ans=0.2 2023-10-03 21:36:16,040 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_45399_210_4852_1_1533624725_7579737_349-116173-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 21:36:20,746 INFO [train.py:1046] (3/4) Epoch 40, batch 5000, loss[loss=0.1697, simple_loss=0.2613, pruned_loss=0.03906, over 24438.00 frames. ], tot_loss[loss=0.1541, simple_loss=0.2338, pruned_loss=0.03719, over 4712546.29 frames. ], batch size: 69, lr: 2.53e-03, grad_scale: 8.0 2023-10-03 21:36:20,866 WARNING [train.py:1204] (3/4) Exclude cut with ID _8699_210_2069_1_1528630346831_3960409_155-39766-0 from training. Duration: 0.78 2023-10-03 21:36:20,875 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_86-16881-0_sp1.1 from training. Duration: 0.52725 2023-10-03 21:36:21,042 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.nonlin_attention.balancer.prob, batch_count=1414493.3333333333, ans=0.125 2023-10-03 21:36:28,011 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_191-159384-0_sp1.1 from training. Duration: 0.97275 2023-10-03 21:36:28,022 WARNING [train.py:1204] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_307-158462-0_sp1.1 from training. Duration: 0.62725 2023-10-03 21:36:29,935 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7544_210_6753_1_1528591327_1471426_33-346031-0 from training. Duration: 0.61 2023-10-03 21:36:31,355 WARNING [train.py:1204] (3/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_112-149213-0 from training. Duration: 0.95 2023-10-03 21:36:34,097 WARNING [train.py:1204] (3/4) Exclude cut with ID _55844_210_2346_1_1534471212569_7625707_925-191342-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 21:36:35,509 WARNING [train.py:1204] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_87-278686-0 from training. Duration: 0.77 2023-10-03 21:36:35,528 WARNING [train.py:1204] (3/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_415-78792-0_sp1.1 from training. Duration: 0.736375 2023-10-03 21:36:35,538 WARNING [train.py:1204] (3/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_107-332398-0_sp0.9 from training. Duration: 0.8666875 2023-10-03 21:36:35,630 WARNING [train.py:1204] (3/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_72-251255-0 from training. Duration: 0.94 2023-10-03 21:36:36,990 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_431-227273-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 21:36:37,053 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7544_210_6753_1_1528587000_691468_87-178599-0_sp0.9 from training. Duration: 0.9666875 2023-10-03 21:36:38,388 WARNING [train.py:1204] (3/4) Exclude cut with ID _13916_210_9994_1_1530788341126_3763149_60-191556-0 from training. Duration: 0.61 2023-10-03 21:36:38,397 WARNING [train.py:1204] (3/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_746-164782-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 21:36:38,429 WARNING [train.py:1204] (3/4) Exclude cut with ID _33640_210_2655_1_1533117605479_4855469_596-21066-0_sp1.1 from training. Duration: 0.92725 2023-10-03 21:36:39,891 WARNING [train.py:1204] (3/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_320-14078-0 from training. Duration: 0.95 2023-10-03 21:36:40,059 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.bypass.skip_rate, batch_count=1414560.0, ans=0.09899494936611666 2023-10-03 21:36:40,370 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.4.encoder.layers.1.self_attn_weights.whiten_keys, num_groups=4, num_channels=128, metric=2.12 vs. limit=6.0 2023-10-03 21:36:41,151 WARNING [train.py:1204] (3/4) Exclude cut with ID _29877_210_6228_1_1532397791106_5276149_266-62291-0 from training. Duration: 0.87 2023-10-03 21:36:41,229 WARNING [train.py:1204] (3/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_176-56120-0_sp0.9 from training. Duration: 0.92225 2023-10-03 21:36:41,275 WARNING [train.py:1204] (3/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_91-26919-0 from training. Duration: 0.97 2023-10-03 21:36:41,286 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_224-322394-0_sp0.9 from training. Duration: 0.7333125 2023-10-03 21:36:43,181 WARNING [train.py:1204] (3/4) Exclude cut with ID _44982_210_1511_1_1534312820108_3726029_586-297059-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 21:36:43,232 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_196-289637-0_sp1.1 from training. Duration: 0.6818125 2023-10-03 21:36:43,233 WARNING [train.py:1204] (3/4) Exclude cut with ID _18513_210_9206_1_1531618215223_3862620_402-5957-0 from training. Duration: 0.99 2023-10-03 21:36:44,493 WARNING [train.py:1204] (3/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_81-300212-0 from training. Duration: 0.6 2023-10-03 21:36:47,216 WARNING [train.py:1204] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_166-128941-0 from training. Duration: 0.54 2023-10-03 21:36:47,242 WARNING [train.py:1204] (3/4) Exclude cut with ID _58059_210_5854_1_1534901471149_3164808_57-309660-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 21:36:47,314 WARNING [train.py:1204] (3/4) Exclude cut with ID _60621_210_17071_1_1535013053530_3671739_73-155814-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 21:36:48,754 WARNING [train.py:1204] (3/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_726-270780-0 from training. Duration: 0.86 2023-10-03 21:36:48,766 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8853_210_3273_1_1528633899_818968_51-187685-0_sp1.1 from training. Duration: 0.62725 2023-10-03 21:36:49,003 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.feed_forward1.out_proj.dropout_p, batch_count=1414626.6666666667, ans=0.1 2023-10-03 21:36:52,195 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_315-212794-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 21:36:52,272 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_53373_210_5399_1_1534229471_4715695_314-122795-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 21:36:53,605 WARNING [train.py:1204] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_187-221566-0_sp0.9 from training. Duration: 0.47775 2023-10-03 21:36:53,776 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.balancer2.prob, batch_count=1414626.6666666667, ans=0.125 2023-10-03 21:36:54,846 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_133-564-0 from training. Duration: 0.79 2023-10-03 21:36:54,893 WARNING [train.py:1204] (3/4) Exclude cut with ID _55153_210_2253_1_1534384786677_3162096_255-228344-0_sp1.1 from training. Duration: 0.936375 2023-10-03 21:36:56,516 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.bypass.skip_rate, batch_count=1414626.6666666667, ans=0.07 2023-10-03 21:36:57,089 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.1.encoder.layers.0.nonlin_attention.whiten1, num_groups=1, num_channels=192, metric=4.81 vs. limit=10.0 2023-10-03 21:36:57,669 WARNING [train.py:1204] (3/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_383-165234-0_sp1.1 from training. Duration: 0.8545625 2023-10-03 21:37:02,576 WARNING [train.py:1204] (3/4) Exclude cut with ID IC0677W0056-334889 from training. Duration: 0.768 2023-10-03 21:37:04,086 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_14953_210_2151_1_1530701710_5616165_710-165400-0_sp0.9 from training. Duration: 0.9666875 2023-10-03 21:37:05,433 WARNING [train.py:1204] (3/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_355-236205-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 21:37:05,434 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_34450_210_9547_1_1533099443_4109050_503-246893-0_sp1.1 from training. Duration: 0.963625 2023-10-03 21:37:06,877 WARNING [train.py:1204] (3/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_25-120161-0 from training. Duration: 0.98 2023-10-03 21:37:08,195 WARNING [train.py:1204] (3/4) Exclude cut with ID _66112_210_10635_1_1535590236305_3801484_205-311930-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 21:37:08,217 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_48049_210_5632_1_1533864553_3519937_185-281218-0_sp1.1 from training. Duration: 0.92725 2023-10-03 21:37:08,268 WARNING [train.py:1204] (3/4) Exclude cut with ID _48782_210_12658_1_1534467701544_2300422_191-332415-0_sp1.1 from training. Duration: 0.97275 2023-10-03 21:37:10,883 WARNING [train.py:1204] (3/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_907-151398-0_sp1.1 from training. Duration: 0.336375 2023-10-03 21:37:12,649 WARNING [train.py:1204] (3/4) Exclude cut with ID _67499_210_17282_1_1535509805220_3692430_20-179346-0_sp1.1 from training. Duration: 0.8818125 2023-10-03 21:37:14,145 WARNING [train.py:1204] (3/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_441-91466-0_sp1.1 from training. Duration: 0.8818125 2023-10-03 21:37:16,060 WARNING [train.py:1204] (3/4) Exclude cut with ID _46681_210_9642_1_1533893005571_7603359_786-164007-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 21:37:22,038 WARNING [train.py:1204] (3/4) Exclude cut with ID _14970_210_15709_1_1530773543493_1715730_187-20259-0 from training. Duration: 0.89 2023-10-03 21:37:24,905 WARNING [train.py:1204] (3/4) Exclude cut with ID _12734_210_8341_1_1530183614296_3891929_383-339075-0_sp1.1 from training. Duration: 0.963625 2023-10-03 21:37:33,985 WARNING [train.py:1204] (3/4) Exclude cut with ID _37086_210_9038_1_1532999540891_7021250_834-322225-0_sp1.1 from training. Duration: 0.92725 2023-10-03 21:37:34,097 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_10987_210_4163_1_1529760555_4123902_56-290559-0_sp1.1 from training. Duration: 0.963625 2023-10-03 21:37:34,104 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_92-246270-0_sp0.9 from training. Duration: 0.9444375 2023-10-03 21:37:35,420 INFO [train.py:1046] (3/4) Epoch 40, batch 5050, loss[loss=0.1584, simple_loss=0.2521, pruned_loss=0.0323, over 24363.00 frames. ], tot_loss[loss=0.1554, simple_loss=0.2355, pruned_loss=0.03768, over 4715796.36 frames. ], batch size: 74, lr: 2.53e-03, grad_scale: 8.0 2023-10-03 21:37:35,487 WARNING [train.py:1204] (3/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_56-13561-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 21:37:35,521 WARNING [train.py:1204] (3/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_393-57888-0_sp1.1 from training. Duration: 0.5818125 2023-10-03 21:37:36,799 WARNING [train.py:1204] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_168-32906-0_sp1.1 from training. Duration: 0.62725 2023-10-03 21:37:36,852 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_51225_210_3601_1_1534400634_2327383_127-207553-0_sp1.1 from training. Duration: 0.963625 2023-10-03 21:37:40,981 WARNING [train.py:1204] (3/4) Exclude cut with ID _58583_210_5236_1_1534726835266_3574249_494-203521-0_sp1.1 from training. Duration: 0.963625 2023-10-03 21:37:41,000 WARNING [train.py:1204] (3/4) Exclude cut with ID _49376_210_5986_1_1533985278732_3370740_354-344695-0 from training. Duration: 0.99 2023-10-03 21:37:41,152 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.balancer1.prob, batch_count=1414826.6666666667, ans=0.125 2023-10-03 21:37:42,314 WARNING [train.py:1204] (3/4) Exclude cut with ID _8021_210_7994_1_1528376451358_3653069_179-45206-0_sp1.1 from training. Duration: 0.7818125 2023-10-03 21:37:43,833 WARNING [train.py:1204] (3/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_742-154017-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 21:37:45,196 WARNING [train.py:1204] (3/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_222-198127-0_sp1.1 from training. Duration: 0.763625 2023-10-03 21:37:45,249 WARNING [train.py:1204] (3/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_235-307337-0 from training. Duration: 0.94 2023-10-03 21:37:47,279 WARNING [train.py:1204] (3/4) Exclude cut with ID _8393_210_6702_1_1528505860249_3457328_453-176500-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 21:37:48,530 WARNING [train.py:1204] (3/4) Exclude cut with ID _53565_210_10635_1_1534337218335_3820259_102-179528-0_sp1.1 from training. Duration: 0.97275 2023-10-03 21:37:49,969 WARNING [train.py:1204] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_43-206757-0_sp1.1 from training. Duration: 0.6818125 2023-10-03 21:37:51,928 WARNING [train.py:1204] (3/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_335-274780-0_sp1.1 from training. Duration: 0.7090625 2023-10-03 21:37:51,973 WARNING [train.py:1204] (3/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_128-296581-0_sp1.1 from training. Duration: 0.67275 2023-10-03 21:37:58,975 WARNING [train.py:1204] (3/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_111-112362-0 from training. Duration: 0.91 2023-10-03 21:37:59,029 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_5522_210_3273_1_1527249606_3909096_366-172508-0_sp1.1 from training. Duration: 0.6 2023-10-03 21:38:01,922 WARNING [train.py:1204] (3/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_47-167373-0_sp1.1 from training. Duration: 0.836375 2023-10-03 21:38:01,957 WARNING [train.py:1204] (3/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_41-234353-0 from training. Duration: 0.68 2023-10-03 21:38:02,001 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_82-221196-0_sp0.9 from training. Duration: 0.8444375 2023-10-03 21:38:03,304 WARNING [train.py:1204] (3/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_47-4814-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 21:38:03,332 WARNING [train.py:1204] (3/4) Exclude cut with ID _43541_210_10626_1_1533796233418_3635440_40-247993-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 21:38:04,717 WARNING [train.py:1204] (3/4) Exclude cut with ID _21989_210_5854_1_1531899214931_3271360_133-44759-0_sp0.9 from training. Duration: 0.8555625 2023-10-03 21:38:04,720 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_94-195801-0 from training. Duration: 0.94 2023-10-03 21:38:04,798 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_141-89113-0 from training. Duration: 0.86 2023-10-03 21:38:06,175 WARNING [train.py:1204] (3/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_683-284578-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 21:38:07,620 WARNING [train.py:1204] (3/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_6-214815-0_sp1.1 from training. Duration: 0.77275 2023-10-03 21:38:10,242 WARNING [train.py:1204] (3/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_556-212082-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 21:38:11,593 WARNING [train.py:1204] (3/4) Exclude cut with ID _44902_210_16187_1_1533888006062_3792078_149-67212-0 from training. Duration: 0.87 2023-10-03 21:38:14,863 WARNING [train.py:1204] (3/4) Exclude cut with ID _61148_210_6897_1_1535094022726_4217610_333-159440-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 21:38:18,336 WARNING [train.py:1204] (3/4) Exclude cut with ID _3120_210_3525_1_1526036143112_4072980_404-249994-0 from training. Duration: 0.99 2023-10-03 21:38:18,426 WARNING [train.py:1204] (3/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_234-260682-0_sp0.9 from training. Duration: 0.9333125 2023-10-03 21:38:18,454 WARNING [train.py:1204] (3/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_374-138971-0_sp1.1 from training. Duration: 0.8545625 2023-10-03 21:38:19,917 WARNING [train.py:1204] (3/4) Exclude cut with ID _53713_210_24116_1_1534314487762_7416751_743-302085-0_sp1.1 from training. Duration: 0.92725 2023-10-03 21:38:19,988 WARNING [train.py:1204] (3/4) Exclude cut with ID _18516_210_9206_1_1531704653206_3692406_198-334870-0_sp1.1 from training. Duration: 0.836375 2023-10-03 21:38:21,974 WARNING [train.py:1204] (3/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_395-73570-0_sp1.1 from training. Duration: 0.9 2023-10-03 21:38:24,652 WARNING [train.py:1204] (3/4) Exclude cut with ID _46683_210_9642_1_1533730913492_4591116_421-19547-0_sp1.1 from training. Duration: 0.936375 2023-10-03 21:38:24,694 WARNING [train.py:1204] (3/4) Exclude cut with ID _19896_210_1883_1_1531709022662_6649569_714-8668-0_sp1.1 from training. Duration: 0.963625 2023-10-03 21:38:24,734 WARNING [train.py:1204] (3/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_49-101072-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 21:38:24,741 WARNING [train.py:1204] (3/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_220-191080-0_sp1.1 from training. Duration: 0.8909375 2023-10-03 21:38:24,770 WARNING [train.py:1204] (3/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_62-335552-0 from training. Duration: 0.87 2023-10-03 21:38:26,317 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.1.feed_forward1.out_proj.dropout_p, batch_count=1415026.6666666667, ans=0.1 2023-10-03 21:38:27,413 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.583e+02 1.936e+02 2.098e+02 2.306e+02 3.294e+02, threshold=4.196e+02, percent-clipped=0.0 2023-10-03 21:38:27,494 WARNING [train.py:1204] (3/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_111-112362-0_sp1.1 from training. Duration: 0.82725 2023-10-03 21:38:27,623 WARNING [train.py:1204] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_318-59097-0_sp0.9 from training. Duration: 0.8444375 2023-10-03 21:38:32,307 WARNING [train.py:1204] (3/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_98-255710-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 21:38:32,314 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9042W0003-515609 from training. Duration: 0.896 2023-10-03 21:38:32,322 WARNING [train.py:1204] (3/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_158-250116-0_sp0.9 from training. Duration: 0.688875 2023-10-03 21:38:35,076 WARNING [train.py:1204] (3/4) Exclude cut with ID _26750_210_5665_1_1532226678442_4063230_7-346885-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 21:38:36,352 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_37839_210_7234_1_1533542462_3689303_357-227078-0_sp1.1 from training. Duration: 0.963625 2023-10-03 21:38:36,381 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9052W0119-520904 from training. Duration: 0.896 2023-10-03 21:38:39,118 WARNING [train.py:1204] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_249-281349-0_sp1.1 from training. Duration: 0.77275 2023-10-03 21:38:39,123 WARNING [train.py:1204] (3/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_147-293311-0 from training. Duration: 0.94 2023-10-03 21:38:39,124 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_158-21166-0_sp1.1 from training. Duration: 0.963625 2023-10-03 21:38:41,991 WARNING [train.py:1204] (3/4) Exclude cut with ID _14100_210_8341_1_1530529320613_3678069_207-332194-0_sp1.1 from training. Duration: 0.92725 2023-10-03 21:38:42,026 WARNING [train.py:1204] (3/4) Exclude cut with ID _9389_210_1664_1_1529217337301_5289269_324-211031-0_sp1.1 from training. Duration: 0.963625 2023-10-03 21:38:42,046 WARNING [train.py:1204] (3/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_133-158308-0 from training. Duration: 0.84 2023-10-03 21:38:43,433 WARNING [train.py:1204] (3/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_166-314607-0 from training. Duration: 0.54 2023-10-03 21:38:46,697 WARNING [train.py:1204] (3/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_649-79538-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 21:38:46,712 WARNING [train.py:1204] (3/4) Exclude cut with ID _28394_210_8913_1_1532430049518_3650118_343-115667-0_sp1.1 from training. Duration: 0.97275 2023-10-03 21:38:48,428 WARNING [train.py:1204] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_87-278686-0_sp0.9 from training. Duration: 0.8555625 2023-10-03 21:38:49,733 INFO [train.py:1046] (3/4) Epoch 40, batch 5100, loss[loss=0.1461, simple_loss=0.2314, pruned_loss=0.03042, over 24517.00 frames. ], tot_loss[loss=0.1559, simple_loss=0.236, pruned_loss=0.03791, over 4711770.07 frames. ], batch size: 66, lr: 2.53e-03, grad_scale: 8.0 2023-10-03 21:38:51,124 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9109W0003-549749 from training. Duration: 0.768 2023-10-03 21:38:53,096 WARNING [train.py:1204] (3/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_126-29104-0_sp1.1 from training. Duration: 0.77275 2023-10-03 21:38:53,312 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=1415160.0, ans=0.1 2023-10-03 21:38:55,881 WARNING [train.py:1204] (3/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_325-113798-0 from training. Duration: 0.85 2023-10-03 21:38:57,225 WARNING [train.py:1204] (3/4) Exclude cut with ID _6694_210_4599_1_1527852637490_3965529_116-109398-0 from training. Duration: 0.73 2023-10-03 21:38:57,284 WARNING [train.py:1204] (3/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_46-57119-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 21:38:58,736 WARNING [train.py:1204] (3/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_546-119312-0_sp1.1 from training. Duration: 0.9 2023-10-03 21:39:01,889 WARNING [train.py:1204] (3/4) Exclude cut with ID _60057_210_10640_1_1534903065793_3333150_468-306939-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 21:39:03,231 WARNING [train.py:1204] (3/4) Exclude cut with ID _15150_210_8341_1_1530752411803_7256384_22-138274-0 from training. Duration: 0.96 2023-10-03 21:39:03,249 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_289-180904-0 from training. Duration: 0.76 2023-10-03 21:39:03,468 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.1.self_attn_weights.pos_emb_skip_rate, batch_count=1415226.6666666667, ans=0.0 2023-10-03 21:39:06,199 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.1.feed_forward1.hidden_balancer.prob, batch_count=1415226.6666666667, ans=0.125 2023-10-03 21:39:07,424 WARNING [train.py:1204] (3/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_64-133721-0_sp1.1 from training. Duration: 0.92725 2023-10-03 21:39:08,679 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_195-257512-0_sp1.1 from training. Duration: 0.6090625 2023-10-03 21:39:11,544 WARNING [train.py:1204] (3/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_704-101963-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 21:39:13,538 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.3.encoder.layers.3.nonlin_attention.whiten2, num_groups=1, num_channels=512, metric=4.43 vs. limit=15.0 2023-10-03 21:39:14,345 WARNING [train.py:1204] (3/4) Exclude cut with ID _4366_210_1388_1_1526792053633_3613819_231-211402-0 from training. Duration: 0.94 2023-10-03 21:39:15,720 WARNING [train.py:1204] (3/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_679-190385-0_sp1.1 from training. Duration: 0.97275 2023-10-03 21:39:17,096 WARNING [train.py:1204] (3/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_298-81459-0_sp1.1 from training. Duration: 0.963625 2023-10-03 21:39:17,105 WARNING [train.py:1204] (3/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_658-282610-0_sp0.9 from training. Duration: 0.588875 2023-10-03 21:39:20,515 WARNING [train.py:1204] (3/4) Exclude cut with ID _57572_210_5517_1_1534921219624_3809484_196-283734-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 21:39:22,501 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_53071_210_16554_1_1534490405_2954066_294-198554-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 21:39:22,507 WARNING [train.py:1204] (3/4) Exclude cut with ID _4866_210_2577_1_1527404412284_4364659_352-157930-0 from training. Duration: 0.81 2023-10-03 21:39:23,950 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9231W0113-610139 from training. Duration: 0.896 2023-10-03 21:39:25,326 WARNING [train.py:1204] (3/4) Exclude cut with ID _24271_210_12658_1_1531987270022_7190519_638-171906-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 21:39:26,643 WARNING [train.py:1204] (3/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_272-249985-0 from training. Duration: 0.67 2023-10-03 21:39:26,654 WARNING [train.py:1204] (3/4) Exclude cut with ID _64086_210_7994_1_1535270417019_7435949_449-31567-0 from training. Duration: 0.97 2023-10-03 21:39:30,876 WARNING [train.py:1204] (3/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_109-282568-0_sp1.1 from training. Duration: 0.97275 2023-10-03 21:39:39,731 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_121-45934-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 21:39:41,154 WARNING [train.py:1204] (3/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_314-126819-0 from training. Duration: 0.5 2023-10-03 21:39:42,467 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9309W0192-640319 from training. Duration: 0.512 2023-10-03 21:39:42,483 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9310W0003-640649 from training. Duration: 0.896 2023-10-03 21:39:42,673 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.ff3_skip_rate, batch_count=1415360.0, ans=0.0 2023-10-03 21:39:45,230 WARNING [train.py:1204] (3/4) Exclude cut with ID _7160_210_1876_1_1527940923231_3703799_120-150943-0 from training. Duration: 0.81 2023-10-03 21:39:45,231 WARNING [train.py:1204] (3/4) Exclude cut with ID _40380_210_9059_1_1533380594759_4187380_6-33508-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 21:39:46,658 WARNING [train.py:1204] (3/4) Exclude cut with ID _59202_210_26984_1_1534816810612_3549174_137-318710-0 from training. Duration: 0.89 2023-10-03 21:39:50,181 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.2.encoder.layers.0.whiten, num_groups=1, num_channels=384, metric=5.14 vs. limit=12.0 2023-10-03 21:39:50,833 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_435-313305-0 from training. Duration: 0.71 2023-10-03 21:39:52,830 WARNING [train.py:1204] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_91-212486-0_sp1.1 from training. Duration: 0.6181875 2023-10-03 21:39:54,253 WARNING [train.py:1204] (3/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_133-158308-0_sp1.1 from training. Duration: 0.763625 2023-10-03 21:39:55,649 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_99-251314-0 from training. Duration: 0.87 2023-10-03 21:39:57,181 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=1415426.6666666667, ans=0.1 2023-10-03 21:39:59,576 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_365-217119-0_sp1.1 from training. Duration: 0.57275 2023-10-03 21:39:59,612 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3475_210_2028_1_1526193578_3622569_377-166133-0 from training. Duration: 0.86 2023-10-03 21:40:02,316 INFO [train.py:1046] (3/4) Epoch 40, batch 5150, loss[loss=0.1575, simple_loss=0.2412, pruned_loss=0.03692, over 24367.00 frames. ], tot_loss[loss=0.1576, simple_loss=0.2379, pruned_loss=0.03865, over 4718080.61 frames. ], batch size: 77, lr: 2.53e-03, grad_scale: 8.0 2023-10-03 21:40:05,357 WARNING [train.py:1204] (3/4) Exclude cut with ID _13916_210_9994_1_1530788341126_3763149_116-21034-0_sp1.1 from training. Duration: 0.87275 2023-10-03 21:40:05,372 WARNING [train.py:1204] (3/4) Exclude cut with ID _64220_210_9527_1_1535250151772_4875259_142-216831-0_sp1.1 from training. Duration: 0.936375 2023-10-03 21:40:05,372 WARNING [train.py:1204] (3/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_490-207861-0_sp1.1 from training. Duration: 0.8909375 2023-10-03 21:40:06,551 WARNING [train.py:1204] (3/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_87-272068-0_sp0.9 from training. Duration: 0.92225 2023-10-03 21:40:06,571 WARNING [train.py:1204] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_18-46756-0_sp1.1 from training. Duration: 0.7181875 2023-10-03 21:40:06,640 WARNING [train.py:1204] (3/4) Exclude cut with ID _39411_210_5986_1_1533538825030_3343649_98-311335-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 21:40:08,593 WARNING [train.py:1204] (3/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_113-18163-0 from training. Duration: 0.89 2023-10-03 21:40:08,595 WARNING [train.py:1204] (3/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_1103-56445-0 from training. Duration: 0.73 2023-10-03 21:40:08,654 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3474_210_2028_1_1526188513_4825246_145-309131-0 from training. Duration: 0.74 2023-10-03 21:40:09,898 WARNING [train.py:1204] (3/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_120-211630-0_sp1.1 from training. Duration: 0.836375 2023-10-03 21:40:09,914 WARNING [train.py:1204] (3/4) Exclude cut with ID _8030_210_7497_1_1528444886781_3531089_32-31837-0 from training. Duration: 0.97 2023-10-03 21:40:11,285 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_19949_210_2161_1_1531706337_928079_8-160956-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 21:40:12,545 WARNING [train.py:1204] (3/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_321-215742-0_sp1.1 from training. Duration: 0.4545625 2023-10-03 21:40:14,002 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_583-124650-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 21:40:15,329 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_30434_210_13049_1_1532519384_981746_164-182755-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 21:40:17,200 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.feed_forward1.out_proj.dropout_p, batch_count=1415560.0, ans=0.1 2023-10-03 21:40:18,343 WARNING [train.py:1204] (3/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_339-104791-0_sp1.1 from training. Duration: 0.4454375 2023-10-03 21:40:19,630 WARNING [train.py:1204] (3/4) Exclude cut with ID _15026_210_8750_1_1530777602316_3291850_211-195043-0 from training. Duration: 0.8 2023-10-03 21:40:21,008 WARNING [train.py:1204] (3/4) Exclude cut with ID _29877_210_6228_1_1532397791106_5276149_487-182647-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 21:40:21,046 WARNING [train.py:1204] (3/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_127-566-0_sp0.9 from training. Duration: 0.9333125 2023-10-03 21:40:22,451 WARNING [train.py:1204] (3/4) Exclude cut with ID _8263_210_1664_1_1528439452891_970220_68-181805-0_sp0.9 from training. Duration: 0.711125 2023-10-03 21:40:22,452 WARNING [train.py:1204] (3/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_504-72513-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 21:40:22,461 WARNING [train.py:1204] (3/4) Exclude cut with ID _64639_210_5816_1_1535367619437_3528000_324-265233-0_sp1.1 from training. Duration: 0.963625 2023-10-03 21:40:24,259 WARNING [train.py:1204] (3/4) Exclude cut with ID _6456_210_1534_1_1527681634212_3181049_215-309359-0_sp0.9 from training. Duration: 0.97775 2023-10-03 21:40:24,265 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_115-233929-0_sp1.1 from training. Duration: 0.6545625 2023-10-03 21:40:25,576 WARNING [train.py:1204] (3/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_219-224445-0 from training. Duration: 0.84 2023-10-03 21:40:27,519 WARNING [train.py:1204] (3/4) Exclude cut with ID _14867_210_1968_1_1530684179890_3846969_413-29292-0_sp1.1 from training. Duration: 0.8181875 2023-10-03 21:40:27,562 WARNING [train.py:1204] (3/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_1012-279473-0_sp0.9 from training. Duration: 0.7444375 2023-10-03 21:40:30,831 WARNING [train.py:1204] (3/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_605-133395-0_sp0.9 from training. Duration: 0.7333125 2023-10-03 21:40:30,964 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_89-35748-0 from training. Duration: 0.79 2023-10-03 21:40:32,480 WARNING [train.py:1204] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_476-272757-0_sp1.1 from training. Duration: 0.8454375 2023-10-03 21:40:32,771 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.balancer1.prob, batch_count=1415626.6666666667, ans=0.125 2023-10-03 21:40:38,295 WARNING [train.py:1204] (3/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_449-15508-0_sp0.9 from training. Duration: 0.911125 2023-10-03 21:40:39,679 WARNING [train.py:1204] (3/4) Exclude cut with ID _29345_210_4381_1_1532689168263_3860169_198-174328-0 from training. Duration: 0.91 2023-10-03 21:40:42,526 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_15517_210_6753_1_1530952293_1664795_125-158157-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 21:40:45,598 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.conv_module1.balancer1.prob, batch_count=1415693.3333333333, ans=0.125 2023-10-03 21:40:49,450 WARNING [train.py:1204] (3/4) Exclude cut with ID _25565_210_12392_1_1532423009382_3952098_420-113180-0_sp1.1 from training. Duration: 0.963625 2023-10-03 21:40:49,541 WARNING [train.py:1204] (3/4) Exclude cut with ID _51471_210_1889_1_1534399391390_3094250_236-146773-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 21:40:52,347 WARNING [train.py:1204] (3/4) Exclude cut with ID _30039_210_13270_1_1532484612900_2725150_175-35012-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 21:40:53,867 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.569e+02 1.916e+02 2.105e+02 2.538e+02 3.935e+02, threshold=4.210e+02, percent-clipped=0.0 2023-10-03 21:40:53,978 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_23850_210_4238_1_1532232597_1632433_99-275843-0_sp1.1 from training. Duration: 0.936375 2023-10-03 21:40:55,357 WARNING [train.py:1204] (3/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_658-282610-0 from training. Duration: 0.53 2023-10-03 21:41:00,198 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_37818_210_4238_1_1533620910_4720083_234-65934-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 21:41:00,482 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=1415760.0, ans=0.1 2023-10-03 21:41:01,496 WARNING [train.py:1204] (3/4) Exclude cut with ID _3067_210_2667_1_1526212974640_4503168_459-173867-0_sp1.1 from training. Duration: 0.8 2023-10-03 21:41:01,525 WARNING [train.py:1204] (3/4) Exclude cut with ID _8696_210_1876_1_1528545632440_3345058_209-54069-0_sp0.9 from training. Duration: 0.7444375 2023-10-03 21:41:05,590 WARNING [train.py:1204] (3/4) Exclude cut with ID _62574_210_2655_1_1535180407058_3933249_420-236465-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 21:41:07,562 WARNING [train.py:1204] (3/4) Exclude cut with ID _46681_210_9642_1_1533893005571_7603359_333-4042-0_sp1.1 from training. Duration: 0.936375 2023-10-03 21:41:07,636 WARNING [train.py:1204] (3/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_377-296839-0 from training. Duration: 0.55 2023-10-03 21:41:11,822 WARNING [train.py:1204] (3/4) Exclude cut with ID _43997_210_4525_1_1533538632555_5189329_426-92599-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 21:41:11,919 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_43-71989-0_sp1.1 from training. Duration: 0.5181875 2023-10-03 21:41:13,363 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2009_210_2071_1_1525481134_4588937_140-345530-0_sp1.1 from training. Duration: 0.963625 2023-10-03 21:41:13,371 WARNING [train.py:1204] (3/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_138-332773-0_sp0.9 from training. Duration: 0.9 2023-10-03 21:41:14,701 WARNING [train.py:1204] (3/4) Exclude cut with ID _13942_210_9988_1_1530604906036_3733360_499-55676-0_sp0.9 from training. Duration: 0.688875 2023-10-03 21:41:14,725 WARNING [train.py:1204] (3/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_259-34038-0_sp1.1 from training. Duration: 0.67275 2023-10-03 21:41:14,734 WARNING [train.py:1204] (3/4) Exclude cut with ID _3120_210_3525_1_1526036143112_4072980_404-249994-0_sp1.1 from training. Duration: 0.9 2023-10-03 21:41:14,767 WARNING [train.py:1204] (3/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_222-108810-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 21:41:16,089 INFO [train.py:1046] (3/4) Epoch 40, batch 5200, loss[loss=0.148, simple_loss=0.2371, pruned_loss=0.02945, over 24592.00 frames. ], tot_loss[loss=0.1584, simple_loss=0.2387, pruned_loss=0.03908, over 4714976.81 frames. ], batch size: 71, lr: 2.53e-03, grad_scale: 16.0 2023-10-03 21:41:18,754 WARNING [train.py:1204] (3/4) Exclude cut with ID _22083_210_8254_1_1531702862439_3678970_49-234546-0_sp0.9 from training. Duration: 0.92225 2023-10-03 21:41:21,397 WARNING [train.py:1204] (3/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_199-295611-0_sp0.9 from training. Duration: 0.611125 2023-10-03 21:41:23,441 WARNING [train.py:1204] (3/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_43-192945-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 21:41:25,667 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.conv_module2.balancer1.prob, batch_count=1415826.6666666667, ans=0.125 2023-10-03 21:41:26,757 WARNING [train.py:1204] (3/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_364-22817-0 from training. Duration: 0.71 2023-10-03 21:41:26,844 WARNING [train.py:1204] (3/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_388-129552-0_sp1.1 from training. Duration: 0.87275 2023-10-03 21:41:26,976 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.0.self_attn_weights.pos_emb_skip_rate, batch_count=1415826.6666666667, ans=0.0 2023-10-03 21:41:28,249 WARNING [train.py:1204] (3/4) Exclude cut with ID _6537_210_1883_1_1527987559710_3597129_190-280227-0_sp1.1 from training. Duration: 0.97275 2023-10-03 21:41:28,474 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.conv_module1.balancer1.prob, batch_count=1415826.6666666667, ans=0.125 2023-10-03 21:41:31,480 WARNING [train.py:1204] (3/4) Exclude cut with ID _55844_210_2346_1_1534471212569_7625707_398-56897-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 21:41:31,560 WARNING [train.py:1204] (3/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_48-182155-0_sp1.1 from training. Duration: 0.7909375 2023-10-03 21:41:31,586 WARNING [train.py:1204] (3/4) Exclude cut with ID _29529_210_7234_1_1532393961455_3977460_582-18084-0_sp1.1 from training. Duration: 0.97275 2023-10-03 21:41:32,936 WARNING [train.py:1204] (3/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_912-282412-0 from training. Duration: 0.49 2023-10-03 21:41:36,193 WARNING [train.py:1204] (3/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_332-256168-0_sp0.9 from training. Duration: 0.8333125 2023-10-03 21:41:36,239 WARNING [train.py:1204] (3/4) Exclude cut with ID _64220_210_9527_1_1535250151772_4875259_55-250624-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 21:41:37,926 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.conv_module1.balancer1.prob, batch_count=1415893.3333333333, ans=0.125 2023-10-03 21:41:39,106 WARNING [train.py:1204] (3/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_264-108685-0 from training. Duration: 0.55 2023-10-03 21:41:41,886 WARNING [train.py:1204] (3/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_613-232856-0_sp0.9 from training. Duration: 0.6 2023-10-03 21:41:43,249 WARNING [train.py:1204] (3/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_68-212801-0_sp1.1 from training. Duration: 0.663625 2023-10-03 21:41:43,300 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6290_210_3273_1_1527595224_1282787_115-87095-0 from training. Duration: 0.55 2023-10-03 21:41:44,533 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3080_210_2071_1_1526086102_4365166_312-8726-0 from training. Duration: 0.91 2023-10-03 21:41:47,340 WARNING [train.py:1204] (3/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_105-63309-0 from training. Duration: 0.98 2023-10-03 21:41:47,404 WARNING [train.py:1204] (3/4) Exclude cut with ID _2941_210_2577_1_1526199673150_2489759_201-160779-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 21:41:47,407 WARNING [train.py:1204] (3/4) Exclude cut with ID ID0406W0226-885434 from training. Duration: 0.997 2023-10-03 21:41:48,666 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_17292_210_6753_1_1531125388_2943734_353-328201-0_sp1.1 from training. Duration: 0.97275 2023-10-03 21:41:50,007 WARNING [train.py:1204] (3/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_572-96029-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 21:41:50,038 WARNING [train.py:1204] (3/4) Exclude cut with ID _14970_210_15709_1_1530775547361_1966965_6-23328-0_sp0.9 from training. Duration: 0.9555625 2023-10-03 21:41:50,081 WARNING [train.py:1204] (3/4) Exclude cut with ID _45012_210_12651_1_1533608435136_5371380_240-37101-0 from training. Duration: 0.93 2023-10-03 21:41:51,522 WARNING [train.py:1204] (3/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_685-90183-0_sp1.1 from training. Duration: 0.936375 2023-10-03 21:41:54,664 WARNING [train.py:1204] (3/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_41-22245-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 21:41:56,140 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_15126_210_6753_1_1530778677_2591185_120-277707-0 from training. Duration: 0.8 2023-10-03 21:41:56,169 WARNING [train.py:1204] (3/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_310-26186-0 from training. Duration: 0.7 2023-10-03 21:41:56,411 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=1415960.0, ans=0.1 2023-10-03 21:41:57,409 WARNING [train.py:1204] (3/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_787-294416-0 from training. Duration: 0.82 2023-10-03 21:42:02,344 WARNING [train.py:1204] (3/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_795-33252-0 from training. Duration: 0.37 2023-10-03 21:42:03,684 WARNING [train.py:1204] (3/4) Exclude cut with ID _7537_210_2084_1_1528502395431_3924359_113-199933-0_sp0.9 from training. Duration: 0.6555625 2023-10-03 21:42:09,919 WARNING [train.py:1204] (3/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_308-231735-0_sp0.9 from training. Duration: 0.811125 2023-10-03 21:42:09,948 WARNING [train.py:1204] (3/4) Exclude cut with ID _19504_210_9994_1_1531648863242_3325220_354-296765-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 21:42:10,089 WARNING [train.py:1204] (3/4) Exclude cut with ID _5409_210_1388_1_1527292850189_5865191_178-127970-0 from training. Duration: 0.57 2023-10-03 21:42:11,306 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_1466-248056-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 21:42:11,324 WARNING [train.py:1204] (3/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_535-14410-0_sp0.9 from training. Duration: 0.52225 2023-10-03 21:42:11,326 WARNING [train.py:1204] (3/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_747-129941-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 21:42:12,625 WARNING [train.py:1204] (3/4) Exclude cut with ID _15012_210_1968_1_1530856820429_5095350_688-178849-0_sp1.1 from training. Duration: 0.5818125 2023-10-03 21:42:14,157 WARNING [train.py:1204] (3/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_356-45054-0_sp1.1 from training. Duration: 0.7818125 2023-10-03 21:42:15,478 WARNING [train.py:1204] (3/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_146-276944-0_sp0.9 from training. Duration: 0.92225 2023-10-03 21:42:18,219 WARNING [train.py:1204] (3/4) Exclude cut with ID _39204_210_4220_1_1533880547368_3191120_346-180306-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 21:42:18,295 WARNING [train.py:1204] (3/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_641-158017-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 21:42:18,295 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_19952_210_2161_1_1531875469_3851750_274-42137-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 21:42:25,911 WARNING [train.py:1204] (3/4) Exclude cut with ID _60804_210_9642_1_1534831199859_7268299_87-327819-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 21:42:25,989 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_9302_210_2550_1_1528889962_4655416_638-301620-0 from training. Duration: 0.47 2023-10-03 21:42:26,052 WARNING [train.py:1204] (3/4) Exclude cut with ID _44304_210_15374_1_1533639352765_4613129_302-22084-0_sp1.1 from training. Duration: 0.7818125 2023-10-03 21:42:26,063 WARNING [train.py:1204] (3/4) Exclude cut with ID _18513_210_9206_1_1531618215223_3862620_177-227063-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 21:42:27,493 WARNING [train.py:1204] (3/4) Exclude cut with ID _41326_210_5854_1_1533778076509_3492080_510-45444-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 21:42:29,364 WARNING [train.py:1204] (3/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_76-311963-0_sp1.1 from training. Duration: 0.636375 2023-10-03 21:42:30,677 INFO [train.py:1046] (3/4) Epoch 40, batch 5250, loss[loss=0.1502, simple_loss=0.2434, pruned_loss=0.02852, over 24343.00 frames. ], tot_loss[loss=0.1583, simple_loss=0.2387, pruned_loss=0.03894, over 4714048.69 frames. ], batch size: 74, lr: 2.53e-03, grad_scale: 16.0 2023-10-03 21:42:30,738 WARNING [train.py:1204] (3/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_156-302675-0_sp0.9 from training. Duration: 0.611125 2023-10-03 21:42:33,458 WARNING [train.py:1204] (3/4) Exclude cut with ID _53489_210_6228_1_1534298357169_3835970_367-148411-0_sp1.1 from training. Duration: 0.936375 2023-10-03 21:42:34,067 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.4.encoder.layers.0.conv_module1.whiten, num_groups=1, num_channels=384, metric=4.76 vs. limit=15.0 2023-10-03 21:42:35,531 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.2.encoder.layers.0.self_attn2.whiten, num_groups=1, num_channels=384, metric=20.07 vs. limit=22.5 2023-10-03 21:42:37,470 WARNING [train.py:1204] (3/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_389-144525-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 21:42:37,511 WARNING [train.py:1204] (3/4) Exclude cut with ID _14069_210_5632_1_1530512692033_4803390_158-238427-0_sp1.1 from training. Duration: 0.963625 2023-10-03 21:42:39,282 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_486-151131-0_sp0.9 from training. Duration: 0.9444375 2023-10-03 21:42:43,526 WARNING [train.py:1204] (3/4) Exclude cut with ID _39846_210_12651_1_1533603651992_1318030_188-71831-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 21:42:44,919 WARNING [train.py:1204] (3/4) Exclude cut with ID _39411_210_5986_1_1533538825030_3343649_173-238248-0_sp1.1 from training. Duration: 0.7545625 2023-10-03 21:42:47,549 WARNING [train.py:1204] (3/4) Exclude cut with ID _20684_210_6917_1_1531567751646_4203930_469-49081-0_sp1.1 from training. Duration: 0.92725 2023-10-03 21:42:47,649 WARNING [train.py:1204] (3/4) Exclude cut with ID _8833_210_5219_1_1528592665210_7553059_611-80313-0_sp1.1 from training. Duration: 0.5818125 2023-10-03 21:42:50,370 WARNING [train.py:1204] (3/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_440-141811-0 from training. Duration: 0.95 2023-10-03 21:42:51,644 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_41211_210_2544_1_1533348859_3702910_236-223652-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 21:42:52,957 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_40222_210_6228_1_1533385298_4481939_220-163998-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 21:43:12,474 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.conv_module1.balancer1.max_abs, batch_count=1416360.0, ans=10.0 2023-10-03 21:43:18,734 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.664e+02 1.994e+02 2.255e+02 2.624e+02 3.979e+02, threshold=4.511e+02, percent-clipped=0.0 2023-10-03 21:43:38,757 INFO [train.py:1046] (3/4) Epoch 40, batch 5300, loss[loss=0.1571, simple_loss=0.25, pruned_loss=0.03209, over 24671.00 frames. ], tot_loss[loss=0.1569, simple_loss=0.237, pruned_loss=0.03839, over 4717077.35 frames. ], batch size: 68, lr: 2.53e-03, grad_scale: 16.0 2023-10-03 21:43:52,492 WARNING [train.py:1204] (3/4) Exclude cut with ID _14629_210_15709_1_1530604298182_956899_6-232695-0_sp1.1 from training. Duration: 0.8545625 2023-10-03 21:43:52,513 WARNING [train.py:1204] (3/4) Exclude cut with ID _13921_210_9994_1_1530841782272_3503520_327-69677-0 from training. Duration: 0.9 2023-10-03 21:43:52,541 WARNING [train.py:1204] (3/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_372-298093-0 from training. Duration: 0.8 2023-10-03 21:43:52,555 WARNING [train.py:1204] (3/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_515-139111-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 21:43:52,806 WARNING [train.py:1204] (3/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_85-65056-0_sp1.1 from training. Duration: 0.87275 2023-10-03 21:43:52,850 WARNING [train.py:1204] (3/4) Exclude cut with ID _15113_210_8254_1_1530842438579_3716279_209-54533-0_sp1.1 from training. Duration: 0.87275 2023-10-03 21:43:53,252 WARNING [train.py:1204] (3/4) Exclude cut with ID _70447_210_12392_1_1535972499300_3160420_123-312838-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 21:43:53,281 WARNING [train.py:1204] (3/4) Exclude cut with ID _60113_210_16271_1_1535164212481_4386079_70-63596-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 21:43:53,311 WARNING [train.py:1204] (3/4) Exclude cut with ID _14632_210_9988_1_1530691279392_3923200_225-188509-0_sp1.1 from training. Duration: 0.963625 2023-10-03 21:43:53,361 WARNING [train.py:1204] (3/4) Exclude cut with ID _34734_210_4525_1_1533365977542_4448720_584-180532-0_sp1.1 from training. Duration: 0.97275 2023-10-03 21:43:53,374 WARNING [train.py:1204] (3/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_351-294650-0_sp1.1 from training. Duration: 0.57275 2023-10-03 21:43:53,711 WARNING [train.py:1204] (3/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_263-168388-0_sp0.9 from training. Duration: 0.9555625 2023-10-03 21:43:53,741 WARNING [train.py:1204] (3/4) Exclude cut with ID _22083_210_8254_1_1531702862439_3678970_201-85738-0 from training. Duration: 0.9 2023-10-03 21:43:53,834 WARNING [train.py:1204] (3/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_237-58433-0 from training. Duration: 0.74 2023-10-03 21:43:53,862 WARNING [train.py:1204] (3/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_262-250826-0 from training. Duration: 0.99 2023-10-03 21:43:53,969 WARNING [train.py:1204] (3/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_927-43827-0_sp0.9 from training. Duration: 0.82225 2023-10-03 21:43:54,000 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2821_210_2715_1_1526108385_3763560_73-296673-0 from training. Duration: 0.83 2023-10-03 21:43:54,076 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6287_210_3273_1_1527509506_2653716_182-199887-0 from training. Duration: 0.51 2023-10-03 21:43:54,205 WARNING [train.py:1204] (3/4) Exclude cut with ID _70519_210_4163_1_1535961595994_7458230_899-80223-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 21:43:54,470 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_210-234949-0_sp1.1 from training. Duration: 0.97275 2023-10-03 21:43:54,510 WARNING [train.py:1204] (3/4) Exclude cut with ID _30196_210_1664_1_1533708496522_7074140_768-278176-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 21:43:54,595 WARNING [train.py:1204] (3/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_315-128283-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 21:43:54,680 WARNING [train.py:1204] (3/4) Exclude cut with ID _14886_210_4220_1_1530702020355_3516190_330-323941-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 21:43:54,959 WARNING [train.py:1204] (3/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_264-348896-0_sp1.1 from training. Duration: 0.77275 2023-10-03 21:43:54,985 WARNING [train.py:1204] (3/4) Exclude cut with ID _37086_210_9038_1_1532999540891_7021250_433-10563-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 21:43:55,025 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_53638_210_21581_1_1534297319_5808449_885-80172-0_sp1.1 from training. Duration: 0.87275 2023-10-03 21:43:55,552 WARNING [train.py:1204] (3/4) Exclude cut with ID _14623_210_1925_1_1530773354962_3632609_157-218771-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 21:43:55,564 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_1368-78165-0_sp1.1 from training. Duration: 0.97275 2023-10-03 21:43:55,569 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_309-65177-0_sp0.9 from training. Duration: 0.97775 2023-10-03 21:43:55,577 WARNING [train.py:1204] (3/4) Exclude cut with ID _21632_210_2253_1_1531994001765_3744140_291-138812-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 21:43:55,592 WARNING [train.py:1204] (3/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_669-93163-0_sp1.1 from training. Duration: 0.8909375 2023-10-03 21:43:56,178 WARNING [train.py:1204] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_292-215346-0 from training. Duration: 0.97 2023-10-03 21:43:56,203 WARNING [train.py:1204] (3/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_286-222073-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 21:43:56,493 WARNING [train.py:1204] (3/4) Exclude cut with ID _59256_210_5986_1_1535076085997_3589120_121-237386-0_sp1.1 from training. Duration: 0.87275 2023-10-03 21:43:56,508 WARNING [train.py:1204] (3/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_96-320736-0 from training. Duration: 0.86 2023-10-03 21:43:56,517 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_128-202104-0 from training. Duration: 0.63 2023-10-03 21:43:56,607 WARNING [train.py:1204] (3/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_242-3472-0_sp0.9 from training. Duration: 0.811125 2023-10-03 21:43:56,653 WARNING [train.py:1204] (3/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_154-74182-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 21:43:56,658 WARNING [train.py:1204] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_187-221566-0 from training. Duration: 0.43 2023-10-03 21:43:56,829 WARNING [train.py:1204] (3/4) Exclude cut with ID _6194_210_1534_1_1527511826160_3941194_14-282876-0 from training. Duration: 0.71 2023-10-03 21:43:56,871 WARNING [train.py:1204] (3/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_660-211088-0_sp1.1 from training. Duration: 0.563625 2023-10-03 21:43:57,218 WARNING [train.py:1204] (3/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_526-62079-0_sp1.1 from training. Duration: 0.6090625 2023-10-03 21:43:57,377 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_92-246270-0_sp1.1 from training. Duration: 0.77275 2023-10-03 21:43:57,472 WARNING [train.py:1204] (3/4) Exclude cut with ID IC0287W0023-142350 from training. Duration: 0.956 2023-10-03 21:43:57,555 WARNING [train.py:1204] (3/4) Exclude cut with ID _58605_210_12210_1_1534723195939_3904259_48-269402-0 from training. Duration: 0.96 2023-10-03 21:43:57,570 WARNING [train.py:1204] (3/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_416-257730-0_sp0.9 from training. Duration: 0.72225 2023-10-03 21:43:57,651 WARNING [train.py:1204] (3/4) Exclude cut with ID _7877_210_6014_1_1528286358099_3750136_185-17516-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 21:43:58,349 WARNING [train.py:1204] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_23-189234-0 from training. Duration: 0.83 2023-10-03 21:43:58,366 WARNING [train.py:1204] (3/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_237-89684-0 from training. Duration: 0.96 2023-10-03 21:43:58,437 WARNING [train.py:1204] (3/4) Exclude cut with ID _13916_210_9994_1_1530788341126_3763149_116-21034-0 from training. Duration: 0.96 2023-10-03 21:43:58,607 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_246-125000-0_sp1.1 from training. Duration: 0.563625 2023-10-03 21:44:05,022 INFO [train.py:1046] (3/4) Epoch 41, batch 0, loss[loss=0.174, simple_loss=0.2619, pruned_loss=0.04305, over 24368.00 frames. ], tot_loss[loss=0.174, simple_loss=0.2619, pruned_loss=0.04305, over 24368.00 frames. ], batch size: 77, lr: 2.50e-03, grad_scale: 32.0 2023-10-03 21:44:05,022 INFO [train.py:1069] (3/4) Computing validation loss 2023-10-03 21:44:17,637 INFO [train.py:1078] (3/4) Epoch 41, validation: loss=0.3341, simple_loss=0.2655, pruned_loss=0.2013, over 1125622.00 frames. 2023-10-03 21:44:17,637 INFO [train.py:1079] (3/4) Maximum memory allocated so far is 21134MB 2023-10-03 21:44:18,036 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.conv_skip_rate, batch_count=1416573.3333333333, ans=0.0 2023-10-03 21:44:19,279 WARNING [train.py:1204] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_402-253027-0 from training. Duration: 0.54 2023-10-03 21:44:22,012 WARNING [train.py:1204] (3/4) Exclude cut with ID _59288_210_5854_1_1535077757774_3412246_158-289345-0_sp1.1 from training. Duration: 0.92725 2023-10-03 21:44:23,439 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_28211_210_6388_1_1532861506_4523224_128-328162-0_sp1.1 from training. Duration: 0.8181875 2023-10-03 21:44:28,237 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_788-278281-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 21:44:28,244 WARNING [train.py:1204] (3/4) Exclude cut with ID _49728_210_12651_1_1534041034294_6788150_624-195301-0_sp0.9 from training. Duration: 0.9444375 2023-10-03 21:44:29,620 WARNING [train.py:1204] (3/4) Exclude cut with ID _14860_210_4915_1_1530702020340_7343240_267-139775-0_sp1.1 from training. Duration: 0.963625 2023-10-03 21:44:29,688 WARNING [train.py:1204] (3/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_332-221057-0 from training. Duration: 0.68 2023-10-03 21:44:29,833 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.balancer1.prob, batch_count=1416573.3333333333, ans=0.125 2023-10-03 21:44:32,850 WARNING [train.py:1204] (3/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_242-76943-0 from training. Duration: 0.94 2023-10-03 21:44:35,547 WARNING [train.py:1204] (3/4) Exclude cut with ID _16747_210_5986_1_1531811544733_2749109_87-349587-0_sp1.1 from training. Duration: 0.963625 2023-10-03 21:44:35,613 WARNING [train.py:1204] (3/4) Exclude cut with ID _22982_210_1883_1_1531882790447_5449409_505-346452-0_sp1.1 from training. Duration: 0.963625 2023-10-03 21:44:35,811 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.feed_forward1.out_proj.dropout_p, batch_count=1416640.0, ans=0.1 2023-10-03 21:44:38,336 WARNING [train.py:1204] (3/4) Exclude cut with ID _17956_210_4353_1_1531209568125_6979970_13-192107-0_sp1.1 from training. Duration: 0.963625 2023-10-03 21:44:38,379 WARNING [train.py:1204] (3/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_430-307666-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 21:44:39,669 WARNING [train.py:1204] (3/4) Exclude cut with ID _2895_210_2477_1_1525956785794_4063750_289-11475-0_sp1.1 from training. Duration: 0.7454375 2023-10-03 21:44:39,689 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3910_210_1794_1_1526742306_334530_28-238909-0_sp0.9 from training. Duration: 0.9 2023-10-03 21:44:41,767 WARNING [train.py:1204] (3/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_18-130336-0 from training. Duration: 0.79 2023-10-03 21:44:44,529 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_548-294316-0_sp0.9 from training. Duration: 0.9 2023-10-03 21:44:50,144 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_42-142473-0_sp1.1 from training. Duration: 0.6545625 2023-10-03 21:44:50,150 WARNING [train.py:1204] (3/4) Exclude cut with ID _59170_210_6721_1_1534983633960_4203335_326-48066-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 21:44:52,838 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_45886_210_17024_1_1533709215_471063_7-79165-0 from training. Duration: 0.98 2023-10-03 21:44:56,318 WARNING [train.py:1204] (3/4) Exclude cut with ID _44585_210_18628_1_1533605350387_4518199_280-73534-0_sp0.9 from training. Duration: 0.988875 2023-10-03 21:44:56,325 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3475_210_2028_1_1526193578_3622569_265-276625-0_sp1.1 from training. Duration: 0.6909375 2023-10-03 21:44:58,261 WARNING [train.py:1204] (3/4) Exclude cut with ID _64631_210_27767_1_1535349580068_4165160_88-225232-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 21:45:02,890 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_205-265518-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 21:45:04,627 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.bypass.scale_min, batch_count=1416773.3333333333, ans=0.2 2023-10-03 21:45:05,055 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.3.encoder.layers.2.feed_forward2.out_whiten, num_groups=1, num_channels=512, metric=7.72 vs. limit=15.0 2023-10-03 21:45:05,813 WARNING [train.py:1204] (3/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_271-253734-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 21:45:12,531 WARNING [train.py:1204] (3/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_282-257791-0 from training. Duration: 0.86 2023-10-03 21:45:15,939 WARNING [train.py:1204] (3/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_356-45054-0 from training. Duration: 0.86 2023-10-03 21:45:15,983 WARNING [train.py:1204] (3/4) Exclude cut with ID _69584_210_5219_1_1535785133775_4044450_38-348116-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 21:45:15,985 WARNING [train.py:1204] (3/4) Exclude cut with ID _66763_210_17301_1_1535788667617_241750_21-328480-0_sp1.1 from training. Duration: 0.936375 2023-10-03 21:45:17,372 WARNING [train.py:1204] (3/4) Exclude cut with ID _26528_210_9070_1_1532570410842_3851310_497-97218-0_sp0.9 from training. Duration: 0.9555625 2023-10-03 21:45:18,690 WARNING [train.py:1204] (3/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_592-101075-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 21:45:18,817 WARNING [train.py:1204] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_678-159307-0 from training. Duration: 0.85 2023-10-03 21:45:20,241 WARNING [train.py:1204] (3/4) Exclude cut with ID _28427_210_4220_1_1532581240967_3061069_166-319773-0_sp1.1 from training. Duration: 0.936375 2023-10-03 21:45:21,560 WARNING [train.py:1204] (3/4) Exclude cut with ID _29877_210_6228_1_1532397791106_5276149_606-224984-0_sp1.1 from training. Duration: 0.936375 2023-10-03 21:45:24,331 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_125-203105-0_sp1.1 from training. Duration: 0.72725 2023-10-03 21:45:27,647 WARNING [train.py:1204] (3/4) Exclude cut with ID IC0620W0113-306765 from training. Duration: 0.768 2023-10-03 21:45:29,181 WARNING [train.py:1204] (3/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_410-287857-0_sp1.1 from training. Duration: 0.8454375 2023-10-03 21:45:31,718 INFO [train.py:1046] (3/4) Epoch 41, batch 50, loss[loss=0.1425, simple_loss=0.2258, pruned_loss=0.02965, over 24610.00 frames. ], tot_loss[loss=0.1552, simple_loss=0.2353, pruned_loss=0.03755, over 1059097.10 frames. ], batch size: 60, lr: 2.49e-03, grad_scale: 32.0 2023-10-03 21:45:33,125 WARNING [train.py:1204] (3/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_78-95260-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 21:45:36,481 WARNING [train.py:1204] (3/4) Exclude cut with ID _58059_210_5854_1_1534901471149_3164808_6-73080-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 21:45:36,503 WARNING [train.py:1204] (3/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_331-117829-0 from training. Duration: 0.62 2023-10-03 21:45:36,562 WARNING [train.py:1204] (3/4) Exclude cut with ID _14880_210_8895_1_1530680426655_3795259_289-212817-0_sp0.9 from training. Duration: 0.7666875 2023-10-03 21:45:37,824 WARNING [train.py:1204] (3/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_620-144779-0_sp1.1 from training. Duration: 0.92725 2023-10-03 21:45:37,976 WARNING [train.py:1204] (3/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_561-288575-0_sp1.1 from training. Duration: 0.963625 2023-10-03 21:45:40,541 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_22657_210_6890_1_1532086002_1637216_122-188128-0_sp1.1 from training. Duration: 0.963625 2023-10-03 21:45:42,038 WARNING [train.py:1204] (3/4) Exclude cut with ID _16756_210_4915_1_1531047803763_7104259_604-86607-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 21:45:42,777 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.1.encoder.layers.1.self_attn2.whiten, num_groups=1, num_channels=256, metric=9.89 vs. limit=22.5 2023-10-03 21:45:45,289 WARNING [train.py:1204] (3/4) Exclude cut with ID _15150_210_8341_1_1530752411803_7256384_469-239509-0 from training. Duration: 0.68 2023-10-03 21:45:45,304 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_10987_210_4163_1_1529760555_4123902_302-87926-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 21:45:50,815 WARNING [train.py:1204] (3/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_469-94673-0_sp0.9 from training. Duration: 0.87775 2023-10-03 21:45:50,926 WARNING [train.py:1204] (3/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_798-106179-0 from training. Duration: 0.83 2023-10-03 21:45:53,667 WARNING [train.py:1204] (3/4) Exclude cut with ID _16744_210_5986_1_1531724405384_3907567_242-111316-0 from training. Duration: 0.93 2023-10-03 21:45:54,012 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.feed_forward3.hidden_balancer.prob, batch_count=1416973.3333333333, ans=0.125 2023-10-03 21:45:55,096 WARNING [train.py:1204] (3/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_938-205135-0_sp0.9 from training. Duration: 0.9666875 2023-10-03 21:45:55,200 WARNING [train.py:1204] (3/4) Exclude cut with ID _64135_210_2253_1_1535677267876_7608972_133-188266-0_sp0.9 from training. Duration: 0.9 2023-10-03 21:45:55,207 WARNING [train.py:1204] (3/4) Exclude cut with ID _44585_210_18628_1_1533605350387_4518199_623-346820-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 21:45:57,061 WARNING [train.py:1204] (3/4) Exclude cut with ID _42326_210_7770_1_1533814282531_3707269_454-266903-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 21:45:58,394 WARNING [train.py:1204] (3/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_5-55893-0_sp1.1 from training. Duration: 0.5 2023-10-03 21:45:59,699 WARNING [train.py:1204] (3/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_577-332600-0_sp1.1 from training. Duration: 0.5181875 2023-10-03 21:45:59,700 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_957-138882-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 21:46:05,816 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.592e+02 1.939e+02 2.175e+02 2.475e+02 4.066e+02, threshold=4.350e+02, percent-clipped=0.0 2023-10-03 21:46:07,382 WARNING [train.py:1204] (3/4) Exclude cut with ID _12556_210_8341_1_1530081024100_7327690_219-38004-0_sp1.1 from training. Duration: 0.97275 2023-10-03 21:46:08,736 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_397-147196-0_sp1.1 from training. Duration: 0.72725 2023-10-03 21:46:08,745 WARNING [train.py:1204] (3/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_231-104736-0_sp0.9 from training. Duration: 0.8666875 2023-10-03 21:46:08,820 WARNING [train.py:1204] (3/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_398-334558-0 from training. Duration: 0.96 2023-10-03 21:46:10,279 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_257-20733-0_sp1.1 from training. Duration: 0.6909375 2023-10-03 21:46:12,270 WARNING [train.py:1204] (3/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_146-282400-0_sp1.1 from training. Duration: 0.6454375 2023-10-03 21:46:12,420 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.feed_forward1.out_proj.dropout_p, batch_count=1417040.0, ans=0.1 2023-10-03 21:46:13,574 WARNING [train.py:1204] (3/4) Exclude cut with ID _51899_210_20034_1_1534129254466_3669220_265-215428-0 from training. Duration: 0.98 2023-10-03 21:46:13,642 WARNING [train.py:1204] (3/4) Exclude cut with ID _34423_210_8750_1_1533430798456_3330199_219-106541-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 21:46:14,381 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.1.encoder.layers.1.conv_module1.whiten, num_groups=1, num_channels=256, metric=14.08 vs. limit=15.0 2023-10-03 21:46:16,281 WARNING [train.py:1204] (3/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_458-67321-0 from training. Duration: 0.97 2023-10-03 21:46:23,082 WARNING [train.py:1204] (3/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_1051-182606-0_sp1.1 from training. Duration: 0.936375 2023-10-03 21:46:23,104 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_53555_210_5632_1_1534595220_7232297_603-4396-0_sp1.1 from training. Duration: 0.97275 2023-10-03 21:46:24,544 WARNING [train.py:1204] (3/4) Exclude cut with ID _54218_210_12210_1_1534413646388_4409259_159-144367-0_sp1.1 from training. Duration: 0.963625 2023-10-03 21:46:25,958 WARNING [train.py:1204] (3/4) Exclude cut with ID _7606_210_4915_1_1528894253277_5080690_301-10732-0_sp0.9 from training. Duration: 0.9 2023-10-03 21:46:25,961 WARNING [train.py:1204] (3/4) Exclude cut with ID _15026_210_8750_1_1530777602316_3291850_211-195043-0_sp0.9 from training. Duration: 0.888875 2023-10-03 21:46:27,462 WARNING [train.py:1204] (3/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_146-282400-0 from training. Duration: 0.71 2023-10-03 21:46:27,498 WARNING [train.py:1204] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_65-88242-0 from training. Duration: 0.56 2023-10-03 21:46:30,872 WARNING [train.py:1204] (3/4) Exclude cut with ID _55857_210_2346_1_1534644007831_6898209_80-107757-0_sp1.1 from training. Duration: 0.963625 2023-10-03 21:46:30,904 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_15126_210_6753_1_1530778677_2591185_120-277707-0_sp0.9 from training. Duration: 0.888875 2023-10-03 21:46:33,616 WARNING [train.py:1204] (3/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_1033-168593-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 21:46:33,670 WARNING [train.py:1204] (3/4) Exclude cut with ID _46683_210_9642_1_1533730913492_4591116_475-30711-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 21:46:34,965 WARNING [train.py:1204] (3/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_253-308377-0 from training. Duration: 0.49 2023-10-03 21:46:35,027 WARNING [train.py:1204] (3/4) Exclude cut with ID _4680_210_3525_1_1527249757953_893386_75-78979-0 from training. Duration: 0.91 2023-10-03 21:46:36,422 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_5522_210_3273_1_1527249606_3909096_318-203153-0_sp0.9 from training. Duration: 0.47775 2023-10-03 21:46:38,411 WARNING [train.py:1204] (3/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_550-170116-0_sp1.1 from training. Duration: 0.92725 2023-10-03 21:46:38,436 WARNING [train.py:1204] (3/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_277-210798-0_sp0.9 from training. Duration: 0.611125 2023-10-03 21:46:38,601 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.0.conv_skip_rate, batch_count=1417173.3333333333, ans=0.0 2023-10-03 21:46:39,763 WARNING [train.py:1204] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_620-276793-0 from training. Duration: 0.59 2023-10-03 21:46:39,782 WARNING [train.py:1204] (3/4) Exclude cut with ID _3067_210_2667_1_1526212974640_4503168_119-175776-0 from training. Duration: 0.74 2023-10-03 21:46:41,138 WARNING [train.py:1204] (3/4) Exclude cut with ID _55209_210_1531_1_1534675828996_4597160_681-195827-0_sp1.1 from training. Duration: 0.92725 2023-10-03 21:46:41,207 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_427-78097-0_sp1.1 from training. Duration: 0.72725 2023-10-03 21:46:44,393 WARNING [train.py:1204] (3/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_96-290039-0_sp0.9 from training. Duration: 0.62225 2023-10-03 21:46:44,410 WARNING [train.py:1204] (3/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_287-292174-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 21:46:45,744 INFO [train.py:1046] (3/4) Epoch 41, batch 100, loss[loss=0.1536, simple_loss=0.2394, pruned_loss=0.03383, over 24678.00 frames. ], tot_loss[loss=0.1591, simple_loss=0.24, pruned_loss=0.0391, over 1872690.89 frames. ], batch size: 65, lr: 2.49e-03, grad_scale: 16.0 2023-10-03 21:46:45,903 WARNING [train.py:1204] (3/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_200-292228-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 21:46:48,596 WARNING [train.py:1204] (3/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_291-194999-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 21:46:51,372 WARNING [train.py:1204] (3/4) Exclude cut with ID _58895_210_8750_1_1535184081320_3649089_265-323810-0_sp1.1 from training. Duration: 0.87275 2023-10-03 21:46:51,486 WARNING [train.py:1204] (3/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_58-68956-0 from training. Duration: 0.83 2023-10-03 21:46:51,493 WARNING [train.py:1204] (3/4) Exclude cut with ID _39867_210_9826_1_1533553222030_9344259_322-115153-0_sp1.1 from training. Duration: 0.963625 2023-10-03 21:46:55,500 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3371_210_1794_1_1526384462_5580777_396-339812-0_sp1.1 from training. Duration: 0.77275 2023-10-03 21:46:55,519 WARNING [train.py:1204] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_731-2428-0_sp0.9 from training. Duration: 0.92225 2023-10-03 21:46:55,541 WARNING [train.py:1204] (3/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_98-741-0_sp1.1 from training. Duration: 0.72725 2023-10-03 21:46:55,543 WARNING [train.py:1204] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_238-245655-0_sp0.9 from training. Duration: 0.9 2023-10-03 21:46:55,571 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6788_210_1388_1_1527898698_4562021_91-197981-0_sp0.9 from training. Duration: 0.92225 2023-10-03 21:46:56,951 WARNING [train.py:1204] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_860-96198-0 from training. Duration: 0.73 2023-10-03 21:47:00,224 WARNING [train.py:1204] (3/4) Exclude cut with ID _14268_210_9206_1_1530622771285_4714099_331-105351-0_sp0.9 from training. Duration: 0.72225 2023-10-03 21:47:00,247 WARNING [train.py:1204] (3/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_204-137133-0_sp1.1 from training. Duration: 0.92725 2023-10-03 21:47:00,277 WARNING [train.py:1204] (3/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_5-46496-0_sp1.1 from training. Duration: 0.936375 2023-10-03 21:47:00,293 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_160-218721-0_sp1.1 from training. Duration: 0.87275 2023-10-03 21:47:04,308 WARNING [train.py:1204] (3/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_503-287883-0 from training. Duration: 0.88 2023-10-03 21:47:06,051 WARNING [train.py:1204] (3/4) Exclude cut with ID _57572_210_5517_1_1534921219624_3809484_110-79134-0_sp1.1 from training. Duration: 0.92725 2023-10-03 21:47:06,119 WARNING [train.py:1204] (3/4) Exclude cut with ID _44585_210_18628_1_1533605350387_4518199_421-37641-0_sp1.1 from training. Duration: 0.936375 2023-10-03 21:47:07,990 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_42-142473-0_sp0.9 from training. Duration: 0.8 2023-10-03 21:47:09,470 WARNING [train.py:1204] (3/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_310-114452-0_sp0.9 from training. Duration: 0.6555625 2023-10-03 21:47:12,994 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.ff3_skip_rate, batch_count=1417306.6666666667, ans=0.0 2023-10-03 21:47:14,086 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9029W0120-509520 from training. Duration: 0.768 2023-10-03 21:47:15,321 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9029W0433-509820 from training. Duration: 0.896 2023-10-03 21:47:15,445 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_40222_210_6228_1_1533385298_4481939_163-334586-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 21:47:15,445 WARNING [train.py:1204] (3/4) Exclude cut with ID _48677_210_5663_1_1533902405919_3892989_408-40740-0_sp1.1 from training. Duration: 0.7545625 2023-10-03 21:47:19,608 WARNING [train.py:1204] (3/4) Exclude cut with ID _24314_210_4915_1_1532135078116_7170179_521-258868-0_sp0.9 from training. Duration: 0.87775 2023-10-03 21:47:19,931 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.conv_skip_rate, batch_count=1417373.3333333333, ans=0.0 2023-10-03 21:47:21,019 WARNING [train.py:1204] (3/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_576-51198-0_sp1.1 from training. Duration: 0.92725 2023-10-03 21:47:22,322 WARNING [train.py:1204] (3/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_306-216871-0_sp1.1 from training. Duration: 0.97275 2023-10-03 21:47:26,687 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.nonlin_attention.balancer.prob, batch_count=1417373.3333333333, ans=0.125 2023-10-03 21:47:27,848 WARNING [train.py:1204] (3/4) Exclude cut with ID _20684_210_6917_1_1531567751646_4203930_463-119982-0_sp1.1 from training. Duration: 0.97275 2023-10-03 21:47:29,179 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9084W0012-536850 from training. Duration: 0.64 2023-10-03 21:47:30,556 WARNING [train.py:1204] (3/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_847-41334-0_sp1.1 from training. Duration: 0.4 2023-10-03 21:47:34,540 WARNING [train.py:1204] (3/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_326-165900-0_sp1.1 from training. Duration: 0.62725 2023-10-03 21:47:35,628 WARNING [train.py:1204] (3/4) Exclude cut with ID _7537_210_2084_1_1528502395431_3924359_128-268029-0_sp1.1 from training. Duration: 0.8909375 2023-10-03 21:47:37,090 WARNING [train.py:1204] (3/4) Exclude cut with ID _67499_210_17282_1_1535509805220_3692430_333-155030-0_sp1.1 from training. Duration: 0.97275 2023-10-03 21:47:40,368 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_34008_210_10431_1_1533345424_8183247_588-257177-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 21:47:41,787 WARNING [train.py:1204] (3/4) Exclude cut with ID _25914_210_6400_1_1532141317058_644740_52-96808-0_sp1.1 from training. Duration: 0.82725 2023-10-03 21:47:44,398 WARNING [train.py:1204] (3/4) Exclude cut with ID _56494_210_4220_1_1535284837625_3033169_69-138150-0_sp1.1 from training. Duration: 0.8545625 2023-10-03 21:47:45,798 WARNING [train.py:1204] (3/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_19-143553-0_sp1.1 from training. Duration: 0.97275 2023-10-03 21:47:47,112 WARNING [train.py:1204] (3/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_582-130735-0_sp1.1 from training. Duration: 0.936375 2023-10-03 21:47:48,311 WARNING [train.py:1204] (3/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_943-201356-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 21:47:48,321 WARNING [train.py:1204] (3/4) Exclude cut with ID _3231_210_2106_1_1526205864934_6784220_422-229276-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 21:47:49,671 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_48049_210_5632_1_1533864553_3519937_73-198717-0_sp1.1 from training. Duration: 0.97275 2023-10-03 21:47:49,733 WARNING [train.py:1204] (3/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_329-285801-0 from training. Duration: 0.91 2023-10-03 21:47:49,754 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9171W0114-579000 from training. Duration: 0.896 2023-10-03 21:47:51,094 WARNING [train.py:1204] (3/4) Exclude cut with ID _3120_210_3525_1_1526036143112_4072980_210-132675-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 21:47:51,181 WARNING [train.py:1204] (3/4) Exclude cut with ID _55857_210_2346_1_1534644007831_6898209_745-98391-0_sp0.9 from training. Duration: 0.8444375 2023-10-03 21:47:52,572 WARNING [train.py:1204] (3/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_615-322035-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 21:47:52,577 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_36756_210_7234_1_1533026991_966276_119-315767-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 21:47:52,593 WARNING [train.py:1204] (3/4) Exclude cut with ID _13916_210_9994_1_1530788341126_3763149_60-191556-0_sp1.1 from training. Duration: 0.5545625 2023-10-03 21:47:53,795 WARNING [train.py:1204] (3/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_177-236809-0_sp1.1 from training. Duration: 0.4454375 2023-10-03 21:47:53,838 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7002_210_1794_1_1527939683_5157286_184-229630-0_sp1.1 from training. Duration: 0.67275 2023-10-03 21:47:53,844 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_50428_210_5724_1_1534247532_4126418_464-18322-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 21:47:54,068 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.balancer1.prob, batch_count=1417506.6666666667, ans=0.125 2023-10-03 21:47:55,138 WARNING [train.py:1204] (3/4) Exclude cut with ID _35799_210_5854_1_1533263487887_3529071_466-131402-0_sp1.1 from training. Duration: 0.936375 2023-10-03 21:47:56,541 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_1259-132768-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 21:47:56,604 WARNING [train.py:1204] (3/4) Exclude cut with ID _6694_210_4599_1_1527852637490_3965529_144-185051-0_sp1.1 from training. Duration: 0.87275 2023-10-03 21:47:57,821 INFO [train.py:1046] (3/4) Epoch 41, batch 150, loss[loss=0.1468, simple_loss=0.2342, pruned_loss=0.02972, over 24586.00 frames. ], tot_loss[loss=0.1588, simple_loss=0.2403, pruned_loss=0.03872, over 2515197.84 frames. ], batch size: 71, lr: 2.49e-03, grad_scale: 16.0 2023-10-03 21:47:57,906 WARNING [train.py:1204] (3/4) Exclude cut with ID _64639_210_5816_1_1535367619437_3528000_59-133290-0_sp1.1 from training. Duration: 0.963625 2023-10-03 21:48:01,851 WARNING [train.py:1204] (3/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_223-49525-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 21:48:02,772 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.1.encoder.layers.0.conv_module2.whiten, num_groups=1, num_channels=256, metric=5.61 vs. limit=15.0 2023-10-03 21:48:05,139 WARNING [train.py:1204] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_748-36150-0_sp1.1 from training. Duration: 0.82725 2023-10-03 21:48:05,140 WARNING [train.py:1204] (3/4) Exclude cut with ID _19896_210_1883_1_1531709022662_6649569_52-221627-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 21:48:05,207 WARNING [train.py:1204] (3/4) Exclude cut with ID _6789_210_1534_1_1527854471671_2870519_296-115766-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 21:48:09,361 WARNING [train.py:1204] (3/4) Exclude cut with ID _14624_210_1925_1_1530858690657_3642219_100-19871-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 21:48:09,408 WARNING [train.py:1204] (3/4) Exclude cut with ID _12555_210_8750_1_1530075594402_3780239_50-293675-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 21:48:12,605 WARNING [train.py:1204] (3/4) Exclude cut with ID _14880_210_8895_1_1530680426655_3795259_289-212817-0_sp1.1 from training. Duration: 0.62725 2023-10-03 21:48:12,658 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_388-211626-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 21:48:16,129 WARNING [train.py:1204] (3/4) Exclude cut with ID _5289_210_2084_1_1527678202736_3779299_392-261988-0 from training. Duration: 0.65 2023-10-03 21:48:16,137 WARNING [train.py:1204] (3/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_124-345840-0 from training. Duration: 0.52 2023-10-03 21:48:16,149 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2655_210_2087_1_1525688959_3897400_437-337690-0 from training. Duration: 0.63 2023-10-03 21:48:20,099 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3780_210_2087_1_1526467391_239068_13-274633-0_sp1.1 from training. Duration: 0.92725 2023-10-03 21:48:20,104 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_805-71497-0_sp1.1 from training. Duration: 0.6454375 2023-10-03 21:48:21,421 WARNING [train.py:1204] (3/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_434-83000-0_sp1.1 from training. Duration: 0.8909375 2023-10-03 21:48:22,849 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_17613_210_2151_1_1531137569_2366160_285-219531-0_sp1.1 from training. Duration: 0.97275 2023-10-03 21:48:22,854 WARNING [train.py:1204] (3/4) Exclude cut with ID _56108_210_16380_1_1534505446159_3887959_22-37858-0_sp1.1 from training. Duration: 0.936375 2023-10-03 21:48:22,882 WARNING [train.py:1204] (3/4) Exclude cut with ID _28427_210_4220_1_1532581240967_3061069_200-88444-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 21:48:22,950 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_77-279042-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 21:48:24,221 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9309W0193-640320 from training. Duration: 0.896 2023-10-03 21:48:24,521 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.balancer1.prob, batch_count=1417640.0, ans=0.125 2023-10-03 21:48:25,693 WARNING [train.py:1204] (3/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_516-324055-0_sp1.1 from training. Duration: 0.936375 2023-10-03 21:48:29,859 WARNING [train.py:1204] (3/4) Exclude cut with ID _40094_210_16380_1_1533294005773_3641159_10-261240-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 21:48:31,405 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.bypass.skip_rate, batch_count=1417706.6666666667, ans=0.07 2023-10-03 21:48:31,965 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.2.encoder.layers.0.nonlin_attention.whiten2, num_groups=1, num_channels=384, metric=6.93 vs. limit=15.0 2023-10-03 21:48:32,371 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.623e+02 1.887e+02 2.079e+02 2.290e+02 3.335e+02, threshold=4.157e+02, percent-clipped=0.0 2023-10-03 21:48:34,369 WARNING [train.py:1204] (3/4) Exclude cut with ID _6267_210_2084_1_1527764658526_3894730_375-219887-0_sp1.1 from training. Duration: 0.6090625 2023-10-03 21:48:35,800 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6788_210_1388_1_1527898698_4562021_180-75288-0 from training. Duration: 0.99 2023-10-03 21:48:37,949 WARNING [train.py:1204] (3/4) Exclude cut with ID _8030_210_7497_1_1528444886781_3531089_262-86750-0_sp1.1 from training. Duration: 0.736375 2023-10-03 21:48:37,970 WARNING [train.py:1204] (3/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_934-325498-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 21:48:39,279 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_456-22141-0_sp0.9 from training. Duration: 0.97775 2023-10-03 21:48:41,183 WARNING [train.py:1204] (3/4) Exclude cut with ID _49643_210_7989_1_1534640244224_3939031_598-161926-0_sp1.1 from training. Duration: 0.8181875 2023-10-03 21:48:41,459 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=1417773.3333333333, ans=0.1 2023-10-03 21:48:42,635 WARNING [train.py:1204] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_345-49721-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 21:48:44,069 WARNING [train.py:1204] (3/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_440-141811-0_sp1.1 from training. Duration: 0.863625 2023-10-03 21:48:44,204 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.0.conv_skip_rate, batch_count=1417773.3333333333, ans=0.0 2023-10-03 21:48:44,293 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.bypass.scale_min, batch_count=1417773.3333333333, ans=0.2 2023-10-03 21:48:45,930 WARNING [train.py:1204] (3/4) Exclude cut with ID _64048_210_10626_1_1535198382802_3835179_394-212120-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 21:48:47,243 WARNING [train.py:1204] (3/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_91-277684-0 from training. Duration: 0.74 2023-10-03 21:48:51,340 WARNING [train.py:1204] (3/4) Exclude cut with ID _23038_210_9570_1_1532001584158_3652419_191-346343-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 21:48:51,395 WARNING [train.py:1204] (3/4) Exclude cut with ID _15140_210_9288_1_1530791388450_4880149_351-347974-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 21:48:52,711 WARNING [train.py:1204] (3/4) Exclude cut with ID _58605_210_12210_1_1534723195939_3904259_48-269402-0_sp1.1 from training. Duration: 0.87275 2023-10-03 21:48:52,718 WARNING [train.py:1204] (3/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_401-53423-0_sp0.9 from training. Duration: 0.611125 2023-10-03 21:48:55,510 WARNING [train.py:1204] (3/4) Exclude cut with ID _42659_210_2070_1_1533607442765_3120380_158-49004-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 21:48:56,893 WARNING [train.py:1204] (3/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_81-75849-0_sp1.1 from training. Duration: 0.3545625 2023-10-03 21:48:59,589 WARNING [train.py:1204] (3/4) Exclude cut with ID _27052_210_5986_1_1532329197187_3585009_314-106499-0_sp1.1 from training. Duration: 0.836375 2023-10-03 21:49:00,942 WARNING [train.py:1204] (3/4) Exclude cut with ID _4626_210_3532_1_1526817652717_3916859_446-206656-0_sp0.9 from training. Duration: 0.8444375 2023-10-03 21:49:01,035 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_53378_210_17282_1_1534221071_5769509_158-114960-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 21:49:03,604 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3171_210_2087_1_1526109084_2330007_37-64334-0_sp0.9 from training. Duration: 0.911125 2023-10-03 21:49:03,620 WARNING [train.py:1204] (3/4) Exclude cut with ID _5410_210_1388_1_1527397055795_1719020_39-112221-0 from training. Duration: 0.69 2023-10-03 21:49:03,638 WARNING [train.py:1204] (3/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_4-91037-0_sp0.9 from training. Duration: 0.97775 2023-10-03 21:49:03,663 WARNING [train.py:1204] (3/4) Exclude cut with ID ID0079W0443-717795 from training. Duration: 0.541 2023-10-03 21:49:07,111 WARNING [train.py:1204] (3/4) Exclude cut with ID _34734_210_4525_1_1533365977542_4448720_766-297928-0_sp1.1 from training. Duration: 0.936375 2023-10-03 21:49:11,706 INFO [train.py:1046] (3/4) Epoch 41, batch 200, loss[loss=0.155, simple_loss=0.2398, pruned_loss=0.0351, over 24293.00 frames. ], tot_loss[loss=0.1582, simple_loss=0.2395, pruned_loss=0.03848, over 3015172.26 frames. ], batch size: 74, lr: 2.49e-03, grad_scale: 16.0 2023-10-03 21:49:11,843 WARNING [train.py:1204] (3/4) Exclude cut with ID _51471_210_1889_1_1534399391390_3094250_310-337793-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 21:49:13,675 WARNING [train.py:1204] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_678-159307-0_sp0.9 from training. Duration: 0.9444375 2023-10-03 21:49:15,776 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.3.encoder.layers.3.feed_forward3.out_whiten, num_groups=1, num_channels=512, metric=12.08 vs. limit=15.0 2023-10-03 21:49:16,479 WARNING [train.py:1204] (3/4) Exclude cut with ID _50093_210_4220_1_1534417141207_3432299_259-322252-0 from training. Duration: 0.92 2023-10-03 21:49:17,797 WARNING [train.py:1204] (3/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_67-103444-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 21:49:17,815 WARNING [train.py:1204] (3/4) Exclude cut with ID _40551_210_10635_1_1533776424632_3416179_311-247578-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 21:49:19,806 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_4633_210_2040_1_1526824163_4145343_416-76117-0 from training. Duration: 0.95 2023-10-03 21:49:21,320 WARNING [train.py:1204] (3/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_331-117829-0_sp0.9 from training. Duration: 0.688875 2023-10-03 21:49:22,653 WARNING [train.py:1204] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_299-247059-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 21:49:23,974 WARNING [train.py:1204] (3/4) Exclude cut with ID _64002_210_9570_1_1535198605157_38979_2-240845-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 21:49:28,102 WARNING [train.py:1204] (3/4) Exclude cut with ID _4681_210_3525_1_1527250689464_120889_15-126760-0_sp0.9 from training. Duration: 0.9 2023-10-03 21:49:28,114 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_21500_210_2054_1_1532149310_2716758_256-156611-0_sp1.1 from training. Duration: 0.936375 2023-10-03 21:49:28,139 WARNING [train.py:1204] (3/4) Exclude cut with ID _16908_210_5724_1_1531119157248_3772700_624-203729-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 21:49:31,429 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.4.encoder.layers.2.self_attn_weights.whiten_keys, num_groups=4, num_channels=128, metric=1.54 vs. limit=6.0 2023-10-03 21:49:46,117 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.feed_forward3.hidden_balancer.prob, batch_count=1418040.0, ans=0.125 2023-10-03 21:49:47,265 WARNING [train.py:1204] (3/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_605-219732-0_sp1.1 from training. Duration: 0.8545625 2023-10-03 21:49:47,299 WARNING [train.py:1204] (3/4) Exclude cut with ID _44718_210_5816_1_1534050051869_6469030_380-167520-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 21:49:48,625 WARNING [train.py:1204] (3/4) Exclude cut with ID _19896_210_1883_1_1531709022662_6649569_94-229927-0_sp1.1 from training. Duration: 0.8818125 2023-10-03 21:49:48,682 WARNING [train.py:1204] (3/4) Exclude cut with ID _19514_210_9259_1_1531530071330_5084230_519-174300-0_sp1.1 from training. Duration: 0.963625 2023-10-03 21:49:48,837 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.0.balancer1.prob, batch_count=1418040.0, ans=0.125 2023-10-03 21:49:50,010 WARNING [train.py:1204] (3/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_340-174176-0_sp0.9 from training. Duration: 0.5444375 2023-10-03 21:49:50,013 WARNING [train.py:1204] (3/4) Exclude cut with ID _8263_210_1664_1_1528439452891_970220_68-181805-0_sp1.1 from training. Duration: 0.5818125 2023-10-03 21:49:52,561 WARNING [train.py:1204] (3/4) Exclude cut with ID _3120_210_3525_1_1526036143112_4072980_348-257723-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 21:49:52,631 WARNING [train.py:1204] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_453-133983-0_sp1.1 from training. Duration: 0.7090625 2023-10-03 21:49:54,588 WARNING [train.py:1204] (3/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_17-86060-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 21:49:54,620 WARNING [train.py:1204] (3/4) Exclude cut with ID _29877_210_6228_1_1532397791106_5276149_228-184657-0_sp1.1 from training. Duration: 0.97275 2023-10-03 21:49:57,318 WARNING [train.py:1204] (3/4) Exclude cut with ID _18516_210_9206_1_1531704653206_3692406_198-334870-0 from training. Duration: 0.92 2023-10-03 21:49:57,359 WARNING [train.py:1204] (3/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_340-174176-0_sp1.1 from training. Duration: 0.4454375 2023-10-03 21:49:57,373 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_14-139186-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 21:50:02,966 WARNING [train.py:1204] (3/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_126-29104-0_sp0.9 from training. Duration: 0.9444375 2023-10-03 21:50:07,251 WARNING [train.py:1204] (3/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_84-10310-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 21:50:14,362 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_15517_210_6753_1_1530954008_3487336_79-273076-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 21:50:16,259 WARNING [train.py:1204] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_371-18169-0_sp0.9 from training. Duration: 0.9555625 2023-10-03 21:50:18,017 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.ff2_skip_rate, batch_count=1418173.3333333333, ans=0.0 2023-10-03 21:50:22,470 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_68702_210_23689_1_1535799577_3420068_170-92788-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 21:50:23,927 WARNING [train.py:1204] (3/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_78-182912-0 from training. Duration: 0.59 2023-10-03 21:50:25,603 INFO [train.py:1046] (3/4) Epoch 41, batch 250, loss[loss=0.1553, simple_loss=0.248, pruned_loss=0.03128, over 24293.00 frames. ], tot_loss[loss=0.1576, simple_loss=0.238, pruned_loss=0.03862, over 3387949.99 frames. ], batch size: 74, lr: 2.49e-03, grad_scale: 16.0 2023-10-03 21:50:25,698 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3019_210_2028_1_1526044245_3264865_297-12384-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 21:50:25,705 WARNING [train.py:1204] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_147-111137-0_sp1.1 from training. Duration: 0.863625 2023-10-03 21:50:25,712 WARNING [train.py:1204] (3/4) Exclude cut with ID _28323_210_16187_1_1532480395667_4100300_117-8859-0_sp1.1 from training. Duration: 0.97275 2023-10-03 21:50:25,801 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7762_210_1794_1_1528199508_4057408_419-125009-0_sp1.1 from training. Duration: 0.7545625 2023-10-03 21:50:28,447 WARNING [train.py:1204] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_268-87909-0 from training. Duration: 0.99 2023-10-03 21:50:28,529 WARNING [train.py:1204] (3/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_118-311357-0_sp1.1 from training. Duration: 0.92725 2023-10-03 21:50:29,705 WARNING [train.py:1204] (3/4) Exclude cut with ID ID0377W0003-870225 from training. Duration: 0.852 2023-10-03 21:50:31,030 WARNING [train.py:1204] (3/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_56-5838-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 21:50:31,132 WARNING [train.py:1204] (3/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_90-31383-0_sp0.9 from training. Duration: 0.9666875 2023-10-03 21:50:32,497 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_31583_210_3852_1_1532595851_4024413_58-86657-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 21:50:33,844 WARNING [train.py:1204] (3/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_344-113875-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 21:50:35,279 WARNING [train.py:1204] (3/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_278-104676-0_sp1.1 from training. Duration: 0.8909375 2023-10-03 21:50:35,311 WARNING [train.py:1204] (3/4) Exclude cut with ID _55844_210_2346_1_1534471212569_7625707_764-2387-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 21:50:37,110 WARNING [train.py:1204] (3/4) Exclude cut with ID _27430_210_7770_1_1532604689375_1378590_157-14963-0_sp1.1 from training. Duration: 0.963625 2023-10-03 21:50:38,634 WARNING [train.py:1204] (3/4) Exclude cut with ID _7160_210_1876_1_1527940923231_3703799_282-322611-0_sp1.1 from training. Duration: 0.7909375 2023-10-03 21:50:47,493 WARNING [train.py:1204] (3/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_190-138640-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 21:50:49,439 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_33348_210_5632_1_1533086912_3324030_63-270922-0_sp1.1 from training. Duration: 0.97275 2023-10-03 21:50:50,862 WARNING [train.py:1204] (3/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_135-184281-0_sp1.1 from training. Duration: 0.9 2023-10-03 21:50:58,161 WARNING [train.py:1204] (3/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_277-210798-0_sp1.1 from training. Duration: 0.5 2023-10-03 21:50:58,233 WARNING [train.py:1204] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_402-253027-0_sp0.9 from training. Duration: 0.6 2023-10-03 21:50:59,627 WARNING [train.py:1204] (3/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_540-275685-0_sp0.9 from training. Duration: 0.988875 2023-10-03 21:50:59,666 WARNING [train.py:1204] (3/4) Exclude cut with ID _59493_210_17282_1_1535078419077_3225879_118-18960-0_sp1.1 from training. Duration: 0.936375 2023-10-03 21:51:00,896 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.605e+02 1.902e+02 2.095e+02 2.391e+02 3.475e+02, threshold=4.190e+02, percent-clipped=0.0 2023-10-03 21:51:01,024 WARNING [train.py:1204] (3/4) Exclude cut with ID _6194_210_1534_1_1527511826160_3941194_321-148641-0_sp1.1 from training. Duration: 0.5818125 2023-10-03 21:51:01,031 WARNING [train.py:1204] (3/4) Exclude cut with ID _61133_210_9969_1_1535196777227_3696409_333-270325-0_sp1.1 from training. Duration: 0.8454375 2023-10-03 21:51:02,356 WARNING [train.py:1204] (3/4) Exclude cut with ID _49376_210_5986_1_1533985278732_3370740_307-145221-0_sp1.1 from training. Duration: 0.936375 2023-10-03 21:51:05,591 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6846_210_6753_1_1527850767_4295158_2-315073-0_sp0.9 from training. Duration: 0.97775 2023-10-03 21:51:08,339 WARNING [train.py:1204] (3/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_17-25045-0 from training. Duration: 0.59 2023-10-03 21:51:08,360 WARNING [train.py:1204] (3/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_503-333510-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 21:51:09,809 WARNING [train.py:1204] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_238-245655-0_sp1.1 from training. Duration: 0.736375 2023-10-03 21:51:11,102 WARNING [train.py:1204] (3/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_329-82235-0_sp0.9 from training. Duration: 0.911125 2023-10-03 21:51:11,107 WARNING [train.py:1204] (3/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_325-113798-0_sp0.9 from training. Duration: 0.9444375 2023-10-03 21:51:12,473 WARNING [train.py:1204] (3/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_395-37222-0_sp0.9 from training. Duration: 0.8444375 2023-10-03 21:51:12,551 WARNING [train.py:1204] (3/4) Exclude cut with ID _64135_210_2253_1_1535677267876_7608972_498-5039-0_sp1.1 from training. Duration: 0.7090625 2023-10-03 21:51:12,566 WARNING [train.py:1204] (3/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_303-228738-0_sp0.9 from training. Duration: 0.9333125 2023-10-03 21:51:14,044 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.1.self_attn_weights.pos_emb_skip_rate, batch_count=1418440.0, ans=0.0 2023-10-03 21:51:15,171 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_577-220899-0_sp1.1 from training. Duration: 0.92725 2023-10-03 21:51:15,268 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3475_210_2028_1_1526193578_3622569_377-166133-0_sp1.1 from training. Duration: 0.7818125 2023-10-03 21:51:16,578 WARNING [train.py:1204] (3/4) Exclude cut with ID _49376_210_5986_1_1533985278732_3370740_272-313512-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 21:51:21,801 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7762_210_1794_1_1528199508_4057408_422-227034-0_sp0.9 from training. Duration: 0.811125 2023-10-03 21:51:24,796 WARNING [train.py:1204] (3/4) Exclude cut with ID _18895_210_6228_1_1531357274234_3784930_74-341428-0_sp1.1 from training. Duration: 0.92725 2023-10-03 21:51:26,269 WARNING [train.py:1204] (3/4) Exclude cut with ID _44902_210_16187_1_1533888006062_3792078_149-67212-0_sp1.1 from training. Duration: 0.7909375 2023-10-03 21:51:29,564 WARNING [train.py:1204] (3/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_243-126020-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 21:51:30,962 WARNING [train.py:1204] (3/4) Exclude cut with ID _54218_210_12210_1_1534413646388_4409259_259-6561-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 21:51:35,799 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_41452_210_7291_1_1533437687_3941281_405-11559-0 from training. Duration: 0.91 2023-10-03 21:51:37,180 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_759-225084-0_sp1.1 from training. Duration: 0.97275 2023-10-03 21:51:37,194 WARNING [train.py:1204] (3/4) Exclude cut with ID _14747_210_8341_1_1530685797463_3654089_147-131229-0_sp0.9 from training. Duration: 0.8444375 2023-10-03 21:51:38,618 WARNING [train.py:1204] (3/4) Exclude cut with ID _14867_210_1968_1_1530684179890_3846969_319-250868-0 from training. Duration: 0.99 2023-10-03 21:51:39,832 INFO [train.py:1046] (3/4) Epoch 41, batch 300, loss[loss=0.1519, simple_loss=0.2326, pruned_loss=0.03559, over 24525.00 frames. ], tot_loss[loss=0.1566, simple_loss=0.2357, pruned_loss=0.03873, over 3661432.05 frames. ], batch size: 63, lr: 2.49e-03, grad_scale: 16.0 2023-10-03 21:51:39,908 WARNING [train.py:1204] (3/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_580-93798-0_sp0.9 from training. Duration: 0.62225 2023-10-03 21:51:41,405 WARNING [train.py:1204] (3/4) Exclude cut with ID _9679_210_7497_1_1529129212906_4363929_512-313889-0_sp0.9 from training. Duration: 0.9 2023-10-03 21:51:41,415 WARNING [train.py:1204] (3/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_18-189198-0 from training. Duration: 0.9 2023-10-03 21:51:45,587 WARNING [train.py:1204] (3/4) Exclude cut with ID _49755_210_18053_1_1534581005846_3218709_137-64179-0_sp1.1 from training. Duration: 0.92725 2023-10-03 21:51:47,036 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_48049_210_5632_1_1533864553_3519937_306-283794-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 21:51:51,499 WARNING [train.py:1204] (3/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_25-120161-0_sp1.1 from training. Duration: 0.8909375 2023-10-03 21:51:53,333 WARNING [train.py:1204] (3/4) Exclude cut with ID _8833_210_5219_1_1528592665210_7553059_319-349181-0 from training. Duration: 0.99 2023-10-03 21:51:53,424 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_296-332325-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 21:51:56,078 WARNING [train.py:1204] (3/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_436-186311-0_sp1.1 from training. Duration: 0.5909375 2023-10-03 21:51:56,098 WARNING [train.py:1204] (3/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_348-127789-0 from training. Duration: 0.85 2023-10-03 21:51:56,104 WARNING [train.py:1204] (3/4) Exclude cut with ID _3992_210_1747_1_1526644816031_3731060_443-195183-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 21:52:00,676 WARNING [train.py:1204] (3/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_308-231735-0_sp1.1 from training. Duration: 0.663625 2023-10-03 21:52:05,258 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6786_210_1388_1_1527839699_4194495_378-46070-0_sp1.1 from training. Duration: 0.6545625 2023-10-03 21:52:05,309 WARNING [train.py:1204] (3/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_63-101996-0 from training. Duration: 0.88 2023-10-03 21:52:08,720 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2541_210_1794_1_1525590269_3084765_156-173713-0 from training. Duration: 0.81 2023-10-03 21:52:08,750 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_217-239796-0_sp1.1 from training. Duration: 0.963625 2023-10-03 21:52:10,896 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.3.encoder.layers.1.self_attn1.whiten, num_groups=1, num_channels=512, metric=13.50 vs. limit=22.5 2023-10-03 21:52:11,629 WARNING [train.py:1204] (3/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_530-329400-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 21:52:13,065 WARNING [train.py:1204] (3/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_150-62920-0_sp1.1 from training. Duration: 0.963625 2023-10-03 21:52:13,065 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_5522_210_3273_1_1527249606_3909096_366-172508-0 from training. Duration: 0.66 2023-10-03 21:52:13,067 WARNING [train.py:1204] (3/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_303-340446-0_sp1.1 from training. Duration: 0.8181875 2023-10-03 21:52:15,747 WARNING [train.py:1204] (3/4) Exclude cut with ID _8699_210_2069_1_1528630346831_3960409_160-24408-0_sp1.1 from training. Duration: 0.82725 2023-10-03 21:52:19,007 WARNING [train.py:1204] (3/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_299-187801-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 21:52:19,053 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_34886_210_10633_1_1533203882_1755661_92-345288-0_sp1.1 from training. Duration: 0.97275 2023-10-03 21:52:22,052 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_5438_210_1794_1_1527336687_2600009_104-288274-0_sp1.1 from training. Duration: 0.52725 2023-10-03 21:52:22,056 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_154-316450-0 from training. Duration: 0.76 2023-10-03 21:52:23,403 WARNING [train.py:1204] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_23-189234-0_sp0.9 from training. Duration: 0.92225 2023-10-03 21:52:25,601 WARNING [train.py:1204] (3/4) Exclude cut with ID _4779_210_2084_1_1527318022944_3914893_346-168820-0_sp1.1 from training. Duration: 0.963625 2023-10-03 21:52:27,009 WARNING [train.py:1204] (3/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_5-55893-0 from training. Duration: 0.55 2023-10-03 21:52:28,379 WARNING [train.py:1204] (3/4) Exclude cut with ID _20455_210_5219_1_1531565996775_8098100_180-24517-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 21:52:31,267 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_243-143119-0_sp0.9 from training. Duration: 0.8555625 2023-10-03 21:52:31,765 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.4.encoder.layers.0.nonlin_attention.whiten2, num_groups=1, num_channels=384, metric=8.01 vs. limit=15.0 2023-10-03 21:52:36,397 WARNING [train.py:1204] (3/4) Exclude cut with ID _65585_210_12210_1_1535450451584_3764098_293-212045-0_sp1.1 from training. Duration: 0.92725 2023-10-03 21:52:36,402 WARNING [train.py:1204] (3/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_577-332600-0 from training. Duration: 0.57 2023-10-03 21:52:39,629 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.conv_module2.balancer2.prob, batch_count=1418840.0, ans=0.125 2023-10-03 21:52:40,750 WARNING [train.py:1204] (3/4) Exclude cut with ID _30039_210_13270_1_1532484612900_2725150_104-289874-0_sp1.1 from training. Duration: 0.963625 2023-10-03 21:52:40,757 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3172_210_2087_1_1526292929_3987865_197-97762-0_sp1.1 from training. Duration: 0.6818125 2023-10-03 21:52:42,297 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_256-150908-0_sp1.1 from training. Duration: 0.963625 2023-10-03 21:52:44,935 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_296-287678-0_sp1.1 from training. Duration: 0.863625 2023-10-03 21:52:44,971 WARNING [train.py:1204] (3/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_443-30034-0 from training. Duration: 0.64 2023-10-03 21:52:45,002 WARNING [train.py:1204] (3/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_494-199538-0_sp0.9 from training. Duration: 0.8333125 2023-10-03 21:52:46,282 WARNING [train.py:1204] (3/4) Exclude cut with ID _19708_210_9206_1_1532136650733_4238364_61-162325-0_sp1.1 from training. Duration: 0.97275 2023-10-03 21:52:46,371 WARNING [train.py:1204] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_475-127455-0 from training. Duration: 0.7 2023-10-03 21:52:49,073 WARNING [train.py:1204] (3/4) Exclude cut with ID _48677_210_5663_1_1533902405919_3892989_188-11581-0_sp1.1 from training. Duration: 0.963625 2023-10-03 21:52:49,105 WARNING [train.py:1204] (3/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_136-8381-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 21:52:50,912 WARNING [train.py:1204] (3/4) Exclude cut with ID _55328_210_27767_1_1534580973260_4088170_270-229555-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 21:52:50,943 WARNING [train.py:1204] (3/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_372-266951-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 21:52:51,020 WARNING [train.py:1204] (3/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_14-242149-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 21:52:51,153 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.conv_skip_rate, batch_count=1418840.0, ans=0.0 2023-10-03 21:52:52,744 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.bypass.scale_min, batch_count=1418840.0, ans=0.2 2023-10-03 21:52:55,532 INFO [train.py:1046] (3/4) Epoch 41, batch 350, loss[loss=0.1591, simple_loss=0.2205, pruned_loss=0.04879, over 22868.00 frames. ], tot_loss[loss=0.1559, simple_loss=0.2346, pruned_loss=0.03862, over 3891610.45 frames. ], batch size: 322, lr: 2.49e-03, grad_scale: 16.0 2023-10-03 21:52:55,608 WARNING [train.py:1204] (3/4) Exclude cut with ID _7877_210_6014_1_1528286358099_3750136_159-76512-0_sp1.1 from training. Duration: 0.936375 2023-10-03 21:52:55,611 WARNING [train.py:1204] (3/4) Exclude cut with ID _8263_210_1664_1_1528438953494_294100_40-204731-0_sp1.1 from training. Duration: 0.4545625 2023-10-03 21:52:58,285 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8278_210_2550_1_1528637099_4220413_694-238413-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 21:53:02,877 WARNING [train.py:1204] (3/4) Exclude cut with ID _47278_210_12041_1_1533902418349_3576110_421-172710-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 21:53:06,127 WARNING [train.py:1204] (3/4) Exclude cut with ID _14970_210_15709_1_1530773543493_1715730_215-282680-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 21:53:06,170 WARNING [train.py:1204] (3/4) Exclude cut with ID _64639_210_5816_1_1535367619437_3528000_306-149449-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 21:53:08,764 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_28-291982-0 from training. Duration: 0.97 2023-10-03 21:53:10,125 WARNING [train.py:1204] (3/4) Exclude cut with ID _55134_210_21982_1_1534590028377_3882170_412-15870-0_sp1.1 from training. Duration: 0.936375 2023-10-03 21:53:11,408 WARNING [train.py:1204] (3/4) Exclude cut with ID _13684_210_7497_1_1530619354863_3598359_312-235606-0 from training. Duration: 0.6 2023-10-03 21:53:12,953 WARNING [train.py:1204] (3/4) Exclude cut with ID _37112_210_2575_1_1532997007673_7268610_347-238993-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 21:53:12,986 WARNING [train.py:1204] (3/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_121-284152-0 from training. Duration: 0.59 2023-10-03 21:53:14,372 WARNING [train.py:1204] (3/4) Exclude cut with ID _2941_210_2577_1_1526199673150_2489759_228-2961-0_sp1.1 from training. Duration: 0.97275 2023-10-03 21:53:17,112 WARNING [train.py:1204] (3/4) Exclude cut with ID _28323_210_16187_1_1532480395667_4100300_531-24157-0 from training. Duration: 0.98 2023-10-03 21:53:17,229 WARNING [train.py:1204] (3/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_266-329755-0_sp0.9 from training. Duration: 0.988875 2023-10-03 21:53:20,382 WARNING [train.py:1204] (3/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_1038-146421-0_sp1.1 from training. Duration: 0.97275 2023-10-03 21:53:20,462 WARNING [train.py:1204] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_391-22148-0_sp0.9 from training. Duration: 0.8555625 2023-10-03 21:53:21,768 WARNING [train.py:1204] (3/4) Exclude cut with ID _64521_210_14699_1_1535243427259_3815328_543-341117-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 21:53:21,790 WARNING [train.py:1204] (3/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_52-168712-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 21:53:21,837 WARNING [train.py:1204] (3/4) Exclude cut with ID _33920_210_4220_1_1533257851500_3084399_198-334063-0_sp1.1 from training. Duration: 0.8545625 2023-10-03 21:53:21,851 WARNING [train.py:1204] (3/4) Exclude cut with ID _50093_210_4220_1_1534417141207_3432299_69-111987-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 21:53:23,238 WARNING [train.py:1204] (3/4) Exclude cut with ID _14821_210_8750_1_1530680503991_6829359_189-297451-0_sp0.9 from training. Duration: 0.611125 2023-10-03 21:53:23,501 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.bypass.scale_min, batch_count=1419040.0, ans=0.2 2023-10-03 21:53:24,583 WARNING [train.py:1204] (3/4) Exclude cut with ID _8030_210_7497_1_1528444886781_3531089_262-86750-0_sp0.9 from training. Duration: 0.9 2023-10-03 21:53:24,589 WARNING [train.py:1204] (3/4) Exclude cut with ID _24612_210_8913_1_1531998054193_3806359_294-191196-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 21:53:31,194 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.621e+02 1.922e+02 2.071e+02 2.368e+02 3.597e+02, threshold=4.142e+02, percent-clipped=0.0 2023-10-03 21:53:32,696 WARNING [train.py:1204] (3/4) Exclude cut with ID _16909_210_5724_1_1531292079912_4118919_707-342896-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 21:53:32,711 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_9460_210_4211_1_1528954587_10049667_572-37035-0_sp0.9 from training. Duration: 0.888875 2023-10-03 21:53:32,910 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.feed_forward1.hidden_balancer.prob, batch_count=1419040.0, ans=0.125 2023-10-03 21:53:34,065 WARNING [train.py:1204] (3/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_762-109886-0_sp1.1 from training. Duration: 0.82725 2023-10-03 21:53:34,097 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_14953_210_2151_1_1530701710_5616165_519-326393-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 21:53:38,794 WARNING [train.py:1204] (3/4) Exclude cut with ID _25914_210_6400_1_1532141317058_644740_52-96808-0 from training. Duration: 0.91 2023-10-03 21:53:38,798 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_68095_210_20185_1_1535718226_8888855_1079-94525-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 21:53:38,888 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder_embed.convnext.hidden_balancer.prob, batch_count=1419106.6666666667, ans=0.125 2023-10-03 21:53:39,053 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.conv_module1.balancer1.prob, batch_count=1419106.6666666667, ans=0.125 2023-10-03 21:53:44,288 WARNING [train.py:1204] (3/4) Exclude cut with ID _14860_210_4915_1_1530702020340_7343240_746-3647-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 21:53:44,289 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_70379_210_7925_1_1535941455_557455_49-101720-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 21:53:44,304 WARNING [train.py:1204] (3/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_8-307291-0_sp1.1 from training. Duration: 0.936375 2023-10-03 21:53:45,664 WARNING [train.py:1204] (3/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_117-290916-0 from training. Duration: 0.51 2023-10-03 21:53:47,127 WARNING [train.py:1204] (3/4) Exclude cut with ID _22020_210_2461_1_1531724164513_6326900_944-339840-0_sp1.1 from training. Duration: 0.963625 2023-10-03 21:53:48,517 WARNING [train.py:1204] (3/4) Exclude cut with ID IC0515W0043-254911 from training. Duration: 0.976 2023-10-03 21:53:50,432 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3910_210_1794_1_1526742306_334530_28-238909-0 from training. Duration: 0.81 2023-10-03 21:53:50,447 WARNING [train.py:1204] (3/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_269-267275-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 21:53:50,682 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.bypass_mid.scale_min, batch_count=1419106.6666666667, ans=0.2 2023-10-03 21:53:53,235 WARNING [train.py:1204] (3/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_72-251255-0_sp1.1 from training. Duration: 0.8545625 2023-10-03 21:53:53,250 WARNING [train.py:1204] (3/4) Exclude cut with ID _14886_210_4220_1_1530702020355_3516190_175-229073-0 from training. Duration: 0.45 2023-10-03 21:53:56,696 WARNING [train.py:1204] (3/4) Exclude cut with ID _40519_210_13794_1_1533715228655_7337390_921-209452-0_sp1.1 from training. Duration: 0.963625 2023-10-03 21:53:59,404 WARNING [train.py:1204] (3/4) Exclude cut with ID _29857_210_20034_1_1532395886753_3798160_335-163562-0_sp0.9 from training. Duration: 0.9444375 2023-10-03 21:54:00,104 WARNING [train.py:1204] (3/4) Exclude cut with ID _58583_210_5236_1_1534726835266_3574249_384-58445-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 21:54:02,684 WARNING [train.py:1204] (3/4) Exclude cut with ID _44011_210_2046_1_1533722401691_3631310_96-315656-0_sp1.1 from training. Duration: 0.963625 2023-10-03 21:54:02,697 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_68484_210_1794_1_1535849804_7678097_549-170613-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 21:54:04,290 WARNING [train.py:1204] (3/4) Exclude cut with ID _37892_210_19596_1_1533198677897_3964058_214-148058-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 21:54:07,117 WARNING [train.py:1204] (3/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_415-78792-0_sp0.9 from training. Duration: 0.9 2023-10-03 21:54:09,081 WARNING [train.py:1204] (3/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_383-205833-0_sp1.1 from training. Duration: 0.863625 2023-10-03 21:54:10,463 INFO [train.py:1046] (3/4) Epoch 41, batch 400, loss[loss=0.168, simple_loss=0.239, pruned_loss=0.0485, over 23785.00 frames. ], tot_loss[loss=0.1554, simple_loss=0.2337, pruned_loss=0.03858, over 4051397.63 frames. ], batch size: 164, lr: 2.49e-03, grad_scale: 32.0 2023-10-03 21:54:10,512 WARNING [train.py:1204] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_167-137255-0 from training. Duration: 0.64 2023-10-03 21:54:10,528 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8275_210_4198_1_1528455320_3892571_484-154786-0_sp1.1 from training. Duration: 0.963625 2023-10-03 21:54:10,584 WARNING [train.py:1204] (3/4) Exclude cut with ID _40284_210_2084_1_1533513679814_3704279_396-319412-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 21:54:12,026 WARNING [train.py:1204] (3/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_280-120942-0_sp0.9 from training. Duration: 0.8555625 2023-10-03 21:54:12,254 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.ff2_skip_rate, batch_count=1419240.0, ans=0.0 2023-10-03 21:54:13,386 WARNING [train.py:1204] (3/4) Exclude cut with ID _45630_210_10556_1_1533949174498_3902049_558-240889-0_sp1.1 from training. Duration: 0.92725 2023-10-03 21:54:14,758 WARNING [train.py:1204] (3/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_197-307458-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 21:54:16,070 WARNING [train.py:1204] (3/4) Exclude cut with ID _16909_210_5724_1_1531292079912_4118919_163-254843-0_sp1.1 from training. Duration: 0.92725 2023-10-03 21:54:17,496 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_103-84010-0 from training. Duration: 0.69 2023-10-03 21:54:20,638 WARNING [train.py:1204] (3/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_401-158239-0 from training. Duration: 0.71 2023-10-03 21:54:20,639 WARNING [train.py:1204] (3/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_899-233658-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 21:54:21,949 WARNING [train.py:1204] (3/4) Exclude cut with ID _23825_210_13449_1_1531915313526_3497150_240-271367-0 from training. Duration: 0.97 2023-10-03 21:54:22,002 WARNING [train.py:1204] (3/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_424-134619-0_sp1.1 from training. Duration: 0.92725 2023-10-03 21:54:27,968 WARNING [train.py:1204] (3/4) Exclude cut with ID _44831_210_13427_1_1533816011549_3689035_32-188147-0_sp1.1 from training. Duration: 0.87275 2023-10-03 21:54:27,982 WARNING [train.py:1204] (3/4) Exclude cut with ID _59170_210_6721_1_1534983633960_4203335_314-63879-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 21:54:27,998 WARNING [train.py:1204] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_602-275260-0 from training. Duration: 0.82 2023-10-03 21:54:28,029 WARNING [train.py:1204] (3/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_511-306767-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 21:54:28,066 WARNING [train.py:1204] (3/4) Exclude cut with ID _41817_210_18835_1_1533434477160_7423049_565-201775-0_sp1.1 from training. Duration: 0.92725 2023-10-03 21:54:28,076 WARNING [train.py:1204] (3/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_778-173151-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 21:54:28,112 WARNING [train.py:1204] (3/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_680-51001-0_sp1.1 from training. Duration: 0.963625 2023-10-03 21:54:31,404 WARNING [train.py:1204] (3/4) Exclude cut with ID IC0677W0059-334891 from training. Duration: 0.896 2023-10-03 21:54:31,449 WARNING [train.py:1204] (3/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_370-331600-0 from training. Duration: 0.94 2023-10-03 21:54:31,713 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.conv_module1.balancer2.min_positive, batch_count=1419306.6666666667, ans=0.05 2023-10-03 21:54:37,035 WARNING [train.py:1204] (3/4) Exclude cut with ID _66840_210_5219_1_1535452168426_4196349_446-240192-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 21:54:37,722 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.feed_forward1.out_whiten.whitening_limit, batch_count=1419306.6666666667, ans=15.0 2023-10-03 21:54:38,399 WARNING [train.py:1204] (3/4) Exclude cut with ID _65242_210_18628_1_1535369393717_3823149_38-6732-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 21:54:38,459 WARNING [train.py:1204] (3/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_302-2550-0 from training. Duration: 0.81 2023-10-03 21:54:40,450 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_100-348441-0 from training. Duration: 0.91 2023-10-03 21:54:43,186 WARNING [train.py:1204] (3/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_329-285801-0_sp1.1 from training. Duration: 0.82725 2023-10-03 21:54:44,538 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_30167_210_3528_1_1532428012_4772375_343-334712-0_sp1.1 from training. Duration: 0.936375 2023-10-03 21:54:50,503 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7543_210_6753_1_1528543318_4552_1-85072-0 from training. Duration: 0.83 2023-10-03 21:54:54,500 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7002_210_1794_1_1527935634_4034444_487-236501-0_sp1.1 from training. Duration: 0.6 2023-10-03 21:54:55,859 WARNING [train.py:1204] (3/4) Exclude cut with ID _7160_210_1876_1_1527940923231_3703799_282-322611-0 from training. Duration: 0.87 2023-10-03 21:54:59,033 WARNING [train.py:1204] (3/4) Exclude cut with ID _67335_210_5219_1_1535525975652_4275229_570-262983-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 21:55:00,448 WARNING [train.py:1204] (3/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_625-338546-0_sp1.1 from training. Duration: 0.8 2023-10-03 21:55:00,477 WARNING [train.py:1204] (3/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_176-56120-0 from training. Duration: 0.83 2023-10-03 21:55:03,576 WARNING [train.py:1204] (3/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_963-187109-0_sp1.1 from training. Duration: 0.9 2023-10-03 21:55:03,826 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.feed_forward2.hidden_balancer.prob, batch_count=1419440.0, ans=0.125 2023-10-03 21:55:06,321 WARNING [train.py:1204] (3/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_121-284152-0_sp0.9 from training. Duration: 0.6555625 2023-10-03 21:55:06,435 WARNING [train.py:1204] (3/4) Exclude cut with ID _45021_210_9994_1_1533619751094_3330288_104-81711-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 21:55:10,497 WARNING [train.py:1204] (3/4) Exclude cut with ID _51382_210_23862_1_1534492801838_3650104_129-343914-0_sp1.1 from training. Duration: 0.936375 2023-10-03 21:55:11,875 WARNING [train.py:1204] (3/4) Exclude cut with ID _14970_210_15709_1_1530775547361_1966965_8-169924-0 from training. Duration: 0.92 2023-10-03 21:55:13,999 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2295_210_2087_1_1525500993_2407863_234-83104-0_sp1.1 from training. Duration: 0.563625 2023-10-03 21:55:15,351 WARNING [train.py:1204] (3/4) Exclude cut with ID _14687_210_5854_1_1530703838259_3664889_115-239046-0 from training. Duration: 0.67 2023-10-03 21:55:16,791 WARNING [train.py:1204] (3/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_690-126306-0_sp1.1 from training. Duration: 0.7090625 2023-10-03 21:55:16,798 WARNING [train.py:1204] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_625-102955-0_sp0.9 from training. Duration: 0.8555625 2023-10-03 21:55:18,312 WARNING [train.py:1204] (3/4) Exclude cut with ID _9389_210_1664_1_1529217337301_5289269_289-213417-0 from training. Duration: 0.89 2023-10-03 21:55:21,415 WARNING [train.py:1204] (3/4) Exclude cut with ID _13253_210_1925_1_1530757775819_3577873_191-144977-0_sp0.9 from training. Duration: 0.8666875 2023-10-03 21:55:22,720 WARNING [train.py:1204] (3/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_325-155703-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 21:55:22,749 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2295_210_2087_1_1525500993_2407863_116-238894-0_sp0.9 from training. Duration: 0.77775 2023-10-03 21:55:22,860 WARNING [train.py:1204] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_94-126128-0 from training. Duration: 0.86 2023-10-03 21:55:24,092 INFO [train.py:1046] (3/4) Epoch 41, batch 450, loss[loss=0.1487, simple_loss=0.2263, pruned_loss=0.03555, over 23863.00 frames. ], tot_loss[loss=0.156, simple_loss=0.2352, pruned_loss=0.03844, over 4210562.64 frames. ], batch size: 195, lr: 2.49e-03, grad_scale: 16.0 2023-10-03 21:55:24,153 WARNING [train.py:1204] (3/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_395-310220-0_sp0.9 from training. Duration: 0.988875 2023-10-03 21:55:25,637 WARNING [train.py:1204] (3/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_821-110852-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 21:55:26,956 WARNING [train.py:1204] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_64-248359-0_sp1.1 from training. Duration: 0.62725 2023-10-03 21:55:26,966 WARNING [train.py:1204] (3/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_138-187031-0 from training. Duration: 0.99 2023-10-03 21:55:27,003 WARNING [train.py:1204] (3/4) Exclude cut with ID _4122_210_2084_1_1526713164417_4353280_528-206771-0_sp0.9 from training. Duration: 0.97775 2023-10-03 21:55:28,383 WARNING [train.py:1204] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_844-96989-0_sp1.1 from training. Duration: 0.6454375 2023-10-03 21:55:31,604 WARNING [train.py:1204] (3/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_106-30782-0_sp1.1 from training. Duration: 0.72725 2023-10-03 21:55:42,062 WARNING [train.py:1204] (3/4) Exclude cut with ID _14683_210_5219_1_1530698352608_3974859_281-250040-0_sp1.1 from training. Duration: 0.936375 2023-10-03 21:55:42,304 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.feed_forward1.out_proj.dropout_p, batch_count=1419640.0, ans=0.1 2023-10-03 21:55:43,294 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_46013_210_7234_1_1533776362_3744032_463-52378-0_sp0.9 from training. Duration: 0.9555625 2023-10-03 21:55:44,704 WARNING [train.py:1204] (3/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_299-34413-0 from training. Duration: 0.65 2023-10-03 21:55:46,014 WARNING [train.py:1204] (3/4) Exclude cut with ID _35799_210_5854_1_1533263487887_3529071_490-326847-0 from training. Duration: 0.99 2023-10-03 21:55:48,781 WARNING [train.py:1204] (3/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_589-9070-0_sp0.9 from training. Duration: 0.788875 2023-10-03 21:55:50,785 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.ff2_skip_rate, batch_count=1419640.0, ans=0.0 2023-10-03 21:55:51,225 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.3.encoder.layers.2.feed_forward3.out_whiten, num_groups=1, num_channels=512, metric=6.83 vs. limit=15.0 2023-10-03 21:55:52,128 WARNING [train.py:1204] (3/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_884-29515-0_sp1.1 from training. Duration: 0.936375 2023-10-03 21:55:53,628 WARNING [train.py:1204] (3/4) Exclude cut with ID _15012_210_1968_1_1530856820429_5095350_474-119995-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 21:55:56,831 WARNING [train.py:1204] (3/4) Exclude cut with ID _65585_210_12210_1_1535450451584_3764098_211-97124-0_sp1.1 from training. Duration: 0.963625 2023-10-03 21:55:58,212 WARNING [train.py:1204] (3/4) Exclude cut with ID _33640_210_2655_1_1533117605479_4855469_666-224463-0_sp1.1 from training. Duration: 0.963625 2023-10-03 21:55:59,677 WARNING [train.py:1204] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_35-153886-0 from training. Duration: 0.84 2023-10-03 21:56:00,964 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.613e+02 1.852e+02 2.077e+02 2.381e+02 3.655e+02, threshold=4.153e+02, percent-clipped=0.0 2023-10-03 21:56:01,048 WARNING [train.py:1204] (3/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_210-106047-0 from training. Duration: 0.97 2023-10-03 21:56:04,333 WARNING [train.py:1204] (3/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_479-125611-0 from training. Duration: 0.96 2023-10-03 21:56:04,364 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_9417_210_1794_1_1529235261_4614131_427-153963-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 21:56:05,718 WARNING [train.py:1204] (3/4) Exclude cut with ID _23818_210_6228_1_1531917063884_3676920_133-170571-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 21:56:05,791 WARNING [train.py:1204] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_602-275260-0_sp1.1 from training. Duration: 0.7454375 2023-10-03 21:56:07,753 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9029W0121-509521 from training. Duration: 0.896 2023-10-03 21:56:07,761 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9029W0434-509821 from training. Duration: 0.64 2023-10-03 21:56:07,784 WARNING [train.py:1204] (3/4) Exclude cut with ID _23038_210_9570_1_1532001584158_3652419_289-307034-0_sp1.1 from training. Duration: 0.936375 2023-10-03 21:56:09,211 WARNING [train.py:1204] (3/4) Exclude cut with ID _29345_210_4381_1_1532689168263_3860169_149-257268-0_sp0.9 from training. Duration: 0.9 2023-10-03 21:56:10,559 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_405-339986-0_sp1.1 from training. Duration: 0.463625 2023-10-03 21:56:13,332 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2655_210_2087_1_1525688959_3897400_437-337690-0_sp1.1 from training. Duration: 0.57275 2023-10-03 21:56:13,364 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7543_210_6753_1_1528541907_1399322_165-323619-0_sp1.1 from training. Duration: 0.62725 2023-10-03 21:56:14,649 WARNING [train.py:1204] (3/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_508-335369-0_sp1.1 from training. Duration: 0.436375 2023-10-03 21:56:14,709 WARNING [train.py:1204] (3/4) Exclude cut with ID _7852_210_5219_1_1528286471316_8139400_358-287492-0 from training. Duration: 0.96 2023-10-03 21:56:16,171 WARNING [train.py:1204] (3/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_322-217220-0_sp0.9 from training. Duration: 0.9555625 2023-10-03 21:56:18,842 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3171_210_2087_1_1526109084_2330007_66-260583-0_sp0.9 from training. Duration: 0.8 2023-10-03 21:56:18,883 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8558_210_1410_1_1528505225_7613406_131-329524-0_sp1.1 from training. Duration: 0.7181875 2023-10-03 21:56:20,306 WARNING [train.py:1204] (3/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_30-326681-0 from training. Duration: 0.83 2023-10-03 21:56:23,632 WARNING [train.py:1204] (3/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_58-68956-0_sp0.9 from training. Duration: 0.92225 2023-10-03 21:56:25,473 WARNING [train.py:1204] (3/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_255-312654-0 from training. Duration: 0.91 2023-10-03 21:56:25,548 WARNING [train.py:1204] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_99-167392-0 from training. Duration: 0.61 2023-10-03 21:56:27,008 WARNING [train.py:1204] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_94-126128-0_sp0.9 from training. Duration: 0.9555625 2023-10-03 21:56:32,341 WARNING [train.py:1204] (3/4) Exclude cut with ID _68635_210_9317_1_1535770813410_4315870_187-192817-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 21:56:32,460 WARNING [train.py:1204] (3/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_277-204970-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 21:56:35,554 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6163_210_6400_1_1527506746_4263068_283-137811-0_sp0.9 from training. Duration: 0.8555625 2023-10-03 21:56:35,580 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9144W0121-566446 from training. Duration: 0.896 2023-10-03 21:56:38,159 INFO [train.py:1046] (3/4) Epoch 41, batch 500, loss[loss=0.1743, simple_loss=0.2634, pruned_loss=0.0426, over 24708.00 frames. ], tot_loss[loss=0.1576, simple_loss=0.2369, pruned_loss=0.03918, over 4320441.22 frames. ], batch size: 73, lr: 2.49e-03, grad_scale: 16.0 2023-10-03 21:56:39,600 WARNING [train.py:1204] (3/4) Exclude cut with ID _49371_210_12071_1_1533952761021_3812190_351-315519-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 21:56:39,663 WARNING [train.py:1204] (3/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_115-326778-0_sp1.1 from training. Duration: 0.8454375 2023-10-03 21:56:39,855 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.feed_forward1.out_proj.dropout_p, batch_count=1419906.6666666667, ans=0.1 2023-10-03 21:56:41,081 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_17292_210_6753_1_1531125388_2943734_436-198965-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 21:56:41,090 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9165W0117-576751 from training. Duration: 0.896 2023-10-03 21:56:42,452 WARNING [train.py:1204] (3/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_212-149947-0 from training. Duration: 0.73 2023-10-03 21:56:42,460 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_13757_210_3528_1_1530440682_4107239_65-167544-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 21:56:45,275 WARNING [train.py:1204] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_700-183549-0_sp0.9 from training. Duration: 0.7555625 2023-10-03 21:56:45,873 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.3.encoder.layers.2.whiten, num_groups=1, num_channels=512, metric=7.17 vs. limit=12.0 2023-10-03 21:56:49,313 WARNING [train.py:1204] (3/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_216-343007-0_sp0.9 from training. Duration: 0.7333125 2023-10-03 21:56:51,083 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7544_210_6753_1_1528587722_3576686_295-258479-0_sp1.1 from training. Duration: 0.763625 2023-10-03 21:56:53,889 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_365-287106-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 21:56:53,909 WARNING [train.py:1204] (3/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_300-3747-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 21:56:55,185 WARNING [train.py:1204] (3/4) Exclude cut with ID _14069_210_5632_1_1530512692033_4803390_487-303201-0_sp1.1 from training. Duration: 0.92725 2023-10-03 21:57:03,299 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.4.encoder.layers.0.self_attn_weights.whiten_keys, num_groups=4, num_channels=128, metric=1.61 vs. limit=6.0 2023-10-03 21:57:05,474 WARNING [train.py:1204] (3/4) Exclude cut with ID _14459_210_2228_1_1531443641946_7516700_668-123026-0_sp1.1 from training. Duration: 0.936375 2023-10-03 21:57:05,508 WARNING [train.py:1204] (3/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_108-53929-0_sp1.1 from training. Duration: 0.536375 2023-10-03 21:57:05,538 WARNING [train.py:1204] (3/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_266-137104-0_sp0.9 from training. Duration: 0.72225 2023-10-03 21:57:07,250 WARNING [train.py:1204] (3/4) Exclude cut with ID _12302_210_5219_1_1529924446466_8107280_902-168619-0_sp1.1 from training. Duration: 0.936375 2023-10-03 21:57:07,278 WARNING [train.py:1204] (3/4) Exclude cut with ID _59502_210_12210_1_1535107024698_1835139_146-162495-0 from training. Duration: 0.92 2023-10-03 21:57:07,294 WARNING [train.py:1204] (3/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_412-287526-0_sp1.1 from training. Duration: 0.6545625 2023-10-03 21:57:10,101 WARNING [train.py:1204] (3/4) Exclude cut with ID _7160_210_1876_1_1527940923231_3703799_120-150943-0_sp0.9 from training. Duration: 0.9 2023-10-03 21:57:11,408 WARNING [train.py:1204] (3/4) Exclude cut with ID _15037_210_2253_1_1530752294104_3267650_296-192597-0_sp1.1 from training. Duration: 0.736375 2023-10-03 21:57:11,436 WARNING [train.py:1204] (3/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_397-216497-0_sp1.1 from training. Duration: 0.87275 2023-10-03 21:57:11,450 WARNING [train.py:1204] (3/4) Exclude cut with ID _40094_210_16380_1_1533294005773_3641159_76-269320-0_sp1.1 from training. Duration: 0.936375 2023-10-03 21:57:11,507 WARNING [train.py:1204] (3/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_449-15508-0 from training. Duration: 0.82 2023-10-03 21:57:13,038 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.balancer1.prob, batch_count=1420040.0, ans=0.125 2023-10-03 21:57:15,507 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9309W0194-640321 from training. Duration: 0.64 2023-10-03 21:57:16,895 WARNING [train.py:1204] (3/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_284-312178-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 21:57:17,131 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.bypass_mid.scale_min, batch_count=1420040.0, ans=0.2 2023-10-03 21:57:19,469 WARNING [train.py:1204] (3/4) Exclude cut with ID _29853_210_7497_1_1532505600851_6642110_401-55937-0_sp1.1 from training. Duration: 0.92725 2023-10-03 21:57:19,549 WARNING [train.py:1204] (3/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_716-24918-0_sp1.1 from training. Duration: 0.92725 2023-10-03 21:57:19,579 WARNING [train.py:1204] (3/4) Exclude cut with ID _19637_210_4220_1_1531537167676_3205060_314-305638-0_sp1.1 from training. Duration: 0.92725 2023-10-03 21:57:20,945 WARNING [train.py:1204] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_475-127455-0_sp1.1 from training. Duration: 0.636375 2023-10-03 21:57:22,810 WARNING [train.py:1204] (3/4) Exclude cut with ID _5444_210_2151_1_1527418808464_5312210_581-315411-0 from training. Duration: 0.62 2023-10-03 21:57:24,766 WARNING [train.py:1204] (3/4) Exclude cut with ID _44902_210_16187_1_1533888006062_3792078_149-67212-0_sp0.9 from training. Duration: 0.9666875 2023-10-03 21:57:24,939 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.ff3_skip_rate, batch_count=1420106.6666666667, ans=0.0 2023-10-03 21:57:26,216 WARNING [train.py:1204] (3/4) Exclude cut with ID _31602_210_5854_1_1532677021456_4458250_63-63253-0_sp1.1 from training. Duration: 0.963625 2023-10-03 21:57:30,340 WARNING [train.py:1204] (3/4) Exclude cut with ID _55209_210_1531_1_1534675828996_4597160_660-203992-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 21:57:32,031 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.conv_module1.balancer2.prob, batch_count=1420106.6666666667, ans=0.125 2023-10-03 21:57:33,085 WARNING [train.py:1204] (3/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_83-190624-0_sp1.1 from training. Duration: 0.92725 2023-10-03 21:57:39,628 WARNING [train.py:1204] (3/4) Exclude cut with ID _58605_210_12210_1_1534723195939_3904259_154-139635-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 21:57:42,399 WARNING [train.py:1204] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_164-224052-0 from training. Duration: 0.7 2023-10-03 21:57:42,408 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_1126-189071-0_sp1.1 from training. Duration: 0.963625 2023-10-03 21:57:42,426 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_15396_210_6890_1_1531308473_1938936_310-214789-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 21:57:43,914 WARNING [train.py:1204] (3/4) Exclude cut with ID _8395_210_1531_1_1528597871523_4411579_431-132470-0 from training. Duration: 0.74 2023-10-03 21:57:45,312 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3780_210_2087_1_1526467760_2130483_170-259044-0_sp0.9 from training. Duration: 0.77775 2023-10-03 21:57:45,495 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.feed_forward3.hidden_balancer.prob, batch_count=1420173.3333333333, ans=0.125 2023-10-03 21:57:45,521 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.self_attn_weights.pos_emb_skip_rate, batch_count=1420173.3333333333, ans=0.0 2023-10-03 21:57:47,968 WARNING [train.py:1204] (3/4) Exclude cut with ID _12733_210_8341_1_1530167424556_3646380_537-230640-0_sp1.1 from training. Duration: 0.963625 2023-10-03 21:57:50,627 INFO [train.py:1046] (3/4) Epoch 41, batch 550, loss[loss=0.1468, simple_loss=0.2261, pruned_loss=0.03373, over 24354.00 frames. ], tot_loss[loss=0.1581, simple_loss=0.2379, pruned_loss=0.03915, over 4413682.03 frames. ], batch size: 61, lr: 2.49e-03, grad_scale: 16.0 2023-10-03 21:57:52,215 WARNING [train.py:1204] (3/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_159-281310-0 from training. Duration: 0.95 2023-10-03 21:57:55,369 WARNING [train.py:1204] (3/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_434-83000-0 from training. Duration: 0.98 2023-10-03 21:57:55,388 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_4405_210_1849_1_1526732205_3593939_493-32464-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 21:57:55,399 WARNING [train.py:1204] (3/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_342-100157-0 from training. Duration: 0.54 2023-10-03 21:57:55,458 WARNING [train.py:1204] (3/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_765-250133-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 21:57:55,462 WARNING [train.py:1204] (3/4) Exclude cut with ID _38670_210_15379_1_1533448810659_4391059_81-312245-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 21:57:57,206 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_50050_210_16223_1_1535435796_7290138_1273-247611-0_sp1.1 from training. Duration: 0.936375 2023-10-03 21:57:57,265 WARNING [train.py:1204] (3/4) Exclude cut with ID _44831_210_13427_1_1533816011549_3689035_399-18591-0_sp1.1 from training. Duration: 0.936375 2023-10-03 21:57:57,282 WARNING [train.py:1204] (3/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_557-324258-0_sp0.9 from training. Duration: 0.9 2023-10-03 21:57:58,608 WARNING [train.py:1204] (3/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_62-335552-0_sp1.1 from training. Duration: 0.7909375 2023-10-03 21:58:01,245 WARNING [train.py:1204] (3/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_673-264125-0_sp1.1 from training. Duration: 0.963625 2023-10-03 21:58:03,979 WARNING [train.py:1204] (3/4) Exclude cut with ID _8030_210_7497_1_1528444886781_3531089_262-86750-0 from training. Duration: 0.81 2023-10-03 21:58:04,003 WARNING [train.py:1204] (3/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_444-28041-0_sp1.1 from training. Duration: 0.77275 2023-10-03 21:58:04,344 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.self_attn_weights.pos_emb_skip_rate, batch_count=1420306.6666666667, ans=0.0 2023-10-03 21:58:07,474 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_34008_210_10431_1_1533345424_8183247_516-14858-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 21:58:07,487 WARNING [train.py:1204] (3/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_54-248218-0_sp1.1 from training. Duration: 0.936375 2023-10-03 21:58:12,080 WARNING [train.py:1204] (3/4) Exclude cut with ID _13253_210_1925_1_1530757775819_3577873_300-119782-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 21:58:12,149 WARNING [train.py:1204] (3/4) Exclude cut with ID _39290_210_4220_1_1533862778760_3075118_21-240703-0_sp1.1 from training. Duration: 0.936375 2023-10-03 21:58:16,404 WARNING [train.py:1204] (3/4) Exclude cut with ID 774-127930-0014-10412-0_sp1.1 from training. Duration: 0.95 2023-10-03 21:58:16,452 WARNING [train.py:1204] (3/4) Exclude cut with ID _67499_210_17282_1_1535509805220_3692430_20-179346-0 from training. Duration: 0.97 2023-10-03 21:58:19,058 WARNING [train.py:1204] (3/4) Exclude cut with ID _8365_210_2064_1_1528592311070_4677690_431-294102-0_sp0.9 from training. Duration: 0.988875 2023-10-03 21:58:23,397 WARNING [train.py:1204] (3/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_365-1492-0_sp1.1 from training. Duration: 0.82725 2023-10-03 21:58:25,136 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_105-114797-0_sp0.9 from training. Duration: 0.9333125 2023-10-03 21:58:26,522 WARNING [train.py:1204] (3/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_769-168302-0_sp0.9 from training. Duration: 0.788875 2023-10-03 21:58:28,303 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.621e+02 1.959e+02 2.274e+02 2.623e+02 4.132e+02, threshold=4.548e+02, percent-clipped=0.0 2023-10-03 21:58:31,088 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_40340_210_9024_1_1533433916_4132944_320-270277-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 21:58:31,094 WARNING [train.py:1204] (3/4) Exclude cut with ID ID0203W0003-781711 from training. Duration: 0.958 2023-10-03 21:58:31,173 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_30434_210_13049_1_1532519384_981746_52-12932-0_sp1.1 from training. Duration: 0.936375 2023-10-03 21:58:32,579 WARNING [train.py:1204] (3/4) Exclude cut with ID _8263_210_1664_1_1528438953494_294100_40-204731-0_sp0.9 from training. Duration: 0.5555625 2023-10-03 21:58:35,344 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1032-236863-0_sp0.9 from training. Duration: 0.9333125 2023-10-03 21:58:36,622 WARNING [train.py:1204] (3/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_335-274780-0_sp0.9 from training. Duration: 0.8666875 2023-10-03 21:58:36,630 WARNING [train.py:1204] (3/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_654-52713-0_sp1.1 from training. Duration: 0.7 2023-10-03 21:58:37,947 WARNING [train.py:1204] (3/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_742-267856-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 21:58:39,417 WARNING [train.py:1204] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_74-48717-0 from training. Duration: 0.58 2023-10-03 21:58:41,239 WARNING [train.py:1204] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_18-46756-0 from training. Duration: 0.79 2023-10-03 21:58:41,290 WARNING [train.py:1204] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_612-56061-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 21:58:42,600 WARNING [train.py:1204] (3/4) Exclude cut with ID _56423_210_5854_1_1534896061049_3165969_412-2403-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 21:58:42,662 WARNING [train.py:1204] (3/4) Exclude cut with ID _62574_210_2655_1_1535180407058_3933249_120-196848-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 21:58:42,663 WARNING [train.py:1204] (3/4) Exclude cut with ID _40519_210_13794_1_1533715228655_7337390_916-51000-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 21:58:45,979 WARNING [train.py:1204] (3/4) Exclude cut with ID _65263_210_22356_1_1535763598691_3921037_108-21902-0_sp1.1 from training. Duration: 0.9 2023-10-03 21:58:46,065 WARNING [train.py:1204] (3/4) Exclude cut with ID _4122_210_2084_1_1526713164417_4353280_528-206771-0_sp1.1 from training. Duration: 0.8 2023-10-03 21:58:48,879 WARNING [train.py:1204] (3/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_43-325657-0_sp1.1 from training. Duration: 0.92725 2023-10-03 21:58:48,927 WARNING [train.py:1204] (3/4) Exclude cut with ID _44659_210_2721_1_1534036120061_6614860_314-82041-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 21:58:49,169 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.bypass.skip_rate, batch_count=1420506.6666666667, ans=0.07 2023-10-03 21:58:50,367 WARNING [train.py:1204] (3/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_81-300212-0_sp0.9 from training. Duration: 0.6666875 2023-10-03 21:58:50,444 WARNING [train.py:1204] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_665-196155-0_sp0.9 from training. Duration: 0.7444375 2023-10-03 21:58:51,760 WARNING [train.py:1204] (3/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_140-336881-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 21:58:51,962 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.ff3_skip_rate, batch_count=1420506.6666666667, ans=0.0 2023-10-03 21:58:53,175 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6290_210_3273_1_1527595224_1282787_115-87095-0_sp0.9 from training. Duration: 0.611125 2023-10-03 21:58:54,524 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_53373_210_5399_1_1534229471_4715695_607-259685-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 21:58:56,541 WARNING [train.py:1204] (3/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_175-163894-0_sp1.1 from training. Duration: 0.67275 2023-10-03 21:58:56,559 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7002_210_1794_1_1527939683_5157286_1-281264-0_sp1.1 from training. Duration: 0.4 2023-10-03 21:59:02,527 WARNING [train.py:1204] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_605-257205-0 from training. Duration: 0.59 2023-10-03 21:59:05,222 INFO [train.py:1046] (3/4) Epoch 41, batch 600, loss[loss=0.1766, simple_loss=0.2594, pruned_loss=0.04687, over 24427.00 frames. ], tot_loss[loss=0.1578, simple_loss=0.238, pruned_loss=0.03885, over 4482909.76 frames. ], batch size: 77, lr: 2.49e-03, grad_scale: 16.0 2023-10-03 21:59:06,629 WARNING [train.py:1204] (3/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_454-314901-0 from training. Duration: 0.46 2023-10-03 21:59:08,154 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_46032_210_14656_1_1533781412_4507969_287-130422-0_sp1.1 from training. Duration: 0.97275 2023-10-03 21:59:08,177 WARNING [train.py:1204] (3/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_186-309161-0_sp1.1 from training. Duration: 0.4909375 2023-10-03 21:59:08,208 WARNING [train.py:1204] (3/4) Exclude cut with ID _42659_210_2070_1_1533607442765_3120380_45-342307-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 21:59:16,271 WARNING [train.py:1204] (3/4) Exclude cut with ID _12555_210_8750_1_1530075594402_3780239_478-326818-0_sp1.1 from training. Duration: 0.963625 2023-10-03 21:59:16,402 WARNING [train.py:1204] (3/4) Exclude cut with ID _7606_210_4915_1_1528894253277_5080690_127-177761-0_sp1.1 from training. Duration: 0.5818125 2023-10-03 21:59:16,600 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.ff3_skip_rate, batch_count=1420573.3333333333, ans=0.0 2023-10-03 21:59:17,822 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6788_210_1388_1_1527898698_4562021_91-197981-0 from training. Duration: 0.83 2023-10-03 21:59:20,580 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8558_210_1410_1_1528505225_7613406_131-329524-0_sp0.9 from training. Duration: 0.87775 2023-10-03 21:59:21,815 WARNING [train.py:1204] (3/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_153-153537-0_sp1.1 from training. Duration: 0.82725 2023-10-03 21:59:24,504 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_17292_210_6753_1_1531125388_2943734_285-349105-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 21:59:27,719 WARNING [train.py:1204] (3/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_37-41919-0 from training. Duration: 0.87 2023-10-03 21:59:27,748 WARNING [train.py:1204] (3/4) Exclude cut with ID _60860_210_8254_1_1534896015875_3557169_172-294852-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 21:59:33,137 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.3.encoder.layers.3.feed_forward3.out_whiten, num_groups=1, num_channels=512, metric=12.86 vs. limit=15.0 2023-10-03 21:59:33,888 WARNING [train.py:1204] (3/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_146-276944-0 from training. Duration: 0.83 2023-10-03 21:59:36,616 WARNING [train.py:1204] (3/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_1254-44082-0_sp1.1 from training. Duration: 0.936375 2023-10-03 21:59:36,628 WARNING [train.py:1204] (3/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_1115-180350-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 21:59:36,656 WARNING [train.py:1204] (3/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_60-146189-0_sp1.1 from training. Duration: 0.8909375 2023-10-03 21:59:42,141 WARNING [train.py:1204] (3/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_517-153649-0_sp1.1 from training. Duration: 0.92725 2023-10-03 21:59:42,158 WARNING [train.py:1204] (3/4) Exclude cut with ID _39571_210_13449_1_1533801584143_3651077_37-266257-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 21:59:44,084 WARNING [train.py:1204] (3/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_527-276118-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 21:59:50,151 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7696_210_2040_1_1528207167_3716099_117-125434-0_sp1.1 from training. Duration: 0.6909375 2023-10-03 21:59:51,772 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=1420773.3333333333, ans=0.1 2023-10-03 21:59:51,784 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.balancer1.prob, batch_count=1420773.3333333333, ans=0.125 2023-10-03 21:59:54,371 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_25767_210_2161_1_1532307483_3867748_478-156360-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 21:59:54,375 WARNING [train.py:1204] (3/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_177-49092-0_sp1.1 from training. Duration: 0.82725 2023-10-03 21:59:54,389 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_11305_210_1794_1_1529750428_7178148_680-317073-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 22:00:03,830 WARNING [train.py:1204] (3/4) Exclude cut with ID _53565_210_10635_1_1534337218335_3820259_287-18120-0 from training. Duration: 0.97 2023-10-03 22:00:06,722 WARNING [train.py:1204] (3/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_498-40153-0_sp1.1 from training. Duration: 0.57275 2023-10-03 22:00:06,750 WARNING [train.py:1204] (3/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_555-63066-0_sp1.1 from training. Duration: 0.963625 2023-10-03 22:00:09,663 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.conv_module1.balancer1.prob, batch_count=1420840.0, ans=0.125 2023-10-03 22:00:10,775 WARNING [train.py:1204] (3/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_395-310220-0 from training. Duration: 0.89 2023-10-03 22:00:12,195 WARNING [train.py:1204] (3/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_937-208281-0_sp1.1 from training. Duration: 0.77275 2023-10-03 22:00:13,747 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.nonlin_attention.balancer.prob, batch_count=1420840.0, ans=0.125 2023-10-03 22:00:15,425 WARNING [train.py:1204] (3/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_120-211630-0 from training. Duration: 0.92 2023-10-03 22:00:17,245 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_454-276429-0_sp1.1 from training. Duration: 0.7909375 2023-10-03 22:00:17,291 WARNING [train.py:1204] (3/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_404-75889-0_sp1.1 from training. Duration: 0.8090625 2023-10-03 22:00:20,153 INFO [train.py:1046] (3/4) Epoch 41, batch 650, loss[loss=0.1645, simple_loss=0.2579, pruned_loss=0.03554, over 24569.00 frames. ], tot_loss[loss=0.1569, simple_loss=0.2368, pruned_loss=0.03853, over 4525976.70 frames. ], batch size: 71, lr: 2.49e-03, grad_scale: 16.0 2023-10-03 22:00:20,421 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.0.conv_module2.balancer2.prob, batch_count=1420906.6666666667, ans=0.125 2023-10-03 22:00:21,657 WARNING [train.py:1204] (3/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_282-136641-0_sp0.9 from training. Duration: 0.4666875 2023-10-03 22:00:21,761 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_809-17156-0_sp1.1 from training. Duration: 0.663625 2023-10-03 22:00:24,420 WARNING [train.py:1204] (3/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_1003-101638-0_sp0.9 from training. Duration: 0.811125 2023-10-03 22:00:25,843 WARNING [train.py:1204] (3/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_404-75889-0_sp0.9 from training. Duration: 0.988875 2023-10-03 22:00:27,632 WARNING [train.py:1204] (3/4) Exclude cut with ID _59288_210_5854_1_1535077757774_3412246_570-295255-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 22:00:30,847 WARNING [train.py:1204] (3/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_412-287526-0 from training. Duration: 0.72 2023-10-03 22:00:32,287 WARNING [train.py:1204] (3/4) Exclude cut with ID _44010_210_4525_1_1533884262964_4013620_649-79655-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 22:00:36,381 WARNING [train.py:1204] (3/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_57-270037-0_sp0.9 from training. Duration: 0.8555625 2023-10-03 22:00:36,382 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_34815_210_6388_1_1533381320_3571903_302-255963-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 22:00:40,565 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_47449_210_5399_1_1534076445_4006929_573-301852-0_sp1.1 from training. Duration: 0.97275 2023-10-03 22:00:43,396 WARNING [train.py:1204] (3/4) Exclude cut with ID 6709-74022-0004-86860-0_sp1.1 from training. Duration: 0.9409375 2023-10-03 22:00:44,772 WARNING [train.py:1204] (3/4) Exclude cut with ID _40519_210_13794_1_1533715228655_7337390_111-71991-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 22:00:46,583 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_30167_210_3528_1_1532428012_4772375_169-214186-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 22:00:49,377 WARNING [train.py:1204] (3/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_20-235313-0_sp1.1 from training. Duration: 0.963625 2023-10-03 22:00:49,414 WARNING [train.py:1204] (3/4) Exclude cut with ID _4681_210_3525_1_1527251040321_1235713_123-24428-0_sp0.9 from training. Duration: 0.5333125 2023-10-03 22:00:52,184 WARNING [train.py:1204] (3/4) Exclude cut with ID _31602_210_5854_1_1532677021456_4458250_444-198752-0_sp1.1 from training. Duration: 0.97275 2023-10-03 22:00:52,974 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.2.encoder.layers.1.self_attn2.whiten, num_groups=1, num_channels=384, metric=15.88 vs. limit=22.5 2023-10-03 22:00:53,515 WARNING [train.py:1204] (3/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_9-279442-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 22:00:53,580 WARNING [train.py:1204] (3/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_973-198085-0_sp0.9 from training. Duration: 0.8333125 2023-10-03 22:00:54,918 WARNING [train.py:1204] (3/4) Exclude cut with ID _29853_210_7497_1_1532505600851_6642110_567-250221-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 22:00:56,662 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.581e+02 1.942e+02 2.236e+02 2.477e+02 3.530e+02, threshold=4.472e+02, percent-clipped=0.0 2023-10-03 22:00:56,772 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6545_210_2040_1_1527774670_4245258_407-48070-0_sp1.1 from training. Duration: 0.6909375 2023-10-03 22:00:58,297 WARNING [train.py:1204] (3/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_224-343295-0_sp0.9 from training. Duration: 0.611125 2023-10-03 22:00:58,311 WARNING [train.py:1204] (3/4) Exclude cut with ID IC0125W0011-61817 from training. Duration: 0.88 2023-10-03 22:00:58,320 WARNING [train.py:1204] (3/4) Exclude cut with ID _60113_210_16271_1_1535164212481_4386079_435-154371-0_sp1.1 from training. Duration: 0.97275 2023-10-03 22:00:58,331 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_25836_210_16554_1_1532138167_53197_3-9394-0_sp1.1 from training. Duration: 0.92725 2023-10-03 22:00:58,535 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.conv_module2.balancer2.prob, batch_count=1421040.0, ans=0.125 2023-10-03 22:01:02,370 WARNING [train.py:1204] (3/4) Exclude cut with ID _29343_210_4381_1_1532602757752_3674109_57-55125-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 22:01:02,464 WARNING [train.py:1204] (3/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_267-23560-0_sp1.1 from training. Duration: 0.936375 2023-10-03 22:01:02,487 WARNING [train.py:1204] (3/4) Exclude cut with ID _30196_210_1664_1_1533708496522_7074140_1150-278867-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 22:01:03,727 WARNING [train.py:1204] (3/4) Exclude cut with ID _6270_210_2084_1_1527850911573_3856420_26-277343-0_sp0.9 from training. Duration: 0.911125 2023-10-03 22:01:05,148 WARNING [train.py:1204] (3/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_325-62802-0 from training. Duration: 0.87 2023-10-03 22:01:05,227 WARNING [train.py:1204] (3/4) Exclude cut with ID _56524_210_9642_1_1534572011779_3619170_145-310842-0_sp1.1 from training. Duration: 0.9 2023-10-03 22:01:05,272 WARNING [train.py:1204] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_860-96198-0_sp0.9 from training. Duration: 0.811125 2023-10-03 22:01:06,594 WARNING [train.py:1204] (3/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_17-25045-0_sp1.1 from training. Duration: 0.536375 2023-10-03 22:01:06,611 WARNING [train.py:1204] (3/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_349-69727-0_sp1.1 from training. Duration: 0.936375 2023-10-03 22:01:09,188 WARNING [train.py:1204] (3/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_580-93798-0_sp1.1 from training. Duration: 0.5090625 2023-10-03 22:01:11,853 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7762_210_1794_1_1528199508_4057408_419-125009-0 from training. Duration: 0.83 2023-10-03 22:01:11,964 WARNING [train.py:1204] (3/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_356-167816-0 from training. Duration: 0.72 2023-10-03 22:01:13,280 WARNING [train.py:1204] (3/4) Exclude cut with ID _29345_210_4381_1_1532689168263_3860169_225-97100-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 22:01:13,291 WARNING [train.py:1204] (3/4) Exclude cut with ID _14227_210_4381_1_1530701804286_3940259_374-194712-0_sp1.1 from training. Duration: 0.92725 2023-10-03 22:01:13,322 WARNING [train.py:1204] (3/4) Exclude cut with ID _40137_210_12210_1_1533290484046_3795180_249-187059-0_sp0.9 from training. Duration: 0.97775 2023-10-03 22:01:13,346 WARNING [train.py:1204] (3/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_389-317194-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 22:01:16,539 WARNING [train.py:1204] (3/4) Exclude cut with ID _27485_210_4916_1_1532739191677_3668269_239-122918-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 22:01:21,368 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2245_210_2715_1_1525511548_3540254_128-117609-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 22:01:21,402 WARNING [train.py:1204] (3/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_160-276178-0_sp1.1 from training. Duration: 0.963625 2023-10-03 22:01:22,771 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_31920_210_10633_1_1532860009_1686094_1-279735-0_sp1.1 from training. Duration: 0.97275 2023-10-03 22:01:25,902 WARNING [train.py:1204] (3/4) Exclude cut with ID _51078_210_9070_1_1534161557424_3805250_325-262383-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 22:01:25,922 WARNING [train.py:1204] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_643-52007-0_sp0.9 from training. Duration: 0.6333125 2023-10-03 22:01:27,278 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_51937_210_5014_1_1534573610_4022025_518-299248-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 22:01:33,468 INFO [train.py:1046] (3/4) Epoch 41, batch 700, loss[loss=0.1328, simple_loss=0.1838, pruned_loss=0.04094, over 19098.00 frames. ], tot_loss[loss=0.156, simple_loss=0.2352, pruned_loss=0.03835, over 4562167.48 frames. ], batch size: 388, lr: 2.49e-03, grad_scale: 16.0 2023-10-03 22:01:33,558 WARNING [train.py:1204] (3/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_166-314607-0_sp1.1 from training. Duration: 0.4909375 2023-10-03 22:01:33,562 WARNING [train.py:1204] (3/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_1087-282939-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 22:01:33,598 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_23850_210_4238_1_1532232597_1632433_32-185135-0_sp1.1 from training. Duration: 0.7818125 2023-10-03 22:01:34,732 WARNING [train.py:1204] (3/4) Exclude cut with ID _59500_210_8254_1_1534809610680_3736009_145-89359-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 22:01:37,565 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_340-336408-0 from training. Duration: 0.93 2023-10-03 22:01:37,635 WARNING [train.py:1204] (3/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_9-21737-0 from training. Duration: 0.97 2023-10-03 22:01:41,687 WARNING [train.py:1204] (3/4) Exclude cut with ID _2220_210_1819_1_1525509109898_3658680_50-99837-0 from training. Duration: 0.54 2023-10-03 22:01:41,750 WARNING [train.py:1204] (3/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_879-299350-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 22:01:44,438 WARNING [train.py:1204] (3/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_905-160074-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 22:01:46,300 WARNING [train.py:1204] (3/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_395-73570-0 from training. Duration: 0.99 2023-10-03 22:01:51,077 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7264_210_4938_1_1527939911_8132095_800-91995-0_sp1.1 from training. Duration: 0.7818125 2023-10-03 22:01:52,475 WARNING [train.py:1204] (3/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_77-311404-0_sp1.1 from training. Duration: 0.8545625 2023-10-03 22:01:54,171 WARNING [train.py:1204] (3/4) Exclude cut with ID _13679_210_7497_1_1530534293662_3355098_107-80772-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 22:01:54,287 WARNING [train.py:1204] (3/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_224-343295-0_sp1.1 from training. Duration: 0.5 2023-10-03 22:01:54,316 WARNING [train.py:1204] (3/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_156-253482-0_sp1.1 from training. Duration: 0.963625 2023-10-03 22:01:59,298 WARNING [train.py:1204] (3/4) Exclude cut with ID _49728_210_12651_1_1534041034294_6788150_309-125532-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 22:02:01,977 WARNING [train.py:1204] (3/4) Exclude cut with ID _13913_210_9994_1_1530579615187_3383897_473-277682-0_sp0.9 from training. Duration: 0.4333125 2023-10-03 22:02:01,981 WARNING [train.py:1204] (3/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_3-276701-0_sp1.1 from training. Duration: 0.82725 2023-10-03 22:02:03,496 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.self_attn_weights.pos_emb_skip_rate, batch_count=1421373.3333333333, ans=0.0 2023-10-03 22:02:04,649 WARNING [train.py:1204] (3/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_282-136641-0 from training. Duration: 0.42 2023-10-03 22:02:06,127 WARNING [train.py:1204] (3/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_127-566-0 from training. Duration: 0.84 2023-10-03 22:02:08,859 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6163_210_6400_1_1527506746_4263068_283-137811-0_sp1.1 from training. Duration: 0.7 2023-10-03 22:02:08,888 WARNING [train.py:1204] (3/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_34-183685-0_sp1.1 from training. Duration: 0.8909375 2023-10-03 22:02:10,237 WARNING [train.py:1204] (3/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_154-135696-0_sp0.9 from training. Duration: 0.888875 2023-10-03 22:02:13,229 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.bypass.skip_rate, batch_count=1421373.3333333333, ans=0.09899494936611666 2023-10-03 22:02:14,994 WARNING [train.py:1204] (3/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_676-51014-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 22:02:15,045 WARNING [train.py:1204] (3/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_38-169920-0 from training. Duration: 0.91 2023-10-03 22:02:20,476 WARNING [train.py:1204] (3/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_747-114465-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 22:02:20,526 WARNING [train.py:1204] (3/4) Exclude cut with ID _8021_210_7994_1_1528376451358_3653069_100-140231-0_sp1.1 from training. Duration: 0.6090625 2023-10-03 22:02:20,543 WARNING [train.py:1204] (3/4) Exclude cut with ID _64135_210_2253_1_1535677267876_7608972_133-188266-0 from training. Duration: 0.81 2023-10-03 22:02:25,134 WARNING [train.py:1204] (3/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_714-310538-0_sp1.1 from training. Duration: 0.7818125 2023-10-03 22:02:26,511 WARNING [train.py:1204] (3/4) Exclude cut with ID _39867_210_9826_1_1533553222030_9344259_1134-288498-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 22:02:29,288 WARNING [train.py:1204] (3/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_211-237289-0_sp1.1 from training. Duration: 0.936375 2023-10-03 22:02:31,532 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.conv_skip_rate, batch_count=1421506.6666666667, ans=0.0 2023-10-03 22:02:34,020 WARNING [train.py:1204] (3/4) Exclude cut with ID _59202_210_26984_1_1534816810612_3549174_137-318710-0_sp0.9 from training. Duration: 0.988875 2023-10-03 22:02:34,052 WARNING [train.py:1204] (3/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_86-66192-0 from training. Duration: 0.55 2023-10-03 22:02:36,918 WARNING [train.py:1204] (3/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_540-275685-0 from training. Duration: 0.89 2023-10-03 22:02:36,947 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8776_210_2040_1_1528638837_4088036_244-278763-0 from training. Duration: 0.9 2023-10-03 22:02:39,660 WARNING [train.py:1204] (3/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_9-219972-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 22:02:41,101 WARNING [train.py:1204] (3/4) Exclude cut with ID _44659_210_2721_1_1534036120061_6614860_656-283823-0_sp1.1 from training. Duration: 0.92725 2023-10-03 22:02:42,376 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_69630_210_7234_1_1535789601_661049_77-216385-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 22:02:45,457 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_19952_210_2161_1_1531875469_3851750_210-308919-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 22:02:45,463 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_4737_210_2040_1_1526998689_3293008_378-178650-0 from training. Duration: 0.95 2023-10-03 22:02:47,444 INFO [train.py:1046] (3/4) Epoch 41, batch 750, loss[loss=0.1571, simple_loss=0.2389, pruned_loss=0.03764, over 18595.00 frames. ], tot_loss[loss=0.1554, simple_loss=0.2347, pruned_loss=0.03807, over 4596330.03 frames. ], batch size: 40, lr: 2.49e-03, grad_scale: 16.0 2023-10-03 22:02:48,159 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.3.encoder.layers.1.self_attn2.whiten, num_groups=1, num_channels=512, metric=14.05 vs. limit=22.5 2023-10-03 22:02:48,907 WARNING [train.py:1204] (3/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_224-343295-0 from training. Duration: 0.55 2023-10-03 22:02:48,923 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7002_210_1794_1_1527939683_5157286_184-229630-0 from training. Duration: 0.74 2023-10-03 22:02:50,169 WARNING [train.py:1204] (3/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_6-214815-0 from training. Duration: 0.85 2023-10-03 22:02:50,256 WARNING [train.py:1204] (3/4) Exclude cut with ID _8699_210_2069_1_1528630346831_3960409_160-24408-0 from training. Duration: 0.91 2023-10-03 22:02:50,287 WARNING [train.py:1204] (3/4) Exclude cut with ID _5585_210_2667_1_1527418813324_8007340_89-332072-0 from training. Duration: 0.64 2023-10-03 22:02:50,859 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.3.encoder.layers.0.nonlin_attention.whiten2, num_groups=1, num_channels=512, metric=6.31 vs. limit=15.0 2023-10-03 22:02:52,095 WARNING [train.py:1204] (3/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_528-259800-0_sp1.1 from training. Duration: 0.963625 2023-10-03 22:02:53,437 WARNING [train.py:1204] (3/4) Exclude cut with ID _55383_210_10635_1_1534468426172_2231669_134-128246-0 from training. Duration: 0.89 2023-10-03 22:02:54,774 WARNING [train.py:1204] (3/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_400-338274-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 22:02:56,132 WARNING [train.py:1204] (3/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_364-22817-0_sp0.9 from training. Duration: 0.788875 2023-10-03 22:02:56,241 WARNING [train.py:1204] (3/4) Exclude cut with ID _42863_210_9070_1_1533902398788_3696920_331-291057-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 22:02:57,553 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_5436_210_1794_1_1527331317_2541494_229-211016-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 22:02:57,583 WARNING [train.py:1204] (3/4) Exclude cut with ID _6456_210_1534_1_1527681634212_3181049_345-252877-0_sp0.9 from training. Duration: 0.711125 2023-10-03 22:02:58,899 WARNING [train.py:1204] (3/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_96-81499-0_sp1.1 from training. Duration: 0.92725 2023-10-03 22:03:00,424 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_5575_210_1410_1_1527301305_2570170_342-265572-0_sp1.1 from training. Duration: 0.8545625 2023-10-03 22:03:02,207 WARNING [train.py:1204] (3/4) Exclude cut with ID _5444_210_2151_1_1527418808464_5312210_726-27165-0_sp0.9 from training. Duration: 0.7444375 2023-10-03 22:03:06,180 WARNING [train.py:1204] (3/4) Exclude cut with ID _22083_210_8254_1_1531702862439_3678970_304-327750-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 22:03:07,619 WARNING [train.py:1204] (3/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_245-226199-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 22:03:07,649 WARNING [train.py:1204] (3/4) Exclude cut with ID _42875_210_14856_1_1533704397487_4012433_102-199499-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 22:03:08,975 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_51225_210_3601_1_1534400634_2327383_55-301844-0 from training. Duration: 0.93 2023-10-03 22:03:10,356 WARNING [train.py:1204] (3/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_916-195848-0_sp1.1 from training. Duration: 0.836375 2023-10-03 22:03:10,421 WARNING [train.py:1204] (3/4) Exclude cut with ID _44718_210_5816_1_1534050051869_6469030_497-311999-0_sp1.1 from training. Duration: 0.936375 2023-10-03 22:03:11,780 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_34886_210_10633_1_1533203882_1755661_91-194765-0_sp1.1 from training. Duration: 0.936375 2023-10-03 22:03:13,185 WARNING [train.py:1204] (3/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_310-26186-0_sp0.9 from training. Duration: 0.77775 2023-10-03 22:03:15,178 WARNING [train.py:1204] (3/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_416-257730-0 from training. Duration: 0.65 2023-10-03 22:03:15,183 WARNING [train.py:1204] (3/4) Exclude cut with ID _9389_210_1664_1_1529217337301_5289269_209-154760-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 22:03:16,464 WARNING [train.py:1204] (3/4) Exclude cut with ID _4092_210_2151_1_1526643044449_3135759_227-213972-0 from training. Duration: 0.54 2023-10-03 22:03:16,498 WARNING [train.py:1204] (3/4) Exclude cut with ID IC0679W0044-335852 from training. Duration: 0.768 2023-10-03 22:03:17,811 WARNING [train.py:1204] (3/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_42-186162-0 from training. Duration: 0.92 2023-10-03 22:03:17,818 WARNING [train.py:1204] (3/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_302-2550-0_sp0.9 from training. Duration: 0.9 2023-10-03 22:03:17,833 WARNING [train.py:1204] (3/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_356-31216-0_sp0.9 from training. Duration: 0.8333125 2023-10-03 22:03:21,680 WARNING [train.py:1204] (3/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_125-322172-0_sp1.1 from training. Duration: 0.8454375 2023-10-03 22:03:24,238 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.560e+02 1.879e+02 2.063e+02 2.265e+02 2.874e+02, threshold=4.127e+02, percent-clipped=0.0 2023-10-03 22:03:28,517 WARNING [train.py:1204] (3/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_141-89727-0_sp0.9 from training. Duration: 0.788875 2023-10-03 22:03:28,534 WARNING [train.py:1204] (3/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_647-337784-0_sp1.1 from training. Duration: 0.97275 2023-10-03 22:03:28,543 WARNING [train.py:1204] (3/4) Exclude cut with ID _6493_210_1747_1_1528459210424_4012470_513-140967-0_sp1.1 from training. Duration: 0.7181875 2023-10-03 22:03:31,302 WARNING [train.py:1204] (3/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_87-266334-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 22:03:31,405 WARNING [train.py:1204] (3/4) Exclude cut with ID _26528_210_9070_1_1532570410842_3851310_526-261338-0_sp1.1 from training. Duration: 0.92725 2023-10-03 22:03:31,449 WARNING [train.py:1204] (3/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_380-343670-0 from training. Duration: 0.88 2023-10-03 22:03:33,264 WARNING [train.py:1204] (3/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_449-15508-0_sp1.1 from training. Duration: 0.7454375 2023-10-03 22:03:35,864 WARNING [train.py:1204] (3/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_372-6869-0_sp0.9 from training. Duration: 0.488875 2023-10-03 22:03:37,207 WARNING [train.py:1204] (3/4) Exclude cut with ID _26907_210_7234_1_1532221740694_3205263_337-2494-0_sp0.9 from training. Duration: 0.97775 2023-10-03 22:03:38,712 WARNING [train.py:1204] (3/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_10-100367-0_sp1.1 from training. Duration: 0.8909375 2023-10-03 22:03:39,988 WARNING [train.py:1204] (3/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_47-167373-0 from training. Duration: 0.92 2023-10-03 22:03:40,054 WARNING [train.py:1204] (3/4) Exclude cut with ID _28323_210_16187_1_1532480395667_4100300_628-282179-0_sp1.1 from training. Duration: 0.97275 2023-10-03 22:03:44,230 WARNING [train.py:1204] (3/4) Exclude cut with ID _20455_210_5219_1_1531565996775_8098100_213-280898-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 22:03:45,649 WARNING [train.py:1204] (3/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_383-316000-0_sp1.1 from training. Duration: 0.8181875 2023-10-03 22:03:45,690 WARNING [train.py:1204] (3/4) Exclude cut with ID _18513_210_9206_1_1531618215223_3862620_16-78180-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 22:03:48,899 WARNING [train.py:1204] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_330-327861-0_sp1.1 from training. Duration: 0.6090625 2023-10-03 22:03:50,976 WARNING [train.py:1204] (3/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_356-31216-0 from training. Duration: 0.75 2023-10-03 22:03:52,766 WARNING [train.py:1204] (3/4) Exclude cut with ID _25248_210_8750_1_1532062825941_5554180_255-148539-0_sp1.1 from training. Duration: 0.963625 2023-10-03 22:03:52,812 WARNING [train.py:1204] (3/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_269-154726-0_sp1.1 from training. Duration: 0.7818125 2023-10-03 22:03:54,258 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7002_210_1794_1_1527939683_5157286_163-87552-0_sp1.1 from training. Duration: 0.7818125 2023-10-03 22:03:54,481 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.nonlin_attention.balancer.prob, batch_count=1421840.0, ans=0.125 2023-10-03 22:03:55,568 WARNING [train.py:1204] (3/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_97-151896-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 22:03:58,337 WARNING [train.py:1204] (3/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_207-7026-0_sp1.1 from training. Duration: 0.97275 2023-10-03 22:03:58,375 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_92-46009-0_sp0.9 from training. Duration: 0.6 2023-10-03 22:04:00,894 INFO [train.py:1046] (3/4) Epoch 41, batch 800, loss[loss=0.1587, simple_loss=0.2389, pruned_loss=0.03923, over 23377.00 frames. ], tot_loss[loss=0.1562, simple_loss=0.2359, pruned_loss=0.0382, over 4626984.73 frames. ], batch size: 134, lr: 2.49e-03, grad_scale: 32.0 2023-10-03 22:04:06,973 WARNING [train.py:1204] (3/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_122-178847-0_sp1.1 from training. Duration: 0.97275 2023-10-03 22:04:06,977 WARNING [train.py:1204] (3/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_94-348956-0_sp1.1 from training. Duration: 0.92725 2023-10-03 22:04:09,647 WARNING [train.py:1204] (3/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_1018-244089-0_sp1.1 from training. Duration: 0.963625 2023-10-03 22:04:09,665 WARNING [train.py:1204] (3/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_87-118423-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 22:04:09,834 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.conv_module2.balancer2.prob, batch_count=1421906.6666666667, ans=0.125 2023-10-03 22:04:09,874 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.ff3_skip_rate, batch_count=1421906.6666666667, ans=0.0 2023-10-03 22:04:11,086 WARNING [train.py:1204] (3/4) Exclude cut with ID _53713_210_24116_1_1534314487762_7416751_504-212292-0_sp1.1 from training. Duration: 0.92725 2023-10-03 22:04:12,409 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_446-150327-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 22:04:12,531 WARNING [train.py:1204] (3/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_682-245989-0_sp1.1 from training. Duration: 0.92725 2023-10-03 22:04:17,898 WARNING [train.py:1204] (3/4) Exclude cut with ID _35723_210_1794_1_1533088341043_628250_8-234524-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 22:04:17,983 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder_embed.conv.8.prob, batch_count=1421973.3333333333, ans=0.125 2023-10-03 22:04:19,835 WARNING [train.py:1204] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_64-248359-0_sp0.9 from training. Duration: 0.7666875 2023-10-03 22:04:22,573 WARNING [train.py:1204] (3/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_49-97158-0 from training. Duration: 0.76 2023-10-03 22:04:22,614 WARNING [train.py:1204] (3/4) Exclude cut with ID _24314_210_4915_1_1532135078116_7170179_743-915-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 22:04:22,705 WARNING [train.py:1204] (3/4) Exclude cut with ID _61194_210_18628_1_1535007555628_3750909_353-323468-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 22:04:24,551 WARNING [train.py:1204] (3/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_387-131538-0_sp1.1 from training. Duration: 0.736375 2023-10-03 22:04:24,571 WARNING [train.py:1204] (3/4) Exclude cut with ID _44424_210_18163_1_1533623394848_1213110_79-187603-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 22:04:24,619 WARNING [train.py:1204] (3/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_86-19601-0 from training. Duration: 0.85 2023-10-03 22:04:24,638 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_50050_210_16223_1_1535435796_7290138_103-283529-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 22:04:25,951 WARNING [train.py:1204] (3/4) Exclude cut with ID _13913_210_9994_1_1530579615187_3383897_473-277682-0 from training. Duration: 0.39 2023-10-03 22:04:28,718 WARNING [train.py:1204] (3/4) Exclude cut with ID _42326_210_7770_1_1533814282531_3707269_368-68029-0_sp1.1 from training. Duration: 0.92725 2023-10-03 22:04:30,169 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_41428_210_2161_1_1533774184_3992532_446-339645-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 22:04:32,718 WARNING [train.py:1204] (3/4) Exclude cut with ID _69584_210_5219_1_1535785133775_4044450_325-7656-0_sp1.1 from training. Duration: 0.7818125 2023-10-03 22:04:32,734 WARNING [train.py:1204] (3/4) Exclude cut with ID _21632_210_2253_1_1531994001765_3744140_134-184387-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 22:04:35,971 WARNING [train.py:1204] (3/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_550-184707-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 22:04:35,998 WARNING [train.py:1204] (3/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_106-341780-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 22:04:38,892 WARNING [train.py:1204] (3/4) Exclude cut with ID _45871_210_5665_1_1533864614245_3296060_253-132552-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 22:04:38,957 WARNING [train.py:1204] (3/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_395-310220-0_sp1.1 from training. Duration: 0.8090625 2023-10-03 22:04:38,976 WARNING [train.py:1204] (3/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_795-33252-0_sp0.9 from training. Duration: 0.411125 2023-10-03 22:04:40,346 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9002W0003-495962 from training. Duration: 0.946 2023-10-03 22:04:40,365 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_43-71989-0 from training. Duration: 0.57 2023-10-03 22:04:40,377 WARNING [train.py:1204] (3/4) Exclude cut with ID _8699_210_2069_1_1528630346831_3960409_155-39766-0_sp1.1 from training. Duration: 0.7090625 2023-10-03 22:04:40,392 WARNING [train.py:1204] (3/4) Exclude cut with ID _27894_210_12392_1_1532415496078_6902439_788-232684-0_sp1.1 from training. Duration: 0.97275 2023-10-03 22:04:41,789 WARNING [train.py:1204] (3/4) Exclude cut with ID _56494_210_4220_1_1535284837625_3033169_10-39296-0_sp1.1 from training. Duration: 0.92725 2023-10-03 22:04:41,810 WARNING [train.py:1204] (3/4) Exclude cut with ID _40901_210_5665_1_1533363789958_3856130_341-20819-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 22:04:42,238 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.5.encoder.layers.1.self_attn2.whiten, num_groups=1, num_channels=256, metric=10.36 vs. limit=22.5 2023-10-03 22:04:46,312 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9029W0435-509822 from training. Duration: 0.896 2023-10-03 22:04:47,545 WARNING [train.py:1204] (3/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_51-117576-0 from training. Duration: 0.4 2023-10-03 22:04:48,999 WARNING [train.py:1204] (3/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_197-28164-0_sp1.1 from training. Duration: 0.72725 2023-10-03 22:04:50,777 WARNING [train.py:1204] (3/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_145-137790-0_sp1.1 from training. Duration: 0.7545625 2023-10-03 22:04:55,612 WARNING [train.py:1204] (3/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_2-39653-0_sp1.1 from training. Duration: 0.8909375 2023-10-03 22:04:58,421 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3780_210_2087_1_1526467760_2130483_186-208240-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 22:04:58,665 INFO [scaling.py:1118] (3/4) WithLoss: name=encoder.encoders.1.encoder.layers.1.self_attn_weights, loss-sum=0.000e+00 2023-10-03 22:04:59,869 WARNING [train.py:1204] (3/4) Exclude cut with ID _24055_210_12651_1_1532310907143_976979_92-59356-0 from training. Duration: 0.91 2023-10-03 22:04:59,926 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3910_210_1794_1_1526742306_334530_28-238909-0_sp1.1 from training. Duration: 0.736375 2023-10-03 22:05:01,517 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.ff2_skip_rate, batch_count=1422173.3333333333, ans=0.0 2023-10-03 22:05:02,668 WARNING [train.py:1204] (3/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_269-154726-0 from training. Duration: 0.86 2023-10-03 22:05:05,988 WARNING [train.py:1204] (3/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_326-165900-0_sp0.9 from training. Duration: 0.7666875 2023-10-03 22:05:08,730 WARNING [train.py:1204] (3/4) Exclude cut with ID _5294_210_1747_1_1527249643928_3696179_470-276077-0_sp1.1 from training. Duration: 0.963625 2023-10-03 22:05:09,968 WARNING [train.py:1204] (3/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_372-131163-0 from training. Duration: 0.94 2023-10-03 22:05:10,019 WARNING [train.py:1204] (3/4) Exclude cut with ID _45297_210_2721_1_1533715351381_3264949_351-147136-0_sp1.1 from training. Duration: 0.7909375 2023-10-03 22:05:11,359 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_1072-12671-0_sp1.1 from training. Duration: 0.92725 2023-10-03 22:05:12,751 WARNING [train.py:1204] (3/4) Exclude cut with ID _15133_210_6228_1_1530794512074_3080379_54-198193-0 from training. Duration: 0.94 2023-10-03 22:05:12,769 WARNING [train.py:1204] (3/4) Exclude cut with ID _30196_210_1664_1_1533708496522_7074140_1194-270988-0_sp1.1 from training. Duration: 0.97275 2023-10-03 22:05:14,105 INFO [train.py:1046] (3/4) Epoch 41, batch 850, loss[loss=0.1543, simple_loss=0.2296, pruned_loss=0.03949, over 23724.00 frames. ], tot_loss[loss=0.1566, simple_loss=0.2367, pruned_loss=0.03826, over 4648640.82 frames. ], batch size: 164, lr: 2.49e-03, grad_scale: 32.0 2023-10-03 22:05:14,183 WARNING [train.py:1204] (3/4) Exclude cut with ID _50310_210_15488_1_1534068043345_3641080_289-193637-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 22:05:16,113 WARNING [train.py:1204] (3/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_313-263495-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 22:05:17,542 WARNING [train.py:1204] (3/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_18-130336-0_sp1.1 from training. Duration: 0.7181875 2023-10-03 22:05:20,730 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_47896_210_20508_1_1534492518_5958868_646-287209-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 22:05:22,113 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_294-163793-0 from training. Duration: 0.82 2023-10-03 22:05:22,145 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_8-319957-0 from training. Duration: 0.96 2023-10-03 22:05:22,148 WARNING [train.py:1204] (3/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_1211-226066-0 from training. Duration: 0.76 2023-10-03 22:05:24,172 WARNING [train.py:1204] (3/4) Exclude cut with ID _4779_210_2084_1_1527318022944_3914893_500-285013-0_sp0.9 from training. Duration: 0.7666875 2023-10-03 22:05:25,491 WARNING [train.py:1204] (3/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_347-232268-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 22:05:26,996 WARNING [train.py:1204] (3/4) Exclude cut with ID _29877_210_6228_1_1532397791106_5276149_111-55056-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 22:05:27,030 WARNING [train.py:1204] (3/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_812-271646-0_sp1.1 from training. Duration: 0.92725 2023-10-03 22:05:27,053 WARNING [train.py:1204] (3/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_235-262124-0_sp1.1 from training. Duration: 0.6090625 2023-10-03 22:05:31,220 WARNING [train.py:1204] (3/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_14-16530-0_sp1.1 from training. Duration: 0.97275 2023-10-03 22:05:31,260 WARNING [train.py:1204] (3/4) Exclude cut with ID _44718_210_5816_1_1534050051869_6469030_568-304512-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 22:05:32,584 WARNING [train.py:1204] (3/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_1033-274376-0 from training. Duration: 0.87 2023-10-03 22:05:37,364 WARNING [train.py:1204] (3/4) Exclude cut with ID _4681_210_3525_1_1527251040321_1235713_123-24428-0 from training. Duration: 0.48 2023-10-03 22:05:40,198 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_37818_210_4238_1_1533620910_4720083_251-288764-0_sp1.1 from training. Duration: 0.97275 2023-10-03 22:05:40,270 WARNING [train.py:1204] (3/4) Exclude cut with ID _65585_210_12210_1_1535450451584_3764098_112-294301-0 from training. Duration: 0.88 2023-10-03 22:05:43,054 WARNING [train.py:1204] (3/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_329-113173-0 from training. Duration: 0.62 2023-10-03 22:05:44,442 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6767_210_3273_1_1527855783_1718932_173-256601-0 from training. Duration: 0.8 2023-10-03 22:05:47,772 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9266W0001-627677 from training. Duration: 0.896 2023-10-03 22:05:47,784 WARNING [train.py:1204] (3/4) Exclude cut with ID _49643_210_7989_1_1534640244224_3939031_611-185515-0_sp1.1 from training. Duration: 0.8545625 2023-10-03 22:05:47,788 WARNING [train.py:1204] (3/4) Exclude cut with ID _31649_210_6228_1_1532584608063_4091078_687-237771-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 22:05:48,882 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_148-172861-0_sp0.9 from training. Duration: 0.5555625 2023-10-03 22:05:51,485 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.620e+02 1.958e+02 2.183e+02 2.406e+02 3.404e+02, threshold=4.367e+02, percent-clipped=0.0 2023-10-03 22:05:51,579 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_5129_210_6400_1_1527161005_3579054_107-69738-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 22:05:52,945 WARNING [train.py:1204] (3/4) Exclude cut with ID _40137_210_12210_1_1533290484046_3795180_36-301164-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 22:05:52,973 WARNING [train.py:1204] (3/4) Exclude cut with ID _7537_210_2084_1_1528502395431_3924359_113-199933-0 from training. Duration: 0.59 2023-10-03 22:05:56,778 WARNING [train.py:1204] (3/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_547-5424-0_sp1.1 from training. Duration: 0.8545625 2023-10-03 22:05:56,832 WARNING [train.py:1204] (3/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_147-284024-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 22:05:58,172 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_294-163793-0_sp1.1 from training. Duration: 0.7454375 2023-10-03 22:05:58,202 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3780_210_2087_1_1526467760_2130483_170-259044-0_sp1.1 from training. Duration: 0.636375 2023-10-03 22:05:59,714 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.feed_forward1.hidden_balancer.prob, batch_count=1422440.0, ans=0.125 2023-10-03 22:06:00,885 WARNING [train.py:1204] (3/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_726-270780-0_sp1.1 from training. Duration: 0.7818125 2023-10-03 22:06:02,302 WARNING [train.py:1204] (3/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_214-291369-0_sp1.1 from training. Duration: 0.57275 2023-10-03 22:06:02,359 WARNING [train.py:1204] (3/4) Exclude cut with ID _21881_210_3525_1_1531825242757_3798380_247-102591-0 from training. Duration: 0.98 2023-10-03 22:06:02,553 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=1422440.0, ans=0.1 2023-10-03 22:06:06,352 WARNING [train.py:1204] (3/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_18-174647-0_sp1.1 from training. Duration: 0.82725 2023-10-03 22:06:06,354 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_415-52611-0_sp1.1 from training. Duration: 0.936375 2023-10-03 22:06:08,355 WARNING [train.py:1204] (3/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_379-49457-0_sp0.9 from training. Duration: 0.9666875 2023-10-03 22:06:08,380 WARNING [train.py:1204] (3/4) Exclude cut with ID _64521_210_14699_1_1535243427259_3815328_368-195106-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 22:06:09,674 WARNING [train.py:1204] (3/4) Exclude cut with ID _28394_210_8913_1_1532430049518_3650118_51-334124-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 22:06:12,339 WARNING [train.py:1204] (3/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_317-161769-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 22:06:15,151 WARNING [train.py:1204] (3/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_40-129756-0_sp0.9 from training. Duration: 0.9 2023-10-03 22:06:16,561 WARNING [train.py:1204] (3/4) Exclude cut with ID _3067_210_2667_1_1526212974640_4503168_119-175776-0_sp0.9 from training. Duration: 0.82225 2023-10-03 22:06:16,603 WARNING [train.py:1204] (3/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_1004-304915-0_sp1.1 from training. Duration: 0.92725 2023-10-03 22:06:17,978 WARNING [train.py:1204] (3/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_359-118926-0_sp0.9 from training. Duration: 0.8 2023-10-03 22:06:24,296 WARNING [train.py:1204] (3/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_51-200220-0_sp0.9 from training. Duration: 0.62225 2023-10-03 22:06:25,618 WARNING [train.py:1204] (3/4) Exclude cut with ID _46683_210_9642_1_1533730913492_4591116_530-326202-0_sp1.1 from training. Duration: 0.936375 2023-10-03 22:06:25,679 WARNING [train.py:1204] (3/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_567-135114-0 from training. Duration: 0.87 2023-10-03 22:06:27,593 WARNING [train.py:1204] (3/4) Exclude cut with ID _66863_210_18053_1_1535698758334_8590319_1092-271784-0_sp1.1 from training. Duration: 0.87275 2023-10-03 22:06:27,609 WARNING [train.py:1204] (3/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_762-282614-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 22:06:29,455 INFO [train.py:1046] (3/4) Epoch 41, batch 900, loss[loss=0.1445, simple_loss=0.2245, pruned_loss=0.03228, over 24409.00 frames. ], tot_loss[loss=0.1572, simple_loss=0.2372, pruned_loss=0.03854, over 4664212.00 frames. ], batch size: 58, lr: 2.49e-03, grad_scale: 32.0 2023-10-03 22:06:30,934 WARNING [train.py:1204] (3/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_280-120942-0 from training. Duration: 0.77 2023-10-03 22:06:36,481 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_31583_210_3852_1_1532595851_4024413_27-170931-0_sp1.1 from training. Duration: 0.963625 2023-10-03 22:06:36,635 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.self_attn_weights.pos_emb_skip_rate, batch_count=1422573.3333333333, ans=0.0 2023-10-03 22:06:38,060 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.feed_forward3.hidden_balancer.prob, batch_count=1422573.3333333333, ans=0.125 2023-10-03 22:06:40,522 WARNING [train.py:1204] (3/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_266-271628-0_sp1.1 from training. Duration: 0.92725 2023-10-03 22:06:40,576 WARNING [train.py:1204] (3/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_41-31157-0 from training. Duration: 0.57 2023-10-03 22:06:40,719 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.0.conv_module2.balancer1.prob, batch_count=1422573.3333333333, ans=0.125 2023-10-03 22:06:45,336 WARNING [train.py:1204] (3/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_113-18163-0_sp1.1 from training. Duration: 0.8090625 2023-10-03 22:06:45,363 WARNING [train.py:1204] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_147-111137-0 from training. Duration: 0.95 2023-10-03 22:06:46,702 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1083-220122-0_sp1.1 from training. Duration: 0.463625 2023-10-03 22:06:48,061 WARNING [train.py:1204] (3/4) Exclude cut with ID _14884_210_2252_1_1530707459689_5561250_591-173406-0_sp1.1 from training. Duration: 0.87275 2023-10-03 22:06:48,073 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_24677_210_1794_1_1532002693_4341296_149-303793-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 22:06:49,685 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_133-564-0_sp1.1 from training. Duration: 0.7181875 2023-10-03 22:06:49,706 WARNING [train.py:1204] (3/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_625-338546-0_sp0.9 from training. Duration: 0.97775 2023-10-03 22:06:51,316 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.1.nonlin_attention.balancer.max_positive, batch_count=1422640.0, ans=0.95 2023-10-03 22:06:51,400 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=1422640.0, ans=0.1 2023-10-03 22:06:55,916 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.0.self_attn_weights.pos_emb_skip_rate, batch_count=1422640.0, ans=0.0 2023-10-03 22:06:58,559 WARNING [train.py:1204] (3/4) Exclude cut with ID _25882_210_3036_1_1532775665815_6479510_624-320169-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 22:06:58,566 WARNING [train.py:1204] (3/4) Exclude cut with ID _20622_210_3852_1_1531622427080_1093839_127-325326-0_sp1.1 from training. Duration: 0.92725 2023-10-03 22:06:58,597 WARNING [train.py:1204] (3/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_464-33931-0_sp1.1 from training. Duration: 0.4909375 2023-10-03 22:07:01,932 WARNING [train.py:1204] (3/4) Exclude cut with ID _66112_210_10635_1_1535590236305_3801484_120-304588-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 22:07:06,607 WARNING [train.py:1204] (3/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_110-130961-0 from training. Duration: 0.47 2023-10-03 22:07:06,762 WARNING [train.py:1204] (3/4) Exclude cut with ID _16908_210_5724_1_1531119157248_3772700_64-54020-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 22:07:11,028 WARNING [train.py:1204] (3/4) Exclude cut with ID _4068_210_2629_1_1526697350395_1546259_224-288693-0_sp1.1 from training. Duration: 0.536375 2023-10-03 22:07:12,277 WARNING [train.py:1204] (3/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_191-41023-0_sp0.9 from training. Duration: 0.788875 2023-10-03 22:07:12,362 WARNING [train.py:1204] (3/4) Exclude cut with ID ID0205W0255-783002 from training. Duration: 0.958 2023-10-03 22:07:14,339 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_162-262717-0 from training. Duration: 0.83 2023-10-03 22:07:21,121 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_224-322394-0_sp1.1 from training. Duration: 0.6 2023-10-03 22:07:21,128 WARNING [train.py:1204] (3/4) Exclude cut with ID _4866_210_2577_1_1527404412284_4364659_133-1696-0_sp1.1 from training. Duration: 0.9 2023-10-03 22:07:21,193 WARNING [train.py:1204] (3/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_973-198085-0_sp1.1 from training. Duration: 0.6818125 2023-10-03 22:07:27,434 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_47449_210_5399_1_1534076445_4006929_410-237806-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 22:07:27,443 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7543_210_6753_1_1528543318_4552_1-85072-0_sp0.9 from training. Duration: 0.92225 2023-10-03 22:07:29,291 WARNING [train.py:1204] (3/4) Exclude cut with ID _65263_210_22356_1_1535763598691_3921037_108-21902-0 from training. Duration: 0.99 2023-10-03 22:07:29,296 WARNING [train.py:1204] (3/4) Exclude cut with ID _65585_210_12210_1_1535450451584_3764098_309-174818-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 22:07:30,706 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_309-65177-0 from training. Duration: 0.88 2023-10-03 22:07:32,203 WARNING [train.py:1204] (3/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_363-185703-0_sp1.1 from training. Duration: 0.863625 2023-10-03 22:07:33,913 WARNING [train.py:1204] (3/4) Exclude cut with ID _42326_210_7770_1_1533814282531_3707269_442-238692-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 22:07:34,028 WARNING [train.py:1204] (3/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_226-254446-0_sp1.1 from training. Duration: 0.936375 2023-10-03 22:07:34,046 WARNING [train.py:1204] (3/4) Exclude cut with ID _29853_210_7497_1_1532505600851_6642110_287-299903-0_sp1.1 from training. Duration: 0.963625 2023-10-03 22:07:39,635 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1015-343884-0 from training. Duration: 0.62 2023-10-03 22:07:39,681 WARNING [train.py:1204] (3/4) Exclude cut with ID ID0305W0008-833402 from training. Duration: 0.91 2023-10-03 22:07:41,083 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6673_210_3273_1_1527767790_3173240_351-167954-0_sp1.1 from training. Duration: 0.436375 2023-10-03 22:07:41,102 WARNING [train.py:1204] (3/4) Exclude cut with ID _6456_210_1534_1_1527681634212_3181049_345-252877-0 from training. Duration: 0.64 2023-10-03 22:07:43,769 INFO [train.py:1046] (3/4) Epoch 41, batch 950, loss[loss=0.1598, simple_loss=0.2495, pruned_loss=0.03505, over 24366.00 frames. ], tot_loss[loss=0.1571, simple_loss=0.2374, pruned_loss=0.03839, over 4680674.27 frames. ], batch size: 77, lr: 2.49e-03, grad_scale: 32.0 2023-10-03 22:07:45,105 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_13-205182-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 22:07:47,292 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.feed_forward3.hidden_balancer.prob, batch_count=1422906.6666666667, ans=0.125 2023-10-03 22:07:48,425 WARNING [train.py:1204] (3/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_427-208901-0 from training. Duration: 0.68 2023-10-03 22:07:53,944 WARNING [train.py:1204] (3/4) Exclude cut with ID _49039_210_6315_1_1533967537541_3846118_9-148921-0_sp1.1 from training. Duration: 0.97275 2023-10-03 22:07:54,161 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.feed_forward1.hidden_balancer.prob, batch_count=1422906.6666666667, ans=0.125 2023-10-03 22:07:56,055 WARNING [train.py:1204] (3/4) Exclude cut with ID _59493_210_17282_1_1535078419077_3225879_143-70317-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 22:07:57,259 WARNING [train.py:1204] (3/4) Exclude cut with ID _27967_210_12140_1_1532588391859_8419130_1056-236369-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 22:07:57,524 INFO [scaling.py:1118] (3/4) WithLoss: name=encoder.encoders.3.encoder.layers.1.self_attn_weights, loss-sum=0.000e+00 2023-10-03 22:07:58,547 WARNING [train.py:1204] (3/4) Exclude cut with ID _6537_210_1883_1_1527987559710_3597129_382-133062-0_sp0.9 from training. Duration: 0.7555625 2023-10-03 22:08:00,093 WARNING [train.py:1204] (3/4) Exclude cut with ID ID0376W0255-869957 from training. Duration: 0.9510625 2023-10-03 22:08:03,379 WARNING [train.py:1204] (3/4) Exclude cut with ID _49371_210_12071_1_1533952761021_3812190_483-240691-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 22:08:03,449 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_12868_210_6753_1_1530701932_530275_18-76322-0_sp1.1 from training. Duration: 0.7909375 2023-10-03 22:08:05,369 WARNING [train.py:1204] (3/4) Exclude cut with ID _53950_210_5816_1_1534417264919_3862951_367-26344-0_sp1.1 from training. Duration: 0.97275 2023-10-03 22:08:05,389 WARNING [train.py:1204] (3/4) Exclude cut with ID _5444_210_2151_1_1527418808464_5312210_634-194174-0_sp1.1 from training. Duration: 0.8818125 2023-10-03 22:08:05,423 WARNING [train.py:1204] (3/4) Exclude cut with ID _56494_210_4220_1_1535284837625_3033169_69-138150-0 from training. Duration: 0.94 2023-10-03 22:08:06,824 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7543_210_6753_1_1528544010_1110402_25-183860-0_sp0.9 from training. Duration: 0.52225 2023-10-03 22:08:08,254 WARNING [train.py:1204] (3/4) Exclude cut with ID _40284_210_2084_1_1533513679814_3704279_287-191918-0_sp1.1 from training. Duration: 0.963625 2023-10-03 22:08:09,601 WARNING [train.py:1204] (3/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_291-270881-0 from training. Duration: 0.97 2023-10-03 22:08:09,646 WARNING [train.py:1204] (3/4) Exclude cut with ID _12555_210_8750_1_1530075594402_3780239_449-267986-0_sp0.9 from training. Duration: 0.92225 2023-10-03 22:08:12,629 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.1.conv_skip_rate, batch_count=1423040.0, ans=0.0 2023-10-03 22:08:13,763 WARNING [train.py:1204] (3/4) Exclude cut with ID _3992_210_1747_1_1526644816031_3731060_268-64520-0_sp1.1 from training. Duration: 0.963625 2023-10-03 22:08:13,771 WARNING [train.py:1204] (3/4) Exclude cut with ID _39411_210_5986_1_1533538825030_3343649_173-238248-0_sp0.9 from training. Duration: 0.92225 2023-10-03 22:08:13,807 WARNING [train.py:1204] (3/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_828-313097-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 22:08:15,172 WARNING [train.py:1204] (3/4) Exclude cut with ID _26907_210_7234_1_1532221740694_3205263_337-2494-0 from training. Duration: 0.88 2023-10-03 22:08:16,536 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_229-45300-0_sp1.1 from training. Duration: 0.5181875 2023-10-03 22:08:21,136 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.567e+02 1.951e+02 2.151e+02 2.455e+02 3.832e+02, threshold=4.302e+02, percent-clipped=0.0 2023-10-03 22:08:21,226 WARNING [train.py:1204] (3/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_300-27235-0_sp1.1 from training. Duration: 0.7909375 2023-10-03 22:08:21,341 WARNING [train.py:1204] (3/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_48-338976-0_sp1.1 from training. Duration: 0.8090625 2023-10-03 22:08:26,043 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_13091_210_7925_1_1530609447_8987354_792-195353-0_sp1.1 from training. Duration: 0.936375 2023-10-03 22:08:26,058 WARNING [train.py:1204] (3/4) Exclude cut with ID _56524_210_9642_1_1534572011779_3619170_311-54500-0_sp1.1 from training. Duration: 0.97275 2023-10-03 22:08:30,113 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7543_210_6753_1_1528544010_1110402_25-183860-0 from training. Duration: 0.47 2023-10-03 22:08:30,249 WARNING [train.py:1204] (3/4) Exclude cut with ID 2411-132532-0017-82279-0_sp1.1 from training. Duration: 0.9681875 2023-10-03 22:08:30,249 WARNING [train.py:1204] (3/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_272-249985-0_sp0.9 from training. Duration: 0.7444375 2023-10-03 22:08:32,097 WARNING [train.py:1204] (3/4) Exclude cut with ID _55644_210_11856_1_1534593599582_4103719_320-229006-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 22:08:32,165 WARNING [train.py:1204] (3/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_177-100554-0_sp1.1 from training. Duration: 0.963625 2023-10-03 22:08:32,167 WARNING [train.py:1204] (3/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_787-294416-0_sp1.1 from training. Duration: 0.7454375 2023-10-03 22:08:35,784 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.balancer1.prob, batch_count=1423106.6666666667, ans=0.125 2023-10-03 22:08:36,880 WARNING [train.py:1204] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_318-59097-0 from training. Duration: 0.76 2023-10-03 22:08:38,165 WARNING [train.py:1204] (3/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_480-13146-0_sp1.1 from training. Duration: 0.77275 2023-10-03 22:08:39,614 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_469-338558-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 22:08:40,951 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_367-126897-0_sp1.1 from training. Duration: 0.963625 2023-10-03 22:08:40,967 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6545_210_2040_1_1527774670_4245258_379-14182-0 from training. Duration: 0.8 2023-10-03 22:08:40,979 WARNING [train.py:1204] (3/4) Exclude cut with ID _21631_210_2253_1_1531907880778_3160339_197-79498-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 22:08:40,983 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_416-293746-0_sp1.1 from training. Duration: 0.8181875 2023-10-03 22:08:41,027 WARNING [train.py:1204] (3/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_52-211683-0 from training. Duration: 0.9 2023-10-03 22:08:45,040 WARNING [train.py:1204] (3/4) Exclude cut with ID _48678_210_5663_1_1533988830947_3769199_306-255886-0_sp0.9 from training. Duration: 0.8555625 2023-10-03 22:08:47,979 WARNING [train.py:1204] (3/4) Exclude cut with ID _65134_210_14763_1_1535335393985_3505368_296-24733-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 22:08:52,701 WARNING [train.py:1204] (3/4) Exclude cut with ID _14898_210_9206_1_1530766812377_4177190_335-99364-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 22:08:54,153 WARNING [train.py:1204] (3/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_19-342632-0 from training. Duration: 0.91 2023-10-03 22:08:54,178 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_53071_210_16554_1_1534490405_2954066_265-170472-0 from training. Duration: 0.83 2023-10-03 22:08:57,661 WARNING [train.py:1204] (3/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_189-298172-0_sp1.1 from training. Duration: 0.963625 2023-10-03 22:08:58,940 INFO [train.py:1046] (3/4) Epoch 41, batch 1000, loss[loss=0.1675, simple_loss=0.2491, pruned_loss=0.04297, over 24467.00 frames. ], tot_loss[loss=0.1565, simple_loss=0.2366, pruned_loss=0.03818, over 4688708.10 frames. ], batch size: 69, lr: 2.49e-03, grad_scale: 32.0 2023-10-03 22:09:02,291 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7615_210_4211_1_1528110965_4580160_72-235211-0 from training. Duration: 0.76 2023-10-03 22:09:02,331 WARNING [train.py:1204] (3/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_155-90874-0_sp1.1 from training. Duration: 0.936375 2023-10-03 22:09:08,221 WARNING [train.py:1204] (3/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_164-216310-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 22:09:09,635 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_46013_210_7234_1_1533776362_3744032_463-52378-0 from training. Duration: 0.86 2023-10-03 22:09:09,639 WARNING [train.py:1204] (3/4) Exclude cut with ID _61133_210_9969_1_1535196777227_3696409_333-270325-0 from training. Duration: 0.93 2023-10-03 22:09:13,806 WARNING [train.py:1204] (3/4) Exclude cut with ID _5172_210_1883_1_1527246062567_3577219_18-101380-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 22:09:13,815 WARNING [train.py:1204] (3/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_577-236858-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 22:09:15,231 WARNING [train.py:1204] (3/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_1006-177345-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 22:09:17,968 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_4-266348-0 from training. Duration: 0.78 2023-10-03 22:09:21,201 WARNING [train.py:1204] (3/4) Exclude cut with ID _55857_210_2346_1_1534644007831_6898209_745-98391-0 from training. Duration: 0.76 2023-10-03 22:09:22,621 WARNING [train.py:1204] (3/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_375-244976-0 from training. Duration: 0.75 2023-10-03 22:09:23,866 WARNING [train.py:1204] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_4-248404-0_sp1.1 from training. Duration: 0.97275 2023-10-03 22:09:27,052 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8453_210_4211_1_1528502940_8562730_596-306820-0 from training. Duration: 0.97 2023-10-03 22:09:27,162 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_211-339622-0_sp0.9 from training. Duration: 0.5 2023-10-03 22:09:27,188 WARNING [train.py:1204] (3/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_393-57888-0 from training. Duration: 0.64 2023-10-03 22:09:29,033 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.4.encoder.layers.0.conv_module2.whiten, num_groups=1, num_channels=384, metric=8.87 vs. limit=15.0 2023-10-03 22:09:29,716 WARNING [train.py:1204] (3/4) Exclude cut with ID _23825_210_13449_1_1531915313526_3497150_143-343575-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 22:09:29,777 WARNING [train.py:1204] (3/4) Exclude cut with ID _58605_210_12210_1_1534723195939_3904259_36-12256-0_sp1.1 from training. Duration: 0.936375 2023-10-03 22:09:39,232 WARNING [train.py:1204] (3/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_38-67795-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 22:09:39,296 WARNING [train.py:1204] (3/4) Exclude cut with ID _64631_210_27767_1_1535349580068_4165160_308-327832-0_sp1.1 from training. Duration: 0.92725 2023-10-03 22:09:39,353 WARNING [train.py:1204] (3/4) Exclude cut with ID _37086_210_9038_1_1532999540891_7021250_726-81176-0_sp1.1 from training. Duration: 0.936375 2023-10-03 22:09:40,678 WARNING [train.py:1204] (3/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_47-141521-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 22:09:40,690 WARNING [train.py:1204] (3/4) Exclude cut with ID _3067_210_2667_1_1526212974640_4503168_250-285937-0 from training. Duration: 0.8 2023-10-03 22:09:40,718 WARNING [train.py:1204] (3/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_239-24000-0_sp1.1 from training. Duration: 0.97275 2023-10-03 22:09:41,986 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3371_210_1794_1_1526384462_5580777_396-339812-0_sp0.9 from training. Duration: 0.9444375 2023-10-03 22:09:42,032 WARNING [train.py:1204] (3/4) Exclude cut with ID _16909_210_5724_1_1531292079912_4118919_580-173454-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 22:09:43,392 WARNING [train.py:1204] (3/4) Exclude cut with ID IC0124W0002-61308 from training. Duration: 0.982 2023-10-03 22:09:46,149 WARNING [train.py:1204] (3/4) Exclude cut with ID _31602_210_5854_1_1532677021456_4458250_249-96604-0 from training. Duration: 0.89 2023-10-03 22:09:46,229 WARNING [train.py:1204] (3/4) Exclude cut with ID _69584_210_5219_1_1535785133775_4044450_325-7656-0 from training. Duration: 0.86 2023-10-03 22:09:48,920 WARNING [train.py:1204] (3/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_478-162247-0 from training. Duration: 0.82 2023-10-03 22:09:50,768 WARNING [train.py:1204] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_686-54383-0_sp1.1 from training. Duration: 0.7909375 2023-10-03 22:09:56,850 WARNING [train.py:1204] (3/4) Exclude cut with ID _29303_210_2575_1_1532392237303_7410649_85-270092-0_sp1.1 from training. Duration: 0.936375 2023-10-03 22:09:56,861 WARNING [train.py:1204] (3/4) Exclude cut with ID _2322_210_2667_1_1525604548971_3656299_440-80494-0_sp1.1 from training. Duration: 0.8 2023-10-03 22:09:58,165 WARNING [train.py:1204] (3/4) Exclude cut with ID _47346_210_9476_1_1533815971553_3536160_201-40644-0_sp1.1 from training. Duration: 0.936375 2023-10-03 22:09:59,683 WARNING [train.py:1204] (3/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_343-139898-0_sp1.1 from training. Duration: 0.87275 2023-10-03 22:09:59,958 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.feed_forward1.out_proj.dropout_p, batch_count=1423506.6666666667, ans=0.1 2023-10-03 22:10:01,096 WARNING [train.py:1204] (3/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_1012-279473-0 from training. Duration: 0.67 2023-10-03 22:10:02,514 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7931_210_1794_1_1528594728_5564059_280-279560-0_sp1.1 from training. Duration: 0.8909375 2023-10-03 22:10:04,257 WARNING [train.py:1204] (3/4) Exclude cut with ID _8263_210_1664_1_1528439452891_970220_68-181805-0 from training. Duration: 0.64 2023-10-03 22:10:04,305 WARNING [train.py:1204] (3/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_605-133395-0 from training. Duration: 0.66 2023-10-03 22:10:06,188 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_43-171109-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 22:10:06,191 WARNING [train.py:1204] (3/4) Exclude cut with ID _40094_210_16380_1_1533294005773_3641159_107-280739-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 22:10:08,933 WARNING [train.py:1204] (3/4) Exclude cut with ID _14821_210_8750_1_1530680503991_6829359_2-227151-0_sp0.9 from training. Duration: 0.92225 2023-10-03 22:10:10,430 WARNING [train.py:1204] (3/4) Exclude cut with ID _14268_210_9206_1_1530622771285_4714099_331-105351-0_sp1.1 from training. Duration: 0.5909375 2023-10-03 22:10:10,543 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_26326_210_7925_1_1533002493_3592406_451-82741-0_sp1.1 from training. Duration: 0.97275 2023-10-03 22:10:13,257 INFO [train.py:1046] (3/4) Epoch 41, batch 1050, loss[loss=0.1552, simple_loss=0.2362, pruned_loss=0.03705, over 23250.00 frames. ], tot_loss[loss=0.1552, simple_loss=0.2351, pruned_loss=0.03769, over 4700893.00 frames. ], batch size: 105, lr: 2.49e-03, grad_scale: 32.0 2023-10-03 22:10:14,623 WARNING [train.py:1204] (3/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_191-59784-0_sp0.9 from training. Duration: 0.97775 2023-10-03 22:10:14,723 WARNING [train.py:1204] (3/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_445-117398-0_sp1.1 from training. Duration: 0.8090625 2023-10-03 22:10:17,270 WARNING [train.py:1204] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_605-257205-0_sp0.9 from training. Duration: 0.6555625 2023-10-03 22:10:17,351 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_50050_210_16223_1_1535435796_7290138_592-258841-0_sp1.1 from training. Duration: 0.936375 2023-10-03 22:10:20,517 WARNING [train.py:1204] (3/4) Exclude cut with ID _14348_210_8341_1_1530599434996_3778640_240-232370-0_sp0.9 from training. Duration: 0.9333125 2023-10-03 22:10:23,227 WARNING [train.py:1204] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_239-316802-0_sp0.9 from training. Duration: 0.7666875 2023-10-03 22:10:25,249 WARNING [train.py:1204] (3/4) Exclude cut with ID _45560_210_21982_1_1533812603951_3584999_111-6376-0_sp1.1 from training. Duration: 0.763625 2023-10-03 22:10:26,707 WARNING [train.py:1204] (3/4) Exclude cut with ID _54189_210_16380_1_1534332670039_3911710_606-311481-0_sp1.1 from training. Duration: 0.963625 2023-10-03 22:10:26,766 WARNING [train.py:1204] (3/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_237-332027-0_sp1.1 from training. Duration: 0.863625 2023-10-03 22:10:28,259 WARNING [train.py:1204] (3/4) Exclude cut with ID _66070_210_7640_1_1535441393096_7805310_489-292631-0_sp0.9 from training. Duration: 0.87775 2023-10-03 22:10:29,581 WARNING [train.py:1204] (3/4) Exclude cut with ID _49728_210_12651_1_1534041034294_6788150_624-195301-0_sp1.1 from training. Duration: 0.77275 2023-10-03 22:10:29,627 WARNING [train.py:1204] (3/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_44-79427-0 from training. Duration: 0.98 2023-10-03 22:10:30,932 WARNING [train.py:1204] (3/4) Exclude cut with ID _35350_210_16040_1_1533040191183_4075299_490-198354-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 22:10:30,976 WARNING [train.py:1204] (3/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_309-222545-0 from training. Duration: 0.84 2023-10-03 22:10:31,599 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.3.encoder.layers.2.self_attn2.whiten, num_groups=1, num_channels=512, metric=13.08 vs. limit=22.5 2023-10-03 22:10:34,117 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_13091_210_7925_1_1530609447_8987354_790-199469-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 22:10:34,131 WARNING [train.py:1204] (3/4) Exclude cut with ID _21632_210_2253_1_1531994001765_3744140_110-274654-0 from training. Duration: 0.82 2023-10-03 22:10:34,137 WARNING [train.py:1204] (3/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_498-40153-0_sp0.9 from training. Duration: 0.7 2023-10-03 22:10:41,639 WARNING [train.py:1204] (3/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_363-282909-0_sp1.1 from training. Duration: 0.936375 2023-10-03 22:10:42,547 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.1.encoder.layers.1.feed_forward1.out_whiten, num_groups=1, num_channels=256, metric=5.27 vs. limit=15.0 2023-10-03 22:10:43,049 WARNING [train.py:1204] (3/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_178-177582-0_sp0.9 from training. Duration: 0.82225 2023-10-03 22:10:43,067 WARNING [train.py:1204] (3/4) Exclude cut with ID _24612_210_8913_1_1531998054193_3806359_357-35487-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 22:10:44,711 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.nonlin_attention.balancer.prob, batch_count=1423706.6666666667, ans=0.125 2023-10-03 22:10:45,924 WARNING [train.py:1204] (3/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_138-332773-0 from training. Duration: 0.81 2023-10-03 22:10:45,951 WARNING [train.py:1204] (3/4) Exclude cut with ID _7624_210_1531_1_1528109802198_4933101_418-349834-0 from training. Duration: 0.6 2023-10-03 22:10:45,970 WARNING [train.py:1204] (3/4) Exclude cut with ID _7537_210_2084_1_1528502395431_3924359_14-312906-0_sp0.9 from training. Duration: 0.9333125 2023-10-03 22:10:47,631 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.0.conv_skip_rate, batch_count=1423706.6666666667, ans=0.0 2023-10-03 22:10:48,928 WARNING [train.py:1204] (3/4) Exclude cut with ID _60057_210_10640_1_1534903065793_3333150_362-230377-0 from training. Duration: 0.87 2023-10-03 22:10:49,178 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.feed_forward3.hidden_balancer.prob, batch_count=1423706.6666666667, ans=0.125 2023-10-03 22:10:49,628 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.3.encoder.layers.0.whiten, num_groups=1, num_channels=512, metric=5.02 vs. limit=12.0 2023-10-03 22:10:49,697 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.conv_module2.whiten.whitening_limit, batch_count=1423706.6666666667, ans=15.0 2023-10-03 22:10:50,170 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.613e+02 1.894e+02 2.056e+02 2.296e+02 3.481e+02, threshold=4.112e+02, percent-clipped=0.0 2023-10-03 22:10:50,451 WARNING [train.py:1204] (3/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_138-64210-0 from training. Duration: 0.73 2023-10-03 22:10:51,761 WARNING [train.py:1204] (3/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_454-339504-0_sp1.1 from training. Duration: 0.97275 2023-10-03 22:10:55,356 WARNING [train.py:1204] (3/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_508-335369-0_sp0.9 from training. Duration: 0.5333125 2023-10-03 22:10:57,003 WARNING [train.py:1204] (3/4) Exclude cut with ID _5608_210_2106_1_1527415439089_6819768_909-82767-0_sp0.9 from training. Duration: 0.67775 2023-10-03 22:10:58,106 WARNING [train.py:1204] (3/4) Exclude cut with ID _6694_210_4599_1_1527852637490_3965529_99-11611-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 22:10:58,170 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2655_210_2087_1_1525688959_3897400_52-279899-0_sp1.1 from training. Duration: 0.62725 2023-10-03 22:11:02,660 WARNING [train.py:1204] (3/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_375-245677-0_sp1.1 from training. Duration: 0.736375 2023-10-03 22:11:05,477 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_23850_210_4238_1_1532232597_1632433_32-185135-0 from training. Duration: 0.86 2023-10-03 22:11:06,898 WARNING [train.py:1204] (3/4) Exclude cut with ID _15113_210_8254_1_1530842438579_3716279_209-54533-0 from training. Duration: 0.96 2023-10-03 22:11:08,661 WARNING [train.py:1204] (3/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_401-53423-0 from training. Duration: 0.55 2023-10-03 22:11:08,709 WARNING [train.py:1204] (3/4) Exclude cut with ID _14867_210_1968_1_1530684179890_3846969_319-250868-0_sp1.1 from training. Duration: 0.9 2023-10-03 22:11:08,733 WARNING [train.py:1204] (3/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_106-30782-0_sp0.9 from training. Duration: 0.888875 2023-10-03 22:11:10,084 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_5960_210_1794_1_1527403865_3990999_515-2613-0 from training. Duration: 0.88 2023-10-03 22:11:12,917 WARNING [train.py:1204] (3/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_798-106179-0_sp0.9 from training. Duration: 0.92225 2023-10-03 22:11:15,618 WARNING [train.py:1204] (3/4) Exclude cut with ID _14227_210_4381_1_1530701804286_3940259_125-275642-0_sp1.1 from training. Duration: 0.9 2023-10-03 22:11:16,122 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.4.encoder.layers.2.self_attn_weights.whiten_keys, num_groups=4, num_channels=128, metric=2.03 vs. limit=6.0 2023-10-03 22:11:16,809 WARNING [train.py:1204] (3/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_79-200392-0_sp1.1 from training. Duration: 0.8818125 2023-10-03 22:11:16,847 WARNING [train.py:1204] (3/4) Exclude cut with ID _48836_210_12210_1_1534075848293_3904150_429-35475-0_sp0.9 from training. Duration: 0.988875 2023-10-03 22:11:16,870 WARNING [train.py:1204] (3/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_224-172517-0_sp1.1 from training. Duration: 0.97275 2023-10-03 22:11:17,091 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.ff2_skip_rate, batch_count=1423840.0, ans=0.0 2023-10-03 22:11:19,752 WARNING [train.py:1204] (3/4) Exclude cut with ID _15113_210_8254_1_1530842438579_3716279_76-232507-0_sp1.1 from training. Duration: 0.97275 2023-10-03 22:11:19,754 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_135-5501-0 from training. Duration: 0.98 2023-10-03 22:11:22,404 WARNING [train.py:1204] (3/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_563-347832-0_sp0.9 from training. Duration: 0.988875 2023-10-03 22:11:22,422 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6786_210_1388_1_1527839699_4194495_273-149841-0 from training. Duration: 0.92 2023-10-03 22:11:22,464 WARNING [train.py:1204] (3/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_172-215932-0 from training. Duration: 0.66 2023-10-03 22:11:23,149 WARNING [train.py:1204] (3/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_939-132910-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 22:11:27,649 INFO [train.py:1046] (3/4) Epoch 41, batch 1100, loss[loss=0.1662, simple_loss=0.2387, pruned_loss=0.04686, over 23464.00 frames. ], tot_loss[loss=0.1551, simple_loss=0.2345, pruned_loss=0.03782, over 4702088.51 frames. ], batch size: 285, lr: 2.49e-03, grad_scale: 32.0 2023-10-03 22:11:29,050 WARNING [train.py:1204] (3/4) Exclude cut with ID _49371_210_12071_1_1533952761021_3812190_16-246001-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 22:11:31,894 WARNING [train.py:1204] (3/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_680-189165-0_sp1.1 from training. Duration: 0.936375 2023-10-03 22:11:35,204 WARNING [train.py:1204] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_53-300544-0_sp1.1 from training. Duration: 0.6090625 2023-10-03 22:11:35,314 WARNING [train.py:1204] (3/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_540-275685-0_sp1.1 from training. Duration: 0.8090625 2023-10-03 22:11:35,330 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_38965_210_4852_1_1533174906_3484735_453-106579-0_sp1.1 from training. Duration: 0.92725 2023-10-03 22:11:37,140 WARNING [train.py:1204] (3/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_105-328174-0 from training. Duration: 0.76 2023-10-03 22:11:38,546 WARNING [train.py:1204] (3/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_279-126205-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 22:11:41,155 WARNING [train.py:1204] (3/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_5-55893-0_sp0.9 from training. Duration: 0.611125 2023-10-03 22:11:42,580 WARNING [train.py:1204] (3/4) Exclude cut with ID _13678_210_7497_1_1530530069226_3463170_141-10835-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 22:11:44,006 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.1.feed_forward1.hidden_balancer.prob, batch_count=1423973.3333333333, ans=0.125 2023-10-03 22:11:46,597 WARNING [train.py:1204] (3/4) Exclude cut with ID _22083_210_8254_1_1531702862439_3678970_201-85738-0_sp1.1 from training. Duration: 0.8181875 2023-10-03 22:11:46,616 WARNING [train.py:1204] (3/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_51-1049-0 from training. Duration: 0.75 2023-10-03 22:11:46,696 WARNING [train.py:1204] (3/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_206-10097-0_sp0.9 from training. Duration: 0.5444375 2023-10-03 22:11:48,171 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_4528_210_3273_1_1527076654_3276777_360-63661-0_sp1.1 from training. Duration: 0.92725 2023-10-03 22:11:48,179 WARNING [train.py:1204] (3/4) Exclude cut with ID _64086_210_7994_1_1535270417019_7435949_449-31567-0_sp1.1 from training. Duration: 0.8818125 2023-10-03 22:11:51,667 WARNING [train.py:1204] (3/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_245-174553-0_sp0.9 from training. Duration: 0.97775 2023-10-03 22:11:52,805 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6545_210_2040_1_1527774670_4245258_379-14182-0_sp1.1 from training. Duration: 0.72725 2023-10-03 22:11:56,242 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_256-206070-0_sp1.1 from training. Duration: 0.963625 2023-10-03 22:11:59,019 WARNING [train.py:1204] (3/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_278-104676-0 from training. Duration: 0.98 2023-10-03 22:12:00,442 WARNING [train.py:1204] (3/4) Exclude cut with ID IC0671W0065-331908 from training. Duration: 0.896 2023-10-03 22:12:00,673 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.feed_forward3.hidden_balancer.prob, batch_count=1424040.0, ans=0.125 2023-10-03 22:12:01,763 WARNING [train.py:1204] (3/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_244-129917-0_sp1.1 from training. Duration: 0.97275 2023-10-03 22:12:02,040 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.bypass.skip_rate, batch_count=1424040.0, ans=0.04949747468305833 2023-10-03 22:12:03,754 WARNING [train.py:1204] (3/4) Exclude cut with ID _7852_210_5219_1_1528286471316_8139400_1129-170857-0_sp1.1 from training. Duration: 0.97275 2023-10-03 22:12:05,087 WARNING [train.py:1204] (3/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_443-30034-0_sp0.9 from training. Duration: 0.711125 2023-10-03 22:12:05,126 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_31583_210_3852_1_1532595851_4024413_250-332999-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 22:12:07,014 WARNING [train.py:1204] (3/4) Exclude cut with ID _8772_210_1850_1_1528635595972_3664011_247-336448-0 from training. Duration: 0.74 2023-10-03 22:12:08,352 WARNING [train.py:1204] (3/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_567-135114-0_sp0.9 from training. Duration: 0.9666875 2023-10-03 22:12:08,355 WARNING [train.py:1204] (3/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_557-349534-0_sp1.1 from training. Duration: 0.82725 2023-10-03 22:12:08,367 WARNING [train.py:1204] (3/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_1341-26728-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 22:12:08,417 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_40222_210_6228_1_1533385298_4481939_375-328051-0_sp1.1 from training. Duration: 0.97275 2023-10-03 22:12:08,444 WARNING [train.py:1204] (3/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_276-96593-0 from training. Duration: 0.77 2023-10-03 22:12:15,306 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_53071_210_16554_1_1534490405_2954066_265-170472-0_sp0.9 from training. Duration: 0.92225 2023-10-03 22:12:15,325 WARNING [train.py:1204] (3/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_365-1492-0 from training. Duration: 0.91 2023-10-03 22:12:16,730 WARNING [train.py:1204] (3/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_654-52713-0_sp0.9 from training. Duration: 0.8555625 2023-10-03 22:12:22,032 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.4.encoder.layers.0.feed_forward2.out_whiten, num_groups=1, num_channels=384, metric=5.75 vs. limit=15.0 2023-10-03 22:12:22,718 WARNING [train.py:1204] (3/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_113-92608-0_sp1.1 from training. Duration: 0.4909375 2023-10-03 22:12:24,358 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_46013_210_7234_1_1533776362_3744032_50-141535-0 from training. Duration: 0.99 2023-10-03 22:12:24,372 WARNING [train.py:1204] (3/4) Exclude cut with ID _13942_210_9988_1_1530604906036_3733360_499-55676-0_sp1.1 from training. Duration: 0.563625 2023-10-03 22:12:27,588 WARNING [train.py:1204] (3/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_375-65143-0_sp1.1 from training. Duration: 0.97275 2023-10-03 22:12:30,248 WARNING [train.py:1204] (3/4) Exclude cut with ID _16756_210_4915_1_1531047803763_7104259_199-325608-0_sp1.1 from training. Duration: 0.92725 2023-10-03 22:12:30,272 WARNING [train.py:1204] (3/4) Exclude cut with ID _31649_210_6228_1_1532584608063_4091078_443-124391-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 22:12:30,373 WARNING [train.py:1204] (3/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_146-218763-0 from training. Duration: 0.86 2023-10-03 22:12:31,718 WARNING [train.py:1204] (3/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_567-135114-0_sp1.1 from training. Duration: 0.7909375 2023-10-03 22:12:31,779 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_26326_210_7925_1_1533002493_3592406_462-288299-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 22:12:33,216 WARNING [train.py:1204] (3/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_322-217220-0 from training. Duration: 0.86 2023-10-03 22:12:35,020 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_238-256668-0_sp1.1 from training. Duration: 0.62725 2023-10-03 22:12:35,067 WARNING [train.py:1204] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_371-18169-0 from training. Duration: 0.86 2023-10-03 22:12:37,026 WARNING [train.py:1204] (3/4) Exclude cut with ID _58480_210_12392_1_1534811121111_3646160_23-74512-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 22:12:37,032 WARNING [train.py:1204] (3/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_784-213099-0_sp1.1 from training. Duration: 0.8454375 2023-10-03 22:12:38,407 WARNING [train.py:1204] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_625-102955-0_sp1.1 from training. Duration: 0.7 2023-10-03 22:12:42,379 INFO [train.py:1046] (3/4) Epoch 41, batch 1150, loss[loss=0.1426, simple_loss=0.2229, pruned_loss=0.03113, over 24568.00 frames. ], tot_loss[loss=0.1558, simple_loss=0.2353, pruned_loss=0.03813, over 4694198.59 frames. ], batch size: 60, lr: 2.49e-03, grad_scale: 32.0 2023-10-03 22:12:43,814 WARNING [train.py:1204] (3/4) Exclude cut with ID _49748_210_24116_1_1533986971737_3158910_176-174327-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 22:12:46,605 WARNING [train.py:1204] (3/4) Exclude cut with ID _50310_210_15488_1_1534068043345_3641080_432-96140-0_sp1.1 from training. Duration: 0.936375 2023-10-03 22:12:48,498 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.4.encoder.layers.2.feed_forward2.out_whiten, num_groups=1, num_channels=384, metric=10.01 vs. limit=15.0 2023-10-03 22:12:49,344 WARNING [train.py:1204] (3/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_76-14910-0_sp1.1 from training. Duration: 0.92725 2023-10-03 22:12:49,365 WARNING [train.py:1204] (3/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_40-19458-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 22:12:49,406 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_46-93547-0 from training. Duration: 0.95 2023-10-03 22:12:50,812 WARNING [train.py:1204] (3/4) Exclude cut with ID _21631_210_2253_1_1531907880778_3160339_321-35325-0_sp1.1 from training. Duration: 0.963625 2023-10-03 22:12:54,279 WARNING [train.py:1204] (3/4) Exclude cut with ID _4779_210_2084_1_1527318022944_3914893_500-285013-0 from training. Duration: 0.69 2023-10-03 22:12:54,380 WARNING [train.py:1204] (3/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_301-93324-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 22:12:55,449 WARNING [train.py:1204] (3/4) Exclude cut with ID _6473_210_1741_1_1527901307396_3815380_108-17335-0_sp1.1 from training. Duration: 0.5909375 2023-10-03 22:13:01,024 WARNING [train.py:1204] (3/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_206-10097-0 from training. Duration: 0.49 2023-10-03 22:13:02,446 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_36756_210_7234_1_1533026991_966276_103-77513-0_sp1.1 from training. Duration: 0.97275 2023-10-03 22:13:07,129 WARNING [train.py:1204] (3/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_662-18193-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 22:13:07,194 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_26433_210_12108_1_1532157158_4056480_439-52438-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 22:13:08,408 WARNING [train.py:1204] (3/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_464-33931-0 from training. Duration: 0.54 2023-10-03 22:13:08,422 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3474_210_2028_1_1526188513_4825246_145-309131-0_sp0.9 from training. Duration: 0.82225 2023-10-03 22:13:08,444 WARNING [train.py:1204] (3/4) Exclude cut with ID _9679_210_7497_1_1529129212906_4363929_446-308321-0_sp1.1 from training. Duration: 0.963625 2023-10-03 22:13:11,959 WARNING [train.py:1204] (3/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_662-275621-0 from training. Duration: 0.42 2023-10-03 22:13:13,306 WARNING [train.py:1204] (3/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_344-337148-0_sp1.1 from training. Duration: 0.97275 2023-10-03 22:13:14,677 WARNING [train.py:1204] (3/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_759-179071-0_sp1.1 from training. Duration: 0.92725 2023-10-03 22:13:18,761 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.649e+02 1.950e+02 2.098e+02 2.374e+02 3.607e+02, threshold=4.196e+02, percent-clipped=0.0 2023-10-03 22:13:21,522 WARNING [train.py:1204] (3/4) Exclude cut with ID _28783_210_7943_1_1532671164808_7155479_293-12188-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 22:13:27,757 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_17613_210_2151_1_1531137569_2366160_235-320304-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 22:13:27,781 WARNING [train.py:1204] (3/4) Exclude cut with ID _14349_210_8341_1_1530615710818_3442470_170-336541-0 from training. Duration: 0.86 2023-10-03 22:13:27,836 WARNING [train.py:1204] (3/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_375-218677-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 22:13:27,885 WARNING [train.py:1204] (3/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_829-202956-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 22:13:35,383 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9029W0436-509823 from training. Duration: 0.768 2023-10-03 22:13:36,759 WARNING [train.py:1204] (3/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_231-112214-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 22:13:36,940 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.attention_skip_rate, batch_count=1424440.0, ans=0.0 2023-10-03 22:13:44,095 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9059W0470-524883 from training. Duration: 0.3415625 2023-10-03 22:13:46,994 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_43266_210_1794_1_1533521789_4497218_206-79512-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 22:13:48,538 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_294-163793-0_sp0.9 from training. Duration: 0.911125 2023-10-03 22:13:48,559 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6290_210_3273_1_1527595224_1282787_115-87095-0_sp1.1 from training. Duration: 0.5 2023-10-03 22:13:49,911 WARNING [train.py:1204] (3/4) Exclude cut with ID _14100_210_8341_1_1530529320613_3678069_279-55760-0_sp1.1 from training. Duration: 0.8454375 2023-10-03 22:13:51,401 WARNING [train.py:1204] (3/4) Exclude cut with ID _14621_210_1925_1_1530686901185_3675420_419-234200-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 22:13:56,001 INFO [train.py:1046] (3/4) Epoch 41, batch 1200, loss[loss=0.1573, simple_loss=0.2317, pruned_loss=0.04142, over 23428.00 frames. ], tot_loss[loss=0.1563, simple_loss=0.236, pruned_loss=0.03832, over 4682797.21 frames. ], batch size: 285, lr: 2.49e-03, grad_scale: 32.0 2023-10-03 22:13:57,439 WARNING [train.py:1204] (3/4) Exclude cut with ID _60229_210_27767_1_1534838406158_3655350_353-213656-0_sp1.1 from training. Duration: 0.863625 2023-10-03 22:13:57,448 WARNING [train.py:1204] (3/4) Exclude cut with ID _48678_210_5663_1_1533988830947_3769199_306-255886-0_sp1.1 from training. Duration: 0.7 2023-10-03 22:13:57,704 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.self_attn_weights.pos_emb_skip_rate, batch_count=1424573.3333333333, ans=0.0 2023-10-03 22:13:58,866 WARNING [train.py:1204] (3/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_466-3492-0_sp1.1 from training. Duration: 0.97275 2023-10-03 22:13:58,867 WARNING [train.py:1204] (3/4) Exclude cut with ID _8755_210_1876_1_1528628559357_3258209_199-242167-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 22:14:00,252 WARNING [train.py:1204] (3/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_237-89684-0_sp1.1 from training. Duration: 0.87275 2023-10-03 22:14:00,507 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.conv_module2.balancer2.prob, batch_count=1424573.3333333333, ans=0.125 2023-10-03 22:14:01,640 WARNING [train.py:1204] (3/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_70-84121-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 22:14:04,370 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7544_210_6753_1_1528587722_3576686_172-171750-0_sp1.1 from training. Duration: 0.5909375 2023-10-03 22:14:06,391 WARNING [train.py:1204] (3/4) Exclude cut with ID _24271_210_12658_1_1531987270022_7190519_40-321822-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 22:14:07,676 WARNING [train.py:1204] (3/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_227-83612-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 22:14:09,683 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9156W0015-572508 from training. Duration: 0.896 2023-10-03 22:14:11,130 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_292-265928-0 from training. Duration: 0.51 2023-10-03 22:14:17,262 WARNING [train.py:1204] (3/4) Exclude cut with ID _14747_210_8341_1_1530685797463_3654089_147-131229-0_sp1.1 from training. Duration: 0.6909375 2023-10-03 22:14:18,774 WARNING [train.py:1204] (3/4) Exclude cut with ID _45297_210_2721_1_1533715351381_3264949_351-147136-0_sp0.9 from training. Duration: 0.9666875 2023-10-03 22:14:20,208 WARNING [train.py:1204] (3/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_1102-324568-0_sp1.1 from training. Duration: 0.97275 2023-10-03 22:14:22,907 WARNING [train.py:1204] (3/4) Exclude cut with ID _18516_210_9206_1_1531704653206_3692406_465-75446-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 22:14:22,911 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9203W0126-595623 from training. Duration: 0.896 2023-10-03 22:14:24,296 WARNING [train.py:1204] (3/4) Exclude cut with ID _55383_210_10635_1_1534468426172_2231669_139-328410-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 22:14:31,853 WARNING [train.py:1204] (3/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_166-246557-0_sp0.9 from training. Duration: 0.711125 2023-10-03 22:14:31,858 WARNING [train.py:1204] (3/4) Exclude cut with ID _30039_210_13270_1_1532484612900_2725150_239-304872-0_sp1.1 from training. Duration: 0.963625 2023-10-03 22:14:31,889 WARNING [train.py:1204] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_6-226750-0 from training. Duration: 0.94 2023-10-03 22:14:33,225 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_135-5501-0_sp1.1 from training. Duration: 0.8909375 2023-10-03 22:14:36,058 WARNING [train.py:1204] (3/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_359-118926-0 from training. Duration: 0.72 2023-10-03 22:14:40,757 WARNING [train.py:1204] (3/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_4-91037-0 from training. Duration: 0.88 2023-10-03 22:14:40,775 WARNING [train.py:1204] (3/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_415-288492-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 22:14:40,864 WARNING [train.py:1204] (3/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_544-287179-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 22:14:44,478 WARNING [train.py:1204] (3/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_122-32606-0_sp1.1 from training. Duration: 0.92725 2023-10-03 22:14:44,539 WARNING [train.py:1204] (3/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_121-284152-0_sp1.1 from training. Duration: 0.536375 2023-10-03 22:14:47,247 WARNING [train.py:1204] (3/4) Exclude cut with ID _44718_210_5816_1_1534050051869_6469030_350-299477-0_sp1.1 from training. Duration: 0.97275 2023-10-03 22:14:47,266 WARNING [train.py:1204] (3/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_1103-56445-0_sp0.9 from training. Duration: 0.811125 2023-10-03 22:14:48,626 WARNING [train.py:1204] (3/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_64-298917-0_sp0.9 from training. Duration: 0.97775 2023-10-03 22:14:48,669 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2245_210_2715_1_1525511548_3540254_310-52483-0 from training. Duration: 0.88 2023-10-03 22:14:48,725 WARNING [train.py:1204] (3/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_401-158239-0_sp1.1 from training. Duration: 0.6454375 2023-10-03 22:14:48,767 WARNING [train.py:1204] (3/4) Exclude cut with ID _59502_210_12210_1_1535107024698_1835139_146-162495-0_sp1.1 from training. Duration: 0.836375 2023-10-03 22:14:48,784 WARNING [train.py:1204] (3/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_41-31157-0_sp0.9 from training. Duration: 0.6333125 2023-10-03 22:14:52,862 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_68717_210_3528_1_1535628683_1556759_95-36722-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 22:14:52,875 WARNING [train.py:1204] (3/4) Exclude cut with ID _30196_210_1664_1_1533708496522_7074140_654-162611-0_sp1.1 from training. Duration: 0.92725 2023-10-03 22:14:56,942 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_9302_210_2550_1_1528889962_4655416_638-301620-0_sp0.9 from training. Duration: 0.52225 2023-10-03 22:14:58,679 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.conv_module1.balancer1.prob, batch_count=1424840.0, ans=0.125 2023-10-03 22:14:59,827 WARNING [train.py:1204] (3/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_478-162247-0_sp1.1 from training. Duration: 0.7454375 2023-10-03 22:15:01,901 WARNING [train.py:1204] (3/4) Exclude cut with ID _45297_210_2721_1_1533715351381_3264949_353-9541-0 from training. Duration: 0.97 2023-10-03 22:15:07,064 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9653W0190-675468 from training. Duration: 0.9935 2023-10-03 22:15:07,409 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=1424840.0, ans=0.1 2023-10-03 22:15:08,497 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_49730_210_13049_1_1534075294_1830676_214-84436-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 22:15:09,874 INFO [train.py:1046] (3/4) Epoch 41, batch 1250, loss[loss=0.1579, simple_loss=0.2509, pruned_loss=0.03243, over 24656.00 frames. ], tot_loss[loss=0.1576, simple_loss=0.2374, pruned_loss=0.03888, over 4688669.93 frames. ], batch size: 73, lr: 2.49e-03, grad_scale: 32.0 2023-10-03 22:15:09,984 WARNING [train.py:1204] (3/4) Exclude cut with ID _50093_210_4220_1_1534417141207_3432299_259-322252-0_sp1.1 from training. Duration: 0.836375 2023-10-03 22:15:11,399 WARNING [train.py:1204] (3/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_599-50220-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 22:15:13,377 WARNING [train.py:1204] (3/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_174-109016-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 22:15:14,776 WARNING [train.py:1204] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_665-196155-0 from training. Duration: 0.67 2023-10-03 22:15:19,385 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.out_combiner.scale_min, batch_count=1424906.6666666667, ans=0.2 2023-10-03 22:15:20,629 WARNING [train.py:1204] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_656-188361-0_sp1.1 from training. Duration: 0.7909375 2023-10-03 22:15:20,695 WARNING [train.py:1204] (3/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_60-18123-0_sp1.1 from training. Duration: 0.8 2023-10-03 22:15:22,047 WARNING [train.py:1204] (3/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_535-14410-0 from training. Duration: 0.47 2023-10-03 22:15:22,206 WARNING [train.py:1204] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_94-126128-0_sp1.1 from training. Duration: 0.7818125 2023-10-03 22:15:23,641 WARNING [train.py:1204] (3/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_356-31216-0_sp1.1 from training. Duration: 0.6818125 2023-10-03 22:15:28,912 WARNING [train.py:1204] (3/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_443-30034-0_sp1.1 from training. Duration: 0.5818125 2023-10-03 22:15:29,616 WARNING [train.py:1204] (3/4) Exclude cut with ID _26907_210_7234_1_1532221740694_3205263_337-2494-0_sp1.1 from training. Duration: 0.8 2023-10-03 22:15:29,810 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.out_combiner.scale_min, batch_count=1424973.3333333333, ans=0.2 2023-10-03 22:15:30,881 WARNING [train.py:1204] (3/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_49-97158-0_sp0.9 from training. Duration: 0.8444375 2023-10-03 22:15:30,893 WARNING [train.py:1204] (3/4) Exclude cut with ID _7877_210_6014_1_1528286358099_3750136_553-170128-0_sp1.1 from training. Duration: 0.963625 2023-10-03 22:15:33,663 WARNING [train.py:1204] (3/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_202-293523-0_sp0.9 from training. Duration: 0.8 2023-10-03 22:15:35,152 WARNING [train.py:1204] (3/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_188-171673-0_sp0.9 from training. Duration: 0.6444375 2023-10-03 22:15:36,481 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_120-41668-0_sp1.1 from training. Duration: 0.636375 2023-10-03 22:15:36,486 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_40223_210_6228_1_1533298404_4812267_613-78769-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 22:15:36,588 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_53861_210_5399_1_1534422156_4272077_150-336899-0_sp1.1 from training. Duration: 0.963625 2023-10-03 22:15:36,650 WARNING [train.py:1204] (3/4) Exclude cut with ID _14459_210_2228_1_1531443641946_7516700_1146-56843-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 22:15:39,466 WARNING [train.py:1204] (3/4) Exclude cut with ID _39867_210_9826_1_1533553222030_9344259_1194-299928-0_sp1.1 from training. Duration: 0.92725 2023-10-03 22:15:40,785 WARNING [train.py:1204] (3/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_214-291369-0_sp0.9 from training. Duration: 0.7 2023-10-03 22:15:47,170 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.556e+02 1.967e+02 2.225e+02 2.437e+02 4.651e+02, threshold=4.451e+02, percent-clipped=1.0 2023-10-03 22:15:47,275 WARNING [train.py:1204] (3/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_358-215558-0 from training. Duration: 0.93 2023-10-03 22:15:47,337 WARNING [train.py:1204] (3/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_577-287150-0_sp0.9 from training. Duration: 0.911125 2023-10-03 22:15:48,205 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.0.layers.1.whiten, num_groups=1, num_channels=192, metric=3.58 vs. limit=12.0 2023-10-03 22:15:50,170 WARNING [train.py:1204] (3/4) Exclude cut with ID _60860_210_8254_1_1534896015875_3557169_229-187367-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 22:15:50,231 WARNING [train.py:1204] (3/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_857-298942-0 from training. Duration: 0.85 2023-10-03 22:15:51,624 WARNING [train.py:1204] (3/4) Exclude cut with ID _40137_210_12210_1_1533290484046_3795180_249-187059-0_sp1.1 from training. Duration: 0.8 2023-10-03 22:15:51,632 WARNING [train.py:1204] (3/4) Exclude cut with ID ID0171W0508-765603 from training. Duration: 0.541 2023-10-03 22:15:51,654 WARNING [train.py:1204] (3/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_521-219439-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 22:15:51,660 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_40223_210_6228_1_1533298404_4812267_741-302616-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 22:15:56,836 WARNING [train.py:1204] (3/4) Exclude cut with ID _65293_210_9070_1_1535545549765_4004210_438-154735-0_sp1.1 from training. Duration: 0.92725 2023-10-03 22:15:59,056 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.nonlin_attention.balancer.prob, batch_count=1425106.6666666667, ans=0.125 2023-10-03 22:16:00,187 WARNING [train.py:1204] (3/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_564-132950-0_sp1.1 from training. Duration: 0.92725 2023-10-03 22:16:00,218 WARNING [train.py:1204] (3/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_291-270881-0_sp1.1 from training. Duration: 0.8818125 2023-10-03 22:16:01,729 WARNING [train.py:1204] (3/4) Exclude cut with ID _60229_210_27767_1_1534838406158_3655350_353-213656-0 from training. Duration: 0.95 2023-10-03 22:16:01,753 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_243-143119-0 from training. Duration: 0.77 2023-10-03 22:16:01,789 WARNING [train.py:1204] (3/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_690-126306-0 from training. Duration: 0.78 2023-10-03 22:16:04,629 WARNING [train.py:1204] (3/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_347-95003-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 22:16:05,971 WARNING [train.py:1204] (3/4) Exclude cut with ID _2322_210_2667_1_1525604548971_3656299_440-80494-0 from training. Duration: 0.88 2023-10-03 22:16:05,993 WARNING [train.py:1204] (3/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_370-229434-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 22:16:08,854 WARNING [train.py:1204] (3/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_124-345840-0_sp1.1 from training. Duration: 0.47275 2023-10-03 22:16:08,862 WARNING [train.py:1204] (3/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_165-123581-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 22:16:11,570 WARNING [train.py:1204] (3/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_453-282588-0 from training. Duration: 0.98 2023-10-03 22:16:11,593 WARNING [train.py:1204] (3/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_299-34413-0_sp0.9 from training. Duration: 0.72225 2023-10-03 22:16:12,792 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_40036_210_13405_1_1533648647_1315388_48-146016-0_sp1.1 from training. Duration: 0.8454375 2023-10-03 22:16:12,815 WARNING [train.py:1204] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_99-167392-0_sp0.9 from training. Duration: 0.67775 2023-10-03 22:16:14,168 WARNING [train.py:1204] (3/4) Exclude cut with ID _48677_210_5663_1_1533902405919_3892989_37-207114-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 22:16:16,210 WARNING [train.py:1204] (3/4) Exclude cut with ID _29857_210_20034_1_1532395886753_3798160_335-163562-0 from training. Duration: 0.85 2023-10-03 22:16:18,297 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_36492_210_7927_1_1533261987_4790834_703-343351-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 22:16:19,598 WARNING [train.py:1204] (3/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_80-147301-0_sp0.9 from training. Duration: 0.9333125 2023-10-03 22:16:21,001 WARNING [train.py:1204] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_218-295097-0_sp0.9 from training. Duration: 0.8555625 2023-10-03 22:16:23,734 INFO [train.py:1046] (3/4) Epoch 41, batch 1300, loss[loss=0.1514, simple_loss=0.2376, pruned_loss=0.03256, over 24336.00 frames. ], tot_loss[loss=0.1579, simple_loss=0.2379, pruned_loss=0.03891, over 4691015.02 frames. ], batch size: 61, lr: 2.49e-03, grad_scale: 32.0 2023-10-03 22:16:23,794 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2295_210_2087_1_1525500993_2407863_116-238894-0_sp1.1 from training. Duration: 0.636375 2023-10-03 22:16:25,294 WARNING [train.py:1204] (3/4) Exclude cut with ID _3231_210_2106_1_1526205864934_6784220_697-171068-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 22:16:26,583 WARNING [train.py:1204] (3/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_102-77064-0 from training. Duration: 0.93 2023-10-03 22:16:29,859 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_13091_210_7925_1_1530609447_8987354_1186-261957-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 22:16:32,595 WARNING [train.py:1204] (3/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_91-277684-0_sp1.1 from training. Duration: 0.67275 2023-10-03 22:16:32,677 WARNING [train.py:1204] (3/4) Exclude cut with ID _31088_210_16446_1_1532520012284_3440919_159-313751-0_sp1.1 from training. Duration: 0.936375 2023-10-03 22:16:34,214 WARNING [train.py:1204] (3/4) Exclude cut with ID _42326_210_7770_1_1533814282531_3707269_227-203721-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 22:16:35,598 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3531_210_3528_1_1526294203_5216456_309-227302-0_sp1.1 from training. Duration: 0.62725 2023-10-03 22:16:35,658 WARNING [train.py:1204] (3/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_226-242140-0 from training. Duration: 0.81 2023-10-03 22:16:41,044 WARNING [train.py:1204] (3/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_122-109023-0_sp1.1 from training. Duration: 0.6909375 2023-10-03 22:16:41,133 WARNING [train.py:1204] (3/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_660-211088-0_sp0.9 from training. Duration: 0.688875 2023-10-03 22:16:42,921 WARNING [train.py:1204] (3/4) Exclude cut with ID _39290_210_4220_1_1533862778760_3075118_92-311600-0 from training. Duration: 0.88 2023-10-03 22:16:47,636 WARNING [train.py:1204] (3/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_390-339137-0_sp1.1 from training. Duration: 0.5090625 2023-10-03 22:16:49,774 WARNING [train.py:1204] (3/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_449-341946-0_sp1.1 from training. Duration: 0.963625 2023-10-03 22:16:52,440 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_68095_210_20185_1_1535718226_8888855_884-327007-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 22:16:52,957 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.5.encoder.layers.0.self_attn1.whiten, num_groups=1, num_channels=256, metric=11.46 vs. limit=22.5 2023-10-03 22:16:53,801 WARNING [train.py:1204] (3/4) Exclude cut with ID _42326_210_7770_1_1533814282531_3707269_308-222998-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 22:16:55,180 WARNING [train.py:1204] (3/4) Exclude cut with ID _40399_210_4353_1_1533305658194_3279406_344-238708-0_sp1.1 from training. Duration: 0.963625 2023-10-03 22:16:55,248 WARNING [train.py:1204] (3/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_87-131427-0_sp1.1 from training. Duration: 0.7090625 2023-10-03 22:16:56,539 WARNING [train.py:1204] (3/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_110-130961-0_sp0.9 from training. Duration: 0.52225 2023-10-03 22:16:58,005 WARNING [train.py:1204] (3/4) Exclude cut with ID _55153_210_2253_1_1534384786677_3162096_148-79022-0 from training. Duration: 0.96 2023-10-03 22:17:02,739 WARNING [train.py:1204] (3/4) Exclude cut with ID _9679_210_7497_1_1529129212906_4363929_512-313889-0_sp1.1 from training. Duration: 0.736375 2023-10-03 22:17:03,982 WARNING [train.py:1204] (3/4) Exclude cut with ID _7970_210_1883_1_1528455616296_3472230_25-178407-0_sp1.1 from training. Duration: 0.6454375 2023-10-03 22:17:05,398 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_246-125000-0 from training. Duration: 0.62 2023-10-03 22:17:05,451 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_15126_210_6753_1_1530778677_2591185_113-157651-0_sp0.9 from training. Duration: 0.7555625 2023-10-03 22:17:08,122 WARNING [train.py:1204] (3/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_105-63309-0_sp1.1 from training. Duration: 0.8909375 2023-10-03 22:17:10,800 WARNING [train.py:1204] (3/4) Exclude cut with ID _29877_210_6228_1_1532397791106_5276149_578-177271-0_sp1.1 from training. Duration: 0.936375 2023-10-03 22:17:10,848 WARNING [train.py:1204] (3/4) Exclude cut with ID _44831_210_13427_1_1533816011549_3689035_32-188147-0 from training. Duration: 0.96 2023-10-03 22:17:10,899 WARNING [train.py:1204] (3/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_300-254337-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 22:17:10,911 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_128-241008-0 from training. Duration: 0.94 2023-10-03 22:17:12,298 WARNING [train.py:1204] (3/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_53-279178-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 22:17:17,041 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.nonlin_attention.balancer.max_positive, batch_count=1425440.0, ans=0.95 2023-10-03 22:17:18,239 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_41452_210_7291_1_1533437687_3941281_273-334088-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 22:17:18,242 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_128-241008-0_sp1.1 from training. Duration: 0.8545625 2023-10-03 22:17:20,519 WARNING [train.py:1204] (3/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_379-49457-0 from training. Duration: 0.87 2023-10-03 22:17:21,925 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_105-114797-0 from training. Duration: 0.84 2023-10-03 22:17:23,406 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_14928_210_5632_1_1530773461_4714000_454-112198-0 from training. Duration: 0.66 2023-10-03 22:17:26,159 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_456-22141-0_sp1.1 from training. Duration: 0.8 2023-10-03 22:17:28,900 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_115-233929-0 from training. Duration: 0.72 2023-10-03 22:17:31,675 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_616-16795-0_sp1.1 from training. Duration: 0.963625 2023-10-03 22:17:32,631 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.nonlin_attention.balancer.max_positive, batch_count=1425506.6666666667, ans=0.95 2023-10-03 22:17:37,917 INFO [train.py:1046] (3/4) Epoch 41, batch 1350, loss[loss=0.1594, simple_loss=0.2308, pruned_loss=0.04398, over 23916.00 frames. ], tot_loss[loss=0.1566, simple_loss=0.2364, pruned_loss=0.03838, over 4703792.09 frames. ], batch size: 179, lr: 2.49e-03, grad_scale: 16.0 2023-10-03 22:17:38,012 WARNING [train.py:1204] (3/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_448-212135-0 from training. Duration: 0.91 2023-10-03 22:17:40,776 WARNING [train.py:1204] (3/4) Exclude cut with ID _46683_210_9642_1_1533730913492_4591116_201-205677-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 22:17:42,255 WARNING [train.py:1204] (3/4) Exclude cut with ID _64086_210_7994_1_1535270417019_7435949_43-80483-0_sp1.1 from training. Duration: 0.97275 2023-10-03 22:17:42,501 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.feed_forward1.hidden_balancer.prob, batch_count=1425573.3333333333, ans=0.125 2023-10-03 22:17:46,841 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_240-134124-0_sp1.1 from training. Duration: 0.963625 2023-10-03 22:17:46,871 WARNING [train.py:1204] (3/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_907-120940-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 22:17:48,741 WARNING [train.py:1204] (3/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_848-280957-0_sp1.1 from training. Duration: 0.7909375 2023-10-03 22:17:48,783 WARNING [train.py:1204] (3/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_243-42116-0_sp0.9 from training. Duration: 0.811125 2023-10-03 22:17:50,928 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.feed_forward1.out_whiten.whitening_limit, batch_count=1425573.3333333333, ans=15.0 2023-10-03 22:17:51,665 WARNING [train.py:1204] (3/4) Exclude cut with ID _6694_210_4599_1_1527852637490_3965529_116-109398-0_sp0.9 from training. Duration: 0.811125 2023-10-03 22:17:52,127 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.5.encoder.layers.0.feed_forward2.out_whiten, num_groups=1, num_channels=256, metric=10.47 vs. limit=15.0 2023-10-03 22:17:54,169 WARNING [train.py:1204] (3/4) Exclude cut with ID _24314_210_4915_1_1532135078116_7170179_521-258868-0 from training. Duration: 0.79 2023-10-03 22:17:55,456 WARNING [train.py:1204] (3/4) Exclude cut with ID _2220_210_1819_1_1525509109898_3658680_50-99837-0_sp0.9 from training. Duration: 0.6 2023-10-03 22:17:55,506 WARNING [train.py:1204] (3/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_313-134930-0_sp1.1 from training. Duration: 0.8818125 2023-10-03 22:17:58,343 WARNING [train.py:1204] (3/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_125-144174-0 from training. Duration: 0.39 2023-10-03 22:17:58,399 WARNING [train.py:1204] (3/4) Exclude cut with ID _59288_210_5854_1_1535077757774_3412246_275-293897-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 22:17:58,556 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.conv_skip_rate, batch_count=1425640.0, ans=0.0 2023-10-03 22:18:01,142 WARNING [train.py:1204] (3/4) Exclude cut with ID _40519_210_13794_1_1533715228655_7337390_722-52880-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 22:18:01,165 WARNING [train.py:1204] (3/4) Exclude cut with ID _7537_210_2084_1_1528502395431_3924359_128-268029-0 from training. Duration: 0.98 2023-10-03 22:18:01,299 WARNING [train.py:1204] (3/4) Exclude cut with ID _8021_210_7994_1_1528376451358_3653069_179-45206-0 from training. Duration: 0.86 2023-10-03 22:18:03,897 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.3.encoder.layers.0.feed_forward2.out_whiten, num_groups=1, num_channels=512, metric=5.06 vs. limit=15.0 2023-10-03 22:18:04,520 WARNING [train.py:1204] (3/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_88-276668-0 from training. Duration: 0.99 2023-10-03 22:18:05,973 WARNING [train.py:1204] (3/4) Exclude cut with ID _22613_210_5219_1_1532253599686_4090170_290-212862-0_sp1.1 from training. Duration: 0.97275 2023-10-03 22:18:05,985 WARNING [train.py:1204] (3/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_465-214726-0 from training. Duration: 0.86 2023-10-03 22:18:06,201 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.out_combiner.scale_min, batch_count=1425706.6666666667, ans=0.2 2023-10-03 22:18:06,269 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.out_combiner.scale_min, batch_count=1425706.6666666667, ans=0.2 2023-10-03 22:18:10,378 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.attention_skip_rate, batch_count=1425706.6666666667, ans=0.0 2023-10-03 22:18:16,172 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.677e+02 1.965e+02 2.297e+02 2.654e+02 3.860e+02, threshold=4.593e+02, percent-clipped=0.0 2023-10-03 22:18:16,322 WARNING [train.py:1204] (3/4) Exclude cut with ID _56480_210_4220_1_1534679942079_3075409_138-36031-0_sp1.1 from training. Duration: 0.97275 2023-10-03 22:18:25,514 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.ff2_skip_rate, batch_count=1425773.3333333333, ans=0.0 2023-10-03 22:18:26,675 WARNING [train.py:1204] (3/4) Exclude cut with ID _58605_210_12210_1_1534723195939_3904259_300-285878-0_sp1.1 from training. Duration: 0.97275 2023-10-03 22:18:26,718 WARNING [train.py:1204] (3/4) Exclude cut with ID _40901_210_5665_1_1533363789958_3856130_114-340229-0_sp1.1 from training. Duration: 0.936375 2023-10-03 22:18:28,048 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_489-230703-0 from training. Duration: 0.85 2023-10-03 22:18:32,133 WARNING [train.py:1204] (3/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_322-81243-0_sp1.1 from training. Duration: 0.936375 2023-10-03 22:18:32,222 WARNING [train.py:1204] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_271-269084-0 from training. Duration: 0.83 2023-10-03 22:18:32,233 WARNING [train.py:1204] (3/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_166-314607-0_sp0.9 from training. Duration: 0.6 2023-10-03 22:18:34,109 WARNING [train.py:1204] (3/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_340-343302-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 22:18:35,541 WARNING [train.py:1204] (3/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_286-112015-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 22:18:36,956 WARNING [train.py:1204] (3/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_328-264776-0 from training. Duration: 0.67 2023-10-03 22:18:38,515 WARNING [train.py:1204] (3/4) Exclude cut with ID _4866_210_2577_1_1527404412284_4364659_256-306830-0_sp0.9 from training. Duration: 0.9666875 2023-10-03 22:18:41,408 WARNING [train.py:1204] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_239-316802-0 from training. Duration: 0.69 2023-10-03 22:18:42,682 WARNING [train.py:1204] (3/4) Exclude cut with ID _29857_210_20034_1_1532395886753_3798160_279-185801-0 from training. Duration: 0.91 2023-10-03 22:18:47,548 WARNING [train.py:1204] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_656-188361-0 from training. Duration: 0.87 2023-10-03 22:18:49,596 WARNING [train.py:1204] (3/4) Exclude cut with ID _44718_210_5816_1_1534050051869_6469030_100-230287-0_sp1.1 from training. Duration: 0.936375 2023-10-03 22:18:52,249 INFO [train.py:1046] (3/4) Epoch 41, batch 1400, loss[loss=0.1491, simple_loss=0.2312, pruned_loss=0.03351, over 24605.00 frames. ], tot_loss[loss=0.1556, simple_loss=0.235, pruned_loss=0.03811, over 4700668.39 frames. ], batch size: 60, lr: 2.49e-03, grad_scale: 16.0 2023-10-03 22:18:53,636 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8275_210_4198_1_1528455320_3892571_276-215622-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 22:18:53,686 WARNING [train.py:1204] (3/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_698-309430-0_sp1.1 from training. Duration: 0.963625 2023-10-03 22:18:57,905 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_365-217119-0 from training. Duration: 0.63 2023-10-03 22:18:59,979 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7264_210_4938_1_1527939911_8132095_800-91995-0 from training. Duration: 0.86 2023-10-03 22:19:00,140 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.1.ff3_skip_rate, batch_count=1425906.6666666667, ans=0.0 2023-10-03 22:19:10,704 WARNING [train.py:1204] (3/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_49-97158-0_sp1.1 from training. Duration: 0.6909375 2023-10-03 22:19:13,412 WARNING [train.py:1204] (3/4) Exclude cut with ID _61132_210_9969_1_1535021792964_3755419_61-57720-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 22:19:15,907 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.4.encoder.layers.0.self_attn2.whiten, num_groups=1, num_channels=384, metric=13.27 vs. limit=22.5 2023-10-03 22:19:16,640 WARNING [train.py:1204] (3/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_345-306832-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 22:19:16,670 WARNING [train.py:1204] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_164-224052-0_sp0.9 from training. Duration: 0.77775 2023-10-03 22:19:21,998 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_34886_210_10633_1_1533203882_1755661_76-156096-0_sp1.1 from training. Duration: 0.7909375 2023-10-03 22:19:22,084 WARNING [train.py:1204] (3/4) Exclude cut with ID 4133-6541-0027-40495-0_sp1.1 from training. Duration: 0.9681875 2023-10-03 22:19:23,764 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.conv_module2.balancer1.prob, batch_count=1426040.0, ans=0.125 2023-10-03 22:19:31,629 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_514-211165-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 22:19:31,683 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_41211_210_2544_1_1533348859_3702910_97-212695-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 22:19:36,440 WARNING [train.py:1204] (3/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_406-45086-0 from training. Duration: 0.8 2023-10-03 22:19:36,492 WARNING [train.py:1204] (3/4) Exclude cut with ID _65263_210_22356_1_1535763598691_3921037_49-222917-0_sp1.1 from training. Duration: 0.836375 2023-10-03 22:19:37,764 WARNING [train.py:1204] (3/4) Exclude cut with ID _4866_210_2577_1_1527404412284_4364659_352-157930-0_sp1.1 from training. Duration: 0.736375 2023-10-03 22:19:37,828 WARNING [train.py:1204] (3/4) Exclude cut with ID _4366_210_1388_1_1526792053633_3613819_231-211402-0_sp1.1 from training. Duration: 0.8545625 2023-10-03 22:19:39,190 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_98-21576-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 22:19:39,270 WARNING [train.py:1204] (3/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_348-127789-0_sp0.9 from training. Duration: 0.9444375 2023-10-03 22:19:40,534 WARNING [train.py:1204] (3/4) Exclude cut with ID _45871_210_5665_1_1533864614245_3296060_98-329557-0_sp1.1 from training. Duration: 0.97275 2023-10-03 22:19:40,584 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6846_210_6753_1_1527850767_4295158_2-315073-0_sp1.1 from training. Duration: 0.8 2023-10-03 22:19:40,794 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=1426106.6666666667, ans=0.1 2023-10-03 22:19:41,957 WARNING [train.py:1204] (3/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_145-137790-0 from training. Duration: 0.83 2023-10-03 22:19:41,983 WARNING [train.py:1204] (3/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_133-158308-0_sp0.9 from training. Duration: 0.9333125 2023-10-03 22:19:46,748 WARNING [train.py:1204] (3/4) Exclude cut with ID _14898_210_9206_1_1530766812377_4177190_320-11758-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 22:19:49,510 WARNING [train.py:1204] (3/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_406-45086-0_sp0.9 from training. Duration: 0.888875 2023-10-03 22:19:58,766 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_5438_210_1794_1_1527336687_2600009_104-288274-0 from training. Duration: 0.58 2023-10-03 22:19:58,839 WARNING [train.py:1204] (3/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_108-53929-0_sp0.9 from training. Duration: 0.6555625 2023-10-03 22:19:58,907 WARNING [train.py:1204] (3/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_444-286390-0_sp1.1 from training. Duration: 0.963625 2023-10-03 22:20:01,703 WARNING [train.py:1204] (3/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_125-144174-0_sp0.9 from training. Duration: 0.4333125 2023-10-03 22:20:01,773 WARNING [train.py:1204] (3/4) Exclude cut with ID _55209_210_1531_1_1534675828996_4597160_118-345621-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 22:20:04,544 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_45886_210_17024_1_1533709215_471063_7-79165-0_sp1.1 from training. Duration: 0.8909375 2023-10-03 22:20:06,481 INFO [train.py:1046] (3/4) Epoch 41, batch 1450, loss[loss=0.1508, simple_loss=0.2387, pruned_loss=0.03147, over 24640.00 frames. ], tot_loss[loss=0.1555, simple_loss=0.2351, pruned_loss=0.03797, over 4706759.63 frames. ], batch size: 65, lr: 2.49e-03, grad_scale: 16.0 2023-10-03 22:20:08,000 WARNING [train.py:1204] (3/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_175-163894-0_sp0.9 from training. Duration: 0.82225 2023-10-03 22:20:09,397 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_50188_210_4198_1_1534503621_3646041_353-81682-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 22:20:09,403 WARNING [train.py:1204] (3/4) Exclude cut with ID _17875_210_2087_1_1531224012907_1821339_175-80565-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 22:20:09,413 WARNING [train.py:1204] (3/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_159-328157-0_sp1.1 from training. Duration: 0.47275 2023-10-03 22:20:13,753 WARNING [train.py:1204] (3/4) Exclude cut with ID _60057_210_10640_1_1534903065793_3333150_137-267579-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 22:20:15,469 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_92-46009-0_sp1.1 from training. Duration: 0.4909375 2023-10-03 22:20:15,580 WARNING [train.py:1204] (3/4) Exclude cut with ID _53203_210_2087_1_1534246218309_475590_48-68507-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 22:20:15,595 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_28211_210_6388_1_1532861506_4523224_128-328162-0 from training. Duration: 0.9 2023-10-03 22:20:16,899 WARNING [train.py:1204] (3/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_51-1049-0_sp1.1 from training. Duration: 0.6818125 2023-10-03 22:20:16,986 WARNING [train.py:1204] (3/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_383-316000-0 from training. Duration: 0.9 2023-10-03 22:20:17,037 WARNING [train.py:1204] (3/4) Exclude cut with ID _4779_210_2084_1_1527318022944_3914893_185-295153-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 22:20:18,837 WARNING [train.py:1204] (3/4) Exclude cut with ID _3231_210_2106_1_1526205864934_6784220_171-256004-0_sp0.9 from training. Duration: 0.9555625 2023-10-03 22:20:18,837 WARNING [train.py:1204] (3/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_87-272068-0 from training. Duration: 0.83 2023-10-03 22:20:20,242 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_164-152777-0_sp1.1 from training. Duration: 0.92725 2023-10-03 22:20:22,093 WARNING [train.py:1204] (3/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_18-130336-0_sp0.9 from training. Duration: 0.87775 2023-10-03 22:20:22,161 WARNING [train.py:1204] (3/4) Exclude cut with ID _14886_210_4220_1_1530702020355_3516190_175-229073-0_sp1.1 from training. Duration: 0.4090625 2023-10-03 22:20:22,173 WARNING [train.py:1204] (3/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_791-45149-0_sp0.9 from training. Duration: 0.9555625 2023-10-03 22:20:23,542 WARNING [train.py:1204] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_468-189058-0_sp1.1 from training. Duration: 0.87275 2023-10-03 22:20:24,867 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8278_210_2550_1_1528637099_4220413_32-142780-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 22:20:26,558 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.conv_module2.balancer1.max_abs, batch_count=1426306.6666666667, ans=10.0 2023-10-03 22:20:27,676 WARNING [train.py:1204] (3/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_146-218763-0_sp0.9 from training. Duration: 0.9555625 2023-10-03 22:20:31,792 WARNING [train.py:1204] (3/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_348-127789-0_sp1.1 from training. Duration: 0.77275 2023-10-03 22:20:31,804 WARNING [train.py:1204] (3/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_288-285445-0_sp1.1 from training. Duration: 0.936375 2023-10-03 22:20:34,862 WARNING [train.py:1204] (3/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_308-181732-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 22:20:34,887 WARNING [train.py:1204] (3/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_895-117428-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 22:20:37,604 WARNING [train.py:1204] (3/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_387-159008-0_sp0.9 from training. Duration: 0.9555625 2023-10-03 22:20:37,625 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_37351_210_7925_1_1533012931_3812113_265-85780-0_sp0.9 from training. Duration: 0.97775 2023-10-03 22:20:37,640 WARNING [train.py:1204] (3/4) Exclude cut with ID _40551_210_10635_1_1533776424632_3416179_361-3964-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 22:20:37,686 WARNING [train.py:1204] (3/4) Exclude cut with ID _25882_210_3036_1_1532775665815_6479510_485-122742-0_sp1.1 from training. Duration: 0.97275 2023-10-03 22:20:40,604 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.0.self_attn_weights.pos_emb_skip_rate, batch_count=1426373.3333333333, ans=0.0 2023-10-03 22:20:41,794 WARNING [train.py:1204] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_98-51676-0 from training. Duration: 0.73 2023-10-03 22:20:43,248 WARNING [train.py:1204] (3/4) Exclude cut with ID _30548_210_16040_1_1532608097130_3146969_371-280610-0_sp1.1 from training. Duration: 0.92725 2023-10-03 22:20:43,520 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.conv_skip_rate, batch_count=1426373.3333333333, ans=0.0 2023-10-03 22:20:45,101 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.615e+02 1.909e+02 2.047e+02 2.238e+02 3.342e+02, threshold=4.094e+02, percent-clipped=0.0 2023-10-03 22:20:47,184 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.3.encoder.layers.1.conv_module2.whiten, num_groups=1, num_channels=512, metric=4.11 vs. limit=15.0 2023-10-03 22:20:47,907 WARNING [train.py:1204] (3/4) Exclude cut with ID IC0671W0081-331924 from training. Duration: 0.896 2023-10-03 22:20:49,414 WARNING [train.py:1204] (3/4) Exclude cut with ID _40284_210_2084_1_1533513679814_3704279_380-160775-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 22:20:50,850 WARNING [train.py:1204] (3/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_57-270037-0_sp1.1 from training. Duration: 0.7 2023-10-03 22:20:52,743 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_853-150237-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 22:20:54,100 WARNING [train.py:1204] (3/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_491-261923-0 from training. Duration: 0.98 2023-10-03 22:21:00,071 WARNING [train.py:1204] (3/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_836-244045-0_sp1.1 from training. Duration: 0.97275 2023-10-03 22:21:00,151 WARNING [train.py:1204] (3/4) Exclude cut with ID _14880_210_8895_1_1530680426655_3795259_289-212817-0 from training. Duration: 0.69 2023-10-03 22:21:01,590 WARNING [train.py:1204] (3/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_303-340446-0 from training. Duration: 0.9 2023-10-03 22:21:03,096 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_37152_210_9697_1_1533106573_3844188_418-41736-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 22:21:07,509 WARNING [train.py:1204] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_371-18169-0_sp1.1 from training. Duration: 0.7818125 2023-10-03 22:21:07,549 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_187-219984-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 22:21:08,934 WARNING [train.py:1204] (3/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_63-267585-0 from training. Duration: 0.9 2023-10-03 22:21:11,565 WARNING [train.py:1204] (3/4) Exclude cut with ID _27013_210_1389_1_1532437173240_3252649_222-125573-0 from training. Duration: 0.98 2023-10-03 22:21:11,611 WARNING [train.py:1204] (3/4) Exclude cut with ID _14821_210_8750_1_1530680503991_6829359_189-297451-0 from training. Duration: 0.55 2023-10-03 22:21:11,719 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_557-163391-0_sp1.1 from training. Duration: 0.97275 2023-10-03 22:21:13,103 WARNING [train.py:1204] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_65-88242-0_sp1.1 from training. Duration: 0.5090625 2023-10-03 22:21:19,081 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.out_combiner.scale_min, batch_count=1426573.3333333333, ans=0.2 2023-10-03 22:21:20,109 INFO [train.py:1046] (3/4) Epoch 41, batch 1500, loss[loss=0.1552, simple_loss=0.2282, pruned_loss=0.04106, over 23716.00 frames. ], tot_loss[loss=0.1558, simple_loss=0.2352, pruned_loss=0.03818, over 4708526.91 frames. ], batch size: 164, lr: 2.49e-03, grad_scale: 16.0 2023-10-03 22:21:24,390 WARNING [train.py:1204] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_218-295097-0 from training. Duration: 0.77 2023-10-03 22:21:24,435 WARNING [train.py:1204] (3/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_686-247600-0_sp1.1 from training. Duration: 0.836375 2023-10-03 22:21:24,439 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_397-147196-0_sp0.9 from training. Duration: 0.888875 2023-10-03 22:21:25,764 WARNING [train.py:1204] (3/4) Exclude cut with ID _53950_210_5816_1_1534417264919_3862951_237-136449-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 22:21:27,427 WARNING [train.py:1204] (3/4) Exclude cut with ID _3120_210_3525_1_1526036143112_4072980_74-178400-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 22:21:27,501 WARNING [train.py:1204] (3/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_309-222545-0_sp0.9 from training. Duration: 0.9333125 2023-10-03 22:21:29,361 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7544_210_6753_1_1528587722_3576686_172-171750-0 from training. Duration: 0.65 2023-10-03 22:21:30,682 WARNING [train.py:1204] (3/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_3-237434-0_sp0.9 from training. Duration: 0.8444375 2023-10-03 22:21:30,717 WARNING [train.py:1204] (3/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_101-293367-0_sp0.9 from training. Duration: 0.7 2023-10-03 22:21:30,731 WARNING [train.py:1204] (3/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_818-213552-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 22:21:30,799 WARNING [train.py:1204] (3/4) Exclude cut with ID _14349_210_8341_1_1530615710818_3442470_115-285975-0_sp1.1 from training. Duration: 0.7818125 2023-10-03 22:21:32,257 WARNING [train.py:1204] (3/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_291-309749-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 22:21:35,375 WARNING [train.py:1204] (3/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_715-3462-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 22:21:35,615 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.feed_forward1.hidden_balancer.prob, batch_count=1426640.0, ans=0.125 2023-10-03 22:21:38,429 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.balancer1.prob, batch_count=1426640.0, ans=0.125 2023-10-03 22:21:41,080 WARNING [train.py:1204] (3/4) Exclude cut with ID _59502_210_12210_1_1535107024698_1835139_13-331827-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 22:21:41,090 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_4528_210_3273_1_1527076654_3276777_342-168068-0 from training. Duration: 0.87 2023-10-03 22:21:41,163 WARNING [train.py:1204] (3/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_215-141059-0_sp0.9 from training. Duration: 0.688875 2023-10-03 22:21:41,183 WARNING [train.py:1204] (3/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_106-286130-0_sp1.1 from training. Duration: 0.8909375 2023-10-03 22:21:42,550 WARNING [train.py:1204] (3/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_480-77655-0_sp1.1 from training. Duration: 0.97275 2023-10-03 22:21:45,224 WARNING [train.py:1204] (3/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_848-280957-0 from training. Duration: 0.87 2023-10-03 22:21:48,403 WARNING [train.py:1204] (3/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_122-109023-0 from training. Duration: 0.76 2023-10-03 22:21:49,751 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_11305_210_1794_1_1529750428_7178148_645-233480-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 22:21:51,084 WARNING [train.py:1204] (3/4) Exclude cut with ID _7562_210_2084_1_1528887767916_3946229_345-169156-0 from training. Duration: 0.58 2023-10-03 22:21:53,822 WARNING [train.py:1204] (3/4) Exclude cut with ID _7562_210_2084_1_1528887767916_3946229_345-169156-0_sp1.1 from training. Duration: 0.52725 2023-10-03 22:21:54,026 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.bypass_mid.scale_min, batch_count=1426706.6666666667, ans=0.2 2023-10-03 22:21:55,210 WARNING [train.py:1204] (3/4) Exclude cut with ID _13253_210_1925_1_1530757775819_3577873_253-246386-0_sp1.1 from training. Duration: 0.8181875 2023-10-03 22:21:56,636 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_136-264444-0_sp1.1 from training. Duration: 0.97275 2023-10-03 22:21:56,656 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2168_210_2290_1_1525606920_1849400_136-222991-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 22:21:58,630 WARNING [train.py:1204] (3/4) Exclude cut with ID _7852_210_5219_1_1528286471316_8139400_242-105336-0 from training. Duration: 0.72 2023-10-03 22:21:58,666 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_5960_210_1794_1_1527403865_3990999_515-2613-0_sp1.1 from training. Duration: 0.8 2023-10-03 22:21:58,689 WARNING [train.py:1204] (3/4) Exclude cut with ID _39688_210_19596_1_1533371626725_4154159_551-167834-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 22:21:58,747 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_85-195240-0 from training. Duration: 0.87 2023-10-03 22:21:58,789 WARNING [train.py:1204] (3/4) Exclude cut with ID _63171_210_15488_1_1535104792794_3295110_113-113534-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 22:22:00,924 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder_embed.convnext.out_balancer.prob, batch_count=1426706.6666666667, ans=0.125 2023-10-03 22:22:06,648 WARNING [train.py:1204] (3/4) Exclude cut with ID _8833_210_5219_1_1528592665210_7553059_319-349181-0_sp1.1 from training. Duration: 0.9 2023-10-03 22:22:06,649 WARNING [train.py:1204] (3/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_225-43381-0 from training. Duration: 0.94 2023-10-03 22:22:10,853 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_5959_210_1794_1_1527400400_3445726_412-44052-0_sp0.9 from training. Duration: 0.8555625 2023-10-03 22:22:12,260 WARNING [train.py:1204] (3/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_356-167816-0_sp1.1 from training. Duration: 0.6545625 2023-10-03 22:22:15,020 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9012W0112-501244 from training. Duration: 0.768 2023-10-03 22:22:15,060 WARNING [train.py:1204] (3/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_177-326952-0_sp1.1 from training. Duration: 0.92725 2023-10-03 22:22:15,063 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9016W0116-502789 from training. Duration: 0.896 2023-10-03 22:22:17,102 WARNING [train.py:1204] (3/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_557-204815-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 22:22:18,481 WARNING [train.py:1204] (3/4) Exclude cut with ID _55857_210_2346_1_1534644007831_6898209_279-229469-0_sp1.1 from training. Duration: 0.936375 2023-10-03 22:22:19,766 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9029W0375-509764 from training. Duration: 0.896 2023-10-03 22:22:21,233 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2295_210_2087_1_1525500993_2407863_234-83104-0_sp0.9 from training. Duration: 0.688875 2023-10-03 22:22:21,577 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.nonlin_attention.balancer.max_positive, batch_count=1426840.0, ans=0.95 2023-10-03 22:22:24,017 WARNING [train.py:1204] (3/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_266-137104-0 from training. Duration: 0.65 2023-10-03 22:22:25,383 WARNING [train.py:1204] (3/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_289-163927-0_sp1.1 from training. Duration: 0.92725 2023-10-03 22:22:28,307 WARNING [train.py:1204] (3/4) Exclude cut with ID _12733_210_8341_1_1530167424556_3646380_86-225314-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 22:22:28,320 WARNING [train.py:1204] (3/4) Exclude cut with ID _7877_210_6014_1_1528286358099_3750136_618-284064-0_sp1.1 from training. Duration: 0.92725 2023-10-03 22:22:28,370 WARNING [train.py:1204] (3/4) Exclude cut with ID _55844_210_2346_1_1534471212569_7625707_1017-31469-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 22:22:30,196 WARNING [train.py:1204] (3/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_617-241446-0_sp1.1 from training. Duration: 0.92725 2023-10-03 22:22:31,553 WARNING [train.py:1204] (3/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_866-173440-0_sp1.1 from training. Duration: 0.8181875 2023-10-03 22:22:32,871 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_9460_210_4211_1_1528954587_10049667_480-260269-0 from training. Duration: 0.56 2023-10-03 22:22:32,919 WARNING [train.py:1204] (3/4) Exclude cut with ID _44659_210_2721_1_1534036120061_6614860_626-298884-0 from training. Duration: 0.97 2023-10-03 22:22:34,152 INFO [train.py:1046] (3/4) Epoch 41, batch 1550, loss[loss=0.1592, simple_loss=0.2341, pruned_loss=0.04212, over 23591.00 frames. ], tot_loss[loss=0.157, simple_loss=0.2365, pruned_loss=0.03869, over 4714968.38 frames. ], batch size: 256, lr: 2.49e-03, grad_scale: 16.0 2023-10-03 22:22:34,203 WARNING [train.py:1204] (3/4) Exclude cut with ID _53565_210_10635_1_1534337218335_3820259_287-18120-0_sp1.1 from training. Duration: 0.8818125 2023-10-03 22:22:34,910 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_791-197891-0 from training. Duration: 0.84 2023-10-03 22:22:34,954 WARNING [train.py:1204] (3/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_527-238990-0 from training. Duration: 0.9 2023-10-03 22:22:37,574 WARNING [train.py:1204] (3/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_449-65969-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 22:22:38,888 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_553-341452-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 22:22:39,094 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.conv_module1.balancer2.min_abs, batch_count=1426906.6666666667, ans=0.5 2023-10-03 22:22:40,220 WARNING [train.py:1204] (3/4) Exclude cut with ID _16909_210_5724_1_1531292079912_4118919_300-47574-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 22:22:40,232 WARNING [train.py:1204] (3/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_70-27732-0_sp1.1 from training. Duration: 0.8545625 2023-10-03 22:22:40,333 WARNING [train.py:1204] (3/4) Exclude cut with ID _44718_210_5816_1_1534050051869_6469030_727-28074-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 22:22:41,622 WARNING [train.py:1204] (3/4) Exclude cut with ID _55857_210_2346_1_1534644007831_6898209_357-123664-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 22:22:42,484 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.1.encoder.layers.1.feed_forward3.out_whiten, num_groups=1, num_channels=256, metric=8.24 vs. limit=15.0 2023-10-03 22:22:44,538 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9123W0004-555964 from training. Duration: 0.896 2023-10-03 22:22:44,559 WARNING [train.py:1204] (3/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_445-145208-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 22:22:45,789 WARNING [train.py:1204] (3/4) Exclude cut with ID _3675_210_4557_1_1526639934408_4182089_456-18305-0_sp1.1 from training. Duration: 0.6909375 2023-10-03 22:22:45,854 WARNING [train.py:1204] (3/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_1-99680-0_sp1.1 from training. Duration: 0.6090625 2023-10-03 22:22:47,841 WARNING [train.py:1204] (3/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_146-282400-0_sp0.9 from training. Duration: 0.788875 2023-10-03 22:22:47,854 WARNING [train.py:1204] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_844-96989-0 from training. Duration: 0.71 2023-10-03 22:22:50,504 WARNING [train.py:1204] (3/4) Exclude cut with ID _29853_210_7497_1_1532505600851_6642110_128-146364-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 22:22:50,530 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_224-322394-0 from training. Duration: 0.66 2023-10-03 22:22:51,886 WARNING [train.py:1204] (3/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_191-59784-0 from training. Duration: 0.88 2023-10-03 22:22:51,892 WARNING [train.py:1204] (3/4) Exclude cut with ID _12556_210_8341_1_1530081024100_7327690_725-193477-0 from training. Duration: 0.98 2023-10-03 22:22:51,944 WARNING [train.py:1204] (3/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_523-79838-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 22:22:53,307 WARNING [train.py:1204] (3/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_398-334558-0_sp1.1 from training. Duration: 0.87275 2023-10-03 22:22:54,930 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.feed_forward1.out_proj.dropout_p, batch_count=1426973.3333333333, ans=0.1 2023-10-03 22:22:56,224 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.0.conv_module2.balancer1.prob, batch_count=1426973.3333333333, ans=0.125 2023-10-03 22:22:57,434 WARNING [train.py:1204] (3/4) Exclude cut with ID _6456_210_1534_1_1527681634212_3181049_171-36697-0_sp1.1 from training. Duration: 0.936375 2023-10-03 22:22:59,201 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.conv_module2.balancer2.prob, batch_count=1426973.3333333333, ans=0.125 2023-10-03 22:23:00,275 WARNING [train.py:1204] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_700-183549-0 from training. Duration: 0.68 2023-10-03 22:23:00,277 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8853_210_3273_1_1528633899_818968_51-187685-0 from training. Duration: 0.69 2023-10-03 22:23:08,811 WARNING [train.py:1204] (3/4) Exclude cut with ID _7719_210_3525_1_1528456153163_4296960_333-110399-0_sp1.1 from training. Duration: 0.87275 2023-10-03 22:23:10,298 WARNING [train.py:1204] (3/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_1022-83740-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 22:23:11,701 WARNING [train.py:1204] (3/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_61-327062-0_sp0.9 from training. Duration: 0.9 2023-10-03 22:23:11,715 WARNING [train.py:1204] (3/4) Exclude cut with ID _47251_210_18628_1_1533814063128_4137250_214-88076-0_sp1.1 from training. Duration: 0.8 2023-10-03 22:23:11,772 WARNING [train.py:1204] (3/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_839-173017-0 from training. Duration: 0.41 2023-10-03 22:23:13,021 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.576e+02 1.987e+02 2.151e+02 2.393e+02 4.271e+02, threshold=4.303e+02, percent-clipped=1.0 2023-10-03 22:23:15,978 WARNING [train.py:1204] (3/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_299-34413-0_sp1.1 from training. Duration: 0.5909375 2023-10-03 22:23:17,829 WARNING [train.py:1204] (3/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_664-159457-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 22:23:20,617 WARNING [train.py:1204] (3/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_483-291760-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 22:23:20,779 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.1.feed_forward2.hidden_balancer.prob, batch_count=1427106.6666666667, ans=0.125 2023-10-03 22:23:23,326 WARNING [train.py:1204] (3/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_287-255474-0_sp1.1 from training. Duration: 0.92725 2023-10-03 22:23:23,370 WARNING [train.py:1204] (3/4) Exclude cut with ID _48677_210_5663_1_1533902405919_3892989_111-11439-0_sp1.1 from training. Duration: 0.87275 2023-10-03 22:23:23,384 WARNING [train.py:1204] (3/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_399-156791-0 from training. Duration: 0.83 2023-10-03 22:23:24,695 WARNING [train.py:1204] (3/4) Exclude cut with ID _3992_210_1747_1_1526644816031_3731060_36-147855-0_sp0.9 from training. Duration: 0.8666875 2023-10-03 22:23:27,547 WARNING [train.py:1204] (3/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_442-320317-0_sp1.1 from training. Duration: 0.7454375 2023-10-03 22:23:27,573 WARNING [train.py:1204] (3/4) Exclude cut with ID _55429_210_10626_1_1534467588527_7174143_953-80868-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 22:23:28,929 WARNING [train.py:1204] (3/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_159-328157-0_sp0.9 from training. Duration: 0.57775 2023-10-03 22:23:28,937 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9309W0197-640324 from training. Duration: 0.896 2023-10-03 22:23:32,235 WARNING [train.py:1204] (3/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_361-30822-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 22:23:35,165 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.bypass_mid.scale_min, batch_count=1427173.3333333333, ans=0.2 2023-10-03 22:23:35,217 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.feed_forward1.out_proj.dropout_p, batch_count=1427173.3333333333, ans=0.1 2023-10-03 22:23:38,333 WARNING [train.py:1204] (3/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_636-251213-0 from training. Duration: 0.94 2023-10-03 22:23:44,495 WARNING [train.py:1204] (3/4) Exclude cut with ID _49728_210_12651_1_1534041034294_6788150_468-333827-0_sp1.1 from training. Duration: 0.963625 2023-10-03 22:23:44,577 WARNING [train.py:1204] (3/4) Exclude cut with ID _25882_210_3036_1_1532775665815_6479510_623-191759-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 22:23:44,772 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.feed_forward1.out_proj.dropout_p, batch_count=1427173.3333333333, ans=0.1 2023-10-03 22:23:45,942 WARNING [train.py:1204] (3/4) Exclude cut with ID _5189_210_4557_1_1527248319428_3769229_199-53526-0 from training. Duration: 0.42 2023-10-03 22:23:47,460 WARNING [train.py:1204] (3/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_32-205288-0_sp0.9 from training. Duration: 0.8666875 2023-10-03 22:23:47,544 WARNING [train.py:1204] (3/4) Exclude cut with ID _65263_210_22356_1_1535763598691_3921037_401-346646-0_sp1.1 from training. Duration: 0.963625 2023-10-03 22:23:47,548 WARNING [train.py:1204] (3/4) Exclude cut with ID _12555_210_8750_1_1530075594402_3780239_449-267986-0_sp1.1 from training. Duration: 0.7545625 2023-10-03 22:23:48,767 INFO [train.py:1046] (3/4) Epoch 41, batch 1600, loss[loss=0.1526, simple_loss=0.2453, pruned_loss=0.03, over 24561.00 frames. ], tot_loss[loss=0.1574, simple_loss=0.2373, pruned_loss=0.03872, over 4715693.95 frames. ], batch size: 71, lr: 2.49e-03, grad_scale: 32.0 2023-10-03 22:23:48,819 WARNING [train.py:1204] (3/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_430-201020-0_sp1.1 from training. Duration: 0.8818125 2023-10-03 22:23:50,181 WARNING [train.py:1204] (3/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_379-49457-0_sp1.1 from training. Duration: 0.7909375 2023-10-03 22:23:52,263 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.self_attn_weights.pos_emb_skip_rate, batch_count=1427240.0, ans=0.0 2023-10-03 22:23:53,550 WARNING [train.py:1204] (3/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_200-105218-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 22:23:54,835 WARNING [train.py:1204] (3/4) Exclude cut with ID _3867_210_1883_1_1526641200575_4475830_319-349850-0 from training. Duration: 0.67 2023-10-03 22:23:54,924 WARNING [train.py:1204] (3/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_916-195848-0 from training. Duration: 0.92 2023-10-03 22:23:57,658 WARNING [train.py:1204] (3/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_436-186311-0 from training. Duration: 0.65 2023-10-03 22:23:57,958 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.ff3_skip_rate, batch_count=1427240.0, ans=0.0 2023-10-03 22:23:59,057 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3910_210_1794_1_1526777667_3747260_302-180617-0_sp1.1 from training. Duration: 0.97275 2023-10-03 22:24:01,138 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.feed_forward2.out_whiten.whitening_limit, batch_count=1427240.0, ans=15.0 2023-10-03 22:24:01,763 WARNING [train.py:1204] (3/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_102-92751-0 from training. Duration: 0.99 2023-10-03 22:24:03,120 WARNING [train.py:1204] (3/4) Exclude cut with ID _51899_210_20034_1_1534129254466_3669220_265-215428-0_sp1.1 from training. Duration: 0.8909375 2023-10-03 22:24:05,160 WARNING [train.py:1204] (3/4) Exclude cut with ID _58583_210_5236_1_1534726835266_3574249_438-198993-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 22:24:09,425 WARNING [train.py:1204] (3/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_166-166990-0_sp1.1 from training. Duration: 0.87275 2023-10-03 22:24:09,657 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.balancer2.prob, batch_count=1427306.6666666667, ans=0.125 2023-10-03 22:24:14,733 WARNING [train.py:1204] (3/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_907-151398-0 from training. Duration: 0.37 2023-10-03 22:24:17,491 WARNING [train.py:1204] (3/4) Exclude cut with ID _38670_210_15379_1_1533448810659_4391059_452-256669-0_sp1.1 from training. Duration: 0.936375 2023-10-03 22:24:18,824 WARNING [train.py:1204] (3/4) Exclude cut with ID _48677_210_5663_1_1533902405919_3892989_408-40740-0 from training. Duration: 0.83 2023-10-03 22:24:18,872 WARNING [train.py:1204] (3/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_903-158756-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 22:24:18,924 WARNING [train.py:1204] (3/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_397-216497-0 from training. Duration: 0.96 2023-10-03 22:24:23,063 WARNING [train.py:1204] (3/4) Exclude cut with ID _55429_210_10626_1_1534467588527_7174143_97-335084-0 from training. Duration: 0.97 2023-10-03 22:24:30,458 WARNING [train.py:1204] (3/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_453-267730-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 22:24:31,764 WARNING [train.py:1204] (3/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_153-97441-0 from training. Duration: 0.56 2023-10-03 22:24:31,825 WARNING [train.py:1204] (3/4) Exclude cut with ID _40901_210_5665_1_1533363789958_3856130_312-329122-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 22:24:33,144 WARNING [train.py:1204] (3/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_97-126471-0_sp1.1 from training. Duration: 0.97275 2023-10-03 22:24:33,144 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_11305_210_1794_1_1529750428_7178148_257-312870-0_sp1.1 from training. Duration: 0.8 2023-10-03 22:24:35,945 WARNING [train.py:1204] (3/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_454-314901-0_sp0.9 from training. Duration: 0.511125 2023-10-03 22:24:39,193 WARNING [train.py:1204] (3/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_282-136641-0_sp1.1 from training. Duration: 0.3818125 2023-10-03 22:24:42,355 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_46013_210_7234_1_1533776362_3744032_493-160724-0_sp1.1 from training. Duration: 0.963625 2023-10-03 22:24:42,405 WARNING [train.py:1204] (3/4) Exclude cut with ID _67831_210_12392_1_1535612464202_3092612_259-33442-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 22:24:42,457 WARNING [train.py:1204] (3/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_289-62384-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 22:24:44,416 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6545_210_2040_1_1527774670_4245258_379-14182-0_sp0.9 from training. Duration: 0.888875 2023-10-03 22:24:45,595 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_548-294316-0_sp1.1 from training. Duration: 0.736375 2023-10-03 22:24:45,683 WARNING [train.py:1204] (3/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_857-298942-0_sp1.1 from training. Duration: 0.77275 2023-10-03 22:24:48,314 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6673_210_3273_1_1527767790_3173240_327-306185-0_sp1.1 from training. Duration: 0.7545625 2023-10-03 22:24:54,294 WARNING [train.py:1204] (3/4) Exclude cut with ID _61008_210_10633_1_1534852769198_6974370_814-107647-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 22:24:54,559 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.conv_module2.balancer1.prob, batch_count=1427506.6666666667, ans=0.125 2023-10-03 22:24:55,710 WARNING [train.py:1204] (3/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_116-332008-0_sp1.1 from training. Duration: 0.8909375 2023-10-03 22:24:57,232 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.feed_forward2.hidden_balancer.prob, batch_count=1427506.6666666667, ans=0.125 2023-10-03 22:24:58,437 WARNING [train.py:1204] (3/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_266-329755-0 from training. Duration: 0.89 2023-10-03 22:24:58,440 WARNING [train.py:1204] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_271-269084-0_sp0.9 from training. Duration: 0.92225 2023-10-03 22:24:58,513 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3171_210_2087_1_1526109084_2330007_66-260583-0 from training. Duration: 0.72 2023-10-03 22:25:02,713 INFO [train.py:1046] (3/4) Epoch 41, batch 1650, loss[loss=0.163, simple_loss=0.2582, pruned_loss=0.03395, over 24551.00 frames. ], tot_loss[loss=0.1578, simple_loss=0.2378, pruned_loss=0.03884, over 4721948.03 frames. ], batch size: 71, lr: 2.49e-03, grad_scale: 16.0 2023-10-03 22:25:04,187 WARNING [train.py:1204] (3/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_1326-276386-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 22:25:05,557 WARNING [train.py:1204] (3/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_372-30302-0_sp1.1 from training. Duration: 0.7818125 2023-10-03 22:25:06,463 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.0.layers.1.conv_module1.whiten, num_groups=1, num_channels=192, metric=10.14 vs. limit=15.0 2023-10-03 22:25:06,860 WARNING [train.py:1204] (3/4) Exclude cut with ID _25882_210_3036_1_1532775665815_6479510_631-257029-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 22:25:06,869 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3691_210_3528_1_1526468169_4062053_355-334432-0 from training. Duration: 0.87 2023-10-03 22:25:06,877 WARNING [train.py:1204] (3/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_839-173017-0_sp1.1 from training. Duration: 0.37275 2023-10-03 22:25:06,892 WARNING [train.py:1204] (3/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_441-91466-0 from training. Duration: 0.97 2023-10-03 22:25:08,332 WARNING [train.py:1204] (3/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_108-53929-0 from training. Duration: 0.59 2023-10-03 22:25:11,694 WARNING [train.py:1204] (3/4) Exclude cut with ID _9389_210_1664_1_1529217337301_5289269_260-1942-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 22:25:12,980 WARNING [train.py:1204] (3/4) Exclude cut with ID _56423_210_5854_1_1534896061049_3165969_198-61547-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 22:25:13,018 WARNING [train.py:1204] (3/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_412-142122-0_sp1.1 from training. Duration: 0.92725 2023-10-03 22:25:14,746 WARNING [train.py:1204] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_88-341718-0_sp1.1 from training. Duration: 0.6 2023-10-03 22:25:17,924 WARNING [train.py:1204] (3/4) Exclude cut with ID _61079_210_4525_1_1534921008198_4011419_334-298697-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 22:25:19,346 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_5465_210_4211_1_1527227253_55357_8-2499-0_sp1.1 from training. Duration: 0.996375 2023-10-03 22:25:22,150 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_447-201565-0_sp1.1 from training. Duration: 0.97275 2023-10-03 22:25:22,156 WARNING [train.py:1204] (3/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_253-161789-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 22:25:22,159 WARNING [train.py:1204] (3/4) Exclude cut with ID _44304_210_15374_1_1533639352765_4613129_302-22084-0_sp0.9 from training. Duration: 0.9555625 2023-10-03 22:25:22,168 WARNING [train.py:1204] (3/4) Exclude cut with ID _26664_210_7770_1_1532258943280_3418270_187-80815-0_sp1.1 from training. Duration: 0.8454375 2023-10-03 22:25:22,248 WARNING [train.py:1204] (3/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_68-212801-0 from training. Duration: 0.73 2023-10-03 22:25:23,570 WARNING [train.py:1204] (3/4) Exclude cut with ID _29345_210_4381_1_1532689168263_3860169_149-257268-0 from training. Duration: 0.81 2023-10-03 22:25:28,357 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_213-103282-0_sp1.1 from training. Duration: 0.7090625 2023-10-03 22:25:29,850 WARNING [train.py:1204] (3/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_156-302675-0_sp1.1 from training. Duration: 0.5 2023-10-03 22:25:36,834 WARNING [train.py:1204] (3/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_125-322172-0 from training. Duration: 0.93 2023-10-03 22:25:36,903 WARNING [train.py:1204] (3/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_493-60474-0_sp1.1 from training. Duration: 0.963625 2023-10-03 22:25:39,610 WARNING [train.py:1204] (3/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_156-302675-0 from training. Duration: 0.55 2023-10-03 22:25:43,250 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.685e+02 1.986e+02 2.197e+02 2.523e+02 3.489e+02, threshold=4.394e+02, percent-clipped=0.0 2023-10-03 22:25:44,674 WARNING [train.py:1204] (3/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_123-267862-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 22:25:47,816 WARNING [train.py:1204] (3/4) Exclude cut with ID _28427_210_4220_1_1532581240967_3061069_201-143747-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 22:25:47,852 WARNING [train.py:1204] (3/4) Exclude cut with ID _8772_210_1850_1_1528635595972_3664011_96-12756-0_sp1.1 from training. Duration: 0.8909375 2023-10-03 22:25:47,889 WARNING [train.py:1204] (3/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_1347-145675-0_sp1.1 from training. Duration: 0.936375 2023-10-03 22:25:49,301 WARNING [train.py:1204] (3/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_387-159008-0_sp1.1 from training. Duration: 0.7818125 2023-10-03 22:25:49,319 WARNING [train.py:1204] (3/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_400-201896-0_sp1.1 from training. Duration: 0.963625 2023-10-03 22:25:52,118 WARNING [train.py:1204] (3/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_326-174612-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 22:25:53,401 WARNING [train.py:1204] (3/4) Exclude cut with ID _55844_210_2346_1_1534471212569_7625707_390-116233-0_sp1.1 from training. Duration: 0.963625 2023-10-03 22:25:53,443 WARNING [train.py:1204] (3/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_419-317062-0_sp1.1 from training. Duration: 0.82725 2023-10-03 22:25:54,795 WARNING [train.py:1204] (3/4) Exclude cut with ID _29877_210_6228_1_1532397791106_5276149_266-62291-0_sp0.9 from training. Duration: 0.9666875 2023-10-03 22:25:55,060 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=1427773.3333333333, ans=0.1 2023-10-03 22:25:55,271 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.5.encoder.layers.0.feed_forward3.out_whiten, num_groups=1, num_channels=256, metric=8.73 vs. limit=15.0 2023-10-03 22:25:56,634 WARNING [train.py:1204] (3/4) Exclude cut with ID _7857_210_1534_1_1528286515745_6831470_431-338528-0_sp1.1 from training. Duration: 0.92725 2023-10-03 22:25:58,021 WARNING [train.py:1204] (3/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_87-131427-0_sp0.9 from training. Duration: 0.8666875 2023-10-03 22:26:00,761 WARNING [train.py:1204] (3/4) Exclude cut with ID _24055_210_12651_1_1532310907143_976979_92-59356-0_sp1.1 from training. Duration: 0.82725 2023-10-03 22:26:00,847 WARNING [train.py:1204] (3/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_96-290039-0 from training. Duration: 0.56 2023-10-03 22:26:03,535 WARNING [train.py:1204] (3/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_48-182155-0_sp0.9 from training. Duration: 0.9666875 2023-10-03 22:26:03,567 WARNING [train.py:1204] (3/4) Exclude cut with ID _4626_210_3532_1_1526817652717_3916859_446-206656-0 from training. Duration: 0.76 2023-10-03 22:26:04,913 WARNING [train.py:1204] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_168-32906-0 from training. Duration: 0.69 2023-10-03 22:26:04,931 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3780_210_2087_1_1526467760_2130483_170-259044-0 from training. Duration: 0.7 2023-10-03 22:26:04,943 WARNING [train.py:1204] (3/4) Exclude cut with ID _65134_210_14763_1_1535335393985_3505368_153-216718-0_sp1.1 from training. Duration: 0.92725 2023-10-03 22:26:05,007 WARNING [train.py:1204] (3/4) Exclude cut with ID _64639_210_5816_1_1535367619437_3528000_461-329156-0_sp1.1 from training. Duration: 0.87275 2023-10-03 22:26:05,056 WARNING [train.py:1204] (3/4) Exclude cut with ID _21989_210_5854_1_1531899214931_3271360_126-347531-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 22:26:05,316 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.nonlin_attention.balancer.prob, batch_count=1427840.0, ans=0.125 2023-10-03 22:26:06,340 WARNING [train.py:1204] (3/4) Exclude cut with ID _7614_210_4915_1_1529326764092_4537359_96-162197-0_sp1.1 from training. Duration: 0.963625 2023-10-03 22:26:06,341 WARNING [train.py:1204] (3/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_1083-147536-0 from training. Duration: 0.97 2023-10-03 22:26:10,463 WARNING [train.py:1204] (3/4) Exclude cut with ID _64029_210_4381_1_1535178502772_3756459_275-16710-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 22:26:10,585 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7264_210_4938_1_1527939911_8132095_800-91995-0_sp0.9 from training. Duration: 0.9555625 2023-10-03 22:26:12,294 WARNING [train.py:1204] (3/4) Exclude cut with ID _3844_210_1390_1_1526727803582_2871209_50-193879-0_sp1.1 from training. Duration: 0.936375 2023-10-03 22:26:15,700 WARNING [train.py:1204] (3/4) Exclude cut with ID _46952_210_5986_1_1534230010357_3107379_232-344503-0 from training. Duration: 0.96 2023-10-03 22:26:17,082 INFO [train.py:1046] (3/4) Epoch 41, batch 1700, loss[loss=0.1558, simple_loss=0.2415, pruned_loss=0.03499, over 24026.00 frames. ], tot_loss[loss=0.1568, simple_loss=0.2368, pruned_loss=0.03844, over 4731619.57 frames. ], batch size: 80, lr: 2.49e-03, grad_scale: 16.0 2023-10-03 22:26:18,595 WARNING [train.py:1204] (3/4) Exclude cut with ID _25808_210_13292_1_1532343514562_7366109_206-14076-0_sp1.1 from training. Duration: 0.936375 2023-10-03 22:26:18,596 WARNING [train.py:1204] (3/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_55-31904-0_sp0.9 from training. Duration: 0.97775 2023-10-03 22:26:18,615 WARNING [train.py:1204] (3/4) Exclude cut with ID _14632_210_9988_1_1530691279392_3923200_161-129530-0 from training. Duration: 0.96 2023-10-03 22:26:20,690 WARNING [train.py:1204] (3/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_770-200430-0_sp1.1 from training. Duration: 0.8090625 2023-10-03 22:26:20,700 WARNING [train.py:1204] (3/4) Exclude cut with ID _6270_210_2084_1_1527850911573_3856420_26-277343-0_sp1.1 from training. Duration: 0.7454375 2023-10-03 22:26:20,701 WARNING [train.py:1204] (3/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_88-23776-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 22:26:22,183 WARNING [train.py:1204] (3/4) Exclude cut with ID _60860_210_8254_1_1534896015875_3557169_245-256666-0_sp1.1 from training. Duration: 0.8818125 2023-10-03 22:26:22,222 WARNING [train.py:1204] (3/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_636-251213-0_sp1.1 from training. Duration: 0.8545625 2023-10-03 22:26:22,245 WARNING [train.py:1204] (3/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_540-131164-0 from training. Duration: 0.92 2023-10-03 22:26:26,720 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_5931_210_2040_1_1527428481_4547999_321-193614-0_sp0.9 from training. Duration: 0.7444375 2023-10-03 22:26:28,355 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=1427906.6666666667, ans=0.1 2023-10-03 22:26:32,364 WARNING [train.py:1204] (3/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_373-225403-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 22:26:36,276 WARNING [train.py:1204] (3/4) Exclude cut with ID _20622_210_3852_1_1531622427080_1093839_57-326027-0_sp1.1 from training. Duration: 0.97275 2023-10-03 22:26:36,492 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.feed_forward3.hidden_balancer.prob, batch_count=1427973.3333333333, ans=0.125 2023-10-03 22:26:42,875 WARNING [train.py:1204] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_605-257205-0_sp1.1 from training. Duration: 0.536375 2023-10-03 22:26:42,898 WARNING [train.py:1204] (3/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_442-320317-0_sp0.9 from training. Duration: 0.911125 2023-10-03 22:26:42,939 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_35361_210_4238_1_1533262810_7793476_675-87012-0_sp1.1 from training. Duration: 0.8090625 2023-10-03 22:26:42,973 WARNING [train.py:1204] (3/4) Exclude cut with ID _4184_210_2550_1_1526730192013_2219380_306-44736-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 22:26:46,166 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_124-113562-0 from training. Duration: 0.94 2023-10-03 22:26:49,571 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2821_210_2715_1_1526108385_3763560_73-296673-0_sp0.9 from training. Duration: 0.92225 2023-10-03 22:26:49,586 WARNING [train.py:1204] (3/4) Exclude cut with ID _10301_210_5886_1_1529222275332_3910620_216-46959-0_sp1.1 from training. Duration: 0.92725 2023-10-03 22:26:51,021 WARNING [train.py:1204] (3/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_91-277684-0_sp0.9 from training. Duration: 0.82225 2023-10-03 22:26:52,352 WARNING [train.py:1204] (3/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_351-294650-0_sp0.9 from training. Duration: 0.7 2023-10-03 22:26:53,782 WARNING [train.py:1204] (3/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_209-205365-0 from training. Duration: 0.79 2023-10-03 22:26:53,832 WARNING [train.py:1204] (3/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_97-325053-0 from training. Duration: 0.92 2023-10-03 22:26:55,233 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_49348_210_7927_1_1533954500_3691300_109-276430-0_sp1.1 from training. Duration: 0.92725 2023-10-03 22:26:57,080 WARNING [train.py:1204] (3/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_912-47473-0 from training. Duration: 0.68 2023-10-03 22:26:57,164 WARNING [train.py:1204] (3/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_96-320736-0_sp0.9 from training. Duration: 0.9555625 2023-10-03 22:27:04,179 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.feed_forward2.hidden_balancer.prob, batch_count=1428106.6666666667, ans=0.125 2023-10-03 22:27:06,794 WARNING [train.py:1204] (3/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_1252-235481-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 22:27:06,904 WARNING [train.py:1204] (3/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_813-261554-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 22:27:08,274 WARNING [train.py:1204] (3/4) Exclude cut with ID _21632_210_2253_1_1531994001765_3744140_110-274654-0_sp0.9 from training. Duration: 0.911125 2023-10-03 22:27:08,455 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=1428106.6666666667, ans=0.1 2023-10-03 22:27:09,642 WARNING [train.py:1204] (3/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_381-317791-0_sp1.1 from training. Duration: 0.663625 2023-10-03 22:27:09,654 WARNING [train.py:1204] (3/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_51-117576-0_sp1.1 from training. Duration: 0.363625 2023-10-03 22:27:09,670 WARNING [train.py:1204] (3/4) Exclude cut with ID _42321_210_7770_1_1533468631352_4366895_104-215915-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 22:27:11,443 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_216-22824-0_sp1.1 from training. Duration: 0.92725 2023-10-03 22:27:11,444 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_14831_210_7927_1_1530700200_5583610_181-27467-0 from training. Duration: 0.91 2023-10-03 22:27:12,832 WARNING [train.py:1204] (3/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_1024-265645-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 22:27:12,834 WARNING [train.py:1204] (3/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_412-233904-0_sp1.1 from training. Duration: 0.936375 2023-10-03 22:27:12,854 WARNING [train.py:1204] (3/4) Exclude cut with ID _30191_210_1664_1_1533189915047_7196670_911-223461-0_sp1.1 from training. Duration: 0.92725 2023-10-03 22:27:12,863 WARNING [train.py:1204] (3/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_558-148266-0_sp1.1 from training. Duration: 0.963625 2023-10-03 22:27:15,595 WARNING [train.py:1204] (3/4) Exclude cut with ID _58480_210_12392_1_1534811121111_3646160_212-164495-0_sp1.1 from training. Duration: 0.936375 2023-10-03 22:27:15,596 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_454-276429-0_sp0.9 from training. Duration: 0.9666875 2023-10-03 22:27:15,825 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.conv_module1.balancer1.prob, batch_count=1428173.3333333333, ans=0.125 2023-10-03 22:27:17,560 WARNING [train.py:1204] (3/4) Exclude cut with ID _24736_210_3279_1_1532332894218_2838700_73-197086-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 22:27:17,612 WARNING [train.py:1204] (3/4) Exclude cut with ID _5294_210_1747_1_1527249643928_3696179_529-242298-0_sp1.1 from training. Duration: 0.863625 2023-10-03 22:27:17,658 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_53555_210_5632_1_1534595220_7232297_345-171438-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 22:27:23,371 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_28211_210_6388_1_1532861506_4523224_22-25766-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 22:27:25,335 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7002_210_1794_1_1527939683_5157286_163-87552-0 from training. Duration: 0.86 2023-10-03 22:27:26,852 WARNING [train.py:1204] (3/4) Exclude cut with ID _56927_210_9969_1_1534848970840_3695229_409-225129-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 22:27:28,290 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_26192_210_7927_1_1532137269_1040115_21-219331-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 22:27:28,418 WARNING [train.py:1204] (3/4) Exclude cut with ID _15012_210_1968_1_1530856820429_5095350_662-210469-0 from training. Duration: 0.57 2023-10-03 22:27:32,477 INFO [train.py:1046] (3/4) Epoch 41, batch 1750, loss[loss=0.1397, simple_loss=0.2228, pruned_loss=0.02832, over 24499.00 frames. ], tot_loss[loss=0.1562, simple_loss=0.2359, pruned_loss=0.03823, over 4725239.15 frames. ], batch size: 63, lr: 2.49e-03, grad_scale: 16.0 2023-10-03 22:27:32,654 WARNING [train.py:1204] (3/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_582-191016-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 22:27:34,038 WARNING [train.py:1204] (3/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_270-210032-0_sp1.1 from training. Duration: 0.963625 2023-10-03 22:27:34,071 WARNING [train.py:1204] (3/4) Exclude cut with ID _6789_210_1534_1_1527857937218_3118950_262-48853-0_sp0.9 from training. Duration: 0.62225 2023-10-03 22:27:35,439 WARNING [train.py:1204] (3/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_847-41334-0 from training. Duration: 0.44 2023-10-03 22:27:35,474 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_573-46532-0_sp1.1 from training. Duration: 0.92725 2023-10-03 22:27:37,028 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.feed_forward1.hidden_balancer.prob, batch_count=1428240.0, ans=0.125 2023-10-03 22:27:38,228 WARNING [train.py:1204] (3/4) Exclude cut with ID _21881_210_3525_1_1531825242757_3798380_247-102591-0_sp1.1 from training. Duration: 0.8909375 2023-10-03 22:27:38,252 WARNING [train.py:1204] (3/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_200-21395-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 22:27:38,723 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.4.encoder.layers.1.self_attn1.whiten, num_groups=1, num_channels=384, metric=12.75 vs. limit=22.5 2023-10-03 22:27:44,225 WARNING [train.py:1204] (3/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_711-15688-0 from training. Duration: 0.43 2023-10-03 22:27:45,632 WARNING [train.py:1204] (3/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_53-75025-0_sp1.1 from training. Duration: 0.963625 2023-10-03 22:27:50,164 WARNING [train.py:1204] (3/4) Exclude cut with ID _6194_210_1534_1_1527511826160_3941194_321-148641-0 from training. Duration: 0.64 2023-10-03 22:27:50,188 WARNING [train.py:1204] (3/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_337-294431-0_sp1.1 from training. Duration: 0.936375 2023-10-03 22:27:50,291 WARNING [train.py:1204] (3/4) Exclude cut with ID _21632_210_2253_1_1531994001765_3744140_110-274654-0_sp1.1 from training. Duration: 0.7454375 2023-10-03 22:27:53,377 WARNING [train.py:1204] (3/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_135-174080-0_sp1.1 from training. Duration: 0.4818125 2023-10-03 22:27:54,872 WARNING [train.py:1204] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_21-174514-0 from training. Duration: 0.43 2023-10-03 22:27:57,786 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_38965_210_4852_1_1533174906_3484735_215-294589-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 22:27:57,811 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_548-294316-0 from training. Duration: 0.81 2023-10-03 22:28:04,820 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_103-84010-0_sp1.1 from training. Duration: 0.62725 2023-10-03 22:28:07,759 WARNING [train.py:1204] (3/4) Exclude cut with ID _18199_210_6109_1_1531475789864_4288790_295-198247-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 22:28:07,785 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_13757_210_3528_1_1530440682_4107239_103-313224-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 22:28:11,065 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_34886_210_10633_1_1533203882_1755661_67-294673-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 22:28:11,075 WARNING [train.py:1204] (3/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_350-101734-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 22:28:13,609 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.577e+02 1.948e+02 2.173e+02 2.566e+02 4.881e+02, threshold=4.347e+02, percent-clipped=1.0 2023-10-03 22:28:13,722 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_37357_210_17024_1_1533089504_2316121_204-153986-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 22:28:15,110 WARNING [train.py:1204] (3/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_673-115569-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 22:28:16,950 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_489-230703-0_sp0.9 from training. Duration: 0.9444375 2023-10-03 22:28:17,008 WARNING [train.py:1204] (3/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_575-102496-0_sp1.1 from training. Duration: 0.936375 2023-10-03 22:28:18,991 WARNING [train.py:1204] (3/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_669-93163-0 from training. Duration: 0.98 2023-10-03 22:28:20,530 WARNING [train.py:1204] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_249-281349-0_sp0.9 from training. Duration: 0.9444375 2023-10-03 22:28:22,619 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.conv_skip_rate, batch_count=1428440.0, ans=0.0 2023-10-03 22:28:23,649 WARNING [train.py:1204] (3/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_106-30782-0 from training. Duration: 0.8 2023-10-03 22:28:23,713 WARNING [train.py:1204] (3/4) Exclude cut with ID _8772_210_1850_1_1528635595972_3664011_398-320049-0_sp1.1 from training. Duration: 0.8 2023-10-03 22:28:25,079 WARNING [train.py:1204] (3/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_516-151727-0_sp1.1 from training. Duration: 0.963625 2023-10-03 22:28:26,387 WARNING [train.py:1204] (3/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_325-62802-0_sp1.1 from training. Duration: 0.7909375 2023-10-03 22:28:26,652 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.conv_module1.balancer1.max_abs, batch_count=1428440.0, ans=10.0 2023-10-03 22:28:26,670 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.conv_module1.balancer2.min_abs, batch_count=1428440.0, ans=0.5 2023-10-03 22:28:29,307 WARNING [train.py:1204] (3/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_93-58002-0_sp0.9 from training. Duration: 0.6333125 2023-10-03 22:28:30,617 WARNING [train.py:1204] (3/4) Exclude cut with ID _4681_210_3525_1_1527251040321_1235713_123-24428-0_sp1.1 from training. Duration: 0.436375 2023-10-03 22:28:30,680 WARNING [train.py:1204] (3/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_1185-56473-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 22:28:32,107 WARNING [train.py:1204] (3/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_96-244299-0_sp1.1 from training. Duration: 0.8 2023-10-03 22:28:32,369 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.bypass.skip_rate, batch_count=1428506.6666666667, ans=0.04949747468305833 2023-10-03 22:28:36,200 WARNING [train.py:1204] (3/4) Exclude cut with ID _59500_210_8254_1_1534809610680_3736009_104-72815-0_sp1.1 from training. Duration: 0.963625 2023-10-03 22:28:37,637 WARNING [train.py:1204] (3/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_468-283113-0_sp1.1 from training. Duration: 0.87275 2023-10-03 22:28:39,744 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.ff3_skip_rate, batch_count=1428506.6666666667, ans=0.0 2023-10-03 22:28:40,787 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6239_210_4211_1_1527765584_4306698_528-102186-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 22:28:40,862 WARNING [train.py:1204] (3/4) Exclude cut with ID _45297_210_2721_1_1533715351381_3264949_351-147136-0 from training. Duration: 0.87 2023-10-03 22:28:40,867 WARNING [train.py:1204] (3/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_520-51744-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 22:28:42,367 WARNING [train.py:1204] (3/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_310-114452-0_sp1.1 from training. Duration: 0.536375 2023-10-03 22:28:42,371 WARNING [train.py:1204] (3/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_450-165710-0_sp1.1 from training. Duration: 0.92725 2023-10-03 22:28:42,381 WARNING [train.py:1204] (3/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_280-120942-0_sp1.1 from training. Duration: 0.7 2023-10-03 22:28:42,408 WARNING [train.py:1204] (3/4) Exclude cut with ID _65585_210_12210_1_1535450451584_3764098_112-294301-0_sp0.9 from training. Duration: 0.97775 2023-10-03 22:28:43,690 WARNING [train.py:1204] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_98-51676-0_sp0.9 from training. Duration: 0.811125 2023-10-03 22:28:46,874 INFO [train.py:1046] (3/4) Epoch 41, batch 1800, loss[loss=0.1538, simple_loss=0.2491, pruned_loss=0.02927, over 24558.00 frames. ], tot_loss[loss=0.1556, simple_loss=0.2354, pruned_loss=0.03788, over 4724571.28 frames. ], batch size: 71, lr: 2.48e-03, grad_scale: 8.0 2023-10-03 22:28:46,951 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_33348_210_5632_1_1533086912_3324030_85-28583-0_sp1.1 from training. Duration: 0.7454375 2023-10-03 22:28:48,271 WARNING [train.py:1204] (3/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_336-208232-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 22:28:51,426 WARNING [train.py:1204] (3/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_339-104791-0_sp0.9 from training. Duration: 0.5444375 2023-10-03 22:28:54,815 WARNING [train.py:1204] (3/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_559-29840-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 22:28:57,542 WARNING [train.py:1204] (3/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_117-290916-0_sp0.9 from training. Duration: 0.5666875 2023-10-03 22:28:57,624 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_185-112289-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 22:29:00,602 WARNING [train.py:1204] (3/4) Exclude cut with ID _59256_210_5986_1_1535076085997_3589120_155-298797-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 22:29:02,232 WARNING [train.py:1204] (3/4) Exclude cut with ID _44659_210_2721_1_1534036120061_6614860_246-206531-0_sp1.1 from training. Duration: 0.92725 2023-10-03 22:29:03,536 WARNING [train.py:1204] (3/4) Exclude cut with ID _23818_210_6228_1_1531917063884_3676920_469-151944-0_sp1.1 from training. Duration: 0.92725 2023-10-03 22:29:04,844 WARNING [train.py:1204] (3/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_88-276668-0_sp1.1 from training. Duration: 0.9 2023-10-03 22:29:05,131 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.conv_module2.balancer1.max_abs, batch_count=1428640.0, ans=10.0 2023-10-03 22:29:07,589 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_15101_210_7927_1_1530786415_5033274_320-327527-0_sp1.1 from training. Duration: 0.87275 2023-10-03 22:29:07,605 WARNING [train.py:1204] (3/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_446-112117-0 from training. Duration: 0.9 2023-10-03 22:29:08,971 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_30280_210_1881_1_1532872779_3660139_400-254159-0_sp1.1 from training. Duration: 0.936375 2023-10-03 22:29:12,443 WARNING [train.py:1204] (3/4) Exclude cut with ID _40284_210_2084_1_1533513679814_3704279_158-345341-0_sp1.1 from training. Duration: 0.936375 2023-10-03 22:29:16,570 WARNING [train.py:1204] (3/4) Exclude cut with ID 3033-130750-0096-55598-0 from training. Duration: 0.83 2023-10-03 22:29:19,365 WARNING [train.py:1204] (3/4) Exclude cut with ID _9389_210_1664_1_1529217337301_5289269_379-128794-0 from training. Duration: 0.85 2023-10-03 22:29:19,390 WARNING [train.py:1204] (3/4) Exclude cut with ID _3675_210_4557_1_1526639934408_4182089_456-18305-0 from training. Duration: 0.76 2023-10-03 22:29:19,420 WARNING [train.py:1204] (3/4) Exclude cut with ID _29853_210_7497_1_1532505600851_6642110_571-249460-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 22:29:19,506 WARNING [train.py:1204] (3/4) Exclude cut with ID _4068_210_2629_1_1526697350395_1546259_230-325568-0_sp1.1 from training. Duration: 0.92725 2023-10-03 22:29:19,507 WARNING [train.py:1204] (3/4) Exclude cut with ID _34415_210_8750_1_1533085213375_3451116_213-267570-0_sp1.1 from training. Duration: 0.963625 2023-10-03 22:29:21,355 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6767_210_3273_1_1527855783_1718932_142-127132-0_sp1.1 from training. Duration: 0.863625 2023-10-03 22:29:25,050 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.conv_module2.balancer1.prob, batch_count=1428706.6666666667, ans=0.125 2023-10-03 22:29:30,347 WARNING [train.py:1204] (3/4) Exclude cut with ID IC0653W0001-322925 from training. Duration: 0.896 2023-10-03 22:29:30,457 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_4528_210_3273_1_1527076654_3276777_209-219804-0_sp1.1 from training. Duration: 0.62725 2023-10-03 22:29:33,131 WARNING [train.py:1204] (3/4) Exclude cut with ID _55844_210_2346_1_1534471212569_7625707_187-204971-0_sp1.1 from training. Duration: 0.936375 2023-10-03 22:29:34,647 WARNING [train.py:1204] (3/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_369-325184-0 from training. Duration: 0.99 2023-10-03 22:29:34,685 WARNING [train.py:1204] (3/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_791-45149-0 from training. Duration: 0.86 2023-10-03 22:29:36,032 WARNING [train.py:1204] (3/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_317-211107-0_sp1.1 from training. Duration: 0.836375 2023-10-03 22:29:37,408 WARNING [train.py:1204] (3/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_488-330913-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 22:29:38,740 WARNING [train.py:1204] (3/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_506-42614-0_sp1.1 from training. Duration: 0.7181875 2023-10-03 22:29:44,720 WARNING [train.py:1204] (3/4) Exclude cut with ID _8696_210_1876_1_1528545632440_3345058_209-54069-0 from training. Duration: 0.67 2023-10-03 22:29:49,047 WARNING [train.py:1204] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_215-202973-0_sp1.1 from training. Duration: 0.8 2023-10-03 22:29:49,129 WARNING [train.py:1204] (3/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_321-215742-0 from training. Duration: 0.5 2023-10-03 22:29:50,393 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_428-32114-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 22:29:50,401 WARNING [train.py:1204] (3/4) Exclude cut with ID _27013_210_1389_1_1532437173240_3252649_467-263151-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 22:29:50,459 WARNING [train.py:1204] (3/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_734-129761-0_sp0.9 from training. Duration: 0.911125 2023-10-03 22:29:51,740 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_521-214276-0 from training. Duration: 0.44 2023-10-03 22:29:54,906 WARNING [train.py:1204] (3/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_80-147301-0_sp1.1 from training. Duration: 0.763625 2023-10-03 22:29:54,908 WARNING [train.py:1204] (3/4) Exclude cut with ID _9679_210_7497_1_1529129212906_4363929_56-307796-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 22:29:57,617 WARNING [train.py:1204] (3/4) Exclude cut with ID _14832_210_8750_1_1530691351539_3171348_1-54315-0 from training. Duration: 0.87 2023-10-03 22:29:57,617 WARNING [train.py:1204] (3/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_558-155750-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 22:30:00,974 INFO [train.py:1046] (3/4) Epoch 41, batch 1850, loss[loss=0.1524, simple_loss=0.2462, pruned_loss=0.02933, over 24471.00 frames. ], tot_loss[loss=0.156, simple_loss=0.2362, pruned_loss=0.03787, over 4725663.93 frames. ], batch size: 69, lr: 2.48e-03, grad_scale: 8.0 2023-10-03 22:30:01,024 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_142-309370-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 22:30:01,057 WARNING [train.py:1204] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_18-46756-0_sp0.9 from training. Duration: 0.87775 2023-10-03 22:30:01,064 WARNING [train.py:1204] (3/4) Exclude cut with ID _61148_210_6897_1_1535094022726_4217610_279-212159-0_sp1.1 from training. Duration: 0.936375 2023-10-03 22:30:02,481 WARNING [train.py:1204] (3/4) Exclude cut with ID _21987_210_5854_1_1531821500482_3637038_164-2641-0_sp1.1 from training. Duration: 0.936375 2023-10-03 22:30:02,924 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.5.encoder.layers.0.self_attn1.whiten, num_groups=1, num_channels=256, metric=10.88 vs. limit=22.5 2023-10-03 22:30:02,974 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.4.encoder.layers.1.self_attn_weights.whiten_keys, num_groups=4, num_channels=128, metric=2.05 vs. limit=6.0 2023-10-03 22:30:03,800 WARNING [train.py:1204] (3/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_395-340852-0_sp1.1 from training. Duration: 0.8454375 2023-10-03 22:30:05,224 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1077-184261-0_sp1.1 from training. Duration: 0.963625 2023-10-03 22:30:05,233 WARNING [train.py:1204] (3/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_1176-204992-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 22:30:07,984 WARNING [train.py:1204] (3/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_563-347832-0_sp1.1 from training. Duration: 0.8090625 2023-10-03 22:30:08,046 WARNING [train.py:1204] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_380-204756-0_sp0.9 from training. Duration: 0.9555625 2023-10-03 22:30:11,075 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.feed_forward3.hidden_balancer.prob, batch_count=1428906.6666666667, ans=0.125 2023-10-03 22:30:15,399 WARNING [train.py:1204] (3/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_212-10501-0_sp1.1 from training. Duration: 0.8545625 2023-10-03 22:30:15,429 WARNING [train.py:1204] (3/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_445-117398-0 from training. Duration: 0.89 2023-10-03 22:30:18,215 WARNING [train.py:1204] (3/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_29-280622-0 from training. Duration: 0.82 2023-10-03 22:30:20,936 WARNING [train.py:1204] (3/4) Exclude cut with ID _26664_210_7770_1_1532258943280_3418270_187-80815-0 from training. Duration: 0.93 2023-10-03 22:30:25,021 WARNING [train.py:1204] (3/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_414-349640-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 22:30:26,918 WARNING [train.py:1204] (3/4) Exclude cut with ID _45021_210_9994_1_1533619751094_3330288_96-239010-0 from training. Duration: 0.91 2023-10-03 22:30:26,930 WARNING [train.py:1204] (3/4) Exclude cut with ID _5189_210_4557_1_1527248319428_3769229_199-53526-0_sp0.9 from training. Duration: 0.4666875 2023-10-03 22:30:27,096 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.0.bypass_mid.scale_min, batch_count=1428973.3333333333, ans=0.2 2023-10-03 22:30:27,580 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.3.encoder.layers.3.conv_module1.whiten, num_groups=1, num_channels=512, metric=4.86 vs. limit=15.0 2023-10-03 22:30:36,418 WARNING [train.py:1204] (3/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_63-101996-0_sp1.1 from training. Duration: 0.8 2023-10-03 22:30:39,060 WARNING [train.py:1204] (3/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_199-86432-0 from training. Duration: 0.98 2023-10-03 22:30:40,576 WARNING [train.py:1204] (3/4) Exclude cut with ID _36399_210_16049_1_1532998771377_3698333_207-259013-0_sp1.1 from training. Duration: 0.92725 2023-10-03 22:30:40,613 WARNING [train.py:1204] (3/4) Exclude cut with ID _62574_210_2655_1_1535180407058_3933249_567-111756-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 22:30:41,269 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.3.encoder.layers.1.feed_forward2.out_whiten, num_groups=1, num_channels=512, metric=5.56 vs. limit=15.0 2023-10-03 22:30:41,806 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.623e+02 1.909e+02 2.082e+02 2.293e+02 3.628e+02, threshold=4.164e+02, percent-clipped=0.0 2023-10-03 22:30:43,392 WARNING [train.py:1204] (3/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_436-318113-0 from training. Duration: 0.75 2023-10-03 22:30:44,789 WARNING [train.py:1204] (3/4) Exclude cut with ID _53379_210_20034_1_1534222027363_3854654_43-305214-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 22:30:44,804 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_15126_210_6753_1_1530778677_2591185_113-157651-0_sp1.1 from training. Duration: 0.6181875 2023-10-03 22:30:46,717 WARNING [train.py:1204] (3/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_146-218763-0_sp1.1 from training. Duration: 0.7818125 2023-10-03 22:30:48,124 WARNING [train.py:1204] (3/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_714-310538-0_sp0.9 from training. Duration: 0.9555625 2023-10-03 22:30:50,781 WARNING [train.py:1204] (3/4) Exclude cut with ID _24271_210_12658_1_1531987270022_7190519_709-16721-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 22:30:53,463 WARNING [train.py:1204] (3/4) Exclude cut with ID _5190_210_4557_1_1527244368799_373439_28-197030-0_sp0.9 from training. Duration: 0.8 2023-10-03 22:30:53,491 WARNING [train.py:1204] (3/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_267-197536-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 22:30:53,530 WARNING [train.py:1204] (3/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_658-282610-0_sp1.1 from training. Duration: 0.4818125 2023-10-03 22:30:53,538 WARNING [train.py:1204] (3/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_257-67050-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 22:30:53,722 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.conv_module2.balancer2.prob, batch_count=1429106.6666666667, ans=0.125 2023-10-03 22:30:56,199 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_14906_210_7927_1_1530754190_9408633_961-53402-0_sp1.1 from training. Duration: 0.936375 2023-10-03 22:30:58,229 WARNING [train.py:1204] (3/4) Exclude cut with ID _60113_210_16271_1_1535164212481_4386079_490-280290-0_sp1.1 from training. Duration: 0.8909375 2023-10-03 22:31:00,365 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.conv_module2.balancer2.prob, batch_count=1429173.3333333333, ans=0.125 2023-10-03 22:31:01,491 WARNING [train.py:1204] (3/4) Exclude cut with ID _14100_210_8341_1_1530529320613_3678069_258-224599-0 from training. Duration: 0.77 2023-10-03 22:31:02,825 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6961_210_2087_1_1527937168_6815011_565-326898-0_sp1.1 from training. Duration: 0.936375 2023-10-03 22:31:06,979 WARNING [train.py:1204] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_239-316802-0_sp1.1 from training. Duration: 0.62725 2023-10-03 22:31:08,333 WARNING [train.py:1204] (3/4) Exclude cut with ID _55857_210_2346_1_1534644007831_6898209_745-98391-0_sp1.1 from training. Duration: 0.6909375 2023-10-03 22:31:08,336 WARNING [train.py:1204] (3/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_154-135696-0 from training. Duration: 0.8 2023-10-03 22:31:08,338 WARNING [train.py:1204] (3/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_93-58002-0 from training. Duration: 0.57 2023-10-03 22:31:09,773 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9025W0005-507335 from training. Duration: 0.896 2023-10-03 22:31:09,872 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9029W0034-509435 from training. Duration: 0.64 2023-10-03 22:31:11,245 WARNING [train.py:1204] (3/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_76-203867-0_sp1.1 from training. Duration: 0.6545625 2023-10-03 22:31:11,253 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_46639_210_8341_1_1533726088_4495500_66-346325-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 22:31:12,592 WARNING [train.py:1204] (3/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_145-137790-0_sp0.9 from training. Duration: 0.92225 2023-10-03 22:31:12,613 WARNING [train.py:1204] (3/4) Exclude cut with ID _36522_210_8887_1_1533177030611_1772478_7-327299-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 22:31:13,846 INFO [train.py:1046] (3/4) Epoch 41, batch 1900, loss[loss=0.1843, simple_loss=0.2549, pruned_loss=0.05689, over 19560.00 frames. ], tot_loss[loss=0.1564, simple_loss=0.2366, pruned_loss=0.03807, over 4725268.26 frames. ], batch size: 388, lr: 2.48e-03, grad_scale: 8.0 2023-10-03 22:31:13,938 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9042W0009-515615 from training. Duration: 0.896 2023-10-03 22:31:13,947 WARNING [train.py:1204] (3/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_39-248870-0_sp0.9 from training. Duration: 0.7666875 2023-10-03 22:31:13,989 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_35441_210_16554_1_1533347572_4092680_362-242912-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 22:31:15,347 WARNING [train.py:1204] (3/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_216-343007-0_sp1.1 from training. Duration: 0.6 2023-10-03 22:31:16,686 WARNING [train.py:1204] (3/4) Exclude cut with ID _3867_210_1883_1_1526641200575_4475830_272-215563-0_sp0.9 from training. Duration: 0.6333125 2023-10-03 22:31:18,410 WARNING [train.py:1204] (3/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_553-175603-0_sp1.1 from training. Duration: 0.92725 2023-10-03 22:31:18,419 WARNING [train.py:1204] (3/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_963-187109-0 from training. Duration: 0.99 2023-10-03 22:31:19,869 WARNING [train.py:1204] (3/4) Exclude cut with ID _44973_210_1511_1_1533794381163_3634770_495-211466-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 22:31:19,880 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9068W0002-529085 from training. Duration: 0.896 2023-10-03 22:31:19,902 WARNING [train.py:1204] (3/4) Exclude cut with ID _5294_210_1747_1_1527249643928_3696179_180-100713-0_sp1.1 from training. Duration: 0.736375 2023-10-03 22:31:21,316 WARNING [train.py:1204] (3/4) Exclude cut with ID _35799_210_5854_1_1533263487887_3529071_149-49689-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 22:31:21,490 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.0.balancer1.prob, batch_count=1429240.0, ans=0.125 2023-10-03 22:31:24,510 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.nonlin_attention.balancer.prob, batch_count=1429240.0, ans=0.125 2023-10-03 22:31:25,718 WARNING [train.py:1204] (3/4) Exclude cut with ID _67725_210_27767_1_1535608756671_5105359_36-220794-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 22:31:28,395 WARNING [train.py:1204] (3/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_338-96520-0_sp1.1 from training. Duration: 0.8545625 2023-10-03 22:31:28,475 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9104W0004-547160 from training. Duration: 0.896 2023-10-03 22:31:30,300 WARNING [train.py:1204] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_330-327861-0 from training. Duration: 0.67 2023-10-03 22:31:32,245 WARNING [train.py:1204] (3/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_156-257607-0_sp0.9 from training. Duration: 0.92225 2023-10-03 22:31:32,317 WARNING [train.py:1204] (3/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_476-322050-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 22:31:32,326 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9118W0017-553385 from training. Duration: 0.896 2023-10-03 22:31:33,675 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9120W0028-554435 from training. Duration: 0.896 2023-10-03 22:31:35,466 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.feed_forward1.hidden_balancer.prob, batch_count=1429306.6666666667, ans=0.125 2023-10-03 22:31:36,644 WARNING [train.py:1204] (3/4) Exclude cut with ID _56524_210_9642_1_1534572011779_3619170_145-310842-0 from training. Duration: 0.99 2023-10-03 22:31:39,365 WARNING [train.py:1204] (3/4) Exclude cut with ID _51631_210_8254_1_1534129228420_4084794_270-222508-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 22:31:42,255 WARNING [train.py:1204] (3/4) Exclude cut with ID _14268_210_9206_1_1530622771285_4714099_331-105351-0 from training. Duration: 0.65 2023-10-03 22:31:43,619 WARNING [train.py:1204] (3/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_40-129756-0 from training. Duration: 0.81 2023-10-03 22:31:49,231 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.1.encoder.layers.1.feed_forward1.out_whiten, num_groups=1, num_channels=256, metric=5.41 vs. limit=15.0 2023-10-03 22:31:51,231 WARNING [train.py:1204] (3/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_231-104736-0 from training. Duration: 0.78 2023-10-03 22:31:51,470 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.conv_module2.balancer1.prob, batch_count=1429373.3333333333, ans=0.125 2023-10-03 22:31:55,354 WARNING [train.py:1204] (3/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_214-291369-0 from training. Duration: 0.63 2023-10-03 22:31:55,367 WARNING [train.py:1204] (3/4) Exclude cut with ID _62574_210_2655_1_1535180407058_3933249_369-84347-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 22:31:56,631 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9210W0111-599240 from training. Duration: 0.896 2023-10-03 22:31:56,635 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_92-246270-0 from training. Duration: 0.85 2023-10-03 22:31:56,669 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_37152_210_9697_1_1533106573_3844188_29-239070-0 from training. Duration: 0.98 2023-10-03 22:31:57,993 WARNING [train.py:1204] (3/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_239-22076-0 from training. Duration: 0.85 2023-10-03 22:31:57,996 WARNING [train.py:1204] (3/4) Exclude cut with ID _65962_210_29927_1_1535436026519_4174930_368-44639-0_sp1.1 from training. Duration: 0.97275 2023-10-03 22:32:01,887 WARNING [train.py:1204] (3/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_152-212533-0 from training. Duration: 0.97 2023-10-03 22:32:03,376 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=1429440.0, ans=0.1 2023-10-03 22:32:05,991 WARNING [train.py:1204] (3/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_90-31383-0_sp1.1 from training. Duration: 0.7909375 2023-10-03 22:32:07,570 WARNING [train.py:1204] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_196-152868-0_sp1.1 from training. Duration: 0.92725 2023-10-03 22:32:08,907 WARNING [train.py:1204] (3/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_140-181176-0 from training. Duration: 0.92 2023-10-03 22:32:09,256 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.bypass.scale_min, batch_count=1429440.0, ans=0.2 2023-10-03 22:32:10,291 WARNING [train.py:1204] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_168-32906-0_sp0.9 from training. Duration: 0.7666875 2023-10-03 22:32:13,029 WARNING [train.py:1204] (3/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_13-165512-0 from training. Duration: 0.82 2023-10-03 22:32:13,057 WARNING [train.py:1204] (3/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_381-317791-0_sp0.9 from training. Duration: 0.811125 2023-10-03 22:32:19,034 WARNING [train.py:1204] (3/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_63-267585-0_sp1.1 from training. Duration: 0.8181875 2023-10-03 22:32:19,036 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_326-303945-0_sp1.1 from training. Duration: 0.9 2023-10-03 22:32:20,322 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_194-143381-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 22:32:20,402 WARNING [train.py:1204] (3/4) Exclude cut with ID _39290_210_4220_1_1533862778760_3075118_194-16667-0_sp1.1 from training. Duration: 0.963625 2023-10-03 22:32:21,772 WARNING [train.py:1204] (3/4) Exclude cut with ID _15012_210_1968_1_1530856820429_5095350_662-210469-0_sp0.9 from training. Duration: 0.6333125 2023-10-03 22:32:21,803 WARNING [train.py:1204] (3/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_68-219807-0_sp1.1 from training. Duration: 0.42725 2023-10-03 22:32:23,155 WARNING [train.py:1204] (3/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_321-22821-0_sp0.9 from training. Duration: 0.988875 2023-10-03 22:32:27,044 INFO [train.py:1046] (3/4) Epoch 41, batch 1950, loss[loss=0.1649, simple_loss=0.2568, pruned_loss=0.03647, over 24296.00 frames. ], tot_loss[loss=0.1572, simple_loss=0.2376, pruned_loss=0.03845, over 4723880.50 frames. ], batch size: 74, lr: 2.48e-03, grad_scale: 8.0 2023-10-03 22:32:27,101 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_1037-45317-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 22:32:27,102 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_403-39619-0_sp1.1 from training. Duration: 0.763625 2023-10-03 22:32:28,495 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_510-96996-0_sp0.9 from training. Duration: 0.9666875 2023-10-03 22:32:28,497 WARNING [train.py:1204] (3/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_225-180155-0_sp1.1 from training. Duration: 0.92725 2023-10-03 22:32:28,532 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_278-272612-0_sp0.9 from training. Duration: 0.811125 2023-10-03 22:32:30,450 WARNING [train.py:1204] (3/4) Exclude cut with ID _53713_210_24116_1_1534314487762_7416751_691-70244-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 22:32:35,478 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6788_210_1388_1_1527898698_4562021_91-197981-0_sp1.1 from training. Duration: 0.7545625 2023-10-03 22:32:36,862 WARNING [train.py:1204] (3/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_60-18123-0_sp0.9 from training. Duration: 0.97775 2023-10-03 22:32:38,171 WARNING [train.py:1204] (3/4) Exclude cut with ID _35799_210_5854_1_1533263487887_3529071_5-214617-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 22:32:38,176 WARNING [train.py:1204] (3/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_890-5117-0_sp1.1 from training. Duration: 0.7090625 2023-10-03 22:32:39,603 WARNING [train.py:1204] (3/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_660-211088-0 from training. Duration: 0.62 2023-10-03 22:32:40,891 WARNING [train.py:1204] (3/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_314-126819-0_sp0.9 from training. Duration: 0.5555625 2023-10-03 22:32:40,920 WARNING [train.py:1204] (3/4) Exclude cut with ID _21632_210_2253_1_1531994001765_3744140_362-66225-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 22:32:41,131 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.ff2_skip_rate, batch_count=1429640.0, ans=0.0 2023-10-03 22:32:42,275 WARNING [train.py:1204] (3/4) Exclude cut with ID _61118_210_9527_1_1534897668884_3752630_173-258555-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 22:32:45,073 WARNING [train.py:1204] (3/4) Exclude cut with ID _7719_210_3525_1_1528456153163_4296960_88-250259-0_sp0.9 from training. Duration: 0.9333125 2023-10-03 22:32:45,105 WARNING [train.py:1204] (3/4) Exclude cut with ID _15140_210_9288_1_1530791388450_4880149_539-153708-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 22:32:45,134 WARNING [train.py:1204] (3/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_253-42544-0_sp1.1 from training. Duration: 0.97275 2023-10-03 22:32:46,601 WARNING [train.py:1204] (3/4) Exclude cut with ID _33640_210_2655_1_1533117605479_4855469_479-296877-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 22:32:51,128 WARNING [train.py:1204] (3/4) Exclude cut with ID 3033-130750-0096-55598-0_sp1.1 from training. Duration: 0.7545625 2023-10-03 22:32:51,138 WARNING [train.py:1204] (3/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_191-41023-0_sp1.1 from training. Duration: 0.6454375 2023-10-03 22:32:51,154 WARNING [train.py:1204] (3/4) Exclude cut with ID _8365_210_2064_1_1528592311070_4677690_431-294102-0_sp1.1 from training. Duration: 0.8090625 2023-10-03 22:32:51,191 WARNING [train.py:1204] (3/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_893-296030-0_sp1.1 from training. Duration: 0.97275 2023-10-03 22:32:55,332 WARNING [train.py:1204] (3/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_401-343820-0_sp1.1 from training. Duration: 0.97275 2023-10-03 22:32:59,491 WARNING [train.py:1204] (3/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_127-566-0_sp1.1 from training. Duration: 0.763625 2023-10-03 22:32:59,493 WARNING [train.py:1204] (3/4) Exclude cut with ID _14100_210_8341_1_1530529320613_3678069_126-272847-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 22:32:59,508 WARNING [train.py:1204] (3/4) Exclude cut with ID _15133_210_6228_1_1530794512074_3080379_29-264458-0_sp1.1 from training. Duration: 0.52725 2023-10-03 22:32:59,508 WARNING [train.py:1204] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_127-119109-0 from training. Duration: 0.72 2023-10-03 22:32:59,581 WARNING [train.py:1204] (3/4) Exclude cut with ID _6537_210_1883_1_1527987559710_3597129_382-133062-0_sp1.1 from training. Duration: 0.6181875 2023-10-03 22:33:01,401 WARNING [train.py:1204] (3/4) Exclude cut with ID _29529_210_7234_1_1532393961455_3977460_391-219682-0_sp1.1 from training. Duration: 0.936375 2023-10-03 22:33:01,442 WARNING [train.py:1204] (3/4) Exclude cut with ID _37537_210_2253_1_1533171758568_3421949_96-137189-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 22:33:04,643 WARNING [train.py:1204] (3/4) Exclude cut with ID _58414_210_8254_1_1535421609162_7151182_15-276301-0_sp1.1 from training. Duration: 0.97275 2023-10-03 22:33:07,951 WARNING [train.py:1204] (3/4) Exclude cut with ID _59502_210_12210_1_1535107024698_1835139_110-289074-0_sp1.1 from training. Duration: 0.92725 2023-10-03 22:33:09,229 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.661e+02 2.015e+02 2.236e+02 2.546e+02 4.290e+02, threshold=4.473e+02, percent-clipped=1.0 2023-10-03 22:33:12,061 WARNING [train.py:1204] (3/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_367-61330-0_sp1.1 from training. Duration: 0.6818125 2023-10-03 22:33:13,620 WARNING [train.py:1204] (3/4) Exclude cut with ID _45297_210_2721_1_1533715351381_3264949_353-9541-0_sp1.1 from training. Duration: 0.8818125 2023-10-03 22:33:13,649 WARNING [train.py:1204] (3/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_85-138257-0_sp0.9 from training. Duration: 0.788875 2023-10-03 22:33:13,682 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3787_210_2040_1_1526477544_5478586_603-120324-0 from training. Duration: 0.81 2023-10-03 22:33:14,968 WARNING [train.py:1204] (3/4) Exclude cut with ID _42863_210_9070_1_1533902398788_3696920_411-204131-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 22:33:17,778 WARNING [train.py:1204] (3/4) Exclude cut with ID _30196_210_1664_1_1533708496522_7074140_822-289909-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 22:33:17,849 WARNING [train.py:1204] (3/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_32-335103-0_sp0.9 from training. Duration: 0.911125 2023-10-03 22:33:19,211 WARNING [train.py:1204] (3/4) Exclude cut with ID _6493_210_1747_1_1528459210424_4012470_513-140967-0_sp0.9 from training. Duration: 0.87775 2023-10-03 22:33:26,464 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_47896_210_20508_1_1534492518_5958868_439-257766-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 22:33:27,853 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_1294-63152-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 22:33:31,251 WARNING [train.py:1204] (3/4) Exclude cut with ID _59170_210_6721_1_1534983633960_4203335_319-166512-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 22:33:32,692 WARNING [train.py:1204] (3/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_146-3470-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 22:33:32,877 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=1429840.0, ans=0.1 2023-10-03 22:33:35,862 WARNING [train.py:1204] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_678-159307-0_sp1.1 from training. Duration: 0.77275 2023-10-03 22:33:35,906 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_343-48822-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 22:33:37,293 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_4692_210_3528_1_1526899755_4323254_107-44401-0 from training. Duration: 0.86 2023-10-03 22:33:37,298 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_74-292608-0_sp1.1 from training. Duration: 0.6545625 2023-10-03 22:33:39,227 WARNING [train.py:1204] (3/4) Exclude cut with ID _55644_210_11856_1_1534593599582_4103719_293-248694-0_sp1.1 from training. Duration: 0.97275 2023-10-03 22:33:39,315 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3172_210_2087_1_1526292929_3987865_197-97762-0 from training. Duration: 0.75 2023-10-03 22:33:41,903 INFO [train.py:1046] (3/4) Epoch 41, batch 2000, loss[loss=0.1473, simple_loss=0.2176, pruned_loss=0.03852, over 23452.00 frames. ], tot_loss[loss=0.1575, simple_loss=0.2381, pruned_loss=0.03849, over 4724135.90 frames. ], batch size: 285, lr: 2.48e-03, grad_scale: 16.0 2023-10-03 22:33:42,036 WARNING [train.py:1204] (3/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_264-348896-0_sp0.9 from training. Duration: 0.9444375 2023-10-03 22:33:43,735 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.balancer2.prob, batch_count=1429906.6666666667, ans=0.125 2023-10-03 22:33:44,886 WARNING [train.py:1204] (3/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_209-205365-0_sp0.9 from training. Duration: 0.87775 2023-10-03 22:33:46,199 WARNING [train.py:1204] (3/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_222-198127-0_sp0.9 from training. Duration: 0.9333125 2023-10-03 22:33:46,244 WARNING [train.py:1204] (3/4) Exclude cut with ID _57572_210_5517_1_1534921219624_3809484_52-43530-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 22:33:47,628 WARNING [train.py:1204] (3/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_96-320736-0_sp1.1 from training. Duration: 0.7818125 2023-10-03 22:33:49,469 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_13091_210_7925_1_1530609447_8987354_1159-225508-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 22:33:53,539 WARNING [train.py:1204] (3/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_237-44332-0 from training. Duration: 0.94 2023-10-03 22:33:53,578 WARNING [train.py:1204] (3/4) Exclude cut with ID _7970_210_1883_1_1528455616296_3472230_25-178407-0_sp0.9 from training. Duration: 0.788875 2023-10-03 22:33:56,371 WARNING [train.py:1204] (3/4) Exclude cut with ID _29003_210_19596_1_1532507391720_3894098_357-257062-0_sp1.1 from training. Duration: 0.936375 2023-10-03 22:33:58,403 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_809-17156-0 from training. Duration: 0.73 2023-10-03 22:33:59,605 WARNING [train.py:1204] (3/4) Exclude cut with ID _7160_210_1876_1_1527940923231_3703799_302-150728-0_sp1.1 from training. Duration: 0.7090625 2023-10-03 22:34:00,940 WARNING [train.py:1204] (3/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_444-28041-0_sp0.9 from training. Duration: 0.9444375 2023-10-03 22:34:02,429 WARNING [train.py:1204] (3/4) Exclude cut with ID _52892_210_6721_1_1534378941259_4124129_278-251666-0_sp1.1 from training. Duration: 0.92725 2023-10-03 22:34:04,365 WARNING [train.py:1204] (3/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_115-326778-0 from training. Duration: 0.93 2023-10-03 22:34:05,811 WARNING [train.py:1204] (3/4) Exclude cut with ID _41391_210_7994_1_1533603771872_3638201_31-188961-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 22:34:07,200 WARNING [train.py:1204] (3/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_1073-6264-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 22:34:07,218 WARNING [train.py:1204] (3/4) Exclude cut with ID _66868_210_9527_1_1535509287954_4561140_435-259515-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 22:34:08,545 WARNING [train.py:1204] (3/4) Exclude cut with ID _14886_210_4220_1_1530702020355_3516190_159-73748-0 from training. Duration: 0.83 2023-10-03 22:34:08,576 WARNING [train.py:1204] (3/4) Exclude cut with ID _15150_210_8341_1_1530752411803_7256384_469-239509-0_sp1.1 from training. Duration: 0.6181875 2023-10-03 22:34:10,431 WARNING [train.py:1204] (3/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_395-340852-0 from training. Duration: 0.93 2023-10-03 22:34:10,434 WARNING [train.py:1204] (3/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_138-187031-0_sp1.1 from training. Duration: 0.9 2023-10-03 22:34:13,162 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_13757_210_3528_1_1530440682_4107239_48-106347-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 22:34:14,555 WARNING [train.py:1204] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_164-224052-0_sp1.1 from training. Duration: 0.636375 2023-10-03 22:34:14,566 WARNING [train.py:1204] (3/4) Exclude cut with ID _59288_210_5854_1_1535077757774_3412246_408-322648-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 22:34:14,602 WARNING [train.py:1204] (3/4) Exclude cut with ID _30191_210_1664_1_1533189915047_7196670_827-36359-0_sp1.1 from training. Duration: 0.963625 2023-10-03 22:34:15,907 WARNING [train.py:1204] (3/4) Exclude cut with ID _4680_210_3525_1_1527249757953_893386_75-78979-0_sp1.1 from training. Duration: 0.82725 2023-10-03 22:34:17,125 WARNING [train.py:1204] (3/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_383-165234-0 from training. Duration: 0.94 2023-10-03 22:34:18,988 WARNING [train.py:1204] (3/4) Exclude cut with ID _19896_210_1883_1_1531709022662_6649569_94-229927-0 from training. Duration: 0.97 2023-10-03 22:34:18,997 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6788_210_1388_1_1527898698_4562021_180-75288-0_sp1.1 from training. Duration: 0.9 2023-10-03 22:34:19,004 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_26433_210_12108_1_1532157158_4056480_54-321349-0_sp1.1 from training. Duration: 0.97275 2023-10-03 22:34:21,049 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.2.encoder.layers.0.feed_forward2.out_whiten, num_groups=1, num_channels=384, metric=5.40 vs. limit=15.0 2023-10-03 22:34:24,600 WARNING [train.py:1204] (3/4) Exclude cut with ID _56108_210_16380_1_1534505446159_3887959_265-161493-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 22:34:25,935 WARNING [train.py:1204] (3/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_395-111602-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 22:34:25,941 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_154-316450-0_sp0.9 from training. Duration: 0.8444375 2023-10-03 22:34:27,922 WARNING [train.py:1204] (3/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_381-116041-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 22:34:29,345 WARNING [train.py:1204] (3/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_65-104124-0_sp1.1 from training. Duration: 0.963625 2023-10-03 22:34:30,625 WARNING [train.py:1204] (3/4) Exclude cut with ID _6536_210_1883_1_1527850783339_3319069_169-23811-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 22:34:30,663 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_289-180904-0_sp0.9 from training. Duration: 0.8444375 2023-10-03 22:34:30,680 WARNING [train.py:1204] (3/4) Exclude cut with ID _56406_210_9288_1_1534899618664_3496149_226-242400-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 22:34:32,105 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_24899_210_16554_1_1532046422_3827462_276-216330-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 22:34:34,095 WARNING [train.py:1204] (3/4) Exclude cut with ID _29345_210_4381_1_1532689168263_3860169_198-174328-0_sp1.1 from training. Duration: 0.82725 2023-10-03 22:34:35,536 WARNING [train.py:1204] (3/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_338-96520-0 from training. Duration: 0.94 2023-10-03 22:34:37,386 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.5.encoder.layers.0.nonlin_attention.whiten1, num_groups=1, num_channels=192, metric=4.88 vs. limit=10.0 2023-10-03 22:34:39,574 WARNING [train.py:1204] (3/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_7-111954-0_sp1.1 from training. Duration: 0.5090625 2023-10-03 22:34:41,414 WARNING [train.py:1204] (3/4) Exclude cut with ID _36692_210_5665_1_1533608967420_398809_8-16261-0_sp1.1 from training. Duration: 0.97275 2023-10-03 22:34:45,510 WARNING [train.py:1204] (3/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_86-212657-0_sp1.1 from training. Duration: 0.97275 2023-10-03 22:34:45,517 WARNING [train.py:1204] (3/4) Exclude cut with ID _44659_210_2721_1_1534036120061_6614860_626-298884-0_sp1.1 from training. Duration: 0.8818125 2023-10-03 22:34:48,808 WARNING [train.py:1204] (3/4) Exclude cut with ID _49755_210_18053_1_1534581005846_3218709_120-257918-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 22:34:51,493 WARNING [train.py:1204] (3/4) Exclude cut with ID _21881_210_3525_1_1531825242757_3798380_400-113821-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 22:34:51,496 WARNING [train.py:1204] (3/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_387-236773-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 22:34:51,693 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=1430173.3333333333, ans=0.1 2023-10-03 22:34:52,824 WARNING [train.py:1204] (3/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_613-232856-0_sp1.1 from training. Duration: 0.4909375 2023-10-03 22:34:52,862 WARNING [train.py:1204] (3/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_181-78632-0_sp0.9 from training. Duration: 0.8666875 2023-10-03 22:34:54,312 WARNING [train.py:1204] (3/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_175-17485-0_sp1.1 from training. Duration: 0.97275 2023-10-03 22:34:55,661 INFO [train.py:1046] (3/4) Epoch 41, batch 2050, loss[loss=0.1613, simple_loss=0.2481, pruned_loss=0.03722, over 24377.00 frames. ], tot_loss[loss=0.1571, simple_loss=0.2373, pruned_loss=0.03846, over 4724496.44 frames. ], batch size: 77, lr: 2.48e-03, grad_scale: 8.0 2023-10-03 22:34:55,726 WARNING [train.py:1204] (3/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_1004-183422-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 22:34:57,222 WARNING [train.py:1204] (3/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_32-65962-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 22:34:59,172 WARNING [train.py:1204] (3/4) Exclude cut with ID _15458_210_8341_1_1530858614263_3834410_40-298664-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 22:35:04,798 WARNING [train.py:1204] (3/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_380-261863-0_sp1.1 from training. Duration: 0.963625 2023-10-03 22:35:06,136 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_372-173012-0_sp1.1 from training. Duration: 0.863625 2023-10-03 22:35:07,574 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_57-82205-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 22:35:08,845 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3691_210_3528_1_1526468169_4062053_355-334432-0_sp0.9 from training. Duration: 0.9666875 2023-10-03 22:35:08,974 WARNING [train.py:1204] (3/4) Exclude cut with ID _19023_210_4353_1_1531378892552_5820780_28-36615-0 from training. Duration: 0.97 2023-10-03 22:35:08,981 WARNING [train.py:1204] (3/4) Exclude cut with ID _29853_210_7497_1_1532505600851_6642110_286-251333-0_sp1.1 from training. Duration: 0.92725 2023-10-03 22:35:10,455 WARNING [train.py:1204] (3/4) Exclude cut with ID _31602_210_5854_1_1532677021456_4458250_518-80679-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 22:35:11,687 WARNING [train.py:1204] (3/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_38-187851-0_sp0.9 from training. Duration: 0.688875 2023-10-03 22:35:20,035 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.feed_forward1.out_proj.dropout_p, batch_count=1430306.6666666667, ans=0.1 2023-10-03 22:35:22,579 WARNING [train.py:1204] (3/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_39-248870-0_sp1.1 from training. Duration: 0.62725 2023-10-03 22:35:22,604 WARNING [train.py:1204] (3/4) Exclude cut with ID _16908_210_5724_1_1531119157248_3772700_154-31455-0_sp1.1 from training. Duration: 0.97275 2023-10-03 22:35:23,891 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8453_210_4211_1_1528502940_8562730_347-330673-0 from training. Duration: 0.72 2023-10-03 22:35:26,503 WARNING [train.py:1204] (3/4) Exclude cut with ID _30191_210_1664_1_1533189915047_7196670_674-204247-0_sp1.1 from training. Duration: 0.97275 2023-10-03 22:35:27,329 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.0.conv_skip_rate, batch_count=1430373.3333333333, ans=0.0 2023-10-03 22:35:28,432 WARNING [train.py:1204] (3/4) Exclude cut with ID _8833_210_5219_1_1528592665210_7553059_329-125156-0 from training. Duration: 0.82 2023-10-03 22:35:28,465 WARNING [train.py:1204] (3/4) Exclude cut with ID _4779_210_2084_1_1527318022944_3914893_500-285013-0_sp1.1 from training. Duration: 0.62725 2023-10-03 22:35:30,030 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.balancer1.prob, batch_count=1430373.3333333333, ans=0.125 2023-10-03 22:35:31,349 WARNING [train.py:1204] (3/4) Exclude cut with ID _14421_210_8750_1_1530604795825_3542290_190-313578-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 22:35:32,690 WARNING [train.py:1204] (3/4) Exclude cut with ID _35799_210_5854_1_1533263487887_3529071_538-273574-0_sp1.1 from training. Duration: 0.936375 2023-10-03 22:35:34,155 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_14928_210_5632_1_1530773461_4714000_454-112198-0_sp1.1 from training. Duration: 0.6 2023-10-03 22:35:34,206 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_468-143126-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 22:35:36,040 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_4528_210_3273_1_1527076654_3276777_258-121681-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 22:35:36,126 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_397-132565-0_sp1.1 from training. Duration: 0.9 2023-10-03 22:35:37,478 WARNING [train.py:1204] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_454-123002-0_sp0.9 from training. Duration: 0.7666875 2023-10-03 22:35:38,773 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.668e+02 2.010e+02 2.255e+02 2.586e+02 3.703e+02, threshold=4.510e+02, percent-clipped=0.0 2023-10-03 22:35:39,170 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.conv_skip_rate, batch_count=1430440.0, ans=0.0 2023-10-03 22:35:40,207 WARNING [train.py:1204] (3/4) Exclude cut with ID _27052_210_5986_1_1532329197187_3585009_297-86890-0_sp1.1 from training. Duration: 0.936375 2023-10-03 22:35:43,339 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_51225_210_3601_1_1534400634_2327383_55-301844-0_sp1.1 from training. Duration: 0.8454375 2023-10-03 22:35:44,798 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6786_210_1388_1_1527839699_4194495_378-46070-0_sp0.9 from training. Duration: 0.8 2023-10-03 22:35:46,116 WARNING [train.py:1204] (3/4) Exclude cut with ID _42334_210_7770_1_1534206610396_3605880_55-19758-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 22:35:49,560 WARNING [train.py:1204] (3/4) Exclude cut with ID _14884_210_2252_1_1530707459689_5561250_114-201399-0_sp0.9 from training. Duration: 0.8444375 2023-10-03 22:35:55,069 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_99-251314-0_sp0.9 from training. Duration: 0.9666875 2023-10-03 22:35:55,149 WARNING [train.py:1204] (3/4) Exclude cut with ID _26528_210_9070_1_1532570410842_3851310_497-97218-0 from training. Duration: 0.86 2023-10-03 22:36:01,119 WARNING [train.py:1204] (3/4) Exclude cut with ID _55328_210_27767_1_1534580973260_4088170_230-317241-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 22:36:01,187 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_125-203105-0_sp0.9 from training. Duration: 0.888875 2023-10-03 22:36:03,762 WARNING [train.py:1204] (3/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_55-31904-0_sp1.1 from training. Duration: 0.8 2023-10-03 22:36:07,072 WARNING [train.py:1204] (3/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_18-174647-0 from training. Duration: 0.91 2023-10-03 22:36:09,761 INFO [train.py:1046] (3/4) Epoch 41, batch 2100, loss[loss=0.1674, simple_loss=0.2516, pruned_loss=0.0416, over 24456.00 frames. ], tot_loss[loss=0.1562, simple_loss=0.2363, pruned_loss=0.03803, over 4727577.91 frames. ], batch size: 63, lr: 2.48e-03, grad_scale: 8.0 2023-10-03 22:36:09,872 WARNING [train.py:1204] (3/4) Exclude cut with ID IC0165W0005-81726 from training. Duration: 0.952 2023-10-03 22:36:09,872 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_31583_210_3852_1_1532595851_4024413_148-171847-0_sp1.1 from training. Duration: 0.963625 2023-10-03 22:36:11,199 WARNING [train.py:1204] (3/4) Exclude cut with ID _21989_210_5854_1_1531899214931_3271360_116-138769-0_sp1.1 from training. Duration: 0.936375 2023-10-03 22:36:11,280 WARNING [train.py:1204] (3/4) Exclude cut with ID _66070_210_7640_1_1535441393096_7805310_489-292631-0_sp1.1 from training. Duration: 0.7181875 2023-10-03 22:36:14,418 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_202-27960-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 22:36:14,426 WARNING [train.py:1204] (3/4) Exclude cut with ID _5294_210_1747_1_1527249643928_3696179_342-270278-0 from training. Duration: 0.55 2023-10-03 22:36:14,471 WARNING [train.py:1204] (3/4) Exclude cut with ID _38323_210_12108_1_1533121233369_3303049_416-11919-0 from training. Duration: 0.96 2023-10-03 22:36:15,824 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6545_210_2040_1_1527774670_4245258_407-48070-0_sp0.9 from training. Duration: 0.8444375 2023-10-03 22:36:18,996 WARNING [train.py:1204] (3/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_503-30719-0_sp1.1 from training. Duration: 0.8909375 2023-10-03 22:36:19,070 WARNING [train.py:1204] (3/4) Exclude cut with ID _49643_210_7989_1_1534640244224_3939031_234-326497-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 22:36:23,122 WARNING [train.py:1204] (3/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_301-77873-0_sp1.1 from training. Duration: 0.963625 2023-10-03 22:36:23,182 WARNING [train.py:1204] (3/4) Exclude cut with ID _45297_210_2721_1_1533715351381_3264949_152-154697-0_sp1.1 from training. Duration: 0.97275 2023-10-03 22:36:23,189 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3371_210_1794_1_1526384462_5580777_396-339812-0 from training. Duration: 0.85 2023-10-03 22:36:24,565 WARNING [train.py:1204] (3/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_399-156791-0_sp1.1 from training. Duration: 0.7545625 2023-10-03 22:36:25,850 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7696_210_2040_1_1528207167_3716099_117-125434-0 from training. Duration: 0.76 2023-10-03 22:36:25,863 WARNING [train.py:1204] (3/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_96-244299-0 from training. Duration: 0.88 2023-10-03 22:36:27,262 WARNING [train.py:1204] (3/4) Exclude cut with ID _37892_210_19596_1_1533198677897_3964058_519-343462-0_sp1.1 from training. Duration: 0.92725 2023-10-03 22:36:27,284 WARNING [train.py:1204] (3/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_202-183922-0_sp1.1 from training. Duration: 0.77275 2023-10-03 22:36:27,290 WARNING [train.py:1204] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_453-133983-0 from training. Duration: 0.78 2023-10-03 22:36:27,328 WARNING [train.py:1204] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_178-39722-0_sp1.1 from training. Duration: 0.4454375 2023-10-03 22:36:33,723 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_15126_210_6753_1_1530778677_2591185_113-157651-0 from training. Duration: 0.68 2023-10-03 22:36:33,724 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_271-234962-0_sp1.1 from training. Duration: 0.7181875 2023-10-03 22:36:36,408 WARNING [train.py:1204] (3/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_195-64967-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 22:36:36,443 WARNING [train.py:1204] (3/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_365-215065-0_sp1.1 from training. Duration: 0.936375 2023-10-03 22:36:40,937 WARNING [train.py:1204] (3/4) Exclude cut with ID _31602_210_5854_1_1532677021456_4458250_249-96604-0_sp0.9 from training. Duration: 0.988875 2023-10-03 22:36:40,991 WARNING [train.py:1204] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_307-158462-0 from training. Duration: 0.69 2023-10-03 22:36:41,040 WARNING [train.py:1204] (3/4) Exclude cut with ID _30918_210_6400_1_1532754078600_3125920_407-223394-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 22:36:41,043 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6673_210_3273_1_1527767790_3173240_351-167954-0_sp0.9 from training. Duration: 0.5333125 2023-10-03 22:36:41,880 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.1.encoder.layers.0.conv_module1.whiten, num_groups=1, num_channels=256, metric=12.06 vs. limit=15.0 2023-10-03 22:36:43,790 WARNING [train.py:1204] (3/4) Exclude cut with ID _4866_210_2577_1_1527404412284_4364659_256-306830-0 from training. Duration: 0.87 2023-10-03 22:36:43,846 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_17613_210_2151_1_1531137569_2366160_222-131118-0_sp1.1 from training. Duration: 0.963625 2023-10-03 22:36:43,860 WARNING [train.py:1204] (3/4) Exclude cut with ID _45560_210_21982_1_1533812603951_3584999_111-6376-0 from training. Duration: 0.84 2023-10-03 22:36:43,880 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_216-73841-0 from training. Duration: 0.96 2023-10-03 22:36:44,005 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.conv_module1.balancer2.prob, batch_count=1430706.6666666667, ans=0.125 2023-10-03 22:36:45,689 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_14058_210_4163_1_1530690953_6327180_49-210687-0 from training. Duration: 0.94 2023-10-03 22:36:45,976 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.conv_module1.balancer2.min_positive, batch_count=1430706.6666666667, ans=0.05 2023-10-03 22:36:47,106 WARNING [train.py:1204] (3/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_506-42614-0_sp0.9 from training. Duration: 0.87775 2023-10-03 22:36:49,833 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_25-332138-0_sp1.1 from training. Duration: 0.82725 2023-10-03 22:36:51,671 WARNING [train.py:1204] (3/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_589-9070-0_sp1.1 from training. Duration: 0.6454375 2023-10-03 22:36:53,043 WARNING [train.py:1204] (3/4) Exclude cut with ID _6194_210_1534_1_1527511826160_3941194_14-282876-0_sp1.1 from training. Duration: 0.6454375 2023-10-03 22:36:54,495 WARNING [train.py:1204] (3/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_1166-90654-0_sp1.1 from training. Duration: 0.92725 2023-10-03 22:36:55,871 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_218-255858-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 22:36:55,881 WARNING [train.py:1204] (3/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_1285-256622-0 from training. Duration: 0.84 2023-10-03 22:36:55,901 WARNING [train.py:1204] (3/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_205-286261-0_sp1.1 from training. Duration: 0.963625 2023-10-03 22:36:55,915 WARNING [train.py:1204] (3/4) Exclude cut with ID _68777_210_4353_1_1535810390731_3117904_6-154119-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 22:36:57,302 WARNING [train.py:1204] (3/4) Exclude cut with ID _22982_210_1883_1_1531882790447_5449409_565-199393-0_sp1.1 from training. Duration: 0.92725 2023-10-03 22:36:57,312 WARNING [train.py:1204] (3/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_278-154492-0 from training. Duration: 0.76 2023-10-03 22:36:58,737 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_426-313557-0 from training. Duration: 0.53 2023-10-03 22:37:00,059 WARNING [train.py:1204] (3/4) Exclude cut with ID _41817_210_18835_1_1533434477160_7423049_555-293448-0 from training. Duration: 0.8 2023-10-03 22:37:02,198 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.feed_forward1.hidden_balancer.prob, batch_count=1430773.3333333333, ans=0.125 2023-10-03 22:37:03,381 WARNING [train.py:1204] (3/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_32-335103-0_sp1.1 from training. Duration: 0.7454375 2023-10-03 22:37:06,563 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.5.encoder.layers.0.self_attn2.whiten, num_groups=1, num_channels=256, metric=10.47 vs. limit=22.5 2023-10-03 22:37:07,397 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2655_210_2087_1_1525688959_3897400_202-147557-0_sp1.1 from training. Duration: 0.77275 2023-10-03 22:37:07,432 WARNING [train.py:1204] (3/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_158-250116-0 from training. Duration: 0.62 2023-10-03 22:37:12,059 WARNING [train.py:1204] (3/4) Exclude cut with ID _55429_210_10626_1_1534467588527_7174143_528-88083-0_sp1.1 from training. Duration: 0.963625 2023-10-03 22:37:13,507 WARNING [train.py:1204] (3/4) Exclude cut with ID _8030_210_7497_1_1528444886781_3531089_32-31837-0_sp1.1 from training. Duration: 0.8818125 2023-10-03 22:37:15,222 WARNING [train.py:1204] (3/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_744-86168-0_sp1.1 from training. Duration: 0.97275 2023-10-03 22:37:15,240 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1113-7369-0_sp0.9 from training. Duration: 0.9444375 2023-10-03 22:37:15,258 WARNING [train.py:1204] (3/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_124-345840-0_sp0.9 from training. Duration: 0.57775 2023-10-03 22:37:16,486 WARNING [train.py:1204] (3/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_328-264776-0_sp0.9 from training. Duration: 0.7444375 2023-10-03 22:37:17,887 WARNING [train.py:1204] (3/4) Exclude cut with ID _18895_210_6228_1_1531357274234_3784930_469-166615-0_sp1.1 from training. Duration: 0.963625 2023-10-03 22:37:17,902 WARNING [train.py:1204] (3/4) Exclude cut with ID _8395_210_1531_1_1528597871523_4411579_431-132470-0_sp0.9 from training. Duration: 0.82225 2023-10-03 22:37:17,973 WARNING [train.py:1204] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_554-45617-0_sp1.1 from training. Duration: 0.9 2023-10-03 22:37:18,000 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_35361_210_4238_1_1533262810_7793476_524-188242-0_sp1.1 from training. Duration: 0.92725 2023-10-03 22:37:18,168 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.conv_module1.balancer1.max_abs, batch_count=1430840.0, ans=10.0 2023-10-03 22:37:21,128 WARNING [train.py:1204] (3/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_938-205135-0 from training. Duration: 0.87 2023-10-03 22:37:22,500 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_5931_210_2040_1_1527428481_4547999_321-193614-0 from training. Duration: 0.67 2023-10-03 22:37:22,513 WARNING [train.py:1204] (3/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_445-269521-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 22:37:23,831 INFO [train.py:1046] (3/4) Epoch 41, batch 2150, loss[loss=0.1367, simple_loss=0.2164, pruned_loss=0.02854, over 24616.00 frames. ], tot_loss[loss=0.1556, simple_loss=0.2354, pruned_loss=0.03784, over 4723129.84 frames. ], batch size: 60, lr: 2.48e-03, grad_scale: 8.0 2023-10-03 22:37:26,685 WARNING [train.py:1204] (3/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_3-33116-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 22:37:26,687 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_4633_210_2040_1_1526824163_4145343_416-76117-0_sp1.1 from training. Duration: 0.863625 2023-10-03 22:37:26,722 WARNING [train.py:1204] (3/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_1033-274376-0_sp1.1 from training. Duration: 0.7909375 2023-10-03 22:37:26,759 WARNING [train.py:1204] (3/4) Exclude cut with ID _8772_210_1850_1_1528635595972_3664011_398-320049-0_sp0.9 from training. Duration: 0.97775 2023-10-03 22:37:31,051 WARNING [train.py:1204] (3/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_321-215742-0_sp0.9 from training. Duration: 0.5555625 2023-10-03 22:37:34,183 WARNING [train.py:1204] (3/4) Exclude cut with ID _28394_210_8913_1_1532430049518_3650118_272-190950-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 22:37:34,273 WARNING [train.py:1204] (3/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_31-299371-0_sp1.1 from training. Duration: 0.92725 2023-10-03 22:37:36,995 WARNING [train.py:1204] (3/4) Exclude cut with ID _55383_210_10635_1_1534468426172_2231669_134-128246-0_sp0.9 from training. Duration: 0.988875 2023-10-03 22:37:36,998 WARNING [train.py:1204] (3/4) Exclude cut with ID _40519_210_13794_1_1533715228655_7337390_887-9547-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 22:37:38,305 WARNING [train.py:1204] (3/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_310-109461-0_sp0.9 from training. Duration: 0.888875 2023-10-03 22:37:42,746 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2009_210_2071_1_1525481134_4588937_258-250196-0_sp1.1 from training. Duration: 0.92725 2023-10-03 22:37:42,796 WARNING [train.py:1204] (3/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_1186-72702-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 22:37:42,799 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_14831_210_7927_1_1530700200_5583610_181-27467-0_sp1.1 from training. Duration: 0.82725 2023-10-03 22:37:47,417 WARNING [train.py:1204] (3/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_278-132109-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 22:37:47,443 WARNING [train.py:1204] (3/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_261-297503-0 from training. Duration: 0.99 2023-10-03 22:37:50,170 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder_embed.conv.8.prob, batch_count=1430973.3333333333, ans=0.125 2023-10-03 22:37:51,425 WARNING [train.py:1204] (3/4) Exclude cut with ID _19896_210_1883_1_1531709022662_6649569_313-265903-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 22:37:53,268 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_105-114797-0_sp1.1 from training. Duration: 0.763625 2023-10-03 22:37:53,357 WARNING [train.py:1204] (3/4) Exclude cut with ID _53755_210_8254_1_1534577312941_4707360_9-65843-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 22:37:54,624 WARNING [train.py:1204] (3/4) Exclude cut with ID _18516_210_9206_1_1531704653206_3692406_361-224549-0_sp1.1 from training. Duration: 0.97275 2023-10-03 22:37:54,658 WARNING [train.py:1204] (3/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_403-310975-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 22:37:55,974 WARNING [train.py:1204] (3/4) Exclude cut with ID _5444_210_2151_1_1527418808464_5312210_581-315411-0_sp0.9 from training. Duration: 0.688875 2023-10-03 22:37:56,012 WARNING [train.py:1204] (3/4) Exclude cut with ID _23825_210_13449_1_1531915313526_3497150_464-154904-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 22:37:56,020 WARNING [train.py:1204] (3/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_937-208281-0_sp0.9 from training. Duration: 0.9444375 2023-10-03 22:37:57,383 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_37839_210_7234_1_1533542462_3689303_158-181212-0_sp1.1 from training. Duration: 0.963625 2023-10-03 22:37:57,471 WARNING [train.py:1204] (3/4) Exclude cut with ID _8772_210_1850_1_1528635595972_3664011_398-320049-0 from training. Duration: 0.88 2023-10-03 22:37:57,604 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.1.conv_module1.balancer2.prob, batch_count=1431040.0, ans=0.125 2023-10-03 22:37:58,883 WARNING [train.py:1204] (3/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_718-250907-0_sp1.1 from training. Duration: 0.62725 2023-10-03 22:38:00,168 WARNING [train.py:1204] (3/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_641-126395-0_sp1.1 from training. Duration: 0.92725 2023-10-03 22:38:00,192 WARNING [train.py:1204] (3/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_166-241493-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 22:38:01,601 WARNING [train.py:1204] (3/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_526-62079-0_sp0.9 from training. Duration: 0.7444375 2023-10-03 22:38:02,982 WARNING [train.py:1204] (3/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_439-275083-0_sp1.1 from training. Duration: 0.936375 2023-10-03 22:38:05,961 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.616e+02 1.931e+02 2.155e+02 2.450e+02 3.696e+02, threshold=4.310e+02, percent-clipped=0.0 2023-10-03 22:38:06,084 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_66907_210_21581_1_1535615661_3590177_493-229032-0_sp1.1 from training. Duration: 0.92725 2023-10-03 22:38:06,116 WARNING [train.py:1204] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_89-321955-0_sp1.1 from training. Duration: 0.72725 2023-10-03 22:38:07,496 WARNING [train.py:1204] (3/4) Exclude cut with ID _29857_210_20034_1_1532395886753_3798160_273-226195-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 22:38:07,510 WARNING [train.py:1204] (3/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_404-75889-0 from training. Duration: 0.89 2023-10-03 22:38:07,551 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_271-234962-0_sp0.9 from training. Duration: 0.87775 2023-10-03 22:38:12,267 WARNING [train.py:1204] (3/4) Exclude cut with ID _22961_210_8254_1_1531976366672_4278180_156-339685-0_sp1.1 from training. Duration: 0.97275 2023-10-03 22:38:12,309 WARNING [train.py:1204] (3/4) Exclude cut with ID _37892_210_19596_1_1533198677897_3964058_425-231927-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 22:38:13,660 WARNING [train.py:1204] (3/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_102-288670-0_sp1.1 from training. Duration: 0.97275 2023-10-03 22:38:15,094 WARNING [train.py:1204] (3/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_283-220372-0_sp1.1 from training. Duration: 0.5090625 2023-10-03 22:38:15,169 WARNING [train.py:1204] (3/4) Exclude cut with ID _44304_210_15374_1_1533639352765_4613129_86-1699-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 22:38:18,380 WARNING [train.py:1204] (3/4) Exclude cut with ID _2891_210_2550_1_1525874398521_4427710_327-121590-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 22:38:18,387 WARNING [train.py:1204] (3/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_369-222060-0 from training. Duration: 0.71 2023-10-03 22:38:19,711 WARNING [train.py:1204] (3/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_762-109886-0 from training. Duration: 0.91 2023-10-03 22:38:19,744 WARNING [train.py:1204] (3/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_477-255782-0_sp0.9 from training. Duration: 0.911125 2023-10-03 22:38:19,907 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.conv_skip_rate, batch_count=1431106.6666666667, ans=0.0 2023-10-03 22:38:21,028 WARNING [train.py:1204] (3/4) Exclude cut with ID IC0669W0061-330906 from training. Duration: 0.768 2023-10-03 22:38:22,415 WARNING [train.py:1204] (3/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_347-213071-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 22:38:22,454 WARNING [train.py:1204] (3/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_507-303370-0_sp1.1 from training. Duration: 0.87275 2023-10-03 22:38:23,812 WARNING [train.py:1204] (3/4) Exclude cut with ID _15012_210_1968_1_1530856820429_5095350_688-178849-0 from training. Duration: 0.64 2023-10-03 22:38:23,817 WARNING [train.py:1204] (3/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_28-70987-0_sp0.9 from training. Duration: 0.9666875 2023-10-03 22:38:23,819 WARNING [train.py:1204] (3/4) Exclude cut with ID _5910_210_1389_1_1527337972454_5050100_555-159815-0 from training. Duration: 0.98 2023-10-03 22:38:23,847 WARNING [train.py:1204] (3/4) Exclude cut with ID IC0679W0064-335871 from training. Duration: 0.768 2023-10-03 22:38:23,847 WARNING [train.py:1204] (3/4) Exclude cut with ID IC0679W0079-335886 from training. Duration: 0.896 2023-10-03 22:38:23,894 WARNING [train.py:1204] (3/4) Exclude cut with ID _5608_210_2106_1_1527415439089_6819768_909-82767-0 from training. Duration: 0.61 2023-10-03 22:38:25,661 WARNING [train.py:1204] (3/4) Exclude cut with ID _23038_210_9570_1_1532001584158_3652419_411-323967-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 22:38:25,716 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_350-231748-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 22:38:26,988 WARNING [train.py:1204] (3/4) Exclude cut with ID _5289_210_2084_1_1527678202736_3779299_142-103317-0_sp0.9 from training. Duration: 0.9333125 2023-10-03 22:38:27,060 WARNING [train.py:1204] (3/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_314-144055-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 22:38:27,128 WARNING [train.py:1204] (3/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_177-236809-0_sp0.9 from training. Duration: 0.5444375 2023-10-03 22:38:29,907 WARNING [train.py:1204] (3/4) Exclude cut with ID _23038_210_9570_1_1532001584158_3652419_82-334733-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 22:38:29,915 WARNING [train.py:1204] (3/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_225-22526-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 22:38:37,239 INFO [train.py:1046] (3/4) Epoch 41, batch 2200, loss[loss=0.1533, simple_loss=0.2433, pruned_loss=0.03161, over 24625.00 frames. ], tot_loss[loss=0.1554, simple_loss=0.2356, pruned_loss=0.03764, over 4729538.17 frames. ], batch size: 68, lr: 2.48e-03, grad_scale: 8.0 2023-10-03 22:38:37,361 WARNING [train.py:1204] (3/4) Exclude cut with ID _30327_210_16049_1_1532484161295_3924169_401-70819-0_sp1.1 from training. Duration: 0.7818125 2023-10-03 22:38:38,675 WARNING [train.py:1204] (3/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_538-32291-0 from training. Duration: 0.83 2023-10-03 22:38:40,707 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.4.encoder.layers.0.feed_forward1.out_whiten, num_groups=1, num_channels=384, metric=7.50 vs. limit=15.0 2023-10-03 22:38:41,515 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_37357_210_17024_1_1533089504_2316121_258-211605-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 22:38:45,026 WARNING [train.py:1204] (3/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_550-335198-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 22:38:45,084 WARNING [train.py:1204] (3/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_98-741-0_sp0.9 from training. Duration: 0.888875 2023-10-03 22:38:46,443 WARNING [train.py:1204] (3/4) Exclude cut with ID _38323_210_12108_1_1533121233369_3303049_251-238988-0_sp1.1 from training. Duration: 0.92725 2023-10-03 22:38:47,747 WARNING [train.py:1204] (3/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_259-34038-0_sp0.9 from training. Duration: 0.82225 2023-10-03 22:38:49,630 WARNING [train.py:1204] (3/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_1060-180633-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 22:38:50,903 WARNING [train.py:1204] (3/4) Exclude cut with ID _8755_210_1876_1_1528628559357_3258209_211-37473-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 22:38:50,908 WARNING [train.py:1204] (3/4) Exclude cut with ID _4122_210_2084_1_1526713164417_4353280_528-206771-0 from training. Duration: 0.88 2023-10-03 22:38:56,817 WARNING [train.py:1204] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_64-248359-0 from training. Duration: 0.69 2023-10-03 22:38:58,286 WARNING [train.py:1204] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_402-253027-0_sp1.1 from training. Duration: 0.4909375 2023-10-03 22:38:58,553 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=1431306.6666666667, ans=0.1 2023-10-03 22:39:04,198 WARNING [train.py:1204] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_625-102955-0 from training. Duration: 0.77 2023-10-03 22:39:07,112 WARNING [train.py:1204] (3/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_611-219324-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 22:39:07,191 WARNING [train.py:1204] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_718-68505-0_sp0.9 from training. Duration: 0.988875 2023-10-03 22:39:07,237 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_353-182855-0_sp1.1 from training. Duration: 0.87275 2023-10-03 22:39:09,959 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.conv_module1.balancer2.prob, batch_count=1431373.3333333333, ans=0.125 2023-10-03 22:39:09,992 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.feed_forward2.hidden_balancer.prob, batch_count=1431373.3333333333, ans=0.125 2023-10-03 22:39:11,210 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_5960_210_1794_1_1527403865_3990999_515-2613-0_sp0.9 from training. Duration: 0.97775 2023-10-03 22:39:11,229 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_5575_210_1410_1_1527301305_2570170_342-265572-0 from training. Duration: 0.94 2023-10-03 22:39:15,803 WARNING [train.py:1204] (3/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_329-113173-0_sp0.9 from training. Duration: 0.688875 2023-10-03 22:39:17,206 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_15396_210_6890_1_1531308473_1938936_213-312506-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 22:39:17,263 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_521-214276-0_sp1.1 from training. Duration: 0.4 2023-10-03 22:39:20,412 WARNING [train.py:1204] (3/4) Exclude cut with ID _15026_210_8750_1_1530777602316_3291850_211-195043-0_sp1.1 from training. Duration: 0.72725 2023-10-03 22:39:21,776 WARNING [train.py:1204] (3/4) Exclude cut with ID _29343_210_4381_1_1532602757752_3674109_108-257494-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 22:39:23,770 WARNING [train.py:1204] (3/4) Exclude cut with ID _64414_210_17301_1_1535243252048_3757759_403-58151-0_sp1.1 from training. Duration: 0.963625 2023-10-03 22:39:23,882 WARNING [train.py:1204] (3/4) Exclude cut with ID _36512_210_6987_1_1533033143998_3126069_109-177787-0_sp1.1 from training. Duration: 0.92725 2023-10-03 22:39:26,581 WARNING [train.py:1204] (3/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_498-40153-0 from training. Duration: 0.63 2023-10-03 22:39:26,650 WARNING [train.py:1204] (3/4) Exclude cut with ID _56447_210_1389_1_1535074245910_3697189_161-334523-0_sp1.1 from training. Duration: 0.936375 2023-10-03 22:39:28,030 WARNING [train.py:1204] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_238-245655-0 from training. Duration: 0.81 2023-10-03 22:39:30,665 WARNING [train.py:1204] (3/4) Exclude cut with ID _64631_210_27767_1_1535349580068_4165160_108-43865-0_sp1.1 from training. Duration: 0.92725 2023-10-03 22:39:30,670 WARNING [train.py:1204] (3/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_243-42116-0_sp1.1 from training. Duration: 0.663625 2023-10-03 22:39:30,699 WARNING [train.py:1204] (3/4) Exclude cut with ID _66868_210_9527_1_1535509287954_4561140_594-132160-0_sp1.1 from training. Duration: 0.92725 2023-10-03 22:39:33,881 WARNING [train.py:1204] (3/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_48-338976-0_sp0.9 from training. Duration: 0.988875 2023-10-03 22:39:35,234 WARNING [train.py:1204] (3/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_137-100595-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 22:39:35,241 WARNING [train.py:1204] (3/4) Exclude cut with ID _49748_210_24116_1_1533986971737_3158910_12-271669-0_sp1.1 from training. Duration: 0.936375 2023-10-03 22:39:35,274 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_37357_210_17024_1_1533089504_2316121_206-80421-0_sp1.1 from training. Duration: 0.936375 2023-10-03 22:39:36,779 WARNING [train.py:1204] (3/4) Exclude cut with ID _7537_210_2084_1_1528502395431_3924359_113-199933-0_sp1.1 from training. Duration: 0.536375 2023-10-03 22:39:36,818 WARNING [train.py:1204] (3/4) Exclude cut with ID _55440_210_12651_1_1534496713364_2980520_31-59289-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 22:39:38,273 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1083-220122-0_sp0.9 from training. Duration: 0.5666875 2023-10-03 22:39:38,423 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.attention_skip_rate, batch_count=1431506.6666666667, ans=0.0 2023-10-03 22:39:41,106 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_426-313557-0_sp1.1 from training. Duration: 0.4818125 2023-10-03 22:39:41,157 WARNING [train.py:1204] (3/4) Exclude cut with ID _35799_210_5854_1_1533263487887_3529071_324-94843-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 22:39:44,307 WARNING [train.py:1204] (3/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_264-108685-0_sp1.1 from training. Duration: 0.5 2023-10-03 22:39:44,388 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9012W0006-501141 from training. Duration: 0.896 2023-10-03 22:39:45,776 WARNING [train.py:1204] (3/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_890-5117-0_sp0.9 from training. Duration: 0.8666875 2023-10-03 22:39:47,297 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9023W0008-506301 from training. Duration: 0.896 2023-10-03 22:39:49,177 WARNING [train.py:1204] (3/4) Exclude cut with ID _13916_210_9994_1_1530788341126_3763149_60-191556-0_sp0.9 from training. Duration: 0.67775 2023-10-03 22:39:49,234 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9029W0331-509721 from training. Duration: 0.896 2023-10-03 22:39:52,445 INFO [train.py:1046] (3/4) Epoch 41, batch 2250, loss[loss=0.1639, simple_loss=0.2374, pruned_loss=0.04525, over 23832.00 frames. ], tot_loss[loss=0.1555, simple_loss=0.2359, pruned_loss=0.03759, over 4717671.89 frames. ], batch size: 195, lr: 2.48e-03, grad_scale: 8.0 2023-10-03 22:39:52,501 WARNING [train.py:1204] (3/4) Exclude cut with ID _35823_210_6917_1_1533187881914_7297830_567-108596-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 22:39:52,554 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7543_210_6753_1_1528544010_1110402_25-183860-0_sp1.1 from training. Duration: 0.42725 2023-10-03 22:39:53,931 WARNING [train.py:1204] (3/4) Exclude cut with ID _47278_210_12041_1_1533902418349_3576110_495-260035-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 22:39:56,528 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9051W0015-520281 from training. Duration: 0.9045625 2023-10-03 22:39:56,678 WARNING [train.py:1204] (3/4) Exclude cut with ID _14459_210_2228_1_1531443641946_7516700_707-66193-0_sp1.1 from training. Duration: 0.97275 2023-10-03 22:39:59,254 WARNING [train.py:1204] (3/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_219-224445-0_sp1.1 from training. Duration: 0.763625 2023-10-03 22:40:03,420 WARNING [train.py:1204] (3/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_345-167986-0_sp0.9 from training. Duration: 0.9333125 2023-10-03 22:40:05,322 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.conv_module2.balancer1.prob, batch_count=1431640.0, ans=0.125 2023-10-03 22:40:06,521 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_34815_210_6388_1_1533381320_3571903_279-305565-0_sp1.1 from training. Duration: 0.836375 2023-10-03 22:40:09,343 WARNING [train.py:1204] (3/4) Exclude cut with ID _21987_210_5854_1_1531821500482_3637038_317-179212-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 22:40:09,391 WARNING [train.py:1204] (3/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_395-37222-0_sp1.1 from training. Duration: 0.6909375 2023-10-03 22:40:10,699 WARNING [train.py:1204] (3/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_168-50337-0_sp1.1 from training. Duration: 0.763625 2023-10-03 22:40:11,030 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.conv_module1.balancer1.min_positive, batch_count=1431640.0, ans=0.025 2023-10-03 22:40:14,133 WARNING [train.py:1204] (3/4) Exclude cut with ID _60113_210_16271_1_1535164212481_4386079_559-167191-0 from training. Duration: 0.93 2023-10-03 22:40:14,143 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_47228_210_4983_1_1533780021_3672027_344-179642-0_sp1.1 from training. Duration: 0.92725 2023-10-03 22:40:14,175 WARNING [train.py:1204] (3/4) Exclude cut with ID _22961_210_8254_1_1531976366672_4278180_317-199546-0_sp0.9 from training. Duration: 0.9555625 2023-10-03 22:40:15,606 WARNING [train.py:1204] (3/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_141-89727-0 from training. Duration: 0.71 2023-10-03 22:40:15,823 INFO [scaling.py:1118] (3/4) WithLoss: name=encoder.encoders.3.encoder.layers.2.self_attn_weights, loss-sum=0.000e+00 2023-10-03 22:40:16,827 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_78-116075-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 22:40:16,838 WARNING [train.py:1204] (3/4) Exclude cut with ID _64086_210_7994_1_1535270417019_7435949_52-235756-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 22:40:17,081 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.conv_skip_rate, batch_count=1431640.0, ans=0.0 2023-10-03 22:40:18,233 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7615_210_4211_1_1528110965_4580160_72-235211-0_sp1.1 from training. Duration: 0.6909375 2023-10-03 22:40:23,441 WARNING [train.py:1204] (3/4) Exclude cut with ID _14069_210_5632_1_1530512692033_4803390_287-27950-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 22:40:24,808 WARNING [train.py:1204] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_74-48717-0_sp0.9 from training. Duration: 0.6444375 2023-10-03 22:40:24,843 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7544_210_6753_1_1528591327_1471426_172-348777-0_sp1.1 from training. Duration: 0.6 2023-10-03 22:40:26,217 WARNING [train.py:1204] (3/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_220-191080-0 from training. Duration: 0.98 2023-10-03 22:40:27,576 WARNING [train.py:1204] (3/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_1081-137641-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 22:40:28,960 WARNING [train.py:1204] (3/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_278-235459-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 22:40:30,461 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.0.balancer_na.min_abs, batch_count=1431706.6666666667, ans=0.02 2023-10-03 22:40:33,228 WARNING [train.py:1204] (3/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_1181-77470-0_sp1.1 from training. Duration: 0.936375 2023-10-03 22:40:34,423 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.587e+02 1.897e+02 2.082e+02 2.416e+02 4.175e+02, threshold=4.164e+02, percent-clipped=0.0 2023-10-03 22:40:34,588 WARNING [train.py:1204] (3/4) Exclude cut with ID _60860_210_8254_1_1534896015875_3557169_198-5531-0_sp1.1 from training. Duration: 0.936375 2023-10-03 22:40:36,030 WARNING [train.py:1204] (3/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_17-289781-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 22:40:36,880 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.bypass_mid.scale_min, batch_count=1431773.3333333333, ans=0.2 2023-10-03 22:40:37,831 WARNING [train.py:1204] (3/4) Exclude cut with ID _50310_210_15488_1_1534068043345_3641080_146-199752-0_sp1.1 from training. Duration: 0.92725 2023-10-03 22:40:39,297 WARNING [train.py:1204] (3/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_969-213354-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 22:40:40,645 WARNING [train.py:1204] (3/4) Exclude cut with ID _70519_210_4163_1_1535961595994_7458230_66-344156-0_sp1.1 from training. Duration: 0.963625 2023-10-03 22:40:46,663 WARNING [train.py:1204] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_380-204756-0_sp1.1 from training. Duration: 0.7818125 2023-10-03 22:40:48,146 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_128-202104-0_sp0.9 from training. Duration: 0.7 2023-10-03 22:40:54,222 WARNING [train.py:1204] (3/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_51-1049-0_sp0.9 from training. Duration: 0.8333125 2023-10-03 22:40:54,240 WARNING [train.py:1204] (3/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_86-66192-0_sp1.1 from training. Duration: 0.5 2023-10-03 22:40:54,304 WARNING [train.py:1204] (3/4) Exclude cut with ID _14069_210_5632_1_1530512692033_4803390_95-1896-0_sp1.1 from training. Duration: 0.8909375 2023-10-03 22:40:58,871 WARNING [train.py:1204] (3/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_29-293896-0_sp1.1 from training. Duration: 0.5545625 2023-10-03 22:41:01,576 WARNING [train.py:1204] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_167-137255-0_sp0.9 from training. Duration: 0.711125 2023-10-03 22:41:01,577 WARNING [train.py:1204] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_217-95056-0 from training. Duration: 0.98 2023-10-03 22:41:02,916 WARNING [train.py:1204] (3/4) Exclude cut with ID _47278_210_12041_1_1533902418349_3576110_97-266746-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 22:41:02,956 WARNING [train.py:1204] (3/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_380-343670-0_sp1.1 from training. Duration: 0.8 2023-10-03 22:41:04,437 WARNING [train.py:1204] (3/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_107-332398-0 from training. Duration: 0.78 2023-10-03 22:41:04,616 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.conv_module1.balancer2.prob, batch_count=1431906.6666666667, ans=0.125 2023-10-03 22:41:05,698 INFO [train.py:1046] (3/4) Epoch 41, batch 2300, loss[loss=0.1574, simple_loss=0.2504, pruned_loss=0.03223, over 24330.00 frames. ], tot_loss[loss=0.1558, simple_loss=0.2364, pruned_loss=0.03764, over 4729032.80 frames. ], batch size: 74, lr: 2.48e-03, grad_scale: 8.0 2023-10-03 22:41:07,175 WARNING [train.py:1204] (3/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_91-267696-0_sp1.1 from training. Duration: 0.7909375 2023-10-03 22:41:07,195 WARNING [train.py:1204] (3/4) Exclude cut with ID _41298_210_9019_1_1533430371450_4822143_498-186993-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 22:41:14,906 WARNING [train.py:1204] (3/4) Exclude cut with ID _17956_210_4353_1_1531209568125_6979970_54-77610-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 22:41:14,949 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_40047_210_2290_1_1533290519_2374311_257-335162-0_sp1.1 from training. Duration: 0.863625 2023-10-03 22:41:17,719 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9653W0193-675471 from training. Duration: 0.9656875 2023-10-03 22:41:19,142 WARNING [train.py:1204] (3/4) Exclude cut with ID _25248_210_8750_1_1532062825941_5554180_243-240536-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 22:41:26,743 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_32360_210_4198_1_1532689160_3648436_60-51908-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 22:41:26,759 WARNING [train.py:1204] (3/4) Exclude cut with ID _4092_210_2151_1_1526643044449_3135759_227-213972-0_sp0.9 from training. Duration: 0.6 2023-10-03 22:41:26,795 WARNING [train.py:1204] (3/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_392-173500-0_sp1.1 from training. Duration: 0.97275 2023-10-03 22:41:26,832 WARNING [train.py:1204] (3/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_71-329150-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 22:41:26,834 WARNING [train.py:1204] (3/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_866-173440-0 from training. Duration: 0.9 2023-10-03 22:41:28,183 WARNING [train.py:1204] (3/4) Exclude cut with ID _14832_210_8750_1_1530691351539_3171348_1-54315-0_sp0.9 from training. Duration: 0.9666875 2023-10-03 22:41:31,603 WARNING [train.py:1204] (3/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_430-116139-0_sp1.1 from training. Duration: 0.77275 2023-10-03 22:41:31,643 WARNING [train.py:1204] (3/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_356-45054-0_sp0.9 from training. Duration: 0.9555625 2023-10-03 22:41:34,426 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3531_210_3528_1_1526294203_5216456_309-227302-0_sp0.9 from training. Duration: 0.7666875 2023-10-03 22:41:37,176 WARNING [train.py:1204] (3/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_301-330455-0_sp1.1 from training. Duration: 0.72725 2023-10-03 22:41:41,228 WARNING [train.py:1204] (3/4) Exclude cut with ID _23557_210_8750_1_1531890106370_6846255_667-145945-0_sp1.1 from training. Duration: 0.92725 2023-10-03 22:41:42,005 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.1.balancer1.prob, batch_count=1432040.0, ans=0.125 2023-10-03 22:41:43,347 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.conv_module1.balancer1.min_positive, batch_count=1432040.0, ans=0.025 2023-10-03 22:41:47,906 WARNING [train.py:1204] (3/4) Exclude cut with ID _14860_210_4915_1_1530702020340_7343240_836-328489-0_sp1.1 from training. Duration: 0.8181875 2023-10-03 22:41:47,956 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_37152_210_9697_1_1533106573_3844188_416-333249-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 22:41:49,416 WARNING [train.py:1204] (3/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_538-32291-0_sp0.9 from training. Duration: 0.92225 2023-10-03 22:41:50,822 WARNING [train.py:1204] (3/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_965-176824-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 22:41:54,163 WARNING [train.py:1204] (3/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_86-19601-0_sp1.1 from training. Duration: 0.77275 2023-10-03 22:41:55,535 WARNING [train.py:1204] (3/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_436-318113-0_sp1.1 from training. Duration: 0.6818125 2023-10-03 22:41:55,562 WARNING [train.py:1204] (3/4) Exclude cut with ID _41817_210_18835_1_1533434477160_7423049_555-293448-0_sp0.9 from training. Duration: 0.888875 2023-10-03 22:41:55,579 WARNING [train.py:1204] (3/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_68-219807-0 from training. Duration: 0.47 2023-10-03 22:41:59,610 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7544_210_6753_1_1528591327_1471426_33-346031-0_sp1.1 from training. Duration: 0.5545625 2023-10-03 22:42:01,294 WARNING [train.py:1204] (3/4) Exclude cut with ID _49748_210_24116_1_1533986971737_3158910_263-52113-0_sp1.1 from training. Duration: 0.97275 2023-10-03 22:42:01,347 WARNING [train.py:1204] (3/4) Exclude cut with ID _64639_210_5816_1_1535367619437_3528000_340-24580-0_sp1.1 from training. Duration: 0.92725 2023-10-03 22:42:01,370 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_47228_210_4983_1_1533780021_3672027_479-114941-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 22:42:01,397 WARNING [train.py:1204] (3/4) Exclude cut with ID _44585_210_18628_1_1533605350387_4518199_506-119659-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 22:42:02,876 WARNING [train.py:1204] (3/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_314-126819-0_sp1.1 from training. Duration: 0.4545625 2023-10-03 22:42:02,885 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2541_210_1794_1_1525590269_3084765_156-173713-0_sp0.9 from training. Duration: 0.9 2023-10-03 22:42:04,179 WARNING [train.py:1204] (3/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_108-255430-0 from training. Duration: 0.97 2023-10-03 22:42:04,186 WARNING [train.py:1204] (3/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_791-45149-0_sp1.1 from training. Duration: 0.7818125 2023-10-03 22:42:04,192 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_53378_210_17282_1_1534227054_1261425_10-118295-0_sp1.1 from training. Duration: 0.97275 2023-10-03 22:42:04,252 WARNING [train.py:1204] (3/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_148-158509-0 from training. Duration: 0.41 2023-10-03 22:42:11,188 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_34726_210_4525_1_1533019474_5030395_687-291305-0_sp1.1 from training. Duration: 0.936375 2023-10-03 22:42:14,534 WARNING [train.py:1204] (3/4) Exclude cut with ID _14860_210_4915_1_1530702020340_7343240_264-324537-0_sp1.1 from training. Duration: 0.963625 2023-10-03 22:42:17,906 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_41251_210_15368_1_1533429767_4820695_591-46386-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 22:42:17,930 WARNING [train.py:1204] (3/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_86-19601-0_sp0.9 from training. Duration: 0.9444375 2023-10-03 22:42:17,965 WARNING [train.py:1204] (3/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_401-255491-0_sp0.9 from training. Duration: 0.77775 2023-10-03 22:42:20,573 INFO [train.py:1046] (3/4) Epoch 41, batch 2350, loss[loss=0.154, simple_loss=0.2376, pruned_loss=0.03519, over 23298.00 frames. ], tot_loss[loss=0.1561, simple_loss=0.2365, pruned_loss=0.03785, over 4717226.61 frames. ], batch size: 105, lr: 2.48e-03, grad_scale: 8.0 2023-10-03 22:42:20,639 WARNING [train.py:1204] (3/4) Exclude cut with ID _8696_210_1876_1_1528545632440_3345058_209-54069-0_sp1.1 from training. Duration: 0.6090625 2023-10-03 22:42:20,655 WARNING [train.py:1204] (3/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_369-325184-0_sp1.1 from training. Duration: 0.9 2023-10-03 22:42:21,973 WARNING [train.py:1204] (3/4) Exclude cut with ID _14687_210_5854_1_1530703838259_3664889_115-239046-0_sp0.9 from training. Duration: 0.7444375 2023-10-03 22:42:22,018 WARNING [train.py:1204] (3/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_168-50337-0 from training. Duration: 0.84 2023-10-03 22:42:29,324 WARNING [train.py:1204] (3/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_909-64151-0_sp1.1 from training. Duration: 0.8545625 2023-10-03 22:42:29,335 WARNING [train.py:1204] (3/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_597-154640-0 from training. Duration: 0.9 2023-10-03 22:42:34,035 WARNING [train.py:1204] (3/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_126-274694-0 from training. Duration: 0.98 2023-10-03 22:42:36,685 WARNING [train.py:1204] (3/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_344-269786-0_sp1.1 from training. Duration: 0.97275 2023-10-03 22:42:38,328 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=1432306.6666666667, ans=0.1 2023-10-03 22:42:39,560 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_37818_210_4238_1_1533620910_4720083_322-253154-0_sp1.1 from training. Duration: 0.92725 2023-10-03 22:42:39,563 WARNING [train.py:1204] (3/4) Exclude cut with ID _58895_210_8750_1_1535184081320_3649089_221-15823-0_sp1.1 from training. Duration: 0.92725 2023-10-03 22:42:40,871 WARNING [train.py:1204] (3/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_9-21737-0_sp1.1 from training. Duration: 0.8818125 2023-10-03 22:42:40,918 WARNING [train.py:1204] (3/4) Exclude cut with ID _14069_210_5632_1_1530512692033_4803390_522-341588-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 22:42:42,263 WARNING [train.py:1204] (3/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_363-185703-0 from training. Duration: 0.95 2023-10-03 22:42:45,708 WARNING [train.py:1204] (3/4) Exclude cut with ID _41817_210_18835_1_1533434477160_7423049_265-76216-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 22:42:51,594 WARNING [train.py:1204] (3/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_841-341679-0 from training. Duration: 0.58 2023-10-03 22:42:52,984 WARNING [train.py:1204] (3/4) Exclude cut with ID _55429_210_10626_1_1534467588527_7174143_97-335084-0_sp1.1 from training. Duration: 0.8818125 2023-10-03 22:42:55,702 WARNING [train.py:1204] (3/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_690-126306-0_sp0.9 from training. Duration: 0.8666875 2023-10-03 22:42:55,718 WARNING [train.py:1204] (3/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_102-92751-0_sp1.1 from training. Duration: 0.9 2023-10-03 22:42:58,864 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3787_210_2040_1_1526477544_5478586_603-120324-0_sp1.1 from training. Duration: 0.736375 2023-10-03 22:42:58,974 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_33348_210_5632_1_1533086912_3324030_85-28583-0 from training. Duration: 0.82 2023-10-03 22:43:00,343 WARNING [train.py:1204] (3/4) Exclude cut with ID _60057_210_10640_1_1534903065793_3333150_362-230377-0_sp1.1 from training. Duration: 0.7909375 2023-10-03 22:43:03,365 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.622e+02 1.889e+02 2.078e+02 2.250e+02 3.278e+02, threshold=4.156e+02, percent-clipped=0.0 2023-10-03 22:43:03,444 WARNING [train.py:1204] (3/4) Exclude cut with ID _60113_210_16271_1_1535164212481_4386079_194-146486-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 22:43:03,469 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_53753_210_7234_1_1534320003_3594148_171-277668-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 22:43:03,514 WARNING [train.py:1204] (3/4) Exclude cut with ID _34423_210_8750_1_1533430798456_3330199_251-332788-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 22:43:04,992 WARNING [train.py:1204] (3/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_663-166973-0_sp0.9 from training. Duration: 0.888875 2023-10-03 22:43:07,551 WARNING [train.py:1204] (3/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_927-43827-0 from training. Duration: 0.74 2023-10-03 22:43:07,597 WARNING [train.py:1204] (3/4) Exclude cut with ID _28323_210_16187_1_1532480395667_4100300_306-74561-0_sp1.1 from training. Duration: 0.8545625 2023-10-03 22:43:09,251 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.bypass.scale_min, batch_count=1432440.0, ans=0.2 2023-10-03 22:43:10,361 WARNING [train.py:1204] (3/4) Exclude cut with ID _15037_210_2253_1_1530752294104_3267650_139-337989-0_sp1.1 from training. Duration: 0.92725 2023-10-03 22:43:10,375 WARNING [train.py:1204] (3/4) Exclude cut with ID _49039_210_6315_1_1533967537541_3846118_460-263670-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 22:43:13,031 WARNING [train.py:1204] (3/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_186-309161-0 from training. Duration: 0.54 2023-10-03 22:43:13,095 WARNING [train.py:1204] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_87-278686-0_sp1.1 from training. Duration: 0.7 2023-10-03 22:43:16,870 WARNING [train.py:1204] (3/4) Exclude cut with ID _27052_210_5986_1_1532329197187_3585009_314-106499-0 from training. Duration: 0.92 2023-10-03 22:43:16,886 WARNING [train.py:1204] (3/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_412-287526-0_sp0.9 from training. Duration: 0.8 2023-10-03 22:43:22,334 WARNING [train.py:1204] (3/4) Exclude cut with ID _22961_210_8254_1_1531976366672_4278180_317-199546-0 from training. Duration: 0.86 2023-10-03 22:43:23,853 WARNING [train.py:1204] (3/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_381-317791-0 from training. Duration: 0.73 2023-10-03 22:43:25,137 WARNING [train.py:1204] (3/4) Exclude cut with ID _44010_210_4525_1_1533884262964_4013620_560-310411-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 22:43:25,150 WARNING [train.py:1204] (3/4) Exclude cut with ID _5294_210_1747_1_1527249643928_3696179_342-270278-0_sp0.9 from training. Duration: 0.611125 2023-10-03 22:43:25,178 WARNING [train.py:1204] (3/4) Exclude cut with ID ID0473W0002-918951 from training. Duration: 0.924 2023-10-03 22:43:26,862 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1083-220122-0 from training. Duration: 0.51 2023-10-03 22:43:29,613 WARNING [train.py:1204] (3/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_138-154812-0 from training. Duration: 0.78 2023-10-03 22:43:31,074 WARNING [train.py:1204] (3/4) Exclude cut with ID _44982_210_1511_1_1534312820108_3726029_331-19420-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 22:43:33,786 INFO [train.py:1046] (3/4) Epoch 41, batch 2400, loss[loss=0.1542, simple_loss=0.2413, pruned_loss=0.03359, over 23354.00 frames. ], tot_loss[loss=0.156, simple_loss=0.2367, pruned_loss=0.03766, over 4722678.70 frames. ], batch size: 93, lr: 2.48e-03, grad_scale: 16.0 2023-10-03 22:43:35,716 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2655_210_2087_1_1525688959_3897400_202-147557-0_sp0.9 from training. Duration: 0.9444375 2023-10-03 22:43:37,507 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.conv_module1.balancer2.prob, batch_count=1432573.3333333333, ans=0.125 2023-10-03 22:43:38,552 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_23850_210_4238_1_1532232597_1632433_32-185135-0_sp0.9 from training. Duration: 0.9555625 2023-10-03 22:43:38,669 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_46-93547-0_sp1.1 from training. Duration: 0.863625 2023-10-03 22:43:39,932 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7931_210_1794_1_1528594728_5564059_280-279560-0 from training. Duration: 0.98 2023-10-03 22:43:39,983 WARNING [train.py:1204] (3/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_937-208281-0 from training. Duration: 0.85 2023-10-03 22:43:46,204 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_14928_210_5632_1_1530773461_4714000_454-112198-0_sp0.9 from training. Duration: 0.7333125 2023-10-03 22:43:46,205 WARNING [train.py:1204] (3/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_944-153333-0_sp1.1 from training. Duration: 0.963625 2023-10-03 22:43:47,637 WARNING [train.py:1204] (3/4) Exclude cut with ID _14821_210_8750_1_1530680503991_6829359_2-227151-0 from training. Duration: 0.83 2023-10-03 22:43:48,959 WARNING [train.py:1204] (3/4) Exclude cut with ID _28461_210_2253_1_1532771775189_3041299_96-152158-0_sp1.1 from training. Duration: 0.77275 2023-10-03 22:43:49,026 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7837_210_1792_1_1528453445_5841305_654-54195-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 22:43:49,067 WARNING [train.py:1204] (3/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_344-152526-0 from training. Duration: 0.53 2023-10-03 22:43:56,347 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_30167_210_3528_1_1532428012_4772375_480-142230-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 22:43:58,505 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.3.encoder.layers.2.self_attn_weights.whiten_keys, num_groups=8, num_channels=256, metric=3.18 vs. limit=6.0 2023-10-03 22:43:59,154 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_160-218721-0 from training. Duration: 0.96 2023-10-03 22:43:59,923 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.2.encoder.layers.1.nonlin_attention.whiten2, num_groups=1, num_channels=384, metric=5.85 vs. limit=15.0 2023-10-03 22:44:03,400 WARNING [train.py:1204] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_65-88242-0_sp0.9 from training. Duration: 0.62225 2023-10-03 22:44:08,118 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_34886_210_10633_1_1533203882_1755661_76-156096-0 from training. Duration: 0.87 2023-10-03 22:44:10,928 WARNING [train.py:1204] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_188-38388-0_sp1.1 from training. Duration: 0.92725 2023-10-03 22:44:11,048 WARNING [train.py:1204] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_81-289335-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 22:44:15,867 WARNING [train.py:1204] (3/4) Exclude cut with ID _59493_210_17282_1_1535078419077_3225879_110-8778-0_sp1.1 from training. Duration: 0.936375 2023-10-03 22:44:15,919 WARNING [train.py:1204] (3/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_188-260011-0 from training. Duration: 0.73 2023-10-03 22:44:15,959 WARNING [train.py:1204] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_453-133983-0_sp0.9 from training. Duration: 0.8666875 2023-10-03 22:44:24,585 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8054_210_7927_1_1528627721_4984094_553-17379-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 22:44:27,340 WARNING [train.py:1204] (3/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_341-171072-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 22:44:28,919 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.conv_module1.balancer2.prob, batch_count=1432773.3333333333, ans=0.125 2023-10-03 22:44:30,062 WARNING [train.py:1204] (3/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_265-257286-0_sp1.1 from training. Duration: 0.963625 2023-10-03 22:44:30,149 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_403-39619-0_sp0.9 from training. Duration: 0.9333125 2023-10-03 22:44:30,154 WARNING [train.py:1204] (3/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_464-33931-0_sp0.9 from training. Duration: 0.6 2023-10-03 22:44:30,173 WARNING [train.py:1204] (3/4) Exclude cut with ID _60804_210_9642_1_1534831199859_7268299_632-143863-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 22:44:30,179 WARNING [train.py:1204] (3/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_431-310997-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 22:44:31,609 WARNING [train.py:1204] (3/4) Exclude cut with ID _31649_210_6228_1_1532584608063_4091078_114-263860-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 22:44:31,628 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6961_210_2087_1_1527937168_6815011_562-165518-0_sp0.9 from training. Duration: 0.8555625 2023-10-03 22:44:37,574 WARNING [train.py:1204] (3/4) Exclude cut with ID _27052_210_5986_1_1532329197187_3585009_338-325870-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 22:44:37,624 WARNING [train.py:1204] (3/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_469-94673-0_sp1.1 from training. Duration: 0.7181875 2023-10-03 22:44:37,649 WARNING [train.py:1204] (3/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_259-34038-0 from training. Duration: 0.74 2023-10-03 22:44:38,971 WARNING [train.py:1204] (3/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_388-129552-0 from training. Duration: 0.96 2023-10-03 22:44:40,462 WARNING [train.py:1204] (3/4) Exclude cut with ID _28394_210_8913_1_1532430049518_3650118_342-251455-0_sp1.1 from training. Duration: 0.936375 2023-10-03 22:44:40,480 WARNING [train.py:1204] (3/4) Exclude cut with ID _53731_210_7994_1_1534553979244_3757214_8-191390-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 22:44:41,803 WARNING [train.py:1204] (3/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_613-232856-0 from training. Duration: 0.54 2023-10-03 22:44:41,867 WARNING [train.py:1204] (3/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_557-349534-0 from training. Duration: 0.91 2023-10-03 22:44:41,883 WARNING [train.py:1204] (3/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_379-184223-0 from training. Duration: 0.85 2023-10-03 22:44:41,887 WARNING [train.py:1204] (3/4) Exclude cut with ID IC0125W0001-61807 from training. Duration: 0.966 2023-10-03 22:44:43,832 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_229-45300-0 from training. Duration: 0.57 2023-10-03 22:44:45,228 WARNING [train.py:1204] (3/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_1069-24743-0_sp1.1 from training. Duration: 0.92725 2023-10-03 22:44:47,059 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_41452_210_7291_1_1533437687_3941281_323-172873-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 22:44:47,076 WARNING [train.py:1204] (3/4) Exclude cut with ID _37086_210_9038_1_1532999540891_7021250_802-226709-0_sp1.1 from training. Duration: 0.97275 2023-10-03 22:44:48,328 INFO [train.py:1046] (3/4) Epoch 41, batch 2450, loss[loss=0.1637, simple_loss=0.2435, pruned_loss=0.04193, over 23182.00 frames. ], tot_loss[loss=0.1554, simple_loss=0.2355, pruned_loss=0.03768, over 4720637.11 frames. ], batch size: 105, lr: 2.48e-03, grad_scale: 16.0 2023-10-03 22:44:48,459 WARNING [train.py:1204] (3/4) Exclude cut with ID IC0144W0500-71767 from training. Duration: 0.931875 2023-10-03 22:44:49,879 WARNING [train.py:1204] (3/4) Exclude cut with ID _39846_210_12651_1_1533605310265_2115212_233-129699-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 22:44:51,205 WARNING [train.py:1204] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_574-45541-0_sp0.9 from training. Duration: 0.72225 2023-10-03 22:44:53,186 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.self_attn_weights.pos_emb_skip_rate, batch_count=1432906.6666666667, ans=0.0 2023-10-03 22:44:54,322 WARNING [train.py:1204] (3/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_372-298093-0_sp1.1 from training. Duration: 0.72725 2023-10-03 22:44:54,342 WARNING [train.py:1204] (3/4) Exclude cut with ID _62574_210_2655_1_1535180407058_3933249_34-26343-0_sp1.1 from training. Duration: 0.97275 2023-10-03 22:44:54,636 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.conv_module2.balancer2.prob, batch_count=1432906.6666666667, ans=0.125 2023-10-03 22:44:55,905 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.0.feed_forward1.hidden_balancer.prob, batch_count=1432906.6666666667, ans=0.125 2023-10-03 22:44:58,394 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_512-256238-0_sp1.1 from training. Duration: 0.963625 2023-10-03 22:44:58,399 WARNING [train.py:1204] (3/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_196-55502-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 22:44:58,488 WARNING [train.py:1204] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_195-167011-0 from training. Duration: 0.75 2023-10-03 22:45:04,012 WARNING [train.py:1204] (3/4) Exclude cut with ID _13678_210_7497_1_1530530069226_3463170_345-230416-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 22:45:04,028 WARNING [train.py:1204] (3/4) Exclude cut with ID _51078_210_9070_1_1534161557424_3805250_225-327809-0_sp1.1 from training. Duration: 0.963625 2023-10-03 22:45:08,196 WARNING [train.py:1204] (3/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_18-189198-0_sp1.1 from training. Duration: 0.8181875 2023-10-03 22:45:08,214 WARNING [train.py:1204] (3/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_329-82235-0_sp1.1 from training. Duration: 0.7454375 2023-10-03 22:45:08,229 WARNING [train.py:1204] (3/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_726-270780-0_sp0.9 from training. Duration: 0.9555625 2023-10-03 22:45:10,115 WARNING [train.py:1204] (3/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_654-52713-0 from training. Duration: 0.77 2023-10-03 22:45:10,699 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.5.encoder.layers.1.nonlin_attention.whiten1, num_groups=1, num_channels=192, metric=2.87 vs. limit=10.0 2023-10-03 22:45:12,959 WARNING [train.py:1204] (3/4) Exclude cut with ID _20455_210_5219_1_1531565996775_8098100_191-16006-0_sp1.1 from training. Duration: 0.963625 2023-10-03 22:45:14,323 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3171_210_2087_1_1526109084_2330007_66-260583-0_sp1.1 from training. Duration: 0.6545625 2023-10-03 22:45:14,372 WARNING [train.py:1204] (3/4) Exclude cut with ID _38670_210_15379_1_1533448810659_4391059_63-129006-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 22:45:17,644 WARNING [train.py:1204] (3/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_393-57888-0_sp0.9 from training. Duration: 0.711125 2023-10-03 22:45:19,312 WARNING [train.py:1204] (3/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_857-298942-0_sp0.9 from training. Duration: 0.9444375 2023-10-03 22:45:19,421 WARNING [train.py:1204] (3/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_239-22076-0_sp0.9 from training. Duration: 0.9444375 2023-10-03 22:45:20,735 WARNING [train.py:1204] (3/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_424-227947-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 22:45:22,102 WARNING [train.py:1204] (3/4) Exclude cut with ID _14069_210_5632_1_1530512692033_4803390_95-1896-0 from training. Duration: 0.98 2023-10-03 22:45:24,057 WARNING [train.py:1204] (3/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_96-244299-0_sp0.9 from training. Duration: 0.97775 2023-10-03 22:45:29,833 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.out_combiner.scale_min, batch_count=1433040.0, ans=0.2 2023-10-03 22:45:32,399 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.566e+02 2.017e+02 2.173e+02 2.483e+02 3.583e+02, threshold=4.346e+02, percent-clipped=0.0 2023-10-03 22:45:32,492 WARNING [train.py:1204] (3/4) Exclude cut with ID _46683_210_9642_1_1533730913492_4591116_562-319825-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 22:45:33,870 WARNING [train.py:1204] (3/4) Exclude cut with ID _64639_210_5816_1_1535367619437_3528000_284-173474-0_sp1.1 from training. Duration: 0.963625 2023-10-03 22:45:33,898 WARNING [train.py:1204] (3/4) Exclude cut with ID _29529_210_7234_1_1532393961455_3977460_497-100133-0_sp1.1 from training. Duration: 0.936375 2023-10-03 22:45:33,924 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_12868_210_6753_1_1530701932_530275_18-76322-0_sp0.9 from training. Duration: 0.9666875 2023-10-03 22:45:33,954 WARNING [train.py:1204] (3/4) Exclude cut with ID _50093_210_4220_1_1534417141207_3432299_116-303447-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 22:45:35,315 WARNING [train.py:1204] (3/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_44-79427-0_sp1.1 from training. Duration: 0.8909375 2023-10-03 22:45:35,371 WARNING [train.py:1204] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_887-249555-0 from training. Duration: 0.77 2023-10-03 22:45:35,617 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.conv_module2.balancer1.prob, batch_count=1433106.6666666667, ans=0.125 2023-10-03 22:45:35,798 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.5.encoder.layers.1.self_attn_weights.whiten_keys, num_groups=4, num_channels=128, metric=1.70 vs. limit=6.0 2023-10-03 22:45:39,985 WARNING [train.py:1204] (3/4) Exclude cut with ID _28461_210_2253_1_1532771775189_3041299_96-152158-0_sp0.9 from training. Duration: 0.9444375 2023-10-03 22:45:40,040 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_14058_210_4163_1_1530690953_6327180_49-210687-0_sp1.1 from training. Duration: 0.8545625 2023-10-03 22:45:40,298 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.conv_module2.balancer1.prob, batch_count=1433106.6666666667, ans=0.125 2023-10-03 22:45:41,518 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.conv_module1.balancer1.max_abs, batch_count=1433106.6666666667, ans=10.0 2023-10-03 22:45:42,709 WARNING [train.py:1204] (3/4) Exclude cut with ID _40399_210_4353_1_1533305658194_3279406_351-127903-0_sp1.1 from training. Duration: 0.97275 2023-10-03 22:45:42,721 WARNING [train.py:1204] (3/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_184-206939-0_sp1.1 from training. Duration: 0.936375 2023-10-03 22:45:47,556 WARNING [train.py:1204] (3/4) Exclude cut with ID _14100_210_8341_1_1530529320613_3678069_258-224599-0_sp1.1 from training. Duration: 0.7 2023-10-03 22:45:47,572 WARNING [train.py:1204] (3/4) Exclude cut with ID _6493_210_1747_1_1528459210424_4012470_324-323854-0 from training. Duration: 0.92 2023-10-03 22:45:48,959 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_151-325626-0_sp1.1 from training. Duration: 0.8818125 2023-10-03 22:45:50,724 WARNING [train.py:1204] (3/4) Exclude cut with ID _14528_210_8254_1_1530669700514_4306939_106-43145-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 22:45:50,746 WARNING [train.py:1204] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_686-54383-0 from training. Duration: 0.87 2023-10-03 22:45:50,792 WARNING [train.py:1204] (3/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_801-102362-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 22:45:52,034 WARNING [train.py:1204] (3/4) Exclude cut with ID _48677_210_5663_1_1533902405919_3892989_408-40740-0_sp0.9 from training. Duration: 0.92225 2023-10-03 22:45:55,450 WARNING [train.py:1204] (3/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_448-212135-0_sp1.1 from training. Duration: 0.82725 2023-10-03 22:45:56,895 WARNING [train.py:1204] (3/4) Exclude cut with ID _59170_210_6721_1_1534983633960_4203335_320-124047-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 22:45:58,209 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_4692_210_3528_1_1526899755_4323254_107-44401-0_sp1.1 from training. Duration: 0.7818125 2023-10-03 22:45:58,432 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.feed_forward2.hidden_balancer.prob, batch_count=1433173.3333333333, ans=0.125 2023-10-03 22:46:01,113 WARNING [train.py:1204] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_731-2428-0 from training. Duration: 0.83 2023-10-03 22:46:02,347 INFO [train.py:1046] (3/4) Epoch 41, batch 2500, loss[loss=0.1639, simple_loss=0.2565, pruned_loss=0.03569, over 24285.00 frames. ], tot_loss[loss=0.1546, simple_loss=0.2343, pruned_loss=0.03746, over 4715747.47 frames. ], batch size: 74, lr: 2.48e-03, grad_scale: 8.0 2023-10-03 22:46:03,761 WARNING [train.py:1204] (3/4) Exclude cut with ID _29857_210_20034_1_1532395886753_3798160_335-163562-0_sp1.1 from training. Duration: 0.77275 2023-10-03 22:46:04,046 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=1433240.0, ans=0.1 2023-10-03 22:46:07,925 WARNING [train.py:1204] (3/4) Exclude cut with ID _14832_210_8750_1_1530691351539_3171348_171-275004-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 22:46:17,532 WARNING [train.py:1204] (3/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_105-328174-0_sp0.9 from training. Duration: 0.8444375 2023-10-03 22:46:17,575 WARNING [train.py:1204] (3/4) Exclude cut with ID _68965_210_1883_1_1535698962272_3497490_164-118421-0_sp1.1 from training. Duration: 0.936375 2023-10-03 22:46:17,702 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.1.conv_module2.balancer2.prob, batch_count=1433306.6666666667, ans=0.125 2023-10-03 22:46:19,035 WARNING [train.py:1204] (3/4) Exclude cut with ID _19896_210_1883_1_1531709022662_6649569_380-24170-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 22:46:19,046 WARNING [train.py:1204] (3/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_505-205270-0_sp1.1 from training. Duration: 0.3454375 2023-10-03 22:46:26,171 WARNING [train.py:1204] (3/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_427-208901-0_sp1.1 from training. Duration: 0.6181875 2023-10-03 22:46:26,220 WARNING [train.py:1204] (3/4) Exclude cut with ID _23818_210_6228_1_1531917063884_3676920_85-326916-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 22:46:29,371 WARNING [train.py:1204] (3/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_101-293367-0_sp1.1 from training. Duration: 0.57275 2023-10-03 22:46:29,380 WARNING [train.py:1204] (3/4) Exclude cut with ID _5409_210_1388_1_1527292850189_5865191_178-127970-0_sp1.1 from training. Duration: 0.5181875 2023-10-03 22:46:29,451 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_403-39619-0 from training. Duration: 0.84 2023-10-03 22:46:30,782 WARNING [train.py:1204] (3/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_427-87841-0_sp1.1 from training. Duration: 0.963625 2023-10-03 22:46:30,821 WARNING [train.py:1204] (3/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_286-68181-0_sp1.1 from training. Duration: 0.97275 2023-10-03 22:46:32,134 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6673_210_3273_1_1527767790_3173240_327-306185-0 from training. Duration: 0.83 2023-10-03 22:46:32,157 WARNING [train.py:1204] (3/4) Exclude cut with ID _3675_210_4557_1_1526639934408_4182089_106-203713-0_sp1.1 from training. Duration: 0.963625 2023-10-03 22:46:33,543 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_53071_210_16554_1_1534493434_1276796_130-36076-0 from training. Duration: 0.95 2023-10-03 22:46:33,579 WARNING [train.py:1204] (3/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_586-93054-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 22:46:36,378 WARNING [train.py:1204] (3/4) Exclude cut with ID _23825_210_13449_1_1531915313526_3497150_262-10571-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 22:46:37,739 WARNING [train.py:1204] (3/4) Exclude cut with ID _28783_210_7943_1_1532671164808_7155479_432-67885-0_sp1.1 from training. Duration: 0.97275 2023-10-03 22:46:39,194 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6287_210_3273_1_1527509506_2653716_182-199887-0_sp0.9 from training. Duration: 0.5666875 2023-10-03 22:46:40,479 WARNING [train.py:1204] (3/4) Exclude cut with ID _3231_210_2106_1_1526205864934_6784220_171-256004-0 from training. Duration: 0.86 2023-10-03 22:46:40,525 WARNING [train.py:1204] (3/4) Exclude cut with ID _69051_210_1531_1_1535885566901_3829538_332-92090-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 22:46:42,430 WARNING [train.py:1204] (3/4) Exclude cut with ID _3675_210_4557_1_1526639934408_4182089_104-324125-0_sp1.1 from training. Duration: 0.963625 2023-10-03 22:46:45,876 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_41764_210_2521_1_1533686110_4089313_501-2119-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 22:46:50,378 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_56-100988-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 22:46:51,405 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.0.layers.0.conv_module1.whiten, num_groups=1, num_channels=192, metric=10.17 vs. limit=15.0 2023-10-03 22:46:53,152 WARNING [train.py:1204] (3/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_398-239819-0_sp1.1 from training. Duration: 0.9 2023-10-03 22:46:57,892 WARNING [train.py:1204] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_74-48717-0_sp1.1 from training. Duration: 0.52725 2023-10-03 22:47:00,667 WARNING [train.py:1204] (3/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_177-49092-0 from training. Duration: 0.91 2023-10-03 22:47:00,685 WARNING [train.py:1204] (3/4) Exclude cut with ID _62337_210_12392_1_1535009273105_3412229_44-176353-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 22:47:00,693 WARNING [train.py:1204] (3/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_110-130961-0_sp1.1 from training. Duration: 0.42725 2023-10-03 22:47:02,079 WARNING [train.py:1204] (3/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_37-189262-0_sp1.1 from training. Duration: 0.8909375 2023-10-03 22:47:02,080 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3172_210_2087_1_1526292929_3987865_197-97762-0_sp0.9 from training. Duration: 0.8333125 2023-10-03 22:47:04,704 WARNING [train.py:1204] (3/4) Exclude cut with ID IC0677W0065-334897 from training. Duration: 0.896 2023-10-03 22:47:04,705 WARNING [train.py:1204] (3/4) Exclude cut with ID IC0677W0080-334912 from training. Duration: 0.768 2023-10-03 22:47:04,710 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_120-41668-0 from training. Duration: 0.7 2023-10-03 22:47:06,301 WARNING [train.py:1204] (3/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_460-205287-0_sp1.1 from training. Duration: 0.963625 2023-10-03 22:47:07,626 WARNING [train.py:1204] (3/4) Exclude cut with ID _14632_210_9988_1_1530691279392_3923200_102-204477-0 from training. Duration: 0.97 2023-10-03 22:47:07,631 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_34815_210_6388_1_1533381320_3571903_279-305565-0 from training. Duration: 0.92 2023-10-03 22:47:08,945 WARNING [train.py:1204] (3/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_91-148831-0_sp1.1 from training. Duration: 0.9 2023-10-03 22:47:10,291 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_82-221196-0 from training. Duration: 0.76 2023-10-03 22:47:12,928 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.bypass_mid.scale_min, batch_count=1433506.6666666667, ans=0.2 2023-10-03 22:47:13,967 WARNING [train.py:1204] (3/4) Exclude cut with ID _38020_210_5986_1_1533112252259_7006089_493-102745-0 from training. Duration: 0.98 2023-10-03 22:47:16,702 INFO [train.py:1046] (3/4) Epoch 41, batch 2550, loss[loss=0.1552, simple_loss=0.2364, pruned_loss=0.03705, over 23195.00 frames. ], tot_loss[loss=0.1555, simple_loss=0.2351, pruned_loss=0.03796, over 4714346.10 frames. ], batch size: 119, lr: 2.48e-03, grad_scale: 8.0 2023-10-03 22:47:18,676 WARNING [train.py:1204] (3/4) Exclude cut with ID _61133_210_9969_1_1535196777227_3696409_40-93442-0_sp1.1 from training. Duration: 0.92725 2023-10-03 22:47:20,051 WARNING [train.py:1204] (3/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_45-258852-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 22:47:20,089 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_53071_210_16554_1_1534493434_1276796_130-36076-0_sp1.1 from training. Duration: 0.863625 2023-10-03 22:47:21,010 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.1.encoder.layers.0.nonlin_attention.whiten2, num_groups=1, num_channels=256, metric=5.60 vs. limit=15.0 2023-10-03 22:47:21,458 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8694_210_6400_1_1529065754_3260756_332-13553-0_sp1.1 from training. Duration: 0.92725 2023-10-03 22:47:21,546 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_405-339986-0 from training. Duration: 0.51 2023-10-03 22:47:22,851 WARNING [train.py:1204] (3/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_927-43827-0_sp1.1 from training. Duration: 0.67275 2023-10-03 22:47:26,871 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_397-147196-0 from training. Duration: 0.8 2023-10-03 22:47:28,326 WARNING [train.py:1204] (3/4) Exclude cut with ID 3557-8342-0013-54691-0_sp1.1 from training. Duration: 0.836375 2023-10-03 22:47:30,310 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_47438_210_5399_1_1533797471_4513356_626-127722-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 22:47:33,015 WARNING [train.py:1204] (3/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_256-323883-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 22:47:33,019 WARNING [train.py:1204] (3/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_839-173017-0_sp0.9 from training. Duration: 0.4555625 2023-10-03 22:47:33,066 WARNING [train.py:1204] (3/4) Exclude cut with ID _5289_210_2084_1_1527678202736_3779299_392-261988-0_sp1.1 from training. Duration: 0.5909375 2023-10-03 22:47:33,098 WARNING [train.py:1204] (3/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_119-224384-0_sp1.1 from training. Duration: 0.936375 2023-10-03 22:47:34,518 WARNING [train.py:1204] (3/4) Exclude cut with ID _57523_210_1856_1_1534766294545_3732989_262-267933-0_sp1.1 from training. Duration: 0.963625 2023-10-03 22:47:37,304 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_35361_210_4238_1_1533262810_7793476_675-87012-0_sp0.9 from training. Duration: 0.988875 2023-10-03 22:47:37,322 WARNING [train.py:1204] (3/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_17-90541-0 from training. Duration: 0.94 2023-10-03 22:47:37,365 WARNING [train.py:1204] (3/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_535-14410-0_sp1.1 from training. Duration: 0.42725 2023-10-03 22:47:37,371 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_24673_210_6400_1_1531968633_778659_94-154511-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 22:47:37,374 WARNING [train.py:1204] (3/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_212-10501-0 from training. Duration: 0.94 2023-10-03 22:47:49,884 WARNING [train.py:1204] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_383-285196-0_sp1.1 from training. Duration: 0.8818125 2023-10-03 22:47:53,946 WARNING [train.py:1204] (3/4) Exclude cut with ID _45560_210_21982_1_1533812603951_3584999_238-47020-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 22:47:53,961 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_21500_210_2054_1_1532149310_2716758_278-18053-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 22:47:53,968 WARNING [train.py:1204] (3/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_453-9626-0_sp1.1 from training. Duration: 0.936375 2023-10-03 22:47:55,453 WARNING [train.py:1204] (3/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_153-97441-0_sp1.1 from training. Duration: 0.5090625 2023-10-03 22:47:56,030 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.5.encoder.layers.1.nonlin_attention.whiten2, num_groups=1, num_channels=256, metric=10.12 vs. limit=15.0 2023-10-03 22:47:59,065 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.2.encoder.layers.2.feed_forward2.out_whiten, num_groups=1, num_channels=384, metric=3.73 vs. limit=15.0 2023-10-03 22:48:00,989 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.666e+02 2.004e+02 2.232e+02 2.565e+02 3.401e+02, threshold=4.465e+02, percent-clipped=0.0 2023-10-03 22:48:01,153 WARNING [train.py:1204] (3/4) Exclude cut with ID _67725_210_27767_1_1535608756671_5105359_431-120895-0_sp1.1 from training. Duration: 0.963625 2023-10-03 22:48:04,282 WARNING [train.py:1204] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_574-45541-0_sp1.1 from training. Duration: 0.5909375 2023-10-03 22:48:05,552 WARNING [train.py:1204] (3/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_162-103620-0_sp1.1 from training. Duration: 0.8454375 2023-10-03 22:48:05,572 WARNING [train.py:1204] (3/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_734-129761-0_sp1.1 from training. Duration: 0.7454375 2023-10-03 22:48:06,836 WARNING [train.py:1204] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_620-276793-0_sp1.1 from training. Duration: 0.536375 2023-10-03 22:48:06,870 WARNING [train.py:1204] (3/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_153-97441-0_sp0.9 from training. Duration: 0.62225 2023-10-03 22:48:09,823 WARNING [train.py:1204] (3/4) Exclude cut with ID _12733_210_8341_1_1530167424556_3646380_523-85621-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 22:48:09,854 WARNING [train.py:1204] (3/4) Exclude cut with ID _64048_210_10626_1_1535198382802_3835179_352-179834-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 22:48:15,059 WARNING [train.py:1204] (3/4) Exclude cut with ID _55162_210_17071_1_1534408248583_3753480_43-242364-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 22:48:15,075 WARNING [train.py:1204] (3/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_661-264857-0 from training. Duration: 0.93 2023-10-03 22:48:15,082 WARNING [train.py:1204] (3/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_325-237800-0_sp1.1 from training. Duration: 0.97275 2023-10-03 22:48:15,137 WARNING [train.py:1204] (3/4) Exclude cut with ID _55328_210_27767_1_1534580973260_4088170_385-171459-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 22:48:16,490 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3780_210_2087_1_1526467389_2145572_176-107140-0_sp1.1 from training. Duration: 0.62725 2023-10-03 22:48:17,835 WARNING [train.py:1204] (3/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_375-244976-0_sp1.1 from training. Duration: 0.6818125 2023-10-03 22:48:19,412 WARNING [train.py:1204] (3/4) Exclude cut with ID _35941_210_15379_1_1532995294515_4023089_309-33162-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 22:48:25,384 WARNING [train.py:1204] (3/4) Exclude cut with ID _14459_210_2228_1_1531443641946_7516700_1185-22038-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 22:48:27,954 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_1197-69460-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 22:48:28,285 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.conv_skip_rate, batch_count=1433840.0, ans=0.0 2023-10-03 22:48:29,500 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9012W0022-501157 from training. Duration: 0.896 2023-10-03 22:48:30,740 INFO [train.py:1046] (3/4) Epoch 41, batch 2600, loss[loss=0.1463, simple_loss=0.2394, pruned_loss=0.0266, over 24442.00 frames. ], tot_loss[loss=0.1564, simple_loss=0.2359, pruned_loss=0.03842, over 4718193.84 frames. ], batch size: 66, lr: 2.48e-03, grad_scale: 8.0 2023-10-03 22:48:33,548 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9023W0009-506302 from training. Duration: 0.896 2023-10-03 22:48:33,563 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2821_210_2715_1_1526108385_3763560_73-296673-0_sp1.1 from training. Duration: 0.7545625 2023-10-03 22:48:33,607 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9025W0113-507442 from training. Duration: 0.896 2023-10-03 22:48:35,409 WARNING [train.py:1204] (3/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_739-2098-0 from training. Duration: 0.92 2023-10-03 22:48:35,422 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9029W0332-509722 from training. Duration: 0.768 2023-10-03 22:48:39,507 WARNING [train.py:1204] (3/4) Exclude cut with ID _16908_210_5724_1_1531119157248_3772700_68-101953-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 22:48:39,523 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9042W0011-515617 from training. Duration: 0.768 2023-10-03 22:48:42,125 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3019_210_2028_1_1526044245_3264865_191-72235-0 from training. Duration: 0.97 2023-10-03 22:48:43,553 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9052W0006-520792 from training. Duration: 0.896 2023-10-03 22:48:44,946 WARNING [train.py:1204] (3/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_245-174553-0_sp1.1 from training. Duration: 0.8 2023-10-03 22:48:46,871 WARNING [train.py:1204] (3/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_202-67017-0 from training. Duration: 0.82 2023-10-03 22:48:48,275 WARNING [train.py:1204] (3/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_304-59227-0 from training. Duration: 0.77 2023-10-03 22:48:48,505 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.bypass.scale_min, batch_count=1433973.3333333333, ans=0.2 2023-10-03 22:48:49,613 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_246-125000-0_sp0.9 from training. Duration: 0.688875 2023-10-03 22:48:49,652 WARNING [train.py:1204] (3/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_243-42116-0 from training. Duration: 0.73 2023-10-03 22:48:51,114 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9087W0121-538492 from training. Duration: 0.896 2023-10-03 22:48:52,891 WARNING [train.py:1204] (3/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_120-76843-0 from training. Duration: 0.55 2023-10-03 22:48:53,728 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.1.encoder.layers.1.feed_forward1.out_whiten, num_groups=1, num_channels=256, metric=7.51 vs. limit=15.0 2023-10-03 22:48:58,450 WARNING [train.py:1204] (3/4) Exclude cut with ID _7852_210_5219_1_1528286471316_8139400_850-304621-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 22:48:59,750 WARNING [train.py:1204] (3/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_154-285415-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 22:48:59,761 WARNING [train.py:1204] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_538-278194-0_sp1.1 from training. Duration: 0.92725 2023-10-03 22:48:59,762 WARNING [train.py:1204] (3/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_119-207162-0 from training. Duration: 0.71 2023-10-03 22:49:02,496 WARNING [train.py:1204] (3/4) Exclude cut with ID _2334_210_2667_1_1525608238880_4705129_459-104997-0_sp1.1 from training. Duration: 0.863625 2023-10-03 22:49:02,795 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.balancer2.prob, batch_count=1434040.0, ans=0.125 2023-10-03 22:49:06,665 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9147W0019-567847 from training. Duration: 0.768 2023-10-03 22:49:11,336 WARNING [train.py:1204] (3/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_228-189439-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 22:49:11,361 WARNING [train.py:1204] (3/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_196-77172-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 22:49:12,760 WARNING [train.py:1204] (3/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_625-338546-0 from training. Duration: 0.88 2023-10-03 22:49:12,811 WARNING [train.py:1204] (3/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_193-327630-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 22:49:12,813 WARNING [train.py:1204] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_391-24006-0_sp1.1 from training. Duration: 0.92725 2023-10-03 22:49:14,139 WARNING [train.py:1204] (3/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_197-28164-0 from training. Duration: 0.8 2023-10-03 22:49:17,360 WARNING [train.py:1204] (3/4) Exclude cut with ID _8395_210_1531_1_1528597871523_4411579_431-132470-0_sp1.1 from training. Duration: 0.67275 2023-10-03 22:49:17,393 WARNING [train.py:1204] (3/4) Exclude cut with ID _35350_210_16040_1_1533040191183_4075299_465-248326-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 22:49:18,717 WARNING [train.py:1204] (3/4) Exclude cut with ID _16756_210_4915_1_1531047803763_7104259_101-141922-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 22:49:22,067 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9212W0005-600172 from training. Duration: 0.896 2023-10-03 22:49:22,097 WARNING [train.py:1204] (3/4) Exclude cut with ID _63186_210_2084_1_1535414479705_3908649_378-166820-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 22:49:23,324 WARNING [train.py:1204] (3/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_52-211683-0_sp1.1 from training. Duration: 0.8181875 2023-10-03 22:49:27,517 WARNING [train.py:1204] (3/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_1147-237729-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 22:49:27,786 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.conv_module2.balancer1.prob, batch_count=1434173.3333333333, ans=0.125 2023-10-03 22:49:28,794 WARNING [train.py:1204] (3/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_234-14174-0_sp0.9 from training. Duration: 0.911125 2023-10-03 22:49:28,813 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_454-276429-0 from training. Duration: 0.87 2023-10-03 22:49:28,984 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.conv_module2.balancer1.prob, batch_count=1434173.3333333333, ans=0.125 2023-10-03 22:49:30,193 WARNING [train.py:1204] (3/4) Exclude cut with ID _53713_210_24116_1_1534314487762_7416751_755-165313-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 22:49:32,835 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8278_210_2550_1_1528637099_4220413_436-317072-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 22:49:34,192 WARNING [train.py:1204] (3/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_28-324729-0_sp1.1 from training. Duration: 0.8545625 2023-10-03 22:49:37,215 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=1434173.3333333333, ans=0.1 2023-10-03 22:49:38,917 WARNING [train.py:1204] (3/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_326-165900-0 from training. Duration: 0.69 2023-10-03 22:49:38,991 WARNING [train.py:1204] (3/4) Exclude cut with ID _53489_210_6228_1_1534298357169_3835970_96-30246-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 22:49:39,109 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.balancer1.prob, batch_count=1434173.3333333333, ans=0.125 2023-10-03 22:49:40,445 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8275_210_4198_1_1528455320_3892571_491-17707-0_sp1.1 from training. Duration: 0.5818125 2023-10-03 22:49:43,106 INFO [train.py:1046] (3/4) Epoch 41, batch 2650, loss[loss=0.1756, simple_loss=0.2481, pruned_loss=0.05157, over 23432.00 frames. ], tot_loss[loss=0.1565, simple_loss=0.236, pruned_loss=0.03849, over 4713161.53 frames. ], batch size: 285, lr: 2.48e-03, grad_scale: 4.0 2023-10-03 22:49:45,149 WARNING [train.py:1204] (3/4) Exclude cut with ID _14348_210_8341_1_1530599434996_3778640_240-232370-0 from training. Duration: 0.84 2023-10-03 22:49:45,167 WARNING [train.py:1204] (3/4) Exclude cut with ID _45012_210_12651_1_1533608435136_5371380_54-112623-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 22:49:46,972 WARNING [train.py:1204] (3/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_328-264776-0_sp1.1 from training. Duration: 0.6090625 2023-10-03 22:49:47,021 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9325W0001-648427 from training. Duration: 0.768 2023-10-03 22:49:47,039 WARNING [train.py:1204] (3/4) Exclude cut with ID _56480_210_4220_1_1534679942079_3075409_49-74717-0_sp1.1 from training. Duration: 0.936375 2023-10-03 22:49:48,950 WARNING [train.py:1204] (3/4) Exclude cut with ID _59500_210_8254_1_1534809610680_3736009_6-241869-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 22:49:49,565 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.4.encoder.layers.1.conv_module2.whiten, num_groups=1, num_channels=384, metric=2.77 vs. limit=15.0 2023-10-03 22:49:51,685 WARNING [train.py:1204] (3/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_172-215932-0_sp0.9 from training. Duration: 0.7333125 2023-10-03 22:49:53,063 WARNING [train.py:1204] (3/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_17-90541-0_sp1.1 from training. Duration: 0.8545625 2023-10-03 22:49:54,442 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_43831_210_2158_1_1533725502_5872791_525-89447-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 22:49:55,793 WARNING [train.py:1204] (3/4) Exclude cut with ID _12302_210_5219_1_1529924446466_8107280_907-182949-0 from training. Duration: 0.92 2023-10-03 22:49:55,814 WARNING [train.py:1204] (3/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_402-102311-0_sp0.9 from training. Duration: 0.7444375 2023-10-03 22:49:57,172 WARNING [train.py:1204] (3/4) Exclude cut with ID _8021_210_7994_1_1528376451358_3653069_179-45206-0_sp0.9 from training. Duration: 0.9555625 2023-10-03 22:49:59,912 WARNING [train.py:1204] (3/4) Exclude cut with ID _40137_210_12210_1_1533290484046_3795180_249-187059-0 from training. Duration: 0.88 2023-10-03 22:50:01,336 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9653W0194-675472 from training. Duration: 0.9658125 2023-10-03 22:50:04,055 WARNING [train.py:1204] (3/4) Exclude cut with ID _51332_210_9988_1_1534244371836_3801270_336-255604-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 22:50:05,634 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_510-96996-0 from training. Duration: 0.87 2023-10-03 22:50:06,924 WARNING [train.py:1204] (3/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_1167-20437-0_sp1.1 from training. Duration: 0.92725 2023-10-03 22:50:08,332 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3475_210_2028_1_1526193578_3622569_265-276625-0 from training. Duration: 0.76 2023-10-03 22:50:10,510 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.conv_module2.balancer1.prob, batch_count=1434306.6666666667, ans=0.125 2023-10-03 22:50:10,536 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.conv_module2.balancer2.prob, batch_count=1434306.6666666667, ans=0.125 2023-10-03 22:50:11,741 WARNING [train.py:1204] (3/4) Exclude cut with ID _14860_210_4915_1_1530702020340_7343240_470-172418-0_sp1.1 from training. Duration: 0.97275 2023-10-03 22:50:11,751 WARNING [train.py:1204] (3/4) Exclude cut with ID _5444_210_2151_1_1527418808464_5312210_581-315411-0_sp1.1 from training. Duration: 0.563625 2023-10-03 22:50:13,131 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_34450_210_9547_1_1533099443_4109050_519-342439-0_sp1.1 from training. Duration: 0.97275 2023-10-03 22:50:13,155 WARNING [train.py:1204] (3/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_50-175961-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 22:50:17,946 WARNING [train.py:1204] (3/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_372-6869-0 from training. Duration: 0.44 2023-10-03 22:50:17,951 WARNING [train.py:1204] (3/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_3-237434-0 from training. Duration: 0.76 2023-10-03 22:50:22,376 WARNING [train.py:1204] (3/4) Exclude cut with ID _8772_210_1850_1_1528635595972_3664011_247-336448-0_sp1.1 from training. Duration: 0.67275 2023-10-03 22:50:26,492 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_427-78097-0 from training. Duration: 0.8 2023-10-03 22:50:26,507 WARNING [train.py:1204] (3/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_466-252727-0_sp1.1 from training. Duration: 0.97275 2023-10-03 22:50:26,577 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_15972_210_6400_1_1530877934_3405692_316-35478-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 22:50:27,869 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_809-17156-0_sp0.9 from training. Duration: 0.811125 2023-10-03 22:50:27,921 WARNING [train.py:1204] (3/4) Exclude cut with ID _57572_210_5517_1_1534921219624_3809484_594-263474-0_sp1.1 from training. Duration: 0.936375 2023-10-03 22:50:29,235 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.637e+02 1.964e+02 2.149e+02 2.506e+02 3.247e+02, threshold=4.298e+02, percent-clipped=0.0 2023-10-03 22:50:29,346 WARNING [train.py:1204] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_190-228204-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 22:50:32,040 WARNING [train.py:1204] (3/4) Exclude cut with ID _19514_210_9259_1_1531530071330_5084230_117-21193-0_sp1.1 from training. Duration: 0.936375 2023-10-03 22:50:33,417 WARNING [train.py:1204] (3/4) Exclude cut with ID _69584_210_5219_1_1535785133775_4044450_190-21315-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 22:50:33,486 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_53071_210_16554_1_1534493434_1276796_184-302050-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 22:50:33,531 WARNING [train.py:1204] (3/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_120-76843-0_sp1.1 from training. Duration: 0.5 2023-10-03 22:50:34,871 WARNING [train.py:1204] (3/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_318-162017-0_sp1.1 from training. Duration: 0.82725 2023-10-03 22:50:34,970 WARNING [train.py:1204] (3/4) Exclude cut with ID _46067_210_4220_1_1534125571066_3307450_79-46864-0_sp1.1 from training. Duration: 0.92725 2023-10-03 22:50:36,326 WARNING [train.py:1204] (3/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_358-215558-0_sp1.1 from training. Duration: 0.8454375 2023-10-03 22:50:37,703 WARNING [train.py:1204] (3/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_621-66764-0_sp1.1 from training. Duration: 0.92725 2023-10-03 22:50:39,153 WARNING [train.py:1204] (3/4) Exclude cut with ID _42863_210_9070_1_1533902398788_3696920_338-156851-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 22:50:39,177 WARNING [train.py:1204] (3/4) Exclude cut with ID _5289_210_2084_1_1527678202736_3779299_392-261988-0_sp0.9 from training. Duration: 0.72225 2023-10-03 22:50:43,799 WARNING [train.py:1204] (3/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_409-84682-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 22:50:45,132 WARNING [train.py:1204] (3/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_30-326681-0_sp0.9 from training. Duration: 0.92225 2023-10-03 22:50:45,136 WARNING [train.py:1204] (3/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_389-184695-0_sp1.1 from training. Duration: 0.92725 2023-10-03 22:50:45,190 WARNING [train.py:1204] (3/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_442-320317-0 from training. Duration: 0.82 2023-10-03 22:50:45,408 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.self_attn_weights.pos_emb_skip_rate, batch_count=1434506.6666666667, ans=0.0 2023-10-03 22:50:47,987 WARNING [train.py:1204] (3/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_756-29911-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 22:50:50,035 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_401-112036-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 22:50:51,969 WARNING [train.py:1204] (3/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_181-308827-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 22:50:53,337 WARNING [train.py:1204] (3/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_332-258725-0_sp1.1 from training. Duration: 0.963625 2023-10-03 22:50:53,413 WARNING [train.py:1204] (3/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_188-260011-0_sp0.9 from training. Duration: 0.811125 2023-10-03 22:50:54,815 WARNING [train.py:1204] (3/4) Exclude cut with ID _49039_210_6315_1_1533967537541_3846118_99-302239-0_sp1.1 from training. Duration: 0.963625 2023-10-03 22:50:56,236 WARNING [train.py:1204] (3/4) Exclude cut with ID _4122_210_2084_1_1526713164417_4353280_312-294828-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 22:50:56,242 WARNING [train.py:1204] (3/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_303-228738-0 from training. Duration: 0.84 2023-10-03 22:50:57,393 INFO [train.py:1046] (3/4) Epoch 41, batch 2700, loss[loss=0.1445, simple_loss=0.2182, pruned_loss=0.03536, over 23429.00 frames. ], tot_loss[loss=0.1566, simple_loss=0.2368, pruned_loss=0.03824, over 4719638.30 frames. ], batch size: 285, lr: 2.48e-03, grad_scale: 8.0 2023-10-03 22:50:58,864 WARNING [train.py:1204] (3/4) Exclude cut with ID _14349_210_8341_1_1530615710818_3442470_115-285975-0_sp0.9 from training. Duration: 0.9555625 2023-10-03 22:50:58,991 WARNING [train.py:1204] (3/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_125-144174-0_sp1.1 from training. Duration: 0.3545625 2023-10-03 22:50:59,175 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=1434573.3333333333, ans=0.1 2023-10-03 22:51:01,709 WARNING [train.py:1204] (3/4) Exclude cut with ID _53565_210_10635_1_1534337218335_3820259_351-54570-0_sp1.1 from training. Duration: 0.97275 2023-10-03 22:51:01,730 WARNING [train.py:1204] (3/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_410-40065-0_sp1.1 from training. Duration: 0.963625 2023-10-03 22:51:01,757 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_38965_210_4852_1_1533174906_3484735_167-339442-0_sp1.1 from training. Duration: 0.963625 2023-10-03 22:51:03,040 WARNING [train.py:1204] (3/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_265-164486-0_sp1.1 from training. Duration: 0.87275 2023-10-03 22:51:03,053 WARNING [train.py:1204] (3/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_725-182484-0_sp1.1 from training. Duration: 0.92725 2023-10-03 22:51:03,083 WARNING [train.py:1204] (3/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_607-213951-0_sp0.9 from training. Duration: 0.9333125 2023-10-03 22:51:03,104 WARNING [train.py:1204] (3/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_605-133395-0_sp1.1 from training. Duration: 0.6 2023-10-03 22:51:04,411 WARNING [train.py:1204] (3/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_351-294650-0 from training. Duration: 0.63 2023-10-03 22:51:04,466 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_14953_210_2151_1_1530701710_5616165_710-165400-0_sp1.1 from training. Duration: 0.7909375 2023-10-03 22:51:07,042 WARNING [train.py:1204] (3/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_237-58433-0_sp1.1 from training. Duration: 0.67275 2023-10-03 22:51:08,391 WARNING [train.py:1204] (3/4) Exclude cut with ID _5190_210_4557_1_1527244368799_373439_28-197030-0_sp1.1 from training. Duration: 0.6545625 2023-10-03 22:51:08,446 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_743-327651-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 22:51:13,139 WARNING [train.py:1204] (3/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_76-203867-0_sp0.9 from training. Duration: 0.8 2023-10-03 22:51:13,225 WARNING [train.py:1204] (3/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_274-274798-0 from training. Duration: 0.98 2023-10-03 22:51:13,264 WARNING [train.py:1204] (3/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_226-242140-0_sp1.1 from training. Duration: 0.736375 2023-10-03 22:51:18,333 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.balancer2.prob, batch_count=1434640.0, ans=0.125 2023-10-03 22:51:19,391 WARNING [train.py:1204] (3/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_352-59958-0_sp1.1 from training. Duration: 0.7818125 2023-10-03 22:51:19,401 WARNING [train.py:1204] (3/4) Exclude cut with ID _39411_210_5986_1_1533538825030_3343649_191-251819-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 22:51:26,949 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7046_210_6753_1_1527895337_6920917_642-106799-0_sp1.1 from training. Duration: 0.836375 2023-10-03 22:51:26,956 WARNING [train.py:1204] (3/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_412-3442-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 22:51:26,982 WARNING [train.py:1204] (3/4) Exclude cut with ID _60057_210_10640_1_1534903065793_3333150_171-181133-0_sp0.9 from training. Duration: 0.97775 2023-10-03 22:51:26,991 WARNING [train.py:1204] (3/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_154-135696-0_sp1.1 from training. Duration: 0.72725 2023-10-03 22:51:29,788 WARNING [train.py:1204] (3/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_148-186805-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 22:51:32,495 WARNING [train.py:1204] (3/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_605-165641-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 22:51:32,501 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_174-277364-0_sp0.9 from training. Duration: 0.788875 2023-10-03 22:51:32,522 WARNING [train.py:1204] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_89-321955-0_sp0.9 from training. Duration: 0.888875 2023-10-03 22:51:35,814 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.3.encoder.layers.0.feed_forward3.out_whiten, num_groups=1, num_channels=512, metric=13.07 vs. limit=15.0 2023-10-03 22:51:36,513 WARNING [train.py:1204] (3/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_438-105429-0_sp1.1 from training. Duration: 0.963625 2023-10-03 22:51:36,524 WARNING [train.py:1204] (3/4) Exclude cut with ID _14970_210_15709_1_1530773543493_1715730_187-20259-0_sp0.9 from training. Duration: 0.988875 2023-10-03 22:51:44,125 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.1.conv_skip_rate, batch_count=1434773.3333333333, ans=0.0 2023-10-03 22:51:44,212 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.balancer1.prob, batch_count=1434773.3333333333, ans=0.125 2023-10-03 22:51:45,342 WARNING [train.py:1204] (3/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_385-193052-0_sp0.9 from training. Duration: 0.9555625 2023-10-03 22:51:45,397 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7113_210_1849_1_1527942685_3448768_194-311626-0_sp1.1 from training. Duration: 0.92725 2023-10-03 22:51:46,118 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.2.encoder.layers.1.feed_forward2.out_whiten, num_groups=1, num_channels=384, metric=3.89 vs. limit=15.0 2023-10-03 22:51:50,010 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3780_210_2087_1_1526467389_2145572_176-107140-0_sp0.9 from training. Duration: 0.7666875 2023-10-03 22:51:50,012 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_36756_210_7234_1_1533026991_966276_123-213457-0_sp1.1 from training. Duration: 0.936375 2023-10-03 22:51:53,016 WARNING [train.py:1204] (3/4) Exclude cut with ID _23557_210_8750_1_1531890106370_6846255_456-244861-0_sp1.1 from training. Duration: 0.963625 2023-10-03 22:51:55,613 WARNING [train.py:1204] (3/4) Exclude cut with ID _37086_210_9038_1_1532999540891_7021250_242-349170-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 22:51:55,689 WARNING [train.py:1204] (3/4) Exclude cut with ID _41298_210_9019_1_1533430371450_4822143_471-281351-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 22:51:56,926 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_53378_210_17282_1_1534221071_5769509_572-126176-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 22:51:58,367 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_1507-263032-0_sp1.1 from training. Duration: 0.963625 2023-10-03 22:51:58,397 WARNING [train.py:1204] (3/4) Exclude cut with ID _13679_210_7497_1_1530534293662_3355098_411-186088-0_sp1.1 from training. Duration: 0.8545625 2023-10-03 22:52:01,098 WARNING [train.py:1204] (3/4) Exclude cut with ID _29345_210_4381_1_1532689168263_3860169_149-257268-0_sp1.1 from training. Duration: 0.736375 2023-10-03 22:52:01,641 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.5.encoder.layers.0.nonlin_attention.whiten1, num_groups=1, num_channels=192, metric=4.12 vs. limit=10.0 2023-10-03 22:52:02,456 WARNING [train.py:1204] (3/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_381-252014-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 22:52:02,463 WARNING [train.py:1204] (3/4) Exclude cut with ID _35799_210_5854_1_1533263487887_3529071_512-2717-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 22:52:02,742 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.bypass.scale_min, batch_count=1434840.0, ans=0.2 2023-10-03 22:52:06,595 WARNING [train.py:1204] (3/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_505-205270-0_sp0.9 from training. Duration: 0.42225 2023-10-03 22:52:07,872 WARNING [train.py:1204] (3/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_731-20117-0_sp1.1 from training. Duration: 0.936375 2023-10-03 22:52:10,561 INFO [train.py:1046] (3/4) Epoch 41, batch 2750, loss[loss=0.1632, simple_loss=0.2568, pruned_loss=0.03482, over 24652.00 frames. ], tot_loss[loss=0.1565, simple_loss=0.2366, pruned_loss=0.03819, over 4715896.67 frames. ], batch size: 73, lr: 2.48e-03, grad_scale: 4.0 2023-10-03 22:52:10,628 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_37351_210_7925_1_1533012931_3812113_265-85780-0_sp1.1 from training. Duration: 0.8 2023-10-03 22:52:10,641 WARNING [train.py:1204] (3/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_313-134930-0 from training. Duration: 0.97 2023-10-03 22:52:13,854 WARNING [train.py:1204] (3/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_374-138971-0 from training. Duration: 0.94 2023-10-03 22:52:13,897 WARNING [train.py:1204] (3/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_1227-287886-0_sp1.1 from training. Duration: 0.936375 2023-10-03 22:52:15,403 WARNING [train.py:1204] (3/4) Exclude cut with ID _59493_210_17282_1_1535078419077_3225879_332-217544-0_sp1.1 from training. Duration: 0.97275 2023-10-03 22:52:16,685 WARNING [train.py:1204] (3/4) Exclude cut with ID _23514_210_1389_1_1531832139785_3031439_361-119640-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 22:52:18,188 WARNING [train.py:1204] (3/4) Exclude cut with ID _69584_210_5219_1_1535785133775_4044450_142-118058-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 22:52:18,220 WARNING [train.py:1204] (3/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_178-177582-0_sp1.1 from training. Duration: 0.67275 2023-10-03 22:52:19,513 WARNING [train.py:1204] (3/4) Exclude cut with ID _25303_210_17519_1_1532050207576_3635970_41-195572-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 22:52:24,153 WARNING [train.py:1204] (3/4) Exclude cut with ID _55328_210_27767_1_1534580973260_4088170_329-341322-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 22:52:24,195 WARNING [train.py:1204] (3/4) Exclude cut with ID _5608_210_2106_1_1527415439089_6819768_909-82767-0_sp1.1 from training. Duration: 0.5545625 2023-10-03 22:52:24,235 WARNING [train.py:1204] (3/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_261-297503-0_sp1.1 from training. Duration: 0.9 2023-10-03 22:52:24,247 WARNING [train.py:1204] (3/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_459-291706-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 22:52:24,248 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7762_210_1794_1_1528199508_4057408_422-227034-0 from training. Duration: 0.73 2023-10-03 22:52:25,536 WARNING [train.py:1204] (3/4) Exclude cut with ID _3067_210_2667_1_1526212974640_4503168_250-285937-0_sp0.9 from training. Duration: 0.888875 2023-10-03 22:52:25,557 WARNING [train.py:1204] (3/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_332-253745-0_sp1.1 from training. Duration: 0.936375 2023-10-03 22:52:30,421 WARNING [train.py:1204] (3/4) Exclude cut with ID _13938_210_9988_1_1530518718978_3693960_304-112784-0 from training. Duration: 0.46 2023-10-03 22:52:33,164 WARNING [train.py:1204] (3/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_226-267957-0_sp1.1 from training. Duration: 0.92725 2023-10-03 22:52:33,203 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_41822_210_13270_1_1533955976_4379284_608-254567-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 22:52:33,258 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_124-113562-0_sp1.1 from training. Duration: 0.8545625 2023-10-03 22:52:33,289 WARNING [train.py:1204] (3/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_117-290916-0_sp1.1 from training. Duration: 0.463625 2023-10-03 22:52:34,633 WARNING [train.py:1204] (3/4) Exclude cut with ID _28427_210_4220_1_1532581240967_3061069_102-66533-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 22:52:36,000 WARNING [train.py:1204] (3/4) Exclude cut with ID _55383_210_10635_1_1534468426172_2231669_134-128246-0_sp1.1 from training. Duration: 0.8090625 2023-10-03 22:52:37,289 WARNING [train.py:1204] (3/4) Exclude cut with ID _45871_210_5665_1_1533864614245_3296060_49-308316-0_sp1.1 from training. Duration: 0.97275 2023-10-03 22:52:37,345 WARNING [train.py:1204] (3/4) Exclude cut with ID _57572_210_5517_1_1534921219624_3809484_176-199205-0_sp1.1 from training. Duration: 0.97275 2023-10-03 22:52:41,804 WARNING [train.py:1204] (3/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_45-265684-0_sp1.1 from training. Duration: 0.6545625 2023-10-03 22:52:41,826 WARNING [train.py:1204] (3/4) Exclude cut with ID _15150_210_8341_1_1530752411803_7256384_469-239509-0_sp0.9 from training. Duration: 0.7555625 2023-10-03 22:52:41,870 WARNING [train.py:1204] (3/4) Exclude cut with ID _28461_210_2253_1_1532771775189_3041299_136-270644-0_sp1.1 from training. Duration: 0.8454375 2023-10-03 22:52:44,711 WARNING [train.py:1204] (3/4) Exclude cut with ID _69584_210_5219_1_1535785133775_4044450_302-47302-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 22:52:46,070 WARNING [train.py:1204] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_166-128941-0_sp1.1 from training. Duration: 0.4909375 2023-10-03 22:52:46,422 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.balancer2.prob, batch_count=1435040.0, ans=0.125 2023-10-03 22:52:51,817 WARNING [train.py:1204] (3/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_559-120504-0_sp1.1 from training. Duration: 0.97275 2023-10-03 22:52:53,743 WARNING [train.py:1204] (3/4) Exclude cut with ID _6456_210_1534_1_1527681634212_3181049_345-252877-0_sp1.1 from training. Duration: 0.5818125 2023-10-03 22:52:53,777 WARNING [train.py:1204] (3/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_902-233003-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 22:52:57,727 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.705e+02 1.945e+02 2.112e+02 2.405e+02 3.716e+02, threshold=4.224e+02, percent-clipped=0.0 2023-10-03 22:52:59,176 WARNING [train.py:1204] (3/4) Exclude cut with ID _36538_210_9994_1_1533204020363_3795620_204-261385-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 22:52:59,180 WARNING [train.py:1204] (3/4) Exclude cut with ID _14821_210_8750_1_1530680503991_6829359_189-297451-0_sp1.1 from training. Duration: 0.5 2023-10-03 22:52:59,227 WARNING [train.py:1204] (3/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_1211-226066-0_sp0.9 from training. Duration: 0.8444375 2023-10-03 22:53:04,596 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6786_210_1388_1_1527839699_4194495_273-149841-0_sp1.1 from training. Duration: 0.836375 2023-10-03 22:53:04,637 WARNING [train.py:1204] (3/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_380-162648-0_sp1.1 from training. Duration: 0.8545625 2023-10-03 22:53:04,638 WARNING [train.py:1204] (3/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_494-199538-0 from training. Duration: 0.75 2023-10-03 22:53:08,669 WARNING [train.py:1204] (3/4) Exclude cut with ID _49097_210_16049_1_1533952780207_4360979_276-87053-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 22:53:10,090 WARNING [train.py:1204] (3/4) Exclude cut with ID _5444_210_2151_1_1527418808464_5312210_726-27165-0 from training. Duration: 0.67 2023-10-03 22:53:14,746 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_128-202104-0_sp1.1 from training. Duration: 0.57275 2023-10-03 22:53:16,380 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.self_attn_weights.pos_emb_skip_rate, batch_count=1435173.3333333333, ans=0.0 2023-10-03 22:53:17,449 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_11349_210_6753_1_1529800861_1101207_26-65602-0_sp1.1 from training. Duration: 0.87275 2023-10-03 22:53:17,473 WARNING [train.py:1204] (3/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_159-328157-0 from training. Duration: 0.52 2023-10-03 22:53:17,572 WARNING [train.py:1204] (3/4) Exclude cut with ID _14069_210_5632_1_1530512692033_4803390_98-21934-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 22:53:20,269 WARNING [train.py:1204] (3/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_352-59958-0_sp0.9 from training. Duration: 0.9555625 2023-10-03 22:53:20,308 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6163_210_6400_1_1527506746_4263068_283-137811-0 from training. Duration: 0.77 2023-10-03 22:53:20,429 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.1.attention_skip_rate, batch_count=1435173.3333333333, ans=0.0 2023-10-03 22:53:21,583 WARNING [train.py:1204] (3/4) Exclude cut with ID _21989_210_5854_1_1531899214931_3271360_133-44759-0_sp1.1 from training. Duration: 0.7 2023-10-03 22:53:23,462 WARNING [train.py:1204] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_21-174514-0_sp0.9 from training. Duration: 0.47775 2023-10-03 22:53:24,748 INFO [train.py:1046] (3/4) Epoch 41, batch 2800, loss[loss=0.1661, simple_loss=0.2423, pruned_loss=0.04498, over 18830.00 frames. ], tot_loss[loss=0.1558, simple_loss=0.2356, pruned_loss=0.03798, over 4704498.87 frames. ], batch size: 41, lr: 2.48e-03, grad_scale: 8.0 2023-10-03 22:53:24,806 WARNING [train.py:1204] (3/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_201-78811-0_sp1.1 from training. Duration: 0.963625 2023-10-03 22:53:24,839 WARNING [train.py:1204] (3/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_537-164630-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 22:53:25,058 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.ff3_skip_rate, batch_count=1435240.0, ans=0.0 2023-10-03 22:53:26,862 WARNING [train.py:1204] (3/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_156-257607-0 from training. Duration: 0.83 2023-10-03 22:53:26,873 WARNING [train.py:1204] (3/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_308-154450-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 22:53:26,907 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_45399_210_4852_1_1533624725_7579737_214-240574-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 22:53:28,168 WARNING [train.py:1204] (3/4) Exclude cut with ID _29303_210_2575_1_1532392237303_7410649_615-305517-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 22:53:28,210 WARNING [train.py:1204] (3/4) Exclude cut with ID IC0125W0002-61808 from training. Duration: 0.708 2023-10-03 22:53:28,211 WARNING [train.py:1204] (3/4) Exclude cut with ID IC0125W0017-61823 from training. Duration: 0.759 2023-10-03 22:53:32,730 WARNING [train.py:1204] (3/4) Exclude cut with ID _53219_210_4525_1_1534834836008_4064825_4-145291-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 22:53:35,361 WARNING [train.py:1204] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_185-174805-0_sp1.1 from training. Duration: 0.6181875 2023-10-03 22:53:35,387 WARNING [train.py:1204] (3/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_635-193046-0_sp1.1 from training. Duration: 0.92725 2023-10-03 22:53:38,240 WARNING [train.py:1204] (3/4) Exclude cut with ID _46067_210_4220_1_1534125571066_3307450_321-258866-0_sp1.1 from training. Duration: 0.97275 2023-10-03 22:53:41,306 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8558_210_1410_1_1528505225_7613406_131-329524-0 from training. Duration: 0.79 2023-10-03 22:53:42,718 WARNING [train.py:1204] (3/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_847-41334-0_sp0.9 from training. Duration: 0.488875 2023-10-03 22:53:44,099 WARNING [train.py:1204] (3/4) Exclude cut with ID _14227_210_4381_1_1530701804286_3940259_125-275642-0 from training. Duration: 0.99 2023-10-03 22:53:45,460 WARNING [train.py:1204] (3/4) Exclude cut with ID _25248_210_8750_1_1532062825941_5554180_248-177255-0_sp1.1 from training. Duration: 0.963625 2023-10-03 22:53:45,511 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_40047_210_2290_1_1533290519_2374311_195-165693-0_sp1.1 from training. Duration: 0.8090625 2023-10-03 22:53:45,516 WARNING [train.py:1204] (3/4) Exclude cut with ID _23109_210_4381_1_1532150828777_901949_29-258672-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 22:53:49,593 WARNING [train.py:1204] (3/4) Exclude cut with ID _14821_210_8750_1_1530680503991_6829359_2-227151-0_sp1.1 from training. Duration: 0.7545625 2023-10-03 22:53:49,623 WARNING [train.py:1204] (3/4) Exclude cut with ID _49643_210_7989_1_1534640244224_3939031_565-53982-0_sp1.1 from training. Duration: 0.963625 2023-10-03 22:53:49,627 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7544_210_6753_1_1528591327_1471426_33-346031-0_sp0.9 from training. Duration: 0.67775 2023-10-03 22:53:50,905 WARNING [train.py:1204] (3/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_95-61782-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 22:53:56,342 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.0.layers.0.nonlin_attention.whiten2, num_groups=1, num_channels=192, metric=5.33 vs. limit=15.0 2023-10-03 22:53:58,731 WARNING [train.py:1204] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_686-54383-0_sp0.9 from training. Duration: 0.9666875 2023-10-03 22:54:00,151 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_505-276823-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 22:54:02,061 WARNING [train.py:1204] (3/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_745-128443-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 22:54:03,380 WARNING [train.py:1204] (3/4) Exclude cut with ID _3844_210_1390_1_1526727803582_2871209_20-251664-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 22:54:03,451 WARNING [train.py:1204] (3/4) Exclude cut with ID _36538_210_9994_1_1533204020363_3795620_487-164628-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 22:54:09,306 WARNING [train.py:1204] (3/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_309-222545-0_sp1.1 from training. Duration: 0.763625 2023-10-03 22:54:09,329 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3531_210_3528_1_1526294203_5216456_309-227302-0 from training. Duration: 0.69 2023-10-03 22:54:10,662 WARNING [train.py:1204] (3/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_42-192460-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 22:54:10,742 WARNING [train.py:1204] (3/4) Exclude cut with ID _22083_210_8254_1_1531702862439_3678970_49-234546-0_sp1.1 from training. Duration: 0.7545625 2023-10-03 22:54:10,746 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_427-78097-0_sp0.9 from training. Duration: 0.888875 2023-10-03 22:54:14,853 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_378-205254-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 22:54:16,089 WARNING [train.py:1204] (3/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_672-314830-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 22:54:16,373 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.ff3_skip_rate, batch_count=1435440.0, ans=0.0 2023-10-03 22:54:19,007 WARNING [train.py:1204] (3/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_90-30673-0_sp1.1 from training. Duration: 0.763625 2023-10-03 22:54:20,457 WARNING [train.py:1204] (3/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_563-329943-0_sp1.1 from training. Duration: 0.9 2023-10-03 22:54:20,478 WARNING [train.py:1204] (3/4) Exclude cut with ID _21632_210_2253_1_1531994001765_3744140_154-84090-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 22:54:20,479 WARNING [train.py:1204] (3/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_103-336064-0_sp0.9 from training. Duration: 0.5666875 2023-10-03 22:54:21,779 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_43-71989-0_sp0.9 from training. Duration: 0.6333125 2023-10-03 22:54:21,834 WARNING [train.py:1204] (3/4) Exclude cut with ID _3867_210_1883_1_1526641200575_4475830_319-349850-0_sp0.9 from training. Duration: 0.7444375 2023-10-03 22:54:22,331 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.5.encoder.layers.1.self_attn_weights.whiten_keys, num_groups=4, num_channels=128, metric=1.49 vs. limit=6.0 2023-10-03 22:54:23,344 WARNING [train.py:1204] (3/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_629-179454-0_sp1.1 from training. Duration: 0.963625 2023-10-03 22:54:23,352 WARNING [train.py:1204] (3/4) Exclude cut with ID _65263_210_22356_1_1535763598691_3921037_49-222917-0 from training. Duration: 0.92 2023-10-03 22:54:23,390 WARNING [train.py:1204] (3/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_602-331772-0_sp1.1 from training. Duration: 0.936375 2023-10-03 22:54:25,184 WARNING [train.py:1204] (3/4) Exclude cut with ID _58414_210_8254_1_1535421609162_7151182_292-134978-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 22:54:25,228 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_38793_210_2544_1_1533176114_3849320_200-299292-0_sp1.1 from training. Duration: 0.936375 2023-10-03 22:54:28,283 WARNING [train.py:1204] (3/4) Exclude cut with ID _14970_210_15709_1_1530775547361_1966965_6-23328-0 from training. Duration: 0.86 2023-10-03 22:54:28,366 WARNING [train.py:1204] (3/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_386-225511-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 22:54:28,377 WARNING [train.py:1204] (3/4) Exclude cut with ID _38323_210_12108_1_1533121233369_3303049_416-11919-0_sp1.1 from training. Duration: 0.87275 2023-10-03 22:54:29,678 WARNING [train.py:1204] (3/4) Exclude cut with ID _4866_210_2577_1_1527404412284_4364659_256-306830-0_sp1.1 from training. Duration: 0.7909375 2023-10-03 22:54:31,055 WARNING [train.py:1204] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_53-300544-0 from training. Duration: 0.67 2023-10-03 22:54:36,920 WARNING [train.py:1204] (3/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_80-74184-0_sp1.1 from training. Duration: 0.7545625 2023-10-03 22:54:36,933 WARNING [train.py:1204] (3/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_332-256168-0_sp1.1 from training. Duration: 0.6818125 2023-10-03 22:54:36,989 WARNING [train.py:1204] (3/4) Exclude cut with ID _48836_210_12210_1_1534075848293_3904150_429-35475-0_sp1.1 from training. Duration: 0.8090625 2023-10-03 22:54:38,149 INFO [train.py:1046] (3/4) Epoch 41, batch 2850, loss[loss=0.1541, simple_loss=0.2308, pruned_loss=0.03875, over 23708.00 frames. ], tot_loss[loss=0.1549, simple_loss=0.2343, pruned_loss=0.03775, over 4703823.25 frames. ], batch size: 149, lr: 2.48e-03, grad_scale: 8.0 2023-10-03 22:54:40,094 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_144-135833-0_sp1.1 from training. Duration: 0.97275 2023-10-03 22:54:44,101 WARNING [train.py:1204] (3/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_380-343670-0_sp0.9 from training. Duration: 0.97775 2023-10-03 22:54:44,136 WARNING [train.py:1204] (3/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_213-265174-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 22:54:44,166 WARNING [train.py:1204] (3/4) Exclude cut with ID _20455_210_5219_1_1531565996775_8098100_86-314283-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 22:54:48,369 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_41750_210_1960_1_1533520678_3948042_68-223813-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 22:54:49,649 WARNING [train.py:1204] (3/4) Exclude cut with ID _55153_210_2253_1_1534384786677_3162096_24-198429-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 22:54:51,143 WARNING [train.py:1204] (3/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_119-207162-0_sp0.9 from training. Duration: 0.788875 2023-10-03 22:54:51,202 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_148-172861-0 from training. Duration: 0.5 2023-10-03 22:54:57,244 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_5522_210_3273_1_1527249606_3909096_318-203153-0 from training. Duration: 0.43 2023-10-03 22:54:57,250 WARNING [train.py:1204] (3/4) Exclude cut with ID _14884_210_2252_1_1530707459689_5561250_288-189949-0_sp1.1 from training. Duration: 0.97275 2023-10-03 22:54:57,423 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.bypass.skip_rate, batch_count=1435640.0, ans=0.04949747468305833 2023-10-03 22:54:59,257 WARNING [train.py:1204] (3/4) Exclude cut with ID _5294_210_1747_1_1527249643928_3696179_529-242298-0 from training. Duration: 0.95 2023-10-03 22:55:00,609 WARNING [train.py:1204] (3/4) Exclude cut with ID _36538_210_9994_1_1533204020363_3795620_503-110289-0_sp1.1 from training. Duration: 0.936375 2023-10-03 22:55:03,229 WARNING [train.py:1204] (3/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_385-193052-0 from training. Duration: 0.86 2023-10-03 22:55:03,291 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_456-22141-0 from training. Duration: 0.88 2023-10-03 22:55:04,726 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_38965_210_4852_1_1533174906_3484735_284-117840-0_sp1.1 from training. Duration: 0.936375 2023-10-03 22:55:06,308 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.0.conv_module2.balancer1.max_abs, batch_count=1435706.6666666667, ans=10.0 2023-10-03 22:55:08,466 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.ff3_skip_rate, batch_count=1435706.6666666667, ans=0.0 2023-10-03 22:55:10,987 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.conv_module2.balancer2.min_abs, batch_count=1435706.6666666667, ans=0.5 2023-10-03 22:55:13,578 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.3.encoder.layers.0.conv_module1.whiten, num_groups=1, num_channels=512, metric=3.59 vs. limit=15.0 2023-10-03 22:55:17,048 WARNING [train.py:1204] (3/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_75-201625-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 22:55:17,261 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.ff3_skip_rate, batch_count=1435706.6666666667, ans=0.0 2023-10-03 22:55:18,425 WARNING [train.py:1204] (3/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_255-312654-0_sp1.1 from training. Duration: 0.82725 2023-10-03 22:55:18,440 WARNING [train.py:1204] (3/4) Exclude cut with ID _47251_210_18628_1_1533814063128_4137250_214-88076-0_sp0.9 from training. Duration: 0.97775 2023-10-03 22:55:19,793 WARNING [train.py:1204] (3/4) Exclude cut with ID _4092_210_2151_1_1526643044449_3135759_227-213972-0_sp1.1 from training. Duration: 0.4909375 2023-10-03 22:55:19,808 WARNING [train.py:1204] (3/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_912-47473-0_sp1.1 from training. Duration: 0.6181875 2023-10-03 22:55:19,835 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6767_210_3273_1_1527855783_1718932_173-256601-0_sp1.1 from training. Duration: 0.72725 2023-10-03 22:55:22,520 WARNING [train.py:1204] (3/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_32-205288-0_sp1.1 from training. Duration: 0.7090625 2023-10-03 22:55:22,553 WARNING [train.py:1204] (3/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_39-248870-0 from training. Duration: 0.69 2023-10-03 22:55:23,928 WARNING [train.py:1204] (3/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_377-296839-0_sp1.1 from training. Duration: 0.5 2023-10-03 22:55:23,942 WARNING [train.py:1204] (3/4) Exclude cut with ID _13679_210_7497_1_1530534293662_3355098_413-56154-0_sp1.1 from training. Duration: 0.963625 2023-10-03 22:55:25,197 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.544e+02 1.888e+02 2.036e+02 2.281e+02 3.028e+02, threshold=4.073e+02, percent-clipped=0.0 2023-10-03 22:55:25,305 WARNING [train.py:1204] (3/4) Exclude cut with ID _14970_210_15709_1_1530775547361_1966965_201-84544-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 22:55:26,597 WARNING [train.py:1204] (3/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_105-95775-0_sp1.1 from training. Duration: 0.936375 2023-10-03 22:55:29,372 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.conv_module2.balancer2.prob, batch_count=1435773.3333333333, ans=0.125 2023-10-03 22:55:30,261 WARNING [train.py:1204] (3/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_131-191669-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 22:55:30,287 WARNING [train.py:1204] (3/4) Exclude cut with ID _14860_210_4915_1_1530702020340_7343240_695-4323-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 22:55:30,499 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.balancer1.prob, batch_count=1435773.3333333333, ans=0.125 2023-10-03 22:55:31,676 WARNING [train.py:1204] (3/4) Exclude cut with ID _39867_210_9826_1_1533553222030_9344259_991-130565-0_sp1.1 from training. Duration: 0.97275 2023-10-03 22:55:33,228 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_50-3289-0_sp1.1 from training. Duration: 0.82725 2023-10-03 22:55:34,714 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_54-347670-0_sp1.1 from training. Duration: 0.92725 2023-10-03 22:55:36,085 WARNING [train.py:1204] (3/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_200-153445-0_sp1.1 from training. Duration: 0.936375 2023-10-03 22:55:36,164 WARNING [train.py:1204] (3/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_279-302911-0_sp1.1 from training. Duration: 0.97275 2023-10-03 22:55:39,540 WARNING [train.py:1204] (3/4) Exclude cut with ID _14886_210_4220_1_1530702020355_3516190_159-73748-0_sp0.9 from training. Duration: 0.92225 2023-10-03 22:55:41,267 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.bypass.scale_min, batch_count=1435840.0, ans=0.2 2023-10-03 22:55:43,811 WARNING [train.py:1204] (3/4) Exclude cut with ID _53379_210_20034_1_1534222027363_3854654_23-222715-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 22:55:45,653 WARNING [train.py:1204] (3/4) Exclude cut with ID _12555_210_8750_1_1530075594402_3780239_449-267986-0 from training. Duration: 0.83 2023-10-03 22:55:45,665 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7544_210_6753_1_1528587722_3576686_295-258479-0 from training. Duration: 0.84 2023-10-03 22:55:47,133 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7002_210_1794_1_1527935634_4034444_487-236501-0_sp0.9 from training. Duration: 0.7333125 2023-10-03 22:55:47,191 WARNING [train.py:1204] (3/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_244-19107-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 22:55:48,441 WARNING [train.py:1204] (3/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_390-339137-0 from training. Duration: 0.56 2023-10-03 22:55:48,503 WARNING [train.py:1204] (3/4) Exclude cut with ID _7562_210_2084_1_1528887767916_3946229_186-326422-0_sp1.1 from training. Duration: 0.763625 2023-10-03 22:55:49,847 WARNING [train.py:1204] (3/4) Exclude cut with ID _33640_210_2655_1_1533117605479_4855469_645-145235-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 22:55:49,873 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_25767_210_2161_1_1532307483_3867748_506-171777-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 22:55:49,901 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_5438_210_1794_1_1527336687_2600009_233-58072-0_sp1.1 from training. Duration: 0.7 2023-10-03 22:55:49,908 WARNING [train.py:1204] (3/4) Exclude cut with ID IC0669W0063-330908 from training. Duration: 0.896 2023-10-03 22:55:51,214 WARNING [train.py:1204] (3/4) Exclude cut with ID IC0671W0070-331913 from training. Duration: 0.896 2023-10-03 22:55:51,218 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_258-310570-0_sp1.1 from training. Duration: 0.7454375 2023-10-03 22:55:51,848 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.3.encoder.layers.2.self_attn1.whiten, num_groups=1, num_channels=512, metric=15.07 vs. limit=22.5 2023-10-03 22:55:52,484 INFO [train.py:1046] (3/4) Epoch 41, batch 2900, loss[loss=0.163, simple_loss=0.2272, pruned_loss=0.0494, over 18894.00 frames. ], tot_loss[loss=0.1551, simple_loss=0.235, pruned_loss=0.03758, over 4713013.32 frames. ], batch size: 388, lr: 2.48e-03, grad_scale: 8.0 2023-10-03 22:55:52,531 WARNING [train.py:1204] (3/4) Exclude cut with ID _9461_210_4557_1_1529060514726_3677451_162-177507-0_sp1.1 from training. Duration: 0.97275 2023-10-03 22:55:56,790 WARNING [train.py:1204] (3/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_26-175553-0_sp0.9 from training. Duration: 0.67775 2023-10-03 22:55:56,814 WARNING [train.py:1204] (3/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_7-150481-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 22:55:58,061 WARNING [train.py:1204] (3/4) Exclude cut with ID _23557_210_8750_1_1531890106370_6846255_72-273262-0_sp1.1 from training. Duration: 0.963625 2023-10-03 22:55:58,159 WARNING [train.py:1204] (3/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_367-61330-0 from training. Duration: 0.75 2023-10-03 22:56:03,144 WARNING [train.py:1204] (3/4) Exclude cut with ID _39867_210_9826_1_1533553222030_9344259_1337-11141-0_sp1.1 from training. Duration: 0.936375 2023-10-03 22:56:03,169 WARNING [train.py:1204] (3/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_101-293367-0 from training. Duration: 0.63 2023-10-03 22:56:04,454 WARNING [train.py:1204] (3/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_10-100367-0 from training. Duration: 0.98 2023-10-03 22:56:06,001 WARNING [train.py:1204] (3/4) Exclude cut with ID _60057_210_10640_1_1534903065793_3333150_171-181133-0_sp1.1 from training. Duration: 0.8 2023-10-03 22:56:06,010 WARNING [train.py:1204] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_844-96989-0_sp0.9 from training. Duration: 0.788875 2023-10-03 22:56:07,442 WARNING [train.py:1204] (3/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_301-144595-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 22:56:07,532 WARNING [train.py:1204] (3/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_115-147343-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 22:56:10,968 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3171_210_2087_1_1526109084_2330007_37-64334-0_sp1.1 from training. Duration: 0.7454375 2023-10-03 22:56:11,013 WARNING [train.py:1204] (3/4) Exclude cut with ID _19514_210_9259_1_1531530071330_5084230_769-272914-0_sp1.1 from training. Duration: 0.936375 2023-10-03 22:56:12,515 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.conv_module1.balancer1.min_positive, batch_count=1435973.3333333333, ans=0.025 2023-10-03 22:56:15,800 WARNING [train.py:1204] (3/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_76-311963-0_sp0.9 from training. Duration: 0.77775 2023-10-03 22:56:15,844 WARNING [train.py:1204] (3/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_526-62079-0 from training. Duration: 0.67 2023-10-03 22:56:17,200 WARNING [train.py:1204] (3/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_7-111954-0_sp0.9 from training. Duration: 0.62225 2023-10-03 22:56:18,614 WARNING [train.py:1204] (3/4) Exclude cut with ID _57997_210_4163_1_1534766425889_3961049_360-279140-0_sp1.1 from training. Duration: 0.97275 2023-10-03 22:56:18,886 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.conv_skip_rate, batch_count=1435973.3333333333, ans=0.0 2023-10-03 22:56:20,102 WARNING [train.py:1204] (3/4) Exclude cut with ID _4866_210_2577_1_1527404412284_4364659_133-1696-0 from training. Duration: 0.99 2023-10-03 22:56:20,179 WARNING [train.py:1204] (3/4) Exclude cut with ID _5190_210_4557_1_1527244368799_373439_28-197030-0 from training. Duration: 0.72 2023-10-03 22:56:24,272 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_53-39003-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 22:56:24,274 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6673_210_3273_1_1527767790_3173240_351-167954-0 from training. Duration: 0.48 2023-10-03 22:56:24,298 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2822_210_2715_1_1526112586_1999587_165-36656-0_sp1.1 from training. Duration: 0.8090625 2023-10-03 22:56:26,939 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_328-264892-0_sp1.1 from training. Duration: 0.92725 2023-10-03 22:56:26,952 WARNING [train.py:1204] (3/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_29-293896-0_sp0.9 from training. Duration: 0.67775 2023-10-03 22:56:28,441 WARNING [train.py:1204] (3/4) Exclude cut with ID _65263_210_22356_1_1535763598691_3921037_465-55448-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 22:56:30,357 WARNING [train.py:1204] (3/4) Exclude cut with ID _21989_210_5854_1_1531899214931_3271360_311-53106-0_sp1.1 from training. Duration: 0.97275 2023-10-03 22:56:33,593 WARNING [train.py:1204] (3/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_389-277915-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 22:56:35,075 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_69630_210_7234_1_1535791094_677837_63-277139-0_sp1.1 from training. Duration: 0.963625 2023-10-03 22:56:37,802 WARNING [train.py:1204] (3/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_85-138257-0 from training. Duration: 0.71 2023-10-03 22:56:37,833 WARNING [train.py:1204] (3/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_686-247600-0 from training. Duration: 0.92 2023-10-03 22:56:37,838 WARNING [train.py:1204] (3/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_108-255430-0_sp1.1 from training. Duration: 0.8818125 2023-10-03 22:56:41,909 WARNING [train.py:1204] (3/4) Exclude cut with ID _14884_210_2252_1_1530707459689_5561250_114-201399-0_sp1.1 from training. Duration: 0.6909375 2023-10-03 22:56:43,340 WARNING [train.py:1204] (3/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_506-42614-0 from training. Duration: 0.79 2023-10-03 22:56:45,282 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_85-195240-0_sp1.1 from training. Duration: 0.7909375 2023-10-03 22:56:50,845 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_141-36243-0_sp1.1 from training. Duration: 0.97275 2023-10-03 22:56:59,169 WARNING [train.py:1204] (3/4) Exclude cut with ID _7852_210_5219_1_1528286471316_8139400_358-287492-0_sp1.1 from training. Duration: 0.87275 2023-10-03 22:56:59,193 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_805-71497-0_sp0.9 from training. Duration: 0.788875 2023-10-03 22:57:01,078 WARNING [train.py:1204] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_178-39722-0 from training. Duration: 0.49 2023-10-03 22:57:03,838 WARNING [train.py:1204] (3/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_428-181925-0_sp1.1 from training. Duration: 0.963625 2023-10-03 22:57:03,841 WARNING [train.py:1204] (3/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_770-200430-0 from training. Duration: 0.89 2023-10-03 22:57:03,878 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_1104-271009-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 22:57:03,925 WARNING [train.py:1204] (3/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_128-296581-0_sp0.9 from training. Duration: 0.82225 2023-10-03 22:57:06,729 INFO [train.py:1046] (3/4) Epoch 41, batch 2950, loss[loss=0.1578, simple_loss=0.2324, pruned_loss=0.04163, over 23734.00 frames. ], tot_loss[loss=0.1562, simple_loss=0.2365, pruned_loss=0.03796, over 4714732.40 frames. ], batch size: 232, lr: 2.48e-03, grad_scale: 8.0 2023-10-03 22:57:08,283 WARNING [train.py:1204] (3/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_19-62511-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 22:57:09,511 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_805-71497-0 from training. Duration: 0.71 2023-10-03 22:57:09,598 WARNING [train.py:1204] (3/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_573-152077-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 22:57:10,884 WARNING [train.py:1204] (3/4) Exclude cut with ID _42875_210_14856_1_1533704397487_4012433_393-177816-0_sp1.1 from training. Duration: 0.963625 2023-10-03 22:57:11,010 WARNING [train.py:1204] (3/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_278-35159-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 22:57:12,877 WARNING [train.py:1204] (3/4) Exclude cut with ID _15037_210_2253_1_1530752294104_3267650_13-130361-0_sp1.1 from training. Duration: 0.9 2023-10-03 22:57:12,966 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3019_210_2028_1_1526044245_3264865_127-135615-0 from training. Duration: 0.94 2023-10-03 22:57:14,801 WARNING [train.py:1204] (3/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_607-213951-0 from training. Duration: 0.84 2023-10-03 22:57:16,144 WARNING [train.py:1204] (3/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_436-318113-0_sp0.9 from training. Duration: 0.8333125 2023-10-03 22:57:16,145 WARNING [train.py:1204] (3/4) Exclude cut with ID _13938_210_9988_1_1530518718978_3693960_148-152225-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 22:57:22,970 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2541_210_1794_1_1525590269_3084765_156-173713-0_sp1.1 from training. Duration: 0.736375 2023-10-03 22:57:23,149 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.conv_skip_rate, batch_count=1436306.6666666667, ans=0.0 2023-10-03 22:57:25,783 WARNING [train.py:1204] (3/4) Exclude cut with ID _28323_210_16187_1_1532480395667_4100300_531-24157-0_sp1.1 from training. Duration: 0.8909375 2023-10-03 22:57:27,268 WARNING [train.py:1204] (3/4) Exclude cut with ID _29303_210_2575_1_1532392237303_7410649_382-217492-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 22:57:28,554 WARNING [train.py:1204] (3/4) Exclude cut with ID _58895_210_8750_1_1535184081320_3649089_270-158013-0_sp1.1 from training. Duration: 0.92725 2023-10-03 22:57:30,630 WARNING [train.py:1204] (3/4) Exclude cut with ID _29277_210_3279_1_1532581109293_3173069_311-206227-0_sp1.1 from training. Duration: 0.936375 2023-10-03 22:57:30,641 WARNING [train.py:1204] (3/4) Exclude cut with ID _7160_210_1876_1_1527940923231_3703799_282-322611-0_sp0.9 from training. Duration: 0.9666875 2023-10-03 22:57:32,059 WARNING [train.py:1204] (3/4) Exclude cut with ID _53219_210_4525_1_1534834836008_4064825_562-32295-0_sp1.1 from training. Duration: 0.963625 2023-10-03 22:57:33,529 WARNING [train.py:1204] (3/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_408-87330-0_sp1.1 from training. Duration: 0.963625 2023-10-03 22:57:33,537 WARNING [train.py:1204] (3/4) Exclude cut with ID _59202_210_26984_1_1534816810612_3549174_137-318710-0_sp1.1 from training. Duration: 0.8090625 2023-10-03 22:57:36,200 WARNING [train.py:1204] (3/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_372-30302-0 from training. Duration: 0.86 2023-10-03 22:57:41,609 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7543_210_6753_1_1528541907_1399322_178-56269-0_sp1.1 from training. Duration: 0.95275 2023-10-03 22:57:41,633 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9109W0012-549758 from training. Duration: 0.512 2023-10-03 22:57:43,052 WARNING [train.py:1204] (3/4) Exclude cut with ID _5410_210_1388_1_1527397055795_1719020_39-112221-0_sp0.9 from training. Duration: 0.7666875 2023-10-03 22:57:44,398 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9121W0012-554933 from training. Duration: 0.896 2023-10-03 22:57:44,516 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder_embed.convnext.out_balancer.prob, batch_count=1436373.3333333333, ans=0.125 2023-10-03 22:57:46,322 WARNING [train.py:1204] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_476-272757-0 from training. Duration: 0.93 2023-10-03 22:57:46,343 WARNING [train.py:1204] (3/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_1085-171718-0_sp1.1 from training. Duration: 0.92725 2023-10-03 22:57:47,739 WARNING [train.py:1204] (3/4) Exclude cut with ID _8480_210_1874_1_1528589391205_6728330_309-139896-0_sp1.1 from training. Duration: 0.736375 2023-10-03 22:57:47,739 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9130W0113-559703 from training. Duration: 0.896 2023-10-03 22:57:47,743 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_292-265928-0_sp1.1 from training. Duration: 0.463625 2023-10-03 22:57:49,195 WARNING [train.py:1204] (3/4) Exclude cut with ID _8772_210_1850_1_1528635595972_3664011_96-12756-0 from training. Duration: 0.98 2023-10-03 22:57:50,505 WARNING [train.py:1204] (3/4) Exclude cut with ID _12556_210_8341_1_1530081024100_7327690_725-193477-0_sp1.1 from training. Duration: 0.8909375 2023-10-03 22:57:50,565 WARNING [train.py:1204] (3/4) Exclude cut with ID _5289_210_2084_1_1527678202736_3779299_142-103317-0_sp1.1 from training. Duration: 0.763625 2023-10-03 22:57:53,089 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.633e+02 1.859e+02 2.046e+02 2.288e+02 3.221e+02, threshold=4.092e+02, percent-clipped=0.0 2023-10-03 22:57:53,232 WARNING [train.py:1204] (3/4) Exclude cut with ID _45012_210_12651_1_1533608435136_5371380_206-51075-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 22:57:54,619 WARNING [train.py:1204] (3/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_176-56120-0_sp1.1 from training. Duration: 0.7545625 2023-10-03 22:57:54,643 WARNING [train.py:1204] (3/4) Exclude cut with ID _40284_210_2084_1_1533513679814_3704279_220-201864-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 22:57:56,100 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9165W0004-576638 from training. Duration: 0.896 2023-10-03 22:57:56,141 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_5438_210_1794_1_1527336687_2600009_218-343336-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 22:57:57,455 WARNING [train.py:1204] (3/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_318-162017-0 from training. Duration: 0.91 2023-10-03 22:58:02,239 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_49544_210_1794_1_1534379770_5262083_483-349298-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 22:58:03,587 WARNING [train.py:1204] (3/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_197-28164-0_sp0.9 from training. Duration: 0.888875 2023-10-03 22:58:05,330 WARNING [train.py:1204] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_643-52007-0 from training. Duration: 0.57 2023-10-03 22:58:05,348 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_38965_210_4852_1_1533174906_3484735_403-332963-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 22:58:06,688 WARNING [train.py:1204] (3/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_340-174176-0 from training. Duration: 0.49 2023-10-03 22:58:09,406 WARNING [train.py:1204] (3/4) Exclude cut with ID _56708_210_13177_1_1535252351282_3422899_363-917-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 22:58:10,694 WARNING [train.py:1204] (3/4) Exclude cut with ID _19896_210_1883_1_1531709022662_6649569_525-159063-0_sp1.1 from training. Duration: 0.936375 2023-10-03 22:58:10,731 WARNING [train.py:1204] (3/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_430-116139-0_sp0.9 from training. Duration: 0.9444375 2023-10-03 22:58:12,121 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_53753_210_7234_1_1534320003_3594148_86-329250-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 22:58:12,140 WARNING [train.py:1204] (3/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_406-275649-0_sp1.1 from training. Duration: 0.3818125 2023-10-03 22:58:13,490 WARNING [train.py:1204] (3/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_1083-147536-0_sp1.1 from training. Duration: 0.8818125 2023-10-03 22:58:15,366 WARNING [train.py:1204] (3/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_54-311741-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 22:58:15,371 WARNING [train.py:1204] (3/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_237-58433-0_sp0.9 from training. Duration: 0.82225 2023-10-03 22:58:15,410 WARNING [train.py:1204] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_98-51676-0_sp1.1 from training. Duration: 0.663625 2023-10-03 22:58:15,467 WARNING [train.py:1204] (3/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_386-270472-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 22:58:17,294 WARNING [train.py:1204] (3/4) Exclude cut with ID _19023_210_4353_1_1531378892552_5820780_83-254794-0_sp1.1 from training. Duration: 0.7818125 2023-10-03 22:58:19,925 INFO [train.py:1046] (3/4) Epoch 41, batch 3000, loss[loss=0.1519, simple_loss=0.2364, pruned_loss=0.03365, over 24652.00 frames. ], tot_loss[loss=0.156, simple_loss=0.2366, pruned_loss=0.03773, over 4726047.75 frames. ], batch size: 65, lr: 2.48e-03, grad_scale: 8.0 2023-10-03 22:58:19,925 INFO [train.py:1069] (3/4) Computing validation loss 2023-10-03 22:58:32,318 INFO [train.py:1078] (3/4) Epoch 41, validation: loss=0.3725, simple_loss=0.2818, pruned_loss=0.2316, over 1125622.00 frames. 2023-10-03 22:58:32,319 INFO [train.py:1079] (3/4) Maximum memory allocated so far is 21134MB 2023-10-03 22:58:32,364 WARNING [train.py:1204] (3/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_314-93656-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 22:58:32,387 WARNING [train.py:1204] (3/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_336-213530-0 from training. Duration: 0.9 2023-10-03 22:58:33,866 WARNING [train.py:1204] (3/4) Exclude cut with ID _36686_210_5665_1_1533778225913_3646410_65-221120-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 22:58:36,500 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_26433_210_12108_1_1532157158_4056480_271-68178-0_sp1.1 from training. Duration: 0.92725 2023-10-03 22:58:37,868 WARNING [train.py:1204] (3/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_46-207599-0_sp0.9 from training. Duration: 0.711125 2023-10-03 22:58:40,810 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9303W0114-637133 from training. Duration: 0.896 2023-10-03 22:58:40,836 WARNING [train.py:1204] (3/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_490-207861-0 from training. Duration: 0.98 2023-10-03 22:58:42,243 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6767_210_3273_1_1527855783_1718932_173-256601-0_sp0.9 from training. Duration: 0.888875 2023-10-03 22:58:42,274 WARNING [train.py:1204] (3/4) Exclude cut with ID _13921_210_9994_1_1530841782272_3503520_327-69677-0_sp1.1 from training. Duration: 0.8181875 2023-10-03 22:58:43,606 WARNING [train.py:1204] (3/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_508-335369-0 from training. Duration: 0.48 2023-10-03 22:58:45,341 WARNING [train.py:1204] (3/4) Exclude cut with ID _64631_210_27767_1_1535349580068_4165160_24-46567-0_sp1.1 from training. Duration: 0.963625 2023-10-03 22:58:49,828 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_89-35748-0_sp1.1 from training. Duration: 0.7181875 2023-10-03 22:58:58,128 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_26433_210_12108_1_1532157158_4056480_578-262107-0_sp1.1 from training. Duration: 0.9 2023-10-03 22:59:04,162 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.2.encoder.layers.1.self_attn1.whiten, num_groups=1, num_channels=384, metric=16.23 vs. limit=22.5 2023-10-03 22:59:04,717 WARNING [train.py:1204] (3/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_129-337802-0 from training. Duration: 0.72 2023-10-03 22:59:06,135 WARNING [train.py:1204] (3/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_356-167816-0_sp0.9 from training. Duration: 0.8 2023-10-03 22:59:07,751 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.conv_module1.balancer2.min_abs, batch_count=1436706.6666666667, ans=0.5 2023-10-03 22:59:08,820 WARNING [train.py:1204] (3/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_137-116442-0_sp1.1 from training. Duration: 0.7090625 2023-10-03 22:59:08,870 WARNING [train.py:1204] (3/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_358-172940-0_sp1.1 from training. Duration: 0.963625 2023-10-03 22:59:10,058 WARNING [train.py:1204] (3/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_668-344147-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 22:59:11,472 WARNING [train.py:1204] (3/4) Exclude cut with ID _62337_210_12392_1_1535009273105_3412229_255-157667-0_sp1.1 from training. Duration: 0.936375 2023-10-03 22:59:11,475 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_486-151131-0 from training. Duration: 0.85 2023-10-03 22:59:14,808 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_53638_210_21581_1_1534297319_5808449_885-80172-0 from training. Duration: 0.96 2023-10-03 22:59:17,177 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.0.layers.0.nonlin_attention.whiten2, num_groups=1, num_channels=192, metric=4.68 vs. limit=15.0 2023-10-03 22:59:17,464 WARNING [train.py:1204] (3/4) Exclude cut with ID _50093_210_4220_1_1534417141207_3432299_105-183067-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 22:59:17,528 WARNING [train.py:1204] (3/4) Exclude cut with ID _7562_210_2084_1_1528887767916_3946229_345-169156-0_sp0.9 from training. Duration: 0.6444375 2023-10-03 22:59:20,686 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_4528_210_3273_1_1527076654_3276777_209-219804-0_sp0.9 from training. Duration: 0.7666875 2023-10-03 22:59:20,726 WARNING [train.py:1204] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_35-153886-0_sp0.9 from training. Duration: 0.9333125 2023-10-03 22:59:20,777 WARNING [train.py:1204] (3/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_332-121925-0_sp1.1 from training. Duration: 0.92725 2023-10-03 22:59:20,779 WARNING [train.py:1204] (3/4) Exclude cut with ID _59202_210_26984_1_1534816810612_3549174_495-65515-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 22:59:23,564 WARNING [train.py:1204] (3/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_769-168302-0_sp1.1 from training. Duration: 0.6454375 2023-10-03 22:59:23,609 WARNING [train.py:1204] (3/4) Exclude cut with ID _24271_210_12658_1_1531987270022_7190519_708-71345-0_sp1.1 from training. Duration: 0.936375 2023-10-03 22:59:23,609 WARNING [train.py:1204] (3/4) Exclude cut with ID 3033-130750-0096-55598-0_sp0.9 from training. Duration: 0.92225 2023-10-03 22:59:25,041 WARNING [train.py:1204] (3/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_219-224445-0_sp0.9 from training. Duration: 0.9333125 2023-10-03 22:59:27,770 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_174-277364-0 from training. Duration: 0.71 2023-10-03 22:59:27,933 INFO [scaling.py:1118] (3/4) WithLoss: name=encoder.encoders.0.layers.0.self_attn_weights, loss-sum=0.000e+00 2023-10-03 22:59:29,096 WARNING [train.py:1204] (3/4) Exclude cut with ID _45021_210_9994_1_1533619751094_3330288_96-239010-0_sp1.1 from training. Duration: 0.82725 2023-10-03 22:59:29,126 WARNING [train.py:1204] (3/4) Exclude cut with ID _31602_210_5854_1_1532677021456_4458250_489-305116-0_sp1.1 from training. Duration: 0.97275 2023-10-03 22:59:29,145 WARNING [train.py:1204] (3/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_269-154726-0_sp0.9 from training. Duration: 0.9555625 2023-10-03 22:59:33,661 WARNING [train.py:1204] (3/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_126-54418-0_sp1.1 from training. Duration: 0.92725 2023-10-03 22:59:33,696 WARNING [train.py:1204] (3/4) Exclude cut with ID _42326_210_7770_1_1533814282531_3707269_182-224159-0_sp1.1 from training. Duration: 0.92725 2023-10-03 22:59:35,716 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7002_210_1794_1_1527939683_5157286_1-281264-0_sp0.9 from training. Duration: 0.488875 2023-10-03 22:59:35,763 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_86-16881-0 from training. Duration: 0.58 2023-10-03 22:59:36,977 WARNING [train.py:1204] (3/4) Exclude cut with ID _19514_210_9259_1_1531530071330_5084230_637-279094-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 22:59:37,003 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_26433_210_12108_1_1532157158_4056480_578-262107-0 from training. Duration: 0.99 2023-10-03 22:59:37,047 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_82-221196-0_sp1.1 from training. Duration: 0.6909375 2023-10-03 22:59:38,467 WARNING [train.py:1204] (3/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_402-102311-0 from training. Duration: 0.67 2023-10-03 22:59:38,699 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.bypass_mid.scale_min, batch_count=1436840.0, ans=0.2 2023-10-03 22:59:41,228 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1015-343884-0_sp1.1 from training. Duration: 0.563625 2023-10-03 22:59:42,560 WARNING [train.py:1204] (3/4) Exclude cut with ID _13684_210_7497_1_1530619354863_3598359_312-235606-0_sp1.1 from training. Duration: 0.5454375 2023-10-03 22:59:42,590 WARNING [train.py:1204] (3/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_55-31904-0 from training. Duration: 0.88 2023-10-03 22:59:44,381 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_25-332138-0 from training. Duration: 0.91 2023-10-03 22:59:44,382 WARNING [train.py:1204] (3/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_912-282412-0_sp0.9 from training. Duration: 0.5444375 2023-10-03 22:59:44,450 WARNING [train.py:1204] (3/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_458-299435-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 22:59:45,749 INFO [train.py:1046] (3/4) Epoch 41, batch 3050, loss[loss=0.1409, simple_loss=0.2318, pruned_loss=0.02496, over 24456.00 frames. ], tot_loss[loss=0.1566, simple_loss=0.2371, pruned_loss=0.03806, over 4732016.65 frames. ], batch size: 63, lr: 2.48e-03, grad_scale: 8.0 2023-10-03 22:59:47,120 WARNING [train.py:1204] (3/4) Exclude cut with ID _69584_210_5219_1_1535785133775_4044450_275-88403-0_sp1.1 from training. Duration: 0.92725 2023-10-03 22:59:47,129 WARNING [train.py:1204] (3/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_841-341679-0_sp1.1 from training. Duration: 0.52725 2023-10-03 22:59:47,135 WARNING [train.py:1204] (3/4) Exclude cut with ID _41391_210_7994_1_1533603771872_3638201_1-54521-0_sp1.1 from training. Duration: 0.97275 2023-10-03 22:59:47,187 WARNING [train.py:1204] (3/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_495-186609-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 22:59:49,968 WARNING [train.py:1204] (3/4) Exclude cut with ID _4681_210_3525_1_1527250689464_120889_15-126760-0 from training. Duration: 0.81 2023-10-03 22:59:52,752 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3019_210_2028_1_1526044245_3264865_127-135615-0_sp1.1 from training. Duration: 0.8545625 2023-10-03 22:59:54,217 WARNING [train.py:1204] (3/4) Exclude cut with ID _30191_210_1664_1_1533189915047_7196670_915-21561-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 22:59:54,249 WARNING [train.py:1204] (3/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_6-214815-0_sp0.9 from training. Duration: 0.9444375 2023-10-03 22:59:57,069 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_45399_210_4852_1_1533624725_7579737_190-137494-0_sp1.1 from training. Duration: 0.97275 2023-10-03 22:59:59,972 WARNING [train.py:1204] (3/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_207-132038-0 from training. Duration: 0.94 2023-10-03 23:00:04,819 WARNING [train.py:1204] (3/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_90-31383-0 from training. Duration: 0.87 2023-10-03 23:00:06,246 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_15517_210_6753_1_1530952293_1664795_119-31001-0 from training. Duration: 0.94 2023-10-03 23:00:06,270 WARNING [train.py:1204] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_684-85342-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 23:00:10,257 WARNING [train.py:1204] (3/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_97-325053-0_sp1.1 from training. Duration: 0.836375 2023-10-03 23:00:13,092 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_68702_210_23689_1_1535799577_3420068_168-25150-0_sp1.1 from training. Duration: 0.97275 2023-10-03 23:00:13,105 WARNING [train.py:1204] (3/4) Exclude cut with ID _65134_210_14763_1_1535335393985_3505368_111-27984-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 23:00:15,027 WARNING [train.py:1204] (3/4) Exclude cut with ID _48678_210_5663_1_1533988830947_3769199_276-226230-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 23:00:15,066 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder_embed.convnext.hidden_balancer.prob, batch_count=1437040.0, ans=0.125 2023-10-03 23:00:19,026 WARNING [train.py:1204] (3/4) Exclude cut with ID _28427_210_4220_1_1532581240967_3061069_43-84928-0_sp1.1 from training. Duration: 0.9 2023-10-03 23:00:19,060 WARNING [train.py:1204] (3/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_158-250116-0_sp1.1 from training. Duration: 0.563625 2023-10-03 23:00:20,276 WARNING [train.py:1204] (3/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_279-332556-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 23:00:20,314 WARNING [train.py:1204] (3/4) Exclude cut with ID _42863_210_9070_1_1533902398788_3696920_488-230146-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 23:00:20,315 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_387-247159-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 23:00:21,741 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6729_210_2550_1_1528032481_1788725_261-42619-0_sp1.1 from training. Duration: 0.97275 2023-10-03 23:00:23,127 WARNING [train.py:1204] (3/4) Exclude cut with ID _19896_210_1883_1_1531709022662_6649569_831-44724-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 23:00:25,886 WARNING [train.py:1204] (3/4) Exclude cut with ID _55153_210_2253_1_1534384786677_3162096_280-285944-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 23:00:25,916 WARNING [train.py:1204] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_448-4048-0 from training. Duration: 0.96 2023-10-03 23:00:25,974 WARNING [train.py:1204] (3/4) Exclude cut with ID _29678_210_7893_1_1532421042787_3906118_456-202548-0_sp1.1 from training. Duration: 0.97275 2023-10-03 23:00:25,986 WARNING [train.py:1204] (3/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_107-332398-0_sp1.1 from training. Duration: 0.7090625 2023-10-03 23:00:28,875 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.attention_skip_rate, batch_count=1437106.6666666667, ans=0.0 2023-10-03 23:00:30,060 WARNING [train.py:1204] (3/4) Exclude cut with ID _13921_210_9994_1_1530838702800_3003429_46-149444-0_sp1.1 from training. Duration: 0.8545625 2023-10-03 23:00:31,143 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.602e+02 1.945e+02 2.143e+02 2.357e+02 3.381e+02, threshold=4.287e+02, percent-clipped=0.0 2023-10-03 23:00:31,242 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_257-20733-0_sp0.9 from training. Duration: 0.8444375 2023-10-03 23:00:31,300 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_53753_210_7234_1_1534320003_3594148_385-234809-0_sp1.1 from training. Duration: 0.963625 2023-10-03 23:00:32,602 WARNING [train.py:1204] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_292-215346-0_sp1.1 from training. Duration: 0.8818125 2023-10-03 23:00:37,264 WARNING [train.py:1204] (3/4) Exclude cut with ID _68965_210_1883_1_1535698962272_3497490_196-124268-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 23:00:37,297 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_23801_210_16223_1_1532066392_3294657_570-5439-0_sp1.1 from training. Duration: 0.8818125 2023-10-03 23:00:43,431 WARNING [train.py:1204] (3/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_349-6928-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 23:00:44,694 WARNING [train.py:1204] (3/4) Exclude cut with ID _60621_210_17071_1_1535013053530_3671739_226-255202-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 23:00:44,698 WARNING [train.py:1204] (3/4) Exclude cut with ID _41817_210_18835_1_1533434477160_7423049_448-130302-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 23:00:46,601 WARNING [train.py:1204] (3/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_758-143677-0_sp1.1 from training. Duration: 0.8909375 2023-10-03 23:00:46,642 WARNING [train.py:1204] (3/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_912-47473-0_sp0.9 from training. Duration: 0.7555625 2023-10-03 23:00:48,078 WARNING [train.py:1204] (3/4) Exclude cut with ID _6694_210_4599_1_1527852637490_3965529_356-169233-0_sp1.1 from training. Duration: 0.9 2023-10-03 23:00:49,489 WARNING [train.py:1204] (3/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_1-99680-0 from training. Duration: 0.67 2023-10-03 23:00:49,590 WARNING [train.py:1204] (3/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_199-86432-0_sp1.1 from training. Duration: 0.8909375 2023-10-03 23:00:49,617 WARNING [train.py:1204] (3/4) Exclude cut with ID _61148_210_6897_1_1535094022726_4217610_362-182430-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 23:00:52,186 WARNING [train.py:1204] (3/4) Exclude cut with ID _49376_210_5986_1_1533985278732_3370740_301-154763-0 from training. Duration: 0.98 2023-10-03 23:00:53,595 WARNING [train.py:1204] (3/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_572-185150-0_sp1.1 from training. Duration: 0.8818125 2023-10-03 23:00:57,649 INFO [train.py:1046] (3/4) Epoch 41, batch 3100, loss[loss=0.1468, simple_loss=0.2326, pruned_loss=0.03053, over 24459.00 frames. ], tot_loss[loss=0.1571, simple_loss=0.2374, pruned_loss=0.03836, over 4727684.24 frames. ], batch size: 66, lr: 2.48e-03, grad_scale: 8.0 2023-10-03 23:00:57,789 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_47896_210_20508_1_1534492518_5958868_558-314572-0_sp1.1 from training. Duration: 0.8818125 2023-10-03 23:00:57,882 WARNING [train.py:1204] (3/4) Exclude cut with ID _45560_210_21982_1_1533812603951_3584999_111-6376-0_sp0.9 from training. Duration: 0.9333125 2023-10-03 23:00:59,370 WARNING [train.py:1204] (3/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_412-317077-0_sp1.1 from training. Duration: 0.6818125 2023-10-03 23:01:02,153 WARNING [train.py:1204] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_718-68505-0 from training. Duration: 0.89 2023-10-03 23:01:05,336 WARNING [train.py:1204] (3/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_77-311404-0 from training. Duration: 0.94 2023-10-03 23:01:06,667 WARNING [train.py:1204] (3/4) Exclude cut with ID _60229_210_27767_1_1534838406158_3655350_264-279573-0 from training. Duration: 0.83 2023-10-03 23:01:08,551 WARNING [train.py:1204] (3/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_106-300985-0_sp1.1 from training. Duration: 0.8181875 2023-10-03 23:01:11,760 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_4183_210_2087_1_1526729100_1869838_79-277189-0_sp1.1 from training. Duration: 0.963625 2023-10-03 23:01:11,769 WARNING [train.py:1204] (3/4) Exclude cut with ID _28427_210_4220_1_1532581240967_3061069_195-295268-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 23:01:14,521 WARNING [train.py:1204] (3/4) Exclude cut with ID _4092_210_2151_1_1526643044449_3135759_356-278694-0_sp0.9 from training. Duration: 0.52225 2023-10-03 23:01:16,147 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.conv_module1.balancer1.prob, batch_count=1437306.6666666667, ans=0.125 2023-10-03 23:01:17,622 WARNING [train.py:1204] (3/4) Exclude cut with ID _14459_210_2228_1_1531443641946_7516700_777-71446-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 23:01:23,157 WARNING [train.py:1204] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_520-126729-0 from training. Duration: 0.7 2023-10-03 23:01:26,211 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.bypass.scale_min, batch_count=1437373.3333333333, ans=0.2 2023-10-03 23:01:27,388 WARNING [train.py:1204] (3/4) Exclude cut with ID _13684_210_7497_1_1530619354863_3598359_312-235606-0_sp0.9 from training. Duration: 0.6666875 2023-10-03 23:01:27,452 WARNING [train.py:1204] (3/4) Exclude cut with ID _37537_210_2253_1_1533171758568_3421949_283-349402-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 23:01:27,484 WARNING [train.py:1204] (3/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_29-117932-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 23:01:28,774 WARNING [train.py:1204] (3/4) Exclude cut with ID _29303_210_2575_1_1532392237303_7410649_721-283351-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 23:01:28,855 WARNING [train.py:1204] (3/4) Exclude cut with ID _14886_210_4220_1_1530702020355_3516190_175-229073-0_sp0.9 from training. Duration: 0.5 2023-10-03 23:01:31,661 WARNING [train.py:1204] (3/4) Exclude cut with ID _15458_210_8341_1_1530858614263_3834410_70-268830-0_sp1.1 from training. Duration: 0.97275 2023-10-03 23:01:31,683 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2295_210_2087_1_1525500993_2407863_234-83104-0 from training. Duration: 0.62 2023-10-03 23:01:31,684 WARNING [train.py:1204] (3/4) Exclude cut with ID _22961_210_8254_1_1531976366672_4278180_317-199546-0_sp1.1 from training. Duration: 0.7818125 2023-10-03 23:01:32,978 WARNING [train.py:1204] (3/4) Exclude cut with ID _27894_210_12392_1_1532415496078_6902439_547-196022-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 23:01:33,080 WARNING [train.py:1204] (3/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_46-207599-0 from training. Duration: 0.64 2023-10-03 23:01:34,690 WARNING [train.py:1204] (3/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_426-326246-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 23:01:39,739 WARNING [train.py:1204] (3/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_445-117398-0_sp0.9 from training. Duration: 0.988875 2023-10-03 23:01:41,450 WARNING [train.py:1204] (3/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_563-347832-0 from training. Duration: 0.89 2023-10-03 23:01:41,712 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.conv_skip_rate, batch_count=1437440.0, ans=0.0 2023-10-03 23:01:42,862 WARNING [train.py:1204] (3/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_252-216674-0 from training. Duration: 0.54 2023-10-03 23:01:42,939 WARNING [train.py:1204] (3/4) Exclude cut with ID _59611_210_8895_1_1534741198240_3650361_187-212779-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 23:01:42,992 WARNING [train.py:1204] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_282-159367-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 23:01:44,926 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.4.encoder.layers.2.nonlin_attention.whiten1, num_groups=1, num_channels=288, metric=4.16 vs. limit=10.0 2023-10-03 23:01:45,812 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_68702_210_23689_1_1535799577_3420068_33-136883-0_sp1.1 from training. Duration: 0.936375 2023-10-03 23:01:45,829 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_47228_210_4983_1_1533780021_3672027_184-293706-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 23:01:47,079 WARNING [train.py:1204] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_656-188361-0_sp0.9 from training. Duration: 0.9666875 2023-10-03 23:01:47,206 WARNING [train.py:1204] (3/4) Exclude cut with ID _4866_210_2577_1_1527404412284_4364659_352-157930-0_sp0.9 from training. Duration: 0.9 2023-10-03 23:01:47,207 WARNING [train.py:1204] (3/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_210-106047-0_sp1.1 from training. Duration: 0.8818125 2023-10-03 23:01:47,421 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.ff3_skip_rate, batch_count=1437440.0, ans=0.0 2023-10-03 23:01:50,484 WARNING [train.py:1204] (3/4) Exclude cut with ID _9389_210_1664_1_1529217337301_5289269_289-213417-0_sp1.1 from training. Duration: 0.8090625 2023-10-03 23:01:51,755 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3910_210_1794_1_1526777667_3747260_178-41186-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 23:01:51,759 WARNING [train.py:1204] (3/4) Exclude cut with ID _27052_210_5986_1_1532329197187_3585009_24-172263-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 23:01:51,763 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_5522_210_3273_1_1527249606_3909096_318-203153-0_sp1.1 from training. Duration: 0.3909375 2023-10-03 23:01:55,876 WARNING [train.py:1204] (3/4) Exclude cut with ID _27894_210_12392_1_1532415496078_6902439_222-51982-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 23:01:57,276 WARNING [train.py:1204] (3/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_87-131427-0 from training. Duration: 0.78 2023-10-03 23:02:00,064 WARNING [train.py:1204] (3/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_372-298093-0_sp0.9 from training. Duration: 0.888875 2023-10-03 23:02:00,657 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.4.encoder.layers.2.nonlin_attention.whiten1, num_groups=1, num_channels=288, metric=4.24 vs. limit=10.0 2023-10-03 23:02:01,405 WARNING [train.py:1204] (3/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_663-166973-0 from training. Duration: 0.8 2023-10-03 23:02:01,435 WARNING [train.py:1204] (3/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_429-177469-0_sp1.1 from training. Duration: 0.936375 2023-10-03 23:02:01,471 WARNING [train.py:1204] (3/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_724-172651-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 23:02:01,521 WARNING [train.py:1204] (3/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_103-336064-0 from training. Duration: 0.51 2023-10-03 23:02:02,370 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.1.encoder.layers.1.conv_module1.whiten, num_groups=1, num_channels=256, metric=11.86 vs. limit=15.0 2023-10-03 23:02:06,955 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.5.encoder.layers.0.feed_forward2.out_whiten, num_groups=1, num_channels=256, metric=10.79 vs. limit=15.0 2023-10-03 23:02:12,103 INFO [train.py:1046] (3/4) Epoch 41, batch 3150, loss[loss=0.1762, simple_loss=0.2647, pruned_loss=0.04382, over 24438.00 frames. ], tot_loss[loss=0.1569, simple_loss=0.2369, pruned_loss=0.03841, over 4717721.23 frames. ], batch size: 69, lr: 2.48e-03, grad_scale: 8.0 2023-10-03 23:02:12,185 WARNING [train.py:1204] (3/4) Exclude cut with ID _48677_210_5663_1_1533902405919_3892989_111-11439-0 from training. Duration: 0.96 2023-10-03 23:02:14,176 WARNING [train.py:1204] (3/4) Exclude cut with ID _8085_210_4557_1_1528455633493_3716100_95-103398-0_sp1.1 from training. Duration: 0.963625 2023-10-03 23:02:15,537 WARNING [train.py:1204] (3/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_1041-110837-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 23:02:15,678 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_50050_210_16223_1_1535435796_7290138_1155-121757-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 23:02:15,680 WARNING [train.py:1204] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_602-275260-0_sp0.9 from training. Duration: 0.911125 2023-10-03 23:02:17,090 WARNING [train.py:1204] (3/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_265-164486-0 from training. Duration: 0.96 2023-10-03 23:02:18,423 WARNING [train.py:1204] (3/4) Exclude cut with ID _46067_210_4220_1_1534125571066_3307450_142-229375-0_sp1.1 from training. Duration: 0.963625 2023-10-03 23:02:18,456 WARNING [train.py:1204] (3/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_310-26186-0_sp1.1 from training. Duration: 0.636375 2023-10-03 23:02:20,479 WARNING [train.py:1204] (3/4) Exclude cut with ID _28461_210_2253_1_1532771775189_3041299_136-270644-0 from training. Duration: 0.93 2023-10-03 23:02:20,666 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.nonlin_attention.balancer.prob, batch_count=1437573.3333333333, ans=0.125 2023-10-03 23:02:21,791 WARNING [train.py:1204] (3/4) Exclude cut with ID _14860_210_4915_1_1530702020340_7343240_438-286372-0_sp1.1 from training. Duration: 0.936375 2023-10-03 23:02:23,426 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.conv_module1.balancer1.prob, batch_count=1437573.3333333333, ans=0.125 2023-10-03 23:02:24,499 WARNING [train.py:1204] (3/4) Exclude cut with ID IC0128W0004-63309 from training. Duration: 0.831 2023-10-03 23:02:26,034 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.conv_skip_rate, batch_count=1437640.0, ans=0.0 2023-10-03 23:02:27,148 WARNING [train.py:1204] (3/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_135-174080-0 from training. Duration: 0.53 2023-10-03 23:02:27,200 WARNING [train.py:1204] (3/4) Exclude cut with ID _41326_210_5854_1_1533778076509_3492080_277-59511-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 23:02:28,528 WARNING [train.py:1204] (3/4) Exclude cut with ID IC0144W0112-71379 from training. Duration: 0.992 2023-10-03 23:02:28,598 WARNING [train.py:1204] (3/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_711-15688-0_sp0.9 from training. Duration: 0.47775 2023-10-03 23:02:29,939 WARNING [train.py:1204] (3/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_237-332027-0 from training. Duration: 0.95 2023-10-03 23:02:29,994 WARNING [train.py:1204] (3/4) Exclude cut with ID _7970_210_1883_1_1528455616296_3472230_25-178407-0 from training. Duration: 0.71 2023-10-03 23:02:29,996 WARNING [train.py:1204] (3/4) Exclude cut with ID _8480_210_1874_1_1528589391205_6728330_309-139896-0 from training. Duration: 0.81 2023-10-03 23:02:30,012 WARNING [train.py:1204] (3/4) Exclude cut with ID _39571_210_13449_1_1533801584143_3651077_319-268877-0_sp1.1 from training. Duration: 0.936375 2023-10-03 23:02:30,015 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_155-246880-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 23:02:31,393 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_566-122599-0_sp1.1 from training. Duration: 0.936375 2023-10-03 23:02:34,253 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_12868_210_6753_1_1530701932_530275_18-76322-0 from training. Duration: 0.87 2023-10-03 23:02:35,039 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.conv_skip_rate, batch_count=1437640.0, ans=0.0 2023-10-03 23:02:36,150 WARNING [train.py:1204] (3/4) Exclude cut with ID _56494_210_4220_1_1535284837625_3033169_159-106035-0_sp1.1 from training. Duration: 0.963625 2023-10-03 23:02:37,415 WARNING [train.py:1204] (3/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_98-276013-0_sp1.1 from training. Duration: 0.963625 2023-10-03 23:02:37,461 WARNING [train.py:1204] (3/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_330-216124-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 23:02:37,755 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.conv_module2.balancer2.prob, batch_count=1437640.0, ans=0.125 2023-10-03 23:02:40,132 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_177-58005-0_sp0.9 from training. Duration: 0.688875 2023-10-03 23:02:43,530 WARNING [train.py:1204] (3/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_343-139898-0 from training. Duration: 0.96 2023-10-03 23:02:43,595 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6961_210_2087_1_1527937168_6815011_395-185060-0_sp1.1 from training. Duration: 0.863625 2023-10-03 23:02:43,832 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.feed_forward1.out_proj.dropout_p, batch_count=1437706.6666666667, ans=0.1 2023-10-03 23:02:46,716 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7002_210_1794_1_1527939683_5157286_184-229630-0_sp0.9 from training. Duration: 0.82225 2023-10-03 23:02:48,004 WARNING [train.py:1204] (3/4) Exclude cut with ID _59170_210_6721_1_1534983633960_4203335_153-18895-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 23:02:49,306 WARNING [train.py:1204] (3/4) Exclude cut with ID _49643_210_7989_1_1534640244224_3939031_598-161926-0 from training. Duration: 0.9 2023-10-03 23:02:52,757 WARNING [train.py:1204] (3/4) Exclude cut with ID _4068_210_2629_1_1526697350395_1546259_224-288693-0 from training. Duration: 0.59 2023-10-03 23:02:52,814 WARNING [train.py:1204] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_718-68505-0_sp1.1 from training. Duration: 0.8090625 2023-10-03 23:02:52,858 WARNING [train.py:1204] (3/4) Exclude cut with ID _4068_210_2629_1_1526697350395_1546259_224-288693-0_sp0.9 from training. Duration: 0.6555625 2023-10-03 23:02:52,890 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2296_210_2087_1_1525503448_3397335_57-185430-0_sp0.9 from training. Duration: 0.6666875 2023-10-03 23:02:52,933 WARNING [train.py:1204] (3/4) Exclude cut with ID _15458_210_8341_1_1530858614263_3834410_49-235472-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 23:02:52,936 WARNING [train.py:1204] (3/4) Exclude cut with ID _64135_210_2253_1_1535677267876_7608972_498-5039-0_sp0.9 from training. Duration: 0.8666875 2023-10-03 23:02:55,532 WARNING [train.py:1204] (3/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_283-220372-0_sp0.9 from training. Duration: 0.62225 2023-10-03 23:02:55,541 WARNING [train.py:1204] (3/4) Exclude cut with ID _6473_210_1741_1_1527901307396_3815380_108-17335-0_sp0.9 from training. Duration: 0.72225 2023-10-03 23:02:56,918 WARNING [train.py:1204] (3/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_113-92608-0 from training. Duration: 0.54 2023-10-03 23:02:56,969 WARNING [train.py:1204] (3/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_272-249985-0_sp1.1 from training. Duration: 0.6090625 2023-10-03 23:02:56,979 WARNING [train.py:1204] (3/4) Exclude cut with ID _50376_210_13401_1_1534031502101_4217740_518-187390-0_sp1.1 from training. Duration: 0.97275 2023-10-03 23:02:57,137 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.nonlin_attention.balancer.prob, batch_count=1437773.3333333333, ans=0.125 2023-10-03 23:02:57,230 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.feed_forward2.hidden_balancer.prob, batch_count=1437773.3333333333, ans=0.125 2023-10-03 23:02:58,342 WARNING [train.py:1204] (3/4) Exclude cut with ID _59738_210_15379_1_1534849512562_3941470_165-248260-0_sp1.1 from training. Duration: 0.92725 2023-10-03 23:02:58,479 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.nonlin_attention.balancer.prob, batch_count=1437773.3333333333, ans=0.125 2023-10-03 23:02:59,589 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.566e+02 1.918e+02 2.167e+02 2.429e+02 3.852e+02, threshold=4.335e+02, percent-clipped=0.0 2023-10-03 23:02:59,656 WARNING [train.py:1204] (3/4) Exclude cut with ID _23818_210_6228_1_1531917063884_3676920_400-343925-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 23:03:00,953 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6961_210_2087_1_1527937168_6815011_562-165518-0 from training. Duration: 0.77 2023-10-03 23:03:01,017 WARNING [train.py:1204] (3/4) Exclude cut with ID _24271_210_12658_1_1531987270022_7190519_289-207931-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 23:03:02,467 WARNING [train.py:1204] (3/4) Exclude cut with ID _14632_210_9988_1_1530691279392_3923200_332-105904-0 from training. Duration: 0.9 2023-10-03 23:03:02,490 WARNING [train.py:1204] (3/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_203-232438-0_sp1.1 from training. Duration: 0.97275 2023-10-03 23:03:03,800 WARNING [train.py:1204] (3/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_263-168388-0 from training. Duration: 0.86 2023-10-03 23:03:05,155 WARNING [train.py:1204] (3/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_174-205058-0 from training. Duration: 0.95 2023-10-03 23:03:06,544 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_94-195801-0_sp1.1 from training. Duration: 0.8545625 2023-10-03 23:03:06,567 WARNING [train.py:1204] (3/4) Exclude cut with ID _55153_210_2253_1_1534384786677_3162096_269-103204-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 23:03:08,446 WARNING [train.py:1204] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_748-36150-0 from training. Duration: 0.91 2023-10-03 23:03:09,811 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_148-172861-0_sp1.1 from training. Duration: 0.4545625 2023-10-03 23:03:09,865 WARNING [train.py:1204] (3/4) Exclude cut with ID _67499_210_17282_1_1535509805220_3692430_460-77730-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 23:03:11,478 WARNING [train.py:1204] (3/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_374-20753-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 23:03:12,829 WARNING [train.py:1204] (3/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_374-289983-0_sp1.1 from training. Duration: 0.97275 2023-10-03 23:03:12,871 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_34886_210_10633_1_1533203882_1755661_76-156096-0_sp0.9 from training. Duration: 0.9666875 2023-10-03 23:03:19,340 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_238-256668-0_sp0.9 from training. Duration: 0.7666875 2023-10-03 23:03:20,733 WARNING [train.py:1204] (3/4) Exclude cut with ID _46683_210_9642_1_1533730913492_4591116_489-146727-0_sp1.1 from training. Duration: 0.97275 2023-10-03 23:03:22,209 WARNING [train.py:1204] (3/4) Exclude cut with ID _13938_210_9988_1_1530518718978_3693960_304-112784-0_sp0.9 from training. Duration: 0.511125 2023-10-03 23:03:26,368 INFO [train.py:1046] (3/4) Epoch 41, batch 3200, loss[loss=0.1495, simple_loss=0.2262, pruned_loss=0.03643, over 23623.00 frames. ], tot_loss[loss=0.156, simple_loss=0.2357, pruned_loss=0.03814, over 4710200.43 frames. ], batch size: 256, lr: 2.48e-03, grad_scale: 16.0 2023-10-03 23:03:27,878 WARNING [train.py:1204] (3/4) Exclude cut with ID _24271_210_12658_1_1531987270022_7190519_243-46549-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 23:03:27,890 WARNING [train.py:1204] (3/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_78-182912-0_sp1.1 from training. Duration: 0.536375 2023-10-03 23:03:30,672 WARNING [train.py:1204] (3/4) Exclude cut with ID _57572_210_5517_1_1534921219624_3809484_121-321334-0_sp1.1 from training. Duration: 0.97275 2023-10-03 23:03:30,784 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_40047_210_2290_1_1533290519_2374311_254-102149-0_sp1.1 from training. Duration: 0.963625 2023-10-03 23:03:30,785 WARNING [train.py:1204] (3/4) Exclude cut with ID _28461_210_2253_1_1532771775189_3041299_96-152158-0 from training. Duration: 0.85 2023-10-03 23:03:33,545 WARNING [train.py:1204] (3/4) Exclude cut with ID _29877_210_6228_1_1532397791106_5276149_161-102979-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 23:03:38,155 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3787_210_2040_1_1526477544_5478586_603-120324-0_sp0.9 from training. Duration: 0.9 2023-10-03 23:03:42,324 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_41750_210_1960_1_1533520678_3948042_247-213456-0_sp1.1 from training. Duration: 0.97275 2023-10-03 23:03:51,851 WARNING [train.py:1204] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_127-119109-0_sp0.9 from training. Duration: 0.8 2023-10-03 23:03:58,798 WARNING [train.py:1204] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_215-202973-0 from training. Duration: 0.88 2023-10-03 23:04:00,039 WARNING [train.py:1204] (3/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_17-320491-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 23:04:03,351 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.5.encoder.layers.0.whiten, num_groups=1, num_channels=256, metric=2.10 vs. limit=12.0 2023-10-03 23:04:04,218 WARNING [train.py:1204] (3/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_310-114452-0 from training. Duration: 0.59 2023-10-03 23:04:04,303 WARNING [train.py:1204] (3/4) Exclude cut with ID _15133_210_6228_1_1530794512074_3080379_29-264458-0_sp0.9 from training. Duration: 0.6444375 2023-10-03 23:04:08,903 WARNING [train.py:1204] (3/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_159-281310-0_sp1.1 from training. Duration: 0.863625 2023-10-03 23:04:08,915 WARNING [train.py:1204] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_195-167011-0_sp0.9 from training. Duration: 0.8333125 2023-10-03 23:04:08,998 WARNING [train.py:1204] (3/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_156-257607-0_sp1.1 from training. Duration: 0.7545625 2023-10-03 23:04:13,136 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2295_210_2087_1_1525500993_2407863_116-238894-0 from training. Duration: 0.7 2023-10-03 23:04:13,242 WARNING [train.py:1204] (3/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_321-335475-0_sp0.9 from training. Duration: 0.5 2023-10-03 23:04:18,098 WARNING [train.py:1204] (3/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_188-114464-0 from training. Duration: 0.82 2023-10-03 23:04:20,899 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2592_210_2290_1_1525697025_4882925_157-70512-0 from training. Duration: 0.91 2023-10-03 23:04:22,261 WARNING [train.py:1204] (3/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_138-64210-0_sp1.1 from training. Duration: 0.663625 2023-10-03 23:04:25,259 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_11305_210_1794_1_1529750428_7178148_651-95702-0_sp1.1 from training. Duration: 0.963625 2023-10-03 23:04:25,273 WARNING [train.py:1204] (3/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_379-184223-0_sp0.9 from training. Duration: 0.9444375 2023-10-03 23:04:26,597 WARNING [train.py:1204] (3/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_381-125325-0_sp1.1 from training. Duration: 0.963625 2023-10-03 23:04:26,662 WARNING [train.py:1204] (3/4) Exclude cut with ID IC0622W0021-307659 from training. Duration: 0.896 2023-10-03 23:04:26,664 WARNING [train.py:1204] (3/4) Exclude cut with ID _7624_210_1531_1_1528109802198_4933101_418-349834-0_sp1.1 from training. Duration: 0.5454375 2023-10-03 23:04:29,487 WARNING [train.py:1204] (3/4) Exclude cut with ID _49728_210_12651_1_1534041034294_6788150_607-3416-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 23:04:30,818 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8275_210_4198_1_1528455320_3892571_491-17707-0 from training. Duration: 0.64 2023-10-03 23:04:30,864 WARNING [train.py:1204] (3/4) Exclude cut with ID _5287_210_2084_1_1527418883040_3878670_333-48508-0 from training. Duration: 0.6 2023-10-03 23:04:32,275 WARNING [train.py:1204] (3/4) Exclude cut with ID _8365_210_2064_1_1528592311070_4677690_371-122295-0 from training. Duration: 0.55 2023-10-03 23:04:33,641 WARNING [train.py:1204] (3/4) Exclude cut with ID _14860_210_4915_1_1530702020340_7343240_836-328489-0 from training. Duration: 0.9 2023-10-03 23:04:35,682 WARNING [train.py:1204] (3/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_1033-274376-0_sp0.9 from training. Duration: 0.9666875 2023-10-03 23:04:37,087 WARNING [train.py:1204] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_475-127455-0_sp0.9 from training. Duration: 0.77775 2023-10-03 23:04:37,100 WARNING [train.py:1204] (3/4) Exclude cut with ID IC0671W0071-331914 from training. Duration: 0.896 2023-10-03 23:04:37,126 WARNING [train.py:1204] (3/4) Exclude cut with ID _30327_210_16049_1_1532484161295_3924169_2-1523-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 23:04:37,128 WARNING [train.py:1204] (3/4) Exclude cut with ID _24314_210_4915_1_1532135078116_7170179_460-208279-0_sp1.1 from training. Duration: 0.97275 2023-10-03 23:04:39,631 INFO [train.py:1046] (3/4) Epoch 41, batch 3250, loss[loss=0.1409, simple_loss=0.2159, pruned_loss=0.03295, over 24276.00 frames. ], tot_loss[loss=0.156, simple_loss=0.2361, pruned_loss=0.03798, over 4720175.32 frames. ], batch size: 56, lr: 2.48e-03, grad_scale: 16.0 2023-10-03 23:04:39,723 WARNING [train.py:1204] (3/4) Exclude cut with ID IC0679W0052-335859 from training. Duration: 0.64 2023-10-03 23:04:46,278 WARNING [train.py:1204] (3/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_3-237434-0_sp1.1 from training. Duration: 0.6909375 2023-10-03 23:04:48,971 WARNING [train.py:1204] (3/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_405-185283-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 23:04:55,982 WARNING [train.py:1204] (3/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_927-83684-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 23:04:55,990 WARNING [train.py:1204] (3/4) Exclude cut with ID _6694_210_4599_1_1527852637490_3965529_356-169233-0 from training. Duration: 0.99 2023-10-03 23:04:57,342 WARNING [train.py:1204] (3/4) Exclude cut with ID _19896_210_1883_1_1531709022662_6649569_562-233001-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 23:04:57,381 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_354-78629-0_sp1.1 from training. Duration: 0.963625 2023-10-03 23:04:57,384 WARNING [train.py:1204] (3/4) Exclude cut with ID _60135_210_13427_1_1534935557922_3651319_96-168198-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 23:04:58,852 WARNING [train.py:1204] (3/4) Exclude cut with ID _31602_210_5854_1_1532677021456_4458250_249-96604-0_sp1.1 from training. Duration: 0.8090625 2023-10-03 23:05:00,120 WARNING [train.py:1204] (3/4) Exclude cut with ID _14100_210_8341_1_1530529320613_3678069_258-224599-0_sp0.9 from training. Duration: 0.8555625 2023-10-03 23:05:00,277 WARNING [train.py:1204] (3/4) Exclude cut with ID _19708_210_9206_1_1532136650733_4238364_215-160135-0_sp1.1 from training. Duration: 0.97275 2023-10-03 23:05:01,643 WARNING [train.py:1204] (3/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_113-18163-0_sp0.9 from training. Duration: 0.988875 2023-10-03 23:05:01,695 WARNING [train.py:1204] (3/4) Exclude cut with ID _64135_210_2253_1_1535677267876_7608972_552-337340-0_sp1.1 from training. Duration: 0.92725 2023-10-03 23:05:02,984 WARNING [train.py:1204] (3/4) Exclude cut with ID _67725_210_27767_1_1535608756671_5105359_10-12887-0_sp1.1 from training. Duration: 0.97275 2023-10-03 23:05:02,991 WARNING [train.py:1204] (3/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_401-284840-0_sp1.1 from training. Duration: 0.97275 2023-10-03 23:05:04,436 WARNING [train.py:1204] (3/4) Exclude cut with ID _44659_210_2721_1_1534036120061_6614860_483-141285-0_sp1.1 from training. Duration: 0.936375 2023-10-03 23:05:07,872 WARNING [train.py:1204] (3/4) Exclude cut with ID _9461_210_4557_1_1529060514726_3677451_358-332734-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 23:05:09,252 WARNING [train.py:1204] (3/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_788-298907-0_sp1.1 from training. Duration: 0.8090625 2023-10-03 23:05:11,936 WARNING [train.py:1204] (3/4) Exclude cut with ID _39411_210_5986_1_1533538825030_3343649_354-267196-0_sp1.1 from training. Duration: 0.92725 2023-10-03 23:05:11,967 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_14953_210_2151_1_1530701710_5616165_192-309792-0_sp1.1 from training. Duration: 0.97275 2023-10-03 23:05:13,414 WARNING [train.py:1204] (3/4) Exclude cut with ID _59170_210_6721_1_1534983633960_4203335_290-151255-0_sp1.1 from training. Duration: 0.92725 2023-10-03 23:05:14,747 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_5575_210_1410_1_1527301305_2570170_249-93769-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 23:05:14,756 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_46013_210_7234_1_1533776362_3744032_463-52378-0_sp1.1 from training. Duration: 0.7818125 2023-10-03 23:05:20,049 WARNING [train.py:1204] (3/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_580-93798-0 from training. Duration: 0.56 2023-10-03 23:05:20,096 WARNING [train.py:1204] (3/4) Exclude cut with ID _19514_210_9259_1_1531530071330_5084230_385-13762-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 23:05:21,362 WARNING [train.py:1204] (3/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_458-67321-0_sp1.1 from training. Duration: 0.8818125 2023-10-03 23:05:21,465 WARNING [train.py:1204] (3/4) Exclude cut with ID _41298_210_9019_1_1533430371450_4822143_188-154791-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 23:05:22,848 WARNING [train.py:1204] (3/4) Exclude cut with ID _15037_210_2253_1_1530752294104_3267650_296-192597-0_sp0.9 from training. Duration: 0.9 2023-10-03 23:05:26,868 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.688e+02 1.962e+02 2.147e+02 2.402e+02 3.461e+02, threshold=4.294e+02, percent-clipped=0.0 2023-10-03 23:05:28,497 WARNING [train.py:1204] (3/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_661-264857-0_sp1.1 from training. Duration: 0.8454375 2023-10-03 23:05:34,191 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_34008_210_10431_1_1533345424_8183247_662-317331-0_sp1.1 from training. Duration: 0.8545625 2023-10-03 23:05:34,217 WARNING [train.py:1204] (3/4) Exclude cut with ID _46502_210_13794_1_1533724452198_3657159_241-278329-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 23:05:34,217 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_11349_210_6753_1_1529800861_1101207_26-65602-0 from training. Duration: 0.96 2023-10-03 23:05:35,451 WARNING [train.py:1204] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_448-4048-0_sp1.1 from training. Duration: 0.87275 2023-10-03 23:05:35,459 WARNING [train.py:1204] (3/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_662-275621-0_sp1.1 from training. Duration: 0.3818125 2023-10-03 23:05:35,518 WARNING [train.py:1204] (3/4) Exclude cut with ID _13938_210_9988_1_1530518718978_3693960_256-54454-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 23:05:36,051 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.5.encoder.layers.0.conv_module2.whiten, num_groups=1, num_channels=256, metric=4.58 vs. limit=15.0 2023-10-03 23:05:38,247 WARNING [train.py:1204] (3/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_256-32662-0 from training. Duration: 0.9 2023-10-03 23:05:40,158 WARNING [train.py:1204] (3/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_326-17061-0 from training. Duration: 0.94 2023-10-03 23:05:40,186 WARNING [train.py:1204] (3/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_505-97987-0_sp1.1 from training. Duration: 0.7818125 2023-10-03 23:05:41,623 WARNING [train.py:1204] (3/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_316-294364-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 23:05:41,669 WARNING [train.py:1204] (3/4) Exclude cut with ID _19514_210_9259_1_1531530071330_5084230_762-231063-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 23:05:41,720 WARNING [train.py:1204] (3/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_68-219807-0_sp0.9 from training. Duration: 0.52225 2023-10-03 23:05:41,760 WARNING [train.py:1204] (3/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_479-158053-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 23:05:41,960 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.attention_skip_rate, batch_count=1438506.6666666667, ans=0.0 2023-10-03 23:05:45,742 WARNING [train.py:1204] (3/4) Exclude cut with ID _66112_210_10635_1_1535590236305_3801484_83-38181-0_sp1.1 from training. Duration: 0.936375 2023-10-03 23:05:45,763 WARNING [train.py:1204] (3/4) Exclude cut with ID _55857_210_2346_1_1534644007831_6898209_255-146346-0_sp1.1 from training. Duration: 0.963625 2023-10-03 23:05:47,702 WARNING [train.py:1204] (3/4) Exclude cut with ID _48836_210_12210_1_1534075848293_3904150_429-35475-0 from training. Duration: 0.89 2023-10-03 23:05:49,420 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_50154_210_13731_1_1534750862_1080961_119-187163-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 23:05:51,539 WARNING [train.py:1204] (3/4) Exclude cut with ID _8793_210_6310_1_1528542985139_2310009_121-179782-0_sp0.9 from training. Duration: 0.888875 2023-10-03 23:05:51,542 WARNING [train.py:1204] (3/4) Exclude cut with ID _14884_210_2252_1_1530707459689_5561250_591-173406-0 from training. Duration: 0.96 2023-10-03 23:05:54,109 INFO [train.py:1046] (3/4) Epoch 41, batch 3300, loss[loss=0.1594, simple_loss=0.2408, pruned_loss=0.039, over 24663.00 frames. ], tot_loss[loss=0.1564, simple_loss=0.2364, pruned_loss=0.03814, over 4715445.78 frames. ], batch size: 65, lr: 2.48e-03, grad_scale: 16.0 2023-10-03 23:05:54,232 WARNING [train.py:1204] (3/4) Exclude cut with ID _15133_210_6228_1_1530794512074_3080379_54-198193-0_sp1.1 from training. Duration: 0.8545625 2023-10-03 23:05:54,241 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6545_210_2040_1_1527774670_4245258_407-48070-0 from training. Duration: 0.76 2023-10-03 23:05:56,884 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1032-236863-0 from training. Duration: 0.84 2023-10-03 23:05:56,972 WARNING [train.py:1204] (3/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_507-303370-0 from training. Duration: 0.96 2023-10-03 23:05:56,993 WARNING [train.py:1204] (3/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_164-282366-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 23:06:01,138 WARNING [train.py:1204] (3/4) Exclude cut with ID _39867_210_9826_1_1533553222030_9344259_312-158727-0_sp1.1 from training. Duration: 0.963625 2023-10-03 23:06:03,788 WARNING [train.py:1204] (3/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_174-205058-0_sp1.1 from training. Duration: 0.863625 2023-10-03 23:06:03,825 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_11305_210_1794_1_1529750428_7178148_166-237358-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 23:06:06,499 WARNING [train.py:1204] (3/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_26-175553-0_sp1.1 from training. Duration: 0.5545625 2023-10-03 23:06:06,517 WARNING [train.py:1204] (3/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_96-290039-0_sp1.1 from training. Duration: 0.5090625 2023-10-03 23:06:07,986 WARNING [train.py:1204] (3/4) Exclude cut with ID _53219_210_4525_1_1534834836008_4064825_261-101691-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 23:06:10,630 WARNING [train.py:1204] (3/4) Exclude cut with ID _24271_210_12658_1_1531987270022_7190519_96-293390-0_sp1.1 from training. Duration: 0.936375 2023-10-03 23:06:14,010 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_271-234962-0 from training. Duration: 0.79 2023-10-03 23:06:15,391 WARNING [train.py:1204] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_136-30660-0_sp1.1 from training. Duration: 0.97275 2023-10-03 23:06:15,420 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_625-281046-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 23:06:16,793 WARNING [train.py:1204] (3/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_333-168193-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 23:06:18,484 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9029W0269-509664 from training. Duration: 0.896 2023-10-03 23:06:18,591 WARNING [train.py:1204] (3/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_450-147393-0_sp1.1 from training. Duration: 0.92725 2023-10-03 23:06:19,898 WARNING [train.py:1204] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_643-52007-0_sp1.1 from training. Duration: 0.5181875 2023-10-03 23:06:19,984 WARNING [train.py:1204] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_330-327861-0_sp0.9 from training. Duration: 0.7444375 2023-10-03 23:06:20,003 WARNING [train.py:1204] (3/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_260-317651-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 23:06:20,022 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9044W0010-516654 from training. Duration: 0.896 2023-10-03 23:06:22,298 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.conv_module1.balancer1.prob, batch_count=1438706.6666666667, ans=0.125 2023-10-03 23:06:23,842 WARNING [train.py:1204] (3/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_265-118378-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 23:06:25,061 WARNING [train.py:1204] (3/4) Exclude cut with ID _6194_210_1534_1_1527511826160_3941194_321-148641-0_sp0.9 from training. Duration: 0.711125 2023-10-03 23:06:26,403 WARNING [train.py:1204] (3/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_101-169176-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 23:06:26,406 WARNING [train.py:1204] (3/4) Exclude cut with ID 497-129325-0061-62254-0_sp1.1 from training. Duration: 0.97725 2023-10-03 23:06:27,901 WARNING [train.py:1204] (3/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_795-33252-0_sp1.1 from training. Duration: 0.336375 2023-10-03 23:06:27,927 WARNING [train.py:1204] (3/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_168-246176-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 23:06:29,285 WARNING [train.py:1204] (3/4) Exclude cut with ID _7160_210_1876_1_1527940923231_3703799_120-150943-0_sp1.1 from training. Duration: 0.736375 2023-10-03 23:06:30,782 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9084W0005-536844 from training. Duration: 0.768 2023-10-03 23:06:33,389 WARNING [train.py:1204] (3/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_234-260682-0 from training. Duration: 0.84 2023-10-03 23:06:33,422 WARNING [train.py:1204] (3/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_188-114464-0_sp0.9 from training. Duration: 0.911125 2023-10-03 23:06:36,079 WARNING [train.py:1204] (3/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_81-75849-0 from training. Duration: 0.39 2023-10-03 23:06:38,687 WARNING [train.py:1204] (3/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_379-184223-0_sp1.1 from training. Duration: 0.77275 2023-10-03 23:06:41,395 WARNING [train.py:1204] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_454-123002-0_sp1.1 from training. Duration: 0.62725 2023-10-03 23:06:42,050 WARNING [train.py:1204] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_887-249555-0_sp1.1 from training. Duration: 0.7 2023-10-03 23:06:43,537 WARNING [train.py:1204] (3/4) Exclude cut with ID _31088_210_16446_1_1532520012284_3440919_20-260902-0_sp1.1 from training. Duration: 0.97275 2023-10-03 23:06:43,722 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.conv_skip_rate, batch_count=1438773.3333333333, ans=0.0 2023-10-03 23:06:44,913 WARNING [train.py:1204] (3/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_856-344574-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 23:06:44,915 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_36756_210_7234_1_1533026991_966276_4-164118-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 23:06:46,178 WARNING [train.py:1204] (3/4) Exclude cut with ID _8365_210_2064_1_1528592311070_4677690_371-122295-0_sp1.1 from training. Duration: 0.5 2023-10-03 23:06:47,626 WARNING [train.py:1204] (3/4) Exclude cut with ID _5910_210_1389_1_1527337972454_5050100_555-159815-0_sp1.1 from training. Duration: 0.8909375 2023-10-03 23:06:47,643 WARNING [train.py:1204] (3/4) Exclude cut with ID _37086_210_9038_1_1532999540891_7021250_469-147941-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 23:06:47,847 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.conv_skip_rate, batch_count=1438773.3333333333, ans=0.0 2023-10-03 23:06:48,978 WARNING [train.py:1204] (3/4) Exclude cut with ID _3067_210_2667_1_1526212974640_4503168_459-173867-0_sp0.9 from training. Duration: 0.97775 2023-10-03 23:06:50,778 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9156W0006-572499 from training. Duration: 0.896 2023-10-03 23:06:52,744 WARNING [train.py:1204] (3/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_277-210798-0 from training. Duration: 0.55 2023-10-03 23:06:55,411 WARNING [train.py:1204] (3/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_103-336064-0_sp1.1 from training. Duration: 0.463625 2023-10-03 23:06:55,471 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3414_210_1988_1_1526212718_4813195_331-290945-0_sp1.1 from training. Duration: 0.936375 2023-10-03 23:06:55,472 WARNING [train.py:1204] (3/4) Exclude cut with ID _67499_210_17282_1_1535509805220_3692430_365-29276-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 23:06:58,732 WARNING [train.py:1204] (3/4) Exclude cut with ID _14821_210_8750_1_1530680503991_6829359_69-250847-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 23:06:58,734 WARNING [train.py:1204] (3/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_1102-61301-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 23:07:00,100 WARNING [train.py:1204] (3/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_166-246557-0_sp1.1 from training. Duration: 0.5818125 2023-10-03 23:07:00,143 WARNING [train.py:1204] (3/4) Exclude cut with ID _15351_210_10626_1_1530856635653_3576210_214-129918-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 23:07:00,166 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_120-41668-0_sp0.9 from training. Duration: 0.77775 2023-10-03 23:07:00,236 WARNING [train.py:1204] (3/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_27-72278-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 23:07:02,890 WARNING [train.py:1204] (3/4) Exclude cut with ID _24314_210_4915_1_1532135078116_7170179_521-258868-0_sp1.1 from training. Duration: 0.7181875 2023-10-03 23:07:04,485 WARNING [train.py:1204] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_143-86344-0 from training. Duration: 0.77 2023-10-03 23:07:04,518 WARNING [train.py:1204] (3/4) Exclude cut with ID _53489_210_6228_1_1534298357169_3835970_465-157433-0_sp1.1 from training. Duration: 0.92725 2023-10-03 23:07:05,820 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_248-319353-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 23:07:07,083 INFO [train.py:1046] (3/4) Epoch 41, batch 3350, loss[loss=0.1807, simple_loss=0.252, pruned_loss=0.05472, over 23788.00 frames. ], tot_loss[loss=0.1573, simple_loss=0.2374, pruned_loss=0.03858, over 4721190.79 frames. ], batch size: 179, lr: 2.48e-03, grad_scale: 8.0 2023-10-03 23:07:07,222 WARNING [train.py:1204] (3/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_102-77064-0_sp1.1 from training. Duration: 0.8454375 2023-10-03 23:07:08,524 WARNING [train.py:1204] (3/4) Exclude cut with ID _9389_210_1664_1_1529217337301_5289269_379-128794-0_sp1.1 from training. Duration: 0.77275 2023-10-03 23:07:09,877 WARNING [train.py:1204] (3/4) Exclude cut with ID _3231_210_2106_1_1526205864934_6784220_236-260835-0_sp1.1 from training. Duration: 0.97275 2023-10-03 23:07:11,245 WARNING [train.py:1204] (3/4) Exclude cut with ID _24271_210_12658_1_1531987270022_7190519_184-342363-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 23:07:11,246 WARNING [train.py:1204] (3/4) Exclude cut with ID _27894_210_12392_1_1532415496078_6902439_67-221663-0_sp1.1 from training. Duration: 0.92725 2023-10-03 23:07:14,451 WARNING [train.py:1204] (3/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_304-59227-0_sp1.1 from training. Duration: 0.7 2023-10-03 23:07:14,734 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.conv_module2.balancer2.prob, batch_count=1438906.6666666667, ans=0.125 2023-10-03 23:07:15,834 WARNING [train.py:1204] (3/4) Exclude cut with ID _58605_210_12210_1_1534723195939_3904259_458-42232-0_sp1.1 from training. Duration: 0.92725 2023-10-03 23:07:15,915 WARNING [train.py:1204] (3/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_13-165512-0_sp0.9 from training. Duration: 0.911125 2023-10-03 23:07:18,741 WARNING [train.py:1204] (3/4) Exclude cut with ID _58602_210_2346_1_1534903210085_6908830_792-102241-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 23:07:18,939 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.feed_forward2.hidden_balancer.prob, batch_count=1438906.6666666667, ans=0.125 2023-10-03 23:07:20,125 WARNING [train.py:1204] (3/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_588-125165-0_sp0.9 from training. Duration: 0.92225 2023-10-03 23:07:21,940 WARNING [train.py:1204] (3/4) Exclude cut with ID _54218_210_12210_1_1534413646388_4409259_239-98429-0_sp1.1 from training. Duration: 0.97275 2023-10-03 23:07:22,000 WARNING [train.py:1204] (3/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_538-32291-0_sp1.1 from training. Duration: 0.7545625 2023-10-03 23:07:24,076 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.nonlin_attention.balancer.prob, batch_count=1438973.3333333333, ans=0.125 2023-10-03 23:07:25,212 WARNING [train.py:1204] (3/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_419-317062-0 from training. Duration: 0.91 2023-10-03 23:07:25,318 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9309W0202-640329 from training. Duration: 0.896 2023-10-03 23:07:25,342 WARNING [train.py:1204] (3/4) Exclude cut with ID _3231_210_2106_1_1526205864934_6784220_419-313291-0_sp1.1 from training. Duration: 0.97275 2023-10-03 23:07:25,492 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.conv_module1.balancer2.prob, batch_count=1438973.3333333333, ans=0.125 2023-10-03 23:07:28,085 WARNING [train.py:1204] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_619-344499-0 from training. Duration: 0.63 2023-10-03 23:07:29,843 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_278-272612-0 from training. Duration: 0.73 2023-10-03 23:07:31,248 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_103-84010-0_sp0.9 from training. Duration: 0.7666875 2023-10-03 23:07:31,277 WARNING [train.py:1204] (3/4) Exclude cut with ID _26528_210_9070_1_1532570410842_3851310_497-97218-0_sp1.1 from training. Duration: 0.7818125 2023-10-03 23:07:32,678 WARNING [train.py:1204] (3/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_262-84613-0_sp1.1 from training. Duration: 0.936375 2023-10-03 23:07:32,721 WARNING [train.py:1204] (3/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_387-131538-0 from training. Duration: 0.81 2023-10-03 23:07:32,744 WARNING [train.py:1204] (3/4) Exclude cut with ID _8030_210_7497_1_1528444886781_3531089_224-300489-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 23:07:32,761 WARNING [train.py:1204] (3/4) Exclude cut with ID _14349_210_8341_1_1530615710818_3442470_170-336541-0_sp0.9 from training. Duration: 0.9555625 2023-10-03 23:07:35,509 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_47069_210_15368_1_1533781267_4431320_180-31746-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 23:07:36,923 WARNING [train.py:1204] (3/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_447-7041-0_sp1.1 from training. Duration: 0.92725 2023-10-03 23:07:38,224 WARNING [train.py:1204] (3/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_2-55283-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 23:07:38,289 WARNING [train.py:1204] (3/4) Exclude cut with ID _12555_210_8750_1_1530075594402_3780239_70-181911-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 23:07:41,472 WARNING [train.py:1204] (3/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_311-328022-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 23:07:42,874 WARNING [train.py:1204] (3/4) Exclude cut with ID _14528_210_8254_1_1530669700514_4306939_330-2987-0_sp1.1 from training. Duration: 0.92725 2023-10-03 23:07:42,915 WARNING [train.py:1204] (3/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_385-184858-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 23:07:45,808 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.1.conv_module2.balancer2.prob, batch_count=1439040.0, ans=0.125 2023-10-03 23:07:47,063 WARNING [train.py:1204] (3/4) Exclude cut with ID _64414_210_17301_1_1535243252048_3757759_348-333360-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 23:07:48,390 WARNING [train.py:1204] (3/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_753-214751-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 23:07:51,092 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_17613_210_2151_1_1531137569_2366160_202-208013-0_sp1.1 from training. Duration: 0.92725 2023-10-03 23:07:51,100 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_43831_210_2158_1_1533725502_5872791_610-129300-0_sp1.1 from training. Duration: 0.936375 2023-10-03 23:07:54,210 WARNING [train.py:1204] (3/4) Exclude cut with ID _54218_210_12210_1_1534413646388_4409259_321-23852-0_sp1.1 from training. Duration: 0.936375 2023-10-03 23:07:55,980 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.529e+02 1.914e+02 2.116e+02 2.439e+02 3.109e+02, threshold=4.232e+02, percent-clipped=0.0 2023-10-03 23:07:56,137 WARNING [train.py:1204] (3/4) Exclude cut with ID _39411_210_5986_1_1533538825030_3343649_173-238248-0 from training. Duration: 0.83 2023-10-03 23:07:56,150 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_405-339986-0_sp0.9 from training. Duration: 0.5666875 2023-10-03 23:07:57,462 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2009_210_2071_1_1525481134_4588937_385-302100-0 from training. Duration: 0.87 2023-10-03 23:07:57,510 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_161-276996-0_sp1.1 from training. Duration: 0.863625 2023-10-03 23:07:58,876 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_177-58005-0 from training. Duration: 0.62 2023-10-03 23:08:00,315 WARNING [train.py:1204] (3/4) Exclude cut with ID _64639_210_5816_1_1535367619437_3528000_146-343734-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 23:08:02,168 WARNING [train.py:1204] (3/4) Exclude cut with ID _3845_210_1390_1_1526731326141_3523610_72-61441-0_sp1.1 from training. Duration: 0.92725 2023-10-03 23:08:07,777 WARNING [train.py:1204] (3/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_991-125028-0_sp1.1 from training. Duration: 0.936375 2023-10-03 23:08:07,850 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7544_210_6753_1_1528591327_1471426_172-348777-0 from training. Duration: 0.66 2023-10-03 23:08:09,086 WARNING [train.py:1204] (3/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_416-257730-0_sp1.1 from training. Duration: 0.5909375 2023-10-03 23:08:09,236 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.conv_skip_rate, batch_count=1439173.3333333333, ans=0.0 2023-10-03 23:08:10,940 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1113-7369-0_sp1.1 from training. Duration: 0.77275 2023-10-03 23:08:12,347 WARNING [train.py:1204] (3/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_242-76943-0_sp1.1 from training. Duration: 0.8545625 2023-10-03 23:08:15,358 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.feed_forward1.hidden_balancer.prob, batch_count=1439173.3333333333, ans=0.125 2023-10-03 23:08:16,555 WARNING [train.py:1204] (3/4) Exclude cut with ID _44718_210_5816_1_1534050051869_6469030_190-49648-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 23:08:16,750 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.feed_forward2.hidden_balancer.prob, batch_count=1439173.3333333333, ans=0.125 2023-10-03 23:08:17,966 WARNING [train.py:1204] (3/4) Exclude cut with ID _47251_210_18628_1_1533814063128_4137250_214-88076-0 from training. Duration: 0.88 2023-10-03 23:08:19,409 WARNING [train.py:1204] (3/4) Exclude cut with ID _5585_210_2667_1_1527418813324_8007340_89-332072-0_sp1.1 from training. Duration: 0.5818125 2023-10-03 23:08:19,448 WARNING [train.py:1204] (3/4) Exclude cut with ID _41817_210_18835_1_1533434477160_7423049_555-293448-0_sp1.1 from training. Duration: 0.72725 2023-10-03 23:08:20,804 INFO [train.py:1046] (3/4) Epoch 41, batch 3400, loss[loss=0.1674, simple_loss=0.2518, pruned_loss=0.04146, over 23512.00 frames. ], tot_loss[loss=0.1589, simple_loss=0.2389, pruned_loss=0.03943, over 4705796.76 frames. ], batch size: 93, lr: 2.48e-03, grad_scale: 8.0 2023-10-03 23:08:20,857 WARNING [train.py:1204] (3/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_286-331314-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 23:08:20,909 WARNING [train.py:1204] (3/4) Exclude cut with ID _13921_210_9994_1_1530838702800_3003429_46-149444-0 from training. Duration: 0.94 2023-10-03 23:08:22,237 WARNING [train.py:1204] (3/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_768-43595-0_sp1.1 from training. Duration: 0.936375 2023-10-03 23:08:22,255 WARNING [train.py:1204] (3/4) Exclude cut with ID _35355_210_16040_1_1533212988977_4016229_74-190076-0 from training. Duration: 0.8 2023-10-03 23:08:23,676 WARNING [train.py:1204] (3/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_1007-316380-0_sp1.1 from training. Duration: 0.963625 2023-10-03 23:08:23,726 WARNING [train.py:1204] (3/4) Exclude cut with ID _23557_210_8750_1_1531890106370_6846255_150-283437-0_sp1.1 from training. Duration: 0.963625 2023-10-03 23:08:23,761 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3474_210_2028_1_1526188513_4825246_145-309131-0_sp1.1 from training. Duration: 0.67275 2023-10-03 23:08:25,635 WARNING [train.py:1204] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_391-22148-0_sp1.1 from training. Duration: 0.7 2023-10-03 23:08:26,827 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_238-256668-0 from training. Duration: 0.69 2023-10-03 23:08:30,939 WARNING [train.py:1204] (3/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_372-198878-0 from training. Duration: 0.6 2023-10-03 23:08:30,950 WARNING [train.py:1204] (3/4) Exclude cut with ID ID0174W0006-766659 from training. Duration: 0.953 2023-10-03 23:08:30,976 WARNING [train.py:1204] (3/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_668-23019-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 23:08:35,650 WARNING [train.py:1204] (3/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_322-74328-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 23:08:35,656 WARNING [train.py:1204] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_872-264525-0_sp1.1 from training. Duration: 0.5909375 2023-10-03 23:08:37,020 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_53378_210_17282_1_1534221071_5769509_641-286201-0_sp1.1 from training. Duration: 0.97275 2023-10-03 23:08:37,106 WARNING [train.py:1204] (3/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_199-295611-0_sp1.1 from training. Duration: 0.5 2023-10-03 23:08:41,973 WARNING [train.py:1204] (3/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_1270-317971-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 23:08:42,841 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.0.layers.1.feed_forward3.out_whiten, num_groups=1, num_channels=192, metric=10.65 vs. limit=15.0 2023-10-03 23:08:44,558 WARNING [train.py:1204] (3/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_784-213099-0 from training. Duration: 0.93 2023-10-03 23:08:48,832 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_115-233929-0_sp0.9 from training. Duration: 0.8 2023-10-03 23:08:51,511 WARNING [train.py:1204] (3/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_256-223809-0_sp1.1 from training. Duration: 0.97275 2023-10-03 23:08:51,555 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_285-87781-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 23:08:52,906 WARNING [train.py:1204] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_619-344499-0_sp1.1 from training. Duration: 0.57275 2023-10-03 23:08:59,401 WARNING [train.py:1204] (3/4) Exclude cut with ID _60229_210_27767_1_1534838406158_3655350_264-279573-0_sp0.9 from training. Duration: 0.92225 2023-10-03 23:09:02,331 WARNING [train.py:1204] (3/4) Exclude cut with ID _7719_210_3525_1_1528456153163_4296960_333-110399-0 from training. Duration: 0.96 2023-10-03 23:09:08,389 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_50423_210_13270_1_1534128598_5021734_759-90833-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 23:09:08,435 WARNING [train.py:1204] (3/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_151-254798-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 23:09:10,444 WARNING [train.py:1204] (3/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_450-203637-0 from training. Duration: 0.92 2023-10-03 23:09:10,486 WARNING [train.py:1204] (3/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_673-36456-0_sp1.1 from training. Duration: 0.936375 2023-10-03 23:09:11,800 WARNING [train.py:1204] (3/4) Exclude cut with ID _36538_210_9994_1_1533204020363_3795620_131-8685-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 23:09:13,155 WARNING [train.py:1204] (3/4) Exclude cut with ID _12556_210_8341_1_1530081024100_7327690_713-55626-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 23:09:13,200 WARNING [train.py:1204] (3/4) Exclude cut with ID _2322_210_2667_1_1525604548971_3656299_440-80494-0_sp0.9 from training. Duration: 0.97775 2023-10-03 23:09:15,928 WARNING [train.py:1204] (3/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_210-325081-0_sp1.1 from training. Duration: 0.97275 2023-10-03 23:09:18,767 WARNING [train.py:1204] (3/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_375-244976-0_sp0.9 from training. Duration: 0.8333125 2023-10-03 23:09:18,781 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_15517_210_6753_1_1530952293_1664795_119-31001-0_sp1.1 from training. Duration: 0.8545625 2023-10-03 23:09:23,072 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_41452_210_7291_1_1533437687_3941281_319-42326-0_sp1.1 from training. Duration: 0.92725 2023-10-03 23:09:25,637 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.3.encoder.layers.1.nonlin_attention.whiten2, num_groups=1, num_channels=512, metric=6.94 vs. limit=15.0 2023-10-03 23:09:26,358 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_372-173012-0 from training. Duration: 0.95 2023-10-03 23:09:31,255 WARNING [train.py:1204] (3/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_41-31157-0_sp1.1 from training. Duration: 0.5181875 2023-10-03 23:09:31,413 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.attention_skip_rate, batch_count=1439506.6666666667, ans=0.0 2023-10-03 23:09:35,195 INFO [train.py:1046] (3/4) Epoch 41, batch 3450, loss[loss=0.1446, simple_loss=0.2271, pruned_loss=0.03109, over 24458.00 frames. ], tot_loss[loss=0.1584, simple_loss=0.2383, pruned_loss=0.03922, over 4708707.48 frames. ], batch size: 63, lr: 2.48e-03, grad_scale: 4.0 2023-10-03 23:09:36,768 WARNING [train.py:1204] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_91-212486-0 from training. Duration: 0.68 2023-10-03 23:09:40,215 WARNING [train.py:1204] (3/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_1267-272693-0 from training. Duration: 0.73 2023-10-03 23:09:40,246 WARNING [train.py:1204] (3/4) Exclude cut with ID _13938_210_9988_1_1530518718978_3693960_152-67046-0_sp1.1 from training. Duration: 0.936375 2023-10-03 23:09:43,495 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_4528_210_3273_1_1527076654_3276777_342-168068-0_sp1.1 from training. Duration: 0.7909375 2023-10-03 23:09:43,504 WARNING [train.py:1204] (3/4) Exclude cut with ID _7606_210_4915_1_1528894253277_5080690_127-177761-0 from training. Duration: 0.64 2023-10-03 23:09:44,836 WARNING [train.py:1204] (3/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_335-137972-0_sp1.1 from training. Duration: 0.92725 2023-10-03 23:09:47,586 WARNING [train.py:1204] (3/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_331-117829-0_sp1.1 from training. Duration: 0.563625 2023-10-03 23:09:47,801 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=1439573.3333333333, ans=0.1 2023-10-03 23:09:51,840 WARNING [train.py:1204] (3/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_125-125818-0_sp1.1 from training. Duration: 0.87275 2023-10-03 23:09:53,150 WARNING [train.py:1204] (3/4) Exclude cut with ID _6789_210_1534_1_1527854471671_2870519_48-294897-0_sp1.1 from training. Duration: 0.963625 2023-10-03 23:09:54,562 WARNING [train.py:1204] (3/4) Exclude cut with ID _6694_210_4599_1_1527852637490_3965529_116-109398-0_sp1.1 from training. Duration: 0.663625 2023-10-03 23:09:54,574 WARNING [train.py:1204] (3/4) Exclude cut with ID _52892_210_6721_1_1534378941259_4124129_222-172685-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 23:09:56,348 WARNING [train.py:1204] (3/4) Exclude cut with ID _50655_210_18628_1_1534229942193_3772154_320-132275-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 23:10:02,471 WARNING [train.py:1204] (3/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_76-311963-0 from training. Duration: 0.7 2023-10-03 23:10:06,767 WARNING [train.py:1204] (3/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_32-205288-0 from training. Duration: 0.78 2023-10-03 23:10:06,783 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_86-16881-0_sp0.9 from training. Duration: 0.6444375 2023-10-03 23:10:06,829 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_271-297505-0_sp1.1 from training. Duration: 0.9 2023-10-03 23:10:06,929 WARNING [train.py:1204] (3/4) Exclude cut with ID _57572_210_5517_1_1534921219624_3809484_187-295293-0_sp1.1 from training. Duration: 0.963625 2023-10-03 23:10:10,813 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.4.encoder.layers.0.conv_module1.whiten, num_groups=1, num_channels=384, metric=5.17 vs. limit=15.0 2023-10-03 23:10:14,681 WARNING [train.py:1204] (3/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_563-329943-0 from training. Duration: 0.99 2023-10-03 23:10:14,754 WARNING [train.py:1204] (3/4) Exclude cut with ID _7160_210_1876_1_1527940923231_3703799_302-150728-0_sp0.9 from training. Duration: 0.8666875 2023-10-03 23:10:15,269 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.5.encoder.layers.0.feed_forward3.out_whiten, num_groups=1, num_channels=256, metric=8.73 vs. limit=15.0 2023-10-03 23:10:17,583 WARNING [train.py:1204] (3/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_926-7526-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 23:10:18,928 WARNING [train.py:1204] (3/4) Exclude cut with ID _62574_210_2655_1_1535180407058_3933249_467-22348-0_sp1.1 from training. Duration: 0.936375 2023-10-03 23:10:20,288 WARNING [train.py:1204] (3/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_280-161633-0_sp0.9 from training. Duration: 0.811125 2023-10-03 23:10:20,417 WARNING [train.py:1204] (3/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_385-193052-0_sp1.1 from training. Duration: 0.7818125 2023-10-03 23:10:21,772 WARNING [train.py:1204] (3/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_444-28041-0 from training. Duration: 0.85 2023-10-03 23:10:21,777 WARNING [train.py:1204] (3/4) Exclude cut with ID _66070_210_7640_1_1535441393096_7805310_341-249237-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 23:10:23,113 WARNING [train.py:1204] (3/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_425-269826-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 23:10:26,227 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.621e+02 1.900e+02 2.129e+02 2.402e+02 3.276e+02, threshold=4.259e+02, percent-clipped=0.0 2023-10-03 23:10:26,352 WARNING [train.py:1204] (3/4) Exclude cut with ID _53950_210_5816_1_1534417264919_3862951_124-185597-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 23:10:26,496 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.ff2_skip_rate, batch_count=1439773.3333333333, ans=0.0 2023-10-03 23:10:29,020 WARNING [train.py:1204] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_380-204756-0 from training. Duration: 0.86 2023-10-03 23:10:33,270 WARNING [train.py:1204] (3/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_282-257791-0_sp0.9 from training. Duration: 0.9555625 2023-10-03 23:10:37,422 WARNING [train.py:1204] (3/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_372-131163-0_sp1.1 from training. Duration: 0.8545625 2023-10-03 23:10:39,386 WARNING [train.py:1204] (3/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_447-233513-0_sp1.1 from training. Duration: 0.963625 2023-10-03 23:10:42,117 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_54-118333-0_sp1.1 from training. Duration: 0.97275 2023-10-03 23:10:46,731 WARNING [train.py:1204] (3/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_388-230239-0_sp1.1 from training. Duration: 0.963625 2023-10-03 23:10:46,743 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_40047_210_2290_1_1533290519_2374311_141-54639-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 23:10:48,202 WARNING [train.py:1204] (3/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_91-267696-0_sp0.9 from training. Duration: 0.9666875 2023-10-03 23:10:49,474 INFO [train.py:1046] (3/4) Epoch 41, batch 3500, loss[loss=0.1725, simple_loss=0.2616, pruned_loss=0.04167, over 24443.00 frames. ], tot_loss[loss=0.1575, simple_loss=0.2377, pruned_loss=0.03871, over 4714583.31 frames. ], batch size: 69, lr: 2.47e-03, grad_scale: 8.0 2023-10-03 23:10:49,531 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_532-137847-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 23:10:52,241 WARNING [train.py:1204] (3/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_155-283231-0_sp1.1 from training. Duration: 0.97275 2023-10-03 23:10:56,928 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2245_210_2715_1_1525511548_3540254_310-52483-0_sp1.1 from training. Duration: 0.8 2023-10-03 23:10:56,986 WARNING [train.py:1204] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_185-174805-0 from training. Duration: 0.68 2023-10-03 23:10:59,001 WARNING [train.py:1204] (3/4) Exclude cut with ID _15012_210_1968_1_1530856820429_5095350_662-210469-0_sp1.1 from training. Duration: 0.5181875 2023-10-03 23:11:01,765 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_426-313557-0_sp0.9 from training. Duration: 0.588875 2023-10-03 23:11:03,299 WARNING [train.py:1204] (3/4) Exclude cut with ID _42326_210_7770_1_1533814282531_3707269_407-204343-0_sp1.1 from training. Duration: 0.97275 2023-10-03 23:11:03,313 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_258-310570-0 from training. Duration: 0.82 2023-10-03 23:11:07,389 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_4737_210_2040_1_1526998689_3293008_378-178650-0_sp1.1 from training. Duration: 0.863625 2023-10-03 23:11:08,695 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_47896_210_20508_1_1534492518_5958868_432-327451-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 23:11:12,989 WARNING [train.py:1204] (3/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_188-114464-0_sp1.1 from training. Duration: 0.7454375 2023-10-03 23:11:13,001 WARNING [train.py:1204] (3/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_798-106179-0_sp1.1 from training. Duration: 0.7545625 2023-10-03 23:11:14,292 WARNING [train.py:1204] (3/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_143-117421-0_sp1.1 from training. Duration: 0.52725 2023-10-03 23:11:14,329 WARNING [train.py:1204] (3/4) Exclude cut with ID _67499_210_17282_1_1535509805220_3692430_361-78786-0_sp1.1 from training. Duration: 0.963625 2023-10-03 23:11:16,218 WARNING [train.py:1204] (3/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_712-342563-0_sp1.1 from training. Duration: 0.8818125 2023-10-03 23:11:16,270 WARNING [train.py:1204] (3/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_387-159008-0 from training. Duration: 0.86 2023-10-03 23:11:17,460 WARNING [train.py:1204] (3/4) Exclude cut with ID _33640_210_2655_1_1533117605479_4855469_959-18820-0_sp1.1 from training. Duration: 0.963625 2023-10-03 23:11:18,811 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_133-564-0_sp0.9 from training. Duration: 0.87775 2023-10-03 23:11:18,926 WARNING [train.py:1204] (3/4) Exclude cut with ID _23514_210_1389_1_1531832139785_3031439_12-29287-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 23:11:23,023 WARNING [train.py:1204] (3/4) Exclude cut with ID _23514_210_1389_1_1531832139785_3031439_489-185052-0_sp1.1 from training. Duration: 0.963625 2023-10-03 23:11:24,323 WARNING [train.py:1204] (3/4) Exclude cut with ID _5444_210_2151_1_1527418808464_5312210_634-194174-0 from training. Duration: 0.97 2023-10-03 23:11:24,341 WARNING [train.py:1204] (3/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_91-26919-0_sp1.1 from training. Duration: 0.8818125 2023-10-03 23:11:27,663 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_37351_210_7925_1_1533012931_3812113_516-82303-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 23:11:29,497 WARNING [train.py:1204] (3/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_80-74184-0_sp0.9 from training. Duration: 0.92225 2023-10-03 23:11:30,888 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_9460_210_4211_1_1528954587_10049667_495-202963-0_sp1.1 from training. Duration: 0.963625 2023-10-03 23:11:32,397 WARNING [train.py:1204] (3/4) Exclude cut with ID _14632_210_9988_1_1530691279392_3923200_332-105904-0_sp1.1 from training. Duration: 0.8181875 2023-10-03 23:11:33,651 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_40764_210_1925_1_1533882359_3752345_493-167886-0_sp1.1 from training. Duration: 0.92725 2023-10-03 23:11:35,036 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_196-289637-0 from training. Duration: 0.75 2023-10-03 23:11:36,367 WARNING [train.py:1204] (3/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_26-175553-0 from training. Duration: 0.61 2023-10-03 23:11:36,409 WARNING [train.py:1204] (3/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_890-5117-0 from training. Duration: 0.78 2023-10-03 23:11:37,855 WARNING [train.py:1204] (3/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_206-306860-0_sp1.1 from training. Duration: 0.92725 2023-10-03 23:11:37,985 WARNING [train.py:1204] (3/4) Exclude cut with ID _25882_210_3036_1_1532775665815_6479510_570-79824-0_sp1.1 from training. Duration: 0.963625 2023-10-03 23:11:38,586 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.3.encoder.layers.1.feed_forward1.out_whiten, num_groups=1, num_channels=512, metric=8.61 vs. limit=15.0 2023-10-03 23:11:39,278 WARNING [train.py:1204] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_731-2428-0_sp1.1 from training. Duration: 0.7545625 2023-10-03 23:11:39,307 WARNING [train.py:1204] (3/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_138-154812-0_sp0.9 from training. Duration: 0.8666875 2023-10-03 23:11:42,133 WARNING [train.py:1204] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_178-39722-0_sp0.9 from training. Duration: 0.5444375 2023-10-03 23:11:42,636 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.4.encoder.layers.1.self_attn1.whiten, num_groups=1, num_channels=384, metric=12.60 vs. limit=22.5 2023-10-03 23:11:43,438 WARNING [train.py:1204] (3/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_1285-256622-0_sp0.9 from training. Duration: 0.9333125 2023-10-03 23:11:47,449 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.bypass.scale_min, batch_count=1440106.6666666667, ans=0.2 2023-10-03 23:11:48,544 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_41428_210_2161_1_1533774184_3992532_65-271323-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 23:11:49,930 WARNING [train.py:1204] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_308-161067-0 from training. Duration: 0.66 2023-10-03 23:11:49,936 WARNING [train.py:1204] (3/4) Exclude cut with ID _3067_210_2667_1_1526212974640_4503168_459-173867-0 from training. Duration: 0.88 2023-10-03 23:11:49,937 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2221_210_1988_1_1525597051_4935541_415-296882-0_sp1.1 from training. Duration: 0.7 2023-10-03 23:11:50,228 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.feed_forward1.out_proj.dropout_p, batch_count=1440173.3333333333, ans=0.1 2023-10-03 23:11:52,637 WARNING [train.py:1204] (3/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_354-192266-0_sp1.1 from training. Duration: 0.936375 2023-10-03 23:11:52,700 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_258-310570-0_sp0.9 from training. Duration: 0.911125 2023-10-03 23:11:54,139 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_548-141039-0_sp1.1 from training. Duration: 0.963625 2023-10-03 23:11:54,354 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.bypass.skip_rate, batch_count=1440173.3333333333, ans=0.07 2023-10-03 23:11:57,381 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_15101_210_7927_1_1530786415_5033274_320-327527-0 from training. Duration: 0.96 2023-10-03 23:11:57,439 WARNING [train.py:1204] (3/4) Exclude cut with ID _2895_210_2477_1_1525956785794_4063750_289-11475-0_sp0.9 from training. Duration: 0.911125 2023-10-03 23:12:00,620 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7543_210_6753_1_1528543318_4552_1-85072-0_sp1.1 from training. Duration: 0.7545625 2023-10-03 23:12:00,700 WARNING [train.py:1204] (3/4) Exclude cut with ID _64639_210_5816_1_1535367619437_3528000_461-329156-0 from training. Duration: 0.96 2023-10-03 23:12:03,274 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_211-339622-0 from training. Duration: 0.45 2023-10-03 23:12:03,609 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.bypass_mid.scale_min, batch_count=1440173.3333333333, ans=0.2 2023-10-03 23:12:04,701 WARNING [train.py:1204] (3/4) Exclude cut with ID _20455_210_5219_1_1531565996775_8098100_402-112739-0_sp1.1 from training. Duration: 0.963625 2023-10-03 23:12:04,787 WARNING [train.py:1204] (3/4) Exclude cut with ID _53489_210_6228_1_1534298357169_3835970_256-115565-0_sp1.1 from training. Duration: 0.936375 2023-10-03 23:12:04,804 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_26326_210_7925_1_1533002493_3592406_360-305260-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 23:12:06,237 INFO [train.py:1046] (3/4) Epoch 41, batch 3550, loss[loss=0.1577, simple_loss=0.2352, pruned_loss=0.04007, over 24470.00 frames. ], tot_loss[loss=0.1564, simple_loss=0.236, pruned_loss=0.03838, over 4700802.56 frames. ], batch size: 58, lr: 2.47e-03, grad_scale: 8.0 2023-10-03 23:12:06,282 WARNING [train.py:1204] (3/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_18-204383-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 23:12:08,874 WARNING [train.py:1204] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_306-232480-0_sp1.1 from training. Duration: 0.87275 2023-10-03 23:12:17,651 WARNING [train.py:1204] (3/4) Exclude cut with ID _41161_210_20034_1_1533431249657_3555314_116-299186-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 23:12:19,067 WARNING [train.py:1204] (3/4) Exclude cut with ID _13913_210_9994_1_1530579615187_3383897_473-277682-0_sp1.1 from training. Duration: 0.3545625 2023-10-03 23:12:21,860 WARNING [train.py:1204] (3/4) Exclude cut with ID _58895_210_8750_1_1535184081320_3649089_49-225702-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 23:12:23,171 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3080_210_2071_1_1526086102_4365166_312-8726-0_sp1.1 from training. Duration: 0.82725 2023-10-03 23:12:25,121 WARNING [train.py:1204] (3/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_489-190175-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 23:12:25,312 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.balancer1.prob, batch_count=1440306.6666666667, ans=0.125 2023-10-03 23:12:26,423 WARNING [train.py:1204] (3/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_345-95717-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 23:12:26,434 WARNING [train.py:1204] (3/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_85-138257-0_sp1.1 from training. Duration: 0.6454375 2023-10-03 23:12:29,213 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7046_210_6753_1_1527895337_6920917_783-278109-0_sp1.1 from training. Duration: 0.7 2023-10-03 23:12:30,961 WARNING [train.py:1204] (3/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_450-203637-0_sp1.1 from training. Duration: 0.836375 2023-10-03 23:12:31,016 WARNING [train.py:1204] (3/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_416-132893-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 23:12:31,041 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2655_210_2087_1_1525688959_3897400_437-337690-0_sp0.9 from training. Duration: 0.7 2023-10-03 23:12:32,357 WARNING [train.py:1204] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_700-183549-0_sp1.1 from training. Duration: 0.6181875 2023-10-03 23:12:37,852 WARNING [train.py:1204] (3/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_191-59784-0_sp1.1 from training. Duration: 0.8 2023-10-03 23:12:37,866 WARNING [train.py:1204] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_218-295097-0_sp1.1 from training. Duration: 0.7 2023-10-03 23:12:37,979 WARNING [train.py:1204] (3/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_310-109461-0_sp1.1 from training. Duration: 0.72725 2023-10-03 23:12:39,290 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_33348_210_5632_1_1533086912_3324030_131-321149-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 23:12:39,339 WARNING [train.py:1204] (3/4) Exclude cut with ID _64135_210_2253_1_1535677267876_7608972_133-188266-0_sp1.1 from training. Duration: 0.736375 2023-10-03 23:12:39,364 WARNING [train.py:1204] (3/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_505-205270-0 from training. Duration: 0.38 2023-10-03 23:12:39,374 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_67223_210_4238_1_1535612120_2537431_102-88248-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 23:12:39,465 WARNING [train.py:1204] (3/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_60-235433-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 23:12:40,930 WARNING [train.py:1204] (3/4) Exclude cut with ID _5189_210_4557_1_1527248319428_3769229_199-53526-0_sp1.1 from training. Duration: 0.3818125 2023-10-03 23:12:45,603 WARNING [train.py:1204] (3/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_439-90979-0_sp1.1 from training. Duration: 0.963625 2023-10-03 23:12:47,444 WARNING [train.py:1204] (3/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_1162-132880-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 23:12:48,796 WARNING [train.py:1204] (3/4) Exclude cut with ID _56423_210_5854_1_1534896061049_3165969_220-22317-0_sp1.1 from training. Duration: 0.963625 2023-10-03 23:12:50,120 WARNING [train.py:1204] (3/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_909-64151-0 from training. Duration: 0.94 2023-10-03 23:12:51,442 WARNING [train.py:1204] (3/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_465-156500-0_sp0.9 from training. Duration: 0.8 2023-10-03 23:12:52,944 WARNING [train.py:1204] (3/4) Exclude cut with ID _44304_210_15374_1_1533639352765_4613129_302-22084-0 from training. Duration: 0.86 2023-10-03 23:12:52,995 WARNING [train.py:1204] (3/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_663-166973-0_sp1.1 from training. Duration: 0.72725 2023-10-03 23:12:55,989 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.699e+02 1.904e+02 2.052e+02 2.273e+02 3.106e+02, threshold=4.105e+02, percent-clipped=0.0 2023-10-03 23:12:57,427 WARNING [train.py:1204] (3/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_503-287883-0_sp0.9 from training. Duration: 0.97775 2023-10-03 23:12:57,444 WARNING [train.py:1204] (3/4) Exclude cut with ID _60057_210_10640_1_1534903065793_3333150_362-230377-0_sp0.9 from training. Duration: 0.9666875 2023-10-03 23:13:01,689 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_4183_210_2087_1_1526729100_1869838_143-280962-0 from training. Duration: 0.56 2023-10-03 23:13:01,773 WARNING [train.py:1204] (3/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_281-1313-0_sp1.1 from training. Duration: 0.936375 2023-10-03 23:13:09,160 WARNING [train.py:1204] (3/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_64-215118-0_sp1.1 from training. Duration: 0.936375 2023-10-03 23:13:09,225 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2822_210_2715_1_1526112586_1999587_165-36656-0 from training. Duration: 0.89 2023-10-03 23:13:09,276 WARNING [train.py:1204] (3/4) Exclude cut with ID _22961_210_8254_1_1531976366672_4278180_402-208830-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 23:13:14,621 WARNING [train.py:1204] (3/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_98-193494-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 23:13:14,721 WARNING [train.py:1204] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_383-285196-0 from training. Duration: 0.97 2023-10-03 23:13:19,411 INFO [train.py:1046] (3/4) Epoch 41, batch 3600, loss[loss=0.1616, simple_loss=0.2436, pruned_loss=0.03977, over 24494.00 frames. ], tot_loss[loss=0.1558, simple_loss=0.2356, pruned_loss=0.03802, over 4709701.44 frames. ], batch size: 66, lr: 2.47e-03, grad_scale: 16.0 2023-10-03 23:13:19,757 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=1440573.3333333333, ans=0.1 2023-10-03 23:13:22,685 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6767_210_3273_1_1527855783_1718932_142-127132-0 from training. Duration: 0.95 2023-10-03 23:13:22,727 WARNING [train.py:1204] (3/4) Exclude cut with ID _39867_210_9826_1_1533553222030_9344259_1018-339635-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 23:13:24,084 WARNING [train.py:1204] (3/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_274-274798-0_sp1.1 from training. Duration: 0.8909375 2023-10-03 23:13:25,592 WARNING [train.py:1204] (3/4) Exclude cut with ID _2334_210_2667_1_1525608238880_4705129_376-263393-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 23:13:25,731 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.balancer_ff2.min_abs, batch_count=1440573.3333333333, ans=0.1 2023-10-03 23:13:26,922 WARNING [train.py:1204] (3/4) Exclude cut with ID _29303_210_2575_1_1532392237303_7410649_363-344675-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 23:13:28,747 WARNING [train.py:1204] (3/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_58-68956-0_sp1.1 from training. Duration: 0.7545625 2023-10-03 23:13:30,303 WARNING [train.py:1204] (3/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_709-311571-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 23:13:32,920 WARNING [train.py:1204] (3/4) Exclude cut with ID _46067_210_4220_1_1534125571066_3307450_136-257407-0_sp1.1 from training. Duration: 0.963625 2023-10-03 23:13:33,005 WARNING [train.py:1204] (3/4) Exclude cut with ID _6493_210_1747_1_1528459210424_4012470_324-323854-0_sp1.1 from training. Duration: 0.836375 2023-10-03 23:13:33,051 WARNING [train.py:1204] (3/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_234-260682-0_sp1.1 from training. Duration: 0.763625 2023-10-03 23:13:33,303 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.attention_skip_rate, batch_count=1440640.0, ans=0.0 2023-10-03 23:13:34,445 WARNING [train.py:1204] (3/4) Exclude cut with ID _36634_210_1794_1_1533296352450_1399884_55-119451-0_sp1.1 from training. Duration: 0.963625 2023-10-03 23:13:34,448 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_40047_210_2290_1_1533290519_2374311_195-165693-0 from training. Duration: 0.89 2023-10-03 23:13:34,640 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.feed_forward3.hidden_balancer.prob, batch_count=1440640.0, ans=0.125 2023-10-03 23:13:37,721 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_5522_210_3273_1_1527249606_3909096_366-172508-0_sp0.9 from training. Duration: 0.7333125 2023-10-03 23:13:37,818 WARNING [train.py:1204] (3/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_187-10504-0_sp1.1 from training. Duration: 0.963625 2023-10-03 23:13:40,518 WARNING [train.py:1204] (3/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_282-257791-0_sp1.1 from training. Duration: 0.7818125 2023-10-03 23:13:42,189 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.feed_forward1.out_proj.dropout_p, batch_count=1440640.0, ans=0.1 2023-10-03 23:13:43,405 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_43831_210_2158_1_1533725502_5872791_467-288776-0_sp1.1 from training. Duration: 0.92725 2023-10-03 23:13:43,491 WARNING [train.py:1204] (3/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_138-154812-0_sp1.1 from training. Duration: 0.7090625 2023-10-03 23:13:43,550 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder_embed.convnext.layerdrop_rate, batch_count=1440640.0, ans=0.015 2023-10-03 23:13:45,244 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_1154-185930-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 23:13:45,261 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2655_210_2087_1_1525688959_3897400_52-279899-0 from training. Duration: 0.69 2023-10-03 23:13:45,319 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_141-89113-0_sp1.1 from training. Duration: 0.7818125 2023-10-03 23:13:45,497 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.nonlin_attention.balancer.prob, batch_count=1440640.0, ans=0.125 2023-10-03 23:13:47,854 WARNING [train.py:1204] (3/4) Exclude cut with ID _55134_210_21982_1_1534590028377_3882170_297-47310-0_sp1.1 from training. Duration: 0.963625 2023-10-03 23:13:49,232 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_486-151131-0_sp1.1 from training. Duration: 0.77275 2023-10-03 23:13:49,363 WARNING [train.py:1204] (3/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_21-232980-0_sp1.1 from training. Duration: 0.936375 2023-10-03 23:13:52,375 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6165_210_3528_1_1527590982_4379841_338-106380-0_sp1.1 from training. Duration: 0.92725 2023-10-03 23:13:53,699 WARNING [train.py:1204] (3/4) Exclude cut with ID _55162_210_17071_1_1534408248583_3753480_355-257218-0_sp1.1 from training. Duration: 0.97275 2023-10-03 23:13:53,782 WARNING [train.py:1204] (3/4) Exclude cut with ID _2895_210_2477_1_1525956785794_4063750_289-11475-0 from training. Duration: 0.82 2023-10-03 23:13:59,969 WARNING [train.py:1204] (3/4) Exclude cut with ID _16756_210_4915_1_1531047803763_7104259_839-183431-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 23:14:01,371 WARNING [train.py:1204] (3/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_427-208901-0_sp0.9 from training. Duration: 0.7555625 2023-10-03 23:14:01,420 WARNING [train.py:1204] (3/4) Exclude cut with ID _36512_210_6987_1_1533033143998_3126069_181-306500-0 from training. Duration: 0.95 2023-10-03 23:14:04,241 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=1440773.3333333333, ans=0.1 2023-10-03 23:14:07,283 WARNING [train.py:1204] (3/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_28-70987-0_sp1.1 from training. Duration: 0.7909375 2023-10-03 23:14:11,401 WARNING [train.py:1204] (3/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_590-58996-0_sp1.1 from training. Duration: 0.936375 2023-10-03 23:14:14,708 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_48730_210_5399_1_1533885886_2212769_108-47561-0_sp1.1 from training. Duration: 0.936375 2023-10-03 23:14:18,841 WARNING [train.py:1204] (3/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_401-158239-0_sp0.9 from training. Duration: 0.788875 2023-10-03 23:14:20,121 WARNING [train.py:1204] (3/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_278-154492-0_sp1.1 from training. Duration: 0.6909375 2023-10-03 23:14:20,127 WARNING [train.py:1204] (3/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_137-116442-0 from training. Duration: 0.78 2023-10-03 23:14:22,145 WARNING [train.py:1204] (3/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_189-317158-0 from training. Duration: 0.9 2023-10-03 23:14:23,586 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_47896_210_20508_1_1534492518_5958868_558-314572-0 from training. Duration: 0.97 2023-10-03 23:14:25,255 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.conv_module1.balancer2.min_abs, batch_count=1440840.0, ans=0.5 2023-10-03 23:14:26,295 WARNING [train.py:1204] (3/4) Exclude cut with ID _66070_210_7640_1_1535441393096_7805310_290-40284-0_sp1.1 from training. Duration: 0.97275 2023-10-03 23:14:26,358 WARNING [train.py:1204] (3/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_110-224688-0_sp1.1 from training. Duration: 0.8818125 2023-10-03 23:14:28,330 WARNING [train.py:1204] (3/4) Exclude cut with ID _7719_210_3525_1_1528456153163_4296960_88-250259-0 from training. Duration: 0.84 2023-10-03 23:14:28,373 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8453_210_4211_1_1528502940_8562730_1157-114102-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 23:14:29,652 WARNING [train.py:1204] (3/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_317-175062-0_sp1.1 from training. Duration: 0.7454375 2023-10-03 23:14:29,658 WARNING [train.py:1204] (3/4) Exclude cut with ID _29857_210_20034_1_1532395886753_3798160_254-347254-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 23:14:29,739 WARNING [train.py:1204] (3/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_28-70987-0 from training. Duration: 0.87 2023-10-03 23:14:29,985 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.ff2_skip_rate, batch_count=1440840.0, ans=0.0 2023-10-03 23:14:31,174 WARNING [train.py:1204] (3/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_321-335475-0 from training. Duration: 0.45 2023-10-03 23:14:34,307 INFO [train.py:1046] (3/4) Epoch 41, batch 3650, loss[loss=0.1752, simple_loss=0.2459, pruned_loss=0.05225, over 23978.00 frames. ], tot_loss[loss=0.1562, simple_loss=0.236, pruned_loss=0.03821, over 4715446.80 frames. ], batch size: 196, lr: 2.47e-03, grad_scale: 16.0 2023-10-03 23:14:34,423 WARNING [train.py:1204] (3/4) Exclude cut with ID _40901_210_5665_1_1533363789958_3856130_356-107330-0_sp1.1 from training. Duration: 0.936375 2023-10-03 23:14:34,488 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3780_210_2087_1_1526467389_2145572_176-107140-0 from training. Duration: 0.69 2023-10-03 23:14:40,030 WARNING [train.py:1204] (3/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_80-135200-0 from training. Duration: 0.79 2023-10-03 23:14:40,155 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_33348_210_5632_1_1533086912_3324030_85-28583-0_sp0.9 from training. Duration: 0.911125 2023-10-03 23:14:42,235 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.3.encoder.layers.1.nonlin_attention.whiten1, num_groups=1, num_channels=384, metric=5.26 vs. limit=10.0 2023-10-03 23:14:43,102 WARNING [train.py:1204] (3/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_29-293896-0 from training. Duration: 0.61 2023-10-03 23:14:45,804 WARNING [train.py:1204] (3/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_194-252800-0 from training. Duration: 0.97 2023-10-03 23:14:46,725 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.conv_skip_rate, batch_count=1440906.6666666667, ans=0.0 2023-10-03 23:14:50,489 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_360-52446-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 23:14:50,491 WARNING [train.py:1204] (3/4) Exclude cut with ID _9389_210_1664_1_1529217337301_5289269_289-213417-0_sp0.9 from training. Duration: 0.988875 2023-10-03 23:14:51,808 WARNING [train.py:1204] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_143-86344-0_sp0.9 from training. Duration: 0.8555625 2023-10-03 23:14:53,855 WARNING [train.py:1204] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_520-126729-0_sp0.9 from training. Duration: 0.77775 2023-10-03 23:14:55,168 WARNING [train.py:1204] (3/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_76-308663-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 23:14:55,226 WARNING [train.py:1204] (3/4) Exclude cut with ID _8021_210_7994_1_1528376451358_3653069_100-140231-0 from training. Duration: 0.67 2023-10-03 23:14:56,513 WARNING [train.py:1204] (3/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_387-131538-0_sp0.9 from training. Duration: 0.9 2023-10-03 23:14:56,537 WARNING [train.py:1204] (3/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_365-182054-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 23:14:56,581 WARNING [train.py:1204] (3/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_345-340690-0 from training. Duration: 0.93 2023-10-03 23:14:57,952 WARNING [train.py:1204] (3/4) Exclude cut with ID _5409_210_1388_1_1527292850189_5865191_178-127970-0_sp0.9 from training. Duration: 0.6333125 2023-10-03 23:14:59,409 WARNING [train.py:1204] (3/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_352-12734-0_sp1.1 from training. Duration: 0.92725 2023-10-03 23:14:59,412 WARNING [train.py:1204] (3/4) Exclude cut with ID _65134_210_14763_1_1535335393985_3505368_121-129373-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 23:14:59,556 WARNING [train.py:1204] (3/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_557-324258-0_sp1.1 from training. Duration: 0.736375 2023-10-03 23:15:01,687 WARNING [train.py:1204] (3/4) Exclude cut with ID _48678_210_5663_1_1533988830947_3769199_306-255886-0 from training. Duration: 0.77 2023-10-03 23:15:01,814 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.conv_module2.balancer1.prob, batch_count=1440973.3333333333, ans=0.125 2023-10-03 23:15:04,812 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7544_210_6753_1_1528587000_691468_87-178599-0 from training. Duration: 0.87 2023-10-03 23:15:06,121 WARNING [train.py:1204] (3/4) Exclude cut with ID _2941_210_2577_1_1526199673150_2489759_208-236402-0_sp1.1 from training. Duration: 0.963625 2023-10-03 23:15:06,490 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.balancer2.prob, batch_count=1441040.0, ans=0.125 2023-10-03 23:15:07,488 WARNING [train.py:1204] (3/4) Exclude cut with ID _38670_210_15379_1_1533448810659_4391059_65-169397-0 from training. Duration: 0.97 2023-10-03 23:15:07,735 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.nonlin_attention.balancer.prob, batch_count=1441040.0, ans=0.125 2023-10-03 23:15:08,828 WARNING [train.py:1204] (3/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_306-328175-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 23:15:08,845 WARNING [train.py:1204] (3/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_212-149947-0_sp1.1 from training. Duration: 0.663625 2023-10-03 23:15:13,089 WARNING [train.py:1204] (3/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_321-22821-0_sp1.1 from training. Duration: 0.8090625 2023-10-03 23:15:13,352 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.attention_skip_rate, batch_count=1441040.0, ans=0.0 2023-10-03 23:15:15,760 WARNING [train.py:1204] (3/4) Exclude cut with ID _44010_210_4525_1_1533884262964_4013620_483-69110-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 23:15:15,770 WARNING [train.py:1204] (3/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_38-187851-0_sp1.1 from training. Duration: 0.563625 2023-10-03 23:15:16,489 WARNING [train.py:1204] (3/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_129-337802-0_sp0.9 from training. Duration: 0.8 2023-10-03 23:15:17,866 WARNING [train.py:1204] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_268-87909-0_sp1.1 from training. Duration: 0.9 2023-10-03 23:15:20,542 WARNING [train.py:1204] (3/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_559-184862-0_sp1.1 from training. Duration: 0.8545625 2023-10-03 23:15:21,397 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.1.encoder.layers.0.whiten, num_groups=1, num_channels=256, metric=4.90 vs. limit=12.0 2023-10-03 23:15:23,723 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_46032_210_14656_1_1533781412_4507969_237-120766-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 23:15:24,912 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.612e+02 1.947e+02 2.181e+02 2.506e+02 4.142e+02, threshold=4.361e+02, percent-clipped=1.0 2023-10-03 23:15:25,051 WARNING [train.py:1204] (3/4) Exclude cut with ID _28427_210_4220_1_1532581240967_3061069_330-170906-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 23:15:25,063 WARNING [train.py:1204] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_401-51578-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 23:15:27,769 WARNING [train.py:1204] (3/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_209-205365-0_sp1.1 from training. Duration: 0.7181875 2023-10-03 23:15:29,166 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_23801_210_16223_1_1532066392_3294657_57-242170-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 23:15:29,230 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_127-37371-0_sp1.1 from training. Duration: 0.936375 2023-10-03 23:15:35,321 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9171W0019-578905 from training. Duration: 0.896 2023-10-03 23:15:38,625 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_24605_210_1794_1_1532047863_4149755_349-49056-0_sp1.1 from training. Duration: 0.92725 2023-10-03 23:15:38,643 WARNING [train.py:1204] (3/4) Exclude cut with ID _12302_210_5219_1_1529924446466_8107280_897-21237-0_sp1.1 from training. Duration: 0.936375 2023-10-03 23:15:40,015 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_9460_210_4211_1_1528954587_10049667_480-260269-0_sp0.9 from training. Duration: 0.62225 2023-10-03 23:15:41,287 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_348-152548-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 23:15:41,551 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.nonlin_attention.balancer.prob, batch_count=1441173.3333333333, ans=0.125 2023-10-03 23:15:42,633 WARNING [train.py:1204] (3/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_301-330455-0_sp0.9 from training. Duration: 0.888875 2023-10-03 23:15:42,735 WARNING [train.py:1204] (3/4) Exclude cut with ID _61008_210_10633_1_1534852769198_6974370_345-91705-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 23:15:44,147 WARNING [train.py:1204] (3/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_304-273433-0 from training. Duration: 0.99 2023-10-03 23:15:44,151 WARNING [train.py:1204] (3/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_359-345972-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 23:15:44,452 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.feed_forward3.hidden_balancer.prob, batch_count=1441173.3333333333, ans=0.125 2023-10-03 23:15:46,312 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2296_210_2087_1_1525503448_3397335_57-185430-0_sp1.1 from training. Duration: 0.5454375 2023-10-03 23:15:47,802 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_1256-120974-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 23:15:49,084 INFO [train.py:1046] (3/4) Epoch 41, batch 3700, loss[loss=0.1645, simple_loss=0.2504, pruned_loss=0.03935, over 23979.00 frames. ], tot_loss[loss=0.1574, simple_loss=0.2373, pruned_loss=0.03878, over 4713385.29 frames. ], batch size: 86, lr: 2.47e-03, grad_scale: 16.0 2023-10-03 23:15:49,159 WARNING [train.py:1204] (3/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_372-30302-0_sp0.9 from training. Duration: 0.9555625 2023-10-03 23:15:51,728 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_42012_210_7927_1_1533439936_3871556_451-344262-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 23:15:51,729 WARNING [train.py:1204] (3/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_28-324729-0 from training. Duration: 0.94 2023-10-03 23:15:51,738 WARNING [train.py:1204] (3/4) Exclude cut with ID _64135_210_2253_1_1535677267876_7608972_313-18394-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 23:15:51,823 WARNING [train.py:1204] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_620-276793-0_sp0.9 from training. Duration: 0.6555625 2023-10-03 23:15:51,860 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7544_210_6753_1_1528591327_1471426_172-348777-0_sp0.9 from training. Duration: 0.7333125 2023-10-03 23:15:56,349 WARNING [train.py:1204] (3/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_332-221057-0_sp0.9 from training. Duration: 0.7555625 2023-10-03 23:15:59,158 WARNING [train.py:1204] (3/4) Exclude cut with ID _19023_210_4353_1_1531378892552_5820780_5-300035-0_sp1.1 from training. Duration: 0.92725 2023-10-03 23:15:59,218 WARNING [train.py:1204] (3/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_343-309023-0_sp1.1 from training. Duration: 0.963625 2023-10-03 23:16:00,567 WARNING [train.py:1204] (3/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_37-41919-0_sp1.1 from training. Duration: 0.7909375 2023-10-03 23:16:00,606 WARNING [train.py:1204] (3/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_41-182910-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 23:16:01,943 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_229-45300-0_sp0.9 from training. Duration: 0.6333125 2023-10-03 23:16:05,213 WARNING [train.py:1204] (3/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_40-214716-0_sp1.1 from training. Duration: 0.963625 2023-10-03 23:16:08,199 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9309W0203-640330 from training. Duration: 0.896 2023-10-03 23:16:14,145 WARNING [train.py:1204] (3/4) Exclude cut with ID _60229_210_27767_1_1534838406158_3655350_264-279573-0_sp1.1 from training. Duration: 0.7545625 2023-10-03 23:16:14,181 WARNING [train.py:1204] (3/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_372-198878-0_sp0.9 from training. Duration: 0.6666875 2023-10-03 23:16:15,872 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.conv_skip_rate, batch_count=1441306.6666666667, ans=0.0 2023-10-03 23:16:17,457 WARNING [train.py:1204] (3/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_359-118926-0_sp1.1 from training. Duration: 0.6545625 2023-10-03 23:16:17,476 WARNING [train.py:1204] (3/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_85-65056-0 from training. Duration: 0.96 2023-10-03 23:16:17,507 WARNING [train.py:1204] (3/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_89-242335-0_sp1.1 from training. Duration: 0.863625 2023-10-03 23:16:21,520 WARNING [train.py:1204] (3/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_128-49444-0_sp1.1 from training. Duration: 0.936375 2023-10-03 23:16:21,605 WARNING [train.py:1204] (3/4) Exclude cut with ID _30327_210_16049_1_1532484161295_3924169_401-70819-0 from training. Duration: 0.86 2023-10-03 23:16:22,386 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.2.encoder.layers.0.self_attn_weights.whiten_keys, num_groups=4, num_channels=128, metric=4.37 vs. limit=6.0 2023-10-03 23:16:23,022 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_41822_210_13270_1_1533955976_4379284_260-256391-0_sp1.1 from training. Duration: 0.936375 2023-10-03 23:16:23,157 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.self_attn_weights.pos_emb_skip_rate, batch_count=1441373.3333333333, ans=0.0 2023-10-03 23:16:24,399 WARNING [train.py:1204] (3/4) Exclude cut with ID _67335_210_5219_1_1535525975652_4275229_599-247317-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 23:16:27,108 WARNING [train.py:1204] (3/4) Exclude cut with ID _29857_210_20034_1_1532395886753_3798160_209-239267-0_sp1.1 from training. Duration: 0.936375 2023-10-03 23:16:28,411 WARNING [train.py:1204] (3/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_46-207599-0_sp1.1 from training. Duration: 0.5818125 2023-10-03 23:16:30,454 WARNING [train.py:1204] (3/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_912-282412-0_sp1.1 from training. Duration: 0.4454375 2023-10-03 23:16:33,455 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.conv_module2.balancer2.prob, batch_count=1441440.0, ans=0.125 2023-10-03 23:16:34,699 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_4183_210_2087_1_1526729100_1869838_141-166600-0_sp1.1 from training. Duration: 0.863625 2023-10-03 23:16:34,703 WARNING [train.py:1204] (3/4) Exclude cut with ID _15037_210_2253_1_1530752294104_3267650_296-192597-0 from training. Duration: 0.81 2023-10-03 23:16:34,751 WARNING [train.py:1204] (3/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_571-220562-0_sp1.1 from training. Duration: 0.963625 2023-10-03 23:16:34,768 WARNING [train.py:1204] (3/4) Exclude cut with ID _5287_210_2084_1_1527418883040_3878670_12-221186-0 from training. Duration: 0.98 2023-10-03 23:16:38,310 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.self_attn_weights.pos_emb_skip_rate, batch_count=1441440.0, ans=0.0 2023-10-03 23:16:41,331 WARNING [train.py:1204] (3/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_149-85602-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 23:16:41,355 WARNING [train.py:1204] (3/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_1003-101638-0_sp1.1 from training. Duration: 0.663625 2023-10-03 23:16:43,018 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=1441440.0, ans=0.1 2023-10-03 23:16:44,113 WARNING [train.py:1204] (3/4) Exclude cut with ID _55844_210_2346_1_1534471212569_7625707_1099-33689-0_sp1.1 from training. Duration: 0.92725 2023-10-03 23:16:44,152 WARNING [train.py:1204] (3/4) Exclude cut with ID _7606_210_4915_1_1528894253277_5080690_301-10732-0 from training. Duration: 0.81 2023-10-03 23:16:45,522 WARNING [train.py:1204] (3/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_403-34698-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 23:16:45,534 WARNING [train.py:1204] (3/4) Exclude cut with ID _8480_210_1874_1_1528589391205_6728330_755-112625-0_sp0.9 from training. Duration: 0.82225 2023-10-03 23:16:47,418 WARNING [train.py:1204] (3/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_527-238990-0_sp1.1 from training. Duration: 0.8181875 2023-10-03 23:16:47,439 WARNING [train.py:1204] (3/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_426-7328-0_sp1.1 from training. Duration: 0.92725 2023-10-03 23:16:50,070 WARNING [train.py:1204] (3/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_189-317158-0_sp1.1 from training. Duration: 0.8181875 2023-10-03 23:16:51,364 WARNING [train.py:1204] (3/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_162-103620-0 from training. Duration: 0.93 2023-10-03 23:16:52,721 WARNING [train.py:1204] (3/4) Exclude cut with ID _22083_210_8254_1_1531702862439_3678970_49-234546-0 from training. Duration: 0.83 2023-10-03 23:16:54,052 WARNING [train.py:1204] (3/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_370-331600-0_sp1.1 from training. Duration: 0.8545625 2023-10-03 23:16:54,065 WARNING [train.py:1204] (3/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_166-59042-0_sp1.1 from training. Duration: 0.97275 2023-10-03 23:16:55,359 WARNING [train.py:1204] (3/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_138-332773-0_sp1.1 from training. Duration: 0.736375 2023-10-03 23:16:56,815 WARNING [train.py:1204] (3/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_90-30673-0_sp0.9 from training. Duration: 0.9333125 2023-10-03 23:16:58,357 WARNING [train.py:1204] (3/4) Exclude cut with ID _27967_210_12140_1_1532588391859_8419130_737-41898-0_sp1.1 from training. Duration: 0.936375 2023-10-03 23:16:59,786 WARNING [train.py:1204] (3/4) Exclude cut with ID _8833_210_5219_1_1528592665210_7553059_329-125156-0_sp1.1 from training. Duration: 0.7454375 2023-10-03 23:17:01,189 WARNING [train.py:1204] (3/4) Exclude cut with ID _41817_210_18835_1_1533434477160_7423049_352-42537-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 23:17:03,053 INFO [train.py:1046] (3/4) Epoch 41, batch 3750, loss[loss=0.1812, simple_loss=0.2489, pruned_loss=0.0568, over 23414.00 frames. ], tot_loss[loss=0.159, simple_loss=0.2387, pruned_loss=0.03965, over 4701884.48 frames. ], batch size: 285, lr: 2.47e-03, grad_scale: 16.0 2023-10-03 23:17:04,609 WARNING [train.py:1204] (3/4) Exclude cut with ID _8365_210_2064_1_1528592311070_4677690_431-294102-0 from training. Duration: 0.89 2023-10-03 23:17:04,761 WARNING [train.py:1204] (3/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_454-314901-0_sp1.1 from training. Duration: 0.4181875 2023-10-03 23:17:07,892 WARNING [train.py:1204] (3/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_80-135200-0_sp0.9 from training. Duration: 0.87775 2023-10-03 23:17:07,945 WARNING [train.py:1204] (3/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_181-78632-0 from training. Duration: 0.78 2023-10-03 23:17:08,100 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.1.conv_module2.balancer1.prob, batch_count=1441573.3333333333, ans=0.125 2023-10-03 23:17:09,827 WARNING [train.py:1204] (3/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_300-27235-0_sp0.9 from training. Duration: 0.9666875 2023-10-03 23:17:11,049 WARNING [train.py:1204] (3/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_96-177176-0_sp1.1 from training. Duration: 0.97275 2023-10-03 23:17:12,480 WARNING [train.py:1204] (3/4) Exclude cut with ID _35941_210_15379_1_1532995294515_4023089_246-348742-0_sp1.1 from training. Duration: 0.97275 2023-10-03 23:17:13,775 WARNING [train.py:1204] (3/4) Exclude cut with ID _38670_210_15379_1_1533448810659_4391059_65-169397-0_sp1.1 from training. Duration: 0.8818125 2023-10-03 23:17:16,542 WARNING [train.py:1204] (3/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_1003-99642-0_sp1.1 from training. Duration: 0.963625 2023-10-03 23:17:20,021 WARNING [train.py:1204] (3/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_320-14078-0_sp1.1 from training. Duration: 0.863625 2023-10-03 23:17:21,370 WARNING [train.py:1204] (3/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_367-61330-0_sp0.9 from training. Duration: 0.8333125 2023-10-03 23:17:22,679 WARNING [train.py:1204] (3/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_515-298514-0_sp1.1 from training. Duration: 0.92725 2023-10-03 23:17:23,747 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.0.layers.1.self_attn_weights.whiten_keys, num_groups=4, num_channels=128, metric=2.30 vs. limit=6.0 2023-10-03 23:17:25,643 WARNING [train.py:1204] (3/4) Exclude cut with ID _53731_210_7994_1_1534553979244_3757214_471-320815-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 23:17:25,703 WARNING [train.py:1204] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_306-232480-0 from training. Duration: 0.96 2023-10-03 23:17:27,048 WARNING [train.py:1204] (3/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_478-162247-0_sp0.9 from training. Duration: 0.911125 2023-10-03 23:17:28,398 WARNING [train.py:1204] (3/4) Exclude cut with ID _56423_210_5854_1_1534896061049_3165969_345-284696-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 23:17:28,426 WARNING [train.py:1204] (3/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_600-122253-0_sp1.1 from training. Duration: 0.963625 2023-10-03 23:17:29,964 WARNING [train.py:1204] (3/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_199-295611-0 from training. Duration: 0.55 2023-10-03 23:17:34,779 WARNING [train.py:1204] (3/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_734-129761-0 from training. Duration: 0.82 2023-10-03 23:17:34,885 WARNING [train.py:1204] (3/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_349-169257-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 23:17:35,415 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.4.encoder.layers.0.self_attn_weights.whiten_keys, num_groups=4, num_channels=128, metric=1.69 vs. limit=6.0 2023-10-03 23:17:36,290 WARNING [train.py:1204] (3/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_202-67017-0_sp0.9 from training. Duration: 0.911125 2023-10-03 23:17:38,310 WARNING [train.py:1204] (3/4) Exclude cut with ID _16908_210_5724_1_1531119157248_3772700_61-258978-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 23:17:42,373 WARNING [train.py:1204] (3/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_172-156931-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 23:17:45,572 WARNING [train.py:1204] (3/4) Exclude cut with ID _4092_210_2151_1_1526643044449_3135759_356-278694-0_sp1.1 from training. Duration: 0.42725 2023-10-03 23:17:47,575 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.4.encoder.layers.0.whiten, num_groups=1, num_channels=384, metric=4.48 vs. limit=12.0 2023-10-03 23:17:49,767 WARNING [train.py:1204] (3/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_116-332008-0 from training. Duration: 0.98 2023-10-03 23:17:52,494 WARNING [train.py:1204] (3/4) Exclude cut with ID _29003_210_19596_1_1532507391720_3894098_358-114789-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 23:17:52,736 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=1441773.3333333333, ans=0.1 2023-10-03 23:17:52,745 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.bypass_mid.scale_min, batch_count=1441773.3333333333, ans=0.2 2023-10-03 23:17:55,241 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.561e+02 1.936e+02 2.132e+02 2.338e+02 3.284e+02, threshold=4.264e+02, percent-clipped=0.0 2023-10-03 23:17:56,701 WARNING [train.py:1204] (3/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_152-212533-0_sp1.1 from training. Duration: 0.8818125 2023-10-03 23:17:56,764 WARNING [train.py:1204] (3/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_267-214116-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 23:17:59,523 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7543_210_6753_1_1528541907_1399322_165-323619-0_sp0.9 from training. Duration: 0.7666875 2023-10-03 23:18:04,307 WARNING [train.py:1204] (3/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_577-332600-0_sp0.9 from training. Duration: 0.6333125 2023-10-03 23:18:07,020 WARNING [train.py:1204] (3/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_342-100157-0_sp0.9 from training. Duration: 0.6 2023-10-03 23:18:08,586 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=1441840.0, ans=0.1 2023-10-03 23:18:09,669 WARNING [train.py:1204] (3/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_141-89727-0_sp1.1 from training. Duration: 0.6454375 2023-10-03 23:18:10,996 WARNING [train.py:1204] (3/4) Exclude cut with ID _3992_210_1747_1_1526644816031_3731060_107-251434-0_sp1.1 from training. Duration: 0.9 2023-10-03 23:18:12,844 WARNING [train.py:1204] (3/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_401-53423-0_sp1.1 from training. Duration: 0.5 2023-10-03 23:18:17,517 INFO [train.py:1046] (3/4) Epoch 41, batch 3800, loss[loss=0.1532, simple_loss=0.2246, pruned_loss=0.04091, over 23803.00 frames. ], tot_loss[loss=0.158, simple_loss=0.238, pruned_loss=0.03902, over 4706752.18 frames. ], batch size: 179, lr: 2.47e-03, grad_scale: 8.0 2023-10-03 23:18:20,115 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.4.encoder.layers.0.nonlin_attention.whiten2, num_groups=1, num_channels=384, metric=8.91 vs. limit=15.0 2023-10-03 23:18:22,148 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7762_210_1794_1_1528199508_4057408_422-227034-0_sp1.1 from training. Duration: 0.663625 2023-10-03 23:18:25,069 WARNING [train.py:1204] (3/4) Exclude cut with ID _4184_210_2550_1_1526730192013_2219380_102-250299-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 23:18:26,381 WARNING [train.py:1204] (3/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_253-308377-0_sp0.9 from training. Duration: 0.5444375 2023-10-03 23:18:26,453 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_271-297505-0 from training. Duration: 0.99 2023-10-03 23:18:27,848 WARNING [train.py:1204] (3/4) Exclude cut with ID _35941_210_15379_1_1532995294515_4023089_213-142973-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 23:18:29,345 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7544_210_6753_1_1528587722_3576686_162-63018-0_sp1.1 from training. Duration: 0.92725 2023-10-03 23:18:30,179 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.1.feed_forward2.out_whiten.whitening_limit, batch_count=1441906.6666666667, ans=15.0 2023-10-03 23:18:30,669 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_365-217119-0_sp0.9 from training. Duration: 0.7 2023-10-03 23:18:32,040 WARNING [train.py:1204] (3/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_662-275621-0_sp0.9 from training. Duration: 0.4666875 2023-10-03 23:18:32,042 WARNING [train.py:1204] (3/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_374-187555-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 23:18:33,794 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_289-180904-0_sp1.1 from training. Duration: 0.6909375 2023-10-03 23:18:35,173 WARNING [train.py:1204] (3/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_189-47206-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 23:18:35,202 WARNING [train.py:1204] (3/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_234-14174-0_sp1.1 from training. Duration: 0.7454375 2023-10-03 23:18:35,227 WARNING [train.py:1204] (3/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_576-76621-0_sp1.1 from training. Duration: 0.963625 2023-10-03 23:18:36,753 WARNING [train.py:1204] (3/4) Exclude cut with ID _13253_210_1925_1_1530757775819_3577873_253-246386-0 from training. Duration: 0.9 2023-10-03 23:18:41,297 WARNING [train.py:1204] (3/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_372-6869-0_sp1.1 from training. Duration: 0.4 2023-10-03 23:18:41,364 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_21434_210_4238_1_1531916339_1222252_22-54954-0_sp1.1 from training. Duration: 0.936375 2023-10-03 23:18:42,972 INFO [scaling.py:1118] (3/4) WithLoss: name=encoder.encoders.4.encoder.layers.0.self_attn_weights, loss-sum=0.000e+00 2023-10-03 23:18:44,107 WARNING [train.py:1204] (3/4) Exclude cut with ID _29853_210_7497_1_1532505600851_6642110_492-170361-0_sp1.1 from training. Duration: 0.92725 2023-10-03 23:18:47,424 WARNING [train.py:1204] (3/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_505-97987-0_sp0.9 from training. Duration: 0.9555625 2023-10-03 23:18:47,477 WARNING [train.py:1204] (3/4) Exclude cut with ID _8365_210_2064_1_1528592311070_4677690_371-122295-0_sp0.9 from training. Duration: 0.611125 2023-10-03 23:18:50,612 WARNING [train.py:1204] (3/4) Exclude cut with ID _14268_210_9206_1_1530622771285_4714099_637-86630-0_sp1.1 from training. Duration: 0.463625 2023-10-03 23:18:50,624 WARNING [train.py:1204] (3/4) Exclude cut with ID _29857_210_20034_1_1532395886753_3798160_372-5678-0_sp1.1 from training. Duration: 0.963625 2023-10-03 23:18:51,981 WARNING [train.py:1204] (3/4) Exclude cut with ID _52892_210_6721_1_1534378941259_4124129_90-165108-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 23:18:52,152 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.feed_forward3.hidden_balancer.prob, batch_count=1442040.0, ans=0.125 2023-10-03 23:18:53,332 WARNING [train.py:1204] (3/4) Exclude cut with ID _27052_210_5986_1_1532329197187_3585009_143-339198-0_sp1.1 from training. Duration: 0.963625 2023-10-03 23:18:56,200 WARNING [train.py:1204] (3/4) Exclude cut with ID _14268_210_9206_1_1530622771285_4714099_637-86630-0_sp0.9 from training. Duration: 0.5666875 2023-10-03 23:18:56,202 WARNING [train.py:1204] (3/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_401-255491-0 from training. Duration: 0.7 2023-10-03 23:19:00,105 WARNING [train.py:1204] (3/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_178-6946-0_sp1.1 from training. Duration: 0.97275 2023-10-03 23:19:07,395 WARNING [train.py:1204] (3/4) Exclude cut with ID _27485_210_4916_1_1532739191677_3668269_460-163885-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 23:19:13,267 WARNING [train.py:1204] (3/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_270-26053-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 23:19:14,837 WARNING [train.py:1204] (3/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_412-317077-0 from training. Duration: 0.75 2023-10-03 23:19:16,211 WARNING [train.py:1204] (3/4) Exclude cut with ID _5289_210_2084_1_1527678202736_3779299_142-103317-0 from training. Duration: 0.84 2023-10-03 23:19:17,968 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_150-143438-0_sp1.1 from training. Duration: 0.92725 2023-10-03 23:19:18,101 WARNING [train.py:1204] (3/4) Exclude cut with ID _27894_210_12392_1_1532415496078_6902439_668-269779-0_sp1.1 from training. Duration: 0.97275 2023-10-03 23:19:19,401 WARNING [train.py:1204] (3/4) Exclude cut with ID _18513_210_9206_1_1531618215223_3862620_56-222075-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 23:19:21,219 WARNING [train.py:1204] (3/4) Exclude cut with ID _4092_210_2151_1_1526643044449_3135759_356-278694-0 from training. Duration: 0.47 2023-10-03 23:19:23,993 WARNING [train.py:1204] (3/4) Exclude cut with ID _25808_210_13292_1_1532343514562_7366109_633-47524-0 from training. Duration: 0.99 2023-10-03 23:19:24,012 WARNING [train.py:1204] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_872-264525-0 from training. Duration: 0.65 2023-10-03 23:19:24,043 WARNING [train.py:1204] (3/4) Exclude cut with ID _14632_210_9988_1_1530691279392_3923200_239-232545-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 23:19:25,430 WARNING [train.py:1204] (3/4) Exclude cut with ID _14349_210_8341_1_1530615710818_3442470_153-27432-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 23:19:30,865 INFO [train.py:1046] (3/4) Epoch 41, batch 3850, loss[loss=0.1482, simple_loss=0.2147, pruned_loss=0.04083, over 23546.00 frames. ], tot_loss[loss=0.1565, simple_loss=0.236, pruned_loss=0.03847, over 4709350.77 frames. ], batch size: 256, lr: 2.47e-03, grad_scale: 8.0 2023-10-03 23:19:32,279 WARNING [train.py:1204] (3/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_37-41919-0_sp0.9 from training. Duration: 0.9666875 2023-10-03 23:19:32,349 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_196-289637-0_sp0.9 from training. Duration: 0.8333125 2023-10-03 23:19:36,504 WARNING [train.py:1204] (3/4) Exclude cut with ID _13678_210_7497_1_1530530069226_3463170_371-3221-0_sp1.1 from training. Duration: 0.8181875 2023-10-03 23:19:36,699 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.conv_skip_rate, batch_count=1442240.0, ans=0.0 2023-10-03 23:19:39,638 WARNING [train.py:1204] (3/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_557-324258-0 from training. Duration: 0.81 2023-10-03 23:19:39,706 WARNING [train.py:1204] (3/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_1012-279473-0_sp1.1 from training. Duration: 0.6090625 2023-10-03 23:19:41,101 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_66916_210_14656_1_1535725525_326566_7-92649-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 23:19:44,513 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_9460_210_4211_1_1528954587_10049667_480-260269-0_sp1.1 from training. Duration: 0.5090625 2023-10-03 23:19:44,887 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.bypass_mid.scale_min, batch_count=1442306.6666666667, ans=0.2 2023-10-03 23:19:46,031 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_121-155001-0_sp1.1 from training. Duration: 0.92725 2023-10-03 23:19:46,334 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.conv_module2.balancer1.prob, batch_count=1442306.6666666667, ans=0.125 2023-10-03 23:19:48,108 WARNING [train.py:1204] (3/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_188-171673-0_sp1.1 from training. Duration: 0.52725 2023-10-03 23:19:48,175 WARNING [train.py:1204] (3/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_2-39653-0 from training. Duration: 0.98 2023-10-03 23:19:55,586 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_23850_210_4238_1_1532232597_1632433_112-229012-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 23:19:55,768 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.feed_forward3.hidden_balancer.prob, batch_count=1442306.6666666667, ans=0.125 2023-10-03 23:19:58,355 WARNING [train.py:1204] (3/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_626-79528-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 23:19:59,794 WARNING [train.py:1204] (3/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_856-90052-0_sp1.1 from training. Duration: 0.963625 2023-10-03 23:19:59,837 WARNING [train.py:1204] (3/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_122-109023-0_sp0.9 from training. Duration: 0.8444375 2023-10-03 23:20:02,632 WARNING [train.py:1204] (3/4) Exclude cut with ID _64414_210_17301_1_1535243252048_3757759_37-70328-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 23:20:03,914 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_622-344907-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 23:20:03,964 WARNING [train.py:1204] (3/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_410-321956-0_sp1.1 from training. Duration: 0.92725 2023-10-03 23:20:03,981 WARNING [train.py:1204] (3/4) Exclude cut with ID _7562_210_2084_1_1528887767916_3946229_186-326422-0_sp0.9 from training. Duration: 0.9333125 2023-10-03 23:20:04,068 WARNING [train.py:1204] (3/4) Exclude cut with ID _37537_210_2253_1_1533171758568_3421949_127-285192-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 23:20:05,479 WARNING [train.py:1204] (3/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_723-228168-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 23:20:07,289 WARNING [train.py:1204] (3/4) Exclude cut with ID _13678_210_7497_1_1530530069226_3463170_426-246984-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 23:20:07,313 WARNING [train.py:1204] (3/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_399-156791-0_sp0.9 from training. Duration: 0.92225 2023-10-03 23:20:08,646 WARNING [train.py:1204] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_391-22148-0 from training. Duration: 0.77 2023-10-03 23:20:08,671 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3475_210_2028_1_1526193578_3622569_64-297072-0 from training. Duration: 0.91 2023-10-03 23:20:10,122 WARNING [train.py:1204] (3/4) Exclude cut with ID _17875_210_2087_1_1531224012907_1821339_225-280015-0_sp1.1 from training. Duration: 0.963625 2023-10-03 23:20:11,405 WARNING [train.py:1204] (3/4) Exclude cut with ID _50655_210_18628_1_1534229942193_3772154_325-246169-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 23:20:13,335 WARNING [train.py:1204] (3/4) Exclude cut with ID _29529_210_7234_1_1532393961455_3977460_597-257348-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 23:20:14,697 WARNING [train.py:1204] (3/4) Exclude cut with ID _58895_210_8750_1_1535184081320_3649089_77-173485-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 23:20:14,717 WARNING [train.py:1204] (3/4) Exclude cut with ID _6270_210_2084_1_1527850911573_3856420_26-277343-0 from training. Duration: 0.82 2023-10-03 23:20:16,252 WARNING [train.py:1204] (3/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_144-3287-0 from training. Duration: 0.67 2023-10-03 23:20:17,786 WARNING [train.py:1204] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_293-145278-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 23:20:20,240 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_40047_210_2290_1_1533290519_2374311_257-335162-0 from training. Duration: 0.95 2023-10-03 23:20:21,427 WARNING [train.py:1204] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_619-344499-0_sp0.9 from training. Duration: 0.7 2023-10-03 23:20:23,806 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.0.layers.0.conv_module2.whiten, num_groups=1, num_channels=192, metric=8.88 vs. limit=15.0 2023-10-03 23:20:24,013 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.555e+02 1.998e+02 2.145e+02 2.490e+02 4.261e+02, threshold=4.290e+02, percent-clipped=0.0 2023-10-03 23:20:26,896 WARNING [train.py:1204] (3/4) Exclude cut with ID _19504_210_9994_1_1531648863242_3325220_456-315379-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 23:20:28,356 WARNING [train.py:1204] (3/4) Exclude cut with ID _55857_210_2346_1_1534644007831_6898209_631-221692-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 23:20:31,196 WARNING [train.py:1204] (3/4) Exclude cut with ID _64631_210_27767_1_1535349580068_4165160_324-3682-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 23:20:31,395 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.bypass_mid.scale_min, batch_count=1442506.6666666667, ans=0.2 2023-10-03 23:20:32,486 WARNING [train.py:1204] (3/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_317-211107-0 from training. Duration: 0.92 2023-10-03 23:20:34,018 WARNING [train.py:1204] (3/4) Exclude cut with ID _6789_210_1534_1_1527857937218_3118950_311-61262-0 from training. Duration: 0.65 2023-10-03 23:20:34,195 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.conv_module1.balancer1.max_abs, batch_count=1442506.6666666667, ans=10.0 2023-10-03 23:20:37,167 WARNING [train.py:1204] (3/4) Exclude cut with ID _28323_210_16187_1_1532480395667_4100300_365-68882-0_sp1.1 from training. Duration: 0.92725 2023-10-03 23:20:38,484 WARNING [train.py:1204] (3/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_816-181825-0_sp1.1 from training. Duration: 0.92725 2023-10-03 23:20:39,073 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.4.encoder.layers.1.feed_forward3.out_whiten, num_groups=1, num_channels=384, metric=12.69 vs. limit=15.0 2023-10-03 23:20:42,535 WARNING [train.py:1204] (3/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_132-269633-0_sp1.1 from training. Duration: 0.6181875 2023-10-03 23:20:42,554 WARNING [train.py:1204] (3/4) Exclude cut with ID _44585_210_18628_1_1533605350387_4518199_280-73534-0_sp1.1 from training. Duration: 0.8090625 2023-10-03 23:20:42,605 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_68095_210_20185_1_1535718226_8888855_1009-164755-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 23:20:43,885 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_40223_210_6228_1_1533298404_4812267_400-103813-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 23:20:43,886 WARNING [train.py:1204] (3/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_491-261923-0_sp1.1 from training. Duration: 0.8909375 2023-10-03 23:20:43,891 WARNING [train.py:1204] (3/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_712-342563-0 from training. Duration: 0.97 2023-10-03 23:20:45,648 INFO [train.py:1046] (3/4) Epoch 41, batch 3900, loss[loss=0.1439, simple_loss=0.2259, pruned_loss=0.0309, over 24667.00 frames. ], tot_loss[loss=0.1559, simple_loss=0.2356, pruned_loss=0.03813, over 4710600.27 frames. ], batch size: 65, lr: 2.47e-03, grad_scale: 8.0 2023-10-03 23:20:45,738 WARNING [train.py:1204] (3/4) Exclude cut with ID _41161_210_20034_1_1533431249657_3555314_175-190032-0_sp1.1 from training. Duration: 0.963625 2023-10-03 23:20:47,169 WARNING [train.py:1204] (3/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_788-298907-0 from training. Duration: 0.89 2023-10-03 23:20:47,194 WARNING [train.py:1204] (3/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_225-70961-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 23:20:47,199 WARNING [train.py:1204] (3/4) Exclude cut with ID _58583_210_5236_1_1534726835266_3574249_235-175994-0_sp1.1 from training. Duration: 0.92725 2023-10-03 23:20:49,820 WARNING [train.py:1204] (3/4) Exclude cut with ID _12302_210_5219_1_1529924446466_8107280_907-182949-0_sp1.1 from training. Duration: 0.836375 2023-10-03 23:20:49,855 WARNING [train.py:1204] (3/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_94-193692-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 23:20:51,622 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_4528_210_3273_1_1527076654_3276777_342-168068-0_sp0.9 from training. Duration: 0.9666875 2023-10-03 23:20:51,672 WARNING [train.py:1204] (3/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_368-74239-0_sp1.1 from training. Duration: 0.92725 2023-10-03 23:20:51,673 WARNING [train.py:1204] (3/4) Exclude cut with ID _44831_210_13427_1_1533816011549_3689035_291-295928-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 23:20:53,026 WARNING [train.py:1204] (3/4) Exclude cut with ID _56406_210_9288_1_1534899618664_3496149_455-131997-0_sp1.1 from training. Duration: 0.97275 2023-10-03 23:20:53,033 WARNING [train.py:1204] (3/4) Exclude cut with ID _6473_210_1741_1_1527901307396_3815380_108-17335-0 from training. Duration: 0.65 2023-10-03 23:20:54,819 WARNING [train.py:1204] (3/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_286-135600-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 23:20:56,296 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_928-321675-0_sp1.1 from training. Duration: 0.936375 2023-10-03 23:20:57,823 WARNING [train.py:1204] (3/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_81-300212-0_sp1.1 from training. Duration: 0.5454375 2023-10-03 23:20:59,020 WARNING [train.py:1204] (3/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_29-280622-0_sp0.9 from training. Duration: 0.911125 2023-10-03 23:21:00,546 WARNING [train.py:1204] (3/4) Exclude cut with ID _61079_210_4525_1_1534921008198_4011419_228-176447-0_sp1.1 from training. Duration: 0.936375 2023-10-03 23:21:03,265 WARNING [train.py:1204] (3/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_372-198878-0_sp1.1 from training. Duration: 0.5454375 2023-10-03 23:21:03,282 WARNING [train.py:1204] (3/4) Exclude cut with ID _64521_210_14699_1_1535243427259_3815328_244-208725-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 23:21:03,814 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.4.encoder.layers.2.nonlin_attention.whiten1, num_groups=1, num_channels=288, metric=5.62 vs. limit=10.0 2023-10-03 23:21:04,757 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8275_210_4198_1_1528455320_3892571_491-17707-0_sp0.9 from training. Duration: 0.711125 2023-10-03 23:21:06,322 WARNING [train.py:1204] (3/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_375-245677-0 from training. Duration: 0.81 2023-10-03 23:21:06,329 WARNING [train.py:1204] (3/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_690-286055-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 23:21:08,136 WARNING [train.py:1204] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_89-321955-0 from training. Duration: 0.8 2023-10-03 23:21:08,167 WARNING [train.py:1204] (3/4) Exclude cut with ID _18895_210_6228_1_1531357274234_3784930_213-98721-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 23:21:08,241 WARNING [train.py:1204] (3/4) Exclude cut with ID _19708_210_9206_1_1532136650733_4238364_347-65240-0 from training. Duration: 0.97 2023-10-03 23:21:10,897 WARNING [train.py:1204] (3/4) Exclude cut with ID _64135_210_2253_1_1535677267876_7608972_498-5039-0 from training. Duration: 0.78 2023-10-03 23:21:15,089 WARNING [train.py:1204] (3/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_146-276944-0_sp1.1 from training. Duration: 0.7545625 2023-10-03 23:21:16,797 WARNING [train.py:1204] (3/4) Exclude cut with ID _63171_210_15488_1_1535104792794_3295110_155-34488-0_sp1.1 from training. Duration: 0.97275 2023-10-03 23:21:16,818 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_5438_210_1794_1_1527336687_2600009_233-58072-0_sp0.9 from training. Duration: 0.8555625 2023-10-03 23:21:16,865 WARNING [train.py:1204] (3/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_63-101996-0_sp0.9 from training. Duration: 0.97775 2023-10-03 23:21:22,924 WARNING [train.py:1204] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_271-269084-0_sp1.1 from training. Duration: 0.7545625 2023-10-03 23:21:24,306 WARNING [train.py:1204] (3/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_345-167986-0_sp1.1 from training. Duration: 0.763625 2023-10-03 23:21:25,695 WARNING [train.py:1204] (3/4) Exclude cut with ID _39290_210_4220_1_1533862778760_3075118_92-311600-0_sp1.1 from training. Duration: 0.8 2023-10-03 23:21:25,701 WARNING [train.py:1204] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_209-65306-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 23:21:27,049 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_26433_210_12108_1_1532157158_4056480_317-335807-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 23:21:32,622 WARNING [train.py:1204] (3/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_238-94127-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 23:21:32,673 WARNING [train.py:1204] (3/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_207-132038-0_sp1.1 from training. Duration: 0.8545625 2023-10-03 23:21:40,126 WARNING [train.py:1204] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_167-137255-0_sp1.1 from training. Duration: 0.5818125 2023-10-03 23:21:42,918 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2009_210_2071_1_1525481134_4588937_385-302100-0_sp0.9 from training. Duration: 0.9666875 2023-10-03 23:21:50,513 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_15517_210_6753_1_1530954008_3487336_253-110617-0_sp1.1 from training. Duration: 0.936375 2023-10-03 23:21:50,792 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.bypass.skip_rate, batch_count=1442840.0, ans=0.09899494936611666 2023-10-03 23:21:51,963 WARNING [train.py:1204] (3/4) Exclude cut with ID _39290_210_4220_1_1533862778760_3075118_92-311600-0_sp0.9 from training. Duration: 0.97775 2023-10-03 23:21:53,418 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_42-142473-0 from training. Duration: 0.72 2023-10-03 23:21:53,467 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2296_210_2087_1_1525503448_3397335_57-185430-0 from training. Duration: 0.6 2023-10-03 23:21:53,480 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_11305_210_1794_1_1529750428_7178148_257-312870-0_sp0.9 from training. Duration: 0.97775 2023-10-03 23:21:55,462 WARNING [train.py:1204] (3/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_222-198127-0 from training. Duration: 0.84 2023-10-03 23:21:56,807 WARNING [train.py:1204] (3/4) Exclude cut with ID _65263_210_22356_1_1535763598691_3921037_423-202699-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 23:21:58,121 WARNING [train.py:1204] (3/4) Exclude cut with ID _15037_210_2253_1_1530752294104_3267650_13-130361-0 from training. Duration: 0.99 2023-10-03 23:21:58,410 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=1442906.6666666667, ans=0.1 2023-10-03 23:21:59,419 INFO [train.py:1046] (3/4) Epoch 41, batch 3950, loss[loss=0.1547, simple_loss=0.2276, pruned_loss=0.04091, over 23649.00 frames. ], tot_loss[loss=0.1553, simple_loss=0.2347, pruned_loss=0.03796, over 4702958.31 frames. ], batch size: 256, lr: 2.47e-03, grad_scale: 8.0 2023-10-03 23:22:02,480 WARNING [train.py:1204] (3/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_628-198080-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 23:22:03,860 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3171_210_2087_1_1526109084_2330007_37-64334-0 from training. Duration: 0.82 2023-10-03 23:22:03,909 WARNING [train.py:1204] (3/4) Exclude cut with ID _5287_210_2084_1_1527418883040_3878670_12-221186-0_sp1.1 from training. Duration: 0.8909375 2023-10-03 23:22:07,216 WARNING [train.py:1204] (3/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_1103-56445-0_sp1.1 from training. Duration: 0.663625 2023-10-03 23:22:07,954 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.2.encoder.layers.2.self_attn_weights.whiten_keys, num_groups=4, num_channels=128, metric=2.75 vs. limit=6.0 2023-10-03 23:22:08,545 WARNING [train.py:1204] (3/4) Exclude cut with ID _65263_210_22356_1_1535763598691_3921037_169-140068-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 23:22:09,342 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.2.encoder.layers.1.conv_module1.whiten, num_groups=1, num_channels=384, metric=8.18 vs. limit=15.0 2023-10-03 23:22:11,399 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.attention_skip_rate, batch_count=1442906.6666666667, ans=0.0 2023-10-03 23:22:12,859 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.out_combiner.scale_min, batch_count=1442973.3333333333, ans=0.2 2023-10-03 23:22:14,133 WARNING [train.py:1204] (3/4) Exclude cut with ID IC0671W0118-331961 from training. Duration: 0.896 2023-10-03 23:22:16,010 WARNING [train.py:1204] (3/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_402-102311-0_sp1.1 from training. Duration: 0.6090625 2023-10-03 23:22:16,032 WARNING [train.py:1204] (3/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_202-293523-0 from training. Duration: 0.72 2023-10-03 23:22:16,090 WARNING [train.py:1204] (3/4) Exclude cut with ID IC0679W0038-335846 from training. Duration: 0.768 2023-10-03 23:22:17,456 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_37357_210_17024_1_1533089504_2316121_79-198866-0_sp1.1 from training. Duration: 0.936375 2023-10-03 23:22:20,197 WARNING [train.py:1204] (3/4) Exclude cut with ID _7606_210_4915_1_1528894253277_5080690_292-271285-0_sp1.1 from training. Duration: 0.963625 2023-10-03 23:22:20,216 WARNING [train.py:1204] (3/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_377-296839-0_sp0.9 from training. Duration: 0.611125 2023-10-03 23:22:20,227 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_45886_210_17024_1_1533704917_2435333_58-131069-0_sp1.1 from training. Duration: 0.963625 2023-10-03 23:22:20,453 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.ff3_skip_rate, batch_count=1442973.3333333333, ans=0.0 2023-10-03 23:22:22,829 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6961_210_2087_1_1527937168_6815011_395-185060-0 from training. Duration: 0.95 2023-10-03 23:22:26,677 WARNING [train.py:1204] (3/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_304-266479-0_sp1.1 from training. Duration: 0.97275 2023-10-03 23:22:28,046 WARNING [train.py:1204] (3/4) Exclude cut with ID _3867_210_1883_1_1526641200575_4475830_319-349850-0_sp1.1 from training. Duration: 0.6090625 2023-10-03 23:22:28,063 WARNING [train.py:1204] (3/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_304-59227-0_sp0.9 from training. Duration: 0.8555625 2023-10-03 23:22:28,125 WARNING [train.py:1204] (3/4) Exclude cut with ID _8021_210_7994_1_1528376451358_3653069_100-140231-0_sp0.9 from training. Duration: 0.7444375 2023-10-03 23:22:29,605 WARNING [train.py:1204] (3/4) Exclude cut with ID _8833_210_5219_1_1528592665210_7553059_329-125156-0_sp0.9 from training. Duration: 0.911125 2023-10-03 23:22:38,530 WARNING [train.py:1204] (3/4) Exclude cut with ID _14459_210_2228_1_1531443641946_7516700_792-287234-0_sp1.1 from training. Duration: 0.92725 2023-10-03 23:22:38,546 WARNING [train.py:1204] (3/4) Exclude cut with ID _19708_210_9206_1_1532136650733_4238364_347-65240-0_sp1.1 from training. Duration: 0.8818125 2023-10-03 23:22:42,674 WARNING [train.py:1204] (3/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_70-27732-0 from training. Duration: 0.94 2023-10-03 23:22:45,761 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.ff3_skip_rate, batch_count=1443106.6666666667, ans=0.0 2023-10-03 23:22:47,336 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_397-132565-0 from training. Duration: 0.99 2023-10-03 23:22:47,340 WARNING [train.py:1204] (3/4) Exclude cut with ID _49728_210_12651_1_1534041034294_6788150_624-195301-0 from training. Duration: 0.85 2023-10-03 23:22:47,389 WARNING [train.py:1204] (3/4) Exclude cut with ID _55429_210_10626_1_1534467588527_7174143_377-169391-0_sp1.1 from training. Duration: 0.87275 2023-10-03 23:22:48,811 WARNING [train.py:1204] (3/4) Exclude cut with ID _49376_210_5986_1_1533985278732_3370740_354-344695-0_sp1.1 from training. Duration: 0.9 2023-10-03 23:22:51,419 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.504e+02 1.901e+02 2.096e+02 2.372e+02 3.248e+02, threshold=4.191e+02, percent-clipped=0.0 2023-10-03 23:22:56,152 WARNING [train.py:1204] (3/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_276-96593-0_sp1.1 from training. Duration: 0.7 2023-10-03 23:22:56,167 WARNING [train.py:1204] (3/4) Exclude cut with ID _15012_210_1968_1_1530856820429_5095350_688-178849-0_sp0.9 from training. Duration: 0.711125 2023-10-03 23:22:58,073 WARNING [train.py:1204] (3/4) Exclude cut with ID _25565_210_12392_1_1532423009382_3952098_372-69023-0_sp1.1 from training. Duration: 0.936375 2023-10-03 23:22:59,298 WARNING [train.py:1204] (3/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_61-327062-0_sp1.1 from training. Duration: 0.736375 2023-10-03 23:22:59,336 WARNING [train.py:1204] (3/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_215-141059-0 from training. Duration: 0.62 2023-10-03 23:23:02,289 WARNING [train.py:1204] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_6-226750-0_sp1.1 from training. Duration: 0.8545625 2023-10-03 23:23:03,763 WARNING [train.py:1204] (3/4) Exclude cut with ID _14348_210_8341_1_1530599434996_3778640_240-232370-0_sp1.1 from training. Duration: 0.763625 2023-10-03 23:23:04,062 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.self_attn_weights.pos_emb_skip_rate, batch_count=1443173.3333333333, ans=0.0 2023-10-03 23:23:06,947 WARNING [train.py:1204] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_468-189058-0 from training. Duration: 0.96 2023-10-03 23:23:07,264 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.feed_forward3.hidden_balancer.prob, batch_count=1443173.3333333333, ans=0.125 2023-10-03 23:23:11,331 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=1443173.3333333333, ans=0.1 2023-10-03 23:23:13,716 INFO [train.py:1046] (3/4) Epoch 41, batch 4000, loss[loss=0.1594, simple_loss=0.2344, pruned_loss=0.04226, over 23751.00 frames. ], tot_loss[loss=0.1557, simple_loss=0.2356, pruned_loss=0.03789, over 4724859.76 frames. ], batch size: 164, lr: 2.47e-03, grad_scale: 16.0 2023-10-03 23:23:15,625 WARNING [train.py:1204] (3/4) Exclude cut with ID _21987_210_5854_1_1531821500482_3637038_247-93340-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 23:23:18,690 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.conv_skip_rate, batch_count=1443240.0, ans=0.0 2023-10-03 23:23:22,542 WARNING [train.py:1204] (3/4) Exclude cut with ID _66868_210_9527_1_1535509287954_4561140_436-191602-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 23:23:22,814 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.bypass_mid.scale_min, batch_count=1443240.0, ans=0.2 2023-10-03 23:23:26,964 WARNING [train.py:1204] (3/4) Exclude cut with ID _18895_210_6228_1_1531357274234_3784930_394-58203-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 23:23:28,341 WARNING [train.py:1204] (3/4) Exclude cut with ID _58605_210_12210_1_1534723195939_3904259_258-281498-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 23:23:28,376 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_33348_210_5632_1_1533086912_3324030_245-292752-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 23:23:28,397 WARNING [train.py:1204] (3/4) Exclude cut with ID _67831_210_12392_1_1535612464202_3092612_346-2148-0 from training. Duration: 0.93 2023-10-03 23:23:30,243 WARNING [train.py:1204] (3/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_369-222060-0_sp0.9 from training. Duration: 0.788875 2023-10-03 23:23:30,294 WARNING [train.py:1204] (3/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_245-174553-0 from training. Duration: 0.88 2023-10-03 23:23:30,300 WARNING [train.py:1204] (3/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_231-104736-0_sp1.1 from training. Duration: 0.7090625 2023-10-03 23:23:30,315 WARNING [train.py:1204] (3/4) Exclude cut with ID _44585_210_18628_1_1533605350387_4518199_280-73534-0 from training. Duration: 0.89 2023-10-03 23:23:31,977 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.conv_module1.balancer1.prob, batch_count=1443306.6666666667, ans=0.125 2023-10-03 23:23:33,008 WARNING [train.py:1204] (3/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_22-201476-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 23:23:35,755 WARNING [train.py:1204] (3/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_325-62802-0_sp0.9 from training. Duration: 0.9666875 2023-10-03 23:23:35,775 WARNING [train.py:1204] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_199-274062-0_sp1.1 from training. Duration: 0.97275 2023-10-03 23:23:35,785 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_8-319957-0_sp1.1 from training. Duration: 0.87275 2023-10-03 23:23:35,808 WARNING [train.py:1204] (3/4) Exclude cut with ID _29343_210_4381_1_1532602757752_3674109_224-349726-0_sp1.1 from training. Duration: 0.936375 2023-10-03 23:23:35,813 WARNING [train.py:1204] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_520-126729-0_sp1.1 from training. Duration: 0.636375 2023-10-03 23:23:38,984 WARNING [train.py:1204] (3/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_239-22076-0_sp1.1 from training. Duration: 0.77275 2023-10-03 23:23:40,373 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9012W0011-501146 from training. Duration: 0.896 2023-10-03 23:23:41,641 WARNING [train.py:1204] (3/4) Exclude cut with ID _29857_210_20034_1_1532395886753_3798160_279-185801-0_sp1.1 from training. Duration: 0.82725 2023-10-03 23:23:42,935 WARNING [train.py:1204] (3/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_261-223933-0_sp1.1 from training. Duration: 0.92725 2023-10-03 23:23:44,988 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9029W0025-509426 from training. Duration: 0.896 2023-10-03 23:23:46,327 WARNING [train.py:1204] (3/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_337-181502-0_sp1.1 from training. Duration: 0.5909375 2023-10-03 23:23:46,348 WARNING [train.py:1204] (3/4) Exclude cut with ID _68635_210_9317_1_1535770813410_4315870_159-332599-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 23:23:53,140 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_161-276996-0 from training. Duration: 0.95 2023-10-03 23:23:54,501 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7088_210_1874_1_1527939776_1101113_127-68870-0_sp1.1 from training. Duration: 0.936375 2023-10-03 23:23:57,959 WARNING [train.py:1204] (3/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_322-217220-0_sp1.1 from training. Duration: 0.7818125 2023-10-03 23:23:58,015 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9072W0002-530636 from training. Duration: 0.768 2023-10-03 23:24:00,577 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_340-336408-0_sp1.1 from training. Duration: 0.8454375 2023-10-03 23:24:00,650 WARNING [train.py:1204] (3/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_395-37222-0 from training. Duration: 0.76 2023-10-03 23:24:00,653 WARNING [train.py:1204] (3/4) Exclude cut with ID _64099_210_17301_1_1535526977656_1578188_159-209156-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 23:24:02,002 WARNING [train.py:1204] (3/4) Exclude cut with ID _53950_210_5816_1_1534417264919_3862951_28-29681-0_sp1.1 from training. Duration: 0.92725 2023-10-03 23:24:02,240 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.conv_module2.balancer1.prob, batch_count=1443440.0, ans=0.125 2023-10-03 23:24:03,360 WARNING [train.py:1204] (3/4) Exclude cut with ID _4681_210_3525_1_1527250689464_120889_15-126760-0_sp1.1 from training. Duration: 0.736375 2023-10-03 23:24:04,178 INFO [scaling.py:1118] (3/4) WithLoss: name=encoder.encoders.2.encoder.layers.1.self_attn_weights, loss-sum=0.000e+00 2023-10-03 23:24:05,531 WARNING [train.py:1204] (3/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_242-3472-0_sp1.1 from training. Duration: 0.663625 2023-10-03 23:24:05,569 WARNING [train.py:1204] (3/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_436-186311-0_sp0.9 from training. Duration: 0.72225 2023-10-03 23:24:05,584 WARNING [train.py:1204] (3/4) Exclude cut with ID _3675_210_4557_1_1526639934408_4182089_452-172147-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 23:24:06,935 WARNING [train.py:1204] (3/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_80-147301-0 from training. Duration: 0.84 2023-10-03 23:24:06,959 WARNING [train.py:1204] (3/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_351-107588-0_sp1.1 from training. Duration: 0.92725 2023-10-03 23:24:08,426 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9111W0002-550781 from training. Duration: 0.896 2023-10-03 23:24:09,944 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.bypass.skip_rate, batch_count=1443440.0, ans=0.07 2023-10-03 23:24:14,365 WARNING [train.py:1204] (3/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_465-156500-0_sp1.1 from training. Duration: 0.6545625 2023-10-03 23:24:15,903 WARNING [train.py:1204] (3/4) Exclude cut with ID _13938_210_9988_1_1530518718978_3693960_304-112784-0_sp1.1 from training. Duration: 0.4181875 2023-10-03 23:24:17,849 WARNING [train.py:1204] (3/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_202-67017-0_sp1.1 from training. Duration: 0.7454375 2023-10-03 23:24:17,895 WARNING [train.py:1204] (3/4) Exclude cut with ID _58414_210_8254_1_1535421609162_7151182_448-4358-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 23:24:17,961 WARNING [train.py:1204] (3/4) Exclude cut with ID _66840_210_5219_1_1535452168426_4196349_203-243234-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 23:24:19,311 WARNING [train.py:1204] (3/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_201-213513-0_sp1.1 from training. Duration: 0.97275 2023-10-03 23:24:19,522 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.feed_forward1.out_proj.dropout_p, batch_count=1443506.6666666667, ans=0.1 2023-10-03 23:24:25,327 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_117-187560-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 23:24:26,832 WARNING [train.py:1204] (3/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_135-174080-0_sp0.9 from training. Duration: 0.588875 2023-10-03 23:24:26,879 WARNING [train.py:1204] (3/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_45-265684-0 from training. Duration: 0.72 2023-10-03 23:24:28,132 INFO [train.py:1046] (3/4) Epoch 41, batch 4050, loss[loss=0.1693, simple_loss=0.2434, pruned_loss=0.04763, over 23677.00 frames. ], tot_loss[loss=0.1561, simple_loss=0.2362, pruned_loss=0.03805, over 4707833.55 frames. ], batch size: 232, lr: 2.47e-03, grad_scale: 8.0 2023-10-03 23:24:28,395 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.attention_skip_rate, batch_count=1443573.3333333333, ans=0.0 2023-10-03 23:24:30,885 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_435-313305-0_sp1.1 from training. Duration: 0.6454375 2023-10-03 23:24:30,905 WARNING [train.py:1204] (3/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_235-307337-0_sp1.1 from training. Duration: 0.8545625 2023-10-03 23:24:32,860 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7762_210_1794_1_1528199508_4057408_419-125009-0_sp0.9 from training. Duration: 0.92225 2023-10-03 23:24:32,957 WARNING [train.py:1204] (3/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_215-141059-0_sp1.1 from training. Duration: 0.563625 2023-10-03 23:24:34,300 WARNING [train.py:1204] (3/4) Exclude cut with ID _27013_210_1389_1_1532437173240_3252649_222-125573-0_sp1.1 from training. Duration: 0.8909375 2023-10-03 23:24:35,881 WARNING [train.py:1204] (3/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_126-274694-0_sp1.1 from training. Duration: 0.8909375 2023-10-03 23:24:36,351 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.5.encoder.layers.1.feed_forward3.out_whiten, num_groups=1, num_channels=256, metric=15.24 vs. limit=15.0 2023-10-03 23:24:36,816 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.1.encoder.layers.0.self_attn2.whiten, num_groups=1, num_channels=256, metric=14.47 vs. limit=22.5 2023-10-03 23:24:39,945 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2592_210_2290_1_1525697025_4882925_157-70512-0_sp1.1 from training. Duration: 0.82725 2023-10-03 23:24:40,006 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_292-265928-0_sp0.9 from training. Duration: 0.5666875 2023-10-03 23:24:41,847 WARNING [train.py:1204] (3/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_1211-226066-0_sp1.1 from training. Duration: 0.6909375 2023-10-03 23:24:43,173 WARNING [train.py:1204] (3/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_394-265747-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 23:24:43,950 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.3.encoder.layers.1.conv_module2.whiten, num_groups=1, num_channels=512, metric=3.27 vs. limit=15.0 2023-10-03 23:24:46,096 WARNING [train.py:1204] (3/4) Exclude cut with ID _48782_210_12658_1_1534467701544_2300422_239-67139-0_sp1.1 from training. Duration: 0.97275 2023-10-03 23:24:46,206 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.1.bypass_mid.scale_min, batch_count=1443640.0, ans=0.2 2023-10-03 23:24:48,108 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.3.encoder.layers.0.feed_forward1.out_whiten, num_groups=1, num_channels=512, metric=7.56 vs. limit=15.0 2023-10-03 23:24:48,681 WARNING [train.py:1204] (3/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_329-113173-0_sp1.1 from training. Duration: 0.563625 2023-10-03 23:24:51,999 WARNING [train.py:1204] (3/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_81-75849-0_sp0.9 from training. Duration: 0.4333125 2023-10-03 23:24:53,414 WARNING [train.py:1204] (3/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_1003-101638-0 from training. Duration: 0.73 2023-10-03 23:24:53,441 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9309W0189-640316 from training. Duration: 0.64 2023-10-03 23:24:56,641 WARNING [train.py:1204] (3/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_503-287883-0_sp1.1 from training. Duration: 0.8 2023-10-03 23:25:02,376 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.bypass.skip_rate, batch_count=1443706.6666666667, ans=0.07 2023-10-03 23:25:03,539 WARNING [train.py:1204] (3/4) Exclude cut with ID _8833_210_5219_1_1528592665210_7553059_611-80313-0 from training. Duration: 0.64 2023-10-03 23:25:03,604 WARNING [train.py:1204] (3/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_335-106626-0_sp1.1 from training. Duration: 0.963625 2023-10-03 23:25:08,371 WARNING [train.py:1204] (3/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_237-44332-0_sp1.1 from training. Duration: 0.8545625 2023-10-03 23:25:12,571 WARNING [train.py:1204] (3/4) Exclude cut with ID _46952_210_5986_1_1534230010357_3107379_178-147341-0_sp1.1 from training. Duration: 0.97275 2023-10-03 23:25:12,629 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_13091_210_7925_1_1530609447_8987354_946-154426-0_sp1.1 from training. Duration: 0.936375 2023-10-03 23:25:12,645 WARNING [train.py:1204] (3/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_326-17061-0_sp1.1 from training. Duration: 0.8545625 2023-10-03 23:25:17,155 WARNING [train.py:1204] (3/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_38-169920-0_sp1.1 from training. Duration: 0.82725 2023-10-03 23:25:21,089 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.606e+02 1.958e+02 2.133e+02 2.374e+02 3.448e+02, threshold=4.266e+02, percent-clipped=0.0 2023-10-03 23:25:21,191 WARNING [train.py:1204] (3/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_283-220372-0 from training. Duration: 0.56 2023-10-03 23:25:21,198 WARNING [train.py:1204] (3/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_252-216674-0_sp1.1 from training. Duration: 0.4909375 2023-10-03 23:25:21,303 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_202-91290-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 23:25:22,732 WARNING [train.py:1204] (3/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_80-74184-0 from training. Duration: 0.83 2023-10-03 23:25:26,296 WARNING [train.py:1204] (3/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_411-271358-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 23:25:33,708 WARNING [train.py:1204] (3/4) Exclude cut with ID _68777_210_4353_1_1535810390731_3117904_129-166012-0 from training. Duration: 0.85 2023-10-03 23:25:35,062 WARNING [train.py:1204] (3/4) Exclude cut with ID _44982_210_1511_1_1534312820108_3726029_410-109759-0_sp1.1 from training. Duration: 0.963625 2023-10-03 23:25:35,066 WARNING [train.py:1204] (3/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_364-22817-0_sp1.1 from training. Duration: 0.6454375 2023-10-03 23:25:38,448 WARNING [train.py:1204] (3/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_89-242335-0 from training. Duration: 0.95 2023-10-03 23:25:38,456 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_151-325626-0 from training. Duration: 0.97 2023-10-03 23:25:38,457 WARNING [train.py:1204] (3/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_786-264332-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 23:25:38,684 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.conv_module2.balancer1.prob, batch_count=1443840.0, ans=0.125 2023-10-03 23:25:39,678 WARNING [train.py:1204] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_35-153886-0_sp1.1 from training. Duration: 0.763625 2023-10-03 23:25:40,960 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_24007_210_1794_1_1531918925_1663875_116-100916-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 23:25:40,988 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_162-262717-0_sp1.1 from training. Duration: 0.7545625 2023-10-03 23:25:42,259 INFO [train.py:1046] (3/4) Epoch 41, batch 4100, loss[loss=0.1432, simple_loss=0.2251, pruned_loss=0.03064, over 24294.00 frames. ], tot_loss[loss=0.1563, simple_loss=0.2367, pruned_loss=0.03798, over 4696930.65 frames. ], batch size: 61, lr: 2.47e-03, grad_scale: 8.0 2023-10-03 23:25:48,423 WARNING [train.py:1204] (3/4) Exclude cut with ID _8263_210_1664_1_1528438953494_294100_40-204731-0 from training. Duration: 0.5 2023-10-03 23:25:49,829 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_416-293746-0 from training. Duration: 0.9 2023-10-03 23:25:52,422 WARNING [train.py:1204] (3/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_605-219732-0 from training. Duration: 0.94 2023-10-03 23:25:53,772 WARNING [train.py:1204] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_454-123002-0 from training. Duration: 0.69 2023-10-03 23:25:53,785 WARNING [train.py:1204] (3/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_25-304273-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 23:25:53,846 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6860_210_4852_1_1527917457_3833327_451-39702-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 23:25:53,871 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_304-16505-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 23:25:55,227 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_4-266348-0_sp1.1 from training. Duration: 0.7090625 2023-10-03 23:25:55,303 WARNING [train.py:1204] (3/4) Exclude cut with ID ID0149W0001-753701 from training. Duration: 0.989 2023-10-03 23:25:58,617 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_41251_210_15368_1_1533429767_4820695_394-55393-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 23:25:58,700 WARNING [train.py:1204] (3/4) Exclude cut with ID _8793_210_6310_1_1528542985139_2310009_121-179782-0_sp1.1 from training. Duration: 0.72725 2023-10-03 23:25:58,714 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2729_210_3528_1_1525863505_3882214_421-73047-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 23:26:00,038 WARNING [train.py:1204] (3/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_278-154492-0_sp0.9 from training. Duration: 0.8444375 2023-10-03 23:26:04,415 WARNING [train.py:1204] (3/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_412-317077-0_sp0.9 from training. Duration: 0.8333125 2023-10-03 23:26:04,669 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.bypass.skip_rate, batch_count=1443973.3333333333, ans=0.07 2023-10-03 23:26:05,812 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_727-138317-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 23:26:05,859 WARNING [train.py:1204] (3/4) Exclude cut with ID _25808_210_13292_1_1532343514562_7366109_345-335048-0_sp1.1 from training. Duration: 0.92725 2023-10-03 23:26:05,872 WARNING [train.py:1204] (3/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_506-23200-0 from training. Duration: 0.97 2023-10-03 23:26:07,263 WARNING [train.py:1204] (3/4) Exclude cut with ID _44718_210_5816_1_1534050051869_6469030_643-245187-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 23:26:07,267 WARNING [train.py:1204] (3/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_42-186162-0_sp1.1 from training. Duration: 0.836375 2023-10-03 23:26:07,288 WARNING [train.py:1204] (3/4) Exclude cut with ID _3231_210_2106_1_1526205864934_6784220_171-256004-0_sp1.1 from training. Duration: 0.7818125 2023-10-03 23:26:07,308 WARNING [train.py:1204] (3/4) Exclude cut with ID _25914_210_6400_1_1532141317058_644740_43-248478-0_sp1.1 from training. Duration: 0.936375 2023-10-03 23:26:07,365 WARNING [train.py:1204] (3/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_577-287150-0 from training. Duration: 0.82 2023-10-03 23:26:13,290 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_46015_210_7234_1_1533863319_3087837_376-276068-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 23:26:13,392 WARNING [train.py:1204] (3/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_51-200220-0 from training. Duration: 0.56 2023-10-03 23:26:14,922 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.ff2_skip_rate, batch_count=1444040.0, ans=0.0 2023-10-03 23:26:16,010 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_452-96101-0_sp1.1 from training. Duration: 0.963625 2023-10-03 23:26:18,739 WARNING [train.py:1204] (3/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_263-168388-0_sp1.1 from training. Duration: 0.7818125 2023-10-03 23:26:18,740 WARNING [train.py:1204] (3/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_469-94673-0 from training. Duration: 0.79 2023-10-03 23:26:18,860 WARNING [train.py:1204] (3/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_303-228738-0_sp1.1 from training. Duration: 0.763625 2023-10-03 23:26:20,676 WARNING [train.py:1204] (3/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_262-250826-0_sp1.1 from training. Duration: 0.9 2023-10-03 23:26:20,703 WARNING [train.py:1204] (3/4) Exclude cut with ID _68777_210_4353_1_1535810390731_3117904_129-166012-0_sp1.1 from training. Duration: 0.77275 2023-10-03 23:26:23,432 WARNING [train.py:1204] (3/4) Exclude cut with ID _58895_210_8750_1_1535184081320_3649089_265-323810-0 from training. Duration: 0.96 2023-10-03 23:26:23,526 WARNING [train.py:1204] (3/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_120-76843-0_sp0.9 from training. Duration: 0.611125 2023-10-03 23:26:24,878 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7544_210_6753_1_1528587722_3576686_295-258479-0_sp0.9 from training. Duration: 0.9333125 2023-10-03 23:26:26,814 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7046_210_6753_1_1527895337_6920917_783-278109-0 from training. Duration: 0.77 2023-10-03 23:26:26,877 WARNING [train.py:1204] (3/4) Exclude cut with ID _42326_210_7770_1_1533814282531_3707269_113-308323-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 23:26:28,663 WARNING [train.py:1204] (3/4) Exclude cut with ID _7852_210_5219_1_1528286471316_8139400_242-105336-0_sp0.9 from training. Duration: 0.8 2023-10-03 23:26:30,226 WARNING [train.py:1204] (3/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_114-277905-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 23:26:35,833 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_13118_210_7925_1_1530755612_7608297_42-66683-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 23:26:39,196 WARNING [train.py:1204] (3/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_901-79438-0_sp1.1 from training. Duration: 0.8545625 2023-10-03 23:26:39,235 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_30280_210_1881_1_1532872779_3660139_211-182194-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 23:26:44,160 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.1.encoder.layers.0.feed_forward3.out_whiten, num_groups=1, num_channels=256, metric=12.97 vs. limit=15.0 2023-10-03 23:26:44,983 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=1444173.3333333333, ans=0.1 2023-10-03 23:26:47,520 WARNING [train.py:1204] (3/4) Exclude cut with ID _14898_210_9206_1_1530766812377_4177190_621-45732-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 23:26:47,534 WARNING [train.py:1204] (3/4) Exclude cut with ID _35350_210_16040_1_1533040191183_4075299_224-133775-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 23:26:50,408 WARNING [train.py:1204] (3/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_147-293311-0_sp1.1 from training. Duration: 0.8545625 2023-10-03 23:26:53,639 WARNING [train.py:1204] (3/4) Exclude cut with ID _12302_210_5219_1_1529924446466_8107280_835-279754-0_sp1.1 from training. Duration: 0.8181875 2023-10-03 23:26:55,897 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.balancer1.prob, batch_count=1444240.0, ans=0.125 2023-10-03 23:26:56,963 INFO [train.py:1046] (3/4) Epoch 41, batch 4150, loss[loss=0.1657, simple_loss=0.2491, pruned_loss=0.04117, over 24578.00 frames. ], tot_loss[loss=0.1561, simple_loss=0.2363, pruned_loss=0.03801, over 4715786.32 frames. ], batch size: 71, lr: 2.47e-03, grad_scale: 8.0 2023-10-03 23:26:57,010 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8453_210_4211_1_1528502940_8562730_347-330673-0_sp0.9 from training. Duration: 0.8 2023-10-03 23:26:58,417 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_213-103282-0_sp0.9 from training. Duration: 0.8666875 2023-10-03 23:26:59,786 WARNING [train.py:1204] (3/4) Exclude cut with ID _49728_210_12651_1_1534041034294_6788150_66-43496-0_sp1.1 from training. Duration: 0.97275 2023-10-03 23:26:59,788 WARNING [train.py:1204] (3/4) Exclude cut with ID _19023_210_4353_1_1531378892552_5820780_28-36615-0_sp1.1 from training. Duration: 0.8818125 2023-10-03 23:27:01,866 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.bypass_mid.scale_min, batch_count=1444240.0, ans=0.2 2023-10-03 23:27:03,024 WARNING [train.py:1204] (3/4) Exclude cut with ID _14349_210_8341_1_1530615710818_3442470_115-285975-0 from training. Duration: 0.86 2023-10-03 23:27:03,044 WARNING [train.py:1204] (3/4) Exclude cut with ID _12734_210_8341_1_1530183614296_3891929_236-47464-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 23:27:03,098 WARNING [train.py:1204] (3/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_177-236809-0 from training. Duration: 0.49 2023-10-03 23:27:04,476 WARNING [train.py:1204] (3/4) Exclude cut with ID _13678_210_7497_1_1530530069226_3463170_371-3221-0 from training. Duration: 0.9 2023-10-03 23:27:04,491 WARNING [train.py:1204] (3/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_48-338976-0 from training. Duration: 0.89 2023-10-03 23:27:06,963 WARNING [train.py:1204] (3/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_1167-309744-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 23:27:07,893 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.conv_skip_rate, batch_count=1444240.0, ans=0.0 2023-10-03 23:27:11,741 WARNING [train.py:1204] (3/4) Exclude cut with ID _30327_210_16049_1_1532484161295_3924169_401-70819-0_sp0.9 from training. Duration: 0.9555625 2023-10-03 23:27:11,750 WARNING [train.py:1204] (3/4) Exclude cut with ID _55134_210_21982_1_1534590028377_3882170_346-91022-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 23:27:15,909 WARNING [train.py:1204] (3/4) Exclude cut with ID _14459_210_2228_1_1531443641946_7516700_174-118801-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 23:27:15,976 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_269-278006-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 23:27:17,329 WARNING [train.py:1204] (3/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_390-339137-0_sp0.9 from training. Duration: 0.62225 2023-10-03 23:27:19,981 WARNING [train.py:1204] (3/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_253-308377-0_sp1.1 from training. Duration: 0.4454375 2023-10-03 23:27:20,000 WARNING [train.py:1204] (3/4) Exclude cut with ID _23825_210_13449_1_1531915313526_3497150_240-271367-0_sp1.1 from training. Duration: 0.8818125 2023-10-03 23:27:20,077 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1015-343884-0_sp0.9 from training. Duration: 0.688875 2023-10-03 23:27:23,372 WARNING [train.py:1204] (3/4) Exclude cut with ID _23514_210_1389_1_1531832139785_3031439_335-48210-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 23:27:26,657 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_309-65177-0_sp1.1 from training. Duration: 0.8 2023-10-03 23:27:28,059 WARNING [train.py:1204] (3/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_231-137892-0 from training. Duration: 0.99 2023-10-03 23:27:30,765 WARNING [train.py:1204] (3/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_57-270037-0 from training. Duration: 0.77 2023-10-03 23:27:30,770 WARNING [train.py:1204] (3/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_465-214726-0_sp1.1 from training. Duration: 0.7818125 2023-10-03 23:27:32,086 WARNING [train.py:1204] (3/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_106-300985-0 from training. Duration: 0.9 2023-10-03 23:27:32,091 WARNING [train.py:1204] (3/4) Exclude cut with ID _14632_210_9988_1_1530691279392_3923200_161-129530-0_sp1.1 from training. Duration: 0.87275 2023-10-03 23:27:32,103 WARNING [train.py:1204] (3/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_514-88370-0_sp1.1 from training. Duration: 0.963625 2023-10-03 23:27:35,265 WARNING [train.py:1204] (3/4) Exclude cut with ID _56495_210_4234_1_1534748080334_4136219_305-334505-0_sp1.1 from training. Duration: 0.92725 2023-10-03 23:27:35,445 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.1.bypass.scale_min, batch_count=1444373.3333333333, ans=0.2 2023-10-03 23:27:36,743 WARNING [train.py:1204] (3/4) Exclude cut with ID _42875_210_14856_1_1533704397487_4012433_61-264914-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 23:27:40,149 WARNING [train.py:1204] (3/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_91-148831-0 from training. Duration: 0.99 2023-10-03 23:27:40,443 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.ff3_skip_rate, batch_count=1444440.0, ans=0.0 2023-10-03 23:27:43,004 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_435-313305-0_sp0.9 from training. Duration: 0.788875 2023-10-03 23:27:45,510 WARNING [train.py:1204] (3/4) Exclude cut with ID _16744_210_5986_1_1531724405384_3907567_242-111316-0_sp1.1 from training. Duration: 0.8454375 2023-10-03 23:27:45,582 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_257-20733-0 from training. Duration: 0.76 2023-10-03 23:27:45,626 WARNING [train.py:1204] (3/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_4-91037-0_sp1.1 from training. Duration: 0.8 2023-10-03 23:27:46,979 WARNING [train.py:1204] (3/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_3-276701-0 from training. Duration: 0.91 2023-10-03 23:27:48,387 WARNING [train.py:1204] (3/4) Exclude cut with ID _3992_210_1747_1_1526644816031_3731060_36-147855-0_sp1.1 from training. Duration: 0.7090625 2023-10-03 23:27:49,674 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.663e+02 1.998e+02 2.194e+02 2.506e+02 4.254e+02, threshold=4.388e+02, percent-clipped=0.0 2023-10-03 23:27:49,765 WARNING [train.py:1204] (3/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_1296-298671-0_sp1.1 from training. Duration: 0.963625 2023-10-03 23:27:51,097 WARNING [train.py:1204] (3/4) Exclude cut with ID _68965_210_1883_1_1535698962272_3497490_309-334593-0_sp1.1 from training. Duration: 0.92725 2023-10-03 23:27:51,200 WARNING [train.py:1204] (3/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_128-296581-0 from training. Duration: 0.74 2023-10-03 23:27:51,201 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_17613_210_2151_1_1531137569_2366160_170-299079-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 23:27:51,203 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6287_210_3273_1_1527509506_2653716_182-199887-0_sp1.1 from training. Duration: 0.463625 2023-10-03 23:27:55,122 WARNING [train.py:1204] (3/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_80-135200-0_sp1.1 from training. Duration: 0.7181875 2023-10-03 23:27:56,632 WARNING [train.py:1204] (3/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_34-183685-0 from training. Duration: 0.98 2023-10-03 23:27:57,950 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8278_210_2550_1_1528637099_4220413_418-210155-0_sp1.1 from training. Duration: 0.92725 2023-10-03 23:27:57,954 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8853_210_3273_1_1528633899_818968_51-187685-0_sp0.9 from training. Duration: 0.7666875 2023-10-03 23:27:57,978 WARNING [train.py:1204] (3/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_332-221057-0_sp1.1 from training. Duration: 0.6181875 2023-10-03 23:27:58,068 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2221_210_1988_1_1525597051_4935541_415-296882-0 from training. Duration: 0.77 2023-10-03 23:27:59,312 WARNING [train.py:1204] (3/4) Exclude cut with ID _33640_210_2655_1_1533117605479_4855469_770-278185-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 23:27:59,348 WARNING [train.py:1204] (3/4) Exclude cut with ID _5287_210_2084_1_1527418883040_3878670_333-48508-0_sp1.1 from training. Duration: 0.5454375 2023-10-03 23:28:00,668 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_142-276839-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 23:28:02,152 WARNING [train.py:1204] (3/4) Exclude cut with ID _28394_210_8913_1_1532430049518_3650118_126-139371-0_sp1.1 from training. Duration: 0.92725 2023-10-03 23:28:02,174 WARNING [train.py:1204] (3/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_76-203867-0 from training. Duration: 0.72 2023-10-03 23:28:03,861 WARNING [train.py:1204] (3/4) Exclude cut with ID _6194_210_1534_1_1527511826160_3941194_14-282876-0_sp0.9 from training. Duration: 0.788875 2023-10-03 23:28:04,112 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=1444506.6666666667, ans=0.1 2023-10-03 23:28:08,272 WARNING [train.py:1204] (3/4) Exclude cut with ID _35355_210_16040_1_1533212988977_4016229_74-190076-0_sp1.1 from training. Duration: 0.72725 2023-10-03 23:28:11,388 INFO [train.py:1046] (3/4) Epoch 41, batch 4200, loss[loss=0.1649, simple_loss=0.2501, pruned_loss=0.03991, over 23999.00 frames. ], tot_loss[loss=0.1555, simple_loss=0.2352, pruned_loss=0.03787, over 4704080.65 frames. ], batch size: 86, lr: 2.47e-03, grad_scale: 8.0 2023-10-03 23:28:11,455 WARNING [train.py:1204] (3/4) Exclude cut with ID _33920_210_4220_1_1533257851500_3084399_198-334063-0 from training. Duration: 0.94 2023-10-03 23:28:12,909 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_4-266348-0_sp0.9 from training. Duration: 0.8666875 2023-10-03 23:28:14,425 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_23856_210_4238_1_1531994785_3087934_122-38075-0_sp1.1 from training. Duration: 0.936375 2023-10-03 23:28:15,760 WARNING [train.py:1204] (3/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_276-96593-0_sp0.9 from training. Duration: 0.8555625 2023-10-03 23:28:17,147 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_308-278734-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 23:28:17,149 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_5190_210_4557_1_1527245078_2965646_353-250603-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 23:28:21,169 WARNING [train.py:1204] (3/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_472-216663-0 from training. Duration: 0.92 2023-10-03 23:28:24,035 WARNING [train.py:1204] (3/4) Exclude cut with ID _7537_210_2084_1_1528502395431_3924359_14-312906-0 from training. Duration: 0.84 2023-10-03 23:28:24,085 WARNING [train.py:1204] (3/4) Exclude cut with ID _29003_210_19596_1_1532507391720_3894098_559-236660-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 23:28:27,830 WARNING [train.py:1204] (3/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_312-204798-0_sp1.1 from training. Duration: 0.8454375 2023-10-03 23:28:29,296 WARNING [train.py:1204] (3/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_17-309070-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 23:28:33,248 WARNING [train.py:1204] (3/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_344-152526-0_sp0.9 from training. Duration: 0.588875 2023-10-03 23:28:33,374 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_41452_210_7291_1_1533437687_3941281_405-11559-0_sp1.1 from training. Duration: 0.82725 2023-10-03 23:28:33,397 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_48730_210_5399_1_1533885886_2212769_157-124052-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 23:28:33,619 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.balancer1.prob, batch_count=1444640.0, ans=0.125 2023-10-03 23:28:35,295 WARNING [train.py:1204] (3/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_415-78792-0 from training. Duration: 0.81 2023-10-03 23:28:35,299 WARNING [train.py:1204] (3/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_345-340690-0_sp1.1 from training. Duration: 0.8454375 2023-10-03 23:28:36,716 WARNING [train.py:1204] (3/4) Exclude cut with ID _23825_210_13449_1_1531915313526_3497150_124-8138-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 23:28:36,758 WARNING [train.py:1204] (3/4) Exclude cut with ID _49258_210_7994_1_1534122017863_3738861_194-24864-0_sp1.1 from training. Duration: 0.936375 2023-10-03 23:28:36,784 WARNING [train.py:1204] (3/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_369-222060-0_sp1.1 from training. Duration: 0.6454375 2023-10-03 23:28:38,220 WARNING [train.py:1204] (3/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_480-13146-0_sp0.9 from training. Duration: 0.9444375 2023-10-03 23:28:40,895 WARNING [train.py:1204] (3/4) Exclude cut with ID _13253_210_1925_1_1530757775819_3577873_191-144977-0 from training. Duration: 0.78 2023-10-03 23:28:40,925 WARNING [train.py:1204] (3/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_1128-140266-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 23:28:45,562 WARNING [train.py:1204] (3/4) Exclude cut with ID _8772_210_1850_1_1528635595972_3664011_247-336448-0_sp0.9 from training. Duration: 0.82225 2023-10-03 23:28:46,908 WARNING [train.py:1204] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_318-59097-0_sp1.1 from training. Duration: 0.6909375 2023-10-03 23:28:48,427 WARNING [train.py:1204] (3/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_540-131164-0_sp1.1 from training. Duration: 0.836375 2023-10-03 23:28:48,541 WARNING [train.py:1204] (3/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_39-110836-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 23:28:49,929 WARNING [train.py:1204] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_860-96198-0_sp1.1 from training. Duration: 0.663625 2023-10-03 23:28:49,937 WARNING [train.py:1204] (3/4) Exclude cut with ID _15133_210_6228_1_1530794512074_3080379_29-264458-0 from training. Duration: 0.58 2023-10-03 23:28:49,956 WARNING [train.py:1204] (3/4) Exclude cut with ID _55209_210_1531_1_1534675828996_4597160_797-348343-0_sp1.1 from training. Duration: 0.963625 2023-10-03 23:28:51,421 WARNING [train.py:1204] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_23-189234-0_sp1.1 from training. Duration: 0.7545625 2023-10-03 23:28:56,101 WARNING [train.py:1204] (3/4) Exclude cut with ID _5410_210_1388_1_1527397055795_1719020_39-112221-0_sp1.1 from training. Duration: 0.62725 2023-10-03 23:28:58,826 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_100-348441-0_sp1.1 from training. Duration: 0.82725 2023-10-03 23:28:59,058 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.feed_forward1.out_proj.dropout_p, batch_count=1444773.3333333333, ans=0.1 2023-10-03 23:29:03,636 WARNING [train.py:1204] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_143-86344-0_sp1.1 from training. Duration: 0.7 2023-10-03 23:29:06,414 WARNING [train.py:1204] (3/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_126-29104-0 from training. Duration: 0.85 2023-10-03 23:29:09,172 WARNING [train.py:1204] (3/4) Exclude cut with ID _39846_210_12651_1_1533603651992_1318030_138-31771-0_sp1.1 from training. Duration: 0.963625 2023-10-03 23:29:15,254 WARNING [train.py:1204] (3/4) Exclude cut with ID _2220_210_1819_1_1525509109898_3658680_50-99837-0_sp1.1 from training. Duration: 0.4909375 2023-10-03 23:29:15,337 WARNING [train.py:1204] (3/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_836-210550-0_sp1.1 from training. Duration: 0.97275 2023-10-03 23:29:16,788 WARNING [train.py:1204] (3/4) Exclude cut with ID _28323_210_16187_1_1532480395667_4100300_306-74561-0 from training. Duration: 0.94 2023-10-03 23:29:23,506 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_177-58005-0_sp1.1 from training. Duration: 0.563625 2023-10-03 23:29:24,832 INFO [train.py:1046] (3/4) Epoch 41, batch 4250, loss[loss=0.1559, simple_loss=0.246, pruned_loss=0.03294, over 24588.00 frames. ], tot_loss[loss=0.1549, simple_loss=0.2348, pruned_loss=0.03747, over 4717022.19 frames. ], batch size: 71, lr: 2.47e-03, grad_scale: 8.0 2023-10-03 23:29:25,237 INFO [scaling.py:1118] (3/4) WithLoss: name=encoder.encoders.3.encoder.layers.1.self_attn_weights, loss-sum=0.000e+00 2023-10-03 23:29:27,056 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.conv_module2.balancer2.prob, batch_count=1444906.6666666667, ans=0.125 2023-10-03 23:29:27,143 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.feed_forward1.out_proj.dropout_p, batch_count=1444906.6666666667, ans=0.1 2023-10-03 23:29:28,202 WARNING [train.py:1204] (3/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_19-342632-0_sp1.1 from training. Duration: 0.82725 2023-10-03 23:29:28,216 WARNING [train.py:1204] (3/4) Exclude cut with ID _35355_210_16040_1_1533212988977_4016229_74-190076-0_sp0.9 from training. Duration: 0.888875 2023-10-03 23:29:29,725 WARNING [train.py:1204] (3/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_285-97786-0_sp1.1 from training. Duration: 0.97275 2023-10-03 23:29:35,524 WARNING [train.py:1204] (3/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_337-181502-0_sp0.9 from training. Duration: 0.72225 2023-10-03 23:29:36,788 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_353-182855-0 from training. Duration: 0.96 2023-10-03 23:29:36,823 WARNING [train.py:1204] (3/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_409-327730-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 23:29:37,009 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.feed_forward2.hidden_balancer.prob, batch_count=1444906.6666666667, ans=0.125 2023-10-03 23:29:39,610 WARNING [train.py:1204] (3/4) Exclude cut with ID _8030_210_7497_1_1528444886781_3531089_373-19403-0_sp1.1 from training. Duration: 0.97275 2023-10-03 23:29:42,428 WARNING [train.py:1204] (3/4) Exclude cut with ID _6789_210_1534_1_1527857937218_3118950_231-340237-0_sp1.1 from training. Duration: 0.936375 2023-10-03 23:29:47,014 WARNING [train.py:1204] (3/4) Exclude cut with ID _6493_210_1747_1_1528459210424_4012470_536-57832-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 23:29:47,033 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7046_210_6753_1_1527895337_6920917_756-116002-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 23:29:49,671 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_37152_210_9697_1_1533106573_3844188_29-239070-0_sp1.1 from training. Duration: 0.8909375 2023-10-03 23:29:49,673 WARNING [train.py:1204] (3/4) Exclude cut with ID _19023_210_4353_1_1531378892552_5820780_61-334539-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 23:29:51,113 WARNING [train.py:1204] (3/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_159-28488-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 23:29:51,191 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_764-71272-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 23:29:51,818 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.3.encoder.layers.3.self_attn_weights.whiten_keys, num_groups=8, num_channels=256, metric=2.47 vs. limit=6.0 2023-10-03 23:29:53,705 WARNING [train.py:1204] (3/4) Exclude cut with ID _44902_210_16187_1_1533888006062_3792078_258-178274-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 23:29:53,862 WARNING [train.py:1204] (3/4) Exclude cut with ID _7537_210_2084_1_1528502395431_3924359_14-312906-0_sp1.1 from training. Duration: 0.763625 2023-10-03 23:29:56,493 WARNING [train.py:1204] (3/4) Exclude cut with ID _49643_210_7989_1_1534640244224_3939031_674-116639-0_sp1.1 from training. Duration: 0.963625 2023-10-03 23:29:56,587 WARNING [train.py:1204] (3/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_415-161-0 from training. Duration: 0.98 2023-10-03 23:29:59,854 WARNING [train.py:1204] (3/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_264-348896-0 from training. Duration: 0.85 2023-10-03 23:29:59,862 WARNING [train.py:1204] (3/4) Exclude cut with ID _51899_210_20034_1_1534129254466_3669220_233-161492-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 23:29:59,928 WARNING [train.py:1204] (3/4) Exclude cut with ID _31255_210_13427_1_1532865619495_3815143_233-200023-0_sp1.1 from training. Duration: 0.936375 2023-10-03 23:30:01,250 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_23850_210_4238_1_1532232597_1632433_100-229879-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 23:30:01,335 WARNING [train.py:1204] (3/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_375-245677-0_sp0.9 from training. Duration: 0.9 2023-10-03 23:30:01,339 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_40223_210_6228_1_1533298404_4812267_355-317415-0_sp1.1 from training. Duration: 0.97275 2023-10-03 23:30:01,380 WARNING [train.py:1204] (3/4) Exclude cut with ID _9679_210_7497_1_1529129212906_4363929_235-81488-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 23:30:04,704 WARNING [train.py:1204] (3/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_172-215932-0_sp1.1 from training. Duration: 0.6 2023-10-03 23:30:04,768 WARNING [train.py:1204] (3/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_64-298917-0_sp1.1 from training. Duration: 0.8 2023-10-03 23:30:08,732 WARNING [train.py:1204] (3/4) Exclude cut with ID _38020_210_5986_1_1533112252259_7006089_131-53533-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 23:30:09,452 WARNING [train.py:1204] (3/4) Exclude cut with ID _42334_210_7770_1_1534206610396_3605880_392-48154-0_sp1.1 from training. Duration: 0.963625 2023-10-03 23:30:09,612 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.ff2_skip_rate, batch_count=1445106.6666666667, ans=0.0 2023-10-03 23:30:10,636 WARNING [train.py:1204] (3/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_337-181502-0 from training. Duration: 0.65 2023-10-03 23:30:10,643 WARNING [train.py:1204] (3/4) Exclude cut with ID _6267_210_2084_1_1527764658526_3894730_375-219887-0_sp0.9 from training. Duration: 0.7444375 2023-10-03 23:30:10,901 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.feed_forward1.out_proj.dropout_p, batch_count=1445106.6666666667, ans=0.1 2023-10-03 23:30:12,010 WARNING [train.py:1204] (3/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_60-146189-0 from training. Duration: 0.98 2023-10-03 23:30:13,274 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_243-143119-0_sp1.1 from training. Duration: 0.7 2023-10-03 23:30:14,643 WARNING [train.py:1204] (3/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_1267-272693-0_sp0.9 from training. Duration: 0.811125 2023-10-03 23:30:14,768 WARNING [train.py:1204] (3/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_83-182645-0_sp1.1 from training. Duration: 0.97275 2023-10-03 23:30:15,521 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.1.encoder.layers.1.self_attn2.whiten, num_groups=1, num_channels=256, metric=11.36 vs. limit=22.5 2023-10-03 23:30:16,138 WARNING [train.py:1204] (3/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_403-292260-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 23:30:17,405 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.572e+02 1.864e+02 1.993e+02 2.259e+02 3.017e+02, threshold=3.986e+02, percent-clipped=0.0 2023-10-03 23:30:17,572 WARNING [train.py:1204] (3/4) Exclude cut with ID _55429_210_10626_1_1534467588527_7174143_377-169391-0 from training. Duration: 0.96 2023-10-03 23:30:19,140 WARNING [train.py:1204] (3/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_494-199538-0_sp1.1 from training. Duration: 0.6818125 2023-10-03 23:30:20,439 WARNING [train.py:1204] (3/4) Exclude cut with ID _3067_210_2667_1_1526212974640_4503168_119-175776-0_sp1.1 from training. Duration: 0.67275 2023-10-03 23:30:25,650 WARNING [train.py:1204] (3/4) Exclude cut with ID _3231_210_2106_1_1526205864934_6784220_183-303839-0_sp1.1 from training. Duration: 0.97275 2023-10-03 23:30:27,039 WARNING [train.py:1204] (3/4) Exclude cut with ID _18513_210_9206_1_1531618215223_3862620_165-38958-0_sp1.1 from training. Duration: 0.963625 2023-10-03 23:30:28,402 WARNING [train.py:1204] (3/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_87-272068-0_sp1.1 from training. Duration: 0.7545625 2023-10-03 23:30:28,506 WARNING [train.py:1204] (3/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_353-95-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 23:30:28,697 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.nonlin_attention.balancer.min_positive, batch_count=1445173.3333333333, ans=0.05 2023-10-03 23:30:29,912 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2009_210_2071_1_1525481134_4588937_73-302813-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 23:30:32,551 WARNING [train.py:1204] (3/4) Exclude cut with ID _64220_210_9527_1_1535250151772_4875259_88-95011-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 23:30:32,614 WARNING [train.py:1204] (3/4) Exclude cut with ID _44902_210_16187_1_1533888006062_3792078_160-12107-0_sp1.1 from training. Duration: 0.92725 2023-10-03 23:30:32,620 WARNING [train.py:1204] (3/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_547-5424-0 from training. Duration: 0.94 2023-10-03 23:30:34,594 WARNING [train.py:1204] (3/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_268-85656-0_sp1.1 from training. Duration: 0.936375 2023-10-03 23:30:39,187 INFO [train.py:1046] (3/4) Epoch 41, batch 4300, loss[loss=0.1595, simple_loss=0.2325, pruned_loss=0.04324, over 23818.00 frames. ], tot_loss[loss=0.1548, simple_loss=0.2345, pruned_loss=0.03757, over 4714435.99 frames. ], batch size: 213, lr: 2.47e-03, grad_scale: 8.0 2023-10-03 23:30:39,316 WARNING [train.py:1204] (3/4) Exclude cut with ID _66210_210_12651_1_1535446812922_3365850_42-284985-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 23:30:39,486 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.conv_module2.balancer1.prob, batch_count=1445240.0, ans=0.125 2023-10-03 23:30:40,616 WARNING [train.py:1204] (3/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_465-214726-0_sp0.9 from training. Duration: 0.9555625 2023-10-03 23:30:43,022 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.1.feed_forward1.out_whiten.whitening_limit, batch_count=1445240.0, ans=15.0 2023-10-03 23:30:43,491 WARNING [train.py:1204] (3/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_50-95770-0_sp1.1 from training. Duration: 0.936375 2023-10-03 23:30:49,146 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.feed_forward2.hidden_balancer.prob, batch_count=1445240.0, ans=0.125 2023-10-03 23:30:50,350 WARNING [train.py:1204] (3/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_269-77567-0_sp1.1 from training. Duration: 0.963625 2023-10-03 23:30:50,352 WARNING [train.py:1204] (3/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_7-111954-0 from training. Duration: 0.56 2023-10-03 23:30:52,192 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_85-195240-0_sp0.9 from training. Duration: 0.9666875 2023-10-03 23:30:52,389 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.0.feed_forward1.out_proj.dropout_p, batch_count=1445306.6666666667, ans=0.1 2023-10-03 23:30:53,649 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2822_210_2715_1_1526112586_1999587_165-36656-0_sp0.9 from training. Duration: 0.988875 2023-10-03 23:30:53,872 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.conv_module1.balancer2.prob, batch_count=1445306.6666666667, ans=0.125 2023-10-03 23:30:55,418 WARNING [train.py:1204] (3/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_181-78632-0_sp1.1 from training. Duration: 0.7090625 2023-10-03 23:30:55,431 WARNING [train.py:1204] (3/4) Exclude cut with ID IC0671W0044-331887 from training. Duration: 0.896 2023-10-03 23:30:58,373 WARNING [train.py:1204] (3/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_266-137104-0_sp1.1 from training. Duration: 0.5909375 2023-10-03 23:30:59,885 WARNING [train.py:1204] (3/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_137-116442-0_sp0.9 from training. Duration: 0.8666875 2023-10-03 23:31:01,390 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_23801_210_16223_1_1532066392_3294657_570-5439-0 from training. Duration: 0.97 2023-10-03 23:31:01,403 WARNING [train.py:1204] (3/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_29-280622-0_sp1.1 from training. Duration: 0.7454375 2023-10-03 23:31:01,561 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.bypass.scale_min, batch_count=1445306.6666666667, ans=0.2 2023-10-03 23:31:02,591 WARNING [train.py:1204] (3/4) Exclude cut with ID 3557-8342-0013-54691-0 from training. Duration: 0.92 2023-10-03 23:31:05,694 WARNING [train.py:1204] (3/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_711-15688-0_sp1.1 from training. Duration: 0.3909375 2023-10-03 23:31:06,975 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_15126_210_6753_1_1530778677_2591185_120-277707-0_sp1.1 from training. Duration: 0.72725 2023-10-03 23:31:09,081 WARNING [train.py:1204] (3/4) Exclude cut with ID _14970_210_15709_1_1530775547361_1966965_8-169924-0_sp1.1 from training. Duration: 0.836375 2023-10-03 23:31:09,082 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_4692_210_3528_1_1526899755_4323254_107-44401-0_sp0.9 from training. Duration: 0.9555625 2023-10-03 23:31:10,482 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3475_210_2028_1_1526193578_3622569_265-276625-0_sp0.9 from training. Duration: 0.8444375 2023-10-03 23:31:11,937 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.0.conv_module2.balancer2.min_positive, batch_count=1445373.3333333333, ans=0.05 2023-10-03 23:31:13,094 WARNING [train.py:1204] (3/4) Exclude cut with ID _55153_210_2253_1_1534384786677_3162096_148-79022-0_sp1.1 from training. Duration: 0.87275 2023-10-03 23:31:13,190 WARNING [train.py:1204] (3/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_292-36336-0_sp1.1 from training. Duration: 0.92725 2023-10-03 23:31:13,199 WARNING [train.py:1204] (3/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_329-82235-0 from training. Duration: 0.82 2023-10-03 23:31:14,539 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_11305_210_1794_1_1529750428_7178148_257-312870-0 from training. Duration: 0.88 2023-10-03 23:31:15,873 WARNING [train.py:1204] (3/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_436-4493-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 23:31:18,680 WARNING [train.py:1204] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_321-54129-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 23:31:18,687 WARNING [train.py:1204] (3/4) Exclude cut with ID _5287_210_2084_1_1527418883040_3878670_333-48508-0_sp0.9 from training. Duration: 0.6666875 2023-10-03 23:31:18,708 WARNING [train.py:1204] (3/4) Exclude cut with ID _26528_210_9070_1_1532570410842_3851310_244-278158-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 23:31:18,915 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.conv_module2.balancer1.prob, batch_count=1445373.3333333333, ans=0.125 2023-10-03 23:31:20,002 WARNING [train.py:1204] (3/4) Exclude cut with ID _46952_210_5986_1_1534230010357_3107379_232-344503-0_sp1.1 from training. Duration: 0.87275 2023-10-03 23:31:20,020 WARNING [train.py:1204] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_43-206757-0 from training. Duration: 0.75 2023-10-03 23:31:20,022 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_4183_210_2087_1_1526729100_1869838_141-166600-0 from training. Duration: 0.95 2023-10-03 23:31:21,218 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_296-287678-0 from training. Duration: 0.95 2023-10-03 23:31:23,030 WARNING [train.py:1204] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_265-60343-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 23:31:23,058 WARNING [train.py:1204] (3/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_332-256168-0 from training. Duration: 0.75 2023-10-03 23:31:23,108 WARNING [train.py:1204] (3/4) Exclude cut with ID _13679_210_7497_1_1530534293662_3355098_411-186088-0 from training. Duration: 0.94 2023-10-03 23:31:27,789 WARNING [train.py:1204] (3/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_458-126779-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 23:31:29,175 WARNING [train.py:1204] (3/4) Exclude cut with ID IC0803W0380-397572 from training. Duration: 0.032 2023-10-03 23:31:29,235 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_278-272612-0_sp1.1 from training. Duration: 0.663625 2023-10-03 23:31:29,467 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.nonlin_attention.balancer.prob, batch_count=1445440.0, ans=0.125 2023-10-03 23:31:30,644 WARNING [train.py:1204] (3/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_491-263101-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 23:31:31,912 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_53555_210_5632_1_1534595220_7232297_334-336778-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 23:31:35,077 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_50-3289-0 from training. Duration: 0.91 2023-10-03 23:31:35,147 WARNING [train.py:1204] (3/4) Exclude cut with ID _8699_210_2069_1_1528630346831_3960409_155-39766-0_sp0.9 from training. Duration: 0.8666875 2023-10-03 23:31:35,151 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_502-82145-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 23:31:35,198 WARNING [train.py:1204] (3/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_550-43132-0_sp1.1 from training. Duration: 0.97275 2023-10-03 23:31:36,473 WARNING [train.py:1204] (3/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_446-112117-0_sp1.1 from training. Duration: 0.8181875 2023-10-03 23:31:36,548 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3691_210_3528_1_1526468169_4062053_355-334432-0_sp1.1 from training. Duration: 0.7909375 2023-10-03 23:31:39,900 WARNING [train.py:1204] (3/4) Exclude cut with ID _58605_210_12210_1_1534723195939_3904259_239-94720-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 23:31:42,692 WARNING [train.py:1204] (3/4) Exclude cut with ID _49376_210_5986_1_1533985278732_3370740_391-344072-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 23:31:43,917 WARNING [train.py:1204] (3/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_558-322014-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 23:31:43,949 WARNING [train.py:1204] (3/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_256-32662-0_sp1.1 from training. Duration: 0.8181875 2023-10-03 23:31:46,129 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.3.encoder.layers.3.self_attn2.whiten, num_groups=1, num_channels=512, metric=13.55 vs. limit=22.5 2023-10-03 23:31:48,322 WARNING [train.py:1204] (3/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_572-185150-0 from training. Duration: 0.97 2023-10-03 23:31:49,572 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_89-35748-0_sp0.9 from training. Duration: 0.87775 2023-10-03 23:31:52,855 INFO [train.py:1046] (3/4) Epoch 41, batch 4350, loss[loss=0.155, simple_loss=0.2348, pruned_loss=0.03758, over 22339.00 frames. ], tot_loss[loss=0.156, simple_loss=0.2357, pruned_loss=0.0382, over 4696910.88 frames. ], batch size: 49, lr: 2.47e-03, grad_scale: 8.0 2023-10-03 23:31:54,238 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_88-66747-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 23:31:56,807 WARNING [train.py:1204] (3/4) Exclude cut with ID _19504_210_9994_1_1531648863242_3325220_357-237029-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 23:31:58,300 WARNING [train.py:1204] (3/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_302-2550-0_sp1.1 from training. Duration: 0.736375 2023-10-03 23:31:58,300 WARNING [train.py:1204] (3/4) Exclude cut with ID _9461_210_4557_1_1529060514726_3677451_59-194017-0_sp1.1 from training. Duration: 0.963625 2023-10-03 23:32:05,607 WARNING [train.py:1204] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_665-196155-0_sp1.1 from training. Duration: 0.6090625 2023-10-03 23:32:10,172 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_326-315007-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 23:32:13,439 WARNING [train.py:1204] (3/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_105-328174-0_sp1.1 from training. Duration: 0.6909375 2023-10-03 23:32:13,455 WARNING [train.py:1204] (3/4) Exclude cut with ID _56494_210_4220_1_1535284837625_3033169_106-263722-0_sp1.1 from training. Duration: 0.97275 2023-10-03 23:32:14,944 WARNING [train.py:1204] (3/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_770-200430-0_sp0.9 from training. Duration: 0.988875 2023-10-03 23:32:16,863 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.4.encoder.layers.2.conv_module2.whiten, num_groups=1, num_channels=384, metric=3.09 vs. limit=15.0 2023-10-03 23:32:17,632 WARNING [train.py:1204] (3/4) Exclude cut with ID _65640_210_6310_1_1535368201445_3173250_209-13008-0_sp1.1 from training. Duration: 0.9 2023-10-03 23:32:18,263 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.4.encoder.layers.1.conv_module2.whiten, num_groups=1, num_channels=384, metric=2.75 vs. limit=15.0 2023-10-03 23:32:19,005 WARNING [train.py:1204] (3/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_212-149947-0_sp0.9 from training. Duration: 0.811125 2023-10-03 23:32:23,236 WARNING [train.py:1204] (3/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_188-171673-0 from training. Duration: 0.58 2023-10-03 23:32:24,579 WARNING [train.py:1204] (3/4) Exclude cut with ID _58895_210_8750_1_1535184081320_3649089_286-281724-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 23:32:24,643 WARNING [train.py:1204] (3/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_16-254909-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 23:32:29,080 WARNING [train.py:1204] (3/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_52-95655-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 23:32:30,659 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=1445706.6666666667, ans=0.1 2023-10-03 23:32:32,352 WARNING [train.py:1204] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_554-45617-0 from training. Duration: 0.99 2023-10-03 23:32:36,404 WARNING [train.py:1204] (3/4) Exclude cut with ID _14683_210_5219_1_1530698352608_3974859_473-344611-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 23:32:38,415 WARNING [train.py:1204] (3/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_17-25045-0_sp0.9 from training. Duration: 0.6555625 2023-10-03 23:32:42,511 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9072W0003-530637 from training. Duration: 0.768 2023-10-03 23:32:44,464 WARNING [train.py:1204] (3/4) Exclude cut with ID _38670_210_15379_1_1533448810659_4391059_76-74025-0_sp1.1 from training. Duration: 0.936375 2023-10-03 23:32:44,505 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_74-292608-0_sp0.9 from training. Duration: 0.8 2023-10-03 23:32:44,694 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.ff3_skip_rate, batch_count=1445773.3333333333, ans=0.0 2023-10-03 23:32:45,480 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.734e+02 1.921e+02 2.106e+02 2.345e+02 3.480e+02, threshold=4.212e+02, percent-clipped=0.0 2023-10-03 23:32:45,620 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9084W0025-536862 from training. Duration: 0.896 2023-10-03 23:32:46,921 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9087W0021-538392 from training. Duration: 0.768 2023-10-03 23:32:46,934 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7543_210_6753_1_1528543323_645515_56-163234-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 23:32:46,964 WARNING [train.py:1204] (3/4) Exclude cut with ID _45560_210_21982_1_1533812603951_3584999_44-332643-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 23:32:47,192 INFO [scaling.py:1118] (3/4) WithLoss: name=encoder.encoders.3.encoder.layers.1.self_attn_weights, loss-sum=0.000e+00 2023-10-03 23:32:48,270 WARNING [train.py:1204] (3/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_1285-256622-0_sp1.1 from training. Duration: 0.763625 2023-10-03 23:32:49,669 WARNING [train.py:1204] (3/4) Exclude cut with ID _52781_210_16187_1_1534297729099_1118149_141-325632-0_sp1.1 from training. Duration: 0.936375 2023-10-03 23:32:49,727 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_1051-137631-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 23:32:51,105 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_13091_210_7925_1_1530609447_8987354_233-18333-0_sp1.1 from training. Duration: 0.97275 2023-10-03 23:32:53,859 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_34008_210_10431_1_1533345424_8183247_662-317331-0 from training. Duration: 0.94 2023-10-03 23:32:53,872 WARNING [train.py:1204] (3/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_291-248920-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 23:32:53,873 WARNING [train.py:1204] (3/4) Exclude cut with ID _65263_210_22356_1_1535763598691_3921037_274-288481-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 23:32:55,223 WARNING [train.py:1204] (3/4) Exclude cut with ID _5189_210_4557_1_1527248319428_3769229_233-168080-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 23:32:55,282 WARNING [train.py:1204] (3/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_135-184281-0 from training. Duration: 0.99 2023-10-03 23:32:56,642 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9123W0012-555972 from training. Duration: 0.896 2023-10-03 23:32:56,646 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9123W0118-556077 from training. Duration: 0.896 2023-10-03 23:32:56,664 WARNING [train.py:1204] (3/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_406-275649-0 from training. Duration: 0.42 2023-10-03 23:33:00,038 WARNING [train.py:1204] (3/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_309-13684-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 23:33:00,064 WARNING [train.py:1204] (3/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_129-337802-0_sp1.1 from training. Duration: 0.6545625 2023-10-03 23:33:01,310 WARNING [train.py:1204] (3/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_649-266825-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 23:33:01,380 WARNING [train.py:1204] (3/4) Exclude cut with ID _40519_210_13794_1_1533715228655_7337390_895-224742-0_sp1.1 from training. Duration: 0.92725 2023-10-03 23:33:02,739 WARNING [train.py:1204] (3/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_234-14174-0 from training. Duration: 0.82 2023-10-03 23:33:04,106 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9156W0009-572502 from training. Duration: 0.768 2023-10-03 23:33:05,331 INFO [train.py:1046] (3/4) Epoch 41, batch 4400, loss[loss=0.1426, simple_loss=0.2175, pruned_loss=0.03388, over 23586.00 frames. ], tot_loss[loss=0.1559, simple_loss=0.2358, pruned_loss=0.03801, over 4717826.73 frames. ], batch size: 120, lr: 2.47e-03, grad_scale: 16.0 2023-10-03 23:33:05,376 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_424-245529-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 23:33:08,712 WARNING [train.py:1204] (3/4) Exclude cut with ID _26528_210_9070_1_1532570410842_3851310_232-10149-0_sp1.1 from training. Duration: 0.963625 2023-10-03 23:33:08,720 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_17292_210_6753_1_1531125388_2943734_152-30410-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 23:33:13,160 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_35441_210_16554_1_1533347572_4092680_291-82075-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 23:33:14,831 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.conv_module2.balancer1.prob, batch_count=1445906.6666666667, ans=0.125 2023-10-03 23:33:15,901 WARNING [train.py:1204] (3/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_64-298917-0 from training. Duration: 0.88 2023-10-03 23:33:15,931 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_5618_210_1874_1_1527311008_5851_1-214501-0_sp0.9 from training. Duration: 0.7655625 2023-10-03 23:33:17,380 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_9460_210_4211_1_1528954587_10049667_572-37035-0 from training. Duration: 0.8 2023-10-03 23:33:17,411 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9193W0023-590322 from training. Duration: 0.896 2023-10-03 23:33:17,570 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.ff2_skip_rate, batch_count=1445906.6666666667, ans=0.0 2023-10-03 23:33:19,434 WARNING [train.py:1204] (3/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_206-10097-0_sp1.1 from training. Duration: 0.4454375 2023-10-03 23:33:19,456 WARNING [train.py:1204] (3/4) Exclude cut with ID _55134_210_21982_1_1534590028377_3882170_17-51731-0_sp1.1 from training. Duration: 0.963625 2023-10-03 23:33:22,255 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_14953_210_2151_1_1530701710_5616165_710-165400-0 from training. Duration: 0.87 2023-10-03 23:33:23,673 WARNING [train.py:1204] (3/4) Exclude cut with ID _53203_210_2087_1_1534246218309_475590_61-85777-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 23:33:23,757 WARNING [train.py:1204] (3/4) Exclude cut with ID _55844_210_2346_1_1534471212569_7625707_1050-10453-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 23:33:23,765 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9218W0013-603297 from training. Duration: 0.896 2023-10-03 23:33:27,802 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_267-102818-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 23:33:27,803 WARNING [train.py:1204] (3/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_191-41023-0 from training. Duration: 0.71 2023-10-03 23:33:27,840 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9231W0202-610227 from training. Duration: 0.896 2023-10-03 23:33:31,953 WARNING [train.py:1204] (3/4) Exclude cut with ID _6537_210_1883_1_1527987559710_3597129_382-133062-0 from training. Duration: 0.68 2023-10-03 23:33:31,990 WARNING [train.py:1204] (3/4) Exclude cut with ID _28427_210_4220_1_1532581240967_3061069_43-84928-0 from training. Duration: 0.99 2023-10-03 23:33:32,014 WARNING [train.py:1204] (3/4) Exclude cut with ID _59256_210_5986_1_1535076085997_3589120_121-237386-0 from training. Duration: 0.96 2023-10-03 23:33:33,681 WARNING [train.py:1204] (3/4) Exclude cut with ID _8480_210_1874_1_1528589391205_6728330_137-256986-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 23:33:33,782 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_9417_210_1794_1_1529235261_4614131_479-346963-0_sp1.1 from training. Duration: 0.936375 2023-10-03 23:33:33,808 WARNING [train.py:1204] (3/4) Exclude cut with ID _26474_210_9570_1_1532520022699_3623380_267-7362-0_sp1.1 from training. Duration: 0.936375 2023-10-03 23:33:35,127 WARNING [train.py:1204] (3/4) Exclude cut with ID _55328_210_27767_1_1534580973260_4088170_287-61272-0_sp1.1 from training. Duration: 0.97275 2023-10-03 23:33:36,594 WARNING [train.py:1204] (3/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_589-9070-0 from training. Duration: 0.71 2023-10-03 23:33:36,601 WARNING [train.py:1204] (3/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_148-158509-0_sp1.1 from training. Duration: 0.37275 2023-10-03 23:33:36,759 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.nonlin_attention.balancer.min_positive, batch_count=1446040.0, ans=0.05 2023-10-03 23:33:37,994 WARNING [train.py:1204] (3/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_164-184548-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 23:33:39,435 WARNING [train.py:1204] (3/4) Exclude cut with ID _14970_210_15709_1_1530775547361_1966965_6-23328-0_sp1.1 from training. Duration: 0.7818125 2023-10-03 23:33:39,440 WARNING [train.py:1204] (3/4) Exclude cut with ID _29303_210_2575_1_1532392237303_7410649_612-165069-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 23:33:41,356 WARNING [train.py:1204] (3/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_383-321224-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 23:33:41,385 WARNING [train.py:1204] (3/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_283-160525-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 23:33:41,402 WARNING [train.py:1204] (3/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_505-97987-0 from training. Duration: 0.86 2023-10-03 23:33:42,715 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9309W0190-640317 from training. Duration: 0.768 2023-10-03 23:33:46,132 WARNING [train.py:1204] (3/4) Exclude cut with ID _50093_210_4220_1_1534417141207_3432299_280-297390-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 23:33:54,789 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_17292_210_6753_1_1531125388_2943734_320-71385-0_sp1.1 from training. Duration: 0.97275 2023-10-03 23:33:56,207 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_213-103282-0 from training. Duration: 0.78 2023-10-03 23:33:57,907 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.conv_skip_rate, batch_count=1446106.6666666667, ans=0.0 2023-10-03 23:34:00,972 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7615_210_4211_1_1528110965_4580160_72-235211-0_sp0.9 from training. Duration: 0.8444375 2023-10-03 23:34:02,435 WARNING [train.py:1204] (3/4) Exclude cut with ID _14867_210_1968_1_1530684179890_3846969_173-320977-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 23:34:05,194 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7046_210_6753_1_1527895337_6920917_783-278109-0_sp0.9 from training. Duration: 0.8555625 2023-10-03 23:34:05,237 WARNING [train.py:1204] (3/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_90-30673-0 from training. Duration: 0.84 2023-10-03 23:34:05,250 WARNING [train.py:1204] (3/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_415-161-0_sp1.1 from training. Duration: 0.8909375 2023-10-03 23:34:05,260 WARNING [train.py:1204] (3/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_86-66192-0_sp0.9 from training. Duration: 0.611125 2023-10-03 23:34:05,262 WARNING [train.py:1204] (3/4) Exclude cut with ID _4626_210_3532_1_1526817652717_3916859_446-206656-0_sp1.1 from training. Duration: 0.6909375 2023-10-03 23:34:06,606 WARNING [train.py:1204] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_166-128941-0_sp0.9 from training. Duration: 0.6 2023-10-03 23:34:10,780 WARNING [train.py:1204] (3/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_345-167986-0 from training. Duration: 0.84 2023-10-03 23:34:13,930 WARNING [train.py:1204] (3/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_973-198085-0 from training. Duration: 0.75 2023-10-03 23:34:14,005 WARNING [train.py:1204] (3/4) Exclude cut with ID _60057_210_10640_1_1534903065793_3333150_171-181133-0 from training. Duration: 0.88 2023-10-03 23:34:14,019 WARNING [train.py:1204] (3/4) Exclude cut with ID _19504_210_9994_1_1531648863242_3325220_336-196513-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 23:34:14,026 WARNING [train.py:1204] (3/4) Exclude cut with ID _6493_210_1747_1_1528459210424_4012470_513-140967-0 from training. Duration: 0.79 2023-10-03 23:34:14,103 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6673_210_3273_1_1527767790_3173240_327-306185-0_sp0.9 from training. Duration: 0.92225 2023-10-03 23:34:17,354 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1032-236863-0_sp1.1 from training. Duration: 0.763625 2023-10-03 23:34:18,821 WARNING [train.py:1204] (3/4) Exclude cut with ID _14867_210_1968_1_1530684179890_3846969_413-29292-0 from training. Duration: 0.9 2023-10-03 23:34:20,053 INFO [train.py:1046] (3/4) Epoch 41, batch 4450, loss[loss=0.1629, simple_loss=0.2375, pruned_loss=0.04416, over 23491.00 frames. ], tot_loss[loss=0.1559, simple_loss=0.236, pruned_loss=0.03786, over 4715736.76 frames. ], batch size: 256, lr: 2.47e-03, grad_scale: 16.0 2023-10-03 23:34:24,446 WARNING [train.py:1204] (3/4) Exclude cut with ID _47251_210_18628_1_1533814063128_4137250_186-343322-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 23:34:25,860 WARNING [train.py:1204] (3/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_624-46091-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 23:34:25,892 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_791-197891-0_sp0.9 from training. Duration: 0.9333125 2023-10-03 23:34:33,142 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_50050_210_16223_1_1535435796_7290138_786-288457-0_sp1.1 from training. Duration: 0.92725 2023-10-03 23:34:33,166 WARNING [train.py:1204] (3/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_213-227283-0_sp1.1 from training. Duration: 0.963625 2023-10-03 23:34:35,047 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.5.encoder.layers.1.nonlin_attention.whiten1, num_groups=1, num_channels=192, metric=2.72 vs. limit=10.0 2023-10-03 23:34:37,184 WARNING [train.py:1204] (3/4) Exclude cut with ID _42659_210_2070_1_1533607442765_3120380_58-341121-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 23:34:38,597 WARNING [train.py:1204] (3/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_506-23200-0_sp1.1 from training. Duration: 0.8818125 2023-10-03 23:34:40,107 WARNING [train.py:1204] (3/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_336-213530-0_sp1.1 from training. Duration: 0.8181875 2023-10-03 23:34:40,130 WARNING [train.py:1204] (3/4) Exclude cut with ID _64135_210_2253_1_1535677267876_7608972_180-5312-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 23:34:42,058 WARNING [train.py:1204] (3/4) Exclude cut with ID _12302_210_5219_1_1529924446466_8107280_835-279754-0 from training. Duration: 0.9 2023-10-03 23:34:42,059 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_510-96996-0_sp1.1 from training. Duration: 0.7909375 2023-10-03 23:34:42,136 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_15517_210_6753_1_1530954008_3487336_216-187834-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 23:34:42,172 WARNING [train.py:1204] (3/4) Exclude cut with ID _19504_210_9994_1_1531648863242_3325220_414-113427-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 23:34:42,173 WARNING [train.py:1204] (3/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_264-108685-0_sp0.9 from training. Duration: 0.611125 2023-10-03 23:34:44,948 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_5438_210_1794_1_1527336687_2600009_104-288274-0_sp0.9 from training. Duration: 0.6444375 2023-10-03 23:34:50,690 WARNING [train.py:1204] (3/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_520-39496-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 23:34:50,745 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_37818_210_4238_1_1533620910_4720083_315-308565-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 23:34:52,524 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2009_210_2071_1_1525481134_4588937_385-302100-0_sp1.1 from training. Duration: 0.7909375 2023-10-03 23:34:53,988 WARNING [train.py:1204] (3/4) Exclude cut with ID _58895_210_8750_1_1535184081320_3649089_277-53219-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 23:34:55,331 WARNING [train.py:1204] (3/4) Exclude cut with ID _26664_210_7770_1_1532258943280_3418270_121-111258-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 23:35:00,103 WARNING [train.py:1204] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_99-167392-0_sp1.1 from training. Duration: 0.5545625 2023-10-03 23:35:01,452 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1113-7369-0 from training. Duration: 0.85 2023-10-03 23:35:01,473 WARNING [train.py:1204] (3/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_352-59958-0 from training. Duration: 0.86 2023-10-03 23:35:01,482 WARNING [train.py:1204] (3/4) Exclude cut with ID _14886_210_4220_1_1530702020355_3516190_159-73748-0_sp1.1 from training. Duration: 0.7545625 2023-10-03 23:35:02,911 WARNING [train.py:1204] (3/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_131-119569-0_sp1.1 from training. Duration: 0.92725 2023-10-03 23:35:04,375 WARNING [train.py:1204] (3/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_216-343007-0 from training. Duration: 0.66 2023-10-03 23:35:08,412 WARNING [train.py:1204] (3/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_113-92608-0_sp0.9 from training. Duration: 0.6 2023-10-03 23:35:11,110 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_40047_210_2290_1_1533290519_2374311_132-123271-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 23:35:12,364 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.550e+02 2.032e+02 2.316e+02 2.697e+02 5.602e+02, threshold=4.632e+02, percent-clipped=2.0 2023-10-03 23:35:12,457 WARNING [train.py:1204] (3/4) Exclude cut with ID _8793_210_6310_1_1528542985139_2310009_121-179782-0 from training. Duration: 0.8 2023-10-03 23:35:12,479 WARNING [train.py:1204] (3/4) Exclude cut with ID _43997_210_4525_1_1533538632555_5189329_283-218639-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 23:35:12,482 WARNING [train.py:1204] (3/4) Exclude cut with ID _49376_210_5986_1_1533985278732_3370740_297-78397-0_sp1.1 from training. Duration: 0.936375 2023-10-03 23:35:12,504 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3475_210_2028_1_1526193578_3622569_377-166133-0_sp0.9 from training. Duration: 0.9555625 2023-10-03 23:35:13,768 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_520-213149-0_sp1.1 from training. Duration: 0.92725 2023-10-03 23:35:13,952 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.bypass.skip_rate, batch_count=1446440.0, ans=0.04949747468305833 2023-10-03 23:35:14,031 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.conv_skip_rate, batch_count=1446440.0, ans=0.0 2023-10-03 23:35:15,753 WARNING [train.py:1204] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_567-144483-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 23:35:19,077 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_4183_210_2087_1_1526729100_1869838_143-280962-0_sp0.9 from training. Duration: 0.62225 2023-10-03 23:35:20,384 WARNING [train.py:1204] (3/4) Exclude cut with ID _49643_210_7989_1_1534640244224_3939031_611-185515-0 from training. Duration: 0.94 2023-10-03 23:35:20,492 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder_embed.convnext.hidden_balancer.prob, batch_count=1446506.6666666667, ans=0.125 2023-10-03 23:35:21,732 WARNING [train.py:1204] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_88-341718-0_sp0.9 from training. Duration: 0.7333125 2023-10-03 23:35:25,049 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_51225_210_3601_1_1534400634_2327383_49-235441-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 23:35:25,130 WARNING [train.py:1204] (3/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_240-294400-0_sp1.1 from training. Duration: 0.936375 2023-10-03 23:35:26,505 WARNING [train.py:1204] (3/4) Exclude cut with ID _55429_210_10626_1_1534467588527_7174143_640-304825-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 23:35:26,528 WARNING [train.py:1204] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_187-221566-0_sp1.1 from training. Duration: 0.3909375 2023-10-03 23:35:28,002 WARNING [train.py:1204] (3/4) Exclude cut with ID _7606_210_4915_1_1528894253277_5080690_127-177761-0_sp0.9 from training. Duration: 0.711125 2023-10-03 23:35:31,244 WARNING [train.py:1204] (3/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_430-116139-0 from training. Duration: 0.85 2023-10-03 23:35:32,652 WARNING [train.py:1204] (3/4) Exclude cut with ID _3675_210_4557_1_1526639934408_4182089_456-18305-0_sp0.9 from training. Duration: 0.8444375 2023-10-03 23:35:33,933 INFO [train.py:1046] (3/4) Epoch 41, batch 4500, loss[loss=0.1708, simple_loss=0.248, pruned_loss=0.04676, over 23798.00 frames. ], tot_loss[loss=0.1566, simple_loss=0.237, pruned_loss=0.03814, over 4716995.25 frames. ], batch size: 179, lr: 2.47e-03, grad_scale: 16.0 2023-10-03 23:35:35,498 WARNING [train.py:1204] (3/4) Exclude cut with ID _18513_210_9206_1_1531618215223_3862620_402-5957-0_sp1.1 from training. Duration: 0.9 2023-10-03 23:35:36,863 WARNING [train.py:1204] (3/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_465-156500-0 from training. Duration: 0.72 2023-10-03 23:35:36,864 WARNING [train.py:1204] (3/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_714-310538-0 from training. Duration: 0.86 2023-10-03 23:35:38,266 WARNING [train.py:1204] (3/4) Exclude cut with ID _39411_210_5986_1_1533538825030_3343649_335-301187-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 23:35:42,990 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_41750_210_1960_1_1533520678_3948042_241-158616-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 23:35:43,044 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_46013_210_7234_1_1533776362_3744032_50-141535-0_sp1.1 from training. Duration: 0.9 2023-10-03 23:35:43,108 WARNING [train.py:1204] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_43-206757-0_sp0.9 from training. Duration: 0.8333125 2023-10-03 23:35:44,793 WARNING [train.py:1204] (3/4) Exclude cut with ID _22961_210_8254_1_1531976366672_4278180_172-276296-0_sp1.1 from training. Duration: 0.97275 2023-10-03 23:35:44,826 WARNING [train.py:1204] (3/4) Exclude cut with ID _17956_210_4353_1_1531209568125_6979970_1305-301903-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 23:35:44,967 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.0.conv_skip_rate, batch_count=1446573.3333333333, ans=0.0 2023-10-03 23:35:46,153 WARNING [train.py:1204] (3/4) Exclude cut with ID _28323_210_16187_1_1532480395667_4100300_604-44507-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 23:35:57,753 WARNING [train.py:1204] (3/4) Exclude cut with ID _23514_210_1389_1_1531832139785_3031439_88-337544-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 23:35:59,676 WARNING [train.py:1204] (3/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_932-3672-0_sp1.1 from training. Duration: 0.963625 2023-10-03 23:35:59,878 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.bypass_mid.scale_min, batch_count=1446640.0, ans=0.2 2023-10-03 23:36:01,225 WARNING [train.py:1204] (3/4) Exclude cut with ID _46502_210_13794_1_1533724452198_3657159_207-109991-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 23:36:02,606 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_216-73841-0_sp1.1 from training. Duration: 0.87275 2023-10-03 23:36:03,895 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8453_210_4211_1_1528502940_8562730_347-330673-0_sp1.1 from training. Duration: 0.6545625 2023-10-03 23:36:08,164 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_4183_210_2087_1_1526729100_1869838_143-280962-0_sp1.1 from training. Duration: 0.5090625 2023-10-03 23:36:12,271 WARNING [train.py:1204] (3/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_325-113798-0_sp1.1 from training. Duration: 0.77275 2023-10-03 23:36:13,909 INFO [scaling.py:1118] (3/4) WithLoss: name=encoder.encoders.4.encoder.layers.1.self_attn_weights, loss-sum=0.000e+00 2023-10-03 23:36:15,515 WARNING [train.py:1204] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_91-212486-0_sp0.9 from training. Duration: 0.7555625 2023-10-03 23:36:18,265 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8776_210_2040_1_1528638837_4088036_244-278763-0_sp1.1 from training. Duration: 0.8181875 2023-10-03 23:36:18,311 WARNING [train.py:1204] (3/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_106-286130-0 from training. Duration: 0.98 2023-10-03 23:36:19,692 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2746_210_1988_1_1525872816_3795627_372-205861-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 23:36:19,733 WARNING [train.py:1204] (3/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_240-54646-0_sp1.1 from training. Duration: 0.936375 2023-10-03 23:36:22,800 WARNING [train.py:1204] (3/4) Exclude cut with ID _59500_210_8254_1_1534809610680_3736009_162-180695-0_sp1.1 from training. Duration: 0.936375 2023-10-03 23:36:24,168 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7544_210_6753_1_1528591327_1471426_146-93357-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 23:36:24,479 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.bypass.scale_min, batch_count=1446773.3333333333, ans=0.2 2023-10-03 23:36:25,621 WARNING [train.py:1204] (3/4) Exclude cut with ID _42657_210_2070_1_1533520654016_3327008_405-87432-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 23:36:25,651 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_4528_210_3273_1_1527076654_3276777_209-219804-0 from training. Duration: 0.69 2023-10-03 23:36:25,652 WARNING [train.py:1204] (3/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_78-182912-0_sp0.9 from training. Duration: 0.6555625 2023-10-03 23:36:26,826 WARNING [train.py:1204] (3/4) Exclude cut with ID _31255_210_13427_1_1532865619495_3815143_322-114998-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 23:36:30,260 WARNING [train.py:1204] (3/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_848-280957-0_sp0.9 from training. Duration: 0.9666875 2023-10-03 23:36:30,281 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7696_210_2040_1_1528207167_3716099_117-125434-0_sp0.9 from training. Duration: 0.8444375 2023-10-03 23:36:34,367 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_1002-109192-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 23:36:36,296 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.4.encoder.layers.0.self_attn1.whiten, num_groups=1, num_channels=384, metric=14.98 vs. limit=22.5 2023-10-03 23:36:37,108 WARNING [train.py:1204] (3/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_138-64210-0_sp0.9 from training. Duration: 0.811125 2023-10-03 23:36:37,133 WARNING [train.py:1204] (3/4) Exclude cut with ID _25808_210_13292_1_1532343514562_7366109_633-47524-0_sp1.1 from training. Duration: 0.9 2023-10-03 23:36:38,504 WARNING [train.py:1204] (3/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_297-5233-0 from training. Duration: 0.95 2023-10-03 23:36:40,030 WARNING [train.py:1204] (3/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_335-274780-0 from training. Duration: 0.78 2023-10-03 23:36:40,041 WARNING [train.py:1204] (3/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_301-330455-0 from training. Duration: 0.8 2023-10-03 23:36:42,816 WARNING [train.py:1204] (3/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_383-205833-0 from training. Duration: 0.95 2023-10-03 23:36:46,649 WARNING [train.py:1204] (3/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_48-182155-0 from training. Duration: 0.87 2023-10-03 23:36:46,925 INFO [scaling.py:1118] (3/4) WithLoss: name=encoder.encoders.5.encoder.layers.0.self_attn_weights, loss-sum=0.000e+00 2023-10-03 23:36:47,885 INFO [train.py:1046] (3/4) Epoch 41, batch 4550, loss[loss=0.1598, simple_loss=0.2482, pruned_loss=0.03568, over 23724.00 frames. ], tot_loss[loss=0.1561, simple_loss=0.236, pruned_loss=0.03813, over 4713754.97 frames. ], batch size: 85, lr: 2.47e-03, grad_scale: 16.0 2023-10-03 23:36:49,186 WARNING [train.py:1204] (3/4) Exclude cut with ID _14832_210_8750_1_1530691351539_3171348_1-54315-0_sp1.1 from training. Duration: 0.7909375 2023-10-03 23:36:51,956 WARNING [train.py:1204] (3/4) Exclude cut with ID _44982_210_1511_1_1534312820108_3726029_413-100814-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 23:36:52,015 WARNING [train.py:1204] (3/4) Exclude cut with ID _3231_210_2106_1_1526205864934_6784220_104-261392-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 23:36:55,410 WARNING [train.py:1204] (3/4) Exclude cut with ID _24314_210_4915_1_1532135078116_7170179_1-13588-0_sp1.1 from training. Duration: 0.97275 2023-10-03 23:37:01,507 WARNING [train.py:1204] (3/4) Exclude cut with ID _44304_210_15374_1_1533639352765_4613129_141-8356-0_sp1.1 from training. Duration: 0.963625 2023-10-03 23:37:02,937 WARNING [train.py:1204] (3/4) Exclude cut with ID _37892_210_19596_1_1533198677897_3964058_374-335034-0_sp1.1 from training. Duration: 0.936375 2023-10-03 23:37:04,427 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_195-257512-0_sp0.9 from training. Duration: 0.7444375 2023-10-03 23:37:04,435 WARNING [train.py:1204] (3/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_1267-272693-0_sp1.1 from training. Duration: 0.663625 2023-10-03 23:37:04,436 WARNING [train.py:1204] (3/4) Exclude cut with ID _30196_210_1664_1_1533708496522_7074140_690-326155-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 23:37:07,074 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_68847_210_14656_1_1536070214_2586843_38-20717-0_sp1.1 from training. Duration: 0.97275 2023-10-03 23:37:07,109 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_99-251314-0_sp1.1 from training. Duration: 0.7909375 2023-10-03 23:37:09,953 WARNING [train.py:1204] (3/4) Exclude cut with ID _49376_210_5986_1_1533985278732_3370740_225-95630-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 23:37:10,146 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.conv_module2.balancer2.prob, batch_count=1446973.3333333333, ans=0.125 2023-10-03 23:37:12,635 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2655_210_2087_1_1525688959_3897400_202-147557-0 from training. Duration: 0.85 2023-10-03 23:37:12,696 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_5618_210_1874_1_1527311008_5851_1-214501-0_sp1.1 from training. Duration: 0.626375 2023-10-03 23:37:14,011 WARNING [train.py:1204] (3/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_225-43381-0_sp1.1 from training. Duration: 0.8545625 2023-10-03 23:37:17,645 WARNING [train.py:1204] (3/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_242-3472-0 from training. Duration: 0.73 2023-10-03 23:37:19,080 WARNING [train.py:1204] (3/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_339-104791-0 from training. Duration: 0.49 2023-10-03 23:37:19,139 WARNING [train.py:1204] (3/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_488-92261-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 23:37:20,856 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.conv_module1.balancer2.prob, batch_count=1447040.0, ans=0.125 2023-10-03 23:37:21,957 WARNING [train.py:1204] (3/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_175-163894-0 from training. Duration: 0.74 2023-10-03 23:37:25,192 WARNING [train.py:1204] (3/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_41-234353-0_sp0.9 from training. Duration: 0.7555625 2023-10-03 23:37:29,186 WARNING [train.py:1204] (3/4) Exclude cut with ID _47251_210_18628_1_1533814063128_4137250_116-275927-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 23:37:29,213 WARNING [train.py:1204] (3/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_61-223324-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 23:37:29,225 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6961_210_2087_1_1527937168_6815011_562-165518-0_sp1.1 from training. Duration: 0.7 2023-10-03 23:37:31,214 WARNING [train.py:1204] (3/4) Exclude cut with ID _21989_210_5854_1_1531899214931_3271360_133-44759-0 from training. Duration: 0.77 2023-10-03 23:37:32,656 WARNING [train.py:1204] (3/4) Exclude cut with ID _23557_210_8750_1_1531890106370_6846255_77-284923-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 23:37:34,096 WARNING [train.py:1204] (3/4) Exclude cut with ID _13678_210_7497_1_1530530069226_3463170_464-296127-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 23:37:34,115 WARNING [train.py:1204] (3/4) Exclude cut with ID _22613_210_5219_1_1532253599686_4090170_393-36663-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 23:37:35,474 WARNING [train.py:1204] (3/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_235-262124-0_sp0.9 from training. Duration: 0.7444375 2023-10-03 23:37:36,904 WARNING [train.py:1204] (3/4) Exclude cut with ID _8480_210_1874_1_1528589391205_6728330_755-112625-0 from training. Duration: 0.74 2023-10-03 23:37:38,152 WARNING [train.py:1204] (3/4) Exclude cut with ID _6456_210_1534_1_1527681634212_3181049_215-309359-0 from training. Duration: 0.88 2023-10-03 23:37:38,174 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3019_210_2028_1_1526044245_3264865_191-72235-0_sp1.1 from training. Duration: 0.8818125 2023-10-03 23:37:39,499 WARNING [train.py:1204] (3/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_380-162648-0 from training. Duration: 0.94 2023-10-03 23:37:40,780 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.595e+02 2.077e+02 2.278e+02 2.553e+02 3.904e+02, threshold=4.555e+02, percent-clipped=0.0 2023-10-03 23:37:40,933 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_92-46009-0 from training. Duration: 0.54 2023-10-03 23:37:40,957 WARNING [train.py:1204] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_53-300544-0_sp0.9 from training. Duration: 0.7444375 2023-10-03 23:37:43,598 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_68235_210_1925_1_1535863247_4484656_101-250943-0_sp1.1 from training. Duration: 0.97275 2023-10-03 23:37:43,608 WARNING [train.py:1204] (3/4) Exclude cut with ID _14970_210_15709_1_1530775547361_1966965_204-22182-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 23:37:45,514 WARNING [train.py:1204] (3/4) Exclude cut with ID _55153_210_2253_1_1534384786677_3162096_310-130092-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 23:37:45,532 WARNING [train.py:1204] (3/4) Exclude cut with ID _60113_210_16271_1_1535164212481_4386079_559-167191-0_sp1.1 from training. Duration: 0.8454375 2023-10-03 23:37:46,905 WARNING [train.py:1204] (3/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_93-58002-0_sp1.1 from training. Duration: 0.5181875 2023-10-03 23:37:46,963 WARNING [train.py:1204] (3/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_110-224688-0 from training. Duration: 0.97 2023-10-03 23:37:48,423 WARNING [train.py:1204] (3/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_178-178004-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 23:37:48,432 WARNING [train.py:1204] (3/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_321-335475-0_sp1.1 from training. Duration: 0.4090625 2023-10-03 23:37:50,409 WARNING [train.py:1204] (3/4) Exclude cut with ID _66863_210_18053_1_1535698758334_8590319_1092-271784-0 from training. Duration: 0.96 2023-10-03 23:37:50,420 WARNING [train.py:1204] (3/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_607-213951-0_sp1.1 from training. Duration: 0.763625 2023-10-03 23:37:51,723 WARNING [train.py:1204] (3/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_468-283113-0 from training. Duration: 0.96 2023-10-03 23:37:53,110 WARNING [train.py:1204] (3/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_13-165512-0_sp1.1 from training. Duration: 0.7454375 2023-10-03 23:37:53,122 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_69630_210_7234_1_1535788670_910989_56-169762-0_sp1.1 from training. Duration: 0.92725 2023-10-03 23:37:56,439 WARNING [train.py:1204] (3/4) Exclude cut with ID _67335_210_5219_1_1535525975652_4275229_121-80357-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 23:37:56,481 WARNING [train.py:1204] (3/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_126-12079-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 23:37:56,518 WARNING [train.py:1204] (3/4) Exclude cut with ID _5294_210_1747_1_1527249643928_3696179_342-270278-0_sp1.1 from training. Duration: 0.5 2023-10-03 23:37:59,206 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_30322_210_1925_1_1532843478_826817_35-328264-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 23:37:59,336 WARNING [train.py:1204] (3/4) Exclude cut with ID _8480_210_1874_1_1528589391205_6728330_755-112625-0_sp1.1 from training. Duration: 0.67275 2023-10-03 23:38:02,614 INFO [train.py:1046] (3/4) Epoch 41, batch 4600, loss[loss=0.1524, simple_loss=0.2211, pruned_loss=0.04182, over 23684.00 frames. ], tot_loss[loss=0.1554, simple_loss=0.2348, pruned_loss=0.03796, over 4708380.46 frames. ], batch size: 232, lr: 2.47e-03, grad_scale: 16.0 2023-10-03 23:38:04,027 WARNING [train.py:1204] (3/4) Exclude cut with ID _38670_210_15379_1_1533448810659_4391059_132-146412-0_sp1.1 from training. Duration: 0.97275 2023-10-03 23:38:05,294 WARNING [train.py:1204] (3/4) Exclude cut with ID _22020_210_2461_1_1531724164513_6326900_563-39691-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 23:38:08,048 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_9460_210_4211_1_1528954587_10049667_572-37035-0_sp1.1 from training. Duration: 0.72725 2023-10-03 23:38:08,067 WARNING [train.py:1204] (3/4) Exclude cut with ID _68777_210_4353_1_1535810390731_3117904_129-166012-0_sp0.9 from training. Duration: 0.9444375 2023-10-03 23:38:09,433 WARNING [train.py:1204] (3/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_487-91921-0_sp1.1 from training. Duration: 0.963625 2023-10-03 23:38:09,551 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_195-257512-0 from training. Duration: 0.67 2023-10-03 23:38:09,727 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.1.conv_skip_rate, batch_count=1447240.0, ans=0.0 2023-10-03 23:38:10,952 WARNING [train.py:1204] (3/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_188-260011-0_sp1.1 from training. Duration: 0.663625 2023-10-03 23:38:13,749 WARNING [train.py:1204] (3/4) Exclude cut with ID _8480_210_1874_1_1528589391205_6728330_309-139896-0_sp0.9 from training. Duration: 0.9 2023-10-03 23:38:13,815 WARNING [train.py:1204] (3/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_866-321248-0_sp1.1 from training. Duration: 0.963625 2023-10-03 23:38:15,858 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.ff2_skip_rate, batch_count=1447306.6666666667, ans=0.0 2023-10-03 23:38:17,053 WARNING [train.py:1204] (3/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_510-216513-0_sp1.1 from training. Duration: 0.97275 2023-10-03 23:38:24,541 WARNING [train.py:1204] (3/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_758-143677-0 from training. Duration: 0.98 2023-10-03 23:38:25,337 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.0.bypass.scale_min, batch_count=1447306.6666666667, ans=0.2 2023-10-03 23:38:26,340 WARNING [train.py:1204] (3/4) Exclude cut with ID _55134_210_21982_1_1534590028377_3882170_50-214427-0_sp1.1 from training. Duration: 0.97275 2023-10-03 23:38:29,513 WARNING [train.py:1204] (3/4) Exclude cut with ID _31649_210_6228_1_1532584608063_4091078_482-301330-0_sp1.1 from training. Duration: 0.97275 2023-10-03 23:38:30,964 WARNING [train.py:1204] (3/4) Exclude cut with ID _35799_210_5854_1_1533263487887_3529071_490-326847-0_sp1.1 from training. Duration: 0.9 2023-10-03 23:38:30,973 WARNING [train.py:1204] (3/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_984-90716-0_sp1.1 from training. Duration: 0.963625 2023-10-03 23:38:35,254 WARNING [train.py:1204] (3/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_166-246557-0 from training. Duration: 0.64 2023-10-03 23:38:35,255 WARNING [train.py:1204] (3/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_51-200220-0_sp1.1 from training. Duration: 0.5090625 2023-10-03 23:38:35,506 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.feed_forward3.hidden_balancer.prob, batch_count=1447373.3333333333, ans=0.125 2023-10-03 23:38:36,655 WARNING [train.py:1204] (3/4) Exclude cut with ID _51382_210_23862_1_1534492801838_3650104_275-92796-0_sp1.1 from training. Duration: 0.936375 2023-10-03 23:38:40,952 WARNING [train.py:1204] (3/4) Exclude cut with ID _29853_210_7497_1_1532505600851_6642110_461-142829-0_sp1.1 from training. Duration: 0.97275 2023-10-03 23:38:41,056 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.0.conv_module2.balancer1.prob, batch_count=1447373.3333333333, ans=0.125 2023-10-03 23:38:42,313 WARNING [train.py:1204] (3/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_788-298907-0_sp0.9 from training. Duration: 0.988875 2023-10-03 23:38:42,525 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.conv_module1.balancer1.prob, batch_count=1447373.3333333333, ans=0.125 2023-10-03 23:38:43,654 WARNING [train.py:1204] (3/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_739-2098-0_sp1.1 from training. Duration: 0.836375 2023-10-03 23:38:48,255 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7543_210_6753_1_1528541907_1399322_165-323619-0 from training. Duration: 0.69 2023-10-03 23:38:49,598 WARNING [train.py:1204] (3/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_401-255491-0_sp1.1 from training. Duration: 0.636375 2023-10-03 23:38:53,128 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.feed_forward2.hidden_balancer.prob, batch_count=1447440.0, ans=0.125 2023-10-03 23:38:53,191 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.nonlin_attention.balancer.prob, batch_count=1447440.0, ans=0.125 2023-10-03 23:38:54,967 WARNING [train.py:1204] (3/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_24-143283-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 23:38:56,212 WARNING [train.py:1204] (3/4) Exclude cut with ID _14459_210_2228_1_1531443641946_7516700_1221-280439-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 23:38:58,939 WARNING [train.py:1204] (3/4) Exclude cut with ID _59202_210_26984_1_1534816810612_3549174_469-213482-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 23:38:58,943 WARNING [train.py:1204] (3/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_148-158509-0_sp0.9 from training. Duration: 0.4555625 2023-10-03 23:38:58,971 WARNING [train.py:1204] (3/4) Exclude cut with ID _24314_210_4915_1_1532135078116_7170179_863-259095-0_sp1.1 from training. Duration: 0.97275 2023-10-03 23:39:00,816 WARNING [train.py:1204] (3/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_321-22821-0 from training. Duration: 0.89 2023-10-03 23:39:00,844 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_43831_210_2158_1_1533725502_5872791_55-163137-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 23:39:02,157 WARNING [train.py:1204] (3/4) Exclude cut with ID _27430_210_7770_1_1532604689375_1378590_26-25207-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 23:39:02,276 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_17613_210_2151_1_1531137569_2366160_223-316461-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 23:39:03,605 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_53679_210_4852_1_1534556726_4186141_541-213370-0_sp1.1 from training. Duration: 0.963625 2023-10-03 23:39:04,893 WARNING [train.py:1204] (3/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_369-295086-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 23:39:05,147 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.attention_skip_rate, batch_count=1447506.6666666667, ans=0.0 2023-10-03 23:39:06,255 WARNING [train.py:1204] (3/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_91-267696-0 from training. Duration: 0.87 2023-10-03 23:39:06,309 WARNING [train.py:1204] (3/4) Exclude cut with ID _5294_210_1747_1_1527249643928_3696179_180-100713-0 from training. Duration: 0.81 2023-10-03 23:39:06,340 WARNING [train.py:1204] (3/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_32-335103-0 from training. Duration: 0.82 2023-10-03 23:39:06,346 WARNING [train.py:1204] (3/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_133-312727-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 23:39:07,712 WARNING [train.py:1204] (3/4) Exclude cut with ID _59288_210_5854_1_1535077757774_3412246_391-165934-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 23:39:07,762 WARNING [train.py:1204] (3/4) Exclude cut with ID _28323_210_16187_1_1532480395667_4100300_488-1281-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 23:39:09,176 WARNING [train.py:1204] (3/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_124-193207-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 23:39:13,398 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.conv_module1.balancer2.prob, batch_count=1447506.6666666667, ans=0.125 2023-10-03 23:39:16,068 INFO [train.py:1046] (3/4) Epoch 41, batch 4650, loss[loss=0.1565, simple_loss=0.2472, pruned_loss=0.03286, over 24456.00 frames. ], tot_loss[loss=0.1544, simple_loss=0.2336, pruned_loss=0.03755, over 4695797.89 frames. ], batch size: 69, lr: 2.47e-03, grad_scale: 16.0 2023-10-03 23:39:18,005 WARNING [train.py:1204] (3/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_226-242140-0_sp0.9 from training. Duration: 0.9 2023-10-03 23:39:21,027 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=1447573.3333333333, ans=0.1 2023-10-03 23:39:22,057 WARNING [train.py:1204] (3/4) Exclude cut with ID _29277_210_3279_1_1532581109293_3173069_392-3694-0_sp1.1 from training. Duration: 0.936375 2023-10-03 23:39:22,095 WARNING [train.py:1204] (3/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_563-329842-0_sp1.1 from training. Duration: 0.97275 2023-10-03 23:39:22,147 WARNING [train.py:1204] (3/4) Exclude cut with ID _58583_210_5236_1_1534726835266_3574249_391-87755-0_sp1.1 from training. Duration: 0.92725 2023-10-03 23:39:23,443 WARNING [train.py:1204] (3/4) Exclude cut with ID _28427_210_4220_1_1532581240967_3061069_281-237827-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 23:39:23,466 WARNING [train.py:1204] (3/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_44-147898-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 23:39:25,395 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_24007_210_1794_1_1531918925_1663875_125-320772-0_sp1.1 from training. Duration: 0.97275 2023-10-03 23:39:26,966 WARNING [train.py:1204] (3/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_143-117421-0 from training. Duration: 0.58 2023-10-03 23:39:31,048 WARNING [train.py:1204] (3/4) Exclude cut with ID _69584_210_5219_1_1535785133775_4044450_325-7656-0_sp0.9 from training. Duration: 0.9555625 2023-10-03 23:39:32,933 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_37351_210_7925_1_1533012931_3812113_265-85780-0 from training. Duration: 0.88 2023-10-03 23:39:32,975 WARNING [train.py:1204] (3/4) Exclude cut with ID _18516_210_9206_1_1531704653206_3692406_331-237896-0_sp1.1 from training. Duration: 0.936375 2023-10-03 23:39:34,248 WARNING [train.py:1204] (3/4) Exclude cut with ID _14629_210_15709_1_1530604298182_956899_6-232695-0 from training. Duration: 0.94 2023-10-03 23:39:34,269 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7544_210_6753_1_1528587000_691468_87-178599-0_sp1.1 from training. Duration: 0.7909375 2023-10-03 23:39:34,331 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_326-303945-0 from training. Duration: 0.99 2023-10-03 23:39:34,353 WARNING [train.py:1204] (3/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_503-30719-0 from training. Duration: 0.98 2023-10-03 23:39:34,361 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_11305_210_1794_1_1529750428_7178148_299-344640-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 23:39:35,742 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_53071_210_16554_1_1534490405_2954066_265-170472-0_sp1.1 from training. Duration: 0.7545625 2023-10-03 23:39:38,646 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_174-277364-0_sp1.1 from training. Duration: 0.6454375 2023-10-03 23:39:38,845 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.out_combiner.scale_min, batch_count=1447640.0, ans=0.2 2023-10-03 23:39:39,990 WARNING [train.py:1204] (3/4) Exclude cut with ID _58414_210_8254_1_1535421609162_7151182_193-151884-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 23:39:40,015 WARNING [train.py:1204] (3/4) Exclude cut with ID IC0651W0104-322063 from training. Duration: 0.768 2023-10-03 23:39:44,028 WARNING [train.py:1204] (3/4) Exclude cut with ID _21881_210_3525_1_1531825242757_3798380_139-305982-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 23:39:45,540 WARNING [train.py:1204] (3/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_79-200392-0 from training. Duration: 0.97 2023-10-03 23:39:46,999 WARNING [train.py:1204] (3/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_451-280894-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 23:39:47,007 WARNING [train.py:1204] (3/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_304-273433-0_sp1.1 from training. Duration: 0.9 2023-10-03 23:39:48,929 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_5959_210_1794_1_1527400400_3445726_412-44052-0 from training. Duration: 0.77 2023-10-03 23:39:50,333 WARNING [train.py:1204] (3/4) Exclude cut with ID _4681_210_3525_1_1527251040321_1235713_148-26159-0_sp1.1 from training. Duration: 0.963625 2023-10-03 23:39:52,350 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.4.encoder.layers.0.conv_module2.whiten, num_groups=1, num_channels=384, metric=9.41 vs. limit=15.0 2023-10-03 23:39:52,986 WARNING [train.py:1204] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_887-249555-0_sp0.9 from training. Duration: 0.8555625 2023-10-03 23:39:58,744 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_25236_210_2158_1_1532051968_280567_37-6860-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 23:40:02,905 WARNING [train.py:1204] (3/4) Exclude cut with ID _29277_210_3279_1_1532581109293_3173069_417-115527-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 23:40:06,087 WARNING [train.py:1204] (3/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_552-134748-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 23:40:06,156 WARNING [train.py:1204] (3/4) Exclude cut with ID _43176_210_8254_1_1533524417469_4174339_188-174143-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 23:40:06,189 WARNING [train.py:1204] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_127-119109-0_sp1.1 from training. Duration: 0.6545625 2023-10-03 23:40:08,750 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.663e+02 1.906e+02 2.155e+02 2.545e+02 4.224e+02, threshold=4.311e+02, percent-clipped=0.0 2023-10-03 23:40:08,897 WARNING [train.py:1204] (3/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_38-187851-0 from training. Duration: 0.62 2023-10-03 23:40:08,927 WARNING [train.py:1204] (3/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_588-125165-0 from training. Duration: 0.83 2023-10-03 23:40:10,345 WARNING [train.py:1204] (3/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_344-152526-0_sp1.1 from training. Duration: 0.4818125 2023-10-03 23:40:10,355 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7002_210_1794_1_1527935634_4034444_487-236501-0 from training. Duration: 0.66 2023-10-03 23:40:11,635 WARNING [train.py:1204] (3/4) Exclude cut with ID _23825_210_13449_1_1531915313526_3497150_379-16981-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 23:40:14,697 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.1.conv_module1.balancer1.min_positive, batch_count=1447840.0, ans=0.025 2023-10-03 23:40:17,295 WARNING [train.py:1204] (3/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_297-5233-0_sp1.1 from training. Duration: 0.863625 2023-10-03 23:40:17,299 WARNING [train.py:1204] (3/4) Exclude cut with ID _41298_210_9019_1_1533430371450_4822143_178-283538-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 23:40:17,335 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7046_210_6753_1_1527895337_6920917_642-106799-0 from training. Duration: 0.92 2023-10-03 23:40:17,349 WARNING [train.py:1204] (3/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_532-291952-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 23:40:19,156 WARNING [train.py:1204] (3/4) Exclude cut with ID _47278_210_12041_1_1533902418349_3576110_260-270468-0_sp1.1 from training. Duration: 0.92725 2023-10-03 23:40:19,161 WARNING [train.py:1204] (3/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_144-3287-0_sp0.9 from training. Duration: 0.7444375 2023-10-03 23:40:20,555 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_162-262717-0_sp0.9 from training. Duration: 0.92225 2023-10-03 23:40:22,039 WARNING [train.py:1204] (3/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_62-335552-0_sp0.9 from training. Duration: 0.9666875 2023-10-03 23:40:22,044 WARNING [train.py:1204] (3/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_726-272852-0_sp1.1 from training. Duration: 0.92725 2023-10-03 23:40:22,111 WARNING [train.py:1204] (3/4) Exclude cut with ID _14898_210_9206_1_1530766812377_4177190_26-135492-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 23:40:26,739 WARNING [train.py:1204] (3/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_200-95304-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 23:40:26,785 WARNING [train.py:1204] (3/4) Exclude cut with ID _45012_210_12651_1_1533608435136_5371380_240-37101-0_sp1.1 from training. Duration: 0.8454375 2023-10-03 23:40:26,793 WARNING [train.py:1204] (3/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_132-269633-0_sp0.9 from training. Duration: 0.7555625 2023-10-03 23:40:28,128 WARNING [train.py:1204] (3/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_907-151398-0_sp0.9 from training. Duration: 0.411125 2023-10-03 23:40:29,396 INFO [train.py:1046] (3/4) Epoch 41, batch 4700, loss[loss=0.1751, simple_loss=0.255, pruned_loss=0.04765, over 24012.00 frames. ], tot_loss[loss=0.1552, simple_loss=0.235, pruned_loss=0.03768, over 4708032.09 frames. ], batch size: 80, lr: 2.47e-03, grad_scale: 16.0 2023-10-03 23:40:29,488 WARNING [train.py:1204] (3/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_45-265684-0_sp0.9 from training. Duration: 0.8 2023-10-03 23:40:30,843 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_74-292608-0 from training. Duration: 0.72 2023-10-03 23:40:38,455 WARNING [train.py:1204] (3/4) Exclude cut with ID _59202_210_26984_1_1534816810612_3549174_221-84819-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 23:40:39,774 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_33696_210_3852_1_1533082780_2663375_181-325595-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 23:40:41,098 WARNING [train.py:1204] (3/4) Exclude cut with ID _47251_210_18628_1_1533814063128_4137250_351-154105-0_sp1.1 from training. Duration: 0.963625 2023-10-03 23:40:41,188 WARNING [train.py:1204] (3/4) Exclude cut with ID _35501_210_9288_1_1532998507986_3192189_351-334740-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 23:40:43,922 WARNING [train.py:1204] (3/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_342-100157-0_sp1.1 from training. Duration: 0.4909375 2023-10-03 23:40:47,217 WARNING [train.py:1204] (3/4) Exclude cut with ID _6789_210_1534_1_1527857937218_3118950_262-48853-0 from training. Duration: 0.56 2023-10-03 23:40:48,432 WARNING [train.py:1204] (3/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_430-201020-0 from training. Duration: 0.97 2023-10-03 23:40:50,232 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_71-136518-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 23:40:51,564 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_28-291982-0_sp1.1 from training. Duration: 0.8818125 2023-10-03 23:40:51,589 WARNING [train.py:1204] (3/4) Exclude cut with ID _54218_210_12210_1_1534413646388_4409259_437-275326-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 23:40:53,184 WARNING [train.py:1204] (3/4) Exclude cut with ID _55134_210_21982_1_1534590028377_3882170_40-309139-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 23:40:59,571 WARNING [train.py:1204] (3/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_266-329755-0_sp1.1 from training. Duration: 0.8090625 2023-10-03 23:41:00,128 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.3.encoder.layers.2.self_attn1.whiten, num_groups=1, num_channels=512, metric=13.50 vs. limit=22.5 2023-10-03 23:41:00,891 WARNING [train.py:1204] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_21-174514-0_sp1.1 from training. Duration: 0.3909375 2023-10-03 23:41:04,168 WARNING [train.py:1204] (3/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_409-207378-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 23:41:09,733 WARNING [train.py:1204] (3/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_132-269633-0 from training. Duration: 0.68 2023-10-03 23:41:10,928 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2245_210_2715_1_1525511548_3540254_310-52483-0_sp0.9 from training. Duration: 0.97775 2023-10-03 23:41:12,412 WARNING [train.py:1204] (3/4) Exclude cut with ID _27430_210_7770_1_1532604689375_1378590_47-77403-0_sp1.1 from training. Duration: 0.92725 2023-10-03 23:41:17,228 WARNING [train.py:1204] (3/4) Exclude cut with ID _7562_210_2084_1_1528887767916_3946229_186-326422-0 from training. Duration: 0.84 2023-10-03 23:41:19,823 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_230-35993-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 23:41:21,844 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.4.encoder.layers.2.self_attn2.whiten, num_groups=1, num_channels=384, metric=11.64 vs. limit=22.5 2023-10-03 23:41:22,700 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_23854_210_1794_1_1531969696_1329157_57-123491-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 23:41:24,213 WARNING [train.py:1204] (3/4) Exclude cut with ID _14747_210_8341_1_1530685797463_3654089_147-131229-0 from training. Duration: 0.76 2023-10-03 23:41:26,950 WARNING [train.py:1204] (3/4) Exclude cut with ID _30191_210_1664_1_1533189915047_7196670_430-19539-0_sp1.1 from training. Duration: 0.92725 2023-10-03 23:41:26,965 WARNING [train.py:1204] (3/4) Exclude cut with ID _67725_210_27767_1_1535608756671_5105359_44-50762-0_sp1.1 from training. Duration: 0.963625 2023-10-03 23:41:28,858 WARNING [train.py:1204] (3/4) Exclude cut with ID _27485_210_4916_1_1532739191677_3668269_217-161755-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 23:41:28,902 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_5931_210_2040_1_1527428481_4547999_321-193614-0_sp1.1 from training. Duration: 0.6090625 2023-10-03 23:41:30,685 WARNING [train.py:1204] (3/4) Exclude cut with ID _6267_210_2084_1_1527764658526_3894730_375-219887-0 from training. Duration: 0.67 2023-10-03 23:41:32,148 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9087W0022-538393 from training. Duration: 0.768 2023-10-03 23:41:33,569 WARNING [train.py:1204] (3/4) Exclude cut with ID _68777_210_4353_1_1535810390731_3117904_83-196459-0_sp1.1 from training. Duration: 0.963625 2023-10-03 23:41:36,210 WARNING [train.py:1204] (3/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_455-343812-0_sp1.1 from training. Duration: 0.92725 2023-10-03 23:41:36,211 WARNING [train.py:1204] (3/4) Exclude cut with ID _13916_210_9994_1_1530788341126_3763149_414-320174-0_sp1.1 from training. Duration: 0.92725 2023-10-03 23:41:36,215 WARNING [train.py:1204] (3/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_398-239819-0 from training. Duration: 0.99 2023-10-03 23:41:37,589 WARNING [train.py:1204] (3/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_199-232314-0_sp1.1 from training. Duration: 0.92725 2023-10-03 23:41:37,884 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.feed_forward2.hidden_balancer.prob, batch_count=1448173.3333333333, ans=0.125 2023-10-03 23:41:39,545 WARNING [train.py:1204] (3/4) Exclude cut with ID _2334_210_2667_1_1525608238880_4705129_459-104997-0 from training. Duration: 0.95 2023-10-03 23:41:43,430 INFO [train.py:1046] (3/4) Epoch 41, batch 4750, loss[loss=0.1541, simple_loss=0.2351, pruned_loss=0.03651, over 23403.00 frames. ], tot_loss[loss=0.156, simple_loss=0.2356, pruned_loss=0.03814, over 4694466.93 frames. ], batch size: 119, lr: 2.47e-03, grad_scale: 8.0 2023-10-03 23:41:43,472 WARNING [train.py:1204] (3/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_441-209395-0_sp1.1 from training. Duration: 0.936375 2023-10-03 23:41:43,563 WARNING [train.py:1204] (3/4) Exclude cut with ID _27052_210_5986_1_1532329197187_3585009_320-80393-0_sp1.1 from training. Duration: 0.97275 2023-10-03 23:41:46,493 WARNING [train.py:1204] (3/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_89-217253-0_sp1.1 from training. Duration: 0.97275 2023-10-03 23:41:47,749 WARNING [train.py:1204] (3/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_453-282588-0_sp1.1 from training. Duration: 0.8909375 2023-10-03 23:41:49,723 WARNING [train.py:1204] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_88-341718-0 from training. Duration: 0.66 2023-10-03 23:41:49,765 WARNING [train.py:1204] (3/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_230-3521-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 23:41:53,809 WARNING [train.py:1204] (3/4) Exclude cut with ID _9679_210_7497_1_1529129212906_4363929_512-313889-0 from training. Duration: 0.81 2023-10-03 23:41:55,345 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.1.conv_skip_rate, batch_count=1448240.0, ans=0.0 2023-10-03 23:41:55,429 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.feed_forward3.hidden_balancer.prob, batch_count=1448240.0, ans=0.125 2023-10-03 23:41:56,484 WARNING [train.py:1204] (3/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_194-252800-0_sp1.1 from training. Duration: 0.8818125 2023-10-03 23:41:56,518 WARNING [train.py:1204] (3/4) Exclude cut with ID _55209_210_1531_1_1534675828996_4597160_239-293541-0_sp1.1 from training. Duration: 0.963625 2023-10-03 23:41:57,928 WARNING [train.py:1204] (3/4) Exclude cut with ID _35799_210_5854_1_1533263487887_3529071_15-325131-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 23:42:02,712 WARNING [train.py:1204] (3/4) Exclude cut with ID _14100_210_8341_1_1530529320613_3678069_279-55760-0 from training. Duration: 0.93 2023-10-03 23:42:05,627 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_489-230703-0_sp1.1 from training. Duration: 0.77275 2023-10-03 23:42:08,362 WARNING [train.py:1204] (3/4) Exclude cut with ID _6694_210_4599_1_1527852637490_3965529_144-185051-0 from training. Duration: 0.96 2023-10-03 23:42:08,424 WARNING [train.py:1204] (3/4) Exclude cut with ID _28461_210_2253_1_1532771775189_3041299_1-228155-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 23:42:10,458 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.conv_skip_rate, batch_count=1448306.6666666667, ans=0.0 2023-10-03 23:42:11,585 WARNING [train.py:1204] (3/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_686-252323-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 23:42:11,591 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_1075-294633-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 23:42:12,903 WARNING [train.py:1204] (3/4) Exclude cut with ID _22083_210_8254_1_1531702862439_3678970_30-92879-0_sp1.1 from training. Duration: 0.97275 2023-10-03 23:42:14,276 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9245W0121-616888 from training. Duration: 0.896 2023-10-03 23:42:14,279 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6786_210_1388_1_1527839699_4194495_378-46070-0 from training. Duration: 0.72 2023-10-03 23:42:18,475 WARNING [train.py:1204] (3/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_317-175062-0 from training. Duration: 0.82 2023-10-03 23:42:20,624 WARNING [train.py:1204] (3/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_238-89390-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 23:42:23,233 WARNING [train.py:1204] (3/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_524-310811-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 23:42:23,443 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.nonlin_attention.balancer.min_positive, batch_count=1448373.3333333333, ans=0.05 2023-10-03 23:42:25,971 WARNING [train.py:1204] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_307-158462-0_sp0.9 from training. Duration: 0.7666875 2023-10-03 23:42:25,972 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9309W0191-640318 from training. Duration: 0.512 2023-10-03 23:42:25,977 WARNING [train.py:1204] (3/4) Exclude cut with ID _55153_210_2253_1_1534384786677_3162096_100-204617-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 23:42:28,799 WARNING [train.py:1204] (3/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_112-149213-0_sp1.1 from training. Duration: 0.863625 2023-10-03 23:42:30,230 WARNING [train.py:1204] (3/4) Exclude cut with ID _67831_210_12392_1_1535612464202_3092612_346-2148-0_sp1.1 from training. Duration: 0.8454375 2023-10-03 23:42:33,480 WARNING [train.py:1204] (3/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_51-117576-0_sp0.9 from training. Duration: 0.4444375 2023-10-03 23:42:33,508 WARNING [train.py:1204] (3/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_300-27235-0 from training. Duration: 0.87 2023-10-03 23:42:34,699 WARNING [train.py:1204] (3/4) Exclude cut with ID _40399_210_4353_1_1533305658194_3279406_195-116825-0_sp1.1 from training. Duration: 0.97275 2023-10-03 23:42:34,729 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7002_210_1794_1_1527939683_5157286_163-87552-0_sp0.9 from training. Duration: 0.9555625 2023-10-03 23:42:34,773 WARNING [train.py:1204] (3/4) Exclude cut with ID _19514_210_9259_1_1531530071330_5084230_560-237597-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 23:42:35,037 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.conv_module2.balancer1.prob, batch_count=1448440.0, ans=0.125 2023-10-03 23:42:36,148 WARNING [train.py:1204] (3/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_41-234353-0_sp1.1 from training. Duration: 0.6181875 2023-10-03 23:42:36,174 WARNING [train.py:1204] (3/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_769-168302-0 from training. Duration: 0.71 2023-10-03 23:42:38,093 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.617e+02 1.957e+02 2.175e+02 2.344e+02 3.042e+02, threshold=4.350e+02, percent-clipped=0.0 2023-10-03 23:42:38,475 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.nonlin_attention.balancer.min_positive, batch_count=1448440.0, ans=0.05 2023-10-03 23:42:39,519 WARNING [train.py:1204] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_574-45541-0 from training. Duration: 0.65 2023-10-03 23:42:40,986 WARNING [train.py:1204] (3/4) Exclude cut with ID _50310_210_15488_1_1534068043345_3641080_262-67120-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 23:42:43,767 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_53071_210_16554_1_1534493434_1276796_35-11927-0_sp1.1 from training. Duration: 0.963625 2023-10-03 23:42:43,769 WARNING [train.py:1204] (3/4) Exclude cut with ID _3992_210_1747_1_1526644816031_3731060_107-251434-0 from training. Duration: 0.99 2023-10-03 23:42:44,703 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.0.layers.0.conv_module2.whiten, num_groups=1, num_channels=192, metric=10.95 vs. limit=15.0 2023-10-03 23:42:45,111 WARNING [train.py:1204] (3/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_412-54488-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 23:42:45,204 WARNING [train.py:1204] (3/4) Exclude cut with ID _18516_210_9206_1_1531704653206_3692406_77-258490-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 23:42:46,587 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_40047_210_2290_1_1533290519_2374311_195-165693-0_sp0.9 from training. Duration: 0.988875 2023-10-03 23:42:47,881 WARNING [train.py:1204] (3/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_296-36971-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 23:42:49,202 WARNING [train.py:1204] (3/4) Exclude cut with ID _14970_210_15709_1_1530773543493_1715730_187-20259-0_sp1.1 from training. Duration: 0.8090625 2023-10-03 23:42:52,434 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_53373_210_5399_1_1534229471_4715695_135-305927-0_sp1.1 from training. Duration: 0.936375 2023-10-03 23:42:53,746 WARNING [train.py:1204] (3/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_60-18123-0 from training. Duration: 0.88 2023-10-03 23:42:53,823 WARNING [train.py:1204] (3/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_166-166990-0 from training. Duration: 0.96 2023-10-03 23:42:53,903 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_5438_210_1794_1_1527336687_2600009_233-58072-0 from training. Duration: 0.77 2023-10-03 23:42:55,343 WARNING [train.py:1204] (3/4) Exclude cut with ID _8833_210_5219_1_1528592665210_7553059_611-80313-0_sp0.9 from training. Duration: 0.711125 2023-10-03 23:42:55,478 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.conv_skip_rate, batch_count=1448573.3333333333, ans=0.0 2023-10-03 23:42:56,573 INFO [train.py:1046] (3/4) Epoch 41, batch 4800, loss[loss=0.1586, simple_loss=0.2271, pruned_loss=0.04507, over 23766.00 frames. ], tot_loss[loss=0.1567, simple_loss=0.2367, pruned_loss=0.0383, over 4710675.27 frames. ], batch size: 179, lr: 2.47e-03, grad_scale: 16.0 2023-10-03 23:42:56,649 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_53555_210_5632_1_1534595220_7232297_118-43239-0_sp1.1 from training. Duration: 0.936375 2023-10-03 23:42:56,738 WARNING [train.py:1204] (3/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_480-13146-0 from training. Duration: 0.85 2023-10-03 23:43:03,947 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_892-28814-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 23:43:04,219 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.bypass_mid.scale_min, batch_count=1448573.3333333333, ans=0.2 2023-10-03 23:43:05,910 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_24899_210_16554_1_1532046422_3827462_160-21255-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 23:43:11,980 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_154-316450-0_sp1.1 from training. Duration: 0.6909375 2023-10-03 23:43:12,081 WARNING [train.py:1204] (3/4) Exclude cut with ID _30196_210_1664_1_1533708496522_7074140_1225-301232-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 23:43:13,398 WARNING [train.py:1204] (3/4) Exclude cut with ID _28783_210_7943_1_1532671164808_7155479_414-208100-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 23:43:13,442 WARNING [train.py:1204] (3/4) Exclude cut with ID _19023_210_4353_1_1531378892552_5820780_83-254794-0 from training. Duration: 0.86 2023-10-03 23:43:14,834 WARNING [train.py:1204] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_591-83752-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 23:43:14,885 WARNING [train.py:1204] (3/4) Exclude cut with ID _29877_210_6228_1_1532397791106_5276149_266-62291-0_sp1.1 from training. Duration: 0.7909375 2023-10-03 23:43:16,316 WARNING [train.py:1204] (3/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_280-161633-0_sp1.1 from training. Duration: 0.663625 2023-10-03 23:43:20,385 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7543_210_6753_1_1528539468_333764_39-156634-0_sp1.1 from training. Duration: 0.97275 2023-10-03 23:43:22,201 WARNING [train.py:1204] (3/4) Exclude cut with ID _67499_210_17282_1_1535509805220_3692430_505-312248-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 23:43:22,224 WARNING [train.py:1204] (3/4) Exclude cut with ID _5294_210_1747_1_1527249643928_3696179_180-100713-0_sp0.9 from training. Duration: 0.9 2023-10-03 23:43:23,655 WARNING [train.py:1204] (3/4) Exclude cut with ID _26528_210_9070_1_1532570410842_3851310_508-148132-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 23:43:23,665 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_211-339622-0_sp1.1 from training. Duration: 0.4090625 2023-10-03 23:43:23,678 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_339-138356-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 23:43:24,988 WARNING [train.py:1204] (3/4) Exclude cut with ID _27060_210_5986_1_1532674838922_3522549_211-19933-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 23:43:27,664 WARNING [train.py:1204] (3/4) Exclude cut with ID _60229_210_27767_1_1534838406158_3655350_457-117923-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 23:43:28,703 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.0.layers.0.feed_forward2.out_whiten, num_groups=1, num_channels=192, metric=5.56 vs. limit=15.0 2023-10-03 23:43:30,378 WARNING [train.py:1204] (3/4) Exclude cut with ID _55429_210_10626_1_1534467588527_7174143_460-141439-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 23:43:30,494 WARNING [train.py:1204] (3/4) Exclude cut with ID _59170_210_6721_1_1534983633960_4203335_263-10780-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 23:43:30,506 WARNING [train.py:1204] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_872-264525-0_sp0.9 from training. Duration: 0.72225 2023-10-03 23:43:31,847 WARNING [train.py:1204] (3/4) Exclude cut with ID _7624_210_1531_1_1528109802198_4933101_418-349834-0_sp0.9 from training. Duration: 0.6666875 2023-10-03 23:43:32,088 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.conv_module2.balancer1.prob, batch_count=1448706.6666666667, ans=0.125 2023-10-03 23:43:33,116 WARNING [train.py:1204] (3/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_27-53998-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 23:43:35,205 WARNING [train.py:1204] (3/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_202-183922-0 from training. Duration: 0.85 2023-10-03 23:43:35,225 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7002_210_1794_1_1527939683_5157286_1-281264-0 from training. Duration: 0.44 2023-10-03 23:43:36,462 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_53753_210_7234_1_1534320003_3594148_493-9314-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 23:43:36,481 WARNING [train.py:1204] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_217-95056-0_sp1.1 from training. Duration: 0.8909375 2023-10-03 23:43:36,525 WARNING [train.py:1204] (3/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_406-45086-0_sp1.1 from training. Duration: 0.72725 2023-10-03 23:43:38,262 WARNING [train.py:1204] (3/4) Exclude cut with ID _31649_210_6228_1_1532584608063_4091078_505-213821-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 23:43:38,270 WARNING [train.py:1204] (3/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_317-175062-0_sp0.9 from training. Duration: 0.911125 2023-10-03 23:43:38,424 WARNING [train.py:1204] (3/4) Exclude cut with ID _5444_210_2151_1_1527418808464_5312210_726-27165-0_sp1.1 from training. Duration: 0.6090625 2023-10-03 23:43:39,715 WARNING [train.py:1204] (3/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_494-170437-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 23:43:45,120 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_474-64054-0_sp1.1 from training. Duration: 0.936375 2023-10-03 23:43:46,707 WARNING [train.py:1204] (3/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_521-7556-0_sp1.1 from training. Duration: 0.97275 2023-10-03 23:43:48,097 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_269-348505-0_sp1.1 from training. Duration: 0.92725 2023-10-03 23:43:48,364 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.bypass_mid.scale_min, batch_count=1448773.3333333333, ans=0.2 2023-10-03 23:43:52,843 WARNING [train.py:1204] (3/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_559-184862-0 from training. Duration: 0.94 2023-10-03 23:43:52,875 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_68702_210_23689_1_1535799577_3420068_92-76040-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 23:43:52,913 WARNING [train.py:1204] (3/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_227-102052-0_sp1.1 from training. Duration: 0.97275 2023-10-03 23:43:52,950 WARNING [train.py:1204] (3/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_718-250907-0_sp0.9 from training. Duration: 0.7666875 2023-10-03 23:43:54,334 WARNING [train.py:1204] (3/4) Exclude cut with ID _14069_210_5632_1_1530512692033_4803390_352-143891-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 23:43:55,860 WARNING [train.py:1204] (3/4) Exclude cut with ID _6267_210_2084_1_1527764658526_3894730_65-106602-0_sp1.1 from training. Duration: 0.936375 2023-10-03 23:43:57,223 WARNING [train.py:1204] (3/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_202-183922-0_sp0.9 from training. Duration: 0.9444375 2023-10-03 23:43:57,241 WARNING [train.py:1204] (3/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_465-268827-0_sp1.1 from training. Duration: 0.97275 2023-10-03 23:43:58,562 WARNING [train.py:1204] (3/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_58-29064-0_sp1.1 from training. Duration: 0.9 2023-10-03 23:43:58,597 WARNING [train.py:1204] (3/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_1-99680-0_sp0.9 from training. Duration: 0.7444375 2023-10-03 23:43:58,647 WARNING [train.py:1204] (3/4) Exclude cut with ID _13253_210_1925_1_1530757775819_3577873_191-144977-0_sp1.1 from training. Duration: 0.7090625 2023-10-03 23:44:01,867 WARNING [train.py:1204] (3/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_276-112684-0_sp1.1 from training. Duration: 0.92725 2023-10-03 23:44:01,876 WARNING [train.py:1204] (3/4) Exclude cut with ID _64639_210_5816_1_1535367619437_3528000_358-411-0_sp1.1 from training. Duration: 0.97275 2023-10-03 23:44:01,890 WARNING [train.py:1204] (3/4) Exclude cut with ID _31649_210_6228_1_1532584608063_4091078_106-21199-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 23:44:04,433 WARNING [train.py:1204] (3/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_901-79438-0 from training. Duration: 0.94 2023-10-03 23:44:04,726 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.bypass.scale_min, batch_count=1448840.0, ans=0.2 2023-10-03 23:44:06,400 WARNING [train.py:1204] (3/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_718-250907-0 from training. Duration: 0.69 2023-10-03 23:44:06,413 WARNING [train.py:1204] (3/4) Exclude cut with ID _5287_210_2084_1_1527418883040_3878670_363-194161-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 23:44:06,416 WARNING [train.py:1204] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_586-321321-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 23:44:07,786 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_68900_210_2087_1_1535713157_3708529_164-292661-0_sp1.1 from training. Duration: 0.963625 2023-10-03 23:44:07,787 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_26433_210_12108_1_1532157158_4056480_114-144878-0_sp1.1 from training. Duration: 0.97275 2023-10-03 23:44:10,451 INFO [train.py:1046] (3/4) Epoch 41, batch 4850, loss[loss=0.161, simple_loss=0.2501, pruned_loss=0.03589, over 24376.00 frames. ], tot_loss[loss=0.157, simple_loss=0.2371, pruned_loss=0.03844, over 4713263.45 frames. ], batch size: 77, lr: 2.47e-03, grad_scale: 16.0 2023-10-03 23:44:10,550 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_15517_210_6753_1_1530954008_3487336_96-341874-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 23:44:19,176 WARNING [train.py:1204] (3/4) Exclude cut with ID _14268_210_9206_1_1530622771285_4714099_637-86630-0 from training. Duration: 0.51 2023-10-03 23:44:20,528 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_28211_210_6388_1_1532861506_4523224_121-320092-0_sp1.1 from training. Duration: 0.92725 2023-10-03 23:44:26,289 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_141-89113-0_sp0.9 from training. Duration: 0.9555625 2023-10-03 23:44:27,646 WARNING [train.py:1204] (3/4) Exclude cut with ID _6789_210_1534_1_1527857937218_3118950_311-61262-0_sp1.1 from training. Duration: 0.5909375 2023-10-03 23:44:27,668 WARNING [train.py:1204] (3/4) Exclude cut with ID _58480_210_12392_1_1534811121111_3646160_389-200461-0_sp1.1 from training. Duration: 0.97275 2023-10-03 23:44:30,524 WARNING [train.py:1204] (3/4) Exclude cut with ID _41326_210_5854_1_1533778076509_3492080_28-156553-0_sp1.1 from training. Duration: 0.92725 2023-10-03 23:44:31,952 WARNING [train.py:1204] (3/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_577-287150-0_sp1.1 from training. Duration: 0.7454375 2023-10-03 23:44:33,355 WARNING [train.py:1204] (3/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_40-129756-0_sp1.1 from training. Duration: 0.736375 2023-10-03 23:44:33,369 WARNING [train.py:1204] (3/4) Exclude cut with ID _60113_210_16271_1_1535164212481_4386079_490-280290-0 from training. Duration: 0.98 2023-10-03 23:44:36,623 WARNING [train.py:1204] (3/4) Exclude cut with ID _58059_210_5854_1_1534901471149_3164808_34-39905-0_sp1.1 from training. Duration: 0.963625 2023-10-03 23:44:39,905 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_791-197891-0_sp1.1 from training. Duration: 0.763625 2023-10-03 23:44:39,919 WARNING [train.py:1204] (3/4) Exclude cut with ID _6789_210_1534_1_1527857937218_3118950_262-48853-0_sp1.1 from training. Duration: 0.5090625 2023-10-03 23:44:41,720 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2655_210_2087_1_1525688959_3897400_52-279899-0_sp0.9 from training. Duration: 0.7666875 2023-10-03 23:44:41,724 WARNING [train.py:1204] (3/4) Exclude cut with ID _14884_210_2252_1_1530707459689_5561250_114-201399-0 from training. Duration: 0.76 2023-10-03 23:44:45,803 WARNING [train.py:1204] (3/4) Exclude cut with ID _19023_210_4353_1_1531378892552_5820780_83-254794-0_sp0.9 from training. Duration: 0.9555625 2023-10-03 23:44:45,819 WARNING [train.py:1204] (3/4) Exclude cut with ID _53489_210_6228_1_1534298357169_3835970_10-297919-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 23:44:48,813 WARNING [train.py:1204] (3/4) Exclude cut with ID _44585_210_18628_1_1533605350387_4518199_418-3055-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 23:44:48,833 WARNING [train.py:1204] (3/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_310-109461-0 from training. Duration: 0.8 2023-10-03 23:44:48,883 WARNING [train.py:1204] (3/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_235-262124-0 from training. Duration: 0.67 2023-10-03 23:44:50,300 WARNING [train.py:1204] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_195-167011-0_sp1.1 from training. Duration: 0.6818125 2023-10-03 23:44:59,175 WARNING [train.py:1204] (3/4) Exclude cut with ID _7852_210_5219_1_1528286471316_8139400_188-55162-0_sp1.1 from training. Duration: 0.936375 2023-10-03 23:44:59,229 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_35361_210_4238_1_1533262810_7793476_675-87012-0 from training. Duration: 0.89 2023-10-03 23:45:00,628 WARNING [train.py:1204] (3/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_465-81427-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 23:45:01,940 WARNING [train.py:1204] (3/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_597-154640-0_sp1.1 from training. Duration: 0.8181875 2023-10-03 23:45:02,161 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.balancer1.prob, batch_count=1449106.6666666667, ans=0.125 2023-10-03 23:45:03,272 WARNING [train.py:1204] (3/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_186-309161-0_sp0.9 from training. Duration: 0.6 2023-10-03 23:45:04,521 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.665e+02 2.005e+02 2.187e+02 2.519e+02 4.301e+02, threshold=4.375e+02, percent-clipped=0.0 2023-10-03 23:45:05,276 WARNING [train.py:1204] (3/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_546-119312-0 from training. Duration: 0.99 2023-10-03 23:45:05,281 WARNING [train.py:1204] (3/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_330-240060-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 23:45:05,354 WARNING [train.py:1204] (3/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_477-255782-0 from training. Duration: 0.82 2023-10-03 23:45:05,378 WARNING [train.py:1204] (3/4) Exclude cut with ID _14970_210_15709_1_1530773543493_1715730_150-248939-0_sp1.1 from training. Duration: 0.97275 2023-10-03 23:45:06,627 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_556-206615-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 23:45:07,928 WARNING [train.py:1204] (3/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_308-231735-0 from training. Duration: 0.73 2023-10-03 23:45:17,226 WARNING [train.py:1204] (3/4) Exclude cut with ID _27485_210_4916_1_1532739191677_3668269_66-236571-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 23:45:22,673 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2221_210_1988_1_1525597051_4935541_415-296882-0_sp0.9 from training. Duration: 0.8555625 2023-10-03 23:45:22,680 WARNING [train.py:1204] (3/4) Exclude cut with ID _64521_210_14699_1_1535243427259_3815328_231-78630-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 23:45:22,898 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=1449240.0, ans=0.1 2023-10-03 23:45:24,434 INFO [train.py:1046] (3/4) Epoch 41, batch 4900, loss[loss=0.1505, simple_loss=0.217, pruned_loss=0.04197, over 23804.00 frames. ], tot_loss[loss=0.1559, simple_loss=0.2355, pruned_loss=0.03814, over 4704398.87 frames. ], batch size: 212, lr: 2.47e-03, grad_scale: 16.0 2023-10-03 23:45:28,526 WARNING [train.py:1204] (3/4) Exclude cut with ID _13942_210_9988_1_1530604906036_3733360_499-55676-0 from training. Duration: 0.62 2023-10-03 23:45:28,528 WARNING [train.py:1204] (3/4) Exclude cut with ID _49376_210_5986_1_1533985278732_3370740_301-154763-0_sp1.1 from training. Duration: 0.8909375 2023-10-03 23:45:34,138 WARNING [train.py:1204] (3/4) Exclude cut with ID _27485_210_4916_1_1532739191677_3668269_255-216698-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 23:45:34,221 WARNING [train.py:1204] (3/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_137-334143-0_sp1.1 from training. Duration: 0.97275 2023-10-03 23:45:34,240 WARNING [train.py:1204] (3/4) Exclude cut with ID _6456_210_1534_1_1527681634212_3181049_215-309359-0_sp1.1 from training. Duration: 0.8 2023-10-03 23:45:37,541 WARNING [train.py:1204] (3/4) Exclude cut with ID _65640_210_6310_1_1535368201445_3173250_209-13008-0 from training. Duration: 0.99 2023-10-03 23:45:40,437 WARNING [train.py:1204] (3/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_58-29064-0 from training. Duration: 0.99 2023-10-03 23:45:45,347 WARNING [train.py:1204] (3/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_410-287857-0 from training. Duration: 0.93 2023-10-03 23:45:47,047 WARNING [train.py:1204] (3/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_153-153537-0 from training. Duration: 0.91 2023-10-03 23:45:47,086 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7544_210_6753_1_1528587722_3576686_172-171750-0_sp0.9 from training. Duration: 0.72225 2023-10-03 23:45:47,118 WARNING [train.py:1204] (3/4) Exclude cut with ID _64521_210_14699_1_1535243427259_3815328_464-183853-0_sp1.1 from training. Duration: 0.97275 2023-10-03 23:45:47,138 WARNING [train.py:1204] (3/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_385-210308-0_sp1.1 from training. Duration: 0.963625 2023-10-03 23:45:47,151 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_46013_210_7234_1_1533776362_3744032_354-261865-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 23:45:48,407 WARNING [train.py:1204] (3/4) Exclude cut with ID _3067_210_2667_1_1526212974640_4503168_250-285937-0_sp1.1 from training. Duration: 0.72725 2023-10-03 23:45:48,473 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_40036_210_13405_1_1533648647_1315388_48-146016-0 from training. Duration: 0.93 2023-10-03 23:45:50,103 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.conv_module1.balancer1.min_positive, batch_count=1449306.6666666667, ans=0.025 2023-10-03 23:45:50,134 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.attention_skip_rate, batch_count=1449306.6666666667, ans=0.0 2023-10-03 23:45:51,181 WARNING [train.py:1204] (3/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_280-161633-0 from training. Duration: 0.73 2023-10-03 23:45:51,226 WARNING [train.py:1204] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_308-161067-0_sp0.9 from training. Duration: 0.7333125 2023-10-03 23:45:53,220 WARNING [train.py:1204] (3/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_68-212801-0_sp0.9 from training. Duration: 0.811125 2023-10-03 23:45:53,276 WARNING [train.py:1204] (3/4) Exclude cut with ID _6789_210_1534_1_1527857937218_3118950_311-61262-0_sp0.9 from training. Duration: 0.72225 2023-10-03 23:45:57,356 WARNING [train.py:1204] (3/4) Exclude cut with ID _14632_210_9988_1_1530691279392_3923200_102-204477-0_sp1.1 from training. Duration: 0.8818125 2023-10-03 23:45:57,415 WARNING [train.py:1204] (3/4) Exclude cut with ID _65242_210_18628_1_1535369393717_3823149_290-117517-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 23:45:58,741 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_69630_210_7234_1_1535788670_910989_35-337140-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 23:45:58,749 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_5618_210_1874_1_1527311008_5851_1-214501-0 from training. Duration: 0.689 2023-10-03 23:45:59,038 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.conv_skip_rate, batch_count=1449373.3333333333, ans=0.0 2023-10-03 23:46:00,149 WARNING [train.py:1204] (3/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_168-50337-0_sp0.9 from training. Duration: 0.9333125 2023-10-03 23:46:01,502 WARNING [train.py:1204] (3/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_185-14438-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 23:46:01,525 WARNING [train.py:1204] (3/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_61-327062-0 from training. Duration: 0.81 2023-10-03 23:46:01,531 WARNING [train.py:1204] (3/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_98-741-0 from training. Duration: 0.8 2023-10-03 23:46:01,754 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.conv_module2.balancer1.prob, batch_count=1449373.3333333333, ans=0.125 2023-10-03 23:46:06,722 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.4.encoder.layers.2.nonlin_attention.whiten1, num_groups=1, num_channels=288, metric=4.18 vs. limit=10.0 2023-10-03 23:46:07,596 WARNING [train.py:1204] (3/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_312-204798-0 from training. Duration: 0.93 2023-10-03 23:46:10,376 WARNING [train.py:1204] (3/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_787-294416-0_sp0.9 from training. Duration: 0.911125 2023-10-03 23:46:11,744 WARNING [train.py:1204] (3/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_140-181176-0_sp1.1 from training. Duration: 0.836375 2023-10-03 23:46:11,796 WARNING [train.py:1204] (3/4) Exclude cut with ID _14687_210_5854_1_1530703838259_3664889_115-239046-0_sp1.1 from training. Duration: 0.6090625 2023-10-03 23:46:11,836 WARNING [train.py:1204] (3/4) Exclude cut with ID _56423_210_5854_1_1534896061049_3165969_370-230614-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 23:46:13,158 WARNING [train.py:1204] (3/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_406-275649-0_sp0.9 from training. Duration: 0.4666875 2023-10-03 23:46:13,169 WARNING [train.py:1204] (3/4) Exclude cut with ID _15150_210_8341_1_1530752411803_7256384_22-138274-0_sp1.1 from training. Duration: 0.87275 2023-10-03 23:46:13,217 WARNING [train.py:1204] (3/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_125-125818-0 from training. Duration: 0.96 2023-10-03 23:46:15,262 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_4399_210_1874_1_1526705485_2400757_220-47974-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 23:46:17,249 WARNING [train.py:1204] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_308-161067-0_sp1.1 from training. Duration: 0.6 2023-10-03 23:46:17,829 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.4.encoder.layers.1.whiten, num_groups=1, num_channels=384, metric=4.00 vs. limit=12.0 2023-10-03 23:46:18,642 WARNING [train.py:1204] (3/4) Exclude cut with ID _53489_210_6228_1_1534298357169_3835970_102-57645-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 23:46:21,405 WARNING [train.py:1204] (3/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_178-177582-0 from training. Duration: 0.74 2023-10-03 23:46:21,577 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.conv_module1.balancer1.prob, batch_count=1449440.0, ans=0.125 2023-10-03 23:46:22,740 WARNING [train.py:1204] (3/4) Exclude cut with ID _14349_210_8341_1_1530615710818_3442470_170-336541-0_sp1.1 from training. Duration: 0.7818125 2023-10-03 23:46:22,807 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_521-214276-0_sp0.9 from training. Duration: 0.488875 2023-10-03 23:46:24,138 WARNING [train.py:1204] (3/4) Exclude cut with ID _66070_210_7640_1_1535441393096_7805310_489-292631-0 from training. Duration: 0.79 2023-10-03 23:46:30,250 WARNING [train.py:1204] (3/4) Exclude cut with ID _14069_210_5632_1_1530512692033_4803390_438-340023-0_sp1.1 from training. Duration: 0.936375 2023-10-03 23:46:31,585 WARNING [train.py:1204] (3/4) Exclude cut with ID _7852_210_5219_1_1528286471316_8139400_242-105336-0_sp1.1 from training. Duration: 0.6545625 2023-10-03 23:46:33,026 WARNING [train.py:1204] (3/4) Exclude cut with ID _3867_210_1883_1_1526641200575_4475830_272-215563-0 from training. Duration: 0.57 2023-10-03 23:46:33,034 WARNING [train.py:1204] (3/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_143-117421-0_sp0.9 from training. Duration: 0.6444375 2023-10-03 23:46:33,041 WARNING [train.py:1204] (3/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_30-326681-0_sp1.1 from training. Duration: 0.7545625 2023-10-03 23:46:34,862 WARNING [train.py:1204] (3/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_538-218448-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 23:46:37,727 WARNING [train.py:1204] (3/4) Exclude cut with ID _33640_210_2655_1_1533117605479_4855469_60-192970-0_sp1.1 from training. Duration: 0.963625 2023-10-03 23:46:37,734 WARNING [train.py:1204] (3/4) Exclude cut with ID _65585_210_12210_1_1535450451584_3764098_112-294301-0_sp1.1 from training. Duration: 0.8 2023-10-03 23:46:37,757 WARNING [train.py:1204] (3/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_643-37412-0_sp1.1 from training. Duration: 0.936375 2023-10-03 23:46:37,779 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_9302_210_2550_1_1528889962_4655416_638-301620-0_sp1.1 from training. Duration: 0.42725 2023-10-03 23:46:39,027 INFO [train.py:1046] (3/4) Epoch 41, batch 4950, loss[loss=0.1367, simple_loss=0.213, pruned_loss=0.03017, over 24355.00 frames. ], tot_loss[loss=0.1551, simple_loss=0.2341, pruned_loss=0.03807, over 4698953.10 frames. ], batch size: 56, lr: 2.47e-03, grad_scale: 16.0 2023-10-03 23:46:39,194 WARNING [train.py:1204] (3/4) Exclude cut with ID _3867_210_1883_1_1526641200575_4475830_272-215563-0_sp1.1 from training. Duration: 0.5181875 2023-10-03 23:46:42,296 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_53679_210_4852_1_1534556726_4186141_358-154143-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 23:46:42,307 WARNING [train.py:1204] (3/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_841-341679-0_sp0.9 from training. Duration: 0.6444375 2023-10-03 23:46:45,041 WARNING [train.py:1204] (3/4) Exclude cut with ID _7160_210_1876_1_1527940923231_3703799_302-150728-0 from training. Duration: 0.78 2023-10-03 23:46:45,080 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_125-203105-0 from training. Duration: 0.8 2023-10-03 23:46:45,104 WARNING [train.py:1204] (3/4) Exclude cut with ID _5585_210_2667_1_1527418813324_8007340_89-332072-0_sp0.9 from training. Duration: 0.711125 2023-10-03 23:46:46,525 WARNING [train.py:1204] (3/4) Exclude cut with ID _3992_210_1747_1_1526644816031_3731060_36-147855-0 from training. Duration: 0.78 2023-10-03 23:46:46,556 WARNING [train.py:1204] (3/4) Exclude cut with ID _59256_210_5986_1_1535076085997_3589120_172-239045-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 23:46:46,564 WARNING [train.py:1204] (3/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_472-216663-0_sp1.1 from training. Duration: 0.836375 2023-10-03 23:46:48,494 WARNING [train.py:1204] (3/4) Exclude cut with ID _36512_210_6987_1_1533033143998_3126069_181-306500-0_sp1.1 from training. Duration: 0.863625 2023-10-03 23:46:48,512 WARNING [train.py:1204] (3/4) Exclude cut with ID _56524_210_9642_1_1534572011779_3619170_492-95008-0_sp1.1 from training. Duration: 0.97275 2023-10-03 23:46:51,211 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_33348_210_5632_1_1533086912_3324030_119-90326-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 23:46:51,261 WARNING [train.py:1204] (3/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_231-137892-0_sp1.1 from training. Duration: 0.9 2023-10-03 23:46:53,346 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8453_210_4211_1_1528502940_8562730_596-306820-0_sp1.1 from training. Duration: 0.8818125 2023-10-03 23:46:54,674 WARNING [train.py:1204] (3/4) Exclude cut with ID _27052_210_5986_1_1532329197187_3585009_50-242750-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 23:46:56,087 WARNING [train.py:1204] (3/4) Exclude cut with ID _13913_210_9994_1_1530579615187_3383897_358-98435-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 23:46:57,357 WARNING [train.py:1204] (3/4) Exclude cut with ID _27013_210_1389_1_1532437173240_3252649_399-229778-0_sp1.1 from training. Duration: 0.963625 2023-10-03 23:47:00,187 WARNING [train.py:1204] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_185-174805-0_sp0.9 from training. Duration: 0.7555625 2023-10-03 23:47:05,927 WARNING [train.py:1204] (3/4) Exclude cut with ID _65585_210_12210_1_1535450451584_3764098_196-221014-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 23:47:07,363 WARNING [train.py:1204] (3/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_202-293523-0_sp1.1 from training. Duration: 0.6545625 2023-10-03 23:47:07,485 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8275_210_4198_1_1528455320_3892571_213-127634-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 23:47:07,533 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_921-231678-0_sp1.1 from training. Duration: 0.97275 2023-10-03 23:47:08,968 WARNING [train.py:1204] (3/4) Exclude cut with ID _38020_210_5986_1_1533112252259_7006089_493-102745-0_sp1.1 from training. Duration: 0.8909375 2023-10-03 23:47:11,617 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6846_210_6753_1_1527850767_4295158_2-315073-0 from training. Duration: 0.88 2023-10-03 23:47:11,681 WARNING [train.py:1204] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_249-281349-0 from training. Duration: 0.85 2023-10-03 23:47:14,901 WARNING [train.py:1204] (3/4) Exclude cut with ID _18513_210_9206_1_1531618215223_3862620_314-83429-0_sp1.1 from training. Duration: 0.97275 2023-10-03 23:47:17,710 WARNING [train.py:1204] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_215-202973-0_sp0.9 from training. Duration: 0.97775 2023-10-03 23:47:17,720 WARNING [train.py:1204] (3/4) Exclude cut with ID _7719_210_3525_1_1528456153163_4296960_88-250259-0_sp1.1 from training. Duration: 0.763625 2023-10-03 23:47:19,606 WARNING [train.py:1204] (3/4) Exclude cut with ID _7606_210_4915_1_1528894253277_5080690_301-10732-0_sp1.1 from training. Duration: 0.736375 2023-10-03 23:47:19,616 WARNING [train.py:1204] (3/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_818-11907-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 23:47:20,981 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_5959_210_1794_1_1527400400_3445726_412-44052-0_sp1.1 from training. Duration: 0.7 2023-10-03 23:47:23,737 WARNING [train.py:1204] (3/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_6-279195-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 23:47:25,144 WARNING [train.py:1204] (3/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_252-216674-0_sp0.9 from training. Duration: 0.6 2023-10-03 23:47:26,967 WARNING [train.py:1204] (3/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_144-3287-0_sp1.1 from training. Duration: 0.6090625 2023-10-03 23:47:28,430 WARNING [train.py:1204] (3/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_342-3879-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 23:47:28,463 WARNING [train.py:1204] (3/4) Exclude cut with ID _33640_210_2655_1_1533117605479_4855469_584-65630-0_sp1.1 from training. Duration: 0.97275 2023-10-03 23:47:28,532 WARNING [train.py:1204] (3/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_37-189262-0 from training. Duration: 0.98 2023-10-03 23:47:28,560 WARNING [train.py:1204] (3/4) Exclude cut with ID _9389_210_1664_1_1529217337301_5289269_379-128794-0_sp0.9 from training. Duration: 0.9444375 2023-10-03 23:47:30,828 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.1.encoder.layers.1.feed_forward2.out_whiten, num_groups=1, num_channels=256, metric=3.63 vs. limit=15.0 2023-10-03 23:47:31,282 WARNING [train.py:1204] (3/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_119-207162-0_sp1.1 from training. Duration: 0.6454375 2023-10-03 23:47:33,582 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.1.conv_module1.whiten.whitening_limit, batch_count=1449773.3333333333, ans=15.0 2023-10-03 23:47:33,936 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.649e+02 1.903e+02 2.164e+02 2.599e+02 4.348e+02, threshold=4.328e+02, percent-clipped=0.0 2023-10-03 23:47:34,093 WARNING [train.py:1204] (3/4) Exclude cut with ID _2941_210_2577_1_1526199673150_2489759_200-333643-0_sp1.1 from training. Duration: 0.936375 2023-10-03 23:47:36,051 WARNING [train.py:1204] (3/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_479-125611-0_sp1.1 from training. Duration: 0.87275 2023-10-03 23:47:37,110 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3475_210_2028_1_1526193578_3622569_64-297072-0_sp1.1 from training. Duration: 0.82725 2023-10-03 23:47:37,161 WARNING [train.py:1204] (3/4) Exclude cut with ID _53565_210_10635_1_1534337218335_3820259_139-346291-0_sp1.1 from training. Duration: 0.97275 2023-10-03 23:47:38,437 WARNING [train.py:1204] (3/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_588-125165-0_sp1.1 from training. Duration: 0.7545625 2023-10-03 23:47:38,508 WARNING [train.py:1204] (3/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_938-205135-0_sp1.1 from training. Duration: 0.7909375 2023-10-03 23:47:39,966 WARNING [train.py:1204] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_216-48707-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 23:47:40,040 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder_embed.convnext.layerdrop_rate, batch_count=1449840.0, ans=0.015 2023-10-03 23:47:41,802 WARNING [train.py:1204] (3/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_477-255782-0_sp1.1 from training. Duration: 0.7454375 2023-10-03 23:47:41,841 WARNING [train.py:1204] (3/4) Exclude cut with ID _16909_210_5724_1_1531292079912_4118919_583-92842-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 23:47:43,275 WARNING [train.py:1204] (3/4) Exclude cut with ID _60860_210_8254_1_1534896015875_3557169_245-256666-0 from training. Duration: 0.97 2023-10-03 23:47:47,342 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_45399_210_4852_1_1533624725_7579737_349-116173-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 23:47:53,057 INFO [train.py:1046] (3/4) Epoch 41, batch 5000, loss[loss=0.1661, simple_loss=0.2564, pruned_loss=0.03791, over 24467.00 frames. ], tot_loss[loss=0.1546, simple_loss=0.234, pruned_loss=0.03757, over 4702885.23 frames. ], batch size: 69, lr: 2.47e-03, grad_scale: 16.0 2023-10-03 23:47:53,158 WARNING [train.py:1204] (3/4) Exclude cut with ID _8699_210_2069_1_1528630346831_3960409_155-39766-0 from training. Duration: 0.78 2023-10-03 23:47:53,173 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_86-16881-0_sp1.1 from training. Duration: 0.52725 2023-10-03 23:47:56,387 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.balancer2.prob, batch_count=1449906.6666666667, ans=0.125 2023-10-03 23:48:00,734 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_191-159384-0_sp1.1 from training. Duration: 0.97275 2023-10-03 23:48:00,742 WARNING [train.py:1204] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_307-158462-0_sp1.1 from training. Duration: 0.62725 2023-10-03 23:48:00,867 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7544_210_6753_1_1528591327_1471426_33-346031-0 from training. Duration: 0.61 2023-10-03 23:48:02,282 WARNING [train.py:1204] (3/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_112-149213-0 from training. Duration: 0.95 2023-10-03 23:48:04,990 WARNING [train.py:1204] (3/4) Exclude cut with ID _55844_210_2346_1_1534471212569_7625707_925-191342-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 23:48:05,129 WARNING [train.py:1204] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_87-278686-0 from training. Duration: 0.77 2023-10-03 23:48:07,044 WARNING [train.py:1204] (3/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_415-78792-0_sp1.1 from training. Duration: 0.736375 2023-10-03 23:48:07,056 WARNING [train.py:1204] (3/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_107-332398-0_sp0.9 from training. Duration: 0.8666875 2023-10-03 23:48:07,150 WARNING [train.py:1204] (3/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_72-251255-0 from training. Duration: 0.94 2023-10-03 23:48:07,175 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_431-227273-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 23:48:08,355 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7544_210_6753_1_1528587000_691468_87-178599-0_sp0.9 from training. Duration: 0.9666875 2023-10-03 23:48:09,671 WARNING [train.py:1204] (3/4) Exclude cut with ID _13916_210_9994_1_1530788341126_3763149_60-191556-0 from training. Duration: 0.61 2023-10-03 23:48:09,673 WARNING [train.py:1204] (3/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_746-164782-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 23:48:09,711 WARNING [train.py:1204] (3/4) Exclude cut with ID _33640_210_2655_1_1533117605479_4855469_596-21066-0_sp1.1 from training. Duration: 0.92725 2023-10-03 23:48:11,151 WARNING [train.py:1204] (3/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_320-14078-0 from training. Duration: 0.95 2023-10-03 23:48:11,179 WARNING [train.py:1204] (3/4) Exclude cut with ID _29877_210_6228_1_1532397791106_5276149_266-62291-0 from training. Duration: 0.87 2023-10-03 23:48:13,060 WARNING [train.py:1204] (3/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_176-56120-0_sp0.9 from training. Duration: 0.92225 2023-10-03 23:48:13,115 WARNING [train.py:1204] (3/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_91-26919-0 from training. Duration: 0.97 2023-10-03 23:48:13,123 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_224-322394-0_sp0.9 from training. Duration: 0.7333125 2023-10-03 23:48:14,441 WARNING [train.py:1204] (3/4) Exclude cut with ID _44982_210_1511_1_1534312820108_3726029_586-297059-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 23:48:15,708 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_196-289637-0_sp1.1 from training. Duration: 0.6818125 2023-10-03 23:48:15,709 WARNING [train.py:1204] (3/4) Exclude cut with ID _18513_210_9206_1_1531618215223_3862620_402-5957-0 from training. Duration: 0.99 2023-10-03 23:48:15,722 WARNING [train.py:1204] (3/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_81-300212-0 from training. Duration: 0.6 2023-10-03 23:48:17,180 WARNING [train.py:1204] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_166-128941-0 from training. Duration: 0.54 2023-10-03 23:48:17,194 WARNING [train.py:1204] (3/4) Exclude cut with ID _58059_210_5854_1_1534901471149_3164808_57-309660-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 23:48:18,501 WARNING [train.py:1204] (3/4) Exclude cut with ID _60621_210_17071_1_1535013053530_3671739_73-155814-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 23:48:19,955 WARNING [train.py:1204] (3/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_726-270780-0 from training. Duration: 0.86 2023-10-03 23:48:21,274 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8853_210_3273_1_1528633899_818968_51-187685-0_sp1.1 from training. Duration: 0.62725 2023-10-03 23:48:21,508 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.feed_forward3.hidden_balancer.prob, batch_count=1450040.0, ans=0.125 2023-10-03 23:48:23,235 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_315-212794-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 23:48:23,315 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_53373_210_5399_1_1534229471_4715695_314-122795-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 23:48:24,660 WARNING [train.py:1204] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_187-221566-0_sp0.9 from training. Duration: 0.47775 2023-10-03 23:48:27,367 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_133-564-0 from training. Duration: 0.79 2023-10-03 23:48:29,187 WARNING [train.py:1204] (3/4) Exclude cut with ID _55153_210_2253_1_1534384786677_3162096_255-228344-0_sp1.1 from training. Duration: 0.936375 2023-10-03 23:48:30,485 WARNING [train.py:1204] (3/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_383-165234-0_sp1.1 from training. Duration: 0.8545625 2023-10-03 23:48:34,967 WARNING [train.py:1204] (3/4) Exclude cut with ID IC0677W0056-334889 from training. Duration: 0.768 2023-10-03 23:48:37,827 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_14953_210_2151_1_1530701710_5616165_710-165400-0_sp0.9 from training. Duration: 0.9666875 2023-10-03 23:48:39,252 WARNING [train.py:1204] (3/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_355-236205-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 23:48:39,254 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_34450_210_9547_1_1533099443_4109050_503-246893-0_sp1.1 from training. Duration: 0.963625 2023-10-03 23:48:42,514 WARNING [train.py:1204] (3/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_25-120161-0 from training. Duration: 0.98 2023-10-03 23:48:43,788 WARNING [train.py:1204] (3/4) Exclude cut with ID _66112_210_10635_1_1535590236305_3801484_205-311930-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 23:48:43,817 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_48049_210_5632_1_1533864553_3519937_185-281218-0_sp1.1 from training. Duration: 0.92725 2023-10-03 23:48:43,848 WARNING [train.py:1204] (3/4) Exclude cut with ID _48782_210_12658_1_1534467701544_2300422_191-332415-0_sp1.1 from training. Duration: 0.97275 2023-10-03 23:48:45,775 WARNING [train.py:1204] (3/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_907-151398-0_sp1.1 from training. Duration: 0.336375 2023-10-03 23:48:45,944 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.conv_module1.balancer2.prob, batch_count=1450106.6666666667, ans=0.125 2023-10-03 23:48:47,104 WARNING [train.py:1204] (3/4) Exclude cut with ID _67499_210_17282_1_1535509805220_3692430_20-179346-0_sp1.1 from training. Duration: 0.8818125 2023-10-03 23:48:49,822 WARNING [train.py:1204] (3/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_441-91466-0_sp1.1 from training. Duration: 0.8818125 2023-10-03 23:48:51,161 WARNING [train.py:1204] (3/4) Exclude cut with ID _46681_210_9642_1_1533893005571_7603359_786-164007-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 23:48:51,413 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.nonlin_attention.balancer.prob, batch_count=1450173.3333333333, ans=0.125 2023-10-03 23:48:55,450 WARNING [train.py:1204] (3/4) Exclude cut with ID _14970_210_15709_1_1530773543493_1715730_187-20259-0 from training. Duration: 0.89 2023-10-03 23:48:59,091 WARNING [train.py:1204] (3/4) Exclude cut with ID _12734_210_8341_1_1530183614296_3891929_383-339075-0_sp1.1 from training. Duration: 0.963625 2023-10-03 23:49:03,765 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.4.encoder.layers.1.self_attn2.whiten, num_groups=1, num_channels=384, metric=11.33 vs. limit=22.5 2023-10-03 23:49:07,309 INFO [train.py:1046] (3/4) Epoch 41, batch 5050, loss[loss=0.1509, simple_loss=0.2296, pruned_loss=0.03611, over 23769.00 frames. ], tot_loss[loss=0.1551, simple_loss=0.2348, pruned_loss=0.03774, over 4719484.10 frames. ], batch size: 149, lr: 2.47e-03, grad_scale: 8.0 2023-10-03 23:49:07,397 WARNING [train.py:1204] (3/4) Exclude cut with ID _37086_210_9038_1_1532999540891_7021250_834-322225-0_sp1.1 from training. Duration: 0.92725 2023-10-03 23:49:08,866 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_10987_210_4163_1_1529760555_4123902_56-290559-0_sp1.1 from training. Duration: 0.963625 2023-10-03 23:49:08,879 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_92-246270-0_sp0.9 from training. Duration: 0.9444375 2023-10-03 23:49:10,120 WARNING [train.py:1204] (3/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_56-13561-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 23:49:10,149 WARNING [train.py:1204] (3/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_393-57888-0_sp1.1 from training. Duration: 0.5818125 2023-10-03 23:49:10,179 WARNING [train.py:1204] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_168-32906-0_sp1.1 from training. Duration: 0.62725 2023-10-03 23:49:10,846 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_51225_210_3601_1_1534400634_2327383_127-207553-0_sp1.1 from training. Duration: 0.963625 2023-10-03 23:49:12,371 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.ff3_skip_rate, batch_count=1450240.0, ans=0.0 2023-10-03 23:49:13,602 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.0.ff2_skip_rate, batch_count=1450240.0, ans=0.0 2023-10-03 23:49:13,666 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.conv_skip_rate, batch_count=1450240.0, ans=0.0 2023-10-03 23:49:16,787 WARNING [train.py:1204] (3/4) Exclude cut with ID _58583_210_5236_1_1534726835266_3574249_494-203521-0_sp1.1 from training. Duration: 0.963625 2023-10-03 23:49:16,805 WARNING [train.py:1204] (3/4) Exclude cut with ID _49376_210_5986_1_1533985278732_3370740_354-344695-0 from training. Duration: 0.99 2023-10-03 23:49:16,955 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.conv_skip_rate, batch_count=1450240.0, ans=0.0 2023-10-03 23:49:18,115 WARNING [train.py:1204] (3/4) Exclude cut with ID _8021_210_7994_1_1528376451358_3653069_179-45206-0_sp1.1 from training. Duration: 0.7818125 2023-10-03 23:49:20,927 WARNING [train.py:1204] (3/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_742-154017-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 23:49:22,316 WARNING [train.py:1204] (3/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_222-198127-0_sp1.1 from training. Duration: 0.763625 2023-10-03 23:49:22,359 WARNING [train.py:1204] (3/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_235-307337-0 from training. Duration: 0.94 2023-10-03 23:49:23,671 WARNING [train.py:1204] (3/4) Exclude cut with ID _8393_210_6702_1_1528505860249_3457328_453-176500-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 23:49:23,704 WARNING [train.py:1204] (3/4) Exclude cut with ID _53565_210_10635_1_1534337218335_3820259_102-179528-0_sp1.1 from training. Duration: 0.97275 2023-10-03 23:49:25,211 WARNING [train.py:1204] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_43-206757-0_sp1.1 from training. Duration: 0.6818125 2023-10-03 23:49:25,987 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.3.encoder.layers.0.conv_module1.whiten, num_groups=1, num_channels=512, metric=4.19 vs. limit=15.0 2023-10-03 23:49:26,587 WARNING [train.py:1204] (3/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_335-274780-0_sp1.1 from training. Duration: 0.7090625 2023-10-03 23:49:26,629 WARNING [train.py:1204] (3/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_128-296581-0_sp1.1 from training. Duration: 0.67275 2023-10-03 23:49:36,674 WARNING [train.py:1204] (3/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_111-112362-0 from training. Duration: 0.91 2023-10-03 23:49:36,720 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_5522_210_3273_1_1527249606_3909096_366-172508-0_sp1.1 from training. Duration: 0.6 2023-10-03 23:49:38,095 WARNING [train.py:1204] (3/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_47-167373-0_sp1.1 from training. Duration: 0.836375 2023-10-03 23:49:38,146 WARNING [train.py:1204] (3/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_41-234353-0 from training. Duration: 0.68 2023-10-03 23:49:38,180 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_82-221196-0_sp0.9 from training. Duration: 0.8444375 2023-10-03 23:49:39,551 WARNING [train.py:1204] (3/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_47-4814-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 23:49:39,573 WARNING [train.py:1204] (3/4) Exclude cut with ID _43541_210_10626_1_1533796233418_3635440_40-247993-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 23:49:40,861 WARNING [train.py:1204] (3/4) Exclude cut with ID _21989_210_5854_1_1531899214931_3271360_133-44759-0_sp0.9 from training. Duration: 0.8555625 2023-10-03 23:49:40,863 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_94-195801-0 from training. Duration: 0.94 2023-10-03 23:49:41,534 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_141-89113-0 from training. Duration: 0.86 2023-10-03 23:49:42,841 WARNING [train.py:1204] (3/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_683-284578-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 23:49:44,219 WARNING [train.py:1204] (3/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_6-214815-0_sp1.1 from training. Duration: 0.77275 2023-10-03 23:49:48,579 WARNING [train.py:1204] (3/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_556-212082-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 23:49:48,631 WARNING [train.py:1204] (3/4) Exclude cut with ID _44902_210_16187_1_1533888006062_3792078_149-67212-0 from training. Duration: 0.87 2023-10-03 23:49:49,905 WARNING [train.py:1204] (3/4) Exclude cut with ID _61148_210_6897_1_1535094022726_4217610_333-159440-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 23:49:50,140 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.feed_forward1.out_proj.dropout_p, batch_count=1450440.0, ans=0.1 2023-10-03 23:49:51,517 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=1450440.0, ans=0.1 2023-10-03 23:49:52,624 WARNING [train.py:1204] (3/4) Exclude cut with ID _3120_210_3525_1_1526036143112_4072980_404-249994-0 from training. Duration: 0.99 2023-10-03 23:49:54,006 WARNING [train.py:1204] (3/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_234-260682-0_sp0.9 from training. Duration: 0.9333125 2023-10-03 23:49:54,034 WARNING [train.py:1204] (3/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_374-138971-0_sp1.1 from training. Duration: 0.8545625 2023-10-03 23:49:55,391 WARNING [train.py:1204] (3/4) Exclude cut with ID _53713_210_24116_1_1534314487762_7416751_743-302085-0_sp1.1 from training. Duration: 0.92725 2023-10-03 23:49:55,460 WARNING [train.py:1204] (3/4) Exclude cut with ID _18516_210_9206_1_1531704653206_3692406_198-334870-0_sp1.1 from training. Duration: 0.836375 2023-10-03 23:49:58,238 WARNING [train.py:1204] (3/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_395-73570-0_sp1.1 from training. Duration: 0.9 2023-10-03 23:49:58,400 WARNING [train.py:1204] (3/4) Exclude cut with ID _46683_210_9642_1_1533730913492_4591116_421-19547-0_sp1.1 from training. Duration: 0.936375 2023-10-03 23:49:59,678 WARNING [train.py:1204] (3/4) Exclude cut with ID _19896_210_1883_1_1531709022662_6649569_714-8668-0_sp1.1 from training. Duration: 0.963625 2023-10-03 23:50:01,687 WARNING [train.py:1204] (3/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_49-101072-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 23:50:01,694 WARNING [train.py:1204] (3/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_220-191080-0_sp1.1 from training. Duration: 0.8909375 2023-10-03 23:50:01,728 WARNING [train.py:1204] (3/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_62-335552-0 from training. Duration: 0.87 2023-10-03 23:50:01,813 WARNING [train.py:1204] (3/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_111-112362-0_sp1.1 from training. Duration: 0.82725 2023-10-03 23:50:03,169 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.711e+02 1.933e+02 2.126e+02 2.444e+02 3.244e+02, threshold=4.252e+02, percent-clipped=0.0 2023-10-03 23:50:04,660 WARNING [train.py:1204] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_318-59097-0_sp0.9 from training. Duration: 0.8444375 2023-10-03 23:50:06,179 WARNING [train.py:1204] (3/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_98-255710-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 23:50:06,195 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9042W0003-515609 from training. Duration: 0.896 2023-10-03 23:50:06,203 WARNING [train.py:1204] (3/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_158-250116-0_sp0.9 from training. Duration: 0.688875 2023-10-03 23:50:07,635 WARNING [train.py:1204] (3/4) Exclude cut with ID _26750_210_5665_1_1532226678442_4063230_7-346885-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 23:50:08,958 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_37839_210_7234_1_1533542462_3689303_357-227078-0_sp1.1 from training. Duration: 0.963625 2023-10-03 23:50:08,986 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9052W0119-520904 from training. Duration: 0.896 2023-10-03 23:50:12,090 WARNING [train.py:1204] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_249-281349-0_sp1.1 from training. Duration: 0.77275 2023-10-03 23:50:12,094 WARNING [train.py:1204] (3/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_147-293311-0 from training. Duration: 0.94 2023-10-03 23:50:12,095 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_158-21166-0_sp1.1 from training. Duration: 0.963625 2023-10-03 23:50:16,695 WARNING [train.py:1204] (3/4) Exclude cut with ID _14100_210_8341_1_1530529320613_3678069_207-332194-0_sp1.1 from training. Duration: 0.92725 2023-10-03 23:50:18,009 WARNING [train.py:1204] (3/4) Exclude cut with ID _9389_210_1664_1_1529217337301_5289269_324-211031-0_sp1.1 from training. Duration: 0.963625 2023-10-03 23:50:18,033 WARNING [train.py:1204] (3/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_133-158308-0 from training. Duration: 0.84 2023-10-03 23:50:20,657 INFO [train.py:1046] (3/4) Epoch 41, batch 5100, loss[loss=0.1572, simple_loss=0.2321, pruned_loss=0.04109, over 23622.00 frames. ], tot_loss[loss=0.1554, simple_loss=0.2352, pruned_loss=0.03777, over 4730313.27 frames. ], batch size: 256, lr: 2.47e-03, grad_scale: 8.0 2023-10-03 23:50:20,700 WARNING [train.py:1204] (3/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_166-314607-0 from training. Duration: 0.54 2023-10-03 23:50:23,406 WARNING [train.py:1204] (3/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_649-79538-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 23:50:23,414 WARNING [train.py:1204] (3/4) Exclude cut with ID _28394_210_8913_1_1532430049518_3650118_343-115667-0_sp1.1 from training. Duration: 0.97275 2023-10-03 23:50:23,461 WARNING [train.py:1204] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_87-278686-0_sp0.9 from training. Duration: 0.8555625 2023-10-03 23:50:26,325 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9109W0003-549749 from training. Duration: 0.768 2023-10-03 23:50:28,989 WARNING [train.py:1204] (3/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_126-29104-0_sp1.1 from training. Duration: 0.77275 2023-10-03 23:50:31,807 WARNING [train.py:1204] (3/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_325-113798-0 from training. Duration: 0.85 2023-10-03 23:50:31,854 WARNING [train.py:1204] (3/4) Exclude cut with ID _6694_210_4599_1_1527852637490_3965529_116-109398-0 from training. Duration: 0.73 2023-10-03 23:50:31,909 WARNING [train.py:1204] (3/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_46-57119-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 23:50:33,898 WARNING [train.py:1204] (3/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_546-119312-0_sp1.1 from training. Duration: 0.9 2023-10-03 23:50:35,439 WARNING [train.py:1204] (3/4) Exclude cut with ID _60057_210_10640_1_1534903065793_3333150_468-306939-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 23:50:35,490 WARNING [train.py:1204] (3/4) Exclude cut with ID _15150_210_8341_1_1530752411803_7256384_22-138274-0 from training. Duration: 0.96 2023-10-03 23:50:35,506 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_289-180904-0 from training. Duration: 0.76 2023-10-03 23:50:38,421 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.conv_module1.balancer1.prob, batch_count=1450640.0, ans=0.125 2023-10-03 23:50:41,141 WARNING [train.py:1204] (3/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_64-133721-0_sp1.1 from training. Duration: 0.92725 2023-10-03 23:50:41,204 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_195-257512-0_sp1.1 from training. Duration: 0.6090625 2023-10-03 23:50:43,084 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.3.encoder.layers.3.self_attn2.whiten, num_groups=1, num_channels=512, metric=10.24 vs. limit=22.5 2023-10-03 23:50:47,550 WARNING [train.py:1204] (3/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_704-101963-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 23:50:49,077 WARNING [train.py:1204] (3/4) Exclude cut with ID _4366_210_1388_1_1526792053633_3613819_231-211402-0 from training. Duration: 0.94 2023-10-03 23:50:49,119 WARNING [train.py:1204] (3/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_679-190385-0_sp1.1 from training. Duration: 0.97275 2023-10-03 23:50:50,597 WARNING [train.py:1204] (3/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_298-81459-0_sp1.1 from training. Duration: 0.963625 2023-10-03 23:50:50,605 WARNING [train.py:1204] (3/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_658-282610-0_sp0.9 from training. Duration: 0.588875 2023-10-03 23:50:53,275 WARNING [train.py:1204] (3/4) Exclude cut with ID _57572_210_5517_1_1534921219624_3809484_196-283734-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 23:50:53,358 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_53071_210_16554_1_1534490405_2954066_294-198554-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 23:50:53,364 WARNING [train.py:1204] (3/4) Exclude cut with ID _4866_210_2577_1_1527404412284_4364659_352-157930-0 from training. Duration: 0.81 2023-10-03 23:50:53,492 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.0.nonlin_attention.balancer.prob, batch_count=1450706.6666666667, ans=0.125 2023-10-03 23:50:56,007 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9231W0113-610139 from training. Duration: 0.896 2023-10-03 23:50:56,064 WARNING [train.py:1204] (3/4) Exclude cut with ID _24271_210_12658_1_1531987270022_7190519_638-171906-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 23:50:56,282 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.balancer2.prob, batch_count=1450706.6666666667, ans=0.125 2023-10-03 23:50:56,289 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.ff3_skip_rate, batch_count=1450706.6666666667, ans=0.0 2023-10-03 23:50:57,334 WARNING [train.py:1204] (3/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_272-249985-0 from training. Duration: 0.67 2023-10-03 23:50:57,344 WARNING [train.py:1204] (3/4) Exclude cut with ID _64086_210_7994_1_1535270417019_7435949_449-31567-0 from training. Duration: 0.97 2023-10-03 23:50:59,065 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.bypass.scale_min, batch_count=1450706.6666666667, ans=0.2 2023-10-03 23:51:00,225 WARNING [train.py:1204] (3/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_109-282568-0_sp1.1 from training. Duration: 0.97275 2023-10-03 23:51:02,573 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.feed_forward2.hidden_balancer.prob, batch_count=1450706.6666666667, ans=0.125 2023-10-03 23:51:05,623 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.5.encoder.layers.0.feed_forward2.out_whiten, num_groups=1, num_channels=256, metric=7.04 vs. limit=15.0 2023-10-03 23:51:08,092 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.feed_forward2.hidden_balancer.prob, batch_count=1450773.3333333333, ans=0.125 2023-10-03 23:51:09,227 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_121-45934-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 23:51:11,879 WARNING [train.py:1204] (3/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_314-126819-0 from training. Duration: 0.5 2023-10-03 23:51:13,172 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9309W0192-640319 from training. Duration: 0.512 2023-10-03 23:51:13,182 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9310W0003-640649 from training. Duration: 0.896 2023-10-03 23:51:15,993 WARNING [train.py:1204] (3/4) Exclude cut with ID _7160_210_1876_1_1527940923231_3703799_120-150943-0 from training. Duration: 0.81 2023-10-03 23:51:15,995 WARNING [train.py:1204] (3/4) Exclude cut with ID _40380_210_9059_1_1533380594759_4187380_6-33508-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 23:51:18,023 WARNING [train.py:1204] (3/4) Exclude cut with ID _59202_210_26984_1_1534816810612_3549174_137-318710-0 from training. Duration: 0.89 2023-10-03 23:51:22,154 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_435-313305-0 from training. Duration: 0.71 2023-10-03 23:51:23,635 WARNING [train.py:1204] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_91-212486-0_sp1.1 from training. Duration: 0.6181875 2023-10-03 23:51:25,077 WARNING [train.py:1204] (3/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_133-158308-0_sp1.1 from training. Duration: 0.763625 2023-10-03 23:51:26,505 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_99-251314-0 from training. Duration: 0.87 2023-10-03 23:51:27,849 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_365-217119-0_sp1.1 from training. Duration: 0.57275 2023-10-03 23:51:29,170 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3475_210_2028_1_1526193578_3622569_377-166133-0 from training. Duration: 0.86 2023-10-03 23:51:33,710 INFO [train.py:1046] (3/4) Epoch 41, batch 5150, loss[loss=0.1441, simple_loss=0.2294, pruned_loss=0.02937, over 24492.00 frames. ], tot_loss[loss=0.1559, simple_loss=0.2361, pruned_loss=0.03787, over 4739989.35 frames. ], batch size: 63, lr: 2.47e-03, grad_scale: 8.0 2023-10-03 23:51:33,822 WARNING [train.py:1204] (3/4) Exclude cut with ID _13916_210_9994_1_1530788341126_3763149_116-21034-0_sp1.1 from training. Duration: 0.87275 2023-10-03 23:51:33,832 WARNING [train.py:1204] (3/4) Exclude cut with ID _64220_210_9527_1_1535250151772_4875259_142-216831-0_sp1.1 from training. Duration: 0.936375 2023-10-03 23:51:33,832 WARNING [train.py:1204] (3/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_490-207861-0_sp1.1 from training. Duration: 0.8909375 2023-10-03 23:51:35,819 WARNING [train.py:1204] (3/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_87-272068-0_sp0.9 from training. Duration: 0.92225 2023-10-03 23:51:35,848 WARNING [train.py:1204] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_18-46756-0_sp1.1 from training. Duration: 0.7181875 2023-10-03 23:51:37,292 WARNING [train.py:1204] (3/4) Exclude cut with ID _39411_210_5986_1_1533538825030_3343649_98-311335-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 23:51:37,373 WARNING [train.py:1204] (3/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_113-18163-0 from training. Duration: 0.89 2023-10-03 23:51:37,376 WARNING [train.py:1204] (3/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_1103-56445-0 from training. Duration: 0.73 2023-10-03 23:51:38,673 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3474_210_2028_1_1526188513_4825246_145-309131-0 from training. Duration: 0.74 2023-10-03 23:51:38,701 WARNING [train.py:1204] (3/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_120-211630-0_sp1.1 from training. Duration: 0.836375 2023-10-03 23:51:38,711 WARNING [train.py:1204] (3/4) Exclude cut with ID _8030_210_7497_1_1528444886781_3531089_32-31837-0 from training. Duration: 0.97 2023-10-03 23:51:40,069 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_19949_210_2161_1_1531706337_928079_8-160956-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 23:51:40,108 WARNING [train.py:1204] (3/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_321-215742-0_sp1.1 from training. Duration: 0.4545625 2023-10-03 23:51:42,800 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_583-124650-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 23:51:44,140 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_30434_210_13049_1_1532519384_981746_164-182755-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 23:51:49,327 WARNING [train.py:1204] (3/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_339-104791-0_sp1.1 from training. Duration: 0.4454375 2023-10-03 23:51:49,357 WARNING [train.py:1204] (3/4) Exclude cut with ID _15026_210_8750_1_1530777602316_3291850_211-195043-0 from training. Duration: 0.8 2023-10-03 23:51:50,752 WARNING [train.py:1204] (3/4) Exclude cut with ID _29877_210_6228_1_1532397791106_5276149_487-182647-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 23:51:52,064 WARNING [train.py:1204] (3/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_127-566-0_sp0.9 from training. Duration: 0.9333125 2023-10-03 23:51:53,489 WARNING [train.py:1204] (3/4) Exclude cut with ID _8263_210_1664_1_1528439452891_970220_68-181805-0_sp0.9 from training. Duration: 0.711125 2023-10-03 23:51:53,490 WARNING [train.py:1204] (3/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_504-72513-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 23:51:53,501 WARNING [train.py:1204] (3/4) Exclude cut with ID _64639_210_5816_1_1535367619437_3528000_324-265233-0_sp1.1 from training. Duration: 0.963625 2023-10-03 23:51:54,826 WARNING [train.py:1204] (3/4) Exclude cut with ID _6456_210_1534_1_1527681634212_3181049_215-309359-0_sp0.9 from training. Duration: 0.97775 2023-10-03 23:51:54,832 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_115-233929-0_sp1.1 from training. Duration: 0.6545625 2023-10-03 23:51:56,123 WARNING [train.py:1204] (3/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_219-224445-0 from training. Duration: 0.84 2023-10-03 23:51:57,627 WARNING [train.py:1204] (3/4) Exclude cut with ID _14867_210_1968_1_1530684179890_3846969_413-29292-0_sp1.1 from training. Duration: 0.8181875 2023-10-03 23:51:57,687 WARNING [train.py:1204] (3/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_1012-279473-0_sp0.9 from training. Duration: 0.7444375 2023-10-03 23:51:59,172 WARNING [train.py:1204] (3/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_605-133395-0_sp0.9 from training. Duration: 0.7333125 2023-10-03 23:51:59,370 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.feed_forward3.hidden_balancer.prob, batch_count=1450973.3333333333, ans=0.125 2023-10-03 23:52:02,292 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_89-35748-0 from training. Duration: 0.79 2023-10-03 23:52:03,676 WARNING [train.py:1204] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_476-272757-0_sp1.1 from training. Duration: 0.8454375 2023-10-03 23:52:08,300 WARNING [train.py:1204] (3/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_449-15508-0_sp0.9 from training. Duration: 0.911125 2023-10-03 23:52:08,432 WARNING [train.py:1204] (3/4) Exclude cut with ID _29345_210_4381_1_1532689168263_3860169_198-174328-0 from training. Duration: 0.91 2023-10-03 23:52:12,484 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_15517_210_6753_1_1530952293_1664795_125-158157-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 23:52:18,498 WARNING [train.py:1204] (3/4) Exclude cut with ID _25565_210_12392_1_1532423009382_3952098_420-113180-0_sp1.1 from training. Duration: 0.963625 2023-10-03 23:52:19,880 WARNING [train.py:1204] (3/4) Exclude cut with ID _51471_210_1889_1_1534399391390_3094250_236-146773-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 23:52:22,762 WARNING [train.py:1204] (3/4) Exclude cut with ID _30039_210_13270_1_1532484612900_2725150_175-35012-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 23:52:22,803 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_23850_210_4238_1_1532232597_1632433_99-275843-0_sp1.1 from training. Duration: 0.936375 2023-10-03 23:52:24,392 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.conv_skip_rate, batch_count=1451106.6666666667, ans=0.0 2023-10-03 23:52:25,492 WARNING [train.py:1204] (3/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_658-282610-0 from training. Duration: 0.53 2023-10-03 23:52:29,392 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.566e+02 1.988e+02 2.282e+02 2.710e+02 3.872e+02, threshold=4.565e+02, percent-clipped=0.0 2023-10-03 23:52:29,516 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_37818_210_4238_1_1533620910_4720083_234-65934-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 23:52:30,888 WARNING [train.py:1204] (3/4) Exclude cut with ID _3067_210_2667_1_1526212974640_4503168_459-173867-0_sp1.1 from training. Duration: 0.8 2023-10-03 23:52:30,916 WARNING [train.py:1204] (3/4) Exclude cut with ID _8696_210_1876_1_1528545632440_3345058_209-54069-0_sp0.9 from training. Duration: 0.7444375 2023-10-03 23:52:34,253 WARNING [train.py:1204] (3/4) Exclude cut with ID _62574_210_2655_1_1535180407058_3933249_420-236465-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 23:52:35,620 WARNING [train.py:1204] (3/4) Exclude cut with ID _46681_210_9642_1_1533893005571_7603359_333-4042-0_sp1.1 from training. Duration: 0.936375 2023-10-03 23:52:36,982 WARNING [train.py:1204] (3/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_377-296839-0 from training. Duration: 0.55 2023-10-03 23:52:41,795 WARNING [train.py:1204] (3/4) Exclude cut with ID _43997_210_4525_1_1533538632555_5189329_426-92599-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 23:52:43,288 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_43-71989-0_sp1.1 from training. Duration: 0.5181875 2023-10-03 23:52:44,774 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2009_210_2071_1_1525481134_4588937_140-345530-0_sp1.1 from training. Duration: 0.963625 2023-10-03 23:52:44,782 WARNING [train.py:1204] (3/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_138-332773-0_sp0.9 from training. Duration: 0.9 2023-10-03 23:52:46,155 WARNING [train.py:1204] (3/4) Exclude cut with ID _13942_210_9988_1_1530604906036_3733360_499-55676-0_sp0.9 from training. Duration: 0.688875 2023-10-03 23:52:46,172 WARNING [train.py:1204] (3/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_259-34038-0_sp1.1 from training. Duration: 0.67275 2023-10-03 23:52:46,181 WARNING [train.py:1204] (3/4) Exclude cut with ID _3120_210_3525_1_1526036143112_4072980_404-249994-0_sp1.1 from training. Duration: 0.9 2023-10-03 23:52:46,218 WARNING [train.py:1204] (3/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_222-108810-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 23:52:48,017 INFO [train.py:1046] (3/4) Epoch 41, batch 5200, loss[loss=0.1359, simple_loss=0.2175, pruned_loss=0.02711, over 24319.00 frames. ], tot_loss[loss=0.1572, simple_loss=0.2375, pruned_loss=0.03843, over 4735346.42 frames. ], batch size: 56, lr: 2.47e-03, grad_scale: 16.0 2023-10-03 23:52:50,752 WARNING [train.py:1204] (3/4) Exclude cut with ID _22083_210_8254_1_1531702862439_3678970_49-234546-0_sp0.9 from training. Duration: 0.92225 2023-10-03 23:52:52,070 WARNING [train.py:1204] (3/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_199-295611-0_sp0.9 from training. Duration: 0.611125 2023-10-03 23:52:54,838 WARNING [train.py:1204] (3/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_43-192945-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 23:52:56,421 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.feed_forward1.hidden_balancer.prob, batch_count=1451240.0, ans=0.125 2023-10-03 23:52:58,837 WARNING [train.py:1204] (3/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_364-22817-0 from training. Duration: 0.71 2023-10-03 23:52:58,931 WARNING [train.py:1204] (3/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_388-129552-0_sp1.1 from training. Duration: 0.87275 2023-10-03 23:52:58,971 WARNING [train.py:1204] (3/4) Exclude cut with ID _6537_210_1883_1_1527987559710_3597129_190-280227-0_sp1.1 from training. Duration: 0.97275 2023-10-03 23:53:02,061 WARNING [train.py:1204] (3/4) Exclude cut with ID _55844_210_2346_1_1534471212569_7625707_398-56897-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 23:53:03,521 WARNING [train.py:1204] (3/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_48-182155-0_sp1.1 from training. Duration: 0.7909375 2023-10-03 23:53:03,546 WARNING [train.py:1204] (3/4) Exclude cut with ID _29529_210_7234_1_1532393961455_3977460_582-18084-0_sp1.1 from training. Duration: 0.97275 2023-10-03 23:53:04,939 WARNING [train.py:1204] (3/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_912-282412-0 from training. Duration: 0.49 2023-10-03 23:53:07,619 WARNING [train.py:1204] (3/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_332-256168-0_sp0.9 from training. Duration: 0.8333125 2023-10-03 23:53:09,482 WARNING [train.py:1204] (3/4) Exclude cut with ID _64220_210_9527_1_1535250151772_4875259_55-250624-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 23:53:10,933 WARNING [train.py:1204] (3/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_264-108685-0 from training. Duration: 0.55 2023-10-03 23:53:13,695 WARNING [train.py:1204] (3/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_613-232856-0_sp0.9 from training. Duration: 0.6 2023-10-03 23:53:13,775 WARNING [train.py:1204] (3/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_68-212801-0_sp1.1 from training. Duration: 0.663625 2023-10-03 23:53:15,104 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6290_210_3273_1_1527595224_1282787_115-87095-0 from training. Duration: 0.55 2023-10-03 23:53:16,329 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3080_210_2071_1_1526086102_4365166_312-8726-0 from training. Duration: 0.91 2023-10-03 23:53:19,692 WARNING [train.py:1204] (3/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_105-63309-0 from training. Duration: 0.98 2023-10-03 23:53:19,761 WARNING [train.py:1204] (3/4) Exclude cut with ID _2941_210_2577_1_1526199673150_2489759_201-160779-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 23:53:19,764 WARNING [train.py:1204] (3/4) Exclude cut with ID ID0406W0226-885434 from training. Duration: 0.997 2023-10-03 23:53:19,771 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_17292_210_6753_1_1531125388_2943734_353-328201-0_sp1.1 from training. Duration: 0.97275 2023-10-03 23:53:22,438 WARNING [train.py:1204] (3/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_572-96029-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 23:53:22,470 WARNING [train.py:1204] (3/4) Exclude cut with ID _14970_210_15709_1_1530775547361_1966965_6-23328-0_sp0.9 from training. Duration: 0.9555625 2023-10-03 23:53:23,772 WARNING [train.py:1204] (3/4) Exclude cut with ID _45012_210_12651_1_1533608435136_5371380_240-37101-0 from training. Duration: 0.93 2023-10-03 23:53:23,838 WARNING [train.py:1204] (3/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_685-90183-0_sp1.1 from training. Duration: 0.936375 2023-10-03 23:53:25,311 WARNING [train.py:1204] (3/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_41-22245-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 23:53:28,084 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_15126_210_6753_1_1530778677_2591185_120-277707-0 from training. Duration: 0.8 2023-10-03 23:53:28,125 WARNING [train.py:1204] (3/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_310-26186-0 from training. Duration: 0.7 2023-10-03 23:53:28,148 WARNING [train.py:1204] (3/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_787-294416-0 from training. Duration: 0.82 2023-10-03 23:53:32,775 WARNING [train.py:1204] (3/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_795-33252-0 from training. Duration: 0.37 2023-10-03 23:53:34,159 WARNING [train.py:1204] (3/4) Exclude cut with ID _7537_210_2084_1_1528502395431_3924359_113-199933-0_sp0.9 from training. Duration: 0.6555625 2023-10-03 23:53:40,059 WARNING [train.py:1204] (3/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_308-231735-0_sp0.9 from training. Duration: 0.811125 2023-10-03 23:53:40,090 WARNING [train.py:1204] (3/4) Exclude cut with ID _19504_210_9994_1_1531648863242_3325220_354-296765-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 23:53:42,664 WARNING [train.py:1204] (3/4) Exclude cut with ID _5409_210_1388_1_1527292850189_5865191_178-127970-0 from training. Duration: 0.57 2023-10-03 23:53:42,706 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_1466-248056-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 23:53:42,725 WARNING [train.py:1204] (3/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_535-14410-0_sp0.9 from training. Duration: 0.52225 2023-10-03 23:53:42,728 WARNING [train.py:1204] (3/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_747-129941-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 23:53:44,060 WARNING [train.py:1204] (3/4) Exclude cut with ID _15012_210_1968_1_1530856820429_5095350_688-178849-0_sp1.1 from training. Duration: 0.5818125 2023-10-03 23:53:45,540 WARNING [train.py:1204] (3/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_356-45054-0_sp1.1 from training. Duration: 0.7818125 2023-10-03 23:53:46,944 WARNING [train.py:1204] (3/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_146-276944-0_sp0.9 from training. Duration: 0.92225 2023-10-03 23:53:49,047 WARNING [train.py:1204] (3/4) Exclude cut with ID _39204_210_4220_1_1533880547368_3191120_346-180306-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 23:53:51,537 WARNING [train.py:1204] (3/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_641-158017-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 23:53:51,537 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_19952_210_2161_1_1531875469_3851750_274-42137-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 23:53:55,841 WARNING [train.py:1204] (3/4) Exclude cut with ID _60804_210_9642_1_1534831199859_7268299_87-327819-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 23:53:57,119 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_9302_210_2550_1_1528889962_4655416_638-301620-0 from training. Duration: 0.47 2023-10-03 23:53:57,188 WARNING [train.py:1204] (3/4) Exclude cut with ID _44304_210_15374_1_1533639352765_4613129_302-22084-0_sp1.1 from training. Duration: 0.7818125 2023-10-03 23:53:57,200 WARNING [train.py:1204] (3/4) Exclude cut with ID _18513_210_9206_1_1531618215223_3862620_177-227063-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 23:53:58,678 WARNING [train.py:1204] (3/4) Exclude cut with ID _41326_210_5854_1_1533778076509_3492080_510-45444-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 23:53:58,718 WARNING [train.py:1204] (3/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_76-311963-0_sp1.1 from training. Duration: 0.636375 2023-10-03 23:54:00,007 INFO [train.py:1046] (3/4) Epoch 41, batch 5250, loss[loss=0.1729, simple_loss=0.2584, pruned_loss=0.0437, over 24361.00 frames. ], tot_loss[loss=0.1571, simple_loss=0.2373, pruned_loss=0.0385, over 4734759.52 frames. ], batch size: 77, lr: 2.47e-03, grad_scale: 16.0 2023-10-03 23:54:00,125 WARNING [train.py:1204] (3/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_156-302675-0_sp0.9 from training. Duration: 0.611125 2023-10-03 23:54:02,153 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.balancer1.prob, batch_count=1451573.3333333333, ans=0.125 2023-10-03 23:54:03,251 WARNING [train.py:1204] (3/4) Exclude cut with ID _53489_210_6228_1_1534298357169_3835970_367-148411-0_sp1.1 from training. Duration: 0.936375 2023-10-03 23:54:05,954 WARNING [train.py:1204] (3/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_389-144525-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 23:54:05,995 WARNING [train.py:1204] (3/4) Exclude cut with ID _14069_210_5632_1_1530512692033_4803390_158-238427-0_sp1.1 from training. Duration: 0.963625 2023-10-03 23:54:07,414 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_486-151131-0_sp0.9 from training. Duration: 0.9444375 2023-10-03 23:54:12,320 WARNING [train.py:1204] (3/4) Exclude cut with ID _39846_210_12651_1_1533603651992_1318030_188-71831-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 23:54:13,650 WARNING [train.py:1204] (3/4) Exclude cut with ID _39411_210_5986_1_1533538825030_3343649_173-238248-0_sp1.1 from training. Duration: 0.7545625 2023-10-03 23:54:16,842 WARNING [train.py:1204] (3/4) Exclude cut with ID _20684_210_6917_1_1531567751646_4203930_469-49081-0_sp1.1 from training. Duration: 0.92725 2023-10-03 23:54:18,838 WARNING [train.py:1204] (3/4) Exclude cut with ID _8833_210_5219_1_1528592665210_7553059_611-80313-0_sp1.1 from training. Duration: 0.5818125 2023-10-03 23:54:21,525 WARNING [train.py:1204] (3/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_440-141811-0 from training. Duration: 0.95 2023-10-03 23:54:21,546 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_41211_210_2544_1_1533348859_3702910_236-223652-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 23:54:21,631 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_40222_210_6228_1_1533385298_4481939_220-163998-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 23:54:29,521 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.0.feed_forward1.out_proj.dropout_p, batch_count=1451706.6666666667, ans=0.1 2023-10-03 23:54:36,190 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.nonlin_attention.balancer.max_positive, batch_count=1451706.6666666667, ans=0.95 2023-10-03 23:54:42,584 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.feed_forward1.hidden_balancer.prob, batch_count=1451773.3333333333, ans=0.125 2023-10-03 23:54:52,786 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.582e+02 1.869e+02 2.005e+02 2.219e+02 3.156e+02, threshold=4.009e+02, percent-clipped=0.0 2023-10-03 23:55:03,730 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.attention_skip_rate, batch_count=1451840.0, ans=0.0 2023-10-03 23:55:08,298 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.2.encoder.layers.0.nonlin_attention.whiten2, num_groups=1, num_channels=384, metric=5.83 vs. limit=15.0 2023-10-03 23:55:08,764 INFO [train.py:1046] (3/4) Epoch 41, batch 5300, loss[loss=0.1621, simple_loss=0.2414, pruned_loss=0.04136, over 23353.00 frames. ], tot_loss[loss=0.1559, simple_loss=0.2359, pruned_loss=0.038, over 4699137.45 frames. ], batch size: 93, lr: 2.46e-03, grad_scale: 16.0 2023-10-03 23:55:23,224 WARNING [train.py:1204] (3/4) Exclude cut with ID _14629_210_15709_1_1530604298182_956899_6-232695-0_sp1.1 from training. Duration: 0.8545625 2023-10-03 23:55:23,246 WARNING [train.py:1204] (3/4) Exclude cut with ID _13921_210_9994_1_1530841782272_3503520_327-69677-0 from training. Duration: 0.9 2023-10-03 23:55:23,271 WARNING [train.py:1204] (3/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_372-298093-0 from training. Duration: 0.8 2023-10-03 23:55:23,284 WARNING [train.py:1204] (3/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_515-139111-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 23:55:23,525 WARNING [train.py:1204] (3/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_85-65056-0_sp1.1 from training. Duration: 0.87275 2023-10-03 23:55:23,569 WARNING [train.py:1204] (3/4) Exclude cut with ID _15113_210_8254_1_1530842438579_3716279_209-54533-0_sp1.1 from training. Duration: 0.87275 2023-10-03 23:55:23,618 WARNING [train.py:1204] (3/4) Exclude cut with ID _70447_210_12392_1_1535972499300_3160420_123-312838-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 23:55:23,647 WARNING [train.py:1204] (3/4) Exclude cut with ID _60113_210_16271_1_1535164212481_4386079_70-63596-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 23:55:23,678 WARNING [train.py:1204] (3/4) Exclude cut with ID _14632_210_9988_1_1530691279392_3923200_225-188509-0_sp1.1 from training. Duration: 0.963625 2023-10-03 23:55:23,731 WARNING [train.py:1204] (3/4) Exclude cut with ID _34734_210_4525_1_1533365977542_4448720_584-180532-0_sp1.1 from training. Duration: 0.97275 2023-10-03 23:55:23,746 WARNING [train.py:1204] (3/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_351-294650-0_sp1.1 from training. Duration: 0.57275 2023-10-03 23:55:24,099 WARNING [train.py:1204] (3/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_263-168388-0_sp0.9 from training. Duration: 0.9555625 2023-10-03 23:55:24,127 WARNING [train.py:1204] (3/4) Exclude cut with ID _22083_210_8254_1_1531702862439_3678970_201-85738-0 from training. Duration: 0.9 2023-10-03 23:55:24,225 WARNING [train.py:1204] (3/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_237-58433-0 from training. Duration: 0.74 2023-10-03 23:55:24,253 WARNING [train.py:1204] (3/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_262-250826-0 from training. Duration: 0.99 2023-10-03 23:55:24,359 WARNING [train.py:1204] (3/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_927-43827-0_sp0.9 from training. Duration: 0.82225 2023-10-03 23:55:24,390 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2821_210_2715_1_1526108385_3763560_73-296673-0 from training. Duration: 0.83 2023-10-03 23:55:24,465 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6287_210_3273_1_1527509506_2653716_182-199887-0 from training. Duration: 0.51 2023-10-03 23:55:24,598 WARNING [train.py:1204] (3/4) Exclude cut with ID _70519_210_4163_1_1535961595994_7458230_899-80223-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 23:55:25,230 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_210-234949-0_sp1.1 from training. Duration: 0.97275 2023-10-03 23:55:25,270 WARNING [train.py:1204] (3/4) Exclude cut with ID _30196_210_1664_1_1533708496522_7074140_768-278176-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 23:55:25,349 WARNING [train.py:1204] (3/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_315-128283-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 23:55:25,429 WARNING [train.py:1204] (3/4) Exclude cut with ID _14886_210_4220_1_1530702020355_3516190_330-323941-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 23:55:25,687 WARNING [train.py:1204] (3/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_264-348896-0_sp1.1 from training. Duration: 0.77275 2023-10-03 23:55:25,709 WARNING [train.py:1204] (3/4) Exclude cut with ID _37086_210_9038_1_1532999540891_7021250_433-10563-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 23:55:25,747 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_53638_210_21581_1_1534297319_5808449_885-80172-0_sp1.1 from training. Duration: 0.87275 2023-10-03 23:55:25,847 WARNING [train.py:1204] (3/4) Exclude cut with ID _14623_210_1925_1_1530773354962_3632609_157-218771-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 23:55:25,862 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_1368-78165-0_sp1.1 from training. Duration: 0.97275 2023-10-03 23:55:25,867 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_309-65177-0_sp0.9 from training. Duration: 0.97775 2023-10-03 23:55:25,875 WARNING [train.py:1204] (3/4) Exclude cut with ID _21632_210_2253_1_1531994001765_3744140_291-138812-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 23:55:25,893 WARNING [train.py:1204] (3/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_669-93163-0_sp1.1 from training. Duration: 0.8909375 2023-10-03 23:55:26,476 WARNING [train.py:1204] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_292-215346-0 from training. Duration: 0.97 2023-10-03 23:55:26,499 WARNING [train.py:1204] (3/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_286-222073-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 23:55:26,773 WARNING [train.py:1204] (3/4) Exclude cut with ID _59256_210_5986_1_1535076085997_3589120_121-237386-0_sp1.1 from training. Duration: 0.87275 2023-10-03 23:55:26,786 WARNING [train.py:1204] (3/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_96-320736-0 from training. Duration: 0.86 2023-10-03 23:55:26,795 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_128-202104-0 from training. Duration: 0.63 2023-10-03 23:55:26,877 WARNING [train.py:1204] (3/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_242-3472-0_sp0.9 from training. Duration: 0.811125 2023-10-03 23:55:26,920 WARNING [train.py:1204] (3/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_154-74182-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 23:55:26,924 WARNING [train.py:1204] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_187-221566-0 from training. Duration: 0.43 2023-10-03 23:55:27,085 WARNING [train.py:1204] (3/4) Exclude cut with ID _6194_210_1534_1_1527511826160_3941194_14-282876-0 from training. Duration: 0.71 2023-10-03 23:55:27,622 WARNING [train.py:1204] (3/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_660-211088-0_sp1.1 from training. Duration: 0.563625 2023-10-03 23:55:27,987 WARNING [train.py:1204] (3/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_526-62079-0_sp1.1 from training. Duration: 0.6090625 2023-10-03 23:55:28,159 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_92-246270-0_sp1.1 from training. Duration: 0.77275 2023-10-03 23:55:28,259 WARNING [train.py:1204] (3/4) Exclude cut with ID IC0287W0023-142350 from training. Duration: 0.956 2023-10-03 23:55:28,340 WARNING [train.py:1204] (3/4) Exclude cut with ID _58605_210_12210_1_1534723195939_3904259_48-269402-0 from training. Duration: 0.96 2023-10-03 23:55:28,356 WARNING [train.py:1204] (3/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_416-257730-0_sp0.9 from training. Duration: 0.72225 2023-10-03 23:55:28,437 WARNING [train.py:1204] (3/4) Exclude cut with ID _7877_210_6014_1_1528286358099_3750136_185-17516-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 23:55:28,543 WARNING [train.py:1204] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_23-189234-0 from training. Duration: 0.83 2023-10-03 23:55:28,559 WARNING [train.py:1204] (3/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_237-89684-0 from training. Duration: 0.96 2023-10-03 23:55:28,629 WARNING [train.py:1204] (3/4) Exclude cut with ID _13916_210_9994_1_1530788341126_3763149_116-21034-0 from training. Duration: 0.96 2023-10-03 23:55:28,770 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_246-125000-0_sp1.1 from training. Duration: 0.563625 2023-10-03 23:55:32,748 INFO [train.py:1046] (3/4) Epoch 42, batch 0, loss[loss=0.1542, simple_loss=0.2267, pruned_loss=0.04087, over 23735.00 frames. ], tot_loss[loss=0.1542, simple_loss=0.2267, pruned_loss=0.04087, over 23735.00 frames. ], batch size: 212, lr: 2.43e-03, grad_scale: 32.0 2023-10-03 23:55:32,749 INFO [train.py:1069] (3/4) Computing validation loss 2023-10-03 23:55:44,911 INFO [train.py:1078] (3/4) Epoch 42, validation: loss=0.3268, simple_loss=0.2729, pruned_loss=0.1903, over 1125622.00 frames. 2023-10-03 23:55:44,911 INFO [train.py:1079] (3/4) Maximum memory allocated so far is 21134MB 2023-10-03 23:55:48,333 WARNING [train.py:1204] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_402-253027-0 from training. Duration: 0.54 2023-10-03 23:55:48,403 WARNING [train.py:1204] (3/4) Exclude cut with ID _59288_210_5854_1_1535077757774_3412246_158-289345-0_sp1.1 from training. Duration: 0.92725 2023-10-03 23:55:51,148 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_28211_210_6388_1_1532861506_4523224_128-328162-0_sp1.1 from training. Duration: 0.8181875 2023-10-03 23:55:54,142 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.self_attn_weights.pos_emb_skip_rate, batch_count=1451986.6666666667, ans=0.0 2023-10-03 23:55:55,362 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_788-278281-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 23:55:55,371 WARNING [train.py:1204] (3/4) Exclude cut with ID _49728_210_12651_1_1534041034294_6788150_624-195301-0_sp0.9 from training. Duration: 0.9444375 2023-10-03 23:55:56,672 WARNING [train.py:1204] (3/4) Exclude cut with ID _14860_210_4915_1_1530702020340_7343240_267-139775-0_sp1.1 from training. Duration: 0.963625 2023-10-03 23:55:56,745 WARNING [train.py:1204] (3/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_332-221057-0 from training. Duration: 0.68 2023-10-03 23:55:58,072 WARNING [train.py:1204] (3/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_242-76943-0 from training. Duration: 0.94 2023-10-03 23:55:59,496 WARNING [train.py:1204] (3/4) Exclude cut with ID _16747_210_5986_1_1531811544733_2749109_87-349587-0_sp1.1 from training. Duration: 0.963625 2023-10-03 23:55:59,563 WARNING [train.py:1204] (3/4) Exclude cut with ID _22982_210_1883_1_1531882790447_5449409_505-346452-0_sp1.1 from training. Duration: 0.963625 2023-10-03 23:56:02,355 WARNING [train.py:1204] (3/4) Exclude cut with ID _17956_210_4353_1_1531209568125_6979970_13-192107-0_sp1.1 from training. Duration: 0.963625 2023-10-03 23:56:02,391 WARNING [train.py:1204] (3/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_430-307666-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 23:56:03,700 WARNING [train.py:1204] (3/4) Exclude cut with ID _2895_210_2477_1_1525956785794_4063750_289-11475-0_sp1.1 from training. Duration: 0.7454375 2023-10-03 23:56:03,713 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3910_210_1794_1_1526742306_334530_28-238909-0_sp0.9 from training. Duration: 0.9 2023-10-03 23:56:03,980 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.feed_forward2.hidden_balancer.prob, batch_count=1452053.3333333333, ans=0.125 2023-10-03 23:56:05,472 WARNING [train.py:1204] (3/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_18-130336-0 from training. Duration: 0.79 2023-10-03 23:56:05,673 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=1452053.3333333333, ans=0.1 2023-10-03 23:56:05,695 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.self_attn_weights.pos_emb_skip_rate, batch_count=1452053.3333333333, ans=0.0 2023-10-03 23:56:06,833 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_548-294316-0_sp0.9 from training. Duration: 0.9 2023-10-03 23:56:14,216 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_42-142473-0_sp1.1 from training. Duration: 0.6545625 2023-10-03 23:56:15,460 WARNING [train.py:1204] (3/4) Exclude cut with ID _59170_210_6721_1_1534983633960_4203335_326-48066-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 23:56:15,760 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.conv_module1.balancer2.min_abs, batch_count=1452120.0, ans=0.5 2023-10-03 23:56:17,029 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_45886_210_17024_1_1533709215_471063_7-79165-0 from training. Duration: 0.98 2023-10-03 23:56:19,758 WARNING [train.py:1204] (3/4) Exclude cut with ID _44585_210_18628_1_1533605350387_4518199_280-73534-0_sp0.9 from training. Duration: 0.988875 2023-10-03 23:56:19,765 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3475_210_2028_1_1526193578_3622569_265-276625-0_sp1.1 from training. Duration: 0.6909375 2023-10-03 23:56:22,564 WARNING [train.py:1204] (3/4) Exclude cut with ID _64631_210_27767_1_1535349580068_4165160_88-225232-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 23:56:26,827 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_205-265518-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 23:56:32,232 WARNING [train.py:1204] (3/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_271-253734-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 23:56:35,810 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.2.encoder.layers.1.nonlin_attention.whiten1, num_groups=1, num_channels=288, metric=5.40 vs. limit=10.0 2023-10-03 23:56:36,417 WARNING [train.py:1204] (3/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_282-257791-0 from training. Duration: 0.86 2023-10-03 23:56:41,804 WARNING [train.py:1204] (3/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_356-45054-0 from training. Duration: 0.86 2023-10-03 23:56:41,849 WARNING [train.py:1204] (3/4) Exclude cut with ID _69584_210_5219_1_1535785133775_4044450_38-348116-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 23:56:41,850 WARNING [train.py:1204] (3/4) Exclude cut with ID _66763_210_17301_1_1535788667617_241750_21-328480-0_sp1.1 from training. Duration: 0.936375 2023-10-03 23:56:43,193 WARNING [train.py:1204] (3/4) Exclude cut with ID _26528_210_9070_1_1532570410842_3851310_497-97218-0_sp0.9 from training. Duration: 0.9555625 2023-10-03 23:56:43,247 WARNING [train.py:1204] (3/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_592-101075-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 23:56:46,275 WARNING [train.py:1204] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_678-159307-0 from training. Duration: 0.85 2023-10-03 23:56:48,985 WARNING [train.py:1204] (3/4) Exclude cut with ID _28427_210_4220_1_1532581240967_3061069_166-319773-0_sp1.1 from training. Duration: 0.936375 2023-10-03 23:56:49,062 WARNING [train.py:1204] (3/4) Exclude cut with ID _29877_210_6228_1_1532397791106_5276149_606-224984-0_sp1.1 from training. Duration: 0.936375 2023-10-03 23:56:53,129 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_125-203105-0_sp1.1 from training. Duration: 0.72725 2023-10-03 23:56:55,945 WARNING [train.py:1204] (3/4) Exclude cut with ID IC0620W0113-306765 from training. Duration: 0.768 2023-10-03 23:56:57,403 WARNING [train.py:1204] (3/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_410-287857-0_sp1.1 from training. Duration: 0.8454375 2023-10-03 23:56:58,744 INFO [train.py:1046] (3/4) Epoch 42, batch 50, loss[loss=0.1579, simple_loss=0.2311, pruned_loss=0.04233, over 23444.00 frames. ], tot_loss[loss=0.1576, simple_loss=0.2371, pruned_loss=0.03909, over 1075894.26 frames. ], batch size: 285, lr: 2.43e-03, grad_scale: 16.0 2023-10-03 23:57:01,024 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.1.encoder.layers.1.nonlin_attention.whiten2, num_groups=1, num_channels=256, metric=7.65 vs. limit=15.0 2023-10-03 23:57:01,629 WARNING [train.py:1204] (3/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_78-95260-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 23:57:04,353 WARNING [train.py:1204] (3/4) Exclude cut with ID _58059_210_5854_1_1534901471149_3164808_6-73080-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 23:57:04,375 WARNING [train.py:1204] (3/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_331-117829-0 from training. Duration: 0.62 2023-10-03 23:57:05,680 WARNING [train.py:1204] (3/4) Exclude cut with ID _14880_210_8895_1_1530680426655_3795259_289-212817-0_sp0.9 from training. Duration: 0.7666875 2023-10-03 23:57:05,722 WARNING [train.py:1204] (3/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_620-144779-0_sp1.1 from training. Duration: 0.92725 2023-10-03 23:57:08,458 WARNING [train.py:1204] (3/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_561-288575-0_sp1.1 from training. Duration: 0.963625 2023-10-03 23:57:08,579 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_22657_210_6890_1_1532086002_1637216_122-188128-0_sp1.1 from training. Duration: 0.963625 2023-10-03 23:57:11,825 WARNING [train.py:1204] (3/4) Exclude cut with ID _16756_210_4915_1_1531047803763_7104259_604-86607-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 23:57:15,280 WARNING [train.py:1204] (3/4) Exclude cut with ID _15150_210_8341_1_1530752411803_7256384_469-239509-0 from training. Duration: 0.68 2023-10-03 23:57:15,296 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_10987_210_4163_1_1529760555_4123902_302-87926-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 23:57:20,345 WARNING [train.py:1204] (3/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_469-94673-0_sp0.9 from training. Duration: 0.87775 2023-10-03 23:57:21,813 WARNING [train.py:1204] (3/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_798-106179-0 from training. Duration: 0.83 2023-10-03 23:57:23,272 WARNING [train.py:1204] (3/4) Exclude cut with ID _16744_210_5986_1_1531724405384_3907567_242-111316-0 from training. Duration: 0.93 2023-10-03 23:57:25,928 WARNING [train.py:1204] (3/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_938-205135-0_sp0.9 from training. Duration: 0.9666875 2023-10-03 23:57:26,019 WARNING [train.py:1204] (3/4) Exclude cut with ID _64135_210_2253_1_1535677267876_7608972_133-188266-0_sp0.9 from training. Duration: 0.9 2023-10-03 23:57:26,023 WARNING [train.py:1204] (3/4) Exclude cut with ID _44585_210_18628_1_1533605350387_4518199_623-346820-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 23:57:26,145 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.1.bypass.skip_rate, batch_count=1452386.6666666667, ans=0.035 2023-10-03 23:57:27,342 WARNING [train.py:1204] (3/4) Exclude cut with ID _42326_210_7770_1_1533814282531_3707269_454-266903-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 23:57:28,013 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.3.encoder.layers.3.nonlin_attention.whiten2, num_groups=1, num_channels=512, metric=4.29 vs. limit=15.0 2023-10-03 23:57:28,722 WARNING [train.py:1204] (3/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_5-55893-0_sp1.1 from training. Duration: 0.5 2023-10-03 23:57:28,772 WARNING [train.py:1204] (3/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_577-332600-0_sp1.1 from training. Duration: 0.5181875 2023-10-03 23:57:28,773 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_957-138882-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 23:57:35,786 WARNING [train.py:1204] (3/4) Exclude cut with ID _12556_210_8341_1_1530081024100_7327690_219-38004-0_sp1.1 from training. Duration: 0.97275 2023-10-03 23:57:37,167 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_397-147196-0_sp1.1 from training. Duration: 0.72725 2023-10-03 23:57:37,178 WARNING [train.py:1204] (3/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_231-104736-0_sp0.9 from training. Duration: 0.8666875 2023-10-03 23:57:38,509 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.593e+02 2.003e+02 2.232e+02 2.630e+02 3.790e+02, threshold=4.463e+02, percent-clipped=0.0 2023-10-03 23:57:38,584 WARNING [train.py:1204] (3/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_398-334558-0 from training. Duration: 0.96 2023-10-03 23:57:38,715 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_257-20733-0_sp1.1 from training. Duration: 0.6909375 2023-10-03 23:57:40,057 WARNING [train.py:1204] (3/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_146-282400-0_sp1.1 from training. Duration: 0.6454375 2023-10-03 23:57:40,061 WARNING [train.py:1204] (3/4) Exclude cut with ID _51899_210_20034_1_1534129254466_3669220_265-215428-0 from training. Duration: 0.98 2023-10-03 23:57:40,131 WARNING [train.py:1204] (3/4) Exclude cut with ID _34423_210_8750_1_1533430798456_3330199_219-106541-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 23:57:42,022 WARNING [train.py:1204] (3/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_458-67321-0 from training. Duration: 0.97 2023-10-03 23:57:49,227 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.conv_skip_rate, batch_count=1452520.0, ans=0.0 2023-10-03 23:57:51,744 WARNING [train.py:1204] (3/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_1051-182606-0_sp1.1 from training. Duration: 0.936375 2023-10-03 23:57:51,769 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_53555_210_5632_1_1534595220_7232297_603-4396-0_sp1.1 from training. Duration: 0.97275 2023-10-03 23:57:53,162 WARNING [train.py:1204] (3/4) Exclude cut with ID _54218_210_12210_1_1534413646388_4409259_159-144367-0_sp1.1 from training. Duration: 0.963625 2023-10-03 23:57:54,460 WARNING [train.py:1204] (3/4) Exclude cut with ID _7606_210_4915_1_1528894253277_5080690_301-10732-0_sp0.9 from training. Duration: 0.9 2023-10-03 23:57:54,464 WARNING [train.py:1204] (3/4) Exclude cut with ID _15026_210_8750_1_1530777602316_3291850_211-195043-0_sp0.9 from training. Duration: 0.888875 2023-10-03 23:57:56,214 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.balancer1.prob, batch_count=1452520.0, ans=0.125 2023-10-03 23:57:57,184 WARNING [train.py:1204] (3/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_146-282400-0 from training. Duration: 0.71 2023-10-03 23:57:57,223 WARNING [train.py:1204] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_65-88242-0 from training. Duration: 0.56 2023-10-03 23:57:58,724 WARNING [train.py:1204] (3/4) Exclude cut with ID _55857_210_2346_1_1534644007831_6898209_80-107757-0_sp1.1 from training. Duration: 0.963625 2023-10-03 23:57:58,764 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_15126_210_6753_1_1530778677_2591185_120-277707-0_sp0.9 from training. Duration: 0.888875 2023-10-03 23:58:00,108 WARNING [train.py:1204] (3/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_1033-168593-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 23:58:00,173 WARNING [train.py:1204] (3/4) Exclude cut with ID _46683_210_9642_1_1533730913492_4591116_475-30711-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 23:58:00,198 WARNING [train.py:1204] (3/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_253-308377-0 from training. Duration: 0.49 2023-10-03 23:58:01,493 WARNING [train.py:1204] (3/4) Exclude cut with ID _4680_210_3525_1_1527249757953_893386_75-78979-0 from training. Duration: 0.91 2023-10-03 23:58:02,853 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_5522_210_3273_1_1527249606_3909096_318-203153-0_sp0.9 from training. Duration: 0.47775 2023-10-03 23:58:04,126 WARNING [train.py:1204] (3/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_550-170116-0_sp1.1 from training. Duration: 0.92725 2023-10-03 23:58:05,330 WARNING [train.py:1204] (3/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_277-210798-0_sp0.9 from training. Duration: 0.611125 2023-10-03 23:58:06,693 WARNING [train.py:1204] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_620-276793-0 from training. Duration: 0.59 2023-10-03 23:58:06,703 WARNING [train.py:1204] (3/4) Exclude cut with ID _3067_210_2667_1_1526212974640_4503168_119-175776-0 from training. Duration: 0.74 2023-10-03 23:58:06,777 WARNING [train.py:1204] (3/4) Exclude cut with ID _55209_210_1531_1_1534675828996_4597160_681-195827-0_sp1.1 from training. Duration: 0.92725 2023-10-03 23:58:08,143 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_427-78097-0_sp1.1 from training. Duration: 0.72725 2023-10-03 23:58:08,260 WARNING [train.py:1204] (3/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_96-290039-0_sp0.9 from training. Duration: 0.62225 2023-10-03 23:58:09,525 WARNING [train.py:1204] (3/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_287-292174-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 23:58:12,878 INFO [train.py:1046] (3/4) Epoch 42, batch 100, loss[loss=0.1406, simple_loss=0.2179, pruned_loss=0.03163, over 23727.00 frames. ], tot_loss[loss=0.1566, simple_loss=0.2364, pruned_loss=0.03837, over 1875630.89 frames. ], batch size: 149, lr: 2.43e-03, grad_scale: 16.0 2023-10-03 23:58:12,981 WARNING [train.py:1204] (3/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_200-292228-0_sp1.1 from training. Duration: 0.9090625 2023-10-03 23:58:15,645 WARNING [train.py:1204] (3/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_291-194999-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 23:58:19,636 WARNING [train.py:1204] (3/4) Exclude cut with ID _58895_210_8750_1_1535184081320_3649089_265-323810-0_sp1.1 from training. Duration: 0.87275 2023-10-03 23:58:21,685 WARNING [train.py:1204] (3/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_58-68956-0 from training. Duration: 0.83 2023-10-03 23:58:21,690 WARNING [train.py:1204] (3/4) Exclude cut with ID _39867_210_9826_1_1533553222030_9344259_322-115153-0_sp1.1 from training. Duration: 0.963625 2023-10-03 23:58:25,699 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3371_210_1794_1_1526384462_5580777_396-339812-0_sp1.1 from training. Duration: 0.77275 2023-10-03 23:58:25,718 WARNING [train.py:1204] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_731-2428-0_sp0.9 from training. Duration: 0.92225 2023-10-03 23:58:25,738 WARNING [train.py:1204] (3/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_98-741-0_sp1.1 from training. Duration: 0.72725 2023-10-03 23:58:25,739 WARNING [train.py:1204] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_238-245655-0_sp0.9 from training. Duration: 0.9 2023-10-03 23:58:26,984 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6788_210_1388_1_1527898698_4562021_91-197981-0_sp0.9 from training. Duration: 0.92225 2023-10-03 23:58:28,414 WARNING [train.py:1204] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_860-96198-0 from training. Duration: 0.73 2023-10-03 23:58:31,196 WARNING [train.py:1204] (3/4) Exclude cut with ID _14268_210_9206_1_1530622771285_4714099_331-105351-0_sp0.9 from training. Duration: 0.72225 2023-10-03 23:58:31,220 WARNING [train.py:1204] (3/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_204-137133-0_sp1.1 from training. Duration: 0.92725 2023-10-03 23:58:31,247 WARNING [train.py:1204] (3/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_5-46496-0_sp1.1 from training. Duration: 0.936375 2023-10-03 23:58:31,255 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_160-218721-0_sp1.1 from training. Duration: 0.87275 2023-10-03 23:58:34,102 WARNING [train.py:1204] (3/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_503-287883-0 from training. Duration: 0.88 2023-10-03 23:58:35,439 WARNING [train.py:1204] (3/4) Exclude cut with ID _57572_210_5517_1_1534921219624_3809484_110-79134-0_sp1.1 from training. Duration: 0.92725 2023-10-03 23:58:35,515 WARNING [train.py:1204] (3/4) Exclude cut with ID _44585_210_18628_1_1533605350387_4518199_421-37641-0_sp1.1 from training. Duration: 0.936375 2023-10-03 23:58:36,903 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_42-142473-0_sp0.9 from training. Duration: 0.8 2023-10-03 23:58:38,325 WARNING [train.py:1204] (3/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_310-114452-0_sp0.9 from training. Duration: 0.6555625 2023-10-03 23:58:41,103 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9029W0120-509520 from training. Duration: 0.768 2023-10-03 23:58:41,124 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9029W0433-509820 from training. Duration: 0.896 2023-10-03 23:58:42,425 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_40222_210_6228_1_1533385298_4481939_163-334586-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 23:58:42,425 WARNING [train.py:1204] (3/4) Exclude cut with ID _48677_210_5663_1_1533902405919_3892989_408-40740-0_sp1.1 from training. Duration: 0.7545625 2023-10-03 23:58:45,702 WARNING [train.py:1204] (3/4) Exclude cut with ID _24314_210_4915_1_1532135078116_7170179_521-258868-0_sp0.9 from training. Duration: 0.87775 2023-10-03 23:58:48,792 WARNING [train.py:1204] (3/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_576-51198-0_sp1.1 from training. Duration: 0.92725 2023-10-03 23:58:50,528 WARNING [train.py:1204] (3/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_306-216871-0_sp1.1 from training. Duration: 0.97275 2023-10-03 23:58:55,960 WARNING [train.py:1204] (3/4) Exclude cut with ID _20684_210_6917_1_1531567751646_4203930_463-119982-0_sp1.1 from training. Duration: 0.97275 2023-10-03 23:58:57,309 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9084W0012-536850 from training. Duration: 0.64 2023-10-03 23:58:57,550 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.ff3_skip_rate, batch_count=1452853.3333333333, ans=0.0 2023-10-03 23:59:00,034 WARNING [train.py:1204] (3/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_847-41334-0_sp1.1 from training. Duration: 0.4 2023-10-03 23:59:02,857 WARNING [train.py:1204] (3/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_326-165900-0_sp1.1 from training. Duration: 0.62725 2023-10-03 23:59:04,152 WARNING [train.py:1204] (3/4) Exclude cut with ID _7537_210_2084_1_1528502395431_3924359_128-268029-0_sp1.1 from training. Duration: 0.8909375 2023-10-03 23:59:06,766 WARNING [train.py:1204] (3/4) Exclude cut with ID _67499_210_17282_1_1535509805220_3692430_333-155030-0_sp1.1 from training. Duration: 0.97275 2023-10-03 23:59:09,511 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_34008_210_10431_1_1533345424_8183247_588-257177-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 23:59:13,491 WARNING [train.py:1204] (3/4) Exclude cut with ID _25914_210_6400_1_1532141317058_644740_52-96808-0_sp1.1 from training. Duration: 0.82725 2023-10-03 23:59:14,948 WARNING [train.py:1204] (3/4) Exclude cut with ID _56494_210_4220_1_1535284837625_3033169_69-138150-0_sp1.1 from training. Duration: 0.8545625 2023-10-03 23:59:16,424 WARNING [train.py:1204] (3/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_19-143553-0_sp1.1 from training. Duration: 0.97275 2023-10-03 23:59:18,302 WARNING [train.py:1204] (3/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_582-130735-0_sp1.1 from training. Duration: 0.936375 2023-10-03 23:59:19,685 WARNING [train.py:1204] (3/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_943-201356-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 23:59:19,695 WARNING [train.py:1204] (3/4) Exclude cut with ID _3231_210_2106_1_1526205864934_6784220_422-229276-0_sp1.1 from training. Duration: 0.9181875 2023-10-03 23:59:19,729 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_48049_210_5632_1_1533864553_3519937_73-198717-0_sp1.1 from training. Duration: 0.97275 2023-10-03 23:59:21,114 WARNING [train.py:1204] (3/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_329-285801-0 from training. Duration: 0.91 2023-10-03 23:59:21,138 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9171W0114-579000 from training. Duration: 0.896 2023-10-03 23:59:21,149 WARNING [train.py:1204] (3/4) Exclude cut with ID _3120_210_3525_1_1526036143112_4072980_210-132675-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 23:59:22,919 WARNING [train.py:1204] (3/4) Exclude cut with ID _55857_210_2346_1_1534644007831_6898209_745-98391-0_sp0.9 from training. Duration: 0.8444375 2023-10-03 23:59:25,378 WARNING [train.py:1204] (3/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_615-322035-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 23:59:25,383 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_36756_210_7234_1_1533026991_966276_119-315767-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 23:59:25,398 WARNING [train.py:1204] (3/4) Exclude cut with ID _13916_210_9994_1_1530788341126_3763149_60-191556-0_sp1.1 from training. Duration: 0.5545625 2023-10-03 23:59:25,421 WARNING [train.py:1204] (3/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_177-236809-0_sp1.1 from training. Duration: 0.4454375 2023-10-03 23:59:26,531 INFO [train.py:1046] (3/4) Epoch 42, batch 150, loss[loss=0.1666, simple_loss=0.2362, pruned_loss=0.04848, over 22728.00 frames. ], tot_loss[loss=0.1569, simple_loss=0.2368, pruned_loss=0.03854, over 2512184.87 frames. ], batch size: 322, lr: 2.43e-03, grad_scale: 8.0 2023-10-03 23:59:26,595 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7002_210_1794_1_1527939683_5157286_184-229630-0_sp1.1 from training. Duration: 0.67275 2023-10-03 23:59:26,601 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_50428_210_5724_1_1534247532_4126418_464-18322-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 23:59:26,662 WARNING [train.py:1204] (3/4) Exclude cut with ID _35799_210_5854_1_1533263487887_3529071_466-131402-0_sp1.1 from training. Duration: 0.936375 2023-10-03 23:59:28,019 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_1259-132768-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 23:59:28,075 WARNING [train.py:1204] (3/4) Exclude cut with ID _6694_210_4599_1_1527852637490_3965529_144-185051-0_sp1.1 from training. Duration: 0.87275 2023-10-03 23:59:28,116 WARNING [train.py:1204] (3/4) Exclude cut with ID _64639_210_5816_1_1535367619437_3528000_59-133290-0_sp1.1 from training. Duration: 0.963625 2023-10-03 23:59:29,763 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.nonlin_attention.balancer.prob, batch_count=1452986.6666666667, ans=0.125 2023-10-03 23:59:30,927 WARNING [train.py:1204] (3/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_223-49525-0_sp1.1 from training. Duration: 0.9818125 2023-10-03 23:59:33,607 WARNING [train.py:1204] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_748-36150-0_sp1.1 from training. Duration: 0.82725 2023-10-03 23:59:33,608 WARNING [train.py:1204] (3/4) Exclude cut with ID _19896_210_1883_1_1531709022662_6649569_52-221627-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 23:59:34,920 WARNING [train.py:1204] (3/4) Exclude cut with ID _6789_210_1534_1_1527854471671_2870519_296-115766-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 23:59:35,106 WARNING [train.py:1204] (3/4) Exclude cut with ID _14624_210_1925_1_1530858690657_3642219_100-19871-0_sp1.1 from training. Duration: 0.9909375 2023-10-03 23:59:36,534 WARNING [train.py:1204] (3/4) Exclude cut with ID _12555_210_8750_1_1530075594402_3780239_50-293675-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 23:59:37,986 WARNING [train.py:1204] (3/4) Exclude cut with ID _14880_210_8895_1_1530680426655_3795259_289-212817-0_sp1.1 from training. Duration: 0.62725 2023-10-03 23:59:39,311 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_388-211626-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 23:59:43,284 WARNING [train.py:1204] (3/4) Exclude cut with ID _5289_210_2084_1_1527678202736_3779299_392-261988-0 from training. Duration: 0.65 2023-10-03 23:59:43,300 WARNING [train.py:1204] (3/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_124-345840-0 from training. Duration: 0.52 2023-10-03 23:59:43,305 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2655_210_2087_1_1525688959_3897400_437-337690-0 from training. Duration: 0.63 2023-10-03 23:59:46,130 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3780_210_2087_1_1526467391_239068_13-274633-0_sp1.1 from training. Duration: 0.92725 2023-10-03 23:59:46,135 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_805-71497-0_sp1.1 from training. Duration: 0.6454375 2023-10-03 23:59:47,541 WARNING [train.py:1204] (3/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_434-83000-0_sp1.1 from training. Duration: 0.8909375 2023-10-03 23:59:48,910 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_17613_210_2151_1_1531137569_2366160_285-219531-0_sp1.1 from training. Duration: 0.97275 2023-10-03 23:59:48,923 WARNING [train.py:1204] (3/4) Exclude cut with ID _56108_210_16380_1_1534505446159_3887959_22-37858-0_sp1.1 from training. Duration: 0.936375 2023-10-03 23:59:48,957 WARNING [train.py:1204] (3/4) Exclude cut with ID _28427_210_4220_1_1532581240967_3061069_200-88444-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 23:59:49,016 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_77-279042-0_sp1.1 from training. Duration: 0.9454375 2023-10-03 23:59:49,228 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.balancer1.prob, batch_count=1453053.3333333333, ans=0.125 2023-10-03 23:59:50,942 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9309W0193-640320 from training. Duration: 0.896 2023-10-03 23:59:51,252 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.bypass.skip_rate, batch_count=1453053.3333333333, ans=0.07 2023-10-03 23:59:52,288 WARNING [train.py:1204] (3/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_516-324055-0_sp1.1 from training. Duration: 0.936375 2023-10-03 23:59:56,667 WARNING [train.py:1204] (3/4) Exclude cut with ID _40094_210_16380_1_1533294005773_3641159_10-261240-0_sp1.1 from training. Duration: 0.9545625 2023-10-03 23:59:59,444 WARNING [train.py:1204] (3/4) Exclude cut with ID _6267_210_2084_1_1527764658526_3894730_375-219887-0_sp1.1 from training. Duration: 0.6090625 2023-10-03 23:59:59,732 INFO [scaling.py:1118] (3/4) WithLoss: name=encoder.encoders.4.encoder.layers.2.self_attn_weights, loss-sum=0.000e+00 2023-10-04 00:00:00,785 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6788_210_1388_1_1527898698_4562021_180-75288-0 from training. Duration: 0.99 2023-10-04 00:00:03,559 WARNING [train.py:1204] (3/4) Exclude cut with ID _8030_210_7497_1_1528444886781_3531089_262-86750-0_sp1.1 from training. Duration: 0.736375 2023-10-04 00:00:04,862 WARNING [train.py:1204] (3/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_934-325498-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 00:00:04,896 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_456-22141-0_sp0.9 from training. Duration: 0.97775 2023-10-04 00:00:06,213 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.574e+02 1.980e+02 2.184e+02 2.445e+02 3.491e+02, threshold=4.367e+02, percent-clipped=0.0 2023-10-04 00:00:07,645 WARNING [train.py:1204] (3/4) Exclude cut with ID _49643_210_7989_1_1534640244224_3939031_598-161926-0_sp1.1 from training. Duration: 0.8181875 2023-10-04 00:00:07,955 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.nonlin_attention.balancer.prob, batch_count=1453186.6666666667, ans=0.125 2023-10-04 00:00:08,939 WARNING [train.py:1204] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_345-49721-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 00:00:10,295 WARNING [train.py:1204] (3/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_440-141811-0_sp1.1 from training. Duration: 0.863625 2023-10-04 00:00:11,742 WARNING [train.py:1204] (3/4) Exclude cut with ID _64048_210_10626_1_1535198382802_3835179_394-212120-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 00:00:11,809 WARNING [train.py:1204] (3/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_91-277684-0 from training. Duration: 0.74 2023-10-04 00:00:16,313 WARNING [train.py:1204] (3/4) Exclude cut with ID _23038_210_9570_1_1532001584158_3652419_191-346343-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 00:00:17,621 WARNING [train.py:1204] (3/4) Exclude cut with ID _15140_210_9288_1_1530791388450_4880149_351-347974-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 00:00:19,385 WARNING [train.py:1204] (3/4) Exclude cut with ID _58605_210_12210_1_1534723195939_3904259_48-269402-0_sp1.1 from training. Duration: 0.87275 2023-10-04 00:00:19,392 WARNING [train.py:1204] (3/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_401-53423-0_sp0.9 from training. Duration: 0.611125 2023-10-04 00:00:19,641 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.balancer1.prob, batch_count=1453186.6666666667, ans=0.125 2023-10-04 00:00:21,234 WARNING [train.py:1204] (3/4) Exclude cut with ID _42659_210_2070_1_1533607442765_3120380_158-49004-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 00:00:23,575 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.0.layers.1.self_attn1.whiten, num_groups=1, num_channels=192, metric=13.08 vs. limit=22.5 2023-10-04 00:00:23,984 WARNING [train.py:1204] (3/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_81-75849-0_sp1.1 from training. Duration: 0.3545625 2023-10-04 00:00:25,954 WARNING [train.py:1204] (3/4) Exclude cut with ID _27052_210_5986_1_1532329197187_3585009_314-106499-0_sp1.1 from training. Duration: 0.836375 2023-10-04 00:00:27,369 WARNING [train.py:1204] (3/4) Exclude cut with ID _4626_210_3532_1_1526817652717_3916859_446-206656-0_sp0.9 from training. Duration: 0.8444375 2023-10-04 00:00:28,737 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_53378_210_17282_1_1534221071_5769509_158-114960-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 00:00:30,149 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3171_210_2087_1_1526109084_2330007_37-64334-0_sp0.9 from training. Duration: 0.911125 2023-10-04 00:00:30,167 WARNING [train.py:1204] (3/4) Exclude cut with ID _5410_210_1388_1_1527397055795_1719020_39-112221-0 from training. Duration: 0.69 2023-10-04 00:00:30,188 WARNING [train.py:1204] (3/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_4-91037-0_sp0.9 from training. Duration: 0.97775 2023-10-04 00:00:31,482 WARNING [train.py:1204] (3/4) Exclude cut with ID ID0079W0443-717795 from training. Duration: 0.541 2023-10-04 00:00:34,211 WARNING [train.py:1204] (3/4) Exclude cut with ID _34734_210_4525_1_1533365977542_4448720_766-297928-0_sp1.1 from training. Duration: 0.936375 2023-10-04 00:00:38,126 INFO [train.py:1046] (3/4) Epoch 42, batch 200, loss[loss=0.1372, simple_loss=0.2145, pruned_loss=0.03, over 21203.00 frames. ], tot_loss[loss=0.1566, simple_loss=0.2368, pruned_loss=0.03823, over 3003435.57 frames. ], batch size: 46, lr: 2.43e-03, grad_scale: 8.0 2023-10-04 00:00:38,246 WARNING [train.py:1204] (3/4) Exclude cut with ID _51471_210_1889_1_1534399391390_3094250_310-337793-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 00:00:38,266 WARNING [train.py:1204] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_678-159307-0_sp0.9 from training. Duration: 0.9444375 2023-10-04 00:00:41,241 WARNING [train.py:1204] (3/4) Exclude cut with ID _50093_210_4220_1_1534417141207_3432299_259-322252-0 from training. Duration: 0.92 2023-10-04 00:00:41,309 WARNING [train.py:1204] (3/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_67-103444-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 00:00:42,650 WARNING [train.py:1204] (3/4) Exclude cut with ID _40551_210_10635_1_1533776424632_3416179_311-247578-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 00:00:44,066 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_4633_210_2040_1_1526824163_4145343_416-76117-0 from training. Duration: 0.95 2023-10-04 00:00:45,513 WARNING [train.py:1204] (3/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_331-117829-0_sp0.9 from training. Duration: 0.688875 2023-10-04 00:00:46,966 WARNING [train.py:1204] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_299-247059-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 00:00:48,322 WARNING [train.py:1204] (3/4) Exclude cut with ID _64002_210_9570_1_1535198605157_38979_2-240845-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 00:00:54,303 WARNING [train.py:1204] (3/4) Exclude cut with ID _4681_210_3525_1_1527250689464_120889_15-126760-0_sp0.9 from training. Duration: 0.9 2023-10-04 00:00:54,315 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_21500_210_2054_1_1532149310_2716758_256-156611-0_sp1.1 from training. Duration: 0.936375 2023-10-04 00:00:54,346 WARNING [train.py:1204] (3/4) Exclude cut with ID _16908_210_5724_1_1531119157248_3772700_624-203729-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 00:00:55,924 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.feed_forward3.hidden_balancer.prob, batch_count=1453386.6666666667, ans=0.125 2023-10-04 00:01:12,626 WARNING [train.py:1204] (3/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_605-219732-0_sp1.1 from training. Duration: 0.8545625 2023-10-04 00:01:12,660 WARNING [train.py:1204] (3/4) Exclude cut with ID _44718_210_5816_1_1534050051869_6469030_380-167520-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 00:01:14,007 WARNING [train.py:1204] (3/4) Exclude cut with ID _19896_210_1883_1_1531709022662_6649569_94-229927-0_sp1.1 from training. Duration: 0.8818125 2023-10-04 00:01:14,063 WARNING [train.py:1204] (3/4) Exclude cut with ID _19514_210_9259_1_1531530071330_5084230_519-174300-0_sp1.1 from training. Duration: 0.963625 2023-10-04 00:01:15,404 WARNING [train.py:1204] (3/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_340-174176-0_sp0.9 from training. Duration: 0.5444375 2023-10-04 00:01:15,408 WARNING [train.py:1204] (3/4) Exclude cut with ID _8263_210_1664_1_1528439452891_970220_68-181805-0_sp1.1 from training. Duration: 0.5818125 2023-10-04 00:01:18,130 WARNING [train.py:1204] (3/4) Exclude cut with ID _3120_210_3525_1_1526036143112_4072980_348-257723-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 00:01:19,523 WARNING [train.py:1204] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_453-133983-0_sp1.1 from training. Duration: 0.7090625 2023-10-04 00:01:19,596 WARNING [train.py:1204] (3/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_17-86060-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 00:01:19,627 WARNING [train.py:1204] (3/4) Exclude cut with ID _29877_210_6228_1_1532397791106_5276149_228-184657-0_sp1.1 from training. Duration: 0.97275 2023-10-04 00:01:20,966 WARNING [train.py:1204] (3/4) Exclude cut with ID _18516_210_9206_1_1531704653206_3692406_198-334870-0 from training. Duration: 0.92 2023-10-04 00:01:22,645 WARNING [train.py:1204] (3/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_340-174176-0_sp1.1 from training. Duration: 0.4454375 2023-10-04 00:01:22,664 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_14-139186-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 00:01:26,587 WARNING [train.py:1204] (3/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_126-29104-0_sp0.9 from training. Duration: 0.9444375 2023-10-04 00:01:33,835 WARNING [train.py:1204] (3/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_84-10310-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 00:01:40,534 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_15517_210_6753_1_1530954008_3487336_79-273076-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 00:01:40,596 WARNING [train.py:1204] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_371-18169-0_sp0.9 from training. Duration: 0.9555625 2023-10-04 00:01:47,580 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_68702_210_23689_1_1535799577_3420068_170-92788-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 00:01:47,741 WARNING [train.py:1204] (3/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_78-182912-0 from training. Duration: 0.59 2023-10-04 00:01:49,081 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3019_210_2028_1_1526044245_3264865_297-12384-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 00:01:49,093 WARNING [train.py:1204] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_147-111137-0_sp1.1 from training. Duration: 0.863625 2023-10-04 00:01:49,098 WARNING [train.py:1204] (3/4) Exclude cut with ID _28323_210_16187_1_1532480395667_4100300_117-8859-0_sp1.1 from training. Duration: 0.97275 2023-10-04 00:01:50,406 INFO [train.py:1046] (3/4) Epoch 42, batch 250, loss[loss=0.1509, simple_loss=0.22, pruned_loss=0.04088, over 22681.00 frames. ], tot_loss[loss=0.157, simple_loss=0.2371, pruned_loss=0.03848, over 3394527.56 frames. ], batch size: 322, lr: 2.43e-03, grad_scale: 8.0 2023-10-04 00:01:50,478 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7762_210_1794_1_1528199508_4057408_419-125009-0_sp1.1 from training. Duration: 0.7545625 2023-10-04 00:01:52,509 WARNING [train.py:1204] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_268-87909-0 from training. Duration: 0.99 2023-10-04 00:01:52,578 WARNING [train.py:1204] (3/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_118-311357-0_sp1.1 from training. Duration: 0.92725 2023-10-04 00:01:52,599 WARNING [train.py:1204] (3/4) Exclude cut with ID ID0377W0003-870225 from training. Duration: 0.852 2023-10-04 00:01:55,294 WARNING [train.py:1204] (3/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_56-5838-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 00:01:57,226 WARNING [train.py:1204] (3/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_90-31383-0_sp0.9 from training. Duration: 0.9666875 2023-10-04 00:02:00,329 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_31583_210_3852_1_1532595851_4024413_58-86657-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 00:02:00,372 WARNING [train.py:1204] (3/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_344-113875-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 00:02:03,152 WARNING [train.py:1204] (3/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_278-104676-0_sp1.1 from training. Duration: 0.8909375 2023-10-04 00:02:04,585 WARNING [train.py:1204] (3/4) Exclude cut with ID _55844_210_2346_1_1534471212569_7625707_764-2387-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 00:02:04,691 WARNING [train.py:1204] (3/4) Exclude cut with ID _27430_210_7770_1_1532604689375_1378590_157-14963-0_sp1.1 from training. Duration: 0.963625 2023-10-04 00:02:07,539 WARNING [train.py:1204] (3/4) Exclude cut with ID _7160_210_1876_1_1527940923231_3703799_282-322611-0_sp1.1 from training. Duration: 0.7909375 2023-10-04 00:02:13,538 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.ff3_skip_rate, batch_count=1453720.0, ans=0.0 2023-10-04 00:02:14,650 WARNING [train.py:1204] (3/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_190-138640-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 00:02:14,854 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.feed_forward2.hidden_balancer.prob, batch_count=1453720.0, ans=0.125 2023-10-04 00:02:17,364 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_33348_210_5632_1_1533086912_3324030_63-270922-0_sp1.1 from training. Duration: 0.97275 2023-10-04 00:02:17,395 WARNING [train.py:1204] (3/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_135-184281-0_sp1.1 from training. Duration: 0.9 2023-10-04 00:02:18,862 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.conv_module1.balancer1.prob, batch_count=1453786.6666666667, ans=0.125 2023-10-04 00:02:23,412 WARNING [train.py:1204] (3/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_277-210798-0_sp1.1 from training. Duration: 0.5 2023-10-04 00:02:24,753 WARNING [train.py:1204] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_402-253027-0_sp0.9 from training. Duration: 0.6 2023-10-04 00:02:24,830 WARNING [train.py:1204] (3/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_540-275685-0_sp0.9 from training. Duration: 0.988875 2023-10-04 00:02:24,866 WARNING [train.py:1204] (3/4) Exclude cut with ID _59493_210_17282_1_1535078419077_3225879_118-18960-0_sp1.1 from training. Duration: 0.936375 2023-10-04 00:02:26,712 WARNING [train.py:1204] (3/4) Exclude cut with ID _6194_210_1534_1_1527511826160_3941194_321-148641-0_sp1.1 from training. Duration: 0.5818125 2023-10-04 00:02:26,725 WARNING [train.py:1204] (3/4) Exclude cut with ID _61133_210_9969_1_1535196777227_3696409_333-270325-0_sp1.1 from training. Duration: 0.8454375 2023-10-04 00:02:28,461 WARNING [train.py:1204] (3/4) Exclude cut with ID _49376_210_5986_1_1533985278732_3370740_307-145221-0_sp1.1 from training. Duration: 0.936375 2023-10-04 00:02:31,007 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6846_210_6753_1_1527850767_4295158_2-315073-0_sp0.9 from training. Duration: 0.97775 2023-10-04 00:02:32,298 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.657e+02 2.009e+02 2.241e+02 2.579e+02 4.202e+02, threshold=4.483e+02, percent-clipped=0.0 2023-10-04 00:02:32,434 WARNING [train.py:1204] (3/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_17-25045-0 from training. Duration: 0.59 2023-10-04 00:02:32,454 WARNING [train.py:1204] (3/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_503-333510-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 00:02:32,686 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.nonlin_attention.balancer.prob, batch_count=1453786.6666666667, ans=0.125 2023-10-04 00:02:35,099 WARNING [train.py:1204] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_238-245655-0_sp1.1 from training. Duration: 0.736375 2023-10-04 00:02:35,120 WARNING [train.py:1204] (3/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_329-82235-0_sp0.9 from training. Duration: 0.911125 2023-10-04 00:02:35,127 WARNING [train.py:1204] (3/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_325-113798-0_sp0.9 from training. Duration: 0.9444375 2023-10-04 00:02:36,460 WARNING [train.py:1204] (3/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_395-37222-0_sp0.9 from training. Duration: 0.8444375 2023-10-04 00:02:36,555 WARNING [train.py:1204] (3/4) Exclude cut with ID _64135_210_2253_1_1535677267876_7608972_498-5039-0_sp1.1 from training. Duration: 0.7090625 2023-10-04 00:02:37,894 WARNING [train.py:1204] (3/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_303-228738-0_sp0.9 from training. Duration: 0.9333125 2023-10-04 00:02:40,563 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_577-220899-0_sp1.1 from training. Duration: 0.92725 2023-10-04 00:02:41,972 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3475_210_2028_1_1526193578_3622569_377-166133-0_sp1.1 from training. Duration: 0.7818125 2023-10-04 00:02:42,004 WARNING [train.py:1204] (3/4) Exclude cut with ID _49376_210_5986_1_1533985278732_3370740_272-313512-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 00:02:42,255 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.conv_module2.balancer2.prob, batch_count=1453853.3333333333, ans=0.125 2023-10-04 00:02:47,385 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7762_210_1794_1_1528199508_4057408_422-227034-0_sp0.9 from training. Duration: 0.811125 2023-10-04 00:02:48,907 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.conv_module2.balancer1.prob, batch_count=1453920.0, ans=0.125 2023-10-04 00:02:50,112 WARNING [train.py:1204] (3/4) Exclude cut with ID _18895_210_6228_1_1531357274234_3784930_74-341428-0_sp1.1 from training. Duration: 0.92725 2023-10-04 00:02:53,374 WARNING [train.py:1204] (3/4) Exclude cut with ID _44902_210_16187_1_1533888006062_3792078_149-67212-0_sp1.1 from training. Duration: 0.7909375 2023-10-04 00:02:59,410 WARNING [train.py:1204] (3/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_243-126020-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 00:03:01,293 WARNING [train.py:1204] (3/4) Exclude cut with ID _54218_210_12210_1_1534413646388_4409259_259-6561-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 00:03:03,960 INFO [train.py:1046] (3/4) Epoch 42, batch 300, loss[loss=0.1636, simple_loss=0.2446, pruned_loss=0.0413, over 24330.00 frames. ], tot_loss[loss=0.1566, simple_loss=0.2366, pruned_loss=0.03832, over 3682099.71 frames. ], batch size: 61, lr: 2.43e-03, grad_scale: 8.0 2023-10-04 00:03:05,394 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_41452_210_7291_1_1533437687_3941281_405-11559-0 from training. Duration: 0.91 2023-10-04 00:03:05,485 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_759-225084-0_sp1.1 from training. Duration: 0.97275 2023-10-04 00:03:05,500 WARNING [train.py:1204] (3/4) Exclude cut with ID _14747_210_8341_1_1530685797463_3654089_147-131229-0_sp0.9 from training. Duration: 0.8444375 2023-10-04 00:03:06,964 WARNING [train.py:1204] (3/4) Exclude cut with ID _14867_210_1968_1_1530684179890_3846969_319-250868-0 from training. Duration: 0.99 2023-10-04 00:03:06,996 WARNING [train.py:1204] (3/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_580-93798-0_sp0.9 from training. Duration: 0.62225 2023-10-04 00:03:08,452 WARNING [train.py:1204] (3/4) Exclude cut with ID _9679_210_7497_1_1529129212906_4363929_512-313889-0_sp0.9 from training. Duration: 0.9 2023-10-04 00:03:09,788 WARNING [train.py:1204] (3/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_18-189198-0 from training. Duration: 0.9 2023-10-04 00:03:10,073 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=1453986.6666666667, ans=0.1 2023-10-04 00:03:13,769 WARNING [train.py:1204] (3/4) Exclude cut with ID _49755_210_18053_1_1534581005846_3218709_137-64179-0_sp1.1 from training. Duration: 0.92725 2023-10-04 00:03:13,838 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_48049_210_5632_1_1533864553_3519937_306-283794-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 00:03:15,387 INFO [scaling.py:1118] (3/4) WithLoss: name=encoder.encoders.3.encoder.layers.0.self_attn_weights, loss-sum=0.000e+00 2023-10-04 00:03:17,912 WARNING [train.py:1204] (3/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_25-120161-0_sp1.1 from training. Duration: 0.8909375 2023-10-04 00:03:17,980 WARNING [train.py:1204] (3/4) Exclude cut with ID _8833_210_5219_1_1528592665210_7553059_319-349181-0 from training. Duration: 0.99 2023-10-04 00:03:19,323 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_296-332325-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 00:03:19,416 WARNING [train.py:1204] (3/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_436-186311-0_sp1.1 from training. Duration: 0.5909375 2023-10-04 00:03:19,431 WARNING [train.py:1204] (3/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_348-127789-0 from training. Duration: 0.85 2023-10-04 00:03:19,436 WARNING [train.py:1204] (3/4) Exclude cut with ID _3992_210_1747_1_1526644816031_3731060_443-195183-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 00:03:24,151 WARNING [train.py:1204] (3/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_308-231735-0_sp1.1 from training. Duration: 0.663625 2023-10-04 00:03:24,392 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.conv_module2.balancer2.prob, batch_count=1454053.3333333333, ans=0.125 2023-10-04 00:03:28,687 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6786_210_1388_1_1527839699_4194495_378-46070-0_sp1.1 from training. Duration: 0.6545625 2023-10-04 00:03:29,411 WARNING [train.py:1204] (3/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_63-101996-0 from training. Duration: 0.88 2023-10-04 00:03:32,211 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2541_210_1794_1_1525590269_3084765_156-173713-0 from training. Duration: 0.81 2023-10-04 00:03:33,514 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_217-239796-0_sp1.1 from training. Duration: 0.963625 2023-10-04 00:03:34,967 WARNING [train.py:1204] (3/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_530-329400-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 00:03:36,435 WARNING [train.py:1204] (3/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_150-62920-0_sp1.1 from training. Duration: 0.963625 2023-10-04 00:03:36,436 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_5522_210_3273_1_1527249606_3909096_366-172508-0 from training. Duration: 0.66 2023-10-04 00:03:36,438 WARNING [train.py:1204] (3/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_303-340446-0_sp1.1 from training. Duration: 0.8181875 2023-10-04 00:03:39,182 WARNING [train.py:1204] (3/4) Exclude cut with ID _8699_210_2069_1_1528630346831_3960409_160-24408-0_sp1.1 from training. Duration: 0.82725 2023-10-04 00:03:40,635 WARNING [train.py:1204] (3/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_299-187801-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 00:03:40,667 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_34886_210_10633_1_1533203882_1755661_92-345288-0_sp1.1 from training. Duration: 0.97275 2023-10-04 00:03:43,436 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_5438_210_1794_1_1527336687_2600009_104-288274-0_sp1.1 from training. Duration: 0.52725 2023-10-04 00:03:43,441 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_154-316450-0 from training. Duration: 0.76 2023-10-04 00:03:44,791 WARNING [train.py:1204] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_23-189234-0_sp0.9 from training. Duration: 0.92225 2023-10-04 00:03:47,516 WARNING [train.py:1204] (3/4) Exclude cut with ID _4779_210_2084_1_1527318022944_3914893_346-168820-0_sp1.1 from training. Duration: 0.963625 2023-10-04 00:03:48,771 WARNING [train.py:1204] (3/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_5-55893-0 from training. Duration: 0.55 2023-10-04 00:03:48,871 WARNING [train.py:1204] (3/4) Exclude cut with ID _20455_210_5219_1_1531565996775_8098100_180-24517-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 00:03:55,323 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_243-143119-0_sp0.9 from training. Duration: 0.8555625 2023-10-04 00:03:58,066 WARNING [train.py:1204] (3/4) Exclude cut with ID _65585_210_12210_1_1535450451584_3764098_293-212045-0_sp1.1 from training. Duration: 0.92725 2023-10-04 00:03:58,713 WARNING [train.py:1204] (3/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_577-332600-0 from training. Duration: 0.57 2023-10-04 00:04:04,393 WARNING [train.py:1204] (3/4) Exclude cut with ID _30039_210_13270_1_1532484612900_2725150_104-289874-0_sp1.1 from training. Duration: 0.963625 2023-10-04 00:04:04,400 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3172_210_2087_1_1526292929_3987865_197-97762-0_sp1.1 from training. Duration: 0.6818125 2023-10-04 00:04:05,824 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_256-150908-0_sp1.1 from training. Duration: 0.963625 2023-10-04 00:04:07,206 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_296-287678-0_sp1.1 from training. Duration: 0.863625 2023-10-04 00:04:07,235 WARNING [train.py:1204] (3/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_443-30034-0 from training. Duration: 0.64 2023-10-04 00:04:07,275 WARNING [train.py:1204] (3/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_494-199538-0_sp0.9 from training. Duration: 0.8333125 2023-10-04 00:04:08,598 WARNING [train.py:1204] (3/4) Exclude cut with ID _19708_210_9206_1_1532136650733_4238364_61-162325-0_sp1.1 from training. Duration: 0.97275 2023-10-04 00:04:10,085 WARNING [train.py:1204] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_475-127455-0 from training. Duration: 0.7 2023-10-04 00:04:11,453 WARNING [train.py:1204] (3/4) Exclude cut with ID _48677_210_5663_1_1533902405919_3892989_188-11581-0_sp1.1 from training. Duration: 0.963625 2023-10-04 00:04:12,814 WARNING [train.py:1204] (3/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_136-8381-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 00:04:14,278 WARNING [train.py:1204] (3/4) Exclude cut with ID _55328_210_27767_1_1534580973260_4088170_270-229555-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 00:04:14,316 WARNING [train.py:1204] (3/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_372-266951-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 00:04:15,604 WARNING [train.py:1204] (3/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_14-242149-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 00:04:16,925 INFO [train.py:1046] (3/4) Epoch 42, batch 350, loss[loss=0.149, simple_loss=0.2385, pruned_loss=0.02973, over 24632.00 frames. ], tot_loss[loss=0.1553, simple_loss=0.2349, pruned_loss=0.03789, over 3894116.24 frames. ], batch size: 68, lr: 2.43e-03, grad_scale: 8.0 2023-10-04 00:04:18,489 WARNING [train.py:1204] (3/4) Exclude cut with ID _7877_210_6014_1_1528286358099_3750136_159-76512-0_sp1.1 from training. Duration: 0.936375 2023-10-04 00:04:18,491 WARNING [train.py:1204] (3/4) Exclude cut with ID _8263_210_1664_1_1528438953494_294100_40-204731-0_sp1.1 from training. Duration: 0.4545625 2023-10-04 00:04:21,267 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8278_210_2550_1_1528637099_4220413_694-238413-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 00:04:21,413 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.bypass.skip_rate, batch_count=1454320.0, ans=0.04949747468305833 2023-10-04 00:04:27,248 WARNING [train.py:1204] (3/4) Exclude cut with ID _47278_210_12041_1_1533902418349_3576110_421-172710-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 00:04:28,220 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.0.layers.1.feed_forward3.out_whiten, num_groups=1, num_channels=192, metric=8.16 vs. limit=15.0 2023-10-04 00:04:31,162 WARNING [train.py:1204] (3/4) Exclude cut with ID _14970_210_15709_1_1530773543493_1715730_215-282680-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 00:04:31,202 WARNING [train.py:1204] (3/4) Exclude cut with ID _64639_210_5816_1_1535367619437_3528000_306-149449-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 00:04:32,797 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_28-291982-0 from training. Duration: 0.97 2023-10-04 00:04:34,200 WARNING [train.py:1204] (3/4) Exclude cut with ID _55134_210_21982_1_1534590028377_3882170_412-15870-0_sp1.1 from training. Duration: 0.936375 2023-10-04 00:04:35,526 WARNING [train.py:1204] (3/4) Exclude cut with ID _13684_210_7497_1_1530619354863_3598359_312-235606-0 from training. Duration: 0.6 2023-10-04 00:04:36,948 WARNING [train.py:1204] (3/4) Exclude cut with ID _37112_210_2575_1_1532997007673_7268610_347-238993-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 00:04:36,998 WARNING [train.py:1204] (3/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_121-284152-0 from training. Duration: 0.59 2023-10-04 00:04:37,145 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=1454386.6666666667, ans=0.1 2023-10-04 00:04:38,314 WARNING [train.py:1204] (3/4) Exclude cut with ID _2941_210_2577_1_1526199673150_2489759_228-2961-0_sp1.1 from training. Duration: 0.97275 2023-10-04 00:04:41,123 WARNING [train.py:1204] (3/4) Exclude cut with ID _28323_210_16187_1_1532480395667_4100300_531-24157-0 from training. Duration: 0.98 2023-10-04 00:04:42,666 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.ff2_skip_rate, batch_count=1454386.6666666667, ans=0.0 2023-10-04 00:04:43,845 WARNING [train.py:1204] (3/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_266-329755-0_sp0.9 from training. Duration: 0.988875 2023-10-04 00:04:45,207 WARNING [train.py:1204] (3/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_1038-146421-0_sp1.1 from training. Duration: 0.97275 2023-10-04 00:04:46,532 WARNING [train.py:1204] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_391-22148-0_sp0.9 from training. Duration: 0.8555625 2023-10-04 00:04:46,643 WARNING [train.py:1204] (3/4) Exclude cut with ID _64521_210_14699_1_1535243427259_3815328_543-341117-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 00:04:47,936 WARNING [train.py:1204] (3/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_52-168712-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 00:04:47,985 WARNING [train.py:1204] (3/4) Exclude cut with ID _33920_210_4220_1_1533257851500_3084399_198-334063-0_sp1.1 from training. Duration: 0.8545625 2023-10-04 00:04:47,999 WARNING [train.py:1204] (3/4) Exclude cut with ID _50093_210_4220_1_1534417141207_3432299_69-111987-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 00:04:48,048 WARNING [train.py:1204] (3/4) Exclude cut with ID _14821_210_8750_1_1530680503991_6829359_189-297451-0_sp0.9 from training. Duration: 0.611125 2023-10-04 00:04:50,732 WARNING [train.py:1204] (3/4) Exclude cut with ID _8030_210_7497_1_1528444886781_3531089_262-86750-0_sp0.9 from training. Duration: 0.9 2023-10-04 00:04:50,744 WARNING [train.py:1204] (3/4) Exclude cut with ID _24612_210_8913_1_1531998054193_3806359_294-191196-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 00:04:52,451 INFO [scaling.py:1118] (3/4) WithLoss: name=encoder.encoders.2.encoder.layers.2.self_attn_weights, loss-sum=0.000e+00 2023-10-04 00:04:58,163 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.664e+02 1.875e+02 2.208e+02 2.566e+02 3.808e+02, threshold=4.416e+02, percent-clipped=0.0 2023-10-04 00:04:58,298 WARNING [train.py:1204] (3/4) Exclude cut with ID _16909_210_5724_1_1531292079912_4118919_707-342896-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 00:04:58,316 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_9460_210_4211_1_1528954587_10049667_572-37035-0_sp0.9 from training. Duration: 0.888875 2023-10-04 00:04:59,544 WARNING [train.py:1204] (3/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_762-109886-0_sp1.1 from training. Duration: 0.82725 2023-10-04 00:04:59,585 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_14953_210_2151_1_1530701710_5616165_519-326393-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 00:05:02,293 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.3.encoder.layers.3.nonlin_attention.whiten2, num_groups=1, num_channels=512, metric=5.09 vs. limit=15.0 2023-10-04 00:05:04,461 WARNING [train.py:1204] (3/4) Exclude cut with ID _25914_210_6400_1_1532141317058_644740_52-96808-0 from training. Duration: 0.91 2023-10-04 00:05:04,476 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_68095_210_20185_1_1535718226_8888855_1079-94525-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 00:05:08,648 WARNING [train.py:1204] (3/4) Exclude cut with ID _14860_210_4915_1_1530702020340_7343240_746-3647-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 00:05:08,648 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_70379_210_7925_1_1535941455_557455_49-101720-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 00:05:08,680 WARNING [train.py:1204] (3/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_8-307291-0_sp1.1 from training. Duration: 0.936375 2023-10-04 00:05:09,159 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.4.encoder.layers.1.feed_forward1.out_whiten, num_groups=1, num_channels=384, metric=9.98 vs. limit=15.0 2023-10-04 00:05:11,156 WARNING [train.py:1204] (3/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_117-290916-0 from training. Duration: 0.51 2023-10-04 00:05:12,611 WARNING [train.py:1204] (3/4) Exclude cut with ID _22020_210_2461_1_1531724164513_6326900_944-339840-0_sp1.1 from training. Duration: 0.963625 2023-10-04 00:05:13,907 WARNING [train.py:1204] (3/4) Exclude cut with ID IC0515W0043-254911 from training. Duration: 0.976 2023-10-04 00:05:14,666 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.3.encoder.layers.2.feed_forward2.out_whiten, num_groups=1, num_channels=512, metric=16.26 vs. limit=15.0 2023-10-04 00:05:15,384 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3910_210_1794_1_1526742306_334530_28-238909-0 from training. Duration: 0.81 2023-10-04 00:05:15,406 WARNING [train.py:1204] (3/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_269-267275-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 00:05:18,360 WARNING [train.py:1204] (3/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_72-251255-0_sp1.1 from training. Duration: 0.8545625 2023-10-04 00:05:18,372 WARNING [train.py:1204] (3/4) Exclude cut with ID _14886_210_4220_1_1530702020355_3516190_175-229073-0 from training. Duration: 0.45 2023-10-04 00:05:19,855 WARNING [train.py:1204] (3/4) Exclude cut with ID _40519_210_13794_1_1533715228655_7337390_921-209452-0_sp1.1 from training. Duration: 0.963625 2023-10-04 00:05:22,596 WARNING [train.py:1204] (3/4) Exclude cut with ID _29857_210_20034_1_1532395886753_3798160_335-163562-0_sp0.9 from training. Duration: 0.9444375 2023-10-04 00:05:22,701 WARNING [train.py:1204] (3/4) Exclude cut with ID _58583_210_5236_1_1534726835266_3574249_384-58445-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 00:05:24,054 WARNING [train.py:1204] (3/4) Exclude cut with ID _44011_210_2046_1_1533722401691_3631310_96-315656-0_sp1.1 from training. Duration: 0.963625 2023-10-04 00:05:24,058 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_68484_210_1794_1_1535849804_7678097_549-170613-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 00:05:26,051 WARNING [train.py:1204] (3/4) Exclude cut with ID _37892_210_19596_1_1533198677897_3964058_214-148058-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 00:05:29,263 WARNING [train.py:1204] (3/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_415-78792-0_sp0.9 from training. Duration: 0.9 2023-10-04 00:05:30,690 INFO [train.py:1046] (3/4) Epoch 42, batch 400, loss[loss=0.1455, simple_loss=0.219, pruned_loss=0.03602, over 23426.00 frames. ], tot_loss[loss=0.1551, simple_loss=0.2343, pruned_loss=0.03796, over 4049757.85 frames. ], batch size: 120, lr: 2.43e-03, grad_scale: 16.0 2023-10-04 00:05:30,776 WARNING [train.py:1204] (3/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_383-205833-0_sp1.1 from training. Duration: 0.863625 2023-10-04 00:05:32,602 WARNING [train.py:1204] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_167-137255-0 from training. Duration: 0.64 2023-10-04 00:05:32,615 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8275_210_4198_1_1528455320_3892571_484-154786-0_sp1.1 from training. Duration: 0.963625 2023-10-04 00:05:32,685 WARNING [train.py:1204] (3/4) Exclude cut with ID _40284_210_2084_1_1533513679814_3704279_396-319412-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 00:05:34,324 WARNING [train.py:1204] (3/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_280-120942-0_sp0.9 from training. Duration: 0.8555625 2023-10-04 00:05:34,375 WARNING [train.py:1204] (3/4) Exclude cut with ID _45630_210_10556_1_1533949174498_3902049_558-240889-0_sp1.1 from training. Duration: 0.92725 2023-10-04 00:05:38,286 WARNING [train.py:1204] (3/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_197-307458-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 00:05:39,594 WARNING [train.py:1204] (3/4) Exclude cut with ID _16909_210_5724_1_1531292079912_4118919_163-254843-0_sp1.1 from training. Duration: 0.92725 2023-10-04 00:05:41,068 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_103-84010-0 from training. Duration: 0.69 2023-10-04 00:05:42,476 WARNING [train.py:1204] (3/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_401-158239-0 from training. Duration: 0.71 2023-10-04 00:05:42,477 WARNING [train.py:1204] (3/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_899-233658-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 00:05:45,382 WARNING [train.py:1204] (3/4) Exclude cut with ID _23825_210_13449_1_1531915313526_3497150_240-271367-0 from training. Duration: 0.97 2023-10-04 00:05:45,438 WARNING [train.py:1204] (3/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_424-134619-0_sp1.1 from training. Duration: 0.92725 2023-10-04 00:05:48,454 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.conv_module2.balancer1.prob, batch_count=1454720.0, ans=0.125 2023-10-04 00:05:49,513 WARNING [train.py:1204] (3/4) Exclude cut with ID _44831_210_13427_1_1533816011549_3689035_32-188147-0_sp1.1 from training. Duration: 0.87275 2023-10-04 00:05:49,519 WARNING [train.py:1204] (3/4) Exclude cut with ID _59170_210_6721_1_1534983633960_4203335_314-63879-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 00:05:49,533 WARNING [train.py:1204] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_602-275260-0 from training. Duration: 0.82 2023-10-04 00:05:49,577 WARNING [train.py:1204] (3/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_511-306767-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 00:05:50,869 WARNING [train.py:1204] (3/4) Exclude cut with ID _41817_210_18835_1_1533434477160_7423049_565-201775-0_sp1.1 from training. Duration: 0.92725 2023-10-04 00:05:50,887 WARNING [train.py:1204] (3/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_778-173151-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 00:05:50,924 WARNING [train.py:1204] (3/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_680-51001-0_sp1.1 from training. Duration: 0.963625 2023-10-04 00:05:54,783 WARNING [train.py:1204] (3/4) Exclude cut with ID IC0677W0059-334891 from training. Duration: 0.896 2023-10-04 00:05:54,843 WARNING [train.py:1204] (3/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_370-331600-0 from training. Duration: 0.94 2023-10-04 00:05:56,276 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.conv_module1.balancer2.prob, batch_count=1454720.0, ans=0.125 2023-10-04 00:05:58,858 WARNING [train.py:1204] (3/4) Exclude cut with ID _66840_210_5219_1_1535452168426_4196349_446-240192-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 00:05:58,939 WARNING [train.py:1204] (3/4) Exclude cut with ID _65242_210_18628_1_1535369393717_3823149_38-6732-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 00:06:00,781 WARNING [train.py:1204] (3/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_302-2550-0 from training. Duration: 0.81 2023-10-04 00:06:02,127 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_100-348441-0 from training. Duration: 0.91 2023-10-04 00:06:04,907 WARNING [train.py:1204] (3/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_329-285801-0_sp1.1 from training. Duration: 0.82725 2023-10-04 00:06:08,145 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_30167_210_3528_1_1532428012_4772375_343-334712-0_sp1.1 from training. Duration: 0.936375 2023-10-04 00:06:10,979 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.1.conv_module2.balancer2.min_positive, batch_count=1454786.6666666667, ans=0.05 2023-10-04 00:06:15,041 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7543_210_6753_1_1528543318_4552_1-85072-0 from training. Duration: 0.83 2023-10-04 00:06:17,904 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7002_210_1794_1_1527935634_4034444_487-236501-0_sp1.1 from training. Duration: 0.6 2023-10-04 00:06:18,007 WARNING [train.py:1204] (3/4) Exclude cut with ID _7160_210_1876_1_1527940923231_3703799_282-322611-0 from training. Duration: 0.87 2023-10-04 00:06:20,653 WARNING [train.py:1204] (3/4) Exclude cut with ID _67335_210_5219_1_1535525975652_4275229_570-262983-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 00:06:20,762 WARNING [train.py:1204] (3/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_625-338546-0_sp1.1 from training. Duration: 0.8 2023-10-04 00:06:20,791 WARNING [train.py:1204] (3/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_176-56120-0 from training. Duration: 0.83 2023-10-04 00:06:25,542 WARNING [train.py:1204] (3/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_963-187109-0_sp1.1 from training. Duration: 0.9 2023-10-04 00:06:26,957 WARNING [train.py:1204] (3/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_121-284152-0_sp0.9 from training. Duration: 0.6555625 2023-10-04 00:06:28,203 WARNING [train.py:1204] (3/4) Exclude cut with ID _45021_210_9994_1_1533619751094_3330288_104-81711-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 00:06:32,734 WARNING [train.py:1204] (3/4) Exclude cut with ID _51382_210_23862_1_1534492801838_3650104_129-343914-0_sp1.1 from training. Duration: 0.936375 2023-10-04 00:06:32,769 WARNING [train.py:1204] (3/4) Exclude cut with ID _14970_210_15709_1_1530775547361_1966965_8-169924-0 from training. Duration: 0.92 2023-10-04 00:06:34,194 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2295_210_2087_1_1525500993_2407863_234-83104-0_sp1.1 from training. Duration: 0.563625 2023-10-04 00:06:35,573 WARNING [train.py:1204] (3/4) Exclude cut with ID _14687_210_5854_1_1530703838259_3664889_115-239046-0 from training. Duration: 0.67 2023-10-04 00:06:37,510 WARNING [train.py:1204] (3/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_690-126306-0_sp1.1 from training. Duration: 0.7090625 2023-10-04 00:06:37,534 WARNING [train.py:1204] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_625-102955-0_sp0.9 from training. Duration: 0.8555625 2023-10-04 00:06:37,679 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.bypass_mid.scale_min, batch_count=1454920.0, ans=0.2 2023-10-04 00:06:40,294 WARNING [train.py:1204] (3/4) Exclude cut with ID _9389_210_1664_1_1529217337301_5289269_289-213417-0 from training. Duration: 0.89 2023-10-04 00:06:42,857 WARNING [train.py:1204] (3/4) Exclude cut with ID _13253_210_1925_1_1530757775819_3577873_191-144977-0_sp0.9 from training. Duration: 0.8666875 2023-10-04 00:06:42,910 WARNING [train.py:1204] (3/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_325-155703-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 00:06:44,248 INFO [train.py:1046] (3/4) Epoch 42, batch 450, loss[loss=0.1628, simple_loss=0.2482, pruned_loss=0.03865, over 24299.00 frames. ], tot_loss[loss=0.1564, simple_loss=0.2357, pruned_loss=0.0386, over 4184544.09 frames. ], batch size: 74, lr: 2.43e-03, grad_scale: 8.0 2023-10-04 00:06:44,323 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2295_210_2087_1_1525500993_2407863_116-238894-0_sp0.9 from training. Duration: 0.77775 2023-10-04 00:06:45,709 WARNING [train.py:1204] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_94-126128-0 from training. Duration: 0.86 2023-10-04 00:06:45,731 WARNING [train.py:1204] (3/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_395-310220-0_sp0.9 from training. Duration: 0.988875 2023-10-04 00:06:47,094 WARNING [train.py:1204] (3/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_821-110852-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 00:06:47,145 WARNING [train.py:1204] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_64-248359-0_sp1.1 from training. Duration: 0.62725 2023-10-04 00:06:47,155 WARNING [train.py:1204] (3/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_138-187031-0 from training. Duration: 0.99 2023-10-04 00:06:48,476 WARNING [train.py:1204] (3/4) Exclude cut with ID _4122_210_2084_1_1526713164417_4353280_528-206771-0_sp0.9 from training. Duration: 0.97775 2023-10-04 00:06:48,566 WARNING [train.py:1204] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_844-96989-0_sp1.1 from training. Duration: 0.6454375 2023-10-04 00:06:50,066 INFO [scaling.py:1118] (3/4) WithLoss: name=encoder.encoders.0.layers.1.self_attn_weights, loss-sum=0.000e+00 2023-10-04 00:06:51,268 WARNING [train.py:1204] (3/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_106-30782-0_sp1.1 from training. Duration: 0.72725 2023-10-04 00:06:58,707 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.4.encoder.layers.0.whiten, num_groups=1, num_channels=384, metric=4.80 vs. limit=12.0 2023-10-04 00:06:59,469 WARNING [train.py:1204] (3/4) Exclude cut with ID _14683_210_5219_1_1530698352608_3974859_281-250040-0_sp1.1 from training. Duration: 0.936375 2023-10-04 00:07:00,698 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_46013_210_7234_1_1533776362_3744032_463-52378-0_sp0.9 from training. Duration: 0.9555625 2023-10-04 00:07:02,113 WARNING [train.py:1204] (3/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_299-34413-0 from training. Duration: 0.65 2023-10-04 00:07:02,161 WARNING [train.py:1204] (3/4) Exclude cut with ID _35799_210_5854_1_1533263487887_3529071_490-326847-0 from training. Duration: 0.99 2023-10-04 00:07:04,213 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.self_attn_weights.pos_emb_skip_rate, batch_count=1455053.3333333333, ans=0.0 2023-10-04 00:07:06,832 WARNING [train.py:1204] (3/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_589-9070-0_sp0.9 from training. Duration: 0.788875 2023-10-04 00:07:08,287 WARNING [train.py:1204] (3/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_884-29515-0_sp1.1 from training. Duration: 0.936375 2023-10-04 00:07:11,515 WARNING [train.py:1204] (3/4) Exclude cut with ID _15012_210_1968_1_1530856820429_5095350_474-119995-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 00:07:15,574 WARNING [train.py:1204] (3/4) Exclude cut with ID _65585_210_12210_1_1535450451584_3764098_211-97124-0_sp1.1 from training. Duration: 0.963625 2023-10-04 00:07:15,639 WARNING [train.py:1204] (3/4) Exclude cut with ID _33640_210_2655_1_1533117605479_4855469_666-224463-0_sp1.1 from training. Duration: 0.963625 2023-10-04 00:07:17,119 WARNING [train.py:1204] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_35-153886-0 from training. Duration: 0.84 2023-10-04 00:07:18,564 WARNING [train.py:1204] (3/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_210-106047-0 from training. Duration: 0.97 2023-10-04 00:07:21,205 WARNING [train.py:1204] (3/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_479-125611-0 from training. Duration: 0.96 2023-10-04 00:07:21,240 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_9417_210_1794_1_1529235261_4614131_427-153963-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 00:07:22,595 WARNING [train.py:1204] (3/4) Exclude cut with ID _23818_210_6228_1_1531917063884_3676920_133-170571-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 00:07:23,864 WARNING [train.py:1204] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_602-275260-0_sp1.1 from training. Duration: 0.7454375 2023-10-04 00:07:24,138 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.balancer_na.min_abs, batch_count=1455120.0, ans=0.02 2023-10-04 00:07:25,435 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9029W0121-509521 from training. Duration: 0.896 2023-10-04 00:07:25,443 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9029W0434-509821 from training. Duration: 0.64 2023-10-04 00:07:25,463 WARNING [train.py:1204] (3/4) Exclude cut with ID _23038_210_9570_1_1532001584158_3652419_289-307034-0_sp1.1 from training. Duration: 0.936375 2023-10-04 00:07:27,290 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.538e+02 1.923e+02 2.068e+02 2.346e+02 3.607e+02, threshold=4.137e+02, percent-clipped=0.0 2023-10-04 00:07:27,382 WARNING [train.py:1204] (3/4) Exclude cut with ID _29345_210_4381_1_1532689168263_3860169_149-257268-0_sp0.9 from training. Duration: 0.9 2023-10-04 00:07:27,466 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_405-339986-0_sp1.1 from training. Duration: 0.463625 2023-10-04 00:07:30,216 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2655_210_2087_1_1525688959_3897400_437-337690-0_sp1.1 from training. Duration: 0.57275 2023-10-04 00:07:31,531 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7543_210_6753_1_1528541907_1399322_165-323619-0_sp1.1 from training. Duration: 0.62725 2023-10-04 00:07:31,595 WARNING [train.py:1204] (3/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_508-335369-0_sp1.1 from training. Duration: 0.436375 2023-10-04 00:07:33,506 WARNING [train.py:1204] (3/4) Exclude cut with ID _7852_210_5219_1_1528286471316_8139400_358-287492-0 from training. Duration: 0.96 2023-10-04 00:07:34,185 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.3.encoder.layers.2.self_attn1.whiten, num_groups=1, num_channels=512, metric=14.01 vs. limit=22.5 2023-10-04 00:07:34,961 WARNING [train.py:1204] (3/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_322-217220-0_sp0.9 from training. Duration: 0.9555625 2023-10-04 00:07:36,415 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3171_210_2087_1_1526109084_2330007_66-260583-0_sp0.9 from training. Duration: 0.8 2023-10-04 00:07:36,449 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8558_210_1410_1_1528505225_7613406_131-329524-0_sp1.1 from training. Duration: 0.7181875 2023-10-04 00:07:39,158 WARNING [train.py:1204] (3/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_30-326681-0 from training. Duration: 0.83 2023-10-04 00:07:42,454 WARNING [train.py:1204] (3/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_58-68956-0_sp0.9 from training. Duration: 0.92225 2023-10-04 00:07:42,513 WARNING [train.py:1204] (3/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_255-312654-0 from training. Duration: 0.91 2023-10-04 00:07:43,769 WARNING [train.py:1204] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_99-167392-0 from training. Duration: 0.61 2023-10-04 00:07:45,248 WARNING [train.py:1204] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_94-126128-0_sp0.9 from training. Duration: 0.9555625 2023-10-04 00:07:50,600 WARNING [train.py:1204] (3/4) Exclude cut with ID _68635_210_9317_1_1535770813410_4315870_187-192817-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 00:07:51,980 WARNING [train.py:1204] (3/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_277-204970-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 00:07:53,258 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6163_210_6400_1_1527506746_4263068_283-137811-0_sp0.9 from training. Duration: 0.8555625 2023-10-04 00:07:54,550 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9144W0121-566446 from training. Duration: 0.896 2023-10-04 00:07:57,761 INFO [train.py:1046] (3/4) Epoch 42, batch 500, loss[loss=0.1397, simple_loss=0.2198, pruned_loss=0.02979, over 24478.00 frames. ], tot_loss[loss=0.1564, simple_loss=0.2359, pruned_loss=0.03849, over 4309275.59 frames. ], batch size: 58, lr: 2.43e-03, grad_scale: 8.0 2023-10-04 00:07:59,139 WARNING [train.py:1204] (3/4) Exclude cut with ID _49371_210_12071_1_1533952761021_3812190_351-315519-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 00:07:59,210 WARNING [train.py:1204] (3/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_115-326778-0_sp1.1 from training. Duration: 0.8454375 2023-10-04 00:08:00,555 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_17292_210_6753_1_1531125388_2943734_436-198965-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 00:08:00,564 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9165W0117-576751 from training. Duration: 0.896 2023-10-04 00:08:00,752 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.ff2_skip_rate, batch_count=1455320.0, ans=0.0 2023-10-04 00:08:02,294 WARNING [train.py:1204] (3/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_212-149947-0 from training. Duration: 0.73 2023-10-04 00:08:02,302 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_13757_210_3528_1_1530440682_4107239_65-167544-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 00:08:02,537 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=1455320.0, ans=0.1 2023-10-04 00:08:06,395 WARNING [train.py:1204] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_700-183549-0_sp0.9 from training. Duration: 0.7555625 2023-10-04 00:08:06,673 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.feed_forward2.hidden_balancer.prob, batch_count=1455320.0, ans=0.125 2023-10-04 00:08:09,265 WARNING [train.py:1204] (3/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_216-343007-0_sp0.9 from training. Duration: 0.7333125 2023-10-04 00:08:10,688 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7544_210_6753_1_1528587722_3576686_295-258479-0_sp1.1 from training. Duration: 0.763625 2023-10-04 00:08:12,135 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_365-287106-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 00:08:12,154 WARNING [train.py:1204] (3/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_300-3747-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 00:08:13,412 WARNING [train.py:1204] (3/4) Exclude cut with ID _14069_210_5632_1_1530512692033_4803390_487-303201-0_sp1.1 from training. Duration: 0.92725 2023-10-04 00:08:24,971 WARNING [train.py:1204] (3/4) Exclude cut with ID _14459_210_2228_1_1531443641946_7516700_668-123026-0_sp1.1 from training. Duration: 0.936375 2023-10-04 00:08:26,249 WARNING [train.py:1204] (3/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_108-53929-0_sp1.1 from training. Duration: 0.536375 2023-10-04 00:08:27,498 WARNING [train.py:1204] (3/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_266-137104-0_sp0.9 from training. Duration: 0.72225 2023-10-04 00:08:27,528 WARNING [train.py:1204] (3/4) Exclude cut with ID _12302_210_5219_1_1529924446466_8107280_902-168619-0_sp1.1 from training. Duration: 0.936375 2023-10-04 00:08:27,555 WARNING [train.py:1204] (3/4) Exclude cut with ID _59502_210_12210_1_1535107024698_1835139_146-162495-0 from training. Duration: 0.92 2023-10-04 00:08:27,578 WARNING [train.py:1204] (3/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_412-287526-0_sp1.1 from training. Duration: 0.6545625 2023-10-04 00:08:29,698 WARNING [train.py:1204] (3/4) Exclude cut with ID _7160_210_1876_1_1527940923231_3703799_120-150943-0_sp0.9 from training. Duration: 0.9 2023-10-04 00:08:32,429 WARNING [train.py:1204] (3/4) Exclude cut with ID _15037_210_2253_1_1530752294104_3267650_296-192597-0_sp1.1 from training. Duration: 0.736375 2023-10-04 00:08:32,449 WARNING [train.py:1204] (3/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_397-216497-0_sp1.1 from training. Duration: 0.87275 2023-10-04 00:08:32,464 WARNING [train.py:1204] (3/4) Exclude cut with ID _40094_210_16380_1_1533294005773_3641159_76-269320-0_sp1.1 from training. Duration: 0.936375 2023-10-04 00:08:32,524 WARNING [train.py:1204] (3/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_449-15508-0 from training. Duration: 0.82 2023-10-04 00:08:37,020 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9309W0194-640321 from training. Duration: 0.64 2023-10-04 00:08:39,689 WARNING [train.py:1204] (3/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_284-312178-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 00:08:42,337 WARNING [train.py:1204] (3/4) Exclude cut with ID _29853_210_7497_1_1532505600851_6642110_401-55937-0_sp1.1 from training. Duration: 0.92725 2023-10-04 00:08:42,422 WARNING [train.py:1204] (3/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_716-24918-0_sp1.1 from training. Duration: 0.92725 2023-10-04 00:08:42,451 WARNING [train.py:1204] (3/4) Exclude cut with ID _19637_210_4220_1_1531537167676_3205060_314-305638-0_sp1.1 from training. Duration: 0.92725 2023-10-04 00:08:43,810 WARNING [train.py:1204] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_475-127455-0_sp1.1 from training. Duration: 0.636375 2023-10-04 00:08:44,737 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.1.encoder.layers.1.self_attn1.whiten, num_groups=1, num_channels=256, metric=13.05 vs. limit=22.5 2023-10-04 00:08:45,261 WARNING [train.py:1204] (3/4) Exclude cut with ID _5444_210_2151_1_1527418808464_5312210_581-315411-0 from training. Duration: 0.62 2023-10-04 00:08:48,120 WARNING [train.py:1204] (3/4) Exclude cut with ID _44902_210_16187_1_1533888006062_3792078_149-67212-0_sp0.9 from training. Duration: 0.9666875 2023-10-04 00:08:48,438 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=1455520.0, ans=0.1 2023-10-04 00:08:49,946 WARNING [train.py:1204] (3/4) Exclude cut with ID _31602_210_5854_1_1532677021456_4458250_63-63253-0_sp1.1 from training. Duration: 0.963625 2023-10-04 00:08:52,782 WARNING [train.py:1204] (3/4) Exclude cut with ID _55209_210_1531_1_1534675828996_4597160_660-203992-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 00:08:53,036 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.attention_skip_rate, batch_count=1455520.0, ans=0.0 2023-10-04 00:08:54,517 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.conv_module2.balancer1.max_abs, batch_count=1455586.6666666667, ans=10.0 2023-10-04 00:08:55,667 WARNING [train.py:1204] (3/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_83-190624-0_sp1.1 from training. Duration: 0.92725 2023-10-04 00:09:01,752 WARNING [train.py:1204] (3/4) Exclude cut with ID _58605_210_12210_1_1534723195939_3904259_154-139635-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 00:09:04,898 WARNING [train.py:1204] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_164-224052-0 from training. Duration: 0.7 2023-10-04 00:09:04,907 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_1126-189071-0_sp1.1 from training. Duration: 0.963625 2023-10-04 00:09:06,173 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_15396_210_6890_1_1531308473_1938936_310-214789-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 00:09:09,004 WARNING [train.py:1204] (3/4) Exclude cut with ID _8395_210_1531_1_1528597871523_4411579_431-132470-0 from training. Duration: 0.74 2023-10-04 00:09:10,891 INFO [train.py:1046] (3/4) Epoch 42, batch 550, loss[loss=0.1525, simple_loss=0.233, pruned_loss=0.03596, over 24308.00 frames. ], tot_loss[loss=0.1568, simple_loss=0.2363, pruned_loss=0.03867, over 4396426.35 frames. ], batch size: 61, lr: 2.43e-03, grad_scale: 8.0 2023-10-04 00:09:10,943 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3780_210_2087_1_1526467760_2130483_170-259044-0_sp0.9 from training. Duration: 0.77775 2023-10-04 00:09:12,332 WARNING [train.py:1204] (3/4) Exclude cut with ID _12733_210_8341_1_1530167424556_3646380_537-230640-0_sp1.1 from training. Duration: 0.963625 2023-10-04 00:09:14,042 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.balancer2.prob, batch_count=1455653.3333333333, ans=0.125 2023-10-04 00:09:16,553 WARNING [train.py:1204] (3/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_159-281310-0 from training. Duration: 0.95 2023-10-04 00:09:17,271 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.3.encoder.layers.2.feed_forward2.out_whiten, num_groups=1, num_channels=512, metric=7.55 vs. limit=15.0 2023-10-04 00:09:17,897 WARNING [train.py:1204] (3/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_434-83000-0 from training. Duration: 0.98 2023-10-04 00:09:17,938 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_4405_210_1849_1_1526732205_3593939_493-32464-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 00:09:17,947 WARNING [train.py:1204] (3/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_342-100157-0 from training. Duration: 0.54 2023-10-04 00:09:18,009 WARNING [train.py:1204] (3/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_765-250133-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 00:09:18,013 WARNING [train.py:1204] (3/4) Exclude cut with ID _38670_210_15379_1_1533448810659_4391059_81-312245-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 00:09:19,325 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_50050_210_16223_1_1535435796_7290138_1273-247611-0_sp1.1 from training. Duration: 0.936375 2023-10-04 00:09:19,375 WARNING [train.py:1204] (3/4) Exclude cut with ID _44831_210_13427_1_1533816011549_3689035_399-18591-0_sp1.1 from training. Duration: 0.936375 2023-10-04 00:09:19,397 WARNING [train.py:1204] (3/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_557-324258-0_sp0.9 from training. Duration: 0.9 2023-10-04 00:09:20,676 WARNING [train.py:1204] (3/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_62-335552-0_sp1.1 from training. Duration: 0.7909375 2023-10-04 00:09:22,475 WARNING [train.py:1204] (3/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_673-264125-0_sp1.1 from training. Duration: 0.963625 2023-10-04 00:09:22,575 WARNING [train.py:1204] (3/4) Exclude cut with ID _8030_210_7497_1_1528444886781_3531089_262-86750-0 from training. Duration: 0.81 2023-10-04 00:09:22,601 WARNING [train.py:1204] (3/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_444-28041-0_sp1.1 from training. Duration: 0.77275 2023-10-04 00:09:29,798 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_34008_210_10431_1_1533345424_8183247_516-14858-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 00:09:29,813 WARNING [train.py:1204] (3/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_54-248218-0_sp1.1 from training. Duration: 0.936375 2023-10-04 00:09:32,988 WARNING [train.py:1204] (3/4) Exclude cut with ID _13253_210_1925_1_1530757775819_3577873_300-119782-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 00:09:33,061 WARNING [train.py:1204] (3/4) Exclude cut with ID _39290_210_4220_1_1533862778760_3075118_21-240703-0_sp1.1 from training. Duration: 0.936375 2023-10-04 00:09:34,687 WARNING [train.py:1204] (3/4) Exclude cut with ID 774-127930-0014-10412-0_sp1.1 from training. Duration: 0.95 2023-10-04 00:09:36,100 WARNING [train.py:1204] (3/4) Exclude cut with ID _67499_210_17282_1_1535509805220_3692430_20-179346-0 from training. Duration: 0.97 2023-10-04 00:09:36,219 WARNING [train.py:1204] (3/4) Exclude cut with ID _8365_210_2064_1_1528592311070_4677690_431-294102-0_sp0.9 from training. Duration: 0.988875 2023-10-04 00:09:42,227 WARNING [train.py:1204] (3/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_365-1492-0_sp1.1 from training. Duration: 0.82725 2023-10-04 00:09:42,276 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_105-114797-0_sp0.9 from training. Duration: 0.9333125 2023-10-04 00:09:43,632 WARNING [train.py:1204] (3/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_769-168302-0_sp0.9 from training. Duration: 0.788875 2023-10-04 00:09:45,111 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_40340_210_9024_1_1533433916_4132944_320-270277-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 00:09:45,119 WARNING [train.py:1204] (3/4) Exclude cut with ID ID0203W0003-781711 from training. Duration: 0.958 2023-10-04 00:09:46,982 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_30434_210_13049_1_1532519384_981746_52-12932-0_sp1.1 from training. Duration: 0.936375 2023-10-04 00:09:48,361 WARNING [train.py:1204] (3/4) Exclude cut with ID _8263_210_1664_1_1528438953494_294100_40-204731-0_sp0.9 from training. Duration: 0.5555625 2023-10-04 00:09:52,347 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1032-236863-0_sp0.9 from training. Duration: 0.9333125 2023-10-04 00:09:52,407 WARNING [train.py:1204] (3/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_335-274780-0_sp0.9 from training. Duration: 0.8666875 2023-10-04 00:09:52,424 WARNING [train.py:1204] (3/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_654-52713-0_sp1.1 from training. Duration: 0.7 2023-10-04 00:09:53,655 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.585e+02 1.909e+02 2.051e+02 2.343e+02 3.682e+02, threshold=4.101e+02, percent-clipped=0.0 2023-10-04 00:09:53,759 WARNING [train.py:1204] (3/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_742-267856-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 00:09:55,170 WARNING [train.py:1204] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_74-48717-0 from training. Duration: 0.58 2023-10-04 00:09:57,105 WARNING [train.py:1204] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_18-46756-0 from training. Duration: 0.79 2023-10-04 00:09:57,180 WARNING [train.py:1204] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_612-56061-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 00:09:57,189 WARNING [train.py:1204] (3/4) Exclude cut with ID _56423_210_5854_1_1534896061049_3165969_412-2403-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 00:09:58,509 WARNING [train.py:1204] (3/4) Exclude cut with ID _62574_210_2655_1_1535180407058_3933249_120-196848-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 00:09:58,510 WARNING [train.py:1204] (3/4) Exclude cut with ID _40519_210_13794_1_1533715228655_7337390_916-51000-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 00:10:01,876 WARNING [train.py:1204] (3/4) Exclude cut with ID _65263_210_22356_1_1535763598691_3921037_108-21902-0_sp1.1 from training. Duration: 0.9 2023-10-04 00:10:04,479 WARNING [train.py:1204] (3/4) Exclude cut with ID _4122_210_2084_1_1526713164417_4353280_528-206771-0_sp1.1 from training. Duration: 0.8 2023-10-04 00:10:05,929 WARNING [train.py:1204] (3/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_43-325657-0_sp1.1 from training. Duration: 0.92725 2023-10-04 00:10:06,178 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.conv_skip_rate, batch_count=1455853.3333333333, ans=0.0 2023-10-04 00:10:07,697 WARNING [train.py:1204] (3/4) Exclude cut with ID _44659_210_2721_1_1534036120061_6614860_314-82041-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 00:10:09,099 WARNING [train.py:1204] (3/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_81-300212-0_sp0.9 from training. Duration: 0.6666875 2023-10-04 00:10:10,604 WARNING [train.py:1204] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_665-196155-0_sp0.9 from training. Duration: 0.7444375 2023-10-04 00:10:11,977 WARNING [train.py:1204] (3/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_140-336881-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 00:10:12,063 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6290_210_3273_1_1527595224_1282787_115-87095-0_sp0.9 from training. Duration: 0.611125 2023-10-04 00:10:13,374 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_53373_210_5399_1_1534229471_4715695_607-259685-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 00:10:14,822 WARNING [train.py:1204] (3/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_175-163894-0_sp1.1 from training. Duration: 0.67275 2023-10-04 00:10:14,839 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7002_210_1794_1_1527939683_5157286_1-281264-0_sp1.1 from training. Duration: 0.4 2023-10-04 00:10:21,931 WARNING [train.py:1204] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_605-257205-0 from training. Duration: 0.59 2023-10-04 00:10:24,574 INFO [train.py:1046] (3/4) Epoch 42, batch 600, loss[loss=0.1503, simple_loss=0.2303, pruned_loss=0.03515, over 24677.00 frames. ], tot_loss[loss=0.1573, simple_loss=0.2372, pruned_loss=0.03873, over 4476145.76 frames. ], batch size: 65, lr: 2.43e-03, grad_scale: 8.0 2023-10-04 00:10:24,745 WARNING [train.py:1204] (3/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_454-314901-0 from training. Duration: 0.46 2023-10-04 00:10:26,135 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_46032_210_14656_1_1533781412_4507969_287-130422-0_sp1.1 from training. Duration: 0.97275 2023-10-04 00:10:26,154 WARNING [train.py:1204] (3/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_186-309161-0_sp1.1 from training. Duration: 0.4909375 2023-10-04 00:10:26,195 WARNING [train.py:1204] (3/4) Exclude cut with ID _42659_210_2070_1_1533607442765_3120380_45-342307-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 00:10:32,358 WARNING [train.py:1204] (3/4) Exclude cut with ID _12555_210_8750_1_1530075594402_3780239_478-326818-0_sp1.1 from training. Duration: 0.963625 2023-10-04 00:10:32,583 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.conv_module1.balancer1.max_abs, batch_count=1455986.6666666667, ans=10.0 2023-10-04 00:10:34,225 WARNING [train.py:1204] (3/4) Exclude cut with ID _7606_210_4915_1_1528894253277_5080690_127-177761-0_sp1.1 from training. Duration: 0.5818125 2023-10-04 00:10:37,065 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6788_210_1388_1_1527898698_4562021_91-197981-0 from training. Duration: 0.83 2023-10-04 00:10:40,344 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8558_210_1410_1_1528505225_7613406_131-329524-0_sp0.9 from training. Duration: 0.87775 2023-10-04 00:10:40,468 WARNING [train.py:1204] (3/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_153-153537-0_sp1.1 from training. Duration: 0.82725 2023-10-04 00:10:41,888 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_17292_210_6753_1_1531125388_2943734_285-349105-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 00:10:44,625 WARNING [train.py:1204] (3/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_37-41919-0 from training. Duration: 0.87 2023-10-04 00:10:44,645 WARNING [train.py:1204] (3/4) Exclude cut with ID _60860_210_8254_1_1534896015875_3557169_172-294852-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 00:10:51,980 WARNING [train.py:1204] (3/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_146-276944-0 from training. Duration: 0.83 2023-10-04 00:10:54,651 WARNING [train.py:1204] (3/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_1254-44082-0_sp1.1 from training. Duration: 0.936375 2023-10-04 00:10:54,671 WARNING [train.py:1204] (3/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_1115-180350-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 00:10:54,712 WARNING [train.py:1204] (3/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_60-146189-0_sp1.1 from training. Duration: 0.8909375 2023-10-04 00:10:56,812 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.4.encoder.layers.2.feed_forward2.out_whiten, num_groups=1, num_channels=384, metric=8.31 vs. limit=15.0 2023-10-04 00:11:00,375 WARNING [train.py:1204] (3/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_517-153649-0_sp1.1 from training. Duration: 0.92725 2023-10-04 00:11:02,781 WARNING [train.py:1204] (3/4) Exclude cut with ID _39571_210_13449_1_1533801584143_3651077_37-266257-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 00:11:02,837 WARNING [train.py:1204] (3/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_527-276118-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 00:11:08,233 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7696_210_2040_1_1528207167_3716099_117-125434-0_sp1.1 from training. Duration: 0.6909375 2023-10-04 00:11:13,028 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_25767_210_2161_1_1532307483_3867748_478-156360-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 00:11:13,032 WARNING [train.py:1204] (3/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_177-49092-0_sp1.1 from training. Duration: 0.82725 2023-10-04 00:11:13,046 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_11305_210_1794_1_1529750428_7178148_680-317073-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 00:11:14,619 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=1456186.6666666667, ans=0.1 2023-10-04 00:11:16,104 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.ff2_skip_rate, batch_count=1456186.6666666667, ans=0.0 2023-10-04 00:11:18,772 WARNING [train.py:1204] (3/4) Exclude cut with ID _53565_210_10635_1_1534337218335_3820259_287-18120-0 from training. Duration: 0.97 2023-10-04 00:11:24,719 WARNING [train.py:1204] (3/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_498-40153-0_sp1.1 from training. Duration: 0.57275 2023-10-04 00:11:24,736 WARNING [train.py:1204] (3/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_555-63066-0_sp1.1 from training. Duration: 0.963625 2023-10-04 00:11:27,528 WARNING [train.py:1204] (3/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_395-310220-0 from training. Duration: 0.89 2023-10-04 00:11:27,708 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.1.balancer2.prob, batch_count=1456253.3333333333, ans=0.125 2023-10-04 00:11:28,811 WARNING [train.py:1204] (3/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_937-208281-0_sp1.1 from training. Duration: 0.77275 2023-10-04 00:11:31,387 WARNING [train.py:1204] (3/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_120-211630-0 from training. Duration: 0.92 2023-10-04 00:11:31,420 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_454-276429-0_sp1.1 from training. Duration: 0.7909375 2023-10-04 00:11:32,755 WARNING [train.py:1204] (3/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_404-75889-0_sp1.1 from training. Duration: 0.8090625 2023-10-04 00:11:38,343 WARNING [train.py:1204] (3/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_282-136641-0_sp0.9 from training. Duration: 0.4666875 2023-10-04 00:11:38,621 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.balancer_ff2.min_abs, batch_count=1456320.0, ans=0.1 2023-10-04 00:11:40,231 INFO [train.py:1046] (3/4) Epoch 42, batch 650, loss[loss=0.1531, simple_loss=0.2404, pruned_loss=0.03289, over 24000.00 frames. ], tot_loss[loss=0.1562, simple_loss=0.2361, pruned_loss=0.03816, over 4536287.07 frames. ], batch size: 80, lr: 2.43e-03, grad_scale: 8.0 2023-10-04 00:11:40,317 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_809-17156-0_sp1.1 from training. Duration: 0.663625 2023-10-04 00:11:43,007 WARNING [train.py:1204] (3/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_1003-101638-0_sp0.9 from training. Duration: 0.811125 2023-10-04 00:11:44,528 WARNING [train.py:1204] (3/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_404-75889-0_sp0.9 from training. Duration: 0.988875 2023-10-04 00:11:47,385 WARNING [train.py:1204] (3/4) Exclude cut with ID _59288_210_5854_1_1535077757774_3412246_570-295255-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 00:11:48,850 WARNING [train.py:1204] (3/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_412-287526-0 from training. Duration: 0.72 2023-10-04 00:11:48,917 WARNING [train.py:1204] (3/4) Exclude cut with ID _44010_210_4525_1_1533884262964_4013620_649-79655-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 00:11:53,853 INFO [scaling.py:1118] (3/4) WithLoss: name=encoder.encoders.5.encoder.layers.1.self_attn_weights, loss-sum=0.000e+00 2023-10-04 00:11:56,184 WARNING [train.py:1204] (3/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_57-270037-0_sp0.9 from training. Duration: 0.8555625 2023-10-04 00:11:56,185 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_34815_210_6388_1_1533381320_3571903_302-255963-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 00:11:57,778 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.attention_skip_rate, batch_count=1456386.6666666667, ans=0.0 2023-10-04 00:11:59,020 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_47449_210_5399_1_1534076445_4006929_573-301852-0_sp1.1 from training. Duration: 0.97275 2023-10-04 00:12:02,396 WARNING [train.py:1204] (3/4) Exclude cut with ID 6709-74022-0004-86860-0_sp1.1 from training. Duration: 0.9409375 2023-10-04 00:12:05,587 WARNING [train.py:1204] (3/4) Exclude cut with ID _40519_210_13794_1_1533715228655_7337390_111-71991-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 00:12:05,624 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_30167_210_3528_1_1532428012_4772375_169-214186-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 00:12:09,789 WARNING [train.py:1204] (3/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_20-235313-0_sp1.1 from training. Duration: 0.963625 2023-10-04 00:12:09,830 WARNING [train.py:1204] (3/4) Exclude cut with ID _4681_210_3525_1_1527251040321_1235713_123-24428-0_sp0.9 from training. Duration: 0.5333125 2023-10-04 00:12:13,129 WARNING [train.py:1204] (3/4) Exclude cut with ID _31602_210_5854_1_1532677021456_4458250_444-198752-0_sp1.1 from training. Duration: 0.97275 2023-10-04 00:12:14,577 WARNING [train.py:1204] (3/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_9-279442-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 00:12:14,631 WARNING [train.py:1204] (3/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_973-198085-0_sp0.9 from training. Duration: 0.8333125 2023-10-04 00:12:14,727 WARNING [train.py:1204] (3/4) Exclude cut with ID _29853_210_7497_1_1532505600851_6642110_567-250221-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 00:12:16,083 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6545_210_2040_1_1527774670_4245258_407-48070-0_sp1.1 from training. Duration: 0.6909375 2023-10-04 00:12:18,710 WARNING [train.py:1204] (3/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_224-343295-0_sp0.9 from training. Duration: 0.611125 2023-10-04 00:12:18,733 WARNING [train.py:1204] (3/4) Exclude cut with ID IC0125W0011-61817 from training. Duration: 0.88 2023-10-04 00:12:18,741 WARNING [train.py:1204] (3/4) Exclude cut with ID _60113_210_16271_1_1535164212481_4386079_435-154371-0_sp1.1 from training. Duration: 0.97275 2023-10-04 00:12:18,755 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_25836_210_16554_1_1532138167_53197_3-9394-0_sp1.1 from training. Duration: 0.92725 2023-10-04 00:12:22,750 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.692e+02 1.909e+02 2.145e+02 2.381e+02 3.459e+02, threshold=4.289e+02, percent-clipped=0.0 2023-10-04 00:12:22,854 WARNING [train.py:1204] (3/4) Exclude cut with ID _29343_210_4381_1_1532602757752_3674109_57-55125-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 00:12:22,953 WARNING [train.py:1204] (3/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_267-23560-0_sp1.1 from training. Duration: 0.936375 2023-10-04 00:12:22,976 WARNING [train.py:1204] (3/4) Exclude cut with ID _30196_210_1664_1_1533708496522_7074140_1150-278867-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 00:12:24,857 WARNING [train.py:1204] (3/4) Exclude cut with ID _6270_210_2084_1_1527850911573_3856420_26-277343-0_sp0.9 from training. Duration: 0.911125 2023-10-04 00:12:25,088 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.feed_forward1.out_proj.dropout_p, batch_count=1456520.0, ans=0.1 2023-10-04 00:12:26,258 WARNING [train.py:1204] (3/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_325-62802-0 from training. Duration: 0.87 2023-10-04 00:12:26,343 WARNING [train.py:1204] (3/4) Exclude cut with ID _56524_210_9642_1_1534572011779_3619170_145-310842-0_sp1.1 from training. Duration: 0.9 2023-10-04 00:12:26,372 WARNING [train.py:1204] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_860-96198-0_sp0.9 from training. Duration: 0.811125 2023-10-04 00:12:27,737 WARNING [train.py:1204] (3/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_17-25045-0_sp1.1 from training. Duration: 0.536375 2023-10-04 00:12:27,763 WARNING [train.py:1204] (3/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_349-69727-0_sp1.1 from training. Duration: 0.936375 2023-10-04 00:12:29,204 WARNING [train.py:1204] (3/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_580-93798-0_sp1.1 from training. Duration: 0.5090625 2023-10-04 00:12:31,210 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7762_210_1794_1_1528199508_4057408_419-125009-0 from training. Duration: 0.83 2023-10-04 00:12:32,501 WARNING [train.py:1204] (3/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_356-167816-0 from training. Duration: 0.72 2023-10-04 00:12:32,530 WARNING [train.py:1204] (3/4) Exclude cut with ID _29345_210_4381_1_1532689168263_3860169_225-97100-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 00:12:32,536 WARNING [train.py:1204] (3/4) Exclude cut with ID _14227_210_4381_1_1530701804286_3940259_374-194712-0_sp1.1 from training. Duration: 0.92725 2023-10-04 00:12:32,566 WARNING [train.py:1204] (3/4) Exclude cut with ID _40137_210_12210_1_1533290484046_3795180_249-187059-0_sp0.9 from training. Duration: 0.97775 2023-10-04 00:12:32,583 WARNING [train.py:1204] (3/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_389-317194-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 00:12:35,896 WARNING [train.py:1204] (3/4) Exclude cut with ID _27485_210_4916_1_1532739191677_3668269_239-122918-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 00:12:40,210 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.1.conv_module1.balancer1.prob, batch_count=1456586.6666666667, ans=0.125 2023-10-04 00:12:41,581 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2245_210_2715_1_1525511548_3540254_128-117609-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 00:12:43,396 WARNING [train.py:1204] (3/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_160-276178-0_sp1.1 from training. Duration: 0.963625 2023-10-04 00:12:43,494 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_31920_210_10633_1_1532860009_1686094_1-279735-0_sp1.1 from training. Duration: 0.97275 2023-10-04 00:12:46,242 WARNING [train.py:1204] (3/4) Exclude cut with ID _51078_210_9070_1_1534161557424_3805250_325-262383-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 00:12:46,271 WARNING [train.py:1204] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_643-52007-0_sp0.9 from training. Duration: 0.6333125 2023-10-04 00:12:46,325 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_51937_210_5014_1_1534573610_4022025_518-299248-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 00:12:53,561 WARNING [train.py:1204] (3/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_166-314607-0_sp1.1 from training. Duration: 0.4909375 2023-10-04 00:12:53,565 WARNING [train.py:1204] (3/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_1087-282939-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 00:12:53,614 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_23850_210_4238_1_1532232597_1632433_32-185135-0_sp1.1 from training. Duration: 0.7818125 2023-10-04 00:12:54,879 INFO [train.py:1046] (3/4) Epoch 42, batch 700, loss[loss=0.1607, simple_loss=0.2513, pruned_loss=0.03505, over 24684.00 frames. ], tot_loss[loss=0.1551, simple_loss=0.2351, pruned_loss=0.03755, over 4586370.67 frames. ], batch size: 73, lr: 2.43e-03, grad_scale: 8.0 2023-10-04 00:12:54,950 WARNING [train.py:1204] (3/4) Exclude cut with ID _59500_210_8254_1_1534809610680_3736009_145-89359-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 00:12:59,268 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_340-336408-0 from training. Duration: 0.93 2023-10-04 00:13:00,683 WARNING [train.py:1204] (3/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_9-21737-0 from training. Duration: 0.97 2023-10-04 00:13:04,049 WARNING [train.py:1204] (3/4) Exclude cut with ID _2220_210_1819_1_1525509109898_3658680_50-99837-0 from training. Duration: 0.54 2023-10-04 00:13:05,369 WARNING [train.py:1204] (3/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_879-299350-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 00:13:05,498 WARNING [train.py:1204] (3/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_905-160074-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 00:13:06,924 WARNING [train.py:1204] (3/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_395-73570-0 from training. Duration: 0.99 2023-10-04 00:13:12,922 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7264_210_4938_1_1527939911_8132095_800-91995-0_sp1.1 from training. Duration: 0.7818125 2023-10-04 00:13:15,003 WARNING [train.py:1204] (3/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_77-311404-0_sp1.1 from training. Duration: 0.8545625 2023-10-04 00:13:16,363 WARNING [train.py:1204] (3/4) Exclude cut with ID _13679_210_7497_1_1530534293662_3355098_107-80772-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 00:13:17,786 WARNING [train.py:1204] (3/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_224-343295-0_sp1.1 from training. Duration: 0.5 2023-10-04 00:13:19,045 WARNING [train.py:1204] (3/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_156-253482-0_sp1.1 from training. Duration: 0.963625 2023-10-04 00:13:20,508 WARNING [train.py:1204] (3/4) Exclude cut with ID _49728_210_12651_1_1534041034294_6788150_309-125532-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 00:13:23,326 WARNING [train.py:1204] (3/4) Exclude cut with ID _13913_210_9994_1_1530579615187_3383897_473-277682-0_sp0.9 from training. Duration: 0.4333125 2023-10-04 00:13:23,331 WARNING [train.py:1204] (3/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_3-276701-0_sp1.1 from training. Duration: 0.82725 2023-10-04 00:13:23,602 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.balancer1.prob, batch_count=1456786.6666666667, ans=0.125 2023-10-04 00:13:24,695 WARNING [train.py:1204] (3/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_282-136641-0 from training. Duration: 0.42 2023-10-04 00:13:28,042 WARNING [train.py:1204] (3/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_127-566-0 from training. Duration: 0.84 2023-10-04 00:13:32,801 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6163_210_6400_1_1527506746_4263068_283-137811-0_sp1.1 from training. Duration: 0.7 2023-10-04 00:13:32,840 WARNING [train.py:1204] (3/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_34-183685-0_sp1.1 from training. Duration: 0.8909375 2023-10-04 00:13:35,651 WARNING [train.py:1204] (3/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_154-135696-0_sp0.9 from training. Duration: 0.888875 2023-10-04 00:13:38,504 WARNING [train.py:1204] (3/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_676-51014-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 00:13:38,750 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.conv_module1.balancer2.min_positive, batch_count=1456853.3333333333, ans=0.05 2023-10-04 00:13:40,424 WARNING [train.py:1204] (3/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_38-169920-0 from training. Duration: 0.91 2023-10-04 00:13:43,697 WARNING [train.py:1204] (3/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_747-114465-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 00:13:45,023 WARNING [train.py:1204] (3/4) Exclude cut with ID _8021_210_7994_1_1528376451358_3653069_100-140231-0_sp1.1 from training. Duration: 0.6090625 2023-10-04 00:13:45,040 WARNING [train.py:1204] (3/4) Exclude cut with ID _64135_210_2253_1_1535677267876_7608972_133-188266-0 from training. Duration: 0.81 2023-10-04 00:13:49,122 WARNING [train.py:1204] (3/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_714-310538-0_sp1.1 from training. Duration: 0.7818125 2023-10-04 00:13:50,498 WARNING [train.py:1204] (3/4) Exclude cut with ID _39867_210_9826_1_1533553222030_9344259_1134-288498-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 00:13:52,023 WARNING [train.py:1204] (3/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_211-237289-0_sp1.1 from training. Duration: 0.936375 2023-10-04 00:13:57,816 WARNING [train.py:1204] (3/4) Exclude cut with ID _59202_210_26984_1_1534816810612_3549174_137-318710-0_sp0.9 from training. Duration: 0.988875 2023-10-04 00:13:57,848 WARNING [train.py:1204] (3/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_86-66192-0 from training. Duration: 0.55 2023-10-04 00:14:01,847 WARNING [train.py:1204] (3/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_540-275685-0 from training. Duration: 0.89 2023-10-04 00:14:01,879 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8776_210_2040_1_1528638837_4088036_244-278763-0 from training. Duration: 0.9 2023-10-04 00:14:03,438 WARNING [train.py:1204] (3/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_9-219972-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 00:14:05,302 WARNING [train.py:1204] (3/4) Exclude cut with ID _44659_210_2721_1_1534036120061_6614860_656-283823-0_sp1.1 from training. Duration: 0.92725 2023-10-04 00:14:06,635 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_69630_210_7234_1_1535789601_661049_77-216385-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 00:14:06,769 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_19952_210_2161_1_1531875469_3851750_210-308919-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 00:14:06,774 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_4737_210_2040_1_1526998689_3293008_378-178650-0 from training. Duration: 0.95 2023-10-04 00:14:10,010 INFO [train.py:1046] (3/4) Epoch 42, batch 750, loss[loss=0.1612, simple_loss=0.2283, pruned_loss=0.04698, over 22822.00 frames. ], tot_loss[loss=0.155, simple_loss=0.235, pruned_loss=0.03752, over 4623585.60 frames. ], batch size: 322, lr: 2.43e-03, grad_scale: 8.0 2023-10-04 00:14:11,500 WARNING [train.py:1204] (3/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_224-343295-0 from training. Duration: 0.55 2023-10-04 00:14:11,518 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7002_210_1794_1_1527939683_5157286_184-229630-0 from training. Duration: 0.74 2023-10-04 00:14:11,545 WARNING [train.py:1204] (3/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_6-214815-0 from training. Duration: 0.85 2023-10-04 00:14:11,640 WARNING [train.py:1204] (3/4) Exclude cut with ID _8699_210_2069_1_1528630346831_3960409_160-24408-0 from training. Duration: 0.91 2023-10-04 00:14:11,677 WARNING [train.py:1204] (3/4) Exclude cut with ID _5585_210_2667_1_1527418813324_8007340_89-332072-0 from training. Duration: 0.64 2023-10-04 00:14:13,654 WARNING [train.py:1204] (3/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_528-259800-0_sp1.1 from training. Duration: 0.963625 2023-10-04 00:14:13,741 WARNING [train.py:1204] (3/4) Exclude cut with ID _55383_210_10635_1_1534468426172_2231669_134-128246-0 from training. Duration: 0.89 2023-10-04 00:14:14,918 WARNING [train.py:1204] (3/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_400-338274-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 00:14:16,280 WARNING [train.py:1204] (3/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_364-22817-0_sp0.9 from training. Duration: 0.788875 2023-10-04 00:14:16,402 WARNING [train.py:1204] (3/4) Exclude cut with ID _42863_210_9070_1_1533902398788_3696920_331-291057-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 00:14:17,843 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.ff3_skip_rate, batch_count=1456986.6666666667, ans=0.0 2023-10-04 00:14:18,984 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_5436_210_1794_1_1527331317_2541494_229-211016-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 00:14:19,030 WARNING [train.py:1204] (3/4) Exclude cut with ID _6456_210_1534_1_1527681634212_3181049_345-252877-0_sp0.9 from training. Duration: 0.711125 2023-10-04 00:14:20,270 WARNING [train.py:1204] (3/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_96-81499-0_sp1.1 from training. Duration: 0.92725 2023-10-04 00:14:23,635 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_5575_210_1410_1_1527301305_2570170_342-265572-0_sp1.1 from training. Duration: 0.8545625 2023-10-04 00:14:23,698 WARNING [train.py:1204] (3/4) Exclude cut with ID _5444_210_2151_1_1527418808464_5312210_726-27165-0_sp0.9 from training. Duration: 0.7444375 2023-10-04 00:14:25,138 WARNING [train.py:1204] (3/4) Exclude cut with ID _22083_210_8254_1_1531702862439_3678970_304-327750-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 00:14:26,663 WARNING [train.py:1204] (3/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_245-226199-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 00:14:28,000 WARNING [train.py:1204] (3/4) Exclude cut with ID _42875_210_14856_1_1533704397487_4012433_102-199499-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 00:14:28,055 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_51225_210_3601_1_1534400634_2327383_55-301844-0 from training. Duration: 0.93 2023-10-04 00:14:29,640 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.bypass.skip_rate, batch_count=1457053.3333333333, ans=0.09899494936611666 2023-10-04 00:14:31,209 WARNING [train.py:1204] (3/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_916-195848-0_sp1.1 from training. Duration: 0.836375 2023-10-04 00:14:31,281 WARNING [train.py:1204] (3/4) Exclude cut with ID _44718_210_5816_1_1534050051869_6469030_497-311999-0_sp1.1 from training. Duration: 0.936375 2023-10-04 00:14:32,659 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_34886_210_10633_1_1533203882_1755661_91-194765-0_sp1.1 from training. Duration: 0.936375 2023-10-04 00:14:32,853 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=1457053.3333333333, ans=0.1 2023-10-04 00:14:34,140 WARNING [train.py:1204] (3/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_310-26186-0_sp0.9 from training. Duration: 0.77775 2023-10-04 00:14:34,227 WARNING [train.py:1204] (3/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_416-257730-0 from training. Duration: 0.65 2023-10-04 00:14:34,233 WARNING [train.py:1204] (3/4) Exclude cut with ID _9389_210_1664_1_1529217337301_5289269_209-154760-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 00:14:37,547 WARNING [train.py:1204] (3/4) Exclude cut with ID _4092_210_2151_1_1526643044449_3135759_227-213972-0 from training. Duration: 0.54 2023-10-04 00:14:37,576 WARNING [train.py:1204] (3/4) Exclude cut with ID IC0679W0044-335852 from training. Duration: 0.768 2023-10-04 00:14:38,776 WARNING [train.py:1204] (3/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_42-186162-0 from training. Duration: 0.92 2023-10-04 00:14:38,782 WARNING [train.py:1204] (3/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_302-2550-0_sp0.9 from training. Duration: 0.9 2023-10-04 00:14:38,796 WARNING [train.py:1204] (3/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_356-31216-0_sp0.9 from training. Duration: 0.8333125 2023-10-04 00:14:40,262 WARNING [train.py:1204] (3/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_125-322172-0_sp1.1 from training. Duration: 0.8454375 2023-10-04 00:14:43,739 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.bypass.scale_min, batch_count=1457120.0, ans=0.2 2023-10-04 00:14:46,162 WARNING [train.py:1204] (3/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_141-89727-0_sp0.9 from training. Duration: 0.788875 2023-10-04 00:14:46,184 WARNING [train.py:1204] (3/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_647-337784-0_sp1.1 from training. Duration: 0.97275 2023-10-04 00:14:47,360 WARNING [train.py:1204] (3/4) Exclude cut with ID _6493_210_1747_1_1528459210424_4012470_513-140967-0_sp1.1 from training. Duration: 0.7181875 2023-10-04 00:14:48,880 WARNING [train.py:1204] (3/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_87-266334-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 00:14:50,831 WARNING [train.py:1204] (3/4) Exclude cut with ID _26528_210_9070_1_1532570410842_3851310_526-261338-0_sp1.1 from training. Duration: 0.92725 2023-10-04 00:14:50,876 WARNING [train.py:1204] (3/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_380-343670-0 from training. Duration: 0.88 2023-10-04 00:14:51,031 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.bypass_mid.scale_min, batch_count=1457120.0, ans=0.2 2023-10-04 00:14:52,185 WARNING [train.py:1204] (3/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_449-15508-0_sp1.1 from training. Duration: 0.7454375 2023-10-04 00:14:52,278 WARNING [train.py:1204] (3/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_372-6869-0_sp0.9 from training. Duration: 0.488875 2023-10-04 00:14:53,510 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.696e+02 1.922e+02 2.060e+02 2.309e+02 3.255e+02, threshold=4.120e+02, percent-clipped=0.0 2023-10-04 00:14:53,603 WARNING [train.py:1204] (3/4) Exclude cut with ID _26907_210_7234_1_1532221740694_3205263_337-2494-0_sp0.9 from training. Duration: 0.97775 2023-10-04 00:14:55,090 WARNING [train.py:1204] (3/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_10-100367-0_sp1.1 from training. Duration: 0.8909375 2023-10-04 00:14:55,125 WARNING [train.py:1204] (3/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_47-167373-0 from training. Duration: 0.92 2023-10-04 00:14:56,904 WARNING [train.py:1204] (3/4) Exclude cut with ID _28323_210_16187_1_1532480395667_4100300_628-282179-0_sp1.1 from training. Duration: 0.97275 2023-10-04 00:15:02,221 WARNING [train.py:1204] (3/4) Exclude cut with ID _20455_210_5219_1_1531565996775_8098100_213-280898-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 00:15:02,322 WARNING [train.py:1204] (3/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_383-316000-0_sp1.1 from training. Duration: 0.8181875 2023-10-04 00:15:03,666 WARNING [train.py:1204] (3/4) Exclude cut with ID _18513_210_9206_1_1531618215223_3862620_16-78180-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 00:15:06,945 WARNING [train.py:1204] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_330-327861-0_sp1.1 from training. Duration: 0.6090625 2023-10-04 00:15:09,820 WARNING [train.py:1204] (3/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_356-31216-0 from training. Duration: 0.75 2023-10-04 00:15:09,848 WARNING [train.py:1204] (3/4) Exclude cut with ID _25248_210_8750_1_1532062825941_5554180_255-148539-0_sp1.1 from training. Duration: 0.963625 2023-10-04 00:15:11,201 WARNING [train.py:1204] (3/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_269-154726-0_sp1.1 from training. Duration: 0.7818125 2023-10-04 00:15:13,916 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7002_210_1794_1_1527939683_5157286_163-87552-0_sp1.1 from training. Duration: 0.7818125 2023-10-04 00:15:13,953 WARNING [train.py:1204] (3/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_97-151896-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 00:15:16,844 WARNING [train.py:1204] (3/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_207-7026-0_sp1.1 from training. Duration: 0.97275 2023-10-04 00:15:16,873 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_92-46009-0_sp0.9 from training. Duration: 0.6 2023-10-04 00:15:18,951 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.bypass.skip_rate, batch_count=1457253.3333333333, ans=0.07 2023-10-04 00:15:24,721 INFO [train.py:1046] (3/4) Epoch 42, batch 800, loss[loss=0.1482, simple_loss=0.2389, pruned_loss=0.02876, over 23704.00 frames. ], tot_loss[loss=0.1554, simple_loss=0.2356, pruned_loss=0.03764, over 4643760.35 frames. ], batch size: 85, lr: 2.43e-03, grad_scale: 16.0 2023-10-04 00:15:26,177 WARNING [train.py:1204] (3/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_122-178847-0_sp1.1 from training. Duration: 0.97275 2023-10-04 00:15:26,181 WARNING [train.py:1204] (3/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_94-348956-0_sp1.1 from training. Duration: 0.92725 2023-10-04 00:15:29,477 WARNING [train.py:1204] (3/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_1018-244089-0_sp1.1 from training. Duration: 0.963625 2023-10-04 00:15:29,494 WARNING [train.py:1204] (3/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_87-118423-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 00:15:30,993 WARNING [train.py:1204] (3/4) Exclude cut with ID _53713_210_24116_1_1534314487762_7416751_504-212292-0_sp1.1 from training. Duration: 0.92725 2023-10-04 00:15:31,020 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_446-150327-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 00:15:31,694 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.2.encoder.layers.0.self_attn2.whiten, num_groups=1, num_channels=384, metric=17.11 vs. limit=22.5 2023-10-04 00:15:32,938 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.4.encoder.layers.0.feed_forward3.out_whiten, num_groups=1, num_channels=384, metric=13.25 vs. limit=15.0 2023-10-04 00:15:33,738 WARNING [train.py:1204] (3/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_682-245989-0_sp1.1 from training. Duration: 0.92725 2023-10-04 00:15:36,543 WARNING [train.py:1204] (3/4) Exclude cut with ID _35723_210_1794_1_1533088341043_628250_8-234524-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 00:15:36,611 WARNING [train.py:1204] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_64-248359-0_sp0.9 from training. Duration: 0.7666875 2023-10-04 00:15:40,048 WARNING [train.py:1204] (3/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_49-97158-0 from training. Duration: 0.76 2023-10-04 00:15:40,095 WARNING [train.py:1204] (3/4) Exclude cut with ID _24314_210_4915_1_1532135078116_7170179_743-915-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 00:15:40,284 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.feed_forward1.hidden_balancer.prob, batch_count=1457386.6666666667, ans=0.125 2023-10-04 00:15:42,824 WARNING [train.py:1204] (3/4) Exclude cut with ID _61194_210_18628_1_1535007555628_3750909_353-323468-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 00:15:42,855 WARNING [train.py:1204] (3/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_387-131538-0_sp1.1 from training. Duration: 0.736375 2023-10-04 00:15:42,875 WARNING [train.py:1204] (3/4) Exclude cut with ID _44424_210_18163_1_1533623394848_1213110_79-187603-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 00:15:44,155 WARNING [train.py:1204] (3/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_86-19601-0 from training. Duration: 0.85 2023-10-04 00:15:44,190 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_50050_210_16223_1_1535435796_7290138_103-283529-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 00:15:44,230 WARNING [train.py:1204] (3/4) Exclude cut with ID _13913_210_9994_1_1530579615187_3383897_473-277682-0 from training. Duration: 0.39 2023-10-04 00:15:47,054 WARNING [train.py:1204] (3/4) Exclude cut with ID _42326_210_7770_1_1533814282531_3707269_368-68029-0_sp1.1 from training. Duration: 0.92725 2023-10-04 00:15:49,694 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_41428_210_2161_1_1533774184_3992532_446-339645-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 00:15:49,926 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.conv_module1.balancer2.prob, batch_count=1457386.6666666667, ans=0.125 2023-10-04 00:15:51,567 WARNING [train.py:1204] (3/4) Exclude cut with ID _69584_210_5219_1_1535785133775_4044450_325-7656-0_sp1.1 from training. Duration: 0.7818125 2023-10-04 00:15:51,697 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.conv_module2.balancer2.prob, batch_count=1457386.6666666667, ans=0.125 2023-10-04 00:15:52,857 WARNING [train.py:1204] (3/4) Exclude cut with ID _21632_210_2253_1_1531994001765_3744140_134-184387-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 00:15:54,878 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.bypass_mid.scale_min, batch_count=1457453.3333333333, ans=0.2 2023-10-04 00:15:56,022 WARNING [train.py:1204] (3/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_550-184707-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 00:15:56,045 WARNING [train.py:1204] (3/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_106-341780-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 00:15:58,069 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.3.encoder.layers.2.feed_forward1.out_whiten, num_groups=1, num_channels=512, metric=7.71 vs. limit=15.0 2023-10-04 00:16:00,138 WARNING [train.py:1204] (3/4) Exclude cut with ID _45871_210_5665_1_1533864614245_3296060_253-132552-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 00:16:01,492 WARNING [train.py:1204] (3/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_395-310220-0_sp1.1 from training. Duration: 0.8090625 2023-10-04 00:16:01,507 WARNING [train.py:1204] (3/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_795-33252-0_sp0.9 from training. Duration: 0.411125 2023-10-04 00:16:03,680 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9002W0003-495962 from training. Duration: 0.946 2023-10-04 00:16:04,908 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_43-71989-0 from training. Duration: 0.57 2023-10-04 00:16:04,921 WARNING [train.py:1204] (3/4) Exclude cut with ID _8699_210_2069_1_1528630346831_3960409_155-39766-0_sp1.1 from training. Duration: 0.7090625 2023-10-04 00:16:04,934 WARNING [train.py:1204] (3/4) Exclude cut with ID _27894_210_12392_1_1532415496078_6902439_788-232684-0_sp1.1 from training. Duration: 0.97275 2023-10-04 00:16:06,203 WARNING [train.py:1204] (3/4) Exclude cut with ID _56494_210_4220_1_1535284837625_3033169_10-39296-0_sp1.1 from training. Duration: 0.92725 2023-10-04 00:16:06,240 WARNING [train.py:1204] (3/4) Exclude cut with ID _40901_210_5665_1_1533363789958_3856130_341-20819-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 00:16:11,960 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9029W0435-509822 from training. Duration: 0.896 2023-10-04 00:16:13,899 WARNING [train.py:1204] (3/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_51-117576-0 from training. Duration: 0.4 2023-10-04 00:16:14,190 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.bypass.skip_rate, batch_count=1457520.0, ans=0.09899494936611666 2023-10-04 00:16:15,367 WARNING [train.py:1204] (3/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_197-28164-0_sp1.1 from training. Duration: 0.72725 2023-10-04 00:16:16,795 WARNING [train.py:1204] (3/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_145-137790-0_sp1.1 from training. Duration: 0.7545625 2023-10-04 00:16:21,025 WARNING [train.py:1204] (3/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_2-39653-0_sp1.1 from training. Duration: 0.8909375 2023-10-04 00:16:22,776 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.conv_skip_rate, batch_count=1457586.6666666667, ans=0.0 2023-10-04 00:16:23,943 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3780_210_2087_1_1526467760_2130483_186-208240-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 00:16:25,330 WARNING [train.py:1204] (3/4) Exclude cut with ID _24055_210_12651_1_1532310907143_976979_92-59356-0 from training. Duration: 0.91 2023-10-04 00:16:27,098 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3910_210_1794_1_1526742306_334530_28-238909-0_sp1.1 from training. Duration: 0.736375 2023-10-04 00:16:29,196 WARNING [train.py:1204] (3/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_269-154726-0 from training. Duration: 0.86 2023-10-04 00:16:35,237 WARNING [train.py:1204] (3/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_326-165900-0_sp0.9 from training. Duration: 0.7666875 2023-10-04 00:16:36,700 WARNING [train.py:1204] (3/4) Exclude cut with ID _5294_210_1747_1_1527249643928_3696179_470-276077-0_sp1.1 from training. Duration: 0.963625 2023-10-04 00:16:38,005 WARNING [train.py:1204] (3/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_372-131163-0 from training. Duration: 0.94 2023-10-04 00:16:38,078 WARNING [train.py:1204] (3/4) Exclude cut with ID _45297_210_2721_1_1533715351381_3264949_351-147136-0_sp1.1 from training. Duration: 0.7909375 2023-10-04 00:16:38,150 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_1072-12671-0_sp1.1 from training. Duration: 0.92725 2023-10-04 00:16:39,411 INFO [train.py:1046] (3/4) Epoch 42, batch 850, loss[loss=0.1544, simple_loss=0.236, pruned_loss=0.03636, over 23159.00 frames. ], tot_loss[loss=0.1557, simple_loss=0.2359, pruned_loss=0.03776, over 4663469.80 frames. ], batch size: 93, lr: 2.43e-03, grad_scale: 16.0 2023-10-04 00:16:39,547 WARNING [train.py:1204] (3/4) Exclude cut with ID _15133_210_6228_1_1530794512074_3080379_54-198193-0 from training. Duration: 0.94 2023-10-04 00:16:39,574 WARNING [train.py:1204] (3/4) Exclude cut with ID _30196_210_1664_1_1533708496522_7074140_1194-270988-0_sp1.1 from training. Duration: 0.97275 2023-10-04 00:16:39,858 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.feed_forward2.hidden_balancer.prob, batch_count=1457653.3333333333, ans=0.125 2023-10-04 00:16:40,904 WARNING [train.py:1204] (3/4) Exclude cut with ID _50310_210_15488_1_1534068043345_3641080_289-193637-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 00:16:41,004 WARNING [train.py:1204] (3/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_313-263495-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 00:16:44,285 WARNING [train.py:1204] (3/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_18-130336-0_sp1.1 from training. Duration: 0.7181875 2023-10-04 00:16:45,433 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_47896_210_20508_1_1534492518_5958868_646-287209-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 00:16:45,541 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_294-163793-0 from training. Duration: 0.82 2023-10-04 00:16:46,796 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_8-319957-0 from training. Duration: 0.96 2023-10-04 00:16:46,799 WARNING [train.py:1204] (3/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_1211-226066-0 from training. Duration: 0.76 2023-10-04 00:16:46,948 WARNING [train.py:1204] (3/4) Exclude cut with ID _4779_210_2084_1_1527318022944_3914893_500-285013-0_sp0.9 from training. Duration: 0.7666875 2023-10-04 00:16:48,212 WARNING [train.py:1204] (3/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_347-232268-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 00:16:49,689 WARNING [train.py:1204] (3/4) Exclude cut with ID _29877_210_6228_1_1532397791106_5276149_111-55056-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 00:16:50,989 WARNING [train.py:1204] (3/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_812-271646-0_sp1.1 from training. Duration: 0.92725 2023-10-04 00:16:51,020 WARNING [train.py:1204] (3/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_235-262124-0_sp1.1 from training. Duration: 0.6090625 2023-10-04 00:16:52,604 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.self_attn_weights.pos_emb_skip_rate, batch_count=1457720.0, ans=0.0 2023-10-04 00:16:54,407 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.0.feed_forward1.out_proj.dropout_p, batch_count=1457720.0, ans=0.1 2023-10-04 00:16:55,633 WARNING [train.py:1204] (3/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_14-16530-0_sp1.1 from training. Duration: 0.97275 2023-10-04 00:16:56,993 WARNING [train.py:1204] (3/4) Exclude cut with ID _44718_210_5816_1_1534050051869_6469030_568-304512-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 00:16:57,012 WARNING [train.py:1204] (3/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_1033-274376-0 from training. Duration: 0.87 2023-10-04 00:17:00,317 WARNING [train.py:1204] (3/4) Exclude cut with ID _4681_210_3525_1_1527251040321_1235713_123-24428-0 from training. Duration: 0.48 2023-10-04 00:17:01,787 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_37818_210_4238_1_1533620910_4720083_251-288764-0_sp1.1 from training. Duration: 0.97275 2023-10-04 00:17:03,167 WARNING [train.py:1204] (3/4) Exclude cut with ID _65585_210_12210_1_1535450451584_3764098_112-294301-0 from training. Duration: 0.88 2023-10-04 00:17:05,914 WARNING [train.py:1204] (3/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_329-113173-0 from training. Duration: 0.62 2023-10-04 00:17:07,228 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6767_210_3273_1_1527855783_1718932_173-256601-0 from training. Duration: 0.8 2023-10-04 00:17:10,015 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9266W0001-627677 from training. Duration: 0.896 2023-10-04 00:17:10,026 WARNING [train.py:1204] (3/4) Exclude cut with ID _49643_210_7989_1_1534640244224_3939031_611-185515-0_sp1.1 from training. Duration: 0.8545625 2023-10-04 00:17:10,029 WARNING [train.py:1204] (3/4) Exclude cut with ID _31649_210_6228_1_1532584608063_4091078_687-237771-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 00:17:10,049 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_148-172861-0_sp0.9 from training. Duration: 0.5555625 2023-10-04 00:17:12,903 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_5129_210_6400_1_1527161005_3579054_107-69738-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 00:17:14,311 WARNING [train.py:1204] (3/4) Exclude cut with ID _40137_210_12210_1_1533290484046_3795180_36-301164-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 00:17:14,958 WARNING [train.py:1204] (3/4) Exclude cut with ID _7537_210_2084_1_1528502395431_3924359_113-199933-0 from training. Duration: 0.59 2023-10-04 00:17:15,171 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.feed_forward2.hidden_balancer.prob, batch_count=1457786.6666666667, ans=0.125 2023-10-04 00:17:17,897 WARNING [train.py:1204] (3/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_547-5424-0_sp1.1 from training. Duration: 0.8545625 2023-10-04 00:17:19,192 WARNING [train.py:1204] (3/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_147-284024-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 00:17:20,560 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_294-163793-0_sp1.1 from training. Duration: 0.7454375 2023-10-04 00:17:21,767 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.616e+02 1.961e+02 2.256e+02 2.503e+02 3.661e+02, threshold=4.513e+02, percent-clipped=0.0 2023-10-04 00:17:21,858 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3780_210_2087_1_1526467760_2130483_170-259044-0_sp1.1 from training. Duration: 0.636375 2023-10-04 00:17:23,304 WARNING [train.py:1204] (3/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_726-270780-0_sp1.1 from training. Duration: 0.7818125 2023-10-04 00:17:23,505 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.ff3_skip_rate, batch_count=1457853.3333333333, ans=0.0 2023-10-04 00:17:24,703 WARNING [train.py:1204] (3/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_214-291369-0_sp1.1 from training. Duration: 0.57275 2023-10-04 00:17:24,756 WARNING [train.py:1204] (3/4) Exclude cut with ID _21881_210_3525_1_1531825242757_3798380_247-102591-0 from training. Duration: 0.98 2023-10-04 00:17:28,290 WARNING [train.py:1204] (3/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_18-174647-0_sp1.1 from training. Duration: 0.82725 2023-10-04 00:17:28,292 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_415-52611-0_sp1.1 from training. Duration: 0.936375 2023-10-04 00:17:28,421 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.conv_module2.balancer2.prob, batch_count=1457853.3333333333, ans=0.125 2023-10-04 00:17:29,667 WARNING [train.py:1204] (3/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_379-49457-0_sp0.9 from training. Duration: 0.9666875 2023-10-04 00:17:29,681 WARNING [train.py:1204] (3/4) Exclude cut with ID _64521_210_14699_1_1535243427259_3815328_368-195106-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 00:17:29,754 WARNING [train.py:1204] (3/4) Exclude cut with ID _28394_210_8913_1_1532430049518_3650118_51-334124-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 00:17:31,188 WARNING [train.py:1204] (3/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_317-161769-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 00:17:34,542 WARNING [train.py:1204] (3/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_40-129756-0_sp0.9 from training. Duration: 0.9 2023-10-04 00:17:35,908 WARNING [train.py:1204] (3/4) Exclude cut with ID _3067_210_2667_1_1526212974640_4503168_119-175776-0_sp0.9 from training. Duration: 0.82225 2023-10-04 00:17:35,962 WARNING [train.py:1204] (3/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_1004-304915-0_sp1.1 from training. Duration: 0.92725 2023-10-04 00:17:37,221 WARNING [train.py:1204] (3/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_359-118926-0_sp0.9 from training. Duration: 0.8 2023-10-04 00:17:45,568 WARNING [train.py:1204] (3/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_51-200220-0_sp0.9 from training. Duration: 0.62225 2023-10-04 00:17:46,471 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.conv_skip_rate, batch_count=1457920.0, ans=0.0 2023-10-04 00:17:47,455 WARNING [train.py:1204] (3/4) Exclude cut with ID _46683_210_9642_1_1533730913492_4591116_530-326202-0_sp1.1 from training. Duration: 0.936375 2023-10-04 00:17:47,506 WARNING [train.py:1204] (3/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_567-135114-0 from training. Duration: 0.87 2023-10-04 00:17:47,532 WARNING [train.py:1204] (3/4) Exclude cut with ID _66863_210_18053_1_1535698758334_8590319_1092-271784-0_sp1.1 from training. Duration: 0.87275 2023-10-04 00:17:47,543 WARNING [train.py:1204] (3/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_762-282614-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 00:17:49,075 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.conv_skip_rate, batch_count=1457920.0, ans=0.0 2023-10-04 00:17:50,268 WARNING [train.py:1204] (3/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_280-120942-0 from training. Duration: 0.77 2023-10-04 00:17:52,887 INFO [train.py:1046] (3/4) Epoch 42, batch 900, loss[loss=0.1587, simple_loss=0.2446, pruned_loss=0.03637, over 24502.00 frames. ], tot_loss[loss=0.1566, simple_loss=0.2369, pruned_loss=0.03818, over 4683217.16 frames. ], batch size: 63, lr: 2.43e-03, grad_scale: 16.0 2023-10-04 00:17:54,501 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.bypass_mid.scale_min, batch_count=1457986.6666666667, ans=0.2 2023-10-04 00:17:54,507 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.nonlin_attention.balancer.prob, batch_count=1457986.6666666667, ans=0.125 2023-10-04 00:17:57,019 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_31583_210_3852_1_1532595851_4024413_27-170931-0_sp1.1 from training. Duration: 0.963625 2023-10-04 00:18:00,721 WARNING [train.py:1204] (3/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_266-271628-0_sp1.1 from training. Duration: 0.92725 2023-10-04 00:18:00,767 WARNING [train.py:1204] (3/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_41-31157-0 from training. Duration: 0.57 2023-10-04 00:18:03,564 WARNING [train.py:1204] (3/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_113-18163-0_sp1.1 from training. Duration: 0.8090625 2023-10-04 00:18:05,475 WARNING [train.py:1204] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_147-111137-0 from training. Duration: 0.95 2023-10-04 00:18:05,554 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1083-220122-0_sp1.1 from training. Duration: 0.463625 2023-10-04 00:18:06,956 WARNING [train.py:1204] (3/4) Exclude cut with ID _14884_210_2252_1_1530707459689_5561250_591-173406-0_sp1.1 from training. Duration: 0.87275 2023-10-04 00:18:06,969 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_24677_210_1794_1_1532002693_4341296_149-303793-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 00:18:08,287 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_133-564-0_sp1.1 from training. Duration: 0.7181875 2023-10-04 00:18:08,311 WARNING [train.py:1204] (3/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_625-338546-0_sp0.9 from training. Duration: 0.97775 2023-10-04 00:18:14,097 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.0.ff2_skip_rate, batch_count=1458053.3333333333, ans=0.0 2023-10-04 00:18:19,347 WARNING [train.py:1204] (3/4) Exclude cut with ID _25882_210_3036_1_1532775665815_6479510_624-320169-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 00:18:19,356 WARNING [train.py:1204] (3/4) Exclude cut with ID _20622_210_3852_1_1531622427080_1093839_127-325326-0_sp1.1 from training. Duration: 0.92725 2023-10-04 00:18:19,378 WARNING [train.py:1204] (3/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_464-33931-0_sp1.1 from training. Duration: 0.4909375 2023-10-04 00:18:19,629 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.ff2_skip_rate, batch_count=1458053.3333333333, ans=0.0 2023-10-04 00:18:21,682 WARNING [train.py:1204] (3/4) Exclude cut with ID _66112_210_10635_1_1535590236305_3801484_120-304588-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 00:18:25,675 WARNING [train.py:1204] (3/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_110-130961-0 from training. Duration: 0.47 2023-10-04 00:18:28,270 WARNING [train.py:1204] (3/4) Exclude cut with ID _16908_210_5724_1_1531119157248_3772700_64-54020-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 00:18:32,901 WARNING [train.py:1204] (3/4) Exclude cut with ID _4068_210_2629_1_1526697350395_1546259_224-288693-0_sp1.1 from training. Duration: 0.536375 2023-10-04 00:18:34,173 WARNING [train.py:1204] (3/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_191-41023-0_sp0.9 from training. Duration: 0.788875 2023-10-04 00:18:36,044 WARNING [train.py:1204] (3/4) Exclude cut with ID ID0205W0255-783002 from training. Duration: 0.958 2023-10-04 00:18:36,143 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_162-262717-0 from training. Duration: 0.83 2023-10-04 00:18:42,985 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_224-322394-0_sp1.1 from training. Duration: 0.6 2023-10-04 00:18:42,992 WARNING [train.py:1204] (3/4) Exclude cut with ID _4866_210_2577_1_1527404412284_4364659_133-1696-0_sp1.1 from training. Duration: 0.9 2023-10-04 00:18:44,273 WARNING [train.py:1204] (3/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_973-198085-0_sp1.1 from training. Duration: 0.6818125 2023-10-04 00:18:48,716 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.balancer2.prob, batch_count=1458186.6666666667, ans=0.125 2023-10-04 00:18:49,281 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.1.encoder.layers.1.feed_forward1.out_whiten, num_groups=1, num_channels=256, metric=5.73 vs. limit=15.0 2023-10-04 00:18:49,796 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_47449_210_5399_1_1534076445_4006929_410-237806-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 00:18:49,814 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7543_210_6753_1_1528543318_4552_1-85072-0_sp0.9 from training. Duration: 0.92225 2023-10-04 00:18:51,279 WARNING [train.py:1204] (3/4) Exclude cut with ID _65263_210_22356_1_1535763598691_3921037_108-21902-0 from training. Duration: 0.99 2023-10-04 00:18:51,284 WARNING [train.py:1204] (3/4) Exclude cut with ID _65585_210_12210_1_1535450451584_3764098_309-174818-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 00:18:53,913 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.3.encoder.layers.3.nonlin_attention.whiten2, num_groups=1, num_channels=512, metric=7.13 vs. limit=15.0 2023-10-04 00:18:54,491 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_309-65177-0 from training. Duration: 0.88 2023-10-04 00:18:55,846 WARNING [train.py:1204] (3/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_363-185703-0_sp1.1 from training. Duration: 0.863625 2023-10-04 00:18:55,882 WARNING [train.py:1204] (3/4) Exclude cut with ID _42326_210_7770_1_1533814282531_3707269_442-238692-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 00:18:57,257 WARNING [train.py:1204] (3/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_226-254446-0_sp1.1 from training. Duration: 0.936375 2023-10-04 00:18:57,276 WARNING [train.py:1204] (3/4) Exclude cut with ID _29853_210_7497_1_1532505600851_6642110_287-299903-0_sp1.1 from training. Duration: 0.963625 2023-10-04 00:19:00,162 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1015-343884-0 from training. Duration: 0.62 2023-10-04 00:19:00,191 WARNING [train.py:1204] (3/4) Exclude cut with ID ID0305W0008-833402 from training. Duration: 0.91 2023-10-04 00:19:03,299 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6673_210_3273_1_1527767790_3173240_351-167954-0_sp1.1 from training. Duration: 0.436375 2023-10-04 00:19:03,317 WARNING [train.py:1204] (3/4) Exclude cut with ID _6456_210_1534_1_1527681634212_3181049_345-252877-0 from training. Duration: 0.64 2023-10-04 00:19:06,348 INFO [train.py:1046] (3/4) Epoch 42, batch 950, loss[loss=0.1607, simple_loss=0.2376, pruned_loss=0.0419, over 23281.00 frames. ], tot_loss[loss=0.1566, simple_loss=0.2368, pruned_loss=0.0382, over 4692010.48 frames. ], batch size: 105, lr: 2.43e-03, grad_scale: 4.0 2023-10-04 00:19:06,494 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_13-205182-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 00:19:09,809 WARNING [train.py:1204] (3/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_427-208901-0 from training. Duration: 0.68 2023-10-04 00:19:09,991 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.balancer2.prob, batch_count=1458320.0, ans=0.125 2023-10-04 00:19:14,044 WARNING [train.py:1204] (3/4) Exclude cut with ID _49039_210_6315_1_1533967537541_3846118_9-148921-0_sp1.1 from training. Duration: 0.97275 2023-10-04 00:19:16,835 WARNING [train.py:1204] (3/4) Exclude cut with ID _59493_210_17282_1_1535078419077_3225879_143-70317-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 00:19:16,873 WARNING [train.py:1204] (3/4) Exclude cut with ID _27967_210_12140_1_1532588391859_8419130_1056-236369-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 00:19:16,911 WARNING [train.py:1204] (3/4) Exclude cut with ID _6537_210_1883_1_1527987559710_3597129_382-133062-0_sp0.9 from training. Duration: 0.7555625 2023-10-04 00:19:19,634 WARNING [train.py:1204] (3/4) Exclude cut with ID ID0376W0255-869957 from training. Duration: 0.9510625 2023-10-04 00:19:21,727 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.3.encoder.layers.0.whiten, num_groups=1, num_channels=512, metric=5.09 vs. limit=12.0 2023-10-04 00:19:23,616 WARNING [train.py:1204] (3/4) Exclude cut with ID _49371_210_12071_1_1533952761021_3812190_483-240691-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 00:19:25,475 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_12868_210_6753_1_1530701932_530275_18-76322-0_sp1.1 from training. Duration: 0.7909375 2023-10-04 00:19:25,550 WARNING [train.py:1204] (3/4) Exclude cut with ID _53950_210_5816_1_1534417264919_3862951_367-26344-0_sp1.1 from training. Duration: 0.97275 2023-10-04 00:19:25,569 WARNING [train.py:1204] (3/4) Exclude cut with ID _5444_210_2151_1_1527418808464_5312210_634-194174-0_sp1.1 from training. Duration: 0.8818125 2023-10-04 00:19:25,587 WARNING [train.py:1204] (3/4) Exclude cut with ID _56494_210_4220_1_1535284837625_3033169_69-138150-0 from training. Duration: 0.94 2023-10-04 00:19:26,871 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7543_210_6753_1_1528544010_1110402_25-183860-0_sp0.9 from training. Duration: 0.52225 2023-10-04 00:19:28,225 WARNING [train.py:1204] (3/4) Exclude cut with ID _40284_210_2084_1_1533513679814_3704279_287-191918-0_sp1.1 from training. Duration: 0.963625 2023-10-04 00:19:29,718 WARNING [train.py:1204] (3/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_291-270881-0 from training. Duration: 0.97 2023-10-04 00:19:30,982 WARNING [train.py:1204] (3/4) Exclude cut with ID _12555_210_8750_1_1530075594402_3780239_449-267986-0_sp0.9 from training. Duration: 0.92225 2023-10-04 00:19:36,061 WARNING [train.py:1204] (3/4) Exclude cut with ID _3992_210_1747_1_1526644816031_3731060_268-64520-0_sp1.1 from training. Duration: 0.963625 2023-10-04 00:19:36,073 WARNING [train.py:1204] (3/4) Exclude cut with ID _39411_210_5986_1_1533538825030_3343649_173-238248-0_sp0.9 from training. Duration: 0.92225 2023-10-04 00:19:36,213 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.nonlin_attention.balancer.prob, batch_count=1458453.3333333333, ans=0.125 2023-10-04 00:19:37,383 WARNING [train.py:1204] (3/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_828-313097-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 00:19:37,472 WARNING [train.py:1204] (3/4) Exclude cut with ID _26907_210_7234_1_1532221740694_3205263_337-2494-0 from training. Duration: 0.88 2023-10-04 00:19:40,232 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_229-45300-0_sp1.1 from training. Duration: 0.5181875 2023-10-04 00:19:41,610 WARNING [train.py:1204] (3/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_300-27235-0_sp1.1 from training. Duration: 0.7909375 2023-10-04 00:19:42,997 WARNING [train.py:1204] (3/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_48-338976-0_sp1.1 from training. Duration: 0.8090625 2023-10-04 00:19:47,307 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_13091_210_7925_1_1530609447_8987354_792-195353-0_sp1.1 from training. Duration: 0.936375 2023-10-04 00:19:47,316 WARNING [train.py:1204] (3/4) Exclude cut with ID _56524_210_9642_1_1534572011779_3619170_311-54500-0_sp1.1 from training. Duration: 0.97275 2023-10-04 00:19:50,160 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7543_210_6753_1_1528544010_1110402_25-183860-0 from training. Duration: 0.47 2023-10-04 00:19:51,441 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.607e+02 1.895e+02 2.094e+02 2.430e+02 3.415e+02, threshold=4.187e+02, percent-clipped=0.0 2023-10-04 00:19:52,912 WARNING [train.py:1204] (3/4) Exclude cut with ID 2411-132532-0017-82279-0_sp1.1 from training. Duration: 0.9681875 2023-10-04 00:19:52,913 WARNING [train.py:1204] (3/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_272-249985-0_sp0.9 from training. Duration: 0.7444375 2023-10-04 00:19:52,974 WARNING [train.py:1204] (3/4) Exclude cut with ID _55644_210_11856_1_1534593599582_4103719_320-229006-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 00:19:54,276 WARNING [train.py:1204] (3/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_177-100554-0_sp1.1 from training. Duration: 0.963625 2023-10-04 00:19:54,278 WARNING [train.py:1204] (3/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_787-294416-0_sp1.1 from training. Duration: 0.7454375 2023-10-04 00:19:57,773 WARNING [train.py:1204] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_318-59097-0 from training. Duration: 0.76 2023-10-04 00:19:59,111 WARNING [train.py:1204] (3/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_480-13146-0_sp1.1 from training. Duration: 0.77275 2023-10-04 00:20:02,022 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_469-338558-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 00:20:02,079 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_367-126897-0_sp1.1 from training. Duration: 0.963625 2023-10-04 00:20:02,098 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6545_210_2040_1_1527774670_4245258_379-14182-0 from training. Duration: 0.8 2023-10-04 00:20:02,116 WARNING [train.py:1204] (3/4) Exclude cut with ID _21631_210_2253_1_1531907880778_3160339_197-79498-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 00:20:02,120 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_416-293746-0_sp1.1 from training. Duration: 0.8181875 2023-10-04 00:20:03,364 WARNING [train.py:1204] (3/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_52-211683-0 from training. Duration: 0.9 2023-10-04 00:20:04,092 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.3.encoder.layers.2.self_attn2.whiten, num_groups=1, num_channels=512, metric=11.01 vs. limit=22.5 2023-10-04 00:20:08,179 WARNING [train.py:1204] (3/4) Exclude cut with ID _48678_210_5663_1_1533988830947_3769199_306-255886-0_sp0.9 from training. Duration: 0.8555625 2023-10-04 00:20:10,804 WARNING [train.py:1204] (3/4) Exclude cut with ID _65134_210_14763_1_1535335393985_3505368_296-24733-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 00:20:13,625 WARNING [train.py:1204] (3/4) Exclude cut with ID _14898_210_9206_1_1530766812377_4177190_335-99364-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 00:20:15,002 WARNING [train.py:1204] (3/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_19-342632-0 from training. Duration: 0.91 2023-10-04 00:20:15,029 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_53071_210_16554_1_1534490405_2954066_265-170472-0 from training. Duration: 0.83 2023-10-04 00:20:17,879 WARNING [train.py:1204] (3/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_189-298172-0_sp1.1 from training. Duration: 0.963625 2023-10-04 00:20:19,154 INFO [train.py:1046] (3/4) Epoch 42, batch 1000, loss[loss=0.1419, simple_loss=0.2079, pruned_loss=0.038, over 23569.00 frames. ], tot_loss[loss=0.156, simple_loss=0.2363, pruned_loss=0.03782, over 4715919.13 frames. ], batch size: 256, lr: 2.43e-03, grad_scale: 8.0 2023-10-04 00:20:22,094 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7615_210_4211_1_1528110965_4580160_72-235211-0 from training. Duration: 0.76 2023-10-04 00:20:23,510 WARNING [train.py:1204] (3/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_155-90874-0_sp1.1 from training. Duration: 0.936375 2023-10-04 00:20:28,152 WARNING [train.py:1204] (3/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_164-216310-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 00:20:29,546 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_46013_210_7234_1_1533776362_3744032_463-52378-0 from training. Duration: 0.86 2023-10-04 00:20:29,549 WARNING [train.py:1204] (3/4) Exclude cut with ID _61133_210_9969_1_1535196777227_3696409_333-270325-0 from training. Duration: 0.93 2023-10-04 00:20:31,274 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.conv_module2.balancer1.prob, batch_count=1458653.3333333333, ans=0.125 2023-10-04 00:20:31,304 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.bypass.scale_min, batch_count=1458653.3333333333, ans=0.2 2023-10-04 00:20:34,338 WARNING [train.py:1204] (3/4) Exclude cut with ID _5172_210_1883_1_1527246062567_3577219_18-101380-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 00:20:34,346 WARNING [train.py:1204] (3/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_577-236858-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 00:20:34,604 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=1458720.0, ans=0.1 2023-10-04 00:20:35,725 WARNING [train.py:1204] (3/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_1006-177345-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 00:20:38,466 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_4-266348-0 from training. Duration: 0.78 2023-10-04 00:20:39,990 WARNING [train.py:1204] (3/4) Exclude cut with ID _55857_210_2346_1_1534644007831_6898209_745-98391-0 from training. Duration: 0.76 2023-10-04 00:20:42,685 WARNING [train.py:1204] (3/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_375-244976-0 from training. Duration: 0.75 2023-10-04 00:20:42,745 WARNING [train.py:1204] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_4-248404-0_sp1.1 from training. Duration: 0.97275 2023-10-04 00:20:44,192 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8453_210_4211_1_1528502940_8562730_596-306820-0 from training. Duration: 0.97 2023-10-04 00:20:45,596 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_211-339622-0_sp0.9 from training. Duration: 0.5 2023-10-04 00:20:45,640 WARNING [train.py:1204] (3/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_393-57888-0 from training. Duration: 0.64 2023-10-04 00:20:46,981 WARNING [train.py:1204] (3/4) Exclude cut with ID _23825_210_13449_1_1531915313526_3497150_143-343575-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 00:20:48,443 WARNING [train.py:1204] (3/4) Exclude cut with ID _58605_210_12210_1_1534723195939_3904259_36-12256-0_sp1.1 from training. Duration: 0.936375 2023-10-04 00:20:57,324 WARNING [train.py:1204] (3/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_38-67795-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 00:20:58,644 WARNING [train.py:1204] (3/4) Exclude cut with ID _64631_210_27767_1_1535349580068_4165160_308-327832-0_sp1.1 from training. Duration: 0.92725 2023-10-04 00:21:00,020 WARNING [train.py:1204] (3/4) Exclude cut with ID _37086_210_9038_1_1532999540891_7021250_726-81176-0_sp1.1 from training. Duration: 0.936375 2023-10-04 00:21:00,079 WARNING [train.py:1204] (3/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_47-141521-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 00:21:00,083 WARNING [train.py:1204] (3/4) Exclude cut with ID _3067_210_2667_1_1526212974640_4503168_250-285937-0 from training. Duration: 0.8 2023-10-04 00:21:00,116 WARNING [train.py:1204] (3/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_239-24000-0_sp1.1 from training. Duration: 0.97275 2023-10-04 00:21:02,632 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3371_210_1794_1_1526384462_5580777_396-339812-0_sp0.9 from training. Duration: 0.9444375 2023-10-04 00:21:02,680 WARNING [train.py:1204] (3/4) Exclude cut with ID _16909_210_5724_1_1531292079912_4118919_580-173454-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 00:21:02,733 WARNING [train.py:1204] (3/4) Exclude cut with ID IC0124W0002-61308 from training. Duration: 0.982 2023-10-04 00:21:06,118 WARNING [train.py:1204] (3/4) Exclude cut with ID _31602_210_5854_1_1532677021456_4458250_249-96604-0 from training. Duration: 0.89 2023-10-04 00:21:06,226 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.0.balancer1.prob, batch_count=1458853.3333333333, ans=0.125 2023-10-04 00:21:08,893 WARNING [train.py:1204] (3/4) Exclude cut with ID _69584_210_5219_1_1535785133775_4044450_325-7656-0 from training. Duration: 0.86 2023-10-04 00:21:10,334 WARNING [train.py:1204] (3/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_478-162247-0 from training. Duration: 0.82 2023-10-04 00:21:11,721 WARNING [train.py:1204] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_686-54383-0_sp1.1 from training. Duration: 0.7909375 2023-10-04 00:21:13,365 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.self_attn_weights.pos_emb_skip_rate, batch_count=1458853.3333333333, ans=0.0 2023-10-04 00:21:17,305 WARNING [train.py:1204] (3/4) Exclude cut with ID _29303_210_2575_1_1532392237303_7410649_85-270092-0_sp1.1 from training. Duration: 0.936375 2023-10-04 00:21:17,316 WARNING [train.py:1204] (3/4) Exclude cut with ID _2322_210_2667_1_1525604548971_3656299_440-80494-0_sp1.1 from training. Duration: 0.8 2023-10-04 00:21:17,356 WARNING [train.py:1204] (3/4) Exclude cut with ID _47346_210_9476_1_1533815971553_3536160_201-40644-0_sp1.1 from training. Duration: 0.936375 2023-10-04 00:21:18,693 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.conv_module1.balancer2.prob, batch_count=1458920.0, ans=0.125 2023-10-04 00:21:19,817 WARNING [train.py:1204] (3/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_343-139898-0_sp1.1 from training. Duration: 0.87275 2023-10-04 00:21:21,240 WARNING [train.py:1204] (3/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_1012-279473-0 from training. Duration: 0.67 2023-10-04 00:21:22,618 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7931_210_1794_1_1528594728_5564059_280-279560-0_sp1.1 from training. Duration: 0.8909375 2023-10-04 00:21:22,656 WARNING [train.py:1204] (3/4) Exclude cut with ID _8263_210_1664_1_1528439452891_970220_68-181805-0 from training. Duration: 0.64 2023-10-04 00:21:23,994 WARNING [train.py:1204] (3/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_605-133395-0 from training. Duration: 0.66 2023-10-04 00:21:24,104 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_43-171109-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 00:21:24,107 WARNING [train.py:1204] (3/4) Exclude cut with ID _40094_210_16380_1_1533294005773_3641159_107-280739-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 00:21:26,787 WARNING [train.py:1204] (3/4) Exclude cut with ID _14821_210_8750_1_1530680503991_6829359_2-227151-0_sp0.9 from training. Duration: 0.92225 2023-10-04 00:21:28,772 WARNING [train.py:1204] (3/4) Exclude cut with ID _14268_210_9206_1_1530622771285_4714099_331-105351-0_sp1.1 from training. Duration: 0.5909375 2023-10-04 00:21:30,244 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_26326_210_7925_1_1533002493_3592406_451-82741-0_sp1.1 from training. Duration: 0.97275 2023-10-04 00:21:32,049 INFO [train.py:1046] (3/4) Epoch 42, batch 1050, loss[loss=0.1518, simple_loss=0.2263, pruned_loss=0.03867, over 23784.00 frames. ], tot_loss[loss=0.1552, simple_loss=0.2351, pruned_loss=0.03764, over 4709729.65 frames. ], batch size: 232, lr: 2.43e-03, grad_scale: 8.0 2023-10-04 00:21:32,988 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.0.layers.0.conv_module1.whiten, num_groups=1, num_channels=192, metric=14.31 vs. limit=15.0 2023-10-04 00:21:35,332 WARNING [train.py:1204] (3/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_191-59784-0_sp0.9 from training. Duration: 0.97775 2023-10-04 00:21:35,433 WARNING [train.py:1204] (3/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_445-117398-0_sp1.1 from training. Duration: 0.8090625 2023-10-04 00:21:38,552 WARNING [train.py:1204] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_605-257205-0_sp0.9 from training. Duration: 0.6555625 2023-10-04 00:21:39,853 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_50050_210_16223_1_1535435796_7290138_592-258841-0_sp1.1 from training. Duration: 0.936375 2023-10-04 00:21:41,353 WARNING [train.py:1204] (3/4) Exclude cut with ID _14348_210_8341_1_1530599434996_3778640_240-232370-0_sp0.9 from training. Duration: 0.9333125 2023-10-04 00:21:42,756 WARNING [train.py:1204] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_239-316802-0_sp0.9 from training. Duration: 0.7666875 2023-10-04 00:21:44,223 WARNING [train.py:1204] (3/4) Exclude cut with ID _45560_210_21982_1_1533812603951_3584999_111-6376-0_sp1.1 from training. Duration: 0.763625 2023-10-04 00:21:46,922 WARNING [train.py:1204] (3/4) Exclude cut with ID _54189_210_16380_1_1534332670039_3911710_606-311481-0_sp1.1 from training. Duration: 0.963625 2023-10-04 00:21:48,241 WARNING [train.py:1204] (3/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_237-332027-0_sp1.1 from training. Duration: 0.863625 2023-10-04 00:21:48,285 WARNING [train.py:1204] (3/4) Exclude cut with ID _66070_210_7640_1_1535441393096_7805310_489-292631-0_sp0.9 from training. Duration: 0.87775 2023-10-04 00:21:49,722 WARNING [train.py:1204] (3/4) Exclude cut with ID _49728_210_12651_1_1534041034294_6788150_624-195301-0_sp1.1 from training. Duration: 0.77275 2023-10-04 00:21:49,772 WARNING [train.py:1204] (3/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_44-79427-0 from training. Duration: 0.98 2023-10-04 00:21:49,928 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.conv_module1.balancer2.prob, batch_count=1459053.3333333333, ans=0.125 2023-10-04 00:21:51,091 WARNING [train.py:1204] (3/4) Exclude cut with ID _35350_210_16040_1_1533040191183_4075299_490-198354-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 00:21:51,137 WARNING [train.py:1204] (3/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_309-222545-0 from training. Duration: 0.84 2023-10-04 00:21:52,631 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_13091_210_7925_1_1530609447_8987354_790-199469-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 00:21:53,838 WARNING [train.py:1204] (3/4) Exclude cut with ID _21632_210_2253_1_1531994001765_3744140_110-274654-0 from training. Duration: 0.82 2023-10-04 00:21:53,845 WARNING [train.py:1204] (3/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_498-40153-0_sp0.9 from training. Duration: 0.7 2023-10-04 00:21:58,509 WARNING [train.py:1204] (3/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_363-282909-0_sp1.1 from training. Duration: 0.936375 2023-10-04 00:21:58,583 WARNING [train.py:1204] (3/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_178-177582-0_sp0.9 from training. Duration: 0.82225 2023-10-04 00:21:59,869 WARNING [train.py:1204] (3/4) Exclude cut with ID _24612_210_8913_1_1531998054193_3806359_357-35487-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 00:22:01,843 WARNING [train.py:1204] (3/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_138-332773-0 from training. Duration: 0.81 2023-10-04 00:22:01,871 WARNING [train.py:1204] (3/4) Exclude cut with ID _7624_210_1531_1_1528109802198_4933101_418-349834-0 from training. Duration: 0.6 2023-10-04 00:22:01,891 WARNING [train.py:1204] (3/4) Exclude cut with ID _7537_210_2084_1_1528502395431_3924359_14-312906-0_sp0.9 from training. Duration: 0.9333125 2023-10-04 00:22:02,050 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.self_attn_weights.pos_emb_skip_rate, batch_count=1459120.0, ans=0.0 2023-10-04 00:22:05,748 WARNING [train.py:1204] (3/4) Exclude cut with ID _60057_210_10640_1_1534903065793_3333150_362-230377-0 from training. Duration: 0.87 2023-10-04 00:22:07,390 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.conv_module1.balancer2.prob, batch_count=1459120.0, ans=0.125 2023-10-04 00:22:08,756 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.conv_module1.balancer1.min_positive, batch_count=1459120.0, ans=0.025 2023-10-04 00:22:09,934 WARNING [train.py:1204] (3/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_138-64210-0 from training. Duration: 0.73 2023-10-04 00:22:10,003 WARNING [train.py:1204] (3/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_454-339504-0_sp1.1 from training. Duration: 0.97275 2023-10-04 00:22:12,806 WARNING [train.py:1204] (3/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_508-335369-0_sp0.9 from training. Duration: 0.5333125 2023-10-04 00:22:15,461 WARNING [train.py:1204] (3/4) Exclude cut with ID _5608_210_2106_1_1527415439089_6819768_909-82767-0_sp0.9 from training. Duration: 0.67775 2023-10-04 00:22:15,494 WARNING [train.py:1204] (3/4) Exclude cut with ID _6694_210_4599_1_1527852637490_3965529_99-11611-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 00:22:15,553 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2655_210_2087_1_1525688959_3897400_52-279899-0_sp1.1 from training. Duration: 0.62725 2023-10-04 00:22:18,167 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.618e+02 1.961e+02 2.126e+02 2.339e+02 3.298e+02, threshold=4.252e+02, percent-clipped=0.0 2023-10-04 00:22:19,676 WARNING [train.py:1204] (3/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_375-245677-0_sp1.1 from training. Duration: 0.736375 2023-10-04 00:22:20,152 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.5.encoder.layers.1.self_attn2.whiten, num_groups=1, num_channels=256, metric=11.90 vs. limit=22.5 2023-10-04 00:22:22,605 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_23850_210_4238_1_1532232597_1632433_32-185135-0 from training. Duration: 0.86 2023-10-04 00:22:24,596 WARNING [train.py:1204] (3/4) Exclude cut with ID _15113_210_8254_1_1530842438579_3716279_209-54533-0 from training. Duration: 0.96 2023-10-04 00:22:24,637 WARNING [train.py:1204] (3/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_401-53423-0 from training. Duration: 0.55 2023-10-04 00:22:25,907 WARNING [train.py:1204] (3/4) Exclude cut with ID _14867_210_1968_1_1530684179890_3846969_319-250868-0_sp1.1 from training. Duration: 0.9 2023-10-04 00:22:25,921 WARNING [train.py:1204] (3/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_106-30782-0_sp0.9 from training. Duration: 0.888875 2023-10-04 00:22:26,059 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_5960_210_1794_1_1527403865_3990999_515-2613-0 from training. Duration: 0.88 2023-10-04 00:22:28,103 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.2.encoder.layers.2.self_attn_weights.whiten_keys, num_groups=4, num_channels=128, metric=3.03 vs. limit=6.0 2023-10-04 00:22:28,837 WARNING [train.py:1204] (3/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_798-106179-0_sp0.9 from training. Duration: 0.92225 2023-10-04 00:22:30,987 WARNING [train.py:1204] (3/4) Exclude cut with ID _14227_210_4381_1_1530701804286_3940259_125-275642-0_sp1.1 from training. Duration: 0.9 2023-10-04 00:22:30,989 WARNING [train.py:1204] (3/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_79-200392-0_sp1.1 from training. Duration: 0.8818125 2023-10-04 00:22:31,033 WARNING [train.py:1204] (3/4) Exclude cut with ID _48836_210_12210_1_1534075848293_3904150_429-35475-0_sp0.9 from training. Duration: 0.988875 2023-10-04 00:22:32,818 WARNING [train.py:1204] (3/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_224-172517-0_sp1.1 from training. Duration: 0.97275 2023-10-04 00:22:37,304 WARNING [train.py:1204] (3/4) Exclude cut with ID _15113_210_8254_1_1530842438579_3716279_76-232507-0_sp1.1 from training. Duration: 0.97275 2023-10-04 00:22:37,307 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_135-5501-0 from training. Duration: 0.98 2023-10-04 00:22:37,509 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=1459253.3333333333, ans=0.1 2023-10-04 00:22:38,743 WARNING [train.py:1204] (3/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_563-347832-0_sp0.9 from training. Duration: 0.988875 2023-10-04 00:22:38,754 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6786_210_1388_1_1527839699_4194495_273-149841-0 from training. Duration: 0.92 2023-10-04 00:22:40,055 WARNING [train.py:1204] (3/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_172-215932-0 from training. Duration: 0.66 2023-10-04 00:22:40,136 WARNING [train.py:1204] (3/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_939-132910-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 00:22:43,041 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.out_combiner.scale_min, batch_count=1459253.3333333333, ans=0.2 2023-10-04 00:22:44,282 WARNING [train.py:1204] (3/4) Exclude cut with ID _49371_210_12071_1_1533952761021_3812190_16-246001-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 00:22:46,935 INFO [train.py:1046] (3/4) Epoch 42, batch 1100, loss[loss=0.1574, simple_loss=0.2445, pruned_loss=0.03518, over 24512.00 frames. ], tot_loss[loss=0.1543, simple_loss=0.2338, pruned_loss=0.03738, over 4702822.18 frames. ], batch size: 66, lr: 2.43e-03, grad_scale: 8.0 2023-10-04 00:22:48,460 WARNING [train.py:1204] (3/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_680-189165-0_sp1.1 from training. Duration: 0.936375 2023-10-04 00:22:48,669 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=1459320.0, ans=0.1 2023-10-04 00:22:52,606 WARNING [train.py:1204] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_53-300544-0_sp1.1 from training. Duration: 0.6090625 2023-10-04 00:22:53,359 WARNING [train.py:1204] (3/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_540-275685-0_sp1.1 from training. Duration: 0.8090625 2023-10-04 00:22:53,375 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_38965_210_4852_1_1533174906_3484735_453-106579-0_sp1.1 from training. Duration: 0.92725 2023-10-04 00:22:54,511 WARNING [train.py:1204] (3/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_105-328174-0 from training. Duration: 0.76 2023-10-04 00:22:55,878 WARNING [train.py:1204] (3/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_279-126205-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 00:22:59,032 WARNING [train.py:1204] (3/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_5-55893-0_sp0.9 from training. Duration: 0.611125 2023-10-04 00:23:00,906 WARNING [train.py:1204] (3/4) Exclude cut with ID _13678_210_7497_1_1530530069226_3463170_141-10835-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 00:23:03,644 WARNING [train.py:1204] (3/4) Exclude cut with ID _22083_210_8254_1_1531702862439_3678970_201-85738-0_sp1.1 from training. Duration: 0.8181875 2023-10-04 00:23:03,674 WARNING [train.py:1204] (3/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_51-1049-0 from training. Duration: 0.75 2023-10-04 00:23:03,758 WARNING [train.py:1204] (3/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_206-10097-0_sp0.9 from training. Duration: 0.5444375 2023-10-04 00:23:04,978 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_4528_210_3273_1_1527076654_3276777_360-63661-0_sp1.1 from training. Duration: 0.92725 2023-10-04 00:23:04,986 WARNING [train.py:1204] (3/4) Exclude cut with ID _64086_210_7994_1_1535270417019_7435949_449-31567-0_sp1.1 from training. Duration: 0.8818125 2023-10-04 00:23:08,160 WARNING [train.py:1204] (3/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_245-174553-0_sp0.9 from training. Duration: 0.97775 2023-10-04 00:23:11,058 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6545_210_2040_1_1527774670_4245258_379-14182-0_sp1.1 from training. Duration: 0.72725 2023-10-04 00:23:12,782 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.bypass_mid.scale_min, batch_count=1459386.6666666667, ans=0.2 2023-10-04 00:23:15,394 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_256-206070-0_sp1.1 from training. Duration: 0.963625 2023-10-04 00:23:18,087 WARNING [train.py:1204] (3/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_278-104676-0 from training. Duration: 0.98 2023-10-04 00:23:18,162 WARNING [train.py:1204] (3/4) Exclude cut with ID IC0671W0065-331908 from training. Duration: 0.896 2023-10-04 00:23:19,527 WARNING [train.py:1204] (3/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_244-129917-0_sp1.1 from training. Duration: 0.97275 2023-10-04 00:23:19,694 WARNING [train.py:1204] (3/4) Exclude cut with ID _7852_210_5219_1_1528286471316_8139400_1129-170857-0_sp1.1 from training. Duration: 0.97275 2023-10-04 00:23:21,067 WARNING [train.py:1204] (3/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_443-30034-0_sp0.9 from training. Duration: 0.711125 2023-10-04 00:23:21,106 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_31583_210_3852_1_1532595851_4024413_250-332999-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 00:23:22,496 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.1.encoder.layers.0.nonlin_attention.whiten1, num_groups=1, num_channels=192, metric=4.63 vs. limit=10.0 2023-10-04 00:23:22,861 WARNING [train.py:1204] (3/4) Exclude cut with ID _8772_210_1850_1_1528635595972_3664011_247-336448-0 from training. Duration: 0.74 2023-10-04 00:23:24,138 WARNING [train.py:1204] (3/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_567-135114-0_sp0.9 from training. Duration: 0.9666875 2023-10-04 00:23:24,142 WARNING [train.py:1204] (3/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_557-349534-0_sp1.1 from training. Duration: 0.82725 2023-10-04 00:23:24,154 WARNING [train.py:1204] (3/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_1341-26728-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 00:23:24,207 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_40222_210_6228_1_1533385298_4481939_375-328051-0_sp1.1 from training. Duration: 0.97275 2023-10-04 00:23:24,230 WARNING [train.py:1204] (3/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_276-96593-0 from training. Duration: 0.77 2023-10-04 00:23:24,488 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.ff2_skip_rate, batch_count=1459453.3333333333, ans=0.0 2023-10-04 00:23:28,948 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.conv_module1.balancer2.prob, batch_count=1459453.3333333333, ans=0.125 2023-10-04 00:23:31,426 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_53071_210_16554_1_1534490405_2954066_265-170472-0_sp0.9 from training. Duration: 0.92225 2023-10-04 00:23:33,362 WARNING [train.py:1204] (3/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_365-1492-0 from training. Duration: 0.91 2023-10-04 00:23:34,760 WARNING [train.py:1204] (3/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_654-52713-0_sp0.9 from training. Duration: 0.8555625 2023-10-04 00:23:40,881 WARNING [train.py:1204] (3/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_113-92608-0_sp1.1 from training. Duration: 0.4909375 2023-10-04 00:23:43,639 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_46013_210_7234_1_1533776362_3744032_50-141535-0 from training. Duration: 0.99 2023-10-04 00:23:43,647 WARNING [train.py:1204] (3/4) Exclude cut with ID _13942_210_9988_1_1530604906036_3733360_499-55676-0_sp1.1 from training. Duration: 0.563625 2023-10-04 00:23:45,077 WARNING [train.py:1204] (3/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_375-65143-0_sp1.1 from training. Duration: 0.97275 2023-10-04 00:23:45,224 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.0.conv_module2.balancer1.prob, batch_count=1459586.6666666667, ans=0.125 2023-10-04 00:23:47,922 WARNING [train.py:1204] (3/4) Exclude cut with ID _16756_210_4915_1_1531047803763_7104259_199-325608-0_sp1.1 from training. Duration: 0.92725 2023-10-04 00:23:47,962 WARNING [train.py:1204] (3/4) Exclude cut with ID _31649_210_6228_1_1532584608063_4091078_443-124391-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 00:23:49,327 WARNING [train.py:1204] (3/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_146-218763-0 from training. Duration: 0.86 2023-10-04 00:23:50,636 WARNING [train.py:1204] (3/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_567-135114-0_sp1.1 from training. Duration: 0.7909375 2023-10-04 00:23:50,679 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_26326_210_7925_1_1533002493_3592406_462-288299-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 00:23:52,222 WARNING [train.py:1204] (3/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_322-217220-0 from training. Duration: 0.86 2023-10-04 00:23:52,257 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_238-256668-0_sp1.1 from training. Duration: 0.62725 2023-10-04 00:23:52,310 WARNING [train.py:1204] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_371-18169-0 from training. Duration: 0.86 2023-10-04 00:23:52,484 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.ff3_skip_rate, batch_count=1459586.6666666667, ans=0.0 2023-10-04 00:23:54,254 WARNING [train.py:1204] (3/4) Exclude cut with ID _58480_210_12392_1_1534811121111_3646160_23-74512-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 00:23:54,260 WARNING [train.py:1204] (3/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_784-213099-0_sp1.1 from training. Duration: 0.8454375 2023-10-04 00:23:54,430 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=1459586.6666666667, ans=0.1 2023-10-04 00:23:55,540 WARNING [train.py:1204] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_625-102955-0_sp1.1 from training. Duration: 0.7 2023-10-04 00:24:00,115 WARNING [train.py:1204] (3/4) Exclude cut with ID _49748_210_24116_1_1533986971737_3158910_176-174327-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 00:24:01,478 INFO [train.py:1046] (3/4) Epoch 42, batch 1150, loss[loss=0.145, simple_loss=0.2223, pruned_loss=0.03387, over 24499.00 frames. ], tot_loss[loss=0.1551, simple_loss=0.2346, pruned_loss=0.0378, over 4699463.35 frames. ], batch size: 58, lr: 2.43e-03, grad_scale: 4.0 2023-10-04 00:24:01,635 WARNING [train.py:1204] (3/4) Exclude cut with ID _50310_210_15488_1_1534068043345_3641080_432-96140-0_sp1.1 from training. Duration: 0.936375 2023-10-04 00:24:03,084 WARNING [train.py:1204] (3/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_76-14910-0_sp1.1 from training. Duration: 0.92725 2023-10-04 00:24:03,116 WARNING [train.py:1204] (3/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_40-19458-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 00:24:04,407 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_46-93547-0 from training. Duration: 0.95 2023-10-04 00:24:04,457 WARNING [train.py:1204] (3/4) Exclude cut with ID _21631_210_2253_1_1531907880778_3160339_321-35325-0_sp1.1 from training. Duration: 0.963625 2023-10-04 00:24:07,813 WARNING [train.py:1204] (3/4) Exclude cut with ID _4779_210_2084_1_1527318022944_3914893_500-285013-0 from training. Duration: 0.69 2023-10-04 00:24:09,519 WARNING [train.py:1204] (3/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_301-93324-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 00:24:09,535 WARNING [train.py:1204] (3/4) Exclude cut with ID _6473_210_1741_1_1527901307396_3815380_108-17335-0_sp1.1 from training. Duration: 0.5909375 2023-10-04 00:24:15,074 WARNING [train.py:1204] (3/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_206-10097-0 from training. Duration: 0.49 2023-10-04 00:24:16,489 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_36756_210_7234_1_1533026991_966276_103-77513-0_sp1.1 from training. Duration: 0.97275 2023-10-04 00:24:20,729 WARNING [train.py:1204] (3/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_662-18193-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 00:24:20,797 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_26433_210_12108_1_1532157158_4056480_439-52438-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 00:24:22,097 WARNING [train.py:1204] (3/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_464-33931-0 from training. Duration: 0.54 2023-10-04 00:24:22,103 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3474_210_2028_1_1526188513_4825246_145-309131-0_sp0.9 from training. Duration: 0.82225 2023-10-04 00:24:22,121 WARNING [train.py:1204] (3/4) Exclude cut with ID _9679_210_7497_1_1529129212906_4363929_446-308321-0_sp1.1 from training. Duration: 0.963625 2023-10-04 00:24:25,389 WARNING [train.py:1204] (3/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_662-275621-0 from training. Duration: 0.42 2023-10-04 00:24:25,462 WARNING [train.py:1204] (3/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_344-337148-0_sp1.1 from training. Duration: 0.97275 2023-10-04 00:24:25,678 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.conv_module2.balancer1.min_positive, batch_count=1459720.0, ans=0.025 2023-10-04 00:24:27,484 WARNING [train.py:1204] (3/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_759-179071-0_sp1.1 from training. Duration: 0.92725 2023-10-04 00:24:36,064 WARNING [train.py:1204] (3/4) Exclude cut with ID _28783_210_7943_1_1532671164808_7155479_293-12188-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 00:24:42,060 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_17613_210_2151_1_1531137569_2366160_235-320304-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 00:24:42,082 WARNING [train.py:1204] (3/4) Exclude cut with ID _14349_210_8341_1_1530615710818_3442470_170-336541-0 from training. Duration: 0.86 2023-10-04 00:24:42,130 WARNING [train.py:1204] (3/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_375-218677-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 00:24:42,167 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder_embed.conv.8.prob, batch_count=1459786.6666666667, ans=0.125 2023-10-04 00:24:43,376 WARNING [train.py:1204] (3/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_829-202956-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 00:24:46,332 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9029W0436-509823 from training. Duration: 0.768 2023-10-04 00:24:48,977 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.715e+02 2.006e+02 2.296e+02 2.643e+02 4.791e+02, threshold=4.591e+02, percent-clipped=2.0 2023-10-04 00:24:49,063 WARNING [train.py:1204] (3/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_231-112214-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 00:24:55,670 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9059W0470-524883 from training. Duration: 0.3415625 2023-10-04 00:24:59,749 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_43266_210_1794_1_1533521789_4497218_206-79512-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 00:25:01,178 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_294-163793-0_sp0.9 from training. Duration: 0.911125 2023-10-04 00:25:01,208 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6290_210_3273_1_1527595224_1282787_115-87095-0_sp1.1 from training. Duration: 0.5 2023-10-04 00:25:02,455 WARNING [train.py:1204] (3/4) Exclude cut with ID _14100_210_8341_1_1530529320613_3678069_279-55760-0_sp1.1 from training. Duration: 0.8454375 2023-10-04 00:25:06,620 WARNING [train.py:1204] (3/4) Exclude cut with ID _14621_210_1925_1_1530686901185_3675420_419-234200-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 00:25:11,836 WARNING [train.py:1204] (3/4) Exclude cut with ID _60229_210_27767_1_1534838406158_3655350_353-213656-0_sp1.1 from training. Duration: 0.863625 2023-10-04 00:25:13,078 WARNING [train.py:1204] (3/4) Exclude cut with ID _48678_210_5663_1_1533988830947_3769199_306-255886-0_sp1.1 from training. Duration: 0.7 2023-10-04 00:25:14,622 WARNING [train.py:1204] (3/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_466-3492-0_sp1.1 from training. Duration: 0.97275 2023-10-04 00:25:14,622 WARNING [train.py:1204] (3/4) Exclude cut with ID _8755_210_1876_1_1528628559357_3258209_199-242167-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 00:25:14,683 WARNING [train.py:1204] (3/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_237-89684-0_sp1.1 from training. Duration: 0.87275 2023-10-04 00:25:15,933 INFO [train.py:1046] (3/4) Epoch 42, batch 1200, loss[loss=0.1545, simple_loss=0.2252, pruned_loss=0.04194, over 23769.00 frames. ], tot_loss[loss=0.1556, simple_loss=0.2354, pruned_loss=0.03791, over 4710274.35 frames. ], batch size: 135, lr: 2.43e-03, grad_scale: 8.0 2023-10-04 00:25:17,402 WARNING [train.py:1204] (3/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_70-84121-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 00:25:19,058 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7544_210_6753_1_1528587722_3576686_172-171750-0_sp1.1 from training. Duration: 0.5909375 2023-10-04 00:25:20,478 WARNING [train.py:1204] (3/4) Exclude cut with ID _24271_210_12658_1_1531987270022_7190519_40-321822-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 00:25:20,495 WARNING [train.py:1204] (3/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_227-83612-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 00:25:23,320 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9156W0015-572508 from training. Duration: 0.896 2023-10-04 00:25:24,736 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_292-265928-0 from training. Duration: 0.51 2023-10-04 00:25:26,797 WARNING [train.py:1204] (3/4) Exclude cut with ID _14747_210_8341_1_1530685797463_3654089_147-131229-0_sp1.1 from training. Duration: 0.6909375 2023-10-04 00:25:30,146 WARNING [train.py:1204] (3/4) Exclude cut with ID _45297_210_2721_1_1533715351381_3264949_351-147136-0_sp0.9 from training. Duration: 0.9666875 2023-10-04 00:25:32,845 WARNING [train.py:1204] (3/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_1102-324568-0_sp1.1 from training. Duration: 0.97275 2023-10-04 00:25:35,532 WARNING [train.py:1204] (3/4) Exclude cut with ID _18516_210_9206_1_1531704653206_3692406_465-75446-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 00:25:35,534 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9203W0126-595623 from training. Duration: 0.896 2023-10-04 00:25:36,856 WARNING [train.py:1204] (3/4) Exclude cut with ID _55383_210_10635_1_1534468426172_2231669_139-328410-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 00:25:43,650 WARNING [train.py:1204] (3/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_166-246557-0_sp0.9 from training. Duration: 0.711125 2023-10-04 00:25:43,654 WARNING [train.py:1204] (3/4) Exclude cut with ID _30039_210_13270_1_1532484612900_2725150_239-304872-0_sp1.1 from training. Duration: 0.963625 2023-10-04 00:25:44,927 WARNING [train.py:1204] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_6-226750-0 from training. Duration: 0.94 2023-10-04 00:25:45,002 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_135-5501-0_sp1.1 from training. Duration: 0.8909375 2023-10-04 00:25:45,199 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.feed_forward2.hidden_balancer.prob, batch_count=1460120.0, ans=0.125 2023-10-04 00:25:47,771 WARNING [train.py:1204] (3/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_359-118926-0 from training. Duration: 0.72 2023-10-04 00:25:51,868 WARNING [train.py:1204] (3/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_4-91037-0 from training. Duration: 0.88 2023-10-04 00:25:51,894 WARNING [train.py:1204] (3/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_415-288492-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 00:25:53,275 WARNING [train.py:1204] (3/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_544-287179-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 00:25:54,678 WARNING [train.py:1204] (3/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_122-32606-0_sp1.1 from training. Duration: 0.92725 2023-10-04 00:25:56,149 WARNING [train.py:1204] (3/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_121-284152-0_sp1.1 from training. Duration: 0.536375 2023-10-04 00:25:58,019 WARNING [train.py:1204] (3/4) Exclude cut with ID _44718_210_5816_1_1534050051869_6469030_350-299477-0_sp1.1 from training. Duration: 0.97275 2023-10-04 00:25:58,039 WARNING [train.py:1204] (3/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_1103-56445-0_sp0.9 from training. Duration: 0.811125 2023-10-04 00:25:58,103 WARNING [train.py:1204] (3/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_64-298917-0_sp0.9 from training. Duration: 0.97775 2023-10-04 00:25:58,142 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2245_210_2715_1_1525511548_3540254_310-52483-0 from training. Duration: 0.88 2023-10-04 00:26:01,253 WARNING [train.py:1204] (3/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_401-158239-0_sp1.1 from training. Duration: 0.6454375 2023-10-04 00:26:01,311 WARNING [train.py:1204] (3/4) Exclude cut with ID _59502_210_12210_1_1535107024698_1835139_146-162495-0_sp1.1 from training. Duration: 0.836375 2023-10-04 00:26:01,322 WARNING [train.py:1204] (3/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_41-31157-0_sp0.9 from training. Duration: 0.6333125 2023-10-04 00:26:02,752 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_68717_210_3528_1_1535628683_1556759_95-36722-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 00:26:02,761 WARNING [train.py:1204] (3/4) Exclude cut with ID _30196_210_1664_1_1533708496522_7074140_654-162611-0_sp1.1 from training. Duration: 0.92725 2023-10-04 00:26:08,136 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_9302_210_2550_1_1528889962_4655416_638-301620-0_sp0.9 from training. Duration: 0.52225 2023-10-04 00:26:09,529 WARNING [train.py:1204] (3/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_478-162247-0_sp1.1 from training. Duration: 0.7454375 2023-10-04 00:26:11,577 WARNING [train.py:1204] (3/4) Exclude cut with ID _45297_210_2721_1_1533715351381_3264949_353-9541-0 from training. Duration: 0.97 2023-10-04 00:26:13,602 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.4.encoder.layers.0.self_attn1.whiten, num_groups=1, num_channels=384, metric=14.53 vs. limit=22.5 2023-10-04 00:26:15,719 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9653W0190-675468 from training. Duration: 0.9935 2023-10-04 00:26:15,825 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.1.nonlin_attention.balancer.prob, batch_count=1460253.3333333333, ans=0.125 2023-10-04 00:26:18,425 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_49730_210_13049_1_1534075294_1830676_214-84436-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 00:26:18,527 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.0.conv_module2.balancer1.prob, batch_count=1460253.3333333333, ans=0.125 2023-10-04 00:26:21,174 WARNING [train.py:1204] (3/4) Exclude cut with ID _50093_210_4220_1_1534417141207_3432299_259-322252-0_sp1.1 from training. Duration: 0.836375 2023-10-04 00:26:21,429 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.conv_skip_rate, batch_count=1460253.3333333333, ans=0.0 2023-10-04 00:26:22,595 WARNING [train.py:1204] (3/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_599-50220-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 00:26:24,071 WARNING [train.py:1204] (3/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_174-109016-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 00:26:26,988 WARNING [train.py:1204] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_665-196155-0 from training. Duration: 0.67 2023-10-04 00:26:29,721 INFO [train.py:1046] (3/4) Epoch 42, batch 1250, loss[loss=0.2091, simple_loss=0.279, pruned_loss=0.06962, over 19787.00 frames. ], tot_loss[loss=0.1562, simple_loss=0.2363, pruned_loss=0.03805, over 4712053.88 frames. ], batch size: 389, lr: 2.43e-03, grad_scale: 8.0 2023-10-04 00:26:31,932 WARNING [train.py:1204] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_656-188361-0_sp1.1 from training. Duration: 0.7909375 2023-10-04 00:26:33,232 WARNING [train.py:1204] (3/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_60-18123-0_sp1.1 from training. Duration: 0.8 2023-10-04 00:26:35,214 WARNING [train.py:1204] (3/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_535-14410-0 from training. Duration: 0.47 2023-10-04 00:26:36,604 WARNING [train.py:1204] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_94-126128-0_sp1.1 from training. Duration: 0.7818125 2023-10-04 00:26:36,681 WARNING [train.py:1204] (3/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_356-31216-0_sp1.1 from training. Duration: 0.6818125 2023-10-04 00:26:40,873 WARNING [train.py:1204] (3/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_443-30034-0_sp1.1 from training. Duration: 0.5818125 2023-10-04 00:26:42,772 WARNING [train.py:1204] (3/4) Exclude cut with ID _26907_210_7234_1_1532221740694_3205263_337-2494-0_sp1.1 from training. Duration: 0.8 2023-10-04 00:26:44,030 WARNING [train.py:1204] (3/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_49-97158-0_sp0.9 from training. Duration: 0.8444375 2023-10-04 00:26:44,042 WARNING [train.py:1204] (3/4) Exclude cut with ID _7877_210_6014_1_1528286358099_3750136_553-170128-0_sp1.1 from training. Duration: 0.963625 2023-10-04 00:26:46,095 WARNING [train.py:1204] (3/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_202-293523-0_sp0.9 from training. Duration: 0.8 2023-10-04 00:26:46,311 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.conv_module2.balancer2.prob, batch_count=1460386.6666666667, ans=0.125 2023-10-04 00:26:48,031 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.4.encoder.layers.0.whiten, num_groups=1, num_channels=384, metric=4.90 vs. limit=12.0 2023-10-04 00:26:48,767 WARNING [train.py:1204] (3/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_188-171673-0_sp0.9 from training. Duration: 0.6444375 2023-10-04 00:26:48,783 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_120-41668-0_sp1.1 from training. Duration: 0.636375 2023-10-04 00:26:48,787 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_40223_210_6228_1_1533298404_4812267_613-78769-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 00:26:51,528 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_53861_210_5399_1_1534422156_4272077_150-336899-0_sp1.1 from training. Duration: 0.963625 2023-10-04 00:26:51,609 WARNING [train.py:1204] (3/4) Exclude cut with ID _14459_210_2228_1_1531443641946_7516700_1146-56843-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 00:26:55,625 WARNING [train.py:1204] (3/4) Exclude cut with ID _39867_210_9826_1_1533553222030_9344259_1194-299928-0_sp1.1 from training. Duration: 0.92725 2023-10-04 00:26:55,701 WARNING [train.py:1204] (3/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_214-291369-0_sp0.9 from training. Duration: 0.7 2023-10-04 00:27:01,271 WARNING [train.py:1204] (3/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_358-215558-0 from training. Duration: 0.93 2023-10-04 00:27:01,335 WARNING [train.py:1204] (3/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_577-287150-0_sp0.9 from training. Duration: 0.911125 2023-10-04 00:27:03,595 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.balancer1.prob, batch_count=1460453.3333333333, ans=0.125 2023-10-04 00:27:04,619 WARNING [train.py:1204] (3/4) Exclude cut with ID _60860_210_8254_1_1534896015875_3557169_229-187367-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 00:27:06,541 WARNING [train.py:1204] (3/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_857-298942-0 from training. Duration: 0.85 2023-10-04 00:27:06,597 WARNING [train.py:1204] (3/4) Exclude cut with ID _40137_210_12210_1_1533290484046_3795180_249-187059-0_sp1.1 from training. Duration: 0.8 2023-10-04 00:27:06,605 WARNING [train.py:1204] (3/4) Exclude cut with ID ID0171W0508-765603 from training. Duration: 0.541 2023-10-04 00:27:06,620 WARNING [train.py:1204] (3/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_521-219439-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 00:27:06,625 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_40223_210_6228_1_1533298404_4812267_741-302616-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 00:27:09,437 WARNING [train.py:1204] (3/4) Exclude cut with ID _65293_210_9070_1_1535545549765_4004210_438-154735-0_sp1.1 from training. Duration: 0.92725 2023-10-04 00:27:12,841 WARNING [train.py:1204] (3/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_564-132950-0_sp1.1 from training. Duration: 0.92725 2023-10-04 00:27:14,315 WARNING [train.py:1204] (3/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_291-270881-0_sp1.1 from training. Duration: 0.8818125 2023-10-04 00:27:15,752 WARNING [train.py:1204] (3/4) Exclude cut with ID _60229_210_27767_1_1534838406158_3655350_353-213656-0 from training. Duration: 0.95 2023-10-04 00:27:17,455 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_243-143119-0 from training. Duration: 0.77 2023-10-04 00:27:17,490 WARNING [train.py:1204] (3/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_690-126306-0 from training. Duration: 0.78 2023-10-04 00:27:17,983 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.5.encoder.layers.0.feed_forward2.out_whiten, num_groups=1, num_channels=256, metric=7.83 vs. limit=15.0 2023-10-04 00:27:18,736 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.606e+02 1.951e+02 2.115e+02 2.289e+02 3.132e+02, threshold=4.229e+02, percent-clipped=0.0 2023-10-04 00:27:21,696 WARNING [train.py:1204] (3/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_347-95003-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 00:27:23,099 WARNING [train.py:1204] (3/4) Exclude cut with ID _2322_210_2667_1_1525604548971_3656299_440-80494-0 from training. Duration: 0.88 2023-10-04 00:27:23,124 WARNING [train.py:1204] (3/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_370-229434-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 00:27:25,910 WARNING [train.py:1204] (3/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_124-345840-0_sp1.1 from training. Duration: 0.47275 2023-10-04 00:27:25,928 WARNING [train.py:1204] (3/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_165-123581-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 00:27:27,378 WARNING [train.py:1204] (3/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_453-282588-0 from training. Duration: 0.98 2023-10-04 00:27:27,396 WARNING [train.py:1204] (3/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_299-34413-0_sp0.9 from training. Duration: 0.72225 2023-10-04 00:27:28,787 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_40036_210_13405_1_1533648647_1315388_48-146016-0_sp1.1 from training. Duration: 0.8454375 2023-10-04 00:27:28,811 WARNING [train.py:1204] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_99-167392-0_sp0.9 from training. Duration: 0.67775 2023-10-04 00:27:28,882 WARNING [train.py:1204] (3/4) Exclude cut with ID _48677_210_5663_1_1533902405919_3892989_37-207114-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 00:27:30,272 WARNING [train.py:1204] (3/4) Exclude cut with ID _29857_210_20034_1_1532395886753_3798160_335-163562-0 from training. Duration: 0.85 2023-10-04 00:27:33,006 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_36492_210_7927_1_1533261987_4790834_703-343351-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 00:27:33,098 WARNING [train.py:1204] (3/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_80-147301-0_sp0.9 from training. Duration: 0.9333125 2023-10-04 00:27:34,566 WARNING [train.py:1204] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_218-295097-0_sp0.9 from training. Duration: 0.8555625 2023-10-04 00:27:37,898 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2295_210_2087_1_1525500993_2407863_116-238894-0_sp1.1 from training. Duration: 0.636375 2023-10-04 00:27:40,639 WARNING [train.py:1204] (3/4) Exclude cut with ID _3231_210_2106_1_1526205864934_6784220_697-171068-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 00:27:42,371 WARNING [train.py:1204] (3/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_102-77064-0 from training. Duration: 0.93 2023-10-04 00:27:45,170 INFO [train.py:1046] (3/4) Epoch 42, batch 1300, loss[loss=0.1538, simple_loss=0.2272, pruned_loss=0.04026, over 23848.00 frames. ], tot_loss[loss=0.1564, simple_loss=0.2365, pruned_loss=0.0382, over 4709005.12 frames. ], batch size: 195, lr: 2.43e-03, grad_scale: 8.0 2023-10-04 00:27:46,582 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_13091_210_7925_1_1530609447_8987354_1186-261957-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 00:27:46,801 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=1460653.3333333333, ans=0.1 2023-10-04 00:27:47,963 WARNING [train.py:1204] (3/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_91-277684-0_sp1.1 from training. Duration: 0.67275 2023-10-04 00:27:48,040 WARNING [train.py:1204] (3/4) Exclude cut with ID _31088_210_16446_1_1532520012284_3440919_159-313751-0_sp1.1 from training. Duration: 0.936375 2023-10-04 00:27:51,434 WARNING [train.py:1204] (3/4) Exclude cut with ID _42326_210_7770_1_1533814282531_3707269_227-203721-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 00:27:52,826 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3531_210_3528_1_1526294203_5216456_309-227302-0_sp1.1 from training. Duration: 0.62725 2023-10-04 00:27:54,180 WARNING [train.py:1204] (3/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_226-242140-0 from training. Duration: 0.81 2023-10-04 00:27:57,085 WARNING [train.py:1204] (3/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_122-109023-0_sp1.1 from training. Duration: 0.6909375 2023-10-04 00:27:58,440 WARNING [train.py:1204] (3/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_660-211088-0_sp0.9 from training. Duration: 0.688875 2023-10-04 00:27:58,592 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.balancer1.prob, batch_count=1460720.0, ans=0.125 2023-10-04 00:27:59,908 WARNING [train.py:1204] (3/4) Exclude cut with ID _39290_210_4220_1_1533862778760_3075118_92-311600-0 from training. Duration: 0.88 2023-10-04 00:28:04,014 WARNING [train.py:1204] (3/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_390-339137-0_sp1.1 from training. Duration: 0.5090625 2023-10-04 00:28:05,567 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.self_attn_weights.pos_emb_skip_rate, batch_count=1460720.0, ans=0.0 2023-10-04 00:28:06,076 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.3.encoder.layers.0.nonlin_attention.whiten2, num_groups=1, num_channels=512, metric=6.08 vs. limit=15.0 2023-10-04 00:28:08,617 WARNING [train.py:1204] (3/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_449-341946-0_sp1.1 from training. Duration: 0.963625 2023-10-04 00:28:10,044 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_68095_210_20185_1_1535718226_8888855_884-327007-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 00:28:10,189 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.bypass.scale_min, batch_count=1460720.0, ans=0.2 2023-10-04 00:28:11,323 WARNING [train.py:1204] (3/4) Exclude cut with ID _42326_210_7770_1_1533814282531_3707269_308-222998-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 00:28:11,413 WARNING [train.py:1204] (3/4) Exclude cut with ID _40399_210_4353_1_1533305658194_3279406_344-238708-0_sp1.1 from training. Duration: 0.963625 2023-10-04 00:28:12,691 WARNING [train.py:1204] (3/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_87-131427-0_sp1.1 from training. Duration: 0.7090625 2023-10-04 00:28:12,737 WARNING [train.py:1204] (3/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_110-130961-0_sp0.9 from training. Duration: 0.52225 2023-10-04 00:28:14,451 WARNING [train.py:1204] (3/4) Exclude cut with ID _55153_210_2253_1_1534384786677_3162096_148-79022-0 from training. Duration: 0.96 2023-10-04 00:28:20,099 WARNING [train.py:1204] (3/4) Exclude cut with ID _9679_210_7497_1_1529129212906_4363929_512-313889-0_sp1.1 from training. Duration: 0.736375 2023-10-04 00:28:21,430 WARNING [train.py:1204] (3/4) Exclude cut with ID _7970_210_1883_1_1528455616296_3472230_25-178407-0_sp1.1 from training. Duration: 0.6454375 2023-10-04 00:28:22,861 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_246-125000-0 from training. Duration: 0.62 2023-10-04 00:28:24,191 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_15126_210_6753_1_1530778677_2591185_113-157651-0_sp0.9 from training. Duration: 0.7555625 2023-10-04 00:28:24,302 WARNING [train.py:1204] (3/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_105-63309-0_sp1.1 from training. Duration: 0.8909375 2023-10-04 00:28:27,626 WARNING [train.py:1204] (3/4) Exclude cut with ID _29877_210_6228_1_1532397791106_5276149_578-177271-0_sp1.1 from training. Duration: 0.936375 2023-10-04 00:28:27,867 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.attention_skip_rate, batch_count=1460853.3333333333, ans=0.0 2023-10-04 00:28:28,948 WARNING [train.py:1204] (3/4) Exclude cut with ID _44831_210_13427_1_1533816011549_3689035_32-188147-0 from training. Duration: 0.96 2023-10-04 00:28:29,009 WARNING [train.py:1204] (3/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_300-254337-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 00:28:30,358 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_128-241008-0 from training. Duration: 0.94 2023-10-04 00:28:31,689 WARNING [train.py:1204] (3/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_53-279178-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 00:28:35,797 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_41452_210_7291_1_1533437687_3941281_273-334088-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 00:28:35,800 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_128-241008-0_sp1.1 from training. Duration: 0.8545625 2023-10-04 00:28:37,321 WARNING [train.py:1204] (3/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_379-49457-0 from training. Duration: 0.87 2023-10-04 00:28:39,364 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_105-114797-0 from training. Duration: 0.84 2023-10-04 00:28:39,525 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.conv_module1.balancer1.prob, batch_count=1460853.3333333333, ans=0.125 2023-10-04 00:28:39,843 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.5.encoder.layers.1.feed_forward2.out_whiten, num_groups=1, num_channels=256, metric=7.31 vs. limit=15.0 2023-10-04 00:28:42,406 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_14928_210_5632_1_1530773461_4714000_454-112198-0 from training. Duration: 0.66 2023-10-04 00:28:45,270 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_456-22141-0_sp1.1 from training. Duration: 0.8 2023-10-04 00:28:48,670 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_115-233929-0 from training. Duration: 0.72 2023-10-04 00:28:51,279 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_616-16795-0_sp1.1 from training. Duration: 0.963625 2023-10-04 00:28:56,078 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.3.encoder.layers.2.feed_forward2.out_whiten, num_groups=1, num_channels=512, metric=9.89 vs. limit=15.0 2023-10-04 00:28:56,915 WARNING [train.py:1204] (3/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_448-212135-0 from training. Duration: 0.91 2023-10-04 00:28:57,227 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.conv_module2.balancer1.prob, batch_count=1460986.6666666667, ans=0.125 2023-10-04 00:28:58,617 INFO [train.py:1046] (3/4) Epoch 42, batch 1350, loss[loss=0.1536, simple_loss=0.236, pruned_loss=0.03556, over 19412.00 frames. ], tot_loss[loss=0.1555, simple_loss=0.2357, pruned_loss=0.03761, over 4720296.95 frames. ], batch size: 42, lr: 2.43e-03, grad_scale: 8.0 2023-10-04 00:29:00,134 WARNING [train.py:1204] (3/4) Exclude cut with ID _46683_210_9642_1_1533730913492_4591116_201-205677-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 00:29:02,856 WARNING [train.py:1204] (3/4) Exclude cut with ID _64086_210_7994_1_1535270417019_7435949_43-80483-0_sp1.1 from training. Duration: 0.97275 2023-10-04 00:29:05,566 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_240-134124-0_sp1.1 from training. Duration: 0.963625 2023-10-04 00:29:05,587 WARNING [train.py:1204] (3/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_907-120940-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 00:29:08,192 WARNING [train.py:1204] (3/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_848-280957-0_sp1.1 from training. Duration: 0.7909375 2023-10-04 00:29:09,618 WARNING [train.py:1204] (3/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_243-42116-0_sp0.9 from training. Duration: 0.811125 2023-10-04 00:29:15,987 WARNING [train.py:1204] (3/4) Exclude cut with ID _6694_210_4599_1_1527852637490_3965529_116-109398-0_sp0.9 from training. Duration: 0.811125 2023-10-04 00:29:17,720 WARNING [train.py:1204] (3/4) Exclude cut with ID _24314_210_4915_1_1532135078116_7170179_521-258868-0 from training. Duration: 0.79 2023-10-04 00:29:17,830 WARNING [train.py:1204] (3/4) Exclude cut with ID _2220_210_1819_1_1525509109898_3658680_50-99837-0_sp0.9 from training. Duration: 0.6 2023-10-04 00:29:19,128 WARNING [train.py:1204] (3/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_313-134930-0_sp1.1 from training. Duration: 0.8818125 2023-10-04 00:29:20,642 WARNING [train.py:1204] (3/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_125-144174-0 from training. Duration: 0.39 2023-10-04 00:29:20,693 WARNING [train.py:1204] (3/4) Exclude cut with ID _59288_210_5854_1_1535077757774_3412246_275-293897-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 00:29:22,043 WARNING [train.py:1204] (3/4) Exclude cut with ID _40519_210_13794_1_1533715228655_7337390_722-52880-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 00:29:22,058 WARNING [train.py:1204] (3/4) Exclude cut with ID _7537_210_2084_1_1528502395431_3924359_128-268029-0 from training. Duration: 0.98 2023-10-04 00:29:23,528 WARNING [train.py:1204] (3/4) Exclude cut with ID _8021_210_7994_1_1528376451358_3653069_179-45206-0 from training. Duration: 0.86 2023-10-04 00:29:26,276 WARNING [train.py:1204] (3/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_88-276668-0 from training. Duration: 0.99 2023-10-04 00:29:27,666 WARNING [train.py:1204] (3/4) Exclude cut with ID _22613_210_5219_1_1532253599686_4090170_290-212862-0_sp1.1 from training. Duration: 0.97275 2023-10-04 00:29:27,671 WARNING [train.py:1204] (3/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_465-214726-0 from training. Duration: 0.86 2023-10-04 00:29:39,078 WARNING [train.py:1204] (3/4) Exclude cut with ID _56480_210_4220_1_1534679942079_3075409_138-36031-0_sp1.1 from training. Duration: 0.97275 2023-10-04 00:29:46,271 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.588e+02 1.991e+02 2.237e+02 2.537e+02 4.042e+02, threshold=4.474e+02, percent-clipped=0.0 2023-10-04 00:29:47,705 WARNING [train.py:1204] (3/4) Exclude cut with ID _58605_210_12210_1_1534723195939_3904259_300-285878-0_sp1.1 from training. Duration: 0.97275 2023-10-04 00:29:49,029 WARNING [train.py:1204] (3/4) Exclude cut with ID _40901_210_5665_1_1533363789958_3856130_114-340229-0_sp1.1 from training. Duration: 0.936375 2023-10-04 00:29:49,070 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_489-230703-0 from training. Duration: 0.85 2023-10-04 00:29:51,790 WARNING [train.py:1204] (3/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_322-81243-0_sp1.1 from training. Duration: 0.936375 2023-10-04 00:29:51,869 WARNING [train.py:1204] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_271-269084-0 from training. Duration: 0.83 2023-10-04 00:29:51,880 WARNING [train.py:1204] (3/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_166-314607-0_sp0.9 from training. Duration: 0.6 2023-10-04 00:29:52,028 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=1461186.6666666667, ans=0.1 2023-10-04 00:29:52,116 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.balancer2.prob, batch_count=1461186.6666666667, ans=0.125 2023-10-04 00:29:53,264 WARNING [train.py:1204] (3/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_340-343302-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 00:29:54,723 WARNING [train.py:1204] (3/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_286-112015-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 00:29:56,162 WARNING [train.py:1204] (3/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_328-264776-0 from training. Duration: 0.67 2023-10-04 00:29:57,586 WARNING [train.py:1204] (3/4) Exclude cut with ID _4866_210_2577_1_1527404412284_4364659_256-306830-0_sp0.9 from training. Duration: 0.9666875 2023-10-04 00:30:00,932 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder_embed.convnext.out_balancer.prob, batch_count=1461253.3333333333, ans=0.125 2023-10-04 00:30:03,516 WARNING [train.py:1204] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_239-316802-0 from training. Duration: 0.69 2023-10-04 00:30:04,816 WARNING [train.py:1204] (3/4) Exclude cut with ID _29857_210_20034_1_1532395886753_3798160_279-185801-0 from training. Duration: 0.91 2023-10-04 00:30:09,659 WARNING [train.py:1204] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_656-188361-0 from training. Duration: 0.87 2023-10-04 00:30:11,087 WARNING [train.py:1204] (3/4) Exclude cut with ID _44718_210_5816_1_1534050051869_6469030_100-230287-0_sp1.1 from training. Duration: 0.936375 2023-10-04 00:30:12,372 INFO [train.py:1046] (3/4) Epoch 42, batch 1400, loss[loss=0.1523, simple_loss=0.2419, pruned_loss=0.03133, over 24600.00 frames. ], tot_loss[loss=0.155, simple_loss=0.2354, pruned_loss=0.03728, over 4725802.52 frames. ], batch size: 68, lr: 2.43e-03, grad_scale: 8.0 2023-10-04 00:30:16,270 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8275_210_4198_1_1528455320_3892571_276-215622-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 00:30:16,368 WARNING [train.py:1204] (3/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_698-309430-0_sp1.1 from training. Duration: 0.963625 2023-10-04 00:30:19,261 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_365-217119-0 from training. Duration: 0.63 2023-10-04 00:30:19,427 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.0.conv_module2.balancer1.prob, batch_count=1461320.0, ans=0.125 2023-10-04 00:30:20,684 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7264_210_4938_1_1527939911_8132095_800-91995-0 from training. Duration: 0.86 2023-10-04 00:30:29,383 WARNING [train.py:1204] (3/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_49-97158-0_sp1.1 from training. Duration: 0.6909375 2023-10-04 00:30:30,792 WARNING [train.py:1204] (3/4) Exclude cut with ID _61132_210_9969_1_1535021792964_3755419_61-57720-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 00:30:33,530 WARNING [train.py:1204] (3/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_345-306832-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 00:30:33,564 WARNING [train.py:1204] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_164-224052-0_sp0.9 from training. Duration: 0.77775 2023-10-04 00:30:38,022 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_34886_210_10633_1_1533203882_1755661_76-156096-0_sp1.1 from training. Duration: 0.7909375 2023-10-04 00:30:38,107 WARNING [train.py:1204] (3/4) Exclude cut with ID 4133-6541-0027-40495-0_sp1.1 from training. Duration: 0.9681875 2023-10-04 00:30:40,432 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.balancer2.prob, batch_count=1461386.6666666667, ans=0.125 2023-10-04 00:30:45,248 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.4.encoder.layers.1.nonlin_attention.whiten2, num_groups=1, num_channels=384, metric=3.13 vs. limit=15.0 2023-10-04 00:30:47,313 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_514-211165-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 00:30:47,370 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_41211_210_2544_1_1533348859_3702910_97-212695-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 00:30:51,614 WARNING [train.py:1204] (3/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_406-45086-0 from training. Duration: 0.8 2023-10-04 00:30:52,844 WARNING [train.py:1204] (3/4) Exclude cut with ID _65263_210_22356_1_1535763598691_3921037_49-222917-0_sp1.1 from training. Duration: 0.836375 2023-10-04 00:30:52,914 WARNING [train.py:1204] (3/4) Exclude cut with ID _4866_210_2577_1_1527404412284_4364659_352-157930-0_sp1.1 from training. Duration: 0.736375 2023-10-04 00:30:54,247 WARNING [train.py:1204] (3/4) Exclude cut with ID _4366_210_1388_1_1526792053633_3613819_231-211402-0_sp1.1 from training. Duration: 0.8545625 2023-10-04 00:30:54,293 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_98-21576-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 00:30:55,639 WARNING [train.py:1204] (3/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_348-127789-0_sp0.9 from training. Duration: 0.9444375 2023-10-04 00:30:55,662 WARNING [train.py:1204] (3/4) Exclude cut with ID _45871_210_5665_1_1533864614245_3296060_98-329557-0_sp1.1 from training. Duration: 0.97275 2023-10-04 00:30:57,425 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6846_210_6753_1_1527850767_4295158_2-315073-0_sp1.1 from training. Duration: 0.8 2023-10-04 00:30:57,535 WARNING [train.py:1204] (3/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_145-137790-0 from training. Duration: 0.83 2023-10-04 00:30:58,784 WARNING [train.py:1204] (3/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_133-158308-0_sp0.9 from training. Duration: 0.9333125 2023-10-04 00:31:00,505 INFO [scaling.py:1118] (3/4) WithLoss: name=encoder.encoders.2.encoder.layers.0.self_attn_weights, loss-sum=0.000e+00 2023-10-04 00:31:03,097 WARNING [train.py:1204] (3/4) Exclude cut with ID _14898_210_9206_1_1530766812377_4177190_320-11758-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 00:31:07,059 WARNING [train.py:1204] (3/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_406-45086-0_sp0.9 from training. Duration: 0.888875 2023-10-04 00:31:07,327 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.ff3_skip_rate, batch_count=1461520.0, ans=0.0 2023-10-04 00:31:10,656 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.bypass.skip_rate, batch_count=1461586.6666666667, ans=0.07 2023-10-04 00:31:11,396 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.0.layers.1.nonlin_attention.whiten2, num_groups=1, num_channels=192, metric=5.62 vs. limit=15.0 2023-10-04 00:31:14,969 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_5438_210_1794_1_1527336687_2600009_104-288274-0 from training. Duration: 0.58 2023-10-04 00:31:15,055 WARNING [train.py:1204] (3/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_108-53929-0_sp0.9 from training. Duration: 0.6555625 2023-10-04 00:31:17,042 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.ff2_skip_rate, batch_count=1461586.6666666667, ans=0.0 2023-10-04 00:31:18,273 WARNING [train.py:1204] (3/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_444-286390-0_sp1.1 from training. Duration: 0.963625 2023-10-04 00:31:19,752 WARNING [train.py:1204] (3/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_125-144174-0_sp0.9 from training. Duration: 0.4333125 2023-10-04 00:31:21,056 WARNING [train.py:1204] (3/4) Exclude cut with ID _55209_210_1531_1_1534675828996_4597160_118-345621-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 00:31:23,840 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_45886_210_17024_1_1533709215_471063_7-79165-0_sp1.1 from training. Duration: 0.8909375 2023-10-04 00:31:26,683 INFO [train.py:1046] (3/4) Epoch 42, batch 1450, loss[loss=0.1325, simple_loss=0.2114, pruned_loss=0.02679, over 24405.00 frames. ], tot_loss[loss=0.1548, simple_loss=0.2351, pruned_loss=0.03726, over 4725942.34 frames. ], batch size: 58, lr: 2.43e-03, grad_scale: 8.0 2023-10-04 00:31:26,752 WARNING [train.py:1204] (3/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_175-163894-0_sp0.9 from training. Duration: 0.82225 2023-10-04 00:31:28,208 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_50188_210_4198_1_1534503621_3646041_353-81682-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 00:31:28,214 WARNING [train.py:1204] (3/4) Exclude cut with ID _17875_210_2087_1_1531224012907_1821339_175-80565-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 00:31:28,223 WARNING [train.py:1204] (3/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_159-328157-0_sp1.1 from training. Duration: 0.47275 2023-10-04 00:31:33,156 WARNING [train.py:1204] (3/4) Exclude cut with ID _60057_210_10640_1_1534903065793_3333150_137-267579-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 00:31:34,417 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_92-46009-0_sp1.1 from training. Duration: 0.4909375 2023-10-04 00:31:35,734 WARNING [train.py:1204] (3/4) Exclude cut with ID _53203_210_2087_1_1534246218309_475590_48-68507-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 00:31:35,763 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_28211_210_6388_1_1532861506_4523224_128-328162-0 from training. Duration: 0.9 2023-10-04 00:31:37,087 WARNING [train.py:1204] (3/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_51-1049-0_sp1.1 from training. Duration: 0.6818125 2023-10-04 00:31:37,172 WARNING [train.py:1204] (3/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_383-316000-0 from training. Duration: 0.9 2023-10-04 00:31:38,576 WARNING [train.py:1204] (3/4) Exclude cut with ID _4779_210_2084_1_1527318022944_3914893_185-295153-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 00:31:40,420 WARNING [train.py:1204] (3/4) Exclude cut with ID _3231_210_2106_1_1526205864934_6784220_171-256004-0_sp0.9 from training. Duration: 0.9555625 2023-10-04 00:31:40,420 WARNING [train.py:1204] (3/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_87-272068-0 from training. Duration: 0.83 2023-10-04 00:31:41,934 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_164-152777-0_sp1.1 from training. Duration: 0.92725 2023-10-04 00:31:41,980 WARNING [train.py:1204] (3/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_18-130336-0_sp0.9 from training. Duration: 0.87775 2023-10-04 00:31:43,780 WARNING [train.py:1204] (3/4) Exclude cut with ID _14886_210_4220_1_1530702020355_3516190_175-229073-0_sp1.1 from training. Duration: 0.4090625 2023-10-04 00:31:43,801 WARNING [train.py:1204] (3/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_791-45149-0_sp0.9 from training. Duration: 0.9555625 2023-10-04 00:31:45,162 WARNING [train.py:1204] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_468-189058-0_sp1.1 from training. Duration: 0.87275 2023-10-04 00:31:46,560 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8278_210_2550_1_1528637099_4220413_32-142780-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 00:31:48,019 WARNING [train.py:1204] (3/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_146-218763-0_sp0.9 from training. Duration: 0.9555625 2023-10-04 00:31:48,236 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.ff2_skip_rate, batch_count=1461720.0, ans=0.0 2023-10-04 00:31:53,882 WARNING [train.py:1204] (3/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_348-127789-0_sp1.1 from training. Duration: 0.77275 2023-10-04 00:31:53,888 WARNING [train.py:1204] (3/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_288-285445-0_sp1.1 from training. Duration: 0.936375 2023-10-04 00:31:55,433 WARNING [train.py:1204] (3/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_308-181732-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 00:31:55,477 WARNING [train.py:1204] (3/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_895-117428-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 00:31:58,139 WARNING [train.py:1204] (3/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_387-159008-0_sp0.9 from training. Duration: 0.9555625 2023-10-04 00:31:58,162 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_37351_210_7925_1_1533012931_3812113_265-85780-0_sp0.9 from training. Duration: 0.97775 2023-10-04 00:31:58,184 WARNING [train.py:1204] (3/4) Exclude cut with ID _40551_210_10635_1_1533776424632_3416179_361-3964-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 00:31:58,222 WARNING [train.py:1204] (3/4) Exclude cut with ID _25882_210_3036_1_1532775665815_6479510_485-122742-0_sp1.1 from training. Duration: 0.97275 2023-10-04 00:32:02,375 WARNING [train.py:1204] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_98-51676-0 from training. Duration: 0.73 2023-10-04 00:32:03,829 WARNING [train.py:1204] (3/4) Exclude cut with ID _30548_210_16040_1_1532608097130_3146969_371-280610-0_sp1.1 from training. Duration: 0.92725 2023-10-04 00:32:09,622 WARNING [train.py:1204] (3/4) Exclude cut with ID IC0671W0081-331924 from training. Duration: 0.896 2023-10-04 00:32:11,648 WARNING [train.py:1204] (3/4) Exclude cut with ID _40284_210_2084_1_1533513679814_3704279_380-160775-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 00:32:13,026 WARNING [train.py:1204] (3/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_57-270037-0_sp1.1 from training. Duration: 0.7 2023-10-04 00:32:14,444 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.707e+02 2.045e+02 2.411e+02 2.943e+02 4.436e+02, threshold=4.821e+02, percent-clipped=0.0 2023-10-04 00:32:14,538 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_853-150237-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 00:32:15,253 WARNING [train.py:1204] (3/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_491-261923-0 from training. Duration: 0.98 2023-10-04 00:32:19,171 WARNING [train.py:1204] (3/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_836-244045-0_sp1.1 from training. Duration: 0.97275 2023-10-04 00:32:20,456 WARNING [train.py:1204] (3/4) Exclude cut with ID _14880_210_8895_1_1530680426655_3795259_289-212817-0 from training. Duration: 0.69 2023-10-04 00:32:22,155 WARNING [train.py:1204] (3/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_303-340446-0 from training. Duration: 0.9 2023-10-04 00:32:23,512 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_37152_210_9697_1_1533106573_3844188_418-41736-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 00:32:26,311 WARNING [train.py:1204] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_371-18169-0_sp1.1 from training. Duration: 0.7818125 2023-10-04 00:32:26,348 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_187-219984-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 00:32:27,704 WARNING [train.py:1204] (3/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_63-267585-0 from training. Duration: 0.9 2023-10-04 00:32:30,414 WARNING [train.py:1204] (3/4) Exclude cut with ID _27013_210_1389_1_1532437173240_3252649_222-125573-0 from training. Duration: 0.98 2023-10-04 00:32:30,454 WARNING [train.py:1204] (3/4) Exclude cut with ID _14821_210_8750_1_1530680503991_6829359_189-297451-0 from training. Duration: 0.55 2023-10-04 00:32:30,552 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_557-163391-0_sp1.1 from training. Duration: 0.97275 2023-10-04 00:32:31,860 WARNING [train.py:1204] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_65-88242-0_sp1.1 from training. Duration: 0.5090625 2023-10-04 00:32:37,341 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.0.self_attn_weights.pos_emb_skip_rate, batch_count=1461920.0, ans=0.0 2023-10-04 00:32:40,520 INFO [train.py:1046] (3/4) Epoch 42, batch 1500, loss[loss=0.1375, simple_loss=0.2205, pruned_loss=0.0272, over 24325.00 frames. ], tot_loss[loss=0.1555, simple_loss=0.2358, pruned_loss=0.03758, over 4728501.80 frames. ], batch size: 61, lr: 2.43e-03, grad_scale: 8.0 2023-10-04 00:32:43,517 WARNING [train.py:1204] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_218-295097-0 from training. Duration: 0.77 2023-10-04 00:32:44,800 WARNING [train.py:1204] (3/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_686-247600-0_sp1.1 from training. Duration: 0.836375 2023-10-04 00:32:44,804 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_397-147196-0_sp0.9 from training. Duration: 0.888875 2023-10-04 00:32:46,180 WARNING [train.py:1204] (3/4) Exclude cut with ID _53950_210_5816_1_1534417264919_3862951_237-136449-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 00:32:46,264 WARNING [train.py:1204] (3/4) Exclude cut with ID _3120_210_3525_1_1526036143112_4072980_74-178400-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 00:32:48,169 WARNING [train.py:1204] (3/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_309-222545-0_sp0.9 from training. Duration: 0.9333125 2023-10-04 00:32:48,255 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7544_210_6753_1_1528587722_3576686_172-171750-0 from training. Duration: 0.65 2023-10-04 00:32:49,644 WARNING [train.py:1204] (3/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_3-237434-0_sp0.9 from training. Duration: 0.8444375 2023-10-04 00:32:49,674 WARNING [train.py:1204] (3/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_101-293367-0_sp0.9 from training. Duration: 0.7 2023-10-04 00:32:49,690 WARNING [train.py:1204] (3/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_818-213552-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 00:32:51,495 WARNING [train.py:1204] (3/4) Exclude cut with ID _14349_210_8341_1_1530615710818_3442470_115-285975-0_sp1.1 from training. Duration: 0.7818125 2023-10-04 00:32:54,262 WARNING [train.py:1204] (3/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_291-309749-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 00:32:55,648 WARNING [train.py:1204] (3/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_715-3462-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 00:33:02,511 WARNING [train.py:1204] (3/4) Exclude cut with ID _59502_210_12210_1_1535107024698_1835139_13-331827-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 00:33:02,521 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_4528_210_3273_1_1527076654_3276777_342-168068-0 from training. Duration: 0.87 2023-10-04 00:33:02,578 WARNING [train.py:1204] (3/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_215-141059-0_sp0.9 from training. Duration: 0.688875 2023-10-04 00:33:02,598 WARNING [train.py:1204] (3/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_106-286130-0_sp1.1 from training. Duration: 0.8909375 2023-10-04 00:33:03,955 WARNING [train.py:1204] (3/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_480-77655-0_sp1.1 from training. Duration: 0.97275 2023-10-04 00:33:06,813 WARNING [train.py:1204] (3/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_848-280957-0 from training. Duration: 0.87 2023-10-04 00:33:12,656 WARNING [train.py:1204] (3/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_122-109023-0 from training. Duration: 0.76 2023-10-04 00:33:14,636 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_11305_210_1794_1_1529750428_7178148_645-233480-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 00:33:14,690 WARNING [train.py:1204] (3/4) Exclude cut with ID _7562_210_2084_1_1528887767916_3946229_345-169156-0 from training. Duration: 0.58 2023-10-04 00:33:16,095 WARNING [train.py:1204] (3/4) Exclude cut with ID _7562_210_2084_1_1528887767916_3946229_345-169156-0_sp1.1 from training. Duration: 0.52725 2023-10-04 00:33:18,905 WARNING [train.py:1204] (3/4) Exclude cut with ID _13253_210_1925_1_1530757775819_3577873_253-246386-0_sp1.1 from training. Duration: 0.8181875 2023-10-04 00:33:19,617 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_136-264444-0_sp1.1 from training. Duration: 0.97275 2023-10-04 00:33:19,630 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2168_210_2290_1_1525606920_1849400_136-222991-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 00:33:22,101 WARNING [train.py:1204] (3/4) Exclude cut with ID _7852_210_5219_1_1528286471316_8139400_242-105336-0 from training. Duration: 0.72 2023-10-04 00:33:22,152 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_5960_210_1794_1_1527403865_3990999_515-2613-0_sp1.1 from training. Duration: 0.8 2023-10-04 00:33:22,166 WARNING [train.py:1204] (3/4) Exclude cut with ID _39688_210_19596_1_1533371626725_4154159_551-167834-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 00:33:23,455 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_85-195240-0 from training. Duration: 0.87 2023-10-04 00:33:23,521 WARNING [train.py:1204] (3/4) Exclude cut with ID _63171_210_15488_1_1535104792794_3295110_113-113534-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 00:33:29,340 WARNING [train.py:1204] (3/4) Exclude cut with ID _8833_210_5219_1_1528592665210_7553059_319-349181-0_sp1.1 from training. Duration: 0.9 2023-10-04 00:33:29,341 WARNING [train.py:1204] (3/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_225-43381-0 from training. Duration: 0.94 2023-10-04 00:33:34,772 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_5959_210_1794_1_1527400400_3445726_412-44052-0_sp0.9 from training. Duration: 0.8555625 2023-10-04 00:33:36,100 WARNING [train.py:1204] (3/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_356-167816-0_sp1.1 from training. Duration: 0.6545625 2023-10-04 00:33:40,394 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9012W0112-501244 from training. Duration: 0.768 2023-10-04 00:33:40,449 WARNING [train.py:1204] (3/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_177-326952-0_sp1.1 from training. Duration: 0.92725 2023-10-04 00:33:40,453 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9016W0116-502789 from training. Duration: 0.896 2023-10-04 00:33:41,837 WARNING [train.py:1204] (3/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_557-204815-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 00:33:43,634 WARNING [train.py:1204] (3/4) Exclude cut with ID _55857_210_2346_1_1534644007831_6898209_279-229469-0_sp1.1 from training. Duration: 0.936375 2023-10-04 00:33:43,694 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9029W0375-509764 from training. Duration: 0.896 2023-10-04 00:33:45,600 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2295_210_2087_1_1525500993_2407863_234-83104-0_sp0.9 from training. Duration: 0.688875 2023-10-04 00:33:47,076 WARNING [train.py:1204] (3/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_266-137104-0 from training. Duration: 0.65 2023-10-04 00:33:48,427 WARNING [train.py:1204] (3/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_289-163927-0_sp1.1 from training. Duration: 0.92725 2023-10-04 00:33:52,915 WARNING [train.py:1204] (3/4) Exclude cut with ID _12733_210_8341_1_1530167424556_3646380_86-225314-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 00:33:52,927 WARNING [train.py:1204] (3/4) Exclude cut with ID _7877_210_6014_1_1528286358099_3750136_618-284064-0_sp1.1 from training. Duration: 0.92725 2023-10-04 00:33:52,971 WARNING [train.py:1204] (3/4) Exclude cut with ID _55844_210_2346_1_1534471212569_7625707_1017-31469-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 00:33:53,006 WARNING [train.py:1204] (3/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_617-241446-0_sp1.1 from training. Duration: 0.92725 2023-10-04 00:33:54,248 INFO [train.py:1046] (3/4) Epoch 42, batch 1550, loss[loss=0.1571, simple_loss=0.2271, pruned_loss=0.04361, over 23737.00 frames. ], tot_loss[loss=0.1557, simple_loss=0.236, pruned_loss=0.0377, over 4735489.82 frames. ], batch size: 164, lr: 2.43e-03, grad_scale: 8.0 2023-10-04 00:33:54,329 WARNING [train.py:1204] (3/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_866-173440-0_sp1.1 from training. Duration: 0.8181875 2023-10-04 00:33:55,682 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_9460_210_4211_1_1528954587_10049667_480-260269-0 from training. Duration: 0.56 2023-10-04 00:33:55,720 WARNING [train.py:1204] (3/4) Exclude cut with ID _44659_210_2721_1_1534036120061_6614860_626-298884-0 from training. Duration: 0.97 2023-10-04 00:33:57,602 WARNING [train.py:1204] (3/4) Exclude cut with ID _53565_210_10635_1_1534337218335_3820259_287-18120-0_sp1.1 from training. Duration: 0.8818125 2023-10-04 00:33:57,665 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_791-197891-0 from training. Duration: 0.84 2023-10-04 00:33:59,006 WARNING [train.py:1204] (3/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_527-238990-0 from training. Duration: 0.9 2023-10-04 00:34:00,390 WARNING [train.py:1204] (3/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_449-65969-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 00:34:02,811 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_553-341452-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 00:34:02,845 WARNING [train.py:1204] (3/4) Exclude cut with ID _16909_210_5724_1_1531292079912_4118919_300-47574-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 00:34:02,868 WARNING [train.py:1204] (3/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_70-27732-0_sp1.1 from training. Duration: 0.8545625 2023-10-04 00:34:02,968 WARNING [train.py:1204] (3/4) Exclude cut with ID _44718_210_5816_1_1534050051869_6469030_727-28074-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 00:34:04,342 WARNING [train.py:1204] (3/4) Exclude cut with ID _55857_210_2346_1_1534644007831_6898209_357-123664-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 00:34:07,111 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9123W0004-555964 from training. Duration: 0.896 2023-10-04 00:34:07,137 WARNING [train.py:1204] (3/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_445-145208-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 00:34:07,160 WARNING [train.py:1204] (3/4) Exclude cut with ID _3675_210_4557_1_1526639934408_4182089_456-18305-0_sp1.1 from training. Duration: 0.6909375 2023-10-04 00:34:08,508 WARNING [train.py:1204] (3/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_1-99680-0_sp1.1 from training. Duration: 0.6090625 2023-10-04 00:34:09,912 WARNING [train.py:1204] (3/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_146-282400-0_sp0.9 from training. Duration: 0.788875 2023-10-04 00:34:09,929 WARNING [train.py:1204] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_844-96989-0 from training. Duration: 0.71 2023-10-04 00:34:11,753 WARNING [train.py:1204] (3/4) Exclude cut with ID _29853_210_7497_1_1532505600851_6642110_128-146364-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 00:34:11,781 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_224-322394-0 from training. Duration: 0.66 2023-10-04 00:34:13,209 WARNING [train.py:1204] (3/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_191-59784-0 from training. Duration: 0.88 2023-10-04 00:34:13,221 WARNING [train.py:1204] (3/4) Exclude cut with ID _12556_210_8341_1_1530081024100_7327690_725-193477-0 from training. Duration: 0.98 2023-10-04 00:34:14,484 WARNING [train.py:1204] (3/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_523-79838-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 00:34:14,581 WARNING [train.py:1204] (3/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_398-334558-0_sp1.1 from training. Duration: 0.87275 2023-10-04 00:34:20,350 WARNING [train.py:1204] (3/4) Exclude cut with ID _6456_210_1534_1_1527681634212_3181049_171-36697-0_sp1.1 from training. Duration: 0.936375 2023-10-04 00:34:21,919 WARNING [train.py:1204] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_700-183549-0 from training. Duration: 0.68 2023-10-04 00:34:21,928 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8853_210_3273_1_1528633899_818968_51-187685-0 from training. Duration: 0.69 2023-10-04 00:34:30,104 WARNING [train.py:1204] (3/4) Exclude cut with ID _7719_210_3525_1_1528456153163_4296960_333-110399-0_sp1.1 from training. Duration: 0.87275 2023-10-04 00:34:30,214 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.0.conv_skip_rate, batch_count=1462453.3333333333, ans=0.0 2023-10-04 00:34:34,384 WARNING [train.py:1204] (3/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_1022-83740-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 00:34:34,408 WARNING [train.py:1204] (3/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_61-327062-0_sp0.9 from training. Duration: 0.9 2023-10-04 00:34:34,421 WARNING [train.py:1204] (3/4) Exclude cut with ID _47251_210_18628_1_1533814063128_4137250_214-88076-0_sp1.1 from training. Duration: 0.8 2023-10-04 00:34:35,623 WARNING [train.py:1204] (3/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_839-173017-0 from training. Duration: 0.41 2023-10-04 00:34:41,000 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.589e+02 1.990e+02 2.197e+02 2.410e+02 4.079e+02, threshold=4.394e+02, percent-clipped=0.0 2023-10-04 00:34:42,402 WARNING [train.py:1204] (3/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_299-34413-0_sp1.1 from training. Duration: 0.5909375 2023-10-04 00:34:42,520 WARNING [train.py:1204] (3/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_664-159457-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 00:34:46,047 WARNING [train.py:1204] (3/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_483-291760-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 00:34:47,419 WARNING [train.py:1204] (3/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_287-255474-0_sp1.1 from training. Duration: 0.92725 2023-10-04 00:34:47,630 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.balancer1.prob, batch_count=1462520.0, ans=0.125 2023-10-04 00:34:48,790 WARNING [train.py:1204] (3/4) Exclude cut with ID _48677_210_5663_1_1533902405919_3892989_111-11439-0_sp1.1 from training. Duration: 0.87275 2023-10-04 00:34:48,804 WARNING [train.py:1204] (3/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_399-156791-0 from training. Duration: 0.83 2023-10-04 00:34:50,680 WARNING [train.py:1204] (3/4) Exclude cut with ID _3992_210_1747_1_1526644816031_3731060_36-147855-0_sp0.9 from training. Duration: 0.8666875 2023-10-04 00:34:50,914 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.ff3_skip_rate, batch_count=1462520.0, ans=0.0 2023-10-04 00:34:52,119 WARNING [train.py:1204] (3/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_442-320317-0_sp1.1 from training. Duration: 0.7454375 2023-10-04 00:34:52,140 WARNING [train.py:1204] (3/4) Exclude cut with ID _55429_210_10626_1_1534467588527_7174143_953-80868-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 00:34:53,506 WARNING [train.py:1204] (3/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_159-328157-0_sp0.9 from training. Duration: 0.57775 2023-10-04 00:34:53,518 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9309W0197-640324 from training. Duration: 0.896 2023-10-04 00:34:56,837 WARNING [train.py:1204] (3/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_361-30822-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 00:35:01,568 WARNING [train.py:1204] (3/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_636-251213-0 from training. Duration: 0.94 2023-10-04 00:35:05,589 WARNING [train.py:1204] (3/4) Exclude cut with ID _49728_210_12651_1_1534041034294_6788150_468-333827-0_sp1.1 from training. Duration: 0.963625 2023-10-04 00:35:06,913 WARNING [train.py:1204] (3/4) Exclude cut with ID _25882_210_3036_1_1532775665815_6479510_623-191759-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 00:35:06,978 WARNING [train.py:1204] (3/4) Exclude cut with ID _5189_210_4557_1_1527248319428_3769229_199-53526-0 from training. Duration: 0.42 2023-10-04 00:35:08,286 INFO [train.py:1046] (3/4) Epoch 42, batch 1600, loss[loss=0.1525, simple_loss=0.2327, pruned_loss=0.03613, over 23334.00 frames. ], tot_loss[loss=0.1569, simple_loss=0.2372, pruned_loss=0.03831, over 4730424.82 frames. ], batch size: 105, lr: 2.43e-03, grad_scale: 16.0 2023-10-04 00:35:08,393 WARNING [train.py:1204] (3/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_32-205288-0_sp0.9 from training. Duration: 0.8666875 2023-10-04 00:35:09,745 WARNING [train.py:1204] (3/4) Exclude cut with ID _65263_210_22356_1_1535763598691_3921037_401-346646-0_sp1.1 from training. Duration: 0.963625 2023-10-04 00:35:09,749 WARNING [train.py:1204] (3/4) Exclude cut with ID _12555_210_8750_1_1530075594402_3780239_449-267986-0_sp1.1 from training. Duration: 0.7545625 2023-10-04 00:35:09,761 WARNING [train.py:1204] (3/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_430-201020-0_sp1.1 from training. Duration: 0.8818125 2023-10-04 00:35:11,482 WARNING [train.py:1204] (3/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_379-49457-0_sp1.1 from training. Duration: 0.7909375 2023-10-04 00:35:15,591 WARNING [train.py:1204] (3/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_200-105218-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 00:35:15,645 WARNING [train.py:1204] (3/4) Exclude cut with ID _3867_210_1883_1_1526641200575_4475830_319-349850-0 from training. Duration: 0.67 2023-10-04 00:35:17,001 WARNING [train.py:1204] (3/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_916-195848-0 from training. Duration: 0.92 2023-10-04 00:35:19,099 WARNING [train.py:1204] (3/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_436-186311-0 from training. Duration: 0.65 2023-10-04 00:35:22,402 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3910_210_1794_1_1526777667_3747260_302-180617-0_sp1.1 from training. Duration: 0.97275 2023-10-04 00:35:23,819 WARNING [train.py:1204] (3/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_102-92751-0 from training. Duration: 0.99 2023-10-04 00:35:23,886 WARNING [train.py:1204] (3/4) Exclude cut with ID _51899_210_20034_1_1534129254466_3669220_265-215428-0_sp1.1 from training. Duration: 0.8909375 2023-10-04 00:35:27,009 WARNING [train.py:1204] (3/4) Exclude cut with ID _58583_210_5236_1_1534726835266_3574249_438-198993-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 00:35:27,917 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.feed_forward1.out_proj.dropout_p, batch_count=1462720.0, ans=0.1 2023-10-04 00:35:31,599 WARNING [train.py:1204] (3/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_166-166990-0_sp1.1 from training. Duration: 0.87275 2023-10-04 00:35:35,719 WARNING [train.py:1204] (3/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_907-151398-0 from training. Duration: 0.37 2023-10-04 00:35:35,892 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.0.conv_module1.balancer1.prob, batch_count=1462720.0, ans=0.125 2023-10-04 00:35:36,000 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=1462720.0, ans=0.1 2023-10-04 00:35:38,540 WARNING [train.py:1204] (3/4) Exclude cut with ID _38670_210_15379_1_1533448810659_4391059_452-256669-0_sp1.1 from training. Duration: 0.936375 2023-10-04 00:35:38,575 WARNING [train.py:1204] (3/4) Exclude cut with ID _48677_210_5663_1_1533902405919_3892989_408-40740-0 from training. Duration: 0.83 2023-10-04 00:35:39,856 WARNING [train.py:1204] (3/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_903-158756-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 00:35:39,909 WARNING [train.py:1204] (3/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_397-216497-0 from training. Duration: 0.96 2023-10-04 00:35:42,924 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.conv_module2.balancer2.prob, batch_count=1462786.6666666667, ans=0.125 2023-10-04 00:35:45,585 WARNING [train.py:1204] (3/4) Exclude cut with ID _55429_210_10626_1_1534467588527_7174143_97-335084-0 from training. Duration: 0.97 2023-10-04 00:35:51,841 WARNING [train.py:1204] (3/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_453-267730-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 00:35:51,917 WARNING [train.py:1204] (3/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_153-97441-0 from training. Duration: 0.56 2023-10-04 00:35:53,645 WARNING [train.py:1204] (3/4) Exclude cut with ID _40901_210_5665_1_1533363789958_3856130_312-329122-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 00:35:53,691 WARNING [train.py:1204] (3/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_97-126471-0_sp1.1 from training. Duration: 0.97275 2023-10-04 00:35:53,692 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_11305_210_1794_1_1529750428_7178148_257-312870-0_sp1.1 from training. Duration: 0.8 2023-10-04 00:35:53,980 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.feed_forward1.hidden_balancer.prob, batch_count=1462853.3333333333, ans=0.125 2023-10-04 00:35:55,222 WARNING [train.py:1204] (3/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_454-314901-0_sp0.9 from training. Duration: 0.511125 2023-10-04 00:35:55,397 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=1462853.3333333333, ans=0.1 2023-10-04 00:36:01,573 WARNING [train.py:1204] (3/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_282-136641-0_sp1.1 from training. Duration: 0.3818125 2023-10-04 00:36:02,921 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_46013_210_7234_1_1533776362_3744032_493-160724-0_sp1.1 from training. Duration: 0.963625 2023-10-04 00:36:02,962 WARNING [train.py:1204] (3/4) Exclude cut with ID _67831_210_12392_1_1535612464202_3092612_259-33442-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 00:36:04,374 WARNING [train.py:1204] (3/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_289-62384-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 00:36:04,443 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6545_210_2040_1_1527774670_4245258_379-14182-0_sp0.9 from training. Duration: 0.888875 2023-10-04 00:36:07,149 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_548-294316-0_sp1.1 from training. Duration: 0.736375 2023-10-04 00:36:07,234 WARNING [train.py:1204] (3/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_857-298942-0_sp1.1 from training. Duration: 0.77275 2023-10-04 00:36:08,577 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6673_210_3273_1_1527767790_3173240_327-306185-0_sp1.1 from training. Duration: 0.7545625 2023-10-04 00:36:14,181 WARNING [train.py:1204] (3/4) Exclude cut with ID _61008_210_10633_1_1534852769198_6974370_814-107647-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 00:36:15,581 WARNING [train.py:1204] (3/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_116-332008-0_sp1.1 from training. Duration: 0.8909375 2023-10-04 00:36:16,991 WARNING [train.py:1204] (3/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_266-329755-0 from training. Duration: 0.89 2023-10-04 00:36:16,994 WARNING [train.py:1204] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_271-269084-0_sp0.9 from training. Duration: 0.92225 2023-10-04 00:36:18,998 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3171_210_2087_1_1526109084_2330007_66-260583-0 from training. Duration: 0.72 2023-10-04 00:36:22,924 INFO [train.py:1046] (3/4) Epoch 42, batch 1650, loss[loss=0.198, simple_loss=0.271, pruned_loss=0.06246, over 19760.00 frames. ], tot_loss[loss=0.1566, simple_loss=0.237, pruned_loss=0.03815, over 4722344.22 frames. ], batch size: 388, lr: 2.43e-03, grad_scale: 16.0 2023-10-04 00:36:23,109 WARNING [train.py:1204] (3/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_1326-276386-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 00:36:25,013 WARNING [train.py:1204] (3/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_372-30302-0_sp1.1 from training. Duration: 0.7818125 2023-10-04 00:36:26,285 WARNING [train.py:1204] (3/4) Exclude cut with ID _25882_210_3036_1_1532775665815_6479510_631-257029-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 00:36:26,297 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3691_210_3528_1_1526468169_4062053_355-334432-0 from training. Duration: 0.87 2023-10-04 00:36:26,319 WARNING [train.py:1204] (3/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_839-173017-0_sp1.1 from training. Duration: 0.37275 2023-10-04 00:36:26,959 WARNING [train.py:1204] (3/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_441-91466-0 from training. Duration: 0.97 2023-10-04 00:36:26,998 WARNING [train.py:1204] (3/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_108-53929-0 from training. Duration: 0.59 2023-10-04 00:36:30,910 WARNING [train.py:1204] (3/4) Exclude cut with ID _9389_210_1664_1_1529217337301_5289269_260-1942-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 00:36:32,717 WARNING [train.py:1204] (3/4) Exclude cut with ID _56423_210_5854_1_1534896061049_3165969_198-61547-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 00:36:34,020 WARNING [train.py:1204] (3/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_412-142122-0_sp1.1 from training. Duration: 0.92725 2023-10-04 00:36:34,045 WARNING [train.py:1204] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_88-341718-0_sp1.1 from training. Duration: 0.6 2023-10-04 00:36:35,444 WARNING [train.py:1204] (3/4) Exclude cut with ID _61079_210_4525_1_1534921008198_4011419_334-298697-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 00:36:36,890 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_5465_210_4211_1_1527227253_55357_8-2499-0_sp1.1 from training. Duration: 0.996375 2023-10-04 00:36:39,557 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_447-201565-0_sp1.1 from training. Duration: 0.97275 2023-10-04 00:36:39,563 WARNING [train.py:1204] (3/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_253-161789-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 00:36:39,567 WARNING [train.py:1204] (3/4) Exclude cut with ID _44304_210_15374_1_1533639352765_4613129_302-22084-0_sp0.9 from training. Duration: 0.9555625 2023-10-04 00:36:39,586 WARNING [train.py:1204] (3/4) Exclude cut with ID _26664_210_7770_1_1532258943280_3418270_187-80815-0_sp1.1 from training. Duration: 0.8454375 2023-10-04 00:36:40,993 WARNING [train.py:1204] (3/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_68-212801-0 from training. Duration: 0.73 2023-10-04 00:36:41,024 WARNING [train.py:1204] (3/4) Exclude cut with ID _29345_210_4381_1_1532689168263_3860169_149-257268-0 from training. Duration: 0.81 2023-10-04 00:36:44,664 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.1.encoder.layers.1.conv_module1.whiten, num_groups=1, num_channels=256, metric=10.61 vs. limit=15.0 2023-10-04 00:36:46,655 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_213-103282-0_sp1.1 from training. Duration: 0.7090625 2023-10-04 00:36:49,930 WARNING [train.py:1204] (3/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_156-302675-0_sp1.1 from training. Duration: 0.5 2023-10-04 00:36:51,612 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=1463120.0, ans=0.1 2023-10-04 00:36:57,408 WARNING [train.py:1204] (3/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_125-322172-0 from training. Duration: 0.93 2023-10-04 00:36:58,744 WARNING [train.py:1204] (3/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_493-60474-0_sp1.1 from training. Duration: 0.963625 2023-10-04 00:37:00,765 WARNING [train.py:1204] (3/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_156-302675-0 from training. Duration: 0.55 2023-10-04 00:37:04,130 WARNING [train.py:1204] (3/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_123-267862-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 00:37:06,808 WARNING [train.py:1204] (3/4) Exclude cut with ID _28427_210_4220_1_1532581240967_3061069_201-143747-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 00:37:06,843 WARNING [train.py:1204] (3/4) Exclude cut with ID _8772_210_1850_1_1528635595972_3664011_96-12756-0_sp1.1 from training. Duration: 0.8909375 2023-10-04 00:37:06,878 WARNING [train.py:1204] (3/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_1347-145675-0_sp1.1 from training. Duration: 0.936375 2023-10-04 00:37:08,303 WARNING [train.py:1204] (3/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_387-159008-0_sp1.1 from training. Duration: 0.7818125 2023-10-04 00:37:08,321 WARNING [train.py:1204] (3/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_400-201896-0_sp1.1 from training. Duration: 0.963625 2023-10-04 00:37:11,129 WARNING [train.py:1204] (3/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_326-174612-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 00:37:11,184 WARNING [train.py:1204] (3/4) Exclude cut with ID _55844_210_2346_1_1534471212569_7625707_390-116233-0_sp1.1 from training. Duration: 0.963625 2023-10-04 00:37:12,337 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.674e+02 1.974e+02 2.211e+02 2.480e+02 3.454e+02, threshold=4.423e+02, percent-clipped=0.0 2023-10-04 00:37:12,430 WARNING [train.py:1204] (3/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_419-317062-0_sp1.1 from training. Duration: 0.82725 2023-10-04 00:37:12,483 WARNING [train.py:1204] (3/4) Exclude cut with ID _29877_210_6228_1_1532397791106_5276149_266-62291-0_sp0.9 from training. Duration: 0.9666875 2023-10-04 00:37:13,792 WARNING [train.py:1204] (3/4) Exclude cut with ID _7857_210_1534_1_1528286515745_6831470_431-338528-0_sp1.1 from training. Duration: 0.92725 2023-10-04 00:37:15,181 WARNING [train.py:1204] (3/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_87-131427-0_sp0.9 from training. Duration: 0.8666875 2023-10-04 00:37:16,675 WARNING [train.py:1204] (3/4) Exclude cut with ID _24055_210_12651_1_1532310907143_976979_92-59356-0_sp1.1 from training. Duration: 0.82725 2023-10-04 00:37:16,761 WARNING [train.py:1204] (3/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_96-290039-0 from training. Duration: 0.56 2023-10-04 00:37:18,646 WARNING [train.py:1204] (3/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_48-182155-0_sp0.9 from training. Duration: 0.9666875 2023-10-04 00:37:20,494 WARNING [train.py:1204] (3/4) Exclude cut with ID _4626_210_3532_1_1526817652717_3916859_446-206656-0 from training. Duration: 0.76 2023-10-04 00:37:21,837 WARNING [train.py:1204] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_168-32906-0 from training. Duration: 0.69 2023-10-04 00:37:21,861 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3780_210_2087_1_1526467760_2130483_170-259044-0 from training. Duration: 0.7 2023-10-04 00:37:21,873 WARNING [train.py:1204] (3/4) Exclude cut with ID _65134_210_14763_1_1535335393985_3505368_153-216718-0_sp1.1 from training. Duration: 0.92725 2023-10-04 00:37:21,935 WARNING [train.py:1204] (3/4) Exclude cut with ID _64639_210_5816_1_1535367619437_3528000_461-329156-0_sp1.1 from training. Duration: 0.87275 2023-10-04 00:37:23,237 WARNING [train.py:1204] (3/4) Exclude cut with ID _21989_210_5854_1_1531899214931_3271360_126-347531-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 00:37:24,591 WARNING [train.py:1204] (3/4) Exclude cut with ID _7614_210_4915_1_1529326764092_4537359_96-162197-0_sp1.1 from training. Duration: 0.963625 2023-10-04 00:37:24,592 WARNING [train.py:1204] (3/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_1083-147536-0 from training. Duration: 0.97 2023-10-04 00:37:28,817 WARNING [train.py:1204] (3/4) Exclude cut with ID _64029_210_4381_1_1535178502772_3756459_275-16710-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 00:37:29,752 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.0.layers.1.conv_module1.whiten, num_groups=1, num_channels=192, metric=8.35 vs. limit=15.0 2023-10-04 00:37:30,795 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7264_210_4938_1_1527939911_8132095_800-91995-0_sp0.9 from training. Duration: 0.9555625 2023-10-04 00:37:30,829 WARNING [train.py:1204] (3/4) Exclude cut with ID _3844_210_1390_1_1526727803582_2871209_50-193879-0_sp1.1 from training. Duration: 0.936375 2023-10-04 00:37:32,225 WARNING [train.py:1204] (3/4) Exclude cut with ID _46952_210_5986_1_1534230010357_3107379_232-344503-0 from training. Duration: 0.96 2023-10-04 00:37:38,123 INFO [train.py:1046] (3/4) Epoch 42, batch 1700, loss[loss=0.1448, simple_loss=0.2067, pruned_loss=0.04143, over 22844.00 frames. ], tot_loss[loss=0.1558, simple_loss=0.2358, pruned_loss=0.03792, over 4736228.11 frames. ], batch size: 322, lr: 2.43e-03, grad_scale: 8.0 2023-10-04 00:37:38,237 WARNING [train.py:1204] (3/4) Exclude cut with ID _25808_210_13292_1_1532343514562_7366109_206-14076-0_sp1.1 from training. Duration: 0.936375 2023-10-04 00:37:38,238 WARNING [train.py:1204] (3/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_55-31904-0_sp0.9 from training. Duration: 0.97775 2023-10-04 00:37:38,263 WARNING [train.py:1204] (3/4) Exclude cut with ID _14632_210_9988_1_1530691279392_3923200_161-129530-0 from training. Duration: 0.96 2023-10-04 00:37:39,566 WARNING [train.py:1204] (3/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_770-200430-0_sp1.1 from training. Duration: 0.8090625 2023-10-04 00:37:39,577 WARNING [train.py:1204] (3/4) Exclude cut with ID _6270_210_2084_1_1527850911573_3856420_26-277343-0_sp1.1 from training. Duration: 0.7454375 2023-10-04 00:37:39,578 WARNING [train.py:1204] (3/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_88-23776-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 00:37:43,494 WARNING [train.py:1204] (3/4) Exclude cut with ID _60860_210_8254_1_1534896015875_3557169_245-256666-0_sp1.1 from training. Duration: 0.8818125 2023-10-04 00:37:43,523 WARNING [train.py:1204] (3/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_636-251213-0_sp1.1 from training. Duration: 0.8545625 2023-10-04 00:37:43,547 WARNING [train.py:1204] (3/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_540-131164-0 from training. Duration: 0.92 2023-10-04 00:37:45,043 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_5931_210_2040_1_1527428481_4547999_321-193614-0_sp0.9 from training. Duration: 0.7444375 2023-10-04 00:37:51,406 WARNING [train.py:1204] (3/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_373-225403-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 00:37:54,115 WARNING [train.py:1204] (3/4) Exclude cut with ID _20622_210_3852_1_1531622427080_1093839_57-326027-0_sp1.1 from training. Duration: 0.97275 2023-10-04 00:37:55,693 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.1.feed_forward1.out_proj.dropout_p, batch_count=1463386.6666666667, ans=0.1 2023-10-04 00:38:00,254 WARNING [train.py:1204] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_605-257205-0_sp1.1 from training. Duration: 0.536375 2023-10-04 00:38:00,478 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.conv_module1.balancer1.prob, batch_count=1463386.6666666667, ans=0.125 2023-10-04 00:38:01,506 WARNING [train.py:1204] (3/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_442-320317-0_sp0.9 from training. Duration: 0.911125 2023-10-04 00:38:01,541 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_35361_210_4238_1_1533262810_7793476_675-87012-0_sp1.1 from training. Duration: 0.8090625 2023-10-04 00:38:01,585 WARNING [train.py:1204] (3/4) Exclude cut with ID _4184_210_2550_1_1526730192013_2219380_306-44736-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 00:38:04,865 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_124-113562-0 from training. Duration: 0.94 2023-10-04 00:38:06,375 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2821_210_2715_1_1526108385_3763560_73-296673-0_sp0.9 from training. Duration: 0.92225 2023-10-04 00:38:06,390 WARNING [train.py:1204] (3/4) Exclude cut with ID _10301_210_5886_1_1529222275332_3910620_216-46959-0_sp1.1 from training. Duration: 0.92725 2023-10-04 00:38:06,592 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.1.balancer2.prob, batch_count=1463453.3333333333, ans=0.125 2023-10-04 00:38:07,782 WARNING [train.py:1204] (3/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_91-277684-0_sp0.9 from training. Duration: 0.82225 2023-10-04 00:38:07,868 WARNING [train.py:1204] (3/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_351-294650-0_sp0.9 from training. Duration: 0.7 2023-10-04 00:38:10,591 WARNING [train.py:1204] (3/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_209-205365-0 from training. Duration: 0.79 2023-10-04 00:38:10,663 WARNING [train.py:1204] (3/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_97-325053-0 from training. Duration: 0.92 2023-10-04 00:38:12,008 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_49348_210_7927_1_1533954500_3691300_109-276430-0_sp1.1 from training. Duration: 0.92725 2023-10-04 00:38:13,516 WARNING [train.py:1204] (3/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_912-47473-0 from training. Duration: 0.68 2023-10-04 00:38:14,736 WARNING [train.py:1204] (3/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_96-320736-0_sp0.9 from training. Duration: 0.9555625 2023-10-04 00:38:21,093 WARNING [train.py:1204] (3/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_1252-235481-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 00:38:21,179 WARNING [train.py:1204] (3/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_813-261554-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 00:38:22,544 WARNING [train.py:1204] (3/4) Exclude cut with ID _21632_210_2253_1_1531994001765_3744140_110-274654-0_sp0.9 from training. Duration: 0.911125 2023-10-04 00:38:25,208 WARNING [train.py:1204] (3/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_381-317791-0_sp1.1 from training. Duration: 0.663625 2023-10-04 00:38:25,227 WARNING [train.py:1204] (3/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_51-117576-0_sp1.1 from training. Duration: 0.363625 2023-10-04 00:38:25,248 WARNING [train.py:1204] (3/4) Exclude cut with ID _42321_210_7770_1_1533468631352_4366895_104-215915-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 00:38:28,509 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_216-22824-0_sp1.1 from training. Duration: 0.92725 2023-10-04 00:38:28,510 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_14831_210_7927_1_1530700200_5583610_181-27467-0 from training. Duration: 0.91 2023-10-04 00:38:29,873 WARNING [train.py:1204] (3/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_1024-265645-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 00:38:29,876 WARNING [train.py:1204] (3/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_412-233904-0_sp1.1 from training. Duration: 0.936375 2023-10-04 00:38:29,893 WARNING [train.py:1204] (3/4) Exclude cut with ID _30191_210_1664_1_1533189915047_7196670_911-223461-0_sp1.1 from training. Duration: 0.92725 2023-10-04 00:38:29,913 WARNING [train.py:1204] (3/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_558-148266-0_sp1.1 from training. Duration: 0.963625 2023-10-04 00:38:34,536 WARNING [train.py:1204] (3/4) Exclude cut with ID _58480_210_12392_1_1534811121111_3646160_212-164495-0_sp1.1 from training. Duration: 0.936375 2023-10-04 00:38:34,538 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_454-276429-0_sp0.9 from training. Duration: 0.9666875 2023-10-04 00:38:35,877 WARNING [train.py:1204] (3/4) Exclude cut with ID _24736_210_3279_1_1532332894218_2838700_73-197086-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 00:38:37,307 WARNING [train.py:1204] (3/4) Exclude cut with ID _5294_210_1747_1_1527249643928_3696179_529-242298-0_sp1.1 from training. Duration: 0.863625 2023-10-04 00:38:37,343 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_53555_210_5632_1_1534595220_7232297_345-171438-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 00:38:41,480 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_28211_210_6388_1_1532861506_4523224_22-25766-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 00:38:42,892 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7002_210_1794_1_1527939683_5157286_163-87552-0 from training. Duration: 0.86 2023-10-04 00:38:44,282 WARNING [train.py:1204] (3/4) Exclude cut with ID _56927_210_9969_1_1534848970840_3695229_409-225129-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 00:38:45,716 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_26192_210_7927_1_1532137269_1040115_21-219331-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 00:38:47,280 WARNING [train.py:1204] (3/4) Exclude cut with ID _15012_210_1968_1_1530856820429_5095350_662-210469-0 from training. Duration: 0.57 2023-10-04 00:38:52,290 INFO [train.py:1046] (3/4) Epoch 42, batch 1750, loss[loss=0.1524, simple_loss=0.2244, pruned_loss=0.04019, over 23583.00 frames. ], tot_loss[loss=0.1552, simple_loss=0.2351, pruned_loss=0.03768, over 4730247.78 frames. ], batch size: 256, lr: 2.42e-03, grad_scale: 8.0 2023-10-04 00:38:55,198 WARNING [train.py:1204] (3/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_582-191016-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 00:38:56,617 WARNING [train.py:1204] (3/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_270-210032-0_sp1.1 from training. Duration: 0.963625 2023-10-04 00:38:56,652 WARNING [train.py:1204] (3/4) Exclude cut with ID _6789_210_1534_1_1527857937218_3118950_262-48853-0_sp0.9 from training. Duration: 0.62225 2023-10-04 00:38:56,794 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=1463653.3333333333, ans=0.1 2023-10-04 00:38:58,042 WARNING [train.py:1204] (3/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_847-41334-0 from training. Duration: 0.44 2023-10-04 00:38:59,324 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_573-46532-0_sp1.1 from training. Duration: 0.92725 2023-10-04 00:39:00,086 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.4.encoder.layers.0.self_attn1.whiten, num_groups=1, num_channels=384, metric=14.22 vs. limit=22.5 2023-10-04 00:39:02,102 WARNING [train.py:1204] (3/4) Exclude cut with ID _21881_210_3525_1_1531825242757_3798380_247-102591-0_sp1.1 from training. Duration: 0.8909375 2023-10-04 00:39:02,123 WARNING [train.py:1204] (3/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_200-21395-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 00:39:06,776 WARNING [train.py:1204] (3/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_711-15688-0 from training. Duration: 0.43 2023-10-04 00:39:08,223 WARNING [train.py:1204] (3/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_53-75025-0_sp1.1 from training. Duration: 0.963625 2023-10-04 00:39:09,683 WARNING [train.py:1204] (3/4) Exclude cut with ID _6194_210_1534_1_1527511826160_3941194_321-148641-0 from training. Duration: 0.64 2023-10-04 00:39:09,713 WARNING [train.py:1204] (3/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_337-294431-0_sp1.1 from training. Duration: 0.936375 2023-10-04 00:39:09,932 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.nonlin_attention.balancer.prob, batch_count=1463720.0, ans=0.125 2023-10-04 00:39:11,081 WARNING [train.py:1204] (3/4) Exclude cut with ID _21632_210_2253_1_1531994001765_3744140_110-274654-0_sp1.1 from training. Duration: 0.7454375 2023-10-04 00:39:13,836 WARNING [train.py:1204] (3/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_135-174080-0_sp1.1 from training. Duration: 0.4818125 2023-10-04 00:39:15,719 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.4.encoder.layers.2.self_attn1.whiten, num_groups=1, num_channels=384, metric=14.24 vs. limit=22.5 2023-10-04 00:39:16,514 WARNING [train.py:1204] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_21-174514-0 from training. Duration: 0.43 2023-10-04 00:39:17,982 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_38965_210_4852_1_1533174906_3484735_215-294589-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 00:39:18,005 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_548-294316-0 from training. Duration: 0.81 2023-10-04 00:39:27,600 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_103-84010-0_sp1.1 from training. Duration: 0.62725 2023-10-04 00:39:29,082 WARNING [train.py:1204] (3/4) Exclude cut with ID _18199_210_6109_1_1531475789864_4288790_295-198247-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 00:39:30,527 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_13757_210_3528_1_1530440682_4107239_103-313224-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 00:39:32,647 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_34886_210_10633_1_1533203882_1755661_67-294673-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 00:39:33,924 WARNING [train.py:1204] (3/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_350-101734-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 00:39:35,901 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_37357_210_17024_1_1533089504_2316121_204-153986-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 00:39:36,229 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=1463853.3333333333, ans=0.1 2023-10-04 00:39:37,289 WARNING [train.py:1204] (3/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_673-115569-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 00:39:39,951 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_489-230703-0_sp0.9 from training. Duration: 0.9444375 2023-10-04 00:39:41,281 WARNING [train.py:1204] (3/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_575-102496-0_sp1.1 from training. Duration: 0.936375 2023-10-04 00:39:42,570 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.617e+02 1.890e+02 2.169e+02 2.400e+02 4.108e+02, threshold=4.338e+02, percent-clipped=0.0 2023-10-04 00:39:42,654 WARNING [train.py:1204] (3/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_669-93163-0 from training. Duration: 0.98 2023-10-04 00:39:43,961 WARNING [train.py:1204] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_249-281349-0_sp0.9 from training. Duration: 0.9444375 2023-10-04 00:39:45,507 WARNING [train.py:1204] (3/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_106-30782-0 from training. Duration: 0.8 2023-10-04 00:39:46,806 WARNING [train.py:1204] (3/4) Exclude cut with ID _8772_210_1850_1_1528635595972_3664011_398-320049-0_sp1.1 from training. Duration: 0.8 2023-10-04 00:39:48,257 WARNING [train.py:1204] (3/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_516-151727-0_sp1.1 from training. Duration: 0.963625 2023-10-04 00:39:48,550 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.nonlin_attention.balancer.prob, batch_count=1463853.3333333333, ans=0.125 2023-10-04 00:39:49,538 WARNING [train.py:1204] (3/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_325-62802-0_sp1.1 from training. Duration: 0.7909375 2023-10-04 00:39:53,023 WARNING [train.py:1204] (3/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_93-58002-0_sp0.9 from training. Duration: 0.6333125 2023-10-04 00:39:53,068 WARNING [train.py:1204] (3/4) Exclude cut with ID _4681_210_3525_1_1527251040321_1235713_123-24428-0_sp1.1 from training. Duration: 0.436375 2023-10-04 00:39:53,128 WARNING [train.py:1204] (3/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_1185-56473-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 00:39:54,544 WARNING [train.py:1204] (3/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_96-244299-0_sp1.1 from training. Duration: 0.8 2023-10-04 00:39:56,584 WARNING [train.py:1204] (3/4) Exclude cut with ID _59500_210_8254_1_1534809610680_3736009_104-72815-0_sp1.1 from training. Duration: 0.963625 2023-10-04 00:39:59,292 WARNING [train.py:1204] (3/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_468-283113-0_sp1.1 from training. Duration: 0.87275 2023-10-04 00:39:59,403 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6239_210_4211_1_1527765584_4306698_528-102186-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 00:40:00,636 WARNING [train.py:1204] (3/4) Exclude cut with ID _45297_210_2721_1_1533715351381_3264949_351-147136-0 from training. Duration: 0.87 2023-10-04 00:40:00,641 WARNING [train.py:1204] (3/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_520-51744-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 00:40:02,556 WARNING [train.py:1204] (3/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_310-114452-0_sp1.1 from training. Duration: 0.536375 2023-10-04 00:40:04,251 WARNING [train.py:1204] (3/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_450-165710-0_sp1.1 from training. Duration: 0.92725 2023-10-04 00:40:04,270 WARNING [train.py:1204] (3/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_280-120942-0_sp1.1 from training. Duration: 0.7 2023-10-04 00:40:04,291 WARNING [train.py:1204] (3/4) Exclude cut with ID _65585_210_12210_1_1535450451584_3764098_112-294301-0_sp0.9 from training. Duration: 0.97775 2023-10-04 00:40:04,341 WARNING [train.py:1204] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_98-51676-0_sp0.9 from training. Duration: 0.811125 2023-10-04 00:40:06,913 INFO [train.py:1046] (3/4) Epoch 42, batch 1800, loss[loss=0.1646, simple_loss=0.2397, pruned_loss=0.04476, over 23774.00 frames. ], tot_loss[loss=0.1548, simple_loss=0.2344, pruned_loss=0.03758, over 4732380.91 frames. ], batch size: 179, lr: 2.42e-03, grad_scale: 8.0 2023-10-04 00:40:07,022 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_33348_210_5632_1_1533086912_3324030_85-28583-0_sp1.1 from training. Duration: 0.7454375 2023-10-04 00:40:07,290 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.conv_module1.balancer2.prob, batch_count=1463986.6666666667, ans=0.125 2023-10-04 00:40:08,377 WARNING [train.py:1204] (3/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_336-208232-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 00:40:10,145 WARNING [train.py:1204] (3/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_339-104791-0_sp0.9 from training. Duration: 0.5444375 2023-10-04 00:40:12,858 WARNING [train.py:1204] (3/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_559-29840-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 00:40:14,347 WARNING [train.py:1204] (3/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_117-290916-0_sp0.9 from training. Duration: 0.5666875 2023-10-04 00:40:15,713 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_185-112289-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 00:40:17,633 WARNING [train.py:1204] (3/4) Exclude cut with ID _59256_210_5986_1_1535076085997_3589120_155-298797-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 00:40:20,388 WARNING [train.py:1204] (3/4) Exclude cut with ID _44659_210_2721_1_1534036120061_6614860_246-206531-0_sp1.1 from training. Duration: 0.92725 2023-10-04 00:40:20,434 WARNING [train.py:1204] (3/4) Exclude cut with ID _23818_210_6228_1_1531917063884_3676920_469-151944-0_sp1.1 from training. Duration: 0.92725 2023-10-04 00:40:21,907 WARNING [train.py:1204] (3/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_88-276668-0_sp1.1 from training. Duration: 0.9 2023-10-04 00:40:22,126 INFO [scaling.py:1118] (3/4) WithLoss: name=encoder.encoders.2.encoder.layers.0.self_attn_weights, loss-sum=0.000e+00 2023-10-04 00:40:25,156 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_15101_210_7927_1_1530786415_5033274_320-327527-0_sp1.1 from training. Duration: 0.87275 2023-10-04 00:40:25,164 WARNING [train.py:1204] (3/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_446-112117-0 from training. Duration: 0.9 2023-10-04 00:40:25,242 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_30280_210_1881_1_1532872779_3660139_400-254159-0_sp1.1 from training. Duration: 0.936375 2023-10-04 00:40:25,513 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.bypass.skip_rate, batch_count=1464053.3333333333, ans=0.09899494936611666 2023-10-04 00:40:26,763 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.1.conv_module2.balancer2.min_abs, batch_count=1464053.3333333333, ans=0.5 2023-10-04 00:40:27,986 WARNING [train.py:1204] (3/4) Exclude cut with ID _40284_210_2084_1_1533513679814_3704279_158-345341-0_sp1.1 from training. Duration: 0.936375 2023-10-04 00:40:32,045 WARNING [train.py:1204] (3/4) Exclude cut with ID 3033-130750-0096-55598-0 from training. Duration: 0.83 2023-10-04 00:40:33,398 WARNING [train.py:1204] (3/4) Exclude cut with ID _9389_210_1664_1_1529217337301_5289269_379-128794-0 from training. Duration: 0.85 2023-10-04 00:40:33,417 WARNING [train.py:1204] (3/4) Exclude cut with ID _3675_210_4557_1_1526639934408_4182089_456-18305-0 from training. Duration: 0.76 2023-10-04 00:40:33,453 WARNING [train.py:1204] (3/4) Exclude cut with ID _29853_210_7497_1_1532505600851_6642110_571-249460-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 00:40:36,111 WARNING [train.py:1204] (3/4) Exclude cut with ID _4068_210_2629_1_1526697350395_1546259_230-325568-0_sp1.1 from training. Duration: 0.92725 2023-10-04 00:40:36,112 WARNING [train.py:1204] (3/4) Exclude cut with ID _34415_210_8750_1_1533085213375_3451116_213-267570-0_sp1.1 from training. Duration: 0.963625 2023-10-04 00:40:36,193 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6767_210_3273_1_1527855783_1718932_142-127132-0_sp1.1 from training. Duration: 0.863625 2023-10-04 00:40:42,885 WARNING [train.py:1204] (3/4) Exclude cut with ID IC0653W0001-322925 from training. Duration: 0.896 2023-10-04 00:40:42,991 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_4528_210_3273_1_1527076654_3276777_209-219804-0_sp1.1 from training. Duration: 0.62725 2023-10-04 00:40:45,628 WARNING [train.py:1204] (3/4) Exclude cut with ID _55844_210_2346_1_1534471212569_7625707_187-204971-0_sp1.1 from training. Duration: 0.936375 2023-10-04 00:40:48,744 WARNING [train.py:1204] (3/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_369-325184-0 from training. Duration: 0.99 2023-10-04 00:40:50,051 WARNING [train.py:1204] (3/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_791-45149-0 from training. Duration: 0.86 2023-10-04 00:40:50,114 WARNING [train.py:1204] (3/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_317-211107-0_sp1.1 from training. Duration: 0.836375 2023-10-04 00:40:51,530 WARNING [train.py:1204] (3/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_488-330913-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 00:40:53,049 WARNING [train.py:1204] (3/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_506-42614-0_sp1.1 from training. Duration: 0.7181875 2023-10-04 00:40:55,992 WARNING [train.py:1204] (3/4) Exclude cut with ID _8696_210_1876_1_1528545632440_3345058_209-54069-0 from training. Duration: 0.67 2023-10-04 00:40:58,159 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.feed_forward2.hidden_balancer.prob, batch_count=1464186.6666666667, ans=0.125 2023-10-04 00:41:04,180 WARNING [train.py:1204] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_215-202973-0_sp1.1 from training. Duration: 0.8 2023-10-04 00:41:04,412 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.bypass.skip_rate, batch_count=1464186.6666666667, ans=0.09899494936611666 2023-10-04 00:41:05,509 WARNING [train.py:1204] (3/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_321-215742-0 from training. Duration: 0.5 2023-10-04 00:41:05,761 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.out_combiner.scale_min, batch_count=1464253.3333333333, ans=0.2 2023-10-04 00:41:06,889 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_428-32114-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 00:41:06,899 WARNING [train.py:1204] (3/4) Exclude cut with ID _27013_210_1389_1_1532437173240_3252649_467-263151-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 00:41:08,188 WARNING [train.py:1204] (3/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_734-129761-0_sp0.9 from training. Duration: 0.911125 2023-10-04 00:41:08,223 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_521-214276-0 from training. Duration: 0.44 2023-10-04 00:41:11,027 WARNING [train.py:1204] (3/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_80-147301-0_sp1.1 from training. Duration: 0.763625 2023-10-04 00:41:11,030 WARNING [train.py:1204] (3/4) Exclude cut with ID _9679_210_7497_1_1529129212906_4363929_56-307796-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 00:41:12,559 WARNING [train.py:1204] (3/4) Exclude cut with ID _14832_210_8750_1_1530691351539_3171348_1-54315-0 from training. Duration: 0.87 2023-10-04 00:41:12,559 WARNING [train.py:1204] (3/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_558-155750-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 00:41:15,231 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_142-309370-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 00:41:15,272 WARNING [train.py:1204] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_18-46756-0_sp0.9 from training. Duration: 0.87775 2023-10-04 00:41:15,279 WARNING [train.py:1204] (3/4) Exclude cut with ID _61148_210_6897_1_1535094022726_4217610_279-212159-0_sp1.1 from training. Duration: 0.936375 2023-10-04 00:41:16,922 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=1464253.3333333333, ans=0.1 2023-10-04 00:41:17,991 WARNING [train.py:1204] (3/4) Exclude cut with ID _21987_210_5854_1_1531821500482_3637038_164-2641-0_sp1.1 from training. Duration: 0.936375 2023-10-04 00:41:19,712 WARNING [train.py:1204] (3/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_395-340852-0_sp1.1 from training. Duration: 0.8454375 2023-10-04 00:41:21,128 INFO [train.py:1046] (3/4) Epoch 42, batch 1850, loss[loss=0.1642, simple_loss=0.2492, pruned_loss=0.03962, over 24309.00 frames. ], tot_loss[loss=0.1552, simple_loss=0.2351, pruned_loss=0.03765, over 4725333.27 frames. ], batch size: 77, lr: 2.42e-03, grad_scale: 8.0 2023-10-04 00:41:21,178 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1077-184261-0_sp1.1 from training. Duration: 0.963625 2023-10-04 00:41:21,194 WARNING [train.py:1204] (3/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_1176-204992-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 00:41:23,963 WARNING [train.py:1204] (3/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_563-347832-0_sp1.1 from training. Duration: 0.8090625 2023-10-04 00:41:24,020 WARNING [train.py:1204] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_380-204756-0_sp0.9 from training. Duration: 0.9555625 2023-10-04 00:41:26,937 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.feed_forward2.hidden_balancer.prob, batch_count=1464320.0, ans=0.125 2023-10-04 00:41:31,003 WARNING [train.py:1204] (3/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_212-10501-0_sp1.1 from training. Duration: 0.8545625 2023-10-04 00:41:31,035 WARNING [train.py:1204] (3/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_445-117398-0 from training. Duration: 0.89 2023-10-04 00:41:31,174 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.1.feed_forward1.out_proj.dropout_p, batch_count=1464320.0, ans=0.1 2023-10-04 00:41:32,749 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.feed_forward2.hidden_balancer.prob, batch_count=1464320.0, ans=0.125 2023-10-04 00:41:38,059 WARNING [train.py:1204] (3/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_29-280622-0 from training. Duration: 0.82 2023-10-04 00:41:40,897 WARNING [train.py:1204] (3/4) Exclude cut with ID _26664_210_7770_1_1532258943280_3418270_187-80815-0 from training. Duration: 0.93 2023-10-04 00:41:41,129 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.1.conv_module1.balancer1.prob, batch_count=1464386.6666666667, ans=0.125 2023-10-04 00:41:43,824 WARNING [train.py:1204] (3/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_414-349640-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 00:41:43,841 WARNING [train.py:1204] (3/4) Exclude cut with ID _45021_210_9994_1_1533619751094_3330288_96-239010-0 from training. Duration: 0.91 2023-10-04 00:41:43,843 WARNING [train.py:1204] (3/4) Exclude cut with ID _5189_210_4557_1_1527248319428_3769229_199-53526-0_sp0.9 from training. Duration: 0.4666875 2023-10-04 00:41:50,833 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.self_attn_weights.pos_emb_skip_rate, batch_count=1464453.3333333333, ans=0.0 2023-10-04 00:41:53,543 WARNING [train.py:1204] (3/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_63-101996-0_sp1.1 from training. Duration: 0.8 2023-10-04 00:41:55,613 WARNING [train.py:1204] (3/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_199-86432-0 from training. Duration: 0.98 2023-10-04 00:41:55,800 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.ff2_skip_rate, batch_count=1464453.3333333333, ans=0.0 2023-10-04 00:41:58,371 WARNING [train.py:1204] (3/4) Exclude cut with ID _36399_210_16049_1_1532998771377_3698333_207-259013-0_sp1.1 from training. Duration: 0.92725 2023-10-04 00:41:58,410 WARNING [train.py:1204] (3/4) Exclude cut with ID _62574_210_2655_1_1535180407058_3933249_567-111756-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 00:41:59,907 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.conv_module2.balancer2.prob, batch_count=1464453.3333333333, ans=0.125 2023-10-04 00:42:02,433 WARNING [train.py:1204] (3/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_436-318113-0 from training. Duration: 0.75 2023-10-04 00:42:03,799 WARNING [train.py:1204] (3/4) Exclude cut with ID _53379_210_20034_1_1534222027363_3854654_43-305214-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 00:42:03,813 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_15126_210_6753_1_1530778677_2591185_113-157651-0_sp1.1 from training. Duration: 0.6181875 2023-10-04 00:42:05,172 WARNING [train.py:1204] (3/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_146-218763-0_sp1.1 from training. Duration: 0.7818125 2023-10-04 00:42:07,879 WARNING [train.py:1204] (3/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_714-310538-0_sp0.9 from training. Duration: 0.9555625 2023-10-04 00:42:10,524 WARNING [train.py:1204] (3/4) Exclude cut with ID _24271_210_12658_1_1531987270022_7190519_709-16721-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 00:42:12,390 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.458e+02 1.933e+02 2.158e+02 2.386e+02 3.653e+02, threshold=4.316e+02, percent-clipped=0.0 2023-10-04 00:42:13,841 WARNING [train.py:1204] (3/4) Exclude cut with ID _5190_210_4557_1_1527244368799_373439_28-197030-0_sp0.9 from training. Duration: 0.8 2023-10-04 00:42:13,880 WARNING [train.py:1204] (3/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_267-197536-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 00:42:14,091 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.attention_skip_rate, batch_count=1464520.0, ans=0.0 2023-10-04 00:42:15,125 WARNING [train.py:1204] (3/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_658-282610-0_sp1.1 from training. Duration: 0.4818125 2023-10-04 00:42:15,141 WARNING [train.py:1204] (3/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_257-67050-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 00:42:16,568 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_14906_210_7927_1_1530754190_9408633_961-53402-0_sp1.1 from training. Duration: 0.936375 2023-10-04 00:42:18,001 WARNING [train.py:1204] (3/4) Exclude cut with ID _60113_210_16271_1_1535164212481_4386079_490-280290-0_sp1.1 from training. Duration: 0.8909375 2023-10-04 00:42:20,768 WARNING [train.py:1204] (3/4) Exclude cut with ID _14100_210_8341_1_1530529320613_3678069_258-224599-0 from training. Duration: 0.77 2023-10-04 00:42:22,121 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6961_210_2087_1_1527937168_6815011_565-326898-0_sp1.1 from training. Duration: 0.936375 2023-10-04 00:42:25,378 WARNING [train.py:1204] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_239-316802-0_sp1.1 from training. Duration: 0.62725 2023-10-04 00:42:26,784 WARNING [train.py:1204] (3/4) Exclude cut with ID _55857_210_2346_1_1534644007831_6898209_745-98391-0_sp1.1 from training. Duration: 0.6909375 2023-10-04 00:42:26,787 WARNING [train.py:1204] (3/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_154-135696-0 from training. Duration: 0.8 2023-10-04 00:42:26,789 WARNING [train.py:1204] (3/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_93-58002-0 from training. Duration: 0.57 2023-10-04 00:42:28,248 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9025W0005-507335 from training. Duration: 0.896 2023-10-04 00:42:29,609 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9029W0034-509435 from training. Duration: 0.64 2023-10-04 00:42:30,990 WARNING [train.py:1204] (3/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_76-203867-0_sp1.1 from training. Duration: 0.6545625 2023-10-04 00:42:30,999 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_46639_210_8341_1_1533726088_4495500_66-346325-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 00:42:31,012 WARNING [train.py:1204] (3/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_145-137790-0_sp0.9 from training. Duration: 0.92225 2023-10-04 00:42:32,365 WARNING [train.py:1204] (3/4) Exclude cut with ID _36522_210_8887_1_1533177030611_1772478_7-327299-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 00:42:32,441 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9042W0009-515615 from training. Duration: 0.896 2023-10-04 00:42:32,450 WARNING [train.py:1204] (3/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_39-248870-0_sp0.9 from training. Duration: 0.7666875 2023-10-04 00:42:32,480 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_35441_210_16554_1_1533347572_4092680_362-242912-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 00:42:33,882 WARNING [train.py:1204] (3/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_216-343007-0_sp1.1 from training. Duration: 0.6 2023-10-04 00:42:35,123 INFO [train.py:1046] (3/4) Epoch 42, batch 1900, loss[loss=0.1438, simple_loss=0.2237, pruned_loss=0.03197, over 23736.00 frames. ], tot_loss[loss=0.1553, simple_loss=0.2357, pruned_loss=0.03743, over 4729388.65 frames. ], batch size: 149, lr: 2.42e-03, grad_scale: 8.0 2023-10-04 00:42:36,446 WARNING [train.py:1204] (3/4) Exclude cut with ID _3867_210_1883_1_1526641200575_4475830_272-215563-0_sp0.9 from training. Duration: 0.6333125 2023-10-04 00:42:36,530 WARNING [train.py:1204] (3/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_553-175603-0_sp1.1 from training. Duration: 0.92725 2023-10-04 00:42:36,539 WARNING [train.py:1204] (3/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_963-187109-0 from training. Duration: 0.99 2023-10-04 00:42:36,710 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.bypass_mid.scale_min, batch_count=1464653.3333333333, ans=0.2 2023-10-04 00:42:37,976 WARNING [train.py:1204] (3/4) Exclude cut with ID _44973_210_1511_1_1533794381163_3634770_495-211466-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 00:42:37,986 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9068W0002-529085 from training. Duration: 0.896 2023-10-04 00:42:39,797 WARNING [train.py:1204] (3/4) Exclude cut with ID _5294_210_1747_1_1527249643928_3696179_180-100713-0_sp1.1 from training. Duration: 0.736375 2023-10-04 00:42:39,882 WARNING [train.py:1204] (3/4) Exclude cut with ID _35799_210_5854_1_1533263487887_3529071_149-49689-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 00:42:40,559 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.3.encoder.layers.1.feed_forward3.out_whiten, num_groups=1, num_channels=512, metric=9.55 vs. limit=15.0 2023-10-04 00:42:45,969 WARNING [train.py:1204] (3/4) Exclude cut with ID _67725_210_27767_1_1535608756671_5105359_36-220794-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 00:42:47,551 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.conv_skip_rate, batch_count=1464653.3333333333, ans=0.0 2023-10-04 00:42:48,722 WARNING [train.py:1204] (3/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_338-96520-0_sp1.1 from training. Duration: 0.8545625 2023-10-04 00:42:48,792 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9104W0004-547160 from training. Duration: 0.896 2023-10-04 00:42:48,866 WARNING [train.py:1204] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_330-327861-0 from training. Duration: 0.67 2023-10-04 00:42:50,200 WARNING [train.py:1204] (3/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_156-257607-0_sp0.9 from training. Duration: 0.92225 2023-10-04 00:42:51,551 WARNING [train.py:1204] (3/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_476-322050-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 00:42:51,560 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9118W0017-553385 from training. Duration: 0.896 2023-10-04 00:42:51,603 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9120W0028-554435 from training. Duration: 0.896 2023-10-04 00:42:56,121 WARNING [train.py:1204] (3/4) Exclude cut with ID _56524_210_9642_1_1534572011779_3619170_145-310842-0 from training. Duration: 0.99 2023-10-04 00:42:56,232 WARNING [train.py:1204] (3/4) Exclude cut with ID _51631_210_8254_1_1534129228420_4084794_270-222508-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 00:42:58,965 WARNING [train.py:1204] (3/4) Exclude cut with ID _14268_210_9206_1_1530622771285_4714099_331-105351-0 from training. Duration: 0.65 2023-10-04 00:43:00,551 WARNING [train.py:1204] (3/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_40-129756-0 from training. Duration: 0.81 2023-10-04 00:43:09,071 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.balancer1.prob, batch_count=1464786.6666666667, ans=0.125 2023-10-04 00:43:13,798 WARNING [train.py:1204] (3/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_231-104736-0 from training. Duration: 0.78 2023-10-04 00:43:17,145 WARNING [train.py:1204] (3/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_214-291369-0 from training. Duration: 0.63 2023-10-04 00:43:17,152 WARNING [train.py:1204] (3/4) Exclude cut with ID _62574_210_2655_1_1535180407058_3933249_369-84347-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 00:43:17,206 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9210W0111-599240 from training. Duration: 0.896 2023-10-04 00:43:17,210 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_92-246270-0 from training. Duration: 0.85 2023-10-04 00:43:17,248 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_37152_210_9697_1_1533106573_3844188_29-239070-0 from training. Duration: 0.98 2023-10-04 00:43:18,677 WARNING [train.py:1204] (3/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_239-22076-0 from training. Duration: 0.85 2023-10-04 00:43:18,680 WARNING [train.py:1204] (3/4) Exclude cut with ID _65962_210_29927_1_1535436026519_4174930_368-44639-0_sp1.1 from training. Duration: 0.97275 2023-10-04 00:43:18,998 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.bypass.skip_rate, batch_count=1464853.3333333333, ans=0.09899494936611666 2023-10-04 00:43:22,854 WARNING [train.py:1204] (3/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_152-212533-0 from training. Duration: 0.97 2023-10-04 00:43:27,491 WARNING [train.py:1204] (3/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_90-31383-0_sp1.1 from training. Duration: 0.7909375 2023-10-04 00:43:28,966 WARNING [train.py:1204] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_196-152868-0_sp1.1 from training. Duration: 0.92725 2023-10-04 00:43:28,968 WARNING [train.py:1204] (3/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_140-181176-0 from training. Duration: 0.92 2023-10-04 00:43:30,371 WARNING [train.py:1204] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_168-32906-0_sp0.9 from training. Duration: 0.7666875 2023-10-04 00:43:34,410 WARNING [train.py:1204] (3/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_13-165512-0 from training. Duration: 0.82 2023-10-04 00:43:35,645 WARNING [train.py:1204] (3/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_381-317791-0_sp0.9 from training. Duration: 0.811125 2023-10-04 00:43:40,084 WARNING [train.py:1204] (3/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_63-267585-0_sp1.1 from training. Duration: 0.8181875 2023-10-04 00:43:40,086 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_326-303945-0_sp1.1 from training. Duration: 0.9 2023-10-04 00:43:40,099 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_194-143381-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 00:43:40,322 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.bypass.scale_min, batch_count=1464920.0, ans=0.2 2023-10-04 00:43:41,544 WARNING [train.py:1204] (3/4) Exclude cut with ID _39290_210_4220_1_1533862778760_3075118_194-16667-0_sp1.1 from training. Duration: 0.963625 2023-10-04 00:43:43,299 WARNING [train.py:1204] (3/4) Exclude cut with ID _15012_210_1968_1_1530856820429_5095350_662-210469-0_sp0.9 from training. Duration: 0.6333125 2023-10-04 00:43:43,337 WARNING [train.py:1204] (3/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_68-219807-0_sp1.1 from training. Duration: 0.42725 2023-10-04 00:43:45,092 WARNING [train.py:1204] (3/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_321-22821-0_sp0.9 from training. Duration: 0.988875 2023-10-04 00:43:47,159 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_1037-45317-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 00:43:47,161 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_403-39619-0_sp1.1 from training. Duration: 0.763625 2023-10-04 00:43:49,770 INFO [train.py:1046] (3/4) Epoch 42, batch 1950, loss[loss=0.1302, simple_loss=0.2116, pruned_loss=0.02443, over 24594.00 frames. ], tot_loss[loss=0.1555, simple_loss=0.2358, pruned_loss=0.03766, over 4725787.42 frames. ], batch size: 60, lr: 2.42e-03, grad_scale: 8.0 2023-10-04 00:43:49,898 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_510-96996-0_sp0.9 from training. Duration: 0.9666875 2023-10-04 00:43:49,900 WARNING [train.py:1204] (3/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_225-180155-0_sp1.1 from training. Duration: 0.92725 2023-10-04 00:43:49,941 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_278-272612-0_sp0.9 from training. Duration: 0.811125 2023-10-04 00:43:52,553 WARNING [train.py:1204] (3/4) Exclude cut with ID _53713_210_24116_1_1534314487762_7416751_691-70244-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 00:43:55,869 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6788_210_1388_1_1527898698_4562021_91-197981-0_sp1.1 from training. Duration: 0.7545625 2023-10-04 00:43:57,335 WARNING [train.py:1204] (3/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_60-18123-0_sp0.9 from training. Duration: 0.97775 2023-10-04 00:43:57,381 WARNING [train.py:1204] (3/4) Exclude cut with ID _35799_210_5854_1_1533263487887_3529071_5-214617-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 00:43:57,388 WARNING [train.py:1204] (3/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_890-5117-0_sp1.1 from training. Duration: 0.7090625 2023-10-04 00:43:58,163 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.2.encoder.layers.0.self_attn1.whiten, num_groups=1, num_channels=384, metric=21.04 vs. limit=22.5 2023-10-04 00:43:58,807 WARNING [train.py:1204] (3/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_660-211088-0 from training. Duration: 0.62 2023-10-04 00:44:00,056 WARNING [train.py:1204] (3/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_314-126819-0_sp0.9 from training. Duration: 0.5555625 2023-10-04 00:44:00,096 WARNING [train.py:1204] (3/4) Exclude cut with ID _21632_210_2253_1_1531994001765_3744140_362-66225-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 00:44:02,677 WARNING [train.py:1204] (3/4) Exclude cut with ID _61118_210_9527_1_1534897668884_3752630_173-258555-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 00:44:05,507 WARNING [train.py:1204] (3/4) Exclude cut with ID _7719_210_3525_1_1528456153163_4296960_88-250259-0_sp0.9 from training. Duration: 0.9333125 2023-10-04 00:44:05,552 WARNING [train.py:1204] (3/4) Exclude cut with ID _15140_210_9288_1_1530791388450_4880149_539-153708-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 00:44:06,826 WARNING [train.py:1204] (3/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_253-42544-0_sp1.1 from training. Duration: 0.97275 2023-10-04 00:44:06,976 WARNING [train.py:1204] (3/4) Exclude cut with ID _33640_210_2655_1_1533117605479_4855469_479-296877-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 00:44:09,782 WARNING [train.py:1204] (3/4) Exclude cut with ID 3033-130750-0096-55598-0_sp1.1 from training. Duration: 0.7545625 2023-10-04 00:44:09,793 WARNING [train.py:1204] (3/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_191-41023-0_sp1.1 from training. Duration: 0.6454375 2023-10-04 00:44:09,811 WARNING [train.py:1204] (3/4) Exclude cut with ID _8365_210_2064_1_1528592311070_4677690_431-294102-0_sp1.1 from training. Duration: 0.8090625 2023-10-04 00:44:09,841 WARNING [train.py:1204] (3/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_893-296030-0_sp1.1 from training. Duration: 0.97275 2023-10-04 00:44:14,602 WARNING [train.py:1204] (3/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_401-343820-0_sp1.1 from training. Duration: 0.97275 2023-10-04 00:44:18,096 WARNING [train.py:1204] (3/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_127-566-0_sp1.1 from training. Duration: 0.763625 2023-10-04 00:44:18,098 WARNING [train.py:1204] (3/4) Exclude cut with ID _14100_210_8341_1_1530529320613_3678069_126-272847-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 00:44:18,111 WARNING [train.py:1204] (3/4) Exclude cut with ID _15133_210_6228_1_1530794512074_3080379_29-264458-0_sp1.1 from training. Duration: 0.52725 2023-10-04 00:44:18,112 WARNING [train.py:1204] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_127-119109-0 from training. Duration: 0.72 2023-10-04 00:44:19,572 WARNING [train.py:1204] (3/4) Exclude cut with ID _6537_210_1883_1_1527987559710_3597129_382-133062-0_sp1.1 from training. Duration: 0.6181875 2023-10-04 00:44:19,583 WARNING [train.py:1204] (3/4) Exclude cut with ID _29529_210_7234_1_1532393961455_3977460_391-219682-0_sp1.1 from training. Duration: 0.936375 2023-10-04 00:44:19,632 WARNING [train.py:1204] (3/4) Exclude cut with ID _37537_210_2253_1_1533171758568_3421949_96-137189-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 00:44:22,587 WARNING [train.py:1204] (3/4) Exclude cut with ID _58414_210_8254_1_1535421609162_7151182_15-276301-0_sp1.1 from training. Duration: 0.97275 2023-10-04 00:44:25,757 WARNING [train.py:1204] (3/4) Exclude cut with ID _59502_210_12210_1_1535107024698_1835139_110-289074-0_sp1.1 from training. Duration: 0.92725 2023-10-04 00:44:28,524 WARNING [train.py:1204] (3/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_367-61330-0_sp1.1 from training. Duration: 0.6818125 2023-10-04 00:44:31,220 WARNING [train.py:1204] (3/4) Exclude cut with ID _45297_210_2721_1_1533715351381_3264949_353-9541-0_sp1.1 from training. Duration: 0.8818125 2023-10-04 00:44:31,268 WARNING [train.py:1204] (3/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_85-138257-0_sp0.9 from training. Duration: 0.788875 2023-10-04 00:44:31,296 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3787_210_2040_1_1526477544_5478586_603-120324-0 from training. Duration: 0.81 2023-10-04 00:44:32,541 WARNING [train.py:1204] (3/4) Exclude cut with ID _42863_210_9070_1_1533902398788_3696920_411-204131-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 00:44:36,719 WARNING [train.py:1204] (3/4) Exclude cut with ID _30196_210_1664_1_1533708496522_7074140_822-289909-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 00:44:38,105 WARNING [train.py:1204] (3/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_32-335103-0_sp0.9 from training. Duration: 0.911125 2023-10-04 00:44:38,340 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.conv_skip_rate, batch_count=1465186.6666666667, ans=0.0 2023-10-04 00:44:39,322 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.656e+02 1.957e+02 2.205e+02 2.513e+02 3.289e+02, threshold=4.410e+02, percent-clipped=0.0 2023-10-04 00:44:39,394 WARNING [train.py:1204] (3/4) Exclude cut with ID _6493_210_1747_1_1528459210424_4012470_513-140967-0_sp0.9 from training. Duration: 0.87775 2023-10-04 00:44:46,207 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.conv_module2.balancer2.prob, batch_count=1465186.6666666667, ans=0.125 2023-10-04 00:44:47,370 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_47896_210_20508_1_1534492518_5958868_439-257766-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 00:44:49,145 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_1294-63152-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 00:44:50,747 WARNING [train.py:1204] (3/4) Exclude cut with ID _59170_210_6721_1_1534983633960_4203335_319-166512-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 00:44:52,215 WARNING [train.py:1204] (3/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_146-3470-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 00:44:53,619 WARNING [train.py:1204] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_678-159307-0_sp1.1 from training. Duration: 0.77275 2023-10-04 00:44:55,547 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_343-48822-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 00:44:56,884 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_4692_210_3528_1_1526899755_4323254_107-44401-0 from training. Duration: 0.86 2023-10-04 00:44:56,896 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_74-292608-0_sp1.1 from training. Duration: 0.6545625 2023-10-04 00:44:58,295 WARNING [train.py:1204] (3/4) Exclude cut with ID _55644_210_11856_1_1534593599582_4103719_293-248694-0_sp1.1 from training. Duration: 0.97275 2023-10-04 00:44:59,629 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3172_210_2087_1_1526292929_3987865_197-97762-0 from training. Duration: 0.75 2023-10-04 00:45:01,089 WARNING [train.py:1204] (3/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_264-348896-0_sp0.9 from training. Duration: 0.9444375 2023-10-04 00:45:03,803 INFO [train.py:1046] (3/4) Epoch 42, batch 2000, loss[loss=0.1505, simple_loss=0.2209, pruned_loss=0.04009, over 23775.00 frames. ], tot_loss[loss=0.1565, simple_loss=0.2365, pruned_loss=0.03823, over 4724940.94 frames. ], batch size: 164, lr: 2.42e-03, grad_scale: 16.0 2023-10-04 00:45:03,962 WARNING [train.py:1204] (3/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_209-205365-0_sp0.9 from training. Duration: 0.87775 2023-10-04 00:45:05,317 WARNING [train.py:1204] (3/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_222-198127-0_sp0.9 from training. Duration: 0.9333125 2023-10-04 00:45:05,363 WARNING [train.py:1204] (3/4) Exclude cut with ID _57572_210_5517_1_1534921219624_3809484_52-43530-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 00:45:06,900 WARNING [train.py:1204] (3/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_96-320736-0_sp1.1 from training. Duration: 0.7818125 2023-10-04 00:45:09,638 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_13091_210_7925_1_1530609447_8987354_1159-225508-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 00:45:12,795 WARNING [train.py:1204] (3/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_237-44332-0 from training. Duration: 0.94 2023-10-04 00:45:12,856 WARNING [train.py:1204] (3/4) Exclude cut with ID _7970_210_1883_1_1528455616296_3472230_25-178407-0_sp0.9 from training. Duration: 0.788875 2023-10-04 00:45:17,472 WARNING [train.py:1204] (3/4) Exclude cut with ID _29003_210_19596_1_1532507391720_3894098_357-257062-0_sp1.1 from training. Duration: 0.936375 2023-10-04 00:45:20,300 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_809-17156-0 from training. Duration: 0.73 2023-10-04 00:45:20,410 WARNING [train.py:1204] (3/4) Exclude cut with ID _7160_210_1876_1_1527940923231_3703799_302-150728-0_sp1.1 from training. Duration: 0.7090625 2023-10-04 00:45:20,419 WARNING [train.py:1204] (3/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_444-28041-0_sp0.9 from training. Duration: 0.9444375 2023-10-04 00:45:24,580 WARNING [train.py:1204] (3/4) Exclude cut with ID _52892_210_6721_1_1534378941259_4124129_278-251666-0_sp1.1 from training. Duration: 0.92725 2023-10-04 00:45:25,851 WARNING [train.py:1204] (3/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_115-326778-0 from training. Duration: 0.93 2023-10-04 00:45:25,958 WARNING [train.py:1204] (3/4) Exclude cut with ID _41391_210_7994_1_1533603771872_3638201_31-188961-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 00:45:27,274 WARNING [train.py:1204] (3/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_1073-6264-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 00:45:27,293 WARNING [train.py:1204] (3/4) Exclude cut with ID _66868_210_9527_1_1535509287954_4561140_435-259515-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 00:45:28,756 WARNING [train.py:1204] (3/4) Exclude cut with ID _14886_210_4220_1_1530702020355_3516190_159-73748-0 from training. Duration: 0.83 2023-10-04 00:45:28,789 WARNING [train.py:1204] (3/4) Exclude cut with ID _15150_210_8341_1_1530752411803_7256384_469-239509-0_sp1.1 from training. Duration: 0.6181875 2023-10-04 00:45:32,134 WARNING [train.py:1204] (3/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_395-340852-0 from training. Duration: 0.93 2023-10-04 00:45:32,138 WARNING [train.py:1204] (3/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_138-187031-0_sp1.1 from training. Duration: 0.9 2023-10-04 00:45:34,823 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_13757_210_3528_1_1530440682_4107239_48-106347-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 00:45:36,122 WARNING [train.py:1204] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_164-224052-0_sp1.1 from training. Duration: 0.636375 2023-10-04 00:45:36,126 WARNING [train.py:1204] (3/4) Exclude cut with ID _59288_210_5854_1_1535077757774_3412246_408-322648-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 00:45:37,460 WARNING [train.py:1204] (3/4) Exclude cut with ID _30191_210_1664_1_1533189915047_7196670_827-36359-0_sp1.1 from training. Duration: 0.963625 2023-10-04 00:45:37,539 WARNING [train.py:1204] (3/4) Exclude cut with ID _4680_210_3525_1_1527249757953_893386_75-78979-0_sp1.1 from training. Duration: 0.82725 2023-10-04 00:45:37,598 WARNING [train.py:1204] (3/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_383-165234-0 from training. Duration: 0.94 2023-10-04 00:45:42,323 WARNING [train.py:1204] (3/4) Exclude cut with ID _19896_210_1883_1_1531709022662_6649569_94-229927-0 from training. Duration: 0.97 2023-10-04 00:45:42,330 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6788_210_1388_1_1527898698_4562021_180-75288-0_sp1.1 from training. Duration: 0.9 2023-10-04 00:45:42,344 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_26433_210_12108_1_1532157158_4056480_54-321349-0_sp1.1 from training. Duration: 0.97275 2023-10-04 00:45:48,100 WARNING [train.py:1204] (3/4) Exclude cut with ID _56108_210_16380_1_1534505446159_3887959_265-161493-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 00:45:48,206 WARNING [train.py:1204] (3/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_395-111602-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 00:45:48,212 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_154-316450-0_sp0.9 from training. Duration: 0.8444375 2023-10-04 00:45:50,101 WARNING [train.py:1204] (3/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_381-116041-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 00:45:51,527 WARNING [train.py:1204] (3/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_65-104124-0_sp1.1 from training. Duration: 0.963625 2023-10-04 00:45:51,582 WARNING [train.py:1204] (3/4) Exclude cut with ID _6536_210_1883_1_1527850783339_3319069_169-23811-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 00:45:52,904 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_289-180904-0_sp0.9 from training. Duration: 0.8444375 2023-10-04 00:45:52,916 WARNING [train.py:1204] (3/4) Exclude cut with ID _56406_210_9288_1_1534899618664_3496149_226-242400-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 00:45:54,352 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_24899_210_16554_1_1532046422_3827462_276-216330-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 00:45:57,018 WARNING [train.py:1204] (3/4) Exclude cut with ID _29345_210_4381_1_1532689168263_3860169_198-174328-0_sp1.1 from training. Duration: 0.82725 2023-10-04 00:45:57,085 WARNING [train.py:1204] (3/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_338-96520-0 from training. Duration: 0.94 2023-10-04 00:45:57,389 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.self_attn_weights.pos_emb_skip_rate, batch_count=1465520.0, ans=0.0 2023-10-04 00:46:01,884 WARNING [train.py:1204] (3/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_7-111954-0_sp1.1 from training. Duration: 0.5090625 2023-10-04 00:46:04,458 WARNING [train.py:1204] (3/4) Exclude cut with ID _36692_210_5665_1_1533608967420_398809_8-16261-0_sp1.1 from training. Duration: 0.97275 2023-10-04 00:46:06,125 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.bypass.skip_rate, batch_count=1465586.6666666667, ans=0.09899494936611666 2023-10-04 00:46:07,346 WARNING [train.py:1204] (3/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_86-212657-0_sp1.1 from training. Duration: 0.97275 2023-10-04 00:46:07,362 WARNING [train.py:1204] (3/4) Exclude cut with ID _44659_210_2721_1_1534036120061_6614860_626-298884-0_sp1.1 from training. Duration: 0.8818125 2023-10-04 00:46:07,622 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.bypass.scale_min, batch_count=1465586.6666666667, ans=0.2 2023-10-04 00:46:10,223 WARNING [train.py:1204] (3/4) Exclude cut with ID _49755_210_18053_1_1534581005846_3218709_120-257918-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 00:46:11,636 WARNING [train.py:1204] (3/4) Exclude cut with ID _21881_210_3525_1_1531825242757_3798380_400-113821-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 00:46:11,639 WARNING [train.py:1204] (3/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_387-236773-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 00:46:13,538 WARNING [train.py:1204] (3/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_613-232856-0_sp1.1 from training. Duration: 0.4909375 2023-10-04 00:46:14,792 WARNING [train.py:1204] (3/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_181-78632-0_sp0.9 from training. Duration: 0.8666875 2023-10-04 00:46:16,218 WARNING [train.py:1204] (3/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_175-17485-0_sp1.1 from training. Duration: 0.97275 2023-10-04 00:46:17,972 INFO [train.py:1046] (3/4) Epoch 42, batch 2050, loss[loss=0.151, simple_loss=0.23, pruned_loss=0.03603, over 23591.00 frames. ], tot_loss[loss=0.1565, simple_loss=0.2365, pruned_loss=0.03828, over 4726970.40 frames. ], batch size: 149, lr: 2.42e-03, grad_scale: 16.0 2023-10-04 00:46:18,089 WARNING [train.py:1204] (3/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_1004-183422-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 00:46:18,306 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.bypass.skip_rate, batch_count=1465653.3333333333, ans=0.07 2023-10-04 00:46:21,512 WARNING [train.py:1204] (3/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_32-65962-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 00:46:21,605 WARNING [train.py:1204] (3/4) Exclude cut with ID _15458_210_8341_1_1530858614263_3834410_40-298664-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 00:46:27,035 WARNING [train.py:1204] (3/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_380-261863-0_sp1.1 from training. Duration: 0.963625 2023-10-04 00:46:28,426 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_372-173012-0_sp1.1 from training. Duration: 0.863625 2023-10-04 00:46:29,761 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_57-82205-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 00:46:29,829 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3691_210_3528_1_1526468169_4062053_355-334432-0_sp0.9 from training. Duration: 0.9666875 2023-10-04 00:46:31,992 WARNING [train.py:1204] (3/4) Exclude cut with ID _19023_210_4353_1_1531378892552_5820780_28-36615-0 from training. Duration: 0.97 2023-10-04 00:46:31,999 WARNING [train.py:1204] (3/4) Exclude cut with ID _29853_210_7497_1_1532505600851_6642110_286-251333-0_sp1.1 from training. Duration: 0.92725 2023-10-04 00:46:34,703 WARNING [train.py:1204] (3/4) Exclude cut with ID _31602_210_5854_1_1532677021456_4458250_518-80679-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 00:46:34,745 WARNING [train.py:1204] (3/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_38-187851-0_sp0.9 from training. Duration: 0.688875 2023-10-04 00:46:40,816 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.5.encoder.layers.0.feed_forward3.out_whiten, num_groups=1, num_channels=256, metric=8.88 vs. limit=15.0 2023-10-04 00:46:41,942 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.attention_skip_rate, batch_count=1465720.0, ans=0.0 2023-10-04 00:46:44,505 WARNING [train.py:1204] (3/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_39-248870-0_sp1.1 from training. Duration: 0.62725 2023-10-04 00:46:44,526 WARNING [train.py:1204] (3/4) Exclude cut with ID _16908_210_5724_1_1531119157248_3772700_154-31455-0_sp1.1 from training. Duration: 0.97275 2023-10-04 00:46:48,290 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8453_210_4211_1_1528502940_8562730_347-330673-0 from training. Duration: 0.72 2023-10-04 00:46:49,734 WARNING [train.py:1204] (3/4) Exclude cut with ID _30191_210_1664_1_1533189915047_7196670_674-204247-0_sp1.1 from training. Duration: 0.97275 2023-10-04 00:46:51,039 WARNING [train.py:1204] (3/4) Exclude cut with ID _8833_210_5219_1_1528592665210_7553059_329-125156-0 from training. Duration: 0.82 2023-10-04 00:46:51,079 WARNING [train.py:1204] (3/4) Exclude cut with ID _4779_210_2084_1_1527318022944_3914893_500-285013-0_sp1.1 from training. Duration: 0.62725 2023-10-04 00:46:52,590 WARNING [train.py:1204] (3/4) Exclude cut with ID _14421_210_8750_1_1530604795825_3542290_190-313578-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 00:46:55,942 WARNING [train.py:1204] (3/4) Exclude cut with ID _35799_210_5854_1_1533263487887_3529071_538-273574-0_sp1.1 from training. Duration: 0.936375 2023-10-04 00:46:57,309 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_14928_210_5632_1_1530773461_4714000_454-112198-0_sp1.1 from training. Duration: 0.6 2023-10-04 00:46:57,357 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_468-143126-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 00:46:58,717 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_4528_210_3273_1_1527076654_3276777_258-121681-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 00:47:00,124 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_397-132565-0_sp1.1 from training. Duration: 0.9 2023-10-04 00:47:00,142 WARNING [train.py:1204] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_454-123002-0_sp0.9 from training. Duration: 0.7666875 2023-10-04 00:47:00,992 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.1.encoder.layers.1.feed_forward2.out_whiten, num_groups=1, num_channels=256, metric=4.40 vs. limit=15.0 2023-10-04 00:47:03,607 WARNING [train.py:1204] (3/4) Exclude cut with ID _27052_210_5986_1_1532329197187_3585009_297-86890-0_sp1.1 from training. Duration: 0.936375 2023-10-04 00:47:06,228 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_51225_210_3601_1_1534400634_2327383_55-301844-0_sp1.1 from training. Duration: 0.8454375 2023-10-04 00:47:08,812 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.533e+02 1.968e+02 2.128e+02 2.407e+02 4.254e+02, threshold=4.257e+02, percent-clipped=0.0 2023-10-04 00:47:08,923 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6786_210_1388_1_1527839699_4194495_378-46070-0_sp0.9 from training. Duration: 0.8 2023-10-04 00:47:09,118 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.feed_forward2.hidden_balancer.prob, batch_count=1465853.3333333333, ans=0.125 2023-10-04 00:47:10,321 WARNING [train.py:1204] (3/4) Exclude cut with ID _42334_210_7770_1_1534206610396_3605880_55-19758-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 00:47:13,297 WARNING [train.py:1204] (3/4) Exclude cut with ID _14884_210_2252_1_1530707459689_5561250_114-201399-0_sp0.9 from training. Duration: 0.8444375 2023-10-04 00:47:17,911 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_99-251314-0_sp0.9 from training. Duration: 0.9666875 2023-10-04 00:47:17,986 WARNING [train.py:1204] (3/4) Exclude cut with ID _26528_210_9070_1_1532570410842_3851310_497-97218-0 from training. Duration: 0.86 2023-10-04 00:47:24,293 WARNING [train.py:1204] (3/4) Exclude cut with ID _55328_210_27767_1_1534580973260_4088170_230-317241-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 00:47:24,358 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_125-203105-0_sp0.9 from training. Duration: 0.888875 2023-10-04 00:47:26,433 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.bypass.scale_min, batch_count=1465920.0, ans=0.2 2023-10-04 00:47:27,554 WARNING [train.py:1204] (3/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_55-31904-0_sp1.1 from training. Duration: 0.8 2023-10-04 00:47:28,915 WARNING [train.py:1204] (3/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_18-174647-0 from training. Duration: 0.91 2023-10-04 00:47:33,507 INFO [train.py:1046] (3/4) Epoch 42, batch 2100, loss[loss=0.1637, simple_loss=0.2506, pruned_loss=0.03839, over 24352.00 frames. ], tot_loss[loss=0.1556, simple_loss=0.2351, pruned_loss=0.03801, over 4720380.87 frames. ], batch size: 77, lr: 2.42e-03, grad_scale: 16.0 2023-10-04 00:47:33,556 WARNING [train.py:1204] (3/4) Exclude cut with ID IC0165W0005-81726 from training. Duration: 0.952 2023-10-04 00:47:33,556 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_31583_210_3852_1_1532595851_4024413_148-171847-0_sp1.1 from training. Duration: 0.963625 2023-10-04 00:47:33,609 WARNING [train.py:1204] (3/4) Exclude cut with ID _21989_210_5854_1_1531899214931_3271360_116-138769-0_sp1.1 from training. Duration: 0.936375 2023-10-04 00:47:34,956 WARNING [train.py:1204] (3/4) Exclude cut with ID _66070_210_7640_1_1535441393096_7805310_489-292631-0_sp1.1 from training. Duration: 0.7181875 2023-10-04 00:47:36,292 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_202-27960-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 00:47:36,308 WARNING [train.py:1204] (3/4) Exclude cut with ID _5294_210_1747_1_1527249643928_3696179_342-270278-0 from training. Duration: 0.55 2023-10-04 00:47:36,346 WARNING [train.py:1204] (3/4) Exclude cut with ID _38323_210_12108_1_1533121233369_3303049_416-11919-0 from training. Duration: 0.96 2023-10-04 00:47:37,752 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6545_210_2040_1_1527774670_4245258_407-48070-0_sp0.9 from training. Duration: 0.8444375 2023-10-04 00:47:40,569 WARNING [train.py:1204] (3/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_503-30719-0_sp1.1 from training. Duration: 0.8909375 2023-10-04 00:47:41,906 WARNING [train.py:1204] (3/4) Exclude cut with ID _49643_210_7989_1_1534640244224_3939031_234-326497-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 00:47:44,682 WARNING [train.py:1204] (3/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_301-77873-0_sp1.1 from training. Duration: 0.963625 2023-10-04 00:47:46,501 WARNING [train.py:1204] (3/4) Exclude cut with ID _45297_210_2721_1_1533715351381_3264949_152-154697-0_sp1.1 from training. Duration: 0.97275 2023-10-04 00:47:46,508 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3371_210_1794_1_1526384462_5580777_396-339812-0 from training. Duration: 0.85 2023-10-04 00:47:46,600 WARNING [train.py:1204] (3/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_399-156791-0_sp1.1 from training. Duration: 0.7545625 2023-10-04 00:47:46,637 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7696_210_2040_1_1528207167_3716099_117-125434-0 from training. Duration: 0.76 2023-10-04 00:47:46,641 WARNING [train.py:1204] (3/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_96-244299-0 from training. Duration: 0.88 2023-10-04 00:47:48,094 WARNING [train.py:1204] (3/4) Exclude cut with ID _37892_210_19596_1_1533198677897_3964058_519-343462-0_sp1.1 from training. Duration: 0.92725 2023-10-04 00:47:48,116 WARNING [train.py:1204] (3/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_202-183922-0_sp1.1 from training. Duration: 0.77275 2023-10-04 00:47:48,122 WARNING [train.py:1204] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_453-133983-0 from training. Duration: 0.78 2023-10-04 00:47:48,163 WARNING [train.py:1204] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_178-39722-0_sp1.1 from training. Duration: 0.4454375 2023-10-04 00:47:53,990 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_15126_210_6753_1_1530778677_2591185_113-157651-0 from training. Duration: 0.68 2023-10-04 00:47:53,991 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_271-234962-0_sp1.1 from training. Duration: 0.7181875 2023-10-04 00:47:58,492 WARNING [train.py:1204] (3/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_195-64967-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 00:47:58,530 WARNING [train.py:1204] (3/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_365-215065-0_sp1.1 from training. Duration: 0.936375 2023-10-04 00:48:01,713 WARNING [train.py:1204] (3/4) Exclude cut with ID _31602_210_5854_1_1532677021456_4458250_249-96604-0_sp0.9 from training. Duration: 0.988875 2023-10-04 00:48:03,067 WARNING [train.py:1204] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_307-158462-0 from training. Duration: 0.69 2023-10-04 00:48:03,120 WARNING [train.py:1204] (3/4) Exclude cut with ID _30918_210_6400_1_1532754078600_3125920_407-223394-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 00:48:03,124 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6673_210_3273_1_1527767790_3173240_351-167954-0_sp0.9 from training. Duration: 0.5333125 2023-10-04 00:48:04,534 WARNING [train.py:1204] (3/4) Exclude cut with ID _4866_210_2577_1_1527404412284_4364659_256-306830-0 from training. Duration: 0.87 2023-10-04 00:48:04,585 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_17613_210_2151_1_1531137569_2366160_222-131118-0_sp1.1 from training. Duration: 0.963625 2023-10-04 00:48:05,864 WARNING [train.py:1204] (3/4) Exclude cut with ID _45560_210_21982_1_1533812603951_3584999_111-6376-0 from training. Duration: 0.84 2023-10-04 00:48:05,901 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_216-73841-0 from training. Duration: 0.96 2023-10-04 00:48:05,946 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_14058_210_4163_1_1530690953_6327180_49-210687-0 from training. Duration: 0.94 2023-10-04 00:48:08,680 WARNING [train.py:1204] (3/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_506-42614-0_sp0.9 from training. Duration: 0.87775 2023-10-04 00:48:10,143 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_25-332138-0_sp1.1 from training. Duration: 0.82725 2023-10-04 00:48:11,579 WARNING [train.py:1204] (3/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_589-9070-0_sp1.1 from training. Duration: 0.6454375 2023-10-04 00:48:13,044 WARNING [train.py:1204] (3/4) Exclude cut with ID _6194_210_1534_1_1527511826160_3941194_14-282876-0_sp1.1 from training. Duration: 0.6454375 2023-10-04 00:48:14,412 WARNING [train.py:1204] (3/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_1166-90654-0_sp1.1 from training. Duration: 0.92725 2023-10-04 00:48:17,668 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_218-255858-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 00:48:17,671 WARNING [train.py:1204] (3/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_1285-256622-0 from training. Duration: 0.84 2023-10-04 00:48:17,697 WARNING [train.py:1204] (3/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_205-286261-0_sp1.1 from training. Duration: 0.963625 2023-10-04 00:48:18,847 WARNING [train.py:1204] (3/4) Exclude cut with ID _68777_210_4353_1_1535810390731_3117904_6-154119-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 00:48:20,712 WARNING [train.py:1204] (3/4) Exclude cut with ID _22982_210_1883_1_1531882790447_5449409_565-199393-0_sp1.1 from training. Duration: 0.92725 2023-10-04 00:48:20,722 WARNING [train.py:1204] (3/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_278-154492-0 from training. Duration: 0.76 2023-10-04 00:48:22,157 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_426-313557-0 from training. Duration: 0.53 2023-10-04 00:48:22,213 WARNING [train.py:1204] (3/4) Exclude cut with ID _41817_210_18835_1_1533434477160_7423049_555-293448-0 from training. Duration: 0.8 2023-10-04 00:48:26,813 WARNING [train.py:1204] (3/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_32-335103-0_sp1.1 from training. Duration: 0.7454375 2023-10-04 00:48:27,706 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.2.encoder.layers.1.self_attn1.whiten, num_groups=1, num_channels=384, metric=19.27 vs. limit=22.5 2023-10-04 00:48:28,301 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2655_210_2087_1_1525688959_3897400_202-147557-0_sp1.1 from training. Duration: 0.77275 2023-10-04 00:48:28,332 WARNING [train.py:1204] (3/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_158-250116-0 from training. Duration: 0.62 2023-10-04 00:48:34,340 WARNING [train.py:1204] (3/4) Exclude cut with ID _55429_210_10626_1_1534467588527_7174143_528-88083-0_sp1.1 from training. Duration: 0.963625 2023-10-04 00:48:37,061 WARNING [train.py:1204] (3/4) Exclude cut with ID _8030_210_7497_1_1528444886781_3531089_32-31837-0_sp1.1 from training. Duration: 0.8818125 2023-10-04 00:48:37,082 WARNING [train.py:1204] (3/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_744-86168-0_sp1.1 from training. Duration: 0.97275 2023-10-04 00:48:37,099 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1113-7369-0_sp0.9 from training. Duration: 0.9444375 2023-10-04 00:48:37,124 WARNING [train.py:1204] (3/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_124-345840-0_sp0.9 from training. Duration: 0.57775 2023-10-04 00:48:38,314 WARNING [train.py:1204] (3/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_328-264776-0_sp0.9 from training. Duration: 0.7444375 2023-10-04 00:48:39,752 WARNING [train.py:1204] (3/4) Exclude cut with ID _18895_210_6228_1_1531357274234_3784930_469-166615-0_sp1.1 from training. Duration: 0.963625 2023-10-04 00:48:39,758 WARNING [train.py:1204] (3/4) Exclude cut with ID _8395_210_1531_1_1528597871523_4411579_431-132470-0_sp0.9 from training. Duration: 0.82225 2023-10-04 00:48:39,830 WARNING [train.py:1204] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_554-45617-0_sp1.1 from training. Duration: 0.9 2023-10-04 00:48:40,052 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.conv_skip_rate, batch_count=1466253.3333333333, ans=0.0 2023-10-04 00:48:41,154 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_35361_210_4238_1_1533262810_7793476_524-188242-0_sp1.1 from training. Duration: 0.92725 2023-10-04 00:48:42,676 WARNING [train.py:1204] (3/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_938-205135-0 from training. Duration: 0.87 2023-10-04 00:48:44,083 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_5931_210_2040_1_1527428481_4547999_321-193614-0 from training. Duration: 0.67 2023-10-04 00:48:44,101 WARNING [train.py:1204] (3/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_445-269521-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 00:48:46,753 INFO [train.py:1046] (3/4) Epoch 42, batch 2150, loss[loss=0.1481, simple_loss=0.2289, pruned_loss=0.03363, over 24495.00 frames. ], tot_loss[loss=0.1549, simple_loss=0.2345, pruned_loss=0.03764, over 4728627.50 frames. ], batch size: 58, lr: 2.42e-03, grad_scale: 16.0 2023-10-04 00:48:46,831 WARNING [train.py:1204] (3/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_3-33116-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 00:48:46,833 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_4633_210_2040_1_1526824163_4145343_416-76117-0_sp1.1 from training. Duration: 0.863625 2023-10-04 00:48:46,869 WARNING [train.py:1204] (3/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_1033-274376-0_sp1.1 from training. Duration: 0.7909375 2023-10-04 00:48:46,914 WARNING [train.py:1204] (3/4) Exclude cut with ID _8772_210_1850_1_1528635595972_3664011_398-320049-0_sp0.9 from training. Duration: 0.97775 2023-10-04 00:48:54,173 WARNING [train.py:1204] (3/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_321-215742-0_sp0.9 from training. Duration: 0.5555625 2023-10-04 00:48:54,319 WARNING [train.py:1204] (3/4) Exclude cut with ID _28394_210_8913_1_1532430049518_3650118_272-190950-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 00:48:55,705 WARNING [train.py:1204] (3/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_31-299371-0_sp1.1 from training. Duration: 0.92725 2023-10-04 00:48:57,716 WARNING [train.py:1204] (3/4) Exclude cut with ID _55383_210_10635_1_1534468426172_2231669_134-128246-0_sp0.9 from training. Duration: 0.988875 2023-10-04 00:48:57,718 WARNING [train.py:1204] (3/4) Exclude cut with ID _40519_210_13794_1_1533715228655_7337390_887-9547-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 00:48:58,987 WARNING [train.py:1204] (3/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_310-109461-0_sp0.9 from training. Duration: 0.888875 2023-10-04 00:49:03,415 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2009_210_2071_1_1525481134_4588937_258-250196-0_sp1.1 from training. Duration: 0.92725 2023-10-04 00:49:03,461 WARNING [train.py:1204] (3/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_1186-72702-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 00:49:03,463 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_14831_210_7927_1_1530700200_5583610_181-27467-0_sp1.1 from training. Duration: 0.82725 2023-10-04 00:49:03,658 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.bypass.skip_rate, batch_count=1466386.6666666667, ans=0.09899494936611666 2023-10-04 00:49:07,543 WARNING [train.py:1204] (3/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_278-132109-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 00:49:07,560 WARNING [train.py:1204] (3/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_261-297503-0 from training. Duration: 0.99 2023-10-04 00:49:11,702 WARNING [train.py:1204] (3/4) Exclude cut with ID _19896_210_1883_1_1531709022662_6649569_313-265903-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 00:49:13,089 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_105-114797-0_sp1.1 from training. Duration: 0.763625 2023-10-04 00:49:14,484 WARNING [train.py:1204] (3/4) Exclude cut with ID _53755_210_8254_1_1534577312941_4707360_9-65843-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 00:49:14,516 WARNING [train.py:1204] (3/4) Exclude cut with ID _18516_210_9206_1_1531704653206_3692406_361-224549-0_sp1.1 from training. Duration: 0.97275 2023-10-04 00:49:15,738 WARNING [train.py:1204] (3/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_403-310975-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 00:49:15,779 WARNING [train.py:1204] (3/4) Exclude cut with ID _5444_210_2151_1_1527418808464_5312210_581-315411-0_sp0.9 from training. Duration: 0.688875 2023-10-04 00:49:15,823 WARNING [train.py:1204] (3/4) Exclude cut with ID _23825_210_13449_1_1531915313526_3497150_464-154904-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 00:49:15,832 WARNING [train.py:1204] (3/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_937-208281-0_sp0.9 from training. Duration: 0.9444375 2023-10-04 00:49:17,190 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_37839_210_7234_1_1533542462_3689303_158-181212-0_sp1.1 from training. Duration: 0.963625 2023-10-04 00:49:17,269 WARNING [train.py:1204] (3/4) Exclude cut with ID _8772_210_1850_1_1528635595972_3664011_398-320049-0 from training. Duration: 0.88 2023-10-04 00:49:21,031 WARNING [train.py:1204] (3/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_718-250907-0_sp1.1 from training. Duration: 0.62725 2023-10-04 00:49:22,217 WARNING [train.py:1204] (3/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_641-126395-0_sp1.1 from training. Duration: 0.92725 2023-10-04 00:49:22,241 WARNING [train.py:1204] (3/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_166-241493-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 00:49:23,645 WARNING [train.py:1204] (3/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_526-62079-0_sp0.9 from training. Duration: 0.7444375 2023-10-04 00:49:24,938 WARNING [train.py:1204] (3/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_439-275083-0_sp1.1 from training. Duration: 0.936375 2023-10-04 00:49:26,495 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_66907_210_21581_1_1535615661_3590177_493-229032-0_sp1.1 from training. Duration: 0.92725 2023-10-04 00:49:27,844 WARNING [train.py:1204] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_89-321955-0_sp1.1 from training. Duration: 0.72725 2023-10-04 00:49:28,158 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.ff3_skip_rate, batch_count=1466453.3333333333, ans=0.0 2023-10-04 00:49:29,614 WARNING [train.py:1204] (3/4) Exclude cut with ID _29857_210_20034_1_1532395886753_3798160_273-226195-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 00:49:29,628 WARNING [train.py:1204] (3/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_404-75889-0 from training. Duration: 0.89 2023-10-04 00:49:29,662 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_271-234962-0_sp0.9 from training. Duration: 0.87775 2023-10-04 00:49:32,955 WARNING [train.py:1204] (3/4) Exclude cut with ID _22961_210_8254_1_1531976366672_4278180_156-339685-0_sp1.1 from training. Duration: 0.97275 2023-10-04 00:49:34,277 WARNING [train.py:1204] (3/4) Exclude cut with ID _37892_210_19596_1_1533198677897_3964058_425-231927-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 00:49:34,372 WARNING [train.py:1204] (3/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_102-288670-0_sp1.1 from training. Duration: 0.97275 2023-10-04 00:49:35,714 WARNING [train.py:1204] (3/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_283-220372-0_sp1.1 from training. Duration: 0.5090625 2023-10-04 00:49:35,784 WARNING [train.py:1204] (3/4) Exclude cut with ID _44304_210_15374_1_1533639352765_4613129_86-1699-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 00:49:37,168 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.733e+02 1.971e+02 2.124e+02 2.467e+02 3.717e+02, threshold=4.249e+02, percent-clipped=0.0 2023-10-04 00:49:37,314 WARNING [train.py:1204] (3/4) Exclude cut with ID _2891_210_2550_1_1525874398521_4427710_327-121590-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 00:49:37,320 WARNING [train.py:1204] (3/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_369-222060-0 from training. Duration: 0.71 2023-10-04 00:49:40,111 WARNING [train.py:1204] (3/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_762-109886-0 from training. Duration: 0.91 2023-10-04 00:49:40,151 WARNING [train.py:1204] (3/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_477-255782-0_sp0.9 from training. Duration: 0.911125 2023-10-04 00:49:40,178 WARNING [train.py:1204] (3/4) Exclude cut with ID IC0669W0061-330906 from training. Duration: 0.768 2023-10-04 00:49:40,229 WARNING [train.py:1204] (3/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_347-213071-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 00:49:41,467 WARNING [train.py:1204] (3/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_507-303370-0_sp1.1 from training. Duration: 0.87275 2023-10-04 00:49:42,838 WARNING [train.py:1204] (3/4) Exclude cut with ID _15012_210_1968_1_1530856820429_5095350_688-178849-0 from training. Duration: 0.64 2023-10-04 00:49:42,852 WARNING [train.py:1204] (3/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_28-70987-0_sp0.9 from training. Duration: 0.9666875 2023-10-04 00:49:42,855 WARNING [train.py:1204] (3/4) Exclude cut with ID _5910_210_1389_1_1527337972454_5050100_555-159815-0 from training. Duration: 0.98 2023-10-04 00:49:42,883 WARNING [train.py:1204] (3/4) Exclude cut with ID IC0679W0064-335871 from training. Duration: 0.768 2023-10-04 00:49:42,884 WARNING [train.py:1204] (3/4) Exclude cut with ID IC0679W0079-335886 from training. Duration: 0.896 2023-10-04 00:49:44,156 WARNING [train.py:1204] (3/4) Exclude cut with ID _5608_210_2106_1_1527415439089_6819768_909-82767-0 from training. Duration: 0.61 2023-10-04 00:49:45,506 WARNING [train.py:1204] (3/4) Exclude cut with ID _23038_210_9570_1_1532001584158_3652419_411-323967-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 00:49:45,568 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_350-231748-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 00:49:45,578 WARNING [train.py:1204] (3/4) Exclude cut with ID _5289_210_2084_1_1527678202736_3779299_142-103317-0_sp0.9 from training. Duration: 0.9333125 2023-10-04 00:49:45,648 WARNING [train.py:1204] (3/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_314-144055-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 00:49:46,224 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.3.encoder.layers.3.self_attn_weights.whiten_keys, num_groups=8, num_channels=256, metric=2.43 vs. limit=6.0 2023-10-04 00:49:47,070 WARNING [train.py:1204] (3/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_177-236809-0_sp0.9 from training. Duration: 0.5444375 2023-10-04 00:49:48,559 WARNING [train.py:1204] (3/4) Exclude cut with ID _23038_210_9570_1_1532001584158_3652419_82-334733-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 00:49:48,567 WARNING [train.py:1204] (3/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_225-22526-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 00:49:57,708 WARNING [train.py:1204] (3/4) Exclude cut with ID _30327_210_16049_1_1532484161295_3924169_401-70819-0_sp1.1 from training. Duration: 0.7818125 2023-10-04 00:49:57,748 WARNING [train.py:1204] (3/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_538-32291-0 from training. Duration: 0.83 2023-10-04 00:50:00,808 INFO [train.py:1046] (3/4) Epoch 42, batch 2200, loss[loss=0.1541, simple_loss=0.2339, pruned_loss=0.03717, over 23633.00 frames. ], tot_loss[loss=0.155, simple_loss=0.235, pruned_loss=0.0375, over 4723620.21 frames. ], batch size: 149, lr: 2.42e-03, grad_scale: 16.0 2023-10-04 00:50:02,319 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_37357_210_17024_1_1533089504_2316121_258-211605-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 00:50:07,855 WARNING [train.py:1204] (3/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_550-335198-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 00:50:07,922 WARNING [train.py:1204] (3/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_98-741-0_sp0.9 from training. Duration: 0.888875 2023-10-04 00:50:09,217 WARNING [train.py:1204] (3/4) Exclude cut with ID _38323_210_12108_1_1533121233369_3303049_251-238988-0_sp1.1 from training. Duration: 0.92725 2023-10-04 00:50:09,303 WARNING [train.py:1204] (3/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_259-34038-0_sp0.9 from training. Duration: 0.82225 2023-10-04 00:50:09,583 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.bypass.skip_rate, batch_count=1466653.3333333333, ans=0.04949747468305833 2023-10-04 00:50:10,121 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.0.layers.1.whiten, num_groups=1, num_channels=192, metric=4.42 vs. limit=12.0 2023-10-04 00:50:10,753 WARNING [train.py:1204] (3/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_1060-180633-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 00:50:10,786 WARNING [train.py:1204] (3/4) Exclude cut with ID _8755_210_1876_1_1528628559357_3258209_211-37473-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 00:50:10,791 WARNING [train.py:1204] (3/4) Exclude cut with ID _4122_210_2084_1_1526713164417_4353280_528-206771-0 from training. Duration: 0.88 2023-10-04 00:50:12,146 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.feed_forward3.hidden_balancer.prob, batch_count=1466653.3333333333, ans=0.125 2023-10-04 00:50:17,412 WARNING [train.py:1204] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_64-248359-0 from training. Duration: 0.69 2023-10-04 00:50:19,084 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.ff2_skip_rate, batch_count=1466720.0, ans=0.0 2023-10-04 00:50:20,152 WARNING [train.py:1204] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_402-253027-0_sp1.1 from training. Duration: 0.4909375 2023-10-04 00:50:25,006 WARNING [train.py:1204] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_625-102955-0 from training. Duration: 0.77 2023-10-04 00:50:27,023 WARNING [train.py:1204] (3/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_611-219324-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 00:50:28,386 WARNING [train.py:1204] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_718-68505-0_sp0.9 from training. Duration: 0.988875 2023-10-04 00:50:28,433 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_353-182855-0_sp1.1 from training. Duration: 0.87275 2023-10-04 00:50:30,483 INFO [scaling.py:1118] (3/4) WithLoss: name=encoder.encoders.2.encoder.layers.2.self_attn_weights, loss-sum=0.000e+00 2023-10-04 00:50:32,985 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_5960_210_1794_1_1527403865_3990999_515-2613-0_sp0.9 from training. Duration: 0.97775 2023-10-04 00:50:33,014 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_5575_210_1410_1_1527301305_2570170_342-265572-0 from training. Duration: 0.94 2023-10-04 00:50:37,621 WARNING [train.py:1204] (3/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_329-113173-0_sp0.9 from training. Duration: 0.688875 2023-10-04 00:50:38,997 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_15396_210_6890_1_1531308473_1938936_213-312506-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 00:50:40,289 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_521-214276-0_sp1.1 from training. Duration: 0.4 2023-10-04 00:50:41,754 WARNING [train.py:1204] (3/4) Exclude cut with ID _15026_210_8750_1_1530777602316_3291850_211-195043-0_sp1.1 from training. Duration: 0.72725 2023-10-04 00:50:44,447 WARNING [train.py:1204] (3/4) Exclude cut with ID _29343_210_4381_1_1532602757752_3674109_108-257494-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 00:50:45,834 WARNING [train.py:1204] (3/4) Exclude cut with ID _64414_210_17301_1_1535243252048_3757759_403-58151-0_sp1.1 from training. Duration: 0.963625 2023-10-04 00:50:47,216 WARNING [train.py:1204] (3/4) Exclude cut with ID _36512_210_6987_1_1533033143998_3126069_109-177787-0_sp1.1 from training. Duration: 0.92725 2023-10-04 00:50:48,672 WARNING [train.py:1204] (3/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_498-40153-0 from training. Duration: 0.63 2023-10-04 00:50:50,060 WARNING [train.py:1204] (3/4) Exclude cut with ID _56447_210_1389_1_1535074245910_3697189_161-334523-0_sp1.1 from training. Duration: 0.936375 2023-10-04 00:50:50,163 WARNING [train.py:1204] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_238-245655-0 from training. Duration: 0.81 2023-10-04 00:50:52,805 WARNING [train.py:1204] (3/4) Exclude cut with ID _64631_210_27767_1_1535349580068_4165160_108-43865-0_sp1.1 from training. Duration: 0.92725 2023-10-04 00:50:52,810 WARNING [train.py:1204] (3/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_243-42116-0_sp1.1 from training. Duration: 0.663625 2023-10-04 00:50:52,845 WARNING [train.py:1204] (3/4) Exclude cut with ID _66868_210_9527_1_1535509287954_4561140_594-132160-0_sp1.1 from training. Duration: 0.92725 2023-10-04 00:50:56,671 WARNING [train.py:1204] (3/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_48-338976-0_sp0.9 from training. Duration: 0.988875 2023-10-04 00:50:56,721 WARNING [train.py:1204] (3/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_137-100595-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 00:50:56,727 WARNING [train.py:1204] (3/4) Exclude cut with ID _49748_210_24116_1_1533986971737_3158910_12-271669-0_sp1.1 from training. Duration: 0.936375 2023-10-04 00:50:56,747 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_37357_210_17024_1_1533089504_2316121_206-80421-0_sp1.1 from training. Duration: 0.936375 2023-10-04 00:50:58,222 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.conv_module1.balancer1.prob, batch_count=1466853.3333333333, ans=0.125 2023-10-04 00:50:59,274 WARNING [train.py:1204] (3/4) Exclude cut with ID _7537_210_2084_1_1528502395431_3924359_113-199933-0_sp1.1 from training. Duration: 0.536375 2023-10-04 00:51:01,009 WARNING [train.py:1204] (3/4) Exclude cut with ID _55440_210_12651_1_1534496713364_2980520_31-59289-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 00:51:01,847 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.3.encoder.layers.0.nonlin_attention.whiten2, num_groups=1, num_channels=512, metric=5.78 vs. limit=15.0 2023-10-04 00:51:02,598 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1083-220122-0_sp0.9 from training. Duration: 0.5666875 2023-10-04 00:51:07,133 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_426-313557-0_sp1.1 from training. Duration: 0.4818125 2023-10-04 00:51:07,199 WARNING [train.py:1204] (3/4) Exclude cut with ID _35799_210_5854_1_1533263487887_3529071_324-94843-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 00:51:09,869 WARNING [train.py:1204] (3/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_264-108685-0_sp1.1 from training. Duration: 0.5 2023-10-04 00:51:10,739 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.2.encoder.layers.0.feed_forward3.out_whiten, num_groups=1, num_channels=384, metric=12.44 vs. limit=15.0 2023-10-04 00:51:11,248 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9012W0006-501141 from training. Duration: 0.896 2023-10-04 00:51:11,386 WARNING [train.py:1204] (3/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_890-5117-0_sp0.9 from training. Duration: 0.8666875 2023-10-04 00:51:12,711 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9023W0008-506301 from training. Duration: 0.896 2023-10-04 00:51:12,791 WARNING [train.py:1204] (3/4) Exclude cut with ID _13916_210_9994_1_1530788341126_3763149_60-191556-0_sp0.9 from training. Duration: 0.67775 2023-10-04 00:51:12,840 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9029W0331-509721 from training. Duration: 0.896 2023-10-04 00:51:15,581 WARNING [train.py:1204] (3/4) Exclude cut with ID _35823_210_6917_1_1533187881914_7297830_567-108596-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 00:51:15,635 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7543_210_6753_1_1528544010_1110402_25-183860-0_sp1.1 from training. Duration: 0.42725 2023-10-04 00:51:15,767 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.feed_forward2.hidden_balancer.prob, batch_count=1466986.6666666667, ans=0.125 2023-10-04 00:51:16,806 INFO [train.py:1046] (3/4) Epoch 42, batch 2250, loss[loss=0.1684, simple_loss=0.2427, pruned_loss=0.04699, over 23594.00 frames. ], tot_loss[loss=0.1546, simple_loss=0.2349, pruned_loss=0.03715, over 4729060.23 frames. ], batch size: 256, lr: 2.42e-03, grad_scale: 16.0 2023-10-04 00:51:18,220 WARNING [train.py:1204] (3/4) Exclude cut with ID _47278_210_12041_1_1533902418349_3576110_495-260035-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 00:51:19,585 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9051W0015-520281 from training. Duration: 0.9045625 2023-10-04 00:51:20,857 WARNING [train.py:1204] (3/4) Exclude cut with ID _14459_210_2228_1_1531443641946_7516700_707-66193-0_sp1.1 from training. Duration: 0.97275 2023-10-04 00:51:21,073 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.ff2_skip_rate, batch_count=1466986.6666666667, ans=0.0 2023-10-04 00:51:23,698 WARNING [train.py:1204] (3/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_219-224445-0_sp1.1 from training. Duration: 0.763625 2023-10-04 00:51:25,311 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.1.conv_module2.balancer2.prob, batch_count=1466986.6666666667, ans=0.125 2023-10-04 00:51:29,657 WARNING [train.py:1204] (3/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_345-167986-0_sp0.9 from training. Duration: 0.9333125 2023-10-04 00:51:31,706 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_34815_210_6388_1_1533381320_3571903_279-305565-0_sp1.1 from training. Duration: 0.836375 2023-10-04 00:51:34,554 WARNING [train.py:1204] (3/4) Exclude cut with ID _21987_210_5854_1_1531821500482_3637038_317-179212-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 00:51:34,626 WARNING [train.py:1204] (3/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_395-37222-0_sp1.1 from training. Duration: 0.6909375 2023-10-04 00:51:35,995 WARNING [train.py:1204] (3/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_168-50337-0_sp1.1 from training. Duration: 0.763625 2023-10-04 00:51:38,735 WARNING [train.py:1204] (3/4) Exclude cut with ID _60113_210_16271_1_1535164212481_4386079_559-167191-0 from training. Duration: 0.93 2023-10-04 00:51:38,744 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_47228_210_4983_1_1533780021_3672027_344-179642-0_sp1.1 from training. Duration: 0.92725 2023-10-04 00:51:38,785 WARNING [train.py:1204] (3/4) Exclude cut with ID _22961_210_8254_1_1531976366672_4278180_317-199546-0_sp0.9 from training. Duration: 0.9555625 2023-10-04 00:51:40,727 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.0.attention_skip_rate, batch_count=1467053.3333333333, ans=0.0 2023-10-04 00:51:41,940 WARNING [train.py:1204] (3/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_141-89727-0 from training. Duration: 0.71 2023-10-04 00:51:42,010 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_78-116075-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 00:51:42,029 WARNING [train.py:1204] (3/4) Exclude cut with ID _64086_210_7994_1_1535270417019_7435949_52-235756-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 00:51:43,390 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7615_210_4211_1_1528110965_4580160_72-235211-0_sp1.1 from training. Duration: 0.6909375 2023-10-04 00:51:48,806 WARNING [train.py:1204] (3/4) Exclude cut with ID _14069_210_5632_1_1530512692033_4803390_287-27950-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 00:51:48,902 WARNING [train.py:1204] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_74-48717-0_sp0.9 from training. Duration: 0.6444375 2023-10-04 00:51:48,932 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7544_210_6753_1_1528591327_1471426_172-348777-0_sp1.1 from training. Duration: 0.6 2023-10-04 00:51:50,358 WARNING [train.py:1204] (3/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_220-191080-0 from training. Duration: 0.98 2023-10-04 00:51:51,737 WARNING [train.py:1204] (3/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_1081-137641-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 00:51:53,643 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.5.encoder.layers.0.feed_forward3.out_whiten, num_groups=1, num_channels=256, metric=8.51 vs. limit=15.0 2023-10-04 00:51:54,486 WARNING [train.py:1204] (3/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_278-235459-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 00:51:54,636 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.1.feed_forward3.hidden_balancer.prob, batch_count=1467120.0, ans=0.125 2023-10-04 00:52:00,497 WARNING [train.py:1204] (3/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_1181-77470-0_sp1.1 from training. Duration: 0.936375 2023-10-04 00:52:02,464 WARNING [train.py:1204] (3/4) Exclude cut with ID _60860_210_8254_1_1534896015875_3557169_198-5531-0_sp1.1 from training. Duration: 0.936375 2023-10-04 00:52:03,840 WARNING [train.py:1204] (3/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_17-289781-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 00:52:03,845 WARNING [train.py:1204] (3/4) Exclude cut with ID _50310_210_15488_1_1534068043345_3641080_146-199752-0_sp1.1 from training. Duration: 0.92725 2023-10-04 00:52:05,931 WARNING [train.py:1204] (3/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_969-213354-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 00:52:07,087 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.565e+02 1.956e+02 2.105e+02 2.368e+02 2.905e+02, threshold=4.210e+02, percent-clipped=0.0 2023-10-04 00:52:08,563 WARNING [train.py:1204] (3/4) Exclude cut with ID _70519_210_4163_1_1535961595994_7458230_66-344156-0_sp1.1 from training. Duration: 0.963625 2023-10-04 00:52:10,432 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.conv_module2.balancer2.prob, batch_count=1467186.6666666667, ans=0.125 2023-10-04 00:52:11,593 WARNING [train.py:1204] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_380-204756-0_sp1.1 from training. Duration: 0.7818125 2023-10-04 00:52:13,609 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_128-202104-0_sp0.9 from training. Duration: 0.7 2023-10-04 00:52:17,703 WARNING [train.py:1204] (3/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_51-1049-0_sp0.9 from training. Duration: 0.8333125 2023-10-04 00:52:17,726 WARNING [train.py:1204] (3/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_86-66192-0_sp1.1 from training. Duration: 0.5 2023-10-04 00:52:17,780 WARNING [train.py:1204] (3/4) Exclude cut with ID _14069_210_5632_1_1530512692033_4803390_95-1896-0_sp1.1 from training. Duration: 0.8909375 2023-10-04 00:52:23,215 WARNING [train.py:1204] (3/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_29-293896-0_sp1.1 from training. Duration: 0.5545625 2023-10-04 00:52:25,857 WARNING [train.py:1204] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_167-137255-0_sp0.9 from training. Duration: 0.711125 2023-10-04 00:52:25,858 WARNING [train.py:1204] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_217-95056-0 from training. Duration: 0.98 2023-10-04 00:52:25,880 WARNING [train.py:1204] (3/4) Exclude cut with ID _47278_210_12041_1_1533902418349_3576110_97-266746-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 00:52:25,931 WARNING [train.py:1204] (3/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_380-343670-0_sp1.1 from training. Duration: 0.8 2023-10-04 00:52:26,171 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.conv_module2.balancer2.prob, batch_count=1467253.3333333333, ans=0.125 2023-10-04 00:52:28,089 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.1.encoder.layers.1.nonlin_attention.whiten2, num_groups=1, num_channels=256, metric=6.67 vs. limit=15.0 2023-10-04 00:52:28,653 WARNING [train.py:1204] (3/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_107-332398-0 from training. Duration: 0.78 2023-10-04 00:52:30,574 INFO [train.py:1046] (3/4) Epoch 42, batch 2300, loss[loss=0.1512, simple_loss=0.2295, pruned_loss=0.0365, over 23500.00 frames. ], tot_loss[loss=0.1552, simple_loss=0.2358, pruned_loss=0.03735, over 4720949.93 frames. ], batch size: 134, lr: 2.42e-03, grad_scale: 16.0 2023-10-04 00:52:33,839 WARNING [train.py:1204] (3/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_91-267696-0_sp1.1 from training. Duration: 0.7909375 2023-10-04 00:52:33,889 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder_embed.convnext.hidden_balancer.prob, batch_count=1467320.0, ans=0.125 2023-10-04 00:52:35,119 WARNING [train.py:1204] (3/4) Exclude cut with ID _41298_210_9019_1_1533430371450_4822143_498-186993-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 00:52:40,034 WARNING [train.py:1204] (3/4) Exclude cut with ID _17956_210_4353_1_1531209568125_6979970_54-77610-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 00:52:41,377 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_40047_210_2290_1_1533290519_2374311_257-335162-0_sp1.1 from training. Duration: 0.863625 2023-10-04 00:52:43,197 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9653W0193-675471 from training. Duration: 0.9656875 2023-10-04 00:52:44,561 WARNING [train.py:1204] (3/4) Exclude cut with ID _25248_210_8750_1_1532062825941_5554180_243-240536-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 00:52:50,109 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_32360_210_4198_1_1532689160_3648436_60-51908-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 00:52:51,289 WARNING [train.py:1204] (3/4) Exclude cut with ID _4092_210_2151_1_1526643044449_3135759_227-213972-0_sp0.9 from training. Duration: 0.6 2023-10-04 00:52:51,332 WARNING [train.py:1204] (3/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_392-173500-0_sp1.1 from training. Duration: 0.97275 2023-10-04 00:52:51,362 WARNING [train.py:1204] (3/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_71-329150-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 00:52:51,370 WARNING [train.py:1204] (3/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_866-173440-0 from training. Duration: 0.9 2023-10-04 00:52:52,787 WARNING [train.py:1204] (3/4) Exclude cut with ID _14832_210_8750_1_1530691351539_3171348_1-54315-0_sp0.9 from training. Duration: 0.9666875 2023-10-04 00:52:54,223 WARNING [train.py:1204] (3/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_430-116139-0_sp1.1 from training. Duration: 0.77275 2023-10-04 00:52:54,264 WARNING [train.py:1204] (3/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_356-45054-0_sp0.9 from training. Duration: 0.9555625 2023-10-04 00:52:57,099 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3531_210_3528_1_1526294203_5216456_309-227302-0_sp0.9 from training. Duration: 0.7666875 2023-10-04 00:52:59,868 WARNING [train.py:1204] (3/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_301-330455-0_sp1.1 from training. Duration: 0.72725 2023-10-04 00:53:03,051 WARNING [train.py:1204] (3/4) Exclude cut with ID _23557_210_8750_1_1531890106370_6846255_667-145945-0_sp1.1 from training. Duration: 0.92725 2023-10-04 00:53:04,480 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.0.conv_module1.balancer2.prob, batch_count=1467453.3333333333, ans=0.125 2023-10-04 00:53:07,774 WARNING [train.py:1204] (3/4) Exclude cut with ID _14860_210_4915_1_1530702020340_7343240_836-328489-0_sp1.1 from training. Duration: 0.8181875 2023-10-04 00:53:07,818 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_37152_210_9697_1_1533106573_3844188_416-333249-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 00:53:11,074 WARNING [train.py:1204] (3/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_538-32291-0_sp0.9 from training. Duration: 0.92225 2023-10-04 00:53:15,151 WARNING [train.py:1204] (3/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_965-176824-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 00:53:17,936 WARNING [train.py:1204] (3/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_86-19601-0_sp1.1 from training. Duration: 0.77275 2023-10-04 00:53:19,231 WARNING [train.py:1204] (3/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_436-318113-0_sp1.1 from training. Duration: 0.6818125 2023-10-04 00:53:20,537 WARNING [train.py:1204] (3/4) Exclude cut with ID _41817_210_18835_1_1533434477160_7423049_555-293448-0_sp0.9 from training. Duration: 0.888875 2023-10-04 00:53:20,555 WARNING [train.py:1204] (3/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_68-219807-0 from training. Duration: 0.47 2023-10-04 00:53:23,548 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7544_210_6753_1_1528591327_1471426_33-346031-0_sp1.1 from training. Duration: 0.5545625 2023-10-04 00:53:23,556 WARNING [train.py:1204] (3/4) Exclude cut with ID _49748_210_24116_1_1533986971737_3158910_263-52113-0_sp1.1 from training. Duration: 0.97275 2023-10-04 00:53:24,861 WARNING [train.py:1204] (3/4) Exclude cut with ID _64639_210_5816_1_1535367619437_3528000_340-24580-0_sp1.1 from training. Duration: 0.92725 2023-10-04 00:53:24,890 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_47228_210_4983_1_1533780021_3672027_479-114941-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 00:53:26,251 WARNING [train.py:1204] (3/4) Exclude cut with ID _44585_210_18628_1_1533605350387_4518199_506-119659-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 00:53:27,635 WARNING [train.py:1204] (3/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_314-126819-0_sp1.1 from training. Duration: 0.4545625 2023-10-04 00:53:27,638 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2541_210_1794_1_1525590269_3084765_156-173713-0_sp0.9 from training. Duration: 0.9 2023-10-04 00:53:27,688 WARNING [train.py:1204] (3/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_108-255430-0 from training. Duration: 0.97 2023-10-04 00:53:27,696 WARNING [train.py:1204] (3/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_791-45149-0_sp1.1 from training. Duration: 0.7818125 2023-10-04 00:53:27,708 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_53378_210_17282_1_1534227054_1261425_10-118295-0_sp1.1 from training. Duration: 0.97275 2023-10-04 00:53:28,996 WARNING [train.py:1204] (3/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_148-158509-0 from training. Duration: 0.41 2023-10-04 00:53:35,315 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_34726_210_4525_1_1533019474_5030395_687-291305-0_sp1.1 from training. Duration: 0.936375 2023-10-04 00:53:38,481 WARNING [train.py:1204] (3/4) Exclude cut with ID _14860_210_4915_1_1530702020340_7343240_264-324537-0_sp1.1 from training. Duration: 0.963625 2023-10-04 00:53:42,616 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_41251_210_15368_1_1533429767_4820695_591-46386-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 00:53:42,649 WARNING [train.py:1204] (3/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_86-19601-0_sp0.9 from training. Duration: 0.9444375 2023-10-04 00:53:42,668 WARNING [train.py:1204] (3/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_401-255491-0_sp0.9 from training. Duration: 0.77775 2023-10-04 00:53:44,432 INFO [train.py:1046] (3/4) Epoch 42, batch 2350, loss[loss=0.1503, simple_loss=0.2371, pruned_loss=0.03179, over 24643.00 frames. ], tot_loss[loss=0.1564, simple_loss=0.237, pruned_loss=0.03793, over 4715700.64 frames. ], batch size: 65, lr: 2.42e-03, grad_scale: 16.0 2023-10-04 00:53:44,496 WARNING [train.py:1204] (3/4) Exclude cut with ID _8696_210_1876_1_1528545632440_3345058_209-54069-0_sp1.1 from training. Duration: 0.6090625 2023-10-04 00:53:44,506 WARNING [train.py:1204] (3/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_369-325184-0_sp1.1 from training. Duration: 0.9 2023-10-04 00:53:44,571 WARNING [train.py:1204] (3/4) Exclude cut with ID _14687_210_5854_1_1530703838259_3664889_115-239046-0_sp0.9 from training. Duration: 0.7444375 2023-10-04 00:53:45,895 WARNING [train.py:1204] (3/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_168-50337-0 from training. Duration: 0.84 2023-10-04 00:53:46,111 INFO [scaling.py:1118] (3/4) WithLoss: name=encoder.encoders.0.layers.1.self_attn_weights, loss-sum=0.000e+00 2023-10-04 00:53:52,756 WARNING [train.py:1204] (3/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_909-64151-0_sp1.1 from training. Duration: 0.8545625 2023-10-04 00:53:52,774 WARNING [train.py:1204] (3/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_597-154640-0 from training. Duration: 0.9 2023-10-04 00:53:57,299 WARNING [train.py:1204] (3/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_126-274694-0 from training. Duration: 0.98 2023-10-04 00:54:00,006 WARNING [train.py:1204] (3/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_344-269786-0_sp1.1 from training. Duration: 0.97275 2023-10-04 00:54:03,338 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_37818_210_4238_1_1533620910_4720083_322-253154-0_sp1.1 from training. Duration: 0.92725 2023-10-04 00:54:03,341 WARNING [train.py:1204] (3/4) Exclude cut with ID _58895_210_8750_1_1535184081320_3649089_221-15823-0_sp1.1 from training. Duration: 0.92725 2023-10-04 00:54:03,354 WARNING [train.py:1204] (3/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_9-21737-0_sp1.1 from training. Duration: 0.8818125 2023-10-04 00:54:03,389 WARNING [train.py:1204] (3/4) Exclude cut with ID _14069_210_5632_1_1530512692033_4803390_522-341588-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 00:54:04,784 WARNING [train.py:1204] (3/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_363-185703-0 from training. Duration: 0.95 2023-10-04 00:54:08,880 WARNING [train.py:1204] (3/4) Exclude cut with ID _41817_210_18835_1_1533434477160_7423049_265-76216-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 00:54:14,390 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.bypass_mid.scale_min, batch_count=1467786.6666666667, ans=0.2 2023-10-04 00:54:16,013 WARNING [train.py:1204] (3/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_841-341679-0 from training. Duration: 0.58 2023-10-04 00:54:16,129 WARNING [train.py:1204] (3/4) Exclude cut with ID _55429_210_10626_1_1534467588527_7174143_97-335084-0_sp1.1 from training. Duration: 0.8818125 2023-10-04 00:54:19,010 WARNING [train.py:1204] (3/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_690-126306-0_sp0.9 from training. Duration: 0.8666875 2023-10-04 00:54:19,020 WARNING [train.py:1204] (3/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_102-92751-0_sp1.1 from training. Duration: 0.9 2023-10-04 00:54:20,486 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3787_210_2040_1_1526477544_5478586_603-120324-0_sp1.1 from training. Duration: 0.736375 2023-10-04 00:54:21,876 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_33348_210_5632_1_1533086912_3324030_85-28583-0 from training. Duration: 0.82 2023-10-04 00:54:23,172 WARNING [train.py:1204] (3/4) Exclude cut with ID _60057_210_10640_1_1534903065793_3333150_362-230377-0_sp1.1 from training. Duration: 0.7909375 2023-10-04 00:54:23,309 WARNING [train.py:1204] (3/4) Exclude cut with ID _60113_210_16271_1_1535164212481_4386079_194-146486-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 00:54:24,569 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_53753_210_7234_1_1534320003_3594148_171-277668-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 00:54:24,596 WARNING [train.py:1204] (3/4) Exclude cut with ID _34423_210_8750_1_1533430798456_3330199_251-332788-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 00:54:27,304 WARNING [train.py:1204] (3/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_663-166973-0_sp0.9 from training. Duration: 0.888875 2023-10-04 00:54:28,816 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.1.feed_forward1.out_proj.dropout_p, batch_count=1467853.3333333333, ans=0.1 2023-10-04 00:54:31,272 WARNING [train.py:1204] (3/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_927-43827-0 from training. Duration: 0.74 2023-10-04 00:54:31,317 WARNING [train.py:1204] (3/4) Exclude cut with ID _28323_210_16187_1_1532480395667_4100300_306-74561-0_sp1.1 from training. Duration: 0.8545625 2023-10-04 00:54:31,541 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.feed_forward3.hidden_balancer.prob, batch_count=1467853.3333333333, ans=0.125 2023-10-04 00:54:33,421 WARNING [train.py:1204] (3/4) Exclude cut with ID _15037_210_2253_1_1530752294104_3267650_139-337989-0_sp1.1 from training. Duration: 0.92725 2023-10-04 00:54:33,440 WARNING [train.py:1204] (3/4) Exclude cut with ID _49039_210_6315_1_1533967537541_3846118_460-263670-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 00:54:34,606 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.645e+02 2.014e+02 2.206e+02 2.508e+02 4.663e+02, threshold=4.412e+02, percent-clipped=1.0 2023-10-04 00:54:36,409 WARNING [train.py:1204] (3/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_186-309161-0 from training. Duration: 0.54 2023-10-04 00:54:37,739 WARNING [train.py:1204] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_87-278686-0_sp1.1 from training. Duration: 0.7 2023-10-04 00:54:40,956 WARNING [train.py:1204] (3/4) Exclude cut with ID _27052_210_5986_1_1532329197187_3585009_314-106499-0 from training. Duration: 0.92 2023-10-04 00:54:40,972 WARNING [train.py:1204] (3/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_412-287526-0_sp0.9 from training. Duration: 0.8 2023-10-04 00:54:43,904 WARNING [train.py:1204] (3/4) Exclude cut with ID _22961_210_8254_1_1531976366672_4278180_317-199546-0 from training. Duration: 0.86 2023-10-04 00:54:47,342 WARNING [train.py:1204] (3/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_381-317791-0 from training. Duration: 0.73 2023-10-04 00:54:48,669 WARNING [train.py:1204] (3/4) Exclude cut with ID _44010_210_4525_1_1533884262964_4013620_560-310411-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 00:54:48,683 WARNING [train.py:1204] (3/4) Exclude cut with ID _5294_210_1747_1_1527249643928_3696179_342-270278-0_sp0.9 from training. Duration: 0.611125 2023-10-04 00:54:48,702 WARNING [train.py:1204] (3/4) Exclude cut with ID ID0473W0002-918951 from training. Duration: 0.924 2023-10-04 00:54:48,722 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1083-220122-0 from training. Duration: 0.51 2023-10-04 00:54:51,539 WARNING [train.py:1204] (3/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_138-154812-0 from training. Duration: 0.78 2023-10-04 00:54:54,180 WARNING [train.py:1204] (3/4) Exclude cut with ID _44982_210_1511_1_1534312820108_3726029_331-19420-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 00:54:55,221 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.0.layers.0.self_attn1.whiten, num_groups=1, num_channels=192, metric=12.26 vs. limit=22.5 2023-10-04 00:54:57,134 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2655_210_2087_1_1525688959_3897400_202-147557-0_sp0.9 from training. Duration: 0.9444375 2023-10-04 00:54:58,382 INFO [train.py:1046] (3/4) Epoch 42, batch 2400, loss[loss=0.1574, simple_loss=0.2392, pruned_loss=0.03785, over 23680.00 frames. ], tot_loss[loss=0.156, simple_loss=0.2364, pruned_loss=0.03779, over 4708857.78 frames. ], batch size: 149, lr: 2.42e-03, grad_scale: 32.0 2023-10-04 00:55:00,019 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.feed_forward1.out_proj.dropout_p, batch_count=1467986.6666666667, ans=0.1 2023-10-04 00:55:01,467 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_23850_210_4238_1_1532232597_1632433_32-185135-0_sp0.9 from training. Duration: 0.9555625 2023-10-04 00:55:03,436 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_46-93547-0_sp1.1 from training. Duration: 0.863625 2023-10-04 00:55:04,095 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7931_210_1794_1_1528594728_5564059_280-279560-0 from training. Duration: 0.98 2023-10-04 00:55:04,134 WARNING [train.py:1204] (3/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_937-208281-0 from training. Duration: 0.85 2023-10-04 00:55:06,881 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.conv_module2.balancer1.prob, batch_count=1467986.6666666667, ans=0.125 2023-10-04 00:55:11,294 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_14928_210_5632_1_1530773461_4714000_454-112198-0_sp0.9 from training. Duration: 0.7333125 2023-10-04 00:55:11,295 WARNING [train.py:1204] (3/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_944-153333-0_sp1.1 from training. Duration: 0.963625 2023-10-04 00:55:12,766 WARNING [train.py:1204] (3/4) Exclude cut with ID _14821_210_8750_1_1530680503991_6829359_2-227151-0 from training. Duration: 0.83 2023-10-04 00:55:12,791 WARNING [train.py:1204] (3/4) Exclude cut with ID _28461_210_2253_1_1532771775189_3041299_96-152158-0_sp1.1 from training. Duration: 0.77275 2023-10-04 00:55:14,615 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7837_210_1792_1_1528453445_5841305_654-54195-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 00:55:14,647 WARNING [train.py:1204] (3/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_344-152526-0 from training. Duration: 0.53 2023-10-04 00:55:14,927 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.feed_forward1.hidden_balancer.prob, batch_count=1468053.3333333333, ans=0.125 2023-10-04 00:55:18,739 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_30167_210_3528_1_1532428012_4772375_480-142230-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 00:55:20,211 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_160-218721-0 from training. Duration: 0.96 2023-10-04 00:55:20,417 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.ff3_skip_rate, batch_count=1468053.3333333333, ans=0.0 2023-10-04 00:55:24,349 WARNING [train.py:1204] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_65-88242-0_sp0.9 from training. Duration: 0.62225 2023-10-04 00:55:24,578 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.feed_forward2.hidden_balancer.prob, batch_count=1468053.3333333333, ans=0.125 2023-10-04 00:55:30,147 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_34886_210_10633_1_1533203882_1755661_76-156096-0 from training. Duration: 0.87 2023-10-04 00:55:32,859 WARNING [train.py:1204] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_188-38388-0_sp1.1 from training. Duration: 0.92725 2023-10-04 00:55:33,696 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=1468120.0, ans=0.1 2023-10-04 00:55:34,883 WARNING [train.py:1204] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_81-289335-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 00:55:38,903 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.2.encoder.layers.1.self_attn_weights.whiten_keys, num_groups=4, num_channels=128, metric=2.32 vs. limit=6.0 2023-10-04 00:55:39,676 WARNING [train.py:1204] (3/4) Exclude cut with ID _59493_210_17282_1_1535078419077_3225879_110-8778-0_sp1.1 from training. Duration: 0.936375 2023-10-04 00:55:39,718 WARNING [train.py:1204] (3/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_188-260011-0 from training. Duration: 0.73 2023-10-04 00:55:39,752 WARNING [train.py:1204] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_453-133983-0_sp0.9 from training. Duration: 0.8666875 2023-10-04 00:55:47,147 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.3.encoder.layers.3.whiten, num_groups=1, num_channels=512, metric=3.66 vs. limit=12.0 2023-10-04 00:55:48,770 INFO [scaling.py:1118] (3/4) WithLoss: name=encoder.encoders.3.encoder.layers.0.self_attn_weights, loss-sum=0.000e+00 2023-10-04 00:55:49,885 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8054_210_7927_1_1528627721_4984094_553-17379-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 00:55:51,450 WARNING [train.py:1204] (3/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_341-171072-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 00:55:54,248 WARNING [train.py:1204] (3/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_265-257286-0_sp1.1 from training. Duration: 0.963625 2023-10-04 00:55:55,628 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_403-39619-0_sp0.9 from training. Duration: 0.9333125 2023-10-04 00:55:55,634 WARNING [train.py:1204] (3/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_464-33931-0_sp0.9 from training. Duration: 0.6 2023-10-04 00:55:55,666 WARNING [train.py:1204] (3/4) Exclude cut with ID _60804_210_9642_1_1534831199859_7268299_632-143863-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 00:55:55,673 WARNING [train.py:1204] (3/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_431-310997-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 00:55:55,721 WARNING [train.py:1204] (3/4) Exclude cut with ID _31649_210_6228_1_1532584608063_4091078_114-263860-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 00:55:55,739 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6961_210_2087_1_1527937168_6815011_562-165518-0_sp0.9 from training. Duration: 0.8555625 2023-10-04 00:55:57,279 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.self_attn_weights.pos_emb_skip_rate, batch_count=1468253.3333333333, ans=0.0 2023-10-04 00:56:01,121 WARNING [train.py:1204] (3/4) Exclude cut with ID _27052_210_5986_1_1532329197187_3585009_338-325870-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 00:56:01,172 WARNING [train.py:1204] (3/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_469-94673-0_sp1.1 from training. Duration: 0.7181875 2023-10-04 00:56:02,639 WARNING [train.py:1204] (3/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_259-34038-0 from training. Duration: 0.74 2023-10-04 00:56:04,558 WARNING [train.py:1204] (3/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_388-129552-0 from training. Duration: 0.96 2023-10-04 00:56:05,966 WARNING [train.py:1204] (3/4) Exclude cut with ID _28394_210_8913_1_1532430049518_3650118_342-251455-0_sp1.1 from training. Duration: 0.936375 2023-10-04 00:56:05,992 WARNING [train.py:1204] (3/4) Exclude cut with ID _53731_210_7994_1_1534553979244_3757214_8-191390-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 00:56:06,630 WARNING [train.py:1204] (3/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_613-232856-0 from training. Duration: 0.54 2023-10-04 00:56:07,743 WARNING [train.py:1204] (3/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_557-349534-0 from training. Duration: 0.91 2023-10-04 00:56:07,752 WARNING [train.py:1204] (3/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_379-184223-0 from training. Duration: 0.85 2023-10-04 00:56:07,755 WARNING [train.py:1204] (3/4) Exclude cut with ID IC0125W0001-61807 from training. Duration: 0.966 2023-10-04 00:56:07,846 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_229-45300-0 from training. Duration: 0.57 2023-10-04 00:56:09,579 WARNING [train.py:1204] (3/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_1069-24743-0_sp1.1 from training. Duration: 0.92725 2023-10-04 00:56:11,001 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_41452_210_7291_1_1533437687_3941281_323-172873-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 00:56:11,011 WARNING [train.py:1204] (3/4) Exclude cut with ID _37086_210_9038_1_1532999540891_7021250_802-226709-0_sp1.1 from training. Duration: 0.97275 2023-10-04 00:56:12,386 WARNING [train.py:1204] (3/4) Exclude cut with ID IC0144W0500-71767 from training. Duration: 0.931875 2023-10-04 00:56:12,453 WARNING [train.py:1204] (3/4) Exclude cut with ID _39846_210_12651_1_1533605310265_2115212_233-129699-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 00:56:13,723 INFO [train.py:1046] (3/4) Epoch 42, batch 2450, loss[loss=0.1549, simple_loss=0.2248, pruned_loss=0.04254, over 23974.00 frames. ], tot_loss[loss=0.1544, simple_loss=0.234, pruned_loss=0.03742, over 4697828.62 frames. ], batch size: 196, lr: 2.42e-03, grad_scale: 32.0 2023-10-04 00:56:13,780 WARNING [train.py:1204] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_574-45541-0_sp0.9 from training. Duration: 0.72225 2023-10-04 00:56:16,647 WARNING [train.py:1204] (3/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_372-298093-0_sp1.1 from training. Duration: 0.72725 2023-10-04 00:56:16,665 WARNING [train.py:1204] (3/4) Exclude cut with ID _62574_210_2655_1_1535180407058_3933249_34-26343-0_sp1.1 from training. Duration: 0.97275 2023-10-04 00:56:21,114 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_512-256238-0_sp1.1 from training. Duration: 0.963625 2023-10-04 00:56:21,120 WARNING [train.py:1204] (3/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_196-55502-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 00:56:22,507 WARNING [train.py:1204] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_195-167011-0 from training. Duration: 0.75 2023-10-04 00:56:28,121 WARNING [train.py:1204] (3/4) Exclude cut with ID _13678_210_7497_1_1530530069226_3463170_345-230416-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 00:56:28,140 WARNING [train.py:1204] (3/4) Exclude cut with ID _51078_210_9070_1_1534161557424_3805250_225-327809-0_sp1.1 from training. Duration: 0.963625 2023-10-04 00:56:30,912 WARNING [train.py:1204] (3/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_18-189198-0_sp1.1 from training. Duration: 0.8181875 2023-10-04 00:56:30,930 WARNING [train.py:1204] (3/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_329-82235-0_sp1.1 from training. Duration: 0.7454375 2023-10-04 00:56:32,227 WARNING [train.py:1204] (3/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_726-270780-0_sp0.9 from training. Duration: 0.9555625 2023-10-04 00:56:32,254 WARNING [train.py:1204] (3/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_654-52713-0 from training. Duration: 0.77 2023-10-04 00:56:35,684 WARNING [train.py:1204] (3/4) Exclude cut with ID _20455_210_5219_1_1531565996775_8098100_191-16006-0_sp1.1 from training. Duration: 0.963625 2023-10-04 00:56:38,737 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3171_210_2087_1_1526109084_2330007_66-260583-0_sp1.1 from training. Duration: 0.6545625 2023-10-04 00:56:38,809 WARNING [train.py:1204] (3/4) Exclude cut with ID _38670_210_15379_1_1533448810659_4391059_63-129006-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 00:56:41,659 WARNING [train.py:1204] (3/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_393-57888-0_sp0.9 from training. Duration: 0.711125 2023-10-04 00:56:42,926 WARNING [train.py:1204] (3/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_857-298942-0_sp0.9 from training. Duration: 0.9444375 2023-10-04 00:56:44,362 WARNING [train.py:1204] (3/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_239-22076-0_sp0.9 from training. Duration: 0.9444375 2023-10-04 00:56:44,389 WARNING [train.py:1204] (3/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_424-227947-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 00:56:45,930 WARNING [train.py:1204] (3/4) Exclude cut with ID _14069_210_5632_1_1530512692033_4803390_95-1896-0 from training. Duration: 0.98 2023-10-04 00:56:47,271 WARNING [train.py:1204] (3/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_96-244299-0_sp0.9 from training. Duration: 0.97775 2023-10-04 00:56:53,398 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.ff2_skip_rate, batch_count=1468453.3333333333, ans=0.0 2023-10-04 00:56:54,614 WARNING [train.py:1204] (3/4) Exclude cut with ID _46683_210_9642_1_1533730913492_4591116_562-319825-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 00:56:55,949 WARNING [train.py:1204] (3/4) Exclude cut with ID _64639_210_5816_1_1535367619437_3528000_284-173474-0_sp1.1 from training. Duration: 0.963625 2023-10-04 00:56:55,986 WARNING [train.py:1204] (3/4) Exclude cut with ID _29529_210_7234_1_1532393961455_3977460_497-100133-0_sp1.1 from training. Duration: 0.936375 2023-10-04 00:56:57,276 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_12868_210_6753_1_1530701932_530275_18-76322-0_sp0.9 from training. Duration: 0.9666875 2023-10-04 00:56:57,307 WARNING [train.py:1204] (3/4) Exclude cut with ID _50093_210_4220_1_1534417141207_3432299_116-303447-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 00:56:57,396 WARNING [train.py:1204] (3/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_44-79427-0_sp1.1 from training. Duration: 0.8909375 2023-10-04 00:56:58,812 WARNING [train.py:1204] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_887-249555-0 from training. Duration: 0.77 2023-10-04 00:57:02,066 WARNING [train.py:1204] (3/4) Exclude cut with ID _28461_210_2253_1_1532771775189_3041299_96-152158-0_sp0.9 from training. Duration: 0.9444375 2023-10-04 00:57:02,182 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.0.feed_forward1.out_proj.dropout_p, batch_count=1468520.0, ans=0.1 2023-10-04 00:57:03,208 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.609e+02 2.034e+02 2.290e+02 2.639e+02 4.932e+02, threshold=4.579e+02, percent-clipped=1.0 2023-10-04 00:57:03,307 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_14058_210_4163_1_1530690953_6327180_49-210687-0_sp1.1 from training. Duration: 0.8545625 2023-10-04 00:57:06,277 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.ff3_skip_rate, batch_count=1468520.0, ans=0.0 2023-10-04 00:57:06,884 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.2.encoder.layers.0.feed_forward2.out_whiten, num_groups=1, num_channels=384, metric=5.47 vs. limit=15.0 2023-10-04 00:57:07,384 WARNING [train.py:1204] (3/4) Exclude cut with ID _40399_210_4353_1_1533305658194_3279406_351-127903-0_sp1.1 from training. Duration: 0.97275 2023-10-04 00:57:07,390 WARNING [train.py:1204] (3/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_184-206939-0_sp1.1 from training. Duration: 0.936375 2023-10-04 00:57:12,612 WARNING [train.py:1204] (3/4) Exclude cut with ID _14100_210_8341_1_1530529320613_3678069_258-224599-0_sp1.1 from training. Duration: 0.7 2023-10-04 00:57:12,648 WARNING [train.py:1204] (3/4) Exclude cut with ID _6493_210_1747_1_1528459210424_4012470_324-323854-0 from training. Duration: 0.92 2023-10-04 00:57:12,835 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.nonlin_attention.balancer.prob, batch_count=1468586.6666666667, ans=0.125 2023-10-04 00:57:14,048 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_151-325626-0_sp1.1 from training. Duration: 0.8818125 2023-10-04 00:57:15,397 WARNING [train.py:1204] (3/4) Exclude cut with ID _14528_210_8254_1_1530669700514_4306939_106-43145-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 00:57:15,411 WARNING [train.py:1204] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_686-54383-0 from training. Duration: 0.87 2023-10-04 00:57:15,460 WARNING [train.py:1204] (3/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_801-102362-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 00:57:15,526 WARNING [train.py:1204] (3/4) Exclude cut with ID _48677_210_5663_1_1533902405919_3892989_408-40740-0_sp0.9 from training. Duration: 0.92225 2023-10-04 00:57:19,522 WARNING [train.py:1204] (3/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_448-212135-0_sp1.1 from training. Duration: 0.82725 2023-10-04 00:57:20,993 WARNING [train.py:1204] (3/4) Exclude cut with ID _59170_210_6721_1_1534983633960_4203335_320-124047-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 00:57:21,029 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_4692_210_3528_1_1526899755_4323254_107-44401-0_sp1.1 from training. Duration: 0.7818125 2023-10-04 00:57:25,584 WARNING [train.py:1204] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_731-2428-0 from training. Duration: 0.83 2023-10-04 00:57:26,968 INFO [train.py:1046] (3/4) Epoch 42, batch 2500, loss[loss=0.1479, simple_loss=0.2213, pruned_loss=0.03727, over 23385.00 frames. ], tot_loss[loss=0.1539, simple_loss=0.2333, pruned_loss=0.03719, over 4683613.40 frames. ], batch size: 285, lr: 2.42e-03, grad_scale: 32.0 2023-10-04 00:57:27,066 WARNING [train.py:1204] (3/4) Exclude cut with ID _29857_210_20034_1_1532395886753_3798160_335-163562-0_sp1.1 from training. Duration: 0.77275 2023-10-04 00:57:32,574 WARNING [train.py:1204] (3/4) Exclude cut with ID _14832_210_8750_1_1530691351539_3171348_171-275004-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 00:57:32,806 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.ff2_skip_rate, batch_count=1468653.3333333333, ans=0.0 2023-10-04 00:57:37,540 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder_embed.dropout.p, batch_count=1468653.3333333333, ans=0.1 2023-10-04 00:57:42,625 WARNING [train.py:1204] (3/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_105-328174-0_sp0.9 from training. Duration: 0.8444375 2023-10-04 00:57:43,950 WARNING [train.py:1204] (3/4) Exclude cut with ID _68965_210_1883_1_1535698962272_3497490_164-118421-0_sp1.1 from training. Duration: 0.936375 2023-10-04 00:57:45,297 WARNING [train.py:1204] (3/4) Exclude cut with ID _19896_210_1883_1_1531709022662_6649569_380-24170-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 00:57:45,303 WARNING [train.py:1204] (3/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_505-205270-0_sp1.1 from training. Duration: 0.3454375 2023-10-04 00:57:48,622 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.conv_module2.balancer1.prob, batch_count=1468720.0, ans=0.125 2023-10-04 00:57:51,003 WARNING [train.py:1204] (3/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_427-208901-0_sp1.1 from training. Duration: 0.6181875 2023-10-04 00:57:51,056 WARNING [train.py:1204] (3/4) Exclude cut with ID _23818_210_6228_1_1531917063884_3676920_85-326916-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 00:57:52,351 WARNING [train.py:1204] (3/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_101-293367-0_sp1.1 from training. Duration: 0.57275 2023-10-04 00:57:52,359 WARNING [train.py:1204] (3/4) Exclude cut with ID _5409_210_1388_1_1527292850189_5865191_178-127970-0_sp1.1 from training. Duration: 0.5181875 2023-10-04 00:57:53,708 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_403-39619-0 from training. Duration: 0.84 2023-10-04 00:57:53,799 WARNING [train.py:1204] (3/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_427-87841-0_sp1.1 from training. Duration: 0.963625 2023-10-04 00:57:55,462 WARNING [train.py:1204] (3/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_286-68181-0_sp1.1 from training. Duration: 0.97275 2023-10-04 00:57:55,502 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6673_210_3273_1_1527767790_3173240_327-306185-0 from training. Duration: 0.83 2023-10-04 00:57:56,754 WARNING [train.py:1204] (3/4) Exclude cut with ID _3675_210_4557_1_1526639934408_4182089_106-203713-0_sp1.1 from training. Duration: 0.963625 2023-10-04 00:57:56,814 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_53071_210_16554_1_1534493434_1276796_130-36076-0 from training. Duration: 0.95 2023-10-04 00:57:56,854 WARNING [train.py:1204] (3/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_586-93054-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 00:58:01,114 WARNING [train.py:1204] (3/4) Exclude cut with ID _23825_210_13449_1_1531915313526_3497150_262-10571-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 00:58:02,972 WARNING [train.py:1204] (3/4) Exclude cut with ID _28783_210_7943_1_1532671164808_7155479_432-67885-0_sp1.1 from training. Duration: 0.97275 2023-10-04 00:58:04,468 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6287_210_3273_1_1527509506_2653716_182-199887-0_sp0.9 from training. Duration: 0.5666875 2023-10-04 00:58:04,523 WARNING [train.py:1204] (3/4) Exclude cut with ID _3231_210_2106_1_1526205864934_6784220_171-256004-0 from training. Duration: 0.86 2023-10-04 00:58:04,689 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.conv_module2.balancer2.prob, batch_count=1468786.6666666667, ans=0.125 2023-10-04 00:58:05,824 WARNING [train.py:1204] (3/4) Exclude cut with ID _69051_210_1531_1_1535885566901_3829538_332-92090-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 00:58:07,194 WARNING [train.py:1204] (3/4) Exclude cut with ID _3675_210_4557_1_1526639934408_4182089_104-324125-0_sp1.1 from training. Duration: 0.963625 2023-10-04 00:58:10,463 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_41764_210_2521_1_1533686110_4089313_501-2119-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 00:58:13,523 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.attention_skip_rate, batch_count=1468853.3333333333, ans=0.0 2023-10-04 00:58:15,910 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_56-100988-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 00:58:18,843 WARNING [train.py:1204] (3/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_398-239819-0_sp1.1 from training. Duration: 0.9 2023-10-04 00:58:21,731 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.balancer1.prob, batch_count=1468853.3333333333, ans=0.125 2023-10-04 00:58:24,271 WARNING [train.py:1204] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_74-48717-0_sp1.1 from training. Duration: 0.52725 2023-10-04 00:58:25,773 WARNING [train.py:1204] (3/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_177-49092-0 from training. Duration: 0.91 2023-10-04 00:58:27,533 WARNING [train.py:1204] (3/4) Exclude cut with ID _62337_210_12392_1_1535009273105_3412229_44-176353-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 00:58:27,541 WARNING [train.py:1204] (3/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_110-130961-0_sp1.1 from training. Duration: 0.42725 2023-10-04 00:58:28,869 WARNING [train.py:1204] (3/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_37-189262-0_sp1.1 from training. Duration: 0.8909375 2023-10-04 00:58:28,870 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3172_210_2087_1_1526292929_3987865_197-97762-0_sp0.9 from training. Duration: 0.8333125 2023-10-04 00:58:29,135 INFO [scaling.py:1118] (3/4) WithLoss: name=encoder.encoders.2.encoder.layers.2.self_attn_weights, loss-sum=0.000e+00 2023-10-04 00:58:30,274 WARNING [train.py:1204] (3/4) Exclude cut with ID IC0677W0065-334897 from training. Duration: 0.896 2023-10-04 00:58:30,275 WARNING [train.py:1204] (3/4) Exclude cut with ID IC0677W0080-334912 from training. Duration: 0.768 2023-10-04 00:58:30,280 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_120-41668-0 from training. Duration: 0.7 2023-10-04 00:58:34,838 WARNING [train.py:1204] (3/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_460-205287-0_sp1.1 from training. Duration: 0.963625 2023-10-04 00:58:36,236 WARNING [train.py:1204] (3/4) Exclude cut with ID _14632_210_9988_1_1530691279392_3923200_102-204477-0 from training. Duration: 0.97 2023-10-04 00:58:36,244 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_34815_210_6388_1_1533381320_3571903_279-305565-0 from training. Duration: 0.92 2023-10-04 00:58:36,305 WARNING [train.py:1204] (3/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_91-148831-0_sp1.1 from training. Duration: 0.9 2023-10-04 00:58:37,635 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_82-221196-0 from training. Duration: 0.76 2023-10-04 00:58:40,300 INFO [train.py:1046] (3/4) Epoch 42, batch 2550, loss[loss=0.1576, simple_loss=0.2339, pruned_loss=0.04061, over 23715.00 frames. ], tot_loss[loss=0.1545, simple_loss=0.2344, pruned_loss=0.03728, over 4691160.76 frames. ], batch size: 232, lr: 2.42e-03, grad_scale: 16.0 2023-10-04 00:58:40,431 WARNING [train.py:1204] (3/4) Exclude cut with ID _38020_210_5986_1_1533112252259_7006089_493-102745-0 from training. Duration: 0.98 2023-10-04 00:58:43,646 WARNING [train.py:1204] (3/4) Exclude cut with ID _61133_210_9969_1_1535196777227_3696409_40-93442-0_sp1.1 from training. Duration: 0.92725 2023-10-04 00:58:46,937 WARNING [train.py:1204] (3/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_45-258852-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 00:58:46,968 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_53071_210_16554_1_1534493434_1276796_130-36076-0_sp1.1 from training. Duration: 0.863625 2023-10-04 00:58:49,670 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8694_210_6400_1_1529065754_3260756_332-13553-0_sp1.1 from training. Duration: 0.92725 2023-10-04 00:58:49,751 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_405-339986-0 from training. Duration: 0.51 2023-10-04 00:58:51,103 WARNING [train.py:1204] (3/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_927-43827-0_sp1.1 from training. Duration: 0.67275 2023-10-04 00:58:51,374 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.feed_forward1.hidden_balancer.prob, batch_count=1468986.6666666667, ans=0.125 2023-10-04 00:58:54,011 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_397-147196-0 from training. Duration: 0.8 2023-10-04 00:58:55,369 WARNING [train.py:1204] (3/4) Exclude cut with ID 3557-8342-0013-54691-0_sp1.1 from training. Duration: 0.836375 2023-10-04 00:58:56,737 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_47438_210_5399_1_1533797471_4513356_626-127722-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 00:58:58,724 WARNING [train.py:1204] (3/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_256-323883-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 00:58:58,728 WARNING [train.py:1204] (3/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_839-173017-0_sp0.9 from training. Duration: 0.4555625 2023-10-04 00:58:58,753 WARNING [train.py:1204] (3/4) Exclude cut with ID _5289_210_2084_1_1527678202736_3779299_392-261988-0_sp1.1 from training. Duration: 0.5909375 2023-10-04 00:58:59,988 WARNING [train.py:1204] (3/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_119-224384-0_sp1.1 from training. Duration: 0.936375 2023-10-04 00:59:00,033 WARNING [train.py:1204] (3/4) Exclude cut with ID _57523_210_1856_1_1534766294545_3732989_262-267933-0_sp1.1 from training. Duration: 0.963625 2023-10-04 00:59:03,319 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_35361_210_4238_1_1533262810_7793476_675-87012-0_sp0.9 from training. Duration: 0.988875 2023-10-04 00:59:03,336 WARNING [train.py:1204] (3/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_17-90541-0 from training. Duration: 0.94 2023-10-04 00:59:03,371 WARNING [train.py:1204] (3/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_535-14410-0_sp1.1 from training. Duration: 0.42725 2023-10-04 00:59:04,013 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.3.encoder.layers.2.nonlin_attention.whiten1, num_groups=1, num_channels=384, metric=4.35 vs. limit=10.0 2023-10-04 00:59:04,632 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_24673_210_6400_1_1531968633_778659_94-154511-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 00:59:04,636 WARNING [train.py:1204] (3/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_212-10501-0 from training. Duration: 0.94 2023-10-04 00:59:12,244 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.conv_module2.balancer1.prob, batch_count=1469120.0, ans=0.125 2023-10-04 00:59:16,566 WARNING [train.py:1204] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_383-285196-0_sp1.1 from training. Duration: 0.8818125 2023-10-04 00:59:19,440 WARNING [train.py:1204] (3/4) Exclude cut with ID _45560_210_21982_1_1533812603951_3584999_238-47020-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 00:59:19,449 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_21500_210_2054_1_1532149310_2716758_278-18053-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 00:59:19,455 WARNING [train.py:1204] (3/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_453-9626-0_sp1.1 from training. Duration: 0.936375 2023-10-04 00:59:20,863 WARNING [train.py:1204] (3/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_153-97441-0_sp1.1 from training. Duration: 0.5090625 2023-10-04 00:59:23,701 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.attention_skip_rate, batch_count=1469186.6666666667, ans=0.0 2023-10-04 00:59:27,978 WARNING [train.py:1204] (3/4) Exclude cut with ID _67725_210_27767_1_1535608756671_5105359_431-120895-0_sp1.1 from training. Duration: 0.963625 2023-10-04 00:59:31,819 WARNING [train.py:1204] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_574-45541-0_sp1.1 from training. Duration: 0.5909375 2023-10-04 00:59:31,842 WARNING [train.py:1204] (3/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_162-103620-0_sp1.1 from training. Duration: 0.8454375 2023-10-04 00:59:33,036 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.635e+02 1.986e+02 2.220e+02 2.495e+02 3.870e+02, threshold=4.440e+02, percent-clipped=0.0 2023-10-04 00:59:33,074 WARNING [train.py:1204] (3/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_734-129761-0_sp1.1 from training. Duration: 0.7454375 2023-10-04 00:59:33,117 WARNING [train.py:1204] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_620-276793-0_sp1.1 from training. Duration: 0.536375 2023-10-04 00:59:33,155 WARNING [train.py:1204] (3/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_153-97441-0_sp0.9 from training. Duration: 0.62225 2023-10-04 00:59:35,987 WARNING [train.py:1204] (3/4) Exclude cut with ID _12733_210_8341_1_1530167424556_3646380_523-85621-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 00:59:36,010 WARNING [train.py:1204] (3/4) Exclude cut with ID _64048_210_10626_1_1535198382802_3835179_352-179834-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 00:59:41,616 WARNING [train.py:1204] (3/4) Exclude cut with ID _55162_210_17071_1_1534408248583_3753480_43-242364-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 00:59:41,639 WARNING [train.py:1204] (3/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_661-264857-0 from training. Duration: 0.93 2023-10-04 00:59:41,645 WARNING [train.py:1204] (3/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_325-237800-0_sp1.1 from training. Duration: 0.97275 2023-10-04 00:59:41,681 WARNING [train.py:1204] (3/4) Exclude cut with ID _55328_210_27767_1_1534580973260_4088170_385-171459-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 00:59:43,061 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3780_210_2087_1_1526467389_2145572_176-107140-0_sp1.1 from training. Duration: 0.62725 2023-10-04 00:59:46,663 WARNING [train.py:1204] (3/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_375-244976-0_sp1.1 from training. Duration: 0.6818125 2023-10-04 00:59:48,019 WARNING [train.py:1204] (3/4) Exclude cut with ID _35941_210_15379_1_1532995294515_4023089_309-33162-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 00:59:53,639 WARNING [train.py:1204] (3/4) Exclude cut with ID _14459_210_2228_1_1531443641946_7516700_1185-22038-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 00:59:54,965 INFO [train.py:1046] (3/4) Epoch 42, batch 2600, loss[loss=0.1993, simple_loss=0.2742, pruned_loss=0.06215, over 19610.00 frames. ], tot_loss[loss=0.155, simple_loss=0.2351, pruned_loss=0.0374, over 4682042.71 frames. ], batch size: 388, lr: 2.42e-03, grad_scale: 16.0 2023-10-04 00:59:55,218 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.balancer1.prob, batch_count=1469320.0, ans=0.125 2023-10-04 00:59:56,366 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_1197-69460-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 00:59:58,118 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9012W0022-501157 from training. Duration: 0.896 2023-10-04 01:00:00,887 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9023W0009-506302 from training. Duration: 0.896 2023-10-04 01:00:00,911 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2821_210_2715_1_1526108385_3763560_73-296673-0_sp1.1 from training. Duration: 0.7545625 2023-10-04 01:00:00,940 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9025W0113-507442 from training. Duration: 0.896 2023-10-04 01:00:01,153 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.ff3_skip_rate, batch_count=1469320.0, ans=0.0 2023-10-04 01:00:02,785 WARNING [train.py:1204] (3/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_739-2098-0 from training. Duration: 0.92 2023-10-04 01:00:02,806 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9029W0332-509722 from training. Duration: 0.768 2023-10-04 01:00:05,940 WARNING [train.py:1204] (3/4) Exclude cut with ID _16908_210_5724_1_1531119157248_3772700_68-101953-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 01:00:05,966 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9042W0011-515617 from training. Duration: 0.768 2023-10-04 01:00:07,383 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3019_210_2028_1_1526044245_3264865_191-72235-0 from training. Duration: 0.97 2023-10-04 01:00:08,769 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9052W0006-520792 from training. Duration: 0.896 2023-10-04 01:00:10,225 WARNING [train.py:1204] (3/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_245-174553-0_sp1.1 from training. Duration: 0.8 2023-10-04 01:00:11,520 WARNING [train.py:1204] (3/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_202-67017-0 from training. Duration: 0.82 2023-10-04 01:00:12,940 WARNING [train.py:1204] (3/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_304-59227-0 from training. Duration: 0.77 2023-10-04 01:00:14,684 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_246-125000-0_sp0.9 from training. Duration: 0.688875 2023-10-04 01:00:16,577 WARNING [train.py:1204] (3/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_243-42116-0 from training. Duration: 0.73 2023-10-04 01:00:17,807 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9087W0121-538492 from training. Duration: 0.896 2023-10-04 01:00:19,119 WARNING [train.py:1204] (3/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_120-76843-0 from training. Duration: 0.55 2023-10-04 01:00:26,016 WARNING [train.py:1204] (3/4) Exclude cut with ID _7852_210_5219_1_1528286471316_8139400_850-304621-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 01:00:26,034 WARNING [train.py:1204] (3/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_154-285415-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 01:00:26,052 WARNING [train.py:1204] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_538-278194-0_sp1.1 from training. Duration: 0.92725 2023-10-04 01:00:26,052 WARNING [train.py:1204] (3/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_119-207162-0 from training. Duration: 0.71 2023-10-04 01:00:28,836 WARNING [train.py:1204] (3/4) Exclude cut with ID _2334_210_2667_1_1525608238880_4705129_459-104997-0_sp1.1 from training. Duration: 0.863625 2023-10-04 01:00:34,416 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9147W0019-567847 from training. Duration: 0.768 2023-10-04 01:00:37,789 WARNING [train.py:1204] (3/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_228-189439-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 01:00:39,616 WARNING [train.py:1204] (3/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_196-77172-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 01:00:39,693 WARNING [train.py:1204] (3/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_625-338546-0 from training. Duration: 0.88 2023-10-04 01:00:41,035 WARNING [train.py:1204] (3/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_193-327630-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 01:00:41,037 WARNING [train.py:1204] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_391-24006-0_sp1.1 from training. Duration: 0.92725 2023-10-04 01:00:41,115 WARNING [train.py:1204] (3/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_197-28164-0 from training. Duration: 0.8 2023-10-04 01:00:41,263 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.bypass.scale_min, batch_count=1469520.0, ans=0.2 2023-10-04 01:00:44,295 WARNING [train.py:1204] (3/4) Exclude cut with ID _8395_210_1531_1_1528597871523_4411579_431-132470-0_sp1.1 from training. Duration: 0.67275 2023-10-04 01:00:44,328 WARNING [train.py:1204] (3/4) Exclude cut with ID _35350_210_16040_1_1533040191183_4075299_465-248326-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 01:00:46,147 WARNING [train.py:1204] (3/4) Exclude cut with ID _16756_210_4915_1_1531047803763_7104259_101-141922-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 01:00:48,783 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9212W0005-600172 from training. Duration: 0.896 2023-10-04 01:00:48,822 WARNING [train.py:1204] (3/4) Exclude cut with ID _63186_210_2084_1_1535414479705_3908649_378-166820-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 01:00:48,845 WARNING [train.py:1204] (3/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_52-211683-0_sp1.1 from training. Duration: 0.8181875 2023-10-04 01:00:54,352 WARNING [train.py:1204] (3/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_1147-237729-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 01:00:54,421 WARNING [train.py:1204] (3/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_234-14174-0_sp0.9 from training. Duration: 0.911125 2023-10-04 01:00:54,433 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_454-276429-0 from training. Duration: 0.87 2023-10-04 01:00:57,153 WARNING [train.py:1204] (3/4) Exclude cut with ID _53713_210_24116_1_1534314487762_7416751_755-165313-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 01:00:58,471 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8278_210_2550_1_1528637099_4220413_436-317072-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 01:00:59,689 WARNING [train.py:1204] (3/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_28-324729-0_sp1.1 from training. Duration: 0.8545625 2023-10-04 01:01:06,471 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.2.encoder.layers.2.self_attn2.whiten, num_groups=1, num_channels=384, metric=15.59 vs. limit=22.5 2023-10-04 01:01:07,132 WARNING [train.py:1204] (3/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_326-165900-0 from training. Duration: 0.69 2023-10-04 01:01:07,207 WARNING [train.py:1204] (3/4) Exclude cut with ID _53489_210_6228_1_1534298357169_3835970_96-30246-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 01:01:08,418 INFO [train.py:1046] (3/4) Epoch 42, batch 2650, loss[loss=0.1756, simple_loss=0.2603, pruned_loss=0.04546, over 23383.00 frames. ], tot_loss[loss=0.1569, simple_loss=0.2368, pruned_loss=0.03844, over 4685026.35 frames. ], batch size: 93, lr: 2.42e-03, grad_scale: 16.0 2023-10-04 01:01:09,854 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8275_210_4198_1_1528455320_3892571_491-17707-0_sp1.1 from training. Duration: 0.5818125 2023-10-04 01:01:13,820 WARNING [train.py:1204] (3/4) Exclude cut with ID _14348_210_8341_1_1530599434996_3778640_240-232370-0 from training. Duration: 0.84 2023-10-04 01:01:13,837 WARNING [train.py:1204] (3/4) Exclude cut with ID _45012_210_12651_1_1533608435136_5371380_54-112623-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 01:01:15,179 WARNING [train.py:1204] (3/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_328-264776-0_sp1.1 from training. Duration: 0.6090625 2023-10-04 01:01:17,067 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9325W0001-648427 from training. Duration: 0.768 2023-10-04 01:01:17,084 WARNING [train.py:1204] (3/4) Exclude cut with ID _56480_210_4220_1_1534679942079_3075409_49-74717-0_sp1.1 from training. Duration: 0.936375 2023-10-04 01:01:18,458 WARNING [train.py:1204] (3/4) Exclude cut with ID _59500_210_8254_1_1534809610680_3736009_6-241869-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 01:01:19,897 WARNING [train.py:1204] (3/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_172-215932-0_sp0.9 from training. Duration: 0.7333125 2023-10-04 01:01:21,433 WARNING [train.py:1204] (3/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_17-90541-0_sp1.1 from training. Duration: 0.8545625 2023-10-04 01:01:22,898 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_43831_210_2158_1_1533725502_5872791_525-89447-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 01:01:25,464 WARNING [train.py:1204] (3/4) Exclude cut with ID _12302_210_5219_1_1529924446466_8107280_907-182949-0 from training. Duration: 0.92 2023-10-04 01:01:25,484 WARNING [train.py:1204] (3/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_402-102311-0_sp0.9 from training. Duration: 0.7444375 2023-10-04 01:01:25,529 WARNING [train.py:1204] (3/4) Exclude cut with ID _8021_210_7994_1_1528376451358_3653069_179-45206-0_sp0.9 from training. Duration: 0.9555625 2023-10-04 01:01:29,461 WARNING [train.py:1204] (3/4) Exclude cut with ID _40137_210_12210_1_1533290484046_3795180_249-187059-0 from training. Duration: 0.88 2023-10-04 01:01:30,842 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9653W0194-675472 from training. Duration: 0.9658125 2023-10-04 01:01:33,595 WARNING [train.py:1204] (3/4) Exclude cut with ID _51332_210_9988_1_1534244371836_3801270_336-255604-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 01:01:35,054 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_510-96996-0 from training. Duration: 0.87 2023-10-04 01:01:36,332 WARNING [train.py:1204] (3/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_1167-20437-0_sp1.1 from training. Duration: 0.92725 2023-10-04 01:01:36,385 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3475_210_2028_1_1526193578_3622569_265-276625-0 from training. Duration: 0.76 2023-10-04 01:01:42,496 WARNING [train.py:1204] (3/4) Exclude cut with ID _14860_210_4915_1_1530702020340_7343240_470-172418-0_sp1.1 from training. Duration: 0.97275 2023-10-04 01:01:42,506 WARNING [train.py:1204] (3/4) Exclude cut with ID _5444_210_2151_1_1527418808464_5312210_581-315411-0_sp1.1 from training. Duration: 0.563625 2023-10-04 01:01:42,532 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_34450_210_9547_1_1533099443_4109050_519-342439-0_sp1.1 from training. Duration: 0.97275 2023-10-04 01:01:42,558 WARNING [train.py:1204] (3/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_50-175961-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 01:01:47,811 WARNING [train.py:1204] (3/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_372-6869-0 from training. Duration: 0.44 2023-10-04 01:01:47,823 WARNING [train.py:1204] (3/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_3-237434-0 from training. Duration: 0.76 2023-10-04 01:01:49,276 WARNING [train.py:1204] (3/4) Exclude cut with ID _8772_210_1850_1_1528635595972_3664011_247-336448-0_sp1.1 from training. Duration: 0.67275 2023-10-04 01:01:53,463 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_427-78097-0 from training. Duration: 0.8 2023-10-04 01:01:54,690 WARNING [train.py:1204] (3/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_466-252727-0_sp1.1 from training. Duration: 0.97275 2023-10-04 01:01:54,760 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_15972_210_6400_1_1530877934_3405692_316-35478-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 01:01:54,794 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_809-17156-0_sp0.9 from training. Duration: 0.811125 2023-10-04 01:01:56,047 WARNING [train.py:1204] (3/4) Exclude cut with ID _57572_210_5517_1_1534921219624_3809484_594-263474-0_sp1.1 from training. Duration: 0.936375 2023-10-04 01:01:56,083 WARNING [train.py:1204] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_190-228204-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 01:01:57,448 WARNING [train.py:1204] (3/4) Exclude cut with ID _19514_210_9259_1_1531530071330_5084230_117-21193-0_sp1.1 from training. Duration: 0.936375 2023-10-04 01:01:58,840 WARNING [train.py:1204] (3/4) Exclude cut with ID _69584_210_5219_1_1535785133775_4044450_190-21315-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 01:01:58,932 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_53071_210_16554_1_1534493434_1276796_184-302050-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 01:02:00,144 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.561e+02 1.901e+02 2.089e+02 2.276e+02 3.340e+02, threshold=4.178e+02, percent-clipped=0.0 2023-10-04 01:02:00,253 WARNING [train.py:1204] (3/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_120-76843-0_sp1.1 from training. Duration: 0.5 2023-10-04 01:02:00,332 WARNING [train.py:1204] (3/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_318-162017-0_sp1.1 from training. Duration: 0.82725 2023-10-04 01:02:01,697 WARNING [train.py:1204] (3/4) Exclude cut with ID _46067_210_4220_1_1534125571066_3307450_79-46864-0_sp1.1 from training. Duration: 0.92725 2023-10-04 01:02:03,141 WARNING [train.py:1204] (3/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_358-215558-0_sp1.1 from training. Duration: 0.8454375 2023-10-04 01:02:04,457 WARNING [train.py:1204] (3/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_621-66764-0_sp1.1 from training. Duration: 0.92725 2023-10-04 01:02:05,875 WARNING [train.py:1204] (3/4) Exclude cut with ID _42863_210_9070_1_1533902398788_3696920_338-156851-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 01:02:07,158 WARNING [train.py:1204] (3/4) Exclude cut with ID _5289_210_2084_1_1527678202736_3779299_392-261988-0_sp0.9 from training. Duration: 0.72225 2023-10-04 01:02:09,256 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.ff2_skip_rate, batch_count=1469920.0, ans=0.0 2023-10-04 01:02:10,478 WARNING [train.py:1204] (3/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_409-84682-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 01:02:12,425 WARNING [train.py:1204] (3/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_30-326681-0_sp0.9 from training. Duration: 0.92225 2023-10-04 01:02:12,430 WARNING [train.py:1204] (3/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_389-184695-0_sp1.1 from training. Duration: 0.92725 2023-10-04 01:02:12,476 WARNING [train.py:1204] (3/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_442-320317-0 from training. Duration: 0.82 2023-10-04 01:02:16,393 WARNING [train.py:1204] (3/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_756-29911-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 01:02:16,664 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.conv_module2.balancer2.prob, batch_count=1469920.0, ans=0.125 2023-10-04 01:02:18,167 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_401-112036-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 01:02:20,123 WARNING [train.py:1204] (3/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_181-308827-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 01:02:21,516 WARNING [train.py:1204] (3/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_332-258725-0_sp1.1 from training. Duration: 0.963625 2023-10-04 01:02:21,597 WARNING [train.py:1204] (3/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_188-260011-0_sp0.9 from training. Duration: 0.811125 2023-10-04 01:02:22,897 INFO [train.py:1046] (3/4) Epoch 42, batch 2700, loss[loss=0.1613, simple_loss=0.2335, pruned_loss=0.0446, over 23758.00 frames. ], tot_loss[loss=0.1574, simple_loss=0.2373, pruned_loss=0.03875, over 4682214.60 frames. ], batch size: 212, lr: 2.42e-03, grad_scale: 16.0 2023-10-04 01:02:22,963 WARNING [train.py:1204] (3/4) Exclude cut with ID _49039_210_6315_1_1533967537541_3846118_99-302239-0_sp1.1 from training. Duration: 0.963625 2023-10-04 01:02:24,433 WARNING [train.py:1204] (3/4) Exclude cut with ID _4122_210_2084_1_1526713164417_4353280_312-294828-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 01:02:24,441 WARNING [train.py:1204] (3/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_303-228738-0 from training. Duration: 0.84 2023-10-04 01:02:24,693 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.conv_module1.balancer2.prob, batch_count=1469986.6666666667, ans=0.125 2023-10-04 01:02:27,146 WARNING [train.py:1204] (3/4) Exclude cut with ID _14349_210_8341_1_1530615710818_3442470_115-285975-0_sp0.9 from training. Duration: 0.9555625 2023-10-04 01:02:28,581 WARNING [train.py:1204] (3/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_125-144174-0_sp1.1 from training. Duration: 0.3545625 2023-10-04 01:02:29,979 WARNING [train.py:1204] (3/4) Exclude cut with ID _53565_210_10635_1_1534337218335_3820259_351-54570-0_sp1.1 from training. Duration: 0.97275 2023-10-04 01:02:30,013 WARNING [train.py:1204] (3/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_410-40065-0_sp1.1 from training. Duration: 0.963625 2023-10-04 01:02:30,041 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_38965_210_4852_1_1533174906_3484735_167-339442-0_sp1.1 from training. Duration: 0.963625 2023-10-04 01:02:31,430 WARNING [train.py:1204] (3/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_265-164486-0_sp1.1 from training. Duration: 0.87275 2023-10-04 01:02:31,446 WARNING [train.py:1204] (3/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_725-182484-0_sp1.1 from training. Duration: 0.92725 2023-10-04 01:02:31,475 WARNING [train.py:1204] (3/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_607-213951-0_sp0.9 from training. Duration: 0.9333125 2023-10-04 01:02:31,574 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.0.feed_forward1.out_proj.dropout_p, batch_count=1469986.6666666667, ans=0.1 2023-10-04 01:02:32,722 WARNING [train.py:1204] (3/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_605-133395-0_sp1.1 from training. Duration: 0.6 2023-10-04 01:02:32,745 WARNING [train.py:1204] (3/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_351-294650-0 from training. Duration: 0.63 2023-10-04 01:02:34,086 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_14953_210_2151_1_1530701710_5616165_710-165400-0_sp1.1 from training. Duration: 0.7909375 2023-10-04 01:02:35,431 WARNING [train.py:1204] (3/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_237-58433-0_sp1.1 from training. Duration: 0.67275 2023-10-04 01:02:35,522 WARNING [train.py:1204] (3/4) Exclude cut with ID _5190_210_4557_1_1527244368799_373439_28-197030-0_sp1.1 from training. Duration: 0.6545625 2023-10-04 01:02:36,872 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_743-327651-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 01:02:40,137 WARNING [train.py:1204] (3/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_76-203867-0_sp0.9 from training. Duration: 0.8 2023-10-04 01:02:40,386 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.attention_skip_rate, batch_count=1470053.3333333333, ans=0.0 2023-10-04 01:02:41,983 WARNING [train.py:1204] (3/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_274-274798-0 from training. Duration: 0.98 2023-10-04 01:02:42,031 WARNING [train.py:1204] (3/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_226-242140-0_sp1.1 from training. Duration: 0.736375 2023-10-04 01:02:46,281 WARNING [train.py:1204] (3/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_352-59958-0_sp1.1 from training. Duration: 0.7818125 2023-10-04 01:02:46,285 WARNING [train.py:1204] (3/4) Exclude cut with ID _39411_210_5986_1_1533538825030_3343649_191-251819-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 01:02:46,451 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=1470053.3333333333, ans=0.1 2023-10-04 01:02:53,954 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7046_210_6753_1_1527895337_6920917_642-106799-0_sp1.1 from training. Duration: 0.836375 2023-10-04 01:02:53,961 WARNING [train.py:1204] (3/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_412-3442-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 01:02:53,989 WARNING [train.py:1204] (3/4) Exclude cut with ID _60057_210_10640_1_1534903065793_3333150_171-181133-0_sp0.9 from training. Duration: 0.97775 2023-10-04 01:02:53,998 WARNING [train.py:1204] (3/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_154-135696-0_sp1.1 from training. Duration: 0.72725 2023-10-04 01:02:54,282 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.feed_forward3.hidden_balancer.prob, batch_count=1470120.0, ans=0.125 2023-10-04 01:02:55,626 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.balancer1.prob, batch_count=1470120.0, ans=0.125 2023-10-04 01:02:56,769 WARNING [train.py:1204] (3/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_148-186805-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 01:02:59,631 WARNING [train.py:1204] (3/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_605-165641-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 01:02:59,637 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_174-277364-0_sp0.9 from training. Duration: 0.788875 2023-10-04 01:02:59,655 WARNING [train.py:1204] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_89-321955-0_sp0.9 from training. Duration: 0.888875 2023-10-04 01:03:02,506 WARNING [train.py:1204] (3/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_438-105429-0_sp1.1 from training. Duration: 0.963625 2023-10-04 01:03:02,509 WARNING [train.py:1204] (3/4) Exclude cut with ID _14970_210_15709_1_1530773543493_1715730_187-20259-0_sp0.9 from training. Duration: 0.988875 2023-10-04 01:03:09,931 WARNING [train.py:1204] (3/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_385-193052-0_sp0.9 from training. Duration: 0.9555625 2023-10-04 01:03:11,263 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7113_210_1849_1_1527942685_3448768_194-311626-0_sp1.1 from training. Duration: 0.92725 2023-10-04 01:03:14,473 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3780_210_2087_1_1526467389_2145572_176-107140-0_sp0.9 from training. Duration: 0.7666875 2023-10-04 01:03:14,475 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_36756_210_7234_1_1533026991_966276_123-213457-0_sp1.1 from training. Duration: 0.936375 2023-10-04 01:03:17,819 WARNING [train.py:1204] (3/4) Exclude cut with ID _23557_210_8750_1_1531890106370_6846255_456-244861-0_sp1.1 from training. Duration: 0.963625 2023-10-04 01:03:19,167 WARNING [train.py:1204] (3/4) Exclude cut with ID _37086_210_9038_1_1532999540891_7021250_242-349170-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 01:03:19,238 WARNING [train.py:1204] (3/4) Exclude cut with ID _41298_210_9019_1_1533430371450_4822143_471-281351-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 01:03:19,428 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.feed_forward2.hidden_balancer.prob, batch_count=1470186.6666666667, ans=0.125 2023-10-04 01:03:20,639 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_53378_210_17282_1_1534221071_5769509_572-126176-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 01:03:20,727 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_1507-263032-0_sp1.1 from training. Duration: 0.963625 2023-10-04 01:03:22,203 WARNING [train.py:1204] (3/4) Exclude cut with ID _13679_210_7497_1_1530534293662_3355098_411-186088-0_sp1.1 from training. Duration: 0.8545625 2023-10-04 01:03:23,748 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.conv_module2.balancer1.prob, batch_count=1470253.3333333333, ans=0.125 2023-10-04 01:03:23,767 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.attention_skip_rate, batch_count=1470253.3333333333, ans=0.0 2023-10-04 01:03:24,412 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.1.encoder.layers.0.self_attn_weights.whiten_keys, num_groups=4, num_channels=128, metric=3.71 vs. limit=6.0 2023-10-04 01:03:24,958 WARNING [train.py:1204] (3/4) Exclude cut with ID _29345_210_4381_1_1532689168263_3860169_149-257268-0_sp1.1 from training. Duration: 0.736375 2023-10-04 01:03:26,273 WARNING [train.py:1204] (3/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_381-252014-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 01:03:26,280 WARNING [train.py:1204] (3/4) Exclude cut with ID _35799_210_5854_1_1533263487887_3529071_512-2717-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 01:03:27,737 WARNING [train.py:1204] (3/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_505-205270-0_sp0.9 from training. Duration: 0.42225 2023-10-04 01:03:29,099 WARNING [train.py:1204] (3/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_731-20117-0_sp1.1 from training. Duration: 0.936375 2023-10-04 01:03:29,246 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.1.self_attn_weights.pos_emb_skip_rate, batch_count=1470253.3333333333, ans=0.0 2023-10-04 01:03:31,826 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_37351_210_7925_1_1533012931_3812113_265-85780-0_sp1.1 from training. Duration: 0.8 2023-10-04 01:03:31,835 WARNING [train.py:1204] (3/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_313-134930-0 from training. Duration: 0.97 2023-10-04 01:03:33,304 WARNING [train.py:1204] (3/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_374-138971-0 from training. Duration: 0.94 2023-10-04 01:03:33,457 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.ff3_skip_rate, batch_count=1470253.3333333333, ans=0.0 2023-10-04 01:03:34,976 WARNING [train.py:1204] (3/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_1227-287886-0_sp1.1 from training. Duration: 0.936375 2023-10-04 01:03:36,329 INFO [train.py:1046] (3/4) Epoch 42, batch 2750, loss[loss=0.1603, simple_loss=0.2483, pruned_loss=0.03618, over 24323.00 frames. ], tot_loss[loss=0.1574, simple_loss=0.2377, pruned_loss=0.03853, over 4694164.02 frames. ], batch size: 74, lr: 2.42e-03, grad_scale: 16.0 2023-10-04 01:03:37,638 WARNING [train.py:1204] (3/4) Exclude cut with ID _59493_210_17282_1_1535078419077_3225879_332-217544-0_sp1.1 from training. Duration: 0.97275 2023-10-04 01:03:37,674 WARNING [train.py:1204] (3/4) Exclude cut with ID _23514_210_1389_1_1531832139785_3031439_361-119640-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 01:03:40,475 WARNING [train.py:1204] (3/4) Exclude cut with ID _69584_210_5219_1_1535785133775_4044450_142-118058-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 01:03:40,496 WARNING [train.py:1204] (3/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_178-177582-0_sp1.1 from training. Duration: 0.67275 2023-10-04 01:03:40,538 WARNING [train.py:1204] (3/4) Exclude cut with ID _25303_210_17519_1_1532050207576_3635970_41-195572-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 01:03:43,871 WARNING [train.py:1204] (3/4) Exclude cut with ID _55328_210_27767_1_1534580973260_4088170_329-341322-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 01:03:43,906 WARNING [train.py:1204] (3/4) Exclude cut with ID _5608_210_2106_1_1527415439089_6819768_909-82767-0_sp1.1 from training. Duration: 0.5545625 2023-10-04 01:03:43,955 WARNING [train.py:1204] (3/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_261-297503-0_sp1.1 from training. Duration: 0.9 2023-10-04 01:03:43,961 WARNING [train.py:1204] (3/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_459-291706-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 01:03:43,962 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7762_210_1794_1_1528199508_4057408_422-227034-0 from training. Duration: 0.73 2023-10-04 01:03:45,246 WARNING [train.py:1204] (3/4) Exclude cut with ID _3067_210_2667_1_1526212974640_4503168_250-285937-0_sp0.9 from training. Duration: 0.888875 2023-10-04 01:03:45,261 WARNING [train.py:1204] (3/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_332-253745-0_sp1.1 from training. Duration: 0.936375 2023-10-04 01:03:50,205 WARNING [train.py:1204] (3/4) Exclude cut with ID _13938_210_9988_1_1530518718978_3693960_304-112784-0 from training. Duration: 0.46 2023-10-04 01:03:50,387 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.feed_forward1.out_proj.dropout_p, batch_count=1470386.6666666667, ans=0.1 2023-10-04 01:03:50,410 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=1470386.6666666667, ans=0.1 2023-10-04 01:03:52,936 WARNING [train.py:1204] (3/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_226-267957-0_sp1.1 from training. Duration: 0.92725 2023-10-04 01:03:52,976 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_41822_210_13270_1_1533955976_4379284_608-254567-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 01:03:54,326 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_124-113562-0_sp1.1 from training. Duration: 0.8545625 2023-10-04 01:03:54,354 WARNING [train.py:1204] (3/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_117-290916-0_sp1.1 from training. Duration: 0.463625 2023-10-04 01:03:55,696 WARNING [train.py:1204] (3/4) Exclude cut with ID _28427_210_4220_1_1532581240967_3061069_102-66533-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 01:03:57,081 WARNING [train.py:1204] (3/4) Exclude cut with ID _55383_210_10635_1_1534468426172_2231669_134-128246-0_sp1.1 from training. Duration: 0.8090625 2023-10-04 01:03:57,128 WARNING [train.py:1204] (3/4) Exclude cut with ID _45871_210_5665_1_1533864614245_3296060_49-308316-0_sp1.1 from training. Duration: 0.97275 2023-10-04 01:03:57,235 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.1.ff2_skip_rate, batch_count=1470386.6666666667, ans=0.0 2023-10-04 01:03:58,480 WARNING [train.py:1204] (3/4) Exclude cut with ID _57572_210_5517_1_1534921219624_3809484_176-199205-0_sp1.1 from training. Duration: 0.97275 2023-10-04 01:04:01,346 WARNING [train.py:1204] (3/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_45-265684-0_sp1.1 from training. Duration: 0.6545625 2023-10-04 01:04:01,367 WARNING [train.py:1204] (3/4) Exclude cut with ID _15150_210_8341_1_1530752411803_7256384_469-239509-0_sp0.9 from training. Duration: 0.7555625 2023-10-04 01:04:03,260 WARNING [train.py:1204] (3/4) Exclude cut with ID _28461_210_2253_1_1532771775189_3041299_136-270644-0_sp1.1 from training. Duration: 0.8454375 2023-10-04 01:04:03,329 WARNING [train.py:1204] (3/4) Exclude cut with ID _69584_210_5219_1_1535785133775_4044450_302-47302-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 01:04:03,418 WARNING [train.py:1204] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_166-128941-0_sp1.1 from training. Duration: 0.4909375 2023-10-04 01:04:04,955 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.self_attn_weights.pos_emb_skip_rate, batch_count=1470453.3333333333, ans=0.0 2023-10-04 01:04:05,613 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.0.layers.1.nonlin_attention.whiten1, num_groups=1, num_channels=144, metric=5.91 vs. limit=10.0 2023-10-04 01:04:10,216 WARNING [train.py:1204] (3/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_559-120504-0_sp1.1 from training. Duration: 0.97275 2023-10-04 01:04:13,479 WARNING [train.py:1204] (3/4) Exclude cut with ID _6456_210_1534_1_1527681634212_3181049_345-252877-0_sp1.1 from training. Duration: 0.5818125 2023-10-04 01:04:13,503 WARNING [train.py:1204] (3/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_902-233003-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 01:04:20,030 WARNING [train.py:1204] (3/4) Exclude cut with ID _36538_210_9994_1_1533204020363_3795620_204-261385-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 01:04:20,041 WARNING [train.py:1204] (3/4) Exclude cut with ID _14821_210_8750_1_1530680503991_6829359_189-297451-0_sp1.1 from training. Duration: 0.5 2023-10-04 01:04:21,351 WARNING [train.py:1204] (3/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_1211-226066-0_sp0.9 from training. Duration: 0.8444375 2023-10-04 01:04:27,069 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6786_210_1388_1_1527839699_4194495_273-149841-0_sp1.1 from training. Duration: 0.836375 2023-10-04 01:04:27,098 WARNING [train.py:1204] (3/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_380-162648-0_sp1.1 from training. Duration: 0.8545625 2023-10-04 01:04:27,098 WARNING [train.py:1204] (3/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_494-199538-0 from training. Duration: 0.75 2023-10-04 01:04:28,470 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.706e+02 1.995e+02 2.181e+02 2.407e+02 3.708e+02, threshold=4.362e+02, percent-clipped=0.0 2023-10-04 01:04:30,078 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.feed_forward2.hidden_balancer.prob, batch_count=1470520.0, ans=0.125 2023-10-04 01:04:32,608 WARNING [train.py:1204] (3/4) Exclude cut with ID _49097_210_16049_1_1533952780207_4360979_276-87053-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 01:04:34,023 WARNING [train.py:1204] (3/4) Exclude cut with ID _5444_210_2151_1_1527418808464_5312210_726-27165-0 from training. Duration: 0.67 2023-10-04 01:04:38,800 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_128-202104-0_sp1.1 from training. Duration: 0.57275 2023-10-04 01:04:41,527 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_11349_210_6753_1_1529800861_1101207_26-65602-0_sp1.1 from training. Duration: 0.87275 2023-10-04 01:04:41,549 WARNING [train.py:1204] (3/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_159-328157-0 from training. Duration: 0.52 2023-10-04 01:04:42,860 WARNING [train.py:1204] (3/4) Exclude cut with ID _14069_210_5632_1_1530512692033_4803390_98-21934-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 01:04:44,907 WARNING [train.py:1204] (3/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_352-59958-0_sp0.9 from training. Duration: 0.9555625 2023-10-04 01:04:44,955 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6163_210_6400_1_1527506746_4263068_283-137811-0 from training. Duration: 0.77 2023-10-04 01:04:44,983 WARNING [train.py:1204] (3/4) Exclude cut with ID _21989_210_5854_1_1531899214931_3271360_133-44759-0_sp1.1 from training. Duration: 0.7 2023-10-04 01:04:49,280 WARNING [train.py:1204] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_21-174514-0_sp0.9 from training. Duration: 0.47775 2023-10-04 01:04:50,535 INFO [train.py:1046] (3/4) Epoch 42, batch 2800, loss[loss=0.1584, simple_loss=0.2446, pruned_loss=0.03607, over 24650.00 frames. ], tot_loss[loss=0.1569, simple_loss=0.2368, pruned_loss=0.03848, over 4695595.89 frames. ], batch size: 65, lr: 2.42e-03, grad_scale: 32.0 2023-10-04 01:04:50,595 WARNING [train.py:1204] (3/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_201-78811-0_sp1.1 from training. Duration: 0.963625 2023-10-04 01:04:50,632 WARNING [train.py:1204] (3/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_537-164630-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 01:04:51,196 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.4.encoder.layers.1.feed_forward3.out_whiten, num_groups=1, num_channels=384, metric=11.31 vs. limit=15.0 2023-10-04 01:04:51,907 WARNING [train.py:1204] (3/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_156-257607-0 from training. Duration: 0.83 2023-10-04 01:04:51,919 WARNING [train.py:1204] (3/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_308-154450-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 01:04:51,953 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_45399_210_4852_1_1533624725_7579737_214-240574-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 01:04:52,143 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=1470653.3333333333, ans=0.1 2023-10-04 01:04:54,552 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.1.encoder.layers.0.conv_module1.whiten, num_groups=1, num_channels=256, metric=9.66 vs. limit=15.0 2023-10-04 01:04:55,399 WARNING [train.py:1204] (3/4) Exclude cut with ID _29303_210_2575_1_1532392237303_7410649_615-305517-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 01:04:55,456 WARNING [train.py:1204] (3/4) Exclude cut with ID IC0125W0002-61808 from training. Duration: 0.708 2023-10-04 01:04:55,457 WARNING [train.py:1204] (3/4) Exclude cut with ID IC0125W0017-61823 from training. Duration: 0.759 2023-10-04 01:04:55,643 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.0.bypass.skip_rate, batch_count=1470653.3333333333, ans=0.035 2023-10-04 01:04:58,377 WARNING [train.py:1204] (3/4) Exclude cut with ID _53219_210_4525_1_1534834836008_4064825_4-145291-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 01:04:59,830 WARNING [train.py:1204] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_185-174805-0_sp1.1 from training. Duration: 0.6181875 2023-10-04 01:04:59,852 WARNING [train.py:1204] (3/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_635-193046-0_sp1.1 from training. Duration: 0.92725 2023-10-04 01:05:03,881 WARNING [train.py:1204] (3/4) Exclude cut with ID _46067_210_4220_1_1534125571066_3307450_321-258866-0_sp1.1 from training. Duration: 0.97275 2023-10-04 01:05:05,307 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8558_210_1410_1_1528505225_7613406_131-329524-0 from training. Duration: 0.79 2023-10-04 01:05:07,316 WARNING [train.py:1204] (3/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_847-41334-0_sp0.9 from training. Duration: 0.488875 2023-10-04 01:05:08,669 WARNING [train.py:1204] (3/4) Exclude cut with ID _14227_210_4381_1_1530701804286_3940259_125-275642-0 from training. Duration: 0.99 2023-10-04 01:05:08,770 WARNING [train.py:1204] (3/4) Exclude cut with ID _25248_210_8750_1_1532062825941_5554180_248-177255-0_sp1.1 from training. Duration: 0.963625 2023-10-04 01:05:10,072 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_40047_210_2290_1_1533290519_2374311_195-165693-0_sp1.1 from training. Duration: 0.8090625 2023-10-04 01:05:10,082 WARNING [train.py:1204] (3/4) Exclude cut with ID _23109_210_4381_1_1532150828777_901949_29-258672-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 01:05:15,480 WARNING [train.py:1204] (3/4) Exclude cut with ID _14821_210_8750_1_1530680503991_6829359_2-227151-0_sp1.1 from training. Duration: 0.7545625 2023-10-04 01:05:15,507 WARNING [train.py:1204] (3/4) Exclude cut with ID _49643_210_7989_1_1534640244224_3939031_565-53982-0_sp1.1 from training. Duration: 0.963625 2023-10-04 01:05:15,511 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7544_210_6753_1_1528591327_1471426_33-346031-0_sp0.9 from training. Duration: 0.67775 2023-10-04 01:05:16,829 WARNING [train.py:1204] (3/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_95-61782-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 01:05:19,188 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder_embed.convnext.hidden_balancer.prob, batch_count=1470786.6666666667, ans=0.125 2023-10-04 01:05:23,119 WARNING [train.py:1204] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_686-54383-0_sp0.9 from training. Duration: 0.9666875 2023-10-04 01:05:24,928 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_505-276823-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 01:05:27,652 WARNING [train.py:1204] (3/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_745-128443-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 01:05:29,043 WARNING [train.py:1204] (3/4) Exclude cut with ID _3844_210_1390_1_1526727803582_2871209_20-251664-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 01:05:30,477 WARNING [train.py:1204] (3/4) Exclude cut with ID _36538_210_9994_1_1533204020363_3795620_487-164628-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 01:05:33,291 WARNING [train.py:1204] (3/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_309-222545-0_sp1.1 from training. Duration: 0.763625 2023-10-04 01:05:34,587 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3531_210_3528_1_1526294203_5216456_309-227302-0 from training. Duration: 0.69 2023-10-04 01:05:34,647 WARNING [train.py:1204] (3/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_42-192460-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 01:05:34,731 WARNING [train.py:1204] (3/4) Exclude cut with ID _22083_210_8254_1_1531702862439_3678970_49-234546-0_sp1.1 from training. Duration: 0.7545625 2023-10-04 01:05:34,736 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_427-78097-0_sp0.9 from training. Duration: 0.888875 2023-10-04 01:05:39,363 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_378-205254-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 01:05:40,618 WARNING [train.py:1204] (3/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_672-314830-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 01:05:40,879 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.ff3_skip_rate, batch_count=1470853.3333333333, ans=0.0 2023-10-04 01:05:43,486 WARNING [train.py:1204] (3/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_90-30673-0_sp1.1 from training. Duration: 0.763625 2023-10-04 01:05:46,156 WARNING [train.py:1204] (3/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_563-329943-0_sp1.1 from training. Duration: 0.9 2023-10-04 01:05:46,171 WARNING [train.py:1204] (3/4) Exclude cut with ID _21632_210_2253_1_1531994001765_3744140_154-84090-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 01:05:46,172 WARNING [train.py:1204] (3/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_103-336064-0_sp0.9 from training. Duration: 0.5666875 2023-10-04 01:05:46,235 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_43-71989-0_sp0.9 from training. Duration: 0.6333125 2023-10-04 01:05:48,105 WARNING [train.py:1204] (3/4) Exclude cut with ID _3867_210_1883_1_1526641200575_4475830_319-349850-0_sp0.9 from training. Duration: 0.7444375 2023-10-04 01:05:48,171 WARNING [train.py:1204] (3/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_629-179454-0_sp1.1 from training. Duration: 0.963625 2023-10-04 01:05:48,181 WARNING [train.py:1204] (3/4) Exclude cut with ID _65263_210_22356_1_1535763598691_3921037_49-222917-0 from training. Duration: 0.92 2023-10-04 01:05:49,511 WARNING [train.py:1204] (3/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_602-331772-0_sp1.1 from training. Duration: 0.936375 2023-10-04 01:05:49,620 WARNING [train.py:1204] (3/4) Exclude cut with ID _58414_210_8254_1_1535421609162_7151182_292-134978-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 01:05:50,917 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_38793_210_2544_1_1533176114_3849320_200-299292-0_sp1.1 from training. Duration: 0.936375 2023-10-04 01:05:52,297 WARNING [train.py:1204] (3/4) Exclude cut with ID _14970_210_15709_1_1530775547361_1966965_6-23328-0 from training. Duration: 0.86 2023-10-04 01:05:53,682 WARNING [train.py:1204] (3/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_386-225511-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 01:05:53,702 WARNING [train.py:1204] (3/4) Exclude cut with ID _38323_210_12108_1_1533121233369_3303049_416-11919-0_sp1.1 from training. Duration: 0.87275 2023-10-04 01:05:53,756 WARNING [train.py:1204] (3/4) Exclude cut with ID _4866_210_2577_1_1527404412284_4364659_256-306830-0_sp1.1 from training. Duration: 0.7909375 2023-10-04 01:05:55,229 WARNING [train.py:1204] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_53-300544-0 from training. Duration: 0.67 2023-10-04 01:06:02,336 WARNING [train.py:1204] (3/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_80-74184-0_sp1.1 from training. Duration: 0.7545625 2023-10-04 01:06:02,350 WARNING [train.py:1204] (3/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_332-256168-0_sp1.1 from training. Duration: 0.6818125 2023-10-04 01:06:02,397 WARNING [train.py:1204] (3/4) Exclude cut with ID _48836_210_12210_1_1534075848293_3904150_429-35475-0_sp1.1 from training. Duration: 0.8090625 2023-10-04 01:06:04,307 INFO [train.py:1046] (3/4) Epoch 42, batch 2850, loss[loss=0.1524, simple_loss=0.2222, pruned_loss=0.04127, over 23662.00 frames. ], tot_loss[loss=0.1564, simple_loss=0.2362, pruned_loss=0.03828, over 4702994.38 frames. ], batch size: 256, lr: 2.42e-03, grad_scale: 32.0 2023-10-04 01:06:04,374 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_144-135833-0_sp1.1 from training. Duration: 0.97275 2023-10-04 01:06:07,229 WARNING [train.py:1204] (3/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_380-343670-0_sp0.9 from training. Duration: 0.97775 2023-10-04 01:06:07,250 WARNING [train.py:1204] (3/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_213-265174-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 01:06:07,285 WARNING [train.py:1204] (3/4) Exclude cut with ID _20455_210_5219_1_1531565996775_8098100_86-314283-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 01:06:09,927 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_41750_210_1960_1_1533520678_3948042_68-223813-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 01:06:11,264 WARNING [train.py:1204] (3/4) Exclude cut with ID _55153_210_2253_1_1534384786677_3162096_24-198429-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 01:06:12,624 WARNING [train.py:1204] (3/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_119-207162-0_sp0.9 from training. Duration: 0.788875 2023-10-04 01:06:12,667 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_148-172861-0 from training. Duration: 0.5 2023-10-04 01:06:19,445 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_5522_210_3273_1_1527249606_3909096_318-203153-0 from training. Duration: 0.43 2023-10-04 01:06:19,452 WARNING [train.py:1204] (3/4) Exclude cut with ID _14884_210_2252_1_1530707459689_5561250_288-189949-0_sp1.1 from training. Duration: 0.97275 2023-10-04 01:06:20,908 WARNING [train.py:1204] (3/4) Exclude cut with ID _5294_210_1747_1_1527249643928_3696179_529-242298-0 from training. Duration: 0.95 2023-10-04 01:06:22,256 WARNING [train.py:1204] (3/4) Exclude cut with ID _36538_210_9994_1_1533204020363_3795620_503-110289-0_sp1.1 from training. Duration: 0.936375 2023-10-04 01:06:23,628 WARNING [train.py:1204] (3/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_385-193052-0 from training. Duration: 0.86 2023-10-04 01:06:24,996 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_456-22141-0 from training. Duration: 0.88 2023-10-04 01:06:25,117 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_38965_210_4852_1_1533174906_3484735_284-117840-0_sp1.1 from training. Duration: 0.936375 2023-10-04 01:06:37,355 WARNING [train.py:1204] (3/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_75-201625-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 01:06:38,630 WARNING [train.py:1204] (3/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_255-312654-0_sp1.1 from training. Duration: 0.82725 2023-10-04 01:06:38,640 WARNING [train.py:1204] (3/4) Exclude cut with ID _47251_210_18628_1_1533814063128_4137250_214-88076-0_sp0.9 from training. Duration: 0.97775 2023-10-04 01:06:38,706 WARNING [train.py:1204] (3/4) Exclude cut with ID _4092_210_2151_1_1526643044449_3135759_227-213972-0_sp1.1 from training. Duration: 0.4909375 2023-10-04 01:06:38,724 WARNING [train.py:1204] (3/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_912-47473-0_sp1.1 from training. Duration: 0.6181875 2023-10-04 01:06:38,750 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6767_210_3273_1_1527855783_1718932_173-256601-0_sp1.1 from training. Duration: 0.72725 2023-10-04 01:06:40,187 WARNING [train.py:1204] (3/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_32-205288-0_sp1.1 from training. Duration: 0.7090625 2023-10-04 01:06:41,988 WARNING [train.py:1204] (3/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_39-248870-0 from training. Duration: 0.69 2023-10-04 01:06:44,730 WARNING [train.py:1204] (3/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_377-296839-0_sp1.1 from training. Duration: 0.5 2023-10-04 01:06:44,754 WARNING [train.py:1204] (3/4) Exclude cut with ID _13679_210_7497_1_1530534293662_3355098_413-56154-0_sp1.1 from training. Duration: 0.963625 2023-10-04 01:06:45,417 WARNING [train.py:1204] (3/4) Exclude cut with ID _14970_210_15709_1_1530775547361_1966965_201-84544-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 01:06:46,486 WARNING [train.py:1204] (3/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_105-95775-0_sp1.1 from training. Duration: 0.936375 2023-10-04 01:06:46,751 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.conv_skip_rate, batch_count=1471120.0, ans=0.0 2023-10-04 01:06:48,058 WARNING [train.py:1204] (3/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_131-191669-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 01:06:48,104 WARNING [train.py:1204] (3/4) Exclude cut with ID _14860_210_4915_1_1530702020340_7343240_695-4323-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 01:06:50,792 WARNING [train.py:1204] (3/4) Exclude cut with ID _39867_210_9826_1_1533553222030_9344259_991-130565-0_sp1.1 from training. Duration: 0.97275 2023-10-04 01:06:52,193 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_50-3289-0_sp1.1 from training. Duration: 0.82725 2023-10-04 01:06:53,589 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_54-347670-0_sp1.1 from training. Duration: 0.92725 2023-10-04 01:06:53,643 WARNING [train.py:1204] (3/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_200-153445-0_sp1.1 from training. Duration: 0.936375 2023-10-04 01:06:54,995 WARNING [train.py:1204] (3/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_279-302911-0_sp1.1 from training. Duration: 0.97275 2023-10-04 01:06:56,319 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.585e+02 1.878e+02 2.070e+02 2.200e+02 2.793e+02, threshold=4.141e+02, percent-clipped=0.0 2023-10-04 01:06:56,436 WARNING [train.py:1204] (3/4) Exclude cut with ID _14886_210_4220_1_1530702020355_3516190_159-73748-0_sp0.9 from training. Duration: 0.92225 2023-10-04 01:07:01,185 WARNING [train.py:1204] (3/4) Exclude cut with ID _53379_210_20034_1_1534222027363_3854654_23-222715-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 01:07:03,046 WARNING [train.py:1204] (3/4) Exclude cut with ID _12555_210_8750_1_1530075594402_3780239_449-267986-0 from training. Duration: 0.83 2023-10-04 01:07:03,058 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7544_210_6753_1_1528587722_3576686_295-258479-0 from training. Duration: 0.84 2023-10-04 01:07:05,656 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7002_210_1794_1_1527935634_4034444_487-236501-0_sp0.9 from training. Duration: 0.7333125 2023-10-04 01:07:05,714 WARNING [train.py:1204] (3/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_244-19107-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 01:07:05,727 WARNING [train.py:1204] (3/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_390-339137-0 from training. Duration: 0.56 2023-10-04 01:07:05,979 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.feed_forward2.hidden_balancer.prob, batch_count=1471253.3333333333, ans=0.125 2023-10-04 01:07:07,044 WARNING [train.py:1204] (3/4) Exclude cut with ID _7562_210_2084_1_1528887767916_3946229_186-326422-0_sp1.1 from training. Duration: 0.763625 2023-10-04 01:07:07,112 WARNING [train.py:1204] (3/4) Exclude cut with ID _33640_210_2655_1_1533117605479_4855469_645-145235-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 01:07:08,409 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_25767_210_2161_1_1532307483_3867748_506-171777-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 01:07:08,436 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_5438_210_1794_1_1527336687_2600009_233-58072-0_sp1.1 from training. Duration: 0.7 2023-10-04 01:07:08,436 WARNING [train.py:1204] (3/4) Exclude cut with ID IC0669W0063-330908 from training. Duration: 0.896 2023-10-04 01:07:08,468 WARNING [train.py:1204] (3/4) Exclude cut with ID IC0671W0070-331913 from training. Duration: 0.896 2023-10-04 01:07:08,472 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_258-310570-0_sp1.1 from training. Duration: 0.7454375 2023-10-04 01:07:08,537 WARNING [train.py:1204] (3/4) Exclude cut with ID _9461_210_4557_1_1529060514726_3677451_162-177507-0_sp1.1 from training. Duration: 0.97275 2023-10-04 01:07:12,060 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=1471253.3333333333, ans=0.1 2023-10-04 01:07:14,700 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.ff3_skip_rate, batch_count=1471253.3333333333, ans=0.0 2023-10-04 01:07:15,876 WARNING [train.py:1204] (3/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_26-175553-0_sp0.9 from training. Duration: 0.67775 2023-10-04 01:07:15,893 WARNING [train.py:1204] (3/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_7-150481-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 01:07:17,248 WARNING [train.py:1204] (3/4) Exclude cut with ID _23557_210_8750_1_1531890106370_6846255_72-273262-0_sp1.1 from training. Duration: 0.963625 2023-10-04 01:07:19,098 INFO [train.py:1046] (3/4) Epoch 42, batch 2900, loss[loss=0.1456, simple_loss=0.2232, pruned_loss=0.03398, over 24607.00 frames. ], tot_loss[loss=0.1558, simple_loss=0.2359, pruned_loss=0.03791, over 4709102.78 frames. ], batch size: 60, lr: 2.42e-03, grad_scale: 32.0 2023-10-04 01:07:19,190 WARNING [train.py:1204] (3/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_367-61330-0 from training. Duration: 0.75 2023-10-04 01:07:23,336 WARNING [train.py:1204] (3/4) Exclude cut with ID _39867_210_9826_1_1533553222030_9344259_1337-11141-0_sp1.1 from training. Duration: 0.936375 2023-10-04 01:07:23,362 WARNING [train.py:1204] (3/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_101-293367-0 from training. Duration: 0.63 2023-10-04 01:07:23,511 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.feed_forward2.hidden_balancer.prob, batch_count=1471320.0, ans=0.125 2023-10-04 01:07:24,735 WARNING [train.py:1204] (3/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_10-100367-0 from training. Duration: 0.98 2023-10-04 01:07:26,129 WARNING [train.py:1204] (3/4) Exclude cut with ID _60057_210_10640_1_1534903065793_3333150_171-181133-0_sp1.1 from training. Duration: 0.8 2023-10-04 01:07:26,132 WARNING [train.py:1204] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_844-96989-0_sp0.9 from training. Duration: 0.788875 2023-10-04 01:07:28,893 WARNING [train.py:1204] (3/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_301-144595-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 01:07:30,239 WARNING [train.py:1204] (3/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_115-147343-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 01:07:32,980 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3171_210_2087_1_1526109084_2330007_37-64334-0_sp1.1 from training. Duration: 0.7454375 2023-10-04 01:07:34,301 WARNING [train.py:1204] (3/4) Exclude cut with ID _19514_210_9259_1_1531530071330_5084230_769-272914-0_sp1.1 from training. Duration: 0.936375 2023-10-04 01:07:36,382 WARNING [train.py:1204] (3/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_76-311963-0_sp0.9 from training. Duration: 0.77775 2023-10-04 01:07:38,041 WARNING [train.py:1204] (3/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_526-62079-0 from training. Duration: 0.67 2023-10-04 01:07:39,401 WARNING [train.py:1204] (3/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_7-111954-0_sp0.9 from training. Duration: 0.62225 2023-10-04 01:07:39,529 WARNING [train.py:1204] (3/4) Exclude cut with ID _57997_210_4163_1_1534766425889_3961049_360-279140-0_sp1.1 from training. Duration: 0.97275 2023-10-04 01:07:41,151 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=1471386.6666666667, ans=0.1 2023-10-04 01:07:42,269 WARNING [train.py:1204] (3/4) Exclude cut with ID _4866_210_2577_1_1527404412284_4364659_133-1696-0 from training. Duration: 0.99 2023-10-04 01:07:42,354 WARNING [train.py:1204] (3/4) Exclude cut with ID _5190_210_4557_1_1527244368799_373439_28-197030-0 from training. Duration: 0.72 2023-10-04 01:07:45,611 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_53-39003-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 01:07:45,613 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6673_210_3273_1_1527767790_3173240_351-167954-0 from training. Duration: 0.48 2023-10-04 01:07:45,628 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2822_210_2715_1_1526112586_1999587_165-36656-0_sp1.1 from training. Duration: 0.8090625 2023-10-04 01:07:45,773 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.bypass.scale_min, batch_count=1471386.6666666667, ans=0.2 2023-10-04 01:07:47,124 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_328-264892-0_sp1.1 from training. Duration: 0.92725 2023-10-04 01:07:47,128 WARNING [train.py:1204] (3/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_29-293896-0_sp0.9 from training. Duration: 0.67775 2023-10-04 01:07:50,296 WARNING [train.py:1204] (3/4) Exclude cut with ID _65263_210_22356_1_1535763598691_3921037_465-55448-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 01:07:50,359 WARNING [train.py:1204] (3/4) Exclude cut with ID _21989_210_5854_1_1531899214931_3271360_311-53106-0_sp1.1 from training. Duration: 0.97275 2023-10-04 01:07:53,142 WARNING [train.py:1204] (3/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_389-277915-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 01:07:54,593 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_69630_210_7234_1_1535791094_677837_63-277139-0_sp1.1 from training. Duration: 0.963625 2023-10-04 01:07:54,987 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.5.encoder.layers.1.feed_forward3.out_whiten, num_groups=1, num_channels=256, metric=9.45 vs. limit=15.0 2023-10-04 01:07:57,237 WARNING [train.py:1204] (3/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_85-138257-0 from training. Duration: 0.71 2023-10-04 01:07:57,256 WARNING [train.py:1204] (3/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_686-247600-0 from training. Duration: 0.92 2023-10-04 01:07:57,273 WARNING [train.py:1204] (3/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_108-255430-0_sp1.1 from training. Duration: 0.8818125 2023-10-04 01:08:00,172 WARNING [train.py:1204] (3/4) Exclude cut with ID _14884_210_2252_1_1530707459689_5561250_114-201399-0_sp1.1 from training. Duration: 0.6909375 2023-10-04 01:08:03,214 WARNING [train.py:1204] (3/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_506-42614-0 from training. Duration: 0.79 2023-10-04 01:08:03,313 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_85-195240-0_sp1.1 from training. Duration: 0.7909375 2023-10-04 01:08:08,131 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_141-36243-0_sp1.1 from training. Duration: 0.97275 2023-10-04 01:08:13,123 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.0.feed_forward3.hidden_balancer.prob, batch_count=1471520.0, ans=0.125 2023-10-04 01:08:16,987 WARNING [train.py:1204] (3/4) Exclude cut with ID _7852_210_5219_1_1528286471316_8139400_358-287492-0_sp1.1 from training. Duration: 0.87275 2023-10-04 01:08:17,004 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_805-71497-0_sp0.9 from training. Duration: 0.788875 2023-10-04 01:08:17,178 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.conv_module1.balancer2.prob, batch_count=1471586.6666666667, ans=0.125 2023-10-04 01:08:19,049 WARNING [train.py:1204] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_178-39722-0 from training. Duration: 0.49 2023-10-04 01:08:22,904 WARNING [train.py:1204] (3/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_428-181925-0_sp1.1 from training. Duration: 0.963625 2023-10-04 01:08:22,908 WARNING [train.py:1204] (3/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_770-200430-0 from training. Duration: 0.89 2023-10-04 01:08:22,950 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_1104-271009-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 01:08:22,993 WARNING [train.py:1204] (3/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_128-296581-0_sp0.9 from training. Duration: 0.82225 2023-10-04 01:08:31,095 WARNING [train.py:1204] (3/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_19-62511-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 01:08:32,943 INFO [train.py:1046] (3/4) Epoch 42, batch 2950, loss[loss=0.1588, simple_loss=0.2412, pruned_loss=0.03821, over 24448.00 frames. ], tot_loss[loss=0.1565, simple_loss=0.2369, pruned_loss=0.03807, over 4712587.13 frames. ], batch size: 63, lr: 2.42e-03, grad_scale: 16.0 2023-10-04 01:08:33,122 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_805-71497-0 from training. Duration: 0.71 2023-10-04 01:08:34,430 WARNING [train.py:1204] (3/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_573-152077-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 01:08:34,443 WARNING [train.py:1204] (3/4) Exclude cut with ID _42875_210_14856_1_1533704397487_4012433_393-177816-0_sp1.1 from training. Duration: 0.963625 2023-10-04 01:08:35,853 WARNING [train.py:1204] (3/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_278-35159-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 01:08:37,342 WARNING [train.py:1204] (3/4) Exclude cut with ID _15037_210_2253_1_1530752294104_3267650_13-130361-0_sp1.1 from training. Duration: 0.9 2023-10-04 01:08:37,430 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3019_210_2028_1_1526044245_3264865_127-135615-0 from training. Duration: 0.94 2023-10-04 01:08:39,408 WARNING [train.py:1204] (3/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_607-213951-0 from training. Duration: 0.84 2023-10-04 01:08:39,487 WARNING [train.py:1204] (3/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_436-318113-0_sp0.9 from training. Duration: 0.8333125 2023-10-04 01:08:39,488 WARNING [train.py:1204] (3/4) Exclude cut with ID _13938_210_9988_1_1530518718978_3693960_148-152225-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 01:08:43,660 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2541_210_1794_1_1525590269_3084765_156-173713-0_sp1.1 from training. Duration: 0.736375 2023-10-04 01:08:45,526 WARNING [train.py:1204] (3/4) Exclude cut with ID _28323_210_16187_1_1532480395667_4100300_531-24157-0_sp1.1 from training. Duration: 0.8909375 2023-10-04 01:08:45,721 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.bypass_mid.scale_min, batch_count=1471653.3333333333, ans=0.2 2023-10-04 01:08:46,921 WARNING [train.py:1204] (3/4) Exclude cut with ID _29303_210_2575_1_1532392237303_7410649_382-217492-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 01:08:48,221 WARNING [train.py:1204] (3/4) Exclude cut with ID _58895_210_8750_1_1535184081320_3649089_270-158013-0_sp1.1 from training. Duration: 0.92725 2023-10-04 01:08:51,487 WARNING [train.py:1204] (3/4) Exclude cut with ID _29277_210_3279_1_1532581109293_3173069_311-206227-0_sp1.1 from training. Duration: 0.936375 2023-10-04 01:08:51,506 WARNING [train.py:1204] (3/4) Exclude cut with ID _7160_210_1876_1_1527940923231_3703799_282-322611-0_sp0.9 from training. Duration: 0.9666875 2023-10-04 01:08:54,158 WARNING [train.py:1204] (3/4) Exclude cut with ID _53219_210_4525_1_1534834836008_4064825_562-32295-0_sp1.1 from training. Duration: 0.963625 2023-10-04 01:08:54,238 WARNING [train.py:1204] (3/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_408-87330-0_sp1.1 from training. Duration: 0.963625 2023-10-04 01:08:54,253 WARNING [train.py:1204] (3/4) Exclude cut with ID _59202_210_26984_1_1534816810612_3549174_137-318710-0_sp1.1 from training. Duration: 0.8090625 2023-10-04 01:08:55,869 WARNING [train.py:1204] (3/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_372-30302-0 from training. Duration: 0.86 2023-10-04 01:08:58,966 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=1471720.0, ans=0.1 2023-10-04 01:09:00,210 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7543_210_6753_1_1528541907_1399322_178-56269-0_sp1.1 from training. Duration: 0.95275 2023-10-04 01:09:00,228 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9109W0012-549758 from training. Duration: 0.512 2023-10-04 01:09:01,595 WARNING [train.py:1204] (3/4) Exclude cut with ID _5410_210_1388_1_1527397055795_1719020_39-112221-0_sp0.9 from training. Duration: 0.7666875 2023-10-04 01:09:04,717 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9121W0012-554933 from training. Duration: 0.896 2023-10-04 01:09:04,813 WARNING [train.py:1204] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_476-272757-0 from training. Duration: 0.93 2023-10-04 01:09:04,834 WARNING [train.py:1204] (3/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_1085-171718-0_sp1.1 from training. Duration: 0.92725 2023-10-04 01:09:05,395 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.3.encoder.layers.2.feed_forward2.out_whiten, num_groups=1, num_channels=512, metric=11.38 vs. limit=15.0 2023-10-04 01:09:06,080 WARNING [train.py:1204] (3/4) Exclude cut with ID _8480_210_1874_1_1528589391205_6728330_309-139896-0_sp1.1 from training. Duration: 0.736375 2023-10-04 01:09:06,080 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9130W0113-559703 from training. Duration: 0.896 2023-10-04 01:09:06,092 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_292-265928-0_sp1.1 from training. Duration: 0.463625 2023-10-04 01:09:10,793 WARNING [train.py:1204] (3/4) Exclude cut with ID _8772_210_1850_1_1528635595972_3664011_96-12756-0 from training. Duration: 0.98 2023-10-04 01:09:10,877 WARNING [train.py:1204] (3/4) Exclude cut with ID _12556_210_8341_1_1530081024100_7327690_725-193477-0_sp1.1 from training. Duration: 0.8909375 2023-10-04 01:09:12,198 WARNING [train.py:1204] (3/4) Exclude cut with ID _5289_210_2084_1_1527678202736_3779299_142-103317-0_sp1.1 from training. Duration: 0.763625 2023-10-04 01:09:15,595 WARNING [train.py:1204] (3/4) Exclude cut with ID _45012_210_12651_1_1533608435136_5371380_206-51075-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 01:09:17,242 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.attention_skip_rate, batch_count=1471853.3333333333, ans=0.0 2023-10-04 01:09:18,391 WARNING [train.py:1204] (3/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_176-56120-0_sp1.1 from training. Duration: 0.7545625 2023-10-04 01:09:18,431 WARNING [train.py:1204] (3/4) Exclude cut with ID _40284_210_2084_1_1533513679814_3704279_220-201864-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 01:09:19,699 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9165W0004-576638 from training. Duration: 0.896 2023-10-04 01:09:19,730 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_5438_210_1794_1_1527336687_2600009_218-343336-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 01:09:19,759 WARNING [train.py:1204] (3/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_318-162017-0 from training. Duration: 0.91 2023-10-04 01:09:24,629 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_49544_210_1794_1_1534379770_5262083_483-349298-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 01:09:26,029 WARNING [train.py:1204] (3/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_197-28164-0_sp0.9 from training. Duration: 0.888875 2023-10-04 01:09:26,099 WARNING [train.py:1204] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_643-52007-0 from training. Duration: 0.57 2023-10-04 01:09:26,116 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_38965_210_4852_1_1533174906_3484735_403-332963-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 01:09:27,394 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.652e+02 1.891e+02 2.131e+02 2.333e+02 3.581e+02, threshold=4.262e+02, percent-clipped=0.0 2023-10-04 01:09:28,911 WARNING [train.py:1204] (3/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_340-174176-0 from training. Duration: 0.49 2023-10-04 01:09:31,611 WARNING [train.py:1204] (3/4) Exclude cut with ID _56708_210_13177_1_1535252351282_3422899_363-917-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 01:09:34,337 WARNING [train.py:1204] (3/4) Exclude cut with ID _19896_210_1883_1_1531709022662_6649569_525-159063-0_sp1.1 from training. Duration: 0.936375 2023-10-04 01:09:34,371 WARNING [train.py:1204] (3/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_430-116139-0_sp0.9 from training. Duration: 0.9444375 2023-10-04 01:09:35,782 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_53753_210_7234_1_1534320003_3594148_86-329250-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 01:09:35,794 WARNING [train.py:1204] (3/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_406-275649-0_sp1.1 from training. Duration: 0.3818125 2023-10-04 01:09:36,040 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.bypass.skip_rate, batch_count=1471920.0, ans=0.04949747468305833 2023-10-04 01:09:37,725 WARNING [train.py:1204] (3/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_1083-147536-0_sp1.1 from training. Duration: 0.8818125 2023-10-04 01:09:39,118 WARNING [train.py:1204] (3/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_54-311741-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 01:09:39,123 WARNING [train.py:1204] (3/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_237-58433-0_sp0.9 from training. Duration: 0.82225 2023-10-04 01:09:39,174 WARNING [train.py:1204] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_98-51676-0_sp1.1 from training. Duration: 0.663625 2023-10-04 01:09:39,234 WARNING [train.py:1204] (3/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_386-270472-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 01:09:40,583 WARNING [train.py:1204] (3/4) Exclude cut with ID _19023_210_4353_1_1531378892552_5820780_83-254794-0_sp1.1 from training. Duration: 0.7818125 2023-10-04 01:09:41,958 WARNING [train.py:1204] (3/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_314-93656-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 01:09:41,977 WARNING [train.py:1204] (3/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_336-213530-0 from training. Duration: 0.9 2023-10-04 01:09:43,799 WARNING [train.py:1204] (3/4) Exclude cut with ID _36686_210_5665_1_1533778225913_3646410_65-221120-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 01:09:45,278 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_26433_210_12108_1_1532157158_4056480_271-68178-0_sp1.1 from training. Duration: 0.92725 2023-10-04 01:09:46,581 WARNING [train.py:1204] (3/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_46-207599-0_sp0.9 from training. Duration: 0.711125 2023-10-04 01:09:47,912 INFO [train.py:1046] (3/4) Epoch 42, batch 3000, loss[loss=0.1679, simple_loss=0.2339, pruned_loss=0.05099, over 23747.00 frames. ], tot_loss[loss=0.1571, simple_loss=0.2373, pruned_loss=0.03843, over 4712258.61 frames. ], batch size: 164, lr: 2.42e-03, grad_scale: 16.0 2023-10-04 01:09:47,913 INFO [train.py:1069] (3/4) Computing validation loss 2023-10-04 01:09:53,490 INFO [zipformer.py:1853] (3/4) name=encoder.encoders.4.encoder.layers.2.self_attn_weights, attn_weights_entropy = tensor([1.4213, 2.6715, 3.5353, 2.4003], device='cuda:3') 2023-10-04 01:09:59,538 INFO [train.py:1078] (3/4) Epoch 42, validation: loss=0.3457, simple_loss=0.2797, pruned_loss=0.2058, over 1125622.00 frames. 2023-10-04 01:09:59,538 INFO [train.py:1079] (3/4) Maximum memory allocated so far is 21134MB 2023-10-04 01:09:59,715 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9303W0114-637133 from training. Duration: 0.896 2023-10-04 01:10:01,053 WARNING [train.py:1204] (3/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_490-207861-0 from training. Duration: 0.98 2023-10-04 01:10:04,516 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6767_210_3273_1_1527855783_1718932_173-256601-0_sp0.9 from training. Duration: 0.888875 2023-10-04 01:10:05,836 WARNING [train.py:1204] (3/4) Exclude cut with ID _13921_210_9994_1_1530841782272_3503520_327-69677-0_sp1.1 from training. Duration: 0.8181875 2023-10-04 01:10:05,894 WARNING [train.py:1204] (3/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_508-335369-0 from training. Duration: 0.48 2023-10-04 01:10:07,173 WARNING [train.py:1204] (3/4) Exclude cut with ID _64631_210_27767_1_1535349580068_4165160_24-46567-0_sp1.1 from training. Duration: 0.963625 2023-10-04 01:10:13,100 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_89-35748-0_sp1.1 from training. Duration: 0.7181875 2023-10-04 01:10:14,784 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=1472053.3333333333, ans=0.1 2023-10-04 01:10:14,864 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.conv_module2.balancer2.prob, batch_count=1472053.3333333333, ans=0.125 2023-10-04 01:10:22,827 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_26433_210_12108_1_1532157158_4056480_578-262107-0_sp1.1 from training. Duration: 0.9 2023-10-04 01:10:27,015 WARNING [train.py:1204] (3/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_129-337802-0 from training. Duration: 0.72 2023-10-04 01:10:28,390 WARNING [train.py:1204] (3/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_356-167816-0_sp0.9 from training. Duration: 0.8 2023-10-04 01:10:31,151 WARNING [train.py:1204] (3/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_137-116442-0_sp1.1 from training. Duration: 0.7090625 2023-10-04 01:10:32,505 WARNING [train.py:1204] (3/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_358-172940-0_sp1.1 from training. Duration: 0.963625 2023-10-04 01:10:32,541 WARNING [train.py:1204] (3/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_668-344147-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 01:10:35,861 WARNING [train.py:1204] (3/4) Exclude cut with ID _62337_210_12392_1_1535009273105_3412229_255-157667-0_sp1.1 from training. Duration: 0.936375 2023-10-04 01:10:35,864 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_486-151131-0 from training. Duration: 0.85 2023-10-04 01:10:38,627 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_53638_210_21581_1_1534297319_5808449_885-80172-0 from training. Duration: 0.96 2023-10-04 01:10:40,010 WARNING [train.py:1204] (3/4) Exclude cut with ID _50093_210_4220_1_1534417141207_3432299_105-183067-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 01:10:40,072 WARNING [train.py:1204] (3/4) Exclude cut with ID _7562_210_2084_1_1528887767916_3946229_345-169156-0_sp0.9 from training. Duration: 0.6444375 2023-10-04 01:10:42,025 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_4528_210_3273_1_1527076654_3276777_209-219804-0_sp0.9 from training. Duration: 0.7666875 2023-10-04 01:10:42,211 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.conv_module2.balancer2.prob, batch_count=1472120.0, ans=0.125 2023-10-04 01:10:43,370 WARNING [train.py:1204] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_35-153886-0_sp0.9 from training. Duration: 0.9333125 2023-10-04 01:10:43,436 WARNING [train.py:1204] (3/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_332-121925-0_sp1.1 from training. Duration: 0.92725 2023-10-04 01:10:43,438 WARNING [train.py:1204] (3/4) Exclude cut with ID _59202_210_26984_1_1534816810612_3549174_495-65515-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 01:10:43,678 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.feed_forward3.hidden_balancer.prob, batch_count=1472186.6666666667, ans=0.125 2023-10-04 01:10:47,621 WARNING [train.py:1204] (3/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_769-168302-0_sp1.1 from training. Duration: 0.6454375 2023-10-04 01:10:48,994 WARNING [train.py:1204] (3/4) Exclude cut with ID _24271_210_12658_1_1531987270022_7190519_708-71345-0_sp1.1 from training. Duration: 0.936375 2023-10-04 01:10:48,995 WARNING [train.py:1204] (3/4) Exclude cut with ID 3033-130750-0096-55598-0_sp0.9 from training. Duration: 0.92225 2023-10-04 01:10:50,368 WARNING [train.py:1204] (3/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_219-224445-0_sp0.9 from training. Duration: 0.9333125 2023-10-04 01:10:52,484 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_174-277364-0 from training. Duration: 0.71 2023-10-04 01:10:53,807 WARNING [train.py:1204] (3/4) Exclude cut with ID _45021_210_9994_1_1533619751094_3330288_96-239010-0_sp1.1 from training. Duration: 0.82725 2023-10-04 01:10:53,847 WARNING [train.py:1204] (3/4) Exclude cut with ID _31602_210_5854_1_1532677021456_4458250_489-305116-0_sp1.1 from training. Duration: 0.97275 2023-10-04 01:10:53,869 WARNING [train.py:1204] (3/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_269-154726-0_sp0.9 from training. Duration: 0.9555625 2023-10-04 01:10:56,703 WARNING [train.py:1204] (3/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_126-54418-0_sp1.1 from training. Duration: 0.92725 2023-10-04 01:10:56,728 WARNING [train.py:1204] (3/4) Exclude cut with ID _42326_210_7770_1_1533814282531_3707269_182-224159-0_sp1.1 from training. Duration: 0.92725 2023-10-04 01:10:58,086 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7002_210_1794_1_1527939683_5157286_1-281264-0_sp0.9 from training. Duration: 0.488875 2023-10-04 01:10:59,363 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_86-16881-0 from training. Duration: 0.58 2023-10-04 01:10:59,390 WARNING [train.py:1204] (3/4) Exclude cut with ID _19514_210_9259_1_1531530071330_5084230_637-279094-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 01:10:59,443 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_26433_210_12108_1_1532157158_4056480_578-262107-0 from training. Duration: 0.99 2023-10-04 01:11:00,780 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_82-221196-0_sp1.1 from training. Duration: 0.6909375 2023-10-04 01:11:02,165 WARNING [train.py:1204] (3/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_402-102311-0 from training. Duration: 0.67 2023-10-04 01:11:05,560 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1015-343884-0_sp1.1 from training. Duration: 0.563625 2023-10-04 01:11:05,639 WARNING [train.py:1204] (3/4) Exclude cut with ID _13684_210_7497_1_1530619354863_3598359_312-235606-0_sp1.1 from training. Duration: 0.5454375 2023-10-04 01:11:05,662 WARNING [train.py:1204] (3/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_55-31904-0 from training. Duration: 0.88 2023-10-04 01:11:07,046 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_25-332138-0 from training. Duration: 0.91 2023-10-04 01:11:07,047 WARNING [train.py:1204] (3/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_912-282412-0_sp0.9 from training. Duration: 0.5444375 2023-10-04 01:11:08,418 WARNING [train.py:1204] (3/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_458-299435-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 01:11:09,686 WARNING [train.py:1204] (3/4) Exclude cut with ID _69584_210_5219_1_1535785133775_4044450_275-88403-0_sp1.1 from training. Duration: 0.92725 2023-10-04 01:11:09,703 WARNING [train.py:1204] (3/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_841-341679-0_sp1.1 from training. Duration: 0.52725 2023-10-04 01:11:09,710 WARNING [train.py:1204] (3/4) Exclude cut with ID _41391_210_7994_1_1533603771872_3638201_1-54521-0_sp1.1 from training. Duration: 0.97275 2023-10-04 01:11:11,020 WARNING [train.py:1204] (3/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_495-186609-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 01:11:12,474 WARNING [train.py:1204] (3/4) Exclude cut with ID _4681_210_3525_1_1527250689464_120889_15-126760-0 from training. Duration: 0.81 2023-10-04 01:11:14,091 INFO [train.py:1046] (3/4) Epoch 42, batch 3050, loss[loss=0.1506, simple_loss=0.2251, pruned_loss=0.03804, over 24501.00 frames. ], tot_loss[loss=0.1578, simple_loss=0.2377, pruned_loss=0.03896, over 4716813.50 frames. ], batch size: 58, lr: 2.42e-03, grad_scale: 16.0 2023-10-04 01:11:15,541 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3019_210_2028_1_1526044245_3264865_127-135615-0_sp1.1 from training. Duration: 0.8545625 2023-10-04 01:11:18,280 WARNING [train.py:1204] (3/4) Exclude cut with ID _30191_210_1664_1_1533189915047_7196670_915-21561-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 01:11:18,318 WARNING [train.py:1204] (3/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_6-214815-0_sp0.9 from training. Duration: 0.9444375 2023-10-04 01:11:18,540 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.conv_module2.balancer1.prob, batch_count=1472320.0, ans=0.125 2023-10-04 01:11:23,394 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_45399_210_4852_1_1533624725_7579737_190-137494-0_sp1.1 from training. Duration: 0.97275 2023-10-04 01:11:26,235 WARNING [train.py:1204] (3/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_207-132038-0 from training. Duration: 0.94 2023-10-04 01:11:31,726 WARNING [train.py:1204] (3/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_90-31383-0 from training. Duration: 0.87 2023-10-04 01:11:31,760 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_15517_210_6753_1_1530952293_1664795_119-31001-0 from training. Duration: 0.94 2023-10-04 01:11:31,788 WARNING [train.py:1204] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_684-85342-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 01:11:33,297 WARNING [train.py:1204] (3/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_97-325053-0_sp1.1 from training. Duration: 0.836375 2023-10-04 01:11:37,699 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_68702_210_23689_1_1535799577_3420068_168-25150-0_sp1.1 from training. Duration: 0.97275 2023-10-04 01:11:37,717 WARNING [train.py:1204] (3/4) Exclude cut with ID _65134_210_14763_1_1535335393985_3505368_111-27984-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 01:11:38,905 WARNING [train.py:1204] (3/4) Exclude cut with ID _48678_210_5663_1_1533988830947_3769199_276-226230-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 01:11:39,102 WARNING [train.py:1204] (3/4) Exclude cut with ID _28427_210_4220_1_1532581240967_3061069_43-84928-0_sp1.1 from training. Duration: 0.9 2023-10-04 01:11:40,417 WARNING [train.py:1204] (3/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_158-250116-0_sp1.1 from training. Duration: 0.563625 2023-10-04 01:11:40,443 WARNING [train.py:1204] (3/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_279-332556-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 01:11:41,768 WARNING [train.py:1204] (3/4) Exclude cut with ID _42863_210_9070_1_1533902398788_3696920_488-230146-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 01:11:41,768 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_387-247159-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 01:11:43,159 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6729_210_2550_1_1528032481_1788725_261-42619-0_sp1.1 from training. Duration: 0.97275 2023-10-04 01:11:45,037 WARNING [train.py:1204] (3/4) Exclude cut with ID _19896_210_1883_1_1531709022662_6649569_831-44724-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 01:11:49,154 WARNING [train.py:1204] (3/4) Exclude cut with ID _55153_210_2253_1_1534384786677_3162096_280-285944-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 01:11:49,175 WARNING [train.py:1204] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_448-4048-0 from training. Duration: 0.96 2023-10-04 01:11:49,224 WARNING [train.py:1204] (3/4) Exclude cut with ID _29678_210_7893_1_1532421042787_3906118_456-202548-0_sp1.1 from training. Duration: 0.97275 2023-10-04 01:11:51,110 WARNING [train.py:1204] (3/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_107-332398-0_sp1.1 from training. Duration: 0.7090625 2023-10-04 01:11:54,043 WARNING [train.py:1204] (3/4) Exclude cut with ID _13921_210_9994_1_1530838702800_3003429_46-149444-0_sp1.1 from training. Duration: 0.8545625 2023-10-04 01:11:55,340 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_257-20733-0_sp0.9 from training. Duration: 0.8444375 2023-10-04 01:11:55,412 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_53753_210_7234_1_1534320003_3594148_385-234809-0_sp1.1 from training. Duration: 0.963625 2023-10-04 01:11:56,841 WARNING [train.py:1204] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_292-215346-0_sp1.1 from training. Duration: 0.8818125 2023-10-04 01:11:58,505 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.conv_skip_rate, batch_count=1472520.0, ans=0.0 2023-10-04 01:12:02,566 WARNING [train.py:1204] (3/4) Exclude cut with ID _68965_210_1883_1_1535698962272_3497490_196-124268-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 01:12:02,605 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_23801_210_16223_1_1532066392_3294657_570-5439-0_sp1.1 from training. Duration: 0.8818125 2023-10-04 01:12:06,777 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.619e+02 1.976e+02 2.144e+02 2.378e+02 3.256e+02, threshold=4.288e+02, percent-clipped=0.0 2023-10-04 01:12:06,965 WARNING [train.py:1204] (3/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_349-6928-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 01:12:08,238 WARNING [train.py:1204] (3/4) Exclude cut with ID _60621_210_17071_1_1535013053530_3671739_226-255202-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 01:12:08,242 WARNING [train.py:1204] (3/4) Exclude cut with ID _41817_210_18835_1_1533434477160_7423049_448-130302-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 01:12:10,025 WARNING [train.py:1204] (3/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_758-143677-0_sp1.1 from training. Duration: 0.8909375 2023-10-04 01:12:10,059 WARNING [train.py:1204] (3/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_912-47473-0_sp0.9 from training. Duration: 0.7555625 2023-10-04 01:12:11,425 WARNING [train.py:1204] (3/4) Exclude cut with ID _6694_210_4599_1_1527852637490_3965529_356-169233-0_sp1.1 from training. Duration: 0.9 2023-10-04 01:12:12,772 WARNING [train.py:1204] (3/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_1-99680-0 from training. Duration: 0.67 2023-10-04 01:12:12,880 WARNING [train.py:1204] (3/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_199-86432-0_sp1.1 from training. Duration: 0.8909375 2023-10-04 01:12:14,116 WARNING [train.py:1204] (3/4) Exclude cut with ID _61148_210_6897_1_1535094022726_4217610_362-182430-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 01:12:14,186 WARNING [train.py:1204] (3/4) Exclude cut with ID _49376_210_5986_1_1533985278732_3370740_301-154763-0 from training. Duration: 0.98 2023-10-04 01:12:16,103 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.4.encoder.layers.2.self_attn2.whiten, num_groups=1, num_channels=384, metric=11.89 vs. limit=22.5 2023-10-04 01:12:16,915 WARNING [train.py:1204] (3/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_572-185150-0_sp1.1 from training. Duration: 0.8818125 2023-10-04 01:12:17,234 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.balancer2.prob, batch_count=1472586.6666666667, ans=0.125 2023-10-04 01:12:23,691 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_47896_210_20508_1_1534492518_5958868_558-314572-0_sp1.1 from training. Duration: 0.8818125 2023-10-04 01:12:25,645 WARNING [train.py:1204] (3/4) Exclude cut with ID _45560_210_21982_1_1533812603951_3584999_111-6376-0_sp0.9 from training. Duration: 0.9333125 2023-10-04 01:12:27,054 WARNING [train.py:1204] (3/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_412-317077-0_sp1.1 from training. Duration: 0.6818125 2023-10-04 01:12:28,410 INFO [train.py:1046] (3/4) Epoch 42, batch 3100, loss[loss=0.16, simple_loss=0.2499, pruned_loss=0.03506, over 24562.00 frames. ], tot_loss[loss=0.1577, simple_loss=0.2376, pruned_loss=0.03889, over 4711685.35 frames. ], batch size: 71, lr: 2.42e-03, grad_scale: 16.0 2023-10-04 01:12:28,542 WARNING [train.py:1204] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_718-68505-0 from training. Duration: 0.89 2023-10-04 01:12:28,845 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.bypass.skip_rate, batch_count=1472653.3333333333, ans=0.09899494936611666 2023-10-04 01:12:31,406 WARNING [train.py:1204] (3/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_77-311404-0 from training. Duration: 0.94 2023-10-04 01:12:32,802 WARNING [train.py:1204] (3/4) Exclude cut with ID _60229_210_27767_1_1534838406158_3655350_264-279573-0 from training. Duration: 0.83 2023-10-04 01:12:34,156 WARNING [train.py:1204] (3/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_106-300985-0_sp1.1 from training. Duration: 0.8181875 2023-10-04 01:12:35,075 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.1.encoder.layers.0.conv_module1.whiten, num_groups=1, num_channels=256, metric=10.22 vs. limit=15.0 2023-10-04 01:12:38,209 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_4183_210_2087_1_1526729100_1869838_79-277189-0_sp1.1 from training. Duration: 0.963625 2023-10-04 01:12:39,495 WARNING [train.py:1204] (3/4) Exclude cut with ID _28427_210_4220_1_1532581240967_3061069_195-295268-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 01:12:42,793 WARNING [train.py:1204] (3/4) Exclude cut with ID _4092_210_2151_1_1526643044449_3135759_356-278694-0_sp0.9 from training. Duration: 0.52225 2023-10-04 01:12:45,566 WARNING [train.py:1204] (3/4) Exclude cut with ID _14459_210_2228_1_1531443641946_7516700_777-71446-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 01:12:48,645 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.bypass.skip_rate, batch_count=1472720.0, ans=0.09899494936611666 2023-10-04 01:12:51,510 WARNING [train.py:1204] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_520-126729-0 from training. Duration: 0.7 2023-10-04 01:12:54,361 WARNING [train.py:1204] (3/4) Exclude cut with ID _13684_210_7497_1_1530619354863_3598359_312-235606-0_sp0.9 from training. Duration: 0.6666875 2023-10-04 01:12:56,232 WARNING [train.py:1204] (3/4) Exclude cut with ID _37537_210_2253_1_1533171758568_3421949_283-349402-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 01:12:57,515 WARNING [train.py:1204] (3/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_29-117932-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 01:12:57,551 WARNING [train.py:1204] (3/4) Exclude cut with ID _29303_210_2575_1_1532392237303_7410649_721-283351-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 01:12:57,613 WARNING [train.py:1204] (3/4) Exclude cut with ID _14886_210_4220_1_1530702020355_3516190_175-229073-0_sp0.9 from training. Duration: 0.5 2023-10-04 01:13:00,302 WARNING [train.py:1204] (3/4) Exclude cut with ID _15458_210_8341_1_1530858614263_3834410_70-268830-0_sp1.1 from training. Duration: 0.97275 2023-10-04 01:13:00,326 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2295_210_2087_1_1525500993_2407863_234-83104-0 from training. Duration: 0.62 2023-10-04 01:13:00,327 WARNING [train.py:1204] (3/4) Exclude cut with ID _22961_210_8254_1_1531976366672_4278180_317-199546-0_sp1.1 from training. Duration: 0.7818125 2023-10-04 01:13:01,835 WARNING [train.py:1204] (3/4) Exclude cut with ID _27894_210_12392_1_1532415496078_6902439_547-196022-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 01:13:01,940 WARNING [train.py:1204] (3/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_46-207599-0 from training. Duration: 0.64 2023-10-04 01:13:04,580 WARNING [train.py:1204] (3/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_426-326246-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 01:13:08,547 WARNING [train.py:1204] (3/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_445-117398-0_sp0.9 from training. Duration: 0.988875 2023-10-04 01:13:08,606 WARNING [train.py:1204] (3/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_563-347832-0 from training. Duration: 0.89 2023-10-04 01:13:09,920 WARNING [train.py:1204] (3/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_252-216674-0 from training. Duration: 0.54 2023-10-04 01:13:10,005 WARNING [train.py:1204] (3/4) Exclude cut with ID _59611_210_8895_1_1534741198240_3650361_187-212779-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 01:13:11,343 WARNING [train.py:1204] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_282-159367-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 01:13:14,568 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_68702_210_23689_1_1535799577_3420068_33-136883-0_sp1.1 from training. Duration: 0.936375 2023-10-04 01:13:14,586 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_47228_210_4983_1_1533780021_3672027_184-293706-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 01:13:14,605 WARNING [train.py:1204] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_656-188361-0_sp0.9 from training. Duration: 0.9666875 2023-10-04 01:13:14,778 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.feed_forward2.hidden_balancer.prob, batch_count=1472853.3333333333, ans=0.125 2023-10-04 01:13:16,042 WARNING [train.py:1204] (3/4) Exclude cut with ID _4866_210_2577_1_1527404412284_4364659_352-157930-0_sp0.9 from training. Duration: 0.9 2023-10-04 01:13:16,042 WARNING [train.py:1204] (3/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_210-106047-0_sp1.1 from training. Duration: 0.8818125 2023-10-04 01:13:17,600 WARNING [train.py:1204] (3/4) Exclude cut with ID _9389_210_1664_1_1529217337301_5289269_289-213417-0_sp1.1 from training. Duration: 0.8090625 2023-10-04 01:13:18,842 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3910_210_1794_1_1526777667_3747260_178-41186-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 01:13:18,846 WARNING [train.py:1204] (3/4) Exclude cut with ID _27052_210_5986_1_1532329197187_3585009_24-172263-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 01:13:18,850 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_5522_210_3273_1_1527249606_3909096_318-203153-0_sp1.1 from training. Duration: 0.3909375 2023-10-04 01:13:21,715 WARNING [train.py:1204] (3/4) Exclude cut with ID _27894_210_12392_1_1532415496078_6902439_222-51982-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 01:13:24,757 WARNING [train.py:1204] (3/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_87-131427-0 from training. Duration: 0.78 2023-10-04 01:13:26,922 WARNING [train.py:1204] (3/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_372-298093-0_sp0.9 from training. Duration: 0.888875 2023-10-04 01:13:26,993 WARNING [train.py:1204] (3/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_663-166973-0 from training. Duration: 0.8 2023-10-04 01:13:28,261 WARNING [train.py:1204] (3/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_429-177469-0_sp1.1 from training. Duration: 0.936375 2023-10-04 01:13:28,294 WARNING [train.py:1204] (3/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_724-172651-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 01:13:28,330 WARNING [train.py:1204] (3/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_103-336064-0 from training. Duration: 0.51 2023-10-04 01:13:39,305 WARNING [train.py:1204] (3/4) Exclude cut with ID _48677_210_5663_1_1533902405919_3892989_111-11439-0 from training. Duration: 0.96 2023-10-04 01:13:40,624 INFO [train.py:1046] (3/4) Epoch 42, batch 3150, loss[loss=0.1544, simple_loss=0.2461, pruned_loss=0.0313, over 24462.00 frames. ], tot_loss[loss=0.157, simple_loss=0.237, pruned_loss=0.03848, over 4713594.67 frames. ], batch size: 69, lr: 2.42e-03, grad_scale: 16.0 2023-10-04 01:13:40,771 WARNING [train.py:1204] (3/4) Exclude cut with ID _8085_210_4557_1_1528455633493_3716100_95-103398-0_sp1.1 from training. Duration: 0.963625 2023-10-04 01:13:42,035 WARNING [train.py:1204] (3/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_1041-110837-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 01:13:43,872 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_50050_210_16223_1_1535435796_7290138_1155-121757-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 01:13:43,874 WARNING [train.py:1204] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_602-275260-0_sp0.9 from training. Duration: 0.911125 2023-10-04 01:13:45,464 WARNING [train.py:1204] (3/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_265-164486-0 from training. Duration: 0.96 2023-10-04 01:13:46,783 WARNING [train.py:1204] (3/4) Exclude cut with ID _46067_210_4220_1_1534125571066_3307450_142-229375-0_sp1.1 from training. Duration: 0.963625 2023-10-04 01:13:46,817 WARNING [train.py:1204] (3/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_310-26186-0_sp1.1 from training. Duration: 0.636375 2023-10-04 01:13:46,980 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.feed_forward2.hidden_balancer.prob, batch_count=1472986.6666666667, ans=0.125 2023-10-04 01:13:48,231 WARNING [train.py:1204] (3/4) Exclude cut with ID _28461_210_2253_1_1532771775189_3041299_136-270644-0 from training. Duration: 0.93 2023-10-04 01:13:49,483 WARNING [train.py:1204] (3/4) Exclude cut with ID _14860_210_4915_1_1530702020340_7343240_438-286372-0_sp1.1 from training. Duration: 0.936375 2023-10-04 01:13:51,033 WARNING [train.py:1204] (3/4) Exclude cut with ID IC0128W0004-63309 from training. Duration: 0.831 2023-10-04 01:13:53,800 WARNING [train.py:1204] (3/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_135-174080-0 from training. Duration: 0.53 2023-10-04 01:13:53,840 WARNING [train.py:1204] (3/4) Exclude cut with ID _41326_210_5854_1_1533778076509_3492080_277-59511-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 01:13:53,917 WARNING [train.py:1204] (3/4) Exclude cut with ID IC0144W0112-71379 from training. Duration: 0.992 2023-10-04 01:13:55,744 WARNING [train.py:1204] (3/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_711-15688-0_sp0.9 from training. Duration: 0.47775 2023-10-04 01:13:57,195 WARNING [train.py:1204] (3/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_237-332027-0 from training. Duration: 0.95 2023-10-04 01:13:58,411 WARNING [train.py:1204] (3/4) Exclude cut with ID _7970_210_1883_1_1528455616296_3472230_25-178407-0 from training. Duration: 0.71 2023-10-04 01:13:58,413 WARNING [train.py:1204] (3/4) Exclude cut with ID _8480_210_1874_1_1528589391205_6728330_309-139896-0 from training. Duration: 0.81 2023-10-04 01:13:58,439 WARNING [train.py:1204] (3/4) Exclude cut with ID _39571_210_13449_1_1533801584143_3651077_319-268877-0_sp1.1 from training. Duration: 0.936375 2023-10-04 01:13:58,449 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_155-246880-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 01:13:59,858 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_566-122599-0_sp1.1 from training. Duration: 0.936375 2023-10-04 01:14:01,040 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_12868_210_6753_1_1530701932_530275_18-76322-0 from training. Duration: 0.87 2023-10-04 01:14:02,595 WARNING [train.py:1204] (3/4) Exclude cut with ID _56494_210_4220_1_1535284837625_3033169_159-106035-0_sp1.1 from training. Duration: 0.963625 2023-10-04 01:14:03,922 WARNING [train.py:1204] (3/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_98-276013-0_sp1.1 from training. Duration: 0.963625 2023-10-04 01:14:03,969 WARNING [train.py:1204] (3/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_330-216124-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 01:14:06,711 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_177-58005-0_sp0.9 from training. Duration: 0.688875 2023-10-04 01:14:08,603 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.conv_module1.balancer2.prob, batch_count=1473120.0, ans=0.125 2023-10-04 01:14:10,050 WARNING [train.py:1204] (3/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_343-139898-0 from training. Duration: 0.96 2023-10-04 01:14:11,394 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6961_210_2087_1_1527937168_6815011_395-185060-0_sp1.1 from training. Duration: 0.863625 2023-10-04 01:14:12,859 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7002_210_1794_1_1527939683_5157286_184-229630-0_sp0.9 from training. Duration: 0.82225 2023-10-04 01:14:14,249 WARNING [train.py:1204] (3/4) Exclude cut with ID _59170_210_6721_1_1534983633960_4203335_153-18895-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 01:14:14,300 WARNING [train.py:1204] (3/4) Exclude cut with ID _49643_210_7989_1_1534640244224_3939031_598-161926-0 from training. Duration: 0.9 2023-10-04 01:14:18,396 WARNING [train.py:1204] (3/4) Exclude cut with ID _4068_210_2629_1_1526697350395_1546259_224-288693-0 from training. Duration: 0.59 2023-10-04 01:14:18,450 WARNING [train.py:1204] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_718-68505-0_sp1.1 from training. Duration: 0.8090625 2023-10-04 01:14:19,843 WARNING [train.py:1204] (3/4) Exclude cut with ID _4068_210_2629_1_1526697350395_1546259_224-288693-0_sp0.9 from training. Duration: 0.6555625 2023-10-04 01:14:19,864 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2296_210_2087_1_1525503448_3397335_57-185430-0_sp0.9 from training. Duration: 0.6666875 2023-10-04 01:14:19,914 WARNING [train.py:1204] (3/4) Exclude cut with ID _15458_210_8341_1_1530858614263_3834410_49-235472-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 01:14:19,916 WARNING [train.py:1204] (3/4) Exclude cut with ID _64135_210_2253_1_1535677267876_7608972_498-5039-0_sp0.9 from training. Duration: 0.8666875 2023-10-04 01:14:21,288 WARNING [train.py:1204] (3/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_283-220372-0_sp0.9 from training. Duration: 0.62225 2023-10-04 01:14:21,298 WARNING [train.py:1204] (3/4) Exclude cut with ID _6473_210_1741_1_1527901307396_3815380_108-17335-0_sp0.9 from training. Duration: 0.72225 2023-10-04 01:14:22,724 WARNING [train.py:1204] (3/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_113-92608-0 from training. Duration: 0.54 2023-10-04 01:14:22,770 WARNING [train.py:1204] (3/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_272-249985-0_sp1.1 from training. Duration: 0.6090625 2023-10-04 01:14:22,780 WARNING [train.py:1204] (3/4) Exclude cut with ID _50376_210_13401_1_1534031502101_4217740_518-187390-0_sp1.1 from training. Duration: 0.97275 2023-10-04 01:14:24,107 WARNING [train.py:1204] (3/4) Exclude cut with ID _59738_210_15379_1_1534849512562_3941470_165-248260-0_sp1.1 from training. Duration: 0.92725 2023-10-04 01:14:24,119 WARNING [train.py:1204] (3/4) Exclude cut with ID _23818_210_6228_1_1531917063884_3676920_400-343925-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 01:14:24,180 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6961_210_2087_1_1527937168_6815011_562-165518-0 from training. Duration: 0.77 2023-10-04 01:14:26,174 WARNING [train.py:1204] (3/4) Exclude cut with ID _24271_210_12658_1_1531987270022_7190519_289-207931-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 01:14:28,110 WARNING [train.py:1204] (3/4) Exclude cut with ID _14632_210_9988_1_1530691279392_3923200_332-105904-0 from training. Duration: 0.9 2023-10-04 01:14:28,129 WARNING [train.py:1204] (3/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_203-232438-0_sp1.1 from training. Duration: 0.97275 2023-10-04 01:14:29,550 WARNING [train.py:1204] (3/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_263-168388-0 from training. Duration: 0.86 2023-10-04 01:14:30,762 WARNING [train.py:1204] (3/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_174-205058-0 from training. Duration: 0.95 2023-10-04 01:14:32,231 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_94-195801-0_sp1.1 from training. Duration: 0.8545625 2023-10-04 01:14:32,261 WARNING [train.py:1204] (3/4) Exclude cut with ID _55153_210_2253_1_1534384786677_3162096_269-103204-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 01:14:33,745 WARNING [train.py:1204] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_748-36150-0 from training. Duration: 0.91 2023-10-04 01:14:34,960 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.646e+02 1.982e+02 2.210e+02 2.426e+02 4.214e+02, threshold=4.421e+02, percent-clipped=0.0 2023-10-04 01:14:36,422 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_148-172861-0_sp1.1 from training. Duration: 0.4545625 2023-10-04 01:14:36,478 WARNING [train.py:1204] (3/4) Exclude cut with ID _67499_210_17282_1_1535509805220_3692430_460-77730-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 01:14:39,319 WARNING [train.py:1204] (3/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_374-20753-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 01:14:41,241 WARNING [train.py:1204] (3/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_374-289983-0_sp1.1 from training. Duration: 0.97275 2023-10-04 01:14:41,268 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_34886_210_10633_1_1533203882_1755661_76-156096-0_sp0.9 from training. Duration: 0.9666875 2023-10-04 01:14:48,103 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_238-256668-0_sp0.9 from training. Duration: 0.7666875 2023-10-04 01:14:48,155 WARNING [train.py:1204] (3/4) Exclude cut with ID _46683_210_9642_1_1533730913492_4591116_489-146727-0_sp1.1 from training. Duration: 0.97275 2023-10-04 01:14:49,630 WARNING [train.py:1204] (3/4) Exclude cut with ID _13938_210_9988_1_1530518718978_3693960_304-112784-0_sp0.9 from training. Duration: 0.511125 2023-10-04 01:14:53,768 INFO [train.py:1046] (3/4) Epoch 42, batch 3200, loss[loss=0.1584, simple_loss=0.2281, pruned_loss=0.04433, over 20374.00 frames. ], tot_loss[loss=0.1559, simple_loss=0.2357, pruned_loss=0.03807, over 4711619.23 frames. ], batch size: 44, lr: 2.42e-03, grad_scale: 16.0 2023-10-04 01:14:55,198 WARNING [train.py:1204] (3/4) Exclude cut with ID _24271_210_12658_1_1531987270022_7190519_243-46549-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 01:14:55,210 WARNING [train.py:1204] (3/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_78-182912-0_sp1.1 from training. Duration: 0.536375 2023-10-04 01:15:00,225 WARNING [train.py:1204] (3/4) Exclude cut with ID _57572_210_5517_1_1534921219624_3809484_121-321334-0_sp1.1 from training. Duration: 0.97275 2023-10-04 01:15:02,011 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_40047_210_2290_1_1533290519_2374311_254-102149-0_sp1.1 from training. Duration: 0.963625 2023-10-04 01:15:02,012 WARNING [train.py:1204] (3/4) Exclude cut with ID _28461_210_2253_1_1532771775189_3041299_96-152158-0 from training. Duration: 0.85 2023-10-04 01:15:04,869 WARNING [train.py:1204] (3/4) Exclude cut with ID _29877_210_6228_1_1532397791106_5276149_161-102979-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 01:15:08,977 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3787_210_2040_1_1526477544_5478586_603-120324-0_sp0.9 from training. Duration: 0.9 2023-10-04 01:15:13,470 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_41750_210_1960_1_1533520678_3948042_247-213456-0_sp1.1 from training. Duration: 0.97275 2023-10-04 01:15:20,504 WARNING [train.py:1204] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_127-119109-0_sp0.9 from training. Duration: 0.8 2023-10-04 01:15:24,893 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.feed_forward3.hidden_balancer.prob, batch_count=1473453.3333333333, ans=0.125 2023-10-04 01:15:30,824 WARNING [train.py:1204] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_215-202973-0 from training. Duration: 0.88 2023-10-04 01:15:30,920 WARNING [train.py:1204] (3/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_17-320491-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 01:15:34,205 WARNING [train.py:1204] (3/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_310-114452-0 from training. Duration: 0.59 2023-10-04 01:15:34,275 WARNING [train.py:1204] (3/4) Exclude cut with ID _15133_210_6228_1_1530794512074_3080379_29-264458-0_sp0.9 from training. Duration: 0.6444375 2023-10-04 01:15:37,174 WARNING [train.py:1204] (3/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_159-281310-0_sp1.1 from training. Duration: 0.863625 2023-10-04 01:15:37,184 WARNING [train.py:1204] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_195-167011-0_sp0.9 from training. Duration: 0.8333125 2023-10-04 01:15:38,431 WARNING [train.py:1204] (3/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_156-257607-0_sp1.1 from training. Duration: 0.7545625 2023-10-04 01:15:41,847 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2295_210_2087_1_1525500993_2407863_116-238894-0 from training. Duration: 0.7 2023-10-04 01:15:43,257 WARNING [train.py:1204] (3/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_321-335475-0_sp0.9 from training. Duration: 0.5 2023-10-04 01:15:45,860 WARNING [train.py:1204] (3/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_188-114464-0 from training. Duration: 0.82 2023-10-04 01:15:48,705 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2592_210_2290_1_1525697025_4882925_157-70512-0 from training. Duration: 0.91 2023-10-04 01:15:51,340 WARNING [train.py:1204] (3/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_138-64210-0_sp1.1 from training. Duration: 0.663625 2023-10-04 01:15:51,595 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=1473586.6666666667, ans=0.1 2023-10-04 01:15:57,107 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_11305_210_1794_1_1529750428_7178148_651-95702-0_sp1.1 from training. Duration: 0.963625 2023-10-04 01:15:57,129 WARNING [train.py:1204] (3/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_379-184223-0_sp0.9 from training. Duration: 0.9444375 2023-10-04 01:15:57,149 WARNING [train.py:1204] (3/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_381-125325-0_sp1.1 from training. Duration: 0.963625 2023-10-04 01:15:58,580 WARNING [train.py:1204] (3/4) Exclude cut with ID IC0622W0021-307659 from training. Duration: 0.896 2023-10-04 01:15:58,588 WARNING [train.py:1204] (3/4) Exclude cut with ID _7624_210_1531_1_1528109802198_4933101_418-349834-0_sp1.1 from training. Duration: 0.5454375 2023-10-04 01:16:02,501 WARNING [train.py:1204] (3/4) Exclude cut with ID _49728_210_12651_1_1534041034294_6788150_607-3416-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 01:16:05,099 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8275_210_4198_1_1528455320_3892571_491-17707-0 from training. Duration: 0.64 2023-10-04 01:16:05,140 WARNING [train.py:1204] (3/4) Exclude cut with ID _5287_210_2084_1_1527418883040_3878670_333-48508-0 from training. Duration: 0.6 2023-10-04 01:16:05,400 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.feed_forward2.hidden_balancer.prob, batch_count=1473586.6666666667, ans=0.125 2023-10-04 01:16:06,445 WARNING [train.py:1204] (3/4) Exclude cut with ID _8365_210_2064_1_1528592311070_4677690_371-122295-0 from training. Duration: 0.55 2023-10-04 01:16:06,552 WARNING [train.py:1204] (3/4) Exclude cut with ID _14860_210_4915_1_1530702020340_7343240_836-328489-0 from training. Duration: 0.9 2023-10-04 01:16:07,916 INFO [train.py:1046] (3/4) Epoch 42, batch 3250, loss[loss=0.155, simple_loss=0.2268, pruned_loss=0.04157, over 23657.00 frames. ], tot_loss[loss=0.1555, simple_loss=0.2354, pruned_loss=0.03776, over 4715929.27 frames. ], batch size: 232, lr: 2.42e-03, grad_scale: 16.0 2023-10-04 01:16:08,046 WARNING [train.py:1204] (3/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_1033-274376-0_sp0.9 from training. Duration: 0.9666875 2023-10-04 01:16:10,058 WARNING [train.py:1204] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_475-127455-0_sp0.9 from training. Duration: 0.77775 2023-10-04 01:16:10,065 WARNING [train.py:1204] (3/4) Exclude cut with ID IC0671W0071-331914 from training. Duration: 0.896 2023-10-04 01:16:10,095 WARNING [train.py:1204] (3/4) Exclude cut with ID _30327_210_16049_1_1532484161295_3924169_2-1523-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 01:16:11,385 WARNING [train.py:1204] (3/4) Exclude cut with ID _24314_210_4915_1_1532135078116_7170179_460-208279-0_sp1.1 from training. Duration: 0.97275 2023-10-04 01:16:12,762 WARNING [train.py:1204] (3/4) Exclude cut with ID IC0679W0052-335859 from training. Duration: 0.64 2023-10-04 01:16:16,742 WARNING [train.py:1204] (3/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_3-237434-0_sp1.1 from training. Duration: 0.6909375 2023-10-04 01:16:18,206 WARNING [train.py:1204] (3/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_405-185283-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 01:16:26,458 WARNING [train.py:1204] (3/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_927-83684-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 01:16:26,466 WARNING [train.py:1204] (3/4) Exclude cut with ID _6694_210_4599_1_1527852637490_3965529_356-169233-0 from training. Duration: 0.99 2023-10-04 01:16:26,554 WARNING [train.py:1204] (3/4) Exclude cut with ID _19896_210_1883_1_1531709022662_6649569_562-233001-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 01:16:27,758 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_354-78629-0_sp1.1 from training. Duration: 0.963625 2023-10-04 01:16:27,759 WARNING [train.py:1204] (3/4) Exclude cut with ID _60135_210_13427_1_1534935557922_3651319_96-168198-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 01:16:29,164 WARNING [train.py:1204] (3/4) Exclude cut with ID _31602_210_5854_1_1532677021456_4458250_249-96604-0_sp1.1 from training. Duration: 0.8090625 2023-10-04 01:16:29,196 WARNING [train.py:1204] (3/4) Exclude cut with ID _14100_210_8341_1_1530529320613_3678069_258-224599-0_sp0.9 from training. Duration: 0.8555625 2023-10-04 01:16:33,425 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.conv_module2.balancer2.prob, batch_count=1473720.0, ans=0.125 2023-10-04 01:16:34,417 WARNING [train.py:1204] (3/4) Exclude cut with ID _19708_210_9206_1_1532136650733_4238364_215-160135-0_sp1.1 from training. Duration: 0.97275 2023-10-04 01:16:34,439 WARNING [train.py:1204] (3/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_113-18163-0_sp0.9 from training. Duration: 0.988875 2023-10-04 01:16:34,483 WARNING [train.py:1204] (3/4) Exclude cut with ID _64135_210_2253_1_1535677267876_7608972_552-337340-0_sp1.1 from training. Duration: 0.92725 2023-10-04 01:16:34,514 WARNING [train.py:1204] (3/4) Exclude cut with ID _67725_210_27767_1_1535608756671_5105359_10-12887-0_sp1.1 from training. Duration: 0.97275 2023-10-04 01:16:34,527 WARNING [train.py:1204] (3/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_401-284840-0_sp1.1 from training. Duration: 0.97275 2023-10-04 01:16:36,266 WARNING [train.py:1204] (3/4) Exclude cut with ID _44659_210_2721_1_1534036120061_6614860_483-141285-0_sp1.1 from training. Duration: 0.936375 2023-10-04 01:16:37,812 WARNING [train.py:1204] (3/4) Exclude cut with ID _9461_210_4557_1_1529060514726_3677451_358-332734-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 01:16:37,907 WARNING [train.py:1204] (3/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_788-298907-0_sp1.1 from training. Duration: 0.8090625 2023-10-04 01:16:41,089 WARNING [train.py:1204] (3/4) Exclude cut with ID _39411_210_5986_1_1533538825030_3343649_354-267196-0_sp1.1 from training. Duration: 0.92725 2023-10-04 01:16:41,117 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_14953_210_2151_1_1530701710_5616165_192-309792-0_sp1.1 from training. Duration: 0.97275 2023-10-04 01:16:42,486 WARNING [train.py:1204] (3/4) Exclude cut with ID _59170_210_6721_1_1534983633960_4203335_290-151255-0_sp1.1 from training. Duration: 0.92725 2023-10-04 01:16:42,510 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_5575_210_1410_1_1527301305_2570170_249-93769-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 01:16:42,519 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_46013_210_7234_1_1533776362_3744032_463-52378-0_sp1.1 from training. Duration: 0.7818125 2023-10-04 01:16:43,986 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.ff3_skip_rate, batch_count=1473786.6666666667, ans=0.0 2023-10-04 01:16:44,604 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.2.encoder.layers.0.feed_forward3.out_whiten, num_groups=1, num_channels=384, metric=11.85 vs. limit=15.0 2023-10-04 01:16:45,642 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.5.encoder.layers.1.nonlin_attention.whiten2, num_groups=1, num_channels=256, metric=3.99 vs. limit=15.0 2023-10-04 01:16:46,643 WARNING [train.py:1204] (3/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_580-93798-0 from training. Duration: 0.56 2023-10-04 01:16:47,965 WARNING [train.py:1204] (3/4) Exclude cut with ID _19514_210_9259_1_1531530071330_5084230_385-13762-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 01:16:47,970 WARNING [train.py:1204] (3/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_458-67321-0_sp1.1 from training. Duration: 0.8818125 2023-10-04 01:16:50,655 WARNING [train.py:1204] (3/4) Exclude cut with ID _41298_210_9019_1_1533430371450_4822143_188-154791-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 01:16:50,728 WARNING [train.py:1204] (3/4) Exclude cut with ID _15037_210_2253_1_1530752294104_3267650_296-192597-0_sp0.9 from training. Duration: 0.9 2023-10-04 01:16:50,954 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.conv_skip_rate, batch_count=1473853.3333333333, ans=0.0 2023-10-04 01:16:55,028 WARNING [train.py:1204] (3/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_661-264857-0_sp1.1 from training. Duration: 0.8454375 2023-10-04 01:17:02,401 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.658e+02 1.958e+02 2.109e+02 2.405e+02 3.560e+02, threshold=4.218e+02, percent-clipped=0.0 2023-10-04 01:17:02,499 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_34008_210_10431_1_1533345424_8183247_662-317331-0_sp1.1 from training. Duration: 0.8545625 2023-10-04 01:17:02,532 WARNING [train.py:1204] (3/4) Exclude cut with ID _46502_210_13794_1_1533724452198_3657159_241-278329-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 01:17:02,532 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_11349_210_6753_1_1529800861_1101207_26-65602-0 from training. Duration: 0.96 2023-10-04 01:17:02,536 WARNING [train.py:1204] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_448-4048-0_sp1.1 from training. Duration: 0.87275 2023-10-04 01:17:02,543 WARNING [train.py:1204] (3/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_662-275621-0_sp1.1 from training. Duration: 0.3818125 2023-10-04 01:17:03,838 WARNING [train.py:1204] (3/4) Exclude cut with ID _13938_210_9988_1_1530518718978_3693960_256-54454-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 01:17:06,921 WARNING [train.py:1204] (3/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_256-32662-0 from training. Duration: 0.9 2023-10-04 01:17:06,954 WARNING [train.py:1204] (3/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_326-17061-0 from training. Duration: 0.94 2023-10-04 01:17:06,991 WARNING [train.py:1204] (3/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_505-97987-0_sp1.1 from training. Duration: 0.7818125 2023-10-04 01:17:08,919 WARNING [train.py:1204] (3/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_316-294364-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 01:17:08,974 WARNING [train.py:1204] (3/4) Exclude cut with ID _19514_210_9259_1_1531530071330_5084230_762-231063-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 01:17:09,025 WARNING [train.py:1204] (3/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_68-219807-0_sp0.9 from training. Duration: 0.52225 2023-10-04 01:17:09,244 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.bypass_mid.scale_min, batch_count=1473920.0, ans=0.2 2023-10-04 01:17:10,385 WARNING [train.py:1204] (3/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_479-158053-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 01:17:12,034 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.conv_module1.balancer1.prob, batch_count=1473920.0, ans=0.125 2023-10-04 01:17:12,066 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.conv_module1.balancer1.min_positive, batch_count=1473920.0, ans=0.025 2023-10-04 01:17:13,256 WARNING [train.py:1204] (3/4) Exclude cut with ID _66112_210_10635_1_1535590236305_3801484_83-38181-0_sp1.1 from training. Duration: 0.936375 2023-10-04 01:17:13,278 WARNING [train.py:1204] (3/4) Exclude cut with ID _55857_210_2346_1_1534644007831_6898209_255-146346-0_sp1.1 from training. Duration: 0.963625 2023-10-04 01:17:15,951 WARNING [train.py:1204] (3/4) Exclude cut with ID _48836_210_12210_1_1534075848293_3904150_429-35475-0 from training. Duration: 0.89 2023-10-04 01:17:15,969 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_50154_210_13731_1_1534750862_1080961_119-187163-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 01:17:18,655 WARNING [train.py:1204] (3/4) Exclude cut with ID _8793_210_6310_1_1528542985139_2310009_121-179782-0_sp0.9 from training. Duration: 0.888875 2023-10-04 01:17:18,658 WARNING [train.py:1204] (3/4) Exclude cut with ID _14884_210_2252_1_1530707459689_5561250_591-173406-0 from training. Duration: 0.96 2023-10-04 01:17:21,313 INFO [train.py:1046] (3/4) Epoch 42, batch 3300, loss[loss=0.1437, simple_loss=0.2205, pruned_loss=0.03341, over 24444.00 frames. ], tot_loss[loss=0.1562, simple_loss=0.2358, pruned_loss=0.03824, over 4714858.10 frames. ], batch size: 58, lr: 2.42e-03, grad_scale: 16.0 2023-10-04 01:17:22,745 WARNING [train.py:1204] (3/4) Exclude cut with ID _15133_210_6228_1_1530794512074_3080379_54-198193-0_sp1.1 from training. Duration: 0.8545625 2023-10-04 01:17:22,754 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6545_210_2040_1_1527774670_4245258_407-48070-0 from training. Duration: 0.76 2023-10-04 01:17:22,892 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1032-236863-0 from training. Duration: 0.84 2023-10-04 01:17:24,276 WARNING [train.py:1204] (3/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_507-303370-0 from training. Duration: 0.96 2023-10-04 01:17:24,304 WARNING [train.py:1204] (3/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_164-282366-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 01:17:27,148 WARNING [train.py:1204] (3/4) Exclude cut with ID _39867_210_9826_1_1533553222030_9344259_312-158727-0_sp1.1 from training. Duration: 0.963625 2023-10-04 01:17:27,351 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=1473986.6666666667, ans=0.1 2023-10-04 01:17:28,566 WARNING [train.py:1204] (3/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_174-205058-0_sp1.1 from training. Duration: 0.863625 2023-10-04 01:17:30,329 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_11305_210_1794_1_1529750428_7178148_166-237358-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 01:17:32,185 WARNING [train.py:1204] (3/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_26-175553-0_sp1.1 from training. Duration: 0.5545625 2023-10-04 01:17:32,210 WARNING [train.py:1204] (3/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_96-290039-0_sp1.1 from training. Duration: 0.5090625 2023-10-04 01:17:36,374 WARNING [train.py:1204] (3/4) Exclude cut with ID _53219_210_4525_1_1534834836008_4064825_261-101691-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 01:17:37,775 WARNING [train.py:1204] (3/4) Exclude cut with ID _24271_210_12658_1_1531987270022_7190519_96-293390-0_sp1.1 from training. Duration: 0.936375 2023-10-04 01:17:42,404 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_271-234962-0 from training. Duration: 0.79 2023-10-04 01:17:43,796 WARNING [train.py:1204] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_136-30660-0_sp1.1 from training. Duration: 0.97275 2023-10-04 01:17:43,812 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_625-281046-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 01:17:45,143 WARNING [train.py:1204] (3/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_333-168193-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 01:17:45,197 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9029W0269-509664 from training. Duration: 0.896 2023-10-04 01:17:45,352 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.feed_forward1.hidden_balancer.prob, batch_count=1474053.3333333333, ans=0.125 2023-10-04 01:17:46,579 WARNING [train.py:1204] (3/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_450-147393-0_sp1.1 from training. Duration: 0.92725 2023-10-04 01:17:46,620 WARNING [train.py:1204] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_643-52007-0_sp1.1 from training. Duration: 0.5181875 2023-10-04 01:17:48,032 WARNING [train.py:1204] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_330-327861-0_sp0.9 from training. Duration: 0.7444375 2023-10-04 01:17:48,041 WARNING [train.py:1204] (3/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_260-317651-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 01:17:48,073 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9044W0010-516654 from training. Duration: 0.896 2023-10-04 01:17:50,957 WARNING [train.py:1204] (3/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_265-118378-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 01:17:50,963 WARNING [train.py:1204] (3/4) Exclude cut with ID _6194_210_1534_1_1527511826160_3941194_321-148641-0_sp0.9 from training. Duration: 0.711125 2023-10-04 01:17:52,672 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.bypass_mid.scale_min, batch_count=1474120.0, ans=0.2 2023-10-04 01:17:53,885 WARNING [train.py:1204] (3/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_101-169176-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 01:17:53,888 WARNING [train.py:1204] (3/4) Exclude cut with ID 497-129325-0061-62254-0_sp1.1 from training. Duration: 0.97725 2023-10-04 01:17:55,226 WARNING [train.py:1204] (3/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_795-33252-0_sp1.1 from training. Duration: 0.336375 2023-10-04 01:17:55,254 WARNING [train.py:1204] (3/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_168-246176-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 01:17:56,571 WARNING [train.py:1204] (3/4) Exclude cut with ID _7160_210_1876_1_1527940923231_3703799_120-150943-0_sp1.1 from training. Duration: 0.736375 2023-10-04 01:17:57,935 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9084W0005-536844 from training. Duration: 0.768 2023-10-04 01:17:59,379 WARNING [train.py:1204] (3/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_234-260682-0 from training. Duration: 0.84 2023-10-04 01:18:00,695 WARNING [train.py:1204] (3/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_188-114464-0_sp0.9 from training. Duration: 0.911125 2023-10-04 01:18:00,978 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.feed_forward2.hidden_balancer.prob, batch_count=1474120.0, ans=0.125 2023-10-04 01:18:02,788 WARNING [train.py:1204] (3/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_81-75849-0 from training. Duration: 0.39 2023-10-04 01:18:05,842 WARNING [train.py:1204] (3/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_379-184223-0_sp1.1 from training. Duration: 0.77275 2023-10-04 01:18:08,628 WARNING [train.py:1204] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_454-123002-0_sp1.1 from training. Duration: 0.62725 2023-10-04 01:18:08,670 WARNING [train.py:1204] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_887-249555-0_sp1.1 from training. Duration: 0.7 2023-10-04 01:18:13,306 WARNING [train.py:1204] (3/4) Exclude cut with ID _31088_210_16446_1_1532520012284_3440919_20-260902-0_sp1.1 from training. Duration: 0.97275 2023-10-04 01:18:13,360 WARNING [train.py:1204] (3/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_856-344574-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 01:18:13,362 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_36756_210_7234_1_1533026991_966276_4-164118-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 01:18:13,384 WARNING [train.py:1204] (3/4) Exclude cut with ID _8365_210_2064_1_1528592311070_4677690_371-122295-0_sp1.1 from training. Duration: 0.5 2023-10-04 01:18:16,483 WARNING [train.py:1204] (3/4) Exclude cut with ID _5910_210_1389_1_1527337972454_5050100_555-159815-0_sp1.1 from training. Duration: 0.8909375 2023-10-04 01:18:16,493 WARNING [train.py:1204] (3/4) Exclude cut with ID _37086_210_9038_1_1532999540891_7021250_469-147941-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 01:18:17,820 WARNING [train.py:1204] (3/4) Exclude cut with ID _3067_210_2667_1_1526212974640_4503168_459-173867-0_sp0.9 from training. Duration: 0.97775 2023-10-04 01:18:18,068 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.bypass.skip_rate, batch_count=1474186.6666666667, ans=0.07 2023-10-04 01:18:19,192 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9156W0006-572499 from training. Duration: 0.896 2023-10-04 01:18:20,618 WARNING [train.py:1204] (3/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_277-210798-0 from training. Duration: 0.55 2023-10-04 01:18:23,287 WARNING [train.py:1204] (3/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_103-336064-0_sp1.1 from training. Duration: 0.463625 2023-10-04 01:18:23,330 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3414_210_1988_1_1526212718_4813195_331-290945-0_sp1.1 from training. Duration: 0.936375 2023-10-04 01:18:23,331 WARNING [train.py:1204] (3/4) Exclude cut with ID _67499_210_17282_1_1535509805220_3692430_365-29276-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 01:18:24,756 WARNING [train.py:1204] (3/4) Exclude cut with ID _14821_210_8750_1_1530680503991_6829359_69-250847-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 01:18:24,757 WARNING [train.py:1204] (3/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_1102-61301-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 01:18:26,081 WARNING [train.py:1204] (3/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_166-246557-0_sp1.1 from training. Duration: 0.5818125 2023-10-04 01:18:26,139 WARNING [train.py:1204] (3/4) Exclude cut with ID _15351_210_10626_1_1530856635653_3576210_214-129918-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 01:18:26,152 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_120-41668-0_sp0.9 from training. Duration: 0.77775 2023-10-04 01:18:27,600 WARNING [train.py:1204] (3/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_27-72278-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 01:18:29,056 WARNING [train.py:1204] (3/4) Exclude cut with ID _24314_210_4915_1_1532135078116_7170179_521-258868-0_sp1.1 from training. Duration: 0.7181875 2023-10-04 01:18:30,611 WARNING [train.py:1204] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_143-86344-0 from training. Duration: 0.77 2023-10-04 01:18:30,831 INFO [scaling.py:1118] (3/4) WithLoss: name=encoder.encoders.2.encoder.layers.2.self_attn_weights, loss-sum=0.000e+00 2023-10-04 01:18:31,935 WARNING [train.py:1204] (3/4) Exclude cut with ID _53489_210_6228_1_1534298357169_3835970_465-157433-0_sp1.1 from training. Duration: 0.92725 2023-10-04 01:18:32,019 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_248-319353-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 01:18:34,491 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.4.encoder.layers.1.self_attn2.whiten, num_groups=1, num_channels=384, metric=11.59 vs. limit=22.5 2023-10-04 01:18:35,198 INFO [train.py:1046] (3/4) Epoch 42, batch 3350, loss[loss=0.1647, simple_loss=0.2547, pruned_loss=0.03739, over 24445.00 frames. ], tot_loss[loss=0.1568, simple_loss=0.2371, pruned_loss=0.03821, over 4729619.00 frames. ], batch size: 69, lr: 2.42e-03, grad_scale: 16.0 2023-10-04 01:18:36,554 WARNING [train.py:1204] (3/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_102-77064-0_sp1.1 from training. Duration: 0.8454375 2023-10-04 01:18:37,254 WARNING [train.py:1204] (3/4) Exclude cut with ID _9389_210_1664_1_1529217337301_5289269_379-128794-0_sp1.1 from training. Duration: 0.77275 2023-10-04 01:18:38,476 WARNING [train.py:1204] (3/4) Exclude cut with ID _3231_210_2106_1_1526205864934_6784220_236-260835-0_sp1.1 from training. Duration: 0.97275 2023-10-04 01:18:39,774 WARNING [train.py:1204] (3/4) Exclude cut with ID _24271_210_12658_1_1531987270022_7190519_184-342363-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 01:18:39,775 WARNING [train.py:1204] (3/4) Exclude cut with ID _27894_210_12392_1_1532415496078_6902439_67-221663-0_sp1.1 from training. Duration: 0.92725 2023-10-04 01:18:41,255 WARNING [train.py:1204] (3/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_304-59227-0_sp1.1 from training. Duration: 0.7 2023-10-04 01:18:42,748 WARNING [train.py:1204] (3/4) Exclude cut with ID _58605_210_12210_1_1534723195939_3904259_458-42232-0_sp1.1 from training. Duration: 0.92725 2023-10-04 01:18:44,135 WARNING [train.py:1204] (3/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_13-165512-0_sp0.9 from training. Duration: 0.911125 2023-10-04 01:18:47,495 WARNING [train.py:1204] (3/4) Exclude cut with ID _58602_210_2346_1_1534903210085_6908830_792-102241-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 01:18:47,628 WARNING [train.py:1204] (3/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_588-125165-0_sp0.9 from training. Duration: 0.92225 2023-10-04 01:18:50,207 WARNING [train.py:1204] (3/4) Exclude cut with ID _54218_210_12210_1_1534413646388_4409259_239-98429-0_sp1.1 from training. Duration: 0.97275 2023-10-04 01:18:50,262 WARNING [train.py:1204] (3/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_538-32291-0_sp1.1 from training. Duration: 0.7545625 2023-10-04 01:18:51,613 WARNING [train.py:1204] (3/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_419-317062-0 from training. Duration: 0.91 2023-10-04 01:18:52,985 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9309W0202-640329 from training. Duration: 0.896 2023-10-04 01:18:54,234 WARNING [train.py:1204] (3/4) Exclude cut with ID _3231_210_2106_1_1526205864934_6784220_419-313291-0_sp1.1 from training. Duration: 0.97275 2023-10-04 01:18:56,994 WARNING [train.py:1204] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_619-344499-0 from training. Duration: 0.63 2023-10-04 01:18:57,012 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_278-272612-0 from training. Duration: 0.73 2023-10-04 01:18:58,442 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_103-84010-0_sp0.9 from training. Duration: 0.7666875 2023-10-04 01:18:59,706 WARNING [train.py:1204] (3/4) Exclude cut with ID _26528_210_9070_1_1532570410842_3851310_497-97218-0_sp1.1 from training. Duration: 0.7818125 2023-10-04 01:19:01,024 WARNING [train.py:1204] (3/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_262-84613-0_sp1.1 from training. Duration: 0.936375 2023-10-04 01:19:01,064 WARNING [train.py:1204] (3/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_387-131538-0 from training. Duration: 0.81 2023-10-04 01:19:01,098 WARNING [train.py:1204] (3/4) Exclude cut with ID _8030_210_7497_1_1528444886781_3531089_224-300489-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 01:19:01,120 WARNING [train.py:1204] (3/4) Exclude cut with ID _14349_210_8341_1_1530615710818_3442470_170-336541-0_sp0.9 from training. Duration: 0.9555625 2023-10-04 01:19:03,145 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_47069_210_15368_1_1533781267_4431320_180-31746-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 01:19:04,609 WARNING [train.py:1204] (3/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_447-7041-0_sp1.1 from training. Duration: 0.92725 2023-10-04 01:19:04,635 WARNING [train.py:1204] (3/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_2-55283-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 01:19:05,966 WARNING [train.py:1204] (3/4) Exclude cut with ID _12555_210_8750_1_1530075594402_3780239_70-181911-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 01:19:08,001 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.ff3_skip_rate, batch_count=1474453.3333333333, ans=0.0 2023-10-04 01:19:09,244 WARNING [train.py:1204] (3/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_311-328022-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 01:19:10,699 WARNING [train.py:1204] (3/4) Exclude cut with ID _14528_210_8254_1_1530669700514_4306939_330-2987-0_sp1.1 from training. Duration: 0.92725 2023-10-04 01:19:10,751 WARNING [train.py:1204] (3/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_385-184858-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 01:19:14,764 WARNING [train.py:1204] (3/4) Exclude cut with ID _64414_210_17301_1_1535243252048_3757759_348-333360-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 01:19:16,846 WARNING [train.py:1204] (3/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_753-214751-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 01:19:19,507 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_17613_210_2151_1_1531137569_2366160_202-208013-0_sp1.1 from training. Duration: 0.92725 2023-10-04 01:19:19,524 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_43831_210_2158_1_1533725502_5872791_610-129300-0_sp1.1 from training. Duration: 0.936375 2023-10-04 01:19:21,002 WARNING [train.py:1204] (3/4) Exclude cut with ID _54218_210_12210_1_1534413646388_4409259_321-23852-0_sp1.1 from training. Duration: 0.936375 2023-10-04 01:19:23,773 WARNING [train.py:1204] (3/4) Exclude cut with ID _39411_210_5986_1_1533538825030_3343649_173-238248-0 from training. Duration: 0.83 2023-10-04 01:19:23,780 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_405-339986-0_sp0.9 from training. Duration: 0.5666875 2023-10-04 01:19:23,806 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2009_210_2071_1_1525481134_4588937_385-302100-0 from training. Duration: 0.87 2023-10-04 01:19:25,100 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_161-276996-0_sp1.1 from training. Duration: 0.863625 2023-10-04 01:19:25,202 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_177-58005-0 from training. Duration: 0.62 2023-10-04 01:19:26,610 WARNING [train.py:1204] (3/4) Exclude cut with ID _64639_210_5816_1_1535367619437_3528000_146-343734-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 01:19:28,048 WARNING [train.py:1204] (3/4) Exclude cut with ID _3845_210_1390_1_1526731326141_3523610_72-61441-0_sp1.1 from training. Duration: 0.92725 2023-10-04 01:19:29,256 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.600e+02 1.949e+02 2.124e+02 2.464e+02 3.729e+02, threshold=4.249e+02, percent-clipped=0.0 2023-10-04 01:19:35,310 WARNING [train.py:1204] (3/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_991-125028-0_sp1.1 from training. Duration: 0.936375 2023-10-04 01:19:35,380 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7544_210_6753_1_1528591327_1471426_172-348777-0 from training. Duration: 0.66 2023-10-04 01:19:36,743 WARNING [train.py:1204] (3/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_416-257730-0_sp1.1 from training. Duration: 0.5909375 2023-10-04 01:19:38,710 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1113-7369-0_sp1.1 from training. Duration: 0.77275 2023-10-04 01:19:38,799 WARNING [train.py:1204] (3/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_242-76943-0_sp1.1 from training. Duration: 0.8545625 2023-10-04 01:19:44,361 WARNING [train.py:1204] (3/4) Exclude cut with ID _44718_210_5816_1_1534050051869_6469030_190-49648-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 01:19:45,846 WARNING [train.py:1204] (3/4) Exclude cut with ID _47251_210_18628_1_1533814063128_4137250_214-88076-0 from training. Duration: 0.88 2023-10-04 01:19:45,882 WARNING [train.py:1204] (3/4) Exclude cut with ID _5585_210_2667_1_1527418813324_8007340_89-332072-0_sp1.1 from training. Duration: 0.5818125 2023-10-04 01:19:47,196 WARNING [train.py:1204] (3/4) Exclude cut with ID _41817_210_18835_1_1533434477160_7423049_555-293448-0_sp1.1 from training. Duration: 0.72725 2023-10-04 01:19:49,111 INFO [train.py:1046] (3/4) Epoch 42, batch 3400, loss[loss=0.1439, simple_loss=0.2238, pruned_loss=0.03198, over 24602.00 frames. ], tot_loss[loss=0.157, simple_loss=0.2376, pruned_loss=0.03826, over 4727294.20 frames. ], batch size: 60, lr: 2.42e-03, grad_scale: 16.0 2023-10-04 01:19:49,230 WARNING [train.py:1204] (3/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_286-331314-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 01:19:49,403 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.1.ff3_skip_rate, batch_count=1474653.3333333333, ans=0.0 2023-10-04 01:19:50,918 WARNING [train.py:1204] (3/4) Exclude cut with ID _13921_210_9994_1_1530838702800_3003429_46-149444-0 from training. Duration: 0.94 2023-10-04 01:19:50,977 WARNING [train.py:1204] (3/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_768-43595-0_sp1.1 from training. Duration: 0.936375 2023-10-04 01:19:50,986 WARNING [train.py:1204] (3/4) Exclude cut with ID _35355_210_16040_1_1533212988977_4016229_74-190076-0 from training. Duration: 0.8 2023-10-04 01:19:52,609 WARNING [train.py:1204] (3/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_1007-316380-0_sp1.1 from training. Duration: 0.963625 2023-10-04 01:19:53,871 WARNING [train.py:1204] (3/4) Exclude cut with ID _23557_210_8750_1_1531890106370_6846255_150-283437-0_sp1.1 from training. Duration: 0.963625 2023-10-04 01:19:55,174 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3474_210_2028_1_1526188513_4825246_145-309131-0_sp1.1 from training. Duration: 0.67275 2023-10-04 01:19:56,548 WARNING [train.py:1204] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_391-22148-0_sp1.1 from training. Duration: 0.7 2023-10-04 01:19:56,572 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_238-256668-0 from training. Duration: 0.69 2023-10-04 01:20:00,751 WARNING [train.py:1204] (3/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_372-198878-0 from training. Duration: 0.6 2023-10-04 01:20:00,761 WARNING [train.py:1204] (3/4) Exclude cut with ID ID0174W0006-766659 from training. Duration: 0.953 2023-10-04 01:20:02,055 WARNING [train.py:1204] (3/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_668-23019-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 01:20:06,823 WARNING [train.py:1204] (3/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_322-74328-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 01:20:06,830 WARNING [train.py:1204] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_872-264525-0_sp1.1 from training. Duration: 0.5909375 2023-10-04 01:20:06,882 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_53378_210_17282_1_1534221071_5769509_641-286201-0_sp1.1 from training. Duration: 0.97275 2023-10-04 01:20:06,961 WARNING [train.py:1204] (3/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_199-295611-0_sp1.1 from training. Duration: 0.5 2023-10-04 01:20:11,526 WARNING [train.py:1204] (3/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_1270-317971-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 01:20:14,075 WARNING [train.py:1204] (3/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_784-213099-0 from training. Duration: 0.93 2023-10-04 01:20:17,044 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_115-233929-0_sp0.9 from training. Duration: 0.8 2023-10-04 01:20:18,445 WARNING [train.py:1204] (3/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_256-223809-0_sp1.1 from training. Duration: 0.97275 2023-10-04 01:20:20,362 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_285-87781-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 01:20:22,210 WARNING [train.py:1204] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_619-344499-0_sp1.1 from training. Duration: 0.57275 2023-10-04 01:20:26,452 WARNING [train.py:1204] (3/4) Exclude cut with ID _60229_210_27767_1_1534838406158_3655350_264-279573-0_sp0.9 from training. Duration: 0.92225 2023-10-04 01:20:29,350 WARNING [train.py:1204] (3/4) Exclude cut with ID _7719_210_3525_1_1528456153163_4296960_333-110399-0 from training. Duration: 0.96 2023-10-04 01:20:36,692 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_50423_210_13270_1_1534128598_5021734_759-90833-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 01:20:38,050 WARNING [train.py:1204] (3/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_151-254798-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 01:20:38,095 WARNING [train.py:1204] (3/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_450-203637-0 from training. Duration: 0.92 2023-10-04 01:20:39,395 WARNING [train.py:1204] (3/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_673-36456-0_sp1.1 from training. Duration: 0.936375 2023-10-04 01:20:39,434 WARNING [train.py:1204] (3/4) Exclude cut with ID _36538_210_9994_1_1533204020363_3795620_131-8685-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 01:20:39,483 WARNING [train.py:1204] (3/4) Exclude cut with ID _12556_210_8341_1_1530081024100_7327690_713-55626-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 01:20:41,458 WARNING [train.py:1204] (3/4) Exclude cut with ID _2322_210_2667_1_1525604548971_3656299_440-80494-0_sp0.9 from training. Duration: 0.97775 2023-10-04 01:20:44,170 WARNING [train.py:1204] (3/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_210-325081-0_sp1.1 from training. Duration: 0.97275 2023-10-04 01:20:45,999 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.feed_forward2.hidden_balancer.prob, batch_count=1474853.3333333333, ans=0.125 2023-10-04 01:20:47,029 WARNING [train.py:1204] (3/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_375-244976-0_sp0.9 from training. Duration: 0.8333125 2023-10-04 01:20:47,035 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_15517_210_6753_1_1530952293_1664795_119-31001-0_sp1.1 from training. Duration: 0.8545625 2023-10-04 01:20:50,510 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_41452_210_7291_1_1533437687_3941281_319-42326-0_sp1.1 from training. Duration: 0.92725 2023-10-04 01:20:53,049 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.2.encoder.layers.0.self_attn2.whiten, num_groups=1, num_channels=384, metric=18.21 vs. limit=22.5 2023-10-04 01:20:53,690 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_372-173012-0 from training. Duration: 0.95 2023-10-04 01:21:00,509 WARNING [train.py:1204] (3/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_41-31157-0_sp1.1 from training. Duration: 0.5181875 2023-10-04 01:21:03,414 INFO [train.py:1046] (3/4) Epoch 42, batch 3450, loss[loss=0.1591, simple_loss=0.2299, pruned_loss=0.04413, over 23852.00 frames. ], tot_loss[loss=0.1563, simple_loss=0.2368, pruned_loss=0.03795, over 4733526.41 frames. ], batch size: 195, lr: 2.42e-03, grad_scale: 16.0 2023-10-04 01:21:03,567 WARNING [train.py:1204] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_91-212486-0 from training. Duration: 0.68 2023-10-04 01:21:08,214 WARNING [train.py:1204] (3/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_1267-272693-0 from training. Duration: 0.73 2023-10-04 01:21:08,242 WARNING [train.py:1204] (3/4) Exclude cut with ID _13938_210_9988_1_1530518718978_3693960_152-67046-0_sp1.1 from training. Duration: 0.936375 2023-10-04 01:21:09,642 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_4528_210_3273_1_1527076654_3276777_342-168068-0_sp1.1 from training. Duration: 0.7909375 2023-10-04 01:21:09,658 WARNING [train.py:1204] (3/4) Exclude cut with ID _7606_210_4915_1_1528894253277_5080690_127-177761-0 from training. Duration: 0.64 2023-10-04 01:21:10,901 WARNING [train.py:1204] (3/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_335-137972-0_sp1.1 from training. Duration: 0.92725 2023-10-04 01:21:13,672 WARNING [train.py:1204] (3/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_331-117829-0_sp1.1 from training. Duration: 0.563625 2023-10-04 01:21:20,245 WARNING [train.py:1204] (3/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_125-125818-0_sp1.1 from training. Duration: 0.87275 2023-10-04 01:21:20,302 WARNING [train.py:1204] (3/4) Exclude cut with ID _6789_210_1534_1_1527854471671_2870519_48-294897-0_sp1.1 from training. Duration: 0.963625 2023-10-04 01:21:22,192 WARNING [train.py:1204] (3/4) Exclude cut with ID _6694_210_4599_1_1527852637490_3965529_116-109398-0_sp1.1 from training. Duration: 0.663625 2023-10-04 01:21:22,202 WARNING [train.py:1204] (3/4) Exclude cut with ID _52892_210_6721_1_1534378941259_4124129_222-172685-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 01:21:23,603 WARNING [train.py:1204] (3/4) Exclude cut with ID _50655_210_18628_1_1534229942193_3772154_320-132275-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 01:21:29,131 WARNING [train.py:1204] (3/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_76-311963-0 from training. Duration: 0.7 2023-10-04 01:21:29,370 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.bypass.scale_min, batch_count=1475053.3333333333, ans=0.2 2023-10-04 01:21:31,179 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.3.encoder.layers.3.self_attn_weights.whiten_keys, num_groups=8, num_channels=256, metric=2.42 vs. limit=6.0 2023-10-04 01:21:33,393 WARNING [train.py:1204] (3/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_32-205288-0 from training. Duration: 0.78 2023-10-04 01:21:33,412 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_86-16881-0_sp0.9 from training. Duration: 0.6444375 2023-10-04 01:21:35,102 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_271-297505-0_sp1.1 from training. Duration: 0.9 2023-10-04 01:21:36,488 WARNING [train.py:1204] (3/4) Exclude cut with ID _57572_210_5517_1_1534921219624_3809484_187-295293-0_sp1.1 from training. Duration: 0.963625 2023-10-04 01:21:40,920 WARNING [train.py:1204] (3/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_563-329943-0 from training. Duration: 0.99 2023-10-04 01:21:42,290 WARNING [train.py:1204] (3/4) Exclude cut with ID _7160_210_1876_1_1527940923231_3703799_302-150728-0_sp0.9 from training. Duration: 0.8666875 2023-10-04 01:21:46,911 WARNING [train.py:1204] (3/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_926-7526-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 01:21:46,941 WARNING [train.py:1204] (3/4) Exclude cut with ID _62574_210_2655_1_1535180407058_3933249_467-22348-0_sp1.1 from training. Duration: 0.936375 2023-10-04 01:21:49,570 WARNING [train.py:1204] (3/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_280-161633-0_sp0.9 from training. Duration: 0.811125 2023-10-04 01:21:51,609 WARNING [train.py:1204] (3/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_385-193052-0_sp1.1 from training. Duration: 0.7818125 2023-10-04 01:21:51,745 WARNING [train.py:1204] (3/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_444-28041-0 from training. Duration: 0.85 2023-10-04 01:21:51,750 WARNING [train.py:1204] (3/4) Exclude cut with ID _66070_210_7640_1_1535441393096_7805310_341-249237-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 01:21:53,159 WARNING [train.py:1204] (3/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_425-269826-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 01:21:56,024 WARNING [train.py:1204] (3/4) Exclude cut with ID _53950_210_5816_1_1534417264919_3862951_124-185597-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 01:21:58,676 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.601e+02 1.992e+02 2.174e+02 2.512e+02 3.685e+02, threshold=4.348e+02, percent-clipped=0.0 2023-10-04 01:21:58,748 WARNING [train.py:1204] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_380-204756-0 from training. Duration: 0.86 2023-10-04 01:22:00,326 WARNING [train.py:1204] (3/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_282-257791-0_sp0.9 from training. Duration: 0.9555625 2023-10-04 01:22:06,207 WARNING [train.py:1204] (3/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_372-131163-0_sp1.1 from training. Duration: 0.8545625 2023-10-04 01:22:07,534 WARNING [train.py:1204] (3/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_447-233513-0_sp1.1 from training. Duration: 0.963625 2023-10-04 01:22:09,045 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_54-118333-0_sp1.1 from training. Duration: 0.97275 2023-10-04 01:22:10,622 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.bypass_mid.scale_min, batch_count=1475253.3333333333, ans=0.2 2023-10-04 01:22:13,736 WARNING [train.py:1204] (3/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_388-230239-0_sp1.1 from training. Duration: 0.963625 2023-10-04 01:22:13,756 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_40047_210_2290_1_1533290519_2374311_141-54639-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 01:22:15,144 WARNING [train.py:1204] (3/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_91-267696-0_sp0.9 from training. Duration: 0.9666875 2023-10-04 01:22:16,527 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_532-137847-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 01:22:17,646 INFO [train.py:1046] (3/4) Epoch 42, batch 3500, loss[loss=0.1393, simple_loss=0.1911, pruned_loss=0.04375, over 19472.00 frames. ], tot_loss[loss=0.1553, simple_loss=0.2353, pruned_loss=0.03768, over 4724612.67 frames. ], batch size: 388, lr: 2.42e-03, grad_scale: 16.0 2023-10-04 01:22:21,316 WARNING [train.py:1204] (3/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_155-283231-0_sp1.1 from training. Duration: 0.97275 2023-10-04 01:22:22,820 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2245_210_2715_1_1525511548_3540254_310-52483-0_sp1.1 from training. Duration: 0.8 2023-10-04 01:22:24,098 WARNING [train.py:1204] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_185-174805-0 from training. Duration: 0.68 2023-10-04 01:22:24,231 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.0.nonlin_attention.balancer.min_positive, batch_count=1475320.0, ans=0.05 2023-10-04 01:22:26,815 WARNING [train.py:1204] (3/4) Exclude cut with ID _15012_210_1968_1_1530856820429_5095350_662-210469-0_sp1.1 from training. Duration: 0.5181875 2023-10-04 01:22:28,314 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_426-313557-0_sp0.9 from training. Duration: 0.588875 2023-10-04 01:22:29,715 WARNING [train.py:1204] (3/4) Exclude cut with ID _42326_210_7770_1_1533814282531_3707269_407-204343-0_sp1.1 from training. Duration: 0.97275 2023-10-04 01:22:29,731 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_258-310570-0 from training. Duration: 0.82 2023-10-04 01:22:33,949 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_4737_210_2040_1_1526998689_3293008_378-178650-0_sp1.1 from training. Duration: 0.863625 2023-10-04 01:22:35,756 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_47896_210_20508_1_1534492518_5958868_432-327451-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 01:22:37,049 WARNING [train.py:1204] (3/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_188-114464-0_sp1.1 from training. Duration: 0.7454375 2023-10-04 01:22:37,056 WARNING [train.py:1204] (3/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_798-106179-0_sp1.1 from training. Duration: 0.7545625 2023-10-04 01:22:37,095 WARNING [train.py:1204] (3/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_143-117421-0_sp1.1 from training. Duration: 0.52725 2023-10-04 01:22:38,448 WARNING [train.py:1204] (3/4) Exclude cut with ID _67499_210_17282_1_1535509805220_3692430_361-78786-0_sp1.1 from training. Duration: 0.963625 2023-10-04 01:22:38,478 WARNING [train.py:1204] (3/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_712-342563-0_sp1.1 from training. Duration: 0.8818125 2023-10-04 01:22:38,539 WARNING [train.py:1204] (3/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_387-159008-0 from training. Duration: 0.86 2023-10-04 01:22:40,002 WARNING [train.py:1204] (3/4) Exclude cut with ID _33640_210_2655_1_1533117605479_4855469_959-18820-0_sp1.1 from training. Duration: 0.963625 2023-10-04 01:22:41,292 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_133-564-0_sp0.9 from training. Duration: 0.87775 2023-10-04 01:22:44,419 WARNING [train.py:1204] (3/4) Exclude cut with ID _23514_210_1389_1_1531832139785_3031439_12-29287-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 01:22:48,981 WARNING [train.py:1204] (3/4) Exclude cut with ID _23514_210_1389_1_1531832139785_3031439_489-185052-0_sp1.1 from training. Duration: 0.963625 2023-10-04 01:22:49,048 WARNING [train.py:1204] (3/4) Exclude cut with ID _5444_210_2151_1_1527418808464_5312210_634-194174-0 from training. Duration: 0.97 2023-10-04 01:22:49,066 WARNING [train.py:1204] (3/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_91-26919-0_sp1.1 from training. Duration: 0.8818125 2023-10-04 01:22:52,431 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_37351_210_7925_1_1533012931_3812113_516-82303-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 01:22:52,642 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.conv_module2.balancer2.prob, batch_count=1475453.3333333333, ans=0.125 2023-10-04 01:22:53,829 WARNING [train.py:1204] (3/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_80-74184-0_sp0.9 from training. Duration: 0.92225 2023-10-04 01:22:55,209 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_9460_210_4211_1_1528954587_10049667_495-202963-0_sp1.1 from training. Duration: 0.963625 2023-10-04 01:22:56,571 WARNING [train.py:1204] (3/4) Exclude cut with ID _14632_210_9988_1_1530691279392_3923200_332-105904-0_sp1.1 from training. Duration: 0.8181875 2023-10-04 01:22:56,597 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_40764_210_1925_1_1533882359_3752345_493-167886-0_sp1.1 from training. Duration: 0.92725 2023-10-04 01:22:59,307 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_196-289637-0 from training. Duration: 0.75 2023-10-04 01:23:00,590 WARNING [train.py:1204] (3/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_26-175553-0 from training. Duration: 0.61 2023-10-04 01:23:00,631 WARNING [train.py:1204] (3/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_890-5117-0 from training. Duration: 0.78 2023-10-04 01:23:00,694 WARNING [train.py:1204] (3/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_206-306860-0_sp1.1 from training. Duration: 0.92725 2023-10-04 01:23:02,259 WARNING [train.py:1204] (3/4) Exclude cut with ID _25882_210_3036_1_1532775665815_6479510_570-79824-0_sp1.1 from training. Duration: 0.963625 2023-10-04 01:23:02,321 WARNING [train.py:1204] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_731-2428-0_sp1.1 from training. Duration: 0.7545625 2023-10-04 01:23:03,654 WARNING [train.py:1204] (3/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_138-154812-0_sp0.9 from training. Duration: 0.8666875 2023-10-04 01:23:05,593 WARNING [train.py:1204] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_178-39722-0_sp0.9 from training. Duration: 0.5444375 2023-10-04 01:23:06,954 WARNING [train.py:1204] (3/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_1285-256622-0_sp0.9 from training. Duration: 0.9333125 2023-10-04 01:23:11,040 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_41428_210_2161_1_1533774184_3992532_65-271323-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 01:23:12,384 WARNING [train.py:1204] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_308-161067-0 from training. Duration: 0.66 2023-10-04 01:23:12,398 WARNING [train.py:1204] (3/4) Exclude cut with ID _3067_210_2667_1_1526212974640_4503168_459-173867-0 from training. Duration: 0.88 2023-10-04 01:23:12,399 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2221_210_1988_1_1525597051_4935541_415-296882-0_sp1.1 from training. Duration: 0.7 2023-10-04 01:23:14,438 WARNING [train.py:1204] (3/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_354-192266-0_sp1.1 from training. Duration: 0.936375 2023-10-04 01:23:15,746 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_258-310570-0_sp0.9 from training. Duration: 0.911125 2023-10-04 01:23:17,594 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_548-141039-0_sp1.1 from training. Duration: 0.963625 2023-10-04 01:23:19,075 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_15101_210_7927_1_1530786415_5033274_320-327527-0 from training. Duration: 0.96 2023-10-04 01:23:20,999 WARNING [train.py:1204] (3/4) Exclude cut with ID _2895_210_2477_1_1525956785794_4063750_289-11475-0_sp0.9 from training. Duration: 0.911125 2023-10-04 01:23:23,045 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.2.encoder.layers.0.feed_forward1.out_whiten, num_groups=1, num_channels=384, metric=5.24 vs. limit=15.0 2023-10-04 01:23:23,663 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7543_210_6753_1_1528543318_4552_1-85072-0_sp1.1 from training. Duration: 0.7545625 2023-10-04 01:23:24,970 WARNING [train.py:1204] (3/4) Exclude cut with ID _64639_210_5816_1_1535367619437_3528000_461-329156-0 from training. Duration: 0.96 2023-10-04 01:23:25,227 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.feed_forward1.out_proj.dropout_p, batch_count=1475586.6666666667, ans=0.1 2023-10-04 01:23:26,382 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_211-339622-0 from training. Duration: 0.45 2023-10-04 01:23:27,898 WARNING [train.py:1204] (3/4) Exclude cut with ID _20455_210_5219_1_1531565996775_8098100_402-112739-0_sp1.1 from training. Duration: 0.963625 2023-10-04 01:23:29,262 WARNING [train.py:1204] (3/4) Exclude cut with ID _53489_210_6228_1_1534298357169_3835970_256-115565-0_sp1.1 from training. Duration: 0.936375 2023-10-04 01:23:29,278 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_26326_210_7925_1_1533002493_3592406_360-305260-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 01:23:29,304 WARNING [train.py:1204] (3/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_18-204383-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 01:23:31,969 INFO [train.py:1046] (3/4) Epoch 42, batch 3550, loss[loss=0.1487, simple_loss=0.231, pruned_loss=0.03319, over 24450.00 frames. ], tot_loss[loss=0.1542, simple_loss=0.2342, pruned_loss=0.03717, over 4715343.58 frames. ], batch size: 58, lr: 2.42e-03, grad_scale: 16.0 2023-10-04 01:23:34,010 WARNING [train.py:1204] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_306-232480-0_sp1.1 from training. Duration: 0.87275 2023-10-04 01:23:37,203 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.ff2_skip_rate, batch_count=1475653.3333333333, ans=0.0 2023-10-04 01:23:39,775 WARNING [train.py:1204] (3/4) Exclude cut with ID _41161_210_20034_1_1533431249657_3555314_116-299186-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 01:23:41,159 WARNING [train.py:1204] (3/4) Exclude cut with ID _13913_210_9994_1_1530579615187_3383897_473-277682-0_sp1.1 from training. Duration: 0.3545625 2023-10-04 01:23:44,112 WARNING [train.py:1204] (3/4) Exclude cut with ID _58895_210_8750_1_1535184081320_3649089_49-225702-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 01:23:45,809 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3080_210_2071_1_1526086102_4365166_312-8726-0_sp1.1 from training. Duration: 0.82725 2023-10-04 01:23:45,925 WARNING [train.py:1204] (3/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_489-190175-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 01:23:47,761 WARNING [train.py:1204] (3/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_345-95717-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 01:23:47,772 WARNING [train.py:1204] (3/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_85-138257-0_sp1.1 from training. Duration: 0.6454375 2023-10-04 01:23:52,432 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7046_210_6753_1_1527895337_6920917_783-278109-0_sp1.1 from training. Duration: 0.7 2023-10-04 01:23:52,460 WARNING [train.py:1204] (3/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_450-203637-0_sp1.1 from training. Duration: 0.836375 2023-10-04 01:23:53,710 WARNING [train.py:1204] (3/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_416-132893-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 01:23:53,727 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2655_210_2087_1_1525688959_3897400_437-337690-0_sp0.9 from training. Duration: 0.7 2023-10-04 01:23:55,118 WARNING [train.py:1204] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_700-183549-0_sp1.1 from training. Duration: 0.6181875 2023-10-04 01:24:02,041 WARNING [train.py:1204] (3/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_191-59784-0_sp1.1 from training. Duration: 0.8 2023-10-04 01:24:02,062 WARNING [train.py:1204] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_218-295097-0_sp1.1 from training. Duration: 0.7 2023-10-04 01:24:04,769 WARNING [train.py:1204] (3/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_310-109461-0_sp1.1 from training. Duration: 0.72725 2023-10-04 01:24:04,773 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_33348_210_5632_1_1533086912_3324030_131-321149-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 01:24:04,820 WARNING [train.py:1204] (3/4) Exclude cut with ID _64135_210_2253_1_1535677267876_7608972_133-188266-0_sp1.1 from training. Duration: 0.736375 2023-10-04 01:24:04,859 WARNING [train.py:1204] (3/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_505-205270-0 from training. Duration: 0.38 2023-10-04 01:24:04,870 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_67223_210_4238_1_1535612120_2537431_102-88248-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 01:24:08,032 WARNING [train.py:1204] (3/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_60-235433-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 01:24:08,129 WARNING [train.py:1204] (3/4) Exclude cut with ID _5189_210_4557_1_1527248319428_3769229_199-53526-0_sp1.1 from training. Duration: 0.3818125 2023-10-04 01:24:13,749 WARNING [train.py:1204] (3/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_439-90979-0_sp1.1 from training. Duration: 0.963625 2023-10-04 01:24:15,090 WARNING [train.py:1204] (3/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_1162-132880-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 01:24:15,154 WARNING [train.py:1204] (3/4) Exclude cut with ID _56423_210_5854_1_1534896061049_3165969_220-22317-0_sp1.1 from training. Duration: 0.963625 2023-10-04 01:24:16,491 WARNING [train.py:1204] (3/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_909-64151-0 from training. Duration: 0.94 2023-10-04 01:24:17,803 WARNING [train.py:1204] (3/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_465-156500-0_sp0.9 from training. Duration: 0.8 2023-10-04 01:24:19,705 WARNING [train.py:1204] (3/4) Exclude cut with ID _44304_210_15374_1_1533639352765_4613129_302-22084-0 from training. Duration: 0.86 2023-10-04 01:24:21,024 WARNING [train.py:1204] (3/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_663-166973-0_sp1.1 from training. Duration: 0.72725 2023-10-04 01:24:22,493 WARNING [train.py:1204] (3/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_503-287883-0_sp0.9 from training. Duration: 0.97775 2023-10-04 01:24:22,516 WARNING [train.py:1204] (3/4) Exclude cut with ID _60057_210_10640_1_1534903065793_3333150_362-230377-0_sp0.9 from training. Duration: 0.9666875 2023-10-04 01:24:25,915 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_4183_210_2087_1_1526729100_1869838_143-280962-0 from training. Duration: 0.56 2023-10-04 01:24:27,020 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.688e+02 1.930e+02 2.101e+02 2.469e+02 3.261e+02, threshold=4.203e+02, percent-clipped=0.0 2023-10-04 01:24:27,158 WARNING [train.py:1204] (3/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_281-1313-0_sp1.1 from training. Duration: 0.936375 2023-10-04 01:24:28,808 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.conv_module2.balancer2.prob, batch_count=1475853.3333333333, ans=0.125 2023-10-04 01:24:32,638 WARNING [train.py:1204] (3/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_64-215118-0_sp1.1 from training. Duration: 0.936375 2023-10-04 01:24:32,707 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2822_210_2715_1_1526112586_1999587_165-36656-0 from training. Duration: 0.89 2023-10-04 01:24:32,766 WARNING [train.py:1204] (3/4) Exclude cut with ID _22961_210_8254_1_1531976366672_4278180_402-208830-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 01:24:38,931 WARNING [train.py:1204] (3/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_98-193494-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 01:24:40,300 WARNING [train.py:1204] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_383-285196-0 from training. Duration: 0.97 2023-10-04 01:24:43,220 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=1475920.0, ans=0.1 2023-10-04 01:24:45,550 INFO [train.py:1046] (3/4) Epoch 42, batch 3600, loss[loss=0.1692, simple_loss=0.255, pruned_loss=0.04167, over 23974.00 frames. ], tot_loss[loss=0.1541, simple_loss=0.2342, pruned_loss=0.03701, over 4724396.28 frames. ], batch size: 80, lr: 2.41e-03, grad_scale: 32.0 2023-10-04 01:24:45,790 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.0.feed_forward2.hidden_balancer.prob, batch_count=1475986.6666666667, ans=0.125 2023-10-04 01:24:47,110 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6767_210_3273_1_1527855783_1718932_142-127132-0 from training. Duration: 0.95 2023-10-04 01:24:47,146 WARNING [train.py:1204] (3/4) Exclude cut with ID _39867_210_9826_1_1533553222030_9344259_1018-339635-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 01:24:48,787 WARNING [train.py:1204] (3/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_274-274798-0_sp1.1 from training. Duration: 0.8909375 2023-10-04 01:24:50,204 WARNING [train.py:1204] (3/4) Exclude cut with ID _2334_210_2667_1_1525608238880_4705129_376-263393-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 01:24:50,247 WARNING [train.py:1204] (3/4) Exclude cut with ID _29303_210_2575_1_1532392237303_7410649_363-344675-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 01:24:52,167 WARNING [train.py:1204] (3/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_58-68956-0_sp1.1 from training. Duration: 0.7545625 2023-10-04 01:24:55,557 WARNING [train.py:1204] (3/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_709-311571-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 01:24:58,101 WARNING [train.py:1204] (3/4) Exclude cut with ID _46067_210_4220_1_1534125571066_3307450_136-257407-0_sp1.1 from training. Duration: 0.963625 2023-10-04 01:24:59,453 WARNING [train.py:1204] (3/4) Exclude cut with ID _6493_210_1747_1_1528459210424_4012470_324-323854-0_sp1.1 from training. Duration: 0.836375 2023-10-04 01:25:00,794 WARNING [train.py:1204] (3/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_234-260682-0_sp1.1 from training. Duration: 0.763625 2023-10-04 01:25:00,862 WARNING [train.py:1204] (3/4) Exclude cut with ID _36634_210_1794_1_1533296352450_1399884_55-119451-0_sp1.1 from training. Duration: 0.963625 2023-10-04 01:25:00,866 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_40047_210_2290_1_1533290519_2374311_195-165693-0 from training. Duration: 0.89 2023-10-04 01:25:03,712 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_5522_210_3273_1_1527249606_3909096_366-172508-0_sp0.9 from training. Duration: 0.7333125 2023-10-04 01:25:05,052 WARNING [train.py:1204] (3/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_187-10504-0_sp1.1 from training. Duration: 0.963625 2023-10-04 01:25:09,769 WARNING [train.py:1204] (3/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_282-257791-0_sp1.1 from training. Duration: 0.7818125 2023-10-04 01:25:09,984 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.feed_forward1.hidden_balancer.prob, batch_count=1476053.3333333333, ans=0.125 2023-10-04 01:25:12,531 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_43831_210_2158_1_1533725502_5872791_467-288776-0_sp1.1 from training. Duration: 0.92725 2023-10-04 01:25:13,853 WARNING [train.py:1204] (3/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_138-154812-0_sp1.1 from training. Duration: 0.7090625 2023-10-04 01:25:13,891 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_1154-185930-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 01:25:13,921 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2655_210_2087_1_1525688959_3897400_52-279899-0 from training. Duration: 0.69 2023-10-04 01:25:13,986 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_141-89113-0_sp1.1 from training. Duration: 0.7818125 2023-10-04 01:25:18,119 WARNING [train.py:1204] (3/4) Exclude cut with ID _55134_210_21982_1_1534590028377_3882170_297-47310-0_sp1.1 from training. Duration: 0.963625 2023-10-04 01:25:19,371 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_486-151131-0_sp1.1 from training. Duration: 0.77275 2023-10-04 01:25:19,509 WARNING [train.py:1204] (3/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_21-232980-0_sp1.1 from training. Duration: 0.936375 2023-10-04 01:25:20,949 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6165_210_3528_1_1527590982_4379841_338-106380-0_sp1.1 from training. Duration: 0.92725 2023-10-04 01:25:22,255 WARNING [train.py:1204] (3/4) Exclude cut with ID _55162_210_17071_1_1534408248583_3753480_355-257218-0_sp1.1 from training. Duration: 0.97275 2023-10-04 01:25:22,341 WARNING [train.py:1204] (3/4) Exclude cut with ID _2895_210_2477_1_1525956785794_4063750_289-11475-0 from training. Duration: 0.82 2023-10-04 01:25:30,676 WARNING [train.py:1204] (3/4) Exclude cut with ID _16756_210_4915_1_1531047803763_7104259_839-183431-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 01:25:30,762 WARNING [train.py:1204] (3/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_427-208901-0_sp0.9 from training. Duration: 0.7555625 2023-10-04 01:25:32,163 WARNING [train.py:1204] (3/4) Exclude cut with ID _36512_210_6987_1_1533033143998_3126069_181-306500-0 from training. Duration: 0.95 2023-10-04 01:25:36,312 WARNING [train.py:1204] (3/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_28-70987-0_sp1.1 from training. Duration: 0.7909375 2023-10-04 01:25:42,023 WARNING [train.py:1204] (3/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_590-58996-0_sp1.1 from training. Duration: 0.936375 2023-10-04 01:25:44,831 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_48730_210_5399_1_1533885886_2212769_108-47561-0_sp1.1 from training. Duration: 0.936375 2023-10-04 01:25:47,655 WARNING [train.py:1204] (3/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_401-158239-0_sp0.9 from training. Duration: 0.788875 2023-10-04 01:25:47,669 WARNING [train.py:1204] (3/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_278-154492-0_sp1.1 from training. Duration: 0.6909375 2023-10-04 01:25:47,674 WARNING [train.py:1204] (3/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_137-116442-0 from training. Duration: 0.78 2023-10-04 01:25:49,140 WARNING [train.py:1204] (3/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_189-317158-0 from training. Duration: 0.9 2023-10-04 01:25:49,218 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_47896_210_20508_1_1534492518_5958868_558-314572-0 from training. Duration: 0.97 2023-10-04 01:25:50,938 WARNING [train.py:1204] (3/4) Exclude cut with ID _66070_210_7640_1_1535441393096_7805310_290-40284-0_sp1.1 from training. Duration: 0.97275 2023-10-04 01:25:50,981 WARNING [train.py:1204] (3/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_110-224688-0_sp1.1 from training. Duration: 0.8818125 2023-10-04 01:25:51,197 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=1476253.3333333333, ans=0.1 2023-10-04 01:25:54,051 WARNING [train.py:1204] (3/4) Exclude cut with ID _7719_210_3525_1_1528456153163_4296960_88-250259-0 from training. Duration: 0.84 2023-10-04 01:25:54,113 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8453_210_4211_1_1528502940_8562730_1157-114102-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 01:25:54,145 WARNING [train.py:1204] (3/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_317-175062-0_sp1.1 from training. Duration: 0.7454375 2023-10-04 01:25:54,151 WARNING [train.py:1204] (3/4) Exclude cut with ID _29857_210_20034_1_1532395886753_3798160_254-347254-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 01:25:55,542 WARNING [train.py:1204] (3/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_28-70987-0 from training. Duration: 0.87 2023-10-04 01:25:57,372 WARNING [train.py:1204] (3/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_321-335475-0 from training. Duration: 0.45 2023-10-04 01:25:58,968 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.ff2_skip_rate, batch_count=1476320.0, ans=0.0 2023-10-04 01:26:00,057 INFO [train.py:1046] (3/4) Epoch 42, batch 3650, loss[loss=0.159, simple_loss=0.2396, pruned_loss=0.03919, over 24502.00 frames. ], tot_loss[loss=0.1549, simple_loss=0.235, pruned_loss=0.03735, over 4730085.34 frames. ], batch size: 63, lr: 2.41e-03, grad_scale: 32.0 2023-10-04 01:26:00,131 WARNING [train.py:1204] (3/4) Exclude cut with ID _40901_210_5665_1_1533363789958_3856130_356-107330-0_sp1.1 from training. Duration: 0.936375 2023-10-04 01:26:01,452 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3780_210_2087_1_1526467389_2145572_176-107140-0 from training. Duration: 0.69 2023-10-04 01:26:01,723 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.bypass.scale_min, batch_count=1476320.0, ans=0.2 2023-10-04 01:26:04,440 WARNING [train.py:1204] (3/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_80-135200-0 from training. Duration: 0.79 2023-10-04 01:26:05,826 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_33348_210_5632_1_1533086912_3324030_85-28583-0_sp0.9 from training. Duration: 0.911125 2023-10-04 01:26:08,696 WARNING [train.py:1204] (3/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_29-293896-0 from training. Duration: 0.61 2023-10-04 01:26:10,577 WARNING [train.py:1204] (3/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_194-252800-0 from training. Duration: 0.97 2023-10-04 01:26:13,396 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_360-52446-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 01:26:13,398 WARNING [train.py:1204] (3/4) Exclude cut with ID _9389_210_1664_1_1529217337301_5289269_289-213417-0_sp0.9 from training. Duration: 0.988875 2023-10-04 01:26:13,456 WARNING [train.py:1204] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_143-86344-0_sp0.9 from training. Duration: 0.8555625 2023-10-04 01:26:16,732 WARNING [train.py:1204] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_520-126729-0_sp0.9 from training. Duration: 0.77775 2023-10-04 01:26:17,831 WARNING [train.py:1204] (3/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_76-308663-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 01:26:17,894 WARNING [train.py:1204] (3/4) Exclude cut with ID _8021_210_7994_1_1528376451358_3653069_100-140231-0 from training. Duration: 0.67 2023-10-04 01:26:17,958 WARNING [train.py:1204] (3/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_387-131538-0_sp0.9 from training. Duration: 0.9 2023-10-04 01:26:19,284 WARNING [train.py:1204] (3/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_365-182054-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 01:26:19,319 WARNING [train.py:1204] (3/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_345-340690-0 from training. Duration: 0.93 2023-10-04 01:26:19,417 WARNING [train.py:1204] (3/4) Exclude cut with ID _5409_210_1388_1_1527292850189_5865191_178-127970-0_sp0.9 from training. Duration: 0.6333125 2023-10-04 01:26:20,775 WARNING [train.py:1204] (3/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_352-12734-0_sp1.1 from training. Duration: 0.92725 2023-10-04 01:26:20,779 WARNING [train.py:1204] (3/4) Exclude cut with ID _65134_210_14763_1_1535335393985_3505368_121-129373-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 01:26:22,651 WARNING [train.py:1204] (3/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_557-324258-0_sp1.1 from training. Duration: 0.736375 2023-10-04 01:26:25,776 WARNING [train.py:1204] (3/4) Exclude cut with ID _48678_210_5663_1_1533988830947_3769199_306-255886-0 from training. Duration: 0.77 2023-10-04 01:26:27,183 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7544_210_6753_1_1528587000_691468_87-178599-0 from training. Duration: 0.87 2023-10-04 01:26:28,628 WARNING [train.py:1204] (3/4) Exclude cut with ID _2941_210_2577_1_1526199673150_2489759_208-236402-0_sp1.1 from training. Duration: 0.963625 2023-10-04 01:26:30,132 WARNING [train.py:1204] (3/4) Exclude cut with ID _38670_210_15379_1_1533448810659_4391059_65-169397-0 from training. Duration: 0.97 2023-10-04 01:26:30,969 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.2.encoder.layers.0.feed_forward3.out_whiten, num_groups=1, num_channels=384, metric=10.91 vs. limit=15.0 2023-10-04 01:26:31,494 WARNING [train.py:1204] (3/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_306-328175-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 01:26:31,513 WARNING [train.py:1204] (3/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_212-149947-0_sp1.1 from training. Duration: 0.663625 2023-10-04 01:26:32,509 INFO [scaling.py:1022] (3/4) Whitening: name=encoder_embed.out_whiten, num_groups=1, num_channels=192, metric=7.25 vs. limit=8.0 2023-10-04 01:26:35,732 WARNING [train.py:1204] (3/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_321-22821-0_sp1.1 from training. Duration: 0.8090625 2023-10-04 01:26:38,907 WARNING [train.py:1204] (3/4) Exclude cut with ID _44010_210_4525_1_1533884262964_4013620_483-69110-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 01:26:38,916 WARNING [train.py:1204] (3/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_38-187851-0_sp1.1 from training. Duration: 0.563625 2023-10-04 01:26:39,022 WARNING [train.py:1204] (3/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_129-337802-0_sp0.9 from training. Duration: 0.8 2023-10-04 01:26:40,413 WARNING [train.py:1204] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_268-87909-0_sp1.1 from training. Duration: 0.9 2023-10-04 01:26:44,668 WARNING [train.py:1204] (3/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_559-184862-0_sp1.1 from training. Duration: 0.8545625 2023-10-04 01:26:46,765 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_46032_210_14656_1_1533781412_4507969_237-120766-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 01:26:48,159 WARNING [train.py:1204] (3/4) Exclude cut with ID _28427_210_4220_1_1532581240967_3061069_330-170906-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 01:26:48,165 WARNING [train.py:1204] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_401-51578-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 01:26:49,552 WARNING [train.py:1204] (3/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_209-205365-0_sp1.1 from training. Duration: 0.7181875 2023-10-04 01:26:50,896 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_23801_210_16223_1_1532066392_3294657_57-242170-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 01:26:52,208 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_127-37371-0_sp1.1 from training. Duration: 0.936375 2023-10-04 01:26:55,667 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.607e+02 1.914e+02 2.089e+02 2.349e+02 3.091e+02, threshold=4.178e+02, percent-clipped=0.0 2023-10-04 01:26:58,996 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9171W0019-578905 from training. Duration: 0.896 2023-10-04 01:27:01,762 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_24605_210_1794_1_1532047863_4149755_349-49056-0_sp1.1 from training. Duration: 0.92725 2023-10-04 01:27:01,774 WARNING [train.py:1204] (3/4) Exclude cut with ID _12302_210_5219_1_1529924446466_8107280_897-21237-0_sp1.1 from training. Duration: 0.936375 2023-10-04 01:27:01,887 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_9460_210_4211_1_1528954587_10049667_480-260269-0_sp0.9 from training. Duration: 0.62225 2023-10-04 01:27:03,219 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_348-152548-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 01:27:04,670 WARNING [train.py:1204] (3/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_301-330455-0_sp0.9 from training. Duration: 0.888875 2023-10-04 01:27:06,091 WARNING [train.py:1204] (3/4) Exclude cut with ID _61008_210_10633_1_1534852769198_6974370_345-91705-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 01:27:08,693 WARNING [train.py:1204] (3/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_304-273433-0 from training. Duration: 0.99 2023-10-04 01:27:08,697 WARNING [train.py:1204] (3/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_359-345972-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 01:27:09,063 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.feed_forward3.hidden_balancer.prob, batch_count=1476586.6666666667, ans=0.125 2023-10-04 01:27:10,697 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2296_210_2087_1_1525503448_3397335_57-185430-0_sp1.1 from training. Duration: 0.5454375 2023-10-04 01:27:12,179 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_1256-120974-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 01:27:13,488 WARNING [train.py:1204] (3/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_372-30302-0_sp0.9 from training. Duration: 0.9555625 2023-10-04 01:27:14,732 INFO [train.py:1046] (3/4) Epoch 42, batch 3700, loss[loss=0.1729, simple_loss=0.2495, pruned_loss=0.04816, over 23768.00 frames. ], tot_loss[loss=0.1565, simple_loss=0.2364, pruned_loss=0.03829, over 4719802.73 frames. ], batch size: 232, lr: 2.41e-03, grad_scale: 32.0 2023-10-04 01:27:17,433 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_42012_210_7927_1_1533439936_3871556_451-344262-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 01:27:17,434 WARNING [train.py:1204] (3/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_28-324729-0 from training. Duration: 0.94 2023-10-04 01:27:17,442 WARNING [train.py:1204] (3/4) Exclude cut with ID _64135_210_2253_1_1535677267876_7608972_313-18394-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 01:27:17,534 WARNING [train.py:1204] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_620-276793-0_sp0.9 from training. Duration: 0.6555625 2023-10-04 01:27:18,813 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7544_210_6753_1_1528591327_1471426_172-348777-0_sp0.9 from training. Duration: 0.7333125 2023-10-04 01:27:22,281 WARNING [train.py:1204] (3/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_332-221057-0_sp0.9 from training. Duration: 0.7555625 2023-10-04 01:27:25,044 WARNING [train.py:1204] (3/4) Exclude cut with ID _19023_210_4353_1_1531378892552_5820780_5-300035-0_sp1.1 from training. Duration: 0.92725 2023-10-04 01:27:25,091 WARNING [train.py:1204] (3/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_343-309023-0_sp1.1 from training. Duration: 0.963625 2023-10-04 01:27:26,528 WARNING [train.py:1204] (3/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_37-41919-0_sp1.1 from training. Duration: 0.7909375 2023-10-04 01:27:26,567 WARNING [train.py:1204] (3/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_41-182910-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 01:27:26,637 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_229-45300-0_sp0.9 from training. Duration: 0.6333125 2023-10-04 01:27:30,306 WARNING [train.py:1204] (3/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_40-214716-0_sp1.1 from training. Duration: 0.963625 2023-10-04 01:27:31,752 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9309W0203-640330 from training. Duration: 0.896 2023-10-04 01:27:37,499 WARNING [train.py:1204] (3/4) Exclude cut with ID _60229_210_27767_1_1534838406158_3655350_264-279573-0_sp1.1 from training. Duration: 0.7545625 2023-10-04 01:27:37,534 WARNING [train.py:1204] (3/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_372-198878-0_sp0.9 from training. Duration: 0.6666875 2023-10-04 01:27:40,229 WARNING [train.py:1204] (3/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_359-118926-0_sp1.1 from training. Duration: 0.6545625 2023-10-04 01:27:40,241 WARNING [train.py:1204] (3/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_85-65056-0 from training. Duration: 0.96 2023-10-04 01:27:40,279 WARNING [train.py:1204] (3/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_89-242335-0_sp1.1 from training. Duration: 0.863625 2023-10-04 01:27:43,498 WARNING [train.py:1204] (3/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_128-49444-0_sp1.1 from training. Duration: 0.936375 2023-10-04 01:27:43,572 WARNING [train.py:1204] (3/4) Exclude cut with ID _30327_210_16049_1_1532484161295_3924169_401-70819-0 from training. Duration: 0.86 2023-10-04 01:27:44,902 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_41822_210_13270_1_1533955976_4379284_260-256391-0_sp1.1 from training. Duration: 0.936375 2023-10-04 01:27:48,013 WARNING [train.py:1204] (3/4) Exclude cut with ID _67335_210_5219_1_1535525975652_4275229_599-247317-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 01:27:49,479 WARNING [train.py:1204] (3/4) Exclude cut with ID _29857_210_20034_1_1532395886753_3798160_209-239267-0_sp1.1 from training. Duration: 0.936375 2023-10-04 01:27:49,505 WARNING [train.py:1204] (3/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_46-207599-0_sp1.1 from training. Duration: 0.5818125 2023-10-04 01:27:52,331 WARNING [train.py:1204] (3/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_912-282412-0_sp1.1 from training. Duration: 0.4454375 2023-10-04 01:27:52,713 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.conv_module2.balancer1.prob, batch_count=1476786.6666666667, ans=0.125 2023-10-04 01:27:56,452 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_4183_210_2087_1_1526729100_1869838_141-166600-0_sp1.1 from training. Duration: 0.863625 2023-10-04 01:27:56,456 WARNING [train.py:1204] (3/4) Exclude cut with ID _15037_210_2253_1_1530752294104_3267650_296-192597-0 from training. Duration: 0.81 2023-10-04 01:27:56,682 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.conv_module2.balancer1.max_abs, batch_count=1476786.6666666667, ans=10.0 2023-10-04 01:27:58,291 WARNING [train.py:1204] (3/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_571-220562-0_sp1.1 from training. Duration: 0.963625 2023-10-04 01:27:58,316 WARNING [train.py:1204] (3/4) Exclude cut with ID _5287_210_2084_1_1527418883040_3878670_12-221186-0 from training. Duration: 0.98 2023-10-04 01:28:05,711 WARNING [train.py:1204] (3/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_149-85602-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 01:28:05,732 WARNING [train.py:1204] (3/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_1003-101638-0_sp1.1 from training. Duration: 0.663625 2023-10-04 01:28:08,538 WARNING [train.py:1204] (3/4) Exclude cut with ID _55844_210_2346_1_1534471212569_7625707_1099-33689-0_sp1.1 from training. Duration: 0.92725 2023-10-04 01:28:08,572 WARNING [train.py:1204] (3/4) Exclude cut with ID _7606_210_4915_1_1528894253277_5080690_301-10732-0 from training. Duration: 0.81 2023-10-04 01:28:09,997 WARNING [train.py:1204] (3/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_403-34698-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 01:28:10,003 WARNING [train.py:1204] (3/4) Exclude cut with ID _8480_210_1874_1_1528589391205_6728330_755-112625-0_sp0.9 from training. Duration: 0.82225 2023-10-04 01:28:10,015 WARNING [train.py:1204] (3/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_527-238990-0_sp1.1 from training. Duration: 0.8181875 2023-10-04 01:28:11,356 WARNING [train.py:1204] (3/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_426-7328-0_sp1.1 from training. Duration: 0.92725 2023-10-04 01:28:15,796 WARNING [train.py:1204] (3/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_189-317158-0_sp1.1 from training. Duration: 0.8181875 2023-10-04 01:28:15,863 WARNING [train.py:1204] (3/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_162-103620-0 from training. Duration: 0.93 2023-10-04 01:28:17,251 WARNING [train.py:1204] (3/4) Exclude cut with ID _22083_210_8254_1_1531702862439_3678970_49-234546-0 from training. Duration: 0.83 2023-10-04 01:28:18,568 WARNING [train.py:1204] (3/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_370-331600-0_sp1.1 from training. Duration: 0.8545625 2023-10-04 01:28:18,588 WARNING [train.py:1204] (3/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_166-59042-0_sp1.1 from training. Duration: 0.97275 2023-10-04 01:28:20,279 WARNING [train.py:1204] (3/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_138-332773-0_sp1.1 from training. Duration: 0.736375 2023-10-04 01:28:20,350 WARNING [train.py:1204] (3/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_90-30673-0_sp0.9 from training. Duration: 0.9333125 2023-10-04 01:28:24,516 WARNING [train.py:1204] (3/4) Exclude cut with ID _27967_210_12140_1_1532588391859_8419130_737-41898-0_sp1.1 from training. Duration: 0.936375 2023-10-04 01:28:24,624 WARNING [train.py:1204] (3/4) Exclude cut with ID _8833_210_5219_1_1528592665210_7553059_329-125156-0_sp1.1 from training. Duration: 0.7454375 2023-10-04 01:28:26,011 WARNING [train.py:1204] (3/4) Exclude cut with ID _41817_210_18835_1_1533434477160_7423049_352-42537-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 01:28:27,376 WARNING [train.py:1204] (3/4) Exclude cut with ID _8365_210_2064_1_1528592311070_4677690_431-294102-0 from training. Duration: 0.89 2023-10-04 01:28:28,716 INFO [train.py:1046] (3/4) Epoch 42, batch 3750, loss[loss=0.1578, simple_loss=0.2288, pruned_loss=0.04341, over 23386.00 frames. ], tot_loss[loss=0.1573, simple_loss=0.2377, pruned_loss=0.03843, over 4723560.71 frames. ], batch size: 119, lr: 2.41e-03, grad_scale: 32.0 2023-10-04 01:28:28,893 WARNING [train.py:1204] (3/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_454-314901-0_sp1.1 from training. Duration: 0.4181875 2023-10-04 01:28:29,042 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.attention_skip_rate, batch_count=1476986.6666666667, ans=0.0 2023-10-04 01:28:33,832 WARNING [train.py:1204] (3/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_80-135200-0_sp0.9 from training. Duration: 0.87775 2023-10-04 01:28:33,897 WARNING [train.py:1204] (3/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_181-78632-0 from training. Duration: 0.78 2023-10-04 01:28:35,234 WARNING [train.py:1204] (3/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_300-27235-0_sp0.9 from training. Duration: 0.9666875 2023-10-04 01:28:36,068 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.0.layers.1.conv_module2.whiten, num_groups=1, num_channels=192, metric=9.91 vs. limit=15.0 2023-10-04 01:28:36,493 WARNING [train.py:1204] (3/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_96-177176-0_sp1.1 from training. Duration: 0.97275 2023-10-04 01:28:37,912 WARNING [train.py:1204] (3/4) Exclude cut with ID _35941_210_15379_1_1532995294515_4023089_246-348742-0_sp1.1 from training. Duration: 0.97275 2023-10-04 01:28:39,242 WARNING [train.py:1204] (3/4) Exclude cut with ID _38670_210_15379_1_1533448810659_4391059_65-169397-0_sp1.1 from training. Duration: 0.8818125 2023-10-04 01:28:40,741 WARNING [train.py:1204] (3/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_1003-99642-0_sp1.1 from training. Duration: 0.963625 2023-10-04 01:28:43,652 WARNING [train.py:1204] (3/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_320-14078-0_sp1.1 from training. Duration: 0.863625 2023-10-04 01:28:46,245 WARNING [train.py:1204] (3/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_367-61330-0_sp0.9 from training. Duration: 0.8333125 2023-10-04 01:28:49,495 WARNING [train.py:1204] (3/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_515-298514-0_sp1.1 from training. Duration: 0.92725 2023-10-04 01:28:51,034 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=1477053.3333333333, ans=0.1 2023-10-04 01:28:52,343 WARNING [train.py:1204] (3/4) Exclude cut with ID _53731_210_7994_1_1534553979244_3757214_471-320815-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 01:28:52,397 WARNING [train.py:1204] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_306-232480-0 from training. Duration: 0.96 2023-10-04 01:28:54,338 WARNING [train.py:1204] (3/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_478-162247-0_sp0.9 from training. Duration: 0.911125 2023-10-04 01:28:55,744 WARNING [train.py:1204] (3/4) Exclude cut with ID _56423_210_5854_1_1534896061049_3165969_345-284696-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 01:28:55,772 WARNING [train.py:1204] (3/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_600-122253-0_sp1.1 from training. Duration: 0.963625 2023-10-04 01:28:59,104 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.4.encoder.layers.1.whiten, num_groups=1, num_channels=384, metric=4.07 vs. limit=12.0 2023-10-04 01:28:59,969 WARNING [train.py:1204] (3/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_199-295611-0 from training. Duration: 0.55 2023-10-04 01:29:04,509 WARNING [train.py:1204] (3/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_734-129761-0 from training. Duration: 0.82 2023-10-04 01:29:04,610 WARNING [train.py:1204] (3/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_349-169257-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 01:29:06,483 WARNING [train.py:1204] (3/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_202-67017-0_sp0.9 from training. Duration: 0.911125 2023-10-04 01:29:07,918 WARNING [train.py:1204] (3/4) Exclude cut with ID _16908_210_5724_1_1531119157248_3772700_61-258978-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 01:29:10,815 WARNING [train.py:1204] (3/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_172-156931-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 01:29:11,059 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.bypass.skip_rate, batch_count=1477120.0, ans=0.07 2023-10-04 01:29:12,174 WARNING [train.py:1204] (3/4) Exclude cut with ID _4092_210_2151_1_1526643044449_3135759_356-278694-0_sp1.1 from training. Duration: 0.42725 2023-10-04 01:29:16,278 WARNING [train.py:1204] (3/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_116-332008-0 from training. Duration: 0.98 2023-10-04 01:29:17,794 WARNING [train.py:1204] (3/4) Exclude cut with ID _29003_210_19596_1_1532507391720_3894098_358-114789-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 01:29:20,731 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.attention_skip_rate, batch_count=1477186.6666666667, ans=0.0 2023-10-04 01:29:22,387 WARNING [train.py:1204] (3/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_152-212533-0_sp1.1 from training. Duration: 0.8818125 2023-10-04 01:29:23,741 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.2.encoder.layers.2.nonlin_attention.whiten2, num_groups=1, num_channels=384, metric=4.35 vs. limit=15.0 2023-10-04 01:29:24,246 WARNING [train.py:1204] (3/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_267-214116-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 01:29:25,561 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.770e+02 2.079e+02 2.348e+02 2.758e+02 4.520e+02, threshold=4.696e+02, percent-clipped=4.0 2023-10-04 01:29:26,936 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7543_210_6753_1_1528541907_1399322_165-323619-0_sp0.9 from training. Duration: 0.7666875 2023-10-04 01:29:30,698 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.conv_module1.balancer1.prob, batch_count=1477253.3333333333, ans=0.125 2023-10-04 01:29:31,810 WARNING [train.py:1204] (3/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_577-332600-0_sp0.9 from training. Duration: 0.6333125 2023-10-04 01:29:33,207 WARNING [train.py:1204] (3/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_342-100157-0_sp0.9 from training. Duration: 0.6 2023-10-04 01:29:34,659 WARNING [train.py:1204] (3/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_141-89727-0_sp1.1 from training. Duration: 0.6454375 2023-10-04 01:29:36,054 WARNING [train.py:1204] (3/4) Exclude cut with ID _3992_210_1747_1_1526644816031_3731060_107-251434-0_sp1.1 from training. Duration: 0.9 2023-10-04 01:29:37,901 WARNING [train.py:1204] (3/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_401-53423-0_sp1.1 from training. Duration: 0.5 2023-10-04 01:29:43,151 INFO [train.py:1046] (3/4) Epoch 42, batch 3800, loss[loss=0.1535, simple_loss=0.2374, pruned_loss=0.03482, over 24470.00 frames. ], tot_loss[loss=0.1571, simple_loss=0.2377, pruned_loss=0.0383, over 4733105.86 frames. ], batch size: 63, lr: 2.41e-03, grad_scale: 16.0 2023-10-04 01:29:46,238 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7762_210_1794_1_1528199508_4057408_422-227034-0_sp1.1 from training. Duration: 0.663625 2023-10-04 01:29:50,502 WARNING [train.py:1204] (3/4) Exclude cut with ID _4184_210_2550_1_1526730192013_2219380_102-250299-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 01:29:51,865 WARNING [train.py:1204] (3/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_253-308377-0_sp0.9 from training. Duration: 0.5444375 2023-10-04 01:29:51,932 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_271-297505-0 from training. Duration: 0.99 2023-10-04 01:29:52,236 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.conv_module1.balancer2.min_positive, batch_count=1477320.0, ans=0.05 2023-10-04 01:29:53,832 WARNING [train.py:1204] (3/4) Exclude cut with ID _35941_210_15379_1_1532995294515_4023089_213-142973-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 01:29:57,132 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7544_210_6753_1_1528587722_3576686_162-63018-0_sp1.1 from training. Duration: 0.92725 2023-10-04 01:29:57,204 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_365-217119-0_sp0.9 from training. Duration: 0.7 2023-10-04 01:29:59,739 WARNING [train.py:1204] (3/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_662-275621-0_sp0.9 from training. Duration: 0.4666875 2023-10-04 01:29:59,740 WARNING [train.py:1204] (3/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_374-187555-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 01:30:01,096 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_289-180904-0_sp1.1 from training. Duration: 0.6909375 2023-10-04 01:30:03,130 WARNING [train.py:1204] (3/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_189-47206-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 01:30:04,568 WARNING [train.py:1204] (3/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_234-14174-0_sp1.1 from training. Duration: 0.7454375 2023-10-04 01:30:04,604 WARNING [train.py:1204] (3/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_576-76621-0_sp1.1 from training. Duration: 0.963625 2023-10-04 01:30:04,865 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=1477386.6666666667, ans=0.1 2023-10-04 01:30:06,013 WARNING [train.py:1204] (3/4) Exclude cut with ID _13253_210_1925_1_1530757775819_3577873_253-246386-0 from training. Duration: 0.9 2023-10-04 01:30:09,136 WARNING [train.py:1204] (3/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_372-6869-0_sp1.1 from training. Duration: 0.4 2023-10-04 01:30:09,194 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_21434_210_4238_1_1531916339_1222252_22-54954-0_sp1.1 from training. Duration: 0.936375 2023-10-04 01:30:10,688 WARNING [train.py:1204] (3/4) Exclude cut with ID _29853_210_7497_1_1532505600851_6642110_492-170361-0_sp1.1 from training. Duration: 0.92725 2023-10-04 01:30:12,290 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.feed_forward1.hidden_balancer.prob, batch_count=1477453.3333333333, ans=0.125 2023-10-04 01:30:13,537 WARNING [train.py:1204] (3/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_505-97987-0_sp0.9 from training. Duration: 0.9555625 2023-10-04 01:30:14,831 WARNING [train.py:1204] (3/4) Exclude cut with ID _8365_210_2064_1_1528592311070_4677690_371-122295-0_sp0.9 from training. Duration: 0.611125 2023-10-04 01:30:16,190 WARNING [train.py:1204] (3/4) Exclude cut with ID _14268_210_9206_1_1530622771285_4714099_637-86630-0_sp1.1 from training. Duration: 0.463625 2023-10-04 01:30:16,209 WARNING [train.py:1204] (3/4) Exclude cut with ID _29857_210_20034_1_1532395886753_3798160_372-5678-0_sp1.1 from training. Duration: 0.963625 2023-10-04 01:30:20,291 WARNING [train.py:1204] (3/4) Exclude cut with ID _52892_210_6721_1_1534378941259_4124129_90-165108-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 01:30:20,368 WARNING [train.py:1204] (3/4) Exclude cut with ID _27052_210_5986_1_1532329197187_3585009_143-339198-0_sp1.1 from training. Duration: 0.963625 2023-10-04 01:30:24,688 WARNING [train.py:1204] (3/4) Exclude cut with ID _14268_210_9206_1_1530622771285_4714099_637-86630-0_sp0.9 from training. Duration: 0.5666875 2023-10-04 01:30:24,691 WARNING [train.py:1204] (3/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_401-255491-0 from training. Duration: 0.7 2023-10-04 01:30:26,109 WARNING [train.py:1204] (3/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_178-6946-0_sp1.1 from training. Duration: 0.97275 2023-10-04 01:30:31,161 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.conv_module2.balancer2.prob, batch_count=1477520.0, ans=0.125 2023-10-04 01:30:32,271 WARNING [train.py:1204] (3/4) Exclude cut with ID _27485_210_4916_1_1532739191677_3668269_460-163885-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 01:30:36,928 WARNING [train.py:1204] (3/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_270-26053-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 01:30:39,061 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.1.encoder.layers.0.feed_forward3.out_whiten, num_groups=1, num_channels=256, metric=14.27 vs. limit=15.0 2023-10-04 01:30:39,597 WARNING [train.py:1204] (3/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_412-317077-0 from training. Duration: 0.75 2023-10-04 01:30:41,581 WARNING [train.py:1204] (3/4) Exclude cut with ID _5289_210_2084_1_1527678202736_3779299_142-103317-0 from training. Duration: 0.84 2023-10-04 01:30:41,872 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.ff2_skip_rate, batch_count=1477586.6666666667, ans=0.0 2023-10-04 01:30:42,887 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_150-143438-0_sp1.1 from training. Duration: 0.92725 2023-10-04 01:30:43,537 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.3.encoder.layers.3.feed_forward2.out_whiten, num_groups=1, num_channels=512, metric=5.09 vs. limit=15.0 2023-10-04 01:30:44,353 WARNING [train.py:1204] (3/4) Exclude cut with ID _27894_210_12392_1_1532415496078_6902439_668-269779-0_sp1.1 from training. Duration: 0.97275 2023-10-04 01:30:45,795 WARNING [train.py:1204] (3/4) Exclude cut with ID _18513_210_9206_1_1531618215223_3862620_56-222075-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 01:30:47,173 WARNING [train.py:1204] (3/4) Exclude cut with ID _4092_210_2151_1_1526643044449_3135759_356-278694-0 from training. Duration: 0.47 2023-10-04 01:30:51,360 WARNING [train.py:1204] (3/4) Exclude cut with ID _25808_210_13292_1_1532343514562_7366109_633-47524-0 from training. Duration: 0.99 2023-10-04 01:30:51,371 WARNING [train.py:1204] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_872-264525-0 from training. Duration: 0.65 2023-10-04 01:30:51,413 WARNING [train.py:1204] (3/4) Exclude cut with ID _14632_210_9988_1_1530691279392_3923200_239-232545-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 01:30:51,502 WARNING [train.py:1204] (3/4) Exclude cut with ID _14349_210_8341_1_1530615710818_3442470_153-27432-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 01:30:54,986 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.conv_module1.balancer2.prob, batch_count=1477586.6666666667, ans=0.125 2023-10-04 01:30:57,788 INFO [train.py:1046] (3/4) Epoch 42, batch 3850, loss[loss=0.1608, simple_loss=0.2272, pruned_loss=0.04726, over 24001.00 frames. ], tot_loss[loss=0.1564, simple_loss=0.2364, pruned_loss=0.03824, over 4719780.45 frames. ], batch size: 196, lr: 2.41e-03, grad_scale: 16.0 2023-10-04 01:30:57,929 WARNING [train.py:1204] (3/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_37-41919-0_sp0.9 from training. Duration: 0.9666875 2023-10-04 01:30:59,253 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_196-289637-0_sp0.9 from training. Duration: 0.8333125 2023-10-04 01:31:03,823 WARNING [train.py:1204] (3/4) Exclude cut with ID _13678_210_7497_1_1530530069226_3463170_371-3221-0_sp1.1 from training. Duration: 0.8181875 2023-10-04 01:31:03,904 WARNING [train.py:1204] (3/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_557-324258-0 from training. Duration: 0.81 2023-10-04 01:31:04,056 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.0.conv_module1.balancer2.min_positive, batch_count=1477653.3333333333, ans=0.05 2023-10-04 01:31:05,321 WARNING [train.py:1204] (3/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_1012-279473-0_sp1.1 from training. Duration: 0.6090625 2023-10-04 01:31:05,402 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_66916_210_14656_1_1535725525_326566_7-92649-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 01:31:09,424 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_9460_210_4211_1_1528954587_10049667_480-260269-0_sp1.1 from training. Duration: 0.5090625 2023-10-04 01:31:10,890 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_121-155001-0_sp1.1 from training. Duration: 0.92725 2023-10-04 01:31:14,264 WARNING [train.py:1204] (3/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_188-171673-0_sp1.1 from training. Duration: 0.52725 2023-10-04 01:31:14,344 WARNING [train.py:1204] (3/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_2-39653-0 from training. Duration: 0.98 2023-10-04 01:31:15,775 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.conv_module1.balancer2.min_abs, batch_count=1477720.0, ans=0.5 2023-10-04 01:31:21,019 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_23850_210_4238_1_1532232597_1632433_112-229012-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 01:31:21,260 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.conv_skip_rate, batch_count=1477720.0, ans=0.0 2023-10-04 01:31:21,938 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.1.encoder.layers.0.conv_module2.whiten, num_groups=1, num_channels=256, metric=7.85 vs. limit=15.0 2023-10-04 01:31:22,441 WARNING [train.py:1204] (3/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_626-79528-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 01:31:24,537 WARNING [train.py:1204] (3/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_856-90052-0_sp1.1 from training. Duration: 0.963625 2023-10-04 01:31:24,599 WARNING [train.py:1204] (3/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_122-109023-0_sp0.9 from training. Duration: 0.8444375 2023-10-04 01:31:26,695 WARNING [train.py:1204] (3/4) Exclude cut with ID _64414_210_17301_1_1535243252048_3757759_37-70328-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 01:31:27,976 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_622-344907-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 01:31:29,323 WARNING [train.py:1204] (3/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_410-321956-0_sp1.1 from training. Duration: 0.92725 2023-10-04 01:31:29,336 WARNING [train.py:1204] (3/4) Exclude cut with ID _7562_210_2084_1_1528887767916_3946229_186-326422-0_sp0.9 from training. Duration: 0.9333125 2023-10-04 01:31:29,431 WARNING [train.py:1204] (3/4) Exclude cut with ID _37537_210_2253_1_1533171758568_3421949_127-285192-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 01:31:29,616 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.conv_module1.balancer2.min_abs, batch_count=1477786.6666666667, ans=0.5 2023-10-04 01:31:32,488 WARNING [train.py:1204] (3/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_723-228168-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 01:31:33,868 WARNING [train.py:1204] (3/4) Exclude cut with ID _13678_210_7497_1_1530530069226_3463170_426-246984-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 01:31:33,883 WARNING [train.py:1204] (3/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_399-156791-0_sp0.9 from training. Duration: 0.92225 2023-10-04 01:31:35,154 WARNING [train.py:1204] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_391-22148-0 from training. Duration: 0.77 2023-10-04 01:31:35,181 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3475_210_2028_1_1526193578_3622569_64-297072-0 from training. Duration: 0.91 2023-10-04 01:31:36,605 WARNING [train.py:1204] (3/4) Exclude cut with ID _17875_210_2087_1_1531224012907_1821339_225-280015-0_sp1.1 from training. Duration: 0.963625 2023-10-04 01:31:36,636 WARNING [train.py:1204] (3/4) Exclude cut with ID _50655_210_18628_1_1534229942193_3772154_325-246169-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 01:31:39,373 WARNING [train.py:1204] (3/4) Exclude cut with ID _29529_210_7234_1_1532393961455_3977460_597-257348-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 01:31:39,423 WARNING [train.py:1204] (3/4) Exclude cut with ID _58895_210_8750_1_1535184081320_3649089_77-173485-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 01:31:40,687 WARNING [train.py:1204] (3/4) Exclude cut with ID _6270_210_2084_1_1527850911573_3856420_26-277343-0 from training. Duration: 0.82 2023-10-04 01:31:43,893 WARNING [train.py:1204] (3/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_144-3287-0 from training. Duration: 0.67 2023-10-04 01:31:44,015 WARNING [train.py:1204] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_293-145278-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 01:31:46,739 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_40047_210_2290_1_1533290519_2374311_257-335162-0 from training. Duration: 0.95 2023-10-04 01:31:46,869 WARNING [train.py:1204] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_619-344499-0_sp0.9 from training. Duration: 0.7 2023-10-04 01:31:50,461 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.1.encoder.layers.0.nonlin_attention.whiten1, num_groups=1, num_channels=192, metric=4.64 vs. limit=10.0 2023-10-04 01:31:52,246 WARNING [train.py:1204] (3/4) Exclude cut with ID _19504_210_9994_1_1531648863242_3325220_456-315379-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 01:31:52,322 WARNING [train.py:1204] (3/4) Exclude cut with ID _55857_210_2346_1_1534644007831_6898209_631-221692-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 01:31:53,982 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.564e+02 1.866e+02 2.015e+02 2.383e+02 4.192e+02, threshold=4.030e+02, percent-clipped=0.0 2023-10-04 01:31:56,893 WARNING [train.py:1204] (3/4) Exclude cut with ID _64631_210_27767_1_1535349580068_4165160_324-3682-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 01:31:56,923 WARNING [train.py:1204] (3/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_317-211107-0 from training. Duration: 0.92 2023-10-04 01:32:00,129 WARNING [train.py:1204] (3/4) Exclude cut with ID _6789_210_1534_1_1527857937218_3118950_311-61262-0 from training. Duration: 0.65 2023-10-04 01:32:03,313 WARNING [train.py:1204] (3/4) Exclude cut with ID _28323_210_16187_1_1532480395667_4100300_365-68882-0_sp1.1 from training. Duration: 0.92725 2023-10-04 01:32:03,352 WARNING [train.py:1204] (3/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_816-181825-0_sp1.1 from training. Duration: 0.92725 2023-10-04 01:32:06,085 WARNING [train.py:1204] (3/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_132-269633-0_sp1.1 from training. Duration: 0.6181875 2023-10-04 01:32:06,104 WARNING [train.py:1204] (3/4) Exclude cut with ID _44585_210_18628_1_1533605350387_4518199_280-73534-0_sp1.1 from training. Duration: 0.8090625 2023-10-04 01:32:06,168 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_68095_210_20185_1_1535718226_8888855_1009-164755-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 01:32:07,481 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_40223_210_6228_1_1533298404_4812267_400-103813-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 01:32:07,481 WARNING [train.py:1204] (3/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_491-261923-0_sp1.1 from training. Duration: 0.8909375 2023-10-04 01:32:07,493 WARNING [train.py:1204] (3/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_712-342563-0 from training. Duration: 0.97 2023-10-04 01:32:08,890 WARNING [train.py:1204] (3/4) Exclude cut with ID _41161_210_20034_1_1533431249657_3555314_175-190032-0_sp1.1 from training. Duration: 0.963625 2023-10-04 01:32:10,265 WARNING [train.py:1204] (3/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_788-298907-0 from training. Duration: 0.89 2023-10-04 01:32:10,283 WARNING [train.py:1204] (3/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_225-70961-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 01:32:10,289 WARNING [train.py:1204] (3/4) Exclude cut with ID _58583_210_5236_1_1534726835266_3574249_235-175994-0_sp1.1 from training. Duration: 0.92725 2023-10-04 01:32:10,605 INFO [scaling.py:1118] (3/4) WithLoss: name=encoder.encoders.4.encoder.layers.0.self_attn_weights, loss-sum=0.000e+00 2023-10-04 01:32:12,086 INFO [train.py:1046] (3/4) Epoch 42, batch 3900, loss[loss=0.1564, simple_loss=0.245, pruned_loss=0.03385, over 24361.00 frames. ], tot_loss[loss=0.1553, simple_loss=0.2351, pruned_loss=0.03772, over 4731369.49 frames. ], batch size: 77, lr: 2.41e-03, grad_scale: 16.0 2023-10-04 01:32:12,217 WARNING [train.py:1204] (3/4) Exclude cut with ID _12302_210_5219_1_1529924446466_8107280_907-182949-0_sp1.1 from training. Duration: 0.836375 2023-10-04 01:32:13,486 WARNING [train.py:1204] (3/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_94-193692-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 01:32:13,591 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_4528_210_3273_1_1527076654_3276777_342-168068-0_sp0.9 from training. Duration: 0.9666875 2023-10-04 01:32:14,887 WARNING [train.py:1204] (3/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_368-74239-0_sp1.1 from training. Duration: 0.92725 2023-10-04 01:32:14,888 WARNING [train.py:1204] (3/4) Exclude cut with ID _44831_210_13427_1_1533816011549_3689035_291-295928-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 01:32:14,963 WARNING [train.py:1204] (3/4) Exclude cut with ID _56406_210_9288_1_1534899618664_3496149_455-131997-0_sp1.1 from training. Duration: 0.97275 2023-10-04 01:32:14,977 WARNING [train.py:1204] (3/4) Exclude cut with ID _6473_210_1741_1_1527901307396_3815380_108-17335-0 from training. Duration: 0.65 2023-10-04 01:32:16,303 WARNING [train.py:1204] (3/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_286-135600-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 01:32:19,201 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_928-321675-0_sp1.1 from training. Duration: 0.936375 2023-10-04 01:32:20,490 WARNING [train.py:1204] (3/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_81-300212-0_sp1.1 from training. Duration: 0.5454375 2023-10-04 01:32:20,544 WARNING [train.py:1204] (3/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_29-280622-0_sp0.9 from training. Duration: 0.911125 2023-10-04 01:32:21,973 WARNING [train.py:1204] (3/4) Exclude cut with ID _61079_210_4525_1_1534921008198_4011419_228-176447-0_sp1.1 from training. Duration: 0.936375 2023-10-04 01:32:22,178 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.bypass.scale_min, batch_count=1477986.6666666667, ans=0.2 2023-10-04 01:32:23,997 WARNING [train.py:1204] (3/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_372-198878-0_sp1.1 from training. Duration: 0.5454375 2023-10-04 01:32:24,008 WARNING [train.py:1204] (3/4) Exclude cut with ID _64521_210_14699_1_1535243427259_3815328_244-208725-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 01:32:26,721 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8275_210_4198_1_1528455320_3892571_491-17707-0_sp0.9 from training. Duration: 0.711125 2023-10-04 01:32:29,936 WARNING [train.py:1204] (3/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_375-245677-0 from training. Duration: 0.81 2023-10-04 01:32:29,943 WARNING [train.py:1204] (3/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_690-286055-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 01:32:31,328 WARNING [train.py:1204] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_89-321955-0 from training. Duration: 0.8 2023-10-04 01:32:33,189 WARNING [train.py:1204] (3/4) Exclude cut with ID _18895_210_6228_1_1531357274234_3784930_213-98721-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 01:32:34,685 WARNING [train.py:1204] (3/4) Exclude cut with ID _19708_210_9206_1_1532136650733_4238364_347-65240-0 from training. Duration: 0.97 2023-10-04 01:32:34,806 WARNING [train.py:1204] (3/4) Exclude cut with ID _64135_210_2253_1_1535677267876_7608972_498-5039-0 from training. Duration: 0.78 2023-10-04 01:32:38,861 WARNING [train.py:1204] (3/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_146-276944-0_sp1.1 from training. Duration: 0.7545625 2023-10-04 01:32:40,193 WARNING [train.py:1204] (3/4) Exclude cut with ID _63171_210_15488_1_1535104792794_3295110_155-34488-0_sp1.1 from training. Duration: 0.97275 2023-10-04 01:32:40,212 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_5438_210_1794_1_1527336687_2600009_233-58072-0_sp0.9 from training. Duration: 0.8555625 2023-10-04 01:32:41,509 WARNING [train.py:1204] (3/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_63-101996-0_sp0.9 from training. Duration: 0.97775 2023-10-04 01:32:43,110 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.attention_skip_rate, batch_count=1478120.0, ans=0.0 2023-10-04 01:32:44,828 WARNING [train.py:1204] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_271-269084-0_sp1.1 from training. Duration: 0.7545625 2023-10-04 01:32:45,088 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.nonlin_attention.balancer.prob, batch_count=1478120.0, ans=0.125 2023-10-04 01:32:46,298 WARNING [train.py:1204] (3/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_345-167986-0_sp1.1 from training. Duration: 0.763625 2023-10-04 01:32:49,021 WARNING [train.py:1204] (3/4) Exclude cut with ID _39290_210_4220_1_1533862778760_3075118_92-311600-0_sp1.1 from training. Duration: 0.8 2023-10-04 01:32:49,027 WARNING [train.py:1204] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_209-65306-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 01:32:49,107 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_26433_210_12108_1_1532157158_4056480_317-335807-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 01:32:56,565 WARNING [train.py:1204] (3/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_238-94127-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 01:32:56,616 WARNING [train.py:1204] (3/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_207-132038-0_sp1.1 from training. Duration: 0.8545625 2023-10-04 01:33:02,925 WARNING [train.py:1204] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_167-137255-0_sp1.1 from training. Duration: 0.5818125 2023-10-04 01:33:03,028 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2009_210_2071_1_1525481134_4588937_385-302100-0_sp0.9 from training. Duration: 0.9666875 2023-10-04 01:33:13,272 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_15517_210_6753_1_1530954008_3487336_253-110617-0_sp1.1 from training. Duration: 0.936375 2023-10-04 01:33:16,034 WARNING [train.py:1204] (3/4) Exclude cut with ID _39290_210_4220_1_1533862778760_3075118_92-311600-0_sp0.9 from training. Duration: 0.97775 2023-10-04 01:33:16,068 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_42-142473-0 from training. Duration: 0.72 2023-10-04 01:33:16,107 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2296_210_2087_1_1525503448_3397335_57-185430-0 from training. Duration: 0.6 2023-10-04 01:33:16,126 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_11305_210_1794_1_1529750428_7178148_257-312870-0_sp0.9 from training. Duration: 0.97775 2023-10-04 01:33:18,108 WARNING [train.py:1204] (3/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_222-198127-0 from training. Duration: 0.84 2023-10-04 01:33:20,732 WARNING [train.py:1204] (3/4) Exclude cut with ID _65263_210_22356_1_1535763598691_3921037_423-202699-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 01:33:20,792 WARNING [train.py:1204] (3/4) Exclude cut with ID _15037_210_2253_1_1530752294104_3267650_13-130361-0 from training. Duration: 0.99 2023-10-04 01:33:27,076 INFO [train.py:1046] (3/4) Epoch 42, batch 3950, loss[loss=0.1582, simple_loss=0.2531, pruned_loss=0.03166, over 24561.00 frames. ], tot_loss[loss=0.1551, simple_loss=0.2346, pruned_loss=0.0378, over 4707496.11 frames. ], batch size: 71, lr: 2.41e-03, grad_scale: 16.0 2023-10-04 01:33:27,177 WARNING [train.py:1204] (3/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_628-198080-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 01:33:29,824 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3171_210_2087_1_1526109084_2330007_37-64334-0 from training. Duration: 0.82 2023-10-04 01:33:29,890 WARNING [train.py:1204] (3/4) Exclude cut with ID _5287_210_2084_1_1527418883040_3878670_12-221186-0_sp1.1 from training. Duration: 0.8909375 2023-10-04 01:33:32,587 WARNING [train.py:1204] (3/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_1103-56445-0_sp1.1 from training. Duration: 0.663625 2023-10-04 01:33:34,027 WARNING [train.py:1204] (3/4) Exclude cut with ID _65263_210_22356_1_1535763598691_3921037_169-140068-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 01:33:37,261 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.4.encoder.layers.0.conv_module2.whiten, num_groups=1, num_channels=384, metric=10.01 vs. limit=15.0 2023-10-04 01:33:40,014 WARNING [train.py:1204] (3/4) Exclude cut with ID IC0671W0118-331961 from training. Duration: 0.896 2023-10-04 01:33:41,370 WARNING [train.py:1204] (3/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_402-102311-0_sp1.1 from training. Duration: 0.6090625 2023-10-04 01:33:41,388 WARNING [train.py:1204] (3/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_202-293523-0 from training. Duration: 0.72 2023-10-04 01:33:42,701 WARNING [train.py:1204] (3/4) Exclude cut with ID IC0679W0038-335846 from training. Duration: 0.768 2023-10-04 01:33:42,732 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_37357_210_17024_1_1533089504_2316121_79-198866-0_sp1.1 from training. Duration: 0.936375 2023-10-04 01:33:45,555 WARNING [train.py:1204] (3/4) Exclude cut with ID _7606_210_4915_1_1528894253277_5080690_292-271285-0_sp1.1 from training. Duration: 0.963625 2023-10-04 01:33:45,574 WARNING [train.py:1204] (3/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_377-296839-0_sp0.9 from training. Duration: 0.611125 2023-10-04 01:33:45,577 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_45886_210_17024_1_1533704917_2435333_58-131069-0_sp1.1 from training. Duration: 0.963625 2023-10-04 01:33:46,419 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.2.encoder.layers.0.nonlin_attention.whiten2, num_groups=1, num_channels=384, metric=7.34 vs. limit=15.0 2023-10-04 01:33:48,721 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6961_210_2087_1_1527937168_6815011_395-185060-0 from training. Duration: 0.95 2023-10-04 01:33:51,446 WARNING [train.py:1204] (3/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_304-266479-0_sp1.1 from training. Duration: 0.97275 2023-10-04 01:33:51,500 WARNING [train.py:1204] (3/4) Exclude cut with ID _3867_210_1883_1_1526641200575_4475830_319-349850-0_sp1.1 from training. Duration: 0.6090625 2023-10-04 01:33:52,788 WARNING [train.py:1204] (3/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_304-59227-0_sp0.9 from training. Duration: 0.8555625 2023-10-04 01:33:52,857 WARNING [train.py:1204] (3/4) Exclude cut with ID _8021_210_7994_1_1528376451358_3653069_100-140231-0_sp0.9 from training. Duration: 0.7444375 2023-10-04 01:33:54,204 WARNING [train.py:1204] (3/4) Exclude cut with ID _8833_210_5219_1_1528592665210_7553059_329-125156-0_sp0.9 from training. Duration: 0.911125 2023-10-04 01:34:01,890 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=1478453.3333333333, ans=0.1 2023-10-04 01:34:04,397 WARNING [train.py:1204] (3/4) Exclude cut with ID _14459_210_2228_1_1531443641946_7516700_792-287234-0_sp1.1 from training. Duration: 0.92725 2023-10-04 01:34:04,415 WARNING [train.py:1204] (3/4) Exclude cut with ID _19708_210_9206_1_1532136650733_4238364_347-65240-0_sp1.1 from training. Duration: 0.8818125 2023-10-04 01:34:10,723 WARNING [train.py:1204] (3/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_70-27732-0 from training. Duration: 0.94 2023-10-04 01:34:14,873 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_397-132565-0 from training. Duration: 0.99 2023-10-04 01:34:14,876 WARNING [train.py:1204] (3/4) Exclude cut with ID _49728_210_12651_1_1534041034294_6788150_624-195301-0 from training. Duration: 0.85 2023-10-04 01:34:16,112 WARNING [train.py:1204] (3/4) Exclude cut with ID _55429_210_10626_1_1534467588527_7174143_377-169391-0_sp1.1 from training. Duration: 0.87275 2023-10-04 01:34:17,564 WARNING [train.py:1204] (3/4) Exclude cut with ID _49376_210_5986_1_1533985278732_3370740_354-344695-0_sp1.1 from training. Duration: 0.9 2023-10-04 01:34:21,630 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.660e+02 1.959e+02 2.268e+02 2.678e+02 3.550e+02, threshold=4.535e+02, percent-clipped=0.0 2023-10-04 01:34:24,936 WARNING [train.py:1204] (3/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_276-96593-0_sp1.1 from training. Duration: 0.7 2023-10-04 01:34:24,945 WARNING [train.py:1204] (3/4) Exclude cut with ID _15012_210_1968_1_1530856820429_5095350_688-178849-0_sp0.9 from training. Duration: 0.711125 2023-10-04 01:34:25,004 WARNING [train.py:1204] (3/4) Exclude cut with ID _25565_210_12392_1_1532423009382_3952098_372-69023-0_sp1.1 from training. Duration: 0.936375 2023-10-04 01:34:26,364 WARNING [train.py:1204] (3/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_61-327062-0_sp1.1 from training. Duration: 0.736375 2023-10-04 01:34:26,408 WARNING [train.py:1204] (3/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_215-141059-0 from training. Duration: 0.62 2023-10-04 01:34:30,003 WARNING [train.py:1204] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_6-226750-0_sp1.1 from training. Duration: 0.8545625 2023-10-04 01:34:33,224 WARNING [train.py:1204] (3/4) Exclude cut with ID _14348_210_8341_1_1530599434996_3778640_240-232370-0_sp1.1 from training. Duration: 0.763625 2023-10-04 01:34:35,967 WARNING [train.py:1204] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_468-189058-0 from training. Duration: 0.96 2023-10-04 01:34:40,811 INFO [train.py:1046] (3/4) Epoch 42, batch 4000, loss[loss=0.1563, simple_loss=0.2392, pruned_loss=0.03667, over 24011.00 frames. ], tot_loss[loss=0.1559, simple_loss=0.2357, pruned_loss=0.03808, over 4701151.05 frames. ], batch size: 86, lr: 2.41e-03, grad_scale: 32.0 2023-10-04 01:34:44,024 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.attention_skip_rate, batch_count=1478653.3333333333, ans=0.0 2023-10-04 01:34:44,463 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.2.encoder.layers.0.feed_forward2.out_whiten, num_groups=1, num_channels=384, metric=5.02 vs. limit=15.0 2023-10-04 01:34:45,130 WARNING [train.py:1204] (3/4) Exclude cut with ID _21987_210_5854_1_1531821500482_3637038_247-93340-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 01:34:50,653 WARNING [train.py:1204] (3/4) Exclude cut with ID _66868_210_9527_1_1535509287954_4561140_436-191602-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 01:34:57,788 WARNING [train.py:1204] (3/4) Exclude cut with ID _18895_210_6228_1_1531357274234_3784930_394-58203-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 01:34:57,855 WARNING [train.py:1204] (3/4) Exclude cut with ID _58605_210_12210_1_1534723195939_3904259_258-281498-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 01:34:57,887 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_33348_210_5632_1_1533086912_3324030_245-292752-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 01:34:57,915 WARNING [train.py:1204] (3/4) Exclude cut with ID _67831_210_12392_1_1535612464202_3092612_346-2148-0 from training. Duration: 0.93 2023-10-04 01:34:59,859 WARNING [train.py:1204] (3/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_369-222060-0_sp0.9 from training. Duration: 0.788875 2023-10-04 01:34:59,918 WARNING [train.py:1204] (3/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_245-174553-0 from training. Duration: 0.88 2023-10-04 01:34:59,923 WARNING [train.py:1204] (3/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_231-104736-0_sp1.1 from training. Duration: 0.7090625 2023-10-04 01:34:59,929 WARNING [train.py:1204] (3/4) Exclude cut with ID _44585_210_18628_1_1533605350387_4518199_280-73534-0 from training. Duration: 0.89 2023-10-04 01:35:01,320 WARNING [train.py:1204] (3/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_22-201476-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 01:35:04,739 WARNING [train.py:1204] (3/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_325-62802-0_sp0.9 from training. Duration: 0.9666875 2023-10-04 01:35:04,751 WARNING [train.py:1204] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_199-274062-0_sp1.1 from training. Duration: 0.97275 2023-10-04 01:35:04,754 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_8-319957-0_sp1.1 from training. Duration: 0.87275 2023-10-04 01:35:06,050 WARNING [train.py:1204] (3/4) Exclude cut with ID _29343_210_4381_1_1532602757752_3674109_224-349726-0_sp1.1 from training. Duration: 0.936375 2023-10-04 01:35:06,062 WARNING [train.py:1204] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_520-126729-0_sp1.1 from training. Duration: 0.636375 2023-10-04 01:35:06,231 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.bypass.skip_rate, batch_count=1478720.0, ans=0.07 2023-10-04 01:35:08,779 WARNING [train.py:1204] (3/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_239-22076-0_sp1.1 from training. Duration: 0.77275 2023-10-04 01:35:08,897 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9012W0011-501146 from training. Duration: 0.896 2023-10-04 01:35:09,584 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.3.encoder.layers.0.self_attn2.whiten, num_groups=1, num_channels=512, metric=14.18 vs. limit=22.5 2023-10-04 01:35:10,827 WARNING [train.py:1204] (3/4) Exclude cut with ID _29857_210_20034_1_1532395886753_3798160_279-185801-0_sp1.1 from training. Duration: 0.82725 2023-10-04 01:35:10,876 WARNING [train.py:1204] (3/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_261-223933-0_sp1.1 from training. Duration: 0.92725 2023-10-04 01:35:13,664 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9029W0025-509426 from training. Duration: 0.896 2023-10-04 01:35:13,736 WARNING [train.py:1204] (3/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_337-181502-0_sp1.1 from training. Duration: 0.5909375 2023-10-04 01:35:13,912 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.feed_forward2.hidden_balancer.prob, batch_count=1478786.6666666667, ans=0.125 2023-10-04 01:35:14,964 WARNING [train.py:1204] (3/4) Exclude cut with ID _68635_210_9317_1_1535770813410_4315870_159-332599-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 01:35:15,218 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.feed_forward1.out_proj.dropout_p, batch_count=1478786.6666666667, ans=0.1 2023-10-04 01:35:20,527 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_161-276996-0 from training. Duration: 0.95 2023-10-04 01:35:20,569 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7088_210_1874_1_1527939776_1101113_127-68870-0_sp1.1 from training. Duration: 0.936375 2023-10-04 01:35:23,345 WARNING [train.py:1204] (3/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_322-217220-0_sp1.1 from training. Duration: 0.7818125 2023-10-04 01:35:25,225 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9072W0002-530636 from training. Duration: 0.768 2023-10-04 01:35:26,543 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_340-336408-0_sp1.1 from training. Duration: 0.8454375 2023-10-04 01:35:26,609 WARNING [train.py:1204] (3/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_395-37222-0 from training. Duration: 0.76 2023-10-04 01:35:27,681 WARNING [train.py:1204] (3/4) Exclude cut with ID _64099_210_17301_1_1535526977656_1578188_159-209156-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 01:35:27,763 WARNING [train.py:1204] (3/4) Exclude cut with ID _53950_210_5816_1_1534417264919_3862951_28-29681-0_sp1.1 from training. Duration: 0.92725 2023-10-04 01:35:29,145 WARNING [train.py:1204] (3/4) Exclude cut with ID _4681_210_3525_1_1527250689464_120889_15-126760-0_sp1.1 from training. Duration: 0.736375 2023-10-04 01:35:32,532 WARNING [train.py:1204] (3/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_242-3472-0_sp1.1 from training. Duration: 0.663625 2023-10-04 01:35:32,576 WARNING [train.py:1204] (3/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_436-186311-0_sp0.9 from training. Duration: 0.72225 2023-10-04 01:35:34,300 WARNING [train.py:1204] (3/4) Exclude cut with ID _3675_210_4557_1_1526639934408_4182089_452-172147-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 01:35:35,694 WARNING [train.py:1204] (3/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_80-147301-0 from training. Duration: 0.84 2023-10-04 01:35:35,726 WARNING [train.py:1204] (3/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_351-107588-0_sp1.1 from training. Duration: 0.92725 2023-10-04 01:35:38,445 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9111W0002-550781 from training. Duration: 0.896 2023-10-04 01:35:44,513 WARNING [train.py:1204] (3/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_465-156500-0_sp1.1 from training. Duration: 0.6545625 2023-10-04 01:35:44,771 INFO [scaling.py:1118] (3/4) WithLoss: name=encoder.encoders.3.encoder.layers.1.self_attn_weights, loss-sum=0.000e+00 2023-10-04 01:35:45,992 WARNING [train.py:1204] (3/4) Exclude cut with ID _13938_210_9988_1_1530518718978_3693960_304-112784-0_sp1.1 from training. Duration: 0.4181875 2023-10-04 01:35:47,484 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.conv_module1.balancer1.prob, batch_count=1478920.0, ans=0.125 2023-10-04 01:35:48,680 WARNING [train.py:1204] (3/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_202-67017-0_sp1.1 from training. Duration: 0.7454375 2023-10-04 01:35:48,728 WARNING [train.py:1204] (3/4) Exclude cut with ID _58414_210_8254_1_1535421609162_7151182_448-4358-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 01:35:50,060 WARNING [train.py:1204] (3/4) Exclude cut with ID _66840_210_5219_1_1535452168426_4196349_203-243234-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 01:35:51,392 WARNING [train.py:1204] (3/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_201-213513-0_sp1.1 from training. Duration: 0.97275 2023-10-04 01:35:54,039 INFO [train.py:1046] (3/4) Epoch 42, batch 4050, loss[loss=0.1663, simple_loss=0.2525, pruned_loss=0.04005, over 24018.00 frames. ], tot_loss[loss=0.1552, simple_loss=0.2355, pruned_loss=0.03748, over 4721812.24 frames. ], batch size: 80, lr: 2.41e-03, grad_scale: 32.0 2023-10-04 01:35:55,520 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_117-187560-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 01:35:58,587 WARNING [train.py:1204] (3/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_135-174080-0_sp0.9 from training. Duration: 0.588875 2023-10-04 01:35:58,636 WARNING [train.py:1204] (3/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_45-265684-0 from training. Duration: 0.72 2023-10-04 01:36:01,317 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_435-313305-0_sp1.1 from training. Duration: 0.6454375 2023-10-04 01:36:01,336 WARNING [train.py:1204] (3/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_235-307337-0_sp1.1 from training. Duration: 0.8545625 2023-10-04 01:36:02,102 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.1.encoder.layers.1.feed_forward3.out_whiten, num_groups=1, num_channels=256, metric=7.70 vs. limit=15.0 2023-10-04 01:36:02,619 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7762_210_1794_1_1528199508_4057408_419-125009-0_sp0.9 from training. Duration: 0.92225 2023-10-04 01:36:04,411 WARNING [train.py:1204] (3/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_215-141059-0_sp1.1 from training. Duration: 0.563625 2023-10-04 01:36:05,757 WARNING [train.py:1204] (3/4) Exclude cut with ID _27013_210_1389_1_1532437173240_3252649_222-125573-0_sp1.1 from training. Duration: 0.8909375 2023-10-04 01:36:08,031 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=1479053.3333333333, ans=0.1 2023-10-04 01:36:08,975 WARNING [train.py:1204] (3/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_126-274694-0_sp1.1 from training. Duration: 0.8909375 2023-10-04 01:36:09,289 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.conv_module1.balancer1.max_abs, batch_count=1479053.3333333333, ans=10.0 2023-10-04 01:36:10,447 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2592_210_2290_1_1525697025_4882925_157-70512-0_sp1.1 from training. Duration: 0.82725 2023-10-04 01:36:10,502 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_292-265928-0_sp0.9 from training. Duration: 0.5666875 2023-10-04 01:36:11,978 WARNING [train.py:1204] (3/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_1211-226066-0_sp1.1 from training. Duration: 0.6909375 2023-10-04 01:36:13,399 WARNING [train.py:1204] (3/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_394-265747-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 01:36:16,779 WARNING [train.py:1204] (3/4) Exclude cut with ID _48782_210_12658_1_1534467701544_2300422_239-67139-0_sp1.1 from training. Duration: 0.97275 2023-10-04 01:36:19,495 WARNING [train.py:1204] (3/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_329-113173-0_sp1.1 from training. Duration: 0.563625 2023-10-04 01:36:21,116 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.1.balancer2.prob, batch_count=1479053.3333333333, ans=0.125 2023-10-04 01:36:22,339 WARNING [train.py:1204] (3/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_81-75849-0_sp0.9 from training. Duration: 0.4333125 2023-10-04 01:36:23,775 WARNING [train.py:1204] (3/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_1003-101638-0 from training. Duration: 0.73 2023-10-04 01:36:23,806 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9309W0189-640316 from training. Duration: 0.64 2023-10-04 01:36:26,593 WARNING [train.py:1204] (3/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_503-287883-0_sp1.1 from training. Duration: 0.8 2023-10-04 01:36:29,611 INFO [scaling.py:1118] (3/4) WithLoss: name=encoder.encoders.4.encoder.layers.1.self_attn_weights, loss-sum=0.000e+00 2023-10-04 01:36:31,220 WARNING [train.py:1204] (3/4) Exclude cut with ID _8833_210_5219_1_1528592665210_7553059_611-80313-0 from training. Duration: 0.64 2023-10-04 01:36:32,602 WARNING [train.py:1204] (3/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_335-106626-0_sp1.1 from training. Duration: 0.963625 2023-10-04 01:36:35,499 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=1479120.0, ans=0.1 2023-10-04 01:36:36,348 WARNING [train.py:1204] (3/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_237-44332-0_sp1.1 from training. Duration: 0.8545625 2023-10-04 01:36:39,178 WARNING [train.py:1204] (3/4) Exclude cut with ID _46952_210_5986_1_1534230010357_3107379_178-147341-0_sp1.1 from training. Duration: 0.97275 2023-10-04 01:36:39,227 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_13091_210_7925_1_1530609447_8987354_946-154426-0_sp1.1 from training. Duration: 0.936375 2023-10-04 01:36:39,251 WARNING [train.py:1204] (3/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_326-17061-0_sp1.1 from training. Duration: 0.8545625 2023-10-04 01:36:39,449 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.attention_skip_rate, batch_count=1479186.6666666667, ans=0.0 2023-10-04 01:36:41,871 WARNING [train.py:1204] (3/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_38-169920-0_sp1.1 from training. Duration: 0.82725 2023-10-04 01:36:46,588 WARNING [train.py:1204] (3/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_283-220372-0 from training. Duration: 0.56 2023-10-04 01:36:46,596 WARNING [train.py:1204] (3/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_252-216674-0_sp1.1 from training. Duration: 0.4909375 2023-10-04 01:36:48,006 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_202-91290-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 01:36:48,230 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.conv_module2.balancer1.prob, batch_count=1479186.6666666667, ans=0.125 2023-10-04 01:36:49,402 WARNING [train.py:1204] (3/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_80-74184-0 from training. Duration: 0.83 2023-10-04 01:36:50,640 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.616e+02 1.894e+02 2.127e+02 2.287e+02 3.562e+02, threshold=4.254e+02, percent-clipped=0.0 2023-10-04 01:36:53,613 WARNING [train.py:1204] (3/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_411-271358-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 01:37:00,974 WARNING [train.py:1204] (3/4) Exclude cut with ID _68777_210_4353_1_1535810390731_3117904_129-166012-0 from training. Duration: 0.85 2023-10-04 01:37:02,281 WARNING [train.py:1204] (3/4) Exclude cut with ID _44982_210_1511_1_1534312820108_3726029_410-109759-0_sp1.1 from training. Duration: 0.963625 2023-10-04 01:37:02,293 WARNING [train.py:1204] (3/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_364-22817-0_sp1.1 from training. Duration: 0.6454375 2023-10-04 01:37:03,814 WARNING [train.py:1204] (3/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_89-242335-0 from training. Duration: 0.95 2023-10-04 01:37:03,824 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_151-325626-0 from training. Duration: 0.97 2023-10-04 01:37:03,833 WARNING [train.py:1204] (3/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_786-264332-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 01:37:05,702 WARNING [train.py:1204] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_35-153886-0_sp1.1 from training. Duration: 0.763625 2023-10-04 01:37:07,616 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_24007_210_1794_1_1531918925_1663875_116-100916-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 01:37:07,639 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_162-262717-0_sp1.1 from training. Duration: 0.7545625 2023-10-04 01:37:08,900 INFO [train.py:1046] (3/4) Epoch 42, batch 4100, loss[loss=0.155, simple_loss=0.2428, pruned_loss=0.03364, over 24667.00 frames. ], tot_loss[loss=0.1557, simple_loss=0.2362, pruned_loss=0.03759, over 4725975.10 frames. ], batch size: 68, lr: 2.41e-03, grad_scale: 32.0 2023-10-04 01:37:16,208 WARNING [train.py:1204] (3/4) Exclude cut with ID _8263_210_1664_1_1528438953494_294100_40-204731-0 from training. Duration: 0.5 2023-10-04 01:37:16,319 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_416-293746-0 from training. Duration: 0.9 2023-10-04 01:37:17,704 WARNING [train.py:1204] (3/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_605-219732-0 from training. Duration: 0.94 2023-10-04 01:37:19,004 WARNING [train.py:1204] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_454-123002-0 from training. Duration: 0.69 2023-10-04 01:37:19,010 WARNING [train.py:1204] (3/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_25-304273-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 01:37:19,062 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6860_210_4852_1_1527917457_3833327_451-39702-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 01:37:20,390 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_304-16505-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 01:37:20,416 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_4-266348-0_sp1.1 from training. Duration: 0.7090625 2023-10-04 01:37:21,725 WARNING [train.py:1204] (3/4) Exclude cut with ID ID0149W0001-753701 from training. Duration: 0.989 2023-10-04 01:37:24,499 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_41251_210_15368_1_1533429767_4820695_394-55393-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 01:37:27,052 WARNING [train.py:1204] (3/4) Exclude cut with ID _8793_210_6310_1_1528542985139_2310009_121-179782-0_sp1.1 from training. Duration: 0.72725 2023-10-04 01:37:27,069 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2729_210_3528_1_1525863505_3882214_421-73047-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 01:37:27,127 WARNING [train.py:1204] (3/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_278-154492-0_sp0.9 from training. Duration: 0.8444375 2023-10-04 01:37:31,926 WARNING [train.py:1204] (3/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_412-317077-0_sp0.9 from training. Duration: 0.8333125 2023-10-04 01:37:32,178 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.feed_forward2.hidden_balancer.prob, batch_count=1479386.6666666667, ans=0.125 2023-10-04 01:37:33,240 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_727-138317-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 01:37:33,297 WARNING [train.py:1204] (3/4) Exclude cut with ID _25808_210_13292_1_1532343514562_7366109_345-335048-0_sp1.1 from training. Duration: 0.92725 2023-10-04 01:37:33,430 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.feed_forward2.hidden_balancer.prob, batch_count=1479386.6666666667, ans=0.125 2023-10-04 01:37:34,669 WARNING [train.py:1204] (3/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_506-23200-0 from training. Duration: 0.97 2023-10-04 01:37:34,761 WARNING [train.py:1204] (3/4) Exclude cut with ID _44718_210_5816_1_1534050051869_6469030_643-245187-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 01:37:34,765 WARNING [train.py:1204] (3/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_42-186162-0_sp1.1 from training. Duration: 0.836375 2023-10-04 01:37:34,871 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.0.bypass.scale_min, batch_count=1479386.6666666667, ans=0.2 2023-10-04 01:37:35,977 WARNING [train.py:1204] (3/4) Exclude cut with ID _3231_210_2106_1_1526205864934_6784220_171-256004-0_sp1.1 from training. Duration: 0.7818125 2023-10-04 01:37:35,995 WARNING [train.py:1204] (3/4) Exclude cut with ID _25914_210_6400_1_1532141317058_644740_43-248478-0_sp1.1 from training. Duration: 0.936375 2023-10-04 01:37:37,386 WARNING [train.py:1204] (3/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_577-287150-0 from training. Duration: 0.82 2023-10-04 01:37:39,581 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_46015_210_7234_1_1533863319_3087837_376-276068-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 01:37:40,969 WARNING [train.py:1204] (3/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_51-200220-0 from training. Duration: 0.56 2023-10-04 01:37:41,066 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_452-96101-0_sp1.1 from training. Duration: 0.963625 2023-10-04 01:37:45,584 WARNING [train.py:1204] (3/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_263-168388-0_sp1.1 from training. Duration: 0.7818125 2023-10-04 01:37:45,584 WARNING [train.py:1204] (3/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_469-94673-0 from training. Duration: 0.79 2023-10-04 01:37:46,944 WARNING [train.py:1204] (3/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_303-228738-0_sp1.1 from training. Duration: 0.763625 2023-10-04 01:37:48,351 WARNING [train.py:1204] (3/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_262-250826-0_sp1.1 from training. Duration: 0.9 2023-10-04 01:37:48,380 WARNING [train.py:1204] (3/4) Exclude cut with ID _68777_210_4353_1_1535810390731_3117904_129-166012-0_sp1.1 from training. Duration: 0.77275 2023-10-04 01:37:49,857 WARNING [train.py:1204] (3/4) Exclude cut with ID _58895_210_8750_1_1535184081320_3649089_265-323810-0 from training. Duration: 0.96 2023-10-04 01:37:51,247 WARNING [train.py:1204] (3/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_120-76843-0_sp0.9 from training. Duration: 0.611125 2023-10-04 01:37:51,303 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7544_210_6753_1_1528587722_3576686_295-258479-0_sp0.9 from training. Duration: 0.9333125 2023-10-04 01:37:51,504 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=1479520.0, ans=0.1 2023-10-04 01:37:53,961 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7046_210_6753_1_1527895337_6920917_783-278109-0 from training. Duration: 0.77 2023-10-04 01:37:54,015 WARNING [train.py:1204] (3/4) Exclude cut with ID _42326_210_7770_1_1533814282531_3707269_113-308323-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 01:37:54,049 WARNING [train.py:1204] (3/4) Exclude cut with ID _7852_210_5219_1_1528286471316_8139400_242-105336-0_sp0.9 from training. Duration: 0.8 2023-10-04 01:37:56,934 WARNING [train.py:1204] (3/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_114-277905-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 01:38:00,425 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.3.encoder.layers.2.nonlin_attention.whiten2, num_groups=1, num_channels=512, metric=6.38 vs. limit=15.0 2023-10-04 01:38:01,742 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_13118_210_7925_1_1530755612_7608297_42-66683-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 01:38:04,605 WARNING [train.py:1204] (3/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_901-79438-0_sp1.1 from training. Duration: 0.8545625 2023-10-04 01:38:06,398 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_30280_210_1881_1_1532872779_3660139_211-182194-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 01:38:13,877 WARNING [train.py:1204] (3/4) Exclude cut with ID _14898_210_9206_1_1530766812377_4177190_621-45732-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 01:38:13,890 WARNING [train.py:1204] (3/4) Exclude cut with ID _35350_210_16040_1_1533040191183_4075299_224-133775-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 01:38:18,432 WARNING [train.py:1204] (3/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_147-293311-0_sp1.1 from training. Duration: 0.8545625 2023-10-04 01:38:19,830 WARNING [train.py:1204] (3/4) Exclude cut with ID _12302_210_5219_1_1529924446466_8107280_835-279754-0_sp1.1 from training. Duration: 0.8181875 2023-10-04 01:38:22,316 INFO [train.py:1046] (3/4) Epoch 42, batch 4150, loss[loss=0.147, simple_loss=0.2162, pruned_loss=0.03889, over 23546.00 frames. ], tot_loss[loss=0.1562, simple_loss=0.2363, pruned_loss=0.03798, over 4714030.98 frames. ], batch size: 256, lr: 2.41e-03, grad_scale: 16.0 2023-10-04 01:38:23,814 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8453_210_4211_1_1528502940_8562730_347-330673-0_sp0.9 from training. Duration: 0.8 2023-10-04 01:38:25,203 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_213-103282-0_sp0.9 from training. Duration: 0.8666875 2023-10-04 01:38:25,279 WARNING [train.py:1204] (3/4) Exclude cut with ID _49728_210_12651_1_1534041034294_6788150_66-43496-0_sp1.1 from training. Duration: 0.97275 2023-10-04 01:38:25,281 WARNING [train.py:1204] (3/4) Exclude cut with ID _19023_210_4353_1_1531378892552_5820780_28-36615-0_sp1.1 from training. Duration: 0.8818125 2023-10-04 01:38:29,439 WARNING [train.py:1204] (3/4) Exclude cut with ID _14349_210_8341_1_1530615710818_3442470_115-285975-0 from training. Duration: 0.86 2023-10-04 01:38:29,472 WARNING [train.py:1204] (3/4) Exclude cut with ID _12734_210_8341_1_1530183614296_3891929_236-47464-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 01:38:30,815 WARNING [train.py:1204] (3/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_177-236809-0 from training. Duration: 0.49 2023-10-04 01:38:30,878 WARNING [train.py:1204] (3/4) Exclude cut with ID _13678_210_7497_1_1530530069226_3463170_371-3221-0 from training. Duration: 0.9 2023-10-04 01:38:30,892 WARNING [train.py:1204] (3/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_48-338976-0 from training. Duration: 0.89 2023-10-04 01:38:33,539 WARNING [train.py:1204] (3/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_1167-309744-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 01:38:36,865 WARNING [train.py:1204] (3/4) Exclude cut with ID _30327_210_16049_1_1532484161295_3924169_401-70819-0_sp0.9 from training. Duration: 0.9555625 2023-10-04 01:38:36,874 WARNING [train.py:1204] (3/4) Exclude cut with ID _55134_210_21982_1_1534590028377_3882170_346-91022-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 01:38:41,450 WARNING [train.py:1204] (3/4) Exclude cut with ID _14459_210_2228_1_1531443641946_7516700_174-118801-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 01:38:42,908 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_269-278006-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 01:38:42,960 WARNING [train.py:1204] (3/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_390-339137-0_sp0.9 from training. Duration: 0.62225 2023-10-04 01:38:43,168 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.feed_forward1.out_proj.dropout_p, batch_count=1479720.0, ans=0.1 2023-10-04 01:38:45,797 WARNING [train.py:1204] (3/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_253-308377-0_sp1.1 from training. Duration: 0.4454375 2023-10-04 01:38:45,815 WARNING [train.py:1204] (3/4) Exclude cut with ID _23825_210_13449_1_1531915313526_3497150_240-271367-0_sp1.1 from training. Duration: 0.8818125 2023-10-04 01:38:47,187 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1015-343884-0_sp0.9 from training. Duration: 0.688875 2023-10-04 01:38:50,467 WARNING [train.py:1204] (3/4) Exclude cut with ID _23514_210_1389_1_1531832139785_3031439_335-48210-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 01:38:53,271 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_309-65177-0_sp1.1 from training. Duration: 0.8 2023-10-04 01:38:54,653 WARNING [train.py:1204] (3/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_231-137892-0 from training. Duration: 0.99 2023-10-04 01:38:56,096 WARNING [train.py:1204] (3/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_57-270037-0 from training. Duration: 0.77 2023-10-04 01:38:56,110 WARNING [train.py:1204] (3/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_465-214726-0_sp1.1 from training. Duration: 0.7818125 2023-10-04 01:38:57,384 WARNING [train.py:1204] (3/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_106-300985-0 from training. Duration: 0.9 2023-10-04 01:38:57,389 WARNING [train.py:1204] (3/4) Exclude cut with ID _14632_210_9988_1_1530691279392_3923200_161-129530-0_sp1.1 from training. Duration: 0.87275 2023-10-04 01:38:57,406 WARNING [train.py:1204] (3/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_514-88370-0_sp1.1 from training. Duration: 0.963625 2023-10-04 01:39:01,904 WARNING [train.py:1204] (3/4) Exclude cut with ID _56495_210_4234_1_1534748080334_4136219_305-334505-0_sp1.1 from training. Duration: 0.92725 2023-10-04 01:39:01,979 WARNING [train.py:1204] (3/4) Exclude cut with ID _42875_210_14856_1_1533704397487_4012433_61-264914-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 01:39:04,834 WARNING [train.py:1204] (3/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_91-148831-0 from training. Duration: 0.99 2023-10-04 01:39:06,869 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.conv_module1.balancer2.prob, batch_count=1479853.3333333333, ans=0.125 2023-10-04 01:39:06,929 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.balancer2.prob, batch_count=1479853.3333333333, ans=0.125 2023-10-04 01:39:08,607 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_435-313305-0_sp0.9 from training. Duration: 0.788875 2023-10-04 01:39:10,059 WARNING [train.py:1204] (3/4) Exclude cut with ID _16744_210_5986_1_1531724405384_3907567_242-111316-0_sp1.1 from training. Duration: 0.8454375 2023-10-04 01:39:10,120 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_257-20733-0 from training. Duration: 0.76 2023-10-04 01:39:11,434 WARNING [train.py:1204] (3/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_4-91037-0_sp1.1 from training. Duration: 0.8 2023-10-04 01:39:12,936 WARNING [train.py:1204] (3/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_3-276701-0 from training. Duration: 0.91 2023-10-04 01:39:15,630 WARNING [train.py:1204] (3/4) Exclude cut with ID _3992_210_1747_1_1526644816031_3731060_36-147855-0_sp1.1 from training. Duration: 0.7090625 2023-10-04 01:39:16,447 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.0.layers.1.feed_forward3.out_whiten, num_groups=1, num_channels=192, metric=9.02 vs. limit=15.0 2023-10-04 01:39:17,380 WARNING [train.py:1204] (3/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_1296-298671-0_sp1.1 from training. Duration: 0.963625 2023-10-04 01:39:18,822 WARNING [train.py:1204] (3/4) Exclude cut with ID _68965_210_1883_1_1535698962272_3497490_309-334593-0_sp1.1 from training. Duration: 0.92725 2023-10-04 01:39:19,991 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.549e+02 1.917e+02 2.149e+02 2.587e+02 4.183e+02, threshold=4.298e+02, percent-clipped=0.0 2023-10-04 01:39:20,111 WARNING [train.py:1204] (3/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_128-296581-0 from training. Duration: 0.74 2023-10-04 01:39:20,111 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_17613_210_2151_1_1531137569_2366160_170-299079-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 01:39:20,113 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6287_210_3273_1_1527509506_2653716_182-199887-0_sp1.1 from training. Duration: 0.463625 2023-10-04 01:39:22,832 WARNING [train.py:1204] (3/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_80-135200-0_sp1.1 from training. Duration: 0.7181875 2023-10-04 01:39:24,324 WARNING [train.py:1204] (3/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_34-183685-0 from training. Duration: 0.98 2023-10-04 01:39:25,642 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8278_210_2550_1_1528637099_4220413_418-210155-0_sp1.1 from training. Duration: 0.92725 2023-10-04 01:39:25,646 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8853_210_3273_1_1528633899_818968_51-187685-0_sp0.9 from training. Duration: 0.7666875 2023-10-04 01:39:25,670 WARNING [train.py:1204] (3/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_332-221057-0_sp1.1 from training. Duration: 0.6181875 2023-10-04 01:39:26,992 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2221_210_1988_1_1525597051_4935541_415-296882-0 from training. Duration: 0.77 2023-10-04 01:39:27,027 WARNING [train.py:1204] (3/4) Exclude cut with ID _33640_210_2655_1_1533117605479_4855469_770-278185-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 01:39:27,076 WARNING [train.py:1204] (3/4) Exclude cut with ID _5287_210_2084_1_1527418883040_3878670_333-48508-0_sp1.1 from training. Duration: 0.5454375 2023-10-04 01:39:27,200 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.1.conv_module2.balancer1.min_positive, batch_count=1479920.0, ans=0.025 2023-10-04 01:39:28,466 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_142-276839-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 01:39:29,928 WARNING [train.py:1204] (3/4) Exclude cut with ID _28394_210_8913_1_1532430049518_3650118_126-139371-0_sp1.1 from training. Duration: 0.92725 2023-10-04 01:39:29,942 WARNING [train.py:1204] (3/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_76-203867-0 from training. Duration: 0.72 2023-10-04 01:39:31,279 WARNING [train.py:1204] (3/4) Exclude cut with ID _6194_210_1534_1_1527511826160_3941194_14-282876-0_sp0.9 from training. Duration: 0.788875 2023-10-04 01:39:35,916 INFO [train.py:1046] (3/4) Epoch 42, batch 4200, loss[loss=0.1262, simple_loss=0.204, pruned_loss=0.02422, over 21120.00 frames. ], tot_loss[loss=0.1555, simple_loss=0.2353, pruned_loss=0.03786, over 4712691.24 frames. ], batch size: 46, lr: 2.41e-03, grad_scale: 16.0 2023-10-04 01:39:36,131 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.1.conv_module2.balancer2.prob, batch_count=1479986.6666666667, ans=0.125 2023-10-04 01:39:37,274 WARNING [train.py:1204] (3/4) Exclude cut with ID _35355_210_16040_1_1533212988977_4016229_74-190076-0_sp1.1 from training. Duration: 0.72725 2023-10-04 01:39:37,419 WARNING [train.py:1204] (3/4) Exclude cut with ID _33920_210_4220_1_1533257851500_3084399_198-334063-0 from training. Duration: 0.94 2023-10-04 01:39:39,432 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_4-266348-0_sp0.9 from training. Duration: 0.8666875 2023-10-04 01:39:42,756 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_23856_210_4238_1_1531994785_3087934_122-38075-0_sp1.1 from training. Duration: 0.936375 2023-10-04 01:39:42,841 WARNING [train.py:1204] (3/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_276-96593-0_sp0.9 from training. Duration: 0.8555625 2023-10-04 01:39:44,192 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_308-278734-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 01:39:44,193 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_5190_210_4557_1_1527245078_2965646_353-250603-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 01:39:45,624 WARNING [train.py:1204] (3/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_472-216663-0 from training. Duration: 0.92 2023-10-04 01:39:48,376 WARNING [train.py:1204] (3/4) Exclude cut with ID _7537_210_2084_1_1528502395431_3924359_14-312906-0 from training. Duration: 0.84 2023-10-04 01:39:50,112 WARNING [train.py:1204] (3/4) Exclude cut with ID _29003_210_19596_1_1532507391720_3894098_559-236660-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 01:39:51,494 WARNING [train.py:1204] (3/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_312-204798-0_sp1.1 from training. Duration: 0.8454375 2023-10-04 01:39:53,039 WARNING [train.py:1204] (3/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_17-309070-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 01:39:55,718 WARNING [train.py:1204] (3/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_344-152526-0_sp0.9 from training. Duration: 0.588875 2023-10-04 01:39:58,389 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_41452_210_7291_1_1533437687_3941281_405-11559-0_sp1.1 from training. Duration: 0.82725 2023-10-04 01:39:58,423 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_48730_210_5399_1_1533885886_2212769_157-124052-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 01:39:59,860 WARNING [train.py:1204] (3/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_415-78792-0 from training. Duration: 0.81 2023-10-04 01:39:59,872 WARNING [train.py:1204] (3/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_345-340690-0_sp1.1 from training. Duration: 0.8454375 2023-10-04 01:40:01,309 WARNING [train.py:1204] (3/4) Exclude cut with ID _23825_210_13449_1_1531915313526_3497150_124-8138-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 01:40:01,352 WARNING [train.py:1204] (3/4) Exclude cut with ID _49258_210_7994_1_1534122017863_3738861_194-24864-0_sp1.1 from training. Duration: 0.936375 2023-10-04 01:40:01,370 WARNING [train.py:1204] (3/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_369-222060-0_sp1.1 from training. Duration: 0.6454375 2023-10-04 01:40:04,098 WARNING [train.py:1204] (3/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_480-13146-0_sp0.9 from training. Duration: 0.9444375 2023-10-04 01:40:06,167 WARNING [train.py:1204] (3/4) Exclude cut with ID _13253_210_1925_1_1530757775819_3577873_191-144977-0 from training. Duration: 0.78 2023-10-04 01:40:06,197 WARNING [train.py:1204] (3/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_1128-140266-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 01:40:11,482 WARNING [train.py:1204] (3/4) Exclude cut with ID _8772_210_1850_1_1528635595972_3664011_247-336448-0_sp0.9 from training. Duration: 0.82225 2023-10-04 01:40:11,560 WARNING [train.py:1204] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_318-59097-0_sp1.1 from training. Duration: 0.6909375 2023-10-04 01:40:11,850 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.conv_module1.balancer2.prob, batch_count=1480120.0, ans=0.125 2023-10-04 01:40:14,161 WARNING [train.py:1204] (3/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_540-131164-0_sp1.1 from training. Duration: 0.836375 2023-10-04 01:40:15,467 WARNING [train.py:1204] (3/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_39-110836-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 01:40:16,876 WARNING [train.py:1204] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_860-96198-0_sp1.1 from training. Duration: 0.663625 2023-10-04 01:40:16,884 WARNING [train.py:1204] (3/4) Exclude cut with ID _15133_210_6228_1_1530794512074_3080379_29-264458-0 from training. Duration: 0.58 2023-10-04 01:40:16,911 WARNING [train.py:1204] (3/4) Exclude cut with ID _55209_210_1531_1_1534675828996_4597160_797-348343-0_sp1.1 from training. Duration: 0.963625 2023-10-04 01:40:18,816 WARNING [train.py:1204] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_23-189234-0_sp1.1 from training. Duration: 0.7545625 2023-10-04 01:40:19,292 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.4.encoder.layers.1.nonlin_attention.whiten1, num_groups=1, num_channels=288, metric=4.31 vs. limit=10.0 2023-10-04 01:40:24,250 WARNING [train.py:1204] (3/4) Exclude cut with ID _5410_210_1388_1_1527397055795_1719020_39-112221-0_sp1.1 from training. Duration: 0.62725 2023-10-04 01:40:25,647 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_100-348441-0_sp1.1 from training. Duration: 0.82725 2023-10-04 01:40:32,269 WARNING [train.py:1204] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_143-86344-0_sp1.1 from training. Duration: 0.7 2023-10-04 01:40:33,855 WARNING [train.py:1204] (3/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_126-29104-0 from training. Duration: 0.85 2023-10-04 01:40:37,143 WARNING [train.py:1204] (3/4) Exclude cut with ID _39846_210_12651_1_1533603651992_1318030_138-31771-0_sp1.1 from training. Duration: 0.963625 2023-10-04 01:40:41,822 WARNING [train.py:1204] (3/4) Exclude cut with ID _2220_210_1819_1_1525509109898_3658680_50-99837-0_sp1.1 from training. Duration: 0.4909375 2023-10-04 01:40:43,606 WARNING [train.py:1204] (3/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_836-210550-0_sp1.1 from training. Duration: 0.97275 2023-10-04 01:40:45,035 WARNING [train.py:1204] (3/4) Exclude cut with ID _28323_210_16187_1_1532480395667_4100300_306-74561-0 from training. Duration: 0.94 2023-10-04 01:40:49,362 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_177-58005-0_sp1.1 from training. Duration: 0.563625 2023-10-04 01:40:49,607 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.conv_skip_rate, batch_count=1480320.0, ans=0.0 2023-10-04 01:40:50,721 INFO [train.py:1046] (3/4) Epoch 42, batch 4250, loss[loss=0.154, simple_loss=0.2364, pruned_loss=0.03574, over 24443.00 frames. ], tot_loss[loss=0.1544, simple_loss=0.2338, pruned_loss=0.0375, over 4700497.65 frames. ], batch size: 63, lr: 2.41e-03, grad_scale: 16.0 2023-10-04 01:40:51,264 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.5.encoder.layers.1.conv_module1.whiten, num_groups=1, num_channels=256, metric=4.66 vs. limit=15.0 2023-10-04 01:40:52,692 WARNING [train.py:1204] (3/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_19-342632-0_sp1.1 from training. Duration: 0.82725 2023-10-04 01:40:52,716 WARNING [train.py:1204] (3/4) Exclude cut with ID _35355_210_16040_1_1533212988977_4016229_74-190076-0_sp0.9 from training. Duration: 0.888875 2023-10-04 01:40:54,832 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.3.encoder.layers.2.feed_forward3.out_whiten, num_groups=1, num_channels=512, metric=6.45 vs. limit=15.0 2023-10-04 01:40:56,964 WARNING [train.py:1204] (3/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_285-97786-0_sp1.1 from training. Duration: 0.97275 2023-10-04 01:40:57,287 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.conv_module2.balancer2.min_abs, batch_count=1480320.0, ans=0.5 2023-10-04 01:40:59,838 WARNING [train.py:1204] (3/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_337-181502-0_sp0.9 from training. Duration: 0.72225 2023-10-04 01:40:59,878 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_353-182855-0 from training. Duration: 0.96 2023-10-04 01:40:59,914 WARNING [train.py:1204] (3/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_409-327730-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 01:41:03,986 WARNING [train.py:1204] (3/4) Exclude cut with ID _8030_210_7497_1_1528444886781_3531089_373-19403-0_sp1.1 from training. Duration: 0.97275 2023-10-04 01:41:08,993 WARNING [train.py:1204] (3/4) Exclude cut with ID _6789_210_1534_1_1527857937218_3118950_231-340237-0_sp1.1 from training. Duration: 0.936375 2023-10-04 01:41:13,535 WARNING [train.py:1204] (3/4) Exclude cut with ID _6493_210_1747_1_1528459210424_4012470_536-57832-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 01:41:13,547 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7046_210_6753_1_1527895337_6920917_756-116002-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 01:41:14,913 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_37152_210_9697_1_1533106573_3844188_29-239070-0_sp1.1 from training. Duration: 0.8909375 2023-10-04 01:41:14,915 WARNING [train.py:1204] (3/4) Exclude cut with ID _19023_210_4353_1_1531378892552_5820780_61-334539-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 01:41:17,816 WARNING [train.py:1204] (3/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_159-28488-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 01:41:17,898 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_764-71272-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 01:41:19,211 WARNING [train.py:1204] (3/4) Exclude cut with ID _44902_210_16187_1_1533888006062_3792078_258-178274-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 01:41:22,445 WARNING [train.py:1204] (3/4) Exclude cut with ID _7537_210_2084_1_1528502395431_3924359_14-312906-0_sp1.1 from training. Duration: 0.763625 2023-10-04 01:41:23,831 WARNING [train.py:1204] (3/4) Exclude cut with ID _49643_210_7989_1_1534640244224_3939031_674-116639-0_sp1.1 from training. Duration: 0.963625 2023-10-04 01:41:24,070 INFO [scaling.py:1118] (3/4) WithLoss: name=encoder.encoders.0.layers.1.self_attn_weights, loss-sum=0.000e+00 2023-10-04 01:41:25,279 WARNING [train.py:1204] (3/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_415-161-0 from training. Duration: 0.98 2023-10-04 01:41:29,261 WARNING [train.py:1204] (3/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_264-348896-0 from training. Duration: 0.85 2023-10-04 01:41:29,269 WARNING [train.py:1204] (3/4) Exclude cut with ID _51899_210_20034_1_1534129254466_3669220_233-161492-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 01:41:29,342 WARNING [train.py:1204] (3/4) Exclude cut with ID _31255_210_13427_1_1532865619495_3815143_233-200023-0_sp1.1 from training. Duration: 0.936375 2023-10-04 01:41:29,362 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_23850_210_4238_1_1532232597_1632433_100-229879-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 01:41:30,659 WARNING [train.py:1204] (3/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_375-245677-0_sp0.9 from training. Duration: 0.9 2023-10-04 01:41:30,662 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_40223_210_6228_1_1533298404_4812267_355-317415-0_sp1.1 from training. Duration: 0.97275 2023-10-04 01:41:31,915 WARNING [train.py:1204] (3/4) Exclude cut with ID _9679_210_7497_1_1529129212906_4363929_235-81488-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 01:41:33,665 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.conv_skip_rate, batch_count=1480520.0, ans=0.0 2023-10-04 01:41:34,819 WARNING [train.py:1204] (3/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_172-215932-0_sp1.1 from training. Duration: 0.6 2023-10-04 01:41:36,137 WARNING [train.py:1204] (3/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_64-298917-0_sp1.1 from training. Duration: 0.8 2023-10-04 01:41:39,889 WARNING [train.py:1204] (3/4) Exclude cut with ID _38020_210_5986_1_1533112252259_7006089_131-53533-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 01:41:41,386 WARNING [train.py:1204] (3/4) Exclude cut with ID _42334_210_7770_1_1534206610396_3605880_392-48154-0_sp1.1 from training. Duration: 0.963625 2023-10-04 01:41:43,252 WARNING [train.py:1204] (3/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_337-181502-0 from training. Duration: 0.65 2023-10-04 01:41:43,261 WARNING [train.py:1204] (3/4) Exclude cut with ID _6267_210_2084_1_1527764658526_3894730_375-219887-0_sp0.9 from training. Duration: 0.7444375 2023-10-04 01:41:43,351 WARNING [train.py:1204] (3/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_60-146189-0 from training. Duration: 0.98 2023-10-04 01:41:44,604 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_243-143119-0_sp1.1 from training. Duration: 0.7 2023-10-04 01:41:45,993 WARNING [train.py:1204] (3/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_1267-272693-0_sp0.9 from training. Duration: 0.811125 2023-10-04 01:41:48,629 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.567e+02 2.009e+02 2.213e+02 2.557e+02 3.155e+02, threshold=4.427e+02, percent-clipped=0.0 2023-10-04 01:41:48,715 WARNING [train.py:1204] (3/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_83-182645-0_sp1.1 from training. Duration: 0.97275 2023-10-04 01:41:48,740 WARNING [train.py:1204] (3/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_403-292260-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 01:41:51,439 WARNING [train.py:1204] (3/4) Exclude cut with ID _55429_210_10626_1_1534467588527_7174143_377-169391-0 from training. Duration: 0.96 2023-10-04 01:41:52,885 WARNING [train.py:1204] (3/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_494-199538-0_sp1.1 from training. Duration: 0.6818125 2023-10-04 01:41:52,939 WARNING [train.py:1204] (3/4) Exclude cut with ID _3067_210_2667_1_1526212974640_4503168_119-175776-0_sp1.1 from training. Duration: 0.67275 2023-10-04 01:41:53,110 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.conv_skip_rate, batch_count=1480586.6666666667, ans=0.0 2023-10-04 01:41:57,535 WARNING [train.py:1204] (3/4) Exclude cut with ID _3231_210_2106_1_1526205864934_6784220_183-303839-0_sp1.1 from training. Duration: 0.97275 2023-10-04 01:41:58,974 WARNING [train.py:1204] (3/4) Exclude cut with ID _18513_210_9206_1_1531618215223_3862620_165-38958-0_sp1.1 from training. Duration: 0.963625 2023-10-04 01:42:00,361 WARNING [train.py:1204] (3/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_87-272068-0_sp1.1 from training. Duration: 0.7545625 2023-10-04 01:42:01,810 WARNING [train.py:1204] (3/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_353-95-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 01:42:03,244 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2009_210_2071_1_1525481134_4588937_73-302813-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 01:42:04,514 INFO [train.py:1046] (3/4) Epoch 42, batch 4300, loss[loss=0.1649, simple_loss=0.2533, pruned_loss=0.03824, over 24034.00 frames. ], tot_loss[loss=0.1545, simple_loss=0.2338, pruned_loss=0.03757, over 4697363.33 frames. ], batch size: 80, lr: 2.41e-03, grad_scale: 16.0 2023-10-04 01:42:04,629 WARNING [train.py:1204] (3/4) Exclude cut with ID _64220_210_9527_1_1535250151772_4875259_88-95011-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 01:42:05,915 WARNING [train.py:1204] (3/4) Exclude cut with ID _44902_210_16187_1_1533888006062_3792078_160-12107-0_sp1.1 from training. Duration: 0.92725 2023-10-04 01:42:05,924 WARNING [train.py:1204] (3/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_547-5424-0 from training. Duration: 0.94 2023-10-04 01:42:07,794 WARNING [train.py:1204] (3/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_268-85656-0_sp1.1 from training. Duration: 0.936375 2023-10-04 01:42:10,783 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.nonlin_attention.balancer.prob, batch_count=1480653.3333333333, ans=0.125 2023-10-04 01:42:12,651 WARNING [train.py:1204] (3/4) Exclude cut with ID _66210_210_12651_1_1535446812922_3365850_42-284985-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 01:42:12,677 WARNING [train.py:1204] (3/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_465-214726-0_sp0.9 from training. Duration: 0.9555625 2023-10-04 01:42:17,460 WARNING [train.py:1204] (3/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_50-95770-0_sp1.1 from training. Duration: 0.936375 2023-10-04 01:42:24,756 WARNING [train.py:1204] (3/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_269-77567-0_sp1.1 from training. Duration: 0.963625 2023-10-04 01:42:24,758 WARNING [train.py:1204] (3/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_7-111954-0 from training. Duration: 0.56 2023-10-04 01:42:24,847 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_85-195240-0_sp0.9 from training. Duration: 0.9666875 2023-10-04 01:42:26,257 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2822_210_2715_1_1526112586_1999587_165-36656-0_sp0.9 from training. Duration: 0.988875 2023-10-04 01:42:26,279 WARNING [train.py:1204] (3/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_181-78632-0_sp1.1 from training. Duration: 0.7090625 2023-10-04 01:42:26,290 WARNING [train.py:1204] (3/4) Exclude cut with ID IC0671W0044-331887 from training. Duration: 0.896 2023-10-04 01:42:29,009 WARNING [train.py:1204] (3/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_266-137104-0_sp1.1 from training. Duration: 0.5909375 2023-10-04 01:42:31,627 WARNING [train.py:1204] (3/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_137-116442-0_sp0.9 from training. Duration: 0.8666875 2023-10-04 01:42:34,346 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_23801_210_16223_1_1532066392_3294657_570-5439-0 from training. Duration: 0.97 2023-10-04 01:42:34,357 WARNING [train.py:1204] (3/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_29-280622-0_sp1.1 from training. Duration: 0.7454375 2023-10-04 01:42:34,377 WARNING [train.py:1204] (3/4) Exclude cut with ID 3557-8342-0013-54691-0 from training. Duration: 0.92 2023-10-04 01:42:37,840 WARNING [train.py:1204] (3/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_711-15688-0_sp1.1 from training. Duration: 0.3909375 2023-10-04 01:42:39,237 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_15126_210_6753_1_1530778677_2591185_120-277707-0_sp1.1 from training. Duration: 0.72725 2023-10-04 01:42:42,544 WARNING [train.py:1204] (3/4) Exclude cut with ID _14970_210_15709_1_1530775547361_1966965_8-169924-0_sp1.1 from training. Duration: 0.836375 2023-10-04 01:42:42,545 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_4692_210_3528_1_1526899755_4323254_107-44401-0_sp0.9 from training. Duration: 0.9555625 2023-10-04 01:42:43,941 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3475_210_2028_1_1526193578_3622569_265-276625-0_sp0.9 from training. Duration: 0.8444375 2023-10-04 01:42:44,059 WARNING [train.py:1204] (3/4) Exclude cut with ID _55153_210_2253_1_1534384786677_3162096_148-79022-0_sp1.1 from training. Duration: 0.87275 2023-10-04 01:42:46,762 WARNING [train.py:1204] (3/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_292-36336-0_sp1.1 from training. Duration: 0.92725 2023-10-04 01:42:46,778 WARNING [train.py:1204] (3/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_329-82235-0 from training. Duration: 0.82 2023-10-04 01:42:46,857 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_11305_210_1794_1_1529750428_7178148_257-312870-0 from training. Duration: 0.88 2023-10-04 01:42:49,551 WARNING [train.py:1204] (3/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_436-4493-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 01:42:52,303 WARNING [train.py:1204] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_321-54129-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 01:42:52,310 WARNING [train.py:1204] (3/4) Exclude cut with ID _5287_210_2084_1_1527418883040_3878670_333-48508-0_sp0.9 from training. Duration: 0.6666875 2023-10-04 01:42:52,324 WARNING [train.py:1204] (3/4) Exclude cut with ID _26528_210_9070_1_1532570410842_3851310_244-278158-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 01:42:54,157 WARNING [train.py:1204] (3/4) Exclude cut with ID _46952_210_5986_1_1534230010357_3107379_232-344503-0_sp1.1 from training. Duration: 0.87275 2023-10-04 01:42:54,169 WARNING [train.py:1204] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_43-206757-0 from training. Duration: 0.75 2023-10-04 01:42:54,172 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_4183_210_2087_1_1526729100_1869838_141-166600-0 from training. Duration: 0.95 2023-10-04 01:42:54,248 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_296-287678-0 from training. Duration: 0.95 2023-10-04 01:42:54,319 WARNING [train.py:1204] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_265-60343-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 01:42:54,344 WARNING [train.py:1204] (3/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_332-256168-0 from training. Duration: 0.75 2023-10-04 01:42:55,658 WARNING [train.py:1204] (3/4) Exclude cut with ID _13679_210_7497_1_1530534293662_3355098_411-186088-0 from training. Duration: 0.94 2023-10-04 01:42:55,843 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.1.nonlin_attention.balancer.prob, batch_count=1480853.3333333333, ans=0.125 2023-10-04 01:42:59,638 WARNING [train.py:1204] (3/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_458-126779-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 01:43:00,960 WARNING [train.py:1204] (3/4) Exclude cut with ID IC0803W0380-397572 from training. Duration: 0.032 2023-10-04 01:43:02,256 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_278-272612-0_sp1.1 from training. Duration: 0.663625 2023-10-04 01:43:03,685 WARNING [train.py:1204] (3/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_491-263101-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 01:43:03,689 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_53555_210_5632_1_1534595220_7232297_334-336778-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 01:43:03,746 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder_embed.dropout.p, batch_count=1480920.0, ans=0.1 2023-10-04 01:43:06,432 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_50-3289-0 from training. Duration: 0.91 2023-10-04 01:43:06,491 WARNING [train.py:1204] (3/4) Exclude cut with ID _8699_210_2069_1_1528630346831_3960409_155-39766-0_sp0.9 from training. Duration: 0.8666875 2023-10-04 01:43:06,494 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_502-82145-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 01:43:07,810 WARNING [train.py:1204] (3/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_550-43132-0_sp1.1 from training. Duration: 0.97275 2023-10-04 01:43:07,848 WARNING [train.py:1204] (3/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_446-112117-0_sp1.1 from training. Duration: 0.8181875 2023-10-04 01:43:07,909 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3691_210_3528_1_1526468169_4062053_355-334432-0_sp1.1 from training. Duration: 0.7909375 2023-10-04 01:43:11,238 WARNING [train.py:1204] (3/4) Exclude cut with ID _58605_210_12210_1_1534723195939_3904259_239-94720-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 01:43:14,433 WARNING [train.py:1204] (3/4) Exclude cut with ID _49376_210_5986_1_1533985278732_3370740_391-344072-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 01:43:16,345 WARNING [train.py:1204] (3/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_558-322014-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 01:43:16,407 WARNING [train.py:1204] (3/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_256-32662-0_sp1.1 from training. Duration: 0.8181875 2023-10-04 01:43:19,051 INFO [train.py:1046] (3/4) Epoch 42, batch 4350, loss[loss=0.1459, simple_loss=0.2269, pruned_loss=0.03246, over 23821.00 frames. ], tot_loss[loss=0.155, simple_loss=0.2349, pruned_loss=0.03755, over 4716923.53 frames. ], batch size: 212, lr: 2.41e-03, grad_scale: 8.0 2023-10-04 01:43:21,934 WARNING [train.py:1204] (3/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_572-185150-0 from training. Duration: 0.97 2023-10-04 01:43:21,973 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_89-35748-0_sp0.9 from training. Duration: 0.87775 2023-10-04 01:43:26,526 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_88-66747-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 01:43:27,959 WARNING [train.py:1204] (3/4) Exclude cut with ID _19504_210_9994_1_1531648863242_3325220_357-237029-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 01:43:29,460 WARNING [train.py:1204] (3/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_302-2550-0_sp1.1 from training. Duration: 0.736375 2023-10-04 01:43:29,461 WARNING [train.py:1204] (3/4) Exclude cut with ID _9461_210_4557_1_1529060514726_3677451_59-194017-0_sp1.1 from training. Duration: 0.963625 2023-10-04 01:43:32,492 WARNING [train.py:1204] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_665-196155-0_sp1.1 from training. Duration: 0.6090625 2023-10-04 01:43:37,294 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_326-315007-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 01:43:40,483 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.4.encoder.layers.0.self_attn_weights.whiten_keys, num_groups=4, num_channels=128, metric=1.65 vs. limit=6.0 2023-10-04 01:43:41,247 WARNING [train.py:1204] (3/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_105-328174-0_sp1.1 from training. Duration: 0.6909375 2023-10-04 01:43:41,263 WARNING [train.py:1204] (3/4) Exclude cut with ID _56494_210_4220_1_1535284837625_3033169_106-263722-0_sp1.1 from training. Duration: 0.97275 2023-10-04 01:43:45,925 WARNING [train.py:1204] (3/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_770-200430-0_sp0.9 from training. Duration: 0.988875 2023-10-04 01:43:47,918 WARNING [train.py:1204] (3/4) Exclude cut with ID _65640_210_6310_1_1535368201445_3173250_209-13008-0_sp1.1 from training. Duration: 0.9 2023-10-04 01:43:48,023 WARNING [train.py:1204] (3/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_212-149947-0_sp0.9 from training. Duration: 0.811125 2023-10-04 01:43:52,221 WARNING [train.py:1204] (3/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_188-171673-0 from training. Duration: 0.58 2023-10-04 01:43:52,291 WARNING [train.py:1204] (3/4) Exclude cut with ID _58895_210_8750_1_1535184081320_3649089_286-281724-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 01:43:53,650 WARNING [train.py:1204] (3/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_16-254909-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 01:43:59,440 WARNING [train.py:1204] (3/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_52-95655-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 01:44:02,130 WARNING [train.py:1204] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_554-45617-0 from training. Duration: 0.99 2023-10-04 01:44:04,913 WARNING [train.py:1204] (3/4) Exclude cut with ID _14683_210_5219_1_1530698352608_3974859_473-344611-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 01:44:05,009 WARNING [train.py:1204] (3/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_17-25045-0_sp0.9 from training. Duration: 0.6555625 2023-10-04 01:44:09,749 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9072W0003-530637 from training. Duration: 0.768 2023-10-04 01:44:11,050 WARNING [train.py:1204] (3/4) Exclude cut with ID _38670_210_15379_1_1533448810659_4391059_76-74025-0_sp1.1 from training. Duration: 0.936375 2023-10-04 01:44:11,172 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.balancer2.prob, batch_count=1481186.6666666667, ans=0.125 2023-10-04 01:44:12,860 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_74-292608-0_sp0.9 from training. Duration: 0.8 2023-10-04 01:44:14,179 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9084W0025-536862 from training. Duration: 0.896 2023-10-04 01:44:15,571 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9087W0021-538392 from training. Duration: 0.768 2023-10-04 01:44:15,576 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7543_210_6753_1_1528543323_645515_56-163234-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 01:44:15,596 WARNING [train.py:1204] (3/4) Exclude cut with ID _45560_210_21982_1_1533812603951_3584999_44-332643-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 01:44:15,775 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.1.bypass_mid.scale_min, batch_count=1481186.6666666667, ans=0.2 2023-10-04 01:44:16,968 WARNING [train.py:1204] (3/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_1285-256622-0_sp1.1 from training. Duration: 0.763625 2023-10-04 01:44:17,040 WARNING [train.py:1204] (3/4) Exclude cut with ID _52781_210_16187_1_1534297729099_1118149_141-325632-0_sp1.1 from training. Duration: 0.936375 2023-10-04 01:44:18,232 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.647e+02 1.906e+02 2.107e+02 2.374e+02 3.775e+02, threshold=4.214e+02, percent-clipped=0.0 2023-10-04 01:44:18,343 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_1051-137631-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 01:44:18,399 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_13091_210_7925_1_1530609447_8987354_233-18333-0_sp1.1 from training. Duration: 0.97275 2023-10-04 01:44:21,683 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_34008_210_10431_1_1533345424_8183247_662-317331-0 from training. Duration: 0.94 2023-10-04 01:44:21,689 WARNING [train.py:1204] (3/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_291-248920-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 01:44:21,691 WARNING [train.py:1204] (3/4) Exclude cut with ID _65263_210_22356_1_1535763598691_3921037_274-288481-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 01:44:22,986 WARNING [train.py:1204] (3/4) Exclude cut with ID _5189_210_4557_1_1527248319428_3769229_233-168080-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 01:44:24,297 WARNING [train.py:1204] (3/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_135-184281-0 from training. Duration: 0.99 2023-10-04 01:44:24,404 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9123W0012-555972 from training. Duration: 0.896 2023-10-04 01:44:24,408 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9123W0118-556077 from training. Duration: 0.896 2023-10-04 01:44:24,417 WARNING [train.py:1204] (3/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_406-275649-0 from training. Duration: 0.42 2023-10-04 01:44:27,681 WARNING [train.py:1204] (3/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_309-13684-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 01:44:29,009 WARNING [train.py:1204] (3/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_129-337802-0_sp1.1 from training. Duration: 0.6545625 2023-10-04 01:44:29,039 WARNING [train.py:1204] (3/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_649-266825-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 01:44:29,113 WARNING [train.py:1204] (3/4) Exclude cut with ID _40519_210_13794_1_1533715228655_7337390_895-224742-0_sp1.1 from training. Duration: 0.92725 2023-10-04 01:44:30,569 WARNING [train.py:1204] (3/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_234-14174-0 from training. Duration: 0.82 2023-10-04 01:44:31,936 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9156W0009-572502 from training. Duration: 0.768 2023-10-04 01:44:31,950 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_424-245529-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 01:44:33,260 INFO [train.py:1046] (3/4) Epoch 42, batch 4400, loss[loss=0.1444, simple_loss=0.2286, pruned_loss=0.03013, over 24311.00 frames. ], tot_loss[loss=0.1556, simple_loss=0.2354, pruned_loss=0.03785, over 4714407.66 frames. ], batch size: 56, lr: 2.41e-03, grad_scale: 16.0 2023-10-04 01:44:36,327 WARNING [train.py:1204] (3/4) Exclude cut with ID _26528_210_9070_1_1532570410842_3851310_232-10149-0_sp1.1 from training. Duration: 0.963625 2023-10-04 01:44:36,341 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_17292_210_6753_1_1531125388_2943734_152-30410-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 01:44:37,874 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.feed_forward2.hidden_balancer.prob, batch_count=1481320.0, ans=0.125 2023-10-04 01:44:37,920 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.ff3_skip_rate, batch_count=1481320.0, ans=0.0 2023-10-04 01:44:38,971 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_35441_210_16554_1_1533347572_4092680_291-82075-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 01:44:40,982 WARNING [train.py:1204] (3/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_64-298917-0 from training. Duration: 0.88 2023-10-04 01:44:41,000 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_5618_210_1874_1_1527311008_5851_1-214501-0_sp0.9 from training. Duration: 0.7655625 2023-10-04 01:44:41,039 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_9460_210_4211_1_1528954587_10049667_572-37035-0 from training. Duration: 0.8 2023-10-04 01:44:41,056 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9193W0023-590322 from training. Duration: 0.896 2023-10-04 01:44:42,438 WARNING [train.py:1204] (3/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_206-10097-0_sp1.1 from training. Duration: 0.4454375 2023-10-04 01:44:42,458 WARNING [train.py:1204] (3/4) Exclude cut with ID _55134_210_21982_1_1534590028377_3882170_17-51731-0_sp1.1 from training. Duration: 0.963625 2023-10-04 01:44:44,473 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_14953_210_2151_1_1530701710_5616165_710-165400-0 from training. Duration: 0.87 2023-10-04 01:44:46,023 INFO [scaling.py:1118] (3/4) WithLoss: name=encoder.encoders.3.encoder.layers.3.self_attn_weights, loss-sum=0.000e+00 2023-10-04 01:44:47,099 WARNING [train.py:1204] (3/4) Exclude cut with ID _53203_210_2087_1_1534246218309_475590_61-85777-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 01:44:48,418 WARNING [train.py:1204] (3/4) Exclude cut with ID _55844_210_2346_1_1534471212569_7625707_1050-10453-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 01:44:48,429 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9218W0013-603297 from training. Duration: 0.896 2023-10-04 01:44:51,637 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_267-102818-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 01:44:51,638 WARNING [train.py:1204] (3/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_191-41023-0 from training. Duration: 0.71 2023-10-04 01:44:51,692 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9231W0202-610227 from training. Duration: 0.896 2023-10-04 01:44:54,577 WARNING [train.py:1204] (3/4) Exclude cut with ID _6537_210_1883_1_1527987559710_3597129_382-133062-0 from training. Duration: 0.68 2023-10-04 01:44:54,630 WARNING [train.py:1204] (3/4) Exclude cut with ID _28427_210_4220_1_1532581240967_3061069_43-84928-0 from training. Duration: 0.99 2023-10-04 01:44:55,894 WARNING [train.py:1204] (3/4) Exclude cut with ID _59256_210_5986_1_1535076085997_3589120_121-237386-0 from training. Duration: 0.96 2023-10-04 01:44:55,915 WARNING [train.py:1204] (3/4) Exclude cut with ID _8480_210_1874_1_1528589391205_6728330_137-256986-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 01:44:56,007 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_9417_210_1794_1_1529235261_4614131_479-346963-0_sp1.1 from training. Duration: 0.936375 2023-10-04 01:44:57,655 WARNING [train.py:1204] (3/4) Exclude cut with ID _26474_210_9570_1_1532520022699_3623380_267-7362-0_sp1.1 from training. Duration: 0.936375 2023-10-04 01:44:59,077 WARNING [train.py:1204] (3/4) Exclude cut with ID _55328_210_27767_1_1534580973260_4088170_287-61272-0_sp1.1 from training. Duration: 0.97275 2023-10-04 01:45:00,409 WARNING [train.py:1204] (3/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_589-9070-0 from training. Duration: 0.71 2023-10-04 01:45:00,416 WARNING [train.py:1204] (3/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_148-158509-0_sp1.1 from training. Duration: 0.37275 2023-10-04 01:45:01,787 WARNING [train.py:1204] (3/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_164-184548-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 01:45:04,472 WARNING [train.py:1204] (3/4) Exclude cut with ID _14970_210_15709_1_1530775547361_1966965_6-23328-0_sp1.1 from training. Duration: 0.7818125 2023-10-04 01:45:05,727 WARNING [train.py:1204] (3/4) Exclude cut with ID _29303_210_2575_1_1532392237303_7410649_612-165069-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 01:45:07,079 WARNING [train.py:1204] (3/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_383-321224-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 01:45:07,109 WARNING [train.py:1204] (3/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_283-160525-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 01:45:07,120 WARNING [train.py:1204] (3/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_505-97987-0 from training. Duration: 0.86 2023-10-04 01:45:07,210 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9309W0190-640317 from training. Duration: 0.768 2023-10-04 01:45:13,083 WARNING [train.py:1204] (3/4) Exclude cut with ID _50093_210_4220_1_1534417141207_3432299_280-297390-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 01:45:14,711 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.ff2_skip_rate, batch_count=1481453.3333333333, ans=0.0 2023-10-04 01:45:19,224 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_17292_210_6753_1_1531125388_2943734_320-71385-0_sp1.1 from training. Duration: 0.97275 2023-10-04 01:45:20,698 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_213-103282-0 from training. Duration: 0.78 2023-10-04 01:45:23,591 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7615_210_4211_1_1528110965_4580160_72-235211-0_sp0.9 from training. Duration: 0.8444375 2023-10-04 01:45:26,602 WARNING [train.py:1204] (3/4) Exclude cut with ID _14867_210_1968_1_1530684179890_3846969_173-320977-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 01:45:26,826 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=1481520.0, ans=0.1 2023-10-04 01:45:28,468 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7046_210_6753_1_1527895337_6920917_783-278109-0_sp0.9 from training. Duration: 0.8555625 2023-10-04 01:45:29,791 WARNING [train.py:1204] (3/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_90-30673-0 from training. Duration: 0.84 2023-10-04 01:45:29,813 WARNING [train.py:1204] (3/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_415-161-0_sp1.1 from training. Duration: 0.8909375 2023-10-04 01:45:29,832 WARNING [train.py:1204] (3/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_86-66192-0_sp0.9 from training. Duration: 0.611125 2023-10-04 01:45:29,834 WARNING [train.py:1204] (3/4) Exclude cut with ID _4626_210_3532_1_1526817652717_3916859_446-206656-0_sp1.1 from training. Duration: 0.6909375 2023-10-04 01:45:31,116 WARNING [train.py:1204] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_166-128941-0_sp0.9 from training. Duration: 0.6 2023-10-04 01:45:34,161 WARNING [train.py:1204] (3/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_345-167986-0 from training. Duration: 0.84 2023-10-04 01:45:36,871 WARNING [train.py:1204] (3/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_973-198085-0 from training. Duration: 0.75 2023-10-04 01:45:38,252 WARNING [train.py:1204] (3/4) Exclude cut with ID _60057_210_10640_1_1534903065793_3333150_171-181133-0 from training. Duration: 0.88 2023-10-04 01:45:38,267 WARNING [train.py:1204] (3/4) Exclude cut with ID _19504_210_9994_1_1531648863242_3325220_336-196513-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 01:45:38,274 WARNING [train.py:1204] (3/4) Exclude cut with ID _6493_210_1747_1_1528459210424_4012470_513-140967-0 from training. Duration: 0.79 2023-10-04 01:45:38,358 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6673_210_3273_1_1527767790_3173240_327-306185-0_sp0.9 from training. Duration: 0.92225 2023-10-04 01:45:41,753 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1032-236863-0_sp1.1 from training. Duration: 0.763625 2023-10-04 01:45:44,445 WARNING [train.py:1204] (3/4) Exclude cut with ID _14867_210_1968_1_1530684179890_3846969_413-29292-0 from training. Duration: 0.9 2023-10-04 01:45:47,218 INFO [train.py:1046] (3/4) Epoch 42, batch 4450, loss[loss=0.1697, simple_loss=0.2392, pruned_loss=0.05007, over 23772.00 frames. ], tot_loss[loss=0.1573, simple_loss=0.2372, pruned_loss=0.03876, over 4708650.00 frames. ], batch size: 179, lr: 2.41e-03, grad_scale: 16.0 2023-10-04 01:45:47,349 WARNING [train.py:1204] (3/4) Exclude cut with ID _47251_210_18628_1_1533814063128_4137250_186-343322-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 01:45:50,521 WARNING [train.py:1204] (3/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_624-46091-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 01:45:50,560 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_791-197891-0_sp0.9 from training. Duration: 0.9333125 2023-10-04 01:45:55,354 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.conv_skip_rate, batch_count=1481653.3333333333, ans=0.0 2023-10-04 01:45:55,851 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.conv_module1.whiten.whitening_limit, batch_count=1481653.3333333333, ans=15.0 2023-10-04 01:45:57,062 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_50050_210_16223_1_1535435796_7290138_786-288457-0_sp1.1 from training. Duration: 0.92725 2023-10-04 01:45:57,088 WARNING [train.py:1204] (3/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_213-227283-0_sp1.1 from training. Duration: 0.963625 2023-10-04 01:45:59,868 WARNING [train.py:1204] (3/4) Exclude cut with ID _42659_210_2070_1_1533607442765_3120380_58-341121-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 01:46:02,570 WARNING [train.py:1204] (3/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_506-23200-0_sp1.1 from training. Duration: 0.8818125 2023-10-04 01:46:04,679 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.2.encoder.layers.2.conv_module2.whiten, num_groups=1, num_channels=384, metric=8.03 vs. limit=15.0 2023-10-04 01:46:05,341 WARNING [train.py:1204] (3/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_336-213530-0_sp1.1 from training. Duration: 0.8181875 2023-10-04 01:46:06,649 WARNING [train.py:1204] (3/4) Exclude cut with ID _64135_210_2253_1_1535677267876_7608972_180-5312-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 01:46:06,733 WARNING [train.py:1204] (3/4) Exclude cut with ID _12302_210_5219_1_1529924446466_8107280_835-279754-0 from training. Duration: 0.9 2023-10-04 01:46:06,734 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_510-96996-0_sp1.1 from training. Duration: 0.7909375 2023-10-04 01:46:08,071 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_15517_210_6753_1_1530954008_3487336_216-187834-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 01:46:08,110 WARNING [train.py:1204] (3/4) Exclude cut with ID _19504_210_9994_1_1531648863242_3325220_414-113427-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 01:46:08,111 WARNING [train.py:1204] (3/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_264-108685-0_sp0.9 from training. Duration: 0.611125 2023-10-04 01:46:10,245 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder_embed.conv.5.prob, batch_count=1481720.0, ans=0.125 2023-10-04 01:46:11,497 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_5438_210_1794_1_1527336687_2600009_104-288274-0_sp0.9 from training. Duration: 0.6444375 2023-10-04 01:46:15,658 WARNING [train.py:1204] (3/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_520-39496-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 01:46:15,705 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_37818_210_4238_1_1533620910_4720083_315-308565-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 01:46:17,205 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2009_210_2071_1_1525481134_4588937_385-302100-0_sp1.1 from training. Duration: 0.7909375 2023-10-04 01:46:17,254 WARNING [train.py:1204] (3/4) Exclude cut with ID _58895_210_8750_1_1535184081320_3649089_277-53219-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 01:46:18,652 WARNING [train.py:1204] (3/4) Exclude cut with ID _26664_210_7770_1_1532258943280_3418270_121-111258-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 01:46:23,004 WARNING [train.py:1204] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_99-167392-0_sp1.1 from training. Duration: 0.5545625 2023-10-04 01:46:23,102 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1113-7369-0 from training. Duration: 0.85 2023-10-04 01:46:25,001 WARNING [train.py:1204] (3/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_352-59958-0 from training. Duration: 0.86 2023-10-04 01:46:25,007 WARNING [train.py:1204] (3/4) Exclude cut with ID _14886_210_4220_1_1530702020355_3516190_159-73748-0_sp1.1 from training. Duration: 0.7545625 2023-10-04 01:46:29,623 WARNING [train.py:1204] (3/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_131-119569-0_sp1.1 from training. Duration: 0.92725 2023-10-04 01:46:29,696 WARNING [train.py:1204] (3/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_216-343007-0 from training. Duration: 0.66 2023-10-04 01:46:31,288 WARNING [train.py:1204] (3/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_113-92608-0_sp0.9 from training. Duration: 0.6 2023-10-04 01:46:36,612 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_40047_210_2290_1_1533290519_2374311_132-123271-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 01:46:36,672 WARNING [train.py:1204] (3/4) Exclude cut with ID _8793_210_6310_1_1528542985139_2310009_121-179782-0 from training. Duration: 0.8 2023-10-04 01:46:36,695 WARNING [train.py:1204] (3/4) Exclude cut with ID _43997_210_4525_1_1533538632555_5189329_283-218639-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 01:46:36,698 WARNING [train.py:1204] (3/4) Exclude cut with ID _49376_210_5986_1_1533985278732_3370740_297-78397-0_sp1.1 from training. Duration: 0.936375 2023-10-04 01:46:36,713 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3475_210_2028_1_1526193578_3622569_377-166133-0_sp0.9 from training. Duration: 0.9555625 2023-10-04 01:46:36,720 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_520-213149-0_sp1.1 from training. Duration: 0.92725 2023-10-04 01:46:38,102 WARNING [train.py:1204] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_567-144483-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 01:46:41,415 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_4183_210_2087_1_1526729100_1869838_143-280962-0_sp0.9 from training. Duration: 0.62225 2023-10-04 01:46:42,853 WARNING [train.py:1204] (3/4) Exclude cut with ID _49643_210_7989_1_1534640244224_3939031_611-185515-0 from training. Duration: 0.94 2023-10-04 01:46:44,126 WARNING [train.py:1204] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_88-341718-0_sp0.9 from training. Duration: 0.7333125 2023-10-04 01:46:45,465 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_51225_210_3601_1_1534400634_2327383_49-235441-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 01:46:46,652 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.549e+02 2.039e+02 2.233e+02 2.622e+02 3.908e+02, threshold=4.466e+02, percent-clipped=0.0 2023-10-04 01:46:46,779 WARNING [train.py:1204] (3/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_240-294400-0_sp1.1 from training. Duration: 0.936375 2023-10-04 01:46:49,509 WARNING [train.py:1204] (3/4) Exclude cut with ID _55429_210_10626_1_1534467588527_7174143_640-304825-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 01:46:49,545 WARNING [train.py:1204] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_187-221566-0_sp1.1 from training. Duration: 0.3909375 2023-10-04 01:46:49,706 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.conv_module1.balancer2.prob, batch_count=1481920.0, ans=0.125 2023-10-04 01:46:54,162 WARNING [train.py:1204] (3/4) Exclude cut with ID _7606_210_4915_1_1528894253277_5080690_127-177761-0_sp0.9 from training. Duration: 0.711125 2023-10-04 01:46:56,503 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.0.layers.0.feed_forward3.out_whiten, num_groups=1, num_channels=192, metric=11.30 vs. limit=15.0 2023-10-04 01:46:56,994 WARNING [train.py:1204] (3/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_430-116139-0 from training. Duration: 0.85 2023-10-04 01:46:58,926 WARNING [train.py:1204] (3/4) Exclude cut with ID _3675_210_4557_1_1526639934408_4182089_456-18305-0_sp0.9 from training. Duration: 0.8444375 2023-10-04 01:47:02,288 INFO [train.py:1046] (3/4) Epoch 42, batch 4500, loss[loss=0.1563, simple_loss=0.2454, pruned_loss=0.03361, over 24648.00 frames. ], tot_loss[loss=0.1573, simple_loss=0.237, pruned_loss=0.03878, over 4701343.18 frames. ], batch size: 68, lr: 2.41e-03, grad_scale: 16.0 2023-10-04 01:47:02,419 WARNING [train.py:1204] (3/4) Exclude cut with ID _18513_210_9206_1_1531618215223_3862620_402-5957-0_sp1.1 from training. Duration: 0.9 2023-10-04 01:47:03,851 WARNING [train.py:1204] (3/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_465-156500-0 from training. Duration: 0.72 2023-10-04 01:47:03,852 WARNING [train.py:1204] (3/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_714-310538-0 from training. Duration: 0.86 2023-10-04 01:47:05,216 WARNING [train.py:1204] (3/4) Exclude cut with ID _39411_210_5986_1_1533538825030_3343649_335-301187-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 01:47:09,432 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_41750_210_1960_1_1533520678_3948042_241-158616-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 01:47:11,097 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_46013_210_7234_1_1533776362_3744032_50-141535-0_sp1.1 from training. Duration: 0.9 2023-10-04 01:47:12,558 WARNING [train.py:1204] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_43-206757-0_sp0.9 from training. Duration: 0.8333125 2023-10-04 01:47:12,620 WARNING [train.py:1204] (3/4) Exclude cut with ID _22961_210_8254_1_1531976366672_4278180_172-276296-0_sp1.1 from training. Duration: 0.97275 2023-10-04 01:47:12,639 WARNING [train.py:1204] (3/4) Exclude cut with ID _17956_210_4353_1_1531209568125_6979970_1305-301903-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 01:47:12,851 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.balancer2.prob, batch_count=1481986.6666666667, ans=0.125 2023-10-04 01:47:13,941 WARNING [train.py:1204] (3/4) Exclude cut with ID _28323_210_16187_1_1532480395667_4100300_604-44507-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 01:47:25,552 WARNING [train.py:1204] (3/4) Exclude cut with ID _23514_210_1389_1_1531832139785_3031439_88-337544-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 01:47:26,977 WARNING [train.py:1204] (3/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_932-3672-0_sp1.1 from training. Duration: 0.963625 2023-10-04 01:47:29,720 WARNING [train.py:1204] (3/4) Exclude cut with ID _46502_210_13794_1_1533724452198_3657159_207-109991-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 01:47:31,602 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_216-73841-0_sp1.1 from training. Duration: 0.87275 2023-10-04 01:47:31,690 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8453_210_4211_1_1528502940_8562730_347-330673-0_sp1.1 from training. Duration: 0.6545625 2023-10-04 01:47:31,904 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.ff2_skip_rate, batch_count=1482120.0, ans=0.0 2023-10-04 01:47:33,742 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.balancer2.prob, batch_count=1482120.0, ans=0.125 2023-10-04 01:47:38,929 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_4183_210_2087_1_1526729100_1869838_143-280962-0_sp1.1 from training. Duration: 0.5090625 2023-10-04 01:47:42,186 WARNING [train.py:1204] (3/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_325-113798-0_sp1.1 from training. Duration: 0.77275 2023-10-04 01:47:46,380 WARNING [train.py:1204] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_91-212486-0_sp0.9 from training. Duration: 0.7555625 2023-10-04 01:47:49,149 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8776_210_2040_1_1528638837_4088036_244-278763-0_sp1.1 from training. Duration: 0.8181875 2023-10-04 01:47:49,180 WARNING [train.py:1204] (3/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_106-286130-0 from training. Duration: 0.98 2023-10-04 01:47:50,568 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2746_210_1988_1_1525872816_3795627_372-205861-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 01:47:50,623 WARNING [train.py:1204] (3/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_240-54646-0_sp1.1 from training. Duration: 0.936375 2023-10-04 01:47:53,258 WARNING [train.py:1204] (3/4) Exclude cut with ID _59500_210_8254_1_1534809610680_3736009_162-180695-0_sp1.1 from training. Duration: 0.936375 2023-10-04 01:47:53,281 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7544_210_6753_1_1528591327_1471426_146-93357-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 01:47:55,321 WARNING [train.py:1204] (3/4) Exclude cut with ID _42657_210_2070_1_1533520654016_3327008_405-87432-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 01:47:56,616 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_4528_210_3273_1_1527076654_3276777_209-219804-0 from training. Duration: 0.69 2023-10-04 01:47:56,617 WARNING [train.py:1204] (3/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_78-182912-0_sp0.9 from training. Duration: 0.6555625 2023-10-04 01:47:56,624 WARNING [train.py:1204] (3/4) Exclude cut with ID _31255_210_13427_1_1532865619495_3815143_322-114998-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 01:48:00,018 WARNING [train.py:1204] (3/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_848-280957-0_sp0.9 from training. Duration: 0.9666875 2023-10-04 01:48:00,046 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7696_210_2040_1_1528207167_3716099_117-125434-0_sp0.9 from training. Duration: 0.8444375 2023-10-04 01:48:01,431 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_1002-109192-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 01:48:03,667 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=1482253.3333333333, ans=0.1 2023-10-04 01:48:04,594 WARNING [train.py:1204] (3/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_138-64210-0_sp0.9 from training. Duration: 0.811125 2023-10-04 01:48:04,609 WARNING [train.py:1204] (3/4) Exclude cut with ID _25808_210_13292_1_1532343514562_7366109_633-47524-0_sp1.1 from training. Duration: 0.9 2023-10-04 01:48:07,293 WARNING [train.py:1204] (3/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_297-5233-0 from training. Duration: 0.95 2023-10-04 01:48:08,714 WARNING [train.py:1204] (3/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_335-274780-0 from training. Duration: 0.78 2023-10-04 01:48:08,719 WARNING [train.py:1204] (3/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_301-330455-0 from training. Duration: 0.8 2023-10-04 01:48:13,578 WARNING [train.py:1204] (3/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_383-205833-0 from training. Duration: 0.95 2023-10-04 01:48:15,479 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.5.encoder.layers.1.self_attn1.whiten, num_groups=1, num_channels=256, metric=12.66 vs. limit=22.5 2023-10-04 01:48:16,378 INFO [train.py:1046] (3/4) Epoch 42, batch 4550, loss[loss=0.1525, simple_loss=0.2398, pruned_loss=0.03258, over 24486.00 frames. ], tot_loss[loss=0.1564, simple_loss=0.236, pruned_loss=0.03841, over 4707022.56 frames. ], batch size: 66, lr: 2.41e-03, grad_scale: 16.0 2023-10-04 01:48:16,443 WARNING [train.py:1204] (3/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_48-182155-0 from training. Duration: 0.87 2023-10-04 01:48:16,547 WARNING [train.py:1204] (3/4) Exclude cut with ID _14832_210_8750_1_1530691351539_3171348_1-54315-0_sp1.1 from training. Duration: 0.7909375 2023-10-04 01:48:20,654 WARNING [train.py:1204] (3/4) Exclude cut with ID _44982_210_1511_1_1534312820108_3726029_413-100814-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 01:48:20,696 WARNING [train.py:1204] (3/4) Exclude cut with ID _3231_210_2106_1_1526205864934_6784220_104-261392-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 01:48:22,111 WARNING [train.py:1204] (3/4) Exclude cut with ID _24314_210_4915_1_1532135078116_7170179_1-13588-0_sp1.1 from training. Duration: 0.97275 2023-10-04 01:48:25,187 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder_embed.convnext.hidden_balancer.prob, batch_count=1482320.0, ans=0.125 2023-10-04 01:48:27,811 WARNING [train.py:1204] (3/4) Exclude cut with ID _44304_210_15374_1_1533639352765_4613129_141-8356-0_sp1.1 from training. Duration: 0.963625 2023-10-04 01:48:31,160 WARNING [train.py:1204] (3/4) Exclude cut with ID _37892_210_19596_1_1533198677897_3964058_374-335034-0_sp1.1 from training. Duration: 0.936375 2023-10-04 01:48:31,305 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_195-257512-0_sp0.9 from training. Duration: 0.7444375 2023-10-04 01:48:31,307 WARNING [train.py:1204] (3/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_1267-272693-0_sp1.1 from training. Duration: 0.663625 2023-10-04 01:48:31,308 WARNING [train.py:1204] (3/4) Exclude cut with ID _30196_210_1664_1_1533708496522_7074140_690-326155-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 01:48:34,553 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_68847_210_14656_1_1536070214_2586843_38-20717-0_sp1.1 from training. Duration: 0.97275 2023-10-04 01:48:34,602 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_99-251314-0_sp1.1 from training. Duration: 0.7909375 2023-10-04 01:48:38,642 WARNING [train.py:1204] (3/4) Exclude cut with ID _49376_210_5986_1_1533985278732_3370740_225-95630-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 01:48:40,285 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=1482386.6666666667, ans=0.1 2023-10-04 01:48:41,367 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2655_210_2087_1_1525688959_3897400_202-147557-0 from training. Duration: 0.85 2023-10-04 01:48:41,418 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_5618_210_1874_1_1527311008_5851_1-214501-0_sp1.1 from training. Duration: 0.626375 2023-10-04 01:48:42,770 WARNING [train.py:1204] (3/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_225-43381-0_sp1.1 from training. Duration: 0.8545625 2023-10-04 01:48:44,773 WARNING [train.py:1204] (3/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_242-3472-0 from training. Duration: 0.73 2023-10-04 01:48:45,609 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.2.encoder.layers.0.self_attn2.whiten, num_groups=1, num_channels=384, metric=21.58 vs. limit=22.5 2023-10-04 01:48:47,540 WARNING [train.py:1204] (3/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_339-104791-0 from training. Duration: 0.49 2023-10-04 01:48:47,616 WARNING [train.py:1204] (3/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_488-92261-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 01:48:47,893 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.bypass_mid.scale_min, batch_count=1482453.3333333333, ans=0.2 2023-10-04 01:48:50,563 WARNING [train.py:1204] (3/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_175-163894-0 from training. Duration: 0.74 2023-10-04 01:48:52,013 WARNING [train.py:1204] (3/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_41-234353-0_sp0.9 from training. Duration: 0.7555625 2023-10-04 01:48:55,259 WARNING [train.py:1204] (3/4) Exclude cut with ID _47251_210_18628_1_1533814063128_4137250_116-275927-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 01:48:55,293 WARNING [train.py:1204] (3/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_61-223324-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 01:48:56,571 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6961_210_2087_1_1527937168_6815011_562-165518-0_sp1.1 from training. Duration: 0.7 2023-10-04 01:48:57,948 WARNING [train.py:1204] (3/4) Exclude cut with ID _21989_210_5854_1_1531899214931_3271360_133-44759-0 from training. Duration: 0.77 2023-10-04 01:48:58,111 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.self_attn_weights.pos_emb_skip_rate, batch_count=1482453.3333333333, ans=0.0 2023-10-04 01:49:01,008 WARNING [train.py:1204] (3/4) Exclude cut with ID _23557_210_8750_1_1531890106370_6846255_77-284923-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 01:49:04,246 WARNING [train.py:1204] (3/4) Exclude cut with ID _13678_210_7497_1_1530530069226_3463170_464-296127-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 01:49:05,504 WARNING [train.py:1204] (3/4) Exclude cut with ID _22613_210_5219_1_1532253599686_4090170_393-36663-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 01:49:06,891 WARNING [train.py:1204] (3/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_235-262124-0_sp0.9 from training. Duration: 0.7444375 2023-10-04 01:49:06,994 WARNING [train.py:1204] (3/4) Exclude cut with ID _8480_210_1874_1_1528589391205_6728330_755-112625-0 from training. Duration: 0.74 2023-10-04 01:49:08,402 WARNING [train.py:1204] (3/4) Exclude cut with ID _6456_210_1534_1_1527681634212_3181049_215-309359-0 from training. Duration: 0.88 2023-10-04 01:49:08,426 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3019_210_2028_1_1526044245_3264865_191-72235-0_sp1.1 from training. Duration: 0.8818125 2023-10-04 01:49:09,861 WARNING [train.py:1204] (3/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_380-162648-0 from training. Duration: 0.94 2023-10-04 01:49:10,081 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.bypass.skip_rate, batch_count=1482520.0, ans=0.07 2023-10-04 01:49:10,558 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.3.encoder.layers.2.feed_forward2.out_whiten, num_groups=1, num_channels=512, metric=9.48 vs. limit=15.0 2023-10-04 01:49:12,577 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_92-46009-0 from training. Duration: 0.54 2023-10-04 01:49:12,602 WARNING [train.py:1204] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_53-300544-0_sp0.9 from training. Duration: 0.7444375 2023-10-04 01:49:14,057 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_68235_210_1925_1_1535863247_4484656_101-250943-0_sp1.1 from training. Duration: 0.97275 2023-10-04 01:49:14,067 WARNING [train.py:1204] (3/4) Exclude cut with ID _14970_210_15709_1_1530775547361_1966965_204-22182-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 01:49:15,348 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.705e+02 1.972e+02 2.159e+02 2.425e+02 3.623e+02, threshold=4.318e+02, percent-clipped=0.0 2023-10-04 01:49:15,491 WARNING [train.py:1204] (3/4) Exclude cut with ID _55153_210_2253_1_1534384786677_3162096_310-130092-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 01:49:15,502 WARNING [train.py:1204] (3/4) Exclude cut with ID _60113_210_16271_1_1535164212481_4386079_559-167191-0_sp1.1 from training. Duration: 0.8454375 2023-10-04 01:49:17,377 WARNING [train.py:1204] (3/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_93-58002-0_sp1.1 from training. Duration: 0.5181875 2023-10-04 01:49:18,685 WARNING [train.py:1204] (3/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_110-224688-0 from training. Duration: 0.97 2023-10-04 01:49:20,177 WARNING [train.py:1204] (3/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_178-178004-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 01:49:20,186 WARNING [train.py:1204] (3/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_321-335475-0_sp1.1 from training. Duration: 0.4090625 2023-10-04 01:49:20,243 WARNING [train.py:1204] (3/4) Exclude cut with ID _66863_210_18053_1_1535698758334_8590319_1092-271784-0 from training. Duration: 0.96 2023-10-04 01:49:20,250 WARNING [train.py:1204] (3/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_607-213951-0_sp1.1 from training. Duration: 0.763625 2023-10-04 01:49:20,264 WARNING [train.py:1204] (3/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_468-283113-0 from training. Duration: 0.96 2023-10-04 01:49:22,091 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.conv_module2.balancer2.prob, batch_count=1482586.6666666667, ans=0.125 2023-10-04 01:49:23,202 WARNING [train.py:1204] (3/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_13-165512-0_sp1.1 from training. Duration: 0.7454375 2023-10-04 01:49:23,219 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_69630_210_7234_1_1535788670_910989_56-169762-0_sp1.1 from training. Duration: 0.92725 2023-10-04 01:49:25,853 WARNING [train.py:1204] (3/4) Exclude cut with ID _67335_210_5219_1_1535525975652_4275229_121-80357-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 01:49:25,901 WARNING [train.py:1204] (3/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_126-12079-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 01:49:25,932 WARNING [train.py:1204] (3/4) Exclude cut with ID _5294_210_1747_1_1527249643928_3696179_342-270278-0_sp1.1 from training. Duration: 0.5 2023-10-04 01:49:27,871 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_30322_210_1925_1_1532843478_826817_35-328264-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 01:49:30,393 INFO [train.py:1046] (3/4) Epoch 42, batch 4600, loss[loss=0.1513, simple_loss=0.2218, pruned_loss=0.04039, over 23779.00 frames. ], tot_loss[loss=0.1555, simple_loss=0.2347, pruned_loss=0.03817, over 4713348.71 frames. ], batch size: 212, lr: 2.41e-03, grad_scale: 16.0 2023-10-04 01:49:30,493 WARNING [train.py:1204] (3/4) Exclude cut with ID _8480_210_1874_1_1528589391205_6728330_755-112625-0_sp1.1 from training. Duration: 0.67275 2023-10-04 01:49:33,948 WARNING [train.py:1204] (3/4) Exclude cut with ID _38670_210_15379_1_1533448810659_4391059_132-146412-0_sp1.1 from training. Duration: 0.97275 2023-10-04 01:49:34,018 WARNING [train.py:1204] (3/4) Exclude cut with ID _22020_210_2461_1_1531724164513_6326900_563-39691-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 01:49:36,120 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_9460_210_4211_1_1528954587_10049667_572-37035-0_sp1.1 from training. Duration: 0.72725 2023-10-04 01:49:36,138 WARNING [train.py:1204] (3/4) Exclude cut with ID _68777_210_4353_1_1535810390731_3117904_129-166012-0_sp0.9 from training. Duration: 0.9444375 2023-10-04 01:49:37,483 WARNING [train.py:1204] (3/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_487-91921-0_sp1.1 from training. Duration: 0.963625 2023-10-04 01:49:38,886 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_195-257512-0 from training. Duration: 0.67 2023-10-04 01:49:40,307 WARNING [train.py:1204] (3/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_188-260011-0_sp1.1 from training. Duration: 0.663625 2023-10-04 01:49:41,814 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=1482653.3333333333, ans=0.1 2023-10-04 01:49:42,978 WARNING [train.py:1204] (3/4) Exclude cut with ID _8480_210_1874_1_1528589391205_6728330_309-139896-0_sp0.9 from training. Duration: 0.9 2023-10-04 01:49:43,213 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.bypass_mid.scale_min, batch_count=1482653.3333333333, ans=0.2 2023-10-04 01:49:44,485 WARNING [train.py:1204] (3/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_866-321248-0_sp1.1 from training. Duration: 0.963625 2023-10-04 01:49:47,752 WARNING [train.py:1204] (3/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_510-216513-0_sp1.1 from training. Duration: 0.97275 2023-10-04 01:49:53,393 WARNING [train.py:1204] (3/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_758-143677-0 from training. Duration: 0.98 2023-10-04 01:49:54,720 WARNING [train.py:1204] (3/4) Exclude cut with ID _55134_210_21982_1_1534590028377_3882170_50-214427-0_sp1.1 from training. Duration: 0.97275 2023-10-04 01:49:54,870 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.0.feed_forward1.out_proj.dropout_p, batch_count=1482720.0, ans=0.1 2023-10-04 01:49:56,105 WARNING [train.py:1204] (3/4) Exclude cut with ID _31649_210_6228_1_1532584608063_4091078_482-301330-0_sp1.1 from training. Duration: 0.97275 2023-10-04 01:50:00,716 WARNING [train.py:1204] (3/4) Exclude cut with ID _35799_210_5854_1_1533263487887_3529071_490-326847-0_sp1.1 from training. Duration: 0.9 2023-10-04 01:50:00,723 WARNING [train.py:1204] (3/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_984-90716-0_sp1.1 from training. Duration: 0.963625 2023-10-04 01:50:04,957 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.0.layers.1.nonlin_attention.whiten2, num_groups=1, num_channels=192, metric=5.73 vs. limit=15.0 2023-10-04 01:50:05,785 WARNING [train.py:1204] (3/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_166-246557-0 from training. Duration: 0.64 2023-10-04 01:50:05,785 WARNING [train.py:1204] (3/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_51-200220-0_sp1.1 from training. Duration: 0.5090625 2023-10-04 01:50:05,862 WARNING [train.py:1204] (3/4) Exclude cut with ID _51382_210_23862_1_1534492801838_3650104_275-92796-0_sp1.1 from training. Duration: 0.936375 2023-10-04 01:50:12,737 WARNING [train.py:1204] (3/4) Exclude cut with ID _29853_210_7497_1_1532505600851_6642110_461-142829-0_sp1.1 from training. Duration: 0.97275 2023-10-04 01:50:12,780 WARNING [train.py:1204] (3/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_788-298907-0_sp0.9 from training. Duration: 0.988875 2023-10-04 01:50:14,177 WARNING [train.py:1204] (3/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_739-2098-0_sp1.1 from training. Duration: 0.836375 2023-10-04 01:50:18,760 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7543_210_6753_1_1528541907_1399322_165-323619-0 from training. Duration: 0.69 2023-10-04 01:50:20,146 WARNING [train.py:1204] (3/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_401-255491-0_sp1.1 from training. Duration: 0.636375 2023-10-04 01:50:24,404 WARNING [train.py:1204] (3/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_24-143283-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 01:50:25,733 WARNING [train.py:1204] (3/4) Exclude cut with ID _14459_210_2228_1_1531443641946_7516700_1221-280439-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 01:50:27,444 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.ff3_skip_rate, batch_count=1482853.3333333333, ans=0.0 2023-10-04 01:50:29,078 WARNING [train.py:1204] (3/4) Exclude cut with ID _59202_210_26984_1_1534816810612_3549174_469-213482-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 01:50:29,079 WARNING [train.py:1204] (3/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_148-158509-0_sp0.9 from training. Duration: 0.4555625 2023-10-04 01:50:30,345 WARNING [train.py:1204] (3/4) Exclude cut with ID _24314_210_4915_1_1532135078116_7170179_863-259095-0_sp1.1 from training. Duration: 0.97275 2023-10-04 01:50:30,413 WARNING [train.py:1204] (3/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_321-22821-0 from training. Duration: 0.89 2023-10-04 01:50:30,433 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_43831_210_2158_1_1533725502_5872791_55-163137-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 01:50:30,477 WARNING [train.py:1204] (3/4) Exclude cut with ID _27430_210_7770_1_1532604689375_1378590_26-25207-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 01:50:33,162 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_17613_210_2151_1_1531137569_2366160_223-316461-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 01:50:34,548 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_53679_210_4852_1_1534556726_4186141_541-213370-0_sp1.1 from training. Duration: 0.963625 2023-10-04 01:50:35,915 WARNING [train.py:1204] (3/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_369-295086-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 01:50:35,974 WARNING [train.py:1204] (3/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_91-267696-0 from training. Duration: 0.87 2023-10-04 01:50:36,022 WARNING [train.py:1204] (3/4) Exclude cut with ID _5294_210_1747_1_1527249643928_3696179_180-100713-0 from training. Duration: 0.81 2023-10-04 01:50:37,708 WARNING [train.py:1204] (3/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_32-335103-0 from training. Duration: 0.82 2023-10-04 01:50:37,714 WARNING [train.py:1204] (3/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_133-312727-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 01:50:37,810 WARNING [train.py:1204] (3/4) Exclude cut with ID _59288_210_5854_1_1535077757774_3412246_391-165934-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 01:50:39,108 WARNING [train.py:1204] (3/4) Exclude cut with ID _28323_210_16187_1_1532480395667_4100300_488-1281-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 01:50:41,007 WARNING [train.py:1204] (3/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_124-193207-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 01:50:42,637 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.bypass.skip_rate, batch_count=1482920.0, ans=0.07 2023-10-04 01:50:45,196 INFO [train.py:1046] (3/4) Epoch 42, batch 4650, loss[loss=0.132, simple_loss=0.2088, pruned_loss=0.02755, over 24462.00 frames. ], tot_loss[loss=0.1548, simple_loss=0.2341, pruned_loss=0.03774, over 4720347.27 frames. ], batch size: 58, lr: 2.41e-03, grad_scale: 16.0 2023-10-04 01:50:51,226 WARNING [train.py:1204] (3/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_226-242140-0_sp0.9 from training. Duration: 0.9 2023-10-04 01:50:52,617 WARNING [train.py:1204] (3/4) Exclude cut with ID _29277_210_3279_1_1532581109293_3173069_392-3694-0_sp1.1 from training. Duration: 0.936375 2023-10-04 01:50:52,654 WARNING [train.py:1204] (3/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_563-329842-0_sp1.1 from training. Duration: 0.97275 2023-10-04 01:50:52,700 WARNING [train.py:1204] (3/4) Exclude cut with ID _58583_210_5236_1_1534726835266_3574249_391-87755-0_sp1.1 from training. Duration: 0.92725 2023-10-04 01:50:52,718 WARNING [train.py:1204] (3/4) Exclude cut with ID _28427_210_4220_1_1532581240967_3061069_281-237827-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 01:50:52,733 WARNING [train.py:1204] (3/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_44-147898-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 01:50:54,049 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_24007_210_1794_1_1531918925_1663875_125-320772-0_sp1.1 from training. Duration: 0.97275 2023-10-04 01:50:56,769 WARNING [train.py:1204] (3/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_143-117421-0 from training. Duration: 0.58 2023-10-04 01:51:01,157 WARNING [train.py:1204] (3/4) Exclude cut with ID _69584_210_5219_1_1535785133775_4044450_325-7656-0_sp0.9 from training. Duration: 0.9555625 2023-10-04 01:51:02,568 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_37351_210_7925_1_1533012931_3812113_265-85780-0 from training. Duration: 0.88 2023-10-04 01:51:02,601 WARNING [train.py:1204] (3/4) Exclude cut with ID _18516_210_9206_1_1531704653206_3692406_331-237896-0_sp1.1 from training. Duration: 0.936375 2023-10-04 01:51:03,867 WARNING [train.py:1204] (3/4) Exclude cut with ID _14629_210_15709_1_1530604298182_956899_6-232695-0 from training. Duration: 0.94 2023-10-04 01:51:03,892 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7544_210_6753_1_1528587000_691468_87-178599-0_sp1.1 from training. Duration: 0.7909375 2023-10-04 01:51:03,947 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_326-303945-0 from training. Duration: 0.99 2023-10-04 01:51:03,963 WARNING [train.py:1204] (3/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_503-30719-0 from training. Duration: 0.98 2023-10-04 01:51:03,971 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_11305_210_1794_1_1529750428_7178148_299-344640-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 01:51:04,038 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_53071_210_16554_1_1534490405_2954066_265-170472-0_sp1.1 from training. Duration: 0.7545625 2023-10-04 01:51:09,902 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_174-277364-0_sp1.1 from training. Duration: 0.6454375 2023-10-04 01:51:11,280 WARNING [train.py:1204] (3/4) Exclude cut with ID _58414_210_8254_1_1535421609162_7151182_193-151884-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 01:51:11,299 WARNING [train.py:1204] (3/4) Exclude cut with ID IC0651W0104-322063 from training. Duration: 0.768 2023-10-04 01:51:13,357 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.3.encoder.layers.2.conv_module1.whiten, num_groups=1, num_channels=512, metric=5.45 vs. limit=15.0 2023-10-04 01:51:14,075 WARNING [train.py:1204] (3/4) Exclude cut with ID _21881_210_3525_1_1531825242757_3798380_139-305982-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 01:51:15,442 WARNING [train.py:1204] (3/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_79-200392-0 from training. Duration: 0.97 2023-10-04 01:51:16,247 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.conv_module1.balancer1.prob, batch_count=1483120.0, ans=0.125 2023-10-04 01:51:17,327 WARNING [train.py:1204] (3/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_451-280894-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 01:51:17,334 WARNING [train.py:1204] (3/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_304-273433-0_sp1.1 from training. Duration: 0.9 2023-10-04 01:51:18,684 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_5959_210_1794_1_1527400400_3445726_412-44052-0 from training. Duration: 0.77 2023-10-04 01:51:19,422 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.3.encoder.layers.0.feed_forward3.out_whiten, num_groups=1, num_channels=512, metric=13.54 vs. limit=15.0 2023-10-04 01:51:20,101 WARNING [train.py:1204] (3/4) Exclude cut with ID _4681_210_3525_1_1527251040321_1235713_148-26159-0_sp1.1 from training. Duration: 0.963625 2023-10-04 01:51:22,706 WARNING [train.py:1204] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_887-249555-0_sp0.9 from training. Duration: 0.8555625 2023-10-04 01:51:25,679 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.conv_module1.balancer1.max_abs, batch_count=1483120.0, ans=10.0 2023-10-04 01:51:26,794 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_25236_210_2158_1_1532051968_280567_37-6860-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 01:51:30,221 WARNING [train.py:1204] (3/4) Exclude cut with ID _29277_210_3279_1_1532581109293_3173069_417-115527-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 01:51:32,914 WARNING [train.py:1204] (3/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_552-134748-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 01:51:32,974 WARNING [train.py:1204] (3/4) Exclude cut with ID _43176_210_8254_1_1533524417469_4174339_188-174143-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 01:51:33,012 WARNING [train.py:1204] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_127-119109-0_sp1.1 from training. Duration: 0.6545625 2023-10-04 01:51:37,202 WARNING [train.py:1204] (3/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_38-187851-0 from training. Duration: 0.62 2023-10-04 01:51:37,243 WARNING [train.py:1204] (3/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_588-125165-0 from training. Duration: 0.83 2023-10-04 01:51:37,312 WARNING [train.py:1204] (3/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_344-152526-0_sp1.1 from training. Duration: 0.4818125 2023-10-04 01:51:37,313 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7002_210_1794_1_1527935634_4034444_487-236501-0 from training. Duration: 0.66 2023-10-04 01:51:40,914 WARNING [train.py:1204] (3/4) Exclude cut with ID _23825_210_13449_1_1531915313526_3497150_379-16981-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 01:51:45,017 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.625e+02 1.878e+02 2.086e+02 2.466e+02 3.529e+02, threshold=4.172e+02, percent-clipped=0.0 2023-10-04 01:51:45,283 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.balancer2.prob, batch_count=1483253.3333333333, ans=0.125 2023-10-04 01:51:47,876 WARNING [train.py:1204] (3/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_297-5233-0_sp1.1 from training. Duration: 0.863625 2023-10-04 01:51:47,880 WARNING [train.py:1204] (3/4) Exclude cut with ID _41298_210_9019_1_1533430371450_4822143_178-283538-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 01:51:49,197 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7046_210_6753_1_1527895337_6920917_642-106799-0 from training. Duration: 0.92 2023-10-04 01:51:49,213 WARNING [train.py:1204] (3/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_532-291952-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 01:51:51,038 WARNING [train.py:1204] (3/4) Exclude cut with ID _47278_210_12041_1_1533902418349_3576110_260-270468-0_sp1.1 from training. Duration: 0.92725 2023-10-04 01:51:51,043 WARNING [train.py:1204] (3/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_144-3287-0_sp0.9 from training. Duration: 0.7444375 2023-10-04 01:51:52,547 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_162-262717-0_sp0.9 from training. Duration: 0.92225 2023-10-04 01:51:55,169 WARNING [train.py:1204] (3/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_62-335552-0_sp0.9 from training. Duration: 0.9666875 2023-10-04 01:51:55,182 WARNING [train.py:1204] (3/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_726-272852-0_sp1.1 from training. Duration: 0.92725 2023-10-04 01:51:56,541 WARNING [train.py:1204] (3/4) Exclude cut with ID _14898_210_9206_1_1530766812377_4177190_26-135492-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 01:51:57,871 INFO [train.py:1046] (3/4) Epoch 42, batch 4700, loss[loss=0.151, simple_loss=0.2371, pruned_loss=0.03247, over 24631.00 frames. ], tot_loss[loss=0.1549, simple_loss=0.2349, pruned_loss=0.03748, over 4727932.41 frames. ], batch size: 68, lr: 2.41e-03, grad_scale: 8.0 2023-10-04 01:51:59,968 WARNING [train.py:1204] (3/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_200-95304-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 01:52:00,146 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=1483320.0, ans=0.1 2023-10-04 01:52:01,235 WARNING [train.py:1204] (3/4) Exclude cut with ID _45012_210_12651_1_1533608435136_5371380_240-37101-0_sp1.1 from training. Duration: 0.8454375 2023-10-04 01:52:01,243 WARNING [train.py:1204] (3/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_132-269633-0_sp0.9 from training. Duration: 0.7555625 2023-10-04 01:52:01,315 WARNING [train.py:1204] (3/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_907-151398-0_sp0.9 from training. Duration: 0.411125 2023-10-04 01:52:01,548 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.conv_skip_rate, batch_count=1483320.0, ans=0.0 2023-10-04 01:52:02,814 WARNING [train.py:1204] (3/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_45-265684-0_sp0.9 from training. Duration: 0.8 2023-10-04 01:52:04,230 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_74-292608-0 from training. Duration: 0.72 2023-10-04 01:52:09,961 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.conv_skip_rate, batch_count=1483320.0, ans=0.0 2023-10-04 01:52:11,108 WARNING [train.py:1204] (3/4) Exclude cut with ID _59202_210_26984_1_1534816810612_3549174_221-84819-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 01:52:13,167 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_33696_210_3852_1_1533082780_2663375_181-325595-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 01:52:14,535 WARNING [train.py:1204] (3/4) Exclude cut with ID _47251_210_18628_1_1533814063128_4137250_351-154105-0_sp1.1 from training. Duration: 0.963625 2023-10-04 01:52:14,617 WARNING [train.py:1204] (3/4) Exclude cut with ID _35501_210_9288_1_1532998507986_3192189_351-334740-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 01:52:14,730 WARNING [train.py:1204] (3/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_342-100157-0_sp1.1 from training. Duration: 0.4909375 2023-10-04 01:52:18,955 WARNING [train.py:1204] (3/4) Exclude cut with ID _6789_210_1534_1_1527857937218_3118950_262-48853-0 from training. Duration: 0.56 2023-10-04 01:52:20,206 WARNING [train.py:1204] (3/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_430-201020-0 from training. Duration: 0.97 2023-10-04 01:52:22,154 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_71-136518-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 01:52:22,441 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.balancer2.prob, batch_count=1483386.6666666667, ans=0.125 2023-10-04 01:52:23,515 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_28-291982-0_sp1.1 from training. Duration: 0.8818125 2023-10-04 01:52:23,554 WARNING [train.py:1204] (3/4) Exclude cut with ID _54218_210_12210_1_1534413646388_4409259_437-275326-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 01:52:25,020 WARNING [train.py:1204] (3/4) Exclude cut with ID _55134_210_21982_1_1534590028377_3882170_40-309139-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 01:52:30,923 WARNING [train.py:1204] (3/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_266-329755-0_sp1.1 from training. Duration: 0.8090625 2023-10-04 01:52:32,263 WARNING [train.py:1204] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_21-174514-0_sp1.1 from training. Duration: 0.3909375 2023-10-04 01:52:32,433 WARNING [train.py:1204] (3/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_409-207378-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 01:52:39,150 WARNING [train.py:1204] (3/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_132-269633-0 from training. Duration: 0.68 2023-10-04 01:52:39,235 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2245_210_2715_1_1525511548_3540254_310-52483-0_sp0.9 from training. Duration: 0.97775 2023-10-04 01:52:42,593 WARNING [train.py:1204] (3/4) Exclude cut with ID _27430_210_7770_1_1532604689375_1378590_47-77403-0_sp1.1 from training. Duration: 0.92725 2023-10-04 01:52:46,734 WARNING [train.py:1204] (3/4) Exclude cut with ID _7562_210_2084_1_1528887767916_3946229_186-326422-0 from training. Duration: 0.84 2023-10-04 01:52:48,120 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_230-35993-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 01:52:52,737 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_23854_210_1794_1_1531969696_1329157_57-123491-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 01:52:54,102 WARNING [train.py:1204] (3/4) Exclude cut with ID _14747_210_8341_1_1530685797463_3654089_147-131229-0 from training. Duration: 0.76 2023-10-04 01:52:54,260 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.1.bypass_mid.scale_min, batch_count=1483520.0, ans=0.2 2023-10-04 01:52:56,010 WARNING [train.py:1204] (3/4) Exclude cut with ID _30191_210_1664_1_1533189915047_7196670_430-19539-0_sp1.1 from training. Duration: 0.92725 2023-10-04 01:52:56,033 WARNING [train.py:1204] (3/4) Exclude cut with ID _67725_210_27767_1_1535608756671_5105359_44-50762-0_sp1.1 from training. Duration: 0.963625 2023-10-04 01:52:58,783 WARNING [train.py:1204] (3/4) Exclude cut with ID _27485_210_4916_1_1532739191677_3668269_217-161755-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 01:52:58,820 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_5931_210_2040_1_1527428481_4547999_321-193614-0_sp1.1 from training. Duration: 0.6090625 2023-10-04 01:52:58,846 WARNING [train.py:1204] (3/4) Exclude cut with ID _6267_210_2084_1_1527764658526_3894730_375-219887-0 from training. Duration: 0.67 2023-10-04 01:53:00,273 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9087W0022-538393 from training. Duration: 0.768 2023-10-04 01:53:01,619 WARNING [train.py:1204] (3/4) Exclude cut with ID _68777_210_4353_1_1535810390731_3117904_83-196459-0_sp1.1 from training. Duration: 0.963625 2023-10-04 01:53:04,294 WARNING [train.py:1204] (3/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_455-343812-0_sp1.1 from training. Duration: 0.92725 2023-10-04 01:53:04,295 WARNING [train.py:1204] (3/4) Exclude cut with ID _13916_210_9994_1_1530788341126_3763149_414-320174-0_sp1.1 from training. Duration: 0.92725 2023-10-04 01:53:04,299 WARNING [train.py:1204] (3/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_398-239819-0 from training. Duration: 0.99 2023-10-04 01:53:05,669 WARNING [train.py:1204] (3/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_199-232314-0_sp1.1 from training. Duration: 0.92725 2023-10-04 01:53:08,514 WARNING [train.py:1204] (3/4) Exclude cut with ID _2334_210_2667_1_1525608238880_4705129_459-104997-0 from training. Duration: 0.95 2023-10-04 01:53:11,540 INFO [train.py:1046] (3/4) Epoch 42, batch 4750, loss[loss=0.1602, simple_loss=0.2315, pruned_loss=0.04449, over 23722.00 frames. ], tot_loss[loss=0.1552, simple_loss=0.2357, pruned_loss=0.03735, over 4741695.66 frames. ], batch size: 232, lr: 2.41e-03, grad_scale: 8.0 2023-10-04 01:53:11,636 WARNING [train.py:1204] (3/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_441-209395-0_sp1.1 from training. Duration: 0.936375 2023-10-04 01:53:12,861 WARNING [train.py:1204] (3/4) Exclude cut with ID _27052_210_5986_1_1532329197187_3585009_320-80393-0_sp1.1 from training. Duration: 0.97275 2023-10-04 01:53:17,449 WARNING [train.py:1204] (3/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_89-217253-0_sp1.1 from training. Duration: 0.97275 2023-10-04 01:53:17,465 WARNING [train.py:1204] (3/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_453-282588-0_sp1.1 from training. Duration: 0.8909375 2023-10-04 01:53:20,144 WARNING [train.py:1204] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_88-341718-0 from training. Duration: 0.66 2023-10-04 01:53:20,196 WARNING [train.py:1204] (3/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_230-3521-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 01:53:23,035 WARNING [train.py:1204] (3/4) Exclude cut with ID _9679_210_7497_1_1529129212906_4363929_512-313889-0 from training. Duration: 0.81 2023-10-04 01:53:26,055 WARNING [train.py:1204] (3/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_194-252800-0_sp1.1 from training. Duration: 0.8818125 2023-10-04 01:53:26,100 WARNING [train.py:1204] (3/4) Exclude cut with ID _55209_210_1531_1_1534675828996_4597160_239-293541-0_sp1.1 from training. Duration: 0.963625 2023-10-04 01:53:28,041 WARNING [train.py:1204] (3/4) Exclude cut with ID _35799_210_5854_1_1533263487887_3529071_15-325131-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 01:53:33,664 WARNING [train.py:1204] (3/4) Exclude cut with ID _14100_210_8341_1_1530529320613_3678069_279-55760-0 from training. Duration: 0.93 2023-10-04 01:53:37,734 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_489-230703-0_sp1.1 from training. Duration: 0.77275 2023-10-04 01:53:39,201 WARNING [train.py:1204] (3/4) Exclude cut with ID _6694_210_4599_1_1527852637490_3965529_144-185051-0 from training. Duration: 0.96 2023-10-04 01:53:40,570 WARNING [train.py:1204] (3/4) Exclude cut with ID _28461_210_2253_1_1532771775189_3041299_1-228155-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 01:53:43,253 WARNING [train.py:1204] (3/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_686-252323-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 01:53:43,256 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_1075-294633-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 01:53:43,276 WARNING [train.py:1204] (3/4) Exclude cut with ID _22083_210_8254_1_1531702862439_3678970_30-92879-0_sp1.1 from training. Duration: 0.97275 2023-10-04 01:53:44,588 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9245W0121-616888 from training. Duration: 0.896 2023-10-04 01:53:44,591 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6786_210_1388_1_1527839699_4194495_378-46070-0 from training. Duration: 0.72 2023-10-04 01:53:50,738 WARNING [train.py:1204] (3/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_317-175062-0 from training. Duration: 0.82 2023-10-04 01:53:52,189 WARNING [train.py:1204] (3/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_238-89390-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 01:53:54,845 WARNING [train.py:1204] (3/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_524-310811-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 01:53:56,813 WARNING [train.py:1204] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_307-158462-0_sp0.9 from training. Duration: 0.7666875 2023-10-04 01:53:56,813 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9309W0191-640318 from training. Duration: 0.512 2023-10-04 01:53:56,824 WARNING [train.py:1204] (3/4) Exclude cut with ID _55153_210_2253_1_1534384786677_3162096_100-204617-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 01:54:01,434 WARNING [train.py:1204] (3/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_112-149213-0_sp1.1 from training. Duration: 0.863625 2023-10-04 01:54:04,140 WARNING [train.py:1204] (3/4) Exclude cut with ID _67831_210_12392_1_1535612464202_3092612_346-2148-0_sp1.1 from training. Duration: 0.8454375 2023-10-04 01:54:05,594 WARNING [train.py:1204] (3/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_51-117576-0_sp0.9 from training. Duration: 0.4444375 2023-10-04 01:54:06,911 WARNING [train.py:1204] (3/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_300-27235-0 from training. Duration: 0.87 2023-10-04 01:54:06,968 WARNING [train.py:1204] (3/4) Exclude cut with ID _40399_210_4353_1_1533305658194_3279406_195-116825-0_sp1.1 from training. Duration: 0.97275 2023-10-04 01:54:06,990 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7002_210_1794_1_1527939683_5157286_163-87552-0_sp0.9 from training. Duration: 0.9555625 2023-10-04 01:54:07,020 WARNING [train.py:1204] (3/4) Exclude cut with ID _19514_210_9259_1_1531530071330_5084230_560-237597-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 01:54:08,369 WARNING [train.py:1204] (3/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_41-234353-0_sp1.1 from training. Duration: 0.6181875 2023-10-04 01:54:08,386 WARNING [train.py:1204] (3/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_769-168302-0 from training. Duration: 0.71 2023-10-04 01:54:09,905 WARNING [train.py:1204] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_574-45541-0 from training. Duration: 0.65 2023-10-04 01:54:11,196 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.638e+02 1.906e+02 2.067e+02 2.489e+02 4.239e+02, threshold=4.133e+02, percent-clipped=1.0 2023-10-04 01:54:11,335 WARNING [train.py:1204] (3/4) Exclude cut with ID _50310_210_15488_1_1534068043345_3641080_262-67120-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 01:54:12,782 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_53071_210_16554_1_1534493434_1276796_35-11927-0_sp1.1 from training. Duration: 0.963625 2023-10-04 01:54:12,784 WARNING [train.py:1204] (3/4) Exclude cut with ID _3992_210_1747_1_1526644816031_3731060_107-251434-0 from training. Duration: 0.99 2023-10-04 01:54:14,168 WARNING [train.py:1204] (3/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_412-54488-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 01:54:14,416 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.bypass_mid.scale_min, batch_count=1483920.0, ans=0.2 2023-10-04 01:54:15,481 WARNING [train.py:1204] (3/4) Exclude cut with ID _18516_210_9206_1_1531704653206_3692406_77-258490-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 01:54:15,591 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_40047_210_2290_1_1533290519_2374311_195-165693-0_sp0.9 from training. Duration: 0.988875 2023-10-04 01:54:15,636 WARNING [train.py:1204] (3/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_296-36971-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 01:54:17,539 WARNING [train.py:1204] (3/4) Exclude cut with ID _14970_210_15709_1_1530773543493_1715730_187-20259-0_sp1.1 from training. Duration: 0.8090625 2023-10-04 01:54:21,713 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_53373_210_5399_1_1534229471_4715695_135-305927-0_sp1.1 from training. Duration: 0.936375 2023-10-04 01:54:21,742 WARNING [train.py:1204] (3/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_60-18123-0 from training. Duration: 0.88 2023-10-04 01:54:23,764 WARNING [train.py:1204] (3/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_166-166990-0 from training. Duration: 0.96 2023-10-04 01:54:23,842 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_5438_210_1794_1_1527336687_2600009_233-58072-0 from training. Duration: 0.77 2023-10-04 01:54:24,967 INFO [train.py:1046] (3/4) Epoch 42, batch 4800, loss[loss=0.1679, simple_loss=0.2424, pruned_loss=0.04664, over 23794.00 frames. ], tot_loss[loss=0.1561, simple_loss=0.2363, pruned_loss=0.03796, over 4739993.47 frames. ], batch size: 195, lr: 2.41e-03, grad_scale: 16.0 2023-10-04 01:54:28,491 WARNING [train.py:1204] (3/4) Exclude cut with ID _8833_210_5219_1_1528592665210_7553059_611-80313-0_sp0.9 from training. Duration: 0.711125 2023-10-04 01:54:29,709 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_53555_210_5632_1_1534595220_7232297_118-43239-0_sp1.1 from training. Duration: 0.936375 2023-10-04 01:54:29,942 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.balancer2.prob, batch_count=1483986.6666666667, ans=0.125 2023-10-04 01:54:31,022 WARNING [train.py:1204] (3/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_480-13146-0 from training. Duration: 0.85 2023-10-04 01:54:35,256 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_892-28814-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 01:54:35,308 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_24899_210_16554_1_1532046422_3827462_160-21255-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 01:54:40,689 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_154-316450-0_sp1.1 from training. Duration: 0.6909375 2023-10-04 01:54:42,098 WARNING [train.py:1204] (3/4) Exclude cut with ID _30196_210_1664_1_1533708496522_7074140_1225-301232-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 01:54:42,132 WARNING [train.py:1204] (3/4) Exclude cut with ID _28783_210_7943_1_1532671164808_7155479_414-208100-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 01:54:42,170 WARNING [train.py:1204] (3/4) Exclude cut with ID _19023_210_4353_1_1531378892552_5820780_83-254794-0 from training. Duration: 0.86 2023-10-04 01:54:43,545 WARNING [train.py:1204] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_591-83752-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 01:54:43,592 WARNING [train.py:1204] (3/4) Exclude cut with ID _29877_210_6228_1_1532397791106_5276149_266-62291-0_sp1.1 from training. Duration: 0.7909375 2023-10-04 01:54:45,029 WARNING [train.py:1204] (3/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_280-161633-0_sp1.1 from training. Duration: 0.663625 2023-10-04 01:54:48,486 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7543_210_6753_1_1528539468_333764_39-156634-0_sp1.1 from training. Duration: 0.97275 2023-10-04 01:54:50,293 WARNING [train.py:1204] (3/4) Exclude cut with ID _67499_210_17282_1_1535509805220_3692430_505-312248-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 01:54:50,316 WARNING [train.py:1204] (3/4) Exclude cut with ID _5294_210_1747_1_1527249643928_3696179_180-100713-0_sp0.9 from training. Duration: 0.9 2023-10-04 01:54:50,629 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.bypass.scale_min, batch_count=1484053.3333333333, ans=0.2 2023-10-04 01:54:51,658 WARNING [train.py:1204] (3/4) Exclude cut with ID _26528_210_9070_1_1532570410842_3851310_508-148132-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 01:54:51,668 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_211-339622-0_sp1.1 from training. Duration: 0.4090625 2023-10-04 01:54:51,681 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_339-138356-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 01:54:53,098 WARNING [train.py:1204] (3/4) Exclude cut with ID _27060_210_5986_1_1532674838922_3522549_211-19933-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 01:54:56,299 WARNING [train.py:1204] (3/4) Exclude cut with ID _60229_210_27767_1_1534838406158_3655350_457-117923-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 01:54:59,496 WARNING [train.py:1204] (3/4) Exclude cut with ID _55429_210_10626_1_1534467588527_7174143_460-141439-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 01:55:00,812 WARNING [train.py:1204] (3/4) Exclude cut with ID _59170_210_6721_1_1534983633960_4203335_263-10780-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 01:55:00,825 WARNING [train.py:1204] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_872-264525-0_sp0.9 from training. Duration: 0.72225 2023-10-04 01:55:00,914 WARNING [train.py:1204] (3/4) Exclude cut with ID _7624_210_1531_1_1528109802198_4933101_418-349834-0_sp0.9 from training. Duration: 0.6666875 2023-10-04 01:55:03,622 WARNING [train.py:1204] (3/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_27-53998-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 01:55:03,741 WARNING [train.py:1204] (3/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_202-183922-0 from training. Duration: 0.85 2023-10-04 01:55:03,753 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7002_210_1794_1_1527939683_5157286_1-281264-0 from training. Duration: 0.44 2023-10-04 01:55:06,190 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_53753_210_7234_1_1534320003_3594148_493-9314-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 01:55:06,212 WARNING [train.py:1204] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_217-95056-0_sp1.1 from training. Duration: 0.8909375 2023-10-04 01:55:06,254 WARNING [train.py:1204] (3/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_406-45086-0_sp1.1 from training. Duration: 0.72725 2023-10-04 01:55:06,259 WARNING [train.py:1204] (3/4) Exclude cut with ID _31649_210_6228_1_1532584608063_4091078_505-213821-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 01:55:06,267 WARNING [train.py:1204] (3/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_317-175062-0_sp0.9 from training. Duration: 0.911125 2023-10-04 01:55:10,286 WARNING [train.py:1204] (3/4) Exclude cut with ID _5444_210_2151_1_1527418808464_5312210_726-27165-0_sp1.1 from training. Duration: 0.6090625 2023-10-04 01:55:10,333 WARNING [train.py:1204] (3/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_494-170437-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 01:55:11,992 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.conv_module2.balancer2.prob, batch_count=1484186.6666666667, ans=0.125 2023-10-04 01:55:13,426 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_474-64054-0_sp1.1 from training. Duration: 0.936375 2023-10-04 01:55:16,245 WARNING [train.py:1204] (3/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_521-7556-0_sp1.1 from training. Duration: 0.97275 2023-10-04 01:55:17,560 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_269-348505-0_sp1.1 from training. Duration: 0.92725 2023-10-04 01:55:17,813 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.1.balancer_ff3.min_abs, batch_count=1484186.6666666667, ans=0.2 2023-10-04 01:55:20,965 WARNING [train.py:1204] (3/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_559-184862-0 from training. Duration: 0.94 2023-10-04 01:55:20,995 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_68702_210_23689_1_1535799577_3420068_92-76040-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 01:55:21,035 WARNING [train.py:1204] (3/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_227-102052-0_sp1.1 from training. Duration: 0.97275 2023-10-04 01:55:21,080 WARNING [train.py:1204] (3/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_718-250907-0_sp0.9 from training. Duration: 0.7666875 2023-10-04 01:55:21,138 WARNING [train.py:1204] (3/4) Exclude cut with ID _14069_210_5632_1_1530512692033_4803390_352-143891-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 01:55:26,987 WARNING [train.py:1204] (3/4) Exclude cut with ID _6267_210_2084_1_1527764658526_3894730_65-106602-0_sp1.1 from training. Duration: 0.936375 2023-10-04 01:55:27,054 WARNING [train.py:1204] (3/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_202-183922-0_sp0.9 from training. Duration: 0.9444375 2023-10-04 01:55:27,061 WARNING [train.py:1204] (3/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_465-268827-0_sp1.1 from training. Duration: 0.97275 2023-10-04 01:55:28,821 WARNING [train.py:1204] (3/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_58-29064-0_sp1.1 from training. Duration: 0.9 2023-10-04 01:55:28,873 WARNING [train.py:1204] (3/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_1-99680-0_sp0.9 from training. Duration: 0.7444375 2023-10-04 01:55:30,175 WARNING [train.py:1204] (3/4) Exclude cut with ID _13253_210_1925_1_1530757775819_3577873_191-144977-0_sp1.1 from training. Duration: 0.7090625 2023-10-04 01:55:33,146 WARNING [train.py:1204] (3/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_276-112684-0_sp1.1 from training. Duration: 0.92725 2023-10-04 01:55:33,161 WARNING [train.py:1204] (3/4) Exclude cut with ID _64639_210_5816_1_1535367619437_3528000_358-411-0_sp1.1 from training. Duration: 0.97275 2023-10-04 01:55:34,568 WARNING [train.py:1204] (3/4) Exclude cut with ID _31649_210_6228_1_1532584608063_4091078_106-21199-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 01:55:34,685 WARNING [train.py:1204] (3/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_901-79438-0 from training. Duration: 0.94 2023-10-04 01:55:37,428 WARNING [train.py:1204] (3/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_718-250907-0 from training. Duration: 0.69 2023-10-04 01:55:37,434 WARNING [train.py:1204] (3/4) Exclude cut with ID _5287_210_2084_1_1527418883040_3878670_363-194161-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 01:55:37,437 WARNING [train.py:1204] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_586-321321-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 01:55:38,762 INFO [train.py:1046] (3/4) Epoch 42, batch 4850, loss[loss=0.1479, simple_loss=0.2376, pruned_loss=0.02905, over 24475.00 frames. ], tot_loss[loss=0.1566, simple_loss=0.2371, pruned_loss=0.03803, over 4742962.25 frames. ], batch size: 66, lr: 2.41e-03, grad_scale: 16.0 2023-10-04 01:55:38,837 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_68900_210_2087_1_1535713157_3708529_164-292661-0_sp1.1 from training. Duration: 0.963625 2023-10-04 01:55:38,838 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_26433_210_12108_1_1532157158_4056480_114-144878-0_sp1.1 from training. Duration: 0.97275 2023-10-04 01:55:40,361 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_15517_210_6753_1_1530954008_3487336_96-341874-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 01:55:44,643 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.balancer1.prob, batch_count=1484320.0, ans=0.125 2023-10-04 01:55:48,890 WARNING [train.py:1204] (3/4) Exclude cut with ID _14268_210_9206_1_1530622771285_4714099_637-86630-0 from training. Duration: 0.51 2023-10-04 01:55:50,268 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_28211_210_6388_1_1532861506_4523224_121-320092-0_sp1.1 from training. Duration: 0.92725 2023-10-04 01:55:53,194 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_141-89113-0_sp0.9 from training. Duration: 0.9555625 2023-10-04 01:55:53,281 WARNING [train.py:1204] (3/4) Exclude cut with ID _6789_210_1534_1_1527857937218_3118950_311-61262-0_sp1.1 from training. Duration: 0.5909375 2023-10-04 01:55:53,300 WARNING [train.py:1204] (3/4) Exclude cut with ID _58480_210_12392_1_1534811121111_3646160_389-200461-0_sp1.1 from training. Duration: 0.97275 2023-10-04 01:55:57,783 WARNING [train.py:1204] (3/4) Exclude cut with ID _41326_210_5854_1_1533778076509_3492080_28-156553-0_sp1.1 from training. Duration: 0.92725 2023-10-04 01:55:59,732 WARNING [train.py:1204] (3/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_577-287150-0_sp1.1 from training. Duration: 0.7454375 2023-10-04 01:56:01,157 WARNING [train.py:1204] (3/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_40-129756-0_sp1.1 from training. Duration: 0.736375 2023-10-04 01:56:01,174 WARNING [train.py:1204] (3/4) Exclude cut with ID _60113_210_16271_1_1535164212481_4386079_490-280290-0 from training. Duration: 0.98 2023-10-04 01:56:04,894 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.0.layers.1.self_attn1.whiten, num_groups=1, num_channels=192, metric=11.21 vs. limit=22.5 2023-10-04 01:56:05,569 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.self_attn_weights.pos_emb_skip_rate, batch_count=1484386.6666666667, ans=0.0 2023-10-04 01:56:06,615 WARNING [train.py:1204] (3/4) Exclude cut with ID _58059_210_5854_1_1534901471149_3164808_34-39905-0_sp1.1 from training. Duration: 0.963625 2023-10-04 01:56:06,819 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.ff3_skip_rate, batch_count=1484453.3333333333, ans=0.0 2023-10-04 01:56:09,372 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_791-197891-0_sp1.1 from training. Duration: 0.763625 2023-10-04 01:56:09,387 WARNING [train.py:1204] (3/4) Exclude cut with ID _6789_210_1534_1_1527857937218_3118950_262-48853-0_sp1.1 from training. Duration: 0.5090625 2023-10-04 01:56:09,462 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2655_210_2087_1_1525688959_3897400_52-279899-0_sp0.9 from training. Duration: 0.7666875 2023-10-04 01:56:09,466 WARNING [train.py:1204] (3/4) Exclude cut with ID _14884_210_2252_1_1530707459689_5561250_114-201399-0 from training. Duration: 0.76 2023-10-04 01:56:11,817 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.0.layers.1.nonlin_attention.whiten1, num_groups=1, num_channels=144, metric=6.01 vs. limit=10.0 2023-10-04 01:56:12,259 WARNING [train.py:1204] (3/4) Exclude cut with ID _19023_210_4353_1_1531378892552_5820780_83-254794-0_sp0.9 from training. Duration: 0.9555625 2023-10-04 01:56:13,528 WARNING [train.py:1204] (3/4) Exclude cut with ID _53489_210_6228_1_1534298357169_3835970_10-297919-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 01:56:16,336 WARNING [train.py:1204] (3/4) Exclude cut with ID _44585_210_18628_1_1533605350387_4518199_418-3055-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 01:56:16,348 WARNING [train.py:1204] (3/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_310-109461-0 from training. Duration: 0.8 2023-10-04 01:56:16,392 WARNING [train.py:1204] (3/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_235-262124-0 from training. Duration: 0.67 2023-10-04 01:56:18,161 WARNING [train.py:1204] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_195-167011-0_sp1.1 from training. Duration: 0.6818125 2023-10-04 01:56:21,557 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.0.conv_skip_rate, batch_count=1484520.0, ans=0.0 2023-10-04 01:56:22,952 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.feed_forward3.hidden_balancer.prob, batch_count=1484520.0, ans=0.125 2023-10-04 01:56:26,179 WARNING [train.py:1204] (3/4) Exclude cut with ID _7852_210_5219_1_1528286471316_8139400_188-55162-0_sp1.1 from training. Duration: 0.936375 2023-10-04 01:56:27,495 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_35361_210_4238_1_1533262810_7793476_675-87012-0 from training. Duration: 0.89 2023-10-04 01:56:27,580 WARNING [train.py:1204] (3/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_465-81427-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 01:56:28,933 WARNING [train.py:1204] (3/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_597-154640-0_sp1.1 from training. Duration: 0.8181875 2023-10-04 01:56:30,295 WARNING [train.py:1204] (3/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_186-309161-0_sp0.9 from training. Duration: 0.6 2023-10-04 01:56:31,726 WARNING [train.py:1204] (3/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_546-119312-0 from training. Duration: 0.99 2023-10-04 01:56:31,731 WARNING [train.py:1204] (3/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_330-240060-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 01:56:33,599 WARNING [train.py:1204] (3/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_477-255782-0 from training. Duration: 0.82 2023-10-04 01:56:33,624 WARNING [train.py:1204] (3/4) Exclude cut with ID _14970_210_15709_1_1530773543493_1715730_150-248939-0_sp1.1 from training. Duration: 0.97275 2023-10-04 01:56:33,704 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_556-206615-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 01:56:35,032 WARNING [train.py:1204] (3/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_308-231735-0 from training. Duration: 0.73 2023-10-04 01:56:40,322 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.545e+02 1.976e+02 2.279e+02 2.618e+02 4.353e+02, threshold=4.559e+02, percent-clipped=2.0 2023-10-04 01:56:42,055 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.conv_skip_rate, batch_count=1484586.6666666667, ans=0.0 2023-10-04 01:56:42,077 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.nonlin_attention.balancer.prob, batch_count=1484586.6666666667, ans=0.125 2023-10-04 01:56:44,675 WARNING [train.py:1204] (3/4) Exclude cut with ID _27485_210_4916_1_1532739191677_3668269_66-236571-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 01:56:48,779 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2221_210_1988_1_1525597051_4935541_415-296882-0_sp0.9 from training. Duration: 0.8555625 2023-10-04 01:56:48,792 WARNING [train.py:1204] (3/4) Exclude cut with ID _64521_210_14699_1_1535243427259_3815328_231-78630-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 01:56:51,963 INFO [train.py:1046] (3/4) Epoch 42, batch 4900, loss[loss=0.1505, simple_loss=0.2294, pruned_loss=0.03584, over 23904.00 frames. ], tot_loss[loss=0.1564, simple_loss=0.236, pruned_loss=0.03843, over 4727245.59 frames. ], batch size: 195, lr: 2.41e-03, grad_scale: 8.0 2023-10-04 01:56:54,890 WARNING [train.py:1204] (3/4) Exclude cut with ID _13942_210_9988_1_1530604906036_3733360_499-55676-0 from training. Duration: 0.62 2023-10-04 01:56:54,893 WARNING [train.py:1204] (3/4) Exclude cut with ID _49376_210_5986_1_1533985278732_3370740_301-154763-0_sp1.1 from training. Duration: 0.8909375 2023-10-04 01:56:59,797 WARNING [train.py:1204] (3/4) Exclude cut with ID _27485_210_4916_1_1532739191677_3668269_255-216698-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 01:56:59,879 WARNING [train.py:1204] (3/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_137-334143-0_sp1.1 from training. Duration: 0.97275 2023-10-04 01:57:00,071 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.bypass_mid.scale_min, batch_count=1484653.3333333333, ans=0.2 2023-10-04 01:57:01,050 WARNING [train.py:1204] (3/4) Exclude cut with ID _6456_210_1534_1_1527681634212_3181049_215-309359-0_sp1.1 from training. Duration: 0.8 2023-10-04 01:57:04,244 WARNING [train.py:1204] (3/4) Exclude cut with ID _65640_210_6310_1_1535368201445_3173250_209-13008-0 from training. Duration: 0.99 2023-10-04 01:57:08,460 WARNING [train.py:1204] (3/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_58-29064-0 from training. Duration: 0.99 2023-10-04 01:57:12,585 WARNING [train.py:1204] (3/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_410-287857-0 from training. Duration: 0.93 2023-10-04 01:57:14,001 WARNING [train.py:1204] (3/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_153-153537-0 from training. Duration: 0.91 2023-10-04 01:57:14,032 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7544_210_6753_1_1528587722_3576686_172-171750-0_sp0.9 from training. Duration: 0.72225 2023-10-04 01:57:15,367 WARNING [train.py:1204] (3/4) Exclude cut with ID _64521_210_14699_1_1535243427259_3815328_464-183853-0_sp1.1 from training. Duration: 0.97275 2023-10-04 01:57:15,380 WARNING [train.py:1204] (3/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_385-210308-0_sp1.1 from training. Duration: 0.963625 2023-10-04 01:57:15,395 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_46013_210_7234_1_1533776362_3744032_354-261865-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 01:57:15,407 WARNING [train.py:1204] (3/4) Exclude cut with ID _3067_210_2667_1_1526212974640_4503168_250-285937-0_sp1.1 from training. Duration: 0.72725 2023-10-04 01:57:16,725 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_40036_210_13405_1_1533648647_1315388_48-146016-0 from training. Duration: 0.93 2023-10-04 01:57:18,271 WARNING [train.py:1204] (3/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_280-161633-0 from training. Duration: 0.73 2023-10-04 01:57:19,616 WARNING [train.py:1204] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_308-161067-0_sp0.9 from training. Duration: 0.7333125 2023-10-04 01:57:19,903 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.nonlin_attention.balancer.prob, batch_count=1484786.6666666667, ans=0.125 2023-10-04 01:57:21,635 WARNING [train.py:1204] (3/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_68-212801-0_sp0.9 from training. Duration: 0.811125 2023-10-04 01:57:21,705 WARNING [train.py:1204] (3/4) Exclude cut with ID _6789_210_1534_1_1527857937218_3118950_311-61262-0_sp0.9 from training. Duration: 0.72225 2023-10-04 01:57:24,950 WARNING [train.py:1204] (3/4) Exclude cut with ID _14632_210_9988_1_1530691279392_3923200_102-204477-0_sp1.1 from training. Duration: 0.8818125 2023-10-04 01:57:25,011 WARNING [train.py:1204] (3/4) Exclude cut with ID _65242_210_18628_1_1535369393717_3823149_290-117517-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 01:57:27,070 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_69630_210_7234_1_1535788670_910989_35-337140-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 01:57:27,086 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_5618_210_1874_1_1527311008_5851_1-214501-0 from training. Duration: 0.689 2023-10-04 01:57:27,304 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.feed_forward2.hidden_balancer.prob, batch_count=1484786.6666666667, ans=0.125 2023-10-04 01:57:28,421 WARNING [train.py:1204] (3/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_168-50337-0_sp0.9 from training. Duration: 0.9333125 2023-10-04 01:57:28,512 WARNING [train.py:1204] (3/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_185-14438-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 01:57:28,527 WARNING [train.py:1204] (3/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_61-327062-0 from training. Duration: 0.81 2023-10-04 01:57:28,533 WARNING [train.py:1204] (3/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_98-741-0 from training. Duration: 0.8 2023-10-04 01:57:28,837 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.feed_forward1.out_proj.dropout_p, batch_count=1484786.6666666667, ans=0.1 2023-10-04 01:57:33,339 WARNING [train.py:1204] (3/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_312-204798-0 from training. Duration: 0.93 2023-10-04 01:57:34,708 WARNING [train.py:1204] (3/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_787-294416-0_sp0.9 from training. Duration: 0.911125 2023-10-04 01:57:36,043 WARNING [train.py:1204] (3/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_140-181176-0_sp1.1 from training. Duration: 0.836375 2023-10-04 01:57:36,086 WARNING [train.py:1204] (3/4) Exclude cut with ID _14687_210_5854_1_1530703838259_3664889_115-239046-0_sp1.1 from training. Duration: 0.6090625 2023-10-04 01:57:37,424 WARNING [train.py:1204] (3/4) Exclude cut with ID _56423_210_5854_1_1534896061049_3165969_370-230614-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 01:57:37,473 WARNING [train.py:1204] (3/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_406-275649-0_sp0.9 from training. Duration: 0.4666875 2023-10-04 01:57:37,485 WARNING [train.py:1204] (3/4) Exclude cut with ID _15150_210_8341_1_1530752411803_7256384_22-138274-0_sp1.1 from training. Duration: 0.87275 2023-10-04 01:57:37,519 WARNING [train.py:1204] (3/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_125-125818-0 from training. Duration: 0.96 2023-10-04 01:57:40,320 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_4399_210_1874_1_1526705485_2400757_220-47974-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 01:57:41,700 WARNING [train.py:1204] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_308-161067-0_sp1.1 from training. Duration: 0.6 2023-10-04 01:57:43,083 WARNING [train.py:1204] (3/4) Exclude cut with ID _53489_210_6228_1_1534298357169_3835970_102-57645-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 01:57:45,850 WARNING [train.py:1204] (3/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_178-177582-0 from training. Duration: 0.74 2023-10-04 01:57:47,079 WARNING [train.py:1204] (3/4) Exclude cut with ID _14349_210_8341_1_1530615710818_3442470_170-336541-0_sp1.1 from training. Duration: 0.7818125 2023-10-04 01:57:49,140 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_521-214276-0_sp0.9 from training. Duration: 0.488875 2023-10-04 01:57:49,187 WARNING [train.py:1204] (3/4) Exclude cut with ID _66070_210_7640_1_1535441393096_7805310_489-292631-0 from training. Duration: 0.79 2023-10-04 01:57:54,853 WARNING [train.py:1204] (3/4) Exclude cut with ID _14069_210_5632_1_1530512692033_4803390_438-340023-0_sp1.1 from training. Duration: 0.936375 2023-10-04 01:57:56,332 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.2.encoder.layers.0.nonlin_attention.whiten2, num_groups=1, num_channels=384, metric=7.33 vs. limit=15.0 2023-10-04 01:57:56,958 WARNING [train.py:1204] (3/4) Exclude cut with ID _7852_210_5219_1_1528286471316_8139400_242-105336-0_sp1.1 from training. Duration: 0.6545625 2023-10-04 01:57:58,379 WARNING [train.py:1204] (3/4) Exclude cut with ID _3867_210_1883_1_1526641200575_4475830_272-215563-0 from training. Duration: 0.57 2023-10-04 01:57:58,551 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=1484920.0, ans=0.1 2023-10-04 01:57:58,583 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.conv_skip_rate, batch_count=1484920.0, ans=0.0 2023-10-04 01:57:59,624 WARNING [train.py:1204] (3/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_143-117421-0_sp0.9 from training. Duration: 0.6444375 2023-10-04 01:57:59,631 WARNING [train.py:1204] (3/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_30-326681-0_sp1.1 from training. Duration: 0.7545625 2023-10-04 01:58:01,036 WARNING [train.py:1204] (3/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_538-218448-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 01:58:05,729 WARNING [train.py:1204] (3/4) Exclude cut with ID _33640_210_2655_1_1533117605479_4855469_60-192970-0_sp1.1 from training. Duration: 0.963625 2023-10-04 01:58:05,996 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.ff2_skip_rate, batch_count=1484986.6666666667, ans=0.0 2023-10-04 01:58:07,051 INFO [train.py:1046] (3/4) Epoch 42, batch 4950, loss[loss=0.1459, simple_loss=0.2212, pruned_loss=0.03535, over 20247.00 frames. ], tot_loss[loss=0.1553, simple_loss=0.2348, pruned_loss=0.03796, over 4717021.59 frames. ], batch size: 44, lr: 2.41e-03, grad_scale: 8.0 2023-10-04 01:58:07,100 WARNING [train.py:1204] (3/4) Exclude cut with ID _65585_210_12210_1_1535450451584_3764098_112-294301-0_sp1.1 from training. Duration: 0.8 2023-10-04 01:58:07,134 WARNING [train.py:1204] (3/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_643-37412-0_sp1.1 from training. Duration: 0.936375 2023-10-04 01:58:07,159 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_9302_210_2550_1_1528889962_4655416_638-301620-0_sp1.1 from training. Duration: 0.42725 2023-10-04 01:58:08,537 WARNING [train.py:1204] (3/4) Exclude cut with ID _3867_210_1883_1_1526641200575_4475830_272-215563-0_sp1.1 from training. Duration: 0.5181875 2023-10-04 01:58:12,573 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_53679_210_4852_1_1534556726_4186141_358-154143-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 01:58:12,584 WARNING [train.py:1204] (3/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_841-341679-0_sp0.9 from training. Duration: 0.6444375 2023-10-04 01:58:14,071 WARNING [train.py:1204] (3/4) Exclude cut with ID _7160_210_1876_1_1527940923231_3703799_302-150728-0 from training. Duration: 0.78 2023-10-04 01:58:14,103 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_125-203105-0 from training. Duration: 0.8 2023-10-04 01:58:14,118 WARNING [train.py:1204] (3/4) Exclude cut with ID _5585_210_2667_1_1527418813324_8007340_89-332072-0_sp0.9 from training. Duration: 0.711125 2023-10-04 01:58:15,459 WARNING [train.py:1204] (3/4) Exclude cut with ID _3992_210_1747_1_1526644816031_3731060_36-147855-0 from training. Duration: 0.78 2023-10-04 01:58:16,827 WARNING [train.py:1204] (3/4) Exclude cut with ID _59256_210_5986_1_1535076085997_3589120_172-239045-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 01:58:16,835 WARNING [train.py:1204] (3/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_472-216663-0_sp1.1 from training. Duration: 0.836375 2023-10-04 01:58:16,900 WARNING [train.py:1204] (3/4) Exclude cut with ID _36512_210_6987_1_1533033143998_3126069_181-306500-0_sp1.1 from training. Duration: 0.863625 2023-10-04 01:58:16,916 WARNING [train.py:1204] (3/4) Exclude cut with ID _56524_210_9642_1_1534572011779_3619170_492-95008-0_sp1.1 from training. Duration: 0.97275 2023-10-04 01:58:19,961 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_33348_210_5632_1_1533086912_3324030_119-90326-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 01:58:20,019 WARNING [train.py:1204] (3/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_231-137892-0_sp1.1 from training. Duration: 0.9 2023-10-04 01:58:22,557 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8453_210_4211_1_1528502940_8562730_596-306820-0_sp1.1 from training. Duration: 0.8818125 2023-10-04 01:58:22,629 WARNING [train.py:1204] (3/4) Exclude cut with ID _27052_210_5986_1_1532329197187_3585009_50-242750-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 01:58:24,508 WARNING [train.py:1204] (3/4) Exclude cut with ID _13913_210_9994_1_1530579615187_3383897_358-98435-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 01:58:24,524 WARNING [train.py:1204] (3/4) Exclude cut with ID _27013_210_1389_1_1532437173240_3252649_399-229778-0_sp1.1 from training. Duration: 0.963625 2023-10-04 01:58:28,659 WARNING [train.py:1204] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_185-174805-0_sp0.9 from training. Duration: 0.7555625 2023-10-04 01:58:33,868 WARNING [train.py:1204] (3/4) Exclude cut with ID _65585_210_12210_1_1535450451584_3764098_196-221014-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 01:58:34,012 WARNING [train.py:1204] (3/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_202-293523-0_sp1.1 from training. Duration: 0.6545625 2023-10-04 01:58:35,380 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8275_210_4198_1_1528455320_3892571_213-127634-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 01:58:36,757 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_921-231678-0_sp1.1 from training. Duration: 0.97275 2023-10-04 01:58:36,873 WARNING [train.py:1204] (3/4) Exclude cut with ID _38020_210_5986_1_1533112252259_7006089_493-102745-0_sp1.1 from training. Duration: 0.8909375 2023-10-04 01:58:39,596 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6846_210_6753_1_1527850767_4295158_2-315073-0 from training. Duration: 0.88 2023-10-04 01:58:39,671 WARNING [train.py:1204] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_249-281349-0 from training. Duration: 0.85 2023-10-04 01:58:42,368 WARNING [train.py:1204] (3/4) Exclude cut with ID _18513_210_9206_1_1531618215223_3862620_314-83429-0_sp1.1 from training. Duration: 0.97275 2023-10-04 01:58:43,788 WARNING [train.py:1204] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_215-202973-0_sp0.9 from training. Duration: 0.97775 2023-10-04 01:58:43,798 WARNING [train.py:1204] (3/4) Exclude cut with ID _7719_210_3525_1_1528456153163_4296960_88-250259-0_sp1.1 from training. Duration: 0.763625 2023-10-04 01:58:43,924 WARNING [train.py:1204] (3/4) Exclude cut with ID _7606_210_4915_1_1528894253277_5080690_301-10732-0_sp1.1 from training. Duration: 0.736375 2023-10-04 01:58:45,253 WARNING [train.py:1204] (3/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_818-11907-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 01:58:46,562 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_5959_210_1794_1_1527400400_3445726_412-44052-0_sp1.1 from training. Duration: 0.7 2023-10-04 01:58:46,695 WARNING [train.py:1204] (3/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_6-279195-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 01:58:49,797 WARNING [train.py:1204] (3/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_252-216674-0_sp0.9 from training. Duration: 0.6 2023-10-04 01:58:51,227 WARNING [train.py:1204] (3/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_144-3287-0_sp1.1 from training. Duration: 0.6090625 2023-10-04 01:58:54,527 WARNING [train.py:1204] (3/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_342-3879-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 01:58:54,566 WARNING [train.py:1204] (3/4) Exclude cut with ID _33640_210_2655_1_1533117605479_4855469_584-65630-0_sp1.1 from training. Duration: 0.97275 2023-10-04 01:58:55,898 WARNING [train.py:1204] (3/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_37-189262-0 from training. Duration: 0.98 2023-10-04 01:58:55,932 WARNING [train.py:1204] (3/4) Exclude cut with ID _9389_210_1664_1_1529217337301_5289269_379-128794-0_sp0.9 from training. Duration: 0.9444375 2023-10-04 01:58:57,372 WARNING [train.py:1204] (3/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_119-207162-0_sp1.1 from training. Duration: 0.6454375 2023-10-04 01:59:00,614 WARNING [train.py:1204] (3/4) Exclude cut with ID _2941_210_2577_1_1526199673150_2489759_200-333643-0_sp1.1 from training. Duration: 0.936375 2023-10-04 01:59:03,735 WARNING [train.py:1204] (3/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_479-125611-0_sp1.1 from training. Duration: 0.87275 2023-10-04 01:59:03,753 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3475_210_2028_1_1526193578_3622569_64-297072-0_sp1.1 from training. Duration: 0.82725 2023-10-04 01:59:03,789 WARNING [train.py:1204] (3/4) Exclude cut with ID _53565_210_10635_1_1534337218335_3820259_139-346291-0_sp1.1 from training. Duration: 0.97275 2023-10-04 01:59:03,831 WARNING [train.py:1204] (3/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_588-125165-0_sp1.1 from training. Duration: 0.7545625 2023-10-04 01:59:05,118 WARNING [train.py:1204] (3/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_938-205135-0_sp1.1 from training. Duration: 0.7909375 2023-10-04 01:59:05,475 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.conv_module2.balancer1.prob, batch_count=1485253.3333333333, ans=0.125 2023-10-04 01:59:06,541 WARNING [train.py:1204] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_216-48707-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 01:59:07,805 WARNING [train.py:1204] (3/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_477-255782-0_sp1.1 from training. Duration: 0.7454375 2023-10-04 01:59:07,853 WARNING [train.py:1204] (3/4) Exclude cut with ID _16909_210_5724_1_1531292079912_4118919_583-92842-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 01:59:09,058 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.596e+02 1.943e+02 2.204e+02 2.524e+02 4.078e+02, threshold=4.408e+02, percent-clipped=0.0 2023-10-04 01:59:09,186 WARNING [train.py:1204] (3/4) Exclude cut with ID _60860_210_8254_1_1534896015875_3557169_245-256666-0 from training. Duration: 0.97 2023-10-04 01:59:12,025 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_45399_210_4852_1_1533624725_7579737_349-116173-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 01:59:16,746 WARNING [train.py:1204] (3/4) Exclude cut with ID _8699_210_2069_1_1528630346831_3960409_155-39766-0 from training. Duration: 0.78 2023-10-04 01:59:16,755 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_86-16881-0_sp1.1 from training. Duration: 0.52725 2023-10-04 01:59:20,895 INFO [train.py:1046] (3/4) Epoch 42, batch 5000, loss[loss=0.1463, simple_loss=0.2216, pruned_loss=0.03552, over 22789.00 frames. ], tot_loss[loss=0.1551, simple_loss=0.2349, pruned_loss=0.03766, over 4723431.83 frames. ], batch size: 322, lr: 2.41e-03, grad_scale: 8.0 2023-10-04 01:59:21,279 INFO [scaling.py:1118] (3/4) WithLoss: name=encoder.encoders.4.encoder.layers.0.self_attn_weights, loss-sum=0.000e+00 2023-10-04 01:59:24,351 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_191-159384-0_sp1.1 from training. Duration: 0.97275 2023-10-04 01:59:24,359 WARNING [train.py:1204] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_307-158462-0_sp1.1 from training. Duration: 0.62725 2023-10-04 01:59:25,766 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7544_210_6753_1_1528591327_1471426_33-346031-0 from training. Duration: 0.61 2023-10-04 01:59:27,109 WARNING [train.py:1204] (3/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_112-149213-0 from training. Duration: 0.95 2023-10-04 01:59:28,489 WARNING [train.py:1204] (3/4) Exclude cut with ID _55844_210_2346_1_1534471212569_7625707_925-191342-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 01:59:30,404 WARNING [train.py:1204] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_87-278686-0 from training. Duration: 0.77 2023-10-04 01:59:31,677 WARNING [train.py:1204] (3/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_415-78792-0_sp1.1 from training. Duration: 0.736375 2023-10-04 01:59:31,686 WARNING [train.py:1204] (3/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_107-332398-0_sp0.9 from training. Duration: 0.8666875 2023-10-04 01:59:31,777 WARNING [train.py:1204] (3/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_72-251255-0 from training. Duration: 0.94 2023-10-04 01:59:33,075 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_431-227273-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 01:59:33,139 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7544_210_6753_1_1528587000_691468_87-178599-0_sp0.9 from training. Duration: 0.9666875 2023-10-04 01:59:35,133 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.0.bypass.scale_min, batch_count=1485386.6666666667, ans=0.2 2023-10-04 01:59:36,265 WARNING [train.py:1204] (3/4) Exclude cut with ID _13916_210_9994_1_1530788341126_3763149_60-191556-0 from training. Duration: 0.61 2023-10-04 01:59:36,276 WARNING [train.py:1204] (3/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_746-164782-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 01:59:36,327 WARNING [train.py:1204] (3/4) Exclude cut with ID _33640_210_2655_1_1533117605479_4855469_596-21066-0_sp1.1 from training. Duration: 0.92725 2023-10-04 01:59:37,764 WARNING [train.py:1204] (3/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_320-14078-0 from training. Duration: 0.95 2023-10-04 01:59:37,804 WARNING [train.py:1204] (3/4) Exclude cut with ID _29877_210_6228_1_1532397791106_5276149_266-62291-0 from training. Duration: 0.87 2023-10-04 01:59:39,098 WARNING [train.py:1204] (3/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_176-56120-0_sp0.9 from training. Duration: 0.92225 2023-10-04 01:59:39,158 WARNING [train.py:1204] (3/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_91-26919-0 from training. Duration: 0.97 2023-10-04 01:59:40,519 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_224-322394-0_sp0.9 from training. Duration: 0.7333125 2023-10-04 01:59:40,553 WARNING [train.py:1204] (3/4) Exclude cut with ID _44982_210_1511_1_1534312820108_3726029_586-297059-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 01:59:40,598 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_196-289637-0_sp1.1 from training. Duration: 0.6818125 2023-10-04 01:59:40,602 WARNING [train.py:1204] (3/4) Exclude cut with ID _18513_210_9206_1_1531618215223_3862620_402-5957-0 from training. Duration: 0.99 2023-10-04 01:59:40,608 WARNING [train.py:1204] (3/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_81-300212-0 from training. Duration: 0.6 2023-10-04 01:59:43,332 WARNING [train.py:1204] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_166-128941-0 from training. Duration: 0.54 2023-10-04 01:59:43,346 WARNING [train.py:1204] (3/4) Exclude cut with ID _58059_210_5854_1_1534901471149_3164808_57-309660-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 01:59:44,630 WARNING [train.py:1204] (3/4) Exclude cut with ID _60621_210_17071_1_1535013053530_3671739_73-155814-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 01:59:44,748 WARNING [train.py:1204] (3/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_726-270780-0 from training. Duration: 0.86 2023-10-04 01:59:45,326 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.3.encoder.layers.1.feed_forward1.out_whiten, num_groups=1, num_channels=512, metric=10.67 vs. limit=15.0 2023-10-04 01:59:46,040 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8853_210_3273_1_1528633899_818968_51-187685-0_sp1.1 from training. Duration: 0.62725 2023-10-04 01:59:49,076 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_315-212794-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 01:59:49,157 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_53373_210_5399_1_1534229471_4715695_314-122795-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 01:59:49,283 WARNING [train.py:1204] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_187-221566-0_sp0.9 from training. Duration: 0.47775 2023-10-04 01:59:52,361 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_133-564-0 from training. Duration: 0.79 2023-10-04 01:59:52,408 WARNING [train.py:1204] (3/4) Exclude cut with ID _55153_210_2253_1_1534384786677_3162096_255-228344-0_sp1.1 from training. Duration: 0.936375 2023-10-04 01:59:53,813 WARNING [train.py:1204] (3/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_383-165234-0_sp1.1 from training. Duration: 0.8545625 2023-10-04 01:59:54,307 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.5.encoder.layers.0.feed_forward1.out_whiten, num_groups=1, num_channels=256, metric=10.49 vs. limit=15.0 2023-10-04 01:59:57,900 WARNING [train.py:1204] (3/4) Exclude cut with ID IC0677W0056-334889 from training. Duration: 0.768 2023-10-04 02:00:02,566 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_14953_210_2151_1_1530701710_5616165_710-165400-0_sp0.9 from training. Duration: 0.9666875 2023-10-04 02:00:02,663 WARNING [train.py:1204] (3/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_355-236205-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 02:00:02,665 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_34450_210_9547_1_1533099443_4109050_503-246893-0_sp1.1 from training. Duration: 0.963625 2023-10-04 02:00:08,185 WARNING [train.py:1204] (3/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_25-120161-0 from training. Duration: 0.98 2023-10-04 02:00:08,221 WARNING [train.py:1204] (3/4) Exclude cut with ID _66112_210_10635_1_1535590236305_3801484_205-311930-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 02:00:08,243 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_48049_210_5632_1_1533864553_3519937_185-281218-0_sp1.1 from training. Duration: 0.92725 2023-10-04 02:00:08,290 WARNING [train.py:1204] (3/4) Exclude cut with ID _48782_210_12658_1_1534467701544_2300422_191-332415-0_sp1.1 from training. Duration: 0.97275 2023-10-04 02:00:10,610 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.4.encoder.layers.2.self_attn_weights.whiten_keys, num_groups=4, num_channels=128, metric=1.78 vs. limit=6.0 2023-10-04 02:00:11,357 WARNING [train.py:1204] (3/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_907-151398-0_sp1.1 from training. Duration: 0.336375 2023-10-04 02:00:11,411 WARNING [train.py:1204] (3/4) Exclude cut with ID _67499_210_17282_1_1535509805220_3692430_20-179346-0_sp1.1 from training. Duration: 0.8818125 2023-10-04 02:00:14,227 WARNING [train.py:1204] (3/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_441-91466-0_sp1.1 from training. Duration: 0.8818125 2023-10-04 02:00:15,581 WARNING [train.py:1204] (3/4) Exclude cut with ID _46681_210_9642_1_1533893005571_7603359_786-164007-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 02:00:19,816 WARNING [train.py:1204] (3/4) Exclude cut with ID _14970_210_15709_1_1530773543493_1715730_187-20259-0 from training. Duration: 0.89 2023-10-04 02:00:23,058 WARNING [train.py:1204] (3/4) Exclude cut with ID _12734_210_8341_1_1530183614296_3891929_383-339075-0_sp1.1 from training. Duration: 0.963625 2023-10-04 02:00:27,103 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.0.layers.1.whiten, num_groups=1, num_channels=192, metric=3.80 vs. limit=12.0 2023-10-04 02:00:31,689 WARNING [train.py:1204] (3/4) Exclude cut with ID _37086_210_9038_1_1532999540891_7021250_834-322225-0_sp1.1 from training. Duration: 0.92725 2023-10-04 02:00:34,920 INFO [train.py:1046] (3/4) Epoch 42, batch 5050, loss[loss=0.144, simple_loss=0.225, pruned_loss=0.03154, over 24443.00 frames. ], tot_loss[loss=0.1553, simple_loss=0.2354, pruned_loss=0.03761, over 4722166.61 frames. ], batch size: 58, lr: 2.41e-03, grad_scale: 8.0 2023-10-04 02:00:34,973 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_10987_210_4163_1_1529760555_4123902_56-290559-0_sp1.1 from training. Duration: 0.963625 2023-10-04 02:00:34,981 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_92-246270-0_sp0.9 from training. Duration: 0.9444375 2023-10-04 02:00:34,999 WARNING [train.py:1204] (3/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_56-13561-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 02:00:35,040 WARNING [train.py:1204] (3/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_393-57888-0_sp1.1 from training. Duration: 0.5818125 2023-10-04 02:00:35,074 WARNING [train.py:1204] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_168-32906-0_sp1.1 from training. Duration: 0.62725 2023-10-04 02:00:36,366 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_51225_210_3601_1_1534400634_2327383_127-207553-0_sp1.1 from training. Duration: 0.963625 2023-10-04 02:00:41,020 WARNING [train.py:1204] (3/4) Exclude cut with ID _58583_210_5236_1_1534726835266_3574249_494-203521-0_sp1.1 from training. Duration: 0.963625 2023-10-04 02:00:41,032 WARNING [train.py:1204] (3/4) Exclude cut with ID _49376_210_5986_1_1533985278732_3370740_354-344695-0 from training. Duration: 0.99 2023-10-04 02:00:42,468 WARNING [train.py:1204] (3/4) Exclude cut with ID _8021_210_7994_1_1528376451358_3653069_179-45206-0_sp1.1 from training. Duration: 0.7818125 2023-10-04 02:00:43,913 WARNING [train.py:1204] (3/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_742-154017-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 02:00:45,320 WARNING [train.py:1204] (3/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_222-198127-0_sp1.1 from training. Duration: 0.763625 2023-10-04 02:00:46,630 WARNING [train.py:1204] (3/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_235-307337-0 from training. Duration: 0.94 2023-10-04 02:00:46,698 WARNING [train.py:1204] (3/4) Exclude cut with ID _8393_210_6702_1_1528505860249_3457328_453-176500-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 02:00:47,994 WARNING [train.py:1204] (3/4) Exclude cut with ID _53565_210_10635_1_1534337218335_3820259_102-179528-0_sp1.1 from training. Duration: 0.97275 2023-10-04 02:00:50,658 WARNING [train.py:1204] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_43-206757-0_sp1.1 from training. Duration: 0.6818125 2023-10-04 02:00:50,858 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.balancer2.prob, batch_count=1485720.0, ans=0.125 2023-10-04 02:00:52,063 WARNING [train.py:1204] (3/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_335-274780-0_sp1.1 from training. Duration: 0.7090625 2023-10-04 02:00:53,926 WARNING [train.py:1204] (3/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_128-296581-0_sp1.1 from training. Duration: 0.67275 2023-10-04 02:01:00,388 WARNING [train.py:1204] (3/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_111-112362-0 from training. Duration: 0.91 2023-10-04 02:01:01,709 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_5522_210_3273_1_1527249606_3909096_366-172508-0_sp1.1 from training. Duration: 0.6 2023-10-04 02:01:03,577 WARNING [train.py:1204] (3/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_47-167373-0_sp1.1 from training. Duration: 0.836375 2023-10-04 02:01:04,672 WARNING [train.py:1204] (3/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_41-234353-0 from training. Duration: 0.68 2023-10-04 02:01:04,711 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_82-221196-0_sp0.9 from training. Duration: 0.8444375 2023-10-04 02:01:06,292 WARNING [train.py:1204] (3/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_47-4814-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 02:01:06,320 WARNING [train.py:1204] (3/4) Exclude cut with ID _43541_210_10626_1_1533796233418_3635440_40-247993-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 02:01:06,373 WARNING [train.py:1204] (3/4) Exclude cut with ID _21989_210_5854_1_1531899214931_3271360_133-44759-0_sp0.9 from training. Duration: 0.8555625 2023-10-04 02:01:06,376 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_94-195801-0 from training. Duration: 0.94 2023-10-04 02:01:07,586 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_141-89113-0 from training. Duration: 0.86 2023-10-04 02:01:09,067 WARNING [train.py:1204] (3/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_683-284578-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 02:01:10,478 WARNING [train.py:1204] (3/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_6-214815-0_sp1.1 from training. Duration: 0.77275 2023-10-04 02:01:13,865 WARNING [train.py:1204] (3/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_556-212082-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 02:01:13,929 WARNING [train.py:1204] (3/4) Exclude cut with ID _44902_210_16187_1_1533888006062_3792078_149-67212-0 from training. Duration: 0.87 2023-10-04 02:01:16,581 WARNING [train.py:1204] (3/4) Exclude cut with ID _61148_210_6897_1_1535094022726_4217610_333-159440-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 02:01:18,095 WARNING [train.py:1204] (3/4) Exclude cut with ID _3120_210_3525_1_1526036143112_4072980_404-249994-0 from training. Duration: 0.99 2023-10-04 02:01:18,193 WARNING [train.py:1204] (3/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_234-260682-0_sp0.9 from training. Duration: 0.9333125 2023-10-04 02:01:19,432 WARNING [train.py:1204] (3/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_374-138971-0_sp1.1 from training. Duration: 0.8545625 2023-10-04 02:01:19,503 WARNING [train.py:1204] (3/4) Exclude cut with ID _53713_210_24116_1_1534314487762_7416751_743-302085-0_sp1.1 from training. Duration: 0.92725 2023-10-04 02:01:19,566 WARNING [train.py:1204] (3/4) Exclude cut with ID _18516_210_9206_1_1531704653206_3692406_198-334870-0_sp1.1 from training. Duration: 0.836375 2023-10-04 02:01:22,305 WARNING [train.py:1204] (3/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_395-73570-0_sp1.1 from training. Duration: 0.9 2023-10-04 02:01:24,371 WARNING [train.py:1204] (3/4) Exclude cut with ID _46683_210_9642_1_1533730913492_4591116_421-19547-0_sp1.1 from training. Duration: 0.936375 2023-10-04 02:01:25,651 WARNING [train.py:1204] (3/4) Exclude cut with ID _19896_210_1883_1_1531709022662_6649569_714-8668-0_sp1.1 from training. Duration: 0.963625 2023-10-04 02:01:26,816 WARNING [train.py:1204] (3/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_49-101072-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 02:01:26,823 WARNING [train.py:1204] (3/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_220-191080-0_sp1.1 from training. Duration: 0.8909375 2023-10-04 02:01:26,858 WARNING [train.py:1204] (3/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_62-335552-0 from training. Duration: 0.87 2023-10-04 02:01:28,365 WARNING [train.py:1204] (3/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_111-112362-0_sp1.1 from training. Duration: 0.82725 2023-10-04 02:01:28,581 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.balancer2.prob, batch_count=1485853.3333333333, ans=0.125 2023-10-04 02:01:30,382 WARNING [train.py:1204] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_318-59097-0_sp0.9 from training. Duration: 0.8444375 2023-10-04 02:01:34,994 WARNING [train.py:1204] (3/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_98-255710-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 02:01:36,277 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9042W0003-515609 from training. Duration: 0.896 2023-10-04 02:01:36,289 WARNING [train.py:1204] (3/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_158-250116-0_sp0.9 from training. Duration: 0.688875 2023-10-04 02:01:37,596 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.637e+02 1.948e+02 2.169e+02 2.465e+02 3.458e+02, threshold=4.337e+02, percent-clipped=0.0 2023-10-04 02:01:37,696 WARNING [train.py:1204] (3/4) Exclude cut with ID _26750_210_5665_1_1532226678442_4063230_7-346885-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 02:01:37,749 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_37839_210_7234_1_1533542462_3689303_357-227078-0_sp1.1 from training. Duration: 0.963625 2023-10-04 02:01:37,770 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9052W0119-520904 from training. Duration: 0.896 2023-10-04 02:01:40,568 WARNING [train.py:1204] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_249-281349-0_sp1.1 from training. Duration: 0.77275 2023-10-04 02:01:40,586 WARNING [train.py:1204] (3/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_147-293311-0 from training. Duration: 0.94 2023-10-04 02:01:40,588 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_158-21166-0_sp1.1 from training. Duration: 0.963625 2023-10-04 02:01:43,961 WARNING [train.py:1204] (3/4) Exclude cut with ID _14100_210_8341_1_1530529320613_3678069_207-332194-0_sp1.1 from training. Duration: 0.92725 2023-10-04 02:01:44,523 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.4.encoder.layers.1.self_attn1.whiten, num_groups=1, num_channels=384, metric=10.66 vs. limit=22.5 2023-10-04 02:01:45,271 WARNING [train.py:1204] (3/4) Exclude cut with ID _9389_210_1664_1_1529217337301_5289269_324-211031-0_sp1.1 from training. Duration: 0.963625 2023-10-04 02:01:45,284 WARNING [train.py:1204] (3/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_133-158308-0 from training. Duration: 0.84 2023-10-04 02:01:46,645 WARNING [train.py:1204] (3/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_166-314607-0 from training. Duration: 0.54 2023-10-04 02:01:48,152 WARNING [train.py:1204] (3/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_649-79538-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 02:01:48,167 WARNING [train.py:1204] (3/4) Exclude cut with ID _28394_210_8913_1_1532430049518_3650118_343-115667-0_sp1.1 from training. Duration: 0.97275 2023-10-04 02:01:49,449 INFO [train.py:1046] (3/4) Epoch 42, batch 5100, loss[loss=0.2079, simple_loss=0.2835, pruned_loss=0.06615, over 19214.00 frames. ], tot_loss[loss=0.1562, simple_loss=0.2363, pruned_loss=0.03802, over 4702168.62 frames. ], batch size: 388, lr: 2.41e-03, grad_scale: 8.0 2023-10-04 02:01:49,530 WARNING [train.py:1204] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_87-278686-0_sp0.9 from training. Duration: 0.8555625 2023-10-04 02:01:52,246 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9109W0003-549749 from training. Duration: 0.768 2023-10-04 02:01:53,562 WARNING [train.py:1204] (3/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_126-29104-0_sp1.1 from training. Duration: 0.77275 2023-10-04 02:01:56,929 WARNING [train.py:1204] (3/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_325-113798-0 from training. Duration: 0.85 2023-10-04 02:01:56,978 WARNING [train.py:1204] (3/4) Exclude cut with ID _6694_210_4599_1_1527852637490_3965529_116-109398-0 from training. Duration: 0.73 2023-10-04 02:01:58,294 WARNING [train.py:1204] (3/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_46-57119-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 02:01:59,731 WARNING [train.py:1204] (3/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_546-119312-0_sp1.1 from training. Duration: 0.9 2023-10-04 02:02:03,495 WARNING [train.py:1204] (3/4) Exclude cut with ID _60057_210_10640_1_1534903065793_3333150_468-306939-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 02:02:03,552 WARNING [train.py:1204] (3/4) Exclude cut with ID _15150_210_8341_1_1530752411803_7256384_22-138274-0 from training. Duration: 0.96 2023-10-04 02:02:04,789 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_289-180904-0 from training. Duration: 0.76 2023-10-04 02:02:08,954 WARNING [train.py:1204] (3/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_64-133721-0_sp1.1 from training. Duration: 0.92725 2023-10-04 02:02:09,020 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_195-257512-0_sp1.1 from training. Duration: 0.6090625 2023-10-04 02:02:11,838 WARNING [train.py:1204] (3/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_704-101963-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 02:02:16,519 WARNING [train.py:1204] (3/4) Exclude cut with ID _4366_210_1388_1_1526792053633_3613819_231-211402-0 from training. Duration: 0.94 2023-10-04 02:02:16,567 WARNING [train.py:1204] (3/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_679-190385-0_sp1.1 from training. Duration: 0.97275 2023-10-04 02:02:17,995 WARNING [train.py:1204] (3/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_298-81459-0_sp1.1 from training. Duration: 0.963625 2023-10-04 02:02:18,004 WARNING [train.py:1204] (3/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_658-282610-0_sp0.9 from training. Duration: 0.588875 2023-10-04 02:02:19,452 WARNING [train.py:1204] (3/4) Exclude cut with ID _57572_210_5517_1_1534921219624_3809484_196-283734-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 02:02:20,856 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_53071_210_16554_1_1534490405_2954066_294-198554-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 02:02:20,869 WARNING [train.py:1204] (3/4) Exclude cut with ID _4866_210_2577_1_1527404412284_4364659_352-157930-0 from training. Duration: 0.81 2023-10-04 02:02:22,273 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9231W0113-610139 from training. Duration: 0.896 2023-10-04 02:02:22,315 WARNING [train.py:1204] (3/4) Exclude cut with ID _24271_210_12658_1_1531987270022_7190519_638-171906-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 02:02:22,847 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.3.encoder.layers.3.nonlin_attention.whiten1, num_groups=1, num_channels=384, metric=5.10 vs. limit=10.0 2023-10-04 02:02:24,051 WARNING [train.py:1204] (3/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_272-249985-0 from training. Duration: 0.67 2023-10-04 02:02:24,062 WARNING [train.py:1204] (3/4) Exclude cut with ID _64086_210_7994_1_1535270417019_7435949_449-31567-0 from training. Duration: 0.97 2023-10-04 02:02:26,956 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.0.ff3_skip_rate, batch_count=1486120.0, ans=0.0 2023-10-04 02:02:28,094 WARNING [train.py:1204] (3/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_109-282568-0_sp1.1 from training. Duration: 0.97275 2023-10-04 02:02:34,705 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_121-45934-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 02:02:36,137 WARNING [train.py:1204] (3/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_314-126819-0 from training. Duration: 0.5 2023-10-04 02:02:37,503 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9309W0192-640319 from training. Duration: 0.512 2023-10-04 02:02:37,510 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9310W0003-640649 from training. Duration: 0.896 2023-10-04 02:02:38,961 WARNING [train.py:1204] (3/4) Exclude cut with ID _7160_210_1876_1_1527940923231_3703799_120-150943-0 from training. Duration: 0.81 2023-10-04 02:02:38,963 WARNING [train.py:1204] (3/4) Exclude cut with ID _40380_210_9059_1_1533380594759_4187380_6-33508-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 02:02:40,906 WARNING [train.py:1204] (3/4) Exclude cut with ID _59202_210_26984_1_1534816810612_3549174_137-318710-0 from training. Duration: 0.89 2023-10-04 02:02:43,730 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_435-313305-0 from training. Duration: 0.71 2023-10-04 02:02:46,395 WARNING [train.py:1204] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_91-212486-0_sp1.1 from training. Duration: 0.6181875 2023-10-04 02:02:47,735 WARNING [train.py:1204] (3/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_133-158308-0_sp1.1 from training. Duration: 0.763625 2023-10-04 02:02:49,227 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_99-251314-0 from training. Duration: 0.87 2023-10-04 02:02:50,656 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_365-217119-0_sp1.1 from training. Duration: 0.57275 2023-10-04 02:02:52,390 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3475_210_2028_1_1526193578_3622569_377-166133-0 from training. Duration: 0.86 2023-10-04 02:02:57,959 WARNING [train.py:1204] (3/4) Exclude cut with ID _13916_210_9994_1_1530788341126_3763149_116-21034-0_sp1.1 from training. Duration: 0.87275 2023-10-04 02:02:57,981 WARNING [train.py:1204] (3/4) Exclude cut with ID _64220_210_9527_1_1535250151772_4875259_142-216831-0_sp1.1 from training. Duration: 0.936375 2023-10-04 02:02:57,982 WARNING [train.py:1204] (3/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_490-207861-0_sp1.1 from training. Duration: 0.8909375 2023-10-04 02:02:59,306 WARNING [train.py:1204] (3/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_87-272068-0_sp0.9 from training. Duration: 0.92225 2023-10-04 02:02:59,336 WARNING [train.py:1204] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_18-46756-0_sp1.1 from training. Duration: 0.7181875 2023-10-04 02:02:59,397 WARNING [train.py:1204] (3/4) Exclude cut with ID _39411_210_5986_1_1533538825030_3343649_98-311335-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 02:03:01,936 WARNING [train.py:1204] (3/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_113-18163-0 from training. Duration: 0.89 2023-10-04 02:03:01,939 WARNING [train.py:1204] (3/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_1103-56445-0 from training. Duration: 0.73 2023-10-04 02:03:03,124 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3474_210_2028_1_1526188513_4825246_145-309131-0 from training. Duration: 0.74 2023-10-04 02:03:03,150 WARNING [train.py:1204] (3/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_120-211630-0_sp1.1 from training. Duration: 0.836375 2023-10-04 02:03:03,167 WARNING [train.py:1204] (3/4) Exclude cut with ID _8030_210_7497_1_1528444886781_3531089_32-31837-0 from training. Duration: 0.97 2023-10-04 02:03:03,786 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.4.encoder.layers.1.feed_forward1.out_whiten, num_groups=1, num_channels=384, metric=8.97 vs. limit=15.0 2023-10-04 02:03:04,532 INFO [train.py:1046] (3/4) Epoch 42, batch 5150, loss[loss=0.1451, simple_loss=0.2295, pruned_loss=0.03034, over 24481.00 frames. ], tot_loss[loss=0.1566, simple_loss=0.2368, pruned_loss=0.03817, over 4705111.85 frames. ], batch size: 66, lr: 2.41e-03, grad_scale: 8.0 2023-10-04 02:03:05,853 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_19949_210_2161_1_1531706337_928079_8-160956-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 02:03:05,919 WARNING [train.py:1204] (3/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_321-215742-0_sp1.1 from training. Duration: 0.4545625 2023-10-04 02:03:08,661 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_583-124650-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 02:03:10,040 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_30434_210_13049_1_1532519384_981746_164-182755-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 02:03:14,882 WARNING [train.py:1204] (3/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_339-104791-0_sp1.1 from training. Duration: 0.4454375 2023-10-04 02:03:14,901 WARNING [train.py:1204] (3/4) Exclude cut with ID _15026_210_8750_1_1530777602316_3291850_211-195043-0 from training. Duration: 0.8 2023-10-04 02:03:15,153 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.balancer1.prob, batch_count=1486320.0, ans=0.125 2023-10-04 02:03:16,793 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.4.encoder.layers.1.self_attn1.whiten, num_groups=1, num_channels=384, metric=11.51 vs. limit=22.5 2023-10-04 02:03:17,665 WARNING [train.py:1204] (3/4) Exclude cut with ID _29877_210_6228_1_1532397791106_5276149_487-182647-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 02:03:17,709 WARNING [train.py:1204] (3/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_127-566-0_sp0.9 from training. Duration: 0.9333125 2023-10-04 02:03:20,396 WARNING [train.py:1204] (3/4) Exclude cut with ID _8263_210_1664_1_1528439452891_970220_68-181805-0_sp0.9 from training. Duration: 0.711125 2023-10-04 02:03:20,398 WARNING [train.py:1204] (3/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_504-72513-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 02:03:20,406 WARNING [train.py:1204] (3/4) Exclude cut with ID _64639_210_5816_1_1535367619437_3528000_324-265233-0_sp1.1 from training. Duration: 0.963625 2023-10-04 02:03:21,796 WARNING [train.py:1204] (3/4) Exclude cut with ID _6456_210_1534_1_1527681634212_3181049_215-309359-0_sp0.9 from training. Duration: 0.97775 2023-10-04 02:03:21,801 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_115-233929-0_sp1.1 from training. Duration: 0.6545625 2023-10-04 02:03:21,848 WARNING [train.py:1204] (3/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_219-224445-0 from training. Duration: 0.84 2023-10-04 02:03:23,286 WARNING [train.py:1204] (3/4) Exclude cut with ID _14867_210_1968_1_1530684179890_3846969_413-29292-0_sp1.1 from training. Duration: 0.8181875 2023-10-04 02:03:23,327 WARNING [train.py:1204] (3/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_1012-279473-0_sp0.9 from training. Duration: 0.7444375 2023-10-04 02:03:23,466 WARNING [train.py:1204] (3/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_605-133395-0_sp0.9 from training. Duration: 0.7333125 2023-10-04 02:03:25,225 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_89-35748-0 from training. Duration: 0.79 2023-10-04 02:03:25,365 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.balancer2.prob, batch_count=1486386.6666666667, ans=0.125 2023-10-04 02:03:26,056 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.0.layers.1.self_attn1.whiten, num_groups=1, num_channels=192, metric=12.78 vs. limit=22.5 2023-10-04 02:03:27,886 WARNING [train.py:1204] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_476-272757-0_sp1.1 from training. Duration: 0.8454375 2023-10-04 02:03:29,457 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.conv_module1.balancer2.prob, batch_count=1486386.6666666667, ans=0.125 2023-10-04 02:03:33,263 WARNING [train.py:1204] (3/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_449-15508-0_sp0.9 from training. Duration: 0.911125 2023-10-04 02:03:34,642 WARNING [train.py:1204] (3/4) Exclude cut with ID _29345_210_4381_1_1532689168263_3860169_198-174328-0 from training. Duration: 0.91 2023-10-04 02:03:39,349 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_15517_210_6753_1_1530952293_1664795_125-158157-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 02:03:44,232 WARNING [train.py:1204] (3/4) Exclude cut with ID _25565_210_12392_1_1532423009382_3952098_420-113180-0_sp1.1 from training. Duration: 0.963625 2023-10-04 02:03:44,318 WARNING [train.py:1204] (3/4) Exclude cut with ID _51471_210_1889_1_1534399391390_3094250_236-146773-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 02:03:44,463 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=1486453.3333333333, ans=0.1 2023-10-04 02:03:45,072 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.2.encoder.layers.1.self_attn2.whiten, num_groups=1, num_channels=384, metric=16.17 vs. limit=22.5 2023-10-04 02:03:48,334 WARNING [train.py:1204] (3/4) Exclude cut with ID _30039_210_13270_1_1532484612900_2725150_175-35012-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 02:03:48,365 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_23850_210_4238_1_1532232597_1632433_99-275843-0_sp1.1 from training. Duration: 0.936375 2023-10-04 02:03:48,578 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.balancer2.prob, batch_count=1486520.0, ans=0.125 2023-10-04 02:03:49,872 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.feed_forward3.hidden_balancer.prob, batch_count=1486520.0, ans=0.125 2023-10-04 02:03:50,991 WARNING [train.py:1204] (3/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_658-282610-0 from training. Duration: 0.53 2023-10-04 02:03:55,425 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_37818_210_4238_1_1533620910_4720083_234-65934-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 02:03:56,807 WARNING [train.py:1204] (3/4) Exclude cut with ID _3067_210_2667_1_1526212974640_4503168_459-173867-0_sp1.1 from training. Duration: 0.8 2023-10-04 02:03:56,826 WARNING [train.py:1204] (3/4) Exclude cut with ID _8696_210_1876_1_1528545632440_3345058_209-54069-0_sp0.9 from training. Duration: 0.7444375 2023-10-04 02:03:59,754 WARNING [train.py:1204] (3/4) Exclude cut with ID _62574_210_2655_1_1535180407058_3933249_420-236465-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 02:04:01,098 WARNING [train.py:1204] (3/4) Exclude cut with ID _46681_210_9642_1_1533893005571_7603359_333-4042-0_sp1.1 from training. Duration: 0.936375 2023-10-04 02:04:02,447 WARNING [train.py:1204] (3/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_377-296839-0 from training. Duration: 0.55 2023-10-04 02:04:05,033 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.590e+02 1.977e+02 2.120e+02 2.430e+02 3.829e+02, threshold=4.240e+02, percent-clipped=0.0 2023-10-04 02:04:08,240 WARNING [train.py:1204] (3/4) Exclude cut with ID _43997_210_4525_1_1533538632555_5189329_426-92599-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 02:04:09,352 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.0.layers.1.self_attn1.whiten, num_groups=1, num_channels=192, metric=12.79 vs. limit=22.5 2023-10-04 02:04:10,590 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_43-71989-0_sp1.1 from training. Duration: 0.5181875 2023-10-04 02:04:11,742 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2009_210_2071_1_1525481134_4588937_140-345530-0_sp1.1 from training. Duration: 0.963625 2023-10-04 02:04:11,750 WARNING [train.py:1204] (3/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_138-332773-0_sp0.9 from training. Duration: 0.9 2023-10-04 02:04:13,247 WARNING [train.py:1204] (3/4) Exclude cut with ID _13942_210_9988_1_1530604906036_3733360_499-55676-0_sp0.9 from training. Duration: 0.688875 2023-10-04 02:04:13,271 WARNING [train.py:1204] (3/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_259-34038-0_sp1.1 from training. Duration: 0.67275 2023-10-04 02:04:13,281 WARNING [train.py:1204] (3/4) Exclude cut with ID _3120_210_3525_1_1526036143112_4072980_404-249994-0_sp1.1 from training. Duration: 0.9 2023-10-04 02:04:13,314 WARNING [train.py:1204] (3/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_222-108810-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 02:04:16,706 WARNING [train.py:1204] (3/4) Exclude cut with ID _22083_210_8254_1_1531702862439_3678970_49-234546-0_sp0.9 from training. Duration: 0.92225 2023-10-04 02:04:16,805 WARNING [train.py:1204] (3/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_199-295611-0_sp0.9 from training. Duration: 0.611125 2023-10-04 02:04:17,989 INFO [train.py:1046] (3/4) Epoch 42, batch 5200, loss[loss=0.1505, simple_loss=0.2379, pruned_loss=0.0316, over 24543.00 frames. ], tot_loss[loss=0.1564, simple_loss=0.2369, pruned_loss=0.03793, over 4715627.06 frames. ], batch size: 71, lr: 2.41e-03, grad_scale: 16.0 2023-10-04 02:04:19,437 WARNING [train.py:1204] (3/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_43-192945-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 02:04:19,607 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.attention_skip_rate, batch_count=1486653.3333333333, ans=0.0 2023-10-04 02:04:23,663 WARNING [train.py:1204] (3/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_364-22817-0 from training. Duration: 0.71 2023-10-04 02:04:23,742 WARNING [train.py:1204] (3/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_388-129552-0_sp1.1 from training. Duration: 0.87275 2023-10-04 02:04:25,469 WARNING [train.py:1204] (3/4) Exclude cut with ID _6537_210_1883_1_1527987559710_3597129_190-280227-0_sp1.1 from training. Duration: 0.97275 2023-10-04 02:04:28,212 WARNING [train.py:1204] (3/4) Exclude cut with ID _55844_210_2346_1_1534471212569_7625707_398-56897-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 02:04:28,313 WARNING [train.py:1204] (3/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_48-182155-0_sp1.1 from training. Duration: 0.7909375 2023-10-04 02:04:29,631 WARNING [train.py:1204] (3/4) Exclude cut with ID _29529_210_7234_1_1532393961455_3977460_582-18084-0_sp1.1 from training. Duration: 0.97275 2023-10-04 02:04:29,754 WARNING [train.py:1204] (3/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_912-282412-0 from training. Duration: 0.49 2023-10-04 02:04:31,322 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.feed_forward1.hidden_balancer.prob, batch_count=1486720.0, ans=0.125 2023-10-04 02:04:32,560 WARNING [train.py:1204] (3/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_332-256168-0_sp0.9 from training. Duration: 0.8333125 2023-10-04 02:04:33,887 WARNING [train.py:1204] (3/4) Exclude cut with ID _64220_210_9527_1_1535250151772_4875259_55-250624-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 02:04:35,748 WARNING [train.py:1204] (3/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_264-108685-0 from training. Duration: 0.55 2023-10-04 02:04:37,735 WARNING [train.py:1204] (3/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_613-232856-0_sp0.9 from training. Duration: 0.6 2023-10-04 02:04:37,984 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=1486720.0, ans=0.1 2023-10-04 02:04:38,980 WARNING [train.py:1204] (3/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_68-212801-0_sp1.1 from training. Duration: 0.663625 2023-10-04 02:04:39,032 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6290_210_3273_1_1527595224_1282787_115-87095-0 from training. Duration: 0.55 2023-10-04 02:04:40,389 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3080_210_2071_1_1526086102_4365166_312-8726-0 from training. Duration: 0.91 2023-10-04 02:04:42,471 WARNING [train.py:1204] (3/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_105-63309-0 from training. Duration: 0.98 2023-10-04 02:04:43,784 WARNING [train.py:1204] (3/4) Exclude cut with ID _2941_210_2577_1_1526199673150_2489759_201-160779-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 02:04:43,788 WARNING [train.py:1204] (3/4) Exclude cut with ID ID0406W0226-885434 from training. Duration: 0.997 2023-10-04 02:04:43,794 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_17292_210_6753_1_1531125388_2943734_353-328201-0_sp1.1 from training. Duration: 0.97275 2023-10-04 02:04:45,191 WARNING [train.py:1204] (3/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_572-96029-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 02:04:45,226 WARNING [train.py:1204] (3/4) Exclude cut with ID _14970_210_15709_1_1530775547361_1966965_6-23328-0_sp0.9 from training. Duration: 0.9555625 2023-10-04 02:04:46,590 WARNING [train.py:1204] (3/4) Exclude cut with ID _45012_210_12651_1_1533608435136_5371380_240-37101-0 from training. Duration: 0.93 2023-10-04 02:04:46,656 WARNING [train.py:1204] (3/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_685-90183-0_sp1.1 from training. Duration: 0.936375 2023-10-04 02:04:48,209 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.1.feed_forward1.out_proj.dropout_p, batch_count=1486786.6666666667, ans=0.1 2023-10-04 02:04:49,400 WARNING [train.py:1204] (3/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_41-22245-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 02:04:49,650 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.conv_module1.balancer2.prob, batch_count=1486786.6666666667, ans=0.125 2023-10-04 02:04:52,136 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_15126_210_6753_1_1530778677_2591185_120-277707-0 from training. Duration: 0.8 2023-10-04 02:04:52,186 WARNING [train.py:1204] (3/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_310-26186-0 from training. Duration: 0.7 2023-10-04 02:04:52,201 WARNING [train.py:1204] (3/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_787-294416-0 from training. Duration: 0.82 2023-10-04 02:04:58,072 WARNING [train.py:1204] (3/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_795-33252-0 from training. Duration: 0.37 2023-10-04 02:04:59,395 WARNING [train.py:1204] (3/4) Exclude cut with ID _7537_210_2084_1_1528502395431_3924359_113-199933-0_sp0.9 from training. Duration: 0.6555625 2023-10-04 02:05:06,353 WARNING [train.py:1204] (3/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_308-231735-0_sp0.9 from training. Duration: 0.811125 2023-10-04 02:05:06,382 WARNING [train.py:1204] (3/4) Exclude cut with ID _19504_210_9994_1_1531648863242_3325220_354-296765-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 02:05:09,684 WARNING [train.py:1204] (3/4) Exclude cut with ID _5409_210_1388_1_1527292850189_5865191_178-127970-0 from training. Duration: 0.57 2023-10-04 02:05:09,724 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_1466-248056-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 02:05:09,754 WARNING [train.py:1204] (3/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_535-14410-0_sp0.9 from training. Duration: 0.52225 2023-10-04 02:05:09,758 WARNING [train.py:1204] (3/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_747-129941-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 02:05:11,733 WARNING [train.py:1204] (3/4) Exclude cut with ID _15012_210_1968_1_1530856820429_5095350_688-178849-0_sp1.1 from training. Duration: 0.5818125 2023-10-04 02:05:14,292 WARNING [train.py:1204] (3/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_356-45054-0_sp1.1 from training. Duration: 0.7818125 2023-10-04 02:05:15,662 WARNING [train.py:1204] (3/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_146-276944-0_sp0.9 from training. Duration: 0.92225 2023-10-04 02:05:18,948 WARNING [train.py:1204] (3/4) Exclude cut with ID _39204_210_4220_1_1533880547368_3191120_346-180306-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 02:05:19,037 WARNING [train.py:1204] (3/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_641-158017-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 02:05:19,038 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_19952_210_2161_1_1531875469_3851750_274-42137-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 02:05:23,197 WARNING [train.py:1204] (3/4) Exclude cut with ID _60804_210_9642_1_1534831199859_7268299_87-327819-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 02:05:24,512 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_9302_210_2550_1_1528889962_4655416_638-301620-0 from training. Duration: 0.47 2023-10-04 02:05:24,579 WARNING [train.py:1204] (3/4) Exclude cut with ID _44304_210_15374_1_1533639352765_4613129_302-22084-0_sp1.1 from training. Duration: 0.7818125 2023-10-04 02:05:24,590 WARNING [train.py:1204] (3/4) Exclude cut with ID _18513_210_9206_1_1531618215223_3862620_177-227063-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 02:05:27,609 WARNING [train.py:1204] (3/4) Exclude cut with ID _41326_210_5854_1_1533778076509_3492080_510-45444-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 02:05:29,041 WARNING [train.py:1204] (3/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_76-311963-0_sp1.1 from training. Duration: 0.636375 2023-10-04 02:05:30,306 WARNING [train.py:1204] (3/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_156-302675-0_sp0.9 from training. Duration: 0.611125 2023-10-04 02:05:31,692 INFO [train.py:1046] (3/4) Epoch 42, batch 5250, loss[loss=0.1497, simple_loss=0.2333, pruned_loss=0.03305, over 21139.00 frames. ], tot_loss[loss=0.1561, simple_loss=0.2362, pruned_loss=0.03802, over 4716438.00 frames. ], batch size: 46, lr: 2.41e-03, grad_scale: 8.0 2023-10-04 02:05:33,168 WARNING [train.py:1204] (3/4) Exclude cut with ID _53489_210_6228_1_1534298357169_3835970_367-148411-0_sp1.1 from training. Duration: 0.936375 2023-10-04 02:05:34,776 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.ff3_skip_rate, batch_count=1486986.6666666667, ans=0.0 2023-10-04 02:05:37,222 WARNING [train.py:1204] (3/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_389-144525-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 02:05:37,277 WARNING [train.py:1204] (3/4) Exclude cut with ID _14069_210_5632_1_1530512692033_4803390_158-238427-0_sp1.1 from training. Duration: 0.963625 2023-10-04 02:05:38,608 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_486-151131-0_sp0.9 from training. Duration: 0.9444375 2023-10-04 02:05:43,773 WARNING [train.py:1204] (3/4) Exclude cut with ID _39846_210_12651_1_1533603651992_1318030_188-71831-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 02:05:45,184 WARNING [train.py:1204] (3/4) Exclude cut with ID _39411_210_5986_1_1533538825030_3343649_173-238248-0_sp1.1 from training. Duration: 0.7545625 2023-10-04 02:05:46,628 WARNING [train.py:1204] (3/4) Exclude cut with ID _20684_210_6917_1_1531567751646_4203930_469-49081-0_sp1.1 from training. Duration: 0.92725 2023-10-04 02:05:47,992 WARNING [train.py:1204] (3/4) Exclude cut with ID _8833_210_5219_1_1528592665210_7553059_611-80313-0_sp1.1 from training. Duration: 0.5818125 2023-10-04 02:05:50,658 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.2.encoder.layers.0.whiten, num_groups=1, num_channels=384, metric=9.31 vs. limit=12.0 2023-10-04 02:05:51,199 WARNING [train.py:1204] (3/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_440-141811-0 from training. Duration: 0.95 2023-10-04 02:05:51,220 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_41211_210_2544_1_1533348859_3702910_236-223652-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 02:05:51,302 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_40222_210_6228_1_1533385298_4481939_220-163998-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 02:06:24,144 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.conv_module1.balancer1.prob, batch_count=1487186.6666666667, ans=0.125 2023-10-04 02:06:24,214 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.feed_forward1.out_proj.dropout_p, batch_count=1487186.6666666667, ans=0.1 2023-10-04 02:06:25,447 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.self_attn_weights.pos_emb_skip_rate, batch_count=1487253.3333333333, ans=0.0 2023-10-04 02:06:30,435 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.720e+02 1.997e+02 2.174e+02 2.692e+02 4.160e+02, threshold=4.348e+02, percent-clipped=0.0 2023-10-04 02:06:39,932 INFO [train.py:1046] (3/4) Epoch 42, batch 5300, loss[loss=0.1539, simple_loss=0.2338, pruned_loss=0.03703, over 23777.00 frames. ], tot_loss[loss=0.1552, simple_loss=0.2352, pruned_loss=0.0376, over 4721279.69 frames. ], batch size: 135, lr: 2.41e-03, grad_scale: 8.0 2023-10-04 02:06:45,410 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.0.layers.0.nonlin_attention.whiten1, num_groups=1, num_channels=144, metric=9.11 vs. limit=10.0 2023-10-04 02:06:54,339 WARNING [train.py:1204] (3/4) Exclude cut with ID _14629_210_15709_1_1530604298182_956899_6-232695-0_sp1.1 from training. Duration: 0.8545625 2023-10-04 02:06:54,360 WARNING [train.py:1204] (3/4) Exclude cut with ID _13921_210_9994_1_1530841782272_3503520_327-69677-0 from training. Duration: 0.9 2023-10-04 02:06:54,385 WARNING [train.py:1204] (3/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_372-298093-0 from training. Duration: 0.8 2023-10-04 02:06:54,399 WARNING [train.py:1204] (3/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_515-139111-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 02:06:54,633 WARNING [train.py:1204] (3/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_85-65056-0_sp1.1 from training. Duration: 0.87275 2023-10-04 02:06:54,675 WARNING [train.py:1204] (3/4) Exclude cut with ID _15113_210_8254_1_1530842438579_3716279_209-54533-0_sp1.1 from training. Duration: 0.87275 2023-10-04 02:06:54,724 WARNING [train.py:1204] (3/4) Exclude cut with ID _70447_210_12392_1_1535972499300_3160420_123-312838-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 02:06:54,751 WARNING [train.py:1204] (3/4) Exclude cut with ID _60113_210_16271_1_1535164212481_4386079_70-63596-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 02:06:54,780 WARNING [train.py:1204] (3/4) Exclude cut with ID _14632_210_9988_1_1530691279392_3923200_225-188509-0_sp1.1 from training. Duration: 0.963625 2023-10-04 02:06:54,831 WARNING [train.py:1204] (3/4) Exclude cut with ID _34734_210_4525_1_1533365977542_4448720_584-180532-0_sp1.1 from training. Duration: 0.97275 2023-10-04 02:06:54,844 WARNING [train.py:1204] (3/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_351-294650-0_sp1.1 from training. Duration: 0.57275 2023-10-04 02:06:55,179 WARNING [train.py:1204] (3/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_263-168388-0_sp0.9 from training. Duration: 0.9555625 2023-10-04 02:06:55,207 WARNING [train.py:1204] (3/4) Exclude cut with ID _22083_210_8254_1_1531702862439_3678970_201-85738-0 from training. Duration: 0.9 2023-10-04 02:06:55,298 WARNING [train.py:1204] (3/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_237-58433-0 from training. Duration: 0.74 2023-10-04 02:06:55,324 WARNING [train.py:1204] (3/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_262-250826-0 from training. Duration: 0.99 2023-10-04 02:06:55,427 WARNING [train.py:1204] (3/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_927-43827-0_sp0.9 from training. Duration: 0.82225 2023-10-04 02:06:55,460 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2821_210_2715_1_1526108385_3763560_73-296673-0 from training. Duration: 0.83 2023-10-04 02:06:55,538 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6287_210_3273_1_1527509506_2653716_182-199887-0 from training. Duration: 0.51 2023-10-04 02:06:55,676 WARNING [train.py:1204] (3/4) Exclude cut with ID _70519_210_4163_1_1535961595994_7458230_899-80223-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 02:06:55,953 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_210-234949-0_sp1.1 from training. Duration: 0.97275 2023-10-04 02:06:55,994 WARNING [train.py:1204] (3/4) Exclude cut with ID _30196_210_1664_1_1533708496522_7074140_768-278176-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 02:06:56,081 WARNING [train.py:1204] (3/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_315-128283-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 02:06:56,545 WARNING [train.py:1204] (3/4) Exclude cut with ID _14886_210_4220_1_1530702020355_3516190_330-323941-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 02:06:56,831 WARNING [train.py:1204] (3/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_264-348896-0_sp1.1 from training. Duration: 0.77275 2023-10-04 02:06:56,854 WARNING [train.py:1204] (3/4) Exclude cut with ID _37086_210_9038_1_1532999540891_7021250_433-10563-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 02:06:56,894 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_53638_210_21581_1_1534297319_5808449_885-80172-0_sp1.1 from training. Duration: 0.87275 2023-10-04 02:06:56,988 WARNING [train.py:1204] (3/4) Exclude cut with ID _14623_210_1925_1_1530773354962_3632609_157-218771-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 02:06:56,998 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_1368-78165-0_sp1.1 from training. Duration: 0.97275 2023-10-04 02:06:57,002 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_309-65177-0_sp0.9 from training. Duration: 0.97775 2023-10-04 02:06:57,008 WARNING [train.py:1204] (3/4) Exclude cut with ID _21632_210_2253_1_1531994001765_3744140_291-138812-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 02:06:57,021 WARNING [train.py:1204] (3/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_669-93163-0_sp1.1 from training. Duration: 0.8909375 2023-10-04 02:06:57,640 WARNING [train.py:1204] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_292-215346-0 from training. Duration: 0.97 2023-10-04 02:06:57,669 WARNING [train.py:1204] (3/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_286-222073-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 02:06:58,010 WARNING [train.py:1204] (3/4) Exclude cut with ID _59256_210_5986_1_1535076085997_3589120_121-237386-0_sp1.1 from training. Duration: 0.87275 2023-10-04 02:06:58,024 WARNING [train.py:1204] (3/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_96-320736-0 from training. Duration: 0.86 2023-10-04 02:06:58,035 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_128-202104-0 from training. Duration: 0.63 2023-10-04 02:06:58,128 WARNING [train.py:1204] (3/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_242-3472-0_sp0.9 from training. Duration: 0.811125 2023-10-04 02:06:58,173 WARNING [train.py:1204] (3/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_154-74182-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 02:06:58,177 WARNING [train.py:1204] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_187-221566-0 from training. Duration: 0.43 2023-10-04 02:06:58,355 WARNING [train.py:1204] (3/4) Exclude cut with ID _6194_210_1534_1_1527511826160_3941194_14-282876-0 from training. Duration: 0.71 2023-10-04 02:06:58,398 WARNING [train.py:1204] (3/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_660-211088-0_sp1.1 from training. Duration: 0.563625 2023-10-04 02:06:58,757 WARNING [train.py:1204] (3/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_526-62079-0_sp1.1 from training. Duration: 0.6090625 2023-10-04 02:06:59,391 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_92-246270-0_sp1.1 from training. Duration: 0.77275 2023-10-04 02:06:59,486 WARNING [train.py:1204] (3/4) Exclude cut with ID IC0287W0023-142350 from training. Duration: 0.956 2023-10-04 02:06:59,568 WARNING [train.py:1204] (3/4) Exclude cut with ID _58605_210_12210_1_1534723195939_3904259_48-269402-0 from training. Duration: 0.96 2023-10-04 02:06:59,584 WARNING [train.py:1204] (3/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_416-257730-0_sp0.9 from training. Duration: 0.72225 2023-10-04 02:06:59,665 WARNING [train.py:1204] (3/4) Exclude cut with ID _7877_210_6014_1_1528286358099_3750136_185-17516-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 02:06:59,775 WARNING [train.py:1204] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_23-189234-0 from training. Duration: 0.83 2023-10-04 02:06:59,792 WARNING [train.py:1204] (3/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_237-89684-0 from training. Duration: 0.96 2023-10-04 02:06:59,864 WARNING [train.py:1204] (3/4) Exclude cut with ID _13916_210_9994_1_1530788341126_3763149_116-21034-0 from training. Duration: 0.96 2023-10-04 02:07:00,015 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_246-125000-0_sp1.1 from training. Duration: 0.563625 2023-10-04 02:07:06,278 INFO [train.py:1046] (3/4) Epoch 43, batch 0, loss[loss=0.1543, simple_loss=0.2333, pruned_loss=0.03768, over 24648.00 frames. ], tot_loss[loss=0.1543, simple_loss=0.2333, pruned_loss=0.03768, over 24648.00 frames. ], batch size: 65, lr: 2.38e-03, grad_scale: 16.0 2023-10-04 02:07:06,278 INFO [train.py:1069] (3/4) Computing validation loss 2023-10-04 02:07:13,432 INFO [zipformer.py:1853] (3/4) name=encoder.encoders.4.encoder.layers.1.self_attn_weights, attn_weights_entropy = tensor([3.2071, 4.3638, 5.0648, 4.6794], device='cuda:3') 2023-10-04 02:07:17,997 INFO [train.py:1078] (3/4) Epoch 43, validation: loss=0.318, simple_loss=0.2688, pruned_loss=0.1836, over 1125622.00 frames. 2023-10-04 02:07:17,997 INFO [train.py:1079] (3/4) Maximum memory allocated so far is 21134MB 2023-10-04 02:07:18,083 WARNING [train.py:1204] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_402-253027-0 from training. Duration: 0.54 2023-10-04 02:07:18,154 WARNING [train.py:1204] (3/4) Exclude cut with ID _59288_210_5854_1_1535077757774_3412246_158-289345-0_sp1.1 from training. Duration: 0.92725 2023-10-04 02:07:18,407 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.conv_skip_rate, batch_count=1487400.0, ans=0.0 2023-10-04 02:07:19,441 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_28211_210_6388_1_1532861506_4523224_128-328162-0_sp1.1 from training. Duration: 0.8181875 2023-10-04 02:07:25,004 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_788-278281-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 02:07:26,262 WARNING [train.py:1204] (3/4) Exclude cut with ID _49728_210_12651_1_1534041034294_6788150_624-195301-0_sp0.9 from training. Duration: 0.9444375 2023-10-04 02:07:26,304 WARNING [train.py:1204] (3/4) Exclude cut with ID _14860_210_4915_1_1530702020340_7343240_267-139775-0_sp1.1 from training. Duration: 0.963625 2023-10-04 02:07:26,369 WARNING [train.py:1204] (3/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_332-221057-0 from training. Duration: 0.68 2023-10-04 02:07:27,730 WARNING [train.py:1204] (3/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_242-76943-0 from training. Duration: 0.94 2023-10-04 02:07:30,654 WARNING [train.py:1204] (3/4) Exclude cut with ID _16747_210_5986_1_1531811544733_2749109_87-349587-0_sp1.1 from training. Duration: 0.963625 2023-10-04 02:07:31,985 WARNING [train.py:1204] (3/4) Exclude cut with ID _22982_210_1883_1_1531882790447_5449409_505-346452-0_sp1.1 from training. Duration: 0.963625 2023-10-04 02:07:34,683 WARNING [train.py:1204] (3/4) Exclude cut with ID _17956_210_4353_1_1531209568125_6979970_13-192107-0_sp1.1 from training. Duration: 0.963625 2023-10-04 02:07:34,728 WARNING [train.py:1204] (3/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_430-307666-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 02:07:36,037 WARNING [train.py:1204] (3/4) Exclude cut with ID _2895_210_2477_1_1525956785794_4063750_289-11475-0_sp1.1 from training. Duration: 0.7454375 2023-10-04 02:07:36,056 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3910_210_1794_1_1526742306_334530_28-238909-0_sp0.9 from training. Duration: 0.9 2023-10-04 02:07:37,420 WARNING [train.py:1204] (3/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_18-130336-0 from training. Duration: 0.79 2023-10-04 02:07:38,793 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_548-294316-0_sp0.9 from training. Duration: 0.9 2023-10-04 02:07:46,387 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_42-142473-0_sp1.1 from training. Duration: 0.6545625 2023-10-04 02:07:48,032 WARNING [train.py:1204] (3/4) Exclude cut with ID _59170_210_6721_1_1534983633960_4203335_326-48066-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 02:07:50,065 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_45886_210_17024_1_1533709215_471063_7-79165-0 from training. Duration: 0.98 2023-10-04 02:07:54,213 WARNING [train.py:1204] (3/4) Exclude cut with ID _44585_210_18628_1_1533605350387_4518199_280-73534-0_sp0.9 from training. Duration: 0.988875 2023-10-04 02:07:54,220 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3475_210_2028_1_1526193578_3622569_265-276625-0_sp1.1 from training. Duration: 0.6909375 2023-10-04 02:07:55,703 WARNING [train.py:1204] (3/4) Exclude cut with ID _64631_210_27767_1_1535349580068_4165160_88-225232-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 02:08:01,094 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_205-265518-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 02:08:05,208 WARNING [train.py:1204] (3/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_271-253734-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 02:08:09,466 WARNING [train.py:1204] (3/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_282-257791-0 from training. Duration: 0.86 2023-10-04 02:08:12,243 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder_embed.convnext.layerdrop_rate, batch_count=1487600.0, ans=0.015 2023-10-04 02:08:13,467 WARNING [train.py:1204] (3/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_356-45054-0 from training. Duration: 0.86 2023-10-04 02:08:14,781 WARNING [train.py:1204] (3/4) Exclude cut with ID _69584_210_5219_1_1535785133775_4044450_38-348116-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 02:08:14,783 WARNING [train.py:1204] (3/4) Exclude cut with ID _66763_210_17301_1_1535788667617_241750_21-328480-0_sp1.1 from training. Duration: 0.936375 2023-10-04 02:08:14,862 WARNING [train.py:1204] (3/4) Exclude cut with ID _26528_210_9070_1_1532570410842_3851310_497-97218-0_sp0.9 from training. Duration: 0.9555625 2023-10-04 02:08:16,785 WARNING [train.py:1204] (3/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_592-101075-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 02:08:18,154 WARNING [train.py:1204] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_678-159307-0 from training. Duration: 0.85 2023-10-04 02:08:21,484 WARNING [train.py:1204] (3/4) Exclude cut with ID _28427_210_4220_1_1532581240967_3061069_166-319773-0_sp1.1 from training. Duration: 0.936375 2023-10-04 02:08:23,426 WARNING [train.py:1204] (3/4) Exclude cut with ID _29877_210_6228_1_1532397791106_5276149_606-224984-0_sp1.1 from training. Duration: 0.936375 2023-10-04 02:08:25,540 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_125-203105-0_sp1.1 from training. Duration: 0.72725 2023-10-04 02:08:27,461 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.5.encoder.layers.1.nonlin_attention.whiten2, num_groups=1, num_channels=256, metric=8.21 vs. limit=15.0 2023-10-04 02:08:28,331 WARNING [train.py:1204] (3/4) Exclude cut with ID IC0620W0113-306765 from training. Duration: 0.768 2023-10-04 02:08:29,712 WARNING [train.py:1204] (3/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_410-287857-0_sp1.1 from training. Duration: 0.8454375 2023-10-04 02:08:31,054 INFO [train.py:1046] (3/4) Epoch 43, batch 50, loss[loss=0.2015, simple_loss=0.2763, pruned_loss=0.06339, over 19380.00 frames. ], tot_loss[loss=0.1555, simple_loss=0.2358, pruned_loss=0.03762, over 1060146.95 frames. ], batch size: 388, lr: 2.38e-03, grad_scale: 16.0 2023-10-04 02:08:32,497 WARNING [train.py:1204] (3/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_78-95260-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 02:08:35,301 WARNING [train.py:1204] (3/4) Exclude cut with ID _58059_210_5854_1_1534901471149_3164808_6-73080-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 02:08:35,321 WARNING [train.py:1204] (3/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_331-117829-0 from training. Duration: 0.62 2023-10-04 02:08:36,687 WARNING [train.py:1204] (3/4) Exclude cut with ID _14880_210_8895_1_1530680426655_3795259_289-212817-0_sp0.9 from training. Duration: 0.7666875 2023-10-04 02:08:37,947 WARNING [train.py:1204] (3/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_620-144779-0_sp1.1 from training. Duration: 0.92725 2023-10-04 02:08:39,368 WARNING [train.py:1204] (3/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_561-288575-0_sp1.1 from training. Duration: 0.963625 2023-10-04 02:08:39,477 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_22657_210_6890_1_1532086002_1637216_122-188128-0_sp1.1 from training. Duration: 0.963625 2023-10-04 02:08:42,211 WARNING [train.py:1204] (3/4) Exclude cut with ID _16756_210_4915_1_1531047803763_7104259_604-86607-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 02:08:43,727 WARNING [train.py:1204] (3/4) Exclude cut with ID _15150_210_8341_1_1530752411803_7256384_469-239509-0 from training. Duration: 0.68 2023-10-04 02:08:43,734 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_10987_210_4163_1_1529760555_4123902_302-87926-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 02:08:51,545 WARNING [train.py:1204] (3/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_469-94673-0_sp0.9 from training. Duration: 0.87775 2023-10-04 02:08:52,964 WARNING [train.py:1204] (3/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_798-106179-0 from training. Duration: 0.83 2023-10-04 02:08:54,342 WARNING [train.py:1204] (3/4) Exclude cut with ID _16744_210_5986_1_1531724405384_3907567_242-111316-0 from training. Duration: 0.93 2023-10-04 02:08:55,753 WARNING [train.py:1204] (3/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_938-205135-0_sp0.9 from training. Duration: 0.9666875 2023-10-04 02:08:57,811 WARNING [train.py:1204] (3/4) Exclude cut with ID _64135_210_2253_1_1535677267876_7608972_133-188266-0_sp0.9 from training. Duration: 0.9 2023-10-04 02:08:57,815 WARNING [train.py:1204] (3/4) Exclude cut with ID _44585_210_18628_1_1533605350387_4518199_623-346820-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 02:08:59,011 WARNING [train.py:1204] (3/4) Exclude cut with ID _42326_210_7770_1_1533814282531_3707269_454-266903-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 02:09:00,430 WARNING [train.py:1204] (3/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_5-55893-0_sp1.1 from training. Duration: 0.5 2023-10-04 02:09:00,468 WARNING [train.py:1204] (3/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_577-332600-0_sp1.1 from training. Duration: 0.5181875 2023-10-04 02:09:00,469 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_957-138882-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 02:09:01,260 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.1.encoder.layers.1.self_attn_weights.whiten_keys, num_groups=4, num_channels=128, metric=2.61 vs. limit=6.0 2023-10-04 02:09:06,115 WARNING [train.py:1204] (3/4) Exclude cut with ID _12556_210_8341_1_1530081024100_7327690_219-38004-0_sp1.1 from training. Duration: 0.97275 2023-10-04 02:09:07,468 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_397-147196-0_sp1.1 from training. Duration: 0.72725 2023-10-04 02:09:07,477 WARNING [train.py:1204] (3/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_231-104736-0_sp0.9 from training. Duration: 0.8666875 2023-10-04 02:09:08,928 WARNING [train.py:1204] (3/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_398-334558-0 from training. Duration: 0.96 2023-10-04 02:09:09,205 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.feed_forward1.hidden_balancer.prob, batch_count=1487866.6666666667, ans=0.125 2023-10-04 02:09:11,660 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_257-20733-0_sp1.1 from training. Duration: 0.6909375 2023-10-04 02:09:13,041 WARNING [train.py:1204] (3/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_146-282400-0_sp1.1 from training. Duration: 0.6454375 2023-10-04 02:09:13,045 WARNING [train.py:1204] (3/4) Exclude cut with ID _51899_210_20034_1_1534129254466_3669220_265-215428-0 from training. Duration: 0.98 2023-10-04 02:09:13,112 WARNING [train.py:1204] (3/4) Exclude cut with ID _34423_210_8750_1_1533430798456_3330199_219-106541-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 02:09:14,484 WARNING [train.py:1204] (3/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_458-67321-0 from training. Duration: 0.97 2023-10-04 02:09:15,780 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.725e+02 2.079e+02 2.255e+02 2.464e+02 4.467e+02, threshold=4.509e+02, percent-clipped=1.0 2023-10-04 02:09:17,669 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=1487933.3333333333, ans=0.1 2023-10-04 02:09:22,185 WARNING [train.py:1204] (3/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_1051-182606-0_sp1.1 from training. Duration: 0.936375 2023-10-04 02:09:23,305 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_53555_210_5632_1_1534595220_7232297_603-4396-0_sp1.1 from training. Duration: 0.97275 2023-10-04 02:09:24,731 WARNING [train.py:1204] (3/4) Exclude cut with ID _54218_210_12210_1_1534413646388_4409259_159-144367-0_sp1.1 from training. Duration: 0.963625 2023-10-04 02:09:26,088 WARNING [train.py:1204] (3/4) Exclude cut with ID _7606_210_4915_1_1528894253277_5080690_301-10732-0_sp0.9 from training. Duration: 0.9 2023-10-04 02:09:26,091 WARNING [train.py:1204] (3/4) Exclude cut with ID _15026_210_8750_1_1530777602316_3291850_211-195043-0_sp0.9 from training. Duration: 0.888875 2023-10-04 02:09:29,336 WARNING [train.py:1204] (3/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_146-282400-0 from training. Duration: 0.71 2023-10-04 02:09:29,381 WARNING [train.py:1204] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_65-88242-0 from training. Duration: 0.56 2023-10-04 02:09:32,228 WARNING [train.py:1204] (3/4) Exclude cut with ID _55857_210_2346_1_1534644007831_6898209_80-107757-0_sp1.1 from training. Duration: 0.963625 2023-10-04 02:09:32,269 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_15126_210_6753_1_1530778677_2591185_120-277707-0_sp0.9 from training. Duration: 0.888875 2023-10-04 02:09:33,724 WARNING [train.py:1204] (3/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_1033-168593-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 02:09:33,775 WARNING [train.py:1204] (3/4) Exclude cut with ID _46683_210_9642_1_1533730913492_4591116_475-30711-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 02:09:33,807 WARNING [train.py:1204] (3/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_253-308377-0 from training. Duration: 0.49 2023-10-04 02:09:35,205 WARNING [train.py:1204] (3/4) Exclude cut with ID _4680_210_3525_1_1527249757953_893386_75-78979-0 from training. Duration: 0.91 2023-10-04 02:09:35,292 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_5522_210_3273_1_1527249606_3909096_318-203153-0_sp0.9 from training. Duration: 0.47775 2023-10-04 02:09:36,721 WARNING [train.py:1204] (3/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_550-170116-0_sp1.1 from training. Duration: 0.92725 2023-10-04 02:09:36,753 WARNING [train.py:1204] (3/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_277-210798-0_sp0.9 from training. Duration: 0.611125 2023-10-04 02:09:38,126 WARNING [train.py:1204] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_620-276793-0 from training. Duration: 0.59 2023-10-04 02:09:38,145 WARNING [train.py:1204] (3/4) Exclude cut with ID _3067_210_2667_1_1526212974640_4503168_119-175776-0 from training. Duration: 0.74 2023-10-04 02:09:40,749 WARNING [train.py:1204] (3/4) Exclude cut with ID _55209_210_1531_1_1534675828996_4597160_681-195827-0_sp1.1 from training. Duration: 0.92725 2023-10-04 02:09:40,825 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_427-78097-0_sp1.1 from training. Duration: 0.72725 2023-10-04 02:09:41,080 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.feed_forward1.hidden_balancer.prob, batch_count=1488000.0, ans=0.125 2023-10-04 02:09:42,224 WARNING [train.py:1204] (3/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_96-290039-0_sp0.9 from training. Duration: 0.62225 2023-10-04 02:09:42,233 WARNING [train.py:1204] (3/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_287-292174-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 02:09:44,788 INFO [train.py:1046] (3/4) Epoch 43, batch 100, loss[loss=0.1502, simple_loss=0.2324, pruned_loss=0.034, over 24337.00 frames. ], tot_loss[loss=0.1563, simple_loss=0.2377, pruned_loss=0.03746, over 1878244.16 frames. ], batch size: 61, lr: 2.38e-03, grad_scale: 16.0 2023-10-04 02:09:44,888 WARNING [train.py:1204] (3/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_200-292228-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 02:09:48,751 WARNING [train.py:1204] (3/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_291-194999-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 02:09:50,301 WARNING [train.py:1204] (3/4) Exclude cut with ID _58895_210_8750_1_1535184081320_3649089_265-323810-0_sp1.1 from training. Duration: 0.87275 2023-10-04 02:09:52,298 WARNING [train.py:1204] (3/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_58-68956-0 from training. Duration: 0.83 2023-10-04 02:09:52,303 WARNING [train.py:1204] (3/4) Exclude cut with ID _39867_210_9826_1_1533553222030_9344259_322-115153-0_sp1.1 from training. Duration: 0.963625 2023-10-04 02:09:55,792 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3371_210_1794_1_1526384462_5580777_396-339812-0_sp1.1 from training. Duration: 0.77275 2023-10-04 02:09:55,803 WARNING [train.py:1204] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_731-2428-0_sp0.9 from training. Duration: 0.92225 2023-10-04 02:09:55,822 WARNING [train.py:1204] (3/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_98-741-0_sp1.1 from training. Duration: 0.72725 2023-10-04 02:09:55,824 WARNING [train.py:1204] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_238-245655-0_sp0.9 from training. Duration: 0.9 2023-10-04 02:09:57,131 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6788_210_1388_1_1527898698_4562021_91-197981-0_sp0.9 from training. Duration: 0.92225 2023-10-04 02:09:57,250 WARNING [train.py:1204] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_860-96198-0 from training. Duration: 0.73 2023-10-04 02:10:00,465 WARNING [train.py:1204] (3/4) Exclude cut with ID _14268_210_9206_1_1530622771285_4714099_331-105351-0_sp0.9 from training. Duration: 0.72225 2023-10-04 02:10:00,503 WARNING [train.py:1204] (3/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_204-137133-0_sp1.1 from training. Duration: 0.92725 2023-10-04 02:10:01,831 WARNING [train.py:1204] (3/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_5-46496-0_sp1.1 from training. Duration: 0.936375 2023-10-04 02:10:01,840 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_160-218721-0_sp1.1 from training. Duration: 0.87275 2023-10-04 02:10:04,661 WARNING [train.py:1204] (3/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_503-287883-0 from training. Duration: 0.88 2023-10-04 02:10:05,990 WARNING [train.py:1204] (3/4) Exclude cut with ID _57572_210_5517_1_1534921219624_3809484_110-79134-0_sp1.1 from training. Duration: 0.92725 2023-10-04 02:10:07,360 WARNING [train.py:1204] (3/4) Exclude cut with ID _44585_210_18628_1_1533605350387_4518199_421-37641-0_sp1.1 from training. Duration: 0.936375 2023-10-04 02:10:07,449 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_42-142473-0_sp0.9 from training. Duration: 0.8 2023-10-04 02:10:10,264 WARNING [train.py:1204] (3/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_310-114452-0_sp0.9 from training. Duration: 0.6555625 2023-10-04 02:10:12,907 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9029W0120-509520 from training. Duration: 0.768 2023-10-04 02:10:12,921 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9029W0433-509820 from training. Duration: 0.896 2023-10-04 02:10:14,312 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_40222_210_6228_1_1533385298_4481939_163-334586-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 02:10:14,312 WARNING [train.py:1204] (3/4) Exclude cut with ID _48677_210_5663_1_1533902405919_3892989_408-40740-0_sp1.1 from training. Duration: 0.7545625 2023-10-04 02:10:15,886 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.nonlin_attention.balancer.prob, batch_count=1488200.0, ans=0.125 2023-10-04 02:10:18,391 WARNING [train.py:1204] (3/4) Exclude cut with ID _24314_210_4915_1_1532135078116_7170179_521-258868-0_sp0.9 from training. Duration: 0.87775 2023-10-04 02:10:19,791 WARNING [train.py:1204] (3/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_576-51198-0_sp1.1 from training. Duration: 0.92725 2023-10-04 02:10:21,652 WARNING [train.py:1204] (3/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_306-216871-0_sp1.1 from training. Duration: 0.97275 2023-10-04 02:10:28,078 WARNING [train.py:1204] (3/4) Exclude cut with ID _20684_210_6917_1_1531567751646_4203930_463-119982-0_sp1.1 from training. Duration: 0.97275 2023-10-04 02:10:28,137 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9084W0012-536850 from training. Duration: 0.64 2023-10-04 02:10:30,789 WARNING [train.py:1204] (3/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_847-41334-0_sp1.1 from training. Duration: 0.4 2023-10-04 02:10:32,487 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.conv_module1.balancer1.prob, batch_count=1488266.6666666667, ans=0.125 2023-10-04 02:10:35,515 WARNING [train.py:1204] (3/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_326-165900-0_sp1.1 from training. Duration: 0.62725 2023-10-04 02:10:36,978 WARNING [train.py:1204] (3/4) Exclude cut with ID _7537_210_2084_1_1528502395431_3924359_128-268029-0_sp1.1 from training. Duration: 0.8909375 2023-10-04 02:10:39,703 WARNING [train.py:1204] (3/4) Exclude cut with ID _67499_210_17282_1_1535509805220_3692430_333-155030-0_sp1.1 from training. Duration: 0.97275 2023-10-04 02:10:40,284 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.3.encoder.layers.3.whiten, num_groups=1, num_channels=512, metric=3.50 vs. limit=12.0 2023-10-04 02:10:42,636 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_34008_210_10431_1_1533345424_8183247_588-257177-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 02:10:46,805 WARNING [train.py:1204] (3/4) Exclude cut with ID _25914_210_6400_1_1532141317058_644740_52-96808-0_sp1.1 from training. Duration: 0.82725 2023-10-04 02:10:48,180 WARNING [train.py:1204] (3/4) Exclude cut with ID _56494_210_4220_1_1535284837625_3033169_69-138150-0_sp1.1 from training. Duration: 0.8545625 2023-10-04 02:10:49,645 WARNING [train.py:1204] (3/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_19-143553-0_sp1.1 from training. Duration: 0.97275 2023-10-04 02:10:50,920 WARNING [train.py:1204] (3/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_582-130735-0_sp1.1 from training. Duration: 0.936375 2023-10-04 02:10:52,326 WARNING [train.py:1204] (3/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_943-201356-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 02:10:52,336 WARNING [train.py:1204] (3/4) Exclude cut with ID _3231_210_2106_1_1526205864934_6784220_422-229276-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 02:10:52,362 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_48049_210_5632_1_1533864553_3519937_73-198717-0_sp1.1 from training. Duration: 0.97275 2023-10-04 02:10:52,412 WARNING [train.py:1204] (3/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_329-285801-0 from training. Duration: 0.91 2023-10-04 02:10:52,435 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9171W0114-579000 from training. Duration: 0.896 2023-10-04 02:10:52,453 WARNING [train.py:1204] (3/4) Exclude cut with ID _3120_210_3525_1_1526036143112_4072980_210-132675-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 02:10:53,800 WARNING [train.py:1204] (3/4) Exclude cut with ID _55857_210_2346_1_1534644007831_6898209_745-98391-0_sp0.9 from training. Duration: 0.8444375 2023-10-04 02:10:55,208 WARNING [train.py:1204] (3/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_615-322035-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 02:10:55,213 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_36756_210_7234_1_1533026991_966276_119-315767-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 02:10:55,238 WARNING [train.py:1204] (3/4) Exclude cut with ID _13916_210_9994_1_1530788341126_3763149_60-191556-0_sp1.1 from training. Duration: 0.5545625 2023-10-04 02:10:57,717 WARNING [train.py:1204] (3/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_177-236809-0_sp1.1 from training. Duration: 0.4454375 2023-10-04 02:10:57,775 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7002_210_1794_1_1527939683_5157286_184-229630-0_sp1.1 from training. Duration: 0.67275 2023-10-04 02:10:57,781 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_50428_210_5724_1_1534247532_4126418_464-18322-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 02:10:58,915 INFO [train.py:1046] (3/4) Epoch 43, batch 150, loss[loss=0.2105, simple_loss=0.2763, pruned_loss=0.07233, over 19177.00 frames. ], tot_loss[loss=0.1576, simple_loss=0.2379, pruned_loss=0.03865, over 2495661.66 frames. ], batch size: 388, lr: 2.38e-03, grad_scale: 16.0 2023-10-04 02:10:59,017 WARNING [train.py:1204] (3/4) Exclude cut with ID _35799_210_5854_1_1533263487887_3529071_466-131402-0_sp1.1 from training. Duration: 0.936375 2023-10-04 02:10:59,247 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.conv_skip_rate, batch_count=1488400.0, ans=0.0 2023-10-04 02:11:00,762 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_1259-132768-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 02:11:00,815 WARNING [train.py:1204] (3/4) Exclude cut with ID _6694_210_4599_1_1527852637490_3965529_144-185051-0_sp1.1 from training. Duration: 0.87275 2023-10-04 02:11:00,858 WARNING [train.py:1204] (3/4) Exclude cut with ID _64639_210_5816_1_1535367619437_3528000_59-133290-0_sp1.1 from training. Duration: 0.963625 2023-10-04 02:11:03,593 WARNING [train.py:1204] (3/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_223-49525-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 02:11:06,292 WARNING [train.py:1204] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_748-36150-0_sp1.1 from training. Duration: 0.82725 2023-10-04 02:11:06,293 WARNING [train.py:1204] (3/4) Exclude cut with ID _19896_210_1883_1_1531709022662_6649569_52-221627-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 02:11:07,632 WARNING [train.py:1204] (3/4) Exclude cut with ID _6789_210_1534_1_1527854471671_2870519_296-115766-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 02:11:12,042 WARNING [train.py:1204] (3/4) Exclude cut with ID _14624_210_1925_1_1530858690657_3642219_100-19871-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 02:11:13,406 WARNING [train.py:1204] (3/4) Exclude cut with ID _12555_210_8750_1_1530075594402_3780239_50-293675-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 02:11:14,936 WARNING [train.py:1204] (3/4) Exclude cut with ID _14880_210_8895_1_1530680426655_3795259_289-212817-0_sp1.1 from training. Duration: 0.62725 2023-10-04 02:11:16,268 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_388-211626-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 02:11:17,903 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.1.nonlin_attention.balancer.max_positive, batch_count=1488466.6666666667, ans=0.95 2023-10-04 02:11:19,177 WARNING [train.py:1204] (3/4) Exclude cut with ID _5289_210_2084_1_1527678202736_3779299_392-261988-0 from training. Duration: 0.65 2023-10-04 02:11:19,800 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.3.encoder.layers.0.feed_forward3.out_whiten, num_groups=1, num_channels=512, metric=10.38 vs. limit=15.0 2023-10-04 02:11:20,370 WARNING [train.py:1204] (3/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_124-345840-0 from training. Duration: 0.52 2023-10-04 02:11:20,376 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2655_210_2087_1_1525688959_3897400_437-337690-0 from training. Duration: 0.63 2023-10-04 02:11:21,862 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3780_210_2087_1_1526467391_239068_13-274633-0_sp1.1 from training. Duration: 0.92725 2023-10-04 02:11:21,867 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_805-71497-0_sp1.1 from training. Duration: 0.6454375 2023-10-04 02:11:23,140 WARNING [train.py:1204] (3/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_434-83000-0_sp1.1 from training. Duration: 0.8909375 2023-10-04 02:11:23,742 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.4.encoder.layers.2.self_attn_weights.whiten_keys, num_groups=4, num_channels=128, metric=1.69 vs. limit=6.0 2023-10-04 02:11:24,554 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_17613_210_2151_1_1531137569_2366160_285-219531-0_sp1.1 from training. Duration: 0.97275 2023-10-04 02:11:25,901 WARNING [train.py:1204] (3/4) Exclude cut with ID _56108_210_16380_1_1534505446159_3887959_22-37858-0_sp1.1 from training. Duration: 0.936375 2023-10-04 02:11:25,949 WARNING [train.py:1204] (3/4) Exclude cut with ID _28427_210_4220_1_1532581240967_3061069_200-88444-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 02:11:27,295 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_77-279042-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 02:11:28,681 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9309W0193-640320 from training. Duration: 0.896 2023-10-04 02:11:32,364 WARNING [train.py:1204] (3/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_516-324055-0_sp1.1 from training. Duration: 0.936375 2023-10-04 02:11:35,953 WARNING [train.py:1204] (3/4) Exclude cut with ID _40094_210_16380_1_1533294005773_3641159_10-261240-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 02:11:39,938 WARNING [train.py:1204] (3/4) Exclude cut with ID _6267_210_2084_1_1527764658526_3894730_375-219887-0_sp1.1 from training. Duration: 0.6090625 2023-10-04 02:11:40,020 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6788_210_1388_1_1527898698_4562021_180-75288-0 from training. Duration: 0.99 2023-10-04 02:11:43,335 WARNING [train.py:1204] (3/4) Exclude cut with ID _8030_210_7497_1_1528444886781_3531089_262-86750-0_sp1.1 from training. Duration: 0.736375 2023-10-04 02:11:43,357 WARNING [train.py:1204] (3/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_934-325498-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 02:11:43,382 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_456-22141-0_sp0.9 from training. Duration: 0.97775 2023-10-04 02:11:43,705 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=1488600.0, ans=0.1 2023-10-04 02:11:44,687 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.663e+02 1.926e+02 2.079e+02 2.360e+02 3.858e+02, threshold=4.157e+02, percent-clipped=0.0 2023-10-04 02:11:44,825 WARNING [train.py:1204] (3/4) Exclude cut with ID _49643_210_7989_1_1534640244224_3939031_598-161926-0_sp1.1 from training. Duration: 0.8181875 2023-10-04 02:11:45,147 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.conv_module1.balancer1.max_abs, batch_count=1488600.0, ans=10.0 2023-10-04 02:11:46,201 WARNING [train.py:1204] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_345-49721-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 02:11:46,284 WARNING [train.py:1204] (3/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_440-141811-0_sp1.1 from training. Duration: 0.863625 2023-10-04 02:11:47,724 WARNING [train.py:1204] (3/4) Exclude cut with ID _64048_210_10626_1_1535198382802_3835179_394-212120-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 02:11:49,059 WARNING [train.py:1204] (3/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_91-277684-0 from training. Duration: 0.74 2023-10-04 02:11:53,064 WARNING [train.py:1204] (3/4) Exclude cut with ID _23038_210_9570_1_1532001584158_3652419_191-346343-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 02:11:54,376 WARNING [train.py:1204] (3/4) Exclude cut with ID _15140_210_9288_1_1530791388450_4880149_351-347974-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 02:11:54,405 WARNING [train.py:1204] (3/4) Exclude cut with ID _58605_210_12210_1_1534723195939_3904259_48-269402-0_sp1.1 from training. Duration: 0.87275 2023-10-04 02:11:54,412 WARNING [train.py:1204] (3/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_401-53423-0_sp0.9 from training. Duration: 0.611125 2023-10-04 02:11:57,144 WARNING [train.py:1204] (3/4) Exclude cut with ID _42659_210_2070_1_1533607442765_3120380_158-49004-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 02:11:58,553 WARNING [train.py:1204] (3/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_81-75849-0_sp1.1 from training. Duration: 0.3545625 2023-10-04 02:12:02,575 WARNING [train.py:1204] (3/4) Exclude cut with ID _27052_210_5986_1_1532329197187_3585009_314-106499-0_sp1.1 from training. Duration: 0.836375 2023-10-04 02:12:03,872 WARNING [train.py:1204] (3/4) Exclude cut with ID _4626_210_3532_1_1526817652717_3916859_446-206656-0_sp0.9 from training. Duration: 0.8444375 2023-10-04 02:12:06,500 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_53378_210_17282_1_1534221071_5769509_158-114960-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 02:12:09,190 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3171_210_2087_1_1526109084_2330007_37-64334-0_sp0.9 from training. Duration: 0.911125 2023-10-04 02:12:09,212 WARNING [train.py:1204] (3/4) Exclude cut with ID _5410_210_1388_1_1527397055795_1719020_39-112221-0 from training. Duration: 0.69 2023-10-04 02:12:09,235 WARNING [train.py:1204] (3/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_4-91037-0_sp0.9 from training. Duration: 0.97775 2023-10-04 02:12:10,554 WARNING [train.py:1204] (3/4) Exclude cut with ID ID0079W0443-717795 from training. Duration: 0.541 2023-10-04 02:12:12,463 INFO [train.py:1046] (3/4) Epoch 43, batch 200, loss[loss=0.1638, simple_loss=0.2393, pruned_loss=0.04419, over 23804.00 frames. ], tot_loss[loss=0.1583, simple_loss=0.2391, pruned_loss=0.03879, over 3002577.70 frames. ], batch size: 179, lr: 2.38e-03, grad_scale: 16.0 2023-10-04 02:12:12,734 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.bypass_mid.scale_min, batch_count=1488733.3333333333, ans=0.2 2023-10-04 02:12:14,021 WARNING [train.py:1204] (3/4) Exclude cut with ID _34734_210_4525_1_1533365977542_4448720_766-297928-0_sp1.1 from training. Duration: 0.936375 2023-10-04 02:12:15,579 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.0.feed_forward3.hidden_balancer.prob, batch_count=1488733.3333333333, ans=0.125 2023-10-04 02:12:15,649 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.attention_skip_rate, batch_count=1488733.3333333333, ans=0.0 2023-10-04 02:12:18,125 WARNING [train.py:1204] (3/4) Exclude cut with ID _51471_210_1889_1_1534399391390_3094250_310-337793-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 02:12:18,138 WARNING [train.py:1204] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_678-159307-0_sp0.9 from training. Duration: 0.9444375 2023-10-04 02:12:20,811 WARNING [train.py:1204] (3/4) Exclude cut with ID _50093_210_4220_1_1534417141207_3432299_259-322252-0 from training. Duration: 0.92 2023-10-04 02:12:22,177 WARNING [train.py:1204] (3/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_67-103444-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 02:12:22,200 WARNING [train.py:1204] (3/4) Exclude cut with ID _40551_210_10635_1_1533776424632_3416179_311-247578-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 02:12:23,647 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_4633_210_2040_1_1526824163_4145343_416-76117-0 from training. Duration: 0.95 2023-10-04 02:12:23,769 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.bypass_mid.scale_min, batch_count=1488733.3333333333, ans=0.2 2023-10-04 02:12:23,875 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.bypass_mid.scale_min, batch_count=1488733.3333333333, ans=0.2 2023-10-04 02:12:26,240 WARNING [train.py:1204] (3/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_331-117829-0_sp0.9 from training. Duration: 0.688875 2023-10-04 02:12:27,635 WARNING [train.py:1204] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_299-247059-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 02:12:27,707 WARNING [train.py:1204] (3/4) Exclude cut with ID _64002_210_9570_1_1535198605157_38979_2-240845-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 02:12:27,916 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.self_attn_weights.pos_emb_skip_rate, batch_count=1488800.0, ans=0.0 2023-10-04 02:12:30,520 WARNING [train.py:1204] (3/4) Exclude cut with ID _4681_210_3525_1_1527250689464_120889_15-126760-0_sp0.9 from training. Duration: 0.9 2023-10-04 02:12:31,800 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_21500_210_2054_1_1532149310_2716758_256-156611-0_sp1.1 from training. Duration: 0.936375 2023-10-04 02:12:31,816 WARNING [train.py:1204] (3/4) Exclude cut with ID _16908_210_5724_1_1531119157248_3772700_624-203729-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 02:12:32,906 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.5.encoder.layers.0.nonlin_attention.whiten1, num_groups=1, num_channels=192, metric=3.91 vs. limit=10.0 2023-10-04 02:12:47,248 WARNING [train.py:1204] (3/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_605-219732-0_sp1.1 from training. Duration: 0.8545625 2023-10-04 02:12:48,538 WARNING [train.py:1204] (3/4) Exclude cut with ID _44718_210_5816_1_1534050051869_6469030_380-167520-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 02:12:48,611 WARNING [train.py:1204] (3/4) Exclude cut with ID _19896_210_1883_1_1531709022662_6649569_94-229927-0_sp1.1 from training. Duration: 0.8818125 2023-10-04 02:12:49,910 WARNING [train.py:1204] (3/4) Exclude cut with ID _19514_210_9259_1_1531530071330_5084230_519-174300-0_sp1.1 from training. Duration: 0.963625 2023-10-04 02:12:51,223 WARNING [train.py:1204] (3/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_340-174176-0_sp0.9 from training. Duration: 0.5444375 2023-10-04 02:12:51,233 WARNING [train.py:1204] (3/4) Exclude cut with ID _8263_210_1664_1_1528439452891_970220_68-181805-0_sp1.1 from training. Duration: 0.5818125 2023-10-04 02:12:53,925 WARNING [train.py:1204] (3/4) Exclude cut with ID _3120_210_3525_1_1526036143112_4072980_348-257723-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 02:12:55,348 WARNING [train.py:1204] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_453-133983-0_sp1.1 from training. Duration: 0.7090625 2023-10-04 02:12:56,741 WARNING [train.py:1204] (3/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_17-86060-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 02:12:56,769 WARNING [train.py:1204] (3/4) Exclude cut with ID _29877_210_6228_1_1532397791106_5276149_228-184657-0_sp1.1 from training. Duration: 0.97275 2023-10-04 02:12:57,030 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.conv_module1.balancer2.min_abs, batch_count=1488933.3333333333, ans=0.5 2023-10-04 02:12:58,182 WARNING [train.py:1204] (3/4) Exclude cut with ID _18516_210_9206_1_1531704653206_3692406_198-334870-0 from training. Duration: 0.92 2023-10-04 02:12:58,233 WARNING [train.py:1204] (3/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_340-174176-0_sp1.1 from training. Duration: 0.4454375 2023-10-04 02:12:58,245 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_14-139186-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 02:12:58,489 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.self_attn_weights.pos_emb_skip_rate, batch_count=1488933.3333333333, ans=0.0 2023-10-04 02:13:01,062 WARNING [train.py:1204] (3/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_126-29104-0_sp0.9 from training. Duration: 0.9444375 2023-10-04 02:13:07,867 WARNING [train.py:1204] (3/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_84-10310-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 02:13:12,307 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.conv_module1.balancer2.min_positive, batch_count=1489000.0, ans=0.05 2023-10-04 02:13:15,287 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_15517_210_6753_1_1530954008_3487336_79-273076-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 02:13:15,336 WARNING [train.py:1204] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_371-18169-0_sp0.9 from training. Duration: 0.9555625 2023-10-04 02:13:22,127 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_68702_210_23689_1_1535799577_3420068_170-92788-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 02:13:24,817 INFO [train.py:1046] (3/4) Epoch 43, batch 250, loss[loss=0.163, simple_loss=0.2562, pruned_loss=0.03485, over 24465.00 frames. ], tot_loss[loss=0.1573, simple_loss=0.2383, pruned_loss=0.0382, over 3376666.23 frames. ], batch size: 69, lr: 2.38e-03, grad_scale: 16.0 2023-10-04 02:13:24,884 WARNING [train.py:1204] (3/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_78-182912-0 from training. Duration: 0.59 2023-10-04 02:13:24,945 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3019_210_2028_1_1526044245_3264865_297-12384-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 02:13:24,951 WARNING [train.py:1204] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_147-111137-0_sp1.1 from training. Duration: 0.863625 2023-10-04 02:13:24,958 WARNING [train.py:1204] (3/4) Exclude cut with ID _28323_210_16187_1_1532480395667_4100300_117-8859-0_sp1.1 from training. Duration: 0.97275 2023-10-04 02:13:25,043 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7762_210_1794_1_1528199508_4057408_419-125009-0_sp1.1 from training. Duration: 0.7545625 2023-10-04 02:13:26,401 WARNING [train.py:1204] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_268-87909-0 from training. Duration: 0.99 2023-10-04 02:13:27,773 WARNING [train.py:1204] (3/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_118-311357-0_sp1.1 from training. Duration: 0.92725 2023-10-04 02:13:27,787 WARNING [train.py:1204] (3/4) Exclude cut with ID ID0377W0003-870225 from training. Duration: 0.852 2023-10-04 02:13:27,896 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.0.conv_module1.balancer2.prob, batch_count=1489066.6666666667, ans=0.125 2023-10-04 02:13:30,508 WARNING [train.py:1204] (3/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_56-5838-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 02:13:30,999 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.5.encoder.layers.1.feed_forward1.out_whiten, num_groups=1, num_channels=256, metric=6.22 vs. limit=15.0 2023-10-04 02:13:31,867 WARNING [train.py:1204] (3/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_90-31383-0_sp0.9 from training. Duration: 0.9666875 2023-10-04 02:13:33,286 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_31583_210_3852_1_1532595851_4024413_58-86657-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 02:13:35,284 WARNING [train.py:1204] (3/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_344-113875-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 02:13:38,495 WARNING [train.py:1204] (3/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_278-104676-0_sp1.1 from training. Duration: 0.8909375 2023-10-04 02:13:38,530 WARNING [train.py:1204] (3/4) Exclude cut with ID _55844_210_2346_1_1534471212569_7625707_764-2387-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 02:13:39,933 WARNING [train.py:1204] (3/4) Exclude cut with ID _27430_210_7770_1_1532604689375_1378590_157-14963-0_sp1.1 from training. Duration: 0.963625 2023-10-04 02:13:42,739 WARNING [train.py:1204] (3/4) Exclude cut with ID _7160_210_1876_1_1527940923231_3703799_282-322611-0_sp1.1 from training. Duration: 0.7909375 2023-10-04 02:13:43,012 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.attention_skip_rate, batch_count=1489133.3333333333, ans=0.0 2023-10-04 02:13:51,676 WARNING [train.py:1204] (3/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_190-138640-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 02:13:51,914 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.bypass.skip_rate, batch_count=1489133.3333333333, ans=0.09899494936611666 2023-10-04 02:13:54,533 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_33348_210_5632_1_1533086912_3324030_63-270922-0_sp1.1 from training. Duration: 0.97275 2023-10-04 02:13:54,559 WARNING [train.py:1204] (3/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_135-184281-0_sp1.1 from training. Duration: 0.9 2023-10-04 02:14:01,349 WARNING [train.py:1204] (3/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_277-210798-0_sp1.1 from training. Duration: 0.5 2023-10-04 02:14:01,405 WARNING [train.py:1204] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_402-253027-0_sp0.9 from training. Duration: 0.6 2023-10-04 02:14:02,828 WARNING [train.py:1204] (3/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_540-275685-0_sp0.9 from training. Duration: 0.988875 2023-10-04 02:14:04,061 WARNING [train.py:1204] (3/4) Exclude cut with ID _59493_210_17282_1_1535078419077_3225879_118-18960-0_sp1.1 from training. Duration: 0.936375 2023-10-04 02:14:04,143 WARNING [train.py:1204] (3/4) Exclude cut with ID _6194_210_1534_1_1527511826160_3941194_321-148641-0_sp1.1 from training. Duration: 0.5818125 2023-10-04 02:14:04,149 WARNING [train.py:1204] (3/4) Exclude cut with ID _61133_210_9969_1_1535196777227_3696409_333-270325-0_sp1.1 from training. Duration: 0.8454375 2023-10-04 02:14:06,078 WARNING [train.py:1204] (3/4) Exclude cut with ID _49376_210_5986_1_1533985278732_3370740_307-145221-0_sp1.1 from training. Duration: 0.936375 2023-10-04 02:14:07,959 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6846_210_6753_1_1527850767_4295158_2-315073-0_sp0.9 from training. Duration: 0.97775 2023-10-04 02:14:10,525 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.593e+02 2.128e+02 2.316e+02 2.574e+02 3.711e+02, threshold=4.632e+02, percent-clipped=0.0 2023-10-04 02:14:11,981 WARNING [train.py:1204] (3/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_17-25045-0 from training. Duration: 0.59 2023-10-04 02:14:11,994 WARNING [train.py:1204] (3/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_503-333510-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 02:14:13,391 WARNING [train.py:1204] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_238-245655-0_sp1.1 from training. Duration: 0.736375 2023-10-04 02:14:13,419 WARNING [train.py:1204] (3/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_329-82235-0_sp0.9 from training. Duration: 0.911125 2023-10-04 02:14:15,148 WARNING [train.py:1204] (3/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_325-113798-0_sp0.9 from training. Duration: 0.9444375 2023-10-04 02:14:15,233 WARNING [train.py:1204] (3/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_395-37222-0_sp0.9 from training. Duration: 0.8444375 2023-10-04 02:14:16,632 WARNING [train.py:1204] (3/4) Exclude cut with ID _64135_210_2253_1_1535677267876_7608972_498-5039-0_sp1.1 from training. Duration: 0.7090625 2023-10-04 02:14:16,647 WARNING [train.py:1204] (3/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_303-228738-0_sp0.9 from training. Duration: 0.9333125 2023-10-04 02:14:18,143 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_577-220899-0_sp1.1 from training. Duration: 0.92725 2023-10-04 02:14:19,522 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3475_210_2028_1_1526193578_3622569_377-166133-0_sp1.1 from training. Duration: 0.7818125 2023-10-04 02:14:20,810 WARNING [train.py:1204] (3/4) Exclude cut with ID _49376_210_5986_1_1533985278732_3370740_272-313512-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 02:14:22,489 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.conv_skip_rate, batch_count=1489333.3333333333, ans=0.0 2023-10-04 02:14:23,914 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7762_210_1794_1_1528199508_4057408_422-227034-0_sp0.9 from training. Duration: 0.811125 2023-10-04 02:14:25,455 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.1.feed_forward1.out_proj.dropout_p, batch_count=1489333.3333333333, ans=0.1 2023-10-04 02:14:26,725 WARNING [train.py:1204] (3/4) Exclude cut with ID _18895_210_6228_1_1531357274234_3784930_74-341428-0_sp1.1 from training. Duration: 0.92725 2023-10-04 02:14:30,785 WARNING [train.py:1204] (3/4) Exclude cut with ID _44902_210_16187_1_1533888006062_3792078_149-67212-0_sp1.1 from training. Duration: 0.7909375 2023-10-04 02:14:34,452 WARNING [train.py:1204] (3/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_243-126020-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 02:14:35,645 WARNING [train.py:1204] (3/4) Exclude cut with ID _54218_210_12210_1_1534413646388_4409259_259-6561-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 02:14:35,917 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.nonlin_attention.balancer.prob, batch_count=1489333.3333333333, ans=0.125 2023-10-04 02:14:38,804 INFO [train.py:1046] (3/4) Epoch 43, batch 300, loss[loss=0.1525, simple_loss=0.2452, pruned_loss=0.02995, over 24680.00 frames. ], tot_loss[loss=0.156, simple_loss=0.2361, pruned_loss=0.03797, over 3664095.42 frames. ], batch size: 68, lr: 2.38e-03, grad_scale: 16.0 2023-10-04 02:14:38,925 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_41452_210_7291_1_1533437687_3941281_405-11559-0 from training. Duration: 0.91 2023-10-04 02:14:39,161 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.conv_module2.balancer2.prob, batch_count=1489400.0, ans=0.125 2023-10-04 02:14:40,434 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_759-225084-0_sp1.1 from training. Duration: 0.97275 2023-10-04 02:14:40,458 WARNING [train.py:1204] (3/4) Exclude cut with ID _14747_210_8341_1_1530685797463_3654089_147-131229-0_sp0.9 from training. Duration: 0.8444375 2023-10-04 02:14:41,882 WARNING [train.py:1204] (3/4) Exclude cut with ID _14867_210_1968_1_1530684179890_3846969_319-250868-0 from training. Duration: 0.99 2023-10-04 02:14:41,905 WARNING [train.py:1204] (3/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_580-93798-0_sp0.9 from training. Duration: 0.62225 2023-10-04 02:14:43,291 WARNING [train.py:1204] (3/4) Exclude cut with ID _9679_210_7497_1_1529129212906_4363929_512-313889-0_sp0.9 from training. Duration: 0.9 2023-10-04 02:14:43,300 WARNING [train.py:1204] (3/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_18-189198-0 from training. Duration: 0.9 2023-10-04 02:14:47,858 WARNING [train.py:1204] (3/4) Exclude cut with ID _49755_210_18053_1_1534581005846_3218709_137-64179-0_sp1.1 from training. Duration: 0.92725 2023-10-04 02:14:47,998 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.0.bypass_mid.scale_min, batch_count=1489400.0, ans=0.2 2023-10-04 02:14:49,205 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_48049_210_5632_1_1533864553_3519937_306-283794-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 02:14:53,301 WARNING [train.py:1204] (3/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_25-120161-0_sp1.1 from training. Duration: 0.8909375 2023-10-04 02:14:53,364 WARNING [train.py:1204] (3/4) Exclude cut with ID _8833_210_5219_1_1528592665210_7553059_319-349181-0 from training. Duration: 0.99 2023-10-04 02:14:54,655 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_296-332325-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 02:14:56,067 WARNING [train.py:1204] (3/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_436-186311-0_sp1.1 from training. Duration: 0.5909375 2023-10-04 02:14:56,078 WARNING [train.py:1204] (3/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_348-127789-0 from training. Duration: 0.85 2023-10-04 02:14:57,411 WARNING [train.py:1204] (3/4) Exclude cut with ID _3992_210_1747_1_1526644816031_3731060_443-195183-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 02:15:00,287 WARNING [train.py:1204] (3/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_308-231735-0_sp1.1 from training. Duration: 0.663625 2023-10-04 02:15:04,787 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6786_210_1388_1_1527839699_4194495_378-46070-0_sp1.1 from training. Duration: 0.6545625 2023-10-04 02:15:04,835 WARNING [train.py:1204] (3/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_63-101996-0 from training. Duration: 0.88 2023-10-04 02:15:05,288 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.5.encoder.layers.1.self_attn2.whiten, num_groups=1, num_channels=256, metric=12.99 vs. limit=22.5 2023-10-04 02:15:08,162 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2541_210_1794_1_1525590269_3084765_156-173713-0 from training. Duration: 0.81 2023-10-04 02:15:08,196 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_217-239796-0_sp1.1 from training. Duration: 0.963625 2023-10-04 02:15:10,080 WARNING [train.py:1204] (3/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_530-329400-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 02:15:11,491 WARNING [train.py:1204] (3/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_150-62920-0_sp1.1 from training. Duration: 0.963625 2023-10-04 02:15:11,491 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_5522_210_3273_1_1527249606_3909096_366-172508-0 from training. Duration: 0.66 2023-10-04 02:15:11,494 WARNING [train.py:1204] (3/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_303-340446-0_sp1.1 from training. Duration: 0.8181875 2023-10-04 02:15:13,041 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.conv_skip_rate, batch_count=1489533.3333333333, ans=0.0 2023-10-04 02:15:14,197 WARNING [train.py:1204] (3/4) Exclude cut with ID _8699_210_2069_1_1528630346831_3960409_160-24408-0_sp1.1 from training. Duration: 0.82725 2023-10-04 02:15:15,742 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.conv_module1.balancer1.prob, batch_count=1489533.3333333333, ans=0.125 2023-10-04 02:15:16,784 WARNING [train.py:1204] (3/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_299-187801-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 02:15:16,825 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_34886_210_10633_1_1533203882_1755661_92-345288-0_sp1.1 from training. Duration: 0.97275 2023-10-04 02:15:21,447 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_5438_210_1794_1_1527336687_2600009_104-288274-0_sp1.1 from training. Duration: 0.52725 2023-10-04 02:15:21,451 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_154-316450-0 from training. Duration: 0.76 2023-10-04 02:15:22,769 WARNING [train.py:1204] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_23-189234-0_sp0.9 from training. Duration: 0.92225 2023-10-04 02:15:24,402 WARNING [train.py:1204] (3/4) Exclude cut with ID _4779_210_2084_1_1527318022944_3914893_346-168820-0_sp1.1 from training. Duration: 0.963625 2023-10-04 02:15:27,057 WARNING [train.py:1204] (3/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_5-55893-0 from training. Duration: 0.55 2023-10-04 02:15:29,115 WARNING [train.py:1204] (3/4) Exclude cut with ID _20455_210_5219_1_1531565996775_8098100_180-24517-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 02:15:31,707 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_243-143119-0_sp0.9 from training. Duration: 0.8555625 2023-10-04 02:15:33,296 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.conv_skip_rate, batch_count=1489600.0, ans=0.0 2023-10-04 02:15:34,455 WARNING [train.py:1204] (3/4) Exclude cut with ID _65585_210_12210_1_1535450451584_3764098_293-212045-0_sp1.1 from training. Duration: 0.92725 2023-10-04 02:15:34,460 WARNING [train.py:1204] (3/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_577-332600-0 from training. Duration: 0.57 2023-10-04 02:15:37,299 WARNING [train.py:1204] (3/4) Exclude cut with ID _30039_210_13270_1_1532484612900_2725150_104-289874-0_sp1.1 from training. Duration: 0.963625 2023-10-04 02:15:37,306 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3172_210_2087_1_1526292929_3987865_197-97762-0_sp1.1 from training. Duration: 0.6818125 2023-10-04 02:15:40,722 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_256-150908-0_sp1.1 from training. Duration: 0.963625 2023-10-04 02:15:42,056 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_296-287678-0_sp1.1 from training. Duration: 0.863625 2023-10-04 02:15:43,356 WARNING [train.py:1204] (3/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_443-30034-0 from training. Duration: 0.64 2023-10-04 02:15:43,395 WARNING [train.py:1204] (3/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_494-199538-0_sp0.9 from training. Duration: 0.8333125 2023-10-04 02:15:43,444 WARNING [train.py:1204] (3/4) Exclude cut with ID _19708_210_9206_1_1532136650733_4238364_61-162325-0_sp1.1 from training. Duration: 0.97275 2023-10-04 02:15:44,783 WARNING [train.py:1204] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_475-127455-0 from training. Duration: 0.7 2023-10-04 02:15:46,182 WARNING [train.py:1204] (3/4) Exclude cut with ID _48677_210_5663_1_1533902405919_3892989_188-11581-0_sp1.1 from training. Duration: 0.963625 2023-10-04 02:15:46,205 WARNING [train.py:1204] (3/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_136-8381-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 02:15:49,352 WARNING [train.py:1204] (3/4) Exclude cut with ID _55328_210_27767_1_1534580973260_4088170_270-229555-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 02:15:49,380 WARNING [train.py:1204] (3/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_372-266951-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 02:15:50,734 WARNING [train.py:1204] (3/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_14-242149-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 02:15:52,148 INFO [train.py:1046] (3/4) Epoch 43, batch 350, loss[loss=0.1538, simple_loss=0.2429, pruned_loss=0.03238, over 24559.00 frames. ], tot_loss[loss=0.1545, simple_loss=0.2337, pruned_loss=0.03764, over 3880324.63 frames. ], batch size: 71, lr: 2.38e-03, grad_scale: 16.0 2023-10-04 02:15:53,614 WARNING [train.py:1204] (3/4) Exclude cut with ID _7877_210_6014_1_1528286358099_3750136_159-76512-0_sp1.1 from training. Duration: 0.936375 2023-10-04 02:15:53,617 WARNING [train.py:1204] (3/4) Exclude cut with ID _8263_210_1664_1_1528438953494_294100_40-204731-0_sp1.1 from training. Duration: 0.4545625 2023-10-04 02:15:56,418 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8278_210_2550_1_1528637099_4220413_694-238413-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 02:16:02,367 WARNING [train.py:1204] (3/4) Exclude cut with ID _47278_210_12041_1_1533902418349_3576110_421-172710-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 02:16:02,576 WARNING [train.py:1204] (3/4) Exclude cut with ID _14970_210_15709_1_1530773543493_1715730_215-282680-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 02:16:04,005 WARNING [train.py:1204] (3/4) Exclude cut with ID _64639_210_5816_1_1535367619437_3528000_306-149449-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 02:16:05,493 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_28-291982-0 from training. Duration: 0.97 2023-10-04 02:16:06,869 WARNING [train.py:1204] (3/4) Exclude cut with ID _55134_210_21982_1_1534590028377_3882170_412-15870-0_sp1.1 from training. Duration: 0.936375 2023-10-04 02:16:06,904 WARNING [train.py:1204] (3/4) Exclude cut with ID _13684_210_7497_1_1530619354863_3598359_312-235606-0 from training. Duration: 0.6 2023-10-04 02:16:10,189 WARNING [train.py:1204] (3/4) Exclude cut with ID _37112_210_2575_1_1532997007673_7268610_347-238993-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 02:16:11,507 WARNING [train.py:1204] (3/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_121-284152-0 from training. Duration: 0.59 2023-10-04 02:16:11,562 WARNING [train.py:1204] (3/4) Exclude cut with ID _2941_210_2577_1_1526199673150_2489759_228-2961-0_sp1.1 from training. Duration: 0.97275 2023-10-04 02:16:14,653 WARNING [train.py:1204] (3/4) Exclude cut with ID _28323_210_16187_1_1532480395667_4100300_531-24157-0 from training. Duration: 0.98 2023-10-04 02:16:16,114 WARNING [train.py:1204] (3/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_266-329755-0_sp0.9 from training. Duration: 0.988875 2023-10-04 02:16:17,567 WARNING [train.py:1204] (3/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_1038-146421-0_sp1.1 from training. Duration: 0.97275 2023-10-04 02:16:18,899 WARNING [train.py:1204] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_391-22148-0_sp0.9 from training. Duration: 0.8555625 2023-10-04 02:16:21,531 WARNING [train.py:1204] (3/4) Exclude cut with ID _64521_210_14699_1_1535243427259_3815328_543-341117-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 02:16:21,548 WARNING [train.py:1204] (3/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_52-168712-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 02:16:21,586 WARNING [train.py:1204] (3/4) Exclude cut with ID _33920_210_4220_1_1533257851500_3084399_198-334063-0_sp1.1 from training. Duration: 0.8545625 2023-10-04 02:16:21,598 WARNING [train.py:1204] (3/4) Exclude cut with ID _50093_210_4220_1_1534417141207_3432299_69-111987-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 02:16:22,895 WARNING [train.py:1204] (3/4) Exclude cut with ID _14821_210_8750_1_1530680503991_6829359_189-297451-0_sp0.9 from training. Duration: 0.611125 2023-10-04 02:16:25,560 WARNING [train.py:1204] (3/4) Exclude cut with ID _8030_210_7497_1_1528444886781_3531089_262-86750-0_sp0.9 from training. Duration: 0.9 2023-10-04 02:16:25,566 WARNING [train.py:1204] (3/4) Exclude cut with ID _24612_210_8913_1_1531998054193_3806359_294-191196-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 02:16:31,951 WARNING [train.py:1204] (3/4) Exclude cut with ID _16909_210_5724_1_1531292079912_4118919_707-342896-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 02:16:31,969 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_9460_210_4211_1_1528954587_10049667_572-37035-0_sp0.9 from training. Duration: 0.888875 2023-10-04 02:16:33,276 WARNING [train.py:1204] (3/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_762-109886-0_sp1.1 from training. Duration: 0.82725 2023-10-04 02:16:33,309 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_14953_210_2151_1_1530701710_5616165_519-326393-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 02:16:33,475 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.ff2_skip_rate, batch_count=1489866.6666666667, ans=0.0 2023-10-04 02:16:37,241 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.619e+02 1.923e+02 2.044e+02 2.262e+02 2.758e+02, threshold=4.089e+02, percent-clipped=0.0 2023-10-04 02:16:39,285 WARNING [train.py:1204] (3/4) Exclude cut with ID _25914_210_6400_1_1532141317058_644740_52-96808-0 from training. Duration: 0.91 2023-10-04 02:16:39,288 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_68095_210_20185_1_1535718226_8888855_1079-94525-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 02:16:44,000 WARNING [train.py:1204] (3/4) Exclude cut with ID _14860_210_4915_1_1530702020340_7343240_746-3647-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 02:16:44,000 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_70379_210_7925_1_1535941455_557455_49-101720-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 02:16:44,259 INFO [scaling.py:1118] (3/4) WithLoss: name=encoder.encoders.4.encoder.layers.2.self_attn_weights, loss-sum=0.000e+00 2023-10-04 02:16:45,312 WARNING [train.py:1204] (3/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_8-307291-0_sp1.1 from training. Duration: 0.936375 2023-10-04 02:16:47,082 WARNING [train.py:1204] (3/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_117-290916-0 from training. Duration: 0.51 2023-10-04 02:16:49,827 WARNING [train.py:1204] (3/4) Exclude cut with ID _22020_210_2461_1_1531724164513_6326900_944-339840-0_sp1.1 from training. Duration: 0.963625 2023-10-04 02:16:51,139 WARNING [train.py:1204] (3/4) Exclude cut with ID IC0515W0043-254911 from training. Duration: 0.976 2023-10-04 02:16:51,232 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3910_210_1794_1_1526742306_334530_28-238909-0 from training. Duration: 0.81 2023-10-04 02:16:52,475 WARNING [train.py:1204] (3/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_269-267275-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 02:16:52,782 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.feed_forward1.hidden_balancer.prob, batch_count=1490000.0, ans=0.125 2023-10-04 02:16:53,968 WARNING [train.py:1204] (3/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_72-251255-0_sp1.1 from training. Duration: 0.8545625 2023-10-04 02:16:53,978 WARNING [train.py:1204] (3/4) Exclude cut with ID _14886_210_4220_1_1530702020355_3516190_175-229073-0 from training. Duration: 0.45 2023-10-04 02:16:55,403 WARNING [train.py:1204] (3/4) Exclude cut with ID _40519_210_13794_1_1533715228655_7337390_921-209452-0_sp1.1 from training. Duration: 0.963625 2023-10-04 02:16:56,835 WARNING [train.py:1204] (3/4) Exclude cut with ID _29857_210_20034_1_1532395886753_3798160_335-163562-0_sp0.9 from training. Duration: 0.9444375 2023-10-04 02:16:59,984 WARNING [train.py:1204] (3/4) Exclude cut with ID _58583_210_5236_1_1534726835266_3574249_384-58445-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 02:17:01,225 WARNING [train.py:1204] (3/4) Exclude cut with ID _44011_210_2046_1_1533722401691_3631310_96-315656-0_sp1.1 from training. Duration: 0.963625 2023-10-04 02:17:01,229 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_68484_210_1794_1_1535849804_7678097_549-170613-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 02:17:02,642 WARNING [train.py:1204] (3/4) Exclude cut with ID _37892_210_19596_1_1533198677897_3964058_214-148058-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 02:17:05,353 INFO [train.py:1046] (3/4) Epoch 43, batch 400, loss[loss=0.151, simple_loss=0.227, pruned_loss=0.03746, over 24563.00 frames. ], tot_loss[loss=0.1535, simple_loss=0.2326, pruned_loss=0.03718, over 4071269.91 frames. ], batch size: 60, lr: 2.37e-03, grad_scale: 32.0 2023-10-04 02:17:07,167 WARNING [train.py:1204] (3/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_415-78792-0_sp0.9 from training. Duration: 0.9 2023-10-04 02:17:08,579 WARNING [train.py:1204] (3/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_383-205833-0_sp1.1 from training. Duration: 0.863625 2023-10-04 02:17:09,905 WARNING [train.py:1204] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_167-137255-0 from training. Duration: 0.64 2023-10-04 02:17:09,920 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8275_210_4198_1_1528455320_3892571_484-154786-0_sp1.1 from training. Duration: 0.963625 2023-10-04 02:17:11,288 WARNING [train.py:1204] (3/4) Exclude cut with ID _40284_210_2084_1_1533513679814_3704279_396-319412-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 02:17:12,590 WARNING [train.py:1204] (3/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_280-120942-0_sp0.9 from training. Duration: 0.8555625 2023-10-04 02:17:12,629 WARNING [train.py:1204] (3/4) Exclude cut with ID _45630_210_10556_1_1533949174498_3902049_558-240889-0_sp1.1 from training. Duration: 0.92725 2023-10-04 02:17:16,042 WARNING [train.py:1204] (3/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_197-307458-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 02:17:17,884 WARNING [train.py:1204] (3/4) Exclude cut with ID _16909_210_5724_1_1531292079912_4118919_163-254843-0_sp1.1 from training. Duration: 0.92725 2023-10-04 02:17:18,180 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.bypass.scale_min, batch_count=1490066.6666666667, ans=0.2 2023-10-04 02:17:19,346 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_103-84010-0 from training. Duration: 0.69 2023-10-04 02:17:22,011 WARNING [train.py:1204] (3/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_401-158239-0 from training. Duration: 0.71 2023-10-04 02:17:22,012 WARNING [train.py:1204] (3/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_899-233658-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 02:17:23,396 WARNING [train.py:1204] (3/4) Exclude cut with ID _23825_210_13449_1_1531915313526_3497150_240-271367-0 from training. Duration: 0.97 2023-10-04 02:17:23,439 WARNING [train.py:1204] (3/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_424-134619-0_sp1.1 from training. Duration: 0.92725 2023-10-04 02:17:26,195 WARNING [train.py:1204] (3/4) Exclude cut with ID _44831_210_13427_1_1533816011549_3689035_32-188147-0_sp1.1 from training. Duration: 0.87275 2023-10-04 02:17:26,198 WARNING [train.py:1204] (3/4) Exclude cut with ID _59170_210_6721_1_1534983633960_4203335_314-63879-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 02:17:26,212 WARNING [train.py:1204] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_602-275260-0 from training. Duration: 0.82 2023-10-04 02:17:26,239 WARNING [train.py:1204] (3/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_511-306767-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 02:17:27,541 WARNING [train.py:1204] (3/4) Exclude cut with ID _41817_210_18835_1_1533434477160_7423049_565-201775-0_sp1.1 from training. Duration: 0.92725 2023-10-04 02:17:27,551 WARNING [train.py:1204] (3/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_778-173151-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 02:17:27,595 WARNING [train.py:1204] (3/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_680-51001-0_sp1.1 from training. Duration: 0.963625 2023-10-04 02:17:31,026 WARNING [train.py:1204] (3/4) Exclude cut with ID IC0677W0059-334891 from training. Duration: 0.896 2023-10-04 02:17:31,077 WARNING [train.py:1204] (3/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_370-331600-0 from training. Duration: 0.94 2023-10-04 02:17:36,498 WARNING [train.py:1204] (3/4) Exclude cut with ID _66840_210_5219_1_1535452168426_4196349_446-240192-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 02:17:37,886 WARNING [train.py:1204] (3/4) Exclude cut with ID _65242_210_18628_1_1535369393717_3823149_38-6732-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 02:17:37,961 WARNING [train.py:1204] (3/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_302-2550-0 from training. Duration: 0.81 2023-10-04 02:17:39,818 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_100-348441-0 from training. Duration: 0.91 2023-10-04 02:17:43,853 WARNING [train.py:1204] (3/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_329-285801-0_sp1.1 from training. Duration: 0.82725 2023-10-04 02:17:47,073 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_30167_210_3528_1_1532428012_4772375_343-334712-0_sp1.1 from training. Duration: 0.936375 2023-10-04 02:17:53,317 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7543_210_6753_1_1528543318_4552_1-85072-0 from training. Duration: 0.83 2023-10-04 02:17:56,043 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7002_210_1794_1_1527935634_4034444_487-236501-0_sp1.1 from training. Duration: 0.6 2023-10-04 02:17:57,465 WARNING [train.py:1204] (3/4) Exclude cut with ID _7160_210_1876_1_1527940923231_3703799_282-322611-0 from training. Duration: 0.87 2023-10-04 02:17:57,766 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.ff2_skip_rate, batch_count=1490266.6666666667, ans=0.0 2023-10-04 02:17:57,784 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=1490266.6666666667, ans=0.1 2023-10-04 02:17:58,958 WARNING [train.py:1204] (3/4) Exclude cut with ID _67335_210_5219_1_1535525975652_4275229_570-262983-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 02:18:00,923 WARNING [train.py:1204] (3/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_625-338546-0_sp1.1 from training. Duration: 0.8 2023-10-04 02:18:02,209 WARNING [train.py:1204] (3/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_176-56120-0 from training. Duration: 0.83 2023-10-04 02:18:05,066 WARNING [train.py:1204] (3/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_963-187109-0_sp1.1 from training. Duration: 0.9 2023-10-04 02:18:07,810 WARNING [train.py:1204] (3/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_121-284152-0_sp0.9 from training. Duration: 0.6555625 2023-10-04 02:18:09,242 WARNING [train.py:1204] (3/4) Exclude cut with ID _45021_210_9994_1_1533619751094_3330288_104-81711-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 02:18:11,989 WARNING [train.py:1204] (3/4) Exclude cut with ID _51382_210_23862_1_1534492801838_3650104_129-343914-0_sp1.1 from training. Duration: 0.936375 2023-10-04 02:18:12,015 WARNING [train.py:1204] (3/4) Exclude cut with ID _14970_210_15709_1_1530775547361_1966965_8-169924-0 from training. Duration: 0.92 2023-10-04 02:18:16,378 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2295_210_2087_1_1525500993_2407863_234-83104-0_sp1.1 from training. Duration: 0.563625 2023-10-04 02:18:17,844 WARNING [train.py:1204] (3/4) Exclude cut with ID _14687_210_5854_1_1530703838259_3664889_115-239046-0 from training. Duration: 0.67 2023-10-04 02:18:19,707 INFO [train.py:1046] (3/4) Epoch 43, batch 450, loss[loss=0.1658, simple_loss=0.2557, pruned_loss=0.03795, over 24341.00 frames. ], tot_loss[loss=0.1548, simple_loss=0.2342, pruned_loss=0.03774, over 4214492.23 frames. ], batch size: 77, lr: 2.37e-03, grad_scale: 32.0 2023-10-04 02:18:19,770 WARNING [train.py:1204] (3/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_690-126306-0_sp1.1 from training. Duration: 0.7090625 2023-10-04 02:18:19,778 WARNING [train.py:1204] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_625-102955-0_sp0.9 from training. Duration: 0.8555625 2023-10-04 02:18:21,333 WARNING [train.py:1204] (3/4) Exclude cut with ID _9389_210_1664_1_1529217337301_5289269_289-213417-0 from training. Duration: 0.89 2023-10-04 02:18:22,799 WARNING [train.py:1204] (3/4) Exclude cut with ID _13253_210_1925_1_1530757775819_3577873_191-144977-0_sp0.9 from training. Duration: 0.8666875 2023-10-04 02:18:22,850 WARNING [train.py:1204] (3/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_325-155703-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 02:18:24,803 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2295_210_2087_1_1525500993_2407863_116-238894-0_sp0.9 from training. Duration: 0.77775 2023-10-04 02:18:26,193 WARNING [train.py:1204] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_94-126128-0 from training. Duration: 0.86 2023-10-04 02:18:26,719 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.4.encoder.layers.2.nonlin_attention.whiten2, num_groups=1, num_channels=384, metric=3.37 vs. limit=15.0 2023-10-04 02:18:27,422 WARNING [train.py:1204] (3/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_395-310220-0_sp0.9 from training. Duration: 0.988875 2023-10-04 02:18:27,501 WARNING [train.py:1204] (3/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_821-110852-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 02:18:28,831 WARNING [train.py:1204] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_64-248359-0_sp1.1 from training. Duration: 0.62725 2023-10-04 02:18:28,850 WARNING [train.py:1204] (3/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_138-187031-0 from training. Duration: 0.99 2023-10-04 02:18:28,889 WARNING [train.py:1204] (3/4) Exclude cut with ID _4122_210_2084_1_1526713164417_4353280_528-206771-0_sp0.9 from training. Duration: 0.97775 2023-10-04 02:18:30,227 WARNING [train.py:1204] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_844-96989-0_sp1.1 from training. Duration: 0.6454375 2023-10-04 02:18:33,077 WARNING [train.py:1204] (3/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_106-30782-0_sp1.1 from training. Duration: 0.72725 2023-10-04 02:18:36,564 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.balancer1.prob, batch_count=1490466.6666666667, ans=0.125 2023-10-04 02:18:43,164 WARNING [train.py:1204] (3/4) Exclude cut with ID _14683_210_5219_1_1530698352608_3974859_281-250040-0_sp1.1 from training. Duration: 0.936375 2023-10-04 02:18:43,190 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_46013_210_7234_1_1533776362_3744032_463-52378-0_sp0.9 from training. Duration: 0.9555625 2023-10-04 02:18:44,587 WARNING [train.py:1204] (3/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_299-34413-0 from training. Duration: 0.65 2023-10-04 02:18:46,448 WARNING [train.py:1204] (3/4) Exclude cut with ID _35799_210_5854_1_1533263487887_3529071_490-326847-0 from training. Duration: 0.99 2023-10-04 02:18:49,141 WARNING [train.py:1204] (3/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_589-9070-0_sp0.9 from training. Duration: 0.788875 2023-10-04 02:18:51,050 WARNING [train.py:1204] (3/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_884-29515-0_sp1.1 from training. Duration: 0.936375 2023-10-04 02:18:53,760 WARNING [train.py:1204] (3/4) Exclude cut with ID _15012_210_1968_1_1530856820429_5095350_474-119995-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 02:18:56,580 WARNING [train.py:1204] (3/4) Exclude cut with ID _65585_210_12210_1_1535450451584_3764098_211-97124-0_sp1.1 from training. Duration: 0.963625 2023-10-04 02:18:56,643 WARNING [train.py:1204] (3/4) Exclude cut with ID _33640_210_2655_1_1533117605479_4855469_666-224463-0_sp1.1 from training. Duration: 0.963625 2023-10-04 02:18:58,557 WARNING [train.py:1204] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_35-153886-0 from training. Duration: 0.84 2023-10-04 02:18:59,210 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.2.encoder.layers.2.feed_forward2.out_whiten, num_groups=1, num_channels=384, metric=7.42 vs. limit=15.0 2023-10-04 02:18:59,805 WARNING [train.py:1204] (3/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_210-106047-0 from training. Duration: 0.97 2023-10-04 02:18:59,941 WARNING [train.py:1204] (3/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_479-125611-0 from training. Duration: 0.96 2023-10-04 02:19:01,478 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_9417_210_1794_1_1529235261_4614131_427-153963-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 02:19:01,567 WARNING [train.py:1204] (3/4) Exclude cut with ID _23818_210_6228_1_1531917063884_3676920_133-170571-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 02:19:02,784 WARNING [train.py:1204] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_602-275260-0_sp1.1 from training. Duration: 0.7454375 2023-10-04 02:19:03,565 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9029W0121-509521 from training. Duration: 0.896 2023-10-04 02:19:03,573 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9029W0434-509821 from training. Duration: 0.64 2023-10-04 02:19:04,858 WARNING [train.py:1204] (3/4) Exclude cut with ID _23038_210_9570_1_1532001584158_3652419_289-307034-0_sp1.1 from training. Duration: 0.936375 2023-10-04 02:19:06,077 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.607e+02 1.944e+02 2.181e+02 2.558e+02 3.848e+02, threshold=4.361e+02, percent-clipped=0.0 2023-10-04 02:19:06,156 WARNING [train.py:1204] (3/4) Exclude cut with ID _29345_210_4381_1_1532689168263_3860169_149-257268-0_sp0.9 from training. Duration: 0.9 2023-10-04 02:19:07,542 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_405-339986-0_sp1.1 from training. Duration: 0.463625 2023-10-04 02:19:10,317 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2655_210_2087_1_1525688959_3897400_437-337690-0_sp1.1 from training. Duration: 0.57275 2023-10-04 02:19:10,345 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7543_210_6753_1_1528541907_1399322_165-323619-0_sp1.1 from training. Duration: 0.62725 2023-10-04 02:19:11,646 WARNING [train.py:1204] (3/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_508-335369-0_sp1.1 from training. Duration: 0.436375 2023-10-04 02:19:11,720 WARNING [train.py:1204] (3/4) Exclude cut with ID _7852_210_5219_1_1528286471316_8139400_358-287492-0 from training. Duration: 0.96 2023-10-04 02:19:14,902 WARNING [train.py:1204] (3/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_322-217220-0_sp0.9 from training. Duration: 0.9555625 2023-10-04 02:19:18,273 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3171_210_2087_1_1526109084_2330007_66-260583-0_sp0.9 from training. Duration: 0.8 2023-10-04 02:19:18,302 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8558_210_1410_1_1528505225_7613406_131-329524-0_sp1.1 from training. Duration: 0.7181875 2023-10-04 02:19:19,006 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.3.encoder.layers.2.nonlin_attention.whiten1, num_groups=1, num_channels=384, metric=4.11 vs. limit=10.0 2023-10-04 02:19:20,183 WARNING [train.py:1204] (3/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_30-326681-0 from training. Duration: 0.83 2023-10-04 02:19:20,275 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder_embed.conv.2.prob, batch_count=1490666.6666666667, ans=0.125 2023-10-04 02:19:23,010 WARNING [train.py:1204] (3/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_58-68956-0_sp0.9 from training. Duration: 0.92225 2023-10-04 02:19:24,236 WARNING [train.py:1204] (3/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_255-312654-0 from training. Duration: 0.91 2023-10-04 02:19:25,736 WARNING [train.py:1204] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_99-167392-0 from training. Duration: 0.61 2023-10-04 02:19:25,827 WARNING [train.py:1204] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_94-126128-0_sp0.9 from training. Duration: 0.9555625 2023-10-04 02:19:30,372 WARNING [train.py:1204] (3/4) Exclude cut with ID _68635_210_9317_1_1535770813410_4315870_187-192817-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 02:19:31,798 WARNING [train.py:1204] (3/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_277-204970-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 02:19:32,654 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.feed_forward3.hidden_balancer.prob, batch_count=1490666.6666666667, ans=0.125 2023-10-04 02:19:33,610 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6163_210_6400_1_1527506746_4263068_283-137811-0_sp0.9 from training. Duration: 0.8555625 2023-10-04 02:19:34,858 INFO [train.py:1046] (3/4) Epoch 43, batch 500, loss[loss=0.1858, simple_loss=0.2555, pruned_loss=0.05811, over 23769.00 frames. ], tot_loss[loss=0.1551, simple_loss=0.2344, pruned_loss=0.03788, over 4312099.57 frames. ], batch size: 179, lr: 2.37e-03, grad_scale: 32.0 2023-10-04 02:19:34,913 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9144W0121-566446 from training. Duration: 0.896 2023-10-04 02:19:37,666 WARNING [train.py:1204] (3/4) Exclude cut with ID _49371_210_12071_1_1533952761021_3812190_351-315519-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 02:19:40,174 WARNING [train.py:1204] (3/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_115-326778-0_sp1.1 from training. Duration: 0.8454375 2023-10-04 02:19:40,242 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_17292_210_6753_1_1531125388_2943734_436-198965-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 02:19:41,459 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9165W0117-576751 from training. Duration: 0.896 2023-10-04 02:19:42,974 WARNING [train.py:1204] (3/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_212-149947-0 from training. Duration: 0.73 2023-10-04 02:19:42,982 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_13757_210_3528_1_1530440682_4107239_65-167544-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 02:19:45,738 WARNING [train.py:1204] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_700-183549-0_sp0.9 from training. Duration: 0.7555625 2023-10-04 02:19:52,166 WARNING [train.py:1204] (3/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_216-343007-0_sp0.9 from training. Duration: 0.7333125 2023-10-04 02:19:52,273 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7544_210_6753_1_1528587722_3576686_295-258479-0_sp1.1 from training. Duration: 0.763625 2023-10-04 02:19:54,998 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_365-287106-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 02:19:55,011 WARNING [train.py:1204] (3/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_300-3747-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 02:19:55,072 WARNING [train.py:1204] (3/4) Exclude cut with ID _14069_210_5632_1_1530512692033_4803390_487-303201-0_sp1.1 from training. Duration: 0.92725 2023-10-04 02:20:04,033 WARNING [train.py:1204] (3/4) Exclude cut with ID _14459_210_2228_1_1531443641946_7516700_668-123026-0_sp1.1 from training. Duration: 0.936375 2023-10-04 02:20:04,062 WARNING [train.py:1204] (3/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_108-53929-0_sp1.1 from training. Duration: 0.536375 2023-10-04 02:20:04,097 WARNING [train.py:1204] (3/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_266-137104-0_sp0.9 from training. Duration: 0.72225 2023-10-04 02:20:05,916 WARNING [train.py:1204] (3/4) Exclude cut with ID _12302_210_5219_1_1529924446466_8107280_902-168619-0_sp1.1 from training. Duration: 0.936375 2023-10-04 02:20:05,938 WARNING [train.py:1204] (3/4) Exclude cut with ID _59502_210_12210_1_1535107024698_1835139_146-162495-0 from training. Duration: 0.92 2023-10-04 02:20:05,954 WARNING [train.py:1204] (3/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_412-287526-0_sp1.1 from training. Duration: 0.6545625 2023-10-04 02:20:07,444 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.bypass.scale_min, batch_count=1490866.6666666667, ans=0.2 2023-10-04 02:20:08,703 WARNING [train.py:1204] (3/4) Exclude cut with ID _7160_210_1876_1_1527940923231_3703799_120-150943-0_sp0.9 from training. Duration: 0.9 2023-10-04 02:20:10,028 WARNING [train.py:1204] (3/4) Exclude cut with ID _15037_210_2253_1_1530752294104_3267650_296-192597-0_sp1.1 from training. Duration: 0.736375 2023-10-04 02:20:11,233 WARNING [train.py:1204] (3/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_397-216497-0_sp1.1 from training. Duration: 0.87275 2023-10-04 02:20:11,255 WARNING [train.py:1204] (3/4) Exclude cut with ID _40094_210_16380_1_1533294005773_3641159_76-269320-0_sp1.1 from training. Duration: 0.936375 2023-10-04 02:20:11,308 WARNING [train.py:1204] (3/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_449-15508-0 from training. Duration: 0.82 2023-10-04 02:20:12,858 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.conv_module1.balancer2.prob, batch_count=1490866.6666666667, ans=0.125 2023-10-04 02:20:15,283 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9309W0194-640321 from training. Duration: 0.64 2023-10-04 02:20:16,889 WARNING [train.py:1204] (3/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_284-312178-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 02:20:20,039 WARNING [train.py:1204] (3/4) Exclude cut with ID _29853_210_7497_1_1532505600851_6642110_401-55937-0_sp1.1 from training. Duration: 0.92725 2023-10-04 02:20:20,121 WARNING [train.py:1204] (3/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_716-24918-0_sp1.1 from training. Duration: 0.92725 2023-10-04 02:20:21,939 WARNING [train.py:1204] (3/4) Exclude cut with ID _19637_210_4220_1_1531537167676_3205060_314-305638-0_sp1.1 from training. Duration: 0.92725 2023-10-04 02:20:21,973 WARNING [train.py:1204] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_475-127455-0_sp1.1 from training. Duration: 0.636375 2023-10-04 02:20:24,809 WARNING [train.py:1204] (3/4) Exclude cut with ID _5444_210_2151_1_1527418808464_5312210_581-315411-0 from training. Duration: 0.62 2023-10-04 02:20:27,562 WARNING [train.py:1204] (3/4) Exclude cut with ID _44902_210_16187_1_1533888006062_3792078_149-67212-0_sp0.9 from training. Duration: 0.9666875 2023-10-04 02:20:28,916 WARNING [train.py:1204] (3/4) Exclude cut with ID _31602_210_5854_1_1532677021456_4458250_63-63253-0_sp1.1 from training. Duration: 0.963625 2023-10-04 02:20:29,259 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.balancer2.prob, batch_count=1490933.3333333333, ans=0.125 2023-10-04 02:20:33,080 WARNING [train.py:1204] (3/4) Exclude cut with ID _55209_210_1531_1_1534675828996_4597160_660-203992-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 02:20:36,194 WARNING [train.py:1204] (3/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_83-190624-0_sp1.1 from training. Duration: 0.92725 2023-10-04 02:20:39,822 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.nonlin_attention.balancer.prob, batch_count=1491000.0, ans=0.125 2023-10-04 02:20:41,039 WARNING [train.py:1204] (3/4) Exclude cut with ID _58605_210_12210_1_1534723195939_3904259_154-139635-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 02:20:41,157 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.1.conv_module1.balancer1.prob, batch_count=1491000.0, ans=0.125 2023-10-04 02:20:43,859 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.ff3_skip_rate, batch_count=1491000.0, ans=0.0 2023-10-04 02:20:45,000 WARNING [train.py:1204] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_164-224052-0 from training. Duration: 0.7 2023-10-04 02:20:45,011 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_1126-189071-0_sp1.1 from training. Duration: 0.963625 2023-10-04 02:20:45,023 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_15396_210_6890_1_1531308473_1938936_310-214789-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 02:20:47,659 INFO [train.py:1046] (3/4) Epoch 43, batch 550, loss[loss=0.1516, simple_loss=0.2426, pruned_loss=0.03026, over 24535.00 frames. ], tot_loss[loss=0.1561, simple_loss=0.2357, pruned_loss=0.03827, over 4402895.21 frames. ], batch size: 71, lr: 2.37e-03, grad_scale: 16.0 2023-10-04 02:20:47,750 WARNING [train.py:1204] (3/4) Exclude cut with ID _8395_210_1531_1_1528597871523_4411579_431-132470-0 from training. Duration: 0.74 2023-10-04 02:20:47,802 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3780_210_2087_1_1526467760_2130483_170-259044-0_sp0.9 from training. Duration: 0.77775 2023-10-04 02:20:49,171 WARNING [train.py:1204] (3/4) Exclude cut with ID _12733_210_8341_1_1530167424556_3646380_537-230640-0_sp1.1 from training. Duration: 0.963625 2023-10-04 02:20:52,635 WARNING [train.py:1204] (3/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_159-281310-0 from training. Duration: 0.95 2023-10-04 02:20:55,933 WARNING [train.py:1204] (3/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_434-83000-0 from training. Duration: 0.98 2023-10-04 02:20:55,965 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_4405_210_1849_1_1526732205_3593939_493-32464-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 02:20:57,209 WARNING [train.py:1204] (3/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_342-100157-0 from training. Duration: 0.54 2023-10-04 02:20:57,260 WARNING [train.py:1204] (3/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_765-250133-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 02:20:57,265 WARNING [train.py:1204] (3/4) Exclude cut with ID _38670_210_15379_1_1533448810659_4391059_81-312245-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 02:20:58,602 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_50050_210_16223_1_1535435796_7290138_1273-247611-0_sp1.1 from training. Duration: 0.936375 2023-10-04 02:20:58,675 WARNING [train.py:1204] (3/4) Exclude cut with ID _44831_210_13427_1_1533816011549_3689035_399-18591-0_sp1.1 from training. Duration: 0.936375 2023-10-04 02:20:58,696 WARNING [train.py:1204] (3/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_557-324258-0_sp0.9 from training. Duration: 0.9 2023-10-04 02:20:58,935 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.ff3_skip_rate, batch_count=1491066.6666666667, ans=0.0 2023-10-04 02:21:00,044 WARNING [train.py:1204] (3/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_62-335552-0_sp1.1 from training. Duration: 0.7909375 2023-10-04 02:21:01,553 WARNING [train.py:1204] (3/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_673-264125-0_sp1.1 from training. Duration: 0.963625 2023-10-04 02:21:02,944 WARNING [train.py:1204] (3/4) Exclude cut with ID _8030_210_7497_1_1528444886781_3531089_262-86750-0 from training. Duration: 0.81 2023-10-04 02:21:02,967 WARNING [train.py:1204] (3/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_444-28041-0_sp1.1 from training. Duration: 0.77275 2023-10-04 02:21:06,436 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.1.conv_module1.balancer1.prob, batch_count=1491133.3333333333, ans=0.125 2023-10-04 02:21:08,893 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_34008_210_10431_1_1533345424_8183247_516-14858-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 02:21:08,907 WARNING [train.py:1204] (3/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_54-248218-0_sp1.1 from training. Duration: 0.936375 2023-10-04 02:21:10,375 WARNING [train.py:1204] (3/4) Exclude cut with ID _13253_210_1925_1_1530757775819_3577873_300-119782-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 02:21:11,748 WARNING [train.py:1204] (3/4) Exclude cut with ID _39290_210_4220_1_1533862778760_3075118_21-240703-0_sp1.1 from training. Duration: 0.936375 2023-10-04 02:21:15,991 WARNING [train.py:1204] (3/4) Exclude cut with ID 774-127930-0014-10412-0_sp1.1 from training. Duration: 0.95 2023-10-04 02:21:17,300 WARNING [train.py:1204] (3/4) Exclude cut with ID _67499_210_17282_1_1535509805220_3692430_20-179346-0 from training. Duration: 0.97 2023-10-04 02:21:18,654 WARNING [train.py:1204] (3/4) Exclude cut with ID _8365_210_2064_1_1528592311070_4677690_431-294102-0_sp0.9 from training. Duration: 0.988875 2023-10-04 02:21:18,929 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.nonlin_attention.balancer.prob, batch_count=1491200.0, ans=0.125 2023-10-04 02:21:23,434 WARNING [train.py:1204] (3/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_365-1492-0_sp1.1 from training. Duration: 0.82725 2023-10-04 02:21:23,470 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_105-114797-0_sp0.9 from training. Duration: 0.9333125 2023-10-04 02:21:25,321 WARNING [train.py:1204] (3/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_769-168302-0_sp0.9 from training. Duration: 0.788875 2023-10-04 02:21:26,915 INFO [scaling.py:1118] (3/4) WithLoss: name=encoder.encoders.4.encoder.layers.2.self_attn_weights, loss-sum=0.000e+00 2023-10-04 02:21:27,994 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_40340_210_9024_1_1533433916_4132944_320-270277-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 02:21:28,003 WARNING [train.py:1204] (3/4) Exclude cut with ID ID0203W0003-781711 from training. Duration: 0.958 2023-10-04 02:21:28,088 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_30434_210_13049_1_1532519384_981746_52-12932-0_sp1.1 from training. Duration: 0.936375 2023-10-04 02:21:30,646 WARNING [train.py:1204] (3/4) Exclude cut with ID _8263_210_1664_1_1528438953494_294100_40-204731-0_sp0.9 from training. Duration: 0.5555625 2023-10-04 02:21:32,161 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1032-236863-0_sp0.9 from training. Duration: 0.9333125 2023-10-04 02:21:32,234 WARNING [train.py:1204] (3/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_335-274780-0_sp0.9 from training. Duration: 0.8666875 2023-10-04 02:21:32,243 WARNING [train.py:1204] (3/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_654-52713-0_sp1.1 from training. Duration: 0.7 2023-10-04 02:21:34,226 WARNING [train.py:1204] (3/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_742-267856-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 02:21:35,575 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.568e+02 1.973e+02 2.172e+02 2.445e+02 3.955e+02, threshold=4.345e+02, percent-clipped=0.0 2023-10-04 02:21:35,687 WARNING [train.py:1204] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_74-48717-0 from training. Duration: 0.58 2023-10-04 02:21:38,269 WARNING [train.py:1204] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_18-46756-0 from training. Duration: 0.79 2023-10-04 02:21:38,324 WARNING [train.py:1204] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_612-56061-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 02:21:38,339 WARNING [train.py:1204] (3/4) Exclude cut with ID _56423_210_5854_1_1534896061049_3165969_412-2403-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 02:21:38,392 WARNING [train.py:1204] (3/4) Exclude cut with ID _62574_210_2655_1_1535180407058_3933249_120-196848-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 02:21:38,393 WARNING [train.py:1204] (3/4) Exclude cut with ID _40519_210_13794_1_1533715228655_7337390_916-51000-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 02:21:42,351 WARNING [train.py:1204] (3/4) Exclude cut with ID _65263_210_22356_1_1535763598691_3921037_108-21902-0_sp1.1 from training. Duration: 0.9 2023-10-04 02:21:42,430 WARNING [train.py:1204] (3/4) Exclude cut with ID _4122_210_2084_1_1526713164417_4353280_528-206771-0_sp1.1 from training. Duration: 0.8 2023-10-04 02:21:45,254 WARNING [train.py:1204] (3/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_43-325657-0_sp1.1 from training. Duration: 0.92725 2023-10-04 02:21:45,305 WARNING [train.py:1204] (3/4) Exclude cut with ID _44659_210_2721_1_1534036120061_6614860_314-82041-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 02:21:46,626 WARNING [train.py:1204] (3/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_81-300212-0_sp0.9 from training. Duration: 0.6666875 2023-10-04 02:21:48,050 WARNING [train.py:1204] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_665-196155-0_sp0.9 from training. Duration: 0.7444375 2023-10-04 02:21:49,405 WARNING [train.py:1204] (3/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_140-336881-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 02:21:51,385 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6290_210_3273_1_1527595224_1282787_115-87095-0_sp0.9 from training. Duration: 0.611125 2023-10-04 02:21:52,742 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_53373_210_5399_1_1534229471_4715695_607-259685-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 02:21:54,720 WARNING [train.py:1204] (3/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_175-163894-0_sp1.1 from training. Duration: 0.67275 2023-10-04 02:21:54,738 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7002_210_1794_1_1527939683_5157286_1-281264-0_sp1.1 from training. Duration: 0.4 2023-10-04 02:22:00,259 WARNING [train.py:1204] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_605-257205-0 from training. Duration: 0.59 2023-10-04 02:22:01,653 INFO [train.py:1046] (3/4) Epoch 43, batch 600, loss[loss=0.1401, simple_loss=0.2252, pruned_loss=0.02756, over 24460.00 frames. ], tot_loss[loss=0.1575, simple_loss=0.2372, pruned_loss=0.03885, over 4463136.95 frames. ], batch size: 58, lr: 2.37e-03, grad_scale: 16.0 2023-10-04 02:22:03,135 WARNING [train.py:1204] (3/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_454-314901-0 from training. Duration: 0.46 2023-10-04 02:22:03,226 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_46032_210_14656_1_1533781412_4507969_287-130422-0_sp1.1 from training. Duration: 0.97275 2023-10-04 02:22:03,848 WARNING [train.py:1204] (3/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_186-309161-0_sp1.1 from training. Duration: 0.4909375 2023-10-04 02:22:05,209 WARNING [train.py:1204] (3/4) Exclude cut with ID _42659_210_2070_1_1533607442765_3120380_45-342307-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 02:22:05,483 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.attention_skip_rate, batch_count=1491400.0, ans=0.0 2023-10-04 02:22:10,456 WARNING [train.py:1204] (3/4) Exclude cut with ID _12555_210_8750_1_1530075594402_3780239_478-326818-0_sp1.1 from training. Duration: 0.963625 2023-10-04 02:22:13,136 WARNING [train.py:1204] (3/4) Exclude cut with ID _7606_210_4915_1_1528894253277_5080690_127-177761-0_sp1.1 from training. Duration: 0.5818125 2023-10-04 02:22:15,840 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6788_210_1388_1_1527898698_4562021_91-197981-0 from training. Duration: 0.83 2023-10-04 02:22:17,244 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8558_210_1410_1_1528505225_7613406_131-329524-0_sp0.9 from training. Duration: 0.87775 2023-10-04 02:22:19,223 WARNING [train.py:1204] (3/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_153-153537-0_sp1.1 from training. Duration: 0.82725 2023-10-04 02:22:21,944 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_17292_210_6753_1_1531125388_2943734_285-349105-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 02:22:23,341 WARNING [train.py:1204] (3/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_37-41919-0 from training. Duration: 0.87 2023-10-04 02:22:23,367 WARNING [train.py:1204] (3/4) Exclude cut with ID _60860_210_8254_1_1534896015875_3557169_172-294852-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 02:22:29,741 WARNING [train.py:1204] (3/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_146-276944-0 from training. Duration: 0.83 2023-10-04 02:22:32,557 WARNING [train.py:1204] (3/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_1254-44082-0_sp1.1 from training. Duration: 0.936375 2023-10-04 02:22:32,571 WARNING [train.py:1204] (3/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_1115-180350-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 02:22:32,600 WARNING [train.py:1204] (3/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_60-146189-0_sp1.1 from training. Duration: 0.8909375 2023-10-04 02:22:37,445 WARNING [train.py:1204] (3/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_517-153649-0_sp1.1 from training. Duration: 0.92725 2023-10-04 02:22:38,276 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.0.layers.1.self_attn_weights.whiten_keys, num_groups=4, num_channels=128, metric=2.30 vs. limit=6.0 2023-10-04 02:22:38,759 WARNING [train.py:1204] (3/4) Exclude cut with ID _39571_210_13449_1_1533801584143_3651077_37-266257-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 02:22:38,798 WARNING [train.py:1204] (3/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_527-276118-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 02:22:44,183 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7696_210_2040_1_1528207167_3716099_117-125434-0_sp1.1 from training. Duration: 0.6909375 2023-10-04 02:22:48,941 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_25767_210_2161_1_1532307483_3867748_478-156360-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 02:22:48,945 WARNING [train.py:1204] (3/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_177-49092-0_sp1.1 from training. Duration: 0.82725 2023-10-04 02:22:48,952 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_11305_210_1794_1_1529750428_7178148_680-317073-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 02:22:52,239 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.ff3_skip_rate, batch_count=1491600.0, ans=0.0 2023-10-04 02:22:54,839 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.balancer2.prob, batch_count=1491600.0, ans=0.125 2023-10-04 02:22:56,544 WARNING [train.py:1204] (3/4) Exclude cut with ID _53565_210_10635_1_1534337218335_3820259_287-18120-0 from training. Duration: 0.97 2023-10-04 02:23:01,196 WARNING [train.py:1204] (3/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_498-40153-0_sp1.1 from training. Duration: 0.57275 2023-10-04 02:23:01,220 WARNING [train.py:1204] (3/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_555-63066-0_sp1.1 from training. Duration: 0.963625 2023-10-04 02:23:03,292 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.2.encoder.layers.1.self_attn_weights.whiten_keys, num_groups=4, num_channels=128, metric=2.87 vs. limit=6.0 2023-10-04 02:23:05,259 WARNING [train.py:1204] (3/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_395-310220-0 from training. Duration: 0.89 2023-10-04 02:23:06,586 WARNING [train.py:1204] (3/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_937-208281-0_sp1.1 from training. Duration: 0.77275 2023-10-04 02:23:06,898 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.conv_module2.balancer2.prob, batch_count=1491666.6666666667, ans=0.125 2023-10-04 02:23:08,022 WARNING [train.py:1204] (3/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_120-211630-0 from training. Duration: 0.92 2023-10-04 02:23:08,048 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_454-276429-0_sp1.1 from training. Duration: 0.7909375 2023-10-04 02:23:09,387 WARNING [train.py:1204] (3/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_404-75889-0_sp1.1 from training. Duration: 0.8090625 2023-10-04 02:23:14,848 INFO [train.py:1046] (3/4) Epoch 43, batch 650, loss[loss=0.1658, simple_loss=0.2556, pruned_loss=0.03796, over 24653.00 frames. ], tot_loss[loss=0.1558, simple_loss=0.235, pruned_loss=0.03829, over 4507882.12 frames. ], batch size: 73, lr: 2.37e-03, grad_scale: 16.0 2023-10-04 02:23:14,903 WARNING [train.py:1204] (3/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_282-136641-0_sp0.9 from training. Duration: 0.4666875 2023-10-04 02:23:16,418 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_809-17156-0_sp1.1 from training. Duration: 0.663625 2023-10-04 02:23:19,089 WARNING [train.py:1204] (3/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_1003-101638-0_sp0.9 from training. Duration: 0.811125 2023-10-04 02:23:20,976 WARNING [train.py:1204] (3/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_404-75889-0_sp0.9 from training. Duration: 0.988875 2023-10-04 02:23:23,665 WARNING [train.py:1204] (3/4) Exclude cut with ID _59288_210_5854_1_1535077757774_3412246_570-295255-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 02:23:25,148 WARNING [train.py:1204] (3/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_412-287526-0 from training. Duration: 0.72 2023-10-04 02:23:26,384 WARNING [train.py:1204] (3/4) Exclude cut with ID _44010_210_4525_1_1533884262964_4013620_649-79655-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 02:23:31,088 WARNING [train.py:1204] (3/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_57-270037-0_sp0.9 from training. Duration: 0.8555625 2023-10-04 02:23:31,089 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_34815_210_6388_1_1533381320_3571903_302-255963-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 02:23:34,499 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_47449_210_5399_1_1534076445_4006929_573-301852-0_sp1.1 from training. Duration: 0.97275 2023-10-04 02:23:36,232 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.feed_forward2.hidden_balancer.prob, batch_count=1491800.0, ans=0.125 2023-10-04 02:23:37,409 WARNING [train.py:1204] (3/4) Exclude cut with ID 6709-74022-0004-86860-0_sp1.1 from training. Duration: 0.9409375 2023-10-04 02:23:38,749 WARNING [train.py:1204] (3/4) Exclude cut with ID _40519_210_13794_1_1533715228655_7337390_111-71991-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 02:23:40,070 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_30167_210_3528_1_1532428012_4772375_169-214186-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 02:23:41,571 WARNING [train.py:1204] (3/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_20-235313-0_sp1.1 from training. Duration: 0.963625 2023-10-04 02:23:41,616 WARNING [train.py:1204] (3/4) Exclude cut with ID _4681_210_3525_1_1527251040321_1235713_123-24428-0_sp0.9 from training. Duration: 0.5333125 2023-10-04 02:23:44,377 WARNING [train.py:1204] (3/4) Exclude cut with ID _31602_210_5854_1_1532677021456_4458250_444-198752-0_sp1.1 from training. Duration: 0.97275 2023-10-04 02:23:44,429 WARNING [train.py:1204] (3/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_9-279442-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 02:23:45,651 WARNING [train.py:1204] (3/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_973-198085-0_sp0.9 from training. Duration: 0.8333125 2023-10-04 02:23:45,757 WARNING [train.py:1204] (3/4) Exclude cut with ID _29853_210_7497_1_1532505600851_6642110_567-250221-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 02:23:48,718 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6545_210_2040_1_1527774670_4245258_407-48070-0_sp1.1 from training. Duration: 0.6909375 2023-10-04 02:23:51,306 WARNING [train.py:1204] (3/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_224-343295-0_sp0.9 from training. Duration: 0.611125 2023-10-04 02:23:51,322 WARNING [train.py:1204] (3/4) Exclude cut with ID IC0125W0011-61817 from training. Duration: 0.88 2023-10-04 02:23:51,331 WARNING [train.py:1204] (3/4) Exclude cut with ID _60113_210_16271_1_1535164212481_4386079_435-154371-0_sp1.1 from training. Duration: 0.97275 2023-10-04 02:23:51,352 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_25836_210_16554_1_1532138167_53197_3-9394-0_sp1.1 from training. Duration: 0.92725 2023-10-04 02:23:51,620 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.feed_forward2.hidden_balancer.prob, batch_count=1491866.6666666667, ans=0.125 2023-10-04 02:23:54,162 WARNING [train.py:1204] (3/4) Exclude cut with ID _29343_210_4381_1_1532602757752_3674109_57-55125-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 02:23:54,246 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder_embed.convnext.layerdrop_rate, batch_count=1491866.6666666667, ans=0.015 2023-10-04 02:23:55,554 WARNING [train.py:1204] (3/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_267-23560-0_sp1.1 from training. Duration: 0.936375 2023-10-04 02:23:55,592 WARNING [train.py:1204] (3/4) Exclude cut with ID _30196_210_1664_1_1533708496522_7074140_1150-278867-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 02:23:55,772 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=1491866.6666666667, ans=0.1 2023-10-04 02:23:57,425 WARNING [train.py:1204] (3/4) Exclude cut with ID _6270_210_2084_1_1527850911573_3856420_26-277343-0_sp0.9 from training. Duration: 0.911125 2023-10-04 02:23:58,119 WARNING [train.py:1204] (3/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_325-62802-0 from training. Duration: 0.87 2023-10-04 02:23:59,451 WARNING [train.py:1204] (3/4) Exclude cut with ID _56524_210_9642_1_1534572011779_3619170_145-310842-0_sp1.1 from training. Duration: 0.9 2023-10-04 02:23:59,494 WARNING [train.py:1204] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_860-96198-0_sp0.9 from training. Duration: 0.811125 2023-10-04 02:24:01,356 WARNING [train.py:1204] (3/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_17-25045-0_sp1.1 from training. Duration: 0.536375 2023-10-04 02:24:01,374 WARNING [train.py:1204] (3/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_349-69727-0_sp1.1 from training. Duration: 0.936375 2023-10-04 02:24:02,678 WARNING [train.py:1204] (3/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_580-93798-0_sp1.1 from training. Duration: 0.5090625 2023-10-04 02:24:03,910 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.548e+02 1.962e+02 2.234e+02 2.555e+02 3.806e+02, threshold=4.468e+02, percent-clipped=0.0 2023-10-04 02:24:04,025 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7762_210_1794_1_1528199508_4057408_419-125009-0 from training. Duration: 0.83 2023-10-04 02:24:04,125 WARNING [train.py:1204] (3/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_356-167816-0 from training. Duration: 0.72 2023-10-04 02:24:05,446 WARNING [train.py:1204] (3/4) Exclude cut with ID _29345_210_4381_1_1532689168263_3860169_225-97100-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 02:24:05,463 WARNING [train.py:1204] (3/4) Exclude cut with ID _14227_210_4381_1_1530701804286_3940259_374-194712-0_sp1.1 from training. Duration: 0.92725 2023-10-04 02:24:05,490 WARNING [train.py:1204] (3/4) Exclude cut with ID _40137_210_12210_1_1533290484046_3795180_249-187059-0_sp0.9 from training. Duration: 0.97775 2023-10-04 02:24:05,515 WARNING [train.py:1204] (3/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_389-317194-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 02:24:08,170 WARNING [train.py:1204] (3/4) Exclude cut with ID _27485_210_4916_1_1532739191677_3668269_239-122918-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 02:24:15,379 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2245_210_2715_1_1525511548_3540254_128-117609-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 02:24:15,409 WARNING [train.py:1204] (3/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_160-276178-0_sp1.1 from training. Duration: 0.963625 2023-10-04 02:24:16,842 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_31920_210_10633_1_1532860009_1686094_1-279735-0_sp1.1 from training. Duration: 0.97275 2023-10-04 02:24:19,652 WARNING [train.py:1204] (3/4) Exclude cut with ID _51078_210_9070_1_1534161557424_3805250_325-262383-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 02:24:19,687 WARNING [train.py:1204] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_643-52007-0_sp0.9 from training. Duration: 0.6333125 2023-10-04 02:24:21,535 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_51937_210_5014_1_1534573610_4022025_518-299248-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 02:24:23,168 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.feed_forward3.hidden_balancer.prob, batch_count=1492000.0, ans=0.125 2023-10-04 02:24:27,134 WARNING [train.py:1204] (3/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_166-314607-0_sp1.1 from training. Duration: 0.4909375 2023-10-04 02:24:27,138 WARNING [train.py:1204] (3/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_1087-282939-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 02:24:28,411 INFO [train.py:1046] (3/4) Epoch 43, batch 700, loss[loss=0.1575, simple_loss=0.2301, pruned_loss=0.04245, over 23371.00 frames. ], tot_loss[loss=0.155, simple_loss=0.2338, pruned_loss=0.03806, over 4550067.81 frames. ], batch size: 285, lr: 2.37e-03, grad_scale: 8.0 2023-10-04 02:24:29,074 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_23850_210_4238_1_1532232597_1632433_32-185135-0_sp1.1 from training. Duration: 0.7818125 2023-10-04 02:24:29,105 WARNING [train.py:1204] (3/4) Exclude cut with ID _59500_210_8254_1_1534809610680_3736009_145-89359-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 02:24:29,400 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.feed_forward3.hidden_balancer.prob, batch_count=1492066.6666666667, ans=0.125 2023-10-04 02:24:33,252 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_340-336408-0 from training. Duration: 0.93 2023-10-04 02:24:33,318 WARNING [train.py:1204] (3/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_9-21737-0 from training. Duration: 0.97 2023-10-04 02:24:36,014 WARNING [train.py:1204] (3/4) Exclude cut with ID _2220_210_1819_1_1525509109898_3658680_50-99837-0 from training. Duration: 0.54 2023-10-04 02:24:37,369 WARNING [train.py:1204] (3/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_879-299350-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 02:24:38,729 WARNING [train.py:1204] (3/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_905-160074-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 02:24:38,986 INFO [scaling.py:1118] (3/4) WithLoss: name=encoder.encoders.1.encoder.layers.1.self_attn_weights, loss-sum=0.000e+00 2023-10-04 02:24:40,202 WARNING [train.py:1204] (3/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_395-73570-0 from training. Duration: 0.99 2023-10-04 02:24:41,806 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.feed_forward1.out_proj.dropout_p, batch_count=1492133.3333333333, ans=0.1 2023-10-04 02:24:45,732 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7264_210_4938_1_1527939911_8132095_800-91995-0_sp1.1 from training. Duration: 0.7818125 2023-10-04 02:24:48,454 WARNING [train.py:1204] (3/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_77-311404-0_sp1.1 from training. Duration: 0.8545625 2023-10-04 02:24:49,831 WARNING [train.py:1204] (3/4) Exclude cut with ID _13679_210_7497_1_1530534293662_3355098_107-80772-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 02:24:51,151 WARNING [train.py:1204] (3/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_224-343295-0_sp1.1 from training. Duration: 0.5 2023-10-04 02:24:51,194 WARNING [train.py:1204] (3/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_156-253482-0_sp1.1 from training. Duration: 0.963625 2023-10-04 02:24:54,384 WARNING [train.py:1204] (3/4) Exclude cut with ID _49728_210_12651_1_1534041034294_6788150_309-125532-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 02:24:55,869 WARNING [train.py:1204] (3/4) Exclude cut with ID _13913_210_9994_1_1530579615187_3383897_473-277682-0_sp0.9 from training. Duration: 0.4333125 2023-10-04 02:24:55,874 WARNING [train.py:1204] (3/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_3-276701-0_sp1.1 from training. Duration: 0.82725 2023-10-04 02:24:59,062 WARNING [train.py:1204] (3/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_282-136641-0 from training. Duration: 0.42 2023-10-04 02:25:04,008 WARNING [train.py:1204] (3/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_127-566-0 from training. Duration: 0.84 2023-10-04 02:25:06,850 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6163_210_6400_1_1527506746_4263068_283-137811-0_sp1.1 from training. Duration: 0.7 2023-10-04 02:25:06,880 WARNING [train.py:1204] (3/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_34-183685-0_sp1.1 from training. Duration: 0.8909375 2023-10-04 02:25:08,277 WARNING [train.py:1204] (3/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_154-135696-0_sp0.9 from training. Duration: 0.888875 2023-10-04 02:25:11,187 WARNING [train.py:1204] (3/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_676-51014-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 02:25:11,252 WARNING [train.py:1204] (3/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_38-169920-0 from training. Duration: 0.91 2023-10-04 02:25:12,997 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.conv_module2.balancer1.prob, batch_count=1492266.6666666667, ans=0.125 2023-10-04 02:25:15,351 WARNING [train.py:1204] (3/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_747-114465-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 02:25:15,398 WARNING [train.py:1204] (3/4) Exclude cut with ID _8021_210_7994_1_1528376451358_3653069_100-140231-0_sp1.1 from training. Duration: 0.6090625 2023-10-04 02:25:15,414 WARNING [train.py:1204] (3/4) Exclude cut with ID _64135_210_2253_1_1535677267876_7608972_133-188266-0 from training. Duration: 0.81 2023-10-04 02:25:18,684 WARNING [train.py:1204] (3/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_714-310538-0_sp1.1 from training. Duration: 0.7818125 2023-10-04 02:25:20,107 WARNING [train.py:1204] (3/4) Exclude cut with ID _39867_210_9826_1_1533553222030_9344259_1134-288498-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 02:25:21,642 WARNING [train.py:1204] (3/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_211-237289-0_sp1.1 from training. Duration: 0.936375 2023-10-04 02:25:25,839 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.ff2_skip_rate, batch_count=1492333.3333333333, ans=0.0 2023-10-04 02:25:28,849 WARNING [train.py:1204] (3/4) Exclude cut with ID _59202_210_26984_1_1534816810612_3549174_137-318710-0_sp0.9 from training. Duration: 0.988875 2023-10-04 02:25:28,890 WARNING [train.py:1204] (3/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_86-66192-0 from training. Duration: 0.55 2023-10-04 02:25:32,348 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.2.encoder.layers.2.feed_forward2.out_whiten, num_groups=1, num_channels=384, metric=6.38 vs. limit=15.0 2023-10-04 02:25:32,906 WARNING [train.py:1204] (3/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_540-275685-0 from training. Duration: 0.89 2023-10-04 02:25:34,266 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8776_210_2040_1_1528638837_4088036_244-278763-0 from training. Duration: 0.9 2023-10-04 02:25:35,814 WARNING [train.py:1204] (3/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_9-219972-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 02:25:37,286 WARNING [train.py:1204] (3/4) Exclude cut with ID _44659_210_2721_1_1534036120061_6614860_656-283823-0_sp1.1 from training. Duration: 0.92725 2023-10-04 02:25:37,357 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_69630_210_7234_1_1535789601_661049_77-216385-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 02:25:39,945 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_19952_210_2161_1_1531875469_3851750_210-308919-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 02:25:39,960 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_4737_210_2040_1_1526998689_3293008_378-178650-0 from training. Duration: 0.95 2023-10-04 02:25:42,706 INFO [train.py:1046] (3/4) Epoch 43, batch 750, loss[loss=0.1529, simple_loss=0.2366, pruned_loss=0.03456, over 23266.00 frames. ], tot_loss[loss=0.1546, simple_loss=0.2341, pruned_loss=0.03756, over 4591669.20 frames. ], batch size: 93, lr: 2.37e-03, grad_scale: 8.0 2023-10-04 02:25:42,880 WARNING [train.py:1204] (3/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_224-343295-0 from training. Duration: 0.55 2023-10-04 02:25:44,113 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7002_210_1794_1_1527939683_5157286_184-229630-0 from training. Duration: 0.74 2023-10-04 02:25:44,138 WARNING [train.py:1204] (3/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_6-214815-0 from training. Duration: 0.85 2023-10-04 02:25:44,235 WARNING [train.py:1204] (3/4) Exclude cut with ID _8699_210_2069_1_1528630346831_3960409_160-24408-0 from training. Duration: 0.91 2023-10-04 02:25:45,520 WARNING [train.py:1204] (3/4) Exclude cut with ID _5585_210_2667_1_1527418813324_8007340_89-332072-0 from training. Duration: 0.64 2023-10-04 02:25:45,574 WARNING [train.py:1204] (3/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_528-259800-0_sp1.1 from training. Duration: 0.963625 2023-10-04 02:25:48,271 WARNING [train.py:1204] (3/4) Exclude cut with ID _55383_210_10635_1_1534468426172_2231669_134-128246-0 from training. Duration: 0.89 2023-10-04 02:25:48,351 WARNING [train.py:1204] (3/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_400-338274-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 02:25:50,118 WARNING [train.py:1204] (3/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_364-22817-0_sp0.9 from training. Duration: 0.788875 2023-10-04 02:25:51,498 WARNING [train.py:1204] (3/4) Exclude cut with ID _42863_210_9070_1_1533902398788_3696920_331-291057-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 02:25:54,175 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_5436_210_1794_1_1527331317_2541494_229-211016-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 02:25:54,207 WARNING [train.py:1204] (3/4) Exclude cut with ID _6456_210_1534_1_1527681634212_3181049_345-252877-0_sp0.9 from training. Duration: 0.711125 2023-10-04 02:25:54,239 WARNING [train.py:1204] (3/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_96-81499-0_sp1.1 from training. Duration: 0.92725 2023-10-04 02:25:54,926 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.3.encoder.layers.2.nonlin_attention.whiten2, num_groups=1, num_channels=512, metric=5.27 vs. limit=15.0 2023-10-04 02:25:56,978 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_5575_210_1410_1_1527301305_2570170_342-265572-0_sp1.1 from training. Duration: 0.8545625 2023-10-04 02:25:58,289 WARNING [train.py:1204] (3/4) Exclude cut with ID _5444_210_2151_1_1527418808464_5312210_726-27165-0_sp0.9 from training. Duration: 0.7444375 2023-10-04 02:26:00,416 WARNING [train.py:1204] (3/4) Exclude cut with ID _22083_210_8254_1_1531702862439_3678970_304-327750-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 02:26:03,862 WARNING [train.py:1204] (3/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_245-226199-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 02:26:03,895 WARNING [train.py:1204] (3/4) Exclude cut with ID _42875_210_14856_1_1533704397487_4012433_102-199499-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 02:26:03,965 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_51225_210_3601_1_1534400634_2327383_55-301844-0 from training. Duration: 0.93 2023-10-04 02:26:04,798 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.ff2_skip_rate, batch_count=1492466.6666666667, ans=0.0 2023-10-04 02:26:05,697 WARNING [train.py:1204] (3/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_916-195848-0_sp1.1 from training. Duration: 0.836375 2023-10-04 02:26:05,775 WARNING [train.py:1204] (3/4) Exclude cut with ID _44718_210_5816_1_1534050051869_6469030_497-311999-0_sp1.1 from training. Duration: 0.936375 2023-10-04 02:26:07,142 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_34886_210_10633_1_1533203882_1755661_91-194765-0_sp1.1 from training. Duration: 0.936375 2023-10-04 02:26:09,803 WARNING [train.py:1204] (3/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_310-26186-0_sp0.9 from training. Duration: 0.77775 2023-10-04 02:26:09,888 WARNING [train.py:1204] (3/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_416-257730-0 from training. Duration: 0.65 2023-10-04 02:26:09,893 WARNING [train.py:1204] (3/4) Exclude cut with ID _9389_210_1664_1_1529217337301_5289269_209-154760-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 02:26:12,568 WARNING [train.py:1204] (3/4) Exclude cut with ID _4092_210_2151_1_1526643044449_3135759_227-213972-0 from training. Duration: 0.54 2023-10-04 02:26:12,589 WARNING [train.py:1204] (3/4) Exclude cut with ID IC0679W0044-335852 from training. Duration: 0.768 2023-10-04 02:26:12,656 WARNING [train.py:1204] (3/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_42-186162-0 from training. Duration: 0.92 2023-10-04 02:26:12,670 WARNING [train.py:1204] (3/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_302-2550-0_sp0.9 from training. Duration: 0.9 2023-10-04 02:26:13,932 WARNING [train.py:1204] (3/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_356-31216-0_sp0.9 from training. Duration: 0.8333125 2023-10-04 02:26:15,391 WARNING [train.py:1204] (3/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_125-322172-0_sp1.1 from training. Duration: 0.8454375 2023-10-04 02:26:22,638 WARNING [train.py:1204] (3/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_141-89727-0_sp0.9 from training. Duration: 0.788875 2023-10-04 02:26:23,873 WARNING [train.py:1204] (3/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_647-337784-0_sp1.1 from training. Duration: 0.97275 2023-10-04 02:26:23,891 WARNING [train.py:1204] (3/4) Exclude cut with ID _6493_210_1747_1_1528459210424_4012470_513-140967-0_sp1.1 from training. Duration: 0.7181875 2023-10-04 02:26:25,294 WARNING [train.py:1204] (3/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_87-266334-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 02:26:26,650 WARNING [train.py:1204] (3/4) Exclude cut with ID _26528_210_9070_1_1532570410842_3851310_526-261338-0_sp1.1 from training. Duration: 0.92725 2023-10-04 02:26:28,013 WARNING [train.py:1204] (3/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_380-343670-0 from training. Duration: 0.88 2023-10-04 02:26:28,082 WARNING [train.py:1204] (3/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_449-15508-0_sp1.1 from training. Duration: 0.7454375 2023-10-04 02:26:31,200 WARNING [train.py:1204] (3/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_372-6869-0_sp0.9 from training. Duration: 0.488875 2023-10-04 02:26:31,274 WARNING [train.py:1204] (3/4) Exclude cut with ID _26907_210_7234_1_1532221740694_3205263_337-2494-0_sp0.9 from training. Duration: 0.97775 2023-10-04 02:26:32,460 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.607e+02 1.914e+02 2.119e+02 2.380e+02 3.754e+02, threshold=4.238e+02, percent-clipped=0.0 2023-10-04 02:26:35,862 WARNING [train.py:1204] (3/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_10-100367-0_sp1.1 from training. Duration: 0.8909375 2023-10-04 02:26:35,897 WARNING [train.py:1204] (3/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_47-167373-0 from training. Duration: 0.92 2023-10-04 02:26:35,956 WARNING [train.py:1204] (3/4) Exclude cut with ID _28323_210_16187_1_1532480395667_4100300_628-282179-0_sp1.1 from training. Duration: 0.97275 2023-10-04 02:26:41,782 WARNING [train.py:1204] (3/4) Exclude cut with ID _20455_210_5219_1_1531565996775_8098100_213-280898-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 02:26:41,890 WARNING [train.py:1204] (3/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_383-316000-0_sp1.1 from training. Duration: 0.8181875 2023-10-04 02:26:43,275 WARNING [train.py:1204] (3/4) Exclude cut with ID _18513_210_9206_1_1531618215223_3862620_16-78180-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 02:26:44,695 WARNING [train.py:1204] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_330-327861-0_sp1.1 from training. Duration: 0.6090625 2023-10-04 02:26:48,756 WARNING [train.py:1204] (3/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_356-31216-0 from training. Duration: 0.75 2023-10-04 02:26:48,782 WARNING [train.py:1204] (3/4) Exclude cut with ID _25248_210_8750_1_1532062825941_5554180_255-148539-0_sp1.1 from training. Duration: 0.963625 2023-10-04 02:26:50,608 WARNING [train.py:1204] (3/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_269-154726-0_sp1.1 from training. Duration: 0.7818125 2023-10-04 02:26:52,118 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7002_210_1794_1_1527939683_5157286_163-87552-0_sp1.1 from training. Duration: 0.7818125 2023-10-04 02:26:52,150 WARNING [train.py:1204] (3/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_97-151896-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 02:26:56,303 INFO [train.py:1046] (3/4) Epoch 43, batch 800, loss[loss=0.1515, simple_loss=0.2262, pruned_loss=0.03841, over 23663.00 frames. ], tot_loss[loss=0.155, simple_loss=0.2352, pruned_loss=0.03743, over 4636257.50 frames. ], batch size: 135, lr: 2.37e-03, grad_scale: 8.0 2023-10-04 02:26:56,353 WARNING [train.py:1204] (3/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_207-7026-0_sp1.1 from training. Duration: 0.97275 2023-10-04 02:26:56,386 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_92-46009-0_sp0.9 from training. Duration: 0.6 2023-10-04 02:26:59,194 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.feed_forward3.hidden_balancer.prob, batch_count=1492733.3333333333, ans=0.125 2023-10-04 02:27:05,528 WARNING [train.py:1204] (3/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_122-178847-0_sp1.1 from training. Duration: 0.97275 2023-10-04 02:27:05,532 WARNING [train.py:1204] (3/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_94-348956-0_sp1.1 from training. Duration: 0.92725 2023-10-04 02:27:06,999 WARNING [train.py:1204] (3/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_1018-244089-0_sp1.1 from training. Duration: 0.963625 2023-10-04 02:27:07,012 WARNING [train.py:1204] (3/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_87-118423-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 02:27:08,330 WARNING [train.py:1204] (3/4) Exclude cut with ID _53713_210_24116_1_1534314487762_7416751_504-212292-0_sp1.1 from training. Duration: 0.92725 2023-10-04 02:27:08,351 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_446-150327-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 02:27:08,469 WARNING [train.py:1204] (3/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_682-245989-0_sp1.1 from training. Duration: 0.92725 2023-10-04 02:27:08,665 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.conv_module1.balancer1.prob, batch_count=1492733.3333333333, ans=0.125 2023-10-04 02:27:13,077 WARNING [train.py:1204] (3/4) Exclude cut with ID _35723_210_1794_1_1533088341043_628250_8-234524-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 02:27:14,248 WARNING [train.py:1204] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_64-248359-0_sp0.9 from training. Duration: 0.7666875 2023-10-04 02:27:17,071 WARNING [train.py:1204] (3/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_49-97158-0 from training. Duration: 0.76 2023-10-04 02:27:18,346 WARNING [train.py:1204] (3/4) Exclude cut with ID _24314_210_4915_1_1532135078116_7170179_743-915-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 02:27:18,451 WARNING [train.py:1204] (3/4) Exclude cut with ID _61194_210_18628_1_1535007555628_3750909_353-323468-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 02:27:19,766 WARNING [train.py:1204] (3/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_387-131538-0_sp1.1 from training. Duration: 0.736375 2023-10-04 02:27:19,787 WARNING [train.py:1204] (3/4) Exclude cut with ID _44424_210_18163_1_1533623394848_1213110_79-187603-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 02:27:19,833 WARNING [train.py:1204] (3/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_86-19601-0 from training. Duration: 0.85 2023-10-04 02:27:19,853 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_50050_210_16223_1_1535435796_7290138_103-283529-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 02:27:21,779 WARNING [train.py:1204] (3/4) Exclude cut with ID _13913_210_9994_1_1530579615187_3383897_473-277682-0 from training. Duration: 0.39 2023-10-04 02:27:24,595 WARNING [train.py:1204] (3/4) Exclude cut with ID _42326_210_7770_1_1533814282531_3707269_368-68029-0_sp1.1 from training. Duration: 0.92725 2023-10-04 02:27:27,248 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_41428_210_2161_1_1533774184_3992532_446-339645-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 02:27:28,660 WARNING [train.py:1204] (3/4) Exclude cut with ID _69584_210_5219_1_1535785133775_4044450_325-7656-0_sp1.1 from training. Duration: 0.7818125 2023-10-04 02:27:28,668 WARNING [train.py:1204] (3/4) Exclude cut with ID _21632_210_2253_1_1531994001765_3744140_134-184387-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 02:27:33,134 WARNING [train.py:1204] (3/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_550-184707-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 02:27:33,157 WARNING [train.py:1204] (3/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_106-341780-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 02:27:33,355 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.conv_module2.balancer1.max_abs, batch_count=1492866.6666666667, ans=10.0 2023-10-04 02:27:37,616 WARNING [train.py:1204] (3/4) Exclude cut with ID _45871_210_5665_1_1533864614245_3296060_253-132552-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 02:27:39,033 WARNING [train.py:1204] (3/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_395-310220-0_sp1.1 from training. Duration: 0.8090625 2023-10-04 02:27:39,056 WARNING [train.py:1204] (3/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_795-33252-0_sp0.9 from training. Duration: 0.411125 2023-10-04 02:27:39,193 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9002W0003-495962 from training. Duration: 0.946 2023-10-04 02:27:40,506 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_43-71989-0 from training. Duration: 0.57 2023-10-04 02:27:40,519 WARNING [train.py:1204] (3/4) Exclude cut with ID _8699_210_2069_1_1528630346831_3960409_155-39766-0_sp1.1 from training. Duration: 0.7090625 2023-10-04 02:27:40,532 WARNING [train.py:1204] (3/4) Exclude cut with ID _27894_210_12392_1_1532415496078_6902439_788-232684-0_sp1.1 from training. Duration: 0.97275 2023-10-04 02:27:43,764 WARNING [train.py:1204] (3/4) Exclude cut with ID _56494_210_4220_1_1535284837625_3033169_10-39296-0_sp1.1 from training. Duration: 0.92725 2023-10-04 02:27:43,797 WARNING [train.py:1204] (3/4) Exclude cut with ID _40901_210_5665_1_1533363789958_3856130_341-20819-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 02:27:43,940 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.balancer1.prob, batch_count=1492933.3333333333, ans=0.125 2023-10-04 02:27:46,751 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9029W0435-509822 from training. Duration: 0.896 2023-10-04 02:27:48,020 WARNING [train.py:1204] (3/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_51-117576-0 from training. Duration: 0.4 2023-10-04 02:27:49,441 WARNING [train.py:1204] (3/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_197-28164-0_sp1.1 from training. Duration: 0.72725 2023-10-04 02:27:50,859 WARNING [train.py:1204] (3/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_145-137790-0_sp1.1 from training. Duration: 0.7545625 2023-10-04 02:27:51,027 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.conv_module2.balancer1.prob, batch_count=1492933.3333333333, ans=0.125 2023-10-04 02:27:52,283 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=1492933.3333333333, ans=0.1 2023-10-04 02:27:54,796 WARNING [train.py:1204] (3/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_2-39653-0_sp1.1 from training. Duration: 0.8909375 2023-10-04 02:27:58,133 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3780_210_2087_1_1526467760_2130483_186-208240-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 02:27:59,485 WARNING [train.py:1204] (3/4) Exclude cut with ID _24055_210_12651_1_1532310907143_976979_92-59356-0 from training. Duration: 0.91 2023-10-04 02:27:59,545 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3910_210_1794_1_1526742306_334530_28-238909-0_sp1.1 from training. Duration: 0.736375 2023-10-04 02:28:02,276 WARNING [train.py:1204] (3/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_269-154726-0 from training. Duration: 0.86 2023-10-04 02:28:02,454 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.conv_module1.balancer2.prob, batch_count=1493000.0, ans=0.125 2023-10-04 02:28:08,904 WARNING [train.py:1204] (3/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_326-165900-0_sp0.9 from training. Duration: 0.7666875 2023-10-04 02:28:10,104 INFO [train.py:1046] (3/4) Epoch 43, batch 850, loss[loss=0.1412, simple_loss=0.2198, pruned_loss=0.03128, over 24329.00 frames. ], tot_loss[loss=0.1553, simple_loss=0.2354, pruned_loss=0.03759, over 4654848.34 frames. ], batch size: 56, lr: 2.37e-03, grad_scale: 8.0 2023-10-04 02:28:12,212 WARNING [train.py:1204] (3/4) Exclude cut with ID _5294_210_1747_1_1527249643928_3696179_470-276077-0_sp1.1 from training. Duration: 0.963625 2023-10-04 02:28:13,521 WARNING [train.py:1204] (3/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_372-131163-0 from training. Duration: 0.94 2023-10-04 02:28:13,579 WARNING [train.py:1204] (3/4) Exclude cut with ID _45297_210_2721_1_1533715351381_3264949_351-147136-0_sp1.1 from training. Duration: 0.7909375 2023-10-04 02:28:16,162 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_1072-12671-0_sp1.1 from training. Duration: 0.92725 2023-10-04 02:28:16,244 WARNING [train.py:1204] (3/4) Exclude cut with ID _15133_210_6228_1_1530794512074_3080379_54-198193-0 from training. Duration: 0.94 2023-10-04 02:28:16,272 WARNING [train.py:1204] (3/4) Exclude cut with ID _30196_210_1664_1_1533708496522_7074140_1194-270988-0_sp1.1 from training. Duration: 0.97275 2023-10-04 02:28:18,980 WARNING [train.py:1204] (3/4) Exclude cut with ID _50310_210_15488_1_1534068043345_3641080_289-193637-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 02:28:20,363 WARNING [train.py:1204] (3/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_313-263495-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 02:28:21,824 WARNING [train.py:1204] (3/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_18-130336-0_sp1.1 from training. Duration: 0.7181875 2023-10-04 02:28:23,225 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_47896_210_20508_1_1534492518_5958868_646-287209-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 02:28:23,336 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_294-163793-0 from training. Duration: 0.82 2023-10-04 02:28:23,528 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.nonlin_attention.balancer.prob, batch_count=1493133.3333333333, ans=0.125 2023-10-04 02:28:24,606 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_8-319957-0 from training. Duration: 0.96 2023-10-04 02:28:24,610 WARNING [train.py:1204] (3/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_1211-226066-0 from training. Duration: 0.76 2023-10-04 02:28:26,005 WARNING [train.py:1204] (3/4) Exclude cut with ID _4779_210_2084_1_1527318022944_3914893_500-285013-0_sp0.9 from training. Duration: 0.7666875 2023-10-04 02:28:26,031 WARNING [train.py:1204] (3/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_347-232268-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 02:28:27,859 WARNING [train.py:1204] (3/4) Exclude cut with ID _29877_210_6228_1_1532397791106_5276149_111-55056-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 02:28:27,891 WARNING [train.py:1204] (3/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_812-271646-0_sp1.1 from training. Duration: 0.92725 2023-10-04 02:28:29,296 WARNING [train.py:1204] (3/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_235-262124-0_sp1.1 from training. Duration: 0.6090625 2023-10-04 02:28:32,142 WARNING [train.py:1204] (3/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_14-16530-0_sp1.1 from training. Duration: 0.97275 2023-10-04 02:28:33,492 WARNING [train.py:1204] (3/4) Exclude cut with ID _44718_210_5816_1_1534050051869_6469030_568-304512-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 02:28:33,526 WARNING [train.py:1204] (3/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_1033-274376-0 from training. Duration: 0.87 2023-10-04 02:28:36,651 WARNING [train.py:1204] (3/4) Exclude cut with ID _4681_210_3525_1_1527251040321_1235713_123-24428-0 from training. Duration: 0.48 2023-10-04 02:28:38,125 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_37818_210_4238_1_1533620910_4720083_251-288764-0_sp1.1 from training. Duration: 0.97275 2023-10-04 02:28:39,477 WARNING [train.py:1204] (3/4) Exclude cut with ID _65585_210_12210_1_1535450451584_3764098_112-294301-0 from training. Duration: 0.88 2023-10-04 02:28:43,447 WARNING [train.py:1204] (3/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_329-113173-0 from training. Duration: 0.62 2023-10-04 02:28:44,862 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6767_210_3273_1_1527855783_1718932_173-256601-0 from training. Duration: 0.8 2023-10-04 02:28:46,297 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9266W0001-627677 from training. Duration: 0.896 2023-10-04 02:28:47,657 WARNING [train.py:1204] (3/4) Exclude cut with ID _49643_210_7989_1_1534640244224_3939031_611-185515-0_sp1.1 from training. Duration: 0.8545625 2023-10-04 02:28:47,662 WARNING [train.py:1204] (3/4) Exclude cut with ID _31649_210_6228_1_1532584608063_4091078_687-237771-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 02:28:47,674 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_148-172861-0_sp0.9 from training. Duration: 0.5555625 2023-10-04 02:28:50,382 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_5129_210_6400_1_1527161005_3579054_107-69738-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 02:28:50,493 WARNING [train.py:1204] (3/4) Exclude cut with ID _40137_210_12210_1_1533290484046_3795180_36-301164-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 02:28:51,751 WARNING [train.py:1204] (3/4) Exclude cut with ID _7537_210_2084_1_1528502395431_3924359_113-199933-0 from training. Duration: 0.59 2023-10-04 02:28:53,241 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.conv_module1.balancer2.min_positive, batch_count=1493266.6666666667, ans=0.05 2023-10-04 02:28:54,380 WARNING [train.py:1204] (3/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_547-5424-0_sp1.1 from training. Duration: 0.8545625 2023-10-04 02:28:56,092 WARNING [train.py:1204] (3/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_147-284024-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 02:28:56,162 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_294-163793-0_sp1.1 from training. Duration: 0.7454375 2023-10-04 02:28:56,181 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3780_210_2087_1_1526467760_2130483_170-259044-0_sp1.1 from training. Duration: 0.636375 2023-10-04 02:28:57,667 WARNING [train.py:1204] (3/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_726-270780-0_sp1.1 from training. Duration: 0.7818125 2023-10-04 02:28:59,083 WARNING [train.py:1204] (3/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_214-291369-0_sp1.1 from training. Duration: 0.57275 2023-10-04 02:29:00,302 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.595e+02 1.958e+02 2.154e+02 2.504e+02 4.006e+02, threshold=4.308e+02, percent-clipped=0.0 2023-10-04 02:29:00,410 WARNING [train.py:1204] (3/4) Exclude cut with ID _21881_210_3525_1_1531825242757_3798380_247-102591-0 from training. Duration: 0.98 2023-10-04 02:29:05,102 WARNING [train.py:1204] (3/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_18-174647-0_sp1.1 from training. Duration: 0.82725 2023-10-04 02:29:05,105 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_415-52611-0_sp1.1 from training. Duration: 0.936375 2023-10-04 02:29:06,432 WARNING [train.py:1204] (3/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_379-49457-0_sp0.9 from training. Duration: 0.9666875 2023-10-04 02:29:06,458 WARNING [train.py:1204] (3/4) Exclude cut with ID _64521_210_14699_1_1535243427259_3815328_368-195106-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 02:29:07,813 WARNING [train.py:1204] (3/4) Exclude cut with ID _28394_210_8913_1_1532430049518_3650118_51-334124-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 02:29:11,955 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.feed_forward3.hidden_balancer.prob, batch_count=1493333.3333333333, ans=0.125 2023-10-04 02:29:13,189 WARNING [train.py:1204] (3/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_317-161769-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 02:29:15,215 WARNING [train.py:1204] (3/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_40-129756-0_sp0.9 from training. Duration: 0.9 2023-10-04 02:29:16,672 WARNING [train.py:1204] (3/4) Exclude cut with ID _3067_210_2667_1_1526212974640_4503168_119-175776-0_sp0.9 from training. Duration: 0.82225 2023-10-04 02:29:16,717 WARNING [train.py:1204] (3/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_1004-304915-0_sp1.1 from training. Duration: 0.92725 2023-10-04 02:29:16,786 WARNING [train.py:1204] (3/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_359-118926-0_sp0.9 from training. Duration: 0.8 2023-10-04 02:29:24,923 WARNING [train.py:1204] (3/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_51-200220-0_sp0.9 from training. Duration: 0.62225 2023-10-04 02:29:25,011 WARNING [train.py:1204] (3/4) Exclude cut with ID _46683_210_9642_1_1533730913492_4591116_530-326202-0_sp1.1 from training. Duration: 0.936375 2023-10-04 02:29:26,257 INFO [train.py:1046] (3/4) Epoch 43, batch 900, loss[loss=0.1698, simple_loss=0.2467, pruned_loss=0.04648, over 23545.00 frames. ], tot_loss[loss=0.1564, simple_loss=0.2365, pruned_loss=0.03816, over 4664021.76 frames. ], batch size: 256, lr: 2.37e-03, grad_scale: 8.0 2023-10-04 02:29:26,340 WARNING [train.py:1204] (3/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_567-135114-0 from training. Duration: 0.87 2023-10-04 02:29:27,669 WARNING [train.py:1204] (3/4) Exclude cut with ID _66863_210_18053_1_1535698758334_8590319_1092-271784-0_sp1.1 from training. Duration: 0.87275 2023-10-04 02:29:27,679 WARNING [train.py:1204] (3/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_762-282614-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 02:29:27,838 WARNING [train.py:1204] (3/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_280-120942-0 from training. Duration: 0.77 2023-10-04 02:29:32,726 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_31583_210_3852_1_1532595851_4024413_27-170931-0_sp1.1 from training. Duration: 0.963625 2023-10-04 02:29:37,292 WARNING [train.py:1204] (3/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_266-271628-0_sp1.1 from training. Duration: 0.92725 2023-10-04 02:29:37,334 WARNING [train.py:1204] (3/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_41-31157-0 from training. Duration: 0.57 2023-10-04 02:29:37,596 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.feed_forward3.hidden_balancer.prob, batch_count=1493400.0, ans=0.125 2023-10-04 02:29:39,983 WARNING [train.py:1204] (3/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_113-18163-0_sp1.1 from training. Duration: 0.8090625 2023-10-04 02:29:40,029 WARNING [train.py:1204] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_147-111137-0 from training. Duration: 0.95 2023-10-04 02:29:41,456 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1083-220122-0_sp1.1 from training. Duration: 0.463625 2023-10-04 02:29:43,229 WARNING [train.py:1204] (3/4) Exclude cut with ID _14884_210_2252_1_1530707459689_5561250_591-173406-0_sp1.1 from training. Duration: 0.87275 2023-10-04 02:29:43,245 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_24677_210_1794_1_1532002693_4341296_149-303793-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 02:29:43,300 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_133-564-0_sp1.1 from training. Duration: 0.7181875 2023-10-04 02:29:43,322 WARNING [train.py:1204] (3/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_625-338546-0_sp0.9 from training. Duration: 0.97775 2023-10-04 02:29:47,051 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.4.encoder.layers.2.nonlin_attention.whiten1, num_groups=1, num_channels=288, metric=5.95 vs. limit=10.0 2023-10-04 02:29:50,999 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.ff2_skip_rate, batch_count=1493466.6666666667, ans=0.0 2023-10-04 02:29:52,153 WARNING [train.py:1204] (3/4) Exclude cut with ID _25882_210_3036_1_1532775665815_6479510_624-320169-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 02:29:52,160 WARNING [train.py:1204] (3/4) Exclude cut with ID _20622_210_3852_1_1531622427080_1093839_127-325326-0_sp1.1 from training. Duration: 0.92725 2023-10-04 02:29:53,471 WARNING [train.py:1204] (3/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_464-33931-0_sp1.1 from training. Duration: 0.4909375 2023-10-04 02:29:56,162 WARNING [train.py:1204] (3/4) Exclude cut with ID _66112_210_10635_1_1535590236305_3801484_120-304588-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 02:29:59,678 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.1.conv_module1.balancer2.prob, batch_count=1493533.3333333333, ans=0.125 2023-10-04 02:30:02,167 WARNING [train.py:1204] (3/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_110-130961-0 from training. Duration: 0.47 2023-10-04 02:30:03,061 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.0.layers.0.self_attn1.whiten, num_groups=1, num_channels=192, metric=11.56 vs. limit=22.5 2023-10-04 02:30:03,544 WARNING [train.py:1204] (3/4) Exclude cut with ID _16908_210_5724_1_1531119157248_3772700_64-54020-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 02:30:06,934 WARNING [train.py:1204] (3/4) Exclude cut with ID _4068_210_2629_1_1526697350395_1546259_224-288693-0_sp1.1 from training. Duration: 0.536375 2023-10-04 02:30:06,994 WARNING [train.py:1204] (3/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_191-41023-0_sp0.9 from training. Duration: 0.788875 2023-10-04 02:30:08,344 WARNING [train.py:1204] (3/4) Exclude cut with ID ID0205W0255-783002 from training. Duration: 0.958 2023-10-04 02:30:09,622 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_162-262717-0 from training. Duration: 0.83 2023-10-04 02:30:14,958 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_224-322394-0_sp1.1 from training. Duration: 0.6 2023-10-04 02:30:14,964 WARNING [train.py:1204] (3/4) Exclude cut with ID _4866_210_2577_1_1527404412284_4364659_133-1696-0_sp1.1 from training. Duration: 0.9 2023-10-04 02:30:15,933 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.0.layers.0.conv_module2.whiten, num_groups=1, num_channels=192, metric=9.07 vs. limit=15.0 2023-10-04 02:30:16,350 WARNING [train.py:1204] (3/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_973-198085-0_sp1.1 from training. Duration: 0.6818125 2023-10-04 02:30:20,689 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.conv_skip_rate, batch_count=1493600.0, ans=0.0 2023-10-04 02:30:23,286 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_47449_210_5399_1_1534076445_4006929_410-237806-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 02:30:23,302 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7543_210_6753_1_1528543318_4552_1-85072-0_sp0.9 from training. Duration: 0.92225 2023-10-04 02:30:24,717 WARNING [train.py:1204] (3/4) Exclude cut with ID _65263_210_22356_1_1535763598691_3921037_108-21902-0 from training. Duration: 0.99 2023-10-04 02:30:24,722 WARNING [train.py:1204] (3/4) Exclude cut with ID _65585_210_12210_1_1535450451584_3764098_309-174818-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 02:30:25,009 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.feed_forward1.out_proj.dropout_p, batch_count=1493666.6666666667, ans=0.1 2023-10-04 02:30:27,550 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_309-65177-0 from training. Duration: 0.88 2023-10-04 02:30:30,744 WARNING [train.py:1204] (3/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_363-185703-0_sp1.1 from training. Duration: 0.863625 2023-10-04 02:30:30,775 WARNING [train.py:1204] (3/4) Exclude cut with ID _42326_210_7770_1_1533814282531_3707269_442-238692-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 02:30:33,397 WARNING [train.py:1204] (3/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_226-254446-0_sp1.1 from training. Duration: 0.936375 2023-10-04 02:30:33,414 WARNING [train.py:1204] (3/4) Exclude cut with ID _29853_210_7497_1_1532505600851_6642110_287-299903-0_sp1.1 from training. Duration: 0.963625 2023-10-04 02:30:36,266 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1015-343884-0 from training. Duration: 0.62 2023-10-04 02:30:36,296 WARNING [train.py:1204] (3/4) Exclude cut with ID ID0305W0008-833402 from training. Duration: 0.91 2023-10-04 02:30:37,654 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6673_210_3273_1_1527767790_3173240_351-167954-0_sp1.1 from training. Duration: 0.436375 2023-10-04 02:30:37,670 WARNING [train.py:1204] (3/4) Exclude cut with ID _6456_210_1534_1_1527681634212_3181049_345-252877-0 from training. Duration: 0.64 2023-10-04 02:30:40,918 INFO [train.py:1046] (3/4) Epoch 43, batch 950, loss[loss=0.1464, simple_loss=0.2144, pruned_loss=0.03923, over 22629.00 frames. ], tot_loss[loss=0.1568, simple_loss=0.2367, pruned_loss=0.03842, over 4670974.90 frames. ], batch size: 322, lr: 2.37e-03, grad_scale: 8.0 2023-10-04 02:30:41,267 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.conv_module2.balancer2.prob, batch_count=1493733.3333333333, ans=0.125 2023-10-04 02:30:42,469 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_13-205182-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 02:30:43,948 WARNING [train.py:1204] (3/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_427-208901-0 from training. Duration: 0.68 2023-10-04 02:30:50,381 WARNING [train.py:1204] (3/4) Exclude cut with ID _49039_210_6315_1_1533967537541_3846118_9-148921-0_sp1.1 from training. Duration: 0.97275 2023-10-04 02:30:51,859 WARNING [train.py:1204] (3/4) Exclude cut with ID _59493_210_17282_1_1535078419077_3225879_143-70317-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 02:30:51,901 WARNING [train.py:1204] (3/4) Exclude cut with ID _27967_210_12140_1_1532588391859_8419130_1056-236369-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 02:30:53,329 WARNING [train.py:1204] (3/4) Exclude cut with ID _6537_210_1883_1_1527987559710_3597129_382-133062-0_sp0.9 from training. Duration: 0.7555625 2023-10-04 02:30:54,790 WARNING [train.py:1204] (3/4) Exclude cut with ID ID0376W0255-869957 from training. Duration: 0.9510625 2023-10-04 02:30:57,462 WARNING [train.py:1204] (3/4) Exclude cut with ID _49371_210_12071_1_1533952761021_3812190_483-240691-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 02:30:57,530 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_12868_210_6753_1_1530701932_530275_18-76322-0_sp1.1 from training. Duration: 0.7909375 2023-10-04 02:30:58,828 WARNING [train.py:1204] (3/4) Exclude cut with ID _53950_210_5816_1_1534417264919_3862951_367-26344-0_sp1.1 from training. Duration: 0.97275 2023-10-04 02:30:58,856 WARNING [train.py:1204] (3/4) Exclude cut with ID _5444_210_2151_1_1527418808464_5312210_634-194174-0_sp1.1 from training. Duration: 0.8818125 2023-10-04 02:30:58,874 WARNING [train.py:1204] (3/4) Exclude cut with ID _56494_210_4220_1_1535284837625_3033169_69-138150-0 from training. Duration: 0.94 2023-10-04 02:31:00,987 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7543_210_6753_1_1528544010_1110402_25-183860-0_sp0.9 from training. Duration: 0.52225 2023-10-04 02:31:02,256 WARNING [train.py:1204] (3/4) Exclude cut with ID _40284_210_2084_1_1533513679814_3704279_287-191918-0_sp1.1 from training. Duration: 0.963625 2023-10-04 02:31:03,725 WARNING [train.py:1204] (3/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_291-270881-0 from training. Duration: 0.97 2023-10-04 02:31:03,951 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.conv_skip_rate, batch_count=1493800.0, ans=0.0 2023-10-04 02:31:04,930 WARNING [train.py:1204] (3/4) Exclude cut with ID _12555_210_8750_1_1530075594402_3780239_449-267986-0_sp0.9 from training. Duration: 0.92225 2023-10-04 02:31:09,611 WARNING [train.py:1204] (3/4) Exclude cut with ID _3992_210_1747_1_1526644816031_3731060_268-64520-0_sp1.1 from training. Duration: 0.963625 2023-10-04 02:31:09,618 WARNING [train.py:1204] (3/4) Exclude cut with ID _39411_210_5986_1_1533538825030_3343649_173-238248-0_sp0.9 from training. Duration: 0.92225 2023-10-04 02:31:09,647 WARNING [train.py:1204] (3/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_828-313097-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 02:31:11,038 WARNING [train.py:1204] (3/4) Exclude cut with ID _26907_210_7234_1_1532221740694_3205263_337-2494-0 from training. Duration: 0.88 2023-10-04 02:31:13,701 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_229-45300-0_sp1.1 from training. Duration: 0.5181875 2023-10-04 02:31:15,652 WARNING [train.py:1204] (3/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_300-27235-0_sp1.1 from training. Duration: 0.7909375 2023-10-04 02:31:17,136 WARNING [train.py:1204] (3/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_48-338976-0_sp1.1 from training. Duration: 0.8090625 2023-10-04 02:31:21,906 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_13091_210_7925_1_1530609447_8987354_792-195353-0_sp1.1 from training. Duration: 0.936375 2023-10-04 02:31:21,920 WARNING [train.py:1204] (3/4) Exclude cut with ID _56524_210_9642_1_1534572011779_3619170_311-54500-0_sp1.1 from training. Duration: 0.97275 2023-10-04 02:31:23,950 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.4.encoder.layers.0.whiten, num_groups=1, num_channels=384, metric=4.52 vs. limit=12.0 2023-10-04 02:31:26,164 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7543_210_6753_1_1528544010_1110402_25-183860-0 from training. Duration: 0.47 2023-10-04 02:31:27,589 WARNING [train.py:1204] (3/4) Exclude cut with ID 2411-132532-0017-82279-0_sp1.1 from training. Duration: 0.9681875 2023-10-04 02:31:27,590 WARNING [train.py:1204] (3/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_272-249985-0_sp0.9 from training. Duration: 0.7444375 2023-10-04 02:31:28,978 WARNING [train.py:1204] (3/4) Exclude cut with ID _55644_210_11856_1_1534593599582_4103719_320-229006-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 02:31:30,289 WARNING [train.py:1204] (3/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_177-100554-0_sp1.1 from training. Duration: 0.963625 2023-10-04 02:31:30,291 WARNING [train.py:1204] (3/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_787-294416-0_sp1.1 from training. Duration: 0.7454375 2023-10-04 02:31:32,264 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.702e+02 1.990e+02 2.144e+02 2.470e+02 4.825e+02, threshold=4.288e+02, percent-clipped=1.0 2023-10-04 02:31:33,826 WARNING [train.py:1204] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_318-59097-0 from training. Duration: 0.76 2023-10-04 02:31:33,908 WARNING [train.py:1204] (3/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_480-13146-0_sp1.1 from training. Duration: 0.77275 2023-10-04 02:31:34,142 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.conv_module2.balancer2.min_abs, batch_count=1493933.3333333333, ans=0.5 2023-10-04 02:31:36,789 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_469-338558-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 02:31:36,856 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_367-126897-0_sp1.1 from training. Duration: 0.963625 2023-10-04 02:31:36,881 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6545_210_2040_1_1527774670_4245258_379-14182-0 from training. Duration: 0.8 2023-10-04 02:31:36,893 WARNING [train.py:1204] (3/4) Exclude cut with ID _21631_210_2253_1_1531907880778_3160339_197-79498-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 02:31:36,897 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_416-293746-0_sp1.1 from training. Duration: 0.8181875 2023-10-04 02:31:36,934 WARNING [train.py:1204] (3/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_52-211683-0 from training. Duration: 0.9 2023-10-04 02:31:42,944 WARNING [train.py:1204] (3/4) Exclude cut with ID _48678_210_5663_1_1533988830947_3769199_306-255886-0_sp0.9 from training. Duration: 0.8555625 2023-10-04 02:31:46,158 WARNING [train.py:1204] (3/4) Exclude cut with ID _65134_210_14763_1_1535335393985_3505368_296-24733-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 02:31:47,739 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.feed_forward1.hidden_balancer.prob, batch_count=1494000.0, ans=0.125 2023-10-04 02:31:48,484 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.0.layers.1.feed_forward1.out_whiten, num_groups=1, num_channels=192, metric=4.49 vs. limit=15.0 2023-10-04 02:31:50,924 WARNING [train.py:1204] (3/4) Exclude cut with ID _14898_210_9206_1_1530766812377_4177190_335-99364-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 02:31:52,358 WARNING [train.py:1204] (3/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_19-342632-0 from training. Duration: 0.91 2023-10-04 02:31:52,374 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_53071_210_16554_1_1534490405_2954066_265-170472-0 from training. Duration: 0.83 2023-10-04 02:31:54,047 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=1494000.0, ans=0.1 2023-10-04 02:31:55,280 WARNING [train.py:1204] (3/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_189-298172-0_sp1.1 from training. Duration: 0.963625 2023-10-04 02:31:56,475 INFO [train.py:1046] (3/4) Epoch 43, batch 1000, loss[loss=0.1467, simple_loss=0.2269, pruned_loss=0.0332, over 24641.00 frames. ], tot_loss[loss=0.1558, simple_loss=0.2353, pruned_loss=0.0381, over 4679543.65 frames. ], batch size: 60, lr: 2.37e-03, grad_scale: 8.0 2023-10-04 02:31:59,372 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7615_210_4211_1_1528110965_4580160_72-235211-0 from training. Duration: 0.76 2023-10-04 02:32:00,735 WARNING [train.py:1204] (3/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_155-90874-0_sp1.1 from training. Duration: 0.936375 2023-10-04 02:32:01,080 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.bypass.skip_rate, batch_count=1494066.6666666667, ans=0.04949747468305833 2023-10-04 02:32:04,267 WARNING [train.py:1204] (3/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_164-216310-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 02:32:05,660 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_46013_210_7234_1_1533776362_3744032_463-52378-0 from training. Duration: 0.86 2023-10-04 02:32:05,663 WARNING [train.py:1204] (3/4) Exclude cut with ID _61133_210_9969_1_1535196777227_3696409_333-270325-0 from training. Duration: 0.93 2023-10-04 02:32:06,211 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.4.encoder.layers.1.self_attn1.whiten, num_groups=1, num_channels=384, metric=13.70 vs. limit=22.5 2023-10-04 02:32:08,699 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.conv_module2.balancer1.prob, batch_count=1494066.6666666667, ans=0.125 2023-10-04 02:32:11,220 WARNING [train.py:1204] (3/4) Exclude cut with ID _5172_210_1883_1_1527246062567_3577219_18-101380-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 02:32:11,228 WARNING [train.py:1204] (3/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_577-236858-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 02:32:12,622 WARNING [train.py:1204] (3/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_1006-177345-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 02:32:12,870 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.balancer1.prob, batch_count=1494133.3333333333, ans=0.125 2023-10-04 02:32:15,222 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_4-266348-0 from training. Duration: 0.78 2023-10-04 02:32:19,481 WARNING [train.py:1204] (3/4) Exclude cut with ID _55857_210_2346_1_1534644007831_6898209_745-98391-0 from training. Duration: 0.76 2023-10-04 02:32:20,795 WARNING [train.py:1204] (3/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_375-244976-0 from training. Duration: 0.75 2023-10-04 02:32:20,849 WARNING [train.py:1204] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_4-248404-0_sp1.1 from training. Duration: 0.97275 2023-10-04 02:32:21,079 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.ff3_skip_rate, batch_count=1494133.3333333333, ans=0.0 2023-10-04 02:32:22,275 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8453_210_4211_1_1528502940_8562730_596-306820-0 from training. Duration: 0.97 2023-10-04 02:32:24,844 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_211-339622-0_sp0.9 from training. Duration: 0.5 2023-10-04 02:32:24,895 WARNING [train.py:1204] (3/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_393-57888-0 from training. Duration: 0.64 2023-10-04 02:32:26,241 WARNING [train.py:1204] (3/4) Exclude cut with ID _23825_210_13449_1_1531915313526_3497150_143-343575-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 02:32:27,574 WARNING [train.py:1204] (3/4) Exclude cut with ID _58605_210_12210_1_1534723195939_3904259_36-12256-0_sp1.1 from training. Duration: 0.936375 2023-10-04 02:32:33,288 WARNING [train.py:1204] (3/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_38-67795-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 02:32:35,264 WARNING [train.py:1204] (3/4) Exclude cut with ID _64631_210_27767_1_1535349580068_4165160_308-327832-0_sp1.1 from training. Duration: 0.92725 2023-10-04 02:32:35,332 WARNING [train.py:1204] (3/4) Exclude cut with ID _37086_210_9038_1_1532999540891_7021250_726-81176-0_sp1.1 from training. Duration: 0.936375 2023-10-04 02:32:35,402 WARNING [train.py:1204] (3/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_47-141521-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 02:32:36,670 WARNING [train.py:1204] (3/4) Exclude cut with ID _3067_210_2667_1_1526212974640_4503168_250-285937-0 from training. Duration: 0.8 2023-10-04 02:32:36,709 WARNING [train.py:1204] (3/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_239-24000-0_sp1.1 from training. Duration: 0.97275 2023-10-04 02:32:38,048 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3371_210_1794_1_1526384462_5580777_396-339812-0_sp0.9 from training. Duration: 0.9444375 2023-10-04 02:32:38,103 WARNING [train.py:1204] (3/4) Exclude cut with ID _16909_210_5724_1_1531292079912_4118919_580-173454-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 02:32:39,456 WARNING [train.py:1204] (3/4) Exclude cut with ID IC0124W0002-61308 from training. Duration: 0.982 2023-10-04 02:32:42,305 WARNING [train.py:1204] (3/4) Exclude cut with ID _31602_210_5854_1_1532677021456_4458250_249-96604-0 from training. Duration: 0.89 2023-10-04 02:32:44,191 WARNING [train.py:1204] (3/4) Exclude cut with ID _69584_210_5219_1_1535785133775_4044450_325-7656-0 from training. Duration: 0.86 2023-10-04 02:32:46,119 WARNING [train.py:1204] (3/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_478-162247-0 from training. Duration: 0.82 2023-10-04 02:32:48,052 WARNING [train.py:1204] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_686-54383-0_sp1.1 from training. Duration: 0.7909375 2023-10-04 02:32:54,953 WARNING [train.py:1204] (3/4) Exclude cut with ID _29303_210_2575_1_1532392237303_7410649_85-270092-0_sp1.1 from training. Duration: 0.936375 2023-10-04 02:32:54,964 WARNING [train.py:1204] (3/4) Exclude cut with ID _2322_210_2667_1_1525604548971_3656299_440-80494-0_sp1.1 from training. Duration: 0.8 2023-10-04 02:32:56,289 WARNING [train.py:1204] (3/4) Exclude cut with ID _47346_210_9476_1_1533815971553_3536160_201-40644-0_sp1.1 from training. Duration: 0.936375 2023-10-04 02:32:57,685 WARNING [train.py:1204] (3/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_343-139898-0_sp1.1 from training. Duration: 0.87275 2023-10-04 02:32:58,039 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.conv_skip_rate, batch_count=1494333.3333333333, ans=0.0 2023-10-04 02:32:59,108 WARNING [train.py:1204] (3/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_1012-279473-0 from training. Duration: 0.67 2023-10-04 02:32:59,207 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7931_210_1794_1_1528594728_5564059_280-279560-0_sp1.1 from training. Duration: 0.8909375 2023-10-04 02:32:59,252 WARNING [train.py:1204] (3/4) Exclude cut with ID _8263_210_1664_1_1528439452891_970220_68-181805-0 from training. Duration: 0.64 2023-10-04 02:33:00,749 WARNING [train.py:1204] (3/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_605-133395-0 from training. Duration: 0.66 2023-10-04 02:33:02,137 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_43-171109-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 02:33:02,144 WARNING [train.py:1204] (3/4) Exclude cut with ID _40094_210_16380_1_1533294005773_3641159_107-280739-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 02:33:04,821 WARNING [train.py:1204] (3/4) Exclude cut with ID _14821_210_8750_1_1530680503991_6829359_2-227151-0_sp0.9 from training. Duration: 0.92225 2023-10-04 02:33:08,104 WARNING [train.py:1204] (3/4) Exclude cut with ID _14268_210_9206_1_1530622771285_4714099_331-105351-0_sp1.1 from training. Duration: 0.5909375 2023-10-04 02:33:09,502 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_26326_210_7925_1_1533002493_3592406_451-82741-0_sp1.1 from training. Duration: 0.97275 2023-10-04 02:33:10,796 INFO [train.py:1046] (3/4) Epoch 43, batch 1050, loss[loss=0.1481, simple_loss=0.2379, pruned_loss=0.02911, over 24672.00 frames. ], tot_loss[loss=0.1548, simple_loss=0.2345, pruned_loss=0.03756, over 4696034.46 frames. ], batch size: 65, lr: 2.37e-03, grad_scale: 8.0 2023-10-04 02:33:13,680 WARNING [train.py:1204] (3/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_191-59784-0_sp0.9 from training. Duration: 0.97775 2023-10-04 02:33:15,103 WARNING [train.py:1204] (3/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_445-117398-0_sp1.1 from training. Duration: 0.8090625 2023-10-04 02:33:17,076 WARNING [train.py:1204] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_605-257205-0_sp0.9 from training. Duration: 0.6555625 2023-10-04 02:33:18,428 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_50050_210_16223_1_1535435796_7290138_592-258841-0_sp1.1 from training. Duration: 0.936375 2023-10-04 02:33:20,000 WARNING [train.py:1204] (3/4) Exclude cut with ID _14348_210_8341_1_1530599434996_3778640_240-232370-0_sp0.9 from training. Duration: 0.9333125 2023-10-04 02:33:22,651 WARNING [train.py:1204] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_239-316802-0_sp0.9 from training. Duration: 0.7666875 2023-10-04 02:33:22,773 WARNING [train.py:1204] (3/4) Exclude cut with ID _45560_210_21982_1_1533812603951_3584999_111-6376-0_sp1.1 from training. Duration: 0.763625 2023-10-04 02:33:23,377 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.3.encoder.layers.3.whiten, num_groups=1, num_channels=512, metric=3.16 vs. limit=12.0 2023-10-04 02:33:25,547 WARNING [train.py:1204] (3/4) Exclude cut with ID _54189_210_16380_1_1534332670039_3911710_606-311481-0_sp1.1 from training. Duration: 0.963625 2023-10-04 02:33:26,877 WARNING [train.py:1204] (3/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_237-332027-0_sp1.1 from training. Duration: 0.863625 2023-10-04 02:33:26,902 WARNING [train.py:1204] (3/4) Exclude cut with ID _66070_210_7640_1_1535441393096_7805310_489-292631-0_sp0.9 from training. Duration: 0.87775 2023-10-04 02:33:27,137 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=1494466.6666666667, ans=0.1 2023-10-04 02:33:28,250 WARNING [train.py:1204] (3/4) Exclude cut with ID _49728_210_12651_1_1534041034294_6788150_624-195301-0_sp1.1 from training. Duration: 0.77275 2023-10-04 02:33:28,291 WARNING [train.py:1204] (3/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_44-79427-0 from training. Duration: 0.98 2023-10-04 02:33:29,657 WARNING [train.py:1204] (3/4) Exclude cut with ID _35350_210_16040_1_1533040191183_4075299_490-198354-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 02:33:29,705 WARNING [train.py:1204] (3/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_309-222545-0 from training. Duration: 0.84 2023-10-04 02:33:31,264 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.conv_module2.balancer2.prob, batch_count=1494466.6666666667, ans=0.125 2023-10-04 02:33:32,483 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_13091_210_7925_1_1530609447_8987354_790-199469-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 02:33:32,489 WARNING [train.py:1204] (3/4) Exclude cut with ID _21632_210_2253_1_1531994001765_3744140_110-274654-0 from training. Duration: 0.82 2023-10-04 02:33:32,495 WARNING [train.py:1204] (3/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_498-40153-0_sp0.9 from training. Duration: 0.7 2023-10-04 02:33:39,928 WARNING [train.py:1204] (3/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_363-282909-0_sp1.1 from training. Duration: 0.936375 2023-10-04 02:33:39,986 WARNING [train.py:1204] (3/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_178-177582-0_sp0.9 from training. Duration: 0.82225 2023-10-04 02:33:40,003 WARNING [train.py:1204] (3/4) Exclude cut with ID _24612_210_8913_1_1531998054193_3806359_357-35487-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 02:33:42,730 WARNING [train.py:1204] (3/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_138-332773-0 from training. Duration: 0.81 2023-10-04 02:33:42,750 WARNING [train.py:1204] (3/4) Exclude cut with ID _7624_210_1531_1_1528109802198_4933101_418-349834-0 from training. Duration: 0.6 2023-10-04 02:33:42,780 WARNING [train.py:1204] (3/4) Exclude cut with ID _7537_210_2084_1_1528502395431_3924359_14-312906-0_sp0.9 from training. Duration: 0.9333125 2023-10-04 02:33:43,074 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.feed_forward1.out_proj.dropout_p, batch_count=1494533.3333333333, ans=0.1 2023-10-04 02:33:45,571 WARNING [train.py:1204] (3/4) Exclude cut with ID _60057_210_10640_1_1534903065793_3333150_362-230377-0 from training. Duration: 0.87 2023-10-04 02:33:50,664 WARNING [train.py:1204] (3/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_138-64210-0 from training. Duration: 0.73 2023-10-04 02:33:50,732 WARNING [train.py:1204] (3/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_454-339504-0_sp1.1 from training. Duration: 0.97275 2023-10-04 02:33:53,756 WARNING [train.py:1204] (3/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_508-335369-0_sp0.9 from training. Duration: 0.5333125 2023-10-04 02:33:55,330 WARNING [train.py:1204] (3/4) Exclude cut with ID _5608_210_2106_1_1527415439089_6819768_909-82767-0_sp0.9 from training. Duration: 0.67775 2023-10-04 02:33:55,368 WARNING [train.py:1204] (3/4) Exclude cut with ID _6694_210_4599_1_1527852637490_3965529_99-11611-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 02:33:56,768 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2655_210_2087_1_1525688959_3897400_52-279899-0_sp1.1 from training. Duration: 0.62725 2023-10-04 02:33:59,530 WARNING [train.py:1204] (3/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_375-245677-0_sp1.1 from training. Duration: 0.736375 2023-10-04 02:34:02,039 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.636e+02 1.937e+02 2.146e+02 2.350e+02 6.827e+02, threshold=4.291e+02, percent-clipped=1.0 2023-10-04 02:34:03,597 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_23850_210_4238_1_1532232597_1632433_32-185135-0 from training. Duration: 0.86 2023-10-04 02:34:05,037 WARNING [train.py:1204] (3/4) Exclude cut with ID _15113_210_8254_1_1530842438579_3716279_209-54533-0 from training. Duration: 0.96 2023-10-04 02:34:06,415 WARNING [train.py:1204] (3/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_401-53423-0 from training. Duration: 0.55 2023-10-04 02:34:06,454 WARNING [train.py:1204] (3/4) Exclude cut with ID _14867_210_1968_1_1530684179890_3846969_319-250868-0_sp1.1 from training. Duration: 0.9 2023-10-04 02:34:06,469 WARNING [train.py:1204] (3/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_106-30782-0_sp0.9 from training. Duration: 0.888875 2023-10-04 02:34:08,571 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_5960_210_1794_1_1527403865_3990999_515-2613-0 from training. Duration: 0.88 2023-10-04 02:34:12,739 WARNING [train.py:1204] (3/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_798-106179-0_sp0.9 from training. Duration: 0.92225 2023-10-04 02:34:12,895 WARNING [train.py:1204] (3/4) Exclude cut with ID _14227_210_4381_1_1530701804286_3940259_125-275642-0_sp1.1 from training. Duration: 0.9 2023-10-04 02:34:14,212 WARNING [train.py:1204] (3/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_79-200392-0_sp1.1 from training. Duration: 0.8818125 2023-10-04 02:34:14,273 WARNING [train.py:1204] (3/4) Exclude cut with ID _48836_210_12210_1_1534075848293_3904150_429-35475-0_sp0.9 from training. Duration: 0.988875 2023-10-04 02:34:14,297 WARNING [train.py:1204] (3/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_224-172517-0_sp1.1 from training. Duration: 0.97275 2023-10-04 02:34:21,334 WARNING [train.py:1204] (3/4) Exclude cut with ID _15113_210_8254_1_1530842438579_3716279_76-232507-0_sp1.1 from training. Duration: 0.97275 2023-10-04 02:34:21,337 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_135-5501-0 from training. Duration: 0.98 2023-10-04 02:34:22,575 WARNING [train.py:1204] (3/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_563-347832-0_sp0.9 from training. Duration: 0.988875 2023-10-04 02:34:22,586 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6786_210_1388_1_1527839699_4194495_273-149841-0 from training. Duration: 0.92 2023-10-04 02:34:22,614 WARNING [train.py:1204] (3/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_172-215932-0 from training. Duration: 0.66 2023-10-04 02:34:22,683 WARNING [train.py:1204] (3/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_939-132910-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 02:34:25,355 INFO [train.py:1046] (3/4) Epoch 43, batch 1100, loss[loss=0.1598, simple_loss=0.241, pruned_loss=0.03931, over 24033.00 frames. ], tot_loss[loss=0.1546, simple_loss=0.2344, pruned_loss=0.0374, over 4692374.60 frames. ], batch size: 80, lr: 2.37e-03, grad_scale: 8.0 2023-10-04 02:34:26,840 WARNING [train.py:1204] (3/4) Exclude cut with ID _49371_210_12071_1_1533952761021_3812190_16-246001-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 02:34:29,648 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.feed_forward1.hidden_balancer.prob, batch_count=1494733.3333333333, ans=0.125 2023-10-04 02:34:32,240 WARNING [train.py:1204] (3/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_680-189165-0_sp1.1 from training. Duration: 0.936375 2023-10-04 02:34:36,346 WARNING [train.py:1204] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_53-300544-0_sp1.1 from training. Duration: 0.6090625 2023-10-04 02:34:37,767 WARNING [train.py:1204] (3/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_540-275685-0_sp1.1 from training. Duration: 0.8090625 2023-10-04 02:34:37,791 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_38965_210_4852_1_1533174906_3484735_453-106579-0_sp1.1 from training. Duration: 0.92725 2023-10-04 02:34:37,837 WARNING [train.py:1204] (3/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_105-328174-0 from training. Duration: 0.76 2023-10-04 02:34:39,071 WARNING [train.py:1204] (3/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_279-126205-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 02:34:40,619 WARNING [train.py:1204] (3/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_5-55893-0_sp0.9 from training. Duration: 0.611125 2023-10-04 02:34:43,908 WARNING [train.py:1204] (3/4) Exclude cut with ID _13678_210_7497_1_1530530069226_3463170_141-10835-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 02:34:45,325 WARNING [train.py:1204] (3/4) Exclude cut with ID _22083_210_8254_1_1531702862439_3678970_201-85738-0_sp1.1 from training. Duration: 0.8181875 2023-10-04 02:34:47,178 WARNING [train.py:1204] (3/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_51-1049-0 from training. Duration: 0.75 2023-10-04 02:34:47,267 WARNING [train.py:1204] (3/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_206-10097-0_sp0.9 from training. Duration: 0.5444375 2023-10-04 02:34:48,859 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_4528_210_3273_1_1527076654_3276777_360-63661-0_sp1.1 from training. Duration: 0.92725 2023-10-04 02:34:48,867 WARNING [train.py:1204] (3/4) Exclude cut with ID _64086_210_7994_1_1535270417019_7435949_449-31567-0_sp1.1 from training. Duration: 0.8818125 2023-10-04 02:34:49,102 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=1494800.0, ans=0.1 2023-10-04 02:34:50,467 WARNING [train.py:1204] (3/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_245-174553-0_sp0.9 from training. Duration: 0.97775 2023-10-04 02:34:50,616 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=1494800.0, ans=0.1 2023-10-04 02:34:53,800 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6545_210_2040_1_1527774670_4245258_379-14182-0_sp1.1 from training. Duration: 0.72725 2023-10-04 02:34:59,556 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_256-206070-0_sp1.1 from training. Duration: 0.963625 2023-10-04 02:34:59,846 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.ff2_skip_rate, batch_count=1494866.6666666667, ans=0.0 2023-10-04 02:35:01,022 WARNING [train.py:1204] (3/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_278-104676-0 from training. Duration: 0.98 2023-10-04 02:35:02,465 WARNING [train.py:1204] (3/4) Exclude cut with ID IC0671W0065-331908 from training. Duration: 0.896 2023-10-04 02:35:02,507 WARNING [train.py:1204] (3/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_244-129917-0_sp1.1 from training. Duration: 0.97275 2023-10-04 02:35:05,103 WARNING [train.py:1204] (3/4) Exclude cut with ID _7852_210_5219_1_1528286471316_8139400_1129-170857-0_sp1.1 from training. Duration: 0.97275 2023-10-04 02:35:05,175 WARNING [train.py:1204] (3/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_443-30034-0_sp0.9 from training. Duration: 0.711125 2023-10-04 02:35:05,204 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_31583_210_3852_1_1532595851_4024413_250-332999-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 02:35:06,508 WARNING [train.py:1204] (3/4) Exclude cut with ID _8772_210_1850_1_1528635595972_3664011_247-336448-0 from training. Duration: 0.74 2023-10-04 02:35:06,566 WARNING [train.py:1204] (3/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_567-135114-0_sp0.9 from training. Duration: 0.9666875 2023-10-04 02:35:06,570 WARNING [train.py:1204] (3/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_557-349534-0_sp1.1 from training. Duration: 0.82725 2023-10-04 02:35:06,592 WARNING [train.py:1204] (3/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_1341-26728-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 02:35:08,051 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_40222_210_6228_1_1533385298_4481939_375-328051-0_sp1.1 from training. Duration: 0.97275 2023-10-04 02:35:08,092 WARNING [train.py:1204] (3/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_276-96593-0 from training. Duration: 0.77 2023-10-04 02:35:08,637 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.4.encoder.layers.0.self_attn2.whiten, num_groups=1, num_channels=384, metric=13.72 vs. limit=22.5 2023-10-04 02:35:14,173 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_53071_210_16554_1_1534490405_2954066_265-170472-0_sp0.9 from training. Duration: 0.92225 2023-10-04 02:35:15,580 WARNING [train.py:1204] (3/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_365-1492-0 from training. Duration: 0.91 2023-10-04 02:35:15,736 WARNING [train.py:1204] (3/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_654-52713-0_sp0.9 from training. Duration: 0.8555625 2023-10-04 02:35:17,645 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.0.ff3_skip_rate, batch_count=1494933.3333333333, ans=0.0 2023-10-04 02:35:19,526 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.1.attention_skip_rate, batch_count=1494933.3333333333, ans=0.0 2023-10-04 02:35:22,551 WARNING [train.py:1204] (3/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_113-92608-0_sp1.1 from training. Duration: 0.4909375 2023-10-04 02:35:25,377 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_46013_210_7234_1_1533776362_3744032_50-141535-0 from training. Duration: 0.99 2023-10-04 02:35:25,386 WARNING [train.py:1204] (3/4) Exclude cut with ID _13942_210_9988_1_1530604906036_3733360_499-55676-0_sp1.1 from training. Duration: 0.563625 2023-10-04 02:35:25,513 WARNING [train.py:1204] (3/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_375-65143-0_sp1.1 from training. Duration: 0.97275 2023-10-04 02:35:27,069 WARNING [train.py:1204] (3/4) Exclude cut with ID _16756_210_4915_1_1531047803763_7104259_199-325608-0_sp1.1 from training. Duration: 0.92725 2023-10-04 02:35:28,292 WARNING [train.py:1204] (3/4) Exclude cut with ID _31649_210_6228_1_1532584608063_4091078_443-124391-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 02:35:29,698 WARNING [train.py:1204] (3/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_146-218763-0 from training. Duration: 0.86 2023-10-04 02:35:29,756 WARNING [train.py:1204] (3/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_567-135114-0_sp1.1 from training. Duration: 0.7909375 2023-10-04 02:35:31,031 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_26326_210_7925_1_1533002493_3592406_462-288299-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 02:35:31,120 WARNING [train.py:1204] (3/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_322-217220-0 from training. Duration: 0.86 2023-10-04 02:35:32,503 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_238-256668-0_sp1.1 from training. Duration: 0.62725 2023-10-04 02:35:32,547 WARNING [train.py:1204] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_371-18169-0 from training. Duration: 0.86 2023-10-04 02:35:33,912 WARNING [train.py:1204] (3/4) Exclude cut with ID _58480_210_12392_1_1534811121111_3646160_23-74512-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 02:35:33,917 WARNING [train.py:1204] (3/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_784-213099-0_sp1.1 from training. Duration: 0.8454375 2023-10-04 02:35:35,269 WARNING [train.py:1204] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_625-102955-0_sp1.1 from training. Duration: 0.7 2023-10-04 02:35:36,014 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.3.encoder.layers.3.feed_forward2.out_whiten, num_groups=1, num_channels=512, metric=5.11 vs. limit=15.0 2023-10-04 02:35:39,339 INFO [train.py:1046] (3/4) Epoch 43, batch 1150, loss[loss=0.1651, simple_loss=0.2524, pruned_loss=0.03887, over 24270.00 frames. ], tot_loss[loss=0.1549, simple_loss=0.2349, pruned_loss=0.03744, over 4700843.74 frames. ], batch size: 77, lr: 2.37e-03, grad_scale: 8.0 2023-10-04 02:35:41,145 WARNING [train.py:1204] (3/4) Exclude cut with ID _49748_210_24116_1_1533986971737_3158910_176-174327-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 02:35:42,620 WARNING [train.py:1204] (3/4) Exclude cut with ID _50310_210_15488_1_1534068043345_3641080_432-96140-0_sp1.1 from training. Duration: 0.936375 2023-10-04 02:35:44,036 WARNING [train.py:1204] (3/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_76-14910-0_sp1.1 from training. Duration: 0.92725 2023-10-04 02:35:44,201 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.self_attn_weights.pos_emb_skip_rate, batch_count=1495066.6666666667, ans=0.0 2023-10-04 02:35:45,297 WARNING [train.py:1204] (3/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_40-19458-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 02:35:45,335 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_46-93547-0 from training. Duration: 0.95 2023-10-04 02:35:47,293 WARNING [train.py:1204] (3/4) Exclude cut with ID _21631_210_2253_1_1531907880778_3160339_321-35325-0_sp1.1 from training. Duration: 0.963625 2023-10-04 02:35:47,517 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.feed_forward2.hidden_balancer.prob, batch_count=1495066.6666666667, ans=0.125 2023-10-04 02:35:49,896 WARNING [train.py:1204] (3/4) Exclude cut with ID _4779_210_2084_1_1527318022944_3914893_500-285013-0 from training. Duration: 0.69 2023-10-04 02:35:51,664 WARNING [train.py:1204] (3/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_301-93324-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 02:35:51,669 WARNING [train.py:1204] (3/4) Exclude cut with ID _6473_210_1741_1_1527901307396_3815380_108-17335-0_sp1.1 from training. Duration: 0.5909375 2023-10-04 02:35:59,073 WARNING [train.py:1204] (3/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_206-10097-0 from training. Duration: 0.49 2023-10-04 02:36:01,839 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_36756_210_7234_1_1533026991_966276_103-77513-0_sp1.1 from training. Duration: 0.97275 2023-10-04 02:36:03,401 WARNING [train.py:1204] (3/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_662-18193-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 02:36:04,745 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_26433_210_12108_1_1532157158_4056480_439-52438-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 02:36:04,788 WARNING [train.py:1204] (3/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_464-33931-0 from training. Duration: 0.54 2023-10-04 02:36:04,794 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3474_210_2028_1_1526188513_4825246_145-309131-0_sp0.9 from training. Duration: 0.82225 2023-10-04 02:36:06,139 WARNING [train.py:1204] (3/4) Exclude cut with ID _9679_210_7497_1_1529129212906_4363929_446-308321-0_sp1.1 from training. Duration: 0.963625 2023-10-04 02:36:10,157 WARNING [train.py:1204] (3/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_662-275621-0 from training. Duration: 0.42 2023-10-04 02:36:11,503 WARNING [train.py:1204] (3/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_344-337148-0_sp1.1 from training. Duration: 0.97275 2023-10-04 02:36:11,774 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.conv_module1.balancer1.prob, batch_count=1495200.0, ans=0.125 2023-10-04 02:36:12,850 WARNING [train.py:1204] (3/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_759-179071-0_sp1.1 from training. Duration: 0.92725 2023-10-04 02:36:22,854 WARNING [train.py:1204] (3/4) Exclude cut with ID _28783_210_7943_1_1532671164808_7155479_293-12188-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 02:36:28,933 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_17613_210_2151_1_1531137569_2366160_235-320304-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 02:36:28,949 WARNING [train.py:1204] (3/4) Exclude cut with ID _14349_210_8341_1_1530615710818_3442470_170-336541-0 from training. Duration: 0.86 2023-10-04 02:36:29,280 INFO [scaling.py:1118] (3/4) WithLoss: name=encoder.encoders.5.encoder.layers.1.self_attn_weights, loss-sum=0.000e+00 2023-10-04 02:36:30,278 WARNING [train.py:1204] (3/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_375-218677-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 02:36:30,322 WARNING [train.py:1204] (3/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_829-202956-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 02:36:31,604 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.642e+02 1.978e+02 2.214e+02 2.534e+02 4.016e+02, threshold=4.429e+02, percent-clipped=0.0 2023-10-04 02:36:34,041 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.1.encoder.layers.1.self_attn_weights.whiten_keys, num_groups=4, num_channels=128, metric=2.56 vs. limit=6.0 2023-10-04 02:36:36,099 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9029W0436-509823 from training. Duration: 0.768 2023-10-04 02:36:38,821 WARNING [train.py:1204] (3/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_231-112214-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 02:36:44,557 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9059W0470-524883 from training. Duration: 0.3415625 2023-10-04 02:36:44,815 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.feed_forward3.hidden_balancer.prob, batch_count=1495333.3333333333, ans=0.125 2023-10-04 02:36:49,418 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_43266_210_1794_1_1533521789_4497218_206-79512-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 02:36:52,047 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_294-163793-0_sp0.9 from training. Duration: 0.911125 2023-10-04 02:36:52,078 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6290_210_3273_1_1527595224_1282787_115-87095-0_sp1.1 from training. Duration: 0.5 2023-10-04 02:36:52,124 WARNING [train.py:1204] (3/4) Exclude cut with ID _14100_210_8341_1_1530529320613_3678069_279-55760-0_sp1.1 from training. Duration: 0.8454375 2023-10-04 02:36:53,398 INFO [train.py:1046] (3/4) Epoch 43, batch 1200, loss[loss=0.1471, simple_loss=0.223, pruned_loss=0.03557, over 23515.00 frames. ], tot_loss[loss=0.1548, simple_loss=0.2351, pruned_loss=0.03724, over 4711690.32 frames. ], batch size: 134, lr: 2.37e-03, grad_scale: 16.0 2023-10-04 02:36:55,442 WARNING [train.py:1204] (3/4) Exclude cut with ID _14621_210_1925_1_1530686901185_3675420_419-234200-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 02:37:02,028 WARNING [train.py:1204] (3/4) Exclude cut with ID _60229_210_27767_1_1534838406158_3655350_353-213656-0_sp1.1 from training. Duration: 0.863625 2023-10-04 02:37:02,037 WARNING [train.py:1204] (3/4) Exclude cut with ID _48678_210_5663_1_1533988830947_3769199_306-255886-0_sp1.1 from training. Duration: 0.7 2023-10-04 02:37:03,387 WARNING [train.py:1204] (3/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_466-3492-0_sp1.1 from training. Duration: 0.97275 2023-10-04 02:37:03,388 WARNING [train.py:1204] (3/4) Exclude cut with ID _8755_210_1876_1_1528628559357_3258209_199-242167-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 02:37:03,450 WARNING [train.py:1204] (3/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_237-89684-0_sp1.1 from training. Duration: 0.87275 2023-10-04 02:37:04,867 WARNING [train.py:1204] (3/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_70-84121-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 02:37:06,286 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7544_210_6753_1_1528587722_3576686_172-171750-0_sp1.1 from training. Duration: 0.5909375 2023-10-04 02:37:06,472 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.feed_forward1.hidden_balancer.prob, batch_count=1495400.0, ans=0.125 2023-10-04 02:37:07,651 WARNING [train.py:1204] (3/4) Exclude cut with ID _24271_210_12658_1_1531987270022_7190519_40-321822-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 02:37:07,667 WARNING [train.py:1204] (3/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_227-83612-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 02:37:11,474 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9156W0015-572508 from training. Duration: 0.896 2023-10-04 02:37:11,751 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.ff2_skip_rate, batch_count=1495466.6666666667, ans=0.0 2023-10-04 02:37:12,907 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_292-265928-0 from training. Duration: 0.51 2023-10-04 02:37:17,103 WARNING [train.py:1204] (3/4) Exclude cut with ID _14747_210_8341_1_1530685797463_3654089_147-131229-0_sp1.1 from training. Duration: 0.6909375 2023-10-04 02:37:18,608 WARNING [train.py:1204] (3/4) Exclude cut with ID _45297_210_2721_1_1533715351381_3264949_351-147136-0_sp0.9 from training. Duration: 0.9666875 2023-10-04 02:37:20,622 WARNING [train.py:1204] (3/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_1102-324568-0_sp1.1 from training. Duration: 0.97275 2023-10-04 02:37:23,392 WARNING [train.py:1204] (3/4) Exclude cut with ID _18516_210_9206_1_1531704653206_3692406_465-75446-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 02:37:23,393 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9203W0126-595623 from training. Duration: 0.896 2023-10-04 02:37:25,125 WARNING [train.py:1204] (3/4) Exclude cut with ID _55383_210_10635_1_1534468426172_2231669_139-328410-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 02:37:28,524 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.5.encoder.layers.0.self_attn_weights.whiten_keys, num_groups=4, num_channels=128, metric=1.57 vs. limit=6.0 2023-10-04 02:37:32,509 WARNING [train.py:1204] (3/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_166-246557-0_sp0.9 from training. Duration: 0.711125 2023-10-04 02:37:32,514 WARNING [train.py:1204] (3/4) Exclude cut with ID _30039_210_13270_1_1532484612900_2725150_239-304872-0_sp1.1 from training. Duration: 0.963625 2023-10-04 02:37:32,546 WARNING [train.py:1204] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_6-226750-0 from training. Duration: 0.94 2023-10-04 02:37:33,342 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.1.encoder.layers.0.feed_forward1.out_whiten, num_groups=1, num_channels=256, metric=4.86 vs. limit=15.0 2023-10-04 02:37:33,963 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_135-5501-0_sp1.1 from training. Duration: 0.8909375 2023-10-04 02:37:36,804 WARNING [train.py:1204] (3/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_359-118926-0 from training. Duration: 0.72 2023-10-04 02:37:40,961 WARNING [train.py:1204] (3/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_4-91037-0 from training. Duration: 0.88 2023-10-04 02:37:40,985 WARNING [train.py:1204] (3/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_415-288492-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 02:37:42,297 WARNING [train.py:1204] (3/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_544-287179-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 02:37:43,712 WARNING [train.py:1204] (3/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_122-32606-0_sp1.1 from training. Duration: 0.92725 2023-10-04 02:37:45,050 WARNING [train.py:1204] (3/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_121-284152-0_sp1.1 from training. Duration: 0.536375 2023-10-04 02:37:45,168 WARNING [train.py:1204] (3/4) Exclude cut with ID _44718_210_5816_1_1534050051869_6469030_350-299477-0_sp1.1 from training. Duration: 0.97275 2023-10-04 02:37:45,877 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.3.encoder.layers.0.conv_module2.whiten, num_groups=1, num_channels=512, metric=3.30 vs. limit=15.0 2023-10-04 02:37:46,481 WARNING [train.py:1204] (3/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_1103-56445-0_sp0.9 from training. Duration: 0.811125 2023-10-04 02:37:46,541 WARNING [train.py:1204] (3/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_64-298917-0_sp0.9 from training. Duration: 0.97775 2023-10-04 02:37:46,595 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2245_210_2715_1_1525511548_3540254_310-52483-0 from training. Duration: 0.88 2023-10-04 02:37:47,948 WARNING [train.py:1204] (3/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_401-158239-0_sp1.1 from training. Duration: 0.6454375 2023-10-04 02:37:47,984 WARNING [train.py:1204] (3/4) Exclude cut with ID _59502_210_12210_1_1535107024698_1835139_146-162495-0_sp1.1 from training. Duration: 0.836375 2023-10-04 02:37:48,003 WARNING [train.py:1204] (3/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_41-31157-0_sp0.9 from training. Duration: 0.6333125 2023-10-04 02:37:51,243 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_68717_210_3528_1_1535628683_1556759_95-36722-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 02:37:51,250 WARNING [train.py:1204] (3/4) Exclude cut with ID _30196_210_1664_1_1533708496522_7074140_654-162611-0_sp1.1 from training. Duration: 0.92725 2023-10-04 02:37:52,684 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_9302_210_2550_1_1528889962_4655416_638-301620-0_sp0.9 from training. Duration: 0.52225 2023-10-04 02:37:55,719 WARNING [train.py:1204] (3/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_478-162247-0_sp1.1 from training. Duration: 0.7454375 2023-10-04 02:37:59,099 WARNING [train.py:1204] (3/4) Exclude cut with ID _45297_210_2721_1_1533715351381_3264949_353-9541-0 from training. Duration: 0.97 2023-10-04 02:38:01,780 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9653W0190-675468 from training. Duration: 0.9935 2023-10-04 02:38:03,739 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_49730_210_13049_1_1534075294_1830676_214-84436-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 02:38:06,481 WARNING [train.py:1204] (3/4) Exclude cut with ID _50093_210_4220_1_1534417141207_3432299_259-322252-0_sp1.1 from training. Duration: 0.836375 2023-10-04 02:38:07,764 INFO [train.py:1046] (3/4) Epoch 43, batch 1250, loss[loss=0.1791, simple_loss=0.2515, pruned_loss=0.05338, over 22699.00 frames. ], tot_loss[loss=0.1558, simple_loss=0.2361, pruned_loss=0.0378, over 4719113.48 frames. ], batch size: 322, lr: 2.37e-03, grad_scale: 16.0 2023-10-04 02:38:07,863 WARNING [train.py:1204] (3/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_599-50220-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 02:38:08,092 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.balancer2.prob, batch_count=1495733.3333333333, ans=0.125 2023-10-04 02:38:09,365 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.1.conv_module2.balancer1.prob, batch_count=1495733.3333333333, ans=0.125 2023-10-04 02:38:10,578 WARNING [train.py:1204] (3/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_174-109016-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 02:38:10,770 WARNING [train.py:1204] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_665-196155-0 from training. Duration: 0.67 2023-10-04 02:38:14,889 WARNING [train.py:1204] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_656-188361-0_sp1.1 from training. Duration: 0.7909375 2023-10-04 02:38:16,197 WARNING [train.py:1204] (3/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_60-18123-0_sp1.1 from training. Duration: 0.8 2023-10-04 02:38:17,323 WARNING [train.py:1204] (3/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_535-14410-0 from training. Duration: 0.47 2023-10-04 02:38:18,791 WARNING [train.py:1204] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_94-126128-0_sp1.1 from training. Duration: 0.7818125 2023-10-04 02:38:20,534 WARNING [train.py:1204] (3/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_356-31216-0_sp1.1 from training. Duration: 0.6818125 2023-10-04 02:38:24,755 WARNING [train.py:1204] (3/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_443-30034-0_sp1.1 from training. Duration: 0.5818125 2023-10-04 02:38:26,067 WARNING [train.py:1204] (3/4) Exclude cut with ID _26907_210_7234_1_1532221740694_3205263_337-2494-0_sp1.1 from training. Duration: 0.8 2023-10-04 02:38:28,019 WARNING [train.py:1204] (3/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_49-97158-0_sp0.9 from training. Duration: 0.8444375 2023-10-04 02:38:28,037 WARNING [train.py:1204] (3/4) Exclude cut with ID _7877_210_6014_1_1528286358099_3750136_553-170128-0_sp1.1 from training. Duration: 0.963625 2023-10-04 02:38:32,066 WARNING [train.py:1204] (3/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_202-293523-0_sp0.9 from training. Duration: 0.8 2023-10-04 02:38:34,927 WARNING [train.py:1204] (3/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_188-171673-0_sp0.9 from training. Duration: 0.6444375 2023-10-04 02:38:34,948 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_120-41668-0_sp1.1 from training. Duration: 0.636375 2023-10-04 02:38:34,953 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_40223_210_6228_1_1533298404_4812267_613-78769-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 02:38:38,191 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_53861_210_5399_1_1534422156_4272077_150-336899-0_sp1.1 from training. Duration: 0.963625 2023-10-04 02:38:38,257 WARNING [train.py:1204] (3/4) Exclude cut with ID _14459_210_2228_1_1531443641946_7516700_1146-56843-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 02:38:41,042 WARNING [train.py:1204] (3/4) Exclude cut with ID _39867_210_9826_1_1533553222030_9344259_1194-299928-0_sp1.1 from training. Duration: 0.92725 2023-10-04 02:38:42,443 WARNING [train.py:1204] (3/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_214-291369-0_sp0.9 from training. Duration: 0.7 2023-10-04 02:38:46,584 WARNING [train.py:1204] (3/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_358-215558-0 from training. Duration: 0.93 2023-10-04 02:38:46,656 WARNING [train.py:1204] (3/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_577-287150-0_sp0.9 from training. Duration: 0.911125 2023-10-04 02:38:49,510 WARNING [train.py:1204] (3/4) Exclude cut with ID _60860_210_8254_1_1534896015875_3557169_229-187367-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 02:38:50,864 WARNING [train.py:1204] (3/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_857-298942-0 from training. Duration: 0.85 2023-10-04 02:38:52,687 WARNING [train.py:1204] (3/4) Exclude cut with ID _40137_210_12210_1_1533290484046_3795180_249-187059-0_sp1.1 from training. Duration: 0.8 2023-10-04 02:38:52,694 WARNING [train.py:1204] (3/4) Exclude cut with ID ID0171W0508-765603 from training. Duration: 0.541 2023-10-04 02:38:52,709 WARNING [train.py:1204] (3/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_521-219439-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 02:38:52,722 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_40223_210_6228_1_1533298404_4812267_741-302616-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 02:38:55,584 WARNING [train.py:1204] (3/4) Exclude cut with ID _65293_210_9070_1_1535545549765_4004210_438-154735-0_sp1.1 from training. Duration: 0.92725 2023-10-04 02:38:58,746 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.566e+02 1.956e+02 2.157e+02 2.335e+02 3.543e+02, threshold=4.314e+02, percent-clipped=0.0 2023-10-04 02:38:58,835 WARNING [train.py:1204] (3/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_564-132950-0_sp1.1 from training. Duration: 0.92725 2023-10-04 02:38:58,872 WARNING [train.py:1204] (3/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_291-270881-0_sp1.1 from training. Duration: 0.8818125 2023-10-04 02:39:00,678 WARNING [train.py:1204] (3/4) Exclude cut with ID _60229_210_27767_1_1534838406158_3655350_353-213656-0 from training. Duration: 0.95 2023-10-04 02:39:00,696 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_243-143119-0 from training. Duration: 0.77 2023-10-04 02:39:01,996 WARNING [train.py:1204] (3/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_690-126306-0 from training. Duration: 0.78 2023-10-04 02:39:03,469 WARNING [train.py:1204] (3/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_347-95003-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 02:39:04,875 WARNING [train.py:1204] (3/4) Exclude cut with ID _2322_210_2667_1_1525604548971_3656299_440-80494-0 from training. Duration: 0.88 2023-10-04 02:39:04,890 WARNING [train.py:1204] (3/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_370-229434-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 02:39:08,378 WARNING [train.py:1204] (3/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_124-345840-0_sp1.1 from training. Duration: 0.47275 2023-10-04 02:39:08,386 WARNING [train.py:1204] (3/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_165-123581-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 02:39:11,104 WARNING [train.py:1204] (3/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_453-282588-0 from training. Duration: 0.98 2023-10-04 02:39:11,129 WARNING [train.py:1204] (3/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_299-34413-0_sp0.9 from training. Duration: 0.72225 2023-10-04 02:39:11,295 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=1496000.0, ans=0.1 2023-10-04 02:39:12,443 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_40036_210_13405_1_1533648647_1315388_48-146016-0_sp1.1 from training. Duration: 0.8454375 2023-10-04 02:39:12,466 WARNING [train.py:1204] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_99-167392-0_sp0.9 from training. Duration: 0.67775 2023-10-04 02:39:12,525 WARNING [train.py:1204] (3/4) Exclude cut with ID _48677_210_5663_1_1533902405919_3892989_37-207114-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 02:39:15,201 WARNING [train.py:1204] (3/4) Exclude cut with ID _29857_210_20034_1_1532395886753_3798160_335-163562-0 from training. Duration: 0.85 2023-10-04 02:39:16,693 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_36492_210_7927_1_1533261987_4790834_703-343351-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 02:39:18,055 WARNING [train.py:1204] (3/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_80-147301-0_sp0.9 from training. Duration: 0.9333125 2023-10-04 02:39:18,134 WARNING [train.py:1204] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_218-295097-0_sp0.9 from training. Duration: 0.8555625 2023-10-04 02:39:20,698 INFO [train.py:1046] (3/4) Epoch 43, batch 1300, loss[loss=0.1631, simple_loss=0.2459, pruned_loss=0.04009, over 23330.00 frames. ], tot_loss[loss=0.1567, simple_loss=0.2369, pruned_loss=0.03828, over 4705403.57 frames. ], batch size: 105, lr: 2.37e-03, grad_scale: 16.0 2023-10-04 02:39:20,845 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2295_210_2087_1_1525500993_2407863_116-238894-0_sp1.1 from training. Duration: 0.636375 2023-10-04 02:39:24,260 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.conv_module1.balancer2.prob, batch_count=1496066.6666666667, ans=0.125 2023-10-04 02:39:25,516 WARNING [train.py:1204] (3/4) Exclude cut with ID _3231_210_2106_1_1526205864934_6784220_697-171068-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 02:39:26,763 WARNING [train.py:1204] (3/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_102-77064-0 from training. Duration: 0.93 2023-10-04 02:39:30,049 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_13091_210_7925_1_1530609447_8987354_1186-261957-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 02:39:32,115 WARNING [train.py:1204] (3/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_91-277684-0_sp1.1 from training. Duration: 0.67275 2023-10-04 02:39:32,197 WARNING [train.py:1204] (3/4) Exclude cut with ID _31088_210_16446_1_1532520012284_3440919_159-313751-0_sp1.1 from training. Duration: 0.936375 2023-10-04 02:39:33,705 WARNING [train.py:1204] (3/4) Exclude cut with ID _42326_210_7770_1_1533814282531_3707269_227-203721-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 02:39:35,033 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3531_210_3528_1_1526294203_5216456_309-227302-0_sp1.1 from training. Duration: 0.62725 2023-10-04 02:39:35,273 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.bypass_mid.scale_min, batch_count=1496133.3333333333, ans=0.2 2023-10-04 02:39:36,357 WARNING [train.py:1204] (3/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_226-242140-0 from training. Duration: 0.81 2023-10-04 02:39:41,081 WARNING [train.py:1204] (3/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_122-109023-0_sp1.1 from training. Duration: 0.6909375 2023-10-04 02:39:42,429 WARNING [train.py:1204] (3/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_660-211088-0_sp0.9 from training. Duration: 0.688875 2023-10-04 02:39:43,801 WARNING [train.py:1204] (3/4) Exclude cut with ID _39290_210_4220_1_1533862778760_3075118_92-311600-0 from training. Duration: 0.88 2023-10-04 02:39:47,935 WARNING [train.py:1204] (3/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_390-339137-0_sp1.1 from training. Duration: 0.5090625 2023-10-04 02:39:50,749 WARNING [train.py:1204] (3/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_449-341946-0_sp1.1 from training. Duration: 0.963625 2023-10-04 02:39:50,822 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_68095_210_20185_1_1535718226_8888855_884-327007-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 02:39:52,702 WARNING [train.py:1204] (3/4) Exclude cut with ID _42326_210_7770_1_1533814282531_3707269_308-222998-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 02:39:52,904 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.feed_forward2.hidden_balancer.prob, batch_count=1496200.0, ans=0.125 2023-10-04 02:39:54,063 WARNING [train.py:1204] (3/4) Exclude cut with ID _40399_210_4353_1_1533305658194_3279406_344-238708-0_sp1.1 from training. Duration: 0.963625 2023-10-04 02:39:55,469 WARNING [train.py:1204] (3/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_87-131427-0_sp1.1 from training. Duration: 0.7090625 2023-10-04 02:39:55,532 WARNING [train.py:1204] (3/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_110-130961-0_sp0.9 from training. Duration: 0.52225 2023-10-04 02:39:56,944 WARNING [train.py:1204] (3/4) Exclude cut with ID _55153_210_2253_1_1534384786677_3162096_148-79022-0 from training. Duration: 0.96 2023-10-04 02:40:02,904 WARNING [train.py:1204] (3/4) Exclude cut with ID _9679_210_7497_1_1529129212906_4363929_512-313889-0_sp1.1 from training. Duration: 0.736375 2023-10-04 02:40:02,926 WARNING [train.py:1204] (3/4) Exclude cut with ID _7970_210_1883_1_1528455616296_3472230_25-178407-0_sp1.1 from training. Duration: 0.6454375 2023-10-04 02:40:04,852 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_246-125000-0 from training. Duration: 0.62 2023-10-04 02:40:06,259 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_15126_210_6753_1_1530778677_2591185_113-157651-0_sp0.9 from training. Duration: 0.7555625 2023-10-04 02:40:07,734 WARNING [train.py:1204] (3/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_105-63309-0_sp1.1 from training. Duration: 0.8909375 2023-10-04 02:40:09,218 WARNING [train.py:1204] (3/4) Exclude cut with ID _29877_210_6228_1_1532397791106_5276149_578-177271-0_sp1.1 from training. Duration: 0.936375 2023-10-04 02:40:09,268 WARNING [train.py:1204] (3/4) Exclude cut with ID _44831_210_13427_1_1533816011549_3689035_32-188147-0 from training. Duration: 0.96 2023-10-04 02:40:11,135 WARNING [train.py:1204] (3/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_300-254337-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 02:40:11,147 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_128-241008-0 from training. Duration: 0.94 2023-10-04 02:40:12,594 WARNING [train.py:1204] (3/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_53-279178-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 02:40:17,876 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_41452_210_7291_1_1533437687_3941281_273-334088-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 02:40:17,879 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_128-241008-0_sp1.1 from training. Duration: 0.8545625 2023-10-04 02:40:20,835 WARNING [train.py:1204] (3/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_379-49457-0 from training. Duration: 0.87 2023-10-04 02:40:22,156 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_105-114797-0 from training. Duration: 0.84 2023-10-04 02:40:23,993 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_14928_210_5632_1_1530773461_4714000_454-112198-0 from training. Duration: 0.66 2023-10-04 02:40:24,263 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.bypass_mid.scale_min, batch_count=1496333.3333333333, ans=0.2 2023-10-04 02:40:26,043 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.4.encoder.layers.1.nonlin_attention.whiten2, num_groups=1, num_channels=384, metric=3.35 vs. limit=15.0 2023-10-04 02:40:28,753 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_456-22141-0_sp1.1 from training. Duration: 0.8 2023-10-04 02:40:30,347 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_115-233929-0 from training. Duration: 0.72 2023-10-04 02:40:30,464 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_616-16795-0_sp1.1 from training. Duration: 0.963625 2023-10-04 02:40:30,727 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.bypass_mid.scale_min, batch_count=1496333.3333333333, ans=0.2 2023-10-04 02:40:36,454 INFO [train.py:1046] (3/4) Epoch 43, batch 1350, loss[loss=0.1623, simple_loss=0.2171, pruned_loss=0.05376, over 19349.00 frames. ], tot_loss[loss=0.1562, simple_loss=0.2357, pruned_loss=0.0383, over 4693599.18 frames. ], batch size: 388, lr: 2.37e-03, grad_scale: 8.0 2023-10-04 02:40:37,837 WARNING [train.py:1204] (3/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_448-212135-0 from training. Duration: 0.91 2023-10-04 02:40:40,581 WARNING [train.py:1204] (3/4) Exclude cut with ID _46683_210_9642_1_1533730913492_4591116_201-205677-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 02:40:42,431 WARNING [train.py:1204] (3/4) Exclude cut with ID _64086_210_7994_1_1535270417019_7435949_43-80483-0_sp1.1 from training. Duration: 0.97275 2023-10-04 02:40:45,228 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_240-134124-0_sp1.1 from training. Duration: 0.963625 2023-10-04 02:40:45,264 WARNING [train.py:1204] (3/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_907-120940-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 02:40:45,420 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder_embed.convnext.out_balancer.prob, batch_count=1496400.0, ans=0.125 2023-10-04 02:40:46,710 WARNING [train.py:1204] (3/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_848-280957-0_sp1.1 from training. Duration: 0.7909375 2023-10-04 02:40:46,759 WARNING [train.py:1204] (3/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_243-42116-0_sp0.9 from training. Duration: 0.811125 2023-10-04 02:40:51,293 WARNING [train.py:1204] (3/4) Exclude cut with ID _6694_210_4599_1_1527852637490_3965529_116-109398-0_sp0.9 from training. Duration: 0.811125 2023-10-04 02:40:52,641 WARNING [train.py:1204] (3/4) Exclude cut with ID _24314_210_4915_1_1532135078116_7170179_521-258868-0 from training. Duration: 0.79 2023-10-04 02:40:54,616 WARNING [train.py:1204] (3/4) Exclude cut with ID _2220_210_1819_1_1525509109898_3658680_50-99837-0_sp0.9 from training. Duration: 0.6 2023-10-04 02:40:55,813 WARNING [train.py:1204] (3/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_313-134930-0_sp1.1 from training. Duration: 0.8818125 2023-10-04 02:40:56,000 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.0.attention_skip_rate, batch_count=1496466.6666666667, ans=0.0 2023-10-04 02:40:57,323 WARNING [train.py:1204] (3/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_125-144174-0 from training. Duration: 0.39 2023-10-04 02:40:58,647 WARNING [train.py:1204] (3/4) Exclude cut with ID _59288_210_5854_1_1535077757774_3412246_275-293897-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 02:41:01,241 WARNING [train.py:1204] (3/4) Exclude cut with ID _40519_210_13794_1_1533715228655_7337390_722-52880-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 02:41:01,261 WARNING [train.py:1204] (3/4) Exclude cut with ID _7537_210_2084_1_1528502395431_3924359_128-268029-0 from training. Duration: 0.98 2023-10-04 02:41:03,978 WARNING [train.py:1204] (3/4) Exclude cut with ID _8021_210_7994_1_1528376451358_3653069_179-45206-0 from training. Duration: 0.86 2023-10-04 02:41:06,133 WARNING [train.py:1204] (3/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_88-276668-0 from training. Duration: 0.99 2023-10-04 02:41:06,253 WARNING [train.py:1204] (3/4) Exclude cut with ID _22613_210_5219_1_1532253599686_4090170_290-212862-0_sp1.1 from training. Duration: 0.97275 2023-10-04 02:41:07,484 WARNING [train.py:1204] (3/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_465-214726-0 from training. Duration: 0.86 2023-10-04 02:41:10,585 INFO [scaling.py:1118] (3/4) WithLoss: name=encoder.encoders.5.encoder.layers.0.self_attn_weights, loss-sum=0.000e+00 2023-10-04 02:41:12,506 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.0.layers.0.feed_forward2.out_whiten, num_groups=1, num_channels=192, metric=5.60 vs. limit=15.0 2023-10-04 02:41:15,065 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.conv_skip_rate, batch_count=1496533.3333333333, ans=0.0 2023-10-04 02:41:18,936 WARNING [train.py:1204] (3/4) Exclude cut with ID _56480_210_4220_1_1534679942079_3075409_138-36031-0_sp1.1 from training. Duration: 0.97275 2023-10-04 02:41:28,002 WARNING [train.py:1204] (3/4) Exclude cut with ID _58605_210_12210_1_1534723195939_3904259_300-285878-0_sp1.1 from training. Duration: 0.97275 2023-10-04 02:41:28,046 WARNING [train.py:1204] (3/4) Exclude cut with ID _40901_210_5665_1_1533363789958_3856130_114-340229-0_sp1.1 from training. Duration: 0.936375 2023-10-04 02:41:29,142 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.613e+02 1.909e+02 2.129e+02 2.419e+02 3.786e+02, threshold=4.258e+02, percent-clipped=0.0 2023-10-04 02:41:29,915 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_489-230703-0 from training. Duration: 0.85 2023-10-04 02:41:32,490 WARNING [train.py:1204] (3/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_322-81243-0_sp1.1 from training. Duration: 0.936375 2023-10-04 02:41:32,578 WARNING [train.py:1204] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_271-269084-0 from training. Duration: 0.83 2023-10-04 02:41:32,599 WARNING [train.py:1204] (3/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_166-314607-0_sp0.9 from training. Duration: 0.6 2023-10-04 02:41:32,662 WARNING [train.py:1204] (3/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_340-343302-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 02:41:35,497 WARNING [train.py:1204] (3/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_286-112015-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 02:41:37,356 WARNING [train.py:1204] (3/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_328-264776-0 from training. Duration: 0.67 2023-10-04 02:41:38,764 WARNING [train.py:1204] (3/4) Exclude cut with ID _4866_210_2577_1_1527404412284_4364659_256-306830-0_sp0.9 from training. Duration: 0.9666875 2023-10-04 02:41:43,751 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.2.encoder.layers.2.nonlin_attention.whiten2, num_groups=1, num_channels=384, metric=5.40 vs. limit=15.0 2023-10-04 02:41:44,336 WARNING [train.py:1204] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_239-316802-0 from training. Duration: 0.69 2023-10-04 02:41:44,996 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.2.encoder.layers.0.self_attn_weights.whiten_keys, num_groups=4, num_channels=128, metric=3.37 vs. limit=6.0 2023-10-04 02:41:46,478 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.self_attn_weights.pos_emb_skip_rate, batch_count=1496666.6666666667, ans=0.0 2023-10-04 02:41:47,543 WARNING [train.py:1204] (3/4) Exclude cut with ID _29857_210_20034_1_1532395886753_3798160_279-185801-0 from training. Duration: 0.91 2023-10-04 02:41:49,495 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.4.encoder.layers.1.whiten, num_groups=1, num_channels=384, metric=5.20 vs. limit=12.0 2023-10-04 02:41:50,249 INFO [train.py:1046] (3/4) Epoch 43, batch 1400, loss[loss=0.1641, simple_loss=0.2391, pruned_loss=0.04457, over 23872.00 frames. ], tot_loss[loss=0.1547, simple_loss=0.2341, pruned_loss=0.03769, over 4688443.60 frames. ], batch size: 195, lr: 2.37e-03, grad_scale: 8.0 2023-10-04 02:41:50,492 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.0.feed_forward1.out_proj.dropout_p, batch_count=1496733.3333333333, ans=0.1 2023-10-04 02:41:53,609 WARNING [train.py:1204] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_656-188361-0 from training. Duration: 0.87 2023-10-04 02:41:54,937 WARNING [train.py:1204] (3/4) Exclude cut with ID _44718_210_5816_1_1534050051869_6469030_100-230287-0_sp1.1 from training. Duration: 0.936375 2023-10-04 02:41:57,709 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8275_210_4198_1_1528455320_3892571_276-215622-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 02:41:58,406 WARNING [train.py:1204] (3/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_698-309430-0_sp1.1 from training. Duration: 0.963625 2023-10-04 02:42:02,375 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_365-217119-0 from training. Duration: 0.63 2023-10-04 02:42:03,941 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7264_210_4938_1_1527939911_8132095_800-91995-0 from training. Duration: 0.86 2023-10-04 02:42:08,971 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.4.encoder.layers.2.whiten, num_groups=1, num_channels=384, metric=3.58 vs. limit=12.0 2023-10-04 02:42:14,080 WARNING [train.py:1204] (3/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_49-97158-0_sp1.1 from training. Duration: 0.6909375 2023-10-04 02:42:15,438 WARNING [train.py:1204] (3/4) Exclude cut with ID _61132_210_9969_1_1535021792964_3755419_61-57720-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 02:42:18,188 WARNING [train.py:1204] (3/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_345-306832-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 02:42:18,217 WARNING [train.py:1204] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_164-224052-0_sp0.9 from training. Duration: 0.77775 2023-10-04 02:42:24,762 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_34886_210_10633_1_1533203882_1755661_76-156096-0_sp1.1 from training. Duration: 0.7909375 2023-10-04 02:42:24,845 WARNING [train.py:1204] (3/4) Exclude cut with ID 4133-6541-0027-40495-0_sp1.1 from training. Duration: 0.9681875 2023-10-04 02:42:32,143 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_514-211165-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 02:42:32,197 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_41211_210_2544_1_1533348859_3702910_97-212695-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 02:42:36,343 WARNING [train.py:1204] (3/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_406-45086-0 from training. Duration: 0.8 2023-10-04 02:42:37,576 WARNING [train.py:1204] (3/4) Exclude cut with ID _65263_210_22356_1_1535763598691_3921037_49-222917-0_sp1.1 from training. Duration: 0.836375 2023-10-04 02:42:37,618 WARNING [train.py:1204] (3/4) Exclude cut with ID _4866_210_2577_1_1527404412284_4364659_352-157930-0_sp1.1 from training. Duration: 0.736375 2023-10-04 02:42:37,676 WARNING [train.py:1204] (3/4) Exclude cut with ID _4366_210_1388_1_1526792053633_3613819_231-211402-0_sp1.1 from training. Duration: 0.8545625 2023-10-04 02:42:39,455 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_98-21576-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 02:42:40,849 WARNING [train.py:1204] (3/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_348-127789-0_sp0.9 from training. Duration: 0.9444375 2023-10-04 02:42:40,873 WARNING [train.py:1204] (3/4) Exclude cut with ID _45871_210_5665_1_1533864614245_3296060_98-329557-0_sp1.1 from training. Duration: 0.97275 2023-10-04 02:42:41,048 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=1496933.3333333333, ans=0.1 2023-10-04 02:42:42,264 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6846_210_6753_1_1527850767_4295158_2-315073-0_sp1.1 from training. Duration: 0.8 2023-10-04 02:42:43,715 WARNING [train.py:1204] (3/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_145-137790-0 from training. Duration: 0.83 2023-10-04 02:42:43,734 WARNING [train.py:1204] (3/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_133-158308-0_sp0.9 from training. Duration: 0.9333125 2023-10-04 02:42:49,501 WARNING [train.py:1204] (3/4) Exclude cut with ID _14898_210_9206_1_1530766812377_4177190_320-11758-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 02:42:52,940 WARNING [train.py:1204] (3/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_406-45086-0_sp0.9 from training. Duration: 0.888875 2023-10-04 02:42:58,455 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_5438_210_1794_1_1527336687_2600009_104-288274-0 from training. Duration: 0.58 2023-10-04 02:42:58,536 WARNING [train.py:1204] (3/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_108-53929-0_sp0.9 from training. Duration: 0.6555625 2023-10-04 02:43:00,299 WARNING [train.py:1204] (3/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_444-286390-0_sp1.1 from training. Duration: 0.963625 2023-10-04 02:43:03,023 WARNING [train.py:1204] (3/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_125-144174-0_sp0.9 from training. Duration: 0.4333125 2023-10-04 02:43:03,098 WARNING [train.py:1204] (3/4) Exclude cut with ID _55209_210_1531_1_1534675828996_4597160_118-345621-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 02:43:04,349 INFO [train.py:1046] (3/4) Epoch 43, batch 1450, loss[loss=0.1592, simple_loss=0.2511, pruned_loss=0.0337, over 24625.00 frames. ], tot_loss[loss=0.1543, simple_loss=0.2339, pruned_loss=0.03732, over 4697702.19 frames. ], batch size: 68, lr: 2.37e-03, grad_scale: 4.0 2023-10-04 02:43:05,699 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_45886_210_17024_1_1533709215_471063_7-79165-0_sp1.1 from training. Duration: 0.8909375 2023-10-04 02:43:08,572 WARNING [train.py:1204] (3/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_175-163894-0_sp0.9 from training. Duration: 0.82225 2023-10-04 02:43:10,554 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_50188_210_4198_1_1534503621_3646041_353-81682-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 02:43:10,568 WARNING [train.py:1204] (3/4) Exclude cut with ID _17875_210_2087_1_1531224012907_1821339_175-80565-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 02:43:10,577 WARNING [train.py:1204] (3/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_159-328157-0_sp1.1 from training. Duration: 0.47275 2023-10-04 02:43:12,142 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.feed_forward3.hidden_balancer.prob, batch_count=1497066.6666666667, ans=0.125 2023-10-04 02:43:14,741 WARNING [train.py:1204] (3/4) Exclude cut with ID _60057_210_10640_1_1534903065793_3333150_137-267579-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 02:43:14,788 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_92-46009-0_sp1.1 from training. Duration: 0.4909375 2023-10-04 02:43:16,230 WARNING [train.py:1204] (3/4) Exclude cut with ID _53203_210_2087_1_1534246218309_475590_48-68507-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 02:43:16,253 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_28211_210_6388_1_1532861506_4523224_128-328162-0 from training. Duration: 0.9 2023-10-04 02:43:18,210 WARNING [train.py:1204] (3/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_51-1049-0_sp1.1 from training. Duration: 0.6818125 2023-10-04 02:43:18,285 WARNING [train.py:1204] (3/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_383-316000-0 from training. Duration: 0.9 2023-10-04 02:43:18,337 WARNING [train.py:1204] (3/4) Exclude cut with ID _4779_210_2084_1_1527318022944_3914893_185-295153-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 02:43:19,692 WARNING [train.py:1204] (3/4) Exclude cut with ID _3231_210_2106_1_1526205864934_6784220_171-256004-0_sp0.9 from training. Duration: 0.9555625 2023-10-04 02:43:19,693 WARNING [train.py:1204] (3/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_87-272068-0 from training. Duration: 0.83 2023-10-04 02:43:21,507 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_164-152777-0_sp1.1 from training. Duration: 0.92725 2023-10-04 02:43:22,701 WARNING [train.py:1204] (3/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_18-130336-0_sp0.9 from training. Duration: 0.87775 2023-10-04 02:43:22,745 WARNING [train.py:1204] (3/4) Exclude cut with ID _14886_210_4220_1_1530702020355_3516190_175-229073-0_sp1.1 from training. Duration: 0.4090625 2023-10-04 02:43:22,764 WARNING [train.py:1204] (3/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_791-45149-0_sp0.9 from training. Duration: 0.9555625 2023-10-04 02:43:24,133 WARNING [train.py:1204] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_468-189058-0_sp1.1 from training. Duration: 0.87275 2023-10-04 02:43:25,359 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8278_210_2550_1_1528637099_4220413_32-142780-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 02:43:28,071 WARNING [train.py:1204] (3/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_146-218763-0_sp0.9 from training. Duration: 0.9555625 2023-10-04 02:43:29,066 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.ff3_skip_rate, batch_count=1497133.3333333333, ans=0.0 2023-10-04 02:43:30,341 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.conv_module1.balancer2.min_positive, batch_count=1497133.3333333333, ans=0.05 2023-10-04 02:43:31,384 WARNING [train.py:1204] (3/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_348-127789-0_sp1.1 from training. Duration: 0.77275 2023-10-04 02:43:31,392 WARNING [train.py:1204] (3/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_288-285445-0_sp1.1 from training. Duration: 0.936375 2023-10-04 02:43:34,099 WARNING [train.py:1204] (3/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_308-181732-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 02:43:34,132 WARNING [train.py:1204] (3/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_895-117428-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 02:43:36,772 WARNING [train.py:1204] (3/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_387-159008-0_sp0.9 from training. Duration: 0.9555625 2023-10-04 02:43:36,784 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_37351_210_7925_1_1533012931_3812113_265-85780-0_sp0.9 from training. Duration: 0.97775 2023-10-04 02:43:36,806 WARNING [train.py:1204] (3/4) Exclude cut with ID _40551_210_10635_1_1533776424632_3416179_361-3964-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 02:43:36,847 WARNING [train.py:1204] (3/4) Exclude cut with ID _25882_210_3036_1_1532775665815_6479510_485-122742-0_sp1.1 from training. Duration: 0.97275 2023-10-04 02:43:41,644 WARNING [train.py:1204] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_98-51676-0 from training. Duration: 0.73 2023-10-04 02:43:43,119 WARNING [train.py:1204] (3/4) Exclude cut with ID _30548_210_16040_1_1532608097130_3146969_371-280610-0_sp1.1 from training. Duration: 0.92725 2023-10-04 02:43:46,024 WARNING [train.py:1204] (3/4) Exclude cut with ID IC0671W0081-331924 from training. Duration: 0.896 2023-10-04 02:43:47,861 WARNING [train.py:1204] (3/4) Exclude cut with ID _40284_210_2084_1_1533513679814_3704279_380-160775-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 02:43:48,047 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.self_attn_weights.pos_emb_skip_rate, batch_count=1497266.6666666667, ans=0.0 2023-10-04 02:43:49,593 WARNING [train.py:1204] (3/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_57-270037-0_sp1.1 from training. Duration: 0.7 2023-10-04 02:43:51,097 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_853-150237-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 02:43:52,418 WARNING [train.py:1204] (3/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_491-261923-0 from training. Duration: 0.98 2023-10-04 02:43:56,527 WARNING [train.py:1204] (3/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_836-244045-0_sp1.1 from training. Duration: 0.97275 2023-10-04 02:43:57,897 WARNING [train.py:1204] (3/4) Exclude cut with ID _14880_210_8895_1_1530680426655_3795259_289-212817-0 from training. Duration: 0.69 2023-10-04 02:43:59,139 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.634e+02 2.025e+02 2.199e+02 2.536e+02 7.667e+02, threshold=4.399e+02, percent-clipped=1.0 2023-10-04 02:44:00,554 WARNING [train.py:1204] (3/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_303-340446-0 from training. Duration: 0.9 2023-10-04 02:44:02,370 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_37152_210_9697_1_1533106573_3844188_418-41736-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 02:44:02,991 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.3.encoder.layers.3.conv_module1.whiten, num_groups=1, num_channels=512, metric=4.67 vs. limit=15.0 2023-10-04 02:44:05,424 WARNING [train.py:1204] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_371-18169-0_sp1.1 from training. Duration: 0.7818125 2023-10-04 02:44:06,799 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_187-219984-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 02:44:08,150 WARNING [train.py:1204] (3/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_63-267585-0 from training. Duration: 0.9 2023-10-04 02:44:09,649 WARNING [train.py:1204] (3/4) Exclude cut with ID _27013_210_1389_1_1532437173240_3252649_222-125573-0 from training. Duration: 0.98 2023-10-04 02:44:10,965 WARNING [train.py:1204] (3/4) Exclude cut with ID _14821_210_8750_1_1530680503991_6829359_189-297451-0 from training. Duration: 0.55 2023-10-04 02:44:12,893 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_557-163391-0_sp1.1 from training. Duration: 0.97275 2023-10-04 02:44:12,981 WARNING [train.py:1204] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_65-88242-0_sp1.1 from training. Duration: 0.5090625 2023-10-04 02:44:18,506 INFO [train.py:1046] (3/4) Epoch 43, batch 1500, loss[loss=0.1645, simple_loss=0.2517, pruned_loss=0.0386, over 23258.00 frames. ], tot_loss[loss=0.1548, simple_loss=0.2347, pruned_loss=0.03743, over 4695923.00 frames. ], batch size: 93, lr: 2.37e-03, grad_scale: 8.0 2023-10-04 02:44:23,584 WARNING [train.py:1204] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_218-295097-0 from training. Duration: 0.77 2023-10-04 02:44:23,604 WARNING [train.py:1204] (3/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_686-247600-0_sp1.1 from training. Duration: 0.836375 2023-10-04 02:44:23,608 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_397-147196-0_sp0.9 from training. Duration: 0.888875 2023-10-04 02:44:24,897 WARNING [train.py:1204] (3/4) Exclude cut with ID _53950_210_5816_1_1534417264919_3862951_237-136449-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 02:44:25,070 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.1.conv_module1.balancer2.prob, batch_count=1497400.0, ans=0.125 2023-10-04 02:44:26,234 WARNING [train.py:1204] (3/4) Exclude cut with ID _3120_210_3525_1_1526036143112_4072980_74-178400-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 02:44:26,314 WARNING [train.py:1204] (3/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_309-222545-0_sp0.9 from training. Duration: 0.9333125 2023-10-04 02:44:27,677 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7544_210_6753_1_1528587722_3576686_172-171750-0 from training. Duration: 0.65 2023-10-04 02:44:29,039 WARNING [train.py:1204] (3/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_3-237434-0_sp0.9 from training. Duration: 0.8444375 2023-10-04 02:44:29,078 WARNING [train.py:1204] (3/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_101-293367-0_sp0.9 from training. Duration: 0.7 2023-10-04 02:44:29,088 WARNING [train.py:1204] (3/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_818-213552-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 02:44:30,471 WARNING [train.py:1204] (3/4) Exclude cut with ID _14349_210_8341_1_1530615710818_3442470_115-285975-0_sp1.1 from training. Duration: 0.7818125 2023-10-04 02:44:32,448 WARNING [train.py:1204] (3/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_291-309749-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 02:44:33,866 WARNING [train.py:1204] (3/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_715-3462-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 02:44:40,486 WARNING [train.py:1204] (3/4) Exclude cut with ID _59502_210_12210_1_1535107024698_1835139_13-331827-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 02:44:40,497 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_4528_210_3273_1_1527076654_3276777_342-168068-0 from training. Duration: 0.87 2023-10-04 02:44:41,803 WARNING [train.py:1204] (3/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_215-141059-0_sp0.9 from training. Duration: 0.688875 2023-10-04 02:44:41,823 WARNING [train.py:1204] (3/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_106-286130-0_sp1.1 from training. Duration: 0.8909375 2023-10-04 02:44:43,206 WARNING [train.py:1204] (3/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_480-77655-0_sp1.1 from training. Duration: 0.97275 2023-10-04 02:44:47,592 WARNING [train.py:1204] (3/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_848-280957-0 from training. Duration: 0.87 2023-10-04 02:44:50,330 WARNING [train.py:1204] (3/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_122-109023-0 from training. Duration: 0.76 2023-10-04 02:44:51,740 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_11305_210_1794_1_1529750428_7178148_645-233480-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 02:44:51,785 WARNING [train.py:1204] (3/4) Exclude cut with ID _7562_210_2084_1_1528887767916_3946229_345-169156-0 from training. Duration: 0.58 2023-10-04 02:44:53,802 WARNING [train.py:1204] (3/4) Exclude cut with ID _7562_210_2084_1_1528887767916_3946229_345-169156-0_sp1.1 from training. Duration: 0.52725 2023-10-04 02:44:56,591 WARNING [train.py:1204] (3/4) Exclude cut with ID _13253_210_1925_1_1530757775819_3577873_253-246386-0_sp1.1 from training. Duration: 0.8181875 2023-10-04 02:44:57,938 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_136-264444-0_sp1.1 from training. Duration: 0.97275 2023-10-04 02:44:57,951 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2168_210_2290_1_1525606920_1849400_136-222991-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 02:44:59,247 WARNING [train.py:1204] (3/4) Exclude cut with ID _7852_210_5219_1_1528286471316_8139400_242-105336-0 from training. Duration: 0.72 2023-10-04 02:44:59,289 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_5960_210_1794_1_1527403865_3990999_515-2613-0_sp1.1 from training. Duration: 0.8 2023-10-04 02:44:59,310 WARNING [train.py:1204] (3/4) Exclude cut with ID _39688_210_19596_1_1533371626725_4154159_551-167834-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 02:45:00,751 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_85-195240-0 from training. Duration: 0.87 2023-10-04 02:45:02,191 WARNING [train.py:1204] (3/4) Exclude cut with ID _63171_210_15488_1_1535104792794_3295110_113-113534-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 02:45:06,831 WARNING [train.py:1204] (3/4) Exclude cut with ID _8833_210_5219_1_1528592665210_7553059_319-349181-0_sp1.1 from training. Duration: 0.9 2023-10-04 02:45:06,832 WARNING [train.py:1204] (3/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_225-43381-0 from training. Duration: 0.94 2023-10-04 02:45:11,040 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_5959_210_1794_1_1527400400_3445726_412-44052-0_sp0.9 from training. Duration: 0.8555625 2023-10-04 02:45:12,351 WARNING [train.py:1204] (3/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_356-167816-0_sp1.1 from training. Duration: 0.6545625 2023-10-04 02:45:12,581 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.bypass.scale_min, batch_count=1497600.0, ans=0.2 2023-10-04 02:45:16,930 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9012W0112-501244 from training. Duration: 0.768 2023-10-04 02:45:18,267 WARNING [train.py:1204] (3/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_177-326952-0_sp1.1 from training. Duration: 0.92725 2023-10-04 02:45:18,271 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9016W0116-502789 from training. Duration: 0.896 2023-10-04 02:45:19,641 WARNING [train.py:1204] (3/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_557-204815-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 02:45:21,090 WARNING [train.py:1204] (3/4) Exclude cut with ID _55857_210_2346_1_1534644007831_6898209_279-229469-0_sp1.1 from training. Duration: 0.936375 2023-10-04 02:45:21,165 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9029W0375-509764 from training. Duration: 0.896 2023-10-04 02:45:22,513 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2295_210_2087_1_1525500993_2407863_234-83104-0_sp0.9 from training. Duration: 0.688875 2023-10-04 02:45:27,535 WARNING [train.py:1204] (3/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_266-137104-0 from training. Duration: 0.65 2023-10-04 02:45:28,940 WARNING [train.py:1204] (3/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_289-163927-0_sp1.1 from training. Duration: 0.92725 2023-10-04 02:45:29,441 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.5.encoder.layers.1.conv_module1.whiten, num_groups=1, num_channels=256, metric=3.26 vs. limit=15.0 2023-10-04 02:45:30,630 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.conv_skip_rate, batch_count=1497733.3333333333, ans=0.0 2023-10-04 02:45:31,672 INFO [train.py:1046] (3/4) Epoch 43, batch 1550, loss[loss=0.1375, simple_loss=0.2122, pruned_loss=0.03138, over 24318.00 frames. ], tot_loss[loss=0.1551, simple_loss=0.2353, pruned_loss=0.0375, over 4710718.94 frames. ], batch size: 56, lr: 2.37e-03, grad_scale: 8.0 2023-10-04 02:45:31,784 WARNING [train.py:1204] (3/4) Exclude cut with ID _12733_210_8341_1_1530167424556_3646380_86-225314-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 02:45:31,803 WARNING [train.py:1204] (3/4) Exclude cut with ID _7877_210_6014_1_1528286358099_3750136_618-284064-0_sp1.1 from training. Duration: 0.92725 2023-10-04 02:45:33,140 WARNING [train.py:1204] (3/4) Exclude cut with ID _55844_210_2346_1_1534471212569_7625707_1017-31469-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 02:45:33,185 WARNING [train.py:1204] (3/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_617-241446-0_sp1.1 from training. Duration: 0.92725 2023-10-04 02:45:34,628 WARNING [train.py:1204] (3/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_866-173440-0_sp1.1 from training. Duration: 0.8181875 2023-10-04 02:45:36,747 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_9460_210_4211_1_1528954587_10049667_480-260269-0 from training. Duration: 0.56 2023-10-04 02:45:36,776 WARNING [train.py:1204] (3/4) Exclude cut with ID _44659_210_2721_1_1534036120061_6614860_626-298884-0 from training. Duration: 0.97 2023-10-04 02:45:36,822 WARNING [train.py:1204] (3/4) Exclude cut with ID _53565_210_10635_1_1534337218335_3820259_287-18120-0_sp1.1 from training. Duration: 0.8818125 2023-10-04 02:45:38,066 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_791-197891-0 from training. Duration: 0.84 2023-10-04 02:45:39,337 WARNING [train.py:1204] (3/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_527-238990-0 from training. Duration: 0.9 2023-10-04 02:45:40,768 WARNING [train.py:1204] (3/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_449-65969-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 02:45:42,121 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_553-341452-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 02:45:42,162 WARNING [train.py:1204] (3/4) Exclude cut with ID _16909_210_5724_1_1531292079912_4118919_300-47574-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 02:45:42,181 WARNING [train.py:1204] (3/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_70-27732-0_sp1.1 from training. Duration: 0.8545625 2023-10-04 02:45:43,561 WARNING [train.py:1204] (3/4) Exclude cut with ID _44718_210_5816_1_1534050051869_6469030_727-28074-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 02:45:43,604 WARNING [train.py:1204] (3/4) Exclude cut with ID _55857_210_2346_1_1534644007831_6898209_357-123664-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 02:45:46,378 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9123W0004-555964 from training. Duration: 0.896 2023-10-04 02:45:48,103 WARNING [train.py:1204] (3/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_445-145208-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 02:45:48,137 WARNING [train.py:1204] (3/4) Exclude cut with ID _3675_210_4557_1_1526639934408_4182089_456-18305-0_sp1.1 from training. Duration: 0.6909375 2023-10-04 02:45:49,591 WARNING [train.py:1204] (3/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_1-99680-0_sp1.1 from training. Duration: 0.6090625 2023-10-04 02:45:51,052 WARNING [train.py:1204] (3/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_146-282400-0_sp0.9 from training. Duration: 0.788875 2023-10-04 02:45:52,362 WARNING [train.py:1204] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_844-96989-0 from training. Duration: 0.71 2023-10-04 02:45:53,766 WARNING [train.py:1204] (3/4) Exclude cut with ID _29853_210_7497_1_1532505600851_6642110_128-146364-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 02:45:53,798 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_224-322394-0 from training. Duration: 0.66 2023-10-04 02:45:53,894 WARNING [train.py:1204] (3/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_191-59784-0 from training. Duration: 0.88 2023-10-04 02:45:53,899 WARNING [train.py:1204] (3/4) Exclude cut with ID _12556_210_8341_1_1530081024100_7327690_725-193477-0 from training. Duration: 0.98 2023-10-04 02:45:55,878 WARNING [train.py:1204] (3/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_523-79838-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 02:45:55,970 WARNING [train.py:1204] (3/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_398-334558-0_sp1.1 from training. Duration: 0.87275 2023-10-04 02:46:00,607 WARNING [train.py:1204] (3/4) Exclude cut with ID _6456_210_1534_1_1527681634212_3181049_171-36697-0_sp1.1 from training. Duration: 0.936375 2023-10-04 02:46:02,023 WARNING [train.py:1204] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_700-183549-0 from training. Duration: 0.68 2023-10-04 02:46:02,025 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8853_210_3273_1_1528633899_818968_51-187685-0 from training. Duration: 0.69 2023-10-04 02:46:10,626 WARNING [train.py:1204] (3/4) Exclude cut with ID _7719_210_3525_1_1528456153163_4296960_333-110399-0_sp1.1 from training. Duration: 0.87275 2023-10-04 02:46:12,246 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.nonlin_attention.balancer.prob, batch_count=1497866.6666666667, ans=0.125 2023-10-04 02:46:13,439 WARNING [train.py:1204] (3/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_1022-83740-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 02:46:13,456 WARNING [train.py:1204] (3/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_61-327062-0_sp0.9 from training. Duration: 0.9 2023-10-04 02:46:13,462 WARNING [train.py:1204] (3/4) Exclude cut with ID _47251_210_18628_1_1533814063128_4137250_214-88076-0_sp1.1 from training. Duration: 0.8 2023-10-04 02:46:13,606 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.conv_skip_rate, batch_count=1497866.6666666667, ans=0.0 2023-10-04 02:46:14,717 WARNING [train.py:1204] (3/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_839-173017-0 from training. Duration: 0.41 2023-10-04 02:46:19,360 WARNING [train.py:1204] (3/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_299-34413-0_sp1.1 from training. Duration: 0.5909375 2023-10-04 02:46:22,007 WARNING [train.py:1204] (3/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_664-159457-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 02:46:23,486 WARNING [train.py:1204] (3/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_483-291760-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 02:46:24,408 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.0.layers.1.feed_forward2.out_whiten, num_groups=1, num_channels=192, metric=4.25 vs. limit=15.0 2023-10-04 02:46:26,650 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.685e+02 1.915e+02 2.072e+02 2.347e+02 3.023e+02, threshold=4.143e+02, percent-clipped=0.0 2023-10-04 02:46:26,798 WARNING [train.py:1204] (3/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_287-255474-0_sp1.1 from training. Duration: 0.92725 2023-10-04 02:46:26,834 WARNING [train.py:1204] (3/4) Exclude cut with ID _48677_210_5663_1_1533902405919_3892989_111-11439-0_sp1.1 from training. Duration: 0.87275 2023-10-04 02:46:28,228 WARNING [train.py:1204] (3/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_399-156791-0 from training. Duration: 0.83 2023-10-04 02:46:28,260 WARNING [train.py:1204] (3/4) Exclude cut with ID _3992_210_1747_1_1526644816031_3731060_36-147855-0_sp0.9 from training. Duration: 0.8666875 2023-10-04 02:46:29,696 WARNING [train.py:1204] (3/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_442-320317-0_sp1.1 from training. Duration: 0.7454375 2023-10-04 02:46:29,715 WARNING [train.py:1204] (3/4) Exclude cut with ID _55429_210_10626_1_1534467588527_7174143_953-80868-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 02:46:31,590 WARNING [train.py:1204] (3/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_159-328157-0_sp0.9 from training. Duration: 0.57775 2023-10-04 02:46:31,604 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9309W0197-640324 from training. Duration: 0.896 2023-10-04 02:46:34,296 WARNING [train.py:1204] (3/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_361-30822-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 02:46:39,200 WARNING [train.py:1204] (3/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_636-251213-0 from training. Duration: 0.94 2023-10-04 02:46:43,301 WARNING [train.py:1204] (3/4) Exclude cut with ID _49728_210_12651_1_1534041034294_6788150_468-333827-0_sp1.1 from training. Duration: 0.963625 2023-10-04 02:46:44,665 WARNING [train.py:1204] (3/4) Exclude cut with ID _25882_210_3036_1_1532775665815_6479510_623-191759-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 02:46:44,730 WARNING [train.py:1204] (3/4) Exclude cut with ID _5189_210_4557_1_1527248319428_3769229_199-53526-0 from training. Duration: 0.42 2023-10-04 02:46:45,976 INFO [train.py:1046] (3/4) Epoch 43, batch 1600, loss[loss=0.1559, simple_loss=0.2409, pruned_loss=0.03545, over 24499.00 frames. ], tot_loss[loss=0.1563, simple_loss=0.2367, pruned_loss=0.03796, over 4710964.17 frames. ], batch size: 63, lr: 2.37e-03, grad_scale: 16.0 2023-10-04 02:46:47,439 WARNING [train.py:1204] (3/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_32-205288-0_sp0.9 from training. Duration: 0.8666875 2023-10-04 02:46:49,401 WARNING [train.py:1204] (3/4) Exclude cut with ID _65263_210_22356_1_1535763598691_3921037_401-346646-0_sp1.1 from training. Duration: 0.963625 2023-10-04 02:46:49,405 WARNING [train.py:1204] (3/4) Exclude cut with ID _12555_210_8750_1_1530075594402_3780239_449-267986-0_sp1.1 from training. Duration: 0.7545625 2023-10-04 02:46:49,419 WARNING [train.py:1204] (3/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_430-201020-0_sp1.1 from training. Duration: 0.8818125 2023-10-04 02:46:50,164 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.2.encoder.layers.2.whiten, num_groups=1, num_channels=384, metric=3.79 vs. limit=12.0 2023-10-04 02:46:50,833 WARNING [train.py:1204] (3/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_379-49457-0_sp1.1 from training. Duration: 0.7909375 2023-10-04 02:46:54,116 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.4.encoder.layers.1.self_attn2.whiten, num_groups=1, num_channels=384, metric=11.73 vs. limit=22.5 2023-10-04 02:46:54,941 WARNING [train.py:1204] (3/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_200-105218-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 02:46:55,010 WARNING [train.py:1204] (3/4) Exclude cut with ID _3867_210_1883_1_1526641200575_4475830_319-349850-0 from training. Duration: 0.67 2023-10-04 02:46:56,943 WARNING [train.py:1204] (3/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_916-195848-0 from training. Duration: 0.92 2023-10-04 02:46:57,070 WARNING [train.py:1204] (3/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_436-186311-0 from training. Duration: 0.65 2023-10-04 02:46:59,788 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3910_210_1794_1_1526777667_3747260_302-180617-0_sp1.1 from training. Duration: 0.97275 2023-10-04 02:47:01,225 WARNING [train.py:1204] (3/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_102-92751-0 from training. Duration: 0.99 2023-10-04 02:47:02,414 WARNING [train.py:1204] (3/4) Exclude cut with ID _51899_210_20034_1_1534129254466_3669220_265-215428-0_sp1.1 from training. Duration: 0.8909375 2023-10-04 02:47:04,373 WARNING [train.py:1204] (3/4) Exclude cut with ID _58583_210_5236_1_1534726835266_3574249_438-198993-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 02:47:09,189 WARNING [train.py:1204] (3/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_166-166990-0_sp1.1 from training. Duration: 0.87275 2023-10-04 02:47:11,914 WARNING [train.py:1204] (3/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_907-151398-0 from training. Duration: 0.37 2023-10-04 02:47:14,774 WARNING [train.py:1204] (3/4) Exclude cut with ID _38670_210_15379_1_1533448810659_4391059_452-256669-0_sp1.1 from training. Duration: 0.936375 2023-10-04 02:47:16,111 WARNING [train.py:1204] (3/4) Exclude cut with ID _48677_210_5663_1_1533902405919_3892989_408-40740-0 from training. Duration: 0.83 2023-10-04 02:47:16,158 WARNING [train.py:1204] (3/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_903-158756-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 02:47:16,207 WARNING [train.py:1204] (3/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_397-216497-0 from training. Duration: 0.96 2023-10-04 02:47:17,588 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.1.feed_forward1.out_proj.dropout_p, batch_count=1498200.0, ans=0.1 2023-10-04 02:47:22,157 WARNING [train.py:1204] (3/4) Exclude cut with ID _55429_210_10626_1_1534467588527_7174143_97-335084-0 from training. Duration: 0.97 2023-10-04 02:47:23,771 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.1.ff3_skip_rate, batch_count=1498200.0, ans=0.0 2023-10-04 02:47:29,704 WARNING [train.py:1204] (3/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_453-267730-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 02:47:29,777 WARNING [train.py:1204] (3/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_153-97441-0 from training. Duration: 0.56 2023-10-04 02:47:31,108 WARNING [train.py:1204] (3/4) Exclude cut with ID _40901_210_5665_1_1533363789958_3856130_312-329122-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 02:47:31,154 WARNING [train.py:1204] (3/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_97-126471-0_sp1.1 from training. Duration: 0.97275 2023-10-04 02:47:31,154 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_11305_210_1794_1_1529750428_7178148_257-312870-0_sp1.1 from training. Duration: 0.8 2023-10-04 02:47:31,532 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.conv_module2.balancer1.prob, batch_count=1498266.6666666667, ans=0.125 2023-10-04 02:47:32,548 WARNING [train.py:1204] (3/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_454-314901-0_sp0.9 from training. Duration: 0.511125 2023-10-04 02:47:37,118 WARNING [train.py:1204] (3/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_282-136641-0_sp1.1 from training. Duration: 0.3818125 2023-10-04 02:47:39,130 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_46013_210_7234_1_1533776362_3744032_493-160724-0_sp1.1 from training. Duration: 0.963625 2023-10-04 02:47:39,171 WARNING [train.py:1204] (3/4) Exclude cut with ID _67831_210_12392_1_1535612464202_3092612_259-33442-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 02:47:40,622 WARNING [train.py:1204] (3/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_289-62384-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 02:47:40,684 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6545_210_2040_1_1527774670_4245258_379-14182-0_sp0.9 from training. Duration: 0.888875 2023-10-04 02:47:42,082 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_548-294316-0_sp1.1 from training. Duration: 0.736375 2023-10-04 02:47:43,516 WARNING [train.py:1204] (3/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_857-298942-0_sp1.1 from training. Duration: 0.77275 2023-10-04 02:47:44,776 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6673_210_3273_1_1527767790_3173240_327-306185-0_sp1.1 from training. Duration: 0.7545625 2023-10-04 02:47:52,179 WARNING [train.py:1204] (3/4) Exclude cut with ID _61008_210_10633_1_1534852769198_6974370_814-107647-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 02:47:52,260 WARNING [train.py:1204] (3/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_116-332008-0_sp1.1 from training. Duration: 0.8909375 2023-10-04 02:47:55,043 WARNING [train.py:1204] (3/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_266-329755-0 from training. Duration: 0.89 2023-10-04 02:47:55,048 WARNING [train.py:1204] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_271-269084-0_sp0.9 from training. Duration: 0.92225 2023-10-04 02:47:56,394 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3171_210_2087_1_1526109084_2330007_66-260583-0 from training. Duration: 0.72 2023-10-04 02:48:00,927 INFO [train.py:1046] (3/4) Epoch 43, batch 1650, loss[loss=0.1612, simple_loss=0.2459, pruned_loss=0.03823, over 23251.00 frames. ], tot_loss[loss=0.1564, simple_loss=0.2367, pruned_loss=0.03808, over 4713245.36 frames. ], batch size: 93, lr: 2.37e-03, grad_scale: 16.0 2023-10-04 02:48:03,608 WARNING [train.py:1204] (3/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_1326-276386-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 02:48:05,572 WARNING [train.py:1204] (3/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_372-30302-0_sp1.1 from training. Duration: 0.7818125 2023-10-04 02:48:05,633 WARNING [train.py:1204] (3/4) Exclude cut with ID _25882_210_3036_1_1532775665815_6479510_631-257029-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 02:48:05,641 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3691_210_3528_1_1526468169_4062053_355-334432-0 from training. Duration: 0.87 2023-10-04 02:48:05,651 WARNING [train.py:1204] (3/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_839-173017-0_sp1.1 from training. Duration: 0.37275 2023-10-04 02:48:05,659 WARNING [train.py:1204] (3/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_441-91466-0 from training. Duration: 0.97 2023-10-04 02:48:05,694 WARNING [train.py:1204] (3/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_108-53929-0 from training. Duration: 0.59 2023-10-04 02:48:07,331 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.conv_module1.balancer2.prob, batch_count=1498400.0, ans=0.125 2023-10-04 02:48:07,416 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.conv_module2.balancer1.prob, batch_count=1498400.0, ans=0.125 2023-10-04 02:48:09,873 WARNING [train.py:1204] (3/4) Exclude cut with ID _9389_210_1664_1_1529217337301_5289269_260-1942-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 02:48:09,944 WARNING [train.py:1204] (3/4) Exclude cut with ID _56423_210_5854_1_1534896061049_3165969_198-61547-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 02:48:09,974 WARNING [train.py:1204] (3/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_412-142122-0_sp1.1 from training. Duration: 0.92725 2023-10-04 02:48:10,645 WARNING [train.py:1204] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_88-341718-0_sp1.1 from training. Duration: 0.6 2023-10-04 02:48:14,631 WARNING [train.py:1204] (3/4) Exclude cut with ID _61079_210_4525_1_1534921008198_4011419_334-298697-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 02:48:16,054 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_5465_210_4211_1_1527227253_55357_8-2499-0_sp1.1 from training. Duration: 0.996375 2023-10-04 02:48:17,454 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_447-201565-0_sp1.1 from training. Duration: 0.97275 2023-10-04 02:48:19,159 WARNING [train.py:1204] (3/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_253-161789-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 02:48:19,162 WARNING [train.py:1204] (3/4) Exclude cut with ID _44304_210_15374_1_1533639352765_4613129_302-22084-0_sp0.9 from training. Duration: 0.9555625 2023-10-04 02:48:19,179 WARNING [train.py:1204] (3/4) Exclude cut with ID _26664_210_7770_1_1532258943280_3418270_187-80815-0_sp1.1 from training. Duration: 0.8454375 2023-10-04 02:48:19,252 WARNING [train.py:1204] (3/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_68-212801-0 from training. Duration: 0.73 2023-10-04 02:48:20,534 WARNING [train.py:1204] (3/4) Exclude cut with ID _29345_210_4381_1_1532689168263_3860169_149-257268-0 from training. Duration: 0.81 2023-10-04 02:48:26,212 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_213-103282-0_sp1.1 from training. Duration: 0.7090625 2023-10-04 02:48:27,593 WARNING [train.py:1204] (3/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_156-302675-0_sp1.1 from training. Duration: 0.5 2023-10-04 02:48:33,992 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=1498533.3333333333, ans=0.1 2023-10-04 02:48:35,550 WARNING [train.py:1204] (3/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_125-322172-0 from training. Duration: 0.93 2023-10-04 02:48:35,616 WARNING [train.py:1204] (3/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_493-60474-0_sp1.1 from training. Duration: 0.963625 2023-10-04 02:48:38,327 WARNING [train.py:1204] (3/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_156-302675-0 from training. Duration: 0.55 2023-10-04 02:48:41,033 WARNING [train.py:1204] (3/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_123-267862-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 02:48:43,256 WARNING [train.py:1204] (3/4) Exclude cut with ID _28427_210_4220_1_1532581240967_3061069_201-143747-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 02:48:43,282 WARNING [train.py:1204] (3/4) Exclude cut with ID _8772_210_1850_1_1528635595972_3664011_96-12756-0_sp1.1 from training. Duration: 0.8909375 2023-10-04 02:48:43,308 WARNING [train.py:1204] (3/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_1347-145675-0_sp1.1 from training. Duration: 0.936375 2023-10-04 02:48:45,895 WARNING [train.py:1204] (3/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_387-159008-0_sp1.1 from training. Duration: 0.7818125 2023-10-04 02:48:45,926 WARNING [train.py:1204] (3/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_400-201896-0_sp1.1 from training. Duration: 0.963625 2023-10-04 02:48:49,207 WARNING [train.py:1204] (3/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_326-174612-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 02:48:49,239 WARNING [train.py:1204] (3/4) Exclude cut with ID _55844_210_2346_1_1534471212569_7625707_390-116233-0_sp1.1 from training. Duration: 0.963625 2023-10-04 02:48:50,503 WARNING [train.py:1204] (3/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_419-317062-0_sp1.1 from training. Duration: 0.82725 2023-10-04 02:48:50,556 WARNING [train.py:1204] (3/4) Exclude cut with ID _29877_210_6228_1_1532397791106_5276149_266-62291-0_sp0.9 from training. Duration: 0.9666875 2023-10-04 02:48:50,627 WARNING [train.py:1204] (3/4) Exclude cut with ID _7857_210_1534_1_1528286515745_6831470_431-338528-0_sp1.1 from training. Duration: 0.92725 2023-10-04 02:48:51,950 WARNING [train.py:1204] (3/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_87-131427-0_sp0.9 from training. Duration: 0.8666875 2023-10-04 02:48:54,722 WARNING [train.py:1204] (3/4) Exclude cut with ID _24055_210_12651_1_1532310907143_976979_92-59356-0_sp1.1 from training. Duration: 0.82725 2023-10-04 02:48:54,890 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.bypass_mid.scale_min, batch_count=1498600.0, ans=0.2 2023-10-04 02:48:55,938 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.599e+02 1.993e+02 2.180e+02 2.514e+02 3.925e+02, threshold=4.360e+02, percent-clipped=0.0 2023-10-04 02:48:56,050 WARNING [train.py:1204] (3/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_96-290039-0 from training. Duration: 0.56 2023-10-04 02:48:57,490 WARNING [train.py:1204] (3/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_48-182155-0_sp0.9 from training. Duration: 0.9666875 2023-10-04 02:48:57,526 WARNING [train.py:1204] (3/4) Exclude cut with ID _4626_210_3532_1_1526817652717_3916859_446-206656-0 from training. Duration: 0.76 2023-10-04 02:49:00,831 WARNING [train.py:1204] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_168-32906-0 from training. Duration: 0.69 2023-10-04 02:49:00,852 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3780_210_2087_1_1526467760_2130483_170-259044-0 from training. Duration: 0.7 2023-10-04 02:49:00,871 WARNING [train.py:1204] (3/4) Exclude cut with ID _65134_210_14763_1_1535335393985_3505368_153-216718-0_sp1.1 from training. Duration: 0.92725 2023-10-04 02:49:02,161 WARNING [train.py:1204] (3/4) Exclude cut with ID _64639_210_5816_1_1535367619437_3528000_461-329156-0_sp1.1 from training. Duration: 0.87275 2023-10-04 02:49:02,205 WARNING [train.py:1204] (3/4) Exclude cut with ID _21989_210_5854_1_1531899214931_3271360_126-347531-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 02:49:03,559 WARNING [train.py:1204] (3/4) Exclude cut with ID _7614_210_4915_1_1529326764092_4537359_96-162197-0_sp1.1 from training. Duration: 0.963625 2023-10-04 02:49:03,560 WARNING [train.py:1204] (3/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_1083-147536-0 from training. Duration: 0.97 2023-10-04 02:49:06,974 WARNING [train.py:1204] (3/4) Exclude cut with ID _64029_210_4381_1_1535178502772_3756459_275-16710-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 02:49:08,391 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7264_210_4938_1_1527939911_8132095_800-91995-0_sp0.9 from training. Duration: 0.9555625 2023-10-04 02:49:08,425 WARNING [train.py:1204] (3/4) Exclude cut with ID _3844_210_1390_1_1526727803582_2871209_50-193879-0_sp1.1 from training. Duration: 0.936375 2023-10-04 02:49:11,011 WARNING [train.py:1204] (3/4) Exclude cut with ID _46952_210_5986_1_1534230010357_3107379_232-344503-0 from training. Duration: 0.96 2023-10-04 02:49:13,238 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.1.attention_skip_rate, batch_count=1498666.6666666667, ans=0.0 2023-10-04 02:49:14,474 WARNING [train.py:1204] (3/4) Exclude cut with ID _25808_210_13292_1_1532343514562_7366109_206-14076-0_sp1.1 from training. Duration: 0.936375 2023-10-04 02:49:14,475 WARNING [train.py:1204] (3/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_55-31904-0_sp0.9 from training. Duration: 0.97775 2023-10-04 02:49:14,499 WARNING [train.py:1204] (3/4) Exclude cut with ID _14632_210_9988_1_1530691279392_3923200_161-129530-0 from training. Duration: 0.96 2023-10-04 02:49:15,718 INFO [train.py:1046] (3/4) Epoch 43, batch 1700, loss[loss=0.1309, simple_loss=0.2011, pruned_loss=0.03038, over 23421.00 frames. ], tot_loss[loss=0.1559, simple_loss=0.2364, pruned_loss=0.03774, over 4714569.13 frames. ], batch size: 285, lr: 2.37e-03, grad_scale: 16.0 2023-10-04 02:49:15,777 WARNING [train.py:1204] (3/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_770-200430-0_sp1.1 from training. Duration: 0.8090625 2023-10-04 02:49:15,793 WARNING [train.py:1204] (3/4) Exclude cut with ID _6270_210_2084_1_1527850911573_3856420_26-277343-0_sp1.1 from training. Duration: 0.7454375 2023-10-04 02:49:15,794 WARNING [train.py:1204] (3/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_88-23776-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 02:49:20,212 WARNING [train.py:1204] (3/4) Exclude cut with ID _60860_210_8254_1_1534896015875_3557169_245-256666-0_sp1.1 from training. Duration: 0.8818125 2023-10-04 02:49:20,241 WARNING [train.py:1204] (3/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_636-251213-0_sp1.1 from training. Duration: 0.8545625 2023-10-04 02:49:21,529 WARNING [train.py:1204] (3/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_540-131164-0 from training. Duration: 0.92 2023-10-04 02:49:23,050 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_5931_210_2040_1_1527428481_4547999_321-193614-0_sp0.9 from training. Duration: 0.7444375 2023-10-04 02:49:31,800 WARNING [train.py:1204] (3/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_373-225403-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 02:49:34,527 WARNING [train.py:1204] (3/4) Exclude cut with ID _20622_210_3852_1_1531622427080_1093839_57-326027-0_sp1.1 from training. Duration: 0.97275 2023-10-04 02:49:41,728 WARNING [train.py:1204] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_605-257205-0_sp1.1 from training. Duration: 0.536375 2023-10-04 02:49:41,750 WARNING [train.py:1204] (3/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_442-320317-0_sp0.9 from training. Duration: 0.911125 2023-10-04 02:49:41,791 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_35361_210_4238_1_1533262810_7793476_675-87012-0_sp1.1 from training. Duration: 0.8090625 2023-10-04 02:49:41,832 WARNING [train.py:1204] (3/4) Exclude cut with ID _4184_210_2550_1_1526730192013_2219380_306-44736-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 02:49:44,593 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_124-113562-0 from training. Duration: 0.94 2023-10-04 02:49:46,653 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2821_210_2715_1_1526108385_3763560_73-296673-0_sp0.9 from training. Duration: 0.92225 2023-10-04 02:49:46,668 WARNING [train.py:1204] (3/4) Exclude cut with ID _10301_210_5886_1_1529222275332_3910620_216-46959-0_sp1.1 from training. Duration: 0.92725 2023-10-04 02:49:48,068 WARNING [train.py:1204] (3/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_91-277684-0_sp0.9 from training. Duration: 0.82225 2023-10-04 02:49:50,754 WARNING [train.py:1204] (3/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_351-294650-0_sp0.9 from training. Duration: 0.7 2023-10-04 02:49:52,580 WARNING [train.py:1204] (3/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_209-205365-0 from training. Duration: 0.79 2023-10-04 02:49:52,639 WARNING [train.py:1204] (3/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_97-325053-0 from training. Duration: 0.92 2023-10-04 02:49:54,026 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_49348_210_7927_1_1533954500_3691300_109-276430-0_sp1.1 from training. Duration: 0.92725 2023-10-04 02:49:55,428 WARNING [train.py:1204] (3/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_912-47473-0 from training. Duration: 0.68 2023-10-04 02:49:56,757 WARNING [train.py:1204] (3/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_96-320736-0_sp0.9 from training. Duration: 0.9555625 2023-10-04 02:49:58,393 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.balancer1.prob, batch_count=1498933.3333333333, ans=0.125 2023-10-04 02:50:04,415 WARNING [train.py:1204] (3/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_1252-235481-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 02:50:04,509 WARNING [train.py:1204] (3/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_813-261554-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 02:50:05,935 WARNING [train.py:1204] (3/4) Exclude cut with ID _21632_210_2253_1_1531994001765_3744140_110-274654-0_sp0.9 from training. Duration: 0.911125 2023-10-04 02:50:06,074 WARNING [train.py:1204] (3/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_381-317791-0_sp1.1 from training. Duration: 0.663625 2023-10-04 02:50:06,087 WARNING [train.py:1204] (3/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_51-117576-0_sp1.1 from training. Duration: 0.363625 2023-10-04 02:50:07,385 WARNING [train.py:1204] (3/4) Exclude cut with ID _42321_210_7770_1_1533468631352_4366895_104-215915-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 02:50:10,584 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_216-22824-0_sp1.1 from training. Duration: 0.92725 2023-10-04 02:50:10,585 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_14831_210_7927_1_1530700200_5583610_181-27467-0 from training. Duration: 0.91 2023-10-04 02:50:11,853 WARNING [train.py:1204] (3/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_1024-265645-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 02:50:11,856 WARNING [train.py:1204] (3/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_412-233904-0_sp1.1 from training. Duration: 0.936375 2023-10-04 02:50:11,874 WARNING [train.py:1204] (3/4) Exclude cut with ID _30191_210_1664_1_1533189915047_7196670_911-223461-0_sp1.1 from training. Duration: 0.92725 2023-10-04 02:50:11,882 WARNING [train.py:1204] (3/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_558-148266-0_sp1.1 from training. Duration: 0.963625 2023-10-04 02:50:13,364 WARNING [train.py:1204] (3/4) Exclude cut with ID _58480_210_12392_1_1534811121111_3646160_212-164495-0_sp1.1 from training. Duration: 0.936375 2023-10-04 02:50:13,365 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_454-276429-0_sp0.9 from training. Duration: 0.9666875 2023-10-04 02:50:15,380 WARNING [train.py:1204] (3/4) Exclude cut with ID _24736_210_3279_1_1532332894218_2838700_73-197086-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 02:50:16,666 WARNING [train.py:1204] (3/4) Exclude cut with ID _5294_210_1747_1_1527249643928_3696179_529-242298-0_sp1.1 from training. Duration: 0.863625 2023-10-04 02:50:16,704 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_53555_210_5632_1_1534595220_7232297_345-171438-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 02:50:20,012 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_28211_210_6388_1_1532861506_4523224_22-25766-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 02:50:21,408 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7002_210_1794_1_1527939683_5157286_163-87552-0 from training. Duration: 0.86 2023-10-04 02:50:24,082 WARNING [train.py:1204] (3/4) Exclude cut with ID _56927_210_9969_1_1534848970840_3695229_409-225129-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 02:50:24,183 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_26192_210_7927_1_1532137269_1040115_21-219331-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 02:50:26,908 WARNING [train.py:1204] (3/4) Exclude cut with ID _15012_210_1968_1_1530856820429_5095350_662-210469-0 from training. Duration: 0.57 2023-10-04 02:50:29,690 INFO [train.py:1046] (3/4) Epoch 43, batch 1750, loss[loss=0.1529, simple_loss=0.2265, pruned_loss=0.03964, over 23562.00 frames. ], tot_loss[loss=0.1547, simple_loss=0.2347, pruned_loss=0.03731, over 4710325.15 frames. ], batch size: 256, lr: 2.37e-03, grad_scale: 8.0 2023-10-04 02:50:31,821 WARNING [train.py:1204] (3/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_582-191016-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 02:50:34,611 WARNING [train.py:1204] (3/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_270-210032-0_sp1.1 from training. Duration: 0.963625 2023-10-04 02:50:34,639 WARNING [train.py:1204] (3/4) Exclude cut with ID _6789_210_1534_1_1527857937218_3118950_262-48853-0_sp0.9 from training. Duration: 0.62225 2023-10-04 02:50:35,986 WARNING [train.py:1204] (3/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_847-41334-0 from training. Duration: 0.44 2023-10-04 02:50:36,016 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_573-46532-0_sp1.1 from training. Duration: 0.92725 2023-10-04 02:50:39,411 WARNING [train.py:1204] (3/4) Exclude cut with ID _21881_210_3525_1_1531825242757_3798380_247-102591-0_sp1.1 from training. Duration: 0.8909375 2023-10-04 02:50:39,439 WARNING [train.py:1204] (3/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_200-21395-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 02:50:43,626 WARNING [train.py:1204] (3/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_711-15688-0 from training. Duration: 0.43 2023-10-04 02:50:45,751 WARNING [train.py:1204] (3/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_53-75025-0_sp1.1 from training. Duration: 0.963625 2023-10-04 02:50:48,447 WARNING [train.py:1204] (3/4) Exclude cut with ID _6194_210_1534_1_1527511826160_3941194_321-148641-0 from training. Duration: 0.64 2023-10-04 02:50:48,481 WARNING [train.py:1204] (3/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_337-294431-0_sp1.1 from training. Duration: 0.936375 2023-10-04 02:50:50,470 WARNING [train.py:1204] (3/4) Exclude cut with ID _21632_210_2253_1_1531994001765_3744140_110-274654-0_sp1.1 from training. Duration: 0.7454375 2023-10-04 02:50:53,180 WARNING [train.py:1204] (3/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_135-174080-0_sp1.1 from training. Duration: 0.4818125 2023-10-04 02:50:53,269 WARNING [train.py:1204] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_21-174514-0 from training. Duration: 0.43 2023-10-04 02:50:56,006 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_38965_210_4852_1_1533174906_3484735_215-294589-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 02:50:56,038 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_548-294316-0 from training. Duration: 0.81 2023-10-04 02:50:57,673 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.conv_module2.balancer2.min_abs, batch_count=1499133.3333333333, ans=0.5 2023-10-04 02:51:03,476 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.conv_module1.balancer2.prob, batch_count=1499200.0, ans=0.125 2023-10-04 02:51:04,551 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_103-84010-0_sp1.1 from training. Duration: 0.62725 2023-10-04 02:51:07,363 WARNING [train.py:1204] (3/4) Exclude cut with ID _18199_210_6109_1_1531475789864_4288790_295-198247-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 02:51:07,391 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_13757_210_3528_1_1530440682_4107239_103-313224-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 02:51:12,137 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_34886_210_10633_1_1533203882_1755661_67-294673-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 02:51:12,147 WARNING [train.py:1204] (3/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_350-101734-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 02:51:12,345 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.conv_skip_rate, batch_count=1499200.0, ans=0.0 2023-10-04 02:51:14,924 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_37357_210_17024_1_1533089504_2316121_204-153986-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 02:51:16,850 WARNING [train.py:1204] (3/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_673-115569-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 02:51:17,488 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.4.encoder.layers.0.whiten, num_groups=1, num_channels=384, metric=4.75 vs. limit=12.0 2023-10-04 02:51:19,505 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_489-230703-0_sp0.9 from training. Duration: 0.9444375 2023-10-04 02:51:20,843 WARNING [train.py:1204] (3/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_575-102496-0_sp1.1 from training. Duration: 0.936375 2023-10-04 02:51:20,934 WARNING [train.py:1204] (3/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_669-93163-0 from training. Duration: 0.98 2023-10-04 02:51:22,840 WARNING [train.py:1204] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_249-281349-0_sp0.9 from training. Duration: 0.9444375 2023-10-04 02:51:24,325 WARNING [train.py:1204] (3/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_106-30782-0 from training. Duration: 0.8 2023-10-04 02:51:24,397 WARNING [train.py:1204] (3/4) Exclude cut with ID _8772_210_1850_1_1528635595972_3664011_398-320049-0_sp1.1 from training. Duration: 0.8 2023-10-04 02:51:26,924 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.538e+02 2.005e+02 2.231e+02 2.661e+02 3.753e+02, threshold=4.462e+02, percent-clipped=0.0 2023-10-04 02:51:27,007 WARNING [train.py:1204] (3/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_516-151727-0_sp1.1 from training. Duration: 0.963625 2023-10-04 02:51:27,074 WARNING [train.py:1204] (3/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_325-62802-0_sp1.1 from training. Duration: 0.7909375 2023-10-04 02:51:29,854 WARNING [train.py:1204] (3/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_93-58002-0_sp0.9 from training. Duration: 0.6333125 2023-10-04 02:51:29,908 WARNING [train.py:1204] (3/4) Exclude cut with ID _4681_210_3525_1_1527251040321_1235713_123-24428-0_sp1.1 from training. Duration: 0.436375 2023-10-04 02:51:31,293 WARNING [train.py:1204] (3/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_1185-56473-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 02:51:33,290 WARNING [train.py:1204] (3/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_96-244299-0_sp1.1 from training. Duration: 0.8 2023-10-04 02:51:33,584 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.conv_module2.balancer1.prob, batch_count=1499333.3333333333, ans=0.125 2023-10-04 02:51:36,042 WARNING [train.py:1204] (3/4) Exclude cut with ID _59500_210_8254_1_1534809610680_3736009_104-72815-0_sp1.1 from training. Duration: 0.963625 2023-10-04 02:51:38,875 WARNING [train.py:1204] (3/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_468-283113-0_sp1.1 from training. Duration: 0.87275 2023-10-04 02:51:40,270 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6239_210_4211_1_1527765584_4306698_528-102186-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 02:51:40,415 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=1499333.3333333333, ans=0.1 2023-10-04 02:51:41,991 WARNING [train.py:1204] (3/4) Exclude cut with ID _45297_210_2721_1_1533715351381_3264949_351-147136-0 from training. Duration: 0.87 2023-10-04 02:51:41,996 WARNING [train.py:1204] (3/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_520-51744-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 02:51:43,372 WARNING [train.py:1204] (3/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_310-114452-0_sp1.1 from training. Duration: 0.536375 2023-10-04 02:51:43,375 WARNING [train.py:1204] (3/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_450-165710-0_sp1.1 from training. Duration: 0.92725 2023-10-04 02:51:43,393 WARNING [train.py:1204] (3/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_280-120942-0_sp1.1 from training. Duration: 0.7 2023-10-04 02:51:43,421 WARNING [train.py:1204] (3/4) Exclude cut with ID _65585_210_12210_1_1535450451584_3764098_112-294301-0_sp0.9 from training. Duration: 0.97775 2023-10-04 02:51:44,688 INFO [train.py:1046] (3/4) Epoch 43, batch 1800, loss[loss=0.1435, simple_loss=0.225, pruned_loss=0.03103, over 23491.00 frames. ], tot_loss[loss=0.1546, simple_loss=0.2344, pruned_loss=0.03733, over 4711358.56 frames. ], batch size: 105, lr: 2.37e-03, grad_scale: 8.0 2023-10-04 02:51:44,809 WARNING [train.py:1204] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_98-51676-0_sp0.9 from training. Duration: 0.811125 2023-10-04 02:51:49,153 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_33348_210_5632_1_1533086912_3324030_85-28583-0_sp1.1 from training. Duration: 0.7454375 2023-10-04 02:51:49,242 WARNING [train.py:1204] (3/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_336-208232-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 02:51:51,223 WARNING [train.py:1204] (3/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_339-104791-0_sp0.9 from training. Duration: 0.5444375 2023-10-04 02:51:54,055 WARNING [train.py:1204] (3/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_559-29840-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 02:51:55,571 WARNING [train.py:1204] (3/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_117-290916-0_sp0.9 from training. Duration: 0.5666875 2023-10-04 02:51:56,871 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_185-112289-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 02:51:59,747 WARNING [train.py:1204] (3/4) Exclude cut with ID _59256_210_5986_1_1535076085997_3589120_155-298797-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 02:52:02,584 WARNING [train.py:1204] (3/4) Exclude cut with ID _44659_210_2721_1_1534036120061_6614860_246-206531-0_sp1.1 from training. Duration: 0.92725 2023-10-04 02:52:03,897 WARNING [train.py:1204] (3/4) Exclude cut with ID _23818_210_6228_1_1531917063884_3676920_469-151944-0_sp1.1 from training. Duration: 0.92725 2023-10-04 02:52:04,000 WARNING [train.py:1204] (3/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_88-276668-0_sp1.1 from training. Duration: 0.9 2023-10-04 02:52:07,475 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_15101_210_7927_1_1530786415_5033274_320-327527-0_sp1.1 from training. Duration: 0.87275 2023-10-04 02:52:07,492 WARNING [train.py:1204] (3/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_446-112117-0 from training. Duration: 0.9 2023-10-04 02:52:08,823 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_30280_210_1881_1_1532872779_3660139_400-254159-0_sp1.1 from training. Duration: 0.936375 2023-10-04 02:52:11,610 WARNING [train.py:1204] (3/4) Exclude cut with ID _40284_210_2084_1_1533513679814_3704279_158-345341-0_sp1.1 from training. Duration: 0.936375 2023-10-04 02:52:15,057 WARNING [train.py:1204] (3/4) Exclude cut with ID 3033-130750-0096-55598-0 from training. Duration: 0.83 2023-10-04 02:52:18,228 WARNING [train.py:1204] (3/4) Exclude cut with ID _9389_210_1664_1_1529217337301_5289269_379-128794-0 from training. Duration: 0.85 2023-10-04 02:52:18,255 WARNING [train.py:1204] (3/4) Exclude cut with ID _3675_210_4557_1_1526639934408_4182089_456-18305-0 from training. Duration: 0.76 2023-10-04 02:52:18,290 WARNING [train.py:1204] (3/4) Exclude cut with ID _29853_210_7497_1_1532505600851_6642110_571-249460-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 02:52:19,570 WARNING [train.py:1204] (3/4) Exclude cut with ID _4068_210_2629_1_1526697350395_1546259_230-325568-0_sp1.1 from training. Duration: 0.92725 2023-10-04 02:52:19,571 WARNING [train.py:1204] (3/4) Exclude cut with ID _34415_210_8750_1_1533085213375_3451116_213-267570-0_sp1.1 from training. Duration: 0.963625 2023-10-04 02:52:19,645 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6767_210_3273_1_1527855783_1718932_142-127132-0_sp1.1 from training. Duration: 0.863625 2023-10-04 02:52:24,386 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.self_attn_weights.pos_emb_skip_rate, batch_count=1499533.3333333333, ans=0.0 2023-10-04 02:52:25,540 WARNING [train.py:1204] (3/4) Exclude cut with ID IC0653W0001-322925 from training. Duration: 0.896 2023-10-04 02:52:26,881 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_4528_210_3273_1_1527076654_3276777_209-219804-0_sp1.1 from training. Duration: 0.62725 2023-10-04 02:52:28,351 WARNING [train.py:1204] (3/4) Exclude cut with ID _55844_210_2346_1_1534471212569_7625707_187-204971-0_sp1.1 from training. Duration: 0.936375 2023-10-04 02:52:29,771 WARNING [train.py:1204] (3/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_369-325184-0 from training. Duration: 0.99 2023-10-04 02:52:29,797 WARNING [train.py:1204] (3/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_791-45149-0 from training. Duration: 0.86 2023-10-04 02:52:31,134 WARNING [train.py:1204] (3/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_317-211107-0_sp1.1 from training. Duration: 0.836375 2023-10-04 02:52:32,566 WARNING [train.py:1204] (3/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_488-330913-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 02:52:33,975 WARNING [train.py:1204] (3/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_506-42614-0_sp1.1 from training. Duration: 0.7181875 2023-10-04 02:52:35,564 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.bypass.scale_min, batch_count=1499600.0, ans=0.2 2023-10-04 02:52:38,682 WARNING [train.py:1204] (3/4) Exclude cut with ID _8696_210_1876_1_1528545632440_3345058_209-54069-0 from training. Duration: 0.67 2023-10-04 02:52:44,218 WARNING [train.py:1204] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_215-202973-0_sp1.1 from training. Duration: 0.8 2023-10-04 02:52:45,555 WARNING [train.py:1204] (3/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_321-215742-0 from training. Duration: 0.5 2023-10-04 02:52:46,185 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_428-32114-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 02:52:46,193 WARNING [train.py:1204] (3/4) Exclude cut with ID _27013_210_1389_1_1532437173240_3252649_467-263151-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 02:52:46,241 WARNING [train.py:1204] (3/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_734-129761-0_sp0.9 from training. Duration: 0.911125 2023-10-04 02:52:46,275 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_521-214276-0 from training. Duration: 0.44 2023-10-04 02:52:50,493 WARNING [train.py:1204] (3/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_80-147301-0_sp1.1 from training. Duration: 0.763625 2023-10-04 02:52:50,495 WARNING [train.py:1204] (3/4) Exclude cut with ID _9679_210_7497_1_1529129212906_4363929_56-307796-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 02:52:53,540 WARNING [train.py:1204] (3/4) Exclude cut with ID _14832_210_8750_1_1530691351539_3171348_1-54315-0 from training. Duration: 0.87 2023-10-04 02:52:53,540 WARNING [train.py:1204] (3/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_558-155750-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 02:52:56,281 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_142-309370-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 02:52:56,315 WARNING [train.py:1204] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_18-46756-0_sp0.9 from training. Duration: 0.87775 2023-10-04 02:52:56,329 WARNING [train.py:1204] (3/4) Exclude cut with ID _61148_210_6897_1_1535094022726_4217610_279-212159-0_sp1.1 from training. Duration: 0.936375 2023-10-04 02:52:57,680 WARNING [train.py:1204] (3/4) Exclude cut with ID _21987_210_5854_1_1531821500482_3637038_164-2641-0_sp1.1 from training. Duration: 0.936375 2023-10-04 02:52:57,736 WARNING [train.py:1204] (3/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_395-340852-0_sp1.1 from training. Duration: 0.8454375 2023-10-04 02:52:58,936 INFO [train.py:1046] (3/4) Epoch 43, batch 1850, loss[loss=0.1457, simple_loss=0.2237, pruned_loss=0.03388, over 23092.00 frames. ], tot_loss[loss=0.155, simple_loss=0.2348, pruned_loss=0.03759, over 4712590.90 frames. ], batch size: 105, lr: 2.37e-03, grad_scale: 8.0 2023-10-04 02:53:00,456 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1077-184261-0_sp1.1 from training. Duration: 0.963625 2023-10-04 02:53:00,464 WARNING [train.py:1204] (3/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_1176-204992-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 02:53:03,204 WARNING [train.py:1204] (3/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_563-347832-0_sp1.1 from training. Duration: 0.8090625 2023-10-04 02:53:04,462 WARNING [train.py:1204] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_380-204756-0_sp0.9 from training. Duration: 0.9555625 2023-10-04 02:53:10,430 WARNING [train.py:1204] (3/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_212-10501-0_sp1.1 from training. Duration: 0.8545625 2023-10-04 02:53:10,461 WARNING [train.py:1204] (3/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_445-117398-0 from training. Duration: 0.89 2023-10-04 02:53:13,215 WARNING [train.py:1204] (3/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_29-280622-0 from training. Duration: 0.82 2023-10-04 02:53:17,845 WARNING [train.py:1204] (3/4) Exclude cut with ID _26664_210_7770_1_1532258943280_3418270_187-80815-0 from training. Duration: 0.93 2023-10-04 02:53:21,277 WARNING [train.py:1204] (3/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_414-349640-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 02:53:21,304 WARNING [train.py:1204] (3/4) Exclude cut with ID _45021_210_9994_1_1533619751094_3330288_96-239010-0 from training. Duration: 0.91 2023-10-04 02:53:21,306 WARNING [train.py:1204] (3/4) Exclude cut with ID _5189_210_4557_1_1527248319428_3769229_199-53526-0_sp0.9 from training. Duration: 0.4666875 2023-10-04 02:53:32,163 WARNING [train.py:1204] (3/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_63-101996-0_sp1.1 from training. Duration: 0.8 2023-10-04 02:53:33,547 WARNING [train.py:1204] (3/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_199-86432-0 from training. Duration: 0.98 2023-10-04 02:53:36,268 WARNING [train.py:1204] (3/4) Exclude cut with ID _36399_210_16049_1_1532998771377_3698333_207-259013-0_sp1.1 from training. Duration: 0.92725 2023-10-04 02:53:37,732 WARNING [train.py:1204] (3/4) Exclude cut with ID _62574_210_2655_1_1535180407058_3933249_567-111756-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 02:53:41,166 WARNING [train.py:1204] (3/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_436-318113-0 from training. Duration: 0.75 2023-10-04 02:53:41,217 WARNING [train.py:1204] (3/4) Exclude cut with ID _53379_210_20034_1_1534222027363_3854654_43-305214-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 02:53:41,234 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_15126_210_6753_1_1530778677_2591185_113-157651-0_sp1.1 from training. Duration: 0.6181875 2023-10-04 02:53:42,577 WARNING [train.py:1204] (3/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_146-218763-0_sp1.1 from training. Duration: 0.7818125 2023-10-04 02:53:45,299 WARNING [train.py:1204] (3/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_714-310538-0_sp0.9 from training. Duration: 0.9555625 2023-10-04 02:53:46,826 WARNING [train.py:1204] (3/4) Exclude cut with ID _24271_210_12658_1_1531987270022_7190519_709-16721-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 02:53:47,012 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.feed_forward1.out_proj.dropout_p, batch_count=1499933.3333333333, ans=0.1 2023-10-04 02:53:50,059 WARNING [train.py:1204] (3/4) Exclude cut with ID _5190_210_4557_1_1527244368799_373439_28-197030-0_sp0.9 from training. Duration: 0.8 2023-10-04 02:53:51,840 WARNING [train.py:1204] (3/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_267-197536-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 02:53:51,863 WARNING [train.py:1204] (3/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_658-282610-0_sp1.1 from training. Duration: 0.4818125 2023-10-04 02:53:51,872 WARNING [train.py:1204] (3/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_257-67050-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 02:53:53,281 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_14906_210_7927_1_1530754190_9408633_961-53402-0_sp1.1 from training. Duration: 0.936375 2023-10-04 02:53:55,107 WARNING [train.py:1204] (3/4) Exclude cut with ID _60113_210_16271_1_1535164212481_4386079_490-280290-0_sp1.1 from training. Duration: 0.8909375 2023-10-04 02:53:56,298 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.697e+02 1.939e+02 2.109e+02 2.438e+02 4.084e+02, threshold=4.217e+02, percent-clipped=0.0 2023-10-04 02:53:58,012 WARNING [train.py:1204] (3/4) Exclude cut with ID _14100_210_8341_1_1530529320613_3678069_258-224599-0 from training. Duration: 0.77 2023-10-04 02:53:59,348 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6961_210_2087_1_1527937168_6815011_565-326898-0_sp1.1 from training. Duration: 0.936375 2023-10-04 02:54:03,480 WARNING [train.py:1204] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_239-316802-0_sp1.1 from training. Duration: 0.62725 2023-10-04 02:54:03,548 WARNING [train.py:1204] (3/4) Exclude cut with ID _55857_210_2346_1_1534644007831_6898209_745-98391-0_sp1.1 from training. Duration: 0.6909375 2023-10-04 02:54:03,551 WARNING [train.py:1204] (3/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_154-135696-0 from training. Duration: 0.8 2023-10-04 02:54:03,553 WARNING [train.py:1204] (3/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_93-58002-0 from training. Duration: 0.57 2023-10-04 02:54:04,956 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9025W0005-507335 from training. Duration: 0.896 2023-10-04 02:54:06,376 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9029W0034-509435 from training. Duration: 0.64 2023-10-04 02:54:09,002 WARNING [train.py:1204] (3/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_76-203867-0_sp1.1 from training. Duration: 0.6545625 2023-10-04 02:54:09,011 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_46639_210_8341_1_1533726088_4495500_66-346325-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 02:54:09,018 WARNING [train.py:1204] (3/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_145-137790-0_sp0.9 from training. Duration: 0.92225 2023-10-04 02:54:09,040 WARNING [train.py:1204] (3/4) Exclude cut with ID _36522_210_8887_1_1533177030611_1772478_7-327299-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 02:54:10,376 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9042W0009-515615 from training. Duration: 0.896 2023-10-04 02:54:10,394 WARNING [train.py:1204] (3/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_39-248870-0_sp0.9 from training. Duration: 0.7666875 2023-10-04 02:54:10,441 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_35441_210_16554_1_1533347572_4092680_362-242912-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 02:54:11,669 INFO [train.py:1046] (3/4) Epoch 43, batch 1900, loss[loss=0.148, simple_loss=0.2401, pruned_loss=0.02789, over 24325.00 frames. ], tot_loss[loss=0.1549, simple_loss=0.2352, pruned_loss=0.03732, over 4724391.00 frames. ], batch size: 74, lr: 2.37e-03, grad_scale: 8.0 2023-10-04 02:54:11,763 WARNING [train.py:1204] (3/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_216-343007-0_sp1.1 from training. Duration: 0.6 2023-10-04 02:54:13,521 WARNING [train.py:1204] (3/4) Exclude cut with ID _3867_210_1883_1_1526641200575_4475830_272-215563-0_sp0.9 from training. Duration: 0.6333125 2023-10-04 02:54:15,012 WARNING [train.py:1204] (3/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_553-175603-0_sp1.1 from training. Duration: 0.92725 2023-10-04 02:54:15,032 WARNING [train.py:1204] (3/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_963-187109-0 from training. Duration: 0.99 2023-10-04 02:54:17,873 WARNING [train.py:1204] (3/4) Exclude cut with ID _44973_210_1511_1_1533794381163_3634770_495-211466-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 02:54:17,892 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9068W0002-529085 from training. Duration: 0.896 2023-10-04 02:54:17,912 WARNING [train.py:1204] (3/4) Exclude cut with ID _5294_210_1747_1_1527249643928_3696179_180-100713-0_sp1.1 from training. Duration: 0.736375 2023-10-04 02:54:19,273 WARNING [train.py:1204] (3/4) Exclude cut with ID _35799_210_5854_1_1533263487887_3529071_149-49689-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 02:54:24,479 WARNING [train.py:1204] (3/4) Exclude cut with ID _67725_210_27767_1_1535608756671_5105359_36-220794-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 02:54:27,709 WARNING [train.py:1204] (3/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_338-96520-0_sp1.1 from training. Duration: 0.8545625 2023-10-04 02:54:27,784 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9104W0004-547160 from training. Duration: 0.896 2023-10-04 02:54:29,097 WARNING [train.py:1204] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_330-327861-0 from training. Duration: 0.67 2023-10-04 02:54:30,459 WARNING [train.py:1204] (3/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_156-257607-0_sp0.9 from training. Duration: 0.92225 2023-10-04 02:54:31,779 WARNING [train.py:1204] (3/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_476-322050-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 02:54:31,787 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9118W0017-553385 from training. Duration: 0.896 2023-10-04 02:54:31,822 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9120W0028-554435 from training. Duration: 0.896 2023-10-04 02:54:33,353 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.conv_skip_rate, batch_count=1500133.3333333333, ans=0.0 2023-10-04 02:54:36,035 WARNING [train.py:1204] (3/4) Exclude cut with ID _56524_210_9642_1_1534572011779_3619170_145-310842-0 from training. Duration: 0.99 2023-10-04 02:54:37,369 WARNING [train.py:1204] (3/4) Exclude cut with ID _51631_210_8254_1_1534129228420_4084794_270-222508-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 02:54:40,047 WARNING [train.py:1204] (3/4) Exclude cut with ID _14268_210_9206_1_1530622771285_4714099_331-105351-0 from training. Duration: 0.65 2023-10-04 02:54:41,520 WARNING [train.py:1204] (3/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_40-129756-0 from training. Duration: 0.81 2023-10-04 02:54:51,597 WARNING [train.py:1204] (3/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_231-104736-0 from training. Duration: 0.78 2023-10-04 02:54:53,736 WARNING [train.py:1204] (3/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_214-291369-0 from training. Duration: 0.63 2023-10-04 02:54:53,743 WARNING [train.py:1204] (3/4) Exclude cut with ID _62574_210_2655_1_1535180407058_3933249_369-84347-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 02:54:55,525 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9210W0111-599240 from training. Duration: 0.896 2023-10-04 02:54:55,529 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_92-246270-0 from training. Duration: 0.85 2023-10-04 02:54:55,562 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_37152_210_9697_1_1533106573_3844188_29-239070-0 from training. Duration: 0.98 2023-10-04 02:54:55,606 WARNING [train.py:1204] (3/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_239-22076-0 from training. Duration: 0.85 2023-10-04 02:54:55,609 WARNING [train.py:1204] (3/4) Exclude cut with ID _65962_210_29927_1_1535436026519_4174930_368-44639-0_sp1.1 from training. Duration: 0.97275 2023-10-04 02:55:01,414 WARNING [train.py:1204] (3/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_152-212533-0 from training. Duration: 0.97 2023-10-04 02:55:02,930 WARNING [train.py:1204] (3/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_90-31383-0_sp1.1 from training. Duration: 0.7909375 2023-10-04 02:55:05,747 WARNING [train.py:1204] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_196-152868-0_sp1.1 from training. Duration: 0.92725 2023-10-04 02:55:05,748 WARNING [train.py:1204] (3/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_140-181176-0 from training. Duration: 0.92 2023-10-04 02:55:06,530 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.1.encoder.layers.1.whiten, num_groups=1, num_channels=256, metric=3.81 vs. limit=12.0 2023-10-04 02:55:07,067 WARNING [train.py:1204] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_168-32906-0_sp0.9 from training. Duration: 0.7666875 2023-10-04 02:55:11,157 WARNING [train.py:1204] (3/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_13-165512-0 from training. Duration: 0.82 2023-10-04 02:55:11,197 WARNING [train.py:1204] (3/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_381-317791-0_sp0.9 from training. Duration: 0.811125 2023-10-04 02:55:18,798 WARNING [train.py:1204] (3/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_63-267585-0_sp1.1 from training. Duration: 0.8181875 2023-10-04 02:55:18,800 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_326-303945-0_sp1.1 from training. Duration: 0.9 2023-10-04 02:55:20,094 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_194-143381-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 02:55:20,170 WARNING [train.py:1204] (3/4) Exclude cut with ID _39290_210_4220_1_1533862778760_3075118_194-16667-0_sp1.1 from training. Duration: 0.963625 2023-10-04 02:55:21,518 WARNING [train.py:1204] (3/4) Exclude cut with ID _15012_210_1968_1_1530856820429_5095350_662-210469-0_sp0.9 from training. Duration: 0.6333125 2023-10-04 02:55:21,559 WARNING [train.py:1204] (3/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_68-219807-0_sp1.1 from training. Duration: 0.42725 2023-10-04 02:55:23,413 WARNING [train.py:1204] (3/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_321-22821-0_sp0.9 from training. Duration: 0.988875 2023-10-04 02:55:25,446 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_1037-45317-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 02:55:25,448 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_403-39619-0_sp1.1 from training. Duration: 0.763625 2023-10-04 02:55:26,727 INFO [train.py:1046] (3/4) Epoch 43, batch 1950, loss[loss=0.1388, simple_loss=0.2242, pruned_loss=0.02675, over 24488.00 frames. ], tot_loss[loss=0.1561, simple_loss=0.2364, pruned_loss=0.03794, over 4718377.59 frames. ], batch size: 66, lr: 2.37e-03, grad_scale: 8.0 2023-10-04 02:55:28,118 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_510-96996-0_sp0.9 from training. Duration: 0.9666875 2023-10-04 02:55:29,488 WARNING [train.py:1204] (3/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_225-180155-0_sp1.1 from training. Duration: 0.92725 2023-10-04 02:55:29,543 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_278-272612-0_sp0.9 from training. Duration: 0.811125 2023-10-04 02:55:31,326 WARNING [train.py:1204] (3/4) Exclude cut with ID _53713_210_24116_1_1534314487762_7416751_691-70244-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 02:55:31,491 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.feed_forward2.hidden_balancer.prob, batch_count=1500400.0, ans=0.125 2023-10-04 02:55:32,807 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6788_210_1388_1_1527898698_4562021_91-197981-0_sp1.1 from training. Duration: 0.7545625 2023-10-04 02:55:35,505 WARNING [train.py:1204] (3/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_60-18123-0_sp0.9 from training. Duration: 0.97775 2023-10-04 02:55:35,548 WARNING [train.py:1204] (3/4) Exclude cut with ID _35799_210_5854_1_1533263487887_3529071_5-214617-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 02:55:35,559 WARNING [train.py:1204] (3/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_890-5117-0_sp1.1 from training. Duration: 0.7090625 2023-10-04 02:55:38,179 WARNING [train.py:1204] (3/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_660-211088-0 from training. Duration: 0.62 2023-10-04 02:55:38,242 WARNING [train.py:1204] (3/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_314-126819-0_sp0.9 from training. Duration: 0.5555625 2023-10-04 02:55:39,568 WARNING [train.py:1204] (3/4) Exclude cut with ID _21632_210_2253_1_1531994001765_3744140_362-66225-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 02:55:40,945 WARNING [train.py:1204] (3/4) Exclude cut with ID _61118_210_9527_1_1534897668884_3752630_173-258555-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 02:55:42,489 WARNING [train.py:1204] (3/4) Exclude cut with ID _7719_210_3525_1_1528456153163_4296960_88-250259-0_sp0.9 from training. Duration: 0.9333125 2023-10-04 02:55:42,519 WARNING [train.py:1204] (3/4) Exclude cut with ID _15140_210_9288_1_1530791388450_4880149_539-153708-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 02:55:42,542 WARNING [train.py:1204] (3/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_253-42544-0_sp1.1 from training. Duration: 0.97275 2023-10-04 02:55:45,250 WARNING [train.py:1204] (3/4) Exclude cut with ID _33640_210_2655_1_1533117605479_4855469_479-296877-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 02:55:48,527 WARNING [train.py:1204] (3/4) Exclude cut with ID 3033-130750-0096-55598-0_sp1.1 from training. Duration: 0.7545625 2023-10-04 02:55:48,539 WARNING [train.py:1204] (3/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_191-41023-0_sp1.1 from training. Duration: 0.6454375 2023-10-04 02:55:48,556 WARNING [train.py:1204] (3/4) Exclude cut with ID _8365_210_2064_1_1528592311070_4677690_431-294102-0_sp1.1 from training. Duration: 0.8090625 2023-10-04 02:55:48,590 WARNING [train.py:1204] (3/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_893-296030-0_sp1.1 from training. Duration: 0.97275 2023-10-04 02:55:53,218 WARNING [train.py:1204] (3/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_401-343820-0_sp1.1 from training. Duration: 0.97275 2023-10-04 02:55:56,596 WARNING [train.py:1204] (3/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_127-566-0_sp1.1 from training. Duration: 0.763625 2023-10-04 02:55:56,598 WARNING [train.py:1204] (3/4) Exclude cut with ID _14100_210_8341_1_1530529320613_3678069_126-272847-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 02:55:56,633 WARNING [train.py:1204] (3/4) Exclude cut with ID _15133_210_6228_1_1530794512074_3080379_29-264458-0_sp1.1 from training. Duration: 0.52725 2023-10-04 02:55:56,634 WARNING [train.py:1204] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_127-119109-0 from training. Duration: 0.72 2023-10-04 02:55:57,937 WARNING [train.py:1204] (3/4) Exclude cut with ID _6537_210_1883_1_1527987559710_3597129_382-133062-0_sp1.1 from training. Duration: 0.6181875 2023-10-04 02:55:57,950 WARNING [train.py:1204] (3/4) Exclude cut with ID _29529_210_7234_1_1532393961455_3977460_391-219682-0_sp1.1 from training. Duration: 0.936375 2023-10-04 02:55:58,000 WARNING [train.py:1204] (3/4) Exclude cut with ID _37537_210_2253_1_1533171758568_3421949_96-137189-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 02:56:02,559 WARNING [train.py:1204] (3/4) Exclude cut with ID _58414_210_8254_1_1535421609162_7151182_15-276301-0_sp1.1 from training. Duration: 0.97275 2023-10-04 02:56:04,195 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.bypass.scale_min, batch_count=1500533.3333333333, ans=0.2 2023-10-04 02:56:05,299 WARNING [train.py:1204] (3/4) Exclude cut with ID _59502_210_12210_1_1535107024698_1835139_110-289074-0_sp1.1 from training. Duration: 0.92725 2023-10-04 02:56:08,432 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.feed_forward1.hidden_balancer.prob, batch_count=1500533.3333333333, ans=0.125 2023-10-04 02:56:09,624 WARNING [train.py:1204] (3/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_367-61330-0_sp1.1 from training. Duration: 0.6818125 2023-10-04 02:56:12,279 WARNING [train.py:1204] (3/4) Exclude cut with ID _45297_210_2721_1_1533715351381_3264949_353-9541-0_sp1.1 from training. Duration: 0.8818125 2023-10-04 02:56:12,307 WARNING [train.py:1204] (3/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_85-138257-0_sp0.9 from training. Duration: 0.788875 2023-10-04 02:56:12,339 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3787_210_2040_1_1526477544_5478586_603-120324-0 from training. Duration: 0.81 2023-10-04 02:56:12,382 WARNING [train.py:1204] (3/4) Exclude cut with ID _42863_210_9070_1_1533902398788_3696920_411-204131-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 02:56:16,533 WARNING [train.py:1204] (3/4) Exclude cut with ID _30196_210_1664_1_1533708496522_7074140_822-289909-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 02:56:17,898 WARNING [train.py:1204] (3/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_32-335103-0_sp0.9 from training. Duration: 0.911125 2023-10-04 02:56:17,940 WARNING [train.py:1204] (3/4) Exclude cut with ID _6493_210_1747_1_1528459210424_4012470_513-140967-0_sp0.9 from training. Duration: 0.87775 2023-10-04 02:56:25,116 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.601e+02 2.031e+02 2.275e+02 2.589e+02 3.753e+02, threshold=4.549e+02, percent-clipped=0.0 2023-10-04 02:56:25,195 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_47896_210_20508_1_1534492518_5958868_439-257766-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 02:56:25,272 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_1294-63152-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 02:56:27,950 WARNING [train.py:1204] (3/4) Exclude cut with ID _59170_210_6721_1_1534983633960_4203335_319-166512-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 02:56:30,569 WARNING [train.py:1204] (3/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_146-3470-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 02:56:30,841 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.conv_module1.balancer2.prob, batch_count=1500666.6666666667, ans=0.125 2023-10-04 02:56:32,596 WARNING [train.py:1204] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_678-159307-0_sp1.1 from training. Duration: 0.77275 2023-10-04 02:56:33,898 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_343-48822-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 02:56:35,201 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_4692_210_3528_1_1526899755_4323254_107-44401-0 from training. Duration: 0.86 2023-10-04 02:56:35,215 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_74-292608-0_sp1.1 from training. Duration: 0.6545625 2023-10-04 02:56:35,285 WARNING [train.py:1204] (3/4) Exclude cut with ID _55644_210_11856_1_1534593599582_4103719_293-248694-0_sp1.1 from training. Duration: 0.97275 2023-10-04 02:56:36,592 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3172_210_2087_1_1526292929_3987865_197-97762-0 from training. Duration: 0.75 2023-10-04 02:56:39,298 WARNING [train.py:1204] (3/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_264-348896-0_sp0.9 from training. Duration: 0.9444375 2023-10-04 02:56:40,761 INFO [train.py:1046] (3/4) Epoch 43, batch 2000, loss[loss=0.1397, simple_loss=0.216, pruned_loss=0.03169, over 23655.00 frames. ], tot_loss[loss=0.1575, simple_loss=0.2377, pruned_loss=0.0387, over 4706167.74 frames. ], batch size: 149, lr: 2.37e-03, grad_scale: 16.0 2023-10-04 02:56:42,230 WARNING [train.py:1204] (3/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_209-205365-0_sp0.9 from training. Duration: 0.87775 2023-10-04 02:56:43,684 WARNING [train.py:1204] (3/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_222-198127-0_sp0.9 from training. Duration: 0.9333125 2023-10-04 02:56:43,734 WARNING [train.py:1204] (3/4) Exclude cut with ID _57572_210_5517_1_1534921219624_3809484_52-43530-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 02:56:46,378 WARNING [train.py:1204] (3/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_96-320736-0_sp1.1 from training. Duration: 0.7818125 2023-10-04 02:56:47,036 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.2.encoder.layers.1.conv_module2.whiten, num_groups=1, num_channels=384, metric=6.38 vs. limit=15.0 2023-10-04 02:56:47,902 INFO [scaling.py:1118] (3/4) WithLoss: name=encoder.encoders.3.encoder.layers.0.self_attn_weights, loss-sum=0.000e+00 2023-10-04 02:56:49,076 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_13091_210_7925_1_1530609447_8987354_1159-225508-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 02:56:52,324 WARNING [train.py:1204] (3/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_237-44332-0 from training. Duration: 0.94 2023-10-04 02:56:52,369 WARNING [train.py:1204] (3/4) Exclude cut with ID _7970_210_1883_1_1528455616296_3472230_25-178407-0_sp0.9 from training. Duration: 0.788875 2023-10-04 02:56:52,508 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.self_attn_weights.pos_emb_skip_rate, batch_count=1500733.3333333333, ans=0.0 2023-10-04 02:56:57,521 WARNING [train.py:1204] (3/4) Exclude cut with ID _29003_210_19596_1_1532507391720_3894098_357-257062-0_sp1.1 from training. Duration: 0.936375 2023-10-04 02:56:58,928 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_809-17156-0 from training. Duration: 0.73 2023-10-04 02:57:00,298 WARNING [train.py:1204] (3/4) Exclude cut with ID _7160_210_1876_1_1527940923231_3703799_302-150728-0_sp1.1 from training. Duration: 0.7090625 2023-10-04 02:57:00,306 WARNING [train.py:1204] (3/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_444-28041-0_sp0.9 from training. Duration: 0.9444375 2023-10-04 02:57:04,553 WARNING [train.py:1204] (3/4) Exclude cut with ID _52892_210_6721_1_1534378941259_4124129_278-251666-0_sp1.1 from training. Duration: 0.92725 2023-10-04 02:57:05,927 WARNING [train.py:1204] (3/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_115-326778-0 from training. Duration: 0.93 2023-10-04 02:57:06,030 WARNING [train.py:1204] (3/4) Exclude cut with ID _41391_210_7994_1_1533603771872_3638201_31-188961-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 02:57:07,391 WARNING [train.py:1204] (3/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_1073-6264-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 02:57:07,409 WARNING [train.py:1204] (3/4) Exclude cut with ID _66868_210_9527_1_1535509287954_4561140_435-259515-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 02:57:09,355 WARNING [train.py:1204] (3/4) Exclude cut with ID _14886_210_4220_1_1530702020355_3516190_159-73748-0 from training. Duration: 0.83 2023-10-04 02:57:09,385 WARNING [train.py:1204] (3/4) Exclude cut with ID _15150_210_8341_1_1530752411803_7256384_469-239509-0_sp1.1 from training. Duration: 0.6181875 2023-10-04 02:57:10,823 WARNING [train.py:1204] (3/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_395-340852-0 from training. Duration: 0.93 2023-10-04 02:57:10,826 WARNING [train.py:1204] (3/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_138-187031-0_sp1.1 from training. Duration: 0.9 2023-10-04 02:57:13,586 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_13757_210_3528_1_1530440682_4107239_48-106347-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 02:57:13,653 WARNING [train.py:1204] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_164-224052-0_sp1.1 from training. Duration: 0.636375 2023-10-04 02:57:13,657 WARNING [train.py:1204] (3/4) Exclude cut with ID _59288_210_5854_1_1535077757774_3412246_408-322648-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 02:57:14,995 WARNING [train.py:1204] (3/4) Exclude cut with ID _30191_210_1664_1_1533189915047_7196670_827-36359-0_sp1.1 from training. Duration: 0.963625 2023-10-04 02:57:16,304 WARNING [train.py:1204] (3/4) Exclude cut with ID _4680_210_3525_1_1527249757953_893386_75-78979-0_sp1.1 from training. Duration: 0.82725 2023-10-04 02:57:17,580 WARNING [train.py:1204] (3/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_383-165234-0 from training. Duration: 0.94 2023-10-04 02:57:20,604 WARNING [train.py:1204] (3/4) Exclude cut with ID _19896_210_1883_1_1531709022662_6649569_94-229927-0 from training. Duration: 0.97 2023-10-04 02:57:20,611 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6788_210_1388_1_1527898698_4562021_180-75288-0_sp1.1 from training. Duration: 0.9 2023-10-04 02:57:20,628 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_26433_210_12108_1_1532157158_4056480_54-321349-0_sp1.1 from training. Duration: 0.97275 2023-10-04 02:57:25,481 WARNING [train.py:1204] (3/4) Exclude cut with ID _56108_210_16380_1_1534505446159_3887959_265-161493-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 02:57:26,882 WARNING [train.py:1204] (3/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_395-111602-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 02:57:26,891 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_154-316450-0_sp0.9 from training. Duration: 0.8444375 2023-10-04 02:57:28,217 WARNING [train.py:1204] (3/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_381-116041-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 02:57:30,884 WARNING [train.py:1204] (3/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_65-104124-0_sp1.1 from training. Duration: 0.963625 2023-10-04 02:57:30,934 WARNING [train.py:1204] (3/4) Exclude cut with ID _6536_210_1883_1_1527850783339_3319069_169-23811-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 02:57:30,964 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_289-180904-0_sp0.9 from training. Duration: 0.8444375 2023-10-04 02:57:30,975 WARNING [train.py:1204] (3/4) Exclude cut with ID _56406_210_9288_1_1534899618664_3496149_226-242400-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 02:57:32,382 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_24899_210_16554_1_1532046422_3827462_276-216330-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 02:57:35,107 WARNING [train.py:1204] (3/4) Exclude cut with ID _29345_210_4381_1_1532689168263_3860169_198-174328-0_sp1.1 from training. Duration: 0.82725 2023-10-04 02:57:35,176 WARNING [train.py:1204] (3/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_338-96520-0 from training. Duration: 0.94 2023-10-04 02:57:39,685 WARNING [train.py:1204] (3/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_7-111954-0_sp1.1 from training. Duration: 0.5090625 2023-10-04 02:57:40,973 WARNING [train.py:1204] (3/4) Exclude cut with ID _36692_210_5665_1_1533608967420_398809_8-16261-0_sp1.1 from training. Duration: 0.97275 2023-10-04 02:57:41,166 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.0.conv_module2.balancer1.prob, batch_count=1501000.0, ans=0.125 2023-10-04 02:57:43,710 WARNING [train.py:1204] (3/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_86-212657-0_sp1.1 from training. Duration: 0.97275 2023-10-04 02:57:43,718 WARNING [train.py:1204] (3/4) Exclude cut with ID _44659_210_2721_1_1534036120061_6614860_626-298884-0_sp1.1 from training. Duration: 0.8818125 2023-10-04 02:57:46,527 WARNING [train.py:1204] (3/4) Exclude cut with ID _49755_210_18053_1_1534581005846_3218709_120-257918-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 02:57:48,581 WARNING [train.py:1204] (3/4) Exclude cut with ID _21881_210_3525_1_1531825242757_3798380_400-113821-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 02:57:48,583 WARNING [train.py:1204] (3/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_387-236773-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 02:57:49,681 WARNING [train.py:1204] (3/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_613-232856-0_sp1.1 from training. Duration: 0.4909375 2023-10-04 02:57:50,995 WARNING [train.py:1204] (3/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_181-78632-0_sp0.9 from training. Duration: 0.8666875 2023-10-04 02:57:53,129 WARNING [train.py:1204] (3/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_175-17485-0_sp1.1 from training. Duration: 0.97275 2023-10-04 02:57:54,720 INFO [train.py:1046] (3/4) Epoch 43, batch 2050, loss[loss=0.1534, simple_loss=0.2113, pruned_loss=0.0478, over 19436.00 frames. ], tot_loss[loss=0.1568, simple_loss=0.2368, pruned_loss=0.03843, over 4701779.09 frames. ], batch size: 389, lr: 2.37e-03, grad_scale: 8.0 2023-10-04 02:57:54,825 WARNING [train.py:1204] (3/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_1004-183422-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 02:57:58,884 WARNING [train.py:1204] (3/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_32-65962-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 02:57:58,956 WARNING [train.py:1204] (3/4) Exclude cut with ID _15458_210_8341_1_1530858614263_3834410_40-298664-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 02:58:03,235 WARNING [train.py:1204] (3/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_380-261863-0_sp1.1 from training. Duration: 0.963625 2023-10-04 02:58:04,591 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_372-173012-0_sp1.1 from training. Duration: 0.863625 2023-10-04 02:58:05,787 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_57-82205-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 02:58:07,227 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3691_210_3528_1_1526468169_4062053_355-334432-0_sp0.9 from training. Duration: 0.9666875 2023-10-04 02:58:10,568 WARNING [train.py:1204] (3/4) Exclude cut with ID _19023_210_4353_1_1531378892552_5820780_28-36615-0 from training. Duration: 0.97 2023-10-04 02:58:10,582 WARNING [train.py:1204] (3/4) Exclude cut with ID _29853_210_7497_1_1532505600851_6642110_286-251333-0_sp1.1 from training. Duration: 0.92725 2023-10-04 02:58:11,868 WARNING [train.py:1204] (3/4) Exclude cut with ID _31602_210_5854_1_1532677021456_4458250_518-80679-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 02:58:13,149 WARNING [train.py:1204] (3/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_38-187851-0_sp0.9 from training. Duration: 0.688875 2023-10-04 02:58:16,268 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.conv_module2.balancer1.prob, batch_count=1501133.3333333333, ans=0.125 2023-10-04 02:58:22,000 WARNING [train.py:1204] (3/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_39-248870-0_sp1.1 from training. Duration: 0.62725 2023-10-04 02:58:22,017 WARNING [train.py:1204] (3/4) Exclude cut with ID _16908_210_5724_1_1531119157248_3772700_154-31455-0_sp1.1 from training. Duration: 0.97275 2023-10-04 02:58:25,329 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8453_210_4211_1_1528502940_8562730_347-330673-0 from training. Duration: 0.72 2023-10-04 02:58:26,708 WARNING [train.py:1204] (3/4) Exclude cut with ID _30191_210_1664_1_1533189915047_7196670_674-204247-0_sp1.1 from training. Duration: 0.97275 2023-10-04 02:58:26,800 WARNING [train.py:1204] (3/4) Exclude cut with ID _8833_210_5219_1_1528592665210_7553059_329-125156-0 from training. Duration: 0.82 2023-10-04 02:58:26,835 WARNING [train.py:1204] (3/4) Exclude cut with ID _4779_210_2084_1_1527318022944_3914893_500-285013-0_sp1.1 from training. Duration: 0.62725 2023-10-04 02:58:26,969 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.0.feed_forward1.out_proj.dropout_p, batch_count=1501200.0, ans=0.1 2023-10-04 02:58:31,540 WARNING [train.py:1204] (3/4) Exclude cut with ID _14421_210_8750_1_1530604795825_3542290_190-313578-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 02:58:33,008 WARNING [train.py:1204] (3/4) Exclude cut with ID _35799_210_5854_1_1533263487887_3529071_538-273574-0_sp1.1 from training. Duration: 0.936375 2023-10-04 02:58:33,076 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_14928_210_5632_1_1530773461_4714000_454-112198-0_sp1.1 from training. Duration: 0.6 2023-10-04 02:58:34,303 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_468-143126-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 02:58:35,733 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_4528_210_3273_1_1527076654_3276777_258-121681-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 02:58:35,810 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_397-132565-0_sp1.1 from training. Duration: 0.9 2023-10-04 02:58:35,829 WARNING [train.py:1204] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_454-123002-0_sp0.9 from training. Duration: 0.7666875 2023-10-04 02:58:38,486 WARNING [train.py:1204] (3/4) Exclude cut with ID _27052_210_5986_1_1532329197187_3585009_297-86890-0_sp1.1 from training. Duration: 0.936375 2023-10-04 02:58:41,726 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_51225_210_3601_1_1534400634_2327383_55-301844-0_sp1.1 from training. Duration: 0.8454375 2023-10-04 02:58:43,200 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6786_210_1388_1_1527839699_4194495_378-46070-0_sp0.9 from training. Duration: 0.8 2023-10-04 02:58:44,508 WARNING [train.py:1204] (3/4) Exclude cut with ID _42334_210_7770_1_1534206610396_3605880_55-19758-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 02:58:48,635 WARNING [train.py:1204] (3/4) Exclude cut with ID _14884_210_2252_1_1530707459689_5561250_114-201399-0_sp0.9 from training. Duration: 0.8444375 2023-10-04 02:58:55,253 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.607e+02 2.035e+02 2.186e+02 2.516e+02 3.792e+02, threshold=4.373e+02, percent-clipped=0.0 2023-10-04 02:58:55,451 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_99-251314-0_sp0.9 from training. Duration: 0.9666875 2023-10-04 02:58:56,695 WARNING [train.py:1204] (3/4) Exclude cut with ID _26528_210_9070_1_1532570410842_3851310_497-97218-0 from training. Duration: 0.86 2023-10-04 02:59:01,430 WARNING [train.py:1204] (3/4) Exclude cut with ID _55328_210_27767_1_1534580973260_4088170_230-317241-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 02:59:02,800 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_125-203105-0_sp0.9 from training. Duration: 0.888875 2023-10-04 02:59:05,525 WARNING [train.py:1204] (3/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_55-31904-0_sp1.1 from training. Duration: 0.8 2023-10-04 02:59:06,886 WARNING [train.py:1204] (3/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_18-174647-0 from training. Duration: 0.91 2023-10-04 02:59:08,168 INFO [train.py:1046] (3/4) Epoch 43, batch 2100, loss[loss=0.152, simple_loss=0.2351, pruned_loss=0.03444, over 24659.00 frames. ], tot_loss[loss=0.1552, simple_loss=0.2345, pruned_loss=0.03794, over 4703429.14 frames. ], batch size: 65, lr: 2.37e-03, grad_scale: 8.0 2023-10-04 02:59:09,612 WARNING [train.py:1204] (3/4) Exclude cut with ID IC0165W0005-81726 from training. Duration: 0.952 2023-10-04 02:59:09,612 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_31583_210_3852_1_1532595851_4024413_148-171847-0_sp1.1 from training. Duration: 0.963625 2023-10-04 02:59:09,662 WARNING [train.py:1204] (3/4) Exclude cut with ID _21989_210_5854_1_1531899214931_3271360_116-138769-0_sp1.1 from training. Duration: 0.936375 2023-10-04 02:59:10,966 WARNING [train.py:1204] (3/4) Exclude cut with ID _66070_210_7640_1_1535441393096_7805310_489-292631-0_sp1.1 from training. Duration: 0.7181875 2023-10-04 02:59:12,912 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_202-27960-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 02:59:12,920 WARNING [train.py:1204] (3/4) Exclude cut with ID _5294_210_1747_1_1527249643928_3696179_342-270278-0 from training. Duration: 0.55 2023-10-04 02:59:12,957 WARNING [train.py:1204] (3/4) Exclude cut with ID _38323_210_12108_1_1533121233369_3303049_416-11919-0 from training. Duration: 0.96 2023-10-04 02:59:14,428 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6545_210_2040_1_1527774670_4245258_407-48070-0_sp0.9 from training. Duration: 0.8444375 2023-10-04 02:59:18,390 WARNING [train.py:1204] (3/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_503-30719-0_sp1.1 from training. Duration: 0.8909375 2023-10-04 02:59:18,447 WARNING [train.py:1204] (3/4) Exclude cut with ID _49643_210_7989_1_1534640244224_3939031_234-326497-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 02:59:21,752 WARNING [train.py:1204] (3/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_301-77873-0_sp1.1 from training. Duration: 0.963625 2023-10-04 02:59:23,104 WARNING [train.py:1204] (3/4) Exclude cut with ID _45297_210_2721_1_1533715351381_3264949_152-154697-0_sp1.1 from training. Duration: 0.97275 2023-10-04 02:59:23,110 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3371_210_1794_1_1526384462_5580777_396-339812-0 from training. Duration: 0.85 2023-10-04 02:59:23,195 WARNING [train.py:1204] (3/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_399-156791-0_sp1.1 from training. Duration: 0.7545625 2023-10-04 02:59:24,994 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7696_210_2040_1_1528207167_3716099_117-125434-0 from training. Duration: 0.76 2023-10-04 02:59:25,000 WARNING [train.py:1204] (3/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_96-244299-0 from training. Duration: 0.88 2023-10-04 02:59:26,401 WARNING [train.py:1204] (3/4) Exclude cut with ID _37892_210_19596_1_1533198677897_3964058_519-343462-0_sp1.1 from training. Duration: 0.92725 2023-10-04 02:59:26,424 WARNING [train.py:1204] (3/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_202-183922-0_sp1.1 from training. Duration: 0.77275 2023-10-04 02:59:26,430 WARNING [train.py:1204] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_453-133983-0 from training. Duration: 0.78 2023-10-04 02:59:26,477 WARNING [train.py:1204] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_178-39722-0_sp1.1 from training. Duration: 0.4454375 2023-10-04 02:59:32,669 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_15126_210_6753_1_1530778677_2591185_113-157651-0 from training. Duration: 0.68 2023-10-04 02:59:32,670 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_271-234962-0_sp1.1 from training. Duration: 0.7181875 2023-10-04 02:59:35,363 WARNING [train.py:1204] (3/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_195-64967-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 02:59:35,403 WARNING [train.py:1204] (3/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_365-215065-0_sp1.1 from training. Duration: 0.936375 2023-10-04 02:59:39,603 WARNING [train.py:1204] (3/4) Exclude cut with ID _31602_210_5854_1_1532677021456_4458250_249-96604-0_sp0.9 from training. Duration: 0.988875 2023-10-04 02:59:39,647 WARNING [train.py:1204] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_307-158462-0 from training. Duration: 0.69 2023-10-04 02:59:39,705 WARNING [train.py:1204] (3/4) Exclude cut with ID _30918_210_6400_1_1532754078600_3125920_407-223394-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 02:59:39,708 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6673_210_3273_1_1527767790_3173240_351-167954-0_sp0.9 from training. Duration: 0.5333125 2023-10-04 02:59:42,734 WARNING [train.py:1204] (3/4) Exclude cut with ID _4866_210_2577_1_1527404412284_4364659_256-306830-0 from training. Duration: 0.87 2023-10-04 02:59:42,797 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_17613_210_2151_1_1531137569_2366160_222-131118-0_sp1.1 from training. Duration: 0.963625 2023-10-04 02:59:42,819 WARNING [train.py:1204] (3/4) Exclude cut with ID _45560_210_21982_1_1533812603951_3584999_111-6376-0 from training. Duration: 0.84 2023-10-04 02:59:42,842 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_216-73841-0 from training. Duration: 0.96 2023-10-04 02:59:44,121 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_14058_210_4163_1_1530690953_6327180_49-210687-0 from training. Duration: 0.94 2023-10-04 02:59:45,483 WARNING [train.py:1204] (3/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_506-42614-0_sp0.9 from training. Duration: 0.87775 2023-10-04 02:59:46,903 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_25-332138-0_sp1.1 from training. Duration: 0.82725 2023-10-04 02:59:50,167 WARNING [train.py:1204] (3/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_589-9070-0_sp1.1 from training. Duration: 0.6454375 2023-10-04 02:59:50,373 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.feed_forward3.hidden_balancer.prob, batch_count=1501533.3333333333, ans=0.125 2023-10-04 02:59:51,579 WARNING [train.py:1204] (3/4) Exclude cut with ID _6194_210_1534_1_1527511826160_3941194_14-282876-0_sp1.1 from training. Duration: 0.6454375 2023-10-04 02:59:52,916 WARNING [train.py:1204] (3/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_1166-90654-0_sp1.1 from training. Duration: 0.92725 2023-10-04 02:59:56,167 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_218-255858-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 02:59:56,169 WARNING [train.py:1204] (3/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_1285-256622-0 from training. Duration: 0.84 2023-10-04 02:59:56,189 WARNING [train.py:1204] (3/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_205-286261-0_sp1.1 from training. Duration: 0.963625 2023-10-04 02:59:56,211 WARNING [train.py:1204] (3/4) Exclude cut with ID _68777_210_4353_1_1535810390731_3117904_6-154119-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 02:59:56,264 WARNING [train.py:1204] (3/4) Exclude cut with ID _22982_210_1883_1_1531882790447_5449409_565-199393-0_sp1.1 from training. Duration: 0.92725 2023-10-04 02:59:56,496 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=1501600.0, ans=0.1 2023-10-04 02:59:57,546 WARNING [train.py:1204] (3/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_278-154492-0 from training. Duration: 0.76 2023-10-04 02:59:57,677 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_426-313557-0 from training. Duration: 0.53 2023-10-04 02:59:59,007 WARNING [train.py:1204] (3/4) Exclude cut with ID _41817_210_18835_1_1533434477160_7423049_555-293448-0 from training. Duration: 0.8 2023-10-04 03:00:03,647 WARNING [train.py:1204] (3/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_32-335103-0_sp1.1 from training. Duration: 0.7454375 2023-10-04 03:00:06,285 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2655_210_2087_1_1525688959_3897400_202-147557-0_sp1.1 from training. Duration: 0.77275 2023-10-04 03:00:06,327 WARNING [train.py:1204] (3/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_158-250116-0 from training. Duration: 0.62 2023-10-04 03:00:10,694 WARNING [train.py:1204] (3/4) Exclude cut with ID _55429_210_10626_1_1534467588527_7174143_528-88083-0_sp1.1 from training. Duration: 0.963625 2023-10-04 03:00:13,370 WARNING [train.py:1204] (3/4) Exclude cut with ID _8030_210_7497_1_1528444886781_3531089_32-31837-0_sp1.1 from training. Duration: 0.8818125 2023-10-04 03:00:13,405 WARNING [train.py:1204] (3/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_744-86168-0_sp1.1 from training. Duration: 0.97275 2023-10-04 03:00:13,414 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1113-7369-0_sp0.9 from training. Duration: 0.9444375 2023-10-04 03:00:15,185 WARNING [train.py:1204] (3/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_124-345840-0_sp0.9 from training. Duration: 0.57775 2023-10-04 03:00:16,401 WARNING [train.py:1204] (3/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_328-264776-0_sp0.9 from training. Duration: 0.7444375 2023-10-04 03:00:16,550 WARNING [train.py:1204] (3/4) Exclude cut with ID _18895_210_6228_1_1531357274234_3784930_469-166615-0_sp1.1 from training. Duration: 0.963625 2023-10-04 03:00:17,786 WARNING [train.py:1204] (3/4) Exclude cut with ID _8395_210_1531_1_1528597871523_4411579_431-132470-0_sp0.9 from training. Duration: 0.82225 2023-10-04 03:00:17,870 WARNING [train.py:1204] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_554-45617-0_sp1.1 from training. Duration: 0.9 2023-10-04 03:00:17,891 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_35361_210_4238_1_1533262810_7793476_524-188242-0_sp1.1 from training. Duration: 0.92725 2023-10-04 03:00:21,254 WARNING [train.py:1204] (3/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_938-205135-0 from training. Duration: 0.87 2023-10-04 03:00:22,573 INFO [train.py:1046] (3/4) Epoch 43, batch 2150, loss[loss=0.1552, simple_loss=0.2315, pruned_loss=0.03943, over 23662.00 frames. ], tot_loss[loss=0.1537, simple_loss=0.233, pruned_loss=0.03721, over 4696219.54 frames. ], batch size: 232, lr: 2.37e-03, grad_scale: 8.0 2023-10-04 03:00:23,937 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_5931_210_2040_1_1527428481_4547999_321-193614-0 from training. Duration: 0.67 2023-10-04 03:00:23,952 WARNING [train.py:1204] (3/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_445-269521-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 03:00:27,179 WARNING [train.py:1204] (3/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_3-33116-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 03:00:27,181 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_4633_210_2040_1_1526824163_4145343_416-76117-0_sp1.1 from training. Duration: 0.863625 2023-10-04 03:00:27,218 WARNING [train.py:1204] (3/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_1033-274376-0_sp1.1 from training. Duration: 0.7909375 2023-10-04 03:00:27,259 WARNING [train.py:1204] (3/4) Exclude cut with ID _8772_210_1850_1_1528635595972_3664011_398-320049-0_sp0.9 from training. Duration: 0.97775 2023-10-04 03:00:32,164 WARNING [train.py:1204] (3/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_321-215742-0_sp0.9 from training. Duration: 0.5555625 2023-10-04 03:00:34,670 WARNING [train.py:1204] (3/4) Exclude cut with ID _28394_210_8913_1_1532430049518_3650118_272-190950-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 03:00:34,770 WARNING [train.py:1204] (3/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_31-299371-0_sp1.1 from training. Duration: 0.92725 2023-10-04 03:00:36,176 WARNING [train.py:1204] (3/4) Exclude cut with ID _55383_210_10635_1_1534468426172_2231669_134-128246-0_sp0.9 from training. Duration: 0.988875 2023-10-04 03:00:36,179 WARNING [train.py:1204] (3/4) Exclude cut with ID _40519_210_13794_1_1533715228655_7337390_887-9547-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 03:00:37,474 WARNING [train.py:1204] (3/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_310-109461-0_sp0.9 from training. Duration: 0.888875 2023-10-04 03:00:40,291 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2009_210_2071_1_1525481134_4588937_258-250196-0_sp1.1 from training. Duration: 0.92725 2023-10-04 03:00:40,414 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.conv_module1.balancer2.prob, batch_count=1501800.0, ans=0.125 2023-10-04 03:00:41,512 WARNING [train.py:1204] (3/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_1186-72702-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 03:00:41,515 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_14831_210_7927_1_1530700200_5583610_181-27467-0_sp1.1 from training. Duration: 0.82725 2023-10-04 03:00:44,200 WARNING [train.py:1204] (3/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_278-132109-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 03:00:44,224 WARNING [train.py:1204] (3/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_261-297503-0 from training. Duration: 0.99 2023-10-04 03:00:48,975 WARNING [train.py:1204] (3/4) Exclude cut with ID _19896_210_1883_1_1531709022662_6649569_313-265903-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 03:00:49,081 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_105-114797-0_sp1.1 from training. Duration: 0.763625 2023-10-04 03:00:51,731 WARNING [train.py:1204] (3/4) Exclude cut with ID _53755_210_8254_1_1534577312941_4707360_9-65843-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 03:00:52,376 WARNING [train.py:1204] (3/4) Exclude cut with ID _18516_210_9206_1_1531704653206_3692406_361-224549-0_sp1.1 from training. Duration: 0.97275 2023-10-04 03:00:52,397 WARNING [train.py:1204] (3/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_403-310975-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 03:00:52,451 WARNING [train.py:1204] (3/4) Exclude cut with ID _5444_210_2151_1_1527418808464_5312210_581-315411-0_sp0.9 from training. Duration: 0.688875 2023-10-04 03:00:53,737 WARNING [train.py:1204] (3/4) Exclude cut with ID _23825_210_13449_1_1531915313526_3497150_464-154904-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 03:00:53,747 WARNING [train.py:1204] (3/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_937-208281-0_sp0.9 from training. Duration: 0.9444375 2023-10-04 03:00:53,821 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_37839_210_7234_1_1533542462_3689303_158-181212-0_sp1.1 from training. Duration: 0.963625 2023-10-04 03:00:54,040 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.feed_forward1.hidden_balancer.prob, batch_count=1501866.6666666667, ans=0.125 2023-10-04 03:00:55,838 WARNING [train.py:1204] (3/4) Exclude cut with ID _8772_210_1850_1_1528635595972_3664011_398-320049-0 from training. Duration: 0.88 2023-10-04 03:00:57,260 WARNING [train.py:1204] (3/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_718-250907-0_sp1.1 from training. Duration: 0.62725 2023-10-04 03:00:58,617 WARNING [train.py:1204] (3/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_641-126395-0_sp1.1 from training. Duration: 0.92725 2023-10-04 03:00:58,641 WARNING [train.py:1204] (3/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_166-241493-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 03:00:59,994 WARNING [train.py:1204] (3/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_526-62079-0_sp0.9 from training. Duration: 0.7444375 2023-10-04 03:01:01,451 WARNING [train.py:1204] (3/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_439-275083-0_sp1.1 from training. Duration: 0.936375 2023-10-04 03:01:04,597 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_66907_210_21581_1_1535615661_3590177_493-229032-0_sp1.1 from training. Duration: 0.92725 2023-10-04 03:01:04,635 WARNING [train.py:1204] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_89-321955-0_sp1.1 from training. Duration: 0.72725 2023-10-04 03:01:07,249 WARNING [train.py:1204] (3/4) Exclude cut with ID _29857_210_20034_1_1532395886753_3798160_273-226195-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 03:01:07,256 WARNING [train.py:1204] (3/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_404-75889-0 from training. Duration: 0.89 2023-10-04 03:01:07,281 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_271-234962-0_sp0.9 from training. Duration: 0.87775 2023-10-04 03:01:10,023 WARNING [train.py:1204] (3/4) Exclude cut with ID _22961_210_8254_1_1531976366672_4278180_156-339685-0_sp1.1 from training. Duration: 0.97275 2023-10-04 03:01:10,067 WARNING [train.py:1204] (3/4) Exclude cut with ID _37892_210_19596_1_1533198677897_3964058_425-231927-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 03:01:11,407 WARNING [train.py:1204] (3/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_102-288670-0_sp1.1 from training. Duration: 0.97275 2023-10-04 03:01:12,826 WARNING [train.py:1204] (3/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_283-220372-0_sp1.1 from training. Duration: 0.5090625 2023-10-04 03:01:14,196 WARNING [train.py:1204] (3/4) Exclude cut with ID _44304_210_15374_1_1533639352765_4613129_86-1699-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 03:01:15,487 WARNING [train.py:1204] (3/4) Exclude cut with ID _2891_210_2550_1_1525874398521_4427710_327-121590-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 03:01:15,493 WARNING [train.py:1204] (3/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_369-222060-0 from training. Duration: 0.71 2023-10-04 03:01:16,884 WARNING [train.py:1204] (3/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_762-109886-0 from training. Duration: 0.91 2023-10-04 03:01:18,546 WARNING [train.py:1204] (3/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_477-255782-0_sp0.9 from training. Duration: 0.911125 2023-10-04 03:01:18,586 WARNING [train.py:1204] (3/4) Exclude cut with ID IC0669W0061-330906 from training. Duration: 0.768 2023-10-04 03:01:18,626 WARNING [train.py:1204] (3/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_347-213071-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 03:01:18,658 WARNING [train.py:1204] (3/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_507-303370-0_sp1.1 from training. Duration: 0.87275 2023-10-04 03:01:19,967 WARNING [train.py:1204] (3/4) Exclude cut with ID _15012_210_1968_1_1530856820429_5095350_688-178849-0 from training. Duration: 0.64 2023-10-04 03:01:19,982 WARNING [train.py:1204] (3/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_28-70987-0_sp0.9 from training. Duration: 0.9666875 2023-10-04 03:01:19,984 WARNING [train.py:1204] (3/4) Exclude cut with ID _5910_210_1389_1_1527337972454_5050100_555-159815-0 from training. Duration: 0.98 2023-10-04 03:01:20,009 WARNING [train.py:1204] (3/4) Exclude cut with ID IC0679W0064-335871 from training. Duration: 0.768 2023-10-04 03:01:20,009 WARNING [train.py:1204] (3/4) Exclude cut with ID IC0679W0079-335886 from training. Duration: 0.896 2023-10-04 03:01:20,047 WARNING [train.py:1204] (3/4) Exclude cut with ID _5608_210_2106_1_1527415439089_6819768_909-82767-0 from training. Duration: 0.61 2023-10-04 03:01:22,556 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.598e+02 1.924e+02 2.146e+02 2.514e+02 4.521e+02, threshold=4.293e+02, percent-clipped=1.0 2023-10-04 03:01:22,649 WARNING [train.py:1204] (3/4) Exclude cut with ID _23038_210_9570_1_1532001584158_3652419_411-323967-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 03:01:23,954 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_350-231748-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 03:01:23,975 WARNING [train.py:1204] (3/4) Exclude cut with ID _5289_210_2084_1_1527678202736_3779299_142-103317-0_sp0.9 from training. Duration: 0.9333125 2023-10-04 03:01:24,670 WARNING [train.py:1204] (3/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_314-144055-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 03:01:25,877 WARNING [train.py:1204] (3/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_177-236809-0_sp0.9 from training. Duration: 0.5444375 2023-10-04 03:01:26,016 WARNING [train.py:1204] (3/4) Exclude cut with ID _23038_210_9570_1_1532001584158_3652419_82-334733-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 03:01:27,441 WARNING [train.py:1204] (3/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_225-22526-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 03:01:34,846 WARNING [train.py:1204] (3/4) Exclude cut with ID _30327_210_16049_1_1532484161295_3924169_401-70819-0_sp1.1 from training. Duration: 0.7818125 2023-10-04 03:01:34,892 WARNING [train.py:1204] (3/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_538-32291-0 from training. Duration: 0.83 2023-10-04 03:01:35,174 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.ff3_skip_rate, batch_count=1502066.6666666667, ans=0.0 2023-10-04 03:01:36,194 INFO [train.py:1046] (3/4) Epoch 43, batch 2200, loss[loss=0.1696, simple_loss=0.2555, pruned_loss=0.04181, over 24387.00 frames. ], tot_loss[loss=0.154, simple_loss=0.2339, pruned_loss=0.03705, over 4714556.69 frames. ], batch size: 77, lr: 2.37e-03, grad_scale: 8.0 2023-10-04 03:01:38,040 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.conv_skip_rate, batch_count=1502066.6666666667, ans=0.0 2023-10-04 03:01:39,147 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_37357_210_17024_1_1533089504_2316121_258-211605-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 03:01:42,031 WARNING [train.py:1204] (3/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_550-335198-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 03:01:43,418 WARNING [train.py:1204] (3/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_98-741-0_sp0.9 from training. Duration: 0.888875 2023-10-04 03:01:43,464 WARNING [train.py:1204] (3/4) Exclude cut with ID _38323_210_12108_1_1533121233369_3303049_251-238988-0_sp1.1 from training. Duration: 0.92725 2023-10-04 03:01:43,544 WARNING [train.py:1204] (3/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_259-34038-0_sp0.9 from training. Duration: 0.82225 2023-10-04 03:01:45,534 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.4.encoder.layers.1.self_attn2.whiten, num_groups=1, num_channels=384, metric=11.58 vs. limit=22.5 2023-10-04 03:01:46,316 WARNING [train.py:1204] (3/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_1060-180633-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 03:01:47,530 WARNING [train.py:1204] (3/4) Exclude cut with ID _8755_210_1876_1_1528628559357_3258209_211-37473-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 03:01:47,545 WARNING [train.py:1204] (3/4) Exclude cut with ID _4122_210_2084_1_1526713164417_4353280_528-206771-0 from training. Duration: 0.88 2023-10-04 03:01:50,905 WARNING [train.py:1204] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_64-248359-0 from training. Duration: 0.69 2023-10-04 03:01:54,288 WARNING [train.py:1204] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_402-253027-0_sp1.1 from training. Duration: 0.4909375 2023-10-04 03:02:01,059 WARNING [train.py:1204] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_625-102955-0 from training. Duration: 0.77 2023-10-04 03:02:03,840 WARNING [train.py:1204] (3/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_611-219324-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 03:02:05,220 WARNING [train.py:1204] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_718-68505-0_sp0.9 from training. Duration: 0.988875 2023-10-04 03:02:05,258 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_353-182855-0_sp1.1 from training. Duration: 0.87275 2023-10-04 03:02:08,159 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_5960_210_1794_1_1527403865_3990999_515-2613-0_sp0.9 from training. Duration: 0.97775 2023-10-04 03:02:09,427 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_5575_210_1410_1_1527301305_2570170_342-265572-0 from training. Duration: 0.94 2023-10-04 03:02:12,257 WARNING [train.py:1204] (3/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_329-113173-0_sp0.9 from training. Duration: 0.688875 2023-10-04 03:02:13,813 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_15396_210_6890_1_1531308473_1938936_213-312506-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 03:02:15,170 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_521-214276-0_sp1.1 from training. Duration: 0.4 2023-10-04 03:02:15,467 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=1502200.0, ans=0.1 2023-10-04 03:02:16,754 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.feed_forward2.hidden_balancer.prob, batch_count=1502200.0, ans=0.125 2023-10-04 03:02:19,234 WARNING [train.py:1204] (3/4) Exclude cut with ID _15026_210_8750_1_1530777602316_3291850_211-195043-0_sp1.1 from training. Duration: 0.72725 2023-10-04 03:02:21,175 WARNING [train.py:1204] (3/4) Exclude cut with ID _29343_210_4381_1_1532602757752_3674109_108-257494-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 03:02:21,314 WARNING [train.py:1204] (3/4) Exclude cut with ID _64414_210_17301_1_1535243252048_3757759_403-58151-0_sp1.1 from training. Duration: 0.963625 2023-10-04 03:02:24,410 WARNING [train.py:1204] (3/4) Exclude cut with ID _36512_210_6987_1_1533033143998_3126069_109-177787-0_sp1.1 from training. Duration: 0.92725 2023-10-04 03:02:25,932 WARNING [train.py:1204] (3/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_498-40153-0 from training. Duration: 0.63 2023-10-04 03:02:26,151 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.feed_forward2.hidden_balancer.prob, batch_count=1502266.6666666667, ans=0.125 2023-10-04 03:02:27,878 WARNING [train.py:1204] (3/4) Exclude cut with ID _56447_210_1389_1_1535074245910_3697189_161-334523-0_sp1.1 from training. Duration: 0.936375 2023-10-04 03:02:27,983 WARNING [train.py:1204] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_238-245655-0 from training. Duration: 0.81 2023-10-04 03:02:30,503 WARNING [train.py:1204] (3/4) Exclude cut with ID _64631_210_27767_1_1535349580068_4165160_108-43865-0_sp1.1 from training. Duration: 0.92725 2023-10-04 03:02:30,509 WARNING [train.py:1204] (3/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_243-42116-0_sp1.1 from training. Duration: 0.663625 2023-10-04 03:02:30,543 WARNING [train.py:1204] (3/4) Exclude cut with ID _66868_210_9527_1_1535509287954_4561140_594-132160-0_sp1.1 from training. Duration: 0.92725 2023-10-04 03:02:32,626 WARNING [train.py:1204] (3/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_48-338976-0_sp0.9 from training. Duration: 0.988875 2023-10-04 03:02:32,667 WARNING [train.py:1204] (3/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_137-100595-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 03:02:32,673 WARNING [train.py:1204] (3/4) Exclude cut with ID _49748_210_24116_1_1533986971737_3158910_12-271669-0_sp1.1 from training. Duration: 0.936375 2023-10-04 03:02:32,691 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_37357_210_17024_1_1533089504_2316121_206-80421-0_sp1.1 from training. Duration: 0.936375 2023-10-04 03:02:35,252 WARNING [train.py:1204] (3/4) Exclude cut with ID _7537_210_2084_1_1528502395431_3924359_113-199933-0_sp1.1 from training. Duration: 0.536375 2023-10-04 03:02:35,293 WARNING [train.py:1204] (3/4) Exclude cut with ID _55440_210_12651_1_1534496713364_2980520_31-59289-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 03:02:36,969 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.feed_forward1.hidden_balancer.prob, batch_count=1502333.3333333333, ans=0.125 2023-10-04 03:02:38,011 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1083-220122-0_sp0.9 from training. Duration: 0.5666875 2023-10-04 03:02:40,811 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_426-313557-0_sp1.1 from training. Duration: 0.4818125 2023-10-04 03:02:40,874 WARNING [train.py:1204] (3/4) Exclude cut with ID _35799_210_5854_1_1533263487887_3529071_324-94843-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 03:02:43,524 WARNING [train.py:1204] (3/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_264-108685-0_sp1.1 from training. Duration: 0.5 2023-10-04 03:02:43,629 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9012W0006-501141 from training. Duration: 0.896 2023-10-04 03:02:46,409 WARNING [train.py:1204] (3/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_890-5117-0_sp0.9 from training. Duration: 0.8666875 2023-10-04 03:02:47,753 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9023W0008-506301 from training. Duration: 0.896 2023-10-04 03:02:49,064 WARNING [train.py:1204] (3/4) Exclude cut with ID _13916_210_9994_1_1530788341126_3763149_60-191556-0_sp0.9 from training. Duration: 0.67775 2023-10-04 03:02:49,112 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9029W0331-509721 from training. Duration: 0.896 2023-10-04 03:02:50,435 INFO [train.py:1046] (3/4) Epoch 43, batch 2250, loss[loss=0.1655, simple_loss=0.2563, pruned_loss=0.03734, over 24315.00 frames. ], tot_loss[loss=0.1545, simple_loss=0.2348, pruned_loss=0.03712, over 4706694.10 frames. ], batch size: 77, lr: 2.37e-03, grad_scale: 8.0 2023-10-04 03:02:50,560 WARNING [train.py:1204] (3/4) Exclude cut with ID _35823_210_6917_1_1533187881914_7297830_567-108596-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 03:02:51,902 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7543_210_6753_1_1528544010_1110402_25-183860-0_sp1.1 from training. Duration: 0.42725 2023-10-04 03:02:53,843 WARNING [train.py:1204] (3/4) Exclude cut with ID _47278_210_12041_1_1533902418349_3576110_495-260035-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 03:02:54,103 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.self_attn_weights.pos_emb_skip_rate, batch_count=1502400.0, ans=0.0 2023-10-04 03:02:55,228 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9051W0015-520281 from training. Duration: 0.9045625 2023-10-04 03:02:55,392 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.self_attn_weights.pos_emb_skip_rate, batch_count=1502400.0, ans=0.0 2023-10-04 03:02:55,413 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.conv_module2.balancer1.prob, batch_count=1502400.0, ans=0.125 2023-10-04 03:02:58,491 WARNING [train.py:1204] (3/4) Exclude cut with ID _14459_210_2228_1_1531443641946_7516700_707-66193-0_sp1.1 from training. Duration: 0.97275 2023-10-04 03:02:59,925 WARNING [train.py:1204] (3/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_219-224445-0_sp1.1 from training. Duration: 0.763625 2023-10-04 03:03:04,628 WARNING [train.py:1204] (3/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_345-167986-0_sp0.9 from training. Duration: 0.9333125 2023-10-04 03:03:06,066 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_34815_210_6388_1_1533381320_3571903_279-305565-0_sp1.1 from training. Duration: 0.836375 2023-10-04 03:03:08,770 WARNING [train.py:1204] (3/4) Exclude cut with ID _21987_210_5854_1_1531821500482_3637038_317-179212-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 03:03:08,822 WARNING [train.py:1204] (3/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_395-37222-0_sp1.1 from training. Duration: 0.6909375 2023-10-04 03:03:10,161 WARNING [train.py:1204] (3/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_168-50337-0_sp1.1 from training. Duration: 0.763625 2023-10-04 03:03:12,909 WARNING [train.py:1204] (3/4) Exclude cut with ID _60113_210_16271_1_1535164212481_4386079_559-167191-0 from training. Duration: 0.93 2023-10-04 03:03:12,918 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_47228_210_4983_1_1533780021_3672027_344-179642-0_sp1.1 from training. Duration: 0.92725 2023-10-04 03:03:13,124 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.conv_module2.balancer1.prob, batch_count=1502466.6666666667, ans=0.125 2023-10-04 03:03:14,232 WARNING [train.py:1204] (3/4) Exclude cut with ID _22961_210_8254_1_1531976366672_4278180_317-199546-0_sp0.9 from training. Duration: 0.9555625 2023-10-04 03:03:14,373 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder_embed.conv.2.prob, batch_count=1502466.6666666667, ans=0.125 2023-10-04 03:03:15,673 WARNING [train.py:1204] (3/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_141-89727-0 from training. Duration: 0.71 2023-10-04 03:03:17,047 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_78-116075-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 03:03:17,061 WARNING [train.py:1204] (3/4) Exclude cut with ID _64086_210_7994_1_1535270417019_7435949_52-235756-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 03:03:18,411 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7615_210_4211_1_1528110965_4580160_72-235211-0_sp1.1 from training. Duration: 0.6909375 2023-10-04 03:03:22,430 WARNING [train.py:1204] (3/4) Exclude cut with ID _14069_210_5632_1_1530512692033_4803390_287-27950-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 03:03:23,825 WARNING [train.py:1204] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_74-48717-0_sp0.9 from training. Duration: 0.6444375 2023-10-04 03:03:23,864 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7544_210_6753_1_1528591327_1471426_172-348777-0_sp1.1 from training. Duration: 0.6 2023-10-04 03:03:25,687 WARNING [train.py:1204] (3/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_220-191080-0 from training. Duration: 0.98 2023-10-04 03:03:27,587 WARNING [train.py:1204] (3/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_1081-137641-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 03:03:30,261 WARNING [train.py:1204] (3/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_278-235459-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 03:03:31,889 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.conv_skip_rate, batch_count=1502533.3333333333, ans=0.0 2023-10-04 03:03:35,016 WARNING [train.py:1204] (3/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_1181-77470-0_sp1.1 from training. Duration: 0.936375 2023-10-04 03:03:36,407 WARNING [train.py:1204] (3/4) Exclude cut with ID _60860_210_8254_1_1534896015875_3557169_198-5531-0_sp1.1 from training. Duration: 0.936375 2023-10-04 03:03:37,808 WARNING [train.py:1204] (3/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_17-289781-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 03:03:37,813 WARNING [train.py:1204] (3/4) Exclude cut with ID _50310_210_15488_1_1534068043345_3641080_146-199752-0_sp1.1 from training. Duration: 0.92725 2023-10-04 03:03:39,332 WARNING [train.py:1204] (3/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_969-213354-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 03:03:39,558 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.bypass.skip_rate, batch_count=1502600.0, ans=0.07 2023-10-04 03:03:41,958 WARNING [train.py:1204] (3/4) Exclude cut with ID _70519_210_4163_1_1535961595994_7458230_66-344156-0_sp1.1 from training. Duration: 0.963625 2023-10-04 03:03:46,114 WARNING [train.py:1204] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_380-204756-0_sp1.1 from training. Duration: 0.7818125 2023-10-04 03:03:47,531 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_128-202104-0_sp0.9 from training. Duration: 0.7 2023-10-04 03:03:49,306 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.conv_skip_rate, batch_count=1502666.6666666667, ans=0.0 2023-10-04 03:03:50,212 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.661e+02 2.039e+02 2.231e+02 2.498e+02 4.606e+02, threshold=4.463e+02, percent-clipped=1.0 2023-10-04 03:03:53,165 WARNING [train.py:1204] (3/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_51-1049-0_sp0.9 from training. Duration: 0.8333125 2023-10-04 03:03:53,182 WARNING [train.py:1204] (3/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_86-66192-0_sp1.1 from training. Duration: 0.5 2023-10-04 03:03:54,546 WARNING [train.py:1204] (3/4) Exclude cut with ID _14069_210_5632_1_1530512692033_4803390_95-1896-0_sp1.1 from training. Duration: 0.8909375 2023-10-04 03:03:57,985 WARNING [train.py:1204] (3/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_29-293896-0_sp1.1 from training. Duration: 0.5545625 2023-10-04 03:04:01,258 WARNING [train.py:1204] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_167-137255-0_sp0.9 from training. Duration: 0.711125 2023-10-04 03:04:01,258 WARNING [train.py:1204] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_217-95056-0 from training. Duration: 0.98 2023-10-04 03:04:01,275 WARNING [train.py:1204] (3/4) Exclude cut with ID _47278_210_12041_1_1533902418349_3576110_97-266746-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 03:04:01,327 WARNING [train.py:1204] (3/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_380-343670-0_sp1.1 from training. Duration: 0.8 2023-10-04 03:04:03,914 INFO [train.py:1046] (3/4) Epoch 43, batch 2300, loss[loss=0.1743, simple_loss=0.2538, pruned_loss=0.04734, over 23705.00 frames. ], tot_loss[loss=0.1559, simple_loss=0.2359, pruned_loss=0.03793, over 4707264.61 frames. ], batch size: 85, lr: 2.36e-03, grad_scale: 8.0 2023-10-04 03:04:04,030 WARNING [train.py:1204] (3/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_107-332398-0 from training. Duration: 0.78 2023-10-04 03:04:06,445 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.4.encoder.layers.1.self_attn1.whiten, num_groups=1, num_channels=384, metric=11.67 vs. limit=22.5 2023-10-04 03:04:08,542 WARNING [train.py:1204] (3/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_91-267696-0_sp1.1 from training. Duration: 0.7909375 2023-10-04 03:04:08,576 WARNING [train.py:1204] (3/4) Exclude cut with ID _41298_210_9019_1_1533430371450_4822143_498-186993-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 03:04:14,295 WARNING [train.py:1204] (3/4) Exclude cut with ID _17956_210_4353_1_1531209568125_6979970_54-77610-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 03:04:14,339 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_40047_210_2290_1_1533290519_2374311_257-335162-0_sp1.1 from training. Duration: 0.863625 2023-10-04 03:04:15,761 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9653W0193-675471 from training. Duration: 0.9656875 2023-10-04 03:04:17,169 WARNING [train.py:1204] (3/4) Exclude cut with ID _25248_210_8750_1_1532062825941_5554180_243-240536-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 03:04:22,800 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_32360_210_4198_1_1532689160_3648436_60-51908-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 03:04:22,809 WARNING [train.py:1204] (3/4) Exclude cut with ID _4092_210_2151_1_1526643044449_3135759_227-213972-0_sp0.9 from training. Duration: 0.6 2023-10-04 03:04:24,113 WARNING [train.py:1204] (3/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_392-173500-0_sp1.1 from training. Duration: 0.97275 2023-10-04 03:04:24,152 WARNING [train.py:1204] (3/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_71-329150-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 03:04:24,162 WARNING [train.py:1204] (3/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_866-173440-0 from training. Duration: 0.9 2023-10-04 03:04:25,525 WARNING [train.py:1204] (3/4) Exclude cut with ID _14832_210_8750_1_1530691351539_3171348_1-54315-0_sp0.9 from training. Duration: 0.9666875 2023-10-04 03:04:27,540 WARNING [train.py:1204] (3/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_430-116139-0_sp1.1 from training. Duration: 0.77275 2023-10-04 03:04:28,804 WARNING [train.py:1204] (3/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_356-45054-0_sp0.9 from training. Duration: 0.9555625 2023-10-04 03:04:32,071 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3531_210_3528_1_1526294203_5216456_309-227302-0_sp0.9 from training. Duration: 0.7666875 2023-10-04 03:04:34,027 WARNING [train.py:1204] (3/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_301-330455-0_sp1.1 from training. Duration: 0.72725 2023-10-04 03:04:38,215 WARNING [train.py:1204] (3/4) Exclude cut with ID _23557_210_8750_1_1531890106370_6846255_667-145945-0_sp1.1 from training. Duration: 0.92725 2023-10-04 03:04:43,679 WARNING [train.py:1204] (3/4) Exclude cut with ID _14860_210_4915_1_1530702020340_7343240_836-328489-0_sp1.1 from training. Duration: 0.8181875 2023-10-04 03:04:43,747 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_37152_210_9697_1_1533106573_3844188_416-333249-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 03:04:45,210 WARNING [train.py:1204] (3/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_538-32291-0_sp0.9 from training. Duration: 0.92225 2023-10-04 03:04:48,019 WARNING [train.py:1204] (3/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_965-176824-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 03:04:52,139 WARNING [train.py:1204] (3/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_86-19601-0_sp1.1 from training. Duration: 0.77275 2023-10-04 03:04:52,201 WARNING [train.py:1204] (3/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_436-318113-0_sp1.1 from training. Duration: 0.6818125 2023-10-04 03:04:52,235 WARNING [train.py:1204] (3/4) Exclude cut with ID _41817_210_18835_1_1533434477160_7423049_555-293448-0_sp0.9 from training. Duration: 0.888875 2023-10-04 03:04:52,245 WARNING [train.py:1204] (3/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_68-219807-0 from training. Duration: 0.47 2023-10-04 03:04:52,470 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.feed_forward1.out_proj.dropout_p, batch_count=1502933.3333333333, ans=0.1 2023-10-04 03:04:57,065 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7544_210_6753_1_1528591327_1471426_33-346031-0_sp1.1 from training. Duration: 0.5545625 2023-10-04 03:04:57,066 WARNING [train.py:1204] (3/4) Exclude cut with ID _49748_210_24116_1_1533986971737_3158910_263-52113-0_sp1.1 from training. Duration: 0.97275 2023-10-04 03:04:57,119 WARNING [train.py:1204] (3/4) Exclude cut with ID _64639_210_5816_1_1535367619437_3528000_340-24580-0_sp1.1 from training. Duration: 0.92725 2023-10-04 03:04:57,140 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_47228_210_4983_1_1533780021_3672027_479-114941-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 03:04:59,055 WARNING [train.py:1204] (3/4) Exclude cut with ID _44585_210_18628_1_1533605350387_4518199_506-119659-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 03:05:00,408 WARNING [train.py:1204] (3/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_314-126819-0_sp1.1 from training. Duration: 0.4545625 2023-10-04 03:05:00,412 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2541_210_1794_1_1525590269_3084765_156-173713-0_sp0.9 from training. Duration: 0.9 2023-10-04 03:05:00,460 WARNING [train.py:1204] (3/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_108-255430-0 from training. Duration: 0.97 2023-10-04 03:05:00,468 WARNING [train.py:1204] (3/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_791-45149-0_sp1.1 from training. Duration: 0.7818125 2023-10-04 03:05:00,475 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_53378_210_17282_1_1534227054_1261425_10-118295-0_sp1.1 from training. Duration: 0.97275 2023-10-04 03:05:01,828 WARNING [train.py:1204] (3/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_148-158509-0 from training. Duration: 0.41 2023-10-04 03:05:09,356 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_34726_210_4525_1_1533019474_5030395_687-291305-0_sp1.1 from training. Duration: 0.936375 2023-10-04 03:05:09,532 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.feed_forward2.hidden_balancer.prob, batch_count=1503000.0, ans=0.125 2023-10-04 03:05:12,271 WARNING [train.py:1204] (3/4) Exclude cut with ID _14860_210_4915_1_1530702020340_7343240_264-324537-0_sp1.1 from training. Duration: 0.963625 2023-10-04 03:05:15,134 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_41251_210_15368_1_1533429767_4820695_591-46386-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 03:05:15,158 WARNING [train.py:1204] (3/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_86-19601-0_sp0.9 from training. Duration: 0.9444375 2023-10-04 03:05:15,188 WARNING [train.py:1204] (3/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_401-255491-0_sp0.9 from training. Duration: 0.77775 2023-10-04 03:05:17,891 INFO [train.py:1046] (3/4) Epoch 43, batch 2350, loss[loss=0.1553, simple_loss=0.249, pruned_loss=0.03081, over 24424.00 frames. ], tot_loss[loss=0.1564, simple_loss=0.2367, pruned_loss=0.03808, over 4716156.12 frames. ], batch size: 69, lr: 2.36e-03, grad_scale: 8.0 2023-10-04 03:05:17,999 WARNING [train.py:1204] (3/4) Exclude cut with ID _8696_210_1876_1_1528545632440_3345058_209-54069-0_sp1.1 from training. Duration: 0.6090625 2023-10-04 03:05:18,009 WARNING [train.py:1204] (3/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_369-325184-0_sp1.1 from training. Duration: 0.9 2023-10-04 03:05:18,079 WARNING [train.py:1204] (3/4) Exclude cut with ID _14687_210_5854_1_1530703838259_3664889_115-239046-0_sp0.9 from training. Duration: 0.7444375 2023-10-04 03:05:18,117 WARNING [train.py:1204] (3/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_168-50337-0 from training. Duration: 0.84 2023-10-04 03:05:19,790 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.conv_module2.balancer2.prob, batch_count=1503066.6666666667, ans=0.125 2023-10-04 03:05:24,999 WARNING [train.py:1204] (3/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_909-64151-0_sp1.1 from training. Duration: 0.8545625 2023-10-04 03:05:25,009 WARNING [train.py:1204] (3/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_597-154640-0 from training. Duration: 0.9 2023-10-04 03:05:30,584 WARNING [train.py:1204] (3/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_126-274694-0 from training. Duration: 0.98 2023-10-04 03:05:33,246 WARNING [train.py:1204] (3/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_344-269786-0_sp1.1 from training. Duration: 0.97275 2023-10-04 03:05:36,714 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_37818_210_4238_1_1533620910_4720083_322-253154-0_sp1.1 from training. Duration: 0.92725 2023-10-04 03:05:36,717 WARNING [train.py:1204] (3/4) Exclude cut with ID _58895_210_8750_1_1535184081320_3649089_221-15823-0_sp1.1 from training. Duration: 0.92725 2023-10-04 03:05:36,727 WARNING [train.py:1204] (3/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_9-21737-0_sp1.1 from training. Duration: 0.8818125 2023-10-04 03:05:36,769 WARNING [train.py:1204] (3/4) Exclude cut with ID _14069_210_5632_1_1530512692033_4803390_522-341588-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 03:05:38,164 WARNING [train.py:1204] (3/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_363-185703-0 from training. Duration: 0.95 2023-10-04 03:05:40,902 WARNING [train.py:1204] (3/4) Exclude cut with ID _41817_210_18835_1_1533434477160_7423049_265-76216-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 03:05:45,134 INFO [scaling.py:1118] (3/4) WithLoss: name=encoder.encoders.2.encoder.layers.2.self_attn_weights, loss-sum=0.000e+00 2023-10-04 03:05:46,214 WARNING [train.py:1204] (3/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_841-341679-0 from training. Duration: 0.58 2023-10-04 03:05:46,333 WARNING [train.py:1204] (3/4) Exclude cut with ID _55429_210_10626_1_1534467588527_7174143_97-335084-0_sp1.1 from training. Duration: 0.8818125 2023-10-04 03:05:49,110 WARNING [train.py:1204] (3/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_690-126306-0_sp0.9 from training. Duration: 0.8666875 2023-10-04 03:05:49,120 WARNING [train.py:1204] (3/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_102-92751-0_sp1.1 from training. Duration: 0.9 2023-10-04 03:05:51,874 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3787_210_2040_1_1526477544_5478586_603-120324-0_sp1.1 from training. Duration: 0.736375 2023-10-04 03:05:53,723 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_33348_210_5632_1_1533086912_3324030_85-28583-0 from training. Duration: 0.82 2023-10-04 03:05:53,793 WARNING [train.py:1204] (3/4) Exclude cut with ID _60057_210_10640_1_1534903065793_3333150_362-230377-0_sp1.1 from training. Duration: 0.7909375 2023-10-04 03:05:55,196 WARNING [train.py:1204] (3/4) Exclude cut with ID _60113_210_16271_1_1535164212481_4386079_194-146486-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 03:05:55,219 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_53753_210_7234_1_1534320003_3594148_171-277668-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 03:05:56,519 WARNING [train.py:1204] (3/4) Exclude cut with ID _34423_210_8750_1_1533430798456_3330199_251-332788-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 03:05:59,888 WARNING [train.py:1204] (3/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_663-166973-0_sp0.9 from training. Duration: 0.888875 2023-10-04 03:06:01,324 WARNING [train.py:1204] (3/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_927-43827-0 from training. Duration: 0.74 2023-10-04 03:06:02,639 WARNING [train.py:1204] (3/4) Exclude cut with ID _28323_210_16187_1_1532480395667_4100300_306-74561-0_sp1.1 from training. Duration: 0.8545625 2023-10-04 03:06:04,606 WARNING [train.py:1204] (3/4) Exclude cut with ID _15037_210_2253_1_1530752294104_3267650_139-337989-0_sp1.1 from training. Duration: 0.92725 2023-10-04 03:06:04,619 WARNING [train.py:1204] (3/4) Exclude cut with ID _49039_210_6315_1_1533967537541_3846118_460-263670-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 03:06:06,640 WARNING [train.py:1204] (3/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_186-309161-0 from training. Duration: 0.54 2023-10-04 03:06:07,938 WARNING [train.py:1204] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_87-278686-0_sp1.1 from training. Duration: 0.7 2023-10-04 03:06:08,188 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.feed_forward1.hidden_balancer.prob, batch_count=1503266.6666666667, ans=0.125 2023-10-04 03:06:08,653 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.4.encoder.layers.0.feed_forward3.out_whiten, num_groups=1, num_channels=384, metric=13.49 vs. limit=15.0 2023-10-04 03:06:10,748 WARNING [train.py:1204] (3/4) Exclude cut with ID _27052_210_5986_1_1532329197187_3585009_314-106499-0 from training. Duration: 0.92 2023-10-04 03:06:10,773 WARNING [train.py:1204] (3/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_412-287526-0_sp0.9 from training. Duration: 0.8 2023-10-04 03:06:14,917 WARNING [train.py:1204] (3/4) Exclude cut with ID _22961_210_8254_1_1531976366672_4278180_317-199546-0 from training. Duration: 0.86 2023-10-04 03:06:19,065 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.641e+02 1.962e+02 2.138e+02 2.370e+02 3.005e+02, threshold=4.276e+02, percent-clipped=0.0 2023-10-04 03:06:19,142 WARNING [train.py:1204] (3/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_381-317791-0 from training. Duration: 0.73 2023-10-04 03:06:19,195 WARNING [train.py:1204] (3/4) Exclude cut with ID _44010_210_4525_1_1533884262964_4013620_560-310411-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 03:06:19,201 WARNING [train.py:1204] (3/4) Exclude cut with ID _5294_210_1747_1_1527249643928_3696179_342-270278-0_sp0.9 from training. Duration: 0.611125 2023-10-04 03:06:19,220 WARNING [train.py:1204] (3/4) Exclude cut with ID ID0473W0002-918951 from training. Duration: 0.924 2023-10-04 03:06:19,251 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1083-220122-0 from training. Duration: 0.51 2023-10-04 03:06:23,859 WARNING [train.py:1204] (3/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_138-154812-0 from training. Duration: 0.78 2023-10-04 03:06:25,299 WARNING [train.py:1204] (3/4) Exclude cut with ID _44982_210_1511_1_1534312820108_3726029_331-19420-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 03:06:31,108 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2655_210_2087_1_1525688959_3897400_202-147557-0_sp0.9 from training. Duration: 0.9444375 2023-10-04 03:06:32,276 INFO [train.py:1046] (3/4) Epoch 43, batch 2400, loss[loss=0.1531, simple_loss=0.2349, pruned_loss=0.03561, over 23344.00 frames. ], tot_loss[loss=0.156, simple_loss=0.2363, pruned_loss=0.03788, over 4719525.46 frames. ], batch size: 106, lr: 2.36e-03, grad_scale: 16.0 2023-10-04 03:06:33,830 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_23850_210_4238_1_1532232597_1632433_32-185135-0_sp0.9 from training. Duration: 0.9555625 2023-10-04 03:06:37,096 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_46-93547-0_sp1.1 from training. Duration: 0.863625 2023-10-04 03:06:37,153 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7931_210_1794_1_1528594728_5564059_280-279560-0 from training. Duration: 0.98 2023-10-04 03:06:38,601 WARNING [train.py:1204] (3/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_937-208281-0 from training. Duration: 0.85 2023-10-04 03:06:45,497 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_14928_210_5632_1_1530773461_4714000_454-112198-0_sp0.9 from training. Duration: 0.7333125 2023-10-04 03:06:45,498 WARNING [train.py:1204] (3/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_944-153333-0_sp1.1 from training. Duration: 0.963625 2023-10-04 03:06:48,311 WARNING [train.py:1204] (3/4) Exclude cut with ID _14821_210_8750_1_1530680503991_6829359_2-227151-0 from training. Duration: 0.83 2023-10-04 03:06:48,331 WARNING [train.py:1204] (3/4) Exclude cut with ID _28461_210_2253_1_1532771775189_3041299_96-152158-0_sp1.1 from training. Duration: 0.77275 2023-10-04 03:06:48,407 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7837_210_1792_1_1528453445_5841305_654-54195-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 03:06:48,440 WARNING [train.py:1204] (3/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_344-152526-0 from training. Duration: 0.53 2023-10-04 03:06:54,401 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_30167_210_3528_1_1532428012_4772375_480-142230-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 03:06:55,816 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_160-218721-0 from training. Duration: 0.96 2023-10-04 03:07:00,594 WARNING [train.py:1204] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_65-88242-0_sp0.9 from training. Duration: 0.62225 2023-10-04 03:07:05,134 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_34886_210_10633_1_1533203882_1755661_76-156096-0 from training. Duration: 0.87 2023-10-04 03:07:08,183 WARNING [train.py:1204] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_188-38388-0_sp1.1 from training. Duration: 0.92725 2023-10-04 03:07:09,582 WARNING [train.py:1204] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_81-289335-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 03:07:13,694 WARNING [train.py:1204] (3/4) Exclude cut with ID _59493_210_17282_1_1535078419077_3225879_110-8778-0_sp1.1 from training. Duration: 0.936375 2023-10-04 03:07:15,028 WARNING [train.py:1204] (3/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_188-260011-0 from training. Duration: 0.73 2023-10-04 03:07:15,064 WARNING [train.py:1204] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_453-133983-0_sp0.9 from training. Duration: 0.8666875 2023-10-04 03:07:24,729 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8054_210_7927_1_1528627721_4984094_553-17379-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 03:07:26,261 WARNING [train.py:1204] (3/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_341-171072-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 03:07:29,056 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.1.encoder.layers.1.feed_forward3.out_whiten, num_groups=1, num_channels=256, metric=7.31 vs. limit=15.0 2023-10-04 03:07:29,662 WARNING [train.py:1204] (3/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_265-257286-0_sp1.1 from training. Duration: 0.963625 2023-10-04 03:07:29,755 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_403-39619-0_sp0.9 from training. Duration: 0.9333125 2023-10-04 03:07:29,760 WARNING [train.py:1204] (3/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_464-33931-0_sp0.9 from training. Duration: 0.6 2023-10-04 03:07:31,073 WARNING [train.py:1204] (3/4) Exclude cut with ID _60804_210_9642_1_1534831199859_7268299_632-143863-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 03:07:31,082 WARNING [train.py:1204] (3/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_431-310997-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 03:07:31,143 WARNING [train.py:1204] (3/4) Exclude cut with ID _31649_210_6228_1_1532584608063_4091078_114-263860-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 03:07:31,160 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6961_210_2087_1_1527937168_6815011_562-165518-0_sp0.9 from training. Duration: 0.8555625 2023-10-04 03:07:36,866 WARNING [train.py:1204] (3/4) Exclude cut with ID _27052_210_5986_1_1532329197187_3585009_338-325870-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 03:07:38,260 WARNING [train.py:1204] (3/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_469-94673-0_sp1.1 from training. Duration: 0.7181875 2023-10-04 03:07:38,291 WARNING [train.py:1204] (3/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_259-34038-0 from training. Duration: 0.74 2023-10-04 03:07:38,390 WARNING [train.py:1204] (3/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_388-129552-0 from training. Duration: 0.96 2023-10-04 03:07:41,818 WARNING [train.py:1204] (3/4) Exclude cut with ID _28394_210_8913_1_1532430049518_3650118_342-251455-0_sp1.1 from training. Duration: 0.936375 2023-10-04 03:07:41,837 WARNING [train.py:1204] (3/4) Exclude cut with ID _53731_210_7994_1_1534553979244_3757214_8-191390-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 03:07:41,876 WARNING [train.py:1204] (3/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_613-232856-0 from training. Duration: 0.54 2023-10-04 03:07:41,941 WARNING [train.py:1204] (3/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_557-349534-0 from training. Duration: 0.91 2023-10-04 03:07:43,205 WARNING [train.py:1204] (3/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_379-184223-0 from training. Duration: 0.85 2023-10-04 03:07:43,209 WARNING [train.py:1204] (3/4) Exclude cut with ID IC0125W0001-61807 from training. Duration: 0.966 2023-10-04 03:07:44,556 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_229-45300-0 from training. Duration: 0.57 2023-10-04 03:07:44,633 WARNING [train.py:1204] (3/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_1069-24743-0_sp1.1 from training. Duration: 0.92725 2023-10-04 03:07:44,868 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.bypass_mid.scale_min, batch_count=1503733.3333333333, ans=0.2 2023-10-04 03:07:45,913 INFO [train.py:1046] (3/4) Epoch 43, batch 2450, loss[loss=0.1389, simple_loss=0.206, pruned_loss=0.03584, over 23437.00 frames. ], tot_loss[loss=0.1548, simple_loss=0.2347, pruned_loss=0.03742, over 4728944.43 frames. ], batch size: 285, lr: 2.36e-03, grad_scale: 16.0 2023-10-04 03:07:46,027 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_41452_210_7291_1_1533437687_3941281_323-172873-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 03:07:46,039 WARNING [train.py:1204] (3/4) Exclude cut with ID _37086_210_9038_1_1532999540891_7021250_802-226709-0_sp1.1 from training. Duration: 0.97275 2023-10-04 03:07:46,140 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder_embed.dropout.p, batch_count=1503733.3333333333, ans=0.1 2023-10-04 03:07:47,272 WARNING [train.py:1204] (3/4) Exclude cut with ID IC0144W0500-71767 from training. Duration: 0.931875 2023-10-04 03:07:47,346 WARNING [train.py:1204] (3/4) Exclude cut with ID _39846_210_12651_1_1533605310265_2115212_233-129699-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 03:07:47,854 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.5.encoder.layers.0.conv_module2.whiten, num_groups=1, num_channels=256, metric=5.31 vs. limit=15.0 2023-10-04 03:07:48,690 WARNING [train.py:1204] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_574-45541-0_sp0.9 from training. Duration: 0.72225 2023-10-04 03:07:52,755 WARNING [train.py:1204] (3/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_372-298093-0_sp1.1 from training. Duration: 0.72725 2023-10-04 03:07:52,775 WARNING [train.py:1204] (3/4) Exclude cut with ID _62574_210_2655_1_1535180407058_3933249_34-26343-0_sp1.1 from training. Duration: 0.97275 2023-10-04 03:07:55,693 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_512-256238-0_sp1.1 from training. Duration: 0.963625 2023-10-04 03:07:55,699 WARNING [train.py:1204] (3/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_196-55502-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 03:07:56,381 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.2.encoder.layers.0.nonlin_attention.whiten2, num_groups=1, num_channels=384, metric=13.87 vs. limit=15.0 2023-10-04 03:07:57,042 WARNING [train.py:1204] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_195-167011-0 from training. Duration: 0.75 2023-10-04 03:08:03,685 WARNING [train.py:1204] (3/4) Exclude cut with ID _13678_210_7497_1_1530530069226_3463170_345-230416-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 03:08:03,696 WARNING [train.py:1204] (3/4) Exclude cut with ID _51078_210_9070_1_1534161557424_3805250_225-327809-0_sp1.1 from training. Duration: 0.963625 2023-10-04 03:08:06,460 WARNING [train.py:1204] (3/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_18-189198-0_sp1.1 from training. Duration: 0.8181875 2023-10-04 03:08:06,483 WARNING [train.py:1204] (3/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_329-82235-0_sp1.1 from training. Duration: 0.7454375 2023-10-04 03:08:06,493 WARNING [train.py:1204] (3/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_726-270780-0_sp0.9 from training. Duration: 0.9555625 2023-10-04 03:08:07,789 WARNING [train.py:1204] (3/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_654-52713-0 from training. Duration: 0.77 2023-10-04 03:08:11,097 WARNING [train.py:1204] (3/4) Exclude cut with ID _20455_210_5219_1_1531565996775_8098100_191-16006-0_sp1.1 from training. Duration: 0.963625 2023-10-04 03:08:14,187 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3171_210_2087_1_1526109084_2330007_66-260583-0_sp1.1 from training. Duration: 0.6545625 2023-10-04 03:08:15,552 WARNING [train.py:1204] (3/4) Exclude cut with ID _38670_210_15379_1_1533448810659_4391059_63-129006-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 03:08:18,275 WARNING [train.py:1204] (3/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_393-57888-0_sp0.9 from training. Duration: 0.711125 2023-10-04 03:08:18,304 WARNING [train.py:1204] (3/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_857-298942-0_sp0.9 from training. Duration: 0.9444375 2023-10-04 03:08:19,705 WARNING [train.py:1204] (3/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_239-22076-0_sp0.9 from training. Duration: 0.9444375 2023-10-04 03:08:19,729 WARNING [train.py:1204] (3/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_424-227947-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 03:08:21,176 WARNING [train.py:1204] (3/4) Exclude cut with ID _14069_210_5632_1_1530512692033_4803390_95-1896-0 from training. Duration: 0.98 2023-10-04 03:08:21,263 WARNING [train.py:1204] (3/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_96-244299-0_sp0.9 from training. Duration: 0.97775 2023-10-04 03:08:28,577 WARNING [train.py:1204] (3/4) Exclude cut with ID _46683_210_9642_1_1533730913492_4591116_562-319825-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 03:08:29,961 WARNING [train.py:1204] (3/4) Exclude cut with ID _64639_210_5816_1_1535367619437_3528000_284-173474-0_sp1.1 from training. Duration: 0.963625 2023-10-04 03:08:29,988 WARNING [train.py:1204] (3/4) Exclude cut with ID _29529_210_7234_1_1532393961455_3977460_497-100133-0_sp1.1 from training. Duration: 0.936375 2023-10-04 03:08:30,014 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_12868_210_6753_1_1530701932_530275_18-76322-0_sp0.9 from training. Duration: 0.9666875 2023-10-04 03:08:31,321 WARNING [train.py:1204] (3/4) Exclude cut with ID _50093_210_4220_1_1534417141207_3432299_116-303447-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 03:08:33,267 WARNING [train.py:1204] (3/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_44-79427-0_sp1.1 from training. Duration: 0.8909375 2023-10-04 03:08:33,334 WARNING [train.py:1204] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_887-249555-0 from training. Duration: 0.77 2023-10-04 03:08:37,348 WARNING [train.py:1204] (3/4) Exclude cut with ID _28461_210_2253_1_1532771775189_3041299_96-152158-0_sp0.9 from training. Duration: 0.9444375 2023-10-04 03:08:37,394 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_14058_210_4163_1_1530690953_6327180_49-210687-0_sp1.1 from training. Duration: 0.8545625 2023-10-04 03:08:37,971 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.4.encoder.layers.0.nonlin_attention.whiten2, num_groups=1, num_channels=384, metric=9.37 vs. limit=15.0 2023-10-04 03:08:40,110 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.3.encoder.layers.1.feed_forward2.out_whiten, num_groups=1, num_channels=512, metric=5.28 vs. limit=15.0 2023-10-04 03:08:42,864 WARNING [train.py:1204] (3/4) Exclude cut with ID _40399_210_4353_1_1533305658194_3279406_351-127903-0_sp1.1 from training. Duration: 0.97275 2023-10-04 03:08:42,870 WARNING [train.py:1204] (3/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_184-206939-0_sp1.1 from training. Duration: 0.936375 2023-10-04 03:08:47,283 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.673e+02 1.943e+02 2.149e+02 2.494e+02 4.938e+02, threshold=4.298e+02, percent-clipped=1.0 2023-10-04 03:08:48,688 WARNING [train.py:1204] (3/4) Exclude cut with ID _14100_210_8341_1_1530529320613_3678069_258-224599-0_sp1.1 from training. Duration: 0.7 2023-10-04 03:08:48,705 WARNING [train.py:1204] (3/4) Exclude cut with ID _6493_210_1747_1_1528459210424_4012470_324-323854-0 from training. Duration: 0.92 2023-10-04 03:08:48,799 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_151-325626-0_sp1.1 from training. Duration: 0.8818125 2023-10-04 03:08:50,155 WARNING [train.py:1204] (3/4) Exclude cut with ID _14528_210_8254_1_1530669700514_4306939_106-43145-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 03:08:50,176 WARNING [train.py:1204] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_686-54383-0 from training. Duration: 0.87 2023-10-04 03:08:51,505 WARNING [train.py:1204] (3/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_801-102362-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 03:08:51,577 WARNING [train.py:1204] (3/4) Exclude cut with ID _48677_210_5663_1_1533902405919_3892989_408-40740-0_sp0.9 from training. Duration: 0.92225 2023-10-04 03:08:54,525 WARNING [train.py:1204] (3/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_448-212135-0_sp1.1 from training. Duration: 0.82725 2023-10-04 03:08:56,114 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=1504000.0, ans=0.1 2023-10-04 03:08:57,266 WARNING [train.py:1204] (3/4) Exclude cut with ID _59170_210_6721_1_1534983633960_4203335_320-124047-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 03:08:57,317 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_4692_210_3528_1_1526899755_4323254_107-44401-0_sp1.1 from training. Duration: 0.7818125 2023-10-04 03:09:00,293 INFO [train.py:1046] (3/4) Epoch 43, batch 2500, loss[loss=0.1351, simple_loss=0.2178, pruned_loss=0.02619, over 24431.00 frames. ], tot_loss[loss=0.1541, simple_loss=0.2343, pruned_loss=0.03692, over 4732643.19 frames. ], batch size: 58, lr: 2.36e-03, grad_scale: 16.0 2023-10-04 03:09:01,684 WARNING [train.py:1204] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_731-2428-0 from training. Duration: 0.83 2023-10-04 03:09:01,772 WARNING [train.py:1204] (3/4) Exclude cut with ID _29857_210_20034_1_1532395886753_3798160_335-163562-0_sp1.1 from training. Duration: 0.77275 2023-10-04 03:09:06,596 WARNING [train.py:1204] (3/4) Exclude cut with ID _14832_210_8750_1_1530691351539_3171348_171-275004-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 03:09:08,037 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.balancer1.prob, batch_count=1504066.6666666667, ans=0.125 2023-10-04 03:09:08,066 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.conv_module1.balancer2.prob, batch_count=1504066.6666666667, ans=0.125 2023-10-04 03:09:15,674 WARNING [train.py:1204] (3/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_105-328174-0_sp0.9 from training. Duration: 0.8444375 2023-10-04 03:09:16,933 WARNING [train.py:1204] (3/4) Exclude cut with ID _68965_210_1883_1_1535698962272_3497490_164-118421-0_sp1.1 from training. Duration: 0.936375 2023-10-04 03:09:17,025 WARNING [train.py:1204] (3/4) Exclude cut with ID _19896_210_1883_1_1531709022662_6649569_380-24170-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 03:09:17,029 WARNING [train.py:1204] (3/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_505-205270-0_sp1.1 from training. Duration: 0.3454375 2023-10-04 03:09:22,436 WARNING [train.py:1204] (3/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_427-208901-0_sp1.1 from training. Duration: 0.6181875 2023-10-04 03:09:23,843 WARNING [train.py:1204] (3/4) Exclude cut with ID _23818_210_6228_1_1531917063884_3676920_85-326916-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 03:09:23,925 WARNING [train.py:1204] (3/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_101-293367-0_sp1.1 from training. Duration: 0.57275 2023-10-04 03:09:23,933 WARNING [train.py:1204] (3/4) Exclude cut with ID _5409_210_1388_1_1527292850189_5865191_178-127970-0_sp1.1 from training. Duration: 0.5181875 2023-10-04 03:09:25,237 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_403-39619-0 from training. Duration: 0.84 2023-10-04 03:09:26,598 WARNING [train.py:1204] (3/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_427-87841-0_sp1.1 from training. Duration: 0.963625 2023-10-04 03:09:26,658 WARNING [train.py:1204] (3/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_286-68181-0_sp1.1 from training. Duration: 0.97275 2023-10-04 03:09:27,908 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6673_210_3273_1_1527767790_3173240_327-306185-0 from training. Duration: 0.83 2023-10-04 03:09:27,924 WARNING [train.py:1204] (3/4) Exclude cut with ID _3675_210_4557_1_1526639934408_4182089_106-203713-0_sp1.1 from training. Duration: 0.963625 2023-10-04 03:09:27,990 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_53071_210_16554_1_1534493434_1276796_130-36076-0 from training. Duration: 0.95 2023-10-04 03:09:28,020 WARNING [train.py:1204] (3/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_586-93054-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 03:09:30,105 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.ff3_skip_rate, batch_count=1504200.0, ans=0.0 2023-10-04 03:09:31,175 WARNING [train.py:1204] (3/4) Exclude cut with ID _23825_210_13449_1_1531915313526_3497150_262-10571-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 03:09:32,525 WARNING [train.py:1204] (3/4) Exclude cut with ID _28783_210_7943_1_1532671164808_7155479_432-67885-0_sp1.1 from training. Duration: 0.97275 2023-10-04 03:09:34,395 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6287_210_3273_1_1527509506_2653716_182-199887-0_sp0.9 from training. Duration: 0.5666875 2023-10-04 03:09:36,198 WARNING [train.py:1204] (3/4) Exclude cut with ID _3231_210_2106_1_1526205864934_6784220_171-256004-0 from training. Duration: 0.86 2023-10-04 03:09:37,579 WARNING [train.py:1204] (3/4) Exclude cut with ID _69051_210_1531_1_1535885566901_3829538_332-92090-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 03:09:38,898 WARNING [train.py:1204] (3/4) Exclude cut with ID _3675_210_4557_1_1526639934408_4182089_104-324125-0_sp1.1 from training. Duration: 0.963625 2023-10-04 03:09:42,093 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_41764_210_2521_1_1533686110_4089313_501-2119-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 03:09:46,256 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_56-100988-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 03:09:48,943 WARNING [train.py:1204] (3/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_398-239819-0_sp1.1 from training. Duration: 0.9 2023-10-04 03:09:55,118 WARNING [train.py:1204] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_74-48717-0_sp1.1 from training. Duration: 0.52725 2023-10-04 03:09:57,975 WARNING [train.py:1204] (3/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_177-49092-0 from training. Duration: 0.91 2023-10-04 03:09:58,007 WARNING [train.py:1204] (3/4) Exclude cut with ID _62337_210_12392_1_1535009273105_3412229_44-176353-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 03:09:58,015 WARNING [train.py:1204] (3/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_110-130961-0_sp1.1 from training. Duration: 0.42725 2023-10-04 03:09:59,534 WARNING [train.py:1204] (3/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_37-189262-0_sp1.1 from training. Duration: 0.8909375 2023-10-04 03:09:59,535 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3172_210_2087_1_1526292929_3987865_197-97762-0_sp0.9 from training. Duration: 0.8333125 2023-10-04 03:10:00,839 WARNING [train.py:1204] (3/4) Exclude cut with ID IC0677W0065-334897 from training. Duration: 0.896 2023-10-04 03:10:00,840 WARNING [train.py:1204] (3/4) Exclude cut with ID IC0677W0080-334912 from training. Duration: 0.768 2023-10-04 03:10:00,845 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_120-41668-0 from training. Duration: 0.7 2023-10-04 03:10:02,493 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.bypass.scale_min, batch_count=1504333.3333333333, ans=0.2 2023-10-04 03:10:03,634 WARNING [train.py:1204] (3/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_460-205287-0_sp1.1 from training. Duration: 0.963625 2023-10-04 03:10:05,018 WARNING [train.py:1204] (3/4) Exclude cut with ID _14632_210_9988_1_1530691279392_3923200_102-204477-0 from training. Duration: 0.97 2023-10-04 03:10:05,025 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_34815_210_6388_1_1533381320_3571903_279-305565-0 from training. Duration: 0.92 2023-10-04 03:10:06,887 WARNING [train.py:1204] (3/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_91-148831-0_sp1.1 from training. Duration: 0.9 2023-10-04 03:10:06,955 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_82-221196-0 from training. Duration: 0.76 2023-10-04 03:10:08,505 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.feed_forward1.out_proj.dropout_p, batch_count=1504333.3333333333, ans=0.1 2023-10-04 03:10:11,546 WARNING [train.py:1204] (3/4) Exclude cut with ID _38020_210_5986_1_1533112252259_7006089_493-102745-0 from training. Duration: 0.98 2023-10-04 03:10:14,241 INFO [train.py:1046] (3/4) Epoch 43, batch 2550, loss[loss=0.1547, simple_loss=0.2463, pruned_loss=0.03154, over 24653.00 frames. ], tot_loss[loss=0.1544, simple_loss=0.2349, pruned_loss=0.03694, over 4737225.15 frames. ], batch size: 68, lr: 2.36e-03, grad_scale: 16.0 2023-10-04 03:10:14,284 WARNING [train.py:1204] (3/4) Exclude cut with ID _61133_210_9969_1_1535196777227_3696409_40-93442-0_sp1.1 from training. Duration: 0.92725 2023-10-04 03:10:15,683 WARNING [train.py:1204] (3/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_45-258852-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 03:10:15,727 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_53071_210_16554_1_1534493434_1276796_130-36076-0_sp1.1 from training. Duration: 0.863625 2023-10-04 03:10:17,196 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8694_210_6400_1_1529065754_3260756_332-13553-0_sp1.1 from training. Duration: 0.92725 2023-10-04 03:10:18,487 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_405-339986-0 from training. Duration: 0.51 2023-10-04 03:10:18,545 WARNING [train.py:1204] (3/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_927-43827-0_sp1.1 from training. Duration: 0.67275 2023-10-04 03:10:21,300 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_397-147196-0 from training. Duration: 0.8 2023-10-04 03:10:24,446 WARNING [train.py:1204] (3/4) Exclude cut with ID 3557-8342-0013-54691-0_sp1.1 from training. Duration: 0.836375 2023-10-04 03:10:26,466 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.2.encoder.layers.2.nonlin_attention.whiten1, num_groups=1, num_channels=288, metric=5.50 vs. limit=10.0 2023-10-04 03:10:27,023 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_47438_210_5399_1_1533797471_4513356_626-127722-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 03:10:29,808 WARNING [train.py:1204] (3/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_256-323883-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 03:10:29,812 WARNING [train.py:1204] (3/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_839-173017-0_sp0.9 from training. Duration: 0.4555625 2023-10-04 03:10:29,847 WARNING [train.py:1204] (3/4) Exclude cut with ID _5289_210_2084_1_1527678202736_3779299_392-261988-0_sp1.1 from training. Duration: 0.5909375 2023-10-04 03:10:31,145 WARNING [train.py:1204] (3/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_119-224384-0_sp1.1 from training. Duration: 0.936375 2023-10-04 03:10:32,430 WARNING [train.py:1204] (3/4) Exclude cut with ID _57523_210_1856_1_1534766294545_3732989_262-267933-0_sp1.1 from training. Duration: 0.963625 2023-10-04 03:10:33,942 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_35361_210_4238_1_1533262810_7793476_675-87012-0_sp0.9 from training. Duration: 0.988875 2023-10-04 03:10:33,962 WARNING [train.py:1204] (3/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_17-90541-0 from training. Duration: 0.94 2023-10-04 03:10:35,271 WARNING [train.py:1204] (3/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_535-14410-0_sp1.1 from training. Duration: 0.42725 2023-10-04 03:10:35,284 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_24673_210_6400_1_1531968633_778659_94-154511-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 03:10:35,287 WARNING [train.py:1204] (3/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_212-10501-0 from training. Duration: 0.94 2023-10-04 03:10:49,289 WARNING [train.py:1204] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_383-285196-0_sp1.1 from training. Duration: 0.8818125 2023-10-04 03:10:49,810 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.conv_module1.whiten.whitening_limit, batch_count=1504533.3333333333, ans=15.0 2023-10-04 03:10:54,881 WARNING [train.py:1204] (3/4) Exclude cut with ID _45560_210_21982_1_1533812603951_3584999_238-47020-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 03:10:54,898 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_21500_210_2054_1_1532149310_2716758_278-18053-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 03:10:54,905 WARNING [train.py:1204] (3/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_453-9626-0_sp1.1 from training. Duration: 0.936375 2023-10-04 03:10:56,292 WARNING [train.py:1204] (3/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_153-97441-0_sp1.1 from training. Duration: 0.5090625 2023-10-04 03:10:59,803 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.0.ff3_skip_rate, batch_count=1504600.0, ans=0.0 2023-10-04 03:11:01,173 WARNING [train.py:1204] (3/4) Exclude cut with ID _67725_210_27767_1_1535608756671_5105359_431-120895-0_sp1.1 from training. Duration: 0.963625 2023-10-04 03:11:03,838 WARNING [train.py:1204] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_574-45541-0_sp1.1 from training. Duration: 0.5909375 2023-10-04 03:11:03,864 WARNING [train.py:1204] (3/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_162-103620-0_sp1.1 from training. Duration: 0.8454375 2023-10-04 03:11:03,878 WARNING [train.py:1204] (3/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_734-129761-0_sp1.1 from training. Duration: 0.7454375 2023-10-04 03:11:05,231 WARNING [train.py:1204] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_620-276793-0_sp1.1 from training. Duration: 0.536375 2023-10-04 03:11:06,524 WARNING [train.py:1204] (3/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_153-97441-0_sp0.9 from training. Duration: 0.62225 2023-10-04 03:11:06,775 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.self_attn_weights.pos_emb_skip_rate, batch_count=1504600.0, ans=0.0 2023-10-04 03:11:09,704 WARNING [train.py:1204] (3/4) Exclude cut with ID _12733_210_8341_1_1530167424556_3646380_523-85621-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 03:11:09,723 WARNING [train.py:1204] (3/4) Exclude cut with ID _64048_210_10626_1_1535198382802_3835179_352-179834-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 03:11:13,247 WARNING [train.py:1204] (3/4) Exclude cut with ID _55162_210_17071_1_1534408248583_3753480_43-242364-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 03:11:13,272 WARNING [train.py:1204] (3/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_661-264857-0 from training. Duration: 0.93 2023-10-04 03:11:14,420 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.602e+02 1.895e+02 2.103e+02 2.314e+02 4.132e+02, threshold=4.207e+02, percent-clipped=0.0 2023-10-04 03:11:14,492 WARNING [train.py:1204] (3/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_325-237800-0_sp1.1 from training. Duration: 0.97275 2023-10-04 03:11:14,544 WARNING [train.py:1204] (3/4) Exclude cut with ID _55328_210_27767_1_1534580973260_4088170_385-171459-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 03:11:15,778 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3780_210_2087_1_1526467389_2145572_176-107140-0_sp1.1 from training. Duration: 0.62725 2023-10-04 03:11:15,874 WARNING [train.py:1204] (3/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_375-244976-0_sp1.1 from training. Duration: 0.6818125 2023-10-04 03:11:19,089 WARNING [train.py:1204] (3/4) Exclude cut with ID _35941_210_15379_1_1532995294515_4023089_309-33162-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 03:11:24,658 WARNING [train.py:1204] (3/4) Exclude cut with ID _14459_210_2228_1_1531443641946_7516700_1185-22038-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 03:11:26,245 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_1197-69460-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 03:11:27,929 INFO [train.py:1046] (3/4) Epoch 43, batch 2600, loss[loss=0.1664, simple_loss=0.2535, pruned_loss=0.03965, over 23720.00 frames. ], tot_loss[loss=0.1551, simple_loss=0.2361, pruned_loss=0.03703, over 4743294.50 frames. ], batch size: 85, lr: 2.36e-03, grad_scale: 16.0 2023-10-04 03:11:29,523 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9012W0022-501157 from training. Duration: 0.896 2023-10-04 03:11:30,117 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.4.encoder.layers.1.feed_forward3.out_whiten, num_groups=1, num_channels=384, metric=13.09 vs. limit=15.0 2023-10-04 03:11:30,929 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9023W0009-506302 from training. Duration: 0.896 2023-10-04 03:11:30,944 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2821_210_2715_1_1526108385_3763560_73-296673-0_sp1.1 from training. Duration: 0.7545625 2023-10-04 03:11:30,985 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9025W0113-507442 from training. Duration: 0.896 2023-10-04 03:11:31,064 WARNING [train.py:1204] (3/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_739-2098-0 from training. Duration: 0.92 2023-10-04 03:11:32,280 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9029W0332-509722 from training. Duration: 0.768 2023-10-04 03:11:33,806 WARNING [train.py:1204] (3/4) Exclude cut with ID _16908_210_5724_1_1531119157248_3772700_68-101953-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 03:11:35,159 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9042W0011-515617 from training. Duration: 0.768 2023-10-04 03:11:36,513 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3019_210_2028_1_1526044245_3264865_191-72235-0 from training. Duration: 0.97 2023-10-04 03:11:37,854 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9052W0006-520792 from training. Duration: 0.896 2023-10-04 03:11:41,182 WARNING [train.py:1204] (3/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_245-174553-0_sp1.1 from training. Duration: 0.8 2023-10-04 03:11:42,602 WARNING [train.py:1204] (3/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_202-67017-0 from training. Duration: 0.82 2023-10-04 03:11:42,704 WARNING [train.py:1204] (3/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_304-59227-0 from training. Duration: 0.77 2023-10-04 03:11:44,092 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_246-125000-0_sp0.9 from training. Duration: 0.688875 2023-10-04 03:11:45,451 WARNING [train.py:1204] (3/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_243-42116-0 from training. Duration: 0.73 2023-10-04 03:11:46,977 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9087W0121-538492 from training. Duration: 0.896 2023-10-04 03:11:48,372 WARNING [train.py:1204] (3/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_120-76843-0 from training. Duration: 0.55 2023-10-04 03:11:49,963 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.conv_module1.balancer2.prob, batch_count=1504800.0, ans=0.125 2023-10-04 03:11:56,313 WARNING [train.py:1204] (3/4) Exclude cut with ID _7852_210_5219_1_1528286471316_8139400_850-304621-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 03:11:56,325 WARNING [train.py:1204] (3/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_154-285415-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 03:11:56,337 WARNING [train.py:1204] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_538-278194-0_sp1.1 from training. Duration: 0.92725 2023-10-04 03:11:56,338 WARNING [train.py:1204] (3/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_119-207162-0 from training. Duration: 0.71 2023-10-04 03:11:59,079 WARNING [train.py:1204] (3/4) Exclude cut with ID _2334_210_2667_1_1525608238880_4705129_459-104997-0_sp1.1 from training. Duration: 0.863625 2023-10-04 03:12:03,483 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9147W0019-567847 from training. Duration: 0.768 2023-10-04 03:12:07,698 WARNING [train.py:1204] (3/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_228-189439-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 03:12:09,017 WARNING [train.py:1204] (3/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_196-77172-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 03:12:10,947 WARNING [train.py:1204] (3/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_625-338546-0 from training. Duration: 0.88 2023-10-04 03:12:11,013 WARNING [train.py:1204] (3/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_193-327630-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 03:12:11,014 WARNING [train.py:1204] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_391-24006-0_sp1.1 from training. Duration: 0.92725 2023-10-04 03:12:12,284 WARNING [train.py:1204] (3/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_197-28164-0 from training. Duration: 0.8 2023-10-04 03:12:15,511 WARNING [train.py:1204] (3/4) Exclude cut with ID _8395_210_1531_1_1528597871523_4411579_431-132470-0_sp1.1 from training. Duration: 0.67275 2023-10-04 03:12:16,776 WARNING [train.py:1204] (3/4) Exclude cut with ID _35350_210_16040_1_1533040191183_4075299_465-248326-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 03:12:19,515 WARNING [train.py:1204] (3/4) Exclude cut with ID _16756_210_4915_1_1531047803763_7104259_101-141922-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 03:12:22,676 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9212W0005-600172 from training. Duration: 0.896 2023-10-04 03:12:22,707 WARNING [train.py:1204] (3/4) Exclude cut with ID _63186_210_2084_1_1535414479705_3908649_378-166820-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 03:12:22,726 WARNING [train.py:1204] (3/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_52-211683-0_sp1.1 from training. Duration: 0.8181875 2023-10-04 03:12:24,972 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.1.encoder.layers.0.nonlin_attention.whiten1, num_groups=1, num_channels=192, metric=4.68 vs. limit=10.0 2023-10-04 03:12:30,081 WARNING [train.py:1204] (3/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_1147-237729-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 03:12:30,156 WARNING [train.py:1204] (3/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_234-14174-0_sp0.9 from training. Duration: 0.911125 2023-10-04 03:12:30,176 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_454-276429-0 from training. Duration: 0.87 2023-10-04 03:12:31,568 WARNING [train.py:1204] (3/4) Exclude cut with ID _53713_210_24116_1_1534314487762_7416751_755-165313-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 03:12:32,997 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8278_210_2550_1_1528637099_4220413_436-317072-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 03:12:34,321 WARNING [train.py:1204] (3/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_28-324729-0_sp1.1 from training. Duration: 0.8545625 2023-10-04 03:12:39,915 WARNING [train.py:1204] (3/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_326-165900-0 from training. Duration: 0.69 2023-10-04 03:12:40,106 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.0.nonlin_attention.balancer.prob, batch_count=1505066.6666666667, ans=0.125 2023-10-04 03:12:41,239 INFO [train.py:1046] (3/4) Epoch 43, batch 2650, loss[loss=0.1602, simple_loss=0.2376, pruned_loss=0.04138, over 23873.00 frames. ], tot_loss[loss=0.1557, simple_loss=0.2365, pruned_loss=0.03746, over 4738854.59 frames. ], batch size: 195, lr: 2.36e-03, grad_scale: 8.0 2023-10-04 03:12:41,282 WARNING [train.py:1204] (3/4) Exclude cut with ID _53489_210_6228_1_1534298357169_3835970_96-30246-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 03:12:42,079 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8275_210_4198_1_1528455320_3892571_491-17707-0_sp1.1 from training. Duration: 0.5818125 2023-10-04 03:12:43,521 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.feed_forward3.hidden_balancer.prob, batch_count=1505066.6666666667, ans=0.125 2023-10-04 03:12:46,624 WARNING [train.py:1204] (3/4) Exclude cut with ID _14348_210_8341_1_1530599434996_3778640_240-232370-0 from training. Duration: 0.84 2023-10-04 03:12:47,848 WARNING [train.py:1204] (3/4) Exclude cut with ID _45012_210_12651_1_1533608435136_5371380_54-112623-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 03:12:49,117 WARNING [train.py:1204] (3/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_328-264776-0_sp1.1 from training. Duration: 0.6090625 2023-10-04 03:12:49,181 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9325W0001-648427 from training. Duration: 0.768 2023-10-04 03:12:49,196 WARNING [train.py:1204] (3/4) Exclude cut with ID _56480_210_4220_1_1534679942079_3075409_49-74717-0_sp1.1 from training. Duration: 0.936375 2023-10-04 03:12:50,633 WARNING [train.py:1204] (3/4) Exclude cut with ID _59500_210_8254_1_1534809610680_3736009_6-241869-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 03:12:55,181 WARNING [train.py:1204] (3/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_172-215932-0_sp0.9 from training. Duration: 0.7333125 2023-10-04 03:12:56,565 WARNING [train.py:1204] (3/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_17-90541-0_sp1.1 from training. Duration: 0.8545625 2023-10-04 03:12:57,638 INFO [scaling.py:1022] (3/4) Whitening: name=encoder_embed.convnext.out_whiten, num_groups=1, num_channels=128, metric=4.43 vs. limit=5.0 2023-10-04 03:12:58,021 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_43831_210_2158_1_1533725502_5872791_525-89447-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 03:12:58,103 WARNING [train.py:1204] (3/4) Exclude cut with ID _12302_210_5219_1_1529924446466_8107280_907-182949-0 from training. Duration: 0.92 2023-10-04 03:12:58,253 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.feed_forward1.out_proj.dropout_p, batch_count=1505133.3333333333, ans=0.1 2023-10-04 03:12:59,493 WARNING [train.py:1204] (3/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_402-102311-0_sp0.9 from training. Duration: 0.7444375 2023-10-04 03:12:59,524 WARNING [train.py:1204] (3/4) Exclude cut with ID _8021_210_7994_1_1528376451358_3653069_179-45206-0_sp0.9 from training. Duration: 0.9555625 2023-10-04 03:12:59,777 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.ff2_skip_rate, batch_count=1505133.3333333333, ans=0.0 2023-10-04 03:13:01,552 WARNING [train.py:1204] (3/4) Exclude cut with ID _40137_210_12210_1_1533290484046_3795180_249-187059-0 from training. Duration: 0.88 2023-10-04 03:13:04,249 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9653W0194-675472 from training. Duration: 0.9658125 2023-10-04 03:13:05,713 WARNING [train.py:1204] (3/4) Exclude cut with ID _51332_210_9988_1_1534244371836_3801270_336-255604-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 03:13:08,470 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_510-96996-0 from training. Duration: 0.87 2023-10-04 03:13:08,483 WARNING [train.py:1204] (3/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_1167-20437-0_sp1.1 from training. Duration: 0.92725 2023-10-04 03:13:09,779 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3475_210_2028_1_1526193578_3622569_265-276625-0 from training. Duration: 0.76 2023-10-04 03:13:13,289 WARNING [train.py:1204] (3/4) Exclude cut with ID _14860_210_4915_1_1530702020340_7343240_470-172418-0_sp1.1 from training. Duration: 0.97275 2023-10-04 03:13:13,304 WARNING [train.py:1204] (3/4) Exclude cut with ID _5444_210_2151_1_1527418808464_5312210_581-315411-0_sp1.1 from training. Duration: 0.563625 2023-10-04 03:13:13,320 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_34450_210_9547_1_1533099443_4109050_519-342439-0_sp1.1 from training. Duration: 0.97275 2023-10-04 03:13:14,610 WARNING [train.py:1204] (3/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_50-175961-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 03:13:18,033 WARNING [train.py:1204] (3/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_372-6869-0 from training. Duration: 0.44 2023-10-04 03:13:18,039 WARNING [train.py:1204] (3/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_3-237434-0 from training. Duration: 0.76 2023-10-04 03:13:22,140 WARNING [train.py:1204] (3/4) Exclude cut with ID _8772_210_1850_1_1528635595972_3664011_247-336448-0_sp1.1 from training. Duration: 0.67275 2023-10-04 03:13:24,878 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_427-78097-0 from training. Duration: 0.8 2023-10-04 03:13:24,896 WARNING [train.py:1204] (3/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_466-252727-0_sp1.1 from training. Duration: 0.97275 2023-10-04 03:13:26,838 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_15972_210_6400_1_1530877934_3405692_316-35478-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 03:13:26,872 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_809-17156-0_sp0.9 from training. Duration: 0.811125 2023-10-04 03:13:26,909 WARNING [train.py:1204] (3/4) Exclude cut with ID _57572_210_5517_1_1534921219624_3809484_594-263474-0_sp1.1 from training. Duration: 0.936375 2023-10-04 03:13:28,139 WARNING [train.py:1204] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_190-228204-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 03:13:28,278 WARNING [train.py:1204] (3/4) Exclude cut with ID _19514_210_9259_1_1531530071330_5084230_117-21193-0_sp1.1 from training. Duration: 0.936375 2023-10-04 03:13:31,090 WARNING [train.py:1204] (3/4) Exclude cut with ID _69584_210_5219_1_1535785133775_4044450_190-21315-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 03:13:32,393 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_53071_210_16554_1_1534493434_1276796_184-302050-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 03:13:32,443 WARNING [train.py:1204] (3/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_120-76843-0_sp1.1 from training. Duration: 0.5 2023-10-04 03:13:34,413 WARNING [train.py:1204] (3/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_318-162017-0_sp1.1 from training. Duration: 0.82725 2023-10-04 03:13:37,166 WARNING [train.py:1204] (3/4) Exclude cut with ID _46067_210_4220_1_1534125571066_3307450_79-46864-0_sp1.1 from training. Duration: 0.92725 2023-10-04 03:13:37,230 WARNING [train.py:1204] (3/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_358-215558-0_sp1.1 from training. Duration: 0.8454375 2023-10-04 03:13:38,436 WARNING [train.py:1204] (3/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_621-66764-0_sp1.1 from training. Duration: 0.92725 2023-10-04 03:13:38,552 WARNING [train.py:1204] (3/4) Exclude cut with ID _42863_210_9070_1_1533902398788_3696920_338-156851-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 03:13:38,760 INFO [scaling.py:1118] (3/4) WithLoss: name=encoder.encoders.2.encoder.layers.0.self_attn_weights, loss-sum=0.000e+00 2023-10-04 03:13:39,833 WARNING [train.py:1204] (3/4) Exclude cut with ID _5289_210_2084_1_1527678202736_3779299_392-261988-0_sp0.9 from training. Duration: 0.72225 2023-10-04 03:13:43,033 WARNING [train.py:1204] (3/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_409-84682-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 03:13:43,868 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.conv_skip_rate, batch_count=1505333.3333333333, ans=0.0 2023-10-04 03:13:44,751 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.610e+02 1.943e+02 2.072e+02 2.278e+02 3.072e+02, threshold=4.144e+02, percent-clipped=0.0 2023-10-04 03:13:44,908 WARNING [train.py:1204] (3/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_30-326681-0_sp0.9 from training. Duration: 0.92225 2023-10-04 03:13:44,912 WARNING [train.py:1204] (3/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_389-184695-0_sp1.1 from training. Duration: 0.92725 2023-10-04 03:13:46,214 WARNING [train.py:1204] (3/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_442-320317-0 from training. Duration: 0.82 2023-10-04 03:13:47,834 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=1505333.3333333333, ans=0.1 2023-10-04 03:13:50,882 WARNING [train.py:1204] (3/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_756-29911-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 03:13:52,271 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_401-112036-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 03:13:53,605 WARNING [train.py:1204] (3/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_181-308827-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 03:13:54,568 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.0.layers.0.nonlin_attention.whiten2, num_groups=1, num_channels=192, metric=4.95 vs. limit=15.0 2023-10-04 03:13:54,979 WARNING [train.py:1204] (3/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_332-258725-0_sp1.1 from training. Duration: 0.963625 2023-10-04 03:13:56,286 INFO [train.py:1046] (3/4) Epoch 43, batch 2700, loss[loss=0.1458, simple_loss=0.2229, pruned_loss=0.0343, over 24439.00 frames. ], tot_loss[loss=0.1564, simple_loss=0.237, pruned_loss=0.03793, over 4740552.42 frames. ], batch size: 58, lr: 2.36e-03, grad_scale: 8.0 2023-10-04 03:13:56,394 WARNING [train.py:1204] (3/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_188-260011-0_sp0.9 from training. Duration: 0.811125 2023-10-04 03:13:56,431 WARNING [train.py:1204] (3/4) Exclude cut with ID _49039_210_6315_1_1533967537541_3846118_99-302239-0_sp1.1 from training. Duration: 0.963625 2023-10-04 03:13:58,294 WARNING [train.py:1204] (3/4) Exclude cut with ID _4122_210_2084_1_1526713164417_4353280_312-294828-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 03:13:58,299 WARNING [train.py:1204] (3/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_303-228738-0 from training. Duration: 0.84 2023-10-04 03:14:01,035 WARNING [train.py:1204] (3/4) Exclude cut with ID _14349_210_8341_1_1530615710818_3442470_115-285975-0_sp0.9 from training. Duration: 0.9555625 2023-10-04 03:14:03,726 WARNING [train.py:1204] (3/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_125-144174-0_sp1.1 from training. Duration: 0.3545625 2023-10-04 03:14:05,077 WARNING [train.py:1204] (3/4) Exclude cut with ID _53565_210_10635_1_1534337218335_3820259_351-54570-0_sp1.1 from training. Duration: 0.97275 2023-10-04 03:14:05,105 WARNING [train.py:1204] (3/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_410-40065-0_sp1.1 from training. Duration: 0.963625 2023-10-04 03:14:05,131 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_38965_210_4852_1_1533174906_3484735_167-339442-0_sp1.1 from training. Duration: 0.963625 2023-10-04 03:14:07,075 WARNING [train.py:1204] (3/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_265-164486-0_sp1.1 from training. Duration: 0.87275 2023-10-04 03:14:07,085 WARNING [train.py:1204] (3/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_725-182484-0_sp1.1 from training. Duration: 0.92725 2023-10-04 03:14:08,437 WARNING [train.py:1204] (3/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_607-213951-0_sp0.9 from training. Duration: 0.9333125 2023-10-04 03:14:08,457 WARNING [train.py:1204] (3/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_605-133395-0_sp1.1 from training. Duration: 0.6 2023-10-04 03:14:08,477 WARNING [train.py:1204] (3/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_351-294650-0 from training. Duration: 0.63 2023-10-04 03:14:09,784 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_14953_210_2151_1_1530701710_5616165_710-165400-0_sp1.1 from training. Duration: 0.7909375 2023-10-04 03:14:11,145 WARNING [train.py:1204] (3/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_237-58433-0_sp1.1 from training. Duration: 0.67275 2023-10-04 03:14:12,548 WARNING [train.py:1204] (3/4) Exclude cut with ID _5190_210_4557_1_1527244368799_373439_28-197030-0_sp1.1 from training. Duration: 0.6545625 2023-10-04 03:14:12,606 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_743-327651-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 03:14:16,309 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.0.layers.1.self_attn2.whiten, num_groups=1, num_channels=192, metric=10.44 vs. limit=22.5 2023-10-04 03:14:16,810 WARNING [train.py:1204] (3/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_76-203867-0_sp0.9 from training. Duration: 0.8 2023-10-04 03:14:18,598 WARNING [train.py:1204] (3/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_274-274798-0 from training. Duration: 0.98 2023-10-04 03:14:18,642 WARNING [train.py:1204] (3/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_226-242140-0_sp1.1 from training. Duration: 0.736375 2023-10-04 03:14:24,624 WARNING [train.py:1204] (3/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_352-59958-0_sp1.1 from training. Duration: 0.7818125 2023-10-04 03:14:24,629 WARNING [train.py:1204] (3/4) Exclude cut with ID _39411_210_5986_1_1533538825030_3343649_191-251819-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 03:14:29,420 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7046_210_6753_1_1527895337_6920917_642-106799-0_sp1.1 from training. Duration: 0.836375 2023-10-04 03:14:29,427 WARNING [train.py:1204] (3/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_412-3442-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 03:14:29,453 WARNING [train.py:1204] (3/4) Exclude cut with ID _60057_210_10640_1_1534903065793_3333150_171-181133-0_sp0.9 from training. Duration: 0.97775 2023-10-04 03:14:29,468 WARNING [train.py:1204] (3/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_154-135696-0_sp1.1 from training. Duration: 0.72725 2023-10-04 03:14:31,093 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.ff2_skip_rate, batch_count=1505533.3333333333, ans=0.0 2023-10-04 03:14:33,518 WARNING [train.py:1204] (3/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_148-186805-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 03:14:35,153 WARNING [train.py:1204] (3/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_605-165641-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 03:14:35,159 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_174-277364-0_sp0.9 from training. Duration: 0.788875 2023-10-04 03:14:35,173 WARNING [train.py:1204] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_89-321955-0_sp0.9 from training. Duration: 0.888875 2023-10-04 03:14:39,839 WARNING [train.py:1204] (3/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_438-105429-0_sp1.1 from training. Duration: 0.963625 2023-10-04 03:14:39,843 WARNING [train.py:1204] (3/4) Exclude cut with ID _14970_210_15709_1_1530773543493_1715730_187-20259-0_sp0.9 from training. Duration: 0.988875 2023-10-04 03:14:48,226 WARNING [train.py:1204] (3/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_385-193052-0_sp0.9 from training. Duration: 0.9555625 2023-10-04 03:14:48,280 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7113_210_1849_1_1527942685_3448768_194-311626-0_sp1.1 from training. Duration: 0.92725 2023-10-04 03:14:51,427 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3780_210_2087_1_1526467389_2145572_176-107140-0_sp0.9 from training. Duration: 0.7666875 2023-10-04 03:14:51,429 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_36756_210_7234_1_1533026991_966276_123-213457-0_sp1.1 from training. Duration: 0.936375 2023-10-04 03:14:55,665 WARNING [train.py:1204] (3/4) Exclude cut with ID _23557_210_8750_1_1531890106370_6846255_456-244861-0_sp1.1 from training. Duration: 0.963625 2023-10-04 03:14:56,958 WARNING [train.py:1204] (3/4) Exclude cut with ID _37086_210_9038_1_1532999540891_7021250_242-349170-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 03:14:57,012 WARNING [train.py:1204] (3/4) Exclude cut with ID _41298_210_9019_1_1533430371450_4822143_471-281351-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 03:14:57,164 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.0.conv_module1.balancer2.min_abs, batch_count=1505666.6666666667, ans=0.5 2023-10-04 03:14:58,897 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_53378_210_17282_1_1534221071_5769509_572-126176-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 03:15:00,319 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_1507-263032-0_sp1.1 from training. Duration: 0.963625 2023-10-04 03:15:00,356 WARNING [train.py:1204] (3/4) Exclude cut with ID _13679_210_7497_1_1530534293662_3355098_411-186088-0_sp1.1 from training. Duration: 0.8545625 2023-10-04 03:15:01,812 WARNING [train.py:1204] (3/4) Exclude cut with ID _29345_210_4381_1_1532689168263_3860169_149-257268-0_sp1.1 from training. Duration: 0.736375 2023-10-04 03:15:03,184 WARNING [train.py:1204] (3/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_381-252014-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 03:15:03,191 WARNING [train.py:1204] (3/4) Exclude cut with ID _35799_210_5854_1_1533263487887_3529071_512-2717-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 03:15:06,592 WARNING [train.py:1204] (3/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_505-205270-0_sp0.9 from training. Duration: 0.42225 2023-10-04 03:15:06,662 WARNING [train.py:1204] (3/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_731-20117-0_sp1.1 from training. Duration: 0.936375 2023-10-04 03:15:09,415 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_37351_210_7925_1_1533012931_3812113_265-85780-0_sp1.1 from training. Duration: 0.8 2023-10-04 03:15:09,429 WARNING [train.py:1204] (3/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_313-134930-0 from training. Duration: 0.97 2023-10-04 03:15:10,688 INFO [train.py:1046] (3/4) Epoch 43, batch 2750, loss[loss=0.1401, simple_loss=0.2286, pruned_loss=0.02579, over 24464.00 frames. ], tot_loss[loss=0.1565, simple_loss=0.2371, pruned_loss=0.038, over 4735789.67 frames. ], batch size: 63, lr: 2.36e-03, grad_scale: 8.0 2023-10-04 03:15:10,790 WARNING [train.py:1204] (3/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_374-138971-0 from training. Duration: 0.94 2023-10-04 03:15:10,836 WARNING [train.py:1204] (3/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_1227-287886-0_sp1.1 from training. Duration: 0.936375 2023-10-04 03:15:14,896 WARNING [train.py:1204] (3/4) Exclude cut with ID _59493_210_17282_1_1535078419077_3225879_332-217544-0_sp1.1 from training. Duration: 0.97275 2023-10-04 03:15:14,948 WARNING [train.py:1204] (3/4) Exclude cut with ID _23514_210_1389_1_1531832139785_3031439_361-119640-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 03:15:18,232 WARNING [train.py:1204] (3/4) Exclude cut with ID _69584_210_5219_1_1535785133775_4044450_142-118058-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 03:15:18,264 WARNING [train.py:1204] (3/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_178-177582-0_sp1.1 from training. Duration: 0.67275 2023-10-04 03:15:18,308 WARNING [train.py:1204] (3/4) Exclude cut with ID _25303_210_17519_1_1532050207576_3635970_41-195572-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 03:15:21,674 WARNING [train.py:1204] (3/4) Exclude cut with ID _55328_210_27767_1_1534580973260_4088170_329-341322-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 03:15:23,042 WARNING [train.py:1204] (3/4) Exclude cut with ID _5608_210_2106_1_1527415439089_6819768_909-82767-0_sp1.1 from training. Duration: 0.5545625 2023-10-04 03:15:23,096 WARNING [train.py:1204] (3/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_261-297503-0_sp1.1 from training. Duration: 0.9 2023-10-04 03:15:23,102 WARNING [train.py:1204] (3/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_459-291706-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 03:15:23,103 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7762_210_1794_1_1528199508_4057408_422-227034-0 from training. Duration: 0.73 2023-10-04 03:15:24,367 WARNING [train.py:1204] (3/4) Exclude cut with ID _3067_210_2667_1_1526212974640_4503168_250-285937-0_sp0.9 from training. Duration: 0.888875 2023-10-04 03:15:24,382 WARNING [train.py:1204] (3/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_332-253745-0_sp1.1 from training. Duration: 0.936375 2023-10-04 03:15:29,910 WARNING [train.py:1204] (3/4) Exclude cut with ID _13938_210_9988_1_1530518718978_3693960_304-112784-0 from training. Duration: 0.46 2023-10-04 03:15:30,056 WARNING [train.py:1204] (3/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_226-267957-0_sp1.1 from training. Duration: 0.92725 2023-10-04 03:15:30,095 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_41822_210_13270_1_1533955976_4379284_608-254567-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 03:15:31,769 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_124-113562-0_sp1.1 from training. Duration: 0.8545625 2023-10-04 03:15:31,798 WARNING [train.py:1204] (3/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_117-290916-0_sp1.1 from training. Duration: 0.463625 2023-10-04 03:15:33,143 WARNING [train.py:1204] (3/4) Exclude cut with ID _28427_210_4220_1_1532581240967_3061069_102-66533-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 03:15:33,227 WARNING [train.py:1204] (3/4) Exclude cut with ID _55383_210_10635_1_1534468426172_2231669_134-128246-0_sp1.1 from training. Duration: 0.8090625 2023-10-04 03:15:34,974 WARNING [train.py:1204] (3/4) Exclude cut with ID _45871_210_5665_1_1533864614245_3296060_49-308316-0_sp1.1 from training. Duration: 0.97275 2023-10-04 03:15:35,018 WARNING [train.py:1204] (3/4) Exclude cut with ID _57572_210_5517_1_1534921219624_3809484_176-199205-0_sp1.1 from training. Duration: 0.97275 2023-10-04 03:15:40,252 WARNING [train.py:1204] (3/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_45-265684-0_sp1.1 from training. Duration: 0.6545625 2023-10-04 03:15:40,276 WARNING [train.py:1204] (3/4) Exclude cut with ID _15150_210_8341_1_1530752411803_7256384_469-239509-0_sp0.9 from training. Duration: 0.7555625 2023-10-04 03:15:40,344 WARNING [train.py:1204] (3/4) Exclude cut with ID _28461_210_2253_1_1532771775189_3041299_136-270644-0_sp1.1 from training. Duration: 0.8454375 2023-10-04 03:15:41,648 WARNING [train.py:1204] (3/4) Exclude cut with ID _69584_210_5219_1_1535785133775_4044450_302-47302-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 03:15:43,029 WARNING [train.py:1204] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_166-128941-0_sp1.1 from training. Duration: 0.4909375 2023-10-04 03:15:50,426 WARNING [train.py:1204] (3/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_559-120504-0_sp1.1 from training. Duration: 0.97275 2023-10-04 03:15:52,257 WARNING [train.py:1204] (3/4) Exclude cut with ID _6456_210_1534_1_1527681634212_3181049_345-252877-0_sp1.1 from training. Duration: 0.5818125 2023-10-04 03:15:52,282 WARNING [train.py:1204] (3/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_902-233003-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 03:15:53,854 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.feed_forward3.hidden_balancer.prob, batch_count=1505933.3333333333, ans=0.125 2023-10-04 03:15:56,520 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.balancer2.prob, batch_count=1505933.3333333333, ans=0.125 2023-10-04 03:15:57,630 WARNING [train.py:1204] (3/4) Exclude cut with ID _36538_210_9994_1_1533204020363_3795620_204-261385-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 03:15:57,634 WARNING [train.py:1204] (3/4) Exclude cut with ID _14821_210_8750_1_1530680503991_6829359_189-297451-0_sp1.1 from training. Duration: 0.5 2023-10-04 03:15:57,666 WARNING [train.py:1204] (3/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_1211-226066-0_sp0.9 from training. Duration: 0.8444375 2023-10-04 03:16:02,674 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6786_210_1388_1_1527839699_4194495_273-149841-0_sp1.1 from training. Duration: 0.836375 2023-10-04 03:16:02,711 WARNING [train.py:1204] (3/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_380-162648-0_sp1.1 from training. Duration: 0.8545625 2023-10-04 03:16:02,712 WARNING [train.py:1204] (3/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_494-199538-0 from training. Duration: 0.75 2023-10-04 03:16:07,243 WARNING [train.py:1204] (3/4) Exclude cut with ID _49097_210_16049_1_1533952780207_4360979_276-87053-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 03:16:08,657 WARNING [train.py:1204] (3/4) Exclude cut with ID _5444_210_2151_1_1527418808464_5312210_726-27165-0 from training. Duration: 0.67 2023-10-04 03:16:12,650 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.576e+02 1.953e+02 2.174e+02 2.380e+02 3.470e+02, threshold=4.348e+02, percent-clipped=0.0 2023-10-04 03:16:14,030 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_128-202104-0_sp1.1 from training. Duration: 0.57275 2023-10-04 03:16:17,066 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_11349_210_6753_1_1529800861_1101207_26-65602-0_sp1.1 from training. Duration: 0.87275 2023-10-04 03:16:17,090 WARNING [train.py:1204] (3/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_159-328157-0 from training. Duration: 0.52 2023-10-04 03:16:17,172 WARNING [train.py:1204] (3/4) Exclude cut with ID _14069_210_5632_1_1530512692033_4803390_98-21934-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 03:16:18,641 WARNING [train.py:1204] (3/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_352-59958-0_sp0.9 from training. Duration: 0.9555625 2023-10-04 03:16:20,007 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6163_210_6400_1_1527506746_4263068_283-137811-0 from training. Duration: 0.77 2023-10-04 03:16:21,743 WARNING [train.py:1204] (3/4) Exclude cut with ID _21989_210_5854_1_1531899214931_3271360_133-44759-0_sp1.1 from training. Duration: 0.7 2023-10-04 03:16:22,041 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.balancer1.prob, batch_count=1506000.0, ans=0.125 2023-10-04 03:16:24,596 INFO [train.py:1046] (3/4) Epoch 43, batch 2800, loss[loss=0.144, simple_loss=0.2266, pruned_loss=0.03069, over 24629.00 frames. ], tot_loss[loss=0.1554, simple_loss=0.2354, pruned_loss=0.03766, over 4731748.93 frames. ], batch size: 60, lr: 2.36e-03, grad_scale: 16.0 2023-10-04 03:16:24,644 WARNING [train.py:1204] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_21-174514-0_sp0.9 from training. Duration: 0.47775 2023-10-04 03:16:24,677 WARNING [train.py:1204] (3/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_201-78811-0_sp1.1 from training. Duration: 0.963625 2023-10-04 03:16:25,980 WARNING [train.py:1204] (3/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_537-164630-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 03:16:26,055 WARNING [train.py:1204] (3/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_156-257607-0 from training. Duration: 0.83 2023-10-04 03:16:26,066 WARNING [train.py:1204] (3/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_308-154450-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 03:16:26,102 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_45399_210_4852_1_1533624725_7579737_214-240574-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 03:16:30,072 WARNING [train.py:1204] (3/4) Exclude cut with ID _29303_210_2575_1_1532392237303_7410649_615-305517-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 03:16:30,117 WARNING [train.py:1204] (3/4) Exclude cut with ID IC0125W0002-61808 from training. Duration: 0.708 2023-10-04 03:16:30,118 WARNING [train.py:1204] (3/4) Exclude cut with ID IC0125W0017-61823 from training. Duration: 0.759 2023-10-04 03:16:31,700 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.feed_forward3.hidden_balancer.prob, batch_count=1506066.6666666667, ans=0.125 2023-10-04 03:16:32,912 WARNING [train.py:1204] (3/4) Exclude cut with ID _53219_210_4525_1_1534834836008_4064825_4-145291-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 03:16:34,770 WARNING [train.py:1204] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_185-174805-0_sp1.1 from training. Duration: 0.6181875 2023-10-04 03:16:34,784 WARNING [train.py:1204] (3/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_635-193046-0_sp1.1 from training. Duration: 0.92725 2023-10-04 03:16:37,581 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.2.encoder.layers.0.nonlin_attention.whiten2, num_groups=1, num_channels=384, metric=8.13 vs. limit=15.0 2023-10-04 03:16:39,525 WARNING [train.py:1204] (3/4) Exclude cut with ID _46067_210_4220_1_1534125571066_3307450_321-258866-0_sp1.1 from training. Duration: 0.97275 2023-10-04 03:16:40,913 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder_embed.conv.2.prob, batch_count=1506133.3333333333, ans=0.125 2023-10-04 03:16:42,172 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8558_210_1410_1_1528505225_7613406_131-329524-0 from training. Duration: 0.79 2023-10-04 03:16:42,322 WARNING [train.py:1204] (3/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_847-41334-0_sp0.9 from training. Duration: 0.488875 2023-10-04 03:16:43,824 WARNING [train.py:1204] (3/4) Exclude cut with ID _14227_210_4381_1_1530701804286_3940259_125-275642-0 from training. Duration: 0.99 2023-10-04 03:16:44,095 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.feed_forward3.hidden_balancer.prob, batch_count=1506133.3333333333, ans=0.125 2023-10-04 03:16:45,182 WARNING [train.py:1204] (3/4) Exclude cut with ID _25248_210_8750_1_1532062825941_5554180_248-177255-0_sp1.1 from training. Duration: 0.963625 2023-10-04 03:16:46,571 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_40047_210_2290_1_1533290519_2374311_195-165693-0_sp1.1 from training. Duration: 0.8090625 2023-10-04 03:16:46,574 WARNING [train.py:1204] (3/4) Exclude cut with ID _23109_210_4381_1_1532150828777_901949_29-258672-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 03:16:49,962 WARNING [train.py:1204] (3/4) Exclude cut with ID _14821_210_8750_1_1530680503991_6829359_2-227151-0_sp1.1 from training. Duration: 0.7545625 2023-10-04 03:16:49,992 WARNING [train.py:1204] (3/4) Exclude cut with ID _49643_210_7989_1_1534640244224_3939031_565-53982-0_sp1.1 from training. Duration: 0.963625 2023-10-04 03:16:49,996 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7544_210_6753_1_1528591327_1471426_33-346031-0_sp0.9 from training. Duration: 0.67775 2023-10-04 03:16:51,359 WARNING [train.py:1204] (3/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_95-61782-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 03:16:52,273 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.conv_skip_rate, batch_count=1506133.3333333333, ans=0.0 2023-10-04 03:16:56,401 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.conv_skip_rate, batch_count=1506200.0, ans=0.0 2023-10-04 03:16:58,085 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.3.encoder.layers.3.self_attn_weights.whiten_keys, num_groups=8, num_channels=256, metric=2.53 vs. limit=6.0 2023-10-04 03:16:58,802 WARNING [train.py:1204] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_686-54383-0_sp0.9 from training. Duration: 0.9666875 2023-10-04 03:17:00,286 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_505-276823-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 03:17:01,705 WARNING [train.py:1204] (3/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_745-128443-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 03:17:03,539 WARNING [train.py:1204] (3/4) Exclude cut with ID _3844_210_1390_1_1526727803582_2871209_20-251664-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 03:17:03,599 WARNING [train.py:1204] (3/4) Exclude cut with ID _36538_210_9994_1_1533204020363_3795620_487-164628-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 03:17:03,782 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.conv_module1.balancer1.prob, batch_count=1506200.0, ans=0.125 2023-10-04 03:17:10,905 WARNING [train.py:1204] (3/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_309-222545-0_sp1.1 from training. Duration: 0.763625 2023-10-04 03:17:10,919 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3531_210_3528_1_1526294203_5216456_309-227302-0 from training. Duration: 0.69 2023-10-04 03:17:10,982 WARNING [train.py:1204] (3/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_42-192460-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 03:17:12,269 WARNING [train.py:1204] (3/4) Exclude cut with ID _22083_210_8254_1_1531702862439_3678970_49-234546-0_sp1.1 from training. Duration: 0.7545625 2023-10-04 03:17:12,274 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_427-78097-0_sp0.9 from training. Duration: 0.888875 2023-10-04 03:17:15,331 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_378-205254-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 03:17:15,374 WARNING [train.py:1204] (3/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_672-314830-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 03:17:18,021 WARNING [train.py:1204] (3/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_90-30673-0_sp1.1 from training. Duration: 0.763625 2023-10-04 03:17:21,692 WARNING [train.py:1204] (3/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_563-329943-0_sp1.1 from training. Duration: 0.9 2023-10-04 03:17:21,715 WARNING [train.py:1204] (3/4) Exclude cut with ID _21632_210_2253_1_1531994001765_3744140_154-84090-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 03:17:21,716 WARNING [train.py:1204] (3/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_103-336064-0_sp0.9 from training. Duration: 0.5666875 2023-10-04 03:17:21,754 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_43-71989-0_sp0.9 from training. Duration: 0.6333125 2023-10-04 03:17:22,012 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.feed_forward1.hidden_balancer.prob, batch_count=1506266.6666666667, ans=0.125 2023-10-04 03:17:23,077 WARNING [train.py:1204] (3/4) Exclude cut with ID _3867_210_1883_1_1526641200575_4475830_319-349850-0_sp0.9 from training. Duration: 0.7444375 2023-10-04 03:17:23,165 WARNING [train.py:1204] (3/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_629-179454-0_sp1.1 from training. Duration: 0.963625 2023-10-04 03:17:23,173 WARNING [train.py:1204] (3/4) Exclude cut with ID _65263_210_22356_1_1535763598691_3921037_49-222917-0 from training. Duration: 0.92 2023-10-04 03:17:24,458 WARNING [train.py:1204] (3/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_602-331772-0_sp1.1 from training. Duration: 0.936375 2023-10-04 03:17:24,554 WARNING [train.py:1204] (3/4) Exclude cut with ID _58414_210_8254_1_1535421609162_7151182_292-134978-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 03:17:24,589 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_38793_210_2544_1_1533176114_3849320_200-299292-0_sp1.1 from training. Duration: 0.936375 2023-10-04 03:17:25,919 WARNING [train.py:1204] (3/4) Exclude cut with ID _14970_210_15709_1_1530775547361_1966965_6-23328-0 from training. Duration: 0.86 2023-10-04 03:17:25,999 WARNING [train.py:1204] (3/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_386-225511-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 03:17:27,283 WARNING [train.py:1204] (3/4) Exclude cut with ID _38323_210_12108_1_1533121233369_3303049_416-11919-0_sp1.1 from training. Duration: 0.87275 2023-10-04 03:17:27,340 WARNING [train.py:1204] (3/4) Exclude cut with ID _4866_210_2577_1_1527404412284_4364659_256-306830-0_sp1.1 from training. Duration: 0.7909375 2023-10-04 03:17:27,453 WARNING [train.py:1204] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_53-300544-0 from training. Duration: 0.67 2023-10-04 03:17:28,333 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.0.layers.1.self_attn_weights.whiten_keys, num_groups=4, num_channels=128, metric=2.38 vs. limit=6.0 2023-10-04 03:17:32,311 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.3.encoder.layers.3.conv_module2.whiten, num_groups=1, num_channels=512, metric=4.42 vs. limit=15.0 2023-10-04 03:17:34,784 WARNING [train.py:1204] (3/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_80-74184-0_sp1.1 from training. Duration: 0.7545625 2023-10-04 03:17:34,801 WARNING [train.py:1204] (3/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_332-256168-0_sp1.1 from training. Duration: 0.6818125 2023-10-04 03:17:36,123 WARNING [train.py:1204] (3/4) Exclude cut with ID _48836_210_12210_1_1534075848293_3904150_429-35475-0_sp1.1 from training. Duration: 0.8090625 2023-10-04 03:17:36,465 INFO [scaling.py:1118] (3/4) WithLoss: name=encoder.encoders.4.encoder.layers.1.self_attn_weights, loss-sum=0.000e+00 2023-10-04 03:17:37,533 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_144-135833-0_sp1.1 from training. Duration: 0.97275 2023-10-04 03:17:38,859 INFO [train.py:1046] (3/4) Epoch 43, batch 2850, loss[loss=0.1583, simple_loss=0.2323, pruned_loss=0.0421, over 23358.00 frames. ], tot_loss[loss=0.1554, simple_loss=0.2352, pruned_loss=0.03785, over 4736894.57 frames. ], batch size: 119, lr: 2.36e-03, grad_scale: 16.0 2023-10-04 03:17:40,392 WARNING [train.py:1204] (3/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_380-343670-0_sp0.9 from training. Duration: 0.97775 2023-10-04 03:17:40,420 WARNING [train.py:1204] (3/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_213-265174-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 03:17:40,448 WARNING [train.py:1204] (3/4) Exclude cut with ID _20455_210_5219_1_1531565996775_8098100_86-314283-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 03:17:40,691 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.out_combiner.scale_min, batch_count=1506400.0, ans=0.2 2023-10-04 03:17:43,274 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_41750_210_1960_1_1533520678_3948042_68-223813-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 03:17:44,575 WARNING [train.py:1204] (3/4) Exclude cut with ID _55153_210_2253_1_1534384786677_3162096_24-198429-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 03:17:44,688 WARNING [train.py:1204] (3/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_119-207162-0_sp0.9 from training. Duration: 0.788875 2023-10-04 03:17:46,565 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_148-172861-0 from training. Duration: 0.5 2023-10-04 03:17:52,728 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_5522_210_3273_1_1527249606_3909096_318-203153-0 from training. Duration: 0.43 2023-10-04 03:17:52,736 WARNING [train.py:1204] (3/4) Exclude cut with ID _14884_210_2252_1_1530707459689_5561250_288-189949-0_sp1.1 from training. Duration: 0.97275 2023-10-04 03:17:54,140 WARNING [train.py:1204] (3/4) Exclude cut with ID _5294_210_1747_1_1527249643928_3696179_529-242298-0 from training. Duration: 0.95 2023-10-04 03:17:55,530 WARNING [train.py:1204] (3/4) Exclude cut with ID _36538_210_9994_1_1533204020363_3795620_503-110289-0_sp1.1 from training. Duration: 0.936375 2023-10-04 03:17:56,963 WARNING [train.py:1204] (3/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_385-193052-0 from training. Duration: 0.86 2023-10-04 03:17:58,427 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_456-22141-0 from training. Duration: 0.88 2023-10-04 03:17:59,805 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_38965_210_4852_1_1533174906_3484735_284-117840-0_sp1.1 from training. Duration: 0.936375 2023-10-04 03:18:07,105 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.2.encoder.layers.1.feed_forward2.out_whiten, num_groups=1, num_channels=384, metric=3.94 vs. limit=15.0 2023-10-04 03:18:07,796 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.feed_forward2.hidden_balancer.prob, batch_count=1506533.3333333333, ans=0.125 2023-10-04 03:18:14,489 WARNING [train.py:1204] (3/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_75-201625-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 03:18:14,593 WARNING [train.py:1204] (3/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_255-312654-0_sp1.1 from training. Duration: 0.82725 2023-10-04 03:18:14,603 WARNING [train.py:1204] (3/4) Exclude cut with ID _47251_210_18628_1_1533814063128_4137250_214-88076-0_sp0.9 from training. Duration: 0.97775 2023-10-04 03:18:15,342 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.3.encoder.layers.1.conv_module2.whiten, num_groups=1, num_channels=512, metric=3.35 vs. limit=15.0 2023-10-04 03:18:16,091 WARNING [train.py:1204] (3/4) Exclude cut with ID _4092_210_2151_1_1526643044449_3135759_227-213972-0_sp1.1 from training. Duration: 0.4909375 2023-10-04 03:18:16,112 WARNING [train.py:1204] (3/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_912-47473-0_sp1.1 from training. Duration: 0.6181875 2023-10-04 03:18:17,369 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6767_210_3273_1_1527855783_1718932_173-256601-0_sp1.1 from training. Duration: 0.72725 2023-10-04 03:18:17,561 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.ff2_skip_rate, batch_count=1506533.3333333333, ans=0.0 2023-10-04 03:18:19,215 WARNING [train.py:1204] (3/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_32-205288-0_sp1.1 from training. Duration: 0.7090625 2023-10-04 03:18:19,237 WARNING [train.py:1204] (3/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_39-248870-0 from training. Duration: 0.69 2023-10-04 03:18:20,640 WARNING [train.py:1204] (3/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_377-296839-0_sp1.1 from training. Duration: 0.5 2023-10-04 03:18:22,520 WARNING [train.py:1204] (3/4) Exclude cut with ID _13679_210_7497_1_1530534293662_3355098_413-56154-0_sp1.1 from training. Duration: 0.963625 2023-10-04 03:18:22,567 WARNING [train.py:1204] (3/4) Exclude cut with ID _14970_210_15709_1_1530775547361_1966965_201-84544-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 03:18:22,637 WARNING [train.py:1204] (3/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_105-95775-0_sp1.1 from training. Duration: 0.936375 2023-10-04 03:18:25,309 WARNING [train.py:1204] (3/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_131-191669-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 03:18:25,342 WARNING [train.py:1204] (3/4) Exclude cut with ID _14860_210_4915_1_1530702020340_7343240_695-4323-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 03:18:26,690 WARNING [train.py:1204] (3/4) Exclude cut with ID _39867_210_9826_1_1533553222030_9344259_991-130565-0_sp1.1 from training. Duration: 0.97275 2023-10-04 03:18:28,011 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_50-3289-0_sp1.1 from training. Duration: 0.82725 2023-10-04 03:18:28,143 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_54-347670-0_sp1.1 from training. Duration: 0.92725 2023-10-04 03:18:29,461 WARNING [train.py:1204] (3/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_200-153445-0_sp1.1 from training. Duration: 0.936375 2023-10-04 03:18:30,849 WARNING [train.py:1204] (3/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_279-302911-0_sp1.1 from training. Duration: 0.97275 2023-10-04 03:18:32,415 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.conv_skip_rate, batch_count=1506600.0, ans=0.0 2023-10-04 03:18:33,557 WARNING [train.py:1204] (3/4) Exclude cut with ID _14886_210_4220_1_1530702020355_3516190_159-73748-0_sp0.9 from training. Duration: 0.92225 2023-10-04 03:18:37,025 WARNING [train.py:1204] (3/4) Exclude cut with ID _53379_210_20034_1_1534222027363_3854654_23-222715-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 03:18:38,583 WARNING [train.py:1204] (3/4) Exclude cut with ID _12555_210_8750_1_1530075594402_3780239_449-267986-0 from training. Duration: 0.83 2023-10-04 03:18:38,596 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7544_210_6753_1_1528587722_3576686_295-258479-0 from training. Duration: 0.84 2023-10-04 03:18:41,190 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.618e+02 1.979e+02 2.245e+02 2.523e+02 4.092e+02, threshold=4.490e+02, percent-clipped=0.0 2023-10-04 03:18:41,284 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7002_210_1794_1_1527935634_4034444_487-236501-0_sp0.9 from training. Duration: 0.7333125 2023-10-04 03:18:42,700 WARNING [train.py:1204] (3/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_244-19107-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 03:18:42,713 WARNING [train.py:1204] (3/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_390-339137-0 from training. Duration: 0.56 2023-10-04 03:18:42,774 WARNING [train.py:1204] (3/4) Exclude cut with ID _7562_210_2084_1_1528887767916_3946229_186-326422-0_sp1.1 from training. Duration: 0.763625 2023-10-04 03:18:44,113 WARNING [train.py:1204] (3/4) Exclude cut with ID _33640_210_2655_1_1533117605479_4855469_645-145235-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 03:18:44,129 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_25767_210_2161_1_1532307483_3867748_506-171777-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 03:18:44,156 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_5438_210_1794_1_1527336687_2600009_233-58072-0_sp1.1 from training. Duration: 0.7 2023-10-04 03:18:44,156 WARNING [train.py:1204] (3/4) Exclude cut with ID IC0669W0063-330908 from training. Duration: 0.896 2023-10-04 03:18:44,204 WARNING [train.py:1204] (3/4) Exclude cut with ID IC0671W0070-331913 from training. Duration: 0.896 2023-10-04 03:18:44,208 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_258-310570-0_sp1.1 from training. Duration: 0.7454375 2023-10-04 03:18:46,166 WARNING [train.py:1204] (3/4) Exclude cut with ID _9461_210_4557_1_1529060514726_3677451_162-177507-0_sp1.1 from training. Duration: 0.97275 2023-10-04 03:18:52,019 WARNING [train.py:1204] (3/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_26-175553-0_sp0.9 from training. Duration: 0.67775 2023-10-04 03:18:52,043 WARNING [train.py:1204] (3/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_7-150481-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 03:18:52,087 WARNING [train.py:1204] (3/4) Exclude cut with ID _23557_210_8750_1_1531890106370_6846255_72-273262-0_sp1.1 from training. Duration: 0.963625 2023-10-04 03:18:53,436 INFO [train.py:1046] (3/4) Epoch 43, batch 2900, loss[loss=0.1618, simple_loss=0.247, pruned_loss=0.03828, over 23296.00 frames. ], tot_loss[loss=0.1551, simple_loss=0.2349, pruned_loss=0.03767, over 4724904.50 frames. ], batch size: 93, lr: 2.36e-03, grad_scale: 16.0 2023-10-04 03:18:53,562 WARNING [train.py:1204] (3/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_367-61330-0 from training. Duration: 0.75 2023-10-04 03:18:56,362 WARNING [train.py:1204] (3/4) Exclude cut with ID _39867_210_9826_1_1533553222030_9344259_1337-11141-0_sp1.1 from training. Duration: 0.936375 2023-10-04 03:18:57,805 WARNING [train.py:1204] (3/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_101-293367-0 from training. Duration: 0.63 2023-10-04 03:18:57,880 WARNING [train.py:1204] (3/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_10-100367-0 from training. Duration: 0.98 2023-10-04 03:18:59,200 WARNING [train.py:1204] (3/4) Exclude cut with ID _60057_210_10640_1_1534903065793_3333150_171-181133-0_sp1.1 from training. Duration: 0.8 2023-10-04 03:18:59,203 WARNING [train.py:1204] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_844-96989-0_sp0.9 from training. Duration: 0.788875 2023-10-04 03:19:00,602 WARNING [train.py:1204] (3/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_301-144595-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 03:19:02,012 WARNING [train.py:1204] (3/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_115-147343-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 03:19:02,239 INFO [scaling.py:1118] (3/4) WithLoss: name=encoder.encoders.2.encoder.layers.1.self_attn_weights, loss-sum=0.000e+00 2023-10-04 03:19:04,910 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3171_210_2087_1_1526109084_2330007_37-64334-0_sp1.1 from training. Duration: 0.7454375 2023-10-04 03:19:06,620 WARNING [train.py:1204] (3/4) Exclude cut with ID _19514_210_9259_1_1531530071330_5084230_769-272914-0_sp1.1 from training. Duration: 0.936375 2023-10-04 03:19:10,043 WARNING [train.py:1204] (3/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_76-311963-0_sp0.9 from training. Duration: 0.77775 2023-10-04 03:19:10,082 WARNING [train.py:1204] (3/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_526-62079-0 from training. Duration: 0.67 2023-10-04 03:19:11,389 WARNING [train.py:1204] (3/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_7-111954-0_sp0.9 from training. Duration: 0.62225 2023-10-04 03:19:12,728 WARNING [train.py:1204] (3/4) Exclude cut with ID _57997_210_4163_1_1534766425889_3961049_360-279140-0_sp1.1 from training. Duration: 0.97275 2023-10-04 03:19:14,154 WARNING [train.py:1204] (3/4) Exclude cut with ID _4866_210_2577_1_1527404412284_4364659_133-1696-0 from training. Duration: 0.99 2023-10-04 03:19:15,530 WARNING [train.py:1204] (3/4) Exclude cut with ID _5190_210_4557_1_1527244368799_373439_28-197030-0 from training. Duration: 0.72 2023-10-04 03:19:18,814 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_53-39003-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 03:19:18,816 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6673_210_3273_1_1527767790_3173240_351-167954-0 from training. Duration: 0.48 2023-10-04 03:19:18,838 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2822_210_2715_1_1526112586_1999587_165-36656-0_sp1.1 from training. Duration: 0.8090625 2023-10-04 03:19:22,771 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_328-264892-0_sp1.1 from training. Duration: 0.92725 2023-10-04 03:19:22,775 WARNING [train.py:1204] (3/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_29-293896-0_sp0.9 from training. Duration: 0.67775 2023-10-04 03:19:26,294 WARNING [train.py:1204] (3/4) Exclude cut with ID _65263_210_22356_1_1535763598691_3921037_465-55448-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 03:19:27,653 WARNING [train.py:1204] (3/4) Exclude cut with ID _21989_210_5854_1_1531899214931_3271360_311-53106-0_sp1.1 from training. Duration: 0.97275 2023-10-04 03:19:29,449 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.conv_module2.balancer2.min_positive, batch_count=1506866.6666666667, ans=0.05 2023-10-04 03:19:30,501 WARNING [train.py:1204] (3/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_389-277915-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 03:19:33,220 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_69630_210_7234_1_1535791094_677837_63-277139-0_sp1.1 from training. Duration: 0.963625 2023-10-04 03:19:34,563 WARNING [train.py:1204] (3/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_85-138257-0 from training. Duration: 0.71 2023-10-04 03:19:34,587 WARNING [train.py:1204] (3/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_686-247600-0 from training. Duration: 0.92 2023-10-04 03:19:34,592 WARNING [train.py:1204] (3/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_108-255430-0_sp1.1 from training. Duration: 0.8818125 2023-10-04 03:19:39,840 WARNING [train.py:1204] (3/4) Exclude cut with ID _14884_210_2252_1_1530707459689_5561250_114-201399-0_sp1.1 from training. Duration: 0.6909375 2023-10-04 03:19:42,568 WARNING [train.py:1204] (3/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_506-42614-0 from training. Duration: 0.79 2023-10-04 03:19:43,917 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_85-195240-0_sp1.1 from training. Duration: 0.7909375 2023-10-04 03:19:48,086 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_141-36243-0_sp1.1 from training. Duration: 0.97275 2023-10-04 03:19:57,175 WARNING [train.py:1204] (3/4) Exclude cut with ID _7852_210_5219_1_1528286471316_8139400_358-287492-0_sp1.1 from training. Duration: 0.87275 2023-10-04 03:19:57,199 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_805-71497-0_sp0.9 from training. Duration: 0.788875 2023-10-04 03:19:59,895 WARNING [train.py:1204] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_178-39722-0 from training. Duration: 0.49 2023-10-04 03:20:02,709 WARNING [train.py:1204] (3/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_428-181925-0_sp1.1 from training. Duration: 0.963625 2023-10-04 03:20:02,713 WARNING [train.py:1204] (3/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_770-200430-0 from training. Duration: 0.89 2023-10-04 03:20:04,022 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_1104-271009-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 03:20:04,059 WARNING [train.py:1204] (3/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_128-296581-0_sp0.9 from training. Duration: 0.82225 2023-10-04 03:20:05,838 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.ff2_skip_rate, batch_count=1507066.6666666667, ans=0.0 2023-10-04 03:20:06,827 INFO [train.py:1046] (3/4) Epoch 43, batch 2950, loss[loss=0.1584, simple_loss=0.2462, pruned_loss=0.03534, over 23722.00 frames. ], tot_loss[loss=0.1553, simple_loss=0.2356, pruned_loss=0.03751, over 4723544.01 frames. ], batch size: 85, lr: 2.36e-03, grad_scale: 16.0 2023-10-04 03:20:08,485 WARNING [train.py:1204] (3/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_19-62511-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 03:20:10,379 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_805-71497-0 from training. Duration: 0.71 2023-10-04 03:20:11,716 WARNING [train.py:1204] (3/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_573-152077-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 03:20:11,721 WARNING [train.py:1204] (3/4) Exclude cut with ID _42875_210_14856_1_1533704397487_4012433_393-177816-0_sp1.1 from training. Duration: 0.963625 2023-10-04 03:20:13,745 WARNING [train.py:1204] (3/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_278-35159-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 03:20:14,287 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.4.encoder.layers.1.feed_forward3.out_whiten, num_groups=1, num_channels=384, metric=12.63 vs. limit=15.0 2023-10-04 03:20:15,048 WARNING [train.py:1204] (3/4) Exclude cut with ID _15037_210_2253_1_1530752294104_3267650_13-130361-0_sp1.1 from training. Duration: 0.9 2023-10-04 03:20:16,401 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3019_210_2028_1_1526044245_3264865_127-135615-0 from training. Duration: 0.94 2023-10-04 03:20:17,733 WARNING [train.py:1204] (3/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_607-213951-0 from training. Duration: 0.84 2023-10-04 03:20:19,150 WARNING [train.py:1204] (3/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_436-318113-0_sp0.9 from training. Duration: 0.8333125 2023-10-04 03:20:19,151 WARNING [train.py:1204] (3/4) Exclude cut with ID _13938_210_9988_1_1530518718978_3693960_148-152225-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 03:20:23,834 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2541_210_1794_1_1525590269_3084765_156-173713-0_sp1.1 from training. Duration: 0.736375 2023-10-04 03:20:24,738 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.1.encoder.layers.0.conv_module1.whiten, num_groups=1, num_channels=256, metric=10.47 vs. limit=15.0 2023-10-04 03:20:25,253 WARNING [train.py:1204] (3/4) Exclude cut with ID _28323_210_16187_1_1532480395667_4100300_531-24157-0_sp1.1 from training. Duration: 0.8909375 2023-10-04 03:20:28,351 WARNING [train.py:1204] (3/4) Exclude cut with ID _29303_210_2575_1_1532392237303_7410649_382-217492-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 03:20:28,392 WARNING [train.py:1204] (3/4) Exclude cut with ID _58895_210_8750_1_1535184081320_3649089_270-158013-0_sp1.1 from training. Duration: 0.92725 2023-10-04 03:20:28,614 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=1507133.3333333333, ans=0.1 2023-10-04 03:20:31,155 WARNING [train.py:1204] (3/4) Exclude cut with ID _29277_210_3279_1_1532581109293_3173069_311-206227-0_sp1.1 from training. Duration: 0.936375 2023-10-04 03:20:31,174 WARNING [train.py:1204] (3/4) Exclude cut with ID _7160_210_1876_1_1527940923231_3703799_282-322611-0_sp0.9 from training. Duration: 0.9666875 2023-10-04 03:20:32,573 WARNING [train.py:1204] (3/4) Exclude cut with ID _53219_210_4525_1_1534834836008_4064825_562-32295-0_sp1.1 from training. Duration: 0.963625 2023-10-04 03:20:33,916 WARNING [train.py:1204] (3/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_408-87330-0_sp1.1 from training. Duration: 0.963625 2023-10-04 03:20:33,924 WARNING [train.py:1204] (3/4) Exclude cut with ID _59202_210_26984_1_1534816810612_3549174_137-318710-0_sp1.1 from training. Duration: 0.8090625 2023-10-04 03:20:36,820 WARNING [train.py:1204] (3/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_372-30302-0 from training. Duration: 0.86 2023-10-04 03:20:38,811 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.3.encoder.layers.3.feed_forward3.out_whiten, num_groups=1, num_channels=512, metric=14.24 vs. limit=15.0 2023-10-04 03:20:42,643 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7543_210_6753_1_1528541907_1399322_178-56269-0_sp1.1 from training. Duration: 0.95275 2023-10-04 03:20:42,667 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9109W0012-549758 from training. Duration: 0.512 2023-10-04 03:20:45,368 WARNING [train.py:1204] (3/4) Exclude cut with ID _5410_210_1388_1_1527397055795_1719020_39-112221-0_sp0.9 from training. Duration: 0.7666875 2023-10-04 03:20:45,481 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9121W0012-554933 from training. Duration: 0.896 2023-10-04 03:20:47,251 WARNING [train.py:1204] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_476-272757-0 from training. Duration: 0.93 2023-10-04 03:20:47,271 WARNING [train.py:1204] (3/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_1085-171718-0_sp1.1 from training. Duration: 0.92725 2023-10-04 03:20:47,321 WARNING [train.py:1204] (3/4) Exclude cut with ID _8480_210_1874_1_1528589391205_6728330_309-139896-0_sp1.1 from training. Duration: 0.736375 2023-10-04 03:20:47,321 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9130W0113-559703 from training. Duration: 0.896 2023-10-04 03:20:47,325 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_292-265928-0_sp1.1 from training. Duration: 0.463625 2023-10-04 03:20:50,116 WARNING [train.py:1204] (3/4) Exclude cut with ID _8772_210_1850_1_1528635595972_3664011_96-12756-0 from training. Duration: 0.98 2023-10-04 03:20:51,433 WARNING [train.py:1204] (3/4) Exclude cut with ID _12556_210_8341_1_1530081024100_7327690_725-193477-0_sp1.1 from training. Duration: 0.8909375 2023-10-04 03:20:51,486 WARNING [train.py:1204] (3/4) Exclude cut with ID _5289_210_2084_1_1527678202736_3779299_142-103317-0_sp1.1 from training. Duration: 0.763625 2023-10-04 03:20:54,275 WARNING [train.py:1204] (3/4) Exclude cut with ID _45012_210_12651_1_1533608435136_5371380_206-51075-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 03:20:56,164 WARNING [train.py:1204] (3/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_176-56120-0_sp1.1 from training. Duration: 0.7545625 2023-10-04 03:20:56,193 WARNING [train.py:1204] (3/4) Exclude cut with ID _40284_210_2084_1_1533513679814_3704279_220-201864-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 03:20:56,214 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9165W0004-576638 from training. Duration: 0.896 2023-10-04 03:20:57,529 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_5438_210_1794_1_1527336687_2600009_218-343336-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 03:20:57,558 WARNING [train.py:1204] (3/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_318-162017-0 from training. Duration: 0.91 2023-10-04 03:21:02,319 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_49544_210_1794_1_1534379770_5262083_483-349298-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 03:21:03,652 WARNING [train.py:1204] (3/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_197-28164-0_sp0.9 from training. Duration: 0.888875 2023-10-04 03:21:03,730 WARNING [train.py:1204] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_643-52007-0 from training. Duration: 0.57 2023-10-04 03:21:05,042 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_38965_210_4852_1_1533174906_3484735_403-332963-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 03:21:06,616 WARNING [train.py:1204] (3/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_340-174176-0 from training. Duration: 0.49 2023-10-04 03:21:09,080 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.602e+02 1.953e+02 2.204e+02 2.575e+02 5.460e+02, threshold=4.408e+02, percent-clipped=1.0 2023-10-04 03:21:09,234 WARNING [train.py:1204] (3/4) Exclude cut with ID _56708_210_13177_1_1535252351282_3422899_363-917-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 03:21:12,421 WARNING [train.py:1204] (3/4) Exclude cut with ID _19896_210_1883_1_1531709022662_6649569_525-159063-0_sp1.1 from training. Duration: 0.936375 2023-10-04 03:21:12,447 WARNING [train.py:1204] (3/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_430-116139-0_sp0.9 from training. Duration: 0.9444375 2023-10-04 03:21:13,930 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_53753_210_7234_1_1534320003_3594148_86-329250-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 03:21:13,949 WARNING [train.py:1204] (3/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_406-275649-0_sp1.1 from training. Duration: 0.3818125 2023-10-04 03:21:15,482 WARNING [train.py:1204] (3/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_1083-147536-0_sp1.1 from training. Duration: 0.8818125 2023-10-04 03:21:16,783 WARNING [train.py:1204] (3/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_54-311741-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 03:21:16,794 WARNING [train.py:1204] (3/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_237-58433-0_sp0.9 from training. Duration: 0.82225 2023-10-04 03:21:16,826 WARNING [train.py:1204] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_98-51676-0_sp1.1 from training. Duration: 0.663625 2023-10-04 03:21:16,878 WARNING [train.py:1204] (3/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_386-270472-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 03:21:18,772 WARNING [train.py:1204] (3/4) Exclude cut with ID _19023_210_4353_1_1531378892552_5820780_83-254794-0_sp1.1 from training. Duration: 0.7818125 2023-10-04 03:21:18,872 WARNING [train.py:1204] (3/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_314-93656-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 03:21:20,188 WARNING [train.py:1204] (3/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_336-213530-0 from training. Duration: 0.9 2023-10-04 03:21:20,276 WARNING [train.py:1204] (3/4) Exclude cut with ID _36686_210_5665_1_1533778225913_3646410_65-221120-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 03:21:20,471 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.balancer1.prob, batch_count=1507400.0, ans=0.125 2023-10-04 03:21:21,617 INFO [train.py:1046] (3/4) Epoch 43, batch 3000, loss[loss=0.1475, simple_loss=0.2427, pruned_loss=0.02613, over 24313.00 frames. ], tot_loss[loss=0.1557, simple_loss=0.2364, pruned_loss=0.03747, over 4735439.43 frames. ], batch size: 74, lr: 2.36e-03, grad_scale: 16.0 2023-10-04 03:21:21,618 INFO [train.py:1069] (3/4) Computing validation loss 2023-10-04 03:21:29,104 INFO [zipformer.py:1853] (3/4) name=encoder.encoders.2.encoder.layers.2.self_attn_weights, attn_weights_entropy = tensor([3.5759, 3.2716, 2.8956, 2.9027], device='cuda:3') 2023-10-04 03:21:30,914 INFO [zipformer.py:1853] (3/4) name=encoder.encoders.5.encoder.layers.0.self_attn_weights, attn_weights_entropy = tensor([1.8220, 1.7945, 3.2669, 3.1342], device='cuda:3') 2023-10-04 03:21:33,113 INFO [train.py:1078] (3/4) Epoch 43, validation: loss=0.3299, simple_loss=0.2679, pruned_loss=0.196, over 1125622.00 frames. 2023-10-04 03:21:33,114 INFO [train.py:1079] (3/4) Maximum memory allocated so far is 21134MB 2023-10-04 03:21:34,624 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_26433_210_12108_1_1532157158_4056480_271-68178-0_sp1.1 from training. Duration: 0.92725 2023-10-04 03:21:34,669 WARNING [train.py:1204] (3/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_46-207599-0_sp0.9 from training. Duration: 0.711125 2023-10-04 03:21:34,913 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.conv_module2.balancer2.prob, batch_count=1507400.0, ans=0.125 2023-10-04 03:21:38,624 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.2.encoder.layers.0.nonlin_attention.whiten1, num_groups=1, num_channels=288, metric=7.95 vs. limit=10.0 2023-10-04 03:21:39,067 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9303W0114-637133 from training. Duration: 0.896 2023-10-04 03:21:39,108 WARNING [train.py:1204] (3/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_490-207861-0 from training. Duration: 0.98 2023-10-04 03:21:40,575 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6767_210_3273_1_1527855783_1718932_173-256601-0_sp0.9 from training. Duration: 0.888875 2023-10-04 03:21:41,997 WARNING [train.py:1204] (3/4) Exclude cut with ID _13921_210_9994_1_1530841782272_3503520_327-69677-0_sp1.1 from training. Duration: 0.8181875 2023-10-04 03:21:42,054 WARNING [train.py:1204] (3/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_508-335369-0 from training. Duration: 0.48 2023-10-04 03:21:42,082 WARNING [train.py:1204] (3/4) Exclude cut with ID _64631_210_27767_1_1535349580068_4165160_24-46567-0_sp1.1 from training. Duration: 0.963625 2023-10-04 03:21:48,260 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder_embed.convnext.out_balancer.prob, batch_count=1507466.6666666667, ans=0.125 2023-10-04 03:21:49,606 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_89-35748-0_sp1.1 from training. Duration: 0.7181875 2023-10-04 03:21:58,936 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_26433_210_12108_1_1532157158_4056480_578-262107-0_sp1.1 from training. Duration: 0.9 2023-10-04 03:22:03,330 WARNING [train.py:1204] (3/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_129-337802-0 from training. Duration: 0.72 2023-10-04 03:22:03,425 WARNING [train.py:1204] (3/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_356-167816-0_sp0.9 from training. Duration: 0.8 2023-10-04 03:22:06,162 WARNING [train.py:1204] (3/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_137-116442-0_sp1.1 from training. Duration: 0.7090625 2023-10-04 03:22:08,078 WARNING [train.py:1204] (3/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_358-172940-0_sp1.1 from training. Duration: 0.963625 2023-10-04 03:22:08,100 WARNING [train.py:1204] (3/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_668-344147-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 03:22:09,597 WARNING [train.py:1204] (3/4) Exclude cut with ID _62337_210_12392_1_1535009273105_3412229_255-157667-0_sp1.1 from training. Duration: 0.936375 2023-10-04 03:22:09,600 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_486-151131-0 from training. Duration: 0.85 2023-10-04 03:22:11,397 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.4.encoder.layers.2.conv_module1.whiten, num_groups=1, num_channels=384, metric=2.96 vs. limit=15.0 2023-10-04 03:22:13,479 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_53638_210_21581_1_1534297319_5808449_885-80172-0 from training. Duration: 0.96 2023-10-04 03:22:14,848 WARNING [train.py:1204] (3/4) Exclude cut with ID _50093_210_4220_1_1534417141207_3432299_105-183067-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 03:22:14,905 WARNING [train.py:1204] (3/4) Exclude cut with ID _7562_210_2084_1_1528887767916_3946229_345-169156-0_sp0.9 from training. Duration: 0.6444375 2023-10-04 03:22:19,523 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_4528_210_3273_1_1527076654_3276777_209-219804-0_sp0.9 from training. Duration: 0.7666875 2023-10-04 03:22:19,551 WARNING [train.py:1204] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_35-153886-0_sp0.9 from training. Duration: 0.9333125 2023-10-04 03:22:20,872 WARNING [train.py:1204] (3/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_332-121925-0_sp1.1 from training. Duration: 0.92725 2023-10-04 03:22:20,874 WARNING [train.py:1204] (3/4) Exclude cut with ID _59202_210_26984_1_1534816810612_3549174_495-65515-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 03:22:21,080 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.feed_forward3.hidden_balancer.prob, batch_count=1507600.0, ans=0.125 2023-10-04 03:22:24,238 WARNING [train.py:1204] (3/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_769-168302-0_sp1.1 from training. Duration: 0.6454375 2023-10-04 03:22:24,267 WARNING [train.py:1204] (3/4) Exclude cut with ID _24271_210_12658_1_1531987270022_7190519_708-71345-0_sp1.1 from training. Duration: 0.936375 2023-10-04 03:22:24,267 WARNING [train.py:1204] (3/4) Exclude cut with ID 3033-130750-0096-55598-0_sp0.9 from training. Duration: 0.92225 2023-10-04 03:22:26,881 WARNING [train.py:1204] (3/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_219-224445-0_sp0.9 from training. Duration: 0.9333125 2023-10-04 03:22:30,098 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_174-277364-0 from training. Duration: 0.71 2023-10-04 03:22:30,177 WARNING [train.py:1204] (3/4) Exclude cut with ID _45021_210_9994_1_1533619751094_3330288_96-239010-0_sp1.1 from training. Duration: 0.82725 2023-10-04 03:22:30,205 WARNING [train.py:1204] (3/4) Exclude cut with ID _31602_210_5854_1_1532677021456_4458250_489-305116-0_sp1.1 from training. Duration: 0.97275 2023-10-04 03:22:31,469 WARNING [train.py:1204] (3/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_269-154726-0_sp0.9 from training. Duration: 0.9555625 2023-10-04 03:22:34,357 WARNING [train.py:1204] (3/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_126-54418-0_sp1.1 from training. Duration: 0.92725 2023-10-04 03:22:35,468 WARNING [train.py:1204] (3/4) Exclude cut with ID _42326_210_7770_1_1533814282531_3707269_182-224159-0_sp1.1 from training. Duration: 0.92725 2023-10-04 03:22:35,573 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7002_210_1794_1_1527939683_5157286_1-281264-0_sp0.9 from training. Duration: 0.488875 2023-10-04 03:22:35,602 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_86-16881-0 from training. Duration: 0.58 2023-10-04 03:22:35,625 WARNING [train.py:1204] (3/4) Exclude cut with ID _19514_210_9259_1_1531530071330_5084230_637-279094-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 03:22:35,820 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.conv_module2.balancer1.prob, batch_count=1507666.6666666667, ans=0.125 2023-10-04 03:22:35,849 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.bypass.scale_min, batch_count=1507666.6666666667, ans=0.2 2023-10-04 03:22:37,011 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_26433_210_12108_1_1532157158_4056480_578-262107-0 from training. Duration: 0.99 2023-10-04 03:22:38,314 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_82-221196-0_sp1.1 from training. Duration: 0.6909375 2023-10-04 03:22:39,758 WARNING [train.py:1204] (3/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_402-102311-0 from training. Duration: 0.67 2023-10-04 03:22:40,002 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.conv_module2.balancer1.prob, batch_count=1507666.6666666667, ans=0.125 2023-10-04 03:22:43,174 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1015-343884-0_sp1.1 from training. Duration: 0.563625 2023-10-04 03:22:44,568 WARNING [train.py:1204] (3/4) Exclude cut with ID _13684_210_7497_1_1530619354863_3598359_312-235606-0_sp1.1 from training. Duration: 0.5454375 2023-10-04 03:22:45,822 WARNING [train.py:1204] (3/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_55-31904-0 from training. Duration: 0.88 2023-10-04 03:22:45,905 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_25-332138-0 from training. Duration: 0.91 2023-10-04 03:22:45,907 WARNING [train.py:1204] (3/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_912-282412-0_sp0.9 from training. Duration: 0.5444375 2023-10-04 03:22:46,051 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.conv_module2.balancer2.prob, batch_count=1507733.3333333333, ans=0.125 2023-10-04 03:22:47,141 INFO [train.py:1046] (3/4) Epoch 43, batch 3050, loss[loss=0.1479, simple_loss=0.2337, pruned_loss=0.03109, over 24636.00 frames. ], tot_loss[loss=0.1565, simple_loss=0.2372, pruned_loss=0.03787, over 4726378.87 frames. ], batch size: 65, lr: 2.36e-03, grad_scale: 16.0 2023-10-04 03:22:47,238 WARNING [train.py:1204] (3/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_458-299435-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 03:22:47,345 WARNING [train.py:1204] (3/4) Exclude cut with ID _69584_210_5219_1_1535785133775_4044450_275-88403-0_sp1.1 from training. Duration: 0.92725 2023-10-04 03:22:47,353 WARNING [train.py:1204] (3/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_841-341679-0_sp1.1 from training. Duration: 0.52725 2023-10-04 03:22:48,637 WARNING [train.py:1204] (3/4) Exclude cut with ID _41391_210_7994_1_1533603771872_3638201_1-54521-0_sp1.1 from training. Duration: 0.97275 2023-10-04 03:22:48,695 WARNING [train.py:1204] (3/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_495-186609-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 03:22:51,934 WARNING [train.py:1204] (3/4) Exclude cut with ID _4681_210_3525_1_1527250689464_120889_15-126760-0 from training. Duration: 0.81 2023-10-04 03:22:53,816 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3019_210_2028_1_1526044245_3264865_127-135615-0_sp1.1 from training. Duration: 0.8545625 2023-10-04 03:22:55,240 WARNING [train.py:1204] (3/4) Exclude cut with ID _30191_210_1664_1_1533189915047_7196670_915-21561-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 03:22:55,273 WARNING [train.py:1204] (3/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_6-214815-0_sp0.9 from training. Duration: 0.9444375 2023-10-04 03:22:56,882 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.nonlin_attention.balancer.prob, batch_count=1507733.3333333333, ans=0.125 2023-10-04 03:22:58,048 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_45399_210_4852_1_1533624725_7579737_190-137494-0_sp1.1 from training. Duration: 0.97275 2023-10-04 03:23:01,403 WARNING [train.py:1204] (3/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_207-132038-0 from training. Duration: 0.94 2023-10-04 03:23:05,512 WARNING [train.py:1204] (3/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_90-31383-0 from training. Duration: 0.87 2023-10-04 03:23:06,257 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.2.encoder.layers.1.whiten, num_groups=1, num_channels=384, metric=4.77 vs. limit=12.0 2023-10-04 03:23:06,832 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_15517_210_6753_1_1530952293_1664795_119-31001-0 from training. Duration: 0.94 2023-10-04 03:23:06,859 WARNING [train.py:1204] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_684-85342-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 03:23:09,677 WARNING [train.py:1204] (3/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_97-325053-0_sp1.1 from training. Duration: 0.836375 2023-10-04 03:23:11,156 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_68702_210_23689_1_1535799577_3420068_168-25150-0_sp1.1 from training. Duration: 0.97275 2023-10-04 03:23:11,163 WARNING [train.py:1204] (3/4) Exclude cut with ID _65134_210_14763_1_1535335393985_3505368_111-27984-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 03:23:12,957 WARNING [train.py:1204] (3/4) Exclude cut with ID _48678_210_5663_1_1533988830947_3769199_276-226230-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 03:23:13,146 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.1.self_attn_weights.pos_emb_skip_rate, batch_count=1507800.0, ans=0.0 2023-10-04 03:23:13,936 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.0.layers.1.whiten, num_groups=1, num_channels=192, metric=3.88 vs. limit=12.0 2023-10-04 03:23:14,543 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.balancer1.prob, batch_count=1507800.0, ans=0.125 2023-10-04 03:23:14,567 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.balancer1.prob, batch_count=1507800.0, ans=0.125 2023-10-04 03:23:15,677 WARNING [train.py:1204] (3/4) Exclude cut with ID _28427_210_4220_1_1532581240967_3061069_43-84928-0_sp1.1 from training. Duration: 0.9 2023-10-04 03:23:17,071 WARNING [train.py:1204] (3/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_158-250116-0_sp1.1 from training. Duration: 0.563625 2023-10-04 03:23:17,096 WARNING [train.py:1204] (3/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_279-332556-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 03:23:17,139 WARNING [train.py:1204] (3/4) Exclude cut with ID _42863_210_9070_1_1533902398788_3696920_488-230146-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 03:23:17,139 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_387-247159-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 03:23:20,259 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6729_210_2550_1_1528032481_1788725_261-42619-0_sp1.1 from training. Duration: 0.97275 2023-10-04 03:23:22,880 WARNING [train.py:1204] (3/4) Exclude cut with ID _19896_210_1883_1_1531709022662_6649569_831-44724-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 03:23:23,185 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.feed_forward3.hidden_balancer.prob, batch_count=1507866.6666666667, ans=0.125 2023-10-04 03:23:25,615 WARNING [train.py:1204] (3/4) Exclude cut with ID _55153_210_2253_1_1534384786677_3162096_280-285944-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 03:23:25,648 WARNING [train.py:1204] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_448-4048-0 from training. Duration: 0.96 2023-10-04 03:23:27,014 WARNING [train.py:1204] (3/4) Exclude cut with ID _29678_210_7893_1_1532421042787_3906118_456-202548-0_sp1.1 from training. Duration: 0.97275 2023-10-04 03:23:27,028 WARNING [train.py:1204] (3/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_107-332398-0_sp1.1 from training. Duration: 0.7090625 2023-10-04 03:23:28,506 WARNING [train.py:1204] (3/4) Exclude cut with ID _13921_210_9994_1_1530838702800_3003429_46-149444-0_sp1.1 from training. Duration: 0.8545625 2023-10-04 03:23:29,821 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_257-20733-0_sp0.9 from training. Duration: 0.8444375 2023-10-04 03:23:31,474 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_53753_210_7234_1_1534320003_3594148_385-234809-0_sp1.1 from training. Duration: 0.963625 2023-10-04 03:23:31,534 WARNING [train.py:1204] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_292-215346-0_sp1.1 from training. Duration: 0.8818125 2023-10-04 03:23:31,723 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.ff2_skip_rate, batch_count=1507933.3333333333, ans=0.0 2023-10-04 03:23:36,954 WARNING [train.py:1204] (3/4) Exclude cut with ID _68965_210_1883_1_1535698962272_3497490_196-124268-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 03:23:38,300 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_23801_210_16223_1_1532066392_3294657_570-5439-0_sp1.1 from training. Duration: 0.8818125 2023-10-04 03:23:43,780 WARNING [train.py:1204] (3/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_349-6928-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 03:23:45,784 WARNING [train.py:1204] (3/4) Exclude cut with ID _60621_210_17071_1_1535013053530_3671739_226-255202-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 03:23:45,788 WARNING [train.py:1204] (3/4) Exclude cut with ID _41817_210_18835_1_1533434477160_7423049_448-130302-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 03:23:46,002 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=1508000.0, ans=0.1 2023-10-04 03:23:47,245 WARNING [train.py:1204] (3/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_758-143677-0_sp1.1 from training. Duration: 0.8909375 2023-10-04 03:23:47,295 WARNING [train.py:1204] (3/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_912-47473-0_sp0.9 from training. Duration: 0.7555625 2023-10-04 03:23:47,328 WARNING [train.py:1204] (3/4) Exclude cut with ID _6694_210_4599_1_1527852637490_3965529_356-169233-0_sp1.1 from training. Duration: 0.9 2023-10-04 03:23:48,521 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.568e+02 1.989e+02 2.180e+02 2.446e+02 3.954e+02, threshold=4.360e+02, percent-clipped=0.0 2023-10-04 03:23:48,699 WARNING [train.py:1204] (3/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_1-99680-0 from training. Duration: 0.67 2023-10-04 03:23:50,089 WARNING [train.py:1204] (3/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_199-86432-0_sp1.1 from training. Duration: 0.8909375 2023-10-04 03:23:50,112 WARNING [train.py:1204] (3/4) Exclude cut with ID _61148_210_6897_1_1535094022726_4217610_362-182430-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 03:23:53,743 WARNING [train.py:1204] (3/4) Exclude cut with ID _49376_210_5986_1_1533985278732_3370740_301-154763-0 from training. Duration: 0.98 2023-10-04 03:23:55,078 WARNING [train.py:1204] (3/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_572-185150-0_sp1.1 from training. Duration: 0.8818125 2023-10-04 03:23:59,429 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_47896_210_20508_1_1534492518_5958868_558-314572-0_sp1.1 from training. Duration: 0.8818125 2023-10-04 03:24:00,712 INFO [train.py:1046] (3/4) Epoch 43, batch 3100, loss[loss=0.1417, simple_loss=0.2279, pruned_loss=0.02776, over 24603.00 frames. ], tot_loss[loss=0.1557, simple_loss=0.2363, pruned_loss=0.03753, over 4733067.33 frames. ], batch size: 60, lr: 2.36e-03, grad_scale: 16.0 2023-10-04 03:24:00,848 WARNING [train.py:1204] (3/4) Exclude cut with ID _45560_210_21982_1_1533812603951_3584999_111-6376-0_sp0.9 from training. Duration: 0.9333125 2023-10-04 03:24:03,876 WARNING [train.py:1204] (3/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_412-317077-0_sp1.1 from training. Duration: 0.6818125 2023-10-04 03:24:04,015 WARNING [train.py:1204] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_718-68505-0 from training. Duration: 0.89 2023-10-04 03:24:06,723 WARNING [train.py:1204] (3/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_77-311404-0 from training. Duration: 0.94 2023-10-04 03:24:06,807 WARNING [train.py:1204] (3/4) Exclude cut with ID _60229_210_27767_1_1534838406158_3655350_264-279573-0 from training. Duration: 0.83 2023-10-04 03:24:09,458 WARNING [train.py:1204] (3/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_106-300985-0_sp1.1 from training. Duration: 0.8181875 2023-10-04 03:24:12,513 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.out_combiner.scale_min, batch_count=1508066.6666666667, ans=0.2 2023-10-04 03:24:13,695 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_4183_210_2087_1_1526729100_1869838_79-277189-0_sp1.1 from training. Duration: 0.963625 2023-10-04 03:24:13,704 WARNING [train.py:1204] (3/4) Exclude cut with ID _28427_210_4220_1_1532581240967_3061069_195-295268-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 03:24:15,211 WARNING [train.py:1204] (3/4) Exclude cut with ID _4092_210_2151_1_1526643044449_3135759_356-278694-0_sp0.9 from training. Duration: 0.52225 2023-10-04 03:24:15,381 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.ff3_skip_rate, batch_count=1508133.3333333333, ans=0.0 2023-10-04 03:24:19,975 WARNING [train.py:1204] (3/4) Exclude cut with ID _14459_210_2228_1_1531443641946_7516700_777-71446-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 03:24:20,129 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.0.balancer2.prob, batch_count=1508133.3333333333, ans=0.125 2023-10-04 03:24:25,130 WARNING [train.py:1204] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_520-126729-0 from training. Duration: 0.7 2023-10-04 03:24:27,966 WARNING [train.py:1204] (3/4) Exclude cut with ID _13684_210_7497_1_1530619354863_3598359_312-235606-0_sp0.9 from training. Duration: 0.6666875 2023-10-04 03:24:29,341 WARNING [train.py:1204] (3/4) Exclude cut with ID _37537_210_2253_1_1533171758568_3421949_283-349402-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 03:24:29,396 WARNING [train.py:1204] (3/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_29-117932-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 03:24:29,416 WARNING [train.py:1204] (3/4) Exclude cut with ID _29303_210_2575_1_1532392237303_7410649_721-283351-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 03:24:30,713 WARNING [train.py:1204] (3/4) Exclude cut with ID _14886_210_4220_1_1530702020355_3516190_175-229073-0_sp0.9 from training. Duration: 0.5 2023-10-04 03:24:32,102 WARNING [train.py:1204] (3/4) Exclude cut with ID _15458_210_8341_1_1530858614263_3834410_70-268830-0_sp1.1 from training. Duration: 0.97275 2023-10-04 03:24:32,135 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2295_210_2087_1_1525500993_2407863_234-83104-0 from training. Duration: 0.62 2023-10-04 03:24:32,136 WARNING [train.py:1204] (3/4) Exclude cut with ID _22961_210_8254_1_1531976366672_4278180_317-199546-0_sp1.1 from training. Duration: 0.7818125 2023-10-04 03:24:34,094 WARNING [train.py:1204] (3/4) Exclude cut with ID _27894_210_12392_1_1532415496078_6902439_547-196022-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 03:24:35,553 WARNING [train.py:1204] (3/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_46-207599-0 from training. Duration: 0.64 2023-10-04 03:24:38,327 WARNING [train.py:1204] (3/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_426-326246-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 03:24:42,396 WARNING [train.py:1204] (3/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_445-117398-0_sp0.9 from training. Duration: 0.988875 2023-10-04 03:24:43,717 WARNING [train.py:1204] (3/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_563-347832-0 from training. Duration: 0.89 2023-10-04 03:24:43,797 WARNING [train.py:1204] (3/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_252-216674-0 from training. Duration: 0.54 2023-10-04 03:24:45,174 WARNING [train.py:1204] (3/4) Exclude cut with ID _59611_210_8895_1_1534741198240_3650361_187-212779-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 03:24:45,230 WARNING [train.py:1204] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_282-159367-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 03:24:48,171 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_68702_210_23689_1_1535799577_3420068_33-136883-0_sp1.1 from training. Duration: 0.936375 2023-10-04 03:24:48,183 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_47228_210_4983_1_1533780021_3672027_184-293706-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 03:24:48,214 WARNING [train.py:1204] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_656-188361-0_sp0.9 from training. Duration: 0.9666875 2023-10-04 03:24:48,828 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.4.encoder.layers.0.whiten, num_groups=1, num_channels=384, metric=11.27 vs. limit=12.0 2023-10-04 03:24:49,548 WARNING [train.py:1204] (3/4) Exclude cut with ID _4866_210_2577_1_1527404412284_4364659_352-157930-0_sp0.9 from training. Duration: 0.9 2023-10-04 03:24:49,549 WARNING [train.py:1204] (3/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_210-106047-0_sp1.1 from training. Duration: 0.8818125 2023-10-04 03:24:52,957 WARNING [train.py:1204] (3/4) Exclude cut with ID _9389_210_1664_1_1529217337301_5289269_289-213417-0_sp1.1 from training. Duration: 0.8090625 2023-10-04 03:24:52,998 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3910_210_1794_1_1526777667_3747260_178-41186-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 03:24:53,002 WARNING [train.py:1204] (3/4) Exclude cut with ID _27052_210_5986_1_1532329197187_3585009_24-172263-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 03:24:53,006 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_5522_210_3273_1_1527249606_3909096_318-203153-0_sp1.1 from training. Duration: 0.3909375 2023-10-04 03:24:57,501 WARNING [train.py:1204] (3/4) Exclude cut with ID _27894_210_12392_1_1532415496078_6902439_222-51982-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 03:25:00,320 WARNING [train.py:1204] (3/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_87-131427-0 from training. Duration: 0.78 2023-10-04 03:25:01,690 WARNING [train.py:1204] (3/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_372-298093-0_sp0.9 from training. Duration: 0.888875 2023-10-04 03:25:01,746 WARNING [train.py:1204] (3/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_663-166973-0 from training. Duration: 0.8 2023-10-04 03:25:03,154 WARNING [train.py:1204] (3/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_429-177469-0_sp1.1 from training. Duration: 0.936375 2023-10-04 03:25:03,182 WARNING [train.py:1204] (3/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_724-172651-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 03:25:03,238 WARNING [train.py:1204] (3/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_103-336064-0 from training. Duration: 0.51 2023-10-04 03:25:13,517 WARNING [train.py:1204] (3/4) Exclude cut with ID _48677_210_5663_1_1533902405919_3892989_111-11439-0 from training. Duration: 0.96 2023-10-04 03:25:14,897 INFO [train.py:1046] (3/4) Epoch 43, batch 3150, loss[loss=0.1378, simple_loss=0.2185, pruned_loss=0.0285, over 24609.00 frames. ], tot_loss[loss=0.1541, simple_loss=0.2346, pruned_loss=0.0368, over 4736172.89 frames. ], batch size: 60, lr: 2.36e-03, grad_scale: 16.0 2023-10-04 03:25:14,976 WARNING [train.py:1204] (3/4) Exclude cut with ID _8085_210_4557_1_1528455633493_3716100_95-103398-0_sp1.1 from training. Duration: 0.963625 2023-10-04 03:25:16,302 WARNING [train.py:1204] (3/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_1041-110837-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 03:25:17,686 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_50050_210_16223_1_1535435796_7290138_1155-121757-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 03:25:17,689 WARNING [train.py:1204] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_602-275260-0_sp0.9 from training. Duration: 0.911125 2023-10-04 03:25:17,770 WARNING [train.py:1204] (3/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_265-164486-0 from training. Duration: 0.96 2023-10-04 03:25:19,782 WARNING [train.py:1204] (3/4) Exclude cut with ID _46067_210_4220_1_1534125571066_3307450_142-229375-0_sp1.1 from training. Duration: 0.963625 2023-10-04 03:25:21,064 WARNING [train.py:1204] (3/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_310-26186-0_sp1.1 from training. Duration: 0.636375 2023-10-04 03:25:22,952 WARNING [train.py:1204] (3/4) Exclude cut with ID _28461_210_2253_1_1532771775189_3041299_136-270644-0 from training. Duration: 0.93 2023-10-04 03:25:26,120 WARNING [train.py:1204] (3/4) Exclude cut with ID _14860_210_4915_1_1530702020340_7343240_438-286372-0_sp1.1 from training. Duration: 0.936375 2023-10-04 03:25:27,580 WARNING [train.py:1204] (3/4) Exclude cut with ID IC0128W0004-63309 from training. Duration: 0.831 2023-10-04 03:25:30,320 WARNING [train.py:1204] (3/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_135-174080-0 from training. Duration: 0.53 2023-10-04 03:25:30,375 WARNING [train.py:1204] (3/4) Exclude cut with ID _41326_210_5854_1_1533778076509_3492080_277-59511-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 03:25:31,707 WARNING [train.py:1204] (3/4) Exclude cut with ID IC0144W0112-71379 from training. Duration: 0.992 2023-10-04 03:25:33,096 WARNING [train.py:1204] (3/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_711-15688-0_sp0.9 from training. Duration: 0.47775 2023-10-04 03:25:35,730 WARNING [train.py:1204] (3/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_237-332027-0 from training. Duration: 0.95 2023-10-04 03:25:37,751 WARNING [train.py:1204] (3/4) Exclude cut with ID _7970_210_1883_1_1528455616296_3472230_25-178407-0 from training. Duration: 0.71 2023-10-04 03:25:37,753 WARNING [train.py:1204] (3/4) Exclude cut with ID _8480_210_1874_1_1528589391205_6728330_309-139896-0 from training. Duration: 0.81 2023-10-04 03:25:37,777 WARNING [train.py:1204] (3/4) Exclude cut with ID _39571_210_13449_1_1533801584143_3651077_319-268877-0_sp1.1 from training. Duration: 0.936375 2023-10-04 03:25:37,787 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_155-246880-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 03:25:37,879 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_566-122599-0_sp1.1 from training. Duration: 0.936375 2023-10-04 03:25:39,229 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_12868_210_6753_1_1530701932_530275_18-76322-0 from training. Duration: 0.87 2023-10-04 03:25:40,638 WARNING [train.py:1204] (3/4) Exclude cut with ID _56494_210_4220_1_1535284837625_3033169_159-106035-0_sp1.1 from training. Duration: 0.963625 2023-10-04 03:25:40,686 WARNING [train.py:1204] (3/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_98-276013-0_sp1.1 from training. Duration: 0.963625 2023-10-04 03:25:42,109 WARNING [train.py:1204] (3/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_330-216124-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 03:25:43,539 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_177-58005-0_sp0.9 from training. Duration: 0.688875 2023-10-04 03:25:46,830 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.4.encoder.layers.2.conv_module2.whiten, num_groups=1, num_channels=384, metric=3.06 vs. limit=15.0 2023-10-04 03:25:47,614 WARNING [train.py:1204] (3/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_343-139898-0 from training. Duration: 0.96 2023-10-04 03:25:47,671 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6961_210_2087_1_1527937168_6815011_395-185060-0_sp1.1 from training. Duration: 0.863625 2023-10-04 03:25:49,483 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7002_210_1794_1_1527939683_5157286_184-229630-0_sp0.9 from training. Duration: 0.82225 2023-10-04 03:25:50,824 WARNING [train.py:1204] (3/4) Exclude cut with ID _59170_210_6721_1_1534983633960_4203335_153-18895-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 03:25:50,875 WARNING [train.py:1204] (3/4) Exclude cut with ID _49643_210_7989_1_1534640244224_3939031_598-161926-0 from training. Duration: 0.9 2023-10-04 03:25:53,598 WARNING [train.py:1204] (3/4) Exclude cut with ID _4068_210_2629_1_1526697350395_1546259_224-288693-0 from training. Duration: 0.59 2023-10-04 03:25:53,658 WARNING [train.py:1204] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_718-68505-0_sp1.1 from training. Duration: 0.8090625 2023-10-04 03:25:53,704 WARNING [train.py:1204] (3/4) Exclude cut with ID _4068_210_2629_1_1526697350395_1546259_224-288693-0_sp0.9 from training. Duration: 0.6555625 2023-10-04 03:25:55,522 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2296_210_2087_1_1525503448_3397335_57-185430-0_sp0.9 from training. Duration: 0.6666875 2023-10-04 03:25:55,576 WARNING [train.py:1204] (3/4) Exclude cut with ID _15458_210_8341_1_1530858614263_3834410_49-235472-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 03:25:55,578 WARNING [train.py:1204] (3/4) Exclude cut with ID _64135_210_2253_1_1535677267876_7608972_498-5039-0_sp0.9 from training. Duration: 0.8666875 2023-10-04 03:25:58,810 WARNING [train.py:1204] (3/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_283-220372-0_sp0.9 from training. Duration: 0.62225 2023-10-04 03:25:58,818 WARNING [train.py:1204] (3/4) Exclude cut with ID _6473_210_1741_1_1527901307396_3815380_108-17335-0_sp0.9 from training. Duration: 0.72225 2023-10-04 03:25:58,909 WARNING [train.py:1204] (3/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_113-92608-0 from training. Duration: 0.54 2023-10-04 03:26:00,263 WARNING [train.py:1204] (3/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_272-249985-0_sp1.1 from training. Duration: 0.6090625 2023-10-04 03:26:00,273 WARNING [train.py:1204] (3/4) Exclude cut with ID _50376_210_13401_1_1534031502101_4217740_518-187390-0_sp1.1 from training. Duration: 0.97275 2023-10-04 03:26:00,428 WARNING [train.py:1204] (3/4) Exclude cut with ID _59738_210_15379_1_1534849512562_3941470_165-248260-0_sp1.1 from training. Duration: 0.92725 2023-10-04 03:26:00,449 WARNING [train.py:1204] (3/4) Exclude cut with ID _23818_210_6228_1_1531917063884_3676920_400-343925-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 03:26:01,110 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.2.encoder.layers.2.nonlin_attention.whiten1, num_groups=1, num_channels=288, metric=4.90 vs. limit=10.0 2023-10-04 03:26:01,749 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6961_210_2087_1_1527937168_6815011_562-165518-0 from training. Duration: 0.77 2023-10-04 03:26:03,138 WARNING [train.py:1204] (3/4) Exclude cut with ID _24271_210_12658_1_1531987270022_7190519_289-207931-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 03:26:04,967 WARNING [train.py:1204] (3/4) Exclude cut with ID _14632_210_9988_1_1530691279392_3923200_332-105904-0 from training. Duration: 0.9 2023-10-04 03:26:04,986 WARNING [train.py:1204] (3/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_203-232438-0_sp1.1 from training. Duration: 0.97275 2023-10-04 03:26:05,082 WARNING [train.py:1204] (3/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_263-168388-0 from training. Duration: 0.86 2023-10-04 03:26:06,400 WARNING [train.py:1204] (3/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_174-205058-0 from training. Duration: 0.95 2023-10-04 03:26:09,103 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_94-195801-0_sp1.1 from training. Duration: 0.8545625 2023-10-04 03:26:09,125 WARNING [train.py:1204] (3/4) Exclude cut with ID _55153_210_2253_1_1534384786677_3162096_269-103204-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 03:26:09,212 WARNING [train.py:1204] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_748-36150-0 from training. Duration: 0.91 2023-10-04 03:26:10,537 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_148-172861-0_sp1.1 from training. Duration: 0.4545625 2023-10-04 03:26:11,851 WARNING [train.py:1204] (3/4) Exclude cut with ID _67499_210_17282_1_1535509805220_3692430_460-77730-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 03:26:15,894 WARNING [train.py:1204] (3/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_374-20753-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 03:26:17,334 WARNING [train.py:1204] (3/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_374-289983-0_sp1.1 from training. Duration: 0.97275 2023-10-04 03:26:17,369 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_34886_210_10633_1_1533203882_1755661_76-156096-0_sp0.9 from training. Duration: 0.9666875 2023-10-04 03:26:19,128 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.631e+02 2.065e+02 2.333e+02 2.653e+02 4.086e+02, threshold=4.666e+02, percent-clipped=0.0 2023-10-04 03:26:20,804 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_238-256668-0_sp0.9 from training. Duration: 0.7666875 2023-10-04 03:26:22,136 WARNING [train.py:1204] (3/4) Exclude cut with ID _46683_210_9642_1_1533730913492_4591116_489-146727-0_sp1.1 from training. Duration: 0.97275 2023-10-04 03:26:22,343 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.self_attn_weights.pos_emb_skip_rate, batch_count=1508666.6666666667, ans=0.0 2023-10-04 03:26:23,563 WARNING [train.py:1204] (3/4) Exclude cut with ID _13938_210_9988_1_1530518718978_3693960_304-112784-0_sp0.9 from training. Duration: 0.511125 2023-10-04 03:26:29,805 INFO [train.py:1046] (3/4) Epoch 43, batch 3200, loss[loss=0.1554, simple_loss=0.2412, pruned_loss=0.03478, over 23232.00 frames. ], tot_loss[loss=0.1536, simple_loss=0.2337, pruned_loss=0.03678, over 4726542.51 frames. ], batch size: 105, lr: 2.36e-03, grad_scale: 16.0 2023-10-04 03:26:29,909 WARNING [train.py:1204] (3/4) Exclude cut with ID _24271_210_12658_1_1531987270022_7190519_243-46549-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 03:26:29,914 WARNING [train.py:1204] (3/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_78-182912-0_sp1.1 from training. Duration: 0.536375 2023-10-04 03:26:35,315 WARNING [train.py:1204] (3/4) Exclude cut with ID _57572_210_5517_1_1534921219624_3809484_121-321334-0_sp1.1 from training. Duration: 0.97275 2023-10-04 03:26:37,092 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_40047_210_2290_1_1533290519_2374311_254-102149-0_sp1.1 from training. Duration: 0.963625 2023-10-04 03:26:37,093 WARNING [train.py:1204] (3/4) Exclude cut with ID _28461_210_2253_1_1532771775189_3041299_96-152158-0 from training. Duration: 0.85 2023-10-04 03:26:39,804 WARNING [train.py:1204] (3/4) Exclude cut with ID _29877_210_6228_1_1532397791106_5276149_161-102979-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 03:26:41,483 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3787_210_2040_1_1526477544_5478586_603-120324-0_sp0.9 from training. Duration: 0.9 2023-10-04 03:26:45,631 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_41750_210_1960_1_1533520678_3948042_247-213456-0_sp1.1 from training. Duration: 0.97275 2023-10-04 03:26:53,109 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.conv_module2.balancer1.prob, batch_count=1508800.0, ans=0.125 2023-10-04 03:26:54,271 WARNING [train.py:1204] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_127-119109-0_sp0.9 from training. Duration: 0.8 2023-10-04 03:27:02,108 WARNING [train.py:1204] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_215-202973-0 from training. Duration: 0.88 2023-10-04 03:27:04,161 WARNING [train.py:1204] (3/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_17-320491-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 03:27:06,721 WARNING [train.py:1204] (3/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_310-114452-0 from training. Duration: 0.59 2023-10-04 03:27:08,067 WARNING [train.py:1204] (3/4) Exclude cut with ID _15133_210_6228_1_1530794512074_3080379_29-264458-0_sp0.9 from training. Duration: 0.6444375 2023-10-04 03:27:10,922 WARNING [train.py:1204] (3/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_159-281310-0_sp1.1 from training. Duration: 0.863625 2023-10-04 03:27:10,940 WARNING [train.py:1204] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_195-167011-0_sp0.9 from training. Duration: 0.8333125 2023-10-04 03:27:12,269 WARNING [train.py:1204] (3/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_156-257607-0_sp1.1 from training. Duration: 0.7545625 2023-10-04 03:27:15,209 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.ff2_skip_rate, batch_count=1508933.3333333333, ans=0.0 2023-10-04 03:27:15,240 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.feed_forward1.out_proj.dropout_p, batch_count=1508933.3333333333, ans=0.1 2023-10-04 03:27:16,336 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2295_210_2087_1_1525500993_2407863_116-238894-0 from training. Duration: 0.7 2023-10-04 03:27:17,742 WARNING [train.py:1204] (3/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_321-335475-0_sp0.9 from training. Duration: 0.5 2023-10-04 03:27:19,139 WARNING [train.py:1204] (3/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_188-114464-0 from training. Duration: 0.82 2023-10-04 03:27:22,282 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2592_210_2290_1_1525697025_4882925_157-70512-0 from training. Duration: 0.91 2023-10-04 03:27:23,703 WARNING [train.py:1204] (3/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_138-64210-0_sp1.1 from training. Duration: 0.663625 2023-10-04 03:27:23,857 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.self_attn_weights.pos_emb_skip_rate, batch_count=1508933.3333333333, ans=0.0 2023-10-04 03:27:29,781 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_11305_210_1794_1_1529750428_7178148_651-95702-0_sp1.1 from training. Duration: 0.963625 2023-10-04 03:27:29,796 WARNING [train.py:1204] (3/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_379-184223-0_sp0.9 from training. Duration: 0.9444375 2023-10-04 03:27:31,157 WARNING [train.py:1204] (3/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_381-125325-0_sp1.1 from training. Duration: 0.963625 2023-10-04 03:27:31,212 WARNING [train.py:1204] (3/4) Exclude cut with ID IC0622W0021-307659 from training. Duration: 0.896 2023-10-04 03:27:31,214 WARNING [train.py:1204] (3/4) Exclude cut with ID _7624_210_1531_1_1528109802198_4933101_418-349834-0_sp1.1 from training. Duration: 0.5454375 2023-10-04 03:27:34,518 WARNING [train.py:1204] (3/4) Exclude cut with ID _49728_210_12651_1_1534041034294_6788150_607-3416-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 03:27:35,954 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8275_210_4198_1_1528455320_3892571_491-17707-0 from training. Duration: 0.64 2023-10-04 03:27:37,219 WARNING [train.py:1204] (3/4) Exclude cut with ID _5287_210_2084_1_1527418883040_3878670_333-48508-0 from training. Duration: 0.6 2023-10-04 03:27:37,302 WARNING [train.py:1204] (3/4) Exclude cut with ID _8365_210_2064_1_1528592311070_4677690_371-122295-0 from training. Duration: 0.55 2023-10-04 03:27:38,716 WARNING [train.py:1204] (3/4) Exclude cut with ID _14860_210_4915_1_1530702020340_7343240_836-328489-0 from training. Duration: 0.9 2023-10-04 03:27:40,155 WARNING [train.py:1204] (3/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_1033-274376-0_sp0.9 from training. Duration: 0.9666875 2023-10-04 03:27:42,747 INFO [train.py:1046] (3/4) Epoch 43, batch 3250, loss[loss=0.1807, simple_loss=0.2402, pruned_loss=0.06054, over 18783.00 frames. ], tot_loss[loss=0.1544, simple_loss=0.2343, pruned_loss=0.03728, over 4716673.61 frames. ], batch size: 388, lr: 2.36e-03, grad_scale: 16.0 2023-10-04 03:27:44,168 WARNING [train.py:1204] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_475-127455-0_sp0.9 from training. Duration: 0.77775 2023-10-04 03:27:44,175 WARNING [train.py:1204] (3/4) Exclude cut with ID IC0671W0071-331914 from training. Duration: 0.896 2023-10-04 03:27:44,194 WARNING [train.py:1204] (3/4) Exclude cut with ID _30327_210_16049_1_1532484161295_3924169_2-1523-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 03:27:44,196 WARNING [train.py:1204] (3/4) Exclude cut with ID _24314_210_4915_1_1532135078116_7170179_460-208279-0_sp1.1 from training. Duration: 0.97275 2023-10-04 03:27:45,652 WARNING [train.py:1204] (3/4) Exclude cut with ID IC0679W0052-335859 from training. Duration: 0.64 2023-10-04 03:27:48,520 WARNING [train.py:1204] (3/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_3-237434-0_sp1.1 from training. Duration: 0.6909375 2023-10-04 03:27:50,655 WARNING [train.py:1204] (3/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_405-185283-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 03:27:53,780 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.4.encoder.layers.2.self_attn1.whiten, num_groups=1, num_channels=384, metric=12.17 vs. limit=22.5 2023-10-04 03:28:00,951 WARNING [train.py:1204] (3/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_927-83684-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 03:28:00,972 WARNING [train.py:1204] (3/4) Exclude cut with ID _6694_210_4599_1_1527852637490_3965529_356-169233-0 from training. Duration: 0.99 2023-10-04 03:28:02,284 WARNING [train.py:1204] (3/4) Exclude cut with ID _19896_210_1883_1_1531709022662_6649569_562-233001-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 03:28:02,339 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_354-78629-0_sp1.1 from training. Duration: 0.963625 2023-10-04 03:28:02,340 WARNING [train.py:1204] (3/4) Exclude cut with ID _60135_210_13427_1_1534935557922_3651319_96-168198-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 03:28:03,200 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.ff2_skip_rate, batch_count=1509133.3333333333, ans=0.0 2023-10-04 03:28:04,319 WARNING [train.py:1204] (3/4) Exclude cut with ID _31602_210_5854_1_1532677021456_4458250_249-96604-0_sp1.1 from training. Duration: 0.8090625 2023-10-04 03:28:05,592 WARNING [train.py:1204] (3/4) Exclude cut with ID _14100_210_8341_1_1530529320613_3678069_258-224599-0_sp0.9 from training. Duration: 0.8555625 2023-10-04 03:28:07,031 WARNING [train.py:1204] (3/4) Exclude cut with ID _19708_210_9206_1_1532136650733_4238364_215-160135-0_sp1.1 from training. Duration: 0.97275 2023-10-04 03:28:07,071 WARNING [train.py:1204] (3/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_113-18163-0_sp0.9 from training. Duration: 0.988875 2023-10-04 03:28:08,335 WARNING [train.py:1204] (3/4) Exclude cut with ID _64135_210_2253_1_1535677267876_7608972_552-337340-0_sp1.1 from training. Duration: 0.92725 2023-10-04 03:28:08,358 WARNING [train.py:1204] (3/4) Exclude cut with ID _67725_210_27767_1_1535608756671_5105359_10-12887-0_sp1.1 from training. Duration: 0.97275 2023-10-04 03:28:08,371 WARNING [train.py:1204] (3/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_401-284840-0_sp1.1 from training. Duration: 0.97275 2023-10-04 03:28:09,679 WARNING [train.py:1204] (3/4) Exclude cut with ID _44659_210_2721_1_1534036120061_6614860_483-141285-0_sp1.1 from training. Duration: 0.936375 2023-10-04 03:28:11,198 WARNING [train.py:1204] (3/4) Exclude cut with ID _9461_210_4557_1_1529060514726_3677451_358-332734-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 03:28:12,498 WARNING [train.py:1204] (3/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_788-298907-0_sp1.1 from training. Duration: 0.8090625 2023-10-04 03:28:14,074 WARNING [train.py:1204] (3/4) Exclude cut with ID _39411_210_5986_1_1533538825030_3343649_354-267196-0_sp1.1 from training. Duration: 0.92725 2023-10-04 03:28:14,104 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_14953_210_2151_1_1530701710_5616165_192-309792-0_sp1.1 from training. Duration: 0.97275 2023-10-04 03:28:15,478 WARNING [train.py:1204] (3/4) Exclude cut with ID _59170_210_6721_1_1534983633960_4203335_290-151255-0_sp1.1 from training. Duration: 0.92725 2023-10-04 03:28:15,502 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_5575_210_1410_1_1527301305_2570170_249-93769-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 03:28:16,800 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_46013_210_7234_1_1533776362_3744032_463-52378-0_sp1.1 from training. Duration: 0.7818125 2023-10-04 03:28:21,363 WARNING [train.py:1204] (3/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_580-93798-0 from training. Duration: 0.56 2023-10-04 03:28:21,407 WARNING [train.py:1204] (3/4) Exclude cut with ID _19514_210_9259_1_1531530071330_5084230_385-13762-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 03:28:21,411 WARNING [train.py:1204] (3/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_458-67321-0_sp1.1 from training. Duration: 0.8818125 2023-10-04 03:28:22,897 WARNING [train.py:1204] (3/4) Exclude cut with ID _41298_210_9019_1_1533430371450_4822143_188-154791-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 03:28:24,201 WARNING [train.py:1204] (3/4) Exclude cut with ID _15037_210_2253_1_1530752294104_3267650_296-192597-0_sp0.9 from training. Duration: 0.9 2023-10-04 03:28:29,005 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.ff3_skip_rate, batch_count=1509266.6666666667, ans=0.0 2023-10-04 03:28:31,508 WARNING [train.py:1204] (3/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_661-264857-0_sp1.1 from training. Duration: 0.8454375 2023-10-04 03:28:37,953 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_34008_210_10431_1_1533345424_8183247_662-317331-0_sp1.1 from training. Duration: 0.8545625 2023-10-04 03:28:37,991 WARNING [train.py:1204] (3/4) Exclude cut with ID _46502_210_13794_1_1533724452198_3657159_241-278329-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 03:28:37,992 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_11349_210_6753_1_1529800861_1101207_26-65602-0 from training. Duration: 0.96 2023-10-04 03:28:37,995 WARNING [train.py:1204] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_448-4048-0_sp1.1 from training. Duration: 0.87275 2023-10-04 03:28:38,009 WARNING [train.py:1204] (3/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_662-275621-0_sp1.1 from training. Duration: 0.3818125 2023-10-04 03:28:39,225 WARNING [train.py:1204] (3/4) Exclude cut with ID _13938_210_9988_1_1530518718978_3693960_256-54454-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 03:28:40,804 WARNING [train.py:1204] (3/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_256-32662-0 from training. Duration: 0.9 2023-10-04 03:28:40,845 WARNING [train.py:1204] (3/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_326-17061-0 from training. Duration: 0.94 2023-10-04 03:28:42,479 WARNING [train.py:1204] (3/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_505-97987-0_sp1.1 from training. Duration: 0.7818125 2023-10-04 03:28:42,669 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.1.feed_forward1.out_proj.dropout_p, batch_count=1509333.3333333333, ans=0.1 2023-10-04 03:28:43,703 WARNING [train.py:1204] (3/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_316-294364-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 03:28:43,778 WARNING [train.py:1204] (3/4) Exclude cut with ID _19514_210_9259_1_1531530071330_5084230_762-231063-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 03:28:45,174 WARNING [train.py:1204] (3/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_68-219807-0_sp0.9 from training. Duration: 0.52225 2023-10-04 03:28:45,218 WARNING [train.py:1204] (3/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_479-158053-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 03:28:46,415 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.658e+02 1.986e+02 2.142e+02 2.380e+02 3.154e+02, threshold=4.284e+02, percent-clipped=0.0 2023-10-04 03:28:49,243 WARNING [train.py:1204] (3/4) Exclude cut with ID _66112_210_10635_1_1535590236305_3801484_83-38181-0_sp1.1 from training. Duration: 0.936375 2023-10-04 03:28:49,268 WARNING [train.py:1204] (3/4) Exclude cut with ID _55857_210_2346_1_1534644007831_6898209_255-146346-0_sp1.1 from training. Duration: 0.963625 2023-10-04 03:28:50,705 WARNING [train.py:1204] (3/4) Exclude cut with ID _48836_210_12210_1_1534075848293_3904150_429-35475-0 from training. Duration: 0.89 2023-10-04 03:28:50,717 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_50154_210_13731_1_1534750862_1080961_119-187163-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 03:28:52,852 WARNING [train.py:1204] (3/4) Exclude cut with ID _8793_210_6310_1_1528542985139_2310009_121-179782-0_sp0.9 from training. Duration: 0.888875 2023-10-04 03:28:52,855 WARNING [train.py:1204] (3/4) Exclude cut with ID _14884_210_2252_1_1530707459689_5561250_591-173406-0 from training. Duration: 0.96 2023-10-04 03:28:56,936 INFO [train.py:1046] (3/4) Epoch 43, batch 3300, loss[loss=0.163, simple_loss=0.2365, pruned_loss=0.04481, over 23824.00 frames. ], tot_loss[loss=0.1554, simple_loss=0.2353, pruned_loss=0.03779, over 4706419.77 frames. ], batch size: 212, lr: 2.36e-03, grad_scale: 16.0 2023-10-04 03:28:57,001 WARNING [train.py:1204] (3/4) Exclude cut with ID _15133_210_6228_1_1530794512074_3080379_54-198193-0_sp1.1 from training. Duration: 0.8545625 2023-10-04 03:28:57,010 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6545_210_2040_1_1527774670_4245258_407-48070-0 from training. Duration: 0.76 2023-10-04 03:28:58,368 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1032-236863-0 from training. Duration: 0.84 2023-10-04 03:29:00,339 WARNING [train.py:1204] (3/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_507-303370-0 from training. Duration: 0.96 2023-10-04 03:29:00,360 WARNING [train.py:1204] (3/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_164-282366-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 03:29:04,918 WARNING [train.py:1204] (3/4) Exclude cut with ID _39867_210_9826_1_1533553222030_9344259_312-158727-0_sp1.1 from training. Duration: 0.963625 2023-10-04 03:29:06,296 WARNING [train.py:1204] (3/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_174-205058-0_sp1.1 from training. Duration: 0.863625 2023-10-04 03:29:06,333 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_11305_210_1794_1_1529750428_7178148_166-237358-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 03:29:07,784 WARNING [train.py:1204] (3/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_26-175553-0_sp1.1 from training. Duration: 0.5545625 2023-10-04 03:29:09,157 WARNING [train.py:1204] (3/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_96-290039-0_sp1.1 from training. Duration: 0.5090625 2023-10-04 03:29:10,630 WARNING [train.py:1204] (3/4) Exclude cut with ID _53219_210_4525_1_1534834836008_4064825_261-101691-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 03:29:12,078 WARNING [train.py:1204] (3/4) Exclude cut with ID _24271_210_12658_1_1531987270022_7190519_96-293390-0_sp1.1 from training. Duration: 0.936375 2023-10-04 03:29:12,361 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=1509466.6666666667, ans=0.1 2023-10-04 03:29:14,901 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_271-234962-0 from training. Duration: 0.79 2023-10-04 03:29:16,292 WARNING [train.py:1204] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_136-30660-0_sp1.1 from training. Duration: 0.97275 2023-10-04 03:29:16,308 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_625-281046-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 03:29:17,650 WARNING [train.py:1204] (3/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_333-168193-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 03:29:17,713 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9029W0269-509664 from training. Duration: 0.896 2023-10-04 03:29:20,402 WARNING [train.py:1204] (3/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_450-147393-0_sp1.1 from training. Duration: 0.92725 2023-10-04 03:29:21,095 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.3.encoder.layers.1.self_attn2.whiten, num_groups=1, num_channels=512, metric=13.33 vs. limit=22.5 2023-10-04 03:29:21,663 WARNING [train.py:1204] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_643-52007-0_sp1.1 from training. Duration: 0.5181875 2023-10-04 03:29:21,729 WARNING [train.py:1204] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_330-327861-0_sp0.9 from training. Duration: 0.7444375 2023-10-04 03:29:21,744 WARNING [train.py:1204] (3/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_260-317651-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 03:29:23,475 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9044W0010-516654 from training. Duration: 0.896 2023-10-04 03:29:23,670 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.conv_module1.balancer1.prob, batch_count=1509466.6666666667, ans=0.125 2023-10-04 03:29:27,763 WARNING [train.py:1204] (3/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_265-118378-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 03:29:27,774 WARNING [train.py:1204] (3/4) Exclude cut with ID _6194_210_1534_1_1527511826160_3941194_321-148641-0_sp0.9 from training. Duration: 0.711125 2023-10-04 03:29:30,579 WARNING [train.py:1204] (3/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_101-169176-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 03:29:30,582 WARNING [train.py:1204] (3/4) Exclude cut with ID 497-129325-0061-62254-0_sp1.1 from training. Duration: 0.97725 2023-10-04 03:29:32,013 WARNING [train.py:1204] (3/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_795-33252-0_sp1.1 from training. Duration: 0.336375 2023-10-04 03:29:33,741 WARNING [train.py:1204] (3/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_168-246176-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 03:29:33,837 WARNING [train.py:1204] (3/4) Exclude cut with ID _7160_210_1876_1_1527940923231_3703799_120-150943-0_sp1.1 from training. Duration: 0.736375 2023-10-04 03:29:36,491 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9084W0005-536844 from training. Duration: 0.768 2023-10-04 03:29:37,956 WARNING [train.py:1204] (3/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_234-260682-0 from training. Duration: 0.84 2023-10-04 03:29:39,524 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.0.layers.0.nonlin_attention.whiten1, num_groups=1, num_channels=144, metric=9.69 vs. limit=10.0 2023-10-04 03:29:39,789 WARNING [train.py:1204] (3/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_188-114464-0_sp0.9 from training. Duration: 0.911125 2023-10-04 03:29:43,048 WARNING [train.py:1204] (3/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_81-75849-0 from training. Duration: 0.39 2023-10-04 03:29:44,475 WARNING [train.py:1204] (3/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_379-184223-0_sp1.1 from training. Duration: 0.77275 2023-10-04 03:29:47,172 WARNING [train.py:1204] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_454-123002-0_sp1.1 from training. Duration: 0.62725 2023-10-04 03:29:47,210 WARNING [train.py:1204] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_887-249555-0_sp1.1 from training. Duration: 0.7 2023-10-04 03:29:48,738 WARNING [train.py:1204] (3/4) Exclude cut with ID _31088_210_16446_1_1532520012284_3440919_20-260902-0_sp1.1 from training. Duration: 0.97275 2023-10-04 03:29:48,784 WARNING [train.py:1204] (3/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_856-344574-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 03:29:48,786 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_36756_210_7234_1_1533026991_966276_4-164118-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 03:29:48,813 WARNING [train.py:1204] (3/4) Exclude cut with ID _8365_210_2064_1_1528592311070_4677690_371-122295-0_sp1.1 from training. Duration: 0.5 2023-10-04 03:29:52,877 WARNING [train.py:1204] (3/4) Exclude cut with ID _5910_210_1389_1_1527337972454_5050100_555-159815-0_sp1.1 from training. Duration: 0.8909375 2023-10-04 03:29:52,896 WARNING [train.py:1204] (3/4) Exclude cut with ID _37086_210_9038_1_1532999540891_7021250_469-147941-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 03:29:52,968 WARNING [train.py:1204] (3/4) Exclude cut with ID _3067_210_2667_1_1526212974640_4503168_459-173867-0_sp0.9 from training. Duration: 0.97775 2023-10-04 03:29:54,367 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9156W0006-572499 from training. Duration: 0.896 2023-10-04 03:29:56,172 WARNING [train.py:1204] (3/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_277-210798-0 from training. Duration: 0.55 2023-10-04 03:29:58,824 WARNING [train.py:1204] (3/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_103-336064-0_sp1.1 from training. Duration: 0.463625 2023-10-04 03:29:58,899 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3414_210_1988_1_1526212718_4813195_331-290945-0_sp1.1 from training. Duration: 0.936375 2023-10-04 03:29:58,901 WARNING [train.py:1204] (3/4) Exclude cut with ID _67499_210_17282_1_1535509805220_3692430_365-29276-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 03:30:01,522 WARNING [train.py:1204] (3/4) Exclude cut with ID _14821_210_8750_1_1530680503991_6829359_69-250847-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 03:30:01,524 WARNING [train.py:1204] (3/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_1102-61301-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 03:30:01,623 WARNING [train.py:1204] (3/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_166-246557-0_sp1.1 from training. Duration: 0.5818125 2023-10-04 03:30:02,948 WARNING [train.py:1204] (3/4) Exclude cut with ID _15351_210_10626_1_1530856635653_3576210_214-129918-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 03:30:02,961 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_120-41668-0_sp0.9 from training. Duration: 0.77775 2023-10-04 03:30:04,724 WARNING [train.py:1204] (3/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_27-72278-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 03:30:06,102 WARNING [train.py:1204] (3/4) Exclude cut with ID _24314_210_4915_1_1532135078116_7170179_521-258868-0_sp1.1 from training. Duration: 0.7181875 2023-10-04 03:30:06,316 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.feed_forward1.out_proj.dropout_p, batch_count=1509666.6666666667, ans=0.1 2023-10-04 03:30:08,794 WARNING [train.py:1204] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_143-86344-0 from training. Duration: 0.77 2023-10-04 03:30:08,828 WARNING [train.py:1204] (3/4) Exclude cut with ID _53489_210_6228_1_1534298357169_3835970_465-157433-0_sp1.1 from training. Duration: 0.92725 2023-10-04 03:30:10,112 INFO [train.py:1046] (3/4) Epoch 43, batch 3350, loss[loss=0.1631, simple_loss=0.2505, pruned_loss=0.03785, over 24313.00 frames. ], tot_loss[loss=0.1558, simple_loss=0.236, pruned_loss=0.03784, over 4723175.59 frames. ], batch size: 74, lr: 2.36e-03, grad_scale: 16.0 2023-10-04 03:30:10,183 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_248-319353-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 03:30:12,212 WARNING [train.py:1204] (3/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_102-77064-0_sp1.1 from training. Duration: 0.8454375 2023-10-04 03:30:13,484 WARNING [train.py:1204] (3/4) Exclude cut with ID _9389_210_1664_1_1529217337301_5289269_379-128794-0_sp1.1 from training. Duration: 0.77275 2023-10-04 03:30:13,575 WARNING [train.py:1204] (3/4) Exclude cut with ID _3231_210_2106_1_1526205864934_6784220_236-260835-0_sp1.1 from training. Duration: 0.97275 2023-10-04 03:30:16,257 WARNING [train.py:1204] (3/4) Exclude cut with ID _24271_210_12658_1_1531987270022_7190519_184-342363-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 03:30:16,258 WARNING [train.py:1204] (3/4) Exclude cut with ID _27894_210_12392_1_1532415496078_6902439_67-221663-0_sp1.1 from training. Duration: 0.92725 2023-10-04 03:30:17,764 WARNING [train.py:1204] (3/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_304-59227-0_sp1.1 from training. Duration: 0.7 2023-10-04 03:30:19,137 WARNING [train.py:1204] (3/4) Exclude cut with ID _58605_210_12210_1_1534723195939_3904259_458-42232-0_sp1.1 from training. Duration: 0.92725 2023-10-04 03:30:20,470 WARNING [train.py:1204] (3/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_13-165512-0_sp0.9 from training. Duration: 0.911125 2023-10-04 03:30:23,107 WARNING [train.py:1204] (3/4) Exclude cut with ID _58602_210_2346_1_1534903210085_6908830_792-102241-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 03:30:24,452 WARNING [train.py:1204] (3/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_588-125165-0_sp0.9 from training. Duration: 0.92225 2023-10-04 03:30:25,970 WARNING [train.py:1204] (3/4) Exclude cut with ID _54218_210_12210_1_1534413646388_4409259_239-98429-0_sp1.1 from training. Duration: 0.97275 2023-10-04 03:30:26,025 WARNING [train.py:1204] (3/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_538-32291-0_sp1.1 from training. Duration: 0.7545625 2023-10-04 03:30:27,882 WARNING [train.py:1204] (3/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_419-317062-0 from training. Duration: 0.91 2023-10-04 03:30:29,320 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9309W0202-640329 from training. Duration: 0.896 2023-10-04 03:30:30,659 WARNING [train.py:1204] (3/4) Exclude cut with ID _3231_210_2106_1_1526205864934_6784220_419-313291-0_sp1.1 from training. Duration: 0.97275 2023-10-04 03:30:33,331 WARNING [train.py:1204] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_619-344499-0 from training. Duration: 0.63 2023-10-04 03:30:33,343 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_278-272612-0 from training. Duration: 0.73 2023-10-04 03:30:34,788 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_103-84010-0_sp0.9 from training. Duration: 0.7666875 2023-10-04 03:30:34,809 WARNING [train.py:1204] (3/4) Exclude cut with ID _26528_210_9070_1_1532570410842_3851310_497-97218-0_sp1.1 from training. Duration: 0.7818125 2023-10-04 03:30:34,900 WARNING [train.py:1204] (3/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_262-84613-0_sp1.1 from training. Duration: 0.936375 2023-10-04 03:30:36,772 WARNING [train.py:1204] (3/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_387-131538-0 from training. Duration: 0.81 2023-10-04 03:30:36,798 WARNING [train.py:1204] (3/4) Exclude cut with ID _8030_210_7497_1_1528444886781_3531089_224-300489-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 03:30:36,814 WARNING [train.py:1204] (3/4) Exclude cut with ID _14349_210_8341_1_1530615710818_3442470_170-336541-0_sp0.9 from training. Duration: 0.9555625 2023-10-04 03:30:40,063 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_47069_210_15368_1_1533781267_4431320_180-31746-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 03:30:41,371 WARNING [train.py:1204] (3/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_447-7041-0_sp1.1 from training. Duration: 0.92725 2023-10-04 03:30:42,686 WARNING [train.py:1204] (3/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_2-55283-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 03:30:42,744 WARNING [train.py:1204] (3/4) Exclude cut with ID _12555_210_8750_1_1530075594402_3780239_70-181911-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 03:30:44,831 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.balancer2.prob, batch_count=1509866.6666666667, ans=0.125 2023-10-04 03:30:47,381 WARNING [train.py:1204] (3/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_311-328022-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 03:30:47,530 WARNING [train.py:1204] (3/4) Exclude cut with ID _14528_210_8254_1_1530669700514_4306939_330-2987-0_sp1.1 from training. Duration: 0.92725 2023-10-04 03:30:48,934 WARNING [train.py:1204] (3/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_385-184858-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 03:30:49,111 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.1.bypass.scale_min, batch_count=1509866.6666666667, ans=0.2 2023-10-04 03:30:52,238 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.4.encoder.layers.0.self_attn_weights.whiten_keys, num_groups=4, num_channels=128, metric=1.85 vs. limit=6.0 2023-10-04 03:30:53,020 WARNING [train.py:1204] (3/4) Exclude cut with ID _64414_210_17301_1_1535243252048_3757759_348-333360-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 03:30:53,091 WARNING [train.py:1204] (3/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_753-214751-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 03:30:54,508 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_17613_210_2151_1_1531137569_2366160_202-208013-0_sp1.1 from training. Duration: 0.92725 2023-10-04 03:30:54,517 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_43831_210_2158_1_1533725502_5872791_610-129300-0_sp1.1 from training. Duration: 0.936375 2023-10-04 03:30:56,130 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.feed_forward1.hidden_balancer.prob, batch_count=1509933.3333333333, ans=0.125 2023-10-04 03:30:57,899 WARNING [train.py:1204] (3/4) Exclude cut with ID _54218_210_12210_1_1534413646388_4409259_321-23852-0_sp1.1 from training. Duration: 0.936375 2023-10-04 03:31:00,698 WARNING [train.py:1204] (3/4) Exclude cut with ID _39411_210_5986_1_1533538825030_3343649_173-238248-0 from training. Duration: 0.83 2023-10-04 03:31:00,705 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_405-339986-0_sp0.9 from training. Duration: 0.5666875 2023-10-04 03:31:00,742 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2009_210_2071_1_1525481134_4588937_385-302100-0 from training. Duration: 0.87 2023-10-04 03:31:02,034 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_161-276996-0_sp1.1 from training. Duration: 0.863625 2023-10-04 03:31:03,402 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_177-58005-0 from training. Duration: 0.62 2023-10-04 03:31:03,595 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.0.conv_module2.balancer2.prob, batch_count=1509933.3333333333, ans=0.125 2023-10-04 03:31:04,766 WARNING [train.py:1204] (3/4) Exclude cut with ID _64639_210_5816_1_1535367619437_3528000_146-343734-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 03:31:07,493 WARNING [train.py:1204] (3/4) Exclude cut with ID _3845_210_1390_1_1526731326141_3523610_72-61441-0_sp1.1 from training. Duration: 0.92725 2023-10-04 03:31:12,923 WARNING [train.py:1204] (3/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_991-125028-0_sp1.1 from training. Duration: 0.936375 2023-10-04 03:31:14,108 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.541e+02 2.008e+02 2.223e+02 2.622e+02 3.286e+02, threshold=4.446e+02, percent-clipped=0.0 2023-10-04 03:31:14,258 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7544_210_6753_1_1528591327_1471426_172-348777-0 from training. Duration: 0.66 2023-10-04 03:31:15,591 WARNING [train.py:1204] (3/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_416-257730-0_sp1.1 from training. Duration: 0.5909375 2023-10-04 03:31:16,909 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1113-7369-0_sp1.1 from training. Duration: 0.77275 2023-10-04 03:31:18,903 WARNING [train.py:1204] (3/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_242-76943-0_sp1.1 from training. Duration: 0.8545625 2023-10-04 03:31:23,108 WARNING [train.py:1204] (3/4) Exclude cut with ID _44718_210_5816_1_1534050051869_6469030_190-49648-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 03:31:24,376 INFO [train.py:1046] (3/4) Epoch 43, batch 3400, loss[loss=0.154, simple_loss=0.2275, pruned_loss=0.0403, over 23511.00 frames. ], tot_loss[loss=0.1571, simple_loss=0.2371, pruned_loss=0.03855, over 4710614.98 frames. ], batch size: 134, lr: 2.36e-03, grad_scale: 16.0 2023-10-04 03:31:25,770 WARNING [train.py:1204] (3/4) Exclude cut with ID _47251_210_18628_1_1533814063128_4137250_214-88076-0 from training. Duration: 0.88 2023-10-04 03:31:25,802 WARNING [train.py:1204] (3/4) Exclude cut with ID _5585_210_2667_1_1527418813324_8007340_89-332072-0_sp1.1 from training. Duration: 0.5818125 2023-10-04 03:31:25,833 WARNING [train.py:1204] (3/4) Exclude cut with ID _41817_210_18835_1_1533434477160_7423049_555-293448-0_sp1.1 from training. Duration: 0.72725 2023-10-04 03:31:27,317 WARNING [train.py:1204] (3/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_286-331314-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 03:31:27,368 WARNING [train.py:1204] (3/4) Exclude cut with ID _13921_210_9994_1_1530838702800_3003429_46-149444-0 from training. Duration: 0.94 2023-10-04 03:31:29,234 WARNING [train.py:1204] (3/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_768-43595-0_sp1.1 from training. Duration: 0.936375 2023-10-04 03:31:29,244 WARNING [train.py:1204] (3/4) Exclude cut with ID _35355_210_16040_1_1533212988977_4016229_74-190076-0 from training. Duration: 0.8 2023-10-04 03:31:30,504 WARNING [train.py:1204] (3/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_1007-316380-0_sp1.1 from training. Duration: 0.963625 2023-10-04 03:31:31,800 WARNING [train.py:1204] (3/4) Exclude cut with ID _23557_210_8750_1_1531890106370_6846255_150-283437-0_sp1.1 from training. Duration: 0.963625 2023-10-04 03:31:31,846 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3474_210_2028_1_1526188513_4825246_145-309131-0_sp1.1 from training. Duration: 0.67275 2023-10-04 03:31:31,940 WARNING [train.py:1204] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_391-22148-0_sp1.1 from training. Duration: 0.7 2023-10-04 03:31:32,124 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.conv_skip_rate, batch_count=1510066.6666666667, ans=0.0 2023-10-04 03:31:33,300 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_238-256668-0 from training. Duration: 0.69 2023-10-04 03:31:36,216 WARNING [train.py:1204] (3/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_372-198878-0 from training. Duration: 0.6 2023-10-04 03:31:36,233 WARNING [train.py:1204] (3/4) Exclude cut with ID ID0174W0006-766659 from training. Duration: 0.953 2023-10-04 03:31:37,540 WARNING [train.py:1204] (3/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_668-23019-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 03:31:42,830 WARNING [train.py:1204] (3/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_322-74328-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 03:31:42,836 WARNING [train.py:1204] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_872-264525-0_sp1.1 from training. Duration: 0.5909375 2023-10-04 03:31:42,890 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_53378_210_17282_1_1534221071_5769509_641-286201-0_sp1.1 from training. Duration: 0.97275 2023-10-04 03:31:42,966 WARNING [train.py:1204] (3/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_199-295611-0_sp1.1 from training. Duration: 0.5 2023-10-04 03:31:47,137 WARNING [train.py:1204] (3/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_1270-317971-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 03:31:47,925 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.1.encoder.layers.0.nonlin_attention.whiten1, num_groups=1, num_channels=192, metric=4.53 vs. limit=10.0 2023-10-04 03:31:48,840 WARNING [train.py:1204] (3/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_784-213099-0 from training. Duration: 0.93 2023-10-04 03:31:53,016 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_115-233929-0_sp0.9 from training. Duration: 0.8 2023-10-04 03:31:54,396 WARNING [train.py:1204] (3/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_256-223809-0_sp1.1 from training. Duration: 0.97275 2023-10-04 03:31:54,437 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_285-87781-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 03:31:55,761 WARNING [train.py:1204] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_619-344499-0_sp1.1 from training. Duration: 0.57275 2023-10-04 03:32:03,376 WARNING [train.py:1204] (3/4) Exclude cut with ID _60229_210_27767_1_1534838406158_3655350_264-279573-0_sp0.9 from training. Duration: 0.92225 2023-10-04 03:32:06,315 WARNING [train.py:1204] (3/4) Exclude cut with ID _7719_210_3525_1_1528456153163_4296960_333-110399-0 from training. Duration: 0.96 2023-10-04 03:32:14,003 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_50423_210_13270_1_1534128598_5021734_759-90833-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 03:32:14,056 WARNING [train.py:1204] (3/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_151-254798-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 03:32:14,088 WARNING [train.py:1204] (3/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_450-203637-0 from training. Duration: 0.92 2023-10-04 03:32:14,273 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.balancer2.prob, batch_count=1510266.6666666667, ans=0.125 2023-10-04 03:32:15,504 WARNING [train.py:1204] (3/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_673-36456-0_sp1.1 from training. Duration: 0.936375 2023-10-04 03:32:15,553 WARNING [train.py:1204] (3/4) Exclude cut with ID _36538_210_9994_1_1533204020363_3795620_131-8685-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 03:32:16,878 WARNING [train.py:1204] (3/4) Exclude cut with ID _12556_210_8341_1_1530081024100_7327690_713-55626-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 03:32:16,911 WARNING [train.py:1204] (3/4) Exclude cut with ID _2322_210_2667_1_1525604548971_3656299_440-80494-0_sp0.9 from training. Duration: 0.97775 2023-10-04 03:32:20,158 WARNING [train.py:1204] (3/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_210-325081-0_sp1.1 from training. Duration: 0.97275 2023-10-04 03:32:21,736 WARNING [train.py:1204] (3/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_375-244976-0_sp0.9 from training. Duration: 0.8333125 2023-10-04 03:32:21,742 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_15517_210_6753_1_1530952293_1664795_119-31001-0_sp1.1 from training. Duration: 0.8545625 2023-10-04 03:32:27,076 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_41452_210_7291_1_1533437687_3941281_319-42326-0_sp1.1 from training. Duration: 0.92725 2023-10-04 03:32:30,429 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_372-173012-0 from training. Duration: 0.95 2023-10-04 03:32:36,026 WARNING [train.py:1204] (3/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_41-31157-0_sp1.1 from training. Duration: 0.5181875 2023-10-04 03:32:37,627 WARNING [train.py:1204] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_91-212486-0 from training. Duration: 0.68 2023-10-04 03:32:39,355 INFO [train.py:1046] (3/4) Epoch 43, batch 3450, loss[loss=0.1543, simple_loss=0.2259, pruned_loss=0.04133, over 23673.00 frames. ], tot_loss[loss=0.1573, simple_loss=0.2373, pruned_loss=0.03866, over 4695591.93 frames. ], batch size: 149, lr: 2.36e-03, grad_scale: 16.0 2023-10-04 03:32:42,851 WARNING [train.py:1204] (3/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_1267-272693-0 from training. Duration: 0.73 2023-10-04 03:32:42,886 WARNING [train.py:1204] (3/4) Exclude cut with ID _13938_210_9988_1_1530518718978_3693960_152-67046-0_sp1.1 from training. Duration: 0.936375 2023-10-04 03:32:45,524 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_4528_210_3273_1_1527076654_3276777_342-168068-0_sp1.1 from training. Duration: 0.7909375 2023-10-04 03:32:45,547 WARNING [train.py:1204] (3/4) Exclude cut with ID _7606_210_4915_1_1528894253277_5080690_127-177761-0 from training. Duration: 0.64 2023-10-04 03:32:46,829 WARNING [train.py:1204] (3/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_335-137972-0_sp1.1 from training. Duration: 0.92725 2023-10-04 03:32:49,672 WARNING [train.py:1204] (3/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_331-117829-0_sp1.1 from training. Duration: 0.563625 2023-10-04 03:32:51,866 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.conv_skip_rate, batch_count=1510400.0, ans=0.0 2023-10-04 03:32:53,577 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.3.encoder.layers.2.conv_module1.whiten, num_groups=1, num_channels=512, metric=4.78 vs. limit=15.0 2023-10-04 03:32:53,914 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.0.layers.0.self_attn_weights.whiten_keys, num_groups=4, num_channels=128, metric=2.71 vs. limit=6.0 2023-10-04 03:32:55,647 WARNING [train.py:1204] (3/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_125-125818-0_sp1.1 from training. Duration: 0.87275 2023-10-04 03:32:55,696 WARNING [train.py:1204] (3/4) Exclude cut with ID _6789_210_1534_1_1527854471671_2870519_48-294897-0_sp1.1 from training. Duration: 0.963625 2023-10-04 03:32:57,061 WARNING [train.py:1204] (3/4) Exclude cut with ID _6694_210_4599_1_1527852637490_3965529_116-109398-0_sp1.1 from training. Duration: 0.663625 2023-10-04 03:32:57,072 WARNING [train.py:1204] (3/4) Exclude cut with ID _52892_210_6721_1_1534378941259_4124129_222-172685-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 03:33:00,331 WARNING [train.py:1204] (3/4) Exclude cut with ID _50655_210_18628_1_1534229942193_3772154_320-132275-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 03:33:02,593 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.2.encoder.layers.2.conv_module1.whiten, num_groups=1, num_channels=384, metric=6.20 vs. limit=15.0 2023-10-04 03:33:04,620 WARNING [train.py:1204] (3/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_76-311963-0 from training. Duration: 0.7 2023-10-04 03:33:09,410 WARNING [train.py:1204] (3/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_32-205288-0 from training. Duration: 0.78 2023-10-04 03:33:09,436 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_86-16881-0_sp0.9 from training. Duration: 0.6444375 2023-10-04 03:33:09,483 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_271-297505-0_sp1.1 from training. Duration: 0.9 2023-10-04 03:33:12,657 WARNING [train.py:1204] (3/4) Exclude cut with ID _57572_210_5517_1_1534921219624_3809484_187-295293-0_sp1.1 from training. Duration: 0.963625 2023-10-04 03:33:18,147 WARNING [train.py:1204] (3/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_563-329943-0 from training. Duration: 0.99 2023-10-04 03:33:19,584 WARNING [train.py:1204] (3/4) Exclude cut with ID _7160_210_1876_1_1527940923231_3703799_302-150728-0_sp0.9 from training. Duration: 0.8666875 2023-10-04 03:33:22,283 WARNING [train.py:1204] (3/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_926-7526-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 03:33:23,669 WARNING [train.py:1204] (3/4) Exclude cut with ID _62574_210_2655_1_1535180407058_3933249_467-22348-0_sp1.1 from training. Duration: 0.936375 2023-10-04 03:33:25,603 WARNING [train.py:1204] (3/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_280-161633-0_sp0.9 from training. Duration: 0.811125 2023-10-04 03:33:25,716 WARNING [train.py:1204] (3/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_385-193052-0_sp1.1 from training. Duration: 0.7818125 2023-10-04 03:33:27,163 WARNING [train.py:1204] (3/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_444-28041-0 from training. Duration: 0.85 2023-10-04 03:33:27,168 WARNING [train.py:1204] (3/4) Exclude cut with ID _66070_210_7640_1_1535441393096_7805310_341-249237-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 03:33:30,291 WARNING [train.py:1204] (3/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_425-269826-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 03:33:32,957 WARNING [train.py:1204] (3/4) Exclude cut with ID _53950_210_5816_1_1534417264919_3862951_124-185597-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 03:33:35,683 WARNING [train.py:1204] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_380-204756-0 from training. Duration: 0.86 2023-10-04 03:33:38,874 WARNING [train.py:1204] (3/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_282-257791-0_sp0.9 from training. Duration: 0.9555625 2023-10-04 03:33:43,837 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.654e+02 2.002e+02 2.243e+02 2.534e+02 3.921e+02, threshold=4.486e+02, percent-clipped=0.0 2023-10-04 03:33:44,003 WARNING [train.py:1204] (3/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_372-131163-0_sp1.1 from training. Duration: 0.8545625 2023-10-04 03:33:45,381 WARNING [train.py:1204] (3/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_447-233513-0_sp1.1 from training. Duration: 0.963625 2023-10-04 03:33:48,141 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_54-118333-0_sp1.1 from training. Duration: 0.97275 2023-10-04 03:33:50,978 WARNING [train.py:1204] (3/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_388-230239-0_sp1.1 from training. Duration: 0.963625 2023-10-04 03:33:50,990 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_40047_210_2290_1_1533290519_2374311_141-54639-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 03:33:52,480 WARNING [train.py:1204] (3/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_91-267696-0_sp0.9 from training. Duration: 0.9666875 2023-10-04 03:33:52,662 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.ff3_skip_rate, batch_count=1510733.3333333333, ans=0.0 2023-10-04 03:33:53,702 INFO [train.py:1046] (3/4) Epoch 43, batch 3500, loss[loss=0.1371, simple_loss=0.1917, pruned_loss=0.04131, over 19419.00 frames. ], tot_loss[loss=0.1559, simple_loss=0.2358, pruned_loss=0.03804, over 4691556.81 frames. ], batch size: 388, lr: 2.36e-03, grad_scale: 16.0 2023-10-04 03:33:53,758 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_532-137847-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 03:33:54,009 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.conv_module1.balancer2.prob, batch_count=1510733.3333333333, ans=0.125 2023-10-04 03:33:57,106 WARNING [train.py:1204] (3/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_155-283231-0_sp1.1 from training. Duration: 0.97275 2023-10-04 03:33:59,978 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2245_210_2715_1_1525511548_3540254_310-52483-0_sp1.1 from training. Duration: 0.8 2023-10-04 03:34:00,042 WARNING [train.py:1204] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_185-174805-0 from training. Duration: 0.68 2023-10-04 03:34:01,914 WARNING [train.py:1204] (3/4) Exclude cut with ID _15012_210_1968_1_1530856820429_5095350_662-210469-0_sp1.1 from training. Duration: 0.5181875 2023-10-04 03:34:04,748 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_426-313557-0_sp0.9 from training. Duration: 0.588875 2023-10-04 03:34:08,031 WARNING [train.py:1204] (3/4) Exclude cut with ID _42326_210_7770_1_1533814282531_3707269_407-204343-0_sp1.1 from training. Duration: 0.97275 2023-10-04 03:34:08,040 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_258-310570-0 from training. Duration: 0.82 2023-10-04 03:34:12,113 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_4737_210_2040_1_1526998689_3293008_378-178650-0_sp1.1 from training. Duration: 0.863625 2023-10-04 03:34:12,201 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_47896_210_20508_1_1534492518_5958868_432-327451-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 03:34:13,607 WARNING [train.py:1204] (3/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_188-114464-0_sp1.1 from training. Duration: 0.7454375 2023-10-04 03:34:13,614 WARNING [train.py:1204] (3/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_798-106179-0_sp1.1 from training. Duration: 0.7545625 2023-10-04 03:34:14,836 WARNING [train.py:1204] (3/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_143-117421-0_sp1.1 from training. Duration: 0.52725 2023-10-04 03:34:14,885 WARNING [train.py:1204] (3/4) Exclude cut with ID _67499_210_17282_1_1535509805220_3692430_361-78786-0_sp1.1 from training. Duration: 0.963625 2023-10-04 03:34:14,916 WARNING [train.py:1204] (3/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_712-342563-0_sp1.1 from training. Duration: 0.8818125 2023-10-04 03:34:16,367 WARNING [train.py:1204] (3/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_387-159008-0 from training. Duration: 0.86 2023-10-04 03:34:19,055 WARNING [train.py:1204] (3/4) Exclude cut with ID _33640_210_2655_1_1533117605479_4855469_959-18820-0_sp1.1 from training. Duration: 0.963625 2023-10-04 03:34:19,088 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_133-564-0_sp0.9 from training. Duration: 0.87775 2023-10-04 03:34:20,488 WARNING [train.py:1204] (3/4) Exclude cut with ID _23514_210_1389_1_1531832139785_3031439_12-29287-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 03:34:23,790 WARNING [train.py:1204] (3/4) Exclude cut with ID _23514_210_1389_1_1531832139785_3031439_489-185052-0_sp1.1 from training. Duration: 0.963625 2023-10-04 03:34:23,855 WARNING [train.py:1204] (3/4) Exclude cut with ID _5444_210_2151_1_1527418808464_5312210_634-194174-0 from training. Duration: 0.97 2023-10-04 03:34:25,188 WARNING [train.py:1204] (3/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_91-26919-0_sp1.1 from training. Duration: 0.8818125 2023-10-04 03:34:26,673 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_37351_210_7925_1_1533012931_3812113_516-82303-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 03:34:28,051 WARNING [train.py:1204] (3/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_80-74184-0_sp0.9 from training. Duration: 0.92225 2023-10-04 03:34:29,346 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_9460_210_4211_1_1528954587_10049667_495-202963-0_sp1.1 from training. Duration: 0.963625 2023-10-04 03:34:31,205 WARNING [train.py:1204] (3/4) Exclude cut with ID _14632_210_9988_1_1530691279392_3923200_332-105904-0_sp1.1 from training. Duration: 0.8181875 2023-10-04 03:34:32,570 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_40764_210_1925_1_1533882359_3752345_493-167886-0_sp1.1 from training. Duration: 0.92725 2023-10-04 03:34:35,252 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_196-289637-0 from training. Duration: 0.75 2023-10-04 03:34:35,335 WARNING [train.py:1204] (3/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_26-175553-0 from training. Duration: 0.61 2023-10-04 03:34:37,096 WARNING [train.py:1204] (3/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_890-5117-0 from training. Duration: 0.78 2023-10-04 03:34:37,148 WARNING [train.py:1204] (3/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_206-306860-0_sp1.1 from training. Duration: 0.92725 2023-10-04 03:34:40,423 WARNING [train.py:1204] (3/4) Exclude cut with ID _25882_210_3036_1_1532775665815_6479510_570-79824-0_sp1.1 from training. Duration: 0.963625 2023-10-04 03:34:40,485 WARNING [train.py:1204] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_731-2428-0_sp1.1 from training. Duration: 0.7545625 2023-10-04 03:34:40,512 WARNING [train.py:1204] (3/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_138-154812-0_sp0.9 from training. Duration: 0.8666875 2023-10-04 03:34:44,289 WARNING [train.py:1204] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_178-39722-0_sp0.9 from training. Duration: 0.5444375 2023-10-04 03:34:44,346 WARNING [train.py:1204] (3/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_1285-256622-0_sp0.9 from training. Duration: 0.9333125 2023-10-04 03:34:45,795 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.conv_module2.balancer2.prob, batch_count=1510933.3333333333, ans=0.125 2023-10-04 03:34:48,399 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_41428_210_2161_1_1533774184_3992532_65-271323-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 03:34:49,893 WARNING [train.py:1204] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_308-161067-0 from training. Duration: 0.66 2023-10-04 03:34:49,898 WARNING [train.py:1204] (3/4) Exclude cut with ID _3067_210_2667_1_1526212974640_4503168_459-173867-0 from training. Duration: 0.88 2023-10-04 03:34:49,899 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2221_210_1988_1_1525597051_4935541_415-296882-0_sp1.1 from training. Duration: 0.7 2023-10-04 03:34:54,315 WARNING [train.py:1204] (3/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_354-192266-0_sp1.1 from training. Duration: 0.936375 2023-10-04 03:34:54,389 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_258-310570-0_sp0.9 from training. Duration: 0.911125 2023-10-04 03:34:55,179 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.2.encoder.layers.2.feed_forward2.out_whiten, num_groups=1, num_channels=384, metric=3.63 vs. limit=15.0 2023-10-04 03:34:57,129 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_548-141039-0_sp1.1 from training. Duration: 0.963625 2023-10-04 03:34:58,641 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_15101_210_7927_1_1530786415_5033274_320-327527-0 from training. Duration: 0.96 2023-10-04 03:34:59,969 WARNING [train.py:1204] (3/4) Exclude cut with ID _2895_210_2477_1_1525956785794_4063750_289-11475-0_sp0.9 from training. Duration: 0.911125 2023-10-04 03:35:01,361 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7543_210_6753_1_1528543318_4552_1-85072-0_sp1.1 from training. Duration: 0.7545625 2023-10-04 03:35:02,949 WARNING [train.py:1204] (3/4) Exclude cut with ID _64639_210_5816_1_1535367619437_3528000_461-329156-0 from training. Duration: 0.96 2023-10-04 03:35:04,279 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_211-339622-0 from training. Duration: 0.45 2023-10-04 03:35:06,276 WARNING [train.py:1204] (3/4) Exclude cut with ID _20455_210_5219_1_1531565996775_8098100_402-112739-0_sp1.1 from training. Duration: 0.963625 2023-10-04 03:35:07,259 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.0.layers.0.conv_module1.whiten, num_groups=1, num_channels=192, metric=10.55 vs. limit=15.0 2023-10-04 03:35:07,902 INFO [train.py:1046] (3/4) Epoch 43, batch 3550, loss[loss=0.1472, simple_loss=0.2335, pruned_loss=0.03049, over 24681.00 frames. ], tot_loss[loss=0.1549, simple_loss=0.2347, pruned_loss=0.03754, over 4700370.98 frames. ], batch size: 65, lr: 2.36e-03, grad_scale: 8.0 2023-10-04 03:35:08,033 WARNING [train.py:1204] (3/4) Exclude cut with ID _53489_210_6228_1_1534298357169_3835970_256-115565-0_sp1.1 from training. Duration: 0.936375 2023-10-04 03:35:08,051 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_26326_210_7925_1_1533002493_3592406_360-305260-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 03:35:08,074 WARNING [train.py:1204] (3/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_18-204383-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 03:35:12,766 WARNING [train.py:1204] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_306-232480-0_sp1.1 from training. Duration: 0.87275 2023-10-04 03:35:19,428 WARNING [train.py:1204] (3/4) Exclude cut with ID _41161_210_20034_1_1533431249657_3555314_116-299186-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 03:35:20,677 WARNING [train.py:1204] (3/4) Exclude cut with ID _13913_210_9994_1_1530579615187_3383897_473-277682-0_sp1.1 from training. Duration: 0.3545625 2023-10-04 03:35:23,473 WARNING [train.py:1204] (3/4) Exclude cut with ID _58895_210_8750_1_1535184081320_3649089_49-225702-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 03:35:23,552 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3080_210_2071_1_1526086102_4365166_312-8726-0_sp1.1 from training. Duration: 0.82725 2023-10-04 03:35:25,393 WARNING [train.py:1204] (3/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_489-190175-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 03:35:26,660 WARNING [train.py:1204] (3/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_345-95717-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 03:35:27,970 WARNING [train.py:1204] (3/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_85-138257-0_sp1.1 from training. Duration: 0.6454375 2023-10-04 03:35:30,762 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7046_210_6753_1_1527895337_6920917_783-278109-0_sp1.1 from training. Duration: 0.7 2023-10-04 03:35:32,041 WARNING [train.py:1204] (3/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_450-203637-0_sp1.1 from training. Duration: 0.836375 2023-10-04 03:35:32,073 WARNING [train.py:1204] (3/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_416-132893-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 03:35:32,104 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2655_210_2087_1_1525688959_3897400_437-337690-0_sp0.9 from training. Duration: 0.7 2023-10-04 03:35:33,421 WARNING [train.py:1204] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_700-183549-0_sp1.1 from training. Duration: 0.6181875 2023-10-04 03:35:35,606 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.conv_module1.balancer2.prob, batch_count=1511200.0, ans=0.125 2023-10-04 03:35:39,521 WARNING [train.py:1204] (3/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_191-59784-0_sp1.1 from training. Duration: 0.8 2023-10-04 03:35:39,533 WARNING [train.py:1204] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_218-295097-0_sp1.1 from training. Duration: 0.7 2023-10-04 03:35:41,485 WARNING [train.py:1204] (3/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_310-109461-0_sp1.1 from training. Duration: 0.72725 2023-10-04 03:35:41,490 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_33348_210_5632_1_1533086912_3324030_131-321149-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 03:35:41,541 WARNING [train.py:1204] (3/4) Exclude cut with ID _64135_210_2253_1_1535677267876_7608972_133-188266-0_sp1.1 from training. Duration: 0.736375 2023-10-04 03:35:41,560 WARNING [train.py:1204] (3/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_505-205270-0 from training. Duration: 0.38 2023-10-04 03:35:41,577 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_67223_210_4238_1_1535612120_2537431_102-88248-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 03:35:43,135 WARNING [train.py:1204] (3/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_60-235433-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 03:35:44,543 WARNING [train.py:1204] (3/4) Exclude cut with ID _5189_210_4557_1_1527248319428_3769229_199-53526-0_sp1.1 from training. Duration: 0.3818125 2023-10-04 03:35:48,766 WARNING [train.py:1204] (3/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_439-90979-0_sp1.1 from training. Duration: 0.963625 2023-10-04 03:35:50,121 WARNING [train.py:1204] (3/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_1162-132880-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 03:35:50,192 WARNING [train.py:1204] (3/4) Exclude cut with ID _56423_210_5854_1_1534896061049_3165969_220-22317-0_sp1.1 from training. Duration: 0.963625 2023-10-04 03:35:51,720 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.bypass.scale_min, batch_count=1511266.6666666667, ans=0.2 2023-10-04 03:35:52,866 WARNING [train.py:1204] (3/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_909-64151-0 from training. Duration: 0.94 2023-10-04 03:35:52,929 WARNING [train.py:1204] (3/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_465-156500-0_sp0.9 from training. Duration: 0.8 2023-10-04 03:35:54,371 WARNING [train.py:1204] (3/4) Exclude cut with ID _44304_210_15374_1_1533639352765_4613129_302-22084-0 from training. Duration: 0.86 2023-10-04 03:35:54,409 WARNING [train.py:1204] (3/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_663-166973-0_sp1.1 from training. Duration: 0.72725 2023-10-04 03:35:57,678 WARNING [train.py:1204] (3/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_503-287883-0_sp0.9 from training. Duration: 0.97775 2023-10-04 03:35:58,948 WARNING [train.py:1204] (3/4) Exclude cut with ID _60057_210_10640_1_1534903065793_3333150_362-230377-0_sp0.9 from training. Duration: 0.9666875 2023-10-04 03:36:00,545 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.conv_skip_rate, batch_count=1511266.6666666667, ans=0.0 2023-10-04 03:36:01,729 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_4183_210_2087_1_1526729100_1869838_143-280962-0 from training. Duration: 0.56 2023-10-04 03:36:01,813 WARNING [train.py:1204] (3/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_281-1313-0_sp1.1 from training. Duration: 0.936375 2023-10-04 03:36:08,523 WARNING [train.py:1204] (3/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_64-215118-0_sp1.1 from training. Duration: 0.936375 2023-10-04 03:36:08,777 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.feed_forward2.hidden_balancer.prob, batch_count=1511333.3333333333, ans=0.125 2023-10-04 03:36:09,890 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2822_210_2715_1_1526112586_1999587_165-36656-0 from training. Duration: 0.89 2023-10-04 03:36:09,954 WARNING [train.py:1204] (3/4) Exclude cut with ID _22961_210_8254_1_1531976366672_4278180_402-208830-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 03:36:14,574 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.748e+02 1.996e+02 2.248e+02 2.673e+02 4.535e+02, threshold=4.496e+02, percent-clipped=1.0 2023-10-04 03:36:14,671 WARNING [train.py:1204] (3/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_98-193494-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 03:36:14,772 WARNING [train.py:1204] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_383-285196-0 from training. Duration: 0.97 2023-10-04 03:36:20,302 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6767_210_3273_1_1527855783_1718932_142-127132-0 from training. Duration: 0.95 2023-10-04 03:36:21,463 INFO [train.py:1046] (3/4) Epoch 43, batch 3600, loss[loss=0.1625, simple_loss=0.238, pruned_loss=0.04353, over 22815.00 frames. ], tot_loss[loss=0.1544, simple_loss=0.2341, pruned_loss=0.03731, over 4697819.68 frames. ], batch size: 322, lr: 2.36e-03, grad_scale: 8.0 2023-10-04 03:36:21,524 WARNING [train.py:1204] (3/4) Exclude cut with ID _39867_210_9826_1_1533553222030_9344259_1018-339635-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 03:36:22,922 WARNING [train.py:1204] (3/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_274-274798-0_sp1.1 from training. Duration: 0.8909375 2023-10-04 03:36:23,164 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.1.bypass_mid.scale_min, batch_count=1511400.0, ans=0.2 2023-10-04 03:36:24,335 WARNING [train.py:1204] (3/4) Exclude cut with ID _2334_210_2667_1_1525608238880_4705129_376-263393-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 03:36:24,395 WARNING [train.py:1204] (3/4) Exclude cut with ID _29303_210_2575_1_1532392237303_7410649_363-344675-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 03:36:24,479 WARNING [train.py:1204] (3/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_58-68956-0_sp1.1 from training. Duration: 0.7545625 2023-10-04 03:36:24,710 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.feed_forward1.out_proj.dropout_p, batch_count=1511400.0, ans=0.1 2023-10-04 03:36:29,267 WARNING [train.py:1204] (3/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_709-311571-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 03:36:30,594 WARNING [train.py:1204] (3/4) Exclude cut with ID _46067_210_4220_1_1534125571066_3307450_136-257407-0_sp1.1 from training. Duration: 0.963625 2023-10-04 03:36:32,098 WARNING [train.py:1204] (3/4) Exclude cut with ID _6493_210_1747_1_1528459210424_4012470_324-323854-0_sp1.1 from training. Duration: 0.836375 2023-10-04 03:36:33,407 WARNING [train.py:1204] (3/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_234-260682-0_sp1.1 from training. Duration: 0.763625 2023-10-04 03:36:33,464 WARNING [train.py:1204] (3/4) Exclude cut with ID _36634_210_1794_1_1533296352450_1399884_55-119451-0_sp1.1 from training. Duration: 0.963625 2023-10-04 03:36:33,468 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_40047_210_2290_1_1533290519_2374311_195-165693-0 from training. Duration: 0.89 2023-10-04 03:36:37,023 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.bypass.skip_rate, batch_count=1511466.6666666667, ans=0.09899494936611666 2023-10-04 03:36:38,599 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_5522_210_3273_1_1527249606_3909096_366-172508-0_sp0.9 from training. Duration: 0.7333125 2023-10-04 03:36:39,994 WARNING [train.py:1204] (3/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_187-10504-0_sp1.1 from training. Duration: 0.963625 2023-10-04 03:36:42,043 WARNING [train.py:1204] (3/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_282-257791-0_sp1.1 from training. Duration: 0.7818125 2023-10-04 03:36:44,703 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_43831_210_2158_1_1533725502_5872791_467-288776-0_sp1.1 from training. Duration: 0.92725 2023-10-04 03:36:46,062 WARNING [train.py:1204] (3/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_138-154812-0_sp1.1 from training. Duration: 0.7090625 2023-10-04 03:36:46,100 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_1154-185930-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 03:36:47,346 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2655_210_2087_1_1525688959_3897400_52-279899-0 from training. Duration: 0.69 2023-10-04 03:36:47,406 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_141-89113-0_sp1.1 from training. Duration: 0.7818125 2023-10-04 03:36:51,515 WARNING [train.py:1204] (3/4) Exclude cut with ID _55134_210_21982_1_1534590028377_3882170_297-47310-0_sp1.1 from training. Duration: 0.963625 2023-10-04 03:36:51,580 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_486-151131-0_sp1.1 from training. Duration: 0.77275 2023-10-04 03:36:53,049 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.balancer2.prob, batch_count=1511533.3333333333, ans=0.125 2023-10-04 03:36:54,283 WARNING [train.py:1204] (3/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_21-232980-0_sp1.1 from training. Duration: 0.936375 2023-10-04 03:36:55,835 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6165_210_3528_1_1527590982_4379841_338-106380-0_sp1.1 from training. Duration: 0.92725 2023-10-04 03:36:57,141 WARNING [train.py:1204] (3/4) Exclude cut with ID _55162_210_17071_1_1534408248583_3753480_355-257218-0_sp1.1 from training. Duration: 0.97275 2023-10-04 03:36:57,347 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.nonlin_attention.balancer.prob, batch_count=1511533.3333333333, ans=0.125 2023-10-04 03:36:58,476 WARNING [train.py:1204] (3/4) Exclude cut with ID _2895_210_2477_1_1525956785794_4063750_289-11475-0 from training. Duration: 0.82 2023-10-04 03:37:05,807 WARNING [train.py:1204] (3/4) Exclude cut with ID _16756_210_4915_1_1531047803763_7104259_839-183431-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 03:37:05,903 WARNING [train.py:1204] (3/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_427-208901-0_sp0.9 from training. Duration: 0.7555625 2023-10-04 03:37:06,049 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.ff2_skip_rate, batch_count=1511600.0, ans=0.0 2023-10-04 03:37:07,235 WARNING [train.py:1204] (3/4) Exclude cut with ID _36512_210_6987_1_1533033143998_3126069_181-306500-0 from training. Duration: 0.95 2023-10-04 03:37:10,602 WARNING [train.py:1204] (3/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_28-70987-0_sp1.1 from training. Duration: 0.7909375 2023-10-04 03:37:15,906 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.1.conv_skip_rate, batch_count=1511600.0, ans=0.0 2023-10-04 03:37:17,825 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.3.encoder.layers.1.self_attn1.whiten, num_groups=1, num_channels=512, metric=15.26 vs. limit=22.5 2023-10-04 03:37:18,548 WARNING [train.py:1204] (3/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_590-58996-0_sp1.1 from training. Duration: 0.936375 2023-10-04 03:37:21,415 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_48730_210_5399_1_1533885886_2212769_108-47561-0_sp1.1 from training. Duration: 0.936375 2023-10-04 03:37:27,027 WARNING [train.py:1204] (3/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_401-158239-0_sp0.9 from training. Duration: 0.788875 2023-10-04 03:37:27,042 WARNING [train.py:1204] (3/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_278-154492-0_sp1.1 from training. Duration: 0.6909375 2023-10-04 03:37:27,047 WARNING [train.py:1204] (3/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_137-116442-0 from training. Duration: 0.78 2023-10-04 03:37:28,505 WARNING [train.py:1204] (3/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_189-317158-0 from training. Duration: 0.9 2023-10-04 03:37:28,594 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_47896_210_20508_1_1534492518_5958868_558-314572-0 from training. Duration: 0.97 2023-10-04 03:37:30,028 WARNING [train.py:1204] (3/4) Exclude cut with ID _66070_210_7640_1_1535441393096_7805310_290-40284-0_sp1.1 from training. Duration: 0.97275 2023-10-04 03:37:31,352 WARNING [train.py:1204] (3/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_110-224688-0_sp1.1 from training. Duration: 0.8818125 2023-10-04 03:37:32,747 WARNING [train.py:1204] (3/4) Exclude cut with ID _7719_210_3525_1_1528456153163_4296960_88-250259-0 from training. Duration: 0.84 2023-10-04 03:37:32,795 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8453_210_4211_1_1528502940_8562730_1157-114102-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 03:37:32,840 WARNING [train.py:1204] (3/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_317-175062-0_sp1.1 from training. Duration: 0.7454375 2023-10-04 03:37:32,845 WARNING [train.py:1204] (3/4) Exclude cut with ID _29857_210_20034_1_1532395886753_3798160_254-347254-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 03:37:34,689 WARNING [train.py:1204] (3/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_28-70987-0 from training. Duration: 0.87 2023-10-04 03:37:36,001 INFO [train.py:1046] (3/4) Epoch 43, batch 3650, loss[loss=0.1407, simple_loss=0.2245, pruned_loss=0.02848, over 24454.00 frames. ], tot_loss[loss=0.1546, simple_loss=0.2349, pruned_loss=0.03719, over 4700593.93 frames. ], batch size: 63, lr: 2.36e-03, grad_scale: 8.0 2023-10-04 03:37:36,057 WARNING [train.py:1204] (3/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_321-335475-0 from training. Duration: 0.45 2023-10-04 03:37:39,355 WARNING [train.py:1204] (3/4) Exclude cut with ID _40901_210_5665_1_1533363789958_3856130_356-107330-0_sp1.1 from training. Duration: 0.936375 2023-10-04 03:37:40,643 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3780_210_2087_1_1526467389_2145572_176-107140-0 from training. Duration: 0.69 2023-10-04 03:37:44,126 WARNING [train.py:1204] (3/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_80-135200-0 from training. Duration: 0.79 2023-10-04 03:37:47,207 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_33348_210_5632_1_1533086912_3324030_85-28583-0_sp0.9 from training. Duration: 0.911125 2023-10-04 03:37:48,665 WARNING [train.py:1204] (3/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_29-293896-0 from training. Duration: 0.61 2023-10-04 03:37:51,323 WARNING [train.py:1204] (3/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_194-252800-0 from training. Duration: 0.97 2023-10-04 03:37:54,249 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_360-52446-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 03:37:54,251 WARNING [train.py:1204] (3/4) Exclude cut with ID _9389_210_1664_1_1529217337301_5289269_289-213417-0_sp0.9 from training. Duration: 0.988875 2023-10-04 03:37:54,869 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.4.encoder.layers.1.self_attn1.whiten, num_groups=1, num_channels=384, metric=11.40 vs. limit=22.5 2023-10-04 03:37:55,628 WARNING [train.py:1204] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_143-86344-0_sp0.9 from training. Duration: 0.8555625 2023-10-04 03:37:58,500 WARNING [train.py:1204] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_520-126729-0_sp0.9 from training. Duration: 0.77775 2023-10-04 03:37:58,528 WARNING [train.py:1204] (3/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_76-308663-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 03:37:59,854 WARNING [train.py:1204] (3/4) Exclude cut with ID _8021_210_7994_1_1528376451358_3653069_100-140231-0 from training. Duration: 0.67 2023-10-04 03:37:59,946 WARNING [train.py:1204] (3/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_387-131538-0_sp0.9 from training. Duration: 0.9 2023-10-04 03:37:59,978 WARNING [train.py:1204] (3/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_365-182054-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 03:38:01,251 WARNING [train.py:1204] (3/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_345-340690-0 from training. Duration: 0.93 2023-10-04 03:38:02,696 WARNING [train.py:1204] (3/4) Exclude cut with ID _5409_210_1388_1_1527292850189_5865191_178-127970-0_sp0.9 from training. Duration: 0.6333125 2023-10-04 03:38:02,759 WARNING [train.py:1204] (3/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_352-12734-0_sp1.1 from training. Duration: 0.92725 2023-10-04 03:38:02,762 WARNING [train.py:1204] (3/4) Exclude cut with ID _65134_210_14763_1_1535335393985_3505368_121-129373-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 03:38:04,221 WARNING [train.py:1204] (3/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_557-324258-0_sp1.1 from training. Duration: 0.736375 2023-10-04 03:38:08,248 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.0.layers.1.nonlin_attention.whiten2, num_groups=1, num_channels=192, metric=5.61 vs. limit=15.0 2023-10-04 03:38:08,810 WARNING [train.py:1204] (3/4) Exclude cut with ID _48678_210_5663_1_1533988830947_3769199_306-255886-0 from training. Duration: 0.77 2023-10-04 03:38:10,202 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7544_210_6753_1_1528587000_691468_87-178599-0 from training. Duration: 0.87 2023-10-04 03:38:10,263 WARNING [train.py:1204] (3/4) Exclude cut with ID _2941_210_2577_1_1526199673150_2489759_208-236402-0_sp1.1 from training. Duration: 0.963625 2023-10-04 03:38:13,421 WARNING [train.py:1204] (3/4) Exclude cut with ID _38670_210_15379_1_1533448810659_4391059_65-169397-0 from training. Duration: 0.97 2023-10-04 03:38:14,808 WARNING [train.py:1204] (3/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_306-328175-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 03:38:14,818 WARNING [train.py:1204] (3/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_212-149947-0_sp1.1 from training. Duration: 0.663625 2023-10-04 03:38:19,571 WARNING [train.py:1204] (3/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_321-22821-0_sp1.1 from training. Duration: 0.8090625 2023-10-04 03:38:21,091 WARNING [train.py:1204] (3/4) Exclude cut with ID _44010_210_4525_1_1533884262964_4013620_483-69110-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 03:38:21,111 WARNING [train.py:1204] (3/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_38-187851-0_sp1.1 from training. Duration: 0.563625 2023-10-04 03:38:23,738 WARNING [train.py:1204] (3/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_129-337802-0_sp0.9 from training. Duration: 0.8 2023-10-04 03:38:25,060 WARNING [train.py:1204] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_268-87909-0_sp1.1 from training. Duration: 0.9 2023-10-04 03:38:27,780 WARNING [train.py:1204] (3/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_559-184862-0_sp1.1 from training. Duration: 0.8545625 2023-10-04 03:38:28,050 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=1511933.3333333333, ans=0.1 2023-10-04 03:38:30,552 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_46032_210_14656_1_1533781412_4507969_237-120766-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 03:38:30,642 WARNING [train.py:1204] (3/4) Exclude cut with ID _28427_210_4220_1_1532581240967_3061069_330-170906-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 03:38:30,647 WARNING [train.py:1204] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_401-51578-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 03:38:32,100 WARNING [train.py:1204] (3/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_209-205365-0_sp1.1 from training. Duration: 0.7181875 2023-10-04 03:38:33,542 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_23801_210_16223_1_1532066392_3294657_57-242170-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 03:38:35,222 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_127-37371-0_sp1.1 from training. Duration: 0.936375 2023-10-04 03:38:41,209 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9171W0019-578905 from training. Duration: 0.896 2023-10-04 03:38:42,407 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.633e+02 2.039e+02 2.236e+02 2.475e+02 4.169e+02, threshold=4.472e+02, percent-clipped=0.0 2023-10-04 03:38:44,036 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_24605_210_1794_1_1532047863_4149755_349-49056-0_sp1.1 from training. Duration: 0.92725 2023-10-04 03:38:44,048 WARNING [train.py:1204] (3/4) Exclude cut with ID _12302_210_5219_1_1529924446466_8107280_897-21237-0_sp1.1 from training. Duration: 0.936375 2023-10-04 03:38:45,900 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_9460_210_4211_1_1528954587_10049667_480-260269-0_sp0.9 from training. Duration: 0.62225 2023-10-04 03:38:47,270 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_348-152548-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 03:38:47,342 WARNING [train.py:1204] (3/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_301-330455-0_sp0.9 from training. Duration: 0.888875 2023-10-04 03:38:49,371 WARNING [train.py:1204] (3/4) Exclude cut with ID _61008_210_10633_1_1534852769198_6974370_345-91705-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 03:38:49,505 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.nonlin_attention.balancer.prob, batch_count=1512066.6666666667, ans=0.125 2023-10-04 03:38:50,650 INFO [train.py:1046] (3/4) Epoch 43, batch 3700, loss[loss=0.1406, simple_loss=0.2213, pruned_loss=0.02997, over 24406.00 frames. ], tot_loss[loss=0.1554, simple_loss=0.2359, pruned_loss=0.03751, over 4692938.44 frames. ], batch size: 58, lr: 2.36e-03, grad_scale: 8.0 2023-10-04 03:38:52,072 WARNING [train.py:1204] (3/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_304-273433-0 from training. Duration: 0.99 2023-10-04 03:38:52,077 WARNING [train.py:1204] (3/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_359-345972-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 03:38:53,547 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2296_210_2087_1_1525503448_3397335_57-185430-0_sp1.1 from training. Duration: 0.5454375 2023-10-04 03:38:54,943 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_1256-120974-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 03:38:55,009 WARNING [train.py:1204] (3/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_372-30302-0_sp0.9 from training. Duration: 0.9555625 2023-10-04 03:38:56,514 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.attention_skip_rate, batch_count=1512066.6666666667, ans=0.0 2023-10-04 03:38:57,784 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_42012_210_7927_1_1533439936_3871556_451-344262-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 03:38:57,785 WARNING [train.py:1204] (3/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_28-324729-0 from training. Duration: 0.94 2023-10-04 03:38:57,793 WARNING [train.py:1204] (3/4) Exclude cut with ID _64135_210_2253_1_1535677267876_7608972_313-18394-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 03:38:59,188 WARNING [train.py:1204] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_620-276793-0_sp0.9 from training. Duration: 0.6555625 2023-10-04 03:38:59,224 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7544_210_6753_1_1528591327_1471426_172-348777-0_sp0.9 from training. Duration: 0.7333125 2023-10-04 03:39:01,786 WARNING [train.py:1204] (3/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_332-221057-0_sp0.9 from training. Duration: 0.7555625 2023-10-04 03:39:04,618 WARNING [train.py:1204] (3/4) Exclude cut with ID _19023_210_4353_1_1531378892552_5820780_5-300035-0_sp1.1 from training. Duration: 0.92725 2023-10-04 03:39:05,935 WARNING [train.py:1204] (3/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_343-309023-0_sp1.1 from training. Duration: 0.963625 2023-10-04 03:39:07,320 WARNING [train.py:1204] (3/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_37-41919-0_sp1.1 from training. Duration: 0.7909375 2023-10-04 03:39:08,653 WARNING [train.py:1204] (3/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_41-182910-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 03:39:08,704 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_229-45300-0_sp0.9 from training. Duration: 0.6333125 2023-10-04 03:39:11,945 WARNING [train.py:1204] (3/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_40-214716-0_sp1.1 from training. Duration: 0.963625 2023-10-04 03:39:13,335 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9309W0203-640330 from training. Duration: 0.896 2023-10-04 03:39:21,648 WARNING [train.py:1204] (3/4) Exclude cut with ID _60229_210_27767_1_1534838406158_3655350_264-279573-0_sp1.1 from training. Duration: 0.7545625 2023-10-04 03:39:21,696 WARNING [train.py:1204] (3/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_372-198878-0_sp0.9 from training. Duration: 0.6666875 2023-10-04 03:39:23,109 WARNING [train.py:1204] (3/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_359-118926-0_sp1.1 from training. Duration: 0.6545625 2023-10-04 03:39:23,128 WARNING [train.py:1204] (3/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_85-65056-0 from training. Duration: 0.96 2023-10-04 03:39:23,146 WARNING [train.py:1204] (3/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_89-242335-0_sp1.1 from training. Duration: 0.863625 2023-10-04 03:39:26,109 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.balancer1.prob, batch_count=1512200.0, ans=0.125 2023-10-04 03:39:27,238 WARNING [train.py:1204] (3/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_128-49444-0_sp1.1 from training. Duration: 0.936375 2023-10-04 03:39:27,313 WARNING [train.py:1204] (3/4) Exclude cut with ID _30327_210_16049_1_1532484161295_3924169_401-70819-0 from training. Duration: 0.86 2023-10-04 03:39:28,652 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_41822_210_13270_1_1533955976_4379284_260-256391-0_sp1.1 from training. Duration: 0.936375 2023-10-04 03:39:31,206 WARNING [train.py:1204] (3/4) Exclude cut with ID _67335_210_5219_1_1535525975652_4275229_599-247317-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 03:39:32,633 WARNING [train.py:1204] (3/4) Exclude cut with ID _29857_210_20034_1_1532395886753_3798160_209-239267-0_sp1.1 from training. Duration: 0.936375 2023-10-04 03:39:32,655 WARNING [train.py:1204] (3/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_46-207599-0_sp1.1 from training. Duration: 0.5818125 2023-10-04 03:39:35,426 WARNING [train.py:1204] (3/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_912-282412-0_sp1.1 from training. Duration: 0.4454375 2023-10-04 03:39:36,934 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.ff3_skip_rate, batch_count=1512266.6666666667, ans=0.0 2023-10-04 03:39:41,027 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_4183_210_2087_1_1526729100_1869838_141-166600-0_sp1.1 from training. Duration: 0.863625 2023-10-04 03:39:41,031 WARNING [train.py:1204] (3/4) Exclude cut with ID _15037_210_2253_1_1530752294104_3267650_296-192597-0 from training. Duration: 0.81 2023-10-04 03:39:41,072 WARNING [train.py:1204] (3/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_571-220562-0_sp1.1 from training. Duration: 0.963625 2023-10-04 03:39:42,394 WARNING [train.py:1204] (3/4) Exclude cut with ID _5287_210_2084_1_1527418883040_3878670_12-221186-0 from training. Duration: 0.98 2023-10-04 03:39:42,720 INFO [scaling.py:1118] (3/4) WithLoss: name=encoder.encoders.4.encoder.layers.1.self_attn_weights, loss-sum=0.000e+00 2023-10-04 03:39:46,376 WARNING [train.py:1204] (3/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_149-85602-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 03:39:46,398 WARNING [train.py:1204] (3/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_1003-101638-0_sp1.1 from training. Duration: 0.663625 2023-10-04 03:39:50,858 WARNING [train.py:1204] (3/4) Exclude cut with ID _55844_210_2346_1_1534471212569_7625707_1099-33689-0_sp1.1 from training. Duration: 0.92725 2023-10-04 03:39:50,910 WARNING [train.py:1204] (3/4) Exclude cut with ID _7606_210_4915_1_1528894253277_5080690_301-10732-0 from training. Duration: 0.81 2023-10-04 03:39:54,083 WARNING [train.py:1204] (3/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_403-34698-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 03:39:54,098 WARNING [train.py:1204] (3/4) Exclude cut with ID _8480_210_1874_1_1528589391205_6728330_755-112625-0_sp0.9 from training. Duration: 0.82225 2023-10-04 03:39:54,110 WARNING [train.py:1204] (3/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_527-238990-0_sp1.1 from training. Duration: 0.8181875 2023-10-04 03:39:54,130 WARNING [train.py:1204] (3/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_426-7328-0_sp1.1 from training. Duration: 0.92725 2023-10-04 03:39:56,929 WARNING [train.py:1204] (3/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_189-317158-0_sp1.1 from training. Duration: 0.8181875 2023-10-04 03:39:56,997 WARNING [train.py:1204] (3/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_162-103620-0 from training. Duration: 0.93 2023-10-04 03:39:59,626 WARNING [train.py:1204] (3/4) Exclude cut with ID _22083_210_8254_1_1531702862439_3678970_49-234546-0 from training. Duration: 0.83 2023-10-04 03:39:59,667 WARNING [train.py:1204] (3/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_370-331600-0_sp1.1 from training. Duration: 0.8545625 2023-10-04 03:39:59,679 WARNING [train.py:1204] (3/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_166-59042-0_sp1.1 from training. Duration: 0.97275 2023-10-04 03:40:01,045 WARNING [train.py:1204] (3/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_138-332773-0_sp1.1 from training. Duration: 0.736375 2023-10-04 03:40:02,366 WARNING [train.py:1204] (3/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_90-30673-0_sp0.9 from training. Duration: 0.9333125 2023-10-04 03:40:03,720 INFO [train.py:1046] (3/4) Epoch 43, batch 3750, loss[loss=0.1395, simple_loss=0.225, pruned_loss=0.02701, over 24306.00 frames. ], tot_loss[loss=0.1559, simple_loss=0.2366, pruned_loss=0.0376, over 4710992.92 frames. ], batch size: 61, lr: 2.36e-03, grad_scale: 8.0 2023-10-04 03:40:03,903 WARNING [train.py:1204] (3/4) Exclude cut with ID _27967_210_12140_1_1532588391859_8419130_737-41898-0_sp1.1 from training. Duration: 0.936375 2023-10-04 03:40:05,316 WARNING [train.py:1204] (3/4) Exclude cut with ID _8833_210_5219_1_1528592665210_7553059_329-125156-0_sp1.1 from training. Duration: 0.7454375 2023-10-04 03:40:05,528 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.attention_skip_rate, batch_count=1512400.0, ans=0.0 2023-10-04 03:40:06,683 WARNING [train.py:1204] (3/4) Exclude cut with ID _41817_210_18835_1_1533434477160_7423049_352-42537-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 03:40:08,230 WARNING [train.py:1204] (3/4) Exclude cut with ID _8365_210_2064_1_1528592311070_4677690_431-294102-0 from training. Duration: 0.89 2023-10-04 03:40:09,679 WARNING [train.py:1204] (3/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_454-314901-0_sp1.1 from training. Duration: 0.4181875 2023-10-04 03:40:12,820 WARNING [train.py:1204] (3/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_80-135200-0_sp0.9 from training. Duration: 0.87775 2023-10-04 03:40:12,877 WARNING [train.py:1204] (3/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_181-78632-0 from training. Duration: 0.78 2023-10-04 03:40:14,678 WARNING [train.py:1204] (3/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_300-27235-0_sp0.9 from training. Duration: 0.9666875 2023-10-04 03:40:16,079 WARNING [train.py:1204] (3/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_96-177176-0_sp1.1 from training. Duration: 0.97275 2023-10-04 03:40:16,300 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.bypass_mid.scale_min, batch_count=1512400.0, ans=0.2 2023-10-04 03:40:17,428 WARNING [train.py:1204] (3/4) Exclude cut with ID _35941_210_15379_1_1532995294515_4023089_246-348742-0_sp1.1 from training. Duration: 0.97275 2023-10-04 03:40:17,506 WARNING [train.py:1204] (3/4) Exclude cut with ID _38670_210_15379_1_1533448810659_4391059_65-169397-0_sp1.1 from training. Duration: 0.8818125 2023-10-04 03:40:22,191 WARNING [train.py:1204] (3/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_1003-99642-0_sp1.1 from training. Duration: 0.963625 2023-10-04 03:40:25,157 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.balancer2.prob, batch_count=1512466.6666666667, ans=0.125 2023-10-04 03:40:26,321 WARNING [train.py:1204] (3/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_320-14078-0_sp1.1 from training. Duration: 0.863625 2023-10-04 03:40:26,418 WARNING [train.py:1204] (3/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_367-61330-0_sp0.9 from training. Duration: 0.8333125 2023-10-04 03:40:26,630 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.out_combiner.scale_min, batch_count=1512466.6666666667, ans=0.2 2023-10-04 03:40:27,885 WARNING [train.py:1204] (3/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_515-298514-0_sp1.1 from training. Duration: 0.92725 2023-10-04 03:40:30,637 WARNING [train.py:1204] (3/4) Exclude cut with ID _53731_210_7994_1_1534553979244_3757214_471-320815-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 03:40:30,694 WARNING [train.py:1204] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_306-232480-0 from training. Duration: 0.96 2023-10-04 03:40:32,036 WARNING [train.py:1204] (3/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_478-162247-0_sp0.9 from training. Duration: 0.911125 2023-10-04 03:40:33,359 WARNING [train.py:1204] (3/4) Exclude cut with ID _56423_210_5854_1_1534896061049_3165969_345-284696-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 03:40:33,411 WARNING [train.py:1204] (3/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_600-122253-0_sp1.1 from training. Duration: 0.963625 2023-10-04 03:40:35,033 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.conv_module2.balancer1.prob, batch_count=1512533.3333333333, ans=0.125 2023-10-04 03:40:36,201 WARNING [train.py:1204] (3/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_199-295611-0 from training. Duration: 0.55 2023-10-04 03:40:37,709 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.1.ff3_skip_rate, batch_count=1512533.3333333333, ans=0.0 2023-10-04 03:40:39,725 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.feed_forward2.hidden_balancer.prob, batch_count=1512533.3333333333, ans=0.125 2023-10-04 03:40:40,916 WARNING [train.py:1204] (3/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_734-129761-0 from training. Duration: 0.82 2023-10-04 03:40:41,186 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.conv_module2.balancer1.prob, batch_count=1512533.3333333333, ans=0.125 2023-10-04 03:40:42,304 WARNING [train.py:1204] (3/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_349-169257-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 03:40:43,584 WARNING [train.py:1204] (3/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_202-67017-0_sp0.9 from training. Duration: 0.911125 2023-10-04 03:40:45,053 WARNING [train.py:1204] (3/4) Exclude cut with ID _16908_210_5724_1_1531119157248_3772700_61-258978-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 03:40:45,566 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.4.encoder.layers.1.nonlin_attention.whiten2, num_groups=1, num_channels=384, metric=4.04 vs. limit=15.0 2023-10-04 03:40:48,299 WARNING [train.py:1204] (3/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_172-156931-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 03:40:50,247 WARNING [train.py:1204] (3/4) Exclude cut with ID _4092_210_2151_1_1526643044449_3135759_356-278694-0_sp1.1 from training. Duration: 0.42725 2023-10-04 03:40:54,392 WARNING [train.py:1204] (3/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_116-332008-0 from training. Duration: 0.98 2023-10-04 03:40:57,056 WARNING [train.py:1204] (3/4) Exclude cut with ID _29003_210_19596_1_1532507391720_3894098_358-114789-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 03:40:59,869 WARNING [train.py:1204] (3/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_152-212533-0_sp1.1 from training. Duration: 0.8818125 2023-10-04 03:40:59,931 WARNING [train.py:1204] (3/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_267-214116-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 03:41:03,329 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.4.encoder.layers.1.feed_forward3.out_whiten, num_groups=1, num_channels=384, metric=11.93 vs. limit=15.0 2023-10-04 03:41:04,084 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7543_210_6753_1_1528541907_1399322_165-323619-0_sp0.9 from training. Duration: 0.7666875 2023-10-04 03:41:08,185 WARNING [train.py:1204] (3/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_577-332600-0_sp0.9 from training. Duration: 0.6333125 2023-10-04 03:41:08,500 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.bypass.scale_min, batch_count=1512666.6666666667, ans=0.2 2023-10-04 03:41:10,053 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.601e+02 2.075e+02 2.386e+02 2.913e+02 4.377e+02, threshold=4.772e+02, percent-clipped=0.0 2023-10-04 03:41:10,152 WARNING [train.py:1204] (3/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_342-100157-0_sp0.9 from training. Duration: 0.6 2023-10-04 03:41:11,553 WARNING [train.py:1204] (3/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_141-89727-0_sp1.1 from training. Duration: 0.6454375 2023-10-04 03:41:12,955 WARNING [train.py:1204] (3/4) Exclude cut with ID _3992_210_1747_1_1526644816031_3731060_107-251434-0_sp1.1 from training. Duration: 0.9 2023-10-04 03:41:14,930 WARNING [train.py:1204] (3/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_401-53423-0_sp1.1 from training. Duration: 0.5 2023-10-04 03:41:16,756 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.conv_skip_rate, batch_count=1512733.3333333333, ans=0.0 2023-10-04 03:41:17,732 INFO [train.py:1046] (3/4) Epoch 43, batch 3800, loss[loss=0.1642, simple_loss=0.2461, pruned_loss=0.04114, over 24071.00 frames. ], tot_loss[loss=0.156, simple_loss=0.2362, pruned_loss=0.03788, over 4705200.45 frames. ], batch size: 80, lr: 2.36e-03, grad_scale: 8.0 2023-10-04 03:41:24,068 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7762_210_1794_1_1528199508_4057408_422-227034-0_sp1.1 from training. Duration: 0.663625 2023-10-04 03:41:25,722 INFO [scaling.py:1118] (3/4) WithLoss: name=encoder.encoders.0.layers.0.self_attn_weights, loss-sum=0.000e+00 2023-10-04 03:41:26,960 WARNING [train.py:1204] (3/4) Exclude cut with ID _4184_210_2550_1_1526730192013_2219380_102-250299-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 03:41:28,308 WARNING [train.py:1204] (3/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_253-308377-0_sp0.9 from training. Duration: 0.5444375 2023-10-04 03:41:28,380 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_271-297505-0 from training. Duration: 0.99 2023-10-04 03:41:29,736 WARNING [train.py:1204] (3/4) Exclude cut with ID _35941_210_15379_1_1532995294515_4023089_213-142973-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 03:41:32,455 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7544_210_6753_1_1528587722_3576686_162-63018-0_sp1.1 from training. Duration: 0.92725 2023-10-04 03:41:32,539 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_365-217119-0_sp0.9 from training. Duration: 0.7 2023-10-04 03:41:33,929 WARNING [train.py:1204] (3/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_662-275621-0_sp0.9 from training. Duration: 0.4666875 2023-10-04 03:41:33,931 WARNING [train.py:1204] (3/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_374-187555-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 03:41:35,238 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_289-180904-0_sp1.1 from training. Duration: 0.6909375 2023-10-04 03:41:36,723 WARNING [train.py:1204] (3/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_189-47206-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 03:41:36,759 WARNING [train.py:1204] (3/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_234-14174-0_sp1.1 from training. Duration: 0.7454375 2023-10-04 03:41:36,789 WARNING [train.py:1204] (3/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_576-76621-0_sp1.1 from training. Duration: 0.963625 2023-10-04 03:41:39,409 WARNING [train.py:1204] (3/4) Exclude cut with ID _13253_210_1925_1_1530757775819_3577873_253-246386-0 from training. Duration: 0.9 2023-10-04 03:41:43,192 WARNING [train.py:1204] (3/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_372-6869-0_sp1.1 from training. Duration: 0.4 2023-10-04 03:41:43,244 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_21434_210_4238_1_1531916339_1222252_22-54954-0_sp1.1 from training. Duration: 0.936375 2023-10-04 03:41:44,915 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.balancer2.prob, batch_count=1512800.0, ans=0.125 2023-10-04 03:41:47,258 WARNING [train.py:1204] (3/4) Exclude cut with ID _29853_210_7497_1_1532505600851_6642110_492-170361-0_sp1.1 from training. Duration: 0.92725 2023-10-04 03:41:47,441 WARNING [train.py:1204] (3/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_505-97987-0_sp0.9 from training. Duration: 0.9555625 2023-10-04 03:41:48,844 WARNING [train.py:1204] (3/4) Exclude cut with ID _8365_210_2064_1_1528592311070_4677690_371-122295-0_sp0.9 from training. Duration: 0.611125 2023-10-04 03:41:48,969 WARNING [train.py:1204] (3/4) Exclude cut with ID _14268_210_9206_1_1530622771285_4714099_637-86630-0_sp1.1 from training. Duration: 0.463625 2023-10-04 03:41:48,980 WARNING [train.py:1204] (3/4) Exclude cut with ID _29857_210_20034_1_1532395886753_3798160_372-5678-0_sp1.1 from training. Duration: 0.963625 2023-10-04 03:41:52,721 WARNING [train.py:1204] (3/4) Exclude cut with ID _52892_210_6721_1_1534378941259_4124129_90-165108-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 03:41:54,069 WARNING [train.py:1204] (3/4) Exclude cut with ID _27052_210_5986_1_1532329197187_3585009_143-339198-0_sp1.1 from training. Duration: 0.963625 2023-10-04 03:41:58,394 WARNING [train.py:1204] (3/4) Exclude cut with ID _14268_210_9206_1_1530622771285_4714099_637-86630-0_sp0.9 from training. Duration: 0.5666875 2023-10-04 03:41:58,396 WARNING [train.py:1204] (3/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_401-255491-0 from training. Duration: 0.7 2023-10-04 03:41:59,741 WARNING [train.py:1204] (3/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_178-6946-0_sp1.1 from training. Duration: 0.97275 2023-10-04 03:42:02,588 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.nonlin_attention.balancer.min_positive, batch_count=1512933.3333333333, ans=0.05 2023-10-04 03:42:06,597 WARNING [train.py:1204] (3/4) Exclude cut with ID _27485_210_4916_1_1532739191677_3668269_460-163885-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 03:42:06,841 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.feed_forward1.out_proj.dropout_p, batch_count=1512933.3333333333, ans=0.1 2023-10-04 03:42:14,653 WARNING [train.py:1204] (3/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_270-26053-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 03:42:16,075 WARNING [train.py:1204] (3/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_412-317077-0 from training. Duration: 0.75 2023-10-04 03:42:17,479 WARNING [train.py:1204] (3/4) Exclude cut with ID _5289_210_2084_1_1527678202736_3779299_142-103317-0 from training. Duration: 0.84 2023-10-04 03:42:17,531 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_150-143438-0_sp1.1 from training. Duration: 0.92725 2023-10-04 03:42:18,932 WARNING [train.py:1204] (3/4) Exclude cut with ID _27894_210_12392_1_1532415496078_6902439_668-269779-0_sp1.1 from training. Duration: 0.97275 2023-10-04 03:42:18,988 WARNING [train.py:1204] (3/4) Exclude cut with ID _18513_210_9206_1_1531618215223_3862620_56-222075-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 03:42:22,334 WARNING [train.py:1204] (3/4) Exclude cut with ID _4092_210_2151_1_1526643044449_3135759_356-278694-0 from training. Duration: 0.47 2023-10-04 03:42:26,848 WARNING [train.py:1204] (3/4) Exclude cut with ID _25808_210_13292_1_1532343514562_7366109_633-47524-0 from training. Duration: 0.99 2023-10-04 03:42:26,860 WARNING [train.py:1204] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_872-264525-0 from training. Duration: 0.65 2023-10-04 03:42:26,900 WARNING [train.py:1204] (3/4) Exclude cut with ID _14632_210_9988_1_1530691279392_3923200_239-232545-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 03:42:28,303 WARNING [train.py:1204] (3/4) Exclude cut with ID _14349_210_8341_1_1530615710818_3442470_153-27432-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 03:42:32,361 INFO [train.py:1046] (3/4) Epoch 43, batch 3850, loss[loss=0.1723, simple_loss=0.2577, pruned_loss=0.04351, over 24046.00 frames. ], tot_loss[loss=0.1554, simple_loss=0.2353, pruned_loss=0.03772, over 4701297.03 frames. ], batch size: 80, lr: 2.36e-03, grad_scale: 8.0 2023-10-04 03:42:33,771 WARNING [train.py:1204] (3/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_37-41919-0_sp0.9 from training. Duration: 0.9666875 2023-10-04 03:42:33,829 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_196-289637-0_sp0.9 from training. Duration: 0.8333125 2023-10-04 03:42:38,108 WARNING [train.py:1204] (3/4) Exclude cut with ID _13678_210_7497_1_1530530069226_3463170_371-3221-0_sp1.1 from training. Duration: 0.8181875 2023-10-04 03:42:38,182 WARNING [train.py:1204] (3/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_557-324258-0 from training. Duration: 0.81 2023-10-04 03:42:40,816 WARNING [train.py:1204] (3/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_1012-279473-0_sp1.1 from training. Duration: 0.6090625 2023-10-04 03:42:40,901 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_66916_210_14656_1_1535725525_326566_7-92649-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 03:42:44,240 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_9460_210_4211_1_1528954587_10049667_480-260269-0_sp1.1 from training. Duration: 0.5090625 2023-10-04 03:42:48,286 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_121-155001-0_sp1.1 from training. Duration: 0.92725 2023-10-04 03:42:48,854 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.5.encoder.layers.0.whiten, num_groups=1, num_channels=256, metric=3.02 vs. limit=12.0 2023-10-04 03:42:49,761 WARNING [train.py:1204] (3/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_188-171673-0_sp1.1 from training. Duration: 0.52725 2023-10-04 03:42:51,652 WARNING [train.py:1204] (3/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_2-39653-0 from training. Duration: 0.98 2023-10-04 03:42:53,117 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=1513133.3333333333, ans=0.1 2023-10-04 03:42:57,507 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_23850_210_4238_1_1532232597_1632433_112-229012-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 03:42:58,946 WARNING [train.py:1204] (3/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_626-79528-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 03:43:00,350 WARNING [train.py:1204] (3/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_856-90052-0_sp1.1 from training. Duration: 0.963625 2023-10-04 03:43:00,397 WARNING [train.py:1204] (3/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_122-109023-0_sp0.9 from training. Duration: 0.8444375 2023-10-04 03:43:01,874 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.1.self_attn_weights.pos_emb_skip_rate, batch_count=1513200.0, ans=0.0 2023-10-04 03:43:03,085 WARNING [train.py:1204] (3/4) Exclude cut with ID _64414_210_17301_1_1535243252048_3757759_37-70328-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 03:43:04,490 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_622-344907-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 03:43:04,542 WARNING [train.py:1204] (3/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_410-321956-0_sp1.1 from training. Duration: 0.92725 2023-10-04 03:43:04,560 WARNING [train.py:1204] (3/4) Exclude cut with ID _7562_210_2084_1_1528887767916_3946229_186-326422-0_sp0.9 from training. Duration: 0.9333125 2023-10-04 03:43:05,890 WARNING [train.py:1204] (3/4) Exclude cut with ID _37537_210_2253_1_1533171758568_3421949_127-285192-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 03:43:08,656 WARNING [train.py:1204] (3/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_723-228168-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 03:43:08,802 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.1.feed_forward1.hidden_balancer.prob, batch_count=1513200.0, ans=0.125 2023-10-04 03:43:09,960 WARNING [train.py:1204] (3/4) Exclude cut with ID _13678_210_7497_1_1530530069226_3463170_426-246984-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 03:43:09,977 WARNING [train.py:1204] (3/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_399-156791-0_sp0.9 from training. Duration: 0.92225 2023-10-04 03:43:10,045 WARNING [train.py:1204] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_391-22148-0 from training. Duration: 0.77 2023-10-04 03:43:10,077 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3475_210_2028_1_1526193578_3622569_64-297072-0 from training. Duration: 0.91 2023-10-04 03:43:10,211 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=1513200.0, ans=0.1 2023-10-04 03:43:11,461 WARNING [train.py:1204] (3/4) Exclude cut with ID _17875_210_2087_1_1531224012907_1821339_225-280015-0_sp1.1 from training. Duration: 0.963625 2023-10-04 03:43:11,486 WARNING [train.py:1204] (3/4) Exclude cut with ID _50655_210_18628_1_1534229942193_3772154_325-246169-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 03:43:12,060 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.4.encoder.layers.1.nonlin_attention.whiten2, num_groups=1, num_channels=384, metric=3.17 vs. limit=15.0 2023-10-04 03:43:14,797 WARNING [train.py:1204] (3/4) Exclude cut with ID _29529_210_7234_1_1532393961455_3977460_597-257348-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 03:43:14,832 WARNING [train.py:1204] (3/4) Exclude cut with ID _58895_210_8750_1_1535184081320_3649089_77-173485-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 03:43:16,130 WARNING [train.py:1204] (3/4) Exclude cut with ID _6270_210_2084_1_1527850911573_3856420_26-277343-0 from training. Duration: 0.82 2023-10-04 03:43:18,850 WARNING [train.py:1204] (3/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_144-3287-0 from training. Duration: 0.67 2023-10-04 03:43:19,108 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.conv_skip_rate, batch_count=1513266.6666666667, ans=0.0 2023-10-04 03:43:19,437 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.4.encoder.layers.0.nonlin_attention.whiten1, num_groups=1, num_channels=288, metric=7.83 vs. limit=10.0 2023-10-04 03:43:20,223 WARNING [train.py:1204] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_293-145278-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 03:43:22,124 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_40047_210_2290_1_1533290519_2374311_257-335162-0 from training. Duration: 0.95 2023-10-04 03:43:25,609 WARNING [train.py:1204] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_619-344499-0_sp0.9 from training. Duration: 0.7 2023-10-04 03:43:29,824 WARNING [train.py:1204] (3/4) Exclude cut with ID _19504_210_9994_1_1531648863242_3325220_456-315379-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 03:43:31,371 WARNING [train.py:1204] (3/4) Exclude cut with ID _55857_210_2346_1_1534644007831_6898209_631-221692-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 03:43:35,382 WARNING [train.py:1204] (3/4) Exclude cut with ID _64631_210_27767_1_1535349580068_4165160_324-3682-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 03:43:35,416 WARNING [train.py:1204] (3/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_317-211107-0 from training. Duration: 0.92 2023-10-04 03:43:36,928 WARNING [train.py:1204] (3/4) Exclude cut with ID _6789_210_1534_1_1527857937218_3118950_311-61262-0 from training. Duration: 0.65 2023-10-04 03:43:37,161 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=1513333.3333333333, ans=0.1 2023-10-04 03:43:38,220 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.597e+02 1.999e+02 2.160e+02 2.464e+02 3.825e+02, threshold=4.321e+02, percent-clipped=0.0 2023-10-04 03:43:39,683 WARNING [train.py:1204] (3/4) Exclude cut with ID _28323_210_16187_1_1532480395667_4100300_365-68882-0_sp1.1 from training. Duration: 0.92725 2023-10-04 03:43:39,738 WARNING [train.py:1204] (3/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_816-181825-0_sp1.1 from training. Duration: 0.92725 2023-10-04 03:43:42,479 WARNING [train.py:1204] (3/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_132-269633-0_sp1.1 from training. Duration: 0.6181875 2023-10-04 03:43:42,490 WARNING [train.py:1204] (3/4) Exclude cut with ID _44585_210_18628_1_1533605350387_4518199_280-73534-0_sp1.1 from training. Duration: 0.8090625 2023-10-04 03:43:43,907 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_68095_210_20185_1_1535718226_8888855_1009-164755-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 03:43:43,991 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_40223_210_6228_1_1533298404_4812267_400-103813-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 03:43:43,992 WARNING [train.py:1204] (3/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_491-261923-0_sp1.1 from training. Duration: 0.8909375 2023-10-04 03:43:43,998 WARNING [train.py:1204] (3/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_712-342563-0 from training. Duration: 0.97 2023-10-04 03:43:44,262 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.feed_forward1.out_proj.dropout_p, batch_count=1513400.0, ans=0.1 2023-10-04 03:43:45,921 INFO [train.py:1046] (3/4) Epoch 43, batch 3900, loss[loss=0.1537, simple_loss=0.2467, pruned_loss=0.03035, over 24628.00 frames. ], tot_loss[loss=0.1542, simple_loss=0.2341, pruned_loss=0.03714, over 4700975.90 frames. ], batch size: 73, lr: 2.36e-03, grad_scale: 8.0 2023-10-04 03:43:46,012 WARNING [train.py:1204] (3/4) Exclude cut with ID _41161_210_20034_1_1533431249657_3555314_175-190032-0_sp1.1 from training. Duration: 0.963625 2023-10-04 03:43:46,331 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.feed_forward2.hidden_balancer.prob, batch_count=1513400.0, ans=0.125 2023-10-04 03:43:47,401 WARNING [train.py:1204] (3/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_788-298907-0 from training. Duration: 0.89 2023-10-04 03:43:47,526 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=1513400.0, ans=0.1 2023-10-04 03:43:48,718 WARNING [train.py:1204] (3/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_225-70961-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 03:43:48,724 WARNING [train.py:1204] (3/4) Exclude cut with ID _58583_210_5236_1_1534726835266_3574249_235-175994-0_sp1.1 from training. Duration: 0.92725 2023-10-04 03:43:50,092 WARNING [train.py:1204] (3/4) Exclude cut with ID _12302_210_5219_1_1529924446466_8107280_907-182949-0_sp1.1 from training. Duration: 0.836375 2023-10-04 03:43:50,140 WARNING [train.py:1204] (3/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_94-193692-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 03:43:51,476 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_4528_210_3273_1_1527076654_3276777_342-168068-0_sp0.9 from training. Duration: 0.9666875 2023-10-04 03:43:53,475 WARNING [train.py:1204] (3/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_368-74239-0_sp1.1 from training. Duration: 0.92725 2023-10-04 03:43:53,477 WARNING [train.py:1204] (3/4) Exclude cut with ID _44831_210_13427_1_1533816011549_3689035_291-295928-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 03:43:53,545 WARNING [train.py:1204] (3/4) Exclude cut with ID _56406_210_9288_1_1534899618664_3496149_455-131997-0_sp1.1 from training. Duration: 0.97275 2023-10-04 03:43:53,556 WARNING [train.py:1204] (3/4) Exclude cut with ID _6473_210_1741_1_1527901307396_3815380_108-17335-0 from training. Duration: 0.65 2023-10-04 03:43:53,609 WARNING [train.py:1204] (3/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_286-135600-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 03:43:57,036 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_928-321675-0_sp1.1 from training. Duration: 0.936375 2023-10-04 03:43:58,374 WARNING [train.py:1204] (3/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_81-300212-0_sp1.1 from training. Duration: 0.5454375 2023-10-04 03:43:58,420 WARNING [train.py:1204] (3/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_29-280622-0_sp0.9 from training. Duration: 0.911125 2023-10-04 03:43:59,790 WARNING [train.py:1204] (3/4) Exclude cut with ID _61079_210_4525_1_1534921008198_4011419_228-176447-0_sp1.1 from training. Duration: 0.936375 2023-10-04 03:44:02,496 WARNING [train.py:1204] (3/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_372-198878-0_sp1.1 from training. Duration: 0.5454375 2023-10-04 03:44:02,507 WARNING [train.py:1204] (3/4) Exclude cut with ID _64521_210_14699_1_1535243427259_3815328_244-208725-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 03:44:03,880 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8275_210_4198_1_1528455320_3892571_491-17707-0_sp0.9 from training. Duration: 0.711125 2023-10-04 03:44:06,515 WARNING [train.py:1204] (3/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_375-245677-0 from training. Duration: 0.81 2023-10-04 03:44:06,522 WARNING [train.py:1204] (3/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_690-286055-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 03:44:07,977 WARNING [train.py:1204] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_89-321955-0 from training. Duration: 0.8 2023-10-04 03:44:08,014 WARNING [train.py:1204] (3/4) Exclude cut with ID _18895_210_6228_1_1531357274234_3784930_213-98721-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 03:44:09,385 WARNING [train.py:1204] (3/4) Exclude cut with ID _19708_210_9206_1_1532136650733_4238364_347-65240-0 from training. Duration: 0.97 2023-10-04 03:44:10,803 WARNING [train.py:1204] (3/4) Exclude cut with ID _64135_210_2253_1_1535677267876_7608972_498-5039-0 from training. Duration: 0.78 2023-10-04 03:44:15,548 WARNING [train.py:1204] (3/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_146-276944-0_sp1.1 from training. Duration: 0.7545625 2023-10-04 03:44:15,624 WARNING [train.py:1204] (3/4) Exclude cut with ID _63171_210_15488_1_1535104792794_3295110_155-34488-0_sp1.1 from training. Duration: 0.97275 2023-10-04 03:44:15,643 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_5438_210_1794_1_1527336687_2600009_233-58072-0_sp0.9 from training. Duration: 0.8555625 2023-10-04 03:44:17,584 WARNING [train.py:1204] (3/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_63-101996-0_sp0.9 from training. Duration: 0.97775 2023-10-04 03:44:21,612 WARNING [train.py:1204] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_271-269084-0_sp1.1 from training. Duration: 0.7545625 2023-10-04 03:44:24,902 WARNING [train.py:1204] (3/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_345-167986-0_sp1.1 from training. Duration: 0.763625 2023-10-04 03:44:26,844 WARNING [train.py:1204] (3/4) Exclude cut with ID _39290_210_4220_1_1533862778760_3075118_92-311600-0_sp1.1 from training. Duration: 0.8 2023-10-04 03:44:26,857 WARNING [train.py:1204] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_209-65306-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 03:44:28,216 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_26433_210_12108_1_1532157158_4056480_317-335807-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 03:44:32,475 WARNING [train.py:1204] (3/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_238-94127-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 03:44:32,525 WARNING [train.py:1204] (3/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_207-132038-0_sp1.1 from training. Duration: 0.8545625 2023-10-04 03:44:39,474 WARNING [train.py:1204] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_167-137255-0_sp1.1 from training. Duration: 0.5818125 2023-10-04 03:44:40,835 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2009_210_2071_1_1525481134_4588937_385-302100-0_sp0.9 from training. Duration: 0.9666875 2023-10-04 03:44:51,679 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_15517_210_6753_1_1530954008_3487336_253-110617-0_sp1.1 from training. Duration: 0.936375 2023-10-04 03:44:55,078 WARNING [train.py:1204] (3/4) Exclude cut with ID _39290_210_4220_1_1533862778760_3075118_92-311600-0_sp0.9 from training. Duration: 0.97775 2023-10-04 03:44:56,151 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_42-142473-0 from training. Duration: 0.72 2023-10-04 03:44:56,192 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2296_210_2087_1_1525503448_3397335_57-185430-0 from training. Duration: 0.6 2023-10-04 03:44:56,203 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_11305_210_1794_1_1529750428_7178148_257-312870-0_sp0.9 from training. Duration: 0.97775 2023-10-04 03:44:59,315 WARNING [train.py:1204] (3/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_222-198127-0 from training. Duration: 0.84 2023-10-04 03:44:59,416 WARNING [train.py:1204] (3/4) Exclude cut with ID _65263_210_22356_1_1535763598691_3921037_423-202699-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 03:45:00,821 INFO [train.py:1046] (3/4) Epoch 43, batch 3950, loss[loss=0.1399, simple_loss=0.2217, pruned_loss=0.02901, over 24452.00 frames. ], tot_loss[loss=0.1536, simple_loss=0.2337, pruned_loss=0.03677, over 4709779.21 frames. ], batch size: 58, lr: 2.36e-03, grad_scale: 8.0 2023-10-04 03:45:00,860 WARNING [train.py:1204] (3/4) Exclude cut with ID _15037_210_2253_1_1530752294104_3267650_13-130361-0 from training. Duration: 0.99 2023-10-04 03:45:05,181 WARNING [train.py:1204] (3/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_628-198080-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 03:45:05,617 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.5.encoder.layers.1.conv_module1.whiten, num_groups=1, num_channels=256, metric=5.10 vs. limit=15.0 2023-10-04 03:45:06,473 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3171_210_2087_1_1526109084_2330007_37-64334-0 from training. Duration: 0.82 2023-10-04 03:45:07,894 WARNING [train.py:1204] (3/4) Exclude cut with ID _5287_210_2084_1_1527418883040_3878670_12-221186-0_sp1.1 from training. Duration: 0.8909375 2023-10-04 03:45:10,622 WARNING [train.py:1204] (3/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_1103-56445-0_sp1.1 from training. Duration: 0.663625 2023-10-04 03:45:12,037 WARNING [train.py:1204] (3/4) Exclude cut with ID _65263_210_22356_1_1535763598691_3921037_169-140068-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 03:45:16,173 WARNING [train.py:1204] (3/4) Exclude cut with ID IC0671W0118-331961 from training. Duration: 0.896 2023-10-04 03:45:16,233 WARNING [train.py:1204] (3/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_402-102311-0_sp1.1 from training. Duration: 0.6090625 2023-10-04 03:45:16,258 WARNING [train.py:1204] (3/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_202-293523-0 from training. Duration: 0.72 2023-10-04 03:45:18,113 WARNING [train.py:1204] (3/4) Exclude cut with ID IC0679W0038-335846 from training. Duration: 0.768 2023-10-04 03:45:18,136 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_37357_210_17024_1_1533089504_2316121_79-198866-0_sp1.1 from training. Duration: 0.936375 2023-10-04 03:45:21,280 WARNING [train.py:1204] (3/4) Exclude cut with ID _7606_210_4915_1_1528894253277_5080690_292-271285-0_sp1.1 from training. Duration: 0.963625 2023-10-04 03:45:22,538 WARNING [train.py:1204] (3/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_377-296839-0_sp0.9 from training. Duration: 0.611125 2023-10-04 03:45:22,542 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_45886_210_17024_1_1533704917_2435333_58-131069-0_sp1.1 from training. Duration: 0.963625 2023-10-04 03:45:24,780 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=1513800.0, ans=0.1 2023-10-04 03:45:25,944 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6961_210_2087_1_1527937168_6815011_395-185060-0 from training. Duration: 0.95 2023-10-04 03:45:27,409 WARNING [train.py:1204] (3/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_304-266479-0_sp1.1 from training. Duration: 0.97275 2023-10-04 03:45:29,180 WARNING [train.py:1204] (3/4) Exclude cut with ID _3867_210_1883_1_1526641200575_4475830_319-349850-0_sp1.1 from training. Duration: 0.6090625 2023-10-04 03:45:29,188 WARNING [train.py:1204] (3/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_304-59227-0_sp0.9 from training. Duration: 0.8555625 2023-10-04 03:45:30,602 WARNING [train.py:1204] (3/4) Exclude cut with ID _8021_210_7994_1_1528376451358_3653069_100-140231-0_sp0.9 from training. Duration: 0.7444375 2023-10-04 03:45:30,652 WARNING [train.py:1204] (3/4) Exclude cut with ID _8833_210_5219_1_1528592665210_7553059_329-125156-0_sp0.9 from training. Duration: 0.911125 2023-10-04 03:45:36,638 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.balancer2.prob, batch_count=1513866.6666666667, ans=0.125 2023-10-04 03:45:37,785 WARNING [train.py:1204] (3/4) Exclude cut with ID _14459_210_2228_1_1531443641946_7516700_792-287234-0_sp1.1 from training. Duration: 0.92725 2023-10-04 03:45:37,815 WARNING [train.py:1204] (3/4) Exclude cut with ID _19708_210_9206_1_1532136650733_4238364_347-65240-0_sp1.1 from training. Duration: 0.8818125 2023-10-04 03:45:43,222 WARNING [train.py:1204] (3/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_70-27732-0 from training. Duration: 0.94 2023-10-04 03:45:44,787 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.bypass.scale_min, batch_count=1513933.3333333333, ans=0.2 2023-10-04 03:45:46,803 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.attention_skip_rate, batch_count=1513933.3333333333, ans=0.0 2023-10-04 03:45:49,797 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_397-132565-0 from training. Duration: 0.99 2023-10-04 03:45:49,800 WARNING [train.py:1204] (3/4) Exclude cut with ID _49728_210_12651_1_1534041034294_6788150_624-195301-0 from training. Duration: 0.85 2023-10-04 03:45:49,834 WARNING [train.py:1204] (3/4) Exclude cut with ID _55429_210_10626_1_1534467588527_7174143_377-169391-0_sp1.1 from training. Duration: 0.87275 2023-10-04 03:45:51,228 WARNING [train.py:1204] (3/4) Exclude cut with ID _49376_210_5986_1_1533985278732_3370740_354-344695-0_sp1.1 from training. Duration: 0.9 2023-10-04 03:45:54,699 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.2.encoder.layers.2.nonlin_attention.whiten2, num_groups=1, num_channels=384, metric=4.95 vs. limit=15.0 2023-10-04 03:45:58,788 WARNING [train.py:1204] (3/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_276-96593-0_sp1.1 from training. Duration: 0.7 2023-10-04 03:46:00,661 WARNING [train.py:1204] (3/4) Exclude cut with ID _15012_210_1968_1_1530856820429_5095350_688-178849-0_sp0.9 from training. Duration: 0.711125 2023-10-04 03:46:00,707 WARNING [train.py:1204] (3/4) Exclude cut with ID _25565_210_12392_1_1532423009382_3952098_372-69023-0_sp1.1 from training. Duration: 0.936375 2023-10-04 03:46:00,744 WARNING [train.py:1204] (3/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_61-327062-0_sp1.1 from training. Duration: 0.736375 2023-10-04 03:46:00,772 WARNING [train.py:1204] (3/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_215-141059-0 from training. Duration: 0.62 2023-10-04 03:46:07,398 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.574e+02 2.005e+02 2.254e+02 2.527e+02 3.756e+02, threshold=4.508e+02, percent-clipped=0.0 2023-10-04 03:46:07,509 WARNING [train.py:1204] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_6-226750-0_sp1.1 from training. Duration: 0.8545625 2023-10-04 03:46:08,974 WARNING [train.py:1204] (3/4) Exclude cut with ID _14348_210_8341_1_1530599434996_3778640_240-232370-0_sp1.1 from training. Duration: 0.763625 2023-10-04 03:46:11,814 WARNING [train.py:1204] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_468-189058-0 from training. Duration: 0.96 2023-10-04 03:46:14,420 INFO [train.py:1046] (3/4) Epoch 43, batch 4000, loss[loss=0.1853, simple_loss=0.2532, pruned_loss=0.05871, over 19534.00 frames. ], tot_loss[loss=0.1537, simple_loss=0.2341, pruned_loss=0.03665, over 4710748.69 frames. ], batch size: 388, lr: 2.36e-03, grad_scale: 16.0 2023-10-04 03:46:20,807 WARNING [train.py:1204] (3/4) Exclude cut with ID _21987_210_5854_1_1531821500482_3637038_247-93340-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 03:46:26,789 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.balancer1.prob, batch_count=1514066.6666666667, ans=0.125 2023-10-04 03:46:28,385 WARNING [train.py:1204] (3/4) Exclude cut with ID _66868_210_9527_1_1535509287954_4561140_436-191602-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 03:46:33,054 WARNING [train.py:1204] (3/4) Exclude cut with ID _18895_210_6228_1_1531357274234_3784930_394-58203-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 03:46:33,104 WARNING [train.py:1204] (3/4) Exclude cut with ID _58605_210_12210_1_1534723195939_3904259_258-281498-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 03:46:33,139 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_33348_210_5632_1_1533086912_3324030_245-292752-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 03:46:34,428 WARNING [train.py:1204] (3/4) Exclude cut with ID _67831_210_12392_1_1535612464202_3092612_346-2148-0 from training. Duration: 0.93 2023-10-04 03:46:34,501 WARNING [train.py:1204] (3/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_369-222060-0_sp0.9 from training. Duration: 0.788875 2023-10-04 03:46:34,664 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.nonlin_attention.balancer.prob, batch_count=1514133.3333333333, ans=0.125 2023-10-04 03:46:35,819 WARNING [train.py:1204] (3/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_245-174553-0 from training. Duration: 0.88 2023-10-04 03:46:35,824 WARNING [train.py:1204] (3/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_231-104736-0_sp1.1 from training. Duration: 0.7090625 2023-10-04 03:46:35,830 WARNING [train.py:1204] (3/4) Exclude cut with ID _44585_210_18628_1_1533605350387_4518199_280-73534-0 from training. Duration: 0.89 2023-10-04 03:46:37,431 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.conv_module1.balancer1.max_abs, batch_count=1514133.3333333333, ans=10.0 2023-10-04 03:46:38,563 WARNING [train.py:1204] (3/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_22-201476-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 03:46:41,423 WARNING [train.py:1204] (3/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_325-62802-0_sp0.9 from training. Duration: 0.9666875 2023-10-04 03:46:41,436 WARNING [train.py:1204] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_199-274062-0_sp1.1 from training. Duration: 0.97275 2023-10-04 03:46:41,440 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_8-319957-0_sp1.1 from training. Duration: 0.87275 2023-10-04 03:46:41,464 WARNING [train.py:1204] (3/4) Exclude cut with ID _29343_210_4381_1_1532602757752_3674109_224-349726-0_sp1.1 from training. Duration: 0.936375 2023-10-04 03:46:41,469 WARNING [train.py:1204] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_520-126729-0_sp1.1 from training. Duration: 0.636375 2023-10-04 03:46:42,912 WARNING [train.py:1204] (3/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_239-22076-0_sp1.1 from training. Duration: 0.77275 2023-10-04 03:46:44,276 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9012W0011-501146 from training. Duration: 0.896 2023-10-04 03:46:45,602 WARNING [train.py:1204] (3/4) Exclude cut with ID _29857_210_20034_1_1532395886753_3798160_279-185801-0_sp1.1 from training. Duration: 0.82725 2023-10-04 03:46:45,657 WARNING [train.py:1204] (3/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_261-223933-0_sp1.1 from training. Duration: 0.92725 2023-10-04 03:46:50,380 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9029W0025-509426 from training. Duration: 0.896 2023-10-04 03:46:50,471 WARNING [train.py:1204] (3/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_337-181502-0_sp1.1 from training. Duration: 0.5909375 2023-10-04 03:46:51,740 WARNING [train.py:1204] (3/4) Exclude cut with ID _68635_210_9317_1_1535770813410_4315870_159-332599-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 03:46:56,580 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_161-276996-0 from training. Duration: 0.95 2023-10-04 03:46:57,917 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7088_210_1874_1_1527939776_1101113_127-68870-0_sp1.1 from training. Duration: 0.936375 2023-10-04 03:47:00,636 WARNING [train.py:1204] (3/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_322-217220-0_sp1.1 from training. Duration: 0.7818125 2023-10-04 03:47:01,327 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9072W0002-530636 from training. Duration: 0.768 2023-10-04 03:47:03,856 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_340-336408-0_sp1.1 from training. Duration: 0.8454375 2023-10-04 03:47:03,927 WARNING [train.py:1204] (3/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_395-37222-0 from training. Duration: 0.76 2023-10-04 03:47:03,930 WARNING [train.py:1204] (3/4) Exclude cut with ID _64099_210_17301_1_1535526977656_1578188_159-209156-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 03:47:04,001 WARNING [train.py:1204] (3/4) Exclude cut with ID _53950_210_5816_1_1534417264919_3862951_28-29681-0_sp1.1 from training. Duration: 0.92725 2023-10-04 03:47:05,938 WARNING [train.py:1204] (3/4) Exclude cut with ID _4681_210_3525_1_1527250689464_120889_15-126760-0_sp1.1 from training. Duration: 0.736375 2023-10-04 03:47:07,345 WARNING [train.py:1204] (3/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_242-3472-0_sp1.1 from training. Duration: 0.663625 2023-10-04 03:47:07,912 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.4.encoder.layers.0.self_attn1.whiten, num_groups=1, num_channels=384, metric=14.80 vs. limit=22.5 2023-10-04 03:47:08,708 WARNING [train.py:1204] (3/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_436-186311-0_sp0.9 from training. Duration: 0.72225 2023-10-04 03:47:08,747 WARNING [train.py:1204] (3/4) Exclude cut with ID _3675_210_4557_1_1526639934408_4182089_452-172147-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 03:47:11,357 WARNING [train.py:1204] (3/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_80-147301-0 from training. Duration: 0.84 2023-10-04 03:47:11,395 WARNING [train.py:1204] (3/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_351-107588-0_sp1.1 from training. Duration: 0.92725 2023-10-04 03:47:12,842 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9111W0002-550781 from training. Duration: 0.896 2023-10-04 03:47:16,984 WARNING [train.py:1204] (3/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_465-156500-0_sp1.1 from training. Duration: 0.6545625 2023-10-04 03:47:20,152 WARNING [train.py:1204] (3/4) Exclude cut with ID _13938_210_9988_1_1530518718978_3693960_304-112784-0_sp1.1 from training. Duration: 0.4181875 2023-10-04 03:47:21,540 WARNING [train.py:1204] (3/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_202-67017-0_sp1.1 from training. Duration: 0.7454375 2023-10-04 03:47:21,593 WARNING [train.py:1204] (3/4) Exclude cut with ID _58414_210_8254_1_1535421609162_7151182_448-4358-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 03:47:23,584 WARNING [train.py:1204] (3/4) Exclude cut with ID _66840_210_5219_1_1535452168426_4196349_203-243234-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 03:47:23,664 WARNING [train.py:1204] (3/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_201-213513-0_sp1.1 from training. Duration: 0.97275 2023-10-04 03:47:27,135 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.2.encoder.layers.0.feed_forward1.out_whiten, num_groups=1, num_channels=384, metric=6.91 vs. limit=15.0 2023-10-04 03:47:29,564 INFO [train.py:1046] (3/4) Epoch 43, batch 4050, loss[loss=0.1827, simple_loss=0.2535, pruned_loss=0.05597, over 22719.00 frames. ], tot_loss[loss=0.1548, simple_loss=0.235, pruned_loss=0.03726, over 4707180.21 frames. ], batch size: 322, lr: 2.36e-03, grad_scale: 16.0 2023-10-04 03:47:29,625 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_117-187560-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 03:47:32,369 WARNING [train.py:1204] (3/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_135-174080-0_sp0.9 from training. Duration: 0.588875 2023-10-04 03:47:33,783 WARNING [train.py:1204] (3/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_45-265684-0 from training. Duration: 0.72 2023-10-04 03:47:35,254 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_435-313305-0_sp1.1 from training. Duration: 0.6454375 2023-10-04 03:47:35,280 WARNING [train.py:1204] (3/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_235-307337-0_sp1.1 from training. Duration: 0.8545625 2023-10-04 03:47:36,604 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7762_210_1794_1_1528199508_4057408_419-125009-0_sp0.9 from training. Duration: 0.92225 2023-10-04 03:47:37,965 WARNING [train.py:1204] (3/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_215-141059-0_sp1.1 from training. Duration: 0.563625 2023-10-04 03:47:39,930 WARNING [train.py:1204] (3/4) Exclude cut with ID _27013_210_1389_1_1532437173240_3252649_222-125573-0_sp1.1 from training. Duration: 0.8909375 2023-10-04 03:47:42,687 WARNING [train.py:1204] (3/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_126-274694-0_sp1.1 from training. Duration: 0.8909375 2023-10-04 03:47:46,686 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2592_210_2290_1_1525697025_4882925_157-70512-0_sp1.1 from training. Duration: 0.82725 2023-10-04 03:47:47,967 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_292-265928-0_sp0.9 from training. Duration: 0.5666875 2023-10-04 03:47:49,379 WARNING [train.py:1204] (3/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_1211-226066-0_sp1.1 from training. Duration: 0.6909375 2023-10-04 03:47:49,433 WARNING [train.py:1204] (3/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_394-265747-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 03:47:54,193 WARNING [train.py:1204] (3/4) Exclude cut with ID _48782_210_12658_1_1534467701544_2300422_239-67139-0_sp1.1 from training. Duration: 0.97275 2023-10-04 03:47:57,027 WARNING [train.py:1204] (3/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_329-113173-0_sp1.1 from training. Duration: 0.563625 2023-10-04 03:48:00,447 WARNING [train.py:1204] (3/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_81-75849-0_sp0.9 from training. Duration: 0.4333125 2023-10-04 03:48:01,944 WARNING [train.py:1204] (3/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_1003-101638-0 from training. Duration: 0.73 2023-10-04 03:48:01,967 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9309W0189-640316 from training. Duration: 0.64 2023-10-04 03:48:02,803 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=1514533.3333333333, ans=0.1 2023-10-04 03:48:05,223 WARNING [train.py:1204] (3/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_503-287883-0_sp1.1 from training. Duration: 0.8 2023-10-04 03:48:08,181 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.balancer1.prob, batch_count=1514533.3333333333, ans=0.125 2023-10-04 03:48:12,100 WARNING [train.py:1204] (3/4) Exclude cut with ID _8833_210_5219_1_1528592665210_7553059_611-80313-0 from training. Duration: 0.64 2023-10-04 03:48:12,169 WARNING [train.py:1204] (3/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_335-106626-0_sp1.1 from training. Duration: 0.963625 2023-10-04 03:48:14,223 WARNING [train.py:1204] (3/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_237-44332-0_sp1.1 from training. Duration: 0.8545625 2023-10-04 03:48:15,706 WARNING [train.py:1204] (3/4) Exclude cut with ID _46952_210_5986_1_1534230010357_3107379_178-147341-0_sp1.1 from training. Duration: 0.97275 2023-10-04 03:48:17,011 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_13091_210_7925_1_1530609447_8987354_946-154426-0_sp1.1 from training. Duration: 0.936375 2023-10-04 03:48:17,033 WARNING [train.py:1204] (3/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_326-17061-0_sp1.1 from training. Duration: 0.8545625 2023-10-04 03:48:17,216 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.conv_module2.balancer2.prob, batch_count=1514600.0, ans=0.125 2023-10-04 03:48:21,109 WARNING [train.py:1204] (3/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_38-169920-0_sp1.1 from training. Duration: 0.82725 2023-10-04 03:48:24,348 WARNING [train.py:1204] (3/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_283-220372-0 from training. Duration: 0.56 2023-10-04 03:48:24,356 WARNING [train.py:1204] (3/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_252-216674-0_sp1.1 from training. Duration: 0.4909375 2023-10-04 03:48:25,781 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_202-91290-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 03:48:28,591 WARNING [train.py:1204] (3/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_80-74184-0 from training. Duration: 0.83 2023-10-04 03:48:31,960 WARNING [train.py:1204] (3/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_411-271358-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 03:48:33,494 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.0.layers.1.whiten, num_groups=1, num_channels=192, metric=3.94 vs. limit=12.0 2023-10-04 03:48:36,608 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.589e+02 1.949e+02 2.233e+02 2.507e+02 3.530e+02, threshold=4.466e+02, percent-clipped=0.0 2023-10-04 03:48:39,483 WARNING [train.py:1204] (3/4) Exclude cut with ID _68777_210_4353_1_1535810390731_3117904_129-166012-0 from training. Duration: 0.85 2023-10-04 03:48:39,573 WARNING [train.py:1204] (3/4) Exclude cut with ID _44982_210_1511_1_1534312820108_3726029_410-109759-0_sp1.1 from training. Duration: 0.963625 2023-10-04 03:48:39,578 WARNING [train.py:1204] (3/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_364-22817-0_sp1.1 from training. Duration: 0.6454375 2023-10-04 03:48:42,455 WARNING [train.py:1204] (3/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_89-242335-0 from training. Duration: 0.95 2023-10-04 03:48:42,474 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_151-325626-0 from training. Duration: 0.97 2023-10-04 03:48:42,475 WARNING [train.py:1204] (3/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_786-264332-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 03:48:43,722 INFO [train.py:1046] (3/4) Epoch 43, batch 4100, loss[loss=0.1363, simple_loss=0.2191, pruned_loss=0.02672, over 24625.00 frames. ], tot_loss[loss=0.1557, simple_loss=0.2359, pruned_loss=0.03774, over 4716148.52 frames. ], batch size: 60, lr: 2.36e-03, grad_scale: 16.0 2023-10-04 03:48:45,654 WARNING [train.py:1204] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_35-153886-0_sp1.1 from training. Duration: 0.763625 2023-10-04 03:48:47,034 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_24007_210_1794_1_1531918925_1663875_116-100916-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 03:48:47,048 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_162-262717-0_sp1.1 from training. Duration: 0.7545625 2023-10-04 03:48:51,427 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.bypass.skip_rate, batch_count=1514733.3333333333, ans=0.07 2023-10-04 03:48:53,992 WARNING [train.py:1204] (3/4) Exclude cut with ID _8263_210_1664_1_1528438953494_294100_40-204731-0 from training. Duration: 0.5 2023-10-04 03:48:55,925 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_416-293746-0 from training. Duration: 0.9 2023-10-04 03:48:57,355 WARNING [train.py:1204] (3/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_605-219732-0 from training. Duration: 0.94 2023-10-04 03:48:58,709 WARNING [train.py:1204] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_454-123002-0 from training. Duration: 0.69 2023-10-04 03:48:58,715 WARNING [train.py:1204] (3/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_25-304273-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 03:48:58,772 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6860_210_4852_1_1527917457_3833327_451-39702-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 03:48:58,797 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_304-16505-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 03:48:58,819 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_4-266348-0_sp1.1 from training. Duration: 0.7090625 2023-10-04 03:48:58,888 WARNING [train.py:1204] (3/4) Exclude cut with ID ID0149W0001-753701 from training. Duration: 0.989 2023-10-04 03:49:01,333 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.0.layers.0.nonlin_attention.whiten2, num_groups=1, num_channels=192, metric=4.97 vs. limit=15.0 2023-10-04 03:49:02,346 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_41251_210_15368_1_1533429767_4820695_394-55393-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 03:49:04,381 WARNING [train.py:1204] (3/4) Exclude cut with ID _8793_210_6310_1_1528542985139_2310009_121-179782-0_sp1.1 from training. Duration: 0.72725 2023-10-04 03:49:05,683 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2729_210_3528_1_1525863505_3882214_421-73047-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 03:49:05,746 WARNING [train.py:1204] (3/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_278-154492-0_sp0.9 from training. Duration: 0.8444375 2023-10-04 03:49:09,659 WARNING [train.py:1204] (3/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_412-317077-0_sp0.9 from training. Duration: 0.8333125 2023-10-04 03:49:09,743 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_727-138317-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 03:49:11,070 WARNING [train.py:1204] (3/4) Exclude cut with ID _25808_210_13292_1_1532343514562_7366109_345-335048-0_sp1.1 from training. Duration: 0.92725 2023-10-04 03:49:11,083 WARNING [train.py:1204] (3/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_506-23200-0 from training. Duration: 0.97 2023-10-04 03:49:12,376 WARNING [train.py:1204] (3/4) Exclude cut with ID _44718_210_5816_1_1534050051869_6469030_643-245187-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 03:49:12,387 WARNING [train.py:1204] (3/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_42-186162-0_sp1.1 from training. Duration: 0.836375 2023-10-04 03:49:12,399 WARNING [train.py:1204] (3/4) Exclude cut with ID _3231_210_2106_1_1526205864934_6784220_171-256004-0_sp1.1 from training. Duration: 0.7818125 2023-10-04 03:49:12,418 WARNING [train.py:1204] (3/4) Exclude cut with ID _25914_210_6400_1_1532141317058_644740_43-248478-0_sp1.1 from training. Duration: 0.936375 2023-10-04 03:49:12,464 WARNING [train.py:1204] (3/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_577-287150-0 from training. Duration: 0.82 2023-10-04 03:49:15,668 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_46015_210_7234_1_1533863319_3087837_376-276068-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 03:49:15,766 WARNING [train.py:1204] (3/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_51-200220-0 from training. Duration: 0.56 2023-10-04 03:49:18,514 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_452-96101-0_sp1.1 from training. Duration: 0.963625 2023-10-04 03:49:19,926 WARNING [train.py:1204] (3/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_263-168388-0_sp1.1 from training. Duration: 0.7818125 2023-10-04 03:49:19,927 WARNING [train.py:1204] (3/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_469-94673-0 from training. Duration: 0.79 2023-10-04 03:49:21,222 WARNING [train.py:1204] (3/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_303-228738-0_sp1.1 from training. Duration: 0.763625 2023-10-04 03:49:21,279 WARNING [train.py:1204] (3/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_262-250826-0_sp1.1 from training. Duration: 0.9 2023-10-04 03:49:22,545 WARNING [train.py:1204] (3/4) Exclude cut with ID _68777_210_4353_1_1535810390731_3117904_129-166012-0_sp1.1 from training. Duration: 0.77275 2023-10-04 03:49:24,546 WARNING [train.py:1204] (3/4) Exclude cut with ID _58895_210_8750_1_1535184081320_3649089_265-323810-0 from training. Duration: 0.96 2023-10-04 03:49:24,746 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.balancer2.prob, batch_count=1514866.6666666667, ans=0.125 2023-10-04 03:49:25,930 WARNING [train.py:1204] (3/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_120-76843-0_sp0.9 from training. Duration: 0.611125 2023-10-04 03:49:25,988 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7544_210_6753_1_1528587722_3576686_295-258479-0_sp0.9 from training. Duration: 0.9333125 2023-10-04 03:49:28,786 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7046_210_6753_1_1527895337_6920917_783-278109-0 from training. Duration: 0.77 2023-10-04 03:49:28,857 WARNING [train.py:1204] (3/4) Exclude cut with ID _42326_210_7770_1_1533814282531_3707269_113-308323-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 03:49:30,690 WARNING [train.py:1204] (3/4) Exclude cut with ID _7852_210_5219_1_1528286471316_8139400_242-105336-0_sp0.9 from training. Duration: 0.8 2023-10-04 03:49:33,547 WARNING [train.py:1204] (3/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_114-277905-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 03:49:38,128 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_13118_210_7925_1_1530755612_7608297_42-66683-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 03:49:42,277 WARNING [train.py:1204] (3/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_901-79438-0_sp1.1 from training. Duration: 0.8545625 2023-10-04 03:49:42,319 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_30280_210_1881_1_1532872779_3660139_211-182194-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 03:49:51,028 WARNING [train.py:1204] (3/4) Exclude cut with ID _14898_210_9206_1_1530766812377_4177190_621-45732-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 03:49:52,299 WARNING [train.py:1204] (3/4) Exclude cut with ID _35350_210_16040_1_1533040191183_4075299_224-133775-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 03:49:55,741 WARNING [train.py:1204] (3/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_147-293311-0_sp1.1 from training. Duration: 0.8545625 2023-10-04 03:49:57,140 WARNING [train.py:1204] (3/4) Exclude cut with ID _12302_210_5219_1_1529924446466_8107280_835-279754-0_sp1.1 from training. Duration: 0.8181875 2023-10-04 03:49:58,534 INFO [train.py:1046] (3/4) Epoch 43, batch 4150, loss[loss=0.1558, simple_loss=0.2421, pruned_loss=0.03474, over 24338.00 frames. ], tot_loss[loss=0.1558, simple_loss=0.236, pruned_loss=0.03783, over 4724761.26 frames. ], batch size: 74, lr: 2.36e-03, grad_scale: 16.0 2023-10-04 03:50:01,306 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8453_210_4211_1_1528502940_8562730_347-330673-0_sp0.9 from training. Duration: 0.8 2023-10-04 03:50:02,650 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_213-103282-0_sp0.9 from training. Duration: 0.8666875 2023-10-04 03:50:04,430 WARNING [train.py:1204] (3/4) Exclude cut with ID _49728_210_12651_1_1534041034294_6788150_66-43496-0_sp1.1 from training. Duration: 0.97275 2023-10-04 03:50:04,432 WARNING [train.py:1204] (3/4) Exclude cut with ID _19023_210_4353_1_1531378892552_5820780_28-36615-0_sp1.1 from training. Duration: 0.8818125 2023-10-04 03:50:06,398 WARNING [train.py:1204] (3/4) Exclude cut with ID _14349_210_8341_1_1530615710818_3442470_115-285975-0 from training. Duration: 0.86 2023-10-04 03:50:07,842 WARNING [train.py:1204] (3/4) Exclude cut with ID _12734_210_8341_1_1530183614296_3891929_236-47464-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 03:50:09,171 WARNING [train.py:1204] (3/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_177-236809-0 from training. Duration: 0.49 2023-10-04 03:50:09,241 WARNING [train.py:1204] (3/4) Exclude cut with ID _13678_210_7497_1_1530530069226_3463170_371-3221-0 from training. Duration: 0.9 2023-10-04 03:50:09,255 WARNING [train.py:1204] (3/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_48-338976-0 from training. Duration: 0.89 2023-10-04 03:50:10,770 WARNING [train.py:1204] (3/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_1167-309744-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 03:50:14,947 WARNING [train.py:1204] (3/4) Exclude cut with ID _30327_210_16049_1_1532484161295_3924169_401-70819-0_sp0.9 from training. Duration: 0.9555625 2023-10-04 03:50:14,962 WARNING [train.py:1204] (3/4) Exclude cut with ID _55134_210_21982_1_1534590028377_3882170_346-91022-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 03:50:15,560 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.4.encoder.layers.2.self_attn2.whiten, num_groups=1, num_channels=384, metric=11.80 vs. limit=22.5 2023-10-04 03:50:18,157 WARNING [train.py:1204] (3/4) Exclude cut with ID _14459_210_2228_1_1531443641946_7516700_174-118801-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 03:50:18,215 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_269-278006-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 03:50:19,460 WARNING [train.py:1204] (3/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_390-339137-0_sp0.9 from training. Duration: 0.62225 2023-10-04 03:50:21,952 WARNING [train.py:1204] (3/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_253-308377-0_sp1.1 from training. Duration: 0.4454375 2023-10-04 03:50:21,970 WARNING [train.py:1204] (3/4) Exclude cut with ID _23825_210_13449_1_1531915313526_3497150_240-271367-0_sp1.1 from training. Duration: 0.8818125 2023-10-04 03:50:23,297 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1015-343884-0_sp0.9 from training. Duration: 0.688875 2023-10-04 03:50:28,013 WARNING [train.py:1204] (3/4) Exclude cut with ID _23514_210_1389_1_1531832139785_3031439_335-48210-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 03:50:30,985 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_309-65177-0_sp1.1 from training. Duration: 0.8 2023-10-04 03:50:31,227 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.feed_forward1.out_proj.dropout_p, batch_count=1515200.0, ans=0.1 2023-10-04 03:50:32,867 WARNING [train.py:1204] (3/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_231-137892-0 from training. Duration: 0.99 2023-10-04 03:50:35,539 WARNING [train.py:1204] (3/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_57-270037-0 from training. Duration: 0.77 2023-10-04 03:50:35,544 WARNING [train.py:1204] (3/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_465-214726-0_sp1.1 from training. Duration: 0.7818125 2023-10-04 03:50:35,723 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.0.conv_module2.balancer2.prob, batch_count=1515200.0, ans=0.125 2023-10-04 03:50:37,364 WARNING [train.py:1204] (3/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_106-300985-0 from training. Duration: 0.9 2023-10-04 03:50:37,375 WARNING [train.py:1204] (3/4) Exclude cut with ID _14632_210_9988_1_1530691279392_3923200_161-129530-0_sp1.1 from training. Duration: 0.87275 2023-10-04 03:50:37,387 WARNING [train.py:1204] (3/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_514-88370-0_sp1.1 from training. Duration: 0.963625 2023-10-04 03:50:40,100 WARNING [train.py:1204] (3/4) Exclude cut with ID _56495_210_4234_1_1534748080334_4136219_305-334505-0_sp1.1 from training. Duration: 0.92725 2023-10-04 03:50:41,511 WARNING [train.py:1204] (3/4) Exclude cut with ID _42875_210_14856_1_1533704397487_4012433_61-264914-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 03:50:45,743 WARNING [train.py:1204] (3/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_91-148831-0 from training. Duration: 0.99 2023-10-04 03:50:47,272 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_435-313305-0_sp0.9 from training. Duration: 0.788875 2023-10-04 03:50:48,821 WARNING [train.py:1204] (3/4) Exclude cut with ID _16744_210_5986_1_1531724405384_3907567_242-111316-0_sp1.1 from training. Duration: 0.8454375 2023-10-04 03:50:50,164 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_257-20733-0 from training. Duration: 0.76 2023-10-04 03:50:50,219 WARNING [train.py:1204] (3/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_4-91037-0_sp1.1 from training. Duration: 0.8 2023-10-04 03:50:50,370 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.1.bypass.scale_min, batch_count=1515266.6666666667, ans=0.2 2023-10-04 03:50:52,220 WARNING [train.py:1204] (3/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_3-276701-0 from training. Duration: 0.91 2023-10-04 03:50:55,387 WARNING [train.py:1204] (3/4) Exclude cut with ID _3992_210_1747_1_1526644816031_3731060_36-147855-0_sp1.1 from training. Duration: 0.7090625 2023-10-04 03:50:56,706 WARNING [train.py:1204] (3/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_1296-298671-0_sp1.1 from training. Duration: 0.963625 2023-10-04 03:50:56,993 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.balancer1.prob, batch_count=1515333.3333333333, ans=0.125 2023-10-04 03:50:58,151 WARNING [train.py:1204] (3/4) Exclude cut with ID _68965_210_1883_1_1535698962272_3497490_309-334593-0_sp1.1 from training. Duration: 0.92725 2023-10-04 03:50:58,241 WARNING [train.py:1204] (3/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_128-296581-0 from training. Duration: 0.74 2023-10-04 03:50:58,241 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_17613_210_2151_1_1531137569_2366160_170-299079-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 03:50:58,243 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6287_210_3273_1_1527509506_2653716_182-199887-0_sp1.1 from training. Duration: 0.463625 2023-10-04 03:51:00,882 WARNING [train.py:1204] (3/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_80-135200-0_sp1.1 from training. Duration: 0.7181875 2023-10-04 03:51:01,145 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.conv_skip_rate, batch_count=1515333.3333333333, ans=0.0 2023-10-04 03:51:02,330 WARNING [train.py:1204] (3/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_34-183685-0 from training. Duration: 0.98 2023-10-04 03:51:02,348 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8278_210_2550_1_1528637099_4220413_418-210155-0_sp1.1 from training. Duration: 0.92725 2023-10-04 03:51:04,036 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8853_210_3273_1_1528633899_818968_51-187685-0_sp0.9 from training. Duration: 0.7666875 2023-10-04 03:51:04,051 WARNING [train.py:1204] (3/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_332-221057-0_sp1.1 from training. Duration: 0.6181875 2023-10-04 03:51:04,128 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2221_210_1988_1_1525597051_4935541_415-296882-0 from training. Duration: 0.77 2023-10-04 03:51:05,315 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.606e+02 2.071e+02 2.291e+02 2.754e+02 5.163e+02, threshold=4.583e+02, percent-clipped=2.0 2023-10-04 03:51:05,400 WARNING [train.py:1204] (3/4) Exclude cut with ID _33640_210_2655_1_1533117605479_4855469_770-278185-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 03:51:05,430 WARNING [train.py:1204] (3/4) Exclude cut with ID _5287_210_2084_1_1527418883040_3878670_333-48508-0_sp1.1 from training. Duration: 0.5454375 2023-10-04 03:51:05,492 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_142-276839-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 03:51:08,794 WARNING [train.py:1204] (3/4) Exclude cut with ID _28394_210_8913_1_1532430049518_3650118_126-139371-0_sp1.1 from training. Duration: 0.92725 2023-10-04 03:51:08,810 WARNING [train.py:1204] (3/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_76-203867-0 from training. Duration: 0.72 2023-10-04 03:51:08,863 WARNING [train.py:1204] (3/4) Exclude cut with ID _6194_210_1534_1_1527511826160_3941194_14-282876-0_sp0.9 from training. Duration: 0.788875 2023-10-04 03:51:12,985 INFO [train.py:1046] (3/4) Epoch 43, batch 4200, loss[loss=0.1451, simple_loss=0.2275, pruned_loss=0.0313, over 24515.00 frames. ], tot_loss[loss=0.1555, simple_loss=0.2356, pruned_loss=0.03766, over 4730291.62 frames. ], batch size: 66, lr: 2.35e-03, grad_scale: 16.0 2023-10-04 03:51:13,083 WARNING [train.py:1204] (3/4) Exclude cut with ID _35355_210_16040_1_1533212988977_4016229_74-190076-0_sp1.1 from training. Duration: 0.72725 2023-10-04 03:51:14,493 WARNING [train.py:1204] (3/4) Exclude cut with ID _33920_210_4220_1_1533257851500_3084399_198-334063-0 from training. Duration: 0.94 2023-10-04 03:51:17,142 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_4-266348-0_sp0.9 from training. Duration: 0.8666875 2023-10-04 03:51:18,576 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_23856_210_4238_1_1531994785_3087934_122-38075-0_sp1.1 from training. Duration: 0.936375 2023-10-04 03:51:18,681 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.0.conv_module1.balancer2.min_positive, batch_count=1515400.0, ans=0.05 2023-10-04 03:51:19,948 WARNING [train.py:1204] (3/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_276-96593-0_sp0.9 from training. Duration: 0.8555625 2023-10-04 03:51:20,008 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_308-278734-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 03:51:20,010 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_5190_210_4557_1_1527245078_2965646_353-250603-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 03:51:22,077 WARNING [train.py:1204] (3/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_472-216663-0 from training. Duration: 0.92 2023-10-04 03:51:26,665 WARNING [train.py:1204] (3/4) Exclude cut with ID _7537_210_2084_1_1528502395431_3924359_14-312906-0 from training. Duration: 0.84 2023-10-04 03:51:26,720 WARNING [train.py:1204] (3/4) Exclude cut with ID _29003_210_19596_1_1532507391720_3894098_559-236660-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 03:51:28,215 WARNING [train.py:1204] (3/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_312-204798-0_sp1.1 from training. Duration: 0.8454375 2023-10-04 03:51:32,226 WARNING [train.py:1204] (3/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_17-309070-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 03:51:34,260 WARNING [train.py:1204] (3/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_344-152526-0_sp0.9 from training. Duration: 0.588875 2023-10-04 03:51:36,282 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_41452_210_7291_1_1533437687_3941281_405-11559-0_sp1.1 from training. Duration: 0.82725 2023-10-04 03:51:36,305 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_48730_210_5399_1_1533885886_2212769_157-124052-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 03:51:37,509 WARNING [train.py:1204] (3/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_415-78792-0 from training. Duration: 0.81 2023-10-04 03:51:37,513 WARNING [train.py:1204] (3/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_345-340690-0_sp1.1 from training. Duration: 0.8454375 2023-10-04 03:51:38,924 WARNING [train.py:1204] (3/4) Exclude cut with ID _23825_210_13449_1_1531915313526_3497150_124-8138-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 03:51:38,970 WARNING [train.py:1204] (3/4) Exclude cut with ID _49258_210_7994_1_1534122017863_3738861_194-24864-0_sp1.1 from training. Duration: 0.936375 2023-10-04 03:51:38,996 WARNING [train.py:1204] (3/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_369-222060-0_sp1.1 from training. Duration: 0.6454375 2023-10-04 03:51:40,667 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.feed_forward1.hidden_balancer.prob, batch_count=1515466.6666666667, ans=0.125 2023-10-04 03:51:41,650 WARNING [train.py:1204] (3/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_480-13146-0_sp0.9 from training. Duration: 0.9444375 2023-10-04 03:51:43,118 WARNING [train.py:1204] (3/4) Exclude cut with ID _13253_210_1925_1_1530757775819_3577873_191-144977-0 from training. Duration: 0.78 2023-10-04 03:51:44,441 WARNING [train.py:1204] (3/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_1128-140266-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 03:51:47,320 WARNING [train.py:1204] (3/4) Exclude cut with ID _8772_210_1850_1_1528635595972_3664011_247-336448-0_sp0.9 from training. Duration: 0.82225 2023-10-04 03:51:48,642 WARNING [train.py:1204] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_318-59097-0_sp1.1 from training. Duration: 0.6909375 2023-10-04 03:51:50,122 WARNING [train.py:1204] (3/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_540-131164-0_sp1.1 from training. Duration: 0.836375 2023-10-04 03:51:52,037 WARNING [train.py:1204] (3/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_39-110836-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 03:51:54,649 WARNING [train.py:1204] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_860-96198-0_sp1.1 from training. Duration: 0.663625 2023-10-04 03:51:54,664 WARNING [train.py:1204] (3/4) Exclude cut with ID _15133_210_6228_1_1530794512074_3080379_29-264458-0 from training. Duration: 0.58 2023-10-04 03:51:54,697 WARNING [train.py:1204] (3/4) Exclude cut with ID _55209_210_1531_1_1534675828996_4597160_797-348343-0_sp1.1 from training. Duration: 0.963625 2023-10-04 03:51:56,074 WARNING [train.py:1204] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_23-189234-0_sp1.1 from training. Duration: 0.7545625 2023-10-04 03:52:00,723 WARNING [train.py:1204] (3/4) Exclude cut with ID _5410_210_1388_1_1527397055795_1719020_39-112221-0_sp1.1 from training. Duration: 0.62725 2023-10-04 03:52:02,108 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_100-348441-0_sp1.1 from training. Duration: 0.82725 2023-10-04 03:52:08,586 WARNING [train.py:1204] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_143-86344-0_sp1.1 from training. Duration: 0.7 2023-10-04 03:52:11,356 WARNING [train.py:1204] (3/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_126-29104-0 from training. Duration: 0.85 2023-10-04 03:52:12,784 WARNING [train.py:1204] (3/4) Exclude cut with ID _39846_210_12651_1_1533603651992_1318030_138-31771-0_sp1.1 from training. Duration: 0.963625 2023-10-04 03:52:17,058 WARNING [train.py:1204] (3/4) Exclude cut with ID _2220_210_1819_1_1525509109898_3658680_50-99837-0_sp1.1 from training. Duration: 0.4909375 2023-10-04 03:52:18,367 WARNING [train.py:1204] (3/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_836-210550-0_sp1.1 from training. Duration: 0.97275 2023-10-04 03:52:18,504 WARNING [train.py:1204] (3/4) Exclude cut with ID _28323_210_16187_1_1532480395667_4100300_306-74561-0 from training. Duration: 0.94 2023-10-04 03:52:26,336 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_177-58005-0_sp1.1 from training. Duration: 0.563625 2023-10-04 03:52:27,724 INFO [train.py:1046] (3/4) Epoch 43, batch 4250, loss[loss=0.1608, simple_loss=0.2493, pruned_loss=0.03618, over 24566.00 frames. ], tot_loss[loss=0.1545, simple_loss=0.234, pruned_loss=0.03753, over 4729612.83 frames. ], batch size: 71, lr: 2.35e-03, grad_scale: 16.0 2023-10-04 03:52:29,305 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.0.feed_forward1.out_proj.dropout_p, batch_count=1515733.3333333333, ans=0.1 2023-10-04 03:52:30,455 WARNING [train.py:1204] (3/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_19-342632-0_sp1.1 from training. Duration: 0.82725 2023-10-04 03:52:30,469 WARNING [train.py:1204] (3/4) Exclude cut with ID _35355_210_16040_1_1533212988977_4016229_74-190076-0_sp0.9 from training. Duration: 0.888875 2023-10-04 03:52:31,919 WARNING [train.py:1204] (3/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_285-97786-0_sp1.1 from training. Duration: 0.97275 2023-10-04 03:52:35,354 WARNING [train.py:1204] (3/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_337-181502-0_sp0.9 from training. Duration: 0.72225 2023-10-04 03:52:35,387 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_353-182855-0 from training. Duration: 0.96 2023-10-04 03:52:35,423 WARNING [train.py:1204] (3/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_409-327730-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 03:52:38,214 WARNING [train.py:1204] (3/4) Exclude cut with ID _8030_210_7497_1_1528444886781_3531089_373-19403-0_sp1.1 from training. Duration: 0.97275 2023-10-04 03:52:42,206 WARNING [train.py:1204] (3/4) Exclude cut with ID _6789_210_1534_1_1527857937218_3118950_231-340237-0_sp1.1 from training. Duration: 0.936375 2023-10-04 03:52:45,083 WARNING [train.py:1204] (3/4) Exclude cut with ID _6493_210_1747_1_1528459210424_4012470_536-57832-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 03:52:45,105 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7046_210_6753_1_1527895337_6920917_756-116002-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 03:52:45,433 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.feed_forward3.hidden_balancer.prob, batch_count=1515800.0, ans=0.125 2023-10-04 03:52:46,527 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_37152_210_9697_1_1533106573_3844188_29-239070-0_sp1.1 from training. Duration: 0.8909375 2023-10-04 03:52:46,532 WARNING [train.py:1204] (3/4) Exclude cut with ID _19023_210_4353_1_1531378892552_5820780_61-334539-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 03:52:47,997 WARNING [train.py:1204] (3/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_159-28488-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 03:52:48,076 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_764-71272-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 03:52:49,805 WARNING [train.py:1204] (3/4) Exclude cut with ID _44902_210_16187_1_1533888006062_3792078_258-178274-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 03:52:52,887 WARNING [train.py:1204] (3/4) Exclude cut with ID _7537_210_2084_1_1528502395431_3924359_14-312906-0_sp1.1 from training. Duration: 0.763625 2023-10-04 03:52:54,321 WARNING [train.py:1204] (3/4) Exclude cut with ID _49643_210_7989_1_1534640244224_3939031_674-116639-0_sp1.1 from training. Duration: 0.963625 2023-10-04 03:52:55,855 WARNING [train.py:1204] (3/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_415-161-0 from training. Duration: 0.98 2023-10-04 03:52:58,661 WARNING [train.py:1204] (3/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_264-348896-0 from training. Duration: 0.85 2023-10-04 03:52:58,677 WARNING [train.py:1204] (3/4) Exclude cut with ID _51899_210_20034_1_1534129254466_3669220_233-161492-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 03:53:00,582 WARNING [train.py:1204] (3/4) Exclude cut with ID _31255_210_13427_1_1532865619495_3815143_233-200023-0_sp1.1 from training. Duration: 0.936375 2023-10-04 03:53:00,597 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_23850_210_4238_1_1532232597_1632433_100-229879-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 03:53:00,685 WARNING [train.py:1204] (3/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_375-245677-0_sp0.9 from training. Duration: 0.9 2023-10-04 03:53:00,688 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_40223_210_6228_1_1533298404_4812267_355-317415-0_sp1.1 from training. Duration: 0.97275 2023-10-04 03:53:00,729 WARNING [train.py:1204] (3/4) Exclude cut with ID _9679_210_7497_1_1529129212906_4363929_235-81488-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 03:53:05,376 WARNING [train.py:1204] (3/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_172-215932-0_sp1.1 from training. Duration: 0.6 2023-10-04 03:53:06,731 WARNING [train.py:1204] (3/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_64-298917-0_sp1.1 from training. Duration: 0.8 2023-10-04 03:53:09,547 WARNING [train.py:1204] (3/4) Exclude cut with ID _38020_210_5986_1_1533112252259_7006089_131-53533-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 03:53:10,958 WARNING [train.py:1204] (3/4) Exclude cut with ID _42334_210_7770_1_1534206610396_3605880_392-48154-0_sp1.1 from training. Duration: 0.963625 2023-10-04 03:53:12,313 WARNING [train.py:1204] (3/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_337-181502-0 from training. Duration: 0.65 2023-10-04 03:53:12,320 WARNING [train.py:1204] (3/4) Exclude cut with ID _6267_210_2084_1_1527764658526_3894730_375-219887-0_sp0.9 from training. Duration: 0.7444375 2023-10-04 03:53:13,721 WARNING [train.py:1204] (3/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_60-146189-0 from training. Duration: 0.98 2023-10-04 03:53:13,864 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.conv_module1.balancer2.prob, batch_count=1515933.3333333333, ans=0.125 2023-10-04 03:53:16,360 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_243-143119-0_sp1.1 from training. Duration: 0.7 2023-10-04 03:53:17,838 WARNING [train.py:1204] (3/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_1267-272693-0_sp0.9 from training. Duration: 0.811125 2023-10-04 03:53:20,504 WARNING [train.py:1204] (3/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_83-182645-0_sp1.1 from training. Duration: 0.97275 2023-10-04 03:53:20,525 WARNING [train.py:1204] (3/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_403-292260-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 03:53:24,370 WARNING [train.py:1204] (3/4) Exclude cut with ID _55429_210_10626_1_1534467588527_7174143_377-169391-0 from training. Duration: 0.96 2023-10-04 03:53:25,843 WARNING [train.py:1204] (3/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_494-199538-0_sp1.1 from training. Duration: 0.6818125 2023-10-04 03:53:27,538 WARNING [train.py:1204] (3/4) Exclude cut with ID _3067_210_2667_1_1526212974640_4503168_119-175776-0_sp1.1 from training. Duration: 0.67275 2023-10-04 03:53:30,357 WARNING [train.py:1204] (3/4) Exclude cut with ID _3231_210_2106_1_1526205864934_6784220_183-303839-0_sp1.1 from training. Duration: 0.97275 2023-10-04 03:53:33,169 WARNING [train.py:1204] (3/4) Exclude cut with ID _18513_210_9206_1_1531618215223_3862620_165-38958-0_sp1.1 from training. Duration: 0.963625 2023-10-04 03:53:35,074 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.593e+02 1.993e+02 2.241e+02 2.615e+02 3.958e+02, threshold=4.482e+02, percent-clipped=0.0 2023-10-04 03:53:35,158 WARNING [train.py:1204] (3/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_87-272068-0_sp1.1 from training. Duration: 0.7545625 2023-10-04 03:53:35,246 WARNING [train.py:1204] (3/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_353-95-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 03:53:35,486 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.conv_module2.balancer2.prob, batch_count=1516000.0, ans=0.125 2023-10-04 03:53:36,585 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2009_210_2071_1_1525481134_4588937_73-302813-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 03:53:36,735 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.0.bypass.scale_min, batch_count=1516000.0, ans=0.2 2023-10-04 03:53:37,873 WARNING [train.py:1204] (3/4) Exclude cut with ID _64220_210_9527_1_1535250151772_4875259_88-95011-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 03:53:39,192 WARNING [train.py:1204] (3/4) Exclude cut with ID _44902_210_16187_1_1533888006062_3792078_160-12107-0_sp1.1 from training. Duration: 0.92725 2023-10-04 03:53:39,205 WARNING [train.py:1204] (3/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_547-5424-0 from training. Duration: 0.94 2023-10-04 03:53:41,886 INFO [train.py:1046] (3/4) Epoch 43, batch 4300, loss[loss=0.1479, simple_loss=0.2343, pruned_loss=0.03071, over 24628.00 frames. ], tot_loss[loss=0.1539, simple_loss=0.2333, pruned_loss=0.03723, over 4708720.08 frames. ], batch size: 65, lr: 2.35e-03, grad_scale: 16.0 2023-10-04 03:53:41,965 WARNING [train.py:1204] (3/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_268-85656-0_sp1.1 from training. Duration: 0.936375 2023-10-04 03:53:44,904 WARNING [train.py:1204] (3/4) Exclude cut with ID _66210_210_12651_1_1535446812922_3365850_42-284985-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 03:53:46,162 WARNING [train.py:1204] (3/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_465-214726-0_sp0.9 from training. Duration: 0.9555625 2023-10-04 03:53:50,349 WARNING [train.py:1204] (3/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_50-95770-0_sp1.1 from training. Duration: 0.936375 2023-10-04 03:53:53,783 WARNING [train.py:1204] (3/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_269-77567-0_sp1.1 from training. Duration: 0.963625 2023-10-04 03:53:53,785 WARNING [train.py:1204] (3/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_7-111954-0 from training. Duration: 0.56 2023-10-04 03:53:55,795 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_85-195240-0_sp0.9 from training. Duration: 0.9666875 2023-10-04 03:53:58,376 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2822_210_2715_1_1526112586_1999587_165-36656-0_sp0.9 from training. Duration: 0.988875 2023-10-04 03:53:58,403 WARNING [train.py:1204] (3/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_181-78632-0_sp1.1 from training. Duration: 0.7090625 2023-10-04 03:53:58,422 WARNING [train.py:1204] (3/4) Exclude cut with ID IC0671W0044-331887 from training. Duration: 0.896 2023-10-04 03:54:02,993 WARNING [train.py:1204] (3/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_266-137104-0_sp1.1 from training. Duration: 0.5909375 2023-10-04 03:54:04,434 WARNING [train.py:1204] (3/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_137-116442-0_sp0.9 from training. Duration: 0.8666875 2023-10-04 03:54:07,800 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_23801_210_16223_1_1532066392_3294657_570-5439-0 from training. Duration: 0.97 2023-10-04 03:54:09,124 WARNING [train.py:1204] (3/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_29-280622-0_sp1.1 from training. Duration: 0.7454375 2023-10-04 03:54:09,153 WARNING [train.py:1204] (3/4) Exclude cut with ID 3557-8342-0013-54691-0 from training. Duration: 0.92 2023-10-04 03:54:09,338 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.balancer2.prob, batch_count=1516133.3333333333, ans=0.125 2023-10-04 03:54:11,871 WARNING [train.py:1204] (3/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_711-15688-0_sp1.1 from training. Duration: 0.3909375 2023-10-04 03:54:13,245 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_15126_210_6753_1_1530778677_2591185_120-277707-0_sp1.1 from training. Duration: 0.72725 2023-10-04 03:54:14,786 WARNING [train.py:1204] (3/4) Exclude cut with ID _14970_210_15709_1_1530775547361_1966965_8-169924-0_sp1.1 from training. Duration: 0.836375 2023-10-04 03:54:14,787 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_4692_210_3528_1_1526899755_4323254_107-44401-0_sp0.9 from training. Duration: 0.9555625 2023-10-04 03:54:16,120 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3475_210_2028_1_1526193578_3622569_265-276625-0_sp0.9 from training. Duration: 0.8444375 2023-10-04 03:54:17,493 WARNING [train.py:1204] (3/4) Exclude cut with ID _55153_210_2253_1_1534384786677_3162096_148-79022-0_sp1.1 from training. Duration: 0.87275 2023-10-04 03:54:20,114 WARNING [train.py:1204] (3/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_292-36336-0_sp1.1 from training. Duration: 0.92725 2023-10-04 03:54:20,124 WARNING [train.py:1204] (3/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_329-82235-0 from training. Duration: 0.82 2023-10-04 03:54:20,206 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_11305_210_1794_1_1529750428_7178148_257-312870-0 from training. Duration: 0.88 2023-10-04 03:54:22,931 WARNING [train.py:1204] (3/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_436-4493-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 03:54:26,400 WARNING [train.py:1204] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_321-54129-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 03:54:26,407 WARNING [train.py:1204] (3/4) Exclude cut with ID _5287_210_2084_1_1527418883040_3878670_333-48508-0_sp0.9 from training. Duration: 0.6666875 2023-10-04 03:54:26,420 WARNING [train.py:1204] (3/4) Exclude cut with ID _26528_210_9070_1_1532570410842_3851310_244-278158-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 03:54:26,465 WARNING [train.py:1204] (3/4) Exclude cut with ID _46952_210_5986_1_1534230010357_3107379_232-344503-0_sp1.1 from training. Duration: 0.87275 2023-10-04 03:54:26,483 WARNING [train.py:1204] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_43-206757-0 from training. Duration: 0.75 2023-10-04 03:54:26,484 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_4183_210_2087_1_1526729100_1869838_141-166600-0 from training. Duration: 0.95 2023-10-04 03:54:27,786 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_296-287678-0 from training. Duration: 0.95 2023-10-04 03:54:27,862 WARNING [train.py:1204] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_265-60343-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 03:54:29,126 WARNING [train.py:1204] (3/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_332-256168-0 from training. Duration: 0.75 2023-10-04 03:54:29,167 WARNING [train.py:1204] (3/4) Exclude cut with ID _13679_210_7497_1_1530534293662_3355098_411-186088-0 from training. Duration: 0.94 2023-10-04 03:54:30,626 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=1516266.6666666667, ans=0.1 2023-10-04 03:54:32,034 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.conv_module1.balancer2.prob, batch_count=1516266.6666666667, ans=0.125 2023-10-04 03:54:33,691 WARNING [train.py:1204] (3/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_458-126779-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 03:54:35,068 WARNING [train.py:1204] (3/4) Exclude cut with ID IC0803W0380-397572 from training. Duration: 0.032 2023-10-04 03:54:35,798 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.3.encoder.layers.1.conv_module1.whiten, num_groups=1, num_channels=512, metric=2.93 vs. limit=15.0 2023-10-04 03:54:36,387 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_278-272612-0_sp1.1 from training. Duration: 0.663625 2023-10-04 03:54:37,902 WARNING [train.py:1204] (3/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_491-263101-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 03:54:37,910 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_53555_210_5632_1_1534595220_7232297_334-336778-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 03:54:40,010 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_50-3289-0 from training. Duration: 0.91 2023-10-04 03:54:41,170 WARNING [train.py:1204] (3/4) Exclude cut with ID _8699_210_2069_1_1528630346831_3960409_155-39766-0_sp0.9 from training. Duration: 0.8666875 2023-10-04 03:54:41,183 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_502-82145-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 03:54:41,233 WARNING [train.py:1204] (3/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_550-43132-0_sp1.1 from training. Duration: 0.97275 2023-10-04 03:54:42,550 WARNING [train.py:1204] (3/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_446-112117-0_sp1.1 from training. Duration: 0.8181875 2023-10-04 03:54:42,610 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3691_210_3528_1_1526468169_4062053_355-334432-0_sp1.1 from training. Duration: 0.7909375 2023-10-04 03:54:42,896 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.self_attn_weights.pos_emb_skip_rate, batch_count=1516333.3333333333, ans=0.0 2023-10-04 03:54:45,285 WARNING [train.py:1204] (3/4) Exclude cut with ID _58605_210_12210_1_1534723195939_3904259_239-94720-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 03:54:46,823 WARNING [train.py:1204] (3/4) Exclude cut with ID _49376_210_5986_1_1533985278732_3370740_391-344072-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 03:54:48,148 WARNING [train.py:1204] (3/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_558-322014-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 03:54:48,183 WARNING [train.py:1204] (3/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_256-32662-0_sp1.1 from training. Duration: 0.8181875 2023-10-04 03:54:53,667 WARNING [train.py:1204] (3/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_572-185150-0 from training. Duration: 0.97 2023-10-04 03:54:55,435 INFO [train.py:1046] (3/4) Epoch 43, batch 4350, loss[loss=0.1469, simple_loss=0.2327, pruned_loss=0.03054, over 24465.00 frames. ], tot_loss[loss=0.1544, simple_loss=0.2347, pruned_loss=0.03706, over 4716917.77 frames. ], batch size: 66, lr: 2.35e-03, grad_scale: 16.0 2023-10-04 03:54:55,514 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_89-35748-0_sp0.9 from training. Duration: 0.87775 2023-10-04 03:54:59,755 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_88-66747-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 03:55:01,231 WARNING [train.py:1204] (3/4) Exclude cut with ID _19504_210_9994_1_1531648863242_3325220_357-237029-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 03:55:04,325 WARNING [train.py:1204] (3/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_302-2550-0_sp1.1 from training. Duration: 0.736375 2023-10-04 03:55:04,325 WARNING [train.py:1204] (3/4) Exclude cut with ID _9461_210_4557_1_1529060514726_3677451_59-194017-0_sp1.1 from training. Duration: 0.963625 2023-10-04 03:55:09,692 WARNING [train.py:1204] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_665-196155-0_sp1.1 from training. Duration: 0.6090625 2023-10-04 03:55:13,238 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_326-315007-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 03:55:14,749 WARNING [train.py:1204] (3/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_105-328174-0_sp1.1 from training. Duration: 0.6909375 2023-10-04 03:55:14,757 WARNING [train.py:1204] (3/4) Exclude cut with ID _56494_210_4220_1_1535284837625_3033169_106-263722-0_sp1.1 from training. Duration: 0.97275 2023-10-04 03:55:17,393 WARNING [train.py:1204] (3/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_770-200430-0_sp0.9 from training. Duration: 0.988875 2023-10-04 03:55:18,446 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.0.layers.0.feed_forward2.out_whiten, num_groups=1, num_channels=192, metric=5.60 vs. limit=15.0 2023-10-04 03:55:18,983 WARNING [train.py:1204] (3/4) Exclude cut with ID _65640_210_6310_1_1535368201445_3173250_209-13008-0_sp1.1 from training. Duration: 0.9 2023-10-04 03:55:19,181 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.conv_module2.balancer1.prob, batch_count=1516466.6666666667, ans=0.125 2023-10-04 03:55:21,736 WARNING [train.py:1204] (3/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_212-149947-0_sp0.9 from training. Duration: 0.811125 2023-10-04 03:55:26,987 WARNING [train.py:1204] (3/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_188-171673-0 from training. Duration: 0.58 2023-10-04 03:55:27,050 WARNING [train.py:1204] (3/4) Exclude cut with ID _58895_210_8750_1_1535184081320_3649089_286-281724-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 03:55:28,437 WARNING [train.py:1204] (3/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_16-254909-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 03:55:34,356 WARNING [train.py:1204] (3/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_52-95655-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 03:55:34,727 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.bypass.scale_min, batch_count=1516533.3333333333, ans=0.2 2023-10-04 03:55:35,793 WARNING [train.py:1204] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_554-45617-0 from training. Duration: 0.99 2023-10-04 03:55:36,076 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.feed_forward3.hidden_balancer.prob, batch_count=1516533.3333333333, ans=0.125 2023-10-04 03:55:39,945 WARNING [train.py:1204] (3/4) Exclude cut with ID _14683_210_5219_1_1530698352608_3974859_473-344611-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 03:55:41,316 WARNING [train.py:1204] (3/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_17-25045-0_sp0.9 from training. Duration: 0.6555625 2023-10-04 03:55:46,212 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9072W0003-530637 from training. Duration: 0.768 2023-10-04 03:55:46,313 WARNING [train.py:1204] (3/4) Exclude cut with ID _38670_210_15379_1_1533448810659_4391059_76-74025-0_sp1.1 from training. Duration: 0.936375 2023-10-04 03:55:47,599 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_74-292608-0_sp0.9 from training. Duration: 0.8 2023-10-04 03:55:48,944 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9084W0025-536862 from training. Duration: 0.896 2023-10-04 03:55:50,420 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9087W0021-538392 from training. Duration: 0.768 2023-10-04 03:55:50,440 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7543_210_6753_1_1528543323_645515_56-163234-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 03:55:50,469 WARNING [train.py:1204] (3/4) Exclude cut with ID _45560_210_21982_1_1533812603951_3584999_44-332643-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 03:55:51,798 WARNING [train.py:1204] (3/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_1285-256622-0_sp1.1 from training. Duration: 0.763625 2023-10-04 03:55:53,153 WARNING [train.py:1204] (3/4) Exclude cut with ID _52781_210_16187_1_1534297729099_1118149_141-325632-0_sp1.1 from training. Duration: 0.936375 2023-10-04 03:55:53,223 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_1051-137631-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 03:55:54,551 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_13091_210_7925_1_1530609447_8987354_233-18333-0_sp1.1 from training. Duration: 0.97275 2023-10-04 03:55:57,838 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_34008_210_10431_1_1533345424_8183247_662-317331-0 from training. Duration: 0.94 2023-10-04 03:55:57,844 WARNING [train.py:1204] (3/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_291-248920-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 03:55:57,846 WARNING [train.py:1204] (3/4) Exclude cut with ID _65263_210_22356_1_1535763598691_3921037_274-288481-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 03:55:57,875 WARNING [train.py:1204] (3/4) Exclude cut with ID _5189_210_4557_1_1527248319428_3769229_233-168080-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 03:55:59,209 WARNING [train.py:1204] (3/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_135-184281-0 from training. Duration: 0.99 2023-10-04 03:56:00,989 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9123W0012-555972 from training. Duration: 0.896 2023-10-04 03:56:00,993 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9123W0118-556077 from training. Duration: 0.896 2023-10-04 03:56:01,002 WARNING [train.py:1204] (3/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_406-275649-0 from training. Duration: 0.42 2023-10-04 03:56:02,235 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.666e+02 1.917e+02 2.050e+02 2.278e+02 3.303e+02, threshold=4.100e+02, percent-clipped=0.0 2023-10-04 03:56:05,061 WARNING [train.py:1204] (3/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_309-13684-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 03:56:05,083 WARNING [train.py:1204] (3/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_129-337802-0_sp1.1 from training. Duration: 0.6545625 2023-10-04 03:56:05,118 WARNING [train.py:1204] (3/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_649-266825-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 03:56:06,413 WARNING [train.py:1204] (3/4) Exclude cut with ID _40519_210_13794_1_1533715228655_7337390_895-224742-0_sp1.1 from training. Duration: 0.92725 2023-10-04 03:56:08,372 WARNING [train.py:1204] (3/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_234-14174-0 from training. Duration: 0.82 2023-10-04 03:56:09,027 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.3.encoder.layers.3.self_attn_weights.whiten_keys, num_groups=8, num_channels=256, metric=2.39 vs. limit=6.0 2023-10-04 03:56:09,580 INFO [train.py:1046] (3/4) Epoch 43, batch 4400, loss[loss=0.1577, simple_loss=0.2448, pruned_loss=0.03533, over 24623.00 frames. ], tot_loss[loss=0.1544, simple_loss=0.2351, pruned_loss=0.03685, over 4734730.52 frames. ], batch size: 68, lr: 2.35e-03, grad_scale: 32.0 2023-10-04 03:56:10,981 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9156W0009-572502 from training. Duration: 0.768 2023-10-04 03:56:10,998 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_424-245529-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 03:56:14,999 WARNING [train.py:1204] (3/4) Exclude cut with ID _26528_210_9070_1_1532570410842_3851310_232-10149-0_sp1.1 from training. Duration: 0.963625 2023-10-04 03:56:15,008 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_17292_210_6753_1_1531125388_2943734_152-30410-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 03:56:16,465 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_35441_210_16554_1_1533347572_4092680_291-82075-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 03:56:16,721 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.feed_forward2.hidden_balancer.prob, batch_count=1516733.3333333333, ans=0.125 2023-10-04 03:56:19,196 WARNING [train.py:1204] (3/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_64-298917-0 from training. Duration: 0.88 2023-10-04 03:56:19,244 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_5618_210_1874_1_1527311008_5851_1-214501-0_sp0.9 from training. Duration: 0.7655625 2023-10-04 03:56:21,090 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_9460_210_4211_1_1528954587_10049667_572-37035-0 from training. Duration: 0.8 2023-10-04 03:56:21,113 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9193W0023-590322 from training. Duration: 0.896 2023-10-04 03:56:21,212 WARNING [train.py:1204] (3/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_206-10097-0_sp1.1 from training. Duration: 0.4454375 2023-10-04 03:56:21,224 WARNING [train.py:1204] (3/4) Exclude cut with ID _55134_210_21982_1_1534590028377_3882170_17-51731-0_sp1.1 from training. Duration: 0.963625 2023-10-04 03:56:23,870 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_14953_210_2151_1_1530701710_5616165_710-165400-0 from training. Duration: 0.87 2023-10-04 03:56:24,187 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.bypass.scale_min, batch_count=1516800.0, ans=0.2 2023-10-04 03:56:25,323 WARNING [train.py:1204] (3/4) Exclude cut with ID _53203_210_2087_1_1534246218309_475590_61-85777-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 03:56:25,402 WARNING [train.py:1204] (3/4) Exclude cut with ID _55844_210_2346_1_1534471212569_7625707_1050-10453-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 03:56:25,418 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9218W0013-603297 from training. Duration: 0.896 2023-10-04 03:56:26,968 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.feed_forward3.hidden_balancer.prob, batch_count=1516800.0, ans=0.125 2023-10-04 03:56:28,233 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_267-102818-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 03:56:29,477 WARNING [train.py:1204] (3/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_191-41023-0 from training. Duration: 0.71 2023-10-04 03:56:29,521 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9231W0202-610227 from training. Duration: 0.896 2023-10-04 03:56:29,931 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=1516800.0, ans=0.1 2023-10-04 03:56:31,453 WARNING [train.py:1204] (3/4) Exclude cut with ID _6537_210_1883_1_1527987559710_3597129_382-133062-0 from training. Duration: 0.68 2023-10-04 03:56:31,496 WARNING [train.py:1204] (3/4) Exclude cut with ID _28427_210_4220_1_1532581240967_3061069_43-84928-0 from training. Duration: 0.99 2023-10-04 03:56:32,748 WARNING [train.py:1204] (3/4) Exclude cut with ID _59256_210_5986_1_1535076085997_3589120_121-237386-0 from training. Duration: 0.96 2023-10-04 03:56:32,792 WARNING [train.py:1204] (3/4) Exclude cut with ID _8480_210_1874_1_1528589391205_6728330_137-256986-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 03:56:35,477 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_9417_210_1794_1_1529235261_4614131_479-346963-0_sp1.1 from training. Duration: 0.936375 2023-10-04 03:56:35,516 WARNING [train.py:1204] (3/4) Exclude cut with ID _26474_210_9570_1_1532520022699_3623380_267-7362-0_sp1.1 from training. Duration: 0.936375 2023-10-04 03:56:36,667 WARNING [train.py:1204] (3/4) Exclude cut with ID _55328_210_27767_1_1534580973260_4088170_287-61272-0_sp1.1 from training. Duration: 0.97275 2023-10-04 03:56:38,689 WARNING [train.py:1204] (3/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_589-9070-0 from training. Duration: 0.71 2023-10-04 03:56:38,697 WARNING [train.py:1204] (3/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_148-158509-0_sp1.1 from training. Duration: 0.37275 2023-10-04 03:56:40,066 WARNING [train.py:1204] (3/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_164-184548-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 03:56:40,215 WARNING [train.py:1204] (3/4) Exclude cut with ID _14970_210_15709_1_1530775547361_1966965_6-23328-0_sp1.1 from training. Duration: 0.7818125 2023-10-04 03:56:40,220 WARNING [train.py:1204] (3/4) Exclude cut with ID _29303_210_2575_1_1532392237303_7410649_612-165069-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 03:56:40,422 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.balancer2.prob, batch_count=1516866.6666666667, ans=0.125 2023-10-04 03:56:41,643 WARNING [train.py:1204] (3/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_383-321224-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 03:56:42,920 WARNING [train.py:1204] (3/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_283-160525-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 03:56:42,939 WARNING [train.py:1204] (3/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_505-97987-0 from training. Duration: 0.86 2023-10-04 03:56:44,338 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9309W0190-640317 from training. Duration: 0.768 2023-10-04 03:56:48,258 WARNING [train.py:1204] (3/4) Exclude cut with ID _50093_210_4220_1_1534417141207_3432299_280-297390-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 03:56:54,632 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_17292_210_6753_1_1531125388_2943734_320-71385-0_sp1.1 from training. Duration: 0.97275 2023-10-04 03:56:57,181 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_213-103282-0 from training. Duration: 0.78 2023-10-04 03:57:01,812 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7615_210_4211_1_1528110965_4580160_72-235211-0_sp0.9 from training. Duration: 0.8444375 2023-10-04 03:57:05,009 WARNING [train.py:1204] (3/4) Exclude cut with ID _14867_210_1968_1_1530684179890_3846969_173-320977-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 03:57:06,405 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7046_210_6753_1_1527895337_6920917_783-278109-0_sp0.9 from training. Duration: 0.8555625 2023-10-04 03:57:07,636 WARNING [train.py:1204] (3/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_90-30673-0 from training. Duration: 0.84 2023-10-04 03:57:07,653 WARNING [train.py:1204] (3/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_415-161-0_sp1.1 from training. Duration: 0.8909375 2023-10-04 03:57:07,665 WARNING [train.py:1204] (3/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_86-66192-0_sp0.9 from training. Duration: 0.611125 2023-10-04 03:57:07,668 WARNING [train.py:1204] (3/4) Exclude cut with ID _4626_210_3532_1_1526817652717_3916859_446-206656-0_sp1.1 from training. Duration: 0.6909375 2023-10-04 03:57:07,748 WARNING [train.py:1204] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_166-128941-0_sp0.9 from training. Duration: 0.6 2023-10-04 03:57:12,357 WARNING [train.py:1204] (3/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_345-167986-0 from training. Duration: 0.84 2023-10-04 03:57:16,349 WARNING [train.py:1204] (3/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_973-198085-0 from training. Duration: 0.75 2023-10-04 03:57:17,733 WARNING [train.py:1204] (3/4) Exclude cut with ID _60057_210_10640_1_1534903065793_3333150_171-181133-0 from training. Duration: 0.88 2023-10-04 03:57:17,753 WARNING [train.py:1204] (3/4) Exclude cut with ID _19504_210_9994_1_1531648863242_3325220_336-196513-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 03:57:17,760 WARNING [train.py:1204] (3/4) Exclude cut with ID _6493_210_1747_1_1528459210424_4012470_513-140967-0 from training. Duration: 0.79 2023-10-04 03:57:17,849 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6673_210_3273_1_1527767790_3173240_327-306185-0_sp0.9 from training. Duration: 0.92225 2023-10-04 03:57:18,749 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.0.layers.1.feed_forward3.out_whiten, num_groups=1, num_channels=192, metric=9.84 vs. limit=15.0 2023-10-04 03:57:20,568 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1032-236863-0_sp1.1 from training. Duration: 0.763625 2023-10-04 03:57:22,691 INFO [train.py:1046] (3/4) Epoch 43, batch 4450, loss[loss=0.1368, simple_loss=0.2105, pruned_loss=0.03161, over 20771.00 frames. ], tot_loss[loss=0.1556, simple_loss=0.2364, pruned_loss=0.03742, over 4736739.06 frames. ], batch size: 45, lr: 2.35e-03, grad_scale: 32.0 2023-10-04 03:57:22,771 WARNING [train.py:1204] (3/4) Exclude cut with ID _14867_210_1968_1_1530684179890_3846969_413-29292-0 from training. Duration: 0.9 2023-10-04 03:57:26,751 WARNING [train.py:1204] (3/4) Exclude cut with ID _47251_210_18628_1_1533814063128_4137250_186-343322-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 03:57:28,255 WARNING [train.py:1204] (3/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_624-46091-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 03:57:29,537 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_791-197891-0_sp0.9 from training. Duration: 0.9333125 2023-10-04 03:57:34,398 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_50050_210_16223_1_1535435796_7290138_786-288457-0_sp1.1 from training. Duration: 0.92725 2023-10-04 03:57:34,416 WARNING [train.py:1204] (3/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_213-227283-0_sp1.1 from training. Duration: 0.963625 2023-10-04 03:57:38,489 WARNING [train.py:1204] (3/4) Exclude cut with ID _42659_210_2070_1_1533607442765_3120380_58-341121-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 03:57:39,874 WARNING [train.py:1204] (3/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_506-23200-0_sp1.1 from training. Duration: 0.8818125 2023-10-04 03:57:41,899 WARNING [train.py:1204] (3/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_336-213530-0_sp1.1 from training. Duration: 0.8181875 2023-10-04 03:57:42,017 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.feed_forward1.hidden_balancer.prob, batch_count=1517133.3333333333, ans=0.125 2023-10-04 03:57:43,217 WARNING [train.py:1204] (3/4) Exclude cut with ID _64135_210_2253_1_1535677267876_7608972_180-5312-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 03:57:43,307 WARNING [train.py:1204] (3/4) Exclude cut with ID _12302_210_5219_1_1529924446466_8107280_835-279754-0 from training. Duration: 0.9 2023-10-04 03:57:43,309 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_510-96996-0_sp1.1 from training. Duration: 0.7909375 2023-10-04 03:57:44,002 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.3.encoder.layers.0.nonlin_attention.whiten1, num_groups=1, num_channels=384, metric=6.55 vs. limit=10.0 2023-10-04 03:57:44,825 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_15517_210_6753_1_1530954008_3487336_216-187834-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 03:57:44,868 WARNING [train.py:1204] (3/4) Exclude cut with ID _19504_210_9994_1_1531648863242_3325220_414-113427-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 03:57:44,869 WARNING [train.py:1204] (3/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_264-108685-0_sp0.9 from training. Duration: 0.611125 2023-10-04 03:57:48,940 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_5438_210_1794_1_1527336687_2600009_104-288274-0_sp0.9 from training. Duration: 0.6444375 2023-10-04 03:57:52,306 WARNING [train.py:1204] (3/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_520-39496-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 03:57:53,749 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_37818_210_4238_1_1533620910_4720083_315-308565-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 03:57:53,863 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2009_210_2071_1_1525481134_4588937_385-302100-0_sp1.1 from training. Duration: 0.7909375 2023-10-04 03:57:55,177 WARNING [train.py:1204] (3/4) Exclude cut with ID _58895_210_8750_1_1535184081320_3649089_277-53219-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 03:57:57,896 WARNING [train.py:1204] (3/4) Exclude cut with ID _26664_210_7770_1_1532258943280_3418270_121-111258-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 03:58:01,531 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.0.layers.1.feed_forward3.out_whiten, num_groups=1, num_channels=192, metric=9.42 vs. limit=15.0 2023-10-04 03:58:02,611 WARNING [train.py:1204] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_99-167392-0_sp1.1 from training. Duration: 0.5545625 2023-10-04 03:58:04,566 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1113-7369-0 from training. Duration: 0.85 2023-10-04 03:58:04,581 WARNING [train.py:1204] (3/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_352-59958-0 from training. Duration: 0.86 2023-10-04 03:58:04,586 WARNING [train.py:1204] (3/4) Exclude cut with ID _14886_210_4220_1_1530702020355_3516190_159-73748-0_sp1.1 from training. Duration: 0.7545625 2023-10-04 03:58:06,031 WARNING [train.py:1204] (3/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_131-119569-0_sp1.1 from training. Duration: 0.92725 2023-10-04 03:58:07,490 WARNING [train.py:1204] (3/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_216-343007-0 from training. Duration: 0.66 2023-10-04 03:58:10,338 WARNING [train.py:1204] (3/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_113-92608-0_sp0.9 from training. Duration: 0.6 2023-10-04 03:58:14,803 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_40047_210_2290_1_1533290519_2374311_132-123271-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 03:58:14,865 WARNING [train.py:1204] (3/4) Exclude cut with ID _8793_210_6310_1_1528542985139_2310009_121-179782-0 from training. Duration: 0.8 2023-10-04 03:58:14,889 WARNING [train.py:1204] (3/4) Exclude cut with ID _43997_210_4525_1_1533538632555_5189329_283-218639-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 03:58:14,892 WARNING [train.py:1204] (3/4) Exclude cut with ID _49376_210_5986_1_1533985278732_3370740_297-78397-0_sp1.1 from training. Duration: 0.936375 2023-10-04 03:58:16,303 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3475_210_2028_1_1526193578_3622569_377-166133-0_sp0.9 from training. Duration: 0.9555625 2023-10-04 03:58:16,310 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_520-213149-0_sp1.1 from training. Duration: 0.92725 2023-10-04 03:58:17,666 WARNING [train.py:1204] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_567-144483-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 03:58:19,296 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.out_combiner.scale_min, batch_count=1517266.6666666667, ans=0.2 2023-10-04 03:58:20,387 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_4183_210_2087_1_1526729100_1869838_143-280962-0_sp0.9 from training. Duration: 0.62225 2023-10-04 03:58:20,429 WARNING [train.py:1204] (3/4) Exclude cut with ID _49643_210_7989_1_1534640244224_3939031_611-185515-0 from training. Duration: 0.94 2023-10-04 03:58:23,927 WARNING [train.py:1204] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_88-341718-0_sp0.9 from training. Duration: 0.7333125 2023-10-04 03:58:25,579 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.ff2_skip_rate, batch_count=1517333.3333333333, ans=0.0 2023-10-04 03:58:26,644 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_51225_210_3601_1_1534400634_2327383_49-235441-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 03:58:26,730 WARNING [train.py:1204] (3/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_240-294400-0_sp1.1 from training. Duration: 0.936375 2023-10-04 03:58:27,048 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.feed_forward2.hidden_balancer.prob, batch_count=1517333.3333333333, ans=0.125 2023-10-04 03:58:28,197 WARNING [train.py:1204] (3/4) Exclude cut with ID _55429_210_10626_1_1534467588527_7174143_640-304825-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 03:58:28,218 WARNING [train.py:1204] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_187-221566-0_sp1.1 from training. Duration: 0.3909375 2023-10-04 03:58:29,420 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.667e+02 2.027e+02 2.201e+02 2.513e+02 3.494e+02, threshold=4.403e+02, percent-clipped=0.0 2023-10-04 03:58:29,624 WARNING [train.py:1204] (3/4) Exclude cut with ID _7606_210_4915_1_1528894253277_5080690_127-177761-0_sp0.9 from training. Duration: 0.711125 2023-10-04 03:58:33,119 WARNING [train.py:1204] (3/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_430-116139-0 from training. Duration: 0.85 2023-10-04 03:58:34,531 WARNING [train.py:1204] (3/4) Exclude cut with ID _3675_210_4557_1_1526639934408_4182089_456-18305-0_sp0.9 from training. Duration: 0.8444375 2023-10-04 03:58:37,765 INFO [train.py:1046] (3/4) Epoch 43, batch 4500, loss[loss=0.1461, simple_loss=0.2236, pruned_loss=0.03433, over 24483.00 frames. ], tot_loss[loss=0.1559, simple_loss=0.2366, pruned_loss=0.03763, over 4735511.74 frames. ], batch size: 58, lr: 2.35e-03, grad_scale: 32.0 2023-10-04 03:58:39,311 WARNING [train.py:1204] (3/4) Exclude cut with ID _18513_210_9206_1_1531618215223_3862620_402-5957-0_sp1.1 from training. Duration: 0.9 2023-10-04 03:58:40,691 WARNING [train.py:1204] (3/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_465-156500-0 from training. Duration: 0.72 2023-10-04 03:58:40,692 WARNING [train.py:1204] (3/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_714-310538-0 from training. Duration: 0.86 2023-10-04 03:58:43,323 WARNING [train.py:1204] (3/4) Exclude cut with ID _39411_210_5986_1_1533538825030_3343649_335-301187-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 03:58:46,890 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_41750_210_1960_1_1533520678_3948042_241-158616-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 03:58:48,178 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_46013_210_7234_1_1533776362_3744032_50-141535-0_sp1.1 from training. Duration: 0.9 2023-10-04 03:58:49,500 WARNING [train.py:1204] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_43-206757-0_sp0.9 from training. Duration: 0.8333125 2023-10-04 03:58:50,817 WARNING [train.py:1204] (3/4) Exclude cut with ID _22961_210_8254_1_1531976366672_4278180_172-276296-0_sp1.1 from training. Duration: 0.97275 2023-10-04 03:58:50,838 WARNING [train.py:1204] (3/4) Exclude cut with ID _17956_210_4353_1_1531209568125_6979970_1305-301903-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 03:58:50,881 WARNING [train.py:1204] (3/4) Exclude cut with ID _28323_210_16187_1_1532480395667_4100300_604-44507-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 03:58:58,437 INFO [scaling.py:1118] (3/4) WithLoss: name=encoder.encoders.1.encoder.layers.1.self_attn_weights, loss-sum=0.000e+00 2023-10-04 03:59:04,247 WARNING [train.py:1204] (3/4) Exclude cut with ID _23514_210_1389_1_1531832139785_3031439_88-337544-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 03:59:05,681 WARNING [train.py:1204] (3/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_932-3672-0_sp1.1 from training. Duration: 0.963625 2023-10-04 03:59:07,589 WARNING [train.py:1204] (3/4) Exclude cut with ID _46502_210_13794_1_1533724452198_3657159_207-109991-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 03:59:08,915 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_216-73841-0_sp1.1 from training. Duration: 0.87275 2023-10-04 03:59:09,017 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8453_210_4211_1_1528502940_8562730_347-330673-0_sp1.1 from training. Duration: 0.6545625 2023-10-04 03:59:13,408 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.bypass_mid.scale_min, batch_count=1517533.3333333333, ans=0.2 2023-10-04 03:59:15,866 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_4183_210_2087_1_1526729100_1869838_143-280962-0_sp1.1 from training. Duration: 0.5090625 2023-10-04 03:59:19,226 WARNING [train.py:1204] (3/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_325-113798-0_sp1.1 from training. Duration: 0.77275 2023-10-04 03:59:23,361 WARNING [train.py:1204] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_91-212486-0_sp0.9 from training. Duration: 0.7555625 2023-10-04 03:59:26,669 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8776_210_2040_1_1528638837_4088036_244-278763-0_sp1.1 from training. Duration: 0.8181875 2023-10-04 03:59:27,804 WARNING [train.py:1204] (3/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_106-286130-0 from training. Duration: 0.98 2023-10-04 03:59:29,109 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2746_210_1988_1_1525872816_3795627_372-205861-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 03:59:29,144 WARNING [train.py:1204] (3/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_240-54646-0_sp1.1 from training. Duration: 0.936375 2023-10-04 03:59:29,279 WARNING [train.py:1204] (3/4) Exclude cut with ID _59500_210_8254_1_1534809610680_3736009_162-180695-0_sp1.1 from training. Duration: 0.936375 2023-10-04 03:59:30,629 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7544_210_6753_1_1528591327_1471426_146-93357-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 03:59:30,815 WARNING [train.py:1204] (3/4) Exclude cut with ID _42657_210_2070_1_1533520654016_3327008_405-87432-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 03:59:32,105 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_4528_210_3273_1_1527076654_3276777_209-219804-0 from training. Duration: 0.69 2023-10-04 03:59:32,106 WARNING [train.py:1204] (3/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_78-182912-0_sp0.9 from training. Duration: 0.6555625 2023-10-04 03:59:32,119 WARNING [train.py:1204] (3/4) Exclude cut with ID _31255_210_13427_1_1532865619495_3815143_322-114998-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 03:59:34,317 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.self_attn_weights.pos_emb_skip_rate, batch_count=1517600.0, ans=0.0 2023-10-04 03:59:35,526 WARNING [train.py:1204] (3/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_848-280957-0_sp0.9 from training. Duration: 0.9666875 2023-10-04 03:59:37,223 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7696_210_2040_1_1528207167_3716099_117-125434-0_sp0.9 from training. Duration: 0.8444375 2023-10-04 03:59:39,953 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_1002-109192-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 03:59:41,415 WARNING [train.py:1204] (3/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_138-64210-0_sp0.9 from training. Duration: 0.811125 2023-10-04 03:59:41,431 WARNING [train.py:1204] (3/4) Exclude cut with ID _25808_210_13292_1_1532343514562_7366109_633-47524-0_sp1.1 from training. Duration: 0.9 2023-10-04 03:59:41,562 WARNING [train.py:1204] (3/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_297-5233-0 from training. Duration: 0.95 2023-10-04 03:59:44,235 WARNING [train.py:1204] (3/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_335-274780-0 from training. Duration: 0.78 2023-10-04 03:59:44,240 WARNING [train.py:1204] (3/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_301-330455-0 from training. Duration: 0.8 2023-10-04 03:59:48,858 WARNING [train.py:1204] (3/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_383-205833-0 from training. Duration: 0.95 2023-10-04 03:59:51,413 INFO [train.py:1046] (3/4) Epoch 43, batch 4550, loss[loss=0.1456, simple_loss=0.2096, pruned_loss=0.0408, over 23538.00 frames. ], tot_loss[loss=0.1556, simple_loss=0.236, pruned_loss=0.03759, over 4735398.17 frames. ], batch size: 256, lr: 2.35e-03, grad_scale: 32.0 2023-10-04 03:59:52,767 WARNING [train.py:1204] (3/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_48-182155-0 from training. Duration: 0.87 2023-10-04 03:59:54,166 WARNING [train.py:1204] (3/4) Exclude cut with ID _14832_210_8750_1_1530691351539_3171348_1-54315-0_sp1.1 from training. Duration: 0.7909375 2023-10-04 03:59:58,724 WARNING [train.py:1204] (3/4) Exclude cut with ID _44982_210_1511_1_1534312820108_3726029_413-100814-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 03:59:58,778 WARNING [train.py:1204] (3/4) Exclude cut with ID _3231_210_2106_1_1526205864934_6784220_104-261392-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 04:00:00,282 WARNING [train.py:1204] (3/4) Exclude cut with ID _24314_210_4915_1_1532135078116_7170179_1-13588-0_sp1.1 from training. Duration: 0.97275 2023-10-04 04:00:03,764 WARNING [train.py:1204] (3/4) Exclude cut with ID _44304_210_15374_1_1533639352765_4613129_141-8356-0_sp1.1 from training. Duration: 0.963625 2023-10-04 04:00:06,446 WARNING [train.py:1204] (3/4) Exclude cut with ID _37892_210_19596_1_1533198677897_3964058_374-335034-0_sp1.1 from training. Duration: 0.936375 2023-10-04 04:00:09,497 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_195-257512-0_sp0.9 from training. Duration: 0.7444375 2023-10-04 04:00:09,499 WARNING [train.py:1204] (3/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_1267-272693-0_sp1.1 from training. Duration: 0.663625 2023-10-04 04:00:09,500 WARNING [train.py:1204] (3/4) Exclude cut with ID _30196_210_1664_1_1533708496522_7074140_690-326155-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 04:00:12,209 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_68847_210_14656_1_1536070214_2586843_38-20717-0_sp1.1 from training. Duration: 0.97275 2023-10-04 04:00:12,252 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_99-251314-0_sp1.1 from training. Duration: 0.7909375 2023-10-04 04:00:13,761 WARNING [train.py:1204] (3/4) Exclude cut with ID _49376_210_5986_1_1533985278732_3370740_225-95630-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 04:00:18,371 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2655_210_2087_1_1525688959_3897400_202-147557-0 from training. Duration: 0.85 2023-10-04 04:00:18,426 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_5618_210_1874_1_1527311008_5851_1-214501-0_sp1.1 from training. Duration: 0.626375 2023-10-04 04:00:21,133 WARNING [train.py:1204] (3/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_225-43381-0_sp1.1 from training. Duration: 0.8545625 2023-10-04 04:00:21,237 WARNING [train.py:1204] (3/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_242-3472-0 from training. Duration: 0.73 2023-10-04 04:00:24,131 WARNING [train.py:1204] (3/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_339-104791-0 from training. Duration: 0.49 2023-10-04 04:00:24,188 WARNING [train.py:1204] (3/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_488-92261-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 04:00:25,756 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.feed_forward1.out_proj.dropout_p, batch_count=1517866.6666666667, ans=0.1 2023-10-04 04:00:28,680 WARNING [train.py:1204] (3/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_175-163894-0 from training. Duration: 0.74 2023-10-04 04:00:28,809 WARNING [train.py:1204] (3/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_41-234353-0_sp0.9 from training. Duration: 0.7555625 2023-10-04 04:00:31,571 WARNING [train.py:1204] (3/4) Exclude cut with ID _47251_210_18628_1_1533814063128_4137250_116-275927-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 04:00:33,515 WARNING [train.py:1204] (3/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_61-223324-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 04:00:33,527 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6961_210_2087_1_1527937168_6815011_562-165518-0_sp1.1 from training. Duration: 0.7 2023-10-04 04:00:34,971 WARNING [train.py:1204] (3/4) Exclude cut with ID _21989_210_5854_1_1531899214931_3271360_133-44759-0 from training. Duration: 0.77 2023-10-04 04:00:37,758 WARNING [train.py:1204] (3/4) Exclude cut with ID _23557_210_8750_1_1531890106370_6846255_77-284923-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 04:00:40,982 WARNING [train.py:1204] (3/4) Exclude cut with ID _13678_210_7497_1_1530530069226_3463170_464-296127-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 04:00:42,146 WARNING [train.py:1204] (3/4) Exclude cut with ID _22613_210_5219_1_1532253599686_4090170_393-36663-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 04:00:42,245 WARNING [train.py:1204] (3/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_235-262124-0_sp0.9 from training. Duration: 0.7444375 2023-10-04 04:00:44,885 WARNING [train.py:1204] (3/4) Exclude cut with ID _8480_210_1874_1_1528589391205_6728330_755-112625-0 from training. Duration: 0.74 2023-10-04 04:00:44,929 WARNING [train.py:1204] (3/4) Exclude cut with ID _6456_210_1534_1_1527681634212_3181049_215-309359-0 from training. Duration: 0.88 2023-10-04 04:00:44,955 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3019_210_2028_1_1526044245_3264865_191-72235-0_sp1.1 from training. Duration: 0.8818125 2023-10-04 04:00:45,032 WARNING [train.py:1204] (3/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_380-162648-0 from training. Duration: 0.94 2023-10-04 04:00:48,211 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_92-46009-0 from training. Duration: 0.54 2023-10-04 04:00:48,238 WARNING [train.py:1204] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_53-300544-0_sp0.9 from training. Duration: 0.7444375 2023-10-04 04:00:48,388 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.1.conv_module2.balancer2.prob, batch_count=1517933.3333333333, ans=0.125 2023-10-04 04:00:50,927 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_68235_210_1925_1_1535863247_4484656_101-250943-0_sp1.1 from training. Duration: 0.97275 2023-10-04 04:00:50,945 WARNING [train.py:1204] (3/4) Exclude cut with ID _14970_210_15709_1_1530775547361_1966965_204-22182-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 04:00:52,284 WARNING [train.py:1204] (3/4) Exclude cut with ID _55153_210_2253_1_1534384786677_3162096_310-130092-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 04:00:52,295 WARNING [train.py:1204] (3/4) Exclude cut with ID _60113_210_16271_1_1535164212481_4386079_559-167191-0_sp1.1 from training. Duration: 0.8454375 2023-10-04 04:00:53,747 WARNING [train.py:1204] (3/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_93-58002-0_sp1.1 from training. Duration: 0.5181875 2023-10-04 04:00:53,798 WARNING [train.py:1204] (3/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_110-224688-0 from training. Duration: 0.97 2023-10-04 04:00:55,208 WARNING [train.py:1204] (3/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_178-178004-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 04:00:55,216 WARNING [train.py:1204] (3/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_321-335475-0_sp1.1 from training. Duration: 0.4090625 2023-10-04 04:00:56,576 WARNING [train.py:1204] (3/4) Exclude cut with ID _66863_210_18053_1_1535698758334_8590319_1092-271784-0 from training. Duration: 0.96 2023-10-04 04:00:56,591 WARNING [train.py:1204] (3/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_607-213951-0_sp1.1 from training. Duration: 0.763625 2023-10-04 04:00:56,600 WARNING [train.py:1204] (3/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_468-283113-0 from training. Duration: 0.96 2023-10-04 04:00:58,352 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.671e+02 2.020e+02 2.212e+02 2.607e+02 3.843e+02, threshold=4.425e+02, percent-clipped=0.0 2023-10-04 04:00:58,521 WARNING [train.py:1204] (3/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_13-165512-0_sp1.1 from training. Duration: 0.7454375 2023-10-04 04:00:58,531 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_69630_210_7234_1_1535788670_910989_56-169762-0_sp1.1 from training. Duration: 0.92725 2023-10-04 04:00:59,967 WARNING [train.py:1204] (3/4) Exclude cut with ID _67335_210_5219_1_1535525975652_4275229_121-80357-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 04:01:01,306 WARNING [train.py:1204] (3/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_126-12079-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 04:01:01,357 WARNING [train.py:1204] (3/4) Exclude cut with ID _5294_210_1747_1_1527249643928_3696179_342-270278-0_sp1.1 from training. Duration: 0.5 2023-10-04 04:01:02,772 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_30322_210_1925_1_1532843478_826817_35-328264-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 04:01:04,161 WARNING [train.py:1204] (3/4) Exclude cut with ID _8480_210_1874_1_1528589391205_6728330_755-112625-0_sp1.1 from training. Duration: 0.67275 2023-10-04 04:01:06,030 INFO [train.py:1046] (3/4) Epoch 43, batch 4600, loss[loss=0.1739, simple_loss=0.2606, pruned_loss=0.04361, over 24568.00 frames. ], tot_loss[loss=0.1548, simple_loss=0.2353, pruned_loss=0.03717, over 4729553.18 frames. ], batch size: 71, lr: 2.35e-03, grad_scale: 32.0 2023-10-04 04:01:06,393 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.conv_skip_rate, batch_count=1518066.6666666667, ans=0.0 2023-10-04 04:01:07,547 WARNING [train.py:1204] (3/4) Exclude cut with ID _38670_210_15379_1_1533448810659_4391059_132-146412-0_sp1.1 from training. Duration: 0.97275 2023-10-04 04:01:07,773 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.balancer1.prob, batch_count=1518066.6666666667, ans=0.125 2023-10-04 04:01:08,909 WARNING [train.py:1204] (3/4) Exclude cut with ID _22020_210_2461_1_1531724164513_6326900_563-39691-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 04:01:12,102 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_9460_210_4211_1_1528954587_10049667_572-37035-0_sp1.1 from training. Duration: 0.72725 2023-10-04 04:01:12,115 WARNING [train.py:1204] (3/4) Exclude cut with ID _68777_210_4353_1_1535810390731_3117904_129-166012-0_sp0.9 from training. Duration: 0.9444375 2023-10-04 04:01:12,183 WARNING [train.py:1204] (3/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_487-91921-0_sp1.1 from training. Duration: 0.963625 2023-10-04 04:01:13,527 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_195-257512-0 from training. Duration: 0.67 2023-10-04 04:01:15,429 WARNING [train.py:1204] (3/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_188-260011-0_sp1.1 from training. Duration: 0.663625 2023-10-04 04:01:19,355 WARNING [train.py:1204] (3/4) Exclude cut with ID _8480_210_1874_1_1528589391205_6728330_309-139896-0_sp0.9 from training. Duration: 0.9 2023-10-04 04:01:20,759 WARNING [train.py:1204] (3/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_866-321248-0_sp1.1 from training. Duration: 0.963625 2023-10-04 04:01:22,243 WARNING [train.py:1204] (3/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_510-216513-0_sp1.1 from training. Duration: 0.97275 2023-10-04 04:01:27,305 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.conv_module2.balancer2.prob, batch_count=1518133.3333333333, ans=0.125 2023-10-04 04:01:28,200 WARNING [train.py:1204] (3/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_758-143677-0 from training. Duration: 0.98 2023-10-04 04:01:28,284 WARNING [train.py:1204] (3/4) Exclude cut with ID _55134_210_21982_1_1534590028377_3882170_50-214427-0_sp1.1 from training. Duration: 0.97275 2023-10-04 04:01:28,449 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.balancer2.prob, batch_count=1518133.3333333333, ans=0.125 2023-10-04 04:01:28,512 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.bypass_mid.scale_min, batch_count=1518133.3333333333, ans=0.2 2023-10-04 04:01:30,961 WARNING [train.py:1204] (3/4) Exclude cut with ID _31649_210_6228_1_1532584608063_4091078_482-301330-0_sp1.1 from training. Duration: 0.97275 2023-10-04 04:01:34,147 WARNING [train.py:1204] (3/4) Exclude cut with ID _35799_210_5854_1_1533263487887_3529071_490-326847-0_sp1.1 from training. Duration: 0.9 2023-10-04 04:01:34,161 WARNING [train.py:1204] (3/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_984-90716-0_sp1.1 from training. Duration: 0.963625 2023-10-04 04:01:38,533 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.1.bypass.skip_rate, batch_count=1518200.0, ans=0.035 2023-10-04 04:01:41,366 WARNING [train.py:1204] (3/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_166-246557-0 from training. Duration: 0.64 2023-10-04 04:01:41,367 WARNING [train.py:1204] (3/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_51-200220-0_sp1.1 from training. Duration: 0.5090625 2023-10-04 04:01:42,728 WARNING [train.py:1204] (3/4) Exclude cut with ID _51382_210_23862_1_1534492801838_3650104_275-92796-0_sp1.1 from training. Duration: 0.936375 2023-10-04 04:01:46,103 WARNING [train.py:1204] (3/4) Exclude cut with ID _29853_210_7497_1_1532505600851_6642110_461-142829-0_sp1.1 from training. Duration: 0.97275 2023-10-04 04:01:46,155 WARNING [train.py:1204] (3/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_788-298907-0_sp0.9 from training. Duration: 0.988875 2023-10-04 04:01:47,573 WARNING [train.py:1204] (3/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_739-2098-0_sp1.1 from training. Duration: 0.836375 2023-10-04 04:01:51,526 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7543_210_6753_1_1528541907_1399322_165-323619-0 from training. Duration: 0.69 2023-10-04 04:01:52,973 WARNING [train.py:1204] (3/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_401-255491-0_sp1.1 from training. Duration: 0.636375 2023-10-04 04:01:54,615 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.nonlin_attention.balancer.prob, batch_count=1518266.6666666667, ans=0.125 2023-10-04 04:01:57,122 WARNING [train.py:1204] (3/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_24-143283-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 04:01:58,548 WARNING [train.py:1204] (3/4) Exclude cut with ID _14459_210_2228_1_1531443641946_7516700_1221-280439-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 04:02:00,356 WARNING [train.py:1204] (3/4) Exclude cut with ID _59202_210_26984_1_1534816810612_3549174_469-213482-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 04:02:00,357 WARNING [train.py:1204] (3/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_148-158509-0_sp0.9 from training. Duration: 0.4555625 2023-10-04 04:02:01,730 WARNING [train.py:1204] (3/4) Exclude cut with ID _24314_210_4915_1_1532135078116_7170179_863-259095-0_sp1.1 from training. Duration: 0.97275 2023-10-04 04:02:01,806 WARNING [train.py:1204] (3/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_321-22821-0 from training. Duration: 0.89 2023-10-04 04:02:03,083 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_43831_210_2158_1_1533725502_5872791_55-163137-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 04:02:03,132 WARNING [train.py:1204] (3/4) Exclude cut with ID _27430_210_7770_1_1532604689375_1378590_26-25207-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 04:02:04,999 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_17613_210_2151_1_1531137569_2366160_223-316461-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 04:02:05,065 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_53679_210_4852_1_1534556726_4186141_541-213370-0_sp1.1 from training. Duration: 0.963625 2023-10-04 04:02:06,432 WARNING [train.py:1204] (3/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_369-295086-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 04:02:06,484 WARNING [train.py:1204] (3/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_91-267696-0 from training. Duration: 0.87 2023-10-04 04:02:07,777 WARNING [train.py:1204] (3/4) Exclude cut with ID _5294_210_1747_1_1527249643928_3696179_180-100713-0 from training. Duration: 0.81 2023-10-04 04:02:07,811 WARNING [train.py:1204] (3/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_32-335103-0 from training. Duration: 0.82 2023-10-04 04:02:07,817 WARNING [train.py:1204] (3/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_133-312727-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 04:02:09,210 WARNING [train.py:1204] (3/4) Exclude cut with ID _59288_210_5854_1_1535077757774_3412246_391-165934-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 04:02:09,262 WARNING [train.py:1204] (3/4) Exclude cut with ID _28323_210_16187_1_1532480395667_4100300_488-1281-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 04:02:10,589 WARNING [train.py:1204] (3/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_124-193207-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 04:02:10,814 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.balancer1.prob, batch_count=1518333.3333333333, ans=0.125 2023-10-04 04:02:19,795 INFO [train.py:1046] (3/4) Epoch 43, batch 4650, loss[loss=0.158, simple_loss=0.2382, pruned_loss=0.03887, over 23744.00 frames. ], tot_loss[loss=0.1547, simple_loss=0.2354, pruned_loss=0.03698, over 4733380.35 frames. ], batch size: 85, lr: 2.35e-03, grad_scale: 32.0 2023-10-04 04:02:21,345 WARNING [train.py:1204] (3/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_226-242140-0_sp0.9 from training. Duration: 0.9 2023-10-04 04:02:24,186 WARNING [train.py:1204] (3/4) Exclude cut with ID _29277_210_3279_1_1532581109293_3173069_392-3694-0_sp1.1 from training. Duration: 0.936375 2023-10-04 04:02:24,223 WARNING [train.py:1204] (3/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_563-329842-0_sp1.1 from training. Duration: 0.97275 2023-10-04 04:02:24,277 WARNING [train.py:1204] (3/4) Exclude cut with ID _58583_210_5236_1_1534726835266_3574249_391-87755-0_sp1.1 from training. Duration: 0.92725 2023-10-04 04:02:24,299 WARNING [train.py:1204] (3/4) Exclude cut with ID _28427_210_4220_1_1532581240967_3061069_281-237827-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 04:02:24,313 WARNING [train.py:1204] (3/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_44-147898-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 04:02:27,047 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_24007_210_1794_1_1531918925_1663875_125-320772-0_sp1.1 from training. Duration: 0.97275 2023-10-04 04:02:30,181 WARNING [train.py:1204] (3/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_143-117421-0 from training. Duration: 0.58 2023-10-04 04:02:32,996 WARNING [train.py:1204] (3/4) Exclude cut with ID _69584_210_5219_1_1535785133775_4044450_325-7656-0_sp0.9 from training. Duration: 0.9555625 2023-10-04 04:02:33,157 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.0.conv_module1.balancer1.prob, batch_count=1518466.6666666667, ans=0.125 2023-10-04 04:02:34,929 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_37351_210_7925_1_1533012931_3812113_265-85780-0 from training. Duration: 0.88 2023-10-04 04:02:36,160 WARNING [train.py:1204] (3/4) Exclude cut with ID _18516_210_9206_1_1531704653206_3692406_331-237896-0_sp1.1 from training. Duration: 0.936375 2023-10-04 04:02:37,633 WARNING [train.py:1204] (3/4) Exclude cut with ID _14629_210_15709_1_1530604298182_956899_6-232695-0 from training. Duration: 0.94 2023-10-04 04:02:37,656 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7544_210_6753_1_1528587000_691468_87-178599-0_sp1.1 from training. Duration: 0.7909375 2023-10-04 04:02:37,711 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_326-303945-0 from training. Duration: 0.99 2023-10-04 04:02:37,726 WARNING [train.py:1204] (3/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_503-30719-0 from training. Duration: 0.98 2023-10-04 04:02:37,733 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_11305_210_1794_1_1529750428_7178148_299-344640-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 04:02:37,878 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.feed_forward1.hidden_balancer.prob, batch_count=1518466.6666666667, ans=0.125 2023-10-04 04:02:39,009 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_53071_210_16554_1_1534490405_2954066_265-170472-0_sp1.1 from training. Duration: 0.7545625 2023-10-04 04:02:43,092 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_174-277364-0_sp1.1 from training. Duration: 0.6454375 2023-10-04 04:02:43,950 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.1.encoder.layers.1.conv_module2.whiten, num_groups=1, num_channels=256, metric=10.78 vs. limit=15.0 2023-10-04 04:02:44,947 WARNING [train.py:1204] (3/4) Exclude cut with ID _58414_210_8254_1_1535421609162_7151182_193-151884-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 04:02:44,968 WARNING [train.py:1204] (3/4) Exclude cut with ID IC0651W0104-322063 from training. Duration: 0.768 2023-10-04 04:02:47,123 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=1518466.6666666667, ans=0.1 2023-10-04 04:02:48,369 WARNING [train.py:1204] (3/4) Exclude cut with ID _21881_210_3525_1_1531825242757_3798380_139-305982-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 04:02:48,697 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.feed_forward3.hidden_balancer.prob, batch_count=1518533.3333333333, ans=0.125 2023-10-04 04:02:49,770 WARNING [train.py:1204] (3/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_79-200392-0 from training. Duration: 0.97 2023-10-04 04:02:51,186 WARNING [train.py:1204] (3/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_451-280894-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 04:02:51,194 WARNING [train.py:1204] (3/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_304-273433-0_sp1.1 from training. Duration: 0.9 2023-10-04 04:02:52,536 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_5959_210_1794_1_1527400400_3445726_412-44052-0 from training. Duration: 0.77 2023-10-04 04:02:52,684 WARNING [train.py:1204] (3/4) Exclude cut with ID _4681_210_3525_1_1527251040321_1235713_148-26159-0_sp1.1 from training. Duration: 0.963625 2023-10-04 04:02:55,296 WARNING [train.py:1204] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_887-249555-0_sp0.9 from training. Duration: 0.8555625 2023-10-04 04:02:59,437 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_25236_210_2158_1_1532051968_280567_37-6860-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 04:03:05,567 WARNING [train.py:1204] (3/4) Exclude cut with ID _29277_210_3279_1_1532581109293_3173069_417-115527-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 04:03:07,656 WARNING [train.py:1204] (3/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_552-134748-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 04:03:08,962 WARNING [train.py:1204] (3/4) Exclude cut with ID _43176_210_8254_1_1533524417469_4174339_188-174143-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 04:03:09,020 WARNING [train.py:1204] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_127-119109-0_sp1.1 from training. Duration: 0.6545625 2023-10-04 04:03:11,768 WARNING [train.py:1204] (3/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_38-187851-0 from training. Duration: 0.62 2023-10-04 04:03:11,796 WARNING [train.py:1204] (3/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_588-125165-0 from training. Duration: 0.83 2023-10-04 04:03:13,161 WARNING [train.py:1204] (3/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_344-152526-0_sp1.1 from training. Duration: 0.4818125 2023-10-04 04:03:13,162 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7002_210_1794_1_1527935634_4034444_487-236501-0 from training. Duration: 0.66 2023-10-04 04:03:13,852 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.3.encoder.layers.1.self_attn1.whiten, num_groups=1, num_channels=512, metric=14.96 vs. limit=22.5 2023-10-04 04:03:16,319 WARNING [train.py:1204] (3/4) Exclude cut with ID _23825_210_13449_1_1531915313526_3497150_379-16981-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 04:03:23,615 WARNING [train.py:1204] (3/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_297-5233-0_sp1.1 from training. Duration: 0.863625 2023-10-04 04:03:23,627 WARNING [train.py:1204] (3/4) Exclude cut with ID _41298_210_9019_1_1533430371450_4822143_178-283538-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 04:03:23,665 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7046_210_6753_1_1527895337_6920917_642-106799-0 from training. Duration: 0.92 2023-10-04 04:03:24,977 WARNING [train.py:1204] (3/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_532-291952-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 04:03:26,333 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.656e+02 2.022e+02 2.193e+02 2.608e+02 3.826e+02, threshold=4.386e+02, percent-clipped=0.0 2023-10-04 04:03:26,420 WARNING [train.py:1204] (3/4) Exclude cut with ID _47278_210_12041_1_1533902418349_3576110_260-270468-0_sp1.1 from training. Duration: 0.92725 2023-10-04 04:03:26,426 WARNING [train.py:1204] (3/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_144-3287-0_sp0.9 from training. Duration: 0.7444375 2023-10-04 04:03:26,530 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_162-262717-0_sp0.9 from training. Duration: 0.92225 2023-10-04 04:03:26,699 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.conv_skip_rate, batch_count=1518666.6666666667, ans=0.0 2023-10-04 04:03:29,288 WARNING [train.py:1204] (3/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_62-335552-0_sp0.9 from training. Duration: 0.9666875 2023-10-04 04:03:29,295 WARNING [train.py:1204] (3/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_726-272852-0_sp1.1 from training. Duration: 0.92725 2023-10-04 04:03:30,648 WARNING [train.py:1204] (3/4) Exclude cut with ID _14898_210_9206_1_1530766812377_4177190_26-135492-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 04:03:33,369 INFO [train.py:1046] (3/4) Epoch 43, batch 4700, loss[loss=0.1556, simple_loss=0.2347, pruned_loss=0.03823, over 24568.00 frames. ], tot_loss[loss=0.1546, simple_loss=0.2355, pruned_loss=0.03683, over 4740016.13 frames. ], batch size: 60, lr: 2.35e-03, grad_scale: 32.0 2023-10-04 04:03:33,437 WARNING [train.py:1204] (3/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_200-95304-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 04:03:33,484 WARNING [train.py:1204] (3/4) Exclude cut with ID _45012_210_12651_1_1533608435136_5371380_240-37101-0_sp1.1 from training. Duration: 0.8454375 2023-10-04 04:03:33,492 WARNING [train.py:1204] (3/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_132-269633-0_sp0.9 from training. Duration: 0.7555625 2023-10-04 04:03:35,383 WARNING [train.py:1204] (3/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_907-151398-0_sp0.9 from training. Duration: 0.411125 2023-10-04 04:03:35,477 WARNING [train.py:1204] (3/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_45-265684-0_sp0.9 from training. Duration: 0.8 2023-10-04 04:03:36,919 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_74-292608-0 from training. Duration: 0.72 2023-10-04 04:03:47,519 WARNING [train.py:1204] (3/4) Exclude cut with ID _59202_210_26984_1_1534816810612_3549174_221-84819-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 04:03:47,722 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.bypass_mid.scale_min, batch_count=1518800.0, ans=0.2 2023-10-04 04:03:48,875 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_33696_210_3852_1_1533082780_2663375_181-325595-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 04:03:48,923 WARNING [train.py:1204] (3/4) Exclude cut with ID _47251_210_18628_1_1533814063128_4137250_351-154105-0_sp1.1 from training. Duration: 0.963625 2023-10-04 04:03:49,018 WARNING [train.py:1204] (3/4) Exclude cut with ID _35501_210_9288_1_1532998507986_3192189_351-334740-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 04:03:50,898 WARNING [train.py:1204] (3/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_342-100157-0_sp1.1 from training. Duration: 0.4909375 2023-10-04 04:03:55,194 WARNING [train.py:1204] (3/4) Exclude cut with ID _6789_210_1534_1_1527857937218_3118950_262-48853-0 from training. Duration: 0.56 2023-10-04 04:03:55,954 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.1.encoder.layers.1.feed_forward3.out_whiten, num_groups=1, num_channels=256, metric=8.48 vs. limit=15.0 2023-10-04 04:03:56,433 WARNING [train.py:1204] (3/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_430-201020-0 from training. Duration: 0.97 2023-10-04 04:03:56,606 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_71-136518-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 04:03:57,941 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_28-291982-0_sp1.1 from training. Duration: 0.8818125 2023-10-04 04:03:57,976 WARNING [train.py:1204] (3/4) Exclude cut with ID _54218_210_12210_1_1534413646388_4409259_437-275326-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 04:04:02,059 WARNING [train.py:1204] (3/4) Exclude cut with ID _55134_210_21982_1_1534590028377_3882170_40-309139-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 04:04:03,659 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.attention_skip_rate, batch_count=1518866.6666666667, ans=0.0 2023-10-04 04:04:08,052 WARNING [train.py:1204] (3/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_266-329755-0_sp1.1 from training. Duration: 0.8090625 2023-10-04 04:04:08,137 WARNING [train.py:1204] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_21-174514-0_sp1.1 from training. Duration: 0.3909375 2023-10-04 04:04:08,316 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.conv_skip_rate, batch_count=1518866.6666666667, ans=0.0 2023-10-04 04:04:11,318 WARNING [train.py:1204] (3/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_409-207378-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 04:04:15,651 WARNING [train.py:1204] (3/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_132-269633-0 from training. Duration: 0.68 2023-10-04 04:04:17,470 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2245_210_2715_1_1525511548_3540254_310-52483-0_sp0.9 from training. Duration: 0.97775 2023-10-04 04:04:18,857 WARNING [train.py:1204] (3/4) Exclude cut with ID _27430_210_7770_1_1532604689375_1378590_47-77403-0_sp1.1 from training. Duration: 0.92725 2023-10-04 04:04:22,409 WARNING [train.py:1204] (3/4) Exclude cut with ID _7562_210_2084_1_1528887767916_3946229_186-326422-0 from training. Duration: 0.84 2023-10-04 04:04:23,715 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_230-35993-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 04:04:27,627 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_23854_210_1794_1_1531969696_1329157_57-123491-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 04:04:28,904 WARNING [train.py:1204] (3/4) Exclude cut with ID _14747_210_8341_1_1530685797463_3654089_147-131229-0 from training. Duration: 0.76 2023-10-04 04:04:29,016 WARNING [train.py:1204] (3/4) Exclude cut with ID _30191_210_1664_1_1533189915047_7196670_430-19539-0_sp1.1 from training. Duration: 0.92725 2023-10-04 04:04:29,030 WARNING [train.py:1204] (3/4) Exclude cut with ID _67725_210_27767_1_1535608756671_5105359_44-50762-0_sp1.1 from training. Duration: 0.963625 2023-10-04 04:04:31,862 WARNING [train.py:1204] (3/4) Exclude cut with ID _27485_210_4916_1_1532739191677_3668269_217-161755-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 04:04:31,904 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_5931_210_2040_1_1527428481_4547999_321-193614-0_sp1.1 from training. Duration: 0.6090625 2023-10-04 04:04:33,224 WARNING [train.py:1204] (3/4) Exclude cut with ID _6267_210_2084_1_1527764658526_3894730_375-219887-0 from training. Duration: 0.67 2023-10-04 04:04:33,319 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9087W0022-538393 from training. Duration: 0.768 2023-10-04 04:04:35,476 WARNING [train.py:1204] (3/4) Exclude cut with ID _68777_210_4353_1_1535810390731_3117904_83-196459-0_sp1.1 from training. Duration: 0.963625 2023-10-04 04:04:38,810 WARNING [train.py:1204] (3/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_455-343812-0_sp1.1 from training. Duration: 0.92725 2023-10-04 04:04:38,811 WARNING [train.py:1204] (3/4) Exclude cut with ID _13916_210_9994_1_1530788341126_3763149_414-320174-0_sp1.1 from training. Duration: 0.92725 2023-10-04 04:04:38,815 WARNING [train.py:1204] (3/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_398-239819-0 from training. Duration: 0.99 2023-10-04 04:04:40,208 WARNING [train.py:1204] (3/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_199-232314-0_sp1.1 from training. Duration: 0.92725 2023-10-04 04:04:41,618 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.self_attn_weights.pos_emb_skip_rate, batch_count=1519000.0, ans=0.0 2023-10-04 04:04:42,310 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.1.encoder.layers.0.feed_forward2.out_whiten, num_groups=1, num_channels=256, metric=4.11 vs. limit=15.0 2023-10-04 04:04:44,898 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.self_attn2.whiten.whitening_limit, batch_count=1519000.0, ans=22.5 2023-10-04 04:04:45,635 WARNING [train.py:1204] (3/4) Exclude cut with ID _2334_210_2667_1_1525608238880_4705129_459-104997-0 from training. Duration: 0.95 2023-10-04 04:04:48,895 INFO [train.py:1046] (3/4) Epoch 43, batch 4750, loss[loss=0.1515, simple_loss=0.2282, pruned_loss=0.03741, over 23686.00 frames. ], tot_loss[loss=0.1548, simple_loss=0.2357, pruned_loss=0.03692, over 4744440.61 frames. ], batch size: 149, lr: 2.35e-03, grad_scale: 16.0 2023-10-04 04:04:48,984 WARNING [train.py:1204] (3/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_441-209395-0_sp1.1 from training. Duration: 0.936375 2023-10-04 04:04:50,361 WARNING [train.py:1204] (3/4) Exclude cut with ID _27052_210_5986_1_1532329197187_3585009_320-80393-0_sp1.1 from training. Duration: 0.97275 2023-10-04 04:04:55,088 WARNING [train.py:1204] (3/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_89-217253-0_sp1.1 from training. Duration: 0.97275 2023-10-04 04:04:55,103 WARNING [train.py:1204] (3/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_453-282588-0_sp1.1 from training. Duration: 0.8909375 2023-10-04 04:04:56,595 WARNING [train.py:1204] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_88-341718-0 from training. Duration: 0.66 2023-10-04 04:04:56,634 WARNING [train.py:1204] (3/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_230-3521-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 04:04:59,362 WARNING [train.py:1204] (3/4) Exclude cut with ID _9679_210_7497_1_1529129212906_4363929_512-313889-0 from training. Duration: 0.81 2023-10-04 04:05:02,003 WARNING [train.py:1204] (3/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_194-252800-0_sp1.1 from training. Duration: 0.8818125 2023-10-04 04:05:02,038 WARNING [train.py:1204] (3/4) Exclude cut with ID _55209_210_1531_1_1534675828996_4597160_239-293541-0_sp1.1 from training. Duration: 0.963625 2023-10-04 04:05:02,218 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.feed_forward2.hidden_balancer.prob, batch_count=1519133.3333333333, ans=0.125 2023-10-04 04:05:03,321 WARNING [train.py:1204] (3/4) Exclude cut with ID _35799_210_5854_1_1533263487887_3529071_15-325131-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 04:05:03,579 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.nonlin_attention.balancer.prob, batch_count=1519133.3333333333, ans=0.125 2023-10-04 04:05:07,546 WARNING [train.py:1204] (3/4) Exclude cut with ID _14100_210_8341_1_1530529320613_3678069_279-55760-0 from training. Duration: 0.93 2023-10-04 04:05:07,887 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.nonlin_attention.balancer.min_positive, batch_count=1519133.3333333333, ans=0.05 2023-10-04 04:05:10,166 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_489-230703-0_sp1.1 from training. Duration: 0.77275 2023-10-04 04:05:11,425 WARNING [train.py:1204] (3/4) Exclude cut with ID _6694_210_4599_1_1527852637490_3965529_144-185051-0 from training. Duration: 0.96 2023-10-04 04:05:11,637 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.conv_module2.balancer1.prob, batch_count=1519133.3333333333, ans=0.125 2023-10-04 04:05:12,746 WARNING [train.py:1204] (3/4) Exclude cut with ID _28461_210_2253_1_1532771775189_3041299_1-228155-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 04:05:15,384 WARNING [train.py:1204] (3/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_686-252323-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 04:05:15,387 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_1075-294633-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 04:05:15,409 WARNING [train.py:1204] (3/4) Exclude cut with ID _22083_210_8254_1_1531702862439_3678970_30-92879-0_sp1.1 from training. Duration: 0.97275 2023-10-04 04:05:17,368 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9245W0121-616888 from training. Duration: 0.896 2023-10-04 04:05:17,371 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6786_210_1388_1_1527839699_4194495_378-46070-0 from training. Duration: 0.72 2023-10-04 04:05:23,466 WARNING [train.py:1204] (3/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_317-175062-0 from training. Duration: 0.82 2023-10-04 04:05:26,165 WARNING [train.py:1204] (3/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_238-89390-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 04:05:26,402 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.feed_forward1.hidden_balancer.prob, batch_count=1519200.0, ans=0.125 2023-10-04 04:05:27,521 WARNING [train.py:1204] (3/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_524-310811-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 04:05:30,451 WARNING [train.py:1204] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_307-158462-0_sp0.9 from training. Duration: 0.7666875 2023-10-04 04:05:30,451 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9309W0191-640318 from training. Duration: 0.512 2023-10-04 04:05:30,456 WARNING [train.py:1204] (3/4) Exclude cut with ID _55153_210_2253_1_1534384786677_3162096_100-204617-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 04:05:31,971 WARNING [train.py:1204] (3/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_112-149213-0_sp1.1 from training. Duration: 0.863625 2023-10-04 04:05:34,825 WARNING [train.py:1204] (3/4) Exclude cut with ID _67831_210_12392_1_1535612464202_3092612_346-2148-0_sp1.1 from training. Duration: 0.8454375 2023-10-04 04:05:38,166 WARNING [train.py:1204] (3/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_51-117576-0_sp0.9 from training. Duration: 0.4444375 2023-10-04 04:05:38,189 WARNING [train.py:1204] (3/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_300-27235-0 from training. Duration: 0.87 2023-10-04 04:05:39,485 WARNING [train.py:1204] (3/4) Exclude cut with ID _40399_210_4353_1_1533305658194_3279406_195-116825-0_sp1.1 from training. Duration: 0.97275 2023-10-04 04:05:39,507 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7002_210_1794_1_1527939683_5157286_163-87552-0_sp0.9 from training. Duration: 0.9555625 2023-10-04 04:05:40,863 WARNING [train.py:1204] (3/4) Exclude cut with ID _19514_210_9259_1_1531530071330_5084230_560-237597-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 04:05:42,584 WARNING [train.py:1204] (3/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_41-234353-0_sp1.1 from training. Duration: 0.6181875 2023-10-04 04:05:42,601 WARNING [train.py:1204] (3/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_769-168302-0 from training. Duration: 0.71 2023-10-04 04:05:45,381 WARNING [train.py:1204] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_574-45541-0 from training. Duration: 0.65 2023-10-04 04:05:48,107 WARNING [train.py:1204] (3/4) Exclude cut with ID _50310_210_15488_1_1534068043345_3641080_262-67120-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 04:05:50,935 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_53071_210_16554_1_1534493434_1276796_35-11927-0_sp1.1 from training. Duration: 0.963625 2023-10-04 04:05:50,937 WARNING [train.py:1204] (3/4) Exclude cut with ID _3992_210_1747_1_1526644816031_3731060_107-251434-0 from training. Duration: 0.99 2023-10-04 04:05:50,981 WARNING [train.py:1204] (3/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_412-54488-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 04:05:51,277 INFO [scaling.py:1118] (3/4) WithLoss: name=encoder.encoders.4.encoder.layers.0.self_attn_weights, loss-sum=0.000e+00 2023-10-04 04:05:52,885 WARNING [train.py:1204] (3/4) Exclude cut with ID _18516_210_9206_1_1531704653206_3692406_77-258490-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 04:05:54,408 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_40047_210_2290_1_1533290519_2374311_195-165693-0_sp0.9 from training. Duration: 0.988875 2023-10-04 04:05:54,471 WARNING [train.py:1204] (3/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_296-36971-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 04:05:56,282 WARNING [train.py:1204] (3/4) Exclude cut with ID _14970_210_15709_1_1530773543493_1715730_187-20259-0_sp1.1 from training. Duration: 0.8090625 2023-10-04 04:05:57,483 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.570e+02 2.030e+02 2.225e+02 2.486e+02 3.690e+02, threshold=4.449e+02, percent-clipped=0.0 2023-10-04 04:06:00,134 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_53373_210_5399_1_1534229471_4715695_135-305927-0_sp1.1 from training. Duration: 0.936375 2023-10-04 04:06:00,174 WARNING [train.py:1204] (3/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_60-18123-0 from training. Duration: 0.88 2023-10-04 04:06:01,550 WARNING [train.py:1204] (3/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_166-166990-0 from training. Duration: 0.96 2023-10-04 04:06:02,914 INFO [train.py:1046] (3/4) Epoch 43, batch 4800, loss[loss=0.1511, simple_loss=0.2385, pruned_loss=0.03184, over 24675.00 frames. ], tot_loss[loss=0.1558, simple_loss=0.2365, pruned_loss=0.03756, over 4718256.20 frames. ], batch size: 65, lr: 2.35e-03, grad_scale: 32.0 2023-10-04 04:06:02,985 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_5438_210_1794_1_1527336687_2600009_233-58072-0 from training. Duration: 0.77 2023-10-04 04:06:05,748 WARNING [train.py:1204] (3/4) Exclude cut with ID _8833_210_5219_1_1528592665210_7553059_611-80313-0_sp0.9 from training. Duration: 0.711125 2023-10-04 04:06:05,776 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_53555_210_5632_1_1534595220_7232297_118-43239-0_sp1.1 from training. Duration: 0.936375 2023-10-04 04:06:06,038 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.balancer2.prob, batch_count=1519400.0, ans=0.125 2023-10-04 04:06:07,149 WARNING [train.py:1204] (3/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_480-13146-0 from training. Duration: 0.85 2023-10-04 04:06:11,930 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_892-28814-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 04:06:13,221 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_24899_210_16554_1_1532046422_3827462_160-21255-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 04:06:19,134 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_154-316450-0_sp1.1 from training. Duration: 0.6909375 2023-10-04 04:06:20,052 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.0.layers.1.conv_module1.whiten, num_groups=1, num_channels=192, metric=8.47 vs. limit=15.0 2023-10-04 04:06:20,474 WARNING [train.py:1204] (3/4) Exclude cut with ID _30196_210_1664_1_1533708496522_7074140_1225-301232-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 04:06:20,511 WARNING [train.py:1204] (3/4) Exclude cut with ID _28783_210_7943_1_1532671164808_7155479_414-208100-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 04:06:20,549 WARNING [train.py:1204] (3/4) Exclude cut with ID _19023_210_4353_1_1531378892552_5820780_83-254794-0 from training. Duration: 0.86 2023-10-04 04:06:21,967 WARNING [train.py:1204] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_591-83752-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 04:06:23,359 WARNING [train.py:1204] (3/4) Exclude cut with ID _29877_210_6228_1_1532397791106_5276149_266-62291-0_sp1.1 from training. Duration: 0.7909375 2023-10-04 04:06:23,472 WARNING [train.py:1204] (3/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_280-161633-0_sp1.1 from training. Duration: 0.663625 2023-10-04 04:06:26,656 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7543_210_6753_1_1528539468_333764_39-156634-0_sp1.1 from training. Duration: 0.97275 2023-10-04 04:06:29,307 WARNING [train.py:1204] (3/4) Exclude cut with ID _67499_210_17282_1_1535509805220_3692430_505-312248-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 04:06:29,330 WARNING [train.py:1204] (3/4) Exclude cut with ID _5294_210_1747_1_1527249643928_3696179_180-100713-0_sp0.9 from training. Duration: 0.9 2023-10-04 04:06:31,230 WARNING [train.py:1204] (3/4) Exclude cut with ID _26528_210_9070_1_1532570410842_3851310_508-148132-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 04:06:31,240 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_211-339622-0_sp1.1 from training. Duration: 0.4090625 2023-10-04 04:06:31,265 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_339-138356-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 04:06:32,559 WARNING [train.py:1204] (3/4) Exclude cut with ID _27060_210_5986_1_1532674838922_3522549_211-19933-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 04:06:35,306 WARNING [train.py:1204] (3/4) Exclude cut with ID _60229_210_27767_1_1534838406158_3655350_457-117923-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 04:06:38,081 WARNING [train.py:1204] (3/4) Exclude cut with ID _55429_210_10626_1_1534467588527_7174143_460-141439-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 04:06:39,410 WARNING [train.py:1204] (3/4) Exclude cut with ID _59170_210_6721_1_1534983633960_4203335_263-10780-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 04:06:39,422 WARNING [train.py:1204] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_872-264525-0_sp0.9 from training. Duration: 0.72225 2023-10-04 04:06:40,850 WARNING [train.py:1204] (3/4) Exclude cut with ID _7624_210_1531_1_1528109802198_4933101_418-349834-0_sp0.9 from training. Duration: 0.6666875 2023-10-04 04:06:40,959 WARNING [train.py:1204] (3/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_27-53998-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 04:06:44,238 WARNING [train.py:1204] (3/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_202-183922-0 from training. Duration: 0.85 2023-10-04 04:06:44,252 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7002_210_1794_1_1527939683_5157286_1-281264-0 from training. Duration: 0.44 2023-10-04 04:06:44,321 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_53753_210_7234_1_1534320003_3594148_493-9314-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 04:06:45,726 WARNING [train.py:1204] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_217-95056-0_sp1.1 from training. Duration: 0.8909375 2023-10-04 04:06:45,772 WARNING [train.py:1204] (3/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_406-45086-0_sp1.1 from training. Duration: 0.72725 2023-10-04 04:06:45,778 WARNING [train.py:1204] (3/4) Exclude cut with ID _31649_210_6228_1_1532584608063_4091078_505-213821-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 04:06:45,786 WARNING [train.py:1204] (3/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_317-175062-0_sp0.9 from training. Duration: 0.911125 2023-10-04 04:06:47,838 WARNING [train.py:1204] (3/4) Exclude cut with ID _5444_210_2151_1_1527418808464_5312210_726-27165-0_sp1.1 from training. Duration: 0.6090625 2023-10-04 04:06:49,161 WARNING [train.py:1204] (3/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_494-170437-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 04:06:52,351 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_474-64054-0_sp1.1 from training. Duration: 0.936375 2023-10-04 04:06:52,519 WARNING [train.py:1204] (3/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_521-7556-0_sp1.1 from training. Duration: 0.97275 2023-10-04 04:06:55,361 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_269-348505-0_sp1.1 from training. Duration: 0.92725 2023-10-04 04:07:00,028 WARNING [train.py:1204] (3/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_559-184862-0 from training. Duration: 0.94 2023-10-04 04:07:00,071 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_68702_210_23689_1_1535799577_3420068_92-76040-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 04:07:00,101 WARNING [train.py:1204] (3/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_227-102052-0_sp1.1 from training. Duration: 0.97275 2023-10-04 04:07:01,433 WARNING [train.py:1204] (3/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_718-250907-0_sp0.9 from training. Duration: 0.7666875 2023-10-04 04:07:02,819 WARNING [train.py:1204] (3/4) Exclude cut with ID _14069_210_5632_1_1530512692033_4803390_352-143891-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 04:07:04,379 WARNING [train.py:1204] (3/4) Exclude cut with ID _6267_210_2084_1_1527764658526_3894730_65-106602-0_sp1.1 from training. Duration: 0.936375 2023-10-04 04:07:05,660 WARNING [train.py:1204] (3/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_202-183922-0_sp0.9 from training. Duration: 0.9444375 2023-10-04 04:07:05,667 WARNING [train.py:1204] (3/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_465-268827-0_sp1.1 from training. Duration: 0.97275 2023-10-04 04:07:07,089 WARNING [train.py:1204] (3/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_58-29064-0_sp1.1 from training. Duration: 0.9 2023-10-04 04:07:08,319 WARNING [train.py:1204] (3/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_1-99680-0_sp0.9 from training. Duration: 0.7444375 2023-10-04 04:07:09,603 WARNING [train.py:1204] (3/4) Exclude cut with ID _13253_210_1925_1_1530757775819_3577873_191-144977-0_sp1.1 from training. Duration: 0.7090625 2023-10-04 04:07:10,422 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.2.encoder.layers.1.feed_forward1.out_whiten, num_groups=1, num_channels=384, metric=5.51 vs. limit=15.0 2023-10-04 04:07:11,054 WARNING [train.py:1204] (3/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_276-112684-0_sp1.1 from training. Duration: 0.92725 2023-10-04 04:07:12,873 WARNING [train.py:1204] (3/4) Exclude cut with ID _64639_210_5816_1_1535367619437_3528000_358-411-0_sp1.1 from training. Duration: 0.97275 2023-10-04 04:07:12,893 WARNING [train.py:1204] (3/4) Exclude cut with ID _31649_210_6228_1_1532584608063_4091078_106-21199-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 04:07:14,307 WARNING [train.py:1204] (3/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_901-79438-0 from training. Duration: 0.94 2023-10-04 04:07:17,532 INFO [train.py:1046] (3/4) Epoch 43, batch 4850, loss[loss=0.136, simple_loss=0.2251, pruned_loss=0.0235, over 24661.00 frames. ], tot_loss[loss=0.1551, simple_loss=0.2361, pruned_loss=0.03703, over 4730297.36 frames. ], batch size: 65, lr: 2.35e-03, grad_scale: 32.0 2023-10-04 04:07:17,609 WARNING [train.py:1204] (3/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_718-250907-0 from training. Duration: 0.69 2023-10-04 04:07:17,615 WARNING [train.py:1204] (3/4) Exclude cut with ID _5287_210_2084_1_1527418883040_3878670_363-194161-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 04:07:17,619 WARNING [train.py:1204] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_586-321321-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 04:07:17,694 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_68900_210_2087_1_1535713157_3708529_164-292661-0_sp1.1 from training. Duration: 0.963625 2023-10-04 04:07:17,695 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_26433_210_12108_1_1532157158_4056480_114-144878-0_sp1.1 from training. Duration: 0.97275 2023-10-04 04:07:21,672 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_15517_210_6753_1_1530954008_3487336_96-341874-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 04:07:27,810 WARNING [train.py:1204] (3/4) Exclude cut with ID _14268_210_9206_1_1530622771285_4714099_637-86630-0 from training. Duration: 0.51 2023-10-04 04:07:28,408 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.4.encoder.layers.1.feed_forward3.out_whiten, num_groups=1, num_channels=384, metric=11.81 vs. limit=15.0 2023-10-04 04:07:29,212 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_28211_210_6388_1_1532861506_4523224_121-320092-0_sp1.1 from training. Duration: 0.92725 2023-10-04 04:07:32,687 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_141-89113-0_sp0.9 from training. Duration: 0.9555625 2023-10-04 04:07:34,000 WARNING [train.py:1204] (3/4) Exclude cut with ID _6789_210_1534_1_1527857937218_3118950_311-61262-0_sp1.1 from training. Duration: 0.5909375 2023-10-04 04:07:34,035 WARNING [train.py:1204] (3/4) Exclude cut with ID _58480_210_12392_1_1534811121111_3646160_389-200461-0_sp1.1 from training. Duration: 0.97275 2023-10-04 04:07:38,090 WARNING [train.py:1204] (3/4) Exclude cut with ID _41326_210_5854_1_1533778076509_3492080_28-156553-0_sp1.1 from training. Duration: 0.92725 2023-10-04 04:07:39,435 WARNING [train.py:1204] (3/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_577-287150-0_sp1.1 from training. Duration: 0.7454375 2023-10-04 04:07:40,720 WARNING [train.py:1204] (3/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_40-129756-0_sp1.1 from training. Duration: 0.736375 2023-10-04 04:07:40,744 WARNING [train.py:1204] (3/4) Exclude cut with ID _60113_210_16271_1_1535164212481_4386079_490-280290-0 from training. Duration: 0.98 2023-10-04 04:07:45,441 WARNING [train.py:1204] (3/4) Exclude cut with ID _58059_210_5854_1_1534901471149_3164808_34-39905-0_sp1.1 from training. Duration: 0.963625 2023-10-04 04:07:46,857 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_791-197891-0_sp1.1 from training. Duration: 0.763625 2023-10-04 04:07:47,084 INFO [scaling.py:1118] (3/4) WithLoss: name=encoder.encoders.3.encoder.layers.3.self_attn_weights, loss-sum=0.000e+00 2023-10-04 04:07:48,251 WARNING [train.py:1204] (3/4) Exclude cut with ID _6789_210_1534_1_1527857937218_3118950_262-48853-0_sp1.1 from training. Duration: 0.5090625 2023-10-04 04:07:49,510 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2655_210_2087_1_1525688959_3897400_52-279899-0_sp0.9 from training. Duration: 0.7666875 2023-10-04 04:07:49,513 WARNING [train.py:1204] (3/4) Exclude cut with ID _14884_210_2252_1_1530707459689_5561250_114-201399-0 from training. Duration: 0.76 2023-10-04 04:07:50,944 WARNING [train.py:1204] (3/4) Exclude cut with ID _19023_210_4353_1_1531378892552_5820780_83-254794-0_sp0.9 from training. Duration: 0.9555625 2023-10-04 04:07:52,244 WARNING [train.py:1204] (3/4) Exclude cut with ID _53489_210_6228_1_1534298357169_3835970_10-297919-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 04:07:55,592 WARNING [train.py:1204] (3/4) Exclude cut with ID _44585_210_18628_1_1533605350387_4518199_418-3055-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 04:07:55,605 WARNING [train.py:1204] (3/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_310-109461-0 from training. Duration: 0.8 2023-10-04 04:07:55,641 WARNING [train.py:1204] (3/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_235-262124-0 from training. Duration: 0.67 2023-10-04 04:07:56,997 WARNING [train.py:1204] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_195-167011-0_sp1.1 from training. Duration: 0.6818125 2023-10-04 04:08:05,725 WARNING [train.py:1204] (3/4) Exclude cut with ID _7852_210_5219_1_1528286471316_8139400_188-55162-0_sp1.1 from training. Duration: 0.936375 2023-10-04 04:08:07,087 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_35361_210_4238_1_1533262810_7793476_675-87012-0 from training. Duration: 0.89 2023-10-04 04:08:07,169 WARNING [train.py:1204] (3/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_465-81427-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 04:08:08,431 WARNING [train.py:1204] (3/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_597-154640-0_sp1.1 from training. Duration: 0.8181875 2023-10-04 04:08:10,079 WARNING [train.py:1204] (3/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_186-309161-0_sp0.9 from training. Duration: 0.6 2023-10-04 04:08:11,447 WARNING [train.py:1204] (3/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_546-119312-0 from training. Duration: 0.99 2023-10-04 04:08:11,451 WARNING [train.py:1204] (3/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_330-240060-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 04:08:11,535 WARNING [train.py:1204] (3/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_477-255782-0 from training. Duration: 0.82 2023-10-04 04:08:11,553 WARNING [train.py:1204] (3/4) Exclude cut with ID _14970_210_15709_1_1530773543493_1715730_150-248939-0_sp1.1 from training. Duration: 0.97275 2023-10-04 04:08:12,908 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_556-206615-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 04:08:14,237 WARNING [train.py:1204] (3/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_308-231735-0 from training. Duration: 0.73 2023-10-04 04:08:24,385 WARNING [train.py:1204] (3/4) Exclude cut with ID _27485_210_4916_1_1532739191677_3668269_66-236571-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 04:08:26,940 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.675e+02 2.023e+02 2.216e+02 2.527e+02 3.488e+02, threshold=4.432e+02, percent-clipped=0.0 2023-10-04 04:08:30,471 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2221_210_1988_1_1525597051_4935541_415-296882-0_sp0.9 from training. Duration: 0.8555625 2023-10-04 04:08:31,785 WARNING [train.py:1204] (3/4) Exclude cut with ID _64521_210_14699_1_1535243427259_3815328_231-78630-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 04:08:33,728 INFO [train.py:1046] (3/4) Epoch 43, batch 4900, loss[loss=0.1684, simple_loss=0.2496, pruned_loss=0.04355, over 24372.00 frames. ], tot_loss[loss=0.1547, simple_loss=0.2354, pruned_loss=0.037, over 4723075.86 frames. ], batch size: 77, lr: 2.35e-03, grad_scale: 32.0 2023-10-04 04:08:35,302 WARNING [train.py:1204] (3/4) Exclude cut with ID _13942_210_9988_1_1530604906036_3733360_499-55676-0 from training. Duration: 0.62 2023-10-04 04:08:35,311 WARNING [train.py:1204] (3/4) Exclude cut with ID _49376_210_5986_1_1533985278732_3370740_301-154763-0_sp1.1 from training. Duration: 0.8909375 2023-10-04 04:08:40,739 WARNING [train.py:1204] (3/4) Exclude cut with ID _27485_210_4916_1_1532739191677_3668269_255-216698-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 04:08:42,085 WARNING [train.py:1204] (3/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_137-334143-0_sp1.1 from training. Duration: 0.97275 2023-10-04 04:08:42,109 WARNING [train.py:1204] (3/4) Exclude cut with ID _6456_210_1534_1_1527681634212_3181049_215-309359-0_sp1.1 from training. Duration: 0.8 2023-10-04 04:08:44,978 WARNING [train.py:1204] (3/4) Exclude cut with ID _65640_210_6310_1_1535368201445_3173250_209-13008-0 from training. Duration: 0.99 2023-10-04 04:08:51,430 WARNING [train.py:1204] (3/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_58-29064-0 from training. Duration: 0.99 2023-10-04 04:08:54,412 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.conv_module1.balancer1.prob, batch_count=1520133.3333333333, ans=0.125 2023-10-04 04:08:55,555 WARNING [train.py:1204] (3/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_410-287857-0 from training. Duration: 0.93 2023-10-04 04:08:55,633 WARNING [train.py:1204] (3/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_153-153537-0 from training. Duration: 0.91 2023-10-04 04:08:55,671 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7544_210_6753_1_1528587722_3576686_172-171750-0_sp0.9 from training. Duration: 0.72225 2023-10-04 04:08:55,706 WARNING [train.py:1204] (3/4) Exclude cut with ID _64521_210_14699_1_1535243427259_3815328_464-183853-0_sp1.1 from training. Duration: 0.97275 2023-10-04 04:08:57,004 WARNING [train.py:1204] (3/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_385-210308-0_sp1.1 from training. Duration: 0.963625 2023-10-04 04:08:57,012 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_46013_210_7234_1_1533776362_3744032_354-261865-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 04:08:57,019 WARNING [train.py:1204] (3/4) Exclude cut with ID _3067_210_2667_1_1526212974640_4503168_250-285937-0_sp1.1 from training. Duration: 0.72725 2023-10-04 04:08:57,084 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_40036_210_13405_1_1533648647_1315388_48-146016-0 from training. Duration: 0.93 2023-10-04 04:09:00,374 WARNING [train.py:1204] (3/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_280-161633-0 from training. Duration: 0.73 2023-10-04 04:09:00,419 WARNING [train.py:1204] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_308-161067-0_sp0.9 from training. Duration: 0.7333125 2023-10-04 04:09:01,862 WARNING [train.py:1204] (3/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_68-212801-0_sp0.9 from training. Duration: 0.811125 2023-10-04 04:09:03,539 WARNING [train.py:1204] (3/4) Exclude cut with ID _6789_210_1534_1_1527857937218_3118950_311-61262-0_sp0.9 from training. Duration: 0.72225 2023-10-04 04:09:03,825 INFO [scaling.py:1118] (3/4) WithLoss: name=encoder.encoders.3.encoder.layers.1.self_attn_weights, loss-sum=0.000e+00 2023-10-04 04:09:05,098 WARNING [train.py:1204] (3/4) Exclude cut with ID _14632_210_9988_1_1530691279392_3923200_102-204477-0_sp1.1 from training. Duration: 0.8818125 2023-10-04 04:09:05,155 WARNING [train.py:1204] (3/4) Exclude cut with ID _65242_210_18628_1_1535369393717_3823149_290-117517-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 04:09:06,470 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_69630_210_7234_1_1535788670_910989_35-337140-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 04:09:06,484 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_5618_210_1874_1_1527311008_5851_1-214501-0 from training. Duration: 0.689 2023-10-04 04:09:07,864 WARNING [train.py:1204] (3/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_168-50337-0_sp0.9 from training. Duration: 0.9333125 2023-10-04 04:09:09,223 WARNING [train.py:1204] (3/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_185-14438-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 04:09:09,238 WARNING [train.py:1204] (3/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_61-327062-0 from training. Duration: 0.81 2023-10-04 04:09:09,243 WARNING [train.py:1204] (3/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_98-741-0 from training. Duration: 0.8 2023-10-04 04:09:14,707 WARNING [train.py:1204] (3/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_312-204798-0 from training. Duration: 0.93 2023-10-04 04:09:16,123 WARNING [train.py:1204] (3/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_787-294416-0_sp0.9 from training. Duration: 0.911125 2023-10-04 04:09:17,456 WARNING [train.py:1204] (3/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_140-181176-0_sp1.1 from training. Duration: 0.836375 2023-10-04 04:09:17,499 WARNING [train.py:1204] (3/4) Exclude cut with ID _14687_210_5854_1_1530703838259_3664889_115-239046-0_sp1.1 from training. Duration: 0.6090625 2023-10-04 04:09:19,395 WARNING [train.py:1204] (3/4) Exclude cut with ID _56423_210_5854_1_1534896061049_3165969_370-230614-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 04:09:19,439 WARNING [train.py:1204] (3/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_406-275649-0_sp0.9 from training. Duration: 0.4666875 2023-10-04 04:09:19,459 WARNING [train.py:1204] (3/4) Exclude cut with ID _15150_210_8341_1_1530752411803_7256384_22-138274-0_sp1.1 from training. Duration: 0.87275 2023-10-04 04:09:19,491 WARNING [train.py:1204] (3/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_125-125818-0 from training. Duration: 0.96 2023-10-04 04:09:22,124 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_4399_210_1874_1_1526705485_2400757_220-47974-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 04:09:23,515 WARNING [train.py:1204] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_308-161067-0_sp1.1 from training. Duration: 0.6 2023-10-04 04:09:24,327 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.2.encoder.layers.0.self_attn2.whiten, num_groups=1, num_channels=384, metric=18.62 vs. limit=22.5 2023-10-04 04:09:24,911 WARNING [train.py:1204] (3/4) Exclude cut with ID _53489_210_6228_1_1534298357169_3835970_102-57645-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 04:09:27,673 WARNING [train.py:1204] (3/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_178-177582-0 from training. Duration: 0.74 2023-10-04 04:09:29,684 WARNING [train.py:1204] (3/4) Exclude cut with ID _14349_210_8341_1_1530615710818_3442470_170-336541-0_sp1.1 from training. Duration: 0.7818125 2023-10-04 04:09:29,751 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_521-214276-0_sp0.9 from training. Duration: 0.488875 2023-10-04 04:09:29,795 WARNING [train.py:1204] (3/4) Exclude cut with ID _66070_210_7640_1_1535441393096_7805310_489-292631-0 from training. Duration: 0.79 2023-10-04 04:09:33,255 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.conv_skip_rate, batch_count=1520333.3333333333, ans=0.0 2023-10-04 04:09:37,136 WARNING [train.py:1204] (3/4) Exclude cut with ID _14069_210_5632_1_1530512692033_4803390_438-340023-0_sp1.1 from training. Duration: 0.936375 2023-10-04 04:09:38,543 WARNING [train.py:1204] (3/4) Exclude cut with ID _7852_210_5219_1_1528286471316_8139400_242-105336-0_sp1.1 from training. Duration: 0.6545625 2023-10-04 04:09:38,645 WARNING [train.py:1204] (3/4) Exclude cut with ID _3867_210_1883_1_1526641200575_4475830_272-215563-0 from training. Duration: 0.57 2023-10-04 04:09:38,655 WARNING [train.py:1204] (3/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_143-117421-0_sp0.9 from training. Duration: 0.6444375 2023-10-04 04:09:39,902 WARNING [train.py:1204] (3/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_30-326681-0_sp1.1 from training. Duration: 0.7545625 2023-10-04 04:09:41,354 WARNING [train.py:1204] (3/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_538-218448-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 04:09:45,398 WARNING [train.py:1204] (3/4) Exclude cut with ID _33640_210_2655_1_1533117605479_4855469_60-192970-0_sp1.1 from training. Duration: 0.963625 2023-10-04 04:09:45,413 WARNING [train.py:1204] (3/4) Exclude cut with ID _65585_210_12210_1_1535450451584_3764098_112-294301-0_sp1.1 from training. Duration: 0.8 2023-10-04 04:09:45,437 WARNING [train.py:1204] (3/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_643-37412-0_sp1.1 from training. Duration: 0.936375 2023-10-04 04:09:45,453 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_9302_210_2550_1_1528889962_4655416_638-301620-0_sp1.1 from training. Duration: 0.42725 2023-10-04 04:09:46,728 INFO [train.py:1046] (3/4) Epoch 43, batch 4950, loss[loss=0.1623, simple_loss=0.2499, pruned_loss=0.0373, over 24046.00 frames. ], tot_loss[loss=0.1537, simple_loss=0.2338, pruned_loss=0.03684, over 4718812.70 frames. ], batch size: 80, lr: 2.35e-03, grad_scale: 32.0 2023-10-04 04:09:48,824 WARNING [train.py:1204] (3/4) Exclude cut with ID _3867_210_1883_1_1526641200575_4475830_272-215563-0_sp1.1 from training. Duration: 0.5181875 2023-10-04 04:09:51,521 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_53679_210_4852_1_1534556726_4186141_358-154143-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 04:09:51,533 WARNING [train.py:1204] (3/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_841-341679-0_sp0.9 from training. Duration: 0.6444375 2023-10-04 04:09:51,762 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.nonlin_attention.balancer.min_positive, batch_count=1520400.0, ans=0.05 2023-10-04 04:09:51,775 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.bypass.skip_rate, batch_count=1520400.0, ans=0.07 2023-10-04 04:09:53,033 WARNING [train.py:1204] (3/4) Exclude cut with ID _7160_210_1876_1_1527940923231_3703799_302-150728-0 from training. Duration: 0.78 2023-10-04 04:09:54,408 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_125-203105-0 from training. Duration: 0.8 2023-10-04 04:09:54,424 WARNING [train.py:1204] (3/4) Exclude cut with ID _5585_210_2667_1_1527418813324_8007340_89-332072-0_sp0.9 from training. Duration: 0.711125 2023-10-04 04:09:55,707 WARNING [train.py:1204] (3/4) Exclude cut with ID _3992_210_1747_1_1526644816031_3731060_36-147855-0 from training. Duration: 0.78 2023-10-04 04:09:55,725 WARNING [train.py:1204] (3/4) Exclude cut with ID _59256_210_5986_1_1535076085997_3589120_172-239045-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 04:09:55,733 WARNING [train.py:1204] (3/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_472-216663-0_sp1.1 from training. Duration: 0.836375 2023-10-04 04:09:57,130 WARNING [train.py:1204] (3/4) Exclude cut with ID _36512_210_6987_1_1533033143998_3126069_181-306500-0_sp1.1 from training. Duration: 0.863625 2023-10-04 04:09:57,147 WARNING [train.py:1204] (3/4) Exclude cut with ID _56524_210_9642_1_1534572011779_3619170_492-95008-0_sp1.1 from training. Duration: 0.97275 2023-10-04 04:09:57,406 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.bypass.scale_min, batch_count=1520400.0, ans=0.2 2023-10-04 04:09:58,561 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_33348_210_5632_1_1533086912_3324030_119-90326-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 04:09:58,593 WARNING [train.py:1204] (3/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_231-137892-0_sp1.1 from training. Duration: 0.9 2023-10-04 04:10:00,580 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8453_210_4211_1_1528502940_8562730_596-306820-0_sp1.1 from training. Duration: 0.8818125 2023-10-04 04:10:03,649 WARNING [train.py:1204] (3/4) Exclude cut with ID _27052_210_5986_1_1532329197187_3585009_50-242750-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 04:10:06,483 WARNING [train.py:1204] (3/4) Exclude cut with ID _13913_210_9994_1_1530579615187_3383897_358-98435-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 04:10:06,497 WARNING [train.py:1204] (3/4) Exclude cut with ID _27013_210_1389_1_1532437173240_3252649_399-229778-0_sp1.1 from training. Duration: 0.963625 2023-10-04 04:10:09,388 WARNING [train.py:1204] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_185-174805-0_sp0.9 from training. Duration: 0.7555625 2023-10-04 04:10:13,645 WARNING [train.py:1204] (3/4) Exclude cut with ID _65585_210_12210_1_1535450451584_3764098_196-221014-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 04:10:15,003 WARNING [train.py:1204] (3/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_202-293523-0_sp1.1 from training. Duration: 0.6545625 2023-10-04 04:10:16,425 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8275_210_4198_1_1528455320_3892571_213-127634-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 04:10:16,604 INFO [scaling.py:1118] (3/4) WithLoss: name=encoder.encoders.1.encoder.layers.0.self_attn_weights, loss-sum=0.000e+00 2023-10-04 04:10:16,618 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.conv_module2.balancer2.min_positive, batch_count=1520533.3333333333, ans=0.05 2023-10-04 04:10:17,723 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_921-231678-0_sp1.1 from training. Duration: 0.97275 2023-10-04 04:10:19,750 WARNING [train.py:1204] (3/4) Exclude cut with ID _38020_210_5986_1_1533112252259_7006089_493-102745-0_sp1.1 from training. Duration: 0.8909375 2023-10-04 04:10:19,846 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6846_210_6753_1_1527850767_4295158_2-315073-0 from training. Duration: 0.88 2023-10-04 04:10:21,114 WARNING [train.py:1204] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_249-281349-0 from training. Duration: 0.85 2023-10-04 04:10:22,589 WARNING [train.py:1204] (3/4) Exclude cut with ID _18513_210_9206_1_1531618215223_3862620_314-83429-0_sp1.1 from training. Duration: 0.97275 2023-10-04 04:10:22,875 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.conv_module2.balancer2.prob, batch_count=1520533.3333333333, ans=0.125 2023-10-04 04:10:24,068 WARNING [train.py:1204] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_215-202973-0_sp0.9 from training. Duration: 0.97775 2023-10-04 04:10:25,350 WARNING [train.py:1204] (3/4) Exclude cut with ID _7719_210_3525_1_1528456153163_4296960_88-250259-0_sp1.1 from training. Duration: 0.763625 2023-10-04 04:10:26,664 WARNING [train.py:1204] (3/4) Exclude cut with ID _7606_210_4915_1_1528894253277_5080690_301-10732-0_sp1.1 from training. Duration: 0.736375 2023-10-04 04:10:26,673 WARNING [train.py:1204] (3/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_818-11907-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 04:10:26,751 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_5959_210_1794_1_1527400400_3445726_412-44052-0_sp1.1 from training. Duration: 0.7 2023-10-04 04:10:29,801 WARNING [train.py:1204] (3/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_6-279195-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 04:10:31,284 WARNING [train.py:1204] (3/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_252-216674-0_sp0.9 from training. Duration: 0.6 2023-10-04 04:10:33,228 WARNING [train.py:1204] (3/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_144-3287-0_sp1.1 from training. Duration: 0.6090625 2023-10-04 04:10:34,562 WARNING [train.py:1204] (3/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_342-3879-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 04:10:35,920 WARNING [train.py:1204] (3/4) Exclude cut with ID _33640_210_2655_1_1533117605479_4855469_584-65630-0_sp1.1 from training. Duration: 0.97275 2023-10-04 04:10:37,213 WARNING [train.py:1204] (3/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_37-189262-0 from training. Duration: 0.98 2023-10-04 04:10:37,253 WARNING [train.py:1204] (3/4) Exclude cut with ID _9389_210_1664_1_1529217337301_5289269_379-128794-0_sp0.9 from training. Duration: 0.9444375 2023-10-04 04:10:38,648 WARNING [train.py:1204] (3/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_119-207162-0_sp1.1 from training. Duration: 0.6454375 2023-10-04 04:10:44,090 WARNING [train.py:1204] (3/4) Exclude cut with ID _2941_210_2577_1_1526199673150_2489759_200-333643-0_sp1.1 from training. Duration: 0.936375 2023-10-04 04:10:45,536 WARNING [train.py:1204] (3/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_479-125611-0_sp1.1 from training. Duration: 0.87275 2023-10-04 04:10:45,546 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3475_210_2028_1_1526193578_3622569_64-297072-0_sp1.1 from training. Duration: 0.82725 2023-10-04 04:10:45,578 WARNING [train.py:1204] (3/4) Exclude cut with ID _53565_210_10635_1_1534337218335_3820259_139-346291-0_sp1.1 from training. Duration: 0.97275 2023-10-04 04:10:45,624 WARNING [train.py:1204] (3/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_588-125165-0_sp1.1 from training. Duration: 0.7545625 2023-10-04 04:10:46,935 WARNING [train.py:1204] (3/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_938-205135-0_sp1.1 from training. Duration: 0.7909375 2023-10-04 04:10:50,132 WARNING [train.py:1204] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_216-48707-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 04:10:50,192 WARNING [train.py:1204] (3/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_477-255782-0_sp1.1 from training. Duration: 0.7454375 2023-10-04 04:10:51,440 WARNING [train.py:1204] (3/4) Exclude cut with ID _16909_210_5724_1_1531292079912_4118919_583-92842-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 04:10:53,493 WARNING [train.py:1204] (3/4) Exclude cut with ID _60860_210_8254_1_1534896015875_3557169_245-256666-0 from training. Duration: 0.97 2023-10-04 04:10:54,733 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.609e+02 1.929e+02 2.246e+02 2.665e+02 5.151e+02, threshold=4.493e+02, percent-clipped=1.0 2023-10-04 04:10:57,654 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_45399_210_4852_1_1533624725_7579737_349-116173-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 04:11:00,476 INFO [train.py:1046] (3/4) Epoch 43, batch 5000, loss[loss=0.1473, simple_loss=0.2264, pruned_loss=0.03412, over 24356.00 frames. ], tot_loss[loss=0.1533, simple_loss=0.2332, pruned_loss=0.03669, over 4715163.43 frames. ], batch size: 56, lr: 2.35e-03, grad_scale: 32.0 2023-10-04 04:11:00,624 WARNING [train.py:1204] (3/4) Exclude cut with ID _8699_210_2069_1_1528630346831_3960409_155-39766-0 from training. Duration: 0.78 2023-10-04 04:11:00,635 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_86-16881-0_sp1.1 from training. Duration: 0.52725 2023-10-04 04:11:04,554 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.3.encoder.layers.3.self_attn_weights.whiten_keys, num_groups=8, num_channels=256, metric=2.42 vs. limit=6.0 2023-10-04 04:11:07,350 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_191-159384-0_sp1.1 from training. Duration: 0.97275 2023-10-04 04:11:07,367 WARNING [train.py:1204] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_307-158462-0_sp1.1 from training. Duration: 0.62725 2023-10-04 04:11:09,910 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7544_210_6753_1_1528591327_1471426_33-346031-0 from training. Duration: 0.61 2023-10-04 04:11:11,164 WARNING [train.py:1204] (3/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_112-149213-0 from training. Duration: 0.95 2023-10-04 04:11:12,564 WARNING [train.py:1204] (3/4) Exclude cut with ID _55844_210_2346_1_1534471212569_7625707_925-191342-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 04:11:13,150 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.3.encoder.layers.2.self_attn1.whiten, num_groups=1, num_channels=512, metric=13.50 vs. limit=22.5 2023-10-04 04:11:15,286 WARNING [train.py:1204] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_87-278686-0 from training. Duration: 0.77 2023-10-04 04:11:15,316 WARNING [train.py:1204] (3/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_415-78792-0_sp1.1 from training. Duration: 0.736375 2023-10-04 04:11:15,326 WARNING [train.py:1204] (3/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_107-332398-0_sp0.9 from training. Duration: 0.8666875 2023-10-04 04:11:15,427 WARNING [train.py:1204] (3/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_72-251255-0 from training. Duration: 0.94 2023-10-04 04:11:15,451 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_431-227273-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 04:11:16,743 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7544_210_6753_1_1528587000_691468_87-178599-0_sp0.9 from training. Duration: 0.9666875 2023-10-04 04:11:18,084 WARNING [train.py:1204] (3/4) Exclude cut with ID _13916_210_9994_1_1530788341126_3763149_60-191556-0 from training. Duration: 0.61 2023-10-04 04:11:18,087 WARNING [train.py:1204] (3/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_746-164782-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 04:11:19,469 WARNING [train.py:1204] (3/4) Exclude cut with ID _33640_210_2655_1_1533117605479_4855469_596-21066-0_sp1.1 from training. Duration: 0.92725 2023-10-04 04:11:21,544 WARNING [train.py:1204] (3/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_320-14078-0 from training. Duration: 0.95 2023-10-04 04:11:22,741 WARNING [train.py:1204] (3/4) Exclude cut with ID _29877_210_6228_1_1532397791106_5276149_266-62291-0 from training. Duration: 0.87 2023-10-04 04:11:22,804 WARNING [train.py:1204] (3/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_176-56120-0_sp0.9 from training. Duration: 0.92225 2023-10-04 04:11:22,845 WARNING [train.py:1204] (3/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_91-26919-0 from training. Duration: 0.97 2023-10-04 04:11:24,692 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_224-322394-0_sp0.9 from training. Duration: 0.7333125 2023-10-04 04:11:24,739 WARNING [train.py:1204] (3/4) Exclude cut with ID _44982_210_1511_1_1534312820108_3726029_586-297059-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 04:11:24,781 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_196-289637-0_sp1.1 from training. Duration: 0.6818125 2023-10-04 04:11:24,782 WARNING [train.py:1204] (3/4) Exclude cut with ID _18513_210_9206_1_1531618215223_3862620_402-5957-0 from training. Duration: 0.99 2023-10-04 04:11:24,788 WARNING [train.py:1204] (3/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_81-300212-0 from training. Duration: 0.6 2023-10-04 04:11:27,479 WARNING [train.py:1204] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_166-128941-0 from training. Duration: 0.54 2023-10-04 04:11:27,494 WARNING [train.py:1204] (3/4) Exclude cut with ID _58059_210_5854_1_1534901471149_3164808_57-309660-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 04:11:27,561 WARNING [train.py:1204] (3/4) Exclude cut with ID _60621_210_17071_1_1535013053530_3671739_73-155814-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 04:11:30,181 WARNING [train.py:1204] (3/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_726-270780-0 from training. Duration: 0.86 2023-10-04 04:11:30,210 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8853_210_3273_1_1528633899_818968_51-187685-0_sp1.1 from training. Duration: 0.62725 2023-10-04 04:11:30,328 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_315-212794-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 04:11:32,072 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_53373_210_5399_1_1534229471_4715695_314-122795-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 04:11:33,517 WARNING [train.py:1204] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_187-221566-0_sp0.9 from training. Duration: 0.47775 2023-10-04 04:11:34,959 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_133-564-0 from training. Duration: 0.79 2023-10-04 04:11:36,339 WARNING [train.py:1204] (3/4) Exclude cut with ID _55153_210_2253_1_1534384786677_3162096_255-228344-0_sp1.1 from training. Duration: 0.936375 2023-10-04 04:11:38,131 WARNING [train.py:1204] (3/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_383-165234-0_sp1.1 from training. Duration: 0.8545625 2023-10-04 04:11:41,225 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.conv_skip_rate, batch_count=1520866.6666666667, ans=0.0 2023-10-04 04:11:42,417 WARNING [train.py:1204] (3/4) Exclude cut with ID IC0677W0056-334889 from training. Duration: 0.768 2023-10-04 04:11:45,129 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_14953_210_2151_1_1530701710_5616165_710-165400-0_sp0.9 from training. Duration: 0.9666875 2023-10-04 04:11:46,623 WARNING [train.py:1204] (3/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_355-236205-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 04:11:46,625 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_34450_210_9547_1_1533099443_4109050_503-246893-0_sp1.1 from training. Duration: 0.963625 2023-10-04 04:11:48,115 WARNING [train.py:1204] (3/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_25-120161-0 from training. Duration: 0.98 2023-10-04 04:11:49,407 WARNING [train.py:1204] (3/4) Exclude cut with ID _66112_210_10635_1_1535590236305_3801484_205-311930-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 04:11:49,432 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_48049_210_5632_1_1533864553_3519937_185-281218-0_sp1.1 from training. Duration: 0.92725 2023-10-04 04:11:49,475 WARNING [train.py:1204] (3/4) Exclude cut with ID _48782_210_12658_1_1534467701544_2300422_191-332415-0_sp1.1 from training. Duration: 0.97275 2023-10-04 04:11:50,977 WARNING [train.py:1204] (3/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_907-151398-0_sp1.1 from training. Duration: 0.336375 2023-10-04 04:11:52,999 WARNING [train.py:1204] (3/4) Exclude cut with ID _67499_210_17282_1_1535509805220_3692430_20-179346-0_sp1.1 from training. Duration: 0.8818125 2023-10-04 04:11:54,535 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.conv_module2.balancer1.prob, batch_count=1520933.3333333333, ans=0.125 2023-10-04 04:11:55,660 WARNING [train.py:1204] (3/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_441-91466-0_sp1.1 from training. Duration: 0.8818125 2023-10-04 04:11:55,729 WARNING [train.py:1204] (3/4) Exclude cut with ID _46681_210_9642_1_1533893005571_7603359_786-164007-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 04:12:03,658 WARNING [train.py:1204] (3/4) Exclude cut with ID _14970_210_15709_1_1530773543493_1715730_187-20259-0 from training. Duration: 0.89 2023-10-04 04:12:07,665 WARNING [train.py:1204] (3/4) Exclude cut with ID _12734_210_8341_1_1530183614296_3891929_383-339075-0_sp1.1 from training. Duration: 0.963625 2023-10-04 04:12:15,137 INFO [train.py:1046] (3/4) Epoch 43, batch 5050, loss[loss=0.1592, simple_loss=0.2347, pruned_loss=0.04185, over 23812.00 frames. ], tot_loss[loss=0.1539, simple_loss=0.2342, pruned_loss=0.03679, over 4733638.84 frames. ], batch size: 164, lr: 2.35e-03, grad_scale: 16.0 2023-10-04 04:12:15,294 WARNING [train.py:1204] (3/4) Exclude cut with ID _37086_210_9038_1_1532999540891_7021250_834-322225-0_sp1.1 from training. Duration: 0.92725 2023-10-04 04:12:15,541 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=1521066.6666666667, ans=0.1 2023-10-04 04:12:16,639 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_10987_210_4163_1_1529760555_4123902_56-290559-0_sp1.1 from training. Duration: 0.963625 2023-10-04 04:12:16,648 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_92-246270-0_sp0.9 from training. Duration: 0.9444375 2023-10-04 04:12:16,671 WARNING [train.py:1204] (3/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_56-13561-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 04:12:16,720 WARNING [train.py:1204] (3/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_393-57888-0_sp1.1 from training. Duration: 0.5818125 2023-10-04 04:12:16,746 WARNING [train.py:1204] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_168-32906-0_sp1.1 from training. Duration: 0.62725 2023-10-04 04:12:16,793 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_51225_210_3601_1_1534400634_2327383_127-207553-0_sp1.1 from training. Duration: 0.963625 2023-10-04 04:12:19,719 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.nonlin_attention.balancer.prob, batch_count=1521066.6666666667, ans=0.125 2023-10-04 04:12:20,896 WARNING [train.py:1204] (3/4) Exclude cut with ID _58583_210_5236_1_1534726835266_3574249_494-203521-0_sp1.1 from training. Duration: 0.963625 2023-10-04 04:12:20,917 WARNING [train.py:1204] (3/4) Exclude cut with ID _49376_210_5986_1_1533985278732_3370740_354-344695-0 from training. Duration: 0.99 2023-10-04 04:12:21,002 WARNING [train.py:1204] (3/4) Exclude cut with ID _8021_210_7994_1_1528376451358_3653069_179-45206-0_sp1.1 from training. Duration: 0.7818125 2023-10-04 04:12:24,211 WARNING [train.py:1204] (3/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_742-154017-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 04:12:25,518 WARNING [train.py:1204] (3/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_222-198127-0_sp1.1 from training. Duration: 0.763625 2023-10-04 04:12:27,296 WARNING [train.py:1204] (3/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_235-307337-0 from training. Duration: 0.94 2023-10-04 04:12:27,498 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.feed_forward1.out_proj.dropout_p, batch_count=1521066.6666666667, ans=0.1 2023-10-04 04:12:28,567 WARNING [train.py:1204] (3/4) Exclude cut with ID _8393_210_6702_1_1528505860249_3457328_453-176500-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 04:12:29,903 WARNING [train.py:1204] (3/4) Exclude cut with ID _53565_210_10635_1_1534337218335_3820259_102-179528-0_sp1.1 from training. Duration: 0.97275 2023-10-04 04:12:31,314 WARNING [train.py:1204] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_43-206757-0_sp1.1 from training. Duration: 0.6818125 2023-10-04 04:12:33,116 WARNING [train.py:1204] (3/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_335-274780-0_sp1.1 from training. Duration: 0.7090625 2023-10-04 04:12:34,487 WARNING [train.py:1204] (3/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_128-296581-0_sp1.1 from training. Duration: 0.67275 2023-10-04 04:12:42,213 WARNING [train.py:1204] (3/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_111-112362-0 from training. Duration: 0.91 2023-10-04 04:12:42,260 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_5522_210_3273_1_1527249606_3909096_366-172508-0_sp1.1 from training. Duration: 0.6 2023-10-04 04:12:43,593 WARNING [train.py:1204] (3/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_47-167373-0_sp1.1 from training. Duration: 0.836375 2023-10-04 04:12:43,635 WARNING [train.py:1204] (3/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_41-234353-0 from training. Duration: 0.68 2023-10-04 04:12:43,668 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_82-221196-0_sp0.9 from training. Duration: 0.8444375 2023-10-04 04:12:45,000 WARNING [train.py:1204] (3/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_47-4814-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 04:12:45,024 WARNING [train.py:1204] (3/4) Exclude cut with ID _43541_210_10626_1_1533796233418_3635440_40-247993-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 04:12:45,095 WARNING [train.py:1204] (3/4) Exclude cut with ID _21989_210_5854_1_1531899214931_3271360_133-44759-0_sp0.9 from training. Duration: 0.8555625 2023-10-04 04:12:45,098 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_94-195801-0 from training. Duration: 0.94 2023-10-04 04:12:46,325 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_141-89113-0 from training. Duration: 0.86 2023-10-04 04:12:48,968 WARNING [train.py:1204] (3/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_683-284578-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 04:12:50,399 WARNING [train.py:1204] (3/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_6-214815-0_sp1.1 from training. Duration: 0.77275 2023-10-04 04:12:54,912 WARNING [train.py:1204] (3/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_556-212082-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 04:12:54,951 WARNING [train.py:1204] (3/4) Exclude cut with ID _44902_210_16187_1_1533888006062_3792078_149-67212-0 from training. Duration: 0.87 2023-10-04 04:12:56,505 WARNING [train.py:1204] (3/4) Exclude cut with ID _61148_210_6897_1_1535094022726_4217610_333-159440-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 04:12:59,767 WARNING [train.py:1204] (3/4) Exclude cut with ID _3120_210_3525_1_1526036143112_4072980_404-249994-0 from training. Duration: 0.99 2023-10-04 04:13:01,136 WARNING [train.py:1204] (3/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_234-260682-0_sp0.9 from training. Duration: 0.9333125 2023-10-04 04:13:02,386 WARNING [train.py:1204] (3/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_374-138971-0_sp1.1 from training. Duration: 0.8545625 2023-10-04 04:13:02,463 WARNING [train.py:1204] (3/4) Exclude cut with ID _53713_210_24116_1_1534314487762_7416751_743-302085-0_sp1.1 from training. Duration: 0.92725 2023-10-04 04:13:03,825 WARNING [train.py:1204] (3/4) Exclude cut with ID _18516_210_9206_1_1531704653206_3692406_198-334870-0_sp1.1 from training. Duration: 0.836375 2023-10-04 04:13:05,255 WARNING [train.py:1204] (3/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_395-73570-0_sp1.1 from training. Duration: 0.9 2023-10-04 04:13:05,464 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.conv_skip_rate, batch_count=1521266.6666666667, ans=0.0 2023-10-04 04:13:08,351 WARNING [train.py:1204] (3/4) Exclude cut with ID _46683_210_9642_1_1533730913492_4591116_421-19547-0_sp1.1 from training. Duration: 0.936375 2023-10-04 04:13:08,392 WARNING [train.py:1204] (3/4) Exclude cut with ID _19896_210_1883_1_1531709022662_6649569_714-8668-0_sp1.1 from training. Duration: 0.963625 2023-10-04 04:13:08,643 INFO [scaling.py:1118] (3/4) WithLoss: name=encoder.encoders.4.encoder.layers.2.self_attn_weights, loss-sum=0.000e+00 2023-10-04 04:13:08,961 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.4.encoder.layers.0.feed_forward1.out_whiten, num_groups=1, num_channels=384, metric=7.08 vs. limit=15.0 2023-10-04 04:13:10,096 WARNING [train.py:1204] (3/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_49-101072-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 04:13:10,103 WARNING [train.py:1204] (3/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_220-191080-0_sp1.1 from training. Duration: 0.8909375 2023-10-04 04:13:10,149 WARNING [train.py:1204] (3/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_62-335552-0 from training. Duration: 0.87 2023-10-04 04:13:11,552 WARNING [train.py:1204] (3/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_111-112362-0_sp1.1 from training. Duration: 0.82725 2023-10-04 04:13:13,008 WARNING [train.py:1204] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_318-59097-0_sp0.9 from training. Duration: 0.8444375 2023-10-04 04:13:16,035 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.0.ff3_skip_rate, batch_count=1521333.3333333333, ans=0.0 2023-10-04 04:13:17,238 WARNING [train.py:1204] (3/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_98-255710-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 04:13:17,245 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9042W0003-515609 from training. Duration: 0.896 2023-10-04 04:13:17,254 WARNING [train.py:1204] (3/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_158-250116-0_sp0.9 from training. Duration: 0.688875 2023-10-04 04:13:17,414 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.feed_forward2.hidden_balancer.prob, batch_count=1521333.3333333333, ans=0.125 2023-10-04 04:13:17,474 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.ff3_skip_rate, batch_count=1521333.3333333333, ans=0.0 2023-10-04 04:13:18,754 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.conv_module1.balancer2.min_abs, batch_count=1521333.3333333333, ans=0.5 2023-10-04 04:13:19,848 WARNING [train.py:1204] (3/4) Exclude cut with ID _26750_210_5665_1_1532226678442_4063230_7-346885-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 04:13:19,895 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_37839_210_7234_1_1533542462_3689303_357-227078-0_sp1.1 from training. Duration: 0.963625 2023-10-04 04:13:21,183 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9052W0119-520904 from training. Duration: 0.896 2023-10-04 04:13:22,690 WARNING [train.py:1204] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_249-281349-0_sp1.1 from training. Duration: 0.77275 2023-10-04 04:13:22,695 WARNING [train.py:1204] (3/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_147-293311-0 from training. Duration: 0.94 2023-10-04 04:13:22,696 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_158-21166-0_sp1.1 from training. Duration: 0.963625 2023-10-04 04:13:25,265 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.696e+02 1.920e+02 2.111e+02 2.330e+02 3.749e+02, threshold=4.222e+02, percent-clipped=0.0 2023-10-04 04:13:28,069 INFO [train.py:1046] (3/4) Epoch 43, batch 5100, loss[loss=0.1611, simple_loss=0.2492, pruned_loss=0.03656, over 24341.00 frames. ], tot_loss[loss=0.1543, simple_loss=0.2352, pruned_loss=0.03668, over 4739037.53 frames. ], batch size: 77, lr: 2.35e-03, grad_scale: 8.0 2023-10-04 04:13:28,170 WARNING [train.py:1204] (3/4) Exclude cut with ID _14100_210_8341_1_1530529320613_3678069_207-332194-0_sp1.1 from training. Duration: 0.92725 2023-10-04 04:13:28,206 WARNING [train.py:1204] (3/4) Exclude cut with ID _9389_210_1664_1_1529217337301_5289269_324-211031-0_sp1.1 from training. Duration: 0.963625 2023-10-04 04:13:28,218 WARNING [train.py:1204] (3/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_133-158308-0 from training. Duration: 0.84 2023-10-04 04:13:30,184 WARNING [train.py:1204] (3/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_166-314607-0 from training. Duration: 0.54 2023-10-04 04:13:30,465 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.conv_skip_rate, batch_count=1521400.0, ans=0.0 2023-10-04 04:13:31,717 WARNING [train.py:1204] (3/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_649-79538-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 04:13:31,724 WARNING [train.py:1204] (3/4) Exclude cut with ID _28394_210_8913_1_1532430049518_3650118_343-115667-0_sp1.1 from training. Duration: 0.97275 2023-10-04 04:13:31,785 WARNING [train.py:1204] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_87-278686-0_sp0.9 from training. Duration: 0.8555625 2023-10-04 04:13:31,940 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.attention_skip_rate, batch_count=1521400.0, ans=0.0 2023-10-04 04:13:33,944 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.ff3_skip_rate, batch_count=1521400.0, ans=0.0 2023-10-04 04:13:36,287 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9109W0003-549749 from training. Duration: 0.768 2023-10-04 04:13:37,577 WARNING [train.py:1204] (3/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_126-29104-0_sp1.1 from training. Duration: 0.77275 2023-10-04 04:13:39,732 WARNING [train.py:1204] (3/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_325-113798-0 from training. Duration: 0.85 2023-10-04 04:13:41,111 WARNING [train.py:1204] (3/4) Exclude cut with ID _6694_210_4599_1_1527852637490_3965529_116-109398-0 from training. Duration: 0.73 2023-10-04 04:13:41,166 WARNING [train.py:1204] (3/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_46-57119-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 04:13:43,300 WARNING [train.py:1204] (3/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_546-119312-0_sp1.1 from training. Duration: 0.9 2023-10-04 04:13:47,187 WARNING [train.py:1204] (3/4) Exclude cut with ID _60057_210_10640_1_1534903065793_3333150_468-306939-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 04:13:47,238 WARNING [train.py:1204] (3/4) Exclude cut with ID _15150_210_8341_1_1530752411803_7256384_22-138274-0 from training. Duration: 0.96 2023-10-04 04:13:47,253 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_289-180904-0 from training. Duration: 0.76 2023-10-04 04:13:47,442 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.conv_skip_rate, batch_count=1521466.6666666667, ans=0.0 2023-10-04 04:13:51,592 WARNING [train.py:1204] (3/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_64-133721-0_sp1.1 from training. Duration: 0.92725 2023-10-04 04:13:51,635 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_195-257512-0_sp1.1 from training. Duration: 0.6090625 2023-10-04 04:13:54,355 WARNING [train.py:1204] (3/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_704-101963-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 04:13:54,560 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.conv_module1.balancer1.min_positive, batch_count=1521466.6666666667, ans=0.025 2023-10-04 04:13:58,297 WARNING [train.py:1204] (3/4) Exclude cut with ID _4366_210_1388_1_1526792053633_3613819_231-211402-0 from training. Duration: 0.94 2023-10-04 04:13:58,348 WARNING [train.py:1204] (3/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_679-190385-0_sp1.1 from training. Duration: 0.97275 2023-10-04 04:13:59,101 WARNING [train.py:1204] (3/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_298-81459-0_sp1.1 from training. Duration: 0.963625 2023-10-04 04:14:00,567 WARNING [train.py:1204] (3/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_658-282610-0_sp0.9 from training. Duration: 0.588875 2023-10-04 04:14:00,860 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.conv_skip_rate, batch_count=1521533.3333333333, ans=0.0 2023-10-04 04:14:02,007 WARNING [train.py:1204] (3/4) Exclude cut with ID _57572_210_5517_1_1534921219624_3809484_196-283734-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 04:14:03,355 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_53071_210_16554_1_1534490405_2954066_294-198554-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 04:14:03,360 WARNING [train.py:1204] (3/4) Exclude cut with ID _4866_210_2577_1_1527404412284_4364659_352-157930-0 from training. Duration: 0.81 2023-10-04 04:14:04,783 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9231W0113-610139 from training. Duration: 0.896 2023-10-04 04:14:04,835 WARNING [train.py:1204] (3/4) Exclude cut with ID _24271_210_12658_1_1531987270022_7190519_638-171906-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 04:14:06,226 WARNING [train.py:1204] (3/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_272-249985-0 from training. Duration: 0.67 2023-10-04 04:14:06,236 WARNING [train.py:1204] (3/4) Exclude cut with ID _64086_210_7994_1_1535270417019_7435949_449-31567-0 from training. Duration: 0.97 2023-10-04 04:14:11,246 WARNING [train.py:1204] (3/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_109-282568-0_sp1.1 from training. Duration: 0.97275 2023-10-04 04:14:13,015 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.self_attn_weights.pos_emb_skip_rate, batch_count=1521600.0, ans=0.0 2023-10-04 04:14:15,691 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.ff2_skip_rate, batch_count=1521600.0, ans=0.0 2023-10-04 04:14:19,488 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_121-45934-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 04:14:22,210 WARNING [train.py:1204] (3/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_314-126819-0 from training. Duration: 0.5 2023-10-04 04:14:22,243 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9309W0192-640319 from training. Duration: 0.512 2023-10-04 04:14:22,251 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9310W0003-640649 from training. Duration: 0.896 2023-10-04 04:14:23,825 WARNING [train.py:1204] (3/4) Exclude cut with ID _7160_210_1876_1_1527940923231_3703799_120-150943-0 from training. Duration: 0.81 2023-10-04 04:14:23,827 WARNING [train.py:1204] (3/4) Exclude cut with ID _40380_210_9059_1_1533380594759_4187380_6-33508-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 04:14:24,080 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.feed_forward2.hidden_balancer.prob, batch_count=1521600.0, ans=0.125 2023-10-04 04:14:26,613 WARNING [train.py:1204] (3/4) Exclude cut with ID _59202_210_26984_1_1534816810612_3549174_137-318710-0 from training. Duration: 0.89 2023-10-04 04:14:29,393 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_435-313305-0 from training. Duration: 0.71 2023-10-04 04:14:33,398 WARNING [train.py:1204] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_91-212486-0_sp1.1 from training. Duration: 0.6181875 2023-10-04 04:14:33,509 WARNING [train.py:1204] (3/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_133-158308-0_sp1.1 from training. Duration: 0.763625 2023-10-04 04:14:35,062 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.feed_forward2.hidden_balancer.prob, batch_count=1521666.6666666667, ans=0.125 2023-10-04 04:14:36,212 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_99-251314-0 from training. Duration: 0.87 2023-10-04 04:14:37,652 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_365-217119-0_sp1.1 from training. Duration: 0.57275 2023-10-04 04:14:37,697 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3475_210_2028_1_1526193578_3622569_377-166133-0 from training. Duration: 0.86 2023-10-04 04:14:37,818 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=1521666.6666666667, ans=0.1 2023-10-04 04:14:43,740 INFO [train.py:1046] (3/4) Epoch 43, batch 5150, loss[loss=0.165, simple_loss=0.241, pruned_loss=0.04449, over 23532.00 frames. ], tot_loss[loss=0.1555, simple_loss=0.2366, pruned_loss=0.03715, over 4725207.81 frames. ], batch size: 256, lr: 2.35e-03, grad_scale: 8.0 2023-10-04 04:14:43,856 WARNING [train.py:1204] (3/4) Exclude cut with ID _13916_210_9994_1_1530788341126_3763149_116-21034-0_sp1.1 from training. Duration: 0.87275 2023-10-04 04:14:43,877 WARNING [train.py:1204] (3/4) Exclude cut with ID _64220_210_9527_1_1535250151772_4875259_142-216831-0_sp1.1 from training. Duration: 0.936375 2023-10-04 04:14:43,877 WARNING [train.py:1204] (3/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_490-207861-0_sp1.1 from training. Duration: 0.8909375 2023-10-04 04:14:43,939 WARNING [train.py:1204] (3/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_87-272068-0_sp0.9 from training. Duration: 0.92225 2023-10-04 04:14:43,961 WARNING [train.py:1204] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_18-46756-0_sp1.1 from training. Duration: 0.7181875 2023-10-04 04:14:45,320 WARNING [train.py:1204] (3/4) Exclude cut with ID _39411_210_5986_1_1533538825030_3343649_98-311335-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 04:14:46,664 WARNING [train.py:1204] (3/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_113-18163-0 from training. Duration: 0.89 2023-10-04 04:14:46,666 WARNING [train.py:1204] (3/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_1103-56445-0 from training. Duration: 0.73 2023-10-04 04:14:46,727 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3474_210_2028_1_1526188513_4825246_145-309131-0 from training. Duration: 0.74 2023-10-04 04:14:48,046 WARNING [train.py:1204] (3/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_120-211630-0_sp1.1 from training. Duration: 0.836375 2023-10-04 04:14:48,065 WARNING [train.py:1204] (3/4) Exclude cut with ID _8030_210_7497_1_1528444886781_3531089_32-31837-0 from training. Duration: 0.97 2023-10-04 04:14:49,385 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_19949_210_2161_1_1531706337_928079_8-160956-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 04:14:49,434 WARNING [train.py:1204] (3/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_321-215742-0_sp1.1 from training. Duration: 0.4545625 2023-10-04 04:14:49,732 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.ff3_skip_rate, batch_count=1521733.3333333333, ans=0.0 2023-10-04 04:14:50,793 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_583-124650-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 04:14:52,154 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_30434_210_13049_1_1532519384_981746_164-182755-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 04:14:56,338 WARNING [train.py:1204] (3/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_339-104791-0_sp1.1 from training. Duration: 0.4454375 2023-10-04 04:14:56,364 WARNING [train.py:1204] (3/4) Exclude cut with ID _15026_210_8750_1_1530777602316_3291850_211-195043-0 from training. Duration: 0.8 2023-10-04 04:14:57,682 WARNING [train.py:1204] (3/4) Exclude cut with ID _29877_210_6228_1_1532397791106_5276149_487-182647-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 04:14:57,738 WARNING [train.py:1204] (3/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_127-566-0_sp0.9 from training. Duration: 0.9333125 2023-10-04 04:14:59,833 WARNING [train.py:1204] (3/4) Exclude cut with ID _8263_210_1664_1_1528439452891_970220_68-181805-0_sp0.9 from training. Duration: 0.711125 2023-10-04 04:14:59,835 WARNING [train.py:1204] (3/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_504-72513-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 04:14:59,852 WARNING [train.py:1204] (3/4) Exclude cut with ID _64639_210_5816_1_1535367619437_3528000_324-265233-0_sp1.1 from training. Duration: 0.963625 2023-10-04 04:15:01,191 WARNING [train.py:1204] (3/4) Exclude cut with ID _6456_210_1534_1_1527681634212_3181049_215-309359-0_sp0.9 from training. Duration: 0.97775 2023-10-04 04:15:01,196 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_115-233929-0_sp1.1 from training. Duration: 0.6545625 2023-10-04 04:15:02,416 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.3.encoder.layers.2.self_attn_weights.whiten_keys, num_groups=8, num_channels=256, metric=3.13 vs. limit=6.0 2023-10-04 04:15:03,058 WARNING [train.py:1204] (3/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_219-224445-0 from training. Duration: 0.84 2023-10-04 04:15:04,438 WARNING [train.py:1204] (3/4) Exclude cut with ID _14867_210_1968_1_1530684179890_3846969_413-29292-0_sp1.1 from training. Duration: 0.8181875 2023-10-04 04:15:05,864 WARNING [train.py:1204] (3/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_1012-279473-0_sp0.9 from training. Duration: 0.7444375 2023-10-04 04:15:07,311 WARNING [train.py:1204] (3/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_605-133395-0_sp0.9 from training. Duration: 0.7333125 2023-10-04 04:15:08,735 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_89-35748-0 from training. Duration: 0.79 2023-10-04 04:15:10,661 WARNING [train.py:1204] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_476-272757-0_sp1.1 from training. Duration: 0.8454375 2023-10-04 04:15:14,866 WARNING [train.py:1204] (3/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_449-15508-0_sp0.9 from training. Duration: 0.911125 2023-10-04 04:15:16,232 WARNING [train.py:1204] (3/4) Exclude cut with ID _29345_210_4381_1_1532689168263_3860169_198-174328-0 from training. Duration: 0.91 2023-10-04 04:15:20,274 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_15517_210_6753_1_1530952293_1664795_125-158157-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 04:15:24,581 WARNING [train.py:1204] (3/4) Exclude cut with ID _25565_210_12392_1_1532423009382_3952098_420-113180-0_sp1.1 from training. Duration: 0.963625 2023-10-04 04:15:25,989 WARNING [train.py:1204] (3/4) Exclude cut with ID _51471_210_1889_1_1534399391390_3094250_236-146773-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 04:15:29,304 WARNING [train.py:1204] (3/4) Exclude cut with ID _30039_210_13270_1_1532484612900_2725150_175-35012-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 04:15:30,633 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_23850_210_4238_1_1532232597_1632433_99-275843-0_sp1.1 from training. Duration: 0.936375 2023-10-04 04:15:33,929 WARNING [train.py:1204] (3/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_658-282610-0 from training. Duration: 0.53 2023-10-04 04:15:37,987 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_37818_210_4238_1_1533620910_4720083_234-65934-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 04:15:39,274 WARNING [train.py:1204] (3/4) Exclude cut with ID _3067_210_2667_1_1526212974640_4503168_459-173867-0_sp1.1 from training. Duration: 0.8 2023-10-04 04:15:39,295 WARNING [train.py:1204] (3/4) Exclude cut with ID _8696_210_1876_1_1528545632440_3345058_209-54069-0_sp0.9 from training. Duration: 0.7444375 2023-10-04 04:15:39,678 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.conv_module2.balancer1.prob, batch_count=1521933.3333333333, ans=0.125 2023-10-04 04:15:41,397 WARNING [train.py:1204] (3/4) Exclude cut with ID _62574_210_2655_1_1535180407058_3933249_420-236465-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 04:15:42,881 WARNING [train.py:1204] (3/4) Exclude cut with ID _46681_210_9642_1_1533893005571_7603359_333-4042-0_sp1.1 from training. Duration: 0.936375 2023-10-04 04:15:44,254 WARNING [train.py:1204] (3/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_377-296839-0 from training. Duration: 0.55 2023-10-04 04:15:48,172 WARNING [train.py:1204] (3/4) Exclude cut with ID _43997_210_4525_1_1533538632555_5189329_426-92599-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 04:15:48,276 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_43-71989-0_sp1.1 from training. Duration: 0.5181875 2023-10-04 04:15:51,068 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2009_210_2071_1_1525481134_4588937_140-345530-0_sp1.1 from training. Duration: 0.963625 2023-10-04 04:15:51,076 WARNING [train.py:1204] (3/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_138-332773-0_sp0.9 from training. Duration: 0.9 2023-10-04 04:15:51,146 WARNING [train.py:1204] (3/4) Exclude cut with ID _13942_210_9988_1_1530604906036_3733360_499-55676-0_sp0.9 from training. Duration: 0.688875 2023-10-04 04:15:52,392 WARNING [train.py:1204] (3/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_259-34038-0_sp1.1 from training. Duration: 0.67275 2023-10-04 04:15:52,401 WARNING [train.py:1204] (3/4) Exclude cut with ID _3120_210_3525_1_1526036143112_4072980_404-249994-0_sp1.1 from training. Duration: 0.9 2023-10-04 04:15:52,436 WARNING [train.py:1204] (3/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_222-108810-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 04:15:53,836 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.598e+02 2.002e+02 2.172e+02 2.451e+02 4.119e+02, threshold=4.344e+02, percent-clipped=0.0 2023-10-04 04:15:55,398 WARNING [train.py:1204] (3/4) Exclude cut with ID _22083_210_8254_1_1531702862439_3678970_49-234546-0_sp0.9 from training. Duration: 0.92225 2023-10-04 04:15:56,711 INFO [train.py:1046] (3/4) Epoch 43, batch 5200, loss[loss=0.1682, simple_loss=0.248, pruned_loss=0.04417, over 24019.00 frames. ], tot_loss[loss=0.1563, simple_loss=0.2372, pruned_loss=0.03766, over 4727578.64 frames. ], batch size: 86, lr: 2.35e-03, grad_scale: 16.0 2023-10-04 04:15:58,605 WARNING [train.py:1204] (3/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_199-295611-0_sp0.9 from training. Duration: 0.611125 2023-10-04 04:15:59,273 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.4.encoder.layers.2.feed_forward1.out_whiten, num_groups=1, num_channels=384, metric=9.05 vs. limit=15.0 2023-10-04 04:16:00,128 WARNING [train.py:1204] (3/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_43-192945-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 04:16:03,550 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.balancer1.prob, batch_count=1522066.6666666667, ans=0.125 2023-10-04 04:16:04,814 WARNING [train.py:1204] (3/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_364-22817-0 from training. Duration: 0.71 2023-10-04 04:16:06,186 WARNING [train.py:1204] (3/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_388-129552-0_sp1.1 from training. Duration: 0.87275 2023-10-04 04:16:06,240 WARNING [train.py:1204] (3/4) Exclude cut with ID _6537_210_1883_1_1527987559710_3597129_190-280227-0_sp1.1 from training. Duration: 0.97275 2023-10-04 04:16:10,844 WARNING [train.py:1204] (3/4) Exclude cut with ID _55844_210_2346_1_1534471212569_7625707_398-56897-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 04:16:10,930 WARNING [train.py:1204] (3/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_48-182155-0_sp1.1 from training. Duration: 0.7909375 2023-10-04 04:16:11,578 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.3.encoder.layers.2.self_attn1.whiten, num_groups=1, num_channels=512, metric=11.92 vs. limit=22.5 2023-10-04 04:16:12,256 WARNING [train.py:1204] (3/4) Exclude cut with ID _29529_210_7234_1_1532393961455_3977460_582-18084-0_sp1.1 from training. Duration: 0.97275 2023-10-04 04:16:13,998 WARNING [train.py:1204] (3/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_912-282412-0 from training. Duration: 0.49 2023-10-04 04:16:16,731 WARNING [train.py:1204] (3/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_332-256168-0_sp0.9 from training. Duration: 0.8333125 2023-10-04 04:16:16,957 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.ff2_skip_rate, batch_count=1522133.3333333333, ans=0.0 2023-10-04 04:16:18,038 WARNING [train.py:1204] (3/4) Exclude cut with ID _64220_210_9527_1_1535250151772_4875259_55-250624-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 04:16:20,807 WARNING [train.py:1204] (3/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_264-108685-0 from training. Duration: 0.55 2023-10-04 04:16:22,167 WARNING [train.py:1204] (3/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_613-232856-0_sp0.9 from training. Duration: 0.6 2023-10-04 04:16:23,557 WARNING [train.py:1204] (3/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_68-212801-0_sp1.1 from training. Duration: 0.663625 2023-10-04 04:16:24,885 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6290_210_3273_1_1527595224_1282787_115-87095-0 from training. Duration: 0.55 2023-10-04 04:16:24,935 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3080_210_2071_1_1526086102_4365166_312-8726-0 from training. Duration: 0.91 2023-10-04 04:16:26,440 WARNING [train.py:1204] (3/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_105-63309-0 from training. Duration: 0.98 2023-10-04 04:16:27,834 WARNING [train.py:1204] (3/4) Exclude cut with ID _2941_210_2577_1_1526199673150_2489759_201-160779-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 04:16:27,837 WARNING [train.py:1204] (3/4) Exclude cut with ID ID0406W0226-885434 from training. Duration: 0.997 2023-10-04 04:16:27,852 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_17292_210_6753_1_1531125388_2943734_353-328201-0_sp1.1 from training. Duration: 0.97275 2023-10-04 04:16:27,963 WARNING [train.py:1204] (3/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_572-96029-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 04:16:29,307 WARNING [train.py:1204] (3/4) Exclude cut with ID _14970_210_15709_1_1530775547361_1966965_6-23328-0_sp0.9 from training. Duration: 0.9555625 2023-10-04 04:16:29,360 WARNING [train.py:1204] (3/4) Exclude cut with ID _45012_210_12651_1_1533608435136_5371380_240-37101-0 from training. Duration: 0.93 2023-10-04 04:16:29,435 WARNING [train.py:1204] (3/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_685-90183-0_sp1.1 from training. Duration: 0.936375 2023-10-04 04:16:33,095 WARNING [train.py:1204] (3/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_41-22245-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 04:16:34,532 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_15126_210_6753_1_1530778677_2591185_120-277707-0 from training. Duration: 0.8 2023-10-04 04:16:35,765 WARNING [train.py:1204] (3/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_310-26186-0 from training. Duration: 0.7 2023-10-04 04:16:35,788 WARNING [train.py:1204] (3/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_787-294416-0 from training. Duration: 0.82 2023-10-04 04:16:41,880 WARNING [train.py:1204] (3/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_795-33252-0 from training. Duration: 0.37 2023-10-04 04:16:41,921 WARNING [train.py:1204] (3/4) Exclude cut with ID _7537_210_2084_1_1528502395431_3924359_113-199933-0_sp0.9 from training. Duration: 0.6555625 2023-10-04 04:16:45,315 WARNING [train.py:1204] (3/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_308-231735-0_sp0.9 from training. Duration: 0.811125 2023-10-04 04:16:46,579 WARNING [train.py:1204] (3/4) Exclude cut with ID _19504_210_9994_1_1531648863242_3325220_354-296765-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 04:16:47,984 WARNING [train.py:1204] (3/4) Exclude cut with ID _5409_210_1388_1_1527292850189_5865191_178-127970-0 from training. Duration: 0.57 2023-10-04 04:16:48,040 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_1466-248056-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 04:16:48,060 WARNING [train.py:1204] (3/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_535-14410-0_sp0.9 from training. Duration: 0.52225 2023-10-04 04:16:49,276 WARNING [train.py:1204] (3/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_747-129941-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 04:16:49,325 WARNING [train.py:1204] (3/4) Exclude cut with ID _15012_210_1968_1_1530856820429_5095350_688-178849-0_sp1.1 from training. Duration: 0.5818125 2023-10-04 04:16:52,342 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=1522266.6666666667, ans=0.1 2023-10-04 04:16:53,360 WARNING [train.py:1204] (3/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_356-45054-0_sp1.1 from training. Duration: 0.7818125 2023-10-04 04:16:54,657 WARNING [train.py:1204] (3/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_146-276944-0_sp0.9 from training. Duration: 0.92225 2023-10-04 04:16:58,901 WARNING [train.py:1204] (3/4) Exclude cut with ID _39204_210_4220_1_1533880547368_3191120_346-180306-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 04:16:58,983 WARNING [train.py:1204] (3/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_641-158017-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 04:16:58,983 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_19952_210_2161_1_1531875469_3851750_274-42137-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 04:17:06,766 WARNING [train.py:1204] (3/4) Exclude cut with ID _60804_210_9642_1_1534831199859_7268299_87-327819-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 04:17:06,858 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_9302_210_2550_1_1528889962_4655416_638-301620-0 from training. Duration: 0.47 2023-10-04 04:17:08,233 WARNING [train.py:1204] (3/4) Exclude cut with ID _44304_210_15374_1_1533639352765_4613129_302-22084-0_sp1.1 from training. Duration: 0.7818125 2023-10-04 04:17:08,251 WARNING [train.py:1204] (3/4) Exclude cut with ID _18513_210_9206_1_1531618215223_3862620_177-227063-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 04:17:10,070 WARNING [train.py:1204] (3/4) Exclude cut with ID _41326_210_5854_1_1533778076509_3492080_510-45444-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 04:17:11,342 INFO [train.py:1046] (3/4) Epoch 43, batch 5250, loss[loss=0.1607, simple_loss=0.2386, pruned_loss=0.04141, over 23620.00 frames. ], tot_loss[loss=0.1553, simple_loss=0.2362, pruned_loss=0.0372, over 4732881.35 frames. ], batch size: 149, lr: 2.35e-03, grad_scale: 16.0 2023-10-04 04:17:11,389 WARNING [train.py:1204] (3/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_76-311963-0_sp1.1 from training. Duration: 0.636375 2023-10-04 04:17:12,776 WARNING [train.py:1204] (3/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_156-302675-0_sp0.9 from training. Duration: 0.611125 2023-10-04 04:17:15,899 WARNING [train.py:1204] (3/4) Exclude cut with ID _53489_210_6228_1_1534298357169_3835970_367-148411-0_sp1.1 from training. Duration: 0.936375 2023-10-04 04:17:17,400 WARNING [train.py:1204] (3/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_389-144525-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 04:17:17,444 WARNING [train.py:1204] (3/4) Exclude cut with ID _14069_210_5632_1_1530512692033_4803390_158-238427-0_sp1.1 from training. Duration: 0.963625 2023-10-04 04:17:18,841 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_486-151131-0_sp0.9 from training. Duration: 0.9444375 2023-10-04 04:17:24,432 WARNING [train.py:1204] (3/4) Exclude cut with ID _39846_210_12651_1_1533603651992_1318030_188-71831-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 04:17:24,703 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.bypass_mid.scale_min, batch_count=1522466.6666666667, ans=0.2 2023-10-04 04:17:27,140 WARNING [train.py:1204] (3/4) Exclude cut with ID _39411_210_5986_1_1533538825030_3343649_173-238248-0_sp1.1 from training. Duration: 0.7545625 2023-10-04 04:17:28,527 WARNING [train.py:1204] (3/4) Exclude cut with ID _20684_210_6917_1_1531567751646_4203930_469-49081-0_sp1.1 from training. Duration: 0.92725 2023-10-04 04:17:29,941 WARNING [train.py:1204] (3/4) Exclude cut with ID _8833_210_5219_1_1528592665210_7553059_611-80313-0_sp1.1 from training. Duration: 0.5818125 2023-10-04 04:17:30,863 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.1.encoder.layers.0.feed_forward1.out_whiten, num_groups=1, num_channels=256, metric=4.90 vs. limit=15.0 2023-10-04 04:17:31,381 WARNING [train.py:1204] (3/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_440-141811-0 from training. Duration: 0.95 2023-10-04 04:17:31,395 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_41211_210_2544_1_1533348859_3702910_236-223652-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 04:17:32,851 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_40222_210_6228_1_1533385298_4481939_220-163998-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 04:17:39,170 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.conv_module1.balancer1.min_positive, batch_count=1522533.3333333333, ans=0.025 2023-10-04 04:17:57,472 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.feed_forward1.out_proj.dropout_p, batch_count=1522600.0, ans=0.1 2023-10-04 04:17:58,807 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.bypass.skip_rate, batch_count=1522600.0, ans=0.07 2023-10-04 04:18:02,045 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.5.encoder.layers.0.feed_forward1.out_whiten, num_groups=1, num_channels=256, metric=5.87 vs. limit=15.0 2023-10-04 04:18:18,268 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.541e+02 1.988e+02 2.166e+02 2.434e+02 3.421e+02, threshold=4.331e+02, percent-clipped=0.0 2023-10-04 04:18:19,589 INFO [train.py:1046] (3/4) Epoch 43, batch 5300, loss[loss=0.1578, simple_loss=0.2434, pruned_loss=0.03612, over 24437.00 frames. ], tot_loss[loss=0.1546, simple_loss=0.2351, pruned_loss=0.03701, over 4727524.16 frames. ], batch size: 66, lr: 2.35e-03, grad_scale: 8.0 2023-10-04 04:18:25,027 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.self_attn_weights.pos_emb_skip_rate, batch_count=1522733.3333333333, ans=0.0 2023-10-04 04:18:26,870 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.2.encoder.layers.0.conv_module1.whiten, num_groups=1, num_channels=384, metric=8.97 vs. limit=15.0 2023-10-04 04:18:28,060 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.4.encoder.layers.0.conv_module2.whiten, num_groups=1, num_channels=384, metric=11.34 vs. limit=15.0 2023-10-04 04:18:33,453 WARNING [train.py:1204] (3/4) Exclude cut with ID _14629_210_15709_1_1530604298182_956899_6-232695-0_sp1.1 from training. Duration: 0.8545625 2023-10-04 04:18:33,477 WARNING [train.py:1204] (3/4) Exclude cut with ID _13921_210_9994_1_1530841782272_3503520_327-69677-0 from training. Duration: 0.9 2023-10-04 04:18:33,504 WARNING [train.py:1204] (3/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_372-298093-0 from training. Duration: 0.8 2023-10-04 04:18:33,518 WARNING [train.py:1204] (3/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_515-139111-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 04:18:34,110 WARNING [train.py:1204] (3/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_85-65056-0_sp1.1 from training. Duration: 0.87275 2023-10-04 04:18:34,150 WARNING [train.py:1204] (3/4) Exclude cut with ID _15113_210_8254_1_1530842438579_3716279_209-54533-0_sp1.1 from training. Duration: 0.87275 2023-10-04 04:18:34,198 WARNING [train.py:1204] (3/4) Exclude cut with ID _70447_210_12392_1_1535972499300_3160420_123-312838-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 04:18:34,225 WARNING [train.py:1204] (3/4) Exclude cut with ID _60113_210_16271_1_1535164212481_4386079_70-63596-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 04:18:34,253 WARNING [train.py:1204] (3/4) Exclude cut with ID _14632_210_9988_1_1530691279392_3923200_225-188509-0_sp1.1 from training. Duration: 0.963625 2023-10-04 04:18:34,301 WARNING [train.py:1204] (3/4) Exclude cut with ID _34734_210_4525_1_1533365977542_4448720_584-180532-0_sp1.1 from training. Duration: 0.97275 2023-10-04 04:18:34,315 WARNING [train.py:1204] (3/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_351-294650-0_sp1.1 from training. Duration: 0.57275 2023-10-04 04:18:34,638 WARNING [train.py:1204] (3/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_263-168388-0_sp0.9 from training. Duration: 0.9555625 2023-10-04 04:18:34,665 WARNING [train.py:1204] (3/4) Exclude cut with ID _22083_210_8254_1_1531702862439_3678970_201-85738-0 from training. Duration: 0.9 2023-10-04 04:18:34,754 WARNING [train.py:1204] (3/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_237-58433-0 from training. Duration: 0.74 2023-10-04 04:18:34,780 WARNING [train.py:1204] (3/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_262-250826-0 from training. Duration: 0.99 2023-10-04 04:18:34,885 WARNING [train.py:1204] (3/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_927-43827-0_sp0.9 from training. Duration: 0.82225 2023-10-04 04:18:34,916 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2821_210_2715_1_1526108385_3763560_73-296673-0 from training. Duration: 0.83 2023-10-04 04:18:34,993 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6287_210_3273_1_1527509506_2653716_182-199887-0 from training. Duration: 0.51 2023-10-04 04:18:35,128 WARNING [train.py:1204] (3/4) Exclude cut with ID _70519_210_4163_1_1535961595994_7458230_899-80223-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 04:18:35,399 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_210-234949-0_sp1.1 from training. Duration: 0.97275 2023-10-04 04:18:35,442 WARNING [train.py:1204] (3/4) Exclude cut with ID _30196_210_1664_1_1533708496522_7074140_768-278176-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 04:18:35,527 WARNING [train.py:1204] (3/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_315-128283-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 04:18:35,609 WARNING [train.py:1204] (3/4) Exclude cut with ID _14886_210_4220_1_1530702020355_3516190_330-323941-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 04:18:36,228 WARNING [train.py:1204] (3/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_264-348896-0_sp1.1 from training. Duration: 0.77275 2023-10-04 04:18:36,250 WARNING [train.py:1204] (3/4) Exclude cut with ID _37086_210_9038_1_1532999540891_7021250_433-10563-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 04:18:36,286 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_53638_210_21581_1_1534297319_5808449_885-80172-0_sp1.1 from training. Duration: 0.87275 2023-10-04 04:18:36,375 WARNING [train.py:1204] (3/4) Exclude cut with ID _14623_210_1925_1_1530773354962_3632609_157-218771-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 04:18:36,385 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_1368-78165-0_sp1.1 from training. Duration: 0.97275 2023-10-04 04:18:36,390 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_309-65177-0_sp0.9 from training. Duration: 0.97775 2023-10-04 04:18:36,395 WARNING [train.py:1204] (3/4) Exclude cut with ID _21632_210_2253_1_1531994001765_3744140_291-138812-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 04:18:36,407 WARNING [train.py:1204] (3/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_669-93163-0_sp1.1 from training. Duration: 0.8909375 2023-10-04 04:18:36,949 WARNING [train.py:1204] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_292-215346-0 from training. Duration: 0.97 2023-10-04 04:18:36,974 WARNING [train.py:1204] (3/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_286-222073-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 04:18:37,271 WARNING [train.py:1204] (3/4) Exclude cut with ID _59256_210_5986_1_1535076085997_3589120_121-237386-0_sp1.1 from training. Duration: 0.87275 2023-10-04 04:18:37,285 WARNING [train.py:1204] (3/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_96-320736-0 from training. Duration: 0.86 2023-10-04 04:18:37,294 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_128-202104-0 from training. Duration: 0.63 2023-10-04 04:18:37,380 WARNING [train.py:1204] (3/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_242-3472-0_sp0.9 from training. Duration: 0.811125 2023-10-04 04:18:37,425 WARNING [train.py:1204] (3/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_154-74182-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 04:18:37,429 WARNING [train.py:1204] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_187-221566-0 from training. Duration: 0.43 2023-10-04 04:18:37,593 WARNING [train.py:1204] (3/4) Exclude cut with ID _6194_210_1534_1_1527511826160_3941194_14-282876-0 from training. Duration: 0.71 2023-10-04 04:18:37,632 WARNING [train.py:1204] (3/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_660-211088-0_sp1.1 from training. Duration: 0.563625 2023-10-04 04:18:37,968 WARNING [train.py:1204] (3/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_526-62079-0_sp1.1 from training. Duration: 0.6090625 2023-10-04 04:18:38,574 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_92-246270-0_sp1.1 from training. Duration: 0.77275 2023-10-04 04:18:38,666 WARNING [train.py:1204] (3/4) Exclude cut with ID IC0287W0023-142350 from training. Duration: 0.956 2023-10-04 04:18:38,745 WARNING [train.py:1204] (3/4) Exclude cut with ID _58605_210_12210_1_1534723195939_3904259_48-269402-0 from training. Duration: 0.96 2023-10-04 04:18:38,760 WARNING [train.py:1204] (3/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_416-257730-0_sp0.9 from training. Duration: 0.72225 2023-10-04 04:18:38,839 WARNING [train.py:1204] (3/4) Exclude cut with ID _7877_210_6014_1_1528286358099_3750136_185-17516-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 04:18:38,946 WARNING [train.py:1204] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_23-189234-0 from training. Duration: 0.83 2023-10-04 04:18:38,963 WARNING [train.py:1204] (3/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_237-89684-0 from training. Duration: 0.96 2023-10-04 04:18:39,036 WARNING [train.py:1204] (3/4) Exclude cut with ID _13916_210_9994_1_1530788341126_3763149_116-21034-0 from training. Duration: 0.96 2023-10-04 04:18:39,182 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_246-125000-0_sp1.1 from training. Duration: 0.563625 2023-10-04 04:18:43,404 INFO [train.py:1046] (3/4) Epoch 44, batch 0, loss[loss=0.1399, simple_loss=0.2268, pruned_loss=0.02655, over 24475.00 frames. ], tot_loss[loss=0.1399, simple_loss=0.2268, pruned_loss=0.02655, over 24475.00 frames. ], batch size: 63, lr: 2.32e-03, grad_scale: 16.0 2023-10-04 04:18:43,404 INFO [train.py:1069] (3/4) Computing validation loss 2023-10-04 04:18:55,639 INFO [train.py:1078] (3/4) Epoch 44, validation: loss=0.3443, simple_loss=0.2733, pruned_loss=0.2076, over 1125622.00 frames. 2023-10-04 04:18:55,639 INFO [train.py:1079] (3/4) Maximum memory allocated so far is 21134MB 2023-10-04 04:18:58,531 WARNING [train.py:1204] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_402-253027-0 from training. Duration: 0.54 2023-10-04 04:18:59,179 WARNING [train.py:1204] (3/4) Exclude cut with ID _59288_210_5854_1_1535077757774_3412246_158-289345-0_sp1.1 from training. Duration: 0.92725 2023-10-04 04:19:00,554 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_28211_210_6388_1_1532861506_4523224_128-328162-0_sp1.1 from training. Duration: 0.8181875 2023-10-04 04:19:06,870 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_788-278281-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 04:19:06,886 WARNING [train.py:1204] (3/4) Exclude cut with ID _49728_210_12651_1_1534041034294_6788150_624-195301-0_sp0.9 from training. Duration: 0.9444375 2023-10-04 04:19:06,920 WARNING [train.py:1204] (3/4) Exclude cut with ID _14860_210_4915_1_1530702020340_7343240_267-139775-0_sp1.1 from training. Duration: 0.963625 2023-10-04 04:19:07,104 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.conv_module1.balancer1.prob, batch_count=1522813.3333333333, ans=0.125 2023-10-04 04:19:08,300 WARNING [train.py:1204] (3/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_332-221057-0 from training. Duration: 0.68 2023-10-04 04:19:08,415 WARNING [train.py:1204] (3/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_242-76943-0 from training. Duration: 0.94 2023-10-04 04:19:09,830 WARNING [train.py:1204] (3/4) Exclude cut with ID _16747_210_5986_1_1531811544733_2749109_87-349587-0_sp1.1 from training. Duration: 0.963625 2023-10-04 04:19:11,187 WARNING [train.py:1204] (3/4) Exclude cut with ID _22982_210_1883_1_1531882790447_5449409_505-346452-0_sp1.1 from training. Duration: 0.963625 2023-10-04 04:19:15,108 WARNING [train.py:1204] (3/4) Exclude cut with ID _17956_210_4353_1_1531209568125_6979970_13-192107-0_sp1.1 from training. Duration: 0.963625 2023-10-04 04:19:15,143 WARNING [train.py:1204] (3/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_430-307666-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 04:19:16,510 WARNING [train.py:1204] (3/4) Exclude cut with ID _2895_210_2477_1_1525956785794_4063750_289-11475-0_sp1.1 from training. Duration: 0.7454375 2023-10-04 04:19:18,335 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3910_210_1794_1_1526742306_334530_28-238909-0_sp0.9 from training. Duration: 0.9 2023-10-04 04:19:18,457 WARNING [train.py:1204] (3/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_18-130336-0 from training. Duration: 0.79 2023-10-04 04:19:19,940 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_548-294316-0_sp0.9 from training. Duration: 0.9 2023-10-04 04:19:21,360 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.0.conv_skip_rate, batch_count=1522880.0, ans=0.0 2023-10-04 04:19:28,113 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_42-142473-0_sp1.1 from training. Duration: 0.6545625 2023-10-04 04:19:28,119 WARNING [train.py:1204] (3/4) Exclude cut with ID _59170_210_6721_1_1534983633960_4203335_326-48066-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 04:19:31,442 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_45886_210_17024_1_1533709215_471063_7-79165-0 from training. Duration: 0.98 2023-10-04 04:19:34,962 WARNING [train.py:1204] (3/4) Exclude cut with ID _44585_210_18628_1_1533605350387_4518199_280-73534-0_sp0.9 from training. Duration: 0.988875 2023-10-04 04:19:34,968 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3475_210_2028_1_1526193578_3622569_265-276625-0_sp1.1 from training. Duration: 0.6909375 2023-10-04 04:19:36,522 WARNING [train.py:1204] (3/4) Exclude cut with ID _64631_210_27767_1_1535349580068_4165160_88-225232-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 04:19:41,055 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_205-265518-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 04:19:45,352 WARNING [train.py:1204] (3/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_271-253734-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 04:19:47,049 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.balancer1.prob, batch_count=1523013.3333333333, ans=0.125 2023-10-04 04:19:51,534 WARNING [train.py:1204] (3/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_282-257791-0 from training. Duration: 0.86 2023-10-04 04:19:54,313 WARNING [train.py:1204] (3/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_356-45054-0 from training. Duration: 0.86 2023-10-04 04:19:54,345 WARNING [train.py:1204] (3/4) Exclude cut with ID _69584_210_5219_1_1535785133775_4044450_38-348116-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 04:19:54,346 WARNING [train.py:1204] (3/4) Exclude cut with ID _66763_210_17301_1_1535788667617_241750_21-328480-0_sp1.1 from training. Duration: 0.936375 2023-10-04 04:19:54,408 WARNING [train.py:1204] (3/4) Exclude cut with ID _26528_210_9070_1_1532570410842_3851310_497-97218-0_sp0.9 from training. Duration: 0.9555625 2023-10-04 04:19:55,761 WARNING [train.py:1204] (3/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_592-101075-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 04:19:57,177 WARNING [train.py:1204] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_678-159307-0 from training. Duration: 0.85 2023-10-04 04:19:59,736 WARNING [train.py:1204] (3/4) Exclude cut with ID _28427_210_4220_1_1532581240967_3061069_166-319773-0_sp1.1 from training. Duration: 0.936375 2023-10-04 04:19:59,811 WARNING [train.py:1204] (3/4) Exclude cut with ID _29877_210_6228_1_1532397791106_5276149_606-224984-0_sp1.1 from training. Duration: 0.936375 2023-10-04 04:20:04,468 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_125-203105-0_sp1.1 from training. Duration: 0.72725 2023-10-04 04:20:06,397 WARNING [train.py:1204] (3/4) Exclude cut with ID IC0620W0113-306765 from training. Duration: 0.768 2023-10-04 04:20:06,707 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.ff2_skip_rate, batch_count=1523080.0, ans=0.0 2023-10-04 04:20:07,806 WARNING [train.py:1204] (3/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_410-287857-0_sp1.1 from training. Duration: 0.8454375 2023-10-04 04:20:10,413 INFO [train.py:1046] (3/4) Epoch 44, batch 50, loss[loss=0.1526, simple_loss=0.2314, pruned_loss=0.03693, over 23772.00 frames. ], tot_loss[loss=0.1563, simple_loss=0.2371, pruned_loss=0.03776, over 1067505.22 frames. ], batch size: 212, lr: 2.32e-03, grad_scale: 16.0 2023-10-04 04:20:10,559 WARNING [train.py:1204] (3/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_78-95260-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 04:20:13,213 WARNING [train.py:1204] (3/4) Exclude cut with ID _58059_210_5854_1_1534901471149_3164808_6-73080-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 04:20:13,227 WARNING [train.py:1204] (3/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_331-117829-0 from training. Duration: 0.62 2023-10-04 04:20:13,334 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder_embed.dropout.p, batch_count=1523146.6666666667, ans=0.1 2023-10-04 04:20:14,568 WARNING [train.py:1204] (3/4) Exclude cut with ID _14880_210_8895_1_1530680426655_3795259_289-212817-0_sp0.9 from training. Duration: 0.7666875 2023-10-04 04:20:14,594 WARNING [train.py:1204] (3/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_620-144779-0_sp1.1 from training. Duration: 0.92725 2023-10-04 04:20:15,973 WARNING [train.py:1204] (3/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_561-288575-0_sp1.1 from training. Duration: 0.963625 2023-10-04 04:20:17,355 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_22657_210_6890_1_1532086002_1637216_122-188128-0_sp1.1 from training. Duration: 0.963625 2023-10-04 04:20:21,998 WARNING [train.py:1204] (3/4) Exclude cut with ID _16756_210_4915_1_1531047803763_7104259_604-86607-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 04:20:26,206 WARNING [train.py:1204] (3/4) Exclude cut with ID _15150_210_8341_1_1530752411803_7256384_469-239509-0 from training. Duration: 0.68 2023-10-04 04:20:26,215 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_10987_210_4163_1_1529760555_4123902_302-87926-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 04:20:29,099 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.0.self_attn_weights.pos_emb_skip_rate, batch_count=1523213.3333333333, ans=0.0 2023-10-04 04:20:31,822 WARNING [train.py:1204] (3/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_469-94673-0_sp0.9 from training. Duration: 0.87775 2023-10-04 04:20:33,997 WARNING [train.py:1204] (3/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_798-106179-0 from training. Duration: 0.83 2023-10-04 04:20:35,159 WARNING [train.py:1204] (3/4) Exclude cut with ID _16744_210_5986_1_1531724405384_3907567_242-111316-0 from training. Duration: 0.93 2023-10-04 04:20:36,590 WARNING [train.py:1204] (3/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_938-205135-0_sp0.9 from training. Duration: 0.9666875 2023-10-04 04:20:38,431 WARNING [train.py:1204] (3/4) Exclude cut with ID _64135_210_2253_1_1535677267876_7608972_133-188266-0_sp0.9 from training. Duration: 0.9 2023-10-04 04:20:38,435 WARNING [train.py:1204] (3/4) Exclude cut with ID _44585_210_18628_1_1533605350387_4518199_623-346820-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 04:20:41,667 WARNING [train.py:1204] (3/4) Exclude cut with ID _42326_210_7770_1_1533814282531_3707269_454-266903-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 04:20:43,179 WARNING [train.py:1204] (3/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_5-55893-0_sp1.1 from training. Duration: 0.5 2023-10-04 04:20:43,220 WARNING [train.py:1204] (3/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_577-332600-0_sp1.1 from training. Duration: 0.5181875 2023-10-04 04:20:43,220 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_957-138882-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 04:20:51,531 WARNING [train.py:1204] (3/4) Exclude cut with ID _12556_210_8341_1_1530081024100_7327690_219-38004-0_sp1.1 from training. Duration: 0.97275 2023-10-04 04:20:52,849 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_397-147196-0_sp1.1 from training. Duration: 0.72725 2023-10-04 04:20:52,858 WARNING [train.py:1204] (3/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_231-104736-0_sp0.9 from training. Duration: 0.8666875 2023-10-04 04:20:54,298 WARNING [train.py:1204] (3/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_398-334558-0 from training. Duration: 0.96 2023-10-04 04:20:56,155 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_257-20733-0_sp1.1 from training. Duration: 0.6909375 2023-10-04 04:20:56,327 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.1.bypass.scale_min, batch_count=1523346.6666666667, ans=0.2 2023-10-04 04:20:57,563 WARNING [train.py:1204] (3/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_146-282400-0_sp1.1 from training. Duration: 0.6454375 2023-10-04 04:20:57,566 WARNING [train.py:1204] (3/4) Exclude cut with ID _51899_210_20034_1_1534129254466_3669220_265-215428-0 from training. Duration: 0.98 2023-10-04 04:20:57,647 WARNING [train.py:1204] (3/4) Exclude cut with ID _34423_210_8750_1_1533430798456_3330199_219-106541-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 04:20:59,056 WARNING [train.py:1204] (3/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_458-67321-0 from training. Duration: 0.97 2023-10-04 04:20:59,235 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.balancer2.prob, batch_count=1523346.6666666667, ans=0.125 2023-10-04 04:20:59,649 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.4.encoder.layers.0.conv_module1.whiten, num_groups=1, num_channels=384, metric=4.50 vs. limit=15.0 2023-10-04 04:21:03,775 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.3.encoder.layers.2.self_attn_weights.whiten_keys, num_groups=8, num_channels=256, metric=3.11 vs. limit=6.0 2023-10-04 04:21:04,388 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.628e+02 1.994e+02 2.209e+02 2.588e+02 5.609e+02, threshold=4.418e+02, percent-clipped=8.0 2023-10-04 04:21:07,306 WARNING [train.py:1204] (3/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_1051-182606-0_sp1.1 from training. Duration: 0.936375 2023-10-04 04:21:07,320 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_53555_210_5632_1_1534595220_7232297_603-4396-0_sp1.1 from training. Duration: 0.97275 2023-10-04 04:21:08,185 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.feed_forward1.out_proj.dropout_p, batch_count=1523413.3333333333, ans=0.1 2023-10-04 04:21:09,109 WARNING [train.py:1204] (3/4) Exclude cut with ID _54218_210_12210_1_1534413646388_4409259_159-144367-0_sp1.1 from training. Duration: 0.963625 2023-10-04 04:21:09,213 WARNING [train.py:1204] (3/4) Exclude cut with ID _7606_210_4915_1_1528894253277_5080690_301-10732-0_sp0.9 from training. Duration: 0.9 2023-10-04 04:21:09,216 WARNING [train.py:1204] (3/4) Exclude cut with ID _15026_210_8750_1_1530777602316_3291850_211-195043-0_sp0.9 from training. Duration: 0.888875 2023-10-04 04:21:12,444 WARNING [train.py:1204] (3/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_146-282400-0 from training. Duration: 0.71 2023-10-04 04:21:12,486 WARNING [train.py:1204] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_65-88242-0 from training. Duration: 0.56 2023-10-04 04:21:13,804 WARNING [train.py:1204] (3/4) Exclude cut with ID _55857_210_2346_1_1534644007831_6898209_80-107757-0_sp1.1 from training. Duration: 0.963625 2023-10-04 04:21:13,835 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_15126_210_6753_1_1530778677_2591185_120-277707-0_sp0.9 from training. Duration: 0.888875 2023-10-04 04:21:15,211 WARNING [train.py:1204] (3/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_1033-168593-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 04:21:16,508 WARNING [train.py:1204] (3/4) Exclude cut with ID _46683_210_9642_1_1533730913492_4591116_475-30711-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 04:21:16,533 WARNING [train.py:1204] (3/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_253-308377-0 from training. Duration: 0.49 2023-10-04 04:21:16,596 WARNING [train.py:1204] (3/4) Exclude cut with ID _4680_210_3525_1_1527249757953_893386_75-78979-0 from training. Duration: 0.91 2023-10-04 04:21:17,983 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_5522_210_3273_1_1527249606_3909096_318-203153-0_sp0.9 from training. Duration: 0.47775 2023-10-04 04:21:18,088 WARNING [train.py:1204] (3/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_550-170116-0_sp1.1 from training. Duration: 0.92725 2023-10-04 04:21:19,342 WARNING [train.py:1204] (3/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_277-210798-0_sp0.9 from training. Duration: 0.611125 2023-10-04 04:21:19,410 WARNING [train.py:1204] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_620-276793-0 from training. Duration: 0.59 2023-10-04 04:21:19,420 WARNING [train.py:1204] (3/4) Exclude cut with ID _3067_210_2667_1_1526212974640_4503168_119-175776-0 from training. Duration: 0.74 2023-10-04 04:21:20,819 WARNING [train.py:1204] (3/4) Exclude cut with ID _55209_210_1531_1_1534675828996_4597160_681-195827-0_sp1.1 from training. Duration: 0.92725 2023-10-04 04:21:20,894 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_427-78097-0_sp1.1 from training. Duration: 0.72725 2023-10-04 04:21:22,279 WARNING [train.py:1204] (3/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_96-290039-0_sp0.9 from training. Duration: 0.62225 2023-10-04 04:21:22,289 WARNING [train.py:1204] (3/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_287-292174-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 04:21:23,645 INFO [train.py:1046] (3/4) Epoch 44, batch 100, loss[loss=0.1621, simple_loss=0.2363, pruned_loss=0.04401, over 23568.00 frames. ], tot_loss[loss=0.1568, simple_loss=0.2374, pruned_loss=0.03809, over 1891598.80 frames. ], batch size: 256, lr: 2.32e-03, grad_scale: 16.0 2023-10-04 04:21:26,805 WARNING [train.py:1204] (3/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_200-292228-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 04:21:30,778 WARNING [train.py:1204] (3/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_291-194999-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 04:21:33,651 WARNING [train.py:1204] (3/4) Exclude cut with ID _58895_210_8750_1_1535184081320_3649089_265-323810-0_sp1.1 from training. Duration: 0.87275 2023-10-04 04:21:34,954 WARNING [train.py:1204] (3/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_58-68956-0 from training. Duration: 0.83 2023-10-04 04:21:34,969 WARNING [train.py:1204] (3/4) Exclude cut with ID _39867_210_9826_1_1533553222030_9344259_322-115153-0_sp1.1 from training. Duration: 0.963625 2023-10-04 04:21:37,220 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.attention_skip_rate, batch_count=1523546.6666666667, ans=0.0 2023-10-04 04:21:37,337 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.conv_skip_rate, batch_count=1523546.6666666667, ans=0.0 2023-10-04 04:21:38,485 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3371_210_1794_1_1526384462_5580777_396-339812-0_sp1.1 from training. Duration: 0.77275 2023-10-04 04:21:38,496 WARNING [train.py:1204] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_731-2428-0_sp0.9 from training. Duration: 0.92225 2023-10-04 04:21:38,526 WARNING [train.py:1204] (3/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_98-741-0_sp1.1 from training. Duration: 0.72725 2023-10-04 04:21:38,528 WARNING [train.py:1204] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_238-245655-0_sp0.9 from training. Duration: 0.9 2023-10-04 04:21:38,549 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6788_210_1388_1_1527898698_4562021_91-197981-0_sp0.9 from training. Duration: 0.92225 2023-10-04 04:21:41,174 WARNING [train.py:1204] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_860-96198-0 from training. Duration: 0.73 2023-10-04 04:21:43,126 WARNING [train.py:1204] (3/4) Exclude cut with ID _14268_210_9206_1_1530622771285_4714099_331-105351-0_sp0.9 from training. Duration: 0.72225 2023-10-04 04:21:43,157 WARNING [train.py:1204] (3/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_204-137133-0_sp1.1 from training. Duration: 0.92725 2023-10-04 04:21:43,182 WARNING [train.py:1204] (3/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_5-46496-0_sp1.1 from training. Duration: 0.936375 2023-10-04 04:21:43,200 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_160-218721-0_sp1.1 from training. Duration: 0.87275 2023-10-04 04:21:46,003 WARNING [train.py:1204] (3/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_503-287883-0 from training. Duration: 0.88 2023-10-04 04:21:46,238 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.self_attn_weights.pos_emb_skip_rate, batch_count=1523546.6666666667, ans=0.0 2023-10-04 04:21:47,318 WARNING [train.py:1204] (3/4) Exclude cut with ID _57572_210_5517_1_1534921219624_3809484_110-79134-0_sp1.1 from training. Duration: 0.92725 2023-10-04 04:21:47,398 WARNING [train.py:1204] (3/4) Exclude cut with ID _44585_210_18628_1_1533605350387_4518199_421-37641-0_sp1.1 from training. Duration: 0.936375 2023-10-04 04:21:47,603 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.feed_forward3.hidden_balancer.prob, batch_count=1523546.6666666667, ans=0.125 2023-10-04 04:21:48,786 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_42-142473-0_sp0.9 from training. Duration: 0.8 2023-10-04 04:21:51,522 WARNING [train.py:1204] (3/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_310-114452-0_sp0.9 from training. Duration: 0.6555625 2023-10-04 04:21:54,287 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9029W0120-509520 from training. Duration: 0.768 2023-10-04 04:21:55,701 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9029W0433-509820 from training. Duration: 0.896 2023-10-04 04:21:55,826 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_40222_210_6228_1_1533385298_4481939_163-334586-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 04:21:55,826 WARNING [train.py:1204] (3/4) Exclude cut with ID _48677_210_5663_1_1533902405919_3892989_408-40740-0_sp1.1 from training. Duration: 0.7545625 2023-10-04 04:21:59,254 WARNING [train.py:1204] (3/4) Exclude cut with ID _24314_210_4915_1_1532135078116_7170179_521-258868-0_sp0.9 from training. Duration: 0.87775 2023-10-04 04:22:00,645 WARNING [train.py:1204] (3/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_576-51198-0_sp1.1 from training. Duration: 0.92725 2023-10-04 04:22:01,997 WARNING [train.py:1204] (3/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_306-216871-0_sp1.1 from training. Duration: 0.97275 2023-10-04 04:22:06,580 WARNING [train.py:1204] (3/4) Exclude cut with ID _20684_210_6917_1_1531567751646_4203930_463-119982-0_sp1.1 from training. Duration: 0.97275 2023-10-04 04:22:06,649 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9084W0012-536850 from training. Duration: 0.64 2023-10-04 04:22:08,027 WARNING [train.py:1204] (3/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_847-41334-0_sp1.1 from training. Duration: 0.4 2023-10-04 04:22:08,256 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.feed_forward2.hidden_balancer.prob, batch_count=1523680.0, ans=0.125 2023-10-04 04:22:12,648 WARNING [train.py:1204] (3/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_326-165900-0_sp1.1 from training. Duration: 0.62725 2023-10-04 04:22:12,784 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.self_attn_weights.pos_emb_skip_rate, batch_count=1523680.0, ans=0.0 2023-10-04 04:22:14,116 WARNING [train.py:1204] (3/4) Exclude cut with ID _7537_210_2084_1_1528502395431_3924359_128-268029-0_sp1.1 from training. Duration: 0.8909375 2023-10-04 04:22:16,835 WARNING [train.py:1204] (3/4) Exclude cut with ID _67499_210_17282_1_1535509805220_3692430_333-155030-0_sp1.1 from training. Duration: 0.97275 2023-10-04 04:22:20,907 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_34008_210_10431_1_1533345424_8183247_588-257177-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 04:22:23,668 WARNING [train.py:1204] (3/4) Exclude cut with ID _25914_210_6400_1_1532141317058_644740_52-96808-0_sp1.1 from training. Duration: 0.82725 2023-10-04 04:22:25,013 WARNING [train.py:1204] (3/4) Exclude cut with ID _56494_210_4220_1_1535284837625_3033169_69-138150-0_sp1.1 from training. Duration: 0.8545625 2023-10-04 04:22:26,476 WARNING [train.py:1204] (3/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_19-143553-0_sp1.1 from training. Duration: 0.97275 2023-10-04 04:22:26,703 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.feed_forward1.out_proj.dropout_p, batch_count=1523746.6666666667, ans=0.1 2023-10-04 04:22:27,218 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.3.encoder.layers.0.nonlin_attention.whiten1, num_groups=1, num_channels=384, metric=6.70 vs. limit=10.0 2023-10-04 04:22:27,790 WARNING [train.py:1204] (3/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_582-130735-0_sp1.1 from training. Duration: 0.936375 2023-10-04 04:22:29,694 WARNING [train.py:1204] (3/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_943-201356-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 04:22:29,712 WARNING [train.py:1204] (3/4) Exclude cut with ID _3231_210_2106_1_1526205864934_6784220_422-229276-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 04:22:29,937 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.conv_skip_rate, batch_count=1523746.6666666667, ans=0.0 2023-10-04 04:22:30,970 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_48049_210_5632_1_1533864553_3519937_73-198717-0_sp1.1 from training. Duration: 0.97275 2023-10-04 04:22:31,032 WARNING [train.py:1204] (3/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_329-285801-0 from training. Duration: 0.91 2023-10-04 04:22:31,047 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9171W0114-579000 from training. Duration: 0.896 2023-10-04 04:22:31,057 WARNING [train.py:1204] (3/4) Exclude cut with ID _3120_210_3525_1_1526036143112_4072980_210-132675-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 04:22:32,331 WARNING [train.py:1204] (3/4) Exclude cut with ID _55857_210_2346_1_1534644007831_6898209_745-98391-0_sp0.9 from training. Duration: 0.8444375 2023-10-04 04:22:32,532 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.conv_module1.balancer1.prob, batch_count=1523746.6666666667, ans=0.125 2023-10-04 04:22:33,703 WARNING [train.py:1204] (3/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_615-322035-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 04:22:33,708 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_36756_210_7234_1_1533026991_966276_119-315767-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 04:22:33,739 WARNING [train.py:1204] (3/4) Exclude cut with ID _13916_210_9994_1_1530788341126_3763149_60-191556-0_sp1.1 from training. Duration: 0.5545625 2023-10-04 04:22:33,907 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.bypass_mid.scale_min, batch_count=1523746.6666666667, ans=0.2 2023-10-04 04:22:34,985 WARNING [train.py:1204] (3/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_177-236809-0_sp1.1 from training. Duration: 0.4454375 2023-10-04 04:22:35,022 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7002_210_1794_1_1527939683_5157286_184-229630-0_sp1.1 from training. Duration: 0.67275 2023-10-04 04:22:35,644 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_50428_210_5724_1_1534247532_4126418_464-18322-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 04:22:35,734 WARNING [train.py:1204] (3/4) Exclude cut with ID _35799_210_5854_1_1533263487887_3529071_466-131402-0_sp1.1 from training. Duration: 0.936375 2023-10-04 04:22:37,001 INFO [train.py:1046] (3/4) Epoch 44, batch 150, loss[loss=0.1663, simple_loss=0.2544, pruned_loss=0.0391, over 24367.00 frames. ], tot_loss[loss=0.157, simple_loss=0.238, pruned_loss=0.03797, over 2525773.04 frames. ], batch size: 77, lr: 2.32e-03, grad_scale: 16.0 2023-10-04 04:22:37,110 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_1259-132768-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 04:22:37,158 WARNING [train.py:1204] (3/4) Exclude cut with ID _6694_210_4599_1_1527852637490_3965529_144-185051-0_sp1.1 from training. Duration: 0.87275 2023-10-04 04:22:37,198 WARNING [train.py:1204] (3/4) Exclude cut with ID _64639_210_5816_1_1535367619437_3528000_59-133290-0_sp1.1 from training. Duration: 0.963625 2023-10-04 04:22:39,855 WARNING [train.py:1204] (3/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_223-49525-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 04:22:42,016 WARNING [train.py:1204] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_748-36150-0_sp1.1 from training. Duration: 0.82725 2023-10-04 04:22:42,017 WARNING [train.py:1204] (3/4) Exclude cut with ID _19896_210_1883_1_1531709022662_6649569_52-221627-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 04:22:43,343 WARNING [train.py:1204] (3/4) Exclude cut with ID _6789_210_1534_1_1527854471671_2870519_296-115766-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 04:22:44,821 WARNING [train.py:1204] (3/4) Exclude cut with ID _14624_210_1925_1_1530858690657_3642219_100-19871-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 04:22:44,861 WARNING [train.py:1204] (3/4) Exclude cut with ID _12555_210_8750_1_1530075594402_3780239_50-293675-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 04:22:48,812 WARNING [train.py:1204] (3/4) Exclude cut with ID _14880_210_8895_1_1530680426655_3795259_289-212817-0_sp1.1 from training. Duration: 0.62725 2023-10-04 04:22:50,122 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_388-211626-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 04:22:52,939 WARNING [train.py:1204] (3/4) Exclude cut with ID _5289_210_2084_1_1527678202736_3779299_392-261988-0 from training. Duration: 0.65 2023-10-04 04:22:52,946 WARNING [train.py:1204] (3/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_124-345840-0 from training. Duration: 0.52 2023-10-04 04:22:52,959 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2655_210_2087_1_1525688959_3897400_437-337690-0 from training. Duration: 0.63 2023-10-04 04:22:54,988 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.4.encoder.layers.0.conv_module2.whiten, num_groups=1, num_channels=384, metric=12.15 vs. limit=15.0 2023-10-04 04:22:55,843 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3780_210_2087_1_1526467391_239068_13-274633-0_sp1.1 from training. Duration: 0.92725 2023-10-04 04:22:55,848 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_805-71497-0_sp1.1 from training. Duration: 0.6454375 2023-10-04 04:22:58,439 WARNING [train.py:1204] (3/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_434-83000-0_sp1.1 from training. Duration: 0.8909375 2023-10-04 04:22:58,693 INFO [scaling.py:1118] (3/4) WithLoss: name=encoder.encoders.3.encoder.layers.0.self_attn_weights, loss-sum=0.000e+00 2023-10-04 04:22:59,771 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_17613_210_2151_1_1531137569_2366160_285-219531-0_sp1.1 from training. Duration: 0.97275 2023-10-04 04:22:59,776 WARNING [train.py:1204] (3/4) Exclude cut with ID _56108_210_16380_1_1534505446159_3887959_22-37858-0_sp1.1 from training. Duration: 0.936375 2023-10-04 04:22:59,813 WARNING [train.py:1204] (3/4) Exclude cut with ID _28427_210_4220_1_1532581240967_3061069_200-88444-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 04:22:59,881 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_77-279042-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 04:23:03,022 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9309W0193-640320 from training. Duration: 0.896 2023-10-04 04:23:05,080 WARNING [train.py:1204] (3/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_516-324055-0_sp1.1 from training. Duration: 0.936375 2023-10-04 04:23:10,253 WARNING [train.py:1204] (3/4) Exclude cut with ID _40094_210_16380_1_1533294005773_3641159_10-261240-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 04:23:14,154 WARNING [train.py:1204] (3/4) Exclude cut with ID _6267_210_2084_1_1527764658526_3894730_375-219887-0_sp1.1 from training. Duration: 0.6090625 2023-10-04 04:23:15,490 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6788_210_1388_1_1527898698_4562021_180-75288-0 from training. Duration: 0.99 2023-10-04 04:23:15,677 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.0.feed_forward2.hidden_balancer.prob, batch_count=1523946.6666666667, ans=0.125 2023-10-04 04:23:18,653 WARNING [train.py:1204] (3/4) Exclude cut with ID _8030_210_7497_1_1528444886781_3531089_262-86750-0_sp1.1 from training. Duration: 0.736375 2023-10-04 04:23:18,675 WARNING [train.py:1204] (3/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_934-325498-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 04:23:18,703 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_456-22141-0_sp0.9 from training. Duration: 0.97775 2023-10-04 04:23:20,092 WARNING [train.py:1204] (3/4) Exclude cut with ID _49643_210_7989_1_1534640244224_3939031_598-161926-0_sp1.1 from training. Duration: 0.8181875 2023-10-04 04:23:21,490 WARNING [train.py:1204] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_345-49721-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 04:23:21,562 WARNING [train.py:1204] (3/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_440-141811-0_sp1.1 from training. Duration: 0.863625 2023-10-04 04:23:22,837 WARNING [train.py:1204] (3/4) Exclude cut with ID _64048_210_10626_1_1535198382802_3835179_394-212120-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 04:23:24,202 WARNING [train.py:1204] (3/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_91-277684-0 from training. Duration: 0.74 2023-10-04 04:23:28,471 WARNING [train.py:1204] (3/4) Exclude cut with ID _23038_210_9570_1_1532001584158_3652419_191-346343-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 04:23:29,848 WARNING [train.py:1204] (3/4) Exclude cut with ID _15140_210_9288_1_1530791388450_4880149_351-347974-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 04:23:31,226 WARNING [train.py:1204] (3/4) Exclude cut with ID _58605_210_12210_1_1534723195939_3904259_48-269402-0_sp1.1 from training. Duration: 0.87275 2023-10-04 04:23:31,241 WARNING [train.py:1204] (3/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_401-53423-0_sp0.9 from training. Duration: 0.611125 2023-10-04 04:23:32,518 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.582e+02 2.030e+02 2.255e+02 2.508e+02 4.097e+02, threshold=4.511e+02, percent-clipped=0.0 2023-10-04 04:23:32,674 WARNING [train.py:1204] (3/4) Exclude cut with ID _42659_210_2070_1_1533607442765_3120380_158-49004-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 04:23:32,874 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.self_attn_weights.pos_emb_skip_rate, batch_count=1524013.3333333333, ans=0.0 2023-10-04 04:23:34,168 WARNING [train.py:1204] (3/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_81-75849-0_sp1.1 from training. Duration: 0.3545625 2023-10-04 04:23:38,074 WARNING [train.py:1204] (3/4) Exclude cut with ID _27052_210_5986_1_1532329197187_3585009_314-106499-0_sp1.1 from training. Duration: 0.836375 2023-10-04 04:23:39,498 WARNING [train.py:1204] (3/4) Exclude cut with ID _4626_210_3532_1_1526817652717_3916859_446-206656-0_sp0.9 from training. Duration: 0.8444375 2023-10-04 04:23:41,244 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_53378_210_17282_1_1534221071_5769509_158-114960-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 04:23:44,519 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3171_210_2087_1_1526109084_2330007_37-64334-0_sp0.9 from training. Duration: 0.911125 2023-10-04 04:23:44,539 WARNING [train.py:1204] (3/4) Exclude cut with ID _5410_210_1388_1_1527397055795_1719020_39-112221-0 from training. Duration: 0.69 2023-10-04 04:23:44,559 WARNING [train.py:1204] (3/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_4-91037-0_sp0.9 from training. Duration: 0.97775 2023-10-04 04:23:44,583 WARNING [train.py:1204] (3/4) Exclude cut with ID ID0079W0443-717795 from training. Duration: 0.541 2023-10-04 04:23:47,266 WARNING [train.py:1204] (3/4) Exclude cut with ID _34734_210_4525_1_1533365977542_4448720_766-297928-0_sp1.1 from training. Duration: 0.936375 2023-10-04 04:23:51,252 INFO [train.py:1046] (3/4) Epoch 44, batch 200, loss[loss=0.1428, simple_loss=0.2229, pruned_loss=0.03132, over 16884.00 frames. ], tot_loss[loss=0.1562, simple_loss=0.2371, pruned_loss=0.03766, over 3005998.38 frames. ], batch size: 37, lr: 2.32e-03, grad_scale: 8.0 2023-10-04 04:23:52,649 WARNING [train.py:1204] (3/4) Exclude cut with ID _51471_210_1889_1_1534399391390_3094250_310-337793-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 04:23:52,663 WARNING [train.py:1204] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_678-159307-0_sp0.9 from training. Duration: 0.9444375 2023-10-04 04:23:56,770 WARNING [train.py:1204] (3/4) Exclude cut with ID _50093_210_4220_1_1534417141207_3432299_259-322252-0 from training. Duration: 0.92 2023-10-04 04:23:56,824 WARNING [train.py:1204] (3/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_67-103444-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 04:23:56,859 WARNING [train.py:1204] (3/4) Exclude cut with ID _40551_210_10635_1_1533776424632_3416179_311-247578-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 04:23:58,417 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_4633_210_2040_1_1526824163_4145343_416-76117-0 from training. Duration: 0.95 2023-10-04 04:23:58,511 WARNING [train.py:1204] (3/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_331-117829-0_sp0.9 from training. Duration: 0.688875 2023-10-04 04:24:01,374 WARNING [train.py:1204] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_299-247059-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 04:24:01,448 WARNING [train.py:1204] (3/4) Exclude cut with ID _64002_210_9570_1_1535198605157_38979_2-240845-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 04:24:05,532 WARNING [train.py:1204] (3/4) Exclude cut with ID _4681_210_3525_1_1527250689464_120889_15-126760-0_sp0.9 from training. Duration: 0.9 2023-10-04 04:24:05,554 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_21500_210_2054_1_1532149310_2716758_256-156611-0_sp1.1 from training. Duration: 0.936375 2023-10-04 04:24:05,572 WARNING [train.py:1204] (3/4) Exclude cut with ID _16908_210_5724_1_1531119157248_3772700_624-203729-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 04:24:19,816 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.out_combiner.scale_min, batch_count=1524280.0, ans=0.2 2023-10-04 04:24:22,269 WARNING [train.py:1204] (3/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_605-219732-0_sp1.1 from training. Duration: 0.8545625 2023-10-04 04:24:22,303 WARNING [train.py:1204] (3/4) Exclude cut with ID _44718_210_5816_1_1534050051869_6469030_380-167520-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 04:24:22,365 WARNING [train.py:1204] (3/4) Exclude cut with ID _19896_210_1883_1_1531709022662_6649569_94-229927-0_sp1.1 from training. Duration: 0.8818125 2023-10-04 04:24:23,736 WARNING [train.py:1204] (3/4) Exclude cut with ID _19514_210_9259_1_1531530071330_5084230_519-174300-0_sp1.1 from training. Duration: 0.963625 2023-10-04 04:24:23,803 WARNING [train.py:1204] (3/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_340-174176-0_sp0.9 from training. Duration: 0.5444375 2023-10-04 04:24:25,012 WARNING [train.py:1204] (3/4) Exclude cut with ID _8263_210_1664_1_1528439452891_970220_68-181805-0_sp1.1 from training. Duration: 0.5818125 2023-10-04 04:24:26,425 WARNING [train.py:1204] (3/4) Exclude cut with ID _3120_210_3525_1_1526036143112_4072980_348-257723-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 04:24:27,780 WARNING [train.py:1204] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_453-133983-0_sp1.1 from training. Duration: 0.7090625 2023-10-04 04:24:29,102 WARNING [train.py:1204] (3/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_17-86060-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 04:24:29,139 WARNING [train.py:1204] (3/4) Exclude cut with ID _29877_210_6228_1_1532397791106_5276149_228-184657-0_sp1.1 from training. Duration: 0.97275 2023-10-04 04:24:30,591 WARNING [train.py:1204] (3/4) Exclude cut with ID _18516_210_9206_1_1531704653206_3692406_198-334870-0 from training. Duration: 0.92 2023-10-04 04:24:30,626 WARNING [train.py:1204] (3/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_340-174176-0_sp1.1 from training. Duration: 0.4454375 2023-10-04 04:24:30,645 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_14-139186-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 04:24:36,196 WARNING [train.py:1204] (3/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_126-29104-0_sp0.9 from training. Duration: 0.9444375 2023-10-04 04:24:36,484 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=1524346.6666666667, ans=0.1 2023-10-04 04:24:41,979 WARNING [train.py:1204] (3/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_84-10310-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 04:24:47,964 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_15517_210_6753_1_1530954008_3487336_79-273076-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 04:24:48,004 WARNING [train.py:1204] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_371-18169-0_sp0.9 from training. Duration: 0.9555625 2023-10-04 04:24:54,014 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.4.encoder.layers.1.nonlin_attention.whiten2, num_groups=1, num_channels=384, metric=3.08 vs. limit=15.0 2023-10-04 04:24:54,764 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_68702_210_23689_1_1535799577_3420068_170-92788-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 04:24:57,602 WARNING [train.py:1204] (3/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_78-182912-0 from training. Duration: 0.59 2023-10-04 04:24:58,943 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3019_210_2028_1_1526044245_3264865_297-12384-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 04:24:58,949 WARNING [train.py:1204] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_147-111137-0_sp1.1 from training. Duration: 0.863625 2023-10-04 04:24:58,954 WARNING [train.py:1204] (3/4) Exclude cut with ID _28323_210_16187_1_1532480395667_4100300_117-8859-0_sp1.1 from training. Duration: 0.97275 2023-10-04 04:25:00,403 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7762_210_1794_1_1528199508_4057408_419-125009-0_sp1.1 from training. Duration: 0.7545625 2023-10-04 04:25:01,922 WARNING [train.py:1204] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_268-87909-0 from training. Duration: 0.99 2023-10-04 04:25:01,996 WARNING [train.py:1204] (3/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_118-311357-0_sp1.1 from training. Duration: 0.92725 2023-10-04 04:25:02,012 WARNING [train.py:1204] (3/4) Exclude cut with ID ID0377W0003-870225 from training. Duration: 0.852 2023-10-04 04:25:04,611 INFO [train.py:1046] (3/4) Epoch 44, batch 250, loss[loss=0.1544, simple_loss=0.228, pruned_loss=0.04042, over 23825.00 frames. ], tot_loss[loss=0.1575, simple_loss=0.238, pruned_loss=0.03852, over 3390820.21 frames. ], batch size: 179, lr: 2.32e-03, grad_scale: 8.0 2023-10-04 04:25:04,680 WARNING [train.py:1204] (3/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_56-5838-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 04:25:06,078 WARNING [train.py:1204] (3/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_90-31383-0_sp0.9 from training. Duration: 0.9666875 2023-10-04 04:25:07,449 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_31583_210_3852_1_1532595851_4024413_58-86657-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 04:25:07,486 WARNING [train.py:1204] (3/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_344-113875-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 04:25:09,411 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.conv_skip_rate, batch_count=1524480.0, ans=0.0 2023-10-04 04:25:10,581 WARNING [train.py:1204] (3/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_278-104676-0_sp1.1 from training. Duration: 0.8909375 2023-10-04 04:25:10,606 WARNING [train.py:1204] (3/4) Exclude cut with ID _55844_210_2346_1_1534471212569_7625707_764-2387-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 04:25:12,672 WARNING [train.py:1204] (3/4) Exclude cut with ID _27430_210_7770_1_1532604689375_1378590_157-14963-0_sp1.1 from training. Duration: 0.963625 2023-10-04 04:25:16,024 WARNING [train.py:1204] (3/4) Exclude cut with ID _7160_210_1876_1_1527940923231_3703799_282-322611-0_sp1.1 from training. Duration: 0.7909375 2023-10-04 04:25:18,150 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.2.encoder.layers.1.self_attn_weights.whiten_keys, num_groups=4, num_channels=128, metric=2.87 vs. limit=6.0 2023-10-04 04:25:22,810 WARNING [train.py:1204] (3/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_190-138640-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 04:25:26,752 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_33348_210_5632_1_1533086912_3324030_63-270922-0_sp1.1 from training. Duration: 0.97275 2023-10-04 04:25:26,778 WARNING [train.py:1204] (3/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_135-184281-0_sp1.1 from training. Duration: 0.9 2023-10-04 04:25:32,530 WARNING [train.py:1204] (3/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_277-210798-0_sp1.1 from training. Duration: 0.5 2023-10-04 04:25:32,589 WARNING [train.py:1204] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_402-253027-0_sp0.9 from training. Duration: 0.6 2023-10-04 04:25:32,666 WARNING [train.py:1204] (3/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_540-275685-0_sp0.9 from training. Duration: 0.988875 2023-10-04 04:25:33,998 WARNING [train.py:1204] (3/4) Exclude cut with ID _59493_210_17282_1_1535078419077_3225879_118-18960-0_sp1.1 from training. Duration: 0.936375 2023-10-04 04:25:34,090 WARNING [train.py:1204] (3/4) Exclude cut with ID _6194_210_1534_1_1527511826160_3941194_321-148641-0_sp1.1 from training. Duration: 0.5818125 2023-10-04 04:25:34,096 WARNING [train.py:1204] (3/4) Exclude cut with ID _61133_210_9969_1_1535196777227_3696409_333-270325-0_sp1.1 from training. Duration: 0.8454375 2023-10-04 04:25:35,420 WARNING [train.py:1204] (3/4) Exclude cut with ID _49376_210_5986_1_1533985278732_3370740_307-145221-0_sp1.1 from training. Duration: 0.936375 2023-10-04 04:25:37,304 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6846_210_6753_1_1527850767_4295158_2-315073-0_sp0.9 from training. Duration: 0.97775 2023-10-04 04:25:39,427 WARNING [train.py:1204] (3/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_17-25045-0 from training. Duration: 0.59 2023-10-04 04:25:39,439 WARNING [train.py:1204] (3/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_503-333510-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 04:25:40,166 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.1.encoder.layers.0.feed_forward2.out_whiten, num_groups=1, num_channels=256, metric=4.24 vs. limit=15.0 2023-10-04 04:25:40,782 WARNING [train.py:1204] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_238-245655-0_sp1.1 from training. Duration: 0.736375 2023-10-04 04:25:42,122 WARNING [train.py:1204] (3/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_329-82235-0_sp0.9 from training. Duration: 0.911125 2023-10-04 04:25:42,127 WARNING [train.py:1204] (3/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_325-113798-0_sp0.9 from training. Duration: 0.9444375 2023-10-04 04:25:43,467 WARNING [train.py:1204] (3/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_395-37222-0_sp0.9 from training. Duration: 0.8444375 2023-10-04 04:25:44,833 WARNING [train.py:1204] (3/4) Exclude cut with ID _64135_210_2253_1_1535677267876_7608972_498-5039-0_sp1.1 from training. Duration: 0.7090625 2023-10-04 04:25:44,850 WARNING [train.py:1204] (3/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_303-228738-0_sp0.9 from training. Duration: 0.9333125 2023-10-04 04:25:48,889 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_577-220899-0_sp1.1 from training. Duration: 0.92725 2023-10-04 04:25:50,279 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3475_210_2028_1_1526193578_3622569_377-166133-0_sp1.1 from training. Duration: 0.7818125 2023-10-04 04:25:50,311 WARNING [train.py:1204] (3/4) Exclude cut with ID _49376_210_5986_1_1533985278732_3370740_272-313512-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 04:25:54,563 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7762_210_1794_1_1528199508_4057408_422-227034-0_sp0.9 from training. Duration: 0.811125 2023-10-04 04:25:58,660 WARNING [train.py:1204] (3/4) Exclude cut with ID _18895_210_6228_1_1531357274234_3784930_74-341428-0_sp1.1 from training. Duration: 0.92725 2023-10-04 04:25:59,830 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.687e+02 2.036e+02 2.263e+02 2.637e+02 3.971e+02, threshold=4.526e+02, percent-clipped=0.0 2023-10-04 04:26:02,678 WARNING [train.py:1204] (3/4) Exclude cut with ID _44902_210_16187_1_1533888006062_3792078_149-67212-0_sp1.1 from training. Duration: 0.7909375 2023-10-04 04:26:05,667 WARNING [train.py:1204] (3/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_243-126020-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 04:26:07,081 WARNING [train.py:1204] (3/4) Exclude cut with ID _54218_210_12210_1_1534413646388_4409259_259-6561-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 04:26:09,084 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_41452_210_7291_1_1533437687_3941281_405-11559-0 from training. Duration: 0.91 2023-10-04 04:26:10,438 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_759-225084-0_sp1.1 from training. Duration: 0.97275 2023-10-04 04:26:10,452 WARNING [train.py:1204] (3/4) Exclude cut with ID _14747_210_8341_1_1530685797463_3654089_147-131229-0_sp0.9 from training. Duration: 0.8444375 2023-10-04 04:26:12,509 WARNING [train.py:1204] (3/4) Exclude cut with ID _14867_210_1968_1_1530684179890_3846969_319-250868-0 from training. Duration: 0.99 2023-10-04 04:26:12,531 WARNING [train.py:1204] (3/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_580-93798-0_sp0.9 from training. Duration: 0.62225 2023-10-04 04:26:13,892 WARNING [train.py:1204] (3/4) Exclude cut with ID _9679_210_7497_1_1529129212906_4363929_512-313889-0_sp0.9 from training. Duration: 0.9 2023-10-04 04:26:13,901 WARNING [train.py:1204] (3/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_18-189198-0 from training. Duration: 0.9 2023-10-04 04:26:17,854 INFO [train.py:1046] (3/4) Epoch 44, batch 300, loss[loss=0.1422, simple_loss=0.2308, pruned_loss=0.02678, over 24671.00 frames. ], tot_loss[loss=0.156, simple_loss=0.236, pruned_loss=0.03805, over 3684178.37 frames. ], batch size: 68, lr: 2.32e-03, grad_scale: 8.0 2023-10-04 04:26:17,938 WARNING [train.py:1204] (3/4) Exclude cut with ID _49755_210_18053_1_1534581005846_3218709_137-64179-0_sp1.1 from training. Duration: 0.92725 2023-10-04 04:26:19,292 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_48049_210_5632_1_1533864553_3519937_306-283794-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 04:26:23,391 WARNING [train.py:1204] (3/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_25-120161-0_sp1.1 from training. Duration: 0.8909375 2023-10-04 04:26:23,467 WARNING [train.py:1204] (3/4) Exclude cut with ID _8833_210_5219_1_1528592665210_7553059_319-349181-0 from training. Duration: 0.99 2023-10-04 04:26:24,834 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_296-332325-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 04:26:26,219 WARNING [train.py:1204] (3/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_436-186311-0_sp1.1 from training. Duration: 0.5909375 2023-10-04 04:26:26,231 WARNING [train.py:1204] (3/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_348-127789-0 from training. Duration: 0.85 2023-10-04 04:26:27,445 WARNING [train.py:1204] (3/4) Exclude cut with ID _3992_210_1747_1_1526644816031_3731060_443-195183-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 04:26:31,519 WARNING [train.py:1204] (3/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_308-231735-0_sp1.1 from training. Duration: 0.663625 2023-10-04 04:26:35,762 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6786_210_1388_1_1527839699_4194495_378-46070-0_sp1.1 from training. Duration: 0.6545625 2023-10-04 04:26:35,810 WARNING [train.py:1204] (3/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_63-101996-0 from training. Duration: 0.88 2023-10-04 04:26:42,289 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2541_210_1794_1_1525590269_3084765_156-173713-0 from training. Duration: 0.81 2023-10-04 04:26:42,326 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_217-239796-0_sp1.1 from training. Duration: 0.963625 2023-10-04 04:26:44,281 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.conv_module1.balancer2.min_positive, batch_count=1524880.0, ans=0.05 2023-10-04 04:26:45,432 WARNING [train.py:1204] (3/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_530-329400-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 04:26:45,591 WARNING [train.py:1204] (3/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_150-62920-0_sp1.1 from training. Duration: 0.963625 2023-10-04 04:26:45,591 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_5522_210_3273_1_1527249606_3909096_366-172508-0 from training. Duration: 0.66 2023-10-04 04:26:45,593 WARNING [train.py:1204] (3/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_303-340446-0_sp1.1 from training. Duration: 0.8181875 2023-10-04 04:26:47,622 WARNING [train.py:1204] (3/4) Exclude cut with ID _8699_210_2069_1_1528630346831_3960409_160-24408-0_sp1.1 from training. Duration: 0.82725 2023-10-04 04:26:50,354 WARNING [train.py:1204] (3/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_299-187801-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 04:26:50,379 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_34886_210_10633_1_1533203882_1755661_92-345288-0_sp1.1 from training. Duration: 0.97275 2023-10-04 04:26:55,802 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_5438_210_1794_1_1527336687_2600009_104-288274-0_sp1.1 from training. Duration: 0.52725 2023-10-04 04:26:55,806 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_154-316450-0 from training. Duration: 0.76 2023-10-04 04:26:57,308 WARNING [train.py:1204] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_23-189234-0_sp0.9 from training. Duration: 0.92225 2023-10-04 04:26:58,830 WARNING [train.py:1204] (3/4) Exclude cut with ID _4779_210_2084_1_1527318022944_3914893_346-168820-0_sp1.1 from training. Duration: 0.963625 2023-10-04 04:27:00,172 WARNING [train.py:1204] (3/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_5-55893-0 from training. Duration: 0.55 2023-10-04 04:27:00,268 WARNING [train.py:1204] (3/4) Exclude cut with ID _20455_210_5219_1_1531565996775_8098100_180-24517-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 04:27:05,779 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_243-143119-0_sp0.9 from training. Duration: 0.8555625 2023-10-04 04:27:08,603 WARNING [train.py:1204] (3/4) Exclude cut with ID _65585_210_12210_1_1535450451584_3764098_293-212045-0_sp1.1 from training. Duration: 0.92725 2023-10-04 04:27:08,607 WARNING [train.py:1204] (3/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_577-332600-0 from training. Duration: 0.57 2023-10-04 04:27:12,028 WARNING [train.py:1204] (3/4) Exclude cut with ID _30039_210_13270_1_1532484612900_2725150_104-289874-0_sp1.1 from training. Duration: 0.963625 2023-10-04 04:27:12,035 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3172_210_2087_1_1526292929_3987865_197-97762-0_sp1.1 from training. Duration: 0.6818125 2023-10-04 04:27:16,644 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_256-150908-0_sp1.1 from training. Duration: 0.963625 2023-10-04 04:27:17,977 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_296-287678-0_sp1.1 from training. Duration: 0.863625 2023-10-04 04:27:18,012 WARNING [train.py:1204] (3/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_443-30034-0 from training. Duration: 0.64 2023-10-04 04:27:19,782 WARNING [train.py:1204] (3/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_494-199538-0_sp0.9 from training. Duration: 0.8333125 2023-10-04 04:27:19,855 WARNING [train.py:1204] (3/4) Exclude cut with ID _19708_210_9206_1_1532136650733_4238364_61-162325-0_sp1.1 from training. Duration: 0.97275 2023-10-04 04:27:21,227 WARNING [train.py:1204] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_475-127455-0 from training. Duration: 0.7 2023-10-04 04:27:23,953 WARNING [train.py:1204] (3/4) Exclude cut with ID _48677_210_5663_1_1533902405919_3892989_188-11581-0_sp1.1 from training. Duration: 0.963625 2023-10-04 04:27:23,984 WARNING [train.py:1204] (3/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_136-8381-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 04:27:25,369 WARNING [train.py:1204] (3/4) Exclude cut with ID _55328_210_27767_1_1534580973260_4088170_270-229555-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 04:27:25,406 WARNING [train.py:1204] (3/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_372-266951-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 04:27:26,830 WARNING [train.py:1204] (3/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_14-242149-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 04:27:29,803 WARNING [train.py:1204] (3/4) Exclude cut with ID _7877_210_6014_1_1528286358099_3750136_159-76512-0_sp1.1 from training. Duration: 0.936375 2023-10-04 04:27:29,806 WARNING [train.py:1204] (3/4) Exclude cut with ID _8263_210_1664_1_1528438953494_294100_40-204731-0_sp1.1 from training. Duration: 0.4545625 2023-10-04 04:27:31,118 INFO [train.py:1046] (3/4) Epoch 44, batch 350, loss[loss=0.1486, simple_loss=0.2382, pruned_loss=0.02947, over 24643.00 frames. ], tot_loss[loss=0.1545, simple_loss=0.2342, pruned_loss=0.03744, over 3902292.46 frames. ], batch size: 68, lr: 2.32e-03, grad_scale: 8.0 2023-10-04 04:27:33,721 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8278_210_2550_1_1528637099_4220413_694-238413-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 04:27:37,906 WARNING [train.py:1204] (3/4) Exclude cut with ID _47278_210_12041_1_1533902418349_3576110_421-172710-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 04:27:39,352 WARNING [train.py:1204] (3/4) Exclude cut with ID _14970_210_15709_1_1530773543493_1715730_215-282680-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 04:27:39,404 WARNING [train.py:1204] (3/4) Exclude cut with ID _64639_210_5816_1_1535367619437_3528000_306-149449-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 04:27:42,573 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_28-291982-0 from training. Duration: 0.97 2023-10-04 04:27:42,704 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=1525146.6666666667, ans=0.1 2023-10-04 04:27:45,936 WARNING [train.py:1204] (3/4) Exclude cut with ID _55134_210_21982_1_1534590028377_3882170_412-15870-0_sp1.1 from training. Duration: 0.936375 2023-10-04 04:27:45,983 WARNING [train.py:1204] (3/4) Exclude cut with ID _13684_210_7497_1_1530619354863_3598359_312-235606-0 from training. Duration: 0.6 2023-10-04 04:27:49,219 WARNING [train.py:1204] (3/4) Exclude cut with ID _37112_210_2575_1_1532997007673_7268610_347-238993-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 04:27:49,265 WARNING [train.py:1204] (3/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_121-284152-0 from training. Duration: 0.59 2023-10-04 04:27:50,629 WARNING [train.py:1204] (3/4) Exclude cut with ID _2941_210_2577_1_1526199673150_2489759_228-2961-0_sp1.1 from training. Duration: 0.97275 2023-10-04 04:27:53,521 WARNING [train.py:1204] (3/4) Exclude cut with ID _28323_210_16187_1_1532480395667_4100300_531-24157-0 from training. Duration: 0.98 2023-10-04 04:27:54,901 WARNING [train.py:1204] (3/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_266-329755-0_sp0.9 from training. Duration: 0.988875 2023-10-04 04:27:57,708 WARNING [train.py:1204] (3/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_1038-146421-0_sp1.1 from training. Duration: 0.97275 2023-10-04 04:27:57,778 WARNING [train.py:1204] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_391-22148-0_sp0.9 from training. Duration: 0.8555625 2023-10-04 04:27:58,065 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.conv_skip_rate, batch_count=1525213.3333333333, ans=0.0 2023-10-04 04:27:59,143 WARNING [train.py:1204] (3/4) Exclude cut with ID _64521_210_14699_1_1535243427259_3815328_543-341117-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 04:27:59,154 WARNING [train.py:1204] (3/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_52-168712-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 04:28:00,582 WARNING [train.py:1204] (3/4) Exclude cut with ID _33920_210_4220_1_1533257851500_3084399_198-334063-0_sp1.1 from training. Duration: 0.8545625 2023-10-04 04:28:00,603 WARNING [train.py:1204] (3/4) Exclude cut with ID _50093_210_4220_1_1534417141207_3432299_69-111987-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 04:28:00,652 WARNING [train.py:1204] (3/4) Exclude cut with ID _14821_210_8750_1_1530680503991_6829359_189-297451-0_sp0.9 from training. Duration: 0.611125 2023-10-04 04:28:00,801 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.feed_forward3.hidden_balancer.prob, batch_count=1525280.0, ans=0.125 2023-10-04 04:28:03,394 WARNING [train.py:1204] (3/4) Exclude cut with ID _8030_210_7497_1_1528444886781_3531089_262-86750-0_sp0.9 from training. Duration: 0.9 2023-10-04 04:28:03,399 WARNING [train.py:1204] (3/4) Exclude cut with ID _24612_210_8913_1_1531998054193_3806359_294-191196-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 04:28:06,386 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.nonlin_attention.balancer.prob, batch_count=1525280.0, ans=0.125 2023-10-04 04:28:09,253 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.feed_forward1.hidden_balancer.prob, batch_count=1525280.0, ans=0.125 2023-10-04 04:28:10,273 WARNING [train.py:1204] (3/4) Exclude cut with ID _16909_210_5724_1_1531292079912_4118919_707-342896-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 04:28:10,295 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_9460_210_4211_1_1528954587_10049667_572-37035-0_sp0.9 from training. Duration: 0.888875 2023-10-04 04:28:11,896 WARNING [train.py:1204] (3/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_762-109886-0_sp1.1 from training. Duration: 0.82725 2023-10-04 04:28:11,932 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_14953_210_2151_1_1530701710_5616165_519-326393-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 04:28:12,246 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.nonlin_attention.balancer.prob, batch_count=1525280.0, ans=0.125 2023-10-04 04:28:15,459 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.conv_module2.balancer2.prob, batch_count=1525346.6666666667, ans=0.125 2023-10-04 04:28:16,681 WARNING [train.py:1204] (3/4) Exclude cut with ID _25914_210_6400_1_1532141317058_644740_52-96808-0 from training. Duration: 0.91 2023-10-04 04:28:16,684 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_68095_210_20185_1_1535718226_8888855_1079-94525-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 04:28:21,302 WARNING [train.py:1204] (3/4) Exclude cut with ID _14860_210_4915_1_1530702020340_7343240_746-3647-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 04:28:21,302 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_70379_210_7925_1_1535941455_557455_49-101720-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 04:28:21,315 WARNING [train.py:1204] (3/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_8-307291-0_sp1.1 from training. Duration: 0.936375 2023-10-04 04:28:22,682 WARNING [train.py:1204] (3/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_117-290916-0 from training. Duration: 0.51 2023-10-04 04:28:25,761 WARNING [train.py:1204] (3/4) Exclude cut with ID _22020_210_2461_1_1531724164513_6326900_944-339840-0_sp1.1 from training. Duration: 0.963625 2023-10-04 04:28:27,102 WARNING [train.py:1204] (3/4) Exclude cut with ID IC0515W0043-254911 from training. Duration: 0.976 2023-10-04 04:28:28,275 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.545e+02 1.956e+02 2.278e+02 2.648e+02 3.683e+02, threshold=4.557e+02, percent-clipped=0.0 2023-10-04 04:28:28,440 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3910_210_1794_1_1526742306_334530_28-238909-0 from training. Duration: 0.81 2023-10-04 04:28:28,454 WARNING [train.py:1204] (3/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_269-267275-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 04:28:32,481 WARNING [train.py:1204] (3/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_72-251255-0_sp1.1 from training. Duration: 0.8545625 2023-10-04 04:28:32,503 WARNING [train.py:1204] (3/4) Exclude cut with ID _14886_210_4220_1_1530702020355_3516190_175-229073-0 from training. Duration: 0.45 2023-10-04 04:28:35,289 WARNING [train.py:1204] (3/4) Exclude cut with ID _40519_210_13794_1_1533715228655_7337390_921-209452-0_sp1.1 from training. Duration: 0.963625 2023-10-04 04:28:36,771 WARNING [train.py:1204] (3/4) Exclude cut with ID _29857_210_20034_1_1532395886753_3798160_335-163562-0_sp0.9 from training. Duration: 0.9444375 2023-10-04 04:28:38,175 WARNING [train.py:1204] (3/4) Exclude cut with ID _58583_210_5236_1_1534726835266_3574249_384-58445-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 04:28:39,583 WARNING [train.py:1204] (3/4) Exclude cut with ID _44011_210_2046_1_1533722401691_3631310_96-315656-0_sp1.1 from training. Duration: 0.963625 2023-10-04 04:28:39,588 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_68484_210_1794_1_1535849804_7678097_549-170613-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 04:28:42,473 WARNING [train.py:1204] (3/4) Exclude cut with ID _37892_210_19596_1_1533198677897_3964058_214-148058-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 04:28:43,770 INFO [train.py:1046] (3/4) Epoch 44, batch 400, loss[loss=0.153, simple_loss=0.2407, pruned_loss=0.03271, over 24476.00 frames. ], tot_loss[loss=0.1544, simple_loss=0.2343, pruned_loss=0.03721, over 4091078.20 frames. ], batch size: 63, lr: 2.32e-03, grad_scale: 8.0 2023-10-04 04:28:45,325 WARNING [train.py:1204] (3/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_415-78792-0_sp0.9 from training. Duration: 0.9 2023-10-04 04:28:48,598 WARNING [train.py:1204] (3/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_383-205833-0_sp1.1 from training. Duration: 0.863625 2023-10-04 04:28:50,518 WARNING [train.py:1204] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_167-137255-0 from training. Duration: 0.64 2023-10-04 04:28:50,528 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8275_210_4198_1_1528455320_3892571_484-154786-0_sp1.1 from training. Duration: 0.963625 2023-10-04 04:28:51,185 WARNING [train.py:1204] (3/4) Exclude cut with ID _40284_210_2084_1_1533513679814_3704279_396-319412-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 04:28:52,409 WARNING [train.py:1204] (3/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_280-120942-0_sp0.9 from training. Duration: 0.8555625 2023-10-04 04:28:52,467 WARNING [train.py:1204] (3/4) Exclude cut with ID _45630_210_10556_1_1533949174498_3902049_558-240889-0_sp1.1 from training. Duration: 0.92725 2023-10-04 04:28:55,578 WARNING [train.py:1204] (3/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_197-307458-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 04:28:56,941 WARNING [train.py:1204] (3/4) Exclude cut with ID _16909_210_5724_1_1531292079912_4118919_163-254843-0_sp1.1 from training. Duration: 0.92725 2023-10-04 04:28:57,135 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.bypass.skip_rate, batch_count=1525480.0, ans=0.04949747468305833 2023-10-04 04:28:59,470 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_103-84010-0 from training. Duration: 0.69 2023-10-04 04:28:59,697 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.feed_forward1.out_proj.dropout_p, batch_count=1525546.6666666667, ans=0.1 2023-10-04 04:29:00,874 WARNING [train.py:1204] (3/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_401-158239-0 from training. Duration: 0.71 2023-10-04 04:29:00,875 WARNING [train.py:1204] (3/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_899-233658-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 04:29:03,562 WARNING [train.py:1204] (3/4) Exclude cut with ID _23825_210_13449_1_1531915313526_3497150_240-271367-0 from training. Duration: 0.97 2023-10-04 04:29:03,622 WARNING [train.py:1204] (3/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_424-134619-0_sp1.1 from training. Duration: 0.92725 2023-10-04 04:29:07,887 WARNING [train.py:1204] (3/4) Exclude cut with ID _44831_210_13427_1_1533816011549_3689035_32-188147-0_sp1.1 from training. Duration: 0.87275 2023-10-04 04:29:07,890 WARNING [train.py:1204] (3/4) Exclude cut with ID _59170_210_6721_1_1534983633960_4203335_314-63879-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 04:29:07,905 WARNING [train.py:1204] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_602-275260-0 from training. Duration: 0.82 2023-10-04 04:29:07,933 WARNING [train.py:1204] (3/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_511-306767-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 04:29:09,258 WARNING [train.py:1204] (3/4) Exclude cut with ID _41817_210_18835_1_1533434477160_7423049_565-201775-0_sp1.1 from training. Duration: 0.92725 2023-10-04 04:29:09,270 WARNING [train.py:1204] (3/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_778-173151-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 04:29:09,992 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.3.encoder.layers.0.conv_module1.whiten, num_groups=1, num_channels=512, metric=4.05 vs. limit=15.0 2023-10-04 04:29:10,608 WARNING [train.py:1204] (3/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_680-51001-0_sp1.1 from training. Duration: 0.963625 2023-10-04 04:29:10,889 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.feed_forward3.hidden_balancer.prob, batch_count=1525546.6666666667, ans=0.125 2023-10-04 04:29:11,250 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.4.encoder.layers.0.feed_forward1.out_whiten, num_groups=1, num_channels=384, metric=5.08 vs. limit=15.0 2023-10-04 04:29:11,959 WARNING [train.py:1204] (3/4) Exclude cut with ID IC0677W0059-334891 from training. Duration: 0.896 2023-10-04 04:29:13,373 WARNING [train.py:1204] (3/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_370-331600-0 from training. Duration: 0.94 2023-10-04 04:29:17,548 WARNING [train.py:1204] (3/4) Exclude cut with ID _66840_210_5219_1_1535452168426_4196349_446-240192-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 04:29:19,512 WARNING [train.py:1204] (3/4) Exclude cut with ID _65242_210_18628_1_1535369393717_3823149_38-6732-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 04:29:19,579 WARNING [train.py:1204] (3/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_302-2550-0 from training. Duration: 0.81 2023-10-04 04:29:22,775 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_100-348441-0 from training. Duration: 0.91 2023-10-04 04:29:24,898 WARNING [train.py:1204] (3/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_329-285801-0_sp1.1 from training. Duration: 0.82725 2023-10-04 04:29:27,575 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_30167_210_3528_1_1532428012_4772375_343-334712-0_sp1.1 from training. Duration: 0.936375 2023-10-04 04:29:32,575 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7543_210_6753_1_1528543318_4552_1-85072-0 from training. Duration: 0.83 2023-10-04 04:29:32,688 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.0.feed_forward1.out_proj.dropout_p, batch_count=1525680.0, ans=0.1 2023-10-04 04:29:35,286 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7002_210_1794_1_1527935634_4034444_487-236501-0_sp1.1 from training. Duration: 0.6 2023-10-04 04:29:36,660 WARNING [train.py:1204] (3/4) Exclude cut with ID _7160_210_1876_1_1527940923231_3703799_282-322611-0 from training. Duration: 0.87 2023-10-04 04:29:38,158 WARNING [train.py:1204] (3/4) Exclude cut with ID _67335_210_5219_1_1535525975652_4275229_570-262983-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 04:29:40,709 WARNING [train.py:1204] (3/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_625-338546-0_sp1.1 from training. Duration: 0.8 2023-10-04 04:29:40,739 WARNING [train.py:1204] (3/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_176-56120-0 from training. Duration: 0.83 2023-10-04 04:29:43,465 WARNING [train.py:1204] (3/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_963-187109-0_sp1.1 from training. Duration: 0.9 2023-10-04 04:29:44,938 WARNING [train.py:1204] (3/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_121-284152-0_sp0.9 from training. Duration: 0.6555625 2023-10-04 04:29:47,624 WARNING [train.py:1204] (3/4) Exclude cut with ID _45021_210_9994_1_1533619751094_3330288_104-81711-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 04:29:50,951 WARNING [train.py:1204] (3/4) Exclude cut with ID _51382_210_23862_1_1534492801838_3650104_129-343914-0_sp1.1 from training. Duration: 0.936375 2023-10-04 04:29:50,977 WARNING [train.py:1204] (3/4) Exclude cut with ID _14970_210_15709_1_1530775547361_1966965_8-169924-0 from training. Duration: 0.92 2023-10-04 04:29:53,145 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2295_210_2087_1_1525500993_2407863_234-83104-0_sp1.1 from training. Duration: 0.563625 2023-10-04 04:29:54,462 WARNING [train.py:1204] (3/4) Exclude cut with ID _14687_210_5854_1_1530703838259_3664889_115-239046-0 from training. Duration: 0.67 2023-10-04 04:29:55,909 WARNING [train.py:1204] (3/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_690-126306-0_sp1.1 from training. Duration: 0.7090625 2023-10-04 04:29:55,923 WARNING [train.py:1204] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_625-102955-0_sp0.9 from training. Duration: 0.8555625 2023-10-04 04:29:59,136 INFO [train.py:1046] (3/4) Epoch 44, batch 450, loss[loss=0.1562, simple_loss=0.2375, pruned_loss=0.03748, over 24640.00 frames. ], tot_loss[loss=0.1553, simple_loss=0.2353, pruned_loss=0.03767, over 4222201.61 frames. ], batch size: 65, lr: 2.32e-03, grad_scale: 8.0 2023-10-04 04:29:59,215 WARNING [train.py:1204] (3/4) Exclude cut with ID _9389_210_1664_1_1529217337301_5289269_289-213417-0 from training. Duration: 0.89 2023-10-04 04:30:00,603 WARNING [train.py:1204] (3/4) Exclude cut with ID _13253_210_1925_1_1530757775819_3577873_191-144977-0_sp0.9 from training. Duration: 0.8666875 2023-10-04 04:30:02,074 WARNING [train.py:1204] (3/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_325-155703-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 04:30:02,105 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2295_210_2087_1_1525500993_2407863_116-238894-0_sp0.9 from training. Duration: 0.77775 2023-10-04 04:30:03,536 WARNING [train.py:1204] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_94-126128-0 from training. Duration: 0.86 2023-10-04 04:30:03,551 WARNING [train.py:1204] (3/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_395-310220-0_sp0.9 from training. Duration: 0.988875 2023-10-04 04:30:04,903 WARNING [train.py:1204] (3/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_821-110852-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 04:30:04,961 WARNING [train.py:1204] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_64-248359-0_sp1.1 from training. Duration: 0.62725 2023-10-04 04:30:06,332 WARNING [train.py:1204] (3/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_138-187031-0 from training. Duration: 0.99 2023-10-04 04:30:06,375 WARNING [train.py:1204] (3/4) Exclude cut with ID _4122_210_2084_1_1526713164417_4353280_528-206771-0_sp0.9 from training. Duration: 0.97775 2023-10-04 04:30:07,612 WARNING [train.py:1204] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_844-96989-0_sp1.1 from training. Duration: 0.6454375 2023-10-04 04:30:09,102 WARNING [train.py:1204] (3/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_106-30782-0_sp1.1 from training. Duration: 0.72725 2023-10-04 04:30:12,582 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.4.encoder.layers.0.conv_module1.whiten, num_groups=1, num_channels=384, metric=5.81 vs. limit=15.0 2023-10-04 04:30:14,991 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.conv_module1.balancer1.prob, batch_count=1525880.0, ans=0.125 2023-10-04 04:30:16,162 WARNING [train.py:1204] (3/4) Exclude cut with ID _14683_210_5219_1_1530698352608_3974859_281-250040-0_sp1.1 from training. Duration: 0.936375 2023-10-04 04:30:17,508 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_46013_210_7234_1_1533776362_3744032_463-52378-0_sp0.9 from training. Duration: 0.9555625 2023-10-04 04:30:17,656 WARNING [train.py:1204] (3/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_299-34413-0 from training. Duration: 0.65 2023-10-04 04:30:18,963 WARNING [train.py:1204] (3/4) Exclude cut with ID _35799_210_5854_1_1533263487887_3529071_490-326847-0 from training. Duration: 0.99 2023-10-04 04:30:22,826 WARNING [train.py:1204] (3/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_589-9070-0_sp0.9 from training. Duration: 0.788875 2023-10-04 04:30:25,557 WARNING [train.py:1204] (3/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_884-29515-0_sp1.1 from training. Duration: 0.936375 2023-10-04 04:30:27,353 WARNING [train.py:1204] (3/4) Exclude cut with ID _15012_210_1968_1_1530856820429_5095350_474-119995-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 04:30:28,771 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.attention_skip_rate, batch_count=1525946.6666666667, ans=0.0 2023-10-04 04:30:33,914 WARNING [train.py:1204] (3/4) Exclude cut with ID _65585_210_12210_1_1535450451584_3764098_211-97124-0_sp1.1 from training. Duration: 0.963625 2023-10-04 04:30:33,980 WARNING [train.py:1204] (3/4) Exclude cut with ID _33640_210_2655_1_1533117605479_4855469_666-224463-0_sp1.1 from training. Duration: 0.963625 2023-10-04 04:30:35,662 WARNING [train.py:1204] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_35-153886-0 from training. Duration: 0.84 2023-10-04 04:30:37,044 WARNING [train.py:1204] (3/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_210-106047-0 from training. Duration: 0.97 2023-10-04 04:30:38,456 WARNING [train.py:1204] (3/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_479-125611-0 from training. Duration: 0.96 2023-10-04 04:30:39,763 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_9417_210_1794_1_1529235261_4614131_427-153963-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 04:30:39,848 WARNING [train.py:1204] (3/4) Exclude cut with ID _23818_210_6228_1_1531917063884_3676920_133-170571-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 04:30:41,258 WARNING [train.py:1204] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_602-275260-0_sp1.1 from training. Duration: 0.7454375 2023-10-04 04:30:42,742 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9029W0121-509521 from training. Duration: 0.896 2023-10-04 04:30:42,750 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9029W0434-509821 from training. Duration: 0.64 2023-10-04 04:30:42,767 WARNING [train.py:1204] (3/4) Exclude cut with ID _23038_210_9570_1_1532001584158_3652419_289-307034-0_sp1.1 from training. Duration: 0.936375 2023-10-04 04:30:45,432 WARNING [train.py:1204] (3/4) Exclude cut with ID _29345_210_4381_1_1532689168263_3860169_149-257268-0_sp0.9 from training. Duration: 0.9 2023-10-04 04:30:45,512 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_405-339986-0_sp1.1 from training. Duration: 0.463625 2023-10-04 04:30:48,266 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2655_210_2087_1_1525688959_3897400_437-337690-0_sp1.1 from training. Duration: 0.57275 2023-10-04 04:30:49,597 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7543_210_6753_1_1528541907_1399322_165-323619-0_sp1.1 from training. Duration: 0.62725 2023-10-04 04:30:49,645 WARNING [train.py:1204] (3/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_508-335369-0_sp1.1 from training. Duration: 0.436375 2023-10-04 04:30:50,691 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.5.encoder.layers.1.feed_forward1.out_whiten, num_groups=1, num_channels=256, metric=6.48 vs. limit=15.0 2023-10-04 04:30:51,533 WARNING [train.py:1204] (3/4) Exclude cut with ID _7852_210_5219_1_1528286471316_8139400_358-287492-0 from training. Duration: 0.96 2023-10-04 04:30:52,485 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.0.layers.0.self_attn_weights.whiten_keys, num_groups=4, num_channels=128, metric=2.80 vs. limit=6.0 2023-10-04 04:30:53,483 WARNING [train.py:1204] (3/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_322-217220-0_sp0.9 from training. Duration: 0.9555625 2023-10-04 04:30:56,705 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.766e+02 2.172e+02 2.484e+02 2.956e+02 4.900e+02, threshold=4.968e+02, percent-clipped=2.0 2023-10-04 04:30:56,786 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3171_210_2087_1_1526109084_2330007_66-260583-0_sp0.9 from training. Duration: 0.8 2023-10-04 04:30:56,824 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8558_210_1410_1_1528505225_7613406_131-329524-0_sp1.1 from training. Duration: 0.7181875 2023-10-04 04:30:58,246 WARNING [train.py:1204] (3/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_30-326681-0 from training. Duration: 0.83 2023-10-04 04:31:01,696 WARNING [train.py:1204] (3/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_58-68956-0_sp0.9 from training. Duration: 0.92225 2023-10-04 04:31:01,773 WARNING [train.py:1204] (3/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_255-312654-0 from training. Duration: 0.91 2023-10-04 04:31:04,521 WARNING [train.py:1204] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_99-167392-0 from training. Duration: 0.61 2023-10-04 04:31:04,624 WARNING [train.py:1204] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_94-126128-0_sp0.9 from training. Duration: 0.9555625 2023-10-04 04:31:10,084 WARNING [train.py:1204] (3/4) Exclude cut with ID _68635_210_9317_1_1535770813410_4315870_187-192817-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 04:31:11,628 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.balancer1.prob, batch_count=1526146.6666666667, ans=0.125 2023-10-04 04:31:12,732 INFO [train.py:1046] (3/4) Epoch 44, batch 500, loss[loss=0.1629, simple_loss=0.2561, pruned_loss=0.03482, over 24576.00 frames. ], tot_loss[loss=0.1564, simple_loss=0.2367, pruned_loss=0.03807, over 4333637.27 frames. ], batch size: 71, lr: 2.32e-03, grad_scale: 8.0 2023-10-04 04:31:12,794 WARNING [train.py:1204] (3/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_277-204970-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 04:31:12,925 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6163_210_6400_1_1527506746_4263068_283-137811-0_sp0.9 from training. Duration: 0.8555625 2023-10-04 04:31:13,479 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.4.encoder.layers.0.conv_module2.whiten, num_groups=1, num_channels=384, metric=9.32 vs. limit=15.0 2023-10-04 04:31:14,209 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9144W0121-566446 from training. Duration: 0.896 2023-10-04 04:31:17,101 WARNING [train.py:1204] (3/4) Exclude cut with ID _49371_210_12071_1_1533952761021_3812190_351-315519-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 04:31:18,457 WARNING [train.py:1204] (3/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_115-326778-0_sp1.1 from training. Duration: 0.8454375 2023-10-04 04:31:18,520 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_17292_210_6753_1_1531125388_2943734_436-198965-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 04:31:18,529 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9165W0117-576751 from training. Duration: 0.896 2023-10-04 04:31:19,911 WARNING [train.py:1204] (3/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_212-149947-0 from training. Duration: 0.73 2023-10-04 04:31:19,919 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_13757_210_3528_1_1530440682_4107239_65-167544-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 04:31:24,570 WARNING [train.py:1204] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_700-183549-0_sp0.9 from training. Duration: 0.7555625 2023-10-04 04:31:28,331 WARNING [train.py:1204] (3/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_216-343007-0_sp0.9 from training. Duration: 0.7333125 2023-10-04 04:31:30,148 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7544_210_6753_1_1528587722_3576686_295-258479-0_sp1.1 from training. Duration: 0.763625 2023-10-04 04:31:32,955 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_365-287106-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 04:31:32,976 WARNING [train.py:1204] (3/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_300-3747-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 04:31:33,025 WARNING [train.py:1204] (3/4) Exclude cut with ID _14069_210_5632_1_1530512692033_4803390_487-303201-0_sp1.1 from training. Duration: 0.92725 2023-10-04 04:31:41,397 WARNING [train.py:1204] (3/4) Exclude cut with ID _14459_210_2228_1_1531443641946_7516700_668-123026-0_sp1.1 from training. Duration: 0.936375 2023-10-04 04:31:42,641 WARNING [train.py:1204] (3/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_108-53929-0_sp1.1 from training. Duration: 0.536375 2023-10-04 04:31:42,670 WARNING [train.py:1204] (3/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_266-137104-0_sp0.9 from training. Duration: 0.72225 2023-10-04 04:31:42,707 WARNING [train.py:1204] (3/4) Exclude cut with ID _12302_210_5219_1_1529924446466_8107280_902-168619-0_sp1.1 from training. Duration: 0.936375 2023-10-04 04:31:42,733 WARNING [train.py:1204] (3/4) Exclude cut with ID _59502_210_12210_1_1535107024698_1835139_146-162495-0 from training. Duration: 0.92 2023-10-04 04:31:42,755 WARNING [train.py:1204] (3/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_412-287526-0_sp1.1 from training. Duration: 0.6545625 2023-10-04 04:31:45,567 WARNING [train.py:1204] (3/4) Exclude cut with ID _7160_210_1876_1_1527940923231_3703799_120-150943-0_sp0.9 from training. Duration: 0.9 2023-10-04 04:31:45,705 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.0.feed_forward2.hidden_balancer.prob, batch_count=1526280.0, ans=0.125 2023-10-04 04:31:46,338 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.1.encoder.layers.0.feed_forward2.out_whiten, num_groups=1, num_channels=256, metric=4.23 vs. limit=15.0 2023-10-04 04:31:46,788 WARNING [train.py:1204] (3/4) Exclude cut with ID _15037_210_2253_1_1530752294104_3267650_296-192597-0_sp1.1 from training. Duration: 0.736375 2023-10-04 04:31:46,820 WARNING [train.py:1204] (3/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_397-216497-0_sp1.1 from training. Duration: 0.87275 2023-10-04 04:31:48,054 WARNING [train.py:1204] (3/4) Exclude cut with ID _40094_210_16380_1_1533294005773_3641159_76-269320-0_sp1.1 from training. Duration: 0.936375 2023-10-04 04:31:48,109 WARNING [train.py:1204] (3/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_449-15508-0 from training. Duration: 0.82 2023-10-04 04:31:48,369 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.feed_forward1.hidden_balancer.prob, batch_count=1526280.0, ans=0.125 2023-10-04 04:31:51,542 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9309W0194-640321 from training. Duration: 0.64 2023-10-04 04:31:56,100 WARNING [train.py:1204] (3/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_284-312178-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 04:31:57,454 WARNING [train.py:1204] (3/4) Exclude cut with ID _29853_210_7497_1_1532505600851_6642110_401-55937-0_sp1.1 from training. Duration: 0.92725 2023-10-04 04:31:58,903 WARNING [train.py:1204] (3/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_716-24918-0_sp1.1 from training. Duration: 0.92725 2023-10-04 04:31:58,935 WARNING [train.py:1204] (3/4) Exclude cut with ID _19637_210_4220_1_1531537167676_3205060_314-305638-0_sp1.1 from training. Duration: 0.92725 2023-10-04 04:32:00,307 WARNING [train.py:1204] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_475-127455-0_sp1.1 from training. Duration: 0.636375 2023-10-04 04:32:02,147 WARNING [train.py:1204] (3/4) Exclude cut with ID _5444_210_2151_1_1527418808464_5312210_581-315411-0 from training. Duration: 0.62 2023-10-04 04:32:05,464 WARNING [train.py:1204] (3/4) Exclude cut with ID _44902_210_16187_1_1533888006062_3792078_149-67212-0_sp0.9 from training. Duration: 0.9666875 2023-10-04 04:32:06,783 WARNING [train.py:1204] (3/4) Exclude cut with ID _31602_210_5854_1_1532677021456_4458250_63-63253-0_sp1.1 from training. Duration: 0.963625 2023-10-04 04:32:07,262 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.balancer1.prob, batch_count=1526346.6666666667, ans=0.125 2023-10-04 04:32:08,311 WARNING [train.py:1204] (3/4) Exclude cut with ID _55209_210_1531_1_1534675828996_4597160_660-203992-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 04:32:09,704 WARNING [train.py:1204] (3/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_83-190624-0_sp1.1 from training. Duration: 0.92725 2023-10-04 04:32:09,891 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.attention_skip_rate, batch_count=1526346.6666666667, ans=0.0 2023-10-04 04:32:15,193 WARNING [train.py:1204] (3/4) Exclude cut with ID _58605_210_12210_1_1534723195939_3904259_154-139635-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 04:32:17,992 WARNING [train.py:1204] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_164-224052-0 from training. Duration: 0.7 2023-10-04 04:32:18,000 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_1126-189071-0_sp1.1 from training. Duration: 0.963625 2023-10-04 04:32:18,017 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_15396_210_6890_1_1531308473_1938936_310-214789-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 04:32:21,236 WARNING [train.py:1204] (3/4) Exclude cut with ID _8395_210_1531_1_1528597871523_4411579_431-132470-0 from training. Duration: 0.74 2023-10-04 04:32:21,830 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.4.encoder.layers.0.self_attn1.whiten, num_groups=1, num_channels=384, metric=14.44 vs. limit=22.5 2023-10-04 04:32:22,660 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3780_210_2087_1_1526467760_2130483_170-259044-0_sp0.9 from training. Duration: 0.77775 2023-10-04 04:32:24,130 WARNING [train.py:1204] (3/4) Exclude cut with ID _12733_210_8341_1_1530167424556_3646380_537-230640-0_sp1.1 from training. Duration: 0.963625 2023-10-04 04:32:27,414 INFO [train.py:1046] (3/4) Epoch 44, batch 550, loss[loss=0.1503, simple_loss=0.2406, pruned_loss=0.02998, over 24648.00 frames. ], tot_loss[loss=0.1577, simple_loss=0.238, pruned_loss=0.0387, over 4410882.01 frames. ], batch size: 65, lr: 2.32e-03, grad_scale: 8.0 2023-10-04 04:32:27,621 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.0.ff2_skip_rate, batch_count=1526480.0, ans=0.0 2023-10-04 04:32:30,646 WARNING [train.py:1204] (3/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_159-281310-0 from training. Duration: 0.95 2023-10-04 04:32:32,164 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.conv_module2.balancer1.prob, batch_count=1526480.0, ans=0.125 2023-10-04 04:32:33,330 WARNING [train.py:1204] (3/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_434-83000-0 from training. Duration: 0.98 2023-10-04 04:32:33,357 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_4405_210_1849_1_1526732205_3593939_493-32464-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 04:32:33,366 WARNING [train.py:1204] (3/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_342-100157-0 from training. Duration: 0.54 2023-10-04 04:32:35,050 WARNING [train.py:1204] (3/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_765-250133-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 04:32:35,054 WARNING [train.py:1204] (3/4) Exclude cut with ID _38670_210_15379_1_1533448810659_4391059_81-312245-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 04:32:35,141 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_50050_210_16223_1_1535435796_7290138_1273-247611-0_sp1.1 from training. Duration: 0.936375 2023-10-04 04:32:35,432 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.attention_skip_rate, batch_count=1526480.0, ans=0.0 2023-10-04 04:32:36,490 WARNING [train.py:1204] (3/4) Exclude cut with ID _44831_210_13427_1_1533816011549_3689035_399-18591-0_sp1.1 from training. Duration: 0.936375 2023-10-04 04:32:36,512 WARNING [train.py:1204] (3/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_557-324258-0_sp0.9 from training. Duration: 0.9 2023-10-04 04:32:37,891 WARNING [train.py:1204] (3/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_62-335552-0_sp1.1 from training. Duration: 0.7909375 2023-10-04 04:32:39,256 WARNING [train.py:1204] (3/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_673-264125-0_sp1.1 from training. Duration: 0.963625 2023-10-04 04:32:40,644 WARNING [train.py:1204] (3/4) Exclude cut with ID _8030_210_7497_1_1528444886781_3531089_262-86750-0 from training. Duration: 0.81 2023-10-04 04:32:40,657 WARNING [train.py:1204] (3/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_444-28041-0_sp1.1 from training. Duration: 0.77275 2023-10-04 04:32:46,059 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_34008_210_10431_1_1533345424_8183247_516-14858-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 04:32:46,072 WARNING [train.py:1204] (3/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_54-248218-0_sp1.1 from training. Duration: 0.936375 2023-10-04 04:32:46,558 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.5.encoder.layers.1.self_attn2.whiten, num_groups=1, num_channels=256, metric=10.71 vs. limit=22.5 2023-10-04 04:32:48,797 WARNING [train.py:1204] (3/4) Exclude cut with ID _13253_210_1925_1_1530757775819_3577873_300-119782-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 04:32:50,247 WARNING [train.py:1204] (3/4) Exclude cut with ID _39290_210_4220_1_1533862778760_3075118_21-240703-0_sp1.1 from training. Duration: 0.936375 2023-10-04 04:32:53,538 WARNING [train.py:1204] (3/4) Exclude cut with ID 774-127930-0014-10412-0_sp1.1 from training. Duration: 0.95 2023-10-04 04:32:54,910 WARNING [train.py:1204] (3/4) Exclude cut with ID _67499_210_17282_1_1535509805220_3692430_20-179346-0 from training. Duration: 0.97 2023-10-04 04:32:56,397 WARNING [train.py:1204] (3/4) Exclude cut with ID _8365_210_2064_1_1528592311070_4677690_431-294102-0_sp0.9 from training. Duration: 0.988875 2023-10-04 04:33:02,405 WARNING [train.py:1204] (3/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_365-1492-0_sp1.1 from training. Duration: 0.82725 2023-10-04 04:33:02,432 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_105-114797-0_sp0.9 from training. Duration: 0.9333125 2023-10-04 04:33:04,312 WARNING [train.py:1204] (3/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_769-168302-0_sp0.9 from training. Duration: 0.788875 2023-10-04 04:33:07,691 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_40340_210_9024_1_1533433916_4132944_320-270277-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 04:33:07,697 WARNING [train.py:1204] (3/4) Exclude cut with ID ID0203W0003-781711 from training. Duration: 0.958 2023-10-04 04:33:07,780 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_30434_210_13049_1_1532519384_981746_52-12932-0_sp1.1 from training. Duration: 0.936375 2023-10-04 04:33:09,452 WARNING [train.py:1204] (3/4) Exclude cut with ID _8263_210_1664_1_1528438953494_294100_40-204731-0_sp0.9 from training. Duration: 0.5555625 2023-10-04 04:33:12,204 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1032-236863-0_sp0.9 from training. Duration: 0.9333125 2023-10-04 04:33:13,532 WARNING [train.py:1204] (3/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_335-274780-0_sp0.9 from training. Duration: 0.8666875 2023-10-04 04:33:13,540 WARNING [train.py:1204] (3/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_654-52713-0_sp1.1 from training. Duration: 0.7 2023-10-04 04:33:13,623 WARNING [train.py:1204] (3/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_742-267856-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 04:33:15,000 WARNING [train.py:1204] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_74-48717-0 from training. Duration: 0.58 2023-10-04 04:33:15,228 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.conv_module2.balancer2.prob, batch_count=1526680.0, ans=0.125 2023-10-04 04:33:16,509 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.bypass_mid.scale_min, batch_count=1526680.0, ans=0.2 2023-10-04 04:33:17,590 WARNING [train.py:1204] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_18-46756-0 from training. Duration: 0.79 2023-10-04 04:33:17,640 WARNING [train.py:1204] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_612-56061-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 04:33:17,649 WARNING [train.py:1204] (3/4) Exclude cut with ID _56423_210_5854_1_1534896061049_3165969_412-2403-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 04:33:18,983 WARNING [train.py:1204] (3/4) Exclude cut with ID _62574_210_2655_1_1535180407058_3933249_120-196848-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 04:33:18,984 WARNING [train.py:1204] (3/4) Exclude cut with ID _40519_210_13794_1_1533715228655_7337390_916-51000-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 04:33:20,545 INFO [scaling.py:1118] (3/4) WithLoss: name=encoder.encoders.1.encoder.layers.0.self_attn_weights, loss-sum=0.000e+00 2023-10-04 04:33:23,131 WARNING [train.py:1204] (3/4) Exclude cut with ID _65263_210_22356_1_1535763598691_3921037_108-21902-0_sp1.1 from training. Duration: 0.9 2023-10-04 04:33:23,222 WARNING [train.py:1204] (3/4) Exclude cut with ID _4122_210_2084_1_1526713164417_4353280_528-206771-0_sp1.1 from training. Duration: 0.8 2023-10-04 04:33:25,084 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.655e+02 1.940e+02 2.267e+02 2.484e+02 4.509e+02, threshold=4.533e+02, percent-clipped=0.0 2023-10-04 04:33:26,598 WARNING [train.py:1204] (3/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_43-325657-0_sp1.1 from training. Duration: 0.92725 2023-10-04 04:33:27,911 WARNING [train.py:1204] (3/4) Exclude cut with ID _44659_210_2721_1_1534036120061_6614860_314-82041-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 04:33:27,981 WARNING [train.py:1204] (3/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_81-300212-0_sp0.9 from training. Duration: 0.6666875 2023-10-04 04:33:28,050 WARNING [train.py:1204] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_665-196155-0_sp0.9 from training. Duration: 0.7444375 2023-10-04 04:33:30,212 WARNING [train.py:1204] (3/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_140-336881-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 04:33:31,389 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6290_210_3273_1_1527595224_1282787_115-87095-0_sp0.9 from training. Duration: 0.611125 2023-10-04 04:33:31,447 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_53373_210_5399_1_1534229471_4715695_607-259685-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 04:33:32,854 WARNING [train.py:1204] (3/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_175-163894-0_sp1.1 from training. Duration: 0.67275 2023-10-04 04:33:32,871 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7002_210_1794_1_1527939683_5157286_1-281264-0_sp1.1 from training. Duration: 0.4 2023-10-04 04:33:35,588 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.2.encoder.layers.2.self_attn_weights.whiten_keys, num_groups=4, num_channels=128, metric=3.00 vs. limit=6.0 2023-10-04 04:33:39,339 WARNING [train.py:1204] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_605-257205-0 from training. Duration: 0.59 2023-10-04 04:33:39,519 INFO [scaling.py:1118] (3/4) WithLoss: name=encoder.encoders.0.layers.1.self_attn_weights, loss-sum=0.000e+00 2023-10-04 04:33:42,073 INFO [train.py:1046] (3/4) Epoch 44, batch 600, loss[loss=0.1571, simple_loss=0.2411, pruned_loss=0.03652, over 24491.00 frames. ], tot_loss[loss=0.1573, simple_loss=0.2379, pruned_loss=0.03833, over 4480376.56 frames. ], batch size: 63, lr: 2.32e-03, grad_scale: 8.0 2023-10-04 04:33:43,477 WARNING [train.py:1204] (3/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_454-314901-0 from training. Duration: 0.46 2023-10-04 04:33:43,576 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_46032_210_14656_1_1533781412_4507969_287-130422-0_sp1.1 from training. Duration: 0.97275 2023-10-04 04:33:43,591 WARNING [train.py:1204] (3/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_186-309161-0_sp1.1 from training. Duration: 0.4909375 2023-10-04 04:33:44,856 WARNING [train.py:1204] (3/4) Exclude cut with ID _42659_210_2070_1_1533607442765_3120380_45-342307-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 04:33:50,611 WARNING [train.py:1204] (3/4) Exclude cut with ID _12555_210_8750_1_1530075594402_3780239_478-326818-0_sp1.1 from training. Duration: 0.963625 2023-10-04 04:33:52,020 WARNING [train.py:1204] (3/4) Exclude cut with ID _7606_210_4915_1_1528894253277_5080690_127-177761-0_sp1.1 from training. Duration: 0.5818125 2023-10-04 04:33:56,528 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6788_210_1388_1_1527898698_4562021_91-197981-0 from training. Duration: 0.83 2023-10-04 04:33:57,900 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8558_210_1410_1_1528505225_7613406_131-329524-0_sp0.9 from training. Duration: 0.87775 2023-10-04 04:33:58,021 WARNING [train.py:1204] (3/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_153-153537-0_sp1.1 from training. Duration: 0.82725 2023-10-04 04:34:01,461 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_17292_210_6753_1_1531125388_2943734_285-349105-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 04:34:04,727 WARNING [train.py:1204] (3/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_37-41919-0 from training. Duration: 0.87 2023-10-04 04:34:04,747 WARNING [train.py:1204] (3/4) Exclude cut with ID _60860_210_8254_1_1534896015875_3557169_172-294852-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 04:34:10,701 WARNING [train.py:1204] (3/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_146-276944-0 from training. Duration: 0.83 2023-10-04 04:34:12,206 WARNING [train.py:1204] (3/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_1254-44082-0_sp1.1 from training. Duration: 0.936375 2023-10-04 04:34:12,219 WARNING [train.py:1204] (3/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_1115-180350-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 04:34:12,242 WARNING [train.py:1204] (3/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_60-146189-0_sp1.1 from training. Duration: 0.8909375 2023-10-04 04:34:17,845 WARNING [train.py:1204] (3/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_517-153649-0_sp1.1 from training. Duration: 0.92725 2023-10-04 04:34:19,139 WARNING [train.py:1204] (3/4) Exclude cut with ID _39571_210_13449_1_1533801584143_3651077_37-266257-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 04:34:19,194 WARNING [train.py:1204] (3/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_527-276118-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 04:34:26,593 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7696_210_2040_1_1528207167_3716099_117-125434-0_sp1.1 from training. Duration: 0.6909375 2023-10-04 04:34:31,214 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_25767_210_2161_1_1532307483_3867748_478-156360-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 04:34:31,218 WARNING [train.py:1204] (3/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_177-49092-0_sp1.1 from training. Duration: 0.82725 2023-10-04 04:34:31,231 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_11305_210_1794_1_1529750428_7178148_680-317073-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 04:34:35,975 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.balancer1.prob, batch_count=1527013.3333333333, ans=0.125 2023-10-04 04:34:38,516 WARNING [train.py:1204] (3/4) Exclude cut with ID _53565_210_10635_1_1534337218335_3820259_287-18120-0 from training. Duration: 0.97 2023-10-04 04:34:43,244 WARNING [train.py:1204] (3/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_498-40153-0_sp1.1 from training. Duration: 0.57275 2023-10-04 04:34:44,507 WARNING [train.py:1204] (3/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_555-63066-0_sp1.1 from training. Duration: 0.963625 2023-10-04 04:34:46,166 WARNING [train.py:1204] (3/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_395-310220-0 from training. Duration: 0.89 2023-10-04 04:34:47,538 WARNING [train.py:1204] (3/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_937-208281-0_sp1.1 from training. Duration: 0.77275 2023-10-04 04:34:50,292 WARNING [train.py:1204] (3/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_120-211630-0 from training. Duration: 0.92 2023-10-04 04:34:50,340 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_454-276429-0_sp1.1 from training. Duration: 0.7909375 2023-10-04 04:34:51,724 WARNING [train.py:1204] (3/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_404-75889-0_sp1.1 from training. Duration: 0.8090625 2023-10-04 04:34:55,828 INFO [train.py:1046] (3/4) Epoch 44, batch 650, loss[loss=0.1652, simple_loss=0.2315, pruned_loss=0.04948, over 23910.00 frames. ], tot_loss[loss=0.1564, simple_loss=0.2367, pruned_loss=0.03806, over 4530094.24 frames. ], batch size: 195, lr: 2.32e-03, grad_scale: 8.0 2023-10-04 04:34:57,792 WARNING [train.py:1204] (3/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_282-136641-0_sp0.9 from training. Duration: 0.4666875 2023-10-04 04:34:57,892 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_809-17156-0_sp1.1 from training. Duration: 0.663625 2023-10-04 04:35:00,527 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.4.encoder.layers.1.conv_module2.whiten, num_groups=1, num_channels=384, metric=2.85 vs. limit=15.0 2023-10-04 04:35:01,208 WARNING [train.py:1204] (3/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_1003-101638-0_sp0.9 from training. Duration: 0.811125 2023-10-04 04:35:01,309 WARNING [train.py:1204] (3/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_404-75889-0_sp0.9 from training. Duration: 0.988875 2023-10-04 04:35:02,883 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.bypass_mid.scale_min, batch_count=1527146.6666666667, ans=0.2 2023-10-04 04:35:04,389 WARNING [train.py:1204] (3/4) Exclude cut with ID _59288_210_5854_1_1535077757774_3412246_570-295255-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 04:35:05,936 WARNING [train.py:1204] (3/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_412-287526-0 from training. Duration: 0.72 2023-10-04 04:35:07,202 WARNING [train.py:1204] (3/4) Exclude cut with ID _44010_210_4525_1_1533884262964_4013620_649-79655-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 04:35:13,392 WARNING [train.py:1204] (3/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_57-270037-0_sp0.9 from training. Duration: 0.8555625 2023-10-04 04:35:13,393 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_34815_210_6388_1_1533381320_3571903_302-255963-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 04:35:16,191 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_47449_210_5399_1_1534076445_4006929_573-301852-0_sp1.1 from training. Duration: 0.97275 2023-10-04 04:35:20,335 WARNING [train.py:1204] (3/4) Exclude cut with ID 6709-74022-0004-86860-0_sp1.1 from training. Duration: 0.9409375 2023-10-04 04:35:21,717 WARNING [train.py:1204] (3/4) Exclude cut with ID _40519_210_13794_1_1533715228655_7337390_111-71991-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 04:35:23,042 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_30167_210_3528_1_1532428012_4772375_169-214186-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 04:35:25,826 WARNING [train.py:1204] (3/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_20-235313-0_sp1.1 from training. Duration: 0.963625 2023-10-04 04:35:25,871 WARNING [train.py:1204] (3/4) Exclude cut with ID _4681_210_3525_1_1527251040321_1235713_123-24428-0_sp0.9 from training. Duration: 0.5333125 2023-10-04 04:35:28,635 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.1.conv_module1.balancer2.min_positive, batch_count=1527280.0, ans=0.05 2023-10-04 04:35:29,743 WARNING [train.py:1204] (3/4) Exclude cut with ID _31602_210_5854_1_1532677021456_4458250_444-198752-0_sp1.1 from training. Duration: 0.97275 2023-10-04 04:35:29,804 WARNING [train.py:1204] (3/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_9-279442-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 04:35:31,096 WARNING [train.py:1204] (3/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_973-198085-0_sp0.9 from training. Duration: 0.8333125 2023-10-04 04:35:31,180 WARNING [train.py:1204] (3/4) Exclude cut with ID _29853_210_7497_1_1532505600851_6642110_567-250221-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 04:35:32,516 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6545_210_2040_1_1527774670_4245258_407-48070-0_sp1.1 from training. Duration: 0.6909375 2023-10-04 04:35:34,415 WARNING [train.py:1204] (3/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_224-343295-0_sp0.9 from training. Duration: 0.611125 2023-10-04 04:35:34,431 WARNING [train.py:1204] (3/4) Exclude cut with ID IC0125W0011-61817 from training. Duration: 0.88 2023-10-04 04:35:35,768 WARNING [train.py:1204] (3/4) Exclude cut with ID _60113_210_16271_1_1535164212481_4386079_435-154371-0_sp1.1 from training. Duration: 0.97275 2023-10-04 04:35:35,789 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_25836_210_16554_1_1532138167_53197_3-9394-0_sp1.1 from training. Duration: 0.92725 2023-10-04 04:35:36,561 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.3.encoder.layers.1.feed_forward2.out_whiten, num_groups=1, num_channels=512, metric=5.73 vs. limit=15.0 2023-10-04 04:35:37,487 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=1527280.0, ans=0.1 2023-10-04 04:35:38,554 WARNING [train.py:1204] (3/4) Exclude cut with ID _29343_210_4381_1_1532602757752_3674109_57-55125-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 04:35:39,872 WARNING [train.py:1204] (3/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_267-23560-0_sp1.1 from training. Duration: 0.936375 2023-10-04 04:35:39,906 WARNING [train.py:1204] (3/4) Exclude cut with ID _30196_210_1664_1_1533708496522_7074140_1150-278867-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 04:35:39,971 WARNING [train.py:1204] (3/4) Exclude cut with ID _6270_210_2084_1_1527850911573_3856420_26-277343-0_sp0.9 from training. Duration: 0.911125 2023-10-04 04:35:43,091 WARNING [train.py:1204] (3/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_325-62802-0 from training. Duration: 0.87 2023-10-04 04:35:43,173 WARNING [train.py:1204] (3/4) Exclude cut with ID _56524_210_9642_1_1534572011779_3619170_145-310842-0_sp1.1 from training. Duration: 0.9 2023-10-04 04:35:44,464 WARNING [train.py:1204] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_860-96198-0_sp0.9 from training. Duration: 0.811125 2023-10-04 04:35:44,566 WARNING [train.py:1204] (3/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_17-25045-0_sp1.1 from training. Duration: 0.536375 2023-10-04 04:35:44,594 WARNING [train.py:1204] (3/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_349-69727-0_sp1.1 from training. Duration: 0.936375 2023-10-04 04:35:45,932 WARNING [train.py:1204] (3/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_580-93798-0_sp1.1 from training. Duration: 0.5090625 2023-10-04 04:35:47,318 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7762_210_1794_1_1528199508_4057408_419-125009-0 from training. Duration: 0.83 2023-10-04 04:35:47,831 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.5.encoder.layers.0.nonlin_attention.whiten2, num_groups=1, num_channels=256, metric=3.29 vs. limit=15.0 2023-10-04 04:35:48,702 WARNING [train.py:1204] (3/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_356-167816-0 from training. Duration: 0.72 2023-10-04 04:35:48,731 WARNING [train.py:1204] (3/4) Exclude cut with ID _29345_210_4381_1_1532689168263_3860169_225-97100-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 04:35:48,735 WARNING [train.py:1204] (3/4) Exclude cut with ID _14227_210_4381_1_1530701804286_3940259_374-194712-0_sp1.1 from training. Duration: 0.92725 2023-10-04 04:35:48,760 WARNING [train.py:1204] (3/4) Exclude cut with ID _40137_210_12210_1_1533290484046_3795180_249-187059-0_sp0.9 from training. Duration: 0.97775 2023-10-04 04:35:50,091 WARNING [train.py:1204] (3/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_389-317194-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 04:35:51,534 WARNING [train.py:1204] (3/4) Exclude cut with ID _27485_210_4916_1_1532739191677_3668269_239-122918-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 04:35:54,091 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.585e+02 1.998e+02 2.208e+02 2.441e+02 4.854e+02, threshold=4.416e+02, percent-clipped=2.0 2023-10-04 04:35:54,496 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.nonlin_attention.balancer.min_positive, batch_count=1527413.3333333333, ans=0.05 2023-10-04 04:35:55,851 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=1527413.3333333333, ans=0.1 2023-10-04 04:35:57,058 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2245_210_2715_1_1525511548_3540254_128-117609-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 04:35:57,093 WARNING [train.py:1204] (3/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_160-276178-0_sp1.1 from training. Duration: 0.963625 2023-10-04 04:35:59,068 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_31920_210_10633_1_1532860009_1686094_1-279735-0_sp1.1 from training. Duration: 0.97275 2023-10-04 04:36:01,070 INFO [scaling.py:1118] (3/4) WithLoss: name=encoder.encoders.5.encoder.layers.1.self_attn_weights, loss-sum=0.000e+00 2023-10-04 04:36:02,273 WARNING [train.py:1204] (3/4) Exclude cut with ID _51078_210_9070_1_1534161557424_3805250_325-262383-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 04:36:02,295 WARNING [train.py:1204] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_643-52007-0_sp0.9 from training. Duration: 0.6333125 2023-10-04 04:36:04,065 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_51937_210_5014_1_1534573610_4022025_518-299248-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 04:36:08,329 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.attention_skip_rate, batch_count=1527413.3333333333, ans=0.0 2023-10-04 04:36:09,525 WARNING [train.py:1204] (3/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_166-314607-0_sp1.1 from training. Duration: 0.4909375 2023-10-04 04:36:09,529 WARNING [train.py:1204] (3/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_1087-282939-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 04:36:09,556 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_23850_210_4238_1_1532232597_1632433_32-185135-0_sp1.1 from training. Duration: 0.7818125 2023-10-04 04:36:09,587 WARNING [train.py:1204] (3/4) Exclude cut with ID _59500_210_8254_1_1534809610680_3736009_145-89359-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 04:36:10,826 INFO [train.py:1046] (3/4) Epoch 44, batch 700, loss[loss=0.1471, simple_loss=0.2308, pruned_loss=0.03164, over 24524.00 frames. ], tot_loss[loss=0.1552, simple_loss=0.2353, pruned_loss=0.0376, over 4567633.76 frames. ], batch size: 63, lr: 2.32e-03, grad_scale: 8.0 2023-10-04 04:36:14,186 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_340-336408-0 from training. Duration: 0.93 2023-10-04 04:36:15,589 WARNING [train.py:1204] (3/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_9-21737-0 from training. Duration: 0.97 2023-10-04 04:36:18,373 WARNING [train.py:1204] (3/4) Exclude cut with ID _2220_210_1819_1_1525509109898_3658680_50-99837-0 from training. Duration: 0.54 2023-10-04 04:36:18,453 WARNING [train.py:1204] (3/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_879-299350-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 04:36:21,124 WARNING [train.py:1204] (3/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_905-160074-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 04:36:23,989 WARNING [train.py:1204] (3/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_395-73570-0 from training. Duration: 0.99 2023-10-04 04:36:28,142 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7264_210_4938_1_1527939911_8132095_800-91995-0_sp1.1 from training. Duration: 0.7818125 2023-10-04 04:36:31,464 WARNING [train.py:1204] (3/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_77-311404-0_sp1.1 from training. Duration: 0.8545625 2023-10-04 04:36:31,578 WARNING [train.py:1204] (3/4) Exclude cut with ID _13679_210_7497_1_1530534293662_3355098_107-80772-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 04:36:33,545 WARNING [train.py:1204] (3/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_224-343295-0_sp1.1 from training. Duration: 0.5 2023-10-04 04:36:33,587 WARNING [train.py:1204] (3/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_156-253482-0_sp1.1 from training. Duration: 0.963625 2023-10-04 04:36:33,747 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.0.conv_module2.balancer1.prob, batch_count=1527546.6666666667, ans=0.125 2023-10-04 04:36:36,337 WARNING [train.py:1204] (3/4) Exclude cut with ID _49728_210_12651_1_1534041034294_6788150_309-125532-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 04:36:37,846 WARNING [train.py:1204] (3/4) Exclude cut with ID _13913_210_9994_1_1530579615187_3383897_473-277682-0_sp0.9 from training. Duration: 0.4333125 2023-10-04 04:36:37,861 WARNING [train.py:1204] (3/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_3-276701-0_sp1.1 from training. Duration: 0.82725 2023-10-04 04:36:39,241 WARNING [train.py:1204] (3/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_282-136641-0 from training. Duration: 0.42 2023-10-04 04:36:40,671 WARNING [train.py:1204] (3/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_127-566-0 from training. Duration: 0.84 2023-10-04 04:36:46,507 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6163_210_6400_1_1527506746_4263068_283-137811-0_sp1.1 from training. Duration: 0.7 2023-10-04 04:36:46,550 WARNING [train.py:1204] (3/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_34-183685-0_sp1.1 from training. Duration: 0.8909375 2023-10-04 04:36:49,270 WARNING [train.py:1204] (3/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_154-135696-0_sp0.9 from training. Duration: 0.888875 2023-10-04 04:36:52,148 WARNING [train.py:1204] (3/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_676-51014-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 04:36:52,239 WARNING [train.py:1204] (3/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_38-169920-0 from training. Duration: 0.91 2023-10-04 04:36:57,169 WARNING [train.py:1204] (3/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_747-114465-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 04:36:58,485 WARNING [train.py:1204] (3/4) Exclude cut with ID _8021_210_7994_1_1528376451358_3653069_100-140231-0_sp1.1 from training. Duration: 0.6090625 2023-10-04 04:36:58,502 WARNING [train.py:1204] (3/4) Exclude cut with ID _64135_210_2253_1_1535677267876_7608972_133-188266-0 from training. Duration: 0.81 2023-10-04 04:37:03,279 WARNING [train.py:1204] (3/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_714-310538-0_sp1.1 from training. Duration: 0.7818125 2023-10-04 04:37:05,237 WARNING [train.py:1204] (3/4) Exclude cut with ID _39867_210_9826_1_1533553222030_9344259_1134-288498-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 04:37:07,105 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.5.encoder.layers.0.nonlin_attention.whiten1, num_groups=1, num_channels=192, metric=3.82 vs. limit=10.0 2023-10-04 04:37:07,230 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.3.encoder.layers.1.feed_forward2.out_whiten, num_groups=1, num_channels=512, metric=7.41 vs. limit=15.0 2023-10-04 04:37:08,005 WARNING [train.py:1204] (3/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_211-237289-0_sp1.1 from training. Duration: 0.936375 2023-10-04 04:37:13,529 WARNING [train.py:1204] (3/4) Exclude cut with ID _59202_210_26984_1_1534816810612_3549174_137-318710-0_sp0.9 from training. Duration: 0.988875 2023-10-04 04:37:13,564 WARNING [train.py:1204] (3/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_86-66192-0 from training. Duration: 0.55 2023-10-04 04:37:16,840 WARNING [train.py:1204] (3/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_540-275685-0 from training. Duration: 0.89 2023-10-04 04:37:16,886 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8776_210_2040_1_1528638837_4088036_244-278763-0 from training. Duration: 0.9 2023-10-04 04:37:19,744 WARNING [train.py:1204] (3/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_9-219972-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 04:37:21,115 WARNING [train.py:1204] (3/4) Exclude cut with ID _44659_210_2721_1_1534036120061_6614860_656-283823-0_sp1.1 from training. Duration: 0.92725 2023-10-04 04:37:21,191 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_69630_210_7234_1_1535789601_661049_77-216385-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 04:37:23,905 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_19952_210_2161_1_1531875469_3851750_210-308919-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 04:37:23,918 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_4737_210_2040_1_1526998689_3293008_378-178650-0 from training. Duration: 0.95 2023-10-04 04:37:25,159 INFO [train.py:1046] (3/4) Epoch 44, batch 750, loss[loss=0.1472, simple_loss=0.24, pruned_loss=0.02717, over 24472.00 frames. ], tot_loss[loss=0.1544, simple_loss=0.2347, pruned_loss=0.03705, over 4602995.11 frames. ], batch size: 66, lr: 2.32e-03, grad_scale: 8.0 2023-10-04 04:37:27,954 WARNING [train.py:1204] (3/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_224-343295-0 from training. Duration: 0.55 2023-10-04 04:37:27,982 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7002_210_1794_1_1527939683_5157286_184-229630-0 from training. Duration: 0.74 2023-10-04 04:37:28,024 WARNING [train.py:1204] (3/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_6-214815-0 from training. Duration: 0.85 2023-10-04 04:37:29,797 WARNING [train.py:1204] (3/4) Exclude cut with ID _8699_210_2069_1_1528630346831_3960409_160-24408-0 from training. Duration: 0.91 2023-10-04 04:37:30,447 WARNING [train.py:1204] (3/4) Exclude cut with ID _5585_210_2667_1_1527418813324_8007340_89-332072-0 from training. Duration: 0.64 2023-10-04 04:37:31,726 WARNING [train.py:1204] (3/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_528-259800-0_sp1.1 from training. Duration: 0.963625 2023-10-04 04:37:33,105 WARNING [train.py:1204] (3/4) Exclude cut with ID _55383_210_10635_1_1534468426172_2231669_134-128246-0 from training. Duration: 0.89 2023-10-04 04:37:34,520 WARNING [train.py:1204] (3/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_400-338274-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 04:37:34,580 WARNING [train.py:1204] (3/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_364-22817-0_sp0.9 from training. Duration: 0.788875 2023-10-04 04:37:36,379 WARNING [train.py:1204] (3/4) Exclude cut with ID _42863_210_9070_1_1533902398788_3696920_331-291057-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 04:37:37,740 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_5436_210_1794_1_1527331317_2541494_229-211016-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 04:37:37,779 WARNING [train.py:1204] (3/4) Exclude cut with ID _6456_210_1534_1_1527681634212_3181049_345-252877-0_sp0.9 from training. Duration: 0.711125 2023-10-04 04:37:37,812 WARNING [train.py:1204] (3/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_96-81499-0_sp1.1 from training. Duration: 0.92725 2023-10-04 04:37:40,701 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_5575_210_1410_1_1527301305_2570170_342-265572-0_sp1.1 from training. Duration: 0.8545625 2023-10-04 04:37:40,753 WARNING [train.py:1204] (3/4) Exclude cut with ID _5444_210_2151_1_1527418808464_5312210_726-27165-0_sp0.9 from training. Duration: 0.7444375 2023-10-04 04:37:42,188 WARNING [train.py:1204] (3/4) Exclude cut with ID _22083_210_8254_1_1531702862439_3678970_304-327750-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 04:37:45,331 WARNING [train.py:1204] (3/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_245-226199-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 04:37:45,366 WARNING [train.py:1204] (3/4) Exclude cut with ID _42875_210_14856_1_1533704397487_4012433_102-199499-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 04:37:45,640 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.balancer1.prob, batch_count=1527880.0, ans=0.125 2023-10-04 04:37:46,818 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_51225_210_3601_1_1534400634_2327383_55-301844-0 from training. Duration: 0.93 2023-10-04 04:37:48,210 WARNING [train.py:1204] (3/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_916-195848-0_sp1.1 from training. Duration: 0.836375 2023-10-04 04:37:49,654 WARNING [train.py:1204] (3/4) Exclude cut with ID _44718_210_5816_1_1534050051869_6469030_497-311999-0_sp1.1 from training. Duration: 0.936375 2023-10-04 04:37:49,809 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.attention_skip_rate, batch_count=1527880.0, ans=0.0 2023-10-04 04:37:52,230 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_34886_210_10633_1_1533203882_1755661_91-194765-0_sp1.1 from training. Duration: 0.936375 2023-10-04 04:37:55,002 WARNING [train.py:1204] (3/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_310-26186-0_sp0.9 from training. Duration: 0.77775 2023-10-04 04:37:55,086 WARNING [train.py:1204] (3/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_416-257730-0 from training. Duration: 0.65 2023-10-04 04:37:55,092 WARNING [train.py:1204] (3/4) Exclude cut with ID _9389_210_1664_1_1529217337301_5289269_209-154760-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 04:37:56,510 WARNING [train.py:1204] (3/4) Exclude cut with ID _4092_210_2151_1_1526643044449_3135759_227-213972-0 from training. Duration: 0.54 2023-10-04 04:37:56,543 WARNING [train.py:1204] (3/4) Exclude cut with ID IC0679W0044-335852 from training. Duration: 0.768 2023-10-04 04:37:57,927 WARNING [train.py:1204] (3/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_42-186162-0 from training. Duration: 0.92 2023-10-04 04:37:57,934 WARNING [train.py:1204] (3/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_302-2550-0_sp0.9 from training. Duration: 0.9 2023-10-04 04:37:57,954 WARNING [train.py:1204] (3/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_356-31216-0_sp0.9 from training. Duration: 0.8333125 2023-10-04 04:38:01,285 WARNING [train.py:1204] (3/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_125-322172-0_sp1.1 from training. Duration: 0.8454375 2023-10-04 04:38:04,308 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.ff2_skip_rate, batch_count=1527946.6666666667, ans=0.0 2023-10-04 04:38:06,956 WARNING [train.py:1204] (3/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_141-89727-0_sp0.9 from training. Duration: 0.788875 2023-10-04 04:38:08,839 WARNING [train.py:1204] (3/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_647-337784-0_sp1.1 from training. Duration: 0.97275 2023-10-04 04:38:08,849 WARNING [train.py:1204] (3/4) Exclude cut with ID _6493_210_1747_1_1528459210424_4012470_513-140967-0_sp1.1 from training. Duration: 0.7181875 2023-10-04 04:38:10,383 WARNING [train.py:1204] (3/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_87-266334-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 04:38:11,802 WARNING [train.py:1204] (3/4) Exclude cut with ID _26528_210_9070_1_1532570410842_3851310_526-261338-0_sp1.1 from training. Duration: 0.92725 2023-10-04 04:38:11,846 WARNING [train.py:1204] (3/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_380-343670-0 from training. Duration: 0.88 2023-10-04 04:38:13,134 WARNING [train.py:1204] (3/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_449-15508-0_sp1.1 from training. Duration: 0.7454375 2023-10-04 04:38:13,241 WARNING [train.py:1204] (3/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_372-6869-0_sp0.9 from training. Duration: 0.488875 2023-10-04 04:38:14,627 WARNING [train.py:1204] (3/4) Exclude cut with ID _26907_210_7234_1_1532221740694_3205263_337-2494-0_sp0.9 from training. Duration: 0.97775 2023-10-04 04:38:17,310 WARNING [train.py:1204] (3/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_10-100367-0_sp1.1 from training. Duration: 0.8909375 2023-10-04 04:38:17,350 WARNING [train.py:1204] (3/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_47-167373-0 from training. Duration: 0.92 2023-10-04 04:38:19,195 WARNING [train.py:1204] (3/4) Exclude cut with ID _28323_210_16187_1_1532480395667_4100300_628-282179-0_sp1.1 from training. Duration: 0.97275 2023-10-04 04:38:23,159 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.632e+02 1.912e+02 2.188e+02 2.458e+02 4.051e+02, threshold=4.377e+02, percent-clipped=0.0 2023-10-04 04:38:24,698 WARNING [train.py:1204] (3/4) Exclude cut with ID _20455_210_5219_1_1531565996775_8098100_213-280898-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 04:38:26,214 WARNING [train.py:1204] (3/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_383-316000-0_sp1.1 from training. Duration: 0.8181875 2023-10-04 04:38:26,256 WARNING [train.py:1204] (3/4) Exclude cut with ID _18513_210_9206_1_1531618215223_3862620_16-78180-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 04:38:28,998 WARNING [train.py:1204] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_330-327861-0_sp1.1 from training. Duration: 0.6090625 2023-10-04 04:38:31,882 WARNING [train.py:1204] (3/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_356-31216-0 from training. Duration: 0.75 2023-10-04 04:38:33,752 WARNING [train.py:1204] (3/4) Exclude cut with ID _25248_210_8750_1_1532062825941_5554180_255-148539-0_sp1.1 from training. Duration: 0.963625 2023-10-04 04:38:33,804 WARNING [train.py:1204] (3/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_269-154726-0_sp1.1 from training. Duration: 0.7818125 2023-10-04 04:38:36,053 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.self_attn_weights.pos_emb_skip_rate, batch_count=1528080.0, ans=0.0 2023-10-04 04:38:37,083 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7002_210_1794_1_1527939683_5157286_163-87552-0_sp1.1 from training. Duration: 0.7818125 2023-10-04 04:38:37,120 WARNING [train.py:1204] (3/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_97-151896-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 04:38:40,337 INFO [train.py:1046] (3/4) Epoch 44, batch 800, loss[loss=0.1574, simple_loss=0.25, pruned_loss=0.03241, over 24350.00 frames. ], tot_loss[loss=0.1545, simple_loss=0.2352, pruned_loss=0.03692, over 4637151.79 frames. ], batch size: 74, lr: 2.32e-03, grad_scale: 16.0 2023-10-04 04:38:40,451 WARNING [train.py:1204] (3/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_207-7026-0_sp1.1 from training. Duration: 0.97275 2023-10-04 04:38:40,472 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_92-46009-0_sp0.9 from training. Duration: 0.6 2023-10-04 04:38:47,465 WARNING [train.py:1204] (3/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_122-178847-0_sp1.1 from training. Duration: 0.97275 2023-10-04 04:38:47,470 WARNING [train.py:1204] (3/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_94-348956-0_sp1.1 from training. Duration: 0.92725 2023-10-04 04:38:47,611 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.conv_skip_rate, batch_count=1528146.6666666667, ans=0.0 2023-10-04 04:38:50,582 WARNING [train.py:1204] (3/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_1018-244089-0_sp1.1 from training. Duration: 0.963625 2023-10-04 04:38:50,602 WARNING [train.py:1204] (3/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_87-118423-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 04:38:51,851 WARNING [train.py:1204] (3/4) Exclude cut with ID _53713_210_24116_1_1534314487762_7416751_504-212292-0_sp1.1 from training. Duration: 0.92725 2023-10-04 04:38:51,872 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_446-150327-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 04:38:53,235 WARNING [train.py:1204] (3/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_682-245989-0_sp1.1 from training. Duration: 0.92725 2023-10-04 04:38:56,087 WARNING [train.py:1204] (3/4) Exclude cut with ID _35723_210_1794_1_1533088341043_628250_8-234524-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 04:38:57,421 WARNING [train.py:1204] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_64-248359-0_sp0.9 from training. Duration: 0.7666875 2023-10-04 04:38:58,780 WARNING [train.py:1204] (3/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_49-97158-0 from training. Duration: 0.76 2023-10-04 04:39:00,159 WARNING [train.py:1204] (3/4) Exclude cut with ID _24314_210_4915_1_1532135078116_7170179_743-915-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 04:39:01,538 WARNING [train.py:1204] (3/4) Exclude cut with ID _61194_210_18628_1_1535007555628_3750909_353-323468-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 04:39:01,568 WARNING [train.py:1204] (3/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_387-131538-0_sp1.1 from training. Duration: 0.736375 2023-10-04 04:39:01,587 WARNING [train.py:1204] (3/4) Exclude cut with ID _44424_210_18163_1_1533623394848_1213110_79-187603-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 04:39:03,383 WARNING [train.py:1204] (3/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_86-19601-0 from training. Duration: 0.85 2023-10-04 04:39:03,409 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_50050_210_16223_1_1535435796_7290138_103-283529-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 04:39:04,580 WARNING [train.py:1204] (3/4) Exclude cut with ID _13913_210_9994_1_1530579615187_3383897_473-277682-0 from training. Duration: 0.39 2023-10-04 04:39:08,064 WARNING [train.py:1204] (3/4) Exclude cut with ID _42326_210_7770_1_1533814282531_3707269_368-68029-0_sp1.1 from training. Duration: 0.92725 2023-10-04 04:39:09,562 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_41428_210_2161_1_1533774184_3992532_446-339645-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 04:39:12,966 WARNING [train.py:1204] (3/4) Exclude cut with ID _69584_210_5219_1_1535785133775_4044450_325-7656-0_sp1.1 from training. Duration: 0.7818125 2023-10-04 04:39:12,976 WARNING [train.py:1204] (3/4) Exclude cut with ID _21632_210_2253_1_1531994001765_3744140_134-184387-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 04:39:14,488 WARNING [train.py:1204] (3/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_550-184707-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 04:39:14,510 WARNING [train.py:1204] (3/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_106-341780-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 04:39:17,585 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.balancer1.prob, batch_count=1528280.0, ans=0.125 2023-10-04 04:39:18,703 WARNING [train.py:1204] (3/4) Exclude cut with ID _45871_210_5665_1_1533864614245_3296060_253-132552-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 04:39:18,748 WARNING [train.py:1204] (3/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_395-310220-0_sp1.1 from training. Duration: 0.8090625 2023-10-04 04:39:18,762 WARNING [train.py:1204] (3/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_795-33252-0_sp0.9 from training. Duration: 0.411125 2023-10-04 04:39:22,027 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9002W0003-495962 from training. Duration: 0.946 2023-10-04 04:39:22,053 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_43-71989-0 from training. Duration: 0.57 2023-10-04 04:39:22,066 WARNING [train.py:1204] (3/4) Exclude cut with ID _8699_210_2069_1_1528630346831_3960409_155-39766-0_sp1.1 from training. Duration: 0.7090625 2023-10-04 04:39:22,086 WARNING [train.py:1204] (3/4) Exclude cut with ID _27894_210_12392_1_1532415496078_6902439_788-232684-0_sp1.1 from training. Duration: 0.97275 2023-10-04 04:39:24,632 WARNING [train.py:1204] (3/4) Exclude cut with ID _56494_210_4220_1_1535284837625_3033169_10-39296-0_sp1.1 from training. Duration: 0.92725 2023-10-04 04:39:24,659 WARNING [train.py:1204] (3/4) Exclude cut with ID _40901_210_5665_1_1533363789958_3856130_341-20819-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 04:39:26,196 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.conv_module1.balancer1.prob, batch_count=1528346.6666666667, ans=0.125 2023-10-04 04:39:30,110 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9029W0435-509822 from training. Duration: 0.896 2023-10-04 04:39:30,171 WARNING [train.py:1204] (3/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_51-117576-0 from training. Duration: 0.4 2023-10-04 04:39:31,592 WARNING [train.py:1204] (3/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_197-28164-0_sp1.1 from training. Duration: 0.72725 2023-10-04 04:39:31,796 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=1528346.6666666667, ans=0.1 2023-10-04 04:39:33,684 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.3.encoder.layers.2.self_attn_weights.whiten_keys, num_groups=8, num_channels=256, metric=3.30 vs. limit=6.0 2023-10-04 04:39:34,813 WARNING [train.py:1204] (3/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_145-137790-0_sp1.1 from training. Duration: 0.7545625 2023-10-04 04:39:38,014 WARNING [train.py:1204] (3/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_2-39653-0_sp1.1 from training. Duration: 0.8909375 2023-10-04 04:39:40,249 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.1.encoder.layers.1.feed_forward1.out_whiten, num_groups=1, num_channels=256, metric=5.46 vs. limit=15.0 2023-10-04 04:39:41,430 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3780_210_2087_1_1526467760_2130483_186-208240-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 04:39:42,835 WARNING [train.py:1204] (3/4) Exclude cut with ID _24055_210_12651_1_1532310907143_976979_92-59356-0 from training. Duration: 0.91 2023-10-04 04:39:44,090 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3910_210_1794_1_1526742306_334530_28-238909-0_sp1.1 from training. Duration: 0.736375 2023-10-04 04:39:46,920 WARNING [train.py:1204] (3/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_269-154726-0 from training. Duration: 0.86 2023-10-04 04:39:52,364 WARNING [train.py:1204] (3/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_326-165900-0_sp0.9 from training. Duration: 0.7666875 2023-10-04 04:39:53,701 INFO [train.py:1046] (3/4) Epoch 44, batch 850, loss[loss=0.145, simple_loss=0.2333, pruned_loss=0.02833, over 24393.00 frames. ], tot_loss[loss=0.1549, simple_loss=0.2356, pruned_loss=0.03711, over 4657354.76 frames. ], batch size: 61, lr: 2.32e-03, grad_scale: 16.0 2023-10-04 04:39:55,237 WARNING [train.py:1204] (3/4) Exclude cut with ID _5294_210_1747_1_1527249643928_3696179_470-276077-0_sp1.1 from training. Duration: 0.963625 2023-10-04 04:39:55,283 WARNING [train.py:1204] (3/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_372-131163-0 from training. Duration: 0.94 2023-10-04 04:39:57,219 WARNING [train.py:1204] (3/4) Exclude cut with ID _45297_210_2721_1_1533715351381_3264949_351-147136-0_sp1.1 from training. Duration: 0.7909375 2023-10-04 04:39:58,575 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_1072-12671-0_sp1.1 from training. Duration: 0.92725 2023-10-04 04:39:59,978 WARNING [train.py:1204] (3/4) Exclude cut with ID _15133_210_6228_1_1530794512074_3080379_54-198193-0 from training. Duration: 0.94 2023-10-04 04:39:59,995 WARNING [train.py:1204] (3/4) Exclude cut with ID _30196_210_1664_1_1533708496522_7074140_1194-270988-0_sp1.1 from training. Duration: 0.97275 2023-10-04 04:40:01,298 WARNING [train.py:1204] (3/4) Exclude cut with ID _50310_210_15488_1_1534068043345_3641080_289-193637-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 04:40:02,694 WARNING [train.py:1204] (3/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_313-263495-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 04:40:04,122 WARNING [train.py:1204] (3/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_18-130336-0_sp1.1 from training. Duration: 0.7181875 2023-10-04 04:40:04,372 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.balancer1.prob, batch_count=1528480.0, ans=0.125 2023-10-04 04:40:05,498 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_47896_210_20508_1_1534492518_5958868_646-287209-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 04:40:05,603 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_294-163793-0 from training. Duration: 0.82 2023-10-04 04:40:07,382 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_8-319957-0 from training. Duration: 0.96 2023-10-04 04:40:07,385 WARNING [train.py:1204] (3/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_1211-226066-0 from training. Duration: 0.76 2023-10-04 04:40:09,438 WARNING [train.py:1204] (3/4) Exclude cut with ID _4779_210_2084_1_1527318022944_3914893_500-285013-0_sp0.9 from training. Duration: 0.7666875 2023-10-04 04:40:10,657 WARNING [train.py:1204] (3/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_347-232268-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 04:40:12,152 WARNING [train.py:1204] (3/4) Exclude cut with ID _29877_210_6228_1_1532397791106_5276149_111-55056-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 04:40:12,181 WARNING [train.py:1204] (3/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_812-271646-0_sp1.1 from training. Duration: 0.92725 2023-10-04 04:40:12,210 WARNING [train.py:1204] (3/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_235-262124-0_sp1.1 from training. Duration: 0.6090625 2023-10-04 04:40:16,908 WARNING [train.py:1204] (3/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_14-16530-0_sp1.1 from training. Duration: 0.97275 2023-10-04 04:40:16,949 WARNING [train.py:1204] (3/4) Exclude cut with ID _44718_210_5816_1_1534050051869_6469030_568-304512-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 04:40:16,969 WARNING [train.py:1204] (3/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_1033-274376-0 from training. Duration: 0.87 2023-10-04 04:40:20,924 WARNING [train.py:1204] (3/4) Exclude cut with ID _4681_210_3525_1_1527251040321_1235713_123-24428-0 from training. Duration: 0.48 2023-10-04 04:40:23,749 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_37818_210_4238_1_1533620910_4720083_251-288764-0_sp1.1 from training. Duration: 0.97275 2023-10-04 04:40:25,119 WARNING [train.py:1204] (3/4) Exclude cut with ID _65585_210_12210_1_1535450451584_3764098_112-294301-0 from training. Duration: 0.88 2023-10-04 04:40:27,985 WARNING [train.py:1204] (3/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_329-113173-0 from training. Duration: 0.62 2023-10-04 04:40:29,945 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6767_210_3273_1_1527855783_1718932_173-256601-0 from training. Duration: 0.8 2023-10-04 04:40:30,108 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.1.feed_forward1.hidden_balancer.prob, batch_count=1528613.3333333333, ans=0.125 2023-10-04 04:40:32,622 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9266W0001-627677 from training. Duration: 0.896 2023-10-04 04:40:32,632 WARNING [train.py:1204] (3/4) Exclude cut with ID _49643_210_7989_1_1534640244224_3939031_611-185515-0_sp1.1 from training. Duration: 0.8545625 2023-10-04 04:40:32,636 WARNING [train.py:1204] (3/4) Exclude cut with ID _31649_210_6228_1_1532584608063_4091078_687-237771-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 04:40:32,656 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_148-172861-0_sp0.9 from training. Duration: 0.5555625 2023-10-04 04:40:34,167 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_5129_210_6400_1_1527161005_3579054_107-69738-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 04:40:34,346 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.nonlin_attention.balancer.min_positive, batch_count=1528613.3333333333, ans=0.05 2023-10-04 04:40:35,662 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.0.conv_skip_rate, batch_count=1528613.3333333333, ans=0.0 2023-10-04 04:40:37,271 WARNING [train.py:1204] (3/4) Exclude cut with ID _40137_210_12210_1_1533290484046_3795180_36-301164-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 04:40:37,295 WARNING [train.py:1204] (3/4) Exclude cut with ID _7537_210_2084_1_1528502395431_3924359_113-199933-0 from training. Duration: 0.59 2023-10-04 04:40:39,940 WARNING [train.py:1204] (3/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_547-5424-0_sp1.1 from training. Duration: 0.8545625 2023-10-04 04:40:41,901 WARNING [train.py:1204] (3/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_147-284024-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 04:40:41,974 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_294-163793-0_sp1.1 from training. Duration: 0.7454375 2023-10-04 04:40:41,993 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3780_210_2087_1_1526467760_2130483_170-259044-0_sp1.1 from training. Duration: 0.636375 2023-10-04 04:40:44,021 WARNING [train.py:1204] (3/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_726-270780-0_sp1.1 from training. Duration: 0.7818125 2023-10-04 04:40:45,389 WARNING [train.py:1204] (3/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_214-291369-0_sp1.1 from training. Duration: 0.57275 2023-10-04 04:40:45,439 WARNING [train.py:1204] (3/4) Exclude cut with ID _21881_210_3525_1_1531825242757_3798380_247-102591-0 from training. Duration: 0.98 2023-10-04 04:40:49,560 WARNING [train.py:1204] (3/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_18-174647-0_sp1.1 from training. Duration: 0.82725 2023-10-04 04:40:49,563 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_415-52611-0_sp1.1 from training. Duration: 0.936375 2023-10-04 04:40:50,913 WARNING [train.py:1204] (3/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_379-49457-0_sp0.9 from training. Duration: 0.9666875 2023-10-04 04:40:50,930 WARNING [train.py:1204] (3/4) Exclude cut with ID _64521_210_14699_1_1535243427259_3815328_368-195106-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 04:40:51,223 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.feed_forward3.hidden_balancer.prob, batch_count=1528680.0, ans=0.125 2023-10-04 04:40:52,178 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.761e+02 1.992e+02 2.228e+02 2.493e+02 3.552e+02, threshold=4.457e+02, percent-clipped=0.0 2023-10-04 04:40:52,278 WARNING [train.py:1204] (3/4) Exclude cut with ID _28394_210_8913_1_1532430049518_3650118_51-334124-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 04:40:53,740 WARNING [train.py:1204] (3/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_317-161769-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 04:40:53,900 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.feed_forward1.out_proj.dropout_p, batch_count=1528746.6666666667, ans=0.1 2023-10-04 04:40:56,444 WARNING [train.py:1204] (3/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_40-129756-0_sp0.9 from training. Duration: 0.9 2023-10-04 04:40:57,377 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.0.layers.1.self_attn2.whiten, num_groups=1, num_channels=192, metric=10.77 vs. limit=22.5 2023-10-04 04:40:57,804 WARNING [train.py:1204] (3/4) Exclude cut with ID _3067_210_2667_1_1526212974640_4503168_119-175776-0_sp0.9 from training. Duration: 0.82225 2023-10-04 04:40:59,118 WARNING [train.py:1204] (3/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_1004-304915-0_sp1.1 from training. Duration: 0.92725 2023-10-04 04:40:59,186 WARNING [train.py:1204] (3/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_359-118926-0_sp0.9 from training. Duration: 0.8 2023-10-04 04:41:05,396 WARNING [train.py:1204] (3/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_51-200220-0_sp0.9 from training. Duration: 0.62225 2023-10-04 04:41:05,982 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.3.encoder.layers.3.feed_forward1.out_whiten, num_groups=1, num_channels=512, metric=6.71 vs. limit=15.0 2023-10-04 04:41:08,426 INFO [train.py:1046] (3/4) Epoch 44, batch 900, loss[loss=0.2185, simple_loss=0.282, pruned_loss=0.07748, over 19407.00 frames. ], tot_loss[loss=0.1557, simple_loss=0.2368, pruned_loss=0.03735, over 4676365.07 frames. ], batch size: 388, lr: 2.32e-03, grad_scale: 16.0 2023-10-04 04:41:08,481 WARNING [train.py:1204] (3/4) Exclude cut with ID _46683_210_9642_1_1533730913492_4591116_530-326202-0_sp1.1 from training. Duration: 0.936375 2023-10-04 04:41:08,543 WARNING [train.py:1204] (3/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_567-135114-0 from training. Duration: 0.87 2023-10-04 04:41:08,568 WARNING [train.py:1204] (3/4) Exclude cut with ID _66863_210_18053_1_1535698758334_8590319_1092-271784-0_sp1.1 from training. Duration: 0.87275 2023-10-04 04:41:09,867 WARNING [train.py:1204] (3/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_762-282614-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 04:41:11,770 WARNING [train.py:1204] (3/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_280-120942-0 from training. Duration: 0.77 2023-10-04 04:41:17,758 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_31583_210_3852_1_1532595851_4024413_27-170931-0_sp1.1 from training. Duration: 0.963625 2023-10-04 04:41:20,502 WARNING [train.py:1204] (3/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_266-271628-0_sp1.1 from training. Duration: 0.92725 2023-10-04 04:41:20,546 WARNING [train.py:1204] (3/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_41-31157-0 from training. Duration: 0.57 2023-10-04 04:41:23,257 WARNING [train.py:1204] (3/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_113-18163-0_sp1.1 from training. Duration: 0.8090625 2023-10-04 04:41:23,328 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder_embed.conv.8.prob, batch_count=1528880.0, ans=0.125 2023-10-04 04:41:24,551 WARNING [train.py:1204] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_147-111137-0 from training. Duration: 0.95 2023-10-04 04:41:24,627 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1083-220122-0_sp1.1 from training. Duration: 0.463625 2023-10-04 04:41:26,036 WARNING [train.py:1204] (3/4) Exclude cut with ID _14884_210_2252_1_1530707459689_5561250_591-173406-0_sp1.1 from training. Duration: 0.87275 2023-10-04 04:41:26,048 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_24677_210_1794_1_1532002693_4341296_149-303793-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 04:41:27,429 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_133-564-0_sp1.1 from training. Duration: 0.7181875 2023-10-04 04:41:27,450 WARNING [train.py:1204] (3/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_625-338546-0_sp0.9 from training. Duration: 0.97775 2023-10-04 04:41:35,016 WARNING [train.py:1204] (3/4) Exclude cut with ID _25882_210_3036_1_1532775665815_6479510_624-320169-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 04:41:35,023 WARNING [train.py:1204] (3/4) Exclude cut with ID _20622_210_3852_1_1531622427080_1093839_127-325326-0_sp1.1 from training. Duration: 0.92725 2023-10-04 04:41:36,330 WARNING [train.py:1204] (3/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_464-33931-0_sp1.1 from training. Duration: 0.4909375 2023-10-04 04:41:39,617 WARNING [train.py:1204] (3/4) Exclude cut with ID _66112_210_10635_1_1535590236305_3801484_120-304588-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 04:41:43,399 WARNING [train.py:1204] (3/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_110-130961-0 from training. Duration: 0.47 2023-10-04 04:41:46,186 WARNING [train.py:1204] (3/4) Exclude cut with ID _16908_210_5724_1_1531119157248_3772700_64-54020-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 04:41:49,004 WARNING [train.py:1204] (3/4) Exclude cut with ID _4068_210_2629_1_1526697350395_1546259_224-288693-0_sp1.1 from training. Duration: 0.536375 2023-10-04 04:41:50,320 WARNING [train.py:1204] (3/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_191-41023-0_sp0.9 from training. Duration: 0.788875 2023-10-04 04:41:51,287 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.0.layers.0.conv_module2.whiten, num_groups=1, num_channels=192, metric=11.15 vs. limit=15.0 2023-10-04 04:41:51,691 WARNING [train.py:1204] (3/4) Exclude cut with ID ID0205W0255-783002 from training. Duration: 0.958 2023-10-04 04:41:53,074 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_162-262717-0 from training. Duration: 0.83 2023-10-04 04:41:58,652 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_224-322394-0_sp1.1 from training. Duration: 0.6 2023-10-04 04:41:58,659 WARNING [train.py:1204] (3/4) Exclude cut with ID _4866_210_2577_1_1527404412284_4364659_133-1696-0_sp1.1 from training. Duration: 0.9 2023-10-04 04:42:00,078 WARNING [train.py:1204] (3/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_973-198085-0_sp1.1 from training. Duration: 0.6818125 2023-10-04 04:42:07,428 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_47449_210_5399_1_1534076445_4006929_410-237806-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 04:42:07,439 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7543_210_6753_1_1528543318_4552_1-85072-0_sp0.9 from training. Duration: 0.92225 2023-10-04 04:42:08,903 WARNING [train.py:1204] (3/4) Exclude cut with ID _65263_210_22356_1_1535763598691_3921037_108-21902-0 from training. Duration: 0.99 2023-10-04 04:42:08,908 WARNING [train.py:1204] (3/4) Exclude cut with ID _65585_210_12210_1_1535450451584_3764098_309-174818-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 04:42:10,884 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.5.encoder.layers.0.feed_forward1.out_whiten, num_groups=1, num_channels=256, metric=6.43 vs. limit=15.0 2023-10-04 04:42:12,127 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_309-65177-0 from training. Duration: 0.88 2023-10-04 04:42:12,401 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.bypass_mid.scale_min, batch_count=1529080.0, ans=0.2 2023-10-04 04:42:13,551 WARNING [train.py:1204] (3/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_363-185703-0_sp1.1 from training. Duration: 0.863625 2023-10-04 04:42:14,075 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.4.encoder.layers.2.feed_forward3.out_whiten, num_groups=1, num_channels=384, metric=13.96 vs. limit=15.0 2023-10-04 04:42:15,418 WARNING [train.py:1204] (3/4) Exclude cut with ID _42326_210_7770_1_1533814282531_3707269_442-238692-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 04:42:17,443 WARNING [train.py:1204] (3/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_226-254446-0_sp1.1 from training. Duration: 0.936375 2023-10-04 04:42:17,467 WARNING [train.py:1204] (3/4) Exclude cut with ID _29853_210_7497_1_1532505600851_6642110_287-299903-0_sp1.1 from training. Duration: 0.963625 2023-10-04 04:42:21,741 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1015-343884-0 from training. Duration: 0.62 2023-10-04 04:42:21,781 WARNING [train.py:1204] (3/4) Exclude cut with ID ID0305W0008-833402 from training. Duration: 0.91 2023-10-04 04:42:23,062 INFO [train.py:1046] (3/4) Epoch 44, batch 950, loss[loss=0.1578, simple_loss=0.2256, pruned_loss=0.045, over 23816.00 frames. ], tot_loss[loss=0.1552, simple_loss=0.2362, pruned_loss=0.03707, over 4697197.50 frames. ], batch size: 195, lr: 2.32e-03, grad_scale: 8.0 2023-10-04 04:42:23,154 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6673_210_3273_1_1527767790_3173240_351-167954-0_sp1.1 from training. Duration: 0.436375 2023-10-04 04:42:23,175 WARNING [train.py:1204] (3/4) Exclude cut with ID _6456_210_1534_1_1527681634212_3181049_345-252877-0 from training. Duration: 0.64 2023-10-04 04:42:24,667 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_13-205182-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 04:42:26,106 WARNING [train.py:1204] (3/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_427-208901-0 from training. Duration: 0.68 2023-10-04 04:42:33,013 WARNING [train.py:1204] (3/4) Exclude cut with ID _49039_210_6315_1_1533967537541_3846118_9-148921-0_sp1.1 from training. Duration: 0.97275 2023-10-04 04:42:36,150 WARNING [train.py:1204] (3/4) Exclude cut with ID _59493_210_17282_1_1535078419077_3225879_143-70317-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 04:42:37,546 WARNING [train.py:1204] (3/4) Exclude cut with ID _27967_210_12140_1_1532588391859_8419130_1056-236369-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 04:42:37,593 WARNING [train.py:1204] (3/4) Exclude cut with ID _6537_210_1883_1_1527987559710_3597129_382-133062-0_sp0.9 from training. Duration: 0.7555625 2023-10-04 04:42:39,040 WARNING [train.py:1204] (3/4) Exclude cut with ID ID0376W0255-869957 from training. Duration: 0.9510625 2023-10-04 04:42:42,363 WARNING [train.py:1204] (3/4) Exclude cut with ID _49371_210_12071_1_1533952761021_3812190_483-240691-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 04:42:44,216 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_12868_210_6753_1_1530701932_530275_18-76322-0_sp1.1 from training. Duration: 0.7909375 2023-10-04 04:42:45,605 WARNING [train.py:1204] (3/4) Exclude cut with ID _53950_210_5816_1_1534417264919_3862951_367-26344-0_sp1.1 from training. Duration: 0.97275 2023-10-04 04:42:45,625 WARNING [train.py:1204] (3/4) Exclude cut with ID _5444_210_2151_1_1527418808464_5312210_634-194174-0_sp1.1 from training. Duration: 0.8818125 2023-10-04 04:42:45,651 WARNING [train.py:1204] (3/4) Exclude cut with ID _56494_210_4220_1_1535284837625_3033169_69-138150-0 from training. Duration: 0.94 2023-10-04 04:42:48,005 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.4.encoder.layers.2.whiten, num_groups=1, num_channels=384, metric=4.43 vs. limit=12.0 2023-10-04 04:42:48,878 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7543_210_6753_1_1528544010_1110402_25-183860-0_sp0.9 from training. Duration: 0.52225 2023-10-04 04:42:48,974 WARNING [train.py:1204] (3/4) Exclude cut with ID _40284_210_2084_1_1533513679814_3704279_287-191918-0_sp1.1 from training. Duration: 0.963625 2023-10-04 04:42:49,309 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.ff2_skip_rate, batch_count=1529213.3333333333, ans=0.0 2023-10-04 04:42:50,390 WARNING [train.py:1204] (3/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_291-270881-0 from training. Duration: 0.97 2023-10-04 04:42:50,433 WARNING [train.py:1204] (3/4) Exclude cut with ID _12555_210_8750_1_1530075594402_3780239_449-267986-0_sp0.9 from training. Duration: 0.92225 2023-10-04 04:42:54,575 WARNING [train.py:1204] (3/4) Exclude cut with ID _3992_210_1747_1_1526644816031_3731060_268-64520-0_sp1.1 from training. Duration: 0.963625 2023-10-04 04:42:54,589 WARNING [train.py:1204] (3/4) Exclude cut with ID _39411_210_5986_1_1533538825030_3343649_173-238248-0_sp0.9 from training. Duration: 0.92225 2023-10-04 04:42:54,620 WARNING [train.py:1204] (3/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_828-313097-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 04:42:55,969 WARNING [train.py:1204] (3/4) Exclude cut with ID _26907_210_7234_1_1532221740694_3205263_337-2494-0 from training. Duration: 0.88 2023-10-04 04:42:58,741 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_229-45300-0_sp1.1 from training. Duration: 0.5181875 2023-10-04 04:43:00,116 WARNING [train.py:1204] (3/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_300-27235-0_sp1.1 from training. Duration: 0.7909375 2023-10-04 04:43:01,464 WARNING [train.py:1204] (3/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_48-338976-0_sp1.1 from training. Duration: 0.8090625 2023-10-04 04:43:06,044 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_13091_210_7925_1_1530609447_8987354_792-195353-0_sp1.1 from training. Duration: 0.936375 2023-10-04 04:43:06,050 WARNING [train.py:1204] (3/4) Exclude cut with ID _56524_210_9642_1_1534572011779_3619170_311-54500-0_sp1.1 from training. Duration: 0.97275 2023-10-04 04:43:06,345 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.feed_forward3.hidden_balancer.prob, batch_count=1529346.6666666667, ans=0.125 2023-10-04 04:43:09,422 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7543_210_6753_1_1528544010_1110402_25-183860-0 from training. Duration: 0.47 2023-10-04 04:43:12,293 WARNING [train.py:1204] (3/4) Exclude cut with ID 2411-132532-0017-82279-0_sp1.1 from training. Duration: 0.9681875 2023-10-04 04:43:12,294 WARNING [train.py:1204] (3/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_272-249985-0_sp0.9 from training. Duration: 0.7444375 2023-10-04 04:43:12,354 WARNING [train.py:1204] (3/4) Exclude cut with ID _55644_210_11856_1_1534593599582_4103719_320-229006-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 04:43:14,347 WARNING [train.py:1204] (3/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_177-100554-0_sp1.1 from training. Duration: 0.963625 2023-10-04 04:43:14,349 WARNING [train.py:1204] (3/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_787-294416-0_sp1.1 from training. Duration: 0.7454375 2023-10-04 04:43:16,380 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=1529346.6666666667, ans=0.1 2023-10-04 04:43:18,041 WARNING [train.py:1204] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_318-59097-0 from training. Duration: 0.76 2023-10-04 04:43:19,362 WARNING [train.py:1204] (3/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_480-13146-0_sp1.1 from training. Duration: 0.77275 2023-10-04 04:43:20,834 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_469-338558-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 04:43:22,155 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_367-126897-0_sp1.1 from training. Duration: 0.963625 2023-10-04 04:43:22,177 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6545_210_2040_1_1527774670_4245258_379-14182-0 from training. Duration: 0.8 2023-10-04 04:43:23,473 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.575e+02 2.090e+02 2.361e+02 2.661e+02 3.886e+02, threshold=4.722e+02, percent-clipped=0.0 2023-10-04 04:43:23,548 WARNING [train.py:1204] (3/4) Exclude cut with ID _21631_210_2253_1_1531907880778_3160339_197-79498-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 04:43:23,559 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_416-293746-0_sp1.1 from training. Duration: 0.8181875 2023-10-04 04:43:23,804 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=1529413.3333333333, ans=0.1 2023-10-04 04:43:24,787 WARNING [train.py:1204] (3/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_52-211683-0 from training. Duration: 0.9 2023-10-04 04:43:27,671 WARNING [train.py:1204] (3/4) Exclude cut with ID _48678_210_5663_1_1533988830947_3769199_306-255886-0_sp0.9 from training. Duration: 0.8555625 2023-10-04 04:43:30,474 WARNING [train.py:1204] (3/4) Exclude cut with ID _65134_210_14763_1_1535335393985_3505368_296-24733-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 04:43:33,250 WARNING [train.py:1204] (3/4) Exclude cut with ID _14898_210_9206_1_1530766812377_4177190_335-99364-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 04:43:34,609 WARNING [train.py:1204] (3/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_19-342632-0 from training. Duration: 0.91 2023-10-04 04:43:34,633 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_53071_210_16554_1_1534490405_2954066_265-170472-0 from training. Duration: 0.83 2023-10-04 04:43:37,336 INFO [train.py:1046] (3/4) Epoch 44, batch 1000, loss[loss=0.1767, simple_loss=0.2551, pruned_loss=0.04919, over 23610.00 frames. ], tot_loss[loss=0.1554, simple_loss=0.2359, pruned_loss=0.03745, over 4693587.37 frames. ], batch size: 85, lr: 2.32e-03, grad_scale: 8.0 2023-10-04 04:43:39,449 WARNING [train.py:1204] (3/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_189-298172-0_sp1.1 from training. Duration: 0.963625 2023-10-04 04:43:42,243 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7615_210_4211_1_1528110965_4580160_72-235211-0 from training. Duration: 0.76 2023-10-04 04:43:44,047 WARNING [train.py:1204] (3/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_155-90874-0_sp1.1 from training. Duration: 0.936375 2023-10-04 04:43:47,440 WARNING [train.py:1204] (3/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_164-216310-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 04:43:49,503 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_46013_210_7234_1_1533776362_3744032_463-52378-0 from training. Duration: 0.86 2023-10-04 04:43:49,507 WARNING [train.py:1204] (3/4) Exclude cut with ID _61133_210_9969_1_1535196777227_3696409_333-270325-0 from training. Duration: 0.93 2023-10-04 04:43:50,117 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.3.encoder.layers.3.nonlin_attention.whiten2, num_groups=1, num_channels=512, metric=9.62 vs. limit=15.0 2023-10-04 04:43:54,430 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.1.encoder.layers.0.whiten, num_groups=1, num_channels=256, metric=5.51 vs. limit=12.0 2023-10-04 04:43:54,877 WARNING [train.py:1204] (3/4) Exclude cut with ID _5172_210_1883_1_1527246062567_3577219_18-101380-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 04:43:54,885 WARNING [train.py:1204] (3/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_577-236858-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 04:43:55,015 WARNING [train.py:1204] (3/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_1006-177345-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 04:43:55,216 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=1529546.6666666667, ans=0.1 2023-10-04 04:43:55,284 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.feed_forward3.hidden_balancer.prob, batch_count=1529546.6666666667, ans=0.125 2023-10-04 04:43:57,833 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_4-266348-0 from training. Duration: 0.78 2023-10-04 04:44:01,768 WARNING [train.py:1204] (3/4) Exclude cut with ID _55857_210_2346_1_1534644007831_6898209_745-98391-0 from training. Duration: 0.76 2023-10-04 04:44:03,208 WARNING [train.py:1204] (3/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_375-244976-0 from training. Duration: 0.75 2023-10-04 04:44:04,441 WARNING [train.py:1204] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_4-248404-0_sp1.1 from training. Duration: 0.97275 2023-10-04 04:44:05,896 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8453_210_4211_1_1528502940_8562730_596-306820-0 from training. Duration: 0.97 2023-10-04 04:44:07,356 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_211-339622-0_sp0.9 from training. Duration: 0.5 2023-10-04 04:44:07,385 WARNING [train.py:1204] (3/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_393-57888-0 from training. Duration: 0.64 2023-10-04 04:44:08,808 WARNING [train.py:1204] (3/4) Exclude cut with ID _23825_210_13449_1_1531915313526_3497150_143-343575-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 04:44:08,878 WARNING [train.py:1204] (3/4) Exclude cut with ID _58605_210_12210_1_1534723195939_3904259_36-12256-0_sp1.1 from training. Duration: 0.936375 2023-10-04 04:44:17,499 WARNING [train.py:1204] (3/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_38-67795-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 04:44:17,549 WARNING [train.py:1204] (3/4) Exclude cut with ID _64631_210_27767_1_1535349580068_4165160_308-327832-0_sp1.1 from training. Duration: 0.92725 2023-10-04 04:44:18,838 WARNING [train.py:1204] (3/4) Exclude cut with ID _37086_210_9038_1_1532999540891_7021250_726-81176-0_sp1.1 from training. Duration: 0.936375 2023-10-04 04:44:20,196 WARNING [train.py:1204] (3/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_47-141521-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 04:44:20,200 WARNING [train.py:1204] (3/4) Exclude cut with ID _3067_210_2667_1_1526212974640_4503168_250-285937-0 from training. Duration: 0.8 2023-10-04 04:44:20,234 WARNING [train.py:1204] (3/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_239-24000-0_sp1.1 from training. Duration: 0.97275 2023-10-04 04:44:21,685 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3371_210_1794_1_1526384462_5580777_396-339812-0_sp0.9 from training. Duration: 0.9444375 2023-10-04 04:44:21,739 WARNING [train.py:1204] (3/4) Exclude cut with ID _16909_210_5724_1_1531292079912_4118919_580-173454-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 04:44:21,785 WARNING [train.py:1204] (3/4) Exclude cut with ID IC0124W0002-61308 from training. Duration: 0.982 2023-10-04 04:44:24,453 WARNING [train.py:1204] (3/4) Exclude cut with ID _31602_210_5854_1_1532677021456_4458250_249-96604-0 from training. Duration: 0.89 2023-10-04 04:44:25,870 WARNING [train.py:1204] (3/4) Exclude cut with ID _69584_210_5219_1_1535785133775_4044450_325-7656-0 from training. Duration: 0.86 2023-10-04 04:44:28,486 WARNING [train.py:1204] (3/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_478-162247-0 from training. Duration: 0.82 2023-10-04 04:44:29,928 WARNING [train.py:1204] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_686-54383-0_sp1.1 from training. Duration: 0.7909375 2023-10-04 04:44:36,872 WARNING [train.py:1204] (3/4) Exclude cut with ID _29303_210_2575_1_1532392237303_7410649_85-270092-0_sp1.1 from training. Duration: 0.936375 2023-10-04 04:44:36,883 WARNING [train.py:1204] (3/4) Exclude cut with ID _2322_210_2667_1_1525604548971_3656299_440-80494-0_sp1.1 from training. Duration: 0.8 2023-10-04 04:44:36,927 WARNING [train.py:1204] (3/4) Exclude cut with ID _47346_210_9476_1_1533815971553_3536160_201-40644-0_sp1.1 from training. Duration: 0.936375 2023-10-04 04:44:38,283 WARNING [train.py:1204] (3/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_343-139898-0_sp1.1 from training. Duration: 0.87275 2023-10-04 04:44:40,836 WARNING [train.py:1204] (3/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_1012-279473-0 from training. Duration: 0.67 2023-10-04 04:44:40,942 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7931_210_1794_1_1528594728_5564059_280-279560-0_sp1.1 from training. Duration: 0.8909375 2023-10-04 04:44:42,910 WARNING [train.py:1204] (3/4) Exclude cut with ID _8263_210_1664_1_1528439452891_970220_68-181805-0 from training. Duration: 0.64 2023-10-04 04:44:42,950 WARNING [train.py:1204] (3/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_605-133395-0 from training. Duration: 0.66 2023-10-04 04:44:44,845 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_43-171109-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 04:44:44,848 WARNING [train.py:1204] (3/4) Exclude cut with ID _40094_210_16380_1_1533294005773_3641159_107-280739-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 04:44:45,020 WARNING [train.py:1204] (3/4) Exclude cut with ID _14821_210_8750_1_1530680503991_6829359_2-227151-0_sp0.9 from training. Duration: 0.92225 2023-10-04 04:44:48,165 WARNING [train.py:1204] (3/4) Exclude cut with ID _14268_210_9206_1_1530622771285_4714099_331-105351-0_sp1.1 from training. Duration: 0.5909375 2023-10-04 04:44:50,809 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_26326_210_7925_1_1533002493_3592406_451-82741-0_sp1.1 from training. Duration: 0.97275 2023-10-04 04:44:52,201 INFO [train.py:1046] (3/4) Epoch 44, batch 1050, loss[loss=0.1613, simple_loss=0.2321, pruned_loss=0.04521, over 23820.00 frames. ], tot_loss[loss=0.1543, simple_loss=0.234, pruned_loss=0.03729, over 4697749.79 frames. ], batch size: 212, lr: 2.32e-03, grad_scale: 8.0 2023-10-04 04:44:52,357 WARNING [train.py:1204] (3/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_191-59784-0_sp0.9 from training. Duration: 0.97775 2023-10-04 04:44:53,719 WARNING [train.py:1204] (3/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_445-117398-0_sp1.1 from training. Duration: 0.8090625 2023-10-04 04:44:53,941 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.conv_module1.balancer1.prob, batch_count=1529813.3333333333, ans=0.125 2023-10-04 04:44:54,472 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.3.encoder.layers.1.feed_forward2.out_whiten, num_groups=1, num_channels=512, metric=5.78 vs. limit=15.0 2023-10-04 04:44:55,158 WARNING [train.py:1204] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_605-257205-0_sp0.9 from training. Duration: 0.6555625 2023-10-04 04:44:56,529 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_50050_210_16223_1_1535435796_7290138_592-258841-0_sp1.1 from training. Duration: 0.936375 2023-10-04 04:44:59,228 WARNING [train.py:1204] (3/4) Exclude cut with ID _14348_210_8341_1_1530599434996_3778640_240-232370-0_sp0.9 from training. Duration: 0.9333125 2023-10-04 04:45:01,887 WARNING [train.py:1204] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_239-316802-0_sp0.9 from training. Duration: 0.7666875 2023-10-04 04:45:02,082 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.bypass_mid.scale_min, batch_count=1529813.3333333333, ans=0.2 2023-10-04 04:45:03,258 WARNING [train.py:1204] (3/4) Exclude cut with ID _45560_210_21982_1_1533812603951_3584999_111-6376-0_sp1.1 from training. Duration: 0.763625 2023-10-04 04:45:06,045 WARNING [train.py:1204] (3/4) Exclude cut with ID _54189_210_16380_1_1534332670039_3911710_606-311481-0_sp1.1 from training. Duration: 0.963625 2023-10-04 04:45:07,449 WARNING [train.py:1204] (3/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_237-332027-0_sp1.1 from training. Duration: 0.863625 2023-10-04 04:45:07,482 WARNING [train.py:1204] (3/4) Exclude cut with ID _66070_210_7640_1_1535441393096_7805310_489-292631-0_sp0.9 from training. Duration: 0.87775 2023-10-04 04:45:08,765 WARNING [train.py:1204] (3/4) Exclude cut with ID _49728_210_12651_1_1534041034294_6788150_624-195301-0_sp1.1 from training. Duration: 0.77275 2023-10-04 04:45:08,818 WARNING [train.py:1204] (3/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_44-79427-0 from training. Duration: 0.98 2023-10-04 04:45:10,163 WARNING [train.py:1204] (3/4) Exclude cut with ID _35350_210_16040_1_1533040191183_4075299_490-198354-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 04:45:11,496 WARNING [train.py:1204] (3/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_309-222545-0 from training. Duration: 0.84 2023-10-04 04:45:12,187 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.3.encoder.layers.3.whiten, num_groups=1, num_channels=512, metric=3.68 vs. limit=12.0 2023-10-04 04:45:13,621 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_13091_210_7925_1_1530609447_8987354_790-199469-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 04:45:13,626 WARNING [train.py:1204] (3/4) Exclude cut with ID _21632_210_2253_1_1531994001765_3744140_110-274654-0 from training. Duration: 0.82 2023-10-04 04:45:13,632 WARNING [train.py:1204] (3/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_498-40153-0_sp0.9 from training. Duration: 0.7 2023-10-04 04:45:21,077 WARNING [train.py:1204] (3/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_363-282909-0_sp1.1 from training. Duration: 0.936375 2023-10-04 04:45:21,134 WARNING [train.py:1204] (3/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_178-177582-0_sp0.9 from training. Duration: 0.82225 2023-10-04 04:45:21,145 WARNING [train.py:1204] (3/4) Exclude cut with ID _24612_210_8913_1_1531998054193_3806359_357-35487-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 04:45:25,298 WARNING [train.py:1204] (3/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_138-332773-0 from training. Duration: 0.81 2023-10-04 04:45:25,325 WARNING [train.py:1204] (3/4) Exclude cut with ID _7624_210_1531_1_1528109802198_4933101_418-349834-0 from training. Duration: 0.6 2023-10-04 04:45:25,359 WARNING [train.py:1204] (3/4) Exclude cut with ID _7537_210_2084_1_1528502395431_3924359_14-312906-0_sp0.9 from training. Duration: 0.9333125 2023-10-04 04:45:29,455 WARNING [train.py:1204] (3/4) Exclude cut with ID _60057_210_10640_1_1534903065793_3333150_362-230377-0 from training. Duration: 0.87 2023-10-04 04:45:29,717 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.conv_module1.balancer1.prob, batch_count=1529946.6666666667, ans=0.125 2023-10-04 04:45:31,668 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.2.encoder.layers.2.self_attn_weights.whiten_keys, num_groups=4, num_channels=128, metric=3.05 vs. limit=6.0 2023-10-04 04:45:32,300 WARNING [train.py:1204] (3/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_138-64210-0 from training. Duration: 0.73 2023-10-04 04:45:32,367 WARNING [train.py:1204] (3/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_454-339504-0_sp1.1 from training. Duration: 0.97275 2023-10-04 04:45:36,497 WARNING [train.py:1204] (3/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_508-335369-0_sp0.9 from training. Duration: 0.5333125 2023-10-04 04:45:37,800 WARNING [train.py:1204] (3/4) Exclude cut with ID _5608_210_2106_1_1527415439089_6819768_909-82767-0_sp0.9 from training. Duration: 0.67775 2023-10-04 04:45:37,834 WARNING [train.py:1204] (3/4) Exclude cut with ID _6694_210_4599_1_1527852637490_3965529_99-11611-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 04:45:39,148 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2655_210_2087_1_1525688959_3897400_52-279899-0_sp1.1 from training. Duration: 0.62725 2023-10-04 04:45:40,767 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.feed_forward3.hidden_balancer.prob, batch_count=1530013.3333333333, ans=0.125 2023-10-04 04:45:42,056 WARNING [train.py:1204] (3/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_375-245677-0_sp1.1 from training. Duration: 0.736375 2023-10-04 04:45:45,763 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_23850_210_4238_1_1532232597_1632433_32-185135-0 from training. Duration: 0.86 2023-10-04 04:45:47,250 WARNING [train.py:1204] (3/4) Exclude cut with ID _15113_210_8254_1_1530842438579_3716279_209-54533-0 from training. Duration: 0.96 2023-10-04 04:45:49,080 WARNING [train.py:1204] (3/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_401-53423-0 from training. Duration: 0.55 2023-10-04 04:45:49,120 WARNING [train.py:1204] (3/4) Exclude cut with ID _14867_210_1968_1_1530684179890_3846969_319-250868-0_sp1.1 from training. Duration: 0.9 2023-10-04 04:45:49,134 WARNING [train.py:1204] (3/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_106-30782-0_sp0.9 from training. Duration: 0.888875 2023-10-04 04:45:50,481 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.650e+02 2.023e+02 2.342e+02 2.796e+02 4.637e+02, threshold=4.684e+02, percent-clipped=0.0 2023-10-04 04:45:50,658 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_5960_210_1794_1_1527403865_3990999_515-2613-0 from training. Duration: 0.88 2023-10-04 04:45:53,616 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.conv_module1.balancer1.prob, batch_count=1530080.0, ans=0.125 2023-10-04 04:45:53,762 INFO [scaling.py:1118] (3/4) WithLoss: name=encoder.encoders.5.encoder.layers.1.self_attn_weights, loss-sum=0.000e+00 2023-10-04 04:45:54,785 WARNING [train.py:1204] (3/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_798-106179-0_sp0.9 from training. Duration: 0.92225 2023-10-04 04:45:56,239 WARNING [train.py:1204] (3/4) Exclude cut with ID _14227_210_4381_1_1530701804286_3940259_125-275642-0_sp1.1 from training. Duration: 0.9 2023-10-04 04:45:56,241 WARNING [train.py:1204] (3/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_79-200392-0_sp1.1 from training. Duration: 0.8818125 2023-10-04 04:45:56,299 WARNING [train.py:1204] (3/4) Exclude cut with ID _48836_210_12210_1_1534075848293_3904150_429-35475-0_sp0.9 from training. Duration: 0.988875 2023-10-04 04:45:56,322 WARNING [train.py:1204] (3/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_224-172517-0_sp1.1 from training. Duration: 0.97275 2023-10-04 04:46:00,715 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.self_attn_weights.pos_emb_skip_rate, batch_count=1530080.0, ans=0.0 2023-10-04 04:46:01,766 WARNING [train.py:1204] (3/4) Exclude cut with ID _15113_210_8254_1_1530842438579_3716279_76-232507-0_sp1.1 from training. Duration: 0.97275 2023-10-04 04:46:01,769 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_135-5501-0 from training. Duration: 0.98 2023-10-04 04:46:01,913 WARNING [train.py:1204] (3/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_563-347832-0_sp0.9 from training. Duration: 0.988875 2023-10-04 04:46:02,022 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.0.self_attn_weights.pos_emb_skip_rate, batch_count=1530080.0, ans=0.0 2023-10-04 04:46:03,218 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6786_210_1388_1_1527839699_4194495_273-149841-0 from training. Duration: 0.92 2023-10-04 04:46:03,244 WARNING [train.py:1204] (3/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_172-215932-0 from training. Duration: 0.66 2023-10-04 04:46:04,598 INFO [train.py:1046] (3/4) Epoch 44, batch 1100, loss[loss=0.1551, simple_loss=0.2493, pruned_loss=0.03044, over 24672.00 frames. ], tot_loss[loss=0.1541, simple_loss=0.2335, pruned_loss=0.03737, over 4691313.73 frames. ], batch size: 73, lr: 2.32e-03, grad_scale: 8.0 2023-10-04 04:46:04,638 WARNING [train.py:1204] (3/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_939-132910-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 04:46:07,369 WARNING [train.py:1204] (3/4) Exclude cut with ID _49371_210_12071_1_1533952761021_3812190_16-246001-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 04:46:11,619 WARNING [train.py:1204] (3/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_680-189165-0_sp1.1 from training. Duration: 0.936375 2023-10-04 04:46:18,381 WARNING [train.py:1204] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_53-300544-0_sp1.1 from training. Duration: 0.6090625 2023-10-04 04:46:19,684 WARNING [train.py:1204] (3/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_540-275685-0_sp1.1 from training. Duration: 0.8090625 2023-10-04 04:46:19,713 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_38965_210_4852_1_1533174906_3484735_453-106579-0_sp1.1 from training. Duration: 0.92725 2023-10-04 04:46:19,754 WARNING [train.py:1204] (3/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_105-328174-0 from training. Duration: 0.76 2023-10-04 04:46:21,798 WARNING [train.py:1204] (3/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_279-126205-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 04:46:23,239 WARNING [train.py:1204] (3/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_5-55893-0_sp0.9 from training. Duration: 0.611125 2023-10-04 04:46:27,246 WARNING [train.py:1204] (3/4) Exclude cut with ID _13678_210_7497_1_1530530069226_3463170_141-10835-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 04:46:29,972 WARNING [train.py:1204] (3/4) Exclude cut with ID _22083_210_8254_1_1531702862439_3678970_201-85738-0_sp1.1 from training. Duration: 0.8181875 2023-10-04 04:46:30,000 WARNING [train.py:1204] (3/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_51-1049-0 from training. Duration: 0.75 2023-10-04 04:46:31,301 WARNING [train.py:1204] (3/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_206-10097-0_sp0.9 from training. Duration: 0.5444375 2023-10-04 04:46:32,674 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_4528_210_3273_1_1527076654_3276777_360-63661-0_sp1.1 from training. Duration: 0.92725 2023-10-04 04:46:32,682 WARNING [train.py:1204] (3/4) Exclude cut with ID _64086_210_7994_1_1535270417019_7435949_449-31567-0_sp1.1 from training. Duration: 0.8818125 2023-10-04 04:46:35,637 WARNING [train.py:1204] (3/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_245-174553-0_sp0.9 from training. Duration: 0.97775 2023-10-04 04:46:36,970 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6545_210_2040_1_1527774670_4245258_379-14182-0_sp1.1 from training. Duration: 0.72725 2023-10-04 04:46:39,021 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.3.encoder.layers.3.self_attn2.whiten, num_groups=1, num_channels=512, metric=11.12 vs. limit=22.5 2023-10-04 04:46:41,059 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_256-206070-0_sp1.1 from training. Duration: 0.963625 2023-10-04 04:46:44,978 WARNING [train.py:1204] (3/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_278-104676-0 from training. Duration: 0.98 2023-10-04 04:46:45,056 WARNING [train.py:1204] (3/4) Exclude cut with ID IC0671W0065-331908 from training. Duration: 0.896 2023-10-04 04:46:47,569 WARNING [train.py:1204] (3/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_244-129917-0_sp1.1 from training. Duration: 0.97275 2023-10-04 04:46:48,944 WARNING [train.py:1204] (3/4) Exclude cut with ID _7852_210_5219_1_1528286471316_8139400_1129-170857-0_sp1.1 from training. Duration: 0.97275 2023-10-04 04:46:49,020 WARNING [train.py:1204] (3/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_443-30034-0_sp0.9 from training. Duration: 0.711125 2023-10-04 04:46:50,299 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_31583_210_3852_1_1532595851_4024413_250-332999-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 04:46:52,249 WARNING [train.py:1204] (3/4) Exclude cut with ID _8772_210_1850_1_1528635595972_3664011_247-336448-0 from training. Duration: 0.74 2023-10-04 04:46:53,574 WARNING [train.py:1204] (3/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_567-135114-0_sp0.9 from training. Duration: 0.9666875 2023-10-04 04:46:53,578 WARNING [train.py:1204] (3/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_557-349534-0_sp1.1 from training. Duration: 0.82725 2023-10-04 04:46:53,598 WARNING [train.py:1204] (3/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_1341-26728-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 04:46:53,641 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_40222_210_6228_1_1533385298_4481939_375-328051-0_sp1.1 from training. Duration: 0.97275 2023-10-04 04:46:53,671 WARNING [train.py:1204] (3/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_276-96593-0 from training. Duration: 0.77 2023-10-04 04:47:00,489 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_53071_210_16554_1_1534490405_2954066_265-170472-0_sp0.9 from training. Duration: 0.92225 2023-10-04 04:47:00,514 WARNING [train.py:1204] (3/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_365-1492-0 from training. Duration: 0.91 2023-10-04 04:47:01,927 WARNING [train.py:1204] (3/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_654-52713-0_sp0.9 from training. Duration: 0.8555625 2023-10-04 04:47:04,858 WARNING [train.py:1204] (3/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_113-92608-0_sp1.1 from training. Duration: 0.4909375 2023-10-04 04:47:07,674 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_46013_210_7234_1_1533776362_3744032_50-141535-0 from training. Duration: 0.99 2023-10-04 04:47:07,682 WARNING [train.py:1204] (3/4) Exclude cut with ID _13942_210_9988_1_1530604906036_3733360_499-55676-0_sp1.1 from training. Duration: 0.563625 2023-10-04 04:47:07,941 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.conv_module1.balancer1.prob, batch_count=1530413.3333333333, ans=0.125 2023-10-04 04:47:09,087 WARNING [train.py:1204] (3/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_375-65143-0_sp1.1 from training. Duration: 0.97275 2023-10-04 04:47:10,572 WARNING [train.py:1204] (3/4) Exclude cut with ID _16756_210_4915_1_1531047803763_7104259_199-325608-0_sp1.1 from training. Duration: 0.92725 2023-10-04 04:47:11,900 WARNING [train.py:1204] (3/4) Exclude cut with ID _31649_210_6228_1_1532584608063_4091078_443-124391-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 04:47:13,243 WARNING [train.py:1204] (3/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_146-218763-0 from training. Duration: 0.86 2023-10-04 04:47:13,289 WARNING [train.py:1204] (3/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_567-135114-0_sp1.1 from training. Duration: 0.7909375 2023-10-04 04:47:14,983 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_26326_210_7925_1_1533002493_3592406_462-288299-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 04:47:15,078 WARNING [train.py:1204] (3/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_322-217220-0 from training. Duration: 0.86 2023-10-04 04:47:16,778 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_238-256668-0_sp1.1 from training. Duration: 0.62725 2023-10-04 04:47:16,831 WARNING [train.py:1204] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_371-18169-0 from training. Duration: 0.86 2023-10-04 04:47:18,681 INFO [train.py:1046] (3/4) Epoch 44, batch 1150, loss[loss=0.1328, simple_loss=0.2123, pruned_loss=0.02669, over 24577.00 frames. ], tot_loss[loss=0.1544, simple_loss=0.2345, pruned_loss=0.03716, over 4720026.52 frames. ], batch size: 60, lr: 2.32e-03, grad_scale: 8.0 2023-10-04 04:47:18,778 WARNING [train.py:1204] (3/4) Exclude cut with ID _58480_210_12392_1_1534811121111_3646160_23-74512-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 04:47:18,783 WARNING [train.py:1204] (3/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_784-213099-0_sp1.1 from training. Duration: 0.8454375 2023-10-04 04:47:19,756 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.0.layers.1.conv_module1.whiten, num_groups=1, num_channels=192, metric=8.74 vs. limit=15.0 2023-10-04 04:47:20,180 WARNING [train.py:1204] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_625-102955-0_sp1.1 from training. Duration: 0.7 2023-10-04 04:47:24,840 WARNING [train.py:1204] (3/4) Exclude cut with ID _49748_210_24116_1_1533986971737_3158910_176-174327-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 04:47:26,288 WARNING [train.py:1204] (3/4) Exclude cut with ID _50310_210_15488_1_1534068043345_3641080_432-96140-0_sp1.1 from training. Duration: 0.936375 2023-10-04 04:47:27,661 WARNING [train.py:1204] (3/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_76-14910-0_sp1.1 from training. Duration: 0.92725 2023-10-04 04:47:28,998 WARNING [train.py:1204] (3/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_40-19458-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 04:47:29,038 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_46-93547-0 from training. Duration: 0.95 2023-10-04 04:47:30,416 WARNING [train.py:1204] (3/4) Exclude cut with ID _21631_210_2253_1_1531907880778_3160339_321-35325-0_sp1.1 from training. Duration: 0.963625 2023-10-04 04:47:33,136 WARNING [train.py:1204] (3/4) Exclude cut with ID _4779_210_2084_1_1527318022944_3914893_500-285013-0 from training. Duration: 0.69 2023-10-04 04:47:34,590 WARNING [train.py:1204] (3/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_301-93324-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 04:47:34,596 WARNING [train.py:1204] (3/4) Exclude cut with ID _6473_210_1741_1_1527901307396_3815380_108-17335-0_sp1.1 from training. Duration: 0.5909375 2023-10-04 04:47:34,899 INFO [scaling.py:1118] (3/4) WithLoss: name=encoder.encoders.3.encoder.layers.3.self_attn_weights, loss-sum=0.000e+00 2023-10-04 04:47:38,884 WARNING [train.py:1204] (3/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_206-10097-0 from training. Duration: 0.49 2023-10-04 04:47:40,252 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_36756_210_7234_1_1533026991_966276_103-77513-0_sp1.1 from training. Duration: 0.97275 2023-10-04 04:47:44,382 WARNING [train.py:1204] (3/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_662-18193-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 04:47:44,449 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_26433_210_12108_1_1532157158_4056480_439-52438-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 04:47:46,285 WARNING [train.py:1204] (3/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_464-33931-0 from training. Duration: 0.54 2023-10-04 04:47:46,291 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3474_210_2028_1_1526188513_4825246_145-309131-0_sp0.9 from training. Duration: 0.82225 2023-10-04 04:47:46,303 WARNING [train.py:1204] (3/4) Exclude cut with ID _9679_210_7497_1_1529129212906_4363929_446-308321-0_sp1.1 from training. Duration: 0.963625 2023-10-04 04:47:49,906 WARNING [train.py:1204] (3/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_662-275621-0 from training. Duration: 0.42 2023-10-04 04:47:51,179 WARNING [train.py:1204] (3/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_344-337148-0_sp1.1 from training. Duration: 0.97275 2023-10-04 04:47:53,090 WARNING [train.py:1204] (3/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_759-179071-0_sp1.1 from training. Duration: 0.92725 2023-10-04 04:47:58,982 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.bypass_mid.scale_min, batch_count=1530613.3333333333, ans=0.2 2023-10-04 04:48:00,553 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.conv_module2.balancer2.min_positive, batch_count=1530613.3333333333, ans=0.05 2023-10-04 04:48:01,665 WARNING [train.py:1204] (3/4) Exclude cut with ID _28783_210_7943_1_1532671164808_7155479_293-12188-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 04:48:08,549 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_17613_210_2151_1_1531137569_2366160_235-320304-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 04:48:08,565 WARNING [train.py:1204] (3/4) Exclude cut with ID _14349_210_8341_1_1530615710818_3442470_170-336541-0 from training. Duration: 0.86 2023-10-04 04:48:08,643 WARNING [train.py:1204] (3/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_375-218677-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 04:48:09,958 WARNING [train.py:1204] (3/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_829-202956-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 04:48:14,045 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9029W0436-509823 from training. Duration: 0.768 2023-10-04 04:48:15,475 WARNING [train.py:1204] (3/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_231-112214-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 04:48:17,279 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.694e+02 1.987e+02 2.114e+02 2.395e+02 3.524e+02, threshold=4.229e+02, percent-clipped=0.0 2023-10-04 04:48:22,968 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9059W0470-524883 from training. Duration: 0.3415625 2023-10-04 04:48:23,151 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=1530746.6666666667, ans=0.1 2023-10-04 04:48:27,839 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_43266_210_1794_1_1533521789_4497218_206-79512-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 04:48:29,241 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_294-163793-0_sp0.9 from training. Duration: 0.911125 2023-10-04 04:48:29,271 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6290_210_3273_1_1527595224_1282787_115-87095-0_sp1.1 from training. Duration: 0.5 2023-10-04 04:48:29,316 WARNING [train.py:1204] (3/4) Exclude cut with ID _14100_210_8341_1_1530529320613_3678069_279-55760-0_sp1.1 from training. Duration: 0.8454375 2023-10-04 04:48:31,941 INFO [train.py:1046] (3/4) Epoch 44, batch 1200, loss[loss=0.1409, simple_loss=0.2274, pruned_loss=0.0272, over 24480.00 frames. ], tot_loss[loss=0.1551, simple_loss=0.2354, pruned_loss=0.03743, over 4713250.87 frames. ], batch size: 63, lr: 2.32e-03, grad_scale: 16.0 2023-10-04 04:48:32,065 WARNING [train.py:1204] (3/4) Exclude cut with ID _14621_210_1925_1_1530686901185_3675420_419-234200-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 04:48:37,546 WARNING [train.py:1204] (3/4) Exclude cut with ID _60229_210_27767_1_1534838406158_3655350_353-213656-0_sp1.1 from training. Duration: 0.863625 2023-10-04 04:48:37,555 WARNING [train.py:1204] (3/4) Exclude cut with ID _48678_210_5663_1_1533988830947_3769199_306-255886-0_sp1.1 from training. Duration: 0.7 2023-10-04 04:48:38,991 WARNING [train.py:1204] (3/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_466-3492-0_sp1.1 from training. Duration: 0.97275 2023-10-04 04:48:38,991 WARNING [train.py:1204] (3/4) Exclude cut with ID _8755_210_1876_1_1528628559357_3258209_199-242167-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 04:48:39,052 WARNING [train.py:1204] (3/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_237-89684-0_sp1.1 from training. Duration: 0.87275 2023-10-04 04:48:40,673 INFO [scaling.py:1118] (3/4) WithLoss: name=encoder.encoders.3.encoder.layers.1.self_attn_weights, loss-sum=0.000e+00 2023-10-04 04:48:41,809 WARNING [train.py:1204] (3/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_70-84121-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 04:48:42,009 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.conv_skip_rate, batch_count=1530813.3333333333, ans=0.0 2023-10-04 04:48:44,544 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7544_210_6753_1_1528587722_3576686_172-171750-0_sp1.1 from training. Duration: 0.5909375 2023-10-04 04:48:45,911 WARNING [train.py:1204] (3/4) Exclude cut with ID _24271_210_12658_1_1531987270022_7190519_40-321822-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 04:48:45,937 WARNING [train.py:1204] (3/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_227-83612-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 04:48:49,337 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9156W0015-572508 from training. Duration: 0.896 2023-10-04 04:48:51,356 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_292-265928-0 from training. Duration: 0.51 2023-10-04 04:48:55,432 WARNING [train.py:1204] (3/4) Exclude cut with ID _14747_210_8341_1_1530685797463_3654089_147-131229-0_sp1.1 from training. Duration: 0.6909375 2023-10-04 04:48:57,575 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.3.encoder.layers.0.self_attn1.whiten, num_groups=1, num_channels=512, metric=16.11 vs. limit=22.5 2023-10-04 04:48:58,731 WARNING [train.py:1204] (3/4) Exclude cut with ID _45297_210_2721_1_1533715351381_3264949_351-147136-0_sp0.9 from training. Duration: 0.9666875 2023-10-04 04:48:58,956 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.conv_skip_rate, batch_count=1530880.0, ans=0.0 2023-10-04 04:48:59,063 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.conv_module2.balancer2.prob, batch_count=1530880.0, ans=0.125 2023-10-04 04:49:00,148 WARNING [train.py:1204] (3/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_1102-324568-0_sp1.1 from training. Duration: 0.97275 2023-10-04 04:49:01,548 WARNING [train.py:1204] (3/4) Exclude cut with ID _18516_210_9206_1_1531704653206_3692406_465-75446-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 04:49:01,550 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9203W0126-595623 from training. Duration: 0.896 2023-10-04 04:49:02,904 WARNING [train.py:1204] (3/4) Exclude cut with ID _55383_210_10635_1_1534468426172_2231669_139-328410-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 04:49:04,369 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.0.feed_forward1.hidden_balancer.prob, batch_count=1530946.6666666667, ans=0.125 2023-10-04 04:49:10,708 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.1.encoder.layers.1.self_attn_weights.whiten_keys, num_groups=4, num_channels=128, metric=2.75 vs. limit=6.0 2023-10-04 04:49:11,207 WARNING [train.py:1204] (3/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_166-246557-0_sp0.9 from training. Duration: 0.711125 2023-10-04 04:49:11,212 WARNING [train.py:1204] (3/4) Exclude cut with ID _30039_210_13270_1_1532484612900_2725150_239-304872-0_sp1.1 from training. Duration: 0.963625 2023-10-04 04:49:11,246 WARNING [train.py:1204] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_6-226750-0 from training. Duration: 0.94 2023-10-04 04:49:12,526 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_135-5501-0_sp1.1 from training. Duration: 0.8909375 2023-10-04 04:49:15,338 WARNING [train.py:1204] (3/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_359-118926-0 from training. Duration: 0.72 2023-10-04 04:49:18,784 WARNING [train.py:1204] (3/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_4-91037-0 from training. Duration: 0.88 2023-10-04 04:49:18,804 WARNING [train.py:1204] (3/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_415-288492-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 04:49:18,885 WARNING [train.py:1204] (3/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_544-287179-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 04:49:19,053 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=1531013.3333333333, ans=0.1 2023-10-04 04:49:20,915 WARNING [train.py:1204] (3/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_122-32606-0_sp1.1 from training. Duration: 0.92725 2023-10-04 04:49:22,166 WARNING [train.py:1204] (3/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_121-284152-0_sp1.1 from training. Duration: 0.536375 2023-10-04 04:49:22,272 WARNING [train.py:1204] (3/4) Exclude cut with ID _44718_210_5816_1_1534050051869_6469030_350-299477-0_sp1.1 from training. Duration: 0.97275 2023-10-04 04:49:23,505 WARNING [train.py:1204] (3/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_1103-56445-0_sp0.9 from training. Duration: 0.811125 2023-10-04 04:49:23,575 WARNING [train.py:1204] (3/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_64-298917-0_sp0.9 from training. Duration: 0.97775 2023-10-04 04:49:24,914 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2245_210_2715_1_1525511548_3540254_310-52483-0 from training. Duration: 0.88 2023-10-04 04:49:24,958 WARNING [train.py:1204] (3/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_401-158239-0_sp1.1 from training. Duration: 0.6454375 2023-10-04 04:49:25,014 WARNING [train.py:1204] (3/4) Exclude cut with ID _59502_210_12210_1_1535107024698_1835139_146-162495-0_sp1.1 from training. Duration: 0.836375 2023-10-04 04:49:26,168 WARNING [train.py:1204] (3/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_41-31157-0_sp0.9 from training. Duration: 0.6333125 2023-10-04 04:49:30,783 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_68717_210_3528_1_1535628683_1556759_95-36722-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 04:49:30,791 WARNING [train.py:1204] (3/4) Exclude cut with ID _30196_210_1664_1_1533708496522_7074140_654-162611-0_sp1.1 from training. Duration: 0.92725 2023-10-04 04:49:33,664 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_9302_210_2550_1_1528889962_4655416_638-301620-0_sp0.9 from training. Duration: 0.52225 2023-10-04 04:49:36,507 WARNING [train.py:1204] (3/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_478-162247-0_sp1.1 from training. Duration: 0.7454375 2023-10-04 04:49:36,690 WARNING [train.py:1204] (3/4) Exclude cut with ID _45297_210_2721_1_1533715351381_3264949_353-9541-0 from training. Duration: 0.97 2023-10-04 04:49:40,826 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9653W0190-675468 from training. Duration: 0.9935 2023-10-04 04:49:43,532 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_49730_210_13049_1_1534075294_1830676_214-84436-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 04:49:44,776 INFO [train.py:1046] (3/4) Epoch 44, batch 1250, loss[loss=0.1332, simple_loss=0.2187, pruned_loss=0.0239, over 24607.00 frames. ], tot_loss[loss=0.1558, simple_loss=0.2364, pruned_loss=0.03757, over 4718617.85 frames. ], batch size: 60, lr: 2.32e-03, grad_scale: 16.0 2023-10-04 04:49:44,954 WARNING [train.py:1204] (3/4) Exclude cut with ID _50093_210_4220_1_1534417141207_3432299_259-322252-0_sp1.1 from training. Duration: 0.836375 2023-10-04 04:49:46,349 WARNING [train.py:1204] (3/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_599-50220-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 04:49:48,101 WARNING [train.py:1204] (3/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_174-109016-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 04:49:51,360 WARNING [train.py:1204] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_665-196155-0 from training. Duration: 0.67 2023-10-04 04:49:55,955 WARNING [train.py:1204] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_656-188361-0_sp1.1 from training. Duration: 0.7909375 2023-10-04 04:49:56,019 WARNING [train.py:1204] (3/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_60-18123-0_sp1.1 from training. Duration: 0.8 2023-10-04 04:49:57,322 WARNING [train.py:1204] (3/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_535-14410-0 from training. Duration: 0.47 2023-10-04 04:49:58,843 WARNING [train.py:1204] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_94-126128-0_sp1.1 from training. Duration: 0.7818125 2023-10-04 04:50:00,273 WARNING [train.py:1204] (3/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_356-31216-0_sp1.1 from training. Duration: 0.6818125 2023-10-04 04:50:03,590 WARNING [train.py:1204] (3/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_443-30034-0_sp1.1 from training. Duration: 0.5818125 2023-10-04 04:50:04,903 WARNING [train.py:1204] (3/4) Exclude cut with ID _26907_210_7234_1_1532221740694_3205263_337-2494-0_sp1.1 from training. Duration: 0.8 2023-10-04 04:50:06,249 WARNING [train.py:1204] (3/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_49-97158-0_sp0.9 from training. Duration: 0.8444375 2023-10-04 04:50:06,260 WARNING [train.py:1204] (3/4) Exclude cut with ID _7877_210_6014_1_1528286358099_3750136_553-170128-0_sp1.1 from training. Duration: 0.963625 2023-10-04 04:50:07,727 WARNING [train.py:1204] (3/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_202-293523-0_sp0.9 from training. Duration: 0.8 2023-10-04 04:50:10,030 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.2.encoder.layers.0.conv_module2.whiten, num_groups=1, num_channels=384, metric=8.13 vs. limit=15.0 2023-10-04 04:50:10,587 WARNING [train.py:1204] (3/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_188-171673-0_sp0.9 from training. Duration: 0.6444375 2023-10-04 04:50:11,885 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_120-41668-0_sp1.1 from training. Duration: 0.636375 2023-10-04 04:50:11,891 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_40223_210_6228_1_1533298404_4812267_613-78769-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 04:50:12,006 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_53861_210_5399_1_1534422156_4272077_150-336899-0_sp1.1 from training. Duration: 0.963625 2023-10-04 04:50:13,350 WARNING [train.py:1204] (3/4) Exclude cut with ID _14459_210_2228_1_1531443641946_7516700_1146-56843-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 04:50:16,279 WARNING [train.py:1204] (3/4) Exclude cut with ID _39867_210_9826_1_1533553222030_9344259_1194-299928-0_sp1.1 from training. Duration: 0.92725 2023-10-04 04:50:17,627 WARNING [train.py:1204] (3/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_214-291369-0_sp0.9 from training. Duration: 0.7 2023-10-04 04:50:21,200 WARNING [train.py:1204] (3/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_358-215558-0 from training. Duration: 0.93 2023-10-04 04:50:23,004 WARNING [train.py:1204] (3/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_577-287150-0_sp0.9 from training. Duration: 0.911125 2023-10-04 04:50:25,697 WARNING [train.py:1204] (3/4) Exclude cut with ID _60860_210_8254_1_1534896015875_3557169_229-187367-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 04:50:27,125 WARNING [train.py:1204] (3/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_857-298942-0 from training. Duration: 0.85 2023-10-04 04:50:27,181 WARNING [train.py:1204] (3/4) Exclude cut with ID _40137_210_12210_1_1533290484046_3795180_249-187059-0_sp1.1 from training. Duration: 0.8 2023-10-04 04:50:27,188 WARNING [train.py:1204] (3/4) Exclude cut with ID ID0171W0508-765603 from training. Duration: 0.541 2023-10-04 04:50:28,902 WARNING [train.py:1204] (3/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_521-219439-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 04:50:28,908 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_40223_210_6228_1_1533298404_4812267_741-302616-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 04:50:32,994 WARNING [train.py:1204] (3/4) Exclude cut with ID _65293_210_9070_1_1535545549765_4004210_438-154735-0_sp1.1 from training. Duration: 0.92725 2023-10-04 04:50:33,208 WARNING [train.py:1204] (3/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_564-132950-0_sp1.1 from training. Duration: 0.92725 2023-10-04 04:50:33,238 WARNING [train.py:1204] (3/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_291-270881-0_sp1.1 from training. Duration: 0.8818125 2023-10-04 04:50:35,887 WARNING [train.py:1204] (3/4) Exclude cut with ID _60229_210_27767_1_1534838406158_3655350_353-213656-0 from training. Duration: 0.95 2023-10-04 04:50:35,907 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_243-143119-0 from training. Duration: 0.77 2023-10-04 04:50:37,323 WARNING [train.py:1204] (3/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_690-126306-0 from training. Duration: 0.78 2023-10-04 04:50:40,073 WARNING [train.py:1204] (3/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_347-95003-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 04:50:41,448 WARNING [train.py:1204] (3/4) Exclude cut with ID _2322_210_2667_1_1525604548971_3656299_440-80494-0 from training. Duration: 0.88 2023-10-04 04:50:41,477 WARNING [train.py:1204] (3/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_370-229434-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 04:50:45,652 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.610e+02 2.030e+02 2.263e+02 2.727e+02 4.243e+02, threshold=4.525e+02, percent-clipped=1.0 2023-10-04 04:50:45,773 WARNING [train.py:1204] (3/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_124-345840-0_sp1.1 from training. Duration: 0.47275 2023-10-04 04:50:45,783 WARNING [train.py:1204] (3/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_165-123581-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 04:50:46,294 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.5.encoder.layers.0.whiten, num_groups=1, num_channels=256, metric=2.26 vs. limit=12.0 2023-10-04 04:50:49,124 WARNING [train.py:1204] (3/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_453-282588-0 from training. Duration: 0.98 2023-10-04 04:50:49,156 WARNING [train.py:1204] (3/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_299-34413-0_sp0.9 from training. Duration: 0.72225 2023-10-04 04:50:50,527 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_40036_210_13405_1_1533648647_1315388_48-146016-0_sp1.1 from training. Duration: 0.8454375 2023-10-04 04:50:50,559 WARNING [train.py:1204] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_99-167392-0_sp0.9 from training. Duration: 0.67775 2023-10-04 04:50:50,609 WARNING [train.py:1204] (3/4) Exclude cut with ID _48677_210_5663_1_1533902405919_3892989_37-207114-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 04:50:52,680 WARNING [train.py:1204] (3/4) Exclude cut with ID _29857_210_20034_1_1532395886753_3798160_335-163562-0 from training. Duration: 0.85 2023-10-04 04:50:54,100 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_36492_210_7927_1_1533261987_4790834_703-343351-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 04:50:56,053 WARNING [train.py:1204] (3/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_80-147301-0_sp0.9 from training. Duration: 0.9333125 2023-10-04 04:50:57,403 WARNING [train.py:1204] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_218-295097-0_sp0.9 from training. Duration: 0.8555625 2023-10-04 04:51:00,133 INFO [train.py:1046] (3/4) Epoch 44, batch 1300, loss[loss=0.1403, simple_loss=0.2104, pruned_loss=0.03505, over 23611.00 frames. ], tot_loss[loss=0.1561, simple_loss=0.2366, pruned_loss=0.03781, over 4703687.23 frames. ], batch size: 256, lr: 2.32e-03, grad_scale: 8.0 2023-10-04 04:51:00,239 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2295_210_2087_1_1525500993_2407863_116-238894-0_sp1.1 from training. Duration: 0.636375 2023-10-04 04:51:03,697 WARNING [train.py:1204] (3/4) Exclude cut with ID _3231_210_2106_1_1526205864934_6784220_697-171068-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 04:51:04,999 WARNING [train.py:1204] (3/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_102-77064-0 from training. Duration: 0.93 2023-10-04 04:51:07,944 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_13091_210_7925_1_1530609447_8987354_1186-261957-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 04:51:10,678 WARNING [train.py:1204] (3/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_91-277684-0_sp1.1 from training. Duration: 0.67275 2023-10-04 04:51:12,036 WARNING [train.py:1204] (3/4) Exclude cut with ID _31088_210_16446_1_1532520012284_3440919_159-313751-0_sp1.1 from training. Duration: 0.936375 2023-10-04 04:51:14,768 WARNING [train.py:1204] (3/4) Exclude cut with ID _42326_210_7770_1_1533814282531_3707269_227-203721-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 04:51:14,858 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3531_210_3528_1_1526294203_5216456_309-227302-0_sp1.1 from training. Duration: 0.62725 2023-10-04 04:51:16,256 WARNING [train.py:1204] (3/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_226-242140-0 from training. Duration: 0.81 2023-10-04 04:51:19,574 WARNING [train.py:1204] (3/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_122-109023-0_sp1.1 from training. Duration: 0.6909375 2023-10-04 04:51:20,260 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.2.encoder.layers.2.conv_module1.whiten, num_groups=1, num_channels=384, metric=7.34 vs. limit=15.0 2023-10-04 04:51:20,937 WARNING [train.py:1204] (3/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_660-211088-0_sp0.9 from training. Duration: 0.688875 2023-10-04 04:51:22,980 WARNING [train.py:1204] (3/4) Exclude cut with ID _39290_210_4220_1_1533862778760_3075118_92-311600-0 from training. Duration: 0.88 2023-10-04 04:51:24,453 WARNING [train.py:1204] (3/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_390-339137-0_sp1.1 from training. Duration: 0.5090625 2023-10-04 04:51:28,856 WARNING [train.py:1204] (3/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_449-341946-0_sp1.1 from training. Duration: 0.963625 2023-10-04 04:51:28,940 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_68095_210_20185_1_1535718226_8888855_884-327007-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 04:51:29,033 WARNING [train.py:1204] (3/4) Exclude cut with ID _42326_210_7770_1_1533814282531_3707269_308-222998-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 04:51:32,158 WARNING [train.py:1204] (3/4) Exclude cut with ID _40399_210_4353_1_1533305658194_3279406_344-238708-0_sp1.1 from training. Duration: 0.963625 2023-10-04 04:51:32,222 WARNING [train.py:1204] (3/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_87-131427-0_sp1.1 from training. Duration: 0.7090625 2023-10-04 04:51:33,518 WARNING [train.py:1204] (3/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_110-130961-0_sp0.9 from training. Duration: 0.52225 2023-10-04 04:51:33,559 WARNING [train.py:1204] (3/4) Exclude cut with ID _55153_210_2253_1_1534384786677_3162096_148-79022-0 from training. Duration: 0.96 2023-10-04 04:51:39,021 WARNING [train.py:1204] (3/4) Exclude cut with ID _9679_210_7497_1_1529129212906_4363929_512-313889-0_sp1.1 from training. Duration: 0.736375 2023-10-04 04:51:39,037 WARNING [train.py:1204] (3/4) Exclude cut with ID _7970_210_1883_1_1528455616296_3472230_25-178407-0_sp1.1 from training. Duration: 0.6454375 2023-10-04 04:51:41,433 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_246-125000-0 from training. Duration: 0.62 2023-10-04 04:51:41,491 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_15126_210_6753_1_1530778677_2591185_113-157651-0_sp0.9 from training. Duration: 0.7555625 2023-10-04 04:51:42,897 WARNING [train.py:1204] (3/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_105-63309-0_sp1.1 from training. Duration: 0.8909375 2023-10-04 04:51:44,398 WARNING [train.py:1204] (3/4) Exclude cut with ID _29877_210_6228_1_1532397791106_5276149_578-177271-0_sp1.1 from training. Duration: 0.936375 2023-10-04 04:51:46,211 WARNING [train.py:1204] (3/4) Exclude cut with ID _44831_210_13427_1_1533816011549_3689035_32-188147-0 from training. Duration: 0.96 2023-10-04 04:51:46,262 WARNING [train.py:1204] (3/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_300-254337-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 04:51:46,283 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_128-241008-0 from training. Duration: 0.94 2023-10-04 04:51:49,046 WARNING [train.py:1204] (3/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_53-279178-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 04:51:53,737 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_41452_210_7291_1_1533437687_3941281_273-334088-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 04:51:53,739 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_128-241008-0_sp1.1 from training. Duration: 0.8545625 2023-10-04 04:51:57,213 WARNING [train.py:1204] (3/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_379-49457-0 from training. Duration: 0.87 2023-10-04 04:51:58,631 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_105-114797-0 from training. Duration: 0.84 2023-10-04 04:51:58,719 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_14928_210_5632_1_1530773461_4714000_454-112198-0 from training. Duration: 0.66 2023-10-04 04:52:02,941 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_456-22141-0_sp1.1 from training. Duration: 0.8 2023-10-04 04:52:05,911 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_115-233929-0 from training. Duration: 0.72 2023-10-04 04:52:07,225 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_616-16795-0_sp1.1 from training. Duration: 0.963625 2023-10-04 04:52:10,204 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.feed_forward3.hidden_balancer.prob, batch_count=1531746.6666666667, ans=0.125 2023-10-04 04:52:12,955 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.conv_skip_rate, batch_count=1531813.3333333333, ans=0.0 2023-10-04 04:52:14,027 INFO [train.py:1046] (3/4) Epoch 44, batch 1350, loss[loss=0.158, simple_loss=0.2467, pruned_loss=0.03463, over 24411.00 frames. ], tot_loss[loss=0.1553, simple_loss=0.236, pruned_loss=0.03733, over 4717210.22 frames. ], batch size: 69, lr: 2.32e-03, grad_scale: 4.0 2023-10-04 04:52:14,081 WARNING [train.py:1204] (3/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_448-212135-0 from training. Duration: 0.91 2023-10-04 04:52:14,429 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.ff3_skip_rate, batch_count=1531813.3333333333, ans=0.0 2023-10-04 04:52:16,011 WARNING [train.py:1204] (3/4) Exclude cut with ID _46683_210_9642_1_1533730913492_4591116_201-205677-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 04:52:18,676 WARNING [train.py:1204] (3/4) Exclude cut with ID _64086_210_7994_1_1535270417019_7435949_43-80483-0_sp1.1 from training. Duration: 0.97275 2023-10-04 04:52:21,406 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_240-134124-0_sp1.1 from training. Duration: 0.963625 2023-10-04 04:52:22,686 WARNING [train.py:1204] (3/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_907-120940-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 04:52:24,639 WARNING [train.py:1204] (3/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_848-280957-0_sp1.1 from training. Duration: 0.7909375 2023-10-04 04:52:24,689 WARNING [train.py:1204] (3/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_243-42116-0_sp0.9 from training. Duration: 0.811125 2023-10-04 04:52:27,567 WARNING [train.py:1204] (3/4) Exclude cut with ID _6694_210_4599_1_1527852637490_3965529_116-109398-0_sp0.9 from training. Duration: 0.811125 2023-10-04 04:52:28,940 WARNING [train.py:1204] (3/4) Exclude cut with ID _24314_210_4915_1_1532135078116_7170179_521-258868-0 from training. Duration: 0.79 2023-10-04 04:52:30,332 WARNING [train.py:1204] (3/4) Exclude cut with ID _2220_210_1819_1_1525509109898_3658680_50-99837-0_sp0.9 from training. Duration: 0.6 2023-10-04 04:52:30,382 WARNING [train.py:1204] (3/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_313-134930-0_sp1.1 from training. Duration: 0.8818125 2023-10-04 04:52:33,725 WARNING [train.py:1204] (3/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_125-144174-0 from training. Duration: 0.39 2023-10-04 04:52:35,030 WARNING [train.py:1204] (3/4) Exclude cut with ID _59288_210_5854_1_1535077757774_3412246_275-293897-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 04:52:36,335 WARNING [train.py:1204] (3/4) Exclude cut with ID _40519_210_13794_1_1533715228655_7337390_722-52880-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 04:52:36,351 WARNING [train.py:1204] (3/4) Exclude cut with ID _7537_210_2084_1_1528502395431_3924359_128-268029-0 from training. Duration: 0.98 2023-10-04 04:52:39,069 WARNING [train.py:1204] (3/4) Exclude cut with ID _8021_210_7994_1_1528376451358_3653069_179-45206-0 from training. Duration: 0.86 2023-10-04 04:52:41,789 WARNING [train.py:1204] (3/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_88-276668-0 from training. Duration: 0.99 2023-10-04 04:52:43,270 WARNING [train.py:1204] (3/4) Exclude cut with ID _22613_210_5219_1_1532253599686_4090170_290-212862-0_sp1.1 from training. Duration: 0.97275 2023-10-04 04:52:43,276 WARNING [train.py:1204] (3/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_465-214726-0 from training. Duration: 0.86 2023-10-04 04:52:43,560 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.conv_module2.balancer2.prob, batch_count=1531946.6666666667, ans=0.125 2023-10-04 04:52:46,838 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.feed_forward1.out_proj.dropout_p, batch_count=1531946.6666666667, ans=0.1 2023-10-04 04:52:55,187 WARNING [train.py:1204] (3/4) Exclude cut with ID _56480_210_4220_1_1534679942079_3075409_138-36031-0_sp1.1 from training. Duration: 0.97275 2023-10-04 04:53:04,014 WARNING [train.py:1204] (3/4) Exclude cut with ID _58605_210_12210_1_1534723195939_3904259_300-285878-0_sp1.1 from training. Duration: 0.97275 2023-10-04 04:53:04,038 WARNING [train.py:1204] (3/4) Exclude cut with ID _40901_210_5665_1_1533363789958_3856130_114-340229-0_sp1.1 from training. Duration: 0.936375 2023-10-04 04:53:04,083 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_489-230703-0 from training. Duration: 0.85 2023-10-04 04:53:06,062 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.4.encoder.layers.1.conv_module2.whiten, num_groups=1, num_channels=384, metric=2.86 vs. limit=15.0 2023-10-04 04:53:08,695 WARNING [train.py:1204] (3/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_322-81243-0_sp1.1 from training. Duration: 0.936375 2023-10-04 04:53:10,084 WARNING [train.py:1204] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_271-269084-0 from training. Duration: 0.83 2023-10-04 04:53:10,103 WARNING [train.py:1204] (3/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_166-314607-0_sp0.9 from training. Duration: 0.6 2023-10-04 04:53:10,173 WARNING [train.py:1204] (3/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_340-343302-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 04:53:11,625 WARNING [train.py:1204] (3/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_286-112015-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 04:53:13,116 WARNING [train.py:1204] (3/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_328-264776-0 from training. Duration: 0.67 2023-10-04 04:53:14,566 WARNING [train.py:1204] (3/4) Exclude cut with ID _4866_210_2577_1_1527404412284_4364659_256-306830-0_sp0.9 from training. Duration: 0.9666875 2023-10-04 04:53:16,249 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.678e+02 1.909e+02 2.111e+02 2.476e+02 3.345e+02, threshold=4.221e+02, percent-clipped=0.0 2023-10-04 04:53:18,047 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.attention_skip_rate, batch_count=1532080.0, ans=0.0 2023-10-04 04:53:19,228 WARNING [train.py:1204] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_239-316802-0 from training. Duration: 0.69 2023-10-04 04:53:20,654 WARNING [train.py:1204] (3/4) Exclude cut with ID _29857_210_20034_1_1532395886753_3798160_279-185801-0 from training. Duration: 0.91 2023-10-04 04:53:27,152 WARNING [train.py:1204] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_656-188361-0 from training. Duration: 0.87 2023-10-04 04:53:28,411 INFO [train.py:1046] (3/4) Epoch 44, batch 1400, loss[loss=0.1512, simple_loss=0.2358, pruned_loss=0.03329, over 23540.00 frames. ], tot_loss[loss=0.1544, simple_loss=0.2348, pruned_loss=0.037, over 4708868.88 frames. ], batch size: 120, lr: 2.31e-03, grad_scale: 8.0 2023-10-04 04:53:28,519 WARNING [train.py:1204] (3/4) Exclude cut with ID _44718_210_5816_1_1534050051869_6469030_100-230287-0_sp1.1 from training. Duration: 0.936375 2023-10-04 04:53:30,544 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.self_attn2.whiten.whitening_limit, batch_count=1532146.6666666667, ans=22.5 2023-10-04 04:53:32,421 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8275_210_4198_1_1528455320_3892571_276-215622-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 04:53:32,473 WARNING [train.py:1204] (3/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_698-309430-0_sp1.1 from training. Duration: 0.963625 2023-10-04 04:53:37,440 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_365-217119-0 from training. Duration: 0.63 2023-10-04 04:53:37,720 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.feed_forward2.hidden_balancer.prob, batch_count=1532146.6666666667, ans=0.125 2023-10-04 04:53:38,861 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7264_210_4938_1_1527939911_8132095_800-91995-0 from training. Duration: 0.86 2023-10-04 04:53:49,015 WARNING [train.py:1204] (3/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_49-97158-0_sp1.1 from training. Duration: 0.6909375 2023-10-04 04:53:51,722 WARNING [train.py:1204] (3/4) Exclude cut with ID _61132_210_9969_1_1535021792964_3755419_61-57720-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 04:53:54,378 WARNING [train.py:1204] (3/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_345-306832-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 04:53:54,411 WARNING [train.py:1204] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_164-224052-0_sp0.9 from training. Duration: 0.77775 2023-10-04 04:53:57,960 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_34886_210_10633_1_1533203882_1755661_76-156096-0_sp1.1 from training. Duration: 0.7909375 2023-10-04 04:53:59,301 WARNING [train.py:1204] (3/4) Exclude cut with ID 4133-6541-0027-40495-0_sp1.1 from training. Duration: 0.9681875 2023-10-04 04:54:06,326 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_514-211165-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 04:54:06,392 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_41211_210_2544_1_1533348859_3702910_97-212695-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 04:54:07,934 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.self_attn_weights.pos_emb_skip_rate, batch_count=1532280.0, ans=0.0 2023-10-04 04:54:11,093 WARNING [train.py:1204] (3/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_406-45086-0 from training. Duration: 0.8 2023-10-04 04:54:12,366 WARNING [train.py:1204] (3/4) Exclude cut with ID _65263_210_22356_1_1535763598691_3921037_49-222917-0_sp1.1 from training. Duration: 0.836375 2023-10-04 04:54:12,427 WARNING [train.py:1204] (3/4) Exclude cut with ID _4866_210_2577_1_1527404412284_4364659_352-157930-0_sp1.1 from training. Duration: 0.736375 2023-10-04 04:54:14,467 WARNING [train.py:1204] (3/4) Exclude cut with ID _4366_210_1388_1_1526792053633_3613819_231-211402-0_sp1.1 from training. Duration: 0.8545625 2023-10-04 04:54:14,518 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_98-21576-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 04:54:17,161 WARNING [train.py:1204] (3/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_348-127789-0_sp0.9 from training. Duration: 0.9444375 2023-10-04 04:54:17,177 WARNING [train.py:1204] (3/4) Exclude cut with ID _45871_210_5665_1_1533864614245_3296060_98-329557-0_sp1.1 from training. Duration: 0.97275 2023-10-04 04:54:17,228 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6846_210_6753_1_1527850767_4295158_2-315073-0_sp1.1 from training. Duration: 0.8 2023-10-04 04:54:18,595 WARNING [train.py:1204] (3/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_145-137790-0 from training. Duration: 0.83 2023-10-04 04:54:19,917 WARNING [train.py:1204] (3/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_133-158308-0_sp0.9 from training. Duration: 0.9333125 2023-10-04 04:54:22,926 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.feed_forward1.hidden_balancer.prob, batch_count=1532346.6666666667, ans=0.125 2023-10-04 04:54:24,094 WARNING [train.py:1204] (3/4) Exclude cut with ID _14898_210_9206_1_1530766812377_4177190_320-11758-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 04:54:29,097 WARNING [train.py:1204] (3/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_406-45086-0_sp0.9 from training. Duration: 0.888875 2023-10-04 04:54:33,471 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.1.feed_forward1.out_proj.dropout_p, batch_count=1532413.3333333333, ans=0.1 2023-10-04 04:54:34,748 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_5438_210_1794_1_1527336687_2600009_104-288274-0 from training. Duration: 0.58 2023-10-04 04:54:35,018 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.balancer1.prob, batch_count=1532413.3333333333, ans=0.125 2023-10-04 04:54:36,074 WARNING [train.py:1204] (3/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_108-53929-0_sp0.9 from training. Duration: 0.6555625 2023-10-04 04:54:36,150 WARNING [train.py:1204] (3/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_444-286390-0_sp1.1 from training. Duration: 0.963625 2023-10-04 04:54:36,389 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.feed_forward1.out_proj.dropout_p, batch_count=1532413.3333333333, ans=0.1 2023-10-04 04:54:37,530 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.attention_skip_rate, batch_count=1532413.3333333333, ans=0.0 2023-10-04 04:54:38,753 WARNING [train.py:1204] (3/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_125-144174-0_sp0.9 from training. Duration: 0.4333125 2023-10-04 04:54:40,195 WARNING [train.py:1204] (3/4) Exclude cut with ID _55209_210_1531_1_1534675828996_4597160_118-345621-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 04:54:41,379 INFO [train.py:1046] (3/4) Epoch 44, batch 1450, loss[loss=0.1387, simple_loss=0.2121, pruned_loss=0.03262, over 23794.00 frames. ], tot_loss[loss=0.1541, simple_loss=0.2343, pruned_loss=0.03696, over 4714998.26 frames. ], batch size: 212, lr: 2.31e-03, grad_scale: 8.0 2023-10-04 04:54:41,536 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_45886_210_17024_1_1533709215_471063_7-79165-0_sp1.1 from training. Duration: 0.8909375 2023-10-04 04:54:46,420 WARNING [train.py:1204] (3/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_175-163894-0_sp0.9 from training. Duration: 0.82225 2023-10-04 04:54:49,297 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_50188_210_4198_1_1534503621_3646041_353-81682-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 04:54:49,303 WARNING [train.py:1204] (3/4) Exclude cut with ID _17875_210_2087_1_1531224012907_1821339_175-80565-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 04:54:49,312 WARNING [train.py:1204] (3/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_159-328157-0_sp1.1 from training. Duration: 0.47275 2023-10-04 04:54:53,560 WARNING [train.py:1204] (3/4) Exclude cut with ID _60057_210_10640_1_1534903065793_3333150_137-267579-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 04:54:53,611 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_92-46009-0_sp1.1 from training. Duration: 0.4909375 2023-10-04 04:54:56,365 WARNING [train.py:1204] (3/4) Exclude cut with ID _53203_210_2087_1_1534246218309_475590_48-68507-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 04:54:56,388 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_28211_210_6388_1_1532861506_4523224_128-328162-0 from training. Duration: 0.9 2023-10-04 04:54:57,760 WARNING [train.py:1204] (3/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_51-1049-0_sp1.1 from training. Duration: 0.6818125 2023-10-04 04:54:57,852 WARNING [train.py:1204] (3/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_383-316000-0 from training. Duration: 0.9 2023-10-04 04:54:59,668 WARNING [train.py:1204] (3/4) Exclude cut with ID _4779_210_2084_1_1527318022944_3914893_185-295153-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 04:54:59,751 WARNING [train.py:1204] (3/4) Exclude cut with ID _3231_210_2106_1_1526205864934_6784220_171-256004-0_sp0.9 from training. Duration: 0.9555625 2023-10-04 04:54:59,751 WARNING [train.py:1204] (3/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_87-272068-0 from training. Duration: 0.83 2023-10-04 04:55:03,034 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_164-152777-0_sp1.1 from training. Duration: 0.92725 2023-10-04 04:55:03,085 WARNING [train.py:1204] (3/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_18-130336-0_sp0.9 from training. Duration: 0.87775 2023-10-04 04:55:04,440 WARNING [train.py:1204] (3/4) Exclude cut with ID _14886_210_4220_1_1530702020355_3516190_175-229073-0_sp1.1 from training. Duration: 0.4090625 2023-10-04 04:55:04,453 WARNING [train.py:1204] (3/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_791-45149-0_sp0.9 from training. Duration: 0.9555625 2023-10-04 04:55:04,533 WARNING [train.py:1204] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_468-189058-0_sp1.1 from training. Duration: 0.87275 2023-10-04 04:55:05,854 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8278_210_2550_1_1528637099_4220413_32-142780-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 04:55:07,265 WARNING [train.py:1204] (3/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_146-218763-0_sp0.9 from training. Duration: 0.9555625 2023-10-04 04:55:10,168 WARNING [train.py:1204] (3/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_348-127789-0_sp1.1 from training. Duration: 0.77275 2023-10-04 04:55:10,183 WARNING [train.py:1204] (3/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_288-285445-0_sp1.1 from training. Duration: 0.936375 2023-10-04 04:55:13,186 WARNING [train.py:1204] (3/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_308-181732-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 04:55:13,214 WARNING [train.py:1204] (3/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_895-117428-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 04:55:14,661 WARNING [train.py:1204] (3/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_387-159008-0_sp0.9 from training. Duration: 0.9555625 2023-10-04 04:55:14,675 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_37351_210_7925_1_1533012931_3812113_265-85780-0_sp0.9 from training. Duration: 0.97775 2023-10-04 04:55:14,689 WARNING [train.py:1204] (3/4) Exclude cut with ID _40551_210_10635_1_1533776424632_3416179_361-3964-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 04:55:16,060 WARNING [train.py:1204] (3/4) Exclude cut with ID _25882_210_3036_1_1532775665815_6479510_485-122742-0_sp1.1 from training. Duration: 0.97275 2023-10-04 04:55:19,507 WARNING [train.py:1204] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_98-51676-0 from training. Duration: 0.73 2023-10-04 04:55:22,184 WARNING [train.py:1204] (3/4) Exclude cut with ID _30548_210_16040_1_1532608097130_3146969_371-280610-0_sp1.1 from training. Duration: 0.92725 2023-10-04 04:55:26,224 WARNING [train.py:1204] (3/4) Exclude cut with ID IC0671W0081-331924 from training. Duration: 0.896 2023-10-04 04:55:26,339 WARNING [train.py:1204] (3/4) Exclude cut with ID _40284_210_2084_1_1533513679814_3704279_380-160775-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 04:55:29,514 WARNING [train.py:1204] (3/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_57-270037-0_sp1.1 from training. Duration: 0.7 2023-10-04 04:55:29,731 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.conv_module2.balancer2.prob, batch_count=1532680.0, ans=0.125 2023-10-04 04:55:30,909 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_853-150237-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 04:55:30,995 WARNING [train.py:1204] (3/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_491-261923-0 from training. Duration: 0.98 2023-10-04 04:55:36,928 WARNING [train.py:1204] (3/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_836-244045-0_sp1.1 from training. Duration: 0.97275 2023-10-04 04:55:37,712 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.3.encoder.layers.0.nonlin_attention.whiten1, num_groups=1, num_channels=384, metric=7.11 vs. limit=10.0 2023-10-04 04:55:38,386 WARNING [train.py:1204] (3/4) Exclude cut with ID _14880_210_8895_1_1530680426655_3795259_289-212817-0 from training. Duration: 0.69 2023-10-04 04:55:39,730 WARNING [train.py:1204] (3/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_303-340446-0 from training. Duration: 0.9 2023-10-04 04:55:41,257 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_37152_210_9697_1_1533106573_3844188_418-41736-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 04:55:43,946 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.555e+02 1.952e+02 2.155e+02 2.479e+02 3.753e+02, threshold=4.311e+02, percent-clipped=0.0 2023-10-04 04:55:44,060 WARNING [train.py:1204] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_371-18169-0_sp1.1 from training. Duration: 0.7818125 2023-10-04 04:55:44,086 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_187-219984-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 04:55:46,010 WARNING [train.py:1204] (3/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_63-267585-0 from training. Duration: 0.9 2023-10-04 04:55:49,430 WARNING [train.py:1204] (3/4) Exclude cut with ID _27013_210_1389_1_1532437173240_3252649_222-125573-0 from training. Duration: 0.98 2023-10-04 04:55:49,479 WARNING [train.py:1204] (3/4) Exclude cut with ID _14821_210_8750_1_1530680503991_6829359_189-297451-0 from training. Duration: 0.55 2023-10-04 04:55:50,938 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_557-163391-0_sp1.1 from training. Duration: 0.97275 2023-10-04 04:55:52,266 WARNING [train.py:1204] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_65-88242-0_sp1.1 from training. Duration: 0.5090625 2023-10-04 04:55:56,245 INFO [train.py:1046] (3/4) Epoch 44, batch 1500, loss[loss=0.1513, simple_loss=0.2328, pruned_loss=0.03488, over 23243.00 frames. ], tot_loss[loss=0.1543, simple_loss=0.2347, pruned_loss=0.03694, over 4721661.88 frames. ], batch size: 119, lr: 2.31e-03, grad_scale: 8.0 2023-10-04 04:56:03,064 WARNING [train.py:1204] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_218-295097-0 from training. Duration: 0.77 2023-10-04 04:56:03,099 WARNING [train.py:1204] (3/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_686-247600-0_sp1.1 from training. Duration: 0.836375 2023-10-04 04:56:03,103 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_397-147196-0_sp0.9 from training. Duration: 0.888875 2023-10-04 04:56:03,177 WARNING [train.py:1204] (3/4) Exclude cut with ID _53950_210_5816_1_1534417264919_3862951_237-136449-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 04:56:04,580 WARNING [train.py:1204] (3/4) Exclude cut with ID _3120_210_3525_1_1526036143112_4072980_74-178400-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 04:56:05,998 WARNING [train.py:1204] (3/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_309-222545-0_sp0.9 from training. Duration: 0.9333125 2023-10-04 04:56:07,357 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7544_210_6753_1_1528587722_3576686_172-171750-0 from training. Duration: 0.65 2023-10-04 04:56:08,772 WARNING [train.py:1204] (3/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_3-237434-0_sp0.9 from training. Duration: 0.8444375 2023-10-04 04:56:08,824 WARNING [train.py:1204] (3/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_101-293367-0_sp0.9 from training. Duration: 0.7 2023-10-04 04:56:08,831 WARNING [train.py:1204] (3/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_818-213552-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 04:56:10,364 WARNING [train.py:1204] (3/4) Exclude cut with ID _14349_210_8341_1_1530615710818_3442470_115-285975-0_sp1.1 from training. Duration: 0.7818125 2023-10-04 04:56:11,854 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.0.ff3_skip_rate, batch_count=1532880.0, ans=0.0 2023-10-04 04:56:13,083 WARNING [train.py:1204] (3/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_291-309749-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 04:56:13,173 WARNING [train.py:1204] (3/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_715-3462-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 04:56:16,709 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.ff3_skip_rate, batch_count=1532880.0, ans=0.0 2023-10-04 04:56:17,776 WARNING [train.py:1204] (3/4) Exclude cut with ID _59502_210_12210_1_1535107024698_1835139_13-331827-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 04:56:17,795 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_4528_210_3273_1_1527076654_3276777_342-168068-0 from training. Duration: 0.87 2023-10-04 04:56:19,181 WARNING [train.py:1204] (3/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_215-141059-0_sp0.9 from training. Duration: 0.688875 2023-10-04 04:56:19,202 WARNING [train.py:1204] (3/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_106-286130-0_sp1.1 from training. Duration: 0.8909375 2023-10-04 04:56:21,201 WARNING [train.py:1204] (3/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_480-77655-0_sp1.1 from training. Duration: 0.97275 2023-10-04 04:56:23,945 WARNING [train.py:1204] (3/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_848-280957-0 from training. Duration: 0.87 2023-10-04 04:56:24,133 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.balancer1.prob, batch_count=1532880.0, ans=0.125 2023-10-04 04:56:28,081 WARNING [train.py:1204] (3/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_122-109023-0 from training. Duration: 0.76 2023-10-04 04:56:29,463 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_11305_210_1794_1_1529750428_7178148_645-233480-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 04:56:29,525 WARNING [train.py:1204] (3/4) Exclude cut with ID _7562_210_2084_1_1528887767916_3946229_345-169156-0 from training. Duration: 0.58 2023-10-04 04:56:34,173 WARNING [train.py:1204] (3/4) Exclude cut with ID _7562_210_2084_1_1528887767916_3946229_345-169156-0_sp1.1 from training. Duration: 0.52725 2023-10-04 04:56:36,181 WARNING [train.py:1204] (3/4) Exclude cut with ID _13253_210_1925_1_1530757775819_3577873_253-246386-0_sp1.1 from training. Duration: 0.8181875 2023-10-04 04:56:37,546 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_136-264444-0_sp1.1 from training. Duration: 0.97275 2023-10-04 04:56:37,566 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2168_210_2290_1_1525606920_1849400_136-222991-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 04:56:37,706 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.0.balancer2.prob, batch_count=1532946.6666666667, ans=0.125 2023-10-04 04:56:38,968 WARNING [train.py:1204] (3/4) Exclude cut with ID _7852_210_5219_1_1528286471316_8139400_242-105336-0 from training. Duration: 0.72 2023-10-04 04:56:40,251 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_5960_210_1794_1_1527403865_3990999_515-2613-0_sp1.1 from training. Duration: 0.8 2023-10-04 04:56:40,268 WARNING [train.py:1204] (3/4) Exclude cut with ID _39688_210_19596_1_1533371626725_4154159_551-167834-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 04:56:40,330 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_85-195240-0 from training. Duration: 0.87 2023-10-04 04:56:41,659 WARNING [train.py:1204] (3/4) Exclude cut with ID _63171_210_15488_1_1535104792794_3295110_113-113534-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 04:56:47,084 WARNING [train.py:1204] (3/4) Exclude cut with ID _8833_210_5219_1_1528592665210_7553059_319-349181-0_sp1.1 from training. Duration: 0.9 2023-10-04 04:56:47,085 WARNING [train.py:1204] (3/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_225-43381-0 from training. Duration: 0.94 2023-10-04 04:56:50,634 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_5959_210_1794_1_1527400400_3445726_412-44052-0_sp0.9 from training. Duration: 0.8555625 2023-10-04 04:56:52,610 WARNING [train.py:1204] (3/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_356-167816-0_sp1.1 from training. Duration: 0.6545625 2023-10-04 04:56:55,538 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9012W0112-501244 from training. Duration: 0.768 2023-10-04 04:56:55,592 WARNING [train.py:1204] (3/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_177-326952-0_sp1.1 from training. Duration: 0.92725 2023-10-04 04:56:55,709 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.0.bypass.scale_min, batch_count=1533080.0, ans=0.2 2023-10-04 04:56:56,838 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9016W0116-502789 from training. Duration: 0.896 2023-10-04 04:56:58,182 WARNING [train.py:1204] (3/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_557-204815-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 04:56:58,285 WARNING [train.py:1204] (3/4) Exclude cut with ID _55857_210_2346_1_1534644007831_6898209_279-229469-0_sp1.1 from training. Duration: 0.936375 2023-10-04 04:56:59,652 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9029W0375-509764 from training. Duration: 0.896 2023-10-04 04:56:59,725 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2295_210_2087_1_1525500993_2407863_234-83104-0_sp0.9 from training. Duration: 0.688875 2023-10-04 04:57:03,022 WARNING [train.py:1204] (3/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_266-137104-0 from training. Duration: 0.65 2023-10-04 04:57:04,381 WARNING [train.py:1204] (3/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_289-163927-0_sp1.1 from training. Duration: 0.92725 2023-10-04 04:57:07,134 WARNING [train.py:1204] (3/4) Exclude cut with ID _12733_210_8341_1_1530167424556_3646380_86-225314-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 04:57:07,155 WARNING [train.py:1204] (3/4) Exclude cut with ID _7877_210_6014_1_1528286358099_3750136_618-284064-0_sp1.1 from training. Duration: 0.92725 2023-10-04 04:57:07,285 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.attention_skip_rate, batch_count=1533080.0, ans=0.0 2023-10-04 04:57:08,435 WARNING [train.py:1204] (3/4) Exclude cut with ID _55844_210_2346_1_1534471212569_7625707_1017-31469-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 04:57:08,469 WARNING [train.py:1204] (3/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_617-241446-0_sp1.1 from training. Duration: 0.92725 2023-10-04 04:57:08,529 WARNING [train.py:1204] (3/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_866-173440-0_sp1.1 from training. Duration: 0.8181875 2023-10-04 04:57:11,065 INFO [train.py:1046] (3/4) Epoch 44, batch 1550, loss[loss=0.155, simple_loss=0.2316, pruned_loss=0.03925, over 23705.00 frames. ], tot_loss[loss=0.1545, simple_loss=0.2353, pruned_loss=0.03684, over 4720908.67 frames. ], batch size: 179, lr: 2.31e-03, grad_scale: 8.0 2023-10-04 04:57:11,154 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_9460_210_4211_1_1528954587_10049667_480-260269-0 from training. Duration: 0.56 2023-10-04 04:57:11,193 WARNING [train.py:1204] (3/4) Exclude cut with ID _44659_210_2721_1_1534036120061_6614860_626-298884-0 from training. Duration: 0.97 2023-10-04 04:57:12,488 WARNING [train.py:1204] (3/4) Exclude cut with ID _53565_210_10635_1_1534337218335_3820259_287-18120-0_sp1.1 from training. Duration: 0.8818125 2023-10-04 04:57:12,539 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_791-197891-0 from training. Duration: 0.84 2023-10-04 04:57:12,601 WARNING [train.py:1204] (3/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_527-238990-0 from training. Duration: 0.9 2023-10-04 04:57:14,246 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.feed_forward2.hidden_balancer.prob, batch_count=1533146.6666666667, ans=0.125 2023-10-04 04:57:15,329 WARNING [train.py:1204] (3/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_449-65969-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 04:57:16,843 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_553-341452-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 04:57:18,083 WARNING [train.py:1204] (3/4) Exclude cut with ID _16909_210_5724_1_1531292079912_4118919_300-47574-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 04:57:18,098 WARNING [train.py:1204] (3/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_70-27732-0_sp1.1 from training. Duration: 0.8545625 2023-10-04 04:57:20,060 WARNING [train.py:1204] (3/4) Exclude cut with ID _44718_210_5816_1_1534050051869_6469030_727-28074-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 04:57:21,341 WARNING [train.py:1204] (3/4) Exclude cut with ID _55857_210_2346_1_1534644007831_6898209_357-123664-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 04:57:23,411 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.feed_forward1.hidden_balancer.prob, batch_count=1533146.6666666667, ans=0.125 2023-10-04 04:57:24,670 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9123W0004-555964 from training. Duration: 0.896 2023-10-04 04:57:24,700 WARNING [train.py:1204] (3/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_445-145208-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 04:57:24,731 WARNING [train.py:1204] (3/4) Exclude cut with ID _3675_210_4557_1_1526639934408_4182089_456-18305-0_sp1.1 from training. Duration: 0.6909375 2023-10-04 04:57:25,241 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.5.encoder.layers.0.self_attn2.whiten, num_groups=1, num_channels=256, metric=11.14 vs. limit=22.5 2023-10-04 04:57:26,116 WARNING [train.py:1204] (3/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_1-99680-0_sp1.1 from training. Duration: 0.6090625 2023-10-04 04:57:28,860 WARNING [train.py:1204] (3/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_146-282400-0_sp0.9 from training. Duration: 0.788875 2023-10-04 04:57:28,880 WARNING [train.py:1204] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_844-96989-0 from training. Duration: 0.71 2023-10-04 04:57:31,552 WARNING [train.py:1204] (3/4) Exclude cut with ID _29853_210_7497_1_1532505600851_6642110_128-146364-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 04:57:31,575 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_224-322394-0 from training. Duration: 0.66 2023-10-04 04:57:32,929 WARNING [train.py:1204] (3/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_191-59784-0 from training. Duration: 0.88 2023-10-04 04:57:32,934 WARNING [train.py:1204] (3/4) Exclude cut with ID _12556_210_8341_1_1530081024100_7327690_725-193477-0 from training. Duration: 0.98 2023-10-04 04:57:32,980 WARNING [train.py:1204] (3/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_523-79838-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 04:57:35,368 WARNING [train.py:1204] (3/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_398-334558-0_sp1.1 from training. Duration: 0.87275 2023-10-04 04:57:38,091 WARNING [train.py:1204] (3/4) Exclude cut with ID _6456_210_1534_1_1527681634212_3181049_171-36697-0_sp1.1 from training. Duration: 0.936375 2023-10-04 04:57:40,744 WARNING [train.py:1204] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_700-183549-0 from training. Duration: 0.68 2023-10-04 04:57:40,746 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8853_210_3273_1_1528633899_818968_51-187685-0 from training. Duration: 0.69 2023-10-04 04:57:49,058 WARNING [train.py:1204] (3/4) Exclude cut with ID _7719_210_3525_1_1528456153163_4296960_333-110399-0_sp1.1 from training. Duration: 0.87275 2023-10-04 04:57:49,436 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.out_combiner.scale_min, batch_count=1533280.0, ans=0.2 2023-10-04 04:57:52,166 WARNING [train.py:1204] (3/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_1022-83740-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 04:57:52,182 WARNING [train.py:1204] (3/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_61-327062-0_sp0.9 from training. Duration: 0.9 2023-10-04 04:57:52,188 WARNING [train.py:1204] (3/4) Exclude cut with ID _47251_210_18628_1_1533814063128_4137250_214-88076-0_sp1.1 from training. Duration: 0.8 2023-10-04 04:57:52,254 WARNING [train.py:1204] (3/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_839-173017-0 from training. Duration: 0.41 2023-10-04 04:57:58,869 WARNING [train.py:1204] (3/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_299-34413-0_sp1.1 from training. Duration: 0.5909375 2023-10-04 04:57:59,150 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.feed_forward2.hidden_balancer.prob, batch_count=1533346.6666666667, ans=0.125 2023-10-04 04:58:00,234 WARNING [train.py:1204] (3/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_664-159457-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 04:58:00,430 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.1.attention_skip_rate, batch_count=1533346.6666666667, ans=0.0 2023-10-04 04:58:03,033 WARNING [train.py:1204] (3/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_483-291760-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 04:58:06,251 WARNING [train.py:1204] (3/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_287-255474-0_sp1.1 from training. Duration: 0.92725 2023-10-04 04:58:07,544 WARNING [train.py:1204] (3/4) Exclude cut with ID _48677_210_5663_1_1533902405919_3892989_111-11439-0_sp1.1 from training. Duration: 0.87275 2023-10-04 04:58:07,552 WARNING [train.py:1204] (3/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_399-156791-0 from training. Duration: 0.83 2023-10-04 04:58:07,595 WARNING [train.py:1204] (3/4) Exclude cut with ID _3992_210_1747_1_1526644816031_3731060_36-147855-0_sp0.9 from training. Duration: 0.8666875 2023-10-04 04:58:09,566 WARNING [train.py:1204] (3/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_442-320317-0_sp1.1 from training. Duration: 0.7454375 2023-10-04 04:58:09,594 WARNING [train.py:1204] (3/4) Exclude cut with ID _55429_210_10626_1_1534467588527_7174143_953-80868-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 04:58:10,909 WARNING [train.py:1204] (3/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_159-328157-0_sp0.9 from training. Duration: 0.57775 2023-10-04 04:58:10,916 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9309W0197-640324 from training. Duration: 0.896 2023-10-04 04:58:13,598 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.589e+02 1.992e+02 2.204e+02 2.459e+02 3.860e+02, threshold=4.408e+02, percent-clipped=0.0 2023-10-04 04:58:13,704 WARNING [train.py:1204] (3/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_361-30822-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 04:58:19,109 WARNING [train.py:1204] (3/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_636-251213-0 from training. Duration: 0.94 2023-10-04 04:58:19,234 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.0.self_attn_weights.pos_emb_skip_rate, batch_count=1533413.3333333333, ans=0.0 2023-10-04 04:58:23,396 WARNING [train.py:1204] (3/4) Exclude cut with ID _49728_210_12651_1_1534041034294_6788150_468-333827-0_sp1.1 from training. Duration: 0.963625 2023-10-04 04:58:23,623 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.conv_module1.balancer2.min_positive, batch_count=1533480.0, ans=0.05 2023-10-04 04:58:24,644 INFO [train.py:1046] (3/4) Epoch 44, batch 1600, loss[loss=0.1463, simple_loss=0.2298, pruned_loss=0.0314, over 24509.00 frames. ], tot_loss[loss=0.155, simple_loss=0.2361, pruned_loss=0.0369, over 4723297.45 frames. ], batch size: 63, lr: 2.31e-03, grad_scale: 16.0 2023-10-04 04:58:24,727 WARNING [train.py:1204] (3/4) Exclude cut with ID _25882_210_3036_1_1532775665815_6479510_623-191759-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 04:58:24,775 WARNING [train.py:1204] (3/4) Exclude cut with ID _5189_210_4557_1_1527248319428_3769229_199-53526-0 from training. Duration: 0.42 2023-10-04 04:58:26,951 WARNING [train.py:1204] (3/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_32-205288-0_sp0.9 from training. Duration: 0.8666875 2023-10-04 04:58:28,284 WARNING [train.py:1204] (3/4) Exclude cut with ID _65263_210_22356_1_1535763598691_3921037_401-346646-0_sp1.1 from training. Duration: 0.963625 2023-10-04 04:58:28,289 WARNING [train.py:1204] (3/4) Exclude cut with ID _12555_210_8750_1_1530075594402_3780239_449-267986-0_sp1.1 from training. Duration: 0.7545625 2023-10-04 04:58:28,301 WARNING [train.py:1204] (3/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_430-201020-0_sp1.1 from training. Duration: 0.8818125 2023-10-04 04:58:28,393 WARNING [train.py:1204] (3/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_379-49457-0_sp1.1 from training. Duration: 0.7909375 2023-10-04 04:58:32,228 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.5.encoder.layers.1.nonlin_attention.whiten2, num_groups=1, num_channels=256, metric=4.41 vs. limit=15.0 2023-10-04 04:58:33,127 WARNING [train.py:1204] (3/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_200-105218-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 04:58:33,185 WARNING [train.py:1204] (3/4) Exclude cut with ID _3867_210_1883_1_1526641200575_4475830_319-349850-0 from training. Duration: 0.67 2023-10-04 04:58:33,259 WARNING [train.py:1204] (3/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_916-195848-0 from training. Duration: 0.92 2023-10-04 04:58:35,914 WARNING [train.py:1204] (3/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_436-186311-0 from training. Duration: 0.65 2023-10-04 04:58:37,880 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3910_210_1794_1_1526777667_3747260_302-180617-0_sp1.1 from training. Duration: 0.97275 2023-10-04 04:58:41,027 WARNING [train.py:1204] (3/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_102-92751-0 from training. Duration: 0.99 2023-10-04 04:58:42,374 WARNING [train.py:1204] (3/4) Exclude cut with ID _51899_210_20034_1_1534129254466_3669220_265-215428-0_sp1.1 from training. Duration: 0.8909375 2023-10-04 04:58:43,816 WARNING [train.py:1204] (3/4) Exclude cut with ID _58583_210_5236_1_1534726835266_3574249_438-198993-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 04:58:47,930 WARNING [train.py:1204] (3/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_166-166990-0_sp1.1 from training. Duration: 0.87275 2023-10-04 04:58:48,256 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.conv_module1.balancer2.prob, batch_count=1533546.6666666667, ans=0.125 2023-10-04 04:58:49,516 WARNING [train.py:1204] (3/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_907-151398-0 from training. Duration: 0.37 2023-10-04 04:58:52,226 WARNING [train.py:1204] (3/4) Exclude cut with ID _38670_210_15379_1_1533448810659_4391059_452-256669-0_sp1.1 from training. Duration: 0.936375 2023-10-04 04:58:52,272 WARNING [train.py:1204] (3/4) Exclude cut with ID _48677_210_5663_1_1533902405919_3892989_408-40740-0 from training. Duration: 0.83 2023-10-04 04:58:52,433 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.conv_module1.balancer2.prob, batch_count=1533546.6666666667, ans=0.125 2023-10-04 04:58:53,561 WARNING [train.py:1204] (3/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_903-158756-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 04:58:53,605 WARNING [train.py:1204] (3/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_397-216497-0 from training. Duration: 0.96 2023-10-04 04:58:55,763 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.ff2_skip_rate, batch_count=1533613.3333333333, ans=0.0 2023-10-04 04:58:57,099 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder_embed.conv.8.prob, batch_count=1533613.3333333333, ans=0.125 2023-10-04 04:58:59,132 WARNING [train.py:1204] (3/4) Exclude cut with ID _55429_210_10626_1_1534467588527_7174143_97-335084-0 from training. Duration: 0.97 2023-10-04 04:59:06,116 WARNING [train.py:1204] (3/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_453-267730-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 04:59:08,082 WARNING [train.py:1204] (3/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_153-97441-0 from training. Duration: 0.56 2023-10-04 04:59:09,340 WARNING [train.py:1204] (3/4) Exclude cut with ID _40901_210_5665_1_1533363789958_3856130_312-329122-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 04:59:09,388 WARNING [train.py:1204] (3/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_97-126471-0_sp1.1 from training. Duration: 0.97275 2023-10-04 04:59:09,388 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_11305_210_1794_1_1529750428_7178148_257-312870-0_sp1.1 from training. Duration: 0.8 2023-10-04 04:59:12,807 WARNING [train.py:1204] (3/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_454-314901-0_sp0.9 from training. Duration: 0.511125 2023-10-04 04:59:16,849 WARNING [train.py:1204] (3/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_282-136641-0_sp1.1 from training. Duration: 0.3818125 2023-10-04 04:59:19,479 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_46013_210_7234_1_1533776362_3744032_493-160724-0_sp1.1 from training. Duration: 0.963625 2023-10-04 04:59:19,512 WARNING [train.py:1204] (3/4) Exclude cut with ID _67831_210_12392_1_1535612464202_3092612_259-33442-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 04:59:19,569 WARNING [train.py:1204] (3/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_289-62384-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 04:59:19,635 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6545_210_2040_1_1527774670_4245258_379-14182-0_sp0.9 from training. Duration: 0.888875 2023-10-04 04:59:20,965 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_548-294316-0_sp1.1 from training. Duration: 0.736375 2023-10-04 04:59:21,199 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.nonlin_attention.balancer.prob, batch_count=1533680.0, ans=0.125 2023-10-04 04:59:22,373 WARNING [train.py:1204] (3/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_857-298942-0_sp1.1 from training. Duration: 0.77275 2023-10-04 04:59:23,676 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6673_210_3273_1_1527767790_3173240_327-306185-0_sp1.1 from training. Duration: 0.7545625 2023-10-04 04:59:23,953 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.balancer1.prob, batch_count=1533746.6666666667, ans=0.125 2023-10-04 04:59:28,678 WARNING [train.py:1204] (3/4) Exclude cut with ID _61008_210_10633_1_1534852769198_6974370_814-107647-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 04:59:28,793 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.0.ff2_skip_rate, batch_count=1533746.6666666667, ans=0.0 2023-10-04 04:59:30,003 WARNING [train.py:1204] (3/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_116-332008-0_sp1.1 from training. Duration: 0.8909375 2023-10-04 04:59:31,584 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.1.conv_module1.balancer2.prob, batch_count=1533746.6666666667, ans=0.125 2023-10-04 04:59:32,725 WARNING [train.py:1204] (3/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_266-329755-0 from training. Duration: 0.89 2023-10-04 04:59:32,728 WARNING [train.py:1204] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_271-269084-0_sp0.9 from training. Duration: 0.92225 2023-10-04 04:59:34,037 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3171_210_2087_1_1526109084_2330007_66-260583-0 from training. Duration: 0.72 2023-10-04 04:59:38,703 WARNING [train.py:1204] (3/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_1326-276386-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 04:59:40,519 INFO [train.py:1046] (3/4) Epoch 44, batch 1650, loss[loss=0.1705, simple_loss=0.2522, pruned_loss=0.04434, over 24475.00 frames. ], tot_loss[loss=0.1555, simple_loss=0.2366, pruned_loss=0.03719, over 4722421.78 frames. ], batch size: 66, lr: 2.31e-03, grad_scale: 16.0 2023-10-04 04:59:42,010 WARNING [train.py:1204] (3/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_372-30302-0_sp1.1 from training. Duration: 0.7818125 2023-10-04 04:59:43,305 WARNING [train.py:1204] (3/4) Exclude cut with ID _25882_210_3036_1_1532775665815_6479510_631-257029-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 04:59:43,523 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.self_attn_weights.pos_emb_skip_rate, batch_count=1533813.3333333333, ans=0.0 2023-10-04 04:59:44,600 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3691_210_3528_1_1526468169_4062053_355-334432-0 from training. Duration: 0.87 2023-10-04 04:59:44,608 WARNING [train.py:1204] (3/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_839-173017-0_sp1.1 from training. Duration: 0.37275 2023-10-04 04:59:44,616 WARNING [train.py:1204] (3/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_441-91466-0 from training. Duration: 0.97 2023-10-04 04:59:44,658 WARNING [train.py:1204] (3/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_108-53929-0 from training. Duration: 0.59 2023-10-04 04:59:47,650 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.1.bypass.skip_rate, batch_count=1533813.3333333333, ans=0.035 2023-10-04 04:59:48,835 WARNING [train.py:1204] (3/4) Exclude cut with ID _9389_210_1664_1_1529217337301_5289269_260-1942-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 04:59:48,898 WARNING [train.py:1204] (3/4) Exclude cut with ID _56423_210_5854_1_1534896061049_3165969_198-61547-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 04:59:50,186 WARNING [train.py:1204] (3/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_412-142122-0_sp1.1 from training. Duration: 0.92725 2023-10-04 04:59:50,203 WARNING [train.py:1204] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_88-341718-0_sp1.1 from training. Duration: 0.6 2023-10-04 04:59:52,930 WARNING [train.py:1204] (3/4) Exclude cut with ID _61079_210_4525_1_1534921008198_4011419_334-298697-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 04:59:54,323 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_5465_210_4211_1_1527227253_55357_8-2499-0_sp1.1 from training. Duration: 0.996375 2023-10-04 04:59:57,050 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_447-201565-0_sp1.1 from training. Duration: 0.97275 2023-10-04 04:59:57,056 WARNING [train.py:1204] (3/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_253-161789-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 04:59:57,059 WARNING [train.py:1204] (3/4) Exclude cut with ID _44304_210_15374_1_1533639352765_4613129_302-22084-0_sp0.9 from training. Duration: 0.9555625 2023-10-04 04:59:57,075 WARNING [train.py:1204] (3/4) Exclude cut with ID _26664_210_7770_1_1532258943280_3418270_187-80815-0_sp1.1 from training. Duration: 0.8454375 2023-10-04 04:59:58,510 WARNING [train.py:1204] (3/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_68-212801-0 from training. Duration: 0.73 2023-10-04 04:59:58,551 WARNING [train.py:1204] (3/4) Exclude cut with ID _29345_210_4381_1_1532689168263_3860169_149-257268-0 from training. Duration: 0.81 2023-10-04 04:59:58,729 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.conv_module2.balancer1.prob, batch_count=1533880.0, ans=0.125 2023-10-04 05:00:02,367 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.3.encoder.layers.0.feed_forward2.out_whiten, num_groups=1, num_channels=512, metric=4.77 vs. limit=15.0 2023-10-04 05:00:05,008 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_213-103282-0_sp1.1 from training. Duration: 0.7090625 2023-10-04 05:00:06,403 WARNING [train.py:1204] (3/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_156-302675-0_sp1.1 from training. Duration: 0.5 2023-10-04 05:00:15,859 WARNING [train.py:1204] (3/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_125-322172-0 from training. Duration: 0.93 2023-10-04 05:00:18,140 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.0.layers.0.feed_forward2.out_whiten, num_groups=1, num_channels=192, metric=5.23 vs. limit=15.0 2023-10-04 05:00:18,443 WARNING [train.py:1204] (3/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_493-60474-0_sp1.1 from training. Duration: 0.963625 2023-10-04 05:00:19,828 WARNING [train.py:1204] (3/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_156-302675-0 from training. Duration: 0.55 2023-10-04 05:00:21,334 WARNING [train.py:1204] (3/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_123-267862-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 05:00:24,064 WARNING [train.py:1204] (3/4) Exclude cut with ID _28427_210_4220_1_1532581240967_3061069_201-143747-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 05:00:24,097 WARNING [train.py:1204] (3/4) Exclude cut with ID _8772_210_1850_1_1528635595972_3664011_96-12756-0_sp1.1 from training. Duration: 0.8909375 2023-10-04 05:00:24,122 WARNING [train.py:1204] (3/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_1347-145675-0_sp1.1 from training. Duration: 0.936375 2023-10-04 05:00:25,499 WARNING [train.py:1204] (3/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_387-159008-0_sp1.1 from training. Duration: 0.7818125 2023-10-04 05:00:26,821 WARNING [train.py:1204] (3/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_400-201896-0_sp1.1 from training. Duration: 0.963625 2023-10-04 05:00:29,620 WARNING [train.py:1204] (3/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_326-174612-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 05:00:30,814 WARNING [train.py:1204] (3/4) Exclude cut with ID _55844_210_2346_1_1534471212569_7625707_390-116233-0_sp1.1 from training. Duration: 0.963625 2023-10-04 05:00:30,850 WARNING [train.py:1204] (3/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_419-317062-0_sp1.1 from training. Duration: 0.82725 2023-10-04 05:00:30,900 WARNING [train.py:1204] (3/4) Exclude cut with ID _29877_210_6228_1_1532397791106_5276149_266-62291-0_sp0.9 from training. Duration: 0.9666875 2023-10-04 05:00:32,856 WARNING [train.py:1204] (3/4) Exclude cut with ID _7857_210_1534_1_1528286515745_6831470_431-338528-0_sp1.1 from training. Duration: 0.92725 2023-10-04 05:00:32,937 WARNING [train.py:1204] (3/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_87-131427-0_sp0.9 from training. Duration: 0.8666875 2023-10-04 05:00:33,224 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.feed_forward2.hidden_balancer.prob, batch_count=1534013.3333333333, ans=0.125 2023-10-04 05:00:36,400 WARNING [train.py:1204] (3/4) Exclude cut with ID _24055_210_12651_1_1532310907143_976979_92-59356-0_sp1.1 from training. Duration: 0.82725 2023-10-04 05:00:37,764 WARNING [train.py:1204] (3/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_96-290039-0 from training. Duration: 0.56 2023-10-04 05:00:39,146 WARNING [train.py:1204] (3/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_48-182155-0_sp0.9 from training. Duration: 0.9666875 2023-10-04 05:00:39,190 WARNING [train.py:1204] (3/4) Exclude cut with ID _4626_210_3532_1_1526817652717_3916859_446-206656-0 from training. Duration: 0.76 2023-10-04 05:00:40,983 WARNING [train.py:1204] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_168-32906-0 from training. Duration: 0.69 2023-10-04 05:00:42,346 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.628e+02 1.929e+02 2.067e+02 2.263e+02 2.881e+02, threshold=4.133e+02, percent-clipped=0.0 2023-10-04 05:00:42,430 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3780_210_2087_1_1526467760_2130483_170-259044-0 from training. Duration: 0.7 2023-10-04 05:00:42,444 WARNING [train.py:1204] (3/4) Exclude cut with ID _65134_210_14763_1_1535335393985_3505368_153-216718-0_sp1.1 from training. Duration: 0.92725 2023-10-04 05:00:42,513 WARNING [train.py:1204] (3/4) Exclude cut with ID _64639_210_5816_1_1535367619437_3528000_461-329156-0_sp1.1 from training. Duration: 0.87275 2023-10-04 05:00:43,165 WARNING [train.py:1204] (3/4) Exclude cut with ID _21989_210_5854_1_1531899214931_3271360_126-347531-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 05:00:43,203 WARNING [train.py:1204] (3/4) Exclude cut with ID _7614_210_4915_1_1529326764092_4537359_96-162197-0_sp1.1 from training. Duration: 0.963625 2023-10-04 05:00:43,204 WARNING [train.py:1204] (3/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_1083-147536-0 from training. Duration: 0.97 2023-10-04 05:00:47,350 WARNING [train.py:1204] (3/4) Exclude cut with ID _64029_210_4381_1_1535178502772_3756459_275-16710-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 05:00:50,056 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7264_210_4938_1_1527939911_8132095_800-91995-0_sp0.9 from training. Duration: 0.9555625 2023-10-04 05:00:50,097 WARNING [train.py:1204] (3/4) Exclude cut with ID _3844_210_1390_1_1526727803582_2871209_50-193879-0_sp1.1 from training. Duration: 0.936375 2023-10-04 05:00:50,361 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.bypass_mid.scale_min, batch_count=1534080.0, ans=0.2 2023-10-04 05:00:51,514 WARNING [train.py:1204] (3/4) Exclude cut with ID _46952_210_5986_1_1534230010357_3107379_232-344503-0 from training. Duration: 0.96 2023-10-04 05:00:54,167 INFO [train.py:1046] (3/4) Epoch 44, batch 1700, loss[loss=0.1601, simple_loss=0.2421, pruned_loss=0.03902, over 24645.00 frames. ], tot_loss[loss=0.1549, simple_loss=0.2358, pruned_loss=0.03701, over 4726587.76 frames. ], batch size: 68, lr: 2.31e-03, grad_scale: 16.0 2023-10-04 05:00:57,095 WARNING [train.py:1204] (3/4) Exclude cut with ID _25808_210_13292_1_1532343514562_7366109_206-14076-0_sp1.1 from training. Duration: 0.936375 2023-10-04 05:00:57,096 WARNING [train.py:1204] (3/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_55-31904-0_sp0.9 from training. Duration: 0.97775 2023-10-04 05:00:57,114 WARNING [train.py:1204] (3/4) Exclude cut with ID _14632_210_9988_1_1530691279392_3923200_161-129530-0 from training. Duration: 0.96 2023-10-04 05:00:58,457 WARNING [train.py:1204] (3/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_770-200430-0_sp1.1 from training. Duration: 0.8090625 2023-10-04 05:00:58,466 WARNING [train.py:1204] (3/4) Exclude cut with ID _6270_210_2084_1_1527850911573_3856420_26-277343-0_sp1.1 from training. Duration: 0.7454375 2023-10-04 05:00:58,468 WARNING [train.py:1204] (3/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_88-23776-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 05:00:59,890 WARNING [train.py:1204] (3/4) Exclude cut with ID _60860_210_8254_1_1534896015875_3557169_245-256666-0_sp1.1 from training. Duration: 0.8818125 2023-10-04 05:00:59,925 WARNING [train.py:1204] (3/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_636-251213-0_sp1.1 from training. Duration: 0.8545625 2023-10-04 05:00:59,944 WARNING [train.py:1204] (3/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_540-131164-0 from training. Duration: 0.92 2023-10-04 05:01:03,750 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.5.encoder.layers.0.conv_module1.whiten, num_groups=1, num_channels=256, metric=7.10 vs. limit=15.0 2023-10-04 05:01:04,617 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_5931_210_2040_1_1527428481_4547999_321-193614-0_sp0.9 from training. Duration: 0.7444375 2023-10-04 05:01:11,297 WARNING [train.py:1204] (3/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_373-225403-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 05:01:14,652 WARNING [train.py:1204] (3/4) Exclude cut with ID _20622_210_3852_1_1531622427080_1093839_57-326027-0_sp1.1 from training. Duration: 0.97275 2023-10-04 05:01:18,844 WARNING [train.py:1204] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_605-257205-0_sp1.1 from training. Duration: 0.536375 2023-10-04 05:01:18,865 WARNING [train.py:1204] (3/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_442-320317-0_sp0.9 from training. Duration: 0.911125 2023-10-04 05:01:20,218 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_35361_210_4238_1_1533262810_7793476_675-87012-0_sp1.1 from training. Duration: 0.8090625 2023-10-04 05:01:20,256 WARNING [train.py:1204] (3/4) Exclude cut with ID _4184_210_2550_1_1526730192013_2219380_306-44736-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 05:01:22,976 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_124-113562-0 from training. Duration: 0.94 2023-10-04 05:01:23,253 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.conv_skip_rate, batch_count=1534280.0, ans=0.0 2023-10-04 05:01:25,638 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2821_210_2715_1_1526108385_3763560_73-296673-0_sp0.9 from training. Duration: 0.92225 2023-10-04 05:01:25,653 WARNING [train.py:1204] (3/4) Exclude cut with ID _10301_210_5886_1_1529222275332_3910620_216-46959-0_sp1.1 from training. Duration: 0.92725 2023-10-04 05:01:25,999 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.nonlin_attention.balancer.prob, batch_count=1534280.0, ans=0.125 2023-10-04 05:01:27,040 WARNING [train.py:1204] (3/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_91-277684-0_sp0.9 from training. Duration: 0.82225 2023-10-04 05:01:27,140 WARNING [train.py:1204] (3/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_351-294650-0_sp0.9 from training. Duration: 0.7 2023-10-04 05:01:30,050 WARNING [train.py:1204] (3/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_209-205365-0 from training. Duration: 0.79 2023-10-04 05:01:31,998 WARNING [train.py:1204] (3/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_97-325053-0 from training. Duration: 0.92 2023-10-04 05:01:33,335 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_49348_210_7927_1_1533954500_3691300_109-276430-0_sp1.1 from training. Duration: 0.92725 2023-10-04 05:01:34,721 WARNING [train.py:1204] (3/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_912-47473-0 from training. Duration: 0.68 2023-10-04 05:01:34,803 WARNING [train.py:1204] (3/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_96-320736-0_sp0.9 from training. Duration: 0.9555625 2023-10-04 05:01:42,858 WARNING [train.py:1204] (3/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_1252-235481-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 05:01:43,553 WARNING [train.py:1204] (3/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_813-261554-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 05:01:43,661 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.1.feed_forward1.hidden_balancer.prob, batch_count=1534346.6666666667, ans=0.125 2023-10-04 05:01:44,692 WARNING [train.py:1204] (3/4) Exclude cut with ID _21632_210_2253_1_1531994001765_3744140_110-274654-0_sp0.9 from training. Duration: 0.911125 2023-10-04 05:01:44,842 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.feed_forward2.hidden_balancer.prob, batch_count=1534346.6666666667, ans=0.125 2023-10-04 05:01:46,301 INFO [scaling.py:1118] (3/4) WithLoss: name=encoder.encoders.3.encoder.layers.3.self_attn_weights, loss-sum=0.000e+00 2023-10-04 05:01:47,318 WARNING [train.py:1204] (3/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_381-317791-0_sp1.1 from training. Duration: 0.663625 2023-10-04 05:01:47,331 WARNING [train.py:1204] (3/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_51-117576-0_sp1.1 from training. Duration: 0.363625 2023-10-04 05:01:47,348 WARNING [train.py:1204] (3/4) Exclude cut with ID _42321_210_7770_1_1533468631352_4366895_104-215915-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 05:01:48,809 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_216-22824-0_sp1.1 from training. Duration: 0.92725 2023-10-04 05:01:48,810 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_14831_210_7927_1_1530700200_5583610_181-27467-0 from training. Duration: 0.91 2023-10-04 05:01:49,089 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.ff2_skip_rate, batch_count=1534346.6666666667, ans=0.0 2023-10-04 05:01:50,084 WARNING [train.py:1204] (3/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_1024-265645-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 05:01:50,086 WARNING [train.py:1204] (3/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_412-233904-0_sp1.1 from training. Duration: 0.936375 2023-10-04 05:01:50,103 WARNING [train.py:1204] (3/4) Exclude cut with ID _30191_210_1664_1_1533189915047_7196670_911-223461-0_sp1.1 from training. Duration: 0.92725 2023-10-04 05:01:50,119 WARNING [train.py:1204] (3/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_558-148266-0_sp1.1 from training. Duration: 0.963625 2023-10-04 05:01:51,846 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.feed_forward3.hidden_balancer.prob, batch_count=1534346.6666666667, ans=0.125 2023-10-04 05:01:52,908 WARNING [train.py:1204] (3/4) Exclude cut with ID _58480_210_12392_1_1534811121111_3646160_212-164495-0_sp1.1 from training. Duration: 0.936375 2023-10-04 05:01:52,909 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_454-276429-0_sp0.9 from training. Duration: 0.9666875 2023-10-04 05:01:54,300 WARNING [train.py:1204] (3/4) Exclude cut with ID _24736_210_3279_1_1532332894218_2838700_73-197086-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 05:01:54,353 WARNING [train.py:1204] (3/4) Exclude cut with ID _5294_210_1747_1_1527249643928_3696179_529-242298-0_sp1.1 from training. Duration: 0.863625 2023-10-04 05:01:54,406 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_53555_210_5632_1_1534595220_7232297_345-171438-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 05:01:55,174 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.2.encoder.layers.0.self_attn_weights.whiten_keys, num_groups=4, num_channels=128, metric=4.02 vs. limit=6.0 2023-10-04 05:01:58,599 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_28211_210_6388_1_1532861506_4523224_22-25766-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 05:01:59,924 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7002_210_1794_1_1527939683_5157286_163-87552-0 from training. Duration: 0.86 2023-10-04 05:02:03,118 WARNING [train.py:1204] (3/4) Exclude cut with ID _56927_210_9969_1_1534848970840_3695229_409-225129-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 05:02:04,597 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_26192_210_7927_1_1532137269_1040115_21-219331-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 05:02:06,074 WARNING [train.py:1204] (3/4) Exclude cut with ID _15012_210_1968_1_1530856820429_5095350_662-210469-0 from training. Duration: 0.57 2023-10-04 05:02:09,458 INFO [train.py:1046] (3/4) Epoch 44, batch 1750, loss[loss=0.1673, simple_loss=0.2518, pruned_loss=0.04144, over 23424.00 frames. ], tot_loss[loss=0.1537, simple_loss=0.2339, pruned_loss=0.03674, over 4710638.82 frames. ], batch size: 93, lr: 2.31e-03, grad_scale: 16.0 2023-10-04 05:02:12,374 WARNING [train.py:1204] (3/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_582-191016-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 05:02:15,944 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.conv_skip_rate, batch_count=1534480.0, ans=0.0 2023-10-04 05:02:16,990 WARNING [train.py:1204] (3/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_270-210032-0_sp1.1 from training. Duration: 0.963625 2023-10-04 05:02:17,017 WARNING [train.py:1204] (3/4) Exclude cut with ID _6789_210_1534_1_1527857937218_3118950_262-48853-0_sp0.9 from training. Duration: 0.62225 2023-10-04 05:02:17,083 WARNING [train.py:1204] (3/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_847-41334-0 from training. Duration: 0.44 2023-10-04 05:02:18,287 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_573-46532-0_sp1.1 from training. Duration: 0.92725 2023-10-04 05:02:19,820 WARNING [train.py:1204] (3/4) Exclude cut with ID _21881_210_3525_1_1531825242757_3798380_247-102591-0_sp1.1 from training. Duration: 0.8909375 2023-10-04 05:02:19,843 WARNING [train.py:1204] (3/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_200-21395-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 05:02:22,684 WARNING [train.py:1204] (3/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_711-15688-0 from training. Duration: 0.43 2023-10-04 05:02:24,138 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder_embed.conv.8.prob, batch_count=1534546.6666666667, ans=0.125 2023-10-04 05:02:25,382 WARNING [train.py:1204] (3/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_53-75025-0_sp1.1 from training. Duration: 0.963625 2023-10-04 05:02:26,852 WARNING [train.py:1204] (3/4) Exclude cut with ID _6194_210_1534_1_1527511826160_3941194_321-148641-0 from training. Duration: 0.64 2023-10-04 05:02:26,883 WARNING [train.py:1204] (3/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_337-294431-0_sp1.1 from training. Duration: 0.936375 2023-10-04 05:02:28,284 WARNING [train.py:1204] (3/4) Exclude cut with ID _21632_210_2253_1_1531994001765_3744140_110-274654-0_sp1.1 from training. Duration: 0.7454375 2023-10-04 05:02:30,978 WARNING [train.py:1204] (3/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_135-174080-0_sp1.1 from training. Duration: 0.4818125 2023-10-04 05:02:32,796 WARNING [train.py:1204] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_21-174514-0 from training. Duration: 0.43 2023-10-04 05:02:34,200 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_38965_210_4852_1_1533174906_3484735_215-294589-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 05:02:35,457 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_548-294316-0 from training. Duration: 0.81 2023-10-04 05:02:43,417 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_103-84010-0_sp1.1 from training. Duration: 0.62725 2023-10-04 05:02:45,419 WARNING [train.py:1204] (3/4) Exclude cut with ID _18199_210_6109_1_1531475789864_4288790_295-198247-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 05:02:45,438 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_13757_210_3528_1_1530440682_4107239_103-313224-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 05:02:48,404 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_34886_210_10633_1_1533203882_1755661_67-294673-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 05:02:48,413 WARNING [train.py:1204] (3/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_350-101734-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 05:02:51,368 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_37357_210_17024_1_1533089504_2316121_204-153986-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 05:02:51,489 WARNING [train.py:1204] (3/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_673-115569-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 05:02:54,245 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_489-230703-0_sp0.9 from training. Duration: 0.9444375 2023-10-04 05:02:54,308 WARNING [train.py:1204] (3/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_575-102496-0_sp1.1 from training. Duration: 0.936375 2023-10-04 05:02:55,716 WARNING [train.py:1204] (3/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_669-93163-0 from training. Duration: 0.98 2023-10-04 05:02:58,356 WARNING [train.py:1204] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_249-281349-0_sp0.9 from training. Duration: 0.9444375 2023-10-04 05:03:01,091 WARNING [train.py:1204] (3/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_106-30782-0 from training. Duration: 0.8 2023-10-04 05:03:01,254 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.feed_forward1.hidden_balancer.prob, batch_count=1534680.0, ans=0.125 2023-10-04 05:03:02,479 WARNING [train.py:1204] (3/4) Exclude cut with ID _8772_210_1850_1_1528635595972_3664011_398-320049-0_sp1.1 from training. Duration: 0.8 2023-10-04 05:03:04,526 WARNING [train.py:1204] (3/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_516-151727-0_sp1.1 from training. Duration: 0.963625 2023-10-04 05:03:05,802 WARNING [train.py:1204] (3/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_325-62802-0_sp1.1 from training. Duration: 0.7909375 2023-10-04 05:03:09,209 WARNING [train.py:1204] (3/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_93-58002-0_sp0.9 from training. Duration: 0.6333125 2023-10-04 05:03:09,258 WARNING [train.py:1204] (3/4) Exclude cut with ID _4681_210_3525_1_1527251040321_1235713_123-24428-0_sp1.1 from training. Duration: 0.436375 2023-10-04 05:03:11,087 WARNING [train.py:1204] (3/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_1185-56473-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 05:03:12,441 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.628e+02 2.076e+02 2.414e+02 2.954e+02 4.467e+02, threshold=4.828e+02, percent-clipped=1.0 2023-10-04 05:03:13,826 WARNING [train.py:1204] (3/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_96-244299-0_sp1.1 from training. Duration: 0.8 2023-10-04 05:03:17,064 WARNING [train.py:1204] (3/4) Exclude cut with ID _59500_210_8254_1_1534809610680_3736009_104-72815-0_sp1.1 from training. Duration: 0.963625 2023-10-04 05:03:17,346 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.ff3_skip_rate, batch_count=1534746.6666666667, ans=0.0 2023-10-04 05:03:18,623 WARNING [train.py:1204] (3/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_468-283113-0_sp1.1 from training. Duration: 0.87275 2023-10-04 05:03:19,976 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6239_210_4211_1_1527765584_4306698_528-102186-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 05:03:21,287 WARNING [train.py:1204] (3/4) Exclude cut with ID _45297_210_2721_1_1533715351381_3264949_351-147136-0 from training. Duration: 0.87 2023-10-04 05:03:21,291 WARNING [train.py:1204] (3/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_520-51744-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 05:03:22,651 WARNING [train.py:1204] (3/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_310-114452-0_sp1.1 from training. Duration: 0.536375 2023-10-04 05:03:22,655 WARNING [train.py:1204] (3/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_450-165710-0_sp1.1 from training. Duration: 0.92725 2023-10-04 05:03:22,672 WARNING [train.py:1204] (3/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_280-120942-0_sp1.1 from training. Duration: 0.7 2023-10-04 05:03:22,706 WARNING [train.py:1204] (3/4) Exclude cut with ID _65585_210_12210_1_1535450451584_3764098_112-294301-0_sp0.9 from training. Duration: 0.97775 2023-10-04 05:03:24,017 INFO [train.py:1046] (3/4) Epoch 44, batch 1800, loss[loss=0.15, simple_loss=0.2239, pruned_loss=0.03807, over 23615.00 frames. ], tot_loss[loss=0.1532, simple_loss=0.2329, pruned_loss=0.03673, over 4696394.10 frames. ], batch size: 256, lr: 2.31e-03, grad_scale: 16.0 2023-10-04 05:03:24,095 WARNING [train.py:1204] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_98-51676-0_sp0.9 from training. Duration: 0.811125 2023-10-04 05:03:26,842 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_33348_210_5632_1_1533086912_3324030_85-28583-0_sp1.1 from training. Duration: 0.7454375 2023-10-04 05:03:28,256 WARNING [train.py:1204] (3/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_336-208232-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 05:03:29,663 WARNING [train.py:1204] (3/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_339-104791-0_sp0.9 from training. Duration: 0.5444375 2023-10-04 05:03:31,124 WARNING [train.py:1204] (3/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_559-29840-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 05:03:34,108 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.conv_module2.balancer2.min_abs, batch_count=1534813.3333333333, ans=0.5 2023-10-04 05:03:35,177 WARNING [train.py:1204] (3/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_117-290916-0_sp0.9 from training. Duration: 0.5666875 2023-10-04 05:03:35,264 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_185-112289-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 05:03:38,523 WARNING [train.py:1204] (3/4) Exclude cut with ID _59256_210_5986_1_1535076085997_3589120_155-298797-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 05:03:41,747 WARNING [train.py:1204] (3/4) Exclude cut with ID _44659_210_2721_1_1534036120061_6614860_246-206531-0_sp1.1 from training. Duration: 0.92725 2023-10-04 05:03:41,789 WARNING [train.py:1204] (3/4) Exclude cut with ID _23818_210_6228_1_1531917063884_3676920_469-151944-0_sp1.1 from training. Duration: 0.92725 2023-10-04 05:03:41,867 WARNING [train.py:1204] (3/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_88-276668-0_sp1.1 from training. Duration: 0.9 2023-10-04 05:03:45,021 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_15101_210_7927_1_1530786415_5033274_320-327527-0_sp1.1 from training. Duration: 0.87275 2023-10-04 05:03:45,029 WARNING [train.py:1204] (3/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_446-112117-0 from training. Duration: 0.9 2023-10-04 05:03:45,105 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_30280_210_1881_1_1532872779_3660139_400-254159-0_sp1.1 from training. Duration: 0.936375 2023-10-04 05:03:48,946 WARNING [train.py:1204] (3/4) Exclude cut with ID _40284_210_2084_1_1533513679814_3704279_158-345341-0_sp1.1 from training. Duration: 0.936375 2023-10-04 05:03:51,797 WARNING [train.py:1204] (3/4) Exclude cut with ID 3033-130750-0096-55598-0 from training. Duration: 0.83 2023-10-04 05:03:53,274 WARNING [train.py:1204] (3/4) Exclude cut with ID _9389_210_1664_1_1529217337301_5289269_379-128794-0 from training. Duration: 0.85 2023-10-04 05:03:54,547 WARNING [train.py:1204] (3/4) Exclude cut with ID _3675_210_4557_1_1526639934408_4182089_456-18305-0 from training. Duration: 0.76 2023-10-04 05:03:54,584 WARNING [train.py:1204] (3/4) Exclude cut with ID _29853_210_7497_1_1532505600851_6642110_571-249460-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 05:03:54,693 WARNING [train.py:1204] (3/4) Exclude cut with ID _4068_210_2629_1_1526697350395_1546259_230-325568-0_sp1.1 from training. Duration: 0.92725 2023-10-04 05:03:54,694 WARNING [train.py:1204] (3/4) Exclude cut with ID _34415_210_8750_1_1533085213375_3451116_213-267570-0_sp1.1 from training. Duration: 0.963625 2023-10-04 05:03:55,992 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6767_210_3273_1_1527855783_1718932_142-127132-0_sp1.1 from training. Duration: 0.863625 2023-10-04 05:04:00,160 WARNING [train.py:1204] (3/4) Exclude cut with ID IC0653W0001-322925 from training. Duration: 0.896 2023-10-04 05:04:02,789 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_4528_210_3273_1_1527076654_3276777_209-219804-0_sp1.1 from training. Duration: 0.62725 2023-10-04 05:04:03,101 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.self_attn_weights.pos_emb_skip_rate, batch_count=1534946.6666666667, ans=0.0 2023-10-04 05:04:04,585 WARNING [train.py:1204] (3/4) Exclude cut with ID _55844_210_2346_1_1534471212569_7625707_187-204971-0_sp1.1 from training. Duration: 0.936375 2023-10-04 05:04:05,204 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.4.encoder.layers.0.feed_forward2.out_whiten, num_groups=1, num_channels=384, metric=8.11 vs. limit=15.0 2023-10-04 05:04:06,508 WARNING [train.py:1204] (3/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_369-325184-0 from training. Duration: 0.99 2023-10-04 05:04:07,913 WARNING [train.py:1204] (3/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_791-45149-0 from training. Duration: 0.86 2023-10-04 05:04:07,966 WARNING [train.py:1204] (3/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_317-211107-0_sp1.1 from training. Duration: 0.836375 2023-10-04 05:04:09,476 WARNING [train.py:1204] (3/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_488-330913-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 05:04:11,339 WARNING [train.py:1204] (3/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_506-42614-0_sp1.1 from training. Duration: 0.7181875 2023-10-04 05:04:15,910 WARNING [train.py:1204] (3/4) Exclude cut with ID _8696_210_1876_1_1528545632440_3345058_209-54069-0 from training. Duration: 0.67 2023-10-04 05:04:19,113 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.conv_module2.balancer1.min_positive, batch_count=1535013.3333333333, ans=0.025 2023-10-04 05:04:19,114 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.ff3_skip_rate, batch_count=1535013.3333333333, ans=0.0 2023-10-04 05:04:21,506 WARNING [train.py:1204] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_215-202973-0_sp1.1 from training. Duration: 0.8 2023-10-04 05:04:22,854 WARNING [train.py:1204] (3/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_321-215742-0 from training. Duration: 0.5 2023-10-04 05:04:22,918 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_428-32114-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 05:04:22,928 WARNING [train.py:1204] (3/4) Exclude cut with ID _27013_210_1389_1_1532437173240_3252649_467-263151-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 05:04:24,231 WARNING [train.py:1204] (3/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_734-129761-0_sp0.9 from training. Duration: 0.911125 2023-10-04 05:04:24,283 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_521-214276-0 from training. Duration: 0.44 2023-10-04 05:04:28,377 WARNING [train.py:1204] (3/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_80-147301-0_sp1.1 from training. Duration: 0.763625 2023-10-04 05:04:28,379 WARNING [train.py:1204] (3/4) Exclude cut with ID _9679_210_7497_1_1529129212906_4363929_56-307796-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 05:04:29,855 WARNING [train.py:1204] (3/4) Exclude cut with ID _14832_210_8750_1_1530691351539_3171348_1-54315-0 from training. Duration: 0.87 2023-10-04 05:04:29,855 WARNING [train.py:1204] (3/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_558-155750-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 05:04:31,322 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_142-309370-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 05:04:31,349 WARNING [train.py:1204] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_18-46756-0_sp0.9 from training. Duration: 0.87775 2023-10-04 05:04:31,356 WARNING [train.py:1204] (3/4) Exclude cut with ID _61148_210_6897_1_1535094022726_4217610_279-212159-0_sp1.1 from training. Duration: 0.936375 2023-10-04 05:04:32,768 WARNING [train.py:1204] (3/4) Exclude cut with ID _21987_210_5854_1_1531821500482_3637038_164-2641-0_sp1.1 from training. Duration: 0.936375 2023-10-04 05:04:32,807 WARNING [train.py:1204] (3/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_395-340852-0_sp1.1 from training. Duration: 0.8454375 2023-10-04 05:04:35,551 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1077-184261-0_sp1.1 from training. Duration: 0.963625 2023-10-04 05:04:35,559 WARNING [train.py:1204] (3/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_1176-204992-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 05:04:36,683 INFO [train.py:1046] (3/4) Epoch 44, batch 1850, loss[loss=0.1686, simple_loss=0.2445, pruned_loss=0.04639, over 23369.00 frames. ], tot_loss[loss=0.1539, simple_loss=0.2339, pruned_loss=0.03692, over 4701544.11 frames. ], batch size: 285, lr: 2.31e-03, grad_scale: 8.0 2023-10-04 05:04:38,671 WARNING [train.py:1204] (3/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_563-347832-0_sp1.1 from training. Duration: 0.8090625 2023-10-04 05:04:40,477 WARNING [train.py:1204] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_380-204756-0_sp0.9 from training. Duration: 0.9555625 2023-10-04 05:04:44,283 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.feed_forward3.hidden_balancer.prob, batch_count=1535146.6666666667, ans=0.125 2023-10-04 05:04:46,823 WARNING [train.py:1204] (3/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_212-10501-0_sp1.1 from training. Duration: 0.8545625 2023-10-04 05:04:46,853 WARNING [train.py:1204] (3/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_445-117398-0 from training. Duration: 0.89 2023-10-04 05:04:49,871 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.conv_module2.balancer2.min_abs, batch_count=1535146.6666666667, ans=0.5 2023-10-04 05:04:50,981 WARNING [train.py:1204] (3/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_29-280622-0 from training. Duration: 0.82 2023-10-04 05:04:53,826 WARNING [train.py:1204] (3/4) Exclude cut with ID _26664_210_7770_1_1532258943280_3418270_187-80815-0 from training. Duration: 0.93 2023-10-04 05:04:58,124 WARNING [train.py:1204] (3/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_414-349640-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 05:04:59,388 WARNING [train.py:1204] (3/4) Exclude cut with ID _45021_210_9994_1_1533619751094_3330288_96-239010-0 from training. Duration: 0.91 2023-10-04 05:04:59,391 WARNING [train.py:1204] (3/4) Exclude cut with ID _5189_210_4557_1_1527248319428_3769229_199-53526-0_sp0.9 from training. Duration: 0.4666875 2023-10-04 05:05:06,979 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.conv_module2.balancer2.prob, batch_count=1535280.0, ans=0.125 2023-10-04 05:05:08,153 WARNING [train.py:1204] (3/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_63-101996-0_sp1.1 from training. Duration: 0.8 2023-10-04 05:05:09,592 WARNING [train.py:1204] (3/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_199-86432-0 from training. Duration: 0.98 2023-10-04 05:05:11,530 WARNING [train.py:1204] (3/4) Exclude cut with ID _36399_210_16049_1_1532998771377_3698333_207-259013-0_sp1.1 from training. Duration: 0.92725 2023-10-04 05:05:13,391 WARNING [train.py:1204] (3/4) Exclude cut with ID _62574_210_2655_1_1535180407058_3933249_567-111756-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 05:05:15,619 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.conv_module2.balancer2.prob, batch_count=1535280.0, ans=0.125 2023-10-04 05:05:17,932 WARNING [train.py:1204] (3/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_436-318113-0 from training. Duration: 0.75 2023-10-04 05:05:17,977 WARNING [train.py:1204] (3/4) Exclude cut with ID _53379_210_20034_1_1534222027363_3854654_43-305214-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 05:05:17,993 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_15126_210_6753_1_1530778677_2591185_113-157651-0_sp1.1 from training. Duration: 0.6181875 2023-10-04 05:05:18,171 INFO [scaling.py:1118] (3/4) WithLoss: name=encoder.encoders.0.layers.0.self_attn_weights, loss-sum=0.000e+00 2023-10-04 05:05:19,410 WARNING [train.py:1204] (3/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_146-218763-0_sp1.1 from training. Duration: 0.7818125 2023-10-04 05:05:21,076 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.conv_module1.balancer2.prob, batch_count=1535346.6666666667, ans=0.125 2023-10-04 05:05:22,156 WARNING [train.py:1204] (3/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_714-310538-0_sp0.9 from training. Duration: 0.9555625 2023-10-04 05:05:23,497 WARNING [train.py:1204] (3/4) Exclude cut with ID _24271_210_12658_1_1531987270022_7190519_709-16721-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 05:05:26,323 WARNING [train.py:1204] (3/4) Exclude cut with ID _5190_210_4557_1_1527244368799_373439_28-197030-0_sp0.9 from training. Duration: 0.8 2023-10-04 05:05:26,360 WARNING [train.py:1204] (3/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_267-197536-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 05:05:26,391 WARNING [train.py:1204] (3/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_658-282610-0_sp1.1 from training. Duration: 0.4818125 2023-10-04 05:05:26,400 WARNING [train.py:1204] (3/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_257-67050-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 05:05:29,102 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_14906_210_7927_1_1530754190_9408633_961-53402-0_sp1.1 from training. Duration: 0.936375 2023-10-04 05:05:31,616 WARNING [train.py:1204] (3/4) Exclude cut with ID _60113_210_16271_1_1535164212481_4386079_490-280290-0_sp1.1 from training. Duration: 0.8909375 2023-10-04 05:05:33,161 WARNING [train.py:1204] (3/4) Exclude cut with ID _14100_210_8341_1_1530529320613_3678069_258-224599-0 from training. Duration: 0.77 2023-10-04 05:05:34,550 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6961_210_2087_1_1527937168_6815011_565-326898-0_sp1.1 from training. Duration: 0.936375 2023-10-04 05:05:37,427 WARNING [train.py:1204] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_239-316802-0_sp1.1 from training. Duration: 0.62725 2023-10-04 05:05:38,731 WARNING [train.py:1204] (3/4) Exclude cut with ID _55857_210_2346_1_1534644007831_6898209_745-98391-0_sp1.1 from training. Duration: 0.6909375 2023-10-04 05:05:38,742 WARNING [train.py:1204] (3/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_154-135696-0 from training. Duration: 0.8 2023-10-04 05:05:38,744 WARNING [train.py:1204] (3/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_93-58002-0 from training. Duration: 0.57 2023-10-04 05:05:39,922 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.666e+02 2.019e+02 2.202e+02 2.668e+02 3.594e+02, threshold=4.403e+02, percent-clipped=0.0 2023-10-04 05:05:41,957 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9025W0005-507335 from training. Duration: 0.896 2023-10-04 05:05:43,846 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9029W0034-509435 from training. Duration: 0.64 2023-10-04 05:05:45,273 WARNING [train.py:1204] (3/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_76-203867-0_sp1.1 from training. Duration: 0.6545625 2023-10-04 05:05:45,283 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_46639_210_8341_1_1533726088_4495500_66-346325-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 05:05:45,297 WARNING [train.py:1204] (3/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_145-137790-0_sp0.9 from training. Duration: 0.92225 2023-10-04 05:05:46,516 WARNING [train.py:1204] (3/4) Exclude cut with ID _36522_210_8887_1_1533177030611_1772478_7-327299-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 05:05:47,189 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9042W0009-515615 from training. Duration: 0.896 2023-10-04 05:05:47,206 WARNING [train.py:1204] (3/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_39-248870-0_sp0.9 from training. Duration: 0.7666875 2023-10-04 05:05:47,249 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_35441_210_16554_1_1533347572_4092680_362-242912-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 05:05:48,405 WARNING [train.py:1204] (3/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_216-343007-0_sp1.1 from training. Duration: 0.6 2023-10-04 05:05:49,717 WARNING [train.py:1204] (3/4) Exclude cut with ID _3867_210_1883_1_1526641200575_4475830_272-215563-0_sp0.9 from training. Duration: 0.6333125 2023-10-04 05:05:51,017 INFO [train.py:1046] (3/4) Epoch 44, batch 1900, loss[loss=0.1643, simple_loss=0.2527, pruned_loss=0.03793, over 24441.00 frames. ], tot_loss[loss=0.1542, simple_loss=0.2343, pruned_loss=0.037, over 4704640.88 frames. ], batch size: 69, lr: 2.31e-03, grad_scale: 8.0 2023-10-04 05:05:51,124 WARNING [train.py:1204] (3/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_553-175603-0_sp1.1 from training. Duration: 0.92725 2023-10-04 05:05:51,133 WARNING [train.py:1204] (3/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_963-187109-0 from training. Duration: 0.99 2023-10-04 05:05:52,618 WARNING [train.py:1204] (3/4) Exclude cut with ID _44973_210_1511_1_1533794381163_3634770_495-211466-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 05:05:52,635 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9068W0002-529085 from training. Duration: 0.896 2023-10-04 05:05:53,890 WARNING [train.py:1204] (3/4) Exclude cut with ID _5294_210_1747_1_1527249643928_3696179_180-100713-0_sp1.1 from training. Duration: 0.736375 2023-10-04 05:05:53,974 WARNING [train.py:1204] (3/4) Exclude cut with ID _35799_210_5854_1_1533263487887_3529071_149-49689-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 05:05:59,571 WARNING [train.py:1204] (3/4) Exclude cut with ID _67725_210_27767_1_1535608756671_5105359_36-220794-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 05:06:01,009 WARNING [train.py:1204] (3/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_338-96520-0_sp1.1 from training. Duration: 0.8545625 2023-10-04 05:06:01,562 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.4.encoder.layers.1.nonlin_attention.whiten1, num_groups=1, num_channels=288, metric=4.58 vs. limit=10.0 2023-10-04 05:06:02,346 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9104W0004-547160 from training. Duration: 0.896 2023-10-04 05:06:03,632 WARNING [train.py:1204] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_330-327861-0 from training. Duration: 0.67 2023-10-04 05:06:04,453 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.2.encoder.layers.1.self_attn2.whiten, num_groups=1, num_channels=384, metric=15.04 vs. limit=22.5 2023-10-04 05:06:04,983 WARNING [train.py:1204] (3/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_156-257607-0_sp0.9 from training. Duration: 0.92225 2023-10-04 05:06:05,040 WARNING [train.py:1204] (3/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_476-322050-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 05:06:05,052 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9118W0017-553385 from training. Duration: 0.896 2023-10-04 05:06:06,307 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9120W0028-554435 from training. Duration: 0.896 2023-10-04 05:06:09,178 WARNING [train.py:1204] (3/4) Exclude cut with ID _56524_210_9642_1_1534572011779_3619170_145-310842-0 from training. Duration: 0.99 2023-10-04 05:06:10,617 WARNING [train.py:1204] (3/4) Exclude cut with ID _51631_210_8254_1_1534129228420_4084794_270-222508-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 05:06:15,246 WARNING [train.py:1204] (3/4) Exclude cut with ID _14268_210_9206_1_1530622771285_4714099_331-105351-0 from training. Duration: 0.65 2023-10-04 05:06:17,235 WARNING [train.py:1204] (3/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_40-129756-0 from training. Duration: 0.81 2023-10-04 05:06:26,864 WARNING [train.py:1204] (3/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_231-104736-0 from training. Duration: 0.78 2023-10-04 05:06:28,466 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.attention_skip_rate, batch_count=1535613.3333333333, ans=0.0 2023-10-04 05:06:29,674 WARNING [train.py:1204] (3/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_214-291369-0 from training. Duration: 0.63 2023-10-04 05:06:29,687 WARNING [train.py:1204] (3/4) Exclude cut with ID _62574_210_2655_1_1535180407058_3933249_369-84347-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 05:06:29,827 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.out_combiner.scale_min, batch_count=1535613.3333333333, ans=0.2 2023-10-04 05:06:31,026 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9210W0111-599240 from training. Duration: 0.896 2023-10-04 05:06:31,030 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_92-246270-0 from training. Duration: 0.85 2023-10-04 05:06:31,068 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_37152_210_9697_1_1533106573_3844188_29-239070-0 from training. Duration: 0.98 2023-10-04 05:06:31,117 WARNING [train.py:1204] (3/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_239-22076-0 from training. Duration: 0.85 2023-10-04 05:06:31,119 WARNING [train.py:1204] (3/4) Exclude cut with ID _65962_210_29927_1_1535436026519_4174930_368-44639-0_sp1.1 from training. Duration: 0.97275 2023-10-04 05:06:36,515 WARNING [train.py:1204] (3/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_152-212533-0 from training. Duration: 0.97 2023-10-04 05:06:39,351 WARNING [train.py:1204] (3/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_90-31383-0_sp1.1 from training. Duration: 0.7909375 2023-10-04 05:06:43,959 WARNING [train.py:1204] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_196-152868-0_sp1.1 from training. Duration: 0.92725 2023-10-04 05:06:43,961 WARNING [train.py:1204] (3/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_140-181176-0 from training. Duration: 0.92 2023-10-04 05:06:45,895 WARNING [train.py:1204] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_168-32906-0_sp0.9 from training. Duration: 0.7666875 2023-10-04 05:06:49,940 WARNING [train.py:1204] (3/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_13-165512-0 from training. Duration: 0.82 2023-10-04 05:06:49,987 WARNING [train.py:1204] (3/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_381-317791-0_sp0.9 from training. Duration: 0.811125 2023-10-04 05:06:57,578 WARNING [train.py:1204] (3/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_63-267585-0_sp1.1 from training. Duration: 0.8181875 2023-10-04 05:06:57,580 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_326-303945-0_sp1.1 from training. Duration: 0.9 2023-10-04 05:06:57,599 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_194-143381-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 05:06:58,901 WARNING [train.py:1204] (3/4) Exclude cut with ID _39290_210_4220_1_1533862778760_3075118_194-16667-0_sp1.1 from training. Duration: 0.963625 2023-10-04 05:06:59,046 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.0.conv_skip_rate, batch_count=1535746.6666666667, ans=0.0 2023-10-04 05:07:00,287 WARNING [train.py:1204] (3/4) Exclude cut with ID _15012_210_1968_1_1530856820429_5095350_662-210469-0_sp0.9 from training. Duration: 0.6333125 2023-10-04 05:07:00,311 WARNING [train.py:1204] (3/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_68-219807-0_sp1.1 from training. Duration: 0.42725 2023-10-04 05:07:01,669 WARNING [train.py:1204] (3/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_321-22821-0_sp0.9 from training. Duration: 0.988875 2023-10-04 05:07:02,900 INFO [train.py:1046] (3/4) Epoch 44, batch 1950, loss[loss=0.1503, simple_loss=0.2256, pruned_loss=0.03751, over 23645.00 frames. ], tot_loss[loss=0.1546, simple_loss=0.235, pruned_loss=0.03708, over 4710937.61 frames. ], batch size: 149, lr: 2.31e-03, grad_scale: 8.0 2023-10-04 05:07:04,491 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_1037-45317-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 05:07:04,493 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_403-39619-0_sp1.1 from training. Duration: 0.763625 2023-10-04 05:07:05,956 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_510-96996-0_sp0.9 from training. Duration: 0.9666875 2023-10-04 05:07:05,958 WARNING [train.py:1204] (3/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_225-180155-0_sp1.1 from training. Duration: 0.92725 2023-10-04 05:07:06,000 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_278-272612-0_sp0.9 from training. Duration: 0.811125 2023-10-04 05:07:07,356 WARNING [train.py:1204] (3/4) Exclude cut with ID _53713_210_24116_1_1534314487762_7416751_691-70244-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 05:07:10,158 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6788_210_1388_1_1527898698_4562021_91-197981-0_sp1.1 from training. Duration: 0.7545625 2023-10-04 05:07:11,613 WARNING [train.py:1204] (3/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_60-18123-0_sp0.9 from training. Duration: 0.97775 2023-10-04 05:07:11,652 WARNING [train.py:1204] (3/4) Exclude cut with ID _35799_210_5854_1_1533263487887_3529071_5-214617-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 05:07:11,657 WARNING [train.py:1204] (3/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_890-5117-0_sp1.1 from training. Duration: 0.7090625 2023-10-04 05:07:15,590 WARNING [train.py:1204] (3/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_660-211088-0 from training. Duration: 0.62 2023-10-04 05:07:15,656 WARNING [train.py:1204] (3/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_314-126819-0_sp0.9 from training. Duration: 0.5555625 2023-10-04 05:07:17,522 WARNING [train.py:1204] (3/4) Exclude cut with ID _21632_210_2253_1_1531994001765_3744140_362-66225-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 05:07:17,606 WARNING [train.py:1204] (3/4) Exclude cut with ID _61118_210_9527_1_1534897668884_3752630_173-258555-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 05:07:19,473 WARNING [train.py:1204] (3/4) Exclude cut with ID _7719_210_3525_1_1528456153163_4296960_88-250259-0_sp0.9 from training. Duration: 0.9333125 2023-10-04 05:07:20,742 WARNING [train.py:1204] (3/4) Exclude cut with ID _15140_210_9288_1_1530791388450_4880149_539-153708-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 05:07:20,764 WARNING [train.py:1204] (3/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_253-42544-0_sp1.1 from training. Duration: 0.97275 2023-10-04 05:07:22,273 WARNING [train.py:1204] (3/4) Exclude cut with ID _33640_210_2655_1_1533117605479_4855469_479-296877-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 05:07:25,538 WARNING [train.py:1204] (3/4) Exclude cut with ID 3033-130750-0096-55598-0_sp1.1 from training. Duration: 0.7545625 2023-10-04 05:07:25,551 WARNING [train.py:1204] (3/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_191-41023-0_sp1.1 from training. Duration: 0.6454375 2023-10-04 05:07:25,566 WARNING [train.py:1204] (3/4) Exclude cut with ID _8365_210_2064_1_1528592311070_4677690_431-294102-0_sp1.1 from training. Duration: 0.8090625 2023-10-04 05:07:25,601 WARNING [train.py:1204] (3/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_893-296030-0_sp1.1 from training. Duration: 0.97275 2023-10-04 05:07:28,252 WARNING [train.py:1204] (3/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_401-343820-0_sp1.1 from training. Duration: 0.97275 2023-10-04 05:07:30,335 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.2.encoder.layers.2.self_attn1.whiten, num_groups=1, num_channels=384, metric=18.10 vs. limit=22.5 2023-10-04 05:07:32,325 WARNING [train.py:1204] (3/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_127-566-0_sp1.1 from training. Duration: 0.763625 2023-10-04 05:07:32,326 WARNING [train.py:1204] (3/4) Exclude cut with ID _14100_210_8341_1_1530529320613_3678069_126-272847-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 05:07:32,344 WARNING [train.py:1204] (3/4) Exclude cut with ID _15133_210_6228_1_1530794512074_3080379_29-264458-0_sp1.1 from training. Duration: 0.52725 2023-10-04 05:07:32,345 WARNING [train.py:1204] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_127-119109-0 from training. Duration: 0.72 2023-10-04 05:07:32,419 WARNING [train.py:1204] (3/4) Exclude cut with ID _6537_210_1883_1_1527987559710_3597129_382-133062-0_sp1.1 from training. Duration: 0.6181875 2023-10-04 05:07:32,431 WARNING [train.py:1204] (3/4) Exclude cut with ID _29529_210_7234_1_1532393961455_3977460_391-219682-0_sp1.1 from training. Duration: 0.936375 2023-10-04 05:07:33,734 WARNING [train.py:1204] (3/4) Exclude cut with ID _37537_210_2253_1_1533171758568_3421949_96-137189-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 05:07:36,645 WARNING [train.py:1204] (3/4) Exclude cut with ID _58414_210_8254_1_1535421609162_7151182_15-276301-0_sp1.1 from training. Duration: 0.97275 2023-10-04 05:07:39,272 WARNING [train.py:1204] (3/4) Exclude cut with ID _59502_210_12210_1_1535107024698_1835139_110-289074-0_sp1.1 from training. Duration: 0.92725 2023-10-04 05:07:43,932 WARNING [train.py:1204] (3/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_367-61330-0_sp1.1 from training. Duration: 0.6818125 2023-10-04 05:07:46,694 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.0.self_attn_weights.pos_emb_skip_rate, batch_count=1536013.3333333333, ans=0.0 2023-10-04 05:07:46,763 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=1536013.3333333333, ans=0.1 2023-10-04 05:07:47,839 WARNING [train.py:1204] (3/4) Exclude cut with ID _45297_210_2721_1_1533715351381_3264949_353-9541-0_sp1.1 from training. Duration: 0.8818125 2023-10-04 05:07:47,876 WARNING [train.py:1204] (3/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_85-138257-0_sp0.9 from training. Duration: 0.788875 2023-10-04 05:07:47,906 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3787_210_2040_1_1526477544_5478586_603-120324-0 from training. Duration: 0.81 2023-10-04 05:07:49,177 WARNING [train.py:1204] (3/4) Exclude cut with ID _42863_210_9070_1_1533902398788_3696920_411-204131-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 05:07:52,630 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.2.encoder.layers.1.self_attn1.whiten, num_groups=1, num_channels=384, metric=17.72 vs. limit=22.5 2023-10-04 05:07:53,305 WARNING [train.py:1204] (3/4) Exclude cut with ID _30196_210_1664_1_1533708496522_7074140_822-289909-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 05:07:53,391 WARNING [train.py:1204] (3/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_32-335103-0_sp0.9 from training. Duration: 0.911125 2023-10-04 05:07:54,731 WARNING [train.py:1204] (3/4) Exclude cut with ID _6493_210_1747_1_1528459210424_4012470_513-140967-0_sp0.9 from training. Duration: 0.87775 2023-10-04 05:08:00,797 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_47896_210_20508_1_1534492518_5958868_439-257766-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 05:08:03,461 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_1294-63152-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 05:08:06,094 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.695e+02 2.020e+02 2.252e+02 2.597e+02 3.985e+02, threshold=4.504e+02, percent-clipped=0.0 2023-10-04 05:08:06,215 WARNING [train.py:1204] (3/4) Exclude cut with ID _59170_210_6721_1_1534983633960_4203335_319-166512-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 05:08:07,639 WARNING [train.py:1204] (3/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_146-3470-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 05:08:10,403 WARNING [train.py:1204] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_678-159307-0_sp1.1 from training. Duration: 0.77275 2023-10-04 05:08:10,433 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_343-48822-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 05:08:10,491 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_4692_210_3528_1_1526899755_4323254_107-44401-0 from training. Duration: 0.86 2023-10-04 05:08:10,496 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_74-292608-0_sp1.1 from training. Duration: 0.6545625 2023-10-04 05:08:11,834 WARNING [train.py:1204] (3/4) Exclude cut with ID _55644_210_11856_1_1534593599582_4103719_293-248694-0_sp1.1 from training. Duration: 0.97275 2023-10-04 05:08:11,922 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3172_210_2087_1_1526292929_3987865_197-97762-0 from training. Duration: 0.75 2023-10-04 05:08:15,581 WARNING [train.py:1204] (3/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_264-348896-0_sp0.9 from training. Duration: 0.9444375 2023-10-04 05:08:16,706 INFO [train.py:1046] (3/4) Epoch 44, batch 2000, loss[loss=0.2032, simple_loss=0.2779, pruned_loss=0.06427, over 19911.00 frames. ], tot_loss[loss=0.1563, simple_loss=0.2368, pruned_loss=0.03784, over 4700228.47 frames. ], batch size: 388, lr: 2.31e-03, grad_scale: 16.0 2023-10-04 05:08:18,252 WARNING [train.py:1204] (3/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_209-205365-0_sp0.9 from training. Duration: 0.87775 2023-10-04 05:08:19,624 WARNING [train.py:1204] (3/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_222-198127-0_sp0.9 from training. Duration: 0.9333125 2023-10-04 05:08:19,659 WARNING [train.py:1204] (3/4) Exclude cut with ID _57572_210_5517_1_1534921219624_3809484_52-43530-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 05:08:20,998 WARNING [train.py:1204] (3/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_96-320736-0_sp1.1 from training. Duration: 0.7818125 2023-10-04 05:08:21,170 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.1.ff3_skip_rate, batch_count=1536146.6666666667, ans=0.0 2023-10-04 05:08:23,675 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_13091_210_7925_1_1530609447_8987354_1159-225508-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 05:08:23,933 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.conv_module1.balancer1.prob, batch_count=1536146.6666666667, ans=0.125 2023-10-04 05:08:26,976 WARNING [train.py:1204] (3/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_237-44332-0 from training. Duration: 0.94 2023-10-04 05:08:27,030 WARNING [train.py:1204] (3/4) Exclude cut with ID _7970_210_1883_1_1528455616296_3472230_25-178407-0_sp0.9 from training. Duration: 0.788875 2023-10-04 05:08:29,872 WARNING [train.py:1204] (3/4) Exclude cut with ID _29003_210_19596_1_1532507391720_3894098_357-257062-0_sp1.1 from training. Duration: 0.936375 2023-10-04 05:08:32,558 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_809-17156-0 from training. Duration: 0.73 2023-10-04 05:08:32,655 WARNING [train.py:1204] (3/4) Exclude cut with ID _7160_210_1876_1_1527940923231_3703799_302-150728-0_sp1.1 from training. Duration: 0.7090625 2023-10-04 05:08:32,666 WARNING [train.py:1204] (3/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_444-28041-0_sp0.9 from training. Duration: 0.9444375 2023-10-04 05:08:36,700 WARNING [train.py:1204] (3/4) Exclude cut with ID _52892_210_6721_1_1534378941259_4124129_278-251666-0_sp1.1 from training. Duration: 0.92725 2023-10-04 05:08:36,869 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.1.feed_forward3.hidden_balancer.prob, batch_count=1536213.3333333333, ans=0.125 2023-10-04 05:08:38,068 WARNING [train.py:1204] (3/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_115-326778-0 from training. Duration: 0.93 2023-10-04 05:08:39,373 WARNING [train.py:1204] (3/4) Exclude cut with ID _41391_210_7994_1_1533603771872_3638201_31-188961-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 05:08:40,781 WARNING [train.py:1204] (3/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_1073-6264-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 05:08:42,056 WARNING [train.py:1204] (3/4) Exclude cut with ID _66868_210_9527_1_1535509287954_4561140_435-259515-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 05:08:43,396 WARNING [train.py:1204] (3/4) Exclude cut with ID _14886_210_4220_1_1530702020355_3516190_159-73748-0 from training. Duration: 0.83 2023-10-04 05:08:43,433 WARNING [train.py:1204] (3/4) Exclude cut with ID _15150_210_8341_1_1530752411803_7256384_469-239509-0_sp1.1 from training. Duration: 0.6181875 2023-10-04 05:08:46,162 WARNING [train.py:1204] (3/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_395-340852-0 from training. Duration: 0.93 2023-10-04 05:08:46,166 WARNING [train.py:1204] (3/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_138-187031-0_sp1.1 from training. Duration: 0.9 2023-10-04 05:08:47,447 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.3.encoder.layers.3.conv_module2.whiten, num_groups=1, num_channels=512, metric=4.04 vs. limit=15.0 2023-10-04 05:08:48,263 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_13757_210_3528_1_1530440682_4107239_48-106347-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 05:08:49,597 WARNING [train.py:1204] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_164-224052-0_sp1.1 from training. Duration: 0.636375 2023-10-04 05:08:49,607 WARNING [train.py:1204] (3/4) Exclude cut with ID _59288_210_5854_1_1535077757774_3412246_408-322648-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 05:08:49,842 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.feed_forward1.hidden_balancer.prob, batch_count=1536280.0, ans=0.125 2023-10-04 05:08:50,917 WARNING [train.py:1204] (3/4) Exclude cut with ID _30191_210_1664_1_1533189915047_7196670_827-36359-0_sp1.1 from training. Duration: 0.963625 2023-10-04 05:08:51,015 WARNING [train.py:1204] (3/4) Exclude cut with ID _4680_210_3525_1_1527249757953_893386_75-78979-0_sp1.1 from training. Duration: 0.82725 2023-10-04 05:08:52,306 WARNING [train.py:1204] (3/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_383-165234-0 from training. Duration: 0.94 2023-10-04 05:08:54,277 WARNING [train.py:1204] (3/4) Exclude cut with ID _19896_210_1883_1_1531709022662_6649569_94-229927-0 from training. Duration: 0.97 2023-10-04 05:08:54,283 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6788_210_1388_1_1527898698_4562021_180-75288-0_sp1.1 from training. Duration: 0.9 2023-10-04 05:08:55,547 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_26433_210_12108_1_1532157158_4056480_54-321349-0_sp1.1 from training. Duration: 0.97275 2023-10-04 05:08:59,782 WARNING [train.py:1204] (3/4) Exclude cut with ID _56108_210_16380_1_1534505446159_3887959_265-161493-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 05:09:01,125 WARNING [train.py:1204] (3/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_395-111602-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 05:09:01,131 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_154-316450-0_sp0.9 from training. Duration: 0.8444375 2023-10-04 05:09:01,207 WARNING [train.py:1204] (3/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_381-116041-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 05:09:03,854 WARNING [train.py:1204] (3/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_65-104124-0_sp1.1 from training. Duration: 0.963625 2023-10-04 05:09:03,917 WARNING [train.py:1204] (3/4) Exclude cut with ID _6536_210_1883_1_1527850783339_3319069_169-23811-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 05:09:03,950 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_289-180904-0_sp0.9 from training. Duration: 0.8444375 2023-10-04 05:09:03,963 WARNING [train.py:1204] (3/4) Exclude cut with ID _56406_210_9288_1_1534899618664_3496149_226-242400-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 05:09:05,402 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_24899_210_16554_1_1532046422_3827462_276-216330-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 05:09:07,279 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.4.encoder.layers.2.feed_forward3.out_whiten, num_groups=1, num_channels=384, metric=13.89 vs. limit=15.0 2023-10-04 05:09:08,163 WARNING [train.py:1204] (3/4) Exclude cut with ID _29345_210_4381_1_1532689168263_3860169_198-174328-0_sp1.1 from training. Duration: 0.82725 2023-10-04 05:09:09,540 WARNING [train.py:1204] (3/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_338-96520-0 from training. Duration: 0.94 2023-10-04 05:09:14,917 WARNING [train.py:1204] (3/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_7-111954-0_sp1.1 from training. Duration: 0.5090625 2023-10-04 05:09:16,829 WARNING [train.py:1204] (3/4) Exclude cut with ID _36692_210_5665_1_1533608967420_398809_8-16261-0_sp1.1 from training. Duration: 0.97275 2023-10-04 05:09:20,224 WARNING [train.py:1204] (3/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_86-212657-0_sp1.1 from training. Duration: 0.97275 2023-10-04 05:09:20,237 WARNING [train.py:1204] (3/4) Exclude cut with ID _44659_210_2721_1_1534036120061_6614860_626-298884-0_sp1.1 from training. Duration: 0.8818125 2023-10-04 05:09:20,503 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.bypass.skip_rate, batch_count=1536413.3333333333, ans=0.07 2023-10-04 05:09:23,014 WARNING [train.py:1204] (3/4) Exclude cut with ID _49755_210_18053_1_1534581005846_3218709_120-257918-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 05:09:24,851 WARNING [train.py:1204] (3/4) Exclude cut with ID _21881_210_3525_1_1531825242757_3798380_400-113821-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 05:09:24,853 WARNING [train.py:1204] (3/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_387-236773-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 05:09:26,203 WARNING [train.py:1204] (3/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_613-232856-0_sp1.1 from training. Duration: 0.4909375 2023-10-04 05:09:27,566 WARNING [train.py:1204] (3/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_181-78632-0_sp0.9 from training. Duration: 0.8666875 2023-10-04 05:09:28,187 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.4.encoder.layers.0.whiten, num_groups=1, num_channels=384, metric=6.97 vs. limit=12.0 2023-10-04 05:09:28,917 INFO [train.py:1046] (3/4) Epoch 44, batch 2050, loss[loss=0.1432, simple_loss=0.2044, pruned_loss=0.04097, over 22669.00 frames. ], tot_loss[loss=0.156, simple_loss=0.2364, pruned_loss=0.03783, over 4693109.63 frames. ], batch size: 322, lr: 2.31e-03, grad_scale: 16.0 2023-10-04 05:09:28,996 WARNING [train.py:1204] (3/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_175-17485-0_sp1.1 from training. Duration: 0.97275 2023-10-04 05:09:30,265 WARNING [train.py:1204] (3/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_1004-183422-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 05:09:31,736 WARNING [train.py:1204] (3/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_32-65962-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 05:09:31,822 WARNING [train.py:1204] (3/4) Exclude cut with ID _15458_210_8341_1_1530858614263_3834410_40-298664-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 05:09:34,872 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.0.balancer2.prob, batch_count=1536480.0, ans=0.125 2023-10-04 05:09:37,356 WARNING [train.py:1204] (3/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_380-261863-0_sp1.1 from training. Duration: 0.963625 2023-10-04 05:09:38,879 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_372-173012-0_sp1.1 from training. Duration: 0.863625 2023-10-04 05:09:40,231 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_57-82205-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 05:09:41,560 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3691_210_3528_1_1526468169_4062053_355-334432-0_sp0.9 from training. Duration: 0.9666875 2023-10-04 05:09:43,133 WARNING [train.py:1204] (3/4) Exclude cut with ID _19023_210_4353_1_1531378892552_5820780_28-36615-0 from training. Duration: 0.97 2023-10-04 05:09:43,137 WARNING [train.py:1204] (3/4) Exclude cut with ID _29853_210_7497_1_1532505600851_6642110_286-251333-0_sp1.1 from training. Duration: 0.92725 2023-10-04 05:09:45,114 WARNING [train.py:1204] (3/4) Exclude cut with ID _31602_210_5854_1_1532677021456_4458250_518-80679-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 05:09:46,433 WARNING [train.py:1204] (3/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_38-187851-0_sp0.9 from training. Duration: 0.688875 2023-10-04 05:09:55,198 WARNING [train.py:1204] (3/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_39-248870-0_sp1.1 from training. Duration: 0.62725 2023-10-04 05:09:55,223 WARNING [train.py:1204] (3/4) Exclude cut with ID _16908_210_5724_1_1531119157248_3772700_154-31455-0_sp1.1 from training. Duration: 0.97275 2023-10-04 05:09:57,916 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8453_210_4211_1_1528502940_8562730_347-330673-0 from training. Duration: 0.72 2023-10-04 05:10:00,649 WARNING [train.py:1204] (3/4) Exclude cut with ID _30191_210_1664_1_1533189915047_7196670_674-204247-0_sp1.1 from training. Duration: 0.97275 2023-10-04 05:10:02,161 WARNING [train.py:1204] (3/4) Exclude cut with ID _8833_210_5219_1_1528592665210_7553059_329-125156-0 from training. Duration: 0.82 2023-10-04 05:10:02,204 WARNING [train.py:1204] (3/4) Exclude cut with ID _4779_210_2084_1_1527318022944_3914893_500-285013-0_sp1.1 from training. Duration: 0.62725 2023-10-04 05:10:04,945 WARNING [train.py:1204] (3/4) Exclude cut with ID _14421_210_8750_1_1530604795825_3542290_190-313578-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 05:10:07,635 WARNING [train.py:1204] (3/4) Exclude cut with ID _35799_210_5854_1_1533263487887_3529071_538-273574-0_sp1.1 from training. Duration: 0.936375 2023-10-04 05:10:08,984 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_14928_210_5632_1_1530773461_4714000_454-112198-0_sp1.1 from training. Duration: 0.6 2023-10-04 05:10:09,020 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_468-143126-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 05:10:10,376 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_4528_210_3273_1_1527076654_3276777_258-121681-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 05:10:10,460 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_397-132565-0_sp1.1 from training. Duration: 0.9 2023-10-04 05:10:10,486 WARNING [train.py:1204] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_454-123002-0_sp0.9 from training. Duration: 0.7666875 2023-10-04 05:10:15,066 WARNING [train.py:1204] (3/4) Exclude cut with ID _27052_210_5986_1_1532329197187_3585009_297-86890-0_sp1.1 from training. Duration: 0.936375 2023-10-04 05:10:16,898 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_51225_210_3601_1_1534400634_2327383_55-301844-0_sp1.1 from training. Duration: 0.8454375 2023-10-04 05:10:18,298 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6786_210_1388_1_1527839699_4194495_378-46070-0_sp0.9 from training. Duration: 0.8 2023-10-04 05:10:18,394 WARNING [train.py:1204] (3/4) Exclude cut with ID _42334_210_7770_1_1534206610396_3605880_55-19758-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 05:10:23,041 WARNING [train.py:1204] (3/4) Exclude cut with ID _14884_210_2252_1_1530707459689_5561250_114-201399-0_sp0.9 from training. Duration: 0.8444375 2023-10-04 05:10:28,921 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_99-251314-0_sp0.9 from training. Duration: 0.9666875 2023-10-04 05:10:30,261 WARNING [train.py:1204] (3/4) Exclude cut with ID _26528_210_9070_1_1532570410842_3851310_497-97218-0 from training. Duration: 0.86 2023-10-04 05:10:33,074 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.634e+02 1.975e+02 2.184e+02 2.529e+02 3.912e+02, threshold=4.369e+02, percent-clipped=0.0 2023-10-04 05:10:34,647 WARNING [train.py:1204] (3/4) Exclude cut with ID _55328_210_27767_1_1534580973260_4088170_230-317241-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 05:10:35,946 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_125-203105-0_sp0.9 from training. Duration: 0.888875 2023-10-04 05:10:37,306 WARNING [train.py:1204] (3/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_55-31904-0_sp1.1 from training. Duration: 0.8 2023-10-04 05:10:38,789 WARNING [train.py:1204] (3/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_18-174647-0 from training. Duration: 0.91 2023-10-04 05:10:41,471 WARNING [train.py:1204] (3/4) Exclude cut with ID IC0165W0005-81726 from training. Duration: 0.952 2023-10-04 05:10:41,472 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_31583_210_3852_1_1532595851_4024413_148-171847-0_sp1.1 from training. Duration: 0.963625 2023-10-04 05:10:41,597 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.0.conv_module1.balancer2.prob, batch_count=1536813.3333333333, ans=0.125 2023-10-04 05:10:43,308 INFO [train.py:1046] (3/4) Epoch 44, batch 2100, loss[loss=0.1456, simple_loss=0.229, pruned_loss=0.03112, over 24482.00 frames. ], tot_loss[loss=0.1547, simple_loss=0.2346, pruned_loss=0.03738, over 4689450.44 frames. ], batch size: 63, lr: 2.31e-03, grad_scale: 16.0 2023-10-04 05:10:43,981 WARNING [train.py:1204] (3/4) Exclude cut with ID _21989_210_5854_1_1531899214931_3271360_116-138769-0_sp1.1 from training. Duration: 0.936375 2023-10-04 05:10:44,038 WARNING [train.py:1204] (3/4) Exclude cut with ID _66070_210_7640_1_1535441393096_7805310_489-292631-0_sp1.1 from training. Duration: 0.7181875 2023-10-04 05:10:44,305 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.nonlin_attention.balancer.min_positive, batch_count=1536813.3333333333, ans=0.05 2023-10-04 05:10:45,135 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_202-27960-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 05:10:45,151 WARNING [train.py:1204] (3/4) Exclude cut with ID _5294_210_1747_1_1527249643928_3696179_342-270278-0 from training. Duration: 0.55 2023-10-04 05:10:45,181 WARNING [train.py:1204] (3/4) Exclude cut with ID _38323_210_12108_1_1533121233369_3303049_416-11919-0 from training. Duration: 0.96 2023-10-04 05:10:46,612 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6545_210_2040_1_1527774670_4245258_407-48070-0_sp0.9 from training. Duration: 0.8444375 2023-10-04 05:10:48,214 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=1536813.3333333333, ans=0.1 2023-10-04 05:10:49,286 WARNING [train.py:1204] (3/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_503-30719-0_sp1.1 from training. Duration: 0.8909375 2023-10-04 05:10:50,603 WARNING [train.py:1204] (3/4) Exclude cut with ID _49643_210_7989_1_1534640244224_3939031_234-326497-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 05:10:53,893 WARNING [train.py:1204] (3/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_301-77873-0_sp1.1 from training. Duration: 0.963625 2023-10-04 05:10:53,948 WARNING [train.py:1204] (3/4) Exclude cut with ID _45297_210_2721_1_1533715351381_3264949_152-154697-0_sp1.1 from training. Duration: 0.97275 2023-10-04 05:10:53,955 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3371_210_1794_1_1526384462_5580777_396-339812-0 from training. Duration: 0.85 2023-10-04 05:10:55,862 WARNING [train.py:1204] (3/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_399-156791-0_sp1.1 from training. Duration: 0.7545625 2023-10-04 05:10:55,911 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7696_210_2040_1_1528207167_3716099_117-125434-0 from training. Duration: 0.76 2023-10-04 05:10:55,916 WARNING [train.py:1204] (3/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_96-244299-0 from training. Duration: 0.88 2023-10-04 05:10:58,585 WARNING [train.py:1204] (3/4) Exclude cut with ID _37892_210_19596_1_1533198677897_3964058_519-343462-0_sp1.1 from training. Duration: 0.92725 2023-10-04 05:10:58,631 WARNING [train.py:1204] (3/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_202-183922-0_sp1.1 from training. Duration: 0.77275 2023-10-04 05:10:58,637 WARNING [train.py:1204] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_453-133983-0 from training. Duration: 0.78 2023-10-04 05:11:00,046 WARNING [train.py:1204] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_178-39722-0_sp1.1 from training. Duration: 0.4454375 2023-10-04 05:11:04,412 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_15126_210_6753_1_1530778677_2591185_113-157651-0 from training. Duration: 0.68 2023-10-04 05:11:04,413 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_271-234962-0_sp1.1 from training. Duration: 0.7181875 2023-10-04 05:11:07,174 WARNING [train.py:1204] (3/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_195-64967-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 05:11:08,527 WARNING [train.py:1204] (3/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_365-215065-0_sp1.1 from training. Duration: 0.936375 2023-10-04 05:11:12,658 WARNING [train.py:1204] (3/4) Exclude cut with ID _31602_210_5854_1_1532677021456_4458250_249-96604-0_sp0.9 from training. Duration: 0.988875 2023-10-04 05:11:12,716 WARNING [train.py:1204] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_307-158462-0 from training. Duration: 0.69 2023-10-04 05:11:14,435 WARNING [train.py:1204] (3/4) Exclude cut with ID _30918_210_6400_1_1532754078600_3125920_407-223394-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 05:11:14,439 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6673_210_3273_1_1527767790_3173240_351-167954-0_sp0.9 from training. Duration: 0.5333125 2023-10-04 05:11:16,430 WARNING [train.py:1204] (3/4) Exclude cut with ID _4866_210_2577_1_1527404412284_4364659_256-306830-0 from training. Duration: 0.87 2023-10-04 05:11:17,758 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_17613_210_2151_1_1531137569_2366160_222-131118-0_sp1.1 from training. Duration: 0.963625 2023-10-04 05:11:17,770 WARNING [train.py:1204] (3/4) Exclude cut with ID _45560_210_21982_1_1533812603951_3584999_111-6376-0 from training. Duration: 0.84 2023-10-04 05:11:17,800 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_216-73841-0 from training. Duration: 0.96 2023-10-04 05:11:19,132 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_14058_210_4163_1_1530690953_6327180_49-210687-0 from training. Duration: 0.94 2023-10-04 05:11:19,281 WARNING [train.py:1204] (3/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_506-42614-0_sp0.9 from training. Duration: 0.87775 2023-10-04 05:11:20,670 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_25-332138-0_sp1.1 from training. Duration: 0.82725 2023-10-04 05:11:24,067 WARNING [train.py:1204] (3/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_589-9070-0_sp1.1 from training. Duration: 0.6454375 2023-10-04 05:11:25,484 WARNING [train.py:1204] (3/4) Exclude cut with ID _6194_210_1534_1_1527511826160_3941194_14-282876-0_sp1.1 from training. Duration: 0.6454375 2023-10-04 05:11:27,367 WARNING [train.py:1204] (3/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_1166-90654-0_sp1.1 from training. Duration: 0.92725 2023-10-04 05:11:28,801 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_218-255858-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 05:11:28,803 WARNING [train.py:1204] (3/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_1285-256622-0 from training. Duration: 0.84 2023-10-04 05:11:28,825 WARNING [train.py:1204] (3/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_205-286261-0_sp1.1 from training. Duration: 0.963625 2023-10-04 05:11:28,850 WARNING [train.py:1204] (3/4) Exclude cut with ID _68777_210_4353_1_1535810390731_3117904_6-154119-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 05:11:30,169 WARNING [train.py:1204] (3/4) Exclude cut with ID _22982_210_1883_1_1531882790447_5449409_565-199393-0_sp1.1 from training. Duration: 0.92725 2023-10-04 05:11:30,186 WARNING [train.py:1204] (3/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_278-154492-0 from training. Duration: 0.76 2023-10-04 05:11:31,591 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_426-313557-0 from training. Duration: 0.53 2023-10-04 05:11:32,895 WARNING [train.py:1204] (3/4) Exclude cut with ID _41817_210_18835_1_1533434477160_7423049_555-293448-0 from training. Duration: 0.8 2023-10-04 05:11:36,998 WARNING [train.py:1204] (3/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_32-335103-0_sp1.1 from training. Duration: 0.7454375 2023-10-04 05:11:38,516 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.conv_skip_rate, batch_count=1537013.3333333333, ans=0.0 2023-10-04 05:11:39,761 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2655_210_2087_1_1525688959_3897400_202-147557-0_sp1.1 from training. Duration: 0.77275 2023-10-04 05:11:39,797 WARNING [train.py:1204] (3/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_158-250116-0 from training. Duration: 0.62 2023-10-04 05:11:45,924 WARNING [train.py:1204] (3/4) Exclude cut with ID _55429_210_10626_1_1534467588527_7174143_528-88083-0_sp1.1 from training. Duration: 0.963625 2023-10-04 05:11:48,026 WARNING [train.py:1204] (3/4) Exclude cut with ID _8030_210_7497_1_1528444886781_3531089_32-31837-0_sp1.1 from training. Duration: 0.8818125 2023-10-04 05:11:49,165 WARNING [train.py:1204] (3/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_744-86168-0_sp1.1 from training. Duration: 0.97275 2023-10-04 05:11:49,182 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1113-7369-0_sp0.9 from training. Duration: 0.9444375 2023-10-04 05:11:49,209 WARNING [train.py:1204] (3/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_124-345840-0_sp0.9 from training. Duration: 0.57775 2023-10-04 05:11:50,472 WARNING [train.py:1204] (3/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_328-264776-0_sp0.9 from training. Duration: 0.7444375 2023-10-04 05:11:51,840 WARNING [train.py:1204] (3/4) Exclude cut with ID _18895_210_6228_1_1531357274234_3784930_469-166615-0_sp1.1 from training. Duration: 0.963625 2023-10-04 05:11:51,847 WARNING [train.py:1204] (3/4) Exclude cut with ID _8395_210_1531_1_1528597871523_4411579_431-132470-0_sp0.9 from training. Duration: 0.82225 2023-10-04 05:11:51,918 WARNING [train.py:1204] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_554-45617-0_sp1.1 from training. Duration: 0.9 2023-10-04 05:11:51,949 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_35361_210_4238_1_1533262810_7793476_524-188242-0_sp1.1 from training. Duration: 0.92725 2023-10-04 05:11:52,177 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.attention_skip_rate, batch_count=1537080.0, ans=0.0 2023-10-04 05:11:54,606 WARNING [train.py:1204] (3/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_938-205135-0 from training. Duration: 0.87 2023-10-04 05:11:54,724 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_5931_210_2040_1_1527428481_4547999_321-193614-0 from training. Duration: 0.67 2023-10-04 05:11:54,736 WARNING [train.py:1204] (3/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_445-269521-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 05:11:57,863 INFO [train.py:1046] (3/4) Epoch 44, batch 2150, loss[loss=0.1543, simple_loss=0.2312, pruned_loss=0.03872, over 23353.00 frames. ], tot_loss[loss=0.1543, simple_loss=0.2345, pruned_loss=0.03707, over 4703955.43 frames. ], batch size: 134, lr: 2.31e-03, grad_scale: 16.0 2023-10-04 05:11:57,964 WARNING [train.py:1204] (3/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_3-33116-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 05:11:57,965 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_4633_210_2040_1_1526824163_4145343_416-76117-0_sp1.1 from training. Duration: 0.863625 2023-10-04 05:11:58,000 WARNING [train.py:1204] (3/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_1033-274376-0_sp1.1 from training. Duration: 0.7909375 2023-10-04 05:11:59,258 WARNING [train.py:1204] (3/4) Exclude cut with ID _8772_210_1850_1_1528635595972_3664011_398-320049-0_sp0.9 from training. Duration: 0.97775 2023-10-04 05:12:02,164 WARNING [train.py:1204] (3/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_321-215742-0_sp0.9 from training. Duration: 0.5555625 2023-10-04 05:12:04,833 WARNING [train.py:1204] (3/4) Exclude cut with ID _28394_210_8913_1_1532430049518_3650118_272-190950-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 05:12:06,265 WARNING [train.py:1204] (3/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_31-299371-0_sp1.1 from training. Duration: 0.92725 2023-10-04 05:12:08,897 WARNING [train.py:1204] (3/4) Exclude cut with ID _55383_210_10635_1_1534468426172_2231669_134-128246-0_sp0.9 from training. Duration: 0.988875 2023-10-04 05:12:08,900 WARNING [train.py:1204] (3/4) Exclude cut with ID _40519_210_13794_1_1533715228655_7337390_887-9547-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 05:12:08,929 WARNING [train.py:1204] (3/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_310-109461-0_sp0.9 from training. Duration: 0.888875 2023-10-04 05:12:11,689 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2009_210_2071_1_1525481134_4588937_258-250196-0_sp1.1 from training. Duration: 0.92725 2023-10-04 05:12:11,756 WARNING [train.py:1204] (3/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_1186-72702-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 05:12:11,759 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_14831_210_7927_1_1530700200_5583610_181-27467-0_sp1.1 from training. Duration: 0.82725 2023-10-04 05:12:16,752 WARNING [train.py:1204] (3/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_278-132109-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 05:12:16,770 WARNING [train.py:1204] (3/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_261-297503-0 from training. Duration: 0.99 2023-10-04 05:12:20,877 WARNING [train.py:1204] (3/4) Exclude cut with ID _19896_210_1883_1_1531709022662_6649569_313-265903-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 05:12:22,291 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_105-114797-0_sp1.1 from training. Duration: 0.763625 2023-10-04 05:12:24,195 WARNING [train.py:1204] (3/4) Exclude cut with ID _53755_210_8254_1_1534577312941_4707360_9-65843-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 05:12:24,217 WARNING [train.py:1204] (3/4) Exclude cut with ID _18516_210_9206_1_1531704653206_3692406_361-224549-0_sp1.1 from training. Duration: 0.97275 2023-10-04 05:12:24,243 WARNING [train.py:1204] (3/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_403-310975-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 05:12:24,286 WARNING [train.py:1204] (3/4) Exclude cut with ID _5444_210_2151_1_1527418808464_5312210_581-315411-0_sp0.9 from training. Duration: 0.688875 2023-10-04 05:12:25,582 WARNING [train.py:1204] (3/4) Exclude cut with ID _23825_210_13449_1_1531915313526_3497150_464-154904-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 05:12:25,599 WARNING [train.py:1204] (3/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_937-208281-0_sp0.9 from training. Duration: 0.9444375 2023-10-04 05:12:27,424 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_37839_210_7234_1_1533542462_3689303_158-181212-0_sp1.1 from training. Duration: 0.963625 2023-10-04 05:12:28,797 WARNING [train.py:1204] (3/4) Exclude cut with ID _8772_210_1850_1_1528635595972_3664011_398-320049-0 from training. Duration: 0.88 2023-10-04 05:12:30,208 WARNING [train.py:1204] (3/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_718-250907-0_sp1.1 from training. Duration: 0.62725 2023-10-04 05:12:31,554 WARNING [train.py:1204] (3/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_641-126395-0_sp1.1 from training. Duration: 0.92725 2023-10-04 05:12:33,025 WARNING [train.py:1204] (3/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_166-241493-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 05:12:33,119 WARNING [train.py:1204] (3/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_526-62079-0_sp0.9 from training. Duration: 0.7444375 2023-10-04 05:12:34,422 WARNING [train.py:1204] (3/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_439-275083-0_sp1.1 from training. Duration: 0.936375 2023-10-04 05:12:35,855 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_66907_210_21581_1_1535615661_3590177_493-229032-0_sp1.1 from training. Duration: 0.92725 2023-10-04 05:12:35,891 WARNING [train.py:1204] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_89-321955-0_sp1.1 from training. Duration: 0.72725 2023-10-04 05:12:37,273 WARNING [train.py:1204] (3/4) Exclude cut with ID _29857_210_20034_1_1532395886753_3798160_273-226195-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 05:12:37,285 WARNING [train.py:1204] (3/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_404-75889-0 from training. Duration: 0.89 2023-10-04 05:12:38,941 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_271-234962-0_sp0.9 from training. Duration: 0.87775 2023-10-04 05:12:41,644 WARNING [train.py:1204] (3/4) Exclude cut with ID _22961_210_8254_1_1531976366672_4278180_156-339685-0_sp1.1 from training. Duration: 0.97275 2023-10-04 05:12:42,912 WARNING [train.py:1204] (3/4) Exclude cut with ID _37892_210_19596_1_1533198677897_3964058_425-231927-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 05:12:44,313 WARNING [train.py:1204] (3/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_102-288670-0_sp1.1 from training. Duration: 0.97275 2023-10-04 05:12:46,233 WARNING [train.py:1204] (3/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_283-220372-0_sp1.1 from training. Duration: 0.5090625 2023-10-04 05:12:47,706 WARNING [train.py:1204] (3/4) Exclude cut with ID _44304_210_15374_1_1533639352765_4613129_86-1699-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 05:12:49,062 WARNING [train.py:1204] (3/4) Exclude cut with ID _2891_210_2550_1_1525874398521_4427710_327-121590-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 05:12:49,068 WARNING [train.py:1204] (3/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_369-222060-0 from training. Duration: 0.71 2023-10-04 05:12:51,031 WARNING [train.py:1204] (3/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_762-109886-0 from training. Duration: 0.91 2023-10-04 05:12:52,336 WARNING [train.py:1204] (3/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_477-255782-0_sp0.9 from training. Duration: 0.911125 2023-10-04 05:12:52,373 WARNING [train.py:1204] (3/4) Exclude cut with ID IC0669W0061-330906 from training. Duration: 0.768 2023-10-04 05:12:52,427 WARNING [train.py:1204] (3/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_347-213071-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 05:12:53,754 WARNING [train.py:1204] (3/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_507-303370-0_sp1.1 from training. Duration: 0.87275 2023-10-04 05:12:55,086 WARNING [train.py:1204] (3/4) Exclude cut with ID _15012_210_1968_1_1530856820429_5095350_688-178849-0 from training. Duration: 0.64 2023-10-04 05:12:55,093 WARNING [train.py:1204] (3/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_28-70987-0_sp0.9 from training. Duration: 0.9666875 2023-10-04 05:12:55,103 WARNING [train.py:1204] (3/4) Exclude cut with ID _5910_210_1389_1_1527337972454_5050100_555-159815-0 from training. Duration: 0.98 2023-10-04 05:12:55,123 WARNING [train.py:1204] (3/4) Exclude cut with ID IC0679W0064-335871 from training. Duration: 0.768 2023-10-04 05:12:55,124 WARNING [train.py:1204] (3/4) Exclude cut with ID IC0679W0079-335886 from training. Duration: 0.896 2023-10-04 05:12:56,485 WARNING [train.py:1204] (3/4) Exclude cut with ID _5608_210_2106_1_1527415439089_6819768_909-82767-0 from training. Duration: 0.61 2023-10-04 05:12:58,456 WARNING [train.py:1204] (3/4) Exclude cut with ID _23038_210_9570_1_1532001584158_3652419_411-323967-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 05:12:59,863 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_350-231748-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 05:12:59,873 WARNING [train.py:1204] (3/4) Exclude cut with ID _5289_210_2084_1_1527678202736_3779299_142-103317-0_sp0.9 from training. Duration: 0.9333125 2023-10-04 05:13:01,601 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.710e+02 1.996e+02 2.244e+02 2.703e+02 4.049e+02, threshold=4.487e+02, percent-clipped=0.0 2023-10-04 05:13:01,691 WARNING [train.py:1204] (3/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_314-144055-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 05:13:01,756 WARNING [train.py:1204] (3/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_177-236809-0_sp0.9 from training. Duration: 0.5444375 2023-10-04 05:13:03,118 WARNING [train.py:1204] (3/4) Exclude cut with ID _23038_210_9570_1_1532001584158_3652419_82-334733-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 05:13:03,133 WARNING [train.py:1204] (3/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_225-22526-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 05:13:11,481 INFO [train.py:1046] (3/4) Epoch 44, batch 2200, loss[loss=0.1429, simple_loss=0.2297, pruned_loss=0.02801, over 24319.00 frames. ], tot_loss[loss=0.1539, simple_loss=0.2346, pruned_loss=0.03661, over 4720976.34 frames. ], batch size: 61, lr: 2.31e-03, grad_scale: 16.0 2023-10-04 05:13:11,563 WARNING [train.py:1204] (3/4) Exclude cut with ID _30327_210_16049_1_1532484161295_3924169_401-70819-0_sp1.1 from training. Duration: 0.7818125 2023-10-04 05:13:12,927 WARNING [train.py:1204] (3/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_538-32291-0 from training. Duration: 0.83 2023-10-04 05:13:15,970 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_37357_210_17024_1_1533089504_2316121_258-211605-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 05:13:16,220 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.conv_skip_rate, batch_count=1537480.0, ans=0.0 2023-10-04 05:13:18,947 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=1537480.0, ans=0.1 2023-10-04 05:13:20,520 WARNING [train.py:1204] (3/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_550-335198-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 05:13:20,588 WARNING [train.py:1204] (3/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_98-741-0_sp0.9 from training. Duration: 0.888875 2023-10-04 05:13:21,925 WARNING [train.py:1204] (3/4) Exclude cut with ID _38323_210_12108_1_1533121233369_3303049_251-238988-0_sp1.1 from training. Duration: 0.92725 2023-10-04 05:13:24,030 WARNING [train.py:1204] (3/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_259-34038-0_sp0.9 from training. Duration: 0.82225 2023-10-04 05:13:26,640 WARNING [train.py:1204] (3/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_1060-180633-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 05:13:26,679 WARNING [train.py:1204] (3/4) Exclude cut with ID _8755_210_1876_1_1528628559357_3258209_211-37473-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 05:13:26,686 WARNING [train.py:1204] (3/4) Exclude cut with ID _4122_210_2084_1_1526713164417_4353280_528-206771-0 from training. Duration: 0.88 2023-10-04 05:13:32,835 WARNING [train.py:1204] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_64-248359-0 from training. Duration: 0.69 2023-10-04 05:13:33,000 WARNING [train.py:1204] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_402-253027-0_sp1.1 from training. Duration: 0.4909375 2023-10-04 05:13:35,048 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.feed_forward2.hidden_balancer.prob, batch_count=1537546.6666666667, ans=0.125 2023-10-04 05:13:39,059 WARNING [train.py:1204] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_625-102955-0 from training. Duration: 0.77 2023-10-04 05:13:41,740 WARNING [train.py:1204] (3/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_611-219324-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 05:13:41,809 WARNING [train.py:1204] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_718-68505-0_sp0.9 from training. Duration: 0.988875 2023-10-04 05:13:43,149 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_353-182855-0_sp1.1 from training. Duration: 0.87275 2023-10-04 05:13:45,958 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_5960_210_1794_1_1527403865_3990999_515-2613-0_sp0.9 from training. Duration: 0.97775 2023-10-04 05:13:47,436 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_5575_210_1410_1_1527301305_2570170_342-265572-0 from training. Duration: 0.94 2023-10-04 05:13:51,915 WARNING [train.py:1204] (3/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_329-113173-0_sp0.9 from training. Duration: 0.688875 2023-10-04 05:13:52,007 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_15396_210_6890_1_1531308473_1938936_213-312506-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 05:13:53,368 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_521-214276-0_sp1.1 from training. Duration: 0.4 2023-10-04 05:13:56,672 WARNING [train.py:1204] (3/4) Exclude cut with ID _15026_210_8750_1_1530777602316_3291850_211-195043-0_sp1.1 from training. Duration: 0.72725 2023-10-04 05:13:56,944 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.feed_forward3.hidden_balancer.prob, batch_count=1537680.0, ans=0.125 2023-10-04 05:13:58,010 WARNING [train.py:1204] (3/4) Exclude cut with ID _29343_210_4381_1_1532602757752_3674109_108-257494-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 05:13:58,127 WARNING [train.py:1204] (3/4) Exclude cut with ID _64414_210_17301_1_1535243252048_3757759_403-58151-0_sp1.1 from training. Duration: 0.963625 2023-10-04 05:14:00,051 WARNING [train.py:1204] (3/4) Exclude cut with ID _36512_210_6987_1_1533033143998_3126069_109-177787-0_sp1.1 from training. Duration: 0.92725 2023-10-04 05:14:02,729 WARNING [train.py:1204] (3/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_498-40153-0 from training. Duration: 0.63 2023-10-04 05:14:02,808 WARNING [train.py:1204] (3/4) Exclude cut with ID _56447_210_1389_1_1535074245910_3697189_161-334523-0_sp1.1 from training. Duration: 0.936375 2023-10-04 05:14:04,765 WARNING [train.py:1204] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_238-245655-0 from training. Duration: 0.81 2023-10-04 05:14:07,521 WARNING [train.py:1204] (3/4) Exclude cut with ID _64631_210_27767_1_1535349580068_4165160_108-43865-0_sp1.1 from training. Duration: 0.92725 2023-10-04 05:14:07,529 WARNING [train.py:1204] (3/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_243-42116-0_sp1.1 from training. Duration: 0.663625 2023-10-04 05:14:07,568 WARNING [train.py:1204] (3/4) Exclude cut with ID _66868_210_9527_1_1535509287954_4561140_594-132160-0_sp1.1 from training. Duration: 0.92725 2023-10-04 05:14:08,899 WARNING [train.py:1204] (3/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_48-338976-0_sp0.9 from training. Duration: 0.988875 2023-10-04 05:14:10,298 WARNING [train.py:1204] (3/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_137-100595-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 05:14:10,311 WARNING [train.py:1204] (3/4) Exclude cut with ID _49748_210_24116_1_1533986971737_3158910_12-271669-0_sp1.1 from training. Duration: 0.936375 2023-10-04 05:14:10,336 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_37357_210_17024_1_1533089504_2316121_206-80421-0_sp1.1 from training. Duration: 0.936375 2023-10-04 05:14:12,906 WARNING [train.py:1204] (3/4) Exclude cut with ID _7537_210_2084_1_1528502395431_3924359_113-199933-0_sp1.1 from training. Duration: 0.536375 2023-10-04 05:14:12,959 WARNING [train.py:1204] (3/4) Exclude cut with ID _55440_210_12651_1_1534496713364_2980520_31-59289-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 05:14:14,363 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1083-220122-0_sp0.9 from training. Duration: 0.5666875 2023-10-04 05:14:17,112 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_426-313557-0_sp1.1 from training. Duration: 0.4818125 2023-10-04 05:14:17,165 WARNING [train.py:1204] (3/4) Exclude cut with ID _35799_210_5854_1_1533263487887_3529071_324-94843-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 05:14:19,942 WARNING [train.py:1204] (3/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_264-108685-0_sp1.1 from training. Duration: 0.5 2023-10-04 05:14:20,027 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9012W0006-501141 from training. Duration: 0.896 2023-10-04 05:14:22,756 WARNING [train.py:1204] (3/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_890-5117-0_sp0.9 from training. Duration: 0.8666875 2023-10-04 05:14:24,173 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9023W0008-506301 from training. Duration: 0.896 2023-10-04 05:14:25,502 WARNING [train.py:1204] (3/4) Exclude cut with ID _13916_210_9994_1_1530788341126_3763149_60-191556-0_sp0.9 from training. Duration: 0.67775 2023-10-04 05:14:25,556 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9029W0331-509721 from training. Duration: 0.896 2023-10-04 05:14:26,810 INFO [train.py:1046] (3/4) Epoch 44, batch 2250, loss[loss=0.1604, simple_loss=0.2441, pruned_loss=0.0383, over 23946.00 frames. ], tot_loss[loss=0.1545, simple_loss=0.2349, pruned_loss=0.03706, over 4707521.07 frames. ], batch size: 86, lr: 2.31e-03, grad_scale: 16.0 2023-10-04 05:14:26,931 WARNING [train.py:1204] (3/4) Exclude cut with ID _35823_210_6917_1_1533187881914_7297830_567-108596-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 05:14:28,383 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7543_210_6753_1_1528544010_1110402_25-183860-0_sp1.1 from training. Duration: 0.42725 2023-10-04 05:14:28,486 WARNING [train.py:1204] (3/4) Exclude cut with ID _47278_210_12041_1_1533902418349_3576110_495-260035-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 05:14:30,404 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9051W0015-520281 from training. Duration: 0.9045625 2023-10-04 05:14:33,003 WARNING [train.py:1204] (3/4) Exclude cut with ID _14459_210_2228_1_1531443641946_7516700_707-66193-0_sp1.1 from training. Duration: 0.97275 2023-10-04 05:14:34,448 WARNING [train.py:1204] (3/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_219-224445-0_sp1.1 from training. Duration: 0.763625 2023-10-04 05:14:38,180 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.conv_module2.balancer2.prob, batch_count=1537813.3333333333, ans=0.125 2023-10-04 05:14:39,423 WARNING [train.py:1204] (3/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_345-167986-0_sp0.9 from training. Duration: 0.9333125 2023-10-04 05:14:39,537 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_34815_210_6388_1_1533381320_3571903_279-305565-0_sp1.1 from training. Duration: 0.836375 2023-10-04 05:14:40,044 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.5.encoder.layers.0.self_attn1.whiten, num_groups=1, num_channels=256, metric=11.52 vs. limit=22.5 2023-10-04 05:14:42,288 WARNING [train.py:1204] (3/4) Exclude cut with ID _21987_210_5854_1_1531821500482_3637038_317-179212-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 05:14:42,361 WARNING [train.py:1204] (3/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_395-37222-0_sp1.1 from training. Duration: 0.6909375 2023-10-04 05:14:43,643 WARNING [train.py:1204] (3/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_168-50337-0_sp1.1 from training. Duration: 0.763625 2023-10-04 05:14:45,086 WARNING [train.py:1204] (3/4) Exclude cut with ID _60113_210_16271_1_1535164212481_4386079_559-167191-0 from training. Duration: 0.93 2023-10-04 05:14:46,429 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_47228_210_4983_1_1533780021_3672027_344-179642-0_sp1.1 from training. Duration: 0.92725 2023-10-04 05:14:46,462 WARNING [train.py:1204] (3/4) Exclude cut with ID _22961_210_8254_1_1531976366672_4278180_317-199546-0_sp0.9 from training. Duration: 0.9555625 2023-10-04 05:14:47,892 WARNING [train.py:1204] (3/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_141-89727-0 from training. Duration: 0.71 2023-10-04 05:14:49,675 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_78-116075-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 05:14:49,697 WARNING [train.py:1204] (3/4) Exclude cut with ID _64086_210_7994_1_1535270417019_7435949_52-235756-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 05:14:51,721 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7615_210_4211_1_1528110965_4580160_72-235211-0_sp1.1 from training. Duration: 0.6909375 2023-10-04 05:14:57,077 WARNING [train.py:1204] (3/4) Exclude cut with ID _14069_210_5632_1_1530512692033_4803390_287-27950-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 05:14:57,162 WARNING [train.py:1204] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_74-48717-0_sp0.9 from training. Duration: 0.6444375 2023-10-04 05:14:57,188 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7544_210_6753_1_1528591327_1471426_172-348777-0_sp1.1 from training. Duration: 0.6 2023-10-04 05:15:00,031 WARNING [train.py:1204] (3/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_220-191080-0 from training. Duration: 0.98 2023-10-04 05:15:03,297 WARNING [train.py:1204] (3/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_1081-137641-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 05:15:04,620 WARNING [train.py:1204] (3/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_278-235459-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 05:15:07,996 WARNING [train.py:1204] (3/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_1181-77470-0_sp1.1 from training. Duration: 0.936375 2023-10-04 05:15:09,428 WARNING [train.py:1204] (3/4) Exclude cut with ID _60860_210_8254_1_1534896015875_3557169_198-5531-0_sp1.1 from training. Duration: 0.936375 2023-10-04 05:15:10,802 WARNING [train.py:1204] (3/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_17-289781-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 05:15:10,807 WARNING [train.py:1204] (3/4) Exclude cut with ID _50310_210_15488_1_1534068043345_3641080_146-199752-0_sp1.1 from training. Duration: 0.92725 2023-10-04 05:15:13,597 WARNING [train.py:1204] (3/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_969-213354-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 05:15:14,952 WARNING [train.py:1204] (3/4) Exclude cut with ID _70519_210_4163_1_1535961595994_7458230_66-344156-0_sp1.1 from training. Duration: 0.963625 2023-10-04 05:15:17,854 WARNING [train.py:1204] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_380-204756-0_sp1.1 from training. Duration: 0.7818125 2023-10-04 05:15:21,004 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_128-202104-0_sp0.9 from training. Duration: 0.7 2023-10-04 05:15:24,497 WARNING [train.py:1204] (3/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_51-1049-0_sp0.9 from training. Duration: 0.8333125 2023-10-04 05:15:24,518 WARNING [train.py:1204] (3/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_86-66192-0_sp1.1 from training. Duration: 0.5 2023-10-04 05:15:25,839 WARNING [train.py:1204] (3/4) Exclude cut with ID _14069_210_5632_1_1530512692033_4803390_95-1896-0_sp1.1 from training. Duration: 0.8909375 2023-10-04 05:15:31,712 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.711e+02 1.999e+02 2.130e+02 2.376e+02 3.367e+02, threshold=4.261e+02, percent-clipped=0.0 2023-10-04 05:15:33,053 WARNING [train.py:1204] (3/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_29-293896-0_sp1.1 from training. Duration: 0.5545625 2023-10-04 05:15:36,233 WARNING [train.py:1204] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_167-137255-0_sp0.9 from training. Duration: 0.711125 2023-10-04 05:15:36,234 WARNING [train.py:1204] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_217-95056-0 from training. Duration: 0.98 2023-10-04 05:15:36,249 WARNING [train.py:1204] (3/4) Exclude cut with ID _47278_210_12041_1_1533902418349_3576110_97-266746-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 05:15:36,291 WARNING [train.py:1204] (3/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_380-343670-0_sp1.1 from training. Duration: 0.8 2023-10-04 05:15:39,146 WARNING [train.py:1204] (3/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_107-332398-0 from training. Duration: 0.78 2023-10-04 05:15:40,694 WARNING [train.py:1204] (3/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_91-267696-0_sp1.1 from training. Duration: 0.7909375 2023-10-04 05:15:40,713 WARNING [train.py:1204] (3/4) Exclude cut with ID _41298_210_9019_1_1533430371450_4822143_498-186993-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 05:15:41,999 INFO [train.py:1046] (3/4) Epoch 44, batch 2300, loss[loss=0.1575, simple_loss=0.2442, pruned_loss=0.0354, over 24349.00 frames. ], tot_loss[loss=0.1549, simple_loss=0.2351, pruned_loss=0.03732, over 4719689.99 frames. ], batch size: 77, lr: 2.31e-03, grad_scale: 16.0 2023-10-04 05:15:46,173 WARNING [train.py:1204] (3/4) Exclude cut with ID _17956_210_4353_1_1531209568125_6979970_54-77610-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 05:15:46,215 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_40047_210_2290_1_1533290519_2374311_257-335162-0_sp1.1 from training. Duration: 0.863625 2023-10-04 05:15:48,826 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9653W0193-675471 from training. Duration: 0.9656875 2023-10-04 05:15:51,476 WARNING [train.py:1204] (3/4) Exclude cut with ID _25248_210_8750_1_1532062825941_5554180_243-240536-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 05:16:00,765 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_32360_210_4198_1_1532689160_3648436_60-51908-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 05:16:00,774 WARNING [train.py:1204] (3/4) Exclude cut with ID _4092_210_2151_1_1526643044449_3135759_227-213972-0_sp0.9 from training. Duration: 0.6 2023-10-04 05:16:00,808 WARNING [train.py:1204] (3/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_392-173500-0_sp1.1 from training. Duration: 0.97275 2023-10-04 05:16:00,839 WARNING [train.py:1204] (3/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_71-329150-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 05:16:00,841 WARNING [train.py:1204] (3/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_866-173440-0 from training. Duration: 0.9 2023-10-04 05:16:02,747 WARNING [train.py:1204] (3/4) Exclude cut with ID _14832_210_8750_1_1530691351539_3171348_1-54315-0_sp0.9 from training. Duration: 0.9666875 2023-10-04 05:16:04,234 WARNING [train.py:1204] (3/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_430-116139-0_sp1.1 from training. Duration: 0.77275 2023-10-04 05:16:05,574 WARNING [train.py:1204] (3/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_356-45054-0_sp0.9 from training. Duration: 0.9555625 2023-10-04 05:16:07,144 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.balancer_na.min_abs, batch_count=1538213.3333333333, ans=0.02 2023-10-04 05:16:10,373 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3531_210_3528_1_1526294203_5216456_309-227302-0_sp0.9 from training. Duration: 0.7666875 2023-10-04 05:16:10,616 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.bypass.skip_rate, batch_count=1538280.0, ans=0.04949747468305833 2023-10-04 05:16:11,861 WARNING [train.py:1204] (3/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_301-330455-0_sp1.1 from training. Duration: 0.72725 2023-10-04 05:16:14,757 WARNING [train.py:1204] (3/4) Exclude cut with ID _23557_210_8750_1_1531890106370_6846255_667-145945-0_sp1.1 from training. Duration: 0.92725 2023-10-04 05:16:18,857 WARNING [train.py:1204] (3/4) Exclude cut with ID _14860_210_4915_1_1530702020340_7343240_836-328489-0_sp1.1 from training. Duration: 0.8181875 2023-10-04 05:16:18,916 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_37152_210_9697_1_1533106573_3844188_416-333249-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 05:16:21,432 WARNING [train.py:1204] (3/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_538-32291-0_sp0.9 from training. Duration: 0.92225 2023-10-04 05:16:22,927 WARNING [train.py:1204] (3/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_965-176824-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 05:16:23,748 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.conv_module1.balancer1.prob, batch_count=1538280.0, ans=0.125 2023-10-04 05:16:23,758 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.conv_module1.balancer2.prob, batch_count=1538280.0, ans=0.125 2023-10-04 05:16:26,454 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.conv_skip_rate, batch_count=1538346.6666666667, ans=0.0 2023-10-04 05:16:27,702 WARNING [train.py:1204] (3/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_86-19601-0_sp1.1 from training. Duration: 0.77275 2023-10-04 05:16:28,979 WARNING [train.py:1204] (3/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_436-318113-0_sp1.1 from training. Duration: 0.6818125 2023-10-04 05:16:29,008 WARNING [train.py:1204] (3/4) Exclude cut with ID _41817_210_18835_1_1533434477160_7423049_555-293448-0_sp0.9 from training. Duration: 0.888875 2023-10-04 05:16:29,030 WARNING [train.py:1204] (3/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_68-219807-0 from training. Duration: 0.47 2023-10-04 05:16:33,649 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7544_210_6753_1_1528591327_1471426_33-346031-0_sp1.1 from training. Duration: 0.5545625 2023-10-04 05:16:33,651 WARNING [train.py:1204] (3/4) Exclude cut with ID _49748_210_24116_1_1533986971737_3158910_263-52113-0_sp1.1 from training. Duration: 0.97275 2023-10-04 05:16:33,704 WARNING [train.py:1204] (3/4) Exclude cut with ID _64639_210_5816_1_1535367619437_3528000_340-24580-0_sp1.1 from training. Duration: 0.92725 2023-10-04 05:16:33,719 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_47228_210_4983_1_1533780021_3672027_479-114941-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 05:16:33,751 WARNING [train.py:1204] (3/4) Exclude cut with ID _44585_210_18628_1_1533605350387_4518199_506-119659-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 05:16:35,181 WARNING [train.py:1204] (3/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_314-126819-0_sp1.1 from training. Duration: 0.4545625 2023-10-04 05:16:35,185 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2541_210_1794_1_1525590269_3084765_156-173713-0_sp0.9 from training. Duration: 0.9 2023-10-04 05:16:35,226 WARNING [train.py:1204] (3/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_108-255430-0 from training. Duration: 0.97 2023-10-04 05:16:35,242 WARNING [train.py:1204] (3/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_791-45149-0_sp1.1 from training. Duration: 0.7818125 2023-10-04 05:16:35,248 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_53378_210_17282_1_1534227054_1261425_10-118295-0_sp1.1 from training. Duration: 0.97275 2023-10-04 05:16:36,604 WARNING [train.py:1204] (3/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_148-158509-0 from training. Duration: 0.41 2023-10-04 05:16:40,218 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.1.nonlin_attention.balancer.prob, batch_count=1538413.3333333333, ans=0.125 2023-10-04 05:16:42,705 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_34726_210_4525_1_1533019474_5030395_687-291305-0_sp1.1 from training. Duration: 0.936375 2023-10-04 05:16:45,543 WARNING [train.py:1204] (3/4) Exclude cut with ID _14860_210_4915_1_1530702020340_7343240_264-324537-0_sp1.1 from training. Duration: 0.963625 2023-10-04 05:16:50,853 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_41251_210_15368_1_1533429767_4820695_591-46386-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 05:16:50,880 WARNING [train.py:1204] (3/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_86-19601-0_sp0.9 from training. Duration: 0.9444375 2023-10-04 05:16:50,909 WARNING [train.py:1204] (3/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_401-255491-0_sp0.9 from training. Duration: 0.77775 2023-10-04 05:16:52,257 WARNING [train.py:1204] (3/4) Exclude cut with ID _8696_210_1876_1_1528545632440_3345058_209-54069-0_sp1.1 from training. Duration: 0.6090625 2023-10-04 05:16:52,272 WARNING [train.py:1204] (3/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_369-325184-0_sp1.1 from training. Duration: 0.9 2023-10-04 05:16:54,172 WARNING [train.py:1204] (3/4) Exclude cut with ID _14687_210_5854_1_1530703838259_3664889_115-239046-0_sp0.9 from training. Duration: 0.7444375 2023-10-04 05:16:54,207 WARNING [train.py:1204] (3/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_168-50337-0 from training. Duration: 0.84 2023-10-04 05:16:55,466 INFO [train.py:1046] (3/4) Epoch 44, batch 2350, loss[loss=0.1379, simple_loss=0.2193, pruned_loss=0.02821, over 24487.00 frames. ], tot_loss[loss=0.1565, simple_loss=0.2368, pruned_loss=0.03814, over 4697961.46 frames. ], batch size: 63, lr: 2.31e-03, grad_scale: 16.0 2023-10-04 05:17:00,974 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.2.encoder.layers.2.nonlin_attention.whiten1, num_groups=1, num_channels=288, metric=4.66 vs. limit=10.0 2023-10-04 05:17:01,583 WARNING [train.py:1204] (3/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_909-64151-0_sp1.1 from training. Duration: 0.8545625 2023-10-04 05:17:01,593 WARNING [train.py:1204] (3/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_597-154640-0 from training. Duration: 0.9 2023-10-04 05:17:06,878 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.5.encoder.layers.1.feed_forward3.out_whiten, num_groups=1, num_channels=256, metric=11.57 vs. limit=15.0 2023-10-04 05:17:07,797 WARNING [train.py:1204] (3/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_126-274694-0 from training. Duration: 0.98 2023-10-04 05:17:09,222 WARNING [train.py:1204] (3/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_344-269786-0_sp1.1 from training. Duration: 0.97275 2023-10-04 05:17:12,498 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.conv_module2.balancer2.prob, batch_count=1538546.6666666667, ans=0.125 2023-10-04 05:17:13,584 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_37818_210_4238_1_1533620910_4720083_322-253154-0_sp1.1 from training. Duration: 0.92725 2023-10-04 05:17:13,587 WARNING [train.py:1204] (3/4) Exclude cut with ID _58895_210_8750_1_1535184081320_3649089_221-15823-0_sp1.1 from training. Duration: 0.92725 2023-10-04 05:17:13,597 WARNING [train.py:1204] (3/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_9-21737-0_sp1.1 from training. Duration: 0.8818125 2023-10-04 05:17:13,635 WARNING [train.py:1204] (3/4) Exclude cut with ID _14069_210_5632_1_1530512692033_4803390_522-341588-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 05:17:13,712 WARNING [train.py:1204] (3/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_363-185703-0 from training. Duration: 0.95 2023-10-04 05:17:17,902 WARNING [train.py:1204] (3/4) Exclude cut with ID _41817_210_18835_1_1533434477160_7423049_265-76216-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 05:17:22,110 WARNING [train.py:1204] (3/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_841-341679-0 from training. Duration: 0.58 2023-10-04 05:17:25,108 WARNING [train.py:1204] (3/4) Exclude cut with ID _55429_210_10626_1_1534467588527_7174143_97-335084-0_sp1.1 from training. Duration: 0.8818125 2023-10-04 05:17:28,509 WARNING [train.py:1204] (3/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_690-126306-0_sp0.9 from training. Duration: 0.8666875 2023-10-04 05:17:28,518 WARNING [train.py:1204] (3/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_102-92751-0_sp1.1 from training. Duration: 0.9 2023-10-04 05:17:30,003 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3787_210_2040_1_1526477544_5478586_603-120324-0_sp1.1 from training. Duration: 0.736375 2023-10-04 05:17:31,363 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_33348_210_5632_1_1533086912_3324030_85-28583-0 from training. Duration: 0.82 2023-10-04 05:17:32,669 WARNING [train.py:1204] (3/4) Exclude cut with ID _60057_210_10640_1_1534903065793_3333150_362-230377-0_sp1.1 from training. Duration: 0.7909375 2023-10-04 05:17:34,695 WARNING [train.py:1204] (3/4) Exclude cut with ID _60113_210_16271_1_1535164212481_4386079_194-146486-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 05:17:34,707 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_53753_210_7234_1_1534320003_3594148_171-277668-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 05:17:34,742 WARNING [train.py:1204] (3/4) Exclude cut with ID _34423_210_8750_1_1533430798456_3330199_251-332788-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 05:17:37,734 WARNING [train.py:1204] (3/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_663-166973-0_sp0.9 from training. Duration: 0.888875 2023-10-04 05:17:40,415 WARNING [train.py:1204] (3/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_927-43827-0 from training. Duration: 0.74 2023-10-04 05:17:41,767 WARNING [train.py:1204] (3/4) Exclude cut with ID _28323_210_16187_1_1532480395667_4100300_306-74561-0_sp1.1 from training. Duration: 0.8545625 2023-10-04 05:17:44,931 WARNING [train.py:1204] (3/4) Exclude cut with ID _15037_210_2253_1_1530752294104_3267650_139-337989-0_sp1.1 from training. Duration: 0.92725 2023-10-04 05:17:44,944 WARNING [train.py:1204] (3/4) Exclude cut with ID _49039_210_6315_1_1533967537541_3846118_460-263670-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 05:17:46,344 WARNING [train.py:1204] (3/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_186-309161-0 from training. Duration: 0.54 2023-10-04 05:17:47,146 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.3.encoder.layers.0.whiten, num_groups=1, num_channels=512, metric=7.00 vs. limit=12.0 2023-10-04 05:17:47,815 WARNING [train.py:1204] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_87-278686-0_sp1.1 from training. Duration: 0.7 2023-10-04 05:17:48,101 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.feed_forward1.out_proj.dropout_p, batch_count=1538680.0, ans=0.1 2023-10-04 05:17:49,315 WARNING [train.py:1204] (3/4) Exclude cut with ID _27052_210_5986_1_1532329197187_3585009_314-106499-0 from training. Duration: 0.92 2023-10-04 05:17:49,331 WARNING [train.py:1204] (3/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_412-287526-0_sp0.9 from training. Duration: 0.8 2023-10-04 05:17:53,396 WARNING [train.py:1204] (3/4) Exclude cut with ID _22961_210_8254_1_1531976366672_4278180_317-199546-0 from training. Duration: 0.86 2023-10-04 05:17:57,175 WARNING [train.py:1204] (3/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_381-317791-0 from training. Duration: 0.73 2023-10-04 05:17:57,231 WARNING [train.py:1204] (3/4) Exclude cut with ID _44010_210_4525_1_1533884262964_4013620_560-310411-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 05:17:57,238 WARNING [train.py:1204] (3/4) Exclude cut with ID _5294_210_1747_1_1527249643928_3696179_342-270278-0_sp0.9 from training. Duration: 0.611125 2023-10-04 05:17:57,257 WARNING [train.py:1204] (3/4) Exclude cut with ID ID0473W0002-918951 from training. Duration: 0.924 2023-10-04 05:17:58,367 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1083-220122-0 from training. Duration: 0.51 2023-10-04 05:17:59,872 WARNING [train.py:1204] (3/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_138-154812-0 from training. Duration: 0.78 2023-10-04 05:18:01,032 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.679e+02 2.046e+02 2.383e+02 2.695e+02 3.864e+02, threshold=4.767e+02, percent-clipped=0.0 2023-10-04 05:18:03,047 WARNING [train.py:1204] (3/4) Exclude cut with ID _44982_210_1511_1_1534312820108_3726029_331-19420-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 05:18:05,837 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2655_210_2087_1_1525688959_3897400_202-147557-0_sp0.9 from training. Duration: 0.9444375 2023-10-04 05:18:09,844 INFO [train.py:1046] (3/4) Epoch 44, batch 2400, loss[loss=0.1471, simple_loss=0.2231, pruned_loss=0.03549, over 24297.00 frames. ], tot_loss[loss=0.156, simple_loss=0.2359, pruned_loss=0.03805, over 4706300.00 frames. ], batch size: 56, lr: 2.31e-03, grad_scale: 16.0 2023-10-04 05:18:10,133 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.conv_module2.balancer1.prob, batch_count=1538813.3333333333, ans=0.125 2023-10-04 05:18:11,286 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_23850_210_4238_1_1532232597_1632433_32-185135-0_sp0.9 from training. Duration: 0.9555625 2023-10-04 05:18:11,356 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder_embed.conv.8.prob, batch_count=1538813.3333333333, ans=0.125 2023-10-04 05:18:14,559 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_46-93547-0_sp1.1 from training. Duration: 0.863625 2023-10-04 05:18:14,603 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7931_210_1794_1_1528594728_5564059_280-279560-0 from training. Duration: 0.98 2023-10-04 05:18:14,646 WARNING [train.py:1204] (3/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_937-208281-0 from training. Duration: 0.85 2023-10-04 05:18:21,650 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_14928_210_5632_1_1530773461_4714000_454-112198-0_sp0.9 from training. Duration: 0.7333125 2023-10-04 05:18:21,651 WARNING [train.py:1204] (3/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_944-153333-0_sp1.1 from training. Duration: 0.963625 2023-10-04 05:18:24,481 WARNING [train.py:1204] (3/4) Exclude cut with ID _14821_210_8750_1_1530680503991_6829359_2-227151-0 from training. Duration: 0.83 2023-10-04 05:18:24,504 WARNING [train.py:1204] (3/4) Exclude cut with ID _28461_210_2253_1_1532771775189_3041299_96-152158-0_sp1.1 from training. Duration: 0.77275 2023-10-04 05:18:25,847 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7837_210_1792_1_1528453445_5841305_654-54195-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 05:18:26,552 WARNING [train.py:1204] (3/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_344-152526-0 from training. Duration: 0.53 2023-10-04 05:18:29,265 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_30167_210_3528_1_1532428012_4772375_480-142230-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 05:18:33,734 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_160-218721-0 from training. Duration: 0.96 2023-10-04 05:18:37,827 WARNING [train.py:1204] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_65-88242-0_sp0.9 from training. Duration: 0.62225 2023-10-04 05:18:38,061 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.ff3_skip_rate, batch_count=1538946.6666666667, ans=0.0 2023-10-04 05:18:43,412 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_34886_210_10633_1_1533203882_1755661_76-156096-0 from training. Duration: 0.87 2023-10-04 05:18:44,819 WARNING [train.py:1204] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_188-38388-0_sp1.1 from training. Duration: 0.92725 2023-10-04 05:18:46,753 WARNING [train.py:1204] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_81-289335-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 05:18:47,613 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.2.encoder.layers.1.whiten, num_groups=1, num_channels=384, metric=4.63 vs. limit=12.0 2023-10-04 05:18:49,828 INFO [scaling.py:1118] (3/4) WithLoss: name=encoder.encoders.3.encoder.layers.2.self_attn_weights, loss-sum=0.000e+00 2023-10-04 05:18:50,829 WARNING [train.py:1204] (3/4) Exclude cut with ID _59493_210_17282_1_1535078419077_3225879_110-8778-0_sp1.1 from training. Duration: 0.936375 2023-10-04 05:18:52,223 WARNING [train.py:1204] (3/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_188-260011-0 from training. Duration: 0.73 2023-10-04 05:18:52,257 WARNING [train.py:1204] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_453-133983-0_sp0.9 from training. Duration: 0.8666875 2023-10-04 05:18:59,891 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8054_210_7927_1_1528627721_4984094_553-17379-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 05:19:01,764 WARNING [train.py:1204] (3/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_341-171072-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 05:19:05,207 WARNING [train.py:1204] (3/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_265-257286-0_sp1.1 from training. Duration: 0.963625 2023-10-04 05:19:06,627 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_403-39619-0_sp0.9 from training. Duration: 0.9333125 2023-10-04 05:19:06,632 WARNING [train.py:1204] (3/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_464-33931-0_sp0.9 from training. Duration: 0.6 2023-10-04 05:19:06,652 WARNING [train.py:1204] (3/4) Exclude cut with ID _60804_210_9642_1_1534831199859_7268299_632-143863-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 05:19:06,659 WARNING [train.py:1204] (3/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_431-310997-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 05:19:06,848 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.feed_forward3.hidden_balancer.prob, batch_count=1539013.3333333333, ans=0.125 2023-10-04 05:19:07,890 WARNING [train.py:1204] (3/4) Exclude cut with ID _31649_210_6228_1_1532584608063_4091078_114-263860-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 05:19:07,903 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6961_210_2087_1_1527937168_6815011_562-165518-0_sp0.9 from training. Duration: 0.8555625 2023-10-04 05:19:10,823 WARNING [train.py:1204] (3/4) Exclude cut with ID _27052_210_5986_1_1532329197187_3585009_338-325870-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 05:19:10,870 WARNING [train.py:1204] (3/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_469-94673-0_sp1.1 from training. Duration: 0.7181875 2023-10-04 05:19:10,908 WARNING [train.py:1204] (3/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_259-34038-0 from training. Duration: 0.74 2023-10-04 05:19:12,384 WARNING [train.py:1204] (3/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_388-129552-0 from training. Duration: 0.96 2023-10-04 05:19:15,150 WARNING [train.py:1204] (3/4) Exclude cut with ID _28394_210_8913_1_1532430049518_3650118_342-251455-0_sp1.1 from training. Duration: 0.936375 2023-10-04 05:19:15,170 WARNING [train.py:1204] (3/4) Exclude cut with ID _53731_210_7994_1_1534553979244_3757214_8-191390-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 05:19:15,227 WARNING [train.py:1204] (3/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_613-232856-0 from training. Duration: 0.54 2023-10-04 05:19:16,489 WARNING [train.py:1204] (3/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_557-349534-0 from training. Duration: 0.91 2023-10-04 05:19:16,504 WARNING [train.py:1204] (3/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_379-184223-0 from training. Duration: 0.85 2023-10-04 05:19:16,514 WARNING [train.py:1204] (3/4) Exclude cut with ID IC0125W0001-61807 from training. Duration: 0.966 2023-10-04 05:19:18,348 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_229-45300-0 from training. Duration: 0.57 2023-10-04 05:19:19,650 WARNING [train.py:1204] (3/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_1069-24743-0_sp1.1 from training. Duration: 0.92725 2023-10-04 05:19:21,037 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_41452_210_7291_1_1533437687_3941281_323-172873-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 05:19:21,056 WARNING [train.py:1204] (3/4) Exclude cut with ID _37086_210_9038_1_1532999540891_7021250_802-226709-0_sp1.1 from training. Duration: 0.97275 2023-10-04 05:19:21,141 WARNING [train.py:1204] (3/4) Exclude cut with ID IC0144W0500-71767 from training. Duration: 0.931875 2023-10-04 05:19:22,545 WARNING [train.py:1204] (3/4) Exclude cut with ID _39846_210_12651_1_1533605310265_2115212_233-129699-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 05:19:23,834 INFO [train.py:1046] (3/4) Epoch 44, batch 2450, loss[loss=0.1505, simple_loss=0.2281, pruned_loss=0.03642, over 24482.00 frames. ], tot_loss[loss=0.1547, simple_loss=0.234, pruned_loss=0.03766, over 4678391.32 frames. ], batch size: 63, lr: 2.31e-03, grad_scale: 16.0 2023-10-04 05:19:23,886 WARNING [train.py:1204] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_574-45541-0_sp0.9 from training. Duration: 0.72225 2023-10-04 05:19:27,194 WARNING [train.py:1204] (3/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_372-298093-0_sp1.1 from training. Duration: 0.72725 2023-10-04 05:19:27,206 WARNING [train.py:1204] (3/4) Exclude cut with ID _62574_210_2655_1_1535180407058_3933249_34-26343-0_sp1.1 from training. Duration: 0.97275 2023-10-04 05:19:29,981 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_512-256238-0_sp1.1 from training. Duration: 0.963625 2023-10-04 05:19:29,986 WARNING [train.py:1204] (3/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_196-55502-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 05:19:31,401 WARNING [train.py:1204] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_195-167011-0 from training. Duration: 0.75 2023-10-04 05:19:37,351 WARNING [train.py:1204] (3/4) Exclude cut with ID _13678_210_7497_1_1530530069226_3463170_345-230416-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 05:19:37,363 WARNING [train.py:1204] (3/4) Exclude cut with ID _51078_210_9070_1_1534161557424_3805250_225-327809-0_sp1.1 from training. Duration: 0.963625 2023-10-04 05:19:38,862 WARNING [train.py:1204] (3/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_18-189198-0_sp1.1 from training. Duration: 0.8181875 2023-10-04 05:19:38,882 WARNING [train.py:1204] (3/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_329-82235-0_sp1.1 from training. Duration: 0.7454375 2023-10-04 05:19:40,185 WARNING [train.py:1204] (3/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_726-270780-0_sp0.9 from training. Duration: 0.9555625 2023-10-04 05:19:40,219 WARNING [train.py:1204] (3/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_654-52713-0 from training. Duration: 0.77 2023-10-04 05:19:44,341 WARNING [train.py:1204] (3/4) Exclude cut with ID _20455_210_5219_1_1531565996775_8098100_191-16006-0_sp1.1 from training. Duration: 0.963625 2023-10-04 05:19:48,903 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3171_210_2087_1_1526109084_2330007_66-260583-0_sp1.1 from training. Duration: 0.6545625 2023-10-04 05:19:48,948 WARNING [train.py:1204] (3/4) Exclude cut with ID _38670_210_15379_1_1533448810659_4391059_63-129006-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 05:19:51,822 WARNING [train.py:1204] (3/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_393-57888-0_sp0.9 from training. Duration: 0.711125 2023-10-04 05:19:51,851 WARNING [train.py:1204] (3/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_857-298942-0_sp0.9 from training. Duration: 0.9444375 2023-10-04 05:19:53,266 WARNING [train.py:1204] (3/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_239-22076-0_sp0.9 from training. Duration: 0.9444375 2023-10-04 05:19:53,305 WARNING [train.py:1204] (3/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_424-227947-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 05:19:56,565 WARNING [train.py:1204] (3/4) Exclude cut with ID _14069_210_5632_1_1530512692033_4803390_95-1896-0 from training. Duration: 0.98 2023-10-04 05:19:57,946 WARNING [train.py:1204] (3/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_96-244299-0_sp0.9 from training. Duration: 0.97775 2023-10-04 05:20:04,124 WARNING [train.py:1204] (3/4) Exclude cut with ID _46683_210_9642_1_1533730913492_4591116_562-319825-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 05:20:05,546 WARNING [train.py:1204] (3/4) Exclude cut with ID _64639_210_5816_1_1535367619437_3528000_284-173474-0_sp1.1 from training. Duration: 0.963625 2023-10-04 05:20:06,759 WARNING [train.py:1204] (3/4) Exclude cut with ID _29529_210_7234_1_1532393961455_3977460_497-100133-0_sp1.1 from training. Duration: 0.936375 2023-10-04 05:20:06,796 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_12868_210_6753_1_1530701932_530275_18-76322-0_sp0.9 from training. Duration: 0.9666875 2023-10-04 05:20:06,841 WARNING [train.py:1204] (3/4) Exclude cut with ID _50093_210_4220_1_1534417141207_3432299_116-303447-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 05:20:08,164 WARNING [train.py:1204] (3/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_44-79427-0_sp1.1 from training. Duration: 0.8909375 2023-10-04 05:20:08,234 WARNING [train.py:1204] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_887-249555-0 from training. Duration: 0.77 2023-10-04 05:20:08,477 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.nonlin_attention.balancer.prob, batch_count=1539346.6666666667, ans=0.125 2023-10-04 05:20:09,772 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.bypass.scale_min, batch_count=1539346.6666666667, ans=0.2 2023-10-04 05:20:12,195 WARNING [train.py:1204] (3/4) Exclude cut with ID _28461_210_2253_1_1532771775189_3041299_96-152158-0_sp0.9 from training. Duration: 0.9444375 2023-10-04 05:20:12,262 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_14058_210_4163_1_1530690953_6327180_49-210687-0_sp1.1 from training. Duration: 0.8545625 2023-10-04 05:20:13,895 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.feed_forward3.hidden_balancer.prob, batch_count=1539346.6666666667, ans=0.125 2023-10-04 05:20:16,366 WARNING [train.py:1204] (3/4) Exclude cut with ID _40399_210_4353_1_1533305658194_3279406_351-127903-0_sp1.1 from training. Duration: 0.97275 2023-10-04 05:20:16,373 WARNING [train.py:1204] (3/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_184-206939-0_sp1.1 from training. Duration: 0.936375 2023-10-04 05:20:22,168 WARNING [train.py:1204] (3/4) Exclude cut with ID _14100_210_8341_1_1530529320613_3678069_258-224599-0_sp1.1 from training. Duration: 0.7 2023-10-04 05:20:22,192 WARNING [train.py:1204] (3/4) Exclude cut with ID _6493_210_1747_1_1528459210424_4012470_324-323854-0 from training. Duration: 0.92 2023-10-04 05:20:23,446 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_151-325626-0_sp1.1 from training. Duration: 0.8818125 2023-10-04 05:20:24,769 WARNING [train.py:1204] (3/4) Exclude cut with ID _14528_210_8254_1_1530669700514_4306939_106-43145-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 05:20:24,792 WARNING [train.py:1204] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_686-54383-0 from training. Duration: 0.87 2023-10-04 05:20:24,841 WARNING [train.py:1204] (3/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_801-102362-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 05:20:26,771 WARNING [train.py:1204] (3/4) Exclude cut with ID _48677_210_5663_1_1533902405919_3892989_408-40740-0_sp0.9 from training. Duration: 0.92225 2023-10-04 05:20:27,988 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.633e+02 1.926e+02 2.194e+02 2.495e+02 3.736e+02, threshold=4.388e+02, percent-clipped=0.0 2023-10-04 05:20:29,723 WARNING [train.py:1204] (3/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_448-212135-0_sp1.1 from training. Duration: 0.82725 2023-10-04 05:20:32,516 WARNING [train.py:1204] (3/4) Exclude cut with ID _59170_210_6721_1_1534983633960_4203335_320-124047-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 05:20:32,572 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_4692_210_3528_1_1526899755_4323254_107-44401-0_sp1.1 from training. Duration: 0.7818125 2023-10-04 05:20:36,029 WARNING [train.py:1204] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_731-2428-0 from training. Duration: 0.83 2023-10-04 05:20:37,287 INFO [train.py:1046] (3/4) Epoch 44, batch 2500, loss[loss=0.154, simple_loss=0.2426, pruned_loss=0.03276, over 24676.00 frames. ], tot_loss[loss=0.1539, simple_loss=0.2332, pruned_loss=0.0373, over 4693243.39 frames. ], batch size: 68, lr: 2.31e-03, grad_scale: 16.0 2023-10-04 05:20:37,387 WARNING [train.py:1204] (3/4) Exclude cut with ID _29857_210_20034_1_1532395886753_3798160_335-163562-0_sp1.1 from training. Duration: 0.77275 2023-10-04 05:20:44,109 WARNING [train.py:1204] (3/4) Exclude cut with ID _14832_210_8750_1_1530691351539_3171348_171-275004-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 05:20:51,239 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.conv_module1.balancer2.prob, batch_count=1539546.6666666667, ans=0.125 2023-10-04 05:20:52,392 WARNING [train.py:1204] (3/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_105-328174-0_sp0.9 from training. Duration: 0.8444375 2023-10-04 05:20:52,415 WARNING [train.py:1204] (3/4) Exclude cut with ID _68965_210_1883_1_1535698962272_3497490_164-118421-0_sp1.1 from training. Duration: 0.936375 2023-10-04 05:20:54,277 WARNING [train.py:1204] (3/4) Exclude cut with ID _19896_210_1883_1_1531709022662_6649569_380-24170-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 05:20:54,281 WARNING [train.py:1204] (3/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_505-205270-0_sp1.1 from training. Duration: 0.3454375 2023-10-04 05:20:55,945 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.1.ff3_skip_rate, batch_count=1539546.6666666667, ans=0.0 2023-10-04 05:21:00,586 WARNING [train.py:1204] (3/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_427-208901-0_sp1.1 from training. Duration: 0.6181875 2023-10-04 05:21:01,931 WARNING [train.py:1204] (3/4) Exclude cut with ID _23818_210_6228_1_1531917063884_3676920_85-326916-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 05:21:02,014 WARNING [train.py:1204] (3/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_101-293367-0_sp1.1 from training. Duration: 0.57275 2023-10-04 05:21:02,022 WARNING [train.py:1204] (3/4) Exclude cut with ID _5409_210_1388_1_1527292850189_5865191_178-127970-0_sp1.1 from training. Duration: 0.5181875 2023-10-04 05:21:02,074 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_403-39619-0 from training. Duration: 0.84 2023-10-04 05:21:03,905 WARNING [train.py:1204] (3/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_427-87841-0_sp1.1 from training. Duration: 0.963625 2023-10-04 05:21:05,839 WARNING [train.py:1204] (3/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_286-68181-0_sp1.1 from training. Duration: 0.97275 2023-10-04 05:21:07,131 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6673_210_3273_1_1527767790_3173240_327-306185-0 from training. Duration: 0.83 2023-10-04 05:21:07,154 WARNING [train.py:1204] (3/4) Exclude cut with ID _3675_210_4557_1_1526639934408_4182089_106-203713-0_sp1.1 from training. Duration: 0.963625 2023-10-04 05:21:07,222 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_53071_210_16554_1_1534493434_1276796_130-36076-0 from training. Duration: 0.95 2023-10-04 05:21:08,555 WARNING [train.py:1204] (3/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_586-93054-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 05:21:12,647 WARNING [train.py:1204] (3/4) Exclude cut with ID _23825_210_13449_1_1531915313526_3497150_262-10571-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 05:21:13,999 WARNING [train.py:1204] (3/4) Exclude cut with ID _28783_210_7943_1_1532671164808_7155479_432-67885-0_sp1.1 from training. Duration: 0.97275 2023-10-04 05:21:16,692 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6287_210_3273_1_1527509506_2653716_182-199887-0_sp0.9 from training. Duration: 0.5666875 2023-10-04 05:21:17,927 WARNING [train.py:1204] (3/4) Exclude cut with ID _3231_210_2106_1_1526205864934_6784220_171-256004-0 from training. Duration: 0.86 2023-10-04 05:21:17,980 WARNING [train.py:1204] (3/4) Exclude cut with ID _69051_210_1531_1_1535885566901_3829538_332-92090-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 05:21:19,405 WARNING [train.py:1204] (3/4) Exclude cut with ID _3675_210_4557_1_1526639934408_4182089_104-324125-0_sp1.1 from training. Duration: 0.963625 2023-10-04 05:21:23,471 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_41764_210_2521_1_1533686110_4089313_501-2119-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 05:21:26,707 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_56-100988-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 05:21:26,924 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.1.conv_module1.balancer2.prob, batch_count=1539680.0, ans=0.125 2023-10-04 05:21:28,267 WARNING [train.py:1204] (3/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_398-239819-0_sp1.1 from training. Duration: 0.9 2023-10-04 05:21:36,153 WARNING [train.py:1204] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_74-48717-0_sp1.1 from training. Duration: 0.52725 2023-10-04 05:21:39,649 WARNING [train.py:1204] (3/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_177-49092-0 from training. Duration: 0.91 2023-10-04 05:21:39,666 WARNING [train.py:1204] (3/4) Exclude cut with ID _62337_210_12392_1_1535009273105_3412229_44-176353-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 05:21:39,674 WARNING [train.py:1204] (3/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_110-130961-0_sp1.1 from training. Duration: 0.42725 2023-10-04 05:21:41,131 WARNING [train.py:1204] (3/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_37-189262-0_sp1.1 from training. Duration: 0.8909375 2023-10-04 05:21:41,132 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3172_210_2087_1_1526292929_3987865_197-97762-0_sp0.9 from training. Duration: 0.8333125 2023-10-04 05:21:43,701 WARNING [train.py:1204] (3/4) Exclude cut with ID IC0677W0065-334897 from training. Duration: 0.896 2023-10-04 05:21:43,702 WARNING [train.py:1204] (3/4) Exclude cut with ID IC0677W0080-334912 from training. Duration: 0.768 2023-10-04 05:21:43,707 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_120-41668-0 from training. Duration: 0.7 2023-10-04 05:21:45,291 WARNING [train.py:1204] (3/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_460-205287-0_sp1.1 from training. Duration: 0.963625 2023-10-04 05:21:47,959 WARNING [train.py:1204] (3/4) Exclude cut with ID _14632_210_9988_1_1530691279392_3923200_102-204477-0 from training. Duration: 0.97 2023-10-04 05:21:47,973 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_34815_210_6388_1_1533381320_3571903_279-305565-0 from training. Duration: 0.92 2023-10-04 05:21:49,369 WARNING [train.py:1204] (3/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_91-148831-0_sp1.1 from training. Duration: 0.9 2023-10-04 05:21:49,428 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_82-221196-0 from training. Duration: 0.76 2023-10-04 05:21:50,653 INFO [train.py:1046] (3/4) Epoch 44, batch 2550, loss[loss=0.1566, simple_loss=0.2345, pruned_loss=0.03939, over 23593.00 frames. ], tot_loss[loss=0.1541, simple_loss=0.234, pruned_loss=0.0371, over 4712702.40 frames. ], batch size: 256, lr: 2.31e-03, grad_scale: 16.0 2023-10-04 05:21:50,836 WARNING [train.py:1204] (3/4) Exclude cut with ID _38020_210_5986_1_1533112252259_7006089_493-102745-0 from training. Duration: 0.98 2023-10-04 05:21:53,510 WARNING [train.py:1204] (3/4) Exclude cut with ID _61133_210_9969_1_1535196777227_3696409_40-93442-0_sp1.1 from training. Duration: 0.92725 2023-10-04 05:21:56,635 WARNING [train.py:1204] (3/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_45-258852-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 05:21:58,002 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_53071_210_16554_1_1534493434_1276796_130-36076-0_sp1.1 from training. Duration: 0.863625 2023-10-04 05:21:59,422 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8694_210_6400_1_1529065754_3260756_332-13553-0_sp1.1 from training. Duration: 0.92725 2023-10-04 05:22:00,815 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_405-339986-0 from training. Duration: 0.51 2023-10-04 05:22:00,877 WARNING [train.py:1204] (3/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_927-43827-0_sp1.1 from training. Duration: 0.67275 2023-10-04 05:22:04,293 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_397-147196-0 from training. Duration: 0.8 2023-10-04 05:22:06,300 WARNING [train.py:1204] (3/4) Exclude cut with ID 3557-8342-0013-54691-0_sp1.1 from training. Duration: 0.836375 2023-10-04 05:22:07,739 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_47438_210_5399_1_1533797471_4513356_626-127722-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 05:22:10,974 WARNING [train.py:1204] (3/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_256-323883-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 05:22:10,979 WARNING [train.py:1204] (3/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_839-173017-0_sp0.9 from training. Duration: 0.4555625 2023-10-04 05:22:12,327 WARNING [train.py:1204] (3/4) Exclude cut with ID _5289_210_2084_1_1527678202736_3779299_392-261988-0_sp1.1 from training. Duration: 0.5909375 2023-10-04 05:22:12,374 WARNING [train.py:1204] (3/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_119-224384-0_sp1.1 from training. Duration: 0.936375 2023-10-04 05:22:12,408 WARNING [train.py:1204] (3/4) Exclude cut with ID _57523_210_1856_1_1534766294545_3732989_262-267933-0_sp1.1 from training. Duration: 0.963625 2023-10-04 05:22:15,137 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_35361_210_4238_1_1533262810_7793476_675-87012-0_sp0.9 from training. Duration: 0.988875 2023-10-04 05:22:15,164 WARNING [train.py:1204] (3/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_17-90541-0 from training. Duration: 0.94 2023-10-04 05:22:15,192 WARNING [train.py:1204] (3/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_535-14410-0_sp1.1 from training. Duration: 0.42725 2023-10-04 05:22:15,205 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_24673_210_6400_1_1531968633_778659_94-154511-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 05:22:15,208 WARNING [train.py:1204] (3/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_212-10501-0 from training. Duration: 0.94 2023-10-04 05:22:28,927 WARNING [train.py:1204] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_383-285196-0_sp1.1 from training. Duration: 0.8818125 2023-10-04 05:22:31,152 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.conv_module1.balancer2.prob, batch_count=1539946.6666666667, ans=0.125 2023-10-04 05:22:34,023 WARNING [train.py:1204] (3/4) Exclude cut with ID _45560_210_21982_1_1533812603951_3584999_238-47020-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 05:22:34,044 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_21500_210_2054_1_1532149310_2716758_278-18053-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 05:22:34,052 WARNING [train.py:1204] (3/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_453-9626-0_sp1.1 from training. Duration: 0.936375 2023-10-04 05:22:35,923 WARNING [train.py:1204] (3/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_153-97441-0_sp1.1 from training. Duration: 0.5090625 2023-10-04 05:22:40,789 WARNING [train.py:1204] (3/4) Exclude cut with ID _67725_210_27767_1_1535608756671_5105359_431-120895-0_sp1.1 from training. Duration: 0.963625 2023-10-04 05:22:43,590 WARNING [train.py:1204] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_574-45541-0_sp1.1 from training. Duration: 0.5909375 2023-10-04 05:22:43,606 WARNING [train.py:1204] (3/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_162-103620-0_sp1.1 from training. Duration: 0.8454375 2023-10-04 05:22:43,617 WARNING [train.py:1204] (3/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_734-129761-0_sp1.1 from training. Duration: 0.7454375 2023-10-04 05:22:43,655 WARNING [train.py:1204] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_620-276793-0_sp1.1 from training. Duration: 0.536375 2023-10-04 05:22:44,882 WARNING [train.py:1204] (3/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_153-97441-0_sp0.9 from training. Duration: 0.62225 2023-10-04 05:22:47,592 WARNING [train.py:1204] (3/4) Exclude cut with ID _12733_210_8341_1_1530167424556_3646380_523-85621-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 05:22:48,879 WARNING [train.py:1204] (3/4) Exclude cut with ID _64048_210_10626_1_1535198382802_3835179_352-179834-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 05:22:52,951 WARNING [train.py:1204] (3/4) Exclude cut with ID _55162_210_17071_1_1534408248583_3753480_43-242364-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 05:22:52,975 WARNING [train.py:1204] (3/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_661-264857-0 from training. Duration: 0.93 2023-10-04 05:22:52,982 WARNING [train.py:1204] (3/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_325-237800-0_sp1.1 from training. Duration: 0.97275 2023-10-04 05:22:54,157 WARNING [train.py:1204] (3/4) Exclude cut with ID _55328_210_27767_1_1534580973260_4088170_385-171459-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 05:22:55,574 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3780_210_2087_1_1526467389_2145572_176-107140-0_sp1.1 from training. Duration: 0.62725 2023-10-04 05:22:56,823 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.632e+02 1.968e+02 2.128e+02 2.363e+02 3.523e+02, threshold=4.255e+02, percent-clipped=0.0 2023-10-04 05:22:56,954 WARNING [train.py:1204] (3/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_375-244976-0_sp1.1 from training. Duration: 0.6818125 2023-10-04 05:22:58,379 WARNING [train.py:1204] (3/4) Exclude cut with ID _35941_210_15379_1_1532995294515_4023089_309-33162-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 05:23:04,899 INFO [train.py:1046] (3/4) Epoch 44, batch 2600, loss[loss=0.1778, simple_loss=0.2467, pruned_loss=0.05444, over 22801.00 frames. ], tot_loss[loss=0.1546, simple_loss=0.2345, pruned_loss=0.03737, over 4712484.11 frames. ], batch size: 322, lr: 2.31e-03, grad_scale: 8.0 2023-10-04 05:23:05,048 WARNING [train.py:1204] (3/4) Exclude cut with ID _14459_210_2228_1_1531443641946_7516700_1185-22038-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 05:23:07,802 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_1197-69460-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 05:23:09,683 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9012W0022-501157 from training. Duration: 0.896 2023-10-04 05:23:12,970 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9023W0009-506302 from training. Duration: 0.896 2023-10-04 05:23:12,989 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2821_210_2715_1_1526108385_3763560_73-296673-0_sp1.1 from training. Duration: 0.7545625 2023-10-04 05:23:13,024 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9025W0113-507442 from training. Duration: 0.896 2023-10-04 05:23:14,420 WARNING [train.py:1204] (3/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_739-2098-0 from training. Duration: 0.92 2023-10-04 05:23:14,431 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9029W0332-509722 from training. Duration: 0.768 2023-10-04 05:23:15,956 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.conv_module1.balancer2.prob, batch_count=1540146.6666666667, ans=0.125 2023-10-04 05:23:17,089 WARNING [train.py:1204] (3/4) Exclude cut with ID _16908_210_5724_1_1531119157248_3772700_68-101953-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 05:23:17,112 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9042W0011-515617 from training. Duration: 0.768 2023-10-04 05:23:18,432 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3019_210_2028_1_1526044245_3264865_191-72235-0 from training. Duration: 0.97 2023-10-04 05:23:19,841 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9052W0006-520792 from training. Duration: 0.896 2023-10-04 05:23:21,245 WARNING [train.py:1204] (3/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_245-174553-0_sp1.1 from training. Duration: 0.8 2023-10-04 05:23:23,790 WARNING [train.py:1204] (3/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_202-67017-0 from training. Duration: 0.82 2023-10-04 05:23:23,893 WARNING [train.py:1204] (3/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_304-59227-0 from training. Duration: 0.77 2023-10-04 05:23:25,298 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_246-125000-0_sp0.9 from training. Duration: 0.688875 2023-10-04 05:23:26,683 WARNING [train.py:1204] (3/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_243-42116-0 from training. Duration: 0.73 2023-10-04 05:23:29,412 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9087W0121-538492 from training. Duration: 0.896 2023-10-04 05:23:29,427 WARNING [train.py:1204] (3/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_120-76843-0 from training. Duration: 0.55 2023-10-04 05:23:36,943 WARNING [train.py:1204] (3/4) Exclude cut with ID _7852_210_5219_1_1528286471316_8139400_850-304621-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 05:23:38,249 WARNING [train.py:1204] (3/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_154-285415-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 05:23:38,261 WARNING [train.py:1204] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_538-278194-0_sp1.1 from training. Duration: 0.92725 2023-10-04 05:23:38,261 WARNING [train.py:1204] (3/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_119-207162-0 from training. Duration: 0.71 2023-10-04 05:23:38,507 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.1.feed_forward1.hidden_balancer.prob, batch_count=1540280.0, ans=0.125 2023-10-04 05:23:40,188 WARNING [train.py:1204] (3/4) Exclude cut with ID _2334_210_2667_1_1525608238880_4705129_459-104997-0_sp1.1 from training. Duration: 0.863625 2023-10-04 05:23:40,473 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.nonlin_attention.balancer.prob, batch_count=1540280.0, ans=0.125 2023-10-04 05:23:44,862 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9147W0019-567847 from training. Duration: 0.768 2023-10-04 05:23:50,379 WARNING [train.py:1204] (3/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_228-189439-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 05:23:50,416 WARNING [train.py:1204] (3/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_196-77172-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 05:23:50,493 WARNING [train.py:1204] (3/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_625-338546-0 from training. Duration: 0.88 2023-10-04 05:23:51,957 WARNING [train.py:1204] (3/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_193-327630-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 05:23:51,964 WARNING [train.py:1204] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_391-24006-0_sp1.1 from training. Duration: 0.92725 2023-10-04 05:23:53,363 WARNING [train.py:1204] (3/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_197-28164-0 from training. Duration: 0.8 2023-10-04 05:23:56,148 WARNING [train.py:1204] (3/4) Exclude cut with ID _8395_210_1531_1_1528597871523_4411579_431-132470-0_sp1.1 from training. Duration: 0.67275 2023-10-04 05:23:56,166 WARNING [train.py:1204] (3/4) Exclude cut with ID _35350_210_16040_1_1533040191183_4075299_465-248326-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 05:23:57,600 WARNING [train.py:1204] (3/4) Exclude cut with ID _16756_210_4915_1_1531047803763_7104259_101-141922-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 05:24:00,553 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.feed_forward3.hidden_balancer.prob, batch_count=1540346.6666666667, ans=0.125 2023-10-04 05:24:02,304 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9212W0005-600172 from training. Duration: 0.896 2023-10-04 05:24:02,326 WARNING [train.py:1204] (3/4) Exclude cut with ID _63186_210_2084_1_1535414479705_3908649_378-166820-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 05:24:02,345 WARNING [train.py:1204] (3/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_52-211683-0_sp1.1 from training. Duration: 0.8181875 2023-10-04 05:24:06,860 WARNING [train.py:1204] (3/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_1147-237729-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 05:24:06,935 WARNING [train.py:1204] (3/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_234-14174-0_sp0.9 from training. Duration: 0.911125 2023-10-04 05:24:08,261 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_454-276429-0 from training. Duration: 0.87 2023-10-04 05:24:08,344 WARNING [train.py:1204] (3/4) Exclude cut with ID _53713_210_24116_1_1534314487762_7416751_755-165313-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 05:24:11,507 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8278_210_2550_1_1528637099_4220413_436-317072-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 05:24:13,392 WARNING [train.py:1204] (3/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_28-324729-0_sp1.1 from training. Duration: 0.8545625 2023-10-04 05:24:18,953 INFO [train.py:1046] (3/4) Epoch 44, batch 2650, loss[loss=0.1643, simple_loss=0.2554, pruned_loss=0.03656, over 24687.00 frames. ], tot_loss[loss=0.1555, simple_loss=0.2357, pruned_loss=0.03771, over 4707214.71 frames. ], batch size: 65, lr: 2.31e-03, grad_scale: 8.0 2023-10-04 05:24:19,036 WARNING [train.py:1204] (3/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_326-165900-0 from training. Duration: 0.69 2023-10-04 05:24:19,225 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.1.conv_skip_rate, batch_count=1540480.0, ans=0.0 2023-10-04 05:24:20,364 WARNING [train.py:1204] (3/4) Exclude cut with ID _53489_210_6228_1_1534298357169_3835970_96-30246-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 05:24:21,984 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8275_210_4198_1_1528455320_3892571_491-17707-0_sp1.1 from training. Duration: 0.5818125 2023-10-04 05:24:27,290 WARNING [train.py:1204] (3/4) Exclude cut with ID _14348_210_8341_1_1530599434996_3778640_240-232370-0 from training. Duration: 0.84 2023-10-04 05:24:27,311 WARNING [train.py:1204] (3/4) Exclude cut with ID _45012_210_12651_1_1533608435136_5371380_54-112623-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 05:24:27,401 WARNING [train.py:1204] (3/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_328-264776-0_sp1.1 from training. Duration: 0.6090625 2023-10-04 05:24:28,803 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9325W0001-648427 from training. Duration: 0.768 2023-10-04 05:24:28,813 WARNING [train.py:1204] (3/4) Exclude cut with ID _56480_210_4220_1_1534679942079_3075409_49-74717-0_sp1.1 from training. Duration: 0.936375 2023-10-04 05:24:29,615 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.3.encoder.layers.0.self_attn_weights.whiten_keys, num_groups=8, num_channels=256, metric=4.50 vs. limit=6.0 2023-10-04 05:24:30,300 WARNING [train.py:1204] (3/4) Exclude cut with ID _59500_210_8254_1_1534809610680_3736009_6-241869-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 05:24:33,056 WARNING [train.py:1204] (3/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_172-215932-0_sp0.9 from training. Duration: 0.7333125 2023-10-04 05:24:34,984 WARNING [train.py:1204] (3/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_17-90541-0_sp1.1 from training. Duration: 0.8545625 2023-10-04 05:24:35,660 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.3.encoder.layers.0.nonlin_attention.whiten1, num_groups=1, num_channels=384, metric=6.77 vs. limit=10.0 2023-10-04 05:24:36,191 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_43831_210_2158_1_1533725502_5872791_525-89447-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 05:24:37,503 WARNING [train.py:1204] (3/4) Exclude cut with ID _12302_210_5219_1_1529924446466_8107280_907-182949-0 from training. Duration: 0.92 2023-10-04 05:24:37,517 WARNING [train.py:1204] (3/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_402-102311-0_sp0.9 from training. Duration: 0.7444375 2023-10-04 05:24:37,563 WARNING [train.py:1204] (3/4) Exclude cut with ID _8021_210_7994_1_1528376451358_3653069_179-45206-0_sp0.9 from training. Duration: 0.9555625 2023-10-04 05:24:40,893 WARNING [train.py:1204] (3/4) Exclude cut with ID _40137_210_12210_1_1533290484046_3795180_249-187059-0 from training. Duration: 0.88 2023-10-04 05:24:42,713 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9653W0194-675472 from training. Duration: 0.9658125 2023-10-04 05:24:44,483 WARNING [train.py:1204] (3/4) Exclude cut with ID _51332_210_9988_1_1534244371836_3801270_336-255604-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 05:24:47,115 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_510-96996-0 from training. Duration: 0.87 2023-10-04 05:24:47,134 WARNING [train.py:1204] (3/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_1167-20437-0_sp1.1 from training. Duration: 0.92725 2023-10-04 05:24:47,186 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3475_210_2028_1_1526193578_3622569_265-276625-0 from training. Duration: 0.76 2023-10-04 05:24:50,030 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.conv_module2.balancer2.min_abs, batch_count=1540613.3333333333, ans=0.5 2023-10-04 05:24:51,236 WARNING [train.py:1204] (3/4) Exclude cut with ID _14860_210_4915_1_1530702020340_7343240_470-172418-0_sp1.1 from training. Duration: 0.97275 2023-10-04 05:24:51,254 WARNING [train.py:1204] (3/4) Exclude cut with ID _5444_210_2151_1_1527418808464_5312210_581-315411-0_sp1.1 from training. Duration: 0.563625 2023-10-04 05:24:51,284 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_34450_210_9547_1_1533099443_4109050_519-342439-0_sp1.1 from training. Duration: 0.97275 2023-10-04 05:24:51,310 WARNING [train.py:1204] (3/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_50-175961-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 05:24:53,097 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.self_attn2.whiten.whitening_limit, batch_count=1540613.3333333333, ans=22.5 2023-10-04 05:24:54,230 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.conv_module2.balancer2.prob, batch_count=1540613.3333333333, ans=0.125 2023-10-04 05:24:56,729 WARNING [train.py:1204] (3/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_372-6869-0 from training. Duration: 0.44 2023-10-04 05:24:56,735 WARNING [train.py:1204] (3/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_3-237434-0 from training. Duration: 0.76 2023-10-04 05:24:59,510 WARNING [train.py:1204] (3/4) Exclude cut with ID _8772_210_1850_1_1528635595972_3664011_247-336448-0_sp1.1 from training. Duration: 0.67275 2023-10-04 05:25:04,296 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_427-78097-0 from training. Duration: 0.8 2023-10-04 05:25:04,312 WARNING [train.py:1204] (3/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_466-252727-0_sp1.1 from training. Duration: 0.97275 2023-10-04 05:25:04,386 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_15972_210_6400_1_1530877934_3405692_316-35478-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 05:25:04,418 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_809-17156-0_sp0.9 from training. Duration: 0.811125 2023-10-04 05:25:04,449 WARNING [train.py:1204] (3/4) Exclude cut with ID _57572_210_5517_1_1534921219624_3809484_594-263474-0_sp1.1 from training. Duration: 0.936375 2023-10-04 05:25:05,710 WARNING [train.py:1204] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_190-228204-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 05:25:08,437 WARNING [train.py:1204] (3/4) Exclude cut with ID _19514_210_9259_1_1531530071330_5084230_117-21193-0_sp1.1 from training. Duration: 0.936375 2023-10-04 05:25:08,561 WARNING [train.py:1204] (3/4) Exclude cut with ID _69584_210_5219_1_1535785133775_4044450_190-21315-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 05:25:10,347 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_53071_210_16554_1_1534493434_1276796_184-302050-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 05:25:11,757 WARNING [train.py:1204] (3/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_120-76843-0_sp1.1 from training. Duration: 0.5 2023-10-04 05:25:11,839 WARNING [train.py:1204] (3/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_318-162017-0_sp1.1 from training. Duration: 0.82725 2023-10-04 05:25:11,928 WARNING [train.py:1204] (3/4) Exclude cut with ID _46067_210_4220_1_1534125571066_3307450_79-46864-0_sp1.1 from training. Duration: 0.92725 2023-10-04 05:25:13,777 WARNING [train.py:1204] (3/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_358-215558-0_sp1.1 from training. Duration: 0.8454375 2023-10-04 05:25:15,100 WARNING [train.py:1204] (3/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_621-66764-0_sp1.1 from training. Duration: 0.92725 2023-10-04 05:25:15,361 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=1540680.0, ans=0.1 2023-10-04 05:25:16,474 WARNING [train.py:1204] (3/4) Exclude cut with ID _42863_210_9070_1_1533902398788_3696920_338-156851-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 05:25:17,811 WARNING [train.py:1204] (3/4) Exclude cut with ID _5289_210_2084_1_1527678202736_3779299_392-261988-0_sp0.9 from training. Duration: 0.72225 2023-10-04 05:25:20,411 WARNING [train.py:1204] (3/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_409-84682-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 05:25:23,028 WARNING [train.py:1204] (3/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_30-326681-0_sp0.9 from training. Duration: 0.92225 2023-10-04 05:25:23,033 WARNING [train.py:1204] (3/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_389-184695-0_sp1.1 from training. Duration: 0.92725 2023-10-04 05:25:23,078 WARNING [train.py:1204] (3/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_442-320317-0 from training. Duration: 0.82 2023-10-04 05:25:24,460 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.709e+02 2.007e+02 2.168e+02 2.484e+02 3.520e+02, threshold=4.337e+02, percent-clipped=0.0 2023-10-04 05:25:25,946 WARNING [train.py:1204] (3/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_756-29911-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 05:25:27,438 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_401-112036-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 05:25:30,116 WARNING [train.py:1204] (3/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_181-308827-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 05:25:30,182 WARNING [train.py:1204] (3/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_332-258725-0_sp1.1 from training. Duration: 0.963625 2023-10-04 05:25:31,515 INFO [train.py:1046] (3/4) Epoch 44, batch 2700, loss[loss=0.1618, simple_loss=0.2489, pruned_loss=0.03731, over 24607.00 frames. ], tot_loss[loss=0.1568, simple_loss=0.2369, pruned_loss=0.03836, over 4703905.06 frames. ], batch size: 68, lr: 2.31e-03, grad_scale: 8.0 2023-10-04 05:25:31,574 WARNING [train.py:1204] (3/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_188-260011-0_sp0.9 from training. Duration: 0.811125 2023-10-04 05:25:31,617 WARNING [train.py:1204] (3/4) Exclude cut with ID _49039_210_6315_1_1533967537541_3846118_99-302239-0_sp1.1 from training. Duration: 0.963625 2023-10-04 05:25:34,979 WARNING [train.py:1204] (3/4) Exclude cut with ID _4122_210_2084_1_1526713164417_4353280_312-294828-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 05:25:34,990 WARNING [train.py:1204] (3/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_303-228738-0 from training. Duration: 0.84 2023-10-04 05:25:37,569 WARNING [train.py:1204] (3/4) Exclude cut with ID _14349_210_8341_1_1530615710818_3442470_115-285975-0_sp0.9 from training. Duration: 0.9555625 2023-10-04 05:25:39,089 WARNING [train.py:1204] (3/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_125-144174-0_sp1.1 from training. Duration: 0.3545625 2023-10-04 05:25:40,542 WARNING [train.py:1204] (3/4) Exclude cut with ID _53565_210_10635_1_1534337218335_3820259_351-54570-0_sp1.1 from training. Duration: 0.97275 2023-10-04 05:25:41,841 WARNING [train.py:1204] (3/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_410-40065-0_sp1.1 from training. Duration: 0.963625 2023-10-04 05:25:41,874 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_38965_210_4852_1_1533174906_3484735_167-339442-0_sp1.1 from training. Duration: 0.963625 2023-10-04 05:25:43,739 WARNING [train.py:1204] (3/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_265-164486-0_sp1.1 from training. Duration: 0.87275 2023-10-04 05:25:43,752 WARNING [train.py:1204] (3/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_725-182484-0_sp1.1 from training. Duration: 0.92725 2023-10-04 05:25:43,775 WARNING [train.py:1204] (3/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_607-213951-0_sp0.9 from training. Duration: 0.9333125 2023-10-04 05:25:45,577 WARNING [train.py:1204] (3/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_605-133395-0_sp1.1 from training. Duration: 0.6 2023-10-04 05:25:45,604 WARNING [train.py:1204] (3/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_351-294650-0 from training. Duration: 0.63 2023-10-04 05:25:46,858 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_14953_210_2151_1_1530701710_5616165_710-165400-0_sp1.1 from training. Duration: 0.7909375 2023-10-04 05:25:48,287 WARNING [train.py:1204] (3/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_237-58433-0_sp1.1 from training. Duration: 0.67275 2023-10-04 05:25:49,677 WARNING [train.py:1204] (3/4) Exclude cut with ID _5190_210_4557_1_1527244368799_373439_28-197030-0_sp1.1 from training. Duration: 0.6545625 2023-10-04 05:25:49,736 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_743-327651-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 05:25:52,477 WARNING [train.py:1204] (3/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_76-203867-0_sp0.9 from training. Duration: 0.8 2023-10-04 05:25:53,866 WARNING [train.py:1204] (3/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_274-274798-0 from training. Duration: 0.98 2023-10-04 05:25:53,908 WARNING [train.py:1204] (3/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_226-242140-0_sp1.1 from training. Duration: 0.736375 2023-10-04 05:25:56,839 WARNING [train.py:1204] (3/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_352-59958-0_sp1.1 from training. Duration: 0.7818125 2023-10-04 05:25:56,843 WARNING [train.py:1204] (3/4) Exclude cut with ID _39411_210_5986_1_1533538825030_3343649_191-251819-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 05:26:02,338 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7046_210_6753_1_1527895337_6920917_642-106799-0_sp1.1 from training. Duration: 0.836375 2023-10-04 05:26:03,527 WARNING [train.py:1204] (3/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_412-3442-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 05:26:03,554 WARNING [train.py:1204] (3/4) Exclude cut with ID _60057_210_10640_1_1534903065793_3333150_171-181133-0_sp0.9 from training. Duration: 0.97775 2023-10-04 05:26:03,562 WARNING [train.py:1204] (3/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_154-135696-0_sp1.1 from training. Duration: 0.72725 2023-10-04 05:26:06,938 WARNING [train.py:1204] (3/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_148-186805-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 05:26:09,723 WARNING [train.py:1204] (3/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_605-165641-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 05:26:09,732 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_174-277364-0_sp0.9 from training. Duration: 0.788875 2023-10-04 05:26:09,755 WARNING [train.py:1204] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_89-321955-0_sp0.9 from training. Duration: 0.888875 2023-10-04 05:26:13,696 WARNING [train.py:1204] (3/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_438-105429-0_sp1.1 from training. Duration: 0.963625 2023-10-04 05:26:13,700 WARNING [train.py:1204] (3/4) Exclude cut with ID _14970_210_15709_1_1530773543493_1715730_187-20259-0_sp0.9 from training. Duration: 0.988875 2023-10-04 05:26:23,273 WARNING [train.py:1204] (3/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_385-193052-0_sp0.9 from training. Duration: 0.9555625 2023-10-04 05:26:23,325 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7113_210_1849_1_1527942685_3448768_194-311626-0_sp1.1 from training. Duration: 0.92725 2023-10-04 05:26:26,102 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3780_210_2087_1_1526467389_2145572_176-107140-0_sp0.9 from training. Duration: 0.7666875 2023-10-04 05:26:26,103 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_36756_210_7234_1_1533026991_966276_123-213457-0_sp1.1 from training. Duration: 0.936375 2023-10-04 05:26:28,872 WARNING [train.py:1204] (3/4) Exclude cut with ID _23557_210_8750_1_1531890106370_6846255_456-244861-0_sp1.1 from training. Duration: 0.963625 2023-10-04 05:26:30,147 WARNING [train.py:1204] (3/4) Exclude cut with ID _37086_210_9038_1_1532999540891_7021250_242-349170-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 05:26:31,528 WARNING [train.py:1204] (3/4) Exclude cut with ID _41298_210_9019_1_1533430371450_4822143_471-281351-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 05:26:31,635 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_53378_210_17282_1_1534221071_5769509_572-126176-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 05:26:31,807 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.feed_forward3.hidden_balancer.prob, batch_count=1541080.0, ans=0.125 2023-10-04 05:26:33,012 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_1507-263032-0_sp1.1 from training. Duration: 0.963625 2023-10-04 05:26:34,397 WARNING [train.py:1204] (3/4) Exclude cut with ID _13679_210_7497_1_1530534293662_3355098_411-186088-0_sp1.1 from training. Duration: 0.8545625 2023-10-04 05:26:36,537 WARNING [train.py:1204] (3/4) Exclude cut with ID _29345_210_4381_1_1532689168263_3860169_149-257268-0_sp1.1 from training. Duration: 0.736375 2023-10-04 05:26:37,748 WARNING [train.py:1204] (3/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_381-252014-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 05:26:37,754 WARNING [train.py:1204] (3/4) Exclude cut with ID _35799_210_5854_1_1533263487887_3529071_512-2717-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 05:26:40,521 WARNING [train.py:1204] (3/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_505-205270-0_sp0.9 from training. Duration: 0.42225 2023-10-04 05:26:40,598 WARNING [train.py:1204] (3/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_731-20117-0_sp1.1 from training. Duration: 0.936375 2023-10-04 05:26:44,189 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_37351_210_7925_1_1533012931_3812113_265-85780-0_sp1.1 from training. Duration: 0.8 2023-10-04 05:26:44,195 WARNING [train.py:1204] (3/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_313-134930-0 from training. Duration: 0.97 2023-10-04 05:26:45,392 INFO [train.py:1046] (3/4) Epoch 44, batch 2750, loss[loss=0.1503, simple_loss=0.2259, pruned_loss=0.03731, over 23618.00 frames. ], tot_loss[loss=0.1561, simple_loss=0.2363, pruned_loss=0.03795, over 4700703.14 frames. ], batch size: 149, lr: 2.31e-03, grad_scale: 8.0 2023-10-04 05:26:46,905 WARNING [train.py:1204] (3/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_374-138971-0 from training. Duration: 0.94 2023-10-04 05:26:46,948 WARNING [train.py:1204] (3/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_1227-287886-0_sp1.1 from training. Duration: 0.936375 2023-10-04 05:26:49,733 WARNING [train.py:1204] (3/4) Exclude cut with ID _59493_210_17282_1_1535078419077_3225879_332-217544-0_sp1.1 from training. Duration: 0.97275 2023-10-04 05:26:49,787 WARNING [train.py:1204] (3/4) Exclude cut with ID _23514_210_1389_1_1531832139785_3031439_361-119640-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 05:26:52,552 WARNING [train.py:1204] (3/4) Exclude cut with ID _69584_210_5219_1_1535785133775_4044450_142-118058-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 05:26:52,587 WARNING [train.py:1204] (3/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_178-177582-0_sp1.1 from training. Duration: 0.67275 2023-10-04 05:26:53,850 WARNING [train.py:1204] (3/4) Exclude cut with ID _25303_210_17519_1_1532050207576_3635970_41-195572-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 05:26:56,702 WARNING [train.py:1204] (3/4) Exclude cut with ID _55328_210_27767_1_1534580973260_4088170_329-341322-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 05:26:56,737 WARNING [train.py:1204] (3/4) Exclude cut with ID _5608_210_2106_1_1527415439089_6819768_909-82767-0_sp1.1 from training. Duration: 0.5545625 2023-10-04 05:26:56,776 WARNING [train.py:1204] (3/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_261-297503-0_sp1.1 from training. Duration: 0.9 2023-10-04 05:26:58,056 WARNING [train.py:1204] (3/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_459-291706-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 05:26:58,057 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7762_210_1794_1_1528199508_4057408_422-227034-0 from training. Duration: 0.73 2023-10-04 05:26:58,073 WARNING [train.py:1204] (3/4) Exclude cut with ID _3067_210_2667_1_1526212974640_4503168_250-285937-0_sp0.9 from training. Duration: 0.888875 2023-10-04 05:26:58,087 WARNING [train.py:1204] (3/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_332-253745-0_sp1.1 from training. Duration: 0.936375 2023-10-04 05:27:02,428 WARNING [train.py:1204] (3/4) Exclude cut with ID _13938_210_9988_1_1530518718978_3693960_304-112784-0 from training. Duration: 0.46 2023-10-04 05:27:03,887 WARNING [train.py:1204] (3/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_226-267957-0_sp1.1 from training. Duration: 0.92725 2023-10-04 05:27:05,104 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_41822_210_13270_1_1533955976_4379284_608-254567-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 05:27:05,168 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_124-113562-0_sp1.1 from training. Duration: 0.8545625 2023-10-04 05:27:06,601 WARNING [train.py:1204] (3/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_117-290916-0_sp1.1 from training. Duration: 0.463625 2023-10-04 05:27:06,672 WARNING [train.py:1204] (3/4) Exclude cut with ID _28427_210_4220_1_1532581240967_3061069_102-66533-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 05:27:08,407 WARNING [train.py:1204] (3/4) Exclude cut with ID _55383_210_10635_1_1534468426172_2231669_134-128246-0_sp1.1 from training. Duration: 0.8090625 2023-10-04 05:27:08,462 WARNING [train.py:1204] (3/4) Exclude cut with ID _45871_210_5665_1_1533864614245_3296060_49-308316-0_sp1.1 from training. Duration: 0.97275 2023-10-04 05:27:08,509 WARNING [train.py:1204] (3/4) Exclude cut with ID _57572_210_5517_1_1534921219624_3809484_176-199205-0_sp1.1 from training. Duration: 0.97275 2023-10-04 05:27:14,858 WARNING [train.py:1204] (3/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_45-265684-0_sp1.1 from training. Duration: 0.6545625 2023-10-04 05:27:14,884 WARNING [train.py:1204] (3/4) Exclude cut with ID _15150_210_8341_1_1530752411803_7256384_469-239509-0_sp0.9 from training. Duration: 0.7555625 2023-10-04 05:27:16,138 WARNING [train.py:1204] (3/4) Exclude cut with ID _28461_210_2253_1_1532771775189_3041299_136-270644-0_sp1.1 from training. Duration: 0.8454375 2023-10-04 05:27:16,216 WARNING [train.py:1204] (3/4) Exclude cut with ID _69584_210_5219_1_1535785133775_4044450_302-47302-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 05:27:17,539 WARNING [train.py:1204] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_166-128941-0_sp1.1 from training. Duration: 0.4909375 2023-10-04 05:27:24,390 WARNING [train.py:1204] (3/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_559-120504-0_sp1.1 from training. Duration: 0.97275 2023-10-04 05:27:24,524 WARNING [train.py:1204] (3/4) Exclude cut with ID _6456_210_1534_1_1527681634212_3181049_345-252877-0_sp1.1 from training. Duration: 0.5818125 2023-10-04 05:27:26,108 WARNING [train.py:1204] (3/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_902-233003-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 05:27:30,151 WARNING [train.py:1204] (3/4) Exclude cut with ID _36538_210_9994_1_1533204020363_3795620_204-261385-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 05:27:30,155 WARNING [train.py:1204] (3/4) Exclude cut with ID _14821_210_8750_1_1530680503991_6829359_189-297451-0_sp1.1 from training. Duration: 0.5 2023-10-04 05:27:31,520 WARNING [train.py:1204] (3/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_1211-226066-0_sp0.9 from training. Duration: 0.8444375 2023-10-04 05:27:31,896 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.bypass_mid.scale_min, batch_count=1541346.6666666667, ans=0.2 2023-10-04 05:27:35,848 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6786_210_1388_1_1527839699_4194495_273-149841-0_sp1.1 from training. Duration: 0.836375 2023-10-04 05:27:37,186 WARNING [train.py:1204] (3/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_380-162648-0_sp1.1 from training. Duration: 0.8545625 2023-10-04 05:27:37,187 WARNING [train.py:1204] (3/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_494-199538-0 from training. Duration: 0.75 2023-10-04 05:27:42,443 WARNING [train.py:1204] (3/4) Exclude cut with ID _49097_210_16049_1_1533952780207_4360979_276-87053-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 05:27:45,249 WARNING [train.py:1204] (3/4) Exclude cut with ID _5444_210_2151_1_1527418808464_5312210_726-27165-0 from training. Duration: 0.67 2023-10-04 05:27:48,752 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_128-202104-0_sp1.1 from training. Duration: 0.57275 2023-10-04 05:27:50,158 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_11349_210_6753_1_1529800861_1101207_26-65602-0_sp1.1 from training. Duration: 0.87275 2023-10-04 05:27:50,188 WARNING [train.py:1204] (3/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_159-328157-0 from training. Duration: 0.52 2023-10-04 05:27:51,537 WARNING [train.py:1204] (3/4) Exclude cut with ID _14069_210_5632_1_1530512692033_4803390_98-21934-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 05:27:52,770 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.648e+02 2.112e+02 2.441e+02 2.787e+02 5.885e+02, threshold=4.882e+02, percent-clipped=3.0 2023-10-04 05:27:54,199 WARNING [train.py:1204] (3/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_352-59958-0_sp0.9 from training. Duration: 0.9555625 2023-10-04 05:27:54,242 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6163_210_6400_1_1527506746_4263068_283-137811-0 from training. Duration: 0.77 2023-10-04 05:27:54,277 WARNING [train.py:1204] (3/4) Exclude cut with ID _21989_210_5854_1_1531899214931_3271360_133-44759-0_sp1.1 from training. Duration: 0.7 2023-10-04 05:27:54,397 INFO [scaling.py:1118] (3/4) WithLoss: name=encoder.encoders.0.layers.0.self_attn_weights, loss-sum=0.000e+00 2023-10-04 05:27:58,457 INFO [train.py:1046] (3/4) Epoch 44, batch 2800, loss[loss=0.1695, simple_loss=0.2598, pruned_loss=0.03961, over 24539.00 frames. ], tot_loss[loss=0.155, simple_loss=0.2349, pruned_loss=0.03754, over 4692366.89 frames. ], batch size: 71, lr: 2.31e-03, grad_scale: 8.0 2023-10-04 05:27:58,537 WARNING [train.py:1204] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_21-174514-0_sp0.9 from training. Duration: 0.47775 2023-10-04 05:27:58,570 WARNING [train.py:1204] (3/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_201-78811-0_sp1.1 from training. Duration: 0.963625 2023-10-04 05:27:58,601 WARNING [train.py:1204] (3/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_537-164630-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 05:27:59,914 WARNING [train.py:1204] (3/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_156-257607-0 from training. Duration: 0.83 2023-10-04 05:27:59,925 WARNING [train.py:1204] (3/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_308-154450-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 05:27:59,983 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_45399_210_4852_1_1533624725_7579737_214-240574-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 05:28:01,348 WARNING [train.py:1204] (3/4) Exclude cut with ID _29303_210_2575_1_1532392237303_7410649_615-305517-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 05:28:01,405 WARNING [train.py:1204] (3/4) Exclude cut with ID IC0125W0002-61808 from training. Duration: 0.708 2023-10-04 05:28:01,406 WARNING [train.py:1204] (3/4) Exclude cut with ID IC0125W0017-61823 from training. Duration: 0.759 2023-10-04 05:28:05,571 WARNING [train.py:1204] (3/4) Exclude cut with ID _53219_210_4525_1_1534834836008_4064825_4-145291-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 05:28:08,808 WARNING [train.py:1204] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_185-174805-0_sp1.1 from training. Duration: 0.6181875 2023-10-04 05:28:08,823 WARNING [train.py:1204] (3/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_635-193046-0_sp1.1 from training. Duration: 0.92725 2023-10-04 05:28:10,852 WARNING [train.py:1204] (3/4) Exclude cut with ID _46067_210_4220_1_1534125571066_3307450_321-258866-0_sp1.1 from training. Duration: 0.97275 2023-10-04 05:28:10,970 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.0.feed_forward1.hidden_balancer.prob, batch_count=1541480.0, ans=0.125 2023-10-04 05:28:13,438 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8558_210_1410_1_1528505225_7613406_131-329524-0 from training. Duration: 0.79 2023-10-04 05:28:16,797 WARNING [train.py:1204] (3/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_847-41334-0_sp0.9 from training. Duration: 0.488875 2023-10-04 05:28:16,888 WARNING [train.py:1204] (3/4) Exclude cut with ID _14227_210_4381_1_1530701804286_3940259_125-275642-0 from training. Duration: 0.99 2023-10-04 05:28:19,481 WARNING [train.py:1204] (3/4) Exclude cut with ID _25248_210_8750_1_1532062825941_5554180_248-177255-0_sp1.1 from training. Duration: 0.963625 2023-10-04 05:28:19,530 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_40047_210_2290_1_1533290519_2374311_195-165693-0_sp1.1 from training. Duration: 0.8090625 2023-10-04 05:28:19,534 WARNING [train.py:1204] (3/4) Exclude cut with ID _23109_210_4381_1_1532150828777_901949_29-258672-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 05:28:23,444 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.0.layers.1.self_attn1.whiten, num_groups=1, num_channels=192, metric=14.81 vs. limit=22.5 2023-10-04 05:28:23,905 WARNING [train.py:1204] (3/4) Exclude cut with ID _14821_210_8750_1_1530680503991_6829359_2-227151-0_sp1.1 from training. Duration: 0.7545625 2023-10-04 05:28:23,939 WARNING [train.py:1204] (3/4) Exclude cut with ID _49643_210_7989_1_1534640244224_3939031_565-53982-0_sp1.1 from training. Duration: 0.963625 2023-10-04 05:28:23,942 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7544_210_6753_1_1528591327_1471426_33-346031-0_sp0.9 from training. Duration: 0.67775 2023-10-04 05:28:25,335 WARNING [train.py:1204] (3/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_95-61782-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 05:28:30,870 WARNING [train.py:1204] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_686-54383-0_sp0.9 from training. Duration: 0.9666875 2023-10-04 05:28:31,043 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.conv_skip_rate, batch_count=1541613.3333333333, ans=0.0 2023-10-04 05:28:32,389 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_505-276823-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 05:28:34,807 WARNING [train.py:1204] (3/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_745-128443-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 05:28:36,227 WARNING [train.py:1204] (3/4) Exclude cut with ID _3844_210_1390_1_1526727803582_2871209_20-251664-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 05:28:36,282 WARNING [train.py:1204] (3/4) Exclude cut with ID _36538_210_9994_1_1533204020363_3795620_487-164628-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 05:28:41,648 WARNING [train.py:1204] (3/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_309-222545-0_sp1.1 from training. Duration: 0.763625 2023-10-04 05:28:41,668 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3531_210_3528_1_1526294203_5216456_309-227302-0 from training. Duration: 0.69 2023-10-04 05:28:43,040 WARNING [train.py:1204] (3/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_42-192460-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 05:28:43,128 WARNING [train.py:1204] (3/4) Exclude cut with ID _22083_210_8254_1_1531702862439_3678970_49-234546-0_sp1.1 from training. Duration: 0.7545625 2023-10-04 05:28:43,132 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_427-78097-0_sp0.9 from training. Duration: 0.888875 2023-10-04 05:28:48,925 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_378-205254-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 05:28:48,969 WARNING [train.py:1204] (3/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_672-314830-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 05:28:51,852 WARNING [train.py:1204] (3/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_90-30673-0_sp1.1 from training. Duration: 0.763625 2023-10-04 05:28:51,997 WARNING [train.py:1204] (3/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_563-329943-0_sp1.1 from training. Duration: 0.9 2023-10-04 05:28:53,343 WARNING [train.py:1204] (3/4) Exclude cut with ID _21632_210_2253_1_1531994001765_3744140_154-84090-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 05:28:53,345 WARNING [train.py:1204] (3/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_103-336064-0_sp0.9 from training. Duration: 0.5666875 2023-10-04 05:28:53,586 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.conv_module2.balancer1.prob, batch_count=1541680.0, ans=0.125 2023-10-04 05:28:54,661 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_43-71989-0_sp0.9 from training. Duration: 0.6333125 2023-10-04 05:28:54,725 WARNING [train.py:1204] (3/4) Exclude cut with ID _3867_210_1883_1_1526641200575_4475830_319-349850-0_sp0.9 from training. Duration: 0.7444375 2023-10-04 05:28:56,066 WARNING [train.py:1204] (3/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_629-179454-0_sp1.1 from training. Duration: 0.963625 2023-10-04 05:28:56,082 WARNING [train.py:1204] (3/4) Exclude cut with ID _65263_210_22356_1_1535763598691_3921037_49-222917-0 from training. Duration: 0.92 2023-10-04 05:28:56,119 WARNING [train.py:1204] (3/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_602-331772-0_sp1.1 from training. Duration: 0.936375 2023-10-04 05:28:58,780 WARNING [train.py:1204] (3/4) Exclude cut with ID _58414_210_8254_1_1535421609162_7151182_292-134978-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 05:28:58,826 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_38793_210_2544_1_1533176114_3849320_200-299292-0_sp1.1 from training. Duration: 0.936375 2023-10-04 05:29:00,255 WARNING [train.py:1204] (3/4) Exclude cut with ID _14970_210_15709_1_1530775547361_1966965_6-23328-0 from training. Duration: 0.86 2023-10-04 05:29:01,564 WARNING [train.py:1204] (3/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_386-225511-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 05:29:01,581 WARNING [train.py:1204] (3/4) Exclude cut with ID _38323_210_12108_1_1533121233369_3303049_416-11919-0_sp1.1 from training. Duration: 0.87275 2023-10-04 05:29:02,907 WARNING [train.py:1204] (3/4) Exclude cut with ID _4866_210_2577_1_1527404412284_4364659_256-306830-0_sp1.1 from training. Duration: 0.7909375 2023-10-04 05:29:04,335 WARNING [train.py:1204] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_53-300544-0 from training. Duration: 0.67 2023-10-04 05:29:09,875 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.self_attn_weights.pos_emb_skip_rate, batch_count=1541746.6666666667, ans=0.0 2023-10-04 05:29:12,645 INFO [train.py:1046] (3/4) Epoch 44, batch 2850, loss[loss=0.1493, simple_loss=0.2263, pruned_loss=0.03614, over 23554.00 frames. ], tot_loss[loss=0.1545, simple_loss=0.2345, pruned_loss=0.0373, over 4685591.63 frames. ], batch size: 134, lr: 2.31e-03, grad_scale: 8.0 2023-10-04 05:29:12,683 WARNING [train.py:1204] (3/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_80-74184-0_sp1.1 from training. Duration: 0.7545625 2023-10-04 05:29:12,704 WARNING [train.py:1204] (3/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_332-256168-0_sp1.1 from training. Duration: 0.6818125 2023-10-04 05:29:12,752 WARNING [train.py:1204] (3/4) Exclude cut with ID _48836_210_12210_1_1534075848293_3904150_429-35475-0_sp1.1 from training. Duration: 0.8090625 2023-10-04 05:29:14,213 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_144-135833-0_sp1.1 from training. Duration: 0.97275 2023-10-04 05:29:17,056 WARNING [train.py:1204] (3/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_380-343670-0_sp0.9 from training. Duration: 0.97775 2023-10-04 05:29:17,084 WARNING [train.py:1204] (3/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_213-265174-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 05:29:18,333 WARNING [train.py:1204] (3/4) Exclude cut with ID _20455_210_5219_1_1531565996775_8098100_86-314283-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 05:29:20,192 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_41750_210_1960_1_1533520678_3948042_68-223813-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 05:29:20,257 WARNING [train.py:1204] (3/4) Exclude cut with ID _55153_210_2253_1_1534384786677_3162096_24-198429-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 05:29:22,862 WARNING [train.py:1204] (3/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_119-207162-0_sp0.9 from training. Duration: 0.788875 2023-10-04 05:29:22,920 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_148-172861-0 from training. Duration: 0.5 2023-10-04 05:29:24,946 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.2.encoder.layers.2.self_attn_weights.whiten_keys, num_groups=4, num_channels=128, metric=3.04 vs. limit=6.0 2023-10-04 05:29:27,240 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_5522_210_3273_1_1527249606_3909096_318-203153-0 from training. Duration: 0.43 2023-10-04 05:29:27,246 WARNING [train.py:1204] (3/4) Exclude cut with ID _14884_210_2252_1_1530707459689_5561250_288-189949-0_sp1.1 from training. Duration: 0.97275 2023-10-04 05:29:30,073 WARNING [train.py:1204] (3/4) Exclude cut with ID _5294_210_1747_1_1527249643928_3696179_529-242298-0 from training. Duration: 0.95 2023-10-04 05:29:31,356 WARNING [train.py:1204] (3/4) Exclude cut with ID _36538_210_9994_1_1533204020363_3795620_503-110289-0_sp1.1 from training. Duration: 0.936375 2023-10-04 05:29:34,025 WARNING [train.py:1204] (3/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_385-193052-0 from training. Duration: 0.86 2023-10-04 05:29:34,096 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_456-22141-0 from training. Duration: 0.88 2023-10-04 05:29:36,827 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_38965_210_4852_1_1533174906_3484735_284-117840-0_sp1.1 from training. Duration: 0.936375 2023-10-04 05:29:37,046 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.conv_module2.balancer2.min_positive, batch_count=1541880.0, ans=0.05 2023-10-04 05:29:48,206 WARNING [train.py:1204] (3/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_75-201625-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 05:29:50,165 WARNING [train.py:1204] (3/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_255-312654-0_sp1.1 from training. Duration: 0.82725 2023-10-04 05:29:50,183 WARNING [train.py:1204] (3/4) Exclude cut with ID _47251_210_18628_1_1533814063128_4137250_214-88076-0_sp0.9 from training. Duration: 0.97775 2023-10-04 05:29:52,845 WARNING [train.py:1204] (3/4) Exclude cut with ID _4092_210_2151_1_1526643044449_3135759_227-213972-0_sp1.1 from training. Duration: 0.4909375 2023-10-04 05:29:52,869 WARNING [train.py:1204] (3/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_912-47473-0_sp1.1 from training. Duration: 0.6181875 2023-10-04 05:29:52,891 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6767_210_3273_1_1527855783_1718932_173-256601-0_sp1.1 from training. Duration: 0.72725 2023-10-04 05:29:54,323 WARNING [train.py:1204] (3/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_32-205288-0_sp1.1 from training. Duration: 0.7090625 2023-10-04 05:29:54,356 WARNING [train.py:1204] (3/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_39-248870-0 from training. Duration: 0.69 2023-10-04 05:29:57,340 WARNING [train.py:1204] (3/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_377-296839-0_sp1.1 from training. Duration: 0.5 2023-10-04 05:29:57,355 WARNING [train.py:1204] (3/4) Exclude cut with ID _13679_210_7497_1_1530534293662_3355098_413-56154-0_sp1.1 from training. Duration: 0.963625 2023-10-04 05:29:57,408 WARNING [train.py:1204] (3/4) Exclude cut with ID _14970_210_15709_1_1530775547361_1966965_201-84544-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 05:29:57,634 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.conv_module2.balancer2.prob, batch_count=1542013.3333333333, ans=0.125 2023-10-04 05:29:58,736 WARNING [train.py:1204] (3/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_105-95775-0_sp1.1 from training. Duration: 0.936375 2023-10-04 05:30:00,030 WARNING [train.py:1204] (3/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_131-191669-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 05:30:01,336 WARNING [train.py:1204] (3/4) Exclude cut with ID _14860_210_4915_1_1530702020340_7343240_695-4323-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 05:30:02,675 WARNING [train.py:1204] (3/4) Exclude cut with ID _39867_210_9826_1_1533553222030_9344259_991-130565-0_sp1.1 from training. Duration: 0.97275 2023-10-04 05:30:04,137 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_50-3289-0_sp1.1 from training. Duration: 0.82725 2023-10-04 05:30:05,586 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_54-347670-0_sp1.1 from training. Duration: 0.92725 2023-10-04 05:30:05,651 WARNING [train.py:1204] (3/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_200-153445-0_sp1.1 from training. Duration: 0.936375 2023-10-04 05:30:07,046 WARNING [train.py:1204] (3/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_279-302911-0_sp1.1 from training. Duration: 0.97275 2023-10-04 05:30:10,283 WARNING [train.py:1204] (3/4) Exclude cut with ID _14886_210_4220_1_1530702020355_3516190_159-73748-0_sp0.9 from training. Duration: 0.92225 2023-10-04 05:30:11,100 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.feed_forward3.hidden_balancer.prob, batch_count=1542080.0, ans=0.125 2023-10-04 05:30:15,374 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.0.balancer2.prob, batch_count=1542080.0, ans=0.125 2023-10-04 05:30:16,653 WARNING [train.py:1204] (3/4) Exclude cut with ID _53379_210_20034_1_1534222027363_3854654_23-222715-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 05:30:18,014 WARNING [train.py:1204] (3/4) Exclude cut with ID _12555_210_8750_1_1530075594402_3780239_449-267986-0 from training. Duration: 0.83 2023-10-04 05:30:18,025 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7544_210_6753_1_1528587722_3576686_295-258479-0 from training. Duration: 0.84 2023-10-04 05:30:19,485 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7002_210_1794_1_1527935634_4034444_487-236501-0_sp0.9 from training. Duration: 0.7333125 2023-10-04 05:30:20,627 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.593e+02 1.995e+02 2.371e+02 2.715e+02 4.777e+02, threshold=4.741e+02, percent-clipped=0.0 2023-10-04 05:30:20,693 WARNING [train.py:1204] (3/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_244-19107-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 05:30:20,714 WARNING [train.py:1204] (3/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_390-339137-0 from training. Duration: 0.56 2023-10-04 05:30:20,775 WARNING [train.py:1204] (3/4) Exclude cut with ID _7562_210_2084_1_1528887767916_3946229_186-326422-0_sp1.1 from training. Duration: 0.763625 2023-10-04 05:30:22,686 WARNING [train.py:1204] (3/4) Exclude cut with ID _33640_210_2655_1_1533117605479_4855469_645-145235-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 05:30:22,705 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_25767_210_2161_1_1532307483_3867748_506-171777-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 05:30:22,725 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_5438_210_1794_1_1527336687_2600009_233-58072-0_sp1.1 from training. Duration: 0.7 2023-10-04 05:30:22,726 WARNING [train.py:1204] (3/4) Exclude cut with ID IC0669W0063-330908 from training. Duration: 0.896 2023-10-04 05:30:22,790 WARNING [train.py:1204] (3/4) Exclude cut with ID IC0671W0070-331913 from training. Duration: 0.896 2023-10-04 05:30:22,793 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_258-310570-0_sp1.1 from training. Duration: 0.7454375 2023-10-04 05:30:24,103 WARNING [train.py:1204] (3/4) Exclude cut with ID _9461_210_4557_1_1529060514726_3677451_162-177507-0_sp1.1 from training. Duration: 0.97275 2023-10-04 05:30:24,299 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.feed_forward3.hidden_balancer.prob, batch_count=1542080.0, ans=0.125 2023-10-04 05:30:26,682 INFO [train.py:1046] (3/4) Epoch 44, batch 2900, loss[loss=0.1496, simple_loss=0.2335, pruned_loss=0.03283, over 24483.00 frames. ], tot_loss[loss=0.1541, simple_loss=0.2342, pruned_loss=0.03705, over 4695363.33 frames. ], batch size: 63, lr: 2.31e-03, grad_scale: 8.0 2023-10-04 05:30:29,503 WARNING [train.py:1204] (3/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_26-175553-0_sp0.9 from training. Duration: 0.67775 2023-10-04 05:30:29,526 WARNING [train.py:1204] (3/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_7-150481-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 05:30:30,824 WARNING [train.py:1204] (3/4) Exclude cut with ID _23557_210_8750_1_1531890106370_6846255_72-273262-0_sp1.1 from training. Duration: 0.963625 2023-10-04 05:30:30,922 WARNING [train.py:1204] (3/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_367-61330-0 from training. Duration: 0.75 2023-10-04 05:30:35,022 WARNING [train.py:1204] (3/4) Exclude cut with ID _39867_210_9826_1_1533553222030_9344259_1337-11141-0_sp1.1 from training. Duration: 0.936375 2023-10-04 05:30:35,048 WARNING [train.py:1204] (3/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_101-293367-0 from training. Duration: 0.63 2023-10-04 05:30:36,418 WARNING [train.py:1204] (3/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_10-100367-0 from training. Duration: 0.98 2023-10-04 05:30:37,730 WARNING [train.py:1204] (3/4) Exclude cut with ID _60057_210_10640_1_1534903065793_3333150_171-181133-0_sp1.1 from training. Duration: 0.8 2023-10-04 05:30:37,732 WARNING [train.py:1204] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_844-96989-0_sp0.9 from training. Duration: 0.788875 2023-10-04 05:30:40,525 WARNING [train.py:1204] (3/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_301-144595-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 05:30:40,612 WARNING [train.py:1204] (3/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_115-147343-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 05:30:45,215 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3171_210_2087_1_1526109084_2330007_37-64334-0_sp1.1 from training. Duration: 0.7454375 2023-10-04 05:30:46,528 WARNING [train.py:1204] (3/4) Exclude cut with ID _19514_210_9259_1_1531530071330_5084230_769-272914-0_sp1.1 from training. Duration: 0.936375 2023-10-04 05:30:48,005 WARNING [train.py:1204] (3/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_76-311963-0_sp0.9 from training. Duration: 0.77775 2023-10-04 05:30:48,034 WARNING [train.py:1204] (3/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_526-62079-0 from training. Duration: 0.67 2023-10-04 05:30:49,410 WARNING [train.py:1204] (3/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_7-111954-0_sp0.9 from training. Duration: 0.62225 2023-10-04 05:30:49,535 WARNING [train.py:1204] (3/4) Exclude cut with ID _57997_210_4163_1_1534766425889_3961049_360-279140-0_sp1.1 from training. Duration: 0.97275 2023-10-04 05:30:52,828 WARNING [train.py:1204] (3/4) Exclude cut with ID _4866_210_2577_1_1527404412284_4364659_133-1696-0 from training. Duration: 0.99 2023-10-04 05:30:54,093 WARNING [train.py:1204] (3/4) Exclude cut with ID _5190_210_4557_1_1527244368799_373439_28-197030-0 from training. Duration: 0.72 2023-10-04 05:30:56,885 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_53-39003-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 05:30:56,888 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6673_210_3273_1_1527767790_3173240_351-167954-0 from training. Duration: 0.48 2023-10-04 05:30:56,911 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2822_210_2715_1_1526112586_1999587_165-36656-0_sp1.1 from training. Duration: 0.8090625 2023-10-04 05:30:58,419 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_328-264892-0_sp1.1 from training. Duration: 0.92725 2023-10-04 05:30:58,424 WARNING [train.py:1204] (3/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_29-293896-0_sp0.9 from training. Duration: 0.67775 2023-10-04 05:31:02,331 WARNING [train.py:1204] (3/4) Exclude cut with ID _65263_210_22356_1_1535763598691_3921037_465-55448-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 05:31:03,618 WARNING [train.py:1204] (3/4) Exclude cut with ID _21989_210_5854_1_1531899214931_3271360_311-53106-0_sp1.1 from training. Duration: 0.97275 2023-10-04 05:31:03,768 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.conv_skip_rate, batch_count=1542280.0, ans=0.0 2023-10-04 05:31:06,498 WARNING [train.py:1204] (3/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_389-277915-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 05:31:09,266 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_69630_210_7234_1_1535791094_677837_63-277139-0_sp1.1 from training. Duration: 0.963625 2023-10-04 05:31:10,894 WARNING [train.py:1204] (3/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_85-138257-0 from training. Duration: 0.71 2023-10-04 05:31:12,109 WARNING [train.py:1204] (3/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_686-247600-0 from training. Duration: 0.92 2023-10-04 05:31:12,115 WARNING [train.py:1204] (3/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_108-255430-0_sp1.1 from training. Duration: 0.8818125 2023-10-04 05:31:16,096 WARNING [train.py:1204] (3/4) Exclude cut with ID _14884_210_2252_1_1530707459689_5561250_114-201399-0_sp1.1 from training. Duration: 0.6909375 2023-10-04 05:31:17,557 WARNING [train.py:1204] (3/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_506-42614-0 from training. Duration: 0.79 2023-10-04 05:31:20,245 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_85-195240-0_sp1.1 from training. Duration: 0.7909375 2023-10-04 05:31:26,071 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_141-36243-0_sp1.1 from training. Duration: 0.97275 2023-10-04 05:31:33,176 WARNING [train.py:1204] (3/4) Exclude cut with ID _7852_210_5219_1_1528286471316_8139400_358-287492-0_sp1.1 from training. Duration: 0.87275 2023-10-04 05:31:33,196 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_805-71497-0_sp0.9 from training. Duration: 0.788875 2023-10-04 05:31:34,527 WARNING [train.py:1204] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_178-39722-0 from training. Duration: 0.49 2023-10-04 05:31:37,340 WARNING [train.py:1204] (3/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_428-181925-0_sp1.1 from training. Duration: 0.963625 2023-10-04 05:31:37,344 WARNING [train.py:1204] (3/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_770-200430-0 from training. Duration: 0.89 2023-10-04 05:31:37,387 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_1104-271009-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 05:31:37,430 WARNING [train.py:1204] (3/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_128-296581-0_sp0.9 from training. Duration: 0.82225 2023-10-04 05:31:38,681 INFO [train.py:1046] (3/4) Epoch 44, batch 2950, loss[loss=0.1585, simple_loss=0.244, pruned_loss=0.03649, over 24012.00 frames. ], tot_loss[loss=0.1545, simple_loss=0.2348, pruned_loss=0.03705, over 4699957.00 frames. ], batch size: 80, lr: 2.31e-03, grad_scale: 8.0 2023-10-04 05:31:42,954 WARNING [train.py:1204] (3/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_19-62511-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 05:31:44,748 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_805-71497-0 from training. Duration: 0.71 2023-10-04 05:31:44,998 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.attention_skip_rate, batch_count=1542480.0, ans=0.0 2023-10-04 05:31:47,288 WARNING [train.py:1204] (3/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_573-152077-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 05:31:47,294 WARNING [train.py:1204] (3/4) Exclude cut with ID _42875_210_14856_1_1533704397487_4012433_393-177816-0_sp1.1 from training. Duration: 0.963625 2023-10-04 05:31:48,618 WARNING [train.py:1204] (3/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_278-35159-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 05:31:50,053 WARNING [train.py:1204] (3/4) Exclude cut with ID _15037_210_2253_1_1530752294104_3267650_13-130361-0_sp1.1 from training. Duration: 0.9 2023-10-04 05:31:51,439 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3019_210_2028_1_1526044245_3264865_127-135615-0 from training. Duration: 0.94 2023-10-04 05:31:52,816 WARNING [train.py:1204] (3/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_607-213951-0 from training. Duration: 0.84 2023-10-04 05:31:52,882 WARNING [train.py:1204] (3/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_436-318113-0_sp0.9 from training. Duration: 0.8333125 2023-10-04 05:31:52,883 WARNING [train.py:1204] (3/4) Exclude cut with ID _13938_210_9988_1_1530518718978_3693960_148-152225-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 05:31:59,580 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.2.encoder.layers.2.feed_forward2.out_whiten, num_groups=1, num_channels=384, metric=4.92 vs. limit=15.0 2023-10-04 05:32:00,300 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2541_210_1794_1_1525590269_3084765_156-173713-0_sp1.1 from training. Duration: 0.736375 2023-10-04 05:32:00,548 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.ff3_skip_rate, batch_count=1542546.6666666667, ans=0.0 2023-10-04 05:32:01,757 WARNING [train.py:1204] (3/4) Exclude cut with ID _28323_210_16187_1_1532480395667_4100300_531-24157-0_sp1.1 from training. Duration: 0.8909375 2023-10-04 05:32:03,139 WARNING [train.py:1204] (3/4) Exclude cut with ID _29303_210_2575_1_1532392237303_7410649_382-217492-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 05:32:04,465 WARNING [train.py:1204] (3/4) Exclude cut with ID _58895_210_8750_1_1535184081320_3649089_270-158013-0_sp1.1 from training. Duration: 0.92725 2023-10-04 05:32:05,948 WARNING [train.py:1204] (3/4) Exclude cut with ID _29277_210_3279_1_1532581109293_3173069_311-206227-0_sp1.1 from training. Duration: 0.936375 2023-10-04 05:32:05,959 WARNING [train.py:1204] (3/4) Exclude cut with ID _7160_210_1876_1_1527940923231_3703799_282-322611-0_sp0.9 from training. Duration: 0.9666875 2023-10-04 05:32:07,358 WARNING [train.py:1204] (3/4) Exclude cut with ID _53219_210_4525_1_1534834836008_4064825_562-32295-0_sp1.1 from training. Duration: 0.963625 2023-10-04 05:32:08,696 WARNING [train.py:1204] (3/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_408-87330-0_sp1.1 from training. Duration: 0.963625 2023-10-04 05:32:08,703 WARNING [train.py:1204] (3/4) Exclude cut with ID _59202_210_26984_1_1534816810612_3549174_137-318710-0_sp1.1 from training. Duration: 0.8090625 2023-10-04 05:32:11,187 WARNING [train.py:1204] (3/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_372-30302-0 from training. Duration: 0.86 2023-10-04 05:32:15,744 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7543_210_6753_1_1528541907_1399322_178-56269-0_sp1.1 from training. Duration: 0.95275 2023-10-04 05:32:15,762 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9109W0012-549758 from training. Duration: 0.512 2023-10-04 05:32:18,356 WARNING [train.py:1204] (3/4) Exclude cut with ID _5410_210_1388_1_1527397055795_1719020_39-112221-0_sp0.9 from training. Duration: 0.7666875 2023-10-04 05:32:19,587 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9121W0012-554933 from training. Duration: 0.896 2023-10-04 05:32:22,594 WARNING [train.py:1204] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_476-272757-0 from training. Duration: 0.93 2023-10-04 05:32:22,628 WARNING [train.py:1204] (3/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_1085-171718-0_sp1.1 from training. Duration: 0.92725 2023-10-04 05:32:23,949 WARNING [train.py:1204] (3/4) Exclude cut with ID _8480_210_1874_1_1528589391205_6728330_309-139896-0_sp1.1 from training. Duration: 0.736375 2023-10-04 05:32:23,949 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9130W0113-559703 from training. Duration: 0.896 2023-10-04 05:32:23,953 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_292-265928-0_sp1.1 from training. Duration: 0.463625 2023-10-04 05:32:26,699 WARNING [train.py:1204] (3/4) Exclude cut with ID _8772_210_1850_1_1528635595972_3664011_96-12756-0 from training. Duration: 0.98 2023-10-04 05:32:26,765 WARNING [train.py:1204] (3/4) Exclude cut with ID _12556_210_8341_1_1530081024100_7327690_725-193477-0_sp1.1 from training. Duration: 0.8909375 2023-10-04 05:32:26,803 WARNING [train.py:1204] (3/4) Exclude cut with ID _5289_210_2084_1_1527678202736_3779299_142-103317-0_sp1.1 from training. Duration: 0.763625 2023-10-04 05:32:27,537 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.2.encoder.layers.0.conv_module2.whiten, num_groups=1, num_channels=384, metric=5.39 vs. limit=15.0 2023-10-04 05:32:31,417 WARNING [train.py:1204] (3/4) Exclude cut with ID _45012_210_12651_1_1533608435136_5371380_206-51075-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 05:32:31,527 WARNING [train.py:1204] (3/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_176-56120-0_sp1.1 from training. Duration: 0.7545625 2023-10-04 05:32:31,550 WARNING [train.py:1204] (3/4) Exclude cut with ID _40284_210_2084_1_1533513679814_3704279_220-201864-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 05:32:32,868 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9165W0004-576638 from training. Duration: 0.896 2023-10-04 05:32:34,221 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_5438_210_1794_1_1527336687_2600009_218-343336-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 05:32:34,245 WARNING [train.py:1204] (3/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_318-162017-0 from training. Duration: 0.91 2023-10-04 05:32:38,586 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_49544_210_1794_1_1534379770_5262083_483-349298-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 05:32:39,894 WARNING [train.py:1204] (3/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_197-28164-0_sp0.9 from training. Duration: 0.888875 2023-10-04 05:32:39,956 WARNING [train.py:1204] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_643-52007-0 from training. Duration: 0.57 2023-10-04 05:32:41,289 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_38965_210_4852_1_1533174906_3484735_403-332963-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 05:32:42,629 WARNING [train.py:1204] (3/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_340-174176-0 from training. Duration: 0.49 2023-10-04 05:32:45,347 WARNING [train.py:1204] (3/4) Exclude cut with ID _56708_210_13177_1_1535252351282_3422899_363-917-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 05:32:47,164 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.583e+02 1.991e+02 2.260e+02 2.568e+02 3.516e+02, threshold=4.519e+02, percent-clipped=0.0 2023-10-04 05:32:47,297 WARNING [train.py:1204] (3/4) Exclude cut with ID _19896_210_1883_1_1531709022662_6649569_525-159063-0_sp1.1 from training. Duration: 0.936375 2023-10-04 05:32:48,619 WARNING [train.py:1204] (3/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_430-116139-0_sp0.9 from training. Duration: 0.9444375 2023-10-04 05:32:48,750 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_53753_210_7234_1_1534320003_3594148_86-329250-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 05:32:48,770 WARNING [train.py:1204] (3/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_406-275649-0_sp1.1 from training. Duration: 0.3818125 2023-10-04 05:32:51,984 WARNING [train.py:1204] (3/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_1083-147536-0_sp1.1 from training. Duration: 0.8818125 2023-10-04 05:32:52,045 WARNING [train.py:1204] (3/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_54-311741-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 05:32:52,049 WARNING [train.py:1204] (3/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_237-58433-0_sp0.9 from training. Duration: 0.82225 2023-10-04 05:32:52,093 WARNING [train.py:1204] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_98-51676-0_sp1.1 from training. Duration: 0.663625 2023-10-04 05:32:53,910 INFO [train.py:1046] (3/4) Epoch 44, batch 3000, loss[loss=0.1606, simple_loss=0.2448, pruned_loss=0.03814, over 23496.00 frames. ], tot_loss[loss=0.1544, simple_loss=0.2353, pruned_loss=0.03682, over 4715391.81 frames. ], batch size: 119, lr: 2.31e-03, grad_scale: 8.0 2023-10-04 05:32:53,910 INFO [train.py:1069] (3/4) Computing validation loss 2023-10-04 05:33:05,946 INFO [train.py:1078] (3/4) Epoch 44, validation: loss=0.3969, simple_loss=0.2803, pruned_loss=0.2567, over 1125622.00 frames. 2023-10-04 05:33:05,947 INFO [train.py:1079] (3/4) Maximum memory allocated so far is 21134MB 2023-10-04 05:33:06,070 WARNING [train.py:1204] (3/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_386-270472-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 05:33:07,432 WARNING [train.py:1204] (3/4) Exclude cut with ID _19023_210_4353_1_1531378892552_5820780_83-254794-0_sp1.1 from training. Duration: 0.7818125 2023-10-04 05:33:08,953 WARNING [train.py:1204] (3/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_314-93656-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 05:33:10,317 WARNING [train.py:1204] (3/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_336-213530-0 from training. Duration: 0.9 2023-10-04 05:33:11,676 WARNING [train.py:1204] (3/4) Exclude cut with ID _36686_210_5665_1_1533778225913_3646410_65-221120-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 05:33:13,177 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_26433_210_12108_1_1532157158_4056480_271-68178-0_sp1.1 from training. Duration: 0.92725 2023-10-04 05:33:13,225 WARNING [train.py:1204] (3/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_46-207599-0_sp0.9 from training. Duration: 0.711125 2023-10-04 05:33:16,734 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9303W0114-637133 from training. Duration: 0.896 2023-10-04 05:33:16,925 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.balancer2.prob, batch_count=1542813.3333333333, ans=0.125 2023-10-04 05:33:16,927 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.feed_forward1.hidden_balancer.prob, batch_count=1542813.3333333333, ans=0.125 2023-10-04 05:33:18,132 WARNING [train.py:1204] (3/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_490-207861-0 from training. Duration: 0.98 2023-10-04 05:33:20,219 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6767_210_3273_1_1527855783_1718932_173-256601-0_sp0.9 from training. Duration: 0.888875 2023-10-04 05:33:20,249 WARNING [train.py:1204] (3/4) Exclude cut with ID _13921_210_9994_1_1530841782272_3503520_327-69677-0_sp1.1 from training. Duration: 0.8181875 2023-10-04 05:33:21,446 WARNING [train.py:1204] (3/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_508-335369-0 from training. Duration: 0.48 2023-10-04 05:33:23,162 WARNING [train.py:1204] (3/4) Exclude cut with ID _64631_210_27767_1_1535349580068_4165160_24-46567-0_sp1.1 from training. Duration: 0.963625 2023-10-04 05:33:27,445 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_89-35748-0_sp1.1 from training. Duration: 0.7181875 2023-10-04 05:33:36,207 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_26433_210_12108_1_1532157158_4056480_578-262107-0_sp1.1 from training. Duration: 0.9 2023-10-04 05:33:40,455 WARNING [train.py:1204] (3/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_129-337802-0 from training. Duration: 0.72 2023-10-04 05:33:42,286 WARNING [train.py:1204] (3/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_356-167816-0_sp0.9 from training. Duration: 0.8 2023-10-04 05:33:46,337 WARNING [train.py:1204] (3/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_137-116442-0_sp1.1 from training. Duration: 0.7090625 2023-10-04 05:33:46,376 WARNING [train.py:1204] (3/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_358-172940-0_sp1.1 from training. Duration: 0.963625 2023-10-04 05:33:46,411 WARNING [train.py:1204] (3/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_668-344147-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 05:33:49,107 WARNING [train.py:1204] (3/4) Exclude cut with ID _62337_210_12392_1_1535009273105_3412229_255-157667-0_sp1.1 from training. Duration: 0.936375 2023-10-04 05:33:49,111 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_486-151131-0 from training. Duration: 0.85 2023-10-04 05:33:52,470 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_53638_210_21581_1_1534297319_5808449_885-80172-0 from training. Duration: 0.96 2023-10-04 05:33:54,410 WARNING [train.py:1204] (3/4) Exclude cut with ID _50093_210_4220_1_1534417141207_3432299_105-183067-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 05:33:54,498 WARNING [train.py:1204] (3/4) Exclude cut with ID _7562_210_2084_1_1528887767916_3946229_345-169156-0_sp0.9 from training. Duration: 0.6444375 2023-10-04 05:33:55,911 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_4528_210_3273_1_1527076654_3276777_209-219804-0_sp0.9 from training. Duration: 0.7666875 2023-10-04 05:33:57,259 WARNING [train.py:1204] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_35-153886-0_sp0.9 from training. Duration: 0.9333125 2023-10-04 05:33:57,328 WARNING [train.py:1204] (3/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_332-121925-0_sp1.1 from training. Duration: 0.92725 2023-10-04 05:33:57,331 WARNING [train.py:1204] (3/4) Exclude cut with ID _59202_210_26984_1_1534816810612_3549174_495-65515-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 05:33:57,502 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.feed_forward3.hidden_balancer.prob, batch_count=1543013.3333333333, ans=0.125 2023-10-04 05:34:01,423 WARNING [train.py:1204] (3/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_769-168302-0_sp1.1 from training. Duration: 0.6454375 2023-10-04 05:34:03,242 WARNING [train.py:1204] (3/4) Exclude cut with ID _24271_210_12658_1_1531987270022_7190519_708-71345-0_sp1.1 from training. Duration: 0.936375 2023-10-04 05:34:03,243 WARNING [train.py:1204] (3/4) Exclude cut with ID 3033-130750-0096-55598-0_sp0.9 from training. Duration: 0.92225 2023-10-04 05:34:04,677 WARNING [train.py:1204] (3/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_219-224445-0_sp0.9 from training. Duration: 0.9333125 2023-10-04 05:34:06,905 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.0.layers.0.whiten, num_groups=1, num_channels=192, metric=4.56 vs. limit=12.0 2023-10-04 05:34:07,352 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_174-277364-0 from training. Duration: 0.71 2023-10-04 05:34:08,651 WARNING [train.py:1204] (3/4) Exclude cut with ID _45021_210_9994_1_1533619751094_3330288_96-239010-0_sp1.1 from training. Duration: 0.82725 2023-10-04 05:34:08,675 WARNING [train.py:1204] (3/4) Exclude cut with ID _31602_210_5854_1_1532677021456_4458250_489-305116-0_sp1.1 from training. Duration: 0.97275 2023-10-04 05:34:08,702 WARNING [train.py:1204] (3/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_269-154726-0_sp0.9 from training. Duration: 0.9555625 2023-10-04 05:34:09,379 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.3.encoder.layers.3.feed_forward3.out_whiten, num_groups=1, num_channels=512, metric=13.37 vs. limit=15.0 2023-10-04 05:34:10,373 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.balancer2.prob, batch_count=1543080.0, ans=0.125 2023-10-04 05:34:12,877 WARNING [train.py:1204] (3/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_126-54418-0_sp1.1 from training. Duration: 0.92725 2023-10-04 05:34:12,916 WARNING [train.py:1204] (3/4) Exclude cut with ID _42326_210_7770_1_1533814282531_3707269_182-224159-0_sp1.1 from training. Duration: 0.92725 2023-10-04 05:34:14,225 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7002_210_1794_1_1527939683_5157286_1-281264-0_sp0.9 from training. Duration: 0.488875 2023-10-04 05:34:16,014 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_86-16881-0 from training. Duration: 0.58 2023-10-04 05:34:16,044 WARNING [train.py:1204] (3/4) Exclude cut with ID _19514_210_9259_1_1531530071330_5084230_637-279094-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 05:34:16,077 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_26433_210_12108_1_1532157158_4056480_578-262107-0 from training. Duration: 0.99 2023-10-04 05:34:16,114 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_82-221196-0_sp1.1 from training. Duration: 0.6909375 2023-10-04 05:34:17,517 WARNING [train.py:1204] (3/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_402-102311-0 from training. Duration: 0.67 2023-10-04 05:34:20,213 INFO [train.py:1046] (3/4) Epoch 44, batch 3050, loss[loss=0.1494, simple_loss=0.2236, pruned_loss=0.03754, over 23729.00 frames. ], tot_loss[loss=0.1548, simple_loss=0.2357, pruned_loss=0.03693, over 4723333.01 frames. ], batch size: 232, lr: 2.31e-03, grad_scale: 8.0 2023-10-04 05:34:20,309 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1015-343884-0_sp1.1 from training. Duration: 0.563625 2023-10-04 05:34:21,623 WARNING [train.py:1204] (3/4) Exclude cut with ID _13684_210_7497_1_1530619354863_3598359_312-235606-0_sp1.1 from training. Duration: 0.5454375 2023-10-04 05:34:23,396 WARNING [train.py:1204] (3/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_55-31904-0 from training. Duration: 0.88 2023-10-04 05:34:23,478 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_25-332138-0 from training. Duration: 0.91 2023-10-04 05:34:23,480 WARNING [train.py:1204] (3/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_912-282412-0_sp0.9 from training. Duration: 0.5444375 2023-10-04 05:34:23,701 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.bypass.skip_rate, batch_count=1543146.6666666667, ans=0.07 2023-10-04 05:34:24,904 WARNING [train.py:1204] (3/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_458-299435-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 05:34:25,009 WARNING [train.py:1204] (3/4) Exclude cut with ID _69584_210_5219_1_1535785133775_4044450_275-88403-0_sp1.1 from training. Duration: 0.92725 2023-10-04 05:34:25,123 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.1.feed_forward1.out_proj.dropout_p, batch_count=1543146.6666666667, ans=0.1 2023-10-04 05:34:26,285 WARNING [train.py:1204] (3/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_841-341679-0_sp1.1 from training. Duration: 0.52725 2023-10-04 05:34:26,292 WARNING [train.py:1204] (3/4) Exclude cut with ID _41391_210_7994_1_1533603771872_3638201_1-54521-0_sp1.1 from training. Duration: 0.97275 2023-10-04 05:34:26,332 WARNING [train.py:1204] (3/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_495-186609-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 05:34:27,851 WARNING [train.py:1204] (3/4) Exclude cut with ID _4681_210_3525_1_1527250689464_120889_15-126760-0 from training. Duration: 0.81 2023-10-04 05:34:28,056 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=1543146.6666666667, ans=0.1 2023-10-04 05:34:30,504 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3019_210_2028_1_1526044245_3264865_127-135615-0_sp1.1 from training. Duration: 0.8545625 2023-10-04 05:34:31,913 WARNING [train.py:1204] (3/4) Exclude cut with ID _30191_210_1664_1_1533189915047_7196670_915-21561-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 05:34:31,950 WARNING [train.py:1204] (3/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_6-214815-0_sp0.9 from training. Duration: 0.9444375 2023-10-04 05:34:34,776 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_45399_210_4852_1_1533624725_7579737_190-137494-0_sp1.1 from training. Duration: 0.97275 2023-10-04 05:34:36,808 WARNING [train.py:1204] (3/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_207-132038-0 from training. Duration: 0.94 2023-10-04 05:34:42,249 WARNING [train.py:1204] (3/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_90-31383-0 from training. Duration: 0.87 2023-10-04 05:34:43,552 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_15517_210_6753_1_1530952293_1664795_119-31001-0 from training. Duration: 0.94 2023-10-04 05:34:43,578 WARNING [train.py:1204] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_684-85342-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 05:34:45,658 WARNING [train.py:1204] (3/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_97-325053-0_sp1.1 from training. Duration: 0.836375 2023-10-04 05:34:49,052 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_68702_210_23689_1_1535799577_3420068_168-25150-0_sp1.1 from training. Duration: 0.97275 2023-10-04 05:34:50,364 WARNING [train.py:1204] (3/4) Exclude cut with ID _65134_210_14763_1_1535335393985_3505368_111-27984-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 05:34:50,432 WARNING [train.py:1204] (3/4) Exclude cut with ID _48678_210_5663_1_1533988830947_3769199_276-226230-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 05:34:54,881 WARNING [train.py:1204] (3/4) Exclude cut with ID _28427_210_4220_1_1532581240967_3061069_43-84928-0_sp1.1 from training. Duration: 0.9 2023-10-04 05:34:54,912 WARNING [train.py:1204] (3/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_158-250116-0_sp1.1 from training. Duration: 0.563625 2023-10-04 05:34:54,937 WARNING [train.py:1204] (3/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_279-332556-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 05:34:54,975 WARNING [train.py:1204] (3/4) Exclude cut with ID _42863_210_9070_1_1533902398788_3696920_488-230146-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 05:34:54,975 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_387-247159-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 05:34:56,354 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6729_210_2550_1_1528032481_1788725_261-42619-0_sp1.1 from training. Duration: 0.97275 2023-10-04 05:34:57,784 WARNING [train.py:1204] (3/4) Exclude cut with ID _19896_210_1883_1_1531709022662_6649569_831-44724-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 05:35:01,904 WARNING [train.py:1204] (3/4) Exclude cut with ID _55153_210_2253_1_1534384786677_3162096_280-285944-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 05:35:01,944 WARNING [train.py:1204] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_448-4048-0 from training. Duration: 0.96 2023-10-04 05:35:03,159 WARNING [train.py:1204] (3/4) Exclude cut with ID _29678_210_7893_1_1532421042787_3906118_456-202548-0_sp1.1 from training. Duration: 0.97275 2023-10-04 05:35:03,172 WARNING [train.py:1204] (3/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_107-332398-0_sp1.1 from training. Duration: 0.7090625 2023-10-04 05:35:05,116 WARNING [train.py:1204] (3/4) Exclude cut with ID _13921_210_9994_1_1530838702800_3003429_46-149444-0_sp1.1 from training. Duration: 0.8545625 2023-10-04 05:35:06,487 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_257-20733-0_sp0.9 from training. Duration: 0.8444375 2023-10-04 05:35:06,542 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_53753_210_7234_1_1534320003_3594148_385-234809-0_sp1.1 from training. Duration: 0.963625 2023-10-04 05:35:07,988 WARNING [train.py:1204] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_292-215346-0_sp1.1 from training. Duration: 0.8818125 2023-10-04 05:35:12,188 WARNING [train.py:1204] (3/4) Exclude cut with ID _68965_210_1883_1_1535698962272_3497490_196-124268-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 05:35:13,907 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_23801_210_16223_1_1532066392_3294657_570-5439-0_sp1.1 from training. Duration: 0.8818125 2023-10-04 05:35:18,238 WARNING [train.py:1204] (3/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_349-6928-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 05:35:20,097 WARNING [train.py:1204] (3/4) Exclude cut with ID _60621_210_17071_1_1535013053530_3671739_226-255202-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 05:35:20,101 WARNING [train.py:1204] (3/4) Exclude cut with ID _41817_210_18835_1_1533434477160_7423049_448-130302-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 05:35:20,244 WARNING [train.py:1204] (3/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_758-143677-0_sp1.1 from training. Duration: 0.8909375 2023-10-04 05:35:21,532 WARNING [train.py:1204] (3/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_912-47473-0_sp0.9 from training. Duration: 0.7555625 2023-10-04 05:35:21,567 WARNING [train.py:1204] (3/4) Exclude cut with ID _6694_210_4599_1_1527852637490_3965529_356-169233-0_sp1.1 from training. Duration: 0.9 2023-10-04 05:35:22,924 WARNING [train.py:1204] (3/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_1-99680-0 from training. Duration: 0.67 2023-10-04 05:35:25,980 WARNING [train.py:1204] (3/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_199-86432-0_sp1.1 from training. Duration: 0.8909375 2023-10-04 05:35:25,999 WARNING [train.py:1204] (3/4) Exclude cut with ID _61148_210_6897_1_1535094022726_4217610_362-182430-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 05:35:27,391 WARNING [train.py:1204] (3/4) Exclude cut with ID _49376_210_5986_1_1533985278732_3370740_301-154763-0 from training. Duration: 0.98 2023-10-04 05:35:28,604 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.614e+02 2.065e+02 2.224e+02 2.528e+02 3.449e+02, threshold=4.449e+02, percent-clipped=0.0 2023-10-04 05:35:28,747 WARNING [train.py:1204] (3/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_572-185150-0_sp1.1 from training. Duration: 0.8818125 2023-10-04 05:35:33,003 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_47896_210_20508_1_1534492518_5958868_558-314572-0_sp1.1 from training. Duration: 0.8818125 2023-10-04 05:35:33,122 WARNING [train.py:1204] (3/4) Exclude cut with ID _45560_210_21982_1_1533812603951_3584999_111-6376-0_sp0.9 from training. Duration: 0.9333125 2023-10-04 05:35:35,024 INFO [train.py:1046] (3/4) Epoch 44, batch 3100, loss[loss=0.176, simple_loss=0.249, pruned_loss=0.05156, over 23751.00 frames. ], tot_loss[loss=0.1547, simple_loss=0.2354, pruned_loss=0.03699, over 4714206.84 frames. ], batch size: 164, lr: 2.31e-03, grad_scale: 8.0 2023-10-04 05:35:36,581 WARNING [train.py:1204] (3/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_412-317077-0_sp1.1 from training. Duration: 0.6818125 2023-10-04 05:35:39,255 WARNING [train.py:1204] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_718-68505-0 from training. Duration: 0.89 2023-10-04 05:35:40,673 WARNING [train.py:1204] (3/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_77-311404-0 from training. Duration: 0.94 2023-10-04 05:35:43,309 WARNING [train.py:1204] (3/4) Exclude cut with ID _60229_210_27767_1_1534838406158_3655350_264-279573-0 from training. Duration: 0.83 2023-10-04 05:35:44,111 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.3.encoder.layers.0.self_attn2.whiten, num_groups=1, num_channels=512, metric=14.58 vs. limit=22.5 2023-10-04 05:35:45,233 WARNING [train.py:1204] (3/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_106-300985-0_sp1.1 from training. Duration: 0.8181875 2023-10-04 05:35:45,548 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.balancer_ff2.min_abs, batch_count=1543480.0, ans=0.1 2023-10-04 05:35:48,007 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_4183_210_2087_1_1526729100_1869838_79-277189-0_sp1.1 from training. Duration: 0.963625 2023-10-04 05:35:48,022 WARNING [train.py:1204] (3/4) Exclude cut with ID _28427_210_4220_1_1532581240967_3061069_195-295268-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 05:35:51,286 WARNING [train.py:1204] (3/4) Exclude cut with ID _4092_210_2151_1_1526643044449_3135759_356-278694-0_sp0.9 from training. Duration: 0.52225 2023-10-04 05:35:55,551 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.1.encoder.layers.0.feed_forward2.out_whiten, num_groups=1, num_channels=256, metric=4.07 vs. limit=15.0 2023-10-04 05:35:56,030 WARNING [train.py:1204] (3/4) Exclude cut with ID _14459_210_2228_1_1531443641946_7516700_777-71446-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 05:35:59,239 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.attention_skip_rate, batch_count=1543546.6666666667, ans=0.0 2023-10-04 05:36:00,397 WARNING [train.py:1204] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_520-126729-0 from training. Duration: 0.7 2023-10-04 05:36:04,579 WARNING [train.py:1204] (3/4) Exclude cut with ID _13684_210_7497_1_1530619354863_3598359_312-235606-0_sp0.9 from training. Duration: 0.6666875 2023-10-04 05:36:04,619 WARNING [train.py:1204] (3/4) Exclude cut with ID _37537_210_2253_1_1533171758568_3421949_283-349402-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 05:36:04,673 WARNING [train.py:1204] (3/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_29-117932-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 05:36:06,569 WARNING [train.py:1204] (3/4) Exclude cut with ID _29303_210_2575_1_1532392237303_7410649_721-283351-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 05:36:06,631 WARNING [train.py:1204] (3/4) Exclude cut with ID _14886_210_4220_1_1530702020355_3516190_175-229073-0_sp0.9 from training. Duration: 0.5 2023-10-04 05:36:09,270 WARNING [train.py:1204] (3/4) Exclude cut with ID _15458_210_8341_1_1530858614263_3834410_70-268830-0_sp1.1 from training. Duration: 0.97275 2023-10-04 05:36:09,289 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2295_210_2087_1_1525500993_2407863_234-83104-0 from training. Duration: 0.62 2023-10-04 05:36:09,290 WARNING [train.py:1204] (3/4) Exclude cut with ID _22961_210_8254_1_1531976366672_4278180_317-199546-0_sp1.1 from training. Duration: 0.7818125 2023-10-04 05:36:10,634 WARNING [train.py:1204] (3/4) Exclude cut with ID _27894_210_12392_1_1532415496078_6902439_547-196022-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 05:36:13,282 WARNING [train.py:1204] (3/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_46-207599-0 from training. Duration: 0.64 2023-10-04 05:36:14,758 WARNING [train.py:1204] (3/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_426-326246-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 05:36:15,327 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.5.encoder.layers.1.self_attn_weights.whiten_keys, num_groups=4, num_channels=128, metric=1.92 vs. limit=6.0 2023-10-04 05:36:18,125 WARNING [train.py:1204] (3/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_445-117398-0_sp0.9 from training. Duration: 0.988875 2023-10-04 05:36:18,169 WARNING [train.py:1204] (3/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_563-347832-0 from training. Duration: 0.89 2023-10-04 05:36:18,256 WARNING [train.py:1204] (3/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_252-216674-0 from training. Duration: 0.54 2023-10-04 05:36:20,853 WARNING [train.py:1204] (3/4) Exclude cut with ID _59611_210_8895_1_1534741198240_3650361_187-212779-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 05:36:20,924 WARNING [train.py:1204] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_282-159367-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 05:36:22,993 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_68702_210_23689_1_1535799577_3420068_33-136883-0_sp1.1 from training. Duration: 0.936375 2023-10-04 05:36:23,004 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_47228_210_4983_1_1533780021_3672027_184-293706-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 05:36:23,022 WARNING [train.py:1204] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_656-188361-0_sp0.9 from training. Duration: 0.9666875 2023-10-04 05:36:24,312 WARNING [train.py:1204] (3/4) Exclude cut with ID _4866_210_2577_1_1527404412284_4364659_352-157930-0_sp0.9 from training. Duration: 0.9 2023-10-04 05:36:24,313 WARNING [train.py:1204] (3/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_210-106047-0_sp1.1 from training. Duration: 0.8818125 2023-10-04 05:36:25,716 WARNING [train.py:1204] (3/4) Exclude cut with ID _9389_210_1664_1_1529217337301_5289269_289-213417-0_sp1.1 from training. Duration: 0.8090625 2023-10-04 05:36:27,034 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3910_210_1794_1_1526777667_3747260_178-41186-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 05:36:27,038 WARNING [train.py:1204] (3/4) Exclude cut with ID _27052_210_5986_1_1532329197187_3585009_24-172263-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 05:36:27,042 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_5522_210_3273_1_1527249606_3909096_318-203153-0_sp1.1 from training. Duration: 0.3909375 2023-10-04 05:36:29,516 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.0.layers.0.whiten, num_groups=1, num_channels=192, metric=4.59 vs. limit=12.0 2023-10-04 05:36:31,229 WARNING [train.py:1204] (3/4) Exclude cut with ID _27894_210_12392_1_1532415496078_6902439_222-51982-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 05:36:32,614 WARNING [train.py:1204] (3/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_87-131427-0 from training. Duration: 0.78 2023-10-04 05:36:35,829 WARNING [train.py:1204] (3/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_372-298093-0_sp0.9 from training. Duration: 0.888875 2023-10-04 05:36:35,890 WARNING [train.py:1204] (3/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_663-166973-0 from training. Duration: 0.8 2023-10-04 05:36:37,226 WARNING [train.py:1204] (3/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_429-177469-0_sp1.1 from training. Duration: 0.936375 2023-10-04 05:36:37,264 WARNING [train.py:1204] (3/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_724-172651-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 05:36:37,302 WARNING [train.py:1204] (3/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_103-336064-0 from training. Duration: 0.51 2023-10-04 05:36:45,905 WARNING [train.py:1204] (3/4) Exclude cut with ID _48677_210_5663_1_1533902405919_3892989_111-11439-0 from training. Duration: 0.96 2023-10-04 05:36:47,848 WARNING [train.py:1204] (3/4) Exclude cut with ID _8085_210_4557_1_1528455633493_3716100_95-103398-0_sp1.1 from training. Duration: 0.963625 2023-10-04 05:36:47,909 WARNING [train.py:1204] (3/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_1041-110837-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 05:36:49,282 INFO [train.py:1046] (3/4) Epoch 44, batch 3150, loss[loss=0.1577, simple_loss=0.2489, pruned_loss=0.03326, over 24566.00 frames. ], tot_loss[loss=0.1538, simple_loss=0.2341, pruned_loss=0.03671, over 4708751.08 frames. ], batch size: 71, lr: 2.31e-03, grad_scale: 8.0 2023-10-04 05:36:50,766 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_50050_210_16223_1_1535435796_7290138_1155-121757-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 05:36:50,768 WARNING [train.py:1204] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_602-275260-0_sp0.9 from training. Duration: 0.911125 2023-10-04 05:36:51,619 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.conv_skip_rate, batch_count=1543813.3333333333, ans=0.0 2023-10-04 05:36:52,581 WARNING [train.py:1204] (3/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_265-164486-0 from training. Duration: 0.96 2023-10-04 05:36:54,393 WARNING [train.py:1204] (3/4) Exclude cut with ID _46067_210_4220_1_1534125571066_3307450_142-229375-0_sp1.1 from training. Duration: 0.963625 2023-10-04 05:36:54,423 WARNING [train.py:1204] (3/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_310-26186-0_sp1.1 from training. Duration: 0.636375 2023-10-04 05:36:55,782 WARNING [train.py:1204] (3/4) Exclude cut with ID _28461_210_2253_1_1532771775189_3041299_136-270644-0 from training. Duration: 0.93 2023-10-04 05:36:58,511 WARNING [train.py:1204] (3/4) Exclude cut with ID _14860_210_4915_1_1530702020340_7343240_438-286372-0_sp1.1 from training. Duration: 0.936375 2023-10-04 05:37:00,047 WARNING [train.py:1204] (3/4) Exclude cut with ID IC0128W0004-63309 from training. Duration: 0.831 2023-10-04 05:37:03,297 WARNING [train.py:1204] (3/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_135-174080-0 from training. Duration: 0.53 2023-10-04 05:37:03,347 WARNING [train.py:1204] (3/4) Exclude cut with ID _41326_210_5854_1_1533778076509_3492080_277-59511-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 05:37:04,744 WARNING [train.py:1204] (3/4) Exclude cut with ID IC0144W0112-71379 from training. Duration: 0.992 2023-10-04 05:37:04,817 WARNING [train.py:1204] (3/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_711-15688-0_sp0.9 from training. Duration: 0.47775 2023-10-04 05:37:04,914 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder_embed.convnext.layerdrop_rate, batch_count=1543880.0, ans=0.015 2023-10-04 05:37:06,115 WARNING [train.py:1204] (3/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_237-332027-0 from training. Duration: 0.95 2023-10-04 05:37:06,167 WARNING [train.py:1204] (3/4) Exclude cut with ID _7970_210_1883_1_1528455616296_3472230_25-178407-0 from training. Duration: 0.71 2023-10-04 05:37:06,169 WARNING [train.py:1204] (3/4) Exclude cut with ID _8480_210_1874_1_1528589391205_6728330_309-139896-0 from training. Duration: 0.81 2023-10-04 05:37:06,192 WARNING [train.py:1204] (3/4) Exclude cut with ID _39571_210_13449_1_1533801584143_3651077_319-268877-0_sp1.1 from training. Duration: 0.936375 2023-10-04 05:37:06,196 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_155-246880-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 05:37:07,613 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_566-122599-0_sp1.1 from training. Duration: 0.936375 2023-10-04 05:37:10,294 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_12868_210_6753_1_1530701932_530275_18-76322-0 from training. Duration: 0.87 2023-10-04 05:37:11,574 WARNING [train.py:1204] (3/4) Exclude cut with ID _56494_210_4220_1_1535284837625_3033169_159-106035-0_sp1.1 from training. Duration: 0.963625 2023-10-04 05:37:11,614 WARNING [train.py:1204] (3/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_98-276013-0_sp1.1 from training. Duration: 0.963625 2023-10-04 05:37:11,654 WARNING [train.py:1204] (3/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_330-216124-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 05:37:14,394 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_177-58005-0_sp0.9 from training. Duration: 0.688875 2023-10-04 05:37:19,256 WARNING [train.py:1204] (3/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_343-139898-0 from training. Duration: 0.96 2023-10-04 05:37:20,620 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6961_210_2087_1_1527937168_6815011_395-185060-0_sp1.1 from training. Duration: 0.863625 2023-10-04 05:37:22,023 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7002_210_1794_1_1527939683_5157286_184-229630-0_sp0.9 from training. Duration: 0.82225 2023-10-04 05:37:23,455 WARNING [train.py:1204] (3/4) Exclude cut with ID _59170_210_6721_1_1534983633960_4203335_153-18895-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 05:37:23,502 WARNING [train.py:1204] (3/4) Exclude cut with ID _49643_210_7989_1_1534640244224_3939031_598-161926-0 from training. Duration: 0.9 2023-10-04 05:37:27,399 WARNING [train.py:1204] (3/4) Exclude cut with ID _4068_210_2629_1_1526697350395_1546259_224-288693-0 from training. Duration: 0.59 2023-10-04 05:37:28,659 WARNING [train.py:1204] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_718-68505-0_sp1.1 from training. Duration: 0.8090625 2023-10-04 05:37:28,710 WARNING [train.py:1204] (3/4) Exclude cut with ID _4068_210_2629_1_1526697350395_1546259_224-288693-0_sp0.9 from training. Duration: 0.6555625 2023-10-04 05:37:28,722 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2296_210_2087_1_1525503448_3397335_57-185430-0_sp0.9 from training. Duration: 0.6666875 2023-10-04 05:37:30,064 WARNING [train.py:1204] (3/4) Exclude cut with ID _15458_210_8341_1_1530858614263_3834410_49-235472-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 05:37:30,067 WARNING [train.py:1204] (3/4) Exclude cut with ID _64135_210_2253_1_1535677267876_7608972_498-5039-0_sp0.9 from training. Duration: 0.8666875 2023-10-04 05:37:31,685 WARNING [train.py:1204] (3/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_283-220372-0_sp0.9 from training. Duration: 0.62225 2023-10-04 05:37:31,697 WARNING [train.py:1204] (3/4) Exclude cut with ID _6473_210_1741_1_1527901307396_3815380_108-17335-0_sp0.9 from training. Duration: 0.72225 2023-10-04 05:37:34,456 WARNING [train.py:1204] (3/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_113-92608-0 from training. Duration: 0.54 2023-10-04 05:37:34,515 WARNING [train.py:1204] (3/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_272-249985-0_sp1.1 from training. Duration: 0.6090625 2023-10-04 05:37:34,524 WARNING [train.py:1204] (3/4) Exclude cut with ID _50376_210_13401_1_1534031502101_4217740_518-187390-0_sp1.1 from training. Duration: 0.97275 2023-10-04 05:37:36,341 WARNING [train.py:1204] (3/4) Exclude cut with ID _59738_210_15379_1_1534849512562_3941470_165-248260-0_sp1.1 from training. Duration: 0.92725 2023-10-04 05:37:36,360 WARNING [train.py:1204] (3/4) Exclude cut with ID _23818_210_6228_1_1531917063884_3676920_400-343925-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 05:37:37,704 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6961_210_2087_1_1527937168_6815011_562-165518-0 from training. Duration: 0.77 2023-10-04 05:37:37,762 WARNING [train.py:1204] (3/4) Exclude cut with ID _24271_210_12658_1_1531987270022_7190519_289-207931-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 05:37:39,190 WARNING [train.py:1204] (3/4) Exclude cut with ID _14632_210_9988_1_1530691279392_3923200_332-105904-0 from training. Duration: 0.9 2023-10-04 05:37:39,202 WARNING [train.py:1204] (3/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_203-232438-0_sp1.1 from training. Duration: 0.97275 2023-10-04 05:37:40,638 WARNING [train.py:1204] (3/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_263-168388-0 from training. Duration: 0.86 2023-10-04 05:37:41,907 WARNING [train.py:1204] (3/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_174-205058-0 from training. Duration: 0.95 2023-10-04 05:37:43,300 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_94-195801-0_sp1.1 from training. Duration: 0.8545625 2023-10-04 05:37:43,324 WARNING [train.py:1204] (3/4) Exclude cut with ID _55153_210_2253_1_1534384786677_3162096_269-103204-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 05:37:44,577 WARNING [train.py:1204] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_748-36150-0 from training. Duration: 0.91 2023-10-04 05:37:44,672 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_148-172861-0_sp1.1 from training. Duration: 0.4545625 2023-10-04 05:37:46,519 WARNING [train.py:1204] (3/4) Exclude cut with ID _67499_210_17282_1_1535509805220_3692430_460-77730-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 05:37:47,967 WARNING [train.py:1204] (3/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_374-20753-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 05:37:49,404 WARNING [train.py:1204] (3/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_374-289983-0_sp1.1 from training. Duration: 0.97275 2023-10-04 05:37:49,444 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_34886_210_10633_1_1533203882_1755661_76-156096-0_sp0.9 from training. Duration: 0.9666875 2023-10-04 05:37:55,276 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_238-256668-0_sp0.9 from training. Duration: 0.7666875 2023-10-04 05:37:55,334 WARNING [train.py:1204] (3/4) Exclude cut with ID _46683_210_9642_1_1533730913492_4591116_489-146727-0_sp1.1 from training. Duration: 0.97275 2023-10-04 05:37:58,397 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.700e+02 2.021e+02 2.313e+02 2.490e+02 3.781e+02, threshold=4.625e+02, percent-clipped=0.0 2023-10-04 05:37:58,486 WARNING [train.py:1204] (3/4) Exclude cut with ID _13938_210_9988_1_1530518718978_3693960_304-112784-0_sp0.9 from training. Duration: 0.511125 2023-10-04 05:38:02,747 WARNING [train.py:1204] (3/4) Exclude cut with ID _24271_210_12658_1_1531987270022_7190519_243-46549-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 05:38:02,751 WARNING [train.py:1204] (3/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_78-182912-0_sp1.1 from training. Duration: 0.536375 2023-10-04 05:38:04,440 INFO [train.py:1046] (3/4) Epoch 44, batch 3200, loss[loss=0.1581, simple_loss=0.2499, pruned_loss=0.03316, over 24661.00 frames. ], tot_loss[loss=0.1531, simple_loss=0.2333, pruned_loss=0.03643, over 4716993.02 frames. ], batch size: 68, lr: 2.31e-03, grad_scale: 16.0 2023-10-04 05:38:05,892 WARNING [train.py:1204] (3/4) Exclude cut with ID _57572_210_5517_1_1534921219624_3809484_121-321334-0_sp1.1 from training. Duration: 0.97275 2023-10-04 05:38:06,001 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_40047_210_2290_1_1533290519_2374311_254-102149-0_sp1.1 from training. Duration: 0.963625 2023-10-04 05:38:06,002 WARNING [train.py:1204] (3/4) Exclude cut with ID _28461_210_2253_1_1532771775189_3041299_96-152158-0 from training. Duration: 0.85 2023-10-04 05:38:09,977 WARNING [train.py:1204] (3/4) Exclude cut with ID _29877_210_6228_1_1532397791106_5276149_161-102979-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 05:38:14,222 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3787_210_2040_1_1526477544_5478586_603-120324-0_sp0.9 from training. Duration: 0.9 2023-10-04 05:38:16,308 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.balancer2.prob, batch_count=1544146.6666666667, ans=0.125 2023-10-04 05:38:18,834 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_41750_210_1960_1_1533520678_3948042_247-213456-0_sp1.1 from training. Duration: 0.97275 2023-10-04 05:38:28,101 WARNING [train.py:1204] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_127-119109-0_sp0.9 from training. Duration: 0.8 2023-10-04 05:38:36,959 WARNING [train.py:1204] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_215-202973-0 from training. Duration: 0.88 2023-10-04 05:38:38,282 WARNING [train.py:1204] (3/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_17-320491-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 05:38:40,952 WARNING [train.py:1204] (3/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_310-114452-0 from training. Duration: 0.59 2023-10-04 05:38:42,275 WARNING [train.py:1204] (3/4) Exclude cut with ID _15133_210_6228_1_1530794512074_3080379_29-264458-0_sp0.9 from training. Duration: 0.6444375 2023-10-04 05:38:46,388 WARNING [train.py:1204] (3/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_159-281310-0_sp1.1 from training. Duration: 0.863625 2023-10-04 05:38:46,405 WARNING [train.py:1204] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_195-167011-0_sp0.9 from training. Duration: 0.8333125 2023-10-04 05:38:46,473 WARNING [train.py:1204] (3/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_156-257607-0_sp1.1 from training. Duration: 0.7545625 2023-10-04 05:38:49,881 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2295_210_2087_1_1525500993_2407863_116-238894-0 from training. Duration: 0.7 2023-10-04 05:38:51,224 WARNING [train.py:1204] (3/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_321-335475-0_sp0.9 from training. Duration: 0.5 2023-10-04 05:38:52,650 WARNING [train.py:1204] (3/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_188-114464-0 from training. Duration: 0.82 2023-10-04 05:38:55,290 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2592_210_2290_1_1525697025_4882925_157-70512-0 from training. Duration: 0.91 2023-10-04 05:39:00,395 WARNING [train.py:1204] (3/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_138-64210-0_sp1.1 from training. Duration: 0.663625 2023-10-04 05:39:04,059 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.3.encoder.layers.0.self_attn2.whiten, num_groups=1, num_channels=512, metric=14.10 vs. limit=22.5 2023-10-04 05:39:04,634 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_11305_210_1794_1_1529750428_7178148_651-95702-0_sp1.1 from training. Duration: 0.963625 2023-10-04 05:39:04,648 WARNING [train.py:1204] (3/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_379-184223-0_sp0.9 from training. Duration: 0.9444375 2023-10-04 05:39:04,668 WARNING [train.py:1204] (3/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_381-125325-0_sp1.1 from training. Duration: 0.963625 2023-10-04 05:39:04,730 WARNING [train.py:1204] (3/4) Exclude cut with ID IC0622W0021-307659 from training. Duration: 0.896 2023-10-04 05:39:04,739 WARNING [train.py:1204] (3/4) Exclude cut with ID _7624_210_1531_1_1528109802198_4933101_418-349834-0_sp1.1 from training. Duration: 0.5454375 2023-10-04 05:39:06,376 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=1544413.3333333333, ans=0.1 2023-10-04 05:39:09,351 WARNING [train.py:1204] (3/4) Exclude cut with ID _49728_210_12651_1_1534041034294_6788150_607-3416-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 05:39:09,642 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.conv_module1.balancer2.prob, batch_count=1544413.3333333333, ans=0.125 2023-10-04 05:39:10,689 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8275_210_4198_1_1528455320_3892571_491-17707-0 from training. Duration: 0.64 2023-10-04 05:39:10,730 WARNING [train.py:1204] (3/4) Exclude cut with ID _5287_210_2084_1_1527418883040_3878670_333-48508-0 from training. Duration: 0.6 2023-10-04 05:39:12,007 WARNING [train.py:1204] (3/4) Exclude cut with ID _8365_210_2064_1_1528592311070_4677690_371-122295-0 from training. Duration: 0.55 2023-10-04 05:39:14,694 WARNING [train.py:1204] (3/4) Exclude cut with ID _14860_210_4915_1_1530702020340_7343240_836-328489-0 from training. Duration: 0.9 2023-10-04 05:39:16,150 WARNING [train.py:1204] (3/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_1033-274376-0_sp0.9 from training. Duration: 0.9666875 2023-10-04 05:39:17,954 INFO [train.py:1046] (3/4) Epoch 44, batch 3250, loss[loss=0.1404, simple_loss=0.2243, pruned_loss=0.02825, over 24474.00 frames. ], tot_loss[loss=0.1529, simple_loss=0.2334, pruned_loss=0.03622, over 4727903.81 frames. ], batch size: 63, lr: 2.31e-03, grad_scale: 16.0 2023-10-04 05:39:20,824 WARNING [train.py:1204] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_475-127455-0_sp0.9 from training. Duration: 0.77775 2023-10-04 05:39:20,830 WARNING [train.py:1204] (3/4) Exclude cut with ID IC0671W0071-331914 from training. Duration: 0.896 2023-10-04 05:39:20,849 WARNING [train.py:1204] (3/4) Exclude cut with ID _30327_210_16049_1_1532484161295_3924169_2-1523-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 05:39:20,852 WARNING [train.py:1204] (3/4) Exclude cut with ID _24314_210_4915_1_1532135078116_7170179_460-208279-0_sp1.1 from training. Duration: 0.97275 2023-10-04 05:39:22,372 WARNING [train.py:1204] (3/4) Exclude cut with ID IC0679W0052-335859 from training. Duration: 0.64 2023-10-04 05:39:26,642 WARNING [train.py:1204] (3/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_3-237434-0_sp1.1 from training. Duration: 0.6909375 2023-10-04 05:39:29,420 WARNING [train.py:1204] (3/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_405-185283-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 05:39:31,689 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.bypass.skip_rate, batch_count=1544546.6666666667, ans=0.09899494936611666 2023-10-04 05:39:36,895 WARNING [train.py:1204] (3/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_927-83684-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 05:39:36,904 WARNING [train.py:1204] (3/4) Exclude cut with ID _6694_210_4599_1_1527852637490_3965529_356-169233-0 from training. Duration: 0.99 2023-10-04 05:39:38,311 WARNING [train.py:1204] (3/4) Exclude cut with ID _19896_210_1883_1_1531709022662_6649569_562-233001-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 05:39:38,358 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_354-78629-0_sp1.1 from training. Duration: 0.963625 2023-10-04 05:39:38,359 WARNING [train.py:1204] (3/4) Exclude cut with ID _60135_210_13427_1_1534935557922_3651319_96-168198-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 05:39:39,706 WARNING [train.py:1204] (3/4) Exclude cut with ID _31602_210_5854_1_1532677021456_4458250_249-96604-0_sp1.1 from training. Duration: 0.8090625 2023-10-04 05:39:39,744 WARNING [train.py:1204] (3/4) Exclude cut with ID _14100_210_8341_1_1530529320613_3678069_258-224599-0_sp0.9 from training. Duration: 0.8555625 2023-10-04 05:39:41,757 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.bypass.skip_rate, batch_count=1544546.6666666667, ans=0.07 2023-10-04 05:39:42,876 WARNING [train.py:1204] (3/4) Exclude cut with ID _19708_210_9206_1_1532136650733_4238364_215-160135-0_sp1.1 from training. Duration: 0.97275 2023-10-04 05:39:44,127 WARNING [train.py:1204] (3/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_113-18163-0_sp0.9 from training. Duration: 0.988875 2023-10-04 05:39:44,179 WARNING [train.py:1204] (3/4) Exclude cut with ID _64135_210_2253_1_1535677267876_7608972_552-337340-0_sp1.1 from training. Duration: 0.92725 2023-10-04 05:39:45,425 WARNING [train.py:1204] (3/4) Exclude cut with ID _67725_210_27767_1_1535608756671_5105359_10-12887-0_sp1.1 from training. Duration: 0.97275 2023-10-04 05:39:45,432 WARNING [train.py:1204] (3/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_401-284840-0_sp1.1 from training. Duration: 0.97275 2023-10-04 05:39:45,458 WARNING [train.py:1204] (3/4) Exclude cut with ID _44659_210_2721_1_1534036120061_6614860_483-141285-0_sp1.1 from training. Duration: 0.936375 2023-10-04 05:39:48,283 WARNING [train.py:1204] (3/4) Exclude cut with ID _9461_210_4557_1_1529060514726_3677451_358-332734-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 05:39:50,181 WARNING [train.py:1204] (3/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_788-298907-0_sp1.1 from training. Duration: 0.8090625 2023-10-04 05:39:51,593 WARNING [train.py:1204] (3/4) Exclude cut with ID _39411_210_5986_1_1533538825030_3343649_354-267196-0_sp1.1 from training. Duration: 0.92725 2023-10-04 05:39:51,622 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_14953_210_2151_1_1530701710_5616165_192-309792-0_sp1.1 from training. Duration: 0.97275 2023-10-04 05:39:52,967 WARNING [train.py:1204] (3/4) Exclude cut with ID _59170_210_6721_1_1534983633960_4203335_290-151255-0_sp1.1 from training. Duration: 0.92725 2023-10-04 05:39:53,007 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_5575_210_1410_1_1527301305_2570170_249-93769-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 05:39:53,020 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_46013_210_7234_1_1533776362_3744032_463-52378-0_sp1.1 from training. Duration: 0.7818125 2023-10-04 05:39:59,820 WARNING [train.py:1204] (3/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_580-93798-0 from training. Duration: 0.56 2023-10-04 05:40:01,113 WARNING [train.py:1204] (3/4) Exclude cut with ID _19514_210_9259_1_1531530071330_5084230_385-13762-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 05:40:01,118 WARNING [train.py:1204] (3/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_458-67321-0_sp1.1 from training. Duration: 0.8818125 2023-10-04 05:40:02,975 WARNING [train.py:1204] (3/4) Exclude cut with ID _41298_210_9019_1_1533430371450_4822143_188-154791-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 05:40:05,079 WARNING [train.py:1204] (3/4) Exclude cut with ID _15037_210_2253_1_1530752294104_3267650_296-192597-0_sp0.9 from training. Duration: 0.9 2023-10-04 05:40:10,605 WARNING [train.py:1204] (3/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_661-264857-0_sp1.1 from training. Duration: 0.8454375 2023-10-04 05:40:16,908 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_34008_210_10431_1_1533345424_8183247_662-317331-0_sp1.1 from training. Duration: 0.8545625 2023-10-04 05:40:16,937 WARNING [train.py:1204] (3/4) Exclude cut with ID _46502_210_13794_1_1533724452198_3657159_241-278329-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 05:40:16,938 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_11349_210_6753_1_1529800861_1101207_26-65602-0 from training. Duration: 0.96 2023-10-04 05:40:16,942 WARNING [train.py:1204] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_448-4048-0_sp1.1 from training. Duration: 0.87275 2023-10-04 05:40:16,948 WARNING [train.py:1204] (3/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_662-275621-0_sp1.1 from training. Duration: 0.3818125 2023-10-04 05:40:18,332 WARNING [train.py:1204] (3/4) Exclude cut with ID _13938_210_9988_1_1530518718978_3693960_256-54454-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 05:40:19,746 WARNING [train.py:1204] (3/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_256-32662-0 from training. Duration: 0.9 2023-10-04 05:40:21,071 WARNING [train.py:1204] (3/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_326-17061-0 from training. Duration: 0.94 2023-10-04 05:40:21,110 WARNING [train.py:1204] (3/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_505-97987-0_sp1.1 from training. Duration: 0.7818125 2023-10-04 05:40:23,088 WARNING [train.py:1204] (3/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_316-294364-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 05:40:23,145 WARNING [train.py:1204] (3/4) Exclude cut with ID _19514_210_9259_1_1531530071330_5084230_762-231063-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 05:40:24,483 WARNING [train.py:1204] (3/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_68-219807-0_sp0.9 from training. Duration: 0.52225 2023-10-04 05:40:24,532 WARNING [train.py:1204] (3/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_479-158053-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 05:40:25,884 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.575e+02 2.026e+02 2.220e+02 2.491e+02 3.845e+02, threshold=4.441e+02, percent-clipped=0.0 2023-10-04 05:40:27,448 WARNING [train.py:1204] (3/4) Exclude cut with ID _66112_210_10635_1_1535590236305_3801484_83-38181-0_sp1.1 from training. Duration: 0.936375 2023-10-04 05:40:27,470 WARNING [train.py:1204] (3/4) Exclude cut with ID _55857_210_2346_1_1534644007831_6898209_255-146346-0_sp1.1 from training. Duration: 0.963625 2023-10-04 05:40:30,079 WARNING [train.py:1204] (3/4) Exclude cut with ID _48836_210_12210_1_1534075848293_3904150_429-35475-0 from training. Duration: 0.89 2023-10-04 05:40:30,097 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_50154_210_13731_1_1534750862_1080961_119-187163-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 05:40:31,334 INFO [train.py:1046] (3/4) Epoch 44, batch 3300, loss[loss=0.1658, simple_loss=0.2412, pruned_loss=0.04521, over 23478.00 frames. ], tot_loss[loss=0.1539, simple_loss=0.2345, pruned_loss=0.03668, over 4737125.67 frames. ], batch size: 285, lr: 2.31e-03, grad_scale: 16.0 2023-10-04 05:40:32,831 WARNING [train.py:1204] (3/4) Exclude cut with ID _8793_210_6310_1_1528542985139_2310009_121-179782-0_sp0.9 from training. Duration: 0.888875 2023-10-04 05:40:33,447 WARNING [train.py:1204] (3/4) Exclude cut with ID _14884_210_2252_1_1530707459689_5561250_591-173406-0 from training. Duration: 0.96 2023-10-04 05:40:34,921 WARNING [train.py:1204] (3/4) Exclude cut with ID _15133_210_6228_1_1530794512074_3080379_54-198193-0_sp1.1 from training. Duration: 0.8545625 2023-10-04 05:40:34,930 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6545_210_2040_1_1527774670_4245258_407-48070-0 from training. Duration: 0.76 2023-10-04 05:40:36,462 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1032-236863-0 from training. Duration: 0.84 2023-10-04 05:40:37,813 WARNING [train.py:1204] (3/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_507-303370-0 from training. Duration: 0.96 2023-10-04 05:40:37,841 WARNING [train.py:1204] (3/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_164-282366-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 05:40:41,991 WARNING [train.py:1204] (3/4) Exclude cut with ID _39867_210_9826_1_1533553222030_9344259_312-158727-0_sp1.1 from training. Duration: 0.963625 2023-10-04 05:40:42,071 WARNING [train.py:1204] (3/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_174-205058-0_sp1.1 from training. Duration: 0.863625 2023-10-04 05:40:42,106 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_11305_210_1794_1_1529750428_7178148_166-237358-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 05:40:43,044 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.0.layers.1.self_attn1.whiten, num_groups=1, num_channels=192, metric=12.01 vs. limit=22.5 2023-10-04 05:40:43,472 WARNING [train.py:1204] (3/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_26-175553-0_sp1.1 from training. Duration: 0.5545625 2023-10-04 05:40:44,713 WARNING [train.py:1204] (3/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_96-290039-0_sp1.1 from training. Duration: 0.5090625 2023-10-04 05:40:47,984 WARNING [train.py:1204] (3/4) Exclude cut with ID _53219_210_4525_1_1534834836008_4064825_261-101691-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 05:40:49,365 WARNING [train.py:1204] (3/4) Exclude cut with ID _24271_210_12658_1_1531987270022_7190519_96-293390-0_sp1.1 from training. Duration: 0.936375 2023-10-04 05:40:51,318 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.5.encoder.layers.0.conv_module2.whiten, num_groups=1, num_channels=256, metric=4.14 vs. limit=15.0 2023-10-04 05:40:54,038 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_271-234962-0 from training. Duration: 0.79 2023-10-04 05:40:55,400 WARNING [train.py:1204] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_136-30660-0_sp1.1 from training. Duration: 0.97275 2023-10-04 05:40:55,426 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_625-281046-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 05:40:56,030 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.3.encoder.layers.0.feed_forward2.out_whiten, num_groups=1, num_channels=512, metric=4.26 vs. limit=15.0 2023-10-04 05:40:57,982 WARNING [train.py:1204] (3/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_333-168193-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 05:40:58,058 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9029W0269-509664 from training. Duration: 0.896 2023-10-04 05:41:00,664 WARNING [train.py:1204] (3/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_450-147393-0_sp1.1 from training. Duration: 0.92725 2023-10-04 05:41:00,715 WARNING [train.py:1204] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_643-52007-0_sp1.1 from training. Duration: 0.5181875 2023-10-04 05:41:00,784 WARNING [train.py:1204] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_330-327861-0_sp0.9 from training. Duration: 0.7444375 2023-10-04 05:41:00,793 WARNING [train.py:1204] (3/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_260-317651-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 05:41:02,174 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9044W0010-516654 from training. Duration: 0.896 2023-10-04 05:41:06,145 WARNING [train.py:1204] (3/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_265-118378-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 05:41:06,151 WARNING [train.py:1204] (3/4) Exclude cut with ID _6194_210_1534_1_1527511826160_3941194_321-148641-0_sp0.9 from training. Duration: 0.711125 2023-10-04 05:41:08,841 WARNING [train.py:1204] (3/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_101-169176-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 05:41:08,844 WARNING [train.py:1204] (3/4) Exclude cut with ID 497-129325-0061-62254-0_sp1.1 from training. Duration: 0.97725 2023-10-04 05:41:08,947 WARNING [train.py:1204] (3/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_795-33252-0_sp1.1 from training. Duration: 0.336375 2023-10-04 05:41:10,209 WARNING [train.py:1204] (3/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_168-246176-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 05:41:10,333 WARNING [train.py:1204] (3/4) Exclude cut with ID _7160_210_1876_1_1527940923231_3703799_120-150943-0_sp1.1 from training. Duration: 0.736375 2023-10-04 05:41:13,047 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9084W0005-536844 from training. Duration: 0.768 2023-10-04 05:41:14,308 WARNING [train.py:1204] (3/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_234-260682-0 from training. Duration: 0.84 2023-10-04 05:41:14,362 WARNING [train.py:1204] (3/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_188-114464-0_sp0.9 from training. Duration: 0.911125 2023-10-04 05:41:17,033 WARNING [train.py:1204] (3/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_81-75849-0 from training. Duration: 0.39 2023-10-04 05:41:19,778 WARNING [train.py:1204] (3/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_379-184223-0_sp1.1 from training. Duration: 0.77275 2023-10-04 05:41:23,035 WARNING [train.py:1204] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_454-123002-0_sp1.1 from training. Duration: 0.62725 2023-10-04 05:41:23,087 WARNING [train.py:1204] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_887-249555-0_sp1.1 from training. Duration: 0.7 2023-10-04 05:41:26,190 WARNING [train.py:1204] (3/4) Exclude cut with ID _31088_210_16446_1_1532520012284_3440919_20-260902-0_sp1.1 from training. Duration: 0.97275 2023-10-04 05:41:26,225 WARNING [train.py:1204] (3/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_856-344574-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 05:41:26,227 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_36756_210_7234_1_1533026991_966276_4-164118-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 05:41:26,249 WARNING [train.py:1204] (3/4) Exclude cut with ID _8365_210_2064_1_1528592311070_4677690_371-122295-0_sp1.1 from training. Duration: 0.5 2023-10-04 05:41:27,683 WARNING [train.py:1204] (3/4) Exclude cut with ID _5910_210_1389_1_1527337972454_5050100_555-159815-0_sp1.1 from training. Duration: 0.8909375 2023-10-04 05:41:27,693 WARNING [train.py:1204] (3/4) Exclude cut with ID _37086_210_9038_1_1532999540891_7021250_469-147941-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 05:41:29,038 WARNING [train.py:1204] (3/4) Exclude cut with ID _3067_210_2667_1_1526212974640_4503168_459-173867-0_sp0.9 from training. Duration: 0.97775 2023-10-04 05:41:30,361 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9156W0006-572499 from training. Duration: 0.896 2023-10-04 05:41:31,727 WARNING [train.py:1204] (3/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_277-210798-0 from training. Duration: 0.55 2023-10-04 05:41:33,165 WARNING [train.py:1204] (3/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_103-336064-0_sp1.1 from training. Duration: 0.463625 2023-10-04 05:41:34,932 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3414_210_1988_1_1526212718_4813195_331-290945-0_sp1.1 from training. Duration: 0.936375 2023-10-04 05:41:34,933 WARNING [train.py:1204] (3/4) Exclude cut with ID _67499_210_17282_1_1535509805220_3692430_365-29276-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 05:41:36,907 WARNING [train.py:1204] (3/4) Exclude cut with ID _14821_210_8750_1_1530680503991_6829359_69-250847-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 05:41:36,909 WARNING [train.py:1204] (3/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_1102-61301-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 05:41:38,396 WARNING [train.py:1204] (3/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_166-246557-0_sp1.1 from training. Duration: 0.5818125 2023-10-04 05:41:38,439 WARNING [train.py:1204] (3/4) Exclude cut with ID _15351_210_10626_1_1530856635653_3576210_214-129918-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 05:41:38,452 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_120-41668-0_sp0.9 from training. Duration: 0.77775 2023-10-04 05:41:39,804 WARNING [train.py:1204] (3/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_27-72278-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 05:41:42,456 WARNING [train.py:1204] (3/4) Exclude cut with ID _24314_210_4915_1_1532135078116_7170179_521-258868-0_sp1.1 from training. Duration: 0.7181875 2023-10-04 05:41:43,967 WARNING [train.py:1204] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_143-86344-0 from training. Duration: 0.77 2023-10-04 05:41:43,997 WARNING [train.py:1204] (3/4) Exclude cut with ID _53489_210_6228_1_1534298357169_3835970_465-157433-0_sp1.1 from training. Duration: 0.92725 2023-10-04 05:41:44,079 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_248-319353-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 05:41:45,382 INFO [train.py:1046] (3/4) Epoch 44, batch 3350, loss[loss=0.1578, simple_loss=0.2388, pruned_loss=0.03836, over 23660.00 frames. ], tot_loss[loss=0.1551, simple_loss=0.2359, pruned_loss=0.03715, over 4727946.77 frames. ], batch size: 232, lr: 2.31e-03, grad_scale: 16.0 2023-10-04 05:41:46,826 WARNING [train.py:1204] (3/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_102-77064-0_sp1.1 from training. Duration: 0.8454375 2023-10-04 05:41:46,856 WARNING [train.py:1204] (3/4) Exclude cut with ID _9389_210_1664_1_1529217337301_5289269_379-128794-0_sp1.1 from training. Duration: 0.77275 2023-10-04 05:41:48,364 WARNING [train.py:1204] (3/4) Exclude cut with ID _3231_210_2106_1_1526205864934_6784220_236-260835-0_sp1.1 from training. Duration: 0.97275 2023-10-04 05:41:49,710 WARNING [train.py:1204] (3/4) Exclude cut with ID _24271_210_12658_1_1531987270022_7190519_184-342363-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 05:41:49,711 WARNING [train.py:1204] (3/4) Exclude cut with ID _27894_210_12392_1_1532415496078_6902439_67-221663-0_sp1.1 from training. Duration: 0.92725 2023-10-04 05:41:53,033 WARNING [train.py:1204] (3/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_304-59227-0_sp1.1 from training. Duration: 0.7 2023-10-04 05:41:53,197 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.1.conv_module2.balancer1.prob, batch_count=1545146.6666666667, ans=0.125 2023-10-04 05:41:54,861 WARNING [train.py:1204] (3/4) Exclude cut with ID _58605_210_12210_1_1534723195939_3904259_458-42232-0_sp1.1 from training. Duration: 0.92725 2023-10-04 05:41:56,263 WARNING [train.py:1204] (3/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_13-165512-0_sp0.9 from training. Duration: 0.911125 2023-10-04 05:41:58,944 WARNING [train.py:1204] (3/4) Exclude cut with ID _58602_210_2346_1_1534903210085_6908830_792-102241-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 05:42:01,611 WARNING [train.py:1204] (3/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_588-125165-0_sp0.9 from training. Duration: 0.92225 2023-10-04 05:42:01,726 WARNING [train.py:1204] (3/4) Exclude cut with ID _54218_210_12210_1_1534413646388_4409259_239-98429-0_sp1.1 from training. Duration: 0.97275 2023-10-04 05:42:01,774 WARNING [train.py:1204] (3/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_538-32291-0_sp1.1 from training. Duration: 0.7545625 2023-10-04 05:42:03,216 WARNING [train.py:1204] (3/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_419-317062-0 from training. Duration: 0.91 2023-10-04 05:42:03,381 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.feed_forward2.hidden_balancer.prob, batch_count=1545213.3333333333, ans=0.125 2023-10-04 05:42:06,889 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9309W0202-640329 from training. Duration: 0.896 2023-10-04 05:42:06,930 WARNING [train.py:1204] (3/4) Exclude cut with ID _3231_210_2106_1_1526205864934_6784220_419-313291-0_sp1.1 from training. Duration: 0.97275 2023-10-04 05:42:09,623 WARNING [train.py:1204] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_619-344499-0 from training. Duration: 0.63 2023-10-04 05:42:09,635 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_278-272612-0 from training. Duration: 0.73 2023-10-04 05:42:11,023 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_103-84010-0_sp0.9 from training. Duration: 0.7666875 2023-10-04 05:42:11,043 WARNING [train.py:1204] (3/4) Exclude cut with ID _26528_210_9070_1_1532570410842_3851310_497-97218-0_sp1.1 from training. Duration: 0.7818125 2023-10-04 05:42:12,379 WARNING [train.py:1204] (3/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_262-84613-0_sp1.1 from training. Duration: 0.936375 2023-10-04 05:42:12,411 WARNING [train.py:1204] (3/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_387-131538-0 from training. Duration: 0.81 2023-10-04 05:42:13,738 WARNING [train.py:1204] (3/4) Exclude cut with ID _8030_210_7497_1_1528444886781_3531089_224-300489-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 05:42:13,762 WARNING [train.py:1204] (3/4) Exclude cut with ID _14349_210_8341_1_1530615710818_3442470_170-336541-0_sp0.9 from training. Duration: 0.9555625 2023-10-04 05:42:15,069 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_47069_210_15368_1_1533781267_4431320_180-31746-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 05:42:17,844 WARNING [train.py:1204] (3/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_447-7041-0_sp1.1 from training. Duration: 0.92725 2023-10-04 05:42:17,873 WARNING [train.py:1204] (3/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_2-55283-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 05:42:19,178 WARNING [train.py:1204] (3/4) Exclude cut with ID _12555_210_8750_1_1530075594402_3780239_70-181911-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 05:42:21,958 WARNING [train.py:1204] (3/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_311-328022-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 05:42:23,413 WARNING [train.py:1204] (3/4) Exclude cut with ID _14528_210_8254_1_1530669700514_4306939_330-2987-0_sp1.1 from training. Duration: 0.92725 2023-10-04 05:42:25,274 WARNING [train.py:1204] (3/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_385-184858-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 05:42:29,892 WARNING [train.py:1204] (3/4) Exclude cut with ID _64414_210_17301_1_1535243252048_3757759_348-333360-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 05:42:29,961 WARNING [train.py:1204] (3/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_753-214751-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 05:42:31,388 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_17613_210_2151_1_1531137569_2366160_202-208013-0_sp1.1 from training. Duration: 0.92725 2023-10-04 05:42:32,621 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_43831_210_2158_1_1533725502_5872791_610-129300-0_sp1.1 from training. Duration: 0.936375 2023-10-04 05:42:35,361 WARNING [train.py:1204] (3/4) Exclude cut with ID _54218_210_12210_1_1534413646388_4409259_321-23852-0_sp1.1 from training. Duration: 0.936375 2023-10-04 05:42:37,256 WARNING [train.py:1204] (3/4) Exclude cut with ID _39411_210_5986_1_1533538825030_3343649_173-238248-0 from training. Duration: 0.83 2023-10-04 05:42:37,853 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_405-339986-0_sp0.9 from training. Duration: 0.5666875 2023-10-04 05:42:39,038 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2009_210_2071_1_1525481134_4588937_385-302100-0 from training. Duration: 0.87 2023-10-04 05:42:39,105 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_161-276996-0_sp1.1 from training. Duration: 0.863625 2023-10-04 05:42:39,227 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_177-58005-0 from training. Duration: 0.62 2023-10-04 05:42:40,557 WARNING [train.py:1204] (3/4) Exclude cut with ID _64639_210_5816_1_1535367619437_3528000_146-343734-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 05:42:40,718 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.feed_forward1.hidden_balancer.prob, batch_count=1545346.6666666667, ans=0.125 2023-10-04 05:42:42,028 WARNING [train.py:1204] (3/4) Exclude cut with ID _3845_210_1390_1_1526731326141_3523610_72-61441-0_sp1.1 from training. Duration: 0.92725 2023-10-04 05:42:47,625 WARNING [train.py:1204] (3/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_991-125028-0_sp1.1 from training. Duration: 0.936375 2023-10-04 05:42:47,690 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7544_210_6753_1_1528591327_1471426_172-348777-0 from training. Duration: 0.66 2023-10-04 05:42:48,977 WARNING [train.py:1204] (3/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_416-257730-0_sp1.1 from training. Duration: 0.5909375 2023-10-04 05:42:50,311 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1113-7369-0_sp1.1 from training. Duration: 0.77275 2023-10-04 05:42:51,741 WARNING [train.py:1204] (3/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_242-76943-0_sp1.1 from training. Duration: 0.8545625 2023-10-04 05:42:52,272 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.4.encoder.layers.1.feed_forward3.out_whiten, num_groups=1, num_channels=384, metric=12.29 vs. limit=15.0 2023-10-04 05:42:52,934 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.583e+02 2.103e+02 2.289e+02 2.774e+02 3.662e+02, threshold=4.578e+02, percent-clipped=0.0 2023-10-04 05:42:56,409 WARNING [train.py:1204] (3/4) Exclude cut with ID _44718_210_5816_1_1534050051869_6469030_190-49648-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 05:42:59,553 INFO [train.py:1046] (3/4) Epoch 44, batch 3400, loss[loss=0.1446, simple_loss=0.2288, pruned_loss=0.03015, over 24675.00 frames. ], tot_loss[loss=0.1561, simple_loss=0.2373, pruned_loss=0.03739, over 4733358.11 frames. ], batch size: 65, lr: 2.30e-03, grad_scale: 16.0 2023-10-04 05:42:59,606 WARNING [train.py:1204] (3/4) Exclude cut with ID _47251_210_18628_1_1533814063128_4137250_214-88076-0 from training. Duration: 0.88 2023-10-04 05:42:59,644 WARNING [train.py:1204] (3/4) Exclude cut with ID _5585_210_2667_1_1527418813324_8007340_89-332072-0_sp1.1 from training. Duration: 0.5818125 2023-10-04 05:43:00,911 WARNING [train.py:1204] (3/4) Exclude cut with ID _41817_210_18835_1_1533434477160_7423049_555-293448-0_sp1.1 from training. Duration: 0.72725 2023-10-04 05:43:01,029 WARNING [train.py:1204] (3/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_286-331314-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 05:43:01,229 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.balancer2.prob, batch_count=1545480.0, ans=0.125 2023-10-04 05:43:02,324 WARNING [train.py:1204] (3/4) Exclude cut with ID _13921_210_9994_1_1530838702800_3003429_46-149444-0 from training. Duration: 0.94 2023-10-04 05:43:02,394 WARNING [train.py:1204] (3/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_768-43595-0_sp1.1 from training. Duration: 0.936375 2023-10-04 05:43:02,403 WARNING [train.py:1204] (3/4) Exclude cut with ID _35355_210_16040_1_1533212988977_4016229_74-190076-0 from training. Duration: 0.8 2023-10-04 05:43:03,833 WARNING [train.py:1204] (3/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_1007-316380-0_sp1.1 from training. Duration: 0.963625 2023-10-04 05:43:05,203 WARNING [train.py:1204] (3/4) Exclude cut with ID _23557_210_8750_1_1531890106370_6846255_150-283437-0_sp1.1 from training. Duration: 0.963625 2023-10-04 05:43:07,112 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3474_210_2028_1_1526188513_4825246_145-309131-0_sp1.1 from training. Duration: 0.67275 2023-10-04 05:43:07,217 WARNING [train.py:1204] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_391-22148-0_sp1.1 from training. Duration: 0.7 2023-10-04 05:43:07,236 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_238-256668-0 from training. Duration: 0.69 2023-10-04 05:43:10,490 WARNING [train.py:1204] (3/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_372-198878-0 from training. Duration: 0.6 2023-10-04 05:43:10,500 WARNING [train.py:1204] (3/4) Exclude cut with ID ID0174W0006-766659 from training. Duration: 0.953 2023-10-04 05:43:10,517 WARNING [train.py:1204] (3/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_668-23019-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 05:43:12,177 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=1545480.0, ans=0.1 2023-10-04 05:43:14,583 WARNING [train.py:1204] (3/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_322-74328-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 05:43:14,589 WARNING [train.py:1204] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_872-264525-0_sp1.1 from training. Duration: 0.5909375 2023-10-04 05:43:14,631 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_53378_210_17282_1_1534221071_5769509_641-286201-0_sp1.1 from training. Duration: 0.97275 2023-10-04 05:43:17,311 WARNING [train.py:1204] (3/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_199-295611-0_sp1.1 from training. Duration: 0.5 2023-10-04 05:43:19,041 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.conv_module2.balancer2.prob, batch_count=1545546.6666666667, ans=0.125 2023-10-04 05:43:21,442 WARNING [train.py:1204] (3/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_1270-317971-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 05:43:22,877 WARNING [train.py:1204] (3/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_784-213099-0 from training. Duration: 0.93 2023-10-04 05:43:26,467 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.feed_forward3.hidden_balancer.prob, batch_count=1545546.6666666667, ans=0.125 2023-10-04 05:43:27,582 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_115-233929-0_sp0.9 from training. Duration: 0.8 2023-10-04 05:43:29,663 WARNING [train.py:1204] (3/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_256-223809-0_sp1.1 from training. Duration: 0.97275 2023-10-04 05:43:29,900 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.conv_module2.balancer2.prob, batch_count=1545613.3333333333, ans=0.125 2023-10-04 05:43:31,041 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_285-87781-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 05:43:31,131 WARNING [train.py:1204] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_619-344499-0_sp1.1 from training. Duration: 0.57275 2023-10-04 05:43:36,544 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.balancer1.prob, batch_count=1545613.3333333333, ans=0.125 2023-10-04 05:43:37,535 WARNING [train.py:1204] (3/4) Exclude cut with ID _60229_210_27767_1_1534838406158_3655350_264-279573-0_sp0.9 from training. Duration: 0.92225 2023-10-04 05:43:40,442 WARNING [train.py:1204] (3/4) Exclude cut with ID _7719_210_3525_1_1528456153163_4296960_333-110399-0 from training. Duration: 0.96 2023-10-04 05:43:44,896 INFO [scaling.py:1118] (3/4) WithLoss: name=encoder.encoders.5.encoder.layers.1.self_attn_weights, loss-sum=0.000e+00 2023-10-04 05:43:47,309 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_50423_210_13270_1_1534128598_5021734_759-90833-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 05:43:47,369 WARNING [train.py:1204] (3/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_151-254798-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 05:43:47,593 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.self_attn_weights.pos_emb_skip_rate, batch_count=1545680.0, ans=0.0 2023-10-04 05:43:48,614 WARNING [train.py:1204] (3/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_450-203637-0 from training. Duration: 0.92 2023-10-04 05:43:48,647 WARNING [train.py:1204] (3/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_673-36456-0_sp1.1 from training. Duration: 0.936375 2023-10-04 05:43:48,701 WARNING [train.py:1204] (3/4) Exclude cut with ID _36538_210_9994_1_1533204020363_3795620_131-8685-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 05:43:48,854 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.nonlin_attention.balancer.prob, batch_count=1545680.0, ans=0.125 2023-10-04 05:43:49,914 WARNING [train.py:1204] (3/4) Exclude cut with ID _12556_210_8341_1_1530081024100_7327690_713-55626-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 05:43:49,963 WARNING [train.py:1204] (3/4) Exclude cut with ID _2322_210_2667_1_1525604548971_3656299_440-80494-0_sp0.9 from training. Duration: 0.97775 2023-10-04 05:43:52,747 WARNING [train.py:1204] (3/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_210-325081-0_sp1.1 from training. Duration: 0.97275 2023-10-04 05:43:56,100 WARNING [train.py:1204] (3/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_375-244976-0_sp0.9 from training. Duration: 0.8333125 2023-10-04 05:43:56,106 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_15517_210_6753_1_1530952293_1664795_119-31001-0_sp1.1 from training. Duration: 0.8545625 2023-10-04 05:44:00,370 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder_embed.conv.2.prob, batch_count=1545746.6666666667, ans=0.125 2023-10-04 05:44:02,231 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_41452_210_7291_1_1533437687_3941281_319-42326-0_sp1.1 from training. Duration: 0.92725 2023-10-04 05:44:03,548 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_372-173012-0 from training. Duration: 0.95 2023-10-04 05:44:08,660 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.0.feed_forward3.hidden_balancer.prob, batch_count=1545746.6666666667, ans=0.125 2023-10-04 05:44:09,984 WARNING [train.py:1204] (3/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_41-31157-0_sp1.1 from training. Duration: 0.5181875 2023-10-04 05:44:13,864 INFO [train.py:1046] (3/4) Epoch 44, batch 3450, loss[loss=0.1382, simple_loss=0.2058, pruned_loss=0.03534, over 22649.00 frames. ], tot_loss[loss=0.156, simple_loss=0.2369, pruned_loss=0.03758, over 4717037.07 frames. ], batch size: 322, lr: 2.30e-03, grad_scale: 16.0 2023-10-04 05:44:15,309 WARNING [train.py:1204] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_91-212486-0 from training. Duration: 0.68 2023-10-04 05:44:15,553 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.conv_module1.balancer1.prob, batch_count=1545813.3333333333, ans=0.125 2023-10-04 05:44:18,286 WARNING [train.py:1204] (3/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_1267-272693-0 from training. Duration: 0.73 2023-10-04 05:44:18,319 WARNING [train.py:1204] (3/4) Exclude cut with ID _13938_210_9988_1_1530518718978_3693960_152-67046-0_sp1.1 from training. Duration: 0.936375 2023-10-04 05:44:20,982 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_4528_210_3273_1_1527076654_3276777_342-168068-0_sp1.1 from training. Duration: 0.7909375 2023-10-04 05:44:20,997 WARNING [train.py:1204] (3/4) Exclude cut with ID _7606_210_4915_1_1528894253277_5080690_127-177761-0 from training. Duration: 0.64 2023-10-04 05:44:22,343 WARNING [train.py:1204] (3/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_335-137972-0_sp1.1 from training. Duration: 0.92725 2023-10-04 05:44:24,008 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.bypass_mid.scale_min, batch_count=1545813.3333333333, ans=0.2 2023-10-04 05:44:25,176 WARNING [train.py:1204] (3/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_331-117829-0_sp1.1 from training. Duration: 0.563625 2023-10-04 05:44:27,355 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.1.encoder.layers.1.conv_module1.whiten, num_groups=1, num_channels=256, metric=12.96 vs. limit=15.0 2023-10-04 05:44:28,748 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.ff3_skip_rate, batch_count=1545880.0, ans=0.0 2023-10-04 05:44:29,894 WARNING [train.py:1204] (3/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_125-125818-0_sp1.1 from training. Duration: 0.87275 2023-10-04 05:44:29,956 WARNING [train.py:1204] (3/4) Exclude cut with ID _6789_210_1534_1_1527854471671_2870519_48-294897-0_sp1.1 from training. Duration: 0.963625 2023-10-04 05:44:31,326 WARNING [train.py:1204] (3/4) Exclude cut with ID _6694_210_4599_1_1527852637490_3965529_116-109398-0_sp1.1 from training. Duration: 0.663625 2023-10-04 05:44:33,077 WARNING [train.py:1204] (3/4) Exclude cut with ID _52892_210_6721_1_1534378941259_4124129_222-172685-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 05:44:34,545 WARNING [train.py:1204] (3/4) Exclude cut with ID _50655_210_18628_1_1534229942193_3772154_320-132275-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 05:44:41,071 WARNING [train.py:1204] (3/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_76-311963-0 from training. Duration: 0.7 2023-10-04 05:44:44,103 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.0.conv_module1.balancer1.prob, batch_count=1545946.6666666667, ans=0.125 2023-10-04 05:44:45,401 WARNING [train.py:1204] (3/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_32-205288-0 from training. Duration: 0.78 2023-10-04 05:44:45,427 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_86-16881-0_sp0.9 from training. Duration: 0.6444375 2023-10-04 05:44:46,759 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_271-297505-0_sp1.1 from training. Duration: 0.9 2023-10-04 05:44:48,061 WARNING [train.py:1204] (3/4) Exclude cut with ID _57572_210_5517_1_1534921219624_3809484_187-295293-0_sp1.1 from training. Duration: 0.963625 2023-10-04 05:44:53,577 WARNING [train.py:1204] (3/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_563-329943-0 from training. Duration: 0.99 2023-10-04 05:44:53,649 WARNING [train.py:1204] (3/4) Exclude cut with ID _7160_210_1876_1_1527940923231_3703799_302-150728-0_sp0.9 from training. Duration: 0.8666875 2023-10-04 05:44:55,313 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.balancer2.prob, batch_count=1545946.6666666667, ans=0.125 2023-10-04 05:44:56,589 WARNING [train.py:1204] (3/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_926-7526-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 05:44:58,408 WARNING [train.py:1204] (3/4) Exclude cut with ID _62574_210_2655_1_1535180407058_3933249_467-22348-0_sp1.1 from training. Duration: 0.936375 2023-10-04 05:44:58,514 WARNING [train.py:1204] (3/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_280-161633-0_sp0.9 from training. Duration: 0.811125 2023-10-04 05:44:59,849 WARNING [train.py:1204] (3/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_385-193052-0_sp1.1 from training. Duration: 0.7818125 2023-10-04 05:45:01,340 WARNING [train.py:1204] (3/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_444-28041-0 from training. Duration: 0.85 2023-10-04 05:45:01,355 WARNING [train.py:1204] (3/4) Exclude cut with ID _66070_210_7640_1_1535441393096_7805310_341-249237-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 05:45:01,452 WARNING [train.py:1204] (3/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_425-269826-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 05:45:04,662 WARNING [train.py:1204] (3/4) Exclude cut with ID _53950_210_5816_1_1534417264919_3862951_124-185597-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 05:45:07,243 WARNING [train.py:1204] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_380-204756-0 from training. Duration: 0.86 2023-10-04 05:45:12,526 WARNING [train.py:1204] (3/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_282-257791-0_sp0.9 from training. Duration: 0.9555625 2023-10-04 05:45:16,658 WARNING [train.py:1204] (3/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_372-131163-0_sp1.1 from training. Duration: 0.8545625 2023-10-04 05:45:18,050 WARNING [train.py:1204] (3/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_447-233513-0_sp1.1 from training. Duration: 0.963625 2023-10-04 05:45:20,673 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_54-118333-0_sp1.1 from training. Duration: 0.97275 2023-10-04 05:45:22,080 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.658e+02 2.019e+02 2.264e+02 2.676e+02 4.035e+02, threshold=4.528e+02, percent-clipped=0.0 2023-10-04 05:45:23,693 WARNING [train.py:1204] (3/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_388-230239-0_sp1.1 from training. Duration: 0.963625 2023-10-04 05:45:23,704 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_40047_210_2290_1_1533290519_2374311_141-54639-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 05:45:23,760 WARNING [train.py:1204] (3/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_91-267696-0_sp0.9 from training. Duration: 0.9666875 2023-10-04 05:45:25,534 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_532-137847-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 05:45:28,337 INFO [train.py:1046] (3/4) Epoch 44, batch 3500, loss[loss=0.1563, simple_loss=0.238, pruned_loss=0.03732, over 23297.00 frames. ], tot_loss[loss=0.1554, simple_loss=0.2358, pruned_loss=0.03743, over 4708907.23 frames. ], batch size: 93, lr: 2.30e-03, grad_scale: 16.0 2023-10-04 05:45:29,812 WARNING [train.py:1204] (3/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_155-283231-0_sp1.1 from training. Duration: 0.97275 2023-10-04 05:45:30,335 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.5.encoder.layers.0.feed_forward2.out_whiten, num_groups=1, num_channels=256, metric=8.56 vs. limit=15.0 2023-10-04 05:45:33,049 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2245_210_2715_1_1525511548_3540254_310-52483-0_sp1.1 from training. Duration: 0.8 2023-10-04 05:45:34,391 WARNING [train.py:1204] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_185-174805-0 from training. Duration: 0.68 2023-10-04 05:45:36,305 WARNING [train.py:1204] (3/4) Exclude cut with ID _15012_210_1968_1_1530856820429_5095350_662-210469-0_sp1.1 from training. Duration: 0.5181875 2023-10-04 05:45:39,583 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_426-313557-0_sp0.9 from training. Duration: 0.588875 2023-10-04 05:45:41,067 WARNING [train.py:1204] (3/4) Exclude cut with ID _42326_210_7770_1_1533814282531_3707269_407-204343-0_sp1.1 from training. Duration: 0.97275 2023-10-04 05:45:41,076 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_258-310570-0 from training. Duration: 0.82 2023-10-04 05:45:44,006 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.self_attn_weights.pos_emb_skip_rate, batch_count=1546213.3333333333, ans=0.0 2023-10-04 05:45:46,439 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_4737_210_2040_1_1526998689_3293008_378-178650-0_sp1.1 from training. Duration: 0.863625 2023-10-04 05:45:46,526 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_47896_210_20508_1_1534492518_5958868_432-327451-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 05:45:47,789 WARNING [train.py:1204] (3/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_188-114464-0_sp1.1 from training. Duration: 0.7454375 2023-10-04 05:45:47,797 WARNING [train.py:1204] (3/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_798-106179-0_sp1.1 from training. Duration: 0.7545625 2023-10-04 05:45:47,826 WARNING [train.py:1204] (3/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_143-117421-0_sp1.1 from training. Duration: 0.52725 2023-10-04 05:45:47,871 WARNING [train.py:1204] (3/4) Exclude cut with ID _67499_210_17282_1_1535509805220_3692430_361-78786-0_sp1.1 from training. Duration: 0.963625 2023-10-04 05:45:47,904 WARNING [train.py:1204] (3/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_712-342563-0_sp1.1 from training. Duration: 0.8818125 2023-10-04 05:45:48,019 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.0.ff2_skip_rate, batch_count=1546213.3333333333, ans=0.0 2023-10-04 05:45:49,209 WARNING [train.py:1204] (3/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_387-159008-0 from training. Duration: 0.86 2023-10-04 05:45:52,050 WARNING [train.py:1204] (3/4) Exclude cut with ID _33640_210_2655_1_1533117605479_4855469_959-18820-0_sp1.1 from training. Duration: 0.963625 2023-10-04 05:45:52,092 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_133-564-0_sp0.9 from training. Duration: 0.87775 2023-10-04 05:45:53,697 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.nonlin_attention.balancer.max_positive, batch_count=1546213.3333333333, ans=0.95 2023-10-04 05:45:55,243 WARNING [train.py:1204] (3/4) Exclude cut with ID _23514_210_1389_1_1531832139785_3031439_12-29287-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 05:45:59,399 WARNING [train.py:1204] (3/4) Exclude cut with ID _23514_210_1389_1_1531832139785_3031439_489-185052-0_sp1.1 from training. Duration: 0.963625 2023-10-04 05:46:00,749 WARNING [train.py:1204] (3/4) Exclude cut with ID _5444_210_2151_1_1527418808464_5312210_634-194174-0 from training. Duration: 0.97 2023-10-04 05:46:00,775 WARNING [train.py:1204] (3/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_91-26919-0_sp1.1 from training. Duration: 0.8818125 2023-10-04 05:46:04,157 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_37351_210_7925_1_1533012931_3812113_516-82303-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 05:46:05,500 WARNING [train.py:1204] (3/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_80-74184-0_sp0.9 from training. Duration: 0.92225 2023-10-04 05:46:06,811 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_9460_210_4211_1_1528954587_10049667_495-202963-0_sp1.1 from training. Duration: 0.963625 2023-10-04 05:46:08,878 WARNING [train.py:1204] (3/4) Exclude cut with ID _14632_210_9988_1_1530691279392_3923200_332-105904-0_sp1.1 from training. Duration: 0.8181875 2023-10-04 05:46:08,903 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_40764_210_1925_1_1533882359_3752345_493-167886-0_sp1.1 from training. Duration: 0.92725 2023-10-04 05:46:10,830 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_196-289637-0 from training. Duration: 0.75 2023-10-04 05:46:13,364 WARNING [train.py:1204] (3/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_26-175553-0 from training. Duration: 0.61 2023-10-04 05:46:13,421 WARNING [train.py:1204] (3/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_890-5117-0 from training. Duration: 0.78 2023-10-04 05:46:14,749 WARNING [train.py:1204] (3/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_206-306860-0_sp1.1 from training. Duration: 0.92725 2023-10-04 05:46:16,150 WARNING [train.py:1204] (3/4) Exclude cut with ID _25882_210_3036_1_1532775665815_6479510_570-79824-0_sp1.1 from training. Duration: 0.963625 2023-10-04 05:46:17,514 WARNING [train.py:1204] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_731-2428-0_sp1.1 from training. Duration: 0.7545625 2023-10-04 05:46:17,549 WARNING [train.py:1204] (3/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_138-154812-0_sp0.9 from training. Duration: 0.8666875 2023-10-04 05:46:19,059 WARNING [train.py:1204] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_178-39722-0_sp0.9 from training. Duration: 0.5444375 2023-10-04 05:46:20,446 WARNING [train.py:1204] (3/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_1285-256622-0_sp0.9 from training. Duration: 0.9333125 2023-10-04 05:46:24,569 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_41428_210_2161_1_1533774184_3992532_65-271323-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 05:46:25,942 WARNING [train.py:1204] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_308-161067-0 from training. Duration: 0.66 2023-10-04 05:46:25,947 WARNING [train.py:1204] (3/4) Exclude cut with ID _3067_210_2667_1_1526212974640_4503168_459-173867-0 from training. Duration: 0.88 2023-10-04 05:46:25,948 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2221_210_1988_1_1525597051_4935541_415-296882-0_sp1.1 from training. Duration: 0.7 2023-10-04 05:46:27,870 WARNING [train.py:1204] (3/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_354-192266-0_sp1.1 from training. Duration: 0.936375 2023-10-04 05:46:29,255 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_258-310570-0_sp0.9 from training. Duration: 0.911125 2023-10-04 05:46:30,854 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.balancer2.prob, batch_count=1546413.3333333333, ans=0.125 2023-10-04 05:46:31,847 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_548-141039-0_sp1.1 from training. Duration: 0.963625 2023-10-04 05:46:33,412 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_15101_210_7927_1_1530786415_5033274_320-327527-0 from training. Duration: 0.96 2023-10-04 05:46:34,746 WARNING [train.py:1204] (3/4) Exclude cut with ID _2895_210_2477_1_1525956785794_4063750_289-11475-0_sp0.9 from training. Duration: 0.911125 2023-10-04 05:46:36,699 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7543_210_6753_1_1528543318_4552_1-85072-0_sp1.1 from training. Duration: 0.7545625 2023-10-04 05:46:38,022 WARNING [train.py:1204] (3/4) Exclude cut with ID _64639_210_5816_1_1535367619437_3528000_461-329156-0 from training. Duration: 0.96 2023-10-04 05:46:41,386 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_211-339622-0 from training. Duration: 0.45 2023-10-04 05:46:43,392 INFO [train.py:1046] (3/4) Epoch 44, batch 3550, loss[loss=0.1409, simple_loss=0.2258, pruned_loss=0.028, over 24680.00 frames. ], tot_loss[loss=0.1548, simple_loss=0.2353, pruned_loss=0.03717, over 4721007.92 frames. ], batch size: 65, lr: 2.30e-03, grad_scale: 8.0 2023-10-04 05:46:43,459 WARNING [train.py:1204] (3/4) Exclude cut with ID _20455_210_5219_1_1531565996775_8098100_402-112739-0_sp1.1 from training. Duration: 0.963625 2023-10-04 05:46:43,551 WARNING [train.py:1204] (3/4) Exclude cut with ID _53489_210_6228_1_1534298357169_3835970_256-115565-0_sp1.1 from training. Duration: 0.936375 2023-10-04 05:46:43,574 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_26326_210_7925_1_1533002493_3592406_360-305260-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 05:46:44,872 WARNING [train.py:1204] (3/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_18-204383-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 05:46:47,665 WARNING [train.py:1204] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_306-232480-0_sp1.1 from training. Duration: 0.87275 2023-10-04 05:46:49,251 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.1.ff2_skip_rate, batch_count=1546480.0, ans=0.0 2023-10-04 05:46:56,340 WARNING [train.py:1204] (3/4) Exclude cut with ID _41161_210_20034_1_1533431249657_3555314_116-299186-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 05:46:56,466 WARNING [train.py:1204] (3/4) Exclude cut with ID _13913_210_9994_1_1530579615187_3383897_473-277682-0_sp1.1 from training. Duration: 0.3545625 2023-10-04 05:46:59,733 WARNING [train.py:1204] (3/4) Exclude cut with ID _58895_210_8750_1_1535184081320_3649089_49-225702-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 05:47:00,996 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3080_210_2071_1_1526086102_4365166_312-8726-0_sp1.1 from training. Duration: 0.82725 2023-10-04 05:47:03,677 WARNING [train.py:1204] (3/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_489-190175-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 05:47:03,747 WARNING [train.py:1204] (3/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_345-95717-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 05:47:03,761 WARNING [train.py:1204] (3/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_85-138257-0_sp1.1 from training. Duration: 0.6454375 2023-10-04 05:47:06,618 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7046_210_6753_1_1527895337_6920917_783-278109-0_sp1.1 from training. Duration: 0.7 2023-10-04 05:47:06,652 WARNING [train.py:1204] (3/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_450-203637-0_sp1.1 from training. Duration: 0.836375 2023-10-04 05:47:07,107 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.5.encoder.layers.0.self_attn2.whiten, num_groups=1, num_channels=256, metric=10.50 vs. limit=22.5 2023-10-04 05:47:08,539 WARNING [train.py:1204] (3/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_416-132893-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 05:47:08,561 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2655_210_2087_1_1525688959_3897400_437-337690-0_sp0.9 from training. Duration: 0.7 2023-10-04 05:47:08,660 WARNING [train.py:1204] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_700-183549-0_sp1.1 from training. Duration: 0.6181875 2023-10-04 05:47:15,115 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.1.bypass.skip_rate, batch_count=1546613.3333333333, ans=0.035 2023-10-04 05:47:16,274 WARNING [train.py:1204] (3/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_191-59784-0_sp1.1 from training. Duration: 0.8 2023-10-04 05:47:16,285 WARNING [train.py:1204] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_218-295097-0_sp1.1 from training. Duration: 0.7 2023-10-04 05:47:17,774 WARNING [train.py:1204] (3/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_310-109461-0_sp1.1 from training. Duration: 0.72725 2023-10-04 05:47:17,784 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_33348_210_5632_1_1533086912_3324030_131-321149-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 05:47:19,240 WARNING [train.py:1204] (3/4) Exclude cut with ID _64135_210_2253_1_1535677267876_7608972_133-188266-0_sp1.1 from training. Duration: 0.736375 2023-10-04 05:47:19,266 WARNING [train.py:1204] (3/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_505-205270-0 from training. Duration: 0.38 2023-10-04 05:47:19,277 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_67223_210_4238_1_1535612120_2537431_102-88248-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 05:47:21,864 WARNING [train.py:1204] (3/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_60-235433-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 05:47:21,941 WARNING [train.py:1204] (3/4) Exclude cut with ID _5189_210_4557_1_1527248319428_3769229_199-53526-0_sp1.1 from training. Duration: 0.3818125 2023-10-04 05:47:22,159 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.ff2_skip_rate, batch_count=1546613.3333333333, ans=0.0 2023-10-04 05:47:29,831 WARNING [train.py:1204] (3/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_439-90979-0_sp1.1 from training. Duration: 0.963625 2023-10-04 05:47:29,892 WARNING [train.py:1204] (3/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_1162-132880-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 05:47:31,803 WARNING [train.py:1204] (3/4) Exclude cut with ID _56423_210_5854_1_1534896061049_3165969_220-22317-0_sp1.1 from training. Duration: 0.963625 2023-10-04 05:47:32,643 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.3.encoder.layers.0.nonlin_attention.whiten1, num_groups=1, num_channels=384, metric=8.31 vs. limit=10.0 2023-10-04 05:47:33,351 WARNING [train.py:1204] (3/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_909-64151-0 from training. Duration: 0.94 2023-10-04 05:47:34,787 WARNING [train.py:1204] (3/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_465-156500-0_sp0.9 from training. Duration: 0.8 2023-10-04 05:47:34,887 WARNING [train.py:1204] (3/4) Exclude cut with ID _44304_210_15374_1_1533639352765_4613129_302-22084-0 from training. Duration: 0.86 2023-10-04 05:47:36,138 WARNING [train.py:1204] (3/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_663-166973-0_sp1.1 from training. Duration: 0.72725 2023-10-04 05:47:38,887 WARNING [train.py:1204] (3/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_503-287883-0_sp0.9 from training. Duration: 0.97775 2023-10-04 05:47:38,906 WARNING [train.py:1204] (3/4) Exclude cut with ID _60057_210_10640_1_1534903065793_3333150_362-230377-0_sp0.9 from training. Duration: 0.9666875 2023-10-04 05:47:40,420 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_4183_210_2087_1_1526729100_1869838_143-280962-0 from training. Duration: 0.56 2023-10-04 05:47:42,377 WARNING [train.py:1204] (3/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_281-1313-0_sp1.1 from training. Duration: 0.936375 2023-10-04 05:47:47,787 WARNING [train.py:1204] (3/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_64-215118-0_sp1.1 from training. Duration: 0.936375 2023-10-04 05:47:49,045 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2822_210_2715_1_1526112586_1999587_165-36656-0 from training. Duration: 0.89 2023-10-04 05:47:50,350 WARNING [train.py:1204] (3/4) Exclude cut with ID _22961_210_8254_1_1531976366672_4278180_402-208830-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 05:47:54,377 WARNING [train.py:1204] (3/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_98-193494-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 05:47:55,575 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.606e+02 2.057e+02 2.396e+02 2.886e+02 4.470e+02, threshold=4.792e+02, percent-clipped=0.0 2023-10-04 05:47:55,711 WARNING [train.py:1204] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_383-285196-0 from training. Duration: 0.97 2023-10-04 05:47:59,998 INFO [train.py:1046] (3/4) Epoch 44, batch 3600, loss[loss=0.1579, simple_loss=0.2408, pruned_loss=0.03749, over 24389.00 frames. ], tot_loss[loss=0.155, simple_loss=0.2354, pruned_loss=0.03725, over 4722423.57 frames. ], batch size: 77, lr: 2.30e-03, grad_scale: 16.0 2023-10-04 05:48:02,095 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6767_210_3273_1_1527855783_1718932_142-127132-0 from training. Duration: 0.95 2023-10-04 05:48:02,126 WARNING [train.py:1204] (3/4) Exclude cut with ID _39867_210_9826_1_1533553222030_9344259_1018-339635-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 05:48:02,199 WARNING [train.py:1204] (3/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_274-274798-0_sp1.1 from training. Duration: 0.8909375 2023-10-04 05:48:04,798 WARNING [train.py:1204] (3/4) Exclude cut with ID _2334_210_2667_1_1525608238880_4705129_376-263393-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 05:48:04,844 WARNING [train.py:1204] (3/4) Exclude cut with ID _29303_210_2575_1_1532392237303_7410649_363-344675-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 05:48:06,203 WARNING [train.py:1204] (3/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_58-68956-0_sp1.1 from training. Duration: 0.7545625 2023-10-04 05:48:09,157 WARNING [train.py:1204] (3/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_709-311571-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 05:48:11,691 WARNING [train.py:1204] (3/4) Exclude cut with ID _46067_210_4220_1_1534125571066_3307450_136-257407-0_sp1.1 from training. Duration: 0.963625 2023-10-04 05:48:11,781 WARNING [train.py:1204] (3/4) Exclude cut with ID _6493_210_1747_1_1528459210424_4012470_324-323854-0_sp1.1 from training. Duration: 0.836375 2023-10-04 05:48:13,061 WARNING [train.py:1204] (3/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_234-260682-0_sp1.1 from training. Duration: 0.763625 2023-10-04 05:48:15,006 WARNING [train.py:1204] (3/4) Exclude cut with ID _36634_210_1794_1_1533296352450_1399884_55-119451-0_sp1.1 from training. Duration: 0.963625 2023-10-04 05:48:15,019 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_40047_210_2290_1_1533290519_2374311_195-165693-0 from training. Duration: 0.89 2023-10-04 05:48:17,697 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_5522_210_3273_1_1527249606_3909096_366-172508-0_sp0.9 from training. Duration: 0.7333125 2023-10-04 05:48:20,377 WARNING [train.py:1204] (3/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_187-10504-0_sp1.1 from training. Duration: 0.963625 2023-10-04 05:48:23,181 WARNING [train.py:1204] (3/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_282-257791-0_sp1.1 from training. Duration: 0.7818125 2023-10-04 05:48:25,946 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_43831_210_2158_1_1533725502_5872791_467-288776-0_sp1.1 from training. Duration: 0.92725 2023-10-04 05:48:27,247 WARNING [train.py:1204] (3/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_138-154812-0_sp1.1 from training. Duration: 0.7090625 2023-10-04 05:48:27,285 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_1154-185930-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 05:48:27,311 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2655_210_2087_1_1525688959_3897400_52-279899-0 from training. Duration: 0.69 2023-10-04 05:48:27,953 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.2.encoder.layers.1.conv_module1.whiten, num_groups=1, num_channels=384, metric=8.19 vs. limit=15.0 2023-10-04 05:48:28,695 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_141-89113-0_sp1.1 from training. Duration: 0.7818125 2023-10-04 05:48:31,456 WARNING [train.py:1204] (3/4) Exclude cut with ID _55134_210_21982_1_1534590028377_3882170_297-47310-0_sp1.1 from training. Duration: 0.963625 2023-10-04 05:48:31,529 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_486-151131-0_sp1.1 from training. Duration: 0.77275 2023-10-04 05:48:34,642 WARNING [train.py:1204] (3/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_21-232980-0_sp1.1 from training. Duration: 0.936375 2023-10-04 05:48:34,777 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6165_210_3528_1_1527590982_4379841_338-106380-0_sp1.1 from training. Duration: 0.92725 2023-10-04 05:48:36,147 WARNING [train.py:1204] (3/4) Exclude cut with ID _55162_210_17071_1_1534408248583_3753480_355-257218-0_sp1.1 from training. Duration: 0.97275 2023-10-04 05:48:36,229 WARNING [train.py:1204] (3/4) Exclude cut with ID _2895_210_2477_1_1525956785794_4063750_289-11475-0 from training. Duration: 0.82 2023-10-04 05:48:44,385 WARNING [train.py:1204] (3/4) Exclude cut with ID _16756_210_4915_1_1531047803763_7104259_839-183431-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 05:48:46,324 WARNING [train.py:1204] (3/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_427-208901-0_sp0.9 from training. Duration: 0.7555625 2023-10-04 05:48:48,137 WARNING [train.py:1204] (3/4) Exclude cut with ID _36512_210_6987_1_1533033143998_3126069_181-306500-0 from training. Duration: 0.95 2023-10-04 05:48:52,783 WARNING [train.py:1204] (3/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_28-70987-0_sp1.1 from training. Duration: 0.7909375 2023-10-04 05:48:52,965 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.0.bypass.scale_min, batch_count=1547013.3333333333, ans=0.2 2023-10-04 05:48:54,427 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=1547013.3333333333, ans=0.1 2023-10-04 05:48:58,230 WARNING [train.py:1204] (3/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_590-58996-0_sp1.1 from training. Duration: 0.936375 2023-10-04 05:49:01,176 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_48730_210_5399_1_1533885886_2212769_108-47561-0_sp1.1 from training. Duration: 0.936375 2023-10-04 05:49:04,032 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder_embed.convnext.out_balancer.prob, batch_count=1547080.0, ans=0.125 2023-10-04 05:49:05,323 WARNING [train.py:1204] (3/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_401-158239-0_sp0.9 from training. Duration: 0.788875 2023-10-04 05:49:05,343 WARNING [train.py:1204] (3/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_278-154492-0_sp1.1 from training. Duration: 0.6909375 2023-10-04 05:49:05,349 WARNING [train.py:1204] (3/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_137-116442-0 from training. Duration: 0.78 2023-10-04 05:49:07,256 WARNING [train.py:1204] (3/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_189-317158-0 from training. Duration: 0.9 2023-10-04 05:49:08,652 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_47896_210_20508_1_1534492518_5958868_558-314572-0 from training. Duration: 0.97 2023-10-04 05:49:08,862 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=1547080.0, ans=0.1 2023-10-04 05:49:09,931 WARNING [train.py:1204] (3/4) Exclude cut with ID _66070_210_7640_1_1535441393096_7805310_290-40284-0_sp1.1 from training. Duration: 0.97275 2023-10-04 05:49:09,971 WARNING [train.py:1204] (3/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_110-224688-0_sp1.1 from training. Duration: 0.8818125 2023-10-04 05:49:10,114 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.bypass.scale_min, batch_count=1547080.0, ans=0.2 2023-10-04 05:49:11,311 WARNING [train.py:1204] (3/4) Exclude cut with ID _7719_210_3525_1_1528456153163_4296960_88-250259-0 from training. Duration: 0.84 2023-10-04 05:49:12,636 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8453_210_4211_1_1528502940_8562730_1157-114102-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 05:49:12,667 WARNING [train.py:1204] (3/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_317-175062-0_sp1.1 from training. Duration: 0.7454375 2023-10-04 05:49:12,679 WARNING [train.py:1204] (3/4) Exclude cut with ID _29857_210_20034_1_1532395886753_3798160_254-347254-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 05:49:13,948 INFO [train.py:1046] (3/4) Epoch 44, batch 3650, loss[loss=0.1567, simple_loss=0.2433, pruned_loss=0.03501, over 24453.00 frames. ], tot_loss[loss=0.1552, simple_loss=0.2359, pruned_loss=0.03726, over 4719295.57 frames. ], batch size: 66, lr: 2.30e-03, grad_scale: 8.0 2023-10-04 05:49:14,003 WARNING [train.py:1204] (3/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_28-70987-0 from training. Duration: 0.87 2023-10-04 05:49:14,080 WARNING [train.py:1204] (3/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_321-335475-0 from training. Duration: 0.45 2023-10-04 05:49:17,283 WARNING [train.py:1204] (3/4) Exclude cut with ID _40901_210_5665_1_1533363789958_3856130_356-107330-0_sp1.1 from training. Duration: 0.936375 2023-10-04 05:49:19,098 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3780_210_2087_1_1526467389_2145572_176-107140-0 from training. Duration: 0.69 2023-10-04 05:49:23,835 WARNING [train.py:1204] (3/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_80-135200-0 from training. Duration: 0.79 2023-10-04 05:49:23,953 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_33348_210_5632_1_1533086912_3324030_85-28583-0_sp0.9 from training. Duration: 0.911125 2023-10-04 05:49:26,730 WARNING [train.py:1204] (3/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_29-293896-0 from training. Duration: 0.61 2023-10-04 05:49:28,201 WARNING [train.py:1204] (3/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_194-252800-0 from training. Duration: 0.97 2023-10-04 05:49:31,029 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_360-52446-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 05:49:31,031 WARNING [train.py:1204] (3/4) Exclude cut with ID _9389_210_1664_1_1529217337301_5289269_289-213417-0_sp0.9 from training. Duration: 0.988875 2023-10-04 05:49:31,085 WARNING [train.py:1204] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_143-86344-0_sp0.9 from training. Duration: 0.8555625 2023-10-04 05:49:34,441 WARNING [train.py:1204] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_520-126729-0_sp0.9 from training. Duration: 0.77775 2023-10-04 05:49:35,749 WARNING [train.py:1204] (3/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_76-308663-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 05:49:35,818 WARNING [train.py:1204] (3/4) Exclude cut with ID _8021_210_7994_1_1528376451358_3653069_100-140231-0 from training. Duration: 0.67 2023-10-04 05:49:37,187 WARNING [train.py:1204] (3/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_387-131538-0_sp0.9 from training. Duration: 0.9 2023-10-04 05:49:37,218 WARNING [train.py:1204] (3/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_365-182054-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 05:49:37,252 WARNING [train.py:1204] (3/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_345-340690-0 from training. Duration: 0.93 2023-10-04 05:49:38,601 WARNING [train.py:1204] (3/4) Exclude cut with ID _5409_210_1388_1_1527292850189_5865191_178-127970-0_sp0.9 from training. Duration: 0.6333125 2023-10-04 05:49:38,642 WARNING [train.py:1204] (3/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_352-12734-0_sp1.1 from training. Duration: 0.92725 2023-10-04 05:49:38,645 WARNING [train.py:1204] (3/4) Exclude cut with ID _65134_210_14763_1_1535335393985_3505368_121-129373-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 05:49:40,422 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.5.encoder.layers.1.self_attn2.whiten, num_groups=1, num_channels=256, metric=9.95 vs. limit=22.5 2023-10-04 05:49:41,485 WARNING [train.py:1204] (3/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_557-324258-0_sp1.1 from training. Duration: 0.736375 2023-10-04 05:49:44,274 WARNING [train.py:1204] (3/4) Exclude cut with ID _48678_210_5663_1_1533988830947_3769199_306-255886-0 from training. Duration: 0.77 2023-10-04 05:49:44,370 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7544_210_6753_1_1528587000_691468_87-178599-0 from training. Duration: 0.87 2023-10-04 05:49:46,306 WARNING [train.py:1204] (3/4) Exclude cut with ID _2941_210_2577_1_1526199673150_2489759_208-236402-0_sp1.1 from training. Duration: 0.963625 2023-10-04 05:49:49,531 WARNING [train.py:1204] (3/4) Exclude cut with ID _38670_210_15379_1_1533448810659_4391059_65-169397-0 from training. Duration: 0.97 2023-10-04 05:49:50,876 WARNING [train.py:1204] (3/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_306-328175-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 05:49:50,885 WARNING [train.py:1204] (3/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_212-149947-0_sp1.1 from training. Duration: 0.663625 2023-10-04 05:49:51,066 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.balancer2.prob, batch_count=1547280.0, ans=0.125 2023-10-04 05:49:55,308 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.conv_module2.balancer1.prob, batch_count=1547280.0, ans=0.125 2023-10-04 05:49:57,130 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.3.encoder.layers.2.nonlin_attention.whiten2, num_groups=1, num_channels=512, metric=6.83 vs. limit=15.0 2023-10-04 05:49:57,736 WARNING [train.py:1204] (3/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_321-22821-0_sp1.1 from training. Duration: 0.8090625 2023-10-04 05:49:59,156 WARNING [train.py:1204] (3/4) Exclude cut with ID _44010_210_4525_1_1533884262964_4013620_483-69110-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 05:50:00,448 WARNING [train.py:1204] (3/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_38-187851-0_sp1.1 from training. Duration: 0.563625 2023-10-04 05:50:01,933 WARNING [train.py:1204] (3/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_129-337802-0_sp0.9 from training. Duration: 0.8 2023-10-04 05:50:02,009 WARNING [train.py:1204] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_268-87909-0_sp1.1 from training. Duration: 0.9 2023-10-04 05:50:04,890 WARNING [train.py:1204] (3/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_559-184862-0_sp1.1 from training. Duration: 0.8545625 2023-10-04 05:50:08,203 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_46032_210_14656_1_1533781412_4507969_237-120766-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 05:50:09,667 WARNING [train.py:1204] (3/4) Exclude cut with ID _28427_210_4220_1_1532581240967_3061069_330-170906-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 05:50:09,673 WARNING [train.py:1204] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_401-51578-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 05:50:09,915 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=1547346.6666666667, ans=0.1 2023-10-04 05:50:11,032 WARNING [train.py:1204] (3/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_209-205365-0_sp1.1 from training. Duration: 0.7181875 2023-10-04 05:50:12,489 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_23801_210_16223_1_1532066392_3294657_57-242170-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 05:50:12,548 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_127-37371-0_sp1.1 from training. Duration: 0.936375 2023-10-04 05:50:16,820 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9171W0019-578905 from training. Duration: 0.896 2023-10-04 05:50:17,069 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=1547413.3333333333, ans=0.1 2023-10-04 05:50:20,170 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_24605_210_1794_1_1532047863_4149755_349-49056-0_sp1.1 from training. Duration: 0.92725 2023-10-04 05:50:20,181 WARNING [train.py:1204] (3/4) Exclude cut with ID _12302_210_5219_1_1529924446466_8107280_897-21237-0_sp1.1 from training. Duration: 0.936375 2023-10-04 05:50:22,159 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_9460_210_4211_1_1528954587_10049667_480-260269-0_sp0.9 from training. Duration: 0.62225 2023-10-04 05:50:22,222 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_348-152548-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 05:50:23,468 WARNING [train.py:1204] (3/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_301-330455-0_sp0.9 from training. Duration: 0.888875 2023-10-04 05:50:23,677 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.feed_forward1.hidden_balancer.prob, batch_count=1547413.3333333333, ans=0.125 2023-10-04 05:50:24,794 WARNING [train.py:1204] (3/4) Exclude cut with ID _61008_210_10633_1_1534852769198_6974370_345-91705-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 05:50:26,099 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.699e+02 1.956e+02 2.107e+02 2.308e+02 3.469e+02, threshold=4.214e+02, percent-clipped=0.0 2023-10-04 05:50:26,205 WARNING [train.py:1204] (3/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_304-273433-0 from training. Duration: 0.99 2023-10-04 05:50:26,208 WARNING [train.py:1204] (3/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_359-345972-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 05:50:27,692 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2296_210_2087_1_1525503448_3397335_57-185430-0_sp1.1 from training. Duration: 0.5454375 2023-10-04 05:50:29,001 INFO [train.py:1046] (3/4) Epoch 44, batch 3700, loss[loss=0.1573, simple_loss=0.2386, pruned_loss=0.03803, over 24040.00 frames. ], tot_loss[loss=0.1559, simple_loss=0.2366, pruned_loss=0.03766, over 4721174.27 frames. ], batch size: 86, lr: 2.30e-03, grad_scale: 8.0 2023-10-04 05:50:30,500 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_1256-120974-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 05:50:31,867 WARNING [train.py:1204] (3/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_372-30302-0_sp0.9 from training. Duration: 0.9555625 2023-10-04 05:50:34,644 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_42012_210_7927_1_1533439936_3871556_451-344262-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 05:50:34,645 WARNING [train.py:1204] (3/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_28-324729-0 from training. Duration: 0.94 2023-10-04 05:50:36,407 WARNING [train.py:1204] (3/4) Exclude cut with ID _64135_210_2253_1_1535677267876_7608972_313-18394-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 05:50:36,499 WARNING [train.py:1204] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_620-276793-0_sp0.9 from training. Duration: 0.6555625 2023-10-04 05:50:37,890 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7544_210_6753_1_1528591327_1471426_172-348777-0_sp0.9 from training. Duration: 0.7333125 2023-10-04 05:50:38,211 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.bypass_mid.scale_min, batch_count=1547480.0, ans=0.2 2023-10-04 05:50:39,378 WARNING [train.py:1204] (3/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_332-221057-0_sp0.9 from training. Duration: 0.7555625 2023-10-04 05:50:42,187 WARNING [train.py:1204] (3/4) Exclude cut with ID _19023_210_4353_1_1531378892552_5820780_5-300035-0_sp1.1 from training. Duration: 0.92725 2023-10-04 05:50:43,524 WARNING [train.py:1204] (3/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_343-309023-0_sp1.1 from training. Duration: 0.963625 2023-10-04 05:50:43,604 WARNING [train.py:1204] (3/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_37-41919-0_sp1.1 from training. Duration: 0.7909375 2023-10-04 05:50:44,916 WARNING [train.py:1204] (3/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_41-182910-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 05:50:46,285 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_229-45300-0_sp0.9 from training. Duration: 0.6333125 2023-10-04 05:50:46,568 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.nonlin_attention.balancer.prob, batch_count=1547546.6666666667, ans=0.125 2023-10-04 05:50:48,161 WARNING [train.py:1204] (3/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_40-214716-0_sp1.1 from training. Duration: 0.963625 2023-10-04 05:50:48,262 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9309W0203-640330 from training. Duration: 0.896 2023-10-04 05:50:55,178 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.1.encoder.layers.1.whiten, num_groups=1, num_channels=256, metric=4.70 vs. limit=12.0 2023-10-04 05:50:56,301 WARNING [train.py:1204] (3/4) Exclude cut with ID _60229_210_27767_1_1534838406158_3655350_264-279573-0_sp1.1 from training. Duration: 0.7545625 2023-10-04 05:50:57,544 WARNING [train.py:1204] (3/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_372-198878-0_sp0.9 from training. Duration: 0.6666875 2023-10-04 05:50:58,932 WARNING [train.py:1204] (3/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_359-118926-0_sp1.1 from training. Duration: 0.6545625 2023-10-04 05:51:00,219 WARNING [train.py:1204] (3/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_85-65056-0 from training. Duration: 0.96 2023-10-04 05:51:00,246 WARNING [train.py:1204] (3/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_89-242335-0_sp1.1 from training. Duration: 0.863625 2023-10-04 05:51:03,063 WARNING [train.py:1204] (3/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_128-49444-0_sp1.1 from training. Duration: 0.936375 2023-10-04 05:51:04,345 WARNING [train.py:1204] (3/4) Exclude cut with ID _30327_210_16049_1_1532484161295_3924169_401-70819-0 from training. Duration: 0.86 2023-10-04 05:51:06,276 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_41822_210_13270_1_1533955976_4379284_260-256391-0_sp1.1 from training. Duration: 0.936375 2023-10-04 05:51:07,642 WARNING [train.py:1204] (3/4) Exclude cut with ID _67335_210_5219_1_1535525975652_4275229_599-247317-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 05:51:10,407 WARNING [train.py:1204] (3/4) Exclude cut with ID _29857_210_20034_1_1532395886753_3798160_209-239267-0_sp1.1 from training. Duration: 0.936375 2023-10-04 05:51:11,692 WARNING [train.py:1204] (3/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_46-207599-0_sp1.1 from training. Duration: 0.5818125 2023-10-04 05:51:13,221 WARNING [train.py:1204] (3/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_912-282412-0_sp1.1 from training. Duration: 0.4454375 2023-10-04 05:51:16,034 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_4183_210_2087_1_1526729100_1869838_141-166600-0_sp1.1 from training. Duration: 0.863625 2023-10-04 05:51:16,039 WARNING [train.py:1204] (3/4) Exclude cut with ID _15037_210_2253_1_1530752294104_3267650_296-192597-0 from training. Duration: 0.81 2023-10-04 05:51:17,398 WARNING [train.py:1204] (3/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_571-220562-0_sp1.1 from training. Duration: 0.963625 2023-10-04 05:51:17,434 WARNING [train.py:1204] (3/4) Exclude cut with ID _5287_210_2084_1_1527418883040_3878670_12-221186-0 from training. Duration: 0.98 2023-10-04 05:51:22,041 WARNING [train.py:1204] (3/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_149-85602-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 05:51:23,355 WARNING [train.py:1204] (3/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_1003-101638-0_sp1.1 from training. Duration: 0.663625 2023-10-04 05:51:25,522 WARNING [train.py:1204] (3/4) Exclude cut with ID _55844_210_2346_1_1534471212569_7625707_1099-33689-0_sp1.1 from training. Duration: 0.92725 2023-10-04 05:51:26,796 WARNING [train.py:1204] (3/4) Exclude cut with ID _7606_210_4915_1_1528894253277_5080690_301-10732-0 from training. Duration: 0.81 2023-10-04 05:51:28,729 WARNING [train.py:1204] (3/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_403-34698-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 05:51:28,743 WARNING [train.py:1204] (3/4) Exclude cut with ID _8480_210_1874_1_1528589391205_6728330_755-112625-0_sp0.9 from training. Duration: 0.82225 2023-10-04 05:51:28,763 WARNING [train.py:1204] (3/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_527-238990-0_sp1.1 from training. Duration: 0.8181875 2023-10-04 05:51:29,974 WARNING [train.py:1204] (3/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_426-7328-0_sp1.1 from training. Duration: 0.92725 2023-10-04 05:51:31,669 WARNING [train.py:1204] (3/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_189-317158-0_sp1.1 from training. Duration: 0.8181875 2023-10-04 05:51:33,041 WARNING [train.py:1204] (3/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_162-103620-0 from training. Duration: 0.93 2023-10-04 05:51:33,146 WARNING [train.py:1204] (3/4) Exclude cut with ID _22083_210_8254_1_1531702862439_3678970_49-234546-0 from training. Duration: 0.83 2023-10-04 05:51:34,444 WARNING [train.py:1204] (3/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_370-331600-0_sp1.1 from training. Duration: 0.8545625 2023-10-04 05:51:34,464 WARNING [train.py:1204] (3/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_166-59042-0_sp1.1 from training. Duration: 0.97275 2023-10-04 05:51:35,880 WARNING [train.py:1204] (3/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_138-332773-0_sp1.1 from training. Duration: 0.736375 2023-10-04 05:51:37,675 WARNING [train.py:1204] (3/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_90-30673-0_sp0.9 from training. Duration: 0.9333125 2023-10-04 05:51:39,090 WARNING [train.py:1204] (3/4) Exclude cut with ID _27967_210_12140_1_1532588391859_8419130_737-41898-0_sp1.1 from training. Duration: 0.936375 2023-10-04 05:51:40,434 WARNING [train.py:1204] (3/4) Exclude cut with ID _8833_210_5219_1_1528592665210_7553059_329-125156-0_sp1.1 from training. Duration: 0.7454375 2023-10-04 05:51:40,611 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.ff3_skip_rate, batch_count=1547746.6666666667, ans=0.0 2023-10-04 05:51:41,863 WARNING [train.py:1204] (3/4) Exclude cut with ID _41817_210_18835_1_1533434477160_7423049_352-42537-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 05:51:43,118 INFO [train.py:1046] (3/4) Epoch 44, batch 3750, loss[loss=0.141, simple_loss=0.2215, pruned_loss=0.03028, over 24421.00 frames. ], tot_loss[loss=0.1566, simple_loss=0.2374, pruned_loss=0.03788, over 4718768.18 frames. ], batch size: 58, lr: 2.30e-03, grad_scale: 8.0 2023-10-04 05:51:44,527 WARNING [train.py:1204] (3/4) Exclude cut with ID _8365_210_2064_1_1528592311070_4677690_431-294102-0 from training. Duration: 0.89 2023-10-04 05:51:45,810 WARNING [train.py:1204] (3/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_454-314901-0_sp1.1 from training. Duration: 0.4181875 2023-10-04 05:51:50,346 WARNING [train.py:1204] (3/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_80-135200-0_sp0.9 from training. Duration: 0.87775 2023-10-04 05:51:50,415 WARNING [train.py:1204] (3/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_181-78632-0 from training. Duration: 0.78 2023-10-04 05:51:51,734 WARNING [train.py:1204] (3/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_300-27235-0_sp0.9 from training. Duration: 0.9666875 2023-10-04 05:51:53,138 WARNING [train.py:1204] (3/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_96-177176-0_sp1.1 from training. Duration: 0.97275 2023-10-04 05:51:53,243 WARNING [train.py:1204] (3/4) Exclude cut with ID _35941_210_15379_1_1532995294515_4023089_246-348742-0_sp1.1 from training. Duration: 0.97275 2023-10-04 05:51:56,304 WARNING [train.py:1204] (3/4) Exclude cut with ID _38670_210_15379_1_1533448810659_4391059_65-169397-0_sp1.1 from training. Duration: 0.8818125 2023-10-04 05:51:59,617 WARNING [train.py:1204] (3/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_1003-99642-0_sp1.1 from training. Duration: 0.963625 2023-10-04 05:52:02,564 WARNING [train.py:1204] (3/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_320-14078-0_sp1.1 from training. Duration: 0.863625 2023-10-04 05:52:03,919 WARNING [train.py:1204] (3/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_367-61330-0_sp0.9 from training. Duration: 0.8333125 2023-10-04 05:52:06,742 WARNING [train.py:1204] (3/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_515-298514-0_sp1.1 from training. Duration: 0.92725 2023-10-04 05:52:11,310 WARNING [train.py:1204] (3/4) Exclude cut with ID _53731_210_7994_1_1534553979244_3757214_471-320815-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 05:52:11,364 WARNING [train.py:1204] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_306-232480-0 from training. Duration: 0.96 2023-10-04 05:52:11,416 WARNING [train.py:1204] (3/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_478-162247-0_sp0.9 from training. Duration: 0.911125 2023-10-04 05:52:12,785 WARNING [train.py:1204] (3/4) Exclude cut with ID _56423_210_5854_1_1534896061049_3165969_345-284696-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 05:52:12,819 WARNING [train.py:1204] (3/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_600-122253-0_sp1.1 from training. Duration: 0.963625 2023-10-04 05:52:15,565 WARNING [train.py:1204] (3/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_199-295611-0 from training. Duration: 0.55 2023-10-04 05:52:18,818 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.5.encoder.layers.1.nonlin_attention.whiten2, num_groups=1, num_channels=256, metric=9.28 vs. limit=15.0 2023-10-04 05:52:19,694 WARNING [train.py:1204] (3/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_734-129761-0 from training. Duration: 0.82 2023-10-04 05:52:22,901 WARNING [train.py:1204] (3/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_349-169257-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 05:52:22,937 WARNING [train.py:1204] (3/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_202-67017-0_sp0.9 from training. Duration: 0.911125 2023-10-04 05:52:25,683 INFO [scaling.py:1022] (3/4) Whitening: name=encoder_embed.convnext.out_whiten, num_groups=1, num_channels=128, metric=4.64 vs. limit=5.0 2023-10-04 05:52:25,926 WARNING [train.py:1204] (3/4) Exclude cut with ID _16908_210_5724_1_1531119157248_3772700_61-258978-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 05:52:27,383 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.3.encoder.layers.3.nonlin_attention.whiten1, num_groups=1, num_channels=384, metric=5.20 vs. limit=10.0 2023-10-04 05:52:29,582 WARNING [train.py:1204] (3/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_172-156931-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 05:52:29,769 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.1.self_attn_weights.pos_emb_skip_rate, batch_count=1548013.3333333333, ans=0.0 2023-10-04 05:52:30,905 WARNING [train.py:1204] (3/4) Exclude cut with ID _4092_210_2151_1_1526643044449_3135759_356-278694-0_sp1.1 from training. Duration: 0.42725 2023-10-04 05:52:34,287 WARNING [train.py:1204] (3/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_116-332008-0 from training. Duration: 0.98 2023-10-04 05:52:37,141 WARNING [train.py:1204] (3/4) Exclude cut with ID _29003_210_19596_1_1532507391720_3894098_358-114789-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 05:52:37,806 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.4.encoder.layers.0.nonlin_attention.whiten1, num_groups=1, num_channels=288, metric=8.10 vs. limit=10.0 2023-10-04 05:52:38,664 WARNING [train.py:1204] (3/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_152-212533-0_sp1.1 from training. Duration: 0.8818125 2023-10-04 05:52:38,767 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.0.feed_forward1.out_proj.dropout_p, batch_count=1548013.3333333333, ans=0.1 2023-10-04 05:52:40,036 WARNING [train.py:1204] (3/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_267-214116-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 05:52:43,259 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7543_210_6753_1_1528541907_1399322_165-323619-0_sp0.9 from training. Duration: 0.7666875 2023-10-04 05:52:45,984 WARNING [train.py:1204] (3/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_577-332600-0_sp0.9 from training. Duration: 0.6333125 2023-10-04 05:52:48,590 WARNING [train.py:1204] (3/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_342-100157-0_sp0.9 from training. Duration: 0.6 2023-10-04 05:52:50,045 WARNING [train.py:1204] (3/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_141-89727-0_sp1.1 from training. Duration: 0.6454375 2023-10-04 05:52:51,877 WARNING [train.py:1204] (3/4) Exclude cut with ID _3992_210_1747_1_1526644816031_3731060_107-251434-0_sp1.1 from training. Duration: 0.9 2023-10-04 05:52:54,520 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.695e+02 2.028e+02 2.230e+02 2.527e+02 3.718e+02, threshold=4.460e+02, percent-clipped=0.0 2023-10-04 05:52:54,675 WARNING [train.py:1204] (3/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_401-53423-0_sp1.1 from training. Duration: 0.5 2023-10-04 05:52:57,630 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.0.layers.1.conv_module2.whiten, num_groups=1, num_channels=192, metric=10.13 vs. limit=15.0 2023-10-04 05:52:57,959 INFO [train.py:1046] (3/4) Epoch 44, batch 3800, loss[loss=0.1442, simple_loss=0.2179, pruned_loss=0.03519, over 24312.00 frames. ], tot_loss[loss=0.156, simple_loss=0.237, pruned_loss=0.03748, over 4726464.56 frames. ], batch size: 56, lr: 2.30e-03, grad_scale: 8.0 2023-10-04 05:53:02,784 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7762_210_1794_1_1528199508_4057408_422-227034-0_sp1.1 from training. Duration: 0.663625 2023-10-04 05:53:06,743 WARNING [train.py:1204] (3/4) Exclude cut with ID _4184_210_2550_1_1526730192013_2219380_102-250299-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 05:53:06,817 WARNING [train.py:1204] (3/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_253-308377-0_sp0.9 from training. Duration: 0.5444375 2023-10-04 05:53:07,004 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.bypass.scale_min, batch_count=1548146.6666666667, ans=0.2 2023-10-04 05:53:08,106 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_271-297505-0 from training. Duration: 0.99 2023-10-04 05:53:09,452 WARNING [train.py:1204] (3/4) Exclude cut with ID _35941_210_15379_1_1532995294515_4023089_213-142973-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 05:53:10,900 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7544_210_6753_1_1528587722_3576686_162-63018-0_sp1.1 from training. Duration: 0.92725 2023-10-04 05:53:12,661 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_365-217119-0_sp0.9 from training. Duration: 0.7 2023-10-04 05:53:14,153 WARNING [train.py:1204] (3/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_662-275621-0_sp0.9 from training. Duration: 0.4666875 2023-10-04 05:53:14,161 WARNING [train.py:1204] (3/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_374-187555-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 05:53:15,641 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_289-180904-0_sp1.1 from training. Duration: 0.6909375 2023-10-04 05:53:17,092 WARNING [train.py:1204] (3/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_189-47206-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 05:53:17,131 WARNING [train.py:1204] (3/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_234-14174-0_sp1.1 from training. Duration: 0.7454375 2023-10-04 05:53:18,443 WARNING [train.py:1204] (3/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_576-76621-0_sp1.1 from training. Duration: 0.963625 2023-10-04 05:53:18,524 WARNING [train.py:1204] (3/4) Exclude cut with ID _13253_210_1925_1_1530757775819_3577873_253-246386-0 from training. Duration: 0.9 2023-10-04 05:53:21,221 WARNING [train.py:1204] (3/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_372-6869-0_sp1.1 from training. Duration: 0.4 2023-10-04 05:53:21,288 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_21434_210_4238_1_1531916339_1222252_22-54954-0_sp1.1 from training. Duration: 0.936375 2023-10-04 05:53:24,609 WARNING [train.py:1204] (3/4) Exclude cut with ID _29853_210_7497_1_1532505600851_6642110_492-170361-0_sp1.1 from training. Duration: 0.92725 2023-10-04 05:53:24,812 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=1548213.3333333333, ans=0.1 2023-10-04 05:53:27,224 WARNING [train.py:1204] (3/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_505-97987-0_sp0.9 from training. Duration: 0.9555625 2023-10-04 05:53:28,508 WARNING [train.py:1204] (3/4) Exclude cut with ID _8365_210_2064_1_1528592311070_4677690_371-122295-0_sp0.9 from training. Duration: 0.611125 2023-10-04 05:53:30,435 WARNING [train.py:1204] (3/4) Exclude cut with ID _14268_210_9206_1_1530622771285_4714099_637-86630-0_sp1.1 from training. Duration: 0.463625 2023-10-04 05:53:30,448 WARNING [train.py:1204] (3/4) Exclude cut with ID _29857_210_20034_1_1532395886753_3798160_372-5678-0_sp1.1 from training. Duration: 0.963625 2023-10-04 05:53:31,806 WARNING [train.py:1204] (3/4) Exclude cut with ID _52892_210_6721_1_1534378941259_4124129_90-165108-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 05:53:31,883 WARNING [train.py:1204] (3/4) Exclude cut with ID _27052_210_5986_1_1532329197187_3585009_143-339198-0_sp1.1 from training. Duration: 0.963625 2023-10-04 05:53:34,771 WARNING [train.py:1204] (3/4) Exclude cut with ID _14268_210_9206_1_1530622771285_4714099_637-86630-0_sp0.9 from training. Duration: 0.5666875 2023-10-04 05:53:34,773 WARNING [train.py:1204] (3/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_401-255491-0 from training. Duration: 0.7 2023-10-04 05:53:37,542 WARNING [train.py:1204] (3/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_178-6946-0_sp1.1 from training. Duration: 0.97275 2023-10-04 05:53:44,918 WARNING [train.py:1204] (3/4) Exclude cut with ID _27485_210_4916_1_1532739191677_3668269_460-163885-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 05:53:48,005 WARNING [train.py:1204] (3/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_270-26053-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 05:53:50,858 WARNING [train.py:1204] (3/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_412-317077-0 from training. Duration: 0.75 2023-10-04 05:53:51,125 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=1548346.6666666667, ans=0.1 2023-10-04 05:53:54,068 WARNING [train.py:1204] (3/4) Exclude cut with ID _5289_210_2084_1_1527678202736_3779299_142-103317-0 from training. Duration: 0.84 2023-10-04 05:53:55,368 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_150-143438-0_sp1.1 from training. Duration: 0.92725 2023-10-04 05:53:55,493 WARNING [train.py:1204] (3/4) Exclude cut with ID _27894_210_12392_1_1532415496078_6902439_668-269779-0_sp1.1 from training. Duration: 0.97275 2023-10-04 05:53:56,808 WARNING [train.py:1204] (3/4) Exclude cut with ID _18513_210_9206_1_1531618215223_3862620_56-222075-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 05:53:58,266 WARNING [train.py:1204] (3/4) Exclude cut with ID _4092_210_2151_1_1526643044449_3135759_356-278694-0 from training. Duration: 0.47 2023-10-04 05:54:02,937 WARNING [train.py:1204] (3/4) Exclude cut with ID _25808_210_13292_1_1532343514562_7366109_633-47524-0 from training. Duration: 0.99 2023-10-04 05:54:02,948 WARNING [train.py:1204] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_872-264525-0 from training. Duration: 0.65 2023-10-04 05:54:02,987 WARNING [train.py:1204] (3/4) Exclude cut with ID _14632_210_9988_1_1530691279392_3923200_239-232545-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 05:54:04,377 WARNING [train.py:1204] (3/4) Exclude cut with ID _14349_210_8341_1_1530615710818_3442470_153-27432-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 05:54:04,891 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.5.encoder.layers.0.self_attn_weights.whiten_keys, num_groups=4, num_channels=128, metric=1.59 vs. limit=6.0 2023-10-04 05:54:09,892 WARNING [train.py:1204] (3/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_37-41919-0_sp0.9 from training. Duration: 0.9666875 2023-10-04 05:54:11,777 INFO [train.py:1046] (3/4) Epoch 44, batch 3850, loss[loss=0.1436, simple_loss=0.2226, pruned_loss=0.03231, over 20750.00 frames. ], tot_loss[loss=0.1551, simple_loss=0.2354, pruned_loss=0.03738, over 4712570.13 frames. ], batch size: 45, lr: 2.30e-03, grad_scale: 8.0 2023-10-04 05:54:11,839 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_196-289637-0_sp0.9 from training. Duration: 0.8333125 2023-10-04 05:54:17,260 WARNING [train.py:1204] (3/4) Exclude cut with ID _13678_210_7497_1_1530530069226_3463170_371-3221-0_sp1.1 from training. Duration: 0.8181875 2023-10-04 05:54:17,328 WARNING [train.py:1204] (3/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_557-324258-0 from training. Duration: 0.81 2023-10-04 05:54:17,565 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.attention_skip_rate, batch_count=1548480.0, ans=0.0 2023-10-04 05:54:17,992 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.3.encoder.layers.0.conv_module1.whiten, num_groups=1, num_channels=512, metric=4.12 vs. limit=15.0 2023-10-04 05:54:18,703 WARNING [train.py:1204] (3/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_1012-279473-0_sp1.1 from training. Duration: 0.6090625 2023-10-04 05:54:20,055 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_66916_210_14656_1_1535725525_326566_7-92649-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 05:54:23,018 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_9460_210_4211_1_1528954587_10049667_480-260269-0_sp1.1 from training. Duration: 0.5090625 2023-10-04 05:54:26,320 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_121-155001-0_sp1.1 from training. Duration: 0.92725 2023-10-04 05:54:28,196 WARNING [train.py:1204] (3/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_188-171673-0_sp1.1 from training. Duration: 0.52725 2023-10-04 05:54:29,513 WARNING [train.py:1204] (3/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_2-39653-0 from training. Duration: 0.98 2023-10-04 05:54:32,510 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.conv_module2.balancer2.prob, batch_count=1548546.6666666667, ans=0.125 2023-10-04 05:54:35,576 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_23850_210_4238_1_1532232597_1632433_112-229012-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 05:54:36,970 WARNING [train.py:1204] (3/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_626-79528-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 05:54:39,689 WARNING [train.py:1204] (3/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_856-90052-0_sp1.1 from training. Duration: 0.963625 2023-10-04 05:54:39,726 WARNING [train.py:1204] (3/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_122-109023-0_sp0.9 from training. Duration: 0.8444375 2023-10-04 05:54:42,987 WARNING [train.py:1204] (3/4) Exclude cut with ID _64414_210_17301_1_1535243252048_3757759_37-70328-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 05:54:43,049 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_622-344907-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 05:54:43,107 WARNING [train.py:1204] (3/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_410-321956-0_sp1.1 from training. Duration: 0.92725 2023-10-04 05:54:43,118 WARNING [train.py:1204] (3/4) Exclude cut with ID _7562_210_2084_1_1528887767916_3946229_186-326422-0_sp0.9 from training. Duration: 0.9333125 2023-10-04 05:54:44,517 WARNING [train.py:1204] (3/4) Exclude cut with ID _37537_210_2253_1_1533171758568_3421949_127-285192-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 05:54:45,970 WARNING [train.py:1204] (3/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_723-228168-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 05:54:47,274 WARNING [train.py:1204] (3/4) Exclude cut with ID _13678_210_7497_1_1530530069226_3463170_426-246984-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 05:54:47,291 WARNING [train.py:1204] (3/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_399-156791-0_sp0.9 from training. Duration: 0.92225 2023-10-04 05:54:47,361 WARNING [train.py:1204] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_391-22148-0 from training. Duration: 0.77 2023-10-04 05:54:48,604 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3475_210_2028_1_1526193578_3622569_64-297072-0 from training. Duration: 0.91 2023-10-04 05:54:48,672 WARNING [train.py:1204] (3/4) Exclude cut with ID _17875_210_2087_1_1531224012907_1821339_225-280015-0_sp1.1 from training. Duration: 0.963625 2023-10-04 05:54:48,690 WARNING [train.py:1204] (3/4) Exclude cut with ID _50655_210_18628_1_1534229942193_3772154_325-246169-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 05:54:51,552 WARNING [train.py:1204] (3/4) Exclude cut with ID _29529_210_7234_1_1532393961455_3977460_597-257348-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 05:54:53,297 WARNING [train.py:1204] (3/4) Exclude cut with ID _58895_210_8750_1_1535184081320_3649089_77-173485-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 05:54:54,552 WARNING [train.py:1204] (3/4) Exclude cut with ID _6270_210_2084_1_1527850911573_3856420_26-277343-0 from training. Duration: 0.82 2023-10-04 05:54:55,928 WARNING [train.py:1204] (3/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_144-3287-0 from training. Duration: 0.67 2023-10-04 05:54:57,990 WARNING [train.py:1204] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_293-145278-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 05:54:59,427 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_40047_210_2290_1_1533290519_2374311_257-335162-0 from training. Duration: 0.95 2023-10-04 05:55:00,821 WARNING [train.py:1204] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_619-344499-0_sp0.9 from training. Duration: 0.7 2023-10-04 05:55:05,502 WARNING [train.py:1204] (3/4) Exclude cut with ID _19504_210_9994_1_1531648863242_3325220_456-315379-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 05:55:06,805 WARNING [train.py:1204] (3/4) Exclude cut with ID _55857_210_2346_1_1534644007831_6898209_631-221692-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 05:55:07,779 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.0.layers.1.self_attn2.whiten, num_groups=1, num_channels=192, metric=11.17 vs. limit=22.5 2023-10-04 05:55:12,713 WARNING [train.py:1204] (3/4) Exclude cut with ID _64631_210_27767_1_1535349580068_4165160_324-3682-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 05:55:12,741 WARNING [train.py:1204] (3/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_317-211107-0 from training. Duration: 0.92 2023-10-04 05:55:15,462 WARNING [train.py:1204] (3/4) Exclude cut with ID _6789_210_1534_1_1527857937218_3118950_311-61262-0 from training. Duration: 0.65 2023-10-04 05:55:16,853 WARNING [train.py:1204] (3/4) Exclude cut with ID _28323_210_16187_1_1532480395667_4100300_365-68882-0_sp1.1 from training. Duration: 0.92725 2023-10-04 05:55:16,893 WARNING [train.py:1204] (3/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_816-181825-0_sp1.1 from training. Duration: 0.92725 2023-10-04 05:55:19,630 WARNING [train.py:1204] (3/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_132-269633-0_sp1.1 from training. Duration: 0.6181875 2023-10-04 05:55:19,641 WARNING [train.py:1204] (3/4) Exclude cut with ID _44585_210_18628_1_1533605350387_4518199_280-73534-0_sp1.1 from training. Duration: 0.8090625 2023-10-04 05:55:19,690 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_68095_210_20185_1_1535718226_8888855_1009-164755-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 05:55:21,062 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_40223_210_6228_1_1533298404_4812267_400-103813-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 05:55:21,063 WARNING [train.py:1204] (3/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_491-261923-0_sp1.1 from training. Duration: 0.8909375 2023-10-04 05:55:21,068 WARNING [train.py:1204] (3/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_712-342563-0 from training. Duration: 0.97 2023-10-04 05:55:21,143 WARNING [train.py:1204] (3/4) Exclude cut with ID _41161_210_20034_1_1533431249657_3555314_175-190032-0_sp1.1 from training. Duration: 0.963625 2023-10-04 05:55:22,995 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.515e+02 1.962e+02 2.115e+02 2.398e+02 3.315e+02, threshold=4.231e+02, percent-clipped=0.0 2023-10-04 05:55:23,208 WARNING [train.py:1204] (3/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_788-298907-0 from training. Duration: 0.89 2023-10-04 05:55:24,390 WARNING [train.py:1204] (3/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_225-70961-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 05:55:24,397 WARNING [train.py:1204] (3/4) Exclude cut with ID _58583_210_5236_1_1534726835266_3574249_235-175994-0_sp1.1 from training. Duration: 0.92725 2023-10-04 05:55:25,734 INFO [train.py:1046] (3/4) Epoch 44, batch 3900, loss[loss=0.1394, simple_loss=0.2155, pruned_loss=0.03166, over 23371.00 frames. ], tot_loss[loss=0.1547, simple_loss=0.2349, pruned_loss=0.03724, over 4718174.43 frames. ], batch size: 134, lr: 2.30e-03, grad_scale: 8.0 2023-10-04 05:55:25,783 WARNING [train.py:1204] (3/4) Exclude cut with ID _12302_210_5219_1_1529924446466_8107280_907-182949-0_sp1.1 from training. Duration: 0.836375 2023-10-04 05:55:25,816 WARNING [train.py:1204] (3/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_94-193692-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 05:55:27,633 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_4528_210_3273_1_1527076654_3276777_342-168068-0_sp0.9 from training. Duration: 0.9666875 2023-10-04 05:55:27,852 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.nonlin_attention.balancer.min_positive, batch_count=1548813.3333333333, ans=0.05 2023-10-04 05:55:28,959 WARNING [train.py:1204] (3/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_368-74239-0_sp1.1 from training. Duration: 0.92725 2023-10-04 05:55:28,961 WARNING [train.py:1204] (3/4) Exclude cut with ID _44831_210_13427_1_1533816011549_3689035_291-295928-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 05:55:30,343 WARNING [train.py:1204] (3/4) Exclude cut with ID _56406_210_9288_1_1534899618664_3496149_455-131997-0_sp1.1 from training. Duration: 0.97275 2023-10-04 05:55:30,359 WARNING [train.py:1204] (3/4) Exclude cut with ID _6473_210_1741_1_1527901307396_3815380_108-17335-0 from training. Duration: 0.65 2023-10-04 05:55:31,459 WARNING [train.py:1204] (3/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_286-135600-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 05:55:31,973 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.5.encoder.layers.1.feed_forward1.out_whiten, num_groups=1, num_channels=256, metric=5.49 vs. limit=15.0 2023-10-04 05:55:36,169 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_928-321675-0_sp1.1 from training. Duration: 0.936375 2023-10-04 05:55:36,244 WARNING [train.py:1204] (3/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_81-300212-0_sp1.1 from training. Duration: 0.5454375 2023-10-04 05:55:36,280 WARNING [train.py:1204] (3/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_29-280622-0_sp0.9 from training. Duration: 0.911125 2023-10-04 05:55:37,684 WARNING [train.py:1204] (3/4) Exclude cut with ID _61079_210_4525_1_1534921008198_4011419_228-176447-0_sp1.1 from training. Duration: 0.936375 2023-10-04 05:55:39,042 WARNING [train.py:1204] (3/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_372-198878-0_sp1.1 from training. Duration: 0.5454375 2023-10-04 05:55:40,421 WARNING [train.py:1204] (3/4) Exclude cut with ID _64521_210_14699_1_1535243427259_3815328_244-208725-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 05:55:40,553 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8275_210_4198_1_1528455320_3892571_491-17707-0_sp0.9 from training. Duration: 0.711125 2023-10-04 05:55:42,469 WARNING [train.py:1204] (3/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_375-245677-0 from training. Duration: 0.81 2023-10-04 05:55:42,476 WARNING [train.py:1204] (3/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_690-286055-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 05:55:43,203 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.3.encoder.layers.0.nonlin_attention.whiten2, num_groups=1, num_channels=512, metric=7.03 vs. limit=15.0 2023-10-04 05:55:43,870 WARNING [train.py:1204] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_89-321955-0 from training. Duration: 0.8 2023-10-04 05:55:45,171 WARNING [train.py:1204] (3/4) Exclude cut with ID _18895_210_6228_1_1531357274234_3784930_213-98721-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 05:55:45,246 WARNING [train.py:1204] (3/4) Exclude cut with ID _19708_210_9206_1_1532136650733_4238364_347-65240-0 from training. Duration: 0.97 2023-10-04 05:55:47,916 WARNING [train.py:1204] (3/4) Exclude cut with ID _64135_210_2253_1_1535677267876_7608972_498-5039-0 from training. Duration: 0.78 2023-10-04 05:55:49,565 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.1.nonlin_attention.balancer.prob, batch_count=1548880.0, ans=0.125 2023-10-04 05:55:49,966 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.5.encoder.layers.0.self_attn_weights.whiten_keys, num_groups=4, num_channels=128, metric=1.68 vs. limit=6.0 2023-10-04 05:55:50,804 WARNING [train.py:1204] (3/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_146-276944-0_sp1.1 from training. Duration: 0.7545625 2023-10-04 05:55:52,080 WARNING [train.py:1204] (3/4) Exclude cut with ID _63171_210_15488_1_1535104792794_3295110_155-34488-0_sp1.1 from training. Duration: 0.97275 2023-10-04 05:55:52,101 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_5438_210_1794_1_1527336687_2600009_233-58072-0_sp0.9 from training. Duration: 0.8555625 2023-10-04 05:55:53,990 WARNING [train.py:1204] (3/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_63-101996-0_sp0.9 from training. Duration: 0.97775 2023-10-04 05:55:57,396 WARNING [train.py:1204] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_271-269084-0_sp1.1 from training. Duration: 0.7545625 2023-10-04 05:55:58,875 WARNING [train.py:1204] (3/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_345-167986-0_sp1.1 from training. Duration: 0.763625 2023-10-04 05:55:59,630 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.1.encoder.layers.0.feed_forward2.out_whiten, num_groups=1, num_channels=256, metric=4.48 vs. limit=15.0 2023-10-04 05:56:01,711 WARNING [train.py:1204] (3/4) Exclude cut with ID _39290_210_4220_1_1533862778760_3075118_92-311600-0_sp1.1 from training. Duration: 0.8 2023-10-04 05:56:01,717 WARNING [train.py:1204] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_209-65306-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 05:56:03,030 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_26433_210_12108_1_1532157158_4056480_317-335807-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 05:56:09,112 WARNING [train.py:1204] (3/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_238-94127-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 05:56:09,170 WARNING [train.py:1204] (3/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_207-132038-0_sp1.1 from training. Duration: 0.8545625 2023-10-04 05:56:14,144 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.bypass.skip_rate, batch_count=1549013.3333333333, ans=0.07 2023-10-04 05:56:15,389 WARNING [train.py:1204] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_167-137255-0_sp1.1 from training. Duration: 0.5818125 2023-10-04 05:56:16,843 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2009_210_2071_1_1525481134_4588937_385-302100-0_sp0.9 from training. Duration: 0.9666875 2023-10-04 05:56:25,848 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_15517_210_6753_1_1530954008_3487336_253-110617-0_sp1.1 from training. Duration: 0.936375 2023-10-04 05:56:26,088 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.conv_module1.balancer2.prob, batch_count=1549080.0, ans=0.125 2023-10-04 05:56:30,451 WARNING [train.py:1204] (3/4) Exclude cut with ID _39290_210_4220_1_1533862778760_3075118_92-311600-0_sp0.9 from training. Duration: 0.97775 2023-10-04 05:56:30,502 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_42-142473-0 from training. Duration: 0.72 2023-10-04 05:56:30,537 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2296_210_2087_1_1525503448_3397335_57-185430-0 from training. Duration: 0.6 2023-10-04 05:56:31,972 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_11305_210_1794_1_1529750428_7178148_257-312870-0_sp0.9 from training. Duration: 0.97775 2023-10-04 05:56:33,463 WARNING [train.py:1204] (3/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_222-198127-0 from training. Duration: 0.84 2023-10-04 05:56:34,796 WARNING [train.py:1204] (3/4) Exclude cut with ID _65263_210_22356_1_1535763598691_3921037_423-202699-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 05:56:36,178 WARNING [train.py:1204] (3/4) Exclude cut with ID _15037_210_2253_1_1530752294104_3267650_13-130361-0 from training. Duration: 0.99 2023-10-04 05:56:41,070 INFO [train.py:1046] (3/4) Epoch 44, batch 3950, loss[loss=0.1568, simple_loss=0.2387, pruned_loss=0.03746, over 23471.00 frames. ], tot_loss[loss=0.1536, simple_loss=0.2338, pruned_loss=0.03671, over 4715169.58 frames. ], batch size: 134, lr: 2.30e-03, grad_scale: 8.0 2023-10-04 05:56:41,217 WARNING [train.py:1204] (3/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_628-198080-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 05:56:43,163 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3171_210_2087_1_1526109084_2330007_37-64334-0 from training. Duration: 0.82 2023-10-04 05:56:43,223 WARNING [train.py:1204] (3/4) Exclude cut with ID _5287_210_2084_1_1527418883040_3878670_12-221186-0_sp1.1 from training. Duration: 0.8909375 2023-10-04 05:56:45,817 WARNING [train.py:1204] (3/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_1103-56445-0_sp1.1 from training. Duration: 0.663625 2023-10-04 05:56:48,473 WARNING [train.py:1204] (3/4) Exclude cut with ID _65263_210_22356_1_1535763598691_3921037_169-140068-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 05:56:52,694 WARNING [train.py:1204] (3/4) Exclude cut with ID IC0671W0118-331961 from training. Duration: 0.896 2023-10-04 05:56:54,007 WARNING [train.py:1204] (3/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_402-102311-0_sp1.1 from training. Duration: 0.6090625 2023-10-04 05:56:54,033 WARNING [train.py:1204] (3/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_202-293523-0 from training. Duration: 0.72 2023-10-04 05:56:54,087 WARNING [train.py:1204] (3/4) Exclude cut with ID IC0679W0038-335846 from training. Duration: 0.768 2023-10-04 05:56:55,406 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_37357_210_17024_1_1533089504_2316121_79-198866-0_sp1.1 from training. Duration: 0.936375 2023-10-04 05:56:58,845 WARNING [train.py:1204] (3/4) Exclude cut with ID _7606_210_4915_1_1528894253277_5080690_292-271285-0_sp1.1 from training. Duration: 0.963625 2023-10-04 05:56:58,873 WARNING [train.py:1204] (3/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_377-296839-0_sp0.9 from training. Duration: 0.611125 2023-10-04 05:56:58,876 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_45886_210_17024_1_1533704917_2435333_58-131069-0_sp1.1 from training. Duration: 0.963625 2023-10-04 05:57:03,567 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6961_210_2087_1_1527937168_6815011_395-185060-0 from training. Duration: 0.95 2023-10-04 05:57:05,282 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.conv_module2.balancer2.min_abs, batch_count=1549213.3333333333, ans=0.5 2023-10-04 05:57:06,440 WARNING [train.py:1204] (3/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_304-266479-0_sp1.1 from training. Duration: 0.97275 2023-10-04 05:57:06,500 WARNING [train.py:1204] (3/4) Exclude cut with ID _3867_210_1883_1_1526641200575_4475830_319-349850-0_sp1.1 from training. Duration: 0.6090625 2023-10-04 05:57:06,509 WARNING [train.py:1204] (3/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_304-59227-0_sp0.9 from training. Duration: 0.8555625 2023-10-04 05:57:07,853 WARNING [train.py:1204] (3/4) Exclude cut with ID _8021_210_7994_1_1528376451358_3653069_100-140231-0_sp0.9 from training. Duration: 0.7444375 2023-10-04 05:57:07,917 WARNING [train.py:1204] (3/4) Exclude cut with ID _8833_210_5219_1_1528592665210_7553059_329-125156-0_sp0.9 from training. Duration: 0.911125 2023-10-04 05:57:11,631 INFO [scaling.py:1118] (3/4) WithLoss: name=encoder.encoders.4.encoder.layers.0.self_attn_weights, loss-sum=0.000e+00 2023-10-04 05:57:17,526 WARNING [train.py:1204] (3/4) Exclude cut with ID _14459_210_2228_1_1531443641946_7516700_792-287234-0_sp1.1 from training. Duration: 0.92725 2023-10-04 05:57:17,543 WARNING [train.py:1204] (3/4) Exclude cut with ID _19708_210_9206_1_1532136650733_4238364_347-65240-0_sp1.1 from training. Duration: 0.8818125 2023-10-04 05:57:21,933 WARNING [train.py:1204] (3/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_70-27732-0 from training. Duration: 0.94 2023-10-04 05:57:23,803 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.conv_module1.balancer1.prob, batch_count=1549280.0, ans=0.125 2023-10-04 05:57:28,105 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_397-132565-0 from training. Duration: 0.99 2023-10-04 05:57:28,108 WARNING [train.py:1204] (3/4) Exclude cut with ID _49728_210_12651_1_1534041034294_6788150_624-195301-0 from training. Duration: 0.85 2023-10-04 05:57:29,274 WARNING [train.py:1204] (3/4) Exclude cut with ID _55429_210_10626_1_1534467588527_7174143_377-169391-0_sp1.1 from training. Duration: 0.87275 2023-10-04 05:57:30,663 WARNING [train.py:1204] (3/4) Exclude cut with ID _49376_210_5986_1_1533985278732_3370740_354-344695-0_sp1.1 from training. Duration: 0.9 2023-10-04 05:57:36,890 WARNING [train.py:1204] (3/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_276-96593-0_sp1.1 from training. Duration: 0.7 2023-10-04 05:57:36,898 WARNING [train.py:1204] (3/4) Exclude cut with ID _15012_210_1968_1_1530856820429_5095350_688-178849-0_sp0.9 from training. Duration: 0.711125 2023-10-04 05:57:36,938 WARNING [train.py:1204] (3/4) Exclude cut with ID _25565_210_12392_1_1532423009382_3952098_372-69023-0_sp1.1 from training. Duration: 0.936375 2023-10-04 05:57:36,971 WARNING [train.py:1204] (3/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_61-327062-0_sp1.1 from training. Duration: 0.736375 2023-10-04 05:57:37,136 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.balancer1.prob, batch_count=1549346.6666666667, ans=0.125 2023-10-04 05:57:38,806 WARNING [train.py:1204] (3/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_215-141059-0 from training. Duration: 0.62 2023-10-04 05:57:43,124 WARNING [train.py:1204] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_6-226750-0_sp1.1 from training. Duration: 0.8545625 2023-10-04 05:57:44,512 WARNING [train.py:1204] (3/4) Exclude cut with ID _14348_210_8341_1_1530599434996_3778640_240-232370-0_sp1.1 from training. Duration: 0.763625 2023-10-04 05:57:49,167 WARNING [train.py:1204] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_468-189058-0 from training. Duration: 0.96 2023-10-04 05:57:53,310 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.581e+02 1.935e+02 2.171e+02 2.462e+02 3.454e+02, threshold=4.341e+02, percent-clipped=0.0 2023-10-04 05:57:56,213 INFO [train.py:1046] (3/4) Epoch 44, batch 4000, loss[loss=0.1752, simple_loss=0.2615, pruned_loss=0.04449, over 24355.00 frames. ], tot_loss[loss=0.1542, simple_loss=0.2346, pruned_loss=0.03689, over 4726265.07 frames. ], batch size: 77, lr: 2.30e-03, grad_scale: 16.0 2023-10-04 05:57:58,801 WARNING [train.py:1204] (3/4) Exclude cut with ID _21987_210_5854_1_1531821500482_3637038_247-93340-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 05:58:06,869 WARNING [train.py:1204] (3/4) Exclude cut with ID _66868_210_9527_1_1535509287954_4561140_436-191602-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 05:58:08,624 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.conv_module2.balancer2.prob, batch_count=1549480.0, ans=0.125 2023-10-04 05:58:09,013 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.4.encoder.layers.2.conv_module1.whiten, num_groups=1, num_channels=384, metric=2.69 vs. limit=15.0 2023-10-04 05:58:11,712 WARNING [train.py:1204] (3/4) Exclude cut with ID _18895_210_6228_1_1531357274234_3784930_394-58203-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 05:58:11,760 WARNING [train.py:1204] (3/4) Exclude cut with ID _58605_210_12210_1_1534723195939_3904259_258-281498-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 05:58:13,068 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_33348_210_5632_1_1533086912_3324030_245-292752-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 05:58:13,087 WARNING [train.py:1204] (3/4) Exclude cut with ID _67831_210_12392_1_1535612464202_3092612_346-2148-0 from training. Duration: 0.93 2023-10-04 05:58:14,396 WARNING [train.py:1204] (3/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_369-222060-0_sp0.9 from training. Duration: 0.788875 2023-10-04 05:58:15,764 WARNING [train.py:1204] (3/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_245-174553-0 from training. Duration: 0.88 2023-10-04 05:58:15,771 WARNING [train.py:1204] (3/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_231-104736-0_sp1.1 from training. Duration: 0.7090625 2023-10-04 05:58:15,777 WARNING [train.py:1204] (3/4) Exclude cut with ID _44585_210_18628_1_1533605350387_4518199_280-73534-0 from training. Duration: 0.89 2023-10-04 05:58:17,185 WARNING [train.py:1204] (3/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_22-201476-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 05:58:20,367 WARNING [train.py:1204] (3/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_325-62802-0_sp0.9 from training. Duration: 0.9666875 2023-10-04 05:58:20,380 WARNING [train.py:1204] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_199-274062-0_sp1.1 from training. Duration: 0.97275 2023-10-04 05:58:20,390 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_8-319957-0_sp1.1 from training. Duration: 0.87275 2023-10-04 05:58:20,422 WARNING [train.py:1204] (3/4) Exclude cut with ID _29343_210_4381_1_1532602757752_3674109_224-349726-0_sp1.1 from training. Duration: 0.936375 2023-10-04 05:58:20,427 WARNING [train.py:1204] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_520-126729-0_sp1.1 from training. Duration: 0.636375 2023-10-04 05:58:23,234 WARNING [train.py:1204] (3/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_239-22076-0_sp1.1 from training. Duration: 0.77275 2023-10-04 05:58:23,335 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9012W0011-501146 from training. Duration: 0.896 2023-10-04 05:58:24,696 WARNING [train.py:1204] (3/4) Exclude cut with ID _29857_210_20034_1_1532395886753_3798160_279-185801-0_sp1.1 from training. Duration: 0.82725 2023-10-04 05:58:25,994 WARNING [train.py:1204] (3/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_261-223933-0_sp1.1 from training. Duration: 0.92725 2023-10-04 05:58:27,445 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9029W0025-509426 from training. Duration: 0.896 2023-10-04 05:58:28,795 WARNING [train.py:1204] (3/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_337-181502-0_sp1.1 from training. Duration: 0.5909375 2023-10-04 05:58:28,809 WARNING [train.py:1204] (3/4) Exclude cut with ID _68635_210_9317_1_1535770813410_4315870_159-332599-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 05:58:35,942 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_161-276996-0 from training. Duration: 0.95 2023-10-04 05:58:35,983 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7088_210_1874_1_1527939776_1101113_127-68870-0_sp1.1 from training. Duration: 0.936375 2023-10-04 05:58:39,300 WARNING [train.py:1204] (3/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_322-217220-0_sp1.1 from training. Duration: 0.7818125 2023-10-04 05:58:39,351 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9072W0002-530636 from training. Duration: 0.768 2023-10-04 05:58:42,404 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_340-336408-0_sp1.1 from training. Duration: 0.8454375 2023-10-04 05:58:42,462 WARNING [train.py:1204] (3/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_395-37222-0 from training. Duration: 0.76 2023-10-04 05:58:42,465 WARNING [train.py:1204] (3/4) Exclude cut with ID _64099_210_17301_1_1535526977656_1578188_159-209156-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 05:58:43,863 WARNING [train.py:1204] (3/4) Exclude cut with ID _53950_210_5816_1_1534417264919_3862951_28-29681-0_sp1.1 from training. Duration: 0.92725 2023-10-04 05:58:43,946 WARNING [train.py:1204] (3/4) Exclude cut with ID _4681_210_3525_1_1527250689464_120889_15-126760-0_sp1.1 from training. Duration: 0.736375 2023-10-04 05:58:45,350 WARNING [train.py:1204] (3/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_242-3472-0_sp1.1 from training. Duration: 0.663625 2023-10-04 05:58:45,382 WARNING [train.py:1204] (3/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_436-186311-0_sp0.9 from training. Duration: 0.72225 2023-10-04 05:58:45,406 WARNING [train.py:1204] (3/4) Exclude cut with ID _3675_210_4557_1_1526639934408_4182089_452-172147-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 05:58:48,208 WARNING [train.py:1204] (3/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_80-147301-0 from training. Duration: 0.84 2023-10-04 05:58:48,243 WARNING [train.py:1204] (3/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_351-107588-0_sp1.1 from training. Duration: 0.92725 2023-10-04 05:58:51,308 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9111W0002-550781 from training. Duration: 0.896 2023-10-04 05:58:52,320 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.0.layers.1.nonlin_attention.whiten1, num_groups=1, num_channels=144, metric=6.24 vs. limit=10.0 2023-10-04 05:58:55,482 WARNING [train.py:1204] (3/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_465-156500-0_sp1.1 from training. Duration: 0.6545625 2023-10-04 05:58:56,948 WARNING [train.py:1204] (3/4) Exclude cut with ID _13938_210_9988_1_1530518718978_3693960_304-112784-0_sp1.1 from training. Duration: 0.4181875 2023-10-04 05:58:58,520 WARNING [train.py:1204] (3/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_202-67017-0_sp1.1 from training. Duration: 0.7454375 2023-10-04 05:58:58,672 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.nonlin_attention.balancer.prob, batch_count=1549746.6666666667, ans=0.125 2023-10-04 05:58:59,820 WARNING [train.py:1204] (3/4) Exclude cut with ID _58414_210_8254_1_1535421609162_7151182_448-4358-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 05:58:59,884 WARNING [train.py:1204] (3/4) Exclude cut with ID _66840_210_5219_1_1535452168426_4196349_203-243234-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 05:59:01,240 WARNING [train.py:1204] (3/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_201-213513-0_sp1.1 from training. Duration: 0.97275 2023-10-04 05:59:07,244 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_117-187560-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 05:59:10,865 INFO [train.py:1046] (3/4) Epoch 44, batch 4050, loss[loss=0.1475, simple_loss=0.2246, pruned_loss=0.03515, over 23349.00 frames. ], tot_loss[loss=0.1545, simple_loss=0.2349, pruned_loss=0.03705, over 4722531.17 frames. ], batch size: 119, lr: 2.30e-03, grad_scale: 16.0 2023-10-04 05:59:10,943 WARNING [train.py:1204] (3/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_135-174080-0_sp0.9 from training. Duration: 0.588875 2023-10-04 05:59:12,290 WARNING [train.py:1204] (3/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_45-265684-0 from training. Duration: 0.72 2023-10-04 05:59:15,023 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_435-313305-0_sp1.1 from training. Duration: 0.6454375 2023-10-04 05:59:15,050 WARNING [train.py:1204] (3/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_235-307337-0_sp1.1 from training. Duration: 0.8545625 2023-10-04 05:59:16,487 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7762_210_1794_1_1528199508_4057408_419-125009-0_sp0.9 from training. Duration: 0.92225 2023-10-04 05:59:17,834 WARNING [train.py:1204] (3/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_215-141059-0_sp1.1 from training. Duration: 0.563625 2023-10-04 05:59:19,227 WARNING [train.py:1204] (3/4) Exclude cut with ID _27013_210_1389_1_1532437173240_3252649_222-125573-0_sp1.1 from training. Duration: 0.8909375 2023-10-04 05:59:20,904 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.bypass_mid.scale_min, batch_count=1549813.3333333333, ans=0.2 2023-10-04 05:59:22,556 WARNING [train.py:1204] (3/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_126-274694-0_sp1.1 from training. Duration: 0.8909375 2023-10-04 05:59:25,373 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2592_210_2290_1_1525697025_4882925_157-70512-0_sp1.1 from training. Duration: 0.82725 2023-10-04 05:59:25,427 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_292-265928-0_sp0.9 from training. Duration: 0.5666875 2023-10-04 05:59:28,100 WARNING [train.py:1204] (3/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_1211-226066-0_sp1.1 from training. Duration: 0.6909375 2023-10-04 05:59:29,417 WARNING [train.py:1204] (3/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_394-265747-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 05:59:32,223 WARNING [train.py:1204] (3/4) Exclude cut with ID _48782_210_12658_1_1534467701544_2300422_239-67139-0_sp1.1 from training. Duration: 0.97275 2023-10-04 05:59:33,573 WARNING [train.py:1204] (3/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_329-113173-0_sp1.1 from training. Duration: 0.563625 2023-10-04 05:59:33,685 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.0.conv_module1.balancer1.prob, batch_count=1549880.0, ans=0.125 2023-10-04 05:59:37,635 WARNING [train.py:1204] (3/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_81-75849-0_sp0.9 from training. Duration: 0.4333125 2023-10-04 05:59:39,101 WARNING [train.py:1204] (3/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_1003-101638-0 from training. Duration: 0.73 2023-10-04 05:59:39,124 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9309W0189-640316 from training. Duration: 0.64 2023-10-04 05:59:42,756 WARNING [train.py:1204] (3/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_503-287883-0_sp1.1 from training. Duration: 0.8 2023-10-04 05:59:45,116 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.0.layers.1.nonlin_attention.whiten1, num_groups=1, num_channels=144, metric=5.48 vs. limit=10.0 2023-10-04 05:59:47,410 WARNING [train.py:1204] (3/4) Exclude cut with ID _8833_210_5219_1_1528592665210_7553059_611-80313-0 from training. Duration: 0.64 2023-10-04 05:59:47,476 WARNING [train.py:1204] (3/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_335-106626-0_sp1.1 from training. Duration: 0.963625 2023-10-04 05:59:51,502 WARNING [train.py:1204] (3/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_237-44332-0_sp1.1 from training. Duration: 0.8545625 2023-10-04 05:59:53,688 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.conv_module2.balancer1.min_positive, batch_count=1550013.3333333333, ans=0.025 2023-10-04 05:59:54,842 WARNING [train.py:1204] (3/4) Exclude cut with ID _46952_210_5986_1_1534230010357_3107379_178-147341-0_sp1.1 from training. Duration: 0.97275 2023-10-04 05:59:56,238 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_13091_210_7925_1_1530609447_8987354_946-154426-0_sp1.1 from training. Duration: 0.936375 2023-10-04 05:59:56,260 WARNING [train.py:1204] (3/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_326-17061-0_sp1.1 from training. Duration: 0.8545625 2023-10-04 05:59:59,139 WARNING [train.py:1204] (3/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_38-169920-0_sp1.1 from training. Duration: 0.82725 2023-10-04 06:00:03,271 WARNING [train.py:1204] (3/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_283-220372-0 from training. Duration: 0.56 2023-10-04 06:00:04,526 WARNING [train.py:1204] (3/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_252-216674-0_sp1.1 from training. Duration: 0.4909375 2023-10-04 06:00:05,876 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_202-91290-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 06:00:07,328 WARNING [train.py:1204] (3/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_80-74184-0 from training. Duration: 0.83 2023-10-04 06:00:10,711 WARNING [train.py:1204] (3/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_411-271358-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 06:00:16,712 WARNING [train.py:1204] (3/4) Exclude cut with ID _68777_210_4353_1_1535810390731_3117904_129-166012-0 from training. Duration: 0.85 2023-10-04 06:00:16,783 WARNING [train.py:1204] (3/4) Exclude cut with ID _44982_210_1511_1_1534312820108_3726029_410-109759-0_sp1.1 from training. Duration: 0.963625 2023-10-04 06:00:16,787 WARNING [train.py:1204] (3/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_364-22817-0_sp1.1 from training. Duration: 0.6454375 2023-10-04 06:00:20,016 WARNING [train.py:1204] (3/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_89-242335-0 from training. Duration: 0.95 2023-10-04 06:00:21,022 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.591e+02 1.954e+02 2.228e+02 2.538e+02 4.386e+02, threshold=4.457e+02, percent-clipped=1.0 2023-10-04 06:00:21,097 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_151-325626-0 from training. Duration: 0.97 2023-10-04 06:00:21,098 WARNING [train.py:1204] (3/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_786-264332-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 06:00:23,766 INFO [train.py:1046] (3/4) Epoch 44, batch 4100, loss[loss=0.1557, simple_loss=0.246, pruned_loss=0.03271, over 24449.00 frames. ], tot_loss[loss=0.1549, simple_loss=0.2353, pruned_loss=0.03722, over 4732328.23 frames. ], batch size: 69, lr: 2.30e-03, grad_scale: 16.0 2023-10-04 06:00:23,850 WARNING [train.py:1204] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_35-153886-0_sp1.1 from training. Duration: 0.763625 2023-10-04 06:00:23,936 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_24007_210_1794_1_1531918925_1663875_116-100916-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 06:00:25,780 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_162-262717-0_sp1.1 from training. Duration: 0.7545625 2023-10-04 06:00:26,177 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=1550146.6666666667, ans=0.1 2023-10-04 06:00:32,581 WARNING [train.py:1204] (3/4) Exclude cut with ID _8263_210_1664_1_1528438953494_294100_40-204731-0 from training. Duration: 0.5 2023-10-04 06:00:32,685 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_416-293746-0 from training. Duration: 0.9 2023-10-04 06:00:34,130 WARNING [train.py:1204] (3/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_605-219732-0 from training. Duration: 0.94 2023-10-04 06:00:35,493 WARNING [train.py:1204] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_454-123002-0 from training. Duration: 0.69 2023-10-04 06:00:35,501 WARNING [train.py:1204] (3/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_25-304273-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 06:00:35,554 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6860_210_4852_1_1527917457_3833327_451-39702-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 06:00:35,585 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_304-16505-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 06:00:36,941 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_4-266348-0_sp1.1 from training. Duration: 0.7090625 2023-10-04 06:00:37,022 WARNING [train.py:1204] (3/4) Exclude cut with ID ID0149W0001-753701 from training. Duration: 0.989 2023-10-04 06:00:40,240 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_41251_210_15368_1_1533429767_4820695_394-55393-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 06:00:41,637 WARNING [train.py:1204] (3/4) Exclude cut with ID _8793_210_6310_1_1528542985139_2310009_121-179782-0_sp1.1 from training. Duration: 0.72725 2023-10-04 06:00:43,558 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2729_210_3528_1_1525863505_3882214_421-73047-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 06:00:43,625 WARNING [train.py:1204] (3/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_278-154492-0_sp0.9 from training. Duration: 0.8444375 2023-10-04 06:00:48,087 WARNING [train.py:1204] (3/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_412-317077-0_sp0.9 from training. Duration: 0.8333125 2023-10-04 06:00:49,564 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_727-138317-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 06:00:49,600 WARNING [train.py:1204] (3/4) Exclude cut with ID _25808_210_13292_1_1532343514562_7366109_345-335048-0_sp1.1 from training. Duration: 0.92725 2023-10-04 06:00:49,622 WARNING [train.py:1204] (3/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_506-23200-0 from training. Duration: 0.97 2023-10-04 06:00:50,895 WARNING [train.py:1204] (3/4) Exclude cut with ID _44718_210_5816_1_1534050051869_6469030_643-245187-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 06:00:50,900 WARNING [train.py:1204] (3/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_42-186162-0_sp1.1 from training. Duration: 0.836375 2023-10-04 06:00:52,216 WARNING [train.py:1204] (3/4) Exclude cut with ID _3231_210_2106_1_1526205864934_6784220_171-256004-0_sp1.1 from training. Duration: 0.7818125 2023-10-04 06:00:52,247 WARNING [train.py:1204] (3/4) Exclude cut with ID _25914_210_6400_1_1532141317058_644740_43-248478-0_sp1.1 from training. Duration: 0.936375 2023-10-04 06:00:53,554 WARNING [train.py:1204] (3/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_577-287150-0 from training. Duration: 0.82 2023-10-04 06:00:55,228 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_46015_210_7234_1_1533863319_3087837_376-276068-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 06:00:57,078 WARNING [train.py:1204] (3/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_51-200220-0 from training. Duration: 0.56 2023-10-04 06:00:58,502 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_452-96101-0_sp1.1 from training. Duration: 0.963625 2023-10-04 06:00:59,869 WARNING [train.py:1204] (3/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_263-168388-0_sp1.1 from training. Duration: 0.7818125 2023-10-04 06:00:59,870 WARNING [train.py:1204] (3/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_469-94673-0 from training. Duration: 0.79 2023-10-04 06:01:02,582 WARNING [train.py:1204] (3/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_303-228738-0_sp1.1 from training. Duration: 0.763625 2023-10-04 06:01:03,916 WARNING [train.py:1204] (3/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_262-250826-0_sp1.1 from training. Duration: 0.9 2023-10-04 06:01:03,944 WARNING [train.py:1204] (3/4) Exclude cut with ID _68777_210_4353_1_1535810390731_3117904_129-166012-0_sp1.1 from training. Duration: 0.77275 2023-10-04 06:01:05,328 WARNING [train.py:1204] (3/4) Exclude cut with ID _58895_210_8750_1_1535184081320_3649089_265-323810-0 from training. Duration: 0.96 2023-10-04 06:01:06,767 WARNING [train.py:1204] (3/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_120-76843-0_sp0.9 from training. Duration: 0.611125 2023-10-04 06:01:06,839 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7544_210_6753_1_1528587722_3576686_295-258479-0_sp0.9 from training. Duration: 0.9333125 2023-10-04 06:01:08,357 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7046_210_6753_1_1527895337_6920917_783-278109-0 from training. Duration: 0.77 2023-10-04 06:01:09,718 WARNING [train.py:1204] (3/4) Exclude cut with ID _42326_210_7770_1_1533814282531_3707269_113-308323-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 06:01:09,768 WARNING [train.py:1204] (3/4) Exclude cut with ID _7852_210_5219_1_1528286471316_8139400_242-105336-0_sp0.9 from training. Duration: 0.8 2023-10-04 06:01:12,906 WARNING [train.py:1204] (3/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_114-277905-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 06:01:17,583 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_13118_210_7925_1_1530755612_7608297_42-66683-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 06:01:19,548 WARNING [train.py:1204] (3/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_901-79438-0_sp1.1 from training. Duration: 0.8545625 2023-10-04 06:01:20,827 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_30280_210_1881_1_1532872779_3660139_211-182194-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 06:01:26,064 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.4.encoder.layers.0.self_attn1.whiten, num_groups=1, num_channels=384, metric=14.64 vs. limit=22.5 2023-10-04 06:01:28,190 WARNING [train.py:1204] (3/4) Exclude cut with ID _14898_210_9206_1_1530766812377_4177190_621-45732-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 06:01:28,199 WARNING [train.py:1204] (3/4) Exclude cut with ID _35350_210_16040_1_1533040191183_4075299_224-133775-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 06:01:30,981 WARNING [train.py:1204] (3/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_147-293311-0_sp1.1 from training. Duration: 0.8545625 2023-10-04 06:01:33,810 WARNING [train.py:1204] (3/4) Exclude cut with ID _12302_210_5219_1_1529924446466_8107280_835-279754-0_sp1.1 from training. Duration: 0.8181875 2023-10-04 06:01:37,887 INFO [train.py:1046] (3/4) Epoch 44, batch 4150, loss[loss=0.1393, simple_loss=0.2195, pruned_loss=0.02955, over 24480.00 frames. ], tot_loss[loss=0.1549, simple_loss=0.2347, pruned_loss=0.03753, over 4715390.97 frames. ], batch size: 58, lr: 2.30e-03, grad_scale: 16.0 2023-10-04 06:01:38,010 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8453_210_4211_1_1528502940_8562730_347-330673-0_sp0.9 from training. Duration: 0.8 2023-10-04 06:01:39,413 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_213-103282-0_sp0.9 from training. Duration: 0.8666875 2023-10-04 06:01:39,566 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.feed_forward1.hidden_balancer.prob, batch_count=1550480.0, ans=0.125 2023-10-04 06:01:40,799 WARNING [train.py:1204] (3/4) Exclude cut with ID _49728_210_12651_1_1534041034294_6788150_66-43496-0_sp1.1 from training. Duration: 0.97275 2023-10-04 06:01:40,802 WARNING [train.py:1204] (3/4) Exclude cut with ID _19023_210_4353_1_1531378892552_5820780_28-36615-0_sp1.1 from training. Duration: 0.8818125 2023-10-04 06:01:44,482 WARNING [train.py:1204] (3/4) Exclude cut with ID _14349_210_8341_1_1530615710818_3442470_115-285975-0 from training. Duration: 0.86 2023-10-04 06:01:45,751 WARNING [train.py:1204] (3/4) Exclude cut with ID _12734_210_8341_1_1530183614296_3891929_236-47464-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 06:01:45,794 WARNING [train.py:1204] (3/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_177-236809-0 from training. Duration: 0.49 2023-10-04 06:01:45,872 WARNING [train.py:1204] (3/4) Exclude cut with ID _13678_210_7497_1_1530530069226_3463170_371-3221-0 from training. Duration: 0.9 2023-10-04 06:01:45,886 WARNING [train.py:1204] (3/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_48-338976-0 from training. Duration: 0.89 2023-10-04 06:01:46,053 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=1550480.0, ans=0.1 2023-10-04 06:01:47,231 WARNING [train.py:1204] (3/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_1167-309744-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 06:01:47,828 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.4.encoder.layers.1.self_attn_weights.whiten_keys, num_groups=4, num_channels=128, metric=1.71 vs. limit=6.0 2023-10-04 06:01:53,221 WARNING [train.py:1204] (3/4) Exclude cut with ID _30327_210_16049_1_1532484161295_3924169_401-70819-0_sp0.9 from training. Duration: 0.9555625 2023-10-04 06:01:53,232 WARNING [train.py:1204] (3/4) Exclude cut with ID _55134_210_21982_1_1534590028377_3882170_346-91022-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 06:01:57,432 WARNING [train.py:1204] (3/4) Exclude cut with ID _14459_210_2228_1_1531443641946_7516700_174-118801-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 06:01:59,300 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_269-278006-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 06:02:00,636 WARNING [train.py:1204] (3/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_390-339137-0_sp0.9 from training. Duration: 0.62225 2023-10-04 06:02:02,118 WARNING [train.py:1204] (3/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_253-308377-0_sp1.1 from training. Duration: 0.4454375 2023-10-04 06:02:02,132 WARNING [train.py:1204] (3/4) Exclude cut with ID _23825_210_13449_1_1531915313526_3497150_240-271367-0_sp1.1 from training. Duration: 0.8818125 2023-10-04 06:02:02,214 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1015-343884-0_sp0.9 from training. Duration: 0.688875 2023-10-04 06:02:06,358 WARNING [train.py:1204] (3/4) Exclude cut with ID _23514_210_1389_1_1531832139785_3031439_335-48210-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 06:02:09,152 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_309-65177-0_sp1.1 from training. Duration: 0.8 2023-10-04 06:02:10,551 WARNING [train.py:1204] (3/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_231-137892-0 from training. Duration: 0.99 2023-10-04 06:02:12,098 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.0.ff3_skip_rate, batch_count=1550613.3333333333, ans=0.0 2023-10-04 06:02:13,626 WARNING [train.py:1204] (3/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_57-270037-0 from training. Duration: 0.77 2023-10-04 06:02:13,631 WARNING [train.py:1204] (3/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_465-214726-0_sp1.1 from training. Duration: 0.7818125 2023-10-04 06:02:15,538 WARNING [train.py:1204] (3/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_106-300985-0 from training. Duration: 0.9 2023-10-04 06:02:15,543 WARNING [train.py:1204] (3/4) Exclude cut with ID _14632_210_9988_1_1530691279392_3923200_161-129530-0_sp1.1 from training. Duration: 0.87275 2023-10-04 06:02:15,568 WARNING [train.py:1204] (3/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_514-88370-0_sp1.1 from training. Duration: 0.963625 2023-10-04 06:02:17,286 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.feed_forward1.hidden_balancer.prob, batch_count=1550613.3333333333, ans=0.125 2023-10-04 06:02:18,402 WARNING [train.py:1204] (3/4) Exclude cut with ID _56495_210_4234_1_1534748080334_4136219_305-334505-0_sp1.1 from training. Duration: 0.92725 2023-10-04 06:02:18,481 WARNING [train.py:1204] (3/4) Exclude cut with ID _42875_210_14856_1_1533704397487_4012433_61-264914-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 06:02:20,377 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.conv_skip_rate, batch_count=1550613.3333333333, ans=0.0 2023-10-04 06:02:23,378 WARNING [train.py:1204] (3/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_91-148831-0 from training. Duration: 0.99 2023-10-04 06:02:27,386 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_435-313305-0_sp0.9 from training. Duration: 0.788875 2023-10-04 06:02:27,501 WARNING [train.py:1204] (3/4) Exclude cut with ID _16744_210_5986_1_1531724405384_3907567_242-111316-0_sp1.1 from training. Duration: 0.8454375 2023-10-04 06:02:28,821 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_257-20733-0 from training. Duration: 0.76 2023-10-04 06:02:28,879 WARNING [train.py:1204] (3/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_4-91037-0_sp1.1 from training. Duration: 0.8 2023-10-04 06:02:30,867 WARNING [train.py:1204] (3/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_3-276701-0 from training. Duration: 0.91 2023-10-04 06:02:30,999 WARNING [train.py:1204] (3/4) Exclude cut with ID _3992_210_1747_1_1526644816031_3731060_36-147855-0_sp1.1 from training. Duration: 0.7090625 2023-10-04 06:02:32,398 WARNING [train.py:1204] (3/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_1296-298671-0_sp1.1 from training. Duration: 0.963625 2023-10-04 06:02:34,891 WARNING [train.py:1204] (3/4) Exclude cut with ID _68965_210_1883_1_1535698962272_3497490_309-334593-0_sp1.1 from training. Duration: 0.92725 2023-10-04 06:02:36,160 WARNING [train.py:1204] (3/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_128-296581-0 from training. Duration: 0.74 2023-10-04 06:02:36,161 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_17613_210_2151_1_1531137569_2366160_170-299079-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 06:02:36,163 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6287_210_3273_1_1527509506_2653716_182-199887-0_sp1.1 from training. Duration: 0.463625 2023-10-04 06:02:37,698 WARNING [train.py:1204] (3/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_80-135200-0_sp1.1 from training. Duration: 0.7181875 2023-10-04 06:02:40,414 WARNING [train.py:1204] (3/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_34-183685-0 from training. Duration: 0.98 2023-10-04 06:02:40,433 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8278_210_2550_1_1528637099_4220413_418-210155-0_sp1.1 from training. Duration: 0.92725 2023-10-04 06:02:40,437 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8853_210_3273_1_1528633899_818968_51-187685-0_sp0.9 from training. Duration: 0.7666875 2023-10-04 06:02:40,458 WARNING [train.py:1204] (3/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_332-221057-0_sp1.1 from training. Duration: 0.6181875 2023-10-04 06:02:42,385 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2221_210_1988_1_1525597051_4935541_415-296882-0 from training. Duration: 0.77 2023-10-04 06:02:42,411 WARNING [train.py:1204] (3/4) Exclude cut with ID _33640_210_2655_1_1533117605479_4855469_770-278185-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 06:02:43,792 WARNING [train.py:1204] (3/4) Exclude cut with ID _5287_210_2084_1_1527418883040_3878670_333-48508-0_sp1.1 from training. Duration: 0.5454375 2023-10-04 06:02:45,808 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_142-276839-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 06:02:46,098 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.feed_forward1.out_proj.dropout_p, batch_count=1550746.6666666667, ans=0.1 2023-10-04 06:02:46,634 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.0.layers.1.self_attn2.whiten, num_groups=1, num_channels=192, metric=9.45 vs. limit=22.5 2023-10-04 06:02:46,982 WARNING [train.py:1204] (3/4) Exclude cut with ID _28394_210_8913_1_1532430049518_3650118_126-139371-0_sp1.1 from training. Duration: 0.92725 2023-10-04 06:02:48,227 WARNING [train.py:1204] (3/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_76-203867-0 from training. Duration: 0.72 2023-10-04 06:02:48,271 WARNING [train.py:1204] (3/4) Exclude cut with ID _6194_210_1534_1_1527511826160_3941194_14-282876-0_sp0.9 from training. Duration: 0.788875 2023-10-04 06:02:49,483 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.733e+02 2.108e+02 2.384e+02 3.215e+02 5.967e+02, threshold=4.767e+02, percent-clipped=2.0 2023-10-04 06:02:52,739 INFO [train.py:1046] (3/4) Epoch 44, batch 4200, loss[loss=0.1433, simple_loss=0.2297, pruned_loss=0.02846, over 24300.00 frames. ], tot_loss[loss=0.1543, simple_loss=0.234, pruned_loss=0.03732, over 4720620.20 frames. ], batch size: 61, lr: 2.30e-03, grad_scale: 16.0 2023-10-04 06:02:52,962 WARNING [train.py:1204] (3/4) Exclude cut with ID _35355_210_16040_1_1533212988977_4016229_74-190076-0_sp1.1 from training. Duration: 0.72725 2023-10-04 06:02:54,489 WARNING [train.py:1204] (3/4) Exclude cut with ID _33920_210_4220_1_1533257851500_3084399_198-334063-0 from training. Duration: 0.94 2023-10-04 06:02:55,805 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_4-266348-0_sp0.9 from training. Duration: 0.8666875 2023-10-04 06:02:57,220 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_23856_210_4238_1_1531994785_3087934_122-38075-0_sp1.1 from training. Duration: 0.936375 2023-10-04 06:02:57,356 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.attention_skip_rate, batch_count=1550813.3333333333, ans=0.0 2023-10-04 06:02:58,625 WARNING [train.py:1204] (3/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_276-96593-0_sp0.9 from training. Duration: 0.8555625 2023-10-04 06:02:59,974 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_308-278734-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 06:02:59,976 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_5190_210_4557_1_1527245078_2965646_353-250603-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 06:03:03,234 WARNING [train.py:1204] (3/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_472-216663-0 from training. Duration: 0.92 2023-10-04 06:03:05,939 WARNING [train.py:1204] (3/4) Exclude cut with ID _7537_210_2084_1_1528502395431_3924359_14-312906-0 from training. Duration: 0.84 2023-10-04 06:03:05,988 WARNING [train.py:1204] (3/4) Exclude cut with ID _29003_210_19596_1_1532507391720_3894098_559-236660-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 06:03:08,767 WARNING [train.py:1204] (3/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_312-204798-0_sp1.1 from training. Duration: 0.8454375 2023-10-04 06:03:10,245 WARNING [train.py:1204] (3/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_17-309070-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 06:03:13,573 WARNING [train.py:1204] (3/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_344-152526-0_sp0.9 from training. Duration: 0.588875 2023-10-04 06:03:14,950 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_41452_210_7291_1_1533437687_3941281_405-11559-0_sp1.1 from training. Duration: 0.82725 2023-10-04 06:03:16,794 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_48730_210_5399_1_1533885886_2212769_157-124052-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 06:03:16,852 WARNING [train.py:1204] (3/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_415-78792-0 from training. Duration: 0.81 2023-10-04 06:03:16,857 WARNING [train.py:1204] (3/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_345-340690-0_sp1.1 from training. Duration: 0.8454375 2023-10-04 06:03:18,300 WARNING [train.py:1204] (3/4) Exclude cut with ID _23825_210_13449_1_1531915313526_3497150_124-8138-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 06:03:18,345 WARNING [train.py:1204] (3/4) Exclude cut with ID _49258_210_7994_1_1534122017863_3738861_194-24864-0_sp1.1 from training. Duration: 0.936375 2023-10-04 06:03:18,362 WARNING [train.py:1204] (3/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_369-222060-0_sp1.1 from training. Duration: 0.6454375 2023-10-04 06:03:20,326 WARNING [train.py:1204] (3/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_480-13146-0_sp0.9 from training. Duration: 0.9444375 2023-10-04 06:03:23,147 WARNING [train.py:1204] (3/4) Exclude cut with ID _13253_210_1925_1_1530757775819_3577873_191-144977-0 from training. Duration: 0.78 2023-10-04 06:03:23,166 WARNING [train.py:1204] (3/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_1128-140266-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 06:03:26,008 WARNING [train.py:1204] (3/4) Exclude cut with ID _8772_210_1850_1_1528635595972_3664011_247-336448-0_sp0.9 from training. Duration: 0.82225 2023-10-04 06:03:26,093 WARNING [train.py:1204] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_318-59097-0_sp1.1 from training. Duration: 0.6909375 2023-10-04 06:03:28,842 WARNING [train.py:1204] (3/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_540-131164-0_sp1.1 from training. Duration: 0.836375 2023-10-04 06:03:28,956 WARNING [train.py:1204] (3/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_39-110836-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 06:03:29,072 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.1.feed_forward2.hidden_balancer.prob, batch_count=1550946.6666666667, ans=0.125 2023-10-04 06:03:29,200 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.feed_forward3.hidden_balancer.prob, batch_count=1550946.6666666667, ans=0.125 2023-10-04 06:03:32,134 WARNING [train.py:1204] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_860-96198-0_sp1.1 from training. Duration: 0.663625 2023-10-04 06:03:32,143 WARNING [train.py:1204] (3/4) Exclude cut with ID _15133_210_6228_1_1530794512074_3080379_29-264458-0 from training. Duration: 0.58 2023-10-04 06:03:32,161 WARNING [train.py:1204] (3/4) Exclude cut with ID _55209_210_1531_1_1534675828996_4597160_797-348343-0_sp1.1 from training. Duration: 0.963625 2023-10-04 06:03:33,613 WARNING [train.py:1204] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_23-189234-0_sp1.1 from training. Duration: 0.7545625 2023-10-04 06:03:36,553 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.feed_forward3.hidden_balancer.prob, batch_count=1551013.3333333333, ans=0.125 2023-10-04 06:03:39,005 WARNING [train.py:1204] (3/4) Exclude cut with ID _5410_210_1388_1_1527397055795_1719020_39-112221-0_sp1.1 from training. Duration: 0.62725 2023-10-04 06:03:39,129 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_100-348441-0_sp1.1 from training. Duration: 0.82725 2023-10-04 06:03:42,651 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.out_combiner.scale_min, batch_count=1551013.3333333333, ans=0.2 2023-10-04 06:03:46,812 WARNING [train.py:1204] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_143-86344-0_sp1.1 from training. Duration: 0.7 2023-10-04 06:03:48,432 WARNING [train.py:1204] (3/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_126-29104-0 from training. Duration: 0.85 2023-10-04 06:03:50,269 WARNING [train.py:1204] (3/4) Exclude cut with ID _39846_210_12651_1_1533603651992_1318030_138-31771-0_sp1.1 from training. Duration: 0.963625 2023-10-04 06:03:53,944 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.1.nonlin_attention.whiten2.whitening_limit, batch_count=1551080.0, ans=15.0 2023-10-04 06:03:54,550 WARNING [train.py:1204] (3/4) Exclude cut with ID _2220_210_1819_1_1525509109898_3658680_50-99837-0_sp1.1 from training. Duration: 0.4909375 2023-10-04 06:03:55,778 WARNING [train.py:1204] (3/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_836-210550-0_sp1.1 from training. Duration: 0.97275 2023-10-04 06:03:58,453 WARNING [train.py:1204] (3/4) Exclude cut with ID _28323_210_16187_1_1532480395667_4100300_306-74561-0 from training. Duration: 0.94 2023-10-04 06:04:03,383 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_177-58005-0_sp1.1 from training. Duration: 0.563625 2023-10-04 06:04:06,184 WARNING [train.py:1204] (3/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_19-342632-0_sp1.1 from training. Duration: 0.82725 2023-10-04 06:04:06,207 WARNING [train.py:1204] (3/4) Exclude cut with ID _35355_210_16040_1_1533212988977_4016229_74-190076-0_sp0.9 from training. Duration: 0.888875 2023-10-04 06:04:07,446 INFO [train.py:1046] (3/4) Epoch 44, batch 4250, loss[loss=0.1659, simple_loss=0.2559, pruned_loss=0.03789, over 24046.00 frames. ], tot_loss[loss=0.1534, simple_loss=0.2329, pruned_loss=0.03699, over 4716554.36 frames. ], batch size: 80, lr: 2.30e-03, grad_scale: 16.0 2023-10-04 06:04:08,898 WARNING [train.py:1204] (3/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_285-97786-0_sp1.1 from training. Duration: 0.97275 2023-10-04 06:04:12,390 WARNING [train.py:1204] (3/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_337-181502-0_sp0.9 from training. Duration: 0.72225 2023-10-04 06:04:14,265 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_353-182855-0 from training. Duration: 0.96 2023-10-04 06:04:14,293 WARNING [train.py:1204] (3/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_409-327730-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 06:04:17,055 WARNING [train.py:1204] (3/4) Exclude cut with ID _8030_210_7497_1_1528444886781_3531089_373-19403-0_sp1.1 from training. Duration: 0.97275 2023-10-04 06:04:17,326 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.conv_skip_rate, batch_count=1551146.6666666667, ans=0.0 2023-10-04 06:04:20,237 WARNING [train.py:1204] (3/4) Exclude cut with ID _6789_210_1534_1_1527857937218_3118950_231-340237-0_sp1.1 from training. Duration: 0.936375 2023-10-04 06:04:24,251 WARNING [train.py:1204] (3/4) Exclude cut with ID _6493_210_1747_1_1528459210424_4012470_536-57832-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 06:04:24,272 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7046_210_6753_1_1527895337_6920917_756-116002-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 06:04:25,668 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_37152_210_9697_1_1533106573_3844188_29-239070-0_sp1.1 from training. Duration: 0.8909375 2023-10-04 06:04:25,670 WARNING [train.py:1204] (3/4) Exclude cut with ID _19023_210_4353_1_1531378892552_5820780_61-334539-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 06:04:27,071 WARNING [train.py:1204] (3/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_159-28488-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 06:04:28,477 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_764-71272-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 06:04:28,547 WARNING [train.py:1204] (3/4) Exclude cut with ID _44902_210_16187_1_1533888006062_3792078_258-178274-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 06:04:31,810 WARNING [train.py:1204] (3/4) Exclude cut with ID _7537_210_2084_1_1528502395431_3924359_14-312906-0_sp1.1 from training. Duration: 0.763625 2023-10-04 06:04:33,127 WARNING [train.py:1204] (3/4) Exclude cut with ID _49643_210_7989_1_1534640244224_3939031_674-116639-0_sp1.1 from training. Duration: 0.963625 2023-10-04 06:04:34,409 WARNING [train.py:1204] (3/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_415-161-0 from training. Duration: 0.98 2023-10-04 06:04:39,425 WARNING [train.py:1204] (3/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_264-348896-0 from training. Duration: 0.85 2023-10-04 06:04:39,441 WARNING [train.py:1204] (3/4) Exclude cut with ID _51899_210_20034_1_1534129254466_3669220_233-161492-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 06:04:40,721 WARNING [train.py:1204] (3/4) Exclude cut with ID _31255_210_13427_1_1532865619495_3815143_233-200023-0_sp1.1 from training. Duration: 0.936375 2023-10-04 06:04:42,089 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_23850_210_4238_1_1532232597_1632433_100-229879-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 06:04:42,347 INFO [scaling.py:1118] (3/4) WithLoss: name=encoder.encoders.1.encoder.layers.0.self_attn_weights, loss-sum=0.000e+00 2023-10-04 06:04:43,382 WARNING [train.py:1204] (3/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_375-245677-0_sp0.9 from training. Duration: 0.9 2023-10-04 06:04:43,385 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_40223_210_6228_1_1533298404_4812267_355-317415-0_sp1.1 from training. Duration: 0.97275 2023-10-04 06:04:43,435 WARNING [train.py:1204] (3/4) Exclude cut with ID _9679_210_7497_1_1529129212906_4363929_235-81488-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 06:04:47,488 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.self_attn_weights.pos_emb_skip_rate, batch_count=1551280.0, ans=0.0 2023-10-04 06:04:48,588 WARNING [train.py:1204] (3/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_172-215932-0_sp1.1 from training. Duration: 0.6 2023-10-04 06:04:49,789 WARNING [train.py:1204] (3/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_64-298917-0_sp1.1 from training. Duration: 0.8 2023-10-04 06:04:51,387 WARNING [train.py:1204] (3/4) Exclude cut with ID _38020_210_5986_1_1533112252259_7006089_131-53533-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 06:04:54,179 WARNING [train.py:1204] (3/4) Exclude cut with ID _42334_210_7770_1_1534206610396_3605880_392-48154-0_sp1.1 from training. Duration: 0.963625 2023-10-04 06:04:54,246 WARNING [train.py:1204] (3/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_337-181502-0 from training. Duration: 0.65 2023-10-04 06:04:55,435 WARNING [train.py:1204] (3/4) Exclude cut with ID _6267_210_2084_1_1527764658526_3894730_375-219887-0_sp0.9 from training. Duration: 0.7444375 2023-10-04 06:04:55,520 WARNING [train.py:1204] (3/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_60-146189-0 from training. Duration: 0.98 2023-10-04 06:04:57,022 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_243-143119-0_sp1.1 from training. Duration: 0.7 2023-10-04 06:04:58,380 WARNING [train.py:1204] (3/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_1267-272693-0_sp0.9 from training. Duration: 0.811125 2023-10-04 06:04:59,937 WARNING [train.py:1204] (3/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_83-182645-0_sp1.1 from training. Duration: 0.97275 2023-10-04 06:04:59,955 WARNING [train.py:1204] (3/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_403-292260-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 06:05:02,662 WARNING [train.py:1204] (3/4) Exclude cut with ID _55429_210_10626_1_1534467588527_7174143_377-169391-0 from training. Duration: 0.96 2023-10-04 06:05:03,490 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.1.encoder.layers.1.self_attn_weights.whiten_keys, num_groups=4, num_channels=128, metric=2.26 vs. limit=6.0 2023-10-04 06:05:04,452 WARNING [train.py:1204] (3/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_494-199538-0_sp1.1 from training. Duration: 0.6818125 2023-10-04 06:05:04,664 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.ff2_skip_rate, batch_count=1551346.6666666667, ans=0.0 2023-10-04 06:05:05,766 WARNING [train.py:1204] (3/4) Exclude cut with ID _3067_210_2667_1_1526212974640_4503168_119-175776-0_sp1.1 from training. Duration: 0.67275 2023-10-04 06:05:07,264 WARNING [train.py:1204] (3/4) Exclude cut with ID _3231_210_2106_1_1526205864934_6784220_183-303839-0_sp1.1 from training. Duration: 0.97275 2023-10-04 06:05:09,446 WARNING [train.py:1204] (3/4) Exclude cut with ID _18513_210_9206_1_1531618215223_3862620_165-38958-0_sp1.1 from training. Duration: 0.963625 2023-10-04 06:05:10,833 WARNING [train.py:1204] (3/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_87-272068-0_sp1.1 from training. Duration: 0.7545625 2023-10-04 06:05:12,259 WARNING [train.py:1204] (3/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_353-95-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 06:05:13,434 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2009_210_2071_1_1525481134_4588937_73-302813-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 06:05:13,702 INFO [scaling.py:1118] (3/4) WithLoss: name=encoder.encoders.4.encoder.layers.0.self_attn_weights, loss-sum=0.000e+00 2023-10-04 06:05:15,366 WARNING [train.py:1204] (3/4) Exclude cut with ID _64220_210_9527_1_1535250151772_4875259_88-95011-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 06:05:16,663 WARNING [train.py:1204] (3/4) Exclude cut with ID _44902_210_16187_1_1533888006062_3792078_160-12107-0_sp1.1 from training. Duration: 0.92725 2023-10-04 06:05:16,672 WARNING [train.py:1204] (3/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_547-5424-0 from training. Duration: 0.94 2023-10-04 06:05:18,189 WARNING [train.py:1204] (3/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_268-85656-0_sp1.1 from training. Duration: 0.936375 2023-10-04 06:05:19,447 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.694e+02 1.987e+02 2.246e+02 2.555e+02 4.795e+02, threshold=4.492e+02, percent-clipped=1.0 2023-10-04 06:05:19,766 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.feed_forward3.hidden_balancer.prob, batch_count=1551413.3333333333, ans=0.125 2023-10-04 06:05:21,436 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.attention_skip_rate, batch_count=1551480.0, ans=0.0 2023-10-04 06:05:22,551 INFO [train.py:1046] (3/4) Epoch 44, batch 4300, loss[loss=0.1373, simple_loss=0.214, pruned_loss=0.03036, over 24457.00 frames. ], tot_loss[loss=0.1537, simple_loss=0.2334, pruned_loss=0.037, over 4711732.41 frames. ], batch size: 58, lr: 2.30e-03, grad_scale: 16.0 2023-10-04 06:05:22,604 WARNING [train.py:1204] (3/4) Exclude cut with ID _66210_210_12651_1_1535446812922_3365850_42-284985-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 06:05:23,934 WARNING [train.py:1204] (3/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_465-214726-0_sp0.9 from training. Duration: 0.9555625 2023-10-04 06:05:26,758 WARNING [train.py:1204] (3/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_50-95770-0_sp1.1 from training. Duration: 0.936375 2023-10-04 06:05:34,129 WARNING [train.py:1204] (3/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_269-77567-0_sp1.1 from training. Duration: 0.963625 2023-10-04 06:05:34,131 WARNING [train.py:1204] (3/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_7-111954-0 from training. Duration: 0.56 2023-10-04 06:05:35,621 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_85-195240-0_sp0.9 from training. Duration: 0.9666875 2023-10-04 06:05:37,088 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2822_210_2715_1_1526112586_1999587_165-36656-0_sp0.9 from training. Duration: 0.988875 2023-10-04 06:05:37,104 WARNING [train.py:1204] (3/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_181-78632-0_sp1.1 from training. Duration: 0.7090625 2023-10-04 06:05:38,947 WARNING [train.py:1204] (3/4) Exclude cut with ID IC0671W0044-331887 from training. Duration: 0.896 2023-10-04 06:05:40,408 WARNING [train.py:1204] (3/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_266-137104-0_sp1.1 from training. Duration: 0.5909375 2023-10-04 06:05:41,864 WARNING [train.py:1204] (3/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_137-116442-0_sp0.9 from training. Duration: 0.8666875 2023-10-04 06:05:44,624 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_23801_210_16223_1_1532066392_3294657_570-5439-0 from training. Duration: 0.97 2023-10-04 06:05:44,643 WARNING [train.py:1204] (3/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_29-280622-0_sp1.1 from training. Duration: 0.7454375 2023-10-04 06:05:44,667 WARNING [train.py:1204] (3/4) Exclude cut with ID 3557-8342-0013-54691-0 from training. Duration: 0.92 2023-10-04 06:05:49,749 WARNING [train.py:1204] (3/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_711-15688-0_sp1.1 from training. Duration: 0.3909375 2023-10-04 06:05:51,202 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_15126_210_6753_1_1530778677_2591185_120-277707-0_sp1.1 from training. Duration: 0.72725 2023-10-04 06:05:53,998 WARNING [train.py:1204] (3/4) Exclude cut with ID _14970_210_15709_1_1530775547361_1966965_8-169924-0_sp1.1 from training. Duration: 0.836375 2023-10-04 06:05:54,000 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_4692_210_3528_1_1526899755_4323254_107-44401-0_sp0.9 from training. Duration: 0.9555625 2023-10-04 06:05:55,300 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3475_210_2028_1_1526193578_3622569_265-276625-0_sp0.9 from training. Duration: 0.8444375 2023-10-04 06:05:56,680 WARNING [train.py:1204] (3/4) Exclude cut with ID _55153_210_2253_1_1534384786677_3162096_148-79022-0_sp1.1 from training. Duration: 0.87275 2023-10-04 06:05:58,061 WARNING [train.py:1204] (3/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_292-36336-0_sp1.1 from training. Duration: 0.92725 2023-10-04 06:05:58,070 WARNING [train.py:1204] (3/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_329-82235-0 from training. Duration: 0.82 2023-10-04 06:05:59,415 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_11305_210_1794_1_1529750428_7178148_257-312870-0 from training. Duration: 0.88 2023-10-04 06:06:00,881 WARNING [train.py:1204] (3/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_436-4493-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 06:06:03,473 WARNING [train.py:1204] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_321-54129-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 06:06:03,480 WARNING [train.py:1204] (3/4) Exclude cut with ID _5287_210_2084_1_1527418883040_3878670_333-48508-0_sp0.9 from training. Duration: 0.6666875 2023-10-04 06:06:03,494 WARNING [train.py:1204] (3/4) Exclude cut with ID _26528_210_9070_1_1532570410842_3851310_244-278158-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 06:06:04,817 WARNING [train.py:1204] (3/4) Exclude cut with ID _46952_210_5986_1_1534230010357_3107379_232-344503-0_sp1.1 from training. Duration: 0.87275 2023-10-04 06:06:04,836 WARNING [train.py:1204] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_43-206757-0 from training. Duration: 0.75 2023-10-04 06:06:04,838 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_4183_210_2087_1_1526729100_1869838_141-166600-0 from training. Duration: 0.95 2023-10-04 06:06:06,659 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_296-287678-0 from training. Duration: 0.95 2023-10-04 06:06:08,082 WARNING [train.py:1204] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_265-60343-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 06:06:08,113 WARNING [train.py:1204] (3/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_332-256168-0 from training. Duration: 0.75 2023-10-04 06:06:09,798 WARNING [train.py:1204] (3/4) Exclude cut with ID _13679_210_7497_1_1530534293662_3355098_411-186088-0 from training. Duration: 0.94 2023-10-04 06:06:10,036 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.bypass.scale_min, batch_count=1551680.0, ans=0.2 2023-10-04 06:06:15,207 WARNING [train.py:1204] (3/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_458-126779-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 06:06:16,711 WARNING [train.py:1204] (3/4) Exclude cut with ID IC0803W0380-397572 from training. Duration: 0.032 2023-10-04 06:06:16,774 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_278-272612-0_sp1.1 from training. Duration: 0.663625 2023-10-04 06:06:18,196 WARNING [train.py:1204] (3/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_491-263101-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 06:06:18,201 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_53555_210_5632_1_1534595220_7232297_334-336778-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 06:06:20,293 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_50-3289-0 from training. Duration: 0.91 2023-10-04 06:06:21,447 WARNING [train.py:1204] (3/4) Exclude cut with ID _8699_210_2069_1_1528630346831_3960409_155-39766-0_sp0.9 from training. Duration: 0.8666875 2023-10-04 06:06:21,451 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_502-82145-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 06:06:21,503 WARNING [train.py:1204] (3/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_550-43132-0_sp1.1 from training. Duration: 0.97275 2023-10-04 06:06:21,544 WARNING [train.py:1204] (3/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_446-112117-0_sp1.1 from training. Duration: 0.8181875 2023-10-04 06:06:23,465 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3691_210_3528_1_1526468169_4062053_355-334432-0_sp1.1 from training. Duration: 0.7909375 2023-10-04 06:06:24,841 WARNING [train.py:1204] (3/4) Exclude cut with ID _58605_210_12210_1_1534723195939_3904259_239-94720-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 06:06:26,245 WARNING [train.py:1204] (3/4) Exclude cut with ID _49376_210_5986_1_1533985278732_3370740_391-344072-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 06:06:27,576 WARNING [train.py:1204] (3/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_558-322014-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 06:06:27,617 WARNING [train.py:1204] (3/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_256-32662-0_sp1.1 from training. Duration: 0.8181875 2023-10-04 06:06:34,507 WARNING [train.py:1204] (3/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_572-185150-0 from training. Duration: 0.97 2023-10-04 06:06:34,545 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_89-35748-0_sp0.9 from training. Duration: 0.87775 2023-10-04 06:06:35,918 INFO [train.py:1046] (3/4) Epoch 44, batch 4350, loss[loss=0.1547, simple_loss=0.2424, pruned_loss=0.0335, over 24480.00 frames. ], tot_loss[loss=0.155, simple_loss=0.2348, pruned_loss=0.03764, over 4706684.25 frames. ], batch size: 66, lr: 2.30e-03, grad_scale: 16.0 2023-10-04 06:06:38,043 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_88-66747-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 06:06:40,856 WARNING [train.py:1204] (3/4) Exclude cut with ID _19504_210_9994_1_1531648863242_3325220_357-237029-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 06:06:44,207 WARNING [train.py:1204] (3/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_302-2550-0_sp1.1 from training. Duration: 0.736375 2023-10-04 06:06:44,207 WARNING [train.py:1204] (3/4) Exclude cut with ID _9461_210_4557_1_1529060514726_3677451_59-194017-0_sp1.1 from training. Duration: 0.963625 2023-10-04 06:06:50,310 WARNING [train.py:1204] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_665-196155-0_sp1.1 from training. Duration: 0.6090625 2023-10-04 06:06:53,291 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_326-315007-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 06:06:57,711 WARNING [train.py:1204] (3/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_105-328174-0_sp1.1 from training. Duration: 0.6909375 2023-10-04 06:06:57,719 WARNING [train.py:1204] (3/4) Exclude cut with ID _56494_210_4220_1_1535284837625_3033169_106-263722-0_sp1.1 from training. Duration: 0.97275 2023-10-04 06:06:59,172 WARNING [train.py:1204] (3/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_770-200430-0_sp0.9 from training. Duration: 0.988875 2023-10-04 06:07:00,573 WARNING [train.py:1204] (3/4) Exclude cut with ID _65640_210_6310_1_1535368201445_3173250_209-13008-0_sp1.1 from training. Duration: 0.9 2023-10-04 06:07:00,690 WARNING [train.py:1204] (3/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_212-149947-0_sp0.9 from training. Duration: 0.811125 2023-10-04 06:07:06,292 WARNING [train.py:1204] (3/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_188-171673-0 from training. Duration: 0.58 2023-10-04 06:07:08,047 WARNING [train.py:1204] (3/4) Exclude cut with ID _58895_210_8750_1_1535184081320_3649089_286-281724-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 06:07:08,120 WARNING [train.py:1204] (3/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_16-254909-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 06:07:13,582 WARNING [train.py:1204] (3/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_52-95655-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 06:07:16,796 WARNING [train.py:1204] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_554-45617-0 from training. Duration: 0.99 2023-10-04 06:07:19,151 WARNING [train.py:1204] (3/4) Exclude cut with ID _14683_210_5219_1_1530698352608_3974859_473-344611-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 06:07:20,238 WARNING [train.py:1204] (3/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_17-25045-0_sp0.9 from training. Duration: 0.6555625 2023-10-04 06:07:21,920 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.ff3_skip_rate, batch_count=1552013.3333333333, ans=0.0 2023-10-04 06:07:24,428 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9072W0003-530637 from training. Duration: 0.768 2023-10-04 06:07:26,265 WARNING [train.py:1204] (3/4) Exclude cut with ID _38670_210_15379_1_1533448810659_4391059_76-74025-0_sp1.1 from training. Duration: 0.936375 2023-10-04 06:07:26,299 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_74-292608-0_sp0.9 from training. Duration: 0.8 2023-10-04 06:07:27,684 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9084W0025-536862 from training. Duration: 0.896 2023-10-04 06:07:27,755 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9087W0021-538392 from training. Duration: 0.768 2023-10-04 06:07:27,761 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7543_210_6753_1_1528543323_645515_56-163234-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 06:07:29,156 WARNING [train.py:1204] (3/4) Exclude cut with ID _45560_210_21982_1_1533812603951_3584999_44-332643-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 06:07:30,513 WARNING [train.py:1204] (3/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_1285-256622-0_sp1.1 from training. Duration: 0.763625 2023-10-04 06:07:30,577 WARNING [train.py:1204] (3/4) Exclude cut with ID _52781_210_16187_1_1534297729099_1118149_141-325632-0_sp1.1 from training. Duration: 0.936375 2023-10-04 06:07:30,695 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.balancer2.prob, batch_count=1552013.3333333333, ans=0.125 2023-10-04 06:07:31,906 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_1051-137631-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 06:07:31,942 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_13091_210_7925_1_1530609447_8987354_233-18333-0_sp1.1 from training. Duration: 0.97275 2023-10-04 06:07:32,231 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.feed_forward3.hidden_balancer.prob, batch_count=1552013.3333333333, ans=0.125 2023-10-04 06:07:34,729 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_34008_210_10431_1_1533345424_8183247_662-317331-0 from training. Duration: 0.94 2023-10-04 06:07:34,735 WARNING [train.py:1204] (3/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_291-248920-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 06:07:34,736 WARNING [train.py:1204] (3/4) Exclude cut with ID _65263_210_22356_1_1535763598691_3921037_274-288481-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 06:07:34,762 WARNING [train.py:1204] (3/4) Exclude cut with ID _5189_210_4557_1_1527248319428_3769229_233-168080-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 06:07:36,096 WARNING [train.py:1204] (3/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_135-184281-0 from training. Duration: 0.99 2023-10-04 06:07:37,511 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9123W0012-555972 from training. Duration: 0.896 2023-10-04 06:07:37,516 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9123W0118-556077 from training. Duration: 0.896 2023-10-04 06:07:37,525 WARNING [train.py:1204] (3/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_406-275649-0 from training. Duration: 0.42 2023-10-04 06:07:41,957 WARNING [train.py:1204] (3/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_309-13684-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 06:07:42,013 WARNING [train.py:1204] (3/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_129-337802-0_sp1.1 from training. Duration: 0.6545625 2023-10-04 06:07:42,053 WARNING [train.py:1204] (3/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_649-266825-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 06:07:43,401 WARNING [train.py:1204] (3/4) Exclude cut with ID _40519_210_13794_1_1533715228655_7337390_895-224742-0_sp1.1 from training. Duration: 0.92725 2023-10-04 06:07:44,783 WARNING [train.py:1204] (3/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_234-14174-0 from training. Duration: 0.82 2023-10-04 06:07:46,191 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9156W0009-572502 from training. Duration: 0.768 2023-10-04 06:07:46,204 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_424-245529-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 06:07:47,973 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.607e+02 1.987e+02 2.204e+02 2.500e+02 5.176e+02, threshold=4.408e+02, percent-clipped=1.0 2023-10-04 06:07:50,785 INFO [train.py:1046] (3/4) Epoch 44, batch 4400, loss[loss=0.147, simple_loss=0.2269, pruned_loss=0.03351, over 24421.00 frames. ], tot_loss[loss=0.155, simple_loss=0.2349, pruned_loss=0.03752, over 4710148.46 frames. ], batch size: 58, lr: 2.30e-03, grad_scale: 32.0 2023-10-04 06:07:52,669 WARNING [train.py:1204] (3/4) Exclude cut with ID _26528_210_9070_1_1532570410842_3851310_232-10149-0_sp1.1 from training. Duration: 0.963625 2023-10-04 06:07:52,684 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_17292_210_6753_1_1531125388_2943734_152-30410-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 06:07:54,173 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_35441_210_16554_1_1533347572_4092680_291-82075-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 06:07:55,570 WARNING [train.py:1204] (3/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_64-298917-0 from training. Duration: 0.88 2023-10-04 06:07:56,872 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_5618_210_1874_1_1527311008_5851_1-214501-0_sp0.9 from training. Duration: 0.7655625 2023-10-04 06:07:56,915 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_9460_210_4211_1_1528954587_10049667_572-37035-0 from training. Duration: 0.8 2023-10-04 06:07:58,150 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9193W0023-590322 from training. Duration: 0.896 2023-10-04 06:07:59,985 WARNING [train.py:1204] (3/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_206-10097-0_sp1.1 from training. Duration: 0.4454375 2023-10-04 06:07:59,996 WARNING [train.py:1204] (3/4) Exclude cut with ID _55134_210_21982_1_1534590028377_3882170_17-51731-0_sp1.1 from training. Duration: 0.963625 2023-10-04 06:08:02,661 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_14953_210_2151_1_1530701710_5616165_710-165400-0 from training. Duration: 0.87 2023-10-04 06:08:05,264 WARNING [train.py:1204] (3/4) Exclude cut with ID _53203_210_2087_1_1534246218309_475590_61-85777-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 06:08:05,368 WARNING [train.py:1204] (3/4) Exclude cut with ID _55844_210_2346_1_1534471212569_7625707_1050-10453-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 06:08:05,385 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9218W0013-603297 from training. Duration: 0.896 2023-10-04 06:08:09,395 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_267-102818-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 06:08:09,396 WARNING [train.py:1204] (3/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_191-41023-0 from training. Duration: 0.71 2023-10-04 06:08:09,448 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9231W0202-610227 from training. Duration: 0.896 2023-10-04 06:08:14,021 WARNING [train.py:1204] (3/4) Exclude cut with ID _6537_210_1883_1_1527987559710_3597129_382-133062-0 from training. Duration: 0.68 2023-10-04 06:08:14,060 WARNING [train.py:1204] (3/4) Exclude cut with ID _28427_210_4220_1_1532581240967_3061069_43-84928-0 from training. Duration: 0.99 2023-10-04 06:08:15,443 WARNING [train.py:1204] (3/4) Exclude cut with ID _59256_210_5986_1_1535076085997_3589120_121-237386-0 from training. Duration: 0.96 2023-10-04 06:08:15,465 WARNING [train.py:1204] (3/4) Exclude cut with ID _8480_210_1874_1_1528589391205_6728330_137-256986-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 06:08:15,564 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_9417_210_1794_1_1529235261_4614131_479-346963-0_sp1.1 from training. Duration: 0.936375 2023-10-04 06:08:16,767 WARNING [train.py:1204] (3/4) Exclude cut with ID _26474_210_9570_1_1532520022699_3623380_267-7362-0_sp1.1 from training. Duration: 0.936375 2023-10-04 06:08:16,834 WARNING [train.py:1204] (3/4) Exclude cut with ID _55328_210_27767_1_1534580973260_4088170_287-61272-0_sp1.1 from training. Duration: 0.97275 2023-10-04 06:08:18,253 WARNING [train.py:1204] (3/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_589-9070-0 from training. Duration: 0.71 2023-10-04 06:08:18,260 WARNING [train.py:1204] (3/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_148-158509-0_sp1.1 from training. Duration: 0.37275 2023-10-04 06:08:19,623 WARNING [train.py:1204] (3/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_164-184548-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 06:08:21,585 WARNING [train.py:1204] (3/4) Exclude cut with ID _14970_210_15709_1_1530775547361_1966965_6-23328-0_sp1.1 from training. Duration: 0.7818125 2023-10-04 06:08:21,591 WARNING [train.py:1204] (3/4) Exclude cut with ID _29303_210_2575_1_1532392237303_7410649_612-165069-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 06:08:22,930 WARNING [train.py:1204] (3/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_383-321224-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 06:08:24,208 WARNING [train.py:1204] (3/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_283-160525-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 06:08:24,228 WARNING [train.py:1204] (3/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_505-97987-0 from training. Duration: 0.86 2023-10-04 06:08:26,229 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9309W0190-640317 from training. Duration: 0.768 2023-10-04 06:08:29,071 WARNING [train.py:1204] (3/4) Exclude cut with ID _50093_210_4220_1_1534417141207_3432299_280-297390-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 06:08:35,146 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_17292_210_6753_1_1531125388_2943734_320-71385-0_sp1.1 from training. Duration: 0.97275 2023-10-04 06:08:38,011 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_213-103282-0 from training. Duration: 0.78 2023-10-04 06:08:39,773 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.self_attn_weights.pos_emb_skip_rate, batch_count=1552346.6666666667, ans=0.0 2023-10-04 06:08:40,810 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7615_210_4211_1_1528110965_4580160_72-235211-0_sp0.9 from training. Duration: 0.8444375 2023-10-04 06:08:43,965 WARNING [train.py:1204] (3/4) Exclude cut with ID _14867_210_1968_1_1530684179890_3846969_173-320977-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 06:08:45,473 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7046_210_6753_1_1527895337_6920917_783-278109-0_sp0.9 from training. Duration: 0.8555625 2023-10-04 06:08:46,765 WARNING [train.py:1204] (3/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_90-30673-0 from training. Duration: 0.84 2023-10-04 06:08:46,781 WARNING [train.py:1204] (3/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_415-161-0_sp1.1 from training. Duration: 0.8909375 2023-10-04 06:08:46,800 WARNING [train.py:1204] (3/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_86-66192-0_sp0.9 from training. Duration: 0.611125 2023-10-04 06:08:46,810 WARNING [train.py:1204] (3/4) Exclude cut with ID _4626_210_3532_1_1526817652717_3916859_446-206656-0_sp1.1 from training. Duration: 0.6909375 2023-10-04 06:08:48,292 WARNING [train.py:1204] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_166-128941-0_sp0.9 from training. Duration: 0.6 2023-10-04 06:08:52,464 WARNING [train.py:1204] (3/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_345-167986-0 from training. Duration: 0.84 2023-10-04 06:08:55,786 WARNING [train.py:1204] (3/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_973-198085-0 from training. Duration: 0.75 2023-10-04 06:08:57,246 WARNING [train.py:1204] (3/4) Exclude cut with ID _60057_210_10640_1_1534903065793_3333150_171-181133-0 from training. Duration: 0.88 2023-10-04 06:08:57,859 WARNING [train.py:1204] (3/4) Exclude cut with ID _19504_210_9994_1_1531648863242_3325220_336-196513-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 06:08:57,867 WARNING [train.py:1204] (3/4) Exclude cut with ID _6493_210_1747_1_1528459210424_4012470_513-140967-0 from training. Duration: 0.79 2023-10-04 06:08:59,192 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6673_210_3273_1_1527767790_3173240_327-306185-0_sp0.9 from training. Duration: 0.92225 2023-10-04 06:09:01,909 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1032-236863-0_sp1.1 from training. Duration: 0.763625 2023-10-04 06:09:03,373 WARNING [train.py:1204] (3/4) Exclude cut with ID _14867_210_1968_1_1530684179890_3846969_413-29292-0 from training. Duration: 0.9 2023-10-04 06:09:05,157 INFO [train.py:1046] (3/4) Epoch 44, batch 4450, loss[loss=0.1411, simple_loss=0.217, pruned_loss=0.03258, over 24472.00 frames. ], tot_loss[loss=0.1554, simple_loss=0.2354, pruned_loss=0.03771, over 4709890.62 frames. ], batch size: 58, lr: 2.30e-03, grad_scale: 32.0 2023-10-04 06:09:06,749 WARNING [train.py:1204] (3/4) Exclude cut with ID _47251_210_18628_1_1533814063128_4137250_186-343322-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 06:09:08,314 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.out_combiner.scale_min, batch_count=1552480.0, ans=0.2 2023-10-04 06:09:09,424 WARNING [train.py:1204] (3/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_624-46091-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 06:09:09,470 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_791-197891-0_sp0.9 from training. Duration: 0.9333125 2023-10-04 06:09:15,294 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_50050_210_16223_1_1535435796_7290138_786-288457-0_sp1.1 from training. Duration: 0.92725 2023-10-04 06:09:16,599 WARNING [train.py:1204] (3/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_213-227283-0_sp1.1 from training. Duration: 0.963625 2023-10-04 06:09:18,054 WARNING [train.py:1204] (3/4) Exclude cut with ID _42659_210_2070_1_1533607442765_3120380_58-341121-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 06:09:20,749 WARNING [train.py:1204] (3/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_506-23200-0_sp1.1 from training. Duration: 0.8818125 2023-10-04 06:09:23,383 WARNING [train.py:1204] (3/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_336-213530-0_sp1.1 from training. Duration: 0.8181875 2023-10-04 06:09:25,049 WARNING [train.py:1204] (3/4) Exclude cut with ID _64135_210_2253_1_1535677267876_7608972_180-5312-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 06:09:25,136 WARNING [train.py:1204] (3/4) Exclude cut with ID _12302_210_5219_1_1529924446466_8107280_835-279754-0 from training. Duration: 0.9 2023-10-04 06:09:25,138 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_510-96996-0_sp1.1 from training. Duration: 0.7909375 2023-10-04 06:09:25,203 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_15517_210_6753_1_1530954008_3487336_216-187834-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 06:09:27,037 WARNING [train.py:1204] (3/4) Exclude cut with ID _19504_210_9994_1_1531648863242_3325220_414-113427-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 06:09:27,038 WARNING [train.py:1204] (3/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_264-108685-0_sp0.9 from training. Duration: 0.611125 2023-10-04 06:09:29,798 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_5438_210_1794_1_1527336687_2600009_104-288274-0_sp0.9 from training. Duration: 0.6444375 2023-10-04 06:09:34,854 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.0.layers.0.feed_forward3.out_whiten, num_groups=1, num_channels=192, metric=12.00 vs. limit=15.0 2023-10-04 06:09:35,220 WARNING [train.py:1204] (3/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_520-39496-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 06:09:36,915 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_37818_210_4238_1_1533620910_4720083_315-308565-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 06:09:38,354 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2009_210_2071_1_1525481134_4588937_385-302100-0_sp1.1 from training. Duration: 0.7909375 2023-10-04 06:09:39,722 WARNING [train.py:1204] (3/4) Exclude cut with ID _58895_210_8750_1_1535184081320_3649089_277-53219-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 06:09:41,055 WARNING [train.py:1204] (3/4) Exclude cut with ID _26664_210_7770_1_1532258943280_3418270_121-111258-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 06:09:43,202 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.3.encoder.layers.1.nonlin_attention.whiten1, num_groups=1, num_channels=384, metric=5.53 vs. limit=10.0 2023-10-04 06:09:45,953 WARNING [train.py:1204] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_99-167392-0_sp1.1 from training. Duration: 0.5545625 2023-10-04 06:09:47,291 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1113-7369-0 from training. Duration: 0.85 2023-10-04 06:09:47,307 WARNING [train.py:1204] (3/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_352-59958-0 from training. Duration: 0.86 2023-10-04 06:09:47,312 WARNING [train.py:1204] (3/4) Exclude cut with ID _14886_210_4220_1_1530702020355_3516190_159-73748-0_sp1.1 from training. Duration: 0.7545625 2023-10-04 06:09:47,604 INFO [scaling.py:1118] (3/4) WithLoss: name=encoder.encoders.3.encoder.layers.0.self_attn_weights, loss-sum=0.000e+00 2023-10-04 06:09:50,051 WARNING [train.py:1204] (3/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_131-119569-0_sp1.1 from training. Duration: 0.92725 2023-10-04 06:09:50,132 WARNING [train.py:1204] (3/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_216-343007-0 from training. Duration: 0.66 2023-10-04 06:09:53,023 WARNING [train.py:1204] (3/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_113-92608-0_sp0.9 from training. Duration: 0.6 2023-10-04 06:09:54,554 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.feed_forward1.hidden_balancer.prob, batch_count=1552680.0, ans=0.125 2023-10-04 06:09:57,025 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_40047_210_2290_1_1533290519_2374311_132-123271-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 06:09:57,719 WARNING [train.py:1204] (3/4) Exclude cut with ID _8793_210_6310_1_1528542985139_2310009_121-179782-0 from training. Duration: 0.8 2023-10-04 06:09:57,734 WARNING [train.py:1204] (3/4) Exclude cut with ID _43997_210_4525_1_1533538632555_5189329_283-218639-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 06:09:57,737 WARNING [train.py:1204] (3/4) Exclude cut with ID _49376_210_5986_1_1533985278732_3370740_297-78397-0_sp1.1 from training. Duration: 0.936375 2023-10-04 06:09:57,759 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3475_210_2028_1_1526193578_3622569_377-166133-0_sp0.9 from training. Duration: 0.9555625 2023-10-04 06:09:57,766 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_520-213149-0_sp1.1 from training. Duration: 0.92725 2023-10-04 06:09:59,208 WARNING [train.py:1204] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_567-144483-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 06:10:01,924 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_4183_210_2087_1_1526729100_1869838_143-280962-0_sp0.9 from training. Duration: 0.62225 2023-10-04 06:10:01,972 WARNING [train.py:1204] (3/4) Exclude cut with ID _49643_210_7989_1_1534640244224_3939031_611-185515-0 from training. Duration: 0.94 2023-10-04 06:10:03,460 WARNING [train.py:1204] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_88-341718-0_sp0.9 from training. Duration: 0.7333125 2023-10-04 06:10:06,277 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_51225_210_3601_1_1534400634_2327383_49-235441-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 06:10:08,097 WARNING [train.py:1204] (3/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_240-294400-0_sp1.1 from training. Duration: 0.936375 2023-10-04 06:10:09,464 WARNING [train.py:1204] (3/4) Exclude cut with ID _55429_210_10626_1_1534467588527_7174143_640-304825-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 06:10:10,735 WARNING [train.py:1204] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_187-221566-0_sp1.1 from training. Duration: 0.3909375 2023-10-04 06:10:12,173 WARNING [train.py:1204] (3/4) Exclude cut with ID _7606_210_4915_1_1528894253277_5080690_127-177761-0_sp0.9 from training. Duration: 0.711125 2023-10-04 06:10:14,822 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.606e+02 1.948e+02 2.234e+02 2.482e+02 3.571e+02, threshold=4.467e+02, percent-clipped=0.0 2023-10-04 06:10:14,930 WARNING [train.py:1204] (3/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_430-116139-0 from training. Duration: 0.85 2023-10-04 06:10:16,971 WARNING [train.py:1204] (3/4) Exclude cut with ID _3675_210_4557_1_1526639934408_4182089_456-18305-0_sp0.9 from training. Duration: 0.8444375 2023-10-04 06:10:17,211 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.conv_module1.balancer1.prob, batch_count=1552813.3333333333, ans=0.125 2023-10-04 06:10:18,169 INFO [train.py:1046] (3/4) Epoch 44, batch 4500, loss[loss=0.1937, simple_loss=0.2645, pruned_loss=0.0615, over 19677.00 frames. ], tot_loss[loss=0.1556, simple_loss=0.2354, pruned_loss=0.03796, over 4697591.36 frames. ], batch size: 388, lr: 2.30e-03, grad_scale: 32.0 2023-10-04 06:10:18,431 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.1.feed_forward1.out_proj.dropout_p, batch_count=1552813.3333333333, ans=0.1 2023-10-04 06:10:22,339 WARNING [train.py:1204] (3/4) Exclude cut with ID _18513_210_9206_1_1531618215223_3862620_402-5957-0_sp1.1 from training. Duration: 0.9 2023-10-04 06:10:22,434 WARNING [train.py:1204] (3/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_465-156500-0 from training. Duration: 0.72 2023-10-04 06:10:22,435 WARNING [train.py:1204] (3/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_714-310538-0 from training. Duration: 0.86 2023-10-04 06:10:22,741 INFO [scaling.py:1118] (3/4) WithLoss: name=encoder.encoders.5.encoder.layers.1.self_attn_weights, loss-sum=0.000e+00 2023-10-04 06:10:25,137 WARNING [train.py:1204] (3/4) Exclude cut with ID _39411_210_5986_1_1533538825030_3343649_335-301187-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 06:10:30,250 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_41750_210_1960_1_1533520678_3948042_241-158616-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 06:10:30,304 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_46013_210_7234_1_1533776362_3744032_50-141535-0_sp1.1 from training. Duration: 0.9 2023-10-04 06:10:30,936 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.2.encoder.layers.2.feed_forward3.out_whiten, num_groups=1, num_channels=384, metric=9.39 vs. limit=15.0 2023-10-04 06:10:31,607 WARNING [train.py:1204] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_43-206757-0_sp0.9 from training. Duration: 0.8333125 2023-10-04 06:10:31,670 WARNING [train.py:1204] (3/4) Exclude cut with ID _22961_210_8254_1_1531976366672_4278180_172-276296-0_sp1.1 from training. Duration: 0.97275 2023-10-04 06:10:32,964 WARNING [train.py:1204] (3/4) Exclude cut with ID _17956_210_4353_1_1531209568125_6979970_1305-301903-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 06:10:33,012 WARNING [train.py:1204] (3/4) Exclude cut with ID _28323_210_16187_1_1532480395667_4100300_604-44507-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 06:10:46,220 WARNING [train.py:1204] (3/4) Exclude cut with ID _23514_210_1389_1_1531832139785_3031439_88-337544-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 06:10:46,289 WARNING [train.py:1204] (3/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_932-3672-0_sp1.1 from training. Duration: 0.963625 2023-10-04 06:10:47,723 WARNING [train.py:1204] (3/4) Exclude cut with ID _46502_210_13794_1_1533724452198_3657159_207-109991-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 06:10:49,642 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_216-73841-0_sp1.1 from training. Duration: 0.87275 2023-10-04 06:10:49,735 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8453_210_4211_1_1528502940_8562730_347-330673-0_sp1.1 from training. Duration: 0.6545625 2023-10-04 06:10:55,315 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_4183_210_2087_1_1526729100_1869838_143-280962-0_sp1.1 from training. Duration: 0.5090625 2023-10-04 06:11:00,231 WARNING [train.py:1204] (3/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_325-113798-0_sp1.1 from training. Duration: 0.77275 2023-10-04 06:11:04,463 WARNING [train.py:1204] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_91-212486-0_sp0.9 from training. Duration: 0.7555625 2023-10-04 06:11:05,959 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8776_210_2040_1_1528638837_4088036_244-278763-0_sp1.1 from training. Duration: 0.8181875 2023-10-04 06:11:05,993 WARNING [train.py:1204] (3/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_106-286130-0 from training. Duration: 0.98 2023-10-04 06:11:07,266 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2746_210_1988_1_1525872816_3795627_372-205861-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 06:11:07,299 WARNING [train.py:1204] (3/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_240-54646-0_sp1.1 from training. Duration: 0.936375 2023-10-04 06:11:09,978 WARNING [train.py:1204] (3/4) Exclude cut with ID _59500_210_8254_1_1534809610680_3736009_162-180695-0_sp1.1 from training. Duration: 0.936375 2023-10-04 06:11:10,000 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7544_210_6753_1_1528591327_1471426_146-93357-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 06:11:13,229 WARNING [train.py:1204] (3/4) Exclude cut with ID _42657_210_2070_1_1533520654016_3327008_405-87432-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 06:11:13,262 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_4528_210_3273_1_1527076654_3276777_209-219804-0 from training. Duration: 0.69 2023-10-04 06:11:13,262 WARNING [train.py:1204] (3/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_78-182912-0_sp0.9 from training. Duration: 0.6555625 2023-10-04 06:11:13,269 WARNING [train.py:1204] (3/4) Exclude cut with ID _31255_210_13427_1_1532865619495_3815143_322-114998-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 06:11:14,929 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.feed_forward3.hidden_balancer.prob, batch_count=1553013.3333333333, ans=0.125 2023-10-04 06:11:17,438 WARNING [train.py:1204] (3/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_848-280957-0_sp0.9 from training. Duration: 0.9666875 2023-10-04 06:11:18,694 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7696_210_2040_1_1528207167_3716099_117-125434-0_sp0.9 from training. Duration: 0.8444375 2023-10-04 06:11:20,560 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_1002-109192-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 06:11:23,345 WARNING [train.py:1204] (3/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_138-64210-0_sp0.9 from training. Duration: 0.811125 2023-10-04 06:11:23,368 WARNING [train.py:1204] (3/4) Exclude cut with ID _25808_210_13292_1_1532343514562_7366109_633-47524-0_sp1.1 from training. Duration: 0.9 2023-10-04 06:11:24,775 WARNING [train.py:1204] (3/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_297-5233-0 from training. Duration: 0.95 2023-10-04 06:11:26,678 WARNING [train.py:1204] (3/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_335-274780-0 from training. Duration: 0.78 2023-10-04 06:11:26,683 WARNING [train.py:1204] (3/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_301-330455-0 from training. Duration: 0.8 2023-10-04 06:11:30,853 WARNING [train.py:1204] (3/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_383-205833-0 from training. Duration: 0.95 2023-10-04 06:11:32,274 INFO [train.py:1046] (3/4) Epoch 44, batch 4550, loss[loss=0.1406, simple_loss=0.2224, pruned_loss=0.02942, over 24313.00 frames. ], tot_loss[loss=0.1546, simple_loss=0.2342, pruned_loss=0.03752, over 4686945.79 frames. ], batch size: 61, lr: 2.30e-03, grad_scale: 32.0 2023-10-04 06:11:32,457 WARNING [train.py:1204] (3/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_48-182155-0 from training. Duration: 0.87 2023-10-04 06:11:33,772 WARNING [train.py:1204] (3/4) Exclude cut with ID _14832_210_8750_1_1530691351539_3171348_1-54315-0_sp1.1 from training. Duration: 0.7909375 2023-10-04 06:11:36,565 WARNING [train.py:1204] (3/4) Exclude cut with ID _44982_210_1511_1_1534312820108_3726029_413-100814-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 06:11:36,609 WARNING [train.py:1204] (3/4) Exclude cut with ID _3231_210_2106_1_1526205864934_6784220_104-261392-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 06:11:40,501 WARNING [train.py:1204] (3/4) Exclude cut with ID _24314_210_4915_1_1532135078116_7170179_1-13588-0_sp1.1 from training. Duration: 0.97275 2023-10-04 06:11:42,204 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.feed_forward1.out_proj.dropout_p, batch_count=1553146.6666666667, ans=0.1 2023-10-04 06:11:45,268 WARNING [train.py:1204] (3/4) Exclude cut with ID _44304_210_15374_1_1533639352765_4613129_141-8356-0_sp1.1 from training. Duration: 0.963625 2023-10-04 06:11:46,647 WARNING [train.py:1204] (3/4) Exclude cut with ID _37892_210_19596_1_1533198677897_3964058_374-335034-0_sp1.1 from training. Duration: 0.936375 2023-10-04 06:11:48,035 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_195-257512-0_sp0.9 from training. Duration: 0.7444375 2023-10-04 06:11:48,044 WARNING [train.py:1204] (3/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_1267-272693-0_sp1.1 from training. Duration: 0.663625 2023-10-04 06:11:48,044 WARNING [train.py:1204] (3/4) Exclude cut with ID _30196_210_1664_1_1533708496522_7074140_690-326155-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 06:11:49,394 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_68847_210_14656_1_1536070214_2586843_38-20717-0_sp1.1 from training. Duration: 0.97275 2023-10-04 06:11:51,322 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_99-251314-0_sp1.1 from training. Duration: 0.7909375 2023-10-04 06:11:51,502 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=1553213.3333333333, ans=0.1 2023-10-04 06:11:54,355 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.conv_module2.balancer2.prob, batch_count=1553213.3333333333, ans=0.125 2023-10-04 06:11:55,961 WARNING [train.py:1204] (3/4) Exclude cut with ID _49376_210_5986_1_1533985278732_3370740_225-95630-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 06:11:58,047 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2655_210_2087_1_1525688959_3897400_202-147557-0 from training. Duration: 0.85 2023-10-04 06:11:58,107 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_5618_210_1874_1_1527311008_5851_1-214501-0_sp1.1 from training. Duration: 0.626375 2023-10-04 06:11:59,396 WARNING [train.py:1204] (3/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_225-43381-0_sp1.1 from training. Duration: 0.8545625 2023-10-04 06:12:00,805 WARNING [train.py:1204] (3/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_242-3472-0 from training. Duration: 0.73 2023-10-04 06:12:03,683 WARNING [train.py:1204] (3/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_339-104791-0 from training. Duration: 0.49 2023-10-04 06:12:05,006 WARNING [train.py:1204] (3/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_488-92261-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 06:12:07,840 WARNING [train.py:1204] (3/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_175-163894-0 from training. Duration: 0.74 2023-10-04 06:12:09,206 WARNING [train.py:1204] (3/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_41-234353-0_sp0.9 from training. Duration: 0.7555625 2023-10-04 06:12:12,123 WARNING [train.py:1204] (3/4) Exclude cut with ID _47251_210_18628_1_1533814063128_4137250_116-275927-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 06:12:12,161 WARNING [train.py:1204] (3/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_61-223324-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 06:12:13,685 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6961_210_2087_1_1527937168_6815011_562-165518-0_sp1.1 from training. Duration: 0.7 2023-10-04 06:12:15,664 WARNING [train.py:1204] (3/4) Exclude cut with ID _21989_210_5854_1_1531899214931_3271360_133-44759-0 from training. Duration: 0.77 2023-10-04 06:12:17,139 WARNING [train.py:1204] (3/4) Exclude cut with ID _23557_210_8750_1_1531890106370_6846255_77-284923-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 06:12:18,625 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.feed_forward3.hidden_balancer.prob, batch_count=1553346.6666666667, ans=0.125 2023-10-04 06:12:19,888 WARNING [train.py:1204] (3/4) Exclude cut with ID _13678_210_7497_1_1530530069226_3463170_464-296127-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 06:12:20,014 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.conv_module1.balancer2.prob, batch_count=1553346.6666666667, ans=0.125 2023-10-04 06:12:21,230 WARNING [train.py:1204] (3/4) Exclude cut with ID _22613_210_5219_1_1532253599686_4090170_393-36663-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 06:12:22,646 WARNING [train.py:1204] (3/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_235-262124-0_sp0.9 from training. Duration: 0.7444375 2023-10-04 06:12:22,736 WARNING [train.py:1204] (3/4) Exclude cut with ID _8480_210_1874_1_1528589391205_6728330_755-112625-0 from training. Duration: 0.74 2023-10-04 06:12:24,682 WARNING [train.py:1204] (3/4) Exclude cut with ID _6456_210_1534_1_1527681634212_3181049_215-309359-0 from training. Duration: 0.88 2023-10-04 06:12:24,701 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3019_210_2028_1_1526044245_3264865_191-72235-0_sp1.1 from training. Duration: 0.8818125 2023-10-04 06:12:26,009 WARNING [train.py:1204] (3/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_380-162648-0 from training. Duration: 0.94 2023-10-04 06:12:26,170 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_92-46009-0 from training. Duration: 0.54 2023-10-04 06:12:27,864 WARNING [train.py:1204] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_53-300544-0_sp0.9 from training. Duration: 0.7444375 2023-10-04 06:12:27,986 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_68235_210_1925_1_1535863247_4484656_101-250943-0_sp1.1 from training. Duration: 0.97275 2023-10-04 06:12:27,996 WARNING [train.py:1204] (3/4) Exclude cut with ID _14970_210_15709_1_1530775547361_1966965_204-22182-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 06:12:30,747 WARNING [train.py:1204] (3/4) Exclude cut with ID _55153_210_2253_1_1534384786677_3162096_310-130092-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 06:12:30,765 WARNING [train.py:1204] (3/4) Exclude cut with ID _60113_210_16271_1_1535164212481_4386079_559-167191-0_sp1.1 from training. Duration: 0.8454375 2023-10-04 06:12:32,221 WARNING [train.py:1204] (3/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_93-58002-0_sp1.1 from training. Duration: 0.5181875 2023-10-04 06:12:33,484 WARNING [train.py:1204] (3/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_110-224688-0 from training. Duration: 0.97 2023-10-04 06:12:33,764 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.self_attn_weights.pos_emb_skip_rate, batch_count=1553413.3333333333, ans=0.0 2023-10-04 06:12:34,896 WARNING [train.py:1204] (3/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_178-178004-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 06:12:34,904 WARNING [train.py:1204] (3/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_321-335475-0_sp1.1 from training. Duration: 0.4090625 2023-10-04 06:12:34,973 WARNING [train.py:1204] (3/4) Exclude cut with ID _66863_210_18053_1_1535698758334_8590319_1092-271784-0 from training. Duration: 0.96 2023-10-04 06:12:34,979 WARNING [train.py:1204] (3/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_607-213951-0_sp1.1 from training. Duration: 0.763625 2023-10-04 06:12:35,184 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.balancer2.prob, batch_count=1553413.3333333333, ans=0.125 2023-10-04 06:12:36,337 WARNING [train.py:1204] (3/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_468-283113-0 from training. Duration: 0.96 2023-10-04 06:12:37,927 WARNING [train.py:1204] (3/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_13-165512-0_sp1.1 from training. Duration: 0.7454375 2023-10-04 06:12:37,939 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_69630_210_7234_1_1535788670_910989_56-169762-0_sp1.1 from training. Duration: 0.92725 2023-10-04 06:12:40,688 WARNING [train.py:1204] (3/4) Exclude cut with ID _67335_210_5219_1_1535525975652_4275229_121-80357-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 06:12:40,732 WARNING [train.py:1204] (3/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_126-12079-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 06:12:40,767 WARNING [train.py:1204] (3/4) Exclude cut with ID _5294_210_1747_1_1527249643928_3696179_342-270278-0_sp1.1 from training. Duration: 0.5 2023-10-04 06:12:42,144 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_30322_210_1925_1_1532843478_826817_35-328264-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 06:12:43,833 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.688e+02 1.968e+02 2.214e+02 2.646e+02 3.245e+02, threshold=4.428e+02, percent-clipped=0.0 2023-10-04 06:12:43,951 WARNING [train.py:1204] (3/4) Exclude cut with ID _8480_210_1874_1_1528589391205_6728330_755-112625-0_sp1.1 from training. Duration: 0.67275 2023-10-04 06:12:44,205 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.bypass.scale_min, batch_count=1553413.3333333333, ans=0.2 2023-10-04 06:12:44,889 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.0.layers.1.nonlin_attention.whiten1, num_groups=1, num_channels=144, metric=6.52 vs. limit=10.0 2023-10-04 06:12:45,390 WARNING [train.py:1204] (3/4) Exclude cut with ID _38670_210_15379_1_1533448810659_4391059_132-146412-0_sp1.1 from training. Duration: 0.97275 2023-10-04 06:12:46,541 INFO [train.py:1046] (3/4) Epoch 44, batch 4600, loss[loss=0.1533, simple_loss=0.2229, pruned_loss=0.04185, over 23632.00 frames. ], tot_loss[loss=0.1538, simple_loss=0.233, pruned_loss=0.0373, over 4683205.83 frames. ], batch size: 232, lr: 2.30e-03, grad_scale: 32.0 2023-10-04 06:12:46,652 WARNING [train.py:1204] (3/4) Exclude cut with ID _22020_210_2461_1_1531724164513_6326900_563-39691-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 06:12:49,296 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_9460_210_4211_1_1528954587_10049667_572-37035-0_sp1.1 from training. Duration: 0.72725 2023-10-04 06:12:50,580 WARNING [train.py:1204] (3/4) Exclude cut with ID _68777_210_4353_1_1535810390731_3117904_129-166012-0_sp0.9 from training. Duration: 0.9444375 2023-10-04 06:12:50,644 WARNING [train.py:1204] (3/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_487-91921-0_sp1.1 from training. Duration: 0.963625 2023-10-04 06:12:52,582 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_195-257512-0 from training. Duration: 0.67 2023-10-04 06:12:53,967 WARNING [train.py:1204] (3/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_188-260011-0_sp1.1 from training. Duration: 0.663625 2023-10-04 06:12:57,367 WARNING [train.py:1204] (3/4) Exclude cut with ID _8480_210_1874_1_1528589391205_6728330_309-139896-0_sp0.9 from training. Duration: 0.9 2023-10-04 06:12:58,689 WARNING [train.py:1204] (3/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_866-321248-0_sp1.1 from training. Duration: 0.963625 2023-10-04 06:13:01,485 WARNING [train.py:1204] (3/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_510-216513-0_sp1.1 from training. Duration: 0.97275 2023-10-04 06:13:07,117 WARNING [train.py:1204] (3/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_758-143677-0 from training. Duration: 0.98 2023-10-04 06:13:08,478 WARNING [train.py:1204] (3/4) Exclude cut with ID _55134_210_21982_1_1534590028377_3882170_50-214427-0_sp1.1 from training. Duration: 0.97275 2023-10-04 06:13:12,478 WARNING [train.py:1204] (3/4) Exclude cut with ID _31649_210_6228_1_1532584608063_4091078_482-301330-0_sp1.1 from training. Duration: 0.97275 2023-10-04 06:13:15,780 WARNING [train.py:1204] (3/4) Exclude cut with ID _35799_210_5854_1_1533263487887_3529071_490-326847-0_sp1.1 from training. Duration: 0.9 2023-10-04 06:13:15,790 WARNING [train.py:1204] (3/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_984-90716-0_sp1.1 from training. Duration: 0.963625 2023-10-04 06:13:18,705 WARNING [train.py:1204] (3/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_166-246557-0 from training. Duration: 0.64 2023-10-04 06:13:18,706 WARNING [train.py:1204] (3/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_51-200220-0_sp1.1 from training. Duration: 0.5090625 2023-10-04 06:13:18,850 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.1.conv_module2.balancer1.prob, batch_count=1553613.3333333333, ans=0.125 2023-10-04 06:13:20,048 WARNING [train.py:1204] (3/4) Exclude cut with ID _51382_210_23862_1_1534492801838_3650104_275-92796-0_sp1.1 from training. Duration: 0.936375 2023-10-04 06:13:26,475 WARNING [train.py:1204] (3/4) Exclude cut with ID _29853_210_7497_1_1532505600851_6642110_461-142829-0_sp1.1 from training. Duration: 0.97275 2023-10-04 06:13:26,522 WARNING [train.py:1204] (3/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_788-298907-0_sp0.9 from training. Duration: 0.988875 2023-10-04 06:13:28,303 WARNING [train.py:1204] (3/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_739-2098-0_sp1.1 from training. Duration: 0.836375 2023-10-04 06:13:32,361 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7543_210_6753_1_1528541907_1399322_165-323619-0 from training. Duration: 0.69 2023-10-04 06:13:32,447 WARNING [train.py:1204] (3/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_401-255491-0_sp1.1 from training. Duration: 0.636375 2023-10-04 06:13:36,615 WARNING [train.py:1204] (3/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_24-143283-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 06:13:37,996 WARNING [train.py:1204] (3/4) Exclude cut with ID _14459_210_2228_1_1531443641946_7516700_1221-280439-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 06:13:40,651 WARNING [train.py:1204] (3/4) Exclude cut with ID _59202_210_26984_1_1534816810612_3549174_469-213482-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 06:13:40,652 WARNING [train.py:1204] (3/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_148-158509-0_sp0.9 from training. Duration: 0.4555625 2023-10-04 06:13:40,682 WARNING [train.py:1204] (3/4) Exclude cut with ID _24314_210_4915_1_1532135078116_7170179_863-259095-0_sp1.1 from training. Duration: 0.97275 2023-10-04 06:13:42,015 WARNING [train.py:1204] (3/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_321-22821-0 from training. Duration: 0.89 2023-10-04 06:13:42,051 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_43831_210_2158_1_1533725502_5872791_55-163137-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 06:13:42,087 WARNING [train.py:1204] (3/4) Exclude cut with ID _27430_210_7770_1_1532604689375_1378590_26-25207-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 06:13:43,972 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_17613_210_2151_1_1531137569_2366160_223-316461-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 06:13:45,331 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_53679_210_4852_1_1534556726_4186141_541-213370-0_sp1.1 from training. Duration: 0.963625 2023-10-04 06:13:45,403 WARNING [train.py:1204] (3/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_369-295086-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 06:13:45,723 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.feed_forward1.out_proj.dropout_p, batch_count=1553746.6666666667, ans=0.1 2023-10-04 06:13:46,942 WARNING [train.py:1204] (3/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_91-267696-0 from training. Duration: 0.87 2023-10-04 06:13:48,297 WARNING [train.py:1204] (3/4) Exclude cut with ID _5294_210_1747_1_1527249643928_3696179_180-100713-0 from training. Duration: 0.81 2023-10-04 06:13:48,327 WARNING [train.py:1204] (3/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_32-335103-0 from training. Duration: 0.82 2023-10-04 06:13:48,340 WARNING [train.py:1204] (3/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_133-312727-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 06:13:49,734 WARNING [train.py:1204] (3/4) Exclude cut with ID _59288_210_5854_1_1535077757774_3412246_391-165934-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 06:13:49,789 WARNING [train.py:1204] (3/4) Exclude cut with ID _28323_210_16187_1_1532480395667_4100300_488-1281-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 06:13:51,167 WARNING [train.py:1204] (3/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_124-193207-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 06:14:00,587 INFO [train.py:1046] (3/4) Epoch 44, batch 4650, loss[loss=0.1482, simple_loss=0.2306, pruned_loss=0.0329, over 24451.00 frames. ], tot_loss[loss=0.1536, simple_loss=0.2333, pruned_loss=0.03692, over 4704264.64 frames. ], batch size: 63, lr: 2.30e-03, grad_scale: 32.0 2023-10-04 06:14:00,663 WARNING [train.py:1204] (3/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_226-242140-0_sp0.9 from training. Duration: 0.9 2023-10-04 06:14:02,156 WARNING [train.py:1204] (3/4) Exclude cut with ID _29277_210_3279_1_1532581109293_3173069_392-3694-0_sp1.1 from training. Duration: 0.936375 2023-10-04 06:14:02,192 WARNING [train.py:1204] (3/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_563-329842-0_sp1.1 from training. Duration: 0.97275 2023-10-04 06:14:02,358 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.ff3_skip_rate, batch_count=1553813.3333333333, ans=0.0 2023-10-04 06:14:03,485 WARNING [train.py:1204] (3/4) Exclude cut with ID _58583_210_5236_1_1534726835266_3574249_391-87755-0_sp1.1 from training. Duration: 0.92725 2023-10-04 06:14:03,506 WARNING [train.py:1204] (3/4) Exclude cut with ID _28427_210_4220_1_1532581240967_3061069_281-237827-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 06:14:03,521 WARNING [train.py:1204] (3/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_44-147898-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 06:14:04,892 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_24007_210_1794_1_1531918925_1663875_125-320772-0_sp1.1 from training. Duration: 0.97275 2023-10-04 06:14:07,694 WARNING [train.py:1204] (3/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_143-117421-0 from training. Duration: 0.58 2023-10-04 06:14:10,459 WARNING [train.py:1204] (3/4) Exclude cut with ID _69584_210_5219_1_1535785133775_4044450_325-7656-0_sp0.9 from training. Duration: 0.9555625 2023-10-04 06:14:10,654 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.feed_forward2.hidden_balancer.prob, batch_count=1553813.3333333333, ans=0.125 2023-10-04 06:14:10,936 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.5.encoder.layers.0.self_attn1.whiten, num_groups=1, num_channels=256, metric=10.86 vs. limit=22.5 2023-10-04 06:14:12,278 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_37351_210_7925_1_1533012931_3812113_265-85780-0 from training. Duration: 0.88 2023-10-04 06:14:12,302 WARNING [train.py:1204] (3/4) Exclude cut with ID _18516_210_9206_1_1531704653206_3692406_331-237896-0_sp1.1 from training. Duration: 0.936375 2023-10-04 06:14:13,649 WARNING [train.py:1204] (3/4) Exclude cut with ID _14629_210_15709_1_1530604298182_956899_6-232695-0 from training. Duration: 0.94 2023-10-04 06:14:13,678 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7544_210_6753_1_1528587000_691468_87-178599-0_sp1.1 from training. Duration: 0.7909375 2023-10-04 06:14:13,734 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_326-303945-0 from training. Duration: 0.99 2023-10-04 06:14:14,983 WARNING [train.py:1204] (3/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_503-30719-0 from training. Duration: 0.98 2023-10-04 06:14:14,991 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_11305_210_1794_1_1529750428_7178148_299-344640-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 06:14:16,236 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_53071_210_16554_1_1534490405_2954066_265-170472-0_sp1.1 from training. Duration: 0.7545625 2023-10-04 06:14:19,022 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_174-277364-0_sp1.1 from training. Duration: 0.6454375 2023-10-04 06:14:20,418 WARNING [train.py:1204] (3/4) Exclude cut with ID _58414_210_8254_1_1535421609162_7151182_193-151884-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 06:14:20,449 WARNING [train.py:1204] (3/4) Exclude cut with ID IC0651W0104-322063 from training. Duration: 0.768 2023-10-04 06:14:24,397 WARNING [train.py:1204] (3/4) Exclude cut with ID _21881_210_3525_1_1531825242757_3798380_139-305982-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 06:14:25,769 WARNING [train.py:1204] (3/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_79-200392-0 from training. Duration: 0.97 2023-10-04 06:14:27,310 WARNING [train.py:1204] (3/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_451-280894-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 06:14:27,318 WARNING [train.py:1204] (3/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_304-273433-0_sp1.1 from training. Duration: 0.9 2023-10-04 06:14:29,329 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_5959_210_1794_1_1527400400_3445726_412-44052-0 from training. Duration: 0.77 2023-10-04 06:14:30,640 WARNING [train.py:1204] (3/4) Exclude cut with ID _4681_210_3525_1_1527251040321_1235713_148-26159-0_sp1.1 from training. Duration: 0.963625 2023-10-04 06:14:33,339 WARNING [train.py:1204] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_887-249555-0_sp0.9 from training. Duration: 0.8555625 2023-10-04 06:14:36,602 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.4.encoder.layers.2.conv_module1.whiten, num_groups=1, num_channels=384, metric=2.61 vs. limit=15.0 2023-10-04 06:14:37,414 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_25236_210_2158_1_1532051968_280567_37-6860-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 06:14:42,498 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.1.encoder.layers.0.self_attn2.whiten, num_groups=1, num_channels=256, metric=12.85 vs. limit=22.5 2023-10-04 06:14:43,537 WARNING [train.py:1204] (3/4) Exclude cut with ID _29277_210_3279_1_1532581109293_3173069_417-115527-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 06:14:45,081 WARNING [train.py:1204] (3/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_552-134748-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 06:14:45,153 WARNING [train.py:1204] (3/4) Exclude cut with ID _43176_210_8254_1_1533524417469_4174339_188-174143-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 06:14:46,449 WARNING [train.py:1204] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_127-119109-0_sp1.1 from training. Duration: 0.6545625 2023-10-04 06:14:46,762 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.balancer1.prob, batch_count=1554013.3333333333, ans=0.125 2023-10-04 06:14:47,912 WARNING [train.py:1204] (3/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_38-187851-0 from training. Duration: 0.62 2023-10-04 06:14:49,186 WARNING [train.py:1204] (3/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_588-125165-0 from training. Duration: 0.83 2023-10-04 06:14:49,264 WARNING [train.py:1204] (3/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_344-152526-0_sp1.1 from training. Duration: 0.4818125 2023-10-04 06:14:49,266 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7002_210_1794_1_1527935634_4034444_487-236501-0 from training. Duration: 0.66 2023-10-04 06:14:51,995 WARNING [train.py:1204] (3/4) Exclude cut with ID _23825_210_13449_1_1531915313526_3497150_379-16981-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 06:14:57,435 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.bypass_mid.scale_min, batch_count=1554013.3333333333, ans=0.2 2023-10-04 06:15:00,631 WARNING [train.py:1204] (3/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_297-5233-0_sp1.1 from training. Duration: 0.863625 2023-10-04 06:15:00,636 WARNING [train.py:1204] (3/4) Exclude cut with ID _41298_210_9019_1_1533430371450_4822143_178-283538-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 06:15:00,674 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7046_210_6753_1_1527895337_6920917_642-106799-0 from training. Duration: 0.92 2023-10-04 06:15:01,770 WARNING [train.py:1204] (3/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_532-291952-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 06:15:03,179 WARNING [train.py:1204] (3/4) Exclude cut with ID _47278_210_12041_1_1533902418349_3576110_260-270468-0_sp1.1 from training. Duration: 0.92725 2023-10-04 06:15:03,184 WARNING [train.py:1204] (3/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_144-3287-0_sp0.9 from training. Duration: 0.7444375 2023-10-04 06:15:04,478 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_162-262717-0_sp0.9 from training. Duration: 0.92225 2023-10-04 06:15:07,069 WARNING [train.py:1204] (3/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_62-335552-0_sp0.9 from training. Duration: 0.9666875 2023-10-04 06:15:07,083 WARNING [train.py:1204] (3/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_726-272852-0_sp1.1 from training. Duration: 0.92725 2023-10-04 06:15:07,151 WARNING [train.py:1204] (3/4) Exclude cut with ID _14898_210_9206_1_1530766812377_4177190_26-135492-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 06:15:08,816 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.conv_module1.balancer1.prob, batch_count=1554080.0, ans=0.125 2023-10-04 06:15:11,251 WARNING [train.py:1204] (3/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_200-95304-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 06:15:11,296 WARNING [train.py:1204] (3/4) Exclude cut with ID _45012_210_12651_1_1533608435136_5371380_240-37101-0_sp1.1 from training. Duration: 0.8454375 2023-10-04 06:15:12,920 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.770e+02 2.072e+02 2.400e+02 3.094e+02 4.124e+02, threshold=4.800e+02, percent-clipped=0.0 2023-10-04 06:15:12,997 WARNING [train.py:1204] (3/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_132-269633-0_sp0.9 from training. Duration: 0.7555625 2023-10-04 06:15:13,096 WARNING [train.py:1204] (3/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_907-151398-0_sp0.9 from training. Duration: 0.411125 2023-10-04 06:15:14,307 INFO [train.py:1046] (3/4) Epoch 44, batch 4700, loss[loss=0.165, simple_loss=0.2458, pruned_loss=0.04212, over 23757.00 frames. ], tot_loss[loss=0.1538, simple_loss=0.2337, pruned_loss=0.03695, over 4703380.76 frames. ], batch size: 212, lr: 2.30e-03, grad_scale: 16.0 2023-10-04 06:15:14,372 WARNING [train.py:1204] (3/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_45-265684-0_sp0.9 from training. Duration: 0.8 2023-10-04 06:15:14,498 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_74-292608-0 from training. Duration: 0.72 2023-10-04 06:15:21,425 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.0.ff3_skip_rate, batch_count=1554146.6666666667, ans=0.0 2023-10-04 06:15:22,569 WARNING [train.py:1204] (3/4) Exclude cut with ID _59202_210_26984_1_1534816810612_3549174_221-84819-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 06:15:22,643 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_33696_210_3852_1_1533082780_2663375_181-325595-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 06:15:22,685 WARNING [train.py:1204] (3/4) Exclude cut with ID _47251_210_18628_1_1533814063128_4137250_351-154105-0_sp1.1 from training. Duration: 0.963625 2023-10-04 06:15:24,025 WARNING [train.py:1204] (3/4) Exclude cut with ID _35501_210_9288_1_1532998507986_3192189_351-334740-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 06:15:25,516 WARNING [train.py:1204] (3/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_342-100157-0_sp1.1 from training. Duration: 0.4909375 2023-10-04 06:15:30,909 WARNING [train.py:1204] (3/4) Exclude cut with ID _6789_210_1534_1_1527857937218_3118950_262-48853-0 from training. Duration: 0.56 2023-10-04 06:15:31,941 WARNING [train.py:1204] (3/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_430-201020-0 from training. Duration: 0.97 2023-10-04 06:15:33,408 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_71-136518-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 06:15:35,972 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_28-291982-0_sp1.1 from training. Duration: 0.8818125 2023-10-04 06:15:36,012 WARNING [train.py:1204] (3/4) Exclude cut with ID _54218_210_12210_1_1534413646388_4409259_437-275326-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 06:15:37,524 WARNING [train.py:1204] (3/4) Exclude cut with ID _55134_210_21982_1_1534590028377_3882170_40-309139-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 06:15:37,797 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.conv_skip_rate, batch_count=1554213.3333333333, ans=0.0 2023-10-04 06:15:43,553 WARNING [train.py:1204] (3/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_266-329755-0_sp1.1 from training. Duration: 0.8090625 2023-10-04 06:15:44,889 WARNING [train.py:1204] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_21-174514-0_sp1.1 from training. Duration: 0.3909375 2023-10-04 06:15:46,332 WARNING [train.py:1204] (3/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_409-207378-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 06:15:53,085 WARNING [train.py:1204] (3/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_132-269633-0 from training. Duration: 0.68 2023-10-04 06:15:54,424 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2245_210_2715_1_1525511548_3540254_310-52483-0_sp0.9 from training. Duration: 0.97775 2023-10-04 06:15:58,135 WARNING [train.py:1204] (3/4) Exclude cut with ID _27430_210_7770_1_1532604689375_1378590_47-77403-0_sp1.1 from training. Duration: 0.92725 2023-10-04 06:15:59,833 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.feed_forward3.hidden_balancer.prob, batch_count=1554346.6666666667, ans=0.125 2023-10-04 06:16:00,990 WARNING [train.py:1204] (3/4) Exclude cut with ID _7562_210_2084_1_1528887767916_3946229_186-326422-0 from training. Duration: 0.84 2023-10-04 06:16:03,696 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_230-35993-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 06:16:03,963 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.nonlin_attention.balancer.prob, batch_count=1554346.6666666667, ans=0.125 2023-10-04 06:16:08,288 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_23854_210_1794_1_1531969696_1329157_57-123491-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 06:16:08,337 WARNING [train.py:1204] (3/4) Exclude cut with ID _14747_210_8341_1_1530685797463_3654089_147-131229-0 from training. Duration: 0.76 2023-10-04 06:16:09,794 WARNING [train.py:1204] (3/4) Exclude cut with ID _30191_210_1664_1_1533189915047_7196670_430-19539-0_sp1.1 from training. Duration: 0.92725 2023-10-04 06:16:09,808 WARNING [train.py:1204] (3/4) Exclude cut with ID _67725_210_27767_1_1535608756671_5105359_44-50762-0_sp1.1 from training. Duration: 0.963625 2023-10-04 06:16:11,171 WARNING [train.py:1204] (3/4) Exclude cut with ID _27485_210_4916_1_1532739191677_3668269_217-161755-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 06:16:11,215 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_5931_210_2040_1_1527428481_4547999_321-193614-0_sp1.1 from training. Duration: 0.6090625 2023-10-04 06:16:11,233 WARNING [train.py:1204] (3/4) Exclude cut with ID _6267_210_2084_1_1527764658526_3894730_375-219887-0 from training. Duration: 0.67 2023-10-04 06:16:13,850 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9087W0022-538393 from training. Duration: 0.768 2023-10-04 06:16:15,233 WARNING [train.py:1204] (3/4) Exclude cut with ID _68777_210_4353_1_1535810390731_3117904_83-196459-0_sp1.1 from training. Duration: 0.963625 2023-10-04 06:16:15,371 WARNING [train.py:1204] (3/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_455-343812-0_sp1.1 from training. Duration: 0.92725 2023-10-04 06:16:15,372 WARNING [train.py:1204] (3/4) Exclude cut with ID _13916_210_9994_1_1530788341126_3763149_414-320174-0_sp1.1 from training. Duration: 0.92725 2023-10-04 06:16:15,376 WARNING [train.py:1204] (3/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_398-239819-0 from training. Duration: 0.99 2023-10-04 06:16:17,139 WARNING [train.py:1204] (3/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_199-232314-0_sp1.1 from training. Duration: 0.92725 2023-10-04 06:16:20,045 WARNING [train.py:1204] (3/4) Exclude cut with ID _2334_210_2667_1_1525608238880_4705129_459-104997-0 from training. Duration: 0.95 2023-10-04 06:16:22,678 WARNING [train.py:1204] (3/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_441-209395-0_sp1.1 from training. Duration: 0.936375 2023-10-04 06:16:25,726 WARNING [train.py:1204] (3/4) Exclude cut with ID _27052_210_5986_1_1532329197187_3585009_320-80393-0_sp1.1 from training. Duration: 0.97275 2023-10-04 06:16:27,072 INFO [train.py:1046] (3/4) Epoch 44, batch 4750, loss[loss=0.1725, simple_loss=0.241, pruned_loss=0.05198, over 23744.00 frames. ], tot_loss[loss=0.1547, simple_loss=0.2352, pruned_loss=0.03706, over 4716818.69 frames. ], batch size: 179, lr: 2.30e-03, grad_scale: 16.0 2023-10-04 06:16:28,595 WARNING [train.py:1204] (3/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_89-217253-0_sp1.1 from training. Duration: 0.97275 2023-10-04 06:16:29,895 WARNING [train.py:1204] (3/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_453-282588-0_sp1.1 from training. Duration: 0.8909375 2023-10-04 06:16:30,037 WARNING [train.py:1204] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_88-341718-0 from training. Duration: 0.66 2023-10-04 06:16:30,068 WARNING [train.py:1204] (3/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_230-3521-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 06:16:34,917 WARNING [train.py:1204] (3/4) Exclude cut with ID _9679_210_7497_1_1529129212906_4363929_512-313889-0 from training. Duration: 0.81 2023-10-04 06:16:35,156 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.nonlin_attention.balancer.min_positive, batch_count=1554480.0, ans=0.05 2023-10-04 06:16:37,316 WARNING [train.py:1204] (3/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_194-252800-0_sp1.1 from training. Duration: 0.8818125 2023-10-04 06:16:37,352 WARNING [train.py:1204] (3/4) Exclude cut with ID _55209_210_1531_1_1534675828996_4597160_239-293541-0_sp1.1 from training. Duration: 0.963625 2023-10-04 06:16:38,676 WARNING [train.py:1204] (3/4) Exclude cut with ID _35799_210_5854_1_1533263487887_3529071_15-325131-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 06:16:41,699 WARNING [train.py:1204] (3/4) Exclude cut with ID _14100_210_8341_1_1530529320613_3678069_279-55760-0 from training. Duration: 0.93 2023-10-04 06:16:45,376 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.2.encoder.layers.1.feed_forward1.out_whiten, num_groups=1, num_channels=384, metric=6.53 vs. limit=15.0 2023-10-04 06:16:46,476 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_489-230703-0_sp1.1 from training. Duration: 0.77275 2023-10-04 06:16:47,999 WARNING [train.py:1204] (3/4) Exclude cut with ID _6694_210_4599_1_1527852637490_3965529_144-185051-0 from training. Duration: 0.96 2023-10-04 06:16:48,064 WARNING [train.py:1204] (3/4) Exclude cut with ID _28461_210_2253_1_1532771775189_3041299_1-228155-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 06:16:50,805 WARNING [train.py:1204] (3/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_686-252323-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 06:16:50,808 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_1075-294633-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 06:16:50,819 WARNING [train.py:1204] (3/4) Exclude cut with ID _22083_210_8254_1_1531702862439_3678970_30-92879-0_sp1.1 from training. Duration: 0.97275 2023-10-04 06:16:51,078 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.ff2_skip_rate, batch_count=1554546.6666666667, ans=0.0 2023-10-04 06:16:52,147 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9245W0121-616888 from training. Duration: 0.896 2023-10-04 06:16:52,149 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6786_210_1388_1_1527839699_4194495_378-46070-0 from training. Duration: 0.72 2023-10-04 06:16:57,979 WARNING [train.py:1204] (3/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_317-175062-0 from training. Duration: 0.82 2023-10-04 06:16:59,421 WARNING [train.py:1204] (3/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_238-89390-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 06:17:02,259 WARNING [train.py:1204] (3/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_524-310811-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 06:17:04,912 WARNING [train.py:1204] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_307-158462-0_sp0.9 from training. Duration: 0.7666875 2023-10-04 06:17:04,921 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9309W0191-640318 from training. Duration: 0.512 2023-10-04 06:17:04,925 WARNING [train.py:1204] (3/4) Exclude cut with ID _55153_210_2253_1_1534384786677_3162096_100-204617-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 06:17:08,606 WARNING [train.py:1204] (3/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_112-149213-0_sp1.1 from training. Duration: 0.863625 2023-10-04 06:17:11,371 WARNING [train.py:1204] (3/4) Exclude cut with ID _67831_210_12392_1_1535612464202_3092612_346-2148-0_sp1.1 from training. Duration: 0.8454375 2023-10-04 06:17:12,769 WARNING [train.py:1204] (3/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_51-117576-0_sp0.9 from training. Duration: 0.4444375 2023-10-04 06:17:12,800 WARNING [train.py:1204] (3/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_300-27235-0 from training. Duration: 0.87 2023-10-04 06:17:12,839 WARNING [train.py:1204] (3/4) Exclude cut with ID _40399_210_4353_1_1533305658194_3279406_195-116825-0_sp1.1 from training. Duration: 0.97275 2023-10-04 06:17:12,877 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7002_210_1794_1_1527939683_5157286_163-87552-0_sp0.9 from training. Duration: 0.9555625 2023-10-04 06:17:12,906 WARNING [train.py:1204] (3/4) Exclude cut with ID _19514_210_9259_1_1531530071330_5084230_560-237597-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 06:17:14,289 WARNING [train.py:1204] (3/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_41-234353-0_sp1.1 from training. Duration: 0.6181875 2023-10-04 06:17:14,307 WARNING [train.py:1204] (3/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_769-168302-0 from training. Duration: 0.71 2023-10-04 06:17:15,841 WARNING [train.py:1204] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_574-45541-0 from training. Duration: 0.65 2023-10-04 06:17:19,215 WARNING [train.py:1204] (3/4) Exclude cut with ID _50310_210_15488_1_1534068043345_3641080_262-67120-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 06:17:22,045 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_53071_210_16554_1_1534493434_1276796_35-11927-0_sp1.1 from training. Duration: 0.963625 2023-10-04 06:17:22,047 WARNING [train.py:1204] (3/4) Exclude cut with ID _3992_210_1747_1_1526644816031_3731060_107-251434-0 from training. Duration: 0.99 2023-10-04 06:17:22,298 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=1554680.0, ans=0.1 2023-10-04 06:17:23,816 WARNING [train.py:1204] (3/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_412-54488-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 06:17:24,019 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=1554680.0, ans=0.1 2023-10-04 06:17:24,098 INFO [scaling.py:1118] (3/4) WithLoss: name=encoder.encoders.3.encoder.layers.3.self_attn_weights, loss-sum=0.000e+00 2023-10-04 06:17:25,179 WARNING [train.py:1204] (3/4) Exclude cut with ID _18516_210_9206_1_1531704653206_3692406_77-258490-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 06:17:27,126 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_40047_210_2290_1_1533290519_2374311_195-165693-0_sp0.9 from training. Duration: 0.988875 2023-10-04 06:17:27,188 WARNING [train.py:1204] (3/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_296-36971-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 06:17:28,522 WARNING [train.py:1204] (3/4) Exclude cut with ID _14970_210_15709_1_1530773543493_1715730_187-20259-0_sp1.1 from training. Duration: 0.8090625 2023-10-04 06:17:28,670 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.conv_skip_rate, batch_count=1554746.6666666667, ans=0.0 2023-10-04 06:17:31,339 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_53373_210_5399_1_1534229471_4715695_135-305927-0_sp1.1 from training. Duration: 0.936375 2023-10-04 06:17:32,607 WARNING [train.py:1204] (3/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_60-18123-0 from training. Duration: 0.88 2023-10-04 06:17:33,953 WARNING [train.py:1204] (3/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_166-166990-0 from training. Duration: 0.96 2023-10-04 06:17:34,031 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_5438_210_1794_1_1527336687_2600009_233-58072-0 from training. Duration: 0.77 2023-10-04 06:17:35,622 WARNING [train.py:1204] (3/4) Exclude cut with ID _8833_210_5219_1_1528592665210_7553059_611-80313-0_sp0.9 from training. Duration: 0.711125 2023-10-04 06:17:36,897 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_53555_210_5632_1_1534595220_7232297_118-43239-0_sp1.1 from training. Duration: 0.936375 2023-10-04 06:17:37,140 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.balancer1.prob, batch_count=1554746.6666666667, ans=0.125 2023-10-04 06:17:38,781 WARNING [train.py:1204] (3/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_480-13146-0 from training. Duration: 0.85 2023-10-04 06:17:39,576 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.3.encoder.layers.1.self_attn1.whiten, num_groups=1, num_channels=512, metric=12.95 vs. limit=22.5 2023-10-04 06:17:40,117 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.529e+02 2.005e+02 2.194e+02 2.471e+02 3.662e+02, threshold=4.389e+02, percent-clipped=0.0 2023-10-04 06:17:41,397 INFO [train.py:1046] (3/4) Epoch 44, batch 4800, loss[loss=0.1858, simple_loss=0.2546, pruned_loss=0.05854, over 22733.00 frames. ], tot_loss[loss=0.1548, simple_loss=0.2354, pruned_loss=0.03706, over 4724053.02 frames. ], batch size: 322, lr: 2.30e-03, grad_scale: 32.0 2023-10-04 06:17:42,768 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_892-28814-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 06:17:42,819 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_24899_210_16554_1_1532046422_3827462_160-21255-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 06:17:43,226 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.5.encoder.layers.1.whiten, num_groups=1, num_channels=256, metric=2.00 vs. limit=12.0 2023-10-04 06:17:48,829 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_154-316450-0_sp1.1 from training. Duration: 0.6909375 2023-10-04 06:17:48,958 WARNING [train.py:1204] (3/4) Exclude cut with ID _30196_210_1664_1_1533708496522_7074140_1225-301232-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 06:17:48,991 WARNING [train.py:1204] (3/4) Exclude cut with ID _28783_210_7943_1_1532671164808_7155479_414-208100-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 06:17:50,322 WARNING [train.py:1204] (3/4) Exclude cut with ID _19023_210_4353_1_1531378892552_5820780_83-254794-0 from training. Duration: 0.86 2023-10-04 06:17:50,405 WARNING [train.py:1204] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_591-83752-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 06:17:51,662 WARNING [train.py:1204] (3/4) Exclude cut with ID _29877_210_6228_1_1532397791106_5276149_266-62291-0_sp1.1 from training. Duration: 0.7909375 2023-10-04 06:17:54,327 WARNING [train.py:1204] (3/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_280-161633-0_sp1.1 from training. Duration: 0.663625 2023-10-04 06:17:59,378 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7543_210_6753_1_1528539468_333764_39-156634-0_sp1.1 from training. Duration: 0.97275 2023-10-04 06:18:00,781 WARNING [train.py:1204] (3/4) Exclude cut with ID _67499_210_17282_1_1535509805220_3692430_505-312248-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 06:18:00,820 WARNING [train.py:1204] (3/4) Exclude cut with ID _5294_210_1747_1_1527249643928_3696179_180-100713-0_sp0.9 from training. Duration: 0.9 2023-10-04 06:18:02,241 WARNING [train.py:1204] (3/4) Exclude cut with ID _26528_210_9070_1_1532570410842_3851310_508-148132-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 06:18:02,250 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_211-339622-0_sp1.1 from training. Duration: 0.4090625 2023-10-04 06:18:02,264 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_339-138356-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 06:18:03,557 WARNING [train.py:1204] (3/4) Exclude cut with ID _27060_210_5986_1_1532674838922_3522549_211-19933-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 06:18:06,269 WARNING [train.py:1204] (3/4) Exclude cut with ID _60229_210_27767_1_1534838406158_3655350_457-117923-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 06:18:09,604 WARNING [train.py:1204] (3/4) Exclude cut with ID _55429_210_10626_1_1534467588527_7174143_460-141439-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 06:18:10,923 WARNING [train.py:1204] (3/4) Exclude cut with ID _59170_210_6721_1_1534983633960_4203335_263-10780-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 06:18:10,936 WARNING [train.py:1204] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_872-264525-0_sp0.9 from training. Duration: 0.72225 2023-10-04 06:18:11,013 WARNING [train.py:1204] (3/4) Exclude cut with ID _7624_210_1531_1_1528109802198_4933101_418-349834-0_sp0.9 from training. Duration: 0.6666875 2023-10-04 06:18:13,668 WARNING [train.py:1204] (3/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_27-53998-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 06:18:15,088 WARNING [train.py:1204] (3/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_202-183922-0 from training. Duration: 0.85 2023-10-04 06:18:15,100 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7002_210_1794_1_1527939683_5157286_1-281264-0 from training. Duration: 0.44 2023-10-04 06:18:15,171 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_53753_210_7234_1_1534320003_3594148_493-9314-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 06:18:15,191 WARNING [train.py:1204] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_217-95056-0_sp1.1 from training. Duration: 0.8909375 2023-10-04 06:18:16,611 WARNING [train.py:1204] (3/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_406-45086-0_sp1.1 from training. Duration: 0.72725 2023-10-04 06:18:16,617 WARNING [train.py:1204] (3/4) Exclude cut with ID _31649_210_6228_1_1532584608063_4091078_505-213821-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 06:18:16,632 WARNING [train.py:1204] (3/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_317-175062-0_sp0.9 from training. Duration: 0.911125 2023-10-04 06:18:18,126 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.feed_forward2.hidden_balancer.prob, batch_count=1554946.6666666667, ans=0.125 2023-10-04 06:18:20,638 WARNING [train.py:1204] (3/4) Exclude cut with ID _5444_210_2151_1_1527418808464_5312210_726-27165-0_sp1.1 from training. Duration: 0.6090625 2023-10-04 06:18:20,688 WARNING [train.py:1204] (3/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_494-170437-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 06:18:25,331 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_474-64054-0_sp1.1 from training. Duration: 0.936375 2023-10-04 06:18:27,478 WARNING [train.py:1204] (3/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_521-7556-0_sp1.1 from training. Duration: 0.97275 2023-10-04 06:18:30,506 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_269-348505-0_sp1.1 from training. Duration: 0.92725 2023-10-04 06:18:34,427 WARNING [train.py:1204] (3/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_559-184862-0 from training. Duration: 0.94 2023-10-04 06:18:34,456 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_68702_210_23689_1_1535799577_3420068_92-76040-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 06:18:35,791 WARNING [train.py:1204] (3/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_227-102052-0_sp1.1 from training. Duration: 0.97275 2023-10-04 06:18:35,828 WARNING [train.py:1204] (3/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_718-250907-0_sp0.9 from training. Duration: 0.7666875 2023-10-04 06:18:37,178 WARNING [train.py:1204] (3/4) Exclude cut with ID _14069_210_5632_1_1530512692033_4803390_352-143891-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 06:18:40,607 WARNING [train.py:1204] (3/4) Exclude cut with ID _6267_210_2084_1_1527764658526_3894730_65-106602-0_sp1.1 from training. Duration: 0.936375 2023-10-04 06:18:40,710 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.0.balancer1.prob, batch_count=1555080.0, ans=0.125 2023-10-04 06:18:41,972 WARNING [train.py:1204] (3/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_202-183922-0_sp0.9 from training. Duration: 0.9444375 2023-10-04 06:18:41,979 WARNING [train.py:1204] (3/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_465-268827-0_sp1.1 from training. Duration: 0.97275 2023-10-04 06:18:43,249 WARNING [train.py:1204] (3/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_58-29064-0_sp1.1 from training. Duration: 0.9 2023-10-04 06:18:44,604 WARNING [train.py:1204] (3/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_1-99680-0_sp0.9 from training. Duration: 0.7444375 2023-10-04 06:18:45,849 WARNING [train.py:1204] (3/4) Exclude cut with ID _13253_210_1925_1_1530757775819_3577873_191-144977-0_sp1.1 from training. Duration: 0.7090625 2023-10-04 06:18:50,102 WARNING [train.py:1204] (3/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_276-112684-0_sp1.1 from training. Duration: 0.92725 2023-10-04 06:18:50,110 WARNING [train.py:1204] (3/4) Exclude cut with ID _64639_210_5816_1_1535367619437_3528000_358-411-0_sp1.1 from training. Duration: 0.97275 2023-10-04 06:18:50,117 WARNING [train.py:1204] (3/4) Exclude cut with ID _31649_210_6228_1_1532584608063_4091078_106-21199-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 06:18:50,229 WARNING [train.py:1204] (3/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_901-79438-0 from training. Duration: 0.94 2023-10-04 06:18:53,000 WARNING [train.py:1204] (3/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_718-250907-0 from training. Duration: 0.69 2023-10-04 06:18:54,255 INFO [train.py:1046] (3/4) Epoch 44, batch 4850, loss[loss=0.1736, simple_loss=0.2568, pruned_loss=0.04522, over 24353.00 frames. ], tot_loss[loss=0.1554, simple_loss=0.2364, pruned_loss=0.03722, over 4726833.93 frames. ], batch size: 77, lr: 2.30e-03, grad_scale: 16.0 2023-10-04 06:18:54,300 WARNING [train.py:1204] (3/4) Exclude cut with ID _5287_210_2084_1_1527418883040_3878670_363-194161-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 06:18:54,304 WARNING [train.py:1204] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_586-321321-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 06:18:54,377 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_68900_210_2087_1_1535713157_3708529_164-292661-0_sp1.1 from training. Duration: 0.963625 2023-10-04 06:18:54,378 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_26433_210_12108_1_1532157158_4056480_114-144878-0_sp1.1 from training. Duration: 0.97275 2023-10-04 06:18:57,717 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_15517_210_6753_1_1530954008_3487336_96-341874-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 06:18:59,338 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.feed_forward1.out_proj.dropout_p, batch_count=1555146.6666666667, ans=0.1 2023-10-04 06:19:04,135 WARNING [train.py:1204] (3/4) Exclude cut with ID _14268_210_9206_1_1530622771285_4714099_637-86630-0 from training. Duration: 0.51 2023-10-04 06:19:05,434 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_28211_210_6388_1_1532861506_4523224_121-320092-0_sp1.1 from training. Duration: 0.92725 2023-10-04 06:19:09,711 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_141-89113-0_sp0.9 from training. Duration: 0.9555625 2023-10-04 06:19:11,031 WARNING [train.py:1204] (3/4) Exclude cut with ID _6789_210_1534_1_1527857937218_3118950_311-61262-0_sp1.1 from training. Duration: 0.5909375 2023-10-04 06:19:11,060 WARNING [train.py:1204] (3/4) Exclude cut with ID _58480_210_12392_1_1534811121111_3646160_389-200461-0_sp1.1 from training. Duration: 0.97275 2023-10-04 06:19:15,692 WARNING [train.py:1204] (3/4) Exclude cut with ID _41326_210_5854_1_1533778076509_3492080_28-156553-0_sp1.1 from training. Duration: 0.92725 2023-10-04 06:19:15,774 WARNING [train.py:1204] (3/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_577-287150-0_sp1.1 from training. Duration: 0.7454375 2023-10-04 06:19:17,158 WARNING [train.py:1204] (3/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_40-129756-0_sp1.1 from training. Duration: 0.736375 2023-10-04 06:19:17,176 WARNING [train.py:1204] (3/4) Exclude cut with ID _60113_210_16271_1_1535164212481_4386079_490-280290-0 from training. Duration: 0.98 2023-10-04 06:19:21,211 WARNING [train.py:1204] (3/4) Exclude cut with ID _58059_210_5854_1_1534901471149_3164808_34-39905-0_sp1.1 from training. Duration: 0.963625 2023-10-04 06:19:22,534 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_791-197891-0_sp1.1 from training. Duration: 0.763625 2023-10-04 06:19:23,828 WARNING [train.py:1204] (3/4) Exclude cut with ID _6789_210_1534_1_1527857937218_3118950_262-48853-0_sp1.1 from training. Duration: 0.5090625 2023-10-04 06:19:25,280 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2655_210_2087_1_1525688959_3897400_52-279899-0_sp0.9 from training. Duration: 0.7666875 2023-10-04 06:19:25,284 WARNING [train.py:1204] (3/4) Exclude cut with ID _14884_210_2252_1_1530707459689_5561250_114-201399-0 from training. Duration: 0.76 2023-10-04 06:19:27,246 WARNING [train.py:1204] (3/4) Exclude cut with ID _19023_210_4353_1_1531378892552_5820780_83-254794-0_sp0.9 from training. Duration: 0.9555625 2023-10-04 06:19:27,263 WARNING [train.py:1204] (3/4) Exclude cut with ID _53489_210_6228_1_1534298357169_3835970_10-297919-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 06:19:28,623 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder_embed.convnext.hidden_balancer.prob, batch_count=1555280.0, ans=0.125 2023-10-04 06:19:33,130 WARNING [train.py:1204] (3/4) Exclude cut with ID _44585_210_18628_1_1533605350387_4518199_418-3055-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 06:19:33,143 WARNING [train.py:1204] (3/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_310-109461-0 from training. Duration: 0.8 2023-10-04 06:19:34,518 WARNING [train.py:1204] (3/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_235-262124-0 from training. Duration: 0.67 2023-10-04 06:19:35,918 WARNING [train.py:1204] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_195-167011-0_sp1.1 from training. Duration: 0.6818125 2023-10-04 06:19:36,221 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.ff3_skip_rate, batch_count=1555280.0, ans=0.0 2023-10-04 06:19:42,398 WARNING [train.py:1204] (3/4) Exclude cut with ID _7852_210_5219_1_1528286471316_8139400_188-55162-0_sp1.1 from training. Duration: 0.936375 2023-10-04 06:19:43,732 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_35361_210_4238_1_1533262810_7793476_675-87012-0 from training. Duration: 0.89 2023-10-04 06:19:44,405 WARNING [train.py:1204] (3/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_465-81427-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 06:19:44,418 WARNING [train.py:1204] (3/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_597-154640-0_sp1.1 from training. Duration: 0.8181875 2023-10-04 06:19:46,865 WARNING [train.py:1204] (3/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_186-309161-0_sp0.9 from training. Duration: 0.6 2023-10-04 06:19:48,272 WARNING [train.py:1204] (3/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_546-119312-0 from training. Duration: 0.99 2023-10-04 06:19:48,276 WARNING [train.py:1204] (3/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_330-240060-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 06:19:49,833 WARNING [train.py:1204] (3/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_477-255782-0 from training. Duration: 0.82 2023-10-04 06:19:51,127 WARNING [train.py:1204] (3/4) Exclude cut with ID _14970_210_15709_1_1530773543493_1715730_150-248939-0_sp1.1 from training. Duration: 0.97275 2023-10-04 06:19:51,309 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=1555346.6666666667, ans=0.1 2023-10-04 06:19:52,554 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_556-206615-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 06:19:52,634 WARNING [train.py:1204] (3/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_308-231735-0 from training. Duration: 0.73 2023-10-04 06:19:54,169 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.conv_module1.balancer2.prob, batch_count=1555413.3333333333, ans=0.125 2023-10-04 06:20:00,927 WARNING [train.py:1204] (3/4) Exclude cut with ID _27485_210_4916_1_1532739191677_3668269_66-236571-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 06:20:04,619 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.conv_module2.balancer1.min_positive, batch_count=1555413.3333333333, ans=0.025 2023-10-04 06:20:05,763 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2221_210_1988_1_1525597051_4935541_415-296882-0_sp0.9 from training. Duration: 0.8555625 2023-10-04 06:20:05,777 WARNING [train.py:1204] (3/4) Exclude cut with ID _64521_210_14699_1_1535243427259_3815328_231-78630-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 06:20:08,909 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.692e+02 2.027e+02 2.287e+02 2.709e+02 3.937e+02, threshold=4.574e+02, percent-clipped=0.0 2023-10-04 06:20:08,938 INFO [train.py:1046] (3/4) Epoch 44, batch 4900, loss[loss=0.1387, simple_loss=0.1998, pruned_loss=0.03886, over 22556.00 frames. ], tot_loss[loss=0.1548, simple_loss=0.2351, pruned_loss=0.03725, over 4727726.19 frames. ], batch size: 322, lr: 2.30e-03, grad_scale: 16.0 2023-10-04 06:20:11,752 WARNING [train.py:1204] (3/4) Exclude cut with ID _13942_210_9988_1_1530604906036_3733360_499-55676-0 from training. Duration: 0.62 2023-10-04 06:20:11,754 WARNING [train.py:1204] (3/4) Exclude cut with ID _49376_210_5986_1_1533985278732_3370740_301-154763-0_sp1.1 from training. Duration: 0.8909375 2023-10-04 06:20:17,748 WARNING [train.py:1204] (3/4) Exclude cut with ID _27485_210_4916_1_1532739191677_3668269_255-216698-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 06:20:19,123 WARNING [train.py:1204] (3/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_137-334143-0_sp1.1 from training. Duration: 0.97275 2023-10-04 06:20:19,144 WARNING [train.py:1204] (3/4) Exclude cut with ID _6456_210_1534_1_1527681634212_3181049_215-309359-0_sp1.1 from training. Duration: 0.8 2023-10-04 06:20:20,761 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.feed_forward3.hidden_balancer.prob, batch_count=1555480.0, ans=0.125 2023-10-04 06:20:21,881 WARNING [train.py:1204] (3/4) Exclude cut with ID _65640_210_6310_1_1535368201445_3173250_209-13008-0 from training. Duration: 0.99 2023-10-04 06:20:23,588 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.self_attn_weights.pos_emb_skip_rate, batch_count=1555546.6666666667, ans=0.0 2023-10-04 06:20:26,127 WARNING [train.py:1204] (3/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_58-29064-0 from training. Duration: 0.99 2023-10-04 06:20:27,664 WARNING [train.py:1204] (3/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_410-287857-0 from training. Duration: 0.93 2023-10-04 06:20:29,063 WARNING [train.py:1204] (3/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_153-153537-0 from training. Duration: 0.91 2023-10-04 06:20:29,107 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7544_210_6753_1_1528587722_3576686_172-171750-0_sp0.9 from training. Duration: 0.72225 2023-10-04 06:20:30,417 WARNING [train.py:1204] (3/4) Exclude cut with ID _64521_210_14699_1_1535243427259_3815328_464-183853-0_sp1.1 from training. Duration: 0.97275 2023-10-04 06:20:30,438 WARNING [train.py:1204] (3/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_385-210308-0_sp1.1 from training. Duration: 0.963625 2023-10-04 06:20:30,453 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_46013_210_7234_1_1533776362_3744032_354-261865-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 06:20:30,460 WARNING [train.py:1204] (3/4) Exclude cut with ID _3067_210_2667_1_1526212974640_4503168_250-285937-0_sp1.1 from training. Duration: 0.72725 2023-10-04 06:20:32,247 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_40036_210_13405_1_1533648647_1315388_48-146016-0 from training. Duration: 0.93 2023-10-04 06:20:34,359 WARNING [train.py:1204] (3/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_280-161633-0 from training. Duration: 0.73 2023-10-04 06:20:35,669 WARNING [train.py:1204] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_308-161067-0_sp0.9 from training. Duration: 0.7333125 2023-10-04 06:20:35,766 WARNING [train.py:1204] (3/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_68-212801-0_sp0.9 from training. Duration: 0.811125 2023-10-04 06:20:37,587 WARNING [train.py:1204] (3/4) Exclude cut with ID _6789_210_1534_1_1527857937218_3118950_311-61262-0_sp0.9 from training. Duration: 0.72225 2023-10-04 06:20:39,050 WARNING [train.py:1204] (3/4) Exclude cut with ID _14632_210_9988_1_1530691279392_3923200_102-204477-0_sp1.1 from training. Duration: 0.8818125 2023-10-04 06:20:40,354 WARNING [train.py:1204] (3/4) Exclude cut with ID _65242_210_18628_1_1535369393717_3823149_290-117517-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 06:20:41,757 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_69630_210_7234_1_1535788670_910989_35-337140-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 06:20:41,772 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_5618_210_1874_1_1527311008_5851_1-214501-0 from training. Duration: 0.689 2023-10-04 06:20:43,838 WARNING [train.py:1204] (3/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_168-50337-0_sp0.9 from training. Duration: 0.9333125 2023-10-04 06:20:45,084 WARNING [train.py:1204] (3/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_185-14438-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 06:20:45,098 WARNING [train.py:1204] (3/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_61-327062-0 from training. Duration: 0.81 2023-10-04 06:20:45,103 WARNING [train.py:1204] (3/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_98-741-0 from training. Duration: 0.8 2023-10-04 06:20:49,270 WARNING [train.py:1204] (3/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_312-204798-0 from training. Duration: 0.93 2023-10-04 06:20:50,686 WARNING [train.py:1204] (3/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_787-294416-0_sp0.9 from training. Duration: 0.911125 2023-10-04 06:20:52,074 WARNING [train.py:1204] (3/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_140-181176-0_sp1.1 from training. Duration: 0.836375 2023-10-04 06:20:52,107 WARNING [train.py:1204] (3/4) Exclude cut with ID _14687_210_5854_1_1530703838259_3664889_115-239046-0_sp1.1 from training. Duration: 0.6090625 2023-10-04 06:20:53,436 WARNING [train.py:1204] (3/4) Exclude cut with ID _56423_210_5854_1_1534896061049_3165969_370-230614-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 06:20:53,493 WARNING [train.py:1204] (3/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_406-275649-0_sp0.9 from training. Duration: 0.4666875 2023-10-04 06:20:54,813 WARNING [train.py:1204] (3/4) Exclude cut with ID _15150_210_8341_1_1530752411803_7256384_22-138274-0_sp1.1 from training. Duration: 0.87275 2023-10-04 06:20:54,857 WARNING [train.py:1204] (3/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_125-125818-0 from training. Duration: 0.96 2023-10-04 06:20:57,518 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_4399_210_1874_1_1526705485_2400757_220-47974-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 06:20:57,803 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.conv_module1.balancer1.prob, batch_count=1555680.0, ans=0.125 2023-10-04 06:20:58,867 WARNING [train.py:1204] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_308-161067-0_sp1.1 from training. Duration: 0.6 2023-10-04 06:20:59,076 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.ff3_skip_rate, batch_count=1555680.0, ans=0.0 2023-10-04 06:21:00,353 WARNING [train.py:1204] (3/4) Exclude cut with ID _53489_210_6228_1_1534298357169_3835970_102-57645-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 06:21:04,054 WARNING [train.py:1204] (3/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_178-177582-0 from training. Duration: 0.74 2023-10-04 06:21:05,422 WARNING [train.py:1204] (3/4) Exclude cut with ID _14349_210_8341_1_1530615710818_3442470_170-336541-0_sp1.1 from training. Duration: 0.7818125 2023-10-04 06:21:05,508 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_521-214276-0_sp0.9 from training. Duration: 0.488875 2023-10-04 06:21:05,548 WARNING [train.py:1204] (3/4) Exclude cut with ID _66070_210_7640_1_1535441393096_7805310_489-292631-0 from training. Duration: 0.79 2023-10-04 06:21:07,510 INFO [scaling.py:1118] (3/4) WithLoss: name=encoder.encoders.3.encoder.layers.1.self_attn_weights, loss-sum=0.000e+00 2023-10-04 06:21:12,785 WARNING [train.py:1204] (3/4) Exclude cut with ID _14069_210_5632_1_1530512692033_4803390_438-340023-0_sp1.1 from training. Duration: 0.936375 2023-10-04 06:21:13,056 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.feed_forward1.hidden_balancer.prob, batch_count=1555746.6666666667, ans=0.125 2023-10-04 06:21:14,168 WARNING [train.py:1204] (3/4) Exclude cut with ID _7852_210_5219_1_1528286471316_8139400_242-105336-0_sp1.1 from training. Duration: 0.6545625 2023-10-04 06:21:16,714 WARNING [train.py:1204] (3/4) Exclude cut with ID _3867_210_1883_1_1526641200575_4475830_272-215563-0 from training. Duration: 0.57 2023-10-04 06:21:16,721 WARNING [train.py:1204] (3/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_143-117421-0_sp0.9 from training. Duration: 0.6444375 2023-10-04 06:21:16,735 WARNING [train.py:1204] (3/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_30-326681-0_sp1.1 from training. Duration: 0.7545625 2023-10-04 06:21:18,648 WARNING [train.py:1204] (3/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_538-218448-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 06:21:21,533 WARNING [train.py:1204] (3/4) Exclude cut with ID _33640_210_2655_1_1533117605479_4855469_60-192970-0_sp1.1 from training. Duration: 0.963625 2023-10-04 06:21:21,541 WARNING [train.py:1204] (3/4) Exclude cut with ID _65585_210_12210_1_1535450451584_3764098_112-294301-0_sp1.1 from training. Duration: 0.8 2023-10-04 06:21:22,803 INFO [train.py:1046] (3/4) Epoch 44, batch 4950, loss[loss=0.1546, simple_loss=0.2475, pruned_loss=0.03086, over 24474.00 frames. ], tot_loss[loss=0.1539, simple_loss=0.2343, pruned_loss=0.03673, over 4739123.91 frames. ], batch size: 69, lr: 2.30e-03, grad_scale: 16.0 2023-10-04 06:21:22,849 WARNING [train.py:1204] (3/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_643-37412-0_sp1.1 from training. Duration: 0.936375 2023-10-04 06:21:22,866 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_9302_210_2550_1_1528889962_4655416_638-301620-0_sp1.1 from training. Duration: 0.42725 2023-10-04 06:21:22,986 WARNING [train.py:1204] (3/4) Exclude cut with ID _3867_210_1883_1_1526641200575_4475830_272-215563-0_sp1.1 from training. Duration: 0.5181875 2023-10-04 06:21:25,721 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_53679_210_4852_1_1534556726_4186141_358-154143-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 06:21:25,733 WARNING [train.py:1204] (3/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_841-341679-0_sp0.9 from training. Duration: 0.6444375 2023-10-04 06:21:28,458 WARNING [train.py:1204] (3/4) Exclude cut with ID _7160_210_1876_1_1527940923231_3703799_302-150728-0 from training. Duration: 0.78 2023-10-04 06:21:28,494 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_125-203105-0 from training. Duration: 0.8 2023-10-04 06:21:28,515 WARNING [train.py:1204] (3/4) Exclude cut with ID _5585_210_2667_1_1527418813324_8007340_89-332072-0_sp0.9 from training. Duration: 0.711125 2023-10-04 06:21:28,585 WARNING [train.py:1204] (3/4) Exclude cut with ID _3992_210_1747_1_1526644816031_3731060_36-147855-0 from training. Duration: 0.78 2023-10-04 06:21:29,904 WARNING [train.py:1204] (3/4) Exclude cut with ID _59256_210_5986_1_1535076085997_3589120_172-239045-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 06:21:29,922 WARNING [train.py:1204] (3/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_472-216663-0_sp1.1 from training. Duration: 0.836375 2023-10-04 06:21:29,989 WARNING [train.py:1204] (3/4) Exclude cut with ID _36512_210_6987_1_1533033143998_3126069_181-306500-0_sp1.1 from training. Duration: 0.863625 2023-10-04 06:21:31,284 WARNING [train.py:1204] (3/4) Exclude cut with ID _56524_210_9642_1_1534572011779_3619170_492-95008-0_sp1.1 from training. Duration: 0.97275 2023-10-04 06:21:33,250 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_33348_210_5632_1_1533086912_3324030_119-90326-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 06:21:35,220 WARNING [train.py:1204] (3/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_231-137892-0_sp1.1 from training. Duration: 0.9 2023-10-04 06:21:36,599 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8453_210_4211_1_1528502940_8562730_596-306820-0_sp1.1 from training. Duration: 0.8818125 2023-10-04 06:21:37,935 WARNING [train.py:1204] (3/4) Exclude cut with ID _27052_210_5986_1_1532329197187_3585009_50-242750-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 06:21:40,683 WARNING [train.py:1204] (3/4) Exclude cut with ID _13913_210_9994_1_1530579615187_3383897_358-98435-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 06:21:40,696 WARNING [train.py:1204] (3/4) Exclude cut with ID _27013_210_1389_1_1532437173240_3252649_399-229778-0_sp1.1 from training. Duration: 0.963625 2023-10-04 06:21:40,942 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.ff2_skip_rate, batch_count=1555880.0, ans=0.0 2023-10-04 06:21:44,834 WARNING [train.py:1204] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_185-174805-0_sp0.9 from training. Duration: 0.7555625 2023-10-04 06:21:49,866 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.ff2_skip_rate, batch_count=1555880.0, ans=0.0 2023-10-04 06:21:50,853 WARNING [train.py:1204] (3/4) Exclude cut with ID _65585_210_12210_1_1535450451584_3764098_196-221014-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 06:21:50,969 WARNING [train.py:1204] (3/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_202-293523-0_sp1.1 from training. Duration: 0.6545625 2023-10-04 06:21:52,523 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8275_210_4198_1_1528455320_3892571_213-127634-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 06:21:52,577 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_921-231678-0_sp1.1 from training. Duration: 0.97275 2023-10-04 06:21:53,945 WARNING [train.py:1204] (3/4) Exclude cut with ID _38020_210_5986_1_1533112252259_7006089_493-102745-0_sp1.1 from training. Duration: 0.8909375 2023-10-04 06:21:55,399 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6846_210_6753_1_1527850767_4295158_2-315073-0 from training. Duration: 0.88 2023-10-04 06:21:56,690 WARNING [train.py:1204] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_249-281349-0 from training. Duration: 0.85 2023-10-04 06:21:59,446 WARNING [train.py:1204] (3/4) Exclude cut with ID _18513_210_9206_1_1531618215223_3862620_314-83429-0_sp1.1 from training. Duration: 0.97275 2023-10-04 06:22:02,335 WARNING [train.py:1204] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_215-202973-0_sp0.9 from training. Duration: 0.97775 2023-10-04 06:22:02,351 WARNING [train.py:1204] (3/4) Exclude cut with ID _7719_210_3525_1_1528456153163_4296960_88-250259-0_sp1.1 from training. Duration: 0.763625 2023-10-04 06:22:03,855 WARNING [train.py:1204] (3/4) Exclude cut with ID _7606_210_4915_1_1528894253277_5080690_301-10732-0_sp1.1 from training. Duration: 0.736375 2023-10-04 06:22:03,864 WARNING [train.py:1204] (3/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_818-11907-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 06:22:05,270 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_5959_210_1794_1_1527400400_3445726_412-44052-0_sp1.1 from training. Duration: 0.7 2023-10-04 06:22:07,230 WARNING [train.py:1204] (3/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_6-279195-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 06:22:10,522 WARNING [train.py:1204] (3/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_252-216674-0_sp0.9 from training. Duration: 0.6 2023-10-04 06:22:13,163 WARNING [train.py:1204] (3/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_144-3287-0_sp1.1 from training. Duration: 0.6090625 2023-10-04 06:22:13,291 WARNING [train.py:1204] (3/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_342-3879-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 06:22:13,323 WARNING [train.py:1204] (3/4) Exclude cut with ID _33640_210_2655_1_1533117605479_4855469_584-65630-0_sp1.1 from training. Duration: 0.97275 2023-10-04 06:22:14,604 WARNING [train.py:1204] (3/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_37-189262-0 from training. Duration: 0.98 2023-10-04 06:22:14,635 WARNING [train.py:1204] (3/4) Exclude cut with ID _9389_210_1664_1_1529217337301_5289269_379-128794-0_sp0.9 from training. Duration: 0.9444375 2023-10-04 06:22:16,046 WARNING [train.py:1204] (3/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_119-207162-0_sp1.1 from training. Duration: 0.6454375 2023-10-04 06:22:18,116 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.2.encoder.layers.1.whiten, num_groups=1, num_channels=384, metric=4.74 vs. limit=12.0 2023-10-04 06:22:20,659 WARNING [train.py:1204] (3/4) Exclude cut with ID _2941_210_2577_1_1526199673150_2489759_200-333643-0_sp1.1 from training. Duration: 0.936375 2023-10-04 06:22:23,447 WARNING [train.py:1204] (3/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_479-125611-0_sp1.1 from training. Duration: 0.87275 2023-10-04 06:22:23,466 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3475_210_2028_1_1526193578_3622569_64-297072-0_sp1.1 from training. Duration: 0.82725 2023-10-04 06:22:23,496 WARNING [train.py:1204] (3/4) Exclude cut with ID _53565_210_10635_1_1534337218335_3820259_139-346291-0_sp1.1 from training. Duration: 0.97275 2023-10-04 06:22:23,541 WARNING [train.py:1204] (3/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_588-125165-0_sp1.1 from training. Duration: 0.7545625 2023-10-04 06:22:23,597 WARNING [train.py:1204] (3/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_938-205135-0_sp1.1 from training. Duration: 0.7909375 2023-10-04 06:22:26,345 WARNING [train.py:1204] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_216-48707-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 06:22:26,391 WARNING [train.py:1204] (3/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_477-255782-0_sp1.1 from training. Duration: 0.7454375 2023-10-04 06:22:26,437 WARNING [train.py:1204] (3/4) Exclude cut with ID _16909_210_5724_1_1531292079912_4118919_583-92842-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 06:22:27,798 WARNING [train.py:1204] (3/4) Exclude cut with ID _60860_210_8254_1_1534896015875_3557169_245-256666-0 from training. Duration: 0.97 2023-10-04 06:22:29,463 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.bypass.scale_min, batch_count=1556080.0, ans=0.2 2023-10-04 06:22:32,043 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_45399_210_4852_1_1533624725_7579737_349-116173-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 06:22:35,501 WARNING [train.py:1204] (3/4) Exclude cut with ID _8699_210_2069_1_1528630346831_3960409_155-39766-0 from training. Duration: 0.78 2023-10-04 06:22:35,510 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_86-16881-0_sp1.1 from training. Duration: 0.52725 2023-10-04 06:22:35,767 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.conv_module2.balancer2.prob, batch_count=1556146.6666666667, ans=0.125 2023-10-04 06:22:37,369 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.667e+02 2.058e+02 2.273e+02 2.535e+02 3.965e+02, threshold=4.546e+02, percent-clipped=0.0 2023-10-04 06:22:37,396 INFO [train.py:1046] (3/4) Epoch 44, batch 5000, loss[loss=0.1562, simple_loss=0.2334, pruned_loss=0.03956, over 23264.00 frames. ], tot_loss[loss=0.1533, simple_loss=0.2337, pruned_loss=0.03644, over 4729476.52 frames. ], batch size: 119, lr: 2.30e-03, grad_scale: 16.0 2023-10-04 06:22:42,056 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.5.encoder.layers.0.conv_module1.whiten, num_groups=1, num_channels=256, metric=6.98 vs. limit=15.0 2023-10-04 06:22:42,936 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_191-159384-0_sp1.1 from training. Duration: 0.97275 2023-10-04 06:22:42,957 WARNING [train.py:1204] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_307-158462-0_sp1.1 from training. Duration: 0.62725 2023-10-04 06:22:45,598 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7544_210_6753_1_1528591327_1471426_33-346031-0 from training. Duration: 0.61 2023-10-04 06:22:45,680 WARNING [train.py:1204] (3/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_112-149213-0 from training. Duration: 0.95 2023-10-04 06:22:45,842 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.conv_skip_rate, batch_count=1556146.6666666667, ans=0.0 2023-10-04 06:22:48,350 WARNING [train.py:1204] (3/4) Exclude cut with ID _55844_210_2346_1_1534471212569_7625707_925-191342-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 06:22:48,493 WARNING [train.py:1204] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_87-278686-0 from training. Duration: 0.77 2023-10-04 06:22:49,268 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.2.encoder.layers.0.feed_forward1.out_whiten, num_groups=1, num_channels=384, metric=4.65 vs. limit=15.0 2023-10-04 06:22:49,808 WARNING [train.py:1204] (3/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_415-78792-0_sp1.1 from training. Duration: 0.736375 2023-10-04 06:22:49,825 WARNING [train.py:1204] (3/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_107-332398-0_sp0.9 from training. Duration: 0.8666875 2023-10-04 06:22:51,644 WARNING [train.py:1204] (3/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_72-251255-0 from training. Duration: 0.94 2023-10-04 06:22:51,674 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_431-227273-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 06:22:53,069 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7544_210_6753_1_1528587000_691468_87-178599-0_sp0.9 from training. Duration: 0.9666875 2023-10-04 06:22:54,456 WARNING [train.py:1204] (3/4) Exclude cut with ID _13916_210_9994_1_1530788341126_3763149_60-191556-0 from training. Duration: 0.61 2023-10-04 06:22:54,458 WARNING [train.py:1204] (3/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_746-164782-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 06:22:54,503 WARNING [train.py:1204] (3/4) Exclude cut with ID _33640_210_2655_1_1533117605479_4855469_596-21066-0_sp1.1 from training. Duration: 0.92725 2023-10-04 06:22:57,136 WARNING [train.py:1204] (3/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_320-14078-0 from training. Duration: 0.95 2023-10-04 06:22:57,177 WARNING [train.py:1204] (3/4) Exclude cut with ID _29877_210_6228_1_1532397791106_5276149_266-62291-0 from training. Duration: 0.87 2023-10-04 06:22:57,263 WARNING [train.py:1204] (3/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_176-56120-0_sp0.9 from training. Duration: 0.92225 2023-10-04 06:22:58,592 WARNING [train.py:1204] (3/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_91-26919-0 from training. Duration: 0.97 2023-10-04 06:22:58,605 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_224-322394-0_sp0.9 from training. Duration: 0.7333125 2023-10-04 06:22:58,643 WARNING [train.py:1204] (3/4) Exclude cut with ID _44982_210_1511_1_1534312820108_3726029_586-297059-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 06:22:59,960 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_196-289637-0_sp1.1 from training. Duration: 0.6818125 2023-10-04 06:22:59,962 WARNING [train.py:1204] (3/4) Exclude cut with ID _18513_210_9206_1_1531618215223_3862620_402-5957-0 from training. Duration: 0.99 2023-10-04 06:22:59,968 WARNING [train.py:1204] (3/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_81-300212-0 from training. Duration: 0.6 2023-10-04 06:23:01,386 WARNING [train.py:1204] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_166-128941-0 from training. Duration: 0.54 2023-10-04 06:23:01,411 WARNING [train.py:1204] (3/4) Exclude cut with ID _58059_210_5854_1_1534901471149_3164808_57-309660-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 06:23:02,858 WARNING [train.py:1204] (3/4) Exclude cut with ID _60621_210_17071_1_1535013053530_3671739_73-155814-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 06:23:04,190 WARNING [train.py:1204] (3/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_726-270780-0 from training. Duration: 0.86 2023-10-04 06:23:04,202 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8853_210_3273_1_1528633899_818968_51-187685-0_sp1.1 from training. Duration: 0.62725 2023-10-04 06:23:06,112 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_315-212794-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 06:23:07,443 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_53373_210_5399_1_1534229471_4715695_314-122795-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 06:23:07,549 WARNING [train.py:1204] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_187-221566-0_sp0.9 from training. Duration: 0.47775 2023-10-04 06:23:09,619 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_133-564-0 from training. Duration: 0.79 2023-10-04 06:23:10,931 WARNING [train.py:1204] (3/4) Exclude cut with ID _55153_210_2253_1_1534384786677_3162096_255-228344-0_sp1.1 from training. Duration: 0.936375 2023-10-04 06:23:11,052 WARNING [train.py:1204] (3/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_383-165234-0_sp1.1 from training. Duration: 0.8545625 2023-10-04 06:23:14,079 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.conv_module1.balancer1.prob, batch_count=1556280.0, ans=0.125 2023-10-04 06:23:15,178 WARNING [train.py:1204] (3/4) Exclude cut with ID IC0677W0056-334889 from training. Duration: 0.768 2023-10-04 06:23:17,907 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_14953_210_2151_1_1530701710_5616165_710-165400-0_sp0.9 from training. Duration: 0.9666875 2023-10-04 06:23:19,234 WARNING [train.py:1204] (3/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_355-236205-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 06:23:19,235 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_34450_210_9547_1_1533099443_4109050_503-246893-0_sp1.1 from training. Duration: 0.963625 2023-10-04 06:23:22,629 WARNING [train.py:1204] (3/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_25-120161-0 from training. Duration: 0.98 2023-10-04 06:23:22,659 WARNING [train.py:1204] (3/4) Exclude cut with ID _66112_210_10635_1_1535590236305_3801484_205-311930-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 06:23:22,690 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_48049_210_5632_1_1533864553_3519937_185-281218-0_sp1.1 from training. Duration: 0.92725 2023-10-04 06:23:22,719 WARNING [train.py:1204] (3/4) Exclude cut with ID _48782_210_12658_1_1534467701544_2300422_191-332415-0_sp1.1 from training. Duration: 0.97275 2023-10-04 06:23:24,251 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.feed_forward2.hidden_balancer.prob, batch_count=1556346.6666666667, ans=0.125 2023-10-04 06:23:25,281 WARNING [train.py:1204] (3/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_907-151398-0_sp1.1 from training. Duration: 0.336375 2023-10-04 06:23:25,320 WARNING [train.py:1204] (3/4) Exclude cut with ID _67499_210_17282_1_1535509805220_3692430_20-179346-0_sp1.1 from training. Duration: 0.8818125 2023-10-04 06:23:28,081 WARNING [train.py:1204] (3/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_441-91466-0_sp1.1 from training. Duration: 0.8818125 2023-10-04 06:23:29,529 WARNING [train.py:1204] (3/4) Exclude cut with ID _46681_210_9642_1_1533893005571_7603359_786-164007-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 06:23:34,238 WARNING [train.py:1204] (3/4) Exclude cut with ID _14970_210_15709_1_1530773543493_1715730_187-20259-0 from training. Duration: 0.89 2023-10-04 06:23:38,886 WARNING [train.py:1204] (3/4) Exclude cut with ID _12734_210_8341_1_1530183614296_3891929_383-339075-0_sp1.1 from training. Duration: 0.963625 2023-10-04 06:23:39,374 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.5.encoder.layers.1.conv_module1.whiten, num_groups=1, num_channels=256, metric=3.50 vs. limit=15.0 2023-10-04 06:23:48,851 WARNING [train.py:1204] (3/4) Exclude cut with ID _37086_210_9038_1_1532999540891_7021250_834-322225-0_sp1.1 from training. Duration: 0.92725 2023-10-04 06:23:50,158 INFO [train.py:1046] (3/4) Epoch 44, batch 5050, loss[loss=0.1385, simple_loss=0.2132, pruned_loss=0.03186, over 24296.00 frames. ], tot_loss[loss=0.1539, simple_loss=0.2344, pruned_loss=0.03666, over 4737389.03 frames. ], batch size: 56, lr: 2.30e-03, grad_scale: 16.0 2023-10-04 06:23:50,246 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_10987_210_4163_1_1529760555_4123902_56-290559-0_sp1.1 from training. Duration: 0.963625 2023-10-04 06:23:50,253 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_92-246270-0_sp0.9 from training. Duration: 0.9444375 2023-10-04 06:23:50,273 WARNING [train.py:1204] (3/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_56-13561-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 06:23:50,305 WARNING [train.py:1204] (3/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_393-57888-0_sp1.1 from training. Duration: 0.5818125 2023-10-04 06:23:51,609 WARNING [train.py:1204] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_168-32906-0_sp1.1 from training. Duration: 0.62725 2023-10-04 06:23:51,659 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_51225_210_3601_1_1534400634_2327383_127-207553-0_sp1.1 from training. Duration: 0.963625 2023-10-04 06:23:57,676 WARNING [train.py:1204] (3/4) Exclude cut with ID _58583_210_5236_1_1534726835266_3574249_494-203521-0_sp1.1 from training. Duration: 0.963625 2023-10-04 06:23:57,697 WARNING [train.py:1204] (3/4) Exclude cut with ID _49376_210_5986_1_1533985278732_3370740_354-344695-0 from training. Duration: 0.99 2023-10-04 06:23:57,774 WARNING [train.py:1204] (3/4) Exclude cut with ID _8021_210_7994_1_1528376451358_3653069_179-45206-0_sp1.1 from training. Duration: 0.7818125 2023-10-04 06:24:00,076 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.0.layers.1.nonlin_attention.whiten1, num_groups=1, num_channels=144, metric=6.30 vs. limit=10.0 2023-10-04 06:24:00,486 WARNING [train.py:1204] (3/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_742-154017-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 06:24:01,858 WARNING [train.py:1204] (3/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_222-198127-0_sp1.1 from training. Duration: 0.763625 2023-10-04 06:24:01,912 WARNING [train.py:1204] (3/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_235-307337-0 from training. Duration: 0.94 2023-10-04 06:24:03,209 WARNING [train.py:1204] (3/4) Exclude cut with ID _8393_210_6702_1_1528505860249_3457328_453-176500-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 06:24:03,255 WARNING [train.py:1204] (3/4) Exclude cut with ID _53565_210_10635_1_1534337218335_3820259_102-179528-0_sp1.1 from training. Duration: 0.97275 2023-10-04 06:24:06,598 WARNING [train.py:1204] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_43-206757-0_sp1.1 from training. Duration: 0.6818125 2023-10-04 06:24:08,103 WARNING [train.py:1204] (3/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_335-274780-0_sp1.1 from training. Duration: 0.7090625 2023-10-04 06:24:08,146 WARNING [train.py:1204] (3/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_128-296581-0_sp1.1 from training. Duration: 0.67275 2023-10-04 06:24:19,072 WARNING [train.py:1204] (3/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_111-112362-0 from training. Duration: 0.91 2023-10-04 06:24:19,116 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_5522_210_3273_1_1527249606_3909096_366-172508-0_sp1.1 from training. Duration: 0.6 2023-10-04 06:24:20,474 WARNING [train.py:1204] (3/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_47-167373-0_sp1.1 from training. Duration: 0.836375 2023-10-04 06:24:20,518 WARNING [train.py:1204] (3/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_41-234353-0 from training. Duration: 0.68 2023-10-04 06:24:21,789 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_82-221196-0_sp0.9 from training. Duration: 0.8444375 2023-10-04 06:24:23,266 WARNING [train.py:1204] (3/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_47-4814-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 06:24:23,288 WARNING [train.py:1204] (3/4) Exclude cut with ID _43541_210_10626_1_1533796233418_3635440_40-247993-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 06:24:24,593 WARNING [train.py:1204] (3/4) Exclude cut with ID _21989_210_5854_1_1531899214931_3271360_133-44759-0_sp0.9 from training. Duration: 0.8555625 2023-10-04 06:24:24,596 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_94-195801-0 from training. Duration: 0.94 2023-10-04 06:24:24,672 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_141-89113-0 from training. Duration: 0.86 2023-10-04 06:24:26,113 WARNING [train.py:1204] (3/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_683-284578-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 06:24:28,169 WARNING [train.py:1204] (3/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_6-214815-0_sp1.1 from training. Duration: 0.77275 2023-10-04 06:24:31,025 WARNING [train.py:1204] (3/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_556-212082-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 06:24:32,363 WARNING [train.py:1204] (3/4) Exclude cut with ID _44902_210_16187_1_1533888006062_3792078_149-67212-0 from training. Duration: 0.87 2023-10-04 06:24:33,769 WARNING [train.py:1204] (3/4) Exclude cut with ID _61148_210_6897_1_1535094022726_4217610_333-159440-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 06:24:36,566 WARNING [train.py:1204] (3/4) Exclude cut with ID _3120_210_3525_1_1526036143112_4072980_404-249994-0 from training. Duration: 0.99 2023-10-04 06:24:38,284 WARNING [train.py:1204] (3/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_234-260682-0_sp0.9 from training. Duration: 0.9333125 2023-10-04 06:24:38,303 WARNING [train.py:1204] (3/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_374-138971-0_sp1.1 from training. Duration: 0.8545625 2023-10-04 06:24:39,622 WARNING [train.py:1204] (3/4) Exclude cut with ID _53713_210_24116_1_1534314487762_7416751_743-302085-0_sp1.1 from training. Duration: 0.92725 2023-10-04 06:24:40,936 WARNING [train.py:1204] (3/4) Exclude cut with ID _18516_210_9206_1_1531704653206_3692406_198-334870-0_sp1.1 from training. Duration: 0.836375 2023-10-04 06:24:42,295 WARNING [train.py:1204] (3/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_395-73570-0_sp1.1 from training. Duration: 0.9 2023-10-04 06:24:44,215 WARNING [train.py:1204] (3/4) Exclude cut with ID _46683_210_9642_1_1533730913492_4591116_421-19547-0_sp1.1 from training. Duration: 0.936375 2023-10-04 06:24:46,054 WARNING [train.py:1204] (3/4) Exclude cut with ID _19896_210_1883_1_1531709022662_6649569_714-8668-0_sp1.1 from training. Duration: 0.963625 2023-10-04 06:24:46,082 WARNING [train.py:1204] (3/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_49-101072-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 06:24:46,088 WARNING [train.py:1204] (3/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_220-191080-0_sp1.1 from training. Duration: 0.8909375 2023-10-04 06:24:47,484 WARNING [train.py:1204] (3/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_62-335552-0 from training. Duration: 0.87 2023-10-04 06:24:48,815 WARNING [train.py:1204] (3/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_111-112362-0_sp1.1 from training. Duration: 0.82725 2023-10-04 06:24:48,926 WARNING [train.py:1204] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_318-59097-0_sp0.9 from training. Duration: 0.8444375 2023-10-04 06:24:51,852 WARNING [train.py:1204] (3/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_98-255710-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 06:24:51,858 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9042W0003-515609 from training. Duration: 0.896 2023-10-04 06:24:51,867 WARNING [train.py:1204] (3/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_158-250116-0_sp0.9 from training. Duration: 0.688875 2023-10-04 06:24:53,328 WARNING [train.py:1204] (3/4) Exclude cut with ID _26750_210_5665_1_1532226678442_4063230_7-346885-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 06:24:53,370 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_37839_210_7234_1_1533542462_3689303_357-227078-0_sp1.1 from training. Duration: 0.963625 2023-10-04 06:24:53,398 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9052W0119-520904 from training. Duration: 0.896 2023-10-04 06:24:54,127 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.2.encoder.layers.1.feed_forward1.out_whiten, num_groups=1, num_channels=384, metric=6.35 vs. limit=15.0 2023-10-04 06:24:55,994 WARNING [train.py:1204] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_249-281349-0_sp1.1 from training. Duration: 0.77275 2023-10-04 06:24:55,999 WARNING [train.py:1204] (3/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_147-293311-0 from training. Duration: 0.94 2023-10-04 06:24:56,000 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_158-21166-0_sp1.1 from training. Duration: 0.963625 2023-10-04 06:24:57,057 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.0.layers.0.self_attn1.whiten, num_groups=1, num_channels=192, metric=9.95 vs. limit=22.5 2023-10-04 06:25:00,109 WARNING [train.py:1204] (3/4) Exclude cut with ID _14100_210_8341_1_1530529320613_3678069_207-332194-0_sp1.1 from training. Duration: 0.92725 2023-10-04 06:25:00,762 WARNING [train.py:1204] (3/4) Exclude cut with ID _9389_210_1664_1_1529217337301_5289269_324-211031-0_sp1.1 from training. Duration: 0.963625 2023-10-04 06:25:00,782 WARNING [train.py:1204] (3/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_133-158308-0 from training. Duration: 0.84 2023-10-04 06:25:01,954 WARNING [train.py:1204] (3/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_166-314607-0 from training. Duration: 0.54 2023-10-04 06:25:03,549 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.conv_module1.balancer2.prob, batch_count=1556813.3333333333, ans=0.125 2023-10-04 06:25:05,127 INFO [train.py:1046] (3/4) Epoch 44, batch 5100, loss[loss=0.1515, simple_loss=0.2355, pruned_loss=0.03372, over 24668.00 frames. ], tot_loss[loss=0.1543, simple_loss=0.2354, pruned_loss=0.03663, over 4743789.18 frames. ], batch size: 65, lr: 2.30e-03, grad_scale: 8.0 2023-10-04 06:25:05,223 WARNING [train.py:1204] (3/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_649-79538-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 06:25:05,230 WARNING [train.py:1204] (3/4) Exclude cut with ID _28394_210_8913_1_1532430049518_3650118_343-115667-0_sp1.1 from training. Duration: 0.97275 2023-10-04 06:25:06,395 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.588e+02 2.012e+02 2.210e+02 2.512e+02 3.231e+02, threshold=4.420e+02, percent-clipped=0.0 2023-10-04 06:25:06,493 WARNING [train.py:1204] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_87-278686-0_sp0.9 from training. Duration: 0.8555625 2023-10-04 06:25:09,264 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9109W0003-549749 from training. Duration: 0.768 2023-10-04 06:25:11,988 WARNING [train.py:1204] (3/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_126-29104-0_sp1.1 from training. Duration: 0.77275 2023-10-04 06:25:14,039 WARNING [train.py:1204] (3/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_325-113798-0 from training. Duration: 0.85 2023-10-04 06:25:16,096 WARNING [train.py:1204] (3/4) Exclude cut with ID _6694_210_4599_1_1527852637490_3965529_116-109398-0 from training. Duration: 0.73 2023-10-04 06:25:17,449 WARNING [train.py:1204] (3/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_46-57119-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 06:25:18,885 WARNING [train.py:1204] (3/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_546-119312-0_sp1.1 from training. Duration: 0.9 2023-10-04 06:25:21,484 WARNING [train.py:1204] (3/4) Exclude cut with ID _60057_210_10640_1_1534903065793_3333150_468-306939-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 06:25:21,524 WARNING [train.py:1204] (3/4) Exclude cut with ID _15150_210_8341_1_1530752411803_7256384_22-138274-0 from training. Duration: 0.96 2023-10-04 06:25:21,548 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_289-180904-0 from training. Duration: 0.76 2023-10-04 06:25:26,435 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.2.encoder.layers.1.conv_module2.whiten, num_groups=1, num_channels=384, metric=4.05 vs. limit=15.0 2023-10-04 06:25:27,002 WARNING [train.py:1204] (3/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_64-133721-0_sp1.1 from training. Duration: 0.92725 2023-10-04 06:25:27,051 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_195-257512-0_sp1.1 from training. Duration: 0.6090625 2023-10-04 06:25:29,889 WARNING [train.py:1204] (3/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_704-101963-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 06:25:33,061 WARNING [train.py:1204] (3/4) Exclude cut with ID _4366_210_1388_1_1526792053633_3613819_231-211402-0 from training. Duration: 0.94 2023-10-04 06:25:33,754 WARNING [train.py:1204] (3/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_679-190385-0_sp1.1 from training. Duration: 0.97275 2023-10-04 06:25:36,429 WARNING [train.py:1204] (3/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_298-81459-0_sp1.1 from training. Duration: 0.963625 2023-10-04 06:25:36,438 WARNING [train.py:1204] (3/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_658-282610-0_sp0.9 from training. Duration: 0.588875 2023-10-04 06:25:37,903 WARNING [train.py:1204] (3/4) Exclude cut with ID _57572_210_5517_1_1534921219624_3809484_196-283734-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 06:25:39,277 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_53071_210_16554_1_1534490405_2954066_294-198554-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 06:25:39,283 WARNING [train.py:1204] (3/4) Exclude cut with ID _4866_210_2577_1_1527404412284_4364659_352-157930-0 from training. Duration: 0.81 2023-10-04 06:25:41,185 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.4.encoder.layers.0.whiten, num_groups=1, num_channels=384, metric=4.43 vs. limit=12.0 2023-10-04 06:25:41,940 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9231W0113-610139 from training. Duration: 0.896 2023-10-04 06:25:43,292 WARNING [train.py:1204] (3/4) Exclude cut with ID _24271_210_12658_1_1531987270022_7190519_638-171906-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 06:25:43,322 WARNING [train.py:1204] (3/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_272-249985-0 from training. Duration: 0.67 2023-10-04 06:25:43,332 WARNING [train.py:1204] (3/4) Exclude cut with ID _64086_210_7994_1_1535270417019_7435949_449-31567-0 from training. Duration: 0.97 2023-10-04 06:25:47,913 WARNING [train.py:1204] (3/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_109-282568-0_sp1.1 from training. Duration: 0.97275 2023-10-04 06:25:55,665 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_121-45934-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 06:25:59,518 WARNING [train.py:1204] (3/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_314-126819-0 from training. Duration: 0.5 2023-10-04 06:25:59,548 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9309W0192-640319 from training. Duration: 0.512 2023-10-04 06:25:59,555 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9310W0003-640649 from training. Duration: 0.896 2023-10-04 06:26:02,260 WARNING [train.py:1204] (3/4) Exclude cut with ID _7160_210_1876_1_1527940923231_3703799_120-150943-0 from training. Duration: 0.81 2023-10-04 06:26:02,262 WARNING [train.py:1204] (3/4) Exclude cut with ID _40380_210_9059_1_1533380594759_4187380_6-33508-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 06:26:03,729 WARNING [train.py:1204] (3/4) Exclude cut with ID _59202_210_26984_1_1534816810612_3549174_137-318710-0 from training. Duration: 0.89 2023-10-04 06:26:07,133 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_435-313305-0 from training. Duration: 0.71 2023-10-04 06:26:08,538 WARNING [train.py:1204] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_91-212486-0_sp1.1 from training. Duration: 0.6181875 2023-10-04 06:26:10,083 WARNING [train.py:1204] (3/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_133-158308-0_sp1.1 from training. Duration: 0.763625 2023-10-04 06:26:11,421 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_99-251314-0 from training. Duration: 0.87 2023-10-04 06:26:12,838 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_365-217119-0_sp1.1 from training. Duration: 0.57275 2023-10-04 06:26:13,071 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=1557080.0, ans=0.1 2023-10-04 06:26:13,093 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.balancer1.prob, batch_count=1557080.0, ans=0.125 2023-10-04 06:26:14,149 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3475_210_2028_1_1526193578_3622569_377-166133-0 from training. Duration: 0.86 2023-10-04 06:26:19,212 INFO [train.py:1046] (3/4) Epoch 44, batch 5150, loss[loss=0.1542, simple_loss=0.225, pruned_loss=0.04175, over 23366.00 frames. ], tot_loss[loss=0.1554, simple_loss=0.2365, pruned_loss=0.03712, over 4732357.34 frames. ], batch size: 119, lr: 2.30e-03, grad_scale: 8.0 2023-10-04 06:26:19,283 WARNING [train.py:1204] (3/4) Exclude cut with ID _13916_210_9994_1_1530788341126_3763149_116-21034-0_sp1.1 from training. Duration: 0.87275 2023-10-04 06:26:19,294 WARNING [train.py:1204] (3/4) Exclude cut with ID _64220_210_9527_1_1535250151772_4875259_142-216831-0_sp1.1 from training. Duration: 0.936375 2023-10-04 06:26:19,294 WARNING [train.py:1204] (3/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_490-207861-0_sp1.1 from training. Duration: 0.8909375 2023-10-04 06:26:19,353 WARNING [train.py:1204] (3/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_87-272068-0_sp0.9 from training. Duration: 0.92225 2023-10-04 06:26:19,380 WARNING [train.py:1204] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_18-46756-0_sp1.1 from training. Duration: 0.7181875 2023-10-04 06:26:19,565 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.ff3_skip_rate, batch_count=1557146.6666666667, ans=0.0 2023-10-04 06:26:20,041 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.3.encoder.layers.0.whiten, num_groups=1, num_channels=512, metric=5.95 vs. limit=12.0 2023-10-04 06:26:20,714 WARNING [train.py:1204] (3/4) Exclude cut with ID _39411_210_5986_1_1533538825030_3343649_98-311335-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 06:26:20,781 WARNING [train.py:1204] (3/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_113-18163-0 from training. Duration: 0.89 2023-10-04 06:26:20,784 WARNING [train.py:1204] (3/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_1103-56445-0 from training. Duration: 0.73 2023-10-04 06:26:22,098 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3474_210_2028_1_1526188513_4825246_145-309131-0 from training. Duration: 0.74 2023-10-04 06:26:22,125 WARNING [train.py:1204] (3/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_120-211630-0_sp1.1 from training. Duration: 0.836375 2023-10-04 06:26:22,142 WARNING [train.py:1204] (3/4) Exclude cut with ID _8030_210_7497_1_1528444886781_3531089_32-31837-0 from training. Duration: 0.97 2023-10-04 06:26:23,462 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_19949_210_2161_1_1531706337_928079_8-160956-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 06:26:24,687 WARNING [train.py:1204] (3/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_321-215742-0_sp1.1 from training. Duration: 0.4545625 2023-10-04 06:26:26,107 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_583-124650-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 06:26:27,507 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_30434_210_13049_1_1532519384_981746_164-182755-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 06:26:33,457 WARNING [train.py:1204] (3/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_339-104791-0_sp1.1 from training. Duration: 0.4454375 2023-10-04 06:26:33,478 WARNING [train.py:1204] (3/4) Exclude cut with ID _15026_210_8750_1_1530777602316_3291850_211-195043-0 from training. Duration: 0.8 2023-10-04 06:26:34,784 WARNING [train.py:1204] (3/4) Exclude cut with ID _29877_210_6228_1_1532397791106_5276149_487-182647-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 06:26:35,465 WARNING [train.py:1204] (3/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_127-566-0_sp0.9 from training. Duration: 0.9333125 2023-10-04 06:26:38,450 WARNING [train.py:1204] (3/4) Exclude cut with ID _8263_210_1664_1_1528439452891_970220_68-181805-0_sp0.9 from training. Duration: 0.711125 2023-10-04 06:26:38,451 WARNING [train.py:1204] (3/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_504-72513-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 06:26:38,462 WARNING [train.py:1204] (3/4) Exclude cut with ID _64639_210_5816_1_1535367619437_3528000_324-265233-0_sp1.1 from training. Duration: 0.963625 2023-10-04 06:26:38,519 WARNING [train.py:1204] (3/4) Exclude cut with ID _6456_210_1534_1_1527681634212_3181049_215-309359-0_sp0.9 from training. Duration: 0.97775 2023-10-04 06:26:38,523 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_115-233929-0_sp1.1 from training. Duration: 0.6545625 2023-10-04 06:26:38,565 WARNING [train.py:1204] (3/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_219-224445-0 from training. Duration: 0.84 2023-10-04 06:26:41,297 WARNING [train.py:1204] (3/4) Exclude cut with ID _14867_210_1968_1_1530684179890_3846969_413-29292-0_sp1.1 from training. Duration: 0.8181875 2023-10-04 06:26:41,341 WARNING [train.py:1204] (3/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_1012-279473-0_sp0.9 from training. Duration: 0.7444375 2023-10-04 06:26:44,028 WARNING [train.py:1204] (3/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_605-133395-0_sp0.9 from training. Duration: 0.7333125 2023-10-04 06:26:44,165 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_89-35748-0 from training. Duration: 0.79 2023-10-04 06:26:46,106 WARNING [train.py:1204] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_476-272757-0_sp1.1 from training. Duration: 0.8454375 2023-10-04 06:26:50,512 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.conv_module2.balancer2.prob, batch_count=1557280.0, ans=0.125 2023-10-04 06:26:53,448 WARNING [train.py:1204] (3/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_449-15508-0_sp0.9 from training. Duration: 0.911125 2023-10-04 06:26:54,819 WARNING [train.py:1204] (3/4) Exclude cut with ID _29345_210_4381_1_1532689168263_3860169_198-174328-0 from training. Duration: 0.91 2023-10-04 06:26:57,626 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_15517_210_6753_1_1530952293_1664795_125-158157-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 06:27:00,700 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.0.feed_forward1.out_proj.dropout_p, batch_count=1557280.0, ans=0.1 2023-10-04 06:27:03,293 WARNING [train.py:1204] (3/4) Exclude cut with ID _25565_210_12392_1_1532423009382_3952098_420-113180-0_sp1.1 from training. Duration: 0.963625 2023-10-04 06:27:05,090 WARNING [train.py:1204] (3/4) Exclude cut with ID _51471_210_1889_1_1534399391390_3094250_236-146773-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 06:27:05,702 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.3.encoder.layers.1.feed_forward3.out_whiten, num_groups=1, num_channels=512, metric=12.32 vs. limit=15.0 2023-10-04 06:27:09,720 WARNING [train.py:1204] (3/4) Exclude cut with ID _30039_210_13270_1_1532484612900_2725150_175-35012-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 06:27:11,088 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_23850_210_4238_1_1532232597_1632433_99-275843-0_sp1.1 from training. Duration: 0.936375 2023-10-04 06:27:11,882 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.3.encoder.layers.0.self_attn2.whiten, num_groups=1, num_channels=512, metric=13.27 vs. limit=22.5 2023-10-04 06:27:12,521 WARNING [train.py:1204] (3/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_658-282610-0 from training. Duration: 0.53 2023-10-04 06:27:15,323 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_37818_210_4238_1_1533620910_4720083_234-65934-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 06:27:16,663 WARNING [train.py:1204] (3/4) Exclude cut with ID _3067_210_2667_1_1526212974640_4503168_459-173867-0_sp1.1 from training. Duration: 0.8 2023-10-04 06:27:16,682 WARNING [train.py:1204] (3/4) Exclude cut with ID _8696_210_1876_1_1528545632440_3345058_209-54069-0_sp0.9 from training. Duration: 0.7444375 2023-10-04 06:27:21,059 WARNING [train.py:1204] (3/4) Exclude cut with ID _62574_210_2655_1_1535180407058_3933249_420-236465-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 06:27:21,142 WARNING [train.py:1204] (3/4) Exclude cut with ID _46681_210_9642_1_1533893005571_7603359_333-4042-0_sp1.1 from training. Duration: 0.936375 2023-10-04 06:27:22,493 WARNING [train.py:1204] (3/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_377-296839-0 from training. Duration: 0.55 2023-10-04 06:27:27,250 WARNING [train.py:1204] (3/4) Exclude cut with ID _43997_210_4525_1_1533538632555_5189329_426-92599-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 06:27:27,347 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_43-71989-0_sp1.1 from training. Duration: 0.5181875 2023-10-04 06:27:30,123 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2009_210_2071_1_1525481134_4588937_140-345530-0_sp1.1 from training. Duration: 0.963625 2023-10-04 06:27:30,131 WARNING [train.py:1204] (3/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_138-332773-0_sp0.9 from training. Duration: 0.9 2023-10-04 06:27:31,405 WARNING [train.py:1204] (3/4) Exclude cut with ID _13942_210_9988_1_1530604906036_3733360_499-55676-0_sp0.9 from training. Duration: 0.688875 2023-10-04 06:27:31,423 WARNING [train.py:1204] (3/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_259-34038-0_sp1.1 from training. Duration: 0.67275 2023-10-04 06:27:31,434 WARNING [train.py:1204] (3/4) Exclude cut with ID _3120_210_3525_1_1526036143112_4072980_404-249994-0_sp1.1 from training. Duration: 0.9 2023-10-04 06:27:31,460 WARNING [train.py:1204] (3/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_222-108810-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 06:27:32,825 INFO [train.py:1046] (3/4) Epoch 44, batch 5200, loss[loss=0.1976, simple_loss=0.2687, pruned_loss=0.06328, over 19803.00 frames. ], tot_loss[loss=0.1565, simple_loss=0.2375, pruned_loss=0.03778, over 4718544.83 frames. ], batch size: 388, lr: 2.30e-03, grad_scale: 16.0 2023-10-04 06:27:33,014 WARNING [train.py:1204] (3/4) Exclude cut with ID _22083_210_8254_1_1531702862439_3678970_49-234546-0_sp0.9 from training. Duration: 0.92225 2023-10-04 06:27:34,668 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.629e+02 2.022e+02 2.221e+02 2.671e+02 4.836e+02, threshold=4.441e+02, percent-clipped=1.0 2023-10-04 06:27:36,774 WARNING [train.py:1204] (3/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_199-295611-0_sp0.9 from training. Duration: 0.611125 2023-10-04 06:27:38,212 WARNING [train.py:1204] (3/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_43-192945-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 06:27:41,169 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.self_attn_weights.pos_emb_skip_rate, batch_count=1557480.0, ans=0.0 2023-10-04 06:27:42,284 WARNING [train.py:1204] (3/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_364-22817-0 from training. Duration: 0.71 2023-10-04 06:27:43,598 WARNING [train.py:1204] (3/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_388-129552-0_sp1.1 from training. Duration: 0.87275 2023-10-04 06:27:43,650 WARNING [train.py:1204] (3/4) Exclude cut with ID _6537_210_1883_1_1527987559710_3597129_190-280227-0_sp1.1 from training. Duration: 0.97275 2023-10-04 06:27:46,382 WARNING [train.py:1204] (3/4) Exclude cut with ID _55844_210_2346_1_1534471212569_7625707_398-56897-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 06:27:49,422 WARNING [train.py:1204] (3/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_48-182155-0_sp1.1 from training. Duration: 0.7909375 2023-10-04 06:27:49,450 WARNING [train.py:1204] (3/4) Exclude cut with ID _29529_210_7234_1_1532393961455_3977460_582-18084-0_sp1.1 from training. Duration: 0.97275 2023-10-04 06:27:50,898 WARNING [train.py:1204] (3/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_912-282412-0 from training. Duration: 0.49 2023-10-04 06:27:52,375 WARNING [train.py:1204] (3/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_332-256168-0_sp0.9 from training. Duration: 0.8333125 2023-10-04 06:27:52,424 WARNING [train.py:1204] (3/4) Exclude cut with ID _64220_210_9527_1_1535250151772_4875259_55-250624-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 06:27:55,762 WARNING [train.py:1204] (3/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_264-108685-0 from training. Duration: 0.55 2023-10-04 06:27:57,206 WARNING [train.py:1204] (3/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_613-232856-0_sp0.9 from training. Duration: 0.6 2023-10-04 06:27:59,849 WARNING [train.py:1204] (3/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_68-212801-0_sp1.1 from training. Duration: 0.663625 2023-10-04 06:28:01,127 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6290_210_3273_1_1527595224_1282787_115-87095-0 from training. Duration: 0.55 2023-10-04 06:28:01,164 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3080_210_2071_1_1526086102_4365166_312-8726-0 from training. Duration: 0.91 2023-10-04 06:28:04,358 WARNING [train.py:1204] (3/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_105-63309-0 from training. Duration: 0.98 2023-10-04 06:28:04,421 WARNING [train.py:1204] (3/4) Exclude cut with ID _2941_210_2577_1_1526199673150_2489759_201-160779-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 06:28:04,424 WARNING [train.py:1204] (3/4) Exclude cut with ID ID0406W0226-885434 from training. Duration: 0.997 2023-10-04 06:28:04,430 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_17292_210_6753_1_1531125388_2943734_353-328201-0_sp1.1 from training. Duration: 0.97275 2023-10-04 06:28:04,655 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.feed_forward1.hidden_balancer.prob, batch_count=1557613.3333333333, ans=0.125 2023-10-04 06:28:06,374 WARNING [train.py:1204] (3/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_572-96029-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 06:28:07,672 WARNING [train.py:1204] (3/4) Exclude cut with ID _14970_210_15709_1_1530775547361_1966965_6-23328-0_sp0.9 from training. Duration: 0.9555625 2023-10-04 06:28:07,721 WARNING [train.py:1204] (3/4) Exclude cut with ID _45012_210_12651_1_1533608435136_5371380_240-37101-0 from training. Duration: 0.93 2023-10-04 06:28:07,783 WARNING [train.py:1204] (3/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_685-90183-0_sp1.1 from training. Duration: 0.936375 2023-10-04 06:28:11,841 WARNING [train.py:1204] (3/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_41-22245-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 06:28:13,367 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_15126_210_6753_1_1530778677_2591185_120-277707-0 from training. Duration: 0.8 2023-10-04 06:28:14,670 WARNING [train.py:1204] (3/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_310-26186-0 from training. Duration: 0.7 2023-10-04 06:28:14,694 WARNING [train.py:1204] (3/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_787-294416-0 from training. Duration: 0.82 2023-10-04 06:28:17,861 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.conv_skip_rate, batch_count=1557680.0, ans=0.0 2023-10-04 06:28:18,938 WARNING [train.py:1204] (3/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_795-33252-0 from training. Duration: 0.37 2023-10-04 06:28:20,236 WARNING [train.py:1204] (3/4) Exclude cut with ID _7537_210_2084_1_1528502395431_3924359_113-199933-0_sp0.9 from training. Duration: 0.6555625 2023-10-04 06:28:25,104 WARNING [train.py:1204] (3/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_308-231735-0_sp0.9 from training. Duration: 0.811125 2023-10-04 06:28:25,129 WARNING [train.py:1204] (3/4) Exclude cut with ID _19504_210_9994_1_1531648863242_3325220_354-296765-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 06:28:27,162 WARNING [train.py:1204] (3/4) Exclude cut with ID _5409_210_1388_1_1527292850189_5865191_178-127970-0 from training. Duration: 0.57 2023-10-04 06:28:27,206 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_1466-248056-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 06:28:28,501 WARNING [train.py:1204] (3/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_535-14410-0_sp0.9 from training. Duration: 0.52225 2023-10-04 06:28:28,504 WARNING [train.py:1204] (3/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_747-129941-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 06:28:28,540 WARNING [train.py:1204] (3/4) Exclude cut with ID _15012_210_1968_1_1530856820429_5095350_688-178849-0_sp1.1 from training. Duration: 0.5818125 2023-10-04 06:28:31,332 WARNING [train.py:1204] (3/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_356-45054-0_sp1.1 from training. Duration: 0.7818125 2023-10-04 06:28:32,664 WARNING [train.py:1204] (3/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_146-276944-0_sp0.9 from training. Duration: 0.92225 2023-10-04 06:28:36,029 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.3.encoder.layers.2.feed_forward1.out_whiten, num_groups=1, num_channels=512, metric=7.45 vs. limit=15.0 2023-10-04 06:28:36,718 WARNING [train.py:1204] (3/4) Exclude cut with ID _39204_210_4220_1_1533880547368_3191120_346-180306-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 06:28:38,118 WARNING [train.py:1204] (3/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_641-158017-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 06:28:38,119 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_19952_210_2161_1_1531875469_3851750_274-42137-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 06:28:40,260 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.2.encoder.layers.2.conv_module2.whiten, num_groups=1, num_channels=384, metric=6.36 vs. limit=15.0 2023-10-04 06:28:43,563 WARNING [train.py:1204] (3/4) Exclude cut with ID _60804_210_9642_1_1534831199859_7268299_87-327819-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 06:28:43,656 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_9302_210_2550_1_1528889962_4655416_638-301620-0 from training. Duration: 0.47 2023-10-04 06:28:45,044 WARNING [train.py:1204] (3/4) Exclude cut with ID _44304_210_15374_1_1533639352765_4613129_302-22084-0_sp1.1 from training. Duration: 0.7818125 2023-10-04 06:28:45,063 WARNING [train.py:1204] (3/4) Exclude cut with ID _18513_210_9206_1_1531618215223_3862620_177-227063-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 06:28:46,390 WARNING [train.py:1204] (3/4) Exclude cut with ID _41326_210_5854_1_1533778076509_3492080_510-45444-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 06:28:46,438 WARNING [train.py:1204] (3/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_76-311963-0_sp1.1 from training. Duration: 0.636375 2023-10-04 06:28:46,677 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.conv_module2.balancer1.max_abs, batch_count=1557813.3333333333, ans=10.0 2023-10-04 06:28:47,800 INFO [train.py:1046] (3/4) Epoch 44, batch 5250, loss[loss=0.1491, simple_loss=0.2308, pruned_loss=0.03368, over 23130.00 frames. ], tot_loss[loss=0.1559, simple_loss=0.2366, pruned_loss=0.03761, over 4705191.08 frames. ], batch size: 51, lr: 2.30e-03, grad_scale: 16.0 2023-10-04 06:28:47,890 WARNING [train.py:1204] (3/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_156-302675-0_sp0.9 from training. Duration: 0.611125 2023-10-04 06:28:50,681 WARNING [train.py:1204] (3/4) Exclude cut with ID _53489_210_6228_1_1534298357169_3835970_367-148411-0_sp1.1 from training. Duration: 0.936375 2023-10-04 06:28:53,576 WARNING [train.py:1204] (3/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_389-144525-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 06:28:53,622 WARNING [train.py:1204] (3/4) Exclude cut with ID _14069_210_5632_1_1530512692033_4803390_158-238427-0_sp1.1 from training. Duration: 0.963625 2023-10-04 06:28:56,291 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_486-151131-0_sp0.9 from training. Duration: 0.9444375 2023-10-04 06:28:59,716 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.feed_forward1.hidden_balancer.prob, batch_count=1557813.3333333333, ans=0.125 2023-10-04 06:29:00,320 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.1.encoder.layers.1.whiten, num_groups=1, num_channels=256, metric=3.80 vs. limit=12.0 2023-10-04 06:29:00,879 WARNING [train.py:1204] (3/4) Exclude cut with ID _39846_210_12651_1_1533603651992_1318030_188-71831-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 06:29:01,075 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.1.conv_skip_rate, batch_count=1557880.0, ans=0.0 2023-10-04 06:29:02,306 WARNING [train.py:1204] (3/4) Exclude cut with ID _39411_210_5986_1_1533538825030_3343649_173-238248-0_sp1.1 from training. Duration: 0.7545625 2023-10-04 06:29:03,663 WARNING [train.py:1204] (3/4) Exclude cut with ID _20684_210_6917_1_1531567751646_4203930_469-49081-0_sp1.1 from training. Duration: 0.92725 2023-10-04 06:29:05,689 WARNING [train.py:1204] (3/4) Exclude cut with ID _8833_210_5219_1_1528592665210_7553059_611-80313-0_sp1.1 from training. Duration: 0.5818125 2023-10-04 06:29:08,996 WARNING [train.py:1204] (3/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_440-141811-0 from training. Duration: 0.95 2023-10-04 06:29:09,011 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_41211_210_2544_1_1533348859_3702910_236-223652-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 06:29:10,317 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_40222_210_6228_1_1533385298_4481939_220-163998-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 06:29:13,148 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.balancer2.prob, batch_count=1557880.0, ans=0.125 2023-10-04 06:29:34,016 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.balancer2.prob, batch_count=1558013.3333333333, ans=0.125 2023-10-04 06:29:56,289 INFO [train.py:1046] (3/4) Epoch 44, batch 5300, loss[loss=0.1442, simple_loss=0.2213, pruned_loss=0.03356, over 24437.00 frames. ], tot_loss[loss=0.1549, simple_loss=0.2351, pruned_loss=0.03731, over 4695270.63 frames. ], batch size: 58, lr: 2.30e-03, grad_scale: 16.0 2023-10-04 06:29:57,452 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.662e+02 2.098e+02 2.407e+02 2.784e+02 3.746e+02, threshold=4.815e+02, percent-clipped=0.0 2023-10-04 06:30:01,594 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.1.feed_forward3.hidden_balancer.prob, batch_count=1558146.6666666667, ans=0.125 2023-10-04 06:30:01,666 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.ff2_skip_rate, batch_count=1558146.6666666667, ans=0.0 2023-10-04 06:30:05,338 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.0.balancer1.prob, batch_count=1558146.6666666667, ans=0.125 2023-10-04 06:30:07,195 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.3.encoder.layers.3.nonlin_attention.whiten2, num_groups=1, num_channels=512, metric=9.74 vs. limit=15.0 2023-10-04 06:30:10,364 WARNING [train.py:1204] (3/4) Exclude cut with ID _14629_210_15709_1_1530604298182_956899_6-232695-0_sp1.1 from training. Duration: 0.8545625 2023-10-04 06:30:10,400 WARNING [train.py:1204] (3/4) Exclude cut with ID _13921_210_9994_1_1530841782272_3503520_327-69677-0 from training. Duration: 0.9 2023-10-04 06:30:10,426 WARNING [train.py:1204] (3/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_372-298093-0 from training. Duration: 0.8 2023-10-04 06:30:10,440 WARNING [train.py:1204] (3/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_515-139111-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 06:30:10,696 WARNING [train.py:1204] (3/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_85-65056-0_sp1.1 from training. Duration: 0.87275 2023-10-04 06:30:10,740 WARNING [train.py:1204] (3/4) Exclude cut with ID _15113_210_8254_1_1530842438579_3716279_209-54533-0_sp1.1 from training. Duration: 0.87275 2023-10-04 06:30:10,790 WARNING [train.py:1204] (3/4) Exclude cut with ID _70447_210_12392_1_1535972499300_3160420_123-312838-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 06:30:10,822 WARNING [train.py:1204] (3/4) Exclude cut with ID _60113_210_16271_1_1535164212481_4386079_70-63596-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 06:30:10,851 WARNING [train.py:1204] (3/4) Exclude cut with ID _14632_210_9988_1_1530691279392_3923200_225-188509-0_sp1.1 from training. Duration: 0.963625 2023-10-04 06:30:10,902 WARNING [train.py:1204] (3/4) Exclude cut with ID _34734_210_4525_1_1533365977542_4448720_584-180532-0_sp1.1 from training. Duration: 0.97275 2023-10-04 06:30:10,915 WARNING [train.py:1204] (3/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_351-294650-0_sp1.1 from training. Duration: 0.57275 2023-10-04 06:30:11,249 WARNING [train.py:1204] (3/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_263-168388-0_sp0.9 from training. Duration: 0.9555625 2023-10-04 06:30:11,277 WARNING [train.py:1204] (3/4) Exclude cut with ID _22083_210_8254_1_1531702862439_3678970_201-85738-0 from training. Duration: 0.9 2023-10-04 06:30:11,369 WARNING [train.py:1204] (3/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_237-58433-0 from training. Duration: 0.74 2023-10-04 06:30:11,397 WARNING [train.py:1204] (3/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_262-250826-0 from training. Duration: 0.99 2023-10-04 06:30:11,499 WARNING [train.py:1204] (3/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_927-43827-0_sp0.9 from training. Duration: 0.82225 2023-10-04 06:30:11,530 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2821_210_2715_1_1526108385_3763560_73-296673-0 from training. Duration: 0.83 2023-10-04 06:30:11,605 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6287_210_3273_1_1527509506_2653716_182-199887-0 from training. Duration: 0.51 2023-10-04 06:30:11,741 WARNING [train.py:1204] (3/4) Exclude cut with ID _70519_210_4163_1_1535961595994_7458230_899-80223-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 06:30:12,385 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_210-234949-0_sp1.1 from training. Duration: 0.97275 2023-10-04 06:30:12,421 WARNING [train.py:1204] (3/4) Exclude cut with ID _30196_210_1664_1_1533708496522_7074140_768-278176-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 06:30:12,496 WARNING [train.py:1204] (3/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_315-128283-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 06:30:12,582 WARNING [train.py:1204] (3/4) Exclude cut with ID _14886_210_4220_1_1530702020355_3516190_330-323941-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 06:30:12,879 WARNING [train.py:1204] (3/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_264-348896-0_sp1.1 from training. Duration: 0.77275 2023-10-04 06:30:12,907 WARNING [train.py:1204] (3/4) Exclude cut with ID _37086_210_9038_1_1532999540891_7021250_433-10563-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 06:30:12,948 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_53638_210_21581_1_1534297319_5808449_885-80172-0_sp1.1 from training. Duration: 0.87275 2023-10-04 06:30:13,054 WARNING [train.py:1204] (3/4) Exclude cut with ID _14623_210_1925_1_1530773354962_3632609_157-218771-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 06:30:13,065 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_1368-78165-0_sp1.1 from training. Duration: 0.97275 2023-10-04 06:30:13,069 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_309-65177-0_sp0.9 from training. Duration: 0.97775 2023-10-04 06:30:13,076 WARNING [train.py:1204] (3/4) Exclude cut with ID _21632_210_2253_1_1531994001765_3744140_291-138812-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 06:30:13,090 WARNING [train.py:1204] (3/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_669-93163-0_sp1.1 from training. Duration: 0.8909375 2023-10-04 06:30:13,681 WARNING [train.py:1204] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_292-215346-0 from training. Duration: 0.97 2023-10-04 06:30:13,706 WARNING [train.py:1204] (3/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_286-222073-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 06:30:14,009 WARNING [train.py:1204] (3/4) Exclude cut with ID _59256_210_5986_1_1535076085997_3589120_121-237386-0_sp1.1 from training. Duration: 0.87275 2023-10-04 06:30:14,023 WARNING [train.py:1204] (3/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_96-320736-0 from training. Duration: 0.86 2023-10-04 06:30:14,033 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_128-202104-0 from training. Duration: 0.63 2023-10-04 06:30:14,123 WARNING [train.py:1204] (3/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_242-3472-0_sp0.9 from training. Duration: 0.811125 2023-10-04 06:30:14,168 WARNING [train.py:1204] (3/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_154-74182-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 06:30:14,172 WARNING [train.py:1204] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_187-221566-0 from training. Duration: 0.43 2023-10-04 06:30:14,344 WARNING [train.py:1204] (3/4) Exclude cut with ID _6194_210_1534_1_1527511826160_3941194_14-282876-0 from training. Duration: 0.71 2023-10-04 06:30:14,386 WARNING [train.py:1204] (3/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_660-211088-0_sp1.1 from training. Duration: 0.563625 2023-10-04 06:30:15,227 WARNING [train.py:1204] (3/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_526-62079-0_sp1.1 from training. Duration: 0.6090625 2023-10-04 06:30:15,387 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_92-246270-0_sp1.1 from training. Duration: 0.77275 2023-10-04 06:30:15,479 WARNING [train.py:1204] (3/4) Exclude cut with ID IC0287W0023-142350 from training. Duration: 0.956 2023-10-04 06:30:15,558 WARNING [train.py:1204] (3/4) Exclude cut with ID _58605_210_12210_1_1534723195939_3904259_48-269402-0 from training. Duration: 0.96 2023-10-04 06:30:15,573 WARNING [train.py:1204] (3/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_416-257730-0_sp0.9 from training. Duration: 0.72225 2023-10-04 06:30:15,651 WARNING [train.py:1204] (3/4) Exclude cut with ID _7877_210_6014_1_1528286358099_3750136_185-17516-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 06:30:15,755 WARNING [train.py:1204] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_23-189234-0 from training. Duration: 0.83 2023-10-04 06:30:15,772 WARNING [train.py:1204] (3/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_237-89684-0 from training. Duration: 0.96 2023-10-04 06:30:15,842 WARNING [train.py:1204] (3/4) Exclude cut with ID _13916_210_9994_1_1530788341126_3763149_116-21034-0 from training. Duration: 0.96 2023-10-04 06:30:16,010 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_246-125000-0_sp1.1 from training. Duration: 0.563625 2023-10-04 06:30:22,560 INFO [train.py:1046] (3/4) Epoch 45, batch 0, loss[loss=0.1427, simple_loss=0.2347, pruned_loss=0.02533, over 24491.00 frames. ], tot_loss[loss=0.1427, simple_loss=0.2347, pruned_loss=0.02533, over 24491.00 frames. ], batch size: 66, lr: 2.27e-03, grad_scale: 32.0 2023-10-04 06:30:22,561 INFO [train.py:1069] (3/4) Computing validation loss 2023-10-04 06:30:34,478 INFO [train.py:1078] (3/4) Epoch 45, validation: loss=0.3306, simple_loss=0.275, pruned_loss=0.1931, over 1125622.00 frames. 2023-10-04 06:30:34,479 INFO [train.py:1079] (3/4) Maximum memory allocated so far is 21134MB 2023-10-04 06:30:35,883 WARNING [train.py:1204] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_402-253027-0 from training. Duration: 0.54 2023-10-04 06:30:37,142 WARNING [train.py:1204] (3/4) Exclude cut with ID _59288_210_5854_1_1535077757774_3412246_158-289345-0_sp1.1 from training. Duration: 0.92725 2023-10-04 06:30:38,547 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_28211_210_6388_1_1532861506_4523224_128-328162-0_sp1.1 from training. Duration: 0.8181875 2023-10-04 06:30:39,134 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.4.encoder.layers.1.feed_forward3.out_whiten, num_groups=1, num_channels=384, metric=12.07 vs. limit=15.0 2023-10-04 06:30:42,824 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_788-278281-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 06:30:42,831 WARNING [train.py:1204] (3/4) Exclude cut with ID _49728_210_12651_1_1534041034294_6788150_624-195301-0_sp0.9 from training. Duration: 0.9444375 2023-10-04 06:30:42,879 WARNING [train.py:1204] (3/4) Exclude cut with ID _14860_210_4915_1_1530702020340_7343240_267-139775-0_sp1.1 from training. Duration: 0.963625 2023-10-04 06:30:44,143 WARNING [train.py:1204] (3/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_332-221057-0 from training. Duration: 0.68 2023-10-04 06:30:45,521 WARNING [train.py:1204] (3/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_242-76943-0 from training. Duration: 0.94 2023-10-04 06:30:46,862 WARNING [train.py:1204] (3/4) Exclude cut with ID _16747_210_5986_1_1531811544733_2749109_87-349587-0_sp1.1 from training. Duration: 0.963625 2023-10-04 06:30:48,715 WARNING [train.py:1204] (3/4) Exclude cut with ID _22982_210_1883_1_1531882790447_5449409_505-346452-0_sp1.1 from training. Duration: 0.963625 2023-10-04 06:30:51,537 WARNING [train.py:1204] (3/4) Exclude cut with ID _17956_210_4353_1_1531209568125_6979970_13-192107-0_sp1.1 from training. Duration: 0.963625 2023-10-04 06:30:51,582 WARNING [train.py:1204] (3/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_430-307666-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 06:30:52,873 WARNING [train.py:1204] (3/4) Exclude cut with ID _2895_210_2477_1_1525956785794_4063750_289-11475-0_sp1.1 from training. Duration: 0.7454375 2023-10-04 06:30:54,082 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3910_210_1794_1_1526742306_334530_28-238909-0_sp0.9 from training. Duration: 0.9 2023-10-04 06:30:56,869 WARNING [train.py:1204] (3/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_18-130336-0 from training. Duration: 0.79 2023-10-04 06:30:58,285 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_548-294316-0_sp0.9 from training. Duration: 0.9 2023-10-04 06:31:06,624 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_42-142473-0_sp1.1 from training. Duration: 0.6545625 2023-10-04 06:31:06,630 WARNING [train.py:1204] (3/4) Exclude cut with ID _59170_210_6721_1_1534983633960_4203335_326-48066-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 06:31:07,916 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_45886_210_17024_1_1533709215_471063_7-79165-0 from training. Duration: 0.98 2023-10-04 06:31:12,026 WARNING [train.py:1204] (3/4) Exclude cut with ID _44585_210_18628_1_1533605350387_4518199_280-73534-0_sp0.9 from training. Duration: 0.988875 2023-10-04 06:31:12,033 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3475_210_2028_1_1526193578_3622569_265-276625-0_sp1.1 from training. Duration: 0.6909375 2023-10-04 06:31:13,439 WARNING [train.py:1204] (3/4) Exclude cut with ID _64631_210_27767_1_1535349580068_4165160_88-225232-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 06:31:16,372 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_205-265518-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 06:31:19,219 WARNING [train.py:1204] (3/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_271-253734-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 06:31:22,764 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.bypass.skip_rate, batch_count=1558426.6666666667, ans=0.07 2023-10-04 06:31:25,337 WARNING [train.py:1204] (3/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_282-257791-0 from training. Duration: 0.86 2023-10-04 06:31:29,458 WARNING [train.py:1204] (3/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_356-45054-0 from training. Duration: 0.86 2023-10-04 06:31:29,507 WARNING [train.py:1204] (3/4) Exclude cut with ID _69584_210_5219_1_1535785133775_4044450_38-348116-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 06:31:29,508 WARNING [train.py:1204] (3/4) Exclude cut with ID _66763_210_17301_1_1535788667617_241750_21-328480-0_sp1.1 from training. Duration: 0.936375 2023-10-04 06:31:30,925 WARNING [train.py:1204] (3/4) Exclude cut with ID _26528_210_9070_1_1532570410842_3851310_497-97218-0_sp0.9 from training. Duration: 0.9555625 2023-10-04 06:31:30,986 WARNING [train.py:1204] (3/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_592-101075-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 06:31:31,153 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.ff2_skip_rate, batch_count=1558493.3333333333, ans=0.0 2023-10-04 06:31:32,637 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.balancer2.prob, batch_count=1558493.3333333333, ans=0.125 2023-10-04 06:31:34,284 WARNING [train.py:1204] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_678-159307-0 from training. Duration: 0.85 2023-10-04 06:31:37,900 WARNING [train.py:1204] (3/4) Exclude cut with ID _28427_210_4220_1_1532581240967_3061069_166-319773-0_sp1.1 from training. Duration: 0.936375 2023-10-04 06:31:37,980 WARNING [train.py:1204] (3/4) Exclude cut with ID _29877_210_6228_1_1532397791106_5276149_606-224984-0_sp1.1 from training. Duration: 0.936375 2023-10-04 06:31:42,067 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_125-203105-0_sp1.1 from training. Duration: 0.72725 2023-10-04 06:31:46,072 WARNING [train.py:1204] (3/4) Exclude cut with ID IC0620W0113-306765 from training. Duration: 0.768 2023-10-04 06:31:46,177 WARNING [train.py:1204] (3/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_410-287857-0_sp1.1 from training. Duration: 0.8454375 2023-10-04 06:31:47,519 INFO [train.py:1046] (3/4) Epoch 45, batch 50, loss[loss=0.1516, simple_loss=0.2442, pruned_loss=0.02947, over 24460.00 frames. ], tot_loss[loss=0.1526, simple_loss=0.2343, pruned_loss=0.0355, over 1074643.86 frames. ], batch size: 69, lr: 2.27e-03, grad_scale: 32.0 2023-10-04 06:31:50,383 WARNING [train.py:1204] (3/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_78-95260-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 06:31:53,051 WARNING [train.py:1204] (3/4) Exclude cut with ID _58059_210_5854_1_1534901471149_3164808_6-73080-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 06:31:53,064 WARNING [train.py:1204] (3/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_331-117829-0 from training. Duration: 0.62 2023-10-04 06:31:53,125 WARNING [train.py:1204] (3/4) Exclude cut with ID _14880_210_8895_1_1530680426655_3795259_289-212817-0_sp0.9 from training. Duration: 0.7666875 2023-10-04 06:31:53,168 WARNING [train.py:1204] (3/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_620-144779-0_sp1.1 from training. Duration: 0.92725 2023-10-04 06:31:55,188 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.conv_module1.balancer1.prob, batch_count=1558560.0, ans=0.125 2023-10-04 06:31:56,323 WARNING [train.py:1204] (3/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_561-288575-0_sp1.1 from training. Duration: 0.963625 2023-10-04 06:31:56,424 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_22657_210_6890_1_1532086002_1637216_122-188128-0_sp1.1 from training. Duration: 0.963625 2023-10-04 06:31:57,767 WARNING [train.py:1204] (3/4) Exclude cut with ID _16756_210_4915_1_1531047803763_7104259_604-86607-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 06:31:59,419 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.self_attn_weights.pos_emb_skip_rate, batch_count=1558560.0, ans=0.0 2023-10-04 06:32:00,554 WARNING [train.py:1204] (3/4) Exclude cut with ID _15150_210_8341_1_1530752411803_7256384_469-239509-0 from training. Duration: 0.68 2023-10-04 06:32:00,562 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_10987_210_4163_1_1529760555_4123902_302-87926-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 06:32:08,286 WARNING [train.py:1204] (3/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_469-94673-0_sp0.9 from training. Duration: 0.87775 2023-10-04 06:32:08,399 WARNING [train.py:1204] (3/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_798-106179-0 from training. Duration: 0.83 2023-10-04 06:32:10,386 WARNING [train.py:1204] (3/4) Exclude cut with ID _16744_210_5986_1_1531724405384_3907567_242-111316-0 from training. Duration: 0.93 2023-10-04 06:32:12,070 WARNING [train.py:1204] (3/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_938-205135-0_sp0.9 from training. Duration: 0.9666875 2023-10-04 06:32:12,323 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.nonlin_attention.balancer.max_positive, batch_count=1558626.6666666667, ans=0.95 2023-10-04 06:32:13,527 WARNING [train.py:1204] (3/4) Exclude cut with ID _64135_210_2253_1_1535677267876_7608972_133-188266-0_sp0.9 from training. Duration: 0.9 2023-10-04 06:32:13,531 WARNING [train.py:1204] (3/4) Exclude cut with ID _44585_210_18628_1_1533605350387_4518199_623-346820-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 06:32:14,961 WARNING [train.py:1204] (3/4) Exclude cut with ID _42326_210_7770_1_1533814282531_3707269_454-266903-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 06:32:15,217 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=1558626.6666666667, ans=0.1 2023-10-04 06:32:16,228 WARNING [train.py:1204] (3/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_5-55893-0_sp1.1 from training. Duration: 0.5 2023-10-04 06:32:16,270 WARNING [train.py:1204] (3/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_577-332600-0_sp1.1 from training. Duration: 0.5181875 2023-10-04 06:32:16,271 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_957-138882-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 06:32:23,298 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.bypass.skip_rate, batch_count=1558693.3333333333, ans=0.09899494936611666 2023-10-04 06:32:25,880 WARNING [train.py:1204] (3/4) Exclude cut with ID _12556_210_8341_1_1530081024100_7327690_219-38004-0_sp1.1 from training. Duration: 0.97275 2023-10-04 06:32:27,854 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_397-147196-0_sp1.1 from training. Duration: 0.72725 2023-10-04 06:32:27,864 WARNING [train.py:1204] (3/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_231-104736-0_sp0.9 from training. Duration: 0.8666875 2023-10-04 06:32:29,238 WARNING [train.py:1204] (3/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_398-334558-0 from training. Duration: 0.96 2023-10-04 06:32:30,684 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_257-20733-0_sp1.1 from training. Duration: 0.6909375 2023-10-04 06:32:31,598 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.1.encoder.layers.0.whiten, num_groups=1, num_channels=256, metric=5.21 vs. limit=12.0 2023-10-04 06:32:32,097 WARNING [train.py:1204] (3/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_146-282400-0_sp1.1 from training. Duration: 0.6454375 2023-10-04 06:32:32,101 WARNING [train.py:1204] (3/4) Exclude cut with ID _51899_210_20034_1_1534129254466_3669220_265-215428-0 from training. Duration: 0.98 2023-10-04 06:32:32,174 WARNING [train.py:1204] (3/4) Exclude cut with ID _34423_210_8750_1_1533430798456_3330199_219-106541-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 06:32:33,539 WARNING [train.py:1204] (3/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_458-67321-0 from training. Duration: 0.97 2023-10-04 06:32:35,230 INFO [scaling.py:1118] (3/4) WithLoss: name=encoder.encoders.3.encoder.layers.0.self_attn_weights, loss-sum=0.000e+00 2023-10-04 06:32:41,377 WARNING [train.py:1204] (3/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_1051-182606-0_sp1.1 from training. Duration: 0.936375 2023-10-04 06:32:41,392 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_53555_210_5632_1_1534595220_7232297_603-4396-0_sp1.1 from training. Duration: 0.97275 2023-10-04 06:32:41,574 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.0.conv_module2.balancer2.prob, batch_count=1558760.0, ans=0.125 2023-10-04 06:32:43,212 WARNING [train.py:1204] (3/4) Exclude cut with ID _54218_210_12210_1_1534413646388_4409259_159-144367-0_sp1.1 from training. Duration: 0.963625 2023-10-04 06:32:44,627 WARNING [train.py:1204] (3/4) Exclude cut with ID _7606_210_4915_1_1528894253277_5080690_301-10732-0_sp0.9 from training. Duration: 0.9 2023-10-04 06:32:44,631 WARNING [train.py:1204] (3/4) Exclude cut with ID _15026_210_8750_1_1530777602316_3291850_211-195043-0_sp0.9 from training. Duration: 0.888875 2023-10-04 06:32:45,938 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.720e+02 2.033e+02 2.243e+02 2.613e+02 6.562e+02, threshold=4.487e+02, percent-clipped=2.0 2023-10-04 06:32:46,145 WARNING [train.py:1204] (3/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_146-282400-0 from training. Duration: 0.71 2023-10-04 06:32:47,473 WARNING [train.py:1204] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_65-88242-0 from training. Duration: 0.56 2023-10-04 06:32:47,563 WARNING [train.py:1204] (3/4) Exclude cut with ID _55857_210_2346_1_1534644007831_6898209_80-107757-0_sp1.1 from training. Duration: 0.963625 2023-10-04 06:32:48,863 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_15126_210_6753_1_1530778677_2591185_120-277707-0_sp0.9 from training. Duration: 0.888875 2023-10-04 06:32:48,984 WARNING [train.py:1204] (3/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_1033-168593-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 06:32:50,379 WARNING [train.py:1204] (3/4) Exclude cut with ID _46683_210_9642_1_1533730913492_4591116_475-30711-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 06:32:50,414 WARNING [train.py:1204] (3/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_253-308377-0 from training. Duration: 0.49 2023-10-04 06:32:50,471 WARNING [train.py:1204] (3/4) Exclude cut with ID _4680_210_3525_1_1527249757953_893386_75-78979-0 from training. Duration: 0.91 2023-10-04 06:32:51,758 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_5522_210_3273_1_1527249606_3909096_318-203153-0_sp0.9 from training. Duration: 0.47775 2023-10-04 06:32:53,282 WARNING [train.py:1204] (3/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_550-170116-0_sp1.1 from training. Duration: 0.92725 2023-10-04 06:32:53,317 WARNING [train.py:1204] (3/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_277-210798-0_sp0.9 from training. Duration: 0.611125 2023-10-04 06:32:53,377 WARNING [train.py:1204] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_620-276793-0 from training. Duration: 0.59 2023-10-04 06:32:53,387 WARNING [train.py:1204] (3/4) Exclude cut with ID _3067_210_2667_1_1526212974640_4503168_119-175776-0 from training. Duration: 0.74 2023-10-04 06:32:53,455 WARNING [train.py:1204] (3/4) Exclude cut with ID _55209_210_1531_1_1534675828996_4597160_681-195827-0_sp1.1 from training. Duration: 0.92725 2023-10-04 06:32:54,781 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_427-78097-0_sp1.1 from training. Duration: 0.72725 2023-10-04 06:32:57,808 WARNING [train.py:1204] (3/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_96-290039-0_sp0.9 from training. Duration: 0.62225 2023-10-04 06:32:57,825 WARNING [train.py:1204] (3/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_287-292174-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 06:32:59,449 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.1.feed_forward1.out_proj.dropout_p, batch_count=1558826.6666666667, ans=0.1 2023-10-04 06:33:00,575 WARNING [train.py:1204] (3/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_200-292228-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 06:33:01,019 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.balancer1.prob, batch_count=1558893.3333333333, ans=0.125 2023-10-04 06:33:02,056 INFO [train.py:1046] (3/4) Epoch 45, batch 100, loss[loss=0.1617, simple_loss=0.2442, pruned_loss=0.03959, over 23210.00 frames. ], tot_loss[loss=0.1573, simple_loss=0.2388, pruned_loss=0.03789, over 1877514.59 frames. ], batch size: 105, lr: 2.27e-03, grad_scale: 32.0 2023-10-04 06:33:02,102 WARNING [train.py:1204] (3/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_291-194999-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 06:33:04,864 WARNING [train.py:1204] (3/4) Exclude cut with ID _58895_210_8750_1_1535184081320_3649089_265-323810-0_sp1.1 from training. Duration: 0.87275 2023-10-04 06:33:08,034 WARNING [train.py:1204] (3/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_58-68956-0 from training. Duration: 0.83 2023-10-04 06:33:08,039 WARNING [train.py:1204] (3/4) Exclude cut with ID _39867_210_9826_1_1533553222030_9344259_322-115153-0_sp1.1 from training. Duration: 0.963625 2023-10-04 06:33:12,193 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.1.encoder.layers.0.self_attn_weights.whiten_keys, num_groups=4, num_channels=128, metric=3.54 vs. limit=6.0 2023-10-04 06:33:13,301 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3371_210_1794_1_1526384462_5580777_396-339812-0_sp1.1 from training. Duration: 0.77275 2023-10-04 06:33:13,320 WARNING [train.py:1204] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_731-2428-0_sp0.9 from training. Duration: 0.92225 2023-10-04 06:33:13,348 WARNING [train.py:1204] (3/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_98-741-0_sp1.1 from training. Duration: 0.72725 2023-10-04 06:33:13,350 WARNING [train.py:1204] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_238-245655-0_sp0.9 from training. Duration: 0.9 2023-10-04 06:33:13,378 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6788_210_1388_1_1527898698_4562021_91-197981-0_sp0.9 from training. Duration: 0.92225 2023-10-04 06:33:14,717 WARNING [train.py:1204] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_860-96198-0 from training. Duration: 0.73 2023-10-04 06:33:16,149 WARNING [train.py:1204] (3/4) Exclude cut with ID _14268_210_9206_1_1530622771285_4714099_331-105351-0_sp0.9 from training. Duration: 0.72225 2023-10-04 06:33:16,172 WARNING [train.py:1204] (3/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_204-137133-0_sp1.1 from training. Duration: 0.92725 2023-10-04 06:33:17,504 WARNING [train.py:1204] (3/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_5-46496-0_sp1.1 from training. Duration: 0.936375 2023-10-04 06:33:17,520 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_160-218721-0_sp1.1 from training. Duration: 0.87275 2023-10-04 06:33:20,371 WARNING [train.py:1204] (3/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_503-287883-0 from training. Duration: 0.88 2023-10-04 06:33:21,652 WARNING [train.py:1204] (3/4) Exclude cut with ID _57572_210_5517_1_1534921219624_3809484_110-79134-0_sp1.1 from training. Duration: 0.92725 2023-10-04 06:33:21,728 WARNING [train.py:1204] (3/4) Exclude cut with ID _44585_210_18628_1_1533605350387_4518199_421-37641-0_sp1.1 from training. Duration: 0.936375 2023-10-04 06:33:22,552 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.2.encoder.layers.0.self_attn_weights.whiten_keys, num_groups=4, num_channels=128, metric=4.05 vs. limit=6.0 2023-10-04 06:33:23,060 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_42-142473-0_sp0.9 from training. Duration: 0.8 2023-10-04 06:33:23,330 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.conv_skip_rate, batch_count=1558960.0, ans=0.0 2023-10-04 06:33:26,291 WARNING [train.py:1204] (3/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_310-114452-0_sp0.9 from training. Duration: 0.6555625 2023-10-04 06:33:30,296 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9029W0120-509520 from training. Duration: 0.768 2023-10-04 06:33:30,320 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9029W0433-509820 from training. Duration: 0.896 2023-10-04 06:33:31,730 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_40222_210_6228_1_1533385298_4481939_163-334586-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 06:33:31,731 WARNING [train.py:1204] (3/4) Exclude cut with ID _48677_210_5663_1_1533902405919_3892989_408-40740-0_sp1.1 from training. Duration: 0.7545625 2023-10-04 06:33:35,829 WARNING [train.py:1204] (3/4) Exclude cut with ID _24314_210_4915_1_1532135078116_7170179_521-258868-0_sp0.9 from training. Duration: 0.87775 2023-10-04 06:33:37,283 WARNING [train.py:1204] (3/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_576-51198-0_sp1.1 from training. Duration: 0.92725 2023-10-04 06:33:37,394 WARNING [train.py:1204] (3/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_306-216871-0_sp1.1 from training. Duration: 0.97275 2023-10-04 06:33:42,736 WARNING [train.py:1204] (3/4) Exclude cut with ID _20684_210_6917_1_1531567751646_4203930_463-119982-0_sp1.1 from training. Duration: 0.97275 2023-10-04 06:33:44,044 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9084W0012-536850 from training. Duration: 0.64 2023-10-04 06:33:44,260 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.ff2_skip_rate, batch_count=1559026.6666666667, ans=0.0 2023-10-04 06:33:46,742 WARNING [train.py:1204] (3/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_847-41334-0_sp1.1 from training. Duration: 0.4 2023-10-04 06:33:47,019 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.feed_forward1.out_proj.dropout_p, batch_count=1559093.3333333333, ans=0.1 2023-10-04 06:33:47,020 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.bypass.scale_min, batch_count=1559093.3333333333, ans=0.2 2023-10-04 06:33:48,262 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.conv_module1.balancer1.prob, batch_count=1559093.3333333333, ans=0.125 2023-10-04 06:33:50,696 WARNING [train.py:1204] (3/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_326-165900-0_sp1.1 from training. Duration: 0.62725 2023-10-04 06:33:52,030 WARNING [train.py:1204] (3/4) Exclude cut with ID _7537_210_2084_1_1528502395431_3924359_128-268029-0_sp1.1 from training. Duration: 0.8909375 2023-10-04 06:33:53,545 WARNING [train.py:1204] (3/4) Exclude cut with ID _67499_210_17282_1_1535509805220_3692430_333-155030-0_sp1.1 from training. Duration: 0.97275 2023-10-04 06:33:58,226 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_34008_210_10431_1_1533345424_8183247_588-257177-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 06:33:58,423 WARNING [train.py:1204] (3/4) Exclude cut with ID _25914_210_6400_1_1532141317058_644740_52-96808-0_sp1.1 from training. Duration: 0.82725 2023-10-04 06:34:01,128 WARNING [train.py:1204] (3/4) Exclude cut with ID _56494_210_4220_1_1535284837625_3033169_69-138150-0_sp1.1 from training. Duration: 0.8545625 2023-10-04 06:34:02,615 WARNING [train.py:1204] (3/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_19-143553-0_sp1.1 from training. Duration: 0.97275 2023-10-04 06:34:03,975 WARNING [train.py:1204] (3/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_582-130735-0_sp1.1 from training. Duration: 0.936375 2023-10-04 06:34:04,111 WARNING [train.py:1204] (3/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_943-201356-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 06:34:05,392 WARNING [train.py:1204] (3/4) Exclude cut with ID _3231_210_2106_1_1526205864934_6784220_422-229276-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 06:34:05,412 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_48049_210_5632_1_1533864553_3519937_73-198717-0_sp1.1 from training. Duration: 0.97275 2023-10-04 06:34:05,478 WARNING [train.py:1204] (3/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_329-285801-0 from training. Duration: 0.91 2023-10-04 06:34:06,786 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9171W0114-579000 from training. Duration: 0.896 2023-10-04 06:34:06,808 WARNING [train.py:1204] (3/4) Exclude cut with ID _3120_210_3525_1_1526036143112_4072980_210-132675-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 06:34:08,154 WARNING [train.py:1204] (3/4) Exclude cut with ID _55857_210_2346_1_1534644007831_6898209_745-98391-0_sp0.9 from training. Duration: 0.8444375 2023-10-04 06:34:08,225 WARNING [train.py:1204] (3/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_615-322035-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 06:34:08,230 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_36756_210_7234_1_1533026991_966276_119-315767-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 06:34:08,253 WARNING [train.py:1204] (3/4) Exclude cut with ID _13916_210_9994_1_1530788341126_3763149_60-191556-0_sp1.1 from training. Duration: 0.5545625 2023-10-04 06:34:08,269 WARNING [train.py:1204] (3/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_177-236809-0_sp1.1 from training. Duration: 0.4454375 2023-10-04 06:34:08,318 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7002_210_1794_1_1527939683_5157286_184-229630-0_sp1.1 from training. Duration: 0.67275 2023-10-04 06:34:08,323 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_50428_210_5724_1_1534247532_4126418_464-18322-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 06:34:08,519 INFO [scaling.py:1118] (3/4) WithLoss: name=encoder.encoders.1.encoder.layers.1.self_attn_weights, loss-sum=0.000e+00 2023-10-04 06:34:09,671 WARNING [train.py:1204] (3/4) Exclude cut with ID _35799_210_5854_1_1533263487887_3529071_466-131402-0_sp1.1 from training. Duration: 0.936375 2023-10-04 06:34:11,542 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_1259-132768-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 06:34:13,483 WARNING [train.py:1204] (3/4) Exclude cut with ID _6694_210_4599_1_1527852637490_3965529_144-185051-0_sp1.1 from training. Duration: 0.87275 2023-10-04 06:34:13,514 WARNING [train.py:1204] (3/4) Exclude cut with ID _64639_210_5816_1_1535367619437_3528000_59-133290-0_sp1.1 from training. Duration: 0.963625 2023-10-04 06:34:16,642 INFO [train.py:1046] (3/4) Epoch 45, batch 150, loss[loss=0.1571, simple_loss=0.2414, pruned_loss=0.03645, over 24003.00 frames. ], tot_loss[loss=0.1574, simple_loss=0.2391, pruned_loss=0.03779, over 2523278.89 frames. ], batch size: 80, lr: 2.27e-03, grad_scale: 32.0 2023-10-04 06:34:16,686 WARNING [train.py:1204] (3/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_223-49525-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 06:34:19,441 WARNING [train.py:1204] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_748-36150-0_sp1.1 from training. Duration: 0.82725 2023-10-04 06:34:19,441 WARNING [train.py:1204] (3/4) Exclude cut with ID _19896_210_1883_1_1531709022662_6649569_52-221627-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 06:34:19,491 WARNING [train.py:1204] (3/4) Exclude cut with ID _6789_210_1534_1_1527854471671_2870519_296-115766-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 06:34:23,444 WARNING [train.py:1204] (3/4) Exclude cut with ID _14624_210_1925_1_1530858690657_3642219_100-19871-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 06:34:23,499 WARNING [train.py:1204] (3/4) Exclude cut with ID _12555_210_8750_1_1530075594402_3780239_50-293675-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 06:34:26,339 WARNING [train.py:1204] (3/4) Exclude cut with ID _14880_210_8895_1_1530680426655_3795259_289-212817-0_sp1.1 from training. Duration: 0.62725 2023-10-04 06:34:26,616 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.balancer2.prob, batch_count=1559226.6666666667, ans=0.125 2023-10-04 06:34:28,201 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_388-211626-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 06:34:31,048 WARNING [train.py:1204] (3/4) Exclude cut with ID _5289_210_2084_1_1527678202736_3779299_392-261988-0 from training. Duration: 0.65 2023-10-04 06:34:31,056 WARNING [train.py:1204] (3/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_124-345840-0 from training. Duration: 0.52 2023-10-04 06:34:31,061 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2655_210_2087_1_1525688959_3897400_437-337690-0 from training. Duration: 0.63 2023-10-04 06:34:32,584 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3780_210_2087_1_1526467391_239068_13-274633-0_sp1.1 from training. Duration: 0.92725 2023-10-04 06:34:32,588 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_805-71497-0_sp1.1 from training. Duration: 0.6454375 2023-10-04 06:34:32,673 WARNING [train.py:1204] (3/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_434-83000-0_sp1.1 from training. Duration: 0.8909375 2023-10-04 06:34:34,032 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_17613_210_2151_1_1531137569_2366160_285-219531-0_sp1.1 from training. Duration: 0.97275 2023-10-04 06:34:34,037 WARNING [train.py:1204] (3/4) Exclude cut with ID _56108_210_16380_1_1534505446159_3887959_22-37858-0_sp1.1 from training. Duration: 0.936375 2023-10-04 06:34:34,069 WARNING [train.py:1204] (3/4) Exclude cut with ID _28427_210_4220_1_1532581240967_3061069_200-88444-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 06:34:35,454 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_77-279042-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 06:34:36,825 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9309W0193-640320 from training. Duration: 0.896 2023-10-04 06:34:38,133 WARNING [train.py:1204] (3/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_516-324055-0_sp1.1 from training. Duration: 0.936375 2023-10-04 06:34:40,197 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.self_attn_weights.pos_emb_skip_rate, batch_count=1559293.3333333333, ans=0.0 2023-10-04 06:34:45,155 WARNING [train.py:1204] (3/4) Exclude cut with ID _40094_210_16380_1_1533294005773_3641159_10-261240-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 06:34:46,650 WARNING [train.py:1204] (3/4) Exclude cut with ID _6267_210_2084_1_1527764658526_3894730_375-219887-0_sp1.1 from training. Duration: 0.6090625 2023-10-04 06:34:49,320 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6788_210_1388_1_1527898698_4562021_180-75288-0 from training. Duration: 0.99 2023-10-04 06:34:52,220 WARNING [train.py:1204] (3/4) Exclude cut with ID _8030_210_7497_1_1528444886781_3531089_262-86750-0_sp1.1 from training. Duration: 0.736375 2023-10-04 06:34:52,234 WARNING [train.py:1204] (3/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_934-325498-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 06:34:53,505 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_456-22141-0_sp0.9 from training. Duration: 0.97775 2023-10-04 06:34:54,902 WARNING [train.py:1204] (3/4) Exclude cut with ID _49643_210_7989_1_1534640244224_3939031_598-161926-0_sp1.1 from training. Duration: 0.8181875 2023-10-04 06:34:56,274 WARNING [train.py:1204] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_345-49721-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 06:34:56,362 WARNING [train.py:1204] (3/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_440-141811-0_sp1.1 from training. Duration: 0.863625 2023-10-04 06:34:58,286 WARNING [train.py:1204] (3/4) Exclude cut with ID _64048_210_10626_1_1535198382802_3835179_394-212120-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 06:34:59,679 WARNING [train.py:1204] (3/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_91-277684-0 from training. Duration: 0.74 2023-10-04 06:35:03,886 WARNING [train.py:1204] (3/4) Exclude cut with ID _23038_210_9570_1_1532001584158_3652419_191-346343-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 06:35:05,251 WARNING [train.py:1204] (3/4) Exclude cut with ID _15140_210_9288_1_1530791388450_4880149_351-347974-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 06:35:05,281 WARNING [train.py:1204] (3/4) Exclude cut with ID _58605_210_12210_1_1534723195939_3904259_48-269402-0_sp1.1 from training. Duration: 0.87275 2023-10-04 06:35:05,288 WARNING [train.py:1204] (3/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_401-53423-0_sp0.9 from training. Duration: 0.611125 2023-10-04 06:35:06,667 WARNING [train.py:1204] (3/4) Exclude cut with ID _42659_210_2070_1_1533607442765_3120380_158-49004-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 06:35:06,755 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder_embed.convnext.layerdrop_rate, batch_count=1559426.6666666667, ans=0.015 2023-10-04 06:35:08,688 WARNING [train.py:1204] (3/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_81-75849-0_sp1.1 from training. Duration: 0.3545625 2023-10-04 06:35:12,554 WARNING [train.py:1204] (3/4) Exclude cut with ID _27052_210_5986_1_1532329197187_3585009_314-106499-0_sp1.1 from training. Duration: 0.836375 2023-10-04 06:35:14,578 WARNING [train.py:1204] (3/4) Exclude cut with ID _4626_210_3532_1_1526817652717_3916859_446-206656-0_sp0.9 from training. Duration: 0.8444375 2023-10-04 06:35:14,900 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.feed_forward2.hidden_balancer.prob, batch_count=1559493.3333333333, ans=0.125 2023-10-04 06:35:16,219 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.635e+02 1.999e+02 2.185e+02 2.526e+02 4.113e+02, threshold=4.369e+02, percent-clipped=0.0 2023-10-04 06:35:16,303 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_53378_210_17282_1_1534221071_5769509_158-114960-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 06:35:16,512 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.0.balancer2.prob, batch_count=1559493.3333333333, ans=0.125 2023-10-04 06:35:17,673 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3171_210_2087_1_1526109084_2330007_37-64334-0_sp0.9 from training. Duration: 0.911125 2023-10-04 06:35:17,700 WARNING [train.py:1204] (3/4) Exclude cut with ID _5410_210_1388_1_1527397055795_1719020_39-112221-0 from training. Duration: 0.69 2023-10-04 06:35:18,981 WARNING [train.py:1204] (3/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_4-91037-0_sp0.9 from training. Duration: 0.97775 2023-10-04 06:35:18,998 WARNING [train.py:1204] (3/4) Exclude cut with ID ID0079W0443-717795 from training. Duration: 0.541 2023-10-04 06:35:21,764 WARNING [train.py:1204] (3/4) Exclude cut with ID _34734_210_4525_1_1533365977542_4448720_766-297928-0_sp1.1 from training. Duration: 0.936375 2023-10-04 06:35:23,526 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.conv_module2.balancer2.min_positive, batch_count=1559493.3333333333, ans=0.05 2023-10-04 06:35:25,013 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.conv_skip_rate, batch_count=1559493.3333333333, ans=0.0 2023-10-04 06:35:26,121 WARNING [train.py:1204] (3/4) Exclude cut with ID _51471_210_1889_1_1534399391390_3094250_310-337793-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 06:35:26,134 WARNING [train.py:1204] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_678-159307-0_sp0.9 from training. Duration: 0.9444375 2023-10-04 06:35:28,912 WARNING [train.py:1204] (3/4) Exclude cut with ID _50093_210_4220_1_1534417141207_3432299_259-322252-0 from training. Duration: 0.92 2023-10-04 06:35:28,981 WARNING [train.py:1204] (3/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_67-103444-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 06:35:29,002 WARNING [train.py:1204] (3/4) Exclude cut with ID _40551_210_10635_1_1533776424632_3416179_311-247578-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 06:35:30,715 INFO [train.py:1046] (3/4) Epoch 45, batch 200, loss[loss=0.175, simple_loss=0.243, pruned_loss=0.05356, over 23817.00 frames. ], tot_loss[loss=0.1579, simple_loss=0.2392, pruned_loss=0.03828, over 3016106.09 frames. ], batch size: 150, lr: 2.27e-03, grad_scale: 16.0 2023-10-04 06:35:32,193 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_4633_210_2040_1_1526824163_4145343_416-76117-0 from training. Duration: 0.95 2023-10-04 06:35:33,613 WARNING [train.py:1204] (3/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_331-117829-0_sp0.9 from training. Duration: 0.688875 2023-10-04 06:35:36,222 WARNING [train.py:1204] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_299-247059-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 06:35:37,578 WARNING [train.py:1204] (3/4) Exclude cut with ID _64002_210_9570_1_1535198605157_38979_2-240845-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 06:35:37,776 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.balancer2.prob, batch_count=1559560.0, ans=0.125 2023-10-04 06:35:43,524 WARNING [train.py:1204] (3/4) Exclude cut with ID _4681_210_3525_1_1527250689464_120889_15-126760-0_sp0.9 from training. Duration: 0.9 2023-10-04 06:35:43,550 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_21500_210_2054_1_1532149310_2716758_256-156611-0_sp1.1 from training. Duration: 0.936375 2023-10-04 06:35:43,565 WARNING [train.py:1204] (3/4) Exclude cut with ID _16908_210_5724_1_1531119157248_3772700_624-203729-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 06:36:01,018 WARNING [train.py:1204] (3/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_605-219732-0_sp1.1 from training. Duration: 0.8545625 2023-10-04 06:36:02,261 WARNING [train.py:1204] (3/4) Exclude cut with ID _44718_210_5816_1_1534050051869_6469030_380-167520-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 06:36:02,331 WARNING [train.py:1204] (3/4) Exclude cut with ID _19896_210_1883_1_1531709022662_6649569_94-229927-0_sp1.1 from training. Duration: 0.8818125 2023-10-04 06:36:03,716 WARNING [train.py:1204] (3/4) Exclude cut with ID _19514_210_9259_1_1531530071330_5084230_519-174300-0_sp1.1 from training. Duration: 0.963625 2023-10-04 06:36:05,678 WARNING [train.py:1204] (3/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_340-174176-0_sp0.9 from training. Duration: 0.5444375 2023-10-04 06:36:05,682 WARNING [train.py:1204] (3/4) Exclude cut with ID _8263_210_1664_1_1528439452891_970220_68-181805-0_sp1.1 from training. Duration: 0.5818125 2023-10-04 06:36:07,147 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.feed_forward2.hidden_balancer.prob, batch_count=1559693.3333333333, ans=0.125 2023-10-04 06:36:08,323 WARNING [train.py:1204] (3/4) Exclude cut with ID _3120_210_3525_1_1526036143112_4072980_348-257723-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 06:36:08,396 WARNING [train.py:1204] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_453-133983-0_sp1.1 from training. Duration: 0.7090625 2023-10-04 06:36:09,758 WARNING [train.py:1204] (3/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_17-86060-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 06:36:09,779 WARNING [train.py:1204] (3/4) Exclude cut with ID _29877_210_6228_1_1532397791106_5276149_228-184657-0_sp1.1 from training. Duration: 0.97275 2023-10-04 06:36:11,202 WARNING [train.py:1204] (3/4) Exclude cut with ID _18516_210_9206_1_1531704653206_3692406_198-334870-0 from training. Duration: 0.92 2023-10-04 06:36:12,565 WARNING [train.py:1204] (3/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_340-174176-0_sp1.1 from training. Duration: 0.4454375 2023-10-04 06:36:12,578 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_14-139186-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 06:36:12,814 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.self_attn_weights.pos_emb_skip_rate, batch_count=1559760.0, ans=0.0 2023-10-04 06:36:17,403 WARNING [train.py:1204] (3/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_126-29104-0_sp0.9 from training. Duration: 0.9444375 2023-10-04 06:36:22,143 WARNING [train.py:1204] (3/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_84-10310-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 06:36:28,221 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_15517_210_6753_1_1530954008_3487336_79-273076-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 06:36:29,506 WARNING [train.py:1204] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_371-18169-0_sp0.9 from training. Duration: 0.9555625 2023-10-04 06:36:35,028 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_68702_210_23689_1_1535799577_3420068_170-92788-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 06:36:37,736 WARNING [train.py:1204] (3/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_78-182912-0 from training. Duration: 0.59 2023-10-04 06:36:37,791 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3019_210_2028_1_1526044245_3264865_297-12384-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 06:36:37,796 WARNING [train.py:1204] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_147-111137-0_sp1.1 from training. Duration: 0.863625 2023-10-04 06:36:37,801 WARNING [train.py:1204] (3/4) Exclude cut with ID _28323_210_16187_1_1532480395667_4100300_117-8859-0_sp1.1 from training. Duration: 0.97275 2023-10-04 06:36:39,553 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7762_210_1794_1_1528199508_4057408_419-125009-0_sp1.1 from training. Duration: 0.7545625 2023-10-04 06:36:39,652 WARNING [train.py:1204] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_268-87909-0 from training. Duration: 0.99 2023-10-04 06:36:40,972 WARNING [train.py:1204] (3/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_118-311357-0_sp1.1 from training. Duration: 0.92725 2023-10-04 06:36:40,994 WARNING [train.py:1204] (3/4) Exclude cut with ID ID0377W0003-870225 from training. Duration: 0.852 2023-10-04 06:36:42,640 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.conv_skip_rate, batch_count=1559893.3333333333, ans=0.0 2023-10-04 06:36:43,594 INFO [train.py:1046] (3/4) Epoch 45, batch 250, loss[loss=0.1678, simple_loss=0.2504, pruned_loss=0.0426, over 24007.00 frames. ], tot_loss[loss=0.1572, simple_loss=0.2386, pruned_loss=0.03791, over 3381975.09 frames. ], batch size: 80, lr: 2.27e-03, grad_scale: 16.0 2023-10-04 06:36:43,695 WARNING [train.py:1204] (3/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_56-5838-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 06:36:45,033 WARNING [train.py:1204] (3/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_90-31383-0_sp0.9 from training. Duration: 0.9666875 2023-10-04 06:36:46,868 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_31583_210_3852_1_1532595851_4024413_58-86657-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 06:36:48,209 WARNING [train.py:1204] (3/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_344-113875-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 06:36:50,164 WARNING [train.py:1204] (3/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_278-104676-0_sp1.1 from training. Duration: 0.8909375 2023-10-04 06:36:50,201 WARNING [train.py:1204] (3/4) Exclude cut with ID _55844_210_2346_1_1534471212569_7625707_764-2387-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 06:36:52,114 WARNING [train.py:1204] (3/4) Exclude cut with ID _27430_210_7770_1_1532604689375_1378590_157-14963-0_sp1.1 from training. Duration: 0.963625 2023-10-04 06:36:56,140 WARNING [train.py:1204] (3/4) Exclude cut with ID _7160_210_1876_1_1527940923231_3703799_282-322611-0_sp1.1 from training. Duration: 0.7909375 2023-10-04 06:37:02,100 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.ff3_skip_rate, batch_count=1559960.0, ans=0.0 2023-10-04 06:37:04,667 WARNING [train.py:1204] (3/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_190-138640-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 06:37:06,258 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_33348_210_5632_1_1533086912_3324030_63-270922-0_sp1.1 from training. Duration: 0.97275 2023-10-04 06:37:07,904 WARNING [train.py:1204] (3/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_135-184281-0_sp1.1 from training. Duration: 0.9 2023-10-04 06:37:14,711 WARNING [train.py:1204] (3/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_277-210798-0_sp1.1 from training. Duration: 0.5 2023-10-04 06:37:14,781 WARNING [train.py:1204] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_402-253027-0_sp0.9 from training. Duration: 0.6 2023-10-04 06:37:16,123 WARNING [train.py:1204] (3/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_540-275685-0_sp0.9 from training. Duration: 0.988875 2023-10-04 06:37:16,169 WARNING [train.py:1204] (3/4) Exclude cut with ID _59493_210_17282_1_1535078419077_3225879_118-18960-0_sp1.1 from training. Duration: 0.936375 2023-10-04 06:37:18,096 WARNING [train.py:1204] (3/4) Exclude cut with ID _6194_210_1534_1_1527511826160_3941194_321-148641-0_sp1.1 from training. Duration: 0.5818125 2023-10-04 06:37:18,103 WARNING [train.py:1204] (3/4) Exclude cut with ID _61133_210_9969_1_1535196777227_3696409_333-270325-0_sp1.1 from training. Duration: 0.8454375 2023-10-04 06:37:19,519 WARNING [train.py:1204] (3/4) Exclude cut with ID _49376_210_5986_1_1533985278732_3370740_307-145221-0_sp1.1 from training. Duration: 0.936375 2023-10-04 06:37:22,309 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6846_210_6753_1_1527850767_4295158_2-315073-0_sp0.9 from training. Duration: 0.97775 2023-10-04 06:37:24,429 WARNING [train.py:1204] (3/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_17-25045-0 from training. Duration: 0.59 2023-10-04 06:37:25,689 WARNING [train.py:1204] (3/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_503-333510-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 06:37:25,823 WARNING [train.py:1204] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_238-245655-0_sp1.1 from training. Duration: 0.736375 2023-10-04 06:37:27,152 WARNING [train.py:1204] (3/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_329-82235-0_sp0.9 from training. Duration: 0.911125 2023-10-04 06:37:27,157 WARNING [train.py:1204] (3/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_325-113798-0_sp0.9 from training. Duration: 0.9444375 2023-10-04 06:37:27,240 WARNING [train.py:1204] (3/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_395-37222-0_sp0.9 from training. Duration: 0.8444375 2023-10-04 06:37:27,416 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.bypass_mid.scale_min, batch_count=1560093.3333333333, ans=0.2 2023-10-04 06:37:28,610 WARNING [train.py:1204] (3/4) Exclude cut with ID _64135_210_2253_1_1535677267876_7608972_498-5039-0_sp1.1 from training. Duration: 0.7090625 2023-10-04 06:37:28,619 WARNING [train.py:1204] (3/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_303-228738-0_sp0.9 from training. Duration: 0.9333125 2023-10-04 06:37:30,007 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_577-220899-0_sp1.1 from training. Duration: 0.92725 2023-10-04 06:37:31,341 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3475_210_2028_1_1526193578_3622569_377-166133-0_sp1.1 from training. Duration: 0.7818125 2023-10-04 06:37:31,375 WARNING [train.py:1204] (3/4) Exclude cut with ID _49376_210_5986_1_1533985278732_3370740_272-313512-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 06:37:35,375 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7762_210_1794_1_1528199508_4057408_422-227034-0_sp0.9 from training. Duration: 0.811125 2023-10-04 06:37:39,986 WARNING [train.py:1204] (3/4) Exclude cut with ID _18895_210_6228_1_1531357274234_3784930_74-341428-0_sp1.1 from training. Duration: 0.92725 2023-10-04 06:37:41,555 WARNING [train.py:1204] (3/4) Exclude cut with ID _44902_210_16187_1_1533888006062_3792078_149-67212-0_sp1.1 from training. Duration: 0.7909375 2023-10-04 06:37:42,789 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.681e+02 2.020e+02 2.227e+02 2.604e+02 5.544e+02, threshold=4.454e+02, percent-clipped=1.0 2023-10-04 06:37:46,938 WARNING [train.py:1204] (3/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_243-126020-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 06:37:50,153 WARNING [train.py:1204] (3/4) Exclude cut with ID _54218_210_12210_1_1534413646388_4409259_259-6561-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 06:37:53,525 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_41452_210_7291_1_1533437687_3941281_405-11559-0 from training. Duration: 0.91 2023-10-04 06:37:55,359 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_759-225084-0_sp1.1 from training. Duration: 0.97275 2023-10-04 06:37:55,379 WARNING [train.py:1204] (3/4) Exclude cut with ID _14747_210_8341_1_1530685797463_3654089_147-131229-0_sp0.9 from training. Duration: 0.8444375 2023-10-04 06:37:58,152 INFO [train.py:1046] (3/4) Epoch 45, batch 300, loss[loss=0.1346, simple_loss=0.2139, pruned_loss=0.02766, over 24416.00 frames. ], tot_loss[loss=0.1554, simple_loss=0.2363, pruned_loss=0.03723, over 3681979.60 frames. ], batch size: 58, lr: 2.27e-03, grad_scale: 16.0 2023-10-04 06:37:58,195 WARNING [train.py:1204] (3/4) Exclude cut with ID _14867_210_1968_1_1530684179890_3846969_319-250868-0 from training. Duration: 0.99 2023-10-04 06:37:58,225 WARNING [train.py:1204] (3/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_580-93798-0_sp0.9 from training. Duration: 0.62225 2023-10-04 06:37:58,325 WARNING [train.py:1204] (3/4) Exclude cut with ID _9679_210_7497_1_1529129212906_4363929_512-313889-0_sp0.9 from training. Duration: 0.9 2023-10-04 06:37:58,334 WARNING [train.py:1204] (3/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_18-189198-0 from training. Duration: 0.9 2023-10-04 06:38:03,630 WARNING [train.py:1204] (3/4) Exclude cut with ID _49755_210_18053_1_1534581005846_3218709_137-64179-0_sp1.1 from training. Duration: 0.92725 2023-10-04 06:38:03,696 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_48049_210_5632_1_1533864553_3519937_306-283794-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 06:38:07,852 WARNING [train.py:1204] (3/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_25-120161-0_sp1.1 from training. Duration: 0.8909375 2023-10-04 06:38:09,304 WARNING [train.py:1204] (3/4) Exclude cut with ID _8833_210_5219_1_1528592665210_7553059_319-349181-0 from training. Duration: 0.99 2023-10-04 06:38:10,659 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_296-332325-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 06:38:11,926 WARNING [train.py:1204] (3/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_436-186311-0_sp1.1 from training. Duration: 0.5909375 2023-10-04 06:38:11,945 WARNING [train.py:1204] (3/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_348-127789-0 from training. Duration: 0.85 2023-10-04 06:38:11,950 WARNING [train.py:1204] (3/4) Exclude cut with ID _3992_210_1747_1_1526644816031_3731060_443-195183-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 06:38:14,215 WARNING [train.py:1204] (3/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_308-231735-0_sp1.1 from training. Duration: 0.663625 2023-10-04 06:38:14,338 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.self_attn_weights.pos_emb_skip_rate, batch_count=1560293.3333333333, ans=0.0 2023-10-04 06:38:20,028 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6786_210_1388_1_1527839699_4194495_378-46070-0_sp1.1 from training. Duration: 0.6545625 2023-10-04 06:38:20,253 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.ff3_skip_rate, batch_count=1560293.3333333333, ans=0.0 2023-10-04 06:38:21,417 WARNING [train.py:1204] (3/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_63-101996-0 from training. Duration: 0.88 2023-10-04 06:38:24,916 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2541_210_1794_1_1525590269_3084765_156-173713-0 from training. Duration: 0.81 2023-10-04 06:38:24,950 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_217-239796-0_sp1.1 from training. Duration: 0.963625 2023-10-04 06:38:27,066 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.3.encoder.layers.3.nonlin_attention.whiten1, num_groups=1, num_channels=384, metric=5.13 vs. limit=10.0 2023-10-04 06:38:28,028 WARNING [train.py:1204] (3/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_530-329400-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 06:38:30,909 WARNING [train.py:1204] (3/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_150-62920-0_sp1.1 from training. Duration: 0.963625 2023-10-04 06:38:30,910 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_5522_210_3273_1_1527249606_3909096_366-172508-0 from training. Duration: 0.66 2023-10-04 06:38:30,912 WARNING [train.py:1204] (3/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_303-340446-0_sp1.1 from training. Duration: 0.8181875 2023-10-04 06:38:32,436 WARNING [train.py:1204] (3/4) Exclude cut with ID _8699_210_2069_1_1528630346831_3960409_160-24408-0_sp1.1 from training. Duration: 0.82725 2023-10-04 06:38:35,152 WARNING [train.py:1204] (3/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_299-187801-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 06:38:35,192 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_34886_210_10633_1_1533203882_1755661_92-345288-0_sp1.1 from training. Duration: 0.97275 2023-10-04 06:38:35,416 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.feed_forward1.out_proj.dropout_p, batch_count=1560360.0, ans=0.1 2023-10-04 06:38:40,655 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_5438_210_1794_1_1527336687_2600009_104-288274-0_sp1.1 from training. Duration: 0.52725 2023-10-04 06:38:40,667 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_154-316450-0 from training. Duration: 0.76 2023-10-04 06:38:40,736 WARNING [train.py:1204] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_23-189234-0_sp0.9 from training. Duration: 0.92225 2023-10-04 06:38:43,558 WARNING [train.py:1204] (3/4) Exclude cut with ID _4779_210_2084_1_1527318022944_3914893_346-168820-0_sp1.1 from training. Duration: 0.963625 2023-10-04 06:38:44,740 WARNING [train.py:1204] (3/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_5-55893-0 from training. Duration: 0.55 2023-10-04 06:38:46,222 WARNING [train.py:1204] (3/4) Exclude cut with ID _20455_210_5219_1_1531565996775_8098100_180-24517-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 06:38:49,818 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_243-143119-0_sp0.9 from training. Duration: 0.8555625 2023-10-04 06:38:52,665 WARNING [train.py:1204] (3/4) Exclude cut with ID _65585_210_12210_1_1535450451584_3764098_293-212045-0_sp1.1 from training. Duration: 0.92725 2023-10-04 06:38:52,669 WARNING [train.py:1204] (3/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_577-332600-0 from training. Duration: 0.57 2023-10-04 06:38:55,422 WARNING [train.py:1204] (3/4) Exclude cut with ID _30039_210_13270_1_1532484612900_2725150_104-289874-0_sp1.1 from training. Duration: 0.963625 2023-10-04 06:38:56,714 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3172_210_2087_1_1526292929_3987865_197-97762-0_sp1.1 from training. Duration: 0.6818125 2023-10-04 06:38:58,848 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_256-150908-0_sp1.1 from training. Duration: 0.963625 2023-10-04 06:39:00,740 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_296-287678-0_sp1.1 from training. Duration: 0.863625 2023-10-04 06:39:02,092 WARNING [train.py:1204] (3/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_443-30034-0 from training. Duration: 0.64 2023-10-04 06:39:02,132 WARNING [train.py:1204] (3/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_494-199538-0_sp0.9 from training. Duration: 0.8333125 2023-10-04 06:39:02,186 WARNING [train.py:1204] (3/4) Exclude cut with ID _19708_210_9206_1_1532136650733_4238364_61-162325-0_sp1.1 from training. Duration: 0.97275 2023-10-04 06:39:03,558 WARNING [train.py:1204] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_475-127455-0 from training. Duration: 0.7 2023-10-04 06:39:05,023 WARNING [train.py:1204] (3/4) Exclude cut with ID _48677_210_5663_1_1533902405919_3892989_188-11581-0_sp1.1 from training. Duration: 0.963625 2023-10-04 06:39:06,309 WARNING [train.py:1204] (3/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_136-8381-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 06:39:06,413 WARNING [train.py:1204] (3/4) Exclude cut with ID _55328_210_27767_1_1534580973260_4088170_270-229555-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 06:39:07,809 WARNING [train.py:1204] (3/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_372-266951-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 06:39:09,039 WARNING [train.py:1204] (3/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_14-242149-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 06:39:11,731 INFO [train.py:1046] (3/4) Epoch 45, batch 350, loss[loss=0.1517, simple_loss=0.241, pruned_loss=0.0312, over 24313.00 frames. ], tot_loss[loss=0.1542, simple_loss=0.2353, pruned_loss=0.03653, over 3914467.34 frames. ], batch size: 74, lr: 2.27e-03, grad_scale: 16.0 2023-10-04 06:39:11,883 WARNING [train.py:1204] (3/4) Exclude cut with ID _7877_210_6014_1_1528286358099_3750136_159-76512-0_sp1.1 from training. Duration: 0.936375 2023-10-04 06:39:11,885 WARNING [train.py:1204] (3/4) Exclude cut with ID _8263_210_1664_1_1528438953494_294100_40-204731-0_sp1.1 from training. Duration: 0.4545625 2023-10-04 06:39:16,062 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8278_210_2550_1_1528637099_4220413_694-238413-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 06:39:20,731 WARNING [train.py:1204] (3/4) Exclude cut with ID _47278_210_12041_1_1533902418349_3576110_421-172710-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 06:39:23,479 WARNING [train.py:1204] (3/4) Exclude cut with ID _14970_210_15709_1_1530773543493_1715730_215-282680-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 06:39:23,727 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.attention_skip_rate, batch_count=1560560.0, ans=0.0 2023-10-04 06:39:24,853 WARNING [train.py:1204] (3/4) Exclude cut with ID _64639_210_5816_1_1535367619437_3528000_306-149449-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 06:39:27,947 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_28-291982-0 from training. Duration: 0.97 2023-10-04 06:39:29,305 WARNING [train.py:1204] (3/4) Exclude cut with ID _55134_210_21982_1_1534590028377_3882170_412-15870-0_sp1.1 from training. Duration: 0.936375 2023-10-04 06:39:29,346 WARNING [train.py:1204] (3/4) Exclude cut with ID _13684_210_7497_1_1530619354863_3598359_312-235606-0 from training. Duration: 0.6 2023-10-04 06:39:32,665 WARNING [train.py:1204] (3/4) Exclude cut with ID _37112_210_2575_1_1532997007673_7268610_347-238993-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 06:39:32,715 WARNING [train.py:1204] (3/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_121-284152-0 from training. Duration: 0.59 2023-10-04 06:39:32,772 WARNING [train.py:1204] (3/4) Exclude cut with ID _2941_210_2577_1_1526199673150_2489759_228-2961-0_sp1.1 from training. Duration: 0.97275 2023-10-04 06:39:36,926 WARNING [train.py:1204] (3/4) Exclude cut with ID _28323_210_16187_1_1532480395667_4100300_531-24157-0 from training. Duration: 0.98 2023-10-04 06:39:38,218 WARNING [train.py:1204] (3/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_266-329755-0_sp0.9 from training. Duration: 0.988875 2023-10-04 06:39:38,357 WARNING [train.py:1204] (3/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_1038-146421-0_sp1.1 from training. Duration: 0.97275 2023-10-04 06:39:39,827 WARNING [train.py:1204] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_391-22148-0_sp0.9 from training. Duration: 0.8555625 2023-10-04 06:39:42,577 WARNING [train.py:1204] (3/4) Exclude cut with ID _64521_210_14699_1_1535243427259_3815328_543-341117-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 06:39:42,588 WARNING [train.py:1204] (3/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_52-168712-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 06:39:42,619 WARNING [train.py:1204] (3/4) Exclude cut with ID _33920_210_4220_1_1533257851500_3084399_198-334063-0_sp1.1 from training. Duration: 0.8545625 2023-10-04 06:39:42,631 WARNING [train.py:1204] (3/4) Exclude cut with ID _50093_210_4220_1_1534417141207_3432299_69-111987-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 06:39:43,868 WARNING [train.py:1204] (3/4) Exclude cut with ID _14821_210_8750_1_1530680503991_6829359_189-297451-0_sp0.9 from training. Duration: 0.611125 2023-10-04 06:39:44,013 WARNING [train.py:1204] (3/4) Exclude cut with ID _8030_210_7497_1_1528444886781_3531089_262-86750-0_sp0.9 from training. Duration: 0.9 2023-10-04 06:39:44,021 WARNING [train.py:1204] (3/4) Exclude cut with ID _24612_210_8913_1_1531998054193_3806359_294-191196-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 06:39:45,522 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.conv_module1.balancer1.prob, batch_count=1560693.3333333333, ans=0.125 2023-10-04 06:39:50,642 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.1.bypass.skip_rate, batch_count=1560693.3333333333, ans=0.035 2023-10-04 06:39:52,010 WARNING [train.py:1204] (3/4) Exclude cut with ID _16909_210_5724_1_1531292079912_4118919_707-342896-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 06:39:52,025 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_9460_210_4211_1_1528954587_10049667_572-37035-0_sp0.9 from training. Duration: 0.888875 2023-10-04 06:39:53,351 WARNING [train.py:1204] (3/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_762-109886-0_sp1.1 from training. Duration: 0.82725 2023-10-04 06:39:54,724 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_14953_210_2151_1_1530701710_5616165_519-326393-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 06:39:59,578 WARNING [train.py:1204] (3/4) Exclude cut with ID _25914_210_6400_1_1532141317058_644740_52-96808-0 from training. Duration: 0.91 2023-10-04 06:39:59,582 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_68095_210_20185_1_1535718226_8888855_1079-94525-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 06:40:04,340 WARNING [train.py:1204] (3/4) Exclude cut with ID _14860_210_4915_1_1530702020340_7343240_746-3647-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 06:40:04,340 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_70379_210_7925_1_1535941455_557455_49-101720-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 06:40:04,367 WARNING [train.py:1204] (3/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_8-307291-0_sp1.1 from training. Duration: 0.936375 2023-10-04 06:40:05,801 WARNING [train.py:1204] (3/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_117-290916-0 from training. Duration: 0.51 2023-10-04 06:40:07,193 WARNING [train.py:1204] (3/4) Exclude cut with ID _22020_210_2461_1_1531724164513_6326900_944-339840-0_sp1.1 from training. Duration: 0.963625 2023-10-04 06:40:07,361 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.balancer1.prob, batch_count=1560760.0, ans=0.125 2023-10-04 06:40:09,866 WARNING [train.py:1204] (3/4) Exclude cut with ID IC0515W0043-254911 from training. Duration: 0.976 2023-10-04 06:40:09,972 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3910_210_1794_1_1526742306_334530_28-238909-0 from training. Duration: 0.81 2023-10-04 06:40:09,986 WARNING [train.py:1204] (3/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_269-267275-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 06:40:11,196 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.654e+02 1.951e+02 2.094e+02 2.424e+02 4.061e+02, threshold=4.188e+02, percent-clipped=0.0 2023-10-04 06:40:12,689 WARNING [train.py:1204] (3/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_72-251255-0_sp1.1 from training. Duration: 0.8545625 2023-10-04 06:40:12,701 WARNING [train.py:1204] (3/4) Exclude cut with ID _14886_210_4220_1_1530702020355_3516190_175-229073-0 from training. Duration: 0.45 2023-10-04 06:40:14,232 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.conv_skip_rate, batch_count=1560826.6666666667, ans=0.0 2023-10-04 06:40:14,255 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.nonlin_attention.balancer.prob, batch_count=1560826.6666666667, ans=0.125 2023-10-04 06:40:15,364 WARNING [train.py:1204] (3/4) Exclude cut with ID _40519_210_13794_1_1533715228655_7337390_921-209452-0_sp1.1 from training. Duration: 0.963625 2023-10-04 06:40:17,163 WARNING [train.py:1204] (3/4) Exclude cut with ID _29857_210_20034_1_1532395886753_3798160_335-163562-0_sp0.9 from training. Duration: 0.9444375 2023-10-04 06:40:19,263 WARNING [train.py:1204] (3/4) Exclude cut with ID _58583_210_5236_1_1534726835266_3574249_384-58445-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 06:40:20,686 WARNING [train.py:1204] (3/4) Exclude cut with ID _44011_210_2046_1_1533722401691_3631310_96-315656-0_sp1.1 from training. Duration: 0.963625 2023-10-04 06:40:20,691 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_68484_210_1794_1_1535849804_7678097_549-170613-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 06:40:22,149 WARNING [train.py:1204] (3/4) Exclude cut with ID _37892_210_19596_1_1533198677897_3964058_214-148058-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 06:40:26,053 INFO [train.py:1046] (3/4) Epoch 45, batch 400, loss[loss=0.1622, simple_loss=0.2555, pruned_loss=0.03444, over 24278.00 frames. ], tot_loss[loss=0.1533, simple_loss=0.2339, pruned_loss=0.03632, over 4083624.77 frames. ], batch size: 74, lr: 2.27e-03, grad_scale: 32.0 2023-10-04 06:40:26,153 WARNING [train.py:1204] (3/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_415-78792-0_sp0.9 from training. Duration: 0.9 2023-10-04 06:40:29,470 WARNING [train.py:1204] (3/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_383-205833-0_sp1.1 from training. Duration: 0.863625 2023-10-04 06:40:30,874 WARNING [train.py:1204] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_167-137255-0 from training. Duration: 0.64 2023-10-04 06:40:30,890 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8275_210_4198_1_1528455320_3892571_484-154786-0_sp1.1 from training. Duration: 0.963625 2023-10-04 06:40:30,933 WARNING [train.py:1204] (3/4) Exclude cut with ID _40284_210_2084_1_1533513679814_3704279_396-319412-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 06:40:33,686 WARNING [train.py:1204] (3/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_280-120942-0_sp0.9 from training. Duration: 0.8555625 2023-10-04 06:40:33,730 WARNING [train.py:1204] (3/4) Exclude cut with ID _45630_210_10556_1_1533949174498_3902049_558-240889-0_sp1.1 from training. Duration: 0.92725 2023-10-04 06:40:36,490 WARNING [train.py:1204] (3/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_197-307458-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 06:40:38,319 WARNING [train.py:1204] (3/4) Exclude cut with ID _16909_210_5724_1_1531292079912_4118919_163-254843-0_sp1.1 from training. Duration: 0.92725 2023-10-04 06:40:39,714 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_103-84010-0 from training. Duration: 0.69 2023-10-04 06:40:41,106 WARNING [train.py:1204] (3/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_401-158239-0 from training. Duration: 0.71 2023-10-04 06:40:41,107 WARNING [train.py:1204] (3/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_899-233658-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 06:40:41,220 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.1.conv_skip_rate, batch_count=1560960.0, ans=0.0 2023-10-04 06:40:42,542 WARNING [train.py:1204] (3/4) Exclude cut with ID _23825_210_13449_1_1531915313526_3497150_240-271367-0 from training. Duration: 0.97 2023-10-04 06:40:42,589 WARNING [train.py:1204] (3/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_424-134619-0_sp1.1 from training. Duration: 0.92725 2023-10-04 06:40:45,514 WARNING [train.py:1204] (3/4) Exclude cut with ID _44831_210_13427_1_1533816011549_3689035_32-188147-0_sp1.1 from training. Duration: 0.87275 2023-10-04 06:40:45,517 WARNING [train.py:1204] (3/4) Exclude cut with ID _59170_210_6721_1_1534983633960_4203335_314-63879-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 06:40:45,533 WARNING [train.py:1204] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_602-275260-0 from training. Duration: 0.82 2023-10-04 06:40:45,564 WARNING [train.py:1204] (3/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_511-306767-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 06:40:47,360 WARNING [train.py:1204] (3/4) Exclude cut with ID _41817_210_18835_1_1533434477160_7423049_565-201775-0_sp1.1 from training. Duration: 0.92725 2023-10-04 06:40:47,376 WARNING [train.py:1204] (3/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_778-173151-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 06:40:47,436 WARNING [train.py:1204] (3/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_680-51001-0_sp1.1 from training. Duration: 0.963625 2023-10-04 06:40:50,624 WARNING [train.py:1204] (3/4) Exclude cut with ID IC0677W0059-334891 from training. Duration: 0.896 2023-10-04 06:40:50,677 WARNING [train.py:1204] (3/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_370-331600-0 from training. Duration: 0.94 2023-10-04 06:40:56,117 WARNING [train.py:1204] (3/4) Exclude cut with ID _66840_210_5219_1_1535452168426_4196349_446-240192-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 06:40:56,683 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.4.encoder.layers.1.nonlin_attention.whiten2, num_groups=1, num_channels=384, metric=3.14 vs. limit=15.0 2023-10-04 06:40:57,511 WARNING [train.py:1204] (3/4) Exclude cut with ID _65242_210_18628_1_1535369393717_3823149_38-6732-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 06:40:57,594 WARNING [train.py:1204] (3/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_302-2550-0 from training. Duration: 0.81 2023-10-04 06:40:59,409 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_100-348441-0 from training. Duration: 0.91 2023-10-04 06:41:01,711 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.1.encoder.layers.1.feed_forward1.out_whiten, num_groups=1, num_channels=256, metric=5.13 vs. limit=15.0 2023-10-04 06:41:03,591 WARNING [train.py:1204] (3/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_329-285801-0_sp1.1 from training. Duration: 0.82725 2023-10-04 06:41:05,105 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_30167_210_3528_1_1532428012_4772375_343-334712-0_sp1.1 from training. Duration: 0.936375 2023-10-04 06:41:11,121 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7543_210_6753_1_1528543318_4552_1-85072-0 from training. Duration: 0.83 2023-10-04 06:41:13,875 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7002_210_1794_1_1527935634_4034444_487-236501-0_sp1.1 from training. Duration: 0.6 2023-10-04 06:41:13,965 WARNING [train.py:1204] (3/4) Exclude cut with ID _7160_210_1876_1_1527940923231_3703799_282-322611-0 from training. Duration: 0.87 2023-10-04 06:41:16,752 WARNING [train.py:1204] (3/4) Exclude cut with ID _67335_210_5219_1_1535525975652_4275229_570-262983-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 06:41:18,144 WARNING [train.py:1204] (3/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_625-338546-0_sp1.1 from training. Duration: 0.8 2023-10-04 06:41:19,986 WARNING [train.py:1204] (3/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_176-56120-0 from training. Duration: 0.83 2023-10-04 06:41:21,560 WARNING [train.py:1204] (3/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_963-187109-0_sp1.1 from training. Duration: 0.9 2023-10-04 06:41:24,495 WARNING [train.py:1204] (3/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_121-284152-0_sp0.9 from training. Duration: 0.6555625 2023-10-04 06:41:24,617 WARNING [train.py:1204] (3/4) Exclude cut with ID _45021_210_9994_1_1533619751094_3330288_104-81711-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 06:41:27,329 WARNING [train.py:1204] (3/4) Exclude cut with ID _51382_210_23862_1_1534492801838_3650104_129-343914-0_sp1.1 from training. Duration: 0.936375 2023-10-04 06:41:27,362 WARNING [train.py:1204] (3/4) Exclude cut with ID _14970_210_15709_1_1530775547361_1966965_8-169924-0 from training. Duration: 0.92 2023-10-04 06:41:29,421 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.nonlin_attention.balancer.prob, batch_count=1561160.0, ans=0.125 2023-10-04 06:41:30,573 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2295_210_2087_1_1525500993_2407863_234-83104-0_sp1.1 from training. Duration: 0.563625 2023-10-04 06:41:30,627 WARNING [train.py:1204] (3/4) Exclude cut with ID _14687_210_5854_1_1530703838259_3664889_115-239046-0 from training. Duration: 0.67 2023-10-04 06:41:33,340 WARNING [train.py:1204] (3/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_690-126306-0_sp1.1 from training. Duration: 0.7090625 2023-10-04 06:41:33,349 WARNING [train.py:1204] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_625-102955-0_sp0.9 from training. Duration: 0.8555625 2023-10-04 06:41:34,741 WARNING [train.py:1204] (3/4) Exclude cut with ID _9389_210_1664_1_1529217337301_5289269_289-213417-0 from training. Duration: 0.89 2023-10-04 06:41:36,863 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.balancer_na.min_abs, batch_count=1561160.0, ans=0.02 2023-10-04 06:41:37,989 WARNING [train.py:1204] (3/4) Exclude cut with ID _13253_210_1925_1_1530757775819_3577873_191-144977-0_sp0.9 from training. Duration: 0.8666875 2023-10-04 06:41:38,047 WARNING [train.py:1204] (3/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_325-155703-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 06:41:39,416 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2295_210_2087_1_1525500993_2407863_116-238894-0_sp0.9 from training. Duration: 0.77775 2023-10-04 06:41:39,674 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.conv_module1.balancer1.min_positive, batch_count=1561226.6666666667, ans=0.025 2023-10-04 06:41:40,721 INFO [train.py:1046] (3/4) Epoch 45, batch 450, loss[loss=0.1615, simple_loss=0.2366, pruned_loss=0.04324, over 23835.00 frames. ], tot_loss[loss=0.1538, simple_loss=0.2347, pruned_loss=0.0364, over 4230487.95 frames. ], batch size: 150, lr: 2.27e-03, grad_scale: 32.0 2023-10-04 06:41:40,797 WARNING [train.py:1204] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_94-126128-0 from training. Duration: 0.86 2023-10-04 06:41:40,813 WARNING [train.py:1204] (3/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_395-310220-0_sp0.9 from training. Duration: 0.988875 2023-10-04 06:41:40,903 WARNING [train.py:1204] (3/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_821-110852-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 06:41:42,207 WARNING [train.py:1204] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_64-248359-0_sp1.1 from training. Duration: 0.62725 2023-10-04 06:41:42,225 WARNING [train.py:1204] (3/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_138-187031-0 from training. Duration: 0.99 2023-10-04 06:41:42,264 WARNING [train.py:1204] (3/4) Exclude cut with ID _4122_210_2084_1_1526713164417_4353280_528-206771-0_sp0.9 from training. Duration: 0.97775 2023-10-04 06:41:43,654 WARNING [train.py:1204] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_844-96989-0_sp1.1 from training. Duration: 0.6454375 2023-10-04 06:41:46,443 WARNING [train.py:1204] (3/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_106-30782-0_sp1.1 from training. Duration: 0.72725 2023-10-04 06:41:55,852 WARNING [train.py:1204] (3/4) Exclude cut with ID _14683_210_5219_1_1530698352608_3974859_281-250040-0_sp1.1 from training. Duration: 0.936375 2023-10-04 06:41:55,885 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_46013_210_7234_1_1533776362_3744032_463-52378-0_sp0.9 from training. Duration: 0.9555625 2023-10-04 06:41:57,893 WARNING [train.py:1204] (3/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_299-34413-0 from training. Duration: 0.65 2023-10-04 06:41:59,217 WARNING [train.py:1204] (3/4) Exclude cut with ID _35799_210_5854_1_1533263487887_3529071_490-326847-0 from training. Duration: 0.99 2023-10-04 06:42:03,278 WARNING [train.py:1204] (3/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_589-9070-0_sp0.9 from training. Duration: 0.788875 2023-10-04 06:42:06,108 WARNING [train.py:1204] (3/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_884-29515-0_sp1.1 from training. Duration: 0.936375 2023-10-04 06:42:06,221 WARNING [train.py:1204] (3/4) Exclude cut with ID _15012_210_1968_1_1530856820429_5095350_474-119995-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 06:42:06,425 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.conv_module1.balancer1.prob, batch_count=1561293.3333333333, ans=0.125 2023-10-04 06:42:09,706 WARNING [train.py:1204] (3/4) Exclude cut with ID _65585_210_12210_1_1535450451584_3764098_211-97124-0_sp1.1 from training. Duration: 0.963625 2023-10-04 06:42:10,934 WARNING [train.py:1204] (3/4) Exclude cut with ID _33640_210_2655_1_1533117605479_4855469_666-224463-0_sp1.1 from training. Duration: 0.963625 2023-10-04 06:42:13,650 WARNING [train.py:1204] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_35-153886-0 from training. Duration: 0.84 2023-10-04 06:42:13,707 WARNING [train.py:1204] (3/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_210-106047-0 from training. Duration: 0.97 2023-10-04 06:42:16,319 WARNING [train.py:1204] (3/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_479-125611-0 from training. Duration: 0.96 2023-10-04 06:42:16,353 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_9417_210_1794_1_1529235261_4614131_427-153963-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 06:42:17,724 WARNING [train.py:1204] (3/4) Exclude cut with ID _23818_210_6228_1_1531917063884_3676920_133-170571-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 06:42:19,403 WARNING [train.py:1204] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_602-275260-0_sp1.1 from training. Duration: 0.7454375 2023-10-04 06:42:22,284 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9029W0121-509521 from training. Duration: 0.896 2023-10-04 06:42:22,294 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9029W0434-509821 from training. Duration: 0.64 2023-10-04 06:42:22,310 WARNING [train.py:1204] (3/4) Exclude cut with ID _23038_210_9570_1_1532001584158_3652419_289-307034-0_sp1.1 from training. Duration: 0.936375 2023-10-04 06:42:24,244 WARNING [train.py:1204] (3/4) Exclude cut with ID _29345_210_4381_1_1532689168263_3860169_149-257268-0_sp0.9 from training. Duration: 0.9 2023-10-04 06:42:25,609 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_405-339986-0_sp1.1 from training. Duration: 0.463625 2023-10-04 06:42:28,499 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2655_210_2087_1_1525688959_3897400_437-337690-0_sp1.1 from training. Duration: 0.57275 2023-10-04 06:42:28,531 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7543_210_6753_1_1528541907_1399322_165-323619-0_sp1.1 from training. Duration: 0.62725 2023-10-04 06:42:30,304 WARNING [train.py:1204] (3/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_508-335369-0_sp1.1 from training. Duration: 0.436375 2023-10-04 06:42:30,375 WARNING [train.py:1204] (3/4) Exclude cut with ID _7852_210_5219_1_1528286471316_8139400_358-287492-0 from training. Duration: 0.96 2023-10-04 06:42:33,100 WARNING [train.py:1204] (3/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_322-217220-0_sp0.9 from training. Duration: 0.9555625 2023-10-04 06:42:34,542 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3171_210_2087_1_1526109084_2330007_66-260583-0_sp0.9 from training. Duration: 0.8 2023-10-04 06:42:35,873 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8558_210_1410_1_1528505225_7613406_131-329524-0_sp1.1 from training. Duration: 0.7181875 2023-10-04 06:42:37,243 WARNING [train.py:1204] (3/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_30-326681-0 from training. Duration: 0.83 2023-10-04 06:42:39,789 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.591e+02 1.888e+02 2.086e+02 2.350e+02 3.841e+02, threshold=4.172e+02, percent-clipped=0.0 2023-10-04 06:42:41,243 WARNING [train.py:1204] (3/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_58-68956-0_sp0.9 from training. Duration: 0.92225 2023-10-04 06:42:42,579 WARNING [train.py:1204] (3/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_255-312654-0 from training. Duration: 0.91 2023-10-04 06:42:42,660 WARNING [train.py:1204] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_99-167392-0 from training. Duration: 0.61 2023-10-04 06:42:44,563 WARNING [train.py:1204] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_94-126128-0_sp0.9 from training. Duration: 0.9555625 2023-10-04 06:42:46,127 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.1.attention_skip_rate, batch_count=1561493.3333333333, ans=0.0 2023-10-04 06:42:48,683 WARNING [train.py:1204] (3/4) Exclude cut with ID _68635_210_9317_1_1535770813410_4315870_187-192817-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 06:42:51,856 WARNING [train.py:1204] (3/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_277-204970-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 06:42:53,288 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6163_210_6400_1_1527506746_4263068_283-137811-0_sp0.9 from training. Duration: 0.8555625 2023-10-04 06:42:53,313 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9144W0121-566446 from training. Duration: 0.896 2023-10-04 06:42:54,431 INFO [train.py:1046] (3/4) Epoch 45, batch 500, loss[loss=0.1481, simple_loss=0.2284, pruned_loss=0.03395, over 23549.00 frames. ], tot_loss[loss=0.1539, simple_loss=0.2348, pruned_loss=0.03654, over 4339021.65 frames. ], batch size: 120, lr: 2.27e-03, grad_scale: 32.0 2023-10-04 06:42:56,586 WARNING [train.py:1204] (3/4) Exclude cut with ID _49371_210_12071_1_1533952761021_3812190_351-315519-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 06:42:57,921 WARNING [train.py:1204] (3/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_115-326778-0_sp1.1 from training. Duration: 0.8454375 2023-10-04 06:42:57,982 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_17292_210_6753_1_1531125388_2943734_436-198965-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 06:42:59,233 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9165W0117-576751 from training. Duration: 0.896 2023-10-04 06:43:00,658 WARNING [train.py:1204] (3/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_212-149947-0 from training. Duration: 0.73 2023-10-04 06:43:00,667 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_13757_210_3528_1_1530440682_4107239_65-167544-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 06:43:02,155 WARNING [train.py:1204] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_700-183549-0_sp0.9 from training. Duration: 0.7555625 2023-10-04 06:43:05,499 WARNING [train.py:1204] (3/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_216-343007-0_sp0.9 from training. Duration: 0.7333125 2023-10-04 06:43:06,881 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7544_210_6753_1_1528587722_3576686_295-258479-0_sp1.1 from training. Duration: 0.763625 2023-10-04 06:43:09,550 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_365-287106-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 06:43:09,572 WARNING [train.py:1204] (3/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_300-3747-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 06:43:10,989 WARNING [train.py:1204] (3/4) Exclude cut with ID _14069_210_5632_1_1530512692033_4803390_487-303201-0_sp1.1 from training. Duration: 0.92725 2023-10-04 06:43:19,879 WARNING [train.py:1204] (3/4) Exclude cut with ID _14459_210_2228_1_1531443641946_7516700_668-123026-0_sp1.1 from training. Duration: 0.936375 2023-10-04 06:43:19,900 WARNING [train.py:1204] (3/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_108-53929-0_sp1.1 from training. Duration: 0.536375 2023-10-04 06:43:21,663 WARNING [train.py:1204] (3/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_266-137104-0_sp0.9 from training. Duration: 0.72225 2023-10-04 06:43:22,928 WARNING [train.py:1204] (3/4) Exclude cut with ID _12302_210_5219_1_1529924446466_8107280_902-168619-0_sp1.1 from training. Duration: 0.936375 2023-10-04 06:43:22,955 WARNING [train.py:1204] (3/4) Exclude cut with ID _59502_210_12210_1_1535107024698_1835139_146-162495-0 from training. Duration: 0.92 2023-10-04 06:43:22,978 WARNING [train.py:1204] (3/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_412-287526-0_sp1.1 from training. Duration: 0.6545625 2023-10-04 06:43:27,152 WARNING [train.py:1204] (3/4) Exclude cut with ID _7160_210_1876_1_1527940923231_3703799_120-150943-0_sp0.9 from training. Duration: 0.9 2023-10-04 06:43:28,478 WARNING [train.py:1204] (3/4) Exclude cut with ID _15037_210_2253_1_1530752294104_3267650_296-192597-0_sp1.1 from training. Duration: 0.736375 2023-10-04 06:43:28,498 WARNING [train.py:1204] (3/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_397-216497-0_sp1.1 from training. Duration: 0.87275 2023-10-04 06:43:28,519 WARNING [train.py:1204] (3/4) Exclude cut with ID _40094_210_16380_1_1533294005773_3641159_76-269320-0_sp1.1 from training. Duration: 0.936375 2023-10-04 06:43:30,379 WARNING [train.py:1204] (3/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_449-15508-0 from training. Duration: 0.82 2023-10-04 06:43:33,416 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9309W0194-640321 from training. Duration: 0.64 2023-10-04 06:43:33,685 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.conv_module2.balancer1.max_abs, batch_count=1561693.3333333333, ans=10.0 2023-10-04 06:43:34,947 WARNING [train.py:1204] (3/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_284-312178-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 06:43:38,202 WARNING [train.py:1204] (3/4) Exclude cut with ID _29853_210_7497_1_1532505600851_6642110_401-55937-0_sp1.1 from training. Duration: 0.92725 2023-10-04 06:43:38,278 WARNING [train.py:1204] (3/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_716-24918-0_sp1.1 from training. Duration: 0.92725 2023-10-04 06:43:39,637 WARNING [train.py:1204] (3/4) Exclude cut with ID _19637_210_4220_1_1531537167676_3205060_314-305638-0_sp1.1 from training. Duration: 0.92725 2023-10-04 06:43:39,672 WARNING [train.py:1204] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_475-127455-0_sp1.1 from training. Duration: 0.636375 2023-10-04 06:43:42,396 WARNING [train.py:1204] (3/4) Exclude cut with ID _5444_210_2151_1_1527418808464_5312210_581-315411-0 from training. Duration: 0.62 2023-10-04 06:43:43,883 WARNING [train.py:1204] (3/4) Exclude cut with ID _44902_210_16187_1_1533888006062_3792078_149-67212-0_sp0.9 from training. Duration: 0.9666875 2023-10-04 06:43:45,802 WARNING [train.py:1204] (3/4) Exclude cut with ID _31602_210_5854_1_1532677021456_4458250_63-63253-0_sp1.1 from training. Duration: 0.963625 2023-10-04 06:43:47,334 WARNING [train.py:1204] (3/4) Exclude cut with ID _55209_210_1531_1_1534675828996_4597160_660-203992-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 06:43:51,353 WARNING [train.py:1204] (3/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_83-190624-0_sp1.1 from training. Duration: 0.92725 2023-10-04 06:43:53,553 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.bypass.scale_min, batch_count=1561826.6666666667, ans=0.2 2023-10-04 06:43:57,424 WARNING [train.py:1204] (3/4) Exclude cut with ID _58605_210_12210_1_1534723195939_3904259_154-139635-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 06:43:59,338 WARNING [train.py:1204] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_164-224052-0 from training. Duration: 0.7 2023-10-04 06:43:59,355 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_1126-189071-0_sp1.1 from training. Duration: 0.963625 2023-10-04 06:43:59,372 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_15396_210_6890_1_1531308473_1938936_310-214789-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 06:44:03,646 WARNING [train.py:1204] (3/4) Exclude cut with ID _8395_210_1531_1_1528597871523_4411579_431-132470-0 from training. Duration: 0.74 2023-10-04 06:44:03,705 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3780_210_2087_1_1526467760_2130483_170-259044-0_sp0.9 from training. Duration: 0.77775 2023-10-04 06:44:05,078 WARNING [train.py:1204] (3/4) Exclude cut with ID _12733_210_8341_1_1530167424556_3646380_537-230640-0_sp1.1 from training. Duration: 0.963625 2023-10-04 06:44:07,313 INFO [scaling.py:1118] (3/4) WithLoss: name=encoder.encoders.2.encoder.layers.0.self_attn_weights, loss-sum=0.000e+00 2023-10-04 06:44:09,784 INFO [train.py:1046] (3/4) Epoch 45, batch 550, loss[loss=0.1567, simple_loss=0.2481, pruned_loss=0.0326, over 24676.00 frames. ], tot_loss[loss=0.1553, simple_loss=0.2363, pruned_loss=0.03715, over 4431699.29 frames. ], batch size: 73, lr: 2.27e-03, grad_scale: 32.0 2023-10-04 06:44:09,864 WARNING [train.py:1204] (3/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_159-281310-0 from training. Duration: 0.95 2023-10-04 06:44:12,433 WARNING [train.py:1204] (3/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_434-83000-0 from training. Duration: 0.98 2023-10-04 06:44:12,454 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_4405_210_1849_1_1526732205_3593939_493-32464-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 06:44:12,463 WARNING [train.py:1204] (3/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_342-100157-0 from training. Duration: 0.54 2023-10-04 06:44:12,518 WARNING [train.py:1204] (3/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_765-250133-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 06:44:12,522 WARNING [train.py:1204] (3/4) Exclude cut with ID _38670_210_15379_1_1533448810659_4391059_81-312245-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 06:44:12,717 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.bypass.scale_min, batch_count=1561893.3333333333, ans=0.2 2023-10-04 06:44:14,462 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_50050_210_16223_1_1535435796_7290138_1273-247611-0_sp1.1 from training. Duration: 0.936375 2023-10-04 06:44:14,521 WARNING [train.py:1204] (3/4) Exclude cut with ID _44831_210_13427_1_1533816011549_3689035_399-18591-0_sp1.1 from training. Duration: 0.936375 2023-10-04 06:44:14,537 WARNING [train.py:1204] (3/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_557-324258-0_sp0.9 from training. Duration: 0.9 2023-10-04 06:44:15,819 WARNING [train.py:1204] (3/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_62-335552-0_sp1.1 from training. Duration: 0.7909375 2023-10-04 06:44:19,900 WARNING [train.py:1204] (3/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_673-264125-0_sp1.1 from training. Duration: 0.963625 2023-10-04 06:44:19,989 WARNING [train.py:1204] (3/4) Exclude cut with ID _8030_210_7497_1_1528444886781_3531089_262-86750-0 from training. Duration: 0.81 2023-10-04 06:44:20,010 WARNING [train.py:1204] (3/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_444-28041-0_sp1.1 from training. Duration: 0.77275 2023-10-04 06:44:22,947 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.nonlin_attention.balancer.prob, batch_count=1561960.0, ans=0.125 2023-10-04 06:44:26,111 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_34008_210_10431_1_1533345424_8183247_516-14858-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 06:44:26,133 WARNING [train.py:1204] (3/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_54-248218-0_sp1.1 from training. Duration: 0.936375 2023-10-04 06:44:28,881 WARNING [train.py:1204] (3/4) Exclude cut with ID _13253_210_1925_1_1530757775819_3577873_300-119782-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 06:44:30,183 WARNING [train.py:1204] (3/4) Exclude cut with ID _39290_210_4220_1_1533862778760_3075118_21-240703-0_sp1.1 from training. Duration: 0.936375 2023-10-04 06:44:34,983 WARNING [train.py:1204] (3/4) Exclude cut with ID 774-127930-0014-10412-0_sp1.1 from training. Duration: 0.95 2023-10-04 06:44:35,057 WARNING [train.py:1204] (3/4) Exclude cut with ID _67499_210_17282_1_1535509805220_3692430_20-179346-0 from training. Duration: 0.97 2023-10-04 06:44:37,784 WARNING [train.py:1204] (3/4) Exclude cut with ID _8365_210_2064_1_1528592311070_4677690_431-294102-0_sp0.9 from training. Duration: 0.988875 2023-10-04 06:44:38,041 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.balancer2.prob, batch_count=1562026.6666666667, ans=0.125 2023-10-04 06:44:40,169 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.conv_module2.balancer2.min_abs, batch_count=1562026.6666666667, ans=0.5 2023-10-04 06:44:42,595 WARNING [train.py:1204] (3/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_365-1492-0_sp1.1 from training. Duration: 0.82725 2023-10-04 06:44:42,640 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_105-114797-0_sp0.9 from training. Duration: 0.9333125 2023-10-04 06:44:44,031 WARNING [train.py:1204] (3/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_769-168302-0_sp0.9 from training. Duration: 0.788875 2023-10-04 06:44:48,031 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_40340_210_9024_1_1533433916_4132944_320-270277-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 06:44:48,036 WARNING [train.py:1204] (3/4) Exclude cut with ID ID0203W0003-781711 from training. Duration: 0.958 2023-10-04 06:44:48,122 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_30434_210_13049_1_1532519384_981746_52-12932-0_sp1.1 from training. Duration: 0.936375 2023-10-04 06:44:49,930 WARNING [train.py:1204] (3/4) Exclude cut with ID _8263_210_1664_1_1528438953494_294100_40-204731-0_sp0.9 from training. Duration: 0.5555625 2023-10-04 06:44:52,644 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1032-236863-0_sp0.9 from training. Duration: 0.9333125 2023-10-04 06:44:52,699 WARNING [train.py:1204] (3/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_335-274780-0_sp0.9 from training. Duration: 0.8666875 2023-10-04 06:44:52,714 WARNING [train.py:1204] (3/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_654-52713-0_sp1.1 from training. Duration: 0.7 2023-10-04 06:44:54,489 WARNING [train.py:1204] (3/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_742-267856-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 06:44:55,906 WARNING [train.py:1204] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_74-48717-0 from training. Duration: 0.58 2023-10-04 06:44:57,420 WARNING [train.py:1204] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_18-46756-0 from training. Duration: 0.79 2023-10-04 06:44:58,639 WARNING [train.py:1204] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_612-56061-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 06:44:58,647 WARNING [train.py:1204] (3/4) Exclude cut with ID _56423_210_5854_1_1534896061049_3165969_412-2403-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 06:44:58,701 WARNING [train.py:1204] (3/4) Exclude cut with ID _62574_210_2655_1_1535180407058_3933249_120-196848-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 06:44:58,702 WARNING [train.py:1204] (3/4) Exclude cut with ID _40519_210_13794_1_1533715228655_7337390_916-51000-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 06:45:01,481 WARNING [train.py:1204] (3/4) Exclude cut with ID _65263_210_22356_1_1535763598691_3921037_108-21902-0_sp1.1 from training. Duration: 0.9 2023-10-04 06:45:02,891 WARNING [train.py:1204] (3/4) Exclude cut with ID _4122_210_2084_1_1526713164417_4353280_528-206771-0_sp1.1 from training. Duration: 0.8 2023-10-04 06:45:06,267 WARNING [train.py:1204] (3/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_43-325657-0_sp1.1 from training. Duration: 0.92725 2023-10-04 06:45:07,593 WARNING [train.py:1204] (3/4) Exclude cut with ID _44659_210_2721_1_1534036120061_6614860_314-82041-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 06:45:07,665 WARNING [train.py:1204] (3/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_81-300212-0_sp0.9 from training. Duration: 0.6666875 2023-10-04 06:45:09,080 WARNING [train.py:1204] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_665-196155-0_sp0.9 from training. Duration: 0.7444375 2023-10-04 06:45:10,265 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.718e+02 2.015e+02 2.214e+02 2.569e+02 3.684e+02, threshold=4.428e+02, percent-clipped=0.0 2023-10-04 06:45:11,057 WARNING [train.py:1204] (3/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_140-336881-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 06:45:12,373 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6290_210_3273_1_1527595224_1282787_115-87095-0_sp0.9 from training. Duration: 0.611125 2023-10-04 06:45:12,427 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_53373_210_5399_1_1534229471_4715695_607-259685-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 06:45:12,562 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.0.self_attn_weights.pos_emb_skip_rate, batch_count=1562160.0, ans=0.0 2023-10-04 06:45:15,103 WARNING [train.py:1204] (3/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_175-163894-0_sp1.1 from training. Duration: 0.67275 2023-10-04 06:45:15,129 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7002_210_1794_1_1527939683_5157286_1-281264-0_sp1.1 from training. Duration: 0.4 2023-10-04 06:45:20,781 WARNING [train.py:1204] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_605-257205-0 from training. Duration: 0.59 2023-10-04 06:45:24,043 INFO [train.py:1046] (3/4) Epoch 45, batch 600, loss[loss=0.1415, simple_loss=0.2232, pruned_loss=0.0299, over 23674.00 frames. ], tot_loss[loss=0.1549, simple_loss=0.2358, pruned_loss=0.03698, over 4489243.83 frames. ], batch size: 149, lr: 2.27e-03, grad_scale: 16.0 2023-10-04 06:45:25,433 WARNING [train.py:1204] (3/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_454-314901-0 from training. Duration: 0.46 2023-10-04 06:45:27,603 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.0.layers.1.conv_module1.whiten, num_groups=1, num_channels=192, metric=9.23 vs. limit=15.0 2023-10-04 06:45:28,630 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_46032_210_14656_1_1533781412_4507969_287-130422-0_sp1.1 from training. Duration: 0.97275 2023-10-04 06:45:28,652 WARNING [train.py:1204] (3/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_186-309161-0_sp1.1 from training. Duration: 0.4909375 2023-10-04 06:45:28,693 WARNING [train.py:1204] (3/4) Exclude cut with ID _42659_210_2070_1_1533607442765_3120380_45-342307-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 06:45:31,593 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.0.conv_module2.balancer2.prob, batch_count=1562226.6666666667, ans=0.125 2023-10-04 06:45:33,125 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.conv_module1.balancer2.min_abs, batch_count=1562226.6666666667, ans=0.5 2023-10-04 06:45:34,180 WARNING [train.py:1204] (3/4) Exclude cut with ID _12555_210_8750_1_1530075594402_3780239_478-326818-0_sp1.1 from training. Duration: 0.963625 2023-10-04 06:45:35,540 WARNING [train.py:1204] (3/4) Exclude cut with ID _7606_210_4915_1_1528894253277_5080690_127-177761-0_sp1.1 from training. Duration: 0.5818125 2023-10-04 06:45:37,441 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6788_210_1388_1_1527898698_4562021_91-197981-0 from training. Duration: 0.83 2023-10-04 06:45:40,227 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8558_210_1410_1_1528505225_7613406_131-329524-0_sp0.9 from training. Duration: 0.87775 2023-10-04 06:45:40,354 WARNING [train.py:1204] (3/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_153-153537-0_sp1.1 from training. Duration: 0.82725 2023-10-04 06:45:43,438 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_17292_210_6753_1_1531125388_2943734_285-349105-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 06:45:43,740 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.conv_skip_rate, batch_count=1562293.3333333333, ans=0.0 2023-10-04 06:45:44,922 WARNING [train.py:1204] (3/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_37-41919-0 from training. Duration: 0.87 2023-10-04 06:45:44,949 WARNING [train.py:1204] (3/4) Exclude cut with ID _60860_210_8254_1_1534896015875_3557169_172-294852-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 06:45:51,865 WARNING [train.py:1204] (3/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_146-276944-0 from training. Duration: 0.83 2023-10-04 06:45:52,867 INFO [scaling.py:1022] (3/4) Whitening: name=encoder_embed.convnext.out_whiten, num_groups=1, num_channels=128, metric=4.25 vs. limit=5.0 2023-10-04 06:45:55,184 WARNING [train.py:1204] (3/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_1254-44082-0_sp1.1 from training. Duration: 0.936375 2023-10-04 06:45:55,198 WARNING [train.py:1204] (3/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_1115-180350-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 06:45:55,238 WARNING [train.py:1204] (3/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_60-146189-0_sp1.1 from training. Duration: 0.8909375 2023-10-04 06:46:02,616 WARNING [train.py:1204] (3/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_517-153649-0_sp1.1 from training. Duration: 0.92725 2023-10-04 06:46:02,628 WARNING [train.py:1204] (3/4) Exclude cut with ID _39571_210_13449_1_1533801584143_3651077_37-266257-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 06:46:02,673 WARNING [train.py:1204] (3/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_527-276118-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 06:46:08,740 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7696_210_2040_1_1528207167_3716099_117-125434-0_sp1.1 from training. Duration: 0.6909375 2023-10-04 06:46:13,372 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_25767_210_2161_1_1532307483_3867748_478-156360-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 06:46:13,376 WARNING [train.py:1204] (3/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_177-49092-0_sp1.1 from training. Duration: 0.82725 2023-10-04 06:46:13,383 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_11305_210_1794_1_1529750428_7178148_680-317073-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 06:46:20,391 WARNING [train.py:1204] (3/4) Exclude cut with ID _53565_210_10635_1_1534337218335_3820259_287-18120-0 from training. Duration: 0.97 2023-10-04 06:46:20,571 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.0.feed_forward3.hidden_balancer.prob, batch_count=1562426.6666666667, ans=0.125 2023-10-04 06:46:24,804 WARNING [train.py:1204] (3/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_498-40153-0_sp1.1 from training. Duration: 0.57275 2023-10-04 06:46:24,823 WARNING [train.py:1204] (3/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_555-63066-0_sp1.1 from training. Duration: 0.963625 2023-10-04 06:46:30,070 WARNING [train.py:1204] (3/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_395-310220-0 from training. Duration: 0.89 2023-10-04 06:46:30,150 WARNING [train.py:1204] (3/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_937-208281-0_sp1.1 from training. Duration: 0.77275 2023-10-04 06:46:32,997 WARNING [train.py:1204] (3/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_120-211630-0 from training. Duration: 0.92 2023-10-04 06:46:33,033 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_454-276429-0_sp1.1 from training. Duration: 0.7909375 2023-10-04 06:46:33,242 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.feed_forward3.hidden_balancer.prob, batch_count=1562493.3333333333, ans=0.125 2023-10-04 06:46:34,289 WARNING [train.py:1204] (3/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_404-75889-0_sp1.1 from training. Duration: 0.8090625 2023-10-04 06:46:38,068 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.conv_module2.balancer1.max_abs, batch_count=1562560.0, ans=10.0 2023-10-04 06:46:39,056 INFO [train.py:1046] (3/4) Epoch 45, batch 650, loss[loss=0.1569, simple_loss=0.2251, pruned_loss=0.04435, over 23710.00 frames. ], tot_loss[loss=0.1549, simple_loss=0.235, pruned_loss=0.03734, over 4530667.33 frames. ], batch size: 232, lr: 2.27e-03, grad_scale: 16.0 2023-10-04 06:46:39,200 WARNING [train.py:1204] (3/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_282-136641-0_sp0.9 from training. Duration: 0.4666875 2023-10-04 06:46:41,107 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_809-17156-0_sp1.1 from training. Duration: 0.663625 2023-10-04 06:46:41,384 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.ff3_skip_rate, batch_count=1562560.0, ans=0.0 2023-10-04 06:46:42,550 WARNING [train.py:1204] (3/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_1003-101638-0_sp0.9 from training. Duration: 0.811125 2023-10-04 06:46:45,208 WARNING [train.py:1204] (3/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_404-75889-0_sp0.9 from training. Duration: 0.988875 2023-10-04 06:46:46,613 WARNING [train.py:1204] (3/4) Exclude cut with ID _59288_210_5854_1_1535077757774_3412246_570-295255-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 06:46:48,096 WARNING [train.py:1204] (3/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_412-287526-0 from training. Duration: 0.72 2023-10-04 06:46:49,356 WARNING [train.py:1204] (3/4) Exclude cut with ID _44010_210_4525_1_1533884262964_4013620_649-79655-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 06:46:49,655 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.bypass.scale_min, batch_count=1562560.0, ans=0.2 2023-10-04 06:46:52,277 WARNING [train.py:1204] (3/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_57-270037-0_sp0.9 from training. Duration: 0.8555625 2023-10-04 06:46:52,285 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_34815_210_6388_1_1533381320_3571903_302-255963-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 06:46:57,084 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_47449_210_5399_1_1534076445_4006929_573-301852-0_sp1.1 from training. Duration: 0.97275 2023-10-04 06:46:57,303 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.ff2_skip_rate, batch_count=1562626.6666666667, ans=0.0 2023-10-04 06:47:01,181 WARNING [train.py:1204] (3/4) Exclude cut with ID 6709-74022-0004-86860-0_sp1.1 from training. Duration: 0.9409375 2023-10-04 06:47:01,306 WARNING [train.py:1204] (3/4) Exclude cut with ID _40519_210_13794_1_1533715228655_7337390_111-71991-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 06:47:02,952 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_30167_210_3528_1_1532428012_4772375_169-214186-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 06:47:06,386 WARNING [train.py:1204] (3/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_20-235313-0_sp1.1 from training. Duration: 0.963625 2023-10-04 06:47:06,431 WARNING [train.py:1204] (3/4) Exclude cut with ID _4681_210_3525_1_1527251040321_1235713_123-24428-0_sp0.9 from training. Duration: 0.5333125 2023-10-04 06:47:09,625 WARNING [train.py:1204] (3/4) Exclude cut with ID _31602_210_5854_1_1532677021456_4458250_444-198752-0_sp1.1 from training. Duration: 0.97275 2023-10-04 06:47:09,677 WARNING [train.py:1204] (3/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_9-279442-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 06:47:09,732 WARNING [train.py:1204] (3/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_973-198085-0_sp0.9 from training. Duration: 0.8333125 2023-10-04 06:47:10,548 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.2.encoder.layers.0.self_attn2.whiten, num_groups=1, num_channels=384, metric=19.27 vs. limit=22.5 2023-10-04 06:47:11,159 WARNING [train.py:1204] (3/4) Exclude cut with ID _29853_210_7497_1_1532505600851_6642110_567-250221-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 06:47:12,681 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6545_210_2040_1_1527774670_4245258_407-48070-0_sp1.1 from training. Duration: 0.6909375 2023-10-04 06:47:14,127 WARNING [train.py:1204] (3/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_224-343295-0_sp0.9 from training. Duration: 0.611125 2023-10-04 06:47:15,392 WARNING [train.py:1204] (3/4) Exclude cut with ID IC0125W0011-61817 from training. Duration: 0.88 2023-10-04 06:47:15,409 WARNING [train.py:1204] (3/4) Exclude cut with ID _60113_210_16271_1_1535164212481_4386079_435-154371-0_sp1.1 from training. Duration: 0.97275 2023-10-04 06:47:15,428 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_25836_210_16554_1_1532138167_53197_3-9394-0_sp1.1 from training. Duration: 0.92725 2023-10-04 06:47:18,357 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.conv_module1.balancer1.prob, batch_count=1562693.3333333333, ans=0.125 2023-10-04 06:47:19,606 WARNING [train.py:1204] (3/4) Exclude cut with ID _29343_210_4381_1_1532602757752_3674109_57-55125-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 06:47:19,704 WARNING [train.py:1204] (3/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_267-23560-0_sp1.1 from training. Duration: 0.936375 2023-10-04 06:47:19,728 WARNING [train.py:1204] (3/4) Exclude cut with ID _30196_210_1664_1_1533708496522_7074140_1150-278867-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 06:47:20,945 WARNING [train.py:1204] (3/4) Exclude cut with ID _6270_210_2084_1_1527850911573_3856420_26-277343-0_sp0.9 from training. Duration: 0.911125 2023-10-04 06:47:22,363 WARNING [train.py:1204] (3/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_325-62802-0 from training. Duration: 0.87 2023-10-04 06:47:22,427 WARNING [train.py:1204] (3/4) Exclude cut with ID _56524_210_9642_1_1534572011779_3619170_145-310842-0_sp1.1 from training. Duration: 0.9 2023-10-04 06:47:23,759 WARNING [train.py:1204] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_860-96198-0_sp0.9 from training. Duration: 0.811125 2023-10-04 06:47:24,027 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.conv_module2.balancer1.prob, batch_count=1562760.0, ans=0.125 2023-10-04 06:47:25,128 WARNING [train.py:1204] (3/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_17-25045-0_sp1.1 from training. Duration: 0.536375 2023-10-04 06:47:25,154 WARNING [train.py:1204] (3/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_349-69727-0_sp1.1 from training. Duration: 0.936375 2023-10-04 06:47:26,560 WARNING [train.py:1204] (3/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_580-93798-0_sp1.1 from training. Duration: 0.5090625 2023-10-04 06:47:28,509 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7762_210_1794_1_1528199508_4057408_419-125009-0 from training. Duration: 0.83 2023-10-04 06:47:28,734 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.balancer1.prob, batch_count=1562760.0, ans=0.125 2023-10-04 06:47:29,773 WARNING [train.py:1204] (3/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_356-167816-0 from training. Duration: 0.72 2023-10-04 06:47:29,797 WARNING [train.py:1204] (3/4) Exclude cut with ID _29345_210_4381_1_1532689168263_3860169_225-97100-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 06:47:29,808 WARNING [train.py:1204] (3/4) Exclude cut with ID _14227_210_4381_1_1530701804286_3940259_374-194712-0_sp1.1 from training. Duration: 0.92725 2023-10-04 06:47:29,839 WARNING [train.py:1204] (3/4) Exclude cut with ID _40137_210_12210_1_1533290484046_3795180_249-187059-0_sp0.9 from training. Duration: 0.97775 2023-10-04 06:47:29,857 WARNING [train.py:1204] (3/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_389-317194-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 06:47:32,547 WARNING [train.py:1204] (3/4) Exclude cut with ID _27485_210_4916_1_1532739191677_3668269_239-122918-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 06:47:34,875 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.0.layers.1.nonlin_attention.whiten1, num_groups=1, num_channels=144, metric=4.94 vs. limit=10.0 2023-10-04 06:47:39,058 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2245_210_2715_1_1525511548_3540254_128-117609-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 06:47:39,095 WARNING [train.py:1204] (3/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_160-276178-0_sp1.1 from training. Duration: 0.963625 2023-10-04 06:47:40,356 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.699e+02 2.090e+02 2.308e+02 2.598e+02 4.000e+02, threshold=4.617e+02, percent-clipped=0.0 2023-10-04 06:47:40,465 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_31920_210_10633_1_1532860009_1686094_1-279735-0_sp1.1 from training. Duration: 0.97275 2023-10-04 06:47:44,417 WARNING [train.py:1204] (3/4) Exclude cut with ID _51078_210_9070_1_1534161557424_3805250_325-262383-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 06:47:44,449 WARNING [train.py:1204] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_643-52007-0_sp0.9 from training. Duration: 0.6333125 2023-10-04 06:47:45,765 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_51937_210_5014_1_1534573610_4022025_518-299248-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 06:47:47,395 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.feed_forward1.out_proj.dropout_p, batch_count=1562826.6666666667, ans=0.1 2023-10-04 06:47:52,608 INFO [train.py:1046] (3/4) Epoch 45, batch 700, loss[loss=0.1566, simple_loss=0.2368, pruned_loss=0.03816, over 23301.00 frames. ], tot_loss[loss=0.1538, simple_loss=0.2334, pruned_loss=0.03708, over 4557441.04 frames. ], batch size: 93, lr: 2.27e-03, grad_scale: 16.0 2023-10-04 06:47:52,669 WARNING [train.py:1204] (3/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_166-314607-0_sp1.1 from training. Duration: 0.4909375 2023-10-04 06:47:52,680 WARNING [train.py:1204] (3/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_1087-282939-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 06:47:52,723 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_23850_210_4238_1_1532232597_1632433_32-185135-0_sp1.1 from training. Duration: 0.7818125 2023-10-04 06:47:53,909 WARNING [train.py:1204] (3/4) Exclude cut with ID _59500_210_8254_1_1534809610680_3736009_145-89359-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 06:47:58,230 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_340-336408-0 from training. Duration: 0.93 2023-10-04 06:48:00,216 WARNING [train.py:1204] (3/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_9-21737-0 from training. Duration: 0.97 2023-10-04 06:48:01,686 WARNING [train.py:1204] (3/4) Exclude cut with ID _2220_210_1819_1_1525509109898_3658680_50-99837-0 from training. Duration: 0.54 2023-10-04 06:48:01,744 WARNING [train.py:1204] (3/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_879-299350-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 06:48:03,531 WARNING [train.py:1204] (3/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_905-160074-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 06:48:04,940 WARNING [train.py:1204] (3/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_395-73570-0 from training. Duration: 0.99 2023-10-04 06:48:09,572 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7264_210_4938_1_1527939911_8132095_800-91995-0_sp1.1 from training. Duration: 0.7818125 2023-10-04 06:48:11,001 WARNING [train.py:1204] (3/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_77-311404-0_sp1.1 from training. Duration: 0.8545625 2023-10-04 06:48:12,314 WARNING [train.py:1204] (3/4) Exclude cut with ID _13679_210_7497_1_1530534293662_3355098_107-80772-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 06:48:13,687 WARNING [train.py:1204] (3/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_224-343295-0_sp1.1 from training. Duration: 0.5 2023-10-04 06:48:13,716 WARNING [train.py:1204] (3/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_156-253482-0_sp1.1 from training. Duration: 0.963625 2023-10-04 06:48:15,320 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.bypass.skip_rate, batch_count=1562960.0, ans=0.07 2023-10-04 06:48:16,424 WARNING [train.py:1204] (3/4) Exclude cut with ID _49728_210_12651_1_1534041034294_6788150_309-125532-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 06:48:17,817 WARNING [train.py:1204] (3/4) Exclude cut with ID _13913_210_9994_1_1530579615187_3383897_473-277682-0_sp0.9 from training. Duration: 0.4333125 2023-10-04 06:48:19,180 WARNING [train.py:1204] (3/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_3-276701-0_sp1.1 from training. Duration: 0.82725 2023-10-04 06:48:20,619 WARNING [train.py:1204] (3/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_282-136641-0 from training. Duration: 0.42 2023-10-04 06:48:22,069 WARNING [train.py:1204] (3/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_127-566-0 from training. Duration: 0.84 2023-10-04 06:48:23,637 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.feed_forward1.hidden_balancer.prob, batch_count=1563026.6666666667, ans=0.125 2023-10-04 06:48:26,636 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6163_210_6400_1_1527506746_4263068_283-137811-0_sp1.1 from training. Duration: 0.7 2023-10-04 06:48:26,664 WARNING [train.py:1204] (3/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_34-183685-0_sp1.1 from training. Duration: 0.8909375 2023-10-04 06:48:27,579 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.1.encoder.layers.0.self_attn_weights.whiten_keys, num_groups=4, num_channels=128, metric=3.79 vs. limit=6.0 2023-10-04 06:48:28,153 WARNING [train.py:1204] (3/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_154-135696-0_sp0.9 from training. Duration: 0.888875 2023-10-04 06:48:32,645 WARNING [train.py:1204] (3/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_676-51014-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 06:48:32,719 WARNING [train.py:1204] (3/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_38-169920-0 from training. Duration: 0.91 2023-10-04 06:48:33,312 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.4.encoder.layers.1.conv_module2.whiten, num_groups=1, num_channels=384, metric=2.82 vs. limit=15.0 2023-10-04 06:48:38,751 WARNING [train.py:1204] (3/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_747-114465-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 06:48:38,816 WARNING [train.py:1204] (3/4) Exclude cut with ID _8021_210_7994_1_1528376451358_3653069_100-140231-0_sp1.1 from training. Duration: 0.6090625 2023-10-04 06:48:40,149 WARNING [train.py:1204] (3/4) Exclude cut with ID _64135_210_2253_1_1535677267876_7608972_133-188266-0 from training. Duration: 0.81 2023-10-04 06:48:42,982 WARNING [train.py:1204] (3/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_714-310538-0_sp1.1 from training. Duration: 0.7818125 2023-10-04 06:48:43,078 WARNING [train.py:1204] (3/4) Exclude cut with ID _39867_210_9826_1_1533553222030_9344259_1134-288498-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 06:48:43,232 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.conv_module2.balancer2.prob, batch_count=1563093.3333333333, ans=0.125 2023-10-04 06:48:45,061 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.3.encoder.layers.1.feed_forward2.out_whiten, num_groups=1, num_channels=512, metric=5.51 vs. limit=15.0 2023-10-04 06:48:45,850 WARNING [train.py:1204] (3/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_211-237289-0_sp1.1 from training. Duration: 0.936375 2023-10-04 06:48:51,320 WARNING [train.py:1204] (3/4) Exclude cut with ID _59202_210_26984_1_1534816810612_3549174_137-318710-0_sp0.9 from training. Duration: 0.988875 2023-10-04 06:48:51,346 WARNING [train.py:1204] (3/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_86-66192-0 from training. Duration: 0.55 2023-10-04 06:48:51,555 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.feed_forward1.hidden_balancer.prob, batch_count=1563160.0, ans=0.125 2023-10-04 06:48:52,951 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.conv_module2.balancer2.prob, batch_count=1563160.0, ans=0.125 2023-10-04 06:48:54,101 WARNING [train.py:1204] (3/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_540-275685-0 from training. Duration: 0.89 2023-10-04 06:48:54,129 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8776_210_2040_1_1528638837_4088036_244-278763-0 from training. Duration: 0.9 2023-10-04 06:48:57,446 WARNING [train.py:1204] (3/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_9-219972-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 06:49:00,631 WARNING [train.py:1204] (3/4) Exclude cut with ID _44659_210_2721_1_1534036120061_6614860_656-283823-0_sp1.1 from training. Duration: 0.92725 2023-10-04 06:49:00,704 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_69630_210_7234_1_1535789601_661049_77-216385-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 06:49:03,386 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_19952_210_2161_1_1531875469_3851750_210-308919-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 06:49:03,398 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_4737_210_2040_1_1526998689_3293008_378-178650-0 from training. Duration: 0.95 2023-10-04 06:49:06,724 INFO [train.py:1046] (3/4) Epoch 45, batch 750, loss[loss=0.1471, simple_loss=0.2239, pruned_loss=0.03519, over 23890.00 frames. ], tot_loss[loss=0.1537, simple_loss=0.2336, pruned_loss=0.03687, over 4591002.46 frames. ], batch size: 195, lr: 2.27e-03, grad_scale: 16.0 2023-10-04 06:49:08,119 WARNING [train.py:1204] (3/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_224-343295-0 from training. Duration: 0.55 2023-10-04 06:49:08,143 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7002_210_1794_1_1527939683_5157286_184-229630-0 from training. Duration: 0.74 2023-10-04 06:49:08,173 WARNING [train.py:1204] (3/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_6-214815-0 from training. Duration: 0.85 2023-10-04 06:49:10,868 WARNING [train.py:1204] (3/4) Exclude cut with ID _8699_210_2069_1_1528630346831_3960409_160-24408-0 from training. Duration: 0.91 2023-10-04 06:49:10,910 WARNING [train.py:1204] (3/4) Exclude cut with ID _5585_210_2667_1_1527418813324_8007340_89-332072-0 from training. Duration: 0.64 2023-10-04 06:49:10,946 WARNING [train.py:1204] (3/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_528-259800-0_sp1.1 from training. Duration: 0.963625 2023-10-04 06:49:12,338 WARNING [train.py:1204] (3/4) Exclude cut with ID _55383_210_10635_1_1534468426172_2231669_134-128246-0 from training. Duration: 0.89 2023-10-04 06:49:13,723 WARNING [train.py:1204] (3/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_400-338274-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 06:49:13,785 WARNING [train.py:1204] (3/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_364-22817-0_sp0.9 from training. Duration: 0.788875 2023-10-04 06:49:15,302 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=1563226.6666666667, ans=0.1 2023-10-04 06:49:16,639 WARNING [train.py:1204] (3/4) Exclude cut with ID _42863_210_9070_1_1533902398788_3696920_331-291057-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 06:49:18,145 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_5436_210_1794_1_1527331317_2541494_229-211016-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 06:49:18,175 WARNING [train.py:1204] (3/4) Exclude cut with ID _6456_210_1534_1_1527681634212_3181049_345-252877-0_sp0.9 from training. Duration: 0.711125 2023-10-04 06:49:18,211 WARNING [train.py:1204] (3/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_96-81499-0_sp1.1 from training. Duration: 0.92725 2023-10-04 06:49:19,877 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.feed_forward2.hidden_balancer.prob, batch_count=1563293.3333333333, ans=0.125 2023-10-04 06:49:22,296 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_5575_210_1410_1_1527301305_2570170_342-265572-0_sp1.1 from training. Duration: 0.8545625 2023-10-04 06:49:22,366 WARNING [train.py:1204] (3/4) Exclude cut with ID _5444_210_2151_1_1527418808464_5312210_726-27165-0_sp0.9 from training. Duration: 0.7444375 2023-10-04 06:49:23,721 WARNING [train.py:1204] (3/4) Exclude cut with ID _22083_210_8254_1_1531702862439_3678970_304-327750-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 06:49:25,181 WARNING [train.py:1204] (3/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_245-226199-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 06:49:25,212 WARNING [train.py:1204] (3/4) Exclude cut with ID _42875_210_14856_1_1533704397487_4012433_102-199499-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 06:49:26,523 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_51225_210_3601_1_1534400634_2327383_55-301844-0 from training. Duration: 0.93 2023-10-04 06:49:26,638 WARNING [train.py:1204] (3/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_916-195848-0_sp1.1 from training. Duration: 0.836375 2023-10-04 06:49:28,603 WARNING [train.py:1204] (3/4) Exclude cut with ID _44718_210_5816_1_1534050051869_6469030_497-311999-0_sp1.1 from training. Duration: 0.936375 2023-10-04 06:49:30,472 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_34886_210_10633_1_1533203882_1755661_91-194765-0_sp1.1 from training. Duration: 0.936375 2023-10-04 06:49:31,829 WARNING [train.py:1204] (3/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_310-26186-0_sp0.9 from training. Duration: 0.77775 2023-10-04 06:49:33,184 WARNING [train.py:1204] (3/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_416-257730-0 from training. Duration: 0.65 2023-10-04 06:49:33,190 WARNING [train.py:1204] (3/4) Exclude cut with ID _9389_210_1664_1_1529217337301_5289269_209-154760-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 06:49:33,801 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.4.encoder.layers.0.conv_module2.whiten, num_groups=1, num_channels=384, metric=9.28 vs. limit=15.0 2023-10-04 06:49:36,439 WARNING [train.py:1204] (3/4) Exclude cut with ID _4092_210_2151_1_1526643044449_3135759_227-213972-0 from training. Duration: 0.54 2023-10-04 06:49:36,459 WARNING [train.py:1204] (3/4) Exclude cut with ID IC0679W0044-335852 from training. Duration: 0.768 2023-10-04 06:49:36,533 WARNING [train.py:1204] (3/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_42-186162-0 from training. Duration: 0.92 2023-10-04 06:49:36,540 WARNING [train.py:1204] (3/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_302-2550-0_sp0.9 from training. Duration: 0.9 2023-10-04 06:49:36,561 WARNING [train.py:1204] (3/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_356-31216-0_sp0.9 from training. Duration: 0.8333125 2023-10-04 06:49:38,513 WARNING [train.py:1204] (3/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_125-322172-0_sp1.1 from training. Duration: 0.8454375 2023-10-04 06:49:42,796 WARNING [train.py:1204] (3/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_141-89727-0_sp0.9 from training. Duration: 0.788875 2023-10-04 06:49:44,093 WARNING [train.py:1204] (3/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_647-337784-0_sp1.1 from training. Duration: 0.97275 2023-10-04 06:49:44,110 WARNING [train.py:1204] (3/4) Exclude cut with ID _6493_210_1747_1_1528459210424_4012470_513-140967-0_sp1.1 from training. Duration: 0.7181875 2023-10-04 06:49:45,575 WARNING [train.py:1204] (3/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_87-266334-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 06:49:46,839 WARNING [train.py:1204] (3/4) Exclude cut with ID _26528_210_9070_1_1532570410842_3851310_526-261338-0_sp1.1 from training. Duration: 0.92725 2023-10-04 06:49:46,890 WARNING [train.py:1204] (3/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_380-343670-0 from training. Duration: 0.88 2023-10-04 06:49:48,208 WARNING [train.py:1204] (3/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_449-15508-0_sp1.1 from training. Duration: 0.7454375 2023-10-04 06:49:49,552 WARNING [train.py:1204] (3/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_372-6869-0_sp0.9 from training. Duration: 0.488875 2023-10-04 06:49:49,597 WARNING [train.py:1204] (3/4) Exclude cut with ID _26907_210_7234_1_1532221740694_3205263_337-2494-0_sp0.9 from training. Duration: 0.97775 2023-10-04 06:49:49,794 INFO [scaling.py:1118] (3/4) WithLoss: name=encoder.encoders.2.encoder.layers.0.self_attn_weights, loss-sum=0.000e+00 2023-10-04 06:49:53,687 WARNING [train.py:1204] (3/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_10-100367-0_sp1.1 from training. Duration: 0.8909375 2023-10-04 06:49:53,716 WARNING [train.py:1204] (3/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_47-167373-0 from training. Duration: 0.92 2023-10-04 06:49:53,781 WARNING [train.py:1204] (3/4) Exclude cut with ID _28323_210_16187_1_1532480395667_4100300_628-282179-0_sp1.1 from training. Duration: 0.97275 2023-10-04 06:49:59,851 WARNING [train.py:1204] (3/4) Exclude cut with ID _20455_210_5219_1_1531565996775_8098100_213-280898-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 06:49:59,952 WARNING [train.py:1204] (3/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_383-316000-0_sp1.1 from training. Duration: 0.8181875 2023-10-04 06:50:01,802 WARNING [train.py:1204] (3/4) Exclude cut with ID _18513_210_9206_1_1531618215223_3862620_16-78180-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 06:50:04,592 WARNING [train.py:1204] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_330-327861-0_sp1.1 from training. Duration: 0.6090625 2023-10-04 06:50:07,901 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.546e+02 2.004e+02 2.219e+02 2.508e+02 4.389e+02, threshold=4.439e+02, percent-clipped=0.0 2023-10-04 06:50:08,014 WARNING [train.py:1204] (3/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_356-31216-0 from training. Duration: 0.75 2023-10-04 06:50:09,339 WARNING [train.py:1204] (3/4) Exclude cut with ID _25248_210_8750_1_1532062825941_5554180_255-148539-0_sp1.1 from training. Duration: 0.963625 2023-10-04 06:50:10,726 WARNING [train.py:1204] (3/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_269-154726-0_sp1.1 from training. Duration: 0.7818125 2023-10-04 06:50:12,246 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7002_210_1794_1_1527939683_5157286_163-87552-0_sp1.1 from training. Duration: 0.7818125 2023-10-04 06:50:12,279 WARNING [train.py:1204] (3/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_97-151896-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 06:50:14,956 WARNING [train.py:1204] (3/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_207-7026-0_sp1.1 from training. Duration: 0.97275 2023-10-04 06:50:14,976 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_92-46009-0_sp0.9 from training. Duration: 0.6 2023-10-04 06:50:20,451 INFO [train.py:1046] (3/4) Epoch 45, batch 800, loss[loss=0.1506, simple_loss=0.2274, pruned_loss=0.03689, over 23802.00 frames. ], tot_loss[loss=0.1539, simple_loss=0.2343, pruned_loss=0.03674, over 4627764.39 frames. ], batch size: 195, lr: 2.27e-03, grad_scale: 32.0 2023-10-04 06:50:21,908 WARNING [train.py:1204] (3/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_122-178847-0_sp1.1 from training. Duration: 0.97275 2023-10-04 06:50:21,912 WARNING [train.py:1204] (3/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_94-348956-0_sp1.1 from training. Duration: 0.92725 2023-10-04 06:50:23,332 WARNING [train.py:1204] (3/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_1018-244089-0_sp1.1 from training. Duration: 0.963625 2023-10-04 06:50:23,345 WARNING [train.py:1204] (3/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_87-118423-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 06:50:23,506 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.ff3_skip_rate, batch_count=1563560.0, ans=0.0 2023-10-04 06:50:24,656 WARNING [train.py:1204] (3/4) Exclude cut with ID _53713_210_24116_1_1534314487762_7416751_504-212292-0_sp1.1 from training. Duration: 0.92725 2023-10-04 06:50:24,681 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_446-150327-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 06:50:27,886 WARNING [train.py:1204] (3/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_682-245989-0_sp1.1 from training. Duration: 0.92725 2023-10-04 06:50:30,062 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.4.encoder.layers.0.conv_module2.whiten, num_groups=1, num_channels=384, metric=9.92 vs. limit=15.0 2023-10-04 06:50:32,166 WARNING [train.py:1204] (3/4) Exclude cut with ID _35723_210_1794_1_1533088341043_628250_8-234524-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 06:50:32,243 WARNING [train.py:1204] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_64-248359-0_sp0.9 from training. Duration: 0.7666875 2023-10-04 06:50:36,885 WARNING [train.py:1204] (3/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_49-97158-0 from training. Duration: 0.76 2023-10-04 06:50:38,579 WARNING [train.py:1204] (3/4) Exclude cut with ID _24314_210_4915_1_1532135078116_7170179_743-915-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 06:50:39,982 WARNING [train.py:1204] (3/4) Exclude cut with ID _61194_210_18628_1_1535007555628_3750909_353-323468-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 06:50:40,011 WARNING [train.py:1204] (3/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_387-131538-0_sp1.1 from training. Duration: 0.736375 2023-10-04 06:50:40,039 WARNING [train.py:1204] (3/4) Exclude cut with ID _44424_210_18163_1_1533623394848_1213110_79-187603-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 06:50:40,071 WARNING [train.py:1204] (3/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_86-19601-0 from training. Duration: 0.85 2023-10-04 06:50:41,382 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_50050_210_16223_1_1535435796_7290138_103-283529-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 06:50:41,427 WARNING [train.py:1204] (3/4) Exclude cut with ID _13913_210_9994_1_1530579615187_3383897_473-277682-0 from training. Duration: 0.39 2023-10-04 06:50:44,319 WARNING [train.py:1204] (3/4) Exclude cut with ID _42326_210_7770_1_1533814282531_3707269_368-68029-0_sp1.1 from training. Duration: 0.92725 2023-10-04 06:50:45,725 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_41428_210_2161_1_1533774184_3992532_446-339645-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 06:50:47,379 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.ff2_skip_rate, batch_count=1563626.6666666667, ans=0.0 2023-10-04 06:50:48,552 WARNING [train.py:1204] (3/4) Exclude cut with ID _69584_210_5219_1_1535785133775_4044450_325-7656-0_sp1.1 from training. Duration: 0.7818125 2023-10-04 06:50:48,570 WARNING [train.py:1204] (3/4) Exclude cut with ID _21632_210_2253_1_1531994001765_3744140_134-184387-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 06:50:49,915 WARNING [train.py:1204] (3/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_550-184707-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 06:50:51,188 WARNING [train.py:1204] (3/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_106-341780-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 06:50:54,012 WARNING [train.py:1204] (3/4) Exclude cut with ID _45871_210_5665_1_1533864614245_3296060_253-132552-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 06:50:55,334 WARNING [train.py:1204] (3/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_395-310220-0_sp1.1 from training. Duration: 0.8090625 2023-10-04 06:50:55,357 WARNING [train.py:1204] (3/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_795-33252-0_sp0.9 from training. Duration: 0.411125 2023-10-04 06:50:56,755 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9002W0003-495962 from training. Duration: 0.946 2023-10-04 06:50:57,282 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.4.encoder.layers.1.self_attn1.whiten, num_groups=1, num_channels=384, metric=11.48 vs. limit=22.5 2023-10-04 06:50:58,074 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_43-71989-0 from training. Duration: 0.57 2023-10-04 06:50:58,087 WARNING [train.py:1204] (3/4) Exclude cut with ID _8699_210_2069_1_1528630346831_3960409_155-39766-0_sp1.1 from training. Duration: 0.7090625 2023-10-04 06:50:58,099 WARNING [train.py:1204] (3/4) Exclude cut with ID _27894_210_12392_1_1532415496078_6902439_788-232684-0_sp1.1 from training. Duration: 0.97275 2023-10-04 06:50:58,321 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.conv_skip_rate, batch_count=1563693.3333333333, ans=0.0 2023-10-04 06:50:59,916 WARNING [train.py:1204] (3/4) Exclude cut with ID _56494_210_4220_1_1535284837625_3033169_10-39296-0_sp1.1 from training. Duration: 0.92725 2023-10-04 06:50:59,936 WARNING [train.py:1204] (3/4) Exclude cut with ID _40901_210_5665_1_1533363789958_3856130_341-20819-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 06:51:03,407 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9029W0435-509822 from training. Duration: 0.896 2023-10-04 06:51:04,019 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.3.encoder.layers.2.nonlin_attention.whiten2, num_groups=1, num_channels=512, metric=6.32 vs. limit=15.0 2023-10-04 06:51:04,807 WARNING [train.py:1204] (3/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_51-117576-0 from training. Duration: 0.4 2023-10-04 06:51:06,698 WARNING [train.py:1204] (3/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_197-28164-0_sp1.1 from training. Duration: 0.72725 2023-10-04 06:51:08,724 WARNING [train.py:1204] (3/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_145-137790-0_sp1.1 from training. Duration: 0.7545625 2023-10-04 06:51:12,648 WARNING [train.py:1204] (3/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_2-39653-0_sp1.1 from training. Duration: 0.8909375 2023-10-04 06:51:16,909 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3780_210_2087_1_1526467760_2130483_186-208240-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 06:51:18,243 WARNING [train.py:1204] (3/4) Exclude cut with ID _24055_210_12651_1_1532310907143_976979_92-59356-0 from training. Duration: 0.91 2023-10-04 06:51:18,308 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3910_210_1794_1_1526742306_334530_28-238909-0_sp1.1 from training. Duration: 0.736375 2023-10-04 06:51:22,430 WARNING [train.py:1204] (3/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_269-154726-0 from training. Duration: 0.86 2023-10-04 06:51:28,134 WARNING [train.py:1204] (3/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_326-165900-0_sp0.9 from training. Duration: 0.7666875 2023-10-04 06:51:31,029 WARNING [train.py:1204] (3/4) Exclude cut with ID _5294_210_1747_1_1527249643928_3696179_470-276077-0_sp1.1 from training. Duration: 0.963625 2023-10-04 06:51:31,070 WARNING [train.py:1204] (3/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_372-131163-0 from training. Duration: 0.94 2023-10-04 06:51:32,381 WARNING [train.py:1204] (3/4) Exclude cut with ID _45297_210_2721_1_1533715351381_3264949_351-147136-0_sp1.1 from training. Duration: 0.7909375 2023-10-04 06:51:32,462 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_1072-12671-0_sp1.1 from training. Duration: 0.92725 2023-10-04 06:51:34,462 INFO [train.py:1046] (3/4) Epoch 45, batch 850, loss[loss=0.1368, simple_loss=0.2176, pruned_loss=0.02797, over 20390.00 frames. ], tot_loss[loss=0.1547, simple_loss=0.2352, pruned_loss=0.03707, over 4655866.51 frames. ], batch size: 44, lr: 2.27e-03, grad_scale: 32.0 2023-10-04 06:51:34,563 WARNING [train.py:1204] (3/4) Exclude cut with ID _15133_210_6228_1_1530794512074_3080379_54-198193-0 from training. Duration: 0.94 2023-10-04 06:51:34,587 WARNING [train.py:1204] (3/4) Exclude cut with ID _30196_210_1664_1_1533708496522_7074140_1194-270988-0_sp1.1 from training. Duration: 0.97275 2023-10-04 06:51:37,681 WARNING [train.py:1204] (3/4) Exclude cut with ID _50310_210_15488_1_1534068043345_3641080_289-193637-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 06:51:39,169 WARNING [train.py:1204] (3/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_313-263495-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 06:51:42,411 WARNING [train.py:1204] (3/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_18-130336-0_sp1.1 from training. Duration: 0.7181875 2023-10-04 06:51:42,482 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_47896_210_20508_1_1534492518_5958868_646-287209-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 06:51:43,865 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_294-163793-0 from training. Duration: 0.82 2023-10-04 06:51:45,170 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_8-319957-0 from training. Duration: 0.96 2023-10-04 06:51:45,173 WARNING [train.py:1204] (3/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_1211-226066-0 from training. Duration: 0.76 2023-10-04 06:51:45,312 WARNING [train.py:1204] (3/4) Exclude cut with ID _4779_210_2084_1_1527318022944_3914893_500-285013-0_sp0.9 from training. Duration: 0.7666875 2023-10-04 06:51:45,332 WARNING [train.py:1204] (3/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_347-232268-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 06:51:47,525 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.1.encoder.layers.1.whiten, num_groups=1, num_channels=256, metric=3.90 vs. limit=12.0 2023-10-04 06:51:48,117 WARNING [train.py:1204] (3/4) Exclude cut with ID _29877_210_6228_1_1532397791106_5276149_111-55056-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 06:51:48,147 WARNING [train.py:1204] (3/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_812-271646-0_sp1.1 from training. Duration: 0.92725 2023-10-04 06:51:48,177 WARNING [train.py:1204] (3/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_235-262124-0_sp1.1 from training. Duration: 0.6090625 2023-10-04 06:51:51,223 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.balancer2.prob, batch_count=1563960.0, ans=0.125 2023-10-04 06:51:52,245 WARNING [train.py:1204] (3/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_14-16530-0_sp1.1 from training. Duration: 0.97275 2023-10-04 06:51:53,577 WARNING [train.py:1204] (3/4) Exclude cut with ID _44718_210_5816_1_1534050051869_6469030_568-304512-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 06:51:53,609 WARNING [train.py:1204] (3/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_1033-274376-0 from training. Duration: 0.87 2023-10-04 06:51:56,365 WARNING [train.py:1204] (3/4) Exclude cut with ID _4681_210_3525_1_1527251040321_1235713_123-24428-0 from training. Duration: 0.48 2023-10-04 06:51:56,600 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=1563960.0, ans=0.1 2023-10-04 06:52:00,517 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_37818_210_4238_1_1533620910_4720083_251-288764-0_sp1.1 from training. Duration: 0.97275 2023-10-04 06:52:00,587 WARNING [train.py:1204] (3/4) Exclude cut with ID _65585_210_12210_1_1535450451584_3764098_112-294301-0 from training. Duration: 0.88 2023-10-04 06:52:02,201 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.conv_module2.balancer1.prob, batch_count=1564026.6666666667, ans=0.125 2023-10-04 06:52:02,697 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.4.encoder.layers.0.nonlin_attention.whiten1, num_groups=1, num_channels=288, metric=8.12 vs. limit=10.0 2023-10-04 06:52:03,838 WARNING [train.py:1204] (3/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_329-113173-0 from training. Duration: 0.62 2023-10-04 06:52:07,039 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6767_210_3273_1_1527855783_1718932_173-256601-0 from training. Duration: 0.8 2023-10-04 06:52:09,024 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9266W0001-627677 from training. Duration: 0.896 2023-10-04 06:52:09,041 WARNING [train.py:1204] (3/4) Exclude cut with ID _49643_210_7989_1_1534640244224_3939031_611-185515-0_sp1.1 from training. Duration: 0.8545625 2023-10-04 06:52:09,047 WARNING [train.py:1204] (3/4) Exclude cut with ID _31649_210_6228_1_1532584608063_4091078_687-237771-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 06:52:09,063 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_148-172861-0_sp0.9 from training. Duration: 0.5555625 2023-10-04 06:52:09,264 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=1564026.6666666667, ans=0.1 2023-10-04 06:52:11,781 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_5129_210_6400_1_1527161005_3579054_107-69738-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 06:52:13,754 WARNING [train.py:1204] (3/4) Exclude cut with ID _40137_210_12210_1_1533290484046_3795180_36-301164-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 06:52:13,784 WARNING [train.py:1204] (3/4) Exclude cut with ID _7537_210_2084_1_1528502395431_3924359_113-199933-0 from training. Duration: 0.59 2023-10-04 06:52:16,428 WARNING [train.py:1204] (3/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_547-5424-0_sp1.1 from training. Duration: 0.8545625 2023-10-04 06:52:16,480 WARNING [train.py:1204] (3/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_147-284024-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 06:52:17,827 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_294-163793-0_sp1.1 from training. Duration: 0.7454375 2023-10-04 06:52:19,072 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3780_210_2087_1_1526467760_2130483_170-259044-0_sp1.1 from training. Duration: 0.636375 2023-10-04 06:52:20,523 WARNING [train.py:1204] (3/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_726-270780-0_sp1.1 from training. Duration: 0.7818125 2023-10-04 06:52:20,617 WARNING [train.py:1204] (3/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_214-291369-0_sp1.1 from training. Duration: 0.57275 2023-10-04 06:52:21,970 WARNING [train.py:1204] (3/4) Exclude cut with ID _21881_210_3525_1_1531825242757_3798380_247-102591-0 from training. Duration: 0.98 2023-10-04 06:52:22,144 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.conv_module2.balancer2.prob, batch_count=1564093.3333333333, ans=0.125 2023-10-04 06:52:26,086 WARNING [train.py:1204] (3/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_18-174647-0_sp1.1 from training. Duration: 0.82725 2023-10-04 06:52:26,088 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_415-52611-0_sp1.1 from training. Duration: 0.936375 2023-10-04 06:52:26,153 WARNING [train.py:1204] (3/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_379-49457-0_sp0.9 from training. Duration: 0.9666875 2023-10-04 06:52:26,167 WARNING [train.py:1204] (3/4) Exclude cut with ID _64521_210_14699_1_1535243427259_3815328_368-195106-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 06:52:26,318 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.1.conv_module2.balancer2.prob, batch_count=1564093.3333333333, ans=0.125 2023-10-04 06:52:26,320 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.1.bypass.skip_rate, batch_count=1564093.3333333333, ans=0.035 2023-10-04 06:52:27,531 WARNING [train.py:1204] (3/4) Exclude cut with ID _28394_210_8913_1_1532430049518_3650118_51-334124-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 06:52:30,301 WARNING [train.py:1204] (3/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_317-161769-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 06:52:31,729 WARNING [train.py:1204] (3/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_40-129756-0_sp0.9 from training. Duration: 0.9 2023-10-04 06:52:33,673 WARNING [train.py:1204] (3/4) Exclude cut with ID _3067_210_2667_1_1526212974640_4503168_119-175776-0_sp0.9 from training. Duration: 0.82225 2023-10-04 06:52:33,730 WARNING [train.py:1204] (3/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_1004-304915-0_sp1.1 from training. Duration: 0.92725 2023-10-04 06:52:33,784 WARNING [train.py:1204] (3/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_359-118926-0_sp0.9 from training. Duration: 0.8 2023-10-04 06:52:33,976 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.attention_skip_rate, batch_count=1564160.0, ans=0.0 2023-10-04 06:52:34,027 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.out_combiner.scale_min, batch_count=1564160.0, ans=0.2 2023-10-04 06:52:34,987 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.665e+02 2.097e+02 2.346e+02 2.719e+02 4.087e+02, threshold=4.692e+02, percent-clipped=0.0 2023-10-04 06:52:38,587 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.out_combiner.scale_min, batch_count=1564160.0, ans=0.2 2023-10-04 06:52:42,261 WARNING [train.py:1204] (3/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_51-200220-0_sp0.9 from training. Duration: 0.62225 2023-10-04 06:52:43,484 WARNING [train.py:1204] (3/4) Exclude cut with ID _46683_210_9642_1_1533730913492_4591116_530-326202-0_sp1.1 from training. Duration: 0.936375 2023-10-04 06:52:43,539 WARNING [train.py:1204] (3/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_567-135114-0 from training. Duration: 0.87 2023-10-04 06:52:44,757 WARNING [train.py:1204] (3/4) Exclude cut with ID _66863_210_18053_1_1535698758334_8590319_1092-271784-0_sp1.1 from training. Duration: 0.87275 2023-10-04 06:52:44,774 WARNING [train.py:1204] (3/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_762-282614-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 06:52:46,191 WARNING [train.py:1204] (3/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_280-120942-0 from training. Duration: 0.77 2023-10-04 06:52:48,995 INFO [train.py:1046] (3/4) Epoch 45, batch 900, loss[loss=0.1595, simple_loss=0.2445, pruned_loss=0.03729, over 24641.00 frames. ], tot_loss[loss=0.1551, simple_loss=0.2359, pruned_loss=0.03719, over 4672137.01 frames. ], batch size: 65, lr: 2.26e-03, grad_scale: 32.0 2023-10-04 06:52:51,822 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_31583_210_3852_1_1532595851_4024413_27-170931-0_sp1.1 from training. Duration: 0.963625 2023-10-04 06:52:52,031 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=1564226.6666666667, ans=0.1 2023-10-04 06:52:54,618 WARNING [train.py:1204] (3/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_266-271628-0_sp1.1 from training. Duration: 0.92725 2023-10-04 06:52:54,668 WARNING [train.py:1204] (3/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_41-31157-0 from training. Duration: 0.57 2023-10-04 06:52:57,529 WARNING [train.py:1204] (3/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_113-18163-0_sp1.1 from training. Duration: 0.8090625 2023-10-04 06:52:57,575 WARNING [train.py:1204] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_147-111137-0 from training. Duration: 0.95 2023-10-04 06:52:58,951 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1083-220122-0_sp1.1 from training. Duration: 0.463625 2023-10-04 06:53:00,280 WARNING [train.py:1204] (3/4) Exclude cut with ID _14884_210_2252_1_1530707459689_5561250_591-173406-0_sp1.1 from training. Duration: 0.87275 2023-10-04 06:53:01,529 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_24677_210_1794_1_1532002693_4341296_149-303793-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 06:53:01,575 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_133-564-0_sp1.1 from training. Duration: 0.7181875 2023-10-04 06:53:01,611 WARNING [train.py:1204] (3/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_625-338546-0_sp0.9 from training. Duration: 0.97775 2023-10-04 06:53:12,814 WARNING [train.py:1204] (3/4) Exclude cut with ID _25882_210_3036_1_1532775665815_6479510_624-320169-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 06:53:12,820 WARNING [train.py:1204] (3/4) Exclude cut with ID _20622_210_3852_1_1531622427080_1093839_127-325326-0_sp1.1 from training. Duration: 0.92725 2023-10-04 06:53:14,097 WARNING [train.py:1204] (3/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_464-33931-0_sp1.1 from training. Duration: 0.4909375 2023-10-04 06:53:16,217 WARNING [train.py:1204] (3/4) Exclude cut with ID _66112_210_10635_1_1535590236305_3801484_120-304588-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 06:53:17,025 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.0.layers.1.self_attn1.whiten, num_groups=1, num_channels=192, metric=11.95 vs. limit=22.5 2023-10-04 06:53:20,376 WARNING [train.py:1204] (3/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_110-130961-0 from training. Duration: 0.47 2023-10-04 06:53:23,149 WARNING [train.py:1204] (3/4) Exclude cut with ID _16908_210_5724_1_1531119157248_3772700_64-54020-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 06:53:25,922 WARNING [train.py:1204] (3/4) Exclude cut with ID _4068_210_2629_1_1526697350395_1546259_224-288693-0_sp1.1 from training. Duration: 0.536375 2023-10-04 06:53:27,306 WARNING [train.py:1204] (3/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_191-41023-0_sp0.9 from training. Duration: 0.788875 2023-10-04 06:53:27,378 WARNING [train.py:1204] (3/4) Exclude cut with ID ID0205W0255-783002 from training. Duration: 0.958 2023-10-04 06:53:28,739 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_162-262717-0 from training. Duration: 0.83 2023-10-04 06:53:34,984 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_224-322394-0_sp1.1 from training. Duration: 0.6 2023-10-04 06:53:34,992 WARNING [train.py:1204] (3/4) Exclude cut with ID _4866_210_2577_1_1527404412284_4364659_133-1696-0_sp1.1 from training. Duration: 0.9 2023-10-04 06:53:35,058 WARNING [train.py:1204] (3/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_973-198085-0_sp1.1 from training. Duration: 0.6818125 2023-10-04 06:53:39,737 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_47449_210_5399_1_1534076445_4006929_410-237806-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 06:53:39,747 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7543_210_6753_1_1528543318_4552_1-85072-0_sp0.9 from training. Duration: 0.92225 2023-10-04 06:53:42,830 WARNING [train.py:1204] (3/4) Exclude cut with ID _65263_210_22356_1_1535763598691_3921037_108-21902-0 from training. Duration: 0.99 2023-10-04 06:53:42,834 WARNING [train.py:1204] (3/4) Exclude cut with ID _65585_210_12210_1_1535450451584_3764098_309-174818-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 06:53:44,244 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_309-65177-0 from training. Duration: 0.88 2023-10-04 06:53:46,243 WARNING [train.py:1204] (3/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_363-185703-0_sp1.1 from training. Duration: 0.863625 2023-10-04 06:53:47,450 WARNING [train.py:1204] (3/4) Exclude cut with ID _42326_210_7770_1_1533814282531_3707269_442-238692-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 06:53:48,874 WARNING [train.py:1204] (3/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_226-254446-0_sp1.1 from training. Duration: 0.936375 2023-10-04 06:53:48,901 WARNING [train.py:1204] (3/4) Exclude cut with ID _29853_210_7497_1_1532505600851_6642110_287-299903-0_sp1.1 from training. Duration: 0.963625 2023-10-04 06:53:51,725 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1015-343884-0 from training. Duration: 0.62 2023-10-04 06:53:51,754 WARNING [train.py:1204] (3/4) Exclude cut with ID ID0305W0008-833402 from training. Duration: 0.91 2023-10-04 06:53:53,161 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6673_210_3273_1_1527767790_3173240_351-167954-0_sp1.1 from training. Duration: 0.436375 2023-10-04 06:53:54,441 WARNING [train.py:1204] (3/4) Exclude cut with ID _6456_210_1534_1_1527681634212_3181049_345-252877-0 from training. Duration: 0.64 2023-10-04 06:53:57,200 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_13-205182-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 06:53:59,926 WARNING [train.py:1204] (3/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_427-208901-0 from training. Duration: 0.68 2023-10-04 06:54:03,126 INFO [train.py:1046] (3/4) Epoch 45, batch 950, loss[loss=0.1646, simple_loss=0.2426, pruned_loss=0.04327, over 23331.00 frames. ], tot_loss[loss=0.1544, simple_loss=0.2353, pruned_loss=0.03677, over 4694701.19 frames. ], batch size: 93, lr: 2.26e-03, grad_scale: 16.0 2023-10-04 06:54:04,661 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.conv_module1.balancer2.min_positive, batch_count=1564560.0, ans=0.05 2023-10-04 06:54:07,662 WARNING [train.py:1204] (3/4) Exclude cut with ID _49039_210_6315_1_1533967537541_3846118_9-148921-0_sp1.1 from training. Duration: 0.97275 2023-10-04 06:54:09,191 WARNING [train.py:1204] (3/4) Exclude cut with ID _59493_210_17282_1_1535078419077_3225879_143-70317-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 06:54:09,227 WARNING [train.py:1204] (3/4) Exclude cut with ID _27967_210_12140_1_1532588391859_8419130_1056-236369-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 06:54:10,494 WARNING [train.py:1204] (3/4) Exclude cut with ID _6537_210_1883_1_1527987559710_3597129_382-133062-0_sp0.9 from training. Duration: 0.7555625 2023-10-04 06:54:12,447 WARNING [train.py:1204] (3/4) Exclude cut with ID ID0376W0255-869957 from training. Duration: 0.9510625 2023-10-04 06:54:16,476 WARNING [train.py:1204] (3/4) Exclude cut with ID _49371_210_12071_1_1533952761021_3812190_483-240691-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 06:54:16,563 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_12868_210_6753_1_1530701932_530275_18-76322-0_sp1.1 from training. Duration: 0.7909375 2023-10-04 06:54:17,911 WARNING [train.py:1204] (3/4) Exclude cut with ID _53950_210_5816_1_1534417264919_3862951_367-26344-0_sp1.1 from training. Duration: 0.97275 2023-10-04 06:54:17,932 WARNING [train.py:1204] (3/4) Exclude cut with ID _5444_210_2151_1_1527418808464_5312210_634-194174-0_sp1.1 from training. Duration: 0.8818125 2023-10-04 06:54:18,568 WARNING [train.py:1204] (3/4) Exclude cut with ID _56494_210_4220_1_1535284837625_3033169_69-138150-0 from training. Duration: 0.94 2023-10-04 06:54:21,020 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7543_210_6753_1_1528544010_1110402_25-183860-0_sp0.9 from training. Duration: 0.52225 2023-10-04 06:54:21,130 WARNING [train.py:1204] (3/4) Exclude cut with ID _40284_210_2084_1_1533513679814_3704279_287-191918-0_sp1.1 from training. Duration: 0.963625 2023-10-04 06:54:23,739 WARNING [train.py:1204] (3/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_291-270881-0 from training. Duration: 0.97 2023-10-04 06:54:25,037 WARNING [train.py:1204] (3/4) Exclude cut with ID _12555_210_8750_1_1530075594402_3780239_449-267986-0_sp0.9 from training. Duration: 0.92225 2023-10-04 06:54:28,090 WARNING [train.py:1204] (3/4) Exclude cut with ID _3992_210_1747_1_1526644816031_3731060_268-64520-0_sp1.1 from training. Duration: 0.963625 2023-10-04 06:54:28,098 WARNING [train.py:1204] (3/4) Exclude cut with ID _39411_210_5986_1_1533538825030_3343649_173-238248-0_sp0.9 from training. Duration: 0.92225 2023-10-04 06:54:28,130 WARNING [train.py:1204] (3/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_828-313097-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 06:54:29,510 WARNING [train.py:1204] (3/4) Exclude cut with ID _26907_210_7234_1_1532221740694_3205263_337-2494-0 from training. Duration: 0.88 2023-10-04 06:54:30,197 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.3.encoder.layers.2.self_attn2.whiten, num_groups=1, num_channels=512, metric=10.48 vs. limit=22.5 2023-10-04 06:54:30,839 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_229-45300-0_sp1.1 from training. Duration: 0.5181875 2023-10-04 06:54:31,084 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.feed_forward1.hidden_balancer.prob, batch_count=1564693.3333333333, ans=0.125 2023-10-04 06:54:32,256 WARNING [train.py:1204] (3/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_300-27235-0_sp1.1 from training. Duration: 0.7909375 2023-10-04 06:54:33,654 WARNING [train.py:1204] (3/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_48-338976-0_sp1.1 from training. Duration: 0.8090625 2023-10-04 06:54:40,200 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_13091_210_7925_1_1530609447_8987354_792-195353-0_sp1.1 from training. Duration: 0.936375 2023-10-04 06:54:40,206 WARNING [train.py:1204] (3/4) Exclude cut with ID _56524_210_9642_1_1534572011779_3619170_311-54500-0_sp1.1 from training. Duration: 0.97275 2023-10-04 06:54:44,583 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7543_210_6753_1_1528544010_1110402_25-183860-0 from training. Duration: 0.47 2023-10-04 06:54:44,720 WARNING [train.py:1204] (3/4) Exclude cut with ID 2411-132532-0017-82279-0_sp1.1 from training. Duration: 0.9681875 2023-10-04 06:54:44,721 WARNING [train.py:1204] (3/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_272-249985-0_sp0.9 from training. Duration: 0.7444375 2023-10-04 06:54:46,075 WARNING [train.py:1204] (3/4) Exclude cut with ID _55644_210_11856_1_1534593599582_4103719_320-229006-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 06:54:47,412 WARNING [train.py:1204] (3/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_177-100554-0_sp1.1 from training. Duration: 0.963625 2023-10-04 06:54:47,414 WARNING [train.py:1204] (3/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_787-294416-0_sp1.1 from training. Duration: 0.7454375 2023-10-04 06:54:52,110 WARNING [train.py:1204] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_318-59097-0 from training. Duration: 0.76 2023-10-04 06:54:53,430 WARNING [train.py:1204] (3/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_480-13146-0_sp1.1 from training. Duration: 0.77275 2023-10-04 06:54:56,145 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_469-338558-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 06:54:57,485 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_367-126897-0_sp1.1 from training. Duration: 0.963625 2023-10-04 06:54:57,501 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6545_210_2040_1_1527774670_4245258_379-14182-0 from training. Duration: 0.8 2023-10-04 06:54:57,512 WARNING [train.py:1204] (3/4) Exclude cut with ID _21631_210_2253_1_1531907880778_3160339_197-79498-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 06:54:57,516 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_416-293746-0_sp1.1 from training. Duration: 0.8181875 2023-10-04 06:54:58,864 WARNING [train.py:1204] (3/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_52-211683-0 from training. Duration: 0.9 2023-10-04 06:55:01,890 WARNING [train.py:1204] (3/4) Exclude cut with ID _48678_210_5663_1_1533988830947_3769199_306-255886-0_sp0.9 from training. Duration: 0.8555625 2023-10-04 06:55:04,547 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.543e+02 1.998e+02 2.203e+02 2.459e+02 3.275e+02, threshold=4.407e+02, percent-clipped=0.0 2023-10-04 06:55:04,719 WARNING [train.py:1204] (3/4) Exclude cut with ID _65134_210_14763_1_1535335393985_3505368_296-24733-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 06:55:04,897 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.out_combiner.scale_min, batch_count=1564826.6666666667, ans=0.2 2023-10-04 06:55:10,696 WARNING [train.py:1204] (3/4) Exclude cut with ID _14898_210_9206_1_1530766812377_4177190_335-99364-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 06:55:10,965 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=1564826.6666666667, ans=0.1 2023-10-04 06:55:12,595 WARNING [train.py:1204] (3/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_19-342632-0 from training. Duration: 0.91 2023-10-04 06:55:12,620 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_53071_210_16554_1_1534490405_2954066_265-170472-0 from training. Duration: 0.83 2023-10-04 06:55:15,480 WARNING [train.py:1204] (3/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_189-298172-0_sp1.1 from training. Duration: 0.963625 2023-10-04 06:55:17,331 INFO [train.py:1046] (3/4) Epoch 45, batch 1000, loss[loss=0.1484, simple_loss=0.2274, pruned_loss=0.03473, over 24334.00 frames. ], tot_loss[loss=0.1543, simple_loss=0.2353, pruned_loss=0.03666, over 4706733.87 frames. ], batch size: 56, lr: 2.26e-03, grad_scale: 16.0 2023-10-04 06:55:20,276 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7615_210_4211_1_1528110965_4580160_72-235211-0 from training. Duration: 0.76 2023-10-04 06:55:20,335 WARNING [train.py:1204] (3/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_155-90874-0_sp1.1 from training. Duration: 0.936375 2023-10-04 06:55:20,509 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.0.feed_forward2.hidden_balancer.prob, batch_count=1564893.3333333333, ans=0.125 2023-10-04 06:55:25,153 WARNING [train.py:1204] (3/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_164-216310-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 06:55:26,545 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_46013_210_7234_1_1533776362_3744032_463-52378-0 from training. Duration: 0.86 2023-10-04 06:55:26,550 WARNING [train.py:1204] (3/4) Exclude cut with ID _61133_210_9969_1_1535196777227_3696409_333-270325-0 from training. Duration: 0.93 2023-10-04 06:55:30,635 WARNING [train.py:1204] (3/4) Exclude cut with ID _5172_210_1883_1_1527246062567_3577219_18-101380-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 06:55:30,644 WARNING [train.py:1204] (3/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_577-236858-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 06:55:32,129 WARNING [train.py:1204] (3/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_1006-177345-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 06:55:34,907 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_4-266348-0 from training. Duration: 0.78 2023-10-04 06:55:37,648 WARNING [train.py:1204] (3/4) Exclude cut with ID _55857_210_2346_1_1534644007831_6898209_745-98391-0 from training. Duration: 0.76 2023-10-04 06:55:39,633 WARNING [train.py:1204] (3/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_375-244976-0 from training. Duration: 0.75 2023-10-04 06:55:40,755 WARNING [train.py:1204] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_4-248404-0_sp1.1 from training. Duration: 0.97275 2023-10-04 06:55:44,132 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8453_210_4211_1_1528502940_8562730_596-306820-0 from training. Duration: 0.97 2023-10-04 06:55:45,504 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_211-339622-0_sp0.9 from training. Duration: 0.5 2023-10-04 06:55:45,532 WARNING [train.py:1204] (3/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_393-57888-0 from training. Duration: 0.64 2023-10-04 06:55:47,617 WARNING [train.py:1204] (3/4) Exclude cut with ID _23825_210_13449_1_1531915313526_3497150_143-343575-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 06:55:47,679 WARNING [train.py:1204] (3/4) Exclude cut with ID _58605_210_12210_1_1534723195939_3904259_36-12256-0_sp1.1 from training. Duration: 0.936375 2023-10-04 06:55:50,549 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.bypass.scale_min, batch_count=1565026.6666666667, ans=0.2 2023-10-04 06:55:57,820 WARNING [train.py:1204] (3/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_38-67795-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 06:55:59,220 WARNING [train.py:1204] (3/4) Exclude cut with ID _64631_210_27767_1_1535349580068_4165160_308-327832-0_sp1.1 from training. Duration: 0.92725 2023-10-04 06:55:59,275 WARNING [train.py:1204] (3/4) Exclude cut with ID _37086_210_9038_1_1532999540891_7021250_726-81176-0_sp1.1 from training. Duration: 0.936375 2023-10-04 06:56:00,609 WARNING [train.py:1204] (3/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_47-141521-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 06:56:00,613 WARNING [train.py:1204] (3/4) Exclude cut with ID _3067_210_2667_1_1526212974640_4503168_250-285937-0 from training. Duration: 0.8 2023-10-04 06:56:00,638 WARNING [train.py:1204] (3/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_239-24000-0_sp1.1 from training. Duration: 0.97275 2023-10-04 06:56:01,920 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3371_210_1794_1_1526384462_5580777_396-339812-0_sp0.9 from training. Duration: 0.9444375 2023-10-04 06:56:01,958 WARNING [train.py:1204] (3/4) Exclude cut with ID _16909_210_5724_1_1531292079912_4118919_580-173454-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 06:56:02,014 WARNING [train.py:1204] (3/4) Exclude cut with ID IC0124W0002-61308 from training. Duration: 0.982 2023-10-04 06:56:03,646 WARNING [train.py:1204] (3/4) Exclude cut with ID _31602_210_5854_1_1532677021456_4458250_249-96604-0 from training. Duration: 0.89 2023-10-04 06:56:05,002 WARNING [train.py:1204] (3/4) Exclude cut with ID _69584_210_5219_1_1535785133775_4044450_325-7656-0 from training. Duration: 0.86 2023-10-04 06:56:07,635 WARNING [train.py:1204] (3/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_478-162247-0 from training. Duration: 0.82 2023-10-04 06:56:10,759 WARNING [train.py:1204] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_686-54383-0_sp1.1 from training. Duration: 0.7909375 2023-10-04 06:56:12,272 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.conv_module2.balancer2.prob, batch_count=1565093.3333333333, ans=0.125 2023-10-04 06:56:16,876 WARNING [train.py:1204] (3/4) Exclude cut with ID _29303_210_2575_1_1532392237303_7410649_85-270092-0_sp1.1 from training. Duration: 0.936375 2023-10-04 06:56:18,221 WARNING [train.py:1204] (3/4) Exclude cut with ID _2322_210_2667_1_1525604548971_3656299_440-80494-0_sp1.1 from training. Duration: 0.8 2023-10-04 06:56:18,260 WARNING [train.py:1204] (3/4) Exclude cut with ID _47346_210_9476_1_1533815971553_3536160_201-40644-0_sp1.1 from training. Duration: 0.936375 2023-10-04 06:56:18,349 WARNING [train.py:1204] (3/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_343-139898-0_sp1.1 from training. Duration: 0.87275 2023-10-04 06:56:20,435 WARNING [train.py:1204] (3/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_1012-279473-0 from training. Duration: 0.67 2023-10-04 06:56:21,820 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7931_210_1794_1_1528594728_5564059_280-279560-0_sp1.1 from training. Duration: 0.8909375 2023-10-04 06:56:21,871 WARNING [train.py:1204] (3/4) Exclude cut with ID _8263_210_1664_1_1528439452891_970220_68-181805-0 from training. Duration: 0.64 2023-10-04 06:56:23,247 WARNING [train.py:1204] (3/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_605-133395-0 from training. Duration: 0.66 2023-10-04 06:56:24,742 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_43-171109-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 06:56:24,752 WARNING [train.py:1204] (3/4) Exclude cut with ID _40094_210_16380_1_1533294005773_3641159_107-280739-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 06:56:28,037 WARNING [train.py:1204] (3/4) Exclude cut with ID _14821_210_8750_1_1530680503991_6829359_2-227151-0_sp0.9 from training. Duration: 0.92225 2023-10-04 06:56:29,503 WARNING [train.py:1204] (3/4) Exclude cut with ID _14268_210_9206_1_1530622771285_4714099_331-105351-0_sp1.1 from training. Duration: 0.5909375 2023-10-04 06:56:30,863 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_26326_210_7925_1_1533002493_3592406_451-82741-0_sp1.1 from training. Duration: 0.97275 2023-10-04 06:56:32,347 INFO [train.py:1046] (3/4) Epoch 45, batch 1050, loss[loss=0.1438, simple_loss=0.2226, pruned_loss=0.03249, over 23338.00 frames. ], tot_loss[loss=0.1535, simple_loss=0.2343, pruned_loss=0.03632, over 4719166.03 frames. ], batch size: 119, lr: 2.26e-03, grad_scale: 8.0 2023-10-04 06:56:33,858 WARNING [train.py:1204] (3/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_191-59784-0_sp0.9 from training. Duration: 0.97775 2023-10-04 06:56:35,172 WARNING [train.py:1204] (3/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_445-117398-0_sp1.1 from training. Duration: 0.8090625 2023-10-04 06:56:36,548 WARNING [train.py:1204] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_605-257205-0_sp0.9 from training. Duration: 0.6555625 2023-10-04 06:56:37,897 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_50050_210_16223_1_1535435796_7290138_592-258841-0_sp1.1 from training. Duration: 0.936375 2023-10-04 06:56:40,991 WARNING [train.py:1204] (3/4) Exclude cut with ID _14348_210_8341_1_1530599434996_3778640_240-232370-0_sp0.9 from training. Duration: 0.9333125 2023-10-04 06:56:41,282 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.attention_skip_rate, batch_count=1565226.6666666667, ans=0.0 2023-10-04 06:56:42,415 WARNING [train.py:1204] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_239-316802-0_sp0.9 from training. Duration: 0.7666875 2023-10-04 06:56:44,399 WARNING [train.py:1204] (3/4) Exclude cut with ID _45560_210_21982_1_1533812603951_3584999_111-6376-0_sp1.1 from training. Duration: 0.763625 2023-10-04 06:56:47,090 WARNING [train.py:1204] (3/4) Exclude cut with ID _54189_210_16380_1_1534332670039_3911710_606-311481-0_sp1.1 from training. Duration: 0.963625 2023-10-04 06:56:47,157 WARNING [train.py:1204] (3/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_237-332027-0_sp1.1 from training. Duration: 0.863625 2023-10-04 06:56:47,191 WARNING [train.py:1204] (3/4) Exclude cut with ID _66070_210_7640_1_1535441393096_7805310_489-292631-0_sp0.9 from training. Duration: 0.87775 2023-10-04 06:56:48,495 WARNING [train.py:1204] (3/4) Exclude cut with ID _49728_210_12651_1_1534041034294_6788150_624-195301-0_sp1.1 from training. Duration: 0.77275 2023-10-04 06:56:50,292 WARNING [train.py:1204] (3/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_44-79427-0 from training. Duration: 0.98 2023-10-04 06:56:50,347 WARNING [train.py:1204] (3/4) Exclude cut with ID _35350_210_16040_1_1533040191183_4075299_490-198354-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 06:56:50,395 WARNING [train.py:1204] (3/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_309-222545-0 from training. Duration: 0.84 2023-10-04 06:56:51,846 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_13091_210_7925_1_1530609447_8987354_790-199469-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 06:56:51,852 WARNING [train.py:1204] (3/4) Exclude cut with ID _21632_210_2253_1_1531994001765_3744140_110-274654-0 from training. Duration: 0.82 2023-10-04 06:56:53,193 WARNING [train.py:1204] (3/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_498-40153-0_sp0.9 from training. Duration: 0.7 2023-10-04 06:56:59,222 WARNING [train.py:1204] (3/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_363-282909-0_sp1.1 from training. Duration: 0.936375 2023-10-04 06:57:00,489 WARNING [train.py:1204] (3/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_178-177582-0_sp0.9 from training. Duration: 0.82225 2023-10-04 06:57:00,500 WARNING [train.py:1204] (3/4) Exclude cut with ID _24612_210_8913_1_1531998054193_3806359_357-35487-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 06:57:01,981 WARNING [train.py:1204] (3/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_138-332773-0 from training. Duration: 0.81 2023-10-04 06:57:02,001 WARNING [train.py:1204] (3/4) Exclude cut with ID _7624_210_1531_1_1528109802198_4933101_418-349834-0 from training. Duration: 0.6 2023-10-04 06:57:02,021 WARNING [train.py:1204] (3/4) Exclude cut with ID _7537_210_2084_1_1528502395431_3924359_14-312906-0_sp0.9 from training. Duration: 0.9333125 2023-10-04 06:57:06,128 WARNING [train.py:1204] (3/4) Exclude cut with ID _60057_210_10640_1_1534903065793_3333150_362-230377-0 from training. Duration: 0.87 2023-10-04 06:57:08,759 WARNING [train.py:1204] (3/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_138-64210-0 from training. Duration: 0.73 2023-10-04 06:57:08,830 WARNING [train.py:1204] (3/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_454-339504-0_sp1.1 from training. Duration: 0.97275 2023-10-04 06:57:09,464 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.3.encoder.layers.3.nonlin_attention.whiten1, num_groups=1, num_channels=384, metric=5.23 vs. limit=10.0 2023-10-04 06:57:12,440 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.balancer2.prob, batch_count=1565360.0, ans=0.125 2023-10-04 06:57:13,632 WARNING [train.py:1204] (3/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_508-335369-0_sp0.9 from training. Duration: 0.5333125 2023-10-04 06:57:15,109 WARNING [train.py:1204] (3/4) Exclude cut with ID _5608_210_2106_1_1527415439089_6819768_909-82767-0_sp0.9 from training. Duration: 0.67775 2023-10-04 06:57:16,868 WARNING [train.py:1204] (3/4) Exclude cut with ID _6694_210_4599_1_1527852637490_3965529_99-11611-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 06:57:16,928 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2655_210_2087_1_1525688959_3897400_52-279899-0_sp1.1 from training. Duration: 0.62725 2023-10-04 06:57:21,546 WARNING [train.py:1204] (3/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_375-245677-0_sp1.1 from training. Duration: 0.736375 2023-10-04 06:57:24,444 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_23850_210_4238_1_1532232597_1632433_32-185135-0 from training. Duration: 0.86 2023-10-04 06:57:24,716 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=1565426.6666666667, ans=0.1 2023-10-04 06:57:25,975 WARNING [train.py:1204] (3/4) Exclude cut with ID _15113_210_8254_1_1530842438579_3716279_209-54533-0 from training. Duration: 0.96 2023-10-04 06:57:27,670 WARNING [train.py:1204] (3/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_401-53423-0 from training. Duration: 0.55 2023-10-04 06:57:27,713 WARNING [train.py:1204] (3/4) Exclude cut with ID _14867_210_1968_1_1530684179890_3846969_319-250868-0_sp1.1 from training. Duration: 0.9 2023-10-04 06:57:27,726 WARNING [train.py:1204] (3/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_106-30782-0_sp0.9 from training. Duration: 0.888875 2023-10-04 06:57:30,362 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_5960_210_1794_1_1527403865_3990999_515-2613-0 from training. Duration: 0.88 2023-10-04 06:57:33,356 WARNING [train.py:1204] (3/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_798-106179-0_sp0.9 from training. Duration: 0.92225 2023-10-04 06:57:34,879 WARNING [train.py:1204] (3/4) Exclude cut with ID _14227_210_4381_1_1530701804286_3940259_125-275642-0_sp1.1 from training. Duration: 0.9 2023-10-04 06:57:34,882 WARNING [train.py:1204] (3/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_79-200392-0_sp1.1 from training. Duration: 0.8818125 2023-10-04 06:57:34,928 WARNING [train.py:1204] (3/4) Exclude cut with ID _48836_210_12210_1_1534075848293_3904150_429-35475-0_sp0.9 from training. Duration: 0.988875 2023-10-04 06:57:36,081 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.603e+02 2.013e+02 2.292e+02 2.767e+02 4.836e+02, threshold=4.583e+02, percent-clipped=4.0 2023-10-04 06:57:36,154 WARNING [train.py:1204] (3/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_224-172517-0_sp1.1 from training. Duration: 0.97275 2023-10-04 06:57:39,009 WARNING [train.py:1204] (3/4) Exclude cut with ID _15113_210_8254_1_1530842438579_3716279_76-232507-0_sp1.1 from training. Duration: 0.97275 2023-10-04 06:57:39,012 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_135-5501-0 from training. Duration: 0.98 2023-10-04 06:57:40,475 WARNING [train.py:1204] (3/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_563-347832-0_sp0.9 from training. Duration: 0.988875 2023-10-04 06:57:40,488 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6786_210_1388_1_1527839699_4194495_273-149841-0 from training. Duration: 0.92 2023-10-04 06:57:40,507 WARNING [train.py:1204] (3/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_172-215932-0 from training. Duration: 0.66 2023-10-04 06:57:41,832 WARNING [train.py:1204] (3/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_939-132910-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 06:57:46,716 INFO [train.py:1046] (3/4) Epoch 45, batch 1100, loss[loss=0.1745, simple_loss=0.2618, pruned_loss=0.0436, over 23995.00 frames. ], tot_loss[loss=0.1533, simple_loss=0.2338, pruned_loss=0.03643, over 4719189.46 frames. ], batch size: 86, lr: 2.26e-03, grad_scale: 8.0 2023-10-04 06:57:46,756 WARNING [train.py:1204] (3/4) Exclude cut with ID _49371_210_12071_1_1533952761021_3812190_16-246001-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 06:57:51,516 WARNING [train.py:1204] (3/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_680-189165-0_sp1.1 from training. Duration: 0.936375 2023-10-04 06:57:55,563 WARNING [train.py:1204] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_53-300544-0_sp1.1 from training. Duration: 0.6090625 2023-10-04 06:57:57,476 WARNING [train.py:1204] (3/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_540-275685-0_sp1.1 from training. Duration: 0.8090625 2023-10-04 06:57:57,505 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_38965_210_4852_1_1533174906_3484735_453-106579-0_sp1.1 from training. Duration: 0.92725 2023-10-04 06:57:58,811 WARNING [train.py:1204] (3/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_105-328174-0 from training. Duration: 0.76 2023-10-04 06:58:00,254 WARNING [train.py:1204] (3/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_279-126205-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 06:58:02,807 WARNING [train.py:1204] (3/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_5-55893-0_sp0.9 from training. Duration: 0.611125 2023-10-04 06:58:04,258 WARNING [train.py:1204] (3/4) Exclude cut with ID _13678_210_7497_1_1530530069226_3463170_141-10835-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 06:58:06,900 WARNING [train.py:1204] (3/4) Exclude cut with ID _22083_210_8254_1_1531702862439_3678970_201-85738-0_sp1.1 from training. Duration: 0.8181875 2023-10-04 06:58:06,928 WARNING [train.py:1204] (3/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_51-1049-0 from training. Duration: 0.75 2023-10-04 06:58:08,282 WARNING [train.py:1204] (3/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_206-10097-0_sp0.9 from training. Duration: 0.5444375 2023-10-04 06:58:09,720 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_4528_210_3273_1_1527076654_3276777_360-63661-0_sp1.1 from training. Duration: 0.92725 2023-10-04 06:58:09,728 WARNING [train.py:1204] (3/4) Exclude cut with ID _64086_210_7994_1_1535270417019_7435949_449-31567-0_sp1.1 from training. Duration: 0.8818125 2023-10-04 06:58:11,462 WARNING [train.py:1204] (3/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_245-174553-0_sp0.9 from training. Duration: 0.97775 2023-10-04 06:58:11,679 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.conv_skip_rate, batch_count=1565626.6666666667, ans=0.0 2023-10-04 06:58:12,850 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6545_210_2040_1_1527774670_4245258_379-14182-0_sp1.1 from training. Duration: 0.72725 2023-10-04 06:58:17,310 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_256-206070-0_sp1.1 from training. Duration: 0.963625 2023-10-04 06:58:20,601 WARNING [train.py:1204] (3/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_278-104676-0 from training. Duration: 0.98 2023-10-04 06:58:22,091 WARNING [train.py:1204] (3/4) Exclude cut with ID IC0671W0065-331908 from training. Duration: 0.896 2023-10-04 06:58:22,145 WARNING [train.py:1204] (3/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_244-129917-0_sp1.1 from training. Duration: 0.97275 2023-10-04 06:58:22,372 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.balancer1.prob, batch_count=1565693.3333333333, ans=0.125 2023-10-04 06:58:24,796 WARNING [train.py:1204] (3/4) Exclude cut with ID _7852_210_5219_1_1528286471316_8139400_1129-170857-0_sp1.1 from training. Duration: 0.97275 2023-10-04 06:58:26,655 WARNING [train.py:1204] (3/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_443-30034-0_sp0.9 from training. Duration: 0.711125 2023-10-04 06:58:26,701 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_31583_210_3852_1_1532595851_4024413_250-332999-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 06:58:28,072 WARNING [train.py:1204] (3/4) Exclude cut with ID _8772_210_1850_1_1528635595972_3664011_247-336448-0 from training. Duration: 0.74 2023-10-04 06:58:28,129 WARNING [train.py:1204] (3/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_567-135114-0_sp0.9 from training. Duration: 0.9666875 2023-10-04 06:58:28,133 WARNING [train.py:1204] (3/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_557-349534-0_sp1.1 from training. Duration: 0.82725 2023-10-04 06:58:28,145 WARNING [train.py:1204] (3/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_1341-26728-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 06:58:29,515 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_40222_210_6228_1_1533385298_4481939_375-328051-0_sp1.1 from training. Duration: 0.97275 2023-10-04 06:58:29,536 WARNING [train.py:1204] (3/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_276-96593-0 from training. Duration: 0.77 2023-10-04 06:58:32,414 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.ff2_skip_rate, batch_count=1565760.0, ans=0.0 2023-10-04 06:58:36,282 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_53071_210_16554_1_1534490405_2954066_265-170472-0_sp0.9 from training. Duration: 0.92225 2023-10-04 06:58:36,299 WARNING [train.py:1204] (3/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_365-1492-0 from training. Duration: 0.91 2023-10-04 06:58:37,697 WARNING [train.py:1204] (3/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_654-52713-0_sp0.9 from training. Duration: 0.8555625 2023-10-04 06:58:38,659 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.0.layers.0.whiten, num_groups=1, num_channels=192, metric=4.26 vs. limit=12.0 2023-10-04 06:58:40,654 WARNING [train.py:1204] (3/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_113-92608-0_sp1.1 from training. Duration: 0.4909375 2023-10-04 06:58:43,579 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_46013_210_7234_1_1533776362_3744032_50-141535-0 from training. Duration: 0.99 2023-10-04 06:58:43,589 WARNING [train.py:1204] (3/4) Exclude cut with ID _13942_210_9988_1_1530604906036_3733360_499-55676-0_sp1.1 from training. Duration: 0.563625 2023-10-04 06:58:45,756 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.feed_forward3.hidden_balancer.prob, batch_count=1565826.6666666667, ans=0.125 2023-10-04 06:58:46,731 WARNING [train.py:1204] (3/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_375-65143-0_sp1.1 from training. Duration: 0.97275 2023-10-04 06:58:48,661 WARNING [train.py:1204] (3/4) Exclude cut with ID _16756_210_4915_1_1531047803763_7104259_199-325608-0_sp1.1 from training. Duration: 0.92725 2023-10-04 06:58:50,446 WARNING [train.py:1204] (3/4) Exclude cut with ID _31649_210_6228_1_1532584608063_4091078_443-124391-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 06:58:50,539 WARNING [train.py:1204] (3/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_146-218763-0 from training. Duration: 0.86 2023-10-04 06:58:51,899 WARNING [train.py:1204] (3/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_567-135114-0_sp1.1 from training. Duration: 0.7909375 2023-10-04 06:58:51,934 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_26326_210_7925_1_1533002493_3592406_462-288299-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 06:58:53,386 WARNING [train.py:1204] (3/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_322-217220-0 from training. Duration: 0.86 2023-10-04 06:58:55,144 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_238-256668-0_sp1.1 from training. Duration: 0.62725 2023-10-04 06:58:55,199 WARNING [train.py:1204] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_371-18169-0 from training. Duration: 0.86 2023-10-04 06:58:55,485 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.ff3_skip_rate, batch_count=1565826.6666666667, ans=0.0 2023-10-04 06:58:56,567 WARNING [train.py:1204] (3/4) Exclude cut with ID _58480_210_12392_1_1534811121111_3646160_23-74512-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 06:58:56,572 WARNING [train.py:1204] (3/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_784-213099-0_sp1.1 from training. Duration: 0.8454375 2023-10-04 06:58:56,674 WARNING [train.py:1204] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_625-102955-0_sp1.1 from training. Duration: 0.7 2023-10-04 06:59:00,776 INFO [train.py:1046] (3/4) Epoch 45, batch 1150, loss[loss=0.1536, simple_loss=0.2289, pruned_loss=0.03915, over 23463.00 frames. ], tot_loss[loss=0.1541, simple_loss=0.2342, pruned_loss=0.03697, over 4707478.64 frames. ], batch size: 135, lr: 2.26e-03, grad_scale: 8.0 2023-10-04 06:59:02,235 WARNING [train.py:1204] (3/4) Exclude cut with ID _49748_210_24116_1_1533986971737_3158910_176-174327-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 06:59:04,891 WARNING [train.py:1204] (3/4) Exclude cut with ID _50310_210_15488_1_1534068043345_3641080_432-96140-0_sp1.1 from training. Duration: 0.936375 2023-10-04 06:59:06,256 WARNING [train.py:1204] (3/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_76-14910-0_sp1.1 from training. Duration: 0.92725 2023-10-04 06:59:06,280 WARNING [train.py:1204] (3/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_40-19458-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 06:59:06,309 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_46-93547-0 from training. Duration: 0.95 2023-10-04 06:59:06,353 WARNING [train.py:1204] (3/4) Exclude cut with ID _21631_210_2253_1_1531907880778_3160339_321-35325-0_sp1.1 from training. Duration: 0.963625 2023-10-04 06:59:09,114 WARNING [train.py:1204] (3/4) Exclude cut with ID _4779_210_2084_1_1527318022944_3914893_500-285013-0 from training. Duration: 0.69 2023-10-04 06:59:11,729 WARNING [train.py:1204] (3/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_301-93324-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 06:59:11,735 WARNING [train.py:1204] (3/4) Exclude cut with ID _6473_210_1741_1_1527901307396_3815380_108-17335-0_sp1.1 from training. Duration: 0.5909375 2023-10-04 06:59:14,864 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.ff2_skip_rate, batch_count=1565960.0, ans=0.0 2023-10-04 06:59:16,082 WARNING [train.py:1204] (3/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_206-10097-0 from training. Duration: 0.49 2023-10-04 06:59:19,952 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.balancer2.prob, batch_count=1565960.0, ans=0.125 2023-10-04 06:59:21,213 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_36756_210_7234_1_1533026991_966276_103-77513-0_sp1.1 from training. Duration: 0.97275 2023-10-04 06:59:24,514 WARNING [train.py:1204] (3/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_662-18193-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 06:59:26,495 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_26433_210_12108_1_1532157158_4056480_439-52438-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 06:59:26,548 WARNING [train.py:1204] (3/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_464-33931-0 from training. Duration: 0.54 2023-10-04 06:59:26,554 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3474_210_2028_1_1526188513_4825246_145-309131-0_sp0.9 from training. Duration: 0.82225 2023-10-04 06:59:27,736 WARNING [train.py:1204] (3/4) Exclude cut with ID _9679_210_7497_1_1529129212906_4363929_446-308321-0_sp1.1 from training. Duration: 0.963625 2023-10-04 06:59:30,641 WARNING [train.py:1204] (3/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_662-275621-0 from training. Duration: 0.42 2023-10-04 06:59:33,357 WARNING [train.py:1204] (3/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_344-337148-0_sp1.1 from training. Duration: 0.97275 2023-10-04 06:59:34,794 WARNING [train.py:1204] (3/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_759-179071-0_sp1.1 from training. Duration: 0.92725 2023-10-04 06:59:36,469 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.self_attn_weights.pos_emb_skip_rate, batch_count=1566026.6666666667, ans=0.0 2023-10-04 06:59:43,350 WARNING [train.py:1204] (3/4) Exclude cut with ID _28783_210_7943_1_1532671164808_7155479_293-12188-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 06:59:48,905 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_17613_210_2151_1_1531137569_2366160_235-320304-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 06:59:48,923 WARNING [train.py:1204] (3/4) Exclude cut with ID _14349_210_8341_1_1530615710818_3442470_170-336541-0 from training. Duration: 0.86 2023-10-04 06:59:50,642 WARNING [train.py:1204] (3/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_375-218677-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 06:59:50,692 WARNING [train.py:1204] (3/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_829-202956-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 06:59:57,258 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9029W0436-509823 from training. Duration: 0.768 2023-10-04 06:59:57,597 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.conv_module1.balancer1.min_positive, batch_count=1566093.3333333333, ans=0.025 2023-10-04 06:59:59,241 WARNING [train.py:1204] (3/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_231-112214-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 07:00:03,653 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9059W0470-524883 from training. Duration: 0.3415625 2023-10-04 07:00:04,888 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.678e+02 2.046e+02 2.266e+02 2.583e+02 3.861e+02, threshold=4.532e+02, percent-clipped=0.0 2023-10-04 07:00:07,588 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_43266_210_1794_1_1533521789_4497218_206-79512-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 07:00:07,790 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.feed_forward3.hidden_balancer.prob, batch_count=1566160.0, ans=0.125 2023-10-04 07:00:08,975 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_294-163793-0_sp0.9 from training. Duration: 0.911125 2023-10-04 07:00:10,271 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6290_210_3273_1_1527595224_1282787_115-87095-0_sp1.1 from training. Duration: 0.5 2023-10-04 07:00:10,317 WARNING [train.py:1204] (3/4) Exclude cut with ID _14100_210_8341_1_1530529320613_3678069_279-55760-0_sp1.1 from training. Duration: 0.8454375 2023-10-04 07:00:11,860 WARNING [train.py:1204] (3/4) Exclude cut with ID _14621_210_1925_1_1530686901185_3675420_419-234200-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 07:00:14,503 INFO [train.py:1046] (3/4) Epoch 45, batch 1200, loss[loss=0.1616, simple_loss=0.2517, pruned_loss=0.03569, over 24388.00 frames. ], tot_loss[loss=0.1549, simple_loss=0.2352, pruned_loss=0.03732, over 4694371.20 frames. ], batch size: 77, lr: 2.26e-03, grad_scale: 16.0 2023-10-04 07:00:16,142 WARNING [train.py:1204] (3/4) Exclude cut with ID _60229_210_27767_1_1534838406158_3655350_353-213656-0_sp1.1 from training. Duration: 0.863625 2023-10-04 07:00:16,151 WARNING [train.py:1204] (3/4) Exclude cut with ID _48678_210_5663_1_1533988830947_3769199_306-255886-0_sp1.1 from training. Duration: 0.7 2023-10-04 07:00:18,818 WARNING [train.py:1204] (3/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_466-3492-0_sp1.1 from training. Duration: 0.97275 2023-10-04 07:00:18,818 WARNING [train.py:1204] (3/4) Exclude cut with ID _8755_210_1876_1_1528628559357_3258209_199-242167-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 07:00:18,882 WARNING [train.py:1204] (3/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_237-89684-0_sp1.1 from training. Duration: 0.87275 2023-10-04 07:00:22,197 WARNING [train.py:1204] (3/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_70-84121-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 07:00:24,047 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7544_210_6753_1_1528587722_3576686_172-171750-0_sp1.1 from training. Duration: 0.5909375 2023-10-04 07:00:25,407 WARNING [train.py:1204] (3/4) Exclude cut with ID _24271_210_12658_1_1531987270022_7190519_40-321822-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 07:00:25,430 WARNING [train.py:1204] (3/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_227-83612-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 07:00:28,144 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.bypass.scale_min, batch_count=1566226.6666666667, ans=0.2 2023-10-04 07:00:29,187 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9156W0015-572508 from training. Duration: 0.896 2023-10-04 07:00:31,925 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_292-265928-0 from training. Duration: 0.51 2023-10-04 07:00:32,563 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.5.encoder.layers.0.whiten, num_groups=1, num_channels=256, metric=2.32 vs. limit=12.0 2023-10-04 07:00:32,794 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.3.encoder.layers.0.feed_forward1.out_whiten, num_groups=1, num_channels=512, metric=7.47 vs. limit=15.0 2023-10-04 07:00:33,467 WARNING [train.py:1204] (3/4) Exclude cut with ID _14747_210_8341_1_1530685797463_3654089_147-131229-0_sp1.1 from training. Duration: 0.6909375 2023-10-04 07:00:34,859 WARNING [train.py:1204] (3/4) Exclude cut with ID _45297_210_2721_1_1533715351381_3264949_351-147136-0_sp0.9 from training. Duration: 0.9666875 2023-10-04 07:00:36,436 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.conv_skip_rate, batch_count=1566293.3333333333, ans=0.0 2023-10-04 07:00:37,575 WARNING [train.py:1204] (3/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_1102-324568-0_sp1.1 from training. Duration: 0.97275 2023-10-04 07:00:39,045 WARNING [train.py:1204] (3/4) Exclude cut with ID _18516_210_9206_1_1531704653206_3692406_465-75446-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 07:00:39,055 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9203W0126-595623 from training. Duration: 0.896 2023-10-04 07:00:40,387 WARNING [train.py:1204] (3/4) Exclude cut with ID _55383_210_10635_1_1534468426172_2231669_139-328410-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 07:00:46,022 WARNING [train.py:1204] (3/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_166-246557-0_sp0.9 from training. Duration: 0.711125 2023-10-04 07:00:46,033 WARNING [train.py:1204] (3/4) Exclude cut with ID _30039_210_13270_1_1532484612900_2725150_239-304872-0_sp1.1 from training. Duration: 0.963625 2023-10-04 07:00:47,320 WARNING [train.py:1204] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_6-226750-0 from training. Duration: 0.94 2023-10-04 07:00:48,601 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_135-5501-0_sp1.1 from training. Duration: 0.8909375 2023-10-04 07:00:52,153 WARNING [train.py:1204] (3/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_359-118926-0 from training. Duration: 0.72 2023-10-04 07:00:57,231 WARNING [train.py:1204] (3/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_4-91037-0 from training. Duration: 0.88 2023-10-04 07:00:57,248 WARNING [train.py:1204] (3/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_415-288492-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 07:00:57,933 WARNING [train.py:1204] (3/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_544-287179-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 07:00:59,246 WARNING [train.py:1204] (3/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_122-32606-0_sp1.1 from training. Duration: 0.92725 2023-10-04 07:01:00,571 WARNING [train.py:1204] (3/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_121-284152-0_sp1.1 from training. Duration: 0.536375 2023-10-04 07:01:01,899 WARNING [train.py:1204] (3/4) Exclude cut with ID _44718_210_5816_1_1534050051869_6469030_350-299477-0_sp1.1 from training. Duration: 0.97275 2023-10-04 07:01:01,917 WARNING [train.py:1204] (3/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_1103-56445-0_sp0.9 from training. Duration: 0.811125 2023-10-04 07:01:03,219 WARNING [train.py:1204] (3/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_64-298917-0_sp0.9 from training. Duration: 0.97775 2023-10-04 07:01:03,268 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2245_210_2715_1_1525511548_3540254_310-52483-0 from training. Duration: 0.88 2023-10-04 07:01:04,581 WARNING [train.py:1204] (3/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_401-158239-0_sp1.1 from training. Duration: 0.6454375 2023-10-04 07:01:05,867 WARNING [train.py:1204] (3/4) Exclude cut with ID _59502_210_12210_1_1535107024698_1835139_146-162495-0_sp1.1 from training. Duration: 0.836375 2023-10-04 07:01:05,886 WARNING [train.py:1204] (3/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_41-31157-0_sp0.9 from training. Duration: 0.6333125 2023-10-04 07:01:07,467 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_68717_210_3528_1_1535628683_1556759_95-36722-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 07:01:07,474 WARNING [train.py:1204] (3/4) Exclude cut with ID _30196_210_1664_1_1533708496522_7074140_654-162611-0_sp1.1 from training. Duration: 0.92725 2023-10-04 07:01:10,330 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.conv_module1.balancer1.prob, batch_count=1566426.6666666667, ans=0.125 2023-10-04 07:01:12,823 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_9302_210_2550_1_1528889962_4655416_638-301620-0_sp0.9 from training. Duration: 0.52225 2023-10-04 07:01:13,134 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.balancer1.prob, batch_count=1566493.3333333333, ans=0.125 2023-10-04 07:01:14,385 WARNING [train.py:1204] (3/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_478-162247-0_sp1.1 from training. Duration: 0.7454375 2023-10-04 07:01:17,240 WARNING [train.py:1204] (3/4) Exclude cut with ID _45297_210_2721_1_1533715351381_3264949_353-9541-0 from training. Duration: 0.97 2023-10-04 07:01:19,948 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9653W0190-675468 from training. Duration: 0.9935 2023-10-04 07:01:21,235 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_49730_210_13049_1_1534075294_1830676_214-84436-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 07:01:23,251 WARNING [train.py:1204] (3/4) Exclude cut with ID _50093_210_4220_1_1534417141207_3432299_259-322252-0_sp1.1 from training. Duration: 0.836375 2023-10-04 07:01:24,956 WARNING [train.py:1204] (3/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_599-50220-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 07:01:25,194 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.conv_module2.balancer2.prob, batch_count=1566493.3333333333, ans=0.125 2023-10-04 07:01:26,341 WARNING [train.py:1204] (3/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_174-109016-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 07:01:29,030 WARNING [train.py:1204] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_665-196155-0 from training. Duration: 0.67 2023-10-04 07:01:30,155 INFO [train.py:1046] (3/4) Epoch 45, batch 1250, loss[loss=0.2102, simple_loss=0.2808, pruned_loss=0.06982, over 19577.00 frames. ], tot_loss[loss=0.1556, simple_loss=0.2359, pruned_loss=0.03763, over 4696336.87 frames. ], batch size: 388, lr: 2.26e-03, grad_scale: 16.0 2023-10-04 07:01:32,989 WARNING [train.py:1204] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_656-188361-0_sp1.1 from training. Duration: 0.7909375 2023-10-04 07:01:33,052 WARNING [train.py:1204] (3/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_60-18123-0_sp1.1 from training. Duration: 0.8 2023-10-04 07:01:34,386 WARNING [train.py:1204] (3/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_535-14410-0 from training. Duration: 0.47 2023-10-04 07:01:37,178 WARNING [train.py:1204] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_94-126128-0_sp1.1 from training. Duration: 0.7818125 2023-10-04 07:01:38,602 WARNING [train.py:1204] (3/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_356-31216-0_sp1.1 from training. Duration: 0.6818125 2023-10-04 07:01:41,419 WARNING [train.py:1204] (3/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_443-30034-0_sp1.1 from training. Duration: 0.5818125 2023-10-04 07:01:42,751 WARNING [train.py:1204] (3/4) Exclude cut with ID _26907_210_7234_1_1532221740694_3205263_337-2494-0_sp1.1 from training. Duration: 0.8 2023-10-04 07:01:44,188 WARNING [train.py:1204] (3/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_49-97158-0_sp0.9 from training. Duration: 0.8444375 2023-10-04 07:01:44,199 WARNING [train.py:1204] (3/4) Exclude cut with ID _7877_210_6014_1_1528286358099_3750136_553-170128-0_sp1.1 from training. Duration: 0.963625 2023-10-04 07:01:47,081 WARNING [train.py:1204] (3/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_202-293523-0_sp0.9 from training. Duration: 0.8 2023-10-04 07:01:49,900 WARNING [train.py:1204] (3/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_188-171673-0_sp0.9 from training. Duration: 0.6444375 2023-10-04 07:01:51,142 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_120-41668-0_sp1.1 from training. Duration: 0.636375 2023-10-04 07:01:51,148 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_40223_210_6228_1_1533298404_4812267_613-78769-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 07:01:52,992 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_53861_210_5399_1_1534422156_4272077_150-336899-0_sp1.1 from training. Duration: 0.963625 2023-10-04 07:01:54,310 WARNING [train.py:1204] (3/4) Exclude cut with ID _14459_210_2228_1_1531443641946_7516700_1146-56843-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 07:01:56,216 WARNING [train.py:1204] (3/4) Exclude cut with ID _39867_210_9826_1_1533553222030_9344259_1194-299928-0_sp1.1 from training. Duration: 0.92725 2023-10-04 07:01:57,526 WARNING [train.py:1204] (3/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_214-291369-0_sp0.9 from training. Duration: 0.7 2023-10-04 07:02:02,559 WARNING [train.py:1204] (3/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_358-215558-0 from training. Duration: 0.93 2023-10-04 07:02:03,917 WARNING [train.py:1204] (3/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_577-287150-0_sp0.9 from training. Duration: 0.911125 2023-10-04 07:02:05,404 WARNING [train.py:1204] (3/4) Exclude cut with ID _60860_210_8254_1_1534896015875_3557169_229-187367-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 07:02:06,696 WARNING [train.py:1204] (3/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_857-298942-0 from training. Duration: 0.85 2023-10-04 07:02:06,748 WARNING [train.py:1204] (3/4) Exclude cut with ID _40137_210_12210_1_1533290484046_3795180_249-187059-0_sp1.1 from training. Duration: 0.8 2023-10-04 07:02:06,754 WARNING [train.py:1204] (3/4) Exclude cut with ID ID0171W0508-765603 from training. Duration: 0.541 2023-10-04 07:02:06,778 WARNING [train.py:1204] (3/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_521-219439-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 07:02:08,075 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_40223_210_6228_1_1533298404_4812267_741-302616-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 07:02:10,993 WARNING [train.py:1204] (3/4) Exclude cut with ID _65293_210_9070_1_1535545549765_4004210_438-154735-0_sp1.1 from training. Duration: 0.92725 2023-10-04 07:02:13,745 WARNING [train.py:1204] (3/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_564-132950-0_sp1.1 from training. Duration: 0.92725 2023-10-04 07:02:13,777 WARNING [train.py:1204] (3/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_291-270881-0_sp1.1 from training. Duration: 0.8818125 2023-10-04 07:02:16,526 WARNING [train.py:1204] (3/4) Exclude cut with ID _60229_210_27767_1_1534838406158_3655350_353-213656-0 from training. Duration: 0.95 2023-10-04 07:02:16,546 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_243-143119-0 from training. Duration: 0.77 2023-10-04 07:02:17,877 WARNING [train.py:1204] (3/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_690-126306-0 from training. Duration: 0.78 2023-10-04 07:02:20,603 WARNING [train.py:1204] (3/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_347-95003-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 07:02:20,715 WARNING [train.py:1204] (3/4) Exclude cut with ID _2322_210_2667_1_1525604548971_3656299_440-80494-0 from training. Duration: 0.88 2023-10-04 07:02:20,737 WARNING [train.py:1204] (3/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_370-229434-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 07:02:26,476 WARNING [train.py:1204] (3/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_124-345840-0_sp1.1 from training. Duration: 0.47275 2023-10-04 07:02:26,492 WARNING [train.py:1204] (3/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_165-123581-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 07:02:28,038 WARNING [train.py:1204] (3/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_453-282588-0 from training. Duration: 0.98 2023-10-04 07:02:28,063 WARNING [train.py:1204] (3/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_299-34413-0_sp0.9 from training. Duration: 0.72225 2023-10-04 07:02:29,554 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_40036_210_13405_1_1533648647_1315388_48-146016-0_sp1.1 from training. Duration: 0.8454375 2023-10-04 07:02:29,590 WARNING [train.py:1204] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_99-167392-0_sp0.9 from training. Duration: 0.67775 2023-10-04 07:02:30,932 WARNING [train.py:1204] (3/4) Exclude cut with ID _48677_210_5663_1_1533902405919_3892989_37-207114-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 07:02:32,115 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.5.encoder.layers.0.nonlin_attention.whiten2, num_groups=1, num_channels=256, metric=3.43 vs. limit=15.0 2023-10-04 07:02:32,845 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.606e+02 2.067e+02 2.229e+02 2.579e+02 4.151e+02, threshold=4.458e+02, percent-clipped=0.0 2023-10-04 07:02:32,938 WARNING [train.py:1204] (3/4) Exclude cut with ID _29857_210_20034_1_1532395886753_3798160_335-163562-0 from training. Duration: 0.85 2023-10-04 07:02:34,729 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_36492_210_7927_1_1533261987_4790834_703-343351-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 07:02:36,135 WARNING [train.py:1204] (3/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_80-147301-0_sp0.9 from training. Duration: 0.9333125 2023-10-04 07:02:37,550 WARNING [train.py:1204] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_218-295097-0_sp0.9 from training. Duration: 0.8555625 2023-10-04 07:02:37,797 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.conv_module2.balancer1.prob, batch_count=1566826.6666666667, ans=0.125 2023-10-04 07:02:40,304 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2295_210_2087_1_1525500993_2407863_116-238894-0_sp1.1 from training. Duration: 0.636375 2023-10-04 07:02:43,075 INFO [train.py:1046] (3/4) Epoch 45, batch 1300, loss[loss=0.1451, simple_loss=0.2159, pruned_loss=0.03716, over 22695.00 frames. ], tot_loss[loss=0.1553, simple_loss=0.2361, pruned_loss=0.0373, over 4715715.96 frames. ], batch size: 322, lr: 2.26e-03, grad_scale: 16.0 2023-10-04 07:02:43,167 WARNING [train.py:1204] (3/4) Exclude cut with ID _3231_210_2106_1_1526205864934_6784220_697-171068-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 07:02:43,190 WARNING [train.py:1204] (3/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_102-77064-0 from training. Duration: 0.93 2023-10-04 07:02:43,461 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.conv_skip_rate, batch_count=1566893.3333333333, ans=0.0 2023-10-04 07:02:46,100 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_13091_210_7925_1_1530609447_8987354_1186-261957-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 07:02:46,621 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.4.encoder.layers.1.self_attn2.whiten, num_groups=1, num_channels=384, metric=11.63 vs. limit=22.5 2023-10-04 07:02:47,462 WARNING [train.py:1204] (3/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_91-277684-0_sp1.1 from training. Duration: 0.67275 2023-10-04 07:02:47,539 WARNING [train.py:1204] (3/4) Exclude cut with ID _31088_210_16446_1_1532520012284_3440919_159-313751-0_sp1.1 from training. Duration: 0.936375 2023-10-04 07:02:48,947 WARNING [train.py:1204] (3/4) Exclude cut with ID _42326_210_7770_1_1533814282531_3707269_227-203721-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 07:02:51,489 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3531_210_3528_1_1526294203_5216456_309-227302-0_sp1.1 from training. Duration: 0.62725 2023-10-04 07:02:51,552 WARNING [train.py:1204] (3/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_226-242140-0 from training. Duration: 0.81 2023-10-04 07:02:56,800 WARNING [train.py:1204] (3/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_122-109023-0_sp1.1 from training. Duration: 0.6909375 2023-10-04 07:02:59,382 WARNING [train.py:1204] (3/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_660-211088-0_sp0.9 from training. Duration: 0.688875 2023-10-04 07:03:01,343 WARNING [train.py:1204] (3/4) Exclude cut with ID _39290_210_4220_1_1533862778760_3075118_92-311600-0 from training. Duration: 0.88 2023-10-04 07:03:04,072 WARNING [train.py:1204] (3/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_390-339137-0_sp1.1 from training. Duration: 0.5090625 2023-10-04 07:03:09,933 WARNING [train.py:1204] (3/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_449-341946-0_sp1.1 from training. Duration: 0.963625 2023-10-04 07:03:10,132 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.bypass.skip_rate, batch_count=1566960.0, ans=0.04949747468305833 2023-10-04 07:03:11,420 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_68095_210_20185_1_1535718226_8888855_884-327007-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 07:03:13,071 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.bypass.skip_rate, batch_count=1567026.6666666667, ans=0.07 2023-10-04 07:03:14,149 WARNING [train.py:1204] (3/4) Exclude cut with ID _42326_210_7770_1_1533814282531_3707269_308-222998-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 07:03:14,266 WARNING [train.py:1204] (3/4) Exclude cut with ID _40399_210_4353_1_1533305658194_3279406_344-238708-0_sp1.1 from training. Duration: 0.963625 2023-10-04 07:03:14,327 WARNING [train.py:1204] (3/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_87-131427-0_sp1.1 from training. Duration: 0.7090625 2023-10-04 07:03:15,611 WARNING [train.py:1204] (3/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_110-130961-0_sp0.9 from training. Duration: 0.52225 2023-10-04 07:03:15,650 WARNING [train.py:1204] (3/4) Exclude cut with ID _55153_210_2253_1_1534384786677_3162096_148-79022-0 from training. Duration: 0.96 2023-10-04 07:03:19,953 WARNING [train.py:1204] (3/4) Exclude cut with ID _9679_210_7497_1_1529129212906_4363929_512-313889-0_sp1.1 from training. Duration: 0.736375 2023-10-04 07:03:19,966 WARNING [train.py:1204] (3/4) Exclude cut with ID _7970_210_1883_1_1528455616296_3472230_25-178407-0_sp1.1 from training. Duration: 0.6454375 2023-10-04 07:03:20,035 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder_embed.conv.8.prob, batch_count=1567026.6666666667, ans=0.125 2023-10-04 07:03:21,257 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_246-125000-0 from training. Duration: 0.62 2023-10-04 07:03:22,568 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_15126_210_6753_1_1530778677_2591185_113-157651-0_sp0.9 from training. Duration: 0.7555625 2023-10-04 07:03:25,982 WARNING [train.py:1204] (3/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_105-63309-0_sp1.1 from training. Duration: 0.8909375 2023-10-04 07:03:26,216 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.1.nonlin_attention.balancer.prob, batch_count=1567093.3333333333, ans=0.125 2023-10-04 07:03:28,634 WARNING [train.py:1204] (3/4) Exclude cut with ID _29877_210_6228_1_1532397791106_5276149_578-177271-0_sp1.1 from training. Duration: 0.936375 2023-10-04 07:03:28,688 WARNING [train.py:1204] (3/4) Exclude cut with ID _44831_210_13427_1_1533816011549_3689035_32-188147-0 from training. Duration: 0.96 2023-10-04 07:03:28,746 WARNING [train.py:1204] (3/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_300-254337-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 07:03:28,758 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_128-241008-0 from training. Duration: 0.94 2023-10-04 07:03:31,986 WARNING [train.py:1204] (3/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_53-279178-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 07:03:36,562 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_41452_210_7291_1_1533437687_3941281_273-334088-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 07:03:36,564 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_128-241008-0_sp1.1 from training. Duration: 0.8545625 2023-10-04 07:03:41,014 WARNING [train.py:1204] (3/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_379-49457-0 from training. Duration: 0.87 2023-10-04 07:03:42,427 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_105-114797-0 from training. Duration: 0.84 2023-10-04 07:03:43,770 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_14928_210_5632_1_1530773461_4714000_454-112198-0 from training. Duration: 0.66 2023-10-04 07:03:46,669 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_456-22141-0_sp1.1 from training. Duration: 0.8 2023-10-04 07:03:49,282 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_115-233929-0 from training. Duration: 0.72 2023-10-04 07:03:50,703 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_616-16795-0_sp1.1 from training. Duration: 0.963625 2023-10-04 07:03:56,821 INFO [train.py:1046] (3/4) Epoch 45, batch 1350, loss[loss=0.1612, simple_loss=0.2371, pruned_loss=0.0426, over 17126.00 frames. ], tot_loss[loss=0.155, simple_loss=0.2355, pruned_loss=0.03727, over 4707541.08 frames. ], batch size: 37, lr: 2.26e-03, grad_scale: 16.0 2023-10-04 07:03:56,894 WARNING [train.py:1204] (3/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_448-212135-0 from training. Duration: 0.91 2023-10-04 07:04:01,439 WARNING [train.py:1204] (3/4) Exclude cut with ID _46683_210_9642_1_1533730913492_4591116_201-205677-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 07:04:02,889 WARNING [train.py:1204] (3/4) Exclude cut with ID _64086_210_7994_1_1535270417019_7435949_43-80483-0_sp1.1 from training. Duration: 0.97275 2023-10-04 07:04:04,937 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_240-134124-0_sp1.1 from training. Duration: 0.963625 2023-10-04 07:04:06,266 WARNING [train.py:1204] (3/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_907-120940-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 07:04:07,656 WARNING [train.py:1204] (3/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_848-280957-0_sp1.1 from training. Duration: 0.7909375 2023-10-04 07:04:07,698 WARNING [train.py:1204] (3/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_243-42116-0_sp0.9 from training. Duration: 0.811125 2023-10-04 07:04:12,230 WARNING [train.py:1204] (3/4) Exclude cut with ID _6694_210_4599_1_1527852637490_3965529_116-109398-0_sp0.9 from training. Duration: 0.811125 2023-10-04 07:04:12,325 WARNING [train.py:1204] (3/4) Exclude cut with ID _24314_210_4915_1_1532135078116_7170179_521-258868-0 from training. Duration: 0.79 2023-10-04 07:04:15,053 WARNING [train.py:1204] (3/4) Exclude cut with ID _2220_210_1819_1_1525509109898_3658680_50-99837-0_sp0.9 from training. Duration: 0.6 2023-10-04 07:04:15,113 WARNING [train.py:1204] (3/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_313-134930-0_sp1.1 from training. Duration: 0.8818125 2023-10-04 07:04:19,175 WARNING [train.py:1204] (3/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_125-144174-0 from training. Duration: 0.39 2023-10-04 07:04:19,898 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.2.encoder.layers.0.nonlin_attention.whiten1, num_groups=1, num_channels=288, metric=8.12 vs. limit=10.0 2023-10-04 07:04:20,486 WARNING [train.py:1204] (3/4) Exclude cut with ID _59288_210_5854_1_1535077757774_3412246_275-293897-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 07:04:21,968 WARNING [train.py:1204] (3/4) Exclude cut with ID _40519_210_13794_1_1533715228655_7337390_722-52880-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 07:04:21,982 WARNING [train.py:1204] (3/4) Exclude cut with ID _7537_210_2084_1_1528502395431_3924359_128-268029-0 from training. Duration: 0.98 2023-10-04 07:04:24,743 WARNING [train.py:1204] (3/4) Exclude cut with ID _8021_210_7994_1_1528376451358_3653069_179-45206-0 from training. Duration: 0.86 2023-10-04 07:04:26,184 WARNING [train.py:1204] (3/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_88-276668-0 from training. Duration: 0.99 2023-10-04 07:04:26,393 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.bypass.scale_min, batch_count=1567360.0, ans=0.2 2023-10-04 07:04:28,061 WARNING [train.py:1204] (3/4) Exclude cut with ID _22613_210_5219_1_1532253599686_4090170_290-212862-0_sp1.1 from training. Duration: 0.97275 2023-10-04 07:04:28,066 WARNING [train.py:1204] (3/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_465-214726-0 from training. Duration: 0.86 2023-10-04 07:04:39,162 WARNING [train.py:1204] (3/4) Exclude cut with ID _56480_210_4220_1_1534679942079_3075409_138-36031-0_sp1.1 from training. Duration: 0.97275 2023-10-04 07:04:39,360 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.conv_module2.balancer2.prob, batch_count=1567360.0, ans=0.125 2023-10-04 07:04:47,938 WARNING [train.py:1204] (3/4) Exclude cut with ID _58605_210_12210_1_1534723195939_3904259_300-285878-0_sp1.1 from training. Duration: 0.97275 2023-10-04 07:04:47,974 WARNING [train.py:1204] (3/4) Exclude cut with ID _40901_210_5665_1_1533363789958_3856130_114-340229-0_sp1.1 from training. Duration: 0.936375 2023-10-04 07:04:48,019 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_489-230703-0 from training. Duration: 0.85 2023-10-04 07:04:50,899 WARNING [train.py:1204] (3/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_322-81243-0_sp1.1 from training. Duration: 0.936375 2023-10-04 07:04:52,148 WARNING [train.py:1204] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_271-269084-0 from training. Duration: 0.83 2023-10-04 07:04:52,169 WARNING [train.py:1204] (3/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_166-314607-0_sp0.9 from training. Duration: 0.6 2023-10-04 07:04:52,349 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=1567426.6666666667, ans=0.1 2023-10-04 07:04:53,514 WARNING [train.py:1204] (3/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_340-343302-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 07:04:55,014 WARNING [train.py:1204] (3/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_286-112015-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 07:04:58,146 WARNING [train.py:1204] (3/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_328-264776-0 from training. Duration: 0.67 2023-10-04 07:04:58,266 WARNING [train.py:1204] (3/4) Exclude cut with ID _4866_210_2577_1_1527404412284_4364659_256-306830-0_sp0.9 from training. Duration: 0.9666875 2023-10-04 07:05:01,350 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.751e+02 2.135e+02 2.425e+02 2.910e+02 3.812e+02, threshold=4.850e+02, percent-clipped=0.0 2023-10-04 07:05:02,875 WARNING [train.py:1204] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_239-316802-0 from training. Duration: 0.69 2023-10-04 07:05:04,686 WARNING [train.py:1204] (3/4) Exclude cut with ID _29857_210_20034_1_1532395886753_3798160_279-185801-0 from training. Duration: 0.91 2023-10-04 07:05:05,031 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.nonlin_attention.balancer.prob, batch_count=1567493.3333333333, ans=0.125 2023-10-04 07:05:11,838 INFO [train.py:1046] (3/4) Epoch 45, batch 1400, loss[loss=0.1483, simple_loss=0.2397, pruned_loss=0.0285, over 24395.00 frames. ], tot_loss[loss=0.1544, simple_loss=0.2346, pruned_loss=0.03708, over 4705732.99 frames. ], batch size: 69, lr: 2.26e-03, grad_scale: 8.0 2023-10-04 07:05:11,927 WARNING [train.py:1204] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_656-188361-0 from training. Duration: 0.87 2023-10-04 07:05:12,215 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.conv_module1.balancer1.prob, batch_count=1567560.0, ans=0.125 2023-10-04 07:05:13,431 WARNING [train.py:1204] (3/4) Exclude cut with ID _44718_210_5816_1_1534050051869_6469030_100-230287-0_sp1.1 from training. Duration: 0.936375 2023-10-04 07:05:15,254 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.conv_skip_rate, batch_count=1567560.0, ans=0.0 2023-10-04 07:05:16,341 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8275_210_4198_1_1528455320_3892571_276-215622-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 07:05:16,400 WARNING [train.py:1204] (3/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_698-309430-0_sp1.1 from training. Duration: 0.963625 2023-10-04 07:05:20,531 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_365-217119-0 from training. Duration: 0.63 2023-10-04 07:05:23,199 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7264_210_4938_1_1527939911_8132095_800-91995-0 from training. Duration: 0.86 2023-10-04 07:05:34,003 WARNING [train.py:1204] (3/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_49-97158-0_sp1.1 from training. Duration: 0.6909375 2023-10-04 07:05:35,418 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.1.encoder.layers.0.feed_forward2.out_whiten, num_groups=1, num_channels=256, metric=4.88 vs. limit=15.0 2023-10-04 07:05:35,418 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.feed_forward2.out_whiten.whitening_limit, batch_count=1567626.6666666667, ans=15.0 2023-10-04 07:05:37,196 WARNING [train.py:1204] (3/4) Exclude cut with ID _61132_210_9969_1_1535021792964_3755419_61-57720-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 07:05:38,691 WARNING [train.py:1204] (3/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_345-306832-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 07:05:38,715 WARNING [train.py:1204] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_164-224052-0_sp0.9 from training. Duration: 0.77775 2023-10-04 07:05:42,787 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_34886_210_10633_1_1533203882_1755661_76-156096-0_sp1.1 from training. Duration: 0.7909375 2023-10-04 07:05:44,317 WARNING [train.py:1204] (3/4) Exclude cut with ID 4133-6541-0027-40495-0_sp1.1 from training. Duration: 0.9681875 2023-10-04 07:05:53,149 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_514-211165-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 07:05:53,202 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_41211_210_2544_1_1533348859_3702910_97-212695-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 07:05:57,383 WARNING [train.py:1204] (3/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_406-45086-0 from training. Duration: 0.8 2023-10-04 07:05:57,449 WARNING [train.py:1204] (3/4) Exclude cut with ID _65263_210_22356_1_1535763598691_3921037_49-222917-0_sp1.1 from training. Duration: 0.836375 2023-10-04 07:05:57,495 WARNING [train.py:1204] (3/4) Exclude cut with ID _4866_210_2577_1_1527404412284_4364659_352-157930-0_sp1.1 from training. Duration: 0.736375 2023-10-04 07:05:59,472 WARNING [train.py:1204] (3/4) Exclude cut with ID _4366_210_1388_1_1526792053633_3613819_231-211402-0_sp1.1 from training. Duration: 0.8545625 2023-10-04 07:06:00,754 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_98-21576-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 07:06:00,837 WARNING [train.py:1204] (3/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_348-127789-0_sp0.9 from training. Duration: 0.9444375 2023-10-04 07:06:00,859 WARNING [train.py:1204] (3/4) Exclude cut with ID _45871_210_5665_1_1533864614245_3296060_98-329557-0_sp1.1 from training. Duration: 0.97275 2023-10-04 07:06:02,591 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6846_210_6753_1_1527850767_4295158_2-315073-0_sp1.1 from training. Duration: 0.8 2023-10-04 07:06:03,957 WARNING [train.py:1204] (3/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_145-137790-0 from training. Duration: 0.83 2023-10-04 07:06:03,972 WARNING [train.py:1204] (3/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_133-158308-0_sp0.9 from training. Duration: 0.9333125 2023-10-04 07:06:04,271 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.balancer1.prob, batch_count=1567760.0, ans=0.125 2023-10-04 07:06:09,398 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.1.encoder.layers.0.conv_module2.whiten, num_groups=1, num_channels=256, metric=6.50 vs. limit=15.0 2023-10-04 07:06:09,812 WARNING [train.py:1204] (3/4) Exclude cut with ID _14898_210_9206_1_1530766812377_4177190_320-11758-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 07:06:12,681 WARNING [train.py:1204] (3/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_406-45086-0_sp0.9 from training. Duration: 0.888875 2023-10-04 07:06:18,872 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_5438_210_1794_1_1527336687_2600009_104-288274-0 from training. Duration: 0.58 2023-10-04 07:06:20,060 WARNING [train.py:1204] (3/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_108-53929-0_sp0.9 from training. Duration: 0.6555625 2023-10-04 07:06:21,304 WARNING [train.py:1204] (3/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_444-286390-0_sp1.1 from training. Duration: 0.963625 2023-10-04 07:06:22,174 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.2.encoder.layers.1.nonlin_attention.whiten2, num_groups=1, num_channels=384, metric=4.50 vs. limit=15.0 2023-10-04 07:06:22,777 WARNING [train.py:1204] (3/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_125-144174-0_sp0.9 from training. Duration: 0.4333125 2023-10-04 07:06:24,047 WARNING [train.py:1204] (3/4) Exclude cut with ID _55209_210_1531_1_1534675828996_4597160_118-345621-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 07:06:25,398 INFO [train.py:1046] (3/4) Epoch 45, batch 1450, loss[loss=0.1638, simple_loss=0.2522, pruned_loss=0.03764, over 24115.00 frames. ], tot_loss[loss=0.1538, simple_loss=0.2343, pruned_loss=0.03667, over 4717586.94 frames. ], batch size: 86, lr: 2.26e-03, grad_scale: 8.0 2023-10-04 07:06:25,505 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_45886_210_17024_1_1533709215_471063_7-79165-0_sp1.1 from training. Duration: 0.8909375 2023-10-04 07:06:27,030 WARNING [train.py:1204] (3/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_175-163894-0_sp0.9 from training. Duration: 0.82225 2023-10-04 07:06:29,674 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_50188_210_4198_1_1534503621_3646041_353-81682-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 07:06:29,687 WARNING [train.py:1204] (3/4) Exclude cut with ID _17875_210_2087_1_1531224012907_1821339_175-80565-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 07:06:29,695 WARNING [train.py:1204] (3/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_159-328157-0_sp1.1 from training. Duration: 0.47275 2023-10-04 07:06:36,165 WARNING [train.py:1204] (3/4) Exclude cut with ID _60057_210_10640_1_1534903065793_3333150_137-267579-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 07:06:37,468 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_92-46009-0_sp1.1 from training. Duration: 0.4909375 2023-10-04 07:06:38,759 WARNING [train.py:1204] (3/4) Exclude cut with ID _53203_210_2087_1_1534246218309_475590_48-68507-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 07:06:38,790 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_28211_210_6388_1_1532861506_4523224_128-328162-0 from training. Duration: 0.9 2023-10-04 07:06:40,133 WARNING [train.py:1204] (3/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_51-1049-0_sp1.1 from training. Duration: 0.6818125 2023-10-04 07:06:40,212 WARNING [train.py:1204] (3/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_383-316000-0 from training. Duration: 0.9 2023-10-04 07:06:41,633 WARNING [train.py:1204] (3/4) Exclude cut with ID _4779_210_2084_1_1527318022944_3914893_185-295153-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 07:06:41,735 WARNING [train.py:1204] (3/4) Exclude cut with ID _3231_210_2106_1_1526205864934_6784220_171-256004-0_sp0.9 from training. Duration: 0.9555625 2023-10-04 07:06:41,735 WARNING [train.py:1204] (3/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_87-272068-0 from training. Duration: 0.83 2023-10-04 07:06:43,181 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_164-152777-0_sp1.1 from training. Duration: 0.92725 2023-10-04 07:06:44,507 WARNING [train.py:1204] (3/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_18-130336-0_sp0.9 from training. Duration: 0.87775 2023-10-04 07:06:45,856 WARNING [train.py:1204] (3/4) Exclude cut with ID _14886_210_4220_1_1530702020355_3516190_175-229073-0_sp1.1 from training. Duration: 0.4090625 2023-10-04 07:06:45,869 WARNING [train.py:1204] (3/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_791-45149-0_sp0.9 from training. Duration: 0.9555625 2023-10-04 07:06:47,635 WARNING [train.py:1204] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_468-189058-0_sp1.1 from training. Duration: 0.87275 2023-10-04 07:06:47,759 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8278_210_2550_1_1528637099_4220413_32-142780-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 07:06:50,801 WARNING [train.py:1204] (3/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_146-218763-0_sp0.9 from training. Duration: 0.9555625 2023-10-04 07:06:52,335 WARNING [train.py:1204] (3/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_348-127789-0_sp1.1 from training. Duration: 0.77275 2023-10-04 07:06:52,339 WARNING [train.py:1204] (3/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_288-285445-0_sp1.1 from training. Duration: 0.936375 2023-10-04 07:06:56,314 WARNING [train.py:1204] (3/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_308-181732-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 07:06:56,346 WARNING [train.py:1204] (3/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_895-117428-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 07:06:57,751 WARNING [train.py:1204] (3/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_387-159008-0_sp0.9 from training. Duration: 0.9555625 2023-10-04 07:06:57,764 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_37351_210_7925_1_1533012931_3812113_265-85780-0_sp0.9 from training. Duration: 0.97775 2023-10-04 07:06:59,075 WARNING [train.py:1204] (3/4) Exclude cut with ID _40551_210_10635_1_1533776424632_3416179_361-3964-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 07:06:59,116 WARNING [train.py:1204] (3/4) Exclude cut with ID _25882_210_3036_1_1532775665815_6479510_485-122742-0_sp1.1 from training. Duration: 0.97275 2023-10-04 07:07:03,775 WARNING [train.py:1204] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_98-51676-0 from training. Duration: 0.73 2023-10-04 07:07:06,029 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.nonlin_attention.balancer.min_positive, batch_count=1568026.6666666667, ans=0.05 2023-10-04 07:07:06,253 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.5.encoder.layers.1.feed_forward3.out_whiten, num_groups=1, num_channels=256, metric=9.65 vs. limit=15.0 2023-10-04 07:07:07,012 WARNING [train.py:1204] (3/4) Exclude cut with ID _30548_210_16040_1_1532608097130_3146969_371-280610-0_sp1.1 from training. Duration: 0.92725 2023-10-04 07:07:11,207 WARNING [train.py:1204] (3/4) Exclude cut with ID IC0671W0081-331924 from training. Duration: 0.896 2023-10-04 07:07:12,572 WARNING [train.py:1204] (3/4) Exclude cut with ID _40284_210_2084_1_1533513679814_3704279_380-160775-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 07:07:14,004 WARNING [train.py:1204] (3/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_57-270037-0_sp1.1 from training. Duration: 0.7 2023-10-04 07:07:14,101 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_853-150237-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 07:07:15,547 WARNING [train.py:1204] (3/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_491-261923-0 from training. Duration: 0.98 2023-10-04 07:07:19,047 WARNING [train.py:1204] (3/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_836-244045-0_sp1.1 from training. Duration: 0.97275 2023-10-04 07:07:20,356 WARNING [train.py:1204] (3/4) Exclude cut with ID _14880_210_8895_1_1530680426655_3795259_289-212817-0 from training. Duration: 0.69 2023-10-04 07:07:20,455 WARNING [train.py:1204] (3/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_303-340446-0 from training. Duration: 0.9 2023-10-04 07:07:20,545 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_37152_210_9697_1_1533106573_3844188_418-41736-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 07:07:23,425 WARNING [train.py:1204] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_371-18169-0_sp1.1 from training. Duration: 0.7818125 2023-10-04 07:07:23,466 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_187-219984-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 07:07:25,918 WARNING [train.py:1204] (3/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_63-267585-0 from training. Duration: 0.9 2023-10-04 07:07:27,426 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.balancer_ff3.min_abs, batch_count=1568160.0, ans=0.2 2023-10-04 07:07:28,921 WARNING [train.py:1204] (3/4) Exclude cut with ID _27013_210_1389_1_1532437173240_3252649_222-125573-0 from training. Duration: 0.98 2023-10-04 07:07:30,136 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.783e+02 2.030e+02 2.333e+02 2.690e+02 5.278e+02, threshold=4.666e+02, percent-clipped=1.0 2023-10-04 07:07:30,242 WARNING [train.py:1204] (3/4) Exclude cut with ID _14821_210_8750_1_1530680503991_6829359_189-297451-0 from training. Duration: 0.55 2023-10-04 07:07:32,963 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_557-163391-0_sp1.1 from training. Duration: 0.97275 2023-10-04 07:07:33,058 WARNING [train.py:1204] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_65-88242-0_sp1.1 from training. Duration: 0.5090625 2023-10-04 07:07:39,667 INFO [train.py:1046] (3/4) Epoch 45, batch 1500, loss[loss=0.1723, simple_loss=0.2366, pruned_loss=0.05398, over 19424.00 frames. ], tot_loss[loss=0.1541, simple_loss=0.2341, pruned_loss=0.03707, over 4704528.00 frames. ], batch size: 388, lr: 2.26e-03, grad_scale: 8.0 2023-10-04 07:07:42,548 WARNING [train.py:1204] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_218-295097-0 from training. Duration: 0.77 2023-10-04 07:07:42,574 WARNING [train.py:1204] (3/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_686-247600-0_sp1.1 from training. Duration: 0.836375 2023-10-04 07:07:42,578 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_397-147196-0_sp0.9 from training. Duration: 0.888875 2023-10-04 07:07:43,926 WARNING [train.py:1204] (3/4) Exclude cut with ID _53950_210_5816_1_1534417264919_3862951_237-136449-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 07:07:43,992 WARNING [train.py:1204] (3/4) Exclude cut with ID _3120_210_3525_1_1526036143112_4072980_74-178400-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 07:07:45,337 WARNING [train.py:1204] (3/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_309-222545-0_sp0.9 from training. Duration: 0.9333125 2023-10-04 07:07:46,723 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7544_210_6753_1_1528587722_3576686_172-171750-0 from training. Duration: 0.65 2023-10-04 07:07:48,136 WARNING [train.py:1204] (3/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_3-237434-0_sp0.9 from training. Duration: 0.8444375 2023-10-04 07:07:49,873 WARNING [train.py:1204] (3/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_101-293367-0_sp0.9 from training. Duration: 0.7 2023-10-04 07:07:49,883 WARNING [train.py:1204] (3/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_818-213552-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 07:07:49,977 WARNING [train.py:1204] (3/4) Exclude cut with ID _14349_210_8341_1_1530615710818_3442470_115-285975-0_sp1.1 from training. Duration: 0.7818125 2023-10-04 07:07:50,217 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.feed_forward2.hidden_balancer.prob, batch_count=1568226.6666666667, ans=0.125 2023-10-04 07:07:52,656 WARNING [train.py:1204] (3/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_291-309749-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 07:07:54,090 WARNING [train.py:1204] (3/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_715-3462-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 07:07:58,318 WARNING [train.py:1204] (3/4) Exclude cut with ID _59502_210_12210_1_1535107024698_1835139_13-331827-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 07:07:58,329 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_4528_210_3273_1_1527076654_3276777_342-168068-0 from training. Duration: 0.87 2023-10-04 07:07:58,391 WARNING [train.py:1204] (3/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_215-141059-0_sp0.9 from training. Duration: 0.688875 2023-10-04 07:08:00,299 WARNING [train.py:1204] (3/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_106-286130-0_sp1.1 from training. Duration: 0.8909375 2023-10-04 07:08:01,628 WARNING [train.py:1204] (3/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_480-77655-0_sp1.1 from training. Duration: 0.97275 2023-10-04 07:08:03,715 WARNING [train.py:1204] (3/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_848-280957-0 from training. Duration: 0.87 2023-10-04 07:08:08,601 WARNING [train.py:1204] (3/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_122-109023-0 from training. Duration: 0.76 2023-10-04 07:08:09,940 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_11305_210_1794_1_1529750428_7178148_645-233480-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 07:08:10,003 WARNING [train.py:1204] (3/4) Exclude cut with ID _7562_210_2084_1_1528887767916_3946229_345-169156-0 from training. Duration: 0.58 2023-10-04 07:08:12,730 WARNING [train.py:1204] (3/4) Exclude cut with ID _7562_210_2084_1_1528887767916_3946229_345-169156-0_sp1.1 from training. Duration: 0.52725 2023-10-04 07:08:15,381 WARNING [train.py:1204] (3/4) Exclude cut with ID _13253_210_1925_1_1530757775819_3577873_253-246386-0_sp1.1 from training. Duration: 0.8181875 2023-10-04 07:08:16,708 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_136-264444-0_sp1.1 from training. Duration: 0.97275 2023-10-04 07:08:16,730 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2168_210_2290_1_1525606920_1849400_136-222991-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 07:08:16,829 WARNING [train.py:1204] (3/4) Exclude cut with ID _7852_210_5219_1_1528286471316_8139400_242-105336-0 from training. Duration: 0.72 2023-10-04 07:08:18,188 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_5960_210_1794_1_1527403865_3990999_515-2613-0_sp1.1 from training. Duration: 0.8 2023-10-04 07:08:18,212 WARNING [train.py:1204] (3/4) Exclude cut with ID _39688_210_19596_1_1533371626725_4154159_551-167834-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 07:08:18,266 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_85-195240-0 from training. Duration: 0.87 2023-10-04 07:08:18,317 WARNING [train.py:1204] (3/4) Exclude cut with ID _63171_210_15488_1_1535104792794_3295110_113-113534-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 07:08:24,315 WARNING [train.py:1204] (3/4) Exclude cut with ID _8833_210_5219_1_1528592665210_7553059_319-349181-0_sp1.1 from training. Duration: 0.9 2023-10-04 07:08:24,316 WARNING [train.py:1204] (3/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_225-43381-0 from training. Duration: 0.94 2023-10-04 07:08:30,333 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_5959_210_1794_1_1527400400_3445726_412-44052-0_sp0.9 from training. Duration: 0.8555625 2023-10-04 07:08:30,473 WARNING [train.py:1204] (3/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_356-167816-0_sp1.1 from training. Duration: 0.6545625 2023-10-04 07:08:35,115 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9012W0112-501244 from training. Duration: 0.768 2023-10-04 07:08:35,182 WARNING [train.py:1204] (3/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_177-326952-0_sp1.1 from training. Duration: 0.92725 2023-10-04 07:08:36,454 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9016W0116-502789 from training. Duration: 0.896 2023-10-04 07:08:37,185 WARNING [train.py:1204] (3/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_557-204815-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 07:08:37,924 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.3.encoder.layers.0.nonlin_attention.whiten2, num_groups=1, num_channels=512, metric=5.82 vs. limit=15.0 2023-10-04 07:08:38,427 WARNING [train.py:1204] (3/4) Exclude cut with ID _55857_210_2346_1_1534644007831_6898209_279-229469-0_sp1.1 from training. Duration: 0.936375 2023-10-04 07:08:38,501 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9029W0375-509764 from training. Duration: 0.896 2023-10-04 07:08:39,904 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2295_210_2087_1_1525500993_2407863_234-83104-0_sp0.9 from training. Duration: 0.688875 2023-10-04 07:08:42,627 WARNING [train.py:1204] (3/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_266-137104-0 from training. Duration: 0.65 2023-10-04 07:08:42,850 INFO [scaling.py:1118] (3/4) WithLoss: name=encoder.encoders.1.encoder.layers.0.self_attn_weights, loss-sum=0.000e+00 2023-10-04 07:08:43,908 WARNING [train.py:1204] (3/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_289-163927-0_sp1.1 from training. Duration: 0.92725 2023-10-04 07:08:46,585 WARNING [train.py:1204] (3/4) Exclude cut with ID _12733_210_8341_1_1530167424556_3646380_86-225314-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 07:08:47,978 WARNING [train.py:1204] (3/4) Exclude cut with ID _7877_210_6014_1_1528286358099_3750136_618-284064-0_sp1.1 from training. Duration: 0.92725 2023-10-04 07:08:48,024 WARNING [train.py:1204] (3/4) Exclude cut with ID _55844_210_2346_1_1534471212569_7625707_1017-31469-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 07:08:49,375 WARNING [train.py:1204] (3/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_617-241446-0_sp1.1 from training. Duration: 0.92725 2023-10-04 07:08:49,444 WARNING [train.py:1204] (3/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_866-173440-0_sp1.1 from training. Duration: 0.8181875 2023-10-04 07:08:50,754 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_9460_210_4211_1_1528954587_10049667_480-260269-0 from training. Duration: 0.56 2023-10-04 07:08:52,106 WARNING [train.py:1204] (3/4) Exclude cut with ID _44659_210_2721_1_1534036120061_6614860_626-298884-0 from training. Duration: 0.97 2023-10-04 07:08:52,143 WARNING [train.py:1204] (3/4) Exclude cut with ID _53565_210_10635_1_1534337218335_3820259_287-18120-0_sp1.1 from training. Duration: 0.8818125 2023-10-04 07:08:53,557 INFO [train.py:1046] (3/4) Epoch 45, batch 1550, loss[loss=0.1525, simple_loss=0.2428, pruned_loss=0.0311, over 24448.00 frames. ], tot_loss[loss=0.1547, simple_loss=0.235, pruned_loss=0.03721, over 4723869.99 frames. ], batch size: 69, lr: 2.26e-03, grad_scale: 4.0 2023-10-04 07:08:53,620 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_791-197891-0 from training. Duration: 0.84 2023-10-04 07:08:53,658 WARNING [train.py:1204] (3/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_527-238990-0 from training. Duration: 0.9 2023-10-04 07:08:55,595 WARNING [train.py:1204] (3/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_449-65969-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 07:08:58,828 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_553-341452-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 07:08:58,892 WARNING [train.py:1204] (3/4) Exclude cut with ID _16909_210_5724_1_1531292079912_4118919_300-47574-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 07:08:58,919 WARNING [train.py:1204] (3/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_70-27732-0_sp1.1 from training. Duration: 0.8545625 2023-10-04 07:09:00,440 WARNING [train.py:1204] (3/4) Exclude cut with ID _44718_210_5816_1_1534050051869_6469030_727-28074-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 07:09:01,732 WARNING [train.py:1204] (3/4) Exclude cut with ID _55857_210_2346_1_1534644007831_6898209_357-123664-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 07:09:01,956 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.balancer2.prob, batch_count=1568560.0, ans=0.125 2023-10-04 07:09:06,247 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9123W0004-555964 from training. Duration: 0.896 2023-10-04 07:09:06,283 WARNING [train.py:1204] (3/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_445-145208-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 07:09:07,601 WARNING [train.py:1204] (3/4) Exclude cut with ID _3675_210_4557_1_1526639934408_4182089_456-18305-0_sp1.1 from training. Duration: 0.6909375 2023-10-04 07:09:07,672 WARNING [train.py:1204] (3/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_1-99680-0_sp1.1 from training. Duration: 0.6090625 2023-10-04 07:09:07,786 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.1.feed_forward1.out_proj.dropout_p, batch_count=1568626.6666666667, ans=0.1 2023-10-04 07:09:10,350 WARNING [train.py:1204] (3/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_146-282400-0_sp0.9 from training. Duration: 0.788875 2023-10-04 07:09:10,371 WARNING [train.py:1204] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_844-96989-0 from training. Duration: 0.71 2023-10-04 07:09:12,383 WARNING [train.py:1204] (3/4) Exclude cut with ID _29853_210_7497_1_1532505600851_6642110_128-146364-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 07:09:12,407 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_224-322394-0 from training. Duration: 0.66 2023-10-04 07:09:13,810 WARNING [train.py:1204] (3/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_191-59784-0 from training. Duration: 0.88 2023-10-04 07:09:13,815 WARNING [train.py:1204] (3/4) Exclude cut with ID _12556_210_8341_1_1530081024100_7327690_725-193477-0 from training. Duration: 0.98 2023-10-04 07:09:13,868 WARNING [train.py:1204] (3/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_523-79838-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 07:09:15,120 WARNING [train.py:1204] (3/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_398-334558-0_sp1.1 from training. Duration: 0.87275 2023-10-04 07:09:15,385 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.bypass_mid.scale_min, batch_count=1568626.6666666667, ans=0.2 2023-10-04 07:09:20,453 WARNING [train.py:1204] (3/4) Exclude cut with ID _6456_210_1534_1_1527681634212_3181049_171-36697-0_sp1.1 from training. Duration: 0.936375 2023-10-04 07:09:22,166 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.self_attn_weights.pos_emb_skip_rate, batch_count=1568693.3333333333, ans=0.0 2023-10-04 07:09:23,313 WARNING [train.py:1204] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_700-183549-0 from training. Duration: 0.68 2023-10-04 07:09:23,315 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8853_210_3273_1_1528633899_818968_51-187685-0 from training. Duration: 0.69 2023-10-04 07:09:29,646 WARNING [train.py:1204] (3/4) Exclude cut with ID _7719_210_3525_1_1528456153163_4296960_333-110399-0_sp1.1 from training. Duration: 0.87275 2023-10-04 07:09:34,286 WARNING [train.py:1204] (3/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_1022-83740-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 07:09:34,312 WARNING [train.py:1204] (3/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_61-327062-0_sp0.9 from training. Duration: 0.9 2023-10-04 07:09:34,323 WARNING [train.py:1204] (3/4) Exclude cut with ID _47251_210_18628_1_1533814063128_4137250_214-88076-0_sp1.1 from training. Duration: 0.8 2023-10-04 07:09:35,712 WARNING [train.py:1204] (3/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_839-173017-0 from training. Duration: 0.41 2023-10-04 07:09:40,411 WARNING [train.py:1204] (3/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_299-34413-0_sp1.1 from training. Duration: 0.5909375 2023-10-04 07:09:40,678 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.balancer2.prob, batch_count=1568760.0, ans=0.125 2023-10-04 07:09:42,520 WARNING [train.py:1204] (3/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_664-159457-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 07:09:43,858 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=1568760.0, ans=0.1 2023-10-04 07:09:45,001 WARNING [train.py:1204] (3/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_483-291760-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 07:09:47,804 WARNING [train.py:1204] (3/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_287-255474-0_sp1.1 from training. Duration: 0.92725 2023-10-04 07:09:47,841 WARNING [train.py:1204] (3/4) Exclude cut with ID _48677_210_5663_1_1533902405919_3892989_111-11439-0_sp1.1 from training. Duration: 0.87275 2023-10-04 07:09:49,150 WARNING [train.py:1204] (3/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_399-156791-0 from training. Duration: 0.83 2023-10-04 07:09:49,192 WARNING [train.py:1204] (3/4) Exclude cut with ID _3992_210_1747_1_1526644816031_3731060_36-147855-0_sp0.9 from training. Duration: 0.8666875 2023-10-04 07:09:50,595 WARNING [train.py:1204] (3/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_442-320317-0_sp1.1 from training. Duration: 0.7454375 2023-10-04 07:09:50,620 WARNING [train.py:1204] (3/4) Exclude cut with ID _55429_210_10626_1_1534467588527_7174143_953-80868-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 07:09:51,982 WARNING [train.py:1204] (3/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_159-328157-0_sp0.9 from training. Duration: 0.57775 2023-10-04 07:09:51,989 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9309W0197-640324 from training. Duration: 0.896 2023-10-04 07:09:54,262 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.0.layers.0.conv_module2.whiten, num_groups=1, num_channels=192, metric=9.78 vs. limit=15.0 2023-10-04 07:09:55,936 WARNING [train.py:1204] (3/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_361-30822-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 07:09:59,444 WARNING [train.py:1204] (3/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_636-251213-0 from training. Duration: 0.94 2023-10-04 07:10:00,710 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.623e+02 1.950e+02 2.179e+02 2.499e+02 3.800e+02, threshold=4.358e+02, percent-clipped=0.0 2023-10-04 07:10:04,422 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.feed_forward3.hidden_balancer.prob, batch_count=1568826.6666666667, ans=0.125 2023-10-04 07:10:05,533 WARNING [train.py:1204] (3/4) Exclude cut with ID _49728_210_12651_1_1534041034294_6788150_468-333827-0_sp1.1 from training. Duration: 0.963625 2023-10-04 07:10:06,828 WARNING [train.py:1204] (3/4) Exclude cut with ID _25882_210_3036_1_1532775665815_6479510_623-191759-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 07:10:08,093 INFO [train.py:1046] (3/4) Epoch 45, batch 1600, loss[loss=0.1753, simple_loss=0.2578, pruned_loss=0.04638, over 23891.00 frames. ], tot_loss[loss=0.1554, simple_loss=0.2358, pruned_loss=0.03752, over 4716789.11 frames. ], batch size: 86, lr: 2.26e-03, grad_scale: 8.0 2023-10-04 07:10:08,150 WARNING [train.py:1204] (3/4) Exclude cut with ID _5189_210_4557_1_1527248319428_3769229_199-53526-0 from training. Duration: 0.42 2023-10-04 07:10:08,272 WARNING [train.py:1204] (3/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_32-205288-0_sp0.9 from training. Duration: 0.8666875 2023-10-04 07:10:09,569 WARNING [train.py:1204] (3/4) Exclude cut with ID _65263_210_22356_1_1535763598691_3921037_401-346646-0_sp1.1 from training. Duration: 0.963625 2023-10-04 07:10:09,573 WARNING [train.py:1204] (3/4) Exclude cut with ID _12555_210_8750_1_1530075594402_3780239_449-267986-0_sp1.1 from training. Duration: 0.7545625 2023-10-04 07:10:09,591 WARNING [train.py:1204] (3/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_430-201020-0_sp1.1 from training. Duration: 0.8818125 2023-10-04 07:10:11,414 WARNING [train.py:1204] (3/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_379-49457-0_sp1.1 from training. Duration: 0.7909375 2023-10-04 07:10:13,504 WARNING [train.py:1204] (3/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_200-105218-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 07:10:13,631 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.bypass.scale_min, batch_count=1568893.3333333333, ans=0.2 2023-10-04 07:10:14,631 WARNING [train.py:1204] (3/4) Exclude cut with ID _3867_210_1883_1_1526641200575_4475830_319-349850-0 from training. Duration: 0.67 2023-10-04 07:10:15,986 WARNING [train.py:1204] (3/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_916-195848-0 from training. Duration: 0.92 2023-10-04 07:10:17,384 WARNING [train.py:1204] (3/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_436-186311-0 from training. Duration: 0.65 2023-10-04 07:10:18,707 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3910_210_1794_1_1526777667_3747260_302-180617-0_sp1.1 from training. Duration: 0.97275 2023-10-04 07:10:20,167 WARNING [train.py:1204] (3/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_102-92751-0 from training. Duration: 0.99 2023-10-04 07:10:21,513 WARNING [train.py:1204] (3/4) Exclude cut with ID _51899_210_20034_1_1534129254466_3669220_265-215428-0_sp1.1 from training. Duration: 0.8909375 2023-10-04 07:10:24,260 WARNING [train.py:1204] (3/4) Exclude cut with ID _58583_210_5236_1_1534726835266_3574249_438-198993-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 07:10:28,479 WARNING [train.py:1204] (3/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_166-166990-0_sp1.1 from training. Duration: 0.87275 2023-10-04 07:10:32,177 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.conv_module2.balancer1.prob, batch_count=1568960.0, ans=0.125 2023-10-04 07:10:33,621 WARNING [train.py:1204] (3/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_907-151398-0 from training. Duration: 0.37 2023-10-04 07:10:35,154 WARNING [train.py:1204] (3/4) Exclude cut with ID _38670_210_15379_1_1533448810659_4391059_452-256669-0_sp1.1 from training. Duration: 0.936375 2023-10-04 07:10:35,213 WARNING [train.py:1204] (3/4) Exclude cut with ID _48677_210_5663_1_1533902405919_3892989_408-40740-0 from training. Duration: 0.83 2023-10-04 07:10:36,507 WARNING [train.py:1204] (3/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_903-158756-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 07:10:36,563 WARNING [train.py:1204] (3/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_397-216497-0 from training. Duration: 0.96 2023-10-04 07:10:42,684 WARNING [train.py:1204] (3/4) Exclude cut with ID _55429_210_10626_1_1534467588527_7174143_97-335084-0 from training. Duration: 0.97 2023-10-04 07:10:50,004 WARNING [train.py:1204] (3/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_453-267730-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 07:10:50,072 WARNING [train.py:1204] (3/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_153-97441-0 from training. Duration: 0.56 2023-10-04 07:10:50,244 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.bypass.skip_rate, batch_count=1569026.6666666667, ans=0.07 2023-10-04 07:10:51,442 WARNING [train.py:1204] (3/4) Exclude cut with ID _40901_210_5665_1_1533363789958_3856130_312-329122-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 07:10:52,808 WARNING [train.py:1204] (3/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_97-126471-0_sp1.1 from training. Duration: 0.97275 2023-10-04 07:10:52,809 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_11305_210_1794_1_1529750428_7178148_257-312870-0_sp1.1 from training. Duration: 0.8 2023-10-04 07:10:54,380 WARNING [train.py:1204] (3/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_454-314901-0_sp0.9 from training. Duration: 0.511125 2023-10-04 07:10:58,408 WARNING [train.py:1204] (3/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_282-136641-0_sp1.1 from training. Duration: 0.3818125 2023-10-04 07:10:58,591 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=1569093.3333333333, ans=0.1 2023-10-04 07:11:01,598 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_46013_210_7234_1_1533776362_3744032_493-160724-0_sp1.1 from training. Duration: 0.963625 2023-10-04 07:11:01,638 WARNING [train.py:1204] (3/4) Exclude cut with ID _67831_210_12392_1_1535612464202_3092612_259-33442-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 07:11:01,766 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.0.bypass_mid.scale_min, batch_count=1569093.3333333333, ans=0.2 2023-10-04 07:11:03,097 WARNING [train.py:1204] (3/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_289-62384-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 07:11:04,432 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6545_210_2040_1_1527774670_4245258_379-14182-0_sp0.9 from training. Duration: 0.888875 2023-10-04 07:11:06,367 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_548-294316-0_sp1.1 from training. Duration: 0.736375 2023-10-04 07:11:07,676 WARNING [train.py:1204] (3/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_857-298942-0_sp1.1 from training. Duration: 0.77275 2023-10-04 07:11:09,171 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6673_210_3273_1_1527767790_3173240_327-306185-0_sp1.1 from training. Duration: 0.7545625 2023-10-04 07:11:10,825 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.balancer2.prob, batch_count=1569160.0, ans=0.125 2023-10-04 07:11:12,120 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.1.ff2_skip_rate, batch_count=1569160.0, ans=0.0 2023-10-04 07:11:15,218 WARNING [train.py:1204] (3/4) Exclude cut with ID _61008_210_10633_1_1534852769198_6974370_814-107647-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 07:11:16,540 WARNING [train.py:1204] (3/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_116-332008-0_sp1.1 from training. Duration: 0.8909375 2023-10-04 07:11:19,277 WARNING [train.py:1204] (3/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_266-329755-0 from training. Duration: 0.89 2023-10-04 07:11:19,281 WARNING [train.py:1204] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_271-269084-0_sp0.9 from training. Duration: 0.92225 2023-10-04 07:11:19,354 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3171_210_2087_1_1526109084_2330007_66-260583-0 from training. Duration: 0.72 2023-10-04 07:11:21,743 INFO [scaling.py:1118] (3/4) WithLoss: name=encoder.encoders.2.encoder.layers.2.self_attn_weights, loss-sum=0.000e+00 2023-10-04 07:11:22,721 INFO [train.py:1046] (3/4) Epoch 45, batch 1650, loss[loss=0.1428, simple_loss=0.2238, pruned_loss=0.03093, over 24445.00 frames. ], tot_loss[loss=0.1549, simple_loss=0.236, pruned_loss=0.0369, over 4720082.08 frames. ], batch size: 63, lr: 2.26e-03, grad_scale: 8.0 2023-10-04 07:11:24,147 WARNING [train.py:1204] (3/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_1326-276386-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 07:11:26,839 WARNING [train.py:1204] (3/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_372-30302-0_sp1.1 from training. Duration: 0.7818125 2023-10-04 07:11:28,115 WARNING [train.py:1204] (3/4) Exclude cut with ID _25882_210_3036_1_1532775665815_6479510_631-257029-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 07:11:28,132 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3691_210_3528_1_1526468169_4062053_355-334432-0 from training. Duration: 0.87 2023-10-04 07:11:28,140 WARNING [train.py:1204] (3/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_839-173017-0_sp1.1 from training. Duration: 0.37275 2023-10-04 07:11:28,148 WARNING [train.py:1204] (3/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_441-91466-0 from training. Duration: 0.97 2023-10-04 07:11:28,190 WARNING [train.py:1204] (3/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_108-53929-0 from training. Duration: 0.59 2023-10-04 07:11:32,993 WARNING [train.py:1204] (3/4) Exclude cut with ID _9389_210_1664_1_1529217337301_5289269_260-1942-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 07:11:33,068 WARNING [train.py:1204] (3/4) Exclude cut with ID _56423_210_5854_1_1534896061049_3165969_198-61547-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 07:11:33,094 WARNING [train.py:1204] (3/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_412-142122-0_sp1.1 from training. Duration: 0.92725 2023-10-04 07:11:34,314 WARNING [train.py:1204] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_88-341718-0_sp1.1 from training. Duration: 0.6 2023-10-04 07:11:36,976 WARNING [train.py:1204] (3/4) Exclude cut with ID _61079_210_4525_1_1534921008198_4011419_334-298697-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 07:11:38,360 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_5465_210_4211_1_1527227253_55357_8-2499-0_sp1.1 from training. Duration: 0.996375 2023-10-04 07:11:40,258 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_447-201565-0_sp1.1 from training. Duration: 0.97275 2023-10-04 07:11:40,264 WARNING [train.py:1204] (3/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_253-161789-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 07:11:40,267 WARNING [train.py:1204] (3/4) Exclude cut with ID _44304_210_15374_1_1533639352765_4613129_302-22084-0_sp0.9 from training. Duration: 0.9555625 2023-10-04 07:11:40,282 WARNING [train.py:1204] (3/4) Exclude cut with ID _26664_210_7770_1_1532258943280_3418270_187-80815-0_sp1.1 from training. Duration: 0.8454375 2023-10-04 07:11:41,585 WARNING [train.py:1204] (3/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_68-212801-0 from training. Duration: 0.73 2023-10-04 07:11:41,615 WARNING [train.py:1204] (3/4) Exclude cut with ID _29345_210_4381_1_1532689168263_3860169_149-257268-0 from training. Duration: 0.81 2023-10-04 07:11:47,643 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_213-103282-0_sp1.1 from training. Duration: 0.7090625 2023-10-04 07:11:50,383 WARNING [train.py:1204] (3/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_156-302675-0_sp1.1 from training. Duration: 0.5 2023-10-04 07:11:56,735 WARNING [train.py:1204] (3/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_125-322172-0 from training. Duration: 0.93 2023-10-04 07:11:58,031 WARNING [train.py:1204] (3/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_493-60474-0_sp1.1 from training. Duration: 0.963625 2023-10-04 07:11:58,250 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.bypass.skip_rate, batch_count=1569360.0, ans=0.07 2023-10-04 07:12:00,764 WARNING [train.py:1204] (3/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_156-302675-0 from training. Duration: 0.55 2023-10-04 07:12:03,661 WARNING [train.py:1204] (3/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_123-267862-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 07:12:06,878 WARNING [train.py:1204] (3/4) Exclude cut with ID _28427_210_4220_1_1532581240967_3061069_201-143747-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 07:12:06,905 WARNING [train.py:1204] (3/4) Exclude cut with ID _8772_210_1850_1_1528635595972_3664011_96-12756-0_sp1.1 from training. Duration: 0.8909375 2023-10-04 07:12:06,945 WARNING [train.py:1204] (3/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_1347-145675-0_sp1.1 from training. Duration: 0.936375 2023-10-04 07:12:08,865 WARNING [train.py:1204] (3/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_387-159008-0_sp1.1 from training. Duration: 0.7818125 2023-10-04 07:12:08,893 WARNING [train.py:1204] (3/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_400-201896-0_sp1.1 from training. Duration: 0.963625 2023-10-04 07:12:11,760 WARNING [train.py:1204] (3/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_326-174612-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 07:12:11,804 WARNING [train.py:1204] (3/4) Exclude cut with ID _55844_210_2346_1_1534471212569_7625707_390-116233-0_sp1.1 from training. Duration: 0.963625 2023-10-04 07:12:13,515 WARNING [train.py:1204] (3/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_419-317062-0_sp1.1 from training. Duration: 0.82725 2023-10-04 07:12:14,806 WARNING [train.py:1204] (3/4) Exclude cut with ID _29877_210_6228_1_1532397791106_5276149_266-62291-0_sp0.9 from training. Duration: 0.9666875 2023-10-04 07:12:14,887 WARNING [train.py:1204] (3/4) Exclude cut with ID _7857_210_1534_1_1528286515745_6831470_431-338528-0_sp1.1 from training. Duration: 0.92725 2023-10-04 07:12:16,236 WARNING [train.py:1204] (3/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_87-131427-0_sp0.9 from training. Duration: 0.8666875 2023-10-04 07:12:18,984 WARNING [train.py:1204] (3/4) Exclude cut with ID _24055_210_12651_1_1532310907143_976979_92-59356-0_sp1.1 from training. Duration: 0.82725 2023-10-04 07:12:20,280 WARNING [train.py:1204] (3/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_96-290039-0 from training. Duration: 0.56 2023-10-04 07:12:21,780 WARNING [train.py:1204] (3/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_48-182155-0_sp0.9 from training. Duration: 0.9666875 2023-10-04 07:12:21,818 WARNING [train.py:1204] (3/4) Exclude cut with ID _4626_210_3532_1_1526817652717_3916859_446-206656-0 from training. Duration: 0.76 2023-10-04 07:12:23,834 WARNING [train.py:1204] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_168-32906-0 from training. Duration: 0.69 2023-10-04 07:12:23,852 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3780_210_2087_1_1526467760_2130483_170-259044-0 from training. Duration: 0.7 2023-10-04 07:12:23,869 WARNING [train.py:1204] (3/4) Exclude cut with ID _65134_210_14763_1_1535335393985_3505368_153-216718-0_sp1.1 from training. Duration: 0.92725 2023-10-04 07:12:23,925 WARNING [train.py:1204] (3/4) Exclude cut with ID _64639_210_5816_1_1535367619437_3528000_461-329156-0_sp1.1 from training. Duration: 0.87275 2023-10-04 07:12:23,961 WARNING [train.py:1204] (3/4) Exclude cut with ID _21989_210_5854_1_1531899214931_3271360_126-347531-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 07:12:25,091 WARNING [train.py:1204] (3/4) Exclude cut with ID _7614_210_4915_1_1529326764092_4537359_96-162197-0_sp1.1 from training. Duration: 0.963625 2023-10-04 07:12:25,092 WARNING [train.py:1204] (3/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_1083-147536-0 from training. Duration: 0.97 2023-10-04 07:12:27,891 WARNING [train.py:1204] (3/4) Exclude cut with ID _64029_210_4381_1_1535178502772_3756459_275-16710-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 07:12:29,079 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.587e+02 2.045e+02 2.430e+02 3.009e+02 4.606e+02, threshold=4.861e+02, percent-clipped=4.0 2023-10-04 07:12:30,569 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7264_210_4938_1_1527939911_8132095_800-91995-0_sp0.9 from training. Duration: 0.9555625 2023-10-04 07:12:30,612 WARNING [train.py:1204] (3/4) Exclude cut with ID _3844_210_1390_1_1526727803582_2871209_50-193879-0_sp1.1 from training. Duration: 0.936375 2023-10-04 07:12:32,089 WARNING [train.py:1204] (3/4) Exclude cut with ID _46952_210_5986_1_1534230010357_3107379_232-344503-0 from training. Duration: 0.96 2023-10-04 07:12:36,953 INFO [train.py:1046] (3/4) Epoch 45, batch 1700, loss[loss=0.1346, simple_loss=0.1981, pruned_loss=0.03559, over 22776.00 frames. ], tot_loss[loss=0.1545, simple_loss=0.2352, pruned_loss=0.03693, over 4716717.18 frames. ], batch size: 322, lr: 2.26e-03, grad_scale: 8.0 2023-10-04 07:12:37,082 WARNING [train.py:1204] (3/4) Exclude cut with ID _25808_210_13292_1_1532343514562_7366109_206-14076-0_sp1.1 from training. Duration: 0.936375 2023-10-04 07:12:37,083 WARNING [train.py:1204] (3/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_55-31904-0_sp0.9 from training. Duration: 0.97775 2023-10-04 07:12:38,312 WARNING [train.py:1204] (3/4) Exclude cut with ID _14632_210_9988_1_1530691279392_3923200_161-129530-0 from training. Duration: 0.96 2023-10-04 07:12:38,377 WARNING [train.py:1204] (3/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_770-200430-0_sp1.1 from training. Duration: 0.8090625 2023-10-04 07:12:38,394 WARNING [train.py:1204] (3/4) Exclude cut with ID _6270_210_2084_1_1527850911573_3856420_26-277343-0_sp1.1 from training. Duration: 0.7454375 2023-10-04 07:12:38,395 WARNING [train.py:1204] (3/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_88-23776-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 07:12:40,054 INFO [scaling.py:1118] (3/4) WithLoss: name=encoder.encoders.2.encoder.layers.0.self_attn_weights, loss-sum=0.000e+00 2023-10-04 07:12:42,410 WARNING [train.py:1204] (3/4) Exclude cut with ID _60860_210_8254_1_1534896015875_3557169_245-256666-0_sp1.1 from training. Duration: 0.8818125 2023-10-04 07:12:42,447 WARNING [train.py:1204] (3/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_636-251213-0_sp1.1 from training. Duration: 0.8545625 2023-10-04 07:12:42,460 WARNING [train.py:1204] (3/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_540-131164-0 from training. Duration: 0.92 2023-10-04 07:12:44,542 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_5931_210_2040_1_1527428481_4547999_321-193614-0_sp0.9 from training. Duration: 0.7444375 2023-10-04 07:12:47,509 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.feed_forward2.hidden_balancer.prob, batch_count=1569560.0, ans=0.125 2023-10-04 07:12:52,848 WARNING [train.py:1204] (3/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_373-225403-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 07:12:56,015 WARNING [train.py:1204] (3/4) Exclude cut with ID _20622_210_3852_1_1531622427080_1093839_57-326027-0_sp1.1 from training. Duration: 0.97275 2023-10-04 07:13:00,290 WARNING [train.py:1204] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_605-257205-0_sp1.1 from training. Duration: 0.536375 2023-10-04 07:13:01,694 WARNING [train.py:1204] (3/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_442-320317-0_sp0.9 from training. Duration: 0.911125 2023-10-04 07:13:01,748 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_35361_210_4238_1_1533262810_7793476_675-87012-0_sp1.1 from training. Duration: 0.8090625 2023-10-04 07:13:03,011 WARNING [train.py:1204] (3/4) Exclude cut with ID _4184_210_2550_1_1526730192013_2219380_306-44736-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 07:13:04,958 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_124-113562-0 from training. Duration: 0.94 2023-10-04 07:13:08,216 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2821_210_2715_1_1526108385_3763560_73-296673-0_sp0.9 from training. Duration: 0.92225 2023-10-04 07:13:09,500 WARNING [train.py:1204] (3/4) Exclude cut with ID _10301_210_5886_1_1529222275332_3910620_216-46959-0_sp1.1 from training. Duration: 0.92725 2023-10-04 07:13:09,623 WARNING [train.py:1204] (3/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_91-277684-0_sp0.9 from training. Duration: 0.82225 2023-10-04 07:13:11,222 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.attention_skip_rate, batch_count=1569693.3333333333, ans=0.0 2023-10-04 07:13:12,271 WARNING [train.py:1204] (3/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_351-294650-0_sp0.9 from training. Duration: 0.7 2023-10-04 07:13:13,703 WARNING [train.py:1204] (3/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_209-205365-0 from training. Duration: 0.79 2023-10-04 07:13:13,764 WARNING [train.py:1204] (3/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_97-325053-0 from training. Duration: 0.92 2023-10-04 07:13:15,219 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_49348_210_7927_1_1533954500_3691300_109-276430-0_sp1.1 from training. Duration: 0.92725 2023-10-04 07:13:15,330 WARNING [train.py:1204] (3/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_912-47473-0 from training. Duration: 0.68 2023-10-04 07:13:15,532 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.feed_forward1.hidden_balancer.prob, batch_count=1569693.3333333333, ans=0.125 2023-10-04 07:13:17,083 WARNING [train.py:1204] (3/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_96-320736-0_sp0.9 from training. Duration: 0.9555625 2023-10-04 07:13:23,688 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.conv_skip_rate, batch_count=1569760.0, ans=0.0 2023-10-04 07:13:24,659 WARNING [train.py:1204] (3/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_1252-235481-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 07:13:25,985 WARNING [train.py:1204] (3/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_813-261554-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 07:13:26,695 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.3.encoder.layers.1.nonlin_attention.whiten2, num_groups=1, num_channels=512, metric=6.06 vs. limit=15.0 2023-10-04 07:13:27,300 WARNING [train.py:1204] (3/4) Exclude cut with ID _21632_210_2253_1_1531994001765_3744140_110-274654-0_sp0.9 from training. Duration: 0.911125 2023-10-04 07:13:30,108 WARNING [train.py:1204] (3/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_381-317791-0_sp1.1 from training. Duration: 0.663625 2023-10-04 07:13:30,120 WARNING [train.py:1204] (3/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_51-117576-0_sp1.1 from training. Duration: 0.363625 2023-10-04 07:13:30,137 WARNING [train.py:1204] (3/4) Exclude cut with ID _42321_210_7770_1_1533468631352_4366895_104-215915-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 07:13:31,564 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_216-22824-0_sp1.1 from training. Duration: 0.92725 2023-10-04 07:13:31,565 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_14831_210_7927_1_1530700200_5583610_181-27467-0 from training. Duration: 0.91 2023-10-04 07:13:31,631 WARNING [train.py:1204] (3/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_1024-265645-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 07:13:31,634 WARNING [train.py:1204] (3/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_412-233904-0_sp1.1 from training. Duration: 0.936375 2023-10-04 07:13:32,978 WARNING [train.py:1204] (3/4) Exclude cut with ID _30191_210_1664_1_1533189915047_7196670_911-223461-0_sp1.1 from training. Duration: 0.92725 2023-10-04 07:13:32,993 WARNING [train.py:1204] (3/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_558-148266-0_sp1.1 from training. Duration: 0.963625 2023-10-04 07:13:36,265 WARNING [train.py:1204] (3/4) Exclude cut with ID _58480_210_12392_1_1534811121111_3646160_212-164495-0_sp1.1 from training. Duration: 0.936375 2023-10-04 07:13:36,267 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_454-276429-0_sp0.9 from training. Duration: 0.9666875 2023-10-04 07:13:37,488 WARNING [train.py:1204] (3/4) Exclude cut with ID _24736_210_3279_1_1532332894218_2838700_73-197086-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 07:13:37,546 WARNING [train.py:1204] (3/4) Exclude cut with ID _5294_210_1747_1_1527249643928_3696179_529-242298-0_sp1.1 from training. Duration: 0.863625 2023-10-04 07:13:37,575 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_53555_210_5632_1_1534595220_7232297_345-171438-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 07:13:40,380 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_28211_210_6388_1_1532861506_4523224_22-25766-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 07:13:41,690 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7002_210_1794_1_1527939683_5157286_163-87552-0 from training. Duration: 0.86 2023-10-04 07:13:43,189 WARNING [train.py:1204] (3/4) Exclude cut with ID _56927_210_9969_1_1534848970840_3695229_409-225129-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 07:13:44,582 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_26192_210_7927_1_1532137269_1040115_21-219331-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 07:13:47,720 WARNING [train.py:1204] (3/4) Exclude cut with ID _15012_210_1968_1_1530856820429_5095350_662-210469-0 from training. Duration: 0.57 2023-10-04 07:13:50,351 INFO [train.py:1046] (3/4) Epoch 45, batch 1750, loss[loss=0.1424, simple_loss=0.2226, pruned_loss=0.03113, over 24621.00 frames. ], tot_loss[loss=0.1534, simple_loss=0.2339, pruned_loss=0.03646, over 4718104.11 frames. ], batch size: 60, lr: 2.26e-03, grad_scale: 8.0 2023-10-04 07:13:51,799 WARNING [train.py:1204] (3/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_582-191016-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 07:13:53,296 WARNING [train.py:1204] (3/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_270-210032-0_sp1.1 from training. Duration: 0.963625 2023-10-04 07:13:53,323 WARNING [train.py:1204] (3/4) Exclude cut with ID _6789_210_1534_1_1527857937218_3118950_262-48853-0_sp0.9 from training. Duration: 0.62225 2023-10-04 07:13:55,203 WARNING [train.py:1204] (3/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_847-41334-0 from training. Duration: 0.44 2023-10-04 07:13:55,235 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_573-46532-0_sp1.1 from training. Duration: 0.92725 2023-10-04 07:13:59,191 WARNING [train.py:1204] (3/4) Exclude cut with ID _21881_210_3525_1_1531825242757_3798380_247-102591-0_sp1.1 from training. Duration: 0.8909375 2023-10-04 07:13:59,226 WARNING [train.py:1204] (3/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_200-21395-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 07:14:03,981 WARNING [train.py:1204] (3/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_711-15688-0 from training. Duration: 0.43 2023-10-04 07:14:07,355 WARNING [train.py:1204] (3/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_53-75025-0_sp1.1 from training. Duration: 0.963625 2023-10-04 07:14:10,221 WARNING [train.py:1204] (3/4) Exclude cut with ID _6194_210_1534_1_1527511826160_3941194_321-148641-0 from training. Duration: 0.64 2023-10-04 07:14:10,253 WARNING [train.py:1204] (3/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_337-294431-0_sp1.1 from training. Duration: 0.936375 2023-10-04 07:14:11,613 WARNING [train.py:1204] (3/4) Exclude cut with ID _21632_210_2253_1_1531994001765_3744140_110-274654-0_sp1.1 from training. Duration: 0.7454375 2023-10-04 07:14:13,116 WARNING [train.py:1204] (3/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_135-174080-0_sp1.1 from training. Duration: 0.4818125 2023-10-04 07:14:15,000 WARNING [train.py:1204] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_21-174514-0 from training. Duration: 0.43 2023-10-04 07:14:16,345 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_38965_210_4852_1_1533174906_3484735_215-294589-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 07:14:16,366 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_548-294316-0 from training. Duration: 0.81 2023-10-04 07:14:18,071 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.out_combiner.scale_min, batch_count=1569960.0, ans=0.2 2023-10-04 07:14:24,695 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_103-84010-0_sp1.1 from training. Duration: 0.62725 2023-10-04 07:14:26,863 WARNING [train.py:1204] (3/4) Exclude cut with ID _18199_210_6109_1_1531475789864_4288790_295-198247-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 07:14:26,892 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_13757_210_3528_1_1530440682_4107239_103-313224-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 07:14:28,264 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.conv_skip_rate, batch_count=1570026.6666666667, ans=0.0 2023-10-04 07:14:30,643 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_34886_210_10633_1_1533203882_1755661_67-294673-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 07:14:30,660 WARNING [train.py:1204] (3/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_350-101734-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 07:14:31,966 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_37357_210_17024_1_1533089504_2316121_204-153986-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 07:14:35,154 WARNING [train.py:1204] (3/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_673-115569-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 07:14:38,288 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_489-230703-0_sp0.9 from training. Duration: 0.9444375 2023-10-04 07:14:38,354 WARNING [train.py:1204] (3/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_575-102496-0_sp1.1 from training. Duration: 0.936375 2023-10-04 07:14:39,678 WARNING [train.py:1204] (3/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_669-93163-0 from training. Duration: 0.98 2023-10-04 07:14:41,202 WARNING [train.py:1204] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_249-281349-0_sp0.9 from training. Duration: 0.9444375 2023-10-04 07:14:43,909 WARNING [train.py:1204] (3/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_106-30782-0 from training. Duration: 0.8 2023-10-04 07:14:45,760 WARNING [train.py:1204] (3/4) Exclude cut with ID _8772_210_1850_1_1528635595972_3664011_398-320049-0_sp1.1 from training. Duration: 0.8 2023-10-04 07:14:47,182 WARNING [train.py:1204] (3/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_516-151727-0_sp1.1 from training. Duration: 0.963625 2023-10-04 07:14:48,483 WARNING [train.py:1204] (3/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_325-62802-0_sp1.1 from training. Duration: 0.7909375 2023-10-04 07:14:51,298 WARNING [train.py:1204] (3/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_93-58002-0_sp0.9 from training. Duration: 0.6333125 2023-10-04 07:14:51,334 WARNING [train.py:1204] (3/4) Exclude cut with ID _4681_210_3525_1_1527251040321_1235713_123-24428-0_sp1.1 from training. Duration: 0.436375 2023-10-04 07:14:52,659 WARNING [train.py:1204] (3/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_1185-56473-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 07:14:54,018 WARNING [train.py:1204] (3/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_96-244299-0_sp1.1 from training. Duration: 0.8 2023-10-04 07:14:56,584 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.570e+02 1.970e+02 2.244e+02 2.815e+02 4.858e+02, threshold=4.488e+02, percent-clipped=0.0 2023-10-04 07:14:58,666 WARNING [train.py:1204] (3/4) Exclude cut with ID _59500_210_8254_1_1534809610680_3736009_104-72815-0_sp1.1 from training. Duration: 0.963625 2023-10-04 07:15:00,144 WARNING [train.py:1204] (3/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_468-283113-0_sp1.1 from training. Duration: 0.87275 2023-10-04 07:15:02,834 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6239_210_4211_1_1527765584_4306698_528-102186-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 07:15:02,920 WARNING [train.py:1204] (3/4) Exclude cut with ID _45297_210_2721_1_1533715351381_3264949_351-147136-0 from training. Duration: 0.87 2023-10-04 07:15:04,221 INFO [train.py:1046] (3/4) Epoch 45, batch 1800, loss[loss=0.1612, simple_loss=0.2303, pruned_loss=0.04603, over 23778.00 frames. ], tot_loss[loss=0.1525, simple_loss=0.2332, pruned_loss=0.0359, over 4731025.08 frames. ], batch size: 164, lr: 2.26e-03, grad_scale: 8.0 2023-10-04 07:15:04,264 WARNING [train.py:1204] (3/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_520-51744-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 07:15:06,128 WARNING [train.py:1204] (3/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_310-114452-0_sp1.1 from training. Duration: 0.536375 2023-10-04 07:15:06,131 WARNING [train.py:1204] (3/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_450-165710-0_sp1.1 from training. Duration: 0.92725 2023-10-04 07:15:06,141 WARNING [train.py:1204] (3/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_280-120942-0_sp1.1 from training. Duration: 0.7 2023-10-04 07:15:06,161 WARNING [train.py:1204] (3/4) Exclude cut with ID _65585_210_12210_1_1535450451584_3764098_112-294301-0_sp0.9 from training. Duration: 0.97775 2023-10-04 07:15:08,013 WARNING [train.py:1204] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_98-51676-0_sp0.9 from training. Duration: 0.811125 2023-10-04 07:15:09,482 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_33348_210_5632_1_1533086912_3324030_85-28583-0_sp1.1 from training. Duration: 0.7454375 2023-10-04 07:15:09,610 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.conv_skip_rate, batch_count=1570226.6666666667, ans=0.0 2023-10-04 07:15:10,811 WARNING [train.py:1204] (3/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_336-208232-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 07:15:12,158 WARNING [train.py:1204] (3/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_339-104791-0_sp0.9 from training. Duration: 0.5444375 2023-10-04 07:15:13,685 WARNING [train.py:1204] (3/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_559-29840-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 07:15:16,447 WARNING [train.py:1204] (3/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_117-290916-0_sp0.9 from training. Duration: 0.5666875 2023-10-04 07:15:19,582 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_185-112289-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 07:15:19,881 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.bypass.scale_min, batch_count=1570293.3333333333, ans=0.2 2023-10-04 07:15:20,996 WARNING [train.py:1204] (3/4) Exclude cut with ID _59256_210_5986_1_1535076085997_3589120_155-298797-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 07:15:21,583 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.4.encoder.layers.0.self_attn_weights.whiten_keys, num_groups=4, num_channels=128, metric=1.78 vs. limit=6.0 2023-10-04 07:15:23,799 WARNING [train.py:1204] (3/4) Exclude cut with ID _44659_210_2721_1_1534036120061_6614860_246-206531-0_sp1.1 from training. Duration: 0.92725 2023-10-04 07:15:23,850 WARNING [train.py:1204] (3/4) Exclude cut with ID _23818_210_6228_1_1531917063884_3676920_469-151944-0_sp1.1 from training. Duration: 0.92725 2023-10-04 07:15:23,995 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.balancer2.prob, batch_count=1570293.3333333333, ans=0.125 2023-10-04 07:15:24,058 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.ff3_skip_rate, batch_count=1570293.3333333333, ans=0.0 2023-10-04 07:15:25,272 WARNING [train.py:1204] (3/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_88-276668-0_sp1.1 from training. Duration: 0.9 2023-10-04 07:15:28,047 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_15101_210_7927_1_1530786415_5033274_320-327527-0_sp1.1 from training. Duration: 0.87275 2023-10-04 07:15:28,065 WARNING [train.py:1204] (3/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_446-112117-0 from training. Duration: 0.9 2023-10-04 07:15:29,856 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_30280_210_1881_1_1532872779_3660139_400-254159-0_sp1.1 from training. Duration: 0.936375 2023-10-04 07:15:31,465 WARNING [train.py:1204] (3/4) Exclude cut with ID _40284_210_2084_1_1533513679814_3704279_158-345341-0_sp1.1 from training. Duration: 0.936375 2023-10-04 07:15:31,651 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.balancer1.prob, batch_count=1570293.3333333333, ans=0.125 2023-10-04 07:15:33,086 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.feed_forward1.out_proj.dropout_p, batch_count=1570360.0, ans=0.1 2023-10-04 07:15:35,377 WARNING [train.py:1204] (3/4) Exclude cut with ID 3033-130750-0096-55598-0 from training. Duration: 0.83 2023-10-04 07:15:37,297 WARNING [train.py:1204] (3/4) Exclude cut with ID _9389_210_1664_1_1529217337301_5289269_379-128794-0 from training. Duration: 0.85 2023-10-04 07:15:38,544 WARNING [train.py:1204] (3/4) Exclude cut with ID _3675_210_4557_1_1526639934408_4182089_456-18305-0 from training. Duration: 0.76 2023-10-04 07:15:38,584 WARNING [train.py:1204] (3/4) Exclude cut with ID _29853_210_7497_1_1532505600851_6642110_571-249460-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 07:15:40,492 WARNING [train.py:1204] (3/4) Exclude cut with ID _4068_210_2629_1_1526697350395_1546259_230-325568-0_sp1.1 from training. Duration: 0.92725 2023-10-04 07:15:40,493 WARNING [train.py:1204] (3/4) Exclude cut with ID _34415_210_8750_1_1533085213375_3451116_213-267570-0_sp1.1 from training. Duration: 0.963625 2023-10-04 07:15:41,868 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6767_210_3273_1_1527855783_1718932_142-127132-0_sp1.1 from training. Duration: 0.863625 2023-10-04 07:15:48,331 WARNING [train.py:1204] (3/4) Exclude cut with ID IC0653W0001-322925 from training. Duration: 0.896 2023-10-04 07:15:48,606 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.balancer1.prob, batch_count=1570426.6666666667, ans=0.125 2023-10-04 07:15:49,667 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_4528_210_3273_1_1527076654_3276777_209-219804-0_sp1.1 from training. Duration: 0.62725 2023-10-04 07:15:50,939 WARNING [train.py:1204] (3/4) Exclude cut with ID _55844_210_2346_1_1534471212569_7625707_187-204971-0_sp1.1 from training. Duration: 0.936375 2023-10-04 07:15:52,330 WARNING [train.py:1204] (3/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_369-325184-0 from training. Duration: 0.99 2023-10-04 07:15:52,356 WARNING [train.py:1204] (3/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_791-45149-0 from training. Duration: 0.86 2023-10-04 07:15:52,403 WARNING [train.py:1204] (3/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_317-211107-0_sp1.1 from training. Duration: 0.836375 2023-10-04 07:15:55,037 WARNING [train.py:1204] (3/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_488-330913-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 07:15:55,135 WARNING [train.py:1204] (3/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_506-42614-0_sp1.1 from training. Duration: 0.7181875 2023-10-04 07:15:57,950 WARNING [train.py:1204] (3/4) Exclude cut with ID _8696_210_1876_1_1528545632440_3345058_209-54069-0 from training. Duration: 0.67 2023-10-04 07:16:04,061 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.conv_module1.balancer1.prob, batch_count=1570493.3333333333, ans=0.125 2023-10-04 07:16:05,207 WARNING [train.py:1204] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_215-202973-0_sp1.1 from training. Duration: 0.8 2023-10-04 07:16:05,278 WARNING [train.py:1204] (3/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_321-215742-0 from training. Duration: 0.5 2023-10-04 07:16:06,581 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_428-32114-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 07:16:06,589 WARNING [train.py:1204] (3/4) Exclude cut with ID _27013_210_1389_1_1532437173240_3252649_467-263151-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 07:16:06,659 WARNING [train.py:1204] (3/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_734-129761-0_sp0.9 from training. Duration: 0.911125 2023-10-04 07:16:08,537 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_521-214276-0 from training. Duration: 0.44 2023-10-04 07:16:09,972 WARNING [train.py:1204] (3/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_80-147301-0_sp1.1 from training. Duration: 0.763625 2023-10-04 07:16:11,259 WARNING [train.py:1204] (3/4) Exclude cut with ID _9679_210_7497_1_1529129212906_4363929_56-307796-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 07:16:13,485 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.1.encoder.layers.1.self_attn2.whiten, num_groups=1, num_channels=256, metric=10.05 vs. limit=22.5 2023-10-04 07:16:14,088 WARNING [train.py:1204] (3/4) Exclude cut with ID _14832_210_8750_1_1530691351539_3171348_1-54315-0 from training. Duration: 0.87 2023-10-04 07:16:14,089 WARNING [train.py:1204] (3/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_558-155750-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 07:16:17,501 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_142-309370-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 07:16:17,539 WARNING [train.py:1204] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_18-46756-0_sp0.9 from training. Duration: 0.87775 2023-10-04 07:16:17,546 WARNING [train.py:1204] (3/4) Exclude cut with ID _61148_210_6897_1_1535094022726_4217610_279-212159-0_sp1.1 from training. Duration: 0.936375 2023-10-04 07:16:18,857 INFO [train.py:1046] (3/4) Epoch 45, batch 1850, loss[loss=0.1568, simple_loss=0.2352, pruned_loss=0.03918, over 23570.00 frames. ], tot_loss[loss=0.1531, simple_loss=0.2338, pruned_loss=0.03618, over 4726282.42 frames. ], batch size: 256, lr: 2.26e-03, grad_scale: 8.0 2023-10-04 07:16:18,958 WARNING [train.py:1204] (3/4) Exclude cut with ID _21987_210_5854_1_1531821500482_3637038_164-2641-0_sp1.1 from training. Duration: 0.936375 2023-10-04 07:16:20,328 WARNING [train.py:1204] (3/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_395-340852-0_sp1.1 from training. Duration: 0.8454375 2023-10-04 07:16:20,475 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1077-184261-0_sp1.1 from training. Duration: 0.963625 2023-10-04 07:16:21,755 WARNING [train.py:1204] (3/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_1176-204992-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 07:16:24,549 WARNING [train.py:1204] (3/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_563-347832-0_sp1.1 from training. Duration: 0.8090625 2023-10-04 07:16:24,626 WARNING [train.py:1204] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_380-204756-0_sp0.9 from training. Duration: 0.9555625 2023-10-04 07:16:33,016 WARNING [train.py:1204] (3/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_212-10501-0_sp1.1 from training. Duration: 0.8545625 2023-10-04 07:16:33,038 WARNING [train.py:1204] (3/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_445-117398-0 from training. Duration: 0.89 2023-10-04 07:16:35,907 WARNING [train.py:1204] (3/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_29-280622-0 from training. Duration: 0.82 2023-10-04 07:16:38,651 WARNING [train.py:1204] (3/4) Exclude cut with ID _26664_210_7770_1_1532258943280_3418270_187-80815-0 from training. Duration: 0.93 2023-10-04 07:16:40,785 INFO [scaling.py:1118] (3/4) WithLoss: name=encoder.encoders.3.encoder.layers.0.self_attn_weights, loss-sum=0.000e+00 2023-10-04 07:16:44,090 WARNING [train.py:1204] (3/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_414-349640-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 07:16:44,117 WARNING [train.py:1204] (3/4) Exclude cut with ID _45021_210_9994_1_1533619751094_3330288_96-239010-0 from training. Duration: 0.91 2023-10-04 07:16:44,120 WARNING [train.py:1204] (3/4) Exclude cut with ID _5189_210_4557_1_1527248319428_3769229_199-53526-0_sp0.9 from training. Duration: 0.4666875 2023-10-04 07:16:48,702 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=1570693.3333333333, ans=0.1 2023-10-04 07:16:51,823 WARNING [train.py:1204] (3/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_63-101996-0_sp1.1 from training. Duration: 0.8 2023-10-04 07:16:53,229 WARNING [train.py:1204] (3/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_199-86432-0 from training. Duration: 0.98 2023-10-04 07:16:56,006 WARNING [train.py:1204] (3/4) Exclude cut with ID _36399_210_16049_1_1532998771377_3698333_207-259013-0_sp1.1 from training. Duration: 0.92725 2023-10-04 07:16:56,051 WARNING [train.py:1204] (3/4) Exclude cut with ID _62574_210_2655_1_1535180407058_3933249_567-111756-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 07:17:01,450 WARNING [train.py:1204] (3/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_436-318113-0 from training. Duration: 0.75 2023-10-04 07:17:01,501 WARNING [train.py:1204] (3/4) Exclude cut with ID _53379_210_20034_1_1534222027363_3854654_43-305214-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 07:17:03,171 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_15126_210_6753_1_1530778677_2591185_113-157651-0_sp1.1 from training. Duration: 0.6181875 2023-10-04 07:17:04,576 WARNING [train.py:1204] (3/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_146-218763-0_sp1.1 from training. Duration: 0.7818125 2023-10-04 07:17:06,236 WARNING [train.py:1204] (3/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_714-310538-0_sp0.9 from training. Duration: 0.9555625 2023-10-04 07:17:09,112 WARNING [train.py:1204] (3/4) Exclude cut with ID _24271_210_12658_1_1531987270022_7190519_709-16721-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 07:17:13,672 WARNING [train.py:1204] (3/4) Exclude cut with ID _5190_210_4557_1_1527244368799_373439_28-197030-0_sp0.9 from training. Duration: 0.8 2023-10-04 07:17:13,708 WARNING [train.py:1204] (3/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_267-197536-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 07:17:13,750 WARNING [train.py:1204] (3/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_658-282610-0_sp1.1 from training. Duration: 0.4818125 2023-10-04 07:17:13,759 WARNING [train.py:1204] (3/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_257-67050-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 07:17:13,901 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.0.feed_forward2.hidden_balancer.prob, batch_count=1570760.0, ans=0.125 2023-10-04 07:17:15,776 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_14906_210_7927_1_1530754190_9408633_961-53402-0_sp1.1 from training. Duration: 0.936375 2023-10-04 07:17:17,094 WARNING [train.py:1204] (3/4) Exclude cut with ID _60113_210_16271_1_1535164212481_4386079_490-280290-0_sp1.1 from training. Duration: 0.8909375 2023-10-04 07:17:21,681 WARNING [train.py:1204] (3/4) Exclude cut with ID _14100_210_8341_1_1530529320613_3678069_258-224599-0 from training. Duration: 0.77 2023-10-04 07:17:21,734 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6961_210_2087_1_1527937168_6815011_565-326898-0_sp1.1 from training. Duration: 0.936375 2023-10-04 07:17:24,580 WARNING [train.py:1204] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_239-316802-0_sp1.1 from training. Duration: 0.62725 2023-10-04 07:17:25,966 WARNING [train.py:1204] (3/4) Exclude cut with ID _55857_210_2346_1_1534644007831_6898209_745-98391-0_sp1.1 from training. Duration: 0.6909375 2023-10-04 07:17:25,970 WARNING [train.py:1204] (3/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_154-135696-0 from training. Duration: 0.8 2023-10-04 07:17:25,972 WARNING [train.py:1204] (3/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_93-58002-0 from training. Duration: 0.57 2023-10-04 07:17:27,196 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.572e+02 2.009e+02 2.134e+02 2.480e+02 4.687e+02, threshold=4.268e+02, percent-clipped=1.0 2023-10-04 07:17:27,385 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9025W0005-507335 from training. Duration: 0.896 2023-10-04 07:17:28,723 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9029W0034-509435 from training. Duration: 0.64 2023-10-04 07:17:30,046 WARNING [train.py:1204] (3/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_76-203867-0_sp1.1 from training. Duration: 0.6545625 2023-10-04 07:17:30,057 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_46639_210_8341_1_1533726088_4495500_66-346325-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 07:17:30,074 WARNING [train.py:1204] (3/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_145-137790-0_sp0.9 from training. Duration: 0.92225 2023-10-04 07:17:30,100 WARNING [train.py:1204] (3/4) Exclude cut with ID _36522_210_8887_1_1533177030611_1772478_7-327299-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 07:17:31,541 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9042W0009-515615 from training. Duration: 0.896 2023-10-04 07:17:31,557 WARNING [train.py:1204] (3/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_39-248870-0_sp0.9 from training. Duration: 0.7666875 2023-10-04 07:17:32,795 INFO [train.py:1046] (3/4) Epoch 45, batch 1900, loss[loss=0.159, simple_loss=0.2309, pruned_loss=0.04358, over 23803.00 frames. ], tot_loss[loss=0.154, simple_loss=0.2352, pruned_loss=0.03635, over 4733039.91 frames. ], batch size: 179, lr: 2.26e-03, grad_scale: 8.0 2023-10-04 07:17:32,865 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_35441_210_16554_1_1533347572_4092680_362-242912-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 07:17:34,251 WARNING [train.py:1204] (3/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_216-343007-0_sp1.1 from training. Duration: 0.6 2023-10-04 07:17:34,333 WARNING [train.py:1204] (3/4) Exclude cut with ID _3867_210_1883_1_1526641200575_4475830_272-215563-0_sp0.9 from training. Duration: 0.6333125 2023-10-04 07:17:35,619 WARNING [train.py:1204] (3/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_553-175603-0_sp1.1 from training. Duration: 0.92725 2023-10-04 07:17:35,629 WARNING [train.py:1204] (3/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_963-187109-0 from training. Duration: 0.99 2023-10-04 07:17:37,673 WARNING [train.py:1204] (3/4) Exclude cut with ID _44973_210_1511_1_1533794381163_3634770_495-211466-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 07:17:37,684 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9068W0002-529085 from training. Duration: 0.896 2023-10-04 07:17:37,708 WARNING [train.py:1204] (3/4) Exclude cut with ID _5294_210_1747_1_1527249643928_3696179_180-100713-0_sp1.1 from training. Duration: 0.736375 2023-10-04 07:17:39,143 WARNING [train.py:1204] (3/4) Exclude cut with ID _35799_210_5854_1_1533263487887_3529071_149-49689-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 07:17:44,064 WARNING [train.py:1204] (3/4) Exclude cut with ID _67725_210_27767_1_1535608756671_5105359_36-220794-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 07:17:44,344 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.feed_forward1.hidden_balancer.prob, batch_count=1570893.3333333333, ans=0.125 2023-10-04 07:17:45,965 WARNING [train.py:1204] (3/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_338-96520-0_sp1.1 from training. Duration: 0.8545625 2023-10-04 07:17:47,358 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9104W0004-547160 from training. Duration: 0.896 2023-10-04 07:17:47,441 WARNING [train.py:1204] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_330-327861-0 from training. Duration: 0.67 2023-10-04 07:17:50,553 WARNING [train.py:1204] (3/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_156-257607-0_sp0.9 from training. Duration: 0.92225 2023-10-04 07:17:50,615 WARNING [train.py:1204] (3/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_476-322050-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 07:17:50,623 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9118W0017-553385 from training. Duration: 0.896 2023-10-04 07:17:51,561 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.0.layers.0.feed_forward1.out_whiten, num_groups=1, num_channels=192, metric=11.89 vs. limit=15.0 2023-10-04 07:17:51,899 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9120W0028-554435 from training. Duration: 0.896 2023-10-04 07:17:54,684 WARNING [train.py:1204] (3/4) Exclude cut with ID _56524_210_9642_1_1534572011779_3619170_145-310842-0 from training. Duration: 0.99 2023-10-04 07:17:56,090 WARNING [train.py:1204] (3/4) Exclude cut with ID _51631_210_8254_1_1534129228420_4084794_270-222508-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 07:18:00,215 WARNING [train.py:1204] (3/4) Exclude cut with ID _14268_210_9206_1_1530622771285_4714099_331-105351-0 from training. Duration: 0.65 2023-10-04 07:18:01,675 WARNING [train.py:1204] (3/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_40-129756-0 from training. Duration: 0.81 2023-10-04 07:18:05,984 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.feed_forward1.hidden_balancer.prob, batch_count=1571026.6666666667, ans=0.125 2023-10-04 07:18:09,383 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.conv_skip_rate, batch_count=1571026.6666666667, ans=0.0 2023-10-04 07:18:12,444 WARNING [train.py:1204] (3/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_231-104736-0 from training. Duration: 0.78 2023-10-04 07:18:13,919 WARNING [train.py:1204] (3/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_214-291369-0 from training. Duration: 0.63 2023-10-04 07:18:13,925 WARNING [train.py:1204] (3/4) Exclude cut with ID _62574_210_2655_1_1535180407058_3933249_369-84347-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 07:18:13,982 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9210W0111-599240 from training. Duration: 0.896 2023-10-04 07:18:13,986 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_92-246270-0 from training. Duration: 0.85 2023-10-04 07:18:15,302 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_37152_210_9697_1_1533106573_3844188_29-239070-0 from training. Duration: 0.98 2023-10-04 07:18:17,174 WARNING [train.py:1204] (3/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_239-22076-0 from training. Duration: 0.85 2023-10-04 07:18:17,177 WARNING [train.py:1204] (3/4) Exclude cut with ID _65962_210_29927_1_1535436026519_4174930_368-44639-0_sp1.1 from training. Duration: 0.97275 2023-10-04 07:18:20,035 WARNING [train.py:1204] (3/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_152-212533-0 from training. Duration: 0.97 2023-10-04 07:18:23,421 WARNING [train.py:1204] (3/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_90-31383-0_sp1.1 from training. Duration: 0.7909375 2023-10-04 07:18:26,184 WARNING [train.py:1204] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_196-152868-0_sp1.1 from training. Duration: 0.92725 2023-10-04 07:18:26,186 WARNING [train.py:1204] (3/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_140-181176-0 from training. Duration: 0.92 2023-10-04 07:18:27,617 WARNING [train.py:1204] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_168-32906-0_sp0.9 from training. Duration: 0.7666875 2023-10-04 07:18:30,409 WARNING [train.py:1204] (3/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_13-165512-0 from training. Duration: 0.82 2023-10-04 07:18:30,444 WARNING [train.py:1204] (3/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_381-317791-0_sp0.9 from training. Duration: 0.811125 2023-10-04 07:18:33,518 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.bypass.scale_min, batch_count=1571160.0, ans=0.2 2023-10-04 07:18:36,117 WARNING [train.py:1204] (3/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_63-267585-0_sp1.1 from training. Duration: 0.8181875 2023-10-04 07:18:36,119 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_326-303945-0_sp1.1 from training. Duration: 0.9 2023-10-04 07:18:37,401 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_194-143381-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 07:18:37,470 WARNING [train.py:1204] (3/4) Exclude cut with ID _39290_210_4220_1_1533862778760_3075118_194-16667-0_sp1.1 from training. Duration: 0.963625 2023-10-04 07:18:38,450 INFO [scaling.py:1118] (3/4) WithLoss: name=encoder.encoders.5.encoder.layers.0.self_attn_weights, loss-sum=0.000e+00 2023-10-04 07:18:39,338 WARNING [train.py:1204] (3/4) Exclude cut with ID _15012_210_1968_1_1530856820429_5095350_662-210469-0_sp0.9 from training. Duration: 0.6333125 2023-10-04 07:18:39,369 WARNING [train.py:1204] (3/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_68-219807-0_sp1.1 from training. Duration: 0.42725 2023-10-04 07:18:39,897 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.feed_forward2.out_whiten.whitening_limit, batch_count=1571160.0, ans=15.0 2023-10-04 07:18:41,263 WARNING [train.py:1204] (3/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_321-22821-0_sp0.9 from training. Duration: 0.988875 2023-10-04 07:18:44,175 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_1037-45317-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 07:18:44,176 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_403-39619-0_sp1.1 from training. Duration: 0.763625 2023-10-04 07:18:46,135 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_510-96996-0_sp0.9 from training. Duration: 0.9666875 2023-10-04 07:18:46,138 WARNING [train.py:1204] (3/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_225-180155-0_sp1.1 from training. Duration: 0.92725 2023-10-04 07:18:47,497 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_278-272612-0_sp0.9 from training. Duration: 0.811125 2023-10-04 07:18:48,847 INFO [train.py:1046] (3/4) Epoch 45, batch 1950, loss[loss=0.165, simple_loss=0.2444, pruned_loss=0.04284, over 23402.00 frames. ], tot_loss[loss=0.154, simple_loss=0.2352, pruned_loss=0.03642, over 4728464.51 frames. ], batch size: 105, lr: 2.26e-03, grad_scale: 8.0 2023-10-04 07:18:48,973 WARNING [train.py:1204] (3/4) Exclude cut with ID _53713_210_24116_1_1534314487762_7416751_691-70244-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 07:18:51,841 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6788_210_1388_1_1527898698_4562021_91-197981-0_sp1.1 from training. Duration: 0.7545625 2023-10-04 07:18:53,774 WARNING [train.py:1204] (3/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_60-18123-0_sp0.9 from training. Duration: 0.97775 2023-10-04 07:18:53,809 WARNING [train.py:1204] (3/4) Exclude cut with ID _35799_210_5854_1_1533263487887_3529071_5-214617-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 07:18:53,813 WARNING [train.py:1204] (3/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_890-5117-0_sp1.1 from training. Duration: 0.7090625 2023-10-04 07:18:56,600 WARNING [train.py:1204] (3/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_660-211088-0 from training. Duration: 0.62 2023-10-04 07:18:57,911 WARNING [train.py:1204] (3/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_314-126819-0_sp0.9 from training. Duration: 0.5555625 2023-10-04 07:18:59,055 WARNING [train.py:1204] (3/4) Exclude cut with ID _21632_210_2253_1_1531994001765_3744140_362-66225-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 07:19:00,391 WARNING [train.py:1204] (3/4) Exclude cut with ID _61118_210_9527_1_1534897668884_3752630_173-258555-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 07:19:01,829 WARNING [train.py:1204] (3/4) Exclude cut with ID _7719_210_3525_1_1528456153163_4296960_88-250259-0_sp0.9 from training. Duration: 0.9333125 2023-10-04 07:19:01,864 WARNING [train.py:1204] (3/4) Exclude cut with ID _15140_210_9288_1_1530791388450_4880149_539-153708-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 07:19:03,155 WARNING [train.py:1204] (3/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_253-42544-0_sp1.1 from training. Duration: 0.97275 2023-10-04 07:19:03,303 WARNING [train.py:1204] (3/4) Exclude cut with ID _33640_210_2655_1_1533117605479_4855469_479-296877-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 07:19:04,735 WARNING [train.py:1204] (3/4) Exclude cut with ID 3033-130750-0096-55598-0_sp1.1 from training. Duration: 0.7545625 2023-10-04 07:19:04,754 WARNING [train.py:1204] (3/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_191-41023-0_sp1.1 from training. Duration: 0.6454375 2023-10-04 07:19:06,089 WARNING [train.py:1204] (3/4) Exclude cut with ID _8365_210_2064_1_1528592311070_4677690_431-294102-0_sp1.1 from training. Duration: 0.8090625 2023-10-04 07:19:06,131 WARNING [train.py:1204] (3/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_893-296030-0_sp1.1 from training. Duration: 0.97275 2023-10-04 07:19:12,402 WARNING [train.py:1204] (3/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_401-343820-0_sp1.1 from training. Duration: 0.97275 2023-10-04 07:19:14,054 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.feed_forward3.hidden_balancer.prob, batch_count=1571293.3333333333, ans=0.125 2023-10-04 07:19:16,591 WARNING [train.py:1204] (3/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_127-566-0_sp1.1 from training. Duration: 0.763625 2023-10-04 07:19:16,593 WARNING [train.py:1204] (3/4) Exclude cut with ID _14100_210_8341_1_1530529320613_3678069_126-272847-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 07:19:16,608 WARNING [train.py:1204] (3/4) Exclude cut with ID _15133_210_6228_1_1530794512074_3080379_29-264458-0_sp1.1 from training. Duration: 0.52725 2023-10-04 07:19:16,609 WARNING [train.py:1204] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_127-119109-0 from training. Duration: 0.72 2023-10-04 07:19:16,689 WARNING [train.py:1204] (3/4) Exclude cut with ID _6537_210_1883_1_1527987559710_3597129_382-133062-0_sp1.1 from training. Duration: 0.6181875 2023-10-04 07:19:16,703 WARNING [train.py:1204] (3/4) Exclude cut with ID _29529_210_7234_1_1532393961455_3977460_391-219682-0_sp1.1 from training. Duration: 0.936375 2023-10-04 07:19:18,065 WARNING [train.py:1204] (3/4) Exclude cut with ID _37537_210_2253_1_1533171758568_3421949_96-137189-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 07:19:21,387 WARNING [train.py:1204] (3/4) Exclude cut with ID _58414_210_8254_1_1535421609162_7151182_15-276301-0_sp1.1 from training. Duration: 0.97275 2023-10-04 07:19:22,948 WARNING [train.py:1204] (3/4) Exclude cut with ID _59502_210_12210_1_1535107024698_1835139_110-289074-0_sp1.1 from training. Duration: 0.92725 2023-10-04 07:19:24,909 WARNING [train.py:1204] (3/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_367-61330-0_sp1.1 from training. Duration: 0.6818125 2023-10-04 07:19:27,689 WARNING [train.py:1204] (3/4) Exclude cut with ID _45297_210_2721_1_1533715351381_3264949_353-9541-0_sp1.1 from training. Duration: 0.8818125 2023-10-04 07:19:27,720 WARNING [train.py:1204] (3/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_85-138257-0_sp0.9 from training. Duration: 0.788875 2023-10-04 07:19:29,029 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3787_210_2040_1_1526477544_5478586_603-120324-0 from training. Duration: 0.81 2023-10-04 07:19:29,081 WARNING [train.py:1204] (3/4) Exclude cut with ID _42863_210_9070_1_1533902398788_3696920_411-204131-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 07:19:33,160 WARNING [train.py:1204] (3/4) Exclude cut with ID _30196_210_1664_1_1533708496522_7074140_822-289909-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 07:19:34,502 WARNING [train.py:1204] (3/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_32-335103-0_sp0.9 from training. Duration: 0.911125 2023-10-04 07:19:35,818 WARNING [train.py:1204] (3/4) Exclude cut with ID _6493_210_1747_1_1528459210424_4012470_513-140967-0_sp0.9 from training. Duration: 0.87775 2023-10-04 07:19:43,523 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_47896_210_20508_1_1534492518_5958868_439-257766-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 07:19:44,872 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_1294-63152-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 07:19:47,598 WARNING [train.py:1204] (3/4) Exclude cut with ID _59170_210_6721_1_1534983633960_4203335_319-166512-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 07:19:52,083 WARNING [train.py:1204] (3/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_146-3470-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 07:19:55,303 WARNING [train.py:1204] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_678-159307-0_sp1.1 from training. Duration: 0.77275 2023-10-04 07:19:55,346 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_343-48822-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 07:19:56,527 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.633e+02 2.043e+02 2.250e+02 2.599e+02 4.071e+02, threshold=4.500e+02, percent-clipped=0.0 2023-10-04 07:19:56,664 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_4692_210_3528_1_1526899755_4323254_107-44401-0 from training. Duration: 0.86 2023-10-04 07:19:56,669 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_74-292608-0_sp1.1 from training. Duration: 0.6545625 2023-10-04 07:19:58,058 WARNING [train.py:1204] (3/4) Exclude cut with ID _55644_210_11856_1_1534593599582_4103719_293-248694-0_sp1.1 from training. Duration: 0.97275 2023-10-04 07:19:58,299 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.balancer1.prob, batch_count=1571493.3333333333, ans=0.125 2023-10-04 07:19:59,484 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3172_210_2087_1_1526292929_3987865_197-97762-0 from training. Duration: 0.75 2023-10-04 07:20:00,958 WARNING [train.py:1204] (3/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_264-348896-0_sp0.9 from training. Duration: 0.9444375 2023-10-04 07:20:02,307 INFO [train.py:1046] (3/4) Epoch 45, batch 2000, loss[loss=0.16, simple_loss=0.244, pruned_loss=0.03797, over 24467.00 frames. ], tot_loss[loss=0.1553, simple_loss=0.2364, pruned_loss=0.03708, over 4713668.29 frames. ], batch size: 66, lr: 2.26e-03, grad_scale: 16.0 2023-10-04 07:20:05,077 WARNING [train.py:1204] (3/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_209-205365-0_sp0.9 from training. Duration: 0.87775 2023-10-04 07:20:06,310 WARNING [train.py:1204] (3/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_222-198127-0_sp0.9 from training. Duration: 0.9333125 2023-10-04 07:20:07,505 WARNING [train.py:1204] (3/4) Exclude cut with ID _57572_210_5517_1_1534921219624_3809484_52-43530-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 07:20:08,907 WARNING [train.py:1204] (3/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_96-320736-0_sp1.1 from training. Duration: 0.7818125 2023-10-04 07:20:10,369 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_13091_210_7925_1_1530609447_8987354_1159-225508-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 07:20:10,546 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.nonlin_attention.balancer.min_positive, batch_count=1571560.0, ans=0.05 2023-10-04 07:20:12,534 WARNING [train.py:1204] (3/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_237-44332-0 from training. Duration: 0.94 2023-10-04 07:20:13,810 WARNING [train.py:1204] (3/4) Exclude cut with ID _7970_210_1883_1_1528455616296_3472230_25-178407-0_sp0.9 from training. Duration: 0.788875 2023-10-04 07:20:14,071 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.ff3_skip_rate, batch_count=1571560.0, ans=0.0 2023-10-04 07:20:18,529 WARNING [train.py:1204] (3/4) Exclude cut with ID _29003_210_19596_1_1532507391720_3894098_357-257062-0_sp1.1 from training. Duration: 0.936375 2023-10-04 07:20:19,834 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_809-17156-0 from training. Duration: 0.73 2023-10-04 07:20:19,922 WARNING [train.py:1204] (3/4) Exclude cut with ID _7160_210_1876_1_1527940923231_3703799_302-150728-0_sp1.1 from training. Duration: 0.7090625 2023-10-04 07:20:19,930 WARNING [train.py:1204] (3/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_444-28041-0_sp0.9 from training. Duration: 0.9444375 2023-10-04 07:20:22,782 WARNING [train.py:1204] (3/4) Exclude cut with ID _52892_210_6721_1_1534378941259_4124129_278-251666-0_sp1.1 from training. Duration: 0.92725 2023-10-04 07:20:25,029 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.nonlin_attention.balancer.max_positive, batch_count=1571626.6666666667, ans=0.95 2023-10-04 07:20:26,179 WARNING [train.py:1204] (3/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_115-326778-0 from training. Duration: 0.93 2023-10-04 07:20:27,589 WARNING [train.py:1204] (3/4) Exclude cut with ID _41391_210_7994_1_1533603771872_3638201_31-188961-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 07:20:28,945 WARNING [train.py:1204] (3/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_1073-6264-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 07:20:28,964 WARNING [train.py:1204] (3/4) Exclude cut with ID _66868_210_9527_1_1535509287954_4561140_435-259515-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 07:20:30,893 WARNING [train.py:1204] (3/4) Exclude cut with ID _14886_210_4220_1_1530702020355_3516190_159-73748-0 from training. Duration: 0.83 2023-10-04 07:20:32,252 WARNING [train.py:1204] (3/4) Exclude cut with ID _15150_210_8341_1_1530752411803_7256384_469-239509-0_sp1.1 from training. Duration: 0.6181875 2023-10-04 07:20:33,684 WARNING [train.py:1204] (3/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_395-340852-0 from training. Duration: 0.93 2023-10-04 07:20:33,687 WARNING [train.py:1204] (3/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_138-187031-0_sp1.1 from training. Duration: 0.9 2023-10-04 07:20:36,534 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_13757_210_3528_1_1530440682_4107239_48-106347-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 07:20:37,860 WARNING [train.py:1204] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_164-224052-0_sp1.1 from training. Duration: 0.636375 2023-10-04 07:20:37,864 WARNING [train.py:1204] (3/4) Exclude cut with ID _59288_210_5854_1_1535077757774_3412246_408-322648-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 07:20:37,915 WARNING [train.py:1204] (3/4) Exclude cut with ID _30191_210_1664_1_1533189915047_7196670_827-36359-0_sp1.1 from training. Duration: 0.963625 2023-10-04 07:20:37,995 WARNING [train.py:1204] (3/4) Exclude cut with ID _4680_210_3525_1_1527249757953_893386_75-78979-0_sp1.1 from training. Duration: 0.82725 2023-10-04 07:20:39,234 WARNING [train.py:1204] (3/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_383-165234-0 from training. Duration: 0.94 2023-10-04 07:20:43,152 WARNING [train.py:1204] (3/4) Exclude cut with ID _19896_210_1883_1_1531709022662_6649569_94-229927-0 from training. Duration: 0.97 2023-10-04 07:20:43,165 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6788_210_1388_1_1527898698_4562021_180-75288-0_sp1.1 from training. Duration: 0.9 2023-10-04 07:20:43,180 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_26433_210_12108_1_1532157158_4056480_54-321349-0_sp1.1 from training. Duration: 0.97275 2023-10-04 07:20:47,932 WARNING [train.py:1204] (3/4) Exclude cut with ID _56108_210_16380_1_1534505446159_3887959_265-161493-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 07:20:48,045 WARNING [train.py:1204] (3/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_395-111602-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 07:20:49,395 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_154-316450-0_sp0.9 from training. Duration: 0.8444375 2023-10-04 07:20:49,478 WARNING [train.py:1204] (3/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_381-116041-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 07:20:49,786 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.conv_module2.balancer1.prob, batch_count=1571760.0, ans=0.125 2023-10-04 07:20:50,912 WARNING [train.py:1204] (3/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_65-104124-0_sp1.1 from training. Duration: 0.963625 2023-10-04 07:20:52,186 WARNING [train.py:1204] (3/4) Exclude cut with ID _6536_210_1883_1_1527850783339_3319069_169-23811-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 07:20:52,224 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_289-180904-0_sp0.9 from training. Duration: 0.8444375 2023-10-04 07:20:52,234 WARNING [train.py:1204] (3/4) Exclude cut with ID _56406_210_9288_1_1534899618664_3496149_226-242400-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 07:20:52,324 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_24899_210_16554_1_1532046422_3827462_276-216330-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 07:20:55,057 WARNING [train.py:1204] (3/4) Exclude cut with ID _29345_210_4381_1_1532689168263_3860169_198-174328-0_sp1.1 from training. Duration: 0.82725 2023-10-04 07:20:57,032 WARNING [train.py:1204] (3/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_338-96520-0 from training. Duration: 0.94 2023-10-04 07:21:01,718 WARNING [train.py:1204] (3/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_7-111954-0_sp1.1 from training. Duration: 0.5090625 2023-10-04 07:21:03,197 WARNING [train.py:1204] (3/4) Exclude cut with ID _36692_210_5665_1_1533608967420_398809_8-16261-0_sp1.1 from training. Duration: 0.97275 2023-10-04 07:21:07,457 WARNING [train.py:1204] (3/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_86-212657-0_sp1.1 from training. Duration: 0.97275 2023-10-04 07:21:07,470 WARNING [train.py:1204] (3/4) Exclude cut with ID _44659_210_2721_1_1534036120061_6614860_626-298884-0_sp1.1 from training. Duration: 0.8818125 2023-10-04 07:21:11,500 WARNING [train.py:1204] (3/4) Exclude cut with ID _49755_210_18053_1_1534581005846_3218709_120-257918-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 07:21:12,855 WARNING [train.py:1204] (3/4) Exclude cut with ID _21881_210_3525_1_1531825242757_3798380_400-113821-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 07:21:12,858 WARNING [train.py:1204] (3/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_387-236773-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 07:21:15,607 INFO [train.py:1046] (3/4) Epoch 45, batch 2050, loss[loss=0.1603, simple_loss=0.2432, pruned_loss=0.03873, over 24012.00 frames. ], tot_loss[loss=0.1546, simple_loss=0.2356, pruned_loss=0.03679, over 4727081.27 frames. ], batch size: 86, lr: 2.26e-03, grad_scale: 16.0 2023-10-04 07:21:15,651 WARNING [train.py:1204] (3/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_613-232856-0_sp1.1 from training. Duration: 0.4909375 2023-10-04 07:21:15,678 WARNING [train.py:1204] (3/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_181-78632-0_sp0.9 from training. Duration: 0.8666875 2023-10-04 07:21:18,891 WARNING [train.py:1204] (3/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_175-17485-0_sp1.1 from training. Duration: 0.97275 2023-10-04 07:21:18,972 WARNING [train.py:1204] (3/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_1004-183422-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 07:21:20,411 WARNING [train.py:1204] (3/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_32-65962-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 07:21:21,799 WARNING [train.py:1204] (3/4) Exclude cut with ID _15458_210_8341_1_1530858614263_3834410_40-298664-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 07:21:26,354 WARNING [train.py:1204] (3/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_380-261863-0_sp1.1 from training. Duration: 0.963625 2023-10-04 07:21:27,720 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_372-173012-0_sp1.1 from training. Duration: 0.863625 2023-10-04 07:21:27,769 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_57-82205-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 07:21:29,617 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3691_210_3528_1_1526468169_4062053_355-334432-0_sp0.9 from training. Duration: 0.9666875 2023-10-04 07:21:32,267 WARNING [train.py:1204] (3/4) Exclude cut with ID _19023_210_4353_1_1531378892552_5820780_28-36615-0 from training. Duration: 0.97 2023-10-04 07:21:32,272 WARNING [train.py:1204] (3/4) Exclude cut with ID _29853_210_7497_1_1532505600851_6642110_286-251333-0_sp1.1 from training. Duration: 0.92725 2023-10-04 07:21:33,608 WARNING [train.py:1204] (3/4) Exclude cut with ID _31602_210_5854_1_1532677021456_4458250_518-80679-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 07:21:33,657 WARNING [train.py:1204] (3/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_38-187851-0_sp0.9 from training. Duration: 0.688875 2023-10-04 07:21:39,932 INFO [scaling.py:1118] (3/4) WithLoss: name=encoder.encoders.3.encoder.layers.3.self_attn_weights, loss-sum=0.000e+00 2023-10-04 07:21:42,383 WARNING [train.py:1204] (3/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_39-248870-0_sp1.1 from training. Duration: 0.62725 2023-10-04 07:21:42,407 WARNING [train.py:1204] (3/4) Exclude cut with ID _16908_210_5724_1_1531119157248_3772700_154-31455-0_sp1.1 from training. Duration: 0.97275 2023-10-04 07:21:43,882 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8453_210_4211_1_1528502940_8562730_347-330673-0 from training. Duration: 0.72 2023-10-04 07:21:45,427 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.feed_forward3.hidden_balancer.prob, batch_count=1572026.6666666667, ans=0.125 2023-10-04 07:21:47,114 WARNING [train.py:1204] (3/4) Exclude cut with ID _30191_210_1664_1_1533189915047_7196670_674-204247-0_sp1.1 from training. Duration: 0.97275 2023-10-04 07:21:47,208 WARNING [train.py:1204] (3/4) Exclude cut with ID _8833_210_5219_1_1528592665210_7553059_329-125156-0 from training. Duration: 0.82 2023-10-04 07:21:47,237 WARNING [train.py:1204] (3/4) Exclude cut with ID _4779_210_2084_1_1527318022944_3914893_500-285013-0_sp1.1 from training. Duration: 0.62725 2023-10-04 07:21:49,481 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.1.encoder.layers.1.nonlin_attention.whiten2, num_groups=1, num_channels=256, metric=6.56 vs. limit=15.0 2023-10-04 07:21:50,051 WARNING [train.py:1204] (3/4) Exclude cut with ID _14421_210_8750_1_1530604795825_3542290_190-313578-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 07:21:52,733 WARNING [train.py:1204] (3/4) Exclude cut with ID _35799_210_5854_1_1533263487887_3529071_538-273574-0_sp1.1 from training. Duration: 0.936375 2023-10-04 07:21:52,940 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.self_attn_weights.pos_emb_skip_rate, batch_count=1572026.6666666667, ans=0.0 2023-10-04 07:21:54,146 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_14928_210_5632_1_1530773461_4714000_454-112198-0_sp1.1 from training. Duration: 0.6 2023-10-04 07:21:56,087 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_468-143126-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 07:21:57,499 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_4528_210_3273_1_1527076654_3276777_258-121681-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 07:21:59,496 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_397-132565-0_sp1.1 from training. Duration: 0.9 2023-10-04 07:21:59,515 WARNING [train.py:1204] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_454-123002-0_sp0.9 from training. Duration: 0.7666875 2023-10-04 07:22:01,173 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.feed_forward1.hidden_balancer.prob, batch_count=1572093.3333333333, ans=0.125 2023-10-04 07:22:02,356 WARNING [train.py:1204] (3/4) Exclude cut with ID _27052_210_5986_1_1532329197187_3585009_297-86890-0_sp1.1 from training. Duration: 0.936375 2023-10-04 07:22:03,892 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_51225_210_3601_1_1534400634_2327383_55-301844-0_sp1.1 from training. Duration: 0.8454375 2023-10-04 07:22:06,543 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6786_210_1388_1_1527839699_4194495_378-46070-0_sp0.9 from training. Duration: 0.8 2023-10-04 07:22:06,838 INFO [scaling.py:1118] (3/4) WithLoss: name=encoder.encoders.5.encoder.layers.0.self_attn_weights, loss-sum=2.160e-02 2023-10-04 07:22:08,015 WARNING [train.py:1204] (3/4) Exclude cut with ID _42334_210_7770_1_1534206610396_3605880_55-19758-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 07:22:11,009 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.self_attn_weights.pos_emb_skip_rate, batch_count=1572093.3333333333, ans=0.0 2023-10-04 07:22:12,215 WARNING [train.py:1204] (3/4) Exclude cut with ID _14884_210_2252_1_1530707459689_5561250_114-201399-0_sp0.9 from training. Duration: 0.8444375 2023-10-04 07:22:16,272 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.3.encoder.layers.3.conv_module1.whiten, num_groups=1, num_channels=512, metric=4.83 vs. limit=15.0 2023-10-04 07:22:16,981 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_99-251314-0_sp0.9 from training. Duration: 0.9666875 2023-10-04 07:22:17,064 WARNING [train.py:1204] (3/4) Exclude cut with ID _26528_210_9070_1_1532570410842_3851310_497-97218-0 from training. Duration: 0.86 2023-10-04 07:22:23,003 WARNING [train.py:1204] (3/4) Exclude cut with ID _55328_210_27767_1_1534580973260_4088170_230-317241-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 07:22:23,176 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.conv_module2.balancer1.prob, batch_count=1572160.0, ans=0.125 2023-10-04 07:22:24,395 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_125-203105-0_sp0.9 from training. Duration: 0.888875 2023-10-04 07:22:25,688 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.602e+02 2.015e+02 2.281e+02 2.581e+02 4.368e+02, threshold=4.563e+02, percent-clipped=0.0 2023-10-04 07:22:27,148 WARNING [train.py:1204] (3/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_55-31904-0_sp1.1 from training. Duration: 0.8 2023-10-04 07:22:29,177 WARNING [train.py:1204] (3/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_18-174647-0 from training. Duration: 0.91 2023-10-04 07:22:30,491 INFO [train.py:1046] (3/4) Epoch 45, batch 2100, loss[loss=0.1659, simple_loss=0.2544, pruned_loss=0.03875, over 24016.00 frames. ], tot_loss[loss=0.1533, simple_loss=0.2338, pruned_loss=0.03642, over 4734478.55 frames. ], batch size: 80, lr: 2.26e-03, grad_scale: 8.0 2023-10-04 07:22:31,951 WARNING [train.py:1204] (3/4) Exclude cut with ID IC0165W0005-81726 from training. Duration: 0.952 2023-10-04 07:22:31,951 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_31583_210_3852_1_1532595851_4024413_148-171847-0_sp1.1 from training. Duration: 0.963625 2023-10-04 07:22:33,261 WARNING [train.py:1204] (3/4) Exclude cut with ID _21989_210_5854_1_1531899214931_3271360_116-138769-0_sp1.1 from training. Duration: 0.936375 2023-10-04 07:22:33,320 WARNING [train.py:1204] (3/4) Exclude cut with ID _66070_210_7640_1_1535441393096_7805310_489-292631-0_sp1.1 from training. Duration: 0.7181875 2023-10-04 07:22:34,628 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_202-27960-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 07:22:34,641 WARNING [train.py:1204] (3/4) Exclude cut with ID _5294_210_1747_1_1527249643928_3696179_342-270278-0 from training. Duration: 0.55 2023-10-04 07:22:34,678 WARNING [train.py:1204] (3/4) Exclude cut with ID _38323_210_12108_1_1533121233369_3303049_416-11919-0 from training. Duration: 0.96 2023-10-04 07:22:37,449 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6545_210_2040_1_1527774670_4245258_407-48070-0_sp0.9 from training. Duration: 0.8444375 2023-10-04 07:22:40,214 WARNING [train.py:1204] (3/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_503-30719-0_sp1.1 from training. Duration: 0.8909375 2023-10-04 07:22:40,289 WARNING [train.py:1204] (3/4) Exclude cut with ID _49643_210_7989_1_1534640244224_3939031_234-326497-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 07:22:43,005 WARNING [train.py:1204] (3/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_301-77873-0_sp1.1 from training. Duration: 0.963625 2023-10-04 07:22:43,058 WARNING [train.py:1204] (3/4) Exclude cut with ID _45297_210_2721_1_1533715351381_3264949_152-154697-0_sp1.1 from training. Duration: 0.97275 2023-10-04 07:22:43,065 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3371_210_1794_1_1526384462_5580777_396-339812-0 from training. Duration: 0.85 2023-10-04 07:22:45,089 WARNING [train.py:1204] (3/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_399-156791-0_sp1.1 from training. Duration: 0.7545625 2023-10-04 07:22:46,292 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7696_210_2040_1_1528207167_3716099_117-125434-0 from training. Duration: 0.76 2023-10-04 07:22:46,304 WARNING [train.py:1204] (3/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_96-244299-0 from training. Duration: 0.88 2023-10-04 07:22:47,727 WARNING [train.py:1204] (3/4) Exclude cut with ID _37892_210_19596_1_1533198677897_3964058_519-343462-0_sp1.1 from training. Duration: 0.92725 2023-10-04 07:22:47,748 WARNING [train.py:1204] (3/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_202-183922-0_sp1.1 from training. Duration: 0.77275 2023-10-04 07:22:47,754 WARNING [train.py:1204] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_453-133983-0 from training. Duration: 0.78 2023-10-04 07:22:47,788 WARNING [train.py:1204] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_178-39722-0_sp1.1 from training. Duration: 0.4454375 2023-10-04 07:22:49,353 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.conv_module2.balancer1.prob, batch_count=1572293.3333333333, ans=0.125 2023-10-04 07:22:52,581 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_15126_210_6753_1_1530778677_2591185_113-157651-0 from training. Duration: 0.68 2023-10-04 07:22:52,581 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_271-234962-0_sp1.1 from training. Duration: 0.7181875 2023-10-04 07:22:56,020 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.3.encoder.layers.0.self_attn1.whiten, num_groups=1, num_channels=512, metric=20.27 vs. limit=22.5 2023-10-04 07:22:56,619 WARNING [train.py:1204] (3/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_195-64967-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 07:22:56,660 WARNING [train.py:1204] (3/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_365-215065-0_sp1.1 from training. Duration: 0.936375 2023-10-04 07:22:59,962 WARNING [train.py:1204] (3/4) Exclude cut with ID _31602_210_5854_1_1532677021456_4458250_249-96604-0_sp0.9 from training. Duration: 0.988875 2023-10-04 07:23:01,814 WARNING [train.py:1204] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_307-158462-0 from training. Duration: 0.69 2023-10-04 07:23:01,859 WARNING [train.py:1204] (3/4) Exclude cut with ID _30918_210_6400_1_1532754078600_3125920_407-223394-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 07:23:01,862 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6673_210_3273_1_1527767790_3173240_351-167954-0_sp0.9 from training. Duration: 0.5333125 2023-10-04 07:23:04,567 WARNING [train.py:1204] (3/4) Exclude cut with ID _4866_210_2577_1_1527404412284_4364659_256-306830-0 from training. Duration: 0.87 2023-10-04 07:23:04,613 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_17613_210_2151_1_1531137569_2366160_222-131118-0_sp1.1 from training. Duration: 0.963625 2023-10-04 07:23:04,624 WARNING [train.py:1204] (3/4) Exclude cut with ID _45560_210_21982_1_1533812603951_3584999_111-6376-0 from training. Duration: 0.84 2023-10-04 07:23:04,650 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_216-73841-0 from training. Duration: 0.96 2023-10-04 07:23:04,684 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_14058_210_4163_1_1530690953_6327180_49-210687-0 from training. Duration: 0.94 2023-10-04 07:23:07,511 WARNING [train.py:1204] (3/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_506-42614-0_sp0.9 from training. Duration: 0.87775 2023-10-04 07:23:08,918 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_25-332138-0_sp1.1 from training. Duration: 0.82725 2023-10-04 07:23:11,808 WARNING [train.py:1204] (3/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_589-9070-0_sp1.1 from training. Duration: 0.6454375 2023-10-04 07:23:11,938 WARNING [train.py:1204] (3/4) Exclude cut with ID _6194_210_1534_1_1527511826160_3941194_14-282876-0_sp1.1 from training. Duration: 0.6454375 2023-10-04 07:23:13,840 WARNING [train.py:1204] (3/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_1166-90654-0_sp1.1 from training. Duration: 0.92725 2023-10-04 07:23:16,479 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_218-255858-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 07:23:16,481 WARNING [train.py:1204] (3/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_1285-256622-0 from training. Duration: 0.84 2023-10-04 07:23:16,493 WARNING [train.py:1204] (3/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_205-286261-0_sp1.1 from training. Duration: 0.963625 2023-10-04 07:23:16,509 WARNING [train.py:1204] (3/4) Exclude cut with ID _68777_210_4353_1_1535810390731_3117904_6-154119-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 07:23:16,576 WARNING [train.py:1204] (3/4) Exclude cut with ID _22982_210_1883_1_1531882790447_5449409_565-199393-0_sp1.1 from training. Duration: 0.92725 2023-10-04 07:23:16,585 WARNING [train.py:1204] (3/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_278-154492-0 from training. Duration: 0.76 2023-10-04 07:23:18,058 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_426-313557-0 from training. Duration: 0.53 2023-10-04 07:23:18,196 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.out_combiner.scale_min, batch_count=1572426.6666666667, ans=0.2 2023-10-04 07:23:19,288 WARNING [train.py:1204] (3/4) Exclude cut with ID _41817_210_18835_1_1533434477160_7423049_555-293448-0 from training. Duration: 0.8 2023-10-04 07:23:23,959 WARNING [train.py:1204] (3/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_32-335103-0_sp1.1 from training. Duration: 0.7454375 2023-10-04 07:23:25,441 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2655_210_2087_1_1525688959_3897400_202-147557-0_sp1.1 from training. Duration: 0.77275 2023-10-04 07:23:26,734 WARNING [train.py:1204] (3/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_158-250116-0 from training. Duration: 0.62 2023-10-04 07:23:32,024 WARNING [train.py:1204] (3/4) Exclude cut with ID _55429_210_10626_1_1534467588527_7174143_528-88083-0_sp1.1 from training. Duration: 0.963625 2023-10-04 07:23:33,427 WARNING [train.py:1204] (3/4) Exclude cut with ID _8030_210_7497_1_1528444886781_3531089_32-31837-0_sp1.1 from training. Duration: 0.8818125 2023-10-04 07:23:33,450 WARNING [train.py:1204] (3/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_744-86168-0_sp1.1 from training. Duration: 0.97275 2023-10-04 07:23:34,739 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1113-7369-0_sp0.9 from training. Duration: 0.9444375 2023-10-04 07:23:34,769 WARNING [train.py:1204] (3/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_124-345840-0_sp0.9 from training. Duration: 0.57775 2023-10-04 07:23:35,977 WARNING [train.py:1204] (3/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_328-264776-0_sp0.9 from training. Duration: 0.7444375 2023-10-04 07:23:37,382 WARNING [train.py:1204] (3/4) Exclude cut with ID _18895_210_6228_1_1531357274234_3784930_469-166615-0_sp1.1 from training. Duration: 0.963625 2023-10-04 07:23:37,388 WARNING [train.py:1204] (3/4) Exclude cut with ID _8395_210_1531_1_1528597871523_4411579_431-132470-0_sp0.9 from training. Duration: 0.82225 2023-10-04 07:23:38,772 WARNING [train.py:1204] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_554-45617-0_sp1.1 from training. Duration: 0.9 2023-10-04 07:23:38,807 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_35361_210_4238_1_1533262810_7793476_524-188242-0_sp1.1 from training. Duration: 0.92725 2023-10-04 07:23:40,269 WARNING [train.py:1204] (3/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_938-205135-0 from training. Duration: 0.87 2023-10-04 07:23:41,686 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_5931_210_2040_1_1527428481_4547999_321-193614-0 from training. Duration: 0.67 2023-10-04 07:23:43,006 WARNING [train.py:1204] (3/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_445-269521-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 07:23:44,318 INFO [train.py:1046] (3/4) Epoch 45, batch 2150, loss[loss=0.1577, simple_loss=0.2385, pruned_loss=0.03849, over 24516.00 frames. ], tot_loss[loss=0.1527, simple_loss=0.2327, pruned_loss=0.03633, over 4721612.70 frames. ], batch size: 66, lr: 2.26e-03, grad_scale: 8.0 2023-10-04 07:23:46,264 WARNING [train.py:1204] (3/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_3-33116-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 07:23:46,265 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_4633_210_2040_1_1526824163_4145343_416-76117-0_sp1.1 from training. Duration: 0.863625 2023-10-04 07:23:46,299 WARNING [train.py:1204] (3/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_1033-274376-0_sp1.1 from training. Duration: 0.7909375 2023-10-04 07:23:47,649 WARNING [train.py:1204] (3/4) Exclude cut with ID _8772_210_1850_1_1528635595972_3664011_398-320049-0_sp0.9 from training. Duration: 0.97775 2023-10-04 07:23:50,928 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.5.encoder.layers.1.self_attn1.whiten, num_groups=1, num_channels=256, metric=13.44 vs. limit=22.5 2023-10-04 07:23:53,206 WARNING [train.py:1204] (3/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_321-215742-0_sp0.9 from training. Duration: 0.5555625 2023-10-04 07:23:53,321 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.0.conv_skip_rate, batch_count=1572560.0, ans=0.0 2023-10-04 07:23:53,384 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.feed_forward3.hidden_balancer.prob, batch_count=1572560.0, ans=0.125 2023-10-04 07:23:55,137 WARNING [train.py:1204] (3/4) Exclude cut with ID _28394_210_8913_1_1532430049518_3650118_272-190950-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 07:23:56,575 WARNING [train.py:1204] (3/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_31-299371-0_sp1.1 from training. Duration: 0.92725 2023-10-04 07:23:57,996 WARNING [train.py:1204] (3/4) Exclude cut with ID _55383_210_10635_1_1534468426172_2231669_134-128246-0_sp0.9 from training. Duration: 0.988875 2023-10-04 07:23:57,999 WARNING [train.py:1204] (3/4) Exclude cut with ID _40519_210_13794_1_1533715228655_7337390_887-9547-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 07:23:58,029 WARNING [train.py:1204] (3/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_310-109461-0_sp0.9 from training. Duration: 0.888875 2023-10-04 07:23:59,601 INFO [scaling.py:1118] (3/4) WithLoss: name=encoder.encoders.0.layers.0.self_attn_weights, loss-sum=0.000e+00 2023-10-04 07:24:01,455 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2009_210_2071_1_1525481134_4588937_258-250196-0_sp1.1 from training. Duration: 0.92725 2023-10-04 07:24:01,524 WARNING [train.py:1204] (3/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_1186-72702-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 07:24:01,527 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_14831_210_7927_1_1530700200_5583610_181-27467-0_sp1.1 from training. Duration: 0.82725 2023-10-04 07:24:03,421 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.3.encoder.layers.1.feed_forward2.out_whiten, num_groups=1, num_channels=512, metric=5.53 vs. limit=15.0 2023-10-04 07:24:07,254 WARNING [train.py:1204] (3/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_278-132109-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 07:24:07,288 WARNING [train.py:1204] (3/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_261-297503-0 from training. Duration: 0.99 2023-10-04 07:24:07,933 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.2.encoder.layers.2.self_attn1.whiten, num_groups=1, num_channels=384, metric=16.81 vs. limit=22.5 2023-10-04 07:24:11,589 WARNING [train.py:1204] (3/4) Exclude cut with ID _19896_210_1883_1_1531709022662_6649569_313-265903-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 07:24:11,778 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.conv_module2.balancer1.min_positive, batch_count=1572626.6666666667, ans=0.025 2023-10-04 07:24:12,950 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_105-114797-0_sp1.1 from training. Duration: 0.763625 2023-10-04 07:24:13,104 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.0.bypass.skip_rate, batch_count=1572693.3333333333, ans=0.035 2023-10-04 07:24:14,310 WARNING [train.py:1204] (3/4) Exclude cut with ID _53755_210_8254_1_1534577312941_4707360_9-65843-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 07:24:14,340 WARNING [train.py:1204] (3/4) Exclude cut with ID _18516_210_9206_1_1531704653206_3692406_361-224549-0_sp1.1 from training. Duration: 0.97275 2023-10-04 07:24:16,296 WARNING [train.py:1204] (3/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_403-310975-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 07:24:16,339 WARNING [train.py:1204] (3/4) Exclude cut with ID _5444_210_2151_1_1527418808464_5312210_581-315411-0_sp0.9 from training. Duration: 0.688875 2023-10-04 07:24:16,383 WARNING [train.py:1204] (3/4) Exclude cut with ID _23825_210_13449_1_1531915313526_3497150_464-154904-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 07:24:16,392 WARNING [train.py:1204] (3/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_937-208281-0_sp0.9 from training. Duration: 0.9444375 2023-10-04 07:24:17,554 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_37839_210_7234_1_1533542462_3689303_158-181212-0_sp1.1 from training. Duration: 0.963625 2023-10-04 07:24:18,965 WARNING [train.py:1204] (3/4) Exclude cut with ID _8772_210_1850_1_1528635595972_3664011_398-320049-0 from training. Duration: 0.88 2023-10-04 07:24:20,327 WARNING [train.py:1204] (3/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_718-250907-0_sp1.1 from training. Duration: 0.62725 2023-10-04 07:24:20,404 WARNING [train.py:1204] (3/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_641-126395-0_sp1.1 from training. Duration: 0.92725 2023-10-04 07:24:21,713 WARNING [train.py:1204] (3/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_166-241493-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 07:24:23,130 WARNING [train.py:1204] (3/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_526-62079-0_sp0.9 from training. Duration: 0.7444375 2023-10-04 07:24:26,225 WARNING [train.py:1204] (3/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_439-275083-0_sp1.1 from training. Duration: 0.936375 2023-10-04 07:24:27,713 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_66907_210_21581_1_1535615661_3590177_493-229032-0_sp1.1 from training. Duration: 0.92725 2023-10-04 07:24:27,747 WARNING [train.py:1204] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_89-321955-0_sp1.1 from training. Duration: 0.72725 2023-10-04 07:24:29,162 WARNING [train.py:1204] (3/4) Exclude cut with ID _29857_210_20034_1_1532395886753_3798160_273-226195-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 07:24:29,177 WARNING [train.py:1204] (3/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_404-75889-0 from training. Duration: 0.89 2023-10-04 07:24:29,208 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_271-234962-0_sp0.9 from training. Duration: 0.87775 2023-10-04 07:24:31,986 WARNING [train.py:1204] (3/4) Exclude cut with ID _22961_210_8254_1_1531976366672_4278180_156-339685-0_sp1.1 from training. Duration: 0.97275 2023-10-04 07:24:33,312 WARNING [train.py:1204] (3/4) Exclude cut with ID _37892_210_19596_1_1533198677897_3964058_425-231927-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 07:24:33,412 WARNING [train.py:1204] (3/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_102-288670-0_sp1.1 from training. Duration: 0.97275 2023-10-04 07:24:35,248 WARNING [train.py:1204] (3/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_283-220372-0_sp1.1 from training. Duration: 0.5090625 2023-10-04 07:24:35,325 WARNING [train.py:1204] (3/4) Exclude cut with ID _44304_210_15374_1_1533639352765_4613129_86-1699-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 07:24:36,711 WARNING [train.py:1204] (3/4) Exclude cut with ID _2891_210_2550_1_1525874398521_4427710_327-121590-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 07:24:36,721 WARNING [train.py:1204] (3/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_369-222060-0 from training. Duration: 0.71 2023-10-04 07:24:36,978 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.bypass.scale_min, batch_count=1572760.0, ans=0.2 2023-10-04 07:24:38,176 WARNING [train.py:1204] (3/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_762-109886-0 from training. Duration: 0.91 2023-10-04 07:24:38,210 WARNING [train.py:1204] (3/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_477-255782-0_sp0.9 from training. Duration: 0.911125 2023-10-04 07:24:38,237 WARNING [train.py:1204] (3/4) Exclude cut with ID IC0669W0061-330906 from training. Duration: 0.768 2023-10-04 07:24:38,285 WARNING [train.py:1204] (3/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_347-213071-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 07:24:38,323 WARNING [train.py:1204] (3/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_507-303370-0_sp1.1 from training. Duration: 0.87275 2023-10-04 07:24:39,711 WARNING [train.py:1204] (3/4) Exclude cut with ID _15012_210_1968_1_1530856820429_5095350_688-178849-0 from training. Duration: 0.64 2023-10-04 07:24:39,717 WARNING [train.py:1204] (3/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_28-70987-0_sp0.9 from training. Duration: 0.9666875 2023-10-04 07:24:39,719 WARNING [train.py:1204] (3/4) Exclude cut with ID _5910_210_1389_1_1527337972454_5050100_555-159815-0 from training. Duration: 0.98 2023-10-04 07:24:40,906 WARNING [train.py:1204] (3/4) Exclude cut with ID IC0679W0064-335871 from training. Duration: 0.768 2023-10-04 07:24:40,906 WARNING [train.py:1204] (3/4) Exclude cut with ID IC0679W0079-335886 from training. Duration: 0.896 2023-10-04 07:24:40,951 WARNING [train.py:1204] (3/4) Exclude cut with ID _5608_210_2106_1_1527415439089_6819768_909-82767-0 from training. Duration: 0.61 2023-10-04 07:24:41,158 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.balancer1.prob, batch_count=1572760.0, ans=0.125 2023-10-04 07:24:42,315 WARNING [train.py:1204] (3/4) Exclude cut with ID _23038_210_9570_1_1532001584158_3652419_411-323967-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 07:24:43,672 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_350-231748-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 07:24:43,697 WARNING [train.py:1204] (3/4) Exclude cut with ID _5289_210_2084_1_1527678202736_3779299_142-103317-0_sp0.9 from training. Duration: 0.9333125 2023-10-04 07:24:43,767 WARNING [train.py:1204] (3/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_314-144055-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 07:24:45,768 WARNING [train.py:1204] (3/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_177-236809-0_sp0.9 from training. Duration: 0.5444375 2023-10-04 07:24:47,105 WARNING [train.py:1204] (3/4) Exclude cut with ID _23038_210_9570_1_1532001584158_3652419_82-334733-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 07:24:47,124 WARNING [train.py:1204] (3/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_225-22526-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 07:24:50,566 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.3.encoder.layers.0.self_attn2.whiten, num_groups=1, num_channels=512, metric=13.82 vs. limit=22.5 2023-10-04 07:24:54,369 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.710e+02 1.986e+02 2.257e+02 2.566e+02 4.341e+02, threshold=4.515e+02, percent-clipped=0.0 2023-10-04 07:24:55,860 WARNING [train.py:1204] (3/4) Exclude cut with ID _30327_210_16049_1_1532484161295_3924169_401-70819-0_sp1.1 from training. Duration: 0.7818125 2023-10-04 07:24:55,907 WARNING [train.py:1204] (3/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_538-32291-0 from training. Duration: 0.83 2023-10-04 07:24:58,662 INFO [train.py:1046] (3/4) Epoch 45, batch 2200, loss[loss=0.1657, simple_loss=0.238, pruned_loss=0.04666, over 23729.00 frames. ], tot_loss[loss=0.1529, simple_loss=0.2333, pruned_loss=0.03623, over 4732526.58 frames. ], batch size: 232, lr: 2.26e-03, grad_scale: 8.0 2023-10-04 07:25:00,156 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_37357_210_17024_1_1533089504_2316121_258-211605-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 07:25:06,562 WARNING [train.py:1204] (3/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_550-335198-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 07:25:06,627 WARNING [train.py:1204] (3/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_98-741-0_sp0.9 from training. Duration: 0.888875 2023-10-04 07:25:06,681 WARNING [train.py:1204] (3/4) Exclude cut with ID _38323_210_12108_1_1533121233369_3303049_251-238988-0_sp1.1 from training. Duration: 0.92725 2023-10-04 07:25:07,993 WARNING [train.py:1204] (3/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_259-34038-0_sp0.9 from training. Duration: 0.82225 2023-10-04 07:25:08,267 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.bypass.scale_min, batch_count=1572893.3333333333, ans=0.2 2023-10-04 07:25:09,641 WARNING [train.py:1204] (3/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_1060-180633-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 07:25:10,960 WARNING [train.py:1204] (3/4) Exclude cut with ID _8755_210_1876_1_1528628559357_3258209_211-37473-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 07:25:10,974 WARNING [train.py:1204] (3/4) Exclude cut with ID _4122_210_2084_1_1526713164417_4353280_528-206771-0 from training. Duration: 0.88 2023-10-04 07:25:13,866 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=1572960.0, ans=0.1 2023-10-04 07:25:14,938 WARNING [train.py:1204] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_64-248359-0 from training. Duration: 0.69 2023-10-04 07:25:18,336 WARNING [train.py:1204] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_402-253027-0_sp1.1 from training. Duration: 0.4909375 2023-10-04 07:25:21,271 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.0.conv_module2.balancer1.prob, batch_count=1572960.0, ans=0.125 2023-10-04 07:25:22,669 WARNING [train.py:1204] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_625-102955-0 from training. Duration: 0.77 2023-10-04 07:25:26,099 WARNING [train.py:1204] (3/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_611-219324-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 07:25:26,176 WARNING [train.py:1204] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_718-68505-0_sp0.9 from training. Duration: 0.988875 2023-10-04 07:25:27,486 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_353-182855-0_sp1.1 from training. Duration: 0.87275 2023-10-04 07:25:30,436 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_5960_210_1794_1_1527403865_3990999_515-2613-0_sp0.9 from training. Duration: 0.97775 2023-10-04 07:25:31,698 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_5575_210_1410_1_1527301305_2570170_342-265572-0 from training. Duration: 0.94 2023-10-04 07:25:36,762 WARNING [train.py:1204] (3/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_329-113173-0_sp0.9 from training. Duration: 0.688875 2023-10-04 07:25:38,116 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_15396_210_6890_1_1531308473_1938936_213-312506-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 07:25:38,181 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_521-214276-0_sp1.1 from training. Duration: 0.4 2023-10-04 07:25:40,967 WARNING [train.py:1204] (3/4) Exclude cut with ID _15026_210_8750_1_1530777602316_3291850_211-195043-0_sp1.1 from training. Duration: 0.72725 2023-10-04 07:25:42,376 WARNING [train.py:1204] (3/4) Exclude cut with ID _29343_210_4381_1_1532602757752_3674109_108-257494-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 07:25:42,591 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=1573093.3333333333, ans=0.1 2023-10-04 07:25:42,596 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.conv_module2.balancer1.min_positive, batch_count=1573093.3333333333, ans=0.025 2023-10-04 07:25:43,784 WARNING [train.py:1204] (3/4) Exclude cut with ID _64414_210_17301_1_1535243252048_3757759_403-58151-0_sp1.1 from training. Duration: 0.963625 2023-10-04 07:25:45,179 WARNING [train.py:1204] (3/4) Exclude cut with ID _36512_210_6987_1_1533033143998_3126069_109-177787-0_sp1.1 from training. Duration: 0.92725 2023-10-04 07:25:47,888 WARNING [train.py:1204] (3/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_498-40153-0 from training. Duration: 0.63 2023-10-04 07:25:49,236 WARNING [train.py:1204] (3/4) Exclude cut with ID _56447_210_1389_1_1535074245910_3697189_161-334523-0_sp1.1 from training. Duration: 0.936375 2023-10-04 07:25:52,327 WARNING [train.py:1204] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_238-245655-0 from training. Duration: 0.81 2023-10-04 07:25:55,652 WARNING [train.py:1204] (3/4) Exclude cut with ID _64631_210_27767_1_1535349580068_4165160_108-43865-0_sp1.1 from training. Duration: 0.92725 2023-10-04 07:25:55,658 WARNING [train.py:1204] (3/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_243-42116-0_sp1.1 from training. Duration: 0.663625 2023-10-04 07:25:55,696 WARNING [train.py:1204] (3/4) Exclude cut with ID _66868_210_9527_1_1535509287954_4561140_594-132160-0_sp1.1 from training. Duration: 0.92725 2023-10-04 07:25:57,220 WARNING [train.py:1204] (3/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_48-338976-0_sp0.9 from training. Duration: 0.988875 2023-10-04 07:25:57,262 WARNING [train.py:1204] (3/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_137-100595-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 07:25:58,630 WARNING [train.py:1204] (3/4) Exclude cut with ID _49748_210_24116_1_1533986971737_3158910_12-271669-0_sp1.1 from training. Duration: 0.936375 2023-10-04 07:25:58,650 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_37357_210_17024_1_1533089504_2316121_206-80421-0_sp1.1 from training. Duration: 0.936375 2023-10-04 07:25:58,740 WARNING [train.py:1204] (3/4) Exclude cut with ID _7537_210_2084_1_1528502395431_3924359_113-199933-0_sp1.1 from training. Duration: 0.536375 2023-10-04 07:25:58,776 WARNING [train.py:1204] (3/4) Exclude cut with ID _55440_210_12651_1_1534496713364_2980520_31-59289-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 07:26:00,229 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1083-220122-0_sp0.9 from training. Duration: 0.5666875 2023-10-04 07:26:01,682 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.balancer2.prob, batch_count=1573160.0, ans=0.125 2023-10-04 07:26:02,428 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.0.layers.0.self_attn2.whiten, num_groups=1, num_channels=192, metric=11.46 vs. limit=22.5 2023-10-04 07:26:04,070 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_426-313557-0_sp1.1 from training. Duration: 0.4818125 2023-10-04 07:26:04,122 WARNING [train.py:1204] (3/4) Exclude cut with ID _35799_210_5854_1_1533263487887_3529071_324-94843-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 07:26:06,104 WARNING [train.py:1204] (3/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_264-108685-0_sp1.1 from training. Duration: 0.5 2023-10-04 07:26:08,325 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.1.self_attn1.whiten.whitening_limit, batch_count=1573160.0, ans=22.5 2023-10-04 07:26:08,787 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9012W0006-501141 from training. Duration: 0.896 2023-10-04 07:26:08,932 WARNING [train.py:1204] (3/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_890-5117-0_sp0.9 from training. Duration: 0.8666875 2023-10-04 07:26:10,890 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9023W0008-506301 from training. Duration: 0.896 2023-10-04 07:26:12,176 WARNING [train.py:1204] (3/4) Exclude cut with ID _13916_210_9994_1_1530788341126_3763149_60-191556-0_sp0.9 from training. Duration: 0.67775 2023-10-04 07:26:12,223 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9029W0331-509721 from training. Duration: 0.896 2023-10-04 07:26:13,496 INFO [train.py:1046] (3/4) Epoch 45, batch 2250, loss[loss=0.1506, simple_loss=0.238, pruned_loss=0.03164, over 24484.00 frames. ], tot_loss[loss=0.153, simple_loss=0.2336, pruned_loss=0.03618, over 4725534.27 frames. ], batch size: 63, lr: 2.26e-03, grad_scale: 8.0 2023-10-04 07:26:13,637 WARNING [train.py:1204] (3/4) Exclude cut with ID _35823_210_6917_1_1533187881914_7297830_567-108596-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 07:26:14,963 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7543_210_6753_1_1528544010_1110402_25-183860-0_sp1.1 from training. Duration: 0.42725 2023-10-04 07:26:16,354 WARNING [train.py:1204] (3/4) Exclude cut with ID _47278_210_12041_1_1533902418349_3576110_495-260035-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 07:26:17,749 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9051W0015-520281 from training. Duration: 0.9045625 2023-10-04 07:26:17,860 WARNING [train.py:1204] (3/4) Exclude cut with ID _14459_210_2228_1_1531443641946_7516700_707-66193-0_sp1.1 from training. Duration: 0.97275 2023-10-04 07:26:21,141 WARNING [train.py:1204] (3/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_219-224445-0_sp1.1 from training. Duration: 0.763625 2023-10-04 07:26:26,575 WARNING [train.py:1204] (3/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_345-167986-0_sp0.9 from training. Duration: 0.9333125 2023-10-04 07:26:28,094 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_34815_210_6388_1_1533381320_3571903_279-305565-0_sp1.1 from training. Duration: 0.836375 2023-10-04 07:26:30,931 WARNING [train.py:1204] (3/4) Exclude cut with ID _21987_210_5854_1_1531821500482_3637038_317-179212-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 07:26:32,186 WARNING [train.py:1204] (3/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_395-37222-0_sp1.1 from training. Duration: 0.6909375 2023-10-04 07:26:32,271 WARNING [train.py:1204] (3/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_168-50337-0_sp1.1 from training. Duration: 0.763625 2023-10-04 07:26:38,953 WARNING [train.py:1204] (3/4) Exclude cut with ID _60113_210_16271_1_1535164212481_4386079_559-167191-0 from training. Duration: 0.93 2023-10-04 07:26:38,976 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_47228_210_4983_1_1533780021_3672027_344-179642-0_sp1.1 from training. Duration: 0.92725 2023-10-04 07:26:39,012 WARNING [train.py:1204] (3/4) Exclude cut with ID _22961_210_8254_1_1531976366672_4278180_317-199546-0_sp0.9 from training. Duration: 0.9555625 2023-10-04 07:26:42,305 WARNING [train.py:1204] (3/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_141-89727-0 from training. Duration: 0.71 2023-10-04 07:26:42,366 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_78-116075-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 07:26:44,267 WARNING [train.py:1204] (3/4) Exclude cut with ID _64086_210_7994_1_1535270417019_7435949_52-235756-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 07:26:45,697 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7615_210_4211_1_1528110965_4580160_72-235211-0_sp1.1 from training. Duration: 0.6909375 2023-10-04 07:26:48,630 WARNING [train.py:1204] (3/4) Exclude cut with ID _14069_210_5632_1_1530512692033_4803390_287-27950-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 07:26:50,021 WARNING [train.py:1204] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_74-48717-0_sp0.9 from training. Duration: 0.6444375 2023-10-04 07:26:50,047 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7544_210_6753_1_1528591327_1471426_172-348777-0_sp1.1 from training. Duration: 0.6 2023-10-04 07:26:51,608 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.conv_skip_rate, batch_count=1573360.0, ans=0.0 2023-10-04 07:26:52,736 WARNING [train.py:1204] (3/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_220-191080-0 from training. Duration: 0.98 2023-10-04 07:26:54,060 WARNING [train.py:1204] (3/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_1081-137641-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 07:26:57,332 WARNING [train.py:1204] (3/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_278-235459-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 07:27:01,969 WARNING [train.py:1204] (3/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_1181-77470-0_sp1.1 from training. Duration: 0.936375 2023-10-04 07:27:02,104 WARNING [train.py:1204] (3/4) Exclude cut with ID _60860_210_8254_1_1534896015875_3557169_198-5531-0_sp1.1 from training. Duration: 0.936375 2023-10-04 07:27:02,243 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.nonlin_attention.balancer.prob, batch_count=1573426.6666666667, ans=0.125 2023-10-04 07:27:03,483 WARNING [train.py:1204] (3/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_17-289781-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 07:27:03,488 WARNING [train.py:1204] (3/4) Exclude cut with ID _50310_210_15488_1_1534068043345_3641080_146-199752-0_sp1.1 from training. Duration: 0.92725 2023-10-04 07:27:05,029 WARNING [train.py:1204] (3/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_969-213354-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 07:27:06,503 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.conv_skip_rate, batch_count=1573426.6666666667, ans=0.0 2023-10-04 07:27:07,653 WARNING [train.py:1204] (3/4) Exclude cut with ID _70519_210_4163_1_1535961595994_7458230_66-344156-0_sp1.1 from training. Duration: 0.963625 2023-10-04 07:27:11,171 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.ff2_skip_rate, batch_count=1573426.6666666667, ans=0.0 2023-10-04 07:27:12,273 WARNING [train.py:1204] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_380-204756-0_sp1.1 from training. Duration: 0.7818125 2023-10-04 07:27:13,669 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_128-202104-0_sp0.9 from training. Duration: 0.7 2023-10-04 07:27:18,536 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.feed_forward1.out_proj.dropout_p, batch_count=1573493.3333333333, ans=0.1 2023-10-04 07:27:19,731 WARNING [train.py:1204] (3/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_51-1049-0_sp0.9 from training. Duration: 0.8333125 2023-10-04 07:27:19,749 WARNING [train.py:1204] (3/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_86-66192-0_sp1.1 from training. Duration: 0.5 2023-10-04 07:27:20,946 WARNING [train.py:1204] (3/4) Exclude cut with ID _14069_210_5632_1_1530512692033_4803390_95-1896-0_sp1.1 from training. Duration: 0.8909375 2023-10-04 07:27:25,018 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.672e+02 1.969e+02 2.340e+02 2.701e+02 4.212e+02, threshold=4.680e+02, percent-clipped=0.0 2023-10-04 07:27:25,168 WARNING [train.py:1204] (3/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_29-293896-0_sp1.1 from training. Duration: 0.5545625 2023-10-04 07:27:26,657 WARNING [train.py:1204] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_167-137255-0_sp0.9 from training. Duration: 0.711125 2023-10-04 07:27:26,658 WARNING [train.py:1204] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_217-95056-0 from training. Duration: 0.98 2023-10-04 07:27:26,689 WARNING [train.py:1204] (3/4) Exclude cut with ID _47278_210_12041_1_1533902418349_3576110_97-266746-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 07:27:28,409 WARNING [train.py:1204] (3/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_380-343670-0_sp1.1 from training. Duration: 0.8 2023-10-04 07:27:29,735 INFO [train.py:1046] (3/4) Epoch 45, batch 2300, loss[loss=0.1737, simple_loss=0.2445, pruned_loss=0.05148, over 23725.00 frames. ], tot_loss[loss=0.1535, simple_loss=0.2342, pruned_loss=0.03636, over 4718651.76 frames. ], batch size: 164, lr: 2.26e-03, grad_scale: 8.0 2023-10-04 07:27:29,862 WARNING [train.py:1204] (3/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_107-332398-0 from training. Duration: 0.78 2023-10-04 07:27:33,152 WARNING [train.py:1204] (3/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_91-267696-0_sp1.1 from training. Duration: 0.7909375 2023-10-04 07:27:34,495 WARNING [train.py:1204] (3/4) Exclude cut with ID _41298_210_9019_1_1533430371450_4822143_498-186993-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 07:27:39,890 WARNING [train.py:1204] (3/4) Exclude cut with ID _17956_210_4353_1_1531209568125_6979970_54-77610-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 07:27:39,932 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_40047_210_2290_1_1533290519_2374311_257-335162-0_sp1.1 from training. Duration: 0.863625 2023-10-04 07:27:40,111 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.feed_forward3.hidden_balancer.prob, batch_count=1573560.0, ans=0.125 2023-10-04 07:27:43,154 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9653W0193-675471 from training. Duration: 0.9656875 2023-10-04 07:27:43,288 WARNING [train.py:1204] (3/4) Exclude cut with ID _25248_210_8750_1_1532062825941_5554180_243-240536-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 07:27:50,755 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.conv_module1.balancer2.prob, batch_count=1573626.6666666667, ans=0.125 2023-10-04 07:27:51,965 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_32360_210_4198_1_1532689160_3648436_60-51908-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 07:27:51,973 WARNING [train.py:1204] (3/4) Exclude cut with ID _4092_210_2151_1_1526643044449_3135759_227-213972-0_sp0.9 from training. Duration: 0.6 2023-10-04 07:27:52,013 WARNING [train.py:1204] (3/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_392-173500-0_sp1.1 from training. Duration: 0.97275 2023-10-04 07:27:53,454 WARNING [train.py:1204] (3/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_71-329150-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 07:27:53,457 WARNING [train.py:1204] (3/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_866-173440-0 from training. Duration: 0.9 2023-10-04 07:27:53,535 WARNING [train.py:1204] (3/4) Exclude cut with ID _14832_210_8750_1_1530691351539_3171348_1-54315-0_sp0.9 from training. Duration: 0.9666875 2023-10-04 07:27:55,003 WARNING [train.py:1204] (3/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_430-116139-0_sp1.1 from training. Duration: 0.77275 2023-10-04 07:27:56,334 WARNING [train.py:1204] (3/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_356-45054-0_sp0.9 from training. Duration: 0.9555625 2023-10-04 07:27:59,174 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3531_210_3528_1_1526294203_5216456_309-227302-0_sp0.9 from training. Duration: 0.7666875 2023-10-04 07:28:02,513 WARNING [train.py:1204] (3/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_301-330455-0_sp1.1 from training. Duration: 0.72725 2023-10-04 07:28:06,538 WARNING [train.py:1204] (3/4) Exclude cut with ID _23557_210_8750_1_1531890106370_6846255_667-145945-0_sp1.1 from training. Duration: 0.92725 2023-10-04 07:28:09,922 WARNING [train.py:1204] (3/4) Exclude cut with ID _14860_210_4915_1_1530702020340_7343240_836-328489-0_sp1.1 from training. Duration: 0.8181875 2023-10-04 07:28:09,972 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_37152_210_9697_1_1533106573_3844188_416-333249-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 07:28:13,975 WARNING [train.py:1204] (3/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_538-32291-0_sp0.9 from training. Duration: 0.92225 2023-10-04 07:28:15,409 WARNING [train.py:1204] (3/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_965-176824-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 07:28:19,519 WARNING [train.py:1204] (3/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_86-19601-0_sp1.1 from training. Duration: 0.77275 2023-10-04 07:28:20,372 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.0.layers.0.feed_forward3.out_whiten, num_groups=1, num_channels=192, metric=11.75 vs. limit=15.0 2023-10-04 07:28:21,342 WARNING [train.py:1204] (3/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_436-318113-0_sp1.1 from training. Duration: 0.6818125 2023-10-04 07:28:21,384 WARNING [train.py:1204] (3/4) Exclude cut with ID _41817_210_18835_1_1533434477160_7423049_555-293448-0_sp0.9 from training. Duration: 0.888875 2023-10-04 07:28:21,403 WARNING [train.py:1204] (3/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_68-219807-0 from training. Duration: 0.47 2023-10-04 07:28:25,533 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7544_210_6753_1_1528591327_1471426_33-346031-0_sp1.1 from training. Duration: 0.5545625 2023-10-04 07:28:25,535 WARNING [train.py:1204] (3/4) Exclude cut with ID _49748_210_24116_1_1533986971737_3158910_263-52113-0_sp1.1 from training. Duration: 0.97275 2023-10-04 07:28:25,957 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.5.encoder.layers.0.feed_forward2.out_whiten, num_groups=1, num_channels=256, metric=7.27 vs. limit=15.0 2023-10-04 07:28:26,849 WARNING [train.py:1204] (3/4) Exclude cut with ID _64639_210_5816_1_1535367619437_3528000_340-24580-0_sp1.1 from training. Duration: 0.92725 2023-10-04 07:28:28,142 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_47228_210_4983_1_1533780021_3672027_479-114941-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 07:28:28,166 WARNING [train.py:1204] (3/4) Exclude cut with ID _44585_210_18628_1_1533605350387_4518199_506-119659-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 07:28:28,247 WARNING [train.py:1204] (3/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_314-126819-0_sp1.1 from training. Duration: 0.4545625 2023-10-04 07:28:28,250 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2541_210_1794_1_1525590269_3084765_156-173713-0_sp0.9 from training. Duration: 0.9 2023-10-04 07:28:28,297 WARNING [train.py:1204] (3/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_108-255430-0 from training. Duration: 0.97 2023-10-04 07:28:28,466 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.feed_forward1.out_proj.dropout_p, batch_count=1573826.6666666667, ans=0.1 2023-10-04 07:28:29,578 WARNING [train.py:1204] (3/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_791-45149-0_sp1.1 from training. Duration: 0.7818125 2023-10-04 07:28:29,584 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_53378_210_17282_1_1534227054_1261425_10-118295-0_sp1.1 from training. Duration: 0.97275 2023-10-04 07:28:29,653 WARNING [train.py:1204] (3/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_148-158509-0 from training. Duration: 0.41 2023-10-04 07:28:36,392 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_34726_210_4525_1_1533019474_5030395_687-291305-0_sp1.1 from training. Duration: 0.936375 2023-10-04 07:28:39,933 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.feed_forward1.hidden_balancer.prob, batch_count=1573826.6666666667, ans=0.125 2023-10-04 07:28:40,993 WARNING [train.py:1204] (3/4) Exclude cut with ID _14860_210_4915_1_1530702020340_7343240_264-324537-0_sp1.1 from training. Duration: 0.963625 2023-10-04 07:28:43,545 INFO [train.py:1046] (3/4) Epoch 45, batch 2350, loss[loss=0.1635, simple_loss=0.2463, pruned_loss=0.04035, over 23272.00 frames. ], tot_loss[loss=0.1552, simple_loss=0.2361, pruned_loss=0.03716, over 4708266.95 frames. ], batch size: 105, lr: 2.26e-03, grad_scale: 8.0 2023-10-04 07:28:43,695 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_41251_210_15368_1_1533429767_4820695_591-46386-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 07:28:43,719 WARNING [train.py:1204] (3/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_86-19601-0_sp0.9 from training. Duration: 0.9444375 2023-10-04 07:28:45,081 WARNING [train.py:1204] (3/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_401-255491-0_sp0.9 from training. Duration: 0.77775 2023-10-04 07:28:46,539 WARNING [train.py:1204] (3/4) Exclude cut with ID _8696_210_1876_1_1528545632440_3345058_209-54069-0_sp1.1 from training. Duration: 0.6090625 2023-10-04 07:28:46,547 WARNING [train.py:1204] (3/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_369-325184-0_sp1.1 from training. Duration: 0.9 2023-10-04 07:28:47,808 WARNING [train.py:1204] (3/4) Exclude cut with ID _14687_210_5854_1_1530703838259_3664889_115-239046-0_sp0.9 from training. Duration: 0.7444375 2023-10-04 07:28:47,854 WARNING [train.py:1204] (3/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_168-50337-0 from training. Duration: 0.84 2023-10-04 07:28:53,447 WARNING [train.py:1204] (3/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_909-64151-0_sp1.1 from training. Duration: 0.8545625 2023-10-04 07:28:53,456 WARNING [train.py:1204] (3/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_597-154640-0 from training. Duration: 0.9 2023-10-04 07:28:53,960 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.5.encoder.layers.0.whiten, num_groups=1, num_channels=256, metric=2.23 vs. limit=12.0 2023-10-04 07:28:58,951 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.1.encoder.layers.0.nonlin_attention.whiten2, num_groups=1, num_channels=256, metric=5.86 vs. limit=15.0 2023-10-04 07:29:00,193 WARNING [train.py:1204] (3/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_126-274694-0 from training. Duration: 0.98 2023-10-04 07:29:02,898 WARNING [train.py:1204] (3/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_344-269786-0_sp1.1 from training. Duration: 0.97275 2023-10-04 07:29:04,992 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_37818_210_4238_1_1533620910_4720083_322-253154-0_sp1.1 from training. Duration: 0.92725 2023-10-04 07:29:05,002 WARNING [train.py:1204] (3/4) Exclude cut with ID _58895_210_8750_1_1535184081320_3649089_221-15823-0_sp1.1 from training. Duration: 0.92725 2023-10-04 07:29:05,012 WARNING [train.py:1204] (3/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_9-21737-0_sp1.1 from training. Duration: 0.8818125 2023-10-04 07:29:06,296 WARNING [train.py:1204] (3/4) Exclude cut with ID _14069_210_5632_1_1530512692033_4803390_522-341588-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 07:29:06,394 WARNING [train.py:1204] (3/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_363-185703-0 from training. Duration: 0.95 2023-10-04 07:29:09,130 WARNING [train.py:1204] (3/4) Exclude cut with ID _41817_210_18835_1_1533434477160_7423049_265-76216-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 07:29:13,945 WARNING [train.py:1204] (3/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_841-341679-0 from training. Duration: 0.58 2023-10-04 07:29:14,571 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.4.encoder.layers.0.self_attn2.whiten, num_groups=1, num_channels=384, metric=12.63 vs. limit=22.5 2023-10-04 07:29:15,345 WARNING [train.py:1204] (3/4) Exclude cut with ID _55429_210_10626_1_1534467588527_7174143_97-335084-0_sp1.1 from training. Duration: 0.8818125 2023-10-04 07:29:18,089 WARNING [train.py:1204] (3/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_690-126306-0_sp0.9 from training. Duration: 0.8666875 2023-10-04 07:29:18,105 WARNING [train.py:1204] (3/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_102-92751-0_sp1.1 from training. Duration: 0.9 2023-10-04 07:29:19,559 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3787_210_2040_1_1526477544_5478586_603-120324-0_sp1.1 from training. Duration: 0.736375 2023-10-04 07:29:21,038 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_33348_210_5632_1_1533086912_3324030_85-28583-0 from training. Duration: 0.82 2023-10-04 07:29:22,337 WARNING [train.py:1204] (3/4) Exclude cut with ID _60057_210_10640_1_1534903065793_3333150_362-230377-0_sp1.1 from training. Duration: 0.7909375 2023-10-04 07:29:23,809 WARNING [train.py:1204] (3/4) Exclude cut with ID _60113_210_16271_1_1535164212481_4386079_194-146486-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 07:29:23,833 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_53753_210_7234_1_1534320003_3594148_171-277668-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 07:29:25,650 WARNING [train.py:1204] (3/4) Exclude cut with ID _34423_210_8750_1_1533430798456_3330199_251-332788-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 07:29:28,860 WARNING [train.py:1204] (3/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_663-166973-0_sp0.9 from training. Duration: 0.888875 2023-10-04 07:29:30,315 WARNING [train.py:1204] (3/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_927-43827-0 from training. Duration: 0.74 2023-10-04 07:29:31,612 WARNING [train.py:1204] (3/4) Exclude cut with ID _28323_210_16187_1_1532480395667_4100300_306-74561-0_sp1.1 from training. Duration: 0.8545625 2023-10-04 07:29:33,033 WARNING [train.py:1204] (3/4) Exclude cut with ID _15037_210_2253_1_1530752294104_3267650_139-337989-0_sp1.1 from training. Duration: 0.92725 2023-10-04 07:29:33,048 WARNING [train.py:1204] (3/4) Exclude cut with ID _49039_210_6315_1_1533967537541_3846118_460-263670-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 07:29:35,287 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.feed_forward3.hidden_balancer.prob, batch_count=1574093.3333333333, ans=0.125 2023-10-04 07:29:36,411 WARNING [train.py:1204] (3/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_186-309161-0 from training. Duration: 0.54 2023-10-04 07:29:37,752 WARNING [train.py:1204] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_87-278686-0_sp1.1 from training. Duration: 0.7 2023-10-04 07:29:39,738 WARNING [train.py:1204] (3/4) Exclude cut with ID _27052_210_5986_1_1532329197187_3585009_314-106499-0 from training. Duration: 0.92 2023-10-04 07:29:39,760 WARNING [train.py:1204] (3/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_412-287526-0_sp0.9 from training. Duration: 0.8 2023-10-04 07:29:45,175 WARNING [train.py:1204] (3/4) Exclude cut with ID _22961_210_8254_1_1531976366672_4278180_317-199546-0 from training. Duration: 0.86 2023-10-04 07:29:46,812 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.conv_skip_rate, batch_count=1574160.0, ans=0.0 2023-10-04 07:29:49,315 WARNING [train.py:1204] (3/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_381-317791-0 from training. Duration: 0.73 2023-10-04 07:29:49,363 WARNING [train.py:1204] (3/4) Exclude cut with ID _44010_210_4525_1_1533884262964_4013620_560-310411-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 07:29:49,369 WARNING [train.py:1204] (3/4) Exclude cut with ID _5294_210_1747_1_1527249643928_3696179_342-270278-0_sp0.9 from training. Duration: 0.611125 2023-10-04 07:29:50,605 WARNING [train.py:1204] (3/4) Exclude cut with ID ID0473W0002-918951 from training. Duration: 0.924 2023-10-04 07:29:50,622 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1083-220122-0 from training. Duration: 0.51 2023-10-04 07:29:52,175 WARNING [train.py:1204] (3/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_138-154812-0 from training. Duration: 0.78 2023-10-04 07:29:53,385 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.701e+02 2.034e+02 2.312e+02 2.597e+02 4.365e+02, threshold=4.623e+02, percent-clipped=0.0 2023-10-04 07:29:55,399 WARNING [train.py:1204] (3/4) Exclude cut with ID _44982_210_1511_1_1534312820108_3726029_331-19420-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 07:29:58,608 INFO [train.py:1046] (3/4) Epoch 45, batch 2400, loss[loss=0.1683, simple_loss=0.2622, pruned_loss=0.03718, over 24339.00 frames. ], tot_loss[loss=0.155, simple_loss=0.2355, pruned_loss=0.03723, over 4690652.18 frames. ], batch size: 74, lr: 2.26e-03, grad_scale: 16.0 2023-10-04 07:29:58,748 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2655_210_2087_1_1525688959_3897400_202-147557-0_sp0.9 from training. Duration: 0.9444375 2023-10-04 07:30:01,619 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_23850_210_4238_1_1532232597_1632433_32-185135-0_sp0.9 from training. Duration: 0.9555625 2023-10-04 07:30:03,019 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_46-93547-0_sp1.1 from training. Duration: 0.863625 2023-10-04 07:30:03,225 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.conv_module1.balancer1.prob, batch_count=1574226.6666666667, ans=0.125 2023-10-04 07:30:04,331 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7931_210_1794_1_1528594728_5564059_280-279560-0 from training. Duration: 0.98 2023-10-04 07:30:04,371 WARNING [train.py:1204] (3/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_937-208281-0 from training. Duration: 0.85 2023-10-04 07:30:12,023 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_14928_210_5632_1_1530773461_4714000_454-112198-0_sp0.9 from training. Duration: 0.7333125 2023-10-04 07:30:12,024 WARNING [train.py:1204] (3/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_944-153333-0_sp1.1 from training. Duration: 0.963625 2023-10-04 07:30:15,993 WARNING [train.py:1204] (3/4) Exclude cut with ID _14821_210_8750_1_1530680503991_6829359_2-227151-0 from training. Duration: 0.83 2023-10-04 07:30:16,016 WARNING [train.py:1204] (3/4) Exclude cut with ID _28461_210_2253_1_1532771775189_3041299_96-152158-0_sp1.1 from training. Duration: 0.77275 2023-10-04 07:30:17,353 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7837_210_1792_1_1528453445_5841305_654-54195-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 07:30:17,384 WARNING [train.py:1204] (3/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_344-152526-0 from training. Duration: 0.53 2023-10-04 07:30:20,325 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.conv_skip_rate, batch_count=1574293.3333333333, ans=0.0 2023-10-04 07:30:22,917 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_30167_210_3528_1_1532428012_4772375_480-142230-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 07:30:25,050 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_160-218721-0 from training. Duration: 0.96 2023-10-04 07:30:29,309 WARNING [train.py:1204] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_65-88242-0_sp0.9 from training. Duration: 0.62225 2023-10-04 07:30:29,537 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.1.feed_forward1.out_proj.dropout_p, batch_count=1574360.0, ans=0.1 2023-10-04 07:30:30,817 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.ff3_skip_rate, batch_count=1574360.0, ans=0.0 2023-10-04 07:30:33,385 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_34886_210_10633_1_1533203882_1755661_76-156096-0 from training. Duration: 0.87 2023-10-04 07:30:34,359 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.0.layers.0.feed_forward1.out_whiten, num_groups=1, num_channels=192, metric=9.65 vs. limit=15.0 2023-10-04 07:30:34,820 WARNING [train.py:1204] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_188-38388-0_sp1.1 from training. Duration: 0.92725 2023-10-04 07:30:37,834 WARNING [train.py:1204] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_81-289335-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 07:30:42,614 WARNING [train.py:1204] (3/4) Exclude cut with ID _59493_210_17282_1_1535078419077_3225879_110-8778-0_sp1.1 from training. Duration: 0.936375 2023-10-04 07:30:42,646 WARNING [train.py:1204] (3/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_188-260011-0 from training. Duration: 0.73 2023-10-04 07:30:42,693 WARNING [train.py:1204] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_453-133983-0_sp0.9 from training. Duration: 0.8666875 2023-10-04 07:30:49,459 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8054_210_7927_1_1528627721_4984094_553-17379-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 07:30:52,168 WARNING [train.py:1204] (3/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_341-171072-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 07:30:55,671 WARNING [train.py:1204] (3/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_265-257286-0_sp1.1 from training. Duration: 0.963625 2023-10-04 07:30:55,757 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_403-39619-0_sp0.9 from training. Duration: 0.9333125 2023-10-04 07:30:55,762 WARNING [train.py:1204] (3/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_464-33931-0_sp0.9 from training. Duration: 0.6 2023-10-04 07:30:55,783 WARNING [train.py:1204] (3/4) Exclude cut with ID _60804_210_9642_1_1534831199859_7268299_632-143863-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 07:30:55,797 WARNING [train.py:1204] (3/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_431-310997-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 07:30:57,157 WARNING [train.py:1204] (3/4) Exclude cut with ID _31649_210_6228_1_1532584608063_4091078_114-263860-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 07:30:57,176 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6961_210_2087_1_1527937168_6815011_562-165518-0_sp0.9 from training. Duration: 0.8555625 2023-10-04 07:31:00,494 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.conv_skip_rate, batch_count=1574493.3333333333, ans=0.0 2023-10-04 07:31:01,695 WARNING [train.py:1204] (3/4) Exclude cut with ID _27052_210_5986_1_1532329197187_3585009_338-325870-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 07:31:03,051 WARNING [train.py:1204] (3/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_469-94673-0_sp1.1 from training. Duration: 0.7181875 2023-10-04 07:31:03,093 WARNING [train.py:1204] (3/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_259-34038-0 from training. Duration: 0.74 2023-10-04 07:31:03,185 WARNING [train.py:1204] (3/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_388-129552-0 from training. Duration: 0.96 2023-10-04 07:31:05,960 WARNING [train.py:1204] (3/4) Exclude cut with ID _28394_210_8913_1_1532430049518_3650118_342-251455-0_sp1.1 from training. Duration: 0.936375 2023-10-04 07:31:05,986 WARNING [train.py:1204] (3/4) Exclude cut with ID _53731_210_7994_1_1534553979244_3757214_8-191390-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 07:31:07,278 WARNING [train.py:1204] (3/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_613-232856-0 from training. Duration: 0.54 2023-10-04 07:31:07,361 WARNING [train.py:1204] (3/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_557-349534-0 from training. Duration: 0.91 2023-10-04 07:31:08,642 WARNING [train.py:1204] (3/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_379-184223-0 from training. Duration: 0.85 2023-10-04 07:31:08,646 WARNING [train.py:1204] (3/4) Exclude cut with ID IC0125W0001-61807 from training. Duration: 0.966 2023-10-04 07:31:10,457 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_229-45300-0 from training. Duration: 0.57 2023-10-04 07:31:11,742 INFO [train.py:1046] (3/4) Epoch 45, batch 2450, loss[loss=0.1447, simple_loss=0.2156, pruned_loss=0.03686, over 23429.00 frames. ], tot_loss[loss=0.1536, simple_loss=0.2335, pruned_loss=0.03687, over 4683271.24 frames. ], batch size: 285, lr: 2.26e-03, grad_scale: 16.0 2023-10-04 07:31:11,792 WARNING [train.py:1204] (3/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_1069-24743-0_sp1.1 from training. Duration: 0.92725 2023-10-04 07:31:11,906 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_41452_210_7291_1_1533437687_3941281_323-172873-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 07:31:11,916 WARNING [train.py:1204] (3/4) Exclude cut with ID _37086_210_9038_1_1532999540891_7021250_802-226709-0_sp1.1 from training. Duration: 0.97275 2023-10-04 07:31:12,040 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.1.conv_module2.balancer1.prob, batch_count=1574560.0, ans=0.125 2023-10-04 07:31:13,820 WARNING [train.py:1204] (3/4) Exclude cut with ID IC0144W0500-71767 from training. Duration: 0.931875 2023-10-04 07:31:15,144 WARNING [train.py:1204] (3/4) Exclude cut with ID _39846_210_12651_1_1533605310265_2115212_233-129699-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 07:31:15,209 WARNING [train.py:1204] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_574-45541-0_sp0.9 from training. Duration: 0.72225 2023-10-04 07:31:16,776 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.conv_module1.balancer1.prob, batch_count=1574560.0, ans=0.125 2023-10-04 07:31:19,385 WARNING [train.py:1204] (3/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_372-298093-0_sp1.1 from training. Duration: 0.72725 2023-10-04 07:31:19,397 WARNING [train.py:1204] (3/4) Exclude cut with ID _62574_210_2655_1_1535180407058_3933249_34-26343-0_sp1.1 from training. Duration: 0.97275 2023-10-04 07:31:22,277 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_512-256238-0_sp1.1 from training. Duration: 0.963625 2023-10-04 07:31:22,282 WARNING [train.py:1204] (3/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_196-55502-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 07:31:22,383 WARNING [train.py:1204] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_195-167011-0 from training. Duration: 0.75 2023-10-04 07:31:26,882 WARNING [train.py:1204] (3/4) Exclude cut with ID _13678_210_7497_1_1530530069226_3463170_345-230416-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 07:31:28,784 WARNING [train.py:1204] (3/4) Exclude cut with ID _51078_210_9070_1_1534161557424_3805250_225-327809-0_sp1.1 from training. Duration: 0.963625 2023-10-04 07:31:31,649 WARNING [train.py:1204] (3/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_18-189198-0_sp1.1 from training. Duration: 0.8181875 2023-10-04 07:31:32,981 WARNING [train.py:1204] (3/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_329-82235-0_sp1.1 from training. Duration: 0.7454375 2023-10-04 07:31:32,990 WARNING [train.py:1204] (3/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_726-270780-0_sp0.9 from training. Duration: 0.9555625 2023-10-04 07:31:33,026 WARNING [train.py:1204] (3/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_654-52713-0 from training. Duration: 0.77 2023-10-04 07:31:36,220 WARNING [train.py:1204] (3/4) Exclude cut with ID _20455_210_5219_1_1531565996775_8098100_191-16006-0_sp1.1 from training. Duration: 0.963625 2023-10-04 07:31:37,578 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3171_210_2087_1_1526109084_2330007_66-260583-0_sp1.1 from training. Duration: 0.6545625 2023-10-04 07:31:39,066 WARNING [train.py:1204] (3/4) Exclude cut with ID _38670_210_15379_1_1533448810659_4391059_63-129006-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 07:31:44,078 WARNING [train.py:1204] (3/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_393-57888-0_sp0.9 from training. Duration: 0.711125 2023-10-04 07:31:44,121 WARNING [train.py:1204] (3/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_857-298942-0_sp0.9 from training. Duration: 0.9444375 2023-10-04 07:31:45,530 WARNING [train.py:1204] (3/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_239-22076-0_sp0.9 from training. Duration: 0.9444375 2023-10-04 07:31:47,019 WARNING [train.py:1204] (3/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_424-227947-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 07:31:48,379 WARNING [train.py:1204] (3/4) Exclude cut with ID _14069_210_5632_1_1530512692033_4803390_95-1896-0 from training. Duration: 0.98 2023-10-04 07:31:49,880 WARNING [train.py:1204] (3/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_96-244299-0_sp0.9 from training. Duration: 0.97775 2023-10-04 07:31:56,225 WARNING [train.py:1204] (3/4) Exclude cut with ID _46683_210_9642_1_1533730913492_4591116_562-319825-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 07:31:56,446 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.bypass_mid.scale_min, batch_count=1574760.0, ans=0.2 2023-10-04 07:31:57,495 WARNING [train.py:1204] (3/4) Exclude cut with ID _64639_210_5816_1_1535367619437_3528000_284-173474-0_sp1.1 from training. Duration: 0.963625 2023-10-04 07:31:57,523 WARNING [train.py:1204] (3/4) Exclude cut with ID _29529_210_7234_1_1532393961455_3977460_497-100133-0_sp1.1 from training. Duration: 0.936375 2023-10-04 07:31:58,884 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_12868_210_6753_1_1530701932_530275_18-76322-0_sp0.9 from training. Duration: 0.9666875 2023-10-04 07:31:58,918 WARNING [train.py:1204] (3/4) Exclude cut with ID _50093_210_4220_1_1534417141207_3432299_116-303447-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 07:32:00,708 WARNING [train.py:1204] (3/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_44-79427-0_sp1.1 from training. Duration: 0.8909375 2023-10-04 07:32:02,199 WARNING [train.py:1204] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_887-249555-0 from training. Duration: 0.77 2023-10-04 07:32:04,912 WARNING [train.py:1204] (3/4) Exclude cut with ID _28461_210_2253_1_1532771775189_3041299_96-152158-0_sp0.9 from training. Duration: 0.9444375 2023-10-04 07:32:04,969 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_14058_210_4163_1_1530690953_6327180_49-210687-0_sp1.1 from training. Duration: 0.8545625 2023-10-04 07:32:07,698 WARNING [train.py:1204] (3/4) Exclude cut with ID _40399_210_4353_1_1533305658194_3279406_351-127903-0_sp1.1 from training. Duration: 0.97275 2023-10-04 07:32:07,704 WARNING [train.py:1204] (3/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_184-206939-0_sp1.1 from training. Duration: 0.936375 2023-10-04 07:32:13,913 WARNING [train.py:1204] (3/4) Exclude cut with ID _14100_210_8341_1_1530529320613_3678069_258-224599-0_sp1.1 from training. Duration: 0.7 2023-10-04 07:32:13,935 WARNING [train.py:1204] (3/4) Exclude cut with ID _6493_210_1747_1_1528459210424_4012470_324-323854-0 from training. Duration: 0.92 2023-10-04 07:32:15,610 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_151-325626-0_sp1.1 from training. Duration: 0.8818125 2023-10-04 07:32:15,693 WARNING [train.py:1204] (3/4) Exclude cut with ID _14528_210_8254_1_1530669700514_4306939_106-43145-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 07:32:15,708 WARNING [train.py:1204] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_686-54383-0 from training. Duration: 0.87 2023-10-04 07:32:17,191 WARNING [train.py:1204] (3/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_801-102362-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 07:32:18,401 WARNING [train.py:1204] (3/4) Exclude cut with ID _48677_210_5663_1_1533902405919_3892989_408-40740-0_sp0.9 from training. Duration: 0.92225 2023-10-04 07:32:21,287 WARNING [train.py:1204] (3/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_448-212135-0_sp1.1 from training. Duration: 0.82725 2023-10-04 07:32:22,523 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.621e+02 2.042e+02 2.331e+02 2.732e+02 3.935e+02, threshold=4.662e+02, percent-clipped=0.0 2023-10-04 07:32:23,956 WARNING [train.py:1204] (3/4) Exclude cut with ID _59170_210_6721_1_1534983633960_4203335_320-124047-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 07:32:24,000 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_4692_210_3528_1_1526899755_4323254_107-44401-0_sp1.1 from training. Duration: 0.7818125 2023-10-04 07:32:24,217 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.conv_module1.balancer1.prob, batch_count=1574826.6666666667, ans=0.125 2023-10-04 07:32:26,610 INFO [train.py:1046] (3/4) Epoch 45, batch 2500, loss[loss=0.1496, simple_loss=0.2248, pruned_loss=0.03721, over 23551.00 frames. ], tot_loss[loss=0.1531, simple_loss=0.2332, pruned_loss=0.03654, over 4691341.25 frames. ], batch size: 256, lr: 2.26e-03, grad_scale: 16.0 2023-10-04 07:32:27,558 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.conv_module1.balancer2.prob, batch_count=1574893.3333333333, ans=0.125 2023-10-04 07:32:28,542 WARNING [train.py:1204] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_731-2428-0 from training. Duration: 0.83 2023-10-04 07:32:28,640 WARNING [train.py:1204] (3/4) Exclude cut with ID _29857_210_20034_1_1532395886753_3798160_335-163562-0_sp1.1 from training. Duration: 0.77275 2023-10-04 07:32:35,945 WARNING [train.py:1204] (3/4) Exclude cut with ID _14832_210_8750_1_1530691351539_3171348_171-275004-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 07:32:44,431 WARNING [train.py:1204] (3/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_105-328174-0_sp0.9 from training. Duration: 0.8444375 2023-10-04 07:32:44,473 WARNING [train.py:1204] (3/4) Exclude cut with ID _68965_210_1883_1_1535698962272_3497490_164-118421-0_sp1.1 from training. Duration: 0.936375 2023-10-04 07:32:46,680 WARNING [train.py:1204] (3/4) Exclude cut with ID _19896_210_1883_1_1531709022662_6649569_380-24170-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 07:32:46,685 WARNING [train.py:1204] (3/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_505-205270-0_sp1.1 from training. Duration: 0.3454375 2023-10-04 07:32:46,868 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.1.conv_module1.balancer2.min_abs, batch_count=1574960.0, ans=0.5 2023-10-04 07:32:52,257 WARNING [train.py:1204] (3/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_427-208901-0_sp1.1 from training. Duration: 0.6181875 2023-10-04 07:32:52,296 WARNING [train.py:1204] (3/4) Exclude cut with ID _23818_210_6228_1_1531917063884_3676920_85-326916-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 07:32:53,652 WARNING [train.py:1204] (3/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_101-293367-0_sp1.1 from training. Duration: 0.57275 2023-10-04 07:32:53,666 WARNING [train.py:1204] (3/4) Exclude cut with ID _5409_210_1388_1_1527292850189_5865191_178-127970-0_sp1.1 from training. Duration: 0.5181875 2023-10-04 07:32:54,857 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_403-39619-0 from training. Duration: 0.84 2023-10-04 07:32:54,975 WARNING [train.py:1204] (3/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_427-87841-0_sp1.1 from training. Duration: 0.963625 2023-10-04 07:32:56,315 WARNING [train.py:1204] (3/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_286-68181-0_sp1.1 from training. Duration: 0.97275 2023-10-04 07:32:56,359 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6673_210_3273_1_1527767790_3173240_327-306185-0 from training. Duration: 0.83 2023-10-04 07:32:56,382 WARNING [train.py:1204] (3/4) Exclude cut with ID _3675_210_4557_1_1526639934408_4182089_106-203713-0_sp1.1 from training. Duration: 0.963625 2023-10-04 07:32:58,109 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_53071_210_16554_1_1534493434_1276796_130-36076-0 from training. Duration: 0.95 2023-10-04 07:32:58,140 WARNING [train.py:1204] (3/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_586-93054-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 07:33:00,959 WARNING [train.py:1204] (3/4) Exclude cut with ID _23825_210_13449_1_1531915313526_3497150_262-10571-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 07:33:02,845 WARNING [train.py:1204] (3/4) Exclude cut with ID _28783_210_7943_1_1532671164808_7155479_432-67885-0_sp1.1 from training. Duration: 0.97275 2023-10-04 07:33:05,785 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6287_210_3273_1_1527509506_2653716_182-199887-0_sp0.9 from training. Duration: 0.5666875 2023-10-04 07:33:05,840 WARNING [train.py:1204] (3/4) Exclude cut with ID _3231_210_2106_1_1526205864934_6784220_171-256004-0 from training. Duration: 0.86 2023-10-04 07:33:07,183 WARNING [train.py:1204] (3/4) Exclude cut with ID _69051_210_1531_1_1535885566901_3829538_332-92090-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 07:33:08,612 WARNING [train.py:1204] (3/4) Exclude cut with ID _3675_210_4557_1_1526639934408_4182089_104-324125-0_sp1.1 from training. Duration: 0.963625 2023-10-04 07:33:11,484 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_41764_210_2521_1_1533686110_4089313_501-2119-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 07:33:16,402 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_56-100988-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 07:33:20,359 WARNING [train.py:1204] (3/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_398-239819-0_sp1.1 from training. Duration: 0.9 2023-10-04 07:33:24,559 WARNING [train.py:1204] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_74-48717-0_sp1.1 from training. Duration: 0.52725 2023-10-04 07:33:26,097 WARNING [train.py:1204] (3/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_177-49092-0 from training. Duration: 0.91 2023-10-04 07:33:26,115 WARNING [train.py:1204] (3/4) Exclude cut with ID _62337_210_12392_1_1535009273105_3412229_44-176353-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 07:33:26,131 WARNING [train.py:1204] (3/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_110-130961-0_sp1.1 from training. Duration: 0.42725 2023-10-04 07:33:28,083 WARNING [train.py:1204] (3/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_37-189262-0_sp1.1 from training. Duration: 0.8909375 2023-10-04 07:33:28,083 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3172_210_2087_1_1526292929_3987865_197-97762-0_sp0.9 from training. Duration: 0.8333125 2023-10-04 07:33:28,550 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.5.encoder.layers.0.self_attn1.whiten, num_groups=1, num_channels=256, metric=11.06 vs. limit=22.5 2023-10-04 07:33:29,474 WARNING [train.py:1204] (3/4) Exclude cut with ID IC0677W0065-334897 from training. Duration: 0.896 2023-10-04 07:33:29,474 WARNING [train.py:1204] (3/4) Exclude cut with ID IC0677W0080-334912 from training. Duration: 0.768 2023-10-04 07:33:29,479 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_120-41668-0 from training. Duration: 0.7 2023-10-04 07:33:32,402 WARNING [train.py:1204] (3/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_460-205287-0_sp1.1 from training. Duration: 0.963625 2023-10-04 07:33:33,921 WARNING [train.py:1204] (3/4) Exclude cut with ID _14632_210_9988_1_1530691279392_3923200_102-204477-0 from training. Duration: 0.97 2023-10-04 07:33:33,927 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_34815_210_6388_1_1533381320_3571903_279-305565-0 from training. Duration: 0.92 2023-10-04 07:33:35,819 WARNING [train.py:1204] (3/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_91-148831-0_sp1.1 from training. Duration: 0.9 2023-10-04 07:33:35,899 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_82-221196-0 from training. Duration: 0.76 2023-10-04 07:33:38,696 WARNING [train.py:1204] (3/4) Exclude cut with ID _38020_210_5986_1_1533112252259_7006089_493-102745-0 from training. Duration: 0.98 2023-10-04 07:33:40,093 WARNING [train.py:1204] (3/4) Exclude cut with ID _61133_210_9969_1_1535196777227_3696409_40-93442-0_sp1.1 from training. Duration: 0.92725 2023-10-04 07:33:41,301 INFO [train.py:1046] (3/4) Epoch 45, batch 2550, loss[loss=0.1614, simple_loss=0.239, pruned_loss=0.0419, over 23825.00 frames. ], tot_loss[loss=0.1534, simple_loss=0.2335, pruned_loss=0.03669, over 4698440.58 frames. ], batch size: 212, lr: 2.26e-03, grad_scale: 16.0 2023-10-04 07:33:43,311 WARNING [train.py:1204] (3/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_45-258852-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 07:33:44,617 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_53071_210_16554_1_1534493434_1276796_130-36076-0_sp1.1 from training. Duration: 0.863625 2023-10-04 07:33:47,407 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8694_210_6400_1_1529065754_3260756_332-13553-0_sp1.1 from training. Duration: 0.92725 2023-10-04 07:33:48,867 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_405-339986-0 from training. Duration: 0.51 2023-10-04 07:33:48,919 WARNING [train.py:1204] (3/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_927-43827-0_sp1.1 from training. Duration: 0.67275 2023-10-04 07:33:53,034 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_397-147196-0 from training. Duration: 0.8 2023-10-04 07:33:53,157 WARNING [train.py:1204] (3/4) Exclude cut with ID 3557-8342-0013-54691-0_sp1.1 from training. Duration: 0.836375 2023-10-04 07:33:56,002 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_47438_210_5399_1_1533797471_4513356_626-127722-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 07:33:58,074 WARNING [train.py:1204] (3/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_256-323883-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 07:33:58,079 WARNING [train.py:1204] (3/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_839-173017-0_sp0.9 from training. Duration: 0.4555625 2023-10-04 07:33:59,392 WARNING [train.py:1204] (3/4) Exclude cut with ID _5289_210_2084_1_1527678202736_3779299_392-261988-0_sp1.1 from training. Duration: 0.5909375 2023-10-04 07:33:59,425 WARNING [train.py:1204] (3/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_119-224384-0_sp1.1 from training. Duration: 0.936375 2023-10-04 07:33:59,482 WARNING [train.py:1204] (3/4) Exclude cut with ID _57523_210_1856_1_1534766294545_3732989_262-267933-0_sp1.1 from training. Duration: 0.963625 2023-10-04 07:33:59,693 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.nonlin_attention.balancer.prob, batch_count=1575293.3333333333, ans=0.125 2023-10-04 07:34:00,907 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_35361_210_4238_1_1533262810_7793476_675-87012-0_sp0.9 from training. Duration: 0.988875 2023-10-04 07:34:02,221 WARNING [train.py:1204] (3/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_17-90541-0 from training. Duration: 0.94 2023-10-04 07:34:02,257 WARNING [train.py:1204] (3/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_535-14410-0_sp1.1 from training. Duration: 0.42725 2023-10-04 07:34:02,263 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_24673_210_6400_1_1531968633_778659_94-154511-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 07:34:02,266 WARNING [train.py:1204] (3/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_212-10501-0 from training. Duration: 0.94 2023-10-04 07:34:16,201 WARNING [train.py:1204] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_383-285196-0_sp1.1 from training. Duration: 0.8818125 2023-10-04 07:34:21,762 WARNING [train.py:1204] (3/4) Exclude cut with ID _45560_210_21982_1_1533812603951_3584999_238-47020-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 07:34:21,771 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_21500_210_2054_1_1532149310_2716758_278-18053-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 07:34:21,777 WARNING [train.py:1204] (3/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_453-9626-0_sp1.1 from training. Duration: 0.936375 2023-10-04 07:34:21,968 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.feed_forward3.hidden_balancer.prob, batch_count=1575360.0, ans=0.125 2023-10-04 07:34:22,435 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.3.encoder.layers.2.feed_forward1.out_whiten, num_groups=1, num_channels=512, metric=7.40 vs. limit=15.0 2023-10-04 07:34:23,120 WARNING [train.py:1204] (3/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_153-97441-0_sp1.1 from training. Duration: 0.5090625 2023-10-04 07:34:25,359 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.4.encoder.layers.0.nonlin_attention.whiten2, num_groups=1, num_channels=384, metric=13.90 vs. limit=15.0 2023-10-04 07:34:29,546 WARNING [train.py:1204] (3/4) Exclude cut with ID _67725_210_27767_1_1535608756671_5105359_431-120895-0_sp1.1 from training. Duration: 0.963625 2023-10-04 07:34:30,983 WARNING [train.py:1204] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_574-45541-0_sp1.1 from training. Duration: 0.5909375 2023-10-04 07:34:30,999 WARNING [train.py:1204] (3/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_162-103620-0_sp1.1 from training. Duration: 0.8454375 2023-10-04 07:34:31,010 WARNING [train.py:1204] (3/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_734-129761-0_sp1.1 from training. Duration: 0.7454375 2023-10-04 07:34:32,255 WARNING [train.py:1204] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_620-276793-0_sp1.1 from training. Duration: 0.536375 2023-10-04 07:34:32,300 WARNING [train.py:1204] (3/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_153-97441-0_sp0.9 from training. Duration: 0.62225 2023-10-04 07:34:35,026 WARNING [train.py:1204] (3/4) Exclude cut with ID _12733_210_8341_1_1530167424556_3646380_523-85621-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 07:34:35,039 WARNING [train.py:1204] (3/4) Exclude cut with ID _64048_210_10626_1_1535198382802_3835179_352-179834-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 07:34:38,302 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.4.encoder.layers.0.feed_forward2.out_whiten, num_groups=1, num_channels=384, metric=6.59 vs. limit=15.0 2023-10-04 07:34:39,540 WARNING [train.py:1204] (3/4) Exclude cut with ID _55162_210_17071_1_1534408248583_3753480_43-242364-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 07:34:40,842 WARNING [train.py:1204] (3/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_661-264857-0 from training. Duration: 0.93 2023-10-04 07:34:40,849 WARNING [train.py:1204] (3/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_325-237800-0_sp1.1 from training. Duration: 0.97275 2023-10-04 07:34:40,899 WARNING [train.py:1204] (3/4) Exclude cut with ID _55328_210_27767_1_1534580973260_4088170_385-171459-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 07:34:42,252 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3780_210_2087_1_1526467389_2145572_176-107140-0_sp1.1 from training. Duration: 0.62725 2023-10-04 07:34:42,352 WARNING [train.py:1204] (3/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_375-244976-0_sp1.1 from training. Duration: 0.6818125 2023-10-04 07:34:43,744 WARNING [train.py:1204] (3/4) Exclude cut with ID _35941_210_15379_1_1532995294515_4023089_309-33162-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 07:34:50,977 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.690e+02 2.015e+02 2.178e+02 2.479e+02 3.759e+02, threshold=4.356e+02, percent-clipped=0.0 2023-10-04 07:34:51,048 WARNING [train.py:1204] (3/4) Exclude cut with ID _14459_210_2228_1_1531443641946_7516700_1185-22038-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 07:34:52,439 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_1197-69460-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 07:34:52,617 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.feed_forward2.hidden_balancer.prob, batch_count=1575493.3333333333, ans=0.125 2023-10-04 07:34:55,042 INFO [train.py:1046] (3/4) Epoch 45, batch 2600, loss[loss=0.1436, simple_loss=0.2214, pruned_loss=0.03292, over 23629.00 frames. ], tot_loss[loss=0.1537, simple_loss=0.2342, pruned_loss=0.03659, over 4705187.75 frames. ], batch size: 149, lr: 2.26e-03, grad_scale: 16.0 2023-10-04 07:34:55,203 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9012W0022-501157 from training. Duration: 0.896 2023-10-04 07:34:58,696 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9023W0009-506302 from training. Duration: 0.896 2023-10-04 07:34:58,719 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2821_210_2715_1_1526108385_3763560_73-296673-0_sp1.1 from training. Duration: 0.7545625 2023-10-04 07:35:00,059 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9025W0113-507442 from training. Duration: 0.896 2023-10-04 07:35:00,154 WARNING [train.py:1204] (3/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_739-2098-0 from training. Duration: 0.92 2023-10-04 07:35:00,167 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9029W0332-509722 from training. Duration: 0.768 2023-10-04 07:35:00,379 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.balancer2.prob, batch_count=1575560.0, ans=0.125 2023-10-04 07:35:02,149 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.4.encoder.layers.2.self_attn1.whiten, num_groups=1, num_channels=384, metric=11.84 vs. limit=22.5 2023-10-04 07:35:04,304 WARNING [train.py:1204] (3/4) Exclude cut with ID _16908_210_5724_1_1531119157248_3772700_68-101953-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 07:35:04,338 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9042W0011-515617 from training. Duration: 0.768 2023-10-04 07:35:05,672 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3019_210_2028_1_1526044245_3264865_191-72235-0 from training. Duration: 0.97 2023-10-04 07:35:05,769 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9052W0006-520792 from training. Duration: 0.896 2023-10-04 07:35:06,376 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.3.encoder.layers.3.feed_forward2.out_whiten, num_groups=1, num_channels=512, metric=4.87 vs. limit=15.0 2023-10-04 07:35:08,478 WARNING [train.py:1204] (3/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_245-174553-0_sp1.1 from training. Duration: 0.8 2023-10-04 07:35:09,326 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.1.encoder.layers.1.feed_forward3.out_whiten, num_groups=1, num_channels=256, metric=8.78 vs. limit=15.0 2023-10-04 07:35:10,063 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.conv_module1.balancer2.prob, batch_count=1575626.6666666667, ans=0.125 2023-10-04 07:35:11,798 WARNING [train.py:1204] (3/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_202-67017-0 from training. Duration: 0.82 2023-10-04 07:35:11,901 WARNING [train.py:1204] (3/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_304-59227-0 from training. Duration: 0.77 2023-10-04 07:35:13,331 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_246-125000-0_sp0.9 from training. Duration: 0.688875 2023-10-04 07:35:14,595 WARNING [train.py:1204] (3/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_243-42116-0 from training. Duration: 0.73 2023-10-04 07:35:16,560 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9087W0121-538492 from training. Duration: 0.896 2023-10-04 07:35:16,574 WARNING [train.py:1204] (3/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_120-76843-0 from training. Duration: 0.55 2023-10-04 07:35:24,377 WARNING [train.py:1204] (3/4) Exclude cut with ID _7852_210_5219_1_1528286471316_8139400_850-304621-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 07:35:24,397 WARNING [train.py:1204] (3/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_154-285415-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 07:35:25,658 WARNING [train.py:1204] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_538-278194-0_sp1.1 from training. Duration: 0.92725 2023-10-04 07:35:25,658 WARNING [train.py:1204] (3/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_119-207162-0 from training. Duration: 0.71 2023-10-04 07:35:27,664 WARNING [train.py:1204] (3/4) Exclude cut with ID _2334_210_2667_1_1525608238880_4705129_459-104997-0_sp1.1 from training. Duration: 0.863625 2023-10-04 07:35:29,378 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.feed_forward1.hidden_balancer.prob, batch_count=1575693.3333333333, ans=0.125 2023-10-04 07:35:31,970 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9147W0019-567847 from training. Duration: 0.768 2023-10-04 07:35:33,452 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=1575693.3333333333, ans=0.1 2023-10-04 07:35:37,473 WARNING [train.py:1204] (3/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_228-189439-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 07:35:38,864 WARNING [train.py:1204] (3/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_196-77172-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 07:35:38,931 WARNING [train.py:1204] (3/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_625-338546-0 from training. Duration: 0.88 2023-10-04 07:35:38,994 WARNING [train.py:1204] (3/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_193-327630-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 07:35:38,995 WARNING [train.py:1204] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_391-24006-0_sp1.1 from training. Duration: 0.92725 2023-10-04 07:35:40,445 WARNING [train.py:1204] (3/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_197-28164-0 from training. Duration: 0.8 2023-10-04 07:35:45,156 WARNING [train.py:1204] (3/4) Exclude cut with ID _8395_210_1531_1_1528597871523_4411579_431-132470-0_sp1.1 from training. Duration: 0.67275 2023-10-04 07:35:45,174 WARNING [train.py:1204] (3/4) Exclude cut with ID _35350_210_16040_1_1533040191183_4075299_465-248326-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 07:35:46,571 WARNING [train.py:1204] (3/4) Exclude cut with ID _16756_210_4915_1_1531047803763_7104259_101-141922-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 07:35:50,530 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9212W0005-600172 from training. Duration: 0.896 2023-10-04 07:35:50,564 WARNING [train.py:1204] (3/4) Exclude cut with ID _63186_210_2084_1_1535414479705_3908649_378-166820-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 07:35:51,778 WARNING [train.py:1204] (3/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_52-211683-0_sp1.1 from training. Duration: 0.8181875 2023-10-04 07:35:59,003 WARNING [train.py:1204] (3/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_1147-237729-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 07:35:59,078 WARNING [train.py:1204] (3/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_234-14174-0_sp0.9 from training. Duration: 0.911125 2023-10-04 07:35:59,098 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_454-276429-0 from training. Duration: 0.87 2023-10-04 07:36:00,426 WARNING [train.py:1204] (3/4) Exclude cut with ID _53713_210_24116_1_1534314487762_7416751_755-165313-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 07:36:01,905 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8278_210_2550_1_1528637099_4220413_436-317072-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 07:36:03,270 WARNING [train.py:1204] (3/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_28-324729-0_sp1.1 from training. Duration: 0.8545625 2023-10-04 07:36:03,489 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.self_attn_weights.pos_emb_skip_rate, batch_count=1575826.6666666667, ans=0.0 2023-10-04 07:36:08,717 WARNING [train.py:1204] (3/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_326-165900-0 from training. Duration: 0.69 2023-10-04 07:36:10,054 INFO [train.py:1046] (3/4) Epoch 45, batch 2650, loss[loss=0.1685, simple_loss=0.2415, pruned_loss=0.04778, over 22717.00 frames. ], tot_loss[loss=0.1545, simple_loss=0.2347, pruned_loss=0.03714, over 4701836.92 frames. ], batch size: 322, lr: 2.26e-03, grad_scale: 16.0 2023-10-04 07:36:10,098 WARNING [train.py:1204] (3/4) Exclude cut with ID _53489_210_6228_1_1534298357169_3835970_96-30246-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 07:36:10,309 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.0.conv_module2.balancer2.min_abs, batch_count=1575893.3333333333, ans=0.5 2023-10-04 07:36:11,541 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8275_210_4198_1_1528455320_3892571_491-17707-0_sp1.1 from training. Duration: 0.5818125 2023-10-04 07:36:11,822 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.conv_skip_rate, batch_count=1575893.3333333333, ans=0.0 2023-10-04 07:36:16,968 WARNING [train.py:1204] (3/4) Exclude cut with ID _14348_210_8341_1_1530599434996_3778640_240-232370-0 from training. Duration: 0.84 2023-10-04 07:36:16,980 WARNING [train.py:1204] (3/4) Exclude cut with ID _45012_210_12651_1_1533608435136_5371380_54-112623-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 07:36:18,332 WARNING [train.py:1204] (3/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_328-264776-0_sp1.1 from training. Duration: 0.6090625 2023-10-04 07:36:20,236 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9325W0001-648427 from training. Duration: 0.768 2023-10-04 07:36:20,251 WARNING [train.py:1204] (3/4) Exclude cut with ID _56480_210_4220_1_1534679942079_3075409_49-74717-0_sp1.1 from training. Duration: 0.936375 2023-10-04 07:36:23,566 WARNING [train.py:1204] (3/4) Exclude cut with ID _59500_210_8254_1_1534809610680_3736009_6-241869-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 07:36:24,062 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.5.encoder.layers.1.feed_forward1.out_whiten, num_groups=1, num_channels=256, metric=6.19 vs. limit=15.0 2023-10-04 07:36:25,018 WARNING [train.py:1204] (3/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_172-215932-0_sp0.9 from training. Duration: 0.7333125 2023-10-04 07:36:27,084 WARNING [train.py:1204] (3/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_17-90541-0_sp1.1 from training. Duration: 0.8545625 2023-10-04 07:36:29,804 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_43831_210_2158_1_1533725502_5872791_525-89447-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 07:36:31,191 WARNING [train.py:1204] (3/4) Exclude cut with ID _12302_210_5219_1_1529924446466_8107280_907-182949-0 from training. Duration: 0.92 2023-10-04 07:36:31,208 WARNING [train.py:1204] (3/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_402-102311-0_sp0.9 from training. Duration: 0.7444375 2023-10-04 07:36:31,251 WARNING [train.py:1204] (3/4) Exclude cut with ID _8021_210_7994_1_1528376451358_3653069_179-45206-0_sp0.9 from training. Duration: 0.9555625 2023-10-04 07:36:33,299 WARNING [train.py:1204] (3/4) Exclude cut with ID _40137_210_12210_1_1533290484046_3795180_249-187059-0 from training. Duration: 0.88 2023-10-04 07:36:36,014 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9653W0194-675472 from training. Duration: 0.9658125 2023-10-04 07:36:37,477 WARNING [train.py:1204] (3/4) Exclude cut with ID _51332_210_9988_1_1534244371836_3801270_336-255604-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 07:36:38,985 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.0.conv_module2.balancer1.prob, batch_count=1576026.6666666667, ans=0.125 2023-10-04 07:36:40,311 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_510-96996-0 from training. Duration: 0.87 2023-10-04 07:36:40,324 WARNING [train.py:1204] (3/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_1167-20437-0_sp1.1 from training. Duration: 0.92725 2023-10-04 07:36:40,380 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3475_210_2028_1_1526193578_3622569_265-276625-0 from training. Duration: 0.76 2023-10-04 07:36:42,689 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.0.layers.0.nonlin_attention.whiten2, num_groups=1, num_channels=192, metric=4.87 vs. limit=15.0 2023-10-04 07:36:43,293 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=1576026.6666666667, ans=0.1 2023-10-04 07:36:45,708 WARNING [train.py:1204] (3/4) Exclude cut with ID _14860_210_4915_1_1530702020340_7343240_470-172418-0_sp1.1 from training. Duration: 0.97275 2023-10-04 07:36:45,716 WARNING [train.py:1204] (3/4) Exclude cut with ID _5444_210_2151_1_1527418808464_5312210_581-315411-0_sp1.1 from training. Duration: 0.563625 2023-10-04 07:36:45,740 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_34450_210_9547_1_1533099443_4109050_519-342439-0_sp1.1 from training. Duration: 0.97275 2023-10-04 07:36:45,764 WARNING [train.py:1204] (3/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_50-175961-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 07:36:47,260 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.1.feed_forward1.out_proj.dropout_p, batch_count=1576026.6666666667, ans=0.1 2023-10-04 07:36:49,972 WARNING [train.py:1204] (3/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_372-6869-0 from training. Duration: 0.44 2023-10-04 07:36:49,978 WARNING [train.py:1204] (3/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_3-237434-0 from training. Duration: 0.76 2023-10-04 07:36:53,134 WARNING [train.py:1204] (3/4) Exclude cut with ID _8772_210_1850_1_1528635595972_3664011_247-336448-0_sp1.1 from training. Duration: 0.67275 2023-10-04 07:36:57,110 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_427-78097-0 from training. Duration: 0.8 2023-10-04 07:36:57,125 WARNING [train.py:1204] (3/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_466-252727-0_sp1.1 from training. Duration: 0.97275 2023-10-04 07:36:58,459 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_15972_210_6400_1_1530877934_3405692_316-35478-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 07:36:58,496 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_809-17156-0_sp0.9 from training. Duration: 0.811125 2023-10-04 07:36:59,843 WARNING [train.py:1204] (3/4) Exclude cut with ID _57572_210_5517_1_1534921219624_3809484_594-263474-0_sp1.1 from training. Duration: 0.936375 2023-10-04 07:36:59,887 WARNING [train.py:1204] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_190-228204-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 07:37:01,302 WARNING [train.py:1204] (3/4) Exclude cut with ID _19514_210_9259_1_1531530071330_5084230_117-21193-0_sp1.1 from training. Duration: 0.936375 2023-10-04 07:37:03,934 WARNING [train.py:1204] (3/4) Exclude cut with ID _69584_210_5219_1_1535785133775_4044450_190-21315-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 07:37:04,018 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_53071_210_16554_1_1534493434_1276796_184-302050-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 07:37:05,981 WARNING [train.py:1204] (3/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_120-76843-0_sp1.1 from training. Duration: 0.5 2023-10-04 07:37:06,077 WARNING [train.py:1204] (3/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_318-162017-0_sp1.1 from training. Duration: 0.82725 2023-10-04 07:37:07,435 WARNING [train.py:1204] (3/4) Exclude cut with ID _46067_210_4220_1_1534125571066_3307450_79-46864-0_sp1.1 from training. Duration: 0.92725 2023-10-04 07:37:07,489 WARNING [train.py:1204] (3/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_358-215558-0_sp1.1 from training. Duration: 0.8454375 2023-10-04 07:37:07,573 WARNING [train.py:1204] (3/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_621-66764-0_sp1.1 from training. Duration: 0.92725 2023-10-04 07:37:08,888 WARNING [train.py:1204] (3/4) Exclude cut with ID _42863_210_9070_1_1533902398788_3696920_338-156851-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 07:37:09,641 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.1.encoder.layers.0.conv_module2.whiten, num_groups=1, num_channels=256, metric=11.18 vs. limit=15.0 2023-10-04 07:37:10,181 WARNING [train.py:1204] (3/4) Exclude cut with ID _5289_210_2084_1_1527678202736_3779299_392-261988-0_sp0.9 from training. Duration: 0.72225 2023-10-04 07:37:14,211 WARNING [train.py:1204] (3/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_409-84682-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 07:37:14,299 WARNING [train.py:1204] (3/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_30-326681-0_sp0.9 from training. Duration: 0.92225 2023-10-04 07:37:15,542 WARNING [train.py:1204] (3/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_389-184695-0_sp1.1 from training. Duration: 0.92725 2023-10-04 07:37:15,579 WARNING [train.py:1204] (3/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_442-320317-0 from training. Duration: 0.82 2023-10-04 07:37:19,696 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.621e+02 2.024e+02 2.233e+02 2.497e+02 3.556e+02, threshold=4.465e+02, percent-clipped=0.0 2023-10-04 07:37:19,775 WARNING [train.py:1204] (3/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_756-29911-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 07:37:19,973 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.1.feed_forward3.hidden_balancer.prob, batch_count=1576160.0, ans=0.125 2023-10-04 07:37:21,185 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_401-112036-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 07:37:22,587 WARNING [train.py:1204] (3/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_181-308827-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 07:37:24,511 INFO [train.py:1046] (3/4) Epoch 45, batch 2700, loss[loss=0.1586, simple_loss=0.226, pruned_loss=0.04558, over 23419.00 frames. ], tot_loss[loss=0.1548, simple_loss=0.235, pruned_loss=0.03731, over 4716604.28 frames. ], batch size: 285, lr: 2.26e-03, grad_scale: 16.0 2023-10-04 07:37:24,573 WARNING [train.py:1204] (3/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_332-258725-0_sp1.1 from training. Duration: 0.963625 2023-10-04 07:37:24,652 WARNING [train.py:1204] (3/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_188-260011-0_sp0.9 from training. Duration: 0.811125 2023-10-04 07:37:26,512 WARNING [train.py:1204] (3/4) Exclude cut with ID _49039_210_6315_1_1533967537541_3846118_99-302239-0_sp1.1 from training. Duration: 0.963625 2023-10-04 07:37:26,691 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.conv_module1.balancer1.prob, batch_count=1576226.6666666667, ans=0.125 2023-10-04 07:37:29,800 WARNING [train.py:1204] (3/4) Exclude cut with ID _4122_210_2084_1_1526713164417_4353280_312-294828-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 07:37:29,818 WARNING [train.py:1204] (3/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_303-228738-0 from training. Duration: 0.84 2023-10-04 07:37:31,361 WARNING [train.py:1204] (3/4) Exclude cut with ID _14349_210_8341_1_1530615710818_3442470_115-285975-0_sp0.9 from training. Duration: 0.9555625 2023-10-04 07:37:34,129 WARNING [train.py:1204] (3/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_125-144174-0_sp1.1 from training. Duration: 0.3545625 2023-10-04 07:37:35,545 WARNING [train.py:1204] (3/4) Exclude cut with ID _53565_210_10635_1_1534337218335_3820259_351-54570-0_sp1.1 from training. Duration: 0.97275 2023-10-04 07:37:35,579 WARNING [train.py:1204] (3/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_410-40065-0_sp1.1 from training. Duration: 0.963625 2023-10-04 07:37:35,607 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_38965_210_4852_1_1533174906_3484735_167-339442-0_sp1.1 from training. Duration: 0.963625 2023-10-04 07:37:38,850 WARNING [train.py:1204] (3/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_265-164486-0_sp1.1 from training. Duration: 0.87275 2023-10-04 07:37:38,861 WARNING [train.py:1204] (3/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_725-182484-0_sp1.1 from training. Duration: 0.92725 2023-10-04 07:37:38,886 WARNING [train.py:1204] (3/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_607-213951-0_sp0.9 from training. Duration: 0.9333125 2023-10-04 07:37:38,908 WARNING [train.py:1204] (3/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_605-133395-0_sp1.1 from training. Duration: 0.6 2023-10-04 07:37:38,923 WARNING [train.py:1204] (3/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_351-294650-0 from training. Duration: 0.63 2023-10-04 07:37:40,384 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_14953_210_2151_1_1530701710_5616165_710-165400-0_sp1.1 from training. Duration: 0.7909375 2023-10-04 07:37:41,671 WARNING [train.py:1204] (3/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_237-58433-0_sp1.1 from training. Duration: 0.67275 2023-10-04 07:37:43,030 WARNING [train.py:1204] (3/4) Exclude cut with ID _5190_210_4557_1_1527244368799_373439_28-197030-0_sp1.1 from training. Duration: 0.6545625 2023-10-04 07:37:43,081 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_743-327651-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 07:37:45,999 WARNING [train.py:1204] (3/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_76-203867-0_sp0.9 from training. Duration: 0.8 2023-10-04 07:37:47,374 WARNING [train.py:1204] (3/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_274-274798-0 from training. Duration: 0.98 2023-10-04 07:37:47,419 WARNING [train.py:1204] (3/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_226-242140-0_sp1.1 from training. Duration: 0.736375 2023-10-04 07:37:47,559 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.conv_module1.balancer1.prob, batch_count=1576293.3333333333, ans=0.125 2023-10-04 07:37:50,233 WARNING [train.py:1204] (3/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_352-59958-0_sp1.1 from training. Duration: 0.7818125 2023-10-04 07:37:50,239 WARNING [train.py:1204] (3/4) Exclude cut with ID _39411_210_5986_1_1533538825030_3343649_191-251819-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 07:37:58,718 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7046_210_6753_1_1527895337_6920917_642-106799-0_sp1.1 from training. Duration: 0.836375 2023-10-04 07:37:58,725 WARNING [train.py:1204] (3/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_412-3442-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 07:37:58,744 WARNING [train.py:1204] (3/4) Exclude cut with ID _60057_210_10640_1_1534903065793_3333150_171-181133-0_sp0.9 from training. Duration: 0.97775 2023-10-04 07:37:58,760 WARNING [train.py:1204] (3/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_154-135696-0_sp1.1 from training. Duration: 0.72725 2023-10-04 07:38:01,679 WARNING [train.py:1204] (3/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_148-186805-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 07:38:04,384 WARNING [train.py:1204] (3/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_605-165641-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 07:38:04,391 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_174-277364-0_sp0.9 from training. Duration: 0.788875 2023-10-04 07:38:04,404 WARNING [train.py:1204] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_89-321955-0_sp0.9 from training. Duration: 0.888875 2023-10-04 07:38:09,315 WARNING [train.py:1204] (3/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_438-105429-0_sp1.1 from training. Duration: 0.963625 2023-10-04 07:38:09,319 WARNING [train.py:1204] (3/4) Exclude cut with ID _14970_210_15709_1_1530773543493_1715730_187-20259-0_sp0.9 from training. Duration: 0.988875 2023-10-04 07:38:16,409 WARNING [train.py:1204] (3/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_385-193052-0_sp0.9 from training. Duration: 0.9555625 2023-10-04 07:38:16,463 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7113_210_1849_1_1527942685_3448768_194-311626-0_sp1.1 from training. Duration: 0.92725 2023-10-04 07:38:18,097 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.conv_skip_rate, batch_count=1576426.6666666667, ans=0.0 2023-10-04 07:38:19,246 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3780_210_2087_1_1526467389_2145572_176-107140-0_sp0.9 from training. Duration: 0.7666875 2023-10-04 07:38:19,248 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_36756_210_7234_1_1533026991_966276_123-213457-0_sp1.1 from training. Duration: 0.936375 2023-10-04 07:38:22,011 WARNING [train.py:1204] (3/4) Exclude cut with ID _23557_210_8750_1_1531890106370_6846255_456-244861-0_sp1.1 from training. Duration: 0.963625 2023-10-04 07:38:23,767 WARNING [train.py:1204] (3/4) Exclude cut with ID _37086_210_9038_1_1532999540891_7021250_242-349170-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 07:38:23,821 WARNING [train.py:1204] (3/4) Exclude cut with ID _41298_210_9019_1_1533430371450_4822143_471-281351-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 07:38:25,781 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_53378_210_17282_1_1534221071_5769509_572-126176-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 07:38:28,799 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_1507-263032-0_sp1.1 from training. Duration: 0.963625 2023-10-04 07:38:28,846 WARNING [train.py:1204] (3/4) Exclude cut with ID _13679_210_7497_1_1530534293662_3355098_411-186088-0_sp1.1 from training. Duration: 0.8545625 2023-10-04 07:38:30,344 WARNING [train.py:1204] (3/4) Exclude cut with ID _29345_210_4381_1_1532689168263_3860169_149-257268-0_sp1.1 from training. Duration: 0.736375 2023-10-04 07:38:31,806 WARNING [train.py:1204] (3/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_381-252014-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 07:38:31,813 WARNING [train.py:1204] (3/4) Exclude cut with ID _35799_210_5854_1_1533263487887_3529071_512-2717-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 07:38:35,071 WARNING [train.py:1204] (3/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_505-205270-0_sp0.9 from training. Duration: 0.42225 2023-10-04 07:38:35,157 WARNING [train.py:1204] (3/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_731-20117-0_sp1.1 from training. Duration: 0.936375 2023-10-04 07:38:39,332 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_37351_210_7925_1_1533012931_3812113_265-85780-0_sp1.1 from training. Duration: 0.8 2023-10-04 07:38:39,339 WARNING [train.py:1204] (3/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_313-134930-0 from training. Duration: 0.97 2023-10-04 07:38:40,774 INFO [train.py:1046] (3/4) Epoch 45, batch 2750, loss[loss=0.1364, simple_loss=0.2118, pruned_loss=0.03051, over 23658.00 frames. ], tot_loss[loss=0.1549, simple_loss=0.2349, pruned_loss=0.03746, over 4699798.26 frames. ], batch size: 149, lr: 2.26e-03, grad_scale: 16.0 2023-10-04 07:38:40,835 WARNING [train.py:1204] (3/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_374-138971-0 from training. Duration: 0.94 2023-10-04 07:38:40,883 WARNING [train.py:1204] (3/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_1227-287886-0_sp1.1 from training. Duration: 0.936375 2023-10-04 07:38:42,393 WARNING [train.py:1204] (3/4) Exclude cut with ID _59493_210_17282_1_1535078419077_3225879_332-217544-0_sp1.1 from training. Duration: 0.97275 2023-10-04 07:38:43,703 WARNING [train.py:1204] (3/4) Exclude cut with ID _23514_210_1389_1_1531832139785_3031439_361-119640-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 07:38:45,178 WARNING [train.py:1204] (3/4) Exclude cut with ID _69584_210_5219_1_1535785133775_4044450_142-118058-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 07:38:45,198 WARNING [train.py:1204] (3/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_178-177582-0_sp1.1 from training. Duration: 0.67275 2023-10-04 07:38:45,248 WARNING [train.py:1204] (3/4) Exclude cut with ID _25303_210_17519_1_1532050207576_3635970_41-195572-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 07:38:49,257 WARNING [train.py:1204] (3/4) Exclude cut with ID _55328_210_27767_1_1534580973260_4088170_329-341322-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 07:38:49,293 WARNING [train.py:1204] (3/4) Exclude cut with ID _5608_210_2106_1_1527415439089_6819768_909-82767-0_sp1.1 from training. Duration: 0.5545625 2023-10-04 07:38:49,328 WARNING [train.py:1204] (3/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_261-297503-0_sp1.1 from training. Duration: 0.9 2023-10-04 07:38:49,341 WARNING [train.py:1204] (3/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_459-291706-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 07:38:49,342 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7762_210_1794_1_1528199508_4057408_422-227034-0 from training. Duration: 0.73 2023-10-04 07:38:49,351 WARNING [train.py:1204] (3/4) Exclude cut with ID _3067_210_2667_1_1526212974640_4503168_250-285937-0_sp0.9 from training. Duration: 0.888875 2023-10-04 07:38:49,371 WARNING [train.py:1204] (3/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_332-253745-0_sp1.1 from training. Duration: 0.936375 2023-10-04 07:38:55,805 WARNING [train.py:1204] (3/4) Exclude cut with ID _13938_210_9988_1_1530518718978_3693960_304-112784-0 from training. Duration: 0.46 2023-10-04 07:38:57,234 WARNING [train.py:1204] (3/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_226-267957-0_sp1.1 from training. Duration: 0.92725 2023-10-04 07:38:59,000 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_41822_210_13270_1_1533955976_4379284_608-254567-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 07:38:59,186 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.bypass.scale_min, batch_count=1576626.6666666667, ans=0.2 2023-10-04 07:39:00,319 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_124-113562-0_sp1.1 from training. Duration: 0.8545625 2023-10-04 07:39:01,612 WARNING [train.py:1204] (3/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_117-290916-0_sp1.1 from training. Duration: 0.463625 2023-10-04 07:39:02,914 WARNING [train.py:1204] (3/4) Exclude cut with ID _28427_210_4220_1_1532581240967_3061069_102-66533-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 07:39:04,712 WARNING [train.py:1204] (3/4) Exclude cut with ID _55383_210_10635_1_1534468426172_2231669_134-128246-0_sp1.1 from training. Duration: 0.8090625 2023-10-04 07:39:04,751 WARNING [train.py:1204] (3/4) Exclude cut with ID _45871_210_5665_1_1533864614245_3296060_49-308316-0_sp1.1 from training. Duration: 0.97275 2023-10-04 07:39:04,798 WARNING [train.py:1204] (3/4) Exclude cut with ID _57572_210_5517_1_1534921219624_3809484_176-199205-0_sp1.1 from training. Duration: 0.97275 2023-10-04 07:39:08,954 WARNING [train.py:1204] (3/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_45-265684-0_sp1.1 from training. Duration: 0.6545625 2023-10-04 07:39:08,978 WARNING [train.py:1204] (3/4) Exclude cut with ID _15150_210_8341_1_1530752411803_7256384_469-239509-0_sp0.9 from training. Duration: 0.7555625 2023-10-04 07:39:09,027 WARNING [train.py:1204] (3/4) Exclude cut with ID _28461_210_2253_1_1532771775189_3041299_136-270644-0_sp1.1 from training. Duration: 0.8454375 2023-10-04 07:39:10,363 WARNING [train.py:1204] (3/4) Exclude cut with ID _69584_210_5219_1_1535785133775_4044450_302-47302-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 07:39:11,780 WARNING [train.py:1204] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_166-128941-0_sp1.1 from training. Duration: 0.4909375 2023-10-04 07:39:18,752 WARNING [train.py:1204] (3/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_559-120504-0_sp1.1 from training. Duration: 0.97275 2023-10-04 07:39:20,175 WARNING [train.py:1204] (3/4) Exclude cut with ID _6456_210_1534_1_1527681634212_3181049_345-252877-0_sp1.1 from training. Duration: 0.5818125 2023-10-04 07:39:20,204 WARNING [train.py:1204] (3/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_902-233003-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 07:39:23,793 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.nonlin_attention.balancer.prob, batch_count=1576760.0, ans=0.125 2023-10-04 07:39:25,448 WARNING [train.py:1204] (3/4) Exclude cut with ID _36538_210_9994_1_1533204020363_3795620_204-261385-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 07:39:25,452 WARNING [train.py:1204] (3/4) Exclude cut with ID _14821_210_8750_1_1530680503991_6829359_189-297451-0_sp1.1 from training. Duration: 0.5 2023-10-04 07:39:26,790 WARNING [train.py:1204] (3/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_1211-226066-0_sp0.9 from training. Duration: 0.8444375 2023-10-04 07:39:31,739 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.feed_forward2.hidden_balancer.prob, batch_count=1576760.0, ans=0.125 2023-10-04 07:39:32,907 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6786_210_1388_1_1527839699_4194495_273-149841-0_sp1.1 from training. Duration: 0.836375 2023-10-04 07:39:33,065 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.conv_module1.balancer1.max_abs, batch_count=1576760.0, ans=10.0 2023-10-04 07:39:34,238 WARNING [train.py:1204] (3/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_380-162648-0_sp1.1 from training. Duration: 0.8545625 2023-10-04 07:39:34,239 WARNING [train.py:1204] (3/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_494-199538-0 from training. Duration: 0.75 2023-10-04 07:39:34,543 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.ff3_skip_rate, batch_count=1576760.0, ans=0.0 2023-10-04 07:39:35,026 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.2.encoder.layers.2.whiten, num_groups=1, num_channels=384, metric=3.77 vs. limit=12.0 2023-10-04 07:39:38,407 WARNING [train.py:1204] (3/4) Exclude cut with ID _49097_210_16049_1_1533952780207_4360979_276-87053-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 07:39:41,850 WARNING [train.py:1204] (3/4) Exclude cut with ID _5444_210_2151_1_1527418808464_5312210_726-27165-0 from training. Duration: 0.67 2023-10-04 07:39:46,057 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_128-202104-0_sp1.1 from training. Duration: 0.57275 2023-10-04 07:39:48,869 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_11349_210_6753_1_1529800861_1101207_26-65602-0_sp1.1 from training. Duration: 0.87275 2023-10-04 07:39:48,899 WARNING [train.py:1204] (3/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_159-328157-0 from training. Duration: 0.52 2023-10-04 07:39:50,134 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.677e+02 2.025e+02 2.213e+02 2.495e+02 4.523e+02, threshold=4.427e+02, percent-clipped=1.0 2023-10-04 07:39:50,278 WARNING [train.py:1204] (3/4) Exclude cut with ID _14069_210_5632_1_1530512692033_4803390_98-21934-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 07:39:51,720 WARNING [train.py:1204] (3/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_352-59958-0_sp0.9 from training. Duration: 0.9555625 2023-10-04 07:39:51,775 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6163_210_6400_1_1527506746_4263068_283-137811-0 from training. Duration: 0.77 2023-10-04 07:39:51,999 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.bypass.scale_min, batch_count=1576826.6666666667, ans=0.2 2023-10-04 07:39:53,139 WARNING [train.py:1204] (3/4) Exclude cut with ID _21989_210_5854_1_1531899214931_3271360_133-44759-0_sp1.1 from training. Duration: 0.7 2023-10-04 07:39:53,372 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.balancer1.prob, batch_count=1576893.3333333333, ans=0.125 2023-10-04 07:39:54,456 INFO [train.py:1046] (3/4) Epoch 45, batch 2800, loss[loss=0.1582, simple_loss=0.2355, pruned_loss=0.04044, over 23773.00 frames. ], tot_loss[loss=0.1543, simple_loss=0.2346, pruned_loss=0.03699, over 4707645.65 frames. ], batch size: 195, lr: 2.26e-03, grad_scale: 32.0 2023-10-04 07:39:54,627 WARNING [train.py:1204] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_21-174514-0_sp0.9 from training. Duration: 0.47775 2023-10-04 07:39:54,787 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.conv_module1.balancer1.prob, batch_count=1576893.3333333333, ans=0.125 2023-10-04 07:39:55,832 WARNING [train.py:1204] (3/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_201-78811-0_sp1.1 from training. Duration: 0.963625 2023-10-04 07:39:55,859 WARNING [train.py:1204] (3/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_537-164630-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 07:39:55,936 WARNING [train.py:1204] (3/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_156-257607-0 from training. Duration: 0.83 2023-10-04 07:39:55,947 WARNING [train.py:1204] (3/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_308-154450-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 07:39:57,835 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_45399_210_4852_1_1533624725_7579737_214-240574-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 07:39:59,285 WARNING [train.py:1204] (3/4) Exclude cut with ID _29303_210_2575_1_1532392237303_7410649_615-305517-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 07:39:59,330 WARNING [train.py:1204] (3/4) Exclude cut with ID IC0125W0002-61808 from training. Duration: 0.708 2023-10-04 07:39:59,331 WARNING [train.py:1204] (3/4) Exclude cut with ID IC0125W0017-61823 from training. Duration: 0.759 2023-10-04 07:40:02,286 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.conv_module1.balancer1.prob, batch_count=1576893.3333333333, ans=0.125 2023-10-04 07:40:03,910 WARNING [train.py:1204] (3/4) Exclude cut with ID _53219_210_4525_1_1534834836008_4064825_4-145291-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 07:40:05,320 WARNING [train.py:1204] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_185-174805-0_sp1.1 from training. Duration: 0.6181875 2023-10-04 07:40:05,334 WARNING [train.py:1204] (3/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_635-193046-0_sp1.1 from training. Duration: 0.92725 2023-10-04 07:40:08,805 WARNING [train.py:1204] (3/4) Exclude cut with ID _46067_210_4220_1_1534125571066_3307450_321-258866-0_sp1.1 from training. Duration: 0.97275 2023-10-04 07:40:09,968 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8558_210_1410_1_1528505225_7613406_131-329524-0 from training. Duration: 0.79 2023-10-04 07:40:12,820 WARNING [train.py:1204] (3/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_847-41334-0_sp0.9 from training. Duration: 0.488875 2023-10-04 07:40:14,119 WARNING [train.py:1204] (3/4) Exclude cut with ID _14227_210_4381_1_1530701804286_3940259_125-275642-0 from training. Duration: 0.99 2023-10-04 07:40:15,386 WARNING [train.py:1204] (3/4) Exclude cut with ID _25248_210_8750_1_1532062825941_5554180_248-177255-0_sp1.1 from training. Duration: 0.963625 2023-10-04 07:40:15,421 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_40047_210_2290_1_1533290519_2374311_195-165693-0_sp1.1 from training. Duration: 0.8090625 2023-10-04 07:40:15,424 WARNING [train.py:1204] (3/4) Exclude cut with ID _23109_210_4381_1_1532150828777_901949_29-258672-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 07:40:19,654 WARNING [train.py:1204] (3/4) Exclude cut with ID _14821_210_8750_1_1530680503991_6829359_2-227151-0_sp1.1 from training. Duration: 0.7545625 2023-10-04 07:40:19,683 WARNING [train.py:1204] (3/4) Exclude cut with ID _49643_210_7989_1_1534640244224_3939031_565-53982-0_sp1.1 from training. Duration: 0.963625 2023-10-04 07:40:19,693 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7544_210_6753_1_1528591327_1471426_33-346031-0_sp0.9 from training. Duration: 0.67775 2023-10-04 07:40:19,998 INFO [scaling.py:1118] (3/4) WithLoss: name=encoder.encoders.4.encoder.layers.2.self_attn_weights, loss-sum=0.000e+00 2023-10-04 07:40:21,056 WARNING [train.py:1204] (3/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_95-61782-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 07:40:25,854 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.conv_module2.balancer2.min_positive, batch_count=1577026.6666666667, ans=0.05 2023-10-04 07:40:28,898 WARNING [train.py:1204] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_686-54383-0_sp0.9 from training. Duration: 0.9666875 2023-10-04 07:40:30,378 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_505-276823-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 07:40:30,626 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.bypass.skip_rate, batch_count=1577026.6666666667, ans=0.04949747468305833 2023-10-04 07:40:32,981 WARNING [train.py:1204] (3/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_745-128443-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 07:40:34,899 WARNING [train.py:1204] (3/4) Exclude cut with ID _3844_210_1390_1_1526727803582_2871209_20-251664-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 07:40:34,958 WARNING [train.py:1204] (3/4) Exclude cut with ID _36538_210_9994_1_1533204020363_3795620_487-164628-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 07:40:39,577 WARNING [train.py:1204] (3/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_309-222545-0_sp1.1 from training. Duration: 0.763625 2023-10-04 07:40:39,591 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3531_210_3528_1_1526294203_5216456_309-227302-0 from training. Duration: 0.69 2023-10-04 07:40:39,650 WARNING [train.py:1204] (3/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_42-192460-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 07:40:40,995 WARNING [train.py:1204] (3/4) Exclude cut with ID _22083_210_8254_1_1531702862439_3678970_49-234546-0_sp1.1 from training. Duration: 0.7545625 2023-10-04 07:40:40,999 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_427-78097-0_sp0.9 from training. Duration: 0.888875 2023-10-04 07:40:45,176 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_378-205254-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 07:40:45,402 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.feed_forward2.hidden_balancer.prob, batch_count=1577093.3333333333, ans=0.125 2023-10-04 07:40:46,504 WARNING [train.py:1204] (3/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_672-314830-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 07:40:50,690 WARNING [train.py:1204] (3/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_90-30673-0_sp1.1 from training. Duration: 0.763625 2023-10-04 07:40:52,126 WARNING [train.py:1204] (3/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_563-329943-0_sp1.1 from training. Duration: 0.9 2023-10-04 07:40:52,147 WARNING [train.py:1204] (3/4) Exclude cut with ID _21632_210_2253_1_1531994001765_3744140_154-84090-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 07:40:52,148 WARNING [train.py:1204] (3/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_103-336064-0_sp0.9 from training. Duration: 0.5666875 2023-10-04 07:40:52,197 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_43-71989-0_sp0.9 from training. Duration: 0.6333125 2023-10-04 07:40:52,242 WARNING [train.py:1204] (3/4) Exclude cut with ID _3867_210_1883_1_1526641200575_4475830_319-349850-0_sp0.9 from training. Duration: 0.7444375 2023-10-04 07:40:53,606 WARNING [train.py:1204] (3/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_629-179454-0_sp1.1 from training. Duration: 0.963625 2023-10-04 07:40:55,479 WARNING [train.py:1204] (3/4) Exclude cut with ID _65263_210_22356_1_1535763598691_3921037_49-222917-0 from training. Duration: 0.92 2023-10-04 07:40:55,501 WARNING [train.py:1204] (3/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_602-331772-0_sp1.1 from training. Duration: 0.936375 2023-10-04 07:40:56,951 WARNING [train.py:1204] (3/4) Exclude cut with ID _58414_210_8254_1_1535421609162_7151182_292-134978-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 07:40:56,978 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_38793_210_2544_1_1533176114_3849320_200-299292-0_sp1.1 from training. Duration: 0.936375 2023-10-04 07:40:59,031 WARNING [train.py:1204] (3/4) Exclude cut with ID _14970_210_15709_1_1530775547361_1966965_6-23328-0 from training. Duration: 0.86 2023-10-04 07:40:59,330 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.conv_module1.balancer2.prob, batch_count=1577160.0, ans=0.125 2023-10-04 07:41:00,434 WARNING [train.py:1204] (3/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_386-225511-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 07:41:00,460 WARNING [train.py:1204] (3/4) Exclude cut with ID _38323_210_12108_1_1533121233369_3303049_416-11919-0_sp1.1 from training. Duration: 0.87275 2023-10-04 07:41:00,502 WARNING [train.py:1204] (3/4) Exclude cut with ID _4866_210_2577_1_1527404412284_4364659_256-306830-0_sp1.1 from training. Duration: 0.7909375 2023-10-04 07:41:00,675 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.feed_forward2.hidden_balancer.prob, batch_count=1577160.0, ans=0.125 2023-10-04 07:41:01,894 WARNING [train.py:1204] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_53-300544-0 from training. Duration: 0.67 2023-10-04 07:41:08,058 WARNING [train.py:1204] (3/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_80-74184-0_sp1.1 from training. Duration: 0.7545625 2023-10-04 07:41:08,074 WARNING [train.py:1204] (3/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_332-256168-0_sp1.1 from training. Duration: 0.6818125 2023-10-04 07:41:08,127 WARNING [train.py:1204] (3/4) Exclude cut with ID _48836_210_12210_1_1534075848293_3904150_429-35475-0_sp1.1 from training. Duration: 0.8090625 2023-10-04 07:41:09,408 INFO [train.py:1046] (3/4) Epoch 45, batch 2850, loss[loss=0.144, simple_loss=0.2216, pruned_loss=0.03322, over 23532.00 frames. ], tot_loss[loss=0.1535, simple_loss=0.2335, pruned_loss=0.03673, over 4702315.79 frames. ], batch size: 134, lr: 2.26e-03, grad_scale: 32.0 2023-10-04 07:41:11,317 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_144-135833-0_sp1.1 from training. Duration: 0.97275 2023-10-04 07:41:14,173 WARNING [train.py:1204] (3/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_380-343670-0_sp0.9 from training. Duration: 0.97775 2023-10-04 07:41:15,467 WARNING [train.py:1204] (3/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_213-265174-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 07:41:15,495 WARNING [train.py:1204] (3/4) Exclude cut with ID _20455_210_5219_1_1531565996775_8098100_86-314283-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 07:41:17,017 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_41750_210_1960_1_1533520678_3948042_68-223813-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 07:41:18,258 WARNING [train.py:1204] (3/4) Exclude cut with ID _55153_210_2253_1_1534384786677_3162096_24-198429-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 07:41:19,712 WARNING [train.py:1204] (3/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_119-207162-0_sp0.9 from training. Duration: 0.788875 2023-10-04 07:41:19,773 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_148-172861-0 from training. Duration: 0.5 2023-10-04 07:41:26,909 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_5522_210_3273_1_1527249606_3909096_318-203153-0 from training. Duration: 0.43 2023-10-04 07:41:26,915 WARNING [train.py:1204] (3/4) Exclude cut with ID _14884_210_2252_1_1530707459689_5561250_288-189949-0_sp1.1 from training. Duration: 0.97275 2023-10-04 07:41:28,422 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.feed_forward3.hidden_balancer.prob, batch_count=1577293.3333333333, ans=0.125 2023-10-04 07:41:29,610 WARNING [train.py:1204] (3/4) Exclude cut with ID _5294_210_1747_1_1527249643928_3696179_529-242298-0 from training. Duration: 0.95 2023-10-04 07:41:31,489 WARNING [train.py:1204] (3/4) Exclude cut with ID _36538_210_9994_1_1533204020363_3795620_503-110289-0_sp1.1 from training. Duration: 0.936375 2023-10-04 07:41:33,253 WARNING [train.py:1204] (3/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_385-193052-0 from training. Duration: 0.86 2023-10-04 07:41:33,312 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_456-22141-0 from training. Duration: 0.88 2023-10-04 07:41:34,658 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_38965_210_4852_1_1533174906_3484735_284-117840-0_sp1.1 from training. Duration: 0.936375 2023-10-04 07:41:46,984 WARNING [train.py:1204] (3/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_75-201625-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 07:41:48,375 WARNING [train.py:1204] (3/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_255-312654-0_sp1.1 from training. Duration: 0.82725 2023-10-04 07:41:48,386 WARNING [train.py:1204] (3/4) Exclude cut with ID _47251_210_18628_1_1533814063128_4137250_214-88076-0_sp0.9 from training. Duration: 0.97775 2023-10-04 07:41:48,478 WARNING [train.py:1204] (3/4) Exclude cut with ID _4092_210_2151_1_1526643044449_3135759_227-213972-0_sp1.1 from training. Duration: 0.4909375 2023-10-04 07:41:49,746 WARNING [train.py:1204] (3/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_912-47473-0_sp1.1 from training. Duration: 0.6181875 2023-10-04 07:41:49,766 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6767_210_3273_1_1527855783_1718932_173-256601-0_sp1.1 from training. Duration: 0.72725 2023-10-04 07:41:51,169 WARNING [train.py:1204] (3/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_32-205288-0_sp1.1 from training. Duration: 0.7090625 2023-10-04 07:41:52,551 WARNING [train.py:1204] (3/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_39-248870-0 from training. Duration: 0.69 2023-10-04 07:41:54,125 WARNING [train.py:1204] (3/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_377-296839-0_sp1.1 from training. Duration: 0.5 2023-10-04 07:41:54,152 WARNING [train.py:1204] (3/4) Exclude cut with ID _13679_210_7497_1_1530534293662_3355098_413-56154-0_sp1.1 from training. Duration: 0.963625 2023-10-04 07:41:55,460 WARNING [train.py:1204] (3/4) Exclude cut with ID _14970_210_15709_1_1530775547361_1966965_201-84544-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 07:41:55,509 WARNING [train.py:1204] (3/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_105-95775-0_sp1.1 from training. Duration: 0.936375 2023-10-04 07:41:58,703 WARNING [train.py:1204] (3/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_131-191669-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 07:42:00,039 WARNING [train.py:1204] (3/4) Exclude cut with ID _14860_210_4915_1_1530702020340_7343240_695-4323-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 07:42:00,127 WARNING [train.py:1204] (3/4) Exclude cut with ID _39867_210_9826_1_1533553222030_9344259_991-130565-0_sp1.1 from training. Duration: 0.97275 2023-10-04 07:42:01,551 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_50-3289-0_sp1.1 from training. Duration: 0.82725 2023-10-04 07:42:01,687 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_54-347670-0_sp1.1 from training. Duration: 0.92725 2023-10-04 07:42:03,625 WARNING [train.py:1204] (3/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_200-153445-0_sp1.1 from training. Duration: 0.936375 2023-10-04 07:42:05,002 WARNING [train.py:1204] (3/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_279-302911-0_sp1.1 from training. Duration: 0.97275 2023-10-04 07:42:07,031 WARNING [train.py:1204] (3/4) Exclude cut with ID _14886_210_4220_1_1530702020355_3516190_159-73748-0_sp0.9 from training. Duration: 0.92225 2023-10-04 07:42:07,616 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.conv_module2.whiten.whitening_limit, batch_count=1577426.6666666667, ans=15.0 2023-10-04 07:42:08,518 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.attention_skip_rate, batch_count=1577493.3333333333, ans=0.0 2023-10-04 07:42:10,982 WARNING [train.py:1204] (3/4) Exclude cut with ID _53379_210_20034_1_1534222027363_3854654_23-222715-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 07:42:12,390 WARNING [train.py:1204] (3/4) Exclude cut with ID _12555_210_8750_1_1530075594402_3780239_449-267986-0 from training. Duration: 0.83 2023-10-04 07:42:12,401 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7544_210_6753_1_1528587722_3576686_295-258479-0 from training. Duration: 0.84 2023-10-04 07:42:15,116 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7002_210_1794_1_1527935634_4034444_487-236501-0_sp0.9 from training. Duration: 0.7333125 2023-10-04 07:42:15,167 WARNING [train.py:1204] (3/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_244-19107-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 07:42:15,188 WARNING [train.py:1204] (3/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_390-339137-0 from training. Duration: 0.56 2023-10-04 07:42:16,682 WARNING [train.py:1204] (3/4) Exclude cut with ID _7562_210_2084_1_1528887767916_3946229_186-326422-0_sp1.1 from training. Duration: 0.763625 2023-10-04 07:42:17,946 WARNING [train.py:1204] (3/4) Exclude cut with ID _33640_210_2655_1_1533117605479_4855469_645-145235-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 07:42:17,969 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_25767_210_2161_1_1532307483_3867748_506-171777-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 07:42:17,997 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_5438_210_1794_1_1527336687_2600009_233-58072-0_sp1.1 from training. Duration: 0.7 2023-10-04 07:42:17,998 WARNING [train.py:1204] (3/4) Exclude cut with ID IC0669W0063-330908 from training. Duration: 0.896 2023-10-04 07:42:19,175 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.756e+02 2.081e+02 2.377e+02 2.931e+02 5.661e+02, threshold=4.754e+02, percent-clipped=4.0 2023-10-04 07:42:19,291 WARNING [train.py:1204] (3/4) Exclude cut with ID IC0671W0070-331913 from training. Duration: 0.896 2023-10-04 07:42:19,294 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_258-310570-0_sp1.1 from training. Duration: 0.7454375 2023-10-04 07:42:19,359 WARNING [train.py:1204] (3/4) Exclude cut with ID _9461_210_4557_1_1529060514726_3677451_162-177507-0_sp1.1 from training. Duration: 0.97275 2023-10-04 07:42:23,480 INFO [train.py:1046] (3/4) Epoch 45, batch 2900, loss[loss=0.1424, simple_loss=0.2161, pruned_loss=0.0344, over 24447.00 frames. ], tot_loss[loss=0.1532, simple_loss=0.2332, pruned_loss=0.03659, over 4686482.07 frames. ], batch size: 58, lr: 2.26e-03, grad_scale: 32.0 2023-10-04 07:42:23,548 WARNING [train.py:1204] (3/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_26-175553-0_sp0.9 from training. Duration: 0.67775 2023-10-04 07:42:23,564 WARNING [train.py:1204] (3/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_7-150481-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 07:42:24,900 WARNING [train.py:1204] (3/4) Exclude cut with ID _23557_210_8750_1_1531890106370_6846255_72-273262-0_sp1.1 from training. Duration: 0.963625 2023-10-04 07:42:25,013 WARNING [train.py:1204] (3/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_367-61330-0 from training. Duration: 0.75 2023-10-04 07:42:29,572 WARNING [train.py:1204] (3/4) Exclude cut with ID _39867_210_9826_1_1533553222030_9344259_1337-11141-0_sp1.1 from training. Duration: 0.936375 2023-10-04 07:42:29,595 WARNING [train.py:1204] (3/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_101-293367-0 from training. Duration: 0.63 2023-10-04 07:42:29,657 WARNING [train.py:1204] (3/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_10-100367-0 from training. Duration: 0.98 2023-10-04 07:42:30,960 WARNING [train.py:1204] (3/4) Exclude cut with ID _60057_210_10640_1_1534903065793_3333150_171-181133-0_sp1.1 from training. Duration: 0.8 2023-10-04 07:42:32,887 WARNING [train.py:1204] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_844-96989-0_sp0.9 from training. Duration: 0.788875 2023-10-04 07:42:33,037 WARNING [train.py:1204] (3/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_301-144595-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 07:42:33,244 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.conv_module2.balancer2.min_abs, batch_count=1577560.0, ans=0.5 2023-10-04 07:42:34,421 WARNING [train.py:1204] (3/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_115-147343-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 07:42:39,477 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3171_210_2087_1_1526109084_2330007_37-64334-0_sp1.1 from training. Duration: 0.7454375 2023-10-04 07:42:39,512 WARNING [train.py:1204] (3/4) Exclude cut with ID _19514_210_9259_1_1531530071330_5084230_769-272914-0_sp1.1 from training. Duration: 0.936375 2023-10-04 07:42:42,345 WARNING [train.py:1204] (3/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_76-311963-0_sp0.9 from training. Duration: 0.77775 2023-10-04 07:42:42,381 WARNING [train.py:1204] (3/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_526-62079-0 from training. Duration: 0.67 2023-10-04 07:42:42,435 WARNING [train.py:1204] (3/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_7-111954-0_sp0.9 from training. Duration: 0.62225 2023-10-04 07:42:45,082 WARNING [train.py:1204] (3/4) Exclude cut with ID _57997_210_4163_1_1534766425889_3961049_360-279140-0_sp1.1 from training. Duration: 0.97275 2023-10-04 07:42:46,533 WARNING [train.py:1204] (3/4) Exclude cut with ID _4866_210_2577_1_1527404412284_4364659_133-1696-0 from training. Duration: 0.99 2023-10-04 07:42:46,605 WARNING [train.py:1204] (3/4) Exclude cut with ID _5190_210_4557_1_1527244368799_373439_28-197030-0 from training. Duration: 0.72 2023-10-04 07:42:50,727 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_53-39003-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 07:42:50,730 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6673_210_3273_1_1527767790_3173240_351-167954-0 from training. Duration: 0.48 2023-10-04 07:42:50,751 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2822_210_2715_1_1526112586_1999587_165-36656-0_sp1.1 from training. Duration: 0.8090625 2023-10-04 07:42:53,436 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_328-264892-0_sp1.1 from training. Duration: 0.92725 2023-10-04 07:42:53,444 WARNING [train.py:1204] (3/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_29-293896-0_sp0.9 from training. Duration: 0.67775 2023-10-04 07:42:54,251 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.2.encoder.layers.0.conv_module1.whiten, num_groups=1, num_channels=384, metric=6.36 vs. limit=15.0 2023-10-04 07:42:56,293 WARNING [train.py:1204] (3/4) Exclude cut with ID _65263_210_22356_1_1535763598691_3921037_465-55448-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 07:42:57,659 WARNING [train.py:1204] (3/4) Exclude cut with ID _21989_210_5854_1_1531899214931_3271360_311-53106-0_sp1.1 from training. Duration: 0.97275 2023-10-04 07:43:02,537 WARNING [train.py:1204] (3/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_389-277915-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 07:43:05,649 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_69630_210_7234_1_1535791094_677837_63-277139-0_sp1.1 from training. Duration: 0.963625 2023-10-04 07:43:08,421 WARNING [train.py:1204] (3/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_85-138257-0 from training. Duration: 0.71 2023-10-04 07:43:09,100 WARNING [train.py:1204] (3/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_686-247600-0 from training. Duration: 0.92 2023-10-04 07:43:09,105 WARNING [train.py:1204] (3/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_108-255430-0_sp1.1 from training. Duration: 0.8818125 2023-10-04 07:43:13,285 WARNING [train.py:1204] (3/4) Exclude cut with ID _14884_210_2252_1_1530707459689_5561250_114-201399-0_sp1.1 from training. Duration: 0.6909375 2023-10-04 07:43:14,627 WARNING [train.py:1204] (3/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_506-42614-0 from training. Duration: 0.79 2023-10-04 07:43:14,732 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_85-195240-0_sp1.1 from training. Duration: 0.7909375 2023-10-04 07:43:21,681 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_141-36243-0_sp1.1 from training. Duration: 0.97275 2023-10-04 07:43:30,191 WARNING [train.py:1204] (3/4) Exclude cut with ID _7852_210_5219_1_1528286471316_8139400_358-287492-0_sp1.1 from training. Duration: 0.87275 2023-10-04 07:43:30,214 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_805-71497-0_sp0.9 from training. Duration: 0.788875 2023-10-04 07:43:32,017 WARNING [train.py:1204] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_178-39722-0 from training. Duration: 0.49 2023-10-04 07:43:33,609 WARNING [train.py:1204] (3/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_428-181925-0_sp1.1 from training. Duration: 0.963625 2023-10-04 07:43:33,613 WARNING [train.py:1204] (3/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_770-200430-0 from training. Duration: 0.89 2023-10-04 07:43:34,807 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_1104-271009-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 07:43:34,859 WARNING [train.py:1204] (3/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_128-296581-0_sp0.9 from training. Duration: 0.82225 2023-10-04 07:43:34,972 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.0.feed_forward1.out_proj.dropout_p, batch_count=1577826.6666666667, ans=0.1 2023-10-04 07:43:38,096 INFO [train.py:1046] (3/4) Epoch 45, batch 2950, loss[loss=0.1599, simple_loss=0.2335, pruned_loss=0.04314, over 23750.00 frames. ], tot_loss[loss=0.1543, simple_loss=0.2345, pruned_loss=0.03706, over 4695974.40 frames. ], batch size: 179, lr: 2.26e-03, grad_scale: 16.0 2023-10-04 07:43:39,792 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.conv_module1.balancer1.prob, batch_count=1577893.3333333333, ans=0.125 2023-10-04 07:43:42,751 WARNING [train.py:1204] (3/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_19-62511-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 07:43:42,882 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_805-71497-0 from training. Duration: 0.71 2023-10-04 07:43:44,717 WARNING [train.py:1204] (3/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_573-152077-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 07:43:44,724 WARNING [train.py:1204] (3/4) Exclude cut with ID _42875_210_14856_1_1533704397487_4012433_393-177816-0_sp1.1 from training. Duration: 0.963625 2023-10-04 07:43:44,864 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=1577893.3333333333, ans=0.1 2023-10-04 07:43:47,496 WARNING [train.py:1204] (3/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_278-35159-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 07:43:47,589 WARNING [train.py:1204] (3/4) Exclude cut with ID _15037_210_2253_1_1530752294104_3267650_13-130361-0_sp1.1 from training. Duration: 0.9 2023-10-04 07:43:49,095 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3019_210_2028_1_1526044245_3264865_127-135615-0 from training. Duration: 0.94 2023-10-04 07:43:49,147 WARNING [train.py:1204] (3/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_607-213951-0 from training. Duration: 0.84 2023-10-04 07:43:50,316 WARNING [train.py:1204] (3/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_436-318113-0_sp0.9 from training. Duration: 0.8333125 2023-10-04 07:43:50,317 WARNING [train.py:1204] (3/4) Exclude cut with ID _13938_210_9988_1_1530518718978_3693960_148-152225-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 07:43:57,210 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2541_210_1794_1_1525590269_3084765_156-173713-0_sp1.1 from training. Duration: 0.736375 2023-10-04 07:43:58,495 WARNING [train.py:1204] (3/4) Exclude cut with ID _28323_210_16187_1_1532480395667_4100300_531-24157-0_sp1.1 from training. Duration: 0.8909375 2023-10-04 07:44:00,036 WARNING [train.py:1204] (3/4) Exclude cut with ID _29303_210_2575_1_1532392237303_7410649_382-217492-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 07:44:01,356 WARNING [train.py:1204] (3/4) Exclude cut with ID _58895_210_8750_1_1535184081320_3649089_270-158013-0_sp1.1 from training. Duration: 0.92725 2023-10-04 07:44:03,484 WARNING [train.py:1204] (3/4) Exclude cut with ID _29277_210_3279_1_1532581109293_3173069_311-206227-0_sp1.1 from training. Duration: 0.936375 2023-10-04 07:44:03,495 WARNING [train.py:1204] (3/4) Exclude cut with ID _7160_210_1876_1_1527940923231_3703799_282-322611-0_sp0.9 from training. Duration: 0.9666875 2023-10-04 07:44:06,190 WARNING [train.py:1204] (3/4) Exclude cut with ID _53219_210_4525_1_1534834836008_4064825_562-32295-0_sp1.1 from training. Duration: 0.963625 2023-10-04 07:44:06,928 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.3.encoder.layers.2.nonlin_attention.whiten2, num_groups=1, num_channels=512, metric=5.42 vs. limit=15.0 2023-10-04 07:44:07,536 WARNING [train.py:1204] (3/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_408-87330-0_sp1.1 from training. Duration: 0.963625 2023-10-04 07:44:07,552 WARNING [train.py:1204] (3/4) Exclude cut with ID _59202_210_26984_1_1534816810612_3549174_137-318710-0_sp1.1 from training. Duration: 0.8090625 2023-10-04 07:44:11,029 WARNING [train.py:1204] (3/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_372-30302-0 from training. Duration: 0.86 2023-10-04 07:44:17,047 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7543_210_6753_1_1528541907_1399322_178-56269-0_sp1.1 from training. Duration: 0.95275 2023-10-04 07:44:17,066 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9109W0012-549758 from training. Duration: 0.512 2023-10-04 07:44:17,140 WARNING [train.py:1204] (3/4) Exclude cut with ID _5410_210_1388_1_1527397055795_1719020_39-112221-0_sp0.9 from training. Duration: 0.7666875 2023-10-04 07:44:20,318 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9121W0012-554933 from training. Duration: 0.896 2023-10-04 07:44:21,712 WARNING [train.py:1204] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_476-272757-0 from training. Duration: 0.93 2023-10-04 07:44:21,741 WARNING [train.py:1204] (3/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_1085-171718-0_sp1.1 from training. Duration: 0.92725 2023-10-04 07:44:21,991 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.balancer2.prob, batch_count=1578093.3333333333, ans=0.125 2023-10-04 07:44:23,015 WARNING [train.py:1204] (3/4) Exclude cut with ID _8480_210_1874_1_1528589391205_6728330_309-139896-0_sp1.1 from training. Duration: 0.736375 2023-10-04 07:44:23,015 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9130W0113-559703 from training. Duration: 0.896 2023-10-04 07:44:23,028 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_292-265928-0_sp1.1 from training. Duration: 0.463625 2023-10-04 07:44:24,582 WARNING [train.py:1204] (3/4) Exclude cut with ID _8772_210_1850_1_1528635595972_3664011_96-12756-0 from training. Duration: 0.98 2023-10-04 07:44:25,871 WARNING [train.py:1204] (3/4) Exclude cut with ID _12556_210_8341_1_1530081024100_7327690_725-193477-0_sp1.1 from training. Duration: 0.8909375 2023-10-04 07:44:25,924 WARNING [train.py:1204] (3/4) Exclude cut with ID _5289_210_2084_1_1527678202736_3779299_142-103317-0_sp1.1 from training. Duration: 0.763625 2023-10-04 07:44:29,506 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.2.encoder.layers.2.self_attn1.whiten, num_groups=1, num_channels=384, metric=18.42 vs. limit=22.5 2023-10-04 07:44:30,161 WARNING [train.py:1204] (3/4) Exclude cut with ID _45012_210_12651_1_1533608435136_5371380_206-51075-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 07:44:31,449 WARNING [train.py:1204] (3/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_176-56120-0_sp1.1 from training. Duration: 0.7545625 2023-10-04 07:44:31,473 WARNING [train.py:1204] (3/4) Exclude cut with ID _40284_210_2084_1_1533513679814_3704279_220-201864-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 07:44:32,773 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9165W0004-576638 from training. Duration: 0.896 2023-10-04 07:44:32,811 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_5438_210_1794_1_1527336687_2600009_218-343336-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 07:44:32,836 WARNING [train.py:1204] (3/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_318-162017-0 from training. Duration: 0.91 2023-10-04 07:44:38,884 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_49544_210_1794_1_1534379770_5262083_483-349298-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 07:44:40,137 WARNING [train.py:1204] (3/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_197-28164-0_sp0.9 from training. Duration: 0.888875 2023-10-04 07:44:40,199 WARNING [train.py:1204] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_643-52007-0 from training. Duration: 0.57 2023-10-04 07:44:41,427 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_38965_210_4852_1_1533174906_3484735_403-332963-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 07:44:43,479 WARNING [train.py:1204] (3/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_340-174176-0 from training. Duration: 0.49 2023-10-04 07:44:46,249 WARNING [train.py:1204] (3/4) Exclude cut with ID _56708_210_13177_1_1535252351282_3422899_363-917-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 07:44:48,089 WARNING [train.py:1204] (3/4) Exclude cut with ID _19896_210_1883_1_1531709022662_6649569_525-159063-0_sp1.1 from training. Duration: 0.936375 2023-10-04 07:44:48,122 WARNING [train.py:1204] (3/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_430-116139-0_sp0.9 from training. Duration: 0.9444375 2023-10-04 07:44:49,415 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.564e+02 1.943e+02 2.173e+02 2.663e+02 4.538e+02, threshold=4.346e+02, percent-clipped=0.0 2023-10-04 07:44:49,545 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_53753_210_7234_1_1534320003_3594148_86-329250-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 07:44:49,562 WARNING [train.py:1204] (3/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_406-275649-0_sp1.1 from training. Duration: 0.3818125 2023-10-04 07:44:50,928 WARNING [train.py:1204] (3/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_1083-147536-0_sp1.1 from training. Duration: 0.8818125 2023-10-04 07:44:50,999 WARNING [train.py:1204] (3/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_54-311741-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 07:44:51,004 WARNING [train.py:1204] (3/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_237-58433-0_sp0.9 from training. Duration: 0.82225 2023-10-04 07:44:52,365 INFO [train.py:1046] (3/4) Epoch 45, batch 3000, loss[loss=0.1673, simple_loss=0.2352, pruned_loss=0.04974, over 23544.00 frames. ], tot_loss[loss=0.1548, simple_loss=0.2354, pruned_loss=0.03712, over 4704777.80 frames. ], batch size: 256, lr: 2.25e-03, grad_scale: 16.0 2023-10-04 07:44:52,366 INFO [train.py:1069] (3/4) Computing validation loss 2023-10-04 07:45:04,920 INFO [train.py:1078] (3/4) Epoch 45, validation: loss=0.3664, simple_loss=0.2817, pruned_loss=0.2256, over 1125622.00 frames. 2023-10-04 07:45:04,921 INFO [train.py:1079] (3/4) Maximum memory allocated so far is 21134MB 2023-10-04 07:45:04,997 WARNING [train.py:1204] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_98-51676-0_sp1.1 from training. Duration: 0.663625 2023-10-04 07:45:05,066 WARNING [train.py:1204] (3/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_386-270472-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 07:45:07,048 WARNING [train.py:1204] (3/4) Exclude cut with ID _19023_210_4353_1_1531378892552_5820780_83-254794-0_sp1.1 from training. Duration: 0.7818125 2023-10-04 07:45:08,458 WARNING [train.py:1204] (3/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_314-93656-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 07:45:09,716 WARNING [train.py:1204] (3/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_336-213530-0 from training. Duration: 0.9 2023-10-04 07:45:11,173 WARNING [train.py:1204] (3/4) Exclude cut with ID _36686_210_5665_1_1533778225913_3646410_65-221120-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 07:45:14,019 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.2.encoder.layers.0.self_attn1.whiten, num_groups=1, num_channels=384, metric=20.88 vs. limit=22.5 2023-10-04 07:45:15,129 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_26433_210_12108_1_1532157158_4056480_271-68178-0_sp1.1 from training. Duration: 0.92725 2023-10-04 07:45:15,168 WARNING [train.py:1204] (3/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_46-207599-0_sp0.9 from training. Duration: 0.711125 2023-10-04 07:45:16,531 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9303W0114-637133 from training. Duration: 0.896 2023-10-04 07:45:17,839 WARNING [train.py:1204] (3/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_490-207861-0 from training. Duration: 0.98 2023-10-04 07:45:19,389 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6767_210_3273_1_1527855783_1718932_173-256601-0_sp0.9 from training. Duration: 0.888875 2023-10-04 07:45:19,425 WARNING [train.py:1204] (3/4) Exclude cut with ID _13921_210_9994_1_1530841782272_3503520_327-69677-0_sp1.1 from training. Duration: 0.8181875 2023-10-04 07:45:20,654 WARNING [train.py:1204] (3/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_508-335369-0 from training. Duration: 0.48 2023-10-04 07:45:20,685 WARNING [train.py:1204] (3/4) Exclude cut with ID _64631_210_27767_1_1535349580068_4165160_24-46567-0_sp1.1 from training. Duration: 0.963625 2023-10-04 07:45:25,537 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_89-35748-0_sp1.1 from training. Duration: 0.7181875 2023-10-04 07:45:27,068 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=1578293.3333333333, ans=0.1 2023-10-04 07:45:32,805 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.whiten.whitening_limit, batch_count=1578293.3333333333, ans=12.0 2023-10-04 07:45:35,401 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_26433_210_12108_1_1532157158_4056480_578-262107-0_sp1.1 from training. Duration: 0.9 2023-10-04 07:45:40,760 WARNING [train.py:1204] (3/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_129-337802-0 from training. Duration: 0.72 2023-10-04 07:45:40,865 WARNING [train.py:1204] (3/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_356-167816-0_sp0.9 from training. Duration: 0.8 2023-10-04 07:45:43,499 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.bypass_mid.scale_min, batch_count=1578360.0, ans=0.2 2023-10-04 07:45:44,402 WARNING [train.py:1204] (3/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_137-116442-0_sp1.1 from training. Duration: 0.7090625 2023-10-04 07:45:44,443 WARNING [train.py:1204] (3/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_358-172940-0_sp1.1 from training. Duration: 0.963625 2023-10-04 07:45:44,464 WARNING [train.py:1204] (3/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_668-344147-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 07:45:47,155 WARNING [train.py:1204] (3/4) Exclude cut with ID _62337_210_12392_1_1535009273105_3412229_255-157667-0_sp1.1 from training. Duration: 0.936375 2023-10-04 07:45:47,159 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_486-151131-0 from training. Duration: 0.85 2023-10-04 07:45:48,566 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_53638_210_21581_1_1534297319_5808449_885-80172-0 from training. Duration: 0.96 2023-10-04 07:45:49,964 WARNING [train.py:1204] (3/4) Exclude cut with ID _50093_210_4220_1_1534417141207_3432299_105-183067-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 07:45:50,020 WARNING [train.py:1204] (3/4) Exclude cut with ID _7562_210_2084_1_1528887767916_3946229_345-169156-0_sp0.9 from training. Duration: 0.6444375 2023-10-04 07:45:53,245 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_4528_210_3273_1_1527076654_3276777_209-219804-0_sp0.9 from training. Duration: 0.7666875 2023-10-04 07:45:53,280 WARNING [train.py:1204] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_35-153886-0_sp0.9 from training. Duration: 0.9333125 2023-10-04 07:45:54,621 WARNING [train.py:1204] (3/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_332-121925-0_sp1.1 from training. Duration: 0.92725 2023-10-04 07:45:54,622 WARNING [train.py:1204] (3/4) Exclude cut with ID _59202_210_26984_1_1534816810612_3549174_495-65515-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 07:45:57,377 WARNING [train.py:1204] (3/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_769-168302-0_sp1.1 from training. Duration: 0.6454375 2023-10-04 07:45:57,415 WARNING [train.py:1204] (3/4) Exclude cut with ID _24271_210_12658_1_1531987270022_7190519_708-71345-0_sp1.1 from training. Duration: 0.936375 2023-10-04 07:45:57,415 WARNING [train.py:1204] (3/4) Exclude cut with ID 3033-130750-0096-55598-0_sp0.9 from training. Duration: 0.92225 2023-10-04 07:45:58,814 WARNING [train.py:1204] (3/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_219-224445-0_sp0.9 from training. Duration: 0.9333125 2023-10-04 07:46:00,331 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_174-277364-0 from training. Duration: 0.71 2023-10-04 07:46:01,739 WARNING [train.py:1204] (3/4) Exclude cut with ID _45021_210_9994_1_1533619751094_3330288_96-239010-0_sp1.1 from training. Duration: 0.82725 2023-10-04 07:46:03,662 WARNING [train.py:1204] (3/4) Exclude cut with ID _31602_210_5854_1_1532677021456_4458250_489-305116-0_sp1.1 from training. Duration: 0.97275 2023-10-04 07:46:03,687 WARNING [train.py:1204] (3/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_269-154726-0_sp0.9 from training. Duration: 0.9555625 2023-10-04 07:46:07,782 WARNING [train.py:1204] (3/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_126-54418-0_sp1.1 from training. Duration: 0.92725 2023-10-04 07:46:07,809 WARNING [train.py:1204] (3/4) Exclude cut with ID _42326_210_7770_1_1533814282531_3707269_182-224159-0_sp1.1 from training. Duration: 0.92725 2023-10-04 07:46:11,372 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7002_210_1794_1_1527939683_5157286_1-281264-0_sp0.9 from training. Duration: 0.488875 2023-10-04 07:46:11,412 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_86-16881-0 from training. Duration: 0.58 2023-10-04 07:46:11,438 WARNING [train.py:1204] (3/4) Exclude cut with ID _19514_210_9259_1_1531530071330_5084230_637-279094-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 07:46:11,464 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_26433_210_12108_1_1532157158_4056480_578-262107-0 from training. Duration: 0.99 2023-10-04 07:46:12,930 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_82-221196-0_sp1.1 from training. Duration: 0.6909375 2023-10-04 07:46:15,609 WARNING [train.py:1204] (3/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_402-102311-0 from training. Duration: 0.67 2023-10-04 07:46:18,430 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1015-343884-0_sp1.1 from training. Duration: 0.563625 2023-10-04 07:46:18,504 WARNING [train.py:1204] (3/4) Exclude cut with ID _13684_210_7497_1_1530619354863_3598359_312-235606-0_sp1.1 from training. Duration: 0.5454375 2023-10-04 07:46:19,806 INFO [train.py:1046] (3/4) Epoch 45, batch 3050, loss[loss=0.1433, simple_loss=0.224, pruned_loss=0.03135, over 24450.00 frames. ], tot_loss[loss=0.1558, simple_loss=0.2362, pruned_loss=0.03774, over 4696820.72 frames. ], batch size: 58, lr: 2.25e-03, grad_scale: 16.0 2023-10-04 07:46:19,873 WARNING [train.py:1204] (3/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_55-31904-0 from training. Duration: 0.88 2023-10-04 07:46:19,952 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_25-332138-0 from training. Duration: 0.91 2023-10-04 07:46:19,953 WARNING [train.py:1204] (3/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_912-282412-0_sp0.9 from training. Duration: 0.5444375 2023-10-04 07:46:21,310 WARNING [train.py:1204] (3/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_458-299435-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 07:46:21,401 WARNING [train.py:1204] (3/4) Exclude cut with ID _69584_210_5219_1_1535785133775_4044450_275-88403-0_sp1.1 from training. Duration: 0.92725 2023-10-04 07:46:21,410 WARNING [train.py:1204] (3/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_841-341679-0_sp1.1 from training. Duration: 0.52725 2023-10-04 07:46:21,423 WARNING [train.py:1204] (3/4) Exclude cut with ID _41391_210_7994_1_1533603771872_3638201_1-54521-0_sp1.1 from training. Duration: 0.97275 2023-10-04 07:46:22,792 WARNING [train.py:1204] (3/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_495-186609-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 07:46:25,931 WARNING [train.py:1204] (3/4) Exclude cut with ID _4681_210_3525_1_1527250689464_120889_15-126760-0 from training. Duration: 0.81 2023-10-04 07:46:28,675 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3019_210_2028_1_1526044245_3264865_127-135615-0_sp1.1 from training. Duration: 0.8545625 2023-10-04 07:46:28,894 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.self_attn_weights.pos_emb_skip_rate, batch_count=1578560.0, ans=0.0 2023-10-04 07:46:31,409 WARNING [train.py:1204] (3/4) Exclude cut with ID _30191_210_1664_1_1533189915047_7196670_915-21561-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 07:46:31,463 WARNING [train.py:1204] (3/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_6-214815-0_sp0.9 from training. Duration: 0.9444375 2023-10-04 07:46:34,357 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_45399_210_4852_1_1533624725_7579737_190-137494-0_sp1.1 from training. Duration: 0.97275 2023-10-04 07:46:36,422 WARNING [train.py:1204] (3/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_207-132038-0 from training. Duration: 0.94 2023-10-04 07:46:42,447 WARNING [train.py:1204] (3/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_90-31383-0 from training. Duration: 0.87 2023-10-04 07:46:43,909 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_15517_210_6753_1_1530952293_1664795_119-31001-0 from training. Duration: 0.94 2023-10-04 07:46:43,949 WARNING [train.py:1204] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_684-85342-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 07:46:44,162 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.balancer1.prob, batch_count=1578626.6666666667, ans=0.125 2023-10-04 07:46:47,976 WARNING [train.py:1204] (3/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_97-325053-0_sp1.1 from training. Duration: 0.836375 2023-10-04 07:46:52,087 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_68702_210_23689_1_1535799577_3420068_168-25150-0_sp1.1 from training. Duration: 0.97275 2023-10-04 07:46:52,094 WARNING [train.py:1204] (3/4) Exclude cut with ID _65134_210_14763_1_1535335393985_3505368_111-27984-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 07:46:52,154 WARNING [train.py:1204] (3/4) Exclude cut with ID _48678_210_5663_1_1533988830947_3769199_276-226230-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 07:46:54,949 WARNING [train.py:1204] (3/4) Exclude cut with ID _28427_210_4220_1_1532581240967_3061069_43-84928-0_sp1.1 from training. Duration: 0.9 2023-10-04 07:46:54,977 WARNING [train.py:1204] (3/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_158-250116-0_sp1.1 from training. Duration: 0.563625 2023-10-04 07:46:56,686 WARNING [train.py:1204] (3/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_279-332556-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 07:46:56,730 WARNING [train.py:1204] (3/4) Exclude cut with ID _42863_210_9070_1_1533902398788_3696920_488-230146-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 07:46:56,731 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_387-247159-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 07:46:57,289 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.4.encoder.layers.2.feed_forward3.out_whiten, num_groups=1, num_channels=384, metric=12.98 vs. limit=15.0 2023-10-04 07:46:58,117 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6729_210_2550_1_1528032481_1788725_261-42619-0_sp1.1 from training. Duration: 0.97275 2023-10-04 07:46:58,748 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.3.encoder.layers.1.whiten, num_groups=1, num_channels=512, metric=4.76 vs. limit=12.0 2023-10-04 07:46:59,501 WARNING [train.py:1204] (3/4) Exclude cut with ID _19896_210_1883_1_1531709022662_6649569_831-44724-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 07:47:02,289 WARNING [train.py:1204] (3/4) Exclude cut with ID _55153_210_2253_1_1534384786677_3162096_280-285944-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 07:47:02,312 WARNING [train.py:1204] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_448-4048-0 from training. Duration: 0.96 2023-10-04 07:47:02,369 WARNING [train.py:1204] (3/4) Exclude cut with ID _29678_210_7893_1_1532421042787_3906118_456-202548-0_sp1.1 from training. Duration: 0.97275 2023-10-04 07:47:02,387 WARNING [train.py:1204] (3/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_107-332398-0_sp1.1 from training. Duration: 0.7090625 2023-10-04 07:47:02,558 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.conv_skip_rate, batch_count=1578760.0, ans=0.0 2023-10-04 07:47:06,873 WARNING [train.py:1204] (3/4) Exclude cut with ID _13921_210_9994_1_1530838702800_3003429_46-149444-0_sp1.1 from training. Duration: 0.8545625 2023-10-04 07:47:06,914 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_257-20733-0_sp0.9 from training. Duration: 0.8444375 2023-10-04 07:47:08,257 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_53753_210_7234_1_1534320003_3594148_385-234809-0_sp1.1 from training. Duration: 0.963625 2023-10-04 07:47:08,310 WARNING [train.py:1204] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_292-215346-0_sp1.1 from training. Duration: 0.8818125 2023-10-04 07:47:14,471 WARNING [train.py:1204] (3/4) Exclude cut with ID _68965_210_1883_1_1535698962272_3497490_196-124268-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 07:47:14,514 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_23801_210_16223_1_1532066392_3294657_570-5439-0_sp1.1 from training. Duration: 0.8818125 2023-10-04 07:47:20,242 WARNING [train.py:1204] (3/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_349-6928-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 07:47:20,289 WARNING [train.py:1204] (3/4) Exclude cut with ID _60621_210_17071_1_1535013053530_3671739_226-255202-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 07:47:20,292 WARNING [train.py:1204] (3/4) Exclude cut with ID _41817_210_18835_1_1533434477160_7423049_448-130302-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 07:47:21,707 WARNING [train.py:1204] (3/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_758-143677-0_sp1.1 from training. Duration: 0.8909375 2023-10-04 07:47:21,752 WARNING [train.py:1204] (3/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_912-47473-0_sp0.9 from training. Duration: 0.7555625 2023-10-04 07:47:23,072 WARNING [train.py:1204] (3/4) Exclude cut with ID _6694_210_4599_1_1527852637490_3965529_356-169233-0_sp1.1 from training. Duration: 0.9 2023-10-04 07:47:23,154 WARNING [train.py:1204] (3/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_1-99680-0 from training. Duration: 0.67 2023-10-04 07:47:24,545 WARNING [train.py:1204] (3/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_199-86432-0_sp1.1 from training. Duration: 0.8909375 2023-10-04 07:47:24,569 WARNING [train.py:1204] (3/4) Exclude cut with ID _61148_210_6897_1_1535094022726_4217610_362-182430-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 07:47:24,817 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.bypass_mid.scale_min, batch_count=1578826.6666666667, ans=0.2 2023-10-04 07:47:26,416 WARNING [train.py:1204] (3/4) Exclude cut with ID _49376_210_5986_1_1533985278732_3370740_301-154763-0 from training. Duration: 0.98 2023-10-04 07:47:27,786 WARNING [train.py:1204] (3/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_572-185150-0_sp1.1 from training. Duration: 0.8818125 2023-10-04 07:47:30,496 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.590e+02 2.029e+02 2.330e+02 2.767e+02 4.279e+02, threshold=4.661e+02, percent-clipped=0.0 2023-10-04 07:47:32,525 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_47896_210_20508_1_1534492518_5958868_558-314572-0_sp1.1 from training. Duration: 0.8818125 2023-10-04 07:47:33,800 INFO [train.py:1046] (3/4) Epoch 45, batch 3100, loss[loss=0.1442, simple_loss=0.2304, pruned_loss=0.02899, over 24487.00 frames. ], tot_loss[loss=0.156, simple_loss=0.2364, pruned_loss=0.0378, over 4699980.94 frames. ], batch size: 63, lr: 2.25e-03, grad_scale: 16.0 2023-10-04 07:47:33,899 WARNING [train.py:1204] (3/4) Exclude cut with ID _45560_210_21982_1_1533812603951_3584999_111-6376-0_sp0.9 from training. Duration: 0.9333125 2023-10-04 07:47:35,563 WARNING [train.py:1204] (3/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_412-317077-0_sp1.1 from training. Duration: 0.6818125 2023-10-04 07:47:38,217 WARNING [train.py:1204] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_718-68505-0 from training. Duration: 0.89 2023-10-04 07:47:41,006 WARNING [train.py:1204] (3/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_77-311404-0 from training. Duration: 0.94 2023-10-04 07:47:42,830 WARNING [train.py:1204] (3/4) Exclude cut with ID _60229_210_27767_1_1534838406158_3655350_264-279573-0 from training. Duration: 0.83 2023-10-04 07:47:45,968 WARNING [train.py:1204] (3/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_106-300985-0_sp1.1 from training. Duration: 0.8181875 2023-10-04 07:47:48,769 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_4183_210_2087_1_1526729100_1869838_79-277189-0_sp1.1 from training. Duration: 0.963625 2023-10-04 07:47:48,778 WARNING [train.py:1204] (3/4) Exclude cut with ID _28427_210_4220_1_1532581240967_3061069_195-295268-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 07:47:50,384 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.0.conv_module1.balancer1.prob, batch_count=1578960.0, ans=0.125 2023-10-04 07:47:50,891 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.3.encoder.layers.2.conv_module1.whiten, num_groups=1, num_channels=512, metric=4.56 vs. limit=15.0 2023-10-04 07:47:51,602 WARNING [train.py:1204] (3/4) Exclude cut with ID _4092_210_2151_1_1526643044449_3135759_356-278694-0_sp0.9 from training. Duration: 0.52225 2023-10-04 07:47:55,662 WARNING [train.py:1204] (3/4) Exclude cut with ID _14459_210_2228_1_1531443641946_7516700_777-71446-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 07:48:01,533 WARNING [train.py:1204] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_520-126729-0 from training. Duration: 0.7 2023-10-04 07:48:06,132 WARNING [train.py:1204] (3/4) Exclude cut with ID _13684_210_7497_1_1530619354863_3598359_312-235606-0_sp0.9 from training. Duration: 0.6666875 2023-10-04 07:48:06,165 WARNING [train.py:1204] (3/4) Exclude cut with ID _37537_210_2253_1_1533171758568_3421949_283-349402-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 07:48:06,202 WARNING [train.py:1204] (3/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_29-117932-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 07:48:06,228 WARNING [train.py:1204] (3/4) Exclude cut with ID _29303_210_2575_1_1532392237303_7410649_721-283351-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 07:48:06,282 WARNING [train.py:1204] (3/4) Exclude cut with ID _14886_210_4220_1_1530702020355_3516190_175-229073-0_sp0.9 from training. Duration: 0.5 2023-10-04 07:48:06,450 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.conv_skip_rate, batch_count=1579026.6666666667, ans=0.0 2023-10-04 07:48:08,967 WARNING [train.py:1204] (3/4) Exclude cut with ID _15458_210_8341_1_1530858614263_3834410_70-268830-0_sp1.1 from training. Duration: 0.97275 2023-10-04 07:48:08,992 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2295_210_2087_1_1525500993_2407863_234-83104-0 from training. Duration: 0.62 2023-10-04 07:48:08,993 WARNING [train.py:1204] (3/4) Exclude cut with ID _22961_210_8254_1_1531976366672_4278180_317-199546-0_sp1.1 from training. Duration: 0.7818125 2023-10-04 07:48:10,438 WARNING [train.py:1204] (3/4) Exclude cut with ID _27894_210_12392_1_1532415496078_6902439_547-196022-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 07:48:11,873 WARNING [train.py:1204] (3/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_46-207599-0 from training. Duration: 0.64 2023-10-04 07:48:13,786 WARNING [train.py:1204] (3/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_426-326246-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 07:48:16,461 WARNING [train.py:1204] (3/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_445-117398-0_sp0.9 from training. Duration: 0.988875 2023-10-04 07:48:16,515 WARNING [train.py:1204] (3/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_563-347832-0 from training. Duration: 0.89 2023-10-04 07:48:18,380 WARNING [train.py:1204] (3/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_252-216674-0 from training. Duration: 0.54 2023-10-04 07:48:18,455 WARNING [train.py:1204] (3/4) Exclude cut with ID _59611_210_8895_1_1534741198240_3650361_187-212779-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 07:48:19,054 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.2.encoder.layers.2.feed_forward3.out_whiten, num_groups=1, num_channels=384, metric=11.20 vs. limit=15.0 2023-10-04 07:48:19,750 WARNING [train.py:1204] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_282-159367-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 07:48:21,200 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_68702_210_23689_1_1535799577_3420068_33-136883-0_sp1.1 from training. Duration: 0.936375 2023-10-04 07:48:21,210 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_47228_210_4983_1_1533780021_3672027_184-293706-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 07:48:21,236 WARNING [train.py:1204] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_656-188361-0_sp0.9 from training. Duration: 0.9666875 2023-10-04 07:48:22,656 WARNING [train.py:1204] (3/4) Exclude cut with ID _4866_210_2577_1_1527404412284_4364659_352-157930-0_sp0.9 from training. Duration: 0.9 2023-10-04 07:48:22,657 WARNING [train.py:1204] (3/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_210-106047-0_sp1.1 from training. Duration: 0.8818125 2023-10-04 07:48:25,346 WARNING [train.py:1204] (3/4) Exclude cut with ID _9389_210_1664_1_1529217337301_5289269_289-213417-0_sp1.1 from training. Duration: 0.8090625 2023-10-04 07:48:25,374 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3910_210_1794_1_1526777667_3747260_178-41186-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 07:48:25,386 WARNING [train.py:1204] (3/4) Exclude cut with ID _27052_210_5986_1_1532329197187_3585009_24-172263-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 07:48:25,390 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_5522_210_3273_1_1527249606_3909096_318-203153-0_sp1.1 from training. Duration: 0.3909375 2023-10-04 07:48:30,004 WARNING [train.py:1204] (3/4) Exclude cut with ID _27894_210_12392_1_1532415496078_6902439_222-51982-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 07:48:31,893 WARNING [train.py:1204] (3/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_87-131427-0 from training. Duration: 0.78 2023-10-04 07:48:34,710 WARNING [train.py:1204] (3/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_372-298093-0_sp0.9 from training. Duration: 0.888875 2023-10-04 07:48:34,788 WARNING [train.py:1204] (3/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_663-166973-0 from training. Duration: 0.8 2023-10-04 07:48:34,817 WARNING [train.py:1204] (3/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_429-177469-0_sp1.1 from training. Duration: 0.936375 2023-10-04 07:48:36,081 WARNING [train.py:1204] (3/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_724-172651-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 07:48:36,135 WARNING [train.py:1204] (3/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_103-336064-0 from training. Duration: 0.51 2023-10-04 07:48:45,006 WARNING [train.py:1204] (3/4) Exclude cut with ID _48677_210_5663_1_1533902405919_3892989_111-11439-0 from training. Duration: 0.96 2023-10-04 07:48:45,825 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.2.encoder.layers.2.feed_forward2.out_whiten, num_groups=1, num_channels=384, metric=6.97 vs. limit=15.0 2023-10-04 07:48:48,176 INFO [train.py:1046] (3/4) Epoch 45, batch 3150, loss[loss=0.1445, simple_loss=0.2077, pruned_loss=0.04069, over 22689.00 frames. ], tot_loss[loss=0.155, simple_loss=0.2353, pruned_loss=0.03733, over 4710882.55 frames. ], batch size: 322, lr: 2.25e-03, grad_scale: 16.0 2023-10-04 07:48:48,298 WARNING [train.py:1204] (3/4) Exclude cut with ID _8085_210_4557_1_1528455633493_3716100_95-103398-0_sp1.1 from training. Duration: 0.963625 2023-10-04 07:48:48,353 WARNING [train.py:1204] (3/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_1041-110837-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 07:48:51,058 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_50050_210_16223_1_1535435796_7290138_1155-121757-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 07:48:51,060 WARNING [train.py:1204] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_602-275260-0_sp0.9 from training. Duration: 0.911125 2023-10-04 07:48:52,417 WARNING [train.py:1204] (3/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_265-164486-0 from training. Duration: 0.96 2023-10-04 07:48:52,535 WARNING [train.py:1204] (3/4) Exclude cut with ID _46067_210_4220_1_1534125571066_3307450_142-229375-0_sp1.1 from training. Duration: 0.963625 2023-10-04 07:48:52,559 WARNING [train.py:1204] (3/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_310-26186-0_sp1.1 from training. Duration: 0.636375 2023-10-04 07:48:53,920 WARNING [train.py:1204] (3/4) Exclude cut with ID _28461_210_2253_1_1532771775189_3041299_136-270644-0 from training. Duration: 0.93 2023-10-04 07:48:55,337 WARNING [train.py:1204] (3/4) Exclude cut with ID _14860_210_4915_1_1530702020340_7343240_438-286372-0_sp1.1 from training. Duration: 0.936375 2023-10-04 07:48:56,967 WARNING [train.py:1204] (3/4) Exclude cut with ID IC0128W0004-63309 from training. Duration: 0.831 2023-10-04 07:48:59,299 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.5.encoder.layers.1.self_attn1.whiten, num_groups=1, num_channels=256, metric=9.91 vs. limit=22.5 2023-10-04 07:49:01,896 WARNING [train.py:1204] (3/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_135-174080-0 from training. Duration: 0.53 2023-10-04 07:49:01,945 WARNING [train.py:1204] (3/4) Exclude cut with ID _41326_210_5854_1_1533778076509_3492080_277-59511-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 07:49:03,338 WARNING [train.py:1204] (3/4) Exclude cut with ID IC0144W0112-71379 from training. Duration: 0.992 2023-10-04 07:49:03,399 WARNING [train.py:1204] (3/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_711-15688-0_sp0.9 from training. Duration: 0.47775 2023-10-04 07:49:06,097 WARNING [train.py:1204] (3/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_237-332027-0 from training. Duration: 0.95 2023-10-04 07:49:06,149 WARNING [train.py:1204] (3/4) Exclude cut with ID _7970_210_1883_1_1528455616296_3472230_25-178407-0 from training. Duration: 0.71 2023-10-04 07:49:06,151 WARNING [train.py:1204] (3/4) Exclude cut with ID _8480_210_1874_1_1528589391205_6728330_309-139896-0 from training. Duration: 0.81 2023-10-04 07:49:06,171 WARNING [train.py:1204] (3/4) Exclude cut with ID _39571_210_13449_1_1533801584143_3651077_319-268877-0_sp1.1 from training. Duration: 0.936375 2023-10-04 07:49:06,175 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_155-246880-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 07:49:07,581 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_566-122599-0_sp1.1 from training. Duration: 0.936375 2023-10-04 07:49:10,238 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_12868_210_6753_1_1530701932_530275_18-76322-0 from training. Duration: 0.87 2023-10-04 07:49:11,637 WARNING [train.py:1204] (3/4) Exclude cut with ID _56494_210_4220_1_1535284837625_3033169_159-106035-0_sp1.1 from training. Duration: 0.963625 2023-10-04 07:49:11,672 WARNING [train.py:1204] (3/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_98-276013-0_sp1.1 from training. Duration: 0.963625 2023-10-04 07:49:13,372 WARNING [train.py:1204] (3/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_330-216124-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 07:49:14,790 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_177-58005-0_sp0.9 from training. Duration: 0.688875 2023-10-04 07:49:20,240 WARNING [train.py:1204] (3/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_343-139898-0 from training. Duration: 0.96 2023-10-04 07:49:20,307 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6961_210_2087_1_1527937168_6815011_395-185060-0_sp1.1 from training. Duration: 0.863625 2023-10-04 07:49:22,414 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7002_210_1794_1_1527939683_5157286_184-229630-0_sp0.9 from training. Duration: 0.82225 2023-10-04 07:49:23,741 WARNING [train.py:1204] (3/4) Exclude cut with ID _59170_210_6721_1_1534983633960_4203335_153-18895-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 07:49:23,803 WARNING [train.py:1204] (3/4) Exclude cut with ID _49643_210_7989_1_1534640244224_3939031_598-161926-0 from training. Duration: 0.9 2023-10-04 07:49:26,504 WARNING [train.py:1204] (3/4) Exclude cut with ID _4068_210_2629_1_1526697350395_1546259_224-288693-0 from training. Duration: 0.59 2023-10-04 07:49:27,853 WARNING [train.py:1204] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_718-68505-0_sp1.1 from training. Duration: 0.8090625 2023-10-04 07:49:27,892 WARNING [train.py:1204] (3/4) Exclude cut with ID _4068_210_2629_1_1526697350395_1546259_224-288693-0_sp0.9 from training. Duration: 0.6555625 2023-10-04 07:49:27,911 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2296_210_2087_1_1525503448_3397335_57-185430-0_sp0.9 from training. Duration: 0.6666875 2023-10-04 07:49:29,328 WARNING [train.py:1204] (3/4) Exclude cut with ID _15458_210_8341_1_1530858614263_3834410_49-235472-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 07:49:29,331 WARNING [train.py:1204] (3/4) Exclude cut with ID _64135_210_2253_1_1535677267876_7608972_498-5039-0_sp0.9 from training. Duration: 0.8666875 2023-10-04 07:49:31,217 WARNING [train.py:1204] (3/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_283-220372-0_sp0.9 from training. Duration: 0.62225 2023-10-04 07:49:31,226 WARNING [train.py:1204] (3/4) Exclude cut with ID _6473_210_1741_1_1527901307396_3815380_108-17335-0_sp0.9 from training. Duration: 0.72225 2023-10-04 07:49:33,009 WARNING [train.py:1204] (3/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_113-92608-0 from training. Duration: 0.54 2023-10-04 07:49:33,061 WARNING [train.py:1204] (3/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_272-249985-0_sp1.1 from training. Duration: 0.6090625 2023-10-04 07:49:33,076 WARNING [train.py:1204] (3/4) Exclude cut with ID _50376_210_13401_1_1534031502101_4217740_518-187390-0_sp1.1 from training. Duration: 0.97275 2023-10-04 07:49:34,037 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.0.layers.0.nonlin_attention.whiten2, num_groups=1, num_channels=192, metric=5.10 vs. limit=15.0 2023-10-04 07:49:35,733 WARNING [train.py:1204] (3/4) Exclude cut with ID _59738_210_15379_1_1534849512562_3941470_165-248260-0_sp1.1 from training. Duration: 0.92725 2023-10-04 07:49:35,743 WARNING [train.py:1204] (3/4) Exclude cut with ID _23818_210_6228_1_1531917063884_3676920_400-343925-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 07:49:35,806 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6961_210_2087_1_1527937168_6815011_562-165518-0 from training. Duration: 0.77 2023-10-04 07:49:35,874 WARNING [train.py:1204] (3/4) Exclude cut with ID _24271_210_12658_1_1531987270022_7190519_289-207931-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 07:49:36,541 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.4.encoder.layers.0.conv_module1.whiten, num_groups=1, num_channels=384, metric=5.44 vs. limit=15.0 2023-10-04 07:49:37,546 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.self_attn_weights.pos_emb_skip_rate, batch_count=1579426.6666666667, ans=0.0 2023-10-04 07:49:38,659 WARNING [train.py:1204] (3/4) Exclude cut with ID _14632_210_9988_1_1530691279392_3923200_332-105904-0 from training. Duration: 0.9 2023-10-04 07:49:38,677 WARNING [train.py:1204] (3/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_203-232438-0_sp1.1 from training. Duration: 0.97275 2023-10-04 07:49:40,163 WARNING [train.py:1204] (3/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_263-168388-0 from training. Duration: 0.86 2023-10-04 07:49:40,232 WARNING [train.py:1204] (3/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_174-205058-0 from training. Duration: 0.95 2023-10-04 07:49:41,592 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_94-195801-0_sp1.1 from training. Duration: 0.8545625 2023-10-04 07:49:41,616 WARNING [train.py:1204] (3/4) Exclude cut with ID _55153_210_2253_1_1534384786677_3162096_269-103204-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 07:49:41,688 WARNING [train.py:1204] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_748-36150-0 from training. Duration: 0.91 2023-10-04 07:49:44,310 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_148-172861-0_sp1.1 from training. Duration: 0.4545625 2023-10-04 07:49:44,355 WARNING [train.py:1204] (3/4) Exclude cut with ID _67499_210_17282_1_1535509805220_3692430_460-77730-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 07:49:47,806 WARNING [train.py:1204] (3/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_374-20753-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 07:49:49,774 WARNING [train.py:1204] (3/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_374-289983-0_sp1.1 from training. Duration: 0.97275 2023-10-04 07:49:50,898 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_34886_210_10633_1_1533203882_1755661_76-156096-0_sp0.9 from training. Duration: 0.9666875 2023-10-04 07:49:53,884 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_238-256668-0_sp0.9 from training. Duration: 0.7666875 2023-10-04 07:49:55,278 WARNING [train.py:1204] (3/4) Exclude cut with ID _46683_210_9642_1_1533730913492_4591116_489-146727-0_sp1.1 from training. Duration: 0.97275 2023-10-04 07:49:56,827 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=1579493.3333333333, ans=0.1 2023-10-04 07:49:57,966 WARNING [train.py:1204] (3/4) Exclude cut with ID _13938_210_9988_1_1530518718978_3693960_304-112784-0_sp0.9 from training. Duration: 0.511125 2023-10-04 07:49:59,213 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.632e+02 1.953e+02 2.189e+02 2.513e+02 3.532e+02, threshold=4.378e+02, percent-clipped=0.0 2023-10-04 07:50:02,612 INFO [train.py:1046] (3/4) Epoch 45, batch 3200, loss[loss=0.1653, simple_loss=0.2378, pruned_loss=0.04635, over 23831.00 frames. ], tot_loss[loss=0.1541, simple_loss=0.2343, pruned_loss=0.03699, over 4700546.53 frames. ], batch size: 179, lr: 2.25e-03, grad_scale: 32.0 2023-10-04 07:50:04,124 WARNING [train.py:1204] (3/4) Exclude cut with ID _24271_210_12658_1_1531987270022_7190519_243-46549-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 07:50:04,128 WARNING [train.py:1204] (3/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_78-182912-0_sp1.1 from training. Duration: 0.536375 2023-10-04 07:50:08,853 WARNING [train.py:1204] (3/4) Exclude cut with ID _57572_210_5517_1_1534921219624_3809484_121-321334-0_sp1.1 from training. Duration: 0.97275 2023-10-04 07:50:10,267 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_40047_210_2290_1_1533290519_2374311_254-102149-0_sp1.1 from training. Duration: 0.963625 2023-10-04 07:50:10,268 WARNING [train.py:1204] (3/4) Exclude cut with ID _28461_210_2253_1_1532771775189_3041299_96-152158-0 from training. Duration: 0.85 2023-10-04 07:50:10,445 WARNING [train.py:1204] (3/4) Exclude cut with ID _29877_210_6228_1_1532397791106_5276149_161-102979-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 07:50:14,564 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3787_210_2040_1_1526477544_5478586_603-120324-0_sp0.9 from training. Duration: 0.9 2023-10-04 07:50:17,868 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_41750_210_1960_1_1533520678_3948042_247-213456-0_sp1.1 from training. Duration: 0.97275 2023-10-04 07:50:25,345 WARNING [train.py:1204] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_127-119109-0_sp0.9 from training. Duration: 0.8 2023-10-04 07:50:34,058 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.1.ff3_skip_rate, batch_count=1579693.3333333333, ans=0.0 2023-10-04 07:50:35,784 WARNING [train.py:1204] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_215-202973-0 from training. Duration: 0.88 2023-10-04 07:50:35,852 WARNING [train.py:1204] (3/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_17-320491-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 07:50:39,942 WARNING [train.py:1204] (3/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_310-114452-0 from training. Duration: 0.59 2023-10-04 07:50:40,031 WARNING [train.py:1204] (3/4) Exclude cut with ID _15133_210_6228_1_1530794512074_3080379_29-264458-0_sp0.9 from training. Duration: 0.6444375 2023-10-04 07:50:44,671 WARNING [train.py:1204] (3/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_159-281310-0_sp1.1 from training. Duration: 0.863625 2023-10-04 07:50:44,691 WARNING [train.py:1204] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_195-167011-0_sp0.9 from training. Duration: 0.8333125 2023-10-04 07:50:44,753 WARNING [train.py:1204] (3/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_156-257607-0_sp1.1 from training. Duration: 0.7545625 2023-10-04 07:50:48,922 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2295_210_2087_1_1525500993_2407863_116-238894-0 from training. Duration: 0.7 2023-10-04 07:50:50,318 WARNING [train.py:1204] (3/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_321-335475-0_sp0.9 from training. Duration: 0.5 2023-10-04 07:50:52,312 WARNING [train.py:1204] (3/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_188-114464-0 from training. Duration: 0.82 2023-10-04 07:50:54,985 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2592_210_2290_1_1525697025_4882925_157-70512-0 from training. Duration: 0.91 2023-10-04 07:50:55,124 WARNING [train.py:1204] (3/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_138-64210-0_sp1.1 from training. Duration: 0.663625 2023-10-04 07:51:01,101 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_11305_210_1794_1_1529750428_7178148_651-95702-0_sp1.1 from training. Duration: 0.963625 2023-10-04 07:51:01,123 WARNING [train.py:1204] (3/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_379-184223-0_sp0.9 from training. Duration: 0.9444375 2023-10-04 07:51:01,141 WARNING [train.py:1204] (3/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_381-125325-0_sp1.1 from training. Duration: 0.963625 2023-10-04 07:51:02,514 WARNING [train.py:1204] (3/4) Exclude cut with ID IC0622W0021-307659 from training. Duration: 0.896 2023-10-04 07:51:02,516 WARNING [train.py:1204] (3/4) Exclude cut with ID _7624_210_1531_1_1528109802198_4933101_418-349834-0_sp1.1 from training. Duration: 0.5454375 2023-10-04 07:51:07,085 WARNING [train.py:1204] (3/4) Exclude cut with ID _49728_210_12651_1_1534041034294_6788150_607-3416-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 07:51:08,414 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8275_210_4198_1_1528455320_3892571_491-17707-0 from training. Duration: 0.64 2023-10-04 07:51:08,615 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.feed_forward1.out_proj.dropout_p, batch_count=1579826.6666666667, ans=0.1 2023-10-04 07:51:09,758 WARNING [train.py:1204] (3/4) Exclude cut with ID _5287_210_2084_1_1527418883040_3878670_333-48508-0 from training. Duration: 0.6 2023-10-04 07:51:10,493 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.3.encoder.layers.1.feed_forward3.out_whiten, num_groups=1, num_channels=512, metric=10.18 vs. limit=15.0 2023-10-04 07:51:11,126 WARNING [train.py:1204] (3/4) Exclude cut with ID _8365_210_2064_1_1528592311070_4677690_371-122295-0 from training. Duration: 0.55 2023-10-04 07:51:12,683 WARNING [train.py:1204] (3/4) Exclude cut with ID _14860_210_4915_1_1530702020340_7343240_836-328489-0 from training. Duration: 0.9 2023-10-04 07:51:15,323 WARNING [train.py:1204] (3/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_1033-274376-0_sp0.9 from training. Duration: 0.9666875 2023-10-04 07:51:17,266 INFO [train.py:1046] (3/4) Epoch 45, batch 3250, loss[loss=0.1623, simple_loss=0.2467, pruned_loss=0.03898, over 24513.00 frames. ], tot_loss[loss=0.154, simple_loss=0.2345, pruned_loss=0.03677, over 4706381.34 frames. ], batch size: 66, lr: 2.25e-03, grad_scale: 32.0 2023-10-04 07:51:17,313 WARNING [train.py:1204] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_475-127455-0_sp0.9 from training. Duration: 0.77775 2023-10-04 07:51:17,320 WARNING [train.py:1204] (3/4) Exclude cut with ID IC0671W0071-331914 from training. Duration: 0.896 2023-10-04 07:51:17,346 WARNING [train.py:1204] (3/4) Exclude cut with ID _30327_210_16049_1_1532484161295_3924169_2-1523-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 07:51:17,348 WARNING [train.py:1204] (3/4) Exclude cut with ID _24314_210_4915_1_1532135078116_7170179_460-208279-0_sp1.1 from training. Duration: 0.97275 2023-10-04 07:51:18,718 WARNING [train.py:1204] (3/4) Exclude cut with ID IC0679W0052-335859 from training. Duration: 0.64 2023-10-04 07:51:22,880 WARNING [train.py:1204] (3/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_3-237434-0_sp1.1 from training. Duration: 0.6909375 2023-10-04 07:51:24,862 WARNING [train.py:1204] (3/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_405-185283-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 07:51:28,448 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.attention_skip_rate, batch_count=1579893.3333333333, ans=0.0 2023-10-04 07:51:32,591 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.self_attn_weights.pos_emb_skip_rate, batch_count=1579960.0, ans=0.0 2023-10-04 07:51:33,802 WARNING [train.py:1204] (3/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_927-83684-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 07:51:33,811 WARNING [train.py:1204] (3/4) Exclude cut with ID _6694_210_4599_1_1527852637490_3965529_356-169233-0 from training. Duration: 0.99 2023-10-04 07:51:35,191 WARNING [train.py:1204] (3/4) Exclude cut with ID _19896_210_1883_1_1531709022662_6649569_562-233001-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 07:51:36,648 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_354-78629-0_sp1.1 from training. Duration: 0.963625 2023-10-04 07:51:36,650 WARNING [train.py:1204] (3/4) Exclude cut with ID _60135_210_13427_1_1534935557922_3651319_96-168198-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 07:51:36,789 WARNING [train.py:1204] (3/4) Exclude cut with ID _31602_210_5854_1_1532677021456_4458250_249-96604-0_sp1.1 from training. Duration: 0.8090625 2023-10-04 07:51:38,586 WARNING [train.py:1204] (3/4) Exclude cut with ID _14100_210_8341_1_1530529320613_3678069_258-224599-0_sp0.9 from training. Duration: 0.8555625 2023-10-04 07:51:41,521 WARNING [train.py:1204] (3/4) Exclude cut with ID _19708_210_9206_1_1532136650733_4238364_215-160135-0_sp1.1 from training. Duration: 0.97275 2023-10-04 07:51:41,549 WARNING [train.py:1204] (3/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_113-18163-0_sp0.9 from training. Duration: 0.988875 2023-10-04 07:51:41,591 WARNING [train.py:1204] (3/4) Exclude cut with ID _64135_210_2253_1_1535677267876_7608972_552-337340-0_sp1.1 from training. Duration: 0.92725 2023-10-04 07:51:41,615 WARNING [train.py:1204] (3/4) Exclude cut with ID _67725_210_27767_1_1535608756671_5105359_10-12887-0_sp1.1 from training. Duration: 0.97275 2023-10-04 07:51:41,627 WARNING [train.py:1204] (3/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_401-284840-0_sp1.1 from training. Duration: 0.97275 2023-10-04 07:51:42,899 WARNING [train.py:1204] (3/4) Exclude cut with ID _44659_210_2721_1_1534036120061_6614860_483-141285-0_sp1.1 from training. Duration: 0.936375 2023-10-04 07:51:44,260 WARNING [train.py:1204] (3/4) Exclude cut with ID _9461_210_4557_1_1529060514726_3677451_358-332734-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 07:51:45,698 WARNING [train.py:1204] (3/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_788-298907-0_sp1.1 from training. Duration: 0.8090625 2023-10-04 07:51:45,963 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.ff2_skip_rate, batch_count=1580026.6666666667, ans=0.0 2023-10-04 07:51:48,543 WARNING [train.py:1204] (3/4) Exclude cut with ID _39411_210_5986_1_1533538825030_3343649_354-267196-0_sp1.1 from training. Duration: 0.92725 2023-10-04 07:51:48,565 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_14953_210_2151_1_1530701710_5616165_192-309792-0_sp1.1 from training. Duration: 0.97275 2023-10-04 07:51:50,500 WARNING [train.py:1204] (3/4) Exclude cut with ID _59170_210_6721_1_1534983633960_4203335_290-151255-0_sp1.1 from training. Duration: 0.92725 2023-10-04 07:51:51,798 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_5575_210_1410_1_1527301305_2570170_249-93769-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 07:51:51,808 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_46013_210_7234_1_1533776362_3744032_463-52378-0_sp1.1 from training. Duration: 0.7818125 2023-10-04 07:51:56,156 WARNING [train.py:1204] (3/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_580-93798-0 from training. Duration: 0.56 2023-10-04 07:51:56,217 WARNING [train.py:1204] (3/4) Exclude cut with ID _19514_210_9259_1_1531530071330_5084230_385-13762-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 07:51:57,971 WARNING [train.py:1204] (3/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_458-67321-0_sp1.1 from training. Duration: 0.8818125 2023-10-04 07:51:59,867 WARNING [train.py:1204] (3/4) Exclude cut with ID _41298_210_9019_1_1533430371450_4822143_188-154791-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 07:51:59,951 WARNING [train.py:1204] (3/4) Exclude cut with ID _15037_210_2253_1_1530752294104_3267650_296-192597-0_sp0.9 from training. Duration: 0.9 2023-10-04 07:52:05,394 WARNING [train.py:1204] (3/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_661-264857-0_sp1.1 from training. Duration: 0.8454375 2023-10-04 07:52:12,889 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_34008_210_10431_1_1533345424_8183247_662-317331-0_sp1.1 from training. Duration: 0.8545625 2023-10-04 07:52:12,931 WARNING [train.py:1204] (3/4) Exclude cut with ID _46502_210_13794_1_1533724452198_3657159_241-278329-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 07:52:12,932 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_11349_210_6753_1_1529800861_1101207_26-65602-0 from training. Duration: 0.96 2023-10-04 07:52:12,935 WARNING [train.py:1204] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_448-4048-0_sp1.1 from training. Duration: 0.87275 2023-10-04 07:52:12,943 WARNING [train.py:1204] (3/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_662-275621-0_sp1.1 from training. Duration: 0.3818125 2023-10-04 07:52:14,396 WARNING [train.py:1204] (3/4) Exclude cut with ID _13938_210_9988_1_1530518718978_3693960_256-54454-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 07:52:17,116 WARNING [train.py:1204] (3/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_256-32662-0 from training. Duration: 0.9 2023-10-04 07:52:17,153 WARNING [train.py:1204] (3/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_326-17061-0 from training. Duration: 0.94 2023-10-04 07:52:18,335 WARNING [train.py:1204] (3/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_505-97987-0_sp1.1 from training. Duration: 0.7818125 2023-10-04 07:52:19,677 WARNING [train.py:1204] (3/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_316-294364-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 07:52:21,051 WARNING [train.py:1204] (3/4) Exclude cut with ID _19514_210_9259_1_1531530071330_5084230_762-231063-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 07:52:22,374 WARNING [train.py:1204] (3/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_68-219807-0_sp0.9 from training. Duration: 0.52225 2023-10-04 07:52:22,420 WARNING [train.py:1204] (3/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_479-158053-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 07:52:25,752 WARNING [train.py:1204] (3/4) Exclude cut with ID _66112_210_10635_1_1535590236305_3801484_83-38181-0_sp1.1 from training. Duration: 0.936375 2023-10-04 07:52:27,073 WARNING [train.py:1204] (3/4) Exclude cut with ID _55857_210_2346_1_1534644007831_6898209_255-146346-0_sp1.1 from training. Duration: 0.963625 2023-10-04 07:52:28,382 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.442e+02 1.929e+02 2.141e+02 2.387e+02 3.299e+02, threshold=4.283e+02, percent-clipped=0.0 2023-10-04 07:52:28,481 WARNING [train.py:1204] (3/4) Exclude cut with ID _48836_210_12210_1_1534075848293_3904150_429-35475-0 from training. Duration: 0.89 2023-10-04 07:52:28,493 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_50154_210_13731_1_1534750862_1080961_119-187163-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 07:52:31,157 INFO [train.py:1046] (3/4) Epoch 45, batch 3300, loss[loss=0.1489, simple_loss=0.2296, pruned_loss=0.03407, over 23547.00 frames. ], tot_loss[loss=0.1542, simple_loss=0.235, pruned_loss=0.03669, over 4710497.94 frames. ], batch size: 134, lr: 2.25e-03, grad_scale: 32.0 2023-10-04 07:52:31,256 WARNING [train.py:1204] (3/4) Exclude cut with ID _8793_210_6310_1_1528542985139_2310009_121-179782-0_sp0.9 from training. Duration: 0.888875 2023-10-04 07:52:31,261 WARNING [train.py:1204] (3/4) Exclude cut with ID _14884_210_2252_1_1530707459689_5561250_591-173406-0 from training. Duration: 0.96 2023-10-04 07:52:31,475 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.bypass_mid.scale_min, batch_count=1580226.6666666667, ans=0.2 2023-10-04 07:52:34,655 WARNING [train.py:1204] (3/4) Exclude cut with ID _15133_210_6228_1_1530794512074_3080379_54-198193-0_sp1.1 from training. Duration: 0.8545625 2023-10-04 07:52:34,664 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6545_210_2040_1_1527774670_4245258_407-48070-0 from training. Duration: 0.76 2023-10-04 07:52:37,369 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1032-236863-0 from training. Duration: 0.84 2023-10-04 07:52:38,705 WARNING [train.py:1204] (3/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_507-303370-0 from training. Duration: 0.96 2023-10-04 07:52:38,725 WARNING [train.py:1204] (3/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_164-282366-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 07:52:41,623 WARNING [train.py:1204] (3/4) Exclude cut with ID _39867_210_9826_1_1533553222030_9344259_312-158727-0_sp1.1 from training. Duration: 0.963625 2023-10-04 07:52:41,705 WARNING [train.py:1204] (3/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_174-205058-0_sp1.1 from training. Duration: 0.863625 2023-10-04 07:52:41,734 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_11305_210_1794_1_1529750428_7178148_166-237358-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 07:52:43,830 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.1.feed_forward1.out_proj.dropout_p, batch_count=1580226.6666666667, ans=0.1 2023-10-04 07:52:45,007 WARNING [train.py:1204] (3/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_26-175553-0_sp1.1 from training. Duration: 0.5545625 2023-10-04 07:52:45,033 WARNING [train.py:1204] (3/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_96-290039-0_sp1.1 from training. Duration: 0.5090625 2023-10-04 07:52:46,501 WARNING [train.py:1204] (3/4) Exclude cut with ID _53219_210_4525_1_1534834836008_4064825_261-101691-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 07:52:47,960 WARNING [train.py:1204] (3/4) Exclude cut with ID _24271_210_12658_1_1531987270022_7190519_96-293390-0_sp1.1 from training. Duration: 0.936375 2023-10-04 07:52:51,373 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.3.encoder.layers.2.conv_module2.whiten, num_groups=1, num_channels=512, metric=9.15 vs. limit=15.0 2023-10-04 07:52:52,120 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_271-234962-0 from training. Duration: 0.79 2023-10-04 07:52:52,199 WARNING [train.py:1204] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_136-30660-0_sp1.1 from training. Duration: 0.97275 2023-10-04 07:52:52,214 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_625-281046-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 07:52:53,575 WARNING [train.py:1204] (3/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_333-168193-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 07:52:53,701 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.0.attention_skip_rate, batch_count=1580293.3333333333, ans=0.0 2023-10-04 07:52:54,286 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.3.encoder.layers.1.self_attn_weights.whiten_keys, num_groups=8, num_channels=256, metric=3.81 vs. limit=6.0 2023-10-04 07:52:55,437 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9029W0269-509664 from training. Duration: 0.896 2023-10-04 07:52:55,522 WARNING [train.py:1204] (3/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_450-147393-0_sp1.1 from training. Duration: 0.92725 2023-10-04 07:52:55,564 WARNING [train.py:1204] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_643-52007-0_sp1.1 from training. Duration: 0.5181875 2023-10-04 07:52:55,624 WARNING [train.py:1204] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_330-327861-0_sp0.9 from training. Duration: 0.7444375 2023-10-04 07:52:55,632 WARNING [train.py:1204] (3/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_260-317651-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 07:52:56,938 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9044W0010-516654 from training. Duration: 0.896 2023-10-04 07:52:58,516 WARNING [train.py:1204] (3/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_265-118378-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 07:52:59,171 WARNING [train.py:1204] (3/4) Exclude cut with ID _6194_210_1534_1_1527511826160_3941194_321-148641-0_sp0.9 from training. Duration: 0.711125 2023-10-04 07:53:01,788 WARNING [train.py:1204] (3/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_101-169176-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 07:53:01,791 WARNING [train.py:1204] (3/4) Exclude cut with ID 497-129325-0061-62254-0_sp1.1 from training. Duration: 0.97725 2023-10-04 07:53:03,207 WARNING [train.py:1204] (3/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_795-33252-0_sp1.1 from training. Duration: 0.336375 2023-10-04 07:53:03,228 WARNING [train.py:1204] (3/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_168-246176-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 07:53:04,577 WARNING [train.py:1204] (3/4) Exclude cut with ID _7160_210_1876_1_1527940923231_3703799_120-150943-0_sp1.1 from training. Duration: 0.736375 2023-10-04 07:53:07,245 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9084W0005-536844 from training. Duration: 0.768 2023-10-04 07:53:08,737 WARNING [train.py:1204] (3/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_234-260682-0 from training. Duration: 0.84 2023-10-04 07:53:08,771 WARNING [train.py:1204] (3/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_188-114464-0_sp0.9 from training. Duration: 0.911125 2023-10-04 07:53:11,519 WARNING [train.py:1204] (3/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_81-75849-0 from training. Duration: 0.39 2023-10-04 07:53:13,403 WARNING [train.py:1204] (3/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_379-184223-0_sp1.1 from training. Duration: 0.77275 2023-10-04 07:53:16,144 WARNING [train.py:1204] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_454-123002-0_sp1.1 from training. Duration: 0.62725 2023-10-04 07:53:17,474 WARNING [train.py:1204] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_887-249555-0_sp1.1 from training. Duration: 0.7 2023-10-04 07:53:19,014 WARNING [train.py:1204] (3/4) Exclude cut with ID _31088_210_16446_1_1532520012284_3440919_20-260902-0_sp1.1 from training. Duration: 0.97275 2023-10-04 07:53:20,376 WARNING [train.py:1204] (3/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_856-344574-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 07:53:20,389 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_36756_210_7234_1_1533026991_966276_4-164118-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 07:53:20,406 WARNING [train.py:1204] (3/4) Exclude cut with ID _8365_210_2064_1_1528592311070_4677690_371-122295-0_sp1.1 from training. Duration: 0.5 2023-10-04 07:53:21,869 WARNING [train.py:1204] (3/4) Exclude cut with ID _5910_210_1389_1_1527337972454_5050100_555-159815-0_sp1.1 from training. Duration: 0.8909375 2023-10-04 07:53:21,881 WARNING [train.py:1204] (3/4) Exclude cut with ID _37086_210_9038_1_1532999540891_7021250_469-147941-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 07:53:23,196 WARNING [train.py:1204] (3/4) Exclude cut with ID _3067_210_2667_1_1526212974640_4503168_459-173867-0_sp0.9 from training. Duration: 0.97775 2023-10-04 07:53:23,493 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.conv_skip_rate, batch_count=1580426.6666666667, ans=0.0 2023-10-04 07:53:25,095 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9156W0006-572499 from training. Duration: 0.896 2023-10-04 07:53:26,417 WARNING [train.py:1204] (3/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_277-210798-0 from training. Duration: 0.55 2023-10-04 07:53:27,813 WARNING [train.py:1204] (3/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_103-336064-0_sp1.1 from training. Duration: 0.463625 2023-10-04 07:53:30,217 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3414_210_1988_1_1526212718_4813195_331-290945-0_sp1.1 from training. Duration: 0.936375 2023-10-04 07:53:30,218 WARNING [train.py:1204] (3/4) Exclude cut with ID _67499_210_17282_1_1535509805220_3692430_365-29276-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 07:53:32,776 WARNING [train.py:1204] (3/4) Exclude cut with ID _14821_210_8750_1_1530680503991_6829359_69-250847-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 07:53:32,777 WARNING [train.py:1204] (3/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_1102-61301-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 07:53:32,882 WARNING [train.py:1204] (3/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_166-246557-0_sp1.1 from training. Duration: 0.5818125 2023-10-04 07:53:32,929 WARNING [train.py:1204] (3/4) Exclude cut with ID _15351_210_10626_1_1530856635653_3576210_214-129918-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 07:53:34,178 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_120-41668-0_sp0.9 from training. Duration: 0.77775 2023-10-04 07:53:34,250 WARNING [train.py:1204] (3/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_27-72278-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 07:53:36,987 WARNING [train.py:1204] (3/4) Exclude cut with ID _24314_210_4915_1_1532135078116_7170179_521-258868-0_sp1.1 from training. Duration: 0.7181875 2023-10-04 07:53:39,615 WARNING [train.py:1204] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_143-86344-0 from training. Duration: 0.77 2023-10-04 07:53:39,652 WARNING [train.py:1204] (3/4) Exclude cut with ID _53489_210_6228_1_1534298357169_3835970_465-157433-0_sp1.1 from training. Duration: 0.92725 2023-10-04 07:53:39,768 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_248-319353-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 07:53:41,185 WARNING [train.py:1204] (3/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_102-77064-0_sp1.1 from training. Duration: 0.8454375 2023-10-04 07:53:42,449 WARNING [train.py:1204] (3/4) Exclude cut with ID _9389_210_1664_1_1529217337301_5289269_379-128794-0_sp1.1 from training. Duration: 0.77275 2023-10-04 07:53:42,551 WARNING [train.py:1204] (3/4) Exclude cut with ID _3231_210_2106_1_1526205864934_6784220_236-260835-0_sp1.1 from training. Duration: 0.97275 2023-10-04 07:53:45,548 INFO [train.py:1046] (3/4) Epoch 45, batch 3350, loss[loss=0.1574, simple_loss=0.2338, pruned_loss=0.04056, over 23555.00 frames. ], tot_loss[loss=0.1553, simple_loss=0.2361, pruned_loss=0.03721, over 4708988.07 frames. ], batch size: 256, lr: 2.25e-03, grad_scale: 16.0 2023-10-04 07:53:45,636 WARNING [train.py:1204] (3/4) Exclude cut with ID _24271_210_12658_1_1531987270022_7190519_184-342363-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 07:53:45,637 WARNING [train.py:1204] (3/4) Exclude cut with ID _27894_210_12392_1_1532415496078_6902439_67-221663-0_sp1.1 from training. Duration: 0.92725 2023-10-04 07:53:48,506 WARNING [train.py:1204] (3/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_304-59227-0_sp1.1 from training. Duration: 0.7 2023-10-04 07:53:49,920 WARNING [train.py:1204] (3/4) Exclude cut with ID _58605_210_12210_1_1534723195939_3904259_458-42232-0_sp1.1 from training. Duration: 0.92725 2023-10-04 07:53:51,339 WARNING [train.py:1204] (3/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_13-165512-0_sp0.9 from training. Duration: 0.911125 2023-10-04 07:53:51,952 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.conv_module2.whiten.whitening_limit, batch_count=1580560.0, ans=15.0 2023-10-04 07:53:54,124 WARNING [train.py:1204] (3/4) Exclude cut with ID _58602_210_2346_1_1534903210085_6908830_792-102241-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 07:53:55,584 WARNING [train.py:1204] (3/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_588-125165-0_sp0.9 from training. Duration: 0.92225 2023-10-04 07:53:57,464 WARNING [train.py:1204] (3/4) Exclude cut with ID _54218_210_12210_1_1534413646388_4409259_239-98429-0_sp1.1 from training. Duration: 0.97275 2023-10-04 07:53:58,814 WARNING [train.py:1204] (3/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_538-32291-0_sp1.1 from training. Duration: 0.7545625 2023-10-04 07:54:00,188 WARNING [train.py:1204] (3/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_419-317062-0 from training. Duration: 0.91 2023-10-04 07:54:02,132 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9309W0202-640329 from training. Duration: 0.896 2023-10-04 07:54:02,165 WARNING [train.py:1204] (3/4) Exclude cut with ID _3231_210_2106_1_1526205864934_6784220_419-313291-0_sp1.1 from training. Duration: 0.97275 2023-10-04 07:54:06,926 WARNING [train.py:1204] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_619-344499-0 from training. Duration: 0.63 2023-10-04 07:54:06,945 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_278-272612-0 from training. Duration: 0.73 2023-10-04 07:54:08,346 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_103-84010-0_sp0.9 from training. Duration: 0.7666875 2023-10-04 07:54:08,367 WARNING [train.py:1204] (3/4) Exclude cut with ID _26528_210_9070_1_1532570410842_3851310_497-97218-0_sp1.1 from training. Duration: 0.7818125 2023-10-04 07:54:08,597 INFO [scaling.py:1118] (3/4) WithLoss: name=encoder.encoders.3.encoder.layers.2.self_attn_weights, loss-sum=0.000e+00 2023-10-04 07:54:09,653 WARNING [train.py:1204] (3/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_262-84613-0_sp1.1 from training. Duration: 0.936375 2023-10-04 07:54:10,966 WARNING [train.py:1204] (3/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_387-131538-0 from training. Duration: 0.81 2023-10-04 07:54:10,994 WARNING [train.py:1204] (3/4) Exclude cut with ID _8030_210_7497_1_1528444886781_3531089_224-300489-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 07:54:11,024 WARNING [train.py:1204] (3/4) Exclude cut with ID _14349_210_8341_1_1530615710818_3442470_170-336541-0_sp0.9 from training. Duration: 0.9555625 2023-10-04 07:54:15,104 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_47069_210_15368_1_1533781267_4431320_180-31746-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 07:54:16,490 WARNING [train.py:1204] (3/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_447-7041-0_sp1.1 from training. Duration: 0.92725 2023-10-04 07:54:17,772 WARNING [train.py:1204] (3/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_2-55283-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 07:54:17,830 WARNING [train.py:1204] (3/4) Exclude cut with ID _12555_210_8750_1_1530075594402_3780239_70-181911-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 07:54:21,177 WARNING [train.py:1204] (3/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_311-328022-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 07:54:23,838 WARNING [train.py:1204] (3/4) Exclude cut with ID _14528_210_8254_1_1530669700514_4306939_330-2987-0_sp1.1 from training. Duration: 0.92725 2023-10-04 07:54:23,895 WARNING [train.py:1204] (3/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_385-184858-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 07:54:27,316 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.3.encoder.layers.2.feed_forward3.out_whiten, num_groups=1, num_channels=512, metric=7.05 vs. limit=15.0 2023-10-04 07:54:27,941 WARNING [train.py:1204] (3/4) Exclude cut with ID _64414_210_17301_1_1535243252048_3757759_348-333360-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 07:54:28,027 WARNING [train.py:1204] (3/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_753-214751-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 07:54:30,080 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_17613_210_2151_1_1531137569_2366160_202-208013-0_sp1.1 from training. Duration: 0.92725 2023-10-04 07:54:30,095 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_43831_210_2158_1_1533725502_5872791_610-129300-0_sp1.1 from training. Duration: 0.936375 2023-10-04 07:54:32,769 WARNING [train.py:1204] (3/4) Exclude cut with ID _54218_210_12210_1_1534413646388_4409259_321-23852-0_sp1.1 from training. Duration: 0.936375 2023-10-04 07:54:34,184 WARNING [train.py:1204] (3/4) Exclude cut with ID _39411_210_5986_1_1533538825030_3343649_173-238248-0 from training. Duration: 0.83 2023-10-04 07:54:35,465 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_405-339986-0_sp0.9 from training. Duration: 0.5666875 2023-10-04 07:54:35,503 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2009_210_2071_1_1525481134_4588937_385-302100-0 from training. Duration: 0.87 2023-10-04 07:54:35,540 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_161-276996-0_sp1.1 from training. Duration: 0.863625 2023-10-04 07:54:37,498 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_177-58005-0 from training. Duration: 0.62 2023-10-04 07:54:38,976 WARNING [train.py:1204] (3/4) Exclude cut with ID _64639_210_5816_1_1535367619437_3528000_146-343734-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 07:54:40,364 WARNING [train.py:1204] (3/4) Exclude cut with ID _3845_210_1390_1_1526731326141_3523610_72-61441-0_sp1.1 from training. Duration: 0.92725 2023-10-04 07:54:40,610 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.nonlin_attention.balancer.prob, batch_count=1580760.0, ans=0.125 2023-10-04 07:54:45,994 WARNING [train.py:1204] (3/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_991-125028-0_sp1.1 from training. Duration: 0.936375 2023-10-04 07:54:46,067 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7544_210_6753_1_1528591327_1471426_172-348777-0 from training. Duration: 0.66 2023-10-04 07:54:46,115 WARNING [train.py:1204] (3/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_416-257730-0_sp1.1 from training. Duration: 0.5909375 2023-10-04 07:54:47,426 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1113-7369-0_sp1.1 from training. Duration: 0.77275 2023-10-04 07:54:49,226 WARNING [train.py:1204] (3/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_242-76943-0_sp1.1 from training. Duration: 0.8545625 2023-10-04 07:54:54,515 WARNING [train.py:1204] (3/4) Exclude cut with ID _44718_210_5816_1_1534050051869_6469030_190-49648-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 07:54:55,913 WARNING [train.py:1204] (3/4) Exclude cut with ID _47251_210_18628_1_1533814063128_4137250_214-88076-0 from training. Duration: 0.88 2023-10-04 07:54:57,148 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.659e+02 1.935e+02 2.206e+02 2.412e+02 3.759e+02, threshold=4.413e+02, percent-clipped=0.0 2023-10-04 07:54:57,227 WARNING [train.py:1204] (3/4) Exclude cut with ID _5585_210_2667_1_1527418813324_8007340_89-332072-0_sp1.1 from training. Duration: 0.5818125 2023-10-04 07:54:57,270 WARNING [train.py:1204] (3/4) Exclude cut with ID _41817_210_18835_1_1533434477160_7423049_555-293448-0_sp1.1 from training. Duration: 0.72725 2023-10-04 07:54:59,080 INFO [train.py:1046] (3/4) Epoch 45, batch 3400, loss[loss=0.2, simple_loss=0.2727, pruned_loss=0.06361, over 19619.00 frames. ], tot_loss[loss=0.1557, simple_loss=0.237, pruned_loss=0.03722, over 4722294.87 frames. ], batch size: 389, lr: 2.25e-03, grad_scale: 16.0 2023-10-04 07:54:59,184 WARNING [train.py:1204] (3/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_286-331314-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 07:54:59,241 WARNING [train.py:1204] (3/4) Exclude cut with ID _13921_210_9994_1_1530838702800_3003429_46-149444-0 from training. Duration: 0.94 2023-10-04 07:55:00,548 WARNING [train.py:1204] (3/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_768-43595-0_sp1.1 from training. Duration: 0.936375 2023-10-04 07:55:00,557 WARNING [train.py:1204] (3/4) Exclude cut with ID _35355_210_16040_1_1533212988977_4016229_74-190076-0 from training. Duration: 0.8 2023-10-04 07:55:01,989 WARNING [train.py:1204] (3/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_1007-316380-0_sp1.1 from training. Duration: 0.963625 2023-10-04 07:55:02,039 WARNING [train.py:1204] (3/4) Exclude cut with ID _23557_210_8750_1_1531890106370_6846255_150-283437-0_sp1.1 from training. Duration: 0.963625 2023-10-04 07:55:03,289 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3474_210_2028_1_1526188513_4825246_145-309131-0_sp1.1 from training. Duration: 0.67275 2023-10-04 07:55:05,036 WARNING [train.py:1204] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_391-22148-0_sp1.1 from training. Duration: 0.7 2023-10-04 07:55:05,061 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_238-256668-0 from training. Duration: 0.69 2023-10-04 07:55:05,298 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.conv_module2.balancer1.prob, batch_count=1580893.3333333333, ans=0.125 2023-10-04 07:55:09,778 WARNING [train.py:1204] (3/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_372-198878-0 from training. Duration: 0.6 2023-10-04 07:55:09,788 WARNING [train.py:1204] (3/4) Exclude cut with ID ID0174W0006-766659 from training. Duration: 0.953 2023-10-04 07:55:09,804 WARNING [train.py:1204] (3/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_668-23019-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 07:55:13,965 WARNING [train.py:1204] (3/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_322-74328-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 07:55:13,979 WARNING [train.py:1204] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_872-264525-0_sp1.1 from training. Duration: 0.5909375 2023-10-04 07:55:14,032 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_53378_210_17282_1_1534221071_5769509_641-286201-0_sp1.1 from training. Duration: 0.97275 2023-10-04 07:55:16,675 WARNING [train.py:1204] (3/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_199-295611-0_sp1.1 from training. Duration: 0.5 2023-10-04 07:55:17,171 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.attention_skip_rate, batch_count=1580960.0, ans=0.0 2023-10-04 07:55:20,200 WARNING [train.py:1204] (3/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_1270-317971-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 07:55:21,669 WARNING [train.py:1204] (3/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_784-213099-0 from training. Duration: 0.93 2023-10-04 07:55:25,745 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_115-233929-0_sp0.9 from training. Duration: 0.8 2023-10-04 07:55:28,659 WARNING [train.py:1204] (3/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_256-223809-0_sp1.1 from training. Duration: 0.97275 2023-10-04 07:55:28,699 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_285-87781-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 07:55:30,463 WARNING [train.py:1204] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_619-344499-0_sp1.1 from training. Duration: 0.57275 2023-10-04 07:55:35,202 WARNING [train.py:1204] (3/4) Exclude cut with ID _60229_210_27767_1_1534838406158_3655350_264-279573-0_sp0.9 from training. Duration: 0.92225 2023-10-04 07:55:39,722 WARNING [train.py:1204] (3/4) Exclude cut with ID _7719_210_3525_1_1528456153163_4296960_333-110399-0 from training. Duration: 0.96 2023-10-04 07:55:43,983 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_50423_210_13270_1_1534128598_5021734_759-90833-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 07:55:45,278 WARNING [train.py:1204] (3/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_151-254798-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 07:55:45,312 WARNING [train.py:1204] (3/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_450-203637-0 from training. Duration: 0.92 2023-10-04 07:55:46,635 WARNING [train.py:1204] (3/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_673-36456-0_sp1.1 from training. Duration: 0.936375 2023-10-04 07:55:46,698 WARNING [train.py:1204] (3/4) Exclude cut with ID _36538_210_9994_1_1533204020363_3795620_131-8685-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 07:55:47,999 WARNING [train.py:1204] (3/4) Exclude cut with ID _12556_210_8341_1_1530081024100_7327690_713-55626-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 07:55:48,044 WARNING [train.py:1204] (3/4) Exclude cut with ID _2322_210_2667_1_1525604548971_3656299_440-80494-0_sp0.9 from training. Duration: 0.97775 2023-10-04 07:55:52,682 WARNING [train.py:1204] (3/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_210-325081-0_sp1.1 from training. Duration: 0.97275 2023-10-04 07:55:55,552 WARNING [train.py:1204] (3/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_375-244976-0_sp0.9 from training. Duration: 0.8333125 2023-10-04 07:55:55,557 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_15517_210_6753_1_1530952293_1664795_119-31001-0_sp1.1 from training. Duration: 0.8545625 2023-10-04 07:56:00,206 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_41452_210_7291_1_1533437687_3941281_319-42326-0_sp1.1 from training. Duration: 0.92725 2023-10-04 07:56:02,995 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_372-173012-0 from training. Duration: 0.95 2023-10-04 07:56:10,919 WARNING [train.py:1204] (3/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_41-31157-0_sp1.1 from training. Duration: 0.5181875 2023-10-04 07:56:13,666 INFO [train.py:1046] (3/4) Epoch 45, batch 3450, loss[loss=0.1351, simple_loss=0.2224, pruned_loss=0.02386, over 24663.00 frames. ], tot_loss[loss=0.1556, simple_loss=0.2369, pruned_loss=0.03711, over 4737023.33 frames. ], batch size: 65, lr: 2.25e-03, grad_scale: 16.0 2023-10-04 07:56:15,105 WARNING [train.py:1204] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_91-212486-0 from training. Duration: 0.68 2023-10-04 07:56:17,811 WARNING [train.py:1204] (3/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_1267-272693-0 from training. Duration: 0.73 2023-10-04 07:56:17,843 WARNING [train.py:1204] (3/4) Exclude cut with ID _13938_210_9988_1_1530518718978_3693960_152-67046-0_sp1.1 from training. Duration: 0.936375 2023-10-04 07:56:18,055 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.feed_forward2.hidden_balancer.prob, batch_count=1581226.6666666667, ans=0.125 2023-10-04 07:56:19,395 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_4528_210_3273_1_1527076654_3276777_342-168068-0_sp1.1 from training. Duration: 0.7909375 2023-10-04 07:56:19,409 WARNING [train.py:1204] (3/4) Exclude cut with ID _7606_210_4915_1_1528894253277_5080690_127-177761-0 from training. Duration: 0.64 2023-10-04 07:56:20,761 WARNING [train.py:1204] (3/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_335-137972-0_sp1.1 from training. Duration: 0.92725 2023-10-04 07:56:25,268 WARNING [train.py:1204] (3/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_331-117829-0_sp1.1 from training. Duration: 0.563625 2023-10-04 07:56:28,237 WARNING [train.py:1204] (3/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_125-125818-0_sp1.1 from training. Duration: 0.87275 2023-10-04 07:56:28,291 WARNING [train.py:1204] (3/4) Exclude cut with ID _6789_210_1534_1_1527854471671_2870519_48-294897-0_sp1.1 from training. Duration: 0.963625 2023-10-04 07:56:28,987 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.feed_forward3.out_whiten.whitening_limit, batch_count=1581293.3333333333, ans=15.0 2023-10-04 07:56:29,661 WARNING [train.py:1204] (3/4) Exclude cut with ID _6694_210_4599_1_1527852637490_3965529_116-109398-0_sp1.1 from training. Duration: 0.663625 2023-10-04 07:56:29,674 WARNING [train.py:1204] (3/4) Exclude cut with ID _52892_210_6721_1_1534378941259_4124129_222-172685-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 07:56:31,564 WARNING [train.py:1204] (3/4) Exclude cut with ID _50655_210_18628_1_1534229942193_3772154_320-132275-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 07:56:37,644 WARNING [train.py:1204] (3/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_76-311963-0 from training. Duration: 0.7 2023-10-04 07:56:43,624 WARNING [train.py:1204] (3/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_32-205288-0 from training. Duration: 0.78 2023-10-04 07:56:44,955 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_86-16881-0_sp0.9 from training. Duration: 0.6444375 2023-10-04 07:56:44,998 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_271-297505-0_sp1.1 from training. Duration: 0.9 2023-10-04 07:56:46,404 WARNING [train.py:1204] (3/4) Exclude cut with ID _57572_210_5517_1_1534921219624_3809484_187-295293-0_sp1.1 from training. Duration: 0.963625 2023-10-04 07:56:49,272 WARNING [train.py:1204] (3/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_563-329943-0 from training. Duration: 0.99 2023-10-04 07:56:50,547 WARNING [train.py:1204] (3/4) Exclude cut with ID _7160_210_1876_1_1527940923231_3703799_302-150728-0_sp0.9 from training. Duration: 0.8666875 2023-10-04 07:56:55,082 WARNING [train.py:1204] (3/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_926-7526-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 07:56:56,387 WARNING [train.py:1204] (3/4) Exclude cut with ID _62574_210_2655_1_1535180407058_3933249_467-22348-0_sp1.1 from training. Duration: 0.936375 2023-10-04 07:56:57,711 WARNING [train.py:1204] (3/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_280-161633-0_sp0.9 from training. Duration: 0.811125 2023-10-04 07:56:57,823 WARNING [train.py:1204] (3/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_385-193052-0_sp1.1 from training. Duration: 0.7818125 2023-10-04 07:56:59,793 WARNING [train.py:1204] (3/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_444-28041-0 from training. Duration: 0.85 2023-10-04 07:56:59,798 WARNING [train.py:1204] (3/4) Exclude cut with ID _66070_210_7640_1_1535441393096_7805310_341-249237-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 07:57:01,266 WARNING [train.py:1204] (3/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_425-269826-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 07:57:05,746 WARNING [train.py:1204] (3/4) Exclude cut with ID _53950_210_5816_1_1534417264919_3862951_124-185597-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 07:57:07,216 WARNING [train.py:1204] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_380-204756-0 from training. Duration: 0.86 2023-10-04 07:57:07,499 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.conv_module2.balancer2.min_positive, batch_count=1581426.6666666667, ans=0.05 2023-10-04 07:57:12,015 WARNING [train.py:1204] (3/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_282-257791-0_sp0.9 from training. Duration: 0.9555625 2023-10-04 07:57:14,986 WARNING [train.py:1204] (3/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_372-131163-0_sp1.1 from training. Duration: 0.8545625 2023-10-04 07:57:16,323 WARNING [train.py:1204] (3/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_447-233513-0_sp1.1 from training. Duration: 0.963625 2023-10-04 07:57:19,173 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_54-118333-0_sp1.1 from training. Duration: 0.97275 2023-10-04 07:57:23,251 WARNING [train.py:1204] (3/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_388-230239-0_sp1.1 from training. Duration: 0.963625 2023-10-04 07:57:23,264 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_40047_210_2290_1_1533290519_2374311_141-54639-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 07:57:24,614 WARNING [train.py:1204] (3/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_91-267696-0_sp0.9 from training. Duration: 0.9666875 2023-10-04 07:57:24,660 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_532-137847-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 07:57:26,408 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.671e+02 1.981e+02 2.112e+02 2.378e+02 3.937e+02, threshold=4.224e+02, percent-clipped=0.0 2023-10-04 07:57:27,804 INFO [train.py:1046] (3/4) Epoch 45, batch 3500, loss[loss=0.1516, simple_loss=0.2371, pruned_loss=0.03311, over 24505.00 frames. ], tot_loss[loss=0.1546, simple_loss=0.2354, pruned_loss=0.0369, over 4732396.08 frames. ], batch size: 66, lr: 2.25e-03, grad_scale: 16.0 2023-10-04 07:57:29,761 WARNING [train.py:1204] (3/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_155-283231-0_sp1.1 from training. Duration: 0.97275 2023-10-04 07:57:32,591 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2245_210_2715_1_1525511548_3540254_310-52483-0_sp1.1 from training. Duration: 0.8 2023-10-04 07:57:33,882 WARNING [train.py:1204] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_185-174805-0 from training. Duration: 0.68 2023-10-04 07:57:35,386 WARNING [train.py:1204] (3/4) Exclude cut with ID _15012_210_1968_1_1530856820429_5095350_662-210469-0_sp1.1 from training. Duration: 0.5181875 2023-10-04 07:57:37,524 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.conv_module2.balancer1.min_positive, batch_count=1581560.0, ans=0.025 2023-10-04 07:57:38,698 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_426-313557-0_sp0.9 from training. Duration: 0.588875 2023-10-04 07:57:41,818 WARNING [train.py:1204] (3/4) Exclude cut with ID _42326_210_7770_1_1533814282531_3707269_407-204343-0_sp1.1 from training. Duration: 0.97275 2023-10-04 07:57:41,832 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_258-310570-0 from training. Duration: 0.82 2023-10-04 07:57:48,646 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_4737_210_2040_1_1526998689_3293008_378-178650-0_sp1.1 from training. Duration: 0.863625 2023-10-04 07:57:48,727 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_47896_210_20508_1_1534492518_5958868_432-327451-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 07:57:48,813 WARNING [train.py:1204] (3/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_188-114464-0_sp1.1 from training. Duration: 0.7454375 2023-10-04 07:57:50,129 WARNING [train.py:1204] (3/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_798-106179-0_sp1.1 from training. Duration: 0.7545625 2023-10-04 07:57:50,158 WARNING [train.py:1204] (3/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_143-117421-0_sp1.1 from training. Duration: 0.52725 2023-10-04 07:57:50,205 WARNING [train.py:1204] (3/4) Exclude cut with ID _67499_210_17282_1_1535509805220_3692430_361-78786-0_sp1.1 from training. Duration: 0.963625 2023-10-04 07:57:51,502 WARNING [train.py:1204] (3/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_712-342563-0_sp1.1 from training. Duration: 0.8818125 2023-10-04 07:57:51,561 WARNING [train.py:1204] (3/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_387-159008-0 from training. Duration: 0.86 2023-10-04 07:57:52,044 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.5.encoder.layers.1.nonlin_attention.whiten1, num_groups=1, num_channels=192, metric=2.72 vs. limit=10.0 2023-10-04 07:57:54,259 WARNING [train.py:1204] (3/4) Exclude cut with ID _33640_210_2655_1_1533117605479_4855469_959-18820-0_sp1.1 from training. Duration: 0.963625 2023-10-04 07:57:54,288 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_133-564-0_sp0.9 from training. Duration: 0.87775 2023-10-04 07:57:55,652 WARNING [train.py:1204] (3/4) Exclude cut with ID _23514_210_1389_1_1531832139785_3031439_12-29287-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 07:57:58,501 WARNING [train.py:1204] (3/4) Exclude cut with ID _23514_210_1389_1_1531832139785_3031439_489-185052-0_sp1.1 from training. Duration: 0.963625 2023-10-04 07:57:58,569 WARNING [train.py:1204] (3/4) Exclude cut with ID _5444_210_2151_1_1527418808464_5312210_634-194174-0 from training. Duration: 0.97 2023-10-04 07:58:00,494 WARNING [train.py:1204] (3/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_91-26919-0_sp1.1 from training. Duration: 0.8818125 2023-10-04 07:58:03,129 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_37351_210_7925_1_1533012931_3812113_516-82303-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 07:58:05,882 WARNING [train.py:1204] (3/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_80-74184-0_sp0.9 from training. Duration: 0.92225 2023-10-04 07:58:07,775 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_9460_210_4211_1_1528954587_10049667_495-202963-0_sp1.1 from training. Duration: 0.963625 2023-10-04 07:58:09,185 WARNING [train.py:1204] (3/4) Exclude cut with ID _14632_210_9988_1_1530691279392_3923200_332-105904-0_sp1.1 from training. Duration: 0.8181875 2023-10-04 07:58:11,015 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_40764_210_1925_1_1533882359_3752345_493-167886-0_sp1.1 from training. Duration: 0.92725 2023-10-04 07:58:12,585 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_196-289637-0 from training. Duration: 0.75 2023-10-04 07:58:13,907 WARNING [train.py:1204] (3/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_26-175553-0 from training. Duration: 0.61 2023-10-04 07:58:13,956 WARNING [train.py:1204] (3/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_890-5117-0 from training. Duration: 0.78 2023-10-04 07:58:14,010 WARNING [train.py:1204] (3/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_206-306860-0_sp1.1 from training. Duration: 0.92725 2023-10-04 07:58:16,686 WARNING [train.py:1204] (3/4) Exclude cut with ID _25882_210_3036_1_1532775665815_6479510_570-79824-0_sp1.1 from training. Duration: 0.963625 2023-10-04 07:58:16,748 WARNING [train.py:1204] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_731-2428-0_sp1.1 from training. Duration: 0.7545625 2023-10-04 07:58:16,771 WARNING [train.py:1204] (3/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_138-154812-0_sp0.9 from training. Duration: 0.8666875 2023-10-04 07:58:19,776 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.bypass_mid.scale_min, batch_count=1581760.0, ans=0.2 2023-10-04 07:58:20,815 WARNING [train.py:1204] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_178-39722-0_sp0.9 from training. Duration: 0.5444375 2023-10-04 07:58:20,876 WARNING [train.py:1204] (3/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_1285-256622-0_sp0.9 from training. Duration: 0.9333125 2023-10-04 07:58:25,085 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_41428_210_2161_1_1533774184_3992532_65-271323-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 07:58:25,185 WARNING [train.py:1204] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_308-161067-0 from training. Duration: 0.66 2023-10-04 07:58:25,191 WARNING [train.py:1204] (3/4) Exclude cut with ID _3067_210_2667_1_1526212974640_4503168_459-173867-0 from training. Duration: 0.88 2023-10-04 07:58:25,192 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2221_210_1988_1_1525597051_4935541_415-296882-0_sp1.1 from training. Duration: 0.7 2023-10-04 07:58:26,680 WARNING [train.py:1204] (3/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_354-192266-0_sp1.1 from training. Duration: 0.936375 2023-10-04 07:58:27,958 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_258-310570-0_sp0.9 from training. Duration: 0.911125 2023-10-04 07:58:29,390 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_548-141039-0_sp1.1 from training. Duration: 0.963625 2023-10-04 07:58:29,520 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.feed_forward1.hidden_balancer.prob, batch_count=1581826.6666666667, ans=0.125 2023-10-04 07:58:30,732 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_15101_210_7927_1_1530786415_5033274_320-327527-0 from training. Duration: 0.96 2023-10-04 07:58:32,813 WARNING [train.py:1204] (3/4) Exclude cut with ID _2895_210_2477_1_1525956785794_4063750_289-11475-0_sp0.9 from training. Duration: 0.911125 2023-10-04 07:58:32,947 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7543_210_6753_1_1528543318_4552_1-85072-0_sp1.1 from training. Duration: 0.7545625 2023-10-04 07:58:34,280 WARNING [train.py:1204] (3/4) Exclude cut with ID _64639_210_5816_1_1535367619437_3528000_461-329156-0 from training. Duration: 0.96 2023-10-04 07:58:35,217 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.0.layers.0.nonlin_attention.whiten2, num_groups=1, num_channels=192, metric=5.02 vs. limit=15.0 2023-10-04 07:58:38,081 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_211-339622-0 from training. Duration: 0.45 2023-10-04 07:58:40,709 WARNING [train.py:1204] (3/4) Exclude cut with ID _20455_210_5219_1_1531565996775_8098100_402-112739-0_sp1.1 from training. Duration: 0.963625 2023-10-04 07:58:42,164 INFO [train.py:1046] (3/4) Epoch 45, batch 3550, loss[loss=0.1616, simple_loss=0.237, pruned_loss=0.04309, over 23823.00 frames. ], tot_loss[loss=0.1532, simple_loss=0.2333, pruned_loss=0.03654, over 4720316.72 frames. ], batch size: 179, lr: 2.25e-03, grad_scale: 16.0 2023-10-04 07:58:42,254 WARNING [train.py:1204] (3/4) Exclude cut with ID _53489_210_6228_1_1534298357169_3835970_256-115565-0_sp1.1 from training. Duration: 0.936375 2023-10-04 07:58:42,280 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_26326_210_7925_1_1533002493_3592406_360-305260-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 07:58:42,303 WARNING [train.py:1204] (3/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_18-204383-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 07:58:45,075 WARNING [train.py:1204] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_306-232480-0_sp1.1 from training. Duration: 0.87275 2023-10-04 07:58:48,178 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.conv_module2.balancer2.prob, batch_count=1581893.3333333333, ans=0.125 2023-10-04 07:58:52,202 WARNING [train.py:1204] (3/4) Exclude cut with ID _41161_210_20034_1_1533431249657_3555314_116-299186-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 07:58:53,578 WARNING [train.py:1204] (3/4) Exclude cut with ID _13913_210_9994_1_1530579615187_3383897_473-277682-0_sp1.1 from training. Duration: 0.3545625 2023-10-04 07:58:57,634 WARNING [train.py:1204] (3/4) Exclude cut with ID _58895_210_8750_1_1535184081320_3649089_49-225702-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 07:58:57,713 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3080_210_2071_1_1526086102_4365166_312-8726-0_sp1.1 from training. Duration: 0.82725 2023-10-04 07:58:59,549 WARNING [train.py:1204] (3/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_489-190175-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 07:59:00,836 WARNING [train.py:1204] (3/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_345-95717-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 07:59:00,856 WARNING [train.py:1204] (3/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_85-138257-0_sp1.1 from training. Duration: 0.6454375 2023-10-04 07:59:04,351 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7046_210_6753_1_1527895337_6920917_783-278109-0_sp1.1 from training. Duration: 0.7 2023-10-04 07:59:04,378 WARNING [train.py:1204] (3/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_450-203637-0_sp1.1 from training. Duration: 0.836375 2023-10-04 07:59:05,708 WARNING [train.py:1204] (3/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_416-132893-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 07:59:05,725 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2655_210_2087_1_1525688959_3897400_437-337690-0_sp0.9 from training. Duration: 0.7 2023-10-04 07:59:05,801 WARNING [train.py:1204] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_700-183549-0_sp1.1 from training. Duration: 0.6181875 2023-10-04 07:59:06,462 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.self_attn1.whiten.whitening_limit, batch_count=1581960.0, ans=22.5 2023-10-04 07:59:07,834 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.conv_module1.balancer2.min_positive, batch_count=1581960.0, ans=0.05 2023-10-04 07:59:10,939 WARNING [train.py:1204] (3/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_191-59784-0_sp1.1 from training. Duration: 0.8 2023-10-04 07:59:10,951 WARNING [train.py:1204] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_218-295097-0_sp1.1 from training. Duration: 0.7 2023-10-04 07:59:12,317 WARNING [train.py:1204] (3/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_310-109461-0_sp1.1 from training. Duration: 0.72725 2023-10-04 07:59:12,321 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_33348_210_5632_1_1533086912_3324030_131-321149-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 07:59:13,594 WARNING [train.py:1204] (3/4) Exclude cut with ID _64135_210_2253_1_1535677267876_7608972_133-188266-0_sp1.1 from training. Duration: 0.736375 2023-10-04 07:59:13,620 WARNING [train.py:1204] (3/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_505-205270-0 from training. Duration: 0.38 2023-10-04 07:59:13,630 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_67223_210_4238_1_1535612120_2537431_102-88248-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 07:59:15,033 WARNING [train.py:1204] (3/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_60-235433-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 07:59:16,296 WARNING [train.py:1204] (3/4) Exclude cut with ID _5189_210_4557_1_1527248319428_3769229_199-53526-0_sp1.1 from training. Duration: 0.3818125 2023-10-04 07:59:20,413 WARNING [train.py:1204] (3/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_439-90979-0_sp1.1 from training. Duration: 0.963625 2023-10-04 07:59:21,781 WARNING [train.py:1204] (3/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_1162-132880-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 07:59:23,220 WARNING [train.py:1204] (3/4) Exclude cut with ID _56423_210_5854_1_1534896061049_3165969_220-22317-0_sp1.1 from training. Duration: 0.963625 2023-10-04 07:59:23,497 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.self_attn_weights.pos_emb_skip_rate, batch_count=1582026.6666666667, ans=0.0 2023-10-04 07:59:24,709 WARNING [train.py:1204] (3/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_909-64151-0 from training. Duration: 0.94 2023-10-04 07:59:24,781 WARNING [train.py:1204] (3/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_465-156500-0_sp0.9 from training. Duration: 0.8 2023-10-04 07:59:28,216 WARNING [train.py:1204] (3/4) Exclude cut with ID _44304_210_15374_1_1533639352765_4613129_302-22084-0 from training. Duration: 0.86 2023-10-04 07:59:28,278 WARNING [train.py:1204] (3/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_663-166973-0_sp1.1 from training. Duration: 0.72725 2023-10-04 07:59:28,590 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=1582093.3333333333, ans=0.1 2023-10-04 07:59:29,844 WARNING [train.py:1204] (3/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_503-287883-0_sp0.9 from training. Duration: 0.97775 2023-10-04 07:59:31,112 WARNING [train.py:1204] (3/4) Exclude cut with ID _60057_210_10640_1_1534903065793_3333150_362-230377-0_sp0.9 from training. Duration: 0.9666875 2023-10-04 07:59:34,486 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_4183_210_2087_1_1526729100_1869838_143-280962-0 from training. Duration: 0.56 2023-10-04 07:59:35,881 WARNING [train.py:1204] (3/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_281-1313-0_sp1.1 from training. Duration: 0.936375 2023-10-04 07:59:42,639 WARNING [train.py:1204] (3/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_64-215118-0_sp1.1 from training. Duration: 0.936375 2023-10-04 07:59:42,706 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2822_210_2715_1_1526112586_1999587_165-36656-0 from training. Duration: 0.89 2023-10-04 07:59:43,944 WARNING [train.py:1204] (3/4) Exclude cut with ID _22961_210_8254_1_1531976366672_4278180_402-208830-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 07:59:48,049 WARNING [train.py:1204] (3/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_98-193494-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 07:59:49,435 WARNING [train.py:1204] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_383-285196-0 from training. Duration: 0.97 2023-10-04 07:59:54,812 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.552e+02 1.938e+02 2.121e+02 2.490e+02 3.705e+02, threshold=4.241e+02, percent-clipped=0.0 2023-10-04 07:59:54,889 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6767_210_3273_1_1527855783_1718932_142-127132-0 from training. Duration: 0.95 2023-10-04 07:59:54,928 WARNING [train.py:1204] (3/4) Exclude cut with ID _39867_210_9826_1_1533553222030_9344259_1018-339635-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 07:59:55,206 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=1582226.6666666667, ans=0.1 2023-10-04 07:59:56,216 INFO [train.py:1046] (3/4) Epoch 45, batch 3600, loss[loss=0.1579, simple_loss=0.2353, pruned_loss=0.04029, over 23858.00 frames. ], tot_loss[loss=0.153, simple_loss=0.2332, pruned_loss=0.03636, over 4713417.32 frames. ], batch size: 195, lr: 2.25e-03, grad_scale: 32.0 2023-10-04 07:59:56,291 WARNING [train.py:1204] (3/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_274-274798-0_sp1.1 from training. Duration: 0.8909375 2023-10-04 07:59:57,871 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.balancer2.prob, batch_count=1582226.6666666667, ans=0.125 2023-10-04 07:59:58,959 WARNING [train.py:1204] (3/4) Exclude cut with ID _2334_210_2667_1_1525608238880_4705129_376-263393-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 07:59:59,001 WARNING [train.py:1204] (3/4) Exclude cut with ID _29303_210_2575_1_1532392237303_7410649_363-344675-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 08:00:00,360 WARNING [train.py:1204] (3/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_58-68956-0_sp1.1 from training. Duration: 0.7545625 2023-10-04 08:00:04,835 WARNING [train.py:1204] (3/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_709-311571-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 08:00:08,181 WARNING [train.py:1204] (3/4) Exclude cut with ID _46067_210_4220_1_1534125571066_3307450_136-257407-0_sp1.1 from training. Duration: 0.963625 2023-10-04 08:00:09,535 WARNING [train.py:1204] (3/4) Exclude cut with ID _6493_210_1747_1_1528459210424_4012470_324-323854-0_sp1.1 from training. Duration: 0.836375 2023-10-04 08:00:09,589 WARNING [train.py:1204] (3/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_234-260682-0_sp1.1 from training. Duration: 0.763625 2023-10-04 08:00:11,386 WARNING [train.py:1204] (3/4) Exclude cut with ID _36634_210_1794_1_1533296352450_1399884_55-119451-0_sp1.1 from training. Duration: 0.963625 2023-10-04 08:00:11,389 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_40047_210_2290_1_1533290519_2374311_195-165693-0 from training. Duration: 0.89 2023-10-04 08:00:15,574 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.2.encoder.layers.2.feed_forward1.out_whiten, num_groups=1, num_channels=384, metric=5.34 vs. limit=15.0 2023-10-04 08:00:16,180 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_5522_210_3273_1_1527249606_3909096_366-172508-0_sp0.9 from training. Duration: 0.7333125 2023-10-04 08:00:17,568 WARNING [train.py:1204] (3/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_187-10504-0_sp1.1 from training. Duration: 0.963625 2023-10-04 08:00:20,376 WARNING [train.py:1204] (3/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_282-257791-0_sp1.1 from training. Duration: 0.7818125 2023-10-04 08:00:23,229 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_43831_210_2158_1_1533725502_5872791_467-288776-0_sp1.1 from training. Duration: 0.92725 2023-10-04 08:00:23,318 WARNING [train.py:1204] (3/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_138-154812-0_sp1.1 from training. Duration: 0.7090625 2023-10-04 08:00:23,356 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_1154-185930-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 08:00:23,379 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2655_210_2087_1_1525688959_3897400_52-279899-0 from training. Duration: 0.69 2023-10-04 08:00:24,640 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_141-89113-0_sp1.1 from training. Duration: 0.7818125 2023-10-04 08:00:26,083 WARNING [train.py:1204] (3/4) Exclude cut with ID _55134_210_21982_1_1534590028377_3882170_297-47310-0_sp1.1 from training. Duration: 0.963625 2023-10-04 08:00:27,404 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_486-151131-0_sp1.1 from training. Duration: 0.77275 2023-10-04 08:00:30,038 WARNING [train.py:1204] (3/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_21-232980-0_sp1.1 from training. Duration: 0.936375 2023-10-04 08:00:31,439 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6165_210_3528_1_1527590982_4379841_338-106380-0_sp1.1 from training. Duration: 0.92725 2023-10-04 08:00:32,754 WARNING [train.py:1204] (3/4) Exclude cut with ID _55162_210_17071_1_1534408248583_3753480_355-257218-0_sp1.1 from training. Duration: 0.97275 2023-10-04 08:00:34,147 WARNING [train.py:1204] (3/4) Exclude cut with ID _2895_210_2477_1_1525956785794_4063750_289-11475-0 from training. Duration: 0.82 2023-10-04 08:00:42,170 WARNING [train.py:1204] (3/4) Exclude cut with ID _16756_210_4915_1_1531047803763_7104259_839-183431-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 08:00:42,395 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.bypass_mid.scale_min, batch_count=1582426.6666666667, ans=0.2 2023-10-04 08:00:43,646 WARNING [train.py:1204] (3/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_427-208901-0_sp0.9 from training. Duration: 0.7555625 2023-10-04 08:00:45,914 WARNING [train.py:1204] (3/4) Exclude cut with ID _36512_210_6987_1_1533033143998_3126069_181-306500-0 from training. Duration: 0.95 2023-10-04 08:00:48,881 WARNING [train.py:1204] (3/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_28-70987-0_sp1.1 from training. Duration: 0.7909375 2023-10-04 08:00:51,818 WARNING [train.py:1204] (3/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_590-58996-0_sp1.1 from training. Duration: 0.936375 2023-10-04 08:00:55,752 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_48730_210_5399_1_1533885886_2212769_108-47561-0_sp1.1 from training. Duration: 0.936375 2023-10-04 08:01:00,070 WARNING [train.py:1204] (3/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_401-158239-0_sp0.9 from training. Duration: 0.788875 2023-10-04 08:01:00,090 WARNING [train.py:1204] (3/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_278-154492-0_sp1.1 from training. Duration: 0.6909375 2023-10-04 08:01:00,096 WARNING [train.py:1204] (3/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_137-116442-0 from training. Duration: 0.78 2023-10-04 08:01:02,840 WARNING [train.py:1204] (3/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_189-317158-0 from training. Duration: 0.9 2023-10-04 08:01:04,191 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_47896_210_20508_1_1534492518_5958868_558-314572-0 from training. Duration: 0.97 2023-10-04 08:01:06,034 WARNING [train.py:1204] (3/4) Exclude cut with ID _66070_210_7640_1_1535441393096_7805310_290-40284-0_sp1.1 from training. Duration: 0.97275 2023-10-04 08:01:06,075 WARNING [train.py:1204] (3/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_110-224688-0_sp1.1 from training. Duration: 0.8818125 2023-10-04 08:01:07,359 WARNING [train.py:1204] (3/4) Exclude cut with ID _7719_210_3525_1_1528456153163_4296960_88-250259-0 from training. Duration: 0.84 2023-10-04 08:01:07,411 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8453_210_4211_1_1528502940_8562730_1157-114102-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 08:01:07,448 WARNING [train.py:1204] (3/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_317-175062-0_sp1.1 from training. Duration: 0.7454375 2023-10-04 08:01:07,454 WARNING [train.py:1204] (3/4) Exclude cut with ID _29857_210_20034_1_1532395886753_3798160_254-347254-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 08:01:10,165 INFO [train.py:1046] (3/4) Epoch 45, batch 3650, loss[loss=0.2028, simple_loss=0.2675, pruned_loss=0.06901, over 19658.00 frames. ], tot_loss[loss=0.1535, simple_loss=0.234, pruned_loss=0.03649, over 4723357.30 frames. ], batch size: 388, lr: 2.25e-03, grad_scale: 32.0 2023-10-04 08:01:10,222 WARNING [train.py:1204] (3/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_28-70987-0 from training. Duration: 0.87 2023-10-04 08:01:10,278 WARNING [train.py:1204] (3/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_321-335475-0 from training. Duration: 0.45 2023-10-04 08:01:12,483 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.ff2_skip_rate, batch_count=1582560.0, ans=0.0 2023-10-04 08:01:15,905 WARNING [train.py:1204] (3/4) Exclude cut with ID _40901_210_5665_1_1533363789958_3856130_356-107330-0_sp1.1 from training. Duration: 0.936375 2023-10-04 08:01:15,978 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3780_210_2087_1_1526467389_2145572_176-107140-0 from training. Duration: 0.69 2023-10-04 08:01:20,307 WARNING [train.py:1204] (3/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_80-135200-0 from training. Duration: 0.79 2023-10-04 08:01:21,741 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_33348_210_5632_1_1533086912_3324030_85-28583-0_sp0.9 from training. Duration: 0.911125 2023-10-04 08:01:24,469 WARNING [train.py:1204] (3/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_29-293896-0 from training. Duration: 0.61 2023-10-04 08:01:25,879 WARNING [train.py:1204] (3/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_194-252800-0 from training. Duration: 0.97 2023-10-04 08:01:30,042 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_360-52446-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 08:01:30,044 WARNING [train.py:1204] (3/4) Exclude cut with ID _9389_210_1664_1_1529217337301_5289269_289-213417-0_sp0.9 from training. Duration: 0.988875 2023-10-04 08:01:31,342 WARNING [train.py:1204] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_143-86344-0_sp0.9 from training. Duration: 0.8555625 2023-10-04 08:01:34,667 WARNING [train.py:1204] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_520-126729-0_sp0.9 from training. Duration: 0.77775 2023-10-04 08:01:34,698 WARNING [train.py:1204] (3/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_76-308663-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 08:01:35,856 WARNING [train.py:1204] (3/4) Exclude cut with ID _8021_210_7994_1_1528376451358_3653069_100-140231-0 from training. Duration: 0.67 2023-10-04 08:01:35,920 WARNING [train.py:1204] (3/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_387-131538-0_sp0.9 from training. Duration: 0.9 2023-10-04 08:01:35,955 WARNING [train.py:1204] (3/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_365-182054-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 08:01:35,996 WARNING [train.py:1204] (3/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_345-340690-0 from training. Duration: 0.93 2023-10-04 08:01:37,881 WARNING [train.py:1204] (3/4) Exclude cut with ID _5409_210_1388_1_1527292850189_5865191_178-127970-0_sp0.9 from training. Duration: 0.6333125 2023-10-04 08:01:37,928 WARNING [train.py:1204] (3/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_352-12734-0_sp1.1 from training. Duration: 0.92725 2023-10-04 08:01:37,931 WARNING [train.py:1204] (3/4) Exclude cut with ID _65134_210_14763_1_1535335393985_3505368_121-129373-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 08:01:41,235 WARNING [train.py:1204] (3/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_557-324258-0_sp1.1 from training. Duration: 0.736375 2023-10-04 08:01:42,653 WARNING [train.py:1204] (3/4) Exclude cut with ID _48678_210_5663_1_1533988830947_3769199_306-255886-0 from training. Duration: 0.77 2023-10-04 08:01:44,564 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7544_210_6753_1_1528587000_691468_87-178599-0 from training. Duration: 0.87 2023-10-04 08:01:45,869 WARNING [train.py:1204] (3/4) Exclude cut with ID _2941_210_2577_1_1526199673150_2489759_208-236402-0_sp1.1 from training. Duration: 0.963625 2023-10-04 08:01:47,994 WARNING [train.py:1204] (3/4) Exclude cut with ID _38670_210_15379_1_1533448810659_4391059_65-169397-0 from training. Duration: 0.97 2023-10-04 08:01:49,395 WARNING [train.py:1204] (3/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_306-328175-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 08:01:49,404 WARNING [train.py:1204] (3/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_212-149947-0_sp1.1 from training. Duration: 0.663625 2023-10-04 08:01:53,556 WARNING [train.py:1204] (3/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_321-22821-0_sp1.1 from training. Duration: 0.8090625 2023-10-04 08:01:54,946 WARNING [train.py:1204] (3/4) Exclude cut with ID _44010_210_4525_1_1533884262964_4013620_483-69110-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 08:01:54,963 WARNING [train.py:1204] (3/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_38-187851-0_sp1.1 from training. Duration: 0.563625 2023-10-04 08:01:56,393 WARNING [train.py:1204] (3/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_129-337802-0_sp0.9 from training. Duration: 0.8 2023-10-04 08:01:57,673 WARNING [train.py:1204] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_268-87909-0_sp1.1 from training. Duration: 0.9 2023-10-04 08:02:00,356 WARNING [train.py:1204] (3/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_559-184862-0_sp1.1 from training. Duration: 0.8545625 2023-10-04 08:02:01,998 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.nonlin_attention.balancer.prob, batch_count=1582760.0, ans=0.125 2023-10-04 08:02:04,527 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_46032_210_14656_1_1533781412_4507969_237-120766-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 08:02:05,938 WARNING [train.py:1204] (3/4) Exclude cut with ID _28427_210_4220_1_1532581240967_3061069_330-170906-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 08:02:05,945 WARNING [train.py:1204] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_401-51578-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 08:02:07,346 WARNING [train.py:1204] (3/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_209-205365-0_sp1.1 from training. Duration: 0.7181875 2023-10-04 08:02:07,416 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_23801_210_16223_1_1532066392_3294657_57-242170-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 08:02:07,579 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.feed_forward3.hidden_balancer.prob, batch_count=1582760.0, ans=0.125 2023-10-04 08:02:08,822 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_127-37371-0_sp1.1 from training. Duration: 0.936375 2023-10-04 08:02:12,858 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.3.encoder.layers.2.nonlin_attention.whiten1, num_groups=1, num_channels=384, metric=4.05 vs. limit=10.0 2023-10-04 08:02:15,054 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9171W0019-578905 from training. Duration: 0.896 2023-10-04 08:02:20,146 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_24605_210_1794_1_1532047863_4149755_349-49056-0_sp1.1 from training. Duration: 0.92725 2023-10-04 08:02:20,158 WARNING [train.py:1204] (3/4) Exclude cut with ID _12302_210_5219_1_1529924446466_8107280_897-21237-0_sp1.1 from training. Duration: 0.936375 2023-10-04 08:02:21,497 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_9460_210_4211_1_1528954587_10049667_480-260269-0_sp0.9 from training. Duration: 0.62225 2023-10-04 08:02:21,551 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_348-152548-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 08:02:21,620 WARNING [train.py:1204] (3/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_301-330455-0_sp0.9 from training. Duration: 0.888875 2023-10-04 08:02:21,726 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.1.feed_forward1.out_proj.dropout_p, batch_count=1582826.6666666667, ans=0.1 2023-10-04 08:02:24,265 WARNING [train.py:1204] (3/4) Exclude cut with ID _61008_210_10633_1_1534852769198_6974370_345-91705-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 08:02:25,455 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.678e+02 2.054e+02 2.374e+02 2.935e+02 4.345e+02, threshold=4.749e+02, percent-clipped=1.0 2023-10-04 08:02:25,483 INFO [train.py:1046] (3/4) Epoch 45, batch 3700, loss[loss=0.1399, simple_loss=0.2146, pruned_loss=0.03263, over 23688.00 frames. ], tot_loss[loss=0.1542, simple_loss=0.2348, pruned_loss=0.03682, over 4721466.81 frames. ], batch size: 232, lr: 2.25e-03, grad_scale: 16.0 2023-10-04 08:02:26,896 WARNING [train.py:1204] (3/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_304-273433-0 from training. Duration: 0.99 2023-10-04 08:02:26,906 WARNING [train.py:1204] (3/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_359-345972-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 08:02:29,625 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2296_210_2087_1_1525503448_3397335_57-185430-0_sp1.1 from training. Duration: 0.5454375 2023-10-04 08:02:31,040 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_1256-120974-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 08:02:31,111 WARNING [train.py:1204] (3/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_372-30302-0_sp0.9 from training. Duration: 0.9555625 2023-10-04 08:02:33,955 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_42012_210_7927_1_1533439936_3871556_451-344262-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 08:02:33,956 WARNING [train.py:1204] (3/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_28-324729-0 from training. Duration: 0.94 2023-10-04 08:02:33,965 WARNING [train.py:1204] (3/4) Exclude cut with ID _64135_210_2253_1_1535677267876_7608972_313-18394-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 08:02:35,346 WARNING [train.py:1204] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_620-276793-0_sp0.9 from training. Duration: 0.6555625 2023-10-04 08:02:35,371 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7544_210_6753_1_1528591327_1471426_172-348777-0_sp0.9 from training. Duration: 0.7333125 2023-10-04 08:02:36,815 WARNING [train.py:1204] (3/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_332-221057-0_sp0.9 from training. Duration: 0.7555625 2023-10-04 08:02:40,275 WARNING [train.py:1204] (3/4) Exclude cut with ID _19023_210_4353_1_1531378892552_5820780_5-300035-0_sp1.1 from training. Duration: 0.92725 2023-10-04 08:02:40,325 WARNING [train.py:1204] (3/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_343-309023-0_sp1.1 from training. Duration: 0.963625 2023-10-04 08:02:41,776 WARNING [train.py:1204] (3/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_37-41919-0_sp1.1 from training. Duration: 0.7909375 2023-10-04 08:02:41,801 WARNING [train.py:1204] (3/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_41-182910-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 08:02:41,875 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_229-45300-0_sp0.9 from training. Duration: 0.6333125 2023-10-04 08:02:42,145 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.bypass.skip_rate, batch_count=1582960.0, ans=0.04949747468305833 2023-10-04 08:02:44,976 WARNING [train.py:1204] (3/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_40-214716-0_sp1.1 from training. Duration: 0.963625 2023-10-04 08:02:46,345 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9309W0203-640330 from training. Duration: 0.896 2023-10-04 08:02:52,897 INFO [scaling.py:1118] (3/4) WithLoss: name=encoder.encoders.4.encoder.layers.1.self_attn_weights, loss-sum=0.000e+00 2023-10-04 08:02:55,394 WARNING [train.py:1204] (3/4) Exclude cut with ID _60229_210_27767_1_1534838406158_3655350_264-279573-0_sp1.1 from training. Duration: 0.7545625 2023-10-04 08:02:55,425 WARNING [train.py:1204] (3/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_372-198878-0_sp0.9 from training. Duration: 0.6666875 2023-10-04 08:02:56,805 WARNING [train.py:1204] (3/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_359-118926-0_sp1.1 from training. Duration: 0.6545625 2023-10-04 08:02:56,817 WARNING [train.py:1204] (3/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_85-65056-0 from training. Duration: 0.96 2023-10-04 08:02:56,837 WARNING [train.py:1204] (3/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_89-242335-0_sp1.1 from training. Duration: 0.863625 2023-10-04 08:02:57,429 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.4.encoder.layers.1.feed_forward2.out_whiten, num_groups=1, num_channels=384, metric=9.46 vs. limit=15.0 2023-10-04 08:03:00,922 WARNING [train.py:1204] (3/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_128-49444-0_sp1.1 from training. Duration: 0.936375 2023-10-04 08:03:02,307 WARNING [train.py:1204] (3/4) Exclude cut with ID _30327_210_16049_1_1532484161295_3924169_401-70819-0 from training. Duration: 0.86 2023-10-04 08:03:02,383 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_41822_210_13270_1_1533955976_4379284_260-256391-0_sp1.1 from training. Duration: 0.936375 2023-10-04 08:03:03,768 WARNING [train.py:1204] (3/4) Exclude cut with ID _67335_210_5219_1_1535525975652_4275229_599-247317-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 08:03:06,664 WARNING [train.py:1204] (3/4) Exclude cut with ID _29857_210_20034_1_1532395886753_3798160_209-239267-0_sp1.1 from training. Duration: 0.936375 2023-10-04 08:03:06,690 WARNING [train.py:1204] (3/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_46-207599-0_sp1.1 from training. Duration: 0.5818125 2023-10-04 08:03:09,889 WARNING [train.py:1204] (3/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_912-282412-0_sp1.1 from training. Duration: 0.4454375 2023-10-04 08:03:14,484 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_4183_210_2087_1_1526729100_1869838_141-166600-0_sp1.1 from training. Duration: 0.863625 2023-10-04 08:03:14,490 WARNING [train.py:1204] (3/4) Exclude cut with ID _15037_210_2253_1_1530752294104_3267650_296-192597-0 from training. Duration: 0.81 2023-10-04 08:03:14,546 WARNING [train.py:1204] (3/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_571-220562-0_sp1.1 from training. Duration: 0.963625 2023-10-04 08:03:14,567 WARNING [train.py:1204] (3/4) Exclude cut with ID _5287_210_2084_1_1527418883040_3878670_12-221186-0 from training. Duration: 0.98 2023-10-04 08:03:19,430 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.conv_module2.balancer1.prob, batch_count=1583093.3333333333, ans=0.125 2023-10-04 08:03:20,327 WARNING [train.py:1204] (3/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_149-85602-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 08:03:21,556 WARNING [train.py:1204] (3/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_1003-101638-0_sp1.1 from training. Duration: 0.663625 2023-10-04 08:03:24,888 WARNING [train.py:1204] (3/4) Exclude cut with ID _55844_210_2346_1_1534471212569_7625707_1099-33689-0_sp1.1 from training. Duration: 0.92725 2023-10-04 08:03:24,934 WARNING [train.py:1204] (3/4) Exclude cut with ID _7606_210_4915_1_1528894253277_5080690_301-10732-0 from training. Duration: 0.81 2023-10-04 08:03:26,353 WARNING [train.py:1204] (3/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_403-34698-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 08:03:26,359 WARNING [train.py:1204] (3/4) Exclude cut with ID _8480_210_1874_1_1528589391205_6728330_755-112625-0_sp0.9 from training. Duration: 0.82225 2023-10-04 08:03:26,375 WARNING [train.py:1204] (3/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_527-238990-0_sp1.1 from training. Duration: 0.8181875 2023-10-04 08:03:27,644 WARNING [train.py:1204] (3/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_426-7328-0_sp1.1 from training. Duration: 0.92725 2023-10-04 08:03:29,169 WARNING [train.py:1204] (3/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_189-317158-0_sp1.1 from training. Duration: 0.8181875 2023-10-04 08:03:29,242 WARNING [train.py:1204] (3/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_162-103620-0 from training. Duration: 0.93 2023-10-04 08:03:29,469 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.feed_forward3.hidden_balancer.prob, batch_count=1583160.0, ans=0.125 2023-10-04 08:03:30,699 WARNING [train.py:1204] (3/4) Exclude cut with ID _22083_210_8254_1_1531702862439_3678970_49-234546-0 from training. Duration: 0.83 2023-10-04 08:03:31,986 WARNING [train.py:1204] (3/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_370-331600-0_sp1.1 from training. Duration: 0.8545625 2023-10-04 08:03:31,998 WARNING [train.py:1204] (3/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_166-59042-0_sp1.1 from training. Duration: 0.97275 2023-10-04 08:03:33,360 WARNING [train.py:1204] (3/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_138-332773-0_sp1.1 from training. Duration: 0.736375 2023-10-04 08:03:33,534 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.conv_module1.balancer2.prob, batch_count=1583160.0, ans=0.125 2023-10-04 08:03:35,884 WARNING [train.py:1204] (3/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_90-30673-0_sp0.9 from training. Duration: 0.9333125 2023-10-04 08:03:38,466 INFO [train.py:1046] (3/4) Epoch 45, batch 3750, loss[loss=0.1529, simple_loss=0.2337, pruned_loss=0.036, over 24567.00 frames. ], tot_loss[loss=0.1549, simple_loss=0.2356, pruned_loss=0.03712, over 4729998.77 frames. ], batch size: 60, lr: 2.25e-03, grad_scale: 16.0 2023-10-04 08:03:38,529 WARNING [train.py:1204] (3/4) Exclude cut with ID _27967_210_12140_1_1532588391859_8419130_737-41898-0_sp1.1 from training. Duration: 0.936375 2023-10-04 08:03:40,331 WARNING [train.py:1204] (3/4) Exclude cut with ID _8833_210_5219_1_1528592665210_7553059_329-125156-0_sp1.1 from training. Duration: 0.7454375 2023-10-04 08:03:41,780 WARNING [train.py:1204] (3/4) Exclude cut with ID _41817_210_18835_1_1533434477160_7423049_352-42537-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 08:03:42,390 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.3.encoder.layers.2.nonlin_attention.whiten1, num_groups=1, num_channels=384, metric=3.76 vs. limit=10.0 2023-10-04 08:03:43,163 WARNING [train.py:1204] (3/4) Exclude cut with ID _8365_210_2064_1_1528592311070_4677690_431-294102-0 from training. Duration: 0.89 2023-10-04 08:03:44,594 WARNING [train.py:1204] (3/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_454-314901-0_sp1.1 from training. Duration: 0.4181875 2023-10-04 08:03:47,899 WARNING [train.py:1204] (3/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_80-135200-0_sp0.9 from training. Duration: 0.87775 2023-10-04 08:03:47,955 WARNING [train.py:1204] (3/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_181-78632-0 from training. Duration: 0.78 2023-10-04 08:03:49,381 WARNING [train.py:1204] (3/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_300-27235-0_sp0.9 from training. Duration: 0.9666875 2023-10-04 08:03:51,305 WARNING [train.py:1204] (3/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_96-177176-0_sp1.1 from training. Duration: 0.97275 2023-10-04 08:03:54,317 WARNING [train.py:1204] (3/4) Exclude cut with ID _35941_210_15379_1_1532995294515_4023089_246-348742-0_sp1.1 from training. Duration: 0.97275 2023-10-04 08:03:54,408 WARNING [train.py:1204] (3/4) Exclude cut with ID _38670_210_15379_1_1533448810659_4391059_65-169397-0_sp1.1 from training. Duration: 0.8818125 2023-10-04 08:03:58,499 WARNING [train.py:1204] (3/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_1003-99642-0_sp1.1 from training. Duration: 0.963625 2023-10-04 08:04:00,011 WARNING [train.py:1204] (3/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_320-14078-0_sp1.1 from training. Duration: 0.863625 2023-10-04 08:04:01,426 WARNING [train.py:1204] (3/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_367-61330-0_sp0.9 from training. Duration: 0.8333125 2023-10-04 08:04:02,813 WARNING [train.py:1204] (3/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_515-298514-0_sp1.1 from training. Duration: 0.92725 2023-10-04 08:04:03,066 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.attention_skip_rate, batch_count=1583293.3333333333, ans=0.0 2023-10-04 08:04:04,419 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.conv_module2.balancer2.prob, batch_count=1583293.3333333333, ans=0.125 2023-10-04 08:04:05,515 WARNING [train.py:1204] (3/4) Exclude cut with ID _53731_210_7994_1_1534553979244_3757214_471-320815-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 08:04:06,872 WARNING [train.py:1204] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_306-232480-0 from training. Duration: 0.96 2023-10-04 08:04:08,338 WARNING [train.py:1204] (3/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_478-162247-0_sp0.9 from training. Duration: 0.911125 2023-10-04 08:04:09,725 WARNING [train.py:1204] (3/4) Exclude cut with ID _56423_210_5854_1_1534896061049_3165969_345-284696-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 08:04:09,759 WARNING [train.py:1204] (3/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_600-122253-0_sp1.1 from training. Duration: 0.963625 2023-10-04 08:04:15,697 WARNING [train.py:1204] (3/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_199-295611-0 from training. Duration: 0.55 2023-10-04 08:04:17,292 WARNING [train.py:1204] (3/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_734-129761-0 from training. Duration: 0.82 2023-10-04 08:04:19,137 WARNING [train.py:1204] (3/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_349-169257-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 08:04:19,173 WARNING [train.py:1204] (3/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_202-67017-0_sp0.9 from training. Duration: 0.911125 2023-10-04 08:04:21,872 WARNING [train.py:1204] (3/4) Exclude cut with ID _16908_210_5724_1_1531119157248_3772700_61-258978-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 08:04:25,324 WARNING [train.py:1204] (3/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_172-156931-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 08:04:26,645 WARNING [train.py:1204] (3/4) Exclude cut with ID _4092_210_2151_1_1526643044449_3135759_356-278694-0_sp1.1 from training. Duration: 0.42725 2023-10-04 08:04:28,190 WARNING [train.py:1204] (3/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_116-332008-0 from training. Duration: 0.98 2023-10-04 08:04:28,990 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.1.encoder.layers.0.self_attn2.whiten, num_groups=1, num_channels=256, metric=10.39 vs. limit=22.5 2023-10-04 08:04:32,204 WARNING [train.py:1204] (3/4) Exclude cut with ID _29003_210_19596_1_1532507391720_3894098_358-114789-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 08:04:33,766 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.conv_module2.balancer2.min_abs, batch_count=1583426.6666666667, ans=0.5 2023-10-04 08:04:34,961 WARNING [train.py:1204] (3/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_152-212533-0_sp1.1 from training. Duration: 0.8818125 2023-10-04 08:04:36,263 WARNING [train.py:1204] (3/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_267-214116-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 08:04:39,548 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7543_210_6753_1_1528541907_1399322_165-323619-0_sp0.9 from training. Duration: 0.7666875 2023-10-04 08:04:39,724 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.feed_forward3.hidden_balancer.prob, batch_count=1583493.3333333333, ans=0.125 2023-10-04 08:04:43,770 WARNING [train.py:1204] (3/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_577-332600-0_sp0.9 from training. Duration: 0.6333125 2023-10-04 08:04:45,184 WARNING [train.py:1204] (3/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_342-100157-0_sp0.9 from training. Duration: 0.6 2023-10-04 08:04:47,833 WARNING [train.py:1204] (3/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_141-89727-0_sp1.1 from training. Duration: 0.6454375 2023-10-04 08:04:49,240 WARNING [train.py:1204] (3/4) Exclude cut with ID _3992_210_1747_1_1526644816031_3731060_107-251434-0_sp1.1 from training. Duration: 0.9 2023-10-04 08:04:51,313 WARNING [train.py:1204] (3/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_401-53423-0_sp1.1 from training. Duration: 0.5 2023-10-04 08:04:52,545 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.566e+02 2.048e+02 2.267e+02 2.690e+02 4.764e+02, threshold=4.534e+02, percent-clipped=1.0 2023-10-04 08:04:52,571 INFO [train.py:1046] (3/4) Epoch 45, batch 3800, loss[loss=0.161, simple_loss=0.2439, pruned_loss=0.03908, over 24462.00 frames. ], tot_loss[loss=0.1547, simple_loss=0.2363, pruned_loss=0.03653, over 4742269.47 frames. ], batch size: 66, lr: 2.25e-03, grad_scale: 16.0 2023-10-04 08:04:56,301 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.nonlin_attention.balancer.prob, batch_count=1583560.0, ans=0.125 2023-10-04 08:04:58,712 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7762_210_1794_1_1528199508_4057408_422-227034-0_sp1.1 from training. Duration: 0.663625 2023-10-04 08:05:02,900 WARNING [train.py:1204] (3/4) Exclude cut with ID _4184_210_2550_1_1526730192013_2219380_102-250299-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 08:05:02,962 WARNING [train.py:1204] (3/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_253-308377-0_sp0.9 from training. Duration: 0.5444375 2023-10-04 08:05:04,357 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_271-297505-0 from training. Duration: 0.99 2023-10-04 08:05:05,705 WARNING [train.py:1204] (3/4) Exclude cut with ID _35941_210_15379_1_1532995294515_4023089_213-142973-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 08:05:07,182 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7544_210_6753_1_1528587722_3576686_162-63018-0_sp1.1 from training. Duration: 0.92725 2023-10-04 08:05:08,437 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_365-217119-0_sp0.9 from training. Duration: 0.7 2023-10-04 08:05:10,302 WARNING [train.py:1204] (3/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_662-275621-0_sp0.9 from training. Duration: 0.4666875 2023-10-04 08:05:10,304 WARNING [train.py:1204] (3/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_374-187555-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 08:05:11,697 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_289-180904-0_sp1.1 from training. Duration: 0.6909375 2023-10-04 08:05:13,083 WARNING [train.py:1204] (3/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_189-47206-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 08:05:13,124 WARNING [train.py:1204] (3/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_234-14174-0_sp1.1 from training. Duration: 0.7454375 2023-10-04 08:05:13,162 WARNING [train.py:1204] (3/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_576-76621-0_sp1.1 from training. Duration: 0.963625 2023-10-04 08:05:14,566 WARNING [train.py:1204] (3/4) Exclude cut with ID _13253_210_1925_1_1530757775819_3577873_253-246386-0 from training. Duration: 0.9 2023-10-04 08:05:16,111 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.0.conv_module1.balancer1.prob, batch_count=1583626.6666666667, ans=0.125 2023-10-04 08:05:18,694 WARNING [train.py:1204] (3/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_372-6869-0_sp1.1 from training. Duration: 0.4 2023-10-04 08:05:18,751 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_21434_210_4238_1_1531916339_1222252_22-54954-0_sp1.1 from training. Duration: 0.936375 2023-10-04 08:05:18,899 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.1.conv_module1.balancer1.prob, batch_count=1583626.6666666667, ans=0.125 2023-10-04 08:05:20,823 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.1.feed_forward2.hidden_balancer.prob, batch_count=1583693.3333333333, ans=0.125 2023-10-04 08:05:21,990 WARNING [train.py:1204] (3/4) Exclude cut with ID _29853_210_7497_1_1532505600851_6642110_492-170361-0_sp1.1 from training. Duration: 0.92725 2023-10-04 08:05:24,061 WARNING [train.py:1204] (3/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_505-97987-0_sp0.9 from training. Duration: 0.9555625 2023-10-04 08:05:25,354 WARNING [train.py:1204] (3/4) Exclude cut with ID _8365_210_2064_1_1528592311070_4677690_371-122295-0_sp0.9 from training. Duration: 0.611125 2023-10-04 08:05:26,813 WARNING [train.py:1204] (3/4) Exclude cut with ID _14268_210_9206_1_1530622771285_4714099_637-86630-0_sp1.1 from training. Duration: 0.463625 2023-10-04 08:05:26,832 WARNING [train.py:1204] (3/4) Exclude cut with ID _29857_210_20034_1_1532395886753_3798160_372-5678-0_sp1.1 from training. Duration: 0.963625 2023-10-04 08:05:26,985 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=1583693.3333333333, ans=0.1 2023-10-04 08:05:28,242 WARNING [train.py:1204] (3/4) Exclude cut with ID _52892_210_6721_1_1534378941259_4124129_90-165108-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 08:05:29,621 WARNING [train.py:1204] (3/4) Exclude cut with ID _27052_210_5986_1_1532329197187_3585009_143-339198-0_sp1.1 from training. Duration: 0.963625 2023-10-04 08:05:32,452 WARNING [train.py:1204] (3/4) Exclude cut with ID _14268_210_9206_1_1530622771285_4714099_637-86630-0_sp0.9 from training. Duration: 0.5666875 2023-10-04 08:05:33,756 WARNING [train.py:1204] (3/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_401-255491-0 from training. Duration: 0.7 2023-10-04 08:05:35,068 WARNING [train.py:1204] (3/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_178-6946-0_sp1.1 from training. Duration: 0.97275 2023-10-04 08:05:42,546 WARNING [train.py:1204] (3/4) Exclude cut with ID _27485_210_4916_1_1532739191677_3668269_460-163885-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 08:05:42,865 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=1583760.0, ans=0.1 2023-10-04 08:05:46,668 WARNING [train.py:1204] (3/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_270-26053-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 08:05:48,446 WARNING [train.py:1204] (3/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_412-317077-0 from training. Duration: 0.75 2023-10-04 08:05:50,315 WARNING [train.py:1204] (3/4) Exclude cut with ID _5289_210_2084_1_1527678202736_3779299_142-103317-0 from training. Duration: 0.84 2023-10-04 08:05:51,648 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_150-143438-0_sp1.1 from training. Duration: 0.92725 2023-10-04 08:05:53,035 WARNING [train.py:1204] (3/4) Exclude cut with ID _27894_210_12392_1_1532415496078_6902439_668-269779-0_sp1.1 from training. Duration: 0.97275 2023-10-04 08:05:53,693 WARNING [train.py:1204] (3/4) Exclude cut with ID _18513_210_9206_1_1531618215223_3862620_56-222075-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 08:05:56,211 WARNING [train.py:1204] (3/4) Exclude cut with ID _4092_210_2151_1_1526643044449_3135759_356-278694-0 from training. Duration: 0.47 2023-10-04 08:05:59,061 WARNING [train.py:1204] (3/4) Exclude cut with ID _25808_210_13292_1_1532343514562_7366109_633-47524-0 from training. Duration: 0.99 2023-10-04 08:05:59,074 WARNING [train.py:1204] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_872-264525-0 from training. Duration: 0.65 2023-10-04 08:05:59,127 WARNING [train.py:1204] (3/4) Exclude cut with ID _14632_210_9988_1_1530691279392_3923200_239-232545-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 08:06:00,684 WARNING [train.py:1204] (3/4) Exclude cut with ID _14349_210_8341_1_1530615710818_3442470_153-27432-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 08:06:03,656 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.nonlin_attention.balancer.prob, batch_count=1583826.6666666667, ans=0.125 2023-10-04 08:06:06,071 INFO [train.py:1046] (3/4) Epoch 45, batch 3850, loss[loss=0.1638, simple_loss=0.2362, pruned_loss=0.04573, over 23823.00 frames. ], tot_loss[loss=0.154, simple_loss=0.2351, pruned_loss=0.03649, over 4717728.65 frames. ], batch size: 179, lr: 2.25e-03, grad_scale: 16.0 2023-10-04 08:06:06,232 WARNING [train.py:1204] (3/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_37-41919-0_sp0.9 from training. Duration: 0.9666875 2023-10-04 08:06:07,688 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_196-289637-0_sp0.9 from training. Duration: 0.8333125 2023-10-04 08:06:10,805 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.bypass.scale_min, batch_count=1583893.3333333333, ans=0.2 2023-10-04 08:06:11,881 WARNING [train.py:1204] (3/4) Exclude cut with ID _13678_210_7497_1_1530530069226_3463170_371-3221-0_sp1.1 from training. Duration: 0.8181875 2023-10-04 08:06:11,958 WARNING [train.py:1204] (3/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_557-324258-0 from training. Duration: 0.81 2023-10-04 08:06:13,942 WARNING [train.py:1204] (3/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_1012-279473-0_sp1.1 from training. Duration: 0.6090625 2023-10-04 08:06:15,337 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_66916_210_14656_1_1535725525_326566_7-92649-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 08:06:19,018 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_9460_210_4211_1_1528954587_10049667_480-260269-0_sp1.1 from training. Duration: 0.5090625 2023-10-04 08:06:21,718 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_121-155001-0_sp1.1 from training. Duration: 0.92725 2023-10-04 08:06:24,890 WARNING [train.py:1204] (3/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_188-171673-0_sp1.1 from training. Duration: 0.52725 2023-10-04 08:06:24,973 WARNING [train.py:1204] (3/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_2-39653-0 from training. Duration: 0.98 2023-10-04 08:06:31,082 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_23850_210_4238_1_1532232597_1632433_112-229012-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 08:06:32,551 WARNING [train.py:1204] (3/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_626-79528-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 08:06:33,918 WARNING [train.py:1204] (3/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_856-90052-0_sp1.1 from training. Duration: 0.963625 2023-10-04 08:06:35,217 WARNING [train.py:1204] (3/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_122-109023-0_sp0.9 from training. Duration: 0.8444375 2023-10-04 08:06:35,473 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.bypass.scale_min, batch_count=1584026.6666666667, ans=0.2 2023-10-04 08:06:38,019 WARNING [train.py:1204] (3/4) Exclude cut with ID _64414_210_17301_1_1535243252048_3757759_37-70328-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 08:06:38,091 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_622-344907-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 08:06:38,145 WARNING [train.py:1204] (3/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_410-321956-0_sp1.1 from training. Duration: 0.92725 2023-10-04 08:06:38,157 WARNING [train.py:1204] (3/4) Exclude cut with ID _7562_210_2084_1_1528887767916_3946229_186-326422-0_sp0.9 from training. Duration: 0.9333125 2023-10-04 08:06:39,572 WARNING [train.py:1204] (3/4) Exclude cut with ID _37537_210_2253_1_1533171758568_3421949_127-285192-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 08:06:42,813 WARNING [train.py:1204] (3/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_723-228168-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 08:06:42,989 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.feed_forward1.out_proj.dropout_p, batch_count=1584026.6666666667, ans=0.1 2023-10-04 08:06:45,410 WARNING [train.py:1204] (3/4) Exclude cut with ID _13678_210_7497_1_1530530069226_3463170_426-246984-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 08:06:45,426 WARNING [train.py:1204] (3/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_399-156791-0_sp0.9 from training. Duration: 0.92225 2023-10-04 08:06:45,492 WARNING [train.py:1204] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_391-22148-0 from training. Duration: 0.77 2023-10-04 08:06:45,529 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3475_210_2028_1_1526193578_3622569_64-297072-0 from training. Duration: 0.91 2023-10-04 08:06:46,907 WARNING [train.py:1204] (3/4) Exclude cut with ID _17875_210_2087_1_1531224012907_1821339_225-280015-0_sp1.1 from training. Duration: 0.963625 2023-10-04 08:06:46,926 WARNING [train.py:1204] (3/4) Exclude cut with ID _50655_210_18628_1_1534229942193_3772154_325-246169-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 08:06:47,192 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.conv_module1.balancer1.prob, batch_count=1584026.6666666667, ans=0.125 2023-10-04 08:06:48,361 WARNING [train.py:1204] (3/4) Exclude cut with ID _29529_210_7234_1_1532393961455_3977460_597-257348-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 08:06:50,279 WARNING [train.py:1204] (3/4) Exclude cut with ID _58895_210_8750_1_1535184081320_3649089_77-173485-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 08:06:50,307 WARNING [train.py:1204] (3/4) Exclude cut with ID _6270_210_2084_1_1527850911573_3856420_26-277343-0 from training. Duration: 0.82 2023-10-04 08:06:53,088 WARNING [train.py:1204] (3/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_144-3287-0 from training. Duration: 0.67 2023-10-04 08:06:53,214 WARNING [train.py:1204] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_293-145278-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 08:06:55,198 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_40047_210_2290_1_1533290519_2374311_257-335162-0 from training. Duration: 0.95 2023-10-04 08:06:57,970 WARNING [train.py:1204] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_619-344499-0_sp0.9 from training. Duration: 0.7 2023-10-04 08:07:03,275 WARNING [train.py:1204] (3/4) Exclude cut with ID _19504_210_9994_1_1531648863242_3325220_456-315379-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 08:07:04,598 WARNING [train.py:1204] (3/4) Exclude cut with ID _55857_210_2346_1_1534644007831_6898209_631-221692-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 08:07:08,800 WARNING [train.py:1204] (3/4) Exclude cut with ID _64631_210_27767_1_1535349580068_4165160_324-3682-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 08:07:08,821 WARNING [train.py:1204] (3/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_317-211107-0 from training. Duration: 0.92 2023-10-04 08:07:12,287 WARNING [train.py:1204] (3/4) Exclude cut with ID _6789_210_1534_1_1527857937218_3118950_311-61262-0 from training. Duration: 0.65 2023-10-04 08:07:12,447 WARNING [train.py:1204] (3/4) Exclude cut with ID _28323_210_16187_1_1532480395667_4100300_365-68882-0_sp1.1 from training. Duration: 0.92725 2023-10-04 08:07:13,780 WARNING [train.py:1204] (3/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_816-181825-0_sp1.1 from training. Duration: 0.92725 2023-10-04 08:07:16,553 WARNING [train.py:1204] (3/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_132-269633-0_sp1.1 from training. Duration: 0.6181875 2023-10-04 08:07:16,564 WARNING [train.py:1204] (3/4) Exclude cut with ID _44585_210_18628_1_1533605350387_4518199_280-73534-0_sp1.1 from training. Duration: 0.8090625 2023-10-04 08:07:17,002 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.5.encoder.layers.0.whiten, num_groups=1, num_channels=256, metric=2.26 vs. limit=12.0 2023-10-04 08:07:17,865 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_68095_210_20185_1_1535718226_8888855_1009-164755-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 08:07:18,091 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.conv_module1.balancer1.prob, batch_count=1584160.0, ans=0.125 2023-10-04 08:07:19,212 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_40223_210_6228_1_1533298404_4812267_400-103813-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 08:07:19,213 WARNING [train.py:1204] (3/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_491-261923-0_sp1.1 from training. Duration: 0.8909375 2023-10-04 08:07:19,218 WARNING [train.py:1204] (3/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_712-342563-0 from training. Duration: 0.97 2023-10-04 08:07:21,120 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.596e+02 1.982e+02 2.141e+02 2.479e+02 3.654e+02, threshold=4.281e+02, percent-clipped=0.0 2023-10-04 08:07:21,147 INFO [train.py:1046] (3/4) Epoch 45, batch 3900, loss[loss=0.1591, simple_loss=0.2407, pruned_loss=0.03872, over 23196.00 frames. ], tot_loss[loss=0.1537, simple_loss=0.2345, pruned_loss=0.03644, over 4717470.14 frames. ], batch size: 105, lr: 2.25e-03, grad_scale: 16.0 2023-10-04 08:07:21,188 WARNING [train.py:1204] (3/4) Exclude cut with ID _41161_210_20034_1_1533431249657_3555314_175-190032-0_sp1.1 from training. Duration: 0.963625 2023-10-04 08:07:21,318 WARNING [train.py:1204] (3/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_788-298907-0 from training. Duration: 0.89 2023-10-04 08:07:21,335 WARNING [train.py:1204] (3/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_225-70961-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 08:07:21,341 WARNING [train.py:1204] (3/4) Exclude cut with ID _58583_210_5236_1_1534726835266_3574249_235-175994-0_sp1.1 from training. Duration: 0.92725 2023-10-04 08:07:24,715 WARNING [train.py:1204] (3/4) Exclude cut with ID _12302_210_5219_1_1529924446466_8107280_907-182949-0_sp1.1 from training. Duration: 0.836375 2023-10-04 08:07:24,750 WARNING [train.py:1204] (3/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_94-193692-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 08:07:26,085 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_4528_210_3273_1_1527076654_3276777_342-168068-0_sp0.9 from training. Duration: 0.9666875 2023-10-04 08:07:26,740 WARNING [train.py:1204] (3/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_368-74239-0_sp1.1 from training. Duration: 0.92725 2023-10-04 08:07:26,741 WARNING [train.py:1204] (3/4) Exclude cut with ID _44831_210_13427_1_1533816011549_3689035_291-295928-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 08:07:27,914 WARNING [train.py:1204] (3/4) Exclude cut with ID _56406_210_9288_1_1534899618664_3496149_455-131997-0_sp1.1 from training. Duration: 0.97275 2023-10-04 08:07:27,928 WARNING [train.py:1204] (3/4) Exclude cut with ID _6473_210_1741_1_1527901307396_3815380_108-17335-0 from training. Duration: 0.65 2023-10-04 08:07:29,321 WARNING [train.py:1204] (3/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_286-135600-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 08:07:32,037 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_928-321675-0_sp1.1 from training. Duration: 0.936375 2023-10-04 08:07:33,406 WARNING [train.py:1204] (3/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_81-300212-0_sp1.1 from training. Duration: 0.5454375 2023-10-04 08:07:33,453 WARNING [train.py:1204] (3/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_29-280622-0_sp0.9 from training. Duration: 0.911125 2023-10-04 08:07:33,557 WARNING [train.py:1204] (3/4) Exclude cut with ID _61079_210_4525_1_1534921008198_4011419_228-176447-0_sp1.1 from training. Duration: 0.936375 2023-10-04 08:07:37,564 WARNING [train.py:1204] (3/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_372-198878-0_sp1.1 from training. Duration: 0.5454375 2023-10-04 08:07:37,577 WARNING [train.py:1204] (3/4) Exclude cut with ID _64521_210_14699_1_1535243427259_3815328_244-208725-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 08:07:38,974 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8275_210_4198_1_1528455320_3892571_491-17707-0_sp0.9 from training. Duration: 0.711125 2023-10-04 08:07:40,339 WARNING [train.py:1204] (3/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_375-245677-0 from training. Duration: 0.81 2023-10-04 08:07:40,347 WARNING [train.py:1204] (3/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_690-286055-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 08:07:41,270 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.1.encoder.layers.1.conv_module2.whiten, num_groups=1, num_channels=256, metric=15.49 vs. limit=15.0 2023-10-04 08:07:42,356 WARNING [train.py:1204] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_89-321955-0 from training. Duration: 0.8 2023-10-04 08:07:43,722 WARNING [train.py:1204] (3/4) Exclude cut with ID _18895_210_6228_1_1531357274234_3784930_213-98721-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 08:07:44,990 WARNING [train.py:1204] (3/4) Exclude cut with ID _19708_210_9206_1_1532136650733_4238364_347-65240-0 from training. Duration: 0.97 2023-10-04 08:07:46,310 WARNING [train.py:1204] (3/4) Exclude cut with ID _64135_210_2253_1_1535677267876_7608972_498-5039-0 from training. Duration: 0.78 2023-10-04 08:07:49,292 WARNING [train.py:1204] (3/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_146-276944-0_sp1.1 from training. Duration: 0.7545625 2023-10-04 08:07:51,972 WARNING [train.py:1204] (3/4) Exclude cut with ID _63171_210_15488_1_1535104792794_3295110_155-34488-0_sp1.1 from training. Duration: 0.97275 2023-10-04 08:07:51,985 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_5438_210_1794_1_1527336687_2600009_233-58072-0_sp0.9 from training. Duration: 0.8555625 2023-10-04 08:07:52,041 WARNING [train.py:1204] (3/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_63-101996-0_sp0.9 from training. Duration: 0.97775 2023-10-04 08:07:56,674 WARNING [train.py:1204] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_271-269084-0_sp1.1 from training. Duration: 0.7545625 2023-10-04 08:07:58,658 WARNING [train.py:1204] (3/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_345-167986-0_sp1.1 from training. Duration: 0.763625 2023-10-04 08:08:01,285 WARNING [train.py:1204] (3/4) Exclude cut with ID _39290_210_4220_1_1533862778760_3075118_92-311600-0_sp1.1 from training. Duration: 0.8 2023-10-04 08:08:01,291 WARNING [train.py:1204] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_209-65306-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 08:08:02,614 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_26433_210_12108_1_1532157158_4056480_317-335807-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 08:08:06,835 WARNING [train.py:1204] (3/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_238-94127-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 08:08:06,872 WARNING [train.py:1204] (3/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_207-132038-0_sp1.1 from training. Duration: 0.8545625 2023-10-04 08:08:15,934 WARNING [train.py:1204] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_167-137255-0_sp1.1 from training. Duration: 0.5818125 2023-10-04 08:08:17,382 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2009_210_2071_1_1525481134_4588937_385-302100-0_sp0.9 from training. Duration: 0.9666875 2023-10-04 08:08:23,174 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_15517_210_6753_1_1530954008_3487336_253-110617-0_sp1.1 from training. Duration: 0.936375 2023-10-04 08:08:26,473 WARNING [train.py:1204] (3/4) Exclude cut with ID _39290_210_4220_1_1533862778760_3075118_92-311600-0_sp0.9 from training. Duration: 0.97775 2023-10-04 08:08:28,510 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_42-142473-0 from training. Duration: 0.72 2023-10-04 08:08:28,538 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2296_210_2087_1_1525503448_3397335_57-185430-0 from training. Duration: 0.6 2023-10-04 08:08:28,557 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_11305_210_1794_1_1529750428_7178148_257-312870-0_sp0.9 from training. Duration: 0.97775 2023-10-04 08:08:29,700 WARNING [train.py:1204] (3/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_222-198127-0 from training. Duration: 0.84 2023-10-04 08:08:31,080 WARNING [train.py:1204] (3/4) Exclude cut with ID _65263_210_22356_1_1535763598691_3921037_423-202699-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 08:08:31,141 WARNING [train.py:1204] (3/4) Exclude cut with ID _15037_210_2253_1_1530752294104_3267650_13-130361-0 from training. Duration: 0.99 2023-10-04 08:08:35,211 INFO [train.py:1046] (3/4) Epoch 45, batch 3950, loss[loss=0.1467, simple_loss=0.2331, pruned_loss=0.03012, over 24611.00 frames. ], tot_loss[loss=0.1528, simple_loss=0.2333, pruned_loss=0.0362, over 4713851.72 frames. ], batch size: 68, lr: 2.25e-03, grad_scale: 16.0 2023-10-04 08:08:36,921 WARNING [train.py:1204] (3/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_628-198080-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 08:08:38,239 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3171_210_2087_1_1526109084_2330007_37-64334-0 from training. Duration: 0.82 2023-10-04 08:08:38,300 WARNING [train.py:1204] (3/4) Exclude cut with ID _5287_210_2084_1_1527418883040_3878670_12-221186-0_sp1.1 from training. Duration: 0.8909375 2023-10-04 08:08:40,971 WARNING [train.py:1204] (3/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_1103-56445-0_sp1.1 from training. Duration: 0.663625 2023-10-04 08:08:42,832 WARNING [train.py:1204] (3/4) Exclude cut with ID _65263_210_22356_1_1535763598691_3921037_169-140068-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 08:08:47,053 WARNING [train.py:1204] (3/4) Exclude cut with ID IC0671W0118-331961 from training. Duration: 0.896 2023-10-04 08:08:47,104 WARNING [train.py:1204] (3/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_402-102311-0_sp1.1 from training. Duration: 0.6090625 2023-10-04 08:08:48,452 WARNING [train.py:1204] (3/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_202-293523-0 from training. Duration: 0.72 2023-10-04 08:08:48,512 WARNING [train.py:1204] (3/4) Exclude cut with ID IC0679W0038-335846 from training. Duration: 0.768 2023-10-04 08:08:50,271 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_37357_210_17024_1_1533089504_2316121_79-198866-0_sp1.1 from training. Duration: 0.936375 2023-10-04 08:08:51,998 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.bypass.skip_rate, batch_count=1584626.6666666667, ans=0.09899494936611666 2023-10-04 08:08:52,010 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.feed_forward2.hidden_balancer.prob, batch_count=1584626.6666666667, ans=0.125 2023-10-04 08:08:55,042 WARNING [train.py:1204] (3/4) Exclude cut with ID _7606_210_4915_1_1528894253277_5080690_292-271285-0_sp1.1 from training. Duration: 0.963625 2023-10-04 08:08:55,060 WARNING [train.py:1204] (3/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_377-296839-0_sp0.9 from training. Duration: 0.611125 2023-10-04 08:08:55,065 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_45886_210_17024_1_1533704917_2435333_58-131069-0_sp1.1 from training. Duration: 0.963625 2023-10-04 08:08:57,889 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6961_210_2087_1_1527937168_6815011_395-185060-0 from training. Duration: 0.95 2023-10-04 08:09:01,409 WARNING [train.py:1204] (3/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_304-266479-0_sp1.1 from training. Duration: 0.97275 2023-10-04 08:09:02,533 WARNING [train.py:1204] (3/4) Exclude cut with ID _3867_210_1883_1_1526641200575_4475830_319-349850-0_sp1.1 from training. Duration: 0.6090625 2023-10-04 08:09:02,541 WARNING [train.py:1204] (3/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_304-59227-0_sp0.9 from training. Duration: 0.8555625 2023-10-04 08:09:02,605 WARNING [train.py:1204] (3/4) Exclude cut with ID _8021_210_7994_1_1528376451358_3653069_100-140231-0_sp0.9 from training. Duration: 0.7444375 2023-10-04 08:09:02,661 WARNING [train.py:1204] (3/4) Exclude cut with ID _8833_210_5219_1_1528592665210_7553059_329-125156-0_sp0.9 from training. Duration: 0.911125 2023-10-04 08:09:05,688 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.ff3_skip_rate, batch_count=1584693.3333333333, ans=0.0 2023-10-04 08:09:12,769 WARNING [train.py:1204] (3/4) Exclude cut with ID _14459_210_2228_1_1531443641946_7516700_792-287234-0_sp1.1 from training. Duration: 0.92725 2023-10-04 08:09:12,785 WARNING [train.py:1204] (3/4) Exclude cut with ID _19708_210_9206_1_1532136650733_4238364_347-65240-0_sp1.1 from training. Duration: 0.8818125 2023-10-04 08:09:17,151 WARNING [train.py:1204] (3/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_70-27732-0 from training. Duration: 0.94 2023-10-04 08:09:23,098 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_397-132565-0 from training. Duration: 0.99 2023-10-04 08:09:23,109 WARNING [train.py:1204] (3/4) Exclude cut with ID _49728_210_12651_1_1534041034294_6788150_624-195301-0 from training. Duration: 0.85 2023-10-04 08:09:23,139 WARNING [train.py:1204] (3/4) Exclude cut with ID _55429_210_10626_1_1534467588527_7174143_377-169391-0_sp1.1 from training. Duration: 0.87275 2023-10-04 08:09:24,498 WARNING [train.py:1204] (3/4) Exclude cut with ID _49376_210_5986_1_1533985278732_3370740_354-344695-0_sp1.1 from training. Duration: 0.9 2023-10-04 08:09:26,695 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.ff2_skip_rate, batch_count=1584760.0, ans=0.0 2023-10-04 08:09:32,311 WARNING [train.py:1204] (3/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_276-96593-0_sp1.1 from training. Duration: 0.7 2023-10-04 08:09:32,319 WARNING [train.py:1204] (3/4) Exclude cut with ID _15012_210_1968_1_1530856820429_5095350_688-178849-0_sp0.9 from training. Duration: 0.711125 2023-10-04 08:09:32,355 WARNING [train.py:1204] (3/4) Exclude cut with ID _25565_210_12392_1_1532423009382_3952098_372-69023-0_sp1.1 from training. Duration: 0.936375 2023-10-04 08:09:33,622 WARNING [train.py:1204] (3/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_61-327062-0_sp1.1 from training. Duration: 0.736375 2023-10-04 08:09:33,653 WARNING [train.py:1204] (3/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_215-141059-0 from training. Duration: 0.62 2023-10-04 08:09:38,033 WARNING [train.py:1204] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_6-226750-0_sp1.1 from training. Duration: 0.8545625 2023-10-04 08:09:39,366 WARNING [train.py:1204] (3/4) Exclude cut with ID _14348_210_8341_1_1530599434996_3778640_240-232370-0_sp1.1 from training. Duration: 0.763625 2023-10-04 08:09:44,095 WARNING [train.py:1204] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_468-189058-0 from training. Duration: 0.96 2023-10-04 08:09:48,512 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.conv_skip_rate, batch_count=1584893.3333333333, ans=0.0 2023-10-04 08:09:49,586 INFO [train.py:1046] (3/4) Epoch 45, batch 4000, loss[loss=0.1624, simple_loss=0.2498, pruned_loss=0.03752, over 24086.00 frames. ], tot_loss[loss=0.1542, simple_loss=0.2349, pruned_loss=0.03669, over 4712919.50 frames. ], batch size: 80, lr: 2.25e-03, grad_scale: 16.0 2023-10-04 08:09:51,381 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.711e+02 2.026e+02 2.265e+02 2.595e+02 5.973e+02, threshold=4.529e+02, percent-clipped=1.0 2023-10-04 08:09:54,172 WARNING [train.py:1204] (3/4) Exclude cut with ID _21987_210_5854_1_1531821500482_3637038_247-93340-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 08:10:00,522 WARNING [train.py:1204] (3/4) Exclude cut with ID _66868_210_9527_1_1535509287954_4561140_436-191602-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 08:10:06,631 WARNING [train.py:1204] (3/4) Exclude cut with ID _18895_210_6228_1_1531357274234_3784930_394-58203-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 08:10:06,678 WARNING [train.py:1204] (3/4) Exclude cut with ID _58605_210_12210_1_1534723195939_3904259_258-281498-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 08:10:06,802 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.0.attention_skip_rate, batch_count=1584960.0, ans=0.0 2023-10-04 08:10:08,025 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_33348_210_5632_1_1533086912_3324030_245-292752-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 08:10:08,053 WARNING [train.py:1204] (3/4) Exclude cut with ID _67831_210_12392_1_1535612464202_3092612_346-2148-0 from training. Duration: 0.93 2023-10-04 08:10:09,418 WARNING [train.py:1204] (3/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_369-222060-0_sp0.9 from training. Duration: 0.788875 2023-10-04 08:10:09,492 WARNING [train.py:1204] (3/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_245-174553-0 from training. Duration: 0.88 2023-10-04 08:10:09,497 WARNING [train.py:1204] (3/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_231-104736-0_sp1.1 from training. Duration: 0.7090625 2023-10-04 08:10:09,503 WARNING [train.py:1204] (3/4) Exclude cut with ID _44585_210_18628_1_1533605350387_4518199_280-73534-0 from training. Duration: 0.89 2023-10-04 08:10:12,253 WARNING [train.py:1204] (3/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_22-201476-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 08:10:13,801 WARNING [train.py:1204] (3/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_325-62802-0_sp0.9 from training. Duration: 0.9666875 2023-10-04 08:10:15,058 WARNING [train.py:1204] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_199-274062-0_sp1.1 from training. Duration: 0.97275 2023-10-04 08:10:15,070 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_8-319957-0_sp1.1 from training. Duration: 0.87275 2023-10-04 08:10:15,093 WARNING [train.py:1204] (3/4) Exclude cut with ID _29343_210_4381_1_1532602757752_3674109_224-349726-0_sp1.1 from training. Duration: 0.936375 2023-10-04 08:10:15,097 WARNING [train.py:1204] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_520-126729-0_sp1.1 from training. Duration: 0.636375 2023-10-04 08:10:15,411 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.ff3_skip_rate, batch_count=1584960.0, ans=0.0 2023-10-04 08:10:16,564 WARNING [train.py:1204] (3/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_239-22076-0_sp1.1 from training. Duration: 0.77275 2023-10-04 08:10:19,726 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9012W0011-501146 from training. Duration: 0.896 2023-10-04 08:10:19,804 WARNING [train.py:1204] (3/4) Exclude cut with ID _29857_210_20034_1_1532395886753_3798160_279-185801-0_sp1.1 from training. Duration: 0.82725 2023-10-04 08:10:19,855 WARNING [train.py:1204] (3/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_261-223933-0_sp1.1 from training. Duration: 0.92725 2023-10-04 08:10:23,221 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.feed_forward1.out_proj.dropout_p, batch_count=1585026.6666666667, ans=0.1 2023-10-04 08:10:24,334 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9029W0025-509426 from training. Duration: 0.896 2023-10-04 08:10:24,409 WARNING [train.py:1204] (3/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_337-181502-0_sp1.1 from training. Duration: 0.5909375 2023-10-04 08:10:24,421 WARNING [train.py:1204] (3/4) Exclude cut with ID _68635_210_9317_1_1535770813410_4315870_159-332599-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 08:10:29,990 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_161-276996-0 from training. Duration: 0.95 2023-10-04 08:10:31,374 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7088_210_1874_1_1527939776_1101113_127-68870-0_sp1.1 from training. Duration: 0.936375 2023-10-04 08:10:34,534 WARNING [train.py:1204] (3/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_322-217220-0_sp1.1 from training. Duration: 0.7818125 2023-10-04 08:10:35,857 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9072W0002-530636 from training. Duration: 0.768 2023-10-04 08:10:37,267 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_340-336408-0_sp1.1 from training. Duration: 0.8454375 2023-10-04 08:10:37,328 WARNING [train.py:1204] (3/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_395-37222-0 from training. Duration: 0.76 2023-10-04 08:10:37,331 WARNING [train.py:1204] (3/4) Exclude cut with ID _64099_210_17301_1_1535526977656_1578188_159-209156-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 08:10:38,662 WARNING [train.py:1204] (3/4) Exclude cut with ID _53950_210_5816_1_1534417264919_3862951_28-29681-0_sp1.1 from training. Duration: 0.92725 2023-10-04 08:10:40,061 WARNING [train.py:1204] (3/4) Exclude cut with ID _4681_210_3525_1_1527250689464_120889_15-126760-0_sp1.1 from training. Duration: 0.736375 2023-10-04 08:10:40,172 WARNING [train.py:1204] (3/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_242-3472-0_sp1.1 from training. Duration: 0.663625 2023-10-04 08:10:41,542 WARNING [train.py:1204] (3/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_436-186311-0_sp0.9 from training. Duration: 0.72225 2023-10-04 08:10:41,565 WARNING [train.py:1204] (3/4) Exclude cut with ID _3675_210_4557_1_1526639934408_4182089_452-172147-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 08:10:42,952 WARNING [train.py:1204] (3/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_80-147301-0 from training. Duration: 0.84 2023-10-04 08:10:42,973 WARNING [train.py:1204] (3/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_351-107588-0_sp1.1 from training. Duration: 0.92725 2023-10-04 08:10:44,396 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9111W0002-550781 from training. Duration: 0.896 2023-10-04 08:10:49,199 WARNING [train.py:1204] (3/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_465-156500-0_sp1.1 from training. Duration: 0.6545625 2023-10-04 08:10:51,436 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.0.layers.1.feed_forward1.out_whiten, num_groups=1, num_channels=192, metric=5.28 vs. limit=15.0 2023-10-04 08:10:51,989 WARNING [train.py:1204] (3/4) Exclude cut with ID _13938_210_9988_1_1530518718978_3693960_304-112784-0_sp1.1 from training. Duration: 0.4181875 2023-10-04 08:10:55,230 WARNING [train.py:1204] (3/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_202-67017-0_sp1.1 from training. Duration: 0.7454375 2023-10-04 08:10:55,279 WARNING [train.py:1204] (3/4) Exclude cut with ID _58414_210_8254_1_1535421609162_7151182_448-4358-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 08:10:55,351 WARNING [train.py:1204] (3/4) Exclude cut with ID _66840_210_5219_1_1535452168426_4196349_203-243234-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 08:10:56,605 WARNING [train.py:1204] (3/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_201-213513-0_sp1.1 from training. Duration: 0.97275 2023-10-04 08:10:59,553 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_117-187560-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 08:10:59,701 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.conv_module2.balancer2.prob, batch_count=1585160.0, ans=0.125 2023-10-04 08:11:03,110 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.ff3_skip_rate, batch_count=1585226.6666666667, ans=0.0 2023-10-04 08:11:03,978 INFO [train.py:1046] (3/4) Epoch 45, batch 4050, loss[loss=0.1594, simple_loss=0.2342, pruned_loss=0.04231, over 23824.00 frames. ], tot_loss[loss=0.1548, simple_loss=0.2356, pruned_loss=0.03705, over 4708567.01 frames. ], batch size: 164, lr: 2.25e-03, grad_scale: 16.0 2023-10-04 08:11:04,024 WARNING [train.py:1204] (3/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_135-174080-0_sp0.9 from training. Duration: 0.588875 2023-10-04 08:11:04,068 WARNING [train.py:1204] (3/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_45-265684-0 from training. Duration: 0.72 2023-10-04 08:11:06,049 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_435-313305-0_sp1.1 from training. Duration: 0.6454375 2023-10-04 08:11:06,069 WARNING [train.py:1204] (3/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_235-307337-0_sp1.1 from training. Duration: 0.8545625 2023-10-04 08:11:07,435 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7762_210_1794_1_1528199508_4057408_419-125009-0_sp0.9 from training. Duration: 0.92225 2023-10-04 08:11:08,819 WARNING [train.py:1204] (3/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_215-141059-0_sp1.1 from training. Duration: 0.563625 2023-10-04 08:11:10,197 WARNING [train.py:1204] (3/4) Exclude cut with ID _27013_210_1389_1_1532437173240_3252649_222-125573-0_sp1.1 from training. Duration: 0.8909375 2023-10-04 08:11:12,966 WARNING [train.py:1204] (3/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_126-274694-0_sp1.1 from training. Duration: 0.8909375 2023-10-04 08:11:15,813 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2592_210_2290_1_1525697025_4882925_157-70512-0_sp1.1 from training. Duration: 0.82725 2023-10-04 08:11:17,158 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_292-265928-0_sp0.9 from training. Duration: 0.5666875 2023-10-04 08:11:18,937 WARNING [train.py:1204] (3/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_1211-226066-0_sp1.1 from training. Duration: 0.6909375 2023-10-04 08:11:18,987 WARNING [train.py:1204] (3/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_394-265747-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 08:11:22,352 WARNING [train.py:1204] (3/4) Exclude cut with ID _48782_210_12658_1_1534467701544_2300422_239-67139-0_sp1.1 from training. Duration: 0.97275 2023-10-04 08:11:23,790 WARNING [train.py:1204] (3/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_329-113173-0_sp1.1 from training. Duration: 0.563625 2023-10-04 08:11:26,667 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.feed_forward1.hidden_balancer.prob, batch_count=1585293.3333333333, ans=0.125 2023-10-04 08:11:27,872 WARNING [train.py:1204] (3/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_81-75849-0_sp0.9 from training. Duration: 0.4333125 2023-10-04 08:11:29,284 WARNING [train.py:1204] (3/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_1003-101638-0 from training. Duration: 0.73 2023-10-04 08:11:29,308 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9309W0189-640316 from training. Duration: 0.64 2023-10-04 08:11:30,772 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.ff2_skip_rate, batch_count=1585293.3333333333, ans=0.0 2023-10-04 08:11:32,607 WARNING [train.py:1204] (3/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_503-287883-0_sp1.1 from training. Duration: 0.8 2023-10-04 08:11:38,524 WARNING [train.py:1204] (3/4) Exclude cut with ID _8833_210_5219_1_1528592665210_7553059_611-80313-0 from training. Duration: 0.64 2023-10-04 08:11:38,601 WARNING [train.py:1204] (3/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_335-106626-0_sp1.1 from training. Duration: 0.963625 2023-10-04 08:11:41,513 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.feed_forward2.hidden_balancer.prob, batch_count=1585360.0, ans=0.125 2023-10-04 08:11:42,621 WARNING [train.py:1204] (3/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_237-44332-0_sp1.1 from training. Duration: 0.8545625 2023-10-04 08:11:45,422 WARNING [train.py:1204] (3/4) Exclude cut with ID _46952_210_5986_1_1534230010357_3107379_178-147341-0_sp1.1 from training. Duration: 0.97275 2023-10-04 08:11:45,470 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_13091_210_7925_1_1530609447_8987354_946-154426-0_sp1.1 from training. Duration: 0.936375 2023-10-04 08:11:45,492 WARNING [train.py:1204] (3/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_326-17061-0_sp1.1 from training. Duration: 0.8545625 2023-10-04 08:11:47,354 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.4.encoder.layers.2.self_attn_weights.whiten_keys, num_groups=4, num_channels=128, metric=2.26 vs. limit=6.0 2023-10-04 08:11:50,007 WARNING [train.py:1204] (3/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_38-169920-0_sp1.1 from training. Duration: 0.82725 2023-10-04 08:11:51,507 WARNING [train.py:1204] (3/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_283-220372-0 from training. Duration: 0.56 2023-10-04 08:11:53,407 WARNING [train.py:1204] (3/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_252-216674-0_sp1.1 from training. Duration: 0.4909375 2023-10-04 08:11:54,804 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_202-91290-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 08:11:56,238 WARNING [train.py:1204] (3/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_80-74184-0 from training. Duration: 0.83 2023-10-04 08:11:58,936 WARNING [train.py:1204] (3/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_411-271358-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 08:11:59,181 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.nonlin_attention.balancer.prob, batch_count=1585426.6666666667, ans=0.125 2023-10-04 08:12:08,073 WARNING [train.py:1204] (3/4) Exclude cut with ID _68777_210_4353_1_1535810390731_3117904_129-166012-0 from training. Duration: 0.85 2023-10-04 08:12:08,151 WARNING [train.py:1204] (3/4) Exclude cut with ID _44982_210_1511_1_1534312820108_3726029_410-109759-0_sp1.1 from training. Duration: 0.963625 2023-10-04 08:12:08,163 WARNING [train.py:1204] (3/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_364-22817-0_sp1.1 from training. Duration: 0.6454375 2023-10-04 08:12:09,727 WARNING [train.py:1204] (3/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_89-242335-0 from training. Duration: 0.95 2023-10-04 08:12:09,736 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_151-325626-0 from training. Duration: 0.97 2023-10-04 08:12:09,737 WARNING [train.py:1204] (3/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_786-264332-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 08:12:12,512 WARNING [train.py:1204] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_35-153886-0_sp1.1 from training. Duration: 0.763625 2023-10-04 08:12:12,614 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_24007_210_1794_1_1531918925_1663875_116-100916-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 08:12:12,771 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.feed_forward1.hidden_balancer.prob, batch_count=1585493.3333333333, ans=0.125 2023-10-04 08:12:13,873 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_162-262717-0_sp1.1 from training. Duration: 0.7545625 2023-10-04 08:12:15,485 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.1.feed_forward1.out_proj.dropout_p, batch_count=1585493.3333333333, ans=0.1 2023-10-04 08:12:17,831 INFO [train.py:1046] (3/4) Epoch 45, batch 4100, loss[loss=0.1519, simple_loss=0.2359, pruned_loss=0.03394, over 23342.00 frames. ], tot_loss[loss=0.1557, simple_loss=0.2363, pruned_loss=0.03761, over 4706306.99 frames. ], batch size: 119, lr: 2.25e-03, grad_scale: 8.0 2023-10-04 08:12:20,952 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.704e+02 1.978e+02 2.170e+02 2.458e+02 4.039e+02, threshold=4.339e+02, percent-clipped=0.0 2023-10-04 08:12:21,077 WARNING [train.py:1204] (3/4) Exclude cut with ID _8263_210_1664_1_1528438953494_294100_40-204731-0 from training. Duration: 0.5 2023-10-04 08:12:22,885 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_416-293746-0 from training. Duration: 0.9 2023-10-04 08:12:24,435 WARNING [train.py:1204] (3/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_605-219732-0 from training. Duration: 0.94 2023-10-04 08:12:25,875 WARNING [train.py:1204] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_454-123002-0 from training. Duration: 0.69 2023-10-04 08:12:25,881 WARNING [train.py:1204] (3/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_25-304273-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 08:12:25,954 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6860_210_4852_1_1527917457_3833327_451-39702-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 08:12:27,328 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_304-16505-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 08:12:27,340 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_4-266348-0_sp1.1 from training. Duration: 0.7090625 2023-10-04 08:12:27,419 WARNING [train.py:1204] (3/4) Exclude cut with ID ID0149W0001-753701 from training. Duration: 0.989 2023-10-04 08:12:31,578 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_41251_210_15368_1_1533429767_4820695_394-55393-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 08:12:32,901 WARNING [train.py:1204] (3/4) Exclude cut with ID _8793_210_6310_1_1528542985139_2310009_121-179782-0_sp1.1 from training. Duration: 0.72725 2023-10-04 08:12:32,916 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2729_210_3528_1_1525863505_3882214_421-73047-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 08:12:32,983 WARNING [train.py:1204] (3/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_278-154492-0_sp0.9 from training. Duration: 0.8444375 2023-10-04 08:12:38,941 WARNING [train.py:1204] (3/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_412-317077-0_sp0.9 from training. Duration: 0.8333125 2023-10-04 08:12:39,036 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_727-138317-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 08:12:40,814 WARNING [train.py:1204] (3/4) Exclude cut with ID _25808_210_13292_1_1532343514562_7366109_345-335048-0_sp1.1 from training. Duration: 0.92725 2023-10-04 08:12:40,833 WARNING [train.py:1204] (3/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_506-23200-0 from training. Duration: 0.97 2023-10-04 08:12:40,904 WARNING [train.py:1204] (3/4) Exclude cut with ID _44718_210_5816_1_1534050051869_6469030_643-245187-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 08:12:40,908 WARNING [train.py:1204] (3/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_42-186162-0_sp1.1 from training. Duration: 0.836375 2023-10-04 08:12:42,236 WARNING [train.py:1204] (3/4) Exclude cut with ID _3231_210_2106_1_1526205864934_6784220_171-256004-0_sp1.1 from training. Duration: 0.7818125 2023-10-04 08:12:42,262 WARNING [train.py:1204] (3/4) Exclude cut with ID _25914_210_6400_1_1532141317058_644740_43-248478-0_sp1.1 from training. Duration: 0.936375 2023-10-04 08:12:42,313 WARNING [train.py:1204] (3/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_577-287150-0 from training. Duration: 0.82 2023-10-04 08:12:45,078 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_46015_210_7234_1_1533863319_3087837_376-276068-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 08:12:45,173 WARNING [train.py:1204] (3/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_51-200220-0 from training. Duration: 0.56 2023-10-04 08:12:47,849 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_452-96101-0_sp1.1 from training. Duration: 0.963625 2023-10-04 08:12:49,395 WARNING [train.py:1204] (3/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_263-168388-0_sp1.1 from training. Duration: 0.7818125 2023-10-04 08:12:49,396 WARNING [train.py:1204] (3/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_469-94673-0 from training. Duration: 0.79 2023-10-04 08:12:51,286 WARNING [train.py:1204] (3/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_303-228738-0_sp1.1 from training. Duration: 0.763625 2023-10-04 08:12:52,666 WARNING [train.py:1204] (3/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_262-250826-0_sp1.1 from training. Duration: 0.9 2023-10-04 08:12:52,693 WARNING [train.py:1204] (3/4) Exclude cut with ID _68777_210_4353_1_1535810390731_3117904_129-166012-0_sp1.1 from training. Duration: 0.77275 2023-10-04 08:12:56,677 WARNING [train.py:1204] (3/4) Exclude cut with ID _58895_210_8750_1_1535184081320_3649089_265-323810-0 from training. Duration: 0.96 2023-10-04 08:12:58,137 WARNING [train.py:1204] (3/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_120-76843-0_sp0.9 from training. Duration: 0.611125 2023-10-04 08:12:59,460 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7544_210_6753_1_1528587722_3576686_295-258479-0_sp0.9 from training. Duration: 0.9333125 2023-10-04 08:13:00,905 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7046_210_6753_1_1527895337_6920917_783-278109-0 from training. Duration: 0.77 2023-10-04 08:13:00,952 WARNING [train.py:1204] (3/4) Exclude cut with ID _42326_210_7770_1_1533814282531_3707269_113-308323-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 08:13:00,993 WARNING [train.py:1204] (3/4) Exclude cut with ID _7852_210_5219_1_1528286471316_8139400_242-105336-0_sp0.9 from training. Duration: 0.8 2023-10-04 08:13:05,019 WARNING [train.py:1204] (3/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_114-277905-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 08:13:11,029 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_13118_210_7925_1_1530755612_7608297_42-66683-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 08:13:11,298 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.conv_module2.balancer2.prob, batch_count=1585760.0, ans=0.125 2023-10-04 08:13:14,312 WARNING [train.py:1204] (3/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_901-79438-0_sp1.1 from training. Duration: 0.8545625 2023-10-04 08:13:15,646 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_30280_210_1881_1_1532872779_3660139_211-182194-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 08:13:21,338 WARNING [train.py:1204] (3/4) Exclude cut with ID _14898_210_9206_1_1530766812377_4177190_621-45732-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 08:13:21,346 WARNING [train.py:1204] (3/4) Exclude cut with ID _35350_210_16040_1_1533040191183_4075299_224-133775-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 08:13:26,505 WARNING [train.py:1204] (3/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_147-293311-0_sp1.1 from training. Duration: 0.8545625 2023-10-04 08:13:26,833 INFO [scaling.py:1118] (3/4) WithLoss: name=encoder.encoders.3.encoder.layers.0.self_attn_weights, loss-sum=0.000e+00 2023-10-04 08:13:27,968 WARNING [train.py:1204] (3/4) Exclude cut with ID _12302_210_5219_1_1529924446466_8107280_835-279754-0_sp1.1 from training. Duration: 0.8181875 2023-10-04 08:13:30,848 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8453_210_4211_1_1528502940_8562730_347-330673-0_sp0.9 from training. Duration: 0.8 2023-10-04 08:13:30,958 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_213-103282-0_sp0.9 from training. Duration: 0.8666875 2023-10-04 08:13:32,187 INFO [train.py:1046] (3/4) Epoch 45, batch 4150, loss[loss=0.1559, simple_loss=0.2245, pruned_loss=0.04371, over 19552.00 frames. ], tot_loss[loss=0.1553, simple_loss=0.2361, pruned_loss=0.03722, over 4718230.23 frames. ], batch size: 388, lr: 2.25e-03, grad_scale: 8.0 2023-10-04 08:13:32,312 WARNING [train.py:1204] (3/4) Exclude cut with ID _49728_210_12651_1_1534041034294_6788150_66-43496-0_sp1.1 from training. Duration: 0.97275 2023-10-04 08:13:32,317 WARNING [train.py:1204] (3/4) Exclude cut with ID _19023_210_4353_1_1531378892552_5820780_28-36615-0_sp1.1 from training. Duration: 0.8818125 2023-10-04 08:13:33,812 WARNING [train.py:1204] (3/4) Exclude cut with ID _14349_210_8341_1_1530615710818_3442470_115-285975-0 from training. Duration: 0.86 2023-10-04 08:13:35,220 WARNING [train.py:1204] (3/4) Exclude cut with ID _12734_210_8341_1_1530183614296_3891929_236-47464-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 08:13:35,273 WARNING [train.py:1204] (3/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_177-236809-0 from training. Duration: 0.49 2023-10-04 08:13:36,544 WARNING [train.py:1204] (3/4) Exclude cut with ID _13678_210_7497_1_1530530069226_3463170_371-3221-0 from training. Duration: 0.9 2023-10-04 08:13:36,560 WARNING [train.py:1204] (3/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_48-338976-0 from training. Duration: 0.89 2023-10-04 08:13:38,511 WARNING [train.py:1204] (3/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_1167-309744-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 08:13:43,114 WARNING [train.py:1204] (3/4) Exclude cut with ID _30327_210_16049_1_1532484161295_3924169_401-70819-0_sp0.9 from training. Duration: 0.9555625 2023-10-04 08:13:43,122 WARNING [train.py:1204] (3/4) Exclude cut with ID _55134_210_21982_1_1534590028377_3882170_346-91022-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 08:13:47,273 WARNING [train.py:1204] (3/4) Exclude cut with ID _14459_210_2228_1_1531443641946_7516700_174-118801-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 08:13:48,608 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_269-278006-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 08:13:48,665 WARNING [train.py:1204] (3/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_390-339137-0_sp0.9 from training. Duration: 0.62225 2023-10-04 08:13:51,321 WARNING [train.py:1204] (3/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_253-308377-0_sp1.1 from training. Duration: 0.4454375 2023-10-04 08:13:51,338 WARNING [train.py:1204] (3/4) Exclude cut with ID _23825_210_13449_1_1531915313526_3497150_240-271367-0_sp1.1 from training. Duration: 0.8818125 2023-10-04 08:13:53,104 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1015-343884-0_sp0.9 from training. Duration: 0.688875 2023-10-04 08:13:57,850 WARNING [train.py:1204] (3/4) Exclude cut with ID _23514_210_1389_1_1531832139785_3031439_335-48210-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 08:14:00,700 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_309-65177-0_sp1.1 from training. Duration: 0.8 2023-10-04 08:14:02,059 WARNING [train.py:1204] (3/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_231-137892-0 from training. Duration: 0.99 2023-10-04 08:14:04,864 WARNING [train.py:1204] (3/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_57-270037-0 from training. Duration: 0.77 2023-10-04 08:14:04,868 WARNING [train.py:1204] (3/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_465-214726-0_sp1.1 from training. Duration: 0.7818125 2023-10-04 08:14:07,653 WARNING [train.py:1204] (3/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_106-300985-0 from training. Duration: 0.9 2023-10-04 08:14:07,658 WARNING [train.py:1204] (3/4) Exclude cut with ID _14632_210_9988_1_1530691279392_3923200_161-129530-0_sp1.1 from training. Duration: 0.87275 2023-10-04 08:14:07,677 WARNING [train.py:1204] (3/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_514-88370-0_sp1.1 from training. Duration: 0.963625 2023-10-04 08:14:11,120 WARNING [train.py:1204] (3/4) Exclude cut with ID _56495_210_4234_1_1534748080334_4136219_305-334505-0_sp1.1 from training. Duration: 0.92725 2023-10-04 08:14:12,541 WARNING [train.py:1204] (3/4) Exclude cut with ID _42875_210_14856_1_1533704397487_4012433_61-264914-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 08:14:15,841 WARNING [train.py:1204] (3/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_91-148831-0 from training. Duration: 0.99 2023-10-04 08:14:18,635 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_435-313305-0_sp0.9 from training. Duration: 0.788875 2023-10-04 08:14:18,822 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.conv_skip_rate, batch_count=1586093.3333333333, ans=0.0 2023-10-04 08:14:20,001 WARNING [train.py:1204] (3/4) Exclude cut with ID _16744_210_5986_1_1531724405384_3907567_242-111316-0_sp1.1 from training. Duration: 0.8454375 2023-10-04 08:14:21,278 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_257-20733-0 from training. Duration: 0.76 2023-10-04 08:14:22,578 WARNING [train.py:1204] (3/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_4-91037-0_sp1.1 from training. Duration: 0.8 2023-10-04 08:14:22,671 WARNING [train.py:1204] (3/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_3-276701-0 from training. Duration: 0.91 2023-10-04 08:14:24,211 WARNING [train.py:1204] (3/4) Exclude cut with ID _3992_210_1747_1_1526644816031_3731060_36-147855-0_sp1.1 from training. Duration: 0.7090625 2023-10-04 08:14:26,205 WARNING [train.py:1204] (3/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_1296-298671-0_sp1.1 from training. Duration: 0.963625 2023-10-04 08:14:26,288 WARNING [train.py:1204] (3/4) Exclude cut with ID _68965_210_1883_1_1535698962272_3497490_309-334593-0_sp1.1 from training. Duration: 0.92725 2023-10-04 08:14:26,383 WARNING [train.py:1204] (3/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_128-296581-0 from training. Duration: 0.74 2023-10-04 08:14:26,383 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_17613_210_2151_1_1531137569_2366160_170-299079-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 08:14:26,385 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6287_210_3273_1_1527509506_2653716_182-199887-0_sp1.1 from training. Duration: 0.463625 2023-10-04 08:14:29,071 WARNING [train.py:1204] (3/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_80-135200-0_sp1.1 from training. Duration: 0.7181875 2023-10-04 08:14:31,774 WARNING [train.py:1204] (3/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_34-183685-0 from training. Duration: 0.98 2023-10-04 08:14:31,798 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8278_210_2550_1_1528637099_4220413_418-210155-0_sp1.1 from training. Duration: 0.92725 2023-10-04 08:14:31,802 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8853_210_3273_1_1528633899_818968_51-187685-0_sp0.9 from training. Duration: 0.7666875 2023-10-04 08:14:31,817 WARNING [train.py:1204] (3/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_332-221057-0_sp1.1 from training. Duration: 0.6181875 2023-10-04 08:14:33,128 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2221_210_1988_1_1525597051_4935541_415-296882-0 from training. Duration: 0.77 2023-10-04 08:14:34,427 WARNING [train.py:1204] (3/4) Exclude cut with ID _33640_210_2655_1_1533117605479_4855469_770-278185-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 08:14:34,472 WARNING [train.py:1204] (3/4) Exclude cut with ID _5287_210_2084_1_1527418883040_3878670_333-48508-0_sp1.1 from training. Duration: 0.5454375 2023-10-04 08:14:34,520 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_142-276839-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 08:14:35,923 WARNING [train.py:1204] (3/4) Exclude cut with ID _28394_210_8913_1_1532430049518_3650118_126-139371-0_sp1.1 from training. Duration: 0.92725 2023-10-04 08:14:37,304 WARNING [train.py:1204] (3/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_76-203867-0 from training. Duration: 0.72 2023-10-04 08:14:37,969 WARNING [train.py:1204] (3/4) Exclude cut with ID _6194_210_1534_1_1527511826160_3941194_14-282876-0_sp0.9 from training. Duration: 0.788875 2023-10-04 08:14:38,139 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.out_combiner.scale_min, batch_count=1586160.0, ans=0.2 2023-10-04 08:14:41,890 WARNING [train.py:1204] (3/4) Exclude cut with ID _35355_210_16040_1_1533212988977_4016229_74-190076-0_sp1.1 from training. Duration: 0.72725 2023-10-04 08:14:44,680 WARNING [train.py:1204] (3/4) Exclude cut with ID _33920_210_4220_1_1533257851500_3084399_198-334063-0 from training. Duration: 0.94 2023-10-04 08:14:46,524 INFO [train.py:1046] (3/4) Epoch 45, batch 4200, loss[loss=0.1643, simple_loss=0.2601, pruned_loss=0.03426, over 24549.00 frames. ], tot_loss[loss=0.1546, simple_loss=0.2349, pruned_loss=0.03714, over 4720422.37 frames. ], batch size: 71, lr: 2.25e-03, grad_scale: 8.0 2023-10-04 08:14:46,650 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_4-266348-0_sp0.9 from training. Duration: 0.8666875 2023-10-04 08:14:49,037 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.632e+02 2.036e+02 2.349e+02 2.806e+02 3.824e+02, threshold=4.697e+02, percent-clipped=0.0 2023-10-04 08:14:49,165 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_23856_210_4238_1_1531994785_3087934_122-38075-0_sp1.1 from training. Duration: 0.936375 2023-10-04 08:14:49,255 WARNING [train.py:1204] (3/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_276-96593-0_sp0.9 from training. Duration: 0.8555625 2023-10-04 08:14:50,605 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_308-278734-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 08:14:50,607 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_5190_210_4557_1_1527245078_2965646_353-250603-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 08:14:53,840 WARNING [train.py:1204] (3/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_472-216663-0 from training. Duration: 0.92 2023-10-04 08:14:55,566 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.out_combiner.scale_min, batch_count=1586226.6666666667, ans=0.2 2023-10-04 08:14:57,163 WARNING [train.py:1204] (3/4) Exclude cut with ID _7537_210_2084_1_1528502395431_3924359_14-312906-0 from training. Duration: 0.84 2023-10-04 08:14:58,532 WARNING [train.py:1204] (3/4) Exclude cut with ID _29003_210_19596_1_1532507391720_3894098_559-236660-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 08:15:00,044 WARNING [train.py:1204] (3/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_312-204798-0_sp1.1 from training. Duration: 0.8454375 2023-10-04 08:15:00,643 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.5.encoder.layers.1.nonlin_attention.whiten1, num_groups=1, num_channels=192, metric=2.73 vs. limit=10.0 2023-10-04 08:15:02,770 WARNING [train.py:1204] (3/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_17-309070-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 08:15:05,666 WARNING [train.py:1204] (3/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_344-152526-0_sp0.9 from training. Duration: 0.588875 2023-10-04 08:15:08,348 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_41452_210_7291_1_1533437687_3941281_405-11559-0_sp1.1 from training. Duration: 0.82725 2023-10-04 08:15:08,380 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_48730_210_5399_1_1533885886_2212769_157-124052-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 08:15:10,256 WARNING [train.py:1204] (3/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_415-78792-0 from training. Duration: 0.81 2023-10-04 08:15:10,268 WARNING [train.py:1204] (3/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_345-340690-0_sp1.1 from training. Duration: 0.8454375 2023-10-04 08:15:11,732 WARNING [train.py:1204] (3/4) Exclude cut with ID _23825_210_13449_1_1531915313526_3497150_124-8138-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 08:15:11,774 WARNING [train.py:1204] (3/4) Exclude cut with ID _49258_210_7994_1_1534122017863_3738861_194-24864-0_sp1.1 from training. Duration: 0.936375 2023-10-04 08:15:13,184 WARNING [train.py:1204] (3/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_369-222060-0_sp1.1 from training. Duration: 0.6454375 2023-10-04 08:15:13,309 WARNING [train.py:1204] (3/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_480-13146-0_sp0.9 from training. Duration: 0.9444375 2023-10-04 08:15:13,477 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.attention_skip_rate, batch_count=1586293.3333333333, ans=0.0 2023-10-04 08:15:14,718 WARNING [train.py:1204] (3/4) Exclude cut with ID _13253_210_1925_1_1530757775819_3577873_191-144977-0 from training. Duration: 0.78 2023-10-04 08:15:14,740 WARNING [train.py:1204] (3/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_1128-140266-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 08:15:14,912 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.attention_skip_rate, batch_count=1586360.0, ans=0.0 2023-10-04 08:15:19,350 WARNING [train.py:1204] (3/4) Exclude cut with ID _8772_210_1850_1_1528635595972_3664011_247-336448-0_sp0.9 from training. Duration: 0.82225 2023-10-04 08:15:20,857 WARNING [train.py:1204] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_318-59097-0_sp1.1 from training. Duration: 0.6909375 2023-10-04 08:15:22,236 WARNING [train.py:1204] (3/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_540-131164-0_sp1.1 from training. Duration: 0.836375 2023-10-04 08:15:23,640 WARNING [train.py:1204] (3/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_39-110836-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 08:15:28,562 WARNING [train.py:1204] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_860-96198-0_sp1.1 from training. Duration: 0.663625 2023-10-04 08:15:28,570 WARNING [train.py:1204] (3/4) Exclude cut with ID _15133_210_6228_1_1530794512074_3080379_29-264458-0 from training. Duration: 0.58 2023-10-04 08:15:28,597 WARNING [train.py:1204] (3/4) Exclude cut with ID _55209_210_1531_1_1534675828996_4597160_797-348343-0_sp1.1 from training. Duration: 0.963625 2023-10-04 08:15:29,970 WARNING [train.py:1204] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_23-189234-0_sp1.1 from training. Duration: 0.7545625 2023-10-04 08:15:35,486 WARNING [train.py:1204] (3/4) Exclude cut with ID _5410_210_1388_1_1527397055795_1719020_39-112221-0_sp1.1 from training. Duration: 0.62725 2023-10-04 08:15:35,605 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_100-348441-0_sp1.1 from training. Duration: 0.82725 2023-10-04 08:15:39,766 WARNING [train.py:1204] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_143-86344-0_sp1.1 from training. Duration: 0.7 2023-10-04 08:15:43,207 WARNING [train.py:1204] (3/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_126-29104-0 from training. Duration: 0.85 2023-10-04 08:15:45,878 WARNING [train.py:1204] (3/4) Exclude cut with ID _39846_210_12651_1_1533603651992_1318030_138-31771-0_sp1.1 from training. Duration: 0.963625 2023-10-04 08:15:49,041 INFO [scaling.py:1118] (3/4) WithLoss: name=encoder.encoders.5.encoder.layers.0.self_attn_weights, loss-sum=0.000e+00 2023-10-04 08:15:50,656 WARNING [train.py:1204] (3/4) Exclude cut with ID _2220_210_1819_1_1525509109898_3658680_50-99837-0_sp1.1 from training. Duration: 0.4909375 2023-10-04 08:15:51,959 WARNING [train.py:1204] (3/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_836-210550-0_sp1.1 from training. Duration: 0.97275 2023-10-04 08:15:54,620 WARNING [train.py:1204] (3/4) Exclude cut with ID _28323_210_16187_1_1532480395667_4100300_306-74561-0 from training. Duration: 0.94 2023-10-04 08:15:59,849 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_177-58005-0_sp1.1 from training. Duration: 0.563625 2023-10-04 08:16:01,106 INFO [train.py:1046] (3/4) Epoch 45, batch 4250, loss[loss=0.1709, simple_loss=0.2564, pruned_loss=0.04267, over 24010.00 frames. ], tot_loss[loss=0.1535, simple_loss=0.2337, pruned_loss=0.03666, over 4717171.14 frames. ], batch size: 86, lr: 2.25e-03, grad_scale: 8.0 2023-10-04 08:16:02,892 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.1.conv_skip_rate, batch_count=1586560.0, ans=0.0 2023-10-04 08:16:04,068 WARNING [train.py:1204] (3/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_19-342632-0_sp1.1 from training. Duration: 0.82725 2023-10-04 08:16:04,091 WARNING [train.py:1204] (3/4) Exclude cut with ID _35355_210_16040_1_1533212988977_4016229_74-190076-0_sp0.9 from training. Duration: 0.888875 2023-10-04 08:16:04,420 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.feed_forward2.hidden_balancer.prob, batch_count=1586560.0, ans=0.125 2023-10-04 08:16:05,650 WARNING [train.py:1204] (3/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_285-97786-0_sp1.1 from training. Duration: 0.97275 2023-10-04 08:16:05,881 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.conv_skip_rate, batch_count=1586560.0, ans=0.0 2023-10-04 08:16:10,051 WARNING [train.py:1204] (3/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_337-181502-0_sp0.9 from training. Duration: 0.72225 2023-10-04 08:16:10,084 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_353-182855-0 from training. Duration: 0.96 2023-10-04 08:16:11,989 WARNING [train.py:1204] (3/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_409-327730-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 08:16:12,726 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.3.encoder.layers.0.nonlin_attention.whiten1, num_groups=1, num_channels=384, metric=6.90 vs. limit=10.0 2023-10-04 08:16:14,762 WARNING [train.py:1204] (3/4) Exclude cut with ID _8030_210_7497_1_1528444886781_3531089_373-19403-0_sp1.1 from training. Duration: 0.97275 2023-10-04 08:16:18,859 WARNING [train.py:1204] (3/4) Exclude cut with ID _6789_210_1534_1_1527857937218_3118950_231-340237-0_sp1.1 from training. Duration: 0.936375 2023-10-04 08:16:19,008 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.1.ff2_skip_rate, batch_count=1586626.6666666667, ans=0.0 2023-10-04 08:16:20,499 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.0.feed_forward1.out_proj.dropout_p, batch_count=1586626.6666666667, ans=0.1 2023-10-04 08:16:23,699 WARNING [train.py:1204] (3/4) Exclude cut with ID _6493_210_1747_1_1528459210424_4012470_536-57832-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 08:16:23,712 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7046_210_6753_1_1527895337_6920917_756-116002-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 08:16:25,464 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_37152_210_9697_1_1533106573_3844188_29-239070-0_sp1.1 from training. Duration: 0.8909375 2023-10-04 08:16:25,467 WARNING [train.py:1204] (3/4) Exclude cut with ID _19023_210_4353_1_1531378892552_5820780_61-334539-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 08:16:27,064 WARNING [train.py:1204] (3/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_159-28488-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 08:16:28,915 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_764-71272-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 08:16:30,879 WARNING [train.py:1204] (3/4) Exclude cut with ID _44902_210_16187_1_1533888006062_3792078_258-178274-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 08:16:33,617 WARNING [train.py:1204] (3/4) Exclude cut with ID _7537_210_2084_1_1528502395431_3924359_14-312906-0_sp1.1 from training. Duration: 0.763625 2023-10-04 08:16:34,981 WARNING [train.py:1204] (3/4) Exclude cut with ID _49643_210_7989_1_1534640244224_3939031_674-116639-0_sp1.1 from training. Duration: 0.963625 2023-10-04 08:16:36,273 WARNING [train.py:1204] (3/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_415-161-0 from training. Duration: 0.98 2023-10-04 08:16:39,169 WARNING [train.py:1204] (3/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_264-348896-0 from training. Duration: 0.85 2023-10-04 08:16:39,177 WARNING [train.py:1204] (3/4) Exclude cut with ID _51899_210_20034_1_1534129254466_3669220_233-161492-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 08:16:40,506 WARNING [train.py:1204] (3/4) Exclude cut with ID _31255_210_13427_1_1532865619495_3815143_233-200023-0_sp1.1 from training. Duration: 0.936375 2023-10-04 08:16:40,535 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_23850_210_4238_1_1532232597_1632433_100-229879-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 08:16:43,098 WARNING [train.py:1204] (3/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_375-245677-0_sp0.9 from training. Duration: 0.9 2023-10-04 08:16:43,101 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_40223_210_6228_1_1533298404_4812267_355-317415-0_sp1.1 from training. Duration: 0.97275 2023-10-04 08:16:43,142 WARNING [train.py:1204] (3/4) Exclude cut with ID _9679_210_7497_1_1529129212906_4363929_235-81488-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 08:16:46,614 WARNING [train.py:1204] (3/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_172-215932-0_sp1.1 from training. Duration: 0.6 2023-10-04 08:16:48,008 WARNING [train.py:1204] (3/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_64-298917-0_sp1.1 from training. Duration: 0.8 2023-10-04 08:16:48,305 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.feed_forward1.out_proj.dropout_p, batch_count=1586760.0, ans=0.1 2023-10-04 08:16:50,878 WARNING [train.py:1204] (3/4) Exclude cut with ID _38020_210_5986_1_1533112252259_7006089_131-53533-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 08:16:53,620 WARNING [train.py:1204] (3/4) Exclude cut with ID _42334_210_7770_1_1534206610396_3605880_392-48154-0_sp1.1 from training. Duration: 0.963625 2023-10-04 08:16:53,684 WARNING [train.py:1204] (3/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_337-181502-0 from training. Duration: 0.65 2023-10-04 08:16:53,695 WARNING [train.py:1204] (3/4) Exclude cut with ID _6267_210_2084_1_1527764658526_3894730_375-219887-0_sp0.9 from training. Duration: 0.7444375 2023-10-04 08:16:55,017 WARNING [train.py:1204] (3/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_60-146189-0 from training. Duration: 0.98 2023-10-04 08:16:55,111 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_243-143119-0_sp1.1 from training. Duration: 0.7 2023-10-04 08:16:55,221 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.0.bypass.skip_rate, batch_count=1586760.0, ans=0.035 2023-10-04 08:16:56,480 WARNING [train.py:1204] (3/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_1267-272693-0_sp0.9 from training. Duration: 0.811125 2023-10-04 08:17:00,160 WARNING [train.py:1204] (3/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_83-182645-0_sp1.1 from training. Duration: 0.97275 2023-10-04 08:17:00,180 WARNING [train.py:1204] (3/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_403-292260-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 08:17:01,628 WARNING [train.py:1204] (3/4) Exclude cut with ID _55429_210_10626_1_1534467588527_7174143_377-169391-0 from training. Duration: 0.96 2023-10-04 08:17:03,413 WARNING [train.py:1204] (3/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_494-199538-0_sp1.1 from training. Duration: 0.6818125 2023-10-04 08:17:04,760 WARNING [train.py:1204] (3/4) Exclude cut with ID _3067_210_2667_1_1526212974640_4503168_119-175776-0_sp1.1 from training. Duration: 0.67275 2023-10-04 08:17:08,819 WARNING [train.py:1204] (3/4) Exclude cut with ID _3231_210_2106_1_1526205864934_6784220_183-303839-0_sp1.1 from training. Duration: 0.97275 2023-10-04 08:17:11,617 WARNING [train.py:1204] (3/4) Exclude cut with ID _18513_210_9206_1_1531618215223_3862620_165-38958-0_sp1.1 from training. Duration: 0.963625 2023-10-04 08:17:11,708 WARNING [train.py:1204] (3/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_87-272068-0_sp1.1 from training. Duration: 0.7545625 2023-10-04 08:17:13,138 WARNING [train.py:1204] (3/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_353-95-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 08:17:15,128 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2009_210_2071_1_1525481134_4588937_73-302813-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 08:17:16,424 INFO [train.py:1046] (3/4) Epoch 45, batch 4300, loss[loss=0.1512, simple_loss=0.2429, pruned_loss=0.02969, over 24299.00 frames. ], tot_loss[loss=0.1531, simple_loss=0.2331, pruned_loss=0.03654, over 4716570.63 frames. ], batch size: 74, lr: 2.25e-03, grad_scale: 8.0 2023-10-04 08:17:16,519 WARNING [train.py:1204] (3/4) Exclude cut with ID _64220_210_9527_1_1535250151772_4875259_88-95011-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 08:17:16,569 WARNING [train.py:1204] (3/4) Exclude cut with ID _44902_210_16187_1_1533888006062_3792078_160-12107-0_sp1.1 from training. Duration: 0.92725 2023-10-04 08:17:16,582 WARNING [train.py:1204] (3/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_547-5424-0 from training. Duration: 0.94 2023-10-04 08:17:18,167 WARNING [train.py:1204] (3/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_268-85656-0_sp1.1 from training. Duration: 0.936375 2023-10-04 08:17:19,313 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.588e+02 1.964e+02 2.175e+02 2.393e+02 4.014e+02, threshold=4.349e+02, percent-clipped=0.0 2023-10-04 08:17:22,258 WARNING [train.py:1204] (3/4) Exclude cut with ID _66210_210_12651_1_1535446812922_3365850_42-284985-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 08:17:22,291 WARNING [train.py:1204] (3/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_465-214726-0_sp0.9 from training. Duration: 0.9555625 2023-10-04 08:17:28,235 WARNING [train.py:1204] (3/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_50-95770-0_sp1.1 from training. Duration: 0.936375 2023-10-04 08:17:34,407 WARNING [train.py:1204] (3/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_269-77567-0_sp1.1 from training. Duration: 0.963625 2023-10-04 08:17:34,409 WARNING [train.py:1204] (3/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_7-111954-0 from training. Duration: 0.56 2023-10-04 08:17:34,496 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_85-195240-0_sp0.9 from training. Duration: 0.9666875 2023-10-04 08:17:37,593 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2822_210_2715_1_1526112586_1999587_165-36656-0_sp0.9 from training. Duration: 0.988875 2023-10-04 08:17:37,624 WARNING [train.py:1204] (3/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_181-78632-0_sp1.1 from training. Duration: 0.7090625 2023-10-04 08:17:37,643 WARNING [train.py:1204] (3/4) Exclude cut with ID IC0671W0044-331887 from training. Duration: 0.896 2023-10-04 08:17:39,189 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.self_attn_weights.pos_emb_skip_rate, batch_count=1586960.0, ans=0.0 2023-10-04 08:17:40,987 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.4.encoder.layers.2.self_attn2.whiten, num_groups=1, num_channels=384, metric=11.52 vs. limit=22.5 2023-10-04 08:17:41,726 WARNING [train.py:1204] (3/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_266-137104-0_sp1.1 from training. Duration: 0.5909375 2023-10-04 08:17:43,079 WARNING [train.py:1204] (3/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_137-116442-0_sp0.9 from training. Duration: 0.8666875 2023-10-04 08:17:43,874 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.1.feed_forward1.hidden_balancer.prob, batch_count=1586960.0, ans=0.125 2023-10-04 08:17:46,454 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_23801_210_16223_1_1532066392_3294657_570-5439-0 from training. Duration: 0.97 2023-10-04 08:17:46,466 WARNING [train.py:1204] (3/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_29-280622-0_sp1.1 from training. Duration: 0.7454375 2023-10-04 08:17:46,494 WARNING [train.py:1204] (3/4) Exclude cut with ID 3557-8342-0013-54691-0 from training. Duration: 0.92 2023-10-04 08:17:49,329 WARNING [train.py:1204] (3/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_711-15688-0_sp1.1 from training. Duration: 0.3909375 2023-10-04 08:17:50,772 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_15126_210_6753_1_1530778677_2591185_120-277707-0_sp1.1 from training. Duration: 0.72725 2023-10-04 08:17:52,239 WARNING [train.py:1204] (3/4) Exclude cut with ID _14970_210_15709_1_1530775547361_1966965_8-169924-0_sp1.1 from training. Duration: 0.836375 2023-10-04 08:17:52,240 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_4692_210_3528_1_1526899755_4323254_107-44401-0_sp0.9 from training. Duration: 0.9555625 2023-10-04 08:17:53,533 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3475_210_2028_1_1526193578_3622569_265-276625-0_sp0.9 from training. Duration: 0.8444375 2023-10-04 08:17:54,910 WARNING [train.py:1204] (3/4) Exclude cut with ID _55153_210_2253_1_1534384786677_3162096_148-79022-0_sp1.1 from training. Duration: 0.87275 2023-10-04 08:17:55,021 WARNING [train.py:1204] (3/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_292-36336-0_sp1.1 from training. Duration: 0.92725 2023-10-04 08:17:56,319 WARNING [train.py:1204] (3/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_329-82235-0 from training. Duration: 0.82 2023-10-04 08:17:57,734 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_11305_210_1794_1_1529750428_7178148_257-312870-0 from training. Duration: 0.88 2023-10-04 08:17:59,806 WARNING [train.py:1204] (3/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_436-4493-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 08:18:02,638 WARNING [train.py:1204] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_321-54129-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 08:18:02,645 WARNING [train.py:1204] (3/4) Exclude cut with ID _5287_210_2084_1_1527418883040_3878670_333-48508-0_sp0.9 from training. Duration: 0.6666875 2023-10-04 08:18:02,673 WARNING [train.py:1204] (3/4) Exclude cut with ID _26528_210_9070_1_1532570410842_3851310_244-278158-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 08:18:04,455 WARNING [train.py:1204] (3/4) Exclude cut with ID _46952_210_5986_1_1534230010357_3107379_232-344503-0_sp1.1 from training. Duration: 0.87275 2023-10-04 08:18:04,466 WARNING [train.py:1204] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_43-206757-0 from training. Duration: 0.75 2023-10-04 08:18:04,467 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_4183_210_2087_1_1526729100_1869838_141-166600-0 from training. Duration: 0.95 2023-10-04 08:18:04,524 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_296-287678-0 from training. Duration: 0.95 2023-10-04 08:18:05,929 WARNING [train.py:1204] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_265-60343-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 08:18:05,963 WARNING [train.py:1204] (3/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_332-256168-0 from training. Duration: 0.75 2023-10-04 08:18:06,003 WARNING [train.py:1204] (3/4) Exclude cut with ID _13679_210_7497_1_1530534293662_3355098_411-186088-0 from training. Duration: 0.94 2023-10-04 08:18:08,728 WARNING [train.py:1204] (3/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_458-126779-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 08:18:10,100 WARNING [train.py:1204] (3/4) Exclude cut with ID IC0803W0380-397572 from training. Duration: 0.032 2023-10-04 08:18:10,331 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.balancer1.prob, batch_count=1587093.3333333333, ans=0.125 2023-10-04 08:18:12,080 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_278-272612-0_sp1.1 from training. Duration: 0.663625 2023-10-04 08:18:13,483 WARNING [train.py:1204] (3/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_491-263101-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 08:18:13,488 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_53555_210_5632_1_1534595220_7232297_334-336778-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 08:18:14,894 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_50-3289-0 from training. Duration: 0.91 2023-10-04 08:18:15,171 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.conv_skip_rate, batch_count=1587160.0, ans=0.0 2023-10-04 08:18:16,340 WARNING [train.py:1204] (3/4) Exclude cut with ID _8699_210_2069_1_1528630346831_3960409_155-39766-0_sp0.9 from training. Duration: 0.8666875 2023-10-04 08:18:16,352 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_502-82145-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 08:18:16,405 WARNING [train.py:1204] (3/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_550-43132-0_sp1.1 from training. Duration: 0.97275 2023-10-04 08:18:17,709 WARNING [train.py:1204] (3/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_446-112117-0_sp1.1 from training. Duration: 0.8181875 2023-10-04 08:18:17,774 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3691_210_3528_1_1526468169_4062053_355-334432-0_sp1.1 from training. Duration: 0.7909375 2023-10-04 08:18:20,604 WARNING [train.py:1204] (3/4) Exclude cut with ID _58605_210_12210_1_1534723195939_3904259_239-94720-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 08:18:23,310 WARNING [train.py:1204] (3/4) Exclude cut with ID _49376_210_5986_1_1533985278732_3370740_391-344072-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 08:18:24,639 WARNING [train.py:1204] (3/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_558-322014-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 08:18:24,680 WARNING [train.py:1204] (3/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_256-32662-0_sp1.1 from training. Duration: 0.8181875 2023-10-04 08:18:27,657 WARNING [train.py:1204] (3/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_572-185150-0 from training. Duration: 0.97 2023-10-04 08:18:29,667 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_89-35748-0_sp0.9 from training. Duration: 0.87775 2023-10-04 08:18:30,995 INFO [train.py:1046] (3/4) Epoch 45, batch 4350, loss[loss=0.1616, simple_loss=0.2516, pruned_loss=0.03576, over 24001.00 frames. ], tot_loss[loss=0.1532, simple_loss=0.2338, pruned_loss=0.03631, over 4730208.57 frames. ], batch size: 80, lr: 2.25e-03, grad_scale: 8.0 2023-10-04 08:18:33,131 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_88-66747-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 08:18:35,786 WARNING [train.py:1204] (3/4) Exclude cut with ID _19504_210_9994_1_1531648863242_3325220_357-237029-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 08:18:38,487 WARNING [train.py:1204] (3/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_302-2550-0_sp1.1 from training. Duration: 0.736375 2023-10-04 08:18:38,487 WARNING [train.py:1204] (3/4) Exclude cut with ID _9461_210_4557_1_1529060514726_3677451_59-194017-0_sp1.1 from training. Duration: 0.963625 2023-10-04 08:18:43,740 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.3.encoder.layers.0.feed_forward1.out_whiten, num_groups=1, num_channels=512, metric=7.00 vs. limit=15.0 2023-10-04 08:18:44,445 WARNING [train.py:1204] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_665-196155-0_sp1.1 from training. Duration: 0.6090625 2023-10-04 08:18:46,062 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.self_attn_weights.pos_emb_skip_rate, batch_count=1587293.3333333333, ans=0.0 2023-10-04 08:18:48,598 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_326-315007-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 08:18:51,294 WARNING [train.py:1204] (3/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_105-328174-0_sp1.1 from training. Duration: 0.6909375 2023-10-04 08:18:51,303 WARNING [train.py:1204] (3/4) Exclude cut with ID _56494_210_4220_1_1535284837625_3033169_106-263722-0_sp1.1 from training. Duration: 0.97275 2023-10-04 08:18:54,132 WARNING [train.py:1204] (3/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_770-200430-0_sp0.9 from training. Duration: 0.988875 2023-10-04 08:18:56,761 WARNING [train.py:1204] (3/4) Exclude cut with ID _65640_210_6310_1_1535368201445_3173250_209-13008-0_sp1.1 from training. Duration: 0.9 2023-10-04 08:18:58,229 WARNING [train.py:1204] (3/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_212-149947-0_sp0.9 from training. Duration: 0.811125 2023-10-04 08:19:02,280 WARNING [train.py:1204] (3/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_188-171673-0 from training. Duration: 0.58 2023-10-04 08:19:04,103 WARNING [train.py:1204] (3/4) Exclude cut with ID _58895_210_8750_1_1535184081320_3649089_286-281724-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 08:19:05,556 WARNING [train.py:1204] (3/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_16-254909-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 08:19:11,002 WARNING [train.py:1204] (3/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_52-95655-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 08:19:13,728 WARNING [train.py:1204] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_554-45617-0 from training. Duration: 0.99 2023-10-04 08:19:15,887 WARNING [train.py:1204] (3/4) Exclude cut with ID _14683_210_5219_1_1530698352608_3974859_473-344611-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 08:19:17,087 WARNING [train.py:1204] (3/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_17-25045-0_sp0.9 from training. Duration: 0.6555625 2023-10-04 08:19:21,177 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9072W0003-530637 from training. Duration: 0.768 2023-10-04 08:19:23,768 WARNING [train.py:1204] (3/4) Exclude cut with ID _38670_210_15379_1_1533448810659_4391059_76-74025-0_sp1.1 from training. Duration: 0.936375 2023-10-04 08:19:23,825 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_74-292608-0_sp0.9 from training. Duration: 0.8 2023-10-04 08:19:25,216 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9084W0025-536862 from training. Duration: 0.896 2023-10-04 08:19:25,280 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9087W0021-538392 from training. Duration: 0.768 2023-10-04 08:19:25,285 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7543_210_6753_1_1528543323_645515_56-163234-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 08:19:25,533 INFO [scaling.py:1118] (3/4) WithLoss: name=encoder.encoders.4.encoder.layers.2.self_attn_weights, loss-sum=0.000e+00 2023-10-04 08:19:26,501 WARNING [train.py:1204] (3/4) Exclude cut with ID _45560_210_21982_1_1533812603951_3584999_44-332643-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 08:19:26,588 WARNING [train.py:1204] (3/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_1285-256622-0_sp1.1 from training. Duration: 0.763625 2023-10-04 08:19:27,963 WARNING [train.py:1204] (3/4) Exclude cut with ID _52781_210_16187_1_1534297729099_1118149_141-325632-0_sp1.1 from training. Duration: 0.936375 2023-10-04 08:19:29,255 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_1051-137631-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 08:19:29,303 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_13091_210_7925_1_1530609447_8987354_233-18333-0_sp1.1 from training. Duration: 0.97275 2023-10-04 08:19:32,026 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_34008_210_10431_1_1533345424_8183247_662-317331-0 from training. Duration: 0.94 2023-10-04 08:19:32,032 WARNING [train.py:1204] (3/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_291-248920-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 08:19:32,034 WARNING [train.py:1204] (3/4) Exclude cut with ID _65263_210_22356_1_1535763598691_3921037_274-288481-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 08:19:32,054 WARNING [train.py:1204] (3/4) Exclude cut with ID _5189_210_4557_1_1527248319428_3769229_233-168080-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 08:19:33,994 WARNING [train.py:1204] (3/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_135-184281-0 from training. Duration: 0.99 2023-10-04 08:19:35,939 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9123W0012-555972 from training. Duration: 0.896 2023-10-04 08:19:35,943 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9123W0118-556077 from training. Duration: 0.896 2023-10-04 08:19:35,952 WARNING [train.py:1204] (3/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_406-275649-0 from training. Duration: 0.42 2023-10-04 08:19:38,782 WARNING [train.py:1204] (3/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_309-13684-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 08:19:38,810 WARNING [train.py:1204] (3/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_129-337802-0_sp1.1 from training. Duration: 0.6545625 2023-10-04 08:19:40,091 WARNING [train.py:1204] (3/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_649-266825-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 08:19:40,153 WARNING [train.py:1204] (3/4) Exclude cut with ID _40519_210_13794_1_1533715228655_7337390_895-224742-0_sp1.1 from training. Duration: 0.92725 2023-10-04 08:19:40,305 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.nonlin_attention.balancer.max_positive, batch_count=1587493.3333333333, ans=0.95 2023-10-04 08:19:42,878 WARNING [train.py:1204] (3/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_234-14174-0 from training. Duration: 0.82 2023-10-04 08:19:44,207 INFO [train.py:1046] (3/4) Epoch 45, batch 4400, loss[loss=0.1621, simple_loss=0.2345, pruned_loss=0.04483, over 22874.00 frames. ], tot_loss[loss=0.154, simple_loss=0.2345, pruned_loss=0.03673, over 4728487.63 frames. ], batch size: 322, lr: 2.25e-03, grad_scale: 16.0 2023-10-04 08:19:45,648 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9156W0009-572502 from training. Duration: 0.768 2023-10-04 08:19:45,655 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_424-245529-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 08:19:46,926 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.624e+02 2.000e+02 2.178e+02 2.504e+02 3.543e+02, threshold=4.357e+02, percent-clipped=0.0 2023-10-04 08:19:49,251 WARNING [train.py:1204] (3/4) Exclude cut with ID _26528_210_9070_1_1532570410842_3851310_232-10149-0_sp1.1 from training. Duration: 0.963625 2023-10-04 08:19:49,259 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_17292_210_6753_1_1531125388_2943734_152-30410-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 08:19:50,661 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_35441_210_16554_1_1533347572_4092680_291-82075-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 08:19:50,905 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.bypass.scale_min, batch_count=1587560.0, ans=0.2 2023-10-04 08:19:52,128 WARNING [train.py:1204] (3/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_64-298917-0 from training. Duration: 0.88 2023-10-04 08:19:52,146 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_5618_210_1874_1_1527311008_5851_1-214501-0_sp0.9 from training. Duration: 0.7655625 2023-10-04 08:19:53,471 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_9460_210_4211_1_1528954587_10049667_572-37035-0 from training. Duration: 0.8 2023-10-04 08:19:53,499 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9193W0023-590322 from training. Duration: 0.896 2023-10-04 08:19:54,834 WARNING [train.py:1204] (3/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_206-10097-0_sp1.1 from training. Duration: 0.4454375 2023-10-04 08:19:54,854 WARNING [train.py:1204] (3/4) Exclude cut with ID _55134_210_21982_1_1534590028377_3882170_17-51731-0_sp1.1 from training. Duration: 0.963625 2023-10-04 08:19:56,180 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_14953_210_2151_1_1530701710_5616165_710-165400-0 from training. Duration: 0.87 2023-10-04 08:19:58,987 WARNING [train.py:1204] (3/4) Exclude cut with ID _53203_210_2087_1_1534246218309_475590_61-85777-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 08:20:00,290 WARNING [train.py:1204] (3/4) Exclude cut with ID _55844_210_2346_1_1534471212569_7625707_1050-10453-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 08:20:00,300 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9218W0013-603297 from training. Duration: 0.896 2023-10-04 08:20:03,066 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_267-102818-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 08:20:03,067 WARNING [train.py:1204] (3/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_191-41023-0 from training. Duration: 0.71 2023-10-04 08:20:05,101 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9231W0202-610227 from training. Duration: 0.896 2023-10-04 08:20:08,333 WARNING [train.py:1204] (3/4) Exclude cut with ID _6537_210_1883_1_1527987559710_3597129_382-133062-0 from training. Duration: 0.68 2023-10-04 08:20:08,384 WARNING [train.py:1204] (3/4) Exclude cut with ID _28427_210_4220_1_1532581240967_3061069_43-84928-0 from training. Duration: 0.99 2023-10-04 08:20:08,409 WARNING [train.py:1204] (3/4) Exclude cut with ID _59256_210_5986_1_1535076085997_3589120_121-237386-0 from training. Duration: 0.96 2023-10-04 08:20:09,703 WARNING [train.py:1204] (3/4) Exclude cut with ID _8480_210_1874_1_1528589391205_6728330_137-256986-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 08:20:09,787 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_9417_210_1794_1_1529235261_4614131_479-346963-0_sp1.1 from training. Duration: 0.936375 2023-10-04 08:20:09,817 WARNING [train.py:1204] (3/4) Exclude cut with ID _26474_210_9570_1_1532520022699_3623380_267-7362-0_sp1.1 from training. Duration: 0.936375 2023-10-04 08:20:11,187 WARNING [train.py:1204] (3/4) Exclude cut with ID _55328_210_27767_1_1534580973260_4088170_287-61272-0_sp1.1 from training. Duration: 0.97275 2023-10-04 08:20:13,918 WARNING [train.py:1204] (3/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_589-9070-0 from training. Duration: 0.71 2023-10-04 08:20:13,933 WARNING [train.py:1204] (3/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_148-158509-0_sp1.1 from training. Duration: 0.37275 2023-10-04 08:20:15,299 WARNING [train.py:1204] (3/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_164-184548-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 08:20:16,708 WARNING [train.py:1204] (3/4) Exclude cut with ID _14970_210_15709_1_1530775547361_1966965_6-23328-0_sp1.1 from training. Duration: 0.7818125 2023-10-04 08:20:16,713 WARNING [train.py:1204] (3/4) Exclude cut with ID _29303_210_2575_1_1532392237303_7410649_612-165069-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 08:20:18,845 WARNING [train.py:1204] (3/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_383-321224-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 08:20:18,892 WARNING [train.py:1204] (3/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_283-160525-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 08:20:18,902 WARNING [train.py:1204] (3/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_505-97987-0 from training. Duration: 0.86 2023-10-04 08:20:20,215 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9309W0190-640317 from training. Duration: 0.768 2023-10-04 08:20:23,083 WARNING [train.py:1204] (3/4) Exclude cut with ID _50093_210_4220_1_1534417141207_3432299_280-297390-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 08:20:29,966 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_17292_210_6753_1_1531125388_2943734_320-71385-0_sp1.1 from training. Duration: 0.97275 2023-10-04 08:20:31,440 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_213-103282-0 from training. Duration: 0.78 2023-10-04 08:20:35,966 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7615_210_4211_1_1528110965_4580160_72-235211-0_sp0.9 from training. Duration: 0.8444375 2023-10-04 08:20:39,215 WARNING [train.py:1204] (3/4) Exclude cut with ID _14867_210_1968_1_1530684179890_3846969_173-320977-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 08:20:39,434 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.feed_forward3.hidden_balancer.prob, batch_count=1587760.0, ans=0.125 2023-10-04 08:20:42,015 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7046_210_6753_1_1527895337_6920917_783-278109-0_sp0.9 from training. Duration: 0.8555625 2023-10-04 08:20:42,072 WARNING [train.py:1204] (3/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_90-30673-0 from training. Duration: 0.84 2023-10-04 08:20:43,342 WARNING [train.py:1204] (3/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_415-161-0_sp1.1 from training. Duration: 0.8909375 2023-10-04 08:20:43,359 WARNING [train.py:1204] (3/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_86-66192-0_sp0.9 from training. Duration: 0.611125 2023-10-04 08:20:43,362 WARNING [train.py:1204] (3/4) Exclude cut with ID _4626_210_3532_1_1526817652717_3916859_446-206656-0_sp1.1 from training. Duration: 0.6909375 2023-10-04 08:20:44,757 WARNING [train.py:1204] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_166-128941-0_sp0.9 from training. Duration: 0.6 2023-10-04 08:20:47,549 WARNING [train.py:1204] (3/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_345-167986-0 from training. Duration: 0.84 2023-10-04 08:20:51,629 WARNING [train.py:1204] (3/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_973-198085-0 from training. Duration: 0.75 2023-10-04 08:20:53,328 WARNING [train.py:1204] (3/4) Exclude cut with ID _60057_210_10640_1_1534903065793_3333150_171-181133-0 from training. Duration: 0.88 2023-10-04 08:20:53,343 WARNING [train.py:1204] (3/4) Exclude cut with ID _19504_210_9994_1_1531648863242_3325220_336-196513-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 08:20:53,351 WARNING [train.py:1204] (3/4) Exclude cut with ID _6493_210_1747_1_1528459210424_4012470_513-140967-0 from training. Duration: 0.79 2023-10-04 08:20:53,555 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.conv_module2.balancer2.min_abs, batch_count=1587826.6666666667, ans=0.5 2023-10-04 08:20:54,744 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6673_210_3273_1_1527767790_3173240_327-306185-0_sp0.9 from training. Duration: 0.92225 2023-10-04 08:20:57,389 INFO [train.py:1046] (3/4) Epoch 45, batch 4450, loss[loss=0.1579, simple_loss=0.2443, pruned_loss=0.0358, over 24653.00 frames. ], tot_loss[loss=0.1536, simple_loss=0.2346, pruned_loss=0.03633, over 4730009.28 frames. ], batch size: 68, lr: 2.25e-03, grad_scale: 16.0 2023-10-04 08:20:57,483 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1032-236863-0_sp1.1 from training. Duration: 0.763625 2023-10-04 08:21:00,207 WARNING [train.py:1204] (3/4) Exclude cut with ID _14867_210_1968_1_1530684179890_3846969_413-29292-0 from training. Duration: 0.9 2023-10-04 08:21:02,993 WARNING [train.py:1204] (3/4) Exclude cut with ID _47251_210_18628_1_1533814063128_4137250_186-343322-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 08:21:04,406 WARNING [train.py:1204] (3/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_624-46091-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 08:21:04,438 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_791-197891-0_sp0.9 from training. Duration: 0.9333125 2023-10-04 08:21:12,847 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_50050_210_16223_1_1535435796_7290138_786-288457-0_sp1.1 from training. Duration: 0.92725 2023-10-04 08:21:12,876 WARNING [train.py:1204] (3/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_213-227283-0_sp1.1 from training. Duration: 0.963625 2023-10-04 08:21:15,719 WARNING [train.py:1204] (3/4) Exclude cut with ID _42659_210_2070_1_1533607442765_3120380_58-341121-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 08:21:18,337 WARNING [train.py:1204] (3/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_506-23200-0_sp1.1 from training. Duration: 0.8818125 2023-10-04 08:21:21,188 WARNING [train.py:1204] (3/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_336-213530-0_sp1.1 from training. Duration: 0.8181875 2023-10-04 08:21:21,221 WARNING [train.py:1204] (3/4) Exclude cut with ID _64135_210_2253_1_1535677267876_7608972_180-5312-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 08:21:22,537 WARNING [train.py:1204] (3/4) Exclude cut with ID _12302_210_5219_1_1529924446466_8107280_835-279754-0 from training. Duration: 0.9 2023-10-04 08:21:22,540 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_510-96996-0_sp1.1 from training. Duration: 0.7909375 2023-10-04 08:21:24,539 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_15517_210_6753_1_1530954008_3487336_216-187834-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 08:21:24,573 WARNING [train.py:1204] (3/4) Exclude cut with ID _19504_210_9994_1_1531648863242_3325220_414-113427-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 08:21:24,574 WARNING [train.py:1204] (3/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_264-108685-0_sp0.9 from training. Duration: 0.611125 2023-10-04 08:21:24,865 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.nonlin_attention.balancer.prob, batch_count=1587960.0, ans=0.125 2023-10-04 08:21:25,888 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_5438_210_1794_1_1527336687_2600009_104-288274-0_sp0.9 from training. Duration: 0.6444375 2023-10-04 08:21:31,345 WARNING [train.py:1204] (3/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_520-39496-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 08:21:31,402 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_37818_210_4238_1_1533620910_4720083_315-308565-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 08:21:32,926 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2009_210_2071_1_1525481134_4588937_385-302100-0_sp1.1 from training. Duration: 0.7909375 2023-10-04 08:21:34,192 WARNING [train.py:1204] (3/4) Exclude cut with ID _58895_210_8750_1_1535184081320_3649089_277-53219-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 08:21:34,282 WARNING [train.py:1204] (3/4) Exclude cut with ID _26664_210_7770_1_1532258943280_3418270_121-111258-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 08:21:38,978 WARNING [train.py:1204] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_99-167392-0_sp1.1 from training. Duration: 0.5545625 2023-10-04 08:21:40,291 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1113-7369-0 from training. Duration: 0.85 2023-10-04 08:21:40,306 WARNING [train.py:1204] (3/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_352-59958-0 from training. Duration: 0.86 2023-10-04 08:21:40,311 WARNING [train.py:1204] (3/4) Exclude cut with ID _14886_210_4220_1_1530702020355_3516190_159-73748-0_sp1.1 from training. Duration: 0.7545625 2023-10-04 08:21:43,742 WARNING [train.py:1204] (3/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_131-119569-0_sp1.1 from training. Duration: 0.92725 2023-10-04 08:21:43,814 WARNING [train.py:1204] (3/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_216-343007-0 from training. Duration: 0.66 2023-10-04 08:21:47,761 WARNING [train.py:1204] (3/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_113-92608-0_sp0.9 from training. Duration: 0.6 2023-10-04 08:21:50,579 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_40047_210_2290_1_1533290519_2374311_132-123271-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 08:21:50,632 WARNING [train.py:1204] (3/4) Exclude cut with ID _8793_210_6310_1_1528542985139_2310009_121-179782-0 from training. Duration: 0.8 2023-10-04 08:21:50,646 WARNING [train.py:1204] (3/4) Exclude cut with ID _43997_210_4525_1_1533538632555_5189329_283-218639-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 08:21:50,649 WARNING [train.py:1204] (3/4) Exclude cut with ID _49376_210_5986_1_1533985278732_3370740_297-78397-0_sp1.1 from training. Duration: 0.936375 2023-10-04 08:21:50,663 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3475_210_2028_1_1526193578_3622569_377-166133-0_sp0.9 from training. Duration: 0.9555625 2023-10-04 08:21:50,676 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_520-213149-0_sp1.1 from training. Duration: 0.92725 2023-10-04 08:21:53,408 WARNING [train.py:1204] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_567-144483-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 08:21:56,842 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_4183_210_2087_1_1526729100_1869838_143-280962-0_sp0.9 from training. Duration: 0.62225 2023-10-04 08:21:58,166 WARNING [train.py:1204] (3/4) Exclude cut with ID _49643_210_7989_1_1534640244224_3939031_611-185515-0 from training. Duration: 0.94 2023-10-04 08:21:59,617 WARNING [train.py:1204] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_88-341718-0_sp0.9 from training. Duration: 0.7333125 2023-10-04 08:22:01,013 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_51225_210_3601_1_1534400634_2327383_49-235441-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 08:22:02,435 WARNING [train.py:1204] (3/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_240-294400-0_sp1.1 from training. Duration: 0.936375 2023-10-04 08:22:03,862 WARNING [train.py:1204] (3/4) Exclude cut with ID _55429_210_10626_1_1534467588527_7174143_640-304825-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 08:22:03,891 WARNING [train.py:1204] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_187-221566-0_sp1.1 from training. Duration: 0.3909375 2023-10-04 08:22:06,663 WARNING [train.py:1204] (3/4) Exclude cut with ID _7606_210_4915_1_1528894253277_5080690_127-177761-0_sp0.9 from training. Duration: 0.711125 2023-10-04 08:22:10,563 WARNING [train.py:1204] (3/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_430-116139-0 from training. Duration: 0.85 2023-10-04 08:22:11,163 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.4.encoder.layers.1.feed_forward1.out_whiten, num_groups=1, num_channels=384, metric=9.10 vs. limit=15.0 2023-10-04 08:22:11,844 INFO [train.py:1046] (3/4) Epoch 45, batch 4500, loss[loss=0.1528, simple_loss=0.2397, pruned_loss=0.03302, over 24500.00 frames. ], tot_loss[loss=0.1541, simple_loss=0.235, pruned_loss=0.03657, over 4736320.56 frames. ], batch size: 66, lr: 2.25e-03, grad_scale: 16.0 2023-10-04 08:22:13,279 WARNING [train.py:1204] (3/4) Exclude cut with ID _3675_210_4557_1_1526639934408_4182089_456-18305-0_sp0.9 from training. Duration: 0.8444375 2023-10-04 08:22:15,290 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.682e+02 2.063e+02 2.420e+02 3.061e+02 5.300e+02, threshold=4.841e+02, percent-clipped=1.0 2023-10-04 08:22:16,801 WARNING [train.py:1204] (3/4) Exclude cut with ID _18513_210_9206_1_1531618215223_3862620_402-5957-0_sp1.1 from training. Duration: 0.9 2023-10-04 08:22:18,196 WARNING [train.py:1204] (3/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_465-156500-0 from training. Duration: 0.72 2023-10-04 08:22:18,198 WARNING [train.py:1204] (3/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_714-310538-0 from training. Duration: 0.86 2023-10-04 08:22:20,788 WARNING [train.py:1204] (3/4) Exclude cut with ID _39411_210_5986_1_1533538825030_3343649_335-301187-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 08:22:23,778 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_41750_210_1960_1_1533520678_3948042_241-158616-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 08:22:25,213 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_46013_210_7234_1_1533776362_3744032_50-141535-0_sp1.1 from training. Duration: 0.9 2023-10-04 08:22:27,121 WARNING [train.py:1204] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_43-206757-0_sp0.9 from training. Duration: 0.8333125 2023-10-04 08:22:27,361 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.ff2_skip_rate, batch_count=1588293.3333333333, ans=0.0 2023-10-04 08:22:28,451 WARNING [train.py:1204] (3/4) Exclude cut with ID _22961_210_8254_1_1531976366672_4278180_172-276296-0_sp1.1 from training. Duration: 0.97275 2023-10-04 08:22:28,481 WARNING [train.py:1204] (3/4) Exclude cut with ID _17956_210_4353_1_1531209568125_6979970_1305-301903-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 08:22:28,513 WARNING [train.py:1204] (3/4) Exclude cut with ID _28323_210_16187_1_1532480395667_4100300_604-44507-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 08:22:40,039 WARNING [train.py:1204] (3/4) Exclude cut with ID _23514_210_1389_1_1531832139785_3031439_88-337544-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 08:22:40,106 WARNING [train.py:1204] (3/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_932-3672-0_sp1.1 from training. Duration: 0.963625 2023-10-04 08:22:43,240 WARNING [train.py:1204] (3/4) Exclude cut with ID _46502_210_13794_1_1533724452198_3657159_207-109991-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 08:22:44,630 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_216-73841-0_sp1.1 from training. Duration: 0.87275 2023-10-04 08:22:44,716 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8453_210_4211_1_1528502940_8562730_347-330673-0_sp1.1 from training. Duration: 0.6545625 2023-10-04 08:22:44,840 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.balancer_ff2.min_abs, batch_count=1588360.0, ans=0.1 2023-10-04 08:22:47,879 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.5.encoder.layers.1.nonlin_attention.whiten2, num_groups=1, num_channels=256, metric=9.97 vs. limit=15.0 2023-10-04 08:22:51,884 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_4183_210_2087_1_1526729100_1869838_143-280962-0_sp1.1 from training. Duration: 0.5090625 2023-10-04 08:22:56,039 WARNING [train.py:1204] (3/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_325-113798-0_sp1.1 from training. Duration: 0.77275 2023-10-04 08:22:58,054 WARNING [train.py:1204] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_91-212486-0_sp0.9 from training. Duration: 0.7555625 2023-10-04 08:23:00,759 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8776_210_2040_1_1528638837_4088036_244-278763-0_sp1.1 from training. Duration: 0.8181875 2023-10-04 08:23:00,802 WARNING [train.py:1204] (3/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_106-286130-0 from training. Duration: 0.98 2023-10-04 08:23:02,092 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2746_210_1988_1_1525872816_3795627_372-205861-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 08:23:02,127 WARNING [train.py:1204] (3/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_240-54646-0_sp1.1 from training. Duration: 0.936375 2023-10-04 08:23:06,164 WARNING [train.py:1204] (3/4) Exclude cut with ID _59500_210_8254_1_1534809610680_3736009_162-180695-0_sp1.1 from training. Duration: 0.936375 2023-10-04 08:23:06,186 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7544_210_6753_1_1528591327_1471426_146-93357-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 08:23:08,090 WARNING [train.py:1204] (3/4) Exclude cut with ID _42657_210_2070_1_1533520654016_3327008_405-87432-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 08:23:08,122 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_4528_210_3273_1_1527076654_3276777_209-219804-0 from training. Duration: 0.69 2023-10-04 08:23:08,123 WARNING [train.py:1204] (3/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_78-182912-0_sp0.9 from training. Duration: 0.6555625 2023-10-04 08:23:08,131 WARNING [train.py:1204] (3/4) Exclude cut with ID _31255_210_13427_1_1532865619495_3815143_322-114998-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 08:23:08,311 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.0.feed_forward1.hidden_balancer.prob, batch_count=1588426.6666666667, ans=0.125 2023-10-04 08:23:12,966 WARNING [train.py:1204] (3/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_848-280957-0_sp0.9 from training. Duration: 0.9666875 2023-10-04 08:23:12,989 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7696_210_2040_1_1528207167_3716099_117-125434-0_sp0.9 from training. Duration: 0.8444375 2023-10-04 08:23:16,899 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_1002-109192-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 08:23:18,918 WARNING [train.py:1204] (3/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_138-64210-0_sp0.9 from training. Duration: 0.811125 2023-10-04 08:23:20,168 WARNING [train.py:1204] (3/4) Exclude cut with ID _25808_210_13292_1_1532343514562_7366109_633-47524-0_sp1.1 from training. Duration: 0.9 2023-10-04 08:23:20,285 WARNING [train.py:1204] (3/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_297-5233-0 from training. Duration: 0.95 2023-10-04 08:23:23,021 WARNING [train.py:1204] (3/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_335-274780-0 from training. Duration: 0.78 2023-10-04 08:23:23,025 WARNING [train.py:1204] (3/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_301-330455-0 from training. Duration: 0.8 2023-10-04 08:23:25,639 INFO [train.py:1046] (3/4) Epoch 45, batch 4550, loss[loss=0.1531, simple_loss=0.2057, pruned_loss=0.05025, over 19587.00 frames. ], tot_loss[loss=0.1535, simple_loss=0.2344, pruned_loss=0.03634, over 4736816.69 frames. ], batch size: 389, lr: 2.25e-03, grad_scale: 16.0 2023-10-04 08:23:26,380 WARNING [train.py:1204] (3/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_383-205833-0 from training. Duration: 0.95 2023-10-04 08:23:29,127 WARNING [train.py:1204] (3/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_48-182155-0 from training. Duration: 0.87 2023-10-04 08:23:30,458 WARNING [train.py:1204] (3/4) Exclude cut with ID _14832_210_8750_1_1530691351539_3171348_1-54315-0_sp1.1 from training. Duration: 0.7909375 2023-10-04 08:23:33,226 WARNING [train.py:1204] (3/4) Exclude cut with ID _44982_210_1511_1_1534312820108_3726029_413-100814-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 08:23:33,270 WARNING [train.py:1204] (3/4) Exclude cut with ID _3231_210_2106_1_1526205864934_6784220_104-261392-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 08:23:35,990 WARNING [train.py:1204] (3/4) Exclude cut with ID _24314_210_4915_1_1532135078116_7170179_1-13588-0_sp1.1 from training. Duration: 0.97275 2023-10-04 08:23:41,333 WARNING [train.py:1204] (3/4) Exclude cut with ID _44304_210_15374_1_1533639352765_4613129_141-8356-0_sp1.1 from training. Duration: 0.963625 2023-10-04 08:23:42,660 WARNING [train.py:1204] (3/4) Exclude cut with ID _37892_210_19596_1_1533198677897_3964058_374-335034-0_sp1.1 from training. Duration: 0.936375 2023-10-04 08:23:44,052 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_195-257512-0_sp0.9 from training. Duration: 0.7444375 2023-10-04 08:23:44,054 WARNING [train.py:1204] (3/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_1267-272693-0_sp1.1 from training. Duration: 0.663625 2023-10-04 08:23:44,055 WARNING [train.py:1204] (3/4) Exclude cut with ID _30196_210_1664_1_1533708496522_7074140_690-326155-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 08:23:46,828 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_68847_210_14656_1_1536070214_2586843_38-20717-0_sp1.1 from training. Duration: 0.97275 2023-10-04 08:23:46,875 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_99-251314-0_sp1.1 from training. Duration: 0.7909375 2023-10-04 08:23:50,115 WARNING [train.py:1204] (3/4) Exclude cut with ID _49376_210_5986_1_1533985278732_3370740_225-95630-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 08:23:52,812 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2655_210_2087_1_1525688959_3897400_202-147557-0 from training. Duration: 0.85 2023-10-04 08:23:54,110 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_5618_210_1874_1_1527311008_5851_1-214501-0_sp1.1 from training. Duration: 0.626375 2023-10-04 08:23:54,192 WARNING [train.py:1204] (3/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_225-43381-0_sp1.1 from training. Duration: 0.8545625 2023-10-04 08:23:56,170 WARNING [train.py:1204] (3/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_242-3472-0 from training. Duration: 0.73 2023-10-04 08:23:56,360 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.bypass.skip_rate, batch_count=1588693.3333333333, ans=0.07 2023-10-04 08:23:56,389 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.self_attn_weights.pos_emb_skip_rate, batch_count=1588693.3333333333, ans=0.0 2023-10-04 08:24:00,141 WARNING [train.py:1204] (3/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_339-104791-0 from training. Duration: 0.49 2023-10-04 08:24:00,364 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.bypass.scale_min, batch_count=1588693.3333333333, ans=0.2 2023-10-04 08:24:01,540 WARNING [train.py:1204] (3/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_488-92261-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 08:24:03,033 WARNING [train.py:1204] (3/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_175-163894-0 from training. Duration: 0.74 2023-10-04 08:24:04,424 WARNING [train.py:1204] (3/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_41-234353-0_sp0.9 from training. Duration: 0.7555625 2023-10-04 08:24:07,288 WARNING [train.py:1204] (3/4) Exclude cut with ID _47251_210_18628_1_1533814063128_4137250_116-275927-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 08:24:07,316 WARNING [train.py:1204] (3/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_61-223324-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 08:24:07,328 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6961_210_2087_1_1527937168_6815011_562-165518-0_sp1.1 from training. Duration: 0.7 2023-10-04 08:24:10,947 WARNING [train.py:1204] (3/4) Exclude cut with ID _21989_210_5854_1_1531899214931_3271360_133-44759-0 from training. Duration: 0.77 2023-10-04 08:24:13,655 WARNING [train.py:1204] (3/4) Exclude cut with ID _23557_210_8750_1_1531890106370_6846255_77-284923-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 08:24:16,517 WARNING [train.py:1204] (3/4) Exclude cut with ID _13678_210_7497_1_1530530069226_3463170_464-296127-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 08:24:16,531 WARNING [train.py:1204] (3/4) Exclude cut with ID _22613_210_5219_1_1532253599686_4090170_393-36663-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 08:24:16,728 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.0.bypass.skip_rate, batch_count=1588760.0, ans=0.035 2023-10-04 08:24:17,932 WARNING [train.py:1204] (3/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_235-262124-0_sp0.9 from training. Duration: 0.7444375 2023-10-04 08:24:18,029 WARNING [train.py:1204] (3/4) Exclude cut with ID _8480_210_1874_1_1528589391205_6728330_755-112625-0 from training. Duration: 0.74 2023-10-04 08:24:19,387 WARNING [train.py:1204] (3/4) Exclude cut with ID _6456_210_1534_1_1527681634212_3181049_215-309359-0 from training. Duration: 0.88 2023-10-04 08:24:19,403 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3019_210_2028_1_1526044245_3264865_191-72235-0_sp1.1 from training. Duration: 0.8818125 2023-10-04 08:24:20,765 WARNING [train.py:1204] (3/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_380-162648-0 from training. Duration: 0.94 2023-10-04 08:24:21,691 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.0.layers.0.feed_forward2.out_whiten, num_groups=1, num_channels=192, metric=5.64 vs. limit=15.0 2023-10-04 08:24:22,817 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_92-46009-0 from training. Duration: 0.54 2023-10-04 08:24:24,082 WARNING [train.py:1204] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_53-300544-0_sp0.9 from training. Duration: 0.7444375 2023-10-04 08:24:25,536 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_68235_210_1925_1_1535863247_4484656_101-250943-0_sp1.1 from training. Duration: 0.97275 2023-10-04 08:24:26,199 WARNING [train.py:1204] (3/4) Exclude cut with ID _14970_210_15709_1_1530775547361_1966965_204-22182-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 08:24:27,473 WARNING [train.py:1204] (3/4) Exclude cut with ID _55153_210_2253_1_1534384786677_3162096_310-130092-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 08:24:27,484 WARNING [train.py:1204] (3/4) Exclude cut with ID _60113_210_16271_1_1535164212481_4386079_559-167191-0_sp1.1 from training. Duration: 0.8454375 2023-10-04 08:24:28,868 WARNING [train.py:1204] (3/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_93-58002-0_sp1.1 from training. Duration: 0.5181875 2023-10-04 08:24:30,376 WARNING [train.py:1204] (3/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_110-224688-0 from training. Duration: 0.97 2023-10-04 08:24:31,831 WARNING [train.py:1204] (3/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_178-178004-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 08:24:31,847 WARNING [train.py:1204] (3/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_321-335475-0_sp1.1 from training. Duration: 0.4090625 2023-10-04 08:24:33,185 WARNING [train.py:1204] (3/4) Exclude cut with ID _66863_210_18053_1_1535698758334_8590319_1092-271784-0 from training. Duration: 0.96 2023-10-04 08:24:33,192 WARNING [train.py:1204] (3/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_607-213951-0_sp1.1 from training. Duration: 0.763625 2023-10-04 08:24:33,200 WARNING [train.py:1204] (3/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_468-283113-0 from training. Duration: 0.96 2023-10-04 08:24:35,993 WARNING [train.py:1204] (3/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_13-165512-0_sp1.1 from training. Duration: 0.7454375 2023-10-04 08:24:36,012 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_69630_210_7234_1_1535788670_910989_56-169762-0_sp1.1 from training. Duration: 0.92725 2023-10-04 08:24:38,743 WARNING [train.py:1204] (3/4) Exclude cut with ID _67335_210_5219_1_1535525975652_4275229_121-80357-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 08:24:38,783 WARNING [train.py:1204] (3/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_126-12079-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 08:24:40,217 INFO [train.py:1046] (3/4) Epoch 45, batch 4600, loss[loss=0.1628, simple_loss=0.2493, pruned_loss=0.03809, over 24632.00 frames. ], tot_loss[loss=0.1534, simple_loss=0.2338, pruned_loss=0.0365, over 4730059.96 frames. ], batch size: 68, lr: 2.25e-03, grad_scale: 16.0 2023-10-04 08:24:40,263 WARNING [train.py:1204] (3/4) Exclude cut with ID _5294_210_1747_1_1527249643928_3696179_342-270278-0_sp1.1 from training. Duration: 0.5 2023-10-04 08:24:40,380 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_30322_210_1925_1_1532843478_826817_35-328264-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 08:24:42,392 WARNING [train.py:1204] (3/4) Exclude cut with ID _8480_210_1874_1_1528589391205_6728330_755-112625-0_sp1.1 from training. Duration: 0.67275 2023-10-04 08:24:43,542 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.632e+02 1.936e+02 2.256e+02 2.645e+02 3.814e+02, threshold=4.511e+02, percent-clipped=0.0 2023-10-04 08:24:46,284 WARNING [train.py:1204] (3/4) Exclude cut with ID _38670_210_15379_1_1533448810659_4391059_132-146412-0_sp1.1 from training. Duration: 0.97275 2023-10-04 08:24:46,385 WARNING [train.py:1204] (3/4) Exclude cut with ID _22020_210_2461_1_1531724164513_6326900_563-39691-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 08:24:49,064 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_9460_210_4211_1_1528954587_10049667_572-37035-0_sp1.1 from training. Duration: 0.72725 2023-10-04 08:24:49,076 WARNING [train.py:1204] (3/4) Exclude cut with ID _68777_210_4353_1_1535810390731_3117904_129-166012-0_sp0.9 from training. Duration: 0.9444375 2023-10-04 08:24:50,402 WARNING [train.py:1204] (3/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_487-91921-0_sp1.1 from training. Duration: 0.963625 2023-10-04 08:24:52,324 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_195-257512-0 from training. Duration: 0.67 2023-10-04 08:24:52,428 WARNING [train.py:1204] (3/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_188-260011-0_sp1.1 from training. Duration: 0.663625 2023-10-04 08:24:52,915 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.5.encoder.layers.1.nonlin_attention.whiten2, num_groups=1, num_channels=256, metric=9.58 vs. limit=15.0 2023-10-04 08:24:56,530 WARNING [train.py:1204] (3/4) Exclude cut with ID _8480_210_1874_1_1528589391205_6728330_309-139896-0_sp0.9 from training. Duration: 0.9 2023-10-04 08:24:56,584 WARNING [train.py:1204] (3/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_866-321248-0_sp1.1 from training. Duration: 0.963625 2023-10-04 08:24:59,647 WARNING [train.py:1204] (3/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_510-216513-0_sp1.1 from training. Duration: 0.97275 2023-10-04 08:25:06,436 WARNING [train.py:1204] (3/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_758-143677-0 from training. Duration: 0.98 2023-10-04 08:25:06,530 WARNING [train.py:1204] (3/4) Exclude cut with ID _55134_210_21982_1_1534590028377_3882170_50-214427-0_sp1.1 from training. Duration: 0.97275 2023-10-04 08:25:09,793 WARNING [train.py:1204] (3/4) Exclude cut with ID _31649_210_6228_1_1532584608063_4091078_482-301330-0_sp1.1 from training. Duration: 0.97275 2023-10-04 08:25:13,231 WARNING [train.py:1204] (3/4) Exclude cut with ID _35799_210_5854_1_1533263487887_3529071_490-326847-0_sp1.1 from training. Duration: 0.9 2023-10-04 08:25:13,239 WARNING [train.py:1204] (3/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_984-90716-0_sp1.1 from training. Duration: 0.963625 2023-10-04 08:25:16,803 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.3.encoder.layers.3.nonlin_attention.whiten2, num_groups=1, num_channels=512, metric=4.87 vs. limit=15.0 2023-10-04 08:25:17,515 WARNING [train.py:1204] (3/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_166-246557-0 from training. Duration: 0.64 2023-10-04 08:25:17,515 WARNING [train.py:1204] (3/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_51-200220-0_sp1.1 from training. Duration: 0.5090625 2023-10-04 08:25:17,589 WARNING [train.py:1204] (3/4) Exclude cut with ID _51382_210_23862_1_1534492801838_3650104_275-92796-0_sp1.1 from training. Duration: 0.936375 2023-10-04 08:25:23,537 WARNING [train.py:1204] (3/4) Exclude cut with ID _29853_210_7497_1_1532505600851_6642110_461-142829-0_sp1.1 from training. Duration: 0.97275 2023-10-04 08:25:24,801 WARNING [train.py:1204] (3/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_788-298907-0_sp0.9 from training. Duration: 0.988875 2023-10-04 08:25:26,270 WARNING [train.py:1204] (3/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_739-2098-0_sp1.1 from training. Duration: 0.836375 2023-10-04 08:25:29,648 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7543_210_6753_1_1528541907_1399322_165-323619-0 from training. Duration: 0.69 2023-10-04 08:25:29,743 WARNING [train.py:1204] (3/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_401-255491-0_sp1.1 from training. Duration: 0.636375 2023-10-04 08:25:33,307 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.1.nonlin_attention.whiten2.whitening_limit, batch_count=1589093.3333333333, ans=15.0 2023-10-04 08:25:34,031 WARNING [train.py:1204] (3/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_24-143283-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 08:25:35,355 WARNING [train.py:1204] (3/4) Exclude cut with ID _14459_210_2228_1_1531443641946_7516700_1221-280439-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 08:25:38,425 WARNING [train.py:1204] (3/4) Exclude cut with ID _59202_210_26984_1_1534816810612_3549174_469-213482-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 08:25:38,426 WARNING [train.py:1204] (3/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_148-158509-0_sp0.9 from training. Duration: 0.4555625 2023-10-04 08:25:38,463 WARNING [train.py:1204] (3/4) Exclude cut with ID _24314_210_4915_1_1532135078116_7170179_863-259095-0_sp1.1 from training. Duration: 0.97275 2023-10-04 08:25:39,752 WARNING [train.py:1204] (3/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_321-22821-0 from training. Duration: 0.89 2023-10-04 08:25:39,782 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_43831_210_2158_1_1533725502_5872791_55-163137-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 08:25:39,824 WARNING [train.py:1204] (3/4) Exclude cut with ID _27430_210_7770_1_1532604689375_1378590_26-25207-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 08:25:41,235 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_17613_210_2151_1_1531137569_2366160_223-316461-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 08:25:41,299 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_53679_210_4852_1_1534556726_4186141_541-213370-0_sp1.1 from training. Duration: 0.963625 2023-10-04 08:25:43,130 WARNING [train.py:1204] (3/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_369-295086-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 08:25:43,185 WARNING [train.py:1204] (3/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_91-267696-0 from training. Duration: 0.87 2023-10-04 08:25:44,509 WARNING [train.py:1204] (3/4) Exclude cut with ID _5294_210_1747_1_1527249643928_3696179_180-100713-0 from training. Duration: 0.81 2023-10-04 08:25:44,539 WARNING [train.py:1204] (3/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_32-335103-0 from training. Duration: 0.82 2023-10-04 08:25:44,544 WARNING [train.py:1204] (3/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_133-312727-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 08:25:47,117 WARNING [train.py:1204] (3/4) Exclude cut with ID _59288_210_5854_1_1535077757774_3412246_391-165934-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 08:25:47,169 WARNING [train.py:1204] (3/4) Exclude cut with ID _28323_210_16187_1_1532480395667_4100300_488-1281-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 08:25:49,860 WARNING [train.py:1204] (3/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_124-193207-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 08:25:54,417 INFO [train.py:1046] (3/4) Epoch 45, batch 4650, loss[loss=0.1579, simple_loss=0.2387, pruned_loss=0.03857, over 23258.00 frames. ], tot_loss[loss=0.1529, simple_loss=0.2333, pruned_loss=0.03626, over 4728458.59 frames. ], batch size: 119, lr: 2.25e-03, grad_scale: 16.0 2023-10-04 08:26:00,759 WARNING [train.py:1204] (3/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_226-242140-0_sp0.9 from training. Duration: 0.9 2023-10-04 08:26:03,477 WARNING [train.py:1204] (3/4) Exclude cut with ID _29277_210_3279_1_1532581109293_3173069_392-3694-0_sp1.1 from training. Duration: 0.936375 2023-10-04 08:26:03,658 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.conv_module2.balancer2.prob, batch_count=1589226.6666666667, ans=0.125 2023-10-04 08:26:04,792 WARNING [train.py:1204] (3/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_563-329842-0_sp1.1 from training. Duration: 0.97275 2023-10-04 08:26:04,839 WARNING [train.py:1204] (3/4) Exclude cut with ID _58583_210_5236_1_1534726835266_3574249_391-87755-0_sp1.1 from training. Duration: 0.92725 2023-10-04 08:26:04,870 WARNING [train.py:1204] (3/4) Exclude cut with ID _28427_210_4220_1_1532581240967_3061069_281-237827-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 08:26:04,885 WARNING [train.py:1204] (3/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_44-147898-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 08:26:06,315 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_24007_210_1794_1_1531918925_1663875_125-320772-0_sp1.1 from training. Duration: 0.97275 2023-10-04 08:26:07,933 INFO [scaling.py:1118] (3/4) WithLoss: name=encoder.encoders.0.layers.0.self_attn_weights, loss-sum=0.000e+00 2023-10-04 08:26:09,183 WARNING [train.py:1204] (3/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_143-117421-0 from training. Duration: 0.58 2023-10-04 08:26:12,522 WARNING [train.py:1204] (3/4) Exclude cut with ID _69584_210_5219_1_1535785133775_4044450_325-7656-0_sp0.9 from training. Duration: 0.9555625 2023-10-04 08:26:15,699 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_37351_210_7925_1_1533012931_3812113_265-85780-0 from training. Duration: 0.88 2023-10-04 08:26:15,728 WARNING [train.py:1204] (3/4) Exclude cut with ID _18516_210_9206_1_1531704653206_3692406_331-237896-0_sp1.1 from training. Duration: 0.936375 2023-10-04 08:26:15,797 WARNING [train.py:1204] (3/4) Exclude cut with ID _14629_210_15709_1_1530604298182_956899_6-232695-0 from training. Duration: 0.94 2023-10-04 08:26:17,080 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7544_210_6753_1_1528587000_691468_87-178599-0_sp1.1 from training. Duration: 0.7909375 2023-10-04 08:26:18,680 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_326-303945-0 from training. Duration: 0.99 2023-10-04 08:26:18,695 WARNING [train.py:1204] (3/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_503-30719-0 from training. Duration: 0.98 2023-10-04 08:26:18,709 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_11305_210_1794_1_1529750428_7178148_299-344640-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 08:26:18,769 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_53071_210_16554_1_1534490405_2954066_265-170472-0_sp1.1 from training. Duration: 0.7545625 2023-10-04 08:26:21,580 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_174-277364-0_sp1.1 from training. Duration: 0.6454375 2023-10-04 08:26:21,684 WARNING [train.py:1204] (3/4) Exclude cut with ID _58414_210_8254_1_1535421609162_7151182_193-151884-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 08:26:22,982 WARNING [train.py:1204] (3/4) Exclude cut with ID IC0651W0104-322063 from training. Duration: 0.768 2023-10-04 08:26:25,854 WARNING [train.py:1204] (3/4) Exclude cut with ID _21881_210_3525_1_1531825242757_3798380_139-305982-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 08:26:27,562 WARNING [train.py:1204] (3/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_79-200392-0 from training. Duration: 0.97 2023-10-04 08:26:30,206 WARNING [train.py:1204] (3/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_451-280894-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 08:26:30,221 WARNING [train.py:1204] (3/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_304-273433-0_sp1.1 from training. Duration: 0.9 2023-10-04 08:26:32,119 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_5959_210_1794_1_1527400400_3445726_412-44052-0 from training. Duration: 0.77 2023-10-04 08:26:33,556 WARNING [train.py:1204] (3/4) Exclude cut with ID _4681_210_3525_1_1527251040321_1235713_148-26159-0_sp1.1 from training. Duration: 0.963625 2023-10-04 08:26:36,411 WARNING [train.py:1204] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_887-249555-0_sp0.9 from training. Duration: 0.8555625 2023-10-04 08:26:37,879 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_25236_210_2158_1_1532051968_280567_37-6860-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 08:26:43,820 WARNING [train.py:1204] (3/4) Exclude cut with ID _29277_210_3279_1_1532581109293_3173069_417-115527-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 08:26:45,743 WARNING [train.py:1204] (3/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_552-134748-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 08:26:47,026 WARNING [train.py:1204] (3/4) Exclude cut with ID _43176_210_8254_1_1533524417469_4174339_188-174143-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 08:26:47,067 WARNING [train.py:1204] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_127-119109-0_sp1.1 from training. Duration: 0.6545625 2023-10-04 08:26:49,830 WARNING [train.py:1204] (3/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_38-187851-0 from training. Duration: 0.62 2023-10-04 08:26:49,871 WARNING [train.py:1204] (3/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_588-125165-0 from training. Duration: 0.83 2023-10-04 08:26:50,085 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.out_combiner.scale_min, batch_count=1589426.6666666667, ans=0.2 2023-10-04 08:26:51,233 WARNING [train.py:1204] (3/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_344-152526-0_sp1.1 from training. Duration: 0.4818125 2023-10-04 08:26:51,235 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7002_210_1794_1_1527935634_4034444_487-236501-0 from training. Duration: 0.66 2023-10-04 08:26:54,064 WARNING [train.py:1204] (3/4) Exclude cut with ID _23825_210_13449_1_1531915313526_3497150_379-16981-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 08:27:00,117 WARNING [train.py:1204] (3/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_297-5233-0_sp1.1 from training. Duration: 0.863625 2023-10-04 08:27:00,121 WARNING [train.py:1204] (3/4) Exclude cut with ID _41298_210_9019_1_1533430371450_4822143_178-283538-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 08:27:00,161 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7046_210_6753_1_1527895337_6920917_642-106799-0 from training. Duration: 0.92 2023-10-04 08:27:00,182 WARNING [train.py:1204] (3/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_532-291952-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 08:27:02,181 WARNING [train.py:1204] (3/4) Exclude cut with ID _47278_210_12041_1_1533902418349_3576110_260-270468-0_sp1.1 from training. Duration: 0.92725 2023-10-04 08:27:02,186 WARNING [train.py:1204] (3/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_144-3287-0_sp0.9 from training. Duration: 0.7444375 2023-10-04 08:27:02,913 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.3.encoder.layers.0.self_attn_weights.whiten_keys, num_groups=8, num_channels=256, metric=4.26 vs. limit=6.0 2023-10-04 08:27:04,804 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_162-262717-0_sp0.9 from training. Duration: 0.92225 2023-10-04 08:27:04,925 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder_embed.conv.8.prob, batch_count=1589493.3333333333, ans=0.125 2023-10-04 08:27:06,280 WARNING [train.py:1204] (3/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_62-335552-0_sp0.9 from training. Duration: 0.9666875 2023-10-04 08:27:06,285 WARNING [train.py:1204] (3/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_726-272852-0_sp1.1 from training. Duration: 0.92725 2023-10-04 08:27:07,663 WARNING [train.py:1204] (3/4) Exclude cut with ID _14898_210_9206_1_1530766812377_4177190_26-135492-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 08:27:08,964 INFO [train.py:1046] (3/4) Epoch 45, batch 4700, loss[loss=0.1599, simple_loss=0.2427, pruned_loss=0.03859, over 24387.00 frames. ], tot_loss[loss=0.1531, simple_loss=0.2338, pruned_loss=0.03622, over 4730876.41 frames. ], batch size: 77, lr: 2.25e-03, grad_scale: 16.0 2023-10-04 08:27:10,504 WARNING [train.py:1204] (3/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_200-95304-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 08:27:11,761 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.716e+02 2.046e+02 2.404e+02 2.908e+02 6.182e+02, threshold=4.807e+02, percent-clipped=8.0 2023-10-04 08:27:11,888 WARNING [train.py:1204] (3/4) Exclude cut with ID _45012_210_12651_1_1533608435136_5371380_240-37101-0_sp1.1 from training. Duration: 0.8454375 2023-10-04 08:27:11,896 WARNING [train.py:1204] (3/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_132-269633-0_sp0.9 from training. Duration: 0.7555625 2023-10-04 08:27:13,659 WARNING [train.py:1204] (3/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_907-151398-0_sp0.9 from training. Duration: 0.411125 2023-10-04 08:27:15,057 WARNING [train.py:1204] (3/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_45-265684-0_sp0.9 from training. Duration: 0.8 2023-10-04 08:27:16,391 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_74-292608-0 from training. Duration: 0.72 2023-10-04 08:27:22,591 WARNING [train.py:1204] (3/4) Exclude cut with ID _59202_210_26984_1_1534816810612_3549174_221-84819-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 08:27:23,895 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_33696_210_3852_1_1533082780_2663375_181-325595-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 08:27:23,943 WARNING [train.py:1204] (3/4) Exclude cut with ID _47251_210_18628_1_1533814063128_4137250_351-154105-0_sp1.1 from training. Duration: 0.963625 2023-10-04 08:27:24,156 INFO [scaling.py:1118] (3/4) WithLoss: name=encoder.encoders.2.encoder.layers.2.self_attn_weights, loss-sum=0.000e+00 2023-10-04 08:27:25,280 WARNING [train.py:1204] (3/4) Exclude cut with ID _35501_210_9288_1_1532998507986_3192189_351-334740-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 08:27:26,624 WARNING [train.py:1204] (3/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_342-100157-0_sp1.1 from training. Duration: 0.4909375 2023-10-04 08:27:32,619 WARNING [train.py:1204] (3/4) Exclude cut with ID _6789_210_1534_1_1527857937218_3118950_262-48853-0 from training. Duration: 0.56 2023-10-04 08:27:32,645 WARNING [train.py:1204] (3/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_430-201020-0 from training. Duration: 0.97 2023-10-04 08:27:34,596 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_71-136518-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 08:27:34,789 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.conv_skip_rate, batch_count=1589626.6666666667, ans=0.0 2023-10-04 08:27:35,950 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_28-291982-0_sp1.1 from training. Duration: 0.8818125 2023-10-04 08:27:35,975 WARNING [train.py:1204] (3/4) Exclude cut with ID _54218_210_12210_1_1534413646388_4409259_437-275326-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 08:27:38,914 WARNING [train.py:1204] (3/4) Exclude cut with ID _55134_210_21982_1_1534590028377_3882170_40-309139-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 08:27:44,920 WARNING [train.py:1204] (3/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_266-329755-0_sp1.1 from training. Duration: 0.8090625 2023-10-04 08:27:45,002 WARNING [train.py:1204] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_21-174514-0_sp1.1 from training. Duration: 0.3909375 2023-10-04 08:27:48,292 WARNING [train.py:1204] (3/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_409-207378-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 08:27:55,158 WARNING [train.py:1204] (3/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_132-269633-0 from training. Duration: 0.68 2023-10-04 08:27:56,581 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2245_210_2715_1_1525511548_3540254_310-52483-0_sp0.9 from training. Duration: 0.97775 2023-10-04 08:27:59,283 WARNING [train.py:1204] (3/4) Exclude cut with ID _27430_210_7770_1_1532604689375_1378590_47-77403-0_sp1.1 from training. Duration: 0.92725 2023-10-04 08:28:01,995 WARNING [train.py:1204] (3/4) Exclude cut with ID _7562_210_2084_1_1528887767916_3946229_186-326422-0 from training. Duration: 0.84 2023-10-04 08:28:02,107 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_230-35993-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 08:28:07,126 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_23854_210_1794_1_1531969696_1329157_57-123491-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 08:28:07,177 WARNING [train.py:1204] (3/4) Exclude cut with ID _14747_210_8341_1_1530685797463_3654089_147-131229-0 from training. Duration: 0.76 2023-10-04 08:28:09,900 WARNING [train.py:1204] (3/4) Exclude cut with ID _30191_210_1664_1_1533189915047_7196670_430-19539-0_sp1.1 from training. Duration: 0.92725 2023-10-04 08:28:11,103 WARNING [train.py:1204] (3/4) Exclude cut with ID _67725_210_27767_1_1535608756671_5105359_44-50762-0_sp1.1 from training. Duration: 0.963625 2023-10-04 08:28:15,826 WARNING [train.py:1204] (3/4) Exclude cut with ID _27485_210_4916_1_1532739191677_3668269_217-161755-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 08:28:15,864 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_5931_210_2040_1_1527428481_4547999_321-193614-0_sp1.1 from training. Duration: 0.6090625 2023-10-04 08:28:15,881 WARNING [train.py:1204] (3/4) Exclude cut with ID _6267_210_2084_1_1527764658526_3894730_375-219887-0 from training. Duration: 0.67 2023-10-04 08:28:15,969 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9087W0022-538393 from training. Duration: 0.768 2023-10-04 08:28:17,363 WARNING [train.py:1204] (3/4) Exclude cut with ID _68777_210_4353_1_1535810390731_3117904_83-196459-0_sp1.1 from training. Duration: 0.963625 2023-10-04 08:28:18,916 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.balancer1.prob, batch_count=1589826.6666666667, ans=0.125 2023-10-04 08:28:20,066 WARNING [train.py:1204] (3/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_455-343812-0_sp1.1 from training. Duration: 0.92725 2023-10-04 08:28:20,067 WARNING [train.py:1204] (3/4) Exclude cut with ID _13916_210_9994_1_1530788341126_3763149_414-320174-0_sp1.1 from training. Duration: 0.92725 2023-10-04 08:28:20,071 WARNING [train.py:1204] (3/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_398-239819-0 from training. Duration: 0.99 2023-10-04 08:28:21,948 WARNING [train.py:1204] (3/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_199-232314-0_sp1.1 from training. Duration: 0.92725 2023-10-04 08:28:23,203 INFO [train.py:1046] (3/4) Epoch 45, batch 4750, loss[loss=0.1631, simple_loss=0.237, pruned_loss=0.04462, over 23797.00 frames. ], tot_loss[loss=0.1537, simple_loss=0.2343, pruned_loss=0.0366, over 4721402.87 frames. ], batch size: 212, lr: 2.25e-03, grad_scale: 16.0 2023-10-04 08:28:26,087 WARNING [train.py:1204] (3/4) Exclude cut with ID _2334_210_2667_1_1525608238880_4705129_459-104997-0 from training. Duration: 0.95 2023-10-04 08:28:27,602 WARNING [train.py:1204] (3/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_441-209395-0_sp1.1 from training. Duration: 0.936375 2023-10-04 08:28:28,902 WARNING [train.py:1204] (3/4) Exclude cut with ID _27052_210_5986_1_1532329197187_3585009_320-80393-0_sp1.1 from training. Duration: 0.97275 2023-10-04 08:28:31,738 WARNING [train.py:1204] (3/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_89-217253-0_sp1.1 from training. Duration: 0.97275 2023-10-04 08:28:31,759 WARNING [train.py:1204] (3/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_453-282588-0_sp1.1 from training. Duration: 0.8909375 2023-10-04 08:28:33,218 WARNING [train.py:1204] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_88-341718-0 from training. Duration: 0.66 2023-10-04 08:28:34,863 WARNING [train.py:1204] (3/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_230-3521-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 08:28:38,115 WARNING [train.py:1204] (3/4) Exclude cut with ID _9679_210_7497_1_1529129212906_4363929_512-313889-0 from training. Duration: 0.81 2023-10-04 08:28:39,641 WARNING [train.py:1204] (3/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_194-252800-0_sp1.1 from training. Duration: 0.8818125 2023-10-04 08:28:39,680 WARNING [train.py:1204] (3/4) Exclude cut with ID _55209_210_1531_1_1534675828996_4597160_239-293541-0_sp1.1 from training. Duration: 0.963625 2023-10-04 08:28:40,931 WARNING [train.py:1204] (3/4) Exclude cut with ID _35799_210_5854_1_1533263487887_3529071_15-325131-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 08:28:47,075 WARNING [train.py:1204] (3/4) Exclude cut with ID _14100_210_8341_1_1530529320613_3678069_279-55760-0 from training. Duration: 0.93 2023-10-04 08:28:47,176 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.0.conv_module1.balancer2.prob, batch_count=1589960.0, ans=0.125 2023-10-04 08:28:52,608 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_489-230703-0_sp1.1 from training. Duration: 0.77275 2023-10-04 08:28:54,152 WARNING [train.py:1204] (3/4) Exclude cut with ID _6694_210_4599_1_1527852637490_3965529_144-185051-0 from training. Duration: 0.96 2023-10-04 08:28:55,772 WARNING [train.py:1204] (3/4) Exclude cut with ID _28461_210_2253_1_1532771775189_3041299_1-228155-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 08:28:57,352 WARNING [train.py:1204] (3/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_686-252323-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 08:28:57,355 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_1075-294633-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 08:28:58,685 WARNING [train.py:1204] (3/4) Exclude cut with ID _22083_210_8254_1_1531702862439_3678970_30-92879-0_sp1.1 from training. Duration: 0.97275 2023-10-04 08:29:00,232 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9245W0121-616888 from training. Duration: 0.896 2023-10-04 08:29:00,236 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6786_210_1388_1_1527839699_4194495_378-46070-0 from training. Duration: 0.72 2023-10-04 08:29:05,489 WARNING [train.py:1204] (3/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_317-175062-0 from training. Duration: 0.82 2023-10-04 08:29:07,494 WARNING [train.py:1204] (3/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_238-89390-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 08:29:08,729 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.4.encoder.layers.1.whiten, num_groups=1, num_channels=384, metric=3.91 vs. limit=12.0 2023-10-04 08:29:09,528 WARNING [train.py:1204] (3/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_524-310811-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 08:29:12,303 WARNING [train.py:1204] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_307-158462-0_sp0.9 from training. Duration: 0.7666875 2023-10-04 08:29:12,304 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9309W0191-640318 from training. Duration: 0.512 2023-10-04 08:29:12,308 WARNING [train.py:1204] (3/4) Exclude cut with ID _55153_210_2253_1_1534384786677_3162096_100-204617-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 08:29:15,088 WARNING [train.py:1204] (3/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_112-149213-0_sp1.1 from training. Duration: 0.863625 2023-10-04 08:29:16,515 WARNING [train.py:1204] (3/4) Exclude cut with ID _67831_210_12392_1_1535612464202_3092612_346-2148-0_sp1.1 from training. Duration: 0.8454375 2023-10-04 08:29:18,462 WARNING [train.py:1204] (3/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_51-117576-0_sp0.9 from training. Duration: 0.4444375 2023-10-04 08:29:18,490 WARNING [train.py:1204] (3/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_300-27235-0 from training. Duration: 0.87 2023-10-04 08:29:18,530 WARNING [train.py:1204] (3/4) Exclude cut with ID _40399_210_4353_1_1533305658194_3279406_195-116825-0_sp1.1 from training. Duration: 0.97275 2023-10-04 08:29:19,761 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7002_210_1794_1_1527939683_5157286_163-87552-0_sp0.9 from training. Duration: 0.9555625 2023-10-04 08:29:19,804 WARNING [train.py:1204] (3/4) Exclude cut with ID _19514_210_9259_1_1531530071330_5084230_560-237597-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 08:29:22,389 WARNING [train.py:1204] (3/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_41-234353-0_sp1.1 from training. Duration: 0.6181875 2023-10-04 08:29:22,415 WARNING [train.py:1204] (3/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_769-168302-0 from training. Duration: 0.71 2023-10-04 08:29:23,924 WARNING [train.py:1204] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_574-45541-0 from training. Duration: 0.65 2023-10-04 08:29:27,343 WARNING [train.py:1204] (3/4) Exclude cut with ID _50310_210_15488_1_1534068043345_3641080_262-67120-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 08:29:30,195 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_53071_210_16554_1_1534493434_1276796_35-11927-0_sp1.1 from training. Duration: 0.963625 2023-10-04 08:29:30,197 WARNING [train.py:1204] (3/4) Exclude cut with ID _3992_210_1747_1_1526644816031_3731060_107-251434-0 from training. Duration: 0.99 2023-10-04 08:29:30,242 WARNING [train.py:1204] (3/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_412-54488-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 08:29:31,539 WARNING [train.py:1204] (3/4) Exclude cut with ID _18516_210_9206_1_1531704653206_3692406_77-258490-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 08:29:33,013 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_40047_210_2290_1_1533290519_2374311_195-165693-0_sp0.9 from training. Duration: 0.988875 2023-10-04 08:29:33,055 WARNING [train.py:1204] (3/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_296-36971-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 08:29:34,342 WARNING [train.py:1204] (3/4) Exclude cut with ID _14970_210_15709_1_1530773543493_1715730_187-20259-0_sp1.1 from training. Duration: 0.8090625 2023-10-04 08:29:37,048 INFO [train.py:1046] (3/4) Epoch 45, batch 4800, loss[loss=0.1473, simple_loss=0.2344, pruned_loss=0.03007, over 24659.00 frames. ], tot_loss[loss=0.1539, simple_loss=0.2345, pruned_loss=0.03665, over 4726003.41 frames. ], batch size: 68, lr: 2.25e-03, grad_scale: 32.0 2023-10-04 08:29:38,494 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_53373_210_5399_1_1534229471_4715695_135-305927-0_sp1.1 from training. Duration: 0.936375 2023-10-04 08:29:38,533 WARNING [train.py:1204] (3/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_60-18123-0 from training. Duration: 0.88 2023-10-04 08:29:39,208 WARNING [train.py:1204] (3/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_166-166990-0 from training. Duration: 0.96 2023-10-04 08:29:40,328 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.710e+02 2.033e+02 2.288e+02 2.597e+02 3.954e+02, threshold=4.576e+02, percent-clipped=0.0 2023-10-04 08:29:41,899 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_5438_210_1794_1_1527336687_2600009_233-58072-0 from training. Duration: 0.77 2023-10-04 08:29:43,275 WARNING [train.py:1204] (3/4) Exclude cut with ID _8833_210_5219_1_1528592665210_7553059_611-80313-0_sp0.9 from training. Duration: 0.711125 2023-10-04 08:29:44,646 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_53555_210_5632_1_1534595220_7232297_118-43239-0_sp1.1 from training. Duration: 0.936375 2023-10-04 08:29:44,733 WARNING [train.py:1204] (3/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_480-13146-0 from training. Duration: 0.85 2023-10-04 08:29:49,232 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_892-28814-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 08:29:50,713 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_24899_210_16554_1_1532046422_3827462_160-21255-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 08:29:50,983 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.feed_forward3.hidden_balancer.prob, batch_count=1590293.3333333333, ans=0.125 2023-10-04 08:29:55,412 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_154-316450-0_sp1.1 from training. Duration: 0.6909375 2023-10-04 08:29:56,884 WARNING [train.py:1204] (3/4) Exclude cut with ID _30196_210_1664_1_1533708496522_7074140_1225-301232-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 08:29:56,919 WARNING [train.py:1204] (3/4) Exclude cut with ID _28783_210_7943_1_1532671164808_7155479_414-208100-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 08:29:56,956 WARNING [train.py:1204] (3/4) Exclude cut with ID _19023_210_4353_1_1531378892552_5820780_83-254794-0 from training. Duration: 0.86 2023-10-04 08:29:58,213 WARNING [train.py:1204] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_591-83752-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 08:29:59,638 WARNING [train.py:1204] (3/4) Exclude cut with ID _29877_210_6228_1_1532397791106_5276149_266-62291-0_sp1.1 from training. Duration: 0.7909375 2023-10-04 08:30:01,119 WARNING [train.py:1204] (3/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_280-161633-0_sp1.1 from training. Duration: 0.663625 2023-10-04 08:30:03,936 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7543_210_6753_1_1528539468_333764_39-156634-0_sp1.1 from training. Duration: 0.97275 2023-10-04 08:30:05,422 WARNING [train.py:1204] (3/4) Exclude cut with ID _67499_210_17282_1_1535509805220_3692430_505-312248-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 08:30:05,452 WARNING [train.py:1204] (3/4) Exclude cut with ID _5294_210_1747_1_1527249643928_3696179_180-100713-0_sp0.9 from training. Duration: 0.9 2023-10-04 08:30:08,074 WARNING [train.py:1204] (3/4) Exclude cut with ID _26528_210_9070_1_1532570410842_3851310_508-148132-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 08:30:08,084 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_211-339622-0_sp1.1 from training. Duration: 0.4090625 2023-10-04 08:30:08,112 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_339-138356-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 08:30:09,887 WARNING [train.py:1204] (3/4) Exclude cut with ID _27060_210_5986_1_1532674838922_3522549_211-19933-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 08:30:12,949 WARNING [train.py:1204] (3/4) Exclude cut with ID _60229_210_27767_1_1534838406158_3655350_457-117923-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 08:30:13,211 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.conv_module2.balancer2.prob, batch_count=1590360.0, ans=0.125 2023-10-04 08:30:14,510 WARNING [train.py:1204] (3/4) Exclude cut with ID _55429_210_10626_1_1534467588527_7174143_460-141439-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 08:30:16,264 WARNING [train.py:1204] (3/4) Exclude cut with ID _59170_210_6721_1_1534983633960_4203335_263-10780-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 08:30:17,630 WARNING [train.py:1204] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_872-264525-0_sp0.9 from training. Duration: 0.72225 2023-10-04 08:30:18,891 WARNING [train.py:1204] (3/4) Exclude cut with ID _7624_210_1531_1_1528109802198_4933101_418-349834-0_sp0.9 from training. Duration: 0.6666875 2023-10-04 08:30:20,364 WARNING [train.py:1204] (3/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_27-53998-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 08:30:21,706 WARNING [train.py:1204] (3/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_202-183922-0 from training. Duration: 0.85 2023-10-04 08:30:21,718 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7002_210_1794_1_1527939683_5157286_1-281264-0 from training. Duration: 0.44 2023-10-04 08:30:21,791 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_53753_210_7234_1_1534320003_3594148_493-9314-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 08:30:21,811 WARNING [train.py:1204] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_217-95056-0_sp1.1 from training. Duration: 0.8909375 2023-10-04 08:30:21,848 WARNING [train.py:1204] (3/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_406-45086-0_sp1.1 from training. Duration: 0.72725 2023-10-04 08:30:21,854 WARNING [train.py:1204] (3/4) Exclude cut with ID _31649_210_6228_1_1532584608063_4091078_505-213821-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 08:30:21,862 WARNING [train.py:1204] (3/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_317-175062-0_sp0.9 from training. Duration: 0.911125 2023-10-04 08:30:23,412 WARNING [train.py:1204] (3/4) Exclude cut with ID _5444_210_2151_1_1527418808464_5312210_726-27165-0_sp1.1 from training. Duration: 0.6090625 2023-10-04 08:30:25,282 WARNING [train.py:1204] (3/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_494-170437-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 08:30:29,336 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_474-64054-0_sp1.1 from training. Duration: 0.936375 2023-10-04 08:30:29,509 WARNING [train.py:1204] (3/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_521-7556-0_sp1.1 from training. Duration: 0.97275 2023-10-04 08:30:30,930 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_269-348505-0_sp1.1 from training. Duration: 0.92725 2023-10-04 08:30:31,102 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=1590426.6666666667, ans=0.1 2023-10-04 08:30:35,051 WARNING [train.py:1204] (3/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_559-184862-0 from training. Duration: 0.94 2023-10-04 08:30:35,083 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_68702_210_23689_1_1535799577_3420068_92-76040-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 08:30:36,229 WARNING [train.py:1204] (3/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_227-102052-0_sp1.1 from training. Duration: 0.97275 2023-10-04 08:30:36,269 WARNING [train.py:1204] (3/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_718-250907-0_sp0.9 from training. Duration: 0.7666875 2023-10-04 08:30:36,427 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=1590493.3333333333, ans=0.1 2023-10-04 08:30:37,640 WARNING [train.py:1204] (3/4) Exclude cut with ID _14069_210_5632_1_1530512692033_4803390_352-143891-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 08:30:42,630 WARNING [train.py:1204] (3/4) Exclude cut with ID _6267_210_2084_1_1527764658526_3894730_65-106602-0_sp1.1 from training. Duration: 0.936375 2023-10-04 08:30:42,714 WARNING [train.py:1204] (3/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_202-183922-0_sp0.9 from training. Duration: 0.9444375 2023-10-04 08:30:42,721 WARNING [train.py:1204] (3/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_465-268827-0_sp1.1 from training. Duration: 0.97275 2023-10-04 08:30:44,697 WARNING [train.py:1204] (3/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_58-29064-0_sp1.1 from training. Duration: 0.9 2023-10-04 08:30:44,742 WARNING [train.py:1204] (3/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_1-99680-0_sp0.9 from training. Duration: 0.7444375 2023-10-04 08:30:44,801 WARNING [train.py:1204] (3/4) Exclude cut with ID _13253_210_1925_1_1530757775819_3577873_191-144977-0_sp1.1 from training. Duration: 0.7090625 2023-10-04 08:30:48,858 WARNING [train.py:1204] (3/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_276-112684-0_sp1.1 from training. Duration: 0.92725 2023-10-04 08:30:48,867 WARNING [train.py:1204] (3/4) Exclude cut with ID _64639_210_5816_1_1535367619437_3528000_358-411-0_sp1.1 from training. Duration: 0.97275 2023-10-04 08:30:50,124 WARNING [train.py:1204] (3/4) Exclude cut with ID _31649_210_6228_1_1532584608063_4091078_106-21199-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 08:30:51,426 INFO [train.py:1046] (3/4) Epoch 45, batch 4850, loss[loss=0.1566, simple_loss=0.252, pruned_loss=0.03056, over 24448.00 frames. ], tot_loss[loss=0.1547, simple_loss=0.2352, pruned_loss=0.03715, over 4712489.36 frames. ], batch size: 69, lr: 2.25e-03, grad_scale: 32.0 2023-10-04 08:30:51,526 WARNING [train.py:1204] (3/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_901-79438-0 from training. Duration: 0.94 2023-10-04 08:30:52,935 WARNING [train.py:1204] (3/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_718-250907-0 from training. Duration: 0.69 2023-10-04 08:30:54,608 WARNING [train.py:1204] (3/4) Exclude cut with ID _5287_210_2084_1_1527418883040_3878670_363-194161-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 08:30:54,612 WARNING [train.py:1204] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_586-321321-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 08:30:54,712 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_68900_210_2087_1_1535713157_3708529_164-292661-0_sp1.1 from training. Duration: 0.963625 2023-10-04 08:30:54,713 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_26433_210_12108_1_1532157158_4056480_114-144878-0_sp1.1 from training. Duration: 0.97275 2023-10-04 08:30:57,484 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_15517_210_6753_1_1530954008_3487336_96-341874-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 08:31:02,225 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.feed_forward2.hidden_balancer.prob, batch_count=1590560.0, ans=0.125 2023-10-04 08:31:03,361 WARNING [train.py:1204] (3/4) Exclude cut with ID _14268_210_9206_1_1530622771285_4714099_637-86630-0 from training. Duration: 0.51 2023-10-04 08:31:04,668 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_28211_210_6388_1_1532861506_4523224_121-320092-0_sp1.1 from training. Duration: 0.92725 2023-10-04 08:31:09,344 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_141-89113-0_sp0.9 from training. Duration: 0.9555625 2023-10-04 08:31:09,432 WARNING [train.py:1204] (3/4) Exclude cut with ID _6789_210_1534_1_1527857937218_3118950_311-61262-0_sp1.1 from training. Duration: 0.5909375 2023-10-04 08:31:09,451 WARNING [train.py:1204] (3/4) Exclude cut with ID _58480_210_12392_1_1534811121111_3646160_389-200461-0_sp1.1 from training. Duration: 0.97275 2023-10-04 08:31:14,077 WARNING [train.py:1204] (3/4) Exclude cut with ID _41326_210_5854_1_1533778076509_3492080_28-156553-0_sp1.1 from training. Duration: 0.92725 2023-10-04 08:31:15,890 WARNING [train.py:1204] (3/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_577-287150-0_sp1.1 from training. Duration: 0.7454375 2023-10-04 08:31:17,199 WARNING [train.py:1204] (3/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_40-129756-0_sp1.1 from training. Duration: 0.736375 2023-10-04 08:31:17,210 WARNING [train.py:1204] (3/4) Exclude cut with ID _60113_210_16271_1_1535164212481_4386079_490-280290-0 from training. Duration: 0.98 2023-10-04 08:31:20,112 WARNING [train.py:1204] (3/4) Exclude cut with ID _58059_210_5854_1_1534901471149_3164808_34-39905-0_sp1.1 from training. Duration: 0.963625 2023-10-04 08:31:20,387 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.conv_module1.balancer1.prob, batch_count=1590693.3333333333, ans=0.125 2023-10-04 08:31:23,293 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_791-197891-0_sp1.1 from training. Duration: 0.763625 2023-10-04 08:31:23,307 WARNING [train.py:1204] (3/4) Exclude cut with ID _6789_210_1534_1_1527857937218_3118950_262-48853-0_sp1.1 from training. Duration: 0.5090625 2023-10-04 08:31:24,614 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2655_210_2087_1_1525688959_3897400_52-279899-0_sp0.9 from training. Duration: 0.7666875 2023-10-04 08:31:24,618 WARNING [train.py:1204] (3/4) Exclude cut with ID _14884_210_2252_1_1530707459689_5561250_114-201399-0 from training. Duration: 0.76 2023-10-04 08:31:27,374 WARNING [train.py:1204] (3/4) Exclude cut with ID _19023_210_4353_1_1531378892552_5820780_83-254794-0_sp0.9 from training. Duration: 0.9555625 2023-10-04 08:31:27,397 WARNING [train.py:1204] (3/4) Exclude cut with ID _53489_210_6228_1_1534298357169_3835970_10-297919-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 08:31:31,521 WARNING [train.py:1204] (3/4) Exclude cut with ID _44585_210_18628_1_1533605350387_4518199_418-3055-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 08:31:31,537 WARNING [train.py:1204] (3/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_310-109461-0 from training. Duration: 0.8 2023-10-04 08:31:31,580 WARNING [train.py:1204] (3/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_235-262124-0 from training. Duration: 0.67 2023-10-04 08:31:32,934 WARNING [train.py:1204] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_195-167011-0_sp1.1 from training. Duration: 0.6818125 2023-10-04 08:31:39,071 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.5.encoder.layers.1.nonlin_attention.whiten2, num_groups=1, num_channels=256, metric=6.64 vs. limit=15.0 2023-10-04 08:31:39,946 WARNING [train.py:1204] (3/4) Exclude cut with ID _7852_210_5219_1_1528286471316_8139400_188-55162-0_sp1.1 from training. Duration: 0.936375 2023-10-04 08:31:40,000 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_35361_210_4238_1_1533262810_7793476_675-87012-0 from training. Duration: 0.89 2023-10-04 08:31:41,922 WARNING [train.py:1204] (3/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_465-81427-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 08:31:41,935 WARNING [train.py:1204] (3/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_597-154640-0_sp1.1 from training. Duration: 0.8181875 2023-10-04 08:31:45,243 WARNING [train.py:1204] (3/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_186-309161-0_sp0.9 from training. Duration: 0.6 2023-10-04 08:31:46,619 WARNING [train.py:1204] (3/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_546-119312-0 from training. Duration: 0.99 2023-10-04 08:31:46,624 WARNING [train.py:1204] (3/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_330-240060-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 08:31:48,007 WARNING [train.py:1204] (3/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_477-255782-0 from training. Duration: 0.82 2023-10-04 08:31:48,025 WARNING [train.py:1204] (3/4) Exclude cut with ID _14970_210_15709_1_1530773543493_1715730_150-248939-0_sp1.1 from training. Duration: 0.97275 2023-10-04 08:31:49,366 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_556-206615-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 08:31:49,422 WARNING [train.py:1204] (3/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_308-231735-0 from training. Duration: 0.73 2023-10-04 08:31:52,381 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.feed_forward1.hidden_balancer.prob, batch_count=1590826.6666666667, ans=0.125 2023-10-04 08:31:58,373 WARNING [train.py:1204] (3/4) Exclude cut with ID _27485_210_4916_1_1532739191677_3668269_66-236571-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 08:32:03,840 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2221_210_1988_1_1525597051_4935541_415-296882-0_sp0.9 from training. Duration: 0.8555625 2023-10-04 08:32:03,847 WARNING [train.py:1204] (3/4) Exclude cut with ID _64521_210_14699_1_1535243427259_3815328_231-78630-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 08:32:05,194 INFO [train.py:1046] (3/4) Epoch 45, batch 4900, loss[loss=0.16, simple_loss=0.2521, pruned_loss=0.03396, over 23750.00 frames. ], tot_loss[loss=0.1543, simple_loss=0.2351, pruned_loss=0.03682, over 4719387.84 frames. ], batch size: 85, lr: 2.25e-03, grad_scale: 16.0 2023-10-04 08:32:08,039 WARNING [train.py:1204] (3/4) Exclude cut with ID _13942_210_9988_1_1530604906036_3733360_499-55676-0 from training. Duration: 0.62 2023-10-04 08:32:08,041 WARNING [train.py:1204] (3/4) Exclude cut with ID _49376_210_5986_1_1533985278732_3370740_301-154763-0_sp1.1 from training. Duration: 0.8909375 2023-10-04 08:32:09,334 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.637e+02 2.000e+02 2.208e+02 2.639e+02 4.240e+02, threshold=4.416e+02, percent-clipped=0.0 2023-10-04 08:32:12,657 WARNING [train.py:1204] (3/4) Exclude cut with ID _27485_210_4916_1_1532739191677_3668269_255-216698-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 08:32:12,747 WARNING [train.py:1204] (3/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_137-334143-0_sp1.1 from training. Duration: 0.97275 2023-10-04 08:32:13,396 WARNING [train.py:1204] (3/4) Exclude cut with ID _6456_210_1534_1_1527681634212_3181049_215-309359-0_sp1.1 from training. Duration: 0.8 2023-10-04 08:32:16,411 WARNING [train.py:1204] (3/4) Exclude cut with ID _65640_210_6310_1_1535368201445_3173250_209-13008-0 from training. Duration: 0.99 2023-10-04 08:32:22,427 WARNING [train.py:1204] (3/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_58-29064-0 from training. Duration: 0.99 2023-10-04 08:32:25,204 WARNING [train.py:1204] (3/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_410-287857-0 from training. Duration: 0.93 2023-10-04 08:32:26,575 WARNING [train.py:1204] (3/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_153-153537-0 from training. Duration: 0.91 2023-10-04 08:32:27,856 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7544_210_6753_1_1528587722_3576686_172-171750-0_sp0.9 from training. Duration: 0.72225 2023-10-04 08:32:27,889 WARNING [train.py:1204] (3/4) Exclude cut with ID _64521_210_14699_1_1535243427259_3815328_464-183853-0_sp1.1 from training. Duration: 0.97275 2023-10-04 08:32:27,905 WARNING [train.py:1204] (3/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_385-210308-0_sp1.1 from training. Duration: 0.963625 2023-10-04 08:32:27,922 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_46013_210_7234_1_1533776362_3744032_354-261865-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 08:32:27,928 WARNING [train.py:1204] (3/4) Exclude cut with ID _3067_210_2667_1_1526212974640_4503168_250-285937-0_sp1.1 from training. Duration: 0.72725 2023-10-04 08:32:28,007 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_40036_210_13405_1_1533648647_1315388_48-146016-0 from training. Duration: 0.93 2023-10-04 08:32:33,395 WARNING [train.py:1204] (3/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_280-161633-0 from training. Duration: 0.73 2023-10-04 08:32:33,456 WARNING [train.py:1204] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_308-161067-0_sp0.9 from training. Duration: 0.7333125 2023-10-04 08:32:34,816 WARNING [train.py:1204] (3/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_68-212801-0_sp0.9 from training. Duration: 0.811125 2023-10-04 08:32:36,157 WARNING [train.py:1204] (3/4) Exclude cut with ID _6789_210_1534_1_1527857937218_3118950_311-61262-0_sp0.9 from training. Duration: 0.72225 2023-10-04 08:32:37,750 WARNING [train.py:1204] (3/4) Exclude cut with ID _14632_210_9988_1_1530691279392_3923200_102-204477-0_sp1.1 from training. Duration: 0.8818125 2023-10-04 08:32:39,107 WARNING [train.py:1204] (3/4) Exclude cut with ID _65242_210_18628_1_1535369393717_3823149_290-117517-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 08:32:40,427 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_69630_210_7234_1_1535788670_910989_35-337140-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 08:32:40,435 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_5618_210_1874_1_1527311008_5851_1-214501-0 from training. Duration: 0.689 2023-10-04 08:32:42,367 WARNING [train.py:1204] (3/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_168-50337-0_sp0.9 from training. Duration: 0.9333125 2023-10-04 08:32:43,796 WARNING [train.py:1204] (3/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_185-14438-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 08:32:43,820 WARNING [train.py:1204] (3/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_61-327062-0 from training. Duration: 0.81 2023-10-04 08:32:43,825 WARNING [train.py:1204] (3/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_98-741-0 from training. Duration: 0.8 2023-10-04 08:32:46,644 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.0.layers.0.self_attn_weights.whiten_keys, num_groups=4, num_channels=128, metric=2.87 vs. limit=6.0 2023-10-04 08:32:47,210 WARNING [train.py:1204] (3/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_312-204798-0 from training. Duration: 0.93 2023-10-04 08:32:47,372 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.nonlin_attention.balancer.prob, batch_count=1591026.6666666667, ans=0.125 2023-10-04 08:32:47,798 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.3.encoder.layers.2.conv_module1.whiten, num_groups=1, num_channels=512, metric=9.49 vs. limit=15.0 2023-10-04 08:32:50,462 WARNING [train.py:1204] (3/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_787-294416-0_sp0.9 from training. Duration: 0.911125 2023-10-04 08:32:51,805 WARNING [train.py:1204] (3/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_140-181176-0_sp1.1 from training. Duration: 0.836375 2023-10-04 08:32:51,834 WARNING [train.py:1204] (3/4) Exclude cut with ID _14687_210_5854_1_1530703838259_3664889_115-239046-0_sp1.1 from training. Duration: 0.6090625 2023-10-04 08:32:51,897 WARNING [train.py:1204] (3/4) Exclude cut with ID _56423_210_5854_1_1534896061049_3165969_370-230614-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 08:32:51,937 WARNING [train.py:1204] (3/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_406-275649-0_sp0.9 from training. Duration: 0.4666875 2023-10-04 08:32:51,956 WARNING [train.py:1204] (3/4) Exclude cut with ID _15150_210_8341_1_1530752411803_7256384_22-138274-0_sp1.1 from training. Duration: 0.87275 2023-10-04 08:32:53,779 WARNING [train.py:1204] (3/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_125-125818-0 from training. Duration: 0.96 2023-10-04 08:32:56,561 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_4399_210_1874_1_1526705485_2400757_220-47974-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 08:32:57,962 WARNING [train.py:1204] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_308-161067-0_sp1.1 from training. Duration: 0.6 2023-10-04 08:32:59,347 WARNING [train.py:1204] (3/4) Exclude cut with ID _53489_210_6228_1_1534298357169_3835970_102-57645-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 08:33:00,837 WARNING [train.py:1204] (3/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_178-177582-0 from training. Duration: 0.74 2023-10-04 08:33:02,151 WARNING [train.py:1204] (3/4) Exclude cut with ID _14349_210_8341_1_1530615710818_3442470_170-336541-0_sp1.1 from training. Duration: 0.7818125 2023-10-04 08:33:02,216 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_521-214276-0_sp0.9 from training. Duration: 0.488875 2023-10-04 08:33:03,555 WARNING [train.py:1204] (3/4) Exclude cut with ID _66070_210_7640_1_1535441393096_7805310_489-292631-0 from training. Duration: 0.79 2023-10-04 08:33:09,095 WARNING [train.py:1204] (3/4) Exclude cut with ID _14069_210_5632_1_1530512692033_4803390_438-340023-0_sp1.1 from training. Duration: 0.936375 2023-10-04 08:33:10,417 WARNING [train.py:1204] (3/4) Exclude cut with ID _7852_210_5219_1_1528286471316_8139400_242-105336-0_sp1.1 from training. Duration: 0.6545625 2023-10-04 08:33:13,952 WARNING [train.py:1204] (3/4) Exclude cut with ID _3867_210_1883_1_1526641200575_4475830_272-215563-0 from training. Duration: 0.57 2023-10-04 08:33:13,966 WARNING [train.py:1204] (3/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_143-117421-0_sp0.9 from training. Duration: 0.6444375 2023-10-04 08:33:13,975 WARNING [train.py:1204] (3/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_30-326681-0_sp1.1 from training. Duration: 0.7545625 2023-10-04 08:33:15,395 WARNING [train.py:1204] (3/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_538-218448-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 08:33:17,344 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.3.encoder.layers.2.conv_module1.whiten, num_groups=1, num_channels=512, metric=5.61 vs. limit=15.0 2023-10-04 08:33:19,774 INFO [train.py:1046] (3/4) Epoch 45, batch 4950, loss[loss=0.1629, simple_loss=0.2571, pruned_loss=0.03436, over 24654.00 frames. ], tot_loss[loss=0.1536, simple_loss=0.2337, pruned_loss=0.0367, over 4709969.75 frames. ], batch size: 73, lr: 2.25e-03, grad_scale: 16.0 2023-10-04 08:33:19,917 WARNING [train.py:1204] (3/4) Exclude cut with ID _33640_210_2655_1_1533117605479_4855469_60-192970-0_sp1.1 from training. Duration: 0.963625 2023-10-04 08:33:19,924 WARNING [train.py:1204] (3/4) Exclude cut with ID _65585_210_12210_1_1535450451584_3764098_112-294301-0_sp1.1 from training. Duration: 0.8 2023-10-04 08:33:19,945 WARNING [train.py:1204] (3/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_643-37412-0_sp1.1 from training. Duration: 0.936375 2023-10-04 08:33:21,228 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_9302_210_2550_1_1528889962_4655416_638-301620-0_sp1.1 from training. Duration: 0.42725 2023-10-04 08:33:21,350 WARNING [train.py:1204] (3/4) Exclude cut with ID _3867_210_1883_1_1526641200575_4475830_272-215563-0_sp1.1 from training. Duration: 0.5181875 2023-10-04 08:33:23,666 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.feed_forward2.out_whiten.whitening_limit, batch_count=1591226.6666666667, ans=15.0 2023-10-04 08:33:24,647 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_53679_210_4852_1_1534556726_4186141_358-154143-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 08:33:25,902 WARNING [train.py:1204] (3/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_841-341679-0_sp0.9 from training. Duration: 0.6444375 2023-10-04 08:33:27,440 WARNING [train.py:1204] (3/4) Exclude cut with ID _7160_210_1876_1_1527940923231_3703799_302-150728-0 from training. Duration: 0.78 2023-10-04 08:33:27,470 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_125-203105-0 from training. Duration: 0.8 2023-10-04 08:33:27,495 WARNING [train.py:1204] (3/4) Exclude cut with ID _5585_210_2667_1_1527418813324_8007340_89-332072-0_sp0.9 from training. Duration: 0.711125 2023-10-04 08:33:28,882 WARNING [train.py:1204] (3/4) Exclude cut with ID _3992_210_1747_1_1526644816031_3731060_36-147855-0 from training. Duration: 0.78 2023-10-04 08:33:28,902 WARNING [train.py:1204] (3/4) Exclude cut with ID _59256_210_5986_1_1535076085997_3589120_172-239045-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 08:33:28,911 WARNING [train.py:1204] (3/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_472-216663-0_sp1.1 from training. Duration: 0.836375 2023-10-04 08:33:28,965 WARNING [train.py:1204] (3/4) Exclude cut with ID _36512_210_6987_1_1533033143998_3126069_181-306500-0_sp1.1 from training. Duration: 0.863625 2023-10-04 08:33:28,981 WARNING [train.py:1204] (3/4) Exclude cut with ID _56524_210_9642_1_1534572011779_3619170_492-95008-0_sp1.1 from training. Duration: 0.97275 2023-10-04 08:33:30,426 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_33348_210_5632_1_1533086912_3324030_119-90326-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 08:33:31,808 WARNING [train.py:1204] (3/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_231-137892-0_sp1.1 from training. Duration: 0.9 2023-10-04 08:33:33,281 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8453_210_4211_1_1528502940_8562730_596-306820-0_sp1.1 from training. Duration: 0.8818125 2023-10-04 08:33:34,569 WARNING [train.py:1204] (3/4) Exclude cut with ID _27052_210_5986_1_1532329197187_3585009_50-242750-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 08:33:37,192 WARNING [train.py:1204] (3/4) Exclude cut with ID _13913_210_9994_1_1530579615187_3383897_358-98435-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 08:33:37,205 WARNING [train.py:1204] (3/4) Exclude cut with ID _27013_210_1389_1_1532437173240_3252649_399-229778-0_sp1.1 from training. Duration: 0.963625 2023-10-04 08:33:40,424 WARNING [train.py:1204] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_185-174805-0_sp0.9 from training. Duration: 0.7555625 2023-10-04 08:33:46,424 WARNING [train.py:1204] (3/4) Exclude cut with ID _65585_210_12210_1_1535450451584_3764098_196-221014-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 08:33:47,892 WARNING [train.py:1204] (3/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_202-293523-0_sp1.1 from training. Duration: 0.6545625 2023-10-04 08:33:49,345 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8275_210_4198_1_1528455320_3892571_213-127634-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 08:33:51,167 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_921-231678-0_sp1.1 from training. Duration: 0.97275 2023-10-04 08:33:52,547 WARNING [train.py:1204] (3/4) Exclude cut with ID _38020_210_5986_1_1533112252259_7006089_493-102745-0_sp1.1 from training. Duration: 0.8909375 2023-10-04 08:33:52,771 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.bypass.scale_min, batch_count=1591360.0, ans=0.2 2023-10-04 08:33:54,413 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6846_210_6753_1_1527850767_4295158_2-315073-0 from training. Duration: 0.88 2023-10-04 08:33:54,477 WARNING [train.py:1204] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_249-281349-0 from training. Duration: 0.85 2023-10-04 08:33:58,530 WARNING [train.py:1204] (3/4) Exclude cut with ID _18513_210_9206_1_1531618215223_3862620_314-83429-0_sp1.1 from training. Duration: 0.97275 2023-10-04 08:33:59,900 WARNING [train.py:1204] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_215-202973-0_sp0.9 from training. Duration: 0.97775 2023-10-04 08:33:59,911 WARNING [train.py:1204] (3/4) Exclude cut with ID _7719_210_3525_1_1528456153163_4296960_88-250259-0_sp1.1 from training. Duration: 0.763625 2023-10-04 08:34:01,294 WARNING [train.py:1204] (3/4) Exclude cut with ID _7606_210_4915_1_1528894253277_5080690_301-10732-0_sp1.1 from training. Duration: 0.736375 2023-10-04 08:34:01,304 WARNING [train.py:1204] (3/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_818-11907-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 08:34:02,665 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_5959_210_1794_1_1527400400_3445726_412-44052-0_sp1.1 from training. Duration: 0.7 2023-10-04 08:34:05,321 WARNING [train.py:1204] (3/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_6-279195-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 08:34:06,758 WARNING [train.py:1204] (3/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_252-216674-0_sp0.9 from training. Duration: 0.6 2023-10-04 08:34:09,448 WARNING [train.py:1204] (3/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_144-3287-0_sp1.1 from training. Duration: 0.6090625 2023-10-04 08:34:10,925 WARNING [train.py:1204] (3/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_342-3879-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 08:34:10,962 WARNING [train.py:1204] (3/4) Exclude cut with ID _33640_210_2655_1_1533117605479_4855469_584-65630-0_sp1.1 from training. Duration: 0.97275 2023-10-04 08:34:12,271 WARNING [train.py:1204] (3/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_37-189262-0 from training. Duration: 0.98 2023-10-04 08:34:12,300 WARNING [train.py:1204] (3/4) Exclude cut with ID _9389_210_1664_1_1529217337301_5289269_379-128794-0_sp0.9 from training. Duration: 0.9444375 2023-10-04 08:34:14,219 WARNING [train.py:1204] (3/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_119-207162-0_sp1.1 from training. Duration: 0.6454375 2023-10-04 08:34:18,467 WARNING [train.py:1204] (3/4) Exclude cut with ID _2941_210_2577_1_1526199673150_2489759_200-333643-0_sp1.1 from training. Duration: 0.936375 2023-10-04 08:34:18,561 WARNING [train.py:1204] (3/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_479-125611-0_sp1.1 from training. Duration: 0.87275 2023-10-04 08:34:19,194 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3475_210_2028_1_1526193578_3622569_64-297072-0_sp1.1 from training. Duration: 0.82725 2023-10-04 08:34:20,374 WARNING [train.py:1204] (3/4) Exclude cut with ID _53565_210_10635_1_1534337218335_3820259_139-346291-0_sp1.1 from training. Duration: 0.97275 2023-10-04 08:34:21,692 WARNING [train.py:1204] (3/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_588-125165-0_sp1.1 from training. Duration: 0.7545625 2023-10-04 08:34:21,759 WARNING [train.py:1204] (3/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_938-205135-0_sp1.1 from training. Duration: 0.7909375 2023-10-04 08:34:25,418 WARNING [train.py:1204] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_216-48707-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 08:34:25,478 WARNING [train.py:1204] (3/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_477-255782-0_sp1.1 from training. Duration: 0.7454375 2023-10-04 08:34:25,514 WARNING [train.py:1204] (3/4) Exclude cut with ID _16909_210_5724_1_1531292079912_4118919_583-92842-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 08:34:26,895 WARNING [train.py:1204] (3/4) Exclude cut with ID _60860_210_8254_1_1534896015875_3557169_245-256666-0 from training. Duration: 0.97 2023-10-04 08:34:27,168 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.bypass_mid.scale_min, batch_count=1591493.3333333333, ans=0.2 2023-10-04 08:34:29,714 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_45399_210_4852_1_1533624725_7579737_349-116173-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 08:34:33,782 INFO [train.py:1046] (3/4) Epoch 45, batch 5000, loss[loss=0.1385, simple_loss=0.1934, pruned_loss=0.04177, over 19430.00 frames. ], tot_loss[loss=0.153, simple_loss=0.2334, pruned_loss=0.03628, over 4721095.62 frames. ], batch size: 388, lr: 2.25e-03, grad_scale: 16.0 2023-10-04 08:34:35,159 WARNING [train.py:1204] (3/4) Exclude cut with ID _8699_210_2069_1_1528630346831_3960409_155-39766-0 from training. Duration: 0.78 2023-10-04 08:34:35,175 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_86-16881-0_sp1.1 from training. Duration: 0.52725 2023-10-04 08:34:39,136 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.771e+02 2.112e+02 2.442e+02 2.961e+02 4.557e+02, threshold=4.884e+02, percent-clipped=1.0 2023-10-04 08:34:43,360 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_191-159384-0_sp1.1 from training. Duration: 0.97275 2023-10-04 08:34:43,367 WARNING [train.py:1204] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_307-158462-0_sp1.1 from training. Duration: 0.62725 2023-10-04 08:34:43,475 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7544_210_6753_1_1528591327_1471426_33-346031-0 from training. Duration: 0.61 2023-10-04 08:34:43,709 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.feed_forward1.out_proj.dropout_p, batch_count=1591560.0, ans=0.1 2023-10-04 08:34:44,868 WARNING [train.py:1204] (3/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_112-149213-0 from training. Duration: 0.95 2023-10-04 08:34:45,497 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.3.encoder.layers.3.feed_forward3.out_whiten, num_groups=1, num_channels=512, metric=12.72 vs. limit=15.0 2023-10-04 08:34:47,956 WARNING [train.py:1204] (3/4) Exclude cut with ID _55844_210_2346_1_1534471212569_7625707_925-191342-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 08:34:48,116 WARNING [train.py:1204] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_87-278686-0 from training. Duration: 0.77 2023-10-04 08:34:49,432 WARNING [train.py:1204] (3/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_415-78792-0_sp1.1 from training. Duration: 0.736375 2023-10-04 08:34:49,442 WARNING [train.py:1204] (3/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_107-332398-0_sp0.9 from training. Duration: 0.8666875 2023-10-04 08:34:49,534 WARNING [train.py:1204] (3/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_72-251255-0 from training. Duration: 0.94 2023-10-04 08:34:49,751 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.ff2_skip_rate, batch_count=1591626.6666666667, ans=0.0 2023-10-04 08:34:50,720 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_431-227273-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 08:34:50,775 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7544_210_6753_1_1528587000_691468_87-178599-0_sp0.9 from training. Duration: 0.9666875 2023-10-04 08:34:52,737 WARNING [train.py:1204] (3/4) Exclude cut with ID _13916_210_9994_1_1530788341126_3763149_60-191556-0 from training. Duration: 0.61 2023-10-04 08:34:52,739 WARNING [train.py:1204] (3/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_746-164782-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 08:34:52,793 WARNING [train.py:1204] (3/4) Exclude cut with ID _33640_210_2655_1_1533117605479_4855469_596-21066-0_sp1.1 from training. Duration: 0.92725 2023-10-04 08:34:54,714 WARNING [train.py:1204] (3/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_320-14078-0 from training. Duration: 0.95 2023-10-04 08:34:56,508 WARNING [train.py:1204] (3/4) Exclude cut with ID _29877_210_6228_1_1532397791106_5276149_266-62291-0 from training. Duration: 0.87 2023-10-04 08:34:56,573 WARNING [train.py:1204] (3/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_176-56120-0_sp0.9 from training. Duration: 0.92225 2023-10-04 08:34:57,917 WARNING [train.py:1204] (3/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_91-26919-0 from training. Duration: 0.97 2023-10-04 08:34:57,925 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_224-322394-0_sp0.9 from training. Duration: 0.7333125 2023-10-04 08:34:57,968 WARNING [train.py:1204] (3/4) Exclude cut with ID _44982_210_1511_1_1534312820108_3726029_586-297059-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 08:34:58,019 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_196-289637-0_sp1.1 from training. Duration: 0.6818125 2023-10-04 08:34:58,021 WARNING [train.py:1204] (3/4) Exclude cut with ID _18513_210_9206_1_1531618215223_3862620_402-5957-0 from training. Duration: 0.99 2023-10-04 08:34:59,214 WARNING [train.py:1204] (3/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_81-300212-0 from training. Duration: 0.6 2023-10-04 08:35:00,739 WARNING [train.py:1204] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_166-128941-0 from training. Duration: 0.54 2023-10-04 08:35:00,753 WARNING [train.py:1204] (3/4) Exclude cut with ID _58059_210_5854_1_1534901471149_3164808_57-309660-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 08:35:02,129 WARNING [train.py:1204] (3/4) Exclude cut with ID _60621_210_17071_1_1535013053530_3671739_73-155814-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 08:35:03,522 WARNING [train.py:1204] (3/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_726-270780-0 from training. Duration: 0.86 2023-10-04 08:35:03,534 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8853_210_3273_1_1528633899_818968_51-187685-0_sp1.1 from training. Duration: 0.62725 2023-10-04 08:35:04,877 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_315-212794-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 08:35:06,221 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_53373_210_5399_1_1534229471_4715695_314-122795-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 08:35:07,688 WARNING [train.py:1204] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_187-221566-0_sp0.9 from training. Duration: 0.47775 2023-10-04 08:35:09,138 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_133-564-0 from training. Duration: 0.79 2023-10-04 08:35:10,431 WARNING [train.py:1204] (3/4) Exclude cut with ID _55153_210_2253_1_1534384786677_3162096_255-228344-0_sp1.1 from training. Duration: 0.936375 2023-10-04 08:35:11,807 WARNING [train.py:1204] (3/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_383-165234-0_sp1.1 from training. Duration: 0.8545625 2023-10-04 08:35:15,958 WARNING [train.py:1204] (3/4) Exclude cut with ID IC0677W0056-334889 from training. Duration: 0.768 2023-10-04 08:35:17,501 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_14953_210_2151_1_1530701710_5616165_710-165400-0_sp0.9 from training. Duration: 0.9666875 2023-10-04 08:35:19,297 WARNING [train.py:1204] (3/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_355-236205-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 08:35:19,299 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_34450_210_9547_1_1533099443_4109050_503-246893-0_sp1.1 from training. Duration: 0.963625 2023-10-04 08:35:23,858 WARNING [train.py:1204] (3/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_25-120161-0 from training. Duration: 0.98 2023-10-04 08:35:24,499 WARNING [train.py:1204] (3/4) Exclude cut with ID _66112_210_10635_1_1535590236305_3801484_205-311930-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 08:35:24,527 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_48049_210_5632_1_1533864553_3519937_185-281218-0_sp1.1 from training. Duration: 0.92725 2023-10-04 08:35:24,634 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.1.feed_forward1.out_proj.dropout_p, batch_count=1591760.0, ans=0.1 2023-10-04 08:35:25,804 WARNING [train.py:1204] (3/4) Exclude cut with ID _48782_210_12658_1_1534467701544_2300422_191-332415-0_sp1.1 from training. Duration: 0.97275 2023-10-04 08:35:27,254 WARNING [train.py:1204] (3/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_907-151398-0_sp1.1 from training. Duration: 0.336375 2023-10-04 08:35:29,023 WARNING [train.py:1204] (3/4) Exclude cut with ID _67499_210_17282_1_1535509805220_3692430_20-179346-0_sp1.1 from training. Duration: 0.8818125 2023-10-04 08:35:30,542 WARNING [train.py:1204] (3/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_441-91466-0_sp1.1 from training. Duration: 0.8818125 2023-10-04 08:35:33,297 WARNING [train.py:1204] (3/4) Exclude cut with ID _46681_210_9642_1_1533893005571_7603359_786-164007-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 08:35:34,836 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.0.ff3_skip_rate, batch_count=1591826.6666666667, ans=0.0 2023-10-04 08:35:37,488 WARNING [train.py:1204] (3/4) Exclude cut with ID _14970_210_15709_1_1530773543493_1715730_187-20259-0 from training. Duration: 0.89 2023-10-04 08:35:41,496 WARNING [train.py:1204] (3/4) Exclude cut with ID _12734_210_8341_1_1530183614296_3891929_383-339075-0_sp1.1 from training. Duration: 0.963625 2023-10-04 08:35:47,240 INFO [train.py:1046] (3/4) Epoch 45, batch 5050, loss[loss=0.1566, simple_loss=0.2495, pruned_loss=0.03182, over 24326.00 frames. ], tot_loss[loss=0.1525, simple_loss=0.2329, pruned_loss=0.03603, over 4726614.11 frames. ], batch size: 74, lr: 2.25e-03, grad_scale: 8.0 2023-10-04 08:35:50,369 WARNING [train.py:1204] (3/4) Exclude cut with ID _37086_210_9038_1_1532999540891_7021250_834-322225-0_sp1.1 from training. Duration: 0.92725 2023-10-04 08:35:50,490 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_10987_210_4163_1_1529760555_4123902_56-290559-0_sp1.1 from training. Duration: 0.963625 2023-10-04 08:35:51,818 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_92-246270-0_sp0.9 from training. Duration: 0.9444375 2023-10-04 08:35:51,848 WARNING [train.py:1204] (3/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_56-13561-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 08:35:53,210 WARNING [train.py:1204] (3/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_393-57888-0_sp1.1 from training. Duration: 0.5818125 2023-10-04 08:35:53,231 WARNING [train.py:1204] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_168-32906-0_sp1.1 from training. Duration: 0.62725 2023-10-04 08:35:53,283 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_51225_210_3601_1_1534400634_2327383_127-207553-0_sp1.1 from training. Duration: 0.963625 2023-10-04 08:35:55,014 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.ff2_skip_rate, batch_count=1591893.3333333333, ans=0.0 2023-10-04 08:35:58,162 WARNING [train.py:1204] (3/4) Exclude cut with ID _58583_210_5236_1_1534726835266_3574249_494-203521-0_sp1.1 from training. Duration: 0.963625 2023-10-04 08:35:58,177 WARNING [train.py:1204] (3/4) Exclude cut with ID _49376_210_5986_1_1533985278732_3370740_354-344695-0 from training. Duration: 0.99 2023-10-04 08:35:59,580 WARNING [train.py:1204] (3/4) Exclude cut with ID _8021_210_7994_1_1528376451358_3653069_179-45206-0_sp1.1 from training. Duration: 0.7818125 2023-10-04 08:36:01,011 WARNING [train.py:1204] (3/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_742-154017-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 08:36:02,983 WARNING [train.py:1204] (3/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_222-198127-0_sp1.1 from training. Duration: 0.763625 2023-10-04 08:36:04,299 WARNING [train.py:1204] (3/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_235-307337-0 from training. Duration: 0.94 2023-10-04 08:36:04,363 WARNING [train.py:1204] (3/4) Exclude cut with ID _8393_210_6702_1_1528505860249_3457328_453-176500-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 08:36:05,637 WARNING [train.py:1204] (3/4) Exclude cut with ID _53565_210_10635_1_1534337218335_3820259_102-179528-0_sp1.1 from training. Duration: 0.97275 2023-10-04 08:36:07,053 WARNING [train.py:1204] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_43-206757-0_sp1.1 from training. Duration: 0.6818125 2023-10-04 08:36:08,377 WARNING [train.py:1204] (3/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_335-274780-0_sp1.1 from training. Duration: 0.7090625 2023-10-04 08:36:08,413 WARNING [train.py:1204] (3/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_128-296581-0_sp1.1 from training. Duration: 0.67275 2023-10-04 08:36:16,978 WARNING [train.py:1204] (3/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_111-112362-0 from training. Duration: 0.91 2023-10-04 08:36:17,017 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_5522_210_3273_1_1527249606_3909096_366-172508-0_sp1.1 from training. Duration: 0.6 2023-10-04 08:36:18,274 WARNING [train.py:1204] (3/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_47-167373-0_sp1.1 from training. Duration: 0.836375 2023-10-04 08:36:18,318 WARNING [train.py:1204] (3/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_41-234353-0 from training. Duration: 0.68 2023-10-04 08:36:18,349 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_82-221196-0_sp0.9 from training. Duration: 0.8444375 2023-10-04 08:36:21,588 WARNING [train.py:1204] (3/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_47-4814-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 08:36:21,620 WARNING [train.py:1204] (3/4) Exclude cut with ID _43541_210_10626_1_1533796233418_3635440_40-247993-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 08:36:23,093 WARNING [train.py:1204] (3/4) Exclude cut with ID _21989_210_5854_1_1531899214931_3271360_133-44759-0_sp0.9 from training. Duration: 0.8555625 2023-10-04 08:36:23,096 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_94-195801-0 from training. Duration: 0.94 2023-10-04 08:36:24,340 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_141-89113-0 from training. Duration: 0.86 2023-10-04 08:36:26,325 WARNING [train.py:1204] (3/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_683-284578-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 08:36:27,732 WARNING [train.py:1204] (3/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_6-214815-0_sp1.1 from training. Duration: 0.77275 2023-10-04 08:36:30,564 WARNING [train.py:1204] (3/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_556-212082-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 08:36:30,827 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.conv_skip_rate, batch_count=1592093.3333333333, ans=0.0 2023-10-04 08:36:32,564 WARNING [train.py:1204] (3/4) Exclude cut with ID _44902_210_16187_1_1533888006062_3792078_149-67212-0 from training. Duration: 0.87 2023-10-04 08:36:33,943 WARNING [train.py:1204] (3/4) Exclude cut with ID _61148_210_6897_1_1535094022726_4217610_333-159440-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 08:36:35,496 WARNING [train.py:1204] (3/4) Exclude cut with ID _3120_210_3525_1_1526036143112_4072980_404-249994-0 from training. Duration: 0.99 2023-10-04 08:36:36,821 WARNING [train.py:1204] (3/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_234-260682-0_sp0.9 from training. Duration: 0.9333125 2023-10-04 08:36:36,850 WARNING [train.py:1204] (3/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_374-138971-0_sp1.1 from training. Duration: 0.8545625 2023-10-04 08:36:38,250 WARNING [train.py:1204] (3/4) Exclude cut with ID _53713_210_24116_1_1534314487762_7416751_743-302085-0_sp1.1 from training. Duration: 0.92725 2023-10-04 08:36:38,310 WARNING [train.py:1204] (3/4) Exclude cut with ID _18516_210_9206_1_1531704653206_3692406_198-334870-0_sp1.1 from training. Duration: 0.836375 2023-10-04 08:36:40,907 WARNING [train.py:1204] (3/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_395-73570-0_sp1.1 from training. Duration: 0.9 2023-10-04 08:36:42,256 WARNING [train.py:1204] (3/4) Exclude cut with ID _46683_210_9642_1_1533730913492_4591116_421-19547-0_sp1.1 from training. Duration: 0.936375 2023-10-04 08:36:43,656 WARNING [train.py:1204] (3/4) Exclude cut with ID _19896_210_1883_1_1531709022662_6649569_714-8668-0_sp1.1 from training. Duration: 0.963625 2023-10-04 08:36:43,689 WARNING [train.py:1204] (3/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_49-101072-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 08:36:43,696 WARNING [train.py:1204] (3/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_220-191080-0_sp1.1 from training. Duration: 0.8909375 2023-10-04 08:36:44,823 WARNING [train.py:1204] (3/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_62-335552-0 from training. Duration: 0.87 2023-10-04 08:36:44,908 WARNING [train.py:1204] (3/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_111-112362-0_sp1.1 from training. Duration: 0.82725 2023-10-04 08:36:46,433 WARNING [train.py:1204] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_318-59097-0_sp0.9 from training. Duration: 0.8444375 2023-10-04 08:36:49,246 WARNING [train.py:1204] (3/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_98-255710-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 08:36:49,252 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9042W0003-515609 from training. Duration: 0.896 2023-10-04 08:36:49,262 WARNING [train.py:1204] (3/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_158-250116-0_sp0.9 from training. Duration: 0.688875 2023-10-04 08:36:50,633 WARNING [train.py:1204] (3/4) Exclude cut with ID _26750_210_5665_1_1532226678442_4063230_7-346885-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 08:36:52,336 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_37839_210_7234_1_1533542462_3689303_357-227078-0_sp1.1 from training. Duration: 0.963625 2023-10-04 08:36:52,372 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9052W0119-520904 from training. Duration: 0.896 2023-10-04 08:36:53,939 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.balancer1.prob, batch_count=1592160.0, ans=0.125 2023-10-04 08:36:55,489 WARNING [train.py:1204] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_249-281349-0_sp1.1 from training. Duration: 0.77275 2023-10-04 08:36:55,494 WARNING [train.py:1204] (3/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_147-293311-0 from training. Duration: 0.94 2023-10-04 08:36:55,495 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_158-21166-0_sp1.1 from training. Duration: 0.963625 2023-10-04 08:36:57,395 WARNING [train.py:1204] (3/4) Exclude cut with ID _14100_210_8341_1_1530529320613_3678069_207-332194-0_sp1.1 from training. Duration: 0.92725 2023-10-04 08:36:58,697 WARNING [train.py:1204] (3/4) Exclude cut with ID _9389_210_1664_1_1529217337301_5289269_324-211031-0_sp1.1 from training. Duration: 0.963625 2023-10-04 08:36:58,710 WARNING [train.py:1204] (3/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_133-158308-0 from training. Duration: 0.84 2023-10-04 08:36:58,931 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.0.bypass.scale_min, batch_count=1592160.0, ans=0.2 2023-10-04 08:37:00,675 WARNING [train.py:1204] (3/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_166-314607-0 from training. Duration: 0.54 2023-10-04 08:37:01,988 INFO [train.py:1046] (3/4) Epoch 45, batch 5100, loss[loss=0.1378, simple_loss=0.2119, pruned_loss=0.03191, over 23708.00 frames. ], tot_loss[loss=0.1529, simple_loss=0.2332, pruned_loss=0.03629, over 4724046.06 frames. ], batch size: 149, lr: 2.24e-03, grad_scale: 8.0 2023-10-04 08:37:03,277 WARNING [train.py:1204] (3/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_649-79538-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 08:37:03,286 WARNING [train.py:1204] (3/4) Exclude cut with ID _28394_210_8913_1_1532430049518_3650118_343-115667-0_sp1.1 from training. Duration: 0.97275 2023-10-04 08:37:03,348 WARNING [train.py:1204] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_87-278686-0_sp0.9 from training. Duration: 0.8555625 2023-10-04 08:37:03,489 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.bypass.skip_rate, batch_count=1592226.6666666667, ans=0.07 2023-10-04 08:37:05,974 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9109W0003-549749 from training. Duration: 0.768 2023-10-04 08:37:07,264 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.600e+02 1.979e+02 2.119e+02 2.358e+02 3.619e+02, threshold=4.239e+02, percent-clipped=0.0 2023-10-04 08:37:08,646 WARNING [train.py:1204] (3/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_126-29104-0_sp1.1 from training. Duration: 0.77275 2023-10-04 08:37:10,063 WARNING [train.py:1204] (3/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_325-113798-0 from training. Duration: 0.85 2023-10-04 08:37:10,123 WARNING [train.py:1204] (3/4) Exclude cut with ID _6694_210_4599_1_1527852637490_3965529_116-109398-0 from training. Duration: 0.73 2023-10-04 08:37:10,164 WARNING [train.py:1204] (3/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_46-57119-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 08:37:11,693 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.1.bypass.scale_min, batch_count=1592226.6666666667, ans=0.2 2023-10-04 08:37:11,700 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.ff3_skip_rate, batch_count=1592226.6666666667, ans=0.0 2023-10-04 08:37:12,831 WARNING [train.py:1204] (3/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_546-119312-0_sp1.1 from training. Duration: 0.9 2023-10-04 08:37:13,009 WARNING [train.py:1204] (3/4) Exclude cut with ID _60057_210_10640_1_1534903065793_3333150_468-306939-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 08:37:14,362 WARNING [train.py:1204] (3/4) Exclude cut with ID _15150_210_8341_1_1530752411803_7256384_22-138274-0 from training. Duration: 0.96 2023-10-04 08:37:14,387 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_289-180904-0 from training. Duration: 0.76 2023-10-04 08:37:20,281 WARNING [train.py:1204] (3/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_64-133721-0_sp1.1 from training. Duration: 0.92725 2023-10-04 08:37:20,335 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_195-257512-0_sp1.1 from training. Duration: 0.6090625 2023-10-04 08:37:20,587 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.feed_forward1.hidden_balancer.prob, batch_count=1592293.3333333333, ans=0.125 2023-10-04 08:37:21,827 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.self_attn_weights.pos_emb_skip_rate, batch_count=1592293.3333333333, ans=0.0 2023-10-04 08:37:24,784 WARNING [train.py:1204] (3/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_704-101963-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 08:37:28,097 WARNING [train.py:1204] (3/4) Exclude cut with ID _4366_210_1388_1_1526792053633_3613819_231-211402-0 from training. Duration: 0.94 2023-10-04 08:37:28,160 WARNING [train.py:1204] (3/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_679-190385-0_sp1.1 from training. Duration: 0.97275 2023-10-04 08:37:29,648 WARNING [train.py:1204] (3/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_298-81459-0_sp1.1 from training. Duration: 0.963625 2023-10-04 08:37:29,665 WARNING [train.py:1204] (3/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_658-282610-0_sp0.9 from training. Duration: 0.588875 2023-10-04 08:37:32,846 WARNING [train.py:1204] (3/4) Exclude cut with ID _57572_210_5517_1_1534921219624_3809484_196-283734-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 08:37:32,932 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_53071_210_16554_1_1534490405_2954066_294-198554-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 08:37:32,945 WARNING [train.py:1204] (3/4) Exclude cut with ID _4866_210_2577_1_1527404412284_4364659_352-157930-0 from training. Duration: 0.81 2023-10-04 08:37:35,733 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9231W0113-610139 from training. Duration: 0.896 2023-10-04 08:37:37,162 WARNING [train.py:1204] (3/4) Exclude cut with ID _24271_210_12658_1_1531987270022_7190519_638-171906-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 08:37:37,213 WARNING [train.py:1204] (3/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_272-249985-0 from training. Duration: 0.67 2023-10-04 08:37:37,238 WARNING [train.py:1204] (3/4) Exclude cut with ID _64086_210_7994_1_1535270417019_7435949_449-31567-0 from training. Duration: 0.97 2023-10-04 08:37:37,567 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.bypass_mid.scale_min, batch_count=1592360.0, ans=0.2 2023-10-04 08:37:39,904 WARNING [train.py:1204] (3/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_109-282568-0_sp1.1 from training. Duration: 0.97275 2023-10-04 08:37:48,171 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_121-45934-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 08:37:49,982 WARNING [train.py:1204] (3/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_314-126819-0 from training. Duration: 0.5 2023-10-04 08:37:51,318 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9309W0192-640319 from training. Duration: 0.512 2023-10-04 08:37:51,326 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9310W0003-640649 from training. Duration: 0.896 2023-10-04 08:37:52,713 WARNING [train.py:1204] (3/4) Exclude cut with ID _7160_210_1876_1_1527940923231_3703799_120-150943-0 from training. Duration: 0.81 2023-10-04 08:37:52,715 WARNING [train.py:1204] (3/4) Exclude cut with ID _40380_210_9059_1_1533380594759_4187380_6-33508-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 08:37:54,356 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.conv_skip_rate, batch_count=1592426.6666666667, ans=0.0 2023-10-04 08:37:56,562 WARNING [train.py:1204] (3/4) Exclude cut with ID _59202_210_26984_1_1534816810612_3549174_137-318710-0 from training. Duration: 0.89 2023-10-04 08:38:01,233 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_435-313305-0 from training. Duration: 0.71 2023-10-04 08:38:01,503 INFO [scaling.py:1118] (3/4) WithLoss: name=encoder.encoders.4.encoder.layers.2.self_attn_weights, loss-sum=0.000e+00 2023-10-04 08:38:03,982 WARNING [train.py:1204] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_91-212486-0_sp1.1 from training. Duration: 0.6181875 2023-10-04 08:38:06,832 WARNING [train.py:1204] (3/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_133-158308-0_sp1.1 from training. Duration: 0.763625 2023-10-04 08:38:10,819 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_99-251314-0 from training. Duration: 0.87 2023-10-04 08:38:12,362 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_365-217119-0_sp1.1 from training. Duration: 0.57275 2023-10-04 08:38:12,398 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3475_210_2028_1_1526193578_3622569_377-166133-0 from training. Duration: 0.86 2023-10-04 08:38:15,177 INFO [train.py:1046] (3/4) Epoch 45, batch 5150, loss[loss=0.1411, simple_loss=0.2186, pruned_loss=0.03177, over 18924.00 frames. ], tot_loss[loss=0.1538, simple_loss=0.2344, pruned_loss=0.0366, over 4724767.28 frames. ], batch size: 41, lr: 2.24e-03, grad_scale: 8.0 2023-10-04 08:38:18,095 WARNING [train.py:1204] (3/4) Exclude cut with ID _13916_210_9994_1_1530788341126_3763149_116-21034-0_sp1.1 from training. Duration: 0.87275 2023-10-04 08:38:18,114 WARNING [train.py:1204] (3/4) Exclude cut with ID _64220_210_9527_1_1535250151772_4875259_142-216831-0_sp1.1 from training. Duration: 0.936375 2023-10-04 08:38:18,115 WARNING [train.py:1204] (3/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_490-207861-0_sp1.1 from training. Duration: 0.8909375 2023-10-04 08:38:18,181 WARNING [train.py:1204] (3/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_87-272068-0_sp0.9 from training. Duration: 0.92225 2023-10-04 08:38:18,203 WARNING [train.py:1204] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_18-46756-0_sp1.1 from training. Duration: 0.7181875 2023-10-04 08:38:19,518 WARNING [train.py:1204] (3/4) Exclude cut with ID _39411_210_5986_1_1533538825030_3343649_98-311335-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 08:38:20,842 WARNING [train.py:1204] (3/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_113-18163-0 from training. Duration: 0.89 2023-10-04 08:38:20,844 WARNING [train.py:1204] (3/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_1103-56445-0 from training. Duration: 0.73 2023-10-04 08:38:22,194 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3474_210_2028_1_1526188513_4825246_145-309131-0 from training. Duration: 0.74 2023-10-04 08:38:22,221 WARNING [train.py:1204] (3/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_120-211630-0_sp1.1 from training. Duration: 0.836375 2023-10-04 08:38:22,232 WARNING [train.py:1204] (3/4) Exclude cut with ID _8030_210_7497_1_1528444886781_3531089_32-31837-0 from training. Duration: 0.97 2023-10-04 08:38:23,621 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_19949_210_2161_1_1531706337_928079_8-160956-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 08:38:25,362 WARNING [train.py:1204] (3/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_321-215742-0_sp1.1 from training. Duration: 0.4545625 2023-10-04 08:38:26,792 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_583-124650-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 08:38:26,880 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_30434_210_13049_1_1532519384_981746_164-182755-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 08:38:33,233 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.balancer2.prob, batch_count=1592626.6666666667, ans=0.125 2023-10-04 08:38:34,779 WARNING [train.py:1204] (3/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_339-104791-0_sp1.1 from training. Duration: 0.4454375 2023-10-04 08:38:34,812 WARNING [train.py:1204] (3/4) Exclude cut with ID _15026_210_8750_1_1530777602316_3291850_211-195043-0 from training. Duration: 0.8 2023-10-04 08:38:36,254 WARNING [train.py:1204] (3/4) Exclude cut with ID _29877_210_6228_1_1532397791106_5276149_487-182647-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 08:38:36,300 WARNING [train.py:1204] (3/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_127-566-0_sp0.9 from training. Duration: 0.9333125 2023-10-04 08:38:37,725 WARNING [train.py:1204] (3/4) Exclude cut with ID _8263_210_1664_1_1528439452891_970220_68-181805-0_sp0.9 from training. Duration: 0.711125 2023-10-04 08:38:37,726 WARNING [train.py:1204] (3/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_504-72513-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 08:38:37,735 WARNING [train.py:1204] (3/4) Exclude cut with ID _64639_210_5816_1_1535367619437_3528000_324-265233-0_sp1.1 from training. Duration: 0.963625 2023-10-04 08:38:39,103 WARNING [train.py:1204] (3/4) Exclude cut with ID _6456_210_1534_1_1527681634212_3181049_215-309359-0_sp0.9 from training. Duration: 0.97775 2023-10-04 08:38:39,109 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_115-233929-0_sp1.1 from training. Duration: 0.6545625 2023-10-04 08:38:39,155 WARNING [train.py:1204] (3/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_219-224445-0 from training. Duration: 0.84 2023-10-04 08:38:40,553 WARNING [train.py:1204] (3/4) Exclude cut with ID _14867_210_1968_1_1530684179890_3846969_413-29292-0_sp1.1 from training. Duration: 0.8181875 2023-10-04 08:38:40,602 WARNING [train.py:1204] (3/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_1012-279473-0_sp0.9 from training. Duration: 0.7444375 2023-10-04 08:38:43,329 WARNING [train.py:1204] (3/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_605-133395-0_sp0.9 from training. Duration: 0.7333125 2023-10-04 08:38:44,789 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_89-35748-0 from training. Duration: 0.79 2023-10-04 08:38:46,134 WARNING [train.py:1204] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_476-272757-0_sp1.1 from training. Duration: 0.8454375 2023-10-04 08:38:49,173 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.attention_skip_rate, batch_count=1592693.3333333333, ans=0.0 2023-10-04 08:38:50,455 WARNING [train.py:1204] (3/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_449-15508-0_sp0.9 from training. Duration: 0.911125 2023-10-04 08:38:53,172 WARNING [train.py:1204] (3/4) Exclude cut with ID _29345_210_4381_1_1532689168263_3860169_198-174328-0 from training. Duration: 0.91 2023-10-04 08:38:54,705 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_15517_210_6753_1_1530952293_1664795_125-158157-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 08:38:58,462 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.3.encoder.layers.2.feed_forward1.out_whiten, num_groups=1, num_channels=512, metric=6.96 vs. limit=15.0 2023-10-04 08:39:01,093 WARNING [train.py:1204] (3/4) Exclude cut with ID _25565_210_12392_1_1532423009382_3952098_420-113180-0_sp1.1 from training. Duration: 0.963625 2023-10-04 08:39:02,995 WARNING [train.py:1204] (3/4) Exclude cut with ID _51471_210_1889_1_1534399391390_3094250_236-146773-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 08:39:05,849 WARNING [train.py:1204] (3/4) Exclude cut with ID _30039_210_13270_1_1532484612900_2725150_175-35012-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 08:39:05,888 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_23850_210_4238_1_1532232597_1632433_99-275843-0_sp1.1 from training. Duration: 0.936375 2023-10-04 08:39:08,746 WARNING [train.py:1204] (3/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_658-282610-0 from training. Duration: 0.53 2023-10-04 08:39:11,535 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_37818_210_4238_1_1533620910_4720083_234-65934-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 08:39:12,908 WARNING [train.py:1204] (3/4) Exclude cut with ID _3067_210_2667_1_1526212974640_4503168_459-173867-0_sp1.1 from training. Duration: 0.8 2023-10-04 08:39:12,928 WARNING [train.py:1204] (3/4) Exclude cut with ID _8696_210_1876_1_1528545632440_3345058_209-54069-0_sp0.9 from training. Duration: 0.7444375 2023-10-04 08:39:15,696 WARNING [train.py:1204] (3/4) Exclude cut with ID _62574_210_2655_1_1535180407058_3933249_420-236465-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 08:39:16,974 WARNING [train.py:1204] (3/4) Exclude cut with ID _46681_210_9642_1_1533893005571_7603359_333-4042-0_sp1.1 from training. Duration: 0.936375 2023-10-04 08:39:18,367 WARNING [train.py:1204] (3/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_377-296839-0 from training. Duration: 0.55 2023-10-04 08:39:22,442 WARNING [train.py:1204] (3/4) Exclude cut with ID _43997_210_4525_1_1533538632555_5189329_426-92599-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 08:39:23,939 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_43-71989-0_sp1.1 from training. Duration: 0.5181875 2023-10-04 08:39:27,133 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2009_210_2071_1_1525481134_4588937_140-345530-0_sp1.1 from training. Duration: 0.963625 2023-10-04 08:39:27,141 WARNING [train.py:1204] (3/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_138-332773-0_sp0.9 from training. Duration: 0.9 2023-10-04 08:39:28,937 INFO [train.py:1046] (3/4) Epoch 45, batch 5200, loss[loss=0.1485, simple_loss=0.2317, pruned_loss=0.03266, over 24440.00 frames. ], tot_loss[loss=0.1546, simple_loss=0.2354, pruned_loss=0.03688, over 4730984.54 frames. ], batch size: 63, lr: 2.24e-03, grad_scale: 16.0 2023-10-04 08:39:28,985 WARNING [train.py:1204] (3/4) Exclude cut with ID _13942_210_9988_1_1530604906036_3733360_499-55676-0_sp0.9 from training. Duration: 0.688875 2023-10-04 08:39:29,012 WARNING [train.py:1204] (3/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_259-34038-0_sp1.1 from training. Duration: 0.67275 2023-10-04 08:39:29,022 WARNING [train.py:1204] (3/4) Exclude cut with ID _3120_210_3525_1_1526036143112_4072980_404-249994-0_sp1.1 from training. Duration: 0.9 2023-10-04 08:39:29,047 WARNING [train.py:1204] (3/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_222-108810-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 08:39:32,303 WARNING [train.py:1204] (3/4) Exclude cut with ID _22083_210_8254_1_1531702862439_3678970_49-234546-0_sp0.9 from training. Duration: 0.92225 2023-10-04 08:39:33,168 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.2.encoder.layers.0.conv_module1.whiten, num_groups=1, num_channels=384, metric=6.17 vs. limit=15.0 2023-10-04 08:39:34,315 WARNING [train.py:1204] (3/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_199-295611-0_sp0.9 from training. Duration: 0.611125 2023-10-04 08:39:35,578 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.683e+02 2.121e+02 2.351e+02 2.872e+02 5.392e+02, threshold=4.702e+02, percent-clipped=2.0 2023-10-04 08:39:38,358 WARNING [train.py:1204] (3/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_43-192945-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 08:39:39,066 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.2.encoder.layers.0.self_attn2.whiten, num_groups=1, num_channels=384, metric=17.17 vs. limit=22.5 2023-10-04 08:39:41,260 WARNING [train.py:1204] (3/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_364-22817-0 from training. Duration: 0.71 2023-10-04 08:39:41,350 WARNING [train.py:1204] (3/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_388-129552-0_sp1.1 from training. Duration: 0.87275 2023-10-04 08:39:41,393 WARNING [train.py:1204] (3/4) Exclude cut with ID _6537_210_1883_1_1527987559710_3597129_190-280227-0_sp1.1 from training. Duration: 0.97275 2023-10-04 08:39:44,281 WARNING [train.py:1204] (3/4) Exclude cut with ID _55844_210_2346_1_1534471212569_7625707_398-56897-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 08:39:44,437 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.out_combiner.scale_min, batch_count=1592960.0, ans=0.2 2023-10-04 08:39:45,596 WARNING [train.py:1204] (3/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_48-182155-0_sp1.1 from training. Duration: 0.7909375 2023-10-04 08:39:45,624 WARNING [train.py:1204] (3/4) Exclude cut with ID _29529_210_7234_1_1532393961455_3977460_582-18084-0_sp1.1 from training. Duration: 0.97275 2023-10-04 08:39:47,028 WARNING [train.py:1204] (3/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_912-282412-0 from training. Duration: 0.49 2023-10-04 08:39:49,781 WARNING [train.py:1204] (3/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_332-256168-0_sp0.9 from training. Duration: 0.8333125 2023-10-04 08:39:49,838 WARNING [train.py:1204] (3/4) Exclude cut with ID _64220_210_9527_1_1535250151772_4875259_55-250624-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 08:39:52,463 WARNING [train.py:1204] (3/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_264-108685-0 from training. Duration: 0.55 2023-10-04 08:39:55,813 WARNING [train.py:1204] (3/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_613-232856-0_sp0.9 from training. Duration: 0.6 2023-10-04 08:39:57,059 WARNING [train.py:1204] (3/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_68-212801-0_sp1.1 from training. Duration: 0.663625 2023-10-04 08:39:57,107 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6290_210_3273_1_1527595224_1282787_115-87095-0 from training. Duration: 0.55 2023-10-04 08:39:57,141 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3080_210_2071_1_1526086102_4365166_312-8726-0 from training. Duration: 0.91 2023-10-04 08:40:01,918 WARNING [train.py:1204] (3/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_105-63309-0 from training. Duration: 0.98 2023-10-04 08:40:01,990 WARNING [train.py:1204] (3/4) Exclude cut with ID _2941_210_2577_1_1526199673150_2489759_201-160779-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 08:40:01,993 WARNING [train.py:1204] (3/4) Exclude cut with ID ID0406W0226-885434 from training. Duration: 0.997 2023-10-04 08:40:02,002 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_17292_210_6753_1_1531125388_2943734_353-328201-0_sp1.1 from training. Duration: 0.97275 2023-10-04 08:40:03,468 WARNING [train.py:1204] (3/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_572-96029-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 08:40:03,513 WARNING [train.py:1204] (3/4) Exclude cut with ID _14970_210_15709_1_1530775547361_1966965_6-23328-0_sp0.9 from training. Duration: 0.9555625 2023-10-04 08:40:05,425 WARNING [train.py:1204] (3/4) Exclude cut with ID _45012_210_12651_1_1533608435136_5371380_240-37101-0 from training. Duration: 0.93 2023-10-04 08:40:05,497 WARNING [train.py:1204] (3/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_685-90183-0_sp1.1 from training. Duration: 0.936375 2023-10-04 08:40:06,991 WARNING [train.py:1204] (3/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_41-22245-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 08:40:09,681 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_15126_210_6753_1_1530778677_2591185_120-277707-0 from training. Duration: 0.8 2023-10-04 08:40:09,731 WARNING [train.py:1204] (3/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_310-26186-0 from training. Duration: 0.7 2023-10-04 08:40:11,023 WARNING [train.py:1204] (3/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_787-294416-0 from training. Duration: 0.82 2023-10-04 08:40:14,105 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.nonlin_attention.balancer.prob, batch_count=1593093.3333333333, ans=0.125 2023-10-04 08:40:15,223 WARNING [train.py:1204] (3/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_795-33252-0 from training. Duration: 0.37 2023-10-04 08:40:16,549 WARNING [train.py:1204] (3/4) Exclude cut with ID _7537_210_2084_1_1528502395431_3924359_113-199933-0_sp0.9 from training. Duration: 0.6555625 2023-10-04 08:40:20,896 WARNING [train.py:1204] (3/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_308-231735-0_sp0.9 from training. Duration: 0.811125 2023-10-04 08:40:22,298 WARNING [train.py:1204] (3/4) Exclude cut with ID _19504_210_9994_1_1531648863242_3325220_354-296765-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 08:40:22,414 WARNING [train.py:1204] (3/4) Exclude cut with ID _5409_210_1388_1_1527292850189_5865191_178-127970-0 from training. Duration: 0.57 2023-10-04 08:40:24,280 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_1466-248056-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 08:40:24,298 WARNING [train.py:1204] (3/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_535-14410-0_sp0.9 from training. Duration: 0.52225 2023-10-04 08:40:24,301 WARNING [train.py:1204] (3/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_747-129941-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 08:40:25,539 WARNING [train.py:1204] (3/4) Exclude cut with ID _15012_210_1968_1_1530856820429_5095350_688-178849-0_sp1.1 from training. Duration: 0.5818125 2023-10-04 08:40:28,898 WARNING [train.py:1204] (3/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_356-45054-0_sp1.1 from training. Duration: 0.7818125 2023-10-04 08:40:30,297 WARNING [train.py:1204] (3/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_146-276944-0_sp0.9 from training. Duration: 0.92225 2023-10-04 08:40:33,187 WARNING [train.py:1204] (3/4) Exclude cut with ID _39204_210_4220_1_1533880547368_3191120_346-180306-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 08:40:33,273 WARNING [train.py:1204] (3/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_641-158017-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 08:40:33,273 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_19952_210_2161_1_1531875469_3851750_274-42137-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 08:40:39,668 WARNING [train.py:1204] (3/4) Exclude cut with ID _60804_210_9642_1_1534831199859_7268299_87-327819-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 08:40:39,740 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_9302_210_2550_1_1528889962_4655416_638-301620-0 from training. Duration: 0.47 2023-10-04 08:40:39,853 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.0.conv_module2.balancer1.prob, batch_count=1593160.0, ans=0.125 2023-10-04 08:40:41,075 WARNING [train.py:1204] (3/4) Exclude cut with ID _44304_210_15374_1_1533639352765_4613129_302-22084-0_sp1.1 from training. Duration: 0.7818125 2023-10-04 08:40:41,086 WARNING [train.py:1204] (3/4) Exclude cut with ID _18513_210_9206_1_1531618215223_3862620_177-227063-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 08:40:42,520 WARNING [train.py:1204] (3/4) Exclude cut with ID _41326_210_5854_1_1533778076509_3492080_510-45444-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 08:40:42,575 WARNING [train.py:1204] (3/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_76-311963-0_sp1.1 from training. Duration: 0.636375 2023-10-04 08:40:43,936 INFO [train.py:1046] (3/4) Epoch 45, batch 5250, loss[loss=0.1423, simple_loss=0.2216, pruned_loss=0.03146, over 19320.00 frames. ], tot_loss[loss=0.1538, simple_loss=0.2345, pruned_loss=0.03651, over 4728181.49 frames. ], batch size: 42, lr: 2.24e-03, grad_scale: 16.0 2023-10-04 08:40:44,037 WARNING [train.py:1204] (3/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_156-302675-0_sp0.9 from training. Duration: 0.611125 2023-10-04 08:40:46,622 WARNING [train.py:1204] (3/4) Exclude cut with ID _53489_210_6228_1_1534298357169_3835970_367-148411-0_sp1.1 from training. Duration: 0.936375 2023-10-04 08:40:48,233 WARNING [train.py:1204] (3/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_389-144525-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 08:40:48,270 WARNING [train.py:1204] (3/4) Exclude cut with ID _14069_210_5632_1_1530512692033_4803390_158-238427-0_sp1.1 from training. Duration: 0.963625 2023-10-04 08:40:50,912 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_486-151131-0_sp0.9 from training. Duration: 0.9444375 2023-10-04 08:40:57,371 WARNING [train.py:1204] (3/4) Exclude cut with ID _39846_210_12651_1_1533603651992_1318030_188-71831-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 08:40:57,498 WARNING [train.py:1204] (3/4) Exclude cut with ID _39411_210_5986_1_1533538825030_3343649_173-238248-0_sp1.1 from training. Duration: 0.7545625 2023-10-04 08:41:00,281 WARNING [train.py:1204] (3/4) Exclude cut with ID _20684_210_6917_1_1531567751646_4203930_469-49081-0_sp1.1 from training. Duration: 0.92725 2023-10-04 08:41:01,637 WARNING [train.py:1204] (3/4) Exclude cut with ID _8833_210_5219_1_1528592665210_7553059_611-80313-0_sp1.1 from training. Duration: 0.5818125 2023-10-04 08:41:03,021 WARNING [train.py:1204] (3/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_440-141811-0 from training. Duration: 0.95 2023-10-04 08:41:03,044 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_41211_210_2544_1_1533348859_3702910_236-223652-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 08:41:06,631 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_40222_210_6228_1_1533385298_4481939_220-163998-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 08:41:33,114 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=1593426.6666666667, ans=0.1 2023-10-04 08:41:43,338 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.conv_module1.balancer2.prob, batch_count=1593493.3333333333, ans=0.125 2023-10-04 08:41:52,489 INFO [train.py:1046] (3/4) Epoch 45, batch 5300, loss[loss=0.1412, simple_loss=0.2062, pruned_loss=0.03813, over 23661.00 frames. ], tot_loss[loss=0.1532, simple_loss=0.2333, pruned_loss=0.0365, over 4722027.11 frames. ], batch size: 232, lr: 2.24e-03, grad_scale: 16.0 2023-10-04 08:41:54,017 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.nonlin_attention.balancer.prob, batch_count=1593560.0, ans=0.125 2023-10-04 08:41:58,017 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.663e+02 2.084e+02 2.270e+02 2.444e+02 3.408e+02, threshold=4.540e+02, percent-clipped=0.0 2023-10-04 08:42:05,601 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.conv_skip_rate, batch_count=1593626.6666666667, ans=0.0 2023-10-04 08:42:06,577 WARNING [train.py:1204] (3/4) Exclude cut with ID _14629_210_15709_1_1530604298182_956899_6-232695-0_sp1.1 from training. Duration: 0.8545625 2023-10-04 08:42:06,599 WARNING [train.py:1204] (3/4) Exclude cut with ID _13921_210_9994_1_1530841782272_3503520_327-69677-0 from training. Duration: 0.9 2023-10-04 08:42:06,627 WARNING [train.py:1204] (3/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_372-298093-0 from training. Duration: 0.8 2023-10-04 08:42:06,641 WARNING [train.py:1204] (3/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_515-139111-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 08:42:07,232 WARNING [train.py:1204] (3/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_85-65056-0_sp1.1 from training. Duration: 0.87275 2023-10-04 08:42:07,275 WARNING [train.py:1204] (3/4) Exclude cut with ID _15113_210_8254_1_1530842438579_3716279_209-54533-0_sp1.1 from training. Duration: 0.87275 2023-10-04 08:42:07,323 WARNING [train.py:1204] (3/4) Exclude cut with ID _70447_210_12392_1_1535972499300_3160420_123-312838-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 08:42:07,352 WARNING [train.py:1204] (3/4) Exclude cut with ID _60113_210_16271_1_1535164212481_4386079_70-63596-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 08:42:07,380 WARNING [train.py:1204] (3/4) Exclude cut with ID _14632_210_9988_1_1530691279392_3923200_225-188509-0_sp1.1 from training. Duration: 0.963625 2023-10-04 08:42:07,430 WARNING [train.py:1204] (3/4) Exclude cut with ID _34734_210_4525_1_1533365977542_4448720_584-180532-0_sp1.1 from training. Duration: 0.97275 2023-10-04 08:42:07,444 WARNING [train.py:1204] (3/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_351-294650-0_sp1.1 from training. Duration: 0.57275 2023-10-04 08:42:07,778 WARNING [train.py:1204] (3/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_263-168388-0_sp0.9 from training. Duration: 0.9555625 2023-10-04 08:42:07,806 WARNING [train.py:1204] (3/4) Exclude cut with ID _22083_210_8254_1_1531702862439_3678970_201-85738-0 from training. Duration: 0.9 2023-10-04 08:42:07,899 WARNING [train.py:1204] (3/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_237-58433-0 from training. Duration: 0.74 2023-10-04 08:42:07,927 WARNING [train.py:1204] (3/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_262-250826-0 from training. Duration: 0.99 2023-10-04 08:42:08,032 WARNING [train.py:1204] (3/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_927-43827-0_sp0.9 from training. Duration: 0.82225 2023-10-04 08:42:08,064 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2821_210_2715_1_1526108385_3763560_73-296673-0 from training. Duration: 0.83 2023-10-04 08:42:08,143 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6287_210_3273_1_1527509506_2653716_182-199887-0 from training. Duration: 0.51 2023-10-04 08:42:08,279 WARNING [train.py:1204] (3/4) Exclude cut with ID _70519_210_4163_1_1535961595994_7458230_899-80223-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 08:42:08,550 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_210-234949-0_sp1.1 from training. Duration: 0.97275 2023-10-04 08:42:08,590 WARNING [train.py:1204] (3/4) Exclude cut with ID _30196_210_1664_1_1533708496522_7074140_768-278176-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 08:42:08,673 WARNING [train.py:1204] (3/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_315-128283-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 08:42:09,141 WARNING [train.py:1204] (3/4) Exclude cut with ID _14886_210_4220_1_1530702020355_3516190_330-323941-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 08:42:09,403 WARNING [train.py:1204] (3/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_264-348896-0_sp1.1 from training. Duration: 0.77275 2023-10-04 08:42:09,424 WARNING [train.py:1204] (3/4) Exclude cut with ID _37086_210_9038_1_1532999540891_7021250_433-10563-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 08:42:09,461 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_53638_210_21581_1_1534297319_5808449_885-80172-0_sp1.1 from training. Duration: 0.87275 2023-10-04 08:42:09,552 WARNING [train.py:1204] (3/4) Exclude cut with ID _14623_210_1925_1_1530773354962_3632609_157-218771-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 08:42:09,562 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_1368-78165-0_sp1.1 from training. Duration: 0.97275 2023-10-04 08:42:09,566 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_309-65177-0_sp0.9 from training. Duration: 0.97775 2023-10-04 08:42:09,572 WARNING [train.py:1204] (3/4) Exclude cut with ID _21632_210_2253_1_1531994001765_3744140_291-138812-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 08:42:09,584 WARNING [train.py:1204] (3/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_669-93163-0_sp1.1 from training. Duration: 0.8909375 2023-10-04 08:42:10,119 WARNING [train.py:1204] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_292-215346-0 from training. Duration: 0.97 2023-10-04 08:42:10,142 WARNING [train.py:1204] (3/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_286-222073-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 08:42:10,441 WARNING [train.py:1204] (3/4) Exclude cut with ID _59256_210_5986_1_1535076085997_3589120_121-237386-0_sp1.1 from training. Duration: 0.87275 2023-10-04 08:42:10,455 WARNING [train.py:1204] (3/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_96-320736-0 from training. Duration: 0.86 2023-10-04 08:42:10,464 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_128-202104-0 from training. Duration: 0.63 2023-10-04 08:42:10,555 WARNING [train.py:1204] (3/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_242-3472-0_sp0.9 from training. Duration: 0.811125 2023-10-04 08:42:10,602 WARNING [train.py:1204] (3/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_154-74182-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 08:42:10,606 WARNING [train.py:1204] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_187-221566-0 from training. Duration: 0.43 2023-10-04 08:42:10,778 WARNING [train.py:1204] (3/4) Exclude cut with ID _6194_210_1534_1_1527511826160_3941194_14-282876-0 from training. Duration: 0.71 2023-10-04 08:42:10,824 WARNING [train.py:1204] (3/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_660-211088-0_sp1.1 from training. Duration: 0.563625 2023-10-04 08:42:11,180 WARNING [train.py:1204] (3/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_526-62079-0_sp1.1 from training. Duration: 0.6090625 2023-10-04 08:42:11,789 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_92-246270-0_sp1.1 from training. Duration: 0.77275 2023-10-04 08:42:11,878 WARNING [train.py:1204] (3/4) Exclude cut with ID IC0287W0023-142350 from training. Duration: 0.956 2023-10-04 08:42:11,954 WARNING [train.py:1204] (3/4) Exclude cut with ID _58605_210_12210_1_1534723195939_3904259_48-269402-0 from training. Duration: 0.96 2023-10-04 08:42:11,968 WARNING [train.py:1204] (3/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_416-257730-0_sp0.9 from training. Duration: 0.72225 2023-10-04 08:42:12,046 WARNING [train.py:1204] (3/4) Exclude cut with ID _7877_210_6014_1_1528286358099_3750136_185-17516-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 08:42:12,148 WARNING [train.py:1204] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_23-189234-0 from training. Duration: 0.83 2023-10-04 08:42:12,166 WARNING [train.py:1204] (3/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_237-89684-0 from training. Duration: 0.96 2023-10-04 08:42:12,239 WARNING [train.py:1204] (3/4) Exclude cut with ID _13916_210_9994_1_1530788341126_3763149_116-21034-0 from training. Duration: 0.96 2023-10-04 08:42:12,402 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_246-125000-0_sp1.1 from training. Duration: 0.563625 2023-10-04 08:42:14,417 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.self_attn_weights.pos_emb_skip_rate, batch_count=1593640.0, ans=0.0 2023-10-04 08:42:14,724 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.4.encoder.layers.0.conv_module1.whiten, num_groups=1, num_channels=384, metric=5.58 vs. limit=15.0 2023-10-04 08:42:16,535 INFO [train.py:1046] (3/4) Epoch 46, batch 0, loss[loss=0.16, simple_loss=0.2316, pruned_loss=0.04415, over 23754.00 frames. ], tot_loss[loss=0.16, simple_loss=0.2316, pruned_loss=0.04415, over 23754.00 frames. ], batch size: 164, lr: 2.22e-03, grad_scale: 32.0 2023-10-04 08:42:16,536 INFO [train.py:1069] (3/4) Computing validation loss 2023-10-04 08:42:27,682 INFO [zipformer.py:1853] (3/4) name=encoder.encoders.3.encoder.layers.3.self_attn_weights, attn_weights_entropy = tensor([2.5059, 2.2830, 1.9013, 2.3043, 2.0204, 2.1989, 2.2574, 2.1587], device='cuda:3') 2023-10-04 08:42:28,889 INFO [train.py:1078] (3/4) Epoch 46, validation: loss=0.3372, simple_loss=0.2742, pruned_loss=0.2001, over 1125622.00 frames. 2023-10-04 08:42:28,890 INFO [train.py:1079] (3/4) Maximum memory allocated so far is 21221MB 2023-10-04 08:42:28,981 WARNING [train.py:1204] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_402-253027-0 from training. Duration: 0.54 2023-10-04 08:42:29,063 WARNING [train.py:1204] (3/4) Exclude cut with ID _59288_210_5854_1_1535077757774_3412246_158-289345-0_sp1.1 from training. Duration: 0.92725 2023-10-04 08:42:32,168 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_28211_210_6388_1_1532861506_4523224_128-328162-0_sp1.1 from training. Duration: 0.8181875 2023-10-04 08:42:36,369 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_788-278281-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 08:42:36,376 WARNING [train.py:1204] (3/4) Exclude cut with ID _49728_210_12651_1_1534041034294_6788150_624-195301-0_sp0.9 from training. Duration: 0.9444375 2023-10-04 08:42:36,417 WARNING [train.py:1204] (3/4) Exclude cut with ID _14860_210_4915_1_1530702020340_7343240_267-139775-0_sp1.1 from training. Duration: 0.963625 2023-10-04 08:42:36,483 WARNING [train.py:1204] (3/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_332-221057-0 from training. Duration: 0.68 2023-10-04 08:42:39,320 WARNING [train.py:1204] (3/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_242-76943-0 from training. Duration: 0.94 2023-10-04 08:42:40,732 WARNING [train.py:1204] (3/4) Exclude cut with ID _16747_210_5986_1_1531811544733_2749109_87-349587-0_sp1.1 from training. Duration: 0.963625 2023-10-04 08:42:42,133 WARNING [train.py:1204] (3/4) Exclude cut with ID _22982_210_1883_1_1531882790447_5449409_505-346452-0_sp1.1 from training. Duration: 0.963625 2023-10-04 08:42:43,586 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.1.feed_forward1.out_proj.dropout_p, batch_count=1593706.6666666667, ans=0.1 2023-10-04 08:42:43,661 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.feed_forward3.hidden_balancer.prob, batch_count=1593706.6666666667, ans=0.125 2023-10-04 08:42:46,226 WARNING [train.py:1204] (3/4) Exclude cut with ID _17956_210_4353_1_1531209568125_6979970_13-192107-0_sp1.1 from training. Duration: 0.963625 2023-10-04 08:42:47,523 WARNING [train.py:1204] (3/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_430-307666-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 08:42:47,568 WARNING [train.py:1204] (3/4) Exclude cut with ID _2895_210_2477_1_1525956785794_4063750_289-11475-0_sp1.1 from training. Duration: 0.7454375 2023-10-04 08:42:47,588 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3910_210_1794_1_1526742306_334530_28-238909-0_sp0.9 from training. Duration: 0.9 2023-10-04 08:42:49,029 WARNING [train.py:1204] (3/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_18-130336-0 from training. Duration: 0.79 2023-10-04 08:42:49,243 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.conv_module2.balancer1.prob, batch_count=1593706.6666666667, ans=0.125 2023-10-04 08:42:51,691 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_548-294316-0_sp0.9 from training. Duration: 0.9 2023-10-04 08:42:59,299 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_42-142473-0_sp1.1 from training. Duration: 0.6545625 2023-10-04 08:42:59,305 WARNING [train.py:1204] (3/4) Exclude cut with ID _59170_210_6721_1_1534983633960_4203335_326-48066-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 08:43:01,292 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_45886_210_17024_1_1533709215_471063_7-79165-0 from training. Duration: 0.98 2023-10-04 08:43:02,844 WARNING [train.py:1204] (3/4) Exclude cut with ID _44585_210_18628_1_1533605350387_4518199_280-73534-0_sp0.9 from training. Duration: 0.988875 2023-10-04 08:43:02,852 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3475_210_2028_1_1526193578_3622569_265-276625-0_sp1.1 from training. Duration: 0.6909375 2023-10-04 08:43:06,142 WARNING [train.py:1204] (3/4) Exclude cut with ID _64631_210_27767_1_1535349580068_4165160_88-225232-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 08:43:10,276 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_205-265518-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 08:43:13,133 WARNING [train.py:1204] (3/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_271-253734-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 08:43:18,655 WARNING [train.py:1204] (3/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_282-257791-0 from training. Duration: 0.86 2023-10-04 08:43:22,794 WARNING [train.py:1204] (3/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_356-45054-0 from training. Duration: 0.86 2023-10-04 08:43:24,160 WARNING [train.py:1204] (3/4) Exclude cut with ID _69584_210_5219_1_1535785133775_4044450_38-348116-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 08:43:24,161 WARNING [train.py:1204] (3/4) Exclude cut with ID _66763_210_17301_1_1535788667617_241750_21-328480-0_sp1.1 from training. Duration: 0.936375 2023-10-04 08:43:24,235 WARNING [train.py:1204] (3/4) Exclude cut with ID _26528_210_9070_1_1532570410842_3851310_497-97218-0_sp0.9 from training. Duration: 0.9555625 2023-10-04 08:43:25,550 WARNING [train.py:1204] (3/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_592-101075-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 08:43:27,505 WARNING [train.py:1204] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_678-159307-0 from training. Duration: 0.85 2023-10-04 08:43:29,022 INFO [scaling.py:1118] (3/4) WithLoss: name=encoder.encoders.1.encoder.layers.1.self_attn_weights, loss-sum=0.000e+00 2023-10-04 08:43:30,742 WARNING [train.py:1204] (3/4) Exclude cut with ID _28427_210_4220_1_1532581240967_3061069_166-319773-0_sp1.1 from training. Duration: 0.936375 2023-10-04 08:43:30,920 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.bypass.skip_rate, batch_count=1593906.6666666667, ans=0.07 2023-10-04 08:43:32,052 WARNING [train.py:1204] (3/4) Exclude cut with ID _29877_210_6228_1_1532397791106_5276149_606-224984-0_sp1.1 from training. Duration: 0.936375 2023-10-04 08:43:34,144 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_125-203105-0_sp1.1 from training. Duration: 0.72725 2023-10-04 08:43:40,194 WARNING [train.py:1204] (3/4) Exclude cut with ID IC0620W0113-306765 from training. Duration: 0.768 2023-10-04 08:43:41,601 WARNING [train.py:1204] (3/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_410-287857-0_sp1.1 from training. Duration: 0.8454375 2023-10-04 08:43:42,897 INFO [train.py:1046] (3/4) Epoch 46, batch 50, loss[loss=0.1424, simple_loss=0.223, pruned_loss=0.03087, over 23257.00 frames. ], tot_loss[loss=0.1545, simple_loss=0.2351, pruned_loss=0.03697, over 1074529.23 frames. ], batch size: 119, lr: 2.22e-03, grad_scale: 16.0 2023-10-04 08:43:44,520 WARNING [train.py:1204] (3/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_78-95260-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 08:43:46,050 WARNING [train.py:1204] (3/4) Exclude cut with ID _58059_210_5854_1_1534901471149_3164808_6-73080-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 08:43:46,071 WARNING [train.py:1204] (3/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_331-117829-0 from training. Duration: 0.62 2023-10-04 08:43:47,431 WARNING [train.py:1204] (3/4) Exclude cut with ID _14880_210_8895_1_1530680426655_3795259_289-212817-0_sp0.9 from training. Duration: 0.7666875 2023-10-04 08:43:47,457 WARNING [train.py:1204] (3/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_620-144779-0_sp1.1 from training. Duration: 0.92725 2023-10-04 08:43:48,879 WARNING [train.py:1204] (3/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_561-288575-0_sp1.1 from training. Duration: 0.963625 2023-10-04 08:43:51,488 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_22657_210_6890_1_1532086002_1637216_122-188128-0_sp1.1 from training. Duration: 0.963625 2023-10-04 08:43:52,847 WARNING [train.py:1204] (3/4) Exclude cut with ID _16756_210_4915_1_1531047803763_7104259_604-86607-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 08:43:55,699 WARNING [train.py:1204] (3/4) Exclude cut with ID _15150_210_8341_1_1530752411803_7256384_469-239509-0 from training. Duration: 0.68 2023-10-04 08:43:57,041 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_10987_210_4163_1_1529760555_4123902_302-87926-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 08:44:01,691 WARNING [train.py:1204] (3/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_469-94673-0_sp0.9 from training. Duration: 0.87775 2023-10-04 08:44:03,687 WARNING [train.py:1204] (3/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_798-106179-0 from training. Duration: 0.83 2023-10-04 08:44:06,754 WARNING [train.py:1204] (3/4) Exclude cut with ID _16744_210_5986_1_1531724405384_3907567_242-111316-0 from training. Duration: 0.93 2023-10-04 08:44:06,875 WARNING [train.py:1204] (3/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_938-205135-0_sp0.9 from training. Duration: 0.9666875 2023-10-04 08:44:08,705 WARNING [train.py:1204] (3/4) Exclude cut with ID _64135_210_2253_1_1535677267876_7608972_133-188266-0_sp0.9 from training. Duration: 0.9 2023-10-04 08:44:08,709 WARNING [train.py:1204] (3/4) Exclude cut with ID _44585_210_18628_1_1533605350387_4518199_623-346820-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 08:44:10,034 WARNING [train.py:1204] (3/4) Exclude cut with ID _42326_210_7770_1_1533814282531_3707269_454-266903-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 08:44:11,365 WARNING [train.py:1204] (3/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_5-55893-0_sp1.1 from training. Duration: 0.5 2023-10-04 08:44:12,712 WARNING [train.py:1204] (3/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_577-332600-0_sp1.1 from training. Duration: 0.5181875 2023-10-04 08:44:12,713 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_957-138882-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 08:44:13,578 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.2.encoder.layers.1.conv_module1.whiten, num_groups=1, num_channels=384, metric=6.06 vs. limit=15.0 2023-10-04 08:44:21,200 WARNING [train.py:1204] (3/4) Exclude cut with ID _12556_210_8341_1_1530081024100_7327690_219-38004-0_sp1.1 from training. Duration: 0.97275 2023-10-04 08:44:21,307 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_397-147196-0_sp1.1 from training. Duration: 0.72725 2023-10-04 08:44:21,316 WARNING [train.py:1204] (3/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_231-104736-0_sp0.9 from training. Duration: 0.8666875 2023-10-04 08:44:22,675 WARNING [train.py:1204] (3/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_398-334558-0 from training. Duration: 0.96 2023-10-04 08:44:24,161 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_257-20733-0_sp1.1 from training. Duration: 0.6909375 2023-10-04 08:44:24,832 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.2.encoder.layers.2.self_attn1.whiten, num_groups=1, num_channels=384, metric=17.48 vs. limit=22.5 2023-10-04 08:44:25,483 WARNING [train.py:1204] (3/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_146-282400-0_sp1.1 from training. Duration: 0.6454375 2023-10-04 08:44:25,498 WARNING [train.py:1204] (3/4) Exclude cut with ID _51899_210_20034_1_1534129254466_3669220_265-215428-0 from training. Duration: 0.98 2023-10-04 08:44:26,883 WARNING [train.py:1204] (3/4) Exclude cut with ID _34423_210_8750_1_1533430798456_3330199_219-106541-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 08:44:28,318 WARNING [train.py:1204] (3/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_458-67321-0 from training. Duration: 0.97 2023-10-04 08:44:37,573 WARNING [train.py:1204] (3/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_1051-182606-0_sp1.1 from training. Duration: 0.936375 2023-10-04 08:44:37,597 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_53555_210_5632_1_1534595220_7232297_603-4396-0_sp1.1 from training. Duration: 0.97275 2023-10-04 08:44:39,010 WARNING [train.py:1204] (3/4) Exclude cut with ID _54218_210_12210_1_1534413646388_4409259_159-144367-0_sp1.1 from training. Duration: 0.963625 2023-10-04 08:44:40,967 WARNING [train.py:1204] (3/4) Exclude cut with ID _7606_210_4915_1_1528894253277_5080690_301-10732-0_sp0.9 from training. Duration: 0.9 2023-10-04 08:44:40,972 WARNING [train.py:1204] (3/4) Exclude cut with ID _15026_210_8750_1_1530777602316_3291850_211-195043-0_sp0.9 from training. Duration: 0.888875 2023-10-04 08:44:43,172 WARNING [train.py:1204] (3/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_146-282400-0 from training. Duration: 0.71 2023-10-04 08:44:44,539 WARNING [train.py:1204] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_65-88242-0 from training. Duration: 0.56 2023-10-04 08:44:45,953 WARNING [train.py:1204] (3/4) Exclude cut with ID _55857_210_2346_1_1534644007831_6898209_80-107757-0_sp1.1 from training. Duration: 0.963625 2023-10-04 08:44:47,275 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.605e+02 1.970e+02 2.347e+02 2.916e+02 8.307e+02, threshold=4.693e+02, percent-clipped=7.0 2023-10-04 08:44:47,375 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_15126_210_6753_1_1530778677_2591185_120-277707-0_sp0.9 from training. Duration: 0.888875 2023-10-04 08:44:48,743 WARNING [train.py:1204] (3/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_1033-168593-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 08:44:48,790 WARNING [train.py:1204] (3/4) Exclude cut with ID _46683_210_9642_1_1533730913492_4591116_475-30711-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 08:44:48,816 WARNING [train.py:1204] (3/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_253-308377-0 from training. Duration: 0.49 2023-10-04 08:44:50,114 WARNING [train.py:1204] (3/4) Exclude cut with ID _4680_210_3525_1_1527249757953_893386_75-78979-0 from training. Duration: 0.91 2023-10-04 08:44:51,454 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_5522_210_3273_1_1527249606_3909096_318-203153-0_sp0.9 from training. Duration: 0.47775 2023-10-04 08:44:51,560 WARNING [train.py:1204] (3/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_550-170116-0_sp1.1 from training. Duration: 0.92725 2023-10-04 08:44:51,587 WARNING [train.py:1204] (3/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_277-210798-0_sp0.9 from training. Duration: 0.611125 2023-10-04 08:44:52,978 WARNING [train.py:1204] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_620-276793-0 from training. Duration: 0.59 2023-10-04 08:44:52,989 WARNING [train.py:1204] (3/4) Exclude cut with ID _3067_210_2667_1_1526212974640_4503168_119-175776-0 from training. Duration: 0.74 2023-10-04 08:44:53,074 WARNING [train.py:1204] (3/4) Exclude cut with ID _55209_210_1531_1_1534675828996_4597160_681-195827-0_sp1.1 from training. Duration: 0.92725 2023-10-04 08:44:54,404 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_427-78097-0_sp1.1 from training. Duration: 0.72725 2023-10-04 08:44:55,722 WARNING [train.py:1204] (3/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_96-290039-0_sp0.9 from training. Duration: 0.62225 2023-10-04 08:44:55,732 WARNING [train.py:1204] (3/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_287-292174-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 08:44:57,054 INFO [train.py:1046] (3/4) Epoch 46, batch 100, loss[loss=0.1483, simple_loss=0.224, pruned_loss=0.03632, over 23405.00 frames. ], tot_loss[loss=0.1552, simple_loss=0.2359, pruned_loss=0.03728, over 1887985.79 frames. ], batch size: 285, lr: 2.22e-03, grad_scale: 16.0 2023-10-04 08:44:57,187 WARNING [train.py:1204] (3/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_200-292228-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 08:45:01,246 WARNING [train.py:1204] (3/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_291-194999-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 08:45:04,079 WARNING [train.py:1204] (3/4) Exclude cut with ID _58895_210_8750_1_1535184081320_3649089_265-323810-0_sp1.1 from training. Duration: 0.87275 2023-10-04 08:45:07,668 WARNING [train.py:1204] (3/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_58-68956-0 from training. Duration: 0.83 2023-10-04 08:45:07,673 WARNING [train.py:1204] (3/4) Exclude cut with ID _39867_210_9826_1_1533553222030_9344259_322-115153-0_sp1.1 from training. Duration: 0.963625 2023-10-04 08:45:10,524 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3371_210_1794_1_1526384462_5580777_396-339812-0_sp1.1 from training. Duration: 0.77275 2023-10-04 08:45:10,535 WARNING [train.py:1204] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_731-2428-0_sp0.9 from training. Duration: 0.92225 2023-10-04 08:45:10,553 WARNING [train.py:1204] (3/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_98-741-0_sp1.1 from training. Duration: 0.72725 2023-10-04 08:45:10,555 WARNING [train.py:1204] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_238-245655-0_sp0.9 from training. Duration: 0.9 2023-10-04 08:45:10,584 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6788_210_1388_1_1527898698_4562021_91-197981-0_sp0.9 from training. Duration: 0.92225 2023-10-04 08:45:13,788 WARNING [train.py:1204] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_860-96198-0 from training. Duration: 0.73 2023-10-04 08:45:15,177 WARNING [train.py:1204] (3/4) Exclude cut with ID _14268_210_9206_1_1530622771285_4714099_331-105351-0_sp0.9 from training. Duration: 0.72225 2023-10-04 08:45:16,529 WARNING [train.py:1204] (3/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_204-137133-0_sp1.1 from training. Duration: 0.92725 2023-10-04 08:45:16,555 WARNING [train.py:1204] (3/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_5-46496-0_sp1.1 from training. Duration: 0.936375 2023-10-04 08:45:16,563 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_160-218721-0_sp1.1 from training. Duration: 0.87275 2023-10-04 08:45:18,541 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.3.encoder.layers.3.whiten, num_groups=1, num_channels=512, metric=3.88 vs. limit=12.0 2023-10-04 08:45:20,601 WARNING [train.py:1204] (3/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_503-287883-0 from training. Duration: 0.88 2023-10-04 08:45:21,931 WARNING [train.py:1204] (3/4) Exclude cut with ID _57572_210_5517_1_1534921219624_3809484_110-79134-0_sp1.1 from training. Duration: 0.92725 2023-10-04 08:45:22,012 WARNING [train.py:1204] (3/4) Exclude cut with ID _44585_210_18628_1_1533605350387_4518199_421-37641-0_sp1.1 from training. Duration: 0.936375 2023-10-04 08:45:23,448 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_42-142473-0_sp0.9 from training. Duration: 0.8 2023-10-04 08:45:24,832 WARNING [train.py:1204] (3/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_310-114452-0_sp0.9 from training. Duration: 0.6555625 2023-10-04 08:45:28,760 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9029W0120-509520 from training. Duration: 0.768 2023-10-04 08:45:28,783 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9029W0433-509820 from training. Duration: 0.896 2023-10-04 08:45:30,194 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_40222_210_6228_1_1533385298_4481939_163-334586-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 08:45:30,194 WARNING [train.py:1204] (3/4) Exclude cut with ID _48677_210_5663_1_1533902405919_3892989_408-40740-0_sp1.1 from training. Duration: 0.7545625 2023-10-04 08:45:34,405 WARNING [train.py:1204] (3/4) Exclude cut with ID _24314_210_4915_1_1532135078116_7170179_521-258868-0_sp0.9 from training. Duration: 0.87775 2023-10-04 08:45:34,645 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.feed_forward3.hidden_balancer.prob, batch_count=1594440.0, ans=0.125 2023-10-04 08:45:37,738 WARNING [train.py:1204] (3/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_576-51198-0_sp1.1 from training. Duration: 0.92725 2023-10-04 08:45:39,190 WARNING [train.py:1204] (3/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_306-216871-0_sp1.1 from training. Duration: 0.97275 2023-10-04 08:45:45,236 WARNING [train.py:1204] (3/4) Exclude cut with ID _20684_210_6917_1_1531567751646_4203930_463-119982-0_sp1.1 from training. Duration: 0.97275 2023-10-04 08:45:47,194 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9084W0012-536850 from training. Duration: 0.64 2023-10-04 08:45:49,112 WARNING [train.py:1204] (3/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_847-41334-0_sp1.1 from training. Duration: 0.4 2023-10-04 08:45:52,010 WARNING [train.py:1204] (3/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_326-165900-0_sp1.1 from training. Duration: 0.62725 2023-10-04 08:45:53,374 WARNING [train.py:1204] (3/4) Exclude cut with ID _7537_210_2084_1_1528502395431_3924359_128-268029-0_sp1.1 from training. Duration: 0.8909375 2023-10-04 08:45:53,599 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.conv_module1.balancer1.prob, batch_count=1594506.6666666667, ans=0.125 2023-10-04 08:45:56,149 WARNING [train.py:1204] (3/4) Exclude cut with ID _67499_210_17282_1_1535509805220_3692430_333-155030-0_sp1.1 from training. Duration: 0.97275 2023-10-04 08:45:58,825 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_34008_210_10431_1_1533345424_8183247_588-257177-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 08:46:00,379 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.attention_skip_rate, batch_count=1594573.3333333333, ans=0.0 2023-10-04 08:46:01,540 WARNING [train.py:1204] (3/4) Exclude cut with ID _25914_210_6400_1_1532141317058_644740_52-96808-0_sp1.1 from training. Duration: 0.82725 2023-10-04 08:46:02,933 WARNING [train.py:1204] (3/4) Exclude cut with ID _56494_210_4220_1_1535284837625_3033169_69-138150-0_sp1.1 from training. Duration: 0.8545625 2023-10-04 08:46:05,546 WARNING [train.py:1204] (3/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_19-143553-0_sp1.1 from training. Duration: 0.97275 2023-10-04 08:46:05,609 WARNING [train.py:1204] (3/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_582-130735-0_sp1.1 from training. Duration: 0.936375 2023-10-04 08:46:07,035 WARNING [train.py:1204] (3/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_943-201356-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 08:46:07,053 WARNING [train.py:1204] (3/4) Exclude cut with ID _3231_210_2106_1_1526205864934_6784220_422-229276-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 08:46:07,070 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_48049_210_5632_1_1533864553_3519937_73-198717-0_sp1.1 from training. Duration: 0.97275 2023-10-04 08:46:07,127 WARNING [train.py:1204] (3/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_329-285801-0 from training. Duration: 0.91 2023-10-04 08:46:07,305 INFO [scaling.py:1118] (3/4) WithLoss: name=encoder.encoders.1.encoder.layers.1.self_attn_weights, loss-sum=0.000e+00 2023-10-04 08:46:08,972 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9171W0114-579000 from training. Duration: 0.896 2023-10-04 08:46:08,983 WARNING [train.py:1204] (3/4) Exclude cut with ID _3120_210_3525_1_1526036143112_4072980_210-132675-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 08:46:09,065 WARNING [train.py:1204] (3/4) Exclude cut with ID _55857_210_2346_1_1534644007831_6898209_745-98391-0_sp0.9 from training. Duration: 0.8444375 2023-10-04 08:46:10,309 INFO [train.py:1046] (3/4) Epoch 46, batch 150, loss[loss=0.1523, simple_loss=0.2416, pruned_loss=0.03149, over 24663.00 frames. ], tot_loss[loss=0.1552, simple_loss=0.2356, pruned_loss=0.03738, over 2516446.14 frames. ], batch size: 68, lr: 2.22e-03, grad_scale: 16.0 2023-10-04 08:46:10,381 WARNING [train.py:1204] (3/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_615-322035-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 08:46:10,386 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_36756_210_7234_1_1533026991_966276_119-315767-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 08:46:10,411 WARNING [train.py:1204] (3/4) Exclude cut with ID _13916_210_9994_1_1530788341126_3763149_60-191556-0_sp1.1 from training. Duration: 0.5545625 2023-10-04 08:46:10,435 WARNING [train.py:1204] (3/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_177-236809-0_sp1.1 from training. Duration: 0.4454375 2023-10-04 08:46:11,805 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7002_210_1794_1_1527939683_5157286_184-229630-0_sp1.1 from training. Duration: 0.67275 2023-10-04 08:46:11,811 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_50428_210_5724_1_1534247532_4126418_464-18322-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 08:46:12,513 WARNING [train.py:1204] (3/4) Exclude cut with ID _35799_210_5854_1_1533263487887_3529071_466-131402-0_sp1.1 from training. Duration: 0.936375 2023-10-04 08:46:13,686 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_1259-132768-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 08:46:14,892 WARNING [train.py:1204] (3/4) Exclude cut with ID _6694_210_4599_1_1527852637490_3965529_144-185051-0_sp1.1 from training. Duration: 0.87275 2023-10-04 08:46:14,946 WARNING [train.py:1204] (3/4) Exclude cut with ID _64639_210_5816_1_1535367619437_3528000_59-133290-0_sp1.1 from training. Duration: 0.963625 2023-10-04 08:46:18,000 WARNING [train.py:1204] (3/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_223-49525-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 08:46:21,721 WARNING [train.py:1204] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_748-36150-0_sp1.1 from training. Duration: 0.82725 2023-10-04 08:46:21,722 WARNING [train.py:1204] (3/4) Exclude cut with ID _19896_210_1883_1_1531709022662_6649569_52-221627-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 08:46:23,044 WARNING [train.py:1204] (3/4) Exclude cut with ID _6789_210_1534_1_1527854471671_2870519_296-115766-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 08:46:25,701 WARNING [train.py:1204] (3/4) Exclude cut with ID _14624_210_1925_1_1530858690657_3642219_100-19871-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 08:46:25,739 WARNING [train.py:1204] (3/4) Exclude cut with ID _12555_210_8750_1_1530075594402_3780239_50-293675-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 08:46:28,490 WARNING [train.py:1204] (3/4) Exclude cut with ID _14880_210_8895_1_1530680426655_3795259_289-212817-0_sp1.1 from training. Duration: 0.62725 2023-10-04 08:46:28,545 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_388-211626-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 08:46:32,677 WARNING [train.py:1204] (3/4) Exclude cut with ID _5289_210_2084_1_1527678202736_3779299_392-261988-0 from training. Duration: 0.65 2023-10-04 08:46:32,685 WARNING [train.py:1204] (3/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_124-345840-0 from training. Duration: 0.52 2023-10-04 08:46:32,689 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2655_210_2087_1_1525688959_3897400_437-337690-0 from training. Duration: 0.63 2023-10-04 08:46:35,309 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3780_210_2087_1_1526467391_239068_13-274633-0_sp1.1 from training. Duration: 0.92725 2023-10-04 08:46:35,322 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_805-71497-0_sp1.1 from training. Duration: 0.6454375 2023-10-04 08:46:36,710 WARNING [train.py:1204] (3/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_434-83000-0_sp1.1 from training. Duration: 0.8909375 2023-10-04 08:46:36,806 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_17613_210_2151_1_1531137569_2366160_285-219531-0_sp1.1 from training. Duration: 0.97275 2023-10-04 08:46:36,811 WARNING [train.py:1204] (3/4) Exclude cut with ID _56108_210_16380_1_1534505446159_3887959_22-37858-0_sp1.1 from training. Duration: 0.936375 2023-10-04 08:46:38,594 WARNING [train.py:1204] (3/4) Exclude cut with ID _28427_210_4220_1_1532581240967_3061069_200-88444-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 08:46:38,661 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_77-279042-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 08:46:40,051 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9309W0193-640320 from training. Duration: 0.896 2023-10-04 08:46:41,439 WARNING [train.py:1204] (3/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_516-324055-0_sp1.1 from training. Duration: 0.936375 2023-10-04 08:46:46,260 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.conv_skip_rate, batch_count=1594773.3333333333, ans=0.0 2023-10-04 08:46:47,529 WARNING [train.py:1204] (3/4) Exclude cut with ID _40094_210_16380_1_1533294005773_3641159_10-261240-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 08:46:52,161 WARNING [train.py:1204] (3/4) Exclude cut with ID _6267_210_2084_1_1527764658526_3894730_375-219887-0_sp1.1 from training. Duration: 0.6090625 2023-10-04 08:46:53,502 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6788_210_1388_1_1527898698_4562021_180-75288-0 from training. Duration: 0.99 2023-10-04 08:46:57,596 WARNING [train.py:1204] (3/4) Exclude cut with ID _8030_210_7497_1_1528444886781_3531089_262-86750-0_sp1.1 from training. Duration: 0.736375 2023-10-04 08:46:57,616 WARNING [train.py:1204] (3/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_934-325498-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 08:46:57,638 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_456-22141-0_sp0.9 from training. Duration: 0.97775 2023-10-04 08:46:59,107 WARNING [train.py:1204] (3/4) Exclude cut with ID _49643_210_7989_1_1534640244224_3939031_598-161926-0_sp1.1 from training. Duration: 0.8181875 2023-10-04 08:47:00,504 WARNING [train.py:1204] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_345-49721-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 08:47:00,583 WARNING [train.py:1204] (3/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_440-141811-0_sp1.1 from training. Duration: 0.863625 2023-10-04 08:47:00,658 WARNING [train.py:1204] (3/4) Exclude cut with ID _64048_210_10626_1_1535198382802_3835179_394-212120-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 08:47:02,060 WARNING [train.py:1204] (3/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_91-277684-0 from training. Duration: 0.74 2023-10-04 08:47:04,847 WARNING [train.py:1204] (3/4) Exclude cut with ID _23038_210_9570_1_1532001584158_3652419_191-346343-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 08:47:06,202 WARNING [train.py:1204] (3/4) Exclude cut with ID _15140_210_9288_1_1530791388450_4880149_351-347974-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 08:47:06,243 WARNING [train.py:1204] (3/4) Exclude cut with ID _58605_210_12210_1_1534723195939_3904259_48-269402-0_sp1.1 from training. Duration: 0.87275 2023-10-04 08:47:06,250 WARNING [train.py:1204] (3/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_401-53423-0_sp0.9 from training. Duration: 0.611125 2023-10-04 08:47:07,561 WARNING [train.py:1204] (3/4) Exclude cut with ID _42659_210_2070_1_1533607442765_3120380_158-49004-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 08:47:09,430 WARNING [train.py:1204] (3/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_81-75849-0_sp1.1 from training. Duration: 0.3545625 2023-10-04 08:47:12,630 WARNING [train.py:1204] (3/4) Exclude cut with ID _27052_210_5986_1_1532329197187_3585009_314-106499-0_sp1.1 from training. Duration: 0.836375 2023-10-04 08:47:13,937 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.745e+02 2.011e+02 2.305e+02 2.782e+02 3.592e+02, threshold=4.611e+02, percent-clipped=0.0 2023-10-04 08:47:14,063 WARNING [train.py:1204] (3/4) Exclude cut with ID _4626_210_3532_1_1526817652717_3916859_446-206656-0_sp0.9 from training. Duration: 0.8444375 2023-10-04 08:47:15,351 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_53378_210_17282_1_1534221071_5769509_158-114960-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 08:47:17,289 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3171_210_2087_1_1526109084_2330007_37-64334-0_sp0.9 from training. Duration: 0.911125 2023-10-04 08:47:17,311 WARNING [train.py:1204] (3/4) Exclude cut with ID _5410_210_1388_1_1527397055795_1719020_39-112221-0 from training. Duration: 0.69 2023-10-04 08:47:17,329 WARNING [train.py:1204] (3/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_4-91037-0_sp0.9 from training. Duration: 0.97775 2023-10-04 08:47:17,349 WARNING [train.py:1204] (3/4) Exclude cut with ID ID0079W0443-717795 from training. Duration: 0.541 2023-10-04 08:47:20,791 WARNING [train.py:1204] (3/4) Exclude cut with ID _34734_210_4525_1_1533365977542_4448720_766-297928-0_sp1.1 from training. Duration: 0.936375 2023-10-04 08:47:21,008 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.balancer2.prob, batch_count=1594906.6666666667, ans=0.125 2023-10-04 08:47:22,385 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.conv_module2.balancer1.prob, batch_count=1594906.6666666667, ans=0.125 2023-10-04 08:47:24,670 INFO [train.py:1046] (3/4) Epoch 46, batch 200, loss[loss=0.1626, simple_loss=0.238, pruned_loss=0.04362, over 22711.00 frames. ], tot_loss[loss=0.1564, simple_loss=0.2367, pruned_loss=0.03804, over 3005528.22 frames. ], batch size: 322, lr: 2.22e-03, grad_scale: 16.0 2023-10-04 08:47:24,776 WARNING [train.py:1204] (3/4) Exclude cut with ID _51471_210_1889_1_1534399391390_3094250_310-337793-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 08:47:26,023 WARNING [train.py:1204] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_678-159307-0_sp0.9 from training. Duration: 0.9444375 2023-10-04 08:47:28,814 WARNING [train.py:1204] (3/4) Exclude cut with ID _50093_210_4220_1_1534417141207_3432299_259-322252-0 from training. Duration: 0.92 2023-10-04 08:47:28,890 WARNING [train.py:1204] (3/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_67-103444-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 08:47:30,087 WARNING [train.py:1204] (3/4) Exclude cut with ID _40551_210_10635_1_1533776424632_3416179_311-247578-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 08:47:31,579 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_4633_210_2040_1_1526824163_4145343_416-76117-0 from training. Duration: 0.95 2023-10-04 08:47:33,095 WARNING [train.py:1204] (3/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_331-117829-0_sp0.9 from training. Duration: 0.688875 2023-10-04 08:47:35,801 WARNING [train.py:1204] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_299-247059-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 08:47:35,872 WARNING [train.py:1204] (3/4) Exclude cut with ID _64002_210_9570_1_1535198605157_38979_2-240845-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 08:47:36,675 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.1.encoder.layers.1.self_attn_weights.whiten_keys, num_groups=4, num_channels=128, metric=2.74 vs. limit=6.0 2023-10-04 08:47:39,065 WARNING [train.py:1204] (3/4) Exclude cut with ID _4681_210_3525_1_1527250689464_120889_15-126760-0_sp0.9 from training. Duration: 0.9 2023-10-04 08:47:39,077 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_21500_210_2054_1_1532149310_2716758_256-156611-0_sp1.1 from training. Duration: 0.936375 2023-10-04 08:47:39,095 WARNING [train.py:1204] (3/4) Exclude cut with ID _16908_210_5724_1_1531119157248_3772700_624-203729-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 08:47:46,653 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.feed_forward1.out_proj.dropout_p, batch_count=1595040.0, ans=0.1 2023-10-04 08:47:58,545 WARNING [train.py:1204] (3/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_605-219732-0_sp1.1 from training. Duration: 0.8545625 2023-10-04 08:47:58,572 WARNING [train.py:1204] (3/4) Exclude cut with ID _44718_210_5816_1_1534050051869_6469030_380-167520-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 08:47:59,916 WARNING [train.py:1204] (3/4) Exclude cut with ID _19896_210_1883_1_1531709022662_6649569_94-229927-0_sp1.1 from training. Duration: 0.8818125 2023-10-04 08:48:01,291 WARNING [train.py:1204] (3/4) Exclude cut with ID _19514_210_9259_1_1531530071330_5084230_519-174300-0_sp1.1 from training. Duration: 0.963625 2023-10-04 08:48:01,350 WARNING [train.py:1204] (3/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_340-174176-0_sp0.9 from training. Duration: 0.5444375 2023-10-04 08:48:01,353 WARNING [train.py:1204] (3/4) Exclude cut with ID _8263_210_1664_1_1528439452891_970220_68-181805-0_sp1.1 from training. Duration: 0.5818125 2023-10-04 08:48:02,734 WARNING [train.py:1204] (3/4) Exclude cut with ID _3120_210_3525_1_1526036143112_4072980_348-257723-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 08:48:04,097 WARNING [train.py:1204] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_453-133983-0_sp1.1 from training. Duration: 0.7090625 2023-10-04 08:48:04,144 WARNING [train.py:1204] (3/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_17-86060-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 08:48:04,181 WARNING [train.py:1204] (3/4) Exclude cut with ID _29877_210_6228_1_1532397791106_5276149_228-184657-0_sp1.1 from training. Duration: 0.97275 2023-10-04 08:48:04,905 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.3.encoder.layers.1.feed_forward3.out_whiten, num_groups=1, num_channels=512, metric=9.50 vs. limit=15.0 2023-10-04 08:48:05,625 WARNING [train.py:1204] (3/4) Exclude cut with ID _18516_210_9206_1_1531704653206_3692406_198-334870-0 from training. Duration: 0.92 2023-10-04 08:48:05,668 WARNING [train.py:1204] (3/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_340-174176-0_sp1.1 from training. Duration: 0.4454375 2023-10-04 08:48:05,681 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_14-139186-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 08:48:07,701 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.3.encoder.layers.1.feed_forward1.out_whiten, num_groups=1, num_channels=512, metric=7.83 vs. limit=15.0 2023-10-04 08:48:08,566 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.conv_module1.balancer1.prob, batch_count=1595173.3333333333, ans=0.125 2023-10-04 08:48:10,261 WARNING [train.py:1204] (3/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_126-29104-0_sp0.9 from training. Duration: 0.9444375 2023-10-04 08:48:16,272 WARNING [train.py:1204] (3/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_84-10310-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 08:48:25,590 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_15517_210_6753_1_1530954008_3487336_79-273076-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 08:48:25,646 WARNING [train.py:1204] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_371-18169-0_sp0.9 from training. Duration: 0.9555625 2023-10-04 08:48:32,472 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_68702_210_23689_1_1535799577_3420068_170-92788-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 08:48:33,213 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.2.encoder.layers.0.self_attn2.whiten, num_groups=1, num_channels=384, metric=18.91 vs. limit=22.5 2023-10-04 08:48:35,240 WARNING [train.py:1204] (3/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_78-182912-0 from training. Duration: 0.59 2023-10-04 08:48:35,301 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3019_210_2028_1_1526044245_3264865_297-12384-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 08:48:35,307 WARNING [train.py:1204] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_147-111137-0_sp1.1 from training. Duration: 0.863625 2023-10-04 08:48:35,312 WARNING [train.py:1204] (3/4) Exclude cut with ID _28323_210_16187_1_1532480395667_4100300_117-8859-0_sp1.1 from training. Duration: 0.97275 2023-10-04 08:48:35,525 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.feed_forward1.out_proj.dropout_p, batch_count=1595240.0, ans=0.1 2023-10-04 08:48:36,590 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7762_210_1794_1_1528199508_4057408_419-125009-0_sp1.1 from training. Duration: 0.7545625 2023-10-04 08:48:37,885 INFO [train.py:1046] (3/4) Epoch 46, batch 250, loss[loss=0.1553, simple_loss=0.2443, pruned_loss=0.03319, over 24613.00 frames. ], tot_loss[loss=0.1558, simple_loss=0.2357, pruned_loss=0.0379, over 3387785.14 frames. ], batch size: 68, lr: 2.22e-03, grad_scale: 16.0 2023-10-04 08:48:37,976 WARNING [train.py:1204] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_268-87909-0 from training. Duration: 0.99 2023-10-04 08:48:38,048 WARNING [train.py:1204] (3/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_118-311357-0_sp1.1 from training. Duration: 0.92725 2023-10-04 08:48:39,396 WARNING [train.py:1204] (3/4) Exclude cut with ID ID0377W0003-870225 from training. Duration: 0.852 2023-10-04 08:48:40,887 WARNING [train.py:1204] (3/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_56-5838-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 08:48:44,094 WARNING [train.py:1204] (3/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_90-31383-0_sp0.9 from training. Duration: 0.9666875 2023-10-04 08:48:44,190 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_31583_210_3852_1_1532595851_4024413_58-86657-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 08:48:44,990 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.conv_skip_rate, batch_count=1595306.6666666667, ans=0.0 2023-10-04 08:48:46,090 WARNING [train.py:1204] (3/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_344-113875-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 08:48:47,543 WARNING [train.py:1204] (3/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_278-104676-0_sp1.1 from training. Duration: 0.8909375 2023-10-04 08:48:47,572 WARNING [train.py:1204] (3/4) Exclude cut with ID _55844_210_2346_1_1534471212569_7625707_764-2387-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 08:48:48,202 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.3.encoder.layers.2.conv_module1.whiten, num_groups=1, num_channels=512, metric=4.93 vs. limit=15.0 2023-10-04 08:48:49,010 WARNING [train.py:1204] (3/4) Exclude cut with ID _27430_210_7770_1_1532604689375_1378590_157-14963-0_sp1.1 from training. Duration: 0.963625 2023-10-04 08:48:50,973 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.4.encoder.layers.0.self_attn_weights.whiten_keys, num_groups=4, num_channels=128, metric=1.74 vs. limit=6.0 2023-10-04 08:48:51,718 WARNING [train.py:1204] (3/4) Exclude cut with ID _7160_210_1876_1_1527940923231_3703799_282-322611-0_sp1.1 from training. Duration: 0.7909375 2023-10-04 08:48:53,343 INFO [scaling.py:1118] (3/4) WithLoss: name=encoder.encoders.3.encoder.layers.2.self_attn_weights, loss-sum=0.000e+00 2023-10-04 08:49:02,498 WARNING [train.py:1204] (3/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_190-138640-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 08:49:05,170 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_33348_210_5632_1_1533086912_3324030_63-270922-0_sp1.1 from training. Duration: 0.97275 2023-10-04 08:49:05,310 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.feed_forward2.hidden_balancer.prob, batch_count=1595373.3333333333, ans=0.125 2023-10-04 08:49:06,498 WARNING [train.py:1204] (3/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_135-184281-0_sp1.1 from training. Duration: 0.9 2023-10-04 08:49:11,289 WARNING [train.py:1204] (3/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_277-210798-0_sp1.1 from training. Duration: 0.5 2023-10-04 08:49:12,544 WARNING [train.py:1204] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_402-253027-0_sp0.9 from training. Duration: 0.6 2023-10-04 08:49:13,956 WARNING [train.py:1204] (3/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_540-275685-0_sp0.9 from training. Duration: 0.988875 2023-10-04 08:49:15,813 WARNING [train.py:1204] (3/4) Exclude cut with ID _59493_210_17282_1_1535078419077_3225879_118-18960-0_sp1.1 from training. Duration: 0.936375 2023-10-04 08:49:15,893 WARNING [train.py:1204] (3/4) Exclude cut with ID _6194_210_1534_1_1527511826160_3941194_321-148641-0_sp1.1 from training. Duration: 0.5818125 2023-10-04 08:49:15,902 WARNING [train.py:1204] (3/4) Exclude cut with ID _61133_210_9969_1_1535196777227_3696409_333-270325-0_sp1.1 from training. Duration: 0.8454375 2023-10-04 08:49:17,279 WARNING [train.py:1204] (3/4) Exclude cut with ID _49376_210_5986_1_1533985278732_3370740_307-145221-0_sp1.1 from training. Duration: 0.936375 2023-10-04 08:49:18,719 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6846_210_6753_1_1527850767_4295158_2-315073-0_sp0.9 from training. Duration: 0.97775 2023-10-04 08:49:21,514 WARNING [train.py:1204] (3/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_17-25045-0 from training. Duration: 0.59 2023-10-04 08:49:21,536 WARNING [train.py:1204] (3/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_503-333510-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 08:49:21,658 WARNING [train.py:1204] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_238-245655-0_sp1.1 from training. Duration: 0.736375 2023-10-04 08:49:22,952 WARNING [train.py:1204] (3/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_329-82235-0_sp0.9 from training. Duration: 0.911125 2023-10-04 08:49:22,966 WARNING [train.py:1204] (3/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_325-113798-0_sp0.9 from training. Duration: 0.9444375 2023-10-04 08:49:23,049 WARNING [train.py:1204] (3/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_395-37222-0_sp0.9 from training. Duration: 0.8444375 2023-10-04 08:49:23,189 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=1595506.6666666667, ans=0.1 2023-10-04 08:49:26,180 WARNING [train.py:1204] (3/4) Exclude cut with ID _64135_210_2253_1_1535677267876_7608972_498-5039-0_sp1.1 from training. Duration: 0.7090625 2023-10-04 08:49:26,189 WARNING [train.py:1204] (3/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_303-228738-0_sp0.9 from training. Duration: 0.9333125 2023-10-04 08:49:27,625 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_577-220899-0_sp1.1 from training. Duration: 0.92725 2023-10-04 08:49:30,788 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3475_210_2028_1_1526193578_3622569_377-166133-0_sp1.1 from training. Duration: 0.7818125 2023-10-04 08:49:30,831 WARNING [train.py:1204] (3/4) Exclude cut with ID _49376_210_5986_1_1533985278732_3370740_272-313512-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 08:49:33,695 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7762_210_1794_1_1528199508_4057408_422-227034-0_sp0.9 from training. Duration: 0.811125 2023-10-04 08:49:39,010 WARNING [train.py:1204] (3/4) Exclude cut with ID _18895_210_6228_1_1531357274234_3784930_74-341428-0_sp1.1 from training. Duration: 0.92725 2023-10-04 08:49:42,116 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.539e+02 2.009e+02 2.186e+02 2.461e+02 3.268e+02, threshold=4.371e+02, percent-clipped=0.0 2023-10-04 08:49:42,259 WARNING [train.py:1204] (3/4) Exclude cut with ID _44902_210_16187_1_1533888006062_3792078_149-67212-0_sp1.1 from training. Duration: 0.7909375 2023-10-04 08:49:47,056 WARNING [train.py:1204] (3/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_243-126020-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 08:49:48,433 WARNING [train.py:1204] (3/4) Exclude cut with ID _54218_210_12210_1_1534413646388_4409259_259-6561-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 08:49:52,408 INFO [train.py:1046] (3/4) Epoch 46, batch 300, loss[loss=0.1649, simple_loss=0.2508, pruned_loss=0.03946, over 24553.00 frames. ], tot_loss[loss=0.1542, simple_loss=0.2336, pruned_loss=0.03738, over 3689586.47 frames. ], batch size: 71, lr: 2.22e-03, grad_scale: 16.0 2023-10-04 08:49:52,520 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_41452_210_7291_1_1533437687_3941281_405-11559-0 from training. Duration: 0.91 2023-10-04 08:49:53,870 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_759-225084-0_sp1.1 from training. Duration: 0.97275 2023-10-04 08:49:53,889 WARNING [train.py:1204] (3/4) Exclude cut with ID _14747_210_8341_1_1530685797463_3654089_147-131229-0_sp0.9 from training. Duration: 0.8444375 2023-10-04 08:49:55,258 WARNING [train.py:1204] (3/4) Exclude cut with ID _14867_210_1968_1_1530684179890_3846969_319-250868-0 from training. Duration: 0.99 2023-10-04 08:49:55,280 WARNING [train.py:1204] (3/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_580-93798-0_sp0.9 from training. Duration: 0.62225 2023-10-04 08:49:56,718 WARNING [train.py:1204] (3/4) Exclude cut with ID _9679_210_7497_1_1529129212906_4363929_512-313889-0_sp0.9 from training. Duration: 0.9 2023-10-04 08:49:56,731 WARNING [train.py:1204] (3/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_18-189198-0 from training. Duration: 0.9 2023-10-04 08:49:57,012 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.conv_module1.balancer2.prob, batch_count=1595640.0, ans=0.125 2023-10-04 08:49:58,894 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.2.encoder.layers.1.whiten, num_groups=1, num_channels=384, metric=4.66 vs. limit=12.0 2023-10-04 08:50:01,479 WARNING [train.py:1204] (3/4) Exclude cut with ID _49755_210_18053_1_1534581005846_3218709_137-64179-0_sp1.1 from training. Duration: 0.92725 2023-10-04 08:50:01,554 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_48049_210_5632_1_1533864553_3519937_306-283794-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 08:50:04,865 WARNING [train.py:1204] (3/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_25-120161-0_sp1.1 from training. Duration: 0.8909375 2023-10-04 08:50:04,930 WARNING [train.py:1204] (3/4) Exclude cut with ID _8833_210_5219_1_1528592665210_7553059_319-349181-0 from training. Duration: 0.99 2023-10-04 08:50:06,337 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_296-332325-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 08:50:07,712 WARNING [train.py:1204] (3/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_436-186311-0_sp1.1 from training. Duration: 0.5909375 2023-10-04 08:50:09,012 WARNING [train.py:1204] (3/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_348-127789-0 from training. Duration: 0.85 2023-10-04 08:50:09,017 WARNING [train.py:1204] (3/4) Exclude cut with ID _3992_210_1747_1_1526644816031_3731060_443-195183-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 08:50:13,814 WARNING [train.py:1204] (3/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_308-231735-0_sp1.1 from training. Duration: 0.663625 2023-10-04 08:50:17,193 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6786_210_1388_1_1527839699_4194495_378-46070-0_sp1.1 from training. Duration: 0.6545625 2023-10-04 08:50:17,240 WARNING [train.py:1204] (3/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_63-101996-0 from training. Duration: 0.88 2023-10-04 08:50:21,343 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2541_210_1794_1_1525590269_3084765_156-173713-0 from training. Duration: 0.81 2023-10-04 08:50:21,389 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_217-239796-0_sp1.1 from training. Duration: 0.963625 2023-10-04 08:50:24,153 WARNING [train.py:1204] (3/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_530-329400-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 08:50:24,450 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.balancer2.prob, batch_count=1595773.3333333333, ans=0.125 2023-10-04 08:50:25,574 WARNING [train.py:1204] (3/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_150-62920-0_sp1.1 from training. Duration: 0.963625 2023-10-04 08:50:25,574 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_5522_210_3273_1_1527249606_3909096_366-172508-0 from training. Duration: 0.66 2023-10-04 08:50:25,577 WARNING [train.py:1204] (3/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_303-340446-0_sp1.1 from training. Duration: 0.8181875 2023-10-04 08:50:28,321 WARNING [train.py:1204] (3/4) Exclude cut with ID _8699_210_2069_1_1528630346831_3960409_160-24408-0_sp1.1 from training. Duration: 0.82725 2023-10-04 08:50:29,738 WARNING [train.py:1204] (3/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_299-187801-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 08:50:31,608 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_34886_210_10633_1_1533203882_1755661_92-345288-0_sp1.1 from training. Duration: 0.97275 2023-10-04 08:50:33,857 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.nonlin_attention.balancer.prob, batch_count=1595773.3333333333, ans=0.125 2023-10-04 08:50:35,102 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_5438_210_1794_1_1527336687_2600009_104-288274-0_sp1.1 from training. Duration: 0.52725 2023-10-04 08:50:35,106 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_154-316450-0 from training. Duration: 0.76 2023-10-04 08:50:35,170 WARNING [train.py:1204] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_23-189234-0_sp0.9 from training. Duration: 0.92225 2023-10-04 08:50:37,893 WARNING [train.py:1204] (3/4) Exclude cut with ID _4779_210_2084_1_1527318022944_3914893_346-168820-0_sp1.1 from training. Duration: 0.963625 2023-10-04 08:50:39,269 WARNING [train.py:1204] (3/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_5-55893-0 from training. Duration: 0.55 2023-10-04 08:50:41,011 WARNING [train.py:1204] (3/4) Exclude cut with ID _20455_210_5219_1_1531565996775_8098100_180-24517-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 08:50:43,015 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.3.encoder.layers.2.nonlin_attention.whiten2, num_groups=1, num_channels=512, metric=6.38 vs. limit=15.0 2023-10-04 08:50:45,538 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_243-143119-0_sp0.9 from training. Duration: 0.8555625 2023-10-04 08:50:46,967 WARNING [train.py:1204] (3/4) Exclude cut with ID _65585_210_12210_1_1535450451584_3764098_293-212045-0_sp1.1 from training. Duration: 0.92725 2023-10-04 08:50:46,971 WARNING [train.py:1204] (3/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_577-332600-0 from training. Duration: 0.57 2023-10-04 08:50:49,915 WARNING [train.py:1204] (3/4) Exclude cut with ID _30039_210_13270_1_1532484612900_2725150_104-289874-0_sp1.1 from training. Duration: 0.963625 2023-10-04 08:50:49,922 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3172_210_2087_1_1526292929_3987865_197-97762-0_sp1.1 from training. Duration: 0.6818125 2023-10-04 08:50:52,615 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_256-150908-0_sp1.1 from training. Duration: 0.963625 2023-10-04 08:50:55,230 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_296-287678-0_sp1.1 from training. Duration: 0.863625 2023-10-04 08:50:56,550 WARNING [train.py:1204] (3/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_443-30034-0 from training. Duration: 0.64 2023-10-04 08:50:56,588 WARNING [train.py:1204] (3/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_494-199538-0_sp0.9 from training. Duration: 0.8333125 2023-10-04 08:50:57,878 WARNING [train.py:1204] (3/4) Exclude cut with ID _19708_210_9206_1_1532136650733_4238364_61-162325-0_sp1.1 from training. Duration: 0.97275 2023-10-04 08:50:59,236 WARNING [train.py:1204] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_475-127455-0 from training. Duration: 0.7 2023-10-04 08:50:59,369 WARNING [train.py:1204] (3/4) Exclude cut with ID _48677_210_5663_1_1533902405919_3892989_188-11581-0_sp1.1 from training. Duration: 0.963625 2023-10-04 08:50:59,396 WARNING [train.py:1204] (3/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_136-8381-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 08:51:01,655 WARNING [train.py:1204] (3/4) Exclude cut with ID _55328_210_27767_1_1534580973260_4088170_270-229555-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 08:51:02,853 WARNING [train.py:1204] (3/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_372-266951-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 08:51:02,919 WARNING [train.py:1204] (3/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_14-242149-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 08:51:07,710 INFO [train.py:1046] (3/4) Epoch 46, batch 350, loss[loss=0.1487, simple_loss=0.2257, pruned_loss=0.03588, over 23607.00 frames. ], tot_loss[loss=0.153, simple_loss=0.232, pruned_loss=0.03698, over 3903798.64 frames. ], batch size: 120, lr: 2.22e-03, grad_scale: 16.0 2023-10-04 08:51:07,842 WARNING [train.py:1204] (3/4) Exclude cut with ID _7877_210_6014_1_1528286358099_3750136_159-76512-0_sp1.1 from training. Duration: 0.936375 2023-10-04 08:51:07,844 WARNING [train.py:1204] (3/4) Exclude cut with ID _8263_210_1664_1_1528438953494_294100_40-204731-0_sp1.1 from training. Duration: 0.4545625 2023-10-04 08:51:11,109 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8278_210_2550_1_1528637099_4220413_694-238413-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 08:51:11,299 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.conv_skip_rate, batch_count=1595973.3333333333, ans=0.0 2023-10-04 08:51:14,954 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.conv_skip_rate, batch_count=1595973.3333333333, ans=0.0 2023-10-04 08:51:16,161 WARNING [train.py:1204] (3/4) Exclude cut with ID _47278_210_12041_1_1533902418349_3576110_421-172710-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 08:51:19,069 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.ff3_skip_rate, batch_count=1595973.3333333333, ans=0.0 2023-10-04 08:51:20,251 WARNING [train.py:1204] (3/4) Exclude cut with ID _14970_210_15709_1_1530773543493_1715730_215-282680-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 08:51:20,296 WARNING [train.py:1204] (3/4) Exclude cut with ID _64639_210_5816_1_1535367619437_3528000_306-149449-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 08:51:20,452 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.conv_module2.balancer1.prob, batch_count=1595973.3333333333, ans=0.125 2023-10-04 08:51:23,202 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_28-291982-0 from training. Duration: 0.97 2023-10-04 08:51:23,305 WARNING [train.py:1204] (3/4) Exclude cut with ID _55134_210_21982_1_1534590028377_3882170_412-15870-0_sp1.1 from training. Duration: 0.936375 2023-10-04 08:51:24,554 WARNING [train.py:1204] (3/4) Exclude cut with ID _13684_210_7497_1_1530619354863_3598359_312-235606-0 from training. Duration: 0.6 2023-10-04 08:51:27,342 WARNING [train.py:1204] (3/4) Exclude cut with ID _37112_210_2575_1_1532997007673_7268610_347-238993-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 08:51:27,374 WARNING [train.py:1204] (3/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_121-284152-0 from training. Duration: 0.59 2023-10-04 08:51:27,445 WARNING [train.py:1204] (3/4) Exclude cut with ID _2941_210_2577_1_1526199673150_2489759_228-2961-0_sp1.1 from training. Duration: 0.97275 2023-10-04 08:51:30,295 WARNING [train.py:1204] (3/4) Exclude cut with ID _28323_210_16187_1_1532480395667_4100300_531-24157-0 from training. Duration: 0.98 2023-10-04 08:51:33,290 WARNING [train.py:1204] (3/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_266-329755-0_sp0.9 from training. Duration: 0.988875 2023-10-04 08:51:33,414 WARNING [train.py:1204] (3/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_1038-146421-0_sp1.1 from training. Duration: 0.97275 2023-10-04 08:51:35,967 WARNING [train.py:1204] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_391-22148-0_sp0.9 from training. Duration: 0.8555625 2023-10-04 08:51:37,476 WARNING [train.py:1204] (3/4) Exclude cut with ID _64521_210_14699_1_1535243427259_3815328_543-341117-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 08:51:39,167 WARNING [train.py:1204] (3/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_52-168712-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 08:51:39,200 WARNING [train.py:1204] (3/4) Exclude cut with ID _33920_210_4220_1_1533257851500_3084399_198-334063-0_sp1.1 from training. Duration: 0.8545625 2023-10-04 08:51:39,212 WARNING [train.py:1204] (3/4) Exclude cut with ID _50093_210_4220_1_1534417141207_3432299_69-111987-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 08:51:39,258 WARNING [train.py:1204] (3/4) Exclude cut with ID _14821_210_8750_1_1530680503991_6829359_189-297451-0_sp0.9 from training. Duration: 0.611125 2023-10-04 08:51:41,170 WARNING [train.py:1204] (3/4) Exclude cut with ID _8030_210_7497_1_1528444886781_3531089_262-86750-0_sp0.9 from training. Duration: 0.9 2023-10-04 08:51:41,176 WARNING [train.py:1204] (3/4) Exclude cut with ID _24612_210_8913_1_1531998054193_3806359_294-191196-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 08:51:44,768 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.bypass_mid.scale_min, batch_count=1596106.6666666667, ans=0.2 2023-10-04 08:51:49,806 WARNING [train.py:1204] (3/4) Exclude cut with ID _16909_210_5724_1_1531292079912_4118919_707-342896-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 08:51:49,826 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_9460_210_4211_1_1528954587_10049667_572-37035-0_sp0.9 from training. Duration: 0.888875 2023-10-04 08:51:51,037 WARNING [train.py:1204] (3/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_762-109886-0_sp1.1 from training. Duration: 0.82725 2023-10-04 08:51:52,424 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_14953_210_2151_1_1530701710_5616165_519-326393-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 08:51:52,711 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.balancer1.prob, batch_count=1596173.3333333333, ans=0.125 2023-10-04 08:51:56,698 WARNING [train.py:1204] (3/4) Exclude cut with ID _25914_210_6400_1_1532141317058_644740_52-96808-0 from training. Duration: 0.91 2023-10-04 08:51:56,701 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_68095_210_20185_1_1535718226_8888855_1079-94525-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 08:51:59,614 WARNING [train.py:1204] (3/4) Exclude cut with ID _14860_210_4915_1_1530702020340_7343240_746-3647-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 08:51:59,614 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_70379_210_7925_1_1535941455_557455_49-101720-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 08:52:00,938 WARNING [train.py:1204] (3/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_8-307291-0_sp1.1 from training. Duration: 0.936375 2023-10-04 08:52:01,043 WARNING [train.py:1204] (3/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_117-290916-0 from training. Duration: 0.51 2023-10-04 08:52:02,929 WARNING [train.py:1204] (3/4) Exclude cut with ID _22020_210_2461_1_1531724164513_6326900_944-339840-0_sp1.1 from training. Duration: 0.963625 2023-10-04 08:52:04,211 WARNING [train.py:1204] (3/4) Exclude cut with ID IC0515W0043-254911 from training. Duration: 0.976 2023-10-04 08:52:06,817 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3910_210_1794_1_1526742306_334530_28-238909-0 from training. Duration: 0.81 2023-10-04 08:52:06,837 WARNING [train.py:1204] (3/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_269-267275-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 08:52:09,601 WARNING [train.py:1204] (3/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_72-251255-0_sp1.1 from training. Duration: 0.8545625 2023-10-04 08:52:09,624 WARNING [train.py:1204] (3/4) Exclude cut with ID _14886_210_4220_1_1530702020355_3516190_175-229073-0 from training. Duration: 0.45 2023-10-04 08:52:11,458 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.680e+02 2.124e+02 2.443e+02 2.962e+02 4.613e+02, threshold=4.885e+02, percent-clipped=1.0 2023-10-04 08:52:13,022 WARNING [train.py:1204] (3/4) Exclude cut with ID _40519_210_13794_1_1533715228655_7337390_921-209452-0_sp1.1 from training. Duration: 0.963625 2023-10-04 08:52:15,103 WARNING [train.py:1204] (3/4) Exclude cut with ID _29857_210_20034_1_1532395886753_3798160_335-163562-0_sp0.9 from training. Duration: 0.9444375 2023-10-04 08:52:17,108 WARNING [train.py:1204] (3/4) Exclude cut with ID _58583_210_5236_1_1534726835266_3574249_384-58445-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 08:52:18,431 WARNING [train.py:1204] (3/4) Exclude cut with ID _44011_210_2046_1_1533722401691_3631310_96-315656-0_sp1.1 from training. Duration: 0.963625 2023-10-04 08:52:18,435 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_68484_210_1794_1_1535849804_7678097_549-170613-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 08:52:19,911 WARNING [train.py:1204] (3/4) Exclude cut with ID _37892_210_19596_1_1533198677897_3964058_214-148058-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 08:52:22,596 INFO [train.py:1046] (3/4) Epoch 46, batch 400, loss[loss=0.1496, simple_loss=0.2309, pruned_loss=0.03414, over 24688.00 frames. ], tot_loss[loss=0.1524, simple_loss=0.2314, pruned_loss=0.03666, over 4075756.59 frames. ], batch size: 65, lr: 2.22e-03, grad_scale: 32.0 2023-10-04 08:52:22,712 WARNING [train.py:1204] (3/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_415-78792-0_sp0.9 from training. Duration: 0.9 2023-10-04 08:52:25,712 WARNING [train.py:1204] (3/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_383-205833-0_sp1.1 from training. Duration: 0.863625 2023-10-04 08:52:26,999 WARNING [train.py:1204] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_167-137255-0 from training. Duration: 0.64 2023-10-04 08:52:27,007 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8275_210_4198_1_1528455320_3892571_484-154786-0_sp1.1 from training. Duration: 0.963625 2023-10-04 08:52:27,060 WARNING [train.py:1204] (3/4) Exclude cut with ID _40284_210_2084_1_1533513679814_3704279_396-319412-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 08:52:28,524 WARNING [train.py:1204] (3/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_280-120942-0_sp0.9 from training. Duration: 0.8555625 2023-10-04 08:52:28,576 WARNING [train.py:1204] (3/4) Exclude cut with ID _45630_210_10556_1_1533949174498_3902049_558-240889-0_sp1.1 from training. Duration: 0.92725 2023-10-04 08:52:31,282 WARNING [train.py:1204] (3/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_197-307458-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 08:52:31,373 WARNING [train.py:1204] (3/4) Exclude cut with ID _16909_210_5724_1_1531292079912_4118919_163-254843-0_sp1.1 from training. Duration: 0.92725 2023-10-04 08:52:32,691 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_103-84010-0 from training. Duration: 0.69 2023-10-04 08:52:35,800 WARNING [train.py:1204] (3/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_401-158239-0 from training. Duration: 0.71 2023-10-04 08:52:35,800 WARNING [train.py:1204] (3/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_899-233658-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 08:52:37,184 WARNING [train.py:1204] (3/4) Exclude cut with ID _23825_210_13449_1_1531915313526_3497150_240-271367-0 from training. Duration: 0.97 2023-10-04 08:52:38,391 WARNING [train.py:1204] (3/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_424-134619-0_sp1.1 from training. Duration: 0.92725 2023-10-04 08:52:41,812 WARNING [train.py:1204] (3/4) Exclude cut with ID _44831_210_13427_1_1533816011549_3689035_32-188147-0_sp1.1 from training. Duration: 0.87275 2023-10-04 08:52:41,815 WARNING [train.py:1204] (3/4) Exclude cut with ID _59170_210_6721_1_1534983633960_4203335_314-63879-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 08:52:41,839 WARNING [train.py:1204] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_602-275260-0 from training. Duration: 0.82 2023-10-04 08:52:43,303 WARNING [train.py:1204] (3/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_511-306767-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 08:52:43,337 WARNING [train.py:1204] (3/4) Exclude cut with ID _41817_210_18835_1_1533434477160_7423049_565-201775-0_sp1.1 from training. Duration: 0.92725 2023-10-04 08:52:43,353 WARNING [train.py:1204] (3/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_778-173151-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 08:52:43,384 WARNING [train.py:1204] (3/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_680-51001-0_sp1.1 from training. Duration: 0.963625 2023-10-04 08:52:47,188 WARNING [train.py:1204] (3/4) Exclude cut with ID IC0677W0059-334891 from training. Duration: 0.896 2023-10-04 08:52:47,233 WARNING [train.py:1204] (3/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_370-331600-0 from training. Duration: 0.94 2023-10-04 08:52:51,417 WARNING [train.py:1204] (3/4) Exclude cut with ID _66840_210_5219_1_1535452168426_4196349_446-240192-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 08:52:52,793 WARNING [train.py:1204] (3/4) Exclude cut with ID _65242_210_18628_1_1535369393717_3823149_38-6732-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 08:52:52,861 WARNING [train.py:1204] (3/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_302-2550-0 from training. Duration: 0.81 2023-10-04 08:52:55,483 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_100-348441-0 from training. Duration: 0.91 2023-10-04 08:52:58,047 WARNING [train.py:1204] (3/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_329-285801-0_sp1.1 from training. Duration: 0.82725 2023-10-04 08:53:00,708 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_30167_210_3528_1_1532428012_4772375_343-334712-0_sp1.1 from training. Duration: 0.936375 2023-10-04 08:53:00,940 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.feed_forward2.hidden_balancer.prob, batch_count=1596440.0, ans=0.125 2023-10-04 08:53:04,303 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.conv_module2.balancer2.min_positive, batch_count=1596440.0, ans=0.05 2023-10-04 08:53:06,813 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7543_210_6753_1_1528543318_4552_1-85072-0 from training. Duration: 0.83 2023-10-04 08:53:07,121 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.bypass.skip_rate, batch_count=1596506.6666666667, ans=0.07 2023-10-04 08:53:09,574 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7002_210_1794_1_1527935634_4034444_487-236501-0_sp1.1 from training. Duration: 0.6 2023-10-04 08:53:10,891 WARNING [train.py:1204] (3/4) Exclude cut with ID _7160_210_1876_1_1527940923231_3703799_282-322611-0 from training. Duration: 0.87 2023-10-04 08:53:14,000 WARNING [train.py:1204] (3/4) Exclude cut with ID _67335_210_5219_1_1535525975652_4275229_570-262983-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 08:53:14,106 WARNING [train.py:1204] (3/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_625-338546-0_sp1.1 from training. Duration: 0.8 2023-10-04 08:53:16,015 WARNING [train.py:1204] (3/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_176-56120-0 from training. Duration: 0.83 2023-10-04 08:53:16,282 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.ff2_skip_rate, batch_count=1596506.6666666667, ans=0.0 2023-10-04 08:53:16,286 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.balancer2.prob, batch_count=1596506.6666666667, ans=0.125 2023-10-04 08:53:19,484 WARNING [train.py:1204] (3/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_963-187109-0_sp1.1 from training. Duration: 0.9 2023-10-04 08:53:22,504 WARNING [train.py:1204] (3/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_121-284152-0_sp0.9 from training. Duration: 0.6555625 2023-10-04 08:53:23,714 WARNING [train.py:1204] (3/4) Exclude cut with ID _45021_210_9994_1_1533619751094_3330288_104-81711-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 08:53:26,508 WARNING [train.py:1204] (3/4) Exclude cut with ID _51382_210_23862_1_1534492801838_3650104_129-343914-0_sp1.1 from training. Duration: 0.936375 2023-10-04 08:53:26,549 WARNING [train.py:1204] (3/4) Exclude cut with ID _14970_210_15709_1_1530775547361_1966965_8-169924-0 from training. Duration: 0.92 2023-10-04 08:53:29,196 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2295_210_2087_1_1525500993_2407863_234-83104-0_sp1.1 from training. Duration: 0.563625 2023-10-04 08:53:29,265 WARNING [train.py:1204] (3/4) Exclude cut with ID _14687_210_5854_1_1530703838259_3664889_115-239046-0 from training. Duration: 0.67 2023-10-04 08:53:30,567 WARNING [train.py:1204] (3/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_690-126306-0_sp1.1 from training. Duration: 0.7090625 2023-10-04 08:53:30,575 WARNING [train.py:1204] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_625-102955-0_sp0.9 from training. Duration: 0.8555625 2023-10-04 08:53:33,427 WARNING [train.py:1204] (3/4) Exclude cut with ID _9389_210_1664_1_1529217337301_5289269_289-213417-0 from training. Duration: 0.89 2023-10-04 08:53:36,736 INFO [train.py:1046] (3/4) Epoch 46, batch 450, loss[loss=0.1425, simple_loss=0.2203, pruned_loss=0.03231, over 23645.00 frames. ], tot_loss[loss=0.1531, simple_loss=0.2327, pruned_loss=0.03674, over 4211574.19 frames. ], batch size: 149, lr: 2.22e-03, grad_scale: 16.0 2023-10-04 08:53:36,807 WARNING [train.py:1204] (3/4) Exclude cut with ID _13253_210_1925_1_1530757775819_3577873_191-144977-0_sp0.9 from training. Duration: 0.8666875 2023-10-04 08:53:36,862 WARNING [train.py:1204] (3/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_325-155703-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 08:53:36,891 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2295_210_2087_1_1525500993_2407863_116-238894-0_sp0.9 from training. Duration: 0.77775 2023-10-04 08:53:38,295 WARNING [train.py:1204] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_94-126128-0 from training. Duration: 0.86 2023-10-04 08:53:38,313 WARNING [train.py:1204] (3/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_395-310220-0_sp0.9 from training. Duration: 0.988875 2023-10-04 08:53:39,667 WARNING [train.py:1204] (3/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_821-110852-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 08:53:39,727 WARNING [train.py:1204] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_64-248359-0_sp1.1 from training. Duration: 0.62725 2023-10-04 08:53:39,737 WARNING [train.py:1204] (3/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_138-187031-0 from training. Duration: 0.99 2023-10-04 08:53:41,102 WARNING [train.py:1204] (3/4) Exclude cut with ID _4122_210_2084_1_1526713164417_4353280_528-206771-0_sp0.9 from training. Duration: 0.97775 2023-10-04 08:53:42,585 WARNING [train.py:1204] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_844-96989-0_sp1.1 from training. Duration: 0.6454375 2023-10-04 08:53:43,913 WARNING [train.py:1204] (3/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_106-30782-0_sp1.1 from training. Duration: 0.72725 2023-10-04 08:53:53,631 WARNING [train.py:1204] (3/4) Exclude cut with ID _14683_210_5219_1_1530698352608_3974859_281-250040-0_sp1.1 from training. Duration: 0.936375 2023-10-04 08:53:53,661 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_46013_210_7234_1_1533776362_3744032_463-52378-0_sp0.9 from training. Duration: 0.9555625 2023-10-04 08:53:55,072 WARNING [train.py:1204] (3/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_299-34413-0 from training. Duration: 0.65 2023-10-04 08:53:56,367 WARNING [train.py:1204] (3/4) Exclude cut with ID _35799_210_5854_1_1533263487887_3529071_490-326847-0 from training. Duration: 0.99 2023-10-04 08:54:00,522 WARNING [train.py:1204] (3/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_589-9070-0_sp0.9 from training. Duration: 0.788875 2023-10-04 08:54:03,354 WARNING [train.py:1204] (3/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_884-29515-0_sp1.1 from training. Duration: 0.936375 2023-10-04 08:54:04,763 WARNING [train.py:1204] (3/4) Exclude cut with ID _15012_210_1968_1_1530856820429_5095350_474-119995-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 08:54:08,008 WARNING [train.py:1204] (3/4) Exclude cut with ID _65585_210_12210_1_1535450451584_3764098_211-97124-0_sp1.1 from training. Duration: 0.963625 2023-10-04 08:54:08,073 WARNING [train.py:1204] (3/4) Exclude cut with ID _33640_210_2655_1_1533117605479_4855469_666-224463-0_sp1.1 from training. Duration: 0.963625 2023-10-04 08:54:10,791 WARNING [train.py:1204] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_35-153886-0 from training. Duration: 0.84 2023-10-04 08:54:12,071 WARNING [train.py:1204] (3/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_210-106047-0 from training. Duration: 0.97 2023-10-04 08:54:12,189 WARNING [train.py:1204] (3/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_479-125611-0 from training. Duration: 0.96 2023-10-04 08:54:12,217 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_9417_210_1794_1_1529235261_4614131_427-153963-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 08:54:13,403 WARNING [train.py:1204] (3/4) Exclude cut with ID _23818_210_6228_1_1531917063884_3676920_133-170571-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 08:54:15,297 WARNING [train.py:1204] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_602-275260-0_sp1.1 from training. Duration: 0.7454375 2023-10-04 08:54:17,248 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9029W0121-509521 from training. Duration: 0.896 2023-10-04 08:54:17,257 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9029W0434-509821 from training. Duration: 0.64 2023-10-04 08:54:17,282 WARNING [train.py:1204] (3/4) Exclude cut with ID _23038_210_9570_1_1532001584158_3652419_289-307034-0_sp1.1 from training. Duration: 0.936375 2023-10-04 08:54:17,478 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.conv_module2.balancer2.min_abs, batch_count=1596773.3333333333, ans=0.5 2023-10-04 08:54:18,691 WARNING [train.py:1204] (3/4) Exclude cut with ID _29345_210_4381_1_1532689168263_3860169_149-257268-0_sp0.9 from training. Duration: 0.9 2023-10-04 08:54:20,019 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_405-339986-0_sp1.1 from training. Duration: 0.463625 2023-10-04 08:54:24,181 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2655_210_2087_1_1525688959_3897400_437-337690-0_sp1.1 from training. Duration: 0.57275 2023-10-04 08:54:24,213 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7543_210_6753_1_1528541907_1399322_165-323619-0_sp1.1 from training. Duration: 0.62725 2023-10-04 08:54:25,440 WARNING [train.py:1204] (3/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_508-335369-0_sp1.1 from training. Duration: 0.436375 2023-10-04 08:54:26,846 WARNING [train.py:1204] (3/4) Exclude cut with ID _7852_210_5219_1_1528286471316_8139400_358-287492-0 from training. Duration: 0.96 2023-10-04 08:54:28,278 WARNING [train.py:1204] (3/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_322-217220-0_sp0.9 from training. Duration: 0.9555625 2023-10-04 08:54:30,967 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3171_210_2087_1_1526109084_2330007_66-260583-0_sp0.9 from training. Duration: 0.8 2023-10-04 08:54:31,003 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8558_210_1410_1_1528505225_7613406_131-329524-0_sp1.1 from training. Duration: 0.7181875 2023-10-04 08:54:32,429 WARNING [train.py:1204] (3/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_30-326681-0 from training. Duration: 0.83 2023-10-04 08:54:35,254 WARNING [train.py:1204] (3/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_58-68956-0_sp0.9 from training. Duration: 0.92225 2023-10-04 08:54:35,426 INFO [scaling.py:1118] (3/4) WithLoss: name=encoder.encoders.1.encoder.layers.1.self_attn_weights, loss-sum=0.000e+00 2023-10-04 08:54:36,611 WARNING [train.py:1204] (3/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_255-312654-0 from training. Duration: 0.91 2023-10-04 08:54:36,679 WARNING [train.py:1204] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_99-167392-0 from training. Duration: 0.61 2023-10-04 08:54:38,550 WARNING [train.py:1204] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_94-126128-0_sp0.9 from training. Duration: 0.9555625 2023-10-04 08:54:41,394 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.625e+02 1.934e+02 2.148e+02 2.455e+02 3.795e+02, threshold=4.297e+02, percent-clipped=0.0 2023-10-04 08:54:44,375 WARNING [train.py:1204] (3/4) Exclude cut with ID _68635_210_9317_1_1535770813410_4315870_187-192817-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 08:54:46,483 WARNING [train.py:1204] (3/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_277-204970-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 08:54:47,669 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6163_210_6400_1_1527506746_4263068_283-137811-0_sp0.9 from training. Duration: 0.8555625 2023-10-04 08:54:47,696 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9144W0121-566446 from training. Duration: 0.896 2023-10-04 08:54:50,907 INFO [train.py:1046] (3/4) Epoch 46, batch 500, loss[loss=0.1674, simple_loss=0.2393, pruned_loss=0.0477, over 23433.00 frames. ], tot_loss[loss=0.1544, simple_loss=0.2343, pruned_loss=0.03725, over 4322998.83 frames. ], batch size: 285, lr: 2.22e-03, grad_scale: 16.0 2023-10-04 08:54:52,407 WARNING [train.py:1204] (3/4) Exclude cut with ID _49371_210_12071_1_1533952761021_3812190_351-315519-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 08:54:53,738 WARNING [train.py:1204] (3/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_115-326778-0_sp1.1 from training. Duration: 0.8454375 2023-10-04 08:54:53,801 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_17292_210_6753_1_1531125388_2943734_436-198965-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 08:54:53,810 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9165W0117-576751 from training. Duration: 0.896 2023-10-04 08:54:55,186 WARNING [train.py:1204] (3/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_212-149947-0 from training. Duration: 0.73 2023-10-04 08:54:55,206 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_13757_210_3528_1_1530440682_4107239_65-167544-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 08:54:58,036 WARNING [train.py:1204] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_700-183549-0_sp0.9 from training. Duration: 0.7555625 2023-10-04 08:55:02,054 WARNING [train.py:1204] (3/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_216-343007-0_sp0.9 from training. Duration: 0.7333125 2023-10-04 08:55:02,156 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7544_210_6753_1_1528587722_3576686_295-258479-0_sp1.1 from training. Duration: 0.763625 2023-10-04 08:55:05,107 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_365-287106-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 08:55:05,121 WARNING [train.py:1204] (3/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_300-3747-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 08:55:05,191 WARNING [train.py:1204] (3/4) Exclude cut with ID _14069_210_5632_1_1530512692033_4803390_487-303201-0_sp1.1 from training. Duration: 0.92725 2023-10-04 08:55:16,127 WARNING [train.py:1204] (3/4) Exclude cut with ID _14459_210_2228_1_1531443641946_7516700_668-123026-0_sp1.1 from training. Duration: 0.936375 2023-10-04 08:55:17,242 WARNING [train.py:1204] (3/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_108-53929-0_sp1.1 from training. Duration: 0.536375 2023-10-04 08:55:17,271 WARNING [train.py:1204] (3/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_266-137104-0_sp0.9 from training. Duration: 0.72225 2023-10-04 08:55:17,307 WARNING [train.py:1204] (3/4) Exclude cut with ID _12302_210_5219_1_1529924446466_8107280_902-168619-0_sp1.1 from training. Duration: 0.936375 2023-10-04 08:55:19,290 WARNING [train.py:1204] (3/4) Exclude cut with ID _59502_210_12210_1_1535107024698_1835139_146-162495-0 from training. Duration: 0.92 2023-10-04 08:55:19,314 WARNING [train.py:1204] (3/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_412-287526-0_sp1.1 from training. Duration: 0.6545625 2023-10-04 08:55:22,541 WARNING [train.py:1204] (3/4) Exclude cut with ID _7160_210_1876_1_1527940923231_3703799_120-150943-0_sp0.9 from training. Duration: 0.9 2023-10-04 08:55:23,907 WARNING [train.py:1204] (3/4) Exclude cut with ID _15037_210_2253_1_1530752294104_3267650_296-192597-0_sp1.1 from training. Duration: 0.736375 2023-10-04 08:55:23,934 WARNING [train.py:1204] (3/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_397-216497-0_sp1.1 from training. Duration: 0.87275 2023-10-04 08:55:23,949 WARNING [train.py:1204] (3/4) Exclude cut with ID _40094_210_16380_1_1533294005773_3641159_76-269320-0_sp1.1 from training. Duration: 0.936375 2023-10-04 08:55:23,999 WARNING [train.py:1204] (3/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_449-15508-0 from training. Duration: 0.82 2023-10-04 08:55:26,752 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9309W0194-640321 from training. Duration: 0.64 2023-10-04 08:55:28,282 WARNING [train.py:1204] (3/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_284-312178-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 08:55:30,912 WARNING [train.py:1204] (3/4) Exclude cut with ID _29853_210_7497_1_1532505600851_6642110_401-55937-0_sp1.1 from training. Duration: 0.92725 2023-10-04 08:55:32,191 WARNING [train.py:1204] (3/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_716-24918-0_sp1.1 from training. Duration: 0.92725 2023-10-04 08:55:32,219 WARNING [train.py:1204] (3/4) Exclude cut with ID _19637_210_4220_1_1531537167676_3205060_314-305638-0_sp1.1 from training. Duration: 0.92725 2023-10-04 08:55:33,535 WARNING [train.py:1204] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_475-127455-0_sp1.1 from training. Duration: 0.636375 2023-10-04 08:55:33,686 WARNING [train.py:1204] (3/4) Exclude cut with ID _5444_210_2151_1_1527418808464_5312210_581-315411-0 from training. Duration: 0.62 2023-10-04 08:55:37,794 WARNING [train.py:1204] (3/4) Exclude cut with ID _44902_210_16187_1_1533888006062_3792078_149-67212-0_sp0.9 from training. Duration: 0.9666875 2023-10-04 08:55:39,113 WARNING [train.py:1204] (3/4) Exclude cut with ID _31602_210_5854_1_1532677021456_4458250_63-63253-0_sp1.1 from training. Duration: 0.963625 2023-10-04 08:55:43,812 WARNING [train.py:1204] (3/4) Exclude cut with ID _55209_210_1531_1_1534675828996_4597160_660-203992-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 08:55:46,590 WARNING [train.py:1204] (3/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_83-190624-0_sp1.1 from training. Duration: 0.92725 2023-10-04 08:55:51,952 WARNING [train.py:1204] (3/4) Exclude cut with ID _58605_210_12210_1_1534723195939_3904259_154-139635-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 08:55:55,978 WARNING [train.py:1204] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_164-224052-0 from training. Duration: 0.7 2023-10-04 08:55:55,995 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_1126-189071-0_sp1.1 from training. Duration: 0.963625 2023-10-04 08:55:56,013 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_15396_210_6890_1_1531308473_1938936_310-214789-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 08:55:59,930 WARNING [train.py:1204] (3/4) Exclude cut with ID _8395_210_1531_1_1528597871523_4411579_431-132470-0 from training. Duration: 0.74 2023-10-04 08:55:59,994 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3780_210_2087_1_1526467760_2130483_170-259044-0_sp0.9 from training. Duration: 0.77775 2023-10-04 08:56:00,095 WARNING [train.py:1204] (3/4) Exclude cut with ID _12733_210_8341_1_1530167424556_3646380_537-230640-0_sp1.1 from training. Duration: 0.963625 2023-10-04 08:56:03,402 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.5.encoder.layers.1.conv_module2.whiten, num_groups=1, num_channels=256, metric=6.07 vs. limit=15.0 2023-10-04 08:56:04,262 INFO [train.py:1046] (3/4) Epoch 46, batch 550, loss[loss=0.159, simple_loss=0.2336, pruned_loss=0.04221, over 23464.00 frames. ], tot_loss[loss=0.1556, simple_loss=0.2355, pruned_loss=0.03786, over 4401494.42 frames. ], batch size: 285, lr: 2.22e-03, grad_scale: 8.0 2023-10-04 08:56:05,721 WARNING [train.py:1204] (3/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_159-281310-0 from training. Duration: 0.95 2023-10-04 08:56:06,950 WARNING [train.py:1204] (3/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_434-83000-0 from training. Duration: 0.98 2023-10-04 08:56:08,232 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_4405_210_1849_1_1526732205_3593939_493-32464-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 08:56:08,241 WARNING [train.py:1204] (3/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_342-100157-0 from training. Duration: 0.54 2023-10-04 08:56:08,307 WARNING [train.py:1204] (3/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_765-250133-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 08:56:08,311 WARNING [train.py:1204] (3/4) Exclude cut with ID _38670_210_15379_1_1533448810659_4391059_81-312245-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 08:56:09,956 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_50050_210_16223_1_1535435796_7290138_1273-247611-0_sp1.1 from training. Duration: 0.936375 2023-10-04 08:56:11,315 WARNING [train.py:1204] (3/4) Exclude cut with ID _44831_210_13427_1_1533816011549_3689035_399-18591-0_sp1.1 from training. Duration: 0.936375 2023-10-04 08:56:11,340 WARNING [train.py:1204] (3/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_557-324258-0_sp0.9 from training. Duration: 0.9 2023-10-04 08:56:13,110 WARNING [train.py:1204] (3/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_62-335552-0_sp1.1 from training. Duration: 0.7909375 2023-10-04 08:56:14,469 WARNING [train.py:1204] (3/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_673-264125-0_sp1.1 from training. Duration: 0.963625 2023-10-04 08:56:14,566 WARNING [train.py:1204] (3/4) Exclude cut with ID _8030_210_7497_1_1528444886781_3531089_262-86750-0 from training. Duration: 0.81 2023-10-04 08:56:15,845 WARNING [train.py:1204] (3/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_444-28041-0_sp1.1 from training. Duration: 0.77275 2023-10-04 08:56:20,528 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_34008_210_10431_1_1533345424_8183247_516-14858-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 08:56:20,552 WARNING [train.py:1204] (3/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_54-248218-0_sp1.1 from training. Duration: 0.936375 2023-10-04 08:56:23,907 WARNING [train.py:1204] (3/4) Exclude cut with ID _13253_210_1925_1_1530757775819_3577873_300-119782-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 08:56:25,295 WARNING [train.py:1204] (3/4) Exclude cut with ID _39290_210_4220_1_1533862778760_3075118_21-240703-0_sp1.1 from training. Duration: 0.936375 2023-10-04 08:56:28,264 WARNING [train.py:1204] (3/4) Exclude cut with ID 774-127930-0014-10412-0_sp1.1 from training. Duration: 0.95 2023-10-04 08:56:29,532 WARNING [train.py:1204] (3/4) Exclude cut with ID _67499_210_17282_1_1535509805220_3692430_20-179346-0 from training. Duration: 0.97 2023-10-04 08:56:30,897 WARNING [train.py:1204] (3/4) Exclude cut with ID _8365_210_2064_1_1528592311070_4677690_431-294102-0_sp0.9 from training. Duration: 0.988875 2023-10-04 08:56:33,884 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.balancer2.prob, batch_count=1597440.0, ans=0.125 2023-10-04 08:56:35,362 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.feed_forward1.out_proj.dropout_p, batch_count=1597440.0, ans=0.1 2023-10-04 08:56:36,405 WARNING [train.py:1204] (3/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_365-1492-0_sp1.1 from training. Duration: 0.82725 2023-10-04 08:56:36,438 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_105-114797-0_sp0.9 from training. Duration: 0.9333125 2023-10-04 08:56:37,864 WARNING [train.py:1204] (3/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_769-168302-0_sp0.9 from training. Duration: 0.788875 2023-10-04 08:56:40,555 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_40340_210_9024_1_1533433916_4132944_320-270277-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 08:56:40,566 WARNING [train.py:1204] (3/4) Exclude cut with ID ID0203W0003-781711 from training. Duration: 0.958 2023-10-04 08:56:40,770 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=1597440.0, ans=0.1 2023-10-04 08:56:42,423 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_30434_210_13049_1_1532519384_981746_52-12932-0_sp1.1 from training. Duration: 0.936375 2023-10-04 08:56:45,142 WARNING [train.py:1204] (3/4) Exclude cut with ID _8263_210_1664_1_1528438953494_294100_40-204731-0_sp0.9 from training. Duration: 0.5555625 2023-10-04 08:56:46,566 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1032-236863-0_sp0.9 from training. Duration: 0.9333125 2023-10-04 08:56:47,842 WARNING [train.py:1204] (3/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_335-274780-0_sp0.9 from training. Duration: 0.8666875 2023-10-04 08:56:47,874 WARNING [train.py:1204] (3/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_654-52713-0_sp1.1 from training. Duration: 0.7 2023-10-04 08:56:49,231 WARNING [train.py:1204] (3/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_742-267856-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 08:56:49,324 WARNING [train.py:1204] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_74-48717-0 from training. Duration: 0.58 2023-10-04 08:56:49,412 WARNING [train.py:1204] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_18-46756-0 from training. Duration: 0.79 2023-10-04 08:56:51,202 WARNING [train.py:1204] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_612-56061-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 08:56:51,219 WARNING [train.py:1204] (3/4) Exclude cut with ID _56423_210_5854_1_1534896061049_3165969_412-2403-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 08:56:52,667 WARNING [train.py:1204] (3/4) Exclude cut with ID _62574_210_2655_1_1535180407058_3933249_120-196848-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 08:56:52,667 WARNING [train.py:1204] (3/4) Exclude cut with ID _40519_210_13794_1_1533715228655_7337390_916-51000-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 08:56:56,020 WARNING [train.py:1204] (3/4) Exclude cut with ID _65263_210_22356_1_1535763598691_3921037_108-21902-0_sp1.1 from training. Duration: 0.9 2023-10-04 08:56:56,289 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.feed_forward2.hidden_balancer.prob, batch_count=1597506.6666666667, ans=0.125 2023-10-04 08:56:57,429 WARNING [train.py:1204] (3/4) Exclude cut with ID _4122_210_2084_1_1526713164417_4353280_528-206771-0_sp1.1 from training. Duration: 0.8 2023-10-04 08:56:59,348 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.5.encoder.layers.0.feed_forward2.out_whiten, num_groups=1, num_channels=256, metric=6.48 vs. limit=15.0 2023-10-04 08:57:00,203 WARNING [train.py:1204] (3/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_43-325657-0_sp1.1 from training. Duration: 0.92725 2023-10-04 08:57:00,254 WARNING [train.py:1204] (3/4) Exclude cut with ID _44659_210_2721_1_1534036120061_6614860_314-82041-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 08:57:00,314 WARNING [train.py:1204] (3/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_81-300212-0_sp0.9 from training. Duration: 0.6666875 2023-10-04 08:57:01,782 WARNING [train.py:1204] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_665-196155-0_sp0.9 from training. Duration: 0.7444375 2023-10-04 08:57:03,110 WARNING [train.py:1204] (3/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_140-336881-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 08:57:03,182 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6290_210_3273_1_1527595224_1282787_115-87095-0_sp0.9 from training. Duration: 0.611125 2023-10-04 08:57:04,466 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_53373_210_5399_1_1534229471_4715695_607-259685-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 08:57:04,583 WARNING [train.py:1204] (3/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_175-163894-0_sp1.1 from training. Duration: 0.67275 2023-10-04 08:57:05,866 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7002_210_1794_1_1527939683_5157286_1-281264-0_sp1.1 from training. Duration: 0.4 2023-10-04 08:57:10,265 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.752e+02 2.017e+02 2.263e+02 2.749e+02 3.801e+02, threshold=4.526e+02, percent-clipped=0.0 2023-10-04 08:57:11,750 WARNING [train.py:1204] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_605-257205-0 from training. Duration: 0.59 2023-10-04 08:57:14,663 WARNING [train.py:1204] (3/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_454-314901-0 from training. Duration: 0.46 2023-10-04 08:57:16,023 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_46032_210_14656_1_1533781412_4507969_287-130422-0_sp1.1 from training. Duration: 0.97275 2023-10-04 08:57:16,039 WARNING [train.py:1204] (3/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_186-309161-0_sp1.1 from training. Duration: 0.4909375 2023-10-04 08:57:16,085 WARNING [train.py:1204] (3/4) Exclude cut with ID _42659_210_2070_1_1533607442765_3120380_45-342307-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 08:57:17,313 INFO [train.py:1046] (3/4) Epoch 46, batch 600, loss[loss=0.1646, simple_loss=0.2516, pruned_loss=0.03877, over 24409.00 frames. ], tot_loss[loss=0.1553, simple_loss=0.2352, pruned_loss=0.03771, over 4459496.86 frames. ], batch size: 77, lr: 2.22e-03, grad_scale: 8.0 2023-10-04 08:57:22,118 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.feed_forward1.hidden_balancer.prob, batch_count=1597640.0, ans=0.125 2023-10-04 08:57:23,957 WARNING [train.py:1204] (3/4) Exclude cut with ID _12555_210_8750_1_1530075594402_3780239_478-326818-0_sp1.1 from training. Duration: 0.963625 2023-10-04 08:57:24,146 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.feed_forward1.hidden_balancer.prob, batch_count=1597640.0, ans=0.125 2023-10-04 08:57:27,135 WARNING [train.py:1204] (3/4) Exclude cut with ID _7606_210_4915_1_1528894253277_5080690_127-177761-0_sp1.1 from training. Duration: 0.5818125 2023-10-04 08:57:28,565 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6788_210_1388_1_1527898698_4562021_91-197981-0 from training. Duration: 0.83 2023-10-04 08:57:31,157 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8558_210_1410_1_1528505225_7613406_131-329524-0_sp0.9 from training. Duration: 0.87775 2023-10-04 08:57:31,410 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.feed_forward1.hidden_balancer.prob, batch_count=1597706.6666666667, ans=0.125 2023-10-04 08:57:33,822 WARNING [train.py:1204] (3/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_153-153537-0_sp1.1 from training. Duration: 0.82725 2023-10-04 08:57:35,304 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_17292_210_6753_1_1531125388_2943734_285-349105-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 08:57:38,017 WARNING [train.py:1204] (3/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_37-41919-0 from training. Duration: 0.87 2023-10-04 08:57:38,040 WARNING [train.py:1204] (3/4) Exclude cut with ID _60860_210_8254_1_1534896015875_3557169_172-294852-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 08:57:44,144 WARNING [train.py:1204] (3/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_146-276944-0 from training. Duration: 0.83 2023-10-04 08:57:44,302 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.balancer2.prob, batch_count=1597706.6666666667, ans=0.125 2023-10-04 08:57:47,106 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.conv_skip_rate, batch_count=1597773.3333333333, ans=0.0 2023-10-04 08:57:48,223 WARNING [train.py:1204] (3/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_1254-44082-0_sp1.1 from training. Duration: 0.936375 2023-10-04 08:57:48,236 WARNING [train.py:1204] (3/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_1115-180350-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 08:57:48,273 WARNING [train.py:1204] (3/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_60-146189-0_sp1.1 from training. Duration: 0.8909375 2023-10-04 08:57:53,197 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.conv_module1.balancer1.prob, batch_count=1597773.3333333333, ans=0.125 2023-10-04 08:57:54,379 WARNING [train.py:1204] (3/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_517-153649-0_sp1.1 from training. Duration: 0.92725 2023-10-04 08:57:54,401 WARNING [train.py:1204] (3/4) Exclude cut with ID _39571_210_13449_1_1533801584143_3651077_37-266257-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 08:57:54,439 WARNING [train.py:1204] (3/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_527-276118-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 08:58:02,290 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7696_210_2040_1_1528207167_3716099_117-125434-0_sp1.1 from training. Duration: 0.6909375 2023-10-04 08:58:06,544 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_25767_210_2161_1_1532307483_3867748_478-156360-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 08:58:06,548 WARNING [train.py:1204] (3/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_177-49092-0_sp1.1 from training. Duration: 0.82725 2023-10-04 08:58:06,555 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_11305_210_1794_1_1529750428_7178148_680-317073-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 08:58:11,085 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.bypass_mid.scale_min, batch_count=1597840.0, ans=0.2 2023-10-04 08:58:12,252 WARNING [train.py:1204] (3/4) Exclude cut with ID _53565_210_10635_1_1534337218335_3820259_287-18120-0 from training. Duration: 0.97 2023-10-04 08:58:19,440 WARNING [train.py:1204] (3/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_498-40153-0_sp1.1 from training. Duration: 0.57275 2023-10-04 08:58:19,458 WARNING [train.py:1204] (3/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_555-63066-0_sp1.1 from training. Duration: 0.963625 2023-10-04 08:58:20,048 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.5.encoder.layers.1.conv_module2.whiten, num_groups=1, num_channels=256, metric=5.75 vs. limit=15.0 2023-10-04 08:58:21,013 WARNING [train.py:1204] (3/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_395-310220-0 from training. Duration: 0.89 2023-10-04 08:58:22,945 WARNING [train.py:1204] (3/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_937-208281-0_sp1.1 from training. Duration: 0.77275 2023-10-04 08:58:24,366 WARNING [train.py:1204] (3/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_120-211630-0 from training. Duration: 0.92 2023-10-04 08:58:24,418 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_454-276429-0_sp1.1 from training. Duration: 0.7909375 2023-10-04 08:58:25,738 WARNING [train.py:1204] (3/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_404-75889-0_sp1.1 from training. Duration: 0.8090625 2023-10-04 08:58:30,367 WARNING [train.py:1204] (3/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_282-136641-0_sp0.9 from training. Duration: 0.4666875 2023-10-04 08:58:31,715 INFO [train.py:1046] (3/4) Epoch 46, batch 650, loss[loss=0.1541, simple_loss=0.2425, pruned_loss=0.03286, over 24679.00 frames. ], tot_loss[loss=0.1548, simple_loss=0.2347, pruned_loss=0.03741, over 4523320.33 frames. ], batch size: 65, lr: 2.22e-03, grad_scale: 8.0 2023-10-04 08:58:31,805 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_809-17156-0_sp1.1 from training. Duration: 0.663625 2023-10-04 08:58:34,443 WARNING [train.py:1204] (3/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_1003-101638-0_sp0.9 from training. Duration: 0.811125 2023-10-04 08:58:35,798 WARNING [train.py:1204] (3/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_404-75889-0_sp0.9 from training. Duration: 0.988875 2023-10-04 08:58:37,298 WARNING [train.py:1204] (3/4) Exclude cut with ID _59288_210_5854_1_1535077757774_3412246_570-295255-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 08:58:40,130 WARNING [train.py:1204] (3/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_412-287526-0 from training. Duration: 0.72 2023-10-04 08:58:41,957 WARNING [train.py:1204] (3/4) Exclude cut with ID _44010_210_4525_1_1533884262964_4013620_649-79655-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 08:58:43,669 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.feed_forward2.hidden_balancer.prob, batch_count=1597973.3333333333, ans=0.125 2023-10-04 08:58:44,932 WARNING [train.py:1204] (3/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_57-270037-0_sp0.9 from training. Duration: 0.8555625 2023-10-04 08:58:44,933 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_34815_210_6388_1_1533381320_3571903_302-255963-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 08:58:49,581 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_47449_210_5399_1_1534076445_4006929_573-301852-0_sp1.1 from training. Duration: 0.97275 2023-10-04 08:58:49,721 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.feed_forward1.hidden_balancer.prob, batch_count=1598040.0, ans=0.125 2023-10-04 08:58:52,203 WARNING [train.py:1204] (3/4) Exclude cut with ID 6709-74022-0004-86860-0_sp1.1 from training. Duration: 0.9409375 2023-10-04 08:58:54,225 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.bypass.skip_rate, batch_count=1598040.0, ans=0.09899494936611666 2023-10-04 08:58:55,352 WARNING [train.py:1204] (3/4) Exclude cut with ID _40519_210_13794_1_1533715228655_7337390_111-71991-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 08:58:55,414 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_30167_210_3528_1_1532428012_4772375_169-214186-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 08:58:58,869 WARNING [train.py:1204] (3/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_20-235313-0_sp1.1 from training. Duration: 0.963625 2023-10-04 08:58:58,925 WARNING [train.py:1204] (3/4) Exclude cut with ID _4681_210_3525_1_1527251040321_1235713_123-24428-0_sp0.9 from training. Duration: 0.5333125 2023-10-04 08:59:01,555 WARNING [train.py:1204] (3/4) Exclude cut with ID _31602_210_5854_1_1532677021456_4458250_444-198752-0_sp1.1 from training. Duration: 0.97275 2023-10-04 08:59:02,832 WARNING [train.py:1204] (3/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_9-279442-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 08:59:02,887 WARNING [train.py:1204] (3/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_973-198085-0_sp0.9 from training. Duration: 0.8333125 2023-10-04 08:59:04,263 WARNING [train.py:1204] (3/4) Exclude cut with ID _29853_210_7497_1_1532505600851_6642110_567-250221-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 08:59:05,621 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6545_210_2040_1_1527774670_4245258_407-48070-0_sp1.1 from training. Duration: 0.6909375 2023-10-04 08:59:08,425 WARNING [train.py:1204] (3/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_224-343295-0_sp0.9 from training. Duration: 0.611125 2023-10-04 08:59:08,448 WARNING [train.py:1204] (3/4) Exclude cut with ID IC0125W0011-61817 from training. Duration: 0.88 2023-10-04 08:59:08,457 WARNING [train.py:1204] (3/4) Exclude cut with ID _60113_210_16271_1_1535164212481_4386079_435-154371-0_sp1.1 from training. Duration: 0.97275 2023-10-04 08:59:08,470 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_25836_210_16554_1_1532138167_53197_3-9394-0_sp1.1 from training. Duration: 0.92725 2023-10-04 08:59:12,554 WARNING [train.py:1204] (3/4) Exclude cut with ID _29343_210_4381_1_1532602757752_3674109_57-55125-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 08:59:12,712 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.conv_module2.balancer2.min_positive, batch_count=1598106.6666666667, ans=0.05 2023-10-04 08:59:13,925 WARNING [train.py:1204] (3/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_267-23560-0_sp1.1 from training. Duration: 0.936375 2023-10-04 08:59:15,723 WARNING [train.py:1204] (3/4) Exclude cut with ID _30196_210_1664_1_1533708496522_7074140_1150-278867-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 08:59:15,783 WARNING [train.py:1204] (3/4) Exclude cut with ID _6270_210_2084_1_1527850911573_3856420_26-277343-0_sp0.9 from training. Duration: 0.911125 2023-10-04 08:59:17,189 WARNING [train.py:1204] (3/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_325-62802-0 from training. Duration: 0.87 2023-10-04 08:59:18,524 WARNING [train.py:1204] (3/4) Exclude cut with ID _56524_210_9642_1_1534572011779_3619170_145-310842-0_sp1.1 from training. Duration: 0.9 2023-10-04 08:59:18,559 WARNING [train.py:1204] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_860-96198-0_sp0.9 from training. Duration: 0.811125 2023-10-04 08:59:18,793 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.conv_skip_rate, batch_count=1598173.3333333333, ans=0.0 2023-10-04 08:59:19,919 WARNING [train.py:1204] (3/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_17-25045-0_sp1.1 from training. Duration: 0.536375 2023-10-04 08:59:19,936 WARNING [train.py:1204] (3/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_349-69727-0_sp1.1 from training. Duration: 0.936375 2023-10-04 08:59:20,033 WARNING [train.py:1204] (3/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_580-93798-0_sp1.1 from training. Duration: 0.5090625 2023-10-04 08:59:23,423 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7762_210_1794_1_1528199508_4057408_419-125009-0 from training. Duration: 0.83 2023-10-04 08:59:23,620 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.nonlin_attention.balancer.prob, batch_count=1598173.3333333333, ans=0.125 2023-10-04 08:59:24,555 WARNING [train.py:1204] (3/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_356-167816-0 from training. Duration: 0.72 2023-10-04 08:59:24,585 WARNING [train.py:1204] (3/4) Exclude cut with ID _29345_210_4381_1_1532689168263_3860169_225-97100-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 08:59:24,590 WARNING [train.py:1204] (3/4) Exclude cut with ID _14227_210_4381_1_1530701804286_3940259_374-194712-0_sp1.1 from training. Duration: 0.92725 2023-10-04 08:59:26,458 WARNING [train.py:1204] (3/4) Exclude cut with ID _40137_210_12210_1_1533290484046_3795180_249-187059-0_sp0.9 from training. Duration: 0.97775 2023-10-04 08:59:26,477 WARNING [train.py:1204] (3/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_389-317194-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 08:59:27,906 WARNING [train.py:1204] (3/4) Exclude cut with ID _27485_210_4916_1_1532739191677_3668269_239-122918-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 08:59:32,806 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2245_210_2715_1_1525511548_3540254_128-117609-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 08:59:32,836 WARNING [train.py:1204] (3/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_160-276178-0_sp1.1 from training. Duration: 0.963625 2023-10-04 08:59:34,295 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_31920_210_10633_1_1532860009_1686094_1-279735-0_sp1.1 from training. Duration: 0.97275 2023-10-04 08:59:35,661 WARNING [train.py:1204] (3/4) Exclude cut with ID _51078_210_9070_1_1534161557424_3805250_325-262383-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 08:59:36,975 WARNING [train.py:1204] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_643-52007-0_sp0.9 from training. Duration: 0.6333125 2023-10-04 08:59:37,029 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_51937_210_5014_1_1534573610_4022025_518-299248-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 08:59:38,310 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.626e+02 2.015e+02 2.292e+02 2.659e+02 4.120e+02, threshold=4.583e+02, percent-clipped=0.0 2023-10-04 08:59:41,437 INFO [scaling.py:1118] (3/4) WithLoss: name=encoder.encoders.5.encoder.layers.0.self_attn_weights, loss-sum=0.000e+00 2023-10-04 08:59:43,806 WARNING [train.py:1204] (3/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_166-314607-0_sp1.1 from training. Duration: 0.4909375 2023-10-04 08:59:43,810 WARNING [train.py:1204] (3/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_1087-282939-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 08:59:43,848 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_23850_210_4238_1_1532232597_1632433_32-185135-0_sp1.1 from training. Duration: 0.7818125 2023-10-04 08:59:43,886 WARNING [train.py:1204] (3/4) Exclude cut with ID _59500_210_8254_1_1534809610680_3736009_145-89359-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 08:59:44,117 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.conv_module1.balancer1.max_abs, batch_count=1598306.6666666667, ans=10.0 2023-10-04 08:59:45,589 INFO [train.py:1046] (3/4) Epoch 46, batch 700, loss[loss=0.1608, simple_loss=0.249, pruned_loss=0.0363, over 23983.00 frames. ], tot_loss[loss=0.1538, simple_loss=0.2332, pruned_loss=0.03723, over 4559514.83 frames. ], batch size: 80, lr: 2.22e-03, grad_scale: 8.0 2023-10-04 08:59:49,814 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_340-336408-0 from training. Duration: 0.93 2023-10-04 08:59:49,899 WARNING [train.py:1204] (3/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_9-21737-0 from training. Duration: 0.97 2023-10-04 08:59:53,208 WARNING [train.py:1204] (3/4) Exclude cut with ID _2220_210_1819_1_1525509109898_3658680_50-99837-0 from training. Duration: 0.54 2023-10-04 08:59:54,976 WARNING [train.py:1204] (3/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_879-299350-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 08:59:55,110 WARNING [train.py:1204] (3/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_905-160074-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 08:59:57,785 WARNING [train.py:1204] (3/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_395-73570-0 from training. Duration: 0.99 2023-10-04 09:00:01,079 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7264_210_4938_1_1527939911_8132095_800-91995-0_sp1.1 from training. Duration: 0.7818125 2023-10-04 09:00:03,754 WARNING [train.py:1204] (3/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_77-311404-0_sp1.1 from training. Duration: 0.8545625 2023-10-04 09:00:05,116 WARNING [train.py:1204] (3/4) Exclude cut with ID _13679_210_7497_1_1530534293662_3355098_107-80772-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 09:00:07,830 WARNING [train.py:1204] (3/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_224-343295-0_sp1.1 from training. Duration: 0.5 2023-10-04 09:00:07,870 WARNING [train.py:1204] (3/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_156-253482-0_sp1.1 from training. Duration: 0.963625 2023-10-04 09:00:10,589 WARNING [train.py:1204] (3/4) Exclude cut with ID _49728_210_12651_1_1534041034294_6788150_309-125532-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 09:00:12,129 WARNING [train.py:1204] (3/4) Exclude cut with ID _13913_210_9994_1_1530579615187_3383897_473-277682-0_sp0.9 from training. Duration: 0.4333125 2023-10-04 09:00:12,134 WARNING [train.py:1204] (3/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_3-276701-0_sp1.1 from training. Duration: 0.82725 2023-10-04 09:00:14,910 WARNING [train.py:1204] (3/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_282-136641-0 from training. Duration: 0.42 2023-10-04 09:00:15,247 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.balancer2.prob, batch_count=1598440.0, ans=0.125 2023-10-04 09:00:17,006 WARNING [train.py:1204] (3/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_127-566-0 from training. Duration: 0.84 2023-10-04 09:00:20,997 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6163_210_6400_1_1527506746_4263068_283-137811-0_sp1.1 from training. Duration: 0.7 2023-10-04 09:00:21,653 WARNING [train.py:1204] (3/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_34-183685-0_sp1.1 from training. Duration: 0.8909375 2023-10-04 09:00:23,106 WARNING [train.py:1204] (3/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_154-135696-0_sp0.9 from training. Duration: 0.888875 2023-10-04 09:00:26,477 WARNING [train.py:1204] (3/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_676-51014-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 09:00:26,543 WARNING [train.py:1204] (3/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_38-169920-0 from training. Duration: 0.91 2023-10-04 09:00:31,265 WARNING [train.py:1204] (3/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_747-114465-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 09:00:31,316 WARNING [train.py:1204] (3/4) Exclude cut with ID _8021_210_7994_1_1528376451358_3653069_100-140231-0_sp1.1 from training. Duration: 0.6090625 2023-10-04 09:00:31,340 WARNING [train.py:1204] (3/4) Exclude cut with ID _64135_210_2253_1_1535677267876_7608972_133-188266-0 from training. Duration: 0.81 2023-10-04 09:00:34,197 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.conv_module2.balancer2.prob, batch_count=1598506.6666666667, ans=0.125 2023-10-04 09:00:35,447 WARNING [train.py:1204] (3/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_714-310538-0_sp1.1 from training. Duration: 0.7818125 2023-10-04 09:00:36,763 WARNING [train.py:1204] (3/4) Exclude cut with ID _39867_210_9826_1_1533553222030_9344259_1134-288498-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 09:00:38,809 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.3.encoder.layers.3.conv_module2.whiten, num_groups=1, num_channels=512, metric=5.10 vs. limit=15.0 2023-10-04 09:00:39,499 WARNING [train.py:1204] (3/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_211-237289-0_sp1.1 from training. Duration: 0.936375 2023-10-04 09:00:44,235 WARNING [train.py:1204] (3/4) Exclude cut with ID _59202_210_26984_1_1534816810612_3549174_137-318710-0_sp0.9 from training. Duration: 0.988875 2023-10-04 09:00:44,259 WARNING [train.py:1204] (3/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_86-66192-0 from training. Duration: 0.55 2023-10-04 09:00:47,137 WARNING [train.py:1204] (3/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_540-275685-0 from training. Duration: 0.89 2023-10-04 09:00:47,178 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8776_210_2040_1_1528638837_4088036_244-278763-0 from training. Duration: 0.9 2023-10-04 09:00:47,468 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=1598573.3333333333, ans=0.1 2023-10-04 09:00:49,908 WARNING [train.py:1204] (3/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_9-219972-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 09:00:51,732 WARNING [train.py:1204] (3/4) Exclude cut with ID _44659_210_2721_1_1534036120061_6614860_656-283823-0_sp1.1 from training. Duration: 0.92725 2023-10-04 09:00:53,119 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_69630_210_7234_1_1535789601_661049_77-216385-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 09:00:55,870 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_19952_210_2161_1_1531875469_3851750_210-308919-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 09:00:55,884 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_4737_210_2040_1_1526998689_3293008_378-178650-0 from training. Duration: 0.95 2023-10-04 09:00:59,920 INFO [train.py:1046] (3/4) Epoch 46, batch 750, loss[loss=0.1649, simple_loss=0.2418, pruned_loss=0.04404, over 24002.00 frames. ], tot_loss[loss=0.1538, simple_loss=0.2334, pruned_loss=0.0371, over 4601368.77 frames. ], batch size: 196, lr: 2.22e-03, grad_scale: 8.0 2023-10-04 09:01:01,416 WARNING [train.py:1204] (3/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_224-343295-0 from training. Duration: 0.55 2023-10-04 09:01:01,445 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7002_210_1794_1_1527939683_5157286_184-229630-0 from training. Duration: 0.74 2023-10-04 09:01:01,485 WARNING [train.py:1204] (3/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_6-214815-0 from training. Duration: 0.85 2023-10-04 09:01:03,479 WARNING [train.py:1204] (3/4) Exclude cut with ID _8699_210_2069_1_1528630346831_3960409_160-24408-0 from training. Duration: 0.91 2023-10-04 09:01:03,519 WARNING [train.py:1204] (3/4) Exclude cut with ID _5585_210_2667_1_1527418813324_8007340_89-332072-0 from training. Duration: 0.64 2023-10-04 09:01:03,553 WARNING [train.py:1204] (3/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_528-259800-0_sp1.1 from training. Duration: 0.963625 2023-10-04 09:01:06,394 WARNING [train.py:1204] (3/4) Exclude cut with ID _55383_210_10635_1_1534468426172_2231669_134-128246-0 from training. Duration: 0.89 2023-10-04 09:01:06,480 WARNING [train.py:1204] (3/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_400-338274-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 09:01:07,835 WARNING [train.py:1204] (3/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_364-22817-0_sp0.9 from training. Duration: 0.788875 2023-10-04 09:01:10,444 WARNING [train.py:1204] (3/4) Exclude cut with ID _42863_210_9070_1_1533902398788_3696920_331-291057-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 09:01:11,904 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_5436_210_1794_1_1527331317_2541494_229-211016-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 09:01:11,951 WARNING [train.py:1204] (3/4) Exclude cut with ID _6456_210_1534_1_1527681634212_3181049_345-252877-0_sp0.9 from training. Duration: 0.711125 2023-10-04 09:01:11,983 WARNING [train.py:1204] (3/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_96-81499-0_sp1.1 from training. Duration: 0.92725 2023-10-04 09:01:14,669 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_5575_210_1410_1_1527301305_2570170_342-265572-0_sp1.1 from training. Duration: 0.8545625 2023-10-04 09:01:16,504 WARNING [train.py:1204] (3/4) Exclude cut with ID _5444_210_2151_1_1527418808464_5312210_726-27165-0_sp0.9 from training. Duration: 0.7444375 2023-10-04 09:01:17,946 WARNING [train.py:1204] (3/4) Exclude cut with ID _22083_210_8254_1_1531702862439_3678970_304-327750-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 09:01:20,682 WARNING [train.py:1204] (3/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_245-226199-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 09:01:20,728 WARNING [train.py:1204] (3/4) Exclude cut with ID _42875_210_14856_1_1533704397487_4012433_102-199499-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 09:01:22,008 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_51225_210_3601_1_1534400634_2327383_55-301844-0 from training. Duration: 0.93 2023-10-04 09:01:23,981 WARNING [train.py:1204] (3/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_916-195848-0_sp1.1 from training. Duration: 0.836375 2023-10-04 09:01:25,369 WARNING [train.py:1204] (3/4) Exclude cut with ID _44718_210_5816_1_1534050051869_6469030_497-311999-0_sp1.1 from training. Duration: 0.936375 2023-10-04 09:01:27,338 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_34886_210_10633_1_1533203882_1755661_91-194765-0_sp1.1 from training. Duration: 0.936375 2023-10-04 09:01:28,833 WARNING [train.py:1204] (3/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_310-26186-0_sp0.9 from training. Duration: 0.77775 2023-10-04 09:01:30,181 WARNING [train.py:1204] (3/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_416-257730-0 from training. Duration: 0.65 2023-10-04 09:01:30,187 WARNING [train.py:1204] (3/4) Exclude cut with ID _9389_210_1664_1_1529217337301_5289269_209-154760-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 09:01:32,827 WARNING [train.py:1204] (3/4) Exclude cut with ID _4092_210_2151_1_1526643044449_3135759_227-213972-0 from training. Duration: 0.54 2023-10-04 09:01:32,854 WARNING [train.py:1204] (3/4) Exclude cut with ID IC0679W0044-335852 from training. Duration: 0.768 2023-10-04 09:01:34,178 WARNING [train.py:1204] (3/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_42-186162-0 from training. Duration: 0.92 2023-10-04 09:01:34,184 WARNING [train.py:1204] (3/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_302-2550-0_sp0.9 from training. Duration: 0.9 2023-10-04 09:01:34,198 WARNING [train.py:1204] (3/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_356-31216-0_sp0.9 from training. Duration: 0.8333125 2023-10-04 09:01:36,979 WARNING [train.py:1204] (3/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_125-322172-0_sp1.1 from training. Duration: 0.8454375 2023-10-04 09:01:43,064 WARNING [train.py:1204] (3/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_141-89727-0_sp0.9 from training. Duration: 0.788875 2023-10-04 09:01:43,088 WARNING [train.py:1204] (3/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_647-337784-0_sp1.1 from training. Duration: 0.97275 2023-10-04 09:01:43,106 WARNING [train.py:1204] (3/4) Exclude cut with ID _6493_210_1747_1_1528459210424_4012470_513-140967-0_sp1.1 from training. Duration: 0.7181875 2023-10-04 09:01:45,667 WARNING [train.py:1204] (3/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_87-266334-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 09:01:45,777 WARNING [train.py:1204] (3/4) Exclude cut with ID _26528_210_9070_1_1532570410842_3851310_526-261338-0_sp1.1 from training. Duration: 0.92725 2023-10-04 09:01:47,142 WARNING [train.py:1204] (3/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_380-343670-0 from training. Duration: 0.88 2023-10-04 09:01:47,196 WARNING [train.py:1204] (3/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_449-15508-0_sp1.1 from training. Duration: 0.7454375 2023-10-04 09:01:48,543 WARNING [train.py:1204] (3/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_372-6869-0_sp0.9 from training. Duration: 0.488875 2023-10-04 09:01:50,488 WARNING [train.py:1204] (3/4) Exclude cut with ID _26907_210_7234_1_1532221740694_3205263_337-2494-0_sp0.9 from training. Duration: 0.97775 2023-10-04 09:01:53,340 WARNING [train.py:1204] (3/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_10-100367-0_sp1.1 from training. Duration: 0.8909375 2023-10-04 09:01:53,377 WARNING [train.py:1204] (3/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_47-167373-0 from training. Duration: 0.92 2023-10-04 09:01:54,734 WARNING [train.py:1204] (3/4) Exclude cut with ID _28323_210_16187_1_1532480395667_4100300_628-282179-0_sp1.1 from training. Duration: 0.97275 2023-10-04 09:02:00,030 WARNING [train.py:1204] (3/4) Exclude cut with ID _20455_210_5219_1_1531565996775_8098100_213-280898-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 09:02:00,279 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.feed_forward1.out_proj.dropout_p, batch_count=1598906.6666666667, ans=0.1 2023-10-04 09:02:01,448 WARNING [train.py:1204] (3/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_383-316000-0_sp1.1 from training. Duration: 0.8181875 2023-10-04 09:02:01,480 WARNING [train.py:1204] (3/4) Exclude cut with ID _18513_210_9206_1_1531618215223_3862620_16-78180-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 09:02:04,208 WARNING [train.py:1204] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_330-327861-0_sp1.1 from training. Duration: 0.6090625 2023-10-04 09:02:06,748 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.674e+02 2.037e+02 2.258e+02 2.551e+02 3.884e+02, threshold=4.516e+02, percent-clipped=0.0 2023-10-04 09:02:08,731 WARNING [train.py:1204] (3/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_356-31216-0 from training. Duration: 0.75 2023-10-04 09:02:08,761 WARNING [train.py:1204] (3/4) Exclude cut with ID _25248_210_8750_1_1532062825941_5554180_255-148539-0_sp1.1 from training. Duration: 0.963625 2023-10-04 09:02:09,083 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=1598906.6666666667, ans=0.1 2023-10-04 09:02:10,143 WARNING [train.py:1204] (3/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_269-154726-0_sp1.1 from training. Duration: 0.7818125 2023-10-04 09:02:12,935 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7002_210_1794_1_1527939683_5157286_163-87552-0_sp1.1 from training. Duration: 0.7818125 2023-10-04 09:02:12,961 WARNING [train.py:1204] (3/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_97-151896-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 09:02:14,216 INFO [train.py:1046] (3/4) Epoch 46, batch 800, loss[loss=0.1516, simple_loss=0.2341, pruned_loss=0.03452, over 23407.00 frames. ], tot_loss[loss=0.1537, simple_loss=0.2336, pruned_loss=0.03691, over 4633899.30 frames. ], batch size: 106, lr: 2.22e-03, grad_scale: 16.0 2023-10-04 09:02:15,758 WARNING [train.py:1204] (3/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_207-7026-0_sp1.1 from training. Duration: 0.97275 2023-10-04 09:02:15,784 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_92-46009-0_sp0.9 from training. Duration: 0.6 2023-10-04 09:02:23,068 WARNING [train.py:1204] (3/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_122-178847-0_sp1.1 from training. Duration: 0.97275 2023-10-04 09:02:23,072 WARNING [train.py:1204] (3/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_94-348956-0_sp1.1 from training. Duration: 0.92725 2023-10-04 09:02:24,442 WARNING [train.py:1204] (3/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_1018-244089-0_sp1.1 from training. Duration: 0.963625 2023-10-04 09:02:25,737 WARNING [train.py:1204] (3/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_87-118423-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 09:02:27,098 WARNING [train.py:1204] (3/4) Exclude cut with ID _53713_210_24116_1_1534314487762_7416751_504-212292-0_sp1.1 from training. Duration: 0.92725 2023-10-04 09:02:27,126 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_446-150327-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 09:02:27,400 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.conv_module1.balancer2.prob, batch_count=1599040.0, ans=0.125 2023-10-04 09:02:27,734 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.4.encoder.layers.1.nonlin_attention.whiten1, num_groups=1, num_channels=288, metric=4.22 vs. limit=10.0 2023-10-04 09:02:31,050 WARNING [train.py:1204] (3/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_682-245989-0_sp1.1 from training. Duration: 0.92725 2023-10-04 09:02:31,352 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.balancer1.prob, batch_count=1599040.0, ans=0.125 2023-10-04 09:02:35,237 WARNING [train.py:1204] (3/4) Exclude cut with ID _35723_210_1794_1_1533088341043_628250_8-234524-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 09:02:35,304 WARNING [train.py:1204] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_64-248359-0_sp0.9 from training. Duration: 0.7666875 2023-10-04 09:02:36,883 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.feed_forward1.out_proj.dropout_p, batch_count=1599040.0, ans=0.1 2023-10-04 09:02:38,108 WARNING [train.py:1204] (3/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_49-97158-0 from training. Duration: 0.76 2023-10-04 09:02:39,487 WARNING [train.py:1204] (3/4) Exclude cut with ID _24314_210_4915_1_1532135078116_7170179_743-915-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 09:02:39,584 WARNING [train.py:1204] (3/4) Exclude cut with ID _61194_210_18628_1_1535007555628_3750909_353-323468-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 09:02:41,354 WARNING [train.py:1204] (3/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_387-131538-0_sp1.1 from training. Duration: 0.736375 2023-10-04 09:02:41,382 WARNING [train.py:1204] (3/4) Exclude cut with ID _44424_210_18163_1_1533623394848_1213110_79-187603-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 09:02:41,417 WARNING [train.py:1204] (3/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_86-19601-0 from training. Duration: 0.85 2023-10-04 09:02:41,877 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.4.encoder.layers.2.feed_forward2.out_whiten, num_groups=1, num_channels=384, metric=9.63 vs. limit=15.0 2023-10-04 09:02:42,659 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_50050_210_16223_1_1535435796_7290138_103-283529-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 09:02:42,700 WARNING [train.py:1204] (3/4) Exclude cut with ID _13913_210_9994_1_1530579615187_3383897_473-277682-0 from training. Duration: 0.39 2023-10-04 09:02:43,022 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.ff2_skip_rate, batch_count=1599106.6666666667, ans=0.0 2023-10-04 09:02:45,466 WARNING [train.py:1204] (3/4) Exclude cut with ID _42326_210_7770_1_1533814282531_3707269_368-68029-0_sp1.1 from training. Duration: 0.92725 2023-10-04 09:02:46,984 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_41428_210_2161_1_1533774184_3992532_446-339645-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 09:02:49,665 WARNING [train.py:1204] (3/4) Exclude cut with ID _69584_210_5219_1_1535785133775_4044450_325-7656-0_sp1.1 from training. Duration: 0.7818125 2023-10-04 09:02:49,673 WARNING [train.py:1204] (3/4) Exclude cut with ID _21632_210_2253_1_1531994001765_3744140_134-184387-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 09:02:51,101 WARNING [train.py:1204] (3/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_550-184707-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 09:02:51,124 WARNING [train.py:1204] (3/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_106-341780-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 09:02:53,494 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.1.encoder.layers.1.feed_forward2.out_whiten, num_groups=1, num_channels=256, metric=4.54 vs. limit=15.0 2023-10-04 09:02:54,561 WARNING [train.py:1204] (3/4) Exclude cut with ID _45871_210_5665_1_1533864614245_3296060_253-132552-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 09:02:55,995 WARNING [train.py:1204] (3/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_395-310220-0_sp1.1 from training. Duration: 0.8090625 2023-10-04 09:02:56,022 WARNING [train.py:1204] (3/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_795-33252-0_sp0.9 from training. Duration: 0.411125 2023-10-04 09:02:56,689 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.3.encoder.layers.2.conv_module2.whiten, num_groups=1, num_channels=512, metric=11.09 vs. limit=15.0 2023-10-04 09:02:58,040 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9002W0003-495962 from training. Duration: 0.946 2023-10-04 09:02:59,140 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_43-71989-0 from training. Duration: 0.57 2023-10-04 09:02:59,153 WARNING [train.py:1204] (3/4) Exclude cut with ID _8699_210_2069_1_1528630346831_3960409_155-39766-0_sp1.1 from training. Duration: 0.7090625 2023-10-04 09:02:59,175 WARNING [train.py:1204] (3/4) Exclude cut with ID _27894_210_12392_1_1532415496078_6902439_788-232684-0_sp1.1 from training. Duration: 0.97275 2023-10-04 09:03:02,217 WARNING [train.py:1204] (3/4) Exclude cut with ID _56494_210_4220_1_1535284837625_3033169_10-39296-0_sp1.1 from training. Duration: 0.92725 2023-10-04 09:03:02,250 WARNING [train.py:1204] (3/4) Exclude cut with ID _40901_210_5665_1_1533363789958_3856130_341-20819-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 09:03:06,421 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9029W0435-509822 from training. Duration: 0.896 2023-10-04 09:03:06,461 WARNING [train.py:1204] (3/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_51-117576-0 from training. Duration: 0.4 2023-10-04 09:03:07,846 WARNING [train.py:1204] (3/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_197-28164-0_sp1.1 from training. Duration: 0.72725 2023-10-04 09:03:08,126 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.feed_forward1.out_proj.dropout_p, batch_count=1599173.3333333333, ans=0.1 2023-10-04 09:03:09,391 WARNING [train.py:1204] (3/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_145-137790-0_sp1.1 from training. Duration: 0.7545625 2023-10-04 09:03:12,613 WARNING [train.py:1204] (3/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_2-39653-0_sp1.1 from training. Duration: 0.8909375 2023-10-04 09:03:17,904 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3780_210_2087_1_1526467760_2130483_186-208240-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 09:03:17,989 WARNING [train.py:1204] (3/4) Exclude cut with ID _24055_210_12651_1_1532310907143_976979_92-59356-0 from training. Duration: 0.91 2023-10-04 09:03:19,365 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3910_210_1794_1_1526742306_334530_28-238909-0_sp1.1 from training. Duration: 0.736375 2023-10-04 09:03:22,188 WARNING [train.py:1204] (3/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_269-154726-0 from training. Duration: 0.86 2023-10-04 09:03:22,505 INFO [scaling.py:1118] (3/4) WithLoss: name=encoder.encoders.2.encoder.layers.2.self_attn_weights, loss-sum=0.000e+00 2023-10-04 09:03:26,582 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.conv_skip_rate, batch_count=1599306.6666666667, ans=0.0 2023-10-04 09:03:28,273 INFO [train.py:1046] (3/4) Epoch 46, batch 850, loss[loss=0.1521, simple_loss=0.2323, pruned_loss=0.03596, over 23690.00 frames. ], tot_loss[loss=0.154, simple_loss=0.2341, pruned_loss=0.037, over 4657129.04 frames. ], batch size: 149, lr: 2.22e-03, grad_scale: 16.0 2023-10-04 09:03:28,323 WARNING [train.py:1204] (3/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_326-165900-0_sp0.9 from training. Duration: 0.7666875 2023-10-04 09:03:30,627 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.conv_module1.balancer2.prob, batch_count=1599306.6666666667, ans=0.125 2023-10-04 09:03:31,557 WARNING [train.py:1204] (3/4) Exclude cut with ID _5294_210_1747_1_1527249643928_3696179_470-276077-0_sp1.1 from training. Duration: 0.963625 2023-10-04 09:03:31,603 WARNING [train.py:1204] (3/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_372-131163-0 from training. Duration: 0.94 2023-10-04 09:03:33,041 WARNING [train.py:1204] (3/4) Exclude cut with ID _45297_210_2721_1_1533715351381_3264949_351-147136-0_sp1.1 from training. Duration: 0.7909375 2023-10-04 09:03:33,113 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_1072-12671-0_sp1.1 from training. Duration: 0.92725 2023-10-04 09:03:34,649 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.feed_forward1.hidden_balancer.prob, batch_count=1599306.6666666667, ans=0.125 2023-10-04 09:03:35,741 WARNING [train.py:1204] (3/4) Exclude cut with ID _15133_210_6228_1_1530794512074_3080379_54-198193-0 from training. Duration: 0.94 2023-10-04 09:03:35,759 WARNING [train.py:1204] (3/4) Exclude cut with ID _30196_210_1664_1_1533708496522_7074140_1194-270988-0_sp1.1 from training. Duration: 0.97275 2023-10-04 09:03:35,869 WARNING [train.py:1204] (3/4) Exclude cut with ID _50310_210_15488_1_1534068043345_3641080_289-193637-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 09:03:36,111 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=1599306.6666666667, ans=0.1 2023-10-04 09:03:37,276 WARNING [train.py:1204] (3/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_313-263495-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 09:03:39,974 WARNING [train.py:1204] (3/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_18-130336-0_sp1.1 from training. Duration: 0.7181875 2023-10-04 09:03:40,219 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.conv_module1.balancer2.prob, batch_count=1599306.6666666667, ans=0.125 2023-10-04 09:03:41,394 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_47896_210_20508_1_1534492518_5958868_646-287209-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 09:03:42,709 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_294-163793-0 from training. Duration: 0.82 2023-10-04 09:03:42,741 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_8-319957-0 from training. Duration: 0.96 2023-10-04 09:03:42,751 WARNING [train.py:1204] (3/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_1211-226066-0 from training. Duration: 0.76 2023-10-04 09:03:44,910 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.ff3_skip_rate, batch_count=1599373.3333333333, ans=0.0 2023-10-04 09:03:47,379 WARNING [train.py:1204] (3/4) Exclude cut with ID _4779_210_2084_1_1527318022944_3914893_500-285013-0_sp0.9 from training. Duration: 0.7666875 2023-10-04 09:03:47,398 WARNING [train.py:1204] (3/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_347-232268-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 09:03:47,568 WARNING [train.py:1204] (3/4) Exclude cut with ID _29877_210_6228_1_1532397791106_5276149_111-55056-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 09:03:48,855 WARNING [train.py:1204] (3/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_812-271646-0_sp1.1 from training. Duration: 0.92725 2023-10-04 09:03:48,880 WARNING [train.py:1204] (3/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_235-262124-0_sp1.1 from training. Duration: 0.6090625 2023-10-04 09:03:50,622 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.bypass.scale_min, batch_count=1599373.3333333333, ans=0.2 2023-10-04 09:03:51,132 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.3.encoder.layers.1.feed_forward3.out_whiten, num_groups=1, num_channels=512, metric=9.03 vs. limit=15.0 2023-10-04 09:03:53,155 WARNING [train.py:1204] (3/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_14-16530-0_sp1.1 from training. Duration: 0.97275 2023-10-04 09:03:53,179 WARNING [train.py:1204] (3/4) Exclude cut with ID _44718_210_5816_1_1534050051869_6469030_568-304512-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 09:03:54,395 WARNING [train.py:1204] (3/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_1033-274376-0 from training. Duration: 0.87 2023-10-04 09:03:55,934 WARNING [train.py:1204] (3/4) Exclude cut with ID _4681_210_3525_1_1527251040321_1235713_123-24428-0 from training. Duration: 0.48 2023-10-04 09:03:57,531 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_37818_210_4238_1_1533620910_4720083_251-288764-0_sp1.1 from training. Duration: 0.97275 2023-10-04 09:03:58,897 WARNING [train.py:1204] (3/4) Exclude cut with ID _65585_210_12210_1_1535450451584_3764098_112-294301-0 from training. Duration: 0.88 2023-10-04 09:03:59,011 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.conv_module2.balancer2.prob, batch_count=1599440.0, ans=0.125 2023-10-04 09:04:01,426 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.0.feed_forward1.out_proj.dropout_p, batch_count=1599440.0, ans=0.1 2023-10-04 09:04:04,456 WARNING [train.py:1204] (3/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_329-113173-0 from training. Duration: 0.62 2023-10-04 09:04:05,844 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6767_210_3273_1_1527855783_1718932_173-256601-0 from training. Duration: 0.8 2023-10-04 09:04:08,587 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9266W0001-627677 from training. Duration: 0.896 2023-10-04 09:04:08,598 WARNING [train.py:1204] (3/4) Exclude cut with ID _49643_210_7989_1_1534640244224_3939031_611-185515-0_sp1.1 from training. Duration: 0.8545625 2023-10-04 09:04:08,602 WARNING [train.py:1204] (3/4) Exclude cut with ID _31649_210_6228_1_1532584608063_4091078_687-237771-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 09:04:08,612 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_148-172861-0_sp0.9 from training. Duration: 0.5555625 2023-10-04 09:04:11,428 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_5129_210_6400_1_1527161005_3579054_107-69738-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 09:04:11,520 WARNING [train.py:1204] (3/4) Exclude cut with ID _40137_210_12210_1_1533290484046_3795180_36-301164-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 09:04:11,550 WARNING [train.py:1204] (3/4) Exclude cut with ID _7537_210_2084_1_1528502395431_3924359_113-199933-0 from training. Duration: 0.59 2023-10-04 09:04:14,891 WARNING [train.py:1204] (3/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_547-5424-0_sp1.1 from training. Duration: 0.8545625 2023-10-04 09:04:14,948 WARNING [train.py:1204] (3/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_147-284024-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 09:04:15,026 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_294-163793-0_sp1.1 from training. Duration: 0.7454375 2023-10-04 09:04:16,345 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3780_210_2087_1_1526467760_2130483_170-259044-0_sp1.1 from training. Duration: 0.636375 2023-10-04 09:04:17,627 WARNING [train.py:1204] (3/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_726-270780-0_sp1.1 from training. Duration: 0.7818125 2023-10-04 09:04:20,254 WARNING [train.py:1204] (3/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_214-291369-0_sp1.1 from training. Duration: 0.57275 2023-10-04 09:04:21,639 WARNING [train.py:1204] (3/4) Exclude cut with ID _21881_210_3525_1_1531825242757_3798380_247-102591-0 from training. Duration: 0.98 2023-10-04 09:04:22,021 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.feed_forward2.hidden_balancer.prob, batch_count=1599506.6666666667, ans=0.125 2023-10-04 09:04:24,456 WARNING [train.py:1204] (3/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_18-174647-0_sp1.1 from training. Duration: 0.82725 2023-10-04 09:04:24,459 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_415-52611-0_sp1.1 from training. Duration: 0.936375 2023-10-04 09:04:24,527 WARNING [train.py:1204] (3/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_379-49457-0_sp0.9 from training. Duration: 0.9666875 2023-10-04 09:04:24,541 WARNING [train.py:1204] (3/4) Exclude cut with ID _64521_210_14699_1_1535243427259_3815328_368-195106-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 09:04:25,968 WARNING [train.py:1204] (3/4) Exclude cut with ID _28394_210_8913_1_1532430049518_3650118_51-334124-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 09:04:29,191 WARNING [train.py:1204] (3/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_317-161769-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 09:04:31,251 WARNING [train.py:1204] (3/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_40-129756-0_sp0.9 from training. Duration: 0.9 2023-10-04 09:04:32,455 WARNING [train.py:1204] (3/4) Exclude cut with ID _3067_210_2667_1_1526212974640_4503168_119-175776-0_sp0.9 from training. Duration: 0.82225 2023-10-04 09:04:32,510 WARNING [train.py:1204] (3/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_1004-304915-0_sp1.1 from training. Duration: 0.92725 2023-10-04 09:04:34,259 WARNING [train.py:1204] (3/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_359-118926-0_sp0.9 from training. Duration: 0.8 2023-10-04 09:04:35,621 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.663e+02 2.012e+02 2.248e+02 2.524e+02 3.712e+02, threshold=4.497e+02, percent-clipped=0.0 2023-10-04 09:04:42,570 INFO [train.py:1046] (3/4) Epoch 46, batch 900, loss[loss=0.1575, simple_loss=0.2438, pruned_loss=0.03563, over 24436.00 frames. ], tot_loss[loss=0.1547, simple_loss=0.2345, pruned_loss=0.03745, over 4669322.11 frames. ], batch size: 69, lr: 2.21e-03, grad_scale: 16.0 2023-10-04 09:04:42,621 WARNING [train.py:1204] (3/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_51-200220-0_sp0.9 from training. Duration: 0.62225 2023-10-04 09:04:42,697 WARNING [train.py:1204] (3/4) Exclude cut with ID _46683_210_9642_1_1533730913492_4591116_530-326202-0_sp1.1 from training. Duration: 0.936375 2023-10-04 09:04:42,743 WARNING [train.py:1204] (3/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_567-135114-0 from training. Duration: 0.87 2023-10-04 09:04:44,600 WARNING [train.py:1204] (3/4) Exclude cut with ID _66863_210_18053_1_1535698758334_8590319_1092-271784-0_sp1.1 from training. Duration: 0.87275 2023-10-04 09:04:44,618 WARNING [train.py:1204] (3/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_762-282614-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 09:04:47,327 WARNING [train.py:1204] (3/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_280-120942-0 from training. Duration: 0.77 2023-10-04 09:04:51,504 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_31583_210_3852_1_1532595851_4024413_27-170931-0_sp1.1 from training. Duration: 0.963625 2023-10-04 09:04:55,653 WARNING [train.py:1204] (3/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_266-271628-0_sp1.1 from training. Duration: 0.92725 2023-10-04 09:04:55,703 WARNING [train.py:1204] (3/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_41-31157-0 from training. Duration: 0.57 2023-10-04 09:04:55,939 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=1599706.6666666667, ans=0.1 2023-10-04 09:04:58,504 WARNING [train.py:1204] (3/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_113-18163-0_sp1.1 from training. Duration: 0.8090625 2023-10-04 09:04:58,540 WARNING [train.py:1204] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_147-111137-0 from training. Duration: 0.95 2023-10-04 09:04:59,930 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1083-220122-0_sp1.1 from training. Duration: 0.463625 2023-10-04 09:05:00,205 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.conv_skip_rate, batch_count=1599706.6666666667, ans=0.0 2023-10-04 09:05:01,287 WARNING [train.py:1204] (3/4) Exclude cut with ID _14884_210_2252_1_1530707459689_5561250_591-173406-0_sp1.1 from training. Duration: 0.87275 2023-10-04 09:05:01,293 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_24677_210_1794_1_1532002693_4341296_149-303793-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 09:05:01,340 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_133-564-0_sp1.1 from training. Duration: 0.7181875 2023-10-04 09:05:01,361 WARNING [train.py:1204] (3/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_625-338546-0_sp0.9 from training. Duration: 0.97775 2023-10-04 09:05:10,698 WARNING [train.py:1204] (3/4) Exclude cut with ID _25882_210_3036_1_1532775665815_6479510_624-320169-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 09:05:10,705 WARNING [train.py:1204] (3/4) Exclude cut with ID _20622_210_3852_1_1531622427080_1093839_127-325326-0_sp1.1 from training. Duration: 0.92725 2023-10-04 09:05:11,901 WARNING [train.py:1204] (3/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_464-33931-0_sp1.1 from training. Duration: 0.4909375 2023-10-04 09:05:13,373 WARNING [train.py:1204] (3/4) Exclude cut with ID _66112_210_10635_1_1535590236305_3801484_120-304588-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 09:05:18,101 WARNING [train.py:1204] (3/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_110-130961-0 from training. Duration: 0.47 2023-10-04 09:05:19,388 WARNING [train.py:1204] (3/4) Exclude cut with ID _16908_210_5724_1_1531119157248_3772700_64-54020-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 09:05:23,556 WARNING [train.py:1204] (3/4) Exclude cut with ID _4068_210_2629_1_1526697350395_1546259_224-288693-0_sp1.1 from training. Duration: 0.536375 2023-10-04 09:05:24,887 WARNING [train.py:1204] (3/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_191-41023-0_sp0.9 from training. Duration: 0.788875 2023-10-04 09:05:24,955 WARNING [train.py:1204] (3/4) Exclude cut with ID ID0205W0255-783002 from training. Duration: 0.958 2023-10-04 09:05:26,763 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_162-262717-0 from training. Duration: 0.83 2023-10-04 09:05:32,862 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_224-322394-0_sp1.1 from training. Duration: 0.6 2023-10-04 09:05:32,877 WARNING [train.py:1204] (3/4) Exclude cut with ID _4866_210_2577_1_1527404412284_4364659_133-1696-0_sp1.1 from training. Duration: 0.9 2023-10-04 09:05:32,934 WARNING [train.py:1204] (3/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_973-198085-0_sp1.1 from training. Duration: 0.6818125 2023-10-04 09:05:33,730 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.2.encoder.layers.0.self_attn2.whiten, num_groups=1, num_channels=384, metric=18.40 vs. limit=22.5 2023-10-04 09:05:37,819 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_47449_210_5399_1_1534076445_4006929_410-237806-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 09:05:37,831 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7543_210_6753_1_1528543318_4552_1-85072-0_sp0.9 from training. Duration: 0.92225 2023-10-04 09:05:40,480 WARNING [train.py:1204] (3/4) Exclude cut with ID _65263_210_22356_1_1535763598691_3921037_108-21902-0 from training. Duration: 0.99 2023-10-04 09:05:41,727 WARNING [train.py:1204] (3/4) Exclude cut with ID _65585_210_12210_1_1535450451584_3764098_309-174818-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 09:05:43,601 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_309-65177-0 from training. Duration: 0.88 2023-10-04 09:05:44,997 WARNING [train.py:1204] (3/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_363-185703-0_sp1.1 from training. Duration: 0.863625 2023-10-04 09:05:46,300 WARNING [train.py:1204] (3/4) Exclude cut with ID _42326_210_7770_1_1533814282531_3707269_442-238692-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 09:05:46,573 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.out_combiner.scale_min, batch_count=1599906.6666666667, ans=0.2 2023-10-04 09:05:47,625 WARNING [train.py:1204] (3/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_226-254446-0_sp1.1 from training. Duration: 0.936375 2023-10-04 09:05:47,648 WARNING [train.py:1204] (3/4) Exclude cut with ID _29853_210_7497_1_1532505600851_6642110_287-299903-0_sp1.1 from training. Duration: 0.963625 2023-10-04 09:05:47,895 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.conv_module1.balancer1.prob, batch_count=1599906.6666666667, ans=0.125 2023-10-04 09:05:51,819 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1015-343884-0 from training. Duration: 0.62 2023-10-04 09:05:51,863 WARNING [train.py:1204] (3/4) Exclude cut with ID ID0305W0008-833402 from training. Duration: 0.91 2023-10-04 09:05:53,266 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6673_210_3273_1_1527767790_3173240_351-167954-0_sp1.1 from training. Duration: 0.436375 2023-10-04 09:05:53,279 WARNING [train.py:1204] (3/4) Exclude cut with ID _6456_210_1534_1_1527681634212_3181049_345-252877-0 from training. Duration: 0.64 2023-10-04 09:05:55,831 INFO [train.py:1046] (3/4) Epoch 46, batch 950, loss[loss=0.1652, simple_loss=0.244, pruned_loss=0.04325, over 23273.00 frames. ], tot_loss[loss=0.1548, simple_loss=0.2349, pruned_loss=0.03732, over 4689818.57 frames. ], batch size: 105, lr: 2.21e-03, grad_scale: 16.0 2023-10-04 09:05:57,332 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_13-205182-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 09:05:59,463 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.nonlin_attention.balancer.prob, batch_count=1599973.3333333333, ans=0.125 2023-10-04 09:06:04,540 WARNING [train.py:1204] (3/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_427-208901-0 from training. Duration: 0.68 2023-10-04 09:06:07,916 WARNING [train.py:1204] (3/4) Exclude cut with ID _49039_210_6315_1_1533967537541_3846118_9-148921-0_sp1.1 from training. Duration: 0.97275 2023-10-04 09:06:08,186 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.conv_skip_rate, batch_count=1599973.3333333333, ans=0.0 2023-10-04 09:06:10,739 WARNING [train.py:1204] (3/4) Exclude cut with ID _59493_210_17282_1_1535078419077_3225879_143-70317-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 09:06:10,767 WARNING [train.py:1204] (3/4) Exclude cut with ID _27967_210_12140_1_1532588391859_8419130_1056-236369-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 09:06:10,833 WARNING [train.py:1204] (3/4) Exclude cut with ID _6537_210_1883_1_1527987559710_3597129_382-133062-0_sp0.9 from training. Duration: 0.7555625 2023-10-04 09:06:13,492 WARNING [train.py:1204] (3/4) Exclude cut with ID ID0376W0255-869957 from training. Duration: 0.9510625 2023-10-04 09:06:18,102 WARNING [train.py:1204] (3/4) Exclude cut with ID _49371_210_12071_1_1533952761021_3812190_483-240691-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 09:06:18,179 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_12868_210_6753_1_1530701932_530275_18-76322-0_sp1.1 from training. Duration: 0.7909375 2023-10-04 09:06:18,235 WARNING [train.py:1204] (3/4) Exclude cut with ID _53950_210_5816_1_1534417264919_3862951_367-26344-0_sp1.1 from training. Duration: 0.97275 2023-10-04 09:06:19,539 WARNING [train.py:1204] (3/4) Exclude cut with ID _5444_210_2151_1_1527418808464_5312210_634-194174-0_sp1.1 from training. Duration: 0.8818125 2023-10-04 09:06:19,556 WARNING [train.py:1204] (3/4) Exclude cut with ID _56494_210_4220_1_1535284837625_3033169_69-138150-0 from training. Duration: 0.94 2023-10-04 09:06:19,660 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7543_210_6753_1_1528544010_1110402_25-183860-0_sp0.9 from training. Duration: 0.52225 2023-10-04 09:06:21,054 WARNING [train.py:1204] (3/4) Exclude cut with ID _40284_210_2084_1_1533513679814_3704279_287-191918-0_sp1.1 from training. Duration: 0.963625 2023-10-04 09:06:22,447 WARNING [train.py:1204] (3/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_291-270881-0 from training. Duration: 0.97 2023-10-04 09:06:23,833 WARNING [train.py:1204] (3/4) Exclude cut with ID _12555_210_8750_1_1530075594402_3780239_449-267986-0_sp0.9 from training. Duration: 0.92225 2023-10-04 09:06:25,244 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.0.ff3_skip_rate, batch_count=1600040.0, ans=0.0 2023-10-04 09:06:27,707 WARNING [train.py:1204] (3/4) Exclude cut with ID _3992_210_1747_1_1526644816031_3731060_268-64520-0_sp1.1 from training. Duration: 0.963625 2023-10-04 09:06:27,720 WARNING [train.py:1204] (3/4) Exclude cut with ID _39411_210_5986_1_1533538825030_3343649_173-238248-0_sp0.9 from training. Duration: 0.92225 2023-10-04 09:06:27,956 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.feed_forward1.out_proj.dropout_p, batch_count=1600106.6666666667, ans=0.1 2023-10-04 09:06:28,955 WARNING [train.py:1204] (3/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_828-313097-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 09:06:29,042 WARNING [train.py:1204] (3/4) Exclude cut with ID _26907_210_7234_1_1532221740694_3205263_337-2494-0 from training. Duration: 0.88 2023-10-04 09:06:32,094 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_229-45300-0_sp1.1 from training. Duration: 0.5181875 2023-10-04 09:06:34,919 WARNING [train.py:1204] (3/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_300-27235-0_sp1.1 from training. Duration: 0.7909375 2023-10-04 09:06:36,868 WARNING [train.py:1204] (3/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_48-338976-0_sp1.1 from training. Duration: 0.8090625 2023-10-04 09:06:39,738 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_13091_210_7925_1_1530609447_8987354_792-195353-0_sp1.1 from training. Duration: 0.936375 2023-10-04 09:06:39,744 WARNING [train.py:1204] (3/4) Exclude cut with ID _56524_210_9642_1_1534572011779_3619170_311-54500-0_sp1.1 from training. Duration: 0.97275 2023-10-04 09:06:42,864 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7543_210_6753_1_1528544010_1110402_25-183860-0 from training. Duration: 0.47 2023-10-04 09:06:44,317 WARNING [train.py:1204] (3/4) Exclude cut with ID 2411-132532-0017-82279-0_sp1.1 from training. Duration: 0.9681875 2023-10-04 09:06:44,318 WARNING [train.py:1204] (3/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_272-249985-0_sp0.9 from training. Duration: 0.7444375 2023-10-04 09:06:45,667 WARNING [train.py:1204] (3/4) Exclude cut with ID _55644_210_11856_1_1534593599582_4103719_320-229006-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 09:06:46,993 WARNING [train.py:1204] (3/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_177-100554-0_sp1.1 from training. Duration: 0.963625 2023-10-04 09:06:46,995 WARNING [train.py:1204] (3/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_787-294416-0_sp1.1 from training. Duration: 0.7454375 2023-10-04 09:06:51,826 WARNING [train.py:1204] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_318-59097-0 from training. Duration: 0.76 2023-10-04 09:06:51,912 WARNING [train.py:1204] (3/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_480-13146-0_sp1.1 from training. Duration: 0.77275 2023-10-04 09:06:54,540 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_469-338558-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 09:06:54,589 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_367-126897-0_sp1.1 from training. Duration: 0.963625 2023-10-04 09:06:54,614 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6545_210_2040_1_1527774670_4245258_379-14182-0 from training. Duration: 0.8 2023-10-04 09:06:54,626 WARNING [train.py:1204] (3/4) Exclude cut with ID _21631_210_2253_1_1531907880778_3160339_197-79498-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 09:06:54,636 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_416-293746-0_sp1.1 from training. Duration: 0.8181875 2023-10-04 09:06:55,964 WARNING [train.py:1204] (3/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_52-211683-0 from training. Duration: 0.9 2023-10-04 09:06:59,389 WARNING [train.py:1204] (3/4) Exclude cut with ID _48678_210_5663_1_1533988830947_3769199_306-255886-0_sp0.9 from training. Duration: 0.8555625 2023-10-04 09:07:02,097 WARNING [train.py:1204] (3/4) Exclude cut with ID _65134_210_14763_1_1535335393985_3505368_296-24733-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 09:07:05,367 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.677e+02 2.058e+02 2.335e+02 2.951e+02 5.020e+02, threshold=4.671e+02, percent-clipped=3.0 2023-10-04 09:07:06,893 WARNING [train.py:1204] (3/4) Exclude cut with ID _14898_210_9206_1_1530766812377_4177190_335-99364-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 09:07:08,296 WARNING [train.py:1204] (3/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_19-342632-0 from training. Duration: 0.91 2023-10-04 09:07:08,320 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_53071_210_16554_1_1534490405_2954066_265-170472-0 from training. Duration: 0.83 2023-10-04 09:07:12,946 INFO [train.py:1046] (3/4) Epoch 46, batch 1000, loss[loss=0.1472, simple_loss=0.2262, pruned_loss=0.03415, over 19857.00 frames. ], tot_loss[loss=0.1542, simple_loss=0.2339, pruned_loss=0.03724, over 4687558.62 frames. ], batch size: 43, lr: 2.21e-03, grad_scale: 16.0 2023-10-04 09:07:13,045 WARNING [train.py:1204] (3/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_189-298172-0_sp1.1 from training. Duration: 0.963625 2023-10-04 09:07:17,236 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7615_210_4211_1_1528110965_4580160_72-235211-0 from training. Duration: 0.76 2023-10-04 09:07:18,488 WARNING [train.py:1204] (3/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_155-90874-0_sp1.1 from training. Duration: 0.936375 2023-10-04 09:07:21,671 WARNING [train.py:1204] (3/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_164-216310-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 09:07:21,769 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_46013_210_7234_1_1533776362_3744032_463-52378-0 from training. Duration: 0.86 2023-10-04 09:07:21,773 WARNING [train.py:1204] (3/4) Exclude cut with ID _61133_210_9969_1_1535196777227_3696409_333-270325-0 from training. Duration: 0.93 2023-10-04 09:07:28,482 WARNING [train.py:1204] (3/4) Exclude cut with ID _5172_210_1883_1_1527246062567_3577219_18-101380-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 09:07:28,494 WARNING [train.py:1204] (3/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_577-236858-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 09:07:28,757 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.bypass_mid.scale_min, batch_count=1600373.3333333333, ans=0.2 2023-10-04 09:07:30,344 WARNING [train.py:1204] (3/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_1006-177345-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 09:07:33,267 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_4-266348-0 from training. Duration: 0.78 2023-10-04 09:07:36,570 WARNING [train.py:1204] (3/4) Exclude cut with ID _55857_210_2346_1_1534644007831_6898209_745-98391-0 from training. Duration: 0.76 2023-10-04 09:07:36,929 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.bypass.skip_rate, batch_count=1600373.3333333333, ans=0.07 2023-10-04 09:07:38,113 WARNING [train.py:1204] (3/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_375-244976-0 from training. Duration: 0.75 2023-10-04 09:07:38,152 WARNING [train.py:1204] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_4-248404-0_sp1.1 from training. Duration: 0.97275 2023-10-04 09:07:38,277 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8453_210_4211_1_1528502940_8562730_596-306820-0 from training. Duration: 0.97 2023-10-04 09:07:39,626 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_211-339622-0_sp0.9 from training. Duration: 0.5 2023-10-04 09:07:39,666 WARNING [train.py:1204] (3/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_393-57888-0 from training. Duration: 0.64 2023-10-04 09:07:41,660 WARNING [train.py:1204] (3/4) Exclude cut with ID _23825_210_13449_1_1531915313526_3497150_143-343575-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 09:07:42,998 WARNING [train.py:1204] (3/4) Exclude cut with ID _58605_210_12210_1_1534723195939_3904259_36-12256-0_sp1.1 from training. Duration: 0.936375 2023-10-04 09:07:47,428 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.nonlin_attention.balancer.min_positive, batch_count=1600440.0, ans=0.05 2023-10-04 09:07:48,706 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.attention_skip_rate, batch_count=1600440.0, ans=0.0 2023-10-04 09:07:51,781 WARNING [train.py:1204] (3/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_38-67795-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 09:07:51,854 WARNING [train.py:1204] (3/4) Exclude cut with ID _64631_210_27767_1_1535349580068_4165160_308-327832-0_sp1.1 from training. Duration: 0.92725 2023-10-04 09:07:53,836 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.2.encoder.layers.2.conv_module2.whiten, num_groups=1, num_channels=384, metric=6.56 vs. limit=15.0 2023-10-04 09:07:54,449 WARNING [train.py:1204] (3/4) Exclude cut with ID _37086_210_9038_1_1532999540891_7021250_726-81176-0_sp1.1 from training. Duration: 0.936375 2023-10-04 09:07:54,502 WARNING [train.py:1204] (3/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_47-141521-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 09:07:54,507 WARNING [train.py:1204] (3/4) Exclude cut with ID _3067_210_2667_1_1526212974640_4503168_250-285937-0 from training. Duration: 0.8 2023-10-04 09:07:54,543 WARNING [train.py:1204] (3/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_239-24000-0_sp1.1 from training. Duration: 0.97275 2023-10-04 09:07:54,756 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.self_attn_weights.pos_emb_skip_rate, batch_count=1600440.0, ans=0.0 2023-10-04 09:07:55,875 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3371_210_1794_1_1526384462_5580777_396-339812-0_sp0.9 from training. Duration: 0.9444375 2023-10-04 09:07:57,201 WARNING [train.py:1204] (3/4) Exclude cut with ID _16909_210_5724_1_1531292079912_4118919_580-173454-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 09:07:57,256 WARNING [train.py:1204] (3/4) Exclude cut with ID IC0124W0002-61308 from training. Duration: 0.982 2023-10-04 09:08:00,677 WARNING [train.py:1204] (3/4) Exclude cut with ID _31602_210_5854_1_1532677021456_4458250_249-96604-0 from training. Duration: 0.89 2023-10-04 09:08:02,152 WARNING [train.py:1204] (3/4) Exclude cut with ID _69584_210_5219_1_1535785133775_4044450_325-7656-0 from training. Duration: 0.86 2023-10-04 09:08:02,277 WARNING [train.py:1204] (3/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_478-162247-0 from training. Duration: 0.82 2023-10-04 09:08:04,994 WARNING [train.py:1204] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_686-54383-0_sp1.1 from training. Duration: 0.7909375 2023-10-04 09:08:05,903 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=1600506.6666666667, ans=0.1 2023-10-04 09:08:05,919 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.self_attn_weights.pos_emb_skip_rate, batch_count=1600506.6666666667, ans=0.0 2023-10-04 09:08:11,483 WARNING [train.py:1204] (3/4) Exclude cut with ID _29303_210_2575_1_1532392237303_7410649_85-270092-0_sp1.1 from training. Duration: 0.936375 2023-10-04 09:08:11,494 WARNING [train.py:1204] (3/4) Exclude cut with ID _2322_210_2667_1_1525604548971_3656299_440-80494-0_sp1.1 from training. Duration: 0.8 2023-10-04 09:08:11,538 WARNING [train.py:1204] (3/4) Exclude cut with ID _47346_210_9476_1_1533815971553_3536160_201-40644-0_sp1.1 from training. Duration: 0.936375 2023-10-04 09:08:14,213 WARNING [train.py:1204] (3/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_343-139898-0_sp1.1 from training. Duration: 0.87275 2023-10-04 09:08:15,600 WARNING [train.py:1204] (3/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_1012-279473-0 from training. Duration: 0.67 2023-10-04 09:08:16,978 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7931_210_1794_1_1528594728_5564059_280-279560-0_sp1.1 from training. Duration: 0.8909375 2023-10-04 09:08:17,027 WARNING [train.py:1204] (3/4) Exclude cut with ID _8263_210_1664_1_1528439452891_970220_68-181805-0 from training. Duration: 0.64 2023-10-04 09:08:17,470 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.5.encoder.layers.1.self_attn_weights.whiten_keys, num_groups=4, num_channels=128, metric=1.46 vs. limit=6.0 2023-10-04 09:08:18,308 WARNING [train.py:1204] (3/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_605-133395-0 from training. Duration: 0.66 2023-10-04 09:08:19,663 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_43-171109-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 09:08:19,667 WARNING [train.py:1204] (3/4) Exclude cut with ID _40094_210_16380_1_1533294005773_3641159_107-280739-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 09:08:22,919 WARNING [train.py:1204] (3/4) Exclude cut with ID _14821_210_8750_1_1530680503991_6829359_2-227151-0_sp0.9 from training. Duration: 0.92225 2023-10-04 09:08:24,462 WARNING [train.py:1204] (3/4) Exclude cut with ID _14268_210_9206_1_1530622771285_4714099_331-105351-0_sp1.1 from training. Duration: 0.5909375 2023-10-04 09:08:24,991 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.5.encoder.layers.0.conv_module2.whiten, num_groups=1, num_channels=256, metric=4.57 vs. limit=15.0 2023-10-04 09:08:25,906 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_26326_210_7925_1_1533002493_3592406_451-82741-0_sp1.1 from training. Duration: 0.97275 2023-10-04 09:08:27,229 INFO [train.py:1046] (3/4) Epoch 46, batch 1050, loss[loss=0.1626, simple_loss=0.2577, pruned_loss=0.03379, over 24652.00 frames. ], tot_loss[loss=0.1535, simple_loss=0.2333, pruned_loss=0.03687, over 4685173.38 frames. ], batch size: 73, lr: 2.21e-03, grad_scale: 16.0 2023-10-04 09:08:28,628 WARNING [train.py:1204] (3/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_191-59784-0_sp0.9 from training. Duration: 0.97775 2023-10-04 09:08:28,729 WARNING [train.py:1204] (3/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_445-117398-0_sp1.1 from training. Duration: 0.8090625 2023-10-04 09:08:31,361 WARNING [train.py:1204] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_605-257205-0_sp0.9 from training. Duration: 0.6555625 2023-10-04 09:08:31,434 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_50050_210_16223_1_1535435796_7290138_592-258841-0_sp1.1 from training. Duration: 0.936375 2023-10-04 09:08:32,293 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=1600640.0, ans=0.1 2023-10-04 09:08:33,291 WARNING [train.py:1204] (3/4) Exclude cut with ID _14348_210_8341_1_1530599434996_3778640_240-232370-0_sp0.9 from training. Duration: 0.9333125 2023-10-04 09:08:34,689 WARNING [train.py:1204] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_239-316802-0_sp0.9 from training. Duration: 0.7666875 2023-10-04 09:08:36,097 WARNING [train.py:1204] (3/4) Exclude cut with ID _45560_210_21982_1_1533812603951_3584999_111-6376-0_sp1.1 from training. Duration: 0.763625 2023-10-04 09:08:39,267 WARNING [train.py:1204] (3/4) Exclude cut with ID _54189_210_16380_1_1534332670039_3911710_606-311481-0_sp1.1 from training. Duration: 0.963625 2023-10-04 09:08:39,336 WARNING [train.py:1204] (3/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_237-332027-0_sp1.1 from training. Duration: 0.863625 2023-10-04 09:08:40,634 WARNING [train.py:1204] (3/4) Exclude cut with ID _66070_210_7640_1_1535441393096_7805310_489-292631-0_sp0.9 from training. Duration: 0.87775 2023-10-04 09:08:40,719 WARNING [train.py:1204] (3/4) Exclude cut with ID _49728_210_12651_1_1534041034294_6788150_624-195301-0_sp1.1 from training. Duration: 0.77275 2023-10-04 09:08:42,042 WARNING [train.py:1204] (3/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_44-79427-0 from training. Duration: 0.98 2023-10-04 09:08:43,338 WARNING [train.py:1204] (3/4) Exclude cut with ID _35350_210_16040_1_1533040191183_4075299_490-198354-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 09:08:43,377 WARNING [train.py:1204] (3/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_309-222545-0 from training. Duration: 0.84 2023-10-04 09:08:47,396 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_13091_210_7925_1_1530609447_8987354_790-199469-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 09:08:47,402 WARNING [train.py:1204] (3/4) Exclude cut with ID _21632_210_2253_1_1531994001765_3744140_110-274654-0 from training. Duration: 0.82 2023-10-04 09:08:47,408 WARNING [train.py:1204] (3/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_498-40153-0_sp0.9 from training. Duration: 0.7 2023-10-04 09:08:51,900 WARNING [train.py:1204] (3/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_363-282909-0_sp1.1 from training. Duration: 0.936375 2023-10-04 09:08:53,753 WARNING [train.py:1204] (3/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_178-177582-0_sp0.9 from training. Duration: 0.82225 2023-10-04 09:08:53,765 WARNING [train.py:1204] (3/4) Exclude cut with ID _24612_210_8913_1_1531998054193_3806359_357-35487-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 09:08:56,483 WARNING [train.py:1204] (3/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_138-332773-0 from training. Duration: 0.81 2023-10-04 09:08:56,513 WARNING [train.py:1204] (3/4) Exclude cut with ID _7624_210_1531_1_1528109802198_4933101_418-349834-0 from training. Duration: 0.6 2023-10-04 09:08:56,544 WARNING [train.py:1204] (3/4) Exclude cut with ID _7537_210_2084_1_1528502395431_3924359_14-312906-0_sp0.9 from training. Duration: 0.9333125 2023-10-04 09:08:58,211 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.feed_forward3.hidden_balancer.prob, batch_count=1600773.3333333333, ans=0.125 2023-10-04 09:08:58,697 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.1.encoder.layers.0.conv_module1.whiten, num_groups=1, num_channels=256, metric=11.35 vs. limit=15.0 2023-10-04 09:09:01,103 WARNING [train.py:1204] (3/4) Exclude cut with ID _60057_210_10640_1_1534903065793_3333150_362-230377-0 from training. Duration: 0.87 2023-10-04 09:09:02,598 WARNING [train.py:1204] (3/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_138-64210-0 from training. Duration: 0.73 2023-10-04 09:09:04,580 WARNING [train.py:1204] (3/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_454-339504-0_sp1.1 from training. Duration: 0.97275 2023-10-04 09:09:07,417 WARNING [train.py:1204] (3/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_508-335369-0_sp0.9 from training. Duration: 0.5333125 2023-10-04 09:09:10,693 WARNING [train.py:1204] (3/4) Exclude cut with ID _5608_210_2106_1_1527415439089_6819768_909-82767-0_sp0.9 from training. Duration: 0.67775 2023-10-04 09:09:10,733 WARNING [train.py:1204] (3/4) Exclude cut with ID _6694_210_4599_1_1527852637490_3965529_99-11611-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 09:09:12,032 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2655_210_2087_1_1525688959_3897400_52-279899-0_sp1.1 from training. Duration: 0.62725 2023-10-04 09:09:16,064 WARNING [train.py:1204] (3/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_375-245677-0_sp1.1 from training. Duration: 0.736375 2023-10-04 09:09:18,782 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_23850_210_4238_1_1532232597_1632433_32-185135-0 from training. Duration: 0.86 2023-10-04 09:09:21,656 WARNING [train.py:1204] (3/4) Exclude cut with ID _15113_210_8254_1_1530842438579_3716279_209-54533-0 from training. Duration: 0.96 2023-10-04 09:09:21,707 WARNING [train.py:1204] (3/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_401-53423-0 from training. Duration: 0.55 2023-10-04 09:09:21,738 WARNING [train.py:1204] (3/4) Exclude cut with ID _14867_210_1968_1_1530684179890_3846969_319-250868-0_sp1.1 from training. Duration: 0.9 2023-10-04 09:09:23,013 WARNING [train.py:1204] (3/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_106-30782-0_sp0.9 from training. Duration: 0.888875 2023-10-04 09:09:24,410 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_5960_210_1794_1_1527403865_3990999_515-2613-0 from training. Duration: 0.88 2023-10-04 09:09:27,638 WARNING [train.py:1204] (3/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_798-106179-0_sp0.9 from training. Duration: 0.92225 2023-10-04 09:09:27,864 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.bypass_mid.scale_min, batch_count=1600906.6666666667, ans=0.2 2023-10-04 09:09:29,041 WARNING [train.py:1204] (3/4) Exclude cut with ID _14227_210_4381_1_1530701804286_3940259_125-275642-0_sp1.1 from training. Duration: 0.9 2023-10-04 09:09:29,043 WARNING [train.py:1204] (3/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_79-200392-0_sp1.1 from training. Duration: 0.8818125 2023-10-04 09:09:30,386 WARNING [train.py:1204] (3/4) Exclude cut with ID _48836_210_12210_1_1534075848293_3904150_429-35475-0_sp0.9 from training. Duration: 0.988875 2023-10-04 09:09:30,410 WARNING [train.py:1204] (3/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_224-172517-0_sp1.1 from training. Duration: 0.97275 2023-10-04 09:09:35,600 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.693e+02 1.988e+02 2.157e+02 2.390e+02 3.023e+02, threshold=4.315e+02, percent-clipped=0.0 2023-10-04 09:09:35,748 WARNING [train.py:1204] (3/4) Exclude cut with ID _15113_210_8254_1_1530842438579_3716279_76-232507-0_sp1.1 from training. Duration: 0.97275 2023-10-04 09:09:37,020 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_135-5501-0 from training. Duration: 0.98 2023-10-04 09:09:38,422 WARNING [train.py:1204] (3/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_563-347832-0_sp0.9 from training. Duration: 0.988875 2023-10-04 09:09:38,440 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6786_210_1388_1_1527839699_4194495_273-149841-0 from training. Duration: 0.92 2023-10-04 09:09:39,746 WARNING [train.py:1204] (3/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_172-215932-0 from training. Duration: 0.66 2023-10-04 09:09:39,817 WARNING [train.py:1204] (3/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_939-132910-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 09:09:41,062 INFO [train.py:1046] (3/4) Epoch 46, batch 1100, loss[loss=0.1582, simple_loss=0.2461, pruned_loss=0.03519, over 23972.00 frames. ], tot_loss[loss=0.1534, simple_loss=0.2327, pruned_loss=0.03704, over 4681078.58 frames. ], batch size: 80, lr: 2.21e-03, grad_scale: 8.0 2023-10-04 09:09:43,780 WARNING [train.py:1204] (3/4) Exclude cut with ID _49371_210_12071_1_1533952761021_3812190_16-246001-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 09:09:48,556 WARNING [train.py:1204] (3/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_680-189165-0_sp1.1 from training. Duration: 0.936375 2023-10-04 09:09:50,171 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.conv_module2.balancer2.prob, batch_count=1600973.3333333333, ans=0.125 2023-10-04 09:09:54,135 WARNING [train.py:1204] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_53-300544-0_sp1.1 from training. Duration: 0.6090625 2023-10-04 09:09:55,559 WARNING [train.py:1204] (3/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_540-275685-0_sp1.1 from training. Duration: 0.8090625 2023-10-04 09:09:55,584 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_38965_210_4852_1_1533174906_3484735_453-106579-0_sp1.1 from training. Duration: 0.92725 2023-10-04 09:09:55,625 WARNING [train.py:1204] (3/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_105-328174-0 from training. Duration: 0.76 2023-10-04 09:09:57,119 WARNING [train.py:1204] (3/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_279-126205-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 09:10:00,141 WARNING [train.py:1204] (3/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_5-55893-0_sp0.9 from training. Duration: 0.611125 2023-10-04 09:10:02,777 WARNING [train.py:1204] (3/4) Exclude cut with ID _13678_210_7497_1_1530530069226_3463170_141-10835-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 09:10:06,094 WARNING [train.py:1204] (3/4) Exclude cut with ID _22083_210_8254_1_1531702862439_3678970_201-85738-0_sp1.1 from training. Duration: 0.8181875 2023-10-04 09:10:06,121 WARNING [train.py:1204] (3/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_51-1049-0 from training. Duration: 0.75 2023-10-04 09:10:08,016 WARNING [train.py:1204] (3/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_206-10097-0_sp0.9 from training. Duration: 0.5444375 2023-10-04 09:10:08,110 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_4528_210_3273_1_1527076654_3276777_360-63661-0_sp1.1 from training. Duration: 0.92725 2023-10-04 09:10:08,122 WARNING [train.py:1204] (3/4) Exclude cut with ID _64086_210_7994_1_1535270417019_7435949_449-31567-0_sp1.1 from training. Duration: 0.8818125 2023-10-04 09:10:09,845 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.conv_skip_rate, batch_count=1601106.6666666667, ans=0.0 2023-10-04 09:10:10,920 WARNING [train.py:1204] (3/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_245-174553-0_sp0.9 from training. Duration: 0.97775 2023-10-04 09:10:12,329 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6545_210_2040_1_1527774670_4245258_379-14182-0_sp1.1 from training. Duration: 0.72725 2023-10-04 09:10:12,458 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.1.conv_skip_rate, batch_count=1601106.6666666667, ans=0.0 2023-10-04 09:10:18,233 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_256-206070-0_sp1.1 from training. Duration: 0.963625 2023-10-04 09:10:19,874 WARNING [train.py:1204] (3/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_278-104676-0 from training. Duration: 0.98 2023-10-04 09:10:21,197 WARNING [train.py:1204] (3/4) Exclude cut with ID IC0671W0065-331908 from training. Duration: 0.896 2023-10-04 09:10:21,248 WARNING [train.py:1204] (3/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_244-129917-0_sp1.1 from training. Duration: 0.97275 2023-10-04 09:10:22,655 WARNING [train.py:1204] (3/4) Exclude cut with ID _7852_210_5219_1_1528286471316_8139400_1129-170857-0_sp1.1 from training. Duration: 0.97275 2023-10-04 09:10:24,014 WARNING [train.py:1204] (3/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_443-30034-0_sp0.9 from training. Duration: 0.711125 2023-10-04 09:10:24,052 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_31583_210_3852_1_1532595851_4024413_250-332999-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 09:10:26,592 WARNING [train.py:1204] (3/4) Exclude cut with ID _8772_210_1850_1_1528635595972_3664011_247-336448-0 from training. Duration: 0.74 2023-10-04 09:10:26,654 WARNING [train.py:1204] (3/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_567-135114-0_sp0.9 from training. Duration: 0.9666875 2023-10-04 09:10:26,660 WARNING [train.py:1204] (3/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_557-349534-0_sp1.1 from training. Duration: 0.82725 2023-10-04 09:10:26,679 WARNING [train.py:1204] (3/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_1341-26728-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 09:10:28,010 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_40222_210_6228_1_1533385298_4481939_375-328051-0_sp1.1 from training. Duration: 0.97275 2023-10-04 09:10:28,038 WARNING [train.py:1204] (3/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_276-96593-0 from training. Duration: 0.77 2023-10-04 09:10:34,593 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_53071_210_16554_1_1534490405_2954066_265-170472-0_sp0.9 from training. Duration: 0.92225 2023-10-04 09:10:34,610 WARNING [train.py:1204] (3/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_365-1492-0 from training. Duration: 0.91 2023-10-04 09:10:36,633 WARNING [train.py:1204] (3/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_654-52713-0_sp0.9 from training. Duration: 0.8555625 2023-10-04 09:10:40,746 WARNING [train.py:1204] (3/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_113-92608-0_sp1.1 from training. Duration: 0.4909375 2023-10-04 09:10:42,285 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_46013_210_7234_1_1533776362_3744032_50-141535-0 from training. Duration: 0.99 2023-10-04 09:10:42,300 WARNING [train.py:1204] (3/4) Exclude cut with ID _13942_210_9988_1_1530604906036_3733360_499-55676-0_sp1.1 from training. Duration: 0.563625 2023-10-04 09:10:43,763 WARNING [train.py:1204] (3/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_375-65143-0_sp1.1 from training. Duration: 0.97275 2023-10-04 09:10:48,120 WARNING [train.py:1204] (3/4) Exclude cut with ID _16756_210_4915_1_1531047803763_7104259_199-325608-0_sp1.1 from training. Duration: 0.92725 2023-10-04 09:10:48,144 WARNING [train.py:1204] (3/4) Exclude cut with ID _31649_210_6228_1_1532584608063_4091078_443-124391-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 09:10:49,481 WARNING [train.py:1204] (3/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_146-218763-0 from training. Duration: 0.86 2023-10-04 09:10:50,808 WARNING [train.py:1204] (3/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_567-135114-0_sp1.1 from training. Duration: 0.7909375 2023-10-04 09:10:50,858 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_26326_210_7925_1_1533002493_3592406_462-288299-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 09:10:51,745 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.2.encoder.layers.0.self_attn_weights.whiten_keys, num_groups=4, num_channels=128, metric=4.31 vs. limit=6.0 2023-10-04 09:10:52,247 WARNING [train.py:1204] (3/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_322-217220-0 from training. Duration: 0.86 2023-10-04 09:10:52,281 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_238-256668-0_sp1.1 from training. Duration: 0.62725 2023-10-04 09:10:53,573 WARNING [train.py:1204] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_371-18169-0 from training. Duration: 0.86 2023-10-04 09:10:53,878 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.feed_forward1.hidden_balancer.prob, batch_count=1601306.6666666667, ans=0.125 2023-10-04 09:10:54,822 INFO [train.py:1046] (3/4) Epoch 46, batch 1150, loss[loss=0.1629, simple_loss=0.2406, pruned_loss=0.04256, over 23799.00 frames. ], tot_loss[loss=0.1538, simple_loss=0.2335, pruned_loss=0.03702, over 4689811.44 frames. ], batch size: 164, lr: 2.21e-03, grad_scale: 8.0 2023-10-04 09:10:54,913 WARNING [train.py:1204] (3/4) Exclude cut with ID _58480_210_12392_1_1534811121111_3646160_23-74512-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 09:10:54,918 WARNING [train.py:1204] (3/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_784-213099-0_sp1.1 from training. Duration: 0.8454375 2023-10-04 09:10:55,015 WARNING [train.py:1204] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_625-102955-0_sp1.1 from training. Duration: 0.7 2023-10-04 09:10:59,219 WARNING [train.py:1204] (3/4) Exclude cut with ID _49748_210_24116_1_1533986971737_3158910_176-174327-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 09:11:01,518 WARNING [train.py:1204] (3/4) Exclude cut with ID _50310_210_15488_1_1534068043345_3641080_432-96140-0_sp1.1 from training. Duration: 0.936375 2023-10-04 09:11:04,647 WARNING [train.py:1204] (3/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_76-14910-0_sp1.1 from training. Duration: 0.92725 2023-10-04 09:11:04,680 WARNING [train.py:1204] (3/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_40-19458-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 09:11:04,720 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_46-93547-0 from training. Duration: 0.95 2023-10-04 09:11:06,023 WARNING [train.py:1204] (3/4) Exclude cut with ID _21631_210_2253_1_1531907880778_3160339_321-35325-0_sp1.1 from training. Duration: 0.963625 2023-10-04 09:11:06,724 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.3.encoder.layers.3.feed_forward1.out_whiten, num_groups=1, num_channels=512, metric=4.79 vs. limit=15.0 2023-10-04 09:11:08,780 WARNING [train.py:1204] (3/4) Exclude cut with ID _4779_210_2084_1_1527318022944_3914893_500-285013-0 from training. Duration: 0.69 2023-10-04 09:11:09,003 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.conv_skip_rate, batch_count=1601373.3333333333, ans=0.0 2023-10-04 09:11:10,114 WARNING [train.py:1204] (3/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_301-93324-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 09:11:10,277 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.feed_forward3.hidden_balancer.prob, batch_count=1601373.3333333333, ans=0.125 2023-10-04 09:11:11,415 WARNING [train.py:1204] (3/4) Exclude cut with ID _6473_210_1741_1_1527901307396_3815380_108-17335-0_sp1.1 from training. Duration: 0.5909375 2023-10-04 09:11:15,566 WARNING [train.py:1204] (3/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_206-10097-0 from training. Duration: 0.49 2023-10-04 09:11:17,058 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_36756_210_7234_1_1533026991_966276_103-77513-0_sp1.1 from training. Duration: 0.97275 2023-10-04 09:11:21,463 WARNING [train.py:1204] (3/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_662-18193-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 09:11:21,527 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_26433_210_12108_1_1532157158_4056480_439-52438-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 09:11:22,831 WARNING [train.py:1204] (3/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_464-33931-0 from training. Duration: 0.54 2023-10-04 09:11:22,845 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3474_210_2028_1_1526188513_4825246_145-309131-0_sp0.9 from training. Duration: 0.82225 2023-10-04 09:11:22,864 WARNING [train.py:1204] (3/4) Exclude cut with ID _9679_210_7497_1_1529129212906_4363929_446-308321-0_sp1.1 from training. Duration: 0.963625 2023-10-04 09:11:27,023 WARNING [train.py:1204] (3/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_662-275621-0 from training. Duration: 0.42 2023-10-04 09:11:27,100 WARNING [train.py:1204] (3/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_344-337148-0_sp1.1 from training. Duration: 0.97275 2023-10-04 09:11:29,652 WARNING [train.py:1204] (3/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_759-179071-0_sp1.1 from training. Duration: 0.92725 2023-10-04 09:11:36,057 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.conv_module2.balancer2.prob, batch_count=1601440.0, ans=0.125 2023-10-04 09:11:38,576 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.1.encoder.layers.1.conv_module1.whiten, num_groups=1, num_channels=256, metric=12.02 vs. limit=15.0 2023-10-04 09:11:40,375 WARNING [train.py:1204] (3/4) Exclude cut with ID _28783_210_7943_1_1532671164808_7155479_293-12188-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 09:11:40,606 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.ff3_skip_rate, batch_count=1601506.6666666667, ans=0.0 2023-10-04 09:11:45,939 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_17613_210_2151_1_1531137569_2366160_235-320304-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 09:11:45,955 WARNING [train.py:1204] (3/4) Exclude cut with ID _14349_210_8341_1_1530615710818_3442470_170-336541-0 from training. Duration: 0.86 2023-10-04 09:11:46,027 WARNING [train.py:1204] (3/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_375-218677-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 09:11:47,318 WARNING [train.py:1204] (3/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_829-202956-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 09:11:50,454 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.0.bypass.scale_min, batch_count=1601506.6666666667, ans=0.2 2023-10-04 09:11:50,513 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.conv_skip_rate, batch_count=1601506.6666666667, ans=0.0 2023-10-04 09:11:53,495 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9029W0436-509823 from training. Duration: 0.768 2023-10-04 09:11:54,926 WARNING [train.py:1204] (3/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_231-112214-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 09:12:00,504 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9059W0470-524883 from training. Duration: 0.3415625 2023-10-04 09:12:01,744 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.710e+02 2.018e+02 2.194e+02 2.403e+02 3.349e+02, threshold=4.388e+02, percent-clipped=0.0 2023-10-04 09:12:05,128 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_43266_210_1794_1_1533521789_4497218_206-79512-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 09:12:06,538 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_294-163793-0_sp0.9 from training. Duration: 0.911125 2023-10-04 09:12:06,567 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6290_210_3273_1_1527595224_1282787_115-87095-0_sp1.1 from training. Duration: 0.5 2023-10-04 09:12:06,615 WARNING [train.py:1204] (3/4) Exclude cut with ID _14100_210_8341_1_1530529320613_3678069_279-55760-0_sp1.1 from training. Duration: 0.8454375 2023-10-04 09:12:08,531 INFO [train.py:1046] (3/4) Epoch 46, batch 1200, loss[loss=0.1556, simple_loss=0.2466, pruned_loss=0.03234, over 24547.00 frames. ], tot_loss[loss=0.1551, simple_loss=0.2348, pruned_loss=0.03765, over 4680163.33 frames. ], batch size: 71, lr: 2.21e-03, grad_scale: 16.0 2023-10-04 09:12:11,248 WARNING [train.py:1204] (3/4) Exclude cut with ID _14621_210_1925_1_1530686901185_3675420_419-234200-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 09:12:15,433 WARNING [train.py:1204] (3/4) Exclude cut with ID _60229_210_27767_1_1534838406158_3655350_353-213656-0_sp1.1 from training. Duration: 0.863625 2023-10-04 09:12:15,442 WARNING [train.py:1204] (3/4) Exclude cut with ID _48678_210_5663_1_1533988830947_3769199_306-255886-0_sp1.1 from training. Duration: 0.7 2023-10-04 09:12:16,830 WARNING [train.py:1204] (3/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_466-3492-0_sp1.1 from training. Duration: 0.97275 2023-10-04 09:12:16,830 WARNING [train.py:1204] (3/4) Exclude cut with ID _8755_210_1876_1_1528628559357_3258209_199-242167-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 09:12:16,901 WARNING [train.py:1204] (3/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_237-89684-0_sp1.1 from training. Duration: 0.87275 2023-10-04 09:12:19,576 WARNING [train.py:1204] (3/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_70-84121-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 09:12:21,025 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7544_210_6753_1_1528587722_3576686_172-171750-0_sp1.1 from training. Duration: 0.5909375 2023-10-04 09:12:21,126 WARNING [train.py:1204] (3/4) Exclude cut with ID _24271_210_12658_1_1531987270022_7190519_40-321822-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 09:12:22,987 WARNING [train.py:1204] (3/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_227-83612-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 09:12:24,447 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9156W0015-572508 from training. Duration: 0.896 2023-10-04 09:12:27,140 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_292-265928-0 from training. Duration: 0.51 2023-10-04 09:12:28,887 INFO [scaling.py:1118] (3/4) WithLoss: name=encoder.encoders.3.encoder.layers.0.self_attn_weights, loss-sum=0.000e+00 2023-10-04 09:12:30,015 WARNING [train.py:1204] (3/4) Exclude cut with ID _14747_210_8341_1_1530685797463_3654089_147-131229-0_sp1.1 from training. Duration: 0.6909375 2023-10-04 09:12:32,056 WARNING [train.py:1204] (3/4) Exclude cut with ID _45297_210_2721_1_1533715351381_3264949_351-147136-0_sp0.9 from training. Duration: 0.9666875 2023-10-04 09:12:34,698 WARNING [train.py:1204] (3/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_1102-324568-0_sp1.1 from training. Duration: 0.97275 2023-10-04 09:12:37,844 WARNING [train.py:1204] (3/4) Exclude cut with ID _18516_210_9206_1_1531704653206_3692406_465-75446-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 09:12:37,846 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9203W0126-595623 from training. Duration: 0.896 2023-10-04 09:12:37,929 WARNING [train.py:1204] (3/4) Exclude cut with ID _55383_210_10635_1_1534468426172_2231669_139-328410-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 09:12:42,898 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.conv_skip_rate, batch_count=1601773.3333333333, ans=0.0 2023-10-04 09:12:45,480 WARNING [train.py:1204] (3/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_166-246557-0_sp0.9 from training. Duration: 0.711125 2023-10-04 09:12:45,484 WARNING [train.py:1204] (3/4) Exclude cut with ID _30039_210_13270_1_1532484612900_2725150_239-304872-0_sp1.1 from training. Duration: 0.963625 2023-10-04 09:12:45,508 WARNING [train.py:1204] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_6-226750-0 from training. Duration: 0.94 2023-10-04 09:12:45,583 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_135-5501-0_sp1.1 from training. Duration: 0.8909375 2023-10-04 09:12:48,389 WARNING [train.py:1204] (3/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_359-118926-0 from training. Duration: 0.72 2023-10-04 09:12:49,968 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.feed_forward1.out_proj.dropout_p, batch_count=1601773.3333333333, ans=0.1 2023-10-04 09:12:52,926 WARNING [train.py:1204] (3/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_4-91037-0 from training. Duration: 0.88 2023-10-04 09:12:52,943 WARNING [train.py:1204] (3/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_415-288492-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 09:12:53,017 WARNING [train.py:1204] (3/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_544-287179-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 09:12:54,432 WARNING [train.py:1204] (3/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_122-32606-0_sp1.1 from training. Duration: 0.92725 2023-10-04 09:12:55,719 WARNING [train.py:1204] (3/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_121-284152-0_sp1.1 from training. Duration: 0.536375 2023-10-04 09:12:55,817 WARNING [train.py:1204] (3/4) Exclude cut with ID _44718_210_5816_1_1534050051869_6469030_350-299477-0_sp1.1 from training. Duration: 0.97275 2023-10-04 09:12:55,835 WARNING [train.py:1204] (3/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_1103-56445-0_sp0.9 from training. Duration: 0.811125 2023-10-04 09:12:57,137 WARNING [train.py:1204] (3/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_64-298917-0_sp0.9 from training. Duration: 0.97775 2023-10-04 09:12:58,806 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2245_210_2715_1_1525511548_3540254_310-52483-0 from training. Duration: 0.88 2023-10-04 09:12:58,864 WARNING [train.py:1204] (3/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_401-158239-0_sp1.1 from training. Duration: 0.6454375 2023-10-04 09:12:58,898 WARNING [train.py:1204] (3/4) Exclude cut with ID _59502_210_12210_1_1535107024698_1835139_146-162495-0_sp1.1 from training. Duration: 0.836375 2023-10-04 09:12:58,915 WARNING [train.py:1204] (3/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_41-31157-0_sp0.9 from training. Duration: 0.6333125 2023-10-04 09:13:01,563 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_68717_210_3528_1_1535628683_1556759_95-36722-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 09:13:01,578 WARNING [train.py:1204] (3/4) Exclude cut with ID _30196_210_1664_1_1533708496522_7074140_654-162611-0_sp1.1 from training. Duration: 0.92725 2023-10-04 09:13:04,330 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_9302_210_2550_1_1528889962_4655416_638-301620-0_sp0.9 from training. Duration: 0.52225 2023-10-04 09:13:07,557 WARNING [train.py:1204] (3/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_478-162247-0_sp1.1 from training. Duration: 0.7454375 2023-10-04 09:13:09,563 WARNING [train.py:1204] (3/4) Exclude cut with ID _45297_210_2721_1_1533715351381_3264949_353-9541-0 from training. Duration: 0.97 2023-10-04 09:13:13,800 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9653W0190-675468 from training. Duration: 0.9935 2023-10-04 09:13:15,156 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_49730_210_13049_1_1534075294_1830676_214-84436-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 09:13:16,501 WARNING [train.py:1204] (3/4) Exclude cut with ID _50093_210_4220_1_1534417141207_3432299_259-322252-0_sp1.1 from training. Duration: 0.836375 2023-10-04 09:13:17,913 WARNING [train.py:1204] (3/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_599-50220-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 09:13:19,314 WARNING [train.py:1204] (3/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_174-109016-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 09:13:21,348 WARNING [train.py:1204] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_665-196155-0 from training. Duration: 0.67 2023-10-04 09:13:21,584 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.feed_forward2.hidden_balancer.prob, batch_count=1601973.3333333333, ans=0.125 2023-10-04 09:13:22,539 INFO [train.py:1046] (3/4) Epoch 46, batch 1250, loss[loss=0.1468, simple_loss=0.2276, pruned_loss=0.03303, over 23594.00 frames. ], tot_loss[loss=0.156, simple_loss=0.2357, pruned_loss=0.03818, over 4670350.08 frames. ], batch size: 232, lr: 2.21e-03, grad_scale: 8.0 2023-10-04 09:13:25,341 WARNING [train.py:1204] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_656-188361-0_sp1.1 from training. Duration: 0.7909375 2023-10-04 09:13:26,146 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.attention_skip_rate, batch_count=1601973.3333333333, ans=0.0 2023-10-04 09:13:27,179 WARNING [train.py:1204] (3/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_60-18123-0_sp1.1 from training. Duration: 0.8 2023-10-04 09:13:27,267 WARNING [train.py:1204] (3/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_535-14410-0 from training. Duration: 0.47 2023-10-04 09:13:29,851 WARNING [train.py:1204] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_94-126128-0_sp1.1 from training. Duration: 0.7818125 2023-10-04 09:13:31,321 WARNING [train.py:1204] (3/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_356-31216-0_sp1.1 from training. Duration: 0.6818125 2023-10-04 09:13:35,822 WARNING [train.py:1204] (3/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_443-30034-0_sp1.1 from training. Duration: 0.5818125 2023-10-04 09:13:35,906 WARNING [train.py:1204] (3/4) Exclude cut with ID _26907_210_7234_1_1532221740694_3205263_337-2494-0_sp1.1 from training. Duration: 0.8 2023-10-04 09:13:37,337 WARNING [train.py:1204] (3/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_49-97158-0_sp0.9 from training. Duration: 0.8444375 2023-10-04 09:13:37,354 WARNING [train.py:1204] (3/4) Exclude cut with ID _7877_210_6014_1_1528286358099_3750136_553-170128-0_sp1.1 from training. Duration: 0.963625 2023-10-04 09:13:40,505 WARNING [train.py:1204] (3/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_202-293523-0_sp0.9 from training. Duration: 0.8 2023-10-04 09:13:44,630 WARNING [train.py:1204] (3/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_188-171673-0_sp0.9 from training. Duration: 0.6444375 2023-10-04 09:13:44,647 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_120-41668-0_sp1.1 from training. Duration: 0.636375 2023-10-04 09:13:44,652 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_40223_210_6228_1_1533298404_4812267_613-78769-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 09:13:46,042 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_53861_210_5399_1_1534422156_4272077_150-336899-0_sp1.1 from training. Duration: 0.963625 2023-10-04 09:13:47,343 WARNING [train.py:1204] (3/4) Exclude cut with ID _14459_210_2228_1_1531443641946_7516700_1146-56843-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 09:13:51,434 WARNING [train.py:1204] (3/4) Exclude cut with ID _39867_210_9826_1_1533553222030_9344259_1194-299928-0_sp1.1 from training. Duration: 0.92725 2023-10-04 09:13:52,926 WARNING [train.py:1204] (3/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_214-291369-0_sp0.9 from training. Duration: 0.7 2023-10-04 09:13:56,385 WARNING [train.py:1204] (3/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_358-215558-0 from training. Duration: 0.93 2023-10-04 09:13:56,500 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.1.feed_forward3.hidden_balancer.prob, batch_count=1602106.6666666667, ans=0.125 2023-10-04 09:13:57,726 WARNING [train.py:1204] (3/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_577-287150-0_sp0.9 from training. Duration: 0.911125 2023-10-04 09:13:59,785 WARNING [train.py:1204] (3/4) Exclude cut with ID _60860_210_8254_1_1534896015875_3557169_229-187367-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 09:14:00,936 WARNING [train.py:1204] (3/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_857-298942-0 from training. Duration: 0.85 2023-10-04 09:14:00,981 WARNING [train.py:1204] (3/4) Exclude cut with ID _40137_210_12210_1_1533290484046_3795180_249-187059-0_sp1.1 from training. Duration: 0.8 2023-10-04 09:14:00,989 WARNING [train.py:1204] (3/4) Exclude cut with ID ID0171W0508-765603 from training. Duration: 0.541 2023-10-04 09:14:02,283 WARNING [train.py:1204] (3/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_521-219439-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 09:14:02,288 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_40223_210_6228_1_1533298404_4812267_741-302616-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 09:14:06,295 WARNING [train.py:1204] (3/4) Exclude cut with ID _65293_210_9070_1_1535545549765_4004210_438-154735-0_sp1.1 from training. Duration: 0.92725 2023-10-04 09:14:09,655 WARNING [train.py:1204] (3/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_564-132950-0_sp1.1 from training. Duration: 0.92725 2023-10-04 09:14:11,406 WARNING [train.py:1204] (3/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_291-270881-0_sp1.1 from training. Duration: 0.8818125 2023-10-04 09:14:13,946 WARNING [train.py:1204] (3/4) Exclude cut with ID _60229_210_27767_1_1534838406158_3655350_353-213656-0 from training. Duration: 0.95 2023-10-04 09:14:13,958 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_243-143119-0 from training. Duration: 0.77 2023-10-04 09:14:13,980 WARNING [train.py:1204] (3/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_690-126306-0 from training. Duration: 0.78 2023-10-04 09:14:15,560 WARNING [train.py:1204] (3/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_347-95003-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 09:14:15,699 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.ff2_skip_rate, batch_count=1602173.3333333333, ans=0.0 2023-10-04 09:14:18,431 WARNING [train.py:1204] (3/4) Exclude cut with ID _2322_210_2667_1_1525604548971_3656299_440-80494-0 from training. Duration: 0.88 2023-10-04 09:14:18,447 WARNING [train.py:1204] (3/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_370-229434-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 09:14:21,124 WARNING [train.py:1204] (3/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_124-345840-0_sp1.1 from training. Duration: 0.47275 2023-10-04 09:14:21,140 WARNING [train.py:1204] (3/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_165-123581-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 09:14:22,521 WARNING [train.py:1204] (3/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_453-282588-0 from training. Duration: 0.98 2023-10-04 09:14:22,544 WARNING [train.py:1204] (3/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_299-34413-0_sp0.9 from training. Duration: 0.72225 2023-10-04 09:14:23,798 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_40036_210_13405_1_1533648647_1315388_48-146016-0_sp1.1 from training. Duration: 0.8454375 2023-10-04 09:14:25,559 WARNING [train.py:1204] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_99-167392-0_sp0.9 from training. Duration: 0.67775 2023-10-04 09:14:25,612 WARNING [train.py:1204] (3/4) Exclude cut with ID _48677_210_5663_1_1533902405919_3892989_37-207114-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 09:14:26,963 WARNING [train.py:1204] (3/4) Exclude cut with ID _29857_210_20034_1_1532395886753_3798160_335-163562-0 from training. Duration: 0.85 2023-10-04 09:14:28,407 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_36492_210_7927_1_1533261987_4790834_703-343351-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 09:14:29,984 WARNING [train.py:1204] (3/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_80-147301-0_sp0.9 from training. Duration: 0.9333125 2023-10-04 09:14:31,292 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.684e+02 2.091e+02 2.329e+02 2.674e+02 3.922e+02, threshold=4.659e+02, percent-clipped=0.0 2023-10-04 09:14:31,366 WARNING [train.py:1204] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_218-295097-0_sp0.9 from training. Duration: 0.8555625 2023-10-04 09:14:34,594 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2295_210_2087_1_1525500993_2407863_116-238894-0_sp1.1 from training. Duration: 0.636375 2023-10-04 09:14:35,890 INFO [train.py:1046] (3/4) Epoch 46, batch 1300, loss[loss=0.1574, simple_loss=0.2439, pruned_loss=0.03547, over 23462.00 frames. ], tot_loss[loss=0.1555, simple_loss=0.2358, pruned_loss=0.03762, over 4688552.20 frames. ], batch size: 93, lr: 2.21e-03, grad_scale: 8.0 2023-10-04 09:14:36,057 WARNING [train.py:1204] (3/4) Exclude cut with ID _3231_210_2106_1_1526205864934_6784220_697-171068-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 09:14:36,085 WARNING [train.py:1204] (3/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_102-77064-0 from training. Duration: 0.93 2023-10-04 09:14:41,878 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_13091_210_7925_1_1530609447_8987354_1186-261957-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 09:14:43,813 WARNING [train.py:1204] (3/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_91-277684-0_sp1.1 from training. Duration: 0.67275 2023-10-04 09:14:45,210 WARNING [train.py:1204] (3/4) Exclude cut with ID _31088_210_16446_1_1532520012284_3440919_159-313751-0_sp1.1 from training. Duration: 0.936375 2023-10-04 09:14:46,562 WARNING [train.py:1204] (3/4) Exclude cut with ID _42326_210_7770_1_1533814282531_3707269_227-203721-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 09:14:46,642 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3531_210_3528_1_1526294203_5216456_309-227302-0_sp1.1 from training. Duration: 0.62725 2023-10-04 09:14:46,881 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.self_attn_weights.pos_emb_skip_rate, batch_count=1602306.6666666667, ans=0.0 2023-10-04 09:14:48,071 WARNING [train.py:1204] (3/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_226-242140-0 from training. Duration: 0.81 2023-10-04 09:14:54,070 WARNING [train.py:1204] (3/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_122-109023-0_sp1.1 from training. Duration: 0.6909375 2023-10-04 09:14:55,437 WARNING [train.py:1204] (3/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_660-211088-0_sp0.9 from training. Duration: 0.688875 2023-10-04 09:14:56,915 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.1.feed_forward2.hidden_balancer.prob, batch_count=1602373.3333333333, ans=0.125 2023-10-04 09:14:58,236 WARNING [train.py:1204] (3/4) Exclude cut with ID _39290_210_4220_1_1533862778760_3075118_92-311600-0 from training. Duration: 0.88 2023-10-04 09:14:59,811 WARNING [train.py:1204] (3/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_390-339137-0_sp1.1 from training. Duration: 0.5090625 2023-10-04 09:15:02,830 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.feed_forward3.hidden_balancer.prob, batch_count=1602373.3333333333, ans=0.125 2023-10-04 09:15:03,952 WARNING [train.py:1204] (3/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_449-341946-0_sp1.1 from training. Duration: 0.963625 2023-10-04 09:15:04,034 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_68095_210_20185_1_1535718226_8888855_884-327007-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 09:15:05,831 WARNING [train.py:1204] (3/4) Exclude cut with ID _42326_210_7770_1_1533814282531_3707269_308-222998-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 09:15:07,259 WARNING [train.py:1204] (3/4) Exclude cut with ID _40399_210_4353_1_1533305658194_3279406_344-238708-0_sp1.1 from training. Duration: 0.963625 2023-10-04 09:15:08,613 WARNING [train.py:1204] (3/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_87-131427-0_sp1.1 from training. Duration: 0.7090625 2023-10-04 09:15:09,963 WARNING [train.py:1204] (3/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_110-130961-0_sp0.9 from training. Duration: 0.52225 2023-10-04 09:15:10,010 WARNING [train.py:1204] (3/4) Exclude cut with ID _55153_210_2253_1_1534384786677_3162096_148-79022-0 from training. Duration: 0.96 2023-10-04 09:15:12,218 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.self_attn_weights.pos_emb_skip_rate, batch_count=1602440.0, ans=0.0 2023-10-04 09:15:16,071 WARNING [train.py:1204] (3/4) Exclude cut with ID _9679_210_7497_1_1529129212906_4363929_512-313889-0_sp1.1 from training. Duration: 0.736375 2023-10-04 09:15:16,083 WARNING [train.py:1204] (3/4) Exclude cut with ID _7970_210_1883_1_1528455616296_3472230_25-178407-0_sp1.1 from training. Duration: 0.6454375 2023-10-04 09:15:18,133 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_246-125000-0 from training. Duration: 0.62 2023-10-04 09:15:18,207 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_15126_210_6753_1_1530778677_2591185_113-157651-0_sp0.9 from training. Duration: 0.7555625 2023-10-04 09:15:19,576 WARNING [train.py:1204] (3/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_105-63309-0_sp1.1 from training. Duration: 0.8909375 2023-10-04 09:15:21,094 WARNING [train.py:1204] (3/4) Exclude cut with ID _29877_210_6228_1_1532397791106_5276149_578-177271-0_sp1.1 from training. Duration: 0.936375 2023-10-04 09:15:22,488 WARNING [train.py:1204] (3/4) Exclude cut with ID _44831_210_13427_1_1533816011549_3689035_32-188147-0 from training. Duration: 0.96 2023-10-04 09:15:23,793 WARNING [train.py:1204] (3/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_300-254337-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 09:15:23,812 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_128-241008-0 from training. Duration: 0.94 2023-10-04 09:15:25,637 WARNING [train.py:1204] (3/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_53-279178-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 09:15:25,762 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.balancer1.prob, batch_count=1602506.6666666667, ans=0.125 2023-10-04 09:15:29,684 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_41452_210_7291_1_1533437687_3941281_273-334088-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 09:15:29,686 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_128-241008-0_sp1.1 from training. Duration: 0.8545625 2023-10-04 09:15:31,364 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.ff3_skip_rate, batch_count=1602506.6666666667, ans=0.0 2023-10-04 09:15:32,525 WARNING [train.py:1204] (3/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_379-49457-0 from training. Duration: 0.87 2023-10-04 09:15:32,997 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.4.encoder.layers.1.conv_module2.whiten, num_groups=1, num_channels=384, metric=4.59 vs. limit=15.0 2023-10-04 09:15:33,887 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_105-114797-0 from training. Duration: 0.84 2023-10-04 09:15:34,158 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.conv_module1.balancer1.prob, batch_count=1602573.3333333333, ans=0.125 2023-10-04 09:15:35,257 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_14928_210_5632_1_1530773461_4714000_454-112198-0 from training. Duration: 0.66 2023-10-04 09:15:39,761 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_456-22141-0_sp1.1 from training. Duration: 0.8 2023-10-04 09:15:43,134 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_115-233929-0 from training. Duration: 0.72 2023-10-04 09:15:44,506 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_616-16795-0_sp1.1 from training. Duration: 0.963625 2023-10-04 09:15:44,741 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.0.bypass.scale_min, batch_count=1602573.3333333333, ans=0.2 2023-10-04 09:15:49,035 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.ff3_skip_rate, batch_count=1602640.0, ans=0.0 2023-10-04 09:15:50,506 INFO [train.py:1046] (3/4) Epoch 46, batch 1350, loss[loss=0.1476, simple_loss=0.2083, pruned_loss=0.04347, over 19520.00 frames. ], tot_loss[loss=0.1549, simple_loss=0.2354, pruned_loss=0.03723, over 4689466.39 frames. ], batch size: 389, lr: 2.21e-03, grad_scale: 8.0 2023-10-04 09:15:52,014 WARNING [train.py:1204] (3/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_448-212135-0 from training. Duration: 0.91 2023-10-04 09:15:54,792 WARNING [train.py:1204] (3/4) Exclude cut with ID _46683_210_9642_1_1533730913492_4591116_201-205677-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 09:15:58,018 WARNING [train.py:1204] (3/4) Exclude cut with ID _64086_210_7994_1_1535270417019_7435949_43-80483-0_sp1.1 from training. Duration: 0.97275 2023-10-04 09:16:00,842 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_240-134124-0_sp1.1 from training. Duration: 0.963625 2023-10-04 09:16:00,876 WARNING [train.py:1204] (3/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_907-120940-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 09:16:02,283 WARNING [train.py:1204] (3/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_848-280957-0_sp1.1 from training. Duration: 0.7909375 2023-10-04 09:16:02,315 WARNING [train.py:1204] (3/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_243-42116-0_sp0.9 from training. Duration: 0.811125 2023-10-04 09:16:05,148 WARNING [train.py:1204] (3/4) Exclude cut with ID _6694_210_4599_1_1527852637490_3965529_116-109398-0_sp0.9 from training. Duration: 0.811125 2023-10-04 09:16:05,257 WARNING [train.py:1204] (3/4) Exclude cut with ID _24314_210_4915_1_1532135078116_7170179_521-258868-0 from training. Duration: 0.79 2023-10-04 09:16:05,404 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=1602706.6666666667, ans=0.1 2023-10-04 09:16:08,454 WARNING [train.py:1204] (3/4) Exclude cut with ID _2220_210_1819_1_1525509109898_3658680_50-99837-0_sp0.9 from training. Duration: 0.6 2023-10-04 09:16:10,280 WARNING [train.py:1204] (3/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_313-134930-0_sp1.1 from training. Duration: 0.8818125 2023-10-04 09:16:13,017 WARNING [train.py:1204] (3/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_125-144174-0 from training. Duration: 0.39 2023-10-04 09:16:14,318 WARNING [train.py:1204] (3/4) Exclude cut with ID _59288_210_5854_1_1535077757774_3412246_275-293897-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 09:16:14,412 WARNING [train.py:1204] (3/4) Exclude cut with ID _40519_210_13794_1_1533715228655_7337390_722-52880-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 09:16:14,427 WARNING [train.py:1204] (3/4) Exclude cut with ID _7537_210_2084_1_1528502395431_3924359_128-268029-0 from training. Duration: 0.98 2023-10-04 09:16:15,838 WARNING [train.py:1204] (3/4) Exclude cut with ID _8021_210_7994_1_1528376451358_3653069_179-45206-0 from training. Duration: 0.86 2023-10-04 09:16:17,299 WARNING [train.py:1204] (3/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_88-276668-0 from training. Duration: 0.99 2023-10-04 09:16:19,246 WARNING [train.py:1204] (3/4) Exclude cut with ID _22613_210_5219_1_1532253599686_4090170_290-212862-0_sp1.1 from training. Duration: 0.97275 2023-10-04 09:16:19,252 WARNING [train.py:1204] (3/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_465-214726-0 from training. Duration: 0.86 2023-10-04 09:16:32,084 WARNING [train.py:1204] (3/4) Exclude cut with ID _56480_210_4220_1_1534679942079_3075409_138-36031-0_sp1.1 from training. Duration: 0.97275 2023-10-04 09:16:40,936 WARNING [train.py:1204] (3/4) Exclude cut with ID _58605_210_12210_1_1534723195939_3904259_300-285878-0_sp1.1 from training. Duration: 0.97275 2023-10-04 09:16:40,973 WARNING [train.py:1204] (3/4) Exclude cut with ID _40901_210_5665_1_1533363789958_3856130_114-340229-0_sp1.1 from training. Duration: 0.936375 2023-10-04 09:16:42,323 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_489-230703-0 from training. Duration: 0.85 2023-10-04 09:16:43,695 WARNING [train.py:1204] (3/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_322-81243-0_sp1.1 from training. Duration: 0.936375 2023-10-04 09:16:43,935 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.conv_skip_rate, batch_count=1602840.0, ans=0.0 2023-10-04 09:16:46,269 WARNING [train.py:1204] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_271-269084-0 from training. Duration: 0.83 2023-10-04 09:16:46,281 WARNING [train.py:1204] (3/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_166-314607-0_sp0.9 from training. Duration: 0.6 2023-10-04 09:16:46,350 WARNING [train.py:1204] (3/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_340-343302-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 09:16:47,785 WARNING [train.py:1204] (3/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_286-112015-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 09:16:51,090 WARNING [train.py:1204] (3/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_328-264776-0 from training. Duration: 0.67 2023-10-04 09:16:53,642 WARNING [train.py:1204] (3/4) Exclude cut with ID _4866_210_2577_1_1527404412284_4364659_256-306830-0_sp0.9 from training. Duration: 0.9666875 2023-10-04 09:16:59,674 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.650e+02 2.063e+02 2.245e+02 2.755e+02 4.029e+02, threshold=4.491e+02, percent-clipped=0.0 2023-10-04 09:16:59,744 WARNING [train.py:1204] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_239-316802-0 from training. Duration: 0.69 2023-10-04 09:17:01,189 WARNING [train.py:1204] (3/4) Exclude cut with ID _29857_210_20034_1_1532395886753_3798160_279-185801-0 from training. Duration: 0.91 2023-10-04 09:17:02,913 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.bypass.scale_min, batch_count=1602973.3333333333, ans=0.2 2023-10-04 09:17:03,915 INFO [train.py:1046] (3/4) Epoch 46, batch 1400, loss[loss=0.1454, simple_loss=0.2304, pruned_loss=0.03025, over 24320.00 frames. ], tot_loss[loss=0.1538, simple_loss=0.2339, pruned_loss=0.03682, over 4687965.75 frames. ], batch size: 61, lr: 2.21e-03, grad_scale: 8.0 2023-10-04 09:17:05,431 WARNING [train.py:1204] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_656-188361-0 from training. Duration: 0.87 2023-10-04 09:17:08,034 WARNING [train.py:1204] (3/4) Exclude cut with ID _44718_210_5816_1_1534050051869_6469030_100-230287-0_sp1.1 from training. Duration: 0.936375 2023-10-04 09:17:11,286 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8275_210_4198_1_1528455320_3892571_276-215622-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 09:17:11,336 WARNING [train.py:1204] (3/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_698-309430-0_sp1.1 from training. Duration: 0.963625 2023-10-04 09:17:17,506 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_365-217119-0 from training. Duration: 0.63 2023-10-04 09:17:17,642 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7264_210_4938_1_1527939911_8132095_800-91995-0 from training. Duration: 0.86 2023-10-04 09:17:26,506 WARNING [train.py:1204] (3/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_49-97158-0_sp1.1 from training. Duration: 0.6909375 2023-10-04 09:17:29,733 WARNING [train.py:1204] (3/4) Exclude cut with ID _61132_210_9969_1_1535021792964_3755419_61-57720-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 09:17:29,881 WARNING [train.py:1204] (3/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_345-306832-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 09:17:31,181 WARNING [train.py:1204] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_164-224052-0_sp0.9 from training. Duration: 0.77775 2023-10-04 09:17:34,250 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_34886_210_10633_1_1533203882_1755661_76-156096-0_sp1.1 from training. Duration: 0.7909375 2023-10-04 09:17:35,646 WARNING [train.py:1204] (3/4) Exclude cut with ID 4133-6541-0027-40495-0_sp1.1 from training. Duration: 0.9681875 2023-10-04 09:17:44,758 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_514-211165-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 09:17:44,822 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_41211_210_2544_1_1533348859_3702910_97-212695-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 09:17:49,136 WARNING [train.py:1204] (3/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_406-45086-0 from training. Duration: 0.8 2023-10-04 09:17:50,375 WARNING [train.py:1204] (3/4) Exclude cut with ID _65263_210_22356_1_1535763598691_3921037_49-222917-0_sp1.1 from training. Duration: 0.836375 2023-10-04 09:17:52,217 WARNING [train.py:1204] (3/4) Exclude cut with ID _4866_210_2577_1_1527404412284_4364659_352-157930-0_sp1.1 from training. Duration: 0.736375 2023-10-04 09:17:52,299 WARNING [train.py:1204] (3/4) Exclude cut with ID _4366_210_1388_1_1526792053633_3613819_231-211402-0_sp1.1 from training. Duration: 0.8545625 2023-10-04 09:17:52,342 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_98-21576-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 09:17:52,510 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.nonlin_attention.balancer.prob, batch_count=1603173.3333333333, ans=0.125 2023-10-04 09:17:53,705 WARNING [train.py:1204] (3/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_348-127789-0_sp0.9 from training. Duration: 0.9444375 2023-10-04 09:17:53,730 WARNING [train.py:1204] (3/4) Exclude cut with ID _45871_210_5665_1_1533864614245_3296060_98-329557-0_sp1.1 from training. Duration: 0.97275 2023-10-04 09:17:53,778 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6846_210_6753_1_1527850767_4295158_2-315073-0_sp1.1 from training. Duration: 0.8 2023-10-04 09:17:55,235 WARNING [train.py:1204] (3/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_145-137790-0 from training. Duration: 0.83 2023-10-04 09:17:55,250 WARNING [train.py:1204] (3/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_133-158308-0_sp0.9 from training. Duration: 0.9333125 2023-10-04 09:17:58,761 WARNING [train.py:1204] (3/4) Exclude cut with ID _14898_210_9206_1_1530766812377_4177190_320-11758-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 09:18:01,412 WARNING [train.py:1204] (3/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_406-45086-0_sp0.9 from training. Duration: 0.888875 2023-10-04 09:18:10,711 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_5438_210_1794_1_1527336687_2600009_104-288274-0 from training. Duration: 0.58 2023-10-04 09:18:11,973 WARNING [train.py:1204] (3/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_108-53929-0_sp0.9 from training. Duration: 0.6555625 2023-10-04 09:18:12,050 WARNING [train.py:1204] (3/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_444-286390-0_sp1.1 from training. Duration: 0.963625 2023-10-04 09:18:14,947 WARNING [train.py:1204] (3/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_125-144174-0_sp0.9 from training. Duration: 0.4333125 2023-10-04 09:18:15,033 WARNING [train.py:1204] (3/4) Exclude cut with ID _55209_210_1531_1_1534675828996_4597160_118-345621-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 09:18:17,643 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_45886_210_17024_1_1533709215_471063_7-79165-0_sp1.1 from training. Duration: 0.8909375 2023-10-04 09:18:19,020 INFO [train.py:1046] (3/4) Epoch 46, batch 1450, loss[loss=0.1612, simple_loss=0.2406, pruned_loss=0.0409, over 23225.00 frames. ], tot_loss[loss=0.1533, simple_loss=0.2331, pruned_loss=0.03676, over 4671133.31 frames. ], batch size: 119, lr: 2.21e-03, grad_scale: 8.0 2023-10-04 09:18:19,163 WARNING [train.py:1204] (3/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_175-163894-0_sp0.9 from training. Duration: 0.82225 2023-10-04 09:18:20,642 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_50188_210_4198_1_1534503621_3646041_353-81682-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 09:18:20,648 WARNING [train.py:1204] (3/4) Exclude cut with ID _17875_210_2087_1_1531224012907_1821339_175-80565-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 09:18:22,440 WARNING [train.py:1204] (3/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_159-328157-0_sp1.1 from training. Duration: 0.47275 2023-10-04 09:18:26,721 WARNING [train.py:1204] (3/4) Exclude cut with ID _60057_210_10640_1_1534903065793_3333150_137-267579-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 09:18:26,766 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_92-46009-0_sp1.1 from training. Duration: 0.4909375 2023-10-04 09:18:28,658 WARNING [train.py:1204] (3/4) Exclude cut with ID _53203_210_2087_1_1534246218309_475590_48-68507-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 09:18:29,920 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_28211_210_6388_1_1532861506_4523224_128-328162-0 from training. Duration: 0.9 2023-10-04 09:18:31,312 WARNING [train.py:1204] (3/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_51-1049-0_sp1.1 from training. Duration: 0.6818125 2023-10-04 09:18:32,697 WARNING [train.py:1204] (3/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_383-316000-0 from training. Duration: 0.9 2023-10-04 09:18:32,748 WARNING [train.py:1204] (3/4) Exclude cut with ID _4779_210_2084_1_1527318022944_3914893_185-295153-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 09:18:32,825 WARNING [train.py:1204] (3/4) Exclude cut with ID _3231_210_2106_1_1526205864934_6784220_171-256004-0_sp0.9 from training. Duration: 0.9555625 2023-10-04 09:18:32,826 WARNING [train.py:1204] (3/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_87-272068-0 from training. Duration: 0.83 2023-10-04 09:18:33,358 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.5.encoder.layers.0.self_attn_weights.whiten_keys, num_groups=4, num_channels=128, metric=2.00 vs. limit=6.0 2023-10-04 09:18:34,223 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_164-152777-0_sp1.1 from training. Duration: 0.92725 2023-10-04 09:18:34,271 WARNING [train.py:1204] (3/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_18-130336-0_sp0.9 from training. Duration: 0.87775 2023-10-04 09:18:35,562 WARNING [train.py:1204] (3/4) Exclude cut with ID _14886_210_4220_1_1530702020355_3516190_175-229073-0_sp1.1 from training. Duration: 0.4090625 2023-10-04 09:18:35,583 WARNING [train.py:1204] (3/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_791-45149-0_sp0.9 from training. Duration: 0.9555625 2023-10-04 09:18:37,079 WARNING [train.py:1204] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_468-189058-0_sp1.1 from training. Duration: 0.87275 2023-10-04 09:18:38,445 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8278_210_2550_1_1528637099_4220413_32-142780-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 09:18:40,574 WARNING [train.py:1204] (3/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_146-218763-0_sp0.9 from training. Duration: 0.9555625 2023-10-04 09:18:43,259 WARNING [train.py:1204] (3/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_348-127789-0_sp1.1 from training. Duration: 0.77275 2023-10-04 09:18:43,264 WARNING [train.py:1204] (3/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_288-285445-0_sp1.1 from training. Duration: 0.936375 2023-10-04 09:18:44,592 WARNING [train.py:1204] (3/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_308-181732-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 09:18:45,920 WARNING [train.py:1204] (3/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_895-117428-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 09:18:48,655 WARNING [train.py:1204] (3/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_387-159008-0_sp0.9 from training. Duration: 0.9555625 2023-10-04 09:18:48,675 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_37351_210_7925_1_1533012931_3812113_265-85780-0_sp0.9 from training. Duration: 0.97775 2023-10-04 09:18:49,912 WARNING [train.py:1204] (3/4) Exclude cut with ID _40551_210_10635_1_1533776424632_3416179_361-3964-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 09:18:49,963 WARNING [train.py:1204] (3/4) Exclude cut with ID _25882_210_3036_1_1532775665815_6479510_485-122742-0_sp1.1 from training. Duration: 0.97275 2023-10-04 09:18:50,252 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.attention_skip_rate, batch_count=1603440.0, ans=0.0 2023-10-04 09:18:53,592 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.conv_module1.balancer2.min_abs, batch_count=1603440.0, ans=0.5 2023-10-04 09:18:54,242 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.0.layers.1.nonlin_attention.whiten2, num_groups=1, num_channels=192, metric=5.43 vs. limit=15.0 2023-10-04 09:18:54,618 WARNING [train.py:1204] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_98-51676-0 from training. Duration: 0.73 2023-10-04 09:18:57,972 WARNING [train.py:1204] (3/4) Exclude cut with ID _30548_210_16040_1_1532608097130_3146969_371-280610-0_sp1.1 from training. Duration: 0.92725 2023-10-04 09:19:00,812 WARNING [train.py:1204] (3/4) Exclude cut with ID IC0671W0081-331924 from training. Duration: 0.896 2023-10-04 09:19:00,916 WARNING [train.py:1204] (3/4) Exclude cut with ID _40284_210_2084_1_1533513679814_3704279_380-160775-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 09:19:02,232 WARNING [train.py:1204] (3/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_57-270037-0_sp1.1 from training. Duration: 0.7 2023-10-04 09:19:02,334 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_853-150237-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 09:19:03,694 WARNING [train.py:1204] (3/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_491-261923-0 from training. Duration: 0.98 2023-10-04 09:19:08,298 WARNING [train.py:1204] (3/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_836-244045-0_sp1.1 from training. Duration: 0.97275 2023-10-04 09:19:08,547 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.conv_skip_rate, batch_count=1603506.6666666667, ans=0.0 2023-10-04 09:19:09,652 WARNING [train.py:1204] (3/4) Exclude cut with ID _14880_210_8895_1_1530680426655_3795259_289-212817-0 from training. Duration: 0.69 2023-10-04 09:19:09,851 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.ff3_skip_rate, batch_count=1603506.6666666667, ans=0.0 2023-10-04 09:19:11,599 WARNING [train.py:1204] (3/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_303-340446-0 from training. Duration: 0.9 2023-10-04 09:19:11,778 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.bypass.skip_rate, batch_count=1603506.6666666667, ans=0.09899494936611666 2023-10-04 09:19:14,302 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_37152_210_9697_1_1533106573_3844188_418-41736-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 09:19:18,963 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.5.encoder.layers.1.self_attn2.whiten, num_groups=1, num_channels=256, metric=12.36 vs. limit=22.5 2023-10-04 09:19:19,786 WARNING [train.py:1204] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_371-18169-0_sp1.1 from training. Duration: 0.7818125 2023-10-04 09:19:19,832 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_187-219984-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 09:19:22,607 WARNING [train.py:1204] (3/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_63-267585-0 from training. Duration: 0.9 2023-10-04 09:19:25,311 WARNING [train.py:1204] (3/4) Exclude cut with ID _27013_210_1389_1_1532437173240_3252649_222-125573-0 from training. Duration: 0.98 2023-10-04 09:19:25,367 WARNING [train.py:1204] (3/4) Exclude cut with ID _14821_210_8750_1_1530680503991_6829359_189-297451-0 from training. Duration: 0.55 2023-10-04 09:19:27,324 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_557-163391-0_sp1.1 from training. Duration: 0.97275 2023-10-04 09:19:27,560 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.conv_module2.balancer2.prob, batch_count=1603573.3333333333, ans=0.125 2023-10-04 09:19:28,537 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.754e+02 2.040e+02 2.275e+02 2.758e+02 4.535e+02, threshold=4.550e+02, percent-clipped=1.0 2023-10-04 09:19:28,630 WARNING [train.py:1204] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_65-88242-0_sp1.1 from training. Duration: 0.5090625 2023-10-04 09:19:33,288 INFO [train.py:1046] (3/4) Epoch 46, batch 1500, loss[loss=0.1516, simple_loss=0.2287, pruned_loss=0.03728, over 23360.00 frames. ], tot_loss[loss=0.1535, simple_loss=0.2337, pruned_loss=0.03667, over 4689820.78 frames. ], batch size: 119, lr: 2.21e-03, grad_scale: 8.0 2023-10-04 09:19:36,608 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.5.encoder.layers.1.feed_forward2.out_whiten, num_groups=1, num_channels=256, metric=8.80 vs. limit=15.0 2023-10-04 09:19:37,557 WARNING [train.py:1204] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_218-295097-0 from training. Duration: 0.77 2023-10-04 09:19:38,838 WARNING [train.py:1204] (3/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_686-247600-0_sp1.1 from training. Duration: 0.836375 2023-10-04 09:19:38,842 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_397-147196-0_sp0.9 from training. Duration: 0.888875 2023-10-04 09:19:40,644 WARNING [train.py:1204] (3/4) Exclude cut with ID _53950_210_5816_1_1534417264919_3862951_237-136449-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 09:19:40,694 WARNING [train.py:1204] (3/4) Exclude cut with ID _3120_210_3525_1_1526036143112_4072980_74-178400-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 09:19:41,985 WARNING [train.py:1204] (3/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_309-222545-0_sp0.9 from training. Duration: 0.9333125 2023-10-04 09:19:42,068 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7544_210_6753_1_1528587722_3576686_172-171750-0 from training. Duration: 0.65 2023-10-04 09:19:43,535 WARNING [train.py:1204] (3/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_3-237434-0_sp0.9 from training. Duration: 0.8444375 2023-10-04 09:19:43,581 WARNING [train.py:1204] (3/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_101-293367-0_sp0.9 from training. Duration: 0.7 2023-10-04 09:19:43,589 WARNING [train.py:1204] (3/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_818-213552-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 09:19:44,994 WARNING [train.py:1204] (3/4) Exclude cut with ID _14349_210_8341_1_1530615710818_3442470_115-285975-0_sp1.1 from training. Duration: 0.7818125 2023-10-04 09:19:46,952 WARNING [train.py:1204] (3/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_291-309749-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 09:19:48,286 WARNING [train.py:1204] (3/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_715-3462-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 09:19:50,005 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.conv_module1.balancer1.max_abs, batch_count=1603706.6666666667, ans=10.0 2023-10-04 09:19:50,028 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.nonlin_attention.balancer.prob, batch_count=1603706.6666666667, ans=0.125 2023-10-04 09:19:51,312 WARNING [train.py:1204] (3/4) Exclude cut with ID _59502_210_12210_1_1535107024698_1835139_13-331827-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 09:19:51,323 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_4528_210_3273_1_1527076654_3276777_342-168068-0 from training. Duration: 0.87 2023-10-04 09:19:52,685 WARNING [train.py:1204] (3/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_215-141059-0_sp0.9 from training. Duration: 0.688875 2023-10-04 09:19:52,704 WARNING [train.py:1204] (3/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_106-286130-0_sp1.1 from training. Duration: 0.8909375 2023-10-04 09:19:52,776 WARNING [train.py:1204] (3/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_480-77655-0_sp1.1 from training. Duration: 0.97275 2023-10-04 09:19:57,293 WARNING [train.py:1204] (3/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_848-280957-0 from training. Duration: 0.87 2023-10-04 09:20:01,349 WARNING [train.py:1204] (3/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_122-109023-0 from training. Duration: 0.76 2023-10-04 09:20:03,247 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_11305_210_1794_1_1529750428_7178148_645-233480-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 09:20:03,318 WARNING [train.py:1204] (3/4) Exclude cut with ID _7562_210_2084_1_1528887767916_3946229_345-169156-0 from training. Duration: 0.58 2023-10-04 09:20:06,220 WARNING [train.py:1204] (3/4) Exclude cut with ID _7562_210_2084_1_1528887767916_3946229_345-169156-0_sp1.1 from training. Duration: 0.52725 2023-10-04 09:20:09,537 WARNING [train.py:1204] (3/4) Exclude cut with ID _13253_210_1925_1_1530757775819_3577873_253-246386-0_sp1.1 from training. Duration: 0.8181875 2023-10-04 09:20:09,613 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_136-264444-0_sp1.1 from training. Duration: 0.97275 2023-10-04 09:20:09,641 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2168_210_2290_1_1525606920_1849400_136-222991-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 09:20:09,853 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.bypass.skip_rate, batch_count=1603773.3333333333, ans=0.07 2023-10-04 09:20:12,201 WARNING [train.py:1204] (3/4) Exclude cut with ID _7852_210_5219_1_1528286471316_8139400_242-105336-0 from training. Duration: 0.72 2023-10-04 09:20:12,249 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_5960_210_1794_1_1527403865_3990999_515-2613-0_sp1.1 from training. Duration: 0.8 2023-10-04 09:20:12,271 WARNING [train.py:1204] (3/4) Exclude cut with ID _39688_210_19596_1_1533371626725_4154159_551-167834-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 09:20:13,624 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_85-195240-0 from training. Duration: 0.87 2023-10-04 09:20:13,681 WARNING [train.py:1204] (3/4) Exclude cut with ID _63171_210_15488_1_1535104792794_3295110_113-113534-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 09:20:18,389 WARNING [train.py:1204] (3/4) Exclude cut with ID _8833_210_5219_1_1528592665210_7553059_319-349181-0_sp1.1 from training. Duration: 0.9 2023-10-04 09:20:18,390 WARNING [train.py:1204] (3/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_225-43381-0 from training. Duration: 0.94 2023-10-04 09:20:23,247 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.2.encoder.layers.2.feed_forward2.out_whiten, num_groups=1, num_channels=384, metric=3.84 vs. limit=15.0 2023-10-04 09:20:23,844 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_5959_210_1794_1_1527400400_3445726_412-44052-0_sp0.9 from training. Duration: 0.8555625 2023-10-04 09:20:25,135 WARNING [train.py:1204] (3/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_356-167816-0_sp1.1 from training. Duration: 0.6545625 2023-10-04 09:20:27,440 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.1.encoder.layers.0.whiten, num_groups=1, num_channels=256, metric=5.09 vs. limit=12.0 2023-10-04 09:20:29,862 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9012W0112-501244 from training. Duration: 0.768 2023-10-04 09:20:29,907 WARNING [train.py:1204] (3/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_177-326952-0_sp1.1 from training. Duration: 0.92725 2023-10-04 09:20:29,911 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9016W0116-502789 from training. Duration: 0.896 2023-10-04 09:20:30,010 WARNING [train.py:1204] (3/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_557-204815-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 09:20:31,432 WARNING [train.py:1204] (3/4) Exclude cut with ID _55857_210_2346_1_1534644007831_6898209_279-229469-0_sp1.1 from training. Duration: 0.936375 2023-10-04 09:20:32,837 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9029W0375-509764 from training. Duration: 0.896 2023-10-04 09:20:34,554 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2295_210_2087_1_1525500993_2407863_234-83104-0_sp0.9 from training. Duration: 0.688875 2023-10-04 09:20:36,106 WARNING [train.py:1204] (3/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_266-137104-0 from training. Duration: 0.65 2023-10-04 09:20:37,493 WARNING [train.py:1204] (3/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_289-163927-0_sp1.1 from training. Duration: 0.92725 2023-10-04 09:20:40,777 WARNING [train.py:1204] (3/4) Exclude cut with ID _12733_210_8341_1_1530167424556_3646380_86-225314-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 09:20:40,790 WARNING [train.py:1204] (3/4) Exclude cut with ID _7877_210_6014_1_1528286358099_3750136_618-284064-0_sp1.1 from training. Duration: 0.92725 2023-10-04 09:20:40,839 WARNING [train.py:1204] (3/4) Exclude cut with ID _55844_210_2346_1_1534471212569_7625707_1017-31469-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 09:20:42,140 WARNING [train.py:1204] (3/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_617-241446-0_sp1.1 from training. Duration: 0.92725 2023-10-04 09:20:42,200 WARNING [train.py:1204] (3/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_866-173440-0_sp1.1 from training. Duration: 0.8181875 2023-10-04 09:20:44,926 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_9460_210_4211_1_1528954587_10049667_480-260269-0 from training. Duration: 0.56 2023-10-04 09:20:44,971 WARNING [train.py:1204] (3/4) Exclude cut with ID _44659_210_2721_1_1534036120061_6614860_626-298884-0 from training. Duration: 0.97 2023-10-04 09:20:45,006 WARNING [train.py:1204] (3/4) Exclude cut with ID _53565_210_10635_1_1534337218335_3820259_287-18120-0_sp1.1 from training. Duration: 0.8818125 2023-10-04 09:20:46,797 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_791-197891-0 from training. Duration: 0.84 2023-10-04 09:20:46,857 WARNING [train.py:1204] (3/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_527-238990-0 from training. Duration: 0.9 2023-10-04 09:20:48,080 INFO [train.py:1046] (3/4) Epoch 46, batch 1550, loss[loss=0.1483, simple_loss=0.2283, pruned_loss=0.03413, over 23702.00 frames. ], tot_loss[loss=0.1542, simple_loss=0.2344, pruned_loss=0.03704, over 4698859.09 frames. ], batch size: 232, lr: 2.21e-03, grad_scale: 8.0 2023-10-04 09:20:49,460 WARNING [train.py:1204] (3/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_449-65969-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 09:20:50,840 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_553-341452-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 09:20:50,891 WARNING [train.py:1204] (3/4) Exclude cut with ID _16909_210_5724_1_1531292079912_4118919_300-47574-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 09:20:52,088 WARNING [train.py:1204] (3/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_70-27732-0_sp1.1 from training. Duration: 0.8545625 2023-10-04 09:20:53,806 WARNING [train.py:1204] (3/4) Exclude cut with ID _44718_210_5816_1_1534050051869_6469030_727-28074-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 09:20:53,850 WARNING [train.py:1204] (3/4) Exclude cut with ID _55857_210_2346_1_1534644007831_6898209_357-123664-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 09:20:57,113 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9123W0004-555964 from training. Duration: 0.896 2023-10-04 09:20:57,135 WARNING [train.py:1204] (3/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_445-145208-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 09:20:57,183 WARNING [train.py:1204] (3/4) Exclude cut with ID _3675_210_4557_1_1526639934408_4182089_456-18305-0_sp1.1 from training. Duration: 0.6909375 2023-10-04 09:20:58,651 WARNING [train.py:1204] (3/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_1-99680-0_sp1.1 from training. Duration: 0.6090625 2023-10-04 09:21:00,007 WARNING [train.py:1204] (3/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_146-282400-0_sp0.9 from training. Duration: 0.788875 2023-10-04 09:21:00,027 WARNING [train.py:1204] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_844-96989-0 from training. Duration: 0.71 2023-10-04 09:21:02,766 WARNING [train.py:1204] (3/4) Exclude cut with ID _29853_210_7497_1_1532505600851_6642110_128-146364-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 09:21:02,803 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_224-322394-0 from training. Duration: 0.66 2023-10-04 09:21:04,298 WARNING [train.py:1204] (3/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_191-59784-0 from training. Duration: 0.88 2023-10-04 09:21:04,303 WARNING [train.py:1204] (3/4) Exclude cut with ID _12556_210_8341_1_1530081024100_7327690_725-193477-0 from training. Duration: 0.98 2023-10-04 09:21:04,353 WARNING [train.py:1204] (3/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_523-79838-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 09:21:05,727 WARNING [train.py:1204] (3/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_398-334558-0_sp1.1 from training. Duration: 0.87275 2023-10-04 09:21:09,191 WARNING [train.py:1204] (3/4) Exclude cut with ID _6456_210_1534_1_1527681634212_3181049_171-36697-0_sp1.1 from training. Duration: 0.936375 2023-10-04 09:21:11,958 WARNING [train.py:1204] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_700-183549-0 from training. Duration: 0.68 2023-10-04 09:21:11,968 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8853_210_3273_1_1528633899_818968_51-187685-0 from training. Duration: 0.69 2023-10-04 09:21:20,878 WARNING [train.py:1204] (3/4) Exclude cut with ID _7719_210_3525_1_1528456153163_4296960_333-110399-0_sp1.1 from training. Duration: 0.87275 2023-10-04 09:21:24,981 WARNING [train.py:1204] (3/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_1022-83740-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 09:21:24,998 WARNING [train.py:1204] (3/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_61-327062-0_sp0.9 from training. Duration: 0.9 2023-10-04 09:21:25,012 WARNING [train.py:1204] (3/4) Exclude cut with ID _47251_210_18628_1_1533814063128_4137250_214-88076-0_sp1.1 from training. Duration: 0.8 2023-10-04 09:21:26,366 WARNING [train.py:1204] (3/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_839-173017-0 from training. Duration: 0.41 2023-10-04 09:21:31,213 WARNING [train.py:1204] (3/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_299-34413-0_sp1.1 from training. Duration: 0.5909375 2023-10-04 09:21:32,604 WARNING [train.py:1204] (3/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_664-159457-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 09:21:35,373 WARNING [train.py:1204] (3/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_483-291760-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 09:21:38,770 WARNING [train.py:1204] (3/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_287-255474-0_sp1.1 from training. Duration: 0.92725 2023-10-04 09:21:38,811 WARNING [train.py:1204] (3/4) Exclude cut with ID _48677_210_5663_1_1533902405919_3892989_111-11439-0_sp1.1 from training. Duration: 0.87275 2023-10-04 09:21:38,818 WARNING [train.py:1204] (3/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_399-156791-0 from training. Duration: 0.83 2023-10-04 09:21:38,852 WARNING [train.py:1204] (3/4) Exclude cut with ID _3992_210_1747_1_1526644816031_3731060_36-147855-0_sp0.9 from training. Duration: 0.8666875 2023-10-04 09:21:41,993 WARNING [train.py:1204] (3/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_442-320317-0_sp1.1 from training. Duration: 0.7454375 2023-10-04 09:21:42,012 WARNING [train.py:1204] (3/4) Exclude cut with ID _55429_210_10626_1_1534467588527_7174143_953-80868-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 09:21:43,395 WARNING [train.py:1204] (3/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_159-328157-0_sp0.9 from training. Duration: 0.57775 2023-10-04 09:21:43,402 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9309W0197-640324 from training. Duration: 0.896 2023-10-04 09:21:43,637 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.0.ff2_skip_rate, batch_count=1604173.3333333333, ans=0.0 2023-10-04 09:21:44,879 WARNING [train.py:1204] (3/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_361-30822-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 09:21:50,683 WARNING [train.py:1204] (3/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_636-251213-0 from training. Duration: 0.94 2023-10-04 09:21:51,358 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.1.encoder.layers.1.whiten, num_groups=1, num_channels=256, metric=3.91 vs. limit=12.0 2023-10-04 09:21:56,205 WARNING [train.py:1204] (3/4) Exclude cut with ID _49728_210_12651_1_1534041034294_6788150_468-333827-0_sp1.1 from training. Duration: 0.963625 2023-10-04 09:21:57,460 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.703e+02 2.083e+02 2.296e+02 2.597e+02 3.892e+02, threshold=4.592e+02, percent-clipped=0.0 2023-10-04 09:21:57,550 WARNING [train.py:1204] (3/4) Exclude cut with ID _25882_210_3036_1_1532775665815_6479510_623-191759-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 09:21:57,607 WARNING [train.py:1204] (3/4) Exclude cut with ID _5189_210_4557_1_1527248319428_3769229_199-53526-0 from training. Duration: 0.42 2023-10-04 09:21:59,005 WARNING [train.py:1204] (3/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_32-205288-0_sp0.9 from training. Duration: 0.8666875 2023-10-04 09:21:59,201 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.1.attention_skip_rate, batch_count=1604240.0, ans=0.0 2023-10-04 09:22:00,743 WARNING [train.py:1204] (3/4) Exclude cut with ID _65263_210_22356_1_1535763598691_3921037_401-346646-0_sp1.1 from training. Duration: 0.963625 2023-10-04 09:22:00,748 WARNING [train.py:1204] (3/4) Exclude cut with ID _12555_210_8750_1_1530075594402_3780239_449-267986-0_sp1.1 from training. Duration: 0.7545625 2023-10-04 09:22:00,769 WARNING [train.py:1204] (3/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_430-201020-0_sp1.1 from training. Duration: 0.8818125 2023-10-04 09:22:00,847 WARNING [train.py:1204] (3/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_379-49457-0_sp1.1 from training. Duration: 0.7909375 2023-10-04 09:22:02,100 INFO [train.py:1046] (3/4) Epoch 46, batch 1600, loss[loss=0.1405, simple_loss=0.217, pruned_loss=0.03201, over 23621.00 frames. ], tot_loss[loss=0.1545, simple_loss=0.2346, pruned_loss=0.03721, over 4704093.08 frames. ], batch size: 149, lr: 2.21e-03, grad_scale: 16.0 2023-10-04 09:22:04,936 WARNING [train.py:1204] (3/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_200-105218-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 09:22:04,995 WARNING [train.py:1204] (3/4) Exclude cut with ID _3867_210_1883_1_1526641200575_4475830_319-349850-0 from training. Duration: 0.67 2023-10-04 09:22:06,393 WARNING [train.py:1204] (3/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_916-195848-0 from training. Duration: 0.92 2023-10-04 09:22:06,529 WARNING [train.py:1204] (3/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_436-186311-0 from training. Duration: 0.65 2023-10-04 09:22:09,582 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3910_210_1794_1_1526777667_3747260_302-180617-0_sp1.1 from training. Duration: 0.97275 2023-10-04 09:22:11,034 WARNING [train.py:1204] (3/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_102-92751-0 from training. Duration: 0.99 2023-10-04 09:22:12,779 WARNING [train.py:1204] (3/4) Exclude cut with ID _51899_210_20034_1_1534129254466_3669220_265-215428-0_sp1.1 from training. Duration: 0.8909375 2023-10-04 09:22:15,517 WARNING [train.py:1204] (3/4) Exclude cut with ID _58583_210_5236_1_1534726835266_3574249_438-198993-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 09:22:18,352 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.2.encoder.layers.1.feed_forward3.out_whiten, num_groups=1, num_channels=384, metric=7.21 vs. limit=15.0 2023-10-04 09:22:18,945 WARNING [train.py:1204] (3/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_166-166990-0_sp1.1 from training. Duration: 0.87275 2023-10-04 09:22:20,620 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.conv_module2.balancer2.prob, batch_count=1604373.3333333333, ans=0.125 2023-10-04 09:22:21,947 WARNING [train.py:1204] (3/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_907-151398-0 from training. Duration: 0.37 2023-10-04 09:22:25,879 WARNING [train.py:1204] (3/4) Exclude cut with ID _38670_210_15379_1_1533448810659_4391059_452-256669-0_sp1.1 from training. Duration: 0.936375 2023-10-04 09:22:25,927 WARNING [train.py:1204] (3/4) Exclude cut with ID _48677_210_5663_1_1533902405919_3892989_408-40740-0 from training. Duration: 0.83 2023-10-04 09:22:25,971 WARNING [train.py:1204] (3/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_903-158756-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 09:22:27,271 WARNING [train.py:1204] (3/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_397-216497-0 from training. Duration: 0.96 2023-10-04 09:22:31,544 WARNING [train.py:1204] (3/4) Exclude cut with ID _55429_210_10626_1_1534467588527_7174143_97-335084-0 from training. Duration: 0.97 2023-10-04 09:22:37,714 WARNING [train.py:1204] (3/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_453-267730-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 09:22:39,107 WARNING [train.py:1204] (3/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_153-97441-0 from training. Duration: 0.56 2023-10-04 09:22:40,981 WARNING [train.py:1204] (3/4) Exclude cut with ID _40901_210_5665_1_1533363789958_3856130_312-329122-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 09:22:41,023 WARNING [train.py:1204] (3/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_97-126471-0_sp1.1 from training. Duration: 0.97275 2023-10-04 09:22:41,023 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_11305_210_1794_1_1529750428_7178148_257-312870-0_sp1.1 from training. Duration: 0.8 2023-10-04 09:22:44,051 WARNING [train.py:1204] (3/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_454-314901-0_sp0.9 from training. Duration: 0.511125 2023-10-04 09:22:45,697 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.bypass_mid.scale_min, batch_count=1604506.6666666667, ans=0.2 2023-10-04 09:22:48,636 WARNING [train.py:1204] (3/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_282-136641-0_sp1.1 from training. Duration: 0.3818125 2023-10-04 09:22:51,449 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_46013_210_7234_1_1533776362_3744032_493-160724-0_sp1.1 from training. Duration: 0.963625 2023-10-04 09:22:51,509 WARNING [train.py:1204] (3/4) Exclude cut with ID _67831_210_12392_1_1535612464202_3092612_259-33442-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 09:22:52,837 WARNING [train.py:1204] (3/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_289-62384-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 09:22:52,909 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6545_210_2040_1_1527774670_4245258_379-14182-0_sp0.9 from training. Duration: 0.888875 2023-10-04 09:22:55,607 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_548-294316-0_sp1.1 from training. Duration: 0.736375 2023-10-04 09:22:55,694 WARNING [train.py:1204] (3/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_857-298942-0_sp1.1 from training. Duration: 0.77275 2023-10-04 09:22:58,349 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6673_210_3273_1_1527767790_3173240_327-306185-0_sp1.1 from training. Duration: 0.7545625 2023-10-04 09:23:03,814 WARNING [train.py:1204] (3/4) Exclude cut with ID _61008_210_10633_1_1534852769198_6974370_814-107647-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 09:23:03,882 WARNING [train.py:1204] (3/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_116-332008-0_sp1.1 from training. Duration: 0.8909375 2023-10-04 09:23:07,050 WARNING [train.py:1204] (3/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_266-329755-0 from training. Duration: 0.89 2023-10-04 09:23:07,053 WARNING [train.py:1204] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_271-269084-0_sp0.9 from training. Duration: 0.92225 2023-10-04 09:23:07,124 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3171_210_2087_1_1526109084_2330007_66-260583-0 from training. Duration: 0.72 2023-10-04 09:23:12,643 WARNING [train.py:1204] (3/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_1326-276386-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 09:23:14,585 WARNING [train.py:1204] (3/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_372-30302-0_sp1.1 from training. Duration: 0.7818125 2023-10-04 09:23:15,855 INFO [train.py:1046] (3/4) Epoch 46, batch 1650, loss[loss=0.1667, simple_loss=0.2479, pruned_loss=0.04273, over 23948.00 frames. ], tot_loss[loss=0.1559, simple_loss=0.2359, pruned_loss=0.03796, over 4695833.90 frames. ], batch size: 80, lr: 2.21e-03, grad_scale: 16.0 2023-10-04 09:23:15,946 WARNING [train.py:1204] (3/4) Exclude cut with ID _25882_210_3036_1_1532775665815_6479510_631-257029-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 09:23:15,956 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3691_210_3528_1_1526468169_4062053_355-334432-0 from training. Duration: 0.87 2023-10-04 09:23:15,971 WARNING [train.py:1204] (3/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_839-173017-0_sp1.1 from training. Duration: 0.37275 2023-10-04 09:23:15,979 WARNING [train.py:1204] (3/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_441-91466-0 from training. Duration: 0.97 2023-10-04 09:23:17,309 WARNING [train.py:1204] (3/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_108-53929-0 from training. Duration: 0.59 2023-10-04 09:23:22,078 WARNING [train.py:1204] (3/4) Exclude cut with ID _9389_210_1664_1_1529217337301_5289269_260-1942-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 09:23:22,141 WARNING [train.py:1204] (3/4) Exclude cut with ID _56423_210_5854_1_1534896061049_3165969_198-61547-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 09:23:22,174 WARNING [train.py:1204] (3/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_412-142122-0_sp1.1 from training. Duration: 0.92725 2023-10-04 09:23:22,193 WARNING [train.py:1204] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_88-341718-0_sp1.1 from training. Duration: 0.6 2023-10-04 09:23:23,600 WARNING [train.py:1204] (3/4) Exclude cut with ID _61079_210_4525_1_1534921008198_4011419_334-298697-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 09:23:24,208 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.4.encoder.layers.0.conv_module1.whiten, num_groups=1, num_channels=384, metric=4.78 vs. limit=15.0 2023-10-04 09:23:26,364 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_5465_210_4211_1_1527227253_55357_8-2499-0_sp1.1 from training. Duration: 0.996375 2023-10-04 09:23:29,051 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_447-201565-0_sp1.1 from training. Duration: 0.97275 2023-10-04 09:23:29,057 WARNING [train.py:1204] (3/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_253-161789-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 09:23:29,060 WARNING [train.py:1204] (3/4) Exclude cut with ID _44304_210_15374_1_1533639352765_4613129_302-22084-0_sp0.9 from training. Duration: 0.9555625 2023-10-04 09:23:29,069 WARNING [train.py:1204] (3/4) Exclude cut with ID _26664_210_7770_1_1532258943280_3418270_187-80815-0_sp1.1 from training. Duration: 0.8454375 2023-10-04 09:23:29,158 WARNING [train.py:1204] (3/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_68-212801-0 from training. Duration: 0.73 2023-10-04 09:23:30,608 WARNING [train.py:1204] (3/4) Exclude cut with ID _29345_210_4381_1_1532689168263_3860169_149-257268-0 from training. Duration: 0.81 2023-10-04 09:23:33,990 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.3.encoder.layers.1.whiten, num_groups=1, num_channels=512, metric=8.17 vs. limit=12.0 2023-10-04 09:23:34,855 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_213-103282-0_sp1.1 from training. Duration: 0.7090625 2023-10-04 09:23:38,092 WARNING [train.py:1204] (3/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_156-302675-0_sp1.1 from training. Duration: 0.5 2023-10-04 09:23:42,576 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.conv_module1.balancer1.prob, batch_count=1604706.6666666667, ans=0.125 2023-10-04 09:23:44,356 WARNING [train.py:1204] (3/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_125-322172-0 from training. Duration: 0.93 2023-10-04 09:23:46,157 WARNING [train.py:1204] (3/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_493-60474-0_sp1.1 from training. Duration: 0.963625 2023-10-04 09:23:48,971 WARNING [train.py:1204] (3/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_156-302675-0 from training. Duration: 0.55 2023-10-04 09:23:51,723 WARNING [train.py:1204] (3/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_123-267862-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 09:23:54,359 WARNING [train.py:1204] (3/4) Exclude cut with ID _28427_210_4220_1_1532581240967_3061069_201-143747-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 09:23:54,391 WARNING [train.py:1204] (3/4) Exclude cut with ID _8772_210_1850_1_1528635595972_3664011_96-12756-0_sp1.1 from training. Duration: 0.8909375 2023-10-04 09:23:54,423 WARNING [train.py:1204] (3/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_1347-145675-0_sp1.1 from training. Duration: 0.936375 2023-10-04 09:23:55,892 WARNING [train.py:1204] (3/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_387-159008-0_sp1.1 from training. Duration: 0.7818125 2023-10-04 09:23:55,919 WARNING [train.py:1204] (3/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_400-201896-0_sp1.1 from training. Duration: 0.963625 2023-10-04 09:23:59,994 WARNING [train.py:1204] (3/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_326-174612-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 09:24:00,023 WARNING [train.py:1204] (3/4) Exclude cut with ID _55844_210_2346_1_1534471212569_7625707_390-116233-0_sp1.1 from training. Duration: 0.963625 2023-10-04 09:24:00,059 WARNING [train.py:1204] (3/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_419-317062-0_sp1.1 from training. Duration: 0.82725 2023-10-04 09:24:00,263 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.bypass.scale_min, batch_count=1604840.0, ans=0.2 2023-10-04 09:24:01,338 WARNING [train.py:1204] (3/4) Exclude cut with ID _29877_210_6228_1_1532397791106_5276149_266-62291-0_sp0.9 from training. Duration: 0.9666875 2023-10-04 09:24:01,407 WARNING [train.py:1204] (3/4) Exclude cut with ID _7857_210_1534_1_1528286515745_6831470_431-338528-0_sp1.1 from training. Duration: 0.92725 2023-10-04 09:24:01,477 WARNING [train.py:1204] (3/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_87-131427-0_sp0.9 from training. Duration: 0.8666875 2023-10-04 09:24:04,251 WARNING [train.py:1204] (3/4) Exclude cut with ID _24055_210_12651_1_1532310907143_976979_92-59356-0_sp1.1 from training. Duration: 0.82725 2023-10-04 09:24:07,302 WARNING [train.py:1204] (3/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_96-290039-0 from training. Duration: 0.56 2023-10-04 09:24:08,499 WARNING [train.py:1204] (3/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_48-182155-0_sp0.9 from training. Duration: 0.9666875 2023-10-04 09:24:09,802 WARNING [train.py:1204] (3/4) Exclude cut with ID _4626_210_3532_1_1526817652717_3916859_446-206656-0 from training. Duration: 0.76 2023-10-04 09:24:09,924 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder_embed.convnext.layerdrop_rate, batch_count=1604840.0, ans=0.015 2023-10-04 09:24:11,564 WARNING [train.py:1204] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_168-32906-0 from training. Duration: 0.69 2023-10-04 09:24:11,587 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3780_210_2087_1_1526467760_2130483_170-259044-0 from training. Duration: 0.7 2023-10-04 09:24:11,604 WARNING [train.py:1204] (3/4) Exclude cut with ID _65134_210_14763_1_1535335393985_3505368_153-216718-0_sp1.1 from training. Duration: 0.92725 2023-10-04 09:24:13,049 WARNING [train.py:1204] (3/4) Exclude cut with ID _64639_210_5816_1_1535367619437_3528000_461-329156-0_sp1.1 from training. Duration: 0.87275 2023-10-04 09:24:13,100 WARNING [train.py:1204] (3/4) Exclude cut with ID _21989_210_5854_1_1531899214931_3271360_126-347531-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 09:24:13,146 WARNING [train.py:1204] (3/4) Exclude cut with ID _7614_210_4915_1_1529326764092_4537359_96-162197-0_sp1.1 from training. Duration: 0.963625 2023-10-04 09:24:13,146 WARNING [train.py:1204] (3/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_1083-147536-0 from training. Duration: 0.97 2023-10-04 09:24:14,822 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.conv_module1.balancer1.prob, batch_count=1604906.6666666667, ans=0.125 2023-10-04 09:24:18,359 WARNING [train.py:1204] (3/4) Exclude cut with ID _64029_210_4381_1_1535178502772_3756459_275-16710-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 09:24:19,718 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7264_210_4938_1_1527939911_8132095_800-91995-0_sp0.9 from training. Duration: 0.9555625 2023-10-04 09:24:19,750 WARNING [train.py:1204] (3/4) Exclude cut with ID _3844_210_1390_1_1526727803582_2871209_50-193879-0_sp1.1 from training. Duration: 0.936375 2023-10-04 09:24:22,433 WARNING [train.py:1204] (3/4) Exclude cut with ID _46952_210_5986_1_1534230010357_3107379_232-344503-0 from training. Duration: 0.96 2023-10-04 09:24:23,381 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.0.layers.0.feed_forward2.out_whiten, num_groups=1, num_channels=192, metric=5.92 vs. limit=15.0 2023-10-04 09:24:25,065 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.671e+02 2.077e+02 2.252e+02 2.648e+02 5.011e+02, threshold=4.504e+02, percent-clipped=3.0 2023-10-04 09:24:25,368 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.conv_module2.balancer1.prob, batch_count=1604906.6666666667, ans=0.125 2023-10-04 09:24:26,512 WARNING [train.py:1204] (3/4) Exclude cut with ID _25808_210_13292_1_1532343514562_7366109_206-14076-0_sp1.1 from training. Duration: 0.936375 2023-10-04 09:24:26,513 WARNING [train.py:1204] (3/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_55-31904-0_sp0.9 from training. Duration: 0.97775 2023-10-04 09:24:26,531 WARNING [train.py:1204] (3/4) Exclude cut with ID _14632_210_9988_1_1530691279392_3923200_161-129530-0 from training. Duration: 0.96 2023-10-04 09:24:26,596 WARNING [train.py:1204] (3/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_770-200430-0_sp1.1 from training. Duration: 0.8090625 2023-10-04 09:24:26,606 WARNING [train.py:1204] (3/4) Exclude cut with ID _6270_210_2084_1_1527850911573_3856420_26-277343-0_sp1.1 from training. Duration: 0.7454375 2023-10-04 09:24:26,607 WARNING [train.py:1204] (3/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_88-23776-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 09:24:29,299 INFO [train.py:1046] (3/4) Epoch 46, batch 1700, loss[loss=0.1628, simple_loss=0.254, pruned_loss=0.03578, over 24302.00 frames. ], tot_loss[loss=0.1555, simple_loss=0.2356, pruned_loss=0.03766, over 4698040.22 frames. ], batch size: 74, lr: 2.21e-03, grad_scale: 16.0 2023-10-04 09:24:29,405 WARNING [train.py:1204] (3/4) Exclude cut with ID _60860_210_8254_1_1534896015875_3557169_245-256666-0_sp1.1 from training. Duration: 0.8818125 2023-10-04 09:24:29,434 WARNING [train.py:1204] (3/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_636-251213-0_sp1.1 from training. Duration: 0.8545625 2023-10-04 09:24:29,455 WARNING [train.py:1204] (3/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_540-131164-0 from training. Duration: 0.92 2023-10-04 09:24:29,751 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.ff3_skip_rate, batch_count=1604973.3333333333, ans=0.0 2023-10-04 09:24:30,895 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_5931_210_2040_1_1527428481_4547999_321-193614-0_sp0.9 from training. Duration: 0.7444375 2023-10-04 09:24:38,782 WARNING [train.py:1204] (3/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_373-225403-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 09:24:41,544 WARNING [train.py:1204] (3/4) Exclude cut with ID _20622_210_3852_1_1531622427080_1093839_57-326027-0_sp1.1 from training. Duration: 0.97275 2023-10-04 09:24:43,191 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.conv_module2.balancer1.prob, batch_count=1605040.0, ans=0.125 2023-10-04 09:24:47,002 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.bypass.scale_min, batch_count=1605040.0, ans=0.2 2023-10-04 09:24:48,108 WARNING [train.py:1204] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_605-257205-0_sp1.1 from training. Duration: 0.536375 2023-10-04 09:24:48,130 WARNING [train.py:1204] (3/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_442-320317-0_sp0.9 from training. Duration: 0.911125 2023-10-04 09:24:49,514 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_35361_210_4238_1_1533262810_7793476_675-87012-0_sp1.1 from training. Duration: 0.8090625 2023-10-04 09:24:49,553 WARNING [train.py:1204] (3/4) Exclude cut with ID _4184_210_2550_1_1526730192013_2219380_306-44736-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 09:24:51,360 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.self_attn_weights.whiten_keys.whitening_limit, batch_count=1605040.0, ans=6.0 2023-10-04 09:24:52,206 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_124-113562-0 from training. Duration: 0.94 2023-10-04 09:24:53,694 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2821_210_2715_1_1526108385_3763560_73-296673-0_sp0.9 from training. Duration: 0.92225 2023-10-04 09:24:54,304 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.3.encoder.layers.3.self_attn2.whiten, num_groups=1, num_channels=512, metric=11.00 vs. limit=22.5 2023-10-04 09:24:55,004 WARNING [train.py:1204] (3/4) Exclude cut with ID _10301_210_5886_1_1529222275332_3910620_216-46959-0_sp1.1 from training. Duration: 0.92725 2023-10-04 09:24:56,463 WARNING [train.py:1204] (3/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_91-277684-0_sp0.9 from training. Duration: 0.82225 2023-10-04 09:24:57,804 WARNING [train.py:1204] (3/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_351-294650-0_sp0.9 from training. Duration: 0.7 2023-10-04 09:24:59,281 WARNING [train.py:1204] (3/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_209-205365-0 from training. Duration: 0.79 2023-10-04 09:25:00,628 WARNING [train.py:1204] (3/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_97-325053-0 from training. Duration: 0.92 2023-10-04 09:25:02,192 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.feed_forward1.out_proj.dropout_p, batch_count=1605106.6666666667, ans=0.1 2023-10-04 09:25:03,279 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_49348_210_7927_1_1533954500_3691300_109-276430-0_sp1.1 from training. Duration: 0.92725 2023-10-04 09:25:04,664 WARNING [train.py:1204] (3/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_912-47473-0 from training. Duration: 0.68 2023-10-04 09:25:06,015 WARNING [train.py:1204] (3/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_96-320736-0_sp0.9 from training. Duration: 0.9555625 2023-10-04 09:25:14,984 WARNING [train.py:1204] (3/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_1252-235481-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 09:25:16,290 WARNING [train.py:1204] (3/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_813-261554-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 09:25:16,363 WARNING [train.py:1204] (3/4) Exclude cut with ID _21632_210_2253_1_1531994001765_3744140_110-274654-0_sp0.9 from training. Duration: 0.911125 2023-10-04 09:25:19,577 WARNING [train.py:1204] (3/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_381-317791-0_sp1.1 from training. Duration: 0.663625 2023-10-04 09:25:19,601 WARNING [train.py:1204] (3/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_51-117576-0_sp1.1 from training. Duration: 0.363625 2023-10-04 09:25:20,260 WARNING [train.py:1204] (3/4) Exclude cut with ID _42321_210_7770_1_1533468631352_4366895_104-215915-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 09:25:22,860 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_216-22824-0_sp1.1 from training. Duration: 0.92725 2023-10-04 09:25:22,861 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_14831_210_7927_1_1530700200_5583610_181-27467-0 from training. Duration: 0.91 2023-10-04 09:25:22,909 WARNING [train.py:1204] (3/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_1024-265645-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 09:25:22,912 WARNING [train.py:1204] (3/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_412-233904-0_sp1.1 from training. Duration: 0.936375 2023-10-04 09:25:22,935 WARNING [train.py:1204] (3/4) Exclude cut with ID _30191_210_1664_1_1533189915047_7196670_911-223461-0_sp1.1 from training. Duration: 0.92725 2023-10-04 09:25:22,943 WARNING [train.py:1204] (3/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_558-148266-0_sp1.1 from training. Duration: 0.963625 2023-10-04 09:25:25,792 WARNING [train.py:1204] (3/4) Exclude cut with ID _58480_210_12392_1_1534811121111_3646160_212-164495-0_sp1.1 from training. Duration: 0.936375 2023-10-04 09:25:25,793 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_454-276429-0_sp0.9 from training. Duration: 0.9666875 2023-10-04 09:25:27,061 WARNING [train.py:1204] (3/4) Exclude cut with ID _24736_210_3279_1_1532332894218_2838700_73-197086-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 09:25:28,371 WARNING [train.py:1204] (3/4) Exclude cut with ID _5294_210_1747_1_1527249643928_3696179_529-242298-0_sp1.1 from training. Duration: 0.863625 2023-10-04 09:25:28,416 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_53555_210_5632_1_1534595220_7232297_345-171438-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 09:25:32,476 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_28211_210_6388_1_1532861506_4523224_22-25766-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 09:25:33,845 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7002_210_1794_1_1527939683_5157286_163-87552-0 from training. Duration: 0.86 2023-10-04 09:25:35,242 WARNING [train.py:1204] (3/4) Exclude cut with ID _56927_210_9969_1_1534848970840_3695229_409-225129-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 09:25:36,583 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_26192_210_7927_1_1532137269_1040115_21-219331-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 09:25:39,460 WARNING [train.py:1204] (3/4) Exclude cut with ID _15012_210_1968_1_1530856820429_5095350_662-210469-0 from training. Duration: 0.57 2023-10-04 09:25:42,512 INFO [train.py:1046] (3/4) Epoch 46, batch 1750, loss[loss=0.1465, simple_loss=0.2168, pruned_loss=0.03811, over 23433.00 frames. ], tot_loss[loss=0.1542, simple_loss=0.2346, pruned_loss=0.03695, over 4717384.80 frames. ], batch size: 285, lr: 2.21e-03, grad_scale: 16.0 2023-10-04 09:25:43,270 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.4.encoder.layers.0.self_attn1.whiten, num_groups=1, num_channels=384, metric=14.92 vs. limit=22.5 2023-10-04 09:25:47,270 WARNING [train.py:1204] (3/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_582-191016-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 09:25:50,176 WARNING [train.py:1204] (3/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_270-210032-0_sp1.1 from training. Duration: 0.963625 2023-10-04 09:25:50,212 WARNING [train.py:1204] (3/4) Exclude cut with ID _6789_210_1534_1_1527857937218_3118950_262-48853-0_sp0.9 from training. Duration: 0.62225 2023-10-04 09:25:52,142 WARNING [train.py:1204] (3/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_847-41334-0 from training. Duration: 0.44 2023-10-04 09:25:52,176 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_573-46532-0_sp1.1 from training. Duration: 0.92725 2023-10-04 09:25:54,281 WARNING [train.py:1204] (3/4) Exclude cut with ID _21881_210_3525_1_1531825242757_3798380_247-102591-0_sp1.1 from training. Duration: 0.8909375 2023-10-04 09:25:54,304 WARNING [train.py:1204] (3/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_200-21395-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 09:25:57,203 WARNING [train.py:1204] (3/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_711-15688-0 from training. Duration: 0.43 2023-10-04 09:25:59,916 WARNING [train.py:1204] (3/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_53-75025-0_sp1.1 from training. Duration: 0.963625 2023-10-04 09:26:01,402 WARNING [train.py:1204] (3/4) Exclude cut with ID _6194_210_1534_1_1527511826160_3941194_321-148641-0 from training. Duration: 0.64 2023-10-04 09:26:01,435 WARNING [train.py:1204] (3/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_337-294431-0_sp1.1 from training. Duration: 0.936375 2023-10-04 09:26:02,804 WARNING [train.py:1204] (3/4) Exclude cut with ID _21632_210_2253_1_1531994001765_3744140_110-274654-0_sp1.1 from training. Duration: 0.7454375 2023-10-04 09:26:05,502 WARNING [train.py:1204] (3/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_135-174080-0_sp1.1 from training. Duration: 0.4818125 2023-10-04 09:26:08,141 WARNING [train.py:1204] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_21-174514-0 from training. Duration: 0.43 2023-10-04 09:26:09,622 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_38965_210_4852_1_1533174906_3484735_215-294589-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 09:26:09,661 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_548-294316-0 from training. Duration: 0.81 2023-10-04 09:26:17,595 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_103-84010-0_sp1.1 from training. Duration: 0.62725 2023-10-04 09:26:22,094 WARNING [train.py:1204] (3/4) Exclude cut with ID _18199_210_6109_1_1531475789864_4288790_295-198247-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 09:26:22,114 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_13757_210_3528_1_1530440682_4107239_103-313224-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 09:26:24,854 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_34886_210_10633_1_1533203882_1755661_67-294673-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 09:26:24,865 WARNING [train.py:1204] (3/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_350-101734-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 09:26:26,281 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_37357_210_17024_1_1533089504_2316121_204-153986-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 09:26:27,862 WARNING [train.py:1204] (3/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_673-115569-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 09:26:30,635 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_489-230703-0_sp0.9 from training. Duration: 0.9444375 2023-10-04 09:26:31,921 WARNING [train.py:1204] (3/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_575-102496-0_sp1.1 from training. Duration: 0.936375 2023-10-04 09:26:32,010 WARNING [train.py:1204] (3/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_669-93163-0 from training. Duration: 0.98 2023-10-04 09:26:33,539 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.feed_forward1.hidden_balancer.prob, batch_count=1605506.6666666667, ans=0.125 2023-10-04 09:26:34,725 WARNING [train.py:1204] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_249-281349-0_sp0.9 from training. Duration: 0.9444375 2023-10-04 09:26:36,236 WARNING [train.py:1204] (3/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_106-30782-0 from training. Duration: 0.8 2023-10-04 09:26:37,572 WARNING [train.py:1204] (3/4) Exclude cut with ID _8772_210_1850_1_1528635595972_3664011_398-320049-0_sp1.1 from training. Duration: 0.8 2023-10-04 09:26:37,836 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.conv_skip_rate, batch_count=1605506.6666666667, ans=0.0 2023-10-04 09:26:39,009 WARNING [train.py:1204] (3/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_516-151727-0_sp1.1 from training. Duration: 0.963625 2023-10-04 09:26:40,367 WARNING [train.py:1204] (3/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_325-62802-0_sp1.1 from training. Duration: 0.7909375 2023-10-04 09:26:43,687 WARNING [train.py:1204] (3/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_93-58002-0_sp0.9 from training. Duration: 0.6333125 2023-10-04 09:26:44,963 WARNING [train.py:1204] (3/4) Exclude cut with ID _4681_210_3525_1_1527251040321_1235713_123-24428-0_sp1.1 from training. Duration: 0.436375 2023-10-04 09:26:45,029 WARNING [train.py:1204] (3/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_1185-56473-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 09:26:45,187 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.bypass.scale_min, batch_count=1605573.3333333333, ans=0.2 2023-10-04 09:26:47,796 WARNING [train.py:1204] (3/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_96-244299-0_sp1.1 from training. Duration: 0.8 2023-10-04 09:26:53,138 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.738e+02 2.053e+02 2.343e+02 2.960e+02 5.357e+02, threshold=4.686e+02, percent-clipped=4.0 2023-10-04 09:26:53,213 WARNING [train.py:1204] (3/4) Exclude cut with ID _59500_210_8254_1_1534809610680_3736009_104-72815-0_sp1.1 from training. Duration: 0.963625 2023-10-04 09:26:55,282 WARNING [train.py:1204] (3/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_468-283113-0_sp1.1 from training. Duration: 0.87275 2023-10-04 09:26:56,627 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6239_210_4211_1_1527765584_4306698_528-102186-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 09:26:57,940 INFO [train.py:1046] (3/4) Epoch 46, batch 1800, loss[loss=0.1582, simple_loss=0.2463, pruned_loss=0.03507, over 24664.00 frames. ], tot_loss[loss=0.1535, simple_loss=0.2339, pruned_loss=0.03652, over 4706930.70 frames. ], batch size: 65, lr: 2.21e-03, grad_scale: 16.0 2023-10-04 09:26:58,009 WARNING [train.py:1204] (3/4) Exclude cut with ID _45297_210_2721_1_1533715351381_3264949_351-147136-0 from training. Duration: 0.87 2023-10-04 09:26:58,013 WARNING [train.py:1204] (3/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_520-51744-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 09:26:58,102 WARNING [train.py:1204] (3/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_310-114452-0_sp1.1 from training. Duration: 0.536375 2023-10-04 09:26:58,105 WARNING [train.py:1204] (3/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_450-165710-0_sp1.1 from training. Duration: 0.92725 2023-10-04 09:26:58,115 WARNING [train.py:1204] (3/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_280-120942-0_sp1.1 from training. Duration: 0.7 2023-10-04 09:26:59,465 WARNING [train.py:1204] (3/4) Exclude cut with ID _65585_210_12210_1_1535450451584_3764098_112-294301-0_sp0.9 from training. Duration: 0.97775 2023-10-04 09:26:59,523 WARNING [train.py:1204] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_98-51676-0_sp0.9 from training. Duration: 0.811125 2023-10-04 09:27:01,218 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.balancer2.prob, batch_count=1605640.0, ans=0.125 2023-10-04 09:27:02,182 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_33348_210_5632_1_1533086912_3324030_85-28583-0_sp1.1 from training. Duration: 0.7454375 2023-10-04 09:27:02,266 WARNING [train.py:1204] (3/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_336-208232-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 09:27:04,954 WARNING [train.py:1204] (3/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_339-104791-0_sp0.9 from training. Duration: 0.5444375 2023-10-04 09:27:06,366 WARNING [train.py:1204] (3/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_559-29840-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 09:27:09,246 WARNING [train.py:1204] (3/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_117-290916-0_sp0.9 from training. Duration: 0.5666875 2023-10-04 09:27:09,339 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_185-112289-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 09:27:12,727 WARNING [train.py:1204] (3/4) Exclude cut with ID _59256_210_5986_1_1535076085997_3589120_155-298797-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 09:27:15,331 WARNING [train.py:1204] (3/4) Exclude cut with ID _44659_210_2721_1_1534036120061_6614860_246-206531-0_sp1.1 from training. Duration: 0.92725 2023-10-04 09:27:15,384 WARNING [train.py:1204] (3/4) Exclude cut with ID _23818_210_6228_1_1531917063884_3676920_469-151944-0_sp1.1 from training. Duration: 0.92725 2023-10-04 09:27:17,303 WARNING [train.py:1204] (3/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_88-276668-0_sp1.1 from training. Duration: 0.9 2023-10-04 09:27:20,040 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_15101_210_7927_1_1530786415_5033274_320-327527-0_sp1.1 from training. Duration: 0.87275 2023-10-04 09:27:20,056 WARNING [train.py:1204] (3/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_446-112117-0 from training. Duration: 0.9 2023-10-04 09:27:20,125 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_30280_210_1881_1_1532872779_3660139_400-254159-0_sp1.1 from training. Duration: 0.936375 2023-10-04 09:27:21,777 INFO [scaling.py:1118] (3/4) WithLoss: name=encoder.encoders.2.encoder.layers.2.self_attn_weights, loss-sum=3.014e-03 2023-10-04 09:27:23,487 WARNING [train.py:1204] (3/4) Exclude cut with ID _40284_210_2084_1_1533513679814_3704279_158-345341-0_sp1.1 from training. Duration: 0.936375 2023-10-04 09:27:28,239 WARNING [train.py:1204] (3/4) Exclude cut with ID 3033-130750-0096-55598-0 from training. Duration: 0.83 2023-10-04 09:27:30,953 WARNING [train.py:1204] (3/4) Exclude cut with ID _9389_210_1664_1_1529217337301_5289269_379-128794-0 from training. Duration: 0.85 2023-10-04 09:27:30,972 WARNING [train.py:1204] (3/4) Exclude cut with ID _3675_210_4557_1_1526639934408_4182089_456-18305-0 from training. Duration: 0.76 2023-10-04 09:27:31,006 WARNING [train.py:1204] (3/4) Exclude cut with ID _29853_210_7497_1_1532505600851_6642110_571-249460-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 09:27:31,089 WARNING [train.py:1204] (3/4) Exclude cut with ID _4068_210_2629_1_1526697350395_1546259_230-325568-0_sp1.1 from training. Duration: 0.92725 2023-10-04 09:27:31,089 WARNING [train.py:1204] (3/4) Exclude cut with ID _34415_210_8750_1_1533085213375_3451116_213-267570-0_sp1.1 from training. Duration: 0.963625 2023-10-04 09:27:32,427 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6767_210_3273_1_1527855783_1718932_142-127132-0_sp1.1 from training. Duration: 0.863625 2023-10-04 09:27:38,094 WARNING [train.py:1204] (3/4) Exclude cut with ID IC0653W0001-322925 from training. Duration: 0.896 2023-10-04 09:27:39,879 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_4528_210_3273_1_1527076654_3276777_209-219804-0_sp1.1 from training. Duration: 0.62725 2023-10-04 09:27:41,325 WARNING [train.py:1204] (3/4) Exclude cut with ID _55844_210_2346_1_1534471212569_7625707_187-204971-0_sp1.1 from training. Duration: 0.936375 2023-10-04 09:27:42,647 WARNING [train.py:1204] (3/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_369-325184-0 from training. Duration: 0.99 2023-10-04 09:27:43,963 WARNING [train.py:1204] (3/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_791-45149-0 from training. Duration: 0.86 2023-10-04 09:27:44,001 WARNING [train.py:1204] (3/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_317-211107-0_sp1.1 from training. Duration: 0.836375 2023-10-04 09:27:45,376 WARNING [train.py:1204] (3/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_488-330913-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 09:27:45,479 WARNING [train.py:1204] (3/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_506-42614-0_sp1.1 from training. Duration: 0.7181875 2023-10-04 09:27:51,550 WARNING [train.py:1204] (3/4) Exclude cut with ID _8696_210_1876_1_1528545632440_3345058_209-54069-0 from training. Duration: 0.67 2023-10-04 09:27:58,123 WARNING [train.py:1204] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_215-202973-0_sp1.1 from training. Duration: 0.8 2023-10-04 09:27:58,195 WARNING [train.py:1204] (3/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_321-215742-0 from training. Duration: 0.5 2023-10-04 09:27:59,568 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_428-32114-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 09:27:59,576 WARNING [train.py:1204] (3/4) Exclude cut with ID _27013_210_1389_1_1532437173240_3252649_467-263151-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 09:28:00,889 WARNING [train.py:1204] (3/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_734-129761-0_sp0.9 from training. Duration: 0.911125 2023-10-04 09:28:00,935 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_521-214276-0 from training. Duration: 0.44 2023-10-04 09:28:05,102 WARNING [train.py:1204] (3/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_80-147301-0_sp1.1 from training. Duration: 0.763625 2023-10-04 09:28:05,104 WARNING [train.py:1204] (3/4) Exclude cut with ID _9679_210_7497_1_1529129212906_4363929_56-307796-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 09:28:06,554 WARNING [train.py:1204] (3/4) Exclude cut with ID _14832_210_8750_1_1530691351539_3171348_1-54315-0 from training. Duration: 0.87 2023-10-04 09:28:06,555 WARNING [train.py:1204] (3/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_558-155750-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 09:28:09,346 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_142-309370-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 09:28:10,597 WARNING [train.py:1204] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_18-46756-0_sp0.9 from training. Duration: 0.87775 2023-10-04 09:28:10,604 WARNING [train.py:1204] (3/4) Exclude cut with ID _61148_210_6897_1_1535094022726_4217610_279-212159-0_sp1.1 from training. Duration: 0.936375 2023-10-04 09:28:11,945 INFO [train.py:1046] (3/4) Epoch 46, batch 1850, loss[loss=0.1709, simple_loss=0.2431, pruned_loss=0.04932, over 23731.00 frames. ], tot_loss[loss=0.1539, simple_loss=0.2344, pruned_loss=0.03668, over 4712887.08 frames. ], batch size: 232, lr: 2.21e-03, grad_scale: 16.0 2023-10-04 09:28:12,029 WARNING [train.py:1204] (3/4) Exclude cut with ID _21987_210_5854_1_1531821500482_3637038_164-2641-0_sp1.1 from training. Duration: 0.936375 2023-10-04 09:28:12,073 WARNING [train.py:1204] (3/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_395-340852-0_sp1.1 from training. Duration: 0.8454375 2023-10-04 09:28:14,093 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1077-184261-0_sp1.1 from training. Duration: 0.963625 2023-10-04 09:28:14,108 WARNING [train.py:1204] (3/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_1176-204992-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 09:28:14,321 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.conv_module1.balancer1.max_abs, batch_count=1605973.3333333333, ans=10.0 2023-10-04 09:28:15,542 WARNING [train.py:1204] (3/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_563-347832-0_sp1.1 from training. Duration: 0.8090625 2023-10-04 09:28:16,937 WARNING [train.py:1204] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_380-204756-0_sp0.9 from training. Duration: 0.9555625 2023-10-04 09:28:23,639 WARNING [train.py:1204] (3/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_212-10501-0_sp1.1 from training. Duration: 0.8545625 2023-10-04 09:28:23,661 WARNING [train.py:1204] (3/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_445-117398-0 from training. Duration: 0.89 2023-10-04 09:28:27,624 WARNING [train.py:1204] (3/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_29-280622-0 from training. Duration: 0.82 2023-10-04 09:28:30,361 WARNING [train.py:1204] (3/4) Exclude cut with ID _26664_210_7770_1_1532258943280_3418270_187-80815-0 from training. Duration: 0.93 2023-10-04 09:28:32,497 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.3.encoder.layers.0.nonlin_attention.whiten1, num_groups=1, num_channels=384, metric=7.32 vs. limit=10.0 2023-10-04 09:28:33,207 WARNING [train.py:1204] (3/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_414-349640-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 09:28:33,227 WARNING [train.py:1204] (3/4) Exclude cut with ID _45021_210_9994_1_1533619751094_3330288_96-239010-0 from training. Duration: 0.91 2023-10-04 09:28:33,230 WARNING [train.py:1204] (3/4) Exclude cut with ID _5189_210_4557_1_1527248319428_3769229_199-53526-0_sp0.9 from training. Duration: 0.4666875 2023-10-04 09:28:37,643 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.feed_forward1.hidden_balancer.prob, batch_count=1606040.0, ans=0.125 2023-10-04 09:28:38,959 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.ff3_skip_rate, batch_count=1606040.0, ans=0.0 2023-10-04 09:28:43,298 WARNING [train.py:1204] (3/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_63-101996-0_sp1.1 from training. Duration: 0.8 2023-10-04 09:28:46,053 WARNING [train.py:1204] (3/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_199-86432-0 from training. Duration: 0.98 2023-10-04 09:28:50,142 WARNING [train.py:1204] (3/4) Exclude cut with ID _36399_210_16049_1_1532998771377_3698333_207-259013-0_sp1.1 from training. Duration: 0.92725 2023-10-04 09:28:50,191 WARNING [train.py:1204] (3/4) Exclude cut with ID _62574_210_2655_1_1535180407058_3933249_567-111756-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 09:28:53,264 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.balancer1.prob, batch_count=1606106.6666666667, ans=0.125 2023-10-04 09:28:55,781 WARNING [train.py:1204] (3/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_436-318113-0 from training. Duration: 0.75 2023-10-04 09:28:55,833 WARNING [train.py:1204] (3/4) Exclude cut with ID _53379_210_20034_1_1534222027363_3854654_43-305214-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 09:28:55,847 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_15126_210_6753_1_1530778677_2591185_113-157651-0_sp1.1 from training. Duration: 0.6181875 2023-10-04 09:28:58,968 WARNING [train.py:1204] (3/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_146-218763-0_sp1.1 from training. Duration: 0.7818125 2023-10-04 09:29:00,509 WARNING [train.py:1204] (3/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_714-310538-0_sp0.9 from training. Duration: 0.9555625 2023-10-04 09:29:01,973 WARNING [train.py:1204] (3/4) Exclude cut with ID _24271_210_12658_1_1531987270022_7190519_709-16721-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 09:29:04,707 WARNING [train.py:1204] (3/4) Exclude cut with ID _5190_210_4557_1_1527244368799_373439_28-197030-0_sp0.9 from training. Duration: 0.8 2023-10-04 09:29:05,242 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.4.encoder.layers.2.nonlin_attention.whiten1, num_groups=1, num_channels=288, metric=5.50 vs. limit=10.0 2023-10-04 09:29:06,025 WARNING [train.py:1204] (3/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_267-197536-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 09:29:06,049 WARNING [train.py:1204] (3/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_658-282610-0_sp1.1 from training. Duration: 0.4818125 2023-10-04 09:29:06,067 WARNING [train.py:1204] (3/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_257-67050-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 09:29:07,505 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_14906_210_7927_1_1530754190_9408633_961-53402-0_sp1.1 from training. Duration: 0.936375 2023-10-04 09:29:08,843 WARNING [train.py:1204] (3/4) Exclude cut with ID _60113_210_16271_1_1535164212481_4386079_490-280290-0_sp1.1 from training. Duration: 0.8909375 2023-10-04 09:29:09,104 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.balancer2.prob, batch_count=1606173.3333333333, ans=0.125 2023-10-04 09:29:11,675 WARNING [train.py:1204] (3/4) Exclude cut with ID _14100_210_8341_1_1530529320613_3678069_258-224599-0 from training. Duration: 0.77 2023-10-04 09:29:13,015 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6961_210_2087_1_1527937168_6815011_565-326898-0_sp1.1 from training. Duration: 0.936375 2023-10-04 09:29:14,585 WARNING [train.py:1204] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_239-316802-0_sp1.1 from training. Duration: 0.62725 2023-10-04 09:29:16,475 WARNING [train.py:1204] (3/4) Exclude cut with ID _55857_210_2346_1_1534644007831_6898209_745-98391-0_sp1.1 from training. Duration: 0.6909375 2023-10-04 09:29:16,485 WARNING [train.py:1204] (3/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_154-135696-0 from training. Duration: 0.8 2023-10-04 09:29:16,487 WARNING [train.py:1204] (3/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_93-58002-0 from training. Duration: 0.57 2023-10-04 09:29:19,098 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9025W0005-507335 from training. Duration: 0.896 2023-10-04 09:29:19,173 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9029W0034-509435 from training. Duration: 0.64 2023-10-04 09:29:20,633 WARNING [train.py:1204] (3/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_76-203867-0_sp1.1 from training. Duration: 0.6545625 2023-10-04 09:29:20,643 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_46639_210_8341_1_1533726088_4495500_66-346325-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 09:29:20,649 WARNING [train.py:1204] (3/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_145-137790-0_sp0.9 from training. Duration: 0.92225 2023-10-04 09:29:20,671 WARNING [train.py:1204] (3/4) Exclude cut with ID _36522_210_8887_1_1533177030611_1772478_7-327299-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 09:29:20,744 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9042W0009-515615 from training. Duration: 0.896 2023-10-04 09:29:21,808 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.675e+02 1.971e+02 2.198e+02 2.495e+02 3.601e+02, threshold=4.397e+02, percent-clipped=0.0 2023-10-04 09:29:21,875 WARNING [train.py:1204] (3/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_39-248870-0_sp0.9 from training. Duration: 0.7666875 2023-10-04 09:29:21,909 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_35441_210_16554_1_1533347572_4092680_362-242912-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 09:29:22,128 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.0.feed_forward1.out_proj.dropout_p, batch_count=1606240.0, ans=0.1 2023-10-04 09:29:23,823 WARNING [train.py:1204] (3/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_216-343007-0_sp1.1 from training. Duration: 0.6 2023-10-04 09:29:23,903 WARNING [train.py:1204] (3/4) Exclude cut with ID _3867_210_1883_1_1526641200575_4475830_272-215563-0_sp0.9 from training. Duration: 0.6333125 2023-10-04 09:29:26,024 WARNING [train.py:1204] (3/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_553-175603-0_sp1.1 from training. Duration: 0.92725 2023-10-04 09:29:26,033 WARNING [train.py:1204] (3/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_963-187109-0 from training. Duration: 0.99 2023-10-04 09:29:27,246 INFO [train.py:1046] (3/4) Epoch 46, batch 1900, loss[loss=0.1747, simple_loss=0.2476, pruned_loss=0.05087, over 23796.00 frames. ], tot_loss[loss=0.1551, simple_loss=0.2354, pruned_loss=0.03738, over 4718685.07 frames. ], batch size: 164, lr: 2.21e-03, grad_scale: 16.0 2023-10-04 09:29:28,695 WARNING [train.py:1204] (3/4) Exclude cut with ID _44973_210_1511_1_1533794381163_3634770_495-211466-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 09:29:28,714 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9068W0002-529085 from training. Duration: 0.896 2023-10-04 09:29:28,730 WARNING [train.py:1204] (3/4) Exclude cut with ID _5294_210_1747_1_1527249643928_3696179_180-100713-0_sp1.1 from training. Duration: 0.736375 2023-10-04 09:29:30,660 WARNING [train.py:1204] (3/4) Exclude cut with ID _35799_210_5854_1_1533263487887_3529071_149-49689-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 09:29:34,872 WARNING [train.py:1204] (3/4) Exclude cut with ID _67725_210_27767_1_1535608756671_5105359_36-220794-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 09:29:36,195 WARNING [train.py:1204] (3/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_338-96520-0_sp1.1 from training. Duration: 0.8545625 2023-10-04 09:29:36,266 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9104W0004-547160 from training. Duration: 0.896 2023-10-04 09:29:37,559 WARNING [train.py:1204] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_330-327861-0 from training. Duration: 0.67 2023-10-04 09:29:38,985 WARNING [train.py:1204] (3/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_156-257607-0_sp0.9 from training. Duration: 0.92225 2023-10-04 09:29:39,044 WARNING [train.py:1204] (3/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_476-322050-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 09:29:39,053 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9118W0017-553385 from training. Duration: 0.896 2023-10-04 09:29:39,079 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9120W0028-554435 from training. Duration: 0.896 2023-10-04 09:29:39,196 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.1.feed_forward1.hidden_balancer.prob, batch_count=1606306.6666666667, ans=0.125 2023-10-04 09:29:42,983 WARNING [train.py:1204] (3/4) Exclude cut with ID _56524_210_9642_1_1534572011779_3619170_145-310842-0 from training. Duration: 0.99 2023-10-04 09:29:44,976 WARNING [train.py:1204] (3/4) Exclude cut with ID _51631_210_8254_1_1534129228420_4084794_270-222508-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 09:29:49,057 WARNING [train.py:1204] (3/4) Exclude cut with ID _14268_210_9206_1_1530622771285_4714099_331-105351-0 from training. Duration: 0.65 2023-10-04 09:29:52,368 WARNING [train.py:1204] (3/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_40-129756-0 from training. Duration: 0.81 2023-10-04 09:29:55,358 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=1606440.0, ans=0.1 2023-10-04 09:30:03,226 WARNING [train.py:1204] (3/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_231-104736-0 from training. Duration: 0.78 2023-10-04 09:30:04,668 WARNING [train.py:1204] (3/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_214-291369-0 from training. Duration: 0.63 2023-10-04 09:30:04,674 WARNING [train.py:1204] (3/4) Exclude cut with ID _62574_210_2655_1_1535180407058_3933249_369-84347-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 09:30:06,059 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9210W0111-599240 from training. Duration: 0.896 2023-10-04 09:30:06,072 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_92-246270-0 from training. Duration: 0.85 2023-10-04 09:30:06,099 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_37152_210_9697_1_1533106573_3844188_29-239070-0 from training. Duration: 0.98 2023-10-04 09:30:07,449 WARNING [train.py:1204] (3/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_239-22076-0 from training. Duration: 0.85 2023-10-04 09:30:07,459 WARNING [train.py:1204] (3/4) Exclude cut with ID _65962_210_29927_1_1535436026519_4174930_368-44639-0_sp1.1 from training. Duration: 0.97275 2023-10-04 09:30:11,605 WARNING [train.py:1204] (3/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_152-212533-0 from training. Duration: 0.97 2023-10-04 09:30:13,050 WARNING [train.py:1204] (3/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_90-31383-0_sp1.1 from training. Duration: 0.7909375 2023-10-04 09:30:16,355 WARNING [train.py:1204] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_196-152868-0_sp1.1 from training. Duration: 0.92725 2023-10-04 09:30:16,357 WARNING [train.py:1204] (3/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_140-181176-0 from training. Duration: 0.92 2023-10-04 09:30:17,874 WARNING [train.py:1204] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_168-32906-0_sp0.9 from training. Duration: 0.7666875 2023-10-04 09:30:19,826 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.3.encoder.layers.2.nonlin_attention.whiten2, num_groups=1, num_channels=512, metric=5.60 vs. limit=15.0 2023-10-04 09:30:23,847 WARNING [train.py:1204] (3/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_13-165512-0 from training. Duration: 0.82 2023-10-04 09:30:23,896 WARNING [train.py:1204] (3/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_381-317791-0_sp0.9 from training. Duration: 0.811125 2023-10-04 09:30:29,570 WARNING [train.py:1204] (3/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_63-267585-0_sp1.1 from training. Duration: 0.8181875 2023-10-04 09:30:29,572 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_326-303945-0_sp1.1 from training. Duration: 0.9 2023-10-04 09:30:29,584 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_194-143381-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 09:30:29,670 WARNING [train.py:1204] (3/4) Exclude cut with ID _39290_210_4220_1_1533862778760_3075118_194-16667-0_sp1.1 from training. Duration: 0.963625 2023-10-04 09:30:31,591 WARNING [train.py:1204] (3/4) Exclude cut with ID _15012_210_1968_1_1530856820429_5095350_662-210469-0_sp0.9 from training. Duration: 0.6333125 2023-10-04 09:30:31,826 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=1606573.3333333333, ans=0.1 2023-10-04 09:30:32,911 WARNING [train.py:1204] (3/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_68-219807-0_sp1.1 from training. Duration: 0.42725 2023-10-04 09:30:32,983 WARNING [train.py:1204] (3/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_321-22821-0_sp0.9 from training. Duration: 0.988875 2023-10-04 09:30:35,645 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_1037-45317-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 09:30:35,646 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_403-39619-0_sp1.1 from training. Duration: 0.763625 2023-10-04 09:30:37,314 INFO [scaling.py:1118] (3/4) WithLoss: name=encoder.encoders.5.encoder.layers.1.self_attn_weights, loss-sum=0.000e+00 2023-10-04 09:30:38,389 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_510-96996-0_sp0.9 from training. Duration: 0.9666875 2023-10-04 09:30:38,392 WARNING [train.py:1204] (3/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_225-180155-0_sp1.1 from training. Duration: 0.92725 2023-10-04 09:30:39,713 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_278-272612-0_sp0.9 from training. Duration: 0.811125 2023-10-04 09:30:41,085 INFO [train.py:1046] (3/4) Epoch 46, batch 1950, loss[loss=0.1594, simple_loss=0.2424, pruned_loss=0.03817, over 23690.00 frames. ], tot_loss[loss=0.1556, simple_loss=0.2362, pruned_loss=0.03748, over 4722736.58 frames. ], batch size: 85, lr: 2.21e-03, grad_scale: 16.0 2023-10-04 09:30:41,132 WARNING [train.py:1204] (3/4) Exclude cut with ID _53713_210_24116_1_1534314487762_7416751_691-70244-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 09:30:43,950 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6788_210_1388_1_1527898698_4562021_91-197981-0_sp1.1 from training. Duration: 0.7545625 2023-10-04 09:30:45,334 WARNING [train.py:1204] (3/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_60-18123-0_sp0.9 from training. Duration: 0.97775 2023-10-04 09:30:45,380 WARNING [train.py:1204] (3/4) Exclude cut with ID _35799_210_5854_1_1533263487887_3529071_5-214617-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 09:30:46,945 WARNING [train.py:1204] (3/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_890-5117-0_sp1.1 from training. Duration: 0.7090625 2023-10-04 09:30:48,840 WARNING [train.py:1204] (3/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_660-211088-0 from training. Duration: 0.62 2023-10-04 09:30:50,171 WARNING [train.py:1204] (3/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_314-126819-0_sp0.9 from training. Duration: 0.5555625 2023-10-04 09:30:51,415 WARNING [train.py:1204] (3/4) Exclude cut with ID _21632_210_2253_1_1531994001765_3744140_362-66225-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 09:30:51,517 WARNING [train.py:1204] (3/4) Exclude cut with ID _61118_210_9527_1_1534897668884_3752630_173-258555-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 09:30:54,802 WARNING [train.py:1204] (3/4) Exclude cut with ID _7719_210_3525_1_1528456153163_4296960_88-250259-0_sp0.9 from training. Duration: 0.9333125 2023-10-04 09:30:56,064 WARNING [train.py:1204] (3/4) Exclude cut with ID _15140_210_9288_1_1530791388450_4880149_539-153708-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 09:30:56,087 WARNING [train.py:1204] (3/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_253-42544-0_sp1.1 from training. Duration: 0.97275 2023-10-04 09:30:57,664 WARNING [train.py:1204] (3/4) Exclude cut with ID _33640_210_2655_1_1533117605479_4855469_479-296877-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 09:31:00,384 WARNING [train.py:1204] (3/4) Exclude cut with ID 3033-130750-0096-55598-0_sp1.1 from training. Duration: 0.7545625 2023-10-04 09:31:00,395 WARNING [train.py:1204] (3/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_191-41023-0_sp1.1 from training. Duration: 0.6454375 2023-10-04 09:31:00,419 WARNING [train.py:1204] (3/4) Exclude cut with ID _8365_210_2064_1_1528592311070_4677690_431-294102-0_sp1.1 from training. Duration: 0.8090625 2023-10-04 09:31:01,756 WARNING [train.py:1204] (3/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_893-296030-0_sp1.1 from training. Duration: 0.97275 2023-10-04 09:31:05,030 WARNING [train.py:1204] (3/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_401-343820-0_sp1.1 from training. Duration: 0.97275 2023-10-04 09:31:08,340 WARNING [train.py:1204] (3/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_127-566-0_sp1.1 from training. Duration: 0.763625 2023-10-04 09:31:08,342 WARNING [train.py:1204] (3/4) Exclude cut with ID _14100_210_8341_1_1530529320613_3678069_126-272847-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 09:31:08,364 WARNING [train.py:1204] (3/4) Exclude cut with ID _15133_210_6228_1_1530794512074_3080379_29-264458-0_sp1.1 from training. Duration: 0.52725 2023-10-04 09:31:08,365 WARNING [train.py:1204] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_127-119109-0 from training. Duration: 0.72 2023-10-04 09:31:08,418 WARNING [train.py:1204] (3/4) Exclude cut with ID _6537_210_1883_1_1527987559710_3597129_382-133062-0_sp1.1 from training. Duration: 0.6181875 2023-10-04 09:31:09,715 WARNING [train.py:1204] (3/4) Exclude cut with ID _29529_210_7234_1_1532393961455_3977460_391-219682-0_sp1.1 from training. Duration: 0.936375 2023-10-04 09:31:09,778 WARNING [train.py:1204] (3/4) Exclude cut with ID _37537_210_2253_1_1533171758568_3421949_96-137189-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 09:31:13,866 WARNING [train.py:1204] (3/4) Exclude cut with ID _58414_210_8254_1_1535421609162_7151182_15-276301-0_sp1.1 from training. Duration: 0.97275 2023-10-04 09:31:16,562 WARNING [train.py:1204] (3/4) Exclude cut with ID _59502_210_12210_1_1535107024698_1835139_110-289074-0_sp1.1 from training. Duration: 0.92725 2023-10-04 09:31:18,568 WARNING [train.py:1204] (3/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_367-61330-0_sp1.1 from training. Duration: 0.6818125 2023-10-04 09:31:21,441 WARNING [train.py:1204] (3/4) Exclude cut with ID _45297_210_2721_1_1533715351381_3264949_353-9541-0_sp1.1 from training. Duration: 0.8818125 2023-10-04 09:31:21,486 WARNING [train.py:1204] (3/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_85-138257-0_sp0.9 from training. Duration: 0.788875 2023-10-04 09:31:22,810 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3787_210_2040_1_1526477544_5478586_603-120324-0 from training. Duration: 0.81 2023-10-04 09:31:23,476 WARNING [train.py:1204] (3/4) Exclude cut with ID _42863_210_9070_1_1533902398788_3696920_411-204131-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 09:31:27,445 WARNING [train.py:1204] (3/4) Exclude cut with ID _30196_210_1664_1_1533708496522_7074140_822-289909-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 09:31:28,051 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.4.encoder.layers.0.whiten, num_groups=1, num_channels=384, metric=4.35 vs. limit=12.0 2023-10-04 09:31:28,839 WARNING [train.py:1204] (3/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_32-335103-0_sp0.9 from training. Duration: 0.911125 2023-10-04 09:31:30,206 WARNING [train.py:1204] (3/4) Exclude cut with ID _6493_210_1747_1_1528459210424_4012470_513-140967-0_sp0.9 from training. Duration: 0.87775 2023-10-04 09:31:30,488 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=1606840.0, ans=0.1 2023-10-04 09:31:38,300 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_47896_210_20508_1_1534492518_5958868_439-257766-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 09:31:38,563 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.conv_module1.balancer1.prob, batch_count=1606840.0, ans=0.125 2023-10-04 09:31:39,633 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_1294-63152-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 09:31:42,380 WARNING [train.py:1204] (3/4) Exclude cut with ID _59170_210_6721_1_1534983633960_4203335_319-166512-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 09:31:43,816 WARNING [train.py:1204] (3/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_146-3470-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 09:31:44,078 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.conv_module2.balancer2.prob, batch_count=1606906.6666666667, ans=0.125 2023-10-04 09:31:46,460 WARNING [train.py:1204] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_678-159307-0_sp1.1 from training. Duration: 0.77275 2023-10-04 09:31:46,491 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_343-48822-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 09:31:46,553 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_4692_210_3528_1_1526899755_4323254_107-44401-0 from training. Duration: 0.86 2023-10-04 09:31:46,558 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_74-292608-0_sp1.1 from training. Duration: 0.6545625 2023-10-04 09:31:47,858 WARNING [train.py:1204] (3/4) Exclude cut with ID _55644_210_11856_1_1534593599582_4103719_293-248694-0_sp1.1 from training. Duration: 0.97275 2023-10-04 09:31:49,222 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3172_210_2087_1_1526292929_3987865_197-97762-0 from training. Duration: 0.75 2023-10-04 09:31:50,519 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.662e+02 2.065e+02 2.286e+02 2.732e+02 4.457e+02, threshold=4.573e+02, percent-clipped=1.0 2023-10-04 09:31:50,656 WARNING [train.py:1204] (3/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_264-348896-0_sp0.9 from training. Duration: 0.9444375 2023-10-04 09:31:55,472 INFO [train.py:1046] (3/4) Epoch 46, batch 2000, loss[loss=0.1511, simple_loss=0.2193, pruned_loss=0.04141, over 23491.00 frames. ], tot_loss[loss=0.1552, simple_loss=0.2358, pruned_loss=0.0373, over 4727618.03 frames. ], batch size: 256, lr: 2.21e-03, grad_scale: 32.0 2023-10-04 09:31:55,581 WARNING [train.py:1204] (3/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_209-205365-0_sp0.9 from training. Duration: 0.87775 2023-10-04 09:31:55,665 WARNING [train.py:1204] (3/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_222-198127-0_sp0.9 from training. Duration: 0.9333125 2023-10-04 09:31:56,974 WARNING [train.py:1204] (3/4) Exclude cut with ID _57572_210_5517_1_1534921219624_3809484_52-43530-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 09:31:58,230 WARNING [train.py:1204] (3/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_96-320736-0_sp1.1 from training. Duration: 0.7818125 2023-10-04 09:31:59,681 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_13091_210_7925_1_1530609447_8987354_1159-225508-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 09:32:02,840 WARNING [train.py:1204] (3/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_237-44332-0 from training. Duration: 0.94 2023-10-04 09:32:02,889 WARNING [train.py:1204] (3/4) Exclude cut with ID _7970_210_1883_1_1528455616296_3472230_25-178407-0_sp0.9 from training. Duration: 0.788875 2023-10-04 09:32:06,118 WARNING [train.py:1204] (3/4) Exclude cut with ID _29003_210_19596_1_1532507391720_3894098_357-257062-0_sp1.1 from training. Duration: 0.936375 2023-10-04 09:32:07,584 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_809-17156-0 from training. Duration: 0.73 2023-10-04 09:32:08,882 WARNING [train.py:1204] (3/4) Exclude cut with ID _7160_210_1876_1_1527940923231_3703799_302-150728-0_sp1.1 from training. Duration: 0.7090625 2023-10-04 09:32:10,140 WARNING [train.py:1204] (3/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_444-28041-0_sp0.9 from training. Duration: 0.9444375 2023-10-04 09:32:11,526 WARNING [train.py:1204] (3/4) Exclude cut with ID _52892_210_6721_1_1534378941259_4124129_278-251666-0_sp1.1 from training. Duration: 0.92725 2023-10-04 09:32:12,896 WARNING [train.py:1204] (3/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_115-326778-0 from training. Duration: 0.93 2023-10-04 09:32:14,321 WARNING [train.py:1204] (3/4) Exclude cut with ID _41391_210_7994_1_1533603771872_3638201_31-188961-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 09:32:15,935 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.feed_forward1.out_proj.dropout_p, batch_count=1607040.0, ans=0.1 2023-10-04 09:32:17,044 WARNING [train.py:1204] (3/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_1073-6264-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 09:32:17,068 WARNING [train.py:1204] (3/4) Exclude cut with ID _66868_210_9527_1_1535509287954_4561140_435-259515-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 09:32:17,185 WARNING [train.py:1204] (3/4) Exclude cut with ID _14886_210_4220_1_1530702020355_3516190_159-73748-0 from training. Duration: 0.83 2023-10-04 09:32:18,548 WARNING [train.py:1204] (3/4) Exclude cut with ID _15150_210_8341_1_1530752411803_7256384_469-239509-0_sp1.1 from training. Duration: 0.6181875 2023-10-04 09:32:19,847 WARNING [train.py:1204] (3/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_395-340852-0 from training. Duration: 0.93 2023-10-04 09:32:19,850 WARNING [train.py:1204] (3/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_138-187031-0_sp1.1 from training. Duration: 0.9 2023-10-04 09:32:23,139 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_13757_210_3528_1_1530440682_4107239_48-106347-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 09:32:24,055 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.conv_module2.balancer2.prob, batch_count=1607106.6666666667, ans=0.125 2023-10-04 09:32:25,092 WARNING [train.py:1204] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_164-224052-0_sp1.1 from training. Duration: 0.636375 2023-10-04 09:32:25,097 WARNING [train.py:1204] (3/4) Exclude cut with ID _59288_210_5854_1_1535077757774_3412246_408-322648-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 09:32:25,156 WARNING [train.py:1204] (3/4) Exclude cut with ID _30191_210_1664_1_1533189915047_7196670_827-36359-0_sp1.1 from training. Duration: 0.963625 2023-10-04 09:32:26,539 WARNING [train.py:1204] (3/4) Exclude cut with ID _4680_210_3525_1_1527249757953_893386_75-78979-0_sp1.1 from training. Duration: 0.82725 2023-10-04 09:32:26,774 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.conv_module1.balancer1.prob, batch_count=1607106.6666666667, ans=0.125 2023-10-04 09:32:27,877 WARNING [train.py:1204] (3/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_383-165234-0 from training. Duration: 0.94 2023-10-04 09:32:30,640 WARNING [train.py:1204] (3/4) Exclude cut with ID _19896_210_1883_1_1531709022662_6649569_94-229927-0 from training. Duration: 0.97 2023-10-04 09:32:30,654 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6788_210_1388_1_1527898698_4562021_180-75288-0_sp1.1 from training. Duration: 0.9 2023-10-04 09:32:30,662 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_26433_210_12108_1_1532157158_4056480_54-321349-0_sp1.1 from training. Duration: 0.97275 2023-10-04 09:32:34,142 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.bypass.skip_rate, batch_count=1607106.6666666667, ans=0.07 2023-10-04 09:32:38,387 WARNING [train.py:1204] (3/4) Exclude cut with ID _56108_210_16380_1_1534505446159_3887959_265-161493-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 09:32:39,742 WARNING [train.py:1204] (3/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_395-111602-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 09:32:39,749 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_154-316450-0_sp0.9 from training. Duration: 0.8444375 2023-10-04 09:32:41,113 WARNING [train.py:1204] (3/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_381-116041-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 09:32:42,513 WARNING [train.py:1204] (3/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_65-104124-0_sp1.1 from training. Duration: 0.963625 2023-10-04 09:32:42,574 WARNING [train.py:1204] (3/4) Exclude cut with ID _6536_210_1883_1_1527850783339_3319069_169-23811-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 09:32:42,613 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_289-180904-0_sp0.9 from training. Duration: 0.8444375 2023-10-04 09:32:42,623 WARNING [train.py:1204] (3/4) Exclude cut with ID _56406_210_9288_1_1534899618664_3496149_226-242400-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 09:32:44,039 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_24899_210_16554_1_1532046422_3827462_276-216330-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 09:32:48,010 WARNING [train.py:1204] (3/4) Exclude cut with ID _29345_210_4381_1_1532689168263_3860169_198-174328-0_sp1.1 from training. Duration: 0.82725 2023-10-04 09:32:48,066 WARNING [train.py:1204] (3/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_338-96520-0 from training. Duration: 0.94 2023-10-04 09:32:52,148 WARNING [train.py:1204] (3/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_7-111954-0_sp1.1 from training. Duration: 0.5090625 2023-10-04 09:32:53,560 WARNING [train.py:1204] (3/4) Exclude cut with ID _36692_210_5665_1_1533608967420_398809_8-16261-0_sp1.1 from training. Duration: 0.97275 2023-10-04 09:32:59,490 WARNING [train.py:1204] (3/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_86-212657-0_sp1.1 from training. Duration: 0.97275 2023-10-04 09:32:59,498 WARNING [train.py:1204] (3/4) Exclude cut with ID _44659_210_2721_1_1534036120061_6614860_626-298884-0_sp1.1 from training. Duration: 0.8818125 2023-10-04 09:33:02,363 WARNING [train.py:1204] (3/4) Exclude cut with ID _49755_210_18053_1_1534581005846_3218709_120-257918-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 09:33:04,443 WARNING [train.py:1204] (3/4) Exclude cut with ID _21881_210_3525_1_1531825242757_3798380_400-113821-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 09:33:04,446 WARNING [train.py:1204] (3/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_387-236773-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 09:33:04,511 WARNING [train.py:1204] (3/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_613-232856-0_sp1.1 from training. Duration: 0.4909375 2023-10-04 09:33:04,732 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.bypass.skip_rate, batch_count=1607240.0, ans=0.07 2023-10-04 09:33:05,870 WARNING [train.py:1204] (3/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_181-78632-0_sp0.9 from training. Duration: 0.8666875 2023-10-04 09:33:08,411 INFO [train.py:1046] (3/4) Epoch 46, batch 2050, loss[loss=0.1342, simple_loss=0.2037, pruned_loss=0.03229, over 23722.00 frames. ], tot_loss[loss=0.154, simple_loss=0.2343, pruned_loss=0.03683, over 4731944.56 frames. ], batch size: 232, lr: 2.21e-03, grad_scale: 32.0 2023-10-04 09:33:08,469 WARNING [train.py:1204] (3/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_175-17485-0_sp1.1 from training. Duration: 0.97275 2023-10-04 09:33:10,131 WARNING [train.py:1204] (3/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_1004-183422-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 09:33:11,816 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.balancer2.prob, batch_count=1607306.6666666667, ans=0.125 2023-10-04 09:33:12,959 WARNING [train.py:1204] (3/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_32-65962-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 09:33:14,290 WARNING [train.py:1204] (3/4) Exclude cut with ID _15458_210_8341_1_1530858614263_3834410_40-298664-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 09:33:14,950 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.5.encoder.layers.0.feed_forward2.out_whiten, num_groups=1, num_channels=256, metric=7.45 vs. limit=15.0 2023-10-04 09:33:18,610 WARNING [train.py:1204] (3/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_380-261863-0_sp1.1 from training. Duration: 0.963625 2023-10-04 09:33:20,042 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_372-173012-0_sp1.1 from training. Duration: 0.863625 2023-10-04 09:33:20,105 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_57-82205-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 09:33:21,461 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3691_210_3528_1_1526468169_4062053_355-334432-0_sp0.9 from training. Duration: 0.9666875 2023-10-04 09:33:22,940 WARNING [train.py:1204] (3/4) Exclude cut with ID _19023_210_4353_1_1531378892552_5820780_28-36615-0 from training. Duration: 0.97 2023-10-04 09:33:22,950 WARNING [train.py:1204] (3/4) Exclude cut with ID _29853_210_7497_1_1532505600851_6642110_286-251333-0_sp1.1 from training. Duration: 0.92725 2023-10-04 09:33:24,399 WARNING [train.py:1204] (3/4) Exclude cut with ID _31602_210_5854_1_1532677021456_4458250_518-80679-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 09:33:25,722 WARNING [train.py:1204] (3/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_38-187851-0_sp0.9 from training. Duration: 0.688875 2023-10-04 09:33:27,808 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.0.balancer2.prob, batch_count=1607373.3333333333, ans=0.125 2023-10-04 09:33:33,204 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.2.encoder.layers.1.feed_forward2.out_whiten, num_groups=1, num_channels=384, metric=3.72 vs. limit=15.0 2023-10-04 09:33:37,148 WARNING [train.py:1204] (3/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_39-248870-0_sp1.1 from training. Duration: 0.62725 2023-10-04 09:33:37,165 WARNING [train.py:1204] (3/4) Exclude cut with ID _16908_210_5724_1_1531119157248_3772700_154-31455-0_sp1.1 from training. Duration: 0.97275 2023-10-04 09:33:39,842 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8453_210_4211_1_1528502940_8562730_347-330673-0 from training. Duration: 0.72 2023-10-04 09:33:41,349 WARNING [train.py:1204] (3/4) Exclude cut with ID _30191_210_1664_1_1533189915047_7196670_674-204247-0_sp1.1 from training. Duration: 0.97275 2023-10-04 09:33:43,267 WARNING [train.py:1204] (3/4) Exclude cut with ID _8833_210_5219_1_1528592665210_7553059_329-125156-0 from training. Duration: 0.82 2023-10-04 09:33:43,300 WARNING [train.py:1204] (3/4) Exclude cut with ID _4779_210_2084_1_1527318022944_3914893_500-285013-0_sp1.1 from training. Duration: 0.62725 2023-10-04 09:33:44,804 WARNING [train.py:1204] (3/4) Exclude cut with ID _14421_210_8750_1_1530604795825_3542290_190-313578-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 09:33:46,235 WARNING [train.py:1204] (3/4) Exclude cut with ID _35799_210_5854_1_1533263487887_3529071_538-273574-0_sp1.1 from training. Duration: 0.936375 2023-10-04 09:33:47,583 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_14928_210_5632_1_1530773461_4714000_454-112198-0_sp1.1 from training. Duration: 0.6 2023-10-04 09:33:47,639 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_468-143126-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 09:33:50,362 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_4528_210_3273_1_1527076654_3276777_258-121681-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 09:33:52,920 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_397-132565-0_sp1.1 from training. Duration: 0.9 2023-10-04 09:33:52,941 WARNING [train.py:1204] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_454-123002-0_sp0.9 from training. Duration: 0.7666875 2023-10-04 09:33:54,443 WARNING [train.py:1204] (3/4) Exclude cut with ID _27052_210_5986_1_1532329197187_3585009_297-86890-0_sp1.1 from training. Duration: 0.936375 2023-10-04 09:33:55,964 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_51225_210_3601_1_1534400634_2327383_55-301844-0_sp1.1 from training. Duration: 0.8454375 2023-10-04 09:33:59,196 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6786_210_1388_1_1527839699_4194495_378-46070-0_sp0.9 from training. Duration: 0.8 2023-10-04 09:33:59,294 WARNING [train.py:1204] (3/4) Exclude cut with ID _42334_210_7770_1_1534206610396_3605880_55-19758-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 09:34:03,365 WARNING [train.py:1204] (3/4) Exclude cut with ID _14884_210_2252_1_1530707459689_5561250_114-201399-0_sp0.9 from training. Duration: 0.8444375 2023-10-04 09:34:08,079 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_99-251314-0_sp0.9 from training. Duration: 0.9666875 2023-10-04 09:34:09,435 WARNING [train.py:1204] (3/4) Exclude cut with ID _26528_210_9070_1_1532570410842_3851310_497-97218-0 from training. Duration: 0.86 2023-10-04 09:34:15,287 WARNING [train.py:1204] (3/4) Exclude cut with ID _55328_210_27767_1_1534580973260_4088170_230-317241-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 09:34:16,648 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_125-203105-0_sp0.9 from training. Duration: 0.888875 2023-10-04 09:34:18,172 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.591e+02 2.010e+02 2.178e+02 2.574e+02 3.822e+02, threshold=4.355e+02, percent-clipped=0.0 2023-10-04 09:34:18,245 WARNING [train.py:1204] (3/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_55-31904-0_sp1.1 from training. Duration: 0.8 2023-10-04 09:34:19,708 WARNING [train.py:1204] (3/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_18-174647-0 from training. Duration: 0.91 2023-10-04 09:34:22,381 INFO [train.py:1046] (3/4) Epoch 46, batch 2100, loss[loss=0.1357, simple_loss=0.1872, pruned_loss=0.04205, over 18836.00 frames. ], tot_loss[loss=0.1535, simple_loss=0.2336, pruned_loss=0.03669, over 4726936.65 frames. ], batch size: 389, lr: 2.21e-03, grad_scale: 32.0 2023-10-04 09:34:23,828 WARNING [train.py:1204] (3/4) Exclude cut with ID IC0165W0005-81726 from training. Duration: 0.952 2023-10-04 09:34:23,828 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_31583_210_3852_1_1532595851_4024413_148-171847-0_sp1.1 from training. Duration: 0.963625 2023-10-04 09:34:25,104 WARNING [train.py:1204] (3/4) Exclude cut with ID _21989_210_5854_1_1531899214931_3271360_116-138769-0_sp1.1 from training. Duration: 0.936375 2023-10-04 09:34:25,145 WARNING [train.py:1204] (3/4) Exclude cut with ID _66070_210_7640_1_1535441393096_7805310_489-292631-0_sp1.1 from training. Duration: 0.7181875 2023-10-04 09:34:26,473 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_202-27960-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 09:34:26,481 WARNING [train.py:1204] (3/4) Exclude cut with ID _5294_210_1747_1_1527249643928_3696179_342-270278-0 from training. Duration: 0.55 2023-10-04 09:34:26,523 WARNING [train.py:1204] (3/4) Exclude cut with ID _38323_210_12108_1_1533121233369_3303049_416-11919-0 from training. Duration: 0.96 2023-10-04 09:34:29,098 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6545_210_2040_1_1527774670_4245258_407-48070-0_sp0.9 from training. Duration: 0.8444375 2023-10-04 09:34:31,856 WARNING [train.py:1204] (3/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_503-30719-0_sp1.1 from training. Duration: 0.8909375 2023-10-04 09:34:33,161 WARNING [train.py:1204] (3/4) Exclude cut with ID _49643_210_7989_1_1534640244224_3939031_234-326497-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 09:34:35,978 WARNING [train.py:1204] (3/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_301-77873-0_sp1.1 from training. Duration: 0.963625 2023-10-04 09:34:36,038 WARNING [train.py:1204] (3/4) Exclude cut with ID _45297_210_2721_1_1533715351381_3264949_152-154697-0_sp1.1 from training. Duration: 0.97275 2023-10-04 09:34:36,045 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3371_210_1794_1_1526384462_5580777_396-339812-0 from training. Duration: 0.85 2023-10-04 09:34:37,392 WARNING [train.py:1204] (3/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_399-156791-0_sp1.1 from training. Duration: 0.7545625 2023-10-04 09:34:37,448 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7696_210_2040_1_1528207167_3716099_117-125434-0 from training. Duration: 0.76 2023-10-04 09:34:37,452 WARNING [train.py:1204] (3/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_96-244299-0 from training. Duration: 0.88 2023-10-04 09:34:40,787 WARNING [train.py:1204] (3/4) Exclude cut with ID _37892_210_19596_1_1533198677897_3964058_519-343462-0_sp1.1 from training. Duration: 0.92725 2023-10-04 09:34:40,827 WARNING [train.py:1204] (3/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_202-183922-0_sp1.1 from training. Duration: 0.77275 2023-10-04 09:34:40,833 WARNING [train.py:1204] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_453-133983-0 from training. Duration: 0.78 2023-10-04 09:34:42,485 WARNING [train.py:1204] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_178-39722-0_sp1.1 from training. Duration: 0.4454375 2023-10-04 09:34:46,619 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_15126_210_6753_1_1530778677_2591185_113-157651-0 from training. Duration: 0.68 2023-10-04 09:34:46,620 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_271-234962-0_sp1.1 from training. Duration: 0.7181875 2023-10-04 09:34:46,927 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.bypass_mid.scale_min, batch_count=1607706.6666666667, ans=0.2 2023-10-04 09:34:49,477 WARNING [train.py:1204] (3/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_195-64967-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 09:34:50,819 WARNING [train.py:1204] (3/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_365-215065-0_sp1.1 from training. Duration: 0.936375 2023-10-04 09:34:53,560 WARNING [train.py:1204] (3/4) Exclude cut with ID _31602_210_5854_1_1532677021456_4458250_249-96604-0_sp0.9 from training. Duration: 0.988875 2023-10-04 09:34:54,931 WARNING [train.py:1204] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_307-158462-0 from training. Duration: 0.69 2023-10-04 09:34:54,982 WARNING [train.py:1204] (3/4) Exclude cut with ID _30918_210_6400_1_1532754078600_3125920_407-223394-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 09:34:54,985 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6673_210_3273_1_1527767790_3173240_351-167954-0_sp0.9 from training. Duration: 0.5333125 2023-10-04 09:34:58,221 WARNING [train.py:1204] (3/4) Exclude cut with ID _4866_210_2577_1_1527404412284_4364659_256-306830-0 from training. Duration: 0.87 2023-10-04 09:34:58,269 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_17613_210_2151_1_1531137569_2366160_222-131118-0_sp1.1 from training. Duration: 0.963625 2023-10-04 09:34:58,289 WARNING [train.py:1204] (3/4) Exclude cut with ID _45560_210_21982_1_1533812603951_3584999_111-6376-0 from training. Duration: 0.84 2023-10-04 09:34:58,310 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_216-73841-0 from training. Duration: 0.96 2023-10-04 09:34:58,356 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_14058_210_4163_1_1530690953_6327180_49-210687-0 from training. Duration: 0.94 2023-10-04 09:35:00,871 WARNING [train.py:1204] (3/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_506-42614-0_sp0.9 from training. Duration: 0.87775 2023-10-04 09:35:02,309 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_25-332138-0_sp1.1 from training. Duration: 0.82725 2023-10-04 09:35:05,055 WARNING [train.py:1204] (3/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_589-9070-0_sp1.1 from training. Duration: 0.6454375 2023-10-04 09:35:05,995 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.1.encoder.layers.0.self_attn_weights.whiten_keys, num_groups=4, num_channels=128, metric=3.77 vs. limit=6.0 2023-10-04 09:35:06,516 WARNING [train.py:1204] (3/4) Exclude cut with ID _6194_210_1534_1_1527511826160_3941194_14-282876-0_sp1.1 from training. Duration: 0.6454375 2023-10-04 09:35:07,877 WARNING [train.py:1204] (3/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_1166-90654-0_sp1.1 from training. Duration: 0.92725 2023-10-04 09:35:10,591 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_218-255858-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 09:35:10,592 WARNING [train.py:1204] (3/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_1285-256622-0 from training. Duration: 0.84 2023-10-04 09:35:10,613 WARNING [train.py:1204] (3/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_205-286261-0_sp1.1 from training. Duration: 0.963625 2023-10-04 09:35:10,628 WARNING [train.py:1204] (3/4) Exclude cut with ID _68777_210_4353_1_1535810390731_3117904_6-154119-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 09:35:12,485 WARNING [train.py:1204] (3/4) Exclude cut with ID _22982_210_1883_1_1531882790447_5449409_565-199393-0_sp1.1 from training. Duration: 0.92725 2023-10-04 09:35:12,497 WARNING [train.py:1204] (3/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_278-154492-0 from training. Duration: 0.76 2023-10-04 09:35:13,848 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_426-313557-0 from training. Duration: 0.53 2023-10-04 09:35:13,916 WARNING [train.py:1204] (3/4) Exclude cut with ID _41817_210_18835_1_1533434477160_7423049_555-293448-0 from training. Duration: 0.8 2023-10-04 09:35:18,165 WARNING [train.py:1204] (3/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_32-335103-0_sp1.1 from training. Duration: 0.7454375 2023-10-04 09:35:20,897 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2655_210_2087_1_1525688959_3897400_202-147557-0_sp1.1 from training. Duration: 0.77275 2023-10-04 09:35:20,933 WARNING [train.py:1204] (3/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_158-250116-0 from training. Duration: 0.62 2023-10-04 09:35:22,569 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.conv_module2.balancer1.prob, batch_count=1607906.6666666667, ans=0.125 2023-10-04 09:35:26,440 WARNING [train.py:1204] (3/4) Exclude cut with ID _55429_210_10626_1_1534467588527_7174143_528-88083-0_sp1.1 from training. Duration: 0.963625 2023-10-04 09:35:29,166 WARNING [train.py:1204] (3/4) Exclude cut with ID _8030_210_7497_1_1528444886781_3531089_32-31837-0_sp1.1 from training. Duration: 0.8818125 2023-10-04 09:35:29,820 WARNING [train.py:1204] (3/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_744-86168-0_sp1.1 from training. Duration: 0.97275 2023-10-04 09:35:29,835 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1113-7369-0_sp0.9 from training. Duration: 0.9444375 2023-10-04 09:35:29,853 WARNING [train.py:1204] (3/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_124-345840-0_sp0.9 from training. Duration: 0.57775 2023-10-04 09:35:31,123 WARNING [train.py:1204] (3/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_328-264776-0_sp0.9 from training. Duration: 0.7444375 2023-10-04 09:35:32,546 WARNING [train.py:1204] (3/4) Exclude cut with ID _18895_210_6228_1_1531357274234_3784930_469-166615-0_sp1.1 from training. Duration: 0.963625 2023-10-04 09:35:32,559 WARNING [train.py:1204] (3/4) Exclude cut with ID _8395_210_1531_1_1528597871523_4411579_431-132470-0_sp0.9 from training. Duration: 0.82225 2023-10-04 09:35:33,898 WARNING [train.py:1204] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_554-45617-0_sp1.1 from training. Duration: 0.9 2023-10-04 09:35:33,926 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_35361_210_4238_1_1533262810_7793476_524-188242-0_sp1.1 from training. Duration: 0.92725 2023-10-04 09:35:35,172 INFO [train.py:1046] (3/4) Epoch 46, batch 2150, loss[loss=0.1512, simple_loss=0.2302, pruned_loss=0.03607, over 23421.00 frames. ], tot_loss[loss=0.1528, simple_loss=0.233, pruned_loss=0.03629, over 4717584.26 frames. ], batch size: 285, lr: 2.21e-03, grad_scale: 16.0 2023-10-04 09:35:35,334 WARNING [train.py:1204] (3/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_938-205135-0 from training. Duration: 0.87 2023-10-04 09:35:36,673 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_5931_210_2040_1_1527428481_4547999_321-193614-0 from training. Duration: 0.67 2023-10-04 09:35:36,690 WARNING [train.py:1204] (3/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_445-269521-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 09:35:39,449 WARNING [train.py:1204] (3/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_3-33116-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 09:35:39,450 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_4633_210_2040_1_1526824163_4145343_416-76117-0_sp1.1 from training. Duration: 0.863625 2023-10-04 09:35:41,010 WARNING [train.py:1204] (3/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_1033-274376-0_sp1.1 from training. Duration: 0.7909375 2023-10-04 09:35:41,052 WARNING [train.py:1204] (3/4) Exclude cut with ID _8772_210_1850_1_1528635595972_3664011_398-320049-0_sp0.9 from training. Duration: 0.97775 2023-10-04 09:35:42,598 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.attention_skip_rate, batch_count=1607973.3333333333, ans=0.0 2023-10-04 09:35:45,845 WARNING [train.py:1204] (3/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_321-215742-0_sp0.9 from training. Duration: 0.5555625 2023-10-04 09:35:48,509 WARNING [train.py:1204] (3/4) Exclude cut with ID _28394_210_8913_1_1532430049518_3650118_272-190950-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 09:35:49,865 WARNING [train.py:1204] (3/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_31-299371-0_sp1.1 from training. Duration: 0.92725 2023-10-04 09:35:50,169 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.balancer1.prob, batch_count=1608040.0, ans=0.125 2023-10-04 09:35:51,339 WARNING [train.py:1204] (3/4) Exclude cut with ID _55383_210_10635_1_1534468426172_2231669_134-128246-0_sp0.9 from training. Duration: 0.988875 2023-10-04 09:35:51,341 WARNING [train.py:1204] (3/4) Exclude cut with ID _40519_210_13794_1_1533715228655_7337390_887-9547-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 09:35:51,370 WARNING [train.py:1204] (3/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_310-109461-0_sp0.9 from training. Duration: 0.888875 2023-10-04 09:35:52,959 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.conv_module1.balancer1.prob, batch_count=1608040.0, ans=0.125 2023-10-04 09:35:54,126 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2009_210_2071_1_1525481134_4588937_258-250196-0_sp1.1 from training. Duration: 0.92725 2023-10-04 09:35:54,176 WARNING [train.py:1204] (3/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_1186-72702-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 09:35:54,178 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_14831_210_7927_1_1530700200_5583610_181-27467-0_sp1.1 from training. Duration: 0.82725 2023-10-04 09:35:58,754 WARNING [train.py:1204] (3/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_278-132109-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 09:35:58,779 WARNING [train.py:1204] (3/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_261-297503-0 from training. Duration: 0.99 2023-10-04 09:36:03,432 WARNING [train.py:1204] (3/4) Exclude cut with ID _19896_210_1883_1_1531709022662_6649569_313-265903-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 09:36:06,093 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_105-114797-0_sp1.1 from training. Duration: 0.763625 2023-10-04 09:36:06,274 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.ff2_skip_rate, batch_count=1608106.6666666667, ans=0.0 2023-10-04 09:36:07,465 WARNING [train.py:1204] (3/4) Exclude cut with ID _53755_210_8254_1_1534577312941_4707360_9-65843-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 09:36:07,496 WARNING [train.py:1204] (3/4) Exclude cut with ID _18516_210_9206_1_1531704653206_3692406_361-224549-0_sp1.1 from training. Duration: 0.97275 2023-10-04 09:36:08,897 WARNING [train.py:1204] (3/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_403-310975-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 09:36:08,939 WARNING [train.py:1204] (3/4) Exclude cut with ID _5444_210_2151_1_1527418808464_5312210_581-315411-0_sp0.9 from training. Duration: 0.688875 2023-10-04 09:36:08,984 WARNING [train.py:1204] (3/4) Exclude cut with ID _23825_210_13449_1_1531915313526_3497150_464-154904-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 09:36:08,994 WARNING [train.py:1204] (3/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_937-208281-0_sp0.9 from training. Duration: 0.9444375 2023-10-04 09:36:10,317 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_37839_210_7234_1_1533542462_3689303_158-181212-0_sp1.1 from training. Duration: 0.963625 2023-10-04 09:36:10,405 WARNING [train.py:1204] (3/4) Exclude cut with ID _8772_210_1850_1_1528635595972_3664011_398-320049-0 from training. Duration: 0.88 2023-10-04 09:36:13,138 WARNING [train.py:1204] (3/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_718-250907-0_sp1.1 from training. Duration: 0.62725 2023-10-04 09:36:14,951 WARNING [train.py:1204] (3/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_641-126395-0_sp1.1 from training. Duration: 0.92725 2023-10-04 09:36:14,976 WARNING [train.py:1204] (3/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_166-241493-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 09:36:15,072 WARNING [train.py:1204] (3/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_526-62079-0_sp0.9 from training. Duration: 0.7444375 2023-10-04 09:36:17,076 WARNING [train.py:1204] (3/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_439-275083-0_sp1.1 from training. Duration: 0.936375 2023-10-04 09:36:19,798 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_66907_210_21581_1_1535615661_3590177_493-229032-0_sp1.1 from training. Duration: 0.92725 2023-10-04 09:36:19,834 WARNING [train.py:1204] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_89-321955-0_sp1.1 from training. Duration: 0.72725 2023-10-04 09:36:19,983 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.1.conv_module2.balancer1.prob, batch_count=1608173.3333333333, ans=0.125 2023-10-04 09:36:21,277 WARNING [train.py:1204] (3/4) Exclude cut with ID _29857_210_20034_1_1532395886753_3798160_273-226195-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 09:36:21,283 WARNING [train.py:1204] (3/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_404-75889-0 from training. Duration: 0.89 2023-10-04 09:36:21,324 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_271-234962-0_sp0.9 from training. Duration: 0.87775 2023-10-04 09:36:23,993 WARNING [train.py:1204] (3/4) Exclude cut with ID _22961_210_8254_1_1531976366672_4278180_156-339685-0_sp1.1 from training. Duration: 0.97275 2023-10-04 09:36:25,269 WARNING [train.py:1204] (3/4) Exclude cut with ID _37892_210_19596_1_1533198677897_3964058_425-231927-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 09:36:26,605 WARNING [train.py:1204] (3/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_102-288670-0_sp1.1 from training. Duration: 0.97275 2023-10-04 09:36:26,680 WARNING [train.py:1204] (3/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_283-220372-0_sp1.1 from training. Duration: 0.5090625 2023-10-04 09:36:27,970 WARNING [train.py:1204] (3/4) Exclude cut with ID _44304_210_15374_1_1533639352765_4613129_86-1699-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 09:36:29,951 WARNING [train.py:1204] (3/4) Exclude cut with ID _2891_210_2550_1_1525874398521_4427710_327-121590-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 09:36:29,965 WARNING [train.py:1204] (3/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_369-222060-0 from training. Duration: 0.71 2023-10-04 09:36:31,294 WARNING [train.py:1204] (3/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_762-109886-0 from training. Duration: 0.91 2023-10-04 09:36:31,318 WARNING [train.py:1204] (3/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_477-255782-0_sp0.9 from training. Duration: 0.911125 2023-10-04 09:36:33,271 WARNING [train.py:1204] (3/4) Exclude cut with ID IC0669W0061-330906 from training. Duration: 0.768 2023-10-04 09:36:33,324 WARNING [train.py:1204] (3/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_347-213071-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 09:36:34,755 WARNING [train.py:1204] (3/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_507-303370-0_sp1.1 from training. Duration: 0.87275 2023-10-04 09:36:34,831 WARNING [train.py:1204] (3/4) Exclude cut with ID _15012_210_1968_1_1530856820429_5095350_688-178849-0 from training. Duration: 0.64 2023-10-04 09:36:34,836 WARNING [train.py:1204] (3/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_28-70987-0_sp0.9 from training. Duration: 0.9666875 2023-10-04 09:36:34,839 WARNING [train.py:1204] (3/4) Exclude cut with ID _5910_210_1389_1_1527337972454_5050100_555-159815-0 from training. Duration: 0.98 2023-10-04 09:36:36,198 WARNING [train.py:1204] (3/4) Exclude cut with ID IC0679W0064-335871 from training. Duration: 0.768 2023-10-04 09:36:36,198 WARNING [train.py:1204] (3/4) Exclude cut with ID IC0679W0079-335886 from training. Duration: 0.896 2023-10-04 09:36:36,253 WARNING [train.py:1204] (3/4) Exclude cut with ID _5608_210_2106_1_1527415439089_6819768_909-82767-0 from training. Duration: 0.61 2023-10-04 09:36:37,584 WARNING [train.py:1204] (3/4) Exclude cut with ID _23038_210_9570_1_1532001584158_3652419_411-323967-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 09:36:38,967 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_350-231748-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 09:36:38,984 WARNING [train.py:1204] (3/4) Exclude cut with ID _5289_210_2084_1_1527678202736_3779299_142-103317-0_sp0.9 from training. Duration: 0.9333125 2023-10-04 09:36:39,056 WARNING [train.py:1204] (3/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_314-144055-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 09:36:40,331 WARNING [train.py:1204] (3/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_177-236809-0_sp0.9 from training. Duration: 0.5444375 2023-10-04 09:36:40,444 WARNING [train.py:1204] (3/4) Exclude cut with ID _23038_210_9570_1_1532001584158_3652419_82-334733-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 09:36:40,451 WARNING [train.py:1204] (3/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_225-22526-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 09:36:48,295 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.646e+02 1.972e+02 2.248e+02 2.510e+02 3.914e+02, threshold=4.495e+02, percent-clipped=0.0 2023-10-04 09:36:48,433 WARNING [train.py:1204] (3/4) Exclude cut with ID _30327_210_16049_1_1532484161295_3924169_401-70819-0_sp1.1 from training. Duration: 0.7818125 2023-10-04 09:36:48,485 WARNING [train.py:1204] (3/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_538-32291-0 from training. Duration: 0.83 2023-10-04 09:36:49,744 INFO [train.py:1046] (3/4) Epoch 46, batch 2200, loss[loss=0.1395, simple_loss=0.2191, pruned_loss=0.02994, over 24444.00 frames. ], tot_loss[loss=0.1524, simple_loss=0.2328, pruned_loss=0.03595, over 4728462.54 frames. ], batch size: 58, lr: 2.21e-03, grad_scale: 8.0 2023-10-04 09:36:52,588 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_37357_210_17024_1_1533089504_2316121_258-211605-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 09:36:55,377 WARNING [train.py:1204] (3/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_550-335198-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 09:36:56,749 WARNING [train.py:1204] (3/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_98-741-0_sp0.9 from training. Duration: 0.888875 2023-10-04 09:36:56,799 WARNING [train.py:1204] (3/4) Exclude cut with ID _38323_210_12108_1_1533121233369_3303049_251-238988-0_sp1.1 from training. Duration: 0.92725 2023-10-04 09:36:56,933 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.bypass_mid.scale_min, batch_count=1608306.6666666667, ans=0.2 2023-10-04 09:36:58,202 WARNING [train.py:1204] (3/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_259-34038-0_sp0.9 from training. Duration: 0.82225 2023-10-04 09:37:00,158 WARNING [train.py:1204] (3/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_1060-180633-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 09:37:02,116 WARNING [train.py:1204] (3/4) Exclude cut with ID _8755_210_1876_1_1528628559357_3258209_211-37473-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 09:37:02,122 WARNING [train.py:1204] (3/4) Exclude cut with ID _4122_210_2084_1_1526713164417_4353280_528-206771-0 from training. Duration: 0.88 2023-10-04 09:37:06,273 WARNING [train.py:1204] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_64-248359-0 from training. Duration: 0.69 2023-10-04 09:37:08,988 WARNING [train.py:1204] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_402-253027-0_sp1.1 from training. Duration: 0.4909375 2023-10-04 09:37:15,675 WARNING [train.py:1204] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_625-102955-0 from training. Duration: 0.77 2023-10-04 09:37:18,419 WARNING [train.py:1204] (3/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_611-219324-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 09:37:19,745 WARNING [train.py:1204] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_718-68505-0_sp0.9 from training. Duration: 0.988875 2023-10-04 09:37:19,789 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_353-182855-0_sp1.1 from training. Duration: 0.87275 2023-10-04 09:37:25,053 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_5960_210_1794_1_1527403865_3990999_515-2613-0_sp0.9 from training. Duration: 0.97775 2023-10-04 09:37:25,078 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_5575_210_1410_1_1527301305_2570170_342-265572-0 from training. Duration: 0.94 2023-10-04 09:37:28,000 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.balancer2.prob, batch_count=1608440.0, ans=0.125 2023-10-04 09:37:29,172 WARNING [train.py:1204] (3/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_329-113173-0_sp0.9 from training. Duration: 0.688875 2023-10-04 09:37:30,603 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_15396_210_6890_1_1531308473_1938936_213-312506-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 09:37:31,903 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_521-214276-0_sp1.1 from training. Duration: 0.4 2023-10-04 09:37:34,002 WARNING [train.py:1204] (3/4) Exclude cut with ID _15026_210_8750_1_1530777602316_3291850_211-195043-0_sp1.1 from training. Duration: 0.72725 2023-10-04 09:37:34,261 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=1608506.6666666667, ans=0.1 2023-10-04 09:37:35,389 WARNING [train.py:1204] (3/4) Exclude cut with ID _29343_210_4381_1_1532602757752_3674109_108-257494-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 09:37:36,814 WARNING [train.py:1204] (3/4) Exclude cut with ID _64414_210_17301_1_1535243252048_3757759_403-58151-0_sp1.1 from training. Duration: 0.963625 2023-10-04 09:37:39,378 WARNING [train.py:1204] (3/4) Exclude cut with ID _36512_210_6987_1_1533033143998_3126069_109-177787-0_sp1.1 from training. Duration: 0.92725 2023-10-04 09:37:40,784 WARNING [train.py:1204] (3/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_498-40153-0 from training. Duration: 0.63 2023-10-04 09:37:40,865 WARNING [train.py:1204] (3/4) Exclude cut with ID _56447_210_1389_1_1535074245910_3697189_161-334523-0_sp1.1 from training. Duration: 0.936375 2023-10-04 09:37:42,218 WARNING [train.py:1204] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_238-245655-0 from training. Duration: 0.81 2023-10-04 09:37:45,526 WARNING [train.py:1204] (3/4) Exclude cut with ID _64631_210_27767_1_1535349580068_4165160_108-43865-0_sp1.1 from training. Duration: 0.92725 2023-10-04 09:37:45,532 WARNING [train.py:1204] (3/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_243-42116-0_sp1.1 from training. Duration: 0.663625 2023-10-04 09:37:45,567 WARNING [train.py:1204] (3/4) Exclude cut with ID _66868_210_9527_1_1535509287954_4561140_594-132160-0_sp1.1 from training. Duration: 0.92725 2023-10-04 09:37:47,393 WARNING [train.py:1204] (3/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_48-338976-0_sp0.9 from training. Duration: 0.988875 2023-10-04 09:37:47,440 WARNING [train.py:1204] (3/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_137-100595-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 09:37:47,446 WARNING [train.py:1204] (3/4) Exclude cut with ID _49748_210_24116_1_1533986971737_3158910_12-271669-0_sp1.1 from training. Duration: 0.936375 2023-10-04 09:37:47,465 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_37357_210_17024_1_1533089504_2316121_206-80421-0_sp1.1 from training. Duration: 0.936375 2023-10-04 09:37:47,590 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.0.feed_forward1.out_proj.dropout_p, batch_count=1608573.3333333333, ans=0.1 2023-10-04 09:37:48,890 WARNING [train.py:1204] (3/4) Exclude cut with ID _7537_210_2084_1_1528502395431_3924359_113-199933-0_sp1.1 from training. Duration: 0.536375 2023-10-04 09:37:50,224 WARNING [train.py:1204] (3/4) Exclude cut with ID _55440_210_12651_1_1534496713364_2980520_31-59289-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 09:37:51,715 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1083-220122-0_sp0.9 from training. Duration: 0.5666875 2023-10-04 09:37:54,477 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_426-313557-0_sp1.1 from training. Duration: 0.4818125 2023-10-04 09:37:54,524 WARNING [train.py:1204] (3/4) Exclude cut with ID _35799_210_5854_1_1533263487887_3529071_324-94843-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 09:37:55,970 WARNING [train.py:1204] (3/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_264-108685-0_sp1.1 from training. Duration: 0.5 2023-10-04 09:37:57,350 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9012W0006-501141 from training. Duration: 0.896 2023-10-04 09:38:00,066 WARNING [train.py:1204] (3/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_890-5117-0_sp0.9 from training. Duration: 0.8666875 2023-10-04 09:38:00,104 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9023W0008-506301 from training. Duration: 0.896 2023-10-04 09:38:02,544 WARNING [train.py:1204] (3/4) Exclude cut with ID _13916_210_9994_1_1530788341126_3763149_60-191556-0_sp0.9 from training. Duration: 0.67775 2023-10-04 09:38:02,581 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9029W0331-509721 from training. Duration: 0.896 2023-10-04 09:38:03,652 INFO [train.py:1046] (3/4) Epoch 46, batch 2250, loss[loss=0.161, simple_loss=0.2326, pruned_loss=0.04465, over 23654.00 frames. ], tot_loss[loss=0.1529, simple_loss=0.2336, pruned_loss=0.0361, over 4735809.44 frames. ], batch size: 256, lr: 2.21e-03, grad_scale: 8.0 2023-10-04 09:38:03,761 WARNING [train.py:1204] (3/4) Exclude cut with ID _35823_210_6917_1_1533187881914_7297830_567-108596-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 09:38:05,117 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7543_210_6753_1_1528544010_1110402_25-183860-0_sp1.1 from training. Duration: 0.42725 2023-10-04 09:38:06,496 WARNING [train.py:1204] (3/4) Exclude cut with ID _47278_210_12041_1_1533902418349_3576110_495-260035-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 09:38:07,973 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9051W0015-520281 from training. Duration: 0.9045625 2023-10-04 09:38:09,596 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.feed_forward1.out_proj.dropout_p, batch_count=1608640.0, ans=0.1 2023-10-04 09:38:10,705 WARNING [train.py:1204] (3/4) Exclude cut with ID _14459_210_2228_1_1531443641946_7516700_707-66193-0_sp1.1 from training. Duration: 0.97275 2023-10-04 09:38:13,579 WARNING [train.py:1204] (3/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_219-224445-0_sp1.1 from training. Duration: 0.763625 2023-10-04 09:38:18,328 WARNING [train.py:1204] (3/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_345-167986-0_sp0.9 from training. Duration: 0.9333125 2023-10-04 09:38:19,799 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_34815_210_6388_1_1533381320_3571903_279-305565-0_sp1.1 from training. Duration: 0.836375 2023-10-04 09:38:23,917 WARNING [train.py:1204] (3/4) Exclude cut with ID _21987_210_5854_1_1531821500482_3637038_317-179212-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 09:38:23,982 WARNING [train.py:1204] (3/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_395-37222-0_sp1.1 from training. Duration: 0.6909375 2023-10-04 09:38:25,319 WARNING [train.py:1204] (3/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_168-50337-0_sp1.1 from training. Duration: 0.763625 2023-10-04 09:38:26,783 WARNING [train.py:1204] (3/4) Exclude cut with ID _60113_210_16271_1_1535164212481_4386079_559-167191-0 from training. Duration: 0.93 2023-10-04 09:38:26,792 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_47228_210_4983_1_1533780021_3672027_344-179642-0_sp1.1 from training. Duration: 0.92725 2023-10-04 09:38:26,823 WARNING [train.py:1204] (3/4) Exclude cut with ID _22961_210_8254_1_1531976366672_4278180_317-199546-0_sp0.9 from training. Duration: 0.9555625 2023-10-04 09:38:28,230 WARNING [train.py:1204] (3/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_141-89727-0 from training. Duration: 0.71 2023-10-04 09:38:29,577 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_78-116075-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 09:38:29,596 WARNING [train.py:1204] (3/4) Exclude cut with ID _64086_210_7994_1_1535270417019_7435949_52-235756-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 09:38:29,743 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.1.self_attn_weights.pos_emb_skip_rate, batch_count=1608706.6666666667, ans=0.0 2023-10-04 09:38:32,291 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7615_210_4211_1_1528110965_4580160_72-235211-0_sp1.1 from training. Duration: 0.6909375 2023-10-04 09:38:37,558 WARNING [train.py:1204] (3/4) Exclude cut with ID _14069_210_5632_1_1530512692033_4803390_287-27950-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 09:38:38,969 WARNING [train.py:1204] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_74-48717-0_sp0.9 from training. Duration: 0.6444375 2023-10-04 09:38:39,005 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7544_210_6753_1_1528591327_1471426_172-348777-0_sp1.1 from training. Duration: 0.6 2023-10-04 09:38:40,441 WARNING [train.py:1204] (3/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_220-191080-0 from training. Duration: 0.98 2023-10-04 09:38:41,775 WARNING [train.py:1204] (3/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_1081-137641-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 09:38:43,201 WARNING [train.py:1204] (3/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_278-235459-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 09:38:43,414 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.conv_module2.balancer2.prob, batch_count=1608773.3333333333, ans=0.125 2023-10-04 09:38:49,508 WARNING [train.py:1204] (3/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_1181-77470-0_sp1.1 from training. Duration: 0.936375 2023-10-04 09:38:50,973 WARNING [train.py:1204] (3/4) Exclude cut with ID _60860_210_8254_1_1534896015875_3557169_198-5531-0_sp1.1 from training. Duration: 0.936375 2023-10-04 09:38:52,234 WARNING [train.py:1204] (3/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_17-289781-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 09:38:52,252 WARNING [train.py:1204] (3/4) Exclude cut with ID _50310_210_15488_1_1534068043345_3641080_146-199752-0_sp1.1 from training. Duration: 0.92725 2023-10-04 09:38:53,690 WARNING [train.py:1204] (3/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_969-213354-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 09:38:55,077 WARNING [train.py:1204] (3/4) Exclude cut with ID _70519_210_4163_1_1535961595994_7458230_66-344156-0_sp1.1 from training. Duration: 0.963625 2023-10-04 09:38:56,574 WARNING [train.py:1204] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_380-204756-0_sp1.1 from training. Duration: 0.7818125 2023-10-04 09:38:59,325 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_128-202104-0_sp0.9 from training. Duration: 0.7 2023-10-04 09:39:03,692 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.conv_module1.balancer2.prob, batch_count=1608906.6666666667, ans=0.125 2023-10-04 09:39:05,219 WARNING [train.py:1204] (3/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_51-1049-0_sp0.9 from training. Duration: 0.8333125 2023-10-04 09:39:05,237 WARNING [train.py:1204] (3/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_86-66192-0_sp1.1 from training. Duration: 0.5 2023-10-04 09:39:05,306 WARNING [train.py:1204] (3/4) Exclude cut with ID _14069_210_5632_1_1530512692033_4803390_95-1896-0_sp1.1 from training. Duration: 0.8909375 2023-10-04 09:39:12,418 WARNING [train.py:1204] (3/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_29-293896-0_sp1.1 from training. Duration: 0.5545625 2023-10-04 09:39:15,079 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.631e+02 2.142e+02 2.478e+02 2.835e+02 4.262e+02, threshold=4.957e+02, percent-clipped=0.0 2023-10-04 09:39:15,166 WARNING [train.py:1204] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_167-137255-0_sp0.9 from training. Duration: 0.711125 2023-10-04 09:39:15,167 WARNING [train.py:1204] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_217-95056-0 from training. Duration: 0.98 2023-10-04 09:39:15,182 WARNING [train.py:1204] (3/4) Exclude cut with ID _47278_210_12041_1_1533902418349_3576110_97-266746-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 09:39:15,233 WARNING [train.py:1204] (3/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_380-343670-0_sp1.1 from training. Duration: 0.8 2023-10-04 09:39:17,283 INFO [train.py:1046] (3/4) Epoch 46, batch 2300, loss[loss=0.1749, simple_loss=0.2503, pruned_loss=0.04975, over 23620.00 frames. ], tot_loss[loss=0.1534, simple_loss=0.2345, pruned_loss=0.03613, over 4738416.62 frames. ], batch size: 256, lr: 2.21e-03, grad_scale: 8.0 2023-10-04 09:39:18,792 WARNING [train.py:1204] (3/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_107-332398-0 from training. Duration: 0.78 2023-10-04 09:39:21,520 WARNING [train.py:1204] (3/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_91-267696-0_sp1.1 from training. Duration: 0.7909375 2023-10-04 09:39:21,541 WARNING [train.py:1204] (3/4) Exclude cut with ID _41298_210_9019_1_1533430371450_4822143_498-186993-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 09:39:24,462 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.0.ff3_skip_rate, batch_count=1608973.3333333333, ans=0.0 2023-10-04 09:39:27,240 WARNING [train.py:1204] (3/4) Exclude cut with ID _17956_210_4353_1_1531209568125_6979970_54-77610-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 09:39:27,274 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_40047_210_2290_1_1533290519_2374311_257-335162-0_sp1.1 from training. Duration: 0.863625 2023-10-04 09:39:28,699 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9653W0193-675471 from training. Duration: 0.9656875 2023-10-04 09:39:30,092 WARNING [train.py:1204] (3/4) Exclude cut with ID _25248_210_8750_1_1532062825941_5554180_243-240536-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 09:39:37,417 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_32360_210_4198_1_1532689160_3648436_60-51908-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 09:39:37,433 WARNING [train.py:1204] (3/4) Exclude cut with ID _4092_210_2151_1_1526643044449_3135759_227-213972-0_sp0.9 from training. Duration: 0.6 2023-10-04 09:39:39,390 WARNING [train.py:1204] (3/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_392-173500-0_sp1.1 from training. Duration: 0.97275 2023-10-04 09:39:39,428 WARNING [train.py:1204] (3/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_71-329150-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 09:39:39,431 WARNING [train.py:1204] (3/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_866-173440-0 from training. Duration: 0.9 2023-10-04 09:39:39,695 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.conv_module2.balancer2.prob, batch_count=1609040.0, ans=0.125 2023-10-04 09:39:40,779 WARNING [train.py:1204] (3/4) Exclude cut with ID _14832_210_8750_1_1530691351539_3171348_1-54315-0_sp0.9 from training. Duration: 0.9666875 2023-10-04 09:39:43,593 WARNING [train.py:1204] (3/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_430-116139-0_sp1.1 from training. Duration: 0.77275 2023-10-04 09:39:43,712 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.1.feed_forward1.out_proj.dropout_p, batch_count=1609040.0, ans=0.1 2023-10-04 09:39:44,858 WARNING [train.py:1204] (3/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_356-45054-0_sp0.9 from training. Duration: 0.9555625 2023-10-04 09:39:48,357 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3531_210_3528_1_1526294203_5216456_309-227302-0_sp0.9 from training. Duration: 0.7666875 2023-10-04 09:39:52,985 WARNING [train.py:1204] (3/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_301-330455-0_sp1.1 from training. Duration: 0.72725 2023-10-04 09:39:54,478 WARNING [train.py:1204] (3/4) Exclude cut with ID _23557_210_8750_1_1531890106370_6846255_667-145945-0_sp1.1 from training. Duration: 0.92725 2023-10-04 09:39:58,642 WARNING [train.py:1204] (3/4) Exclude cut with ID _14860_210_4915_1_1530702020340_7343240_836-328489-0_sp1.1 from training. Duration: 0.8181875 2023-10-04 09:39:59,925 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_37152_210_9697_1_1533106573_3844188_416-333249-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 09:40:01,382 WARNING [train.py:1204] (3/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_538-32291-0_sp0.9 from training. Duration: 0.92225 2023-10-04 09:40:04,188 WARNING [train.py:1204] (3/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_965-176824-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 09:40:07,034 WARNING [train.py:1204] (3/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_86-19601-0_sp1.1 from training. Duration: 0.77275 2023-10-04 09:40:08,961 WARNING [train.py:1204] (3/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_436-318113-0_sp1.1 from training. Duration: 0.6818125 2023-10-04 09:40:08,995 WARNING [train.py:1204] (3/4) Exclude cut with ID _41817_210_18835_1_1533434477160_7423049_555-293448-0_sp0.9 from training. Duration: 0.888875 2023-10-04 09:40:09,012 WARNING [train.py:1204] (3/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_68-219807-0 from training. Duration: 0.47 2023-10-04 09:40:09,260 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.conv_module1.balancer2.prob, batch_count=1609173.3333333333, ans=0.125 2023-10-04 09:40:12,421 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7544_210_6753_1_1528591327_1471426_33-346031-0_sp1.1 from training. Duration: 0.5545625 2023-10-04 09:40:12,422 WARNING [train.py:1204] (3/4) Exclude cut with ID _49748_210_24116_1_1533986971737_3158910_263-52113-0_sp1.1 from training. Duration: 0.97275 2023-10-04 09:40:12,466 WARNING [train.py:1204] (3/4) Exclude cut with ID _64639_210_5816_1_1535367619437_3528000_340-24580-0_sp1.1 from training. Duration: 0.92725 2023-10-04 09:40:12,483 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_47228_210_4983_1_1533780021_3672027_479-114941-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 09:40:13,808 WARNING [train.py:1204] (3/4) Exclude cut with ID _44585_210_18628_1_1533605350387_4518199_506-119659-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 09:40:13,885 WARNING [train.py:1204] (3/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_314-126819-0_sp1.1 from training. Duration: 0.4545625 2023-10-04 09:40:13,889 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2541_210_1794_1_1525590269_3084765_156-173713-0_sp0.9 from training. Duration: 0.9 2023-10-04 09:40:15,207 WARNING [train.py:1204] (3/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_108-255430-0 from training. Duration: 0.97 2023-10-04 09:40:15,222 WARNING [train.py:1204] (3/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_791-45149-0_sp1.1 from training. Duration: 0.7818125 2023-10-04 09:40:15,228 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_53378_210_17282_1_1534227054_1261425_10-118295-0_sp1.1 from training. Duration: 0.97275 2023-10-04 09:40:15,282 WARNING [train.py:1204] (3/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_148-158509-0 from training. Duration: 0.41 2023-10-04 09:40:23,248 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_34726_210_4525_1_1533019474_5030395_687-291305-0_sp1.1 from training. Duration: 0.936375 2023-10-04 09:40:27,292 WARNING [train.py:1204] (3/4) Exclude cut with ID _14860_210_4915_1_1530702020340_7343240_264-324537-0_sp1.1 from training. Duration: 0.963625 2023-10-04 09:40:27,504 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=1609240.0, ans=0.1 2023-10-04 09:40:28,813 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.feed_forward1.hidden_balancer.prob, batch_count=1609240.0, ans=0.125 2023-10-04 09:40:31,426 INFO [train.py:1046] (3/4) Epoch 46, batch 2350, loss[loss=0.1467, simple_loss=0.2267, pruned_loss=0.03334, over 24334.00 frames. ], tot_loss[loss=0.1534, simple_loss=0.2348, pruned_loss=0.03597, over 4739861.07 frames. ], batch size: 61, lr: 2.21e-03, grad_scale: 8.0 2023-10-04 09:40:31,534 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_41251_210_15368_1_1533429767_4820695_591-46386-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 09:40:31,569 WARNING [train.py:1204] (3/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_86-19601-0_sp0.9 from training. Duration: 0.9444375 2023-10-04 09:40:31,590 WARNING [train.py:1204] (3/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_401-255491-0_sp0.9 from training. Duration: 0.77775 2023-10-04 09:40:33,125 WARNING [train.py:1204] (3/4) Exclude cut with ID _8696_210_1876_1_1528545632440_3345058_209-54069-0_sp1.1 from training. Duration: 0.6090625 2023-10-04 09:40:33,134 WARNING [train.py:1204] (3/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_369-325184-0_sp1.1 from training. Duration: 0.9 2023-10-04 09:40:34,534 WARNING [train.py:1204] (3/4) Exclude cut with ID _14687_210_5854_1_1530703838259_3664889_115-239046-0_sp0.9 from training. Duration: 0.7444375 2023-10-04 09:40:35,881 WARNING [train.py:1204] (3/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_168-50337-0 from training. Duration: 0.84 2023-10-04 09:40:36,685 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.1.encoder.layers.0.feed_forward2.out_whiten, num_groups=1, num_channels=256, metric=4.00 vs. limit=15.0 2023-10-04 09:40:40,925 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.feed_forward2.hidden_balancer.prob, batch_count=1609306.6666666667, ans=0.125 2023-10-04 09:40:42,007 WARNING [train.py:1204] (3/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_909-64151-0_sp1.1 from training. Duration: 0.8545625 2023-10-04 09:40:42,023 WARNING [train.py:1204] (3/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_597-154640-0 from training. Duration: 0.9 2023-10-04 09:40:46,806 WARNING [train.py:1204] (3/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_126-274694-0 from training. Duration: 0.98 2023-10-04 09:40:48,654 WARNING [train.py:1204] (3/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_344-269786-0_sp1.1 from training. Duration: 0.97275 2023-10-04 09:40:50,499 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.5.encoder.layers.0.feed_forward3.out_whiten, num_groups=1, num_channels=256, metric=9.66 vs. limit=15.0 2023-10-04 09:40:51,400 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_37818_210_4238_1_1533620910_4720083_322-253154-0_sp1.1 from training. Duration: 0.92725 2023-10-04 09:40:51,403 WARNING [train.py:1204] (3/4) Exclude cut with ID _58895_210_8750_1_1535184081320_3649089_221-15823-0_sp1.1 from training. Duration: 0.92725 2023-10-04 09:40:51,425 WARNING [train.py:1204] (3/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_9-21737-0_sp1.1 from training. Duration: 0.8818125 2023-10-04 09:40:51,459 WARNING [train.py:1204] (3/4) Exclude cut with ID _14069_210_5632_1_1530512692033_4803390_522-341588-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 09:40:52,749 WARNING [train.py:1204] (3/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_363-185703-0 from training. Duration: 0.95 2023-10-04 09:40:56,836 WARNING [train.py:1204] (3/4) Exclude cut with ID _41817_210_18835_1_1533434477160_7423049_265-76216-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 09:41:02,460 WARNING [train.py:1204] (3/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_841-341679-0 from training. Duration: 0.58 2023-10-04 09:41:02,570 WARNING [train.py:1204] (3/4) Exclude cut with ID _55429_210_10626_1_1534467588527_7174143_97-335084-0_sp1.1 from training. Duration: 0.8818125 2023-10-04 09:41:07,183 WARNING [train.py:1204] (3/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_690-126306-0_sp0.9 from training. Duration: 0.8666875 2023-10-04 09:41:07,199 WARNING [train.py:1204] (3/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_102-92751-0_sp1.1 from training. Duration: 0.9 2023-10-04 09:41:09,916 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3787_210_2040_1_1526477544_5478586_603-120324-0_sp1.1 from training. Duration: 0.736375 2023-10-04 09:41:10,023 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_33348_210_5632_1_1533086912_3324030_85-28583-0 from training. Duration: 0.82 2023-10-04 09:41:11,395 WARNING [train.py:1204] (3/4) Exclude cut with ID _60057_210_10640_1_1534903065793_3333150_362-230377-0_sp1.1 from training. Duration: 0.7909375 2023-10-04 09:41:12,924 WARNING [train.py:1204] (3/4) Exclude cut with ID _60113_210_16271_1_1535164212481_4386079_194-146486-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 09:41:12,937 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_53753_210_7234_1_1534320003_3594148_171-277668-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 09:41:12,981 WARNING [train.py:1204] (3/4) Exclude cut with ID _34423_210_8750_1_1533430798456_3330199_251-332788-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 09:41:17,493 WARNING [train.py:1204] (3/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_663-166973-0_sp0.9 from training. Duration: 0.888875 2023-10-04 09:41:17,798 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.conv_module2.balancer1.min_positive, batch_count=1609506.6666666667, ans=0.025 2023-10-04 09:41:20,799 WARNING [train.py:1204] (3/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_927-43827-0 from training. Duration: 0.74 2023-10-04 09:41:20,867 WARNING [train.py:1204] (3/4) Exclude cut with ID _28323_210_16187_1_1532480395667_4100300_306-74561-0_sp1.1 from training. Duration: 0.8545625 2023-10-04 09:41:21,651 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.1.encoder.layers.0.feed_forward2.out_whiten, num_groups=1, num_channels=256, metric=4.11 vs. limit=15.0 2023-10-04 09:41:23,500 WARNING [train.py:1204] (3/4) Exclude cut with ID _15037_210_2253_1_1530752294104_3267650_139-337989-0_sp1.1 from training. Duration: 0.92725 2023-10-04 09:41:23,524 WARNING [train.py:1204] (3/4) Exclude cut with ID _49039_210_6315_1_1533967537541_3846118_460-263670-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 09:41:24,969 WARNING [train.py:1204] (3/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_186-309161-0 from training. Duration: 0.54 2023-10-04 09:41:26,264 WARNING [train.py:1204] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_87-278686-0_sp1.1 from training. Duration: 0.7 2023-10-04 09:41:27,790 WARNING [train.py:1204] (3/4) Exclude cut with ID _27052_210_5986_1_1532329197187_3585009_314-106499-0 from training. Duration: 0.92 2023-10-04 09:41:27,812 WARNING [train.py:1204] (3/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_412-287526-0_sp0.9 from training. Duration: 0.8 2023-10-04 09:41:32,042 WARNING [train.py:1204] (3/4) Exclude cut with ID _22961_210_8254_1_1531976366672_4278180_317-199546-0 from training. Duration: 0.86 2023-10-04 09:41:35,268 WARNING [train.py:1204] (3/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_381-317791-0 from training. Duration: 0.73 2023-10-04 09:41:35,324 WARNING [train.py:1204] (3/4) Exclude cut with ID _44010_210_4525_1_1533884262964_4013620_560-310411-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 09:41:35,331 WARNING [train.py:1204] (3/4) Exclude cut with ID _5294_210_1747_1_1527249643928_3696179_342-270278-0_sp0.9 from training. Duration: 0.611125 2023-10-04 09:41:35,366 WARNING [train.py:1204] (3/4) Exclude cut with ID ID0473W0002-918951 from training. Duration: 0.924 2023-10-04 09:41:35,384 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1083-220122-0 from training. Duration: 0.51 2023-10-04 09:41:39,525 WARNING [train.py:1204] (3/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_138-154812-0 from training. Duration: 0.78 2023-10-04 09:41:39,763 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.feed_forward1.out_proj.dropout_p, batch_count=1609573.3333333333, ans=0.1 2023-10-04 09:41:42,311 WARNING [train.py:1204] (3/4) Exclude cut with ID _44982_210_1511_1_1534312820108_3726029_331-19420-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 09:41:43,487 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.629e+02 2.219e+02 2.483e+02 2.970e+02 4.725e+02, threshold=4.966e+02, percent-clipped=0.0 2023-10-04 09:41:44,913 INFO [train.py:1046] (3/4) Epoch 46, batch 2400, loss[loss=0.1569, simple_loss=0.245, pruned_loss=0.03443, over 24338.00 frames. ], tot_loss[loss=0.1536, simple_loss=0.235, pruned_loss=0.03616, over 4726853.96 frames. ], batch size: 74, lr: 2.21e-03, grad_scale: 16.0 2023-10-04 09:41:45,976 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.balancer1.prob, batch_count=1609640.0, ans=0.125 2023-10-04 09:41:46,995 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2655_210_2087_1_1525688959_3897400_202-147557-0_sp0.9 from training. Duration: 0.9444375 2023-10-04 09:41:49,086 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_23850_210_4238_1_1532232597_1632433_32-185135-0_sp0.9 from training. Duration: 0.9555625 2023-10-04 09:41:50,498 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_46-93547-0_sp1.1 from training. Duration: 0.863625 2023-10-04 09:41:50,544 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7931_210_1794_1_1528594728_5564059_280-279560-0 from training. Duration: 0.98 2023-10-04 09:41:50,582 WARNING [train.py:1204] (3/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_937-208281-0 from training. Duration: 0.85 2023-10-04 09:41:57,718 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_14928_210_5632_1_1530773461_4714000_454-112198-0_sp0.9 from training. Duration: 0.7333125 2023-10-04 09:41:57,719 WARNING [train.py:1204] (3/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_944-153333-0_sp1.1 from training. Duration: 0.963625 2023-10-04 09:41:59,174 WARNING [train.py:1204] (3/4) Exclude cut with ID _14821_210_8750_1_1530680503991_6829359_2-227151-0 from training. Duration: 0.83 2023-10-04 09:41:59,191 WARNING [train.py:1204] (3/4) Exclude cut with ID _28461_210_2253_1_1532771775189_3041299_96-152158-0_sp1.1 from training. Duration: 0.77275 2023-10-04 09:42:00,526 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7837_210_1792_1_1528453445_5841305_654-54195-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 09:42:00,572 WARNING [train.py:1204] (3/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_344-152526-0 from training. Duration: 0.53 2023-10-04 09:42:06,737 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_30167_210_3528_1_1532428012_4772375_480-142230-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 09:42:08,187 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_160-218721-0 from training. Duration: 0.96 2023-10-04 09:42:14,319 WARNING [train.py:1204] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_65-88242-0_sp0.9 from training. Duration: 0.62225 2023-10-04 09:42:20,412 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_34886_210_10633_1_1533203882_1755661_76-156096-0 from training. Duration: 0.87 2023-10-04 09:42:22,218 WARNING [train.py:1204] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_188-38388-0_sp1.1 from training. Duration: 0.92725 2023-10-04 09:42:22,342 WARNING [train.py:1204] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_81-289335-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 09:42:24,441 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.3.encoder.layers.1.feed_forward1.out_whiten, num_groups=1, num_channels=512, metric=10.21 vs. limit=15.0 2023-10-04 09:42:26,242 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.0.layers.0.feed_forward2.out_whiten, num_groups=1, num_channels=192, metric=5.89 vs. limit=15.0 2023-10-04 09:42:26,671 WARNING [train.py:1204] (3/4) Exclude cut with ID _59493_210_17282_1_1535078419077_3225879_110-8778-0_sp1.1 from training. Duration: 0.936375 2023-10-04 09:42:27,804 WARNING [train.py:1204] (3/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_188-260011-0 from training. Duration: 0.73 2023-10-04 09:42:27,857 WARNING [train.py:1204] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_453-133983-0_sp0.9 from training. Duration: 0.8666875 2023-10-04 09:42:32,251 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.conv_skip_rate, batch_count=1609840.0, ans=0.0 2023-10-04 09:42:33,912 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8054_210_7927_1_1528627721_4984094_553-17379-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 09:42:36,739 WARNING [train.py:1204] (3/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_341-171072-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 09:42:38,273 WARNING [train.py:1204] (3/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_265-257286-0_sp1.1 from training. Duration: 0.963625 2023-10-04 09:42:39,583 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_403-39619-0_sp0.9 from training. Duration: 0.9333125 2023-10-04 09:42:39,595 WARNING [train.py:1204] (3/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_464-33931-0_sp0.9 from training. Duration: 0.6 2023-10-04 09:42:39,623 WARNING [train.py:1204] (3/4) Exclude cut with ID _60804_210_9642_1_1534831199859_7268299_632-143863-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 09:42:39,630 WARNING [train.py:1204] (3/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_431-310997-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 09:42:39,668 WARNING [train.py:1204] (3/4) Exclude cut with ID _31649_210_6228_1_1532584608063_4091078_114-263860-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 09:42:40,953 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6961_210_2087_1_1527937168_6815011_562-165518-0_sp0.9 from training. Duration: 0.8555625 2023-10-04 09:42:43,146 WARNING [train.py:1204] (3/4) Exclude cut with ID _27052_210_5986_1_1532329197187_3585009_338-325870-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 09:42:44,253 WARNING [train.py:1204] (3/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_469-94673-0_sp1.1 from training. Duration: 0.7181875 2023-10-04 09:42:44,274 WARNING [train.py:1204] (3/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_259-34038-0 from training. Duration: 0.74 2023-10-04 09:42:46,084 WARNING [train.py:1204] (3/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_388-129552-0 from training. Duration: 0.96 2023-10-04 09:42:47,486 WARNING [train.py:1204] (3/4) Exclude cut with ID _28394_210_8913_1_1532430049518_3650118_342-251455-0_sp1.1 from training. Duration: 0.936375 2023-10-04 09:42:49,201 WARNING [train.py:1204] (3/4) Exclude cut with ID _53731_210_7994_1_1534553979244_3757214_8-191390-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 09:42:49,250 WARNING [train.py:1204] (3/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_613-232856-0 from training. Duration: 0.54 2023-10-04 09:42:51,730 WARNING [train.py:1204] (3/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_557-349534-0 from training. Duration: 0.91 2023-10-04 09:42:51,746 WARNING [train.py:1204] (3/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_379-184223-0 from training. Duration: 0.85 2023-10-04 09:42:51,749 WARNING [train.py:1204] (3/4) Exclude cut with ID IC0125W0001-61807 from training. Duration: 0.966 2023-10-04 09:42:53,203 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_229-45300-0 from training. Duration: 0.57 2023-10-04 09:42:54,573 WARNING [train.py:1204] (3/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_1069-24743-0_sp1.1 from training. Duration: 0.92725 2023-10-04 09:42:55,966 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_41452_210_7291_1_1533437687_3941281_323-172873-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 09:42:55,976 WARNING [train.py:1204] (3/4) Exclude cut with ID _37086_210_9038_1_1532999540891_7021250_802-226709-0_sp1.1 from training. Duration: 0.97275 2023-10-04 09:42:56,053 WARNING [train.py:1204] (3/4) Exclude cut with ID IC0144W0500-71767 from training. Duration: 0.931875 2023-10-04 09:42:57,575 WARNING [train.py:1204] (3/4) Exclude cut with ID _39846_210_12651_1_1533605310265_2115212_233-129699-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 09:42:57,624 WARNING [train.py:1204] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_574-45541-0_sp0.9 from training. Duration: 0.72225 2023-10-04 09:43:00,358 INFO [train.py:1046] (3/4) Epoch 46, batch 2450, loss[loss=0.1637, simple_loss=0.2525, pruned_loss=0.03745, over 24657.00 frames. ], tot_loss[loss=0.1526, simple_loss=0.2338, pruned_loss=0.03576, over 4728998.18 frames. ], batch size: 73, lr: 2.21e-03, grad_scale: 16.0 2023-10-04 09:43:00,456 WARNING [train.py:1204] (3/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_372-298093-0_sp1.1 from training. Duration: 0.72725 2023-10-04 09:43:00,468 WARNING [train.py:1204] (3/4) Exclude cut with ID _62574_210_2655_1_1535180407058_3933249_34-26343-0_sp1.1 from training. Duration: 0.97275 2023-10-04 09:43:00,678 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=1609973.3333333333, ans=0.1 2023-10-04 09:43:02,047 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.feed_forward1.hidden_balancer.prob, batch_count=1609973.3333333333, ans=0.125 2023-10-04 09:43:05,142 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_512-256238-0_sp1.1 from training. Duration: 0.963625 2023-10-04 09:43:05,147 WARNING [train.py:1204] (3/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_196-55502-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 09:43:05,250 WARNING [train.py:1204] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_195-167011-0 from training. Duration: 0.75 2023-10-04 09:43:08,295 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.conv_module1.balancer2.prob, batch_count=1609973.3333333333, ans=0.125 2023-10-04 09:43:09,610 WARNING [train.py:1204] (3/4) Exclude cut with ID _13678_210_7497_1_1530530069226_3463170_345-230416-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 09:43:09,627 WARNING [train.py:1204] (3/4) Exclude cut with ID _51078_210_9070_1_1534161557424_3805250_225-327809-0_sp1.1 from training. Duration: 0.963625 2023-10-04 09:43:12,775 WARNING [train.py:1204] (3/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_18-189198-0_sp1.1 from training. Duration: 0.8181875 2023-10-04 09:43:12,787 WARNING [train.py:1204] (3/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_329-82235-0_sp1.1 from training. Duration: 0.7454375 2023-10-04 09:43:12,796 WARNING [train.py:1204] (3/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_726-270780-0_sp0.9 from training. Duration: 0.9555625 2023-10-04 09:43:14,531 WARNING [train.py:1204] (3/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_654-52713-0 from training. Duration: 0.77 2023-10-04 09:43:15,302 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.2.encoder.layers.0.feed_forward3.out_whiten, num_groups=1, num_channels=384, metric=12.77 vs. limit=15.0 2023-10-04 09:43:18,579 WARNING [train.py:1204] (3/4) Exclude cut with ID _20455_210_5219_1_1531565996775_8098100_191-16006-0_sp1.1 from training. Duration: 0.963625 2023-10-04 09:43:18,801 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.balancer2.prob, batch_count=1610040.0, ans=0.125 2023-10-04 09:43:21,555 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3171_210_2087_1_1526109084_2330007_66-260583-0_sp1.1 from training. Duration: 0.6545625 2023-10-04 09:43:21,617 WARNING [train.py:1204] (3/4) Exclude cut with ID _38670_210_15379_1_1533448810659_4391059_63-129006-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 09:43:24,523 WARNING [train.py:1204] (3/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_393-57888-0_sp0.9 from training. Duration: 0.711125 2023-10-04 09:43:24,741 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.nonlin_attention.balancer.prob, batch_count=1610040.0, ans=0.125 2023-10-04 09:43:25,708 WARNING [train.py:1204] (3/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_857-298942-0_sp0.9 from training. Duration: 0.9444375 2023-10-04 09:43:25,815 WARNING [train.py:1204] (3/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_239-22076-0_sp0.9 from training. Duration: 0.9444375 2023-10-04 09:43:27,134 WARNING [train.py:1204] (3/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_424-227947-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 09:43:28,588 WARNING [train.py:1204] (3/4) Exclude cut with ID _14069_210_5632_1_1530512692033_4803390_95-1896-0 from training. Duration: 0.98 2023-10-04 09:43:29,982 WARNING [train.py:1204] (3/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_96-244299-0_sp0.9 from training. Duration: 0.97775 2023-10-04 09:43:37,512 WARNING [train.py:1204] (3/4) Exclude cut with ID _46683_210_9642_1_1533730913492_4591116_562-319825-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 09:43:37,619 WARNING [train.py:1204] (3/4) Exclude cut with ID _64639_210_5816_1_1535367619437_3528000_284-173474-0_sp1.1 from training. Duration: 0.963625 2023-10-04 09:43:37,647 WARNING [train.py:1204] (3/4) Exclude cut with ID _29529_210_7234_1_1532393961455_3977460_497-100133-0_sp1.1 from training. Duration: 0.936375 2023-10-04 09:43:37,674 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_12868_210_6753_1_1530701932_530275_18-76322-0_sp0.9 from training. Duration: 0.9666875 2023-10-04 09:43:38,987 WARNING [train.py:1204] (3/4) Exclude cut with ID _50093_210_4220_1_1534417141207_3432299_116-303447-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 09:43:39,087 WARNING [train.py:1204] (3/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_44-79427-0_sp1.1 from training. Duration: 0.8909375 2023-10-04 09:43:40,884 WARNING [train.py:1204] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_887-249555-0 from training. Duration: 0.77 2023-10-04 09:43:44,792 WARNING [train.py:1204] (3/4) Exclude cut with ID _28461_210_2253_1_1532771775189_3041299_96-152158-0_sp0.9 from training. Duration: 0.9444375 2023-10-04 09:43:44,848 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_14058_210_4163_1_1530690953_6327180_49-210687-0_sp1.1 from training. Duration: 0.8545625 2023-10-04 09:43:48,678 WARNING [train.py:1204] (3/4) Exclude cut with ID _40399_210_4353_1_1533305658194_3279406_351-127903-0_sp1.1 from training. Duration: 0.97275 2023-10-04 09:43:48,694 WARNING [train.py:1204] (3/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_184-206939-0_sp1.1 from training. Duration: 0.936375 2023-10-04 09:43:54,316 WARNING [train.py:1204] (3/4) Exclude cut with ID _14100_210_8341_1_1530529320613_3678069_258-224599-0_sp1.1 from training. Duration: 0.7 2023-10-04 09:43:54,341 WARNING [train.py:1204] (3/4) Exclude cut with ID _6493_210_1747_1_1528459210424_4012470_324-323854-0 from training. Duration: 0.92 2023-10-04 09:43:55,722 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_151-325626-0_sp1.1 from training. Duration: 0.8818125 2023-10-04 09:43:55,799 WARNING [train.py:1204] (3/4) Exclude cut with ID _14528_210_8254_1_1530669700514_4306939_106-43145-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 09:43:55,813 WARNING [train.py:1204] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_686-54383-0 from training. Duration: 0.87 2023-10-04 09:43:57,159 WARNING [train.py:1204] (3/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_801-102362-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 09:43:57,233 WARNING [train.py:1204] (3/4) Exclude cut with ID _48677_210_5663_1_1533902405919_3892989_408-40740-0_sp0.9 from training. Duration: 0.92225 2023-10-04 09:44:01,253 WARNING [train.py:1204] (3/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_448-212135-0_sp1.1 from training. Duration: 0.82725 2023-10-04 09:44:02,654 WARNING [train.py:1204] (3/4) Exclude cut with ID _59170_210_6721_1_1534983633960_4203335_320-124047-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 09:44:04,519 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_4692_210_3528_1_1526899755_4323254_107-44401-0_sp1.1 from training. Duration: 0.7818125 2023-10-04 09:44:07,230 WARNING [train.py:1204] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_731-2428-0 from training. Duration: 0.83 2023-10-04 09:44:08,543 WARNING [train.py:1204] (3/4) Exclude cut with ID _29857_210_20034_1_1532395886753_3798160_335-163562-0_sp1.1 from training. Duration: 0.77275 2023-10-04 09:44:13,037 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.731e+02 2.042e+02 2.322e+02 2.681e+02 4.445e+02, threshold=4.643e+02, percent-clipped=0.0 2023-10-04 09:44:14,565 INFO [train.py:1046] (3/4) Epoch 46, batch 2500, loss[loss=0.158, simple_loss=0.2306, pruned_loss=0.04271, over 23753.00 frames. ], tot_loss[loss=0.1521, simple_loss=0.2324, pruned_loss=0.03592, over 4713489.22 frames. ], batch size: 232, lr: 2.21e-03, grad_scale: 16.0 2023-10-04 09:44:14,713 WARNING [train.py:1204] (3/4) Exclude cut with ID _14832_210_8750_1_1530691351539_3171348_171-275004-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 09:44:19,863 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.3.encoder.layers.1.whiten, num_groups=1, num_channels=512, metric=4.37 vs. limit=12.0 2023-10-04 09:44:25,086 WARNING [train.py:1204] (3/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_105-328174-0_sp0.9 from training. Duration: 0.8444375 2023-10-04 09:44:26,346 WARNING [train.py:1204] (3/4) Exclude cut with ID _68965_210_1883_1_1535698962272_3497490_164-118421-0_sp1.1 from training. Duration: 0.936375 2023-10-04 09:44:27,697 WARNING [train.py:1204] (3/4) Exclude cut with ID _19896_210_1883_1_1531709022662_6649569_380-24170-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 09:44:27,701 WARNING [train.py:1204] (3/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_505-205270-0_sp1.1 from training. Duration: 0.3454375 2023-10-04 09:44:29,239 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.0.feed_forward1.out_proj.dropout_p, batch_count=1610373.3333333333, ans=0.1 2023-10-04 09:44:29,363 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.conv_module2.balancer2.prob, batch_count=1610373.3333333333, ans=0.125 2023-10-04 09:44:34,673 WARNING [train.py:1204] (3/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_427-208901-0_sp1.1 from training. Duration: 0.6181875 2023-10-04 09:44:35,964 WARNING [train.py:1204] (3/4) Exclude cut with ID _23818_210_6228_1_1531917063884_3676920_85-326916-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 09:44:37,956 WARNING [train.py:1204] (3/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_101-293367-0_sp1.1 from training. Duration: 0.57275 2023-10-04 09:44:37,976 WARNING [train.py:1204] (3/4) Exclude cut with ID _5409_210_1388_1_1527292850189_5865191_178-127970-0_sp1.1 from training. Duration: 0.5181875 2023-10-04 09:44:38,037 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_403-39619-0 from training. Duration: 0.84 2023-10-04 09:44:39,505 WARNING [train.py:1204] (3/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_427-87841-0_sp1.1 from training. Duration: 0.963625 2023-10-04 09:44:39,554 WARNING [train.py:1204] (3/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_286-68181-0_sp1.1 from training. Duration: 0.97275 2023-10-04 09:44:39,594 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6673_210_3273_1_1527767790_3173240_327-306185-0 from training. Duration: 0.83 2023-10-04 09:44:39,724 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=1610373.3333333333, ans=0.1 2023-10-04 09:44:40,807 WARNING [train.py:1204] (3/4) Exclude cut with ID _3675_210_4557_1_1526639934408_4182089_106-203713-0_sp1.1 from training. Duration: 0.963625 2023-10-04 09:44:40,866 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_53071_210_16554_1_1534493434_1276796_130-36076-0 from training. Duration: 0.95 2023-10-04 09:44:40,887 WARNING [train.py:1204] (3/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_586-93054-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 09:44:46,878 WARNING [train.py:1204] (3/4) Exclude cut with ID _23825_210_13449_1_1531915313526_3497150_262-10571-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 09:44:48,092 WARNING [train.py:1204] (3/4) Exclude cut with ID _28783_210_7943_1_1532671164808_7155479_432-67885-0_sp1.1 from training. Duration: 0.97275 2023-10-04 09:44:50,829 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6287_210_3273_1_1527509506_2653716_182-199887-0_sp0.9 from training. Duration: 0.5666875 2023-10-04 09:44:50,901 WARNING [train.py:1204] (3/4) Exclude cut with ID _3231_210_2106_1_1526205864934_6784220_171-256004-0 from training. Duration: 0.86 2023-10-04 09:44:52,164 WARNING [train.py:1204] (3/4) Exclude cut with ID _69051_210_1531_1_1535885566901_3829538_332-92090-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 09:44:54,193 WARNING [train.py:1204] (3/4) Exclude cut with ID _3675_210_4557_1_1526639934408_4182089_104-324125-0_sp1.1 from training. Duration: 0.963625 2023-10-04 09:44:56,871 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.3.encoder.layers.2.conv_module2.whiten, num_groups=1, num_channels=512, metric=4.09 vs. limit=15.0 2023-10-04 09:44:58,935 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_41764_210_2521_1_1533686110_4089313_501-2119-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 09:45:01,476 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_56-100988-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 09:45:04,299 WARNING [train.py:1204] (3/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_398-239819-0_sp1.1 from training. Duration: 0.9 2023-10-04 09:45:07,663 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.3.encoder.layers.1.conv_module2.whiten, num_groups=1, num_channels=512, metric=4.18 vs. limit=15.0 2023-10-04 09:45:10,446 WARNING [train.py:1204] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_74-48717-0_sp1.1 from training. Duration: 0.52725 2023-10-04 09:45:10,734 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=1610506.6666666667, ans=0.1 2023-10-04 09:45:11,267 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.2.encoder.layers.2.self_attn1.whiten, num_groups=1, num_channels=384, metric=19.46 vs. limit=22.5 2023-10-04 09:45:11,909 WARNING [train.py:1204] (3/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_177-49092-0 from training. Duration: 0.91 2023-10-04 09:45:13,135 WARNING [train.py:1204] (3/4) Exclude cut with ID _62337_210_12392_1_1535009273105_3412229_44-176353-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 09:45:13,144 WARNING [train.py:1204] (3/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_110-130961-0_sp1.1 from training. Duration: 0.42725 2023-10-04 09:45:14,652 WARNING [train.py:1204] (3/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_37-189262-0_sp1.1 from training. Duration: 0.8909375 2023-10-04 09:45:14,652 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3172_210_2087_1_1526292929_3987865_197-97762-0_sp0.9 from training. Duration: 0.8333125 2023-10-04 09:45:16,020 WARNING [train.py:1204] (3/4) Exclude cut with ID IC0677W0065-334897 from training. Duration: 0.896 2023-10-04 09:45:16,020 WARNING [train.py:1204] (3/4) Exclude cut with ID IC0677W0080-334912 from training. Duration: 0.768 2023-10-04 09:45:16,025 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_120-41668-0 from training. Duration: 0.7 2023-10-04 09:45:19,447 WARNING [train.py:1204] (3/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_460-205287-0_sp1.1 from training. Duration: 0.963625 2023-10-04 09:45:20,922 WARNING [train.py:1204] (3/4) Exclude cut with ID _14632_210_9988_1_1530691279392_3923200_102-204477-0 from training. Duration: 0.97 2023-10-04 09:45:20,927 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_34815_210_6388_1_1533381320_3571903_279-305565-0 from training. Duration: 0.92 2023-10-04 09:45:22,218 WARNING [train.py:1204] (3/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_91-148831-0_sp1.1 from training. Duration: 0.9 2023-10-04 09:45:23,595 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_82-221196-0 from training. Duration: 0.76 2023-10-04 09:45:25,271 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.conv_module2.balancer2.prob, batch_count=1610573.3333333333, ans=0.125 2023-10-04 09:45:26,904 WARNING [train.py:1204] (3/4) Exclude cut with ID _38020_210_5986_1_1533112252259_7006089_493-102745-0 from training. Duration: 0.98 2023-10-04 09:45:28,212 INFO [train.py:1046] (3/4) Epoch 46, batch 2550, loss[loss=0.1374, simple_loss=0.2192, pruned_loss=0.02786, over 24601.00 frames. ], tot_loss[loss=0.1525, simple_loss=0.2331, pruned_loss=0.03595, over 4721549.21 frames. ], batch size: 60, lr: 2.21e-03, grad_scale: 16.0 2023-10-04 09:45:28,403 WARNING [train.py:1204] (3/4) Exclude cut with ID _61133_210_9969_1_1535196777227_3696409_40-93442-0_sp1.1 from training. Duration: 0.92725 2023-10-04 09:45:29,779 WARNING [train.py:1204] (3/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_45-258852-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 09:45:29,807 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_53071_210_16554_1_1534493434_1276796_130-36076-0_sp1.1 from training. Duration: 0.863625 2023-10-04 09:45:31,282 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8694_210_6400_1_1529065754_3260756_332-13553-0_sp1.1 from training. Duration: 0.92725 2023-10-04 09:45:32,499 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_405-339986-0 from training. Duration: 0.51 2023-10-04 09:45:32,553 WARNING [train.py:1204] (3/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_927-43827-0_sp1.1 from training. Duration: 0.67275 2023-10-04 09:45:36,889 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_397-147196-0 from training. Duration: 0.8 2023-10-04 09:45:37,314 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.5.encoder.layers.0.conv_module1.whiten, num_groups=1, num_channels=256, metric=6.83 vs. limit=15.0 2023-10-04 09:45:38,231 WARNING [train.py:1204] (3/4) Exclude cut with ID 3557-8342-0013-54691-0_sp1.1 from training. Duration: 0.836375 2023-10-04 09:45:40,195 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_47438_210_5399_1_1533797471_4513356_626-127722-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 09:45:43,577 WARNING [train.py:1204] (3/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_256-323883-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 09:45:43,581 WARNING [train.py:1204] (3/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_839-173017-0_sp0.9 from training. Duration: 0.4555625 2023-10-04 09:45:43,834 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.nonlin_attention.balancer.prob, batch_count=1610706.6666666667, ans=0.125 2023-10-04 09:45:44,919 WARNING [train.py:1204] (3/4) Exclude cut with ID _5289_210_2084_1_1527678202736_3779299_392-261988-0_sp1.1 from training. Duration: 0.5909375 2023-10-04 09:45:44,968 WARNING [train.py:1204] (3/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_119-224384-0_sp1.1 from training. Duration: 0.936375 2023-10-04 09:45:45,007 WARNING [train.py:1204] (3/4) Exclude cut with ID _57523_210_1856_1_1534766294545_3732989_262-267933-0_sp1.1 from training. Duration: 0.963625 2023-10-04 09:45:47,602 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_35361_210_4238_1_1533262810_7793476_675-87012-0_sp0.9 from training. Duration: 0.988875 2023-10-04 09:45:47,621 WARNING [train.py:1204] (3/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_17-90541-0 from training. Duration: 0.94 2023-10-04 09:45:48,956 WARNING [train.py:1204] (3/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_535-14410-0_sp1.1 from training. Duration: 0.42725 2023-10-04 09:45:48,962 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_24673_210_6400_1_1531968633_778659_94-154511-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 09:45:48,965 WARNING [train.py:1204] (3/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_212-10501-0 from training. Duration: 0.94 2023-10-04 09:45:56,856 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.ff2_skip_rate, batch_count=1610773.3333333333, ans=0.0 2023-10-04 09:45:58,687 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.1.encoder.layers.1.whiten, num_groups=1, num_channels=256, metric=4.04 vs. limit=12.0 2023-10-04 09:45:59,419 WARNING [train.py:1204] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_383-285196-0_sp1.1 from training. Duration: 0.8818125 2023-10-04 09:46:04,860 WARNING [train.py:1204] (3/4) Exclude cut with ID _45560_210_21982_1_1533812603951_3584999_238-47020-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 09:46:06,212 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_21500_210_2054_1_1532149310_2716758_278-18053-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 09:46:06,219 WARNING [train.py:1204] (3/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_453-9626-0_sp1.1 from training. Duration: 0.936375 2023-10-04 09:46:06,315 WARNING [train.py:1204] (3/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_153-97441-0_sp1.1 from training. Duration: 0.5090625 2023-10-04 09:46:12,405 WARNING [train.py:1204] (3/4) Exclude cut with ID _67725_210_27767_1_1535608756671_5105359_431-120895-0_sp1.1 from training. Duration: 0.963625 2023-10-04 09:46:15,733 WARNING [train.py:1204] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_574-45541-0_sp1.1 from training. Duration: 0.5909375 2023-10-04 09:46:15,762 WARNING [train.py:1204] (3/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_162-103620-0_sp1.1 from training. Duration: 0.8454375 2023-10-04 09:46:15,775 WARNING [train.py:1204] (3/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_734-129761-0_sp1.1 from training. Duration: 0.7454375 2023-10-04 09:46:15,814 WARNING [train.py:1204] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_620-276793-0_sp1.1 from training. Duration: 0.536375 2023-10-04 09:46:15,849 WARNING [train.py:1204] (3/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_153-97441-0_sp0.9 from training. Duration: 0.62225 2023-10-04 09:46:18,714 WARNING [train.py:1204] (3/4) Exclude cut with ID _12733_210_8341_1_1530167424556_3646380_523-85621-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 09:46:18,725 WARNING [train.py:1204] (3/4) Exclude cut with ID _64048_210_10626_1_1535198382802_3835179_352-179834-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 09:46:24,083 WARNING [train.py:1204] (3/4) Exclude cut with ID _55162_210_17071_1_1534408248583_3753480_43-242364-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 09:46:25,273 WARNING [train.py:1204] (3/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_661-264857-0 from training. Duration: 0.93 2023-10-04 09:46:25,288 WARNING [train.py:1204] (3/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_325-237800-0_sp1.1 from training. Duration: 0.97275 2023-10-04 09:46:27,187 WARNING [train.py:1204] (3/4) Exclude cut with ID _55328_210_27767_1_1534580973260_4088170_385-171459-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 09:46:29,126 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3780_210_2087_1_1526467389_2145572_176-107140-0_sp1.1 from training. Duration: 0.62725 2023-10-04 09:46:30,557 WARNING [train.py:1204] (3/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_375-244976-0_sp1.1 from training. Duration: 0.6818125 2023-10-04 09:46:31,945 WARNING [train.py:1204] (3/4) Exclude cut with ID _35941_210_15379_1_1532995294515_4023089_309-33162-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 09:46:39,030 WARNING [train.py:1204] (3/4) Exclude cut with ID _14459_210_2228_1_1531443641946_7516700_1185-22038-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 09:46:40,242 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.698e+02 1.961e+02 2.101e+02 2.394e+02 3.747e+02, threshold=4.203e+02, percent-clipped=0.0 2023-10-04 09:46:40,422 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_1197-69460-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 09:46:41,749 INFO [train.py:1046] (3/4) Epoch 46, batch 2600, loss[loss=0.1595, simple_loss=0.2483, pruned_loss=0.03542, over 24608.00 frames. ], tot_loss[loss=0.1537, simple_loss=0.2345, pruned_loss=0.03643, over 4716337.87 frames. ], batch size: 68, lr: 2.21e-03, grad_scale: 16.0 2023-10-04 09:46:43,236 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9012W0022-501157 from training. Duration: 0.896 2023-10-04 09:46:43,368 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.1.self_attn_weights.pos_emb_skip_rate, batch_count=1610973.3333333333, ans=0.0 2023-10-04 09:46:47,836 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9023W0009-506302 from training. Duration: 0.896 2023-10-04 09:46:47,862 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2821_210_2715_1_1526108385_3763560_73-296673-0_sp1.1 from training. Duration: 0.7545625 2023-10-04 09:46:47,904 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9025W0113-507442 from training. Duration: 0.896 2023-10-04 09:46:48,604 WARNING [train.py:1204] (3/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_739-2098-0 from training. Duration: 0.92 2023-10-04 09:46:48,616 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9029W0332-509722 from training. Duration: 0.768 2023-10-04 09:46:51,355 WARNING [train.py:1204] (3/4) Exclude cut with ID _16908_210_5724_1_1531119157248_3772700_68-101953-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 09:46:51,381 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9042W0011-515617 from training. Duration: 0.768 2023-10-04 09:46:51,485 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3019_210_2028_1_1526044245_3264865_191-72235-0 from training. Duration: 0.97 2023-10-04 09:46:52,790 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9052W0006-520792 from training. Duration: 0.896 2023-10-04 09:46:54,181 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder_embed.conv.8.prob, batch_count=1610973.3333333333, ans=0.125 2023-10-04 09:46:55,495 WARNING [train.py:1204] (3/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_245-174553-0_sp1.1 from training. Duration: 0.8 2023-10-04 09:46:56,892 WARNING [train.py:1204] (3/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_202-67017-0 from training. Duration: 0.82 2023-10-04 09:46:59,530 WARNING [train.py:1204] (3/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_304-59227-0 from training. Duration: 0.77 2023-10-04 09:47:01,428 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_246-125000-0_sp0.9 from training. Duration: 0.688875 2023-10-04 09:47:01,475 WARNING [train.py:1204] (3/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_243-42116-0 from training. Duration: 0.73 2023-10-04 09:47:04,143 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9087W0121-538492 from training. Duration: 0.896 2023-10-04 09:47:04,158 WARNING [train.py:1204] (3/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_120-76843-0 from training. Duration: 0.55 2023-10-04 09:47:04,352 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.nonlin_attention.balancer.prob, batch_count=1611040.0, ans=0.125 2023-10-04 09:47:12,279 WARNING [train.py:1204] (3/4) Exclude cut with ID _7852_210_5219_1_1528286471316_8139400_850-304621-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 09:47:12,299 WARNING [train.py:1204] (3/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_154-285415-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 09:47:12,311 WARNING [train.py:1204] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_538-278194-0_sp1.1 from training. Duration: 0.92725 2023-10-04 09:47:12,311 WARNING [train.py:1204] (3/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_119-207162-0 from training. Duration: 0.71 2023-10-04 09:47:14,192 WARNING [train.py:1204] (3/4) Exclude cut with ID _2334_210_2667_1_1525608238880_4705129_459-104997-0_sp1.1 from training. Duration: 0.863625 2023-10-04 09:47:17,270 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.ff3_skip_rate, batch_count=1611106.6666666667, ans=0.0 2023-10-04 09:47:19,707 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9147W0019-567847 from training. Duration: 0.768 2023-10-04 09:47:23,323 WARNING [train.py:1204] (3/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_228-189439-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 09:47:24,094 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.0.layers.1.conv_module1.whiten, num_groups=1, num_channels=192, metric=9.82 vs. limit=15.0 2023-10-04 09:47:24,612 WARNING [train.py:1204] (3/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_196-77172-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 09:47:24,688 WARNING [train.py:1204] (3/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_625-338546-0 from training. Duration: 0.88 2023-10-04 09:47:25,974 WARNING [train.py:1204] (3/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_193-327630-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 09:47:25,976 WARNING [train.py:1204] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_391-24006-0_sp1.1 from training. Duration: 0.92725 2023-10-04 09:47:27,428 WARNING [train.py:1204] (3/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_197-28164-0 from training. Duration: 0.8 2023-10-04 09:47:30,741 WARNING [train.py:1204] (3/4) Exclude cut with ID _8395_210_1531_1_1528597871523_4411579_431-132470-0_sp1.1 from training. Duration: 0.67275 2023-10-04 09:47:30,763 WARNING [train.py:1204] (3/4) Exclude cut with ID _35350_210_16040_1_1533040191183_4075299_465-248326-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 09:47:32,256 WARNING [train.py:1204] (3/4) Exclude cut with ID _16756_210_4915_1_1531047803763_7104259_101-141922-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 09:47:35,464 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9212W0005-600172 from training. Duration: 0.896 2023-10-04 09:47:36,732 WARNING [train.py:1204] (3/4) Exclude cut with ID _63186_210_2084_1_1535414479705_3908649_378-166820-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 09:47:36,753 WARNING [train.py:1204] (3/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_52-211683-0_sp1.1 from training. Duration: 0.8181875 2023-10-04 09:47:42,215 WARNING [train.py:1204] (3/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_1147-237729-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 09:47:42,287 WARNING [train.py:1204] (3/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_234-14174-0_sp0.9 from training. Duration: 0.911125 2023-10-04 09:47:42,301 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_454-276429-0 from training. Duration: 0.87 2023-10-04 09:47:43,679 WARNING [train.py:1204] (3/4) Exclude cut with ID _53713_210_24116_1_1534314487762_7416751_755-165313-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 09:47:45,647 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8278_210_2550_1_1528637099_4220413_436-317072-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 09:47:47,093 WARNING [train.py:1204] (3/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_28-324729-0_sp1.1 from training. Duration: 0.8545625 2023-10-04 09:47:47,405 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.feed_forward1.out_proj.dropout_p, batch_count=1611240.0, ans=0.1 2023-10-04 09:47:51,963 WARNING [train.py:1204] (3/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_326-165900-0 from training. Duration: 0.69 2023-10-04 09:47:52,030 WARNING [train.py:1204] (3/4) Exclude cut with ID _53489_210_6228_1_1534298357169_3835970_96-30246-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 09:47:52,264 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.balancer1.prob, batch_count=1611240.0, ans=0.125 2023-10-04 09:47:54,673 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8275_210_4198_1_1528455320_3892571_491-17707-0_sp1.1 from training. Duration: 0.5818125 2023-10-04 09:47:56,007 INFO [train.py:1046] (3/4) Epoch 46, batch 2650, loss[loss=0.1678, simple_loss=0.2446, pruned_loss=0.04553, over 23613.00 frames. ], tot_loss[loss=0.1538, simple_loss=0.2349, pruned_loss=0.03635, over 4722947.59 frames. ], batch size: 256, lr: 2.21e-03, grad_scale: 16.0 2023-10-04 09:47:56,395 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.bypass.scale_min, batch_count=1611306.6666666667, ans=0.2 2023-10-04 09:47:57,950 WARNING [train.py:1204] (3/4) Exclude cut with ID _14348_210_8341_1_1530599434996_3778640_240-232370-0 from training. Duration: 0.84 2023-10-04 09:47:57,959 WARNING [train.py:1204] (3/4) Exclude cut with ID _45012_210_12651_1_1533608435136_5371380_54-112623-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 09:47:59,281 WARNING [train.py:1204] (3/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_328-264776-0_sp1.1 from training. Duration: 0.6090625 2023-10-04 09:48:00,665 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9325W0001-648427 from training. Duration: 0.768 2023-10-04 09:48:00,681 WARNING [train.py:1204] (3/4) Exclude cut with ID _56480_210_4220_1_1534679942079_3075409_49-74717-0_sp1.1 from training. Duration: 0.936375 2023-10-04 09:48:02,186 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.ff2_skip_rate, batch_count=1611306.6666666667, ans=0.0 2023-10-04 09:48:02,290 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.bypass_mid.scale_min, batch_count=1611306.6666666667, ans=0.2 2023-10-04 09:48:04,551 WARNING [train.py:1204] (3/4) Exclude cut with ID _59500_210_8254_1_1534809610680_3736009_6-241869-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 09:48:06,552 WARNING [train.py:1204] (3/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_172-215932-0_sp0.9 from training. Duration: 0.7333125 2023-10-04 09:48:07,959 WARNING [train.py:1204] (3/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_17-90541-0_sp1.1 from training. Duration: 0.8545625 2023-10-04 09:48:09,404 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_43831_210_2158_1_1533725502_5872791_525-89447-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 09:48:10,752 WARNING [train.py:1204] (3/4) Exclude cut with ID _12302_210_5219_1_1529924446466_8107280_907-182949-0 from training. Duration: 0.92 2023-10-04 09:48:10,765 WARNING [train.py:1204] (3/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_402-102311-0_sp0.9 from training. Duration: 0.7444375 2023-10-04 09:48:12,080 WARNING [train.py:1204] (3/4) Exclude cut with ID _8021_210_7994_1_1528376451358_3653069_179-45206-0_sp0.9 from training. Duration: 0.9555625 2023-10-04 09:48:14,867 WARNING [train.py:1204] (3/4) Exclude cut with ID _40137_210_12210_1_1533290484046_3795180_249-187059-0 from training. Duration: 0.88 2023-10-04 09:48:16,782 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9653W0194-675472 from training. Duration: 0.9658125 2023-10-04 09:48:19,490 WARNING [train.py:1204] (3/4) Exclude cut with ID _51332_210_9988_1_1534244371836_3801270_336-255604-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 09:48:21,734 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.2.encoder.layers.2.conv_module2.whiten, num_groups=1, num_channels=384, metric=5.74 vs. limit=15.0 2023-10-04 09:48:23,643 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_510-96996-0 from training. Duration: 0.87 2023-10-04 09:48:23,663 WARNING [train.py:1204] (3/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_1167-20437-0_sp1.1 from training. Duration: 0.92725 2023-10-04 09:48:23,710 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3475_210_2028_1_1526193578_3622569_265-276625-0 from training. Duration: 0.76 2023-10-04 09:48:24,006 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.ff3_skip_rate, batch_count=1611440.0, ans=0.0 2023-10-04 09:48:28,524 WARNING [train.py:1204] (3/4) Exclude cut with ID _14860_210_4915_1_1530702020340_7343240_470-172418-0_sp1.1 from training. Duration: 0.97275 2023-10-04 09:48:28,539 WARNING [train.py:1204] (3/4) Exclude cut with ID _5444_210_2151_1_1527418808464_5312210_581-315411-0_sp1.1 from training. Duration: 0.563625 2023-10-04 09:48:28,567 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_34450_210_9547_1_1533099443_4109050_519-342439-0_sp1.1 from training. Duration: 0.97275 2023-10-04 09:48:29,926 WARNING [train.py:1204] (3/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_50-175961-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 09:48:34,466 WARNING [train.py:1204] (3/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_372-6869-0 from training. Duration: 0.44 2023-10-04 09:48:34,477 WARNING [train.py:1204] (3/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_3-237434-0 from training. Duration: 0.76 2023-10-04 09:48:35,952 WARNING [train.py:1204] (3/4) Exclude cut with ID _8772_210_1850_1_1528635595972_3664011_247-336448-0_sp1.1 from training. Duration: 0.67275 2023-10-04 09:48:40,637 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_427-78097-0 from training. Duration: 0.8 2023-10-04 09:48:42,043 WARNING [train.py:1204] (3/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_466-252727-0_sp1.1 from training. Duration: 0.97275 2023-10-04 09:48:42,116 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_15972_210_6400_1_1530877934_3405692_316-35478-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 09:48:43,342 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_809-17156-0_sp0.9 from training. Duration: 0.811125 2023-10-04 09:48:43,387 WARNING [train.py:1204] (3/4) Exclude cut with ID _57572_210_5517_1_1534921219624_3809484_594-263474-0_sp1.1 from training. Duration: 0.936375 2023-10-04 09:48:43,414 WARNING [train.py:1204] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_190-228204-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 09:48:44,813 WARNING [train.py:1204] (3/4) Exclude cut with ID _19514_210_9259_1_1531530071330_5084230_117-21193-0_sp1.1 from training. Duration: 0.936375 2023-10-04 09:48:46,200 WARNING [train.py:1204] (3/4) Exclude cut with ID _69584_210_5219_1_1535785133775_4044450_190-21315-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 09:48:48,091 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_53071_210_16554_1_1534493434_1276796_184-302050-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 09:48:48,146 WARNING [train.py:1204] (3/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_120-76843-0_sp1.1 from training. Duration: 0.5 2023-10-04 09:48:49,595 WARNING [train.py:1204] (3/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_318-162017-0_sp1.1 from training. Duration: 0.82725 2023-10-04 09:48:50,876 WARNING [train.py:1204] (3/4) Exclude cut with ID _46067_210_4220_1_1534125571066_3307450_79-46864-0_sp1.1 from training. Duration: 0.92725 2023-10-04 09:48:52,195 WARNING [train.py:1204] (3/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_358-215558-0_sp1.1 from training. Duration: 0.8454375 2023-10-04 09:48:52,274 WARNING [train.py:1204] (3/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_621-66764-0_sp1.1 from training. Duration: 0.92725 2023-10-04 09:48:53,600 WARNING [train.py:1204] (3/4) Exclude cut with ID _42863_210_9070_1_1533902398788_3696920_338-156851-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 09:48:53,773 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.conv_skip_rate, batch_count=1611573.3333333333, ans=0.0 2023-10-04 09:48:54,986 WARNING [train.py:1204] (3/4) Exclude cut with ID _5289_210_2084_1_1527678202736_3779299_392-261988-0_sp0.9 from training. Duration: 0.72225 2023-10-04 09:48:58,206 WARNING [train.py:1204] (3/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_409-84682-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 09:48:58,302 WARNING [train.py:1204] (3/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_30-326681-0_sp0.9 from training. Duration: 0.92225 2023-10-04 09:48:59,612 WARNING [train.py:1204] (3/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_389-184695-0_sp1.1 from training. Duration: 0.92725 2023-10-04 09:48:59,643 WARNING [train.py:1204] (3/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_442-320317-0 from training. Duration: 0.82 2023-10-04 09:49:01,253 WARNING [train.py:1204] (3/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_756-29911-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 09:49:03,208 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_401-112036-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 09:49:03,286 WARNING [train.py:1204] (3/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_181-308827-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 09:49:03,351 WARNING [train.py:1204] (3/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_332-258725-0_sp1.1 from training. Duration: 0.963625 2023-10-04 09:49:04,608 WARNING [train.py:1204] (3/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_188-260011-0_sp0.9 from training. Duration: 0.811125 2023-10-04 09:49:04,656 WARNING [train.py:1204] (3/4) Exclude cut with ID _49039_210_6315_1_1533967537541_3846118_99-302239-0_sp1.1 from training. Duration: 0.963625 2023-10-04 09:49:07,343 WARNING [train.py:1204] (3/4) Exclude cut with ID _4122_210_2084_1_1526713164417_4353280_312-294828-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 09:49:07,348 WARNING [train.py:1204] (3/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_303-228738-0 from training. Duration: 0.84 2023-10-04 09:49:09,084 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.641e+02 2.031e+02 2.301e+02 2.644e+02 3.771e+02, threshold=4.602e+02, percent-clipped=0.0 2023-10-04 09:49:10,574 INFO [train.py:1046] (3/4) Epoch 46, batch 2700, loss[loss=0.1508, simple_loss=0.235, pruned_loss=0.03333, over 24376.00 frames. ], tot_loss[loss=0.1545, simple_loss=0.2357, pruned_loss=0.03666, over 4715994.35 frames. ], batch size: 61, lr: 2.21e-03, grad_scale: 16.0 2023-10-04 09:49:10,647 WARNING [train.py:1204] (3/4) Exclude cut with ID _14349_210_8341_1_1530615710818_3442470_115-285975-0_sp0.9 from training. Duration: 0.9555625 2023-10-04 09:49:12,053 WARNING [train.py:1204] (3/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_125-144174-0_sp1.1 from training. Duration: 0.3545625 2023-10-04 09:49:14,068 WARNING [train.py:1204] (3/4) Exclude cut with ID _53565_210_10635_1_1534337218335_3820259_351-54570-0_sp1.1 from training. Duration: 0.97275 2023-10-04 09:49:14,089 WARNING [train.py:1204] (3/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_410-40065-0_sp1.1 from training. Duration: 0.963625 2023-10-04 09:49:14,118 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_38965_210_4852_1_1533174906_3484735_167-339442-0_sp1.1 from training. Duration: 0.963625 2023-10-04 09:49:15,556 WARNING [train.py:1204] (3/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_265-164486-0_sp1.1 from training. Duration: 0.87275 2023-10-04 09:49:15,573 WARNING [train.py:1204] (3/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_725-182484-0_sp1.1 from training. Duration: 0.92725 2023-10-04 09:49:16,903 WARNING [train.py:1204] (3/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_607-213951-0_sp0.9 from training. Duration: 0.9333125 2023-10-04 09:49:16,929 WARNING [train.py:1204] (3/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_605-133395-0_sp1.1 from training. Duration: 0.6 2023-10-04 09:49:16,951 WARNING [train.py:1204] (3/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_351-294650-0 from training. Duration: 0.63 2023-10-04 09:49:18,155 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_14953_210_2151_1_1530701710_5616165_710-165400-0_sp1.1 from training. Duration: 0.7909375 2023-10-04 09:49:20,775 WARNING [train.py:1204] (3/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_237-58433-0_sp1.1 from training. Duration: 0.67275 2023-10-04 09:49:22,137 WARNING [train.py:1204] (3/4) Exclude cut with ID _5190_210_4557_1_1527244368799_373439_28-197030-0_sp1.1 from training. Duration: 0.6545625 2023-10-04 09:49:22,192 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_743-327651-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 09:49:25,607 WARNING [train.py:1204] (3/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_76-203867-0_sp0.9 from training. Duration: 0.8 2023-10-04 09:49:27,000 WARNING [train.py:1204] (3/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_274-274798-0 from training. Duration: 0.98 2023-10-04 09:49:28,360 WARNING [train.py:1204] (3/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_226-242140-0_sp1.1 from training. Duration: 0.736375 2023-10-04 09:49:33,046 WARNING [train.py:1204] (3/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_352-59958-0_sp1.1 from training. Duration: 0.7818125 2023-10-04 09:49:33,052 WARNING [train.py:1204] (3/4) Exclude cut with ID _39411_210_5986_1_1533538825030_3343649_191-251819-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 09:49:40,377 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7046_210_6753_1_1527895337_6920917_642-106799-0_sp1.1 from training. Duration: 0.836375 2023-10-04 09:49:40,384 WARNING [train.py:1204] (3/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_412-3442-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 09:49:40,405 WARNING [train.py:1204] (3/4) Exclude cut with ID _60057_210_10640_1_1534903065793_3333150_171-181133-0_sp0.9 from training. Duration: 0.97775 2023-10-04 09:49:40,415 WARNING [train.py:1204] (3/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_154-135696-0_sp1.1 from training. Duration: 0.72725 2023-10-04 09:49:44,532 WARNING [train.py:1204] (3/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_148-186805-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 09:49:47,337 WARNING [train.py:1204] (3/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_605-165641-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 09:49:47,343 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_174-277364-0_sp0.9 from training. Duration: 0.788875 2023-10-04 09:49:47,362 WARNING [train.py:1204] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_89-321955-0_sp0.9 from training. Duration: 0.888875 2023-10-04 09:49:52,041 WARNING [train.py:1204] (3/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_438-105429-0_sp1.1 from training. Duration: 0.963625 2023-10-04 09:49:52,044 WARNING [train.py:1204] (3/4) Exclude cut with ID _14970_210_15709_1_1530773543493_1715730_187-20259-0_sp0.9 from training. Duration: 0.988875 2023-10-04 09:50:00,926 WARNING [train.py:1204] (3/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_385-193052-0_sp0.9 from training. Duration: 0.9555625 2023-10-04 09:50:00,975 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7113_210_1849_1_1527942685_3448768_194-311626-0_sp1.1 from training. Duration: 0.92725 2023-10-04 09:50:03,888 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3780_210_2087_1_1526467389_2145572_176-107140-0_sp0.9 from training. Duration: 0.7666875 2023-10-04 09:50:03,890 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_36756_210_7234_1_1533026991_966276_123-213457-0_sp1.1 from training. Duration: 0.936375 2023-10-04 09:50:07,117 WARNING [train.py:1204] (3/4) Exclude cut with ID _23557_210_8750_1_1531890106370_6846255_456-244861-0_sp1.1 from training. Duration: 0.963625 2023-10-04 09:50:08,573 WARNING [train.py:1204] (3/4) Exclude cut with ID _37086_210_9038_1_1532999540891_7021250_242-349170-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 09:50:09,968 WARNING [train.py:1204] (3/4) Exclude cut with ID _41298_210_9019_1_1533430371450_4822143_471-281351-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 09:50:10,203 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.conv_module2.balancer1.min_positive, batch_count=1611906.6666666667, ans=0.025 2023-10-04 09:50:11,330 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_53378_210_17282_1_1534221071_5769509_572-126176-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 09:50:13,246 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_1507-263032-0_sp1.1 from training. Duration: 0.963625 2023-10-04 09:50:13,292 WARNING [train.py:1204] (3/4) Exclude cut with ID _13679_210_7497_1_1530534293662_3355098_411-186088-0_sp1.1 from training. Duration: 0.8545625 2023-10-04 09:50:16,081 WARNING [train.py:1204] (3/4) Exclude cut with ID _29345_210_4381_1_1532689168263_3860169_149-257268-0_sp1.1 from training. Duration: 0.736375 2023-10-04 09:50:16,183 WARNING [train.py:1204] (3/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_381-252014-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 09:50:16,189 WARNING [train.py:1204] (3/4) Exclude cut with ID _35799_210_5854_1_1533263487887_3529071_512-2717-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 09:50:20,162 WARNING [train.py:1204] (3/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_505-205270-0_sp0.9 from training. Duration: 0.42225 2023-10-04 09:50:20,245 WARNING [train.py:1204] (3/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_731-20117-0_sp1.1 from training. Duration: 0.936375 2023-10-04 09:50:23,377 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_37351_210_7925_1_1533012931_3812113_265-85780-0_sp1.1 from training. Duration: 0.8 2023-10-04 09:50:24,574 INFO [train.py:1046] (3/4) Epoch 46, batch 2750, loss[loss=0.1456, simple_loss=0.226, pruned_loss=0.03256, over 23576.00 frames. ], tot_loss[loss=0.1537, simple_loss=0.2346, pruned_loss=0.03641, over 4717912.27 frames. ], batch size: 120, lr: 2.21e-03, grad_scale: 16.0 2023-10-04 09:50:24,621 WARNING [train.py:1204] (3/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_313-134930-0 from training. Duration: 0.97 2023-10-04 09:50:26,007 WARNING [train.py:1204] (3/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_374-138971-0 from training. Duration: 0.94 2023-10-04 09:50:26,683 WARNING [train.py:1204] (3/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_1227-287886-0_sp1.1 from training. Duration: 0.936375 2023-10-04 09:50:28,159 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=1611973.3333333333, ans=0.1 2023-10-04 09:50:31,047 WARNING [train.py:1204] (3/4) Exclude cut with ID _59493_210_17282_1_1535078419077_3225879_332-217544-0_sp1.1 from training. Duration: 0.97275 2023-10-04 09:50:31,089 WARNING [train.py:1204] (3/4) Exclude cut with ID _23514_210_1389_1_1531832139785_3031439_361-119640-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 09:50:32,542 WARNING [train.py:1204] (3/4) Exclude cut with ID _69584_210_5219_1_1535785133775_4044450_142-118058-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 09:50:33,756 WARNING [train.py:1204] (3/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_178-177582-0_sp1.1 from training. Duration: 0.67275 2023-10-04 09:50:33,796 WARNING [train.py:1204] (3/4) Exclude cut with ID _25303_210_17519_1_1532050207576_3635970_41-195572-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 09:50:36,597 WARNING [train.py:1204] (3/4) Exclude cut with ID _55328_210_27767_1_1534580973260_4088170_329-341322-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 09:50:36,640 WARNING [train.py:1204] (3/4) Exclude cut with ID _5608_210_2106_1_1527415439089_6819768_909-82767-0_sp1.1 from training. Duration: 0.5545625 2023-10-04 09:50:36,682 WARNING [train.py:1204] (3/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_261-297503-0_sp1.1 from training. Duration: 0.9 2023-10-04 09:50:36,688 WARNING [train.py:1204] (3/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_459-291706-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 09:50:36,689 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7762_210_1794_1_1528199508_4057408_422-227034-0 from training. Duration: 0.73 2023-10-04 09:50:36,698 WARNING [train.py:1204] (3/4) Exclude cut with ID _3067_210_2667_1_1526212974640_4503168_250-285937-0_sp0.9 from training. Duration: 0.888875 2023-10-04 09:50:38,514 WARNING [train.py:1204] (3/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_332-253745-0_sp1.1 from training. Duration: 0.936375 2023-10-04 09:50:43,933 WARNING [train.py:1204] (3/4) Exclude cut with ID _13938_210_9988_1_1530518718978_3693960_304-112784-0 from training. Duration: 0.46 2023-10-04 09:50:45,443 WARNING [train.py:1204] (3/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_226-267957-0_sp1.1 from training. Duration: 0.92725 2023-10-04 09:50:46,786 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_41822_210_13270_1_1533955976_4379284_608-254567-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 09:50:48,465 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_124-113562-0_sp1.1 from training. Duration: 0.8545625 2023-10-04 09:50:48,503 WARNING [train.py:1204] (3/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_117-290916-0_sp1.1 from training. Duration: 0.463625 2023-10-04 09:50:49,835 WARNING [train.py:1204] (3/4) Exclude cut with ID _28427_210_4220_1_1532581240967_3061069_102-66533-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 09:50:51,163 WARNING [train.py:1204] (3/4) Exclude cut with ID _55383_210_10635_1_1534468426172_2231669_134-128246-0_sp1.1 from training. Duration: 0.8090625 2023-10-04 09:50:52,957 WARNING [train.py:1204] (3/4) Exclude cut with ID _45871_210_5665_1_1533864614245_3296060_49-308316-0_sp1.1 from training. Duration: 0.97275 2023-10-04 09:50:53,003 WARNING [train.py:1204] (3/4) Exclude cut with ID _57572_210_5517_1_1534921219624_3809484_176-199205-0_sp1.1 from training. Duration: 0.97275 2023-10-04 09:50:55,967 WARNING [train.py:1204] (3/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_45-265684-0_sp1.1 from training. Duration: 0.6545625 2023-10-04 09:50:55,991 WARNING [train.py:1204] (3/4) Exclude cut with ID _15150_210_8341_1_1530752411803_7256384_469-239509-0_sp0.9 from training. Duration: 0.7555625 2023-10-04 09:50:57,317 WARNING [train.py:1204] (3/4) Exclude cut with ID _28461_210_2253_1_1532771775189_3041299_136-270644-0_sp1.1 from training. Duration: 0.8454375 2023-10-04 09:50:57,980 WARNING [train.py:1204] (3/4) Exclude cut with ID _69584_210_5219_1_1535785133775_4044450_302-47302-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 09:50:59,321 WARNING [train.py:1204] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_166-128941-0_sp1.1 from training. Duration: 0.4909375 2023-10-04 09:51:04,876 WARNING [train.py:1204] (3/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_559-120504-0_sp1.1 from training. Duration: 0.97275 2023-10-04 09:51:07,695 WARNING [train.py:1204] (3/4) Exclude cut with ID _6456_210_1534_1_1527681634212_3181049_345-252877-0_sp1.1 from training. Duration: 0.5818125 2023-10-04 09:51:07,727 WARNING [train.py:1204] (3/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_902-233003-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 09:51:11,194 WARNING [train.py:1204] (3/4) Exclude cut with ID _36538_210_9994_1_1533204020363_3795620_204-261385-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 09:51:11,198 WARNING [train.py:1204] (3/4) Exclude cut with ID _14821_210_8750_1_1530680503991_6829359_189-297451-0_sp1.1 from training. Duration: 0.5 2023-10-04 09:51:12,556 WARNING [train.py:1204] (3/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_1211-226066-0_sp0.9 from training. Duration: 0.8444375 2023-10-04 09:51:18,748 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6786_210_1388_1_1527839699_4194495_273-149841-0_sp1.1 from training. Duration: 0.836375 2023-10-04 09:51:18,777 WARNING [train.py:1204] (3/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_380-162648-0_sp1.1 from training. Duration: 0.8545625 2023-10-04 09:51:18,778 WARNING [train.py:1204] (3/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_494-199538-0 from training. Duration: 0.75 2023-10-04 09:51:23,551 WARNING [train.py:1204] (3/4) Exclude cut with ID _49097_210_16049_1_1533952780207_4360979_276-87053-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 09:51:25,059 WARNING [train.py:1204] (3/4) Exclude cut with ID _5444_210_2151_1_1527418808464_5312210_726-27165-0 from training. Duration: 0.67 2023-10-04 09:51:29,711 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_128-202104-0_sp1.1 from training. Duration: 0.57275 2023-10-04 09:51:31,085 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_11349_210_6753_1_1529800861_1101207_26-65602-0_sp1.1 from training. Duration: 0.87275 2023-10-04 09:51:31,122 WARNING [train.py:1204] (3/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_159-328157-0 from training. Duration: 0.52 2023-10-04 09:51:32,546 WARNING [train.py:1204] (3/4) Exclude cut with ID _14069_210_5632_1_1530512692033_4803390_98-21934-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 09:51:33,895 WARNING [train.py:1204] (3/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_352-59958-0_sp0.9 from training. Duration: 0.9555625 2023-10-04 09:51:33,938 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6163_210_6400_1_1527506746_4263068_283-137811-0 from training. Duration: 0.77 2023-10-04 09:51:35,193 WARNING [train.py:1204] (3/4) Exclude cut with ID _21989_210_5854_1_1531899214931_3271360_133-44759-0_sp1.1 from training. Duration: 0.7 2023-10-04 09:51:37,871 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.736e+02 1.982e+02 2.271e+02 2.856e+02 5.103e+02, threshold=4.543e+02, percent-clipped=1.0 2023-10-04 09:51:37,977 WARNING [train.py:1204] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_21-174514-0_sp0.9 from training. Duration: 0.47775 2023-10-04 09:51:38,016 WARNING [train.py:1204] (3/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_201-78811-0_sp1.1 from training. Duration: 0.963625 2023-10-04 09:51:39,231 INFO [train.py:1046] (3/4) Epoch 46, batch 2800, loss[loss=0.1247, simple_loss=0.1841, pruned_loss=0.03265, over 19205.00 frames. ], tot_loss[loss=0.1532, simple_loss=0.2336, pruned_loss=0.03637, over 4720567.87 frames. ], batch size: 389, lr: 2.21e-03, grad_scale: 32.0 2023-10-04 09:51:39,306 WARNING [train.py:1204] (3/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_537-164630-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 09:51:39,368 WARNING [train.py:1204] (3/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_156-257607-0 from training. Duration: 0.83 2023-10-04 09:51:39,378 WARNING [train.py:1204] (3/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_308-154450-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 09:51:39,417 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_45399_210_4852_1_1533624725_7579737_214-240574-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 09:51:43,961 WARNING [train.py:1204] (3/4) Exclude cut with ID _29303_210_2575_1_1532392237303_7410649_615-305517-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 09:51:44,010 WARNING [train.py:1204] (3/4) Exclude cut with ID IC0125W0002-61808 from training. Duration: 0.708 2023-10-04 09:51:44,011 WARNING [train.py:1204] (3/4) Exclude cut with ID IC0125W0017-61823 from training. Duration: 0.759 2023-10-04 09:51:47,224 WARNING [train.py:1204] (3/4) Exclude cut with ID _53219_210_4525_1_1534834836008_4064825_4-145291-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 09:51:49,960 WARNING [train.py:1204] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_185-174805-0_sp1.1 from training. Duration: 0.6181875 2023-10-04 09:51:49,982 WARNING [train.py:1204] (3/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_635-193046-0_sp1.1 from training. Duration: 0.92725 2023-10-04 09:51:53,284 WARNING [train.py:1204] (3/4) Exclude cut with ID _46067_210_4220_1_1534125571066_3307450_321-258866-0_sp1.1 from training. Duration: 0.97275 2023-10-04 09:51:54,741 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8558_210_1410_1_1528505225_7613406_131-329524-0 from training. Duration: 0.79 2023-10-04 09:51:55,014 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.ff2_skip_rate, batch_count=1612373.3333333333, ans=0.0 2023-10-04 09:51:56,081 WARNING [train.py:1204] (3/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_847-41334-0_sp0.9 from training. Duration: 0.488875 2023-10-04 09:51:58,086 WARNING [train.py:1204] (3/4) Exclude cut with ID _14227_210_4381_1_1530701804286_3940259_125-275642-0 from training. Duration: 0.99 2023-10-04 09:51:58,184 WARNING [train.py:1204] (3/4) Exclude cut with ID _25248_210_8750_1_1532062825941_5554180_248-177255-0_sp1.1 from training. Duration: 0.963625 2023-10-04 09:51:59,456 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_40047_210_2290_1_1533290519_2374311_195-165693-0_sp1.1 from training. Duration: 0.8090625 2023-10-04 09:51:59,460 WARNING [train.py:1204] (3/4) Exclude cut with ID _23109_210_4381_1_1532150828777_901949_29-258672-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 09:52:01,009 WARNING [train.py:1204] (3/4) Exclude cut with ID _14821_210_8750_1_1530680503991_6829359_2-227151-0_sp1.1 from training. Duration: 0.7545625 2023-10-04 09:52:02,319 WARNING [train.py:1204] (3/4) Exclude cut with ID _49643_210_7989_1_1534640244224_3939031_565-53982-0_sp1.1 from training. Duration: 0.963625 2023-10-04 09:52:02,322 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7544_210_6753_1_1528591327_1471426_33-346031-0_sp0.9 from training. Duration: 0.67775 2023-10-04 09:52:02,409 WARNING [train.py:1204] (3/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_95-61782-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 09:52:09,620 WARNING [train.py:1204] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_686-54383-0_sp0.9 from training. Duration: 0.9666875 2023-10-04 09:52:10,921 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_505-276823-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 09:52:12,355 WARNING [train.py:1204] (3/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_745-128443-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 09:52:14,912 WARNING [train.py:1204] (3/4) Exclude cut with ID _3844_210_1390_1_1526727803582_2871209_20-251664-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 09:52:14,966 WARNING [train.py:1204] (3/4) Exclude cut with ID _36538_210_9994_1_1533204020363_3795620_487-164628-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 09:52:21,502 WARNING [train.py:1204] (3/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_309-222545-0_sp1.1 from training. Duration: 0.763625 2023-10-04 09:52:21,519 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3531_210_3528_1_1526294203_5216456_309-227302-0 from training. Duration: 0.69 2023-10-04 09:52:21,574 WARNING [train.py:1204] (3/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_42-192460-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 09:52:23,568 WARNING [train.py:1204] (3/4) Exclude cut with ID _22083_210_8254_1_1531702862439_3678970_49-234546-0_sp1.1 from training. Duration: 0.7545625 2023-10-04 09:52:23,572 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_427-78097-0_sp0.9 from training. Duration: 0.888875 2023-10-04 09:52:26,493 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_378-205254-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 09:52:26,535 WARNING [train.py:1204] (3/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_672-314830-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 09:52:30,000 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.4.encoder.layers.1.self_attn1.whiten, num_groups=1, num_channels=384, metric=14.27 vs. limit=22.5 2023-10-04 09:52:30,929 WARNING [train.py:1204] (3/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_90-30673-0_sp1.1 from training. Duration: 0.763625 2023-10-04 09:52:31,143 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.conv_skip_rate, batch_count=1612506.6666666667, ans=0.0 2023-10-04 09:52:32,305 WARNING [train.py:1204] (3/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_563-329943-0_sp1.1 from training. Duration: 0.9 2023-10-04 09:52:32,328 WARNING [train.py:1204] (3/4) Exclude cut with ID _21632_210_2253_1_1531994001765_3744140_154-84090-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 09:52:32,329 WARNING [train.py:1204] (3/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_103-336064-0_sp0.9 from training. Duration: 0.5666875 2023-10-04 09:52:33,627 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_43-71989-0_sp0.9 from training. Duration: 0.6333125 2023-10-04 09:52:33,681 WARNING [train.py:1204] (3/4) Exclude cut with ID _3867_210_1883_1_1526641200575_4475830_319-349850-0_sp0.9 from training. Duration: 0.7444375 2023-10-04 09:52:35,110 WARNING [train.py:1204] (3/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_629-179454-0_sp1.1 from training. Duration: 0.963625 2023-10-04 09:52:36,443 WARNING [train.py:1204] (3/4) Exclude cut with ID _65263_210_22356_1_1535763598691_3921037_49-222917-0 from training. Duration: 0.92 2023-10-04 09:52:36,474 WARNING [train.py:1204] (3/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_602-331772-0_sp1.1 from training. Duration: 0.936375 2023-10-04 09:52:36,727 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.ff3_skip_rate, batch_count=1612506.6666666667, ans=0.0 2023-10-04 09:52:37,809 WARNING [train.py:1204] (3/4) Exclude cut with ID _58414_210_8254_1_1535421609162_7151182_292-134978-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 09:52:39,770 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_38793_210_2544_1_1533176114_3849320_200-299292-0_sp1.1 from training. Duration: 0.936375 2023-10-04 09:52:41,144 WARNING [train.py:1204] (3/4) Exclude cut with ID _14970_210_15709_1_1530775547361_1966965_6-23328-0 from training. Duration: 0.86 2023-10-04 09:52:42,518 WARNING [train.py:1204] (3/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_386-225511-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 09:52:42,531 WARNING [train.py:1204] (3/4) Exclude cut with ID _38323_210_12108_1_1533121233369_3303049_416-11919-0_sp1.1 from training. Duration: 0.87275 2023-10-04 09:52:42,582 WARNING [train.py:1204] (3/4) Exclude cut with ID _4866_210_2577_1_1527404412284_4364659_256-306830-0_sp1.1 from training. Duration: 0.7909375 2023-10-04 09:52:44,505 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.4.encoder.layers.1.self_attn2.whiten, num_groups=1, num_channels=384, metric=11.69 vs. limit=22.5 2023-10-04 09:52:45,334 WARNING [train.py:1204] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_53-300544-0 from training. Duration: 0.67 2023-10-04 09:52:51,319 WARNING [train.py:1204] (3/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_80-74184-0_sp1.1 from training. Duration: 0.7545625 2023-10-04 09:52:53,024 WARNING [train.py:1204] (3/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_332-256168-0_sp1.1 from training. Duration: 0.6818125 2023-10-04 09:52:53,078 WARNING [train.py:1204] (3/4) Exclude cut with ID _48836_210_12210_1_1534075848293_3904150_429-35475-0_sp1.1 from training. Duration: 0.8090625 2023-10-04 09:52:54,413 INFO [train.py:1046] (3/4) Epoch 46, batch 2850, loss[loss=0.1501, simple_loss=0.23, pruned_loss=0.03508, over 24427.00 frames. ], tot_loss[loss=0.152, simple_loss=0.2325, pruned_loss=0.03579, over 4713885.54 frames. ], batch size: 58, lr: 2.21e-03, grad_scale: 32.0 2023-10-04 09:52:54,581 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_144-135833-0_sp1.1 from training. Duration: 0.97275 2023-10-04 09:52:55,153 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.4.encoder.layers.0.self_attn2.whiten, num_groups=1, num_channels=384, metric=13.41 vs. limit=22.5 2023-10-04 09:52:59,145 WARNING [train.py:1204] (3/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_380-343670-0_sp0.9 from training. Duration: 0.97775 2023-10-04 09:52:59,164 WARNING [train.py:1204] (3/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_213-265174-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 09:53:00,530 WARNING [train.py:1204] (3/4) Exclude cut with ID _20455_210_5219_1_1531565996775_8098100_86-314283-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 09:53:03,255 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_41750_210_1960_1_1533520678_3948042_68-223813-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 09:53:03,309 WARNING [train.py:1204] (3/4) Exclude cut with ID _55153_210_2253_1_1534384786677_3162096_24-198429-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 09:53:06,074 WARNING [train.py:1204] (3/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_119-207162-0_sp0.9 from training. Duration: 0.788875 2023-10-04 09:53:06,120 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_148-172861-0 from training. Duration: 0.5 2023-10-04 09:53:13,690 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_5522_210_3273_1_1527249606_3909096_318-203153-0 from training. Duration: 0.43 2023-10-04 09:53:13,696 WARNING [train.py:1204] (3/4) Exclude cut with ID _14884_210_2252_1_1530707459689_5561250_288-189949-0_sp1.1 from training. Duration: 0.97275 2023-10-04 09:53:15,001 WARNING [train.py:1204] (3/4) Exclude cut with ID _5294_210_1747_1_1527249643928_3696179_529-242298-0 from training. Duration: 0.95 2023-10-04 09:53:16,339 WARNING [train.py:1204] (3/4) Exclude cut with ID _36538_210_9994_1_1533204020363_3795620_503-110289-0_sp1.1 from training. Duration: 0.936375 2023-10-04 09:53:17,844 WARNING [train.py:1204] (3/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_385-193052-0 from training. Duration: 0.86 2023-10-04 09:53:19,195 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_456-22141-0 from training. Duration: 0.88 2023-10-04 09:53:21,725 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_38965_210_4852_1_1533174906_3484735_284-117840-0_sp1.1 from training. Duration: 0.936375 2023-10-04 09:53:33,020 WARNING [train.py:1204] (3/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_75-201625-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 09:53:33,122 WARNING [train.py:1204] (3/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_255-312654-0_sp1.1 from training. Duration: 0.82725 2023-10-04 09:53:33,131 WARNING [train.py:1204] (3/4) Exclude cut with ID _47251_210_18628_1_1533814063128_4137250_214-88076-0_sp0.9 from training. Duration: 0.97775 2023-10-04 09:53:34,519 WARNING [train.py:1204] (3/4) Exclude cut with ID _4092_210_2151_1_1526643044449_3135759_227-213972-0_sp1.1 from training. Duration: 0.4909375 2023-10-04 09:53:34,535 WARNING [train.py:1204] (3/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_912-47473-0_sp1.1 from training. Duration: 0.6181875 2023-10-04 09:53:35,781 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6767_210_3273_1_1527855783_1718932_173-256601-0_sp1.1 from training. Duration: 0.72725 2023-10-04 09:53:37,237 WARNING [train.py:1204] (3/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_32-205288-0_sp1.1 from training. Duration: 0.7090625 2023-10-04 09:53:37,269 WARNING [train.py:1204] (3/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_39-248870-0 from training. Duration: 0.69 2023-10-04 09:53:38,716 WARNING [train.py:1204] (3/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_377-296839-0_sp1.1 from training. Duration: 0.5 2023-10-04 09:53:38,731 WARNING [train.py:1204] (3/4) Exclude cut with ID _13679_210_7497_1_1530534293662_3355098_413-56154-0_sp1.1 from training. Duration: 0.963625 2023-10-04 09:53:38,794 WARNING [train.py:1204] (3/4) Exclude cut with ID _14970_210_15709_1_1530775547361_1966965_201-84544-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 09:53:40,015 WARNING [train.py:1204] (3/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_105-95775-0_sp1.1 from training. Duration: 0.936375 2023-10-04 09:53:43,387 WARNING [train.py:1204] (3/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_131-191669-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 09:53:43,421 WARNING [train.py:1204] (3/4) Exclude cut with ID _14860_210_4915_1_1530702020340_7343240_695-4323-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 09:53:44,725 WARNING [train.py:1204] (3/4) Exclude cut with ID _39867_210_9826_1_1533553222030_9344259_991-130565-0_sp1.1 from training. Duration: 0.97275 2023-10-04 09:53:47,471 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_50-3289-0_sp1.1 from training. Duration: 0.82725 2023-10-04 09:53:48,880 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_54-347670-0_sp1.1 from training. Duration: 0.92725 2023-10-04 09:53:48,947 WARNING [train.py:1204] (3/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_200-153445-0_sp1.1 from training. Duration: 0.936375 2023-10-04 09:53:50,302 WARNING [train.py:1204] (3/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_279-302911-0_sp1.1 from training. Duration: 0.97275 2023-10-04 09:53:53,502 WARNING [train.py:1204] (3/4) Exclude cut with ID _14886_210_4220_1_1530702020355_3516190_159-73748-0_sp0.9 from training. Duration: 0.92225 2023-10-04 09:53:59,514 WARNING [train.py:1204] (3/4) Exclude cut with ID _53379_210_20034_1_1534222027363_3854654_23-222715-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 09:53:59,631 WARNING [train.py:1204] (3/4) Exclude cut with ID _12555_210_8750_1_1530075594402_3780239_449-267986-0 from training. Duration: 0.83 2023-10-04 09:54:00,867 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7544_210_6753_1_1528587722_3576686_295-258479-0 from training. Duration: 0.84 2023-10-04 09:54:02,763 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7002_210_1794_1_1527935634_4034444_487-236501-0_sp0.9 from training. Duration: 0.7333125 2023-10-04 09:54:02,814 WARNING [train.py:1204] (3/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_244-19107-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 09:54:04,199 WARNING [train.py:1204] (3/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_390-339137-0 from training. Duration: 0.56 2023-10-04 09:54:04,261 WARNING [train.py:1204] (3/4) Exclude cut with ID _7562_210_2084_1_1528887767916_3946229_186-326422-0_sp1.1 from training. Duration: 0.763625 2023-10-04 09:54:05,576 WARNING [train.py:1204] (3/4) Exclude cut with ID _33640_210_2655_1_1533117605479_4855469_645-145235-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 09:54:05,598 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_25767_210_2161_1_1532307483_3867748_506-171777-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 09:54:05,624 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_5438_210_1794_1_1527336687_2600009_233-58072-0_sp1.1 from training. Duration: 0.7 2023-10-04 09:54:05,624 WARNING [train.py:1204] (3/4) Exclude cut with ID IC0669W0063-330908 from training. Duration: 0.896 2023-10-04 09:54:05,654 WARNING [train.py:1204] (3/4) Exclude cut with ID IC0671W0070-331913 from training. Duration: 0.896 2023-10-04 09:54:05,657 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_258-310570-0_sp1.1 from training. Duration: 0.7454375 2023-10-04 09:54:06,880 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.728e+02 2.024e+02 2.306e+02 2.714e+02 5.189e+02, threshold=4.613e+02, percent-clipped=2.0 2023-10-04 09:54:07,010 WARNING [train.py:1204] (3/4) Exclude cut with ID _9461_210_4557_1_1529060514726_3677451_162-177507-0_sp1.1 from training. Duration: 0.97275 2023-10-04 09:54:08,219 INFO [train.py:1046] (3/4) Epoch 46, batch 2900, loss[loss=0.1437, simple_loss=0.2165, pruned_loss=0.03545, over 23843.00 frames. ], tot_loss[loss=0.1526, simple_loss=0.233, pruned_loss=0.03606, over 4714605.17 frames. ], batch size: 179, lr: 2.21e-03, grad_scale: 32.0 2023-10-04 09:54:11,128 WARNING [train.py:1204] (3/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_26-175553-0_sp0.9 from training. Duration: 0.67775 2023-10-04 09:54:11,144 WARNING [train.py:1204] (3/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_7-150481-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 09:54:11,206 WARNING [train.py:1204] (3/4) Exclude cut with ID _23557_210_8750_1_1531890106370_6846255_72-273262-0_sp1.1 from training. Duration: 0.963625 2023-10-04 09:54:12,567 WARNING [train.py:1204] (3/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_367-61330-0 from training. Duration: 0.75 2023-10-04 09:54:17,061 WARNING [train.py:1204] (3/4) Exclude cut with ID _39867_210_9826_1_1533553222030_9344259_1337-11141-0_sp1.1 from training. Duration: 0.936375 2023-10-04 09:54:17,085 WARNING [train.py:1204] (3/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_101-293367-0 from training. Duration: 0.63 2023-10-04 09:54:18,559 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=1612973.3333333333, ans=0.1 2023-10-04 09:54:19,652 WARNING [train.py:1204] (3/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_10-100367-0 from training. Duration: 0.98 2023-10-04 09:54:19,770 WARNING [train.py:1204] (3/4) Exclude cut with ID _60057_210_10640_1_1534903065793_3333150_171-181133-0_sp1.1 from training. Duration: 0.8 2023-10-04 09:54:19,773 WARNING [train.py:1204] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_844-96989-0_sp0.9 from training. Duration: 0.788875 2023-10-04 09:54:22,558 WARNING [train.py:1204] (3/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_301-144595-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 09:54:22,646 WARNING [train.py:1204] (3/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_115-147343-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 09:54:23,456 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.1.encoder.layers.1.feed_forward3.out_whiten, num_groups=1, num_channels=256, metric=8.87 vs. limit=15.0 2023-10-04 09:54:24,531 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.4.encoder.layers.1.feed_forward3.out_whiten, num_groups=1, num_channels=384, metric=11.04 vs. limit=15.0 2023-10-04 09:54:27,214 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3171_210_2087_1_1526109084_2330007_37-64334-0_sp1.1 from training. Duration: 0.7454375 2023-10-04 09:54:27,255 WARNING [train.py:1204] (3/4) Exclude cut with ID _19514_210_9259_1_1531530071330_5084230_769-272914-0_sp1.1 from training. Duration: 0.936375 2023-10-04 09:54:31,733 WARNING [train.py:1204] (3/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_76-311963-0_sp0.9 from training. Duration: 0.77775 2023-10-04 09:54:31,780 WARNING [train.py:1204] (3/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_526-62079-0 from training. Duration: 0.67 2023-10-04 09:54:33,614 WARNING [train.py:1204] (3/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_7-111954-0_sp0.9 from training. Duration: 0.62225 2023-10-04 09:54:33,746 WARNING [train.py:1204] (3/4) Exclude cut with ID _57997_210_4163_1_1534766425889_3961049_360-279140-0_sp1.1 from training. Duration: 0.97275 2023-10-04 09:54:36,439 WARNING [train.py:1204] (3/4) Exclude cut with ID _4866_210_2577_1_1527404412284_4364659_133-1696-0 from training. Duration: 0.99 2023-10-04 09:54:36,506 WARNING [train.py:1204] (3/4) Exclude cut with ID _5190_210_4557_1_1527244368799_373439_28-197030-0 from training. Duration: 0.72 2023-10-04 09:54:39,274 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_53-39003-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 09:54:39,277 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6673_210_3273_1_1527767790_3173240_351-167954-0 from training. Duration: 0.48 2023-10-04 09:54:39,293 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2822_210_2715_1_1526112586_1999587_165-36656-0_sp1.1 from training. Duration: 0.8090625 2023-10-04 09:54:41,982 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_328-264892-0_sp1.1 from training. Duration: 0.92725 2023-10-04 09:54:41,988 WARNING [train.py:1204] (3/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_29-293896-0_sp0.9 from training. Duration: 0.67775 2023-10-04 09:54:43,569 WARNING [train.py:1204] (3/4) Exclude cut with ID _65263_210_22356_1_1535763598691_3921037_465-55448-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 09:54:45,333 WARNING [train.py:1204] (3/4) Exclude cut with ID _21989_210_5854_1_1531899214931_3271360_311-53106-0_sp1.1 from training. Duration: 0.97275 2023-10-04 09:54:48,248 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=1613106.6666666667, ans=0.1 2023-10-04 09:54:49,427 WARNING [train.py:1204] (3/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_389-277915-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 09:54:52,211 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_69630_210_7234_1_1535791094_677837_63-277139-0_sp1.1 from training. Duration: 0.963625 2023-10-04 09:54:53,659 WARNING [train.py:1204] (3/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_85-138257-0 from training. Duration: 0.71 2023-10-04 09:54:53,684 WARNING [train.py:1204] (3/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_686-247600-0 from training. Duration: 0.92 2023-10-04 09:54:53,690 WARNING [train.py:1204] (3/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_108-255430-0_sp1.1 from training. Duration: 0.8818125 2023-10-04 09:54:58,275 WARNING [train.py:1204] (3/4) Exclude cut with ID _14884_210_2252_1_1530707459689_5561250_114-201399-0_sp1.1 from training. Duration: 0.6909375 2023-10-04 09:55:00,923 WARNING [train.py:1204] (3/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_506-42614-0 from training. Duration: 0.79 2023-10-04 09:55:02,635 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_85-195240-0_sp1.1 from training. Duration: 0.7909375 2023-10-04 09:55:08,522 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_141-36243-0_sp1.1 from training. Duration: 0.97275 2023-10-04 09:55:16,803 WARNING [train.py:1204] (3/4) Exclude cut with ID _7852_210_5219_1_1528286471316_8139400_358-287492-0_sp1.1 from training. Duration: 0.87275 2023-10-04 09:55:16,830 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_805-71497-0_sp0.9 from training. Duration: 0.788875 2023-10-04 09:55:16,911 WARNING [train.py:1204] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_178-39722-0 from training. Duration: 0.49 2023-10-04 09:55:19,982 WARNING [train.py:1204] (3/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_428-181925-0_sp1.1 from training. Duration: 0.963625 2023-10-04 09:55:19,993 WARNING [train.py:1204] (3/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_770-200430-0 from training. Duration: 0.89 2023-10-04 09:55:20,029 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_1104-271009-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 09:55:21,299 INFO [train.py:1046] (3/4) Epoch 46, batch 2950, loss[loss=0.1712, simple_loss=0.2595, pruned_loss=0.04151, over 24353.00 frames. ], tot_loss[loss=0.1528, simple_loss=0.2333, pruned_loss=0.03617, over 4711228.09 frames. ], batch size: 77, lr: 2.21e-03, grad_scale: 32.0 2023-10-04 09:55:21,383 WARNING [train.py:1204] (3/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_128-296581-0_sp0.9 from training. Duration: 0.82225 2023-10-04 09:55:26,939 WARNING [train.py:1204] (3/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_19-62511-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 09:55:28,851 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_805-71497-0 from training. Duration: 0.71 2023-10-04 09:55:30,258 WARNING [train.py:1204] (3/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_573-152077-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 09:55:30,264 WARNING [train.py:1204] (3/4) Exclude cut with ID _42875_210_14856_1_1533704397487_4012433_393-177816-0_sp1.1 from training. Duration: 0.963625 2023-10-04 09:55:33,490 WARNING [train.py:1204] (3/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_278-35159-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 09:55:33,579 WARNING [train.py:1204] (3/4) Exclude cut with ID _15037_210_2253_1_1530752294104_3267650_13-130361-0_sp1.1 from training. Duration: 0.9 2023-10-04 09:55:34,984 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3019_210_2028_1_1526044245_3264865_127-135615-0 from training. Duration: 0.94 2023-10-04 09:55:35,669 WARNING [train.py:1204] (3/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_607-213951-0 from training. Duration: 0.84 2023-10-04 09:55:36,815 WARNING [train.py:1204] (3/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_436-318113-0_sp0.9 from training. Duration: 0.8333125 2023-10-04 09:55:38,090 WARNING [train.py:1204] (3/4) Exclude cut with ID _13938_210_9988_1_1530518718978_3693960_148-152225-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 09:55:42,293 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.0.attention_skip_rate, batch_count=1613373.3333333333, ans=0.0 2023-10-04 09:55:43,523 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2541_210_1794_1_1525590269_3084765_156-173713-0_sp1.1 from training. Duration: 0.736375 2023-10-04 09:55:44,375 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.2.encoder.layers.0.feed_forward3.out_whiten, num_groups=1, num_channels=384, metric=10.13 vs. limit=15.0 2023-10-04 09:55:44,942 WARNING [train.py:1204] (3/4) Exclude cut with ID _28323_210_16187_1_1532480395667_4100300_531-24157-0_sp1.1 from training. Duration: 0.8909375 2023-10-04 09:55:46,519 WARNING [train.py:1204] (3/4) Exclude cut with ID _29303_210_2575_1_1532392237303_7410649_382-217492-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 09:55:46,560 WARNING [train.py:1204] (3/4) Exclude cut with ID _58895_210_8750_1_1535184081320_3649089_270-158013-0_sp1.1 from training. Duration: 0.92725 2023-10-04 09:55:49,797 WARNING [train.py:1204] (3/4) Exclude cut with ID _29277_210_3279_1_1532581109293_3173069_311-206227-0_sp1.1 from training. Duration: 0.936375 2023-10-04 09:55:49,808 WARNING [train.py:1204] (3/4) Exclude cut with ID _7160_210_1876_1_1527940923231_3703799_282-322611-0_sp0.9 from training. Duration: 0.9666875 2023-10-04 09:55:51,164 WARNING [train.py:1204] (3/4) Exclude cut with ID _53219_210_4525_1_1534834836008_4064825_562-32295-0_sp1.1 from training. Duration: 0.963625 2023-10-04 09:55:52,680 WARNING [train.py:1204] (3/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_408-87330-0_sp1.1 from training. Duration: 0.963625 2023-10-04 09:55:52,687 WARNING [train.py:1204] (3/4) Exclude cut with ID _59202_210_26984_1_1534816810612_3549174_137-318710-0_sp1.1 from training. Duration: 0.8090625 2023-10-04 09:55:55,332 WARNING [train.py:1204] (3/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_372-30302-0 from training. Duration: 0.86 2023-10-04 09:55:58,371 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.conv_skip_rate, batch_count=1613440.0, ans=0.0 2023-10-04 09:55:59,492 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7543_210_6753_1_1528541907_1399322_178-56269-0_sp1.1 from training. Duration: 0.95275 2023-10-04 09:55:59,510 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9109W0012-549758 from training. Duration: 0.512 2023-10-04 09:56:01,375 WARNING [train.py:1204] (3/4) Exclude cut with ID _5410_210_1388_1_1527397055795_1719020_39-112221-0_sp0.9 from training. Duration: 0.7666875 2023-10-04 09:56:04,073 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9121W0012-554933 from training. Duration: 0.896 2023-10-04 09:56:04,174 WARNING [train.py:1204] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_476-272757-0 from training. Duration: 0.93 2023-10-04 09:56:06,239 WARNING [train.py:1204] (3/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_1085-171718-0_sp1.1 from training. Duration: 0.92725 2023-10-04 09:56:06,300 WARNING [train.py:1204] (3/4) Exclude cut with ID _8480_210_1874_1_1528589391205_6728330_309-139896-0_sp1.1 from training. Duration: 0.736375 2023-10-04 09:56:06,300 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9130W0113-559703 from training. Duration: 0.896 2023-10-04 09:56:06,304 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_292-265928-0_sp1.1 from training. Duration: 0.463625 2023-10-04 09:56:08,951 WARNING [train.py:1204] (3/4) Exclude cut with ID _8772_210_1850_1_1528635595972_3664011_96-12756-0 from training. Duration: 0.98 2023-10-04 09:56:09,011 WARNING [train.py:1204] (3/4) Exclude cut with ID _12556_210_8341_1_1530081024100_7327690_725-193477-0_sp1.1 from training. Duration: 0.8909375 2023-10-04 09:56:09,059 WARNING [train.py:1204] (3/4) Exclude cut with ID _5289_210_2084_1_1527678202736_3779299_142-103317-0_sp1.1 from training. Duration: 0.763625 2023-10-04 09:56:11,809 WARNING [train.py:1204] (3/4) Exclude cut with ID _45012_210_12651_1_1533608435136_5371380_206-51075-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 09:56:13,229 WARNING [train.py:1204] (3/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_176-56120-0_sp1.1 from training. Duration: 0.7545625 2023-10-04 09:56:13,267 WARNING [train.py:1204] (3/4) Exclude cut with ID _40284_210_2084_1_1533513679814_3704279_220-201864-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 09:56:14,544 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9165W0004-576638 from training. Duration: 0.896 2023-10-04 09:56:14,593 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_5438_210_1794_1_1527336687_2600009_218-343336-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 09:56:15,994 WARNING [train.py:1204] (3/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_318-162017-0 from training. Duration: 0.91 2023-10-04 09:56:20,651 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_49544_210_1794_1_1534379770_5262083_483-349298-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 09:56:20,850 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.self_attn_weights.pos_emb_skip_rate, batch_count=1613573.3333333333, ans=0.0 2023-10-04 09:56:21,989 WARNING [train.py:1204] (3/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_197-28164-0_sp0.9 from training. Duration: 0.888875 2023-10-04 09:56:22,053 WARNING [train.py:1204] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_643-52007-0 from training. Duration: 0.57 2023-10-04 09:56:22,071 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_38965_210_4852_1_1533174906_3484735_403-332963-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 09:56:24,725 WARNING [train.py:1204] (3/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_340-174176-0 from training. Duration: 0.49 2023-10-04 09:56:27,466 WARNING [train.py:1204] (3/4) Exclude cut with ID _56708_210_13177_1_1535252351282_3422899_363-917-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 09:56:28,919 WARNING [train.py:1204] (3/4) Exclude cut with ID _19896_210_1883_1_1531709022662_6649569_525-159063-0_sp1.1 from training. Duration: 0.936375 2023-10-04 09:56:28,958 WARNING [train.py:1204] (3/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_430-116139-0_sp0.9 from training. Duration: 0.9444375 2023-10-04 09:56:32,020 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_53753_210_7234_1_1534320003_3594148_86-329250-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 09:56:32,030 WARNING [train.py:1204] (3/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_406-275649-0_sp1.1 from training. Duration: 0.3818125 2023-10-04 09:56:32,140 WARNING [train.py:1204] (3/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_1083-147536-0_sp1.1 from training. Duration: 0.8818125 2023-10-04 09:56:33,947 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.636e+02 1.976e+02 2.238e+02 2.495e+02 4.043e+02, threshold=4.475e+02, percent-clipped=0.0 2023-10-04 09:56:34,067 WARNING [train.py:1204] (3/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_54-311741-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 09:56:34,072 WARNING [train.py:1204] (3/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_237-58433-0_sp0.9 from training. Duration: 0.82225 2023-10-04 09:56:35,016 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.0.layers.1.nonlin_attention.whiten1, num_groups=1, num_channels=144, metric=5.31 vs. limit=10.0 2023-10-04 09:56:35,349 INFO [train.py:1046] (3/4) Epoch 46, batch 3000, loss[loss=0.1623, simple_loss=0.249, pruned_loss=0.03784, over 24646.00 frames. ], tot_loss[loss=0.154, simple_loss=0.2346, pruned_loss=0.03671, over 4701734.37 frames. ], batch size: 73, lr: 2.21e-03, grad_scale: 32.0 2023-10-04 09:56:35,349 INFO [train.py:1069] (3/4) Computing validation loss 2023-10-04 09:56:47,860 INFO [train.py:1078] (3/4) Epoch 46, validation: loss=0.3542, simple_loss=0.2819, pruned_loss=0.2132, over 1125622.00 frames. 2023-10-04 09:56:47,861 INFO [train.py:1079] (3/4) Maximum memory allocated so far is 21221MB 2023-10-04 09:56:47,944 WARNING [train.py:1204] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_98-51676-0_sp1.1 from training. Duration: 0.663625 2023-10-04 09:56:48,137 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.bypass.skip_rate, batch_count=1613640.0, ans=0.07 2023-10-04 09:56:49,290 WARNING [train.py:1204] (3/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_386-270472-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 09:56:50,706 WARNING [train.py:1204] (3/4) Exclude cut with ID _19023_210_4353_1_1531378892552_5820780_83-254794-0_sp1.1 from training. Duration: 0.7818125 2023-10-04 09:56:52,193 WARNING [train.py:1204] (3/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_314-93656-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 09:56:52,204 WARNING [train.py:1204] (3/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_336-213530-0 from training. Duration: 0.9 2023-10-04 09:56:52,299 WARNING [train.py:1204] (3/4) Exclude cut with ID _36686_210_5665_1_1533778225913_3646410_65-221120-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 09:56:54,454 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.1.encoder.layers.1.feed_forward1.out_whiten, num_groups=1, num_channels=256, metric=5.16 vs. limit=15.0 2023-10-04 09:56:56,299 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_26433_210_12108_1_1532157158_4056480_271-68178-0_sp1.1 from training. Duration: 0.92725 2023-10-04 09:56:56,345 WARNING [train.py:1204] (3/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_46-207599-0_sp0.9 from training. Duration: 0.711125 2023-10-04 09:56:59,161 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9303W0114-637133 from training. Duration: 0.896 2023-10-04 09:57:00,508 WARNING [train.py:1204] (3/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_490-207861-0 from training. Duration: 0.98 2023-10-04 09:57:02,516 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6767_210_3273_1_1527855783_1718932_173-256601-0_sp0.9 from training. Duration: 0.888875 2023-10-04 09:57:03,805 WARNING [train.py:1204] (3/4) Exclude cut with ID _13921_210_9994_1_1530841782272_3503520_327-69677-0_sp1.1 from training. Duration: 0.8181875 2023-10-04 09:57:03,871 WARNING [train.py:1204] (3/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_508-335369-0 from training. Duration: 0.48 2023-10-04 09:57:03,894 WARNING [train.py:1204] (3/4) Exclude cut with ID _64631_210_27767_1_1535349580068_4165160_24-46567-0_sp1.1 from training. Duration: 0.963625 2023-10-04 09:57:10,564 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_89-35748-0_sp1.1 from training. Duration: 0.7181875 2023-10-04 09:57:20,128 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_26433_210_12108_1_1532157158_4056480_578-262107-0_sp1.1 from training. Duration: 0.9 2023-10-04 09:57:26,193 WARNING [train.py:1204] (3/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_129-337802-0 from training. Duration: 0.72 2023-10-04 09:57:27,486 WARNING [train.py:1204] (3/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_356-167816-0_sp0.9 from training. Duration: 0.8 2023-10-04 09:57:28,951 WARNING [train.py:1204] (3/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_137-116442-0_sp1.1 from training. Duration: 0.7090625 2023-10-04 09:57:28,985 WARNING [train.py:1204] (3/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_358-172940-0_sp1.1 from training. Duration: 0.963625 2023-10-04 09:57:29,013 WARNING [train.py:1204] (3/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_668-344147-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 09:57:29,420 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.5.encoder.layers.1.conv_module1.whiten, num_groups=1, num_channels=256, metric=4.58 vs. limit=15.0 2023-10-04 09:57:31,836 WARNING [train.py:1204] (3/4) Exclude cut with ID _62337_210_12392_1_1535009273105_3412229_255-157667-0_sp1.1 from training. Duration: 0.936375 2023-10-04 09:57:31,839 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_486-151131-0 from training. Duration: 0.85 2023-10-04 09:57:33,809 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_53638_210_21581_1_1534297319_5808449_885-80172-0 from training. Duration: 0.96 2023-10-04 09:57:35,930 WARNING [train.py:1204] (3/4) Exclude cut with ID _50093_210_4220_1_1534417141207_3432299_105-183067-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 09:57:37,816 WARNING [train.py:1204] (3/4) Exclude cut with ID _7562_210_2084_1_1528887767916_3946229_345-169156-0_sp0.9 from training. Duration: 0.6444375 2023-10-04 09:57:39,276 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_4528_210_3273_1_1527076654_3276777_209-219804-0_sp0.9 from training. Duration: 0.7666875 2023-10-04 09:57:40,580 WARNING [train.py:1204] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_35-153886-0_sp0.9 from training. Duration: 0.9333125 2023-10-04 09:57:40,640 WARNING [train.py:1204] (3/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_332-121925-0_sp1.1 from training. Duration: 0.92725 2023-10-04 09:57:40,649 WARNING [train.py:1204] (3/4) Exclude cut with ID _59202_210_26984_1_1534816810612_3549174_495-65515-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 09:57:42,212 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.feed_forward3.hidden_balancer.prob, batch_count=1613840.0, ans=0.125 2023-10-04 09:57:43,598 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.feed_forward1.out_proj.dropout_p, batch_count=1613840.0, ans=0.1 2023-10-04 09:57:44,692 WARNING [train.py:1204] (3/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_769-168302-0_sp1.1 from training. Duration: 0.6454375 2023-10-04 09:57:44,734 WARNING [train.py:1204] (3/4) Exclude cut with ID _24271_210_12658_1_1531987270022_7190519_708-71345-0_sp1.1 from training. Duration: 0.936375 2023-10-04 09:57:44,734 WARNING [train.py:1204] (3/4) Exclude cut with ID 3033-130750-0096-55598-0_sp0.9 from training. Duration: 0.92225 2023-10-04 09:57:47,509 WARNING [train.py:1204] (3/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_219-224445-0_sp0.9 from training. Duration: 0.9333125 2023-10-04 09:57:48,934 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_174-277364-0 from training. Duration: 0.71 2023-10-04 09:57:49,018 WARNING [train.py:1204] (3/4) Exclude cut with ID _45021_210_9994_1_1533619751094_3330288_96-239010-0_sp1.1 from training. Duration: 0.82725 2023-10-04 09:57:50,275 WARNING [train.py:1204] (3/4) Exclude cut with ID _31602_210_5854_1_1532677021456_4458250_489-305116-0_sp1.1 from training. Duration: 0.97275 2023-10-04 09:57:50,306 WARNING [train.py:1204] (3/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_269-154726-0_sp0.9 from training. Duration: 0.9555625 2023-10-04 09:57:51,848 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.bypass.scale_min, batch_count=1613906.6666666667, ans=0.2 2023-10-04 09:57:54,822 WARNING [train.py:1204] (3/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_126-54418-0_sp1.1 from training. Duration: 0.92725 2023-10-04 09:57:54,847 WARNING [train.py:1204] (3/4) Exclude cut with ID _42326_210_7770_1_1533814282531_3707269_182-224159-0_sp1.1 from training. Duration: 0.92725 2023-10-04 09:57:56,170 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7002_210_1794_1_1527939683_5157286_1-281264-0_sp0.9 from training. Duration: 0.488875 2023-10-04 09:57:56,198 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_86-16881-0 from training. Duration: 0.58 2023-10-04 09:57:56,222 WARNING [train.py:1204] (3/4) Exclude cut with ID _19514_210_9259_1_1531530071330_5084230_637-279094-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 09:57:56,271 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_26433_210_12108_1_1532157158_4056480_578-262107-0 from training. Duration: 0.99 2023-10-04 09:57:57,619 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_82-221196-0_sp1.1 from training. Duration: 0.6909375 2023-10-04 09:57:57,755 WARNING [train.py:1204] (3/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_402-102311-0 from training. Duration: 0.67 2023-10-04 09:58:01,107 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1015-343884-0_sp1.1 from training. Duration: 0.563625 2023-10-04 09:58:01,351 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.conv_module1.balancer2.prob, batch_count=1613973.3333333333, ans=0.125 2023-10-04 09:58:02,149 INFO [train.py:1046] (3/4) Epoch 46, batch 3050, loss[loss=0.2043, simple_loss=0.2714, pruned_loss=0.0686, over 19196.00 frames. ], tot_loss[loss=0.1549, simple_loss=0.2354, pruned_loss=0.03718, over 4691286.61 frames. ], batch size: 388, lr: 2.21e-03, grad_scale: 16.0 2023-10-04 09:58:02,257 WARNING [train.py:1204] (3/4) Exclude cut with ID _13684_210_7497_1_1530619354863_3598359_312-235606-0_sp1.1 from training. Duration: 0.5454375 2023-10-04 09:58:02,294 WARNING [train.py:1204] (3/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_55-31904-0 from training. Duration: 0.88 2023-10-04 09:58:04,387 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_25-332138-0 from training. Duration: 0.91 2023-10-04 09:58:04,389 WARNING [train.py:1204] (3/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_912-282412-0_sp0.9 from training. Duration: 0.5444375 2023-10-04 09:58:04,465 WARNING [train.py:1204] (3/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_458-299435-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 09:58:04,612 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.1.conv_module2.balancer2.min_abs, batch_count=1613973.3333333333, ans=0.5 2023-10-04 09:58:04,725 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.feed_forward1.hidden_balancer.prob, batch_count=1613973.3333333333, ans=0.125 2023-10-04 09:58:05,688 WARNING [train.py:1204] (3/4) Exclude cut with ID _69584_210_5219_1_1535785133775_4044450_275-88403-0_sp1.1 from training. Duration: 0.92725 2023-10-04 09:58:05,704 WARNING [train.py:1204] (3/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_841-341679-0_sp1.1 from training. Duration: 0.52725 2023-10-04 09:58:05,839 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.conv_module2.balancer1.prob, batch_count=1613973.3333333333, ans=0.125 2023-10-04 09:58:07,061 WARNING [train.py:1204] (3/4) Exclude cut with ID _41391_210_7994_1_1533603771872_3638201_1-54521-0_sp1.1 from training. Duration: 0.97275 2023-10-04 09:58:07,112 WARNING [train.py:1204] (3/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_495-186609-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 09:58:10,331 WARNING [train.py:1204] (3/4) Exclude cut with ID _4681_210_3525_1_1527250689464_120889_15-126760-0 from training. Duration: 0.81 2023-10-04 09:58:10,483 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3019_210_2028_1_1526044245_3264865_127-135615-0_sp1.1 from training. Duration: 0.8545625 2023-10-04 09:58:13,149 WARNING [train.py:1204] (3/4) Exclude cut with ID _30191_210_1664_1_1533189915047_7196670_915-21561-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 09:58:13,188 WARNING [train.py:1204] (3/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_6-214815-0_sp0.9 from training. Duration: 0.9444375 2023-10-04 09:58:17,249 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_45399_210_4852_1_1533624725_7579737_190-137494-0_sp1.1 from training. Duration: 0.97275 2023-10-04 09:58:20,028 WARNING [train.py:1204] (3/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_207-132038-0 from training. Duration: 0.94 2023-10-04 09:58:24,826 WARNING [train.py:1204] (3/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_90-31383-0 from training. Duration: 0.87 2023-10-04 09:58:26,137 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_15517_210_6753_1_1530952293_1664795_119-31001-0 from training. Duration: 0.94 2023-10-04 09:58:26,169 WARNING [train.py:1204] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_684-85342-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 09:58:29,565 WARNING [train.py:1204] (3/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_97-325053-0_sp1.1 from training. Duration: 0.836375 2023-10-04 09:58:32,876 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_68702_210_23689_1_1535799577_3420068_168-25150-0_sp1.1 from training. Duration: 0.97275 2023-10-04 09:58:32,884 WARNING [train.py:1204] (3/4) Exclude cut with ID _65134_210_14763_1_1535335393985_3505368_111-27984-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 09:58:32,945 WARNING [train.py:1204] (3/4) Exclude cut with ID _48678_210_5663_1_1533988830947_3769199_276-226230-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 09:58:35,718 WARNING [train.py:1204] (3/4) Exclude cut with ID _28427_210_4220_1_1532581240967_3061069_43-84928-0_sp1.1 from training. Duration: 0.9 2023-10-04 09:58:35,757 WARNING [train.py:1204] (3/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_158-250116-0_sp1.1 from training. Duration: 0.563625 2023-10-04 09:58:36,959 WARNING [train.py:1204] (3/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_279-332556-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 09:58:37,000 WARNING [train.py:1204] (3/4) Exclude cut with ID _42863_210_9070_1_1533902398788_3696920_488-230146-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 09:58:37,001 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_387-247159-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 09:58:40,162 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6729_210_2550_1_1528032481_1788725_261-42619-0_sp1.1 from training. Duration: 0.97275 2023-10-04 09:58:41,627 WARNING [train.py:1204] (3/4) Exclude cut with ID _19896_210_1883_1_1531709022662_6649569_831-44724-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 09:58:41,873 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.conv_skip_rate, batch_count=1614106.6666666667, ans=0.0 2023-10-04 09:58:43,094 WARNING [train.py:1204] (3/4) Exclude cut with ID _55153_210_2253_1_1534384786677_3162096_280-285944-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 09:58:44,360 WARNING [train.py:1204] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_448-4048-0 from training. Duration: 0.96 2023-10-04 09:58:44,432 WARNING [train.py:1204] (3/4) Exclude cut with ID _29678_210_7893_1_1532421042787_3906118_456-202548-0_sp1.1 from training. Duration: 0.97275 2023-10-04 09:58:45,738 WARNING [train.py:1204] (3/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_107-332398-0_sp1.1 from training. Duration: 0.7090625 2023-10-04 09:58:46,197 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.conv_module2.balancer2.prob, batch_count=1614173.3333333333, ans=0.125 2023-10-04 09:58:47,283 WARNING [train.py:1204] (3/4) Exclude cut with ID _13921_210_9994_1_1530838702800_3003429_46-149444-0_sp1.1 from training. Duration: 0.8545625 2023-10-04 09:58:47,328 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_257-20733-0_sp0.9 from training. Duration: 0.8444375 2023-10-04 09:58:48,632 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_53753_210_7234_1_1534320003_3594148_385-234809-0_sp1.1 from training. Duration: 0.963625 2023-10-04 09:58:48,684 WARNING [train.py:1204] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_292-215346-0_sp1.1 from training. Duration: 0.8818125 2023-10-04 09:58:53,128 WARNING [train.py:1204] (3/4) Exclude cut with ID _68965_210_1883_1_1535698962272_3497490_196-124268-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 09:58:53,171 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_23801_210_16223_1_1532066392_3294657_570-5439-0_sp1.1 from training. Duration: 0.8818125 2023-10-04 09:58:56,164 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.conv_skip_rate, batch_count=1614173.3333333333, ans=0.0 2023-10-04 09:58:57,450 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.attention_skip_rate, batch_count=1614173.3333333333, ans=0.0 2023-10-04 09:58:59,249 WARNING [train.py:1204] (3/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_349-6928-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 09:59:01,052 WARNING [train.py:1204] (3/4) Exclude cut with ID _60621_210_17071_1_1535013053530_3671739_226-255202-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 09:59:01,056 WARNING [train.py:1204] (3/4) Exclude cut with ID _41817_210_18835_1_1533434477160_7423049_448-130302-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 09:59:02,479 WARNING [train.py:1204] (3/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_758-143677-0_sp1.1 from training. Duration: 0.8909375 2023-10-04 09:59:02,532 WARNING [train.py:1204] (3/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_912-47473-0_sp0.9 from training. Duration: 0.7555625 2023-10-04 09:59:03,950 WARNING [train.py:1204] (3/4) Exclude cut with ID _6694_210_4599_1_1527852637490_3965529_356-169233-0_sp1.1 from training. Duration: 0.9 2023-10-04 09:59:04,029 WARNING [train.py:1204] (3/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_1-99680-0 from training. Duration: 0.67 2023-10-04 09:59:05,352 WARNING [train.py:1204] (3/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_199-86432-0_sp1.1 from training. Duration: 0.8909375 2023-10-04 09:59:05,379 WARNING [train.py:1204] (3/4) Exclude cut with ID _61148_210_6897_1_1535094022726_4217610_362-182430-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 09:59:06,701 WARNING [train.py:1204] (3/4) Exclude cut with ID _49376_210_5986_1_1533985278732_3370740_301-154763-0 from training. Duration: 0.98 2023-10-04 09:59:11,058 WARNING [train.py:1204] (3/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_572-185150-0_sp1.1 from training. Duration: 0.8818125 2023-10-04 09:59:12,651 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.conv_module2.balancer2.prob, batch_count=1614240.0, ans=0.125 2023-10-04 09:59:15,250 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_47896_210_20508_1_1534492518_5958868_558-314572-0_sp1.1 from training. Duration: 0.8818125 2023-10-04 09:59:16,570 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.646e+02 1.988e+02 2.298e+02 2.757e+02 4.239e+02, threshold=4.595e+02, percent-clipped=0.0 2023-10-04 09:59:16,597 INFO [train.py:1046] (3/4) Epoch 46, batch 3100, loss[loss=0.1602, simple_loss=0.2372, pruned_loss=0.04164, over 23311.00 frames. ], tot_loss[loss=0.1549, simple_loss=0.2351, pruned_loss=0.03734, over 4691749.27 frames. ], batch size: 105, lr: 2.20e-03, grad_scale: 16.0 2023-10-04 09:59:16,708 WARNING [train.py:1204] (3/4) Exclude cut with ID _45560_210_21982_1_1533812603951_3584999_111-6376-0_sp0.9 from training. Duration: 0.9333125 2023-10-04 09:59:19,485 WARNING [train.py:1204] (3/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_412-317077-0_sp1.1 from training. Duration: 0.6818125 2023-10-04 09:59:20,950 WARNING [train.py:1204] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_718-68505-0 from training. Duration: 0.89 2023-10-04 09:59:21,233 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.conv_skip_rate, batch_count=1614306.6666666667, ans=0.0 2023-10-04 09:59:24,308 WARNING [train.py:1204] (3/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_77-311404-0 from training. Duration: 0.94 2023-10-04 09:59:25,771 WARNING [train.py:1204] (3/4) Exclude cut with ID _60229_210_27767_1_1534838406158_3655350_264-279573-0 from training. Duration: 0.83 2023-10-04 09:59:25,995 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.conv_module2.balancer2.prob, batch_count=1614306.6666666667, ans=0.125 2023-10-04 09:59:26,039 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.self_attn_weights.pos_emb_skip_rate, batch_count=1614306.6666666667, ans=0.0 2023-10-04 09:59:28,507 WARNING [train.py:1204] (3/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_106-300985-0_sp1.1 from training. Duration: 0.8181875 2023-10-04 09:59:31,803 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_4183_210_2087_1_1526729100_1869838_79-277189-0_sp1.1 from training. Duration: 0.963625 2023-10-04 09:59:31,812 WARNING [train.py:1204] (3/4) Exclude cut with ID _28427_210_4220_1_1532581240967_3061069_195-295268-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 09:59:33,332 WARNING [train.py:1204] (3/4) Exclude cut with ID _4092_210_2151_1_1526643044449_3135759_356-278694-0_sp0.9 from training. Duration: 0.52225 2023-10-04 09:59:37,316 WARNING [train.py:1204] (3/4) Exclude cut with ID _14459_210_2228_1_1531443641946_7516700_777-71446-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 09:59:41,561 WARNING [train.py:1204] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_520-126729-0 from training. Duration: 0.7 2023-10-04 09:59:46,443 WARNING [train.py:1204] (3/4) Exclude cut with ID _13684_210_7497_1_1530619354863_3598359_312-235606-0_sp0.9 from training. Duration: 0.6666875 2023-10-04 09:59:47,736 WARNING [train.py:1204] (3/4) Exclude cut with ID _37537_210_2253_1_1533171758568_3421949_283-349402-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 09:59:47,785 WARNING [train.py:1204] (3/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_29-117932-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 09:59:47,814 WARNING [train.py:1204] (3/4) Exclude cut with ID _29303_210_2575_1_1532392237303_7410649_721-283351-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 09:59:49,199 WARNING [train.py:1204] (3/4) Exclude cut with ID _14886_210_4220_1_1530702020355_3516190_175-229073-0_sp0.9 from training. Duration: 0.5 2023-10-04 09:59:51,878 WARNING [train.py:1204] (3/4) Exclude cut with ID _15458_210_8341_1_1530858614263_3834410_70-268830-0_sp1.1 from training. Duration: 0.97275 2023-10-04 09:59:51,902 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2295_210_2087_1_1525500993_2407863_234-83104-0 from training. Duration: 0.62 2023-10-04 09:59:51,903 WARNING [train.py:1204] (3/4) Exclude cut with ID _22961_210_8254_1_1531976366672_4278180_317-199546-0_sp1.1 from training. Duration: 0.7818125 2023-10-04 09:59:53,296 WARNING [train.py:1204] (3/4) Exclude cut with ID _27894_210_12392_1_1532415496078_6902439_547-196022-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 09:59:55,234 WARNING [train.py:1204] (3/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_46-207599-0 from training. Duration: 0.64 2023-10-04 09:59:55,947 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.4.encoder.layers.0.feed_forward3.out_whiten, num_groups=1, num_channels=384, metric=15.32 vs. limit=15.0 2023-10-04 09:59:56,864 WARNING [train.py:1204] (3/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_426-326246-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 09:59:59,509 WARNING [train.py:1204] (3/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_445-117398-0_sp0.9 from training. Duration: 0.988875 2023-10-04 09:59:59,564 WARNING [train.py:1204] (3/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_563-347832-0 from training. Duration: 0.89 2023-10-04 10:00:01,380 WARNING [train.py:1204] (3/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_252-216674-0 from training. Duration: 0.54 2023-10-04 10:00:02,761 WARNING [train.py:1204] (3/4) Exclude cut with ID _59611_210_8895_1_1534741198240_3650361_187-212779-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 10:00:02,822 WARNING [train.py:1204] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_282-159367-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 10:00:05,988 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_68702_210_23689_1_1535799577_3420068_33-136883-0_sp1.1 from training. Duration: 0.936375 2023-10-04 10:00:05,999 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_47228_210_4983_1_1533780021_3672027_184-293706-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 10:00:06,018 WARNING [train.py:1204] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_656-188361-0_sp0.9 from training. Duration: 0.9666875 2023-10-04 10:00:07,428 WARNING [train.py:1204] (3/4) Exclude cut with ID _4866_210_2577_1_1527404412284_4364659_352-157930-0_sp0.9 from training. Duration: 0.9 2023-10-04 10:00:07,429 WARNING [train.py:1204] (3/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_210-106047-0_sp1.1 from training. Duration: 0.8818125 2023-10-04 10:00:07,731 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.feed_forward1.out_proj.dropout_p, batch_count=1614506.6666666667, ans=0.1 2023-10-04 10:00:08,818 WARNING [train.py:1204] (3/4) Exclude cut with ID _9389_210_1664_1_1529217337301_5289269_289-213417-0_sp1.1 from training. Duration: 0.8090625 2023-10-04 10:00:08,869 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3910_210_1794_1_1526777667_3747260_178-41186-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 10:00:08,873 WARNING [train.py:1204] (3/4) Exclude cut with ID _27052_210_5986_1_1532329197187_3585009_24-172263-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 10:00:08,877 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_5522_210_3273_1_1527249606_3909096_318-203153-0_sp1.1 from training. Duration: 0.3909375 2023-10-04 10:00:11,738 WARNING [train.py:1204] (3/4) Exclude cut with ID _27894_210_12392_1_1532415496078_6902439_222-51982-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 10:00:11,829 WARNING [train.py:1204] (3/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_87-131427-0 from training. Duration: 0.78 2023-10-04 10:00:12,648 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.1.encoder.layers.0.self_attn2.whiten, num_groups=1, num_channels=256, metric=10.11 vs. limit=22.5 2023-10-04 10:00:14,951 WARNING [train.py:1204] (3/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_372-298093-0_sp0.9 from training. Duration: 0.888875 2023-10-04 10:00:15,007 WARNING [train.py:1204] (3/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_663-166973-0 from training. Duration: 0.8 2023-10-04 10:00:16,323 WARNING [train.py:1204] (3/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_429-177469-0_sp1.1 from training. Duration: 0.936375 2023-10-04 10:00:16,356 WARNING [train.py:1204] (3/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_724-172651-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 10:00:16,405 WARNING [train.py:1204] (3/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_103-336064-0 from training. Duration: 0.51 2023-10-04 10:00:27,773 WARNING [train.py:1204] (3/4) Exclude cut with ID _48677_210_5663_1_1533902405919_3892989_111-11439-0 from training. Duration: 0.96 2023-10-04 10:00:29,849 WARNING [train.py:1204] (3/4) Exclude cut with ID _8085_210_4557_1_1528455633493_3716100_95-103398-0_sp1.1 from training. Duration: 0.963625 2023-10-04 10:00:31,000 INFO [train.py:1046] (3/4) Epoch 46, batch 3150, loss[loss=0.1344, simple_loss=0.2042, pruned_loss=0.03224, over 23457.00 frames. ], tot_loss[loss=0.1535, simple_loss=0.2332, pruned_loss=0.03687, over 4680339.07 frames. ], batch size: 285, lr: 2.20e-03, grad_scale: 16.0 2023-10-04 10:00:31,090 WARNING [train.py:1204] (3/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_1041-110837-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 10:00:34,358 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_50050_210_16223_1_1535435796_7290138_1155-121757-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 10:00:34,360 WARNING [train.py:1204] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_602-275260-0_sp0.9 from training. Duration: 0.911125 2023-10-04 10:00:34,424 WARNING [train.py:1204] (3/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_265-164486-0 from training. Duration: 0.96 2023-10-04 10:00:35,730 WARNING [train.py:1204] (3/4) Exclude cut with ID _46067_210_4220_1_1534125571066_3307450_142-229375-0_sp1.1 from training. Duration: 0.963625 2023-10-04 10:00:35,760 WARNING [train.py:1204] (3/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_310-26186-0_sp1.1 from training. Duration: 0.636375 2023-10-04 10:00:35,973 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.out_combiner.scale_min, batch_count=1614640.0, ans=0.2 2023-10-04 10:00:38,704 WARNING [train.py:1204] (3/4) Exclude cut with ID _28461_210_2253_1_1532771775189_3041299_136-270644-0 from training. Duration: 0.93 2023-10-04 10:00:40,152 WARNING [train.py:1204] (3/4) Exclude cut with ID _14860_210_4915_1_1530702020340_7343240_438-286372-0_sp1.1 from training. Duration: 0.936375 2023-10-04 10:00:41,598 WARNING [train.py:1204] (3/4) Exclude cut with ID IC0128W0004-63309 from training. Duration: 0.831 2023-10-04 10:00:41,731 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.1.feed_forward2.hidden_balancer.prob, batch_count=1614640.0, ans=0.125 2023-10-04 10:00:46,252 WARNING [train.py:1204] (3/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_135-174080-0 from training. Duration: 0.53 2023-10-04 10:00:46,284 WARNING [train.py:1204] (3/4) Exclude cut with ID _41326_210_5854_1_1533778076509_3492080_277-59511-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 10:00:47,567 WARNING [train.py:1204] (3/4) Exclude cut with ID IC0144W0112-71379 from training. Duration: 0.992 2023-10-04 10:00:47,648 WARNING [train.py:1204] (3/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_711-15688-0_sp0.9 from training. Duration: 0.47775 2023-10-04 10:00:50,324 WARNING [train.py:1204] (3/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_237-332027-0 from training. Duration: 0.95 2023-10-04 10:00:51,673 WARNING [train.py:1204] (3/4) Exclude cut with ID _7970_210_1883_1_1528455616296_3472230_25-178407-0 from training. Duration: 0.71 2023-10-04 10:00:51,674 WARNING [train.py:1204] (3/4) Exclude cut with ID _8480_210_1874_1_1528589391205_6728330_309-139896-0 from training. Duration: 0.81 2023-10-04 10:00:51,693 WARNING [train.py:1204] (3/4) Exclude cut with ID _39571_210_13449_1_1533801584143_3651077_319-268877-0_sp1.1 from training. Duration: 0.936375 2023-10-04 10:00:51,696 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_155-246880-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 10:00:53,093 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_566-122599-0_sp1.1 from training. Duration: 0.936375 2023-10-04 10:00:54,458 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_12868_210_6753_1_1530701932_530275_18-76322-0 from training. Duration: 0.87 2023-10-04 10:00:56,468 WARNING [train.py:1204] (3/4) Exclude cut with ID _56494_210_4220_1_1535284837625_3033169_159-106035-0_sp1.1 from training. Duration: 0.963625 2023-10-04 10:00:56,497 WARNING [train.py:1204] (3/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_98-276013-0_sp1.1 from training. Duration: 0.963625 2023-10-04 10:00:57,837 WARNING [train.py:1204] (3/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_330-216124-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 10:01:00,495 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_177-58005-0_sp0.9 from training. Duration: 0.688875 2023-10-04 10:01:02,583 WARNING [train.py:1204] (3/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_343-139898-0 from training. Duration: 0.96 2023-10-04 10:01:03,990 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6961_210_2087_1_1527937168_6815011_395-185060-0_sp1.1 from training. Duration: 0.863625 2023-10-04 10:01:07,179 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7002_210_1794_1_1527939683_5157286_184-229630-0_sp0.9 from training. Duration: 0.82225 2023-10-04 10:01:08,483 WARNING [train.py:1204] (3/4) Exclude cut with ID _59170_210_6721_1_1534983633960_4203335_153-18895-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 10:01:08,535 WARNING [train.py:1204] (3/4) Exclude cut with ID _49643_210_7989_1_1534640244224_3939031_598-161926-0 from training. Duration: 0.9 2023-10-04 10:01:11,174 WARNING [train.py:1204] (3/4) Exclude cut with ID _4068_210_2629_1_1526697350395_1546259_224-288693-0 from training. Duration: 0.59 2023-10-04 10:01:11,224 WARNING [train.py:1204] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_718-68505-0_sp1.1 from training. Duration: 0.8090625 2023-10-04 10:01:12,559 WARNING [train.py:1204] (3/4) Exclude cut with ID _4068_210_2629_1_1526697350395_1546259_224-288693-0_sp0.9 from training. Duration: 0.6555625 2023-10-04 10:01:12,578 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2296_210_2087_1_1525503448_3397335_57-185430-0_sp0.9 from training. Duration: 0.6666875 2023-10-04 10:01:13,912 WARNING [train.py:1204] (3/4) Exclude cut with ID _15458_210_8341_1_1530858614263_3834410_49-235472-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 10:01:13,915 WARNING [train.py:1204] (3/4) Exclude cut with ID _64135_210_2253_1_1535677267876_7608972_498-5039-0_sp0.9 from training. Duration: 0.8666875 2023-10-04 10:01:15,303 WARNING [train.py:1204] (3/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_283-220372-0_sp0.9 from training. Duration: 0.62225 2023-10-04 10:01:15,312 WARNING [train.py:1204] (3/4) Exclude cut with ID _6473_210_1741_1_1527901307396_3815380_108-17335-0_sp0.9 from training. Duration: 0.72225 2023-10-04 10:01:15,406 WARNING [train.py:1204] (3/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_113-92608-0 from training. Duration: 0.54 2023-10-04 10:01:16,810 WARNING [train.py:1204] (3/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_272-249985-0_sp1.1 from training. Duration: 0.6090625 2023-10-04 10:01:16,821 WARNING [train.py:1204] (3/4) Exclude cut with ID _50376_210_13401_1_1534031502101_4217740_518-187390-0_sp1.1 from training. Duration: 0.97275 2023-10-04 10:01:17,046 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.feed_forward2.hidden_balancer.prob, batch_count=1614840.0, ans=0.125 2023-10-04 10:01:18,717 WARNING [train.py:1204] (3/4) Exclude cut with ID _59738_210_15379_1_1534849512562_3941470_165-248260-0_sp1.1 from training. Duration: 0.92725 2023-10-04 10:01:18,734 WARNING [train.py:1204] (3/4) Exclude cut with ID _23818_210_6228_1_1531917063884_3676920_400-343925-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 10:01:18,950 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.conv_module2.balancer1.prob, batch_count=1614840.0, ans=0.125 2023-10-04 10:01:20,147 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6961_210_2087_1_1527937168_6815011_562-165518-0 from training. Duration: 0.77 2023-10-04 10:01:20,213 WARNING [train.py:1204] (3/4) Exclude cut with ID _24271_210_12658_1_1531987270022_7190519_289-207931-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 10:01:20,467 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.ff3_skip_rate, batch_count=1614840.0, ans=0.0 2023-10-04 10:01:21,634 WARNING [train.py:1204] (3/4) Exclude cut with ID _14632_210_9988_1_1530691279392_3923200_332-105904-0 from training. Duration: 0.9 2023-10-04 10:01:21,655 WARNING [train.py:1204] (3/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_203-232438-0_sp1.1 from training. Duration: 0.97275 2023-10-04 10:01:22,960 WARNING [train.py:1204] (3/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_263-168388-0 from training. Duration: 0.86 2023-10-04 10:01:24,349 WARNING [train.py:1204] (3/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_174-205058-0 from training. Duration: 0.95 2023-10-04 10:01:25,761 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_94-195801-0_sp1.1 from training. Duration: 0.8545625 2023-10-04 10:01:25,789 WARNING [train.py:1204] (3/4) Exclude cut with ID _55153_210_2253_1_1534384786677_3162096_269-103204-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 10:01:27,539 WARNING [train.py:1204] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_748-36150-0 from training. Duration: 0.91 2023-10-04 10:01:28,890 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_148-172861-0_sp1.1 from training. Duration: 0.4545625 2023-10-04 10:01:28,947 WARNING [train.py:1204] (3/4) Exclude cut with ID _67499_210_17282_1_1535509805220_3692430_460-77730-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 10:01:32,271 WARNING [train.py:1204] (3/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_374-20753-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 10:01:33,654 WARNING [train.py:1204] (3/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_374-289983-0_sp1.1 from training. Duration: 0.97275 2023-10-04 10:01:33,693 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_34886_210_10633_1_1533203882_1755661_76-156096-0_sp0.9 from training. Duration: 0.9666875 2023-10-04 10:01:39,736 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_238-256668-0_sp0.9 from training. Duration: 0.7666875 2023-10-04 10:01:39,789 WARNING [train.py:1204] (3/4) Exclude cut with ID _46683_210_9642_1_1533730913492_4591116_489-146727-0_sp1.1 from training. Duration: 0.97275 2023-10-04 10:01:41,129 WARNING [train.py:1204] (3/4) Exclude cut with ID _13938_210_9988_1_1530518718978_3693960_304-112784-0_sp0.9 from training. Duration: 0.511125 2023-10-04 10:01:45,122 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.695e+02 2.157e+02 2.549e+02 3.186e+02 5.490e+02, threshold=5.099e+02, percent-clipped=3.0 2023-10-04 10:01:45,149 INFO [train.py:1046] (3/4) Epoch 46, batch 3200, loss[loss=0.1491, simple_loss=0.2301, pruned_loss=0.03409, over 23521.00 frames. ], tot_loss[loss=0.1529, simple_loss=0.2326, pruned_loss=0.03662, over 4688662.74 frames. ], batch size: 134, lr: 2.20e-03, grad_scale: 32.0 2023-10-04 10:01:46,975 WARNING [train.py:1204] (3/4) Exclude cut with ID _24271_210_12658_1_1531987270022_7190519_243-46549-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 10:01:46,982 WARNING [train.py:1204] (3/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_78-182912-0_sp1.1 from training. Duration: 0.536375 2023-10-04 10:01:47,167 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.attention_skip_rate, batch_count=1614973.3333333333, ans=0.0 2023-10-04 10:01:51,027 WARNING [train.py:1204] (3/4) Exclude cut with ID _57572_210_5517_1_1534921219624_3809484_121-321334-0_sp1.1 from training. Duration: 0.97275 2023-10-04 10:01:52,445 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_40047_210_2290_1_1533290519_2374311_254-102149-0_sp1.1 from training. Duration: 0.963625 2023-10-04 10:01:52,446 WARNING [train.py:1204] (3/4) Exclude cut with ID _28461_210_2253_1_1532771775189_3041299_96-152158-0 from training. Duration: 0.85 2023-10-04 10:01:53,965 WARNING [train.py:1204] (3/4) Exclude cut with ID _29877_210_6228_1_1532397791106_5276149_161-102979-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 10:01:58,565 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3787_210_2040_1_1526477544_5478586_603-120324-0_sp0.9 from training. Duration: 0.9 2023-10-04 10:02:02,027 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_41750_210_1960_1_1533520678_3948042_247-213456-0_sp1.1 from training. Duration: 0.97275 2023-10-04 10:02:03,446 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.bypass_mid.scale_min, batch_count=1615040.0, ans=0.2 2023-10-04 10:02:10,719 WARNING [train.py:1204] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_127-119109-0_sp0.9 from training. Duration: 0.8 2023-10-04 10:02:11,245 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.4.encoder.layers.2.conv_module1.whiten, num_groups=1, num_channels=384, metric=2.64 vs. limit=15.0 2023-10-04 10:02:17,965 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.0.conv_skip_rate, batch_count=1615106.6666666667, ans=0.0 2023-10-04 10:02:17,999 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.conv_module2.balancer2.prob, batch_count=1615106.6666666667, ans=0.125 2023-10-04 10:02:20,969 WARNING [train.py:1204] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_215-202973-0 from training. Duration: 0.88 2023-10-04 10:02:22,285 WARNING [train.py:1204] (3/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_17-320491-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 10:02:24,998 WARNING [train.py:1204] (3/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_310-114452-0 from training. Duration: 0.59 2023-10-04 10:02:25,072 WARNING [train.py:1204] (3/4) Exclude cut with ID _15133_210_6228_1_1530794512074_3080379_29-264458-0_sp0.9 from training. Duration: 0.6444375 2023-10-04 10:02:28,503 WARNING [train.py:1204] (3/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_159-281310-0_sp1.1 from training. Duration: 0.863625 2023-10-04 10:02:28,520 WARNING [train.py:1204] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_195-167011-0_sp0.9 from training. Duration: 0.8333125 2023-10-04 10:02:29,984 WARNING [train.py:1204] (3/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_156-257607-0_sp1.1 from training. Duration: 0.7545625 2023-10-04 10:02:32,136 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.1.conv_module1.balancer1.prob, batch_count=1615173.3333333333, ans=0.125 2023-10-04 10:02:32,213 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.feed_forward3.hidden_balancer.prob, batch_count=1615173.3333333333, ans=0.125 2023-10-04 10:02:34,551 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2295_210_2087_1_1525500993_2407863_116-238894-0 from training. Duration: 0.7 2023-10-04 10:02:34,716 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.conv_module1.balancer2.prob, batch_count=1615173.3333333333, ans=0.125 2023-10-04 10:02:37,256 WARNING [train.py:1204] (3/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_321-335475-0_sp0.9 from training. Duration: 0.5 2023-10-04 10:02:38,719 WARNING [train.py:1204] (3/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_188-114464-0 from training. Duration: 0.82 2023-10-04 10:02:40,822 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2592_210_2290_1_1525697025_4882925_157-70512-0 from training. Duration: 0.91 2023-10-04 10:02:43,529 WARNING [train.py:1204] (3/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_138-64210-0_sp1.1 from training. Duration: 0.663625 2023-10-04 10:02:47,797 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_11305_210_1794_1_1529750428_7178148_651-95702-0_sp1.1 from training. Duration: 0.963625 2023-10-04 10:02:47,811 WARNING [train.py:1204] (3/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_379-184223-0_sp0.9 from training. Duration: 0.9444375 2023-10-04 10:02:49,146 WARNING [train.py:1204] (3/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_381-125325-0_sp1.1 from training. Duration: 0.963625 2023-10-04 10:02:50,522 WARNING [train.py:1204] (3/4) Exclude cut with ID IC0622W0021-307659 from training. Duration: 0.896 2023-10-04 10:02:50,524 WARNING [train.py:1204] (3/4) Exclude cut with ID _7624_210_1531_1_1528109802198_4933101_418-349834-0_sp1.1 from training. Duration: 0.5454375 2023-10-04 10:02:53,752 WARNING [train.py:1204] (3/4) Exclude cut with ID _49728_210_12651_1_1534041034294_6788150_607-3416-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 10:02:55,148 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8275_210_4198_1_1528455320_3892571_491-17707-0 from training. Duration: 0.64 2023-10-04 10:02:55,200 WARNING [train.py:1204] (3/4) Exclude cut with ID _5287_210_2084_1_1527418883040_3878670_333-48508-0 from training. Duration: 0.6 2023-10-04 10:02:56,640 WARNING [train.py:1204] (3/4) Exclude cut with ID _8365_210_2064_1_1528592311070_4677690_371-122295-0 from training. Duration: 0.55 2023-10-04 10:02:58,007 WARNING [train.py:1204] (3/4) Exclude cut with ID _14860_210_4915_1_1530702020340_7343240_836-328489-0 from training. Duration: 0.9 2023-10-04 10:02:59,765 INFO [train.py:1046] (3/4) Epoch 46, batch 3250, loss[loss=0.1464, simple_loss=0.2272, pruned_loss=0.03279, over 23595.00 frames. ], tot_loss[loss=0.1532, simple_loss=0.233, pruned_loss=0.03663, over 4698933.44 frames. ], batch size: 135, lr: 2.20e-03, grad_scale: 16.0 2023-10-04 10:03:01,157 WARNING [train.py:1204] (3/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_1033-274376-0_sp0.9 from training. Duration: 0.9666875 2023-10-04 10:03:04,391 WARNING [train.py:1204] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_475-127455-0_sp0.9 from training. Duration: 0.77775 2023-10-04 10:03:04,399 WARNING [train.py:1204] (3/4) Exclude cut with ID IC0671W0071-331914 from training. Duration: 0.896 2023-10-04 10:03:04,424 WARNING [train.py:1204] (3/4) Exclude cut with ID _30327_210_16049_1_1532484161295_3924169_2-1523-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 10:03:04,427 WARNING [train.py:1204] (3/4) Exclude cut with ID _24314_210_4915_1_1532135078116_7170179_460-208279-0_sp1.1 from training. Duration: 0.97275 2023-10-04 10:03:05,855 WARNING [train.py:1204] (3/4) Exclude cut with ID IC0679W0052-335859 from training. Duration: 0.64 2023-10-04 10:03:07,463 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.bypass.skip_rate, batch_count=1615306.6666666667, ans=0.07 2023-10-04 10:03:09,897 WARNING [train.py:1204] (3/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_3-237434-0_sp1.1 from training. Duration: 0.6909375 2023-10-04 10:03:13,142 WARNING [train.py:1204] (3/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_405-185283-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 10:03:17,416 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.nonlin_attention.balancer.prob, batch_count=1615373.3333333333, ans=0.125 2023-10-04 10:03:19,976 WARNING [train.py:1204] (3/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_927-83684-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 10:03:19,992 WARNING [train.py:1204] (3/4) Exclude cut with ID _6694_210_4599_1_1527852637490_3965529_356-169233-0 from training. Duration: 0.99 2023-10-04 10:03:20,066 WARNING [train.py:1204] (3/4) Exclude cut with ID _19896_210_1883_1_1531709022662_6649569_562-233001-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 10:03:21,364 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_354-78629-0_sp1.1 from training. Duration: 0.963625 2023-10-04 10:03:21,365 WARNING [train.py:1204] (3/4) Exclude cut with ID _60135_210_13427_1_1534935557922_3651319_96-168198-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 10:03:22,782 WARNING [train.py:1204] (3/4) Exclude cut with ID _31602_210_5854_1_1532677021456_4458250_249-96604-0_sp1.1 from training. Duration: 0.8090625 2023-10-04 10:03:22,823 WARNING [train.py:1204] (3/4) Exclude cut with ID _14100_210_8341_1_1530529320613_3678069_258-224599-0_sp0.9 from training. Duration: 0.8555625 2023-10-04 10:03:25,952 WARNING [train.py:1204] (3/4) Exclude cut with ID _19708_210_9206_1_1532136650733_4238364_215-160135-0_sp1.1 from training. Duration: 0.97275 2023-10-04 10:03:25,989 WARNING [train.py:1204] (3/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_113-18163-0_sp0.9 from training. Duration: 0.988875 2023-10-04 10:03:27,275 WARNING [train.py:1204] (3/4) Exclude cut with ID _64135_210_2253_1_1535677267876_7608972_552-337340-0_sp1.1 from training. Duration: 0.92725 2023-10-04 10:03:27,313 WARNING [train.py:1204] (3/4) Exclude cut with ID _67725_210_27767_1_1535608756671_5105359_10-12887-0_sp1.1 from training. Duration: 0.97275 2023-10-04 10:03:27,327 WARNING [train.py:1204] (3/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_401-284840-0_sp1.1 from training. Duration: 0.97275 2023-10-04 10:03:27,346 WARNING [train.py:1204] (3/4) Exclude cut with ID _44659_210_2721_1_1534036120061_6614860_483-141285-0_sp1.1 from training. Duration: 0.936375 2023-10-04 10:03:27,687 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=1615440.0, ans=0.1 2023-10-04 10:03:28,869 WARNING [train.py:1204] (3/4) Exclude cut with ID _9461_210_4557_1_1529060514726_3677451_358-332734-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 10:03:30,860 WARNING [train.py:1204] (3/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_788-298907-0_sp1.1 from training. Duration: 0.8090625 2023-10-04 10:03:32,342 WARNING [train.py:1204] (3/4) Exclude cut with ID _39411_210_5986_1_1533538825030_3343649_354-267196-0_sp1.1 from training. Duration: 0.92725 2023-10-04 10:03:32,362 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_14953_210_2151_1_1530701710_5616165_192-309792-0_sp1.1 from training. Duration: 0.97275 2023-10-04 10:03:32,476 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.1.nonlin_attention.balancer.prob, batch_count=1615440.0, ans=0.125 2023-10-04 10:03:34,192 WARNING [train.py:1204] (3/4) Exclude cut with ID _59170_210_6721_1_1534983633960_4203335_290-151255-0_sp1.1 from training. Duration: 0.92725 2023-10-04 10:03:35,482 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_5575_210_1410_1_1527301305_2570170_249-93769-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 10:03:35,495 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_46013_210_7234_1_1533776362_3744032_463-52378-0_sp1.1 from training. Duration: 0.7818125 2023-10-04 10:03:41,067 WARNING [train.py:1204] (3/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_580-93798-0 from training. Duration: 0.56 2023-10-04 10:03:41,121 WARNING [train.py:1204] (3/4) Exclude cut with ID _19514_210_9259_1_1531530071330_5084230_385-13762-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 10:03:41,132 WARNING [train.py:1204] (3/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_458-67321-0_sp1.1 from training. Duration: 0.8818125 2023-10-04 10:03:42,959 WARNING [train.py:1204] (3/4) Exclude cut with ID _41298_210_9019_1_1533430371450_4822143_188-154791-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 10:03:43,240 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.bypass_mid.scale_min, batch_count=1615506.6666666667, ans=0.2 2023-10-04 10:03:44,385 WARNING [train.py:1204] (3/4) Exclude cut with ID _15037_210_2253_1_1530752294104_3267650_296-192597-0_sp0.9 from training. Duration: 0.9 2023-10-04 10:03:49,911 WARNING [train.py:1204] (3/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_661-264857-0_sp1.1 from training. Duration: 0.8454375 2023-10-04 10:03:54,255 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.feed_forward3.hidden_balancer.prob, batch_count=1615506.6666666667, ans=0.125 2023-10-04 10:03:56,669 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_34008_210_10431_1_1533345424_8183247_662-317331-0_sp1.1 from training. Duration: 0.8545625 2023-10-04 10:03:56,697 WARNING [train.py:1204] (3/4) Exclude cut with ID _46502_210_13794_1_1533724452198_3657159_241-278329-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 10:03:56,698 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_11349_210_6753_1_1529800861_1101207_26-65602-0 from training. Duration: 0.96 2023-10-04 10:03:58,391 WARNING [train.py:1204] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_448-4048-0_sp1.1 from training. Duration: 0.87275 2023-10-04 10:03:58,398 WARNING [train.py:1204] (3/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_662-275621-0_sp1.1 from training. Duration: 0.3818125 2023-10-04 10:03:58,455 WARNING [train.py:1204] (3/4) Exclude cut with ID _13938_210_9988_1_1530518718978_3693960_256-54454-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 10:04:01,234 WARNING [train.py:1204] (3/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_256-32662-0 from training. Duration: 0.9 2023-10-04 10:04:01,267 WARNING [train.py:1204] (3/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_326-17061-0 from training. Duration: 0.94 2023-10-04 10:04:03,172 WARNING [train.py:1204] (3/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_505-97987-0_sp1.1 from training. Duration: 0.7818125 2023-10-04 10:04:04,539 WARNING [train.py:1204] (3/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_316-294364-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 10:04:05,850 WARNING [train.py:1204] (3/4) Exclude cut with ID _19514_210_9259_1_1531530071330_5084230_762-231063-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 10:04:07,501 WARNING [train.py:1204] (3/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_68-219807-0_sp0.9 from training. Duration: 0.52225 2023-10-04 10:04:07,550 WARNING [train.py:1204] (3/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_479-158053-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 10:04:09,113 WARNING [train.py:1204] (3/4) Exclude cut with ID _66112_210_10635_1_1535590236305_3801484_83-38181-0_sp1.1 from training. Duration: 0.936375 2023-10-04 10:04:09,130 WARNING [train.py:1204] (3/4) Exclude cut with ID _55857_210_2346_1_1534644007831_6898209_255-146346-0_sp1.1 from training. Duration: 0.963625 2023-10-04 10:04:10,531 WARNING [train.py:1204] (3/4) Exclude cut with ID _48836_210_12210_1_1534075848293_3904150_429-35475-0 from training. Duration: 0.89 2023-10-04 10:04:10,549 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_50154_210_13731_1_1534750862_1080961_119-187163-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 10:04:13,244 INFO [train.py:1046] (3/4) Epoch 46, batch 3300, loss[loss=0.1418, simple_loss=0.2222, pruned_loss=0.03066, over 23541.00 frames. ], tot_loss[loss=0.1535, simple_loss=0.2338, pruned_loss=0.03657, over 4701117.07 frames. ], batch size: 134, lr: 2.20e-03, grad_scale: 16.0 2023-10-04 10:04:13,315 WARNING [train.py:1204] (3/4) Exclude cut with ID _8793_210_6310_1_1528542985139_2310009_121-179782-0_sp0.9 from training. Duration: 0.888875 2023-10-04 10:04:13,318 WARNING [train.py:1204] (3/4) Exclude cut with ID _14884_210_2252_1_1530707459689_5561250_591-173406-0 from training. Duration: 0.96 2023-10-04 10:04:15,084 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.777e+02 2.072e+02 2.407e+02 3.224e+02 5.952e+02, threshold=4.814e+02, percent-clipped=1.0 2023-10-04 10:04:17,793 WARNING [train.py:1204] (3/4) Exclude cut with ID _15133_210_6228_1_1530794512074_3080379_54-198193-0_sp1.1 from training. Duration: 0.8545625 2023-10-04 10:04:17,803 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6545_210_2040_1_1527774670_4245258_407-48070-0 from training. Duration: 0.76 2023-10-04 10:04:19,252 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1032-236863-0 from training. Duration: 0.84 2023-10-04 10:04:19,334 WARNING [train.py:1204] (3/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_507-303370-0 from training. Duration: 0.96 2023-10-04 10:04:19,356 WARNING [train.py:1204] (3/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_164-282366-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 10:04:23,376 WARNING [train.py:1204] (3/4) Exclude cut with ID _39867_210_9826_1_1533553222030_9344259_312-158727-0_sp1.1 from training. Duration: 0.963625 2023-10-04 10:04:23,460 WARNING [train.py:1204] (3/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_174-205058-0_sp1.1 from training. Duration: 0.863625 2023-10-04 10:04:24,739 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_11305_210_1794_1_1529750428_7178148_166-237358-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 10:04:26,196 WARNING [train.py:1204] (3/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_26-175553-0_sp1.1 from training. Duration: 0.5545625 2023-10-04 10:04:26,215 WARNING [train.py:1204] (3/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_96-290039-0_sp1.1 from training. Duration: 0.5090625 2023-10-04 10:04:28,049 WARNING [train.py:1204] (3/4) Exclude cut with ID _53219_210_4525_1_1534834836008_4064825_261-101691-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 10:04:29,468 WARNING [train.py:1204] (3/4) Exclude cut with ID _24271_210_12658_1_1531987270022_7190519_96-293390-0_sp1.1 from training. Duration: 0.936375 2023-10-04 10:04:34,028 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_271-234962-0 from training. Duration: 0.79 2023-10-04 10:04:34,103 WARNING [train.py:1204] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_136-30660-0_sp1.1 from training. Duration: 0.97275 2023-10-04 10:04:34,127 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_625-281046-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 10:04:36,946 WARNING [train.py:1204] (3/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_333-168193-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 10:04:37,014 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9029W0269-509664 from training. Duration: 0.896 2023-10-04 10:04:38,689 WARNING [train.py:1204] (3/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_450-147393-0_sp1.1 from training. Duration: 0.92725 2023-10-04 10:04:39,981 WARNING [train.py:1204] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_643-52007-0_sp1.1 from training. Duration: 0.5181875 2023-10-04 10:04:40,058 WARNING [train.py:1204] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_330-327861-0_sp0.9 from training. Duration: 0.7444375 2023-10-04 10:04:40,067 WARNING [train.py:1204] (3/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_260-317651-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 10:04:40,094 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9044W0010-516654 from training. Duration: 0.896 2023-10-04 10:04:42,922 WARNING [train.py:1204] (3/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_265-118378-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 10:04:42,928 WARNING [train.py:1204] (3/4) Exclude cut with ID _6194_210_1534_1_1527511826160_3941194_321-148641-0_sp0.9 from training. Duration: 0.711125 2023-10-04 10:04:46,075 WARNING [train.py:1204] (3/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_101-169176-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 10:04:46,078 WARNING [train.py:1204] (3/4) Exclude cut with ID 497-129325-0061-62254-0_sp1.1 from training. Duration: 0.97725 2023-10-04 10:04:46,190 WARNING [train.py:1204] (3/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_795-33252-0_sp1.1 from training. Duration: 0.336375 2023-10-04 10:04:46,213 WARNING [train.py:1204] (3/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_168-246176-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 10:04:48,789 WARNING [train.py:1204] (3/4) Exclude cut with ID _7160_210_1876_1_1527940923231_3703799_120-150943-0_sp1.1 from training. Duration: 0.736375 2023-10-04 10:04:50,325 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.1.conv_module2.balancer1.min_positive, batch_count=1615773.3333333333, ans=0.025 2023-10-04 10:04:50,865 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.2.encoder.layers.1.conv_module2.whiten, num_groups=1, num_channels=384, metric=8.77 vs. limit=15.0 2023-10-04 10:04:51,513 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9084W0005-536844 from training. Duration: 0.768 2023-10-04 10:04:52,882 WARNING [train.py:1204] (3/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_234-260682-0 from training. Duration: 0.84 2023-10-04 10:04:52,931 WARNING [train.py:1204] (3/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_188-114464-0_sp0.9 from training. Duration: 0.911125 2023-10-04 10:04:55,810 WARNING [train.py:1204] (3/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_81-75849-0 from training. Duration: 0.39 2023-10-04 10:04:59,020 WARNING [train.py:1204] (3/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_379-184223-0_sp1.1 from training. Duration: 0.77275 2023-10-04 10:05:02,173 WARNING [train.py:1204] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_454-123002-0_sp1.1 from training. Duration: 0.62725 2023-10-04 10:05:02,224 WARNING [train.py:1204] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_887-249555-0_sp1.1 from training. Duration: 0.7 2023-10-04 10:05:04,991 WARNING [train.py:1204] (3/4) Exclude cut with ID _31088_210_16446_1_1532520012284_3440919_20-260902-0_sp1.1 from training. Duration: 0.97275 2023-10-04 10:05:05,039 WARNING [train.py:1204] (3/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_856-344574-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 10:05:05,041 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_36756_210_7234_1_1533026991_966276_4-164118-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 10:05:06,360 WARNING [train.py:1204] (3/4) Exclude cut with ID _8365_210_2064_1_1528592311070_4677690_371-122295-0_sp1.1 from training. Duration: 0.5 2023-10-04 10:05:06,525 WARNING [train.py:1204] (3/4) Exclude cut with ID _5910_210_1389_1_1527337972454_5050100_555-159815-0_sp1.1 from training. Duration: 0.8909375 2023-10-04 10:05:06,536 WARNING [train.py:1204] (3/4) Exclude cut with ID _37086_210_9038_1_1532999540891_7021250_469-147941-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 10:05:07,332 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.conv_module1.balancer2.prob, batch_count=1615840.0, ans=0.125 2023-10-04 10:05:07,389 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.balancer1.prob, batch_count=1615840.0, ans=0.125 2023-10-04 10:05:08,363 WARNING [train.py:1204] (3/4) Exclude cut with ID _3067_210_2667_1_1526212974640_4503168_459-173867-0_sp0.9 from training. Duration: 0.97775 2023-10-04 10:05:11,088 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9156W0006-572499 from training. Duration: 0.896 2023-10-04 10:05:12,478 WARNING [train.py:1204] (3/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_277-210798-0 from training. Duration: 0.55 2023-10-04 10:05:15,004 WARNING [train.py:1204] (3/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_103-336064-0_sp1.1 from training. Duration: 0.463625 2023-10-04 10:05:16,360 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3414_210_1988_1_1526212718_4813195_331-290945-0_sp1.1 from training. Duration: 0.936375 2023-10-04 10:05:16,361 WARNING [train.py:1204] (3/4) Exclude cut with ID _67499_210_17282_1_1535509805220_3692430_365-29276-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 10:05:17,799 WARNING [train.py:1204] (3/4) Exclude cut with ID _14821_210_8750_1_1530680503991_6829359_69-250847-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 10:05:17,801 WARNING [train.py:1204] (3/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_1102-61301-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 10:05:19,796 WARNING [train.py:1204] (3/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_166-246557-0_sp1.1 from training. Duration: 0.5818125 2023-10-04 10:05:21,087 WARNING [train.py:1204] (3/4) Exclude cut with ID _15351_210_10626_1_1530856635653_3576210_214-129918-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 10:05:21,107 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_120-41668-0_sp0.9 from training. Duration: 0.77775 2023-10-04 10:05:22,468 WARNING [train.py:1204] (3/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_27-72278-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 10:05:23,844 WARNING [train.py:1204] (3/4) Exclude cut with ID _24314_210_4915_1_1532135078116_7170179_521-258868-0_sp1.1 from training. Duration: 0.7181875 2023-10-04 10:05:25,260 WARNING [train.py:1204] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_143-86344-0 from training. Duration: 0.77 2023-10-04 10:05:25,304 WARNING [train.py:1204] (3/4) Exclude cut with ID _53489_210_6228_1_1534298357169_3835970_465-157433-0_sp1.1 from training. Duration: 0.92725 2023-10-04 10:05:26,617 INFO [train.py:1046] (3/4) Epoch 46, batch 3350, loss[loss=0.153, simple_loss=0.2394, pruned_loss=0.03327, over 23315.00 frames. ], tot_loss[loss=0.1545, simple_loss=0.2351, pruned_loss=0.03697, over 4700451.20 frames. ], batch size: 93, lr: 2.20e-03, grad_scale: 16.0 2023-10-04 10:05:26,711 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_248-319353-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 10:05:28,088 WARNING [train.py:1204] (3/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_102-77064-0_sp1.1 from training. Duration: 0.8454375 2023-10-04 10:05:29,425 WARNING [train.py:1204] (3/4) Exclude cut with ID _9389_210_1664_1_1529217337301_5289269_379-128794-0_sp1.1 from training. Duration: 0.77275 2023-10-04 10:05:29,520 WARNING [train.py:1204] (3/4) Exclude cut with ID _3231_210_2106_1_1526205864934_6784220_236-260835-0_sp1.1 from training. Duration: 0.97275 2023-10-04 10:05:30,887 WARNING [train.py:1204] (3/4) Exclude cut with ID _24271_210_12658_1_1531987270022_7190519_184-342363-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 10:05:30,888 WARNING [train.py:1204] (3/4) Exclude cut with ID _27894_210_12392_1_1532415496078_6902439_67-221663-0_sp1.1 from training. Duration: 0.92725 2023-10-04 10:05:34,508 WARNING [train.py:1204] (3/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_304-59227-0_sp1.1 from training. Duration: 0.7 2023-10-04 10:05:37,176 WARNING [train.py:1204] (3/4) Exclude cut with ID _58605_210_12210_1_1534723195939_3904259_458-42232-0_sp1.1 from training. Duration: 0.92725 2023-10-04 10:05:37,272 WARNING [train.py:1204] (3/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_13-165512-0_sp0.9 from training. Duration: 0.911125 2023-10-04 10:05:40,542 WARNING [train.py:1204] (3/4) Exclude cut with ID _58602_210_2346_1_1534903210085_6908830_792-102241-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 10:05:41,130 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.3.encoder.layers.2.feed_forward1.out_whiten, num_groups=1, num_channels=512, metric=6.98 vs. limit=15.0 2023-10-04 10:05:41,944 WARNING [train.py:1204] (3/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_588-125165-0_sp0.9 from training. Duration: 0.92225 2023-10-04 10:05:43,394 WARNING [train.py:1204] (3/4) Exclude cut with ID _54218_210_12210_1_1534413646388_4409259_239-98429-0_sp1.1 from training. Duration: 0.97275 2023-10-04 10:05:44,607 WARNING [train.py:1204] (3/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_538-32291-0_sp1.1 from training. Duration: 0.7545625 2023-10-04 10:05:46,040 WARNING [train.py:1204] (3/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_419-317062-0 from training. Duration: 0.91 2023-10-04 10:05:46,146 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9309W0202-640329 from training. Duration: 0.896 2023-10-04 10:05:46,178 WARNING [train.py:1204] (3/4) Exclude cut with ID _3231_210_2106_1_1526205864934_6784220_419-313291-0_sp1.1 from training. Duration: 0.97275 2023-10-04 10:05:49,466 WARNING [train.py:1204] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_619-344499-0 from training. Duration: 0.63 2023-10-04 10:05:50,766 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_278-272612-0 from training. Duration: 0.73 2023-10-04 10:05:51,726 INFO [scaling.py:1022] (3/4) Whitening: name=encoder_embed.out_whiten, num_groups=1, num_channels=192, metric=7.71 vs. limit=8.0 2023-10-04 10:05:52,190 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_103-84010-0_sp0.9 from training. Duration: 0.7666875 2023-10-04 10:05:52,225 WARNING [train.py:1204] (3/4) Exclude cut with ID _26528_210_9070_1_1532570410842_3851310_497-97218-0_sp1.1 from training. Duration: 0.7818125 2023-10-04 10:05:53,574 WARNING [train.py:1204] (3/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_262-84613-0_sp1.1 from training. Duration: 0.936375 2023-10-04 10:05:54,861 WARNING [train.py:1204] (3/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_387-131538-0 from training. Duration: 0.81 2023-10-04 10:05:54,894 WARNING [train.py:1204] (3/4) Exclude cut with ID _8030_210_7497_1_1528444886781_3531089_224-300489-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 10:05:54,918 WARNING [train.py:1204] (3/4) Exclude cut with ID _14349_210_8341_1_1530615710818_3442470_170-336541-0_sp0.9 from training. Duration: 0.9555625 2023-10-04 10:05:57,725 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_47069_210_15368_1_1533781267_4431320_180-31746-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 10:05:58,062 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.conv_module1.balancer2.prob, batch_count=1616106.6666666667, ans=0.125 2023-10-04 10:05:59,086 WARNING [train.py:1204] (3/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_447-7041-0_sp1.1 from training. Duration: 0.92725 2023-10-04 10:05:59,126 WARNING [train.py:1204] (3/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_2-55283-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 10:06:00,528 WARNING [train.py:1204] (3/4) Exclude cut with ID _12555_210_8750_1_1530075594402_3780239_70-181911-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 10:06:03,289 WARNING [train.py:1204] (3/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_311-328022-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 10:06:03,495 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.feed_forward1.out_proj.dropout_p, batch_count=1616106.6666666667, ans=0.1 2023-10-04 10:06:06,593 WARNING [train.py:1204] (3/4) Exclude cut with ID _14528_210_8254_1_1530669700514_4306939_330-2987-0_sp1.1 from training. Duration: 0.92725 2023-10-04 10:06:06,643 WARNING [train.py:1204] (3/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_385-184858-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 10:06:09,483 WARNING [train.py:1204] (3/4) Exclude cut with ID _64414_210_17301_1_1535243252048_3757759_348-333360-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 10:06:09,690 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.ff3_skip_rate, batch_count=1616173.3333333333, ans=0.0 2023-10-04 10:06:11,500 WARNING [train.py:1204] (3/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_753-214751-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 10:06:13,915 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_17613_210_2151_1_1531137569_2366160_202-208013-0_sp1.1 from training. Duration: 0.92725 2023-10-04 10:06:13,930 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_43831_210_2158_1_1533725502_5872791_610-129300-0_sp1.1 from training. Duration: 0.936375 2023-10-04 10:06:15,411 WARNING [train.py:1204] (3/4) Exclude cut with ID _54218_210_12210_1_1534413646388_4409259_321-23852-0_sp1.1 from training. Duration: 0.936375 2023-10-04 10:06:18,188 WARNING [train.py:1204] (3/4) Exclude cut with ID _39411_210_5986_1_1533538825030_3343649_173-238248-0 from training. Duration: 0.83 2023-10-04 10:06:18,195 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_405-339986-0_sp0.9 from training. Duration: 0.5666875 2023-10-04 10:06:18,221 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2009_210_2071_1_1525481134_4588937_385-302100-0 from training. Duration: 0.87 2023-10-04 10:06:18,256 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_161-276996-0_sp1.1 from training. Duration: 0.863625 2023-10-04 10:06:20,093 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_177-58005-0 from training. Duration: 0.62 2023-10-04 10:06:21,423 WARNING [train.py:1204] (3/4) Exclude cut with ID _64639_210_5816_1_1535367619437_3528000_146-343734-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 10:06:22,778 WARNING [train.py:1204] (3/4) Exclude cut with ID _3845_210_1390_1_1526731326141_3523610_72-61441-0_sp1.1 from training. Duration: 0.92725 2023-10-04 10:06:29,650 WARNING [train.py:1204] (3/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_991-125028-0_sp1.1 from training. Duration: 0.936375 2023-10-04 10:06:30,954 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7544_210_6753_1_1528591327_1471426_172-348777-0 from training. Duration: 0.66 2023-10-04 10:06:31,010 WARNING [train.py:1204] (3/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_416-257730-0_sp1.1 from training. Duration: 0.5909375 2023-10-04 10:06:33,563 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1113-7369-0_sp1.1 from training. Duration: 0.77275 2023-10-04 10:06:33,660 WARNING [train.py:1204] (3/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_242-76943-0_sp1.1 from training. Duration: 0.8545625 2023-10-04 10:06:37,188 WARNING [train.py:1204] (3/4) Exclude cut with ID _44718_210_5816_1_1534050051869_6469030_190-49648-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 10:06:39,988 INFO [train.py:1046] (3/4) Epoch 46, batch 3400, loss[loss=0.1279, simple_loss=0.2064, pruned_loss=0.02471, over 20351.00 frames. ], tot_loss[loss=0.1546, simple_loss=0.2354, pruned_loss=0.03687, over 4714533.80 frames. ], batch size: 44, lr: 2.20e-03, grad_scale: 16.0 2023-10-04 10:06:40,040 WARNING [train.py:1204] (3/4) Exclude cut with ID _47251_210_18628_1_1533814063128_4137250_214-88076-0 from training. Duration: 0.88 2023-10-04 10:06:40,069 WARNING [train.py:1204] (3/4) Exclude cut with ID _5585_210_2667_1_1527418813324_8007340_89-332072-0_sp1.1 from training. Duration: 0.5818125 2023-10-04 10:06:40,104 WARNING [train.py:1204] (3/4) Exclude cut with ID _41817_210_18835_1_1533434477160_7423049_555-293448-0_sp1.1 from training. Duration: 0.72725 2023-10-04 10:06:41,416 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.680e+02 2.075e+02 2.237e+02 2.507e+02 4.047e+02, threshold=4.473e+02, percent-clipped=0.0 2023-10-04 10:06:42,114 WARNING [train.py:1204] (3/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_286-331314-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 10:06:42,163 WARNING [train.py:1204] (3/4) Exclude cut with ID _13921_210_9994_1_1530838702800_3003429_46-149444-0 from training. Duration: 0.94 2023-10-04 10:06:43,300 WARNING [train.py:1204] (3/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_768-43595-0_sp1.1 from training. Duration: 0.936375 2023-10-04 10:06:43,309 WARNING [train.py:1204] (3/4) Exclude cut with ID _35355_210_16040_1_1533212988977_4016229_74-190076-0 from training. Duration: 0.8 2023-10-04 10:06:46,041 WARNING [train.py:1204] (3/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_1007-316380-0_sp1.1 from training. Duration: 0.963625 2023-10-04 10:06:46,091 WARNING [train.py:1204] (3/4) Exclude cut with ID _23557_210_8750_1_1531890106370_6846255_150-283437-0_sp1.1 from training. Duration: 0.963625 2023-10-04 10:06:46,129 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3474_210_2028_1_1526188513_4825246_145-309131-0_sp1.1 from training. Duration: 0.67275 2023-10-04 10:06:47,480 WARNING [train.py:1204] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_391-22148-0_sp1.1 from training. Duration: 0.7 2023-10-04 10:06:47,505 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_238-256668-0 from training. Duration: 0.69 2023-10-04 10:06:52,374 WARNING [train.py:1204] (3/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_372-198878-0 from training. Duration: 0.6 2023-10-04 10:06:52,391 WARNING [train.py:1204] (3/4) Exclude cut with ID ID0174W0006-766659 from training. Duration: 0.953 2023-10-04 10:06:52,406 WARNING [train.py:1204] (3/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_668-23019-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 10:06:55,223 WARNING [train.py:1204] (3/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_322-74328-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 10:06:55,229 WARNING [train.py:1204] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_872-264525-0_sp1.1 from training. Duration: 0.5909375 2023-10-04 10:06:56,536 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_53378_210_17282_1_1534221071_5769509_641-286201-0_sp1.1 from training. Duration: 0.97275 2023-10-04 10:06:57,983 WARNING [train.py:1204] (3/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_199-295611-0_sp1.1 from training. Duration: 0.5 2023-10-04 10:07:02,318 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.bypass.skip_rate, batch_count=1616373.3333333333, ans=0.07 2023-10-04 10:07:03,570 WARNING [train.py:1204] (3/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_1270-317971-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 10:07:05,393 WARNING [train.py:1204] (3/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_784-213099-0 from training. Duration: 0.93 2023-10-04 10:07:10,138 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_115-233929-0_sp0.9 from training. Duration: 0.8 2023-10-04 10:07:11,596 WARNING [train.py:1204] (3/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_256-223809-0_sp1.1 from training. Duration: 0.97275 2023-10-04 10:07:11,648 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_285-87781-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 10:07:13,387 WARNING [train.py:1204] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_619-344499-0_sp1.1 from training. Duration: 0.57275 2023-10-04 10:07:18,867 WARNING [train.py:1204] (3/4) Exclude cut with ID _60229_210_27767_1_1534838406158_3655350_264-279573-0_sp0.9 from training. Duration: 0.92225 2023-10-04 10:07:23,412 WARNING [train.py:1204] (3/4) Exclude cut with ID _7719_210_3525_1_1528456153163_4296960_333-110399-0 from training. Duration: 0.96 2023-10-04 10:07:28,936 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_50423_210_13270_1_1534128598_5021734_759-90833-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 10:07:28,985 WARNING [train.py:1204] (3/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_151-254798-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 10:07:30,442 WARNING [train.py:1204] (3/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_450-203637-0 from training. Duration: 0.92 2023-10-04 10:07:30,476 WARNING [train.py:1204] (3/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_673-36456-0_sp1.1 from training. Duration: 0.936375 2023-10-04 10:07:30,522 WARNING [train.py:1204] (3/4) Exclude cut with ID _36538_210_9994_1_1533204020363_3795620_131-8685-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 10:07:30,567 WARNING [train.py:1204] (3/4) Exclude cut with ID _12556_210_8341_1_1530081024100_7327690_713-55626-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 10:07:31,819 WARNING [train.py:1204] (3/4) Exclude cut with ID _2322_210_2667_1_1525604548971_3656299_440-80494-0_sp0.9 from training. Duration: 0.97775 2023-10-04 10:07:36,371 WARNING [train.py:1204] (3/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_210-325081-0_sp1.1 from training. Duration: 0.97275 2023-10-04 10:07:38,197 WARNING [train.py:1204] (3/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_375-244976-0_sp0.9 from training. Duration: 0.8333125 2023-10-04 10:07:39,572 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_15517_210_6753_1_1530952293_1664795_119-31001-0_sp1.1 from training. Duration: 0.8545625 2023-10-04 10:07:45,522 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_41452_210_7291_1_1533437687_3941281_319-42326-0_sp1.1 from training. Duration: 0.92725 2023-10-04 10:07:46,890 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_372-173012-0 from training. Duration: 0.95 2023-10-04 10:07:48,533 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.conv_skip_rate, batch_count=1616573.3333333333, ans=0.0 2023-10-04 10:07:51,155 WARNING [train.py:1204] (3/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_41-31157-0_sp1.1 from training. Duration: 0.5181875 2023-10-04 10:07:54,247 INFO [train.py:1046] (3/4) Epoch 46, batch 3450, loss[loss=0.1381, simple_loss=0.2282, pruned_loss=0.02401, over 24436.00 frames. ], tot_loss[loss=0.1556, simple_loss=0.2361, pruned_loss=0.03753, over 4692752.01 frames. ], batch size: 63, lr: 2.20e-03, grad_scale: 16.0 2023-10-04 10:07:54,389 WARNING [train.py:1204] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_91-212486-0 from training. Duration: 0.68 2023-10-04 10:07:57,359 WARNING [train.py:1204] (3/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_1267-272693-0 from training. Duration: 0.73 2023-10-04 10:07:57,396 WARNING [train.py:1204] (3/4) Exclude cut with ID _13938_210_9988_1_1530518718978_3693960_152-67046-0_sp1.1 from training. Duration: 0.936375 2023-10-04 10:07:58,851 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.ff2_skip_rate, batch_count=1616640.0, ans=0.0 2023-10-04 10:08:00,016 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_4528_210_3273_1_1527076654_3276777_342-168068-0_sp1.1 from training. Duration: 0.7909375 2023-10-04 10:08:00,024 WARNING [train.py:1204] (3/4) Exclude cut with ID _7606_210_4915_1_1528894253277_5080690_127-177761-0 from training. Duration: 0.64 2023-10-04 10:08:01,288 WARNING [train.py:1204] (3/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_335-137972-0_sp1.1 from training. Duration: 0.92725 2023-10-04 10:08:04,670 WARNING [train.py:1204] (3/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_331-117829-0_sp1.1 from training. Duration: 0.563625 2023-10-04 10:08:09,523 WARNING [train.py:1204] (3/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_125-125818-0_sp1.1 from training. Duration: 0.87275 2023-10-04 10:08:11,360 WARNING [train.py:1204] (3/4) Exclude cut with ID _6789_210_1534_1_1527854471671_2870519_48-294897-0_sp1.1 from training. Duration: 0.963625 2023-10-04 10:08:12,825 WARNING [train.py:1204] (3/4) Exclude cut with ID _6694_210_4599_1_1527852637490_3965529_116-109398-0_sp1.1 from training. Duration: 0.663625 2023-10-04 10:08:12,848 WARNING [train.py:1204] (3/4) Exclude cut with ID _52892_210_6721_1_1534378941259_4124129_222-172685-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 10:08:15,768 WARNING [train.py:1204] (3/4) Exclude cut with ID _50655_210_18628_1_1534229942193_3772154_320-132275-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 10:08:19,898 WARNING [train.py:1204] (3/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_76-311963-0 from training. Duration: 0.7 2023-10-04 10:08:20,616 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.2.encoder.layers.2.nonlin_attention.whiten1, num_groups=1, num_channels=288, metric=4.98 vs. limit=10.0 2023-10-04 10:08:24,753 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.conv_skip_rate, batch_count=1616773.3333333333, ans=0.0 2023-10-04 10:08:25,964 WARNING [train.py:1204] (3/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_32-205288-0 from training. Duration: 0.78 2023-10-04 10:08:27,211 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_86-16881-0_sp0.9 from training. Duration: 0.6444375 2023-10-04 10:08:27,247 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_271-297505-0_sp1.1 from training. Duration: 0.9 2023-10-04 10:08:27,347 WARNING [train.py:1204] (3/4) Exclude cut with ID _57572_210_5517_1_1534921219624_3809484_187-295293-0_sp1.1 from training. Duration: 0.963625 2023-10-04 10:08:31,529 WARNING [train.py:1204] (3/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_563-329943-0 from training. Duration: 0.99 2023-10-04 10:08:32,892 WARNING [train.py:1204] (3/4) Exclude cut with ID _7160_210_1876_1_1527940923231_3703799_302-150728-0_sp0.9 from training. Duration: 0.8666875 2023-10-04 10:08:34,869 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.ff3_skip_rate, batch_count=1616773.3333333333, ans=0.0 2023-10-04 10:08:36,241 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.balancer2.prob, batch_count=1616773.3333333333, ans=0.125 2023-10-04 10:08:37,810 WARNING [train.py:1204] (3/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_926-7526-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 10:08:39,117 WARNING [train.py:1204] (3/4) Exclude cut with ID _62574_210_2655_1_1535180407058_3933249_467-22348-0_sp1.1 from training. Duration: 0.936375 2023-10-04 10:08:39,219 WARNING [train.py:1204] (3/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_280-161633-0_sp0.9 from training. Duration: 0.811125 2023-10-04 10:08:40,629 WARNING [train.py:1204] (3/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_385-193052-0_sp1.1 from training. Duration: 0.7818125 2023-10-04 10:08:42,406 WARNING [train.py:1204] (3/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_444-28041-0 from training. Duration: 0.85 2023-10-04 10:08:42,566 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.feed_forward1.hidden_balancer.prob, batch_count=1616840.0, ans=0.125 2023-10-04 10:08:43,645 WARNING [train.py:1204] (3/4) Exclude cut with ID _66070_210_7640_1_1535441393096_7805310_341-249237-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 10:08:44,897 WARNING [train.py:1204] (3/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_425-269826-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 10:08:45,618 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.3.encoder.layers.3.whiten, num_groups=1, num_channels=512, metric=3.52 vs. limit=12.0 2023-10-04 10:08:46,363 WARNING [train.py:1204] (3/4) Exclude cut with ID _53950_210_5816_1_1534417264919_3862951_124-185597-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 10:08:47,869 INFO [scaling.py:1118] (3/4) WithLoss: name=encoder.encoders.2.encoder.layers.0.self_attn_weights, loss-sum=0.000e+00 2023-10-04 10:08:50,425 WARNING [train.py:1204] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_380-204756-0 from training. Duration: 0.86 2023-10-04 10:08:53,287 WARNING [train.py:1204] (3/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_282-257791-0_sp0.9 from training. Duration: 0.9555625 2023-10-04 10:08:59,157 WARNING [train.py:1204] (3/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_372-131163-0_sp1.1 from training. Duration: 0.8545625 2023-10-04 10:08:59,354 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.feed_forward2.hidden_balancer.prob, batch_count=1616906.6666666667, ans=0.125 2023-10-04 10:09:00,493 WARNING [train.py:1204] (3/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_447-233513-0_sp1.1 from training. Duration: 0.963625 2023-10-04 10:09:01,930 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_54-118333-0_sp1.1 from training. Duration: 0.97275 2023-10-04 10:09:06,087 WARNING [train.py:1204] (3/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_388-230239-0_sp1.1 from training. Duration: 0.963625 2023-10-04 10:09:06,098 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_40047_210_2290_1_1533290519_2374311_141-54639-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 10:09:06,161 WARNING [train.py:1204] (3/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_91-267696-0_sp0.9 from training. Duration: 0.9666875 2023-10-04 10:09:06,188 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_532-137847-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 10:09:07,947 INFO [train.py:1046] (3/4) Epoch 46, batch 3500, loss[loss=0.1472, simple_loss=0.2282, pruned_loss=0.03311, over 24595.00 frames. ], tot_loss[loss=0.1544, simple_loss=0.2344, pruned_loss=0.03718, over 4703224.26 frames. ], batch size: 60, lr: 2.20e-03, grad_scale: 8.0 2023-10-04 10:09:10,742 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.3.encoder.layers.0.feed_forward3.out_whiten, num_groups=1, num_channels=512, metric=10.62 vs. limit=15.0 2023-10-04 10:09:11,221 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.693e+02 2.024e+02 2.165e+02 2.480e+02 4.406e+02, threshold=4.331e+02, percent-clipped=0.0 2023-10-04 10:09:11,376 WARNING [train.py:1204] (3/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_155-283231-0_sp1.1 from training. Duration: 0.97275 2023-10-04 10:09:14,130 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2245_210_2715_1_1525511548_3540254_310-52483-0_sp1.1 from training. Duration: 0.8 2023-10-04 10:09:15,464 WARNING [train.py:1204] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_185-174805-0 from training. Duration: 0.68 2023-10-04 10:09:16,929 WARNING [train.py:1204] (3/4) Exclude cut with ID _15012_210_1968_1_1530856820429_5095350_662-210469-0_sp1.1 from training. Duration: 0.5181875 2023-10-04 10:09:20,909 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_426-313557-0_sp0.9 from training. Duration: 0.588875 2023-10-04 10:09:23,765 WARNING [train.py:1204] (3/4) Exclude cut with ID _42326_210_7770_1_1533814282531_3707269_407-204343-0_sp1.1 from training. Duration: 0.97275 2023-10-04 10:09:23,785 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_258-310570-0 from training. Duration: 0.82 2023-10-04 10:09:27,115 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_4737_210_2040_1_1526998689_3293008_378-178650-0_sp1.1 from training. Duration: 0.863625 2023-10-04 10:09:28,450 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_47896_210_20508_1_1534492518_5958868_432-327451-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 10:09:29,844 WARNING [train.py:1204] (3/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_188-114464-0_sp1.1 from training. Duration: 0.7454375 2023-10-04 10:09:29,851 WARNING [train.py:1204] (3/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_798-106179-0_sp1.1 from training. Duration: 0.7545625 2023-10-04 10:09:31,085 WARNING [train.py:1204] (3/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_143-117421-0_sp1.1 from training. Duration: 0.52725 2023-10-04 10:09:31,123 WARNING [train.py:1204] (3/4) Exclude cut with ID _67499_210_17282_1_1535509805220_3692430_361-78786-0_sp1.1 from training. Duration: 0.963625 2023-10-04 10:09:32,446 WARNING [train.py:1204] (3/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_712-342563-0_sp1.1 from training. Duration: 0.8818125 2023-10-04 10:09:32,502 WARNING [train.py:1204] (3/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_387-159008-0 from training. Duration: 0.86 2023-10-04 10:09:33,930 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.0.nonlin_attention.balancer.prob, batch_count=1617040.0, ans=0.125 2023-10-04 10:09:35,657 WARNING [train.py:1204] (3/4) Exclude cut with ID _33640_210_2655_1_1533117605479_4855469_959-18820-0_sp1.1 from training. Duration: 0.963625 2023-10-04 10:09:35,686 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_133-564-0_sp0.9 from training. Duration: 0.87775 2023-10-04 10:09:37,110 WARNING [train.py:1204] (3/4) Exclude cut with ID _23514_210_1389_1_1531832139785_3031439_12-29287-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 10:09:40,400 WARNING [train.py:1204] (3/4) Exclude cut with ID _23514_210_1389_1_1531832139785_3031439_489-185052-0_sp1.1 from training. Duration: 0.963625 2023-10-04 10:09:42,371 WARNING [train.py:1204] (3/4) Exclude cut with ID _5444_210_2151_1_1527418808464_5312210_634-194174-0 from training. Duration: 0.97 2023-10-04 10:09:42,389 WARNING [train.py:1204] (3/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_91-26919-0_sp1.1 from training. Duration: 0.8818125 2023-10-04 10:09:43,917 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.1.feed_forward1.hidden_balancer.prob, batch_count=1617106.6666666667, ans=0.125 2023-10-04 10:09:45,215 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_37351_210_7925_1_1533012931_3812113_516-82303-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 10:09:46,588 WARNING [train.py:1204] (3/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_80-74184-0_sp0.9 from training. Duration: 0.92225 2023-10-04 10:09:46,743 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.bypass_mid.scale_min, batch_count=1617106.6666666667, ans=0.2 2023-10-04 10:09:47,844 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_9460_210_4211_1_1528954587_10049667_495-202963-0_sp1.1 from training. Duration: 0.963625 2023-10-04 10:09:49,329 WARNING [train.py:1204] (3/4) Exclude cut with ID _14632_210_9988_1_1530691279392_3923200_332-105904-0_sp1.1 from training. Duration: 0.8181875 2023-10-04 10:09:49,348 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_40764_210_1925_1_1533882359_3752345_493-167886-0_sp1.1 from training. Duration: 0.92725 2023-10-04 10:09:52,139 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_196-289637-0 from training. Duration: 0.75 2023-10-04 10:09:52,307 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.1.conv_module2.balancer2.prob, batch_count=1617173.3333333333, ans=0.125 2023-10-04 10:09:52,308 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.1.bypass.skip_rate, batch_count=1617173.3333333333, ans=0.035 2023-10-04 10:09:53,503 WARNING [train.py:1204] (3/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_26-175553-0 from training. Duration: 0.61 2023-10-04 10:09:53,550 WARNING [train.py:1204] (3/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_890-5117-0 from training. Duration: 0.78 2023-10-04 10:09:53,603 WARNING [train.py:1204] (3/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_206-306860-0_sp1.1 from training. Duration: 0.92725 2023-10-04 10:09:55,534 WARNING [train.py:1204] (3/4) Exclude cut with ID _25882_210_3036_1_1532775665815_6479510_570-79824-0_sp1.1 from training. Duration: 0.963625 2023-10-04 10:09:56,855 WARNING [train.py:1204] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_731-2428-0_sp1.1 from training. Duration: 0.7545625 2023-10-04 10:09:56,891 WARNING [train.py:1204] (3/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_138-154812-0_sp0.9 from training. Duration: 0.8666875 2023-10-04 10:09:59,733 WARNING [train.py:1204] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_178-39722-0_sp0.9 from training. Duration: 0.5444375 2023-10-04 10:09:59,792 WARNING [train.py:1204] (3/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_1285-256622-0_sp0.9 from training. Duration: 0.9333125 2023-10-04 10:10:05,283 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_41428_210_2161_1_1533774184_3992532_65-271323-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 10:10:07,070 WARNING [train.py:1204] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_308-161067-0 from training. Duration: 0.66 2023-10-04 10:10:07,081 WARNING [train.py:1204] (3/4) Exclude cut with ID _3067_210_2667_1_1526212974640_4503168_459-173867-0 from training. Duration: 0.88 2023-10-04 10:10:07,083 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2221_210_1988_1_1525597051_4935541_415-296882-0_sp1.1 from training. Duration: 0.7 2023-10-04 10:10:08,467 WARNING [train.py:1204] (3/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_354-192266-0_sp1.1 from training. Duration: 0.936375 2023-10-04 10:10:08,537 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_258-310570-0_sp0.9 from training. Duration: 0.911125 2023-10-04 10:10:10,480 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_548-141039-0_sp1.1 from training. Duration: 0.963625 2023-10-04 10:10:13,738 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_15101_210_7927_1_1530786415_5033274_320-327527-0 from training. Duration: 0.96 2023-10-04 10:10:13,791 WARNING [train.py:1204] (3/4) Exclude cut with ID _2895_210_2477_1_1525956785794_4063750_289-11475-0_sp0.9 from training. Duration: 0.911125 2023-10-04 10:10:15,220 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7543_210_6753_1_1528543318_4552_1-85072-0_sp1.1 from training. Duration: 0.7545625 2023-10-04 10:10:16,445 WARNING [train.py:1204] (3/4) Exclude cut with ID _64639_210_5816_1_1535367619437_3528000_461-329156-0 from training. Duration: 0.96 2023-10-04 10:10:19,113 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_211-339622-0 from training. Duration: 0.45 2023-10-04 10:10:20,540 WARNING [train.py:1204] (3/4) Exclude cut with ID _20455_210_5219_1_1531565996775_8098100_402-112739-0_sp1.1 from training. Duration: 0.963625 2023-10-04 10:10:21,785 INFO [train.py:1046] (3/4) Epoch 46, batch 3550, loss[loss=0.142, simple_loss=0.2212, pruned_loss=0.03144, over 23738.00 frames. ], tot_loss[loss=0.1532, simple_loss=0.2328, pruned_loss=0.03675, over 4693200.58 frames. ], batch size: 135, lr: 2.20e-03, grad_scale: 8.0 2023-10-04 10:10:21,916 WARNING [train.py:1204] (3/4) Exclude cut with ID _53489_210_6228_1_1534298357169_3835970_256-115565-0_sp1.1 from training. Duration: 0.936375 2023-10-04 10:10:21,934 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_26326_210_7925_1_1533002493_3592406_360-305260-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 10:10:23,764 WARNING [train.py:1204] (3/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_18-204383-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 10:10:25,313 WARNING [train.py:1204] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_306-232480-0_sp1.1 from training. Duration: 0.87275 2023-10-04 10:10:35,037 WARNING [train.py:1204] (3/4) Exclude cut with ID _41161_210_20034_1_1533431249657_3555314_116-299186-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 10:10:37,043 WARNING [train.py:1204] (3/4) Exclude cut with ID _13913_210_9994_1_1530579615187_3383897_473-277682-0_sp1.1 from training. Duration: 0.3545625 2023-10-04 10:10:39,815 WARNING [train.py:1204] (3/4) Exclude cut with ID _58895_210_8750_1_1535184081320_3649089_49-225702-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 10:10:39,891 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3080_210_2071_1_1526086102_4365166_312-8726-0_sp1.1 from training. Duration: 0.82725 2023-10-04 10:10:41,796 WARNING [train.py:1204] (3/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_489-190175-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 10:10:41,860 WARNING [train.py:1204] (3/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_345-95717-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 10:10:43,701 WARNING [train.py:1204] (3/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_85-138257-0_sp1.1 from training. Duration: 0.6454375 2023-10-04 10:10:46,404 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7046_210_6753_1_1527895337_6920917_783-278109-0_sp1.1 from training. Duration: 0.7 2023-10-04 10:10:47,684 WARNING [train.py:1204] (3/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_450-203637-0_sp1.1 from training. Duration: 0.836375 2023-10-04 10:10:47,843 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.balancer1.prob, batch_count=1617373.3333333333, ans=0.125 2023-10-04 10:10:48,953 WARNING [train.py:1204] (3/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_416-132893-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 10:10:48,976 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2655_210_2087_1_1525688959_3897400_437-337690-0_sp0.9 from training. Duration: 0.7 2023-10-04 10:10:49,052 WARNING [train.py:1204] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_700-183549-0_sp1.1 from training. Duration: 0.6181875 2023-10-04 10:10:52,402 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.3.encoder.layers.2.nonlin_attention.whiten1, num_groups=1, num_channels=384, metric=3.71 vs. limit=10.0 2023-10-04 10:10:54,610 WARNING [train.py:1204] (3/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_191-59784-0_sp1.1 from training. Duration: 0.8 2023-10-04 10:10:54,628 WARNING [train.py:1204] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_218-295097-0_sp1.1 from training. Duration: 0.7 2023-10-04 10:10:57,866 WARNING [train.py:1204] (3/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_310-109461-0_sp1.1 from training. Duration: 0.72725 2023-10-04 10:10:57,870 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_33348_210_5632_1_1533086912_3324030_131-321149-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 10:10:57,926 WARNING [train.py:1204] (3/4) Exclude cut with ID _64135_210_2253_1_1535677267876_7608972_133-188266-0_sp1.1 from training. Duration: 0.736375 2023-10-04 10:10:57,952 WARNING [train.py:1204] (3/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_505-205270-0 from training. Duration: 0.38 2023-10-04 10:10:57,962 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_67223_210_4238_1_1535612120_2537431_102-88248-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 10:10:59,432 WARNING [train.py:1204] (3/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_60-235433-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 10:11:00,136 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.2.encoder.layers.1.self_attn_weights.whiten_keys, num_groups=4, num_channels=128, metric=2.47 vs. limit=6.0 2023-10-04 10:11:00,802 WARNING [train.py:1204] (3/4) Exclude cut with ID _5189_210_4557_1_1527248319428_3769229_199-53526-0_sp1.1 from training. Duration: 0.3818125 2023-10-04 10:11:06,740 WARNING [train.py:1204] (3/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_439-90979-0_sp1.1 from training. Duration: 0.963625 2023-10-04 10:11:06,798 WARNING [train.py:1204] (3/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_1162-132880-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 10:11:08,175 WARNING [train.py:1204] (3/4) Exclude cut with ID _56423_210_5854_1_1534896061049_3165969_220-22317-0_sp1.1 from training. Duration: 0.963625 2023-10-04 10:11:10,851 WARNING [train.py:1204] (3/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_909-64151-0 from training. Duration: 0.94 2023-10-04 10:11:10,906 WARNING [train.py:1204] (3/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_465-156500-0_sp0.9 from training. Duration: 0.8 2023-10-04 10:11:10,999 WARNING [train.py:1204] (3/4) Exclude cut with ID _44304_210_15374_1_1533639352765_4613129_302-22084-0 from training. Duration: 0.86 2023-10-04 10:11:12,945 WARNING [train.py:1204] (3/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_663-166973-0_sp1.1 from training. Duration: 0.72725 2023-10-04 10:11:14,479 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.conv_module2.balancer1.prob, batch_count=1617506.6666666667, ans=0.125 2023-10-04 10:11:15,924 WARNING [train.py:1204] (3/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_503-287883-0_sp0.9 from training. Duration: 0.97775 2023-10-04 10:11:15,946 WARNING [train.py:1204] (3/4) Exclude cut with ID _60057_210_10640_1_1534903065793_3333150_362-230377-0_sp0.9 from training. Duration: 0.9666875 2023-10-04 10:11:18,764 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_4183_210_2087_1_1526729100_1869838_143-280962-0 from training. Duration: 0.56 2023-10-04 10:11:18,849 WARNING [train.py:1204] (3/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_281-1313-0_sp1.1 from training. Duration: 0.936375 2023-10-04 10:11:21,965 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.nonlin_attention.balancer.prob, batch_count=1617573.3333333333, ans=0.125 2023-10-04 10:11:22,425 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.2.encoder.layers.2.nonlin_attention.whiten1, num_groups=1, num_channels=288, metric=6.37 vs. limit=10.0 2023-10-04 10:11:24,567 WARNING [train.py:1204] (3/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_64-215118-0_sp1.1 from training. Duration: 0.936375 2023-10-04 10:11:24,630 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2822_210_2715_1_1526112586_1999587_165-36656-0 from training. Duration: 0.89 2023-10-04 10:11:25,981 WARNING [train.py:1204] (3/4) Exclude cut with ID _22961_210_8254_1_1531976366672_4278180_402-208830-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 10:11:30,479 WARNING [train.py:1204] (3/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_98-193494-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 10:11:30,564 WARNING [train.py:1204] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_383-285196-0 from training. Duration: 0.97 2023-10-04 10:11:36,581 INFO [train.py:1046] (3/4) Epoch 46, batch 3600, loss[loss=0.1562, simple_loss=0.2143, pruned_loss=0.04908, over 18976.00 frames. ], tot_loss[loss=0.1528, simple_loss=0.2325, pruned_loss=0.03661, over 4695537.05 frames. ], batch size: 388, lr: 2.20e-03, grad_scale: 16.0 2023-10-04 10:11:36,650 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6767_210_3273_1_1527855783_1718932_142-127132-0 from training. Duration: 0.95 2023-10-04 10:11:36,679 WARNING [train.py:1204] (3/4) Exclude cut with ID _39867_210_9826_1_1533553222030_9344259_1018-339635-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 10:11:38,079 WARNING [train.py:1204] (3/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_274-274798-0_sp1.1 from training. Duration: 0.8909375 2023-10-04 10:11:39,475 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.650e+02 1.969e+02 2.172e+02 2.569e+02 3.736e+02, threshold=4.344e+02, percent-clipped=0.0 2023-10-04 10:11:40,932 WARNING [train.py:1204] (3/4) Exclude cut with ID _2334_210_2667_1_1525608238880_4705129_376-263393-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 10:11:40,992 WARNING [train.py:1204] (3/4) Exclude cut with ID _29303_210_2575_1_1532392237303_7410649_363-344675-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 10:11:44,168 WARNING [train.py:1204] (3/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_58-68956-0_sp1.1 from training. Duration: 0.7545625 2023-10-04 10:11:46,935 WARNING [train.py:1204] (3/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_709-311571-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 10:11:49,034 WARNING [train.py:1204] (3/4) Exclude cut with ID _46067_210_4220_1_1534125571066_3307450_136-257407-0_sp1.1 from training. Duration: 0.963625 2023-10-04 10:11:49,108 WARNING [train.py:1204] (3/4) Exclude cut with ID _6493_210_1747_1_1528459210424_4012470_324-323854-0_sp1.1 from training. Duration: 0.836375 2023-10-04 10:11:50,503 WARNING [train.py:1204] (3/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_234-260682-0_sp1.1 from training. Duration: 0.763625 2023-10-04 10:11:50,565 WARNING [train.py:1204] (3/4) Exclude cut with ID _36634_210_1794_1_1533296352450_1399884_55-119451-0_sp1.1 from training. Duration: 0.963625 2023-10-04 10:11:50,569 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_40047_210_2290_1_1533290519_2374311_195-165693-0 from training. Duration: 0.89 2023-10-04 10:11:53,415 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_5522_210_3273_1_1527249606_3909096_366-172508-0_sp0.9 from training. Duration: 0.7333125 2023-10-04 10:11:54,770 WARNING [train.py:1204] (3/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_187-10504-0_sp1.1 from training. Duration: 0.963625 2023-10-04 10:11:57,469 WARNING [train.py:1204] (3/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_282-257791-0_sp1.1 from training. Duration: 0.7818125 2023-10-04 10:11:59,476 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.nonlin_attention.balancer.max_positive, batch_count=1617706.6666666667, ans=0.95 2023-10-04 10:12:00,541 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_43831_210_2158_1_1533725502_5872791_467-288776-0_sp1.1 from training. Duration: 0.92725 2023-10-04 10:12:01,974 WARNING [train.py:1204] (3/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_138-154812-0_sp1.1 from training. Duration: 0.7090625 2023-10-04 10:12:03,264 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_1154-185930-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 10:12:03,289 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2655_210_2087_1_1525688959_3897400_52-279899-0 from training. Duration: 0.69 2023-10-04 10:12:05,162 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_141-89113-0_sp1.1 from training. Duration: 0.7818125 2023-10-04 10:12:06,644 WARNING [train.py:1204] (3/4) Exclude cut with ID _55134_210_21982_1_1534590028377_3882170_297-47310-0_sp1.1 from training. Duration: 0.963625 2023-10-04 10:12:07,958 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_486-151131-0_sp1.1 from training. Duration: 0.77275 2023-10-04 10:12:09,390 WARNING [train.py:1204] (3/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_21-232980-0_sp1.1 from training. Duration: 0.936375 2023-10-04 10:12:10,983 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6165_210_3528_1_1527590982_4379841_338-106380-0_sp1.1 from training. Duration: 0.92725 2023-10-04 10:12:12,193 WARNING [train.py:1204] (3/4) Exclude cut with ID _55162_210_17071_1_1534408248583_3753480_355-257218-0_sp1.1 from training. Duration: 0.97275 2023-10-04 10:12:14,057 WARNING [train.py:1204] (3/4) Exclude cut with ID _2895_210_2477_1_1525956785794_4063750_289-11475-0 from training. Duration: 0.82 2023-10-04 10:12:20,247 WARNING [train.py:1204] (3/4) Exclude cut with ID _16756_210_4915_1_1531047803763_7104259_839-183431-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 10:12:20,746 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.4.encoder.layers.2.feed_forward2.out_whiten, num_groups=1, num_channels=384, metric=8.39 vs. limit=15.0 2023-10-04 10:12:21,619 WARNING [train.py:1204] (3/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_427-208901-0_sp0.9 from training. Duration: 0.7555625 2023-10-04 10:12:21,669 WARNING [train.py:1204] (3/4) Exclude cut with ID _36512_210_6987_1_1533033143998_3126069_181-306500-0 from training. Duration: 0.95 2023-10-04 10:12:25,808 WARNING [train.py:1204] (3/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_28-70987-0_sp1.1 from training. Duration: 0.7909375 2023-10-04 10:12:31,954 WARNING [train.py:1204] (3/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_590-58996-0_sp1.1 from training. Duration: 0.936375 2023-10-04 10:12:34,558 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_48730_210_5399_1_1533885886_2212769_108-47561-0_sp1.1 from training. Duration: 0.936375 2023-10-04 10:12:36,785 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.conv_skip_rate, batch_count=1617906.6666666667, ans=0.0 2023-10-04 10:12:40,673 WARNING [train.py:1204] (3/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_401-158239-0_sp0.9 from training. Duration: 0.788875 2023-10-04 10:12:40,695 WARNING [train.py:1204] (3/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_278-154492-0_sp1.1 from training. Duration: 0.6909375 2023-10-04 10:12:40,700 WARNING [train.py:1204] (3/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_137-116442-0 from training. Duration: 0.78 2023-10-04 10:12:42,602 WARNING [train.py:1204] (3/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_189-317158-0 from training. Duration: 0.9 2023-10-04 10:12:42,686 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_47896_210_20508_1_1534492518_5958868_558-314572-0 from training. Duration: 0.97 2023-10-04 10:12:45,350 WARNING [train.py:1204] (3/4) Exclude cut with ID _66070_210_7640_1_1535441393096_7805310_290-40284-0_sp1.1 from training. Duration: 0.97275 2023-10-04 10:12:45,394 WARNING [train.py:1204] (3/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_110-224688-0_sp1.1 from training. Duration: 0.8818125 2023-10-04 10:12:48,125 WARNING [train.py:1204] (3/4) Exclude cut with ID _7719_210_3525_1_1528456153163_4296960_88-250259-0 from training. Duration: 0.84 2023-10-04 10:12:48,180 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8453_210_4211_1_1528502940_8562730_1157-114102-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 10:12:49,512 WARNING [train.py:1204] (3/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_317-175062-0_sp1.1 from training. Duration: 0.7454375 2023-10-04 10:12:49,519 WARNING [train.py:1204] (3/4) Exclude cut with ID _29857_210_20034_1_1532395886753_3798160_254-347254-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 10:12:49,611 WARNING [train.py:1204] (3/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_28-70987-0 from training. Duration: 0.87 2023-10-04 10:12:50,896 INFO [train.py:1046] (3/4) Epoch 46, batch 3650, loss[loss=0.1487, simple_loss=0.2295, pruned_loss=0.03397, over 23573.00 frames. ], tot_loss[loss=0.1537, simple_loss=0.2334, pruned_loss=0.03703, over 4693062.34 frames. ], batch size: 135, lr: 2.20e-03, grad_scale: 16.0 2023-10-04 10:12:50,987 WARNING [train.py:1204] (3/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_321-335475-0 from training. Duration: 0.45 2023-10-04 10:12:54,245 WARNING [train.py:1204] (3/4) Exclude cut with ID _40901_210_5665_1_1533363789958_3856130_356-107330-0_sp1.1 from training. Duration: 0.936375 2023-10-04 10:12:55,679 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3780_210_2087_1_1526467389_2145572_176-107140-0 from training. Duration: 0.69 2023-10-04 10:12:57,455 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.balancer1.prob, batch_count=1617973.3333333333, ans=0.125 2023-10-04 10:12:59,768 WARNING [train.py:1204] (3/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_80-135200-0 from training. Duration: 0.79 2023-10-04 10:13:01,144 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_33348_210_5632_1_1533086912_3324030_85-28583-0_sp0.9 from training. Duration: 0.911125 2023-10-04 10:13:01,310 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.1.balancer1.prob, batch_count=1617973.3333333333, ans=0.125 2023-10-04 10:13:04,518 WARNING [train.py:1204] (3/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_29-293896-0 from training. Duration: 0.61 2023-10-04 10:13:05,268 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.2.encoder.layers.1.feed_forward1.out_whiten, num_groups=1, num_channels=384, metric=6.55 vs. limit=15.0 2023-10-04 10:13:07,725 WARNING [train.py:1204] (3/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_194-252800-0 from training. Duration: 0.97 2023-10-04 10:13:11,792 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_360-52446-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 10:13:11,794 WARNING [train.py:1204] (3/4) Exclude cut with ID _9389_210_1664_1_1529217337301_5289269_289-213417-0_sp0.9 from training. Duration: 0.988875 2023-10-04 10:13:11,848 WARNING [train.py:1204] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_143-86344-0_sp0.9 from training. Duration: 0.8555625 2023-10-04 10:13:15,215 WARNING [train.py:1204] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_520-126729-0_sp0.9 from training. Duration: 0.77775 2023-10-04 10:13:15,240 WARNING [train.py:1204] (3/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_76-308663-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 10:13:15,418 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=1618040.0, ans=0.1 2023-10-04 10:13:16,579 WARNING [train.py:1204] (3/4) Exclude cut with ID _8021_210_7994_1_1528376451358_3653069_100-140231-0 from training. Duration: 0.67 2023-10-04 10:13:17,916 WARNING [train.py:1204] (3/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_387-131538-0_sp0.9 from training. Duration: 0.9 2023-10-04 10:13:17,947 WARNING [train.py:1204] (3/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_365-182054-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 10:13:17,994 WARNING [train.py:1204] (3/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_345-340690-0 from training. Duration: 0.93 2023-10-04 10:13:19,349 WARNING [train.py:1204] (3/4) Exclude cut with ID _5409_210_1388_1_1527292850189_5865191_178-127970-0_sp0.9 from training. Duration: 0.6333125 2023-10-04 10:13:20,707 WARNING [train.py:1204] (3/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_352-12734-0_sp1.1 from training. Duration: 0.92725 2023-10-04 10:13:20,711 WARNING [train.py:1204] (3/4) Exclude cut with ID _65134_210_14763_1_1535335393985_3505368_121-129373-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 10:13:23,332 WARNING [train.py:1204] (3/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_557-324258-0_sp1.1 from training. Duration: 0.736375 2023-10-04 10:13:24,798 WARNING [train.py:1204] (3/4) Exclude cut with ID _48678_210_5663_1_1533988830947_3769199_306-255886-0 from training. Duration: 0.77 2023-10-04 10:13:26,648 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7544_210_6753_1_1528587000_691468_87-178599-0 from training. Duration: 0.87 2023-10-04 10:13:28,032 WARNING [train.py:1204] (3/4) Exclude cut with ID _2941_210_2577_1_1526199673150_2489759_208-236402-0_sp1.1 from training. Duration: 0.963625 2023-10-04 10:13:29,489 WARNING [train.py:1204] (3/4) Exclude cut with ID _38670_210_15379_1_1533448810659_4391059_65-169397-0 from training. Duration: 0.97 2023-10-04 10:13:30,931 WARNING [train.py:1204] (3/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_306-328175-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 10:13:32,215 WARNING [train.py:1204] (3/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_212-149947-0_sp1.1 from training. Duration: 0.663625 2023-10-04 10:13:33,865 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.balancer1.prob, batch_count=1618173.3333333333, ans=0.125 2023-10-04 10:13:35,677 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.conv_module2.balancer1.prob, batch_count=1618173.3333333333, ans=0.125 2023-10-04 10:13:36,892 WARNING [train.py:1204] (3/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_321-22821-0_sp1.1 from training. Duration: 0.8090625 2023-10-04 10:13:38,257 WARNING [train.py:1204] (3/4) Exclude cut with ID _44010_210_4525_1_1533884262964_4013620_483-69110-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 10:13:38,266 WARNING [train.py:1204] (3/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_38-187851-0_sp1.1 from training. Duration: 0.563625 2023-10-04 10:13:38,422 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.bypass_mid.scale_min, batch_count=1618173.3333333333, ans=0.2 2023-10-04 10:13:38,470 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.attention_skip_rate, batch_count=1618173.3333333333, ans=0.0 2023-10-04 10:13:40,238 WARNING [train.py:1204] (3/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_129-337802-0_sp0.9 from training. Duration: 0.8 2023-10-04 10:13:41,517 WARNING [train.py:1204] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_268-87909-0_sp1.1 from training. Duration: 0.9 2023-10-04 10:13:42,932 WARNING [train.py:1204] (3/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_559-184862-0_sp1.1 from training. Duration: 0.8545625 2023-10-04 10:13:46,167 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_46032_210_14656_1_1533781412_4507969_237-120766-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 10:13:47,603 WARNING [train.py:1204] (3/4) Exclude cut with ID _28427_210_4220_1_1532581240967_3061069_330-170906-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 10:13:47,617 WARNING [train.py:1204] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_401-51578-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 10:13:47,847 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.attention_skip_rate, batch_count=1618173.3333333333, ans=0.0 2023-10-04 10:13:49,065 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.conv_module1.balancer1.max_abs, batch_count=1618240.0, ans=10.0 2023-10-04 10:13:50,145 WARNING [train.py:1204] (3/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_209-205365-0_sp1.1 from training. Duration: 0.7181875 2023-10-04 10:13:50,226 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_23801_210_16223_1_1532066392_3294657_57-242170-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 10:13:51,510 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_127-37371-0_sp1.1 from training. Duration: 0.936375 2023-10-04 10:13:56,936 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9171W0019-578905 from training. Duration: 0.896 2023-10-04 10:14:01,687 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_24605_210_1794_1_1532047863_4149755_349-49056-0_sp1.1 from training. Duration: 0.92725 2023-10-04 10:14:01,710 WARNING [train.py:1204] (3/4) Exclude cut with ID _12302_210_5219_1_1529924446466_8107280_897-21237-0_sp1.1 from training. Duration: 0.936375 2023-10-04 10:14:03,096 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_9460_210_4211_1_1528954587_10049667_480-260269-0_sp0.9 from training. Duration: 0.62225 2023-10-04 10:14:03,141 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_348-152548-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 10:14:04,399 INFO [train.py:1046] (3/4) Epoch 46, batch 3700, loss[loss=0.1608, simple_loss=0.2496, pruned_loss=0.03598, over 24639.00 frames. ], tot_loss[loss=0.154, simple_loss=0.234, pruned_loss=0.03694, over 4708595.24 frames. ], batch size: 73, lr: 2.20e-03, grad_scale: 16.0 2023-10-04 10:14:04,484 WARNING [train.py:1204] (3/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_301-330455-0_sp0.9 from training. Duration: 0.888875 2023-10-04 10:14:06,461 WARNING [train.py:1204] (3/4) Exclude cut with ID _61008_210_10633_1_1534852769198_6974370_345-91705-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 10:14:06,569 WARNING [train.py:1204] (3/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_304-273433-0 from training. Duration: 0.99 2023-10-04 10:14:06,572 WARNING [train.py:1204] (3/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_359-345972-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 10:14:07,881 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.739e+02 2.030e+02 2.313e+02 2.831e+02 4.310e+02, threshold=4.627e+02, percent-clipped=0.0 2023-10-04 10:14:08,112 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2296_210_2087_1_1525503448_3397335_57-185430-0_sp1.1 from training. Duration: 0.5454375 2023-10-04 10:14:10,141 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_1256-120974-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 10:14:11,493 WARNING [train.py:1204] (3/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_372-30302-0_sp0.9 from training. Duration: 0.9555625 2023-10-04 10:14:13,174 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.bypass_mid.scale_min, batch_count=1618306.6666666667, ans=0.2 2023-10-04 10:14:14,215 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_42012_210_7927_1_1533439936_3871556_451-344262-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 10:14:14,215 WARNING [train.py:1204] (3/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_28-324729-0 from training. Duration: 0.94 2023-10-04 10:14:14,231 WARNING [train.py:1204] (3/4) Exclude cut with ID _64135_210_2253_1_1535677267876_7608972_313-18394-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 10:14:15,544 WARNING [train.py:1204] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_620-276793-0_sp0.9 from training. Duration: 0.6555625 2023-10-04 10:14:17,390 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7544_210_6753_1_1528591327_1471426_172-348777-0_sp0.9 from training. Duration: 0.7333125 2023-10-04 10:14:20,127 WARNING [train.py:1204] (3/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_332-221057-0_sp0.9 from training. Duration: 0.7555625 2023-10-04 10:14:23,038 WARNING [train.py:1204] (3/4) Exclude cut with ID _19023_210_4353_1_1531378892552_5820780_5-300035-0_sp1.1 from training. Duration: 0.92725 2023-10-04 10:14:23,082 WARNING [train.py:1204] (3/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_343-309023-0_sp1.1 from training. Duration: 0.963625 2023-10-04 10:14:24,383 WARNING [train.py:1204] (3/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_37-41919-0_sp1.1 from training. Duration: 0.7909375 2023-10-04 10:14:24,407 WARNING [train.py:1204] (3/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_41-182910-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 10:14:25,765 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_229-45300-0_sp0.9 from training. Duration: 0.6333125 2023-10-04 10:14:28,896 WARNING [train.py:1204] (3/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_40-214716-0_sp1.1 from training. Duration: 0.963625 2023-10-04 10:14:30,324 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9309W0203-640330 from training. Duration: 0.896 2023-10-04 10:14:34,616 WARNING [train.py:1204] (3/4) Exclude cut with ID _60229_210_27767_1_1534838406158_3655350_264-279573-0_sp1.1 from training. Duration: 0.7545625 2023-10-04 10:14:34,654 WARNING [train.py:1204] (3/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_372-198878-0_sp0.9 from training. Duration: 0.6666875 2023-10-04 10:14:37,866 WARNING [train.py:1204] (3/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_359-118926-0_sp1.1 from training. Duration: 0.6545625 2023-10-04 10:14:37,879 WARNING [train.py:1204] (3/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_85-65056-0 from training. Duration: 0.96 2023-10-04 10:14:37,908 WARNING [train.py:1204] (3/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_89-242335-0_sp1.1 from training. Duration: 0.863625 2023-10-04 10:14:38,083 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.conv_module1.balancer1.prob, batch_count=1618440.0, ans=0.125 2023-10-04 10:14:41,226 WARNING [train.py:1204] (3/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_128-49444-0_sp1.1 from training. Duration: 0.936375 2023-10-04 10:14:42,584 WARNING [train.py:1204] (3/4) Exclude cut with ID _30327_210_16049_1_1532484161295_3924169_401-70819-0 from training. Duration: 0.86 2023-10-04 10:14:43,967 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_41822_210_13270_1_1533955976_4379284_260-256391-0_sp1.1 from training. Duration: 0.936375 2023-10-04 10:14:44,217 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.feed_forward1.out_proj.dropout_p, batch_count=1618440.0, ans=0.1 2023-10-04 10:14:45,379 WARNING [train.py:1204] (3/4) Exclude cut with ID _67335_210_5219_1_1535525975652_4275229_599-247317-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 10:14:47,449 WARNING [train.py:1204] (3/4) Exclude cut with ID _29857_210_20034_1_1532395886753_3798160_209-239267-0_sp1.1 from training. Duration: 0.936375 2023-10-04 10:14:47,477 WARNING [train.py:1204] (3/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_46-207599-0_sp1.1 from training. Duration: 0.5818125 2023-10-04 10:14:50,147 WARNING [train.py:1204] (3/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_912-282412-0_sp1.1 from training. Duration: 0.4454375 2023-10-04 10:14:54,189 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_4183_210_2087_1_1526729100_1869838_141-166600-0_sp1.1 from training. Duration: 0.863625 2023-10-04 10:14:54,194 WARNING [train.py:1204] (3/4) Exclude cut with ID _15037_210_2253_1_1530752294104_3267650_296-192597-0 from training. Duration: 0.81 2023-10-04 10:14:54,236 WARNING [train.py:1204] (3/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_571-220562-0_sp1.1 from training. Duration: 0.963625 2023-10-04 10:14:55,517 WARNING [train.py:1204] (3/4) Exclude cut with ID _5287_210_2084_1_1527418883040_3878670_12-221186-0 from training. Duration: 0.98 2023-10-04 10:15:00,523 WARNING [train.py:1204] (3/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_149-85602-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 10:15:01,833 WARNING [train.py:1204] (3/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_1003-101638-0_sp1.1 from training. Duration: 0.663625 2023-10-04 10:15:04,661 WARNING [train.py:1204] (3/4) Exclude cut with ID _55844_210_2346_1_1534471212569_7625707_1099-33689-0_sp1.1 from training. Duration: 0.92725 2023-10-04 10:15:04,699 WARNING [train.py:1204] (3/4) Exclude cut with ID _7606_210_4915_1_1528894253277_5080690_301-10732-0 from training. Duration: 0.81 2023-10-04 10:15:07,998 WARNING [train.py:1204] (3/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_403-34698-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 10:15:08,004 WARNING [train.py:1204] (3/4) Exclude cut with ID _8480_210_1874_1_1528589391205_6728330_755-112625-0_sp0.9 from training. Duration: 0.82225 2023-10-04 10:15:08,017 WARNING [train.py:1204] (3/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_527-238990-0_sp1.1 from training. Duration: 0.8181875 2023-10-04 10:15:08,047 WARNING [train.py:1204] (3/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_426-7328-0_sp1.1 from training. Duration: 0.92725 2023-10-04 10:15:12,851 WARNING [train.py:1204] (3/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_189-317158-0_sp1.1 from training. Duration: 0.8181875 2023-10-04 10:15:12,929 WARNING [train.py:1204] (3/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_162-103620-0 from training. Duration: 0.93 2023-10-04 10:15:14,332 WARNING [train.py:1204] (3/4) Exclude cut with ID _22083_210_8254_1_1531702862439_3678970_49-234546-0 from training. Duration: 0.83 2023-10-04 10:15:14,381 WARNING [train.py:1204] (3/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_370-331600-0_sp1.1 from training. Duration: 0.8545625 2023-10-04 10:15:14,393 WARNING [train.py:1204] (3/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_166-59042-0_sp1.1 from training. Duration: 0.97275 2023-10-04 10:15:15,708 WARNING [train.py:1204] (3/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_138-332773-0_sp1.1 from training. Duration: 0.736375 2023-10-04 10:15:15,930 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.feed_forward2.hidden_balancer.prob, batch_count=1618573.3333333333, ans=0.125 2023-10-04 10:15:17,067 WARNING [train.py:1204] (3/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_90-30673-0_sp0.9 from training. Duration: 0.9333125 2023-10-04 10:15:19,210 WARNING [train.py:1204] (3/4) Exclude cut with ID _27967_210_12140_1_1532588391859_8419130_737-41898-0_sp1.1 from training. Duration: 0.936375 2023-10-04 10:15:20,325 INFO [train.py:1046] (3/4) Epoch 46, batch 3750, loss[loss=0.1607, simple_loss=0.2341, pruned_loss=0.04365, over 23755.00 frames. ], tot_loss[loss=0.1552, simple_loss=0.2358, pruned_loss=0.03728, over 4705944.67 frames. ], batch size: 212, lr: 2.20e-03, grad_scale: 8.0 2023-10-04 10:15:20,445 WARNING [train.py:1204] (3/4) Exclude cut with ID _8833_210_5219_1_1528592665210_7553059_329-125156-0_sp1.1 from training. Duration: 0.7454375 2023-10-04 10:15:20,544 WARNING [train.py:1204] (3/4) Exclude cut with ID _41817_210_18835_1_1533434477160_7423049_352-42537-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 10:15:23,270 WARNING [train.py:1204] (3/4) Exclude cut with ID _8365_210_2064_1_1528592311070_4677690_431-294102-0 from training. Duration: 0.89 2023-10-04 10:15:23,389 WARNING [train.py:1204] (3/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_454-314901-0_sp1.1 from training. Duration: 0.4181875 2023-10-04 10:15:27,624 WARNING [train.py:1204] (3/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_80-135200-0_sp0.9 from training. Duration: 0.87775 2023-10-04 10:15:27,677 WARNING [train.py:1204] (3/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_181-78632-0 from training. Duration: 0.78 2023-10-04 10:15:29,042 WARNING [train.py:1204] (3/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_300-27235-0_sp0.9 from training. Duration: 0.9666875 2023-10-04 10:15:29,118 WARNING [train.py:1204] (3/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_96-177176-0_sp1.1 from training. Duration: 0.97275 2023-10-04 10:15:31,181 WARNING [train.py:1204] (3/4) Exclude cut with ID _35941_210_15379_1_1532995294515_4023089_246-348742-0_sp1.1 from training. Duration: 0.97275 2023-10-04 10:15:32,602 WARNING [train.py:1204] (3/4) Exclude cut with ID _38670_210_15379_1_1533448810659_4391059_65-169397-0_sp1.1 from training. Duration: 0.8818125 2023-10-04 10:15:35,412 WARNING [train.py:1204] (3/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_1003-99642-0_sp1.1 from training. Duration: 0.963625 2023-10-04 10:15:40,045 WARNING [train.py:1204] (3/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_320-14078-0_sp1.1 from training. Duration: 0.863625 2023-10-04 10:15:40,278 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.balancer2.prob, batch_count=1618706.6666666667, ans=0.125 2023-10-04 10:15:41,905 WARNING [train.py:1204] (3/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_367-61330-0_sp0.9 from training. Duration: 0.8333125 2023-10-04 10:15:43,399 WARNING [train.py:1204] (3/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_515-298514-0_sp1.1 from training. Duration: 0.92725 2023-10-04 10:15:47,447 WARNING [train.py:1204] (3/4) Exclude cut with ID _53731_210_7994_1_1534553979244_3757214_471-320815-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 10:15:47,506 WARNING [train.py:1204] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_306-232480-0 from training. Duration: 0.96 2023-10-04 10:15:48,844 WARNING [train.py:1204] (3/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_478-162247-0_sp0.9 from training. Duration: 0.911125 2023-10-04 10:15:48,966 WARNING [train.py:1204] (3/4) Exclude cut with ID _56423_210_5854_1_1534896061049_3165969_345-284696-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 10:15:48,993 WARNING [train.py:1204] (3/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_600-122253-0_sp1.1 from training. Duration: 0.963625 2023-10-04 10:15:53,670 WARNING [train.py:1204] (3/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_199-295611-0 from training. Duration: 0.55 2023-10-04 10:15:56,470 WARNING [train.py:1204] (3/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_734-129761-0 from training. Duration: 0.82 2023-10-04 10:15:59,071 WARNING [train.py:1204] (3/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_349-169257-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 10:15:59,101 WARNING [train.py:1204] (3/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_202-67017-0_sp0.9 from training. Duration: 0.911125 2023-10-04 10:15:59,355 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.ff3_skip_rate, batch_count=1618773.3333333333, ans=0.0 2023-10-04 10:16:00,489 WARNING [train.py:1204] (3/4) Exclude cut with ID _16908_210_5724_1_1531119157248_3772700_61-258978-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 10:16:05,065 WARNING [train.py:1204] (3/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_172-156931-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 10:16:05,160 WARNING [train.py:1204] (3/4) Exclude cut with ID _4092_210_2151_1_1526643044449_3135759_356-278694-0_sp1.1 from training. Duration: 0.42725 2023-10-04 10:16:09,736 WARNING [train.py:1204] (3/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_116-332008-0 from training. Duration: 0.98 2023-10-04 10:16:13,078 WARNING [train.py:1204] (3/4) Exclude cut with ID _29003_210_19596_1_1532507391720_3894098_358-114789-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 10:16:17,393 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.bypass.skip_rate, batch_count=1618840.0, ans=0.07 2023-10-04 10:16:18,554 WARNING [train.py:1204] (3/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_152-212533-0_sp1.1 from training. Duration: 0.8818125 2023-10-04 10:16:18,614 WARNING [train.py:1204] (3/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_267-214116-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 10:16:22,102 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7543_210_6753_1_1528541907_1399322_165-323619-0_sp0.9 from training. Duration: 0.7666875 2023-10-04 10:16:24,933 WARNING [train.py:1204] (3/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_577-332600-0_sp0.9 from training. Duration: 0.6333125 2023-10-04 10:16:25,202 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.self_attn_weights.pos_emb_skip_rate, batch_count=1618906.6666666667, ans=0.0 2023-10-04 10:16:26,306 WARNING [train.py:1204] (3/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_342-100157-0_sp0.9 from training. Duration: 0.6 2023-10-04 10:16:26,565 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.nonlin_attention.balancer.prob, batch_count=1618906.6666666667, ans=0.125 2023-10-04 10:16:28,947 WARNING [train.py:1204] (3/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_141-89727-0_sp1.1 from training. Duration: 0.6454375 2023-10-04 10:16:29,048 WARNING [train.py:1204] (3/4) Exclude cut with ID _3992_210_1747_1_1526644816031_3731060_107-251434-0_sp1.1 from training. Duration: 0.9 2023-10-04 10:16:31,805 WARNING [train.py:1204] (3/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_401-53423-0_sp1.1 from training. Duration: 0.5 2023-10-04 10:16:35,122 INFO [train.py:1046] (3/4) Epoch 46, batch 3800, loss[loss=0.1448, simple_loss=0.2332, pruned_loss=0.02823, over 24648.00 frames. ], tot_loss[loss=0.1548, simple_loss=0.2356, pruned_loss=0.03702, over 4716105.41 frames. ], batch size: 65, lr: 2.20e-03, grad_scale: 8.0 2023-10-04 10:16:39,644 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.696e+02 2.060e+02 2.402e+02 2.784e+02 4.041e+02, threshold=4.803e+02, percent-clipped=0.0 2023-10-04 10:16:39,741 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7762_210_1794_1_1528199508_4057408_422-227034-0_sp1.1 from training. Duration: 0.663625 2023-10-04 10:16:43,015 WARNING [train.py:1204] (3/4) Exclude cut with ID _4184_210_2550_1_1526730192013_2219380_102-250299-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 10:16:44,359 WARNING [train.py:1204] (3/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_253-308377-0_sp0.9 from training. Duration: 0.5444375 2023-10-04 10:16:44,436 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_271-297505-0 from training. Duration: 0.99 2023-10-04 10:16:44,692 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.balancer2.prob, batch_count=1618973.3333333333, ans=0.125 2023-10-04 10:16:45,812 WARNING [train.py:1204] (3/4) Exclude cut with ID _35941_210_15379_1_1532995294515_4023089_213-142973-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 10:16:47,227 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7544_210_6753_1_1528587722_3576686_162-63018-0_sp1.1 from training. Duration: 0.92725 2023-10-04 10:16:48,505 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_365-217119-0_sp0.9 from training. Duration: 0.7 2023-10-04 10:16:50,450 WARNING [train.py:1204] (3/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_662-275621-0_sp0.9 from training. Duration: 0.4666875 2023-10-04 10:16:50,451 WARNING [train.py:1204] (3/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_374-187555-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 10:16:51,935 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_289-180904-0_sp1.1 from training. Duration: 0.6909375 2023-10-04 10:16:52,058 WARNING [train.py:1204] (3/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_189-47206-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 10:16:53,305 WARNING [train.py:1204] (3/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_234-14174-0_sp1.1 from training. Duration: 0.7454375 2023-10-04 10:16:53,347 WARNING [train.py:1204] (3/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_576-76621-0_sp1.1 from training. Duration: 0.963625 2023-10-04 10:16:54,727 WARNING [train.py:1204] (3/4) Exclude cut with ID _13253_210_1925_1_1530757775819_3577873_253-246386-0 from training. Duration: 0.9 2023-10-04 10:16:57,620 WARNING [train.py:1204] (3/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_372-6869-0_sp1.1 from training. Duration: 0.4 2023-10-04 10:16:58,881 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_21434_210_4238_1_1531916339_1222252_22-54954-0_sp1.1 from training. Duration: 0.936375 2023-10-04 10:17:01,694 WARNING [train.py:1204] (3/4) Exclude cut with ID _29853_210_7497_1_1532505600851_6642110_492-170361-0_sp1.1 from training. Duration: 0.92725 2023-10-04 10:17:04,410 WARNING [train.py:1204] (3/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_505-97987-0_sp0.9 from training. Duration: 0.9555625 2023-10-04 10:17:06,279 WARNING [train.py:1204] (3/4) Exclude cut with ID _8365_210_2064_1_1528592311070_4677690_371-122295-0_sp0.9 from training. Duration: 0.611125 2023-10-04 10:17:06,400 WARNING [train.py:1204] (3/4) Exclude cut with ID _14268_210_9206_1_1530622771285_4714099_637-86630-0_sp1.1 from training. Duration: 0.463625 2023-10-04 10:17:07,696 WARNING [train.py:1204] (3/4) Exclude cut with ID _29857_210_20034_1_1532395886753_3798160_372-5678-0_sp1.1 from training. Duration: 0.963625 2023-10-04 10:17:09,939 WARNING [train.py:1204] (3/4) Exclude cut with ID _52892_210_6721_1_1534378941259_4124129_90-165108-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 10:17:11,662 WARNING [train.py:1204] (3/4) Exclude cut with ID _27052_210_5986_1_1532329197187_3585009_143-339198-0_sp1.1 from training. Duration: 0.963625 2023-10-04 10:17:15,910 WARNING [train.py:1204] (3/4) Exclude cut with ID _14268_210_9206_1_1530622771285_4714099_637-86630-0_sp0.9 from training. Duration: 0.5666875 2023-10-04 10:17:15,912 WARNING [train.py:1204] (3/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_401-255491-0 from training. Duration: 0.7 2023-10-04 10:17:16,681 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.3.encoder.layers.0.feed_forward1.out_whiten, num_groups=1, num_channels=512, metric=7.45 vs. limit=15.0 2023-10-04 10:17:17,335 WARNING [train.py:1204] (3/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_178-6946-0_sp1.1 from training. Duration: 0.97275 2023-10-04 10:17:24,938 WARNING [train.py:1204] (3/4) Exclude cut with ID _27485_210_4916_1_1532739191677_3668269_460-163885-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 10:17:29,203 WARNING [train.py:1204] (3/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_270-26053-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 10:17:33,348 WARNING [train.py:1204] (3/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_412-317077-0 from training. Duration: 0.75 2023-10-04 10:17:34,821 WARNING [train.py:1204] (3/4) Exclude cut with ID _5289_210_2084_1_1527678202736_3779299_142-103317-0 from training. Duration: 0.84 2023-10-04 10:17:36,101 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_150-143438-0_sp1.1 from training. Duration: 0.92725 2023-10-04 10:17:37,534 WARNING [train.py:1204] (3/4) Exclude cut with ID _27894_210_12392_1_1532415496078_6902439_668-269779-0_sp1.1 from training. Duration: 0.97275 2023-10-04 10:17:38,874 WARNING [train.py:1204] (3/4) Exclude cut with ID _18513_210_9206_1_1531618215223_3862620_56-222075-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 10:17:38,992 WARNING [train.py:1204] (3/4) Exclude cut with ID _4092_210_2151_1_1526643044449_3135759_356-278694-0 from training. Duration: 0.47 2023-10-04 10:17:43,818 WARNING [train.py:1204] (3/4) Exclude cut with ID _25808_210_13292_1_1532343514562_7366109_633-47524-0 from training. Duration: 0.99 2023-10-04 10:17:43,836 WARNING [train.py:1204] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_872-264525-0 from training. Duration: 0.65 2023-10-04 10:17:43,865 WARNING [train.py:1204] (3/4) Exclude cut with ID _14632_210_9988_1_1530691279392_3923200_239-232545-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 10:17:45,743 WARNING [train.py:1204] (3/4) Exclude cut with ID _14349_210_8341_1_1530615710818_3442470_153-27432-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 10:17:47,496 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.conv_skip_rate, batch_count=1619240.0, ans=0.0 2023-10-04 10:17:49,793 INFO [train.py:1046] (3/4) Epoch 46, batch 3850, loss[loss=0.1574, simple_loss=0.2417, pruned_loss=0.03655, over 23373.00 frames. ], tot_loss[loss=0.1543, simple_loss=0.2346, pruned_loss=0.03697, over 4704467.78 frames. ], batch size: 119, lr: 2.20e-03, grad_scale: 8.0 2023-10-04 10:17:51,711 WARNING [train.py:1204] (3/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_37-41919-0_sp0.9 from training. Duration: 0.9666875 2023-10-04 10:17:51,793 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_196-289637-0_sp0.9 from training. Duration: 0.8333125 2023-10-04 10:17:57,267 WARNING [train.py:1204] (3/4) Exclude cut with ID _13678_210_7497_1_1530530069226_3463170_371-3221-0_sp1.1 from training. Duration: 0.8181875 2023-10-04 10:17:57,729 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.5.encoder.layers.1.self_attn2.whiten, num_groups=1, num_channels=256, metric=10.14 vs. limit=22.5 2023-10-04 10:17:58,573 WARNING [train.py:1204] (3/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_557-324258-0 from training. Duration: 0.81 2023-10-04 10:17:58,646 WARNING [train.py:1204] (3/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_1012-279473-0_sp1.1 from training. Duration: 0.6090625 2023-10-04 10:17:59,973 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_66916_210_14656_1_1535725525_326566_7-92649-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 10:18:01,584 INFO [scaling.py:1118] (3/4) WithLoss: name=encoder.encoders.3.encoder.layers.2.self_attn_weights, loss-sum=0.000e+00 2023-10-04 10:18:02,858 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_9460_210_4211_1_1528954587_10049667_480-260269-0_sp1.1 from training. Duration: 0.5090625 2023-10-04 10:18:05,530 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_121-155001-0_sp1.1 from training. Duration: 0.92725 2023-10-04 10:18:06,946 WARNING [train.py:1204] (3/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_188-171673-0_sp1.1 from training. Duration: 0.52725 2023-10-04 10:18:08,437 WARNING [train.py:1204] (3/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_2-39653-0 from training. Duration: 0.98 2023-10-04 10:18:13,752 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_23850_210_4238_1_1532232597_1632433_112-229012-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 10:18:16,892 WARNING [train.py:1204] (3/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_626-79528-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 10:18:18,352 WARNING [train.py:1204] (3/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_856-90052-0_sp1.1 from training. Duration: 0.963625 2023-10-04 10:18:18,399 WARNING [train.py:1204] (3/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_122-109023-0_sp0.9 from training. Duration: 0.8444375 2023-10-04 10:18:23,092 WARNING [train.py:1204] (3/4) Exclude cut with ID _64414_210_17301_1_1535243252048_3757759_37-70328-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 10:18:23,176 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_622-344907-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 10:18:24,528 WARNING [train.py:1204] (3/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_410-321956-0_sp1.1 from training. Duration: 0.92725 2023-10-04 10:18:24,548 WARNING [train.py:1204] (3/4) Exclude cut with ID _7562_210_2084_1_1528887767916_3946229_186-326422-0_sp0.9 from training. Duration: 0.9333125 2023-10-04 10:18:25,958 WARNING [train.py:1204] (3/4) Exclude cut with ID _37537_210_2253_1_1533171758568_3421949_127-285192-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 10:18:27,394 WARNING [train.py:1204] (3/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_723-228168-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 10:18:28,657 WARNING [train.py:1204] (3/4) Exclude cut with ID _13678_210_7497_1_1530530069226_3463170_426-246984-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 10:18:28,680 WARNING [train.py:1204] (3/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_399-156791-0_sp0.9 from training. Duration: 0.92225 2023-10-04 10:18:28,735 WARNING [train.py:1204] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_391-22148-0 from training. Duration: 0.77 2023-10-04 10:18:28,764 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3475_210_2028_1_1526193578_3622569_64-297072-0 from training. Duration: 0.91 2023-10-04 10:18:30,116 WARNING [train.py:1204] (3/4) Exclude cut with ID _17875_210_2087_1_1531224012907_1821339_225-280015-0_sp1.1 from training. Duration: 0.963625 2023-10-04 10:18:30,142 WARNING [train.py:1204] (3/4) Exclude cut with ID _50655_210_18628_1_1534229942193_3772154_325-246169-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 10:18:31,646 WARNING [train.py:1204] (3/4) Exclude cut with ID _29529_210_7234_1_1532393961455_3977460_597-257348-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 10:18:32,944 WARNING [train.py:1204] (3/4) Exclude cut with ID _58895_210_8750_1_1535184081320_3649089_77-173485-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 10:18:32,963 WARNING [train.py:1204] (3/4) Exclude cut with ID _6270_210_2084_1_1527850911573_3856420_26-277343-0 from training. Duration: 0.82 2023-10-04 10:18:35,719 WARNING [train.py:1204] (3/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_144-3287-0 from training. Duration: 0.67 2023-10-04 10:18:37,122 WARNING [train.py:1204] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_293-145278-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 10:18:38,572 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_40047_210_2290_1_1533290519_2374311_257-335162-0 from training. Duration: 0.95 2023-10-04 10:18:41,948 WARNING [train.py:1204] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_619-344499-0_sp0.9 from training. Duration: 0.7 2023-10-04 10:18:46,712 WARNING [train.py:1204] (3/4) Exclude cut with ID _19504_210_9994_1_1531648863242_3325220_456-315379-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 10:18:48,012 WARNING [train.py:1204] (3/4) Exclude cut with ID _55857_210_2346_1_1534644007831_6898209_631-221692-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 10:18:48,233 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.1.conv_module1.balancer2.prob, batch_count=1619573.3333333333, ans=0.125 2023-10-04 10:18:51,461 WARNING [train.py:1204] (3/4) Exclude cut with ID _64631_210_27767_1_1535349580068_4165160_324-3682-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 10:18:52,785 WARNING [train.py:1204] (3/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_317-211107-0 from training. Duration: 0.92 2023-10-04 10:18:55,669 WARNING [train.py:1204] (3/4) Exclude cut with ID _6789_210_1534_1_1527857937218_3118950_311-61262-0 from training. Duration: 0.65 2023-10-04 10:18:57,086 WARNING [train.py:1204] (3/4) Exclude cut with ID _28323_210_16187_1_1532480395667_4100300_365-68882-0_sp1.1 from training. Duration: 0.92725 2023-10-04 10:18:58,329 WARNING [train.py:1204] (3/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_816-181825-0_sp1.1 from training. Duration: 0.92725 2023-10-04 10:18:59,792 WARNING [train.py:1204] (3/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_132-269633-0_sp1.1 from training. Duration: 0.6181875 2023-10-04 10:18:59,802 WARNING [train.py:1204] (3/4) Exclude cut with ID _44585_210_18628_1_1533605350387_4518199_280-73534-0_sp1.1 from training. Duration: 0.8090625 2023-10-04 10:18:59,852 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_68095_210_20185_1_1535718226_8888855_1009-164755-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 10:19:01,269 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_40223_210_6228_1_1533298404_4812267_400-103813-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 10:19:01,270 WARNING [train.py:1204] (3/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_491-261923-0_sp1.1 from training. Duration: 0.8909375 2023-10-04 10:19:01,275 WARNING [train.py:1204] (3/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_712-342563-0 from training. Duration: 0.97 2023-10-04 10:19:02,185 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.0.layers.1.conv_module2.whiten, num_groups=1, num_channels=192, metric=9.60 vs. limit=15.0 2023-10-04 10:19:02,580 WARNING [train.py:1204] (3/4) Exclude cut with ID _41161_210_20034_1_1533431249657_3555314_175-190032-0_sp1.1 from training. Duration: 0.963625 2023-10-04 10:19:03,859 INFO [train.py:1046] (3/4) Epoch 46, batch 3900, loss[loss=0.1313, simple_loss=0.2127, pruned_loss=0.02493, over 24573.00 frames. ], tot_loss[loss=0.1532, simple_loss=0.2333, pruned_loss=0.0366, over 4704029.80 frames. ], batch size: 60, lr: 2.20e-03, grad_scale: 8.0 2023-10-04 10:19:03,961 WARNING [train.py:1204] (3/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_788-298907-0 from training. Duration: 0.89 2023-10-04 10:19:03,986 WARNING [train.py:1204] (3/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_225-70961-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 10:19:03,993 WARNING [train.py:1204] (3/4) Exclude cut with ID _58583_210_5236_1_1534726835266_3574249_235-175994-0_sp1.1 from training. Duration: 0.92725 2023-10-04 10:19:05,413 WARNING [train.py:1204] (3/4) Exclude cut with ID _12302_210_5219_1_1529924446466_8107280_907-182949-0_sp1.1 from training. Duration: 0.836375 2023-10-04 10:19:05,444 WARNING [train.py:1204] (3/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_94-193692-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 10:19:06,981 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.conv_module1.balancer2.prob, batch_count=1619640.0, ans=0.125 2023-10-04 10:19:07,973 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.616e+02 1.999e+02 2.284e+02 2.578e+02 4.359e+02, threshold=4.569e+02, percent-clipped=0.0 2023-10-04 10:19:08,065 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_4528_210_3273_1_1527076654_3276777_342-168068-0_sp0.9 from training. Duration: 0.9666875 2023-10-04 10:19:09,432 WARNING [train.py:1204] (3/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_368-74239-0_sp1.1 from training. Duration: 0.92725 2023-10-04 10:19:09,434 WARNING [train.py:1204] (3/4) Exclude cut with ID _44831_210_13427_1_1533816011549_3689035_291-295928-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 10:19:09,512 WARNING [train.py:1204] (3/4) Exclude cut with ID _56406_210_9288_1_1534899618664_3496149_455-131997-0_sp1.1 from training. Duration: 0.97275 2023-10-04 10:19:09,519 WARNING [train.py:1204] (3/4) Exclude cut with ID _6473_210_1741_1_1527901307396_3815380_108-17335-0 from training. Duration: 0.65 2023-10-04 10:19:11,362 WARNING [train.py:1204] (3/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_286-135600-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 10:19:14,438 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_928-321675-0_sp1.1 from training. Duration: 0.936375 2023-10-04 10:19:16,354 WARNING [train.py:1204] (3/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_81-300212-0_sp1.1 from training. Duration: 0.5454375 2023-10-04 10:19:17,654 WARNING [train.py:1204] (3/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_29-280622-0_sp0.9 from training. Duration: 0.911125 2023-10-04 10:19:17,753 WARNING [train.py:1204] (3/4) Exclude cut with ID _61079_210_4525_1_1534921008198_4011419_228-176447-0_sp1.1 from training. Duration: 0.936375 2023-10-04 10:19:19,387 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.feed_forward2.hidden_balancer.prob, batch_count=1619706.6666666667, ans=0.125 2023-10-04 10:19:20,485 WARNING [train.py:1204] (3/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_372-198878-0_sp1.1 from training. Duration: 0.5454375 2023-10-04 10:19:20,504 WARNING [train.py:1204] (3/4) Exclude cut with ID _64521_210_14699_1_1535243427259_3815328_244-208725-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 10:19:21,936 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8275_210_4198_1_1528455320_3892571_491-17707-0_sp0.9 from training. Duration: 0.711125 2023-10-04 10:19:23,385 WARNING [train.py:1204] (3/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_375-245677-0 from training. Duration: 0.81 2023-10-04 10:19:23,399 WARNING [train.py:1204] (3/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_690-286055-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 10:19:25,151 WARNING [train.py:1204] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_89-321955-0 from training. Duration: 0.8 2023-10-04 10:19:25,180 WARNING [train.py:1204] (3/4) Exclude cut with ID _18895_210_6228_1_1531357274234_3784930_213-98721-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 10:19:26,540 WARNING [train.py:1204] (3/4) Exclude cut with ID _19708_210_9206_1_1532136650733_4238364_347-65240-0 from training. Duration: 0.97 2023-10-04 10:19:29,194 WARNING [train.py:1204] (3/4) Exclude cut with ID _64135_210_2253_1_1535677267876_7608972_498-5039-0 from training. Duration: 0.78 2023-10-04 10:19:30,903 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.conv_module2.balancer2.prob, batch_count=1619706.6666666667, ans=0.125 2023-10-04 10:19:32,157 WARNING [train.py:1204] (3/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_146-276944-0_sp1.1 from training. Duration: 0.7545625 2023-10-04 10:19:33,522 WARNING [train.py:1204] (3/4) Exclude cut with ID _63171_210_15488_1_1535104792794_3295110_155-34488-0_sp1.1 from training. Duration: 0.97275 2023-10-04 10:19:33,535 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_5438_210_1794_1_1527336687_2600009_233-58072-0_sp0.9 from training. Duration: 0.8555625 2023-10-04 10:19:34,920 WARNING [train.py:1204] (3/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_63-101996-0_sp0.9 from training. Duration: 0.97775 2023-10-04 10:19:38,882 WARNING [train.py:1204] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_271-269084-0_sp1.1 from training. Duration: 0.7545625 2023-10-04 10:19:41,930 WARNING [train.py:1204] (3/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_345-167986-0_sp1.1 from training. Duration: 0.763625 2023-10-04 10:19:43,305 WARNING [train.py:1204] (3/4) Exclude cut with ID _39290_210_4220_1_1533862778760_3075118_92-311600-0_sp1.1 from training. Duration: 0.8 2023-10-04 10:19:43,311 WARNING [train.py:1204] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_209-65306-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 10:19:43,388 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_26433_210_12108_1_1532157158_4056480_317-335807-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 10:19:48,227 WARNING [train.py:1204] (3/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_238-94127-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 10:19:49,543 WARNING [train.py:1204] (3/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_207-132038-0_sp1.1 from training. Duration: 0.8545625 2023-10-04 10:19:55,565 WARNING [train.py:1204] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_167-137255-0_sp1.1 from training. Duration: 0.5818125 2023-10-04 10:19:56,965 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2009_210_2071_1_1525481134_4588937_385-302100-0_sp0.9 from training. Duration: 0.9666875 2023-10-04 10:20:03,926 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_15517_210_6753_1_1530954008_3487336_253-110617-0_sp1.1 from training. Duration: 0.936375 2023-10-04 10:20:06,756 WARNING [train.py:1204] (3/4) Exclude cut with ID _39290_210_4220_1_1533862778760_3075118_92-311600-0_sp0.9 from training. Duration: 0.97775 2023-10-04 10:20:06,792 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_42-142473-0 from training. Duration: 0.72 2023-10-04 10:20:06,837 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2296_210_2087_1_1525503448_3397335_57-185430-0 from training. Duration: 0.6 2023-10-04 10:20:06,857 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_11305_210_1794_1_1529750428_7178148_257-312870-0_sp0.9 from training. Duration: 0.97775 2023-10-04 10:20:08,212 WARNING [train.py:1204] (3/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_222-198127-0 from training. Duration: 0.84 2023-10-04 10:20:10,122 WARNING [train.py:1204] (3/4) Exclude cut with ID _65263_210_22356_1_1535763598691_3921037_423-202699-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 10:20:10,202 WARNING [train.py:1204] (3/4) Exclude cut with ID _15037_210_2253_1_1530752294104_3267650_13-130361-0 from training. Duration: 0.99 2023-10-04 10:20:17,370 INFO [train.py:1046] (3/4) Epoch 46, batch 3950, loss[loss=0.152, simple_loss=0.2423, pruned_loss=0.03089, over 24616.00 frames. ], tot_loss[loss=0.1533, simple_loss=0.2336, pruned_loss=0.03655, over 4709877.07 frames. ], batch size: 68, lr: 2.20e-03, grad_scale: 8.0 2023-10-04 10:20:18,879 WARNING [train.py:1204] (3/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_628-198080-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 10:20:18,975 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3171_210_2087_1_1526109084_2330007_37-64334-0 from training. Duration: 0.82 2023-10-04 10:20:20,346 WARNING [train.py:1204] (3/4) Exclude cut with ID _5287_210_2084_1_1527418883040_3878670_12-221186-0_sp1.1 from training. Duration: 0.8909375 2023-10-04 10:20:23,304 WARNING [train.py:1204] (3/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_1103-56445-0_sp1.1 from training. Duration: 0.663625 2023-10-04 10:20:24,032 WARNING [train.py:1204] (3/4) Exclude cut with ID _65263_210_22356_1_1535763598691_3921037_169-140068-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 10:20:30,364 WARNING [train.py:1204] (3/4) Exclude cut with ID IC0671W0118-331961 from training. Duration: 0.896 2023-10-04 10:20:30,428 WARNING [train.py:1204] (3/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_402-102311-0_sp1.1 from training. Duration: 0.6090625 2023-10-04 10:20:32,056 WARNING [train.py:1204] (3/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_202-293523-0 from training. Duration: 0.72 2023-10-04 10:20:32,129 WARNING [train.py:1204] (3/4) Exclude cut with ID IC0679W0038-335846 from training. Duration: 0.768 2023-10-04 10:20:32,152 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_37357_210_17024_1_1533089504_2316121_79-198866-0_sp1.1 from training. Duration: 0.936375 2023-10-04 10:20:35,036 WARNING [train.py:1204] (3/4) Exclude cut with ID _7606_210_4915_1_1528894253277_5080690_292-271285-0_sp1.1 from training. Duration: 0.963625 2023-10-04 10:20:35,047 WARNING [train.py:1204] (3/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_377-296839-0_sp0.9 from training. Duration: 0.611125 2023-10-04 10:20:35,050 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_45886_210_17024_1_1533704917_2435333_58-131069-0_sp1.1 from training. Duration: 0.963625 2023-10-04 10:20:35,385 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.conv_module1.balancer1.max_abs, batch_count=1620040.0, ans=10.0 2023-10-04 10:20:36,483 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6961_210_2087_1_1527937168_6815011_395-185060-0 from training. Duration: 0.95 2023-10-04 10:20:39,705 WARNING [train.py:1204] (3/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_304-266479-0_sp1.1 from training. Duration: 0.97275 2023-10-04 10:20:39,757 WARNING [train.py:1204] (3/4) Exclude cut with ID _3867_210_1883_1_1526641200575_4475830_319-349850-0_sp1.1 from training. Duration: 0.6090625 2023-10-04 10:20:39,772 WARNING [train.py:1204] (3/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_304-59227-0_sp0.9 from training. Duration: 0.8555625 2023-10-04 10:20:41,206 WARNING [train.py:1204] (3/4) Exclude cut with ID _8021_210_7994_1_1528376451358_3653069_100-140231-0_sp0.9 from training. Duration: 0.7444375 2023-10-04 10:20:41,268 WARNING [train.py:1204] (3/4) Exclude cut with ID _8833_210_5219_1_1528592665210_7553059_329-125156-0_sp0.9 from training. Duration: 0.911125 2023-10-04 10:20:47,707 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.feed_forward3.hidden_balancer.prob, batch_count=1620106.6666666667, ans=0.125 2023-10-04 10:20:48,855 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder_embed.conv.2.prob, batch_count=1620106.6666666667, ans=0.125 2023-10-04 10:20:52,884 WARNING [train.py:1204] (3/4) Exclude cut with ID _14459_210_2228_1_1531443641946_7516700_792-287234-0_sp1.1 from training. Duration: 0.92725 2023-10-04 10:20:52,901 WARNING [train.py:1204] (3/4) Exclude cut with ID _19708_210_9206_1_1532136650733_4238364_347-65240-0_sp1.1 from training. Duration: 0.8818125 2023-10-04 10:20:59,118 WARNING [train.py:1204] (3/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_70-27732-0 from training. Duration: 0.94 2023-10-04 10:21:04,661 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_397-132565-0 from training. Duration: 0.99 2023-10-04 10:21:04,663 WARNING [train.py:1204] (3/4) Exclude cut with ID _49728_210_12651_1_1534041034294_6788150_624-195301-0 from training. Duration: 0.85 2023-10-04 10:21:05,858 WARNING [train.py:1204] (3/4) Exclude cut with ID _55429_210_10626_1_1534467588527_7174143_377-169391-0_sp1.1 from training. Duration: 0.87275 2023-10-04 10:21:07,309 WARNING [train.py:1204] (3/4) Exclude cut with ID _49376_210_5986_1_1533985278732_3370740_354-344695-0_sp1.1 from training. Duration: 0.9 2023-10-04 10:21:13,424 WARNING [train.py:1204] (3/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_276-96593-0_sp1.1 from training. Duration: 0.7 2023-10-04 10:21:13,432 WARNING [train.py:1204] (3/4) Exclude cut with ID _15012_210_1968_1_1530856820429_5095350_688-178849-0_sp0.9 from training. Duration: 0.711125 2023-10-04 10:21:13,466 WARNING [train.py:1204] (3/4) Exclude cut with ID _25565_210_12392_1_1532423009382_3952098_372-69023-0_sp1.1 from training. Duration: 0.936375 2023-10-04 10:21:15,147 WARNING [train.py:1204] (3/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_61-327062-0_sp1.1 from training. Duration: 0.736375 2023-10-04 10:21:15,177 WARNING [train.py:1204] (3/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_215-141059-0 from training. Duration: 0.62 2023-10-04 10:21:19,930 WARNING [train.py:1204] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_6-226750-0_sp1.1 from training. Duration: 0.8545625 2023-10-04 10:21:21,344 WARNING [train.py:1204] (3/4) Exclude cut with ID _14348_210_8341_1_1530599434996_3778640_240-232370-0_sp1.1 from training. Duration: 0.763625 2023-10-04 10:21:26,676 WARNING [train.py:1204] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_468-189058-0 from training. Duration: 0.96 2023-10-04 10:21:30,844 INFO [train.py:1046] (3/4) Epoch 46, batch 4000, loss[loss=0.1498, simple_loss=0.2397, pruned_loss=0.02993, over 24583.00 frames. ], tot_loss[loss=0.1533, simple_loss=0.234, pruned_loss=0.03627, over 4709624.39 frames. ], batch size: 71, lr: 2.20e-03, grad_scale: 16.0 2023-10-04 10:21:35,610 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.692e+02 2.007e+02 2.186e+02 2.659e+02 3.847e+02, threshold=4.373e+02, percent-clipped=0.0 2023-10-04 10:21:35,682 WARNING [train.py:1204] (3/4) Exclude cut with ID _21987_210_5854_1_1531821500482_3637038_247-93340-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 10:21:40,637 WARNING [train.py:1204] (3/4) Exclude cut with ID _66868_210_9527_1_1535509287954_4561140_436-191602-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 10:21:47,023 WARNING [train.py:1204] (3/4) Exclude cut with ID _18895_210_6228_1_1531357274234_3784930_394-58203-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 10:21:47,084 WARNING [train.py:1204] (3/4) Exclude cut with ID _58605_210_12210_1_1534723195939_3904259_258-281498-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 10:21:47,113 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_33348_210_5632_1_1533086912_3324030_245-292752-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 10:21:48,358 WARNING [train.py:1204] (3/4) Exclude cut with ID _67831_210_12392_1_1535612464202_3092612_346-2148-0 from training. Duration: 0.93 2023-10-04 10:21:49,691 WARNING [train.py:1204] (3/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_369-222060-0_sp0.9 from training. Duration: 0.788875 2023-10-04 10:21:49,758 WARNING [train.py:1204] (3/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_245-174553-0 from training. Duration: 0.88 2023-10-04 10:21:49,770 WARNING [train.py:1204] (3/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_231-104736-0_sp1.1 from training. Duration: 0.7090625 2023-10-04 10:21:49,776 WARNING [train.py:1204] (3/4) Exclude cut with ID _44585_210_18628_1_1533605350387_4518199_280-73534-0 from training. Duration: 0.89 2023-10-04 10:21:52,476 WARNING [train.py:1204] (3/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_22-201476-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 10:21:56,553 WARNING [train.py:1204] (3/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_325-62802-0_sp0.9 from training. Duration: 0.9666875 2023-10-04 10:21:56,573 WARNING [train.py:1204] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_199-274062-0_sp1.1 from training. Duration: 0.97275 2023-10-04 10:21:56,576 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_8-319957-0_sp1.1 from training. Duration: 0.87275 2023-10-04 10:21:57,806 WARNING [train.py:1204] (3/4) Exclude cut with ID _29343_210_4381_1_1532602757752_3674109_224-349726-0_sp1.1 from training. Duration: 0.936375 2023-10-04 10:21:57,811 WARNING [train.py:1204] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_520-126729-0_sp1.1 from training. Duration: 0.636375 2023-10-04 10:21:59,260 WARNING [train.py:1204] (3/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_239-22076-0_sp1.1 from training. Duration: 0.77275 2023-10-04 10:22:00,628 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9012W0011-501146 from training. Duration: 0.896 2023-10-04 10:22:00,697 WARNING [train.py:1204] (3/4) Exclude cut with ID _29857_210_20034_1_1532395886753_3798160_279-185801-0_sp1.1 from training. Duration: 0.82725 2023-10-04 10:22:02,012 WARNING [train.py:1204] (3/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_261-223933-0_sp1.1 from training. Duration: 0.92725 2023-10-04 10:22:05,369 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9029W0025-509426 from training. Duration: 0.896 2023-10-04 10:22:05,446 WARNING [train.py:1204] (3/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_337-181502-0_sp1.1 from training. Duration: 0.5909375 2023-10-04 10:22:05,468 WARNING [train.py:1204] (3/4) Exclude cut with ID _68635_210_9317_1_1535770813410_4315870_159-332599-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 10:22:08,311 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.0.bypass.scale_min, batch_count=1620440.0, ans=0.2 2023-10-04 10:22:12,756 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_161-276996-0 from training. Duration: 0.95 2023-10-04 10:22:14,095 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7088_210_1874_1_1527939776_1101113_127-68870-0_sp1.1 from training. Duration: 0.936375 2023-10-04 10:22:16,904 WARNING [train.py:1204] (3/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_322-217220-0_sp1.1 from training. Duration: 0.7818125 2023-10-04 10:22:16,961 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9072W0002-530636 from training. Duration: 0.768 2023-10-04 10:22:17,060 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_340-336408-0_sp1.1 from training. Duration: 0.8454375 2023-10-04 10:22:18,370 WARNING [train.py:1204] (3/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_395-37222-0 from training. Duration: 0.76 2023-10-04 10:22:18,373 WARNING [train.py:1204] (3/4) Exclude cut with ID _64099_210_17301_1_1535526977656_1578188_159-209156-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 10:22:20,280 WARNING [train.py:1204] (3/4) Exclude cut with ID _53950_210_5816_1_1534417264919_3862951_28-29681-0_sp1.1 from training. Duration: 0.92725 2023-10-04 10:22:22,058 WARNING [train.py:1204] (3/4) Exclude cut with ID _4681_210_3525_1_1527250689464_120889_15-126760-0_sp1.1 from training. Duration: 0.736375 2023-10-04 10:22:23,511 WARNING [train.py:1204] (3/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_242-3472-0_sp1.1 from training. Duration: 0.663625 2023-10-04 10:22:23,544 WARNING [train.py:1204] (3/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_436-186311-0_sp0.9 from training. Duration: 0.72225 2023-10-04 10:22:23,569 WARNING [train.py:1204] (3/4) Exclude cut with ID _3675_210_4557_1_1526639934408_4182089_452-172147-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 10:22:24,783 WARNING [train.py:1204] (3/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_80-147301-0 from training. Duration: 0.84 2023-10-04 10:22:26,081 WARNING [train.py:1204] (3/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_351-107588-0_sp1.1 from training. Duration: 0.92725 2023-10-04 10:22:27,458 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9111W0002-550781 from training. Duration: 0.896 2023-10-04 10:22:29,051 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.bypass.scale_min, batch_count=1620573.3333333333, ans=0.2 2023-10-04 10:22:31,526 WARNING [train.py:1204] (3/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_465-156500-0_sp1.1 from training. Duration: 0.6545625 2023-10-04 10:22:36,380 WARNING [train.py:1204] (3/4) Exclude cut with ID _13938_210_9988_1_1530518718978_3693960_304-112784-0_sp1.1 from training. Duration: 0.4181875 2023-10-04 10:22:37,855 WARNING [train.py:1204] (3/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_202-67017-0_sp1.1 from training. Duration: 0.7454375 2023-10-04 10:22:39,297 WARNING [train.py:1204] (3/4) Exclude cut with ID _58414_210_8254_1_1535421609162_7151182_448-4358-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 10:22:39,404 WARNING [train.py:1204] (3/4) Exclude cut with ID _66840_210_5219_1_1535452168426_4196349_203-243234-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 10:22:40,734 WARNING [train.py:1204] (3/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_201-213513-0_sp1.1 from training. Duration: 0.97275 2023-10-04 10:22:40,956 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.balancer1.prob, batch_count=1620573.3333333333, ans=0.125 2023-10-04 10:22:45,497 INFO [train.py:1046] (3/4) Epoch 46, batch 4050, loss[loss=0.1619, simple_loss=0.2373, pruned_loss=0.04323, over 23637.00 frames. ], tot_loss[loss=0.1538, simple_loss=0.2344, pruned_loss=0.03663, over 4695722.29 frames. ], batch size: 256, lr: 2.20e-03, grad_scale: 16.0 2023-10-04 10:22:45,606 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_117-187560-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 10:22:46,056 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.5.encoder.layers.1.conv_module2.whiten, num_groups=1, num_channels=256, metric=5.59 vs. limit=15.0 2023-10-04 10:22:47,382 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.5.encoder.layers.1.nonlin_attention.whiten1, num_groups=1, num_channels=192, metric=2.78 vs. limit=10.0 2023-10-04 10:22:48,258 WARNING [train.py:1204] (3/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_135-174080-0_sp0.9 from training. Duration: 0.588875 2023-10-04 10:22:48,311 WARNING [train.py:1204] (3/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_45-265684-0 from training. Duration: 0.72 2023-10-04 10:22:49,606 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_435-313305-0_sp1.1 from training. Duration: 0.6454375 2023-10-04 10:22:51,437 WARNING [train.py:1204] (3/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_235-307337-0_sp1.1 from training. Duration: 0.8545625 2023-10-04 10:22:53,327 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7762_210_1794_1_1528199508_4057408_419-125009-0_sp0.9 from training. Duration: 0.92225 2023-10-04 10:22:54,794 WARNING [train.py:1204] (3/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_215-141059-0_sp1.1 from training. Duration: 0.563625 2023-10-04 10:22:54,883 WARNING [train.py:1204] (3/4) Exclude cut with ID _27013_210_1389_1_1532437173240_3252649_222-125573-0_sp1.1 from training. Duration: 0.8909375 2023-10-04 10:22:58,876 WARNING [train.py:1204] (3/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_126-274694-0_sp1.1 from training. Duration: 0.8909375 2023-10-04 10:23:01,709 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2592_210_2290_1_1525697025_4882925_157-70512-0_sp1.1 from training. Duration: 0.82725 2023-10-04 10:23:01,759 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_292-265928-0_sp0.9 from training. Duration: 0.5666875 2023-10-04 10:23:03,274 WARNING [train.py:1204] (3/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_1211-226066-0_sp1.1 from training. Duration: 0.6909375 2023-10-04 10:23:04,461 WARNING [train.py:1204] (3/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_394-265747-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 10:23:07,806 WARNING [train.py:1204] (3/4) Exclude cut with ID _48782_210_12658_1_1534467701544_2300422_239-67139-0_sp1.1 from training. Duration: 0.97275 2023-10-04 10:23:09,452 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=1620706.6666666667, ans=0.1 2023-10-04 10:23:10,607 WARNING [train.py:1204] (3/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_329-113173-0_sp1.1 from training. Duration: 0.563625 2023-10-04 10:23:12,227 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.bypass_mid.scale_min, batch_count=1620706.6666666667, ans=0.2 2023-10-04 10:23:13,354 WARNING [train.py:1204] (3/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_81-75849-0_sp0.9 from training. Duration: 0.4333125 2023-10-04 10:23:13,491 WARNING [train.py:1204] (3/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_1003-101638-0 from training. Duration: 0.73 2023-10-04 10:23:14,864 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9309W0189-640316 from training. Duration: 0.64 2023-10-04 10:23:16,196 WARNING [train.py:1204] (3/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_503-287883-0_sp1.1 from training. Duration: 0.8 2023-10-04 10:23:20,386 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.1.feed_forward2.out_whiten.whitening_limit, batch_count=1620773.3333333333, ans=15.0 2023-10-04 10:23:21,556 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.bypass_mid.scale_min, batch_count=1620773.3333333333, ans=0.2 2023-10-04 10:23:22,683 WARNING [train.py:1204] (3/4) Exclude cut with ID _8833_210_5219_1_1528592665210_7553059_611-80313-0 from training. Duration: 0.64 2023-10-04 10:23:24,198 WARNING [train.py:1204] (3/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_335-106626-0_sp1.1 from training. Duration: 0.963625 2023-10-04 10:23:27,378 WARNING [train.py:1204] (3/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_237-44332-0_sp1.1 from training. Duration: 0.8545625 2023-10-04 10:23:31,346 WARNING [train.py:1204] (3/4) Exclude cut with ID _46952_210_5986_1_1534230010357_3107379_178-147341-0_sp1.1 from training. Duration: 0.97275 2023-10-04 10:23:32,724 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_13091_210_7925_1_1530609447_8987354_946-154426-0_sp1.1 from training. Duration: 0.936375 2023-10-04 10:23:32,740 WARNING [train.py:1204] (3/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_326-17061-0_sp1.1 from training. Duration: 0.8545625 2023-10-04 10:23:32,951 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.conv_skip_rate, batch_count=1620840.0, ans=0.0 2023-10-04 10:23:34,193 WARNING [train.py:1204] (3/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_38-169920-0_sp1.1 from training. Duration: 0.82725 2023-10-04 10:23:37,655 WARNING [train.py:1204] (3/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_283-220372-0 from training. Duration: 0.56 2023-10-04 10:23:37,663 WARNING [train.py:1204] (3/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_252-216674-0_sp1.1 from training. Duration: 0.4909375 2023-10-04 10:23:39,231 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_202-91290-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 10:23:39,343 WARNING [train.py:1204] (3/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_80-74184-0 from training. Duration: 0.83 2023-10-04 10:23:42,119 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.1.ff2_skip_rate, batch_count=1620840.0, ans=0.0 2023-10-04 10:23:43,178 WARNING [train.py:1204] (3/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_411-271358-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 10:23:51,084 WARNING [train.py:1204] (3/4) Exclude cut with ID _68777_210_4353_1_1535810390731_3117904_129-166012-0 from training. Duration: 0.85 2023-10-04 10:23:52,429 WARNING [train.py:1204] (3/4) Exclude cut with ID _44982_210_1511_1_1534312820108_3726029_410-109759-0_sp1.1 from training. Duration: 0.963625 2023-10-04 10:23:52,434 WARNING [train.py:1204] (3/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_364-22817-0_sp1.1 from training. Duration: 0.6454375 2023-10-04 10:23:55,684 WARNING [train.py:1204] (3/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_89-242335-0 from training. Duration: 0.95 2023-10-04 10:23:55,701 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_151-325626-0 from training. Duration: 0.97 2023-10-04 10:23:55,703 WARNING [train.py:1204] (3/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_786-264332-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 10:23:57,231 WARNING [train.py:1204] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_35-153886-0_sp1.1 from training. Duration: 0.763625 2023-10-04 10:23:58,516 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_24007_210_1794_1_1531918925_1663875_116-100916-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 10:23:58,537 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_162-262717-0_sp1.1 from training. Duration: 0.7545625 2023-10-04 10:23:59,811 INFO [train.py:1046] (3/4) Epoch 46, batch 4100, loss[loss=0.1597, simple_loss=0.2462, pruned_loss=0.03661, over 24639.00 frames. ], tot_loss[loss=0.1548, simple_loss=0.2354, pruned_loss=0.03708, over 4700682.59 frames. ], batch size: 68, lr: 2.20e-03, grad_scale: 16.0 2023-10-04 10:24:04,013 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.730e+02 2.073e+02 2.274e+02 2.591e+02 3.994e+02, threshold=4.547e+02, percent-clipped=0.0 2023-10-04 10:24:04,252 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.conv_module1.balancer2.min_abs, batch_count=1620973.3333333333, ans=0.5 2023-10-04 10:24:07,561 WARNING [train.py:1204] (3/4) Exclude cut with ID _8263_210_1664_1_1528438953494_294100_40-204731-0 from training. Duration: 0.5 2023-10-04 10:24:10,370 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_416-293746-0 from training. Duration: 0.9 2023-10-04 10:24:11,781 WARNING [train.py:1204] (3/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_605-219732-0 from training. Duration: 0.94 2023-10-04 10:24:13,101 WARNING [train.py:1204] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_454-123002-0 from training. Duration: 0.69 2023-10-04 10:24:13,114 WARNING [train.py:1204] (3/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_25-304273-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 10:24:13,178 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6860_210_4852_1_1527917457_3833327_451-39702-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 10:24:14,594 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_304-16505-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 10:24:14,613 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_4-266348-0_sp1.1 from training. Duration: 0.7090625 2023-10-04 10:24:15,974 WARNING [train.py:1204] (3/4) Exclude cut with ID ID0149W0001-753701 from training. Duration: 0.989 2023-10-04 10:24:18,569 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_41251_210_15368_1_1533429767_4820695_394-55393-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 10:24:18,638 WARNING [train.py:1204] (3/4) Exclude cut with ID _8793_210_6310_1_1528542985139_2310009_121-179782-0_sp1.1 from training. Duration: 0.72725 2023-10-04 10:24:18,652 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2729_210_3528_1_1525863505_3882214_421-73047-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 10:24:19,996 WARNING [train.py:1204] (3/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_278-154492-0_sp0.9 from training. Duration: 0.8444375 2023-10-04 10:24:23,179 WARNING [train.py:1204] (3/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_412-317077-0_sp0.9 from training. Duration: 0.8333125 2023-10-04 10:24:23,266 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_727-138317-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 10:24:24,929 WARNING [train.py:1204] (3/4) Exclude cut with ID _25808_210_13292_1_1532343514562_7366109_345-335048-0_sp1.1 from training. Duration: 0.92725 2023-10-04 10:24:24,950 WARNING [train.py:1204] (3/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_506-23200-0 from training. Duration: 0.97 2023-10-04 10:24:25,020 WARNING [train.py:1204] (3/4) Exclude cut with ID _44718_210_5816_1_1534050051869_6469030_643-245187-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 10:24:26,413 WARNING [train.py:1204] (3/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_42-186162-0_sp1.1 from training. Duration: 0.836375 2023-10-04 10:24:26,432 WARNING [train.py:1204] (3/4) Exclude cut with ID _3231_210_2106_1_1526205864934_6784220_171-256004-0_sp1.1 from training. Duration: 0.7818125 2023-10-04 10:24:26,450 WARNING [train.py:1204] (3/4) Exclude cut with ID _25914_210_6400_1_1532141317058_644740_43-248478-0_sp1.1 from training. Duration: 0.936375 2023-10-04 10:24:26,501 WARNING [train.py:1204] (3/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_577-287150-0 from training. Duration: 0.82 2023-10-04 10:24:31,038 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_46015_210_7234_1_1533863319_3087837_376-276068-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 10:24:31,270 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.balancer2.prob, batch_count=1621106.6666666667, ans=0.125 2023-10-04 10:24:32,397 WARNING [train.py:1204] (3/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_51-200220-0 from training. Duration: 0.56 2023-10-04 10:24:33,778 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_452-96101-0_sp1.1 from training. Duration: 0.963625 2023-10-04 10:24:36,471 WARNING [train.py:1204] (3/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_263-168388-0_sp1.1 from training. Duration: 0.7818125 2023-10-04 10:24:36,472 WARNING [train.py:1204] (3/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_469-94673-0 from training. Duration: 0.79 2023-10-04 10:24:37,894 WARNING [train.py:1204] (3/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_303-228738-0_sp1.1 from training. Duration: 0.763625 2023-10-04 10:24:38,570 WARNING [train.py:1204] (3/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_262-250826-0_sp1.1 from training. Duration: 0.9 2023-10-04 10:24:39,706 WARNING [train.py:1204] (3/4) Exclude cut with ID _68777_210_4353_1_1535810390731_3117904_129-166012-0_sp1.1 from training. Duration: 0.77275 2023-10-04 10:24:41,213 WARNING [train.py:1204] (3/4) Exclude cut with ID _58895_210_8750_1_1535184081320_3649089_265-323810-0 from training. Duration: 0.96 2023-10-04 10:24:42,765 WARNING [train.py:1204] (3/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_120-76843-0_sp0.9 from training. Duration: 0.611125 2023-10-04 10:24:42,836 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7544_210_6753_1_1528587722_3576686_295-258479-0_sp0.9 from training. Duration: 0.9333125 2023-10-04 10:24:44,570 INFO [scaling.py:1118] (3/4) WithLoss: name=encoder.encoders.5.encoder.layers.1.self_attn_weights, loss-sum=0.000e+00 2023-10-04 10:24:45,578 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7046_210_6753_1_1527895337_6920917_783-278109-0 from training. Duration: 0.77 2023-10-04 10:24:45,625 WARNING [train.py:1204] (3/4) Exclude cut with ID _42326_210_7770_1_1533814282531_3707269_113-308323-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 10:24:45,657 WARNING [train.py:1204] (3/4) Exclude cut with ID _7852_210_5219_1_1528286471316_8139400_242-105336-0_sp0.9 from training. Duration: 0.8 2023-10-04 10:24:47,492 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.5.encoder.layers.0.conv_module1.whiten, num_groups=1, num_channels=256, metric=8.19 vs. limit=15.0 2023-10-04 10:24:48,369 WARNING [train.py:1204] (3/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_114-277905-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 10:24:53,232 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.bypass.scale_min, batch_count=1621173.3333333333, ans=0.2 2023-10-04 10:24:54,912 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_13118_210_7925_1_1530755612_7608297_42-66683-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 10:24:59,035 WARNING [train.py:1204] (3/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_901-79438-0_sp1.1 from training. Duration: 0.8545625 2023-10-04 10:24:59,083 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_30280_210_1881_1_1532872779_3660139_211-182194-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 10:25:06,533 WARNING [train.py:1204] (3/4) Exclude cut with ID _14898_210_9206_1_1530766812377_4177190_621-45732-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 10:25:06,541 WARNING [train.py:1204] (3/4) Exclude cut with ID _35350_210_16040_1_1533040191183_4075299_224-133775-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 10:25:09,708 WARNING [train.py:1204] (3/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_147-293311-0_sp1.1 from training. Duration: 0.8545625 2023-10-04 10:25:09,848 WARNING [train.py:1204] (3/4) Exclude cut with ID _12302_210_5219_1_1529924446466_8107280_835-279754-0_sp1.1 from training. Duration: 0.8181875 2023-10-04 10:25:13,862 INFO [train.py:1046] (3/4) Epoch 46, batch 4150, loss[loss=0.1655, simple_loss=0.251, pruned_loss=0.03998, over 23729.00 frames. ], tot_loss[loss=0.1543, simple_loss=0.2351, pruned_loss=0.03681, over 4701709.97 frames. ], batch size: 85, lr: 2.20e-03, grad_scale: 16.0 2023-10-04 10:25:15,256 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8453_210_4211_1_1528502940_8562730_347-330673-0_sp0.9 from training. Duration: 0.8 2023-10-04 10:25:16,556 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_213-103282-0_sp0.9 from training. Duration: 0.8666875 2023-10-04 10:25:17,959 WARNING [train.py:1204] (3/4) Exclude cut with ID _49728_210_12651_1_1534041034294_6788150_66-43496-0_sp1.1 from training. Duration: 0.97275 2023-10-04 10:25:17,961 WARNING [train.py:1204] (3/4) Exclude cut with ID _19023_210_4353_1_1531378892552_5820780_28-36615-0_sp1.1 from training. Duration: 0.8818125 2023-10-04 10:25:21,215 WARNING [train.py:1204] (3/4) Exclude cut with ID _14349_210_8341_1_1530615710818_3442470_115-285975-0 from training. Duration: 0.86 2023-10-04 10:25:21,243 WARNING [train.py:1204] (3/4) Exclude cut with ID _12734_210_8341_1_1530183614296_3891929_236-47464-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 10:25:21,285 WARNING [train.py:1204] (3/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_177-236809-0 from training. Duration: 0.49 2023-10-04 10:25:22,585 WARNING [train.py:1204] (3/4) Exclude cut with ID _13678_210_7497_1_1530530069226_3463170_371-3221-0 from training. Duration: 0.9 2023-10-04 10:25:22,612 WARNING [train.py:1204] (3/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_48-338976-0 from training. Duration: 0.89 2023-10-04 10:25:22,748 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.self_attn_weights.pos_emb_skip_rate, batch_count=1621306.6666666667, ans=0.0 2023-10-04 10:25:24,021 WARNING [train.py:1204] (3/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_1167-309744-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 10:25:28,648 WARNING [train.py:1204] (3/4) Exclude cut with ID _30327_210_16049_1_1532484161295_3924169_401-70819-0_sp0.9 from training. Duration: 0.9555625 2023-10-04 10:25:28,657 WARNING [train.py:1204] (3/4) Exclude cut with ID _55134_210_21982_1_1534590028377_3882170_346-91022-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 10:25:33,311 WARNING [train.py:1204] (3/4) Exclude cut with ID _14459_210_2228_1_1531443641946_7516700_174-118801-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 10:25:34,637 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_269-278006-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 10:25:34,700 WARNING [train.py:1204] (3/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_390-339137-0_sp0.9 from training. Duration: 0.62225 2023-10-04 10:25:36,151 WARNING [train.py:1204] (3/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_253-308377-0_sp1.1 from training. Duration: 0.4454375 2023-10-04 10:25:37,426 WARNING [train.py:1204] (3/4) Exclude cut with ID _23825_210_13449_1_1531915313526_3497150_240-271367-0_sp1.1 from training. Duration: 0.8818125 2023-10-04 10:25:37,512 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1015-343884-0_sp0.9 from training. Duration: 0.688875 2023-10-04 10:25:41,092 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=1621373.3333333333, ans=0.1 2023-10-04 10:25:42,080 WARNING [train.py:1204] (3/4) Exclude cut with ID _23514_210_1389_1_1531832139785_3031439_335-48210-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 10:25:44,982 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_309-65177-0_sp1.1 from training. Duration: 0.8 2023-10-04 10:25:46,337 WARNING [train.py:1204] (3/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_231-137892-0 from training. Duration: 0.99 2023-10-04 10:25:46,684 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=1621440.0, ans=0.1 2023-10-04 10:25:47,795 WARNING [train.py:1204] (3/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_57-270037-0 from training. Duration: 0.77 2023-10-04 10:25:47,800 WARNING [train.py:1204] (3/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_465-214726-0_sp1.1 from training. Duration: 0.7818125 2023-10-04 10:25:49,200 WARNING [train.py:1204] (3/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_106-300985-0 from training. Duration: 0.9 2023-10-04 10:25:49,211 WARNING [train.py:1204] (3/4) Exclude cut with ID _14632_210_9988_1_1530691279392_3923200_161-129530-0_sp1.1 from training. Duration: 0.87275 2023-10-04 10:25:49,223 WARNING [train.py:1204] (3/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_514-88370-0_sp1.1 from training. Duration: 0.963625 2023-10-04 10:25:51,228 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.out_combiner.scale_min, batch_count=1621440.0, ans=0.2 2023-10-04 10:25:52,371 WARNING [train.py:1204] (3/4) Exclude cut with ID _56495_210_4234_1_1534748080334_4136219_305-334505-0_sp1.1 from training. Duration: 0.92725 2023-10-04 10:25:54,093 WARNING [train.py:1204] (3/4) Exclude cut with ID _42875_210_14856_1_1533704397487_4012433_61-264914-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 10:25:57,039 WARNING [train.py:1204] (3/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_91-148831-0 from training. Duration: 0.99 2023-10-04 10:26:00,315 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_435-313305-0_sp0.9 from training. Duration: 0.788875 2023-10-04 10:26:01,708 WARNING [train.py:1204] (3/4) Exclude cut with ID _16744_210_5986_1_1531724405384_3907567_242-111316-0_sp1.1 from training. Duration: 0.8454375 2023-10-04 10:26:03,025 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_257-20733-0 from training. Duration: 0.76 2023-10-04 10:26:03,071 WARNING [train.py:1204] (3/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_4-91037-0_sp1.1 from training. Duration: 0.8 2023-10-04 10:26:04,405 WARNING [train.py:1204] (3/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_3-276701-0 from training. Duration: 0.91 2023-10-04 10:26:07,071 WARNING [train.py:1204] (3/4) Exclude cut with ID _3992_210_1747_1_1526644816031_3731060_36-147855-0_sp1.1 from training. Duration: 0.7090625 2023-10-04 10:26:08,420 WARNING [train.py:1204] (3/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_1296-298671-0_sp1.1 from training. Duration: 0.963625 2023-10-04 10:26:09,807 WARNING [train.py:1204] (3/4) Exclude cut with ID _68965_210_1883_1_1535698962272_3497490_309-334593-0_sp1.1 from training. Duration: 0.92725 2023-10-04 10:26:13,067 WARNING [train.py:1204] (3/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_128-296581-0 from training. Duration: 0.74 2023-10-04 10:26:13,067 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_17613_210_2151_1_1531137569_2366160_170-299079-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 10:26:13,069 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6287_210_3273_1_1527509506_2653716_182-199887-0_sp1.1 from training. Duration: 0.463625 2023-10-04 10:26:14,544 WARNING [train.py:1204] (3/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_80-135200-0_sp1.1 from training. Duration: 0.7181875 2023-10-04 10:26:16,069 WARNING [train.py:1204] (3/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_34-183685-0 from training. Duration: 0.98 2023-10-04 10:26:16,089 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8278_210_2550_1_1528637099_4220413_418-210155-0_sp1.1 from training. Duration: 0.92725 2023-10-04 10:26:16,093 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8853_210_3273_1_1528633899_818968_51-187685-0_sp0.9 from training. Duration: 0.7666875 2023-10-04 10:26:17,374 WARNING [train.py:1204] (3/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_332-221057-0_sp1.1 from training. Duration: 0.6181875 2023-10-04 10:26:17,445 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2221_210_1988_1_1525597051_4935541_415-296882-0 from training. Duration: 0.77 2023-10-04 10:26:17,465 WARNING [train.py:1204] (3/4) Exclude cut with ID _33640_210_2655_1_1533117605479_4855469_770-278185-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 10:26:17,500 WARNING [train.py:1204] (3/4) Exclude cut with ID _5287_210_2084_1_1527418883040_3878670_333-48508-0_sp1.1 from training. Duration: 0.5454375 2023-10-04 10:26:18,936 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_142-276839-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 10:26:20,409 WARNING [train.py:1204] (3/4) Exclude cut with ID _28394_210_8913_1_1532430049518_3650118_126-139371-0_sp1.1 from training. Duration: 0.92725 2023-10-04 10:26:20,423 WARNING [train.py:1204] (3/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_76-203867-0 from training. Duration: 0.72 2023-10-04 10:26:20,473 WARNING [train.py:1204] (3/4) Exclude cut with ID _6194_210_1534_1_1527511826160_3941194_14-282876-0_sp0.9 from training. Duration: 0.788875 2023-10-04 10:26:25,614 WARNING [train.py:1204] (3/4) Exclude cut with ID _35355_210_16040_1_1533212988977_4016229_74-190076-0_sp1.1 from training. Duration: 0.72725 2023-10-04 10:26:26,999 WARNING [train.py:1204] (3/4) Exclude cut with ID _33920_210_4220_1_1533257851500_3084399_198-334063-0 from training. Duration: 0.94 2023-10-04 10:26:28,282 INFO [train.py:1046] (3/4) Epoch 46, batch 4200, loss[loss=0.1241, simple_loss=0.2071, pruned_loss=0.02058, over 24346.00 frames. ], tot_loss[loss=0.1534, simple_loss=0.2338, pruned_loss=0.03647, over 4705827.82 frames. ], batch size: 56, lr: 2.20e-03, grad_scale: 16.0 2023-10-04 10:26:30,199 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_4-266348-0_sp0.9 from training. Duration: 0.8666875 2023-10-04 10:26:30,442 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.conv_module2.balancer1.prob, batch_count=1621640.0, ans=0.125 2023-10-04 10:26:32,765 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.580e+02 2.054e+02 2.218e+02 2.491e+02 3.350e+02, threshold=4.435e+02, percent-clipped=0.0 2023-10-04 10:26:32,832 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_23856_210_4238_1_1531994785_3087934_122-38075-0_sp1.1 from training. Duration: 0.936375 2023-10-04 10:26:34,238 WARNING [train.py:1204] (3/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_276-96593-0_sp0.9 from training. Duration: 0.8555625 2023-10-04 10:26:35,681 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_308-278734-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 10:26:35,682 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_5190_210_4557_1_1527245078_2965646_353-250603-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 10:26:37,139 WARNING [train.py:1204] (3/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_472-216663-0 from training. Duration: 0.92 2023-10-04 10:26:40,545 WARNING [train.py:1204] (3/4) Exclude cut with ID _7537_210_2084_1_1528502395431_3924359_14-312906-0 from training. Duration: 0.84 2023-10-04 10:26:40,582 WARNING [train.py:1204] (3/4) Exclude cut with ID _29003_210_19596_1_1532507391720_3894098_559-236660-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 10:26:43,072 WARNING [train.py:1204] (3/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_312-204798-0_sp1.1 from training. Duration: 0.8454375 2023-10-04 10:26:44,547 WARNING [train.py:1204] (3/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_17-309070-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 10:26:47,235 WARNING [train.py:1204] (3/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_344-152526-0_sp0.9 from training. Duration: 0.588875 2023-10-04 10:26:48,658 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_41452_210_7291_1_1533437687_3941281_405-11559-0_sp1.1 from training. Duration: 0.82725 2023-10-04 10:26:48,694 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_48730_210_5399_1_1533885886_2212769_157-124052-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 10:26:48,879 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.conv_module1.balancer1.min_positive, batch_count=1621706.6666666667, ans=0.025 2023-10-04 10:26:49,972 WARNING [train.py:1204] (3/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_415-78792-0 from training. Duration: 0.81 2023-10-04 10:26:49,976 WARNING [train.py:1204] (3/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_345-340690-0_sp1.1 from training. Duration: 0.8454375 2023-10-04 10:26:51,817 WARNING [train.py:1204] (3/4) Exclude cut with ID _23825_210_13449_1_1531915313526_3497150_124-8138-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 10:26:51,867 WARNING [train.py:1204] (3/4) Exclude cut with ID _49258_210_7994_1_1534122017863_3738861_194-24864-0_sp1.1 from training. Duration: 0.936375 2023-10-04 10:26:53,183 WARNING [train.py:1204] (3/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_369-222060-0_sp1.1 from training. Duration: 0.6454375 2023-10-04 10:26:54,579 WARNING [train.py:1204] (3/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_480-13146-0_sp0.9 from training. Duration: 0.9444375 2023-10-04 10:26:56,463 WARNING [train.py:1204] (3/4) Exclude cut with ID _13253_210_1925_1_1530757775819_3577873_191-144977-0 from training. Duration: 0.78 2023-10-04 10:26:56,480 WARNING [train.py:1204] (3/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_1128-140266-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 10:27:01,318 WARNING [train.py:1204] (3/4) Exclude cut with ID _8772_210_1850_1_1528635595972_3664011_247-336448-0_sp0.9 from training. Duration: 0.82225 2023-10-04 10:27:03,919 WARNING [train.py:1204] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_318-59097-0_sp1.1 from training. Duration: 0.6909375 2023-10-04 10:27:05,438 WARNING [train.py:1204] (3/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_540-131164-0_sp1.1 from training. Duration: 0.836375 2023-10-04 10:27:06,915 WARNING [train.py:1204] (3/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_39-110836-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 10:27:08,391 WARNING [train.py:1204] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_860-96198-0_sp1.1 from training. Duration: 0.663625 2023-10-04 10:27:08,400 WARNING [train.py:1204] (3/4) Exclude cut with ID _15133_210_6228_1_1530794512074_3080379_29-264458-0 from training. Duration: 0.58 2023-10-04 10:27:10,451 WARNING [train.py:1204] (3/4) Exclude cut with ID _55209_210_1531_1_1534675828996_4597160_797-348343-0_sp1.1 from training. Duration: 0.963625 2023-10-04 10:27:11,603 WARNING [train.py:1204] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_23-189234-0_sp1.1 from training. Duration: 0.7545625 2023-10-04 10:27:14,539 WARNING [train.py:1204] (3/4) Exclude cut with ID _5410_210_1388_1_1527397055795_1719020_39-112221-0_sp1.1 from training. Duration: 0.62725 2023-10-04 10:27:15,921 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_100-348441-0_sp1.1 from training. Duration: 0.82725 2023-10-04 10:27:23,247 WARNING [train.py:1204] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_143-86344-0_sp1.1 from training. Duration: 0.7 2023-10-04 10:27:25,326 WARNING [train.py:1204] (3/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_126-29104-0 from training. Duration: 0.85 2023-10-04 10:27:28,114 WARNING [train.py:1204] (3/4) Exclude cut with ID _39846_210_12651_1_1533603651992_1318030_138-31771-0_sp1.1 from training. Duration: 0.963625 2023-10-04 10:27:31,035 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.conv_skip_rate, batch_count=1621906.6666666667, ans=0.0 2023-10-04 10:27:32,326 WARNING [train.py:1204] (3/4) Exclude cut with ID _2220_210_1819_1_1525509109898_3658680_50-99837-0_sp1.1 from training. Duration: 0.4909375 2023-10-04 10:27:34,221 WARNING [train.py:1204] (3/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_836-210550-0_sp1.1 from training. Duration: 0.97275 2023-10-04 10:27:35,552 WARNING [train.py:1204] (3/4) Exclude cut with ID _28323_210_16187_1_1532480395667_4100300_306-74561-0 from training. Duration: 0.94 2023-10-04 10:27:37,233 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.balancer1.prob, batch_count=1621906.6666666667, ans=0.125 2023-10-04 10:27:40,951 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_177-58005-0_sp1.1 from training. Duration: 0.563625 2023-10-04 10:27:42,959 INFO [train.py:1046] (3/4) Epoch 46, batch 4250, loss[loss=0.1501, simple_loss=0.2418, pruned_loss=0.02917, over 24660.00 frames. ], tot_loss[loss=0.1524, simple_loss=0.2329, pruned_loss=0.03594, over 4715086.08 frames. ], batch size: 68, lr: 2.20e-03, grad_scale: 16.0 2023-10-04 10:27:43,258 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.self_attn_weights.pos_emb_skip_rate, batch_count=1621973.3333333333, ans=0.0 2023-10-04 10:27:44,335 WARNING [train.py:1204] (3/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_19-342632-0_sp1.1 from training. Duration: 0.82725 2023-10-04 10:27:44,350 WARNING [train.py:1204] (3/4) Exclude cut with ID _35355_210_16040_1_1533212988977_4016229_74-190076-0_sp0.9 from training. Duration: 0.888875 2023-10-04 10:27:45,892 WARNING [train.py:1204] (3/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_285-97786-0_sp1.1 from training. Duration: 0.97275 2023-10-04 10:27:50,084 WARNING [train.py:1204] (3/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_337-181502-0_sp0.9 from training. Duration: 0.72225 2023-10-04 10:27:51,227 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_353-182855-0 from training. Duration: 0.96 2023-10-04 10:27:51,261 WARNING [train.py:1204] (3/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_409-327730-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 10:27:54,642 WARNING [train.py:1204] (3/4) Exclude cut with ID _8030_210_7497_1_1528444886781_3531089_373-19403-0_sp1.1 from training. Duration: 0.97275 2023-10-04 10:27:58,839 WARNING [train.py:1204] (3/4) Exclude cut with ID _6789_210_1534_1_1527857937218_3118950_231-340237-0_sp1.1 from training. Duration: 0.936375 2023-10-04 10:28:03,388 WARNING [train.py:1204] (3/4) Exclude cut with ID _6493_210_1747_1_1528459210424_4012470_536-57832-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 10:28:03,413 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7046_210_6753_1_1527895337_6920917_756-116002-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 10:28:04,713 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_37152_210_9697_1_1533106573_3844188_29-239070-0_sp1.1 from training. Duration: 0.8909375 2023-10-04 10:28:04,715 WARNING [train.py:1204] (3/4) Exclude cut with ID _19023_210_4353_1_1531378892552_5820780_61-334539-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 10:28:06,603 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.5.encoder.layers.1.self_attn2.whiten, num_groups=1, num_channels=256, metric=12.19 vs. limit=22.5 2023-10-04 10:28:07,430 WARNING [train.py:1204] (3/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_159-28488-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 10:28:07,514 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_764-71272-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 10:28:08,885 WARNING [train.py:1204] (3/4) Exclude cut with ID _44902_210_16187_1_1533888006062_3792078_258-178274-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 10:28:09,035 WARNING [train.py:1204] (3/4) Exclude cut with ID _7537_210_2084_1_1528502395431_3924359_14-312906-0_sp1.1 from training. Duration: 0.763625 2023-10-04 10:28:10,365 WARNING [train.py:1204] (3/4) Exclude cut with ID _49643_210_7989_1_1534640244224_3939031_674-116639-0_sp1.1 from training. Duration: 0.963625 2023-10-04 10:28:12,493 WARNING [train.py:1204] (3/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_415-161-0 from training. Duration: 0.98 2023-10-04 10:28:16,730 WARNING [train.py:1204] (3/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_264-348896-0 from training. Duration: 0.85 2023-10-04 10:28:16,738 WARNING [train.py:1204] (3/4) Exclude cut with ID _51899_210_20034_1_1534129254466_3669220_233-161492-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 10:28:16,967 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.conv_skip_rate, batch_count=1622106.6666666667, ans=0.0 2023-10-04 10:28:18,089 WARNING [train.py:1204] (3/4) Exclude cut with ID _31255_210_13427_1_1532865619495_3815143_233-200023-0_sp1.1 from training. Duration: 0.936375 2023-10-04 10:28:18,113 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_23850_210_4238_1_1532232597_1632433_100-229879-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 10:28:18,197 WARNING [train.py:1204] (3/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_375-245677-0_sp0.9 from training. Duration: 0.9 2023-10-04 10:28:18,207 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_40223_210_6228_1_1533298404_4812267_355-317415-0_sp1.1 from training. Duration: 0.97275 2023-10-04 10:28:19,530 WARNING [train.py:1204] (3/4) Exclude cut with ID _9679_210_7497_1_1529129212906_4363929_235-81488-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 10:28:21,599 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.bypass.scale_min, batch_count=1622106.6666666667, ans=0.2 2023-10-04 10:28:22,866 WARNING [train.py:1204] (3/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_172-215932-0_sp1.1 from training. Duration: 0.6 2023-10-04 10:28:22,927 WARNING [train.py:1204] (3/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_64-298917-0_sp1.1 from training. Duration: 0.8 2023-10-04 10:28:27,453 WARNING [train.py:1204] (3/4) Exclude cut with ID _38020_210_5986_1_1533112252259_7006089_131-53533-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 10:28:30,106 WARNING [train.py:1204] (3/4) Exclude cut with ID _42334_210_7770_1_1534206610396_3605880_392-48154-0_sp1.1 from training. Duration: 0.963625 2023-10-04 10:28:31,467 WARNING [train.py:1204] (3/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_337-181502-0 from training. Duration: 0.65 2023-10-04 10:28:31,482 WARNING [train.py:1204] (3/4) Exclude cut with ID _6267_210_2084_1_1527764658526_3894730_375-219887-0_sp0.9 from training. Duration: 0.7444375 2023-10-04 10:28:31,567 WARNING [train.py:1204] (3/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_60-146189-0 from training. Duration: 0.98 2023-10-04 10:28:33,463 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_243-143119-0_sp1.1 from training. Duration: 0.7 2023-10-04 10:28:34,866 WARNING [train.py:1204] (3/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_1267-272693-0_sp0.9 from training. Duration: 0.811125 2023-10-04 10:28:34,997 WARNING [train.py:1204] (3/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_83-182645-0_sp1.1 from training. Duration: 0.97275 2023-10-04 10:28:35,014 WARNING [train.py:1204] (3/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_403-292260-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 10:28:35,220 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.bypass.scale_min, batch_count=1622173.3333333333, ans=0.2 2023-10-04 10:28:39,083 WARNING [train.py:1204] (3/4) Exclude cut with ID _55429_210_10626_1_1534467588527_7174143_377-169391-0 from training. Duration: 0.96 2023-10-04 10:28:40,392 WARNING [train.py:1204] (3/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_494-199538-0_sp1.1 from training. Duration: 0.6818125 2023-10-04 10:28:41,665 WARNING [train.py:1204] (3/4) Exclude cut with ID _3067_210_2667_1_1526212974640_4503168_119-175776-0_sp1.1 from training. Duration: 0.67275 2023-10-04 10:28:46,304 WARNING [train.py:1204] (3/4) Exclude cut with ID _3231_210_2106_1_1526205864934_6784220_183-303839-0_sp1.1 from training. Duration: 0.97275 2023-10-04 10:28:47,740 WARNING [train.py:1204] (3/4) Exclude cut with ID _18513_210_9206_1_1531618215223_3862620_165-38958-0_sp1.1 from training. Duration: 0.963625 2023-10-04 10:28:49,088 WARNING [train.py:1204] (3/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_87-272068-0_sp1.1 from training. Duration: 0.7545625 2023-10-04 10:28:50,091 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.0.layers.1.nonlin_attention.whiten1, num_groups=1, num_channels=144, metric=5.05 vs. limit=10.0 2023-10-04 10:28:50,469 WARNING [train.py:1204] (3/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_353-95-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 10:28:52,458 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2009_210_2071_1_1525481134_4588937_73-302813-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 10:28:52,588 WARNING [train.py:1204] (3/4) Exclude cut with ID _64220_210_9527_1_1535250151772_4875259_88-95011-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 10:28:53,917 WARNING [train.py:1204] (3/4) Exclude cut with ID _44902_210_16187_1_1533888006062_3792078_160-12107-0_sp1.1 from training. Duration: 0.92725 2023-10-04 10:28:53,924 WARNING [train.py:1204] (3/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_547-5424-0 from training. Duration: 0.94 2023-10-04 10:28:55,369 WARNING [train.py:1204] (3/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_268-85656-0_sp1.1 from training. Duration: 0.936375 2023-10-04 10:28:57,099 INFO [train.py:1046] (3/4) Epoch 46, batch 4300, loss[loss=0.1485, simple_loss=0.2267, pruned_loss=0.03515, over 23339.00 frames. ], tot_loss[loss=0.1518, simple_loss=0.2323, pruned_loss=0.03559, over 4714711.18 frames. ], batch size: 119, lr: 2.20e-03, grad_scale: 16.0 2023-10-04 10:28:58,685 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.conv_module1.balancer2.prob, batch_count=1622306.6666666667, ans=0.125 2023-10-04 10:29:01,508 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.623e+02 2.043e+02 2.222e+02 2.546e+02 3.786e+02, threshold=4.445e+02, percent-clipped=0.0 2023-10-04 10:29:01,639 WARNING [train.py:1204] (3/4) Exclude cut with ID _66210_210_12651_1_1535446812922_3365850_42-284985-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 10:29:01,663 WARNING [train.py:1204] (3/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_465-214726-0_sp0.9 from training. Duration: 0.9555625 2023-10-04 10:29:05,914 WARNING [train.py:1204] (3/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_50-95770-0_sp1.1 from training. Duration: 0.936375 2023-10-04 10:29:11,695 WARNING [train.py:1204] (3/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_269-77567-0_sp1.1 from training. Duration: 0.963625 2023-10-04 10:29:11,698 WARNING [train.py:1204] (3/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_7-111954-0 from training. Duration: 0.56 2023-10-04 10:29:14,262 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_85-195240-0_sp0.9 from training. Duration: 0.9666875 2023-10-04 10:29:16,030 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2822_210_2715_1_1526112586_1999587_165-36656-0_sp0.9 from training. Duration: 0.988875 2023-10-04 10:29:16,056 WARNING [train.py:1204] (3/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_181-78632-0_sp1.1 from training. Duration: 0.7090625 2023-10-04 10:29:16,076 WARNING [train.py:1204] (3/4) Exclude cut with ID IC0671W0044-331887 from training. Duration: 0.896 2023-10-04 10:29:18,825 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.conv_module2.balancer2.prob, batch_count=1622373.3333333333, ans=0.125 2023-10-04 10:29:20,057 WARNING [train.py:1204] (3/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_266-137104-0_sp1.1 from training. Duration: 0.5909375 2023-10-04 10:29:20,229 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.1.nonlin_attention.balancer.prob, batch_count=1622373.3333333333, ans=0.125 2023-10-04 10:29:20,234 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.1.conv_skip_rate, batch_count=1622373.3333333333, ans=0.0 2023-10-04 10:29:21,552 WARNING [train.py:1204] (3/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_137-116442-0_sp0.9 from training. Duration: 0.8666875 2023-10-04 10:29:24,947 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_23801_210_16223_1_1532066392_3294657_570-5439-0 from training. Duration: 0.97 2023-10-04 10:29:24,960 WARNING [train.py:1204] (3/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_29-280622-0_sp1.1 from training. Duration: 0.7454375 2023-10-04 10:29:24,997 WARNING [train.py:1204] (3/4) Exclude cut with ID 3557-8342-0013-54691-0 from training. Duration: 0.92 2023-10-04 10:29:27,613 WARNING [train.py:1204] (3/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_711-15688-0_sp1.1 from training. Duration: 0.3909375 2023-10-04 10:29:27,814 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.0.balancer1.prob, batch_count=1622440.0, ans=0.125 2023-10-04 10:29:27,858 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.conv_module1.balancer1.prob, batch_count=1622440.0, ans=0.125 2023-10-04 10:29:29,041 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_15126_210_6753_1_1530778677_2591185_120-277707-0_sp1.1 from training. Duration: 0.72725 2023-10-04 10:29:33,963 WARNING [train.py:1204] (3/4) Exclude cut with ID _14970_210_15709_1_1530775547361_1966965_8-169924-0_sp1.1 from training. Duration: 0.836375 2023-10-04 10:29:33,964 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_4692_210_3528_1_1526899755_4323254_107-44401-0_sp0.9 from training. Duration: 0.9555625 2023-10-04 10:29:35,356 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3475_210_2028_1_1526193578_3622569_265-276625-0_sp0.9 from training. Duration: 0.8444375 2023-10-04 10:29:36,789 WARNING [train.py:1204] (3/4) Exclude cut with ID _55153_210_2253_1_1534384786677_3162096_148-79022-0_sp1.1 from training. Duration: 0.87275 2023-10-04 10:29:36,868 WARNING [train.py:1204] (3/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_292-36336-0_sp1.1 from training. Duration: 0.92725 2023-10-04 10:29:36,885 WARNING [train.py:1204] (3/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_329-82235-0 from training. Duration: 0.82 2023-10-04 10:29:38,221 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_11305_210_1794_1_1529750428_7178148_257-312870-0 from training. Duration: 0.88 2023-10-04 10:29:41,049 WARNING [train.py:1204] (3/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_436-4493-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 10:29:42,524 WARNING [train.py:1204] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_321-54129-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 10:29:42,533 WARNING [train.py:1204] (3/4) Exclude cut with ID _5287_210_2084_1_1527418883040_3878670_333-48508-0_sp0.9 from training. Duration: 0.6666875 2023-10-04 10:29:42,547 WARNING [train.py:1204] (3/4) Exclude cut with ID _26528_210_9070_1_1532570410842_3851310_244-278158-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 10:29:43,827 WARNING [train.py:1204] (3/4) Exclude cut with ID _46952_210_5986_1_1534230010357_3107379_232-344503-0_sp1.1 from training. Duration: 0.87275 2023-10-04 10:29:43,838 WARNING [train.py:1204] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_43-206757-0 from training. Duration: 0.75 2023-10-04 10:29:43,840 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_4183_210_2087_1_1526729100_1869838_141-166600-0 from training. Duration: 0.95 2023-10-04 10:29:43,907 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_296-287678-0 from training. Duration: 0.95 2023-10-04 10:29:45,325 WARNING [train.py:1204] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_265-60343-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 10:29:47,087 WARNING [train.py:1204] (3/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_332-256168-0 from training. Duration: 0.75 2023-10-04 10:29:47,122 WARNING [train.py:1204] (3/4) Exclude cut with ID _13679_210_7497_1_1530534293662_3355098_411-186088-0 from training. Duration: 0.94 2023-10-04 10:29:51,339 WARNING [train.py:1204] (3/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_458-126779-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 10:29:53,975 WARNING [train.py:1204] (3/4) Exclude cut with ID IC0803W0380-397572 from training. Duration: 0.032 2023-10-04 10:29:54,044 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_278-272612-0_sp1.1 from training. Duration: 0.663625 2023-10-04 10:29:55,445 WARNING [train.py:1204] (3/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_491-263101-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 10:29:56,717 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_53555_210_5632_1_1534595220_7232297_334-336778-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 10:29:58,473 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_50-3289-0 from training. Duration: 0.91 2023-10-04 10:29:58,534 WARNING [train.py:1204] (3/4) Exclude cut with ID _8699_210_2069_1_1528630346831_3960409_155-39766-0_sp0.9 from training. Duration: 0.8666875 2023-10-04 10:29:58,538 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_502-82145-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 10:29:58,587 WARNING [train.py:1204] (3/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_550-43132-0_sp1.1 from training. Duration: 0.97275 2023-10-04 10:29:59,895 WARNING [train.py:1204] (3/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_446-112117-0_sp1.1 from training. Duration: 0.8181875 2023-10-04 10:29:59,954 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3691_210_3528_1_1526468169_4062053_355-334432-0_sp1.1 from training. Duration: 0.7909375 2023-10-04 10:30:00,809 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.0.layers.1.feed_forward1.out_whiten, num_groups=1, num_channels=192, metric=5.08 vs. limit=15.0 2023-10-04 10:30:01,430 WARNING [train.py:1204] (3/4) Exclude cut with ID _58605_210_12210_1_1534723195939_3904259_239-94720-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 10:30:03,474 WARNING [train.py:1204] (3/4) Exclude cut with ID _49376_210_5986_1_1533985278732_3370740_391-344072-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 10:30:04,873 WARNING [train.py:1204] (3/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_558-322014-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 10:30:04,903 WARNING [train.py:1204] (3/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_256-32662-0_sp1.1 from training. Duration: 0.8181875 2023-10-04 10:30:08,286 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.4.encoder.layers.1.self_attn1.whiten, num_groups=1, num_channels=384, metric=11.50 vs. limit=22.5 2023-10-04 10:30:10,219 INFO [train.py:1046] (3/4) Epoch 46, batch 4350, loss[loss=0.1925, simple_loss=0.2626, pruned_loss=0.06119, over 19720.00 frames. ], tot_loss[loss=0.1527, simple_loss=0.2333, pruned_loss=0.03603, over 4718081.65 frames. ], batch size: 388, lr: 2.20e-03, grad_scale: 16.0 2023-10-04 10:30:10,313 WARNING [train.py:1204] (3/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_572-185150-0 from training. Duration: 0.97 2023-10-04 10:30:10,357 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_89-35748-0_sp0.9 from training. Duration: 0.87775 2023-10-04 10:30:16,351 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_88-66747-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 10:30:17,815 WARNING [train.py:1204] (3/4) Exclude cut with ID _19504_210_9994_1_1531648863242_3325220_357-237029-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 10:30:20,657 WARNING [train.py:1204] (3/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_302-2550-0_sp1.1 from training. Duration: 0.736375 2023-10-04 10:30:20,657 WARNING [train.py:1204] (3/4) Exclude cut with ID _9461_210_4557_1_1529060514726_3677451_59-194017-0_sp1.1 from training. Duration: 0.963625 2023-10-04 10:30:24,929 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.bypass.skip_rate, batch_count=1622706.6666666667, ans=0.07 2023-10-04 10:30:26,007 WARNING [train.py:1204] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_665-196155-0_sp1.1 from training. Duration: 0.6090625 2023-10-04 10:30:28,802 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_326-315007-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 10:30:30,849 WARNING [train.py:1204] (3/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_105-328174-0_sp1.1 from training. Duration: 0.6909375 2023-10-04 10:30:30,857 WARNING [train.py:1204] (3/4) Exclude cut with ID _56494_210_4220_1_1535284837625_3033169_106-263722-0_sp1.1 from training. Duration: 0.97275 2023-10-04 10:30:34,154 WARNING [train.py:1204] (3/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_770-200430-0_sp0.9 from training. Duration: 0.988875 2023-10-04 10:30:37,229 WARNING [train.py:1204] (3/4) Exclude cut with ID _65640_210_6310_1_1535368201445_3173250_209-13008-0_sp1.1 from training. Duration: 0.9 2023-10-04 10:30:37,888 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.3.encoder.layers.3.self_attn1.whiten, num_groups=1, num_channels=512, metric=11.91 vs. limit=22.5 2023-10-04 10:30:38,614 WARNING [train.py:1204] (3/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_212-149947-0_sp0.9 from training. Duration: 0.811125 2023-10-04 10:30:38,896 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.conv_skip_rate, batch_count=1622773.3333333333, ans=0.0 2023-10-04 10:30:40,268 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.conv_skip_rate, batch_count=1622773.3333333333, ans=0.0 2023-10-04 10:30:41,541 WARNING [train.py:1204] (3/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_188-171673-0 from training. Duration: 0.58 2023-10-04 10:30:43,514 WARNING [train.py:1204] (3/4) Exclude cut with ID _58895_210_8750_1_1535184081320_3649089_286-281724-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 10:30:44,722 WARNING [train.py:1204] (3/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_16-254909-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 10:30:44,842 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.bypass_mid.scale_min, batch_count=1622773.3333333333, ans=0.2 2023-10-04 10:30:48,981 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=1622773.3333333333, ans=0.1 2023-10-04 10:30:50,242 WARNING [train.py:1204] (3/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_52-95655-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 10:30:51,744 WARNING [train.py:1204] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_554-45617-0 from training. Duration: 0.99 2023-10-04 10:30:53,383 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.balancer2.prob, batch_count=1622840.0, ans=0.125 2023-10-04 10:30:54,606 WARNING [train.py:1204] (3/4) Exclude cut with ID _14683_210_5219_1_1530698352608_3974859_473-344611-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 10:30:55,973 WARNING [train.py:1204] (3/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_17-25045-0_sp0.9 from training. Duration: 0.6555625 2023-10-04 10:31:00,362 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.bypass.scale_min, batch_count=1622840.0, ans=0.2 2023-10-04 10:31:01,958 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9072W0003-530637 from training. Duration: 0.768 2023-10-04 10:31:02,091 WARNING [train.py:1204] (3/4) Exclude cut with ID _38670_210_15379_1_1533448810659_4391059_76-74025-0_sp1.1 from training. Duration: 0.936375 2023-10-04 10:31:04,001 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_74-292608-0_sp0.9 from training. Duration: 0.8 2023-10-04 10:31:04,076 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9084W0025-536862 from training. Duration: 0.896 2023-10-04 10:31:05,388 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9087W0021-538392 from training. Duration: 0.768 2023-10-04 10:31:05,394 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7543_210_6753_1_1528543323_645515_56-163234-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 10:31:05,430 WARNING [train.py:1204] (3/4) Exclude cut with ID _45560_210_21982_1_1533812603951_3584999_44-332643-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 10:31:07,371 WARNING [train.py:1204] (3/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_1285-256622-0_sp1.1 from training. Duration: 0.763625 2023-10-04 10:31:08,659 WARNING [train.py:1204] (3/4) Exclude cut with ID _52781_210_16187_1_1534297729099_1118149_141-325632-0_sp1.1 from training. Duration: 0.936375 2023-10-04 10:31:10,057 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_1051-137631-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 10:31:10,104 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_13091_210_7925_1_1530609447_8987354_233-18333-0_sp1.1 from training. Duration: 0.97275 2023-10-04 10:31:12,810 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_34008_210_10431_1_1533345424_8183247_662-317331-0 from training. Duration: 0.94 2023-10-04 10:31:12,816 WARNING [train.py:1204] (3/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_291-248920-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 10:31:12,818 WARNING [train.py:1204] (3/4) Exclude cut with ID _65263_210_22356_1_1535763598691_3921037_274-288481-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 10:31:12,838 WARNING [train.py:1204] (3/4) Exclude cut with ID _5189_210_4557_1_1527248319428_3769229_233-168080-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 10:31:14,143 WARNING [train.py:1204] (3/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_135-184281-0 from training. Duration: 0.99 2023-10-04 10:31:16,252 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9123W0012-555972 from training. Duration: 0.896 2023-10-04 10:31:16,257 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9123W0118-556077 from training. Duration: 0.896 2023-10-04 10:31:16,266 WARNING [train.py:1204] (3/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_406-275649-0 from training. Duration: 0.42 2023-10-04 10:31:18,865 WARNING [train.py:1204] (3/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_309-13684-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 10:31:18,895 WARNING [train.py:1204] (3/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_129-337802-0_sp1.1 from training. Duration: 0.6545625 2023-10-04 10:31:18,920 WARNING [train.py:1204] (3/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_649-266825-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 10:31:20,227 WARNING [train.py:1204] (3/4) Exclude cut with ID _40519_210_13794_1_1533715228655_7337390_895-224742-0_sp1.1 from training. Duration: 0.92725 2023-10-04 10:31:21,614 WARNING [train.py:1204] (3/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_234-14174-0 from training. Duration: 0.82 2023-10-04 10:31:23,077 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9156W0009-572502 from training. Duration: 0.768 2023-10-04 10:31:23,084 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_424-245529-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 10:31:24,238 INFO [train.py:1046] (3/4) Epoch 46, batch 4400, loss[loss=0.1637, simple_loss=0.2376, pruned_loss=0.04494, over 23887.00 frames. ], tot_loss[loss=0.1539, simple_loss=0.2344, pruned_loss=0.03668, over 4720140.14 frames. ], batch size: 195, lr: 2.20e-03, grad_scale: 32.0 2023-10-04 10:31:26,340 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.4.encoder.layers.1.feed_forward1.out_whiten, num_groups=1, num_channels=384, metric=8.13 vs. limit=15.0 2023-10-04 10:31:27,142 WARNING [train.py:1204] (3/4) Exclude cut with ID _26528_210_9070_1_1532570410842_3851310_232-10149-0_sp1.1 from training. Duration: 0.963625 2023-10-04 10:31:27,150 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_17292_210_6753_1_1531125388_2943734_152-30410-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 10:31:28,568 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.667e+02 1.967e+02 2.244e+02 2.506e+02 3.290e+02, threshold=4.488e+02, percent-clipped=0.0 2023-10-04 10:31:28,686 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_35441_210_16554_1_1533347572_4092680_291-82075-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 10:31:31,334 WARNING [train.py:1204] (3/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_64-298917-0 from training. Duration: 0.88 2023-10-04 10:31:31,354 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_5618_210_1874_1_1527311008_5851_1-214501-0_sp0.9 from training. Duration: 0.7655625 2023-10-04 10:31:32,781 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_9460_210_4211_1_1528954587_10049667_572-37035-0 from training. Duration: 0.8 2023-10-04 10:31:32,810 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9193W0023-590322 from training. Duration: 0.896 2023-10-04 10:31:34,695 WARNING [train.py:1204] (3/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_206-10097-0_sp1.1 from training. Duration: 0.4454375 2023-10-04 10:31:34,707 WARNING [train.py:1204] (3/4) Exclude cut with ID _55134_210_21982_1_1534590028377_3882170_17-51731-0_sp1.1 from training. Duration: 0.963625 2023-10-04 10:31:37,861 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_14953_210_2151_1_1530701710_5616165_710-165400-0 from training. Duration: 0.87 2023-10-04 10:31:41,161 WARNING [train.py:1204] (3/4) Exclude cut with ID _53203_210_2087_1_1534246218309_475590_61-85777-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 10:31:41,435 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.out_combiner.scale_min, batch_count=1623040.0, ans=0.2 2023-10-04 10:31:42,463 WARNING [train.py:1204] (3/4) Exclude cut with ID _55844_210_2346_1_1534471212569_7625707_1050-10453-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 10:31:42,472 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9218W0013-603297 from training. Duration: 0.896 2023-10-04 10:31:45,218 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_267-102818-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 10:31:45,219 WARNING [train.py:1204] (3/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_191-41023-0 from training. Duration: 0.71 2023-10-04 10:31:45,266 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9231W0202-610227 from training. Duration: 0.896 2023-10-04 10:31:45,482 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.balancer_ff3.min_abs, batch_count=1623040.0, ans=0.2 2023-10-04 10:31:48,308 WARNING [train.py:1204] (3/4) Exclude cut with ID _6537_210_1883_1_1527987559710_3597129_382-133062-0 from training. Duration: 0.68 2023-10-04 10:31:48,346 WARNING [train.py:1204] (3/4) Exclude cut with ID _28427_210_4220_1_1532581240967_3061069_43-84928-0 from training. Duration: 0.99 2023-10-04 10:31:48,385 WARNING [train.py:1204] (3/4) Exclude cut with ID _59256_210_5986_1_1535076085997_3589120_121-237386-0 from training. Duration: 0.96 2023-10-04 10:31:49,737 WARNING [train.py:1204] (3/4) Exclude cut with ID _8480_210_1874_1_1528589391205_6728330_137-256986-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 10:31:51,108 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_9417_210_1794_1_1529235261_4614131_479-346963-0_sp1.1 from training. Duration: 0.936375 2023-10-04 10:31:51,139 WARNING [train.py:1204] (3/4) Exclude cut with ID _26474_210_9570_1_1532520022699_3623380_267-7362-0_sp1.1 from training. Duration: 0.936375 2023-10-04 10:31:52,479 WARNING [train.py:1204] (3/4) Exclude cut with ID _55328_210_27767_1_1534580973260_4088170_287-61272-0_sp1.1 from training. Duration: 0.97275 2023-10-04 10:31:53,923 WARNING [train.py:1204] (3/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_589-9070-0 from training. Duration: 0.71 2023-10-04 10:31:53,937 WARNING [train.py:1204] (3/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_148-158509-0_sp1.1 from training. Duration: 0.37275 2023-10-04 10:31:53,998 WARNING [train.py:1204] (3/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_164-184548-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 10:31:55,405 WARNING [train.py:1204] (3/4) Exclude cut with ID _14970_210_15709_1_1530775547361_1966965_6-23328-0_sp1.1 from training. Duration: 0.7818125 2023-10-04 10:31:55,411 WARNING [train.py:1204] (3/4) Exclude cut with ID _29303_210_2575_1_1532392237303_7410649_612-165069-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 10:31:56,977 WARNING [train.py:1204] (3/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_383-321224-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 10:31:58,320 WARNING [train.py:1204] (3/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_283-160525-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 10:31:58,331 WARNING [train.py:1204] (3/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_505-97987-0 from training. Duration: 0.86 2023-10-04 10:31:58,402 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9309W0190-640317 from training. Duration: 0.768 2023-10-04 10:32:02,534 WARNING [train.py:1204] (3/4) Exclude cut with ID _50093_210_4220_1_1534417141207_3432299_280-297390-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 10:32:09,057 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_17292_210_6753_1_1531125388_2943734_320-71385-0_sp1.1 from training. Duration: 0.97275 2023-10-04 10:32:10,497 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_213-103282-0 from training. Duration: 0.78 2023-10-04 10:32:14,552 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7615_210_4211_1_1528110965_4580160_72-235211-0_sp0.9 from training. Duration: 0.8444375 2023-10-04 10:32:19,064 WARNING [train.py:1204] (3/4) Exclude cut with ID _14867_210_1968_1_1530684179890_3846969_173-320977-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 10:32:21,774 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7046_210_6753_1_1527895337_6920917_783-278109-0_sp0.9 from training. Duration: 0.8555625 2023-10-04 10:32:21,827 WARNING [train.py:1204] (3/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_90-30673-0 from training. Duration: 0.84 2023-10-04 10:32:21,842 WARNING [train.py:1204] (3/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_415-161-0_sp1.1 from training. Duration: 0.8909375 2023-10-04 10:32:21,852 WARNING [train.py:1204] (3/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_86-66192-0_sp0.9 from training. Duration: 0.611125 2023-10-04 10:32:21,854 WARNING [train.py:1204] (3/4) Exclude cut with ID _4626_210_3532_1_1526817652717_3916859_446-206656-0_sp1.1 from training. Duration: 0.6909375 2023-10-04 10:32:23,243 WARNING [train.py:1204] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_166-128941-0_sp0.9 from training. Duration: 0.6 2023-10-04 10:32:26,202 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.self_attn_weights.pos_emb_skip_rate, batch_count=1623240.0, ans=0.0 2023-10-04 10:32:28,630 WARNING [train.py:1204] (3/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_345-167986-0 from training. Duration: 0.84 2023-10-04 10:32:30,108 WARNING [train.py:1204] (3/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_973-198085-0 from training. Duration: 0.75 2023-10-04 10:32:30,177 WARNING [train.py:1204] (3/4) Exclude cut with ID _60057_210_10640_1_1534903065793_3333150_171-181133-0 from training. Duration: 0.88 2023-10-04 10:32:30,201 WARNING [train.py:1204] (3/4) Exclude cut with ID _19504_210_9994_1_1531648863242_3325220_336-196513-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 10:32:30,208 WARNING [train.py:1204] (3/4) Exclude cut with ID _6493_210_1747_1_1528459210424_4012470_513-140967-0 from training. Duration: 0.79 2023-10-04 10:32:32,919 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6673_210_3273_1_1527767790_3173240_327-306185-0_sp0.9 from training. Duration: 0.92225 2023-10-04 10:32:35,714 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1032-236863-0_sp1.1 from training. Duration: 0.763625 2023-10-04 10:32:35,893 WARNING [train.py:1204] (3/4) Exclude cut with ID _14867_210_1968_1_1530684179890_3846969_413-29292-0 from training. Duration: 0.9 2023-10-04 10:32:36,023 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.0.feed_forward1.out_proj.dropout_p, batch_count=1623306.6666666667, ans=0.1 2023-10-04 10:32:37,705 INFO [train.py:1046] (3/4) Epoch 46, batch 4450, loss[loss=0.1518, simple_loss=0.2483, pruned_loss=0.02765, over 24417.00 frames. ], tot_loss[loss=0.1544, simple_loss=0.2353, pruned_loss=0.03677, over 4726777.46 frames. ], batch size: 69, lr: 2.20e-03, grad_scale: 32.0 2023-10-04 10:32:41,033 WARNING [train.py:1204] (3/4) Exclude cut with ID _47251_210_18628_1_1533814063128_4137250_186-343322-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 10:32:42,970 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.ff3_skip_rate, batch_count=1623306.6666666667, ans=0.0 2023-10-04 10:32:44,091 WARNING [train.py:1204] (3/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_624-46091-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 10:32:44,128 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_791-197891-0_sp0.9 from training. Duration: 0.9333125 2023-10-04 10:32:50,379 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_50050_210_16223_1_1535435796_7290138_786-288457-0_sp1.1 from training. Duration: 0.92725 2023-10-04 10:32:50,395 WARNING [train.py:1204] (3/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_213-227283-0_sp1.1 from training. Duration: 0.963625 2023-10-04 10:32:54,416 WARNING [train.py:1204] (3/4) Exclude cut with ID _42659_210_2070_1_1533607442765_3120380_58-341121-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 10:32:57,085 WARNING [train.py:1204] (3/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_506-23200-0_sp1.1 from training. Duration: 0.8818125 2023-10-04 10:32:58,514 WARNING [train.py:1204] (3/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_336-213530-0_sp1.1 from training. Duration: 0.8181875 2023-10-04 10:32:58,541 WARNING [train.py:1204] (3/4) Exclude cut with ID _64135_210_2253_1_1535677267876_7608972_180-5312-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 10:32:59,959 WARNING [train.py:1204] (3/4) Exclude cut with ID _12302_210_5219_1_1529924446466_8107280_835-279754-0 from training. Duration: 0.9 2023-10-04 10:32:59,961 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_510-96996-0_sp1.1 from training. Duration: 0.7909375 2023-10-04 10:33:00,023 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_15517_210_6753_1_1530954008_3487336_216-187834-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 10:33:01,412 WARNING [train.py:1204] (3/4) Exclude cut with ID _19504_210_9994_1_1531648863242_3325220_414-113427-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 10:33:01,413 WARNING [train.py:1204] (3/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_264-108685-0_sp0.9 from training. Duration: 0.611125 2023-10-04 10:33:04,098 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_5438_210_1794_1_1527336687_2600009_104-288274-0_sp0.9 from training. Duration: 0.6444375 2023-10-04 10:33:08,315 WARNING [train.py:1204] (3/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_520-39496-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 10:33:08,371 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_37818_210_4238_1_1533620910_4720083_315-308565-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 10:33:10,185 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2009_210_2071_1_1525481134_4588937_385-302100-0_sp1.1 from training. Duration: 0.7909375 2023-10-04 10:33:11,616 WARNING [train.py:1204] (3/4) Exclude cut with ID _58895_210_8750_1_1535184081320_3649089_277-53219-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 10:33:12,942 WARNING [train.py:1204] (3/4) Exclude cut with ID _26664_210_7770_1_1532258943280_3418270_121-111258-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 10:33:16,246 WARNING [train.py:1204] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_99-167392-0_sp1.1 from training. Duration: 0.5545625 2023-10-04 10:33:17,644 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1113-7369-0 from training. Duration: 0.85 2023-10-04 10:33:18,966 WARNING [train.py:1204] (3/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_352-59958-0 from training. Duration: 0.86 2023-10-04 10:33:18,970 WARNING [train.py:1204] (3/4) Exclude cut with ID _14886_210_4220_1_1530702020355_3516190_159-73748-0_sp1.1 from training. Duration: 0.7545625 2023-10-04 10:33:22,211 WARNING [train.py:1204] (3/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_131-119569-0_sp1.1 from training. Duration: 0.92725 2023-10-04 10:33:23,503 WARNING [train.py:1204] (3/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_216-343007-0 from training. Duration: 0.66 2023-10-04 10:33:26,318 WARNING [train.py:1204] (3/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_113-92608-0_sp0.9 from training. Duration: 0.6 2023-10-04 10:33:30,380 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_40047_210_2290_1_1533290519_2374311_132-123271-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 10:33:30,445 WARNING [train.py:1204] (3/4) Exclude cut with ID _8793_210_6310_1_1528542985139_2310009_121-179782-0 from training. Duration: 0.8 2023-10-04 10:33:31,767 WARNING [train.py:1204] (3/4) Exclude cut with ID _43997_210_4525_1_1533538632555_5189329_283-218639-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 10:33:31,770 WARNING [train.py:1204] (3/4) Exclude cut with ID _49376_210_5986_1_1533985278732_3370740_297-78397-0_sp1.1 from training. Duration: 0.936375 2023-10-04 10:33:31,793 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3475_210_2028_1_1526193578_3622569_377-166133-0_sp0.9 from training. Duration: 0.9555625 2023-10-04 10:33:31,799 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_520-213149-0_sp1.1 from training. Duration: 0.92725 2023-10-04 10:33:34,589 WARNING [train.py:1204] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_567-144483-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 10:33:37,419 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_4183_210_2087_1_1526729100_1869838_143-280962-0_sp0.9 from training. Duration: 0.62225 2023-10-04 10:33:37,458 WARNING [train.py:1204] (3/4) Exclude cut with ID _49643_210_7989_1_1534640244224_3939031_611-185515-0 from training. Duration: 0.94 2023-10-04 10:33:39,387 WARNING [train.py:1204] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_88-341718-0_sp0.9 from training. Duration: 0.7333125 2023-10-04 10:33:40,818 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_51225_210_3601_1_1534400634_2327383_49-235441-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 10:33:42,753 WARNING [train.py:1204] (3/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_240-294400-0_sp1.1 from training. Duration: 0.936375 2023-10-04 10:33:42,896 WARNING [train.py:1204] (3/4) Exclude cut with ID _55429_210_10626_1_1534467588527_7174143_640-304825-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 10:33:44,130 WARNING [train.py:1204] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_187-221566-0_sp1.1 from training. Duration: 0.3909375 2023-10-04 10:33:45,597 WARNING [train.py:1204] (3/4) Exclude cut with ID _7606_210_4915_1_1528894253277_5080690_127-177761-0_sp0.9 from training. Duration: 0.711125 2023-10-04 10:33:47,540 WARNING [train.py:1204] (3/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_430-116139-0 from training. Duration: 0.85 2023-10-04 10:33:47,695 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.ff3_skip_rate, batch_count=1623573.3333333333, ans=0.0 2023-10-04 10:33:48,850 WARNING [train.py:1204] (3/4) Exclude cut with ID _3675_210_4557_1_1526639934408_4182089_456-18305-0_sp0.9 from training. Duration: 0.8444375 2023-10-04 10:33:51,990 INFO [train.py:1046] (3/4) Epoch 46, batch 4500, loss[loss=0.1763, simple_loss=0.2495, pruned_loss=0.05156, over 19855.00 frames. ], tot_loss[loss=0.1548, simple_loss=0.2356, pruned_loss=0.03699, over 4718124.19 frames. ], batch size: 388, lr: 2.20e-03, grad_scale: 32.0 2023-10-04 10:33:52,249 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.conv_module1.balancer2.prob, batch_count=1623640.0, ans=0.125 2023-10-04 10:33:52,321 INFO [scaling.py:1118] (3/4) WithLoss: name=encoder.encoders.2.encoder.layers.2.self_attn_weights, loss-sum=0.000e+00 2023-10-04 10:33:53,472 WARNING [train.py:1204] (3/4) Exclude cut with ID _18513_210_9206_1_1531618215223_3862620_402-5957-0_sp1.1 from training. Duration: 0.9 2023-10-04 10:33:54,857 WARNING [train.py:1204] (3/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_465-156500-0 from training. Duration: 0.72 2023-10-04 10:33:54,858 WARNING [train.py:1204] (3/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_714-310538-0 from training. Duration: 0.86 2023-10-04 10:33:56,101 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.660e+02 2.108e+02 2.336e+02 2.689e+02 4.416e+02, threshold=4.672e+02, percent-clipped=0.0 2023-10-04 10:33:56,263 WARNING [train.py:1204] (3/4) Exclude cut with ID _39411_210_5986_1_1533538825030_3343649_335-301187-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 10:34:00,470 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_41750_210_1960_1_1533520678_3948042_241-158616-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 10:34:01,832 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_46013_210_7234_1_1533776362_3744032_50-141535-0_sp1.1 from training. Duration: 0.9 2023-10-04 10:34:03,103 WARNING [train.py:1204] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_43-206757-0_sp0.9 from training. Duration: 0.8333125 2023-10-04 10:34:03,156 WARNING [train.py:1204] (3/4) Exclude cut with ID _22961_210_8254_1_1531976366672_4278180_172-276296-0_sp1.1 from training. Duration: 0.97275 2023-10-04 10:34:03,193 WARNING [train.py:1204] (3/4) Exclude cut with ID _17956_210_4353_1_1531209568125_6979970_1305-301903-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 10:34:03,379 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.conv_module2.balancer2.min_abs, batch_count=1623640.0, ans=0.5 2023-10-04 10:34:03,812 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.4.encoder.layers.0.self_attn2.whiten, num_groups=1, num_channels=384, metric=14.61 vs. limit=22.5 2023-10-04 10:34:04,534 WARNING [train.py:1204] (3/4) Exclude cut with ID _28323_210_16187_1_1532480395667_4100300_604-44507-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 10:34:16,582 WARNING [train.py:1204] (3/4) Exclude cut with ID _23514_210_1389_1_1531832139785_3031439_88-337544-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 10:34:18,430 WARNING [train.py:1204] (3/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_932-3672-0_sp1.1 from training. Duration: 0.963625 2023-10-04 10:34:19,368 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.bypass.scale_min, batch_count=1623706.6666666667, ans=0.2 2023-10-04 10:34:20,410 WARNING [train.py:1204] (3/4) Exclude cut with ID _46502_210_13794_1_1533724452198_3657159_207-109991-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 10:34:20,453 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_216-73841-0_sp1.1 from training. Duration: 0.87275 2023-10-04 10:34:21,769 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8453_210_4211_1_1528502940_8562730_347-330673-0_sp1.1 from training. Duration: 0.6545625 2023-10-04 10:34:28,628 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_4183_210_2087_1_1526729100_1869838_143-280962-0_sp1.1 from training. Duration: 0.5090625 2023-10-04 10:34:28,930 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.balancer2.prob, batch_count=1623773.3333333333, ans=0.125 2023-10-04 10:34:30,149 WARNING [train.py:1204] (3/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_325-113798-0_sp1.1 from training. Duration: 0.77275 2023-10-04 10:34:34,248 WARNING [train.py:1204] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_91-212486-0_sp0.9 from training. Duration: 0.7555625 2023-10-04 10:34:36,956 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8776_210_2040_1_1528638837_4088036_244-278763-0_sp1.1 from training. Duration: 0.8181875 2023-10-04 10:34:38,169 WARNING [train.py:1204] (3/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_106-286130-0 from training. Duration: 0.98 2023-10-04 10:34:38,240 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2746_210_1988_1_1525872816_3795627_372-205861-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 10:34:39,974 WARNING [train.py:1204] (3/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_240-54646-0_sp1.1 from training. Duration: 0.936375 2023-10-04 10:34:41,897 WARNING [train.py:1204] (3/4) Exclude cut with ID _59500_210_8254_1_1534809610680_3736009_162-180695-0_sp1.1 from training. Duration: 0.936375 2023-10-04 10:34:43,244 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7544_210_6753_1_1528591327_1471426_146-93357-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 10:34:46,441 WARNING [train.py:1204] (3/4) Exclude cut with ID _42657_210_2070_1_1533520654016_3327008_405-87432-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 10:34:46,471 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_4528_210_3273_1_1527076654_3276777_209-219804-0 from training. Duration: 0.69 2023-10-04 10:34:46,472 WARNING [train.py:1204] (3/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_78-182912-0_sp0.9 from training. Duration: 0.6555625 2023-10-04 10:34:46,482 WARNING [train.py:1204] (3/4) Exclude cut with ID _31255_210_13427_1_1532865619495_3815143_322-114998-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 10:34:51,151 WARNING [train.py:1204] (3/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_848-280957-0_sp0.9 from training. Duration: 0.9666875 2023-10-04 10:34:51,184 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7696_210_2040_1_1528207167_3716099_117-125434-0_sp0.9 from training. Duration: 0.8444375 2023-10-04 10:34:55,174 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_1002-109192-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 10:34:55,347 WARNING [train.py:1204] (3/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_138-64210-0_sp0.9 from training. Duration: 0.811125 2023-10-04 10:34:55,362 WARNING [train.py:1204] (3/4) Exclude cut with ID _25808_210_13292_1_1532343514562_7366109_633-47524-0_sp1.1 from training. Duration: 0.9 2023-10-04 10:34:57,900 WARNING [train.py:1204] (3/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_297-5233-0 from training. Duration: 0.95 2023-10-04 10:34:59,428 WARNING [train.py:1204] (3/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_335-274780-0 from training. Duration: 0.78 2023-10-04 10:34:59,433 WARNING [train.py:1204] (3/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_301-330455-0 from training. Duration: 0.8 2023-10-04 10:35:02,422 WARNING [train.py:1204] (3/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_383-205833-0 from training. Duration: 0.95 2023-10-04 10:35:04,014 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.ff2_skip_rate, batch_count=1623973.3333333333, ans=0.0 2023-10-04 10:35:05,105 INFO [train.py:1046] (3/4) Epoch 46, batch 4550, loss[loss=0.136, simple_loss=0.1998, pruned_loss=0.03604, over 22747.00 frames. ], tot_loss[loss=0.1537, simple_loss=0.234, pruned_loss=0.03673, over 4716655.94 frames. ], batch size: 322, lr: 2.20e-03, grad_scale: 16.0 2023-10-04 10:35:06,584 WARNING [train.py:1204] (3/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_48-182155-0 from training. Duration: 0.87 2023-10-04 10:35:07,926 WARNING [train.py:1204] (3/4) Exclude cut with ID _14832_210_8750_1_1530691351539_3171348_1-54315-0_sp1.1 from training. Duration: 0.7909375 2023-10-04 10:35:11,092 WARNING [train.py:1204] (3/4) Exclude cut with ID _44982_210_1511_1_1534312820108_3726029_413-100814-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 10:35:12,888 WARNING [train.py:1204] (3/4) Exclude cut with ID _3231_210_2106_1_1526205864934_6784220_104-261392-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 10:35:14,819 WARNING [train.py:1204] (3/4) Exclude cut with ID _24314_210_4915_1_1532135078116_7170179_1-13588-0_sp1.1 from training. Duration: 0.97275 2023-10-04 10:35:16,980 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.3.encoder.layers.1.self_attn_weights.whiten_keys, num_groups=8, num_channels=256, metric=3.68 vs. limit=6.0 2023-10-04 10:35:19,963 WARNING [train.py:1204] (3/4) Exclude cut with ID _44304_210_15374_1_1533639352765_4613129_141-8356-0_sp1.1 from training. Duration: 0.963625 2023-10-04 10:35:21,121 WARNING [train.py:1204] (3/4) Exclude cut with ID _37892_210_19596_1_1533198677897_3964058_374-335034-0_sp1.1 from training. Duration: 0.936375 2023-10-04 10:35:23,764 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_195-257512-0_sp0.9 from training. Duration: 0.7444375 2023-10-04 10:35:23,766 WARNING [train.py:1204] (3/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_1267-272693-0_sp1.1 from training. Duration: 0.663625 2023-10-04 10:35:23,766 WARNING [train.py:1204] (3/4) Exclude cut with ID _30196_210_1664_1_1533708496522_7074140_690-326155-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 10:35:25,837 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.3.encoder.layers.3.self_attn2.whiten, num_groups=1, num_channels=512, metric=11.56 vs. limit=22.5 2023-10-04 10:35:27,919 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_68847_210_14656_1_1536070214_2586843_38-20717-0_sp1.1 from training. Duration: 0.97275 2023-10-04 10:35:27,961 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_99-251314-0_sp1.1 from training. Duration: 0.7909375 2023-10-04 10:35:29,412 WARNING [train.py:1204] (3/4) Exclude cut with ID _49376_210_5986_1_1533985278732_3370740_225-95630-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 10:35:32,276 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2655_210_2087_1_1525688959_3897400_202-147557-0 from training. Duration: 0.85 2023-10-04 10:35:33,580 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_5618_210_1874_1_1527311008_5851_1-214501-0_sp1.1 from training. Duration: 0.626375 2023-10-04 10:35:34,886 WARNING [train.py:1204] (3/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_225-43381-0_sp1.1 from training. Duration: 0.8545625 2023-10-04 10:35:36,288 WARNING [train.py:1204] (3/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_242-3472-0 from training. Duration: 0.73 2023-10-04 10:35:37,896 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.conv_module2.balancer2.prob, batch_count=1624106.6666666667, ans=0.125 2023-10-04 10:35:39,033 WARNING [train.py:1204] (3/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_339-104791-0 from training. Duration: 0.49 2023-10-04 10:35:39,104 WARNING [train.py:1204] (3/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_488-92261-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 10:35:42,408 WARNING [train.py:1204] (3/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_175-163894-0 from training. Duration: 0.74 2023-10-04 10:35:45,520 WARNING [train.py:1204] (3/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_41-234353-0_sp0.9 from training. Duration: 0.7555625 2023-10-04 10:35:48,308 WARNING [train.py:1204] (3/4) Exclude cut with ID _47251_210_18628_1_1533814063128_4137250_116-275927-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 10:35:48,339 WARNING [train.py:1204] (3/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_61-223324-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 10:35:49,596 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6961_210_2087_1_1527937168_6815011_562-165518-0_sp1.1 from training. Duration: 0.7 2023-10-04 10:35:51,456 WARNING [train.py:1204] (3/4) Exclude cut with ID _21989_210_5854_1_1531899214931_3271360_133-44759-0 from training. Duration: 0.77 2023-10-04 10:35:52,187 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.3.encoder.layers.2.nonlin_attention.whiten1, num_groups=1, num_channels=384, metric=4.18 vs. limit=10.0 2023-10-04 10:35:53,399 WARNING [train.py:1204] (3/4) Exclude cut with ID _23557_210_8750_1_1531890106370_6846255_77-284923-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 10:35:54,917 WARNING [train.py:1204] (3/4) Exclude cut with ID _13678_210_7497_1_1530530069226_3463170_464-296127-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 10:35:54,941 WARNING [train.py:1204] (3/4) Exclude cut with ID _22613_210_5219_1_1532253599686_4090170_393-36663-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 10:35:56,360 WARNING [train.py:1204] (3/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_235-262124-0_sp0.9 from training. Duration: 0.7444375 2023-10-04 10:35:57,705 WARNING [train.py:1204] (3/4) Exclude cut with ID _8480_210_1874_1_1528589391205_6728330_755-112625-0 from training. Duration: 0.74 2023-10-04 10:35:58,966 WARNING [train.py:1204] (3/4) Exclude cut with ID _6456_210_1534_1_1527681634212_3181049_215-309359-0 from training. Duration: 0.88 2023-10-04 10:35:58,986 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3019_210_2028_1_1526044245_3264865_191-72235-0_sp1.1 from training. Duration: 0.8818125 2023-10-04 10:35:59,062 WARNING [train.py:1204] (3/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_380-162648-0 from training. Duration: 0.94 2023-10-04 10:36:00,496 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_92-46009-0 from training. Duration: 0.54 2023-10-04 10:36:01,769 WARNING [train.py:1204] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_53-300544-0_sp0.9 from training. Duration: 0.7444375 2023-10-04 10:36:03,187 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_68235_210_1925_1_1535863247_4484656_101-250943-0_sp1.1 from training. Duration: 0.97275 2023-10-04 10:36:03,203 WARNING [train.py:1204] (3/4) Exclude cut with ID _14970_210_15709_1_1530775547361_1966965_204-22182-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 10:36:04,687 WARNING [train.py:1204] (3/4) Exclude cut with ID _55153_210_2253_1_1534384786677_3162096_310-130092-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 10:36:04,699 WARNING [train.py:1204] (3/4) Exclude cut with ID _60113_210_16271_1_1535164212481_4386079_559-167191-0_sp1.1 from training. Duration: 0.8454375 2023-10-04 10:36:07,401 WARNING [train.py:1204] (3/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_93-58002-0_sp1.1 from training. Duration: 0.5181875 2023-10-04 10:36:07,458 WARNING [train.py:1204] (3/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_110-224688-0 from training. Duration: 0.97 2023-10-04 10:36:10,072 WARNING [train.py:1204] (3/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_178-178004-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 10:36:10,088 WARNING [train.py:1204] (3/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_321-335475-0_sp1.1 from training. Duration: 0.4090625 2023-10-04 10:36:10,153 WARNING [train.py:1204] (3/4) Exclude cut with ID _66863_210_18053_1_1535698758334_8590319_1092-271784-0 from training. Duration: 0.96 2023-10-04 10:36:10,159 WARNING [train.py:1204] (3/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_607-213951-0_sp1.1 from training. Duration: 0.763625 2023-10-04 10:36:10,168 WARNING [train.py:1204] (3/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_468-283113-0 from training. Duration: 0.96 2023-10-04 10:36:13,486 WARNING [train.py:1204] (3/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_13-165512-0_sp1.1 from training. Duration: 0.7454375 2023-10-04 10:36:13,497 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_69630_210_7234_1_1535788670_910989_56-169762-0_sp1.1 from training. Duration: 0.92725 2023-10-04 10:36:15,502 WARNING [train.py:1204] (3/4) Exclude cut with ID _67335_210_5219_1_1535525975652_4275229_121-80357-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 10:36:16,877 WARNING [train.py:1204] (3/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_126-12079-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 10:36:16,927 WARNING [train.py:1204] (3/4) Exclude cut with ID _5294_210_1747_1_1527249643928_3696179_342-270278-0_sp1.1 from training. Duration: 0.5 2023-10-04 10:36:19,473 INFO [train.py:1046] (3/4) Epoch 46, batch 4600, loss[loss=0.1665, simple_loss=0.2449, pruned_loss=0.04408, over 23374.00 frames. ], tot_loss[loss=0.1529, simple_loss=0.2328, pruned_loss=0.03652, over 4722509.97 frames. ], batch size: 93, lr: 2.20e-03, grad_scale: 16.0 2023-10-04 10:36:19,556 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_30322_210_1925_1_1532843478_826817_35-328264-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 10:36:21,481 WARNING [train.py:1204] (3/4) Exclude cut with ID _8480_210_1874_1_1528589391205_6728330_755-112625-0_sp1.1 from training. Duration: 0.67275 2023-10-04 10:36:21,675 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.ff3_skip_rate, batch_count=1624306.6666666667, ans=0.0 2023-10-04 10:36:24,211 WARNING [train.py:1204] (3/4) Exclude cut with ID _38670_210_15379_1_1533448810659_4391059_132-146412-0_sp1.1 from training. Duration: 0.97275 2023-10-04 10:36:25,550 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.623e+02 1.983e+02 2.157e+02 2.520e+02 3.849e+02, threshold=4.313e+02, percent-clipped=0.0 2023-10-04 10:36:25,631 WARNING [train.py:1204] (3/4) Exclude cut with ID _22020_210_2461_1_1531724164513_6326900_563-39691-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 10:36:28,383 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_9460_210_4211_1_1528954587_10049667_572-37035-0_sp1.1 from training. Duration: 0.72725 2023-10-04 10:36:28,402 WARNING [train.py:1204] (3/4) Exclude cut with ID _68777_210_4353_1_1535810390731_3117904_129-166012-0_sp0.9 from training. Duration: 0.9444375 2023-10-04 10:36:29,755 WARNING [train.py:1204] (3/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_487-91921-0_sp1.1 from training. Duration: 0.963625 2023-10-04 10:36:30,034 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.conv_module1.balancer1.prob, batch_count=1624306.6666666667, ans=0.125 2023-10-04 10:36:31,137 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_195-257512-0 from training. Duration: 0.67 2023-10-04 10:36:32,561 WARNING [train.py:1204] (3/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_188-260011-0_sp1.1 from training. Duration: 0.663625 2023-10-04 10:36:34,115 WARNING [train.py:1204] (3/4) Exclude cut with ID _8480_210_1874_1_1528589391205_6728330_309-139896-0_sp0.9 from training. Duration: 0.9 2023-10-04 10:36:34,169 WARNING [train.py:1204] (3/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_866-321248-0_sp1.1 from training. Duration: 0.963625 2023-10-04 10:36:35,719 WARNING [train.py:1204] (3/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_510-216513-0_sp1.1 from training. Duration: 0.97275 2023-10-04 10:36:42,839 WARNING [train.py:1204] (3/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_758-143677-0 from training. Duration: 0.98 2023-10-04 10:36:44,232 WARNING [train.py:1204] (3/4) Exclude cut with ID _55134_210_21982_1_1534590028377_3882170_50-214427-0_sp1.1 from training. Duration: 0.97275 2023-10-04 10:36:45,893 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.feed_forward1.hidden_balancer.prob, batch_count=1624373.3333333333, ans=0.125 2023-10-04 10:36:46,921 WARNING [train.py:1204] (3/4) Exclude cut with ID _31649_210_6228_1_1532584608063_4091078_482-301330-0_sp1.1 from training. Duration: 0.97275 2023-10-04 10:36:48,824 WARNING [train.py:1204] (3/4) Exclude cut with ID _35799_210_5854_1_1533263487887_3529071_490-326847-0_sp1.1 from training. Duration: 0.9 2023-10-04 10:36:48,832 WARNING [train.py:1204] (3/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_984-90716-0_sp1.1 from training. Duration: 0.963625 2023-10-04 10:36:54,950 WARNING [train.py:1204] (3/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_166-246557-0 from training. Duration: 0.64 2023-10-04 10:36:54,951 WARNING [train.py:1204] (3/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_51-200220-0_sp1.1 from training. Duration: 0.5090625 2023-10-04 10:36:55,031 WARNING [train.py:1204] (3/4) Exclude cut with ID _51382_210_23862_1_1534492801838_3650104_275-92796-0_sp1.1 from training. Duration: 0.936375 2023-10-04 10:37:00,481 WARNING [train.py:1204] (3/4) Exclude cut with ID _29853_210_7497_1_1532505600851_6642110_461-142829-0_sp1.1 from training. Duration: 0.97275 2023-10-04 10:37:00,534 WARNING [train.py:1204] (3/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_788-298907-0_sp0.9 from training. Duration: 0.988875 2023-10-04 10:37:01,903 WARNING [train.py:1204] (3/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_739-2098-0_sp1.1 from training. Duration: 0.836375 2023-10-04 10:37:06,011 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7543_210_6753_1_1528541907_1399322_165-323619-0 from training. Duration: 0.69 2023-10-04 10:37:07,371 WARNING [train.py:1204] (3/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_401-255491-0_sp1.1 from training. Duration: 0.636375 2023-10-04 10:37:10,311 WARNING [train.py:1204] (3/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_24-143283-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 10:37:11,621 WARNING [train.py:1204] (3/4) Exclude cut with ID _14459_210_2228_1_1531443641946_7516700_1221-280439-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 10:37:13,434 WARNING [train.py:1204] (3/4) Exclude cut with ID _59202_210_26984_1_1534816810612_3549174_469-213482-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 10:37:13,435 WARNING [train.py:1204] (3/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_148-158509-0_sp0.9 from training. Duration: 0.4555625 2023-10-04 10:37:13,475 WARNING [train.py:1204] (3/4) Exclude cut with ID _24314_210_4915_1_1532135078116_7170179_863-259095-0_sp1.1 from training. Duration: 0.97275 2023-10-04 10:37:14,900 WARNING [train.py:1204] (3/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_321-22821-0 from training. Duration: 0.89 2023-10-04 10:37:14,922 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_43831_210_2158_1_1533725502_5872791_55-163137-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 10:37:15,566 WARNING [train.py:1204] (3/4) Exclude cut with ID _27430_210_7770_1_1532604689375_1378590_26-25207-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 10:37:16,875 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_17613_210_2151_1_1531137569_2366160_223-316461-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 10:37:18,269 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_53679_210_4852_1_1534556726_4186141_541-213370-0_sp1.1 from training. Duration: 0.963625 2023-10-04 10:37:20,010 WARNING [train.py:1204] (3/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_369-295086-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 10:37:20,067 WARNING [train.py:1204] (3/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_91-267696-0 from training. Duration: 0.87 2023-10-04 10:37:21,362 WARNING [train.py:1204] (3/4) Exclude cut with ID _5294_210_1747_1_1527249643928_3696179_180-100713-0 from training. Duration: 0.81 2023-10-04 10:37:21,391 WARNING [train.py:1204] (3/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_32-335103-0 from training. Duration: 0.82 2023-10-04 10:37:21,397 WARNING [train.py:1204] (3/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_133-312727-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 10:37:22,833 WARNING [train.py:1204] (3/4) Exclude cut with ID _59288_210_5854_1_1535077757774_3412246_391-165934-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 10:37:24,160 WARNING [train.py:1204] (3/4) Exclude cut with ID _28323_210_16187_1_1532480395667_4100300_488-1281-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 10:37:24,244 WARNING [train.py:1204] (3/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_124-193207-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 10:37:28,679 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.0.conv_module1.balancer1.prob, batch_count=1624573.3333333333, ans=0.125 2023-10-04 10:37:28,756 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.ff2_skip_rate, batch_count=1624573.3333333333, ans=0.0 2023-10-04 10:37:32,694 INFO [train.py:1046] (3/4) Epoch 46, batch 4650, loss[loss=0.1442, simple_loss=0.2341, pruned_loss=0.02716, over 24668.00 frames. ], tot_loss[loss=0.1528, simple_loss=0.2324, pruned_loss=0.03664, over 4718683.04 frames. ], batch size: 65, lr: 2.20e-03, grad_scale: 16.0 2023-10-04 10:37:32,980 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.nonlin_attention.balancer.prob, batch_count=1624640.0, ans=0.125 2023-10-04 10:37:34,156 WARNING [train.py:1204] (3/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_226-242140-0_sp0.9 from training. Duration: 0.9 2023-10-04 10:37:35,496 WARNING [train.py:1204] (3/4) Exclude cut with ID _29277_210_3279_1_1532581109293_3173069_392-3694-0_sp1.1 from training. Duration: 0.936375 2023-10-04 10:37:36,795 WARNING [train.py:1204] (3/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_563-329842-0_sp1.1 from training. Duration: 0.97275 2023-10-04 10:37:36,839 WARNING [train.py:1204] (3/4) Exclude cut with ID _58583_210_5236_1_1534726835266_3574249_391-87755-0_sp1.1 from training. Duration: 0.92725 2023-10-04 10:37:36,869 WARNING [train.py:1204] (3/4) Exclude cut with ID _28427_210_4220_1_1532581240967_3061069_281-237827-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 10:37:36,884 WARNING [train.py:1204] (3/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_44-147898-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 10:37:37,506 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.4.encoder.layers.1.conv_module2.whiten, num_groups=1, num_channels=384, metric=2.85 vs. limit=15.0 2023-10-04 10:37:38,255 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_24007_210_1794_1_1531918925_1663875_125-320772-0_sp1.1 from training. Duration: 0.97275 2023-10-04 10:37:39,771 WARNING [train.py:1204] (3/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_143-117421-0 from training. Duration: 0.58 2023-10-04 10:37:46,766 WARNING [train.py:1204] (3/4) Exclude cut with ID _69584_210_5219_1_1535785133775_4044450_325-7656-0_sp0.9 from training. Duration: 0.9555625 2023-10-04 10:37:46,923 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder_embed.convnext.out_balancer.prob, batch_count=1624706.6666666667, ans=0.125 2023-10-04 10:37:48,093 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_37351_210_7925_1_1533012931_3812113_265-85780-0 from training. Duration: 0.88 2023-10-04 10:37:48,123 WARNING [train.py:1204] (3/4) Exclude cut with ID _18516_210_9206_1_1531704653206_3692406_331-237896-0_sp1.1 from training. Duration: 0.936375 2023-10-04 10:37:50,045 WARNING [train.py:1204] (3/4) Exclude cut with ID _14629_210_15709_1_1530604298182_956899_6-232695-0 from training. Duration: 0.94 2023-10-04 10:37:51,308 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7544_210_6753_1_1528587000_691468_87-178599-0_sp1.1 from training. Duration: 0.7909375 2023-10-04 10:37:51,368 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_326-303945-0 from training. Duration: 0.99 2023-10-04 10:37:51,382 WARNING [train.py:1204] (3/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_503-30719-0 from training. Duration: 0.98 2023-10-04 10:37:51,396 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_11305_210_1794_1_1529750428_7178148_299-344640-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 10:37:52,712 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_53071_210_16554_1_1534490405_2954066_265-170472-0_sp1.1 from training. Duration: 0.7545625 2023-10-04 10:37:55,530 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_174-277364-0_sp1.1 from training. Duration: 0.6454375 2023-10-04 10:37:56,919 WARNING [train.py:1204] (3/4) Exclude cut with ID _58414_210_8254_1_1535421609162_7151182_193-151884-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 10:37:58,131 WARNING [train.py:1204] (3/4) Exclude cut with ID IC0651W0104-322063 from training. Duration: 0.768 2023-10-04 10:38:00,879 WARNING [train.py:1204] (3/4) Exclude cut with ID _21881_210_3525_1_1531825242757_3798380_139-305982-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 10:38:00,987 WARNING [train.py:1204] (3/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_79-200392-0 from training. Duration: 0.97 2023-10-04 10:38:03,760 WARNING [train.py:1204] (3/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_451-280894-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 10:38:03,773 WARNING [train.py:1204] (3/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_304-273433-0_sp1.1 from training. Duration: 0.9 2023-10-04 10:38:05,212 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_5959_210_1794_1_1527400400_3445726_412-44052-0 from training. Duration: 0.77 2023-10-04 10:38:06,543 WARNING [train.py:1204] (3/4) Exclude cut with ID _4681_210_3525_1_1527251040321_1235713_148-26159-0_sp1.1 from training. Duration: 0.963625 2023-10-04 10:38:09,283 WARNING [train.py:1204] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_887-249555-0_sp0.9 from training. Duration: 0.8555625 2023-10-04 10:38:12,103 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_25236_210_2158_1_1532051968_280567_37-6860-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 10:38:12,607 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.4.encoder.layers.2.self_attn_weights.whiten_keys, num_groups=4, num_channels=128, metric=1.67 vs. limit=6.0 2023-10-04 10:38:13,544 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.0.ff2_skip_rate, batch_count=1624773.3333333333, ans=0.0 2023-10-04 10:38:13,625 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.self_attn_weights.pos_emb_skip_rate, batch_count=1624773.3333333333, ans=0.0 2023-10-04 10:38:20,472 WARNING [train.py:1204] (3/4) Exclude cut with ID _29277_210_3279_1_1532581109293_3173069_417-115527-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 10:38:22,466 WARNING [train.py:1204] (3/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_552-134748-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 10:38:22,533 WARNING [train.py:1204] (3/4) Exclude cut with ID _43176_210_8254_1_1533524417469_4174339_188-174143-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 10:38:22,574 WARNING [train.py:1204] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_127-119109-0_sp1.1 from training. Duration: 0.6545625 2023-10-04 10:38:25,260 WARNING [train.py:1204] (3/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_38-187851-0 from training. Duration: 0.62 2023-10-04 10:38:25,309 WARNING [train.py:1204] (3/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_588-125165-0 from training. Duration: 0.83 2023-10-04 10:38:26,703 WARNING [train.py:1204] (3/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_344-152526-0_sp1.1 from training. Duration: 0.4818125 2023-10-04 10:38:26,704 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7002_210_1794_1_1527935634_4034444_487-236501-0 from training. Duration: 0.66 2023-10-04 10:38:26,834 WARNING [train.py:1204] (3/4) Exclude cut with ID _23825_210_13449_1_1531915313526_3497150_379-16981-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 10:38:29,636 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.bypass.scale_min, batch_count=1624840.0, ans=0.2 2023-10-04 10:38:31,083 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.conv_skip_rate, batch_count=1624906.6666666667, ans=0.0 2023-10-04 10:38:33,644 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.feed_forward1.hidden_balancer.prob, batch_count=1624906.6666666667, ans=0.125 2023-10-04 10:38:34,857 WARNING [train.py:1204] (3/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_297-5233-0_sp1.1 from training. Duration: 0.863625 2023-10-04 10:38:34,861 WARNING [train.py:1204] (3/4) Exclude cut with ID _41298_210_9019_1_1533430371450_4822143_178-283538-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 10:38:34,899 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7046_210_6753_1_1527895337_6920917_642-106799-0 from training. Duration: 0.92 2023-10-04 10:38:34,913 WARNING [train.py:1204] (3/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_532-291952-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 10:38:37,570 WARNING [train.py:1204] (3/4) Exclude cut with ID _47278_210_12041_1_1533902418349_3576110_260-270468-0_sp1.1 from training. Duration: 0.92725 2023-10-04 10:38:37,576 WARNING [train.py:1204] (3/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_144-3287-0_sp0.9 from training. Duration: 0.7444375 2023-10-04 10:38:37,692 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_162-262717-0_sp0.9 from training. Duration: 0.92225 2023-10-04 10:38:39,136 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder_embed.convnext.out_balancer.prob, batch_count=1624906.6666666667, ans=0.125 2023-10-04 10:38:41,665 WARNING [train.py:1204] (3/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_62-335552-0_sp0.9 from training. Duration: 0.9666875 2023-10-04 10:38:41,671 WARNING [train.py:1204] (3/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_726-272852-0_sp1.1 from training. Duration: 0.92725 2023-10-04 10:38:41,754 WARNING [train.py:1204] (3/4) Exclude cut with ID _14898_210_9206_1_1530766812377_4177190_26-135492-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 10:38:43,396 WARNING [train.py:1204] (3/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_200-95304-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 10:38:44,748 WARNING [train.py:1204] (3/4) Exclude cut with ID _45012_210_12651_1_1533608435136_5371380_240-37101-0_sp1.1 from training. Duration: 0.8454375 2023-10-04 10:38:44,756 WARNING [train.py:1204] (3/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_132-269633-0_sp0.9 from training. Duration: 0.7555625 2023-10-04 10:38:44,833 WARNING [train.py:1204] (3/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_907-151398-0_sp0.9 from training. Duration: 0.411125 2023-10-04 10:38:46,697 INFO [train.py:1046] (3/4) Epoch 46, batch 4700, loss[loss=0.1502, simple_loss=0.2408, pruned_loss=0.02984, over 24610.00 frames. ], tot_loss[loss=0.1536, simple_loss=0.2336, pruned_loss=0.0368, over 4719724.18 frames. ], batch size: 68, lr: 2.20e-03, grad_scale: 16.0 2023-10-04 10:38:48,515 WARNING [train.py:1204] (3/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_45-265684-0_sp0.9 from training. Duration: 0.8 2023-10-04 10:38:50,450 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_74-292608-0 from training. Duration: 0.72 2023-10-04 10:38:53,569 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.606e+02 2.144e+02 2.474e+02 2.959e+02 4.299e+02, threshold=4.947e+02, percent-clipped=0.0 2023-10-04 10:38:57,892 WARNING [train.py:1204] (3/4) Exclude cut with ID _59202_210_26984_1_1534816810612_3549174_221-84819-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 10:38:59,265 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_33696_210_3852_1_1533082780_2663375_181-325595-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 10:38:59,314 WARNING [train.py:1204] (3/4) Exclude cut with ID _47251_210_18628_1_1533814063128_4137250_351-154105-0_sp1.1 from training. Duration: 0.963625 2023-10-04 10:39:02,003 WARNING [train.py:1204] (3/4) Exclude cut with ID _35501_210_9288_1_1532998507986_3192189_351-334740-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 10:39:03,407 WARNING [train.py:1204] (3/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_342-100157-0_sp1.1 from training. Duration: 0.4909375 2023-10-04 10:39:06,316 WARNING [train.py:1204] (3/4) Exclude cut with ID _6789_210_1534_1_1527857937218_3118950_262-48853-0 from training. Duration: 0.56 2023-10-04 10:39:06,354 WARNING [train.py:1204] (3/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_430-201020-0 from training. Duration: 0.97 2023-10-04 10:39:06,469 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder_embed.dropout.p, batch_count=1625040.0, ans=0.1 2023-10-04 10:39:09,112 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_71-136518-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 10:39:09,202 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_28-291982-0_sp1.1 from training. Duration: 0.8818125 2023-10-04 10:39:10,483 WARNING [train.py:1204] (3/4) Exclude cut with ID _54218_210_12210_1_1534413646388_4409259_437-275326-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 10:39:13,252 WARNING [train.py:1204] (3/4) Exclude cut with ID _55134_210_21982_1_1534590028377_3882170_40-309139-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 10:39:18,236 WARNING [train.py:1204] (3/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_266-329755-0_sp1.1 from training. Duration: 0.8090625 2023-10-04 10:39:20,193 WARNING [train.py:1204] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_21-174514-0_sp1.1 from training. Duration: 0.3909375 2023-10-04 10:39:23,530 WARNING [train.py:1204] (3/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_409-207378-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 10:39:27,813 WARNING [train.py:1204] (3/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_132-269633-0 from training. Duration: 0.68 2023-10-04 10:39:29,204 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2245_210_2715_1_1525511548_3540254_310-52483-0_sp0.9 from training. Duration: 0.97775 2023-10-04 10:39:29,520 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.ff2_skip_rate, batch_count=1625106.6666666667, ans=0.0 2023-10-04 10:39:30,627 WARNING [train.py:1204] (3/4) Exclude cut with ID _27430_210_7770_1_1532604689375_1378590_47-77403-0_sp1.1 from training. Duration: 0.92725 2023-10-04 10:39:34,751 WARNING [train.py:1204] (3/4) Exclude cut with ID _7562_210_2084_1_1528887767916_3946229_186-326422-0 from training. Duration: 0.84 2023-10-04 10:39:34,859 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_230-35993-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 10:39:38,900 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_23854_210_1794_1_1531969696_1329157_57-123491-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 10:39:38,938 WARNING [train.py:1204] (3/4) Exclude cut with ID _14747_210_8341_1_1530685797463_3654089_147-131229-0 from training. Duration: 0.76 2023-10-04 10:39:40,344 WARNING [train.py:1204] (3/4) Exclude cut with ID _30191_210_1664_1_1533189915047_7196670_430-19539-0_sp1.1 from training. Duration: 0.92725 2023-10-04 10:39:40,359 WARNING [train.py:1204] (3/4) Exclude cut with ID _67725_210_27767_1_1535608756671_5105359_44-50762-0_sp1.1 from training. Duration: 0.963625 2023-10-04 10:39:40,678 INFO [scaling.py:1118] (3/4) WithLoss: name=encoder.encoders.5.encoder.layers.0.self_attn_weights, loss-sum=0.000e+00 2023-10-04 10:39:44,256 WARNING [train.py:1204] (3/4) Exclude cut with ID _27485_210_4916_1_1532739191677_3668269_217-161755-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 10:39:44,953 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_5931_210_2040_1_1527428481_4547999_321-193614-0_sp1.1 from training. Duration: 0.6090625 2023-10-04 10:39:46,173 WARNING [train.py:1204] (3/4) Exclude cut with ID _6267_210_2084_1_1527764658526_3894730_375-219887-0 from training. Duration: 0.67 2023-10-04 10:39:47,499 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9087W0022-538393 from training. Duration: 0.768 2023-10-04 10:39:49,288 WARNING [train.py:1204] (3/4) Exclude cut with ID _68777_210_4353_1_1535810390731_3117904_83-196459-0_sp1.1 from training. Duration: 0.963625 2023-10-04 10:39:49,613 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.feed_forward3.hidden_balancer.prob, batch_count=1625240.0, ans=0.125 2023-10-04 10:39:51,335 WARNING [train.py:1204] (3/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_455-343812-0_sp1.1 from training. Duration: 0.92725 2023-10-04 10:39:51,335 WARNING [train.py:1204] (3/4) Exclude cut with ID _13916_210_9994_1_1530788341126_3763149_414-320174-0_sp1.1 from training. Duration: 0.92725 2023-10-04 10:39:51,339 WARNING [train.py:1204] (3/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_398-239819-0 from training. Duration: 0.99 2023-10-04 10:39:51,425 WARNING [train.py:1204] (3/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_199-232314-0_sp1.1 from training. Duration: 0.92725 2023-10-04 10:39:56,135 WARNING [train.py:1204] (3/4) Exclude cut with ID _2334_210_2667_1_1525608238880_4705129_459-104997-0 from training. Duration: 0.95 2023-10-04 10:40:00,203 WARNING [train.py:1204] (3/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_441-209395-0_sp1.1 from training. Duration: 0.936375 2023-10-04 10:40:00,289 WARNING [train.py:1204] (3/4) Exclude cut with ID _27052_210_5986_1_1532329197187_3585009_320-80393-0_sp1.1 from training. Duration: 0.97275 2023-10-04 10:40:00,495 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.bypass.scale_min, batch_count=1625306.6666666667, ans=0.2 2023-10-04 10:40:01,523 INFO [train.py:1046] (3/4) Epoch 46, batch 4750, loss[loss=0.1658, simple_loss=0.2403, pruned_loss=0.04559, over 23354.00 frames. ], tot_loss[loss=0.1545, simple_loss=0.2349, pruned_loss=0.0371, over 4706484.85 frames. ], batch size: 119, lr: 2.20e-03, grad_scale: 8.0 2023-10-04 10:40:05,784 WARNING [train.py:1204] (3/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_89-217253-0_sp1.1 from training. Duration: 0.97275 2023-10-04 10:40:05,800 WARNING [train.py:1204] (3/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_453-282588-0_sp1.1 from training. Duration: 0.8909375 2023-10-04 10:40:07,278 WARNING [train.py:1204] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_88-341718-0 from training. Duration: 0.66 2023-10-04 10:40:07,337 WARNING [train.py:1204] (3/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_230-3521-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 10:40:10,227 WARNING [train.py:1204] (3/4) Exclude cut with ID _9679_210_7497_1_1529129212906_4363929_512-313889-0 from training. Duration: 0.81 2023-10-04 10:40:11,615 WARNING [train.py:1204] (3/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_194-252800-0_sp1.1 from training. Duration: 0.8818125 2023-10-04 10:40:12,960 WARNING [train.py:1204] (3/4) Exclude cut with ID _55209_210_1531_1_1534675828996_4597160_239-293541-0_sp1.1 from training. Duration: 0.963625 2023-10-04 10:40:13,020 WARNING [train.py:1204] (3/4) Exclude cut with ID _35799_210_5854_1_1533263487887_3529071_15-325131-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 10:40:17,276 WARNING [train.py:1204] (3/4) Exclude cut with ID _14100_210_8341_1_1530529320613_3678069_279-55760-0 from training. Duration: 0.93 2023-10-04 10:40:22,460 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_489-230703-0_sp1.1 from training. Duration: 0.77275 2023-10-04 10:40:24,273 WARNING [train.py:1204] (3/4) Exclude cut with ID _6694_210_4599_1_1527852637490_3965529_144-185051-0 from training. Duration: 0.96 2023-10-04 10:40:24,352 WARNING [train.py:1204] (3/4) Exclude cut with ID _28461_210_2253_1_1532771775189_3041299_1-228155-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 10:40:28,397 WARNING [train.py:1204] (3/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_686-252323-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 10:40:28,399 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_1075-294633-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 10:40:28,414 WARNING [train.py:1204] (3/4) Exclude cut with ID _22083_210_8254_1_1531702862439_3678970_30-92879-0_sp1.1 from training. Duration: 0.97275 2023-10-04 10:40:29,802 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9245W0121-616888 from training. Duration: 0.896 2023-10-04 10:40:29,805 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6786_210_1388_1_1527839699_4194495_378-46070-0 from training. Duration: 0.72 2023-10-04 10:40:36,627 WARNING [train.py:1204] (3/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_317-175062-0 from training. Duration: 0.82 2023-10-04 10:40:39,338 WARNING [train.py:1204] (3/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_238-89390-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 10:40:40,789 WARNING [train.py:1204] (3/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_524-310811-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 10:40:43,637 WARNING [train.py:1204] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_307-158462-0_sp0.9 from training. Duration: 0.7666875 2023-10-04 10:40:43,637 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9309W0191-640318 from training. Duration: 0.512 2023-10-04 10:40:43,641 WARNING [train.py:1204] (3/4) Exclude cut with ID _55153_210_2253_1_1534384786677_3162096_100-204617-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 10:40:45,311 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.attention_skip_rate, batch_count=1625506.6666666667, ans=0.0 2023-10-04 10:40:46,457 WARNING [train.py:1204] (3/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_112-149213-0_sp1.1 from training. Duration: 0.863625 2023-10-04 10:40:46,712 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.feed_forward3.hidden_balancer.prob, batch_count=1625506.6666666667, ans=0.125 2023-10-04 10:40:49,242 WARNING [train.py:1204] (3/4) Exclude cut with ID _67831_210_12392_1_1535612464202_3092612_346-2148-0_sp1.1 from training. Duration: 0.8454375 2023-10-04 10:40:52,479 WARNING [train.py:1204] (3/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_51-117576-0_sp0.9 from training. Duration: 0.4444375 2023-10-04 10:40:52,519 WARNING [train.py:1204] (3/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_300-27235-0 from training. Duration: 0.87 2023-10-04 10:40:53,760 WARNING [train.py:1204] (3/4) Exclude cut with ID _40399_210_4353_1_1533305658194_3279406_195-116825-0_sp1.1 from training. Duration: 0.97275 2023-10-04 10:40:53,792 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7002_210_1794_1_1527939683_5157286_163-87552-0_sp0.9 from training. Duration: 0.9555625 2023-10-04 10:40:53,837 WARNING [train.py:1204] (3/4) Exclude cut with ID _19514_210_9259_1_1531530071330_5084230_560-237597-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 10:40:55,856 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.0.attention_skip_rate, batch_count=1625506.6666666667, ans=0.0 2023-10-04 10:40:57,083 WARNING [train.py:1204] (3/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_41-234353-0_sp1.1 from training. Duration: 0.6181875 2023-10-04 10:40:57,107 WARNING [train.py:1204] (3/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_769-168302-0 from training. Duration: 0.71 2023-10-04 10:40:59,130 WARNING [train.py:1204] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_574-45541-0 from training. Duration: 0.65 2023-10-04 10:41:03,161 WARNING [train.py:1204] (3/4) Exclude cut with ID _50310_210_15488_1_1534068043345_3641080_262-67120-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 10:41:04,688 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_53071_210_16554_1_1534493434_1276796_35-11927-0_sp1.1 from training. Duration: 0.963625 2023-10-04 10:41:04,690 WARNING [train.py:1204] (3/4) Exclude cut with ID _3992_210_1747_1_1526644816031_3731060_107-251434-0 from training. Duration: 0.99 2023-10-04 10:41:05,978 WARNING [train.py:1204] (3/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_412-54488-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 10:41:07,379 WARNING [train.py:1204] (3/4) Exclude cut with ID _18516_210_9206_1_1531704653206_3692406_77-258490-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 10:41:08,824 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_40047_210_2290_1_1533290519_2374311_195-165693-0_sp0.9 from training. Duration: 0.988875 2023-10-04 10:41:10,102 WARNING [train.py:1204] (3/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_296-36971-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 10:41:10,156 WARNING [train.py:1204] (3/4) Exclude cut with ID _14970_210_15709_1_1530773543493_1715730_187-20259-0_sp1.1 from training. Duration: 0.8090625 2023-10-04 10:41:12,949 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_53373_210_5399_1_1534229471_4715695_135-305927-0_sp1.1 from training. Duration: 0.936375 2023-10-04 10:41:12,980 WARNING [train.py:1204] (3/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_60-18123-0 from training. Duration: 0.88 2023-10-04 10:41:14,306 INFO [train.py:1046] (3/4) Epoch 46, batch 4800, loss[loss=0.1411, simple_loss=0.226, pruned_loss=0.02816, over 24661.00 frames. ], tot_loss[loss=0.1548, simple_loss=0.2355, pruned_loss=0.03703, over 4714061.78 frames. ], batch size: 65, lr: 2.20e-03, grad_scale: 16.0 2023-10-04 10:41:14,345 WARNING [train.py:1204] (3/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_166-166990-0 from training. Duration: 0.96 2023-10-04 10:41:14,431 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_5438_210_1794_1_1527336687_2600009_233-58072-0 from training. Duration: 0.77 2023-10-04 10:41:15,931 WARNING [train.py:1204] (3/4) Exclude cut with ID _8833_210_5219_1_1528592665210_7553059_611-80313-0_sp0.9 from training. Duration: 0.711125 2023-10-04 10:41:17,169 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_53555_210_5632_1_1534595220_7232297_118-43239-0_sp1.1 from training. Duration: 0.936375 2023-10-04 10:41:18,596 WARNING [train.py:1204] (3/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_480-13146-0 from training. Duration: 0.85 2023-10-04 10:41:21,364 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.563e+02 1.989e+02 2.249e+02 2.577e+02 3.690e+02, threshold=4.497e+02, percent-clipped=0.0 2023-10-04 10:41:22,759 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_892-28814-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 10:41:23,452 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_24899_210_16554_1_1532046422_3827462_160-21255-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 10:41:28,409 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_154-316450-0_sp1.1 from training. Duration: 0.6909375 2023-10-04 10:41:29,802 WARNING [train.py:1204] (3/4) Exclude cut with ID _30196_210_1664_1_1533708496522_7074140_1225-301232-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 10:41:29,841 WARNING [train.py:1204] (3/4) Exclude cut with ID _28783_210_7943_1_1532671164808_7155479_414-208100-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 10:41:31,295 WARNING [train.py:1204] (3/4) Exclude cut with ID _19023_210_4353_1_1531378892552_5820780_83-254794-0 from training. Duration: 0.86 2023-10-04 10:41:31,367 WARNING [train.py:1204] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_591-83752-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 10:41:32,684 WARNING [train.py:1204] (3/4) Exclude cut with ID _29877_210_6228_1_1532397791106_5276149_266-62291-0_sp1.1 from training. Duration: 0.7909375 2023-10-04 10:41:34,073 WARNING [train.py:1204] (3/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_280-161633-0_sp1.1 from training. Duration: 0.663625 2023-10-04 10:41:38,149 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7543_210_6753_1_1528539468_333764_39-156634-0_sp1.1 from training. Duration: 0.97275 2023-10-04 10:41:38,323 INFO [scaling.py:1118] (3/4) WithLoss: name=encoder.encoders.1.encoder.layers.0.self_attn_weights, loss-sum=0.000e+00 2023-10-04 10:41:38,414 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.conv_module2.balancer1.prob, batch_count=1625706.6666666667, ans=0.125 2023-10-04 10:41:40,758 WARNING [train.py:1204] (3/4) Exclude cut with ID _67499_210_17282_1_1535509805220_3692430_505-312248-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 10:41:40,783 WARNING [train.py:1204] (3/4) Exclude cut with ID _5294_210_1747_1_1527249643928_3696179_180-100713-0_sp0.9 from training. Duration: 0.9 2023-10-04 10:41:42,161 WARNING [train.py:1204] (3/4) Exclude cut with ID _26528_210_9070_1_1532570410842_3851310_508-148132-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 10:41:42,178 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_211-339622-0_sp1.1 from training. Duration: 0.4090625 2023-10-04 10:41:42,192 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_339-138356-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 10:41:43,572 WARNING [train.py:1204] (3/4) Exclude cut with ID _27060_210_5986_1_1532674838922_3522549_211-19933-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 10:41:43,924 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.balancer2.prob, batch_count=1625773.3333333333, ans=0.125 2023-10-04 10:41:45,237 WARNING [train.py:1204] (3/4) Exclude cut with ID _60229_210_27767_1_1534838406158_3655350_457-117923-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 10:41:48,100 WARNING [train.py:1204] (3/4) Exclude cut with ID _55429_210_10626_1_1534467588527_7174143_460-141439-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 10:41:49,546 WARNING [train.py:1204] (3/4) Exclude cut with ID _59170_210_6721_1_1534983633960_4203335_263-10780-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 10:41:49,560 WARNING [train.py:1204] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_872-264525-0_sp0.9 from training. Duration: 0.72225 2023-10-04 10:41:51,531 WARNING [train.py:1204] (3/4) Exclude cut with ID _7624_210_1531_1_1528109802198_4933101_418-349834-0_sp0.9 from training. Duration: 0.6666875 2023-10-04 10:41:52,876 WARNING [train.py:1204] (3/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_27-53998-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 10:41:53,056 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.feed_forward1.out_proj.dropout_p, batch_count=1625773.3333333333, ans=0.1 2023-10-04 10:41:55,397 WARNING [train.py:1204] (3/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_202-183922-0 from training. Duration: 0.85 2023-10-04 10:41:55,421 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7002_210_1794_1_1527939683_5157286_1-281264-0 from training. Duration: 0.44 2023-10-04 10:41:55,487 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_53753_210_7234_1_1534320003_3594148_493-9314-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 10:41:57,102 WARNING [train.py:1204] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_217-95056-0_sp1.1 from training. Duration: 0.8909375 2023-10-04 10:41:57,138 WARNING [train.py:1204] (3/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_406-45086-0_sp1.1 from training. Duration: 0.72725 2023-10-04 10:41:57,144 WARNING [train.py:1204] (3/4) Exclude cut with ID _31649_210_6228_1_1532584608063_4091078_505-213821-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 10:41:57,160 WARNING [train.py:1204] (3/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_317-175062-0_sp0.9 from training. Duration: 0.911125 2023-10-04 10:41:58,674 WARNING [train.py:1204] (3/4) Exclude cut with ID _5444_210_2151_1_1527418808464_5312210_726-27165-0_sp1.1 from training. Duration: 0.6090625 2023-10-04 10:42:00,434 WARNING [train.py:1204] (3/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_494-170437-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 10:42:00,530 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder_embed.dropout.p, batch_count=1625840.0, ans=0.1 2023-10-04 10:42:00,692 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.conv_module1.balancer2.prob, batch_count=1625840.0, ans=0.125 2023-10-04 10:42:05,146 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_474-64054-0_sp1.1 from training. Duration: 0.936375 2023-10-04 10:42:07,894 WARNING [train.py:1204] (3/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_521-7556-0_sp1.1 from training. Duration: 0.97275 2023-10-04 10:42:09,203 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_269-348505-0_sp1.1 from training. Duration: 0.92725 2023-10-04 10:42:12,195 WARNING [train.py:1204] (3/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_559-184862-0 from training. Duration: 0.94 2023-10-04 10:42:12,230 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_68702_210_23689_1_1535799577_3420068_92-76040-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 10:42:13,558 WARNING [train.py:1204] (3/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_227-102052-0_sp1.1 from training. Duration: 0.97275 2023-10-04 10:42:13,588 WARNING [train.py:1204] (3/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_718-250907-0_sp0.9 from training. Duration: 0.7666875 2023-10-04 10:42:13,786 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.conv_module1.balancer2.prob, batch_count=1625906.6666666667, ans=0.125 2023-10-04 10:42:13,869 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.conv_module2.balancer2.prob, batch_count=1625906.6666666667, ans=0.125 2023-10-04 10:42:15,010 WARNING [train.py:1204] (3/4) Exclude cut with ID _14069_210_5632_1_1530512692033_4803390_352-143891-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 10:42:16,537 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.bypass.skip_rate, batch_count=1625906.6666666667, ans=0.09899494936611666 2023-10-04 10:42:19,015 WARNING [train.py:1204] (3/4) Exclude cut with ID _6267_210_2084_1_1527764658526_3894730_65-106602-0_sp1.1 from training. Duration: 0.936375 2023-10-04 10:42:20,380 WARNING [train.py:1204] (3/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_202-183922-0_sp0.9 from training. Duration: 0.9444375 2023-10-04 10:42:20,387 WARNING [train.py:1204] (3/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_465-268827-0_sp1.1 from training. Duration: 0.97275 2023-10-04 10:42:20,450 WARNING [train.py:1204] (3/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_58-29064-0_sp1.1 from training. Duration: 0.9 2023-10-04 10:42:22,363 WARNING [train.py:1204] (3/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_1-99680-0_sp0.9 from training. Duration: 0.7444375 2023-10-04 10:42:22,407 WARNING [train.py:1204] (3/4) Exclude cut with ID _13253_210_1925_1_1530757775819_3577873_191-144977-0_sp1.1 from training. Duration: 0.7090625 2023-10-04 10:42:25,288 WARNING [train.py:1204] (3/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_276-112684-0_sp1.1 from training. Duration: 0.92725 2023-10-04 10:42:25,296 WARNING [train.py:1204] (3/4) Exclude cut with ID _64639_210_5816_1_1535367619437_3528000_358-411-0_sp1.1 from training. Duration: 0.97275 2023-10-04 10:42:25,303 WARNING [train.py:1204] (3/4) Exclude cut with ID _31649_210_6228_1_1532584608063_4091078_106-21199-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 10:42:25,553 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.bypass.scale_min, batch_count=1625906.6666666667, ans=0.2 2023-10-04 10:42:26,707 WARNING [train.py:1204] (3/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_901-79438-0 from training. Duration: 0.94 2023-10-04 10:42:28,551 INFO [train.py:1046] (3/4) Epoch 46, batch 4850, loss[loss=0.2043, simple_loss=0.2722, pruned_loss=0.06826, over 19903.00 frames. ], tot_loss[loss=0.1557, simple_loss=0.2362, pruned_loss=0.03761, over 4709469.17 frames. ], batch size: 388, lr: 2.20e-03, grad_scale: 16.0 2023-10-04 10:42:29,971 WARNING [train.py:1204] (3/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_718-250907-0 from training. Duration: 0.69 2023-10-04 10:42:29,979 WARNING [train.py:1204] (3/4) Exclude cut with ID _5287_210_2084_1_1527418883040_3878670_363-194161-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 10:42:29,985 WARNING [train.py:1204] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_586-321321-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 10:42:30,089 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_68900_210_2087_1_1535713157_3708529_164-292661-0_sp1.1 from training. Duration: 0.963625 2023-10-04 10:42:30,090 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_26433_210_12108_1_1532157158_4056480_114-144878-0_sp1.1 from training. Duration: 0.97275 2023-10-04 10:42:30,223 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.conv_module2.balancer2.min_positive, batch_count=1625973.3333333333, ans=0.05 2023-10-04 10:42:32,699 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.3.encoder.layers.2.self_attn_weights.whiten_keys, num_groups=8, num_channels=256, metric=2.96 vs. limit=6.0 2023-10-04 10:42:35,268 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_15517_210_6753_1_1530954008_3487336_96-341874-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 10:42:40,761 WARNING [train.py:1204] (3/4) Exclude cut with ID _14268_210_9206_1_1530622771285_4714099_637-86630-0 from training. Duration: 0.51 2023-10-04 10:42:42,166 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_28211_210_6388_1_1532861506_4523224_121-320092-0_sp1.1 from training. Duration: 0.92725 2023-10-04 10:42:47,564 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_141-89113-0_sp0.9 from training. Duration: 0.9555625 2023-10-04 10:42:47,658 WARNING [train.py:1204] (3/4) Exclude cut with ID _6789_210_1534_1_1527857937218_3118950_311-61262-0_sp1.1 from training. Duration: 0.5909375 2023-10-04 10:42:48,987 WARNING [train.py:1204] (3/4) Exclude cut with ID _58480_210_12392_1_1534811121111_3646160_389-200461-0_sp1.1 from training. Duration: 0.97275 2023-10-04 10:42:52,600 WARNING [train.py:1204] (3/4) Exclude cut with ID _41326_210_5854_1_1533778076509_3492080_28-156553-0_sp1.1 from training. Duration: 0.92725 2023-10-04 10:42:53,951 WARNING [train.py:1204] (3/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_577-287150-0_sp1.1 from training. Duration: 0.7454375 2023-10-04 10:42:55,332 WARNING [train.py:1204] (3/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_40-129756-0_sp1.1 from training. Duration: 0.736375 2023-10-04 10:42:55,342 WARNING [train.py:1204] (3/4) Exclude cut with ID _60113_210_16271_1_1535164212481_4386079_490-280290-0 from training. Duration: 0.98 2023-10-04 10:42:58,204 WARNING [train.py:1204] (3/4) Exclude cut with ID _58059_210_5854_1_1534901471149_3164808_34-39905-0_sp1.1 from training. Duration: 0.963625 2023-10-04 10:43:00,254 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_791-197891-0_sp1.1 from training. Duration: 0.763625 2023-10-04 10:43:01,545 WARNING [train.py:1204] (3/4) Exclude cut with ID _6789_210_1534_1_1527857937218_3118950_262-48853-0_sp1.1 from training. Duration: 0.5090625 2023-10-04 10:43:01,618 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2655_210_2087_1_1525688959_3897400_52-279899-0_sp0.9 from training. Duration: 0.7666875 2023-10-04 10:43:01,622 WARNING [train.py:1204] (3/4) Exclude cut with ID _14884_210_2252_1_1530707459689_5561250_114-201399-0 from training. Duration: 0.76 2023-10-04 10:43:06,128 WARNING [train.py:1204] (3/4) Exclude cut with ID _19023_210_4353_1_1531378892552_5820780_83-254794-0_sp0.9 from training. Duration: 0.9555625 2023-10-04 10:43:06,144 WARNING [train.py:1204] (3/4) Exclude cut with ID _53489_210_6228_1_1534298357169_3835970_10-297919-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 10:43:10,852 WARNING [train.py:1204] (3/4) Exclude cut with ID _44585_210_18628_1_1533605350387_4518199_418-3055-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 10:43:10,872 WARNING [train.py:1204] (3/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_310-109461-0 from training. Duration: 0.8 2023-10-04 10:43:12,183 WARNING [train.py:1204] (3/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_235-262124-0 from training. Duration: 0.67 2023-10-04 10:43:12,263 WARNING [train.py:1204] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_195-167011-0_sp1.1 from training. Duration: 0.6818125 2023-10-04 10:43:17,994 WARNING [train.py:1204] (3/4) Exclude cut with ID _7852_210_5219_1_1528286471316_8139400_188-55162-0_sp1.1 from training. Duration: 0.936375 2023-10-04 10:43:18,041 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_35361_210_4238_1_1533262810_7793476_675-87012-0 from training. Duration: 0.89 2023-10-04 10:43:19,346 WARNING [train.py:1204] (3/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_465-81427-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 10:43:19,361 WARNING [train.py:1204] (3/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_597-154640-0_sp1.1 from training. Duration: 0.8181875 2023-10-04 10:43:20,798 WARNING [train.py:1204] (3/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_186-309161-0_sp0.9 from training. Duration: 0.6 2023-10-04 10:43:22,642 WARNING [train.py:1204] (3/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_546-119312-0 from training. Duration: 0.99 2023-10-04 10:43:22,646 WARNING [train.py:1204] (3/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_330-240060-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 10:43:24,011 WARNING [train.py:1204] (3/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_477-255782-0 from training. Duration: 0.82 2023-10-04 10:43:24,029 WARNING [train.py:1204] (3/4) Exclude cut with ID _14970_210_15709_1_1530773543493_1715730_150-248939-0_sp1.1 from training. Duration: 0.97275 2023-10-04 10:43:26,481 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_556-206615-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 10:43:26,541 WARNING [train.py:1204] (3/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_308-231735-0 from training. Duration: 0.73 2023-10-04 10:43:30,300 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.bypass_mid.scale_min, batch_count=1626240.0, ans=0.2 2023-10-04 10:43:34,813 WARNING [train.py:1204] (3/4) Exclude cut with ID _27485_210_4916_1_1532739191677_3668269_66-236571-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 10:43:39,425 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2221_210_1988_1_1525597051_4935541_415-296882-0_sp0.9 from training. Duration: 0.8555625 2023-10-04 10:43:39,432 WARNING [train.py:1204] (3/4) Exclude cut with ID _64521_210_14699_1_1535243427259_3815328_231-78630-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 10:43:40,860 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder_embed.convnext.layerdrop_rate, batch_count=1626240.0, ans=0.015 2023-10-04 10:43:43,385 INFO [train.py:1046] (3/4) Epoch 46, batch 4900, loss[loss=0.1297, simple_loss=0.2112, pruned_loss=0.02411, over 24565.00 frames. ], tot_loss[loss=0.1548, simple_loss=0.2349, pruned_loss=0.03737, over 4702957.97 frames. ], batch size: 60, lr: 2.20e-03, grad_scale: 16.0 2023-10-04 10:43:46,076 WARNING [train.py:1204] (3/4) Exclude cut with ID _13942_210_9988_1_1530604906036_3733360_499-55676-0 from training. Duration: 0.62 2023-10-04 10:43:46,078 WARNING [train.py:1204] (3/4) Exclude cut with ID _49376_210_5986_1_1533985278732_3370740_301-154763-0_sp1.1 from training. Duration: 0.8909375 2023-10-04 10:43:50,262 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.700e+02 2.004e+02 2.251e+02 2.578e+02 5.064e+02, threshold=4.503e+02, percent-clipped=3.0 2023-10-04 10:43:50,365 WARNING [train.py:1204] (3/4) Exclude cut with ID _27485_210_4916_1_1532739191677_3668269_255-216698-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 10:43:51,728 WARNING [train.py:1204] (3/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_137-334143-0_sp1.1 from training. Duration: 0.97275 2023-10-04 10:43:52,367 WARNING [train.py:1204] (3/4) Exclude cut with ID _6456_210_1534_1_1527681634212_3181049_215-309359-0_sp1.1 from training. Duration: 0.8 2023-10-04 10:43:52,945 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.3.encoder.layers.3.self_attn2.whiten, num_groups=1, num_channels=512, metric=10.41 vs. limit=22.5 2023-10-04 10:43:56,310 WARNING [train.py:1204] (3/4) Exclude cut with ID _65640_210_6310_1_1535368201445_3173250_209-13008-0 from training. Duration: 0.99 2023-10-04 10:44:01,058 WARNING [train.py:1204] (3/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_58-29064-0 from training. Duration: 0.99 2023-10-04 10:44:02,711 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.balancer_ff2.min_abs, batch_count=1626373.3333333333, ans=0.1 2023-10-04 10:44:02,781 INFO [scaling.py:1118] (3/4) WithLoss: name=encoder.encoders.3.encoder.layers.1.self_attn_weights, loss-sum=0.000e+00 2023-10-04 10:44:04,319 WARNING [train.py:1204] (3/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_410-287857-0 from training. Duration: 0.93 2023-10-04 10:44:05,686 WARNING [train.py:1204] (3/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_153-153537-0 from training. Duration: 0.91 2023-10-04 10:44:07,037 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7544_210_6753_1_1528587722_3576686_172-171750-0_sp0.9 from training. Duration: 0.72225 2023-10-04 10:44:07,066 WARNING [train.py:1204] (3/4) Exclude cut with ID _64521_210_14699_1_1535243427259_3815328_464-183853-0_sp1.1 from training. Duration: 0.97275 2023-10-04 10:44:07,086 WARNING [train.py:1204] (3/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_385-210308-0_sp1.1 from training. Duration: 0.963625 2023-10-04 10:44:07,100 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_46013_210_7234_1_1533776362_3744032_354-261865-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 10:44:07,109 WARNING [train.py:1204] (3/4) Exclude cut with ID _3067_210_2667_1_1526212974640_4503168_250-285937-0_sp1.1 from training. Duration: 0.72725 2023-10-04 10:44:07,167 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_40036_210_13405_1_1533648647_1315388_48-146016-0 from training. Duration: 0.93 2023-10-04 10:44:11,226 WARNING [train.py:1204] (3/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_280-161633-0 from training. Duration: 0.73 2023-10-04 10:44:11,266 WARNING [train.py:1204] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_308-161067-0_sp0.9 from training. Duration: 0.7333125 2023-10-04 10:44:13,243 WARNING [train.py:1204] (3/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_68-212801-0_sp0.9 from training. Duration: 0.811125 2023-10-04 10:44:14,553 WARNING [train.py:1204] (3/4) Exclude cut with ID _6789_210_1534_1_1527857937218_3118950_311-61262-0_sp0.9 from training. Duration: 0.72225 2023-10-04 10:44:14,720 WARNING [train.py:1204] (3/4) Exclude cut with ID _14632_210_9988_1_1530691279392_3923200_102-204477-0_sp1.1 from training. Duration: 0.8818125 2023-10-04 10:44:16,084 WARNING [train.py:1204] (3/4) Exclude cut with ID _65242_210_18628_1_1535369393717_3823149_290-117517-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 10:44:17,437 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_69630_210_7234_1_1535788670_910989_35-337140-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 10:44:17,445 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_5618_210_1874_1_1527311008_5851_1-214501-0 from training. Duration: 0.689 2023-10-04 10:44:18,851 WARNING [train.py:1204] (3/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_168-50337-0_sp0.9 from training. Duration: 0.9333125 2023-10-04 10:44:18,944 WARNING [train.py:1204] (3/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_185-14438-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 10:44:20,327 WARNING [train.py:1204] (3/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_61-327062-0 from training. Duration: 0.81 2023-10-04 10:44:20,340 WARNING [train.py:1204] (3/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_98-741-0 from training. Duration: 0.8 2023-10-04 10:44:22,635 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.balancer1.prob, batch_count=1626440.0, ans=0.125 2023-10-04 10:44:24,769 WARNING [train.py:1204] (3/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_312-204798-0 from training. Duration: 0.93 2023-10-04 10:44:26,213 WARNING [train.py:1204] (3/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_787-294416-0_sp0.9 from training. Duration: 0.911125 2023-10-04 10:44:27,525 WARNING [train.py:1204] (3/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_140-181176-0_sp1.1 from training. Duration: 0.836375 2023-10-04 10:44:27,567 WARNING [train.py:1204] (3/4) Exclude cut with ID _14687_210_5854_1_1530703838259_3664889_115-239046-0_sp1.1 from training. Duration: 0.6090625 2023-10-04 10:44:27,606 WARNING [train.py:1204] (3/4) Exclude cut with ID _56423_210_5854_1_1534896061049_3165969_370-230614-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 10:44:29,487 WARNING [train.py:1204] (3/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_406-275649-0_sp0.9 from training. Duration: 0.4666875 2023-10-04 10:44:29,506 WARNING [train.py:1204] (3/4) Exclude cut with ID _15150_210_8341_1_1530752411803_7256384_22-138274-0_sp1.1 from training. Duration: 0.87275 2023-10-04 10:44:29,541 WARNING [train.py:1204] (3/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_125-125818-0 from training. Duration: 0.96 2023-10-04 10:44:29,698 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.0.conv_module1.balancer2.prob, batch_count=1626506.6666666667, ans=0.125 2023-10-04 10:44:32,395 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_4399_210_1874_1_1526705485_2400757_220-47974-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 10:44:35,070 WARNING [train.py:1204] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_308-161067-0_sp1.1 from training. Duration: 0.6 2023-10-04 10:44:36,474 WARNING [train.py:1204] (3/4) Exclude cut with ID _53489_210_6228_1_1534298357169_3835970_102-57645-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 10:44:38,192 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.feed_forward2.hidden_balancer.prob, batch_count=1626506.6666666667, ans=0.125 2023-10-04 10:44:39,296 WARNING [train.py:1204] (3/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_178-177582-0 from training. Duration: 0.74 2023-10-04 10:44:39,356 WARNING [train.py:1204] (3/4) Exclude cut with ID _14349_210_8341_1_1530615710818_3442470_170-336541-0_sp1.1 from training. Duration: 0.7818125 2023-10-04 10:44:40,680 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_521-214276-0_sp0.9 from training. Duration: 0.488875 2023-10-04 10:44:42,437 WARNING [train.py:1204] (3/4) Exclude cut with ID _66070_210_7640_1_1535441393096_7805310_489-292631-0 from training. Duration: 0.79 2023-10-04 10:44:47,875 WARNING [train.py:1204] (3/4) Exclude cut with ID _14069_210_5632_1_1530512692033_4803390_438-340023-0_sp1.1 from training. Duration: 0.936375 2023-10-04 10:44:48,762 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.1.encoder.layers.0.feed_forward1.out_whiten, num_groups=1, num_channels=256, metric=4.75 vs. limit=15.0 2023-10-04 10:44:49,209 WARNING [train.py:1204] (3/4) Exclude cut with ID _7852_210_5219_1_1528286471316_8139400_242-105336-0_sp1.1 from training. Duration: 0.6545625 2023-10-04 10:44:49,307 WARNING [train.py:1204] (3/4) Exclude cut with ID _3867_210_1883_1_1526641200575_4475830_272-215563-0 from training. Duration: 0.57 2023-10-04 10:44:50,531 WARNING [train.py:1204] (3/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_143-117421-0_sp0.9 from training. Duration: 0.6444375 2023-10-04 10:44:50,544 WARNING [train.py:1204] (3/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_30-326681-0_sp1.1 from training. Duration: 0.7545625 2023-10-04 10:44:51,980 WARNING [train.py:1204] (3/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_538-218448-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 10:44:52,152 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.conv_module1.balancer2.prob, batch_count=1626573.3333333333, ans=0.125 2023-10-04 10:44:54,320 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.conv_skip_rate, batch_count=1626573.3333333333, ans=0.0 2023-10-04 10:44:55,460 WARNING [train.py:1204] (3/4) Exclude cut with ID _33640_210_2655_1_1533117605479_4855469_60-192970-0_sp1.1 from training. Duration: 0.963625 2023-10-04 10:44:55,467 WARNING [train.py:1204] (3/4) Exclude cut with ID _65585_210_12210_1_1535450451584_3764098_112-294301-0_sp1.1 from training. Duration: 0.8 2023-10-04 10:44:56,687 INFO [train.py:1046] (3/4) Epoch 46, batch 4950, loss[loss=0.1393, simple_loss=0.2183, pruned_loss=0.03011, over 22119.00 frames. ], tot_loss[loss=0.1537, simple_loss=0.234, pruned_loss=0.0367, over 4716794.02 frames. ], batch size: 48, lr: 2.20e-03, grad_scale: 16.0 2023-10-04 10:44:56,744 WARNING [train.py:1204] (3/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_643-37412-0_sp1.1 from training. Duration: 0.936375 2023-10-04 10:44:56,760 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_9302_210_2550_1_1528889962_4655416_638-301620-0_sp1.1 from training. Duration: 0.42725 2023-10-04 10:44:58,123 WARNING [train.py:1204] (3/4) Exclude cut with ID _3867_210_1883_1_1526641200575_4475830_272-215563-0_sp1.1 from training. Duration: 0.5181875 2023-10-04 10:45:00,219 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_53679_210_4852_1_1534556726_4186141_358-154143-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 10:45:00,237 WARNING [train.py:1204] (3/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_841-341679-0_sp0.9 from training. Duration: 0.6444375 2023-10-04 10:45:05,551 WARNING [train.py:1204] (3/4) Exclude cut with ID _7160_210_1876_1_1527940923231_3703799_302-150728-0 from training. Duration: 0.78 2023-10-04 10:45:06,900 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_125-203105-0 from training. Duration: 0.8 2023-10-04 10:45:06,925 WARNING [train.py:1204] (3/4) Exclude cut with ID _5585_210_2667_1_1527418813324_8007340_89-332072-0_sp0.9 from training. Duration: 0.711125 2023-10-04 10:45:08,323 WARNING [train.py:1204] (3/4) Exclude cut with ID _3992_210_1747_1_1526644816031_3731060_36-147855-0 from training. Duration: 0.78 2023-10-04 10:45:09,597 WARNING [train.py:1204] (3/4) Exclude cut with ID _59256_210_5986_1_1535076085997_3589120_172-239045-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 10:45:09,613 WARNING [train.py:1204] (3/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_472-216663-0_sp1.1 from training. Duration: 0.836375 2023-10-04 10:45:09,674 WARNING [train.py:1204] (3/4) Exclude cut with ID _36512_210_6987_1_1533033143998_3126069_181-306500-0_sp1.1 from training. Duration: 0.863625 2023-10-04 10:45:10,918 WARNING [train.py:1204] (3/4) Exclude cut with ID _56524_210_9642_1_1534572011779_3619170_492-95008-0_sp1.1 from training. Duration: 0.97275 2023-10-04 10:45:12,402 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_33348_210_5632_1_1533086912_3324030_119-90326-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 10:45:12,434 WARNING [train.py:1204] (3/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_231-137892-0_sp1.1 from training. Duration: 0.9 2023-10-04 10:45:13,267 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.1.encoder.layers.1.feed_forward3.out_whiten, num_groups=1, num_channels=256, metric=8.51 vs. limit=15.0 2023-10-04 10:45:14,362 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8453_210_4211_1_1528502940_8562730_596-306820-0_sp1.1 from training. Duration: 0.8818125 2023-10-04 10:45:15,855 WARNING [train.py:1204] (3/4) Exclude cut with ID _27052_210_5986_1_1532329197187_3585009_50-242750-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 10:45:17,310 WARNING [train.py:1204] (3/4) Exclude cut with ID _13913_210_9994_1_1530579615187_3383897_358-98435-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 10:45:17,322 WARNING [train.py:1204] (3/4) Exclude cut with ID _27013_210_1389_1_1532437173240_3252649_399-229778-0_sp1.1 from training. Duration: 0.963625 2023-10-04 10:45:21,508 WARNING [train.py:1204] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_185-174805-0_sp0.9 from training. Duration: 0.7555625 2023-10-04 10:45:21,684 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.1.feed_forward1.hidden_balancer.prob, batch_count=1626706.6666666667, ans=0.125 2023-10-04 10:45:24,502 WARNING [train.py:1204] (3/4) Exclude cut with ID _65585_210_12210_1_1535450451584_3764098_196-221014-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 10:45:25,863 WARNING [train.py:1204] (3/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_202-293523-0_sp1.1 from training. Duration: 0.6545625 2023-10-04 10:45:28,018 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.0.bypass_mid.scale_min, batch_count=1626773.3333333333, ans=0.2 2023-10-04 10:45:29,282 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8275_210_4198_1_1528455320_3892571_213-127634-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 10:45:30,618 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_921-231678-0_sp1.1 from training. Duration: 0.97275 2023-10-04 10:45:30,726 WARNING [train.py:1204] (3/4) Exclude cut with ID _38020_210_5986_1_1533112252259_7006089_493-102745-0_sp1.1 from training. Duration: 0.8909375 2023-10-04 10:45:32,762 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6846_210_6753_1_1527850767_4295158_2-315073-0 from training. Duration: 0.88 2023-10-04 10:45:32,820 WARNING [train.py:1204] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_249-281349-0 from training. Duration: 0.85 2023-10-04 10:45:34,212 WARNING [train.py:1204] (3/4) Exclude cut with ID _18513_210_9206_1_1531618215223_3862620_314-83429-0_sp1.1 from training. Duration: 0.97275 2023-10-04 10:45:35,662 WARNING [train.py:1204] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_215-202973-0_sp0.9 from training. Duration: 0.97775 2023-10-04 10:45:35,671 WARNING [train.py:1204] (3/4) Exclude cut with ID _7719_210_3525_1_1528456153163_4296960_88-250259-0_sp1.1 from training. Duration: 0.763625 2023-10-04 10:45:35,768 WARNING [train.py:1204] (3/4) Exclude cut with ID _7606_210_4915_1_1528894253277_5080690_301-10732-0_sp1.1 from training. Duration: 0.736375 2023-10-04 10:45:35,785 WARNING [train.py:1204] (3/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_818-11907-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 10:45:37,181 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_5959_210_1794_1_1527400400_3445726_412-44052-0_sp1.1 from training. Duration: 0.7 2023-10-04 10:45:39,784 WARNING [train.py:1204] (3/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_6-279195-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 10:45:42,855 WARNING [train.py:1204] (3/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_252-216674-0_sp0.9 from training. Duration: 0.6 2023-10-04 10:45:44,364 WARNING [train.py:1204] (3/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_144-3287-0_sp1.1 from training. Duration: 0.6090625 2023-10-04 10:45:47,052 WARNING [train.py:1204] (3/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_342-3879-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 10:45:47,078 WARNING [train.py:1204] (3/4) Exclude cut with ID _33640_210_2655_1_1533117605479_4855469_584-65630-0_sp1.1 from training. Duration: 0.97275 2023-10-04 10:45:48,419 WARNING [train.py:1204] (3/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_37-189262-0 from training. Duration: 0.98 2023-10-04 10:45:48,460 WARNING [train.py:1204] (3/4) Exclude cut with ID _9389_210_1664_1_1529217337301_5289269_379-128794-0_sp0.9 from training. Duration: 0.9444375 2023-10-04 10:45:49,865 WARNING [train.py:1204] (3/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_119-207162-0_sp1.1 from training. Duration: 0.6454375 2023-10-04 10:45:50,140 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.conv_skip_rate, batch_count=1626840.0, ans=0.0 2023-10-04 10:45:52,924 WARNING [train.py:1204] (3/4) Exclude cut with ID _2941_210_2577_1_1526199673150_2489759_200-333643-0_sp1.1 from training. Duration: 0.936375 2023-10-04 10:45:53,217 INFO [scaling.py:1118] (3/4) WithLoss: name=encoder.encoders.2.encoder.layers.2.self_attn_weights, loss-sum=0.000e+00 2023-10-04 10:45:54,321 WARNING [train.py:1204] (3/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_479-125611-0_sp1.1 from training. Duration: 0.87275 2023-10-04 10:45:54,333 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3475_210_2028_1_1526193578_3622569_64-297072-0_sp1.1 from training. Duration: 0.82725 2023-10-04 10:45:54,376 WARNING [train.py:1204] (3/4) Exclude cut with ID _53565_210_10635_1_1534337218335_3820259_139-346291-0_sp1.1 from training. Duration: 0.97275 2023-10-04 10:45:56,242 WARNING [train.py:1204] (3/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_588-125165-0_sp1.1 from training. Duration: 0.7545625 2023-10-04 10:45:57,451 WARNING [train.py:1204] (3/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_938-205135-0_sp1.1 from training. Duration: 0.7909375 2023-10-04 10:45:57,595 WARNING [train.py:1204] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_216-48707-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 10:45:58,939 WARNING [train.py:1204] (3/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_477-255782-0_sp1.1 from training. Duration: 0.7454375 2023-10-04 10:46:00,287 WARNING [train.py:1204] (3/4) Exclude cut with ID _16909_210_5724_1_1531292079912_4118919_583-92842-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 10:46:00,381 WARNING [train.py:1204] (3/4) Exclude cut with ID _60860_210_8254_1_1534896015875_3557169_245-256666-0 from training. Duration: 0.97 2023-10-04 10:46:05,587 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_45399_210_4852_1_1533624725_7579737_349-116173-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 10:46:08,571 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.balancer2.prob, batch_count=1626906.6666666667, ans=0.125 2023-10-04 10:46:11,089 WARNING [train.py:1204] (3/4) Exclude cut with ID _8699_210_2069_1_1528630346831_3960409_155-39766-0 from training. Duration: 0.78 2023-10-04 10:46:11,099 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_86-16881-0_sp1.1 from training. Duration: 0.52725 2023-10-04 10:46:14,447 INFO [train.py:1046] (3/4) Epoch 46, batch 5000, loss[loss=0.1621, simple_loss=0.2539, pruned_loss=0.03516, over 24461.00 frames. ], tot_loss[loss=0.153, simple_loss=0.2333, pruned_loss=0.03634, over 4725911.81 frames. ], batch size: 69, lr: 2.20e-03, grad_scale: 16.0 2023-10-04 10:46:17,231 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_191-159384-0_sp1.1 from training. Duration: 0.97275 2023-10-04 10:46:17,239 WARNING [train.py:1204] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_307-158462-0_sp1.1 from training. Duration: 0.62725 2023-10-04 10:46:18,265 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.0.layers.0.feed_forward1.out_whiten, num_groups=1, num_channels=192, metric=11.04 vs. limit=15.0 2023-10-04 10:46:18,586 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7544_210_6753_1_1528591327_1471426_33-346031-0 from training. Duration: 0.61 2023-10-04 10:46:18,702 WARNING [train.py:1204] (3/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_112-149213-0 from training. Duration: 0.95 2023-10-04 10:46:21,352 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.638e+02 2.198e+02 2.645e+02 3.298e+02 6.003e+02, threshold=5.290e+02, percent-clipped=6.0 2023-10-04 10:46:21,447 WARNING [train.py:1204] (3/4) Exclude cut with ID _55844_210_2346_1_1534471212569_7625707_925-191342-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 10:46:22,863 WARNING [train.py:1204] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_87-278686-0 from training. Duration: 0.77 2023-10-04 10:46:22,890 WARNING [train.py:1204] (3/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_415-78792-0_sp1.1 from training. Duration: 0.736375 2023-10-04 10:46:24,069 WARNING [train.py:1204] (3/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_107-332398-0_sp0.9 from training. Duration: 0.8666875 2023-10-04 10:46:24,147 WARNING [train.py:1204] (3/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_72-251255-0 from training. Duration: 0.94 2023-10-04 10:46:24,179 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_431-227273-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 10:46:26,157 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7544_210_6753_1_1528587000_691468_87-178599-0_sp0.9 from training. Duration: 0.9666875 2023-10-04 10:46:27,274 WARNING [train.py:1204] (3/4) Exclude cut with ID _13916_210_9994_1_1530788341126_3763149_60-191556-0 from training. Duration: 0.61 2023-10-04 10:46:27,290 WARNING [train.py:1204] (3/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_746-164782-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 10:46:27,337 WARNING [train.py:1204] (3/4) Exclude cut with ID _33640_210_2655_1_1533117605479_4855469_596-21066-0_sp1.1 from training. Duration: 0.92725 2023-10-04 10:46:28,757 WARNING [train.py:1204] (3/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_320-14078-0 from training. Duration: 0.95 2023-10-04 10:46:30,099 WARNING [train.py:1204] (3/4) Exclude cut with ID _29877_210_6228_1_1532397791106_5276149_266-62291-0 from training. Duration: 0.87 2023-10-04 10:46:31,326 WARNING [train.py:1204] (3/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_176-56120-0_sp0.9 from training. Duration: 0.92225 2023-10-04 10:46:31,385 WARNING [train.py:1204] (3/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_91-26919-0 from training. Duration: 0.97 2023-10-04 10:46:31,392 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_224-322394-0_sp0.9 from training. Duration: 0.7333125 2023-10-04 10:46:32,689 WARNING [train.py:1204] (3/4) Exclude cut with ID _44982_210_1511_1_1534312820108_3726029_586-297059-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 10:46:32,725 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_196-289637-0_sp1.1 from training. Duration: 0.6818125 2023-10-04 10:46:32,726 WARNING [train.py:1204] (3/4) Exclude cut with ID _18513_210_9206_1_1531618215223_3862620_402-5957-0 from training. Duration: 0.99 2023-10-04 10:46:32,732 WARNING [train.py:1204] (3/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_81-300212-0 from training. Duration: 0.6 2023-10-04 10:46:34,776 WARNING [train.py:1204] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_166-128941-0 from training. Duration: 0.54 2023-10-04 10:46:34,792 WARNING [train.py:1204] (3/4) Exclude cut with ID _58059_210_5854_1_1534901471149_3164808_57-309660-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 10:46:36,154 WARNING [train.py:1204] (3/4) Exclude cut with ID _60621_210_17071_1_1535013053530_3671739_73-155814-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 10:46:37,464 WARNING [train.py:1204] (3/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_726-270780-0 from training. Duration: 0.86 2023-10-04 10:46:37,483 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8853_210_3273_1_1528633899_818968_51-187685-0_sp1.1 from training. Duration: 0.62725 2023-10-04 10:46:40,118 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_315-212794-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 10:46:40,378 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.out_combiner.scale_min, batch_count=1627040.0, ans=0.2 2023-10-04 10:46:41,429 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_53373_210_5399_1_1534229471_4715695_314-122795-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 10:46:41,544 WARNING [train.py:1204] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_187-221566-0_sp0.9 from training. Duration: 0.47775 2023-10-04 10:46:43,433 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_133-564-0 from training. Duration: 0.79 2023-10-04 10:46:44,761 WARNING [train.py:1204] (3/4) Exclude cut with ID _55153_210_2253_1_1534384786677_3162096_255-228344-0_sp1.1 from training. Duration: 0.936375 2023-10-04 10:46:46,141 WARNING [train.py:1204] (3/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_383-165234-0_sp1.1 from training. Duration: 0.8545625 2023-10-04 10:46:48,921 WARNING [train.py:1204] (3/4) Exclude cut with ID IC0677W0056-334889 from training. Duration: 0.768 2023-10-04 10:46:52,928 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_14953_210_2151_1_1530701710_5616165_710-165400-0_sp0.9 from training. Duration: 0.9666875 2023-10-04 10:46:54,425 WARNING [train.py:1204] (3/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_355-236205-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 10:46:54,427 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_34450_210_9547_1_1533099443_4109050_503-246893-0_sp1.1 from training. Duration: 0.963625 2023-10-04 10:46:54,681 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.bypass.scale_min, batch_count=1627106.6666666667, ans=0.2 2023-10-04 10:46:57,791 WARNING [train.py:1204] (3/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_25-120161-0 from training. Duration: 0.98 2023-10-04 10:46:57,828 WARNING [train.py:1204] (3/4) Exclude cut with ID _66112_210_10635_1_1535590236305_3801484_205-311930-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 10:46:57,861 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_48049_210_5632_1_1533864553_3519937_185-281218-0_sp1.1 from training. Duration: 0.92725 2023-10-04 10:46:58,031 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.ff3_skip_rate, batch_count=1627173.3333333333, ans=0.0 2023-10-04 10:46:59,229 WARNING [train.py:1204] (3/4) Exclude cut with ID _48782_210_12658_1_1534467701544_2300422_191-332415-0_sp1.1 from training. Duration: 0.97275 2023-10-04 10:46:59,424 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.balancer1.prob, batch_count=1627173.3333333333, ans=0.125 2023-10-04 10:47:01,761 WARNING [train.py:1204] (3/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_907-151398-0_sp1.1 from training. Duration: 0.336375 2023-10-04 10:47:01,806 WARNING [train.py:1204] (3/4) Exclude cut with ID _67499_210_17282_1_1535509805220_3692430_20-179346-0_sp1.1 from training. Duration: 0.8818125 2023-10-04 10:47:02,013 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.nonlin_attention.balancer.max_positive, batch_count=1627173.3333333333, ans=0.95 2023-10-04 10:47:05,087 WARNING [train.py:1204] (3/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_441-91466-0_sp1.1 from training. Duration: 0.8818125 2023-10-04 10:47:07,042 WARNING [train.py:1204] (3/4) Exclude cut with ID _46681_210_9642_1_1533893005571_7603359_786-164007-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 10:47:07,196 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.conv_module1.balancer2.prob, batch_count=1627173.3333333333, ans=0.125 2023-10-04 10:47:11,269 WARNING [train.py:1204] (3/4) Exclude cut with ID _14970_210_15709_1_1530773543493_1715730_187-20259-0 from training. Duration: 0.89 2023-10-04 10:47:14,695 WARNING [train.py:1204] (3/4) Exclude cut with ID _12734_210_8341_1_1530183614296_3891929_383-339075-0_sp1.1 from training. Duration: 0.963625 2023-10-04 10:47:18,202 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.1.encoder.layers.1.self_attn1.whiten, num_groups=1, num_channels=256, metric=11.77 vs. limit=22.5 2023-10-04 10:47:21,713 WARNING [train.py:1204] (3/4) Exclude cut with ID _37086_210_9038_1_1532999540891_7021250_834-322225-0_sp1.1 from training. Duration: 0.92725 2023-10-04 10:47:23,074 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_10987_210_4163_1_1529760555_4123902_56-290559-0_sp1.1 from training. Duration: 0.963625 2023-10-04 10:47:23,080 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_92-246270-0_sp0.9 from training. Duration: 0.9444375 2023-10-04 10:47:23,105 WARNING [train.py:1204] (3/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_56-13561-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 10:47:24,429 WARNING [train.py:1204] (3/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_393-57888-0_sp1.1 from training. Duration: 0.5818125 2023-10-04 10:47:24,459 WARNING [train.py:1204] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_168-32906-0_sp1.1 from training. Duration: 0.62725 2023-10-04 10:47:24,502 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_51225_210_3601_1_1534400634_2327383_127-207553-0_sp1.1 from training. Duration: 0.963625 2023-10-04 10:47:27,709 INFO [train.py:1046] (3/4) Epoch 46, batch 5050, loss[loss=0.1574, simple_loss=0.2363, pruned_loss=0.03921, over 22857.00 frames. ], tot_loss[loss=0.1536, simple_loss=0.2339, pruned_loss=0.03664, over 4719136.69 frames. ], batch size: 322, lr: 2.20e-03, grad_scale: 16.0 2023-10-04 10:47:27,844 WARNING [train.py:1204] (3/4) Exclude cut with ID _58583_210_5236_1_1534726835266_3574249_494-203521-0_sp1.1 from training. Duration: 0.963625 2023-10-04 10:47:27,866 WARNING [train.py:1204] (3/4) Exclude cut with ID _49376_210_5986_1_1533985278732_3370740_354-344695-0 from training. Duration: 0.99 2023-10-04 10:47:29,206 WARNING [train.py:1204] (3/4) Exclude cut with ID _8021_210_7994_1_1528376451358_3653069_179-45206-0_sp1.1 from training. Duration: 0.7818125 2023-10-04 10:47:30,642 WARNING [train.py:1204] (3/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_742-154017-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 10:47:31,974 WARNING [train.py:1204] (3/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_222-198127-0_sp1.1 from training. Duration: 0.763625 2023-10-04 10:47:32,042 WARNING [train.py:1204] (3/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_235-307337-0 from training. Duration: 0.94 2023-10-04 10:47:33,838 WARNING [train.py:1204] (3/4) Exclude cut with ID _8393_210_6702_1_1528505860249_3457328_453-176500-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 10:47:35,168 WARNING [train.py:1204] (3/4) Exclude cut with ID _53565_210_10635_1_1534337218335_3820259_102-179528-0_sp1.1 from training. Duration: 0.97275 2023-10-04 10:47:36,744 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.feed_forward1.out_proj.dropout_p, batch_count=1627306.6666666667, ans=0.1 2023-10-04 10:47:38,315 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.5.encoder.layers.1.nonlin_attention.whiten2, num_groups=1, num_channels=256, metric=9.12 vs. limit=15.0 2023-10-04 10:47:39,167 WARNING [train.py:1204] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_43-206757-0_sp1.1 from training. Duration: 0.6818125 2023-10-04 10:47:40,494 WARNING [train.py:1204] (3/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_335-274780-0_sp1.1 from training. Duration: 0.7090625 2023-10-04 10:47:41,948 WARNING [train.py:1204] (3/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_128-296581-0_sp1.1 from training. Duration: 0.67275 2023-10-04 10:47:43,579 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder_embed.convnext.layerdrop_rate, batch_count=1627373.3333333333, ans=0.015 2023-10-04 10:47:49,699 WARNING [train.py:1204] (3/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_111-112362-0 from training. Duration: 0.91 2023-10-04 10:47:51,050 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_5522_210_3273_1_1527249606_3909096_366-172508-0_sp1.1 from training. Duration: 0.6 2023-10-04 10:47:51,129 WARNING [train.py:1204] (3/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_47-167373-0_sp1.1 from training. Duration: 0.836375 2023-10-04 10:47:52,432 WARNING [train.py:1204] (3/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_41-234353-0 from training. Duration: 0.68 2023-10-04 10:47:52,474 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_82-221196-0_sp0.9 from training. Duration: 0.8444375 2023-10-04 10:47:52,634 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.feed_forward3.hidden_balancer.prob, batch_count=1627373.3333333333, ans=0.125 2023-10-04 10:47:53,878 WARNING [train.py:1204] (3/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_47-4814-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 10:47:53,917 WARNING [train.py:1204] (3/4) Exclude cut with ID _43541_210_10626_1_1533796233418_3635440_40-247993-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 10:47:55,175 WARNING [train.py:1204] (3/4) Exclude cut with ID _21989_210_5854_1_1531899214931_3271360_133-44759-0_sp0.9 from training. Duration: 0.8555625 2023-10-04 10:47:55,177 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_94-195801-0 from training. Duration: 0.94 2023-10-04 10:47:55,383 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.conv_module1.balancer1.prob, batch_count=1627440.0, ans=0.125 2023-10-04 10:47:57,091 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_141-89113-0 from training. Duration: 0.86 2023-10-04 10:47:58,327 WARNING [train.py:1204] (3/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_683-284578-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 10:47:59,749 WARNING [train.py:1204] (3/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_6-214815-0_sp1.1 from training. Duration: 0.77275 2023-10-04 10:48:03,861 WARNING [train.py:1204] (3/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_556-212082-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 10:48:03,902 WARNING [train.py:1204] (3/4) Exclude cut with ID _44902_210_16187_1_1533888006062_3792078_149-67212-0 from training. Duration: 0.87 2023-10-04 10:48:05,246 WARNING [train.py:1204] (3/4) Exclude cut with ID _61148_210_6897_1_1535094022726_4217610_333-159440-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 10:48:05,396 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.0.conv_skip_rate, batch_count=1627440.0, ans=0.0 2023-10-04 10:48:09,882 WARNING [train.py:1204] (3/4) Exclude cut with ID _3120_210_3525_1_1526036143112_4072980_404-249994-0 from training. Duration: 0.99 2023-10-04 10:48:09,973 WARNING [train.py:1204] (3/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_234-260682-0_sp0.9 from training. Duration: 0.9333125 2023-10-04 10:48:11,250 WARNING [train.py:1204] (3/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_374-138971-0_sp1.1 from training. Duration: 0.8545625 2023-10-04 10:48:11,326 WARNING [train.py:1204] (3/4) Exclude cut with ID _53713_210_24116_1_1534314487762_7416751_743-302085-0_sp1.1 from training. Duration: 0.92725 2023-10-04 10:48:11,402 WARNING [train.py:1204] (3/4) Exclude cut with ID _18516_210_9206_1_1531704653206_3692406_198-334870-0_sp1.1 from training. Duration: 0.836375 2023-10-04 10:48:12,782 WARNING [train.py:1204] (3/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_395-73570-0_sp1.1 from training. Duration: 0.9 2023-10-04 10:48:14,212 WARNING [train.py:1204] (3/4) Exclude cut with ID _46683_210_9642_1_1533730913492_4591116_421-19547-0_sp1.1 from training. Duration: 0.936375 2023-10-04 10:48:15,587 WARNING [train.py:1204] (3/4) Exclude cut with ID _19896_210_1883_1_1531709022662_6649569_714-8668-0_sp1.1 from training. Duration: 0.963625 2023-10-04 10:48:15,615 WARNING [train.py:1204] (3/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_49-101072-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 10:48:16,896 WARNING [train.py:1204] (3/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_220-191080-0_sp1.1 from training. Duration: 0.8909375 2023-10-04 10:48:16,924 WARNING [train.py:1204] (3/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_62-335552-0 from training. Duration: 0.87 2023-10-04 10:48:18,828 WARNING [train.py:1204] (3/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_111-112362-0_sp1.1 from training. Duration: 0.82725 2023-10-04 10:48:20,172 WARNING [train.py:1204] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_318-59097-0_sp0.9 from training. Duration: 0.8444375 2023-10-04 10:48:21,896 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.balancer1.prob, batch_count=1627506.6666666667, ans=0.125 2023-10-04 10:48:22,922 WARNING [train.py:1204] (3/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_98-255710-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 10:48:22,935 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9042W0003-515609 from training. Duration: 0.896 2023-10-04 10:48:22,944 WARNING [train.py:1204] (3/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_158-250116-0_sp0.9 from training. Duration: 0.688875 2023-10-04 10:48:25,667 WARNING [train.py:1204] (3/4) Exclude cut with ID _26750_210_5665_1_1532226678442_4063230_7-346885-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 10:48:27,093 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_37839_210_7234_1_1533542462_3689303_357-227078-0_sp1.1 from training. Duration: 0.963625 2023-10-04 10:48:27,124 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9052W0119-520904 from training. Duration: 0.896 2023-10-04 10:48:29,272 WARNING [train.py:1204] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_249-281349-0_sp1.1 from training. Duration: 0.77275 2023-10-04 10:48:29,277 WARNING [train.py:1204] (3/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_147-293311-0 from training. Duration: 0.94 2023-10-04 10:48:29,278 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_158-21166-0_sp1.1 from training. Duration: 0.963625 2023-10-04 10:48:29,521 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=1627573.3333333333, ans=0.1 2023-10-04 10:48:33,399 WARNING [train.py:1204] (3/4) Exclude cut with ID _14100_210_8341_1_1530529320613_3678069_207-332194-0_sp1.1 from training. Duration: 0.92725 2023-10-04 10:48:34,802 WARNING [train.py:1204] (3/4) Exclude cut with ID _9389_210_1664_1_1529217337301_5289269_324-211031-0_sp1.1 from training. Duration: 0.963625 2023-10-04 10:48:34,815 WARNING [train.py:1204] (3/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_133-158308-0 from training. Duration: 0.84 2023-10-04 10:48:36,783 WARNING [train.py:1204] (3/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_166-314607-0 from training. Duration: 0.54 2023-10-04 10:48:38,179 WARNING [train.py:1204] (3/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_649-79538-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 10:48:38,187 WARNING [train.py:1204] (3/4) Exclude cut with ID _28394_210_8913_1_1532430049518_3650118_343-115667-0_sp1.1 from training. Duration: 0.97275 2023-10-04 10:48:39,956 WARNING [train.py:1204] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_87-278686-0_sp0.9 from training. Duration: 0.8555625 2023-10-04 10:48:41,339 INFO [train.py:1046] (3/4) Epoch 46, batch 5100, loss[loss=0.1606, simple_loss=0.2388, pruned_loss=0.04119, over 23727.00 frames. ], tot_loss[loss=0.1541, simple_loss=0.2348, pruned_loss=0.03668, over 4723952.92 frames. ], batch size: 149, lr: 2.20e-03, grad_scale: 8.0 2023-10-04 10:48:41,471 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9109W0003-549749 from training. Duration: 0.768 2023-10-04 10:48:43,400 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.5.encoder.layers.0.feed_forward1.out_whiten, num_groups=1, num_channels=256, metric=10.76 vs. limit=15.0 2023-10-04 10:48:44,138 WARNING [train.py:1204] (3/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_126-29104-0_sp1.1 from training. Duration: 0.77275 2023-10-04 10:48:45,602 WARNING [train.py:1204] (3/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_325-113798-0 from training. Duration: 0.85 2023-10-04 10:48:46,933 WARNING [train.py:1204] (3/4) Exclude cut with ID _6694_210_4599_1_1527852637490_3965529_116-109398-0 from training. Duration: 0.73 2023-10-04 10:48:48,168 WARNING [train.py:1204] (3/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_46-57119-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 10:48:49,336 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.646e+02 1.978e+02 2.173e+02 2.426e+02 3.698e+02, threshold=4.345e+02, percent-clipped=0.0 2023-10-04 10:48:49,465 WARNING [train.py:1204] (3/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_546-119312-0_sp1.1 from training. Duration: 0.9 2023-10-04 10:48:51,504 WARNING [train.py:1204] (3/4) Exclude cut with ID _60057_210_10640_1_1534903065793_3333150_468-306939-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 10:48:52,931 WARNING [train.py:1204] (3/4) Exclude cut with ID _15150_210_8341_1_1530752411803_7256384_22-138274-0 from training. Duration: 0.96 2023-10-04 10:48:52,954 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_289-180904-0 from training. Duration: 0.76 2023-10-04 10:48:59,119 WARNING [train.py:1204] (3/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_64-133721-0_sp1.1 from training. Duration: 0.92725 2023-10-04 10:48:59,180 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_195-257512-0_sp1.1 from training. Duration: 0.6090625 2023-10-04 10:49:04,310 WARNING [train.py:1204] (3/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_704-101963-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 10:49:05,899 WARNING [train.py:1204] (3/4) Exclude cut with ID _4366_210_1388_1_1526792053633_3613819_231-211402-0 from training. Duration: 0.94 2023-10-04 10:49:05,959 WARNING [train.py:1204] (3/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_679-190385-0_sp1.1 from training. Duration: 0.97275 2023-10-04 10:49:07,378 WARNING [train.py:1204] (3/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_298-81459-0_sp1.1 from training. Duration: 0.963625 2023-10-04 10:49:07,395 WARNING [train.py:1204] (3/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_658-282610-0_sp0.9 from training. Duration: 0.588875 2023-10-04 10:49:09,309 WARNING [train.py:1204] (3/4) Exclude cut with ID _57572_210_5517_1_1534921219624_3809484_196-283734-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 10:49:11,335 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_53071_210_16554_1_1534490405_2954066_294-198554-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 10:49:11,348 WARNING [train.py:1204] (3/4) Exclude cut with ID _4866_210_2577_1_1527404412284_4364659_352-157930-0 from training. Duration: 0.81 2023-10-04 10:49:12,607 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9231W0113-610139 from training. Duration: 0.896 2023-10-04 10:49:12,789 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.bypass.skip_rate, batch_count=1627773.3333333333, ans=0.09899494936611666 2023-10-04 10:49:13,893 WARNING [train.py:1204] (3/4) Exclude cut with ID _24271_210_12658_1_1531987270022_7190519_638-171906-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 10:49:13,947 WARNING [train.py:1204] (3/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_272-249985-0 from training. Duration: 0.67 2023-10-04 10:49:15,159 WARNING [train.py:1204] (3/4) Exclude cut with ID _64086_210_7994_1_1535270417019_7435949_449-31567-0 from training. Duration: 0.97 2023-10-04 10:49:16,647 WARNING [train.py:1204] (3/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_109-282568-0_sp1.1 from training. Duration: 0.97275 2023-10-04 10:49:24,142 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_121-45934-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 10:49:27,468 WARNING [train.py:1204] (3/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_314-126819-0 from training. Duration: 0.5 2023-10-04 10:49:27,512 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9309W0192-640319 from training. Duration: 0.512 2023-10-04 10:49:27,522 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9310W0003-640649 from training. Duration: 0.896 2023-10-04 10:49:28,912 WARNING [train.py:1204] (3/4) Exclude cut with ID _7160_210_1876_1_1527940923231_3703799_120-150943-0 from training. Duration: 0.81 2023-10-04 10:49:28,914 WARNING [train.py:1204] (3/4) Exclude cut with ID _40380_210_9059_1_1533380594759_4187380_6-33508-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 10:49:31,586 WARNING [train.py:1204] (3/4) Exclude cut with ID _59202_210_26984_1_1534816810612_3549174_137-318710-0 from training. Duration: 0.89 2023-10-04 10:49:35,631 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_435-313305-0 from training. Duration: 0.71 2023-10-04 10:49:39,382 WARNING [train.py:1204] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_91-212486-0_sp1.1 from training. Duration: 0.6181875 2023-10-04 10:49:40,804 WARNING [train.py:1204] (3/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_133-158308-0_sp1.1 from training. Duration: 0.763625 2023-10-04 10:49:42,202 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_99-251314-0 from training. Duration: 0.87 2023-10-04 10:49:43,592 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_365-217119-0_sp1.1 from training. Duration: 0.57275 2023-10-04 10:49:43,635 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3475_210_2028_1_1526193578_3622569_377-166133-0 from training. Duration: 0.86 2023-10-04 10:49:45,844 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.2.encoder.layers.1.nonlin_attention.whiten1, num_groups=1, num_channels=288, metric=5.82 vs. limit=10.0 2023-10-04 10:49:46,756 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.ff3_skip_rate, batch_count=1627906.6666666667, ans=0.0 2023-10-04 10:49:49,315 WARNING [train.py:1204] (3/4) Exclude cut with ID _13916_210_9994_1_1530788341126_3763149_116-21034-0_sp1.1 from training. Duration: 0.87275 2023-10-04 10:49:49,328 WARNING [train.py:1204] (3/4) Exclude cut with ID _64220_210_9527_1_1535250151772_4875259_142-216831-0_sp1.1 from training. Duration: 0.936375 2023-10-04 10:49:49,329 WARNING [train.py:1204] (3/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_490-207861-0_sp1.1 from training. Duration: 0.8909375 2023-10-04 10:49:49,391 WARNING [train.py:1204] (3/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_87-272068-0_sp0.9 from training. Duration: 0.92225 2023-10-04 10:49:49,412 WARNING [train.py:1204] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_18-46756-0_sp1.1 from training. Duration: 0.7181875 2023-10-04 10:49:49,522 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.1.conv_skip_rate, batch_count=1627906.6666666667, ans=0.0 2023-10-04 10:49:51,520 WARNING [train.py:1204] (3/4) Exclude cut with ID _39411_210_5986_1_1533538825030_3343649_98-311335-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 10:49:52,898 WARNING [train.py:1204] (3/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_113-18163-0 from training. Duration: 0.89 2023-10-04 10:49:52,901 WARNING [train.py:1204] (3/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_1103-56445-0 from training. Duration: 0.73 2023-10-04 10:49:54,187 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3474_210_2028_1_1526188513_4825246_145-309131-0 from training. Duration: 0.74 2023-10-04 10:49:54,220 WARNING [train.py:1204] (3/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_120-211630-0_sp1.1 from training. Duration: 0.836375 2023-10-04 10:49:54,236 WARNING [train.py:1204] (3/4) Exclude cut with ID _8030_210_7497_1_1528444886781_3531089_32-31837-0 from training. Duration: 0.97 2023-10-04 10:49:55,480 INFO [train.py:1046] (3/4) Epoch 46, batch 5150, loss[loss=0.149, simple_loss=0.2347, pruned_loss=0.03166, over 24459.00 frames. ], tot_loss[loss=0.1548, simple_loss=0.2356, pruned_loss=0.03704, over 4716789.20 frames. ], batch size: 66, lr: 2.20e-03, grad_scale: 8.0 2023-10-04 10:49:56,962 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_19949_210_2161_1_1531706337_928079_8-160956-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 10:49:57,008 WARNING [train.py:1204] (3/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_321-215742-0_sp1.1 from training. Duration: 0.4545625 2023-10-04 10:49:59,774 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_583-124650-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 10:50:01,694 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_30434_210_13049_1_1532519384_981746_164-182755-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 10:50:07,142 WARNING [train.py:1204] (3/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_339-104791-0_sp1.1 from training. Duration: 0.4454375 2023-10-04 10:50:07,161 WARNING [train.py:1204] (3/4) Exclude cut with ID _15026_210_8750_1_1530777602316_3291850_211-195043-0 from training. Duration: 0.8 2023-10-04 10:50:09,130 WARNING [train.py:1204] (3/4) Exclude cut with ID _29877_210_6228_1_1532397791106_5276149_487-182647-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 10:50:10,458 WARNING [train.py:1204] (3/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_127-566-0_sp0.9 from training. Duration: 0.9333125 2023-10-04 10:50:11,937 WARNING [train.py:1204] (3/4) Exclude cut with ID _8263_210_1664_1_1528439452891_970220_68-181805-0_sp0.9 from training. Duration: 0.711125 2023-10-04 10:50:11,938 WARNING [train.py:1204] (3/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_504-72513-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 10:50:13,228 WARNING [train.py:1204] (3/4) Exclude cut with ID _64639_210_5816_1_1535367619437_3528000_324-265233-0_sp1.1 from training. Duration: 0.963625 2023-10-04 10:50:13,279 WARNING [train.py:1204] (3/4) Exclude cut with ID _6456_210_1534_1_1527681634212_3181049_215-309359-0_sp0.9 from training. Duration: 0.97775 2023-10-04 10:50:13,284 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_115-233929-0_sp1.1 from training. Duration: 0.6545625 2023-10-04 10:50:13,344 WARNING [train.py:1204] (3/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_219-224445-0 from training. Duration: 0.84 2023-10-04 10:50:13,534 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.ff2_skip_rate, batch_count=1628040.0, ans=0.0 2023-10-04 10:50:15,215 WARNING [train.py:1204] (3/4) Exclude cut with ID _14867_210_1968_1_1530684179890_3846969_413-29292-0_sp1.1 from training. Duration: 0.8181875 2023-10-04 10:50:16,545 WARNING [train.py:1204] (3/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_1012-279473-0_sp0.9 from training. Duration: 0.7444375 2023-10-04 10:50:17,975 WARNING [train.py:1204] (3/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_605-133395-0_sp0.9 from training. Duration: 0.7333125 2023-10-04 10:50:19,388 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_89-35748-0 from training. Duration: 0.79 2023-10-04 10:50:19,656 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.conv_module1.balancer2.min_abs, batch_count=1628040.0, ans=0.5 2023-10-04 10:50:20,785 WARNING [train.py:1204] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_476-272757-0_sp1.1 from training. Duration: 0.8454375 2023-10-04 10:50:24,057 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.ff3_skip_rate, batch_count=1628106.6666666667, ans=0.0 2023-10-04 10:50:25,236 WARNING [train.py:1204] (3/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_449-15508-0_sp0.9 from training. Duration: 0.911125 2023-10-04 10:50:28,400 WARNING [train.py:1204] (3/4) Exclude cut with ID _29345_210_4381_1_1532689168263_3860169_198-174328-0 from training. Duration: 0.91 2023-10-04 10:50:32,572 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_15517_210_6753_1_1530952293_1664795_125-158157-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 10:50:38,771 WARNING [train.py:1204] (3/4) Exclude cut with ID _25565_210_12392_1_1532423009382_3952098_420-113180-0_sp1.1 from training. Duration: 0.963625 2023-10-04 10:50:40,092 WARNING [train.py:1204] (3/4) Exclude cut with ID _51471_210_1889_1_1534399391390_3094250_236-146773-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 10:50:40,473 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.self_attn_weights.pos_emb_skip_rate, batch_count=1628173.3333333333, ans=0.0 2023-10-04 10:50:43,539 WARNING [train.py:1204] (3/4) Exclude cut with ID _30039_210_13270_1_1532484612900_2725150_175-35012-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 10:50:43,578 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_23850_210_4238_1_1532232597_1632433_99-275843-0_sp1.1 from training. Duration: 0.936375 2023-10-04 10:50:45,003 WARNING [train.py:1204] (3/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_658-282610-0 from training. Duration: 0.53 2023-10-04 10:50:49,456 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_37818_210_4238_1_1533620910_4720083_234-65934-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 10:50:50,911 WARNING [train.py:1204] (3/4) Exclude cut with ID _3067_210_2667_1_1526212974640_4503168_459-173867-0_sp1.1 from training. Duration: 0.8 2023-10-04 10:50:50,939 WARNING [train.py:1204] (3/4) Exclude cut with ID _8696_210_1876_1_1528545632440_3345058_209-54069-0_sp0.9 from training. Duration: 0.7444375 2023-10-04 10:50:53,709 WARNING [train.py:1204] (3/4) Exclude cut with ID _62574_210_2655_1_1535180407058_3933249_420-236465-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 10:50:53,792 WARNING [train.py:1204] (3/4) Exclude cut with ID _46681_210_9642_1_1533893005571_7603359_333-4042-0_sp1.1 from training. Duration: 0.936375 2023-10-04 10:50:54,467 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.3.encoder.layers.2.self_attn_weights.whiten_keys, num_groups=8, num_channels=256, metric=3.13 vs. limit=6.0 2023-10-04 10:50:55,116 WARNING [train.py:1204] (3/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_377-296839-0 from training. Duration: 0.55 2023-10-04 10:50:56,954 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.conv_skip_rate, batch_count=1628240.0, ans=0.0 2023-10-04 10:50:58,543 WARNING [train.py:1204] (3/4) Exclude cut with ID _43997_210_4525_1_1533538632555_5189329_426-92599-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 10:50:59,939 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_43-71989-0_sp1.1 from training. Duration: 0.5181875 2023-10-04 10:51:01,816 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2009_210_2071_1_1525481134_4588937_140-345530-0_sp1.1 from training. Duration: 0.963625 2023-10-04 10:51:01,824 WARNING [train.py:1204] (3/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_138-332773-0_sp0.9 from training. Duration: 0.9 2023-10-04 10:51:03,166 WARNING [train.py:1204] (3/4) Exclude cut with ID _13942_210_9988_1_1530604906036_3733360_499-55676-0_sp0.9 from training. Duration: 0.688875 2023-10-04 10:51:04,524 WARNING [train.py:1204] (3/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_259-34038-0_sp1.1 from training. Duration: 0.67275 2023-10-04 10:51:04,534 WARNING [train.py:1204] (3/4) Exclude cut with ID _3120_210_3525_1_1526036143112_4072980_404-249994-0_sp1.1 from training. Duration: 0.9 2023-10-04 10:51:04,568 WARNING [train.py:1204] (3/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_222-108810-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 10:51:07,453 WARNING [train.py:1204] (3/4) Exclude cut with ID _22083_210_8254_1_1531702862439_3678970_49-234546-0_sp0.9 from training. Duration: 0.92225 2023-10-04 10:51:10,510 INFO [train.py:1046] (3/4) Epoch 46, batch 5200, loss[loss=0.1466, simple_loss=0.2207, pruned_loss=0.03625, over 23801.00 frames. ], tot_loss[loss=0.1553, simple_loss=0.2361, pruned_loss=0.03725, over 4725051.08 frames. ], batch size: 164, lr: 2.20e-03, grad_scale: 16.0 2023-10-04 10:51:10,605 WARNING [train.py:1204] (3/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_199-295611-0_sp0.9 from training. Duration: 0.611125 2023-10-04 10:51:13,323 WARNING [train.py:1204] (3/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_43-192945-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 10:51:17,911 WARNING [train.py:1204] (3/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_364-22817-0 from training. Duration: 0.71 2023-10-04 10:51:19,045 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.713e+02 2.096e+02 2.291e+02 2.477e+02 4.065e+02, threshold=4.583e+02, percent-clipped=0.0 2023-10-04 10:51:19,133 WARNING [train.py:1204] (3/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_388-129552-0_sp1.1 from training. Duration: 0.87275 2023-10-04 10:51:19,185 WARNING [train.py:1204] (3/4) Exclude cut with ID _6537_210_1883_1_1527987559710_3597129_190-280227-0_sp1.1 from training. Duration: 0.97275 2023-10-04 10:51:21,894 WARNING [train.py:1204] (3/4) Exclude cut with ID _55844_210_2346_1_1534471212569_7625707_398-56897-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 10:51:23,221 WARNING [train.py:1204] (3/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_48-182155-0_sp1.1 from training. Duration: 0.7909375 2023-10-04 10:51:23,256 WARNING [train.py:1204] (3/4) Exclude cut with ID _29529_210_7234_1_1532393961455_3977460_582-18084-0_sp1.1 from training. Duration: 0.97275 2023-10-04 10:51:24,653 WARNING [train.py:1204] (3/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_912-282412-0 from training. Duration: 0.49 2023-10-04 10:51:26,078 WARNING [train.py:1204] (3/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_332-256168-0_sp0.9 from training. Duration: 0.8333125 2023-10-04 10:51:27,405 WARNING [train.py:1204] (3/4) Exclude cut with ID _64220_210_9527_1_1535250151772_4875259_55-250624-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 10:51:30,142 WARNING [train.py:1204] (3/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_264-108685-0 from training. Duration: 0.55 2023-10-04 10:51:30,326 WARNING [train.py:1204] (3/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_613-232856-0_sp0.9 from training. Duration: 0.6 2023-10-04 10:51:30,490 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.balancer2.prob, batch_count=1628373.3333333333, ans=0.125 2023-10-04 10:51:32,186 WARNING [train.py:1204] (3/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_68-212801-0_sp1.1 from training. Duration: 0.663625 2023-10-04 10:51:34,043 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6290_210_3273_1_1527595224_1282787_115-87095-0 from training. Duration: 0.55 2023-10-04 10:51:34,087 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3080_210_2071_1_1526086102_4365166_312-8726-0 from training. Duration: 0.91 2023-10-04 10:51:36,826 WARNING [train.py:1204] (3/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_105-63309-0 from training. Duration: 0.98 2023-10-04 10:51:38,120 WARNING [train.py:1204] (3/4) Exclude cut with ID _2941_210_2577_1_1526199673150_2489759_201-160779-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 10:51:38,123 WARNING [train.py:1204] (3/4) Exclude cut with ID ID0406W0226-885434 from training. Duration: 0.997 2023-10-04 10:51:38,136 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_17292_210_6753_1_1531125388_2943734_353-328201-0_sp1.1 from training. Duration: 0.97275 2023-10-04 10:51:39,479 WARNING [train.py:1204] (3/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_572-96029-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 10:51:41,196 WARNING [train.py:1204] (3/4) Exclude cut with ID _14970_210_15709_1_1530775547361_1966965_6-23328-0_sp0.9 from training. Duration: 0.9555625 2023-10-04 10:51:42,567 WARNING [train.py:1204] (3/4) Exclude cut with ID _45012_210_12651_1_1533608435136_5371380_240-37101-0 from training. Duration: 0.93 2023-10-04 10:51:43,881 WARNING [train.py:1204] (3/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_685-90183-0_sp1.1 from training. Duration: 0.936375 2023-10-04 10:51:45,298 WARNING [train.py:1204] (3/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_41-22245-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 10:51:47,454 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.attention_skip_rate, batch_count=1628440.0, ans=0.0 2023-10-04 10:51:48,554 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_15126_210_6753_1_1530778677_2591185_120-277707-0 from training. Duration: 0.8 2023-10-04 10:51:48,592 WARNING [train.py:1204] (3/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_310-26186-0 from training. Duration: 0.7 2023-10-04 10:51:48,614 WARNING [train.py:1204] (3/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_787-294416-0 from training. Duration: 0.82 2023-10-04 10:51:54,090 WARNING [train.py:1204] (3/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_795-33252-0 from training. Duration: 0.37 2023-10-04 10:51:54,128 WARNING [train.py:1204] (3/4) Exclude cut with ID _7537_210_2084_1_1528502395431_3924359_113-199933-0_sp0.9 from training. Duration: 0.6555625 2023-10-04 10:51:57,028 WARNING [train.py:1204] (3/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_308-231735-0_sp0.9 from training. Duration: 0.811125 2023-10-04 10:51:57,057 WARNING [train.py:1204] (3/4) Exclude cut with ID _19504_210_9994_1_1531648863242_3325220_354-296765-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 10:51:58,462 WARNING [train.py:1204] (3/4) Exclude cut with ID _5409_210_1388_1_1527292850189_5865191_178-127970-0 from training. Duration: 0.57 2023-10-04 10:51:58,502 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_1466-248056-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 10:51:59,126 WARNING [train.py:1204] (3/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_535-14410-0_sp0.9 from training. Duration: 0.52225 2023-10-04 10:51:59,129 WARNING [train.py:1204] (3/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_747-129941-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 10:52:00,269 WARNING [train.py:1204] (3/4) Exclude cut with ID _15012_210_1968_1_1530856820429_5095350_688-178849-0_sp1.1 from training. Duration: 0.5818125 2023-10-04 10:52:03,427 WARNING [train.py:1204] (3/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_356-45054-0_sp1.1 from training. Duration: 0.7818125 2023-10-04 10:52:04,798 WARNING [train.py:1204] (3/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_146-276944-0_sp0.9 from training. Duration: 0.92225 2023-10-04 10:52:05,391 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.4.encoder.layers.2.feed_forward3.out_whiten, num_groups=1, num_channels=384, metric=12.94 vs. limit=15.0 2023-10-04 10:52:07,628 WARNING [train.py:1204] (3/4) Exclude cut with ID _39204_210_4220_1_1533880547368_3191120_346-180306-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 10:52:08,279 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.2.encoder.layers.0.feed_forward3.out_whiten, num_groups=1, num_channels=384, metric=11.20 vs. limit=15.0 2023-10-04 10:52:08,939 WARNING [train.py:1204] (3/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_641-158017-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 10:52:08,939 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_19952_210_2161_1_1531875469_3851750_274-42137-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 10:52:12,561 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.conv_module2.balancer2.min_positive, batch_count=1628573.3333333333, ans=0.05 2023-10-04 10:52:14,919 WARNING [train.py:1204] (3/4) Exclude cut with ID _60804_210_9642_1_1534831199859_7268299_87-327819-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 10:52:15,005 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_9302_210_2550_1_1528889962_4655416_638-301620-0 from training. Duration: 0.47 2023-10-04 10:52:15,206 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.conv_module2.balancer1.prob, batch_count=1628573.3333333333, ans=0.125 2023-10-04 10:52:16,795 WARNING [train.py:1204] (3/4) Exclude cut with ID _44304_210_15374_1_1533639352765_4613129_302-22084-0_sp1.1 from training. Duration: 0.7818125 2023-10-04 10:52:16,815 WARNING [train.py:1204] (3/4) Exclude cut with ID _18513_210_9206_1_1531618215223_3862620_177-227063-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 10:52:18,221 WARNING [train.py:1204] (3/4) Exclude cut with ID _41326_210_5854_1_1533778076509_3492080_510-45444-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 10:52:19,495 WARNING [train.py:1204] (3/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_76-311963-0_sp1.1 from training. Duration: 0.636375 2023-10-04 10:52:20,951 WARNING [train.py:1204] (3/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_156-302675-0_sp0.9 from training. Duration: 0.611125 2023-10-04 10:52:23,575 INFO [train.py:1046] (3/4) Epoch 46, batch 5250, loss[loss=0.1478, simple_loss=0.2342, pruned_loss=0.03074, over 24459.00 frames. ], tot_loss[loss=0.1543, simple_loss=0.2346, pruned_loss=0.03701, over 4701397.22 frames. ], batch size: 66, lr: 2.20e-03, grad_scale: 16.0 2023-10-04 10:52:23,668 WARNING [train.py:1204] (3/4) Exclude cut with ID _53489_210_6228_1_1534298357169_3835970_367-148411-0_sp1.1 from training. Duration: 0.936375 2023-10-04 10:52:27,668 WARNING [train.py:1204] (3/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_389-144525-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 10:52:27,717 WARNING [train.py:1204] (3/4) Exclude cut with ID _14069_210_5632_1_1530512692033_4803390_158-238427-0_sp1.1 from training. Duration: 0.963625 2023-10-04 10:52:29,039 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_486-151131-0_sp0.9 from training. Duration: 0.9444375 2023-10-04 10:52:33,990 WARNING [train.py:1204] (3/4) Exclude cut with ID _39846_210_12651_1_1533603651992_1318030_188-71831-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 10:52:37,177 WARNING [train.py:1204] (3/4) Exclude cut with ID _39411_210_5986_1_1533538825030_3343649_173-238248-0_sp1.1 from training. Duration: 0.7545625 2023-10-04 10:52:39,907 WARNING [train.py:1204] (3/4) Exclude cut with ID _20684_210_6917_1_1531567751646_4203930_469-49081-0_sp1.1 from training. Duration: 0.92725 2023-10-04 10:52:40,006 WARNING [train.py:1204] (3/4) Exclude cut with ID _8833_210_5219_1_1528592665210_7553059_611-80313-0_sp1.1 from training. Duration: 0.5818125 2023-10-04 10:52:42,687 WARNING [train.py:1204] (3/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_440-141811-0 from training. Duration: 0.95 2023-10-04 10:52:42,708 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_41211_210_2544_1_1533348859_3702910_236-223652-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 10:52:44,638 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_40222_210_6228_1_1533385298_4481939_220-163998-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 10:52:51,352 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.conv_module2.balancer2.min_abs, batch_count=1628773.3333333333, ans=0.5 2023-10-04 10:52:56,917 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=1628773.3333333333, ans=0.1 2023-10-04 10:52:58,135 INFO [scaling.py:1118] (3/4) WithLoss: name=encoder.encoders.0.layers.1.self_attn_weights, loss-sum=0.000e+00 2023-10-04 10:53:32,597 INFO [train.py:1046] (3/4) Epoch 46, batch 5300, loss[loss=0.1535, simple_loss=0.2235, pruned_loss=0.04169, over 23765.00 frames. ], tot_loss[loss=0.1537, simple_loss=0.2338, pruned_loss=0.03685, over 4707907.55 frames. ], batch size: 179, lr: 2.19e-03, grad_scale: 16.0 2023-10-04 10:53:40,497 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.649e+02 2.050e+02 2.264e+02 2.625e+02 3.522e+02, threshold=4.529e+02, percent-clipped=0.0 2023-10-04 10:53:46,714 WARNING [train.py:1204] (3/4) Exclude cut with ID _14629_210_15709_1_1530604298182_956899_6-232695-0_sp1.1 from training. Duration: 0.8545625 2023-10-04 10:53:46,736 WARNING [train.py:1204] (3/4) Exclude cut with ID _13921_210_9994_1_1530841782272_3503520_327-69677-0 from training. Duration: 0.9 2023-10-04 10:53:46,762 WARNING [train.py:1204] (3/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_372-298093-0 from training. Duration: 0.8 2023-10-04 10:53:46,776 WARNING [train.py:1204] (3/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_515-139111-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 10:53:47,016 WARNING [train.py:1204] (3/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_85-65056-0_sp1.1 from training. Duration: 0.87275 2023-10-04 10:53:47,057 WARNING [train.py:1204] (3/4) Exclude cut with ID _15113_210_8254_1_1530842438579_3716279_209-54533-0_sp1.1 from training. Duration: 0.87275 2023-10-04 10:53:47,105 WARNING [train.py:1204] (3/4) Exclude cut with ID _70447_210_12392_1_1535972499300_3160420_123-312838-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 10:53:47,133 WARNING [train.py:1204] (3/4) Exclude cut with ID _60113_210_16271_1_1535164212481_4386079_70-63596-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 10:53:47,160 WARNING [train.py:1204] (3/4) Exclude cut with ID _14632_210_9988_1_1530691279392_3923200_225-188509-0_sp1.1 from training. Duration: 0.963625 2023-10-04 10:53:47,210 WARNING [train.py:1204] (3/4) Exclude cut with ID _34734_210_4525_1_1533365977542_4448720_584-180532-0_sp1.1 from training. Duration: 0.97275 2023-10-04 10:53:47,224 WARNING [train.py:1204] (3/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_351-294650-0_sp1.1 from training. Duration: 0.57275 2023-10-04 10:53:47,553 WARNING [train.py:1204] (3/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_263-168388-0_sp0.9 from training. Duration: 0.9555625 2023-10-04 10:53:47,582 WARNING [train.py:1204] (3/4) Exclude cut with ID _22083_210_8254_1_1531702862439_3678970_201-85738-0 from training. Duration: 0.9 2023-10-04 10:53:47,674 WARNING [train.py:1204] (3/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_237-58433-0 from training. Duration: 0.74 2023-10-04 10:53:47,701 WARNING [train.py:1204] (3/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_262-250826-0 from training. Duration: 0.99 2023-10-04 10:53:47,805 WARNING [train.py:1204] (3/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_927-43827-0_sp0.9 from training. Duration: 0.82225 2023-10-04 10:53:47,839 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2821_210_2715_1_1526108385_3763560_73-296673-0 from training. Duration: 0.83 2023-10-04 10:53:47,913 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6287_210_3273_1_1527509506_2653716_182-199887-0 from training. Duration: 0.51 2023-10-04 10:53:48,429 WARNING [train.py:1204] (3/4) Exclude cut with ID _70519_210_4163_1_1535961595994_7458230_899-80223-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 10:53:48,700 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_210-234949-0_sp1.1 from training. Duration: 0.97275 2023-10-04 10:53:48,739 WARNING [train.py:1204] (3/4) Exclude cut with ID _30196_210_1664_1_1533708496522_7074140_768-278176-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 10:53:48,820 WARNING [train.py:1204] (3/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_315-128283-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 10:53:48,908 WARNING [train.py:1204] (3/4) Exclude cut with ID _14886_210_4220_1_1530702020355_3516190_330-323941-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 10:53:49,181 WARNING [train.py:1204] (3/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_264-348896-0_sp1.1 from training. Duration: 0.77275 2023-10-04 10:53:49,203 WARNING [train.py:1204] (3/4) Exclude cut with ID _37086_210_9038_1_1532999540891_7021250_433-10563-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 10:53:49,241 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_53638_210_21581_1_1534297319_5808449_885-80172-0_sp1.1 from training. Duration: 0.87275 2023-10-04 10:53:49,334 WARNING [train.py:1204] (3/4) Exclude cut with ID _14623_210_1925_1_1530773354962_3632609_157-218771-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 10:53:49,345 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_1368-78165-0_sp1.1 from training. Duration: 0.97275 2023-10-04 10:53:49,349 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_309-65177-0_sp0.9 from training. Duration: 0.97775 2023-10-04 10:53:49,355 WARNING [train.py:1204] (3/4) Exclude cut with ID _21632_210_2253_1_1531994001765_3744140_291-138812-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 10:53:49,368 WARNING [train.py:1204] (3/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_669-93163-0_sp1.1 from training. Duration: 0.8909375 2023-10-04 10:53:49,934 WARNING [train.py:1204] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_292-215346-0 from training. Duration: 0.97 2023-10-04 10:53:49,960 WARNING [train.py:1204] (3/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_286-222073-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 10:53:50,267 WARNING [train.py:1204] (3/4) Exclude cut with ID _59256_210_5986_1_1535076085997_3589120_121-237386-0_sp1.1 from training. Duration: 0.87275 2023-10-04 10:53:50,284 WARNING [train.py:1204] (3/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_96-320736-0 from training. Duration: 0.86 2023-10-04 10:53:50,295 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_128-202104-0 from training. Duration: 0.63 2023-10-04 10:53:50,855 WARNING [train.py:1204] (3/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_242-3472-0_sp0.9 from training. Duration: 0.811125 2023-10-04 10:53:50,902 WARNING [train.py:1204] (3/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_154-74182-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 10:53:50,906 WARNING [train.py:1204] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_187-221566-0 from training. Duration: 0.43 2023-10-04 10:53:51,071 WARNING [train.py:1204] (3/4) Exclude cut with ID _6194_210_1534_1_1527511826160_3941194_14-282876-0 from training. Duration: 0.71 2023-10-04 10:53:51,112 WARNING [train.py:1204] (3/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_660-211088-0_sp1.1 from training. Duration: 0.563625 2023-10-04 10:53:51,450 WARNING [train.py:1204] (3/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_526-62079-0_sp1.1 from training. Duration: 0.6090625 2023-10-04 10:53:51,603 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_92-246270-0_sp1.1 from training. Duration: 0.77275 2023-10-04 10:53:51,699 WARNING [train.py:1204] (3/4) Exclude cut with ID IC0287W0023-142350 from training. Duration: 0.956 2023-10-04 10:53:51,777 WARNING [train.py:1204] (3/4) Exclude cut with ID _58605_210_12210_1_1534723195939_3904259_48-269402-0 from training. Duration: 0.96 2023-10-04 10:53:51,793 WARNING [train.py:1204] (3/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_416-257730-0_sp0.9 from training. Duration: 0.72225 2023-10-04 10:53:51,869 WARNING [train.py:1204] (3/4) Exclude cut with ID _7877_210_6014_1_1528286358099_3750136_185-17516-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 10:53:51,976 WARNING [train.py:1204] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_23-189234-0 from training. Duration: 0.83 2023-10-04 10:53:51,993 WARNING [train.py:1204] (3/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_237-89684-0 from training. Duration: 0.96 2023-10-04 10:53:52,066 WARNING [train.py:1204] (3/4) Exclude cut with ID _13916_210_9994_1_1530788341126_3763149_116-21034-0 from training. Duration: 0.96 2023-10-04 10:53:52,213 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_246-125000-0_sp1.1 from training. Duration: 0.563625 2023-10-04 10:53:58,617 INFO [train.py:1046] (3/4) Epoch 47, batch 0, loss[loss=0.1566, simple_loss=0.2315, pruned_loss=0.04087, over 22773.00 frames. ], tot_loss[loss=0.1566, simple_loss=0.2315, pruned_loss=0.04087, over 22773.00 frames. ], batch size: 322, lr: 2.17e-03, grad_scale: 32.0 2023-10-04 10:53:58,617 INFO [train.py:1069] (3/4) Computing validation loss 2023-10-04 10:54:08,995 INFO [zipformer.py:1853] (3/4) name=encoder.encoders.1.encoder.layers.1.self_attn_weights, attn_weights_entropy = tensor([4.4986, 4.2083, 3.8945, 4.0549], device='cuda:3') 2023-10-04 10:54:11,063 INFO [train.py:1078] (3/4) Epoch 47, validation: loss=0.3566, simple_loss=0.2776, pruned_loss=0.2178, over 1125622.00 frames. 2023-10-04 10:54:11,063 INFO [train.py:1079] (3/4) Maximum memory allocated so far is 21221MB 2023-10-04 10:54:12,518 WARNING [train.py:1204] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_402-253027-0 from training. Duration: 0.54 2023-10-04 10:54:12,589 WARNING [train.py:1204] (3/4) Exclude cut with ID _59288_210_5854_1_1535077757774_3412246_158-289345-0_sp1.1 from training. Duration: 0.92725 2023-10-04 10:54:14,578 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_28211_210_6388_1_1532861506_4523224_128-328162-0_sp1.1 from training. Duration: 0.8181875 2023-10-04 10:54:14,816 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.feed_forward2.hidden_balancer.prob, batch_count=1629053.3333333333, ans=0.125 2023-10-04 10:54:15,969 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.1.feed_forward2.hidden_balancer.prob, batch_count=1629053.3333333333, ans=0.125 2023-10-04 10:54:19,937 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_788-278281-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 10:54:19,945 WARNING [train.py:1204] (3/4) Exclude cut with ID _49728_210_12651_1_1534041034294_6788150_624-195301-0_sp0.9 from training. Duration: 0.9444375 2023-10-04 10:54:21,213 WARNING [train.py:1204] (3/4) Exclude cut with ID _14860_210_4915_1_1530702020340_7343240_267-139775-0_sp1.1 from training. Duration: 0.963625 2023-10-04 10:54:21,280 WARNING [train.py:1204] (3/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_332-221057-0 from training. Duration: 0.68 2023-10-04 10:54:23,957 WARNING [train.py:1204] (3/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_242-76943-0 from training. Duration: 0.94 2023-10-04 10:54:25,418 WARNING [train.py:1204] (3/4) Exclude cut with ID _16747_210_5986_1_1531811544733_2749109_87-349587-0_sp1.1 from training. Duration: 0.963625 2023-10-04 10:54:25,482 WARNING [train.py:1204] (3/4) Exclude cut with ID _22982_210_1883_1_1531882790447_5449409_505-346452-0_sp1.1 from training. Duration: 0.963625 2023-10-04 10:54:26,233 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.2.encoder.layers.1.feed_forward2.out_whiten, num_groups=1, num_channels=384, metric=4.04 vs. limit=15.0 2023-10-04 10:54:29,618 WARNING [train.py:1204] (3/4) Exclude cut with ID _17956_210_4353_1_1531209568125_6979970_13-192107-0_sp1.1 from training. Duration: 0.963625 2023-10-04 10:54:29,764 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.self_attn_weights.pos_emb_skip_rate, batch_count=1629120.0, ans=0.0 2023-10-04 10:54:30,904 WARNING [train.py:1204] (3/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_430-307666-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 10:54:30,963 WARNING [train.py:1204] (3/4) Exclude cut with ID _2895_210_2477_1_1525956785794_4063750_289-11475-0_sp1.1 from training. Duration: 0.7454375 2023-10-04 10:54:30,981 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3910_210_1794_1_1526742306_334530_28-238909-0_sp0.9 from training. Duration: 0.9 2023-10-04 10:54:33,705 WARNING [train.py:1204] (3/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_18-130336-0 from training. Duration: 0.79 2023-10-04 10:54:35,831 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_548-294316-0_sp0.9 from training. Duration: 0.9 2023-10-04 10:54:42,881 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_42-142473-0_sp1.1 from training. Duration: 0.6545625 2023-10-04 10:54:42,887 WARNING [train.py:1204] (3/4) Exclude cut with ID _59170_210_6721_1_1534983633960_4203335_326-48066-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 10:54:46,319 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_45886_210_17024_1_1533709215_471063_7-79165-0 from training. Duration: 0.98 2023-10-04 10:54:49,550 WARNING [train.py:1204] (3/4) Exclude cut with ID _44585_210_18628_1_1533605350387_4518199_280-73534-0_sp0.9 from training. Duration: 0.988875 2023-10-04 10:54:49,556 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3475_210_2028_1_1526193578_3622569_265-276625-0_sp1.1 from training. Duration: 0.6909375 2023-10-04 10:54:52,772 WARNING [train.py:1204] (3/4) Exclude cut with ID _64631_210_27767_1_1535349580068_4165160_88-225232-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 10:54:56,864 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_205-265518-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 10:54:59,893 WARNING [train.py:1204] (3/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_271-253734-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 10:55:00,137 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.feed_forward3.hidden_balancer.prob, batch_count=1629253.3333333333, ans=0.125 2023-10-04 10:55:01,424 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=1629253.3333333333, ans=0.1 2023-10-04 10:55:03,903 WARNING [train.py:1204] (3/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_282-257791-0 from training. Duration: 0.86 2023-10-04 10:55:08,565 WARNING [train.py:1204] (3/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_356-45054-0 from training. Duration: 0.86 2023-10-04 10:55:09,840 WARNING [train.py:1204] (3/4) Exclude cut with ID _69584_210_5219_1_1535785133775_4044450_38-348116-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 10:55:09,842 WARNING [train.py:1204] (3/4) Exclude cut with ID _66763_210_17301_1_1535788667617_241750_21-328480-0_sp1.1 from training. Duration: 0.936375 2023-10-04 10:55:11,114 WARNING [train.py:1204] (3/4) Exclude cut with ID _26528_210_9070_1_1532570410842_3851310_497-97218-0_sp0.9 from training. Duration: 0.9555625 2023-10-04 10:55:11,166 WARNING [train.py:1204] (3/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_592-101075-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 10:55:11,764 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.4.encoder.layers.1.nonlin_attention.whiten1, num_groups=1, num_channels=288, metric=3.73 vs. limit=10.0 2023-10-04 10:55:12,932 WARNING [train.py:1204] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_678-159307-0 from training. Duration: 0.85 2023-10-04 10:55:15,028 WARNING [train.py:1204] (3/4) Exclude cut with ID _28427_210_4220_1_1532581240967_3061069_166-319773-0_sp1.1 from training. Duration: 0.936375 2023-10-04 10:55:16,387 WARNING [train.py:1204] (3/4) Exclude cut with ID _29877_210_6228_1_1532397791106_5276149_606-224984-0_sp1.1 from training. Duration: 0.936375 2023-10-04 10:55:19,679 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_125-203105-0_sp1.1 from training. Duration: 0.72725 2023-10-04 10:55:24,365 WARNING [train.py:1204] (3/4) Exclude cut with ID IC0620W0113-306765 from training. Duration: 0.768 2023-10-04 10:55:24,488 WARNING [train.py:1204] (3/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_410-287857-0_sp1.1 from training. Duration: 0.8454375 2023-10-04 10:55:25,783 INFO [train.py:1046] (3/4) Epoch 47, batch 50, loss[loss=0.1441, simple_loss=0.2341, pruned_loss=0.027, over 24476.00 frames. ], tot_loss[loss=0.1543, simple_loss=0.2361, pruned_loss=0.03621, over 1069382.18 frames. ], batch size: 66, lr: 2.17e-03, grad_scale: 32.0 2023-10-04 10:55:27,277 WARNING [train.py:1204] (3/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_78-95260-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 10:55:30,016 WARNING [train.py:1204] (3/4) Exclude cut with ID _58059_210_5854_1_1534901471149_3164808_6-73080-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 10:55:30,031 WARNING [train.py:1204] (3/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_331-117829-0 from training. Duration: 0.62 2023-10-04 10:55:30,093 WARNING [train.py:1204] (3/4) Exclude cut with ID _14880_210_8895_1_1530680426655_3795259_289-212817-0_sp0.9 from training. Duration: 0.7666875 2023-10-04 10:55:30,145 WARNING [train.py:1204] (3/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_620-144779-0_sp1.1 from training. Duration: 0.92725 2023-10-04 10:55:32,843 WARNING [train.py:1204] (3/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_561-288575-0_sp1.1 from training. Duration: 0.963625 2023-10-04 10:55:34,218 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_22657_210_6890_1_1532086002_1637216_122-188128-0_sp1.1 from training. Duration: 0.963625 2023-10-04 10:55:38,501 WARNING [train.py:1204] (3/4) Exclude cut with ID _16756_210_4915_1_1531047803763_7104259_604-86607-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 10:55:41,348 WARNING [train.py:1204] (3/4) Exclude cut with ID _15150_210_8341_1_1530752411803_7256384_469-239509-0 from training. Duration: 0.68 2023-10-04 10:55:41,355 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_10987_210_4163_1_1529760555_4123902_302-87926-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 10:55:47,324 WARNING [train.py:1204] (3/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_469-94673-0_sp0.9 from training. Duration: 0.87775 2023-10-04 10:55:47,442 WARNING [train.py:1204] (3/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_798-106179-0 from training. Duration: 0.83 2023-10-04 10:55:50,190 WARNING [train.py:1204] (3/4) Exclude cut with ID _16744_210_5986_1_1531724405384_3907567_242-111316-0 from training. Duration: 0.93 2023-10-04 10:55:50,419 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.feed_forward1.hidden_balancer.prob, batch_count=1629453.3333333333, ans=0.125 2023-10-04 10:55:51,679 WARNING [train.py:1204] (3/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_938-205135-0_sp0.9 from training. Duration: 0.9666875 2023-10-04 10:55:53,649 WARNING [train.py:1204] (3/4) Exclude cut with ID _64135_210_2253_1_1535677267876_7608972_133-188266-0_sp0.9 from training. Duration: 0.9 2023-10-04 10:55:53,654 WARNING [train.py:1204] (3/4) Exclude cut with ID _44585_210_18628_1_1533605350387_4518199_623-346820-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 10:55:55,477 WARNING [train.py:1204] (3/4) Exclude cut with ID _42326_210_7770_1_1533814282531_3707269_454-266903-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 10:55:55,673 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.attention_skip_rate, batch_count=1629520.0, ans=0.0 2023-10-04 10:55:56,793 WARNING [train.py:1204] (3/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_5-55893-0_sp1.1 from training. Duration: 0.5 2023-10-04 10:55:56,836 WARNING [train.py:1204] (3/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_577-332600-0_sp1.1 from training. Duration: 0.5181875 2023-10-04 10:55:56,837 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_957-138882-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 10:55:58,858 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.4.encoder.layers.2.self_attn_weights.whiten_keys, num_groups=4, num_channels=128, metric=2.09 vs. limit=6.0 2023-10-04 10:56:02,358 WARNING [train.py:1204] (3/4) Exclude cut with ID _12556_210_8341_1_1530081024100_7327690_219-38004-0_sp1.1 from training. Duration: 0.97275 2023-10-04 10:56:02,465 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_397-147196-0_sp1.1 from training. Duration: 0.72725 2023-10-04 10:56:03,801 WARNING [train.py:1204] (3/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_231-104736-0_sp0.9 from training. Duration: 0.8666875 2023-10-04 10:56:03,890 WARNING [train.py:1204] (3/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_398-334558-0 from training. Duration: 0.96 2023-10-04 10:56:05,189 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_257-20733-0_sp1.1 from training. Duration: 0.6909375 2023-10-04 10:56:06,537 WARNING [train.py:1204] (3/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_146-282400-0_sp1.1 from training. Duration: 0.6454375 2023-10-04 10:56:06,548 WARNING [train.py:1204] (3/4) Exclude cut with ID _51899_210_20034_1_1534129254466_3669220_265-215428-0 from training. Duration: 0.98 2023-10-04 10:56:07,846 WARNING [train.py:1204] (3/4) Exclude cut with ID _34423_210_8750_1_1533430798456_3330199_219-106541-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 10:56:11,164 WARNING [train.py:1204] (3/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_458-67321-0 from training. Duration: 0.97 2023-10-04 10:56:11,391 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.feed_forward1.hidden_balancer.prob, batch_count=1629586.6666666667, ans=0.125 2023-10-04 10:56:17,240 WARNING [train.py:1204] (3/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_1051-182606-0_sp1.1 from training. Duration: 0.936375 2023-10-04 10:56:17,264 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_53555_210_5632_1_1534595220_7232297_603-4396-0_sp1.1 from training. Duration: 0.97275 2023-10-04 10:56:18,652 WARNING [train.py:1204] (3/4) Exclude cut with ID _54218_210_12210_1_1534413646388_4409259_159-144367-0_sp1.1 from training. Duration: 0.963625 2023-10-04 10:56:19,965 WARNING [train.py:1204] (3/4) Exclude cut with ID _7606_210_4915_1_1528894253277_5080690_301-10732-0_sp0.9 from training. Duration: 0.9 2023-10-04 10:56:19,979 WARNING [train.py:1204] (3/4) Exclude cut with ID _15026_210_8750_1_1530777602316_3291850_211-195043-0_sp0.9 from training. Duration: 0.888875 2023-10-04 10:56:21,278 WARNING [train.py:1204] (3/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_146-282400-0 from training. Duration: 0.71 2023-10-04 10:56:23,231 WARNING [train.py:1204] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_65-88242-0 from training. Duration: 0.56 2023-10-04 10:56:24,787 WARNING [train.py:1204] (3/4) Exclude cut with ID _55857_210_2346_1_1534644007831_6898209_80-107757-0_sp1.1 from training. Duration: 0.963625 2023-10-04 10:56:24,834 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_15126_210_6753_1_1530778677_2591185_120-277707-0_sp0.9 from training. Duration: 0.888875 2023-10-04 10:56:27,412 WARNING [train.py:1204] (3/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_1033-168593-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 10:56:28,767 WARNING [train.py:1204] (3/4) Exclude cut with ID _46683_210_9642_1_1533730913492_4591116_475-30711-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 10:56:28,811 WARNING [train.py:1204] (3/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_253-308377-0 from training. Duration: 0.49 2023-10-04 10:56:28,890 WARNING [train.py:1204] (3/4) Exclude cut with ID _4680_210_3525_1_1527249757953_893386_75-78979-0 from training. Duration: 0.91 2023-10-04 10:56:30,219 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_5522_210_3273_1_1527249606_3909096_318-203153-0_sp0.9 from training. Duration: 0.47775 2023-10-04 10:56:31,480 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.529e+02 2.113e+02 2.383e+02 2.822e+02 6.328e+02, threshold=4.766e+02, percent-clipped=7.0 2023-10-04 10:56:31,571 WARNING [train.py:1204] (3/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_550-170116-0_sp1.1 from training. Duration: 0.92725 2023-10-04 10:56:31,598 WARNING [train.py:1204] (3/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_277-210798-0_sp0.9 from training. Duration: 0.611125 2023-10-04 10:56:31,674 WARNING [train.py:1204] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_620-276793-0 from training. Duration: 0.59 2023-10-04 10:56:31,692 WARNING [train.py:1204] (3/4) Exclude cut with ID _3067_210_2667_1_1526212974640_4503168_119-175776-0 from training. Duration: 0.74 2023-10-04 10:56:33,017 WARNING [train.py:1204] (3/4) Exclude cut with ID _55209_210_1531_1_1534675828996_4597160_681-195827-0_sp1.1 from training. Duration: 0.92725 2023-10-04 10:56:33,078 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_427-78097-0_sp1.1 from training. Duration: 0.72725 2023-10-04 10:56:35,754 WARNING [train.py:1204] (3/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_96-290039-0_sp0.9 from training. Duration: 0.62225 2023-10-04 10:56:35,772 WARNING [train.py:1204] (3/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_287-292174-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 10:56:38,364 INFO [train.py:1046] (3/4) Epoch 47, batch 100, loss[loss=0.157, simple_loss=0.2345, pruned_loss=0.03972, over 23541.00 frames. ], tot_loss[loss=0.1553, simple_loss=0.2369, pruned_loss=0.03678, over 1877893.06 frames. ], batch size: 256, lr: 2.17e-03, grad_scale: 16.0 2023-10-04 10:56:40,220 WARNING [train.py:1204] (3/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_200-292228-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 10:56:42,980 WARNING [train.py:1204] (3/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_291-194999-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 10:56:47,095 WARNING [train.py:1204] (3/4) Exclude cut with ID _58895_210_8750_1_1535184081320_3649089_265-323810-0_sp1.1 from training. Duration: 0.87275 2023-10-04 10:56:48,963 WARNING [train.py:1204] (3/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_58-68956-0 from training. Duration: 0.83 2023-10-04 10:56:48,968 WARNING [train.py:1204] (3/4) Exclude cut with ID _39867_210_9826_1_1533553222030_9344259_322-115153-0_sp1.1 from training. Duration: 0.963625 2023-10-04 10:56:53,135 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3371_210_1794_1_1526384462_5580777_396-339812-0_sp1.1 from training. Duration: 0.77275 2023-10-04 10:56:53,148 WARNING [train.py:1204] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_731-2428-0_sp0.9 from training. Duration: 0.92225 2023-10-04 10:56:54,378 WARNING [train.py:1204] (3/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_98-741-0_sp1.1 from training. Duration: 0.72725 2023-10-04 10:56:54,380 WARNING [train.py:1204] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_238-245655-0_sp0.9 from training. Duration: 0.9 2023-10-04 10:56:54,416 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6788_210_1388_1_1527898698_4562021_91-197981-0_sp0.9 from training. Duration: 0.92225 2023-10-04 10:56:55,813 WARNING [train.py:1204] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_860-96198-0 from training. Duration: 0.73 2023-10-04 10:56:59,196 WARNING [train.py:1204] (3/4) Exclude cut with ID _14268_210_9206_1_1530622771285_4714099_331-105351-0_sp0.9 from training. Duration: 0.72225 2023-10-04 10:56:59,232 WARNING [train.py:1204] (3/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_204-137133-0_sp1.1 from training. Duration: 0.92725 2023-10-04 10:57:00,534 WARNING [train.py:1204] (3/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_5-46496-0_sp1.1 from training. Duration: 0.936375 2023-10-04 10:57:00,542 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_160-218721-0_sp1.1 from training. Duration: 0.87275 2023-10-04 10:57:04,685 WARNING [train.py:1204] (3/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_503-287883-0 from training. Duration: 0.88 2023-10-04 10:57:04,763 WARNING [train.py:1204] (3/4) Exclude cut with ID _57572_210_5517_1_1534921219624_3809484_110-79134-0_sp1.1 from training. Duration: 0.92725 2023-10-04 10:57:06,245 WARNING [train.py:1204] (3/4) Exclude cut with ID _44585_210_18628_1_1533605350387_4518199_421-37641-0_sp1.1 from training. Duration: 0.936375 2023-10-04 10:57:07,653 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_42-142473-0_sp0.9 from training. Duration: 0.8 2023-10-04 10:57:09,058 WARNING [train.py:1204] (3/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_310-114452-0_sp0.9 from training. Duration: 0.6555625 2023-10-04 10:57:13,170 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9029W0120-509520 from training. Duration: 0.768 2023-10-04 10:57:13,184 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9029W0433-509820 from training. Duration: 0.896 2023-10-04 10:57:13,424 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.conv_skip_rate, batch_count=1629853.3333333333, ans=0.0 2023-10-04 10:57:14,580 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_40222_210_6228_1_1533385298_4481939_163-334586-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 10:57:14,580 WARNING [train.py:1204] (3/4) Exclude cut with ID _48677_210_5663_1_1533902405919_3892989_408-40740-0_sp1.1 from training. Duration: 0.7545625 2023-10-04 10:57:17,951 WARNING [train.py:1204] (3/4) Exclude cut with ID _24314_210_4915_1_1532135078116_7170179_521-258868-0_sp0.9 from training. Duration: 0.87775 2023-10-04 10:57:19,369 WARNING [train.py:1204] (3/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_576-51198-0_sp1.1 from training. Duration: 0.92725 2023-10-04 10:57:20,715 WARNING [train.py:1204] (3/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_306-216871-0_sp1.1 from training. Duration: 0.97275 2023-10-04 10:57:25,349 WARNING [train.py:1204] (3/4) Exclude cut with ID _20684_210_6917_1_1531567751646_4203930_463-119982-0_sp1.1 from training. Duration: 0.97275 2023-10-04 10:57:27,234 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9084W0012-536850 from training. Duration: 0.64 2023-10-04 10:57:28,624 WARNING [train.py:1204] (3/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_847-41334-0_sp1.1 from training. Duration: 0.4 2023-10-04 10:57:30,725 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.bypass.skip_rate, batch_count=1629920.0, ans=0.04949747468305833 2023-10-04 10:57:31,932 WARNING [train.py:1204] (3/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_326-165900-0_sp1.1 from training. Duration: 0.62725 2023-10-04 10:57:33,262 WARNING [train.py:1204] (3/4) Exclude cut with ID _7537_210_2084_1_1528502395431_3924359_128-268029-0_sp1.1 from training. Duration: 0.8909375 2023-10-04 10:57:35,928 WARNING [train.py:1204] (3/4) Exclude cut with ID _67499_210_17282_1_1535509805220_3692430_333-155030-0_sp1.1 from training. Duration: 0.97275 2023-10-04 10:57:37,576 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.0.attention_skip_rate, batch_count=1629986.6666666667, ans=0.0 2023-10-04 10:57:38,897 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_34008_210_10431_1_1533345424_8183247_588-257177-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 10:57:40,410 WARNING [train.py:1204] (3/4) Exclude cut with ID _25914_210_6400_1_1532141317058_644740_52-96808-0_sp1.1 from training. Duration: 0.82725 2023-10-04 10:57:41,770 WARNING [train.py:1204] (3/4) Exclude cut with ID _56494_210_4220_1_1535284837625_3033169_69-138150-0_sp1.1 from training. Duration: 0.8545625 2023-10-04 10:57:44,670 WARNING [train.py:1204] (3/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_19-143553-0_sp1.1 from training. Duration: 0.97275 2023-10-04 10:57:45,330 WARNING [train.py:1204] (3/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_582-130735-0_sp1.1 from training. Duration: 0.936375 2023-10-04 10:57:47,883 WARNING [train.py:1204] (3/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_943-201356-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 10:57:47,893 WARNING [train.py:1204] (3/4) Exclude cut with ID _3231_210_2106_1_1526205864934_6784220_422-229276-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 10:57:47,930 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_48049_210_5632_1_1533864553_3519937_73-198717-0_sp1.1 from training. Duration: 0.97275 2023-10-04 10:57:49,274 WARNING [train.py:1204] (3/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_329-285801-0 from training. Duration: 0.91 2023-10-04 10:57:49,291 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9171W0114-579000 from training. Duration: 0.896 2023-10-04 10:57:49,309 WARNING [train.py:1204] (3/4) Exclude cut with ID _3120_210_3525_1_1526036143112_4072980_210-132675-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 10:57:49,602 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.conv_module2.balancer2.prob, batch_count=1629986.6666666667, ans=0.125 2023-10-04 10:57:50,106 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.2.encoder.layers.0.self_attn_weights.whiten_keys, num_groups=4, num_channels=128, metric=3.29 vs. limit=6.0 2023-10-04 10:57:50,749 WARNING [train.py:1204] (3/4) Exclude cut with ID _55857_210_2346_1_1534644007831_6898209_745-98391-0_sp0.9 from training. Duration: 0.8444375 2023-10-04 10:57:50,801 WARNING [train.py:1204] (3/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_615-322035-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 10:57:50,815 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_36756_210_7234_1_1533026991_966276_119-315767-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 10:57:50,840 WARNING [train.py:1204] (3/4) Exclude cut with ID _13916_210_9994_1_1530788341126_3763149_60-191556-0_sp1.1 from training. Duration: 0.5545625 2023-10-04 10:57:50,855 WARNING [train.py:1204] (3/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_177-236809-0_sp1.1 from training. Duration: 0.4454375 2023-10-04 10:57:50,889 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7002_210_1794_1_1527939683_5157286_184-229630-0_sp1.1 from training. Duration: 0.67275 2023-10-04 10:57:50,902 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_50428_210_5724_1_1534247532_4126418_464-18322-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 10:57:52,187 INFO [train.py:1046] (3/4) Epoch 47, batch 150, loss[loss=0.1451, simple_loss=0.2319, pruned_loss=0.02917, over 24488.00 frames. ], tot_loss[loss=0.1548, simple_loss=0.2366, pruned_loss=0.03649, over 2520996.28 frames. ], batch size: 63, lr: 2.17e-03, grad_scale: 16.0 2023-10-04 10:57:52,253 WARNING [train.py:1204] (3/4) Exclude cut with ID _35799_210_5854_1_1533263487887_3529071_466-131402-0_sp1.1 from training. Duration: 0.936375 2023-10-04 10:57:53,577 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_1259-132768-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 10:57:53,637 WARNING [train.py:1204] (3/4) Exclude cut with ID _6694_210_4599_1_1527852637490_3965529_144-185051-0_sp1.1 from training. Duration: 0.87275 2023-10-04 10:57:53,838 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.feed_forward3.hidden_balancer.prob, batch_count=1630053.3333333333, ans=0.125 2023-10-04 10:57:55,550 WARNING [train.py:1204] (3/4) Exclude cut with ID _64639_210_5816_1_1535367619437_3528000_59-133290-0_sp1.1 from training. Duration: 0.963625 2023-10-04 10:57:57,518 WARNING [train.py:1204] (3/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_223-49525-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 10:57:57,867 INFO [scaling.py:1118] (3/4) WithLoss: name=encoder.encoders.4.encoder.layers.0.self_attn_weights, loss-sum=0.000e+00 2023-10-04 10:57:58,870 WARNING [train.py:1204] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_748-36150-0_sp1.1 from training. Duration: 0.82725 2023-10-04 10:57:58,871 WARNING [train.py:1204] (3/4) Exclude cut with ID _19896_210_1883_1_1531709022662_6649569_52-221627-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 10:57:58,936 WARNING [train.py:1204] (3/4) Exclude cut with ID _6789_210_1534_1_1527854471671_2870519_296-115766-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 10:58:03,557 WARNING [train.py:1204] (3/4) Exclude cut with ID _14624_210_1925_1_1530858690657_3642219_100-19871-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 10:58:04,887 WARNING [train.py:1204] (3/4) Exclude cut with ID _12555_210_8750_1_1530075594402_3780239_50-293675-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 10:58:08,848 WARNING [train.py:1204] (3/4) Exclude cut with ID _14880_210_8895_1_1530680426655_3795259_289-212817-0_sp1.1 from training. Duration: 0.62725 2023-10-04 10:58:08,914 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_388-211626-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 10:58:11,809 WARNING [train.py:1204] (3/4) Exclude cut with ID _5289_210_2084_1_1527678202736_3779299_392-261988-0 from training. Duration: 0.65 2023-10-04 10:58:11,817 WARNING [train.py:1204] (3/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_124-345840-0 from training. Duration: 0.52 2023-10-04 10:58:11,824 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2655_210_2087_1_1525688959_3897400_437-337690-0 from training. Duration: 0.63 2023-10-04 10:58:14,540 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3780_210_2087_1_1526467391_239068_13-274633-0_sp1.1 from training. Duration: 0.92725 2023-10-04 10:58:14,544 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_805-71497-0_sp1.1 from training. Duration: 0.6454375 2023-10-04 10:58:16,592 WARNING [train.py:1204] (3/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_434-83000-0_sp1.1 from training. Duration: 0.8909375 2023-10-04 10:58:17,869 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_17613_210_2151_1_1531137569_2366160_285-219531-0_sp1.1 from training. Duration: 0.97275 2023-10-04 10:58:17,875 WARNING [train.py:1204] (3/4) Exclude cut with ID _56108_210_16380_1_1534505446159_3887959_22-37858-0_sp1.1 from training. Duration: 0.936375 2023-10-04 10:58:17,906 WARNING [train.py:1204] (3/4) Exclude cut with ID _28427_210_4220_1_1532581240967_3061069_200-88444-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 10:58:19,176 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_77-279042-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 10:58:20,510 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9309W0193-640320 from training. Duration: 0.896 2023-10-04 10:58:23,147 WARNING [train.py:1204] (3/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_516-324055-0_sp1.1 from training. Duration: 0.936375 2023-10-04 10:58:23,473 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.conv_module2.balancer1.prob, batch_count=1630186.6666666667, ans=0.125 2023-10-04 10:58:29,657 WARNING [train.py:1204] (3/4) Exclude cut with ID _40094_210_16380_1_1533294005773_3641159_10-261240-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 10:58:33,875 WARNING [train.py:1204] (3/4) Exclude cut with ID _6267_210_2084_1_1527764658526_3894730_375-219887-0_sp1.1 from training. Duration: 0.6090625 2023-10-04 10:58:33,966 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6788_210_1388_1_1527898698_4562021_180-75288-0 from training. Duration: 0.99 2023-10-04 10:58:38,753 WARNING [train.py:1204] (3/4) Exclude cut with ID _8030_210_7497_1_1528444886781_3531089_262-86750-0_sp1.1 from training. Duration: 0.736375 2023-10-04 10:58:38,773 WARNING [train.py:1204] (3/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_934-325498-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 10:58:38,803 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_456-22141-0_sp0.9 from training. Duration: 0.97775 2023-10-04 10:58:39,022 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.attention_skip_rate, batch_count=1630253.3333333333, ans=0.0 2023-10-04 10:58:41,745 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=1630253.3333333333, ans=0.1 2023-10-04 10:58:42,764 WARNING [train.py:1204] (3/4) Exclude cut with ID _49643_210_7989_1_1534640244224_3939031_598-161926-0_sp1.1 from training. Duration: 0.8181875 2023-10-04 10:58:44,184 WARNING [train.py:1204] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_345-49721-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 10:58:45,545 WARNING [train.py:1204] (3/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_440-141811-0_sp1.1 from training. Duration: 0.863625 2023-10-04 10:58:45,637 WARNING [train.py:1204] (3/4) Exclude cut with ID _64048_210_10626_1_1535198382802_3835179_394-212120-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 10:58:46,995 WARNING [train.py:1204] (3/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_91-277684-0 from training. Duration: 0.74 2023-10-04 10:58:51,739 WARNING [train.py:1204] (3/4) Exclude cut with ID _23038_210_9570_1_1532001584158_3652419_191-346343-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 10:58:53,019 WARNING [train.py:1204] (3/4) Exclude cut with ID _15140_210_9288_1_1530791388450_4880149_351-347974-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 10:58:53,057 WARNING [train.py:1204] (3/4) Exclude cut with ID _58605_210_12210_1_1534723195939_3904259_48-269402-0_sp1.1 from training. Duration: 0.87275 2023-10-04 10:58:53,063 WARNING [train.py:1204] (3/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_401-53423-0_sp0.9 from training. Duration: 0.611125 2023-10-04 10:58:54,475 WARNING [train.py:1204] (3/4) Exclude cut with ID _42659_210_2070_1_1533607442765_3120380_158-49004-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 10:58:57,200 WARNING [train.py:1204] (3/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_81-75849-0_sp1.1 from training. Duration: 0.3545625 2023-10-04 10:58:58,420 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.636e+02 1.983e+02 2.136e+02 2.354e+02 3.118e+02, threshold=4.272e+02, percent-clipped=0.0 2023-10-04 10:58:58,628 WARNING [train.py:1204] (3/4) Exclude cut with ID _27052_210_5986_1_1532329197187_3585009_314-106499-0_sp1.1 from training. Duration: 0.836375 2023-10-04 10:59:01,907 WARNING [train.py:1204] (3/4) Exclude cut with ID _4626_210_3532_1_1526817652717_3916859_446-206656-0_sp0.9 from training. Duration: 0.8444375 2023-10-04 10:59:03,327 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_53378_210_17282_1_1534221071_5769509_158-114960-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 10:59:04,768 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3171_210_2087_1_1526109084_2330007_37-64334-0_sp0.9 from training. Duration: 0.911125 2023-10-04 10:59:04,793 WARNING [train.py:1204] (3/4) Exclude cut with ID _5410_210_1388_1_1527397055795_1719020_39-112221-0 from training. Duration: 0.69 2023-10-04 10:59:04,815 WARNING [train.py:1204] (3/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_4-91037-0_sp0.9 from training. Duration: 0.97775 2023-10-04 10:59:04,830 WARNING [train.py:1204] (3/4) Exclude cut with ID ID0079W0443-717795 from training. Duration: 0.541 2023-10-04 10:59:06,081 INFO [train.py:1046] (3/4) Epoch 47, batch 200, loss[loss=0.1542, simple_loss=0.2432, pruned_loss=0.03256, over 24279.00 frames. ], tot_loss[loss=0.1563, simple_loss=0.2384, pruned_loss=0.03712, over 3016234.99 frames. ], batch size: 74, lr: 2.17e-03, grad_scale: 16.0 2023-10-04 10:59:09,422 WARNING [train.py:1204] (3/4) Exclude cut with ID _34734_210_4525_1_1533365977542_4448720_766-297928-0_sp1.1 from training. Duration: 0.936375 2023-10-04 10:59:12,226 WARNING [train.py:1204] (3/4) Exclude cut with ID _51471_210_1889_1_1534399391390_3094250_310-337793-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 10:59:12,243 WARNING [train.py:1204] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_678-159307-0_sp0.9 from training. Duration: 0.9444375 2023-10-04 10:59:14,856 WARNING [train.py:1204] (3/4) Exclude cut with ID _50093_210_4220_1_1534417141207_3432299_259-322252-0 from training. Duration: 0.92 2023-10-04 10:59:16,070 WARNING [train.py:1204] (3/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_67-103444-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 10:59:16,088 WARNING [train.py:1204] (3/4) Exclude cut with ID _40551_210_10635_1_1533776424632_3416179_311-247578-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 10:59:19,489 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_4633_210_2040_1_1526824163_4145343_416-76117-0 from training. Duration: 0.95 2023-10-04 10:59:20,901 WARNING [train.py:1204] (3/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_331-117829-0_sp0.9 from training. Duration: 0.688875 2023-10-04 10:59:22,299 WARNING [train.py:1204] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_299-247059-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 10:59:22,365 WARNING [train.py:1204] (3/4) Exclude cut with ID _64002_210_9570_1_1535198605157_38979_2-240845-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 10:59:25,257 WARNING [train.py:1204] (3/4) Exclude cut with ID _4681_210_3525_1_1527250689464_120889_15-126760-0_sp0.9 from training. Duration: 0.9 2023-10-04 10:59:25,272 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_21500_210_2054_1_1532149310_2716758_256-156611-0_sp1.1 from training. Duration: 0.936375 2023-10-04 10:59:25,288 WARNING [train.py:1204] (3/4) Exclude cut with ID _16908_210_5724_1_1531119157248_3772700_624-203729-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 10:59:36,204 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.attention_skip_rate, batch_count=1630520.0, ans=0.0 2023-10-04 10:59:42,233 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.feed_forward1.hidden_balancer.prob, batch_count=1630520.0, ans=0.125 2023-10-04 10:59:44,766 WARNING [train.py:1204] (3/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_605-219732-0_sp1.1 from training. Duration: 0.8545625 2023-10-04 10:59:44,800 WARNING [train.py:1204] (3/4) Exclude cut with ID _44718_210_5816_1_1534050051869_6469030_380-167520-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 10:59:46,003 WARNING [train.py:1204] (3/4) Exclude cut with ID _19896_210_1883_1_1531709022662_6649569_94-229927-0_sp1.1 from training. Duration: 0.8818125 2023-10-04 10:59:47,281 WARNING [train.py:1204] (3/4) Exclude cut with ID _19514_210_9259_1_1531530071330_5084230_519-174300-0_sp1.1 from training. Duration: 0.963625 2023-10-04 10:59:47,362 WARNING [train.py:1204] (3/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_340-174176-0_sp0.9 from training. Duration: 0.5444375 2023-10-04 10:59:47,366 WARNING [train.py:1204] (3/4) Exclude cut with ID _8263_210_1664_1_1528439452891_970220_68-181805-0_sp1.1 from training. Duration: 0.5818125 2023-10-04 10:59:50,005 WARNING [train.py:1204] (3/4) Exclude cut with ID _3120_210_3525_1_1526036143112_4072980_348-257723-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 10:59:50,756 WARNING [train.py:1204] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_453-133983-0_sp1.1 from training. Duration: 0.7090625 2023-10-04 10:59:51,982 WARNING [train.py:1204] (3/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_17-86060-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 10:59:52,011 WARNING [train.py:1204] (3/4) Exclude cut with ID _29877_210_6228_1_1532397791106_5276149_228-184657-0_sp1.1 from training. Duration: 0.97275 2023-10-04 10:59:53,354 WARNING [train.py:1204] (3/4) Exclude cut with ID _18516_210_9206_1_1531704653206_3692406_198-334870-0 from training. Duration: 0.92 2023-10-04 10:59:54,697 WARNING [train.py:1204] (3/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_340-174176-0_sp1.1 from training. Duration: 0.4454375 2023-10-04 10:59:54,712 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_14-139186-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 10:59:57,682 WARNING [train.py:1204] (3/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_126-29104-0_sp0.9 from training. Duration: 0.9444375 2023-10-04 11:00:02,032 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=1630586.6666666667, ans=0.1 2023-10-04 11:00:03,769 WARNING [train.py:1204] (3/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_84-10310-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 11:00:10,787 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_15517_210_6753_1_1530954008_3487336_79-273076-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 11:00:12,121 WARNING [train.py:1204] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_371-18169-0_sp0.9 from training. Duration: 0.9555625 2023-10-04 11:00:18,178 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_68702_210_23689_1_1535799577_3420068_170-92788-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 11:00:19,538 INFO [train.py:1046] (3/4) Epoch 47, batch 250, loss[loss=0.1577, simple_loss=0.2388, pruned_loss=0.03829, over 24311.00 frames. ], tot_loss[loss=0.1555, simple_loss=0.2368, pruned_loss=0.03713, over 3391399.85 frames. ], batch size: 61, lr: 2.17e-03, grad_scale: 16.0 2023-10-04 11:00:21,525 WARNING [train.py:1204] (3/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_78-182912-0 from training. Duration: 0.59 2023-10-04 11:00:21,586 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3019_210_2028_1_1526044245_3264865_297-12384-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 11:00:21,592 WARNING [train.py:1204] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_147-111137-0_sp1.1 from training. Duration: 0.863625 2023-10-04 11:00:21,597 WARNING [train.py:1204] (3/4) Exclude cut with ID _28323_210_16187_1_1532480395667_4100300_117-8859-0_sp1.1 from training. Duration: 0.97275 2023-10-04 11:00:21,675 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7762_210_1794_1_1528199508_4057408_419-125009-0_sp1.1 from training. Duration: 0.7545625 2023-10-04 11:00:22,985 WARNING [train.py:1204] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_268-87909-0 from training. Duration: 0.99 2023-10-04 11:00:24,383 WARNING [train.py:1204] (3/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_118-311357-0_sp1.1 from training. Duration: 0.92725 2023-10-04 11:00:24,397 WARNING [train.py:1204] (3/4) Exclude cut with ID ID0377W0003-870225 from training. Duration: 0.852 2023-10-04 11:00:25,894 WARNING [train.py:1204] (3/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_56-5838-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 11:00:26,062 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.balancer2.prob, batch_count=1630720.0, ans=0.125 2023-10-04 11:00:28,428 WARNING [train.py:1204] (3/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_90-31383-0_sp0.9 from training. Duration: 0.9666875 2023-10-04 11:00:29,760 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_31583_210_3852_1_1532595851_4024413_58-86657-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 11:00:29,800 WARNING [train.py:1204] (3/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_344-113875-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 11:00:31,837 WARNING [train.py:1204] (3/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_278-104676-0_sp1.1 from training. Duration: 0.8909375 2023-10-04 11:00:31,867 WARNING [train.py:1204] (3/4) Exclude cut with ID _55844_210_2346_1_1534471212569_7625707_764-2387-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 11:00:32,135 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=1630720.0, ans=0.1 2023-10-04 11:00:33,264 WARNING [train.py:1204] (3/4) Exclude cut with ID _27430_210_7770_1_1532604689375_1378590_157-14963-0_sp1.1 from training. Duration: 0.963625 2023-10-04 11:00:35,246 WARNING [train.py:1204] (3/4) Exclude cut with ID _7160_210_1876_1_1527940923231_3703799_282-322611-0_sp1.1 from training. Duration: 0.7909375 2023-10-04 11:00:35,456 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.bypass.scale_min, batch_count=1630786.6666666667, ans=0.2 2023-10-04 11:00:38,142 INFO [scaling.py:1118] (3/4) WithLoss: name=encoder.encoders.3.encoder.layers.1.self_attn_weights, loss-sum=0.000e+00 2023-10-04 11:00:46,567 WARNING [train.py:1204] (3/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_190-138640-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 11:00:49,765 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_33348_210_5632_1_1533086912_3324030_63-270922-0_sp1.1 from training. Duration: 0.97275 2023-10-04 11:00:49,790 WARNING [train.py:1204] (3/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_135-184281-0_sp1.1 from training. Duration: 0.9 2023-10-04 11:00:51,468 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.conv_module2.balancer1.prob, batch_count=1630853.3333333333, ans=0.125 2023-10-04 11:00:56,596 WARNING [train.py:1204] (3/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_277-210798-0_sp1.1 from training. Duration: 0.5 2023-10-04 11:00:57,908 WARNING [train.py:1204] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_402-253027-0_sp0.9 from training. Duration: 0.6 2023-10-04 11:00:57,983 WARNING [train.py:1204] (3/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_540-275685-0_sp0.9 from training. Duration: 0.988875 2023-10-04 11:00:59,290 WARNING [train.py:1204] (3/4) Exclude cut with ID _59493_210_17282_1_1535078419077_3225879_118-18960-0_sp1.1 from training. Duration: 0.936375 2023-10-04 11:01:00,684 WARNING [train.py:1204] (3/4) Exclude cut with ID _6194_210_1534_1_1527511826160_3941194_321-148641-0_sp1.1 from training. Duration: 0.5818125 2023-10-04 11:01:00,690 WARNING [train.py:1204] (3/4) Exclude cut with ID _61133_210_9969_1_1535196777227_3696409_333-270325-0_sp1.1 from training. Duration: 0.8454375 2023-10-04 11:01:00,734 WARNING [train.py:1204] (3/4) Exclude cut with ID _49376_210_5986_1_1533985278732_3370740_307-145221-0_sp1.1 from training. Duration: 0.936375 2023-10-04 11:01:02,848 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6846_210_6753_1_1527850767_4295158_2-315073-0_sp0.9 from training. Duration: 0.97775 2023-10-04 11:01:05,700 WARNING [train.py:1204] (3/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_17-25045-0 from training. Duration: 0.59 2023-10-04 11:01:05,713 WARNING [train.py:1204] (3/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_503-333510-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 11:01:05,856 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.1.feed_forward1.out_proj.dropout_p, batch_count=1630920.0, ans=0.1 2023-10-04 11:01:05,887 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.conv_module2.balancer1.prob, batch_count=1630920.0, ans=0.125 2023-10-04 11:01:07,571 WARNING [train.py:1204] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_238-245655-0_sp1.1 from training. Duration: 0.736375 2023-10-04 11:01:07,598 WARNING [train.py:1204] (3/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_329-82235-0_sp0.9 from training. Duration: 0.911125 2023-10-04 11:01:07,603 WARNING [train.py:1204] (3/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_325-113798-0_sp0.9 from training. Duration: 0.9444375 2023-10-04 11:01:08,976 WARNING [train.py:1204] (3/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_395-37222-0_sp0.9 from training. Duration: 0.8444375 2023-10-04 11:01:10,302 WARNING [train.py:1204] (3/4) Exclude cut with ID _64135_210_2253_1_1535677267876_7608972_498-5039-0_sp1.1 from training. Duration: 0.7090625 2023-10-04 11:01:10,312 WARNING [train.py:1204] (3/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_303-228738-0_sp0.9 from training. Duration: 0.9333125 2023-10-04 11:01:11,762 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_577-220899-0_sp1.1 from training. Duration: 0.92725 2023-10-04 11:01:13,216 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3475_210_2028_1_1526193578_3622569_377-166133-0_sp1.1 from training. Duration: 0.7818125 2023-10-04 11:01:14,452 WARNING [train.py:1204] (3/4) Exclude cut with ID _49376_210_5986_1_1533985278732_3370740_272-313512-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 11:01:16,496 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7762_210_1794_1_1528199508_4057408_422-227034-0_sp0.9 from training. Duration: 0.811125 2023-10-04 11:01:20,051 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.conv_module1.balancer1.min_positive, batch_count=1630986.6666666667, ans=0.025 2023-10-04 11:01:21,152 WARNING [train.py:1204] (3/4) Exclude cut with ID _18895_210_6228_1_1531357274234_3784930_74-341428-0_sp1.1 from training. Duration: 0.92725 2023-10-04 11:01:25,341 WARNING [train.py:1204] (3/4) Exclude cut with ID _44902_210_16187_1_1533888006062_3792078_149-67212-0_sp1.1 from training. Duration: 0.7909375 2023-10-04 11:01:28,128 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.764e+02 2.109e+02 2.410e+02 2.789e+02 4.244e+02, threshold=4.821e+02, percent-clipped=0.0 2023-10-04 11:01:28,427 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.attention_skip_rate, batch_count=1630986.6666666667, ans=0.0 2023-10-04 11:01:29,540 WARNING [train.py:1204] (3/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_243-126020-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 11:01:30,903 WARNING [train.py:1204] (3/4) Exclude cut with ID _54218_210_12210_1_1534413646388_4409259_259-6561-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 11:01:33,983 INFO [train.py:1046] (3/4) Epoch 47, batch 300, loss[loss=0.1615, simple_loss=0.2468, pruned_loss=0.03809, over 23327.00 frames. ], tot_loss[loss=0.1545, simple_loss=0.2352, pruned_loss=0.03692, over 3666698.87 frames. ], batch size: 105, lr: 2.17e-03, grad_scale: 8.0 2023-10-04 11:01:34,052 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_41452_210_7291_1_1533437687_3941281_405-11559-0 from training. Duration: 0.91 2023-10-04 11:01:34,145 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_759-225084-0_sp1.1 from training. Duration: 0.97275 2023-10-04 11:01:34,158 WARNING [train.py:1204] (3/4) Exclude cut with ID _14747_210_8341_1_1530685797463_3654089_147-131229-0_sp0.9 from training. Duration: 0.8444375 2023-10-04 11:01:35,620 WARNING [train.py:1204] (3/4) Exclude cut with ID _14867_210_1968_1_1530684179890_3846969_319-250868-0 from training. Duration: 0.99 2023-10-04 11:01:35,643 WARNING [train.py:1204] (3/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_580-93798-0_sp0.9 from training. Duration: 0.62225 2023-10-04 11:01:37,544 WARNING [train.py:1204] (3/4) Exclude cut with ID _9679_210_7497_1_1529129212906_4363929_512-313889-0_sp0.9 from training. Duration: 0.9 2023-10-04 11:01:37,553 WARNING [train.py:1204] (3/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_18-189198-0 from training. Duration: 0.9 2023-10-04 11:01:42,060 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.conv_skip_rate, batch_count=1631053.3333333333, ans=0.0 2023-10-04 11:01:42,940 WARNING [train.py:1204] (3/4) Exclude cut with ID _49755_210_18053_1_1534581005846_3218709_137-64179-0_sp1.1 from training. Duration: 0.92725 2023-10-04 11:01:43,001 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_48049_210_5632_1_1533864553_3519937_306-283794-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 11:01:45,824 WARNING [train.py:1204] (3/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_25-120161-0_sp1.1 from training. Duration: 0.8909375 2023-10-04 11:01:45,901 WARNING [train.py:1204] (3/4) Exclude cut with ID _8833_210_5219_1_1528592665210_7553059_319-349181-0 from training. Duration: 0.99 2023-10-04 11:01:49,017 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_296-332325-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 11:01:50,991 WARNING [train.py:1204] (3/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_436-186311-0_sp1.1 from training. Duration: 0.5909375 2023-10-04 11:01:51,009 WARNING [train.py:1204] (3/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_348-127789-0 from training. Duration: 0.85 2023-10-04 11:01:51,014 WARNING [train.py:1204] (3/4) Exclude cut with ID _3992_210_1747_1_1526644816031_3731060_443-195183-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 11:01:52,379 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.ff3_skip_rate, batch_count=1631120.0, ans=0.0 2023-10-04 11:01:54,866 WARNING [train.py:1204] (3/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_308-231735-0_sp1.1 from training. Duration: 0.663625 2023-10-04 11:01:59,286 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6786_210_1388_1_1527839699_4194495_378-46070-0_sp1.1 from training. Duration: 0.6545625 2023-10-04 11:02:01,154 WARNING [train.py:1204] (3/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_63-101996-0 from training. Duration: 0.88 2023-10-04 11:02:03,926 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2541_210_1794_1_1525590269_3084765_156-173713-0 from training. Duration: 0.81 2023-10-04 11:02:03,958 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_217-239796-0_sp1.1 from training. Duration: 0.963625 2023-10-04 11:02:06,728 WARNING [train.py:1204] (3/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_530-329400-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 11:02:06,939 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.1.conv_skip_rate, batch_count=1631186.6666666667, ans=0.0 2023-10-04 11:02:08,597 WARNING [train.py:1204] (3/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_150-62920-0_sp1.1 from training. Duration: 0.963625 2023-10-04 11:02:08,597 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_5522_210_3273_1_1527249606_3909096_366-172508-0 from training. Duration: 0.66 2023-10-04 11:02:08,600 WARNING [train.py:1204] (3/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_303-340446-0_sp1.1 from training. Duration: 0.8181875 2023-10-04 11:02:11,317 WARNING [train.py:1204] (3/4) Exclude cut with ID _8699_210_2069_1_1528630346831_3960409_160-24408-0_sp1.1 from training. Duration: 0.82725 2023-10-04 11:02:14,021 WARNING [train.py:1204] (3/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_299-187801-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 11:02:14,048 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_34886_210_10633_1_1533203882_1755661_92-345288-0_sp1.1 from training. Duration: 0.97275 2023-10-04 11:02:16,873 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_5438_210_1794_1_1527336687_2600009_104-288274-0_sp1.1 from training. Duration: 0.52725 2023-10-04 11:02:16,877 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_154-316450-0 from training. Duration: 0.76 2023-10-04 11:02:18,768 WARNING [train.py:1204] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_23-189234-0_sp0.9 from training. Duration: 0.92225 2023-10-04 11:02:21,448 WARNING [train.py:1204] (3/4) Exclude cut with ID _4779_210_2084_1_1527318022944_3914893_346-168820-0_sp1.1 from training. Duration: 0.963625 2023-10-04 11:02:21,578 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=1631253.3333333333, ans=0.1 2023-10-04 11:02:22,800 WARNING [train.py:1204] (3/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_5-55893-0 from training. Duration: 0.55 2023-10-04 11:02:24,769 WARNING [train.py:1204] (3/4) Exclude cut with ID _20455_210_5219_1_1531565996775_8098100_180-24517-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 11:02:27,720 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_243-143119-0_sp0.9 from training. Duration: 0.8555625 2023-10-04 11:02:29,169 WARNING [train.py:1204] (3/4) Exclude cut with ID _65585_210_12210_1_1535450451584_3764098_293-212045-0_sp1.1 from training. Duration: 0.92725 2023-10-04 11:02:29,174 WARNING [train.py:1204] (3/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_577-332600-0 from training. Duration: 0.57 2023-10-04 11:02:34,941 WARNING [train.py:1204] (3/4) Exclude cut with ID _30039_210_13270_1_1532484612900_2725150_104-289874-0_sp1.1 from training. Duration: 0.963625 2023-10-04 11:02:34,947 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3172_210_2087_1_1526292929_3987865_197-97762-0_sp1.1 from training. Duration: 0.6818125 2023-10-04 11:02:35,124 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.0.bypass.skip_rate, batch_count=1631320.0, ans=0.035 2023-10-04 11:02:38,009 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_256-150908-0_sp1.1 from training. Duration: 0.963625 2023-10-04 11:02:39,438 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_296-287678-0_sp1.1 from training. Duration: 0.863625 2023-10-04 11:02:39,470 WARNING [train.py:1204] (3/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_443-30034-0 from training. Duration: 0.64 2023-10-04 11:02:39,498 WARNING [train.py:1204] (3/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_494-199538-0_sp0.9 from training. Duration: 0.8333125 2023-10-04 11:02:40,851 WARNING [train.py:1204] (3/4) Exclude cut with ID _19708_210_9206_1_1532136650733_4238364_61-162325-0_sp1.1 from training. Duration: 0.97275 2023-10-04 11:02:42,186 WARNING [train.py:1204] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_475-127455-0 from training. Duration: 0.7 2023-10-04 11:02:42,309 WARNING [train.py:1204] (3/4) Exclude cut with ID _48677_210_5663_1_1533902405919_3892989_188-11581-0_sp1.1 from training. Duration: 0.963625 2023-10-04 11:02:43,627 WARNING [train.py:1204] (3/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_136-8381-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 11:02:43,740 WARNING [train.py:1204] (3/4) Exclude cut with ID _55328_210_27767_1_1534580973260_4088170_270-229555-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 11:02:45,045 WARNING [train.py:1204] (3/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_372-266951-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 11:02:46,408 WARNING [train.py:1204] (3/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_14-242149-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 11:02:48,375 INFO [train.py:1046] (3/4) Epoch 47, batch 350, loss[loss=0.1411, simple_loss=0.2326, pruned_loss=0.02476, over 24638.00 frames. ], tot_loss[loss=0.153, simple_loss=0.2339, pruned_loss=0.03604, over 3901520.62 frames. ], batch size: 68, lr: 2.17e-03, grad_scale: 8.0 2023-10-04 11:02:48,646 WARNING [train.py:1204] (3/4) Exclude cut with ID _7877_210_6014_1_1528286358099_3750136_159-76512-0_sp1.1 from training. Duration: 0.936375 2023-10-04 11:02:48,648 WARNING [train.py:1204] (3/4) Exclude cut with ID _8263_210_1664_1_1528438953494_294100_40-204731-0_sp1.1 from training. Duration: 0.4545625 2023-10-04 11:02:51,390 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8278_210_2550_1_1528637099_4220413_694-238413-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 11:02:57,403 WARNING [train.py:1204] (3/4) Exclude cut with ID _47278_210_12041_1_1533902418349_3576110_421-172710-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 11:03:02,021 WARNING [train.py:1204] (3/4) Exclude cut with ID _14970_210_15709_1_1530773543493_1715730_215-282680-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 11:03:02,049 WARNING [train.py:1204] (3/4) Exclude cut with ID _64639_210_5816_1_1535367619437_3528000_306-149449-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 11:03:04,838 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_28-291982-0 from training. Duration: 0.97 2023-10-04 11:03:06,276 WARNING [train.py:1204] (3/4) Exclude cut with ID _55134_210_21982_1_1534590028377_3882170_412-15870-0_sp1.1 from training. Duration: 0.936375 2023-10-04 11:03:06,303 WARNING [train.py:1204] (3/4) Exclude cut with ID _13684_210_7497_1_1530619354863_3598359_312-235606-0 from training. Duration: 0.6 2023-10-04 11:03:08,289 WARNING [train.py:1204] (3/4) Exclude cut with ID _37112_210_2575_1_1532997007673_7268610_347-238993-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 11:03:09,053 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.self_attn1.whiten.whitening_limit, batch_count=1631453.3333333333, ans=22.5 2023-10-04 11:03:09,676 WARNING [train.py:1204] (3/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_121-284152-0 from training. Duration: 0.59 2023-10-04 11:03:09,745 WARNING [train.py:1204] (3/4) Exclude cut with ID _2941_210_2577_1_1526199673150_2489759_228-2961-0_sp1.1 from training. Duration: 0.97275 2023-10-04 11:03:12,436 WARNING [train.py:1204] (3/4) Exclude cut with ID _28323_210_16187_1_1532480395667_4100300_531-24157-0 from training. Duration: 0.98 2023-10-04 11:03:13,867 WARNING [train.py:1204] (3/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_266-329755-0_sp0.9 from training. Duration: 0.988875 2023-10-04 11:03:15,241 WARNING [train.py:1204] (3/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_1038-146421-0_sp1.1 from training. Duration: 0.97275 2023-10-04 11:03:15,310 WARNING [train.py:1204] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_391-22148-0_sp0.9 from training. Duration: 0.8555625 2023-10-04 11:03:16,716 WARNING [train.py:1204] (3/4) Exclude cut with ID _64521_210_14699_1_1535243427259_3815328_543-341117-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 11:03:16,734 WARNING [train.py:1204] (3/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_52-168712-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 11:03:18,529 WARNING [train.py:1204] (3/4) Exclude cut with ID _33920_210_4220_1_1533257851500_3084399_198-334063-0_sp1.1 from training. Duration: 0.8545625 2023-10-04 11:03:18,549 WARNING [train.py:1204] (3/4) Exclude cut with ID _50093_210_4220_1_1534417141207_3432299_69-111987-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 11:03:18,587 WARNING [train.py:1204] (3/4) Exclude cut with ID _14821_210_8750_1_1530680503991_6829359_189-297451-0_sp0.9 from training. Duration: 0.611125 2023-10-04 11:03:21,138 WARNING [train.py:1204] (3/4) Exclude cut with ID _8030_210_7497_1_1528444886781_3531089_262-86750-0_sp0.9 from training. Duration: 0.9 2023-10-04 11:03:21,152 WARNING [train.py:1204] (3/4) Exclude cut with ID _24612_210_8913_1_1531998054193_3806359_294-191196-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 11:03:27,417 WARNING [train.py:1204] (3/4) Exclude cut with ID _16909_210_5724_1_1531292079912_4118919_707-342896-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 11:03:27,438 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_9460_210_4211_1_1528954587_10049667_572-37035-0_sp0.9 from training. Duration: 0.888875 2023-10-04 11:03:29,280 WARNING [train.py:1204] (3/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_762-109886-0_sp1.1 from training. Duration: 0.82725 2023-10-04 11:03:29,315 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_14953_210_2151_1_1530701710_5616165_519-326393-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 11:03:33,634 WARNING [train.py:1204] (3/4) Exclude cut with ID _25914_210_6400_1_1532141317058_644740_52-96808-0 from training. Duration: 0.91 2023-10-04 11:03:33,645 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_68095_210_20185_1_1535718226_8888855_1079-94525-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 11:03:36,500 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=1631586.6666666667, ans=0.1 2023-10-04 11:03:39,540 WARNING [train.py:1204] (3/4) Exclude cut with ID _14860_210_4915_1_1530702020340_7343240_746-3647-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 11:03:39,541 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_70379_210_7925_1_1535941455_557455_49-101720-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 11:03:39,563 WARNING [train.py:1204] (3/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_8-307291-0_sp1.1 from training. Duration: 0.936375 2023-10-04 11:03:40,947 WARNING [train.py:1204] (3/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_117-290916-0 from training. Duration: 0.51 2023-10-04 11:03:43,640 WARNING [train.py:1204] (3/4) Exclude cut with ID _22020_210_2461_1_1531724164513_6326900_944-339840-0_sp1.1 from training. Duration: 0.963625 2023-10-04 11:03:43,728 WARNING [train.py:1204] (3/4) Exclude cut with ID IC0515W0043-254911 from training. Duration: 0.976 2023-10-04 11:03:43,824 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3910_210_1794_1_1526742306_334530_28-238909-0 from training. Duration: 0.81 2023-10-04 11:03:45,112 WARNING [train.py:1204] (3/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_269-267275-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 11:03:46,643 WARNING [train.py:1204] (3/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_72-251255-0_sp1.1 from training. Duration: 0.8545625 2023-10-04 11:03:48,585 WARNING [train.py:1204] (3/4) Exclude cut with ID _14886_210_4220_1_1530702020355_3516190_175-229073-0 from training. Duration: 0.45 2023-10-04 11:03:48,751 WARNING [train.py:1204] (3/4) Exclude cut with ID _40519_210_13794_1_1533715228655_7337390_921-209452-0_sp1.1 from training. Duration: 0.963625 2023-10-04 11:03:52,116 WARNING [train.py:1204] (3/4) Exclude cut with ID _29857_210_20034_1_1532395886753_3798160_335-163562-0_sp0.9 from training. Duration: 0.9444375 2023-10-04 11:03:53,360 WARNING [train.py:1204] (3/4) Exclude cut with ID _58583_210_5236_1_1534726835266_3574249_384-58445-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 11:03:53,429 WARNING [train.py:1204] (3/4) Exclude cut with ID _44011_210_2046_1_1533722401691_3631310_96-315656-0_sp1.1 from training. Duration: 0.963625 2023-10-04 11:03:53,433 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_68484_210_1794_1_1535849804_7678097_549-170613-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 11:03:54,883 WARNING [train.py:1204] (3/4) Exclude cut with ID _37892_210_19596_1_1533198677897_3964058_214-148058-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 11:03:58,117 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.708e+02 2.078e+02 2.298e+02 2.651e+02 3.732e+02, threshold=4.596e+02, percent-clipped=0.0 2023-10-04 11:03:58,294 WARNING [train.py:1204] (3/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_415-78792-0_sp0.9 from training. Duration: 0.9 2023-10-04 11:04:00,812 WARNING [train.py:1204] (3/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_383-205833-0_sp1.1 from training. Duration: 0.863625 2023-10-04 11:04:01,019 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=1631653.3333333333, ans=0.1 2023-10-04 11:04:02,305 WARNING [train.py:1204] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_167-137255-0 from training. Duration: 0.64 2023-10-04 11:04:02,320 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8275_210_4198_1_1528455320_3892571_484-154786-0_sp1.1 from training. Duration: 0.963625 2023-10-04 11:04:03,541 INFO [train.py:1046] (3/4) Epoch 47, batch 400, loss[loss=0.1521, simple_loss=0.2309, pruned_loss=0.03662, over 23640.00 frames. ], tot_loss[loss=0.1527, simple_loss=0.2329, pruned_loss=0.03625, over 4072554.98 frames. ], batch size: 149, lr: 2.17e-03, grad_scale: 16.0 2023-10-04 11:04:03,622 WARNING [train.py:1204] (3/4) Exclude cut with ID _40284_210_2084_1_1533513679814_3704279_396-319412-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 11:04:04,953 WARNING [train.py:1204] (3/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_280-120942-0_sp0.9 from training. Duration: 0.8555625 2023-10-04 11:04:06,277 WARNING [train.py:1204] (3/4) Exclude cut with ID _45630_210_10556_1_1533949174498_3902049_558-240889-0_sp1.1 from training. Duration: 0.92725 2023-10-04 11:04:08,378 WARNING [train.py:1204] (3/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_197-307458-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 11:04:09,859 WARNING [train.py:1204] (3/4) Exclude cut with ID _16909_210_5724_1_1531292079912_4118919_163-254843-0_sp1.1 from training. Duration: 0.92725 2023-10-04 11:04:12,679 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_103-84010-0 from training. Duration: 0.69 2023-10-04 11:04:14,030 WARNING [train.py:1204] (3/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_401-158239-0 from training. Duration: 0.71 2023-10-04 11:04:14,031 WARNING [train.py:1204] (3/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_899-233658-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 11:04:15,478 WARNING [train.py:1204] (3/4) Exclude cut with ID _23825_210_13449_1_1531915313526_3497150_240-271367-0 from training. Duration: 0.97 2023-10-04 11:04:16,846 WARNING [train.py:1204] (3/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_424-134619-0_sp1.1 from training. Duration: 0.92725 2023-10-04 11:04:21,500 WARNING [train.py:1204] (3/4) Exclude cut with ID _44831_210_13427_1_1533816011549_3689035_32-188147-0_sp1.1 from training. Duration: 0.87275 2023-10-04 11:04:21,503 WARNING [train.py:1204] (3/4) Exclude cut with ID _59170_210_6721_1_1534983633960_4203335_314-63879-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 11:04:21,526 WARNING [train.py:1204] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_602-275260-0 from training. Duration: 0.82 2023-10-04 11:04:21,563 WARNING [train.py:1204] (3/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_511-306767-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 11:04:23,483 WARNING [train.py:1204] (3/4) Exclude cut with ID _41817_210_18835_1_1533434477160_7423049_565-201775-0_sp1.1 from training. Duration: 0.92725 2023-10-04 11:04:23,495 WARNING [train.py:1204] (3/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_778-173151-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 11:04:23,536 WARNING [train.py:1204] (3/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_680-51001-0_sp1.1 from training. Duration: 0.963625 2023-10-04 11:04:24,839 WARNING [train.py:1204] (3/4) Exclude cut with ID IC0677W0059-334891 from training. Duration: 0.896 2023-10-04 11:04:24,888 WARNING [train.py:1204] (3/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_370-331600-0 from training. Duration: 0.94 2023-10-04 11:04:25,121 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.conv_skip_rate, batch_count=1631786.6666666667, ans=0.0 2023-10-04 11:04:29,856 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.conv_module2.balancer2.prob, batch_count=1631786.6666666667, ans=0.125 2023-10-04 11:04:30,845 WARNING [train.py:1204] (3/4) Exclude cut with ID _66840_210_5219_1_1535452168426_4196349_446-240192-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 11:04:32,144 WARNING [train.py:1204] (3/4) Exclude cut with ID _65242_210_18628_1_1535369393717_3823149_38-6732-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 11:04:32,206 WARNING [train.py:1204] (3/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_302-2550-0 from training. Duration: 0.81 2023-10-04 11:04:33,517 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_100-348441-0 from training. Duration: 0.91 2023-10-04 11:04:36,429 WARNING [train.py:1204] (3/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_329-285801-0_sp1.1 from training. Duration: 0.82725 2023-10-04 11:04:39,210 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.2.encoder.layers.1.whiten, num_groups=1, num_channels=384, metric=4.58 vs. limit=12.0 2023-10-04 11:04:39,807 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_30167_210_3528_1_1532428012_4772375_343-334712-0_sp1.1 from training. Duration: 0.936375 2023-10-04 11:04:44,097 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7543_210_6753_1_1528543318_4552_1-85072-0 from training. Duration: 0.83 2023-10-04 11:04:44,973 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.2.encoder.layers.0.feed_forward3.out_whiten, num_groups=1, num_channels=384, metric=10.85 vs. limit=15.0 2023-10-04 11:04:46,688 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7002_210_1794_1_1527935634_4034444_487-236501-0_sp1.1 from training. Duration: 0.6 2023-10-04 11:04:48,097 WARNING [train.py:1204] (3/4) Exclude cut with ID _7160_210_1876_1_1527940923231_3703799_282-322611-0 from training. Duration: 0.87 2023-10-04 11:04:51,158 WARNING [train.py:1204] (3/4) Exclude cut with ID _67335_210_5219_1_1535525975652_4275229_570-262983-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 11:04:51,263 WARNING [train.py:1204] (3/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_625-338546-0_sp1.1 from training. Duration: 0.8 2023-10-04 11:04:53,182 WARNING [train.py:1204] (3/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_176-56120-0 from training. Duration: 0.83 2023-10-04 11:04:53,425 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.self_attn_weights.pos_emb_skip_rate, batch_count=1631920.0, ans=0.0 2023-10-04 11:04:57,154 WARNING [train.py:1204] (3/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_963-187109-0_sp1.1 from training. Duration: 0.9 2023-10-04 11:04:59,859 WARNING [train.py:1204] (3/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_121-284152-0_sp0.9 from training. Duration: 0.6555625 2023-10-04 11:05:01,756 WARNING [train.py:1204] (3/4) Exclude cut with ID _45021_210_9994_1_1533619751094_3330288_104-81711-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 11:05:01,944 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.0.feed_forward1.hidden_balancer.prob, batch_count=1631986.6666666667, ans=0.125 2023-10-04 11:05:04,587 WARNING [train.py:1204] (3/4) Exclude cut with ID _51382_210_23862_1_1534492801838_3650104_129-343914-0_sp1.1 from training. Duration: 0.936375 2023-10-04 11:05:04,613 WARNING [train.py:1204] (3/4) Exclude cut with ID _14970_210_15709_1_1530775547361_1966965_8-169924-0 from training. Duration: 0.92 2023-10-04 11:05:07,444 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2295_210_2087_1_1525500993_2407863_234-83104-0_sp1.1 from training. Duration: 0.563625 2023-10-04 11:05:07,532 WARNING [train.py:1204] (3/4) Exclude cut with ID _14687_210_5854_1_1530703838259_3664889_115-239046-0 from training. Duration: 0.67 2023-10-04 11:05:09,489 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.bypass.scale_min, batch_count=1631986.6666666667, ans=0.2 2023-10-04 11:05:10,616 WARNING [train.py:1204] (3/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_690-126306-0_sp1.1 from training. Duration: 0.7090625 2023-10-04 11:05:10,623 WARNING [train.py:1204] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_625-102955-0_sp0.9 from training. Duration: 0.8555625 2023-10-04 11:05:12,019 WARNING [train.py:1204] (3/4) Exclude cut with ID _9389_210_1664_1_1529217337301_5289269_289-213417-0 from training. Duration: 0.89 2023-10-04 11:05:13,453 WARNING [train.py:1204] (3/4) Exclude cut with ID _13253_210_1925_1_1530757775819_3577873_191-144977-0_sp0.9 from training. Duration: 0.8666875 2023-10-04 11:05:14,797 WARNING [train.py:1204] (3/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_325-155703-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 11:05:14,830 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2295_210_2087_1_1525500993_2407863_116-238894-0_sp0.9 from training. Duration: 0.77775 2023-10-04 11:05:17,428 INFO [train.py:1046] (3/4) Epoch 47, batch 450, loss[loss=0.1972, simple_loss=0.267, pruned_loss=0.06374, over 19624.00 frames. ], tot_loss[loss=0.1532, simple_loss=0.2336, pruned_loss=0.03639, over 4220142.07 frames. ], batch size: 388, lr: 2.17e-03, grad_scale: 16.0 2023-10-04 11:05:17,489 WARNING [train.py:1204] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_94-126128-0 from training. Duration: 0.86 2023-10-04 11:05:17,513 WARNING [train.py:1204] (3/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_395-310220-0_sp0.9 from training. Duration: 0.988875 2023-10-04 11:05:17,584 WARNING [train.py:1204] (3/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_821-110852-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 11:05:18,958 WARNING [train.py:1204] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_64-248359-0_sp1.1 from training. Duration: 0.62725 2023-10-04 11:05:18,972 WARNING [train.py:1204] (3/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_138-187031-0 from training. Duration: 0.99 2023-10-04 11:05:19,013 WARNING [train.py:1204] (3/4) Exclude cut with ID _4122_210_2084_1_1526713164417_4353280_528-206771-0_sp0.9 from training. Duration: 0.97775 2023-10-04 11:05:20,361 WARNING [train.py:1204] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_844-96989-0_sp1.1 from training. Duration: 0.6454375 2023-10-04 11:05:21,842 WARNING [train.py:1204] (3/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_106-30782-0_sp1.1 from training. Duration: 0.72725 2023-10-04 11:05:31,485 WARNING [train.py:1204] (3/4) Exclude cut with ID _14683_210_5219_1_1530698352608_3974859_281-250040-0_sp1.1 from training. Duration: 0.936375 2023-10-04 11:05:32,835 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_46013_210_7234_1_1533776362_3744032_463-52378-0_sp0.9 from training. Duration: 0.9555625 2023-10-04 11:05:34,316 WARNING [train.py:1204] (3/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_299-34413-0 from training. Duration: 0.65 2023-10-04 11:05:35,654 WARNING [train.py:1204] (3/4) Exclude cut with ID _35799_210_5854_1_1533263487887_3529071_490-326847-0 from training. Duration: 0.99 2023-10-04 11:05:37,198 WARNING [train.py:1204] (3/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_589-9070-0_sp0.9 from training. Duration: 0.788875 2023-10-04 11:05:37,484 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.conv_module2.balancer2.prob, batch_count=1632120.0, ans=0.125 2023-10-04 11:05:40,424 WARNING [train.py:1204] (3/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_884-29515-0_sp1.1 from training. Duration: 0.936375 2023-10-04 11:05:41,804 WARNING [train.py:1204] (3/4) Exclude cut with ID _15012_210_1968_1_1530856820429_5095350_474-119995-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 11:05:44,635 WARNING [train.py:1204] (3/4) Exclude cut with ID _65585_210_12210_1_1535450451584_3764098_211-97124-0_sp1.1 from training. Duration: 0.963625 2023-10-04 11:05:45,959 WARNING [train.py:1204] (3/4) Exclude cut with ID _33640_210_2655_1_1533117605479_4855469_666-224463-0_sp1.1 from training. Duration: 0.963625 2023-10-04 11:05:47,458 WARNING [train.py:1204] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_35-153886-0 from training. Duration: 0.84 2023-10-04 11:05:49,014 WARNING [train.py:1204] (3/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_210-106047-0 from training. Duration: 0.97 2023-10-04 11:05:49,135 WARNING [train.py:1204] (3/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_479-125611-0 from training. Duration: 0.96 2023-10-04 11:05:50,376 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_9417_210_1794_1_1529235261_4614131_427-153963-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 11:05:50,471 WARNING [train.py:1204] (3/4) Exclude cut with ID _23818_210_6228_1_1531917063884_3676920_133-170571-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 11:05:52,683 WARNING [train.py:1204] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_602-275260-0_sp1.1 from training. Duration: 0.7454375 2023-10-04 11:05:53,940 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9029W0121-509521 from training. Duration: 0.896 2023-10-04 11:05:53,950 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9029W0434-509821 from training. Duration: 0.64 2023-10-04 11:05:53,966 WARNING [train.py:1204] (3/4) Exclude cut with ID _23038_210_9570_1_1532001584158_3652419_289-307034-0_sp1.1 from training. Duration: 0.936375 2023-10-04 11:05:56,065 WARNING [train.py:1204] (3/4) Exclude cut with ID _29345_210_4381_1_1532689168263_3860169_149-257268-0_sp0.9 from training. Duration: 0.9 2023-10-04 11:05:56,139 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_405-339986-0_sp1.1 from training. Duration: 0.463625 2023-10-04 11:05:58,737 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2655_210_2087_1_1525688959_3897400_437-337690-0_sp1.1 from training. Duration: 0.57275 2023-10-04 11:05:58,768 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7543_210_6753_1_1528541907_1399322_165-323619-0_sp1.1 from training. Duration: 0.62725 2023-10-04 11:06:00,442 WARNING [train.py:1204] (3/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_508-335369-0_sp1.1 from training. Duration: 0.436375 2023-10-04 11:06:01,772 WARNING [train.py:1204] (3/4) Exclude cut with ID _7852_210_5219_1_1528286471316_8139400_358-287492-0 from training. Duration: 0.96 2023-10-04 11:06:03,078 WARNING [train.py:1204] (3/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_322-217220-0_sp0.9 from training. Duration: 0.9555625 2023-10-04 11:06:04,539 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3171_210_2087_1_1526109084_2330007_66-260583-0_sp0.9 from training. Duration: 0.8 2023-10-04 11:06:04,574 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8558_210_1410_1_1528505225_7613406_131-329524-0_sp1.1 from training. Duration: 0.7181875 2023-10-04 11:06:06,288 WARNING [train.py:1204] (3/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_30-326681-0 from training. Duration: 0.83 2023-10-04 11:06:07,784 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.ff2_skip_rate, batch_count=1632253.3333333333, ans=0.0 2023-10-04 11:06:11,652 WARNING [train.py:1204] (3/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_58-68956-0_sp0.9 from training. Duration: 0.92225 2023-10-04 11:06:12,980 WARNING [train.py:1204] (3/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_255-312654-0 from training. Duration: 0.91 2023-10-04 11:06:13,050 WARNING [train.py:1204] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_99-167392-0 from training. Duration: 0.61 2023-10-04 11:06:14,424 WARNING [train.py:1204] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_94-126128-0_sp0.9 from training. Duration: 0.9555625 2023-10-04 11:06:18,638 WARNING [train.py:1204] (3/4) Exclude cut with ID _68635_210_9317_1_1535770813410_4315870_187-192817-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 11:06:20,120 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.attention_skip_rate, batch_count=1632320.0, ans=0.0 2023-10-04 11:06:20,227 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.feed_forward1.hidden_balancer.prob, batch_count=1632320.0, ans=0.125 2023-10-04 11:06:21,190 WARNING [train.py:1204] (3/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_277-204970-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 11:06:23,107 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6163_210_6400_1_1527506746_4263068_283-137811-0_sp0.9 from training. Duration: 0.8555625 2023-10-04 11:06:23,126 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9144W0121-566446 from training. Duration: 0.896 2023-10-04 11:06:26,053 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.feed_forward3.hidden_balancer.prob, batch_count=1632320.0, ans=0.125 2023-10-04 11:06:27,508 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.678e+02 1.974e+02 2.147e+02 2.474e+02 4.734e+02, threshold=4.294e+02, percent-clipped=1.0 2023-10-04 11:06:27,625 WARNING [train.py:1204] (3/4) Exclude cut with ID _49371_210_12071_1_1533952761021_3812190_351-315519-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 11:06:28,949 WARNING [train.py:1204] (3/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_115-326778-0_sp1.1 from training. Duration: 0.8454375 2023-10-04 11:06:29,126 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.conv_skip_rate, batch_count=1632320.0, ans=0.0 2023-10-04 11:06:30,658 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_17292_210_6753_1_1531125388_2943734_436-198965-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 11:06:30,667 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9165W0117-576751 from training. Duration: 0.896 2023-10-04 11:06:32,020 INFO [train.py:1046] (3/4) Epoch 47, batch 500, loss[loss=0.1593, simple_loss=0.2357, pruned_loss=0.04146, over 23824.00 frames. ], tot_loss[loss=0.1541, simple_loss=0.2348, pruned_loss=0.03674, over 4324211.92 frames. ], batch size: 195, lr: 2.17e-03, grad_scale: 8.0 2023-10-04 11:06:32,107 WARNING [train.py:1204] (3/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_212-149947-0 from training. Duration: 0.73 2023-10-04 11:06:32,117 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_13757_210_3528_1_1530440682_4107239_65-167544-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 11:06:33,615 WARNING [train.py:1204] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_700-183549-0_sp0.9 from training. Duration: 0.7555625 2023-10-04 11:06:35,625 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.3.encoder.layers.3.nonlin_attention.whiten2, num_groups=1, num_channels=512, metric=4.72 vs. limit=15.0 2023-10-04 11:06:37,713 WARNING [train.py:1204] (3/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_216-343007-0_sp0.9 from training. Duration: 0.7333125 2023-10-04 11:06:39,635 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7544_210_6753_1_1528587722_3576686_295-258479-0_sp1.1 from training. Duration: 0.763625 2023-10-04 11:06:41,140 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_365-287106-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 11:06:41,155 WARNING [train.py:1204] (3/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_300-3747-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 11:06:42,520 WARNING [train.py:1204] (3/4) Exclude cut with ID _14069_210_5632_1_1530512692033_4803390_487-303201-0_sp1.1 from training. Duration: 0.92725 2023-10-04 11:06:52,153 WARNING [train.py:1204] (3/4) Exclude cut with ID _14459_210_2228_1_1531443641946_7516700_668-123026-0_sp1.1 from training. Duration: 0.936375 2023-10-04 11:06:53,948 WARNING [train.py:1204] (3/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_108-53929-0_sp1.1 from training. Duration: 0.536375 2023-10-04 11:06:53,993 WARNING [train.py:1204] (3/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_266-137104-0_sp0.9 from training. Duration: 0.72225 2023-10-04 11:06:55,291 WARNING [train.py:1204] (3/4) Exclude cut with ID _12302_210_5219_1_1529924446466_8107280_902-168619-0_sp1.1 from training. Duration: 0.936375 2023-10-04 11:06:55,313 WARNING [train.py:1204] (3/4) Exclude cut with ID _59502_210_12210_1_1535107024698_1835139_146-162495-0 from training. Duration: 0.92 2023-10-04 11:06:55,327 WARNING [train.py:1204] (3/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_412-287526-0_sp1.1 from training. Duration: 0.6545625 2023-10-04 11:06:58,112 WARNING [train.py:1204] (3/4) Exclude cut with ID _7160_210_1876_1_1527940923231_3703799_120-150943-0_sp0.9 from training. Duration: 0.9 2023-10-04 11:06:58,169 WARNING [train.py:1204] (3/4) Exclude cut with ID _15037_210_2253_1_1530752294104_3267650_296-192597-0_sp1.1 from training. Duration: 0.736375 2023-10-04 11:06:58,192 WARNING [train.py:1204] (3/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_397-216497-0_sp1.1 from training. Duration: 0.87275 2023-10-04 11:06:58,215 WARNING [train.py:1204] (3/4) Exclude cut with ID _40094_210_16380_1_1533294005773_3641159_76-269320-0_sp1.1 from training. Duration: 0.936375 2023-10-04 11:06:59,972 WARNING [train.py:1204] (3/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_449-15508-0 from training. Duration: 0.82 2023-10-04 11:07:04,515 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9309W0194-640321 from training. Duration: 0.64 2023-10-04 11:07:07,296 WARNING [train.py:1204] (3/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_284-312178-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 11:07:07,397 WARNING [train.py:1204] (3/4) Exclude cut with ID _29853_210_7497_1_1532505600851_6642110_401-55937-0_sp1.1 from training. Duration: 0.92725 2023-10-04 11:07:07,543 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.balancer2.prob, batch_count=1632520.0, ans=0.125 2023-10-04 11:07:08,788 WARNING [train.py:1204] (3/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_716-24918-0_sp1.1 from training. Duration: 0.92725 2023-10-04 11:07:08,821 WARNING [train.py:1204] (3/4) Exclude cut with ID _19637_210_4220_1_1531537167676_3205060_314-305638-0_sp1.1 from training. Duration: 0.92725 2023-10-04 11:07:10,171 WARNING [train.py:1204] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_475-127455-0_sp1.1 from training. Duration: 0.636375 2023-10-04 11:07:12,132 WARNING [train.py:1204] (3/4) Exclude cut with ID _5444_210_2151_1_1527418808464_5312210_581-315411-0 from training. Duration: 0.62 2023-10-04 11:07:13,761 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.conv_skip_rate, batch_count=1632520.0, ans=0.0 2023-10-04 11:07:14,930 WARNING [train.py:1204] (3/4) Exclude cut with ID _44902_210_16187_1_1533888006062_3792078_149-67212-0_sp0.9 from training. Duration: 0.9666875 2023-10-04 11:07:14,998 WARNING [train.py:1204] (3/4) Exclude cut with ID _31602_210_5854_1_1532677021456_4458250_63-63253-0_sp1.1 from training. Duration: 0.963625 2023-10-04 11:07:19,208 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.ff3_skip_rate, batch_count=1632586.6666666667, ans=0.0 2023-10-04 11:07:20,386 WARNING [train.py:1204] (3/4) Exclude cut with ID _55209_210_1531_1_1534675828996_4597160_660-203992-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 11:07:20,664 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.attention_skip_rate, batch_count=1632586.6666666667, ans=0.0 2023-10-04 11:07:21,771 WARNING [train.py:1204] (3/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_83-190624-0_sp1.1 from training. Duration: 0.92725 2023-10-04 11:07:25,384 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.bypass.scale_min, batch_count=1632586.6666666667, ans=0.2 2023-10-04 11:07:27,782 WARNING [train.py:1204] (3/4) Exclude cut with ID _58605_210_12210_1_1534723195939_3904259_154-139635-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 11:07:30,570 WARNING [train.py:1204] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_164-224052-0 from training. Duration: 0.7 2023-10-04 11:07:30,579 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_1126-189071-0_sp1.1 from training. Duration: 0.963625 2023-10-04 11:07:30,589 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_15396_210_6890_1_1531308473_1938936_310-214789-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 11:07:34,404 WARNING [train.py:1204] (3/4) Exclude cut with ID _8395_210_1531_1_1528597871523_4411579_431-132470-0 from training. Duration: 0.74 2023-10-04 11:07:34,456 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3780_210_2087_1_1526467760_2130483_170-259044-0_sp0.9 from training. Duration: 0.77775 2023-10-04 11:07:35,853 WARNING [train.py:1204] (3/4) Exclude cut with ID _12733_210_8341_1_1530167424556_3646380_537-230640-0_sp1.1 from training. Duration: 0.963625 2023-10-04 11:07:40,415 WARNING [train.py:1204] (3/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_159-281310-0 from training. Duration: 0.95 2023-10-04 11:07:40,797 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.feed_forward1.out_proj.dropout_p, batch_count=1632653.3333333333, ans=0.1 2023-10-04 11:07:41,917 WARNING [train.py:1204] (3/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_434-83000-0 from training. Duration: 0.98 2023-10-04 11:07:41,936 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_4405_210_1849_1_1526732205_3593939_493-32464-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 11:07:41,945 WARNING [train.py:1204] (3/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_342-100157-0 from training. Duration: 0.54 2023-10-04 11:07:42,012 WARNING [train.py:1204] (3/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_765-250133-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 11:07:42,016 WARNING [train.py:1204] (3/4) Exclude cut with ID _38670_210_15379_1_1533448810659_4391059_81-312245-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 11:07:43,357 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_50050_210_16223_1_1535435796_7290138_1273-247611-0_sp1.1 from training. Duration: 0.936375 2023-10-04 11:07:44,701 WARNING [train.py:1204] (3/4) Exclude cut with ID _44831_210_13427_1_1533816011549_3689035_399-18591-0_sp1.1 from training. Duration: 0.936375 2023-10-04 11:07:44,716 WARNING [train.py:1204] (3/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_557-324258-0_sp0.9 from training. Duration: 0.9 2023-10-04 11:07:45,988 INFO [train.py:1046] (3/4) Epoch 47, batch 550, loss[loss=0.1664, simple_loss=0.2444, pruned_loss=0.04422, over 23598.00 frames. ], tot_loss[loss=0.155, simple_loss=0.2357, pruned_loss=0.03719, over 4405271.89 frames. ], batch size: 256, lr: 2.17e-03, grad_scale: 8.0 2023-10-04 11:07:46,076 WARNING [train.py:1204] (3/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_62-335552-0_sp1.1 from training. Duration: 0.7909375 2023-10-04 11:07:48,766 WARNING [train.py:1204] (3/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_673-264125-0_sp1.1 from training. Duration: 0.963625 2023-10-04 11:07:48,857 WARNING [train.py:1204] (3/4) Exclude cut with ID _8030_210_7497_1_1528444886781_3531089_262-86750-0 from training. Duration: 0.81 2023-10-04 11:07:50,107 WARNING [train.py:1204] (3/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_444-28041-0_sp1.1 from training. Duration: 0.77275 2023-10-04 11:07:56,003 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_34008_210_10431_1_1533345424_8183247_516-14858-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 11:07:56,023 WARNING [train.py:1204] (3/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_54-248218-0_sp1.1 from training. Duration: 0.936375 2023-10-04 11:07:56,343 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.conv_module2.balancer1.prob, batch_count=1632720.0, ans=0.125 2023-10-04 11:07:56,352 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.bypass.scale_min, batch_count=1632720.0, ans=0.2 2023-10-04 11:07:58,826 WARNING [train.py:1204] (3/4) Exclude cut with ID _13253_210_1925_1_1530757775819_3577873_300-119782-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 11:08:00,197 WARNING [train.py:1204] (3/4) Exclude cut with ID _39290_210_4220_1_1533862778760_3075118_21-240703-0_sp1.1 from training. Duration: 0.936375 2023-10-04 11:08:04,972 WARNING [train.py:1204] (3/4) Exclude cut with ID 774-127930-0014-10412-0_sp1.1 from training. Duration: 0.95 2023-10-04 11:08:06,851 WARNING [train.py:1204] (3/4) Exclude cut with ID _67499_210_17282_1_1535509805220_3692430_20-179346-0 from training. Duration: 0.97 2023-10-04 11:08:09,480 WARNING [train.py:1204] (3/4) Exclude cut with ID _8365_210_2064_1_1528592311070_4677690_431-294102-0_sp0.9 from training. Duration: 0.988875 2023-10-04 11:08:15,801 WARNING [train.py:1204] (3/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_365-1492-0_sp1.1 from training. Duration: 0.82725 2023-10-04 11:08:15,837 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_105-114797-0_sp0.9 from training. Duration: 0.9333125 2023-10-04 11:08:17,201 WARNING [train.py:1204] (3/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_769-168302-0_sp0.9 from training. Duration: 0.788875 2023-10-04 11:08:19,945 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_40340_210_9024_1_1533433916_4132944_320-270277-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 11:08:19,953 WARNING [train.py:1204] (3/4) Exclude cut with ID ID0203W0003-781711 from training. Duration: 0.958 2023-10-04 11:08:20,030 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_30434_210_13049_1_1532519384_981746_52-12932-0_sp1.1 from training. Duration: 0.936375 2023-10-04 11:08:21,483 WARNING [train.py:1204] (3/4) Exclude cut with ID _8263_210_1664_1_1528438953494_294100_40-204731-0_sp0.9 from training. Duration: 0.5555625 2023-10-04 11:08:24,208 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1032-236863-0_sp0.9 from training. Duration: 0.9333125 2023-10-04 11:08:25,512 WARNING [train.py:1204] (3/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_335-274780-0_sp0.9 from training. Duration: 0.8666875 2023-10-04 11:08:25,520 WARNING [train.py:1204] (3/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_654-52713-0_sp1.1 from training. Duration: 0.7 2023-10-04 11:08:26,829 WARNING [train.py:1204] (3/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_742-267856-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 11:08:28,711 WARNING [train.py:1204] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_74-48717-0 from training. Duration: 0.58 2023-10-04 11:08:30,172 WARNING [train.py:1204] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_18-46756-0 from training. Duration: 0.79 2023-10-04 11:08:30,224 WARNING [train.py:1204] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_612-56061-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 11:08:30,234 WARNING [train.py:1204] (3/4) Exclude cut with ID _56423_210_5854_1_1534896061049_3165969_412-2403-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 11:08:31,529 WARNING [train.py:1204] (3/4) Exclude cut with ID _62574_210_2655_1_1535180407058_3933249_120-196848-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 11:08:31,529 WARNING [train.py:1204] (3/4) Exclude cut with ID _40519_210_13794_1_1533715228655_7337390_916-51000-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 11:08:34,300 WARNING [train.py:1204] (3/4) Exclude cut with ID _65263_210_22356_1_1535763598691_3921037_108-21902-0_sp1.1 from training. Duration: 0.9 2023-10-04 11:08:35,700 WARNING [train.py:1204] (3/4) Exclude cut with ID _4122_210_2084_1_1526713164417_4353280_528-206771-0_sp1.1 from training. Duration: 0.8 2023-10-04 11:08:38,881 WARNING [train.py:1204] (3/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_43-325657-0_sp1.1 from training. Duration: 0.92725 2023-10-04 11:08:38,931 WARNING [train.py:1204] (3/4) Exclude cut with ID _44659_210_2721_1_1534036120061_6614860_314-82041-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 11:08:40,295 WARNING [train.py:1204] (3/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_81-300212-0_sp0.9 from training. Duration: 0.6666875 2023-10-04 11:08:40,365 WARNING [train.py:1204] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_665-196155-0_sp0.9 from training. Duration: 0.7444375 2023-10-04 11:08:42,314 WARNING [train.py:1204] (3/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_140-336881-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 11:08:44,082 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6290_210_3273_1_1527595224_1282787_115-87095-0_sp0.9 from training. Duration: 0.611125 2023-10-04 11:08:44,140 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_53373_210_5399_1_1534229471_4715695_607-259685-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 11:08:46,810 WARNING [train.py:1204] (3/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_175-163894-0_sp1.1 from training. Duration: 0.67275 2023-10-04 11:08:46,846 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7002_210_1794_1_1527939683_5157286_1-281264-0_sp1.1 from training. Duration: 0.4 2023-10-04 11:08:52,384 WARNING [train.py:1204] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_605-257205-0 from training. Duration: 0.59 2023-10-04 11:08:55,007 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.730e+02 2.110e+02 2.356e+02 2.673e+02 4.561e+02, threshold=4.712e+02, percent-clipped=1.0 2023-10-04 11:08:55,173 WARNING [train.py:1204] (3/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_454-314901-0 from training. Duration: 0.46 2023-10-04 11:08:56,580 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_46032_210_14656_1_1533781412_4507969_287-130422-0_sp1.1 from training. Duration: 0.97275 2023-10-04 11:08:57,859 WARNING [train.py:1204] (3/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_186-309161-0_sp1.1 from training. Duration: 0.4909375 2023-10-04 11:08:57,912 WARNING [train.py:1204] (3/4) Exclude cut with ID _42659_210_2070_1_1533607442765_3120380_45-342307-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 11:08:59,190 INFO [train.py:1046] (3/4) Epoch 47, batch 600, loss[loss=0.149, simple_loss=0.2173, pruned_loss=0.04036, over 23832.00 frames. ], tot_loss[loss=0.1545, simple_loss=0.2355, pruned_loss=0.03676, over 4485511.25 frames. ], batch size: 179, lr: 2.17e-03, grad_scale: 8.0 2023-10-04 11:09:04,138 WARNING [train.py:1204] (3/4) Exclude cut with ID _12555_210_8750_1_1530075594402_3780239_478-326818-0_sp1.1 from training. Duration: 0.963625 2023-10-04 11:09:07,369 WARNING [train.py:1204] (3/4) Exclude cut with ID _7606_210_4915_1_1528894253277_5080690_127-177761-0_sp1.1 from training. Duration: 0.5818125 2023-10-04 11:09:08,732 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6788_210_1388_1_1527898698_4562021_91-197981-0 from training. Duration: 0.83 2023-10-04 11:09:09,033 INFO [scaling.py:1118] (3/4) WithLoss: name=encoder.encoders.3.encoder.layers.1.self_attn_weights, loss-sum=0.000e+00 2023-10-04 11:09:10,398 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.feed_forward1.hidden_balancer.prob, batch_count=1633053.3333333333, ans=0.125 2023-10-04 11:09:10,742 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.4.encoder.layers.2.feed_forward1.out_whiten, num_groups=1, num_channels=384, metric=8.99 vs. limit=15.0 2023-10-04 11:09:11,528 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8558_210_1410_1_1528505225_7613406_131-329524-0_sp0.9 from training. Duration: 0.87775 2023-10-04 11:09:13,480 WARNING [train.py:1204] (3/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_153-153537-0_sp1.1 from training. Duration: 0.82725 2023-10-04 11:09:14,932 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_17292_210_6753_1_1531125388_2943734_285-349105-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 11:09:16,494 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.conv_module1.balancer2.prob, batch_count=1633120.0, ans=0.125 2023-10-04 11:09:17,652 WARNING [train.py:1204] (3/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_37-41919-0 from training. Duration: 0.87 2023-10-04 11:09:17,673 WARNING [train.py:1204] (3/4) Exclude cut with ID _60860_210_8254_1_1534896015875_3557169_172-294852-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 11:09:23,185 WARNING [train.py:1204] (3/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_146-276944-0 from training. Duration: 0.83 2023-10-04 11:09:27,173 WARNING [train.py:1204] (3/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_1254-44082-0_sp1.1 from training. Duration: 0.936375 2023-10-04 11:09:27,194 WARNING [train.py:1204] (3/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_1115-180350-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 11:09:27,219 WARNING [train.py:1204] (3/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_60-146189-0_sp1.1 from training. Duration: 0.8909375 2023-10-04 11:09:33,341 WARNING [train.py:1204] (3/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_517-153649-0_sp1.1 from training. Duration: 0.92725 2023-10-04 11:09:33,352 WARNING [train.py:1204] (3/4) Exclude cut with ID _39571_210_13449_1_1533801584143_3651077_37-266257-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 11:09:34,726 WARNING [train.py:1204] (3/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_527-276118-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 11:09:40,849 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7696_210_2040_1_1528207167_3716099_117-125434-0_sp1.1 from training. Duration: 0.6909375 2023-10-04 11:09:45,980 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_25767_210_2161_1_1532307483_3867748_478-156360-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 11:09:45,984 WARNING [train.py:1204] (3/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_177-49092-0_sp1.1 from training. Duration: 0.82725 2023-10-04 11:09:45,991 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_11305_210_1794_1_1529750428_7178148_680-317073-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 11:09:51,717 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.feed_forward2.hidden_balancer.prob, batch_count=1633253.3333333333, ans=0.125 2023-10-04 11:09:54,199 WARNING [train.py:1204] (3/4) Exclude cut with ID _53565_210_10635_1_1534337218335_3820259_287-18120-0 from training. Duration: 0.97 2023-10-04 11:09:58,503 WARNING [train.py:1204] (3/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_498-40153-0_sp1.1 from training. Duration: 0.57275 2023-10-04 11:09:58,523 WARNING [train.py:1204] (3/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_555-63066-0_sp1.1 from training. Duration: 0.963625 2023-10-04 11:10:01,768 WARNING [train.py:1204] (3/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_395-310220-0 from training. Duration: 0.89 2023-10-04 11:10:02,632 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.feed_forward1.out_proj.dropout_p, batch_count=1633320.0, ans=0.1 2023-10-04 11:10:03,650 WARNING [train.py:1204] (3/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_937-208281-0_sp1.1 from training. Duration: 0.77275 2023-10-04 11:10:04,311 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.3.encoder.layers.0.self_attn1.whiten, num_groups=1, num_channels=512, metric=13.30 vs. limit=22.5 2023-10-04 11:10:05,086 WARNING [train.py:1204] (3/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_120-211630-0 from training. Duration: 0.92 2023-10-04 11:10:05,127 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_454-276429-0_sp1.1 from training. Duration: 0.7909375 2023-10-04 11:10:06,395 WARNING [train.py:1204] (3/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_404-75889-0_sp1.1 from training. Duration: 0.8090625 2023-10-04 11:10:12,323 WARNING [train.py:1204] (3/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_282-136641-0_sp0.9 from training. Duration: 0.4666875 2023-10-04 11:10:13,953 INFO [train.py:1046] (3/4) Epoch 47, batch 650, loss[loss=0.1567, simple_loss=0.2388, pruned_loss=0.03735, over 23175.00 frames. ], tot_loss[loss=0.1539, simple_loss=0.2345, pruned_loss=0.03666, over 4529762.39 frames. ], batch size: 105, lr: 2.17e-03, grad_scale: 8.0 2023-10-04 11:10:14,065 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_809-17156-0_sp1.1 from training. Duration: 0.663625 2023-10-04 11:10:14,380 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.feed_forward3.hidden_balancer.prob, batch_count=1633386.6666666667, ans=0.125 2023-10-04 11:10:16,054 WARNING [train.py:1204] (3/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_1003-101638-0_sp0.9 from training. Duration: 0.811125 2023-10-04 11:10:17,513 WARNING [train.py:1204] (3/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_404-75889-0_sp0.9 from training. Duration: 0.988875 2023-10-04 11:10:20,212 WARNING [train.py:1204] (3/4) Exclude cut with ID _59288_210_5854_1_1535077757774_3412246_570-295255-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 11:10:22,944 WARNING [train.py:1204] (3/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_412-287526-0 from training. Duration: 0.72 2023-10-04 11:10:24,186 WARNING [train.py:1204] (3/4) Exclude cut with ID _44010_210_4525_1_1533884262964_4013620_649-79655-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 11:10:29,738 WARNING [train.py:1204] (3/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_57-270037-0_sp0.9 from training. Duration: 0.8555625 2023-10-04 11:10:29,739 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_34815_210_6388_1_1533381320_3571903_302-255963-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 11:10:29,964 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.attention_skip_rate, batch_count=1633453.3333333333, ans=0.0 2023-10-04 11:10:32,647 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_47449_210_5399_1_1534076445_4006929_573-301852-0_sp1.1 from training. Duration: 0.97275 2023-10-04 11:10:36,089 WARNING [train.py:1204] (3/4) Exclude cut with ID 6709-74022-0004-86860-0_sp1.1 from training. Duration: 0.9409375 2023-10-04 11:10:38,686 WARNING [train.py:1204] (3/4) Exclude cut with ID _40519_210_13794_1_1533715228655_7337390_111-71991-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 11:10:38,723 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_30167_210_3528_1_1532428012_4772375_169-214186-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 11:10:43,330 WARNING [train.py:1204] (3/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_20-235313-0_sp1.1 from training. Duration: 0.963625 2023-10-04 11:10:43,378 WARNING [train.py:1204] (3/4) Exclude cut with ID _4681_210_3525_1_1527251040321_1235713_123-24428-0_sp0.9 from training. Duration: 0.5333125 2023-10-04 11:10:45,441 WARNING [train.py:1204] (3/4) Exclude cut with ID _31602_210_5854_1_1532677021456_4458250_444-198752-0_sp1.1 from training. Duration: 0.97275 2023-10-04 11:10:46,786 WARNING [train.py:1204] (3/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_9-279442-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 11:10:46,842 WARNING [train.py:1204] (3/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_973-198085-0_sp0.9 from training. Duration: 0.8333125 2023-10-04 11:10:48,953 WARNING [train.py:1204] (3/4) Exclude cut with ID _29853_210_7497_1_1532505600851_6642110_567-250221-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 11:10:49,055 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6545_210_2040_1_1527774670_4245258_407-48070-0_sp1.1 from training. Duration: 0.6909375 2023-10-04 11:10:51,681 WARNING [train.py:1204] (3/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_224-343295-0_sp0.9 from training. Duration: 0.611125 2023-10-04 11:10:51,698 WARNING [train.py:1204] (3/4) Exclude cut with ID IC0125W0011-61817 from training. Duration: 0.88 2023-10-04 11:10:51,707 WARNING [train.py:1204] (3/4) Exclude cut with ID _60113_210_16271_1_1535164212481_4386079_435-154371-0_sp1.1 from training. Duration: 0.97275 2023-10-04 11:10:51,719 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_25836_210_16554_1_1532138167_53197_3-9394-0_sp1.1 from training. Duration: 0.92725 2023-10-04 11:10:54,642 WARNING [train.py:1204] (3/4) Exclude cut with ID _29343_210_4381_1_1532602757752_3674109_57-55125-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 11:10:56,092 WARNING [train.py:1204] (3/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_267-23560-0_sp1.1 from training. Duration: 0.936375 2023-10-04 11:10:56,117 WARNING [train.py:1204] (3/4) Exclude cut with ID _30196_210_1664_1_1533708496522_7074140_1150-278867-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 11:10:57,314 WARNING [train.py:1204] (3/4) Exclude cut with ID _6270_210_2084_1_1527850911573_3856420_26-277343-0_sp0.9 from training. Duration: 0.911125 2023-10-04 11:10:57,502 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.ff3_skip_rate, batch_count=1633586.6666666667, ans=0.0 2023-10-04 11:10:57,586 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.attention_skip_rate, batch_count=1633586.6666666667, ans=0.0 2023-10-04 11:10:58,726 WARNING [train.py:1204] (3/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_325-62802-0 from training. Duration: 0.87 2023-10-04 11:11:00,137 WARNING [train.py:1204] (3/4) Exclude cut with ID _56524_210_9642_1_1534572011779_3619170_145-310842-0_sp1.1 from training. Duration: 0.9 2023-10-04 11:11:00,178 WARNING [train.py:1204] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_860-96198-0_sp0.9 from training. Duration: 0.811125 2023-10-04 11:11:01,465 WARNING [train.py:1204] (3/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_17-25045-0_sp1.1 from training. Duration: 0.536375 2023-10-04 11:11:01,489 WARNING [train.py:1204] (3/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_349-69727-0_sp1.1 from training. Duration: 0.936375 2023-10-04 11:11:02,836 WARNING [train.py:1204] (3/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_580-93798-0_sp1.1 from training. Duration: 0.5090625 2023-10-04 11:11:04,212 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7762_210_1794_1_1528199508_4057408_419-125009-0 from training. Duration: 0.83 2023-10-04 11:11:04,417 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.1.feed_forward3.hidden_balancer.prob, batch_count=1633586.6666666667, ans=0.125 2023-10-04 11:11:05,681 WARNING [train.py:1204] (3/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_356-167816-0 from training. Duration: 0.72 2023-10-04 11:11:05,711 WARNING [train.py:1204] (3/4) Exclude cut with ID _29345_210_4381_1_1532689168263_3860169_225-97100-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 11:11:05,714 WARNING [train.py:1204] (3/4) Exclude cut with ID _14227_210_4381_1_1530701804286_3940259_374-194712-0_sp1.1 from training. Duration: 0.92725 2023-10-04 11:11:05,750 WARNING [train.py:1204] (3/4) Exclude cut with ID _40137_210_12210_1_1533290484046_3795180_249-187059-0_sp0.9 from training. Duration: 0.97775 2023-10-04 11:11:07,655 WARNING [train.py:1204] (3/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_389-317194-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 11:11:09,046 WARNING [train.py:1204] (3/4) Exclude cut with ID _27485_210_4916_1_1532739191677_3668269_239-122918-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 11:11:16,904 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2245_210_2715_1_1525511548_3540254_128-117609-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 11:11:16,933 WARNING [train.py:1204] (3/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_160-276178-0_sp1.1 from training. Duration: 0.963625 2023-10-04 11:11:17,032 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_31920_210_10633_1_1532860009_1686094_1-279735-0_sp1.1 from training. Duration: 0.97275 2023-10-04 11:11:21,007 WARNING [train.py:1204] (3/4) Exclude cut with ID _51078_210_9070_1_1534161557424_3805250_325-262383-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 11:11:21,040 WARNING [train.py:1204] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_643-52007-0_sp0.9 from training. Duration: 0.6333125 2023-10-04 11:11:21,079 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_51937_210_5014_1_1534573610_4022025_518-299248-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 11:11:21,252 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.conv_module1.balancer1.prob, batch_count=1633653.3333333333, ans=0.125 2023-10-04 11:11:23,380 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.5.encoder.layers.1.self_attn1.whiten, num_groups=1, num_channels=256, metric=13.25 vs. limit=22.5 2023-10-04 11:11:24,157 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.722e+02 2.080e+02 2.396e+02 2.903e+02 4.504e+02, threshold=4.792e+02, percent-clipped=0.0 2023-10-04 11:11:26,394 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.3.encoder.layers.1.whiten, num_groups=1, num_channels=512, metric=6.04 vs. limit=12.0 2023-10-04 11:11:27,270 WARNING [train.py:1204] (3/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_166-314607-0_sp1.1 from training. Duration: 0.4909375 2023-10-04 11:11:27,274 WARNING [train.py:1204] (3/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_1087-282939-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 11:11:28,586 INFO [train.py:1046] (3/4) Epoch 47, batch 700, loss[loss=0.1544, simple_loss=0.2306, pruned_loss=0.03906, over 24297.00 frames. ], tot_loss[loss=0.153, simple_loss=0.2338, pruned_loss=0.03616, over 4579065.30 frames. ], batch size: 61, lr: 2.17e-03, grad_scale: 8.0 2023-10-04 11:11:28,642 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_23850_210_4238_1_1532232597_1632433_32-185135-0_sp1.1 from training. Duration: 0.7818125 2023-10-04 11:11:28,674 WARNING [train.py:1204] (3/4) Exclude cut with ID _59500_210_8254_1_1534809610680_3736009_145-89359-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 11:11:32,742 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_340-336408-0 from training. Duration: 0.93 2023-10-04 11:11:34,104 WARNING [train.py:1204] (3/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_9-21737-0 from training. Duration: 0.97 2023-10-04 11:11:35,542 WARNING [train.py:1204] (3/4) Exclude cut with ID _2220_210_1819_1_1525509109898_3658680_50-99837-0 from training. Duration: 0.54 2023-10-04 11:11:35,608 WARNING [train.py:1204] (3/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_879-299350-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 11:11:37,044 WARNING [train.py:1204] (3/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_905-160074-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 11:11:38,839 WARNING [train.py:1204] (3/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_395-73570-0 from training. Duration: 0.99 2023-10-04 11:11:43,967 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7264_210_4938_1_1527939911_8132095_800-91995-0_sp1.1 from training. Duration: 0.7818125 2023-10-04 11:11:46,802 WARNING [train.py:1204] (3/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_77-311404-0_sp1.1 from training. Duration: 0.8545625 2023-10-04 11:11:48,204 WARNING [train.py:1204] (3/4) Exclude cut with ID _13679_210_7497_1_1530534293662_3355098_107-80772-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 11:11:48,327 WARNING [train.py:1204] (3/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_224-343295-0_sp1.1 from training. Duration: 0.5 2023-10-04 11:11:48,367 WARNING [train.py:1204] (3/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_156-253482-0_sp1.1 from training. Duration: 0.963625 2023-10-04 11:11:51,571 WARNING [train.py:1204] (3/4) Exclude cut with ID _49728_210_12651_1_1534041034294_6788150_309-125532-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 11:11:54,396 WARNING [train.py:1204] (3/4) Exclude cut with ID _13913_210_9994_1_1530579615187_3383897_473-277682-0_sp0.9 from training. Duration: 0.4333125 2023-10-04 11:11:54,401 WARNING [train.py:1204] (3/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_3-276701-0_sp1.1 from training. Duration: 0.82725 2023-10-04 11:11:54,515 WARNING [train.py:1204] (3/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_282-136641-0 from training. Duration: 0.42 2023-10-04 11:11:57,094 WARNING [train.py:1204] (3/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_127-566-0 from training. Duration: 0.84 2023-10-04 11:11:59,121 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.4.encoder.layers.2.self_attn2.whiten, num_groups=1, num_channels=384, metric=11.92 vs. limit=22.5 2023-10-04 11:12:02,386 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6163_210_6400_1_1527506746_4263068_283-137811-0_sp1.1 from training. Duration: 0.7 2023-10-04 11:12:02,422 WARNING [train.py:1204] (3/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_34-183685-0_sp1.1 from training. Duration: 0.8909375 2023-10-04 11:12:03,868 WARNING [train.py:1204] (3/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_154-135696-0_sp0.9 from training. Duration: 0.888875 2023-10-04 11:12:07,328 WARNING [train.py:1204] (3/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_676-51014-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 11:12:08,625 WARNING [train.py:1204] (3/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_38-169920-0 from training. Duration: 0.91 2023-10-04 11:12:12,136 WARNING [train.py:1204] (3/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_747-114465-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 11:12:13,461 WARNING [train.py:1204] (3/4) Exclude cut with ID _8021_210_7994_1_1528376451358_3653069_100-140231-0_sp1.1 from training. Duration: 0.6090625 2023-10-04 11:12:13,478 WARNING [train.py:1204] (3/4) Exclude cut with ID _64135_210_2253_1_1535677267876_7608972_133-188266-0 from training. Duration: 0.81 2023-10-04 11:12:18,001 WARNING [train.py:1204] (3/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_714-310538-0_sp1.1 from training. Duration: 0.7818125 2023-10-04 11:12:18,103 WARNING [train.py:1204] (3/4) Exclude cut with ID _39867_210_9826_1_1533553222030_9344259_1134-288498-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 11:12:20,889 WARNING [train.py:1204] (3/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_211-237289-0_sp1.1 from training. Duration: 0.936375 2023-10-04 11:12:27,824 WARNING [train.py:1204] (3/4) Exclude cut with ID _59202_210_26984_1_1534816810612_3549174_137-318710-0_sp0.9 from training. Duration: 0.988875 2023-10-04 11:12:27,858 WARNING [train.py:1204] (3/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_86-66192-0 from training. Duration: 0.55 2023-10-04 11:12:30,660 WARNING [train.py:1204] (3/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_540-275685-0 from training. Duration: 0.89 2023-10-04 11:12:30,689 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8776_210_2040_1_1528638837_4088036_244-278763-0 from training. Duration: 0.9 2023-10-04 11:12:32,282 WARNING [train.py:1204] (3/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_9-219972-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 11:12:33,748 WARNING [train.py:1204] (3/4) Exclude cut with ID _44659_210_2721_1_1534036120061_6614860_656-283823-0_sp1.1 from training. Duration: 0.92725 2023-10-04 11:12:34,058 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.conv_module1.balancer1.prob, batch_count=1633986.6666666667, ans=0.125 2023-10-04 11:12:35,137 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_69630_210_7234_1_1535789601_661049_77-216385-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 11:12:38,310 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_19952_210_2161_1_1531875469_3851750_210-308919-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 11:12:38,315 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_4737_210_2040_1_1526998689_3293008_378-178650-0 from training. Duration: 0.95 2023-10-04 11:12:43,406 INFO [train.py:1046] (3/4) Epoch 47, batch 750, loss[loss=0.1681, simple_loss=0.2375, pruned_loss=0.04936, over 23824.00 frames. ], tot_loss[loss=0.1531, simple_loss=0.2335, pruned_loss=0.03634, over 4617103.00 frames. ], batch size: 164, lr: 2.17e-03, grad_scale: 8.0 2023-10-04 11:12:44,538 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.0.layers.0.feed_forward2.out_whiten, num_groups=1, num_channels=192, metric=5.70 vs. limit=15.0 2023-10-04 11:12:44,945 WARNING [train.py:1204] (3/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_224-343295-0 from training. Duration: 0.55 2023-10-04 11:12:44,962 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7002_210_1794_1_1527939683_5157286_184-229630-0 from training. Duration: 0.74 2023-10-04 11:12:44,986 WARNING [train.py:1204] (3/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_6-214815-0 from training. Duration: 0.85 2023-10-04 11:12:46,373 WARNING [train.py:1204] (3/4) Exclude cut with ID _8699_210_2069_1_1528630346831_3960409_160-24408-0 from training. Duration: 0.91 2023-10-04 11:12:46,413 WARNING [train.py:1204] (3/4) Exclude cut with ID _5585_210_2667_1_1527418813324_8007340_89-332072-0 from training. Duration: 0.64 2023-10-04 11:12:47,744 WARNING [train.py:1204] (3/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_528-259800-0_sp1.1 from training. Duration: 0.963625 2023-10-04 11:12:47,844 WARNING [train.py:1204] (3/4) Exclude cut with ID _55383_210_10635_1_1534468426172_2231669_134-128246-0 from training. Duration: 0.89 2023-10-04 11:12:49,630 WARNING [train.py:1204] (3/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_400-338274-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 11:12:49,706 WARNING [train.py:1204] (3/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_364-22817-0_sp0.9 from training. Duration: 0.788875 2023-10-04 11:12:51,147 WARNING [train.py:1204] (3/4) Exclude cut with ID _42863_210_9070_1_1533902398788_3696920_331-291057-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 11:12:52,582 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_5436_210_1794_1_1527331317_2541494_229-211016-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 11:12:52,631 WARNING [train.py:1204] (3/4) Exclude cut with ID _6456_210_1534_1_1527681634212_3181049_345-252877-0_sp0.9 from training. Duration: 0.711125 2023-10-04 11:12:52,662 WARNING [train.py:1204] (3/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_96-81499-0_sp1.1 from training. Duration: 0.92725 2023-10-04 11:12:56,546 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_5575_210_1410_1_1527301305_2570170_342-265572-0_sp1.1 from training. Duration: 0.8545625 2023-10-04 11:12:56,618 WARNING [train.py:1204] (3/4) Exclude cut with ID _5444_210_2151_1_1527418808464_5312210_726-27165-0_sp0.9 from training. Duration: 0.7444375 2023-10-04 11:12:59,343 WARNING [train.py:1204] (3/4) Exclude cut with ID _22083_210_8254_1_1531702862439_3678970_304-327750-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 11:13:03,362 WARNING [train.py:1204] (3/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_245-226199-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 11:13:03,399 WARNING [train.py:1204] (3/4) Exclude cut with ID _42875_210_14856_1_1533704397487_4012433_102-199499-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 11:13:03,452 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_51225_210_3601_1_1534400634_2327383_55-301844-0 from training. Duration: 0.93 2023-10-04 11:13:04,795 WARNING [train.py:1204] (3/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_916-195848-0_sp1.1 from training. Duration: 0.836375 2023-10-04 11:13:06,131 WARNING [train.py:1204] (3/4) Exclude cut with ID _44718_210_5816_1_1534050051869_6469030_497-311999-0_sp1.1 from training. Duration: 0.936375 2023-10-04 11:13:07,602 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_34886_210_10633_1_1533203882_1755661_91-194765-0_sp1.1 from training. Duration: 0.936375 2023-10-04 11:13:07,722 WARNING [train.py:1204] (3/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_310-26186-0_sp0.9 from training. Duration: 0.77775 2023-10-04 11:13:09,635 WARNING [train.py:1204] (3/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_416-257730-0 from training. Duration: 0.65 2023-10-04 11:13:09,641 WARNING [train.py:1204] (3/4) Exclude cut with ID _9389_210_1664_1_1529217337301_5289269_209-154760-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 11:13:12,299 WARNING [train.py:1204] (3/4) Exclude cut with ID _4092_210_2151_1_1526643044449_3135759_227-213972-0 from training. Duration: 0.54 2023-10-04 11:13:12,329 WARNING [train.py:1204] (3/4) Exclude cut with ID IC0679W0044-335852 from training. Duration: 0.768 2023-10-04 11:13:14,283 WARNING [train.py:1204] (3/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_42-186162-0 from training. Duration: 0.92 2023-10-04 11:13:14,298 WARNING [train.py:1204] (3/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_302-2550-0_sp0.9 from training. Duration: 0.9 2023-10-04 11:13:14,314 WARNING [train.py:1204] (3/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_356-31216-0_sp0.9 from training. Duration: 0.8333125 2023-10-04 11:13:15,745 WARNING [train.py:1204] (3/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_125-322172-0_sp1.1 from training. Duration: 0.8454375 2023-10-04 11:13:21,672 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.0.conv_module2.balancer2.prob, batch_count=1634186.6666666667, ans=0.125 2023-10-04 11:13:24,251 WARNING [train.py:1204] (3/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_141-89727-0_sp0.9 from training. Duration: 0.788875 2023-10-04 11:13:24,278 WARNING [train.py:1204] (3/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_647-337784-0_sp1.1 from training. Duration: 0.97275 2023-10-04 11:13:24,288 WARNING [train.py:1204] (3/4) Exclude cut with ID _6493_210_1747_1_1528459210424_4012470_513-140967-0_sp1.1 from training. Duration: 0.7181875 2023-10-04 11:13:25,735 WARNING [train.py:1204] (3/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_87-266334-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 11:13:27,150 WARNING [train.py:1204] (3/4) Exclude cut with ID _26528_210_9070_1_1532570410842_3851310_526-261338-0_sp1.1 from training. Duration: 0.92725 2023-10-04 11:13:27,345 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=1634253.3333333333, ans=0.1 2023-10-04 11:13:28,508 WARNING [train.py:1204] (3/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_380-343670-0 from training. Duration: 0.88 2023-10-04 11:13:28,562 WARNING [train.py:1204] (3/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_449-15508-0_sp1.1 from training. Duration: 0.7454375 2023-10-04 11:13:28,843 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.conv_module1.balancer1.prob, batch_count=1634253.3333333333, ans=0.125 2023-10-04 11:13:29,916 WARNING [train.py:1204] (3/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_372-6869-0_sp0.9 from training. Duration: 0.488875 2023-10-04 11:13:29,989 WARNING [train.py:1204] (3/4) Exclude cut with ID _26907_210_7234_1_1532221740694_3205263_337-2494-0_sp0.9 from training. Duration: 0.97775 2023-10-04 11:13:32,807 WARNING [train.py:1204] (3/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_10-100367-0_sp1.1 from training. Duration: 0.8909375 2023-10-04 11:13:34,105 WARNING [train.py:1204] (3/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_47-167373-0 from training. Duration: 0.92 2023-10-04 11:13:34,155 WARNING [train.py:1204] (3/4) Exclude cut with ID _28323_210_16187_1_1532480395667_4100300_628-282179-0_sp1.1 from training. Duration: 0.97275 2023-10-04 11:13:39,666 WARNING [train.py:1204] (3/4) Exclude cut with ID _20455_210_5219_1_1531565996775_8098100_213-280898-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 11:13:41,571 WARNING [train.py:1204] (3/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_383-316000-0_sp1.1 from training. Duration: 0.8181875 2023-10-04 11:13:42,891 WARNING [train.py:1204] (3/4) Exclude cut with ID _18513_210_9206_1_1531618215223_3862620_16-78180-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 11:13:44,388 WARNING [train.py:1204] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_330-327861-0_sp1.1 from training. Duration: 0.6090625 2023-10-04 11:13:47,749 WARNING [train.py:1204] (3/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_356-31216-0 from training. Duration: 0.75 2023-10-04 11:13:47,775 WARNING [train.py:1204] (3/4) Exclude cut with ID _25248_210_8750_1_1532062825941_5554180_255-148539-0_sp1.1 from training. Duration: 0.963625 2023-10-04 11:13:47,940 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.ff2_skip_rate, batch_count=1634320.0, ans=0.0 2023-10-04 11:13:49,594 WARNING [train.py:1204] (3/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_269-154726-0_sp1.1 from training. Duration: 0.7818125 2023-10-04 11:13:52,915 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.606e+02 1.939e+02 2.140e+02 2.417e+02 3.613e+02, threshold=4.280e+02, percent-clipped=0.0 2023-10-04 11:13:53,025 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7002_210_1794_1_1527939683_5157286_163-87552-0_sp1.1 from training. Duration: 0.7818125 2023-10-04 11:13:53,059 WARNING [train.py:1204] (3/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_97-151896-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 11:13:56,953 INFO [train.py:1046] (3/4) Epoch 47, batch 800, loss[loss=0.1683, simple_loss=0.2457, pruned_loss=0.04547, over 23262.00 frames. ], tot_loss[loss=0.1535, simple_loss=0.234, pruned_loss=0.03651, over 4646252.48 frames. ], batch size: 105, lr: 2.17e-03, grad_scale: 16.0 2023-10-04 11:13:57,032 WARNING [train.py:1204] (3/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_207-7026-0_sp1.1 from training. Duration: 0.97275 2023-10-04 11:13:58,353 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_92-46009-0_sp0.9 from training. Duration: 0.6 2023-10-04 11:14:03,066 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.4.encoder.layers.1.whiten, num_groups=1, num_channels=384, metric=3.90 vs. limit=12.0 2023-10-04 11:14:05,395 WARNING [train.py:1204] (3/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_122-178847-0_sp1.1 from training. Duration: 0.97275 2023-10-04 11:14:05,401 WARNING [train.py:1204] (3/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_94-348956-0_sp1.1 from training. Duration: 0.92725 2023-10-04 11:14:06,803 WARNING [train.py:1204] (3/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_1018-244089-0_sp1.1 from training. Duration: 0.963625 2023-10-04 11:14:06,822 WARNING [train.py:1204] (3/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_87-118423-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 11:14:08,241 WARNING [train.py:1204] (3/4) Exclude cut with ID _53713_210_24116_1_1534314487762_7416751_504-212292-0_sp1.1 from training. Duration: 0.92725 2023-10-04 11:14:08,270 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_446-150327-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 11:14:09,545 WARNING [train.py:1204] (3/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_682-245989-0_sp1.1 from training. Duration: 0.92725 2023-10-04 11:14:13,030 WARNING [train.py:1204] (3/4) Exclude cut with ID _35723_210_1794_1_1533088341043_628250_8-234524-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 11:14:14,287 WARNING [train.py:1204] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_64-248359-0_sp0.9 from training. Duration: 0.7666875 2023-10-04 11:14:15,644 WARNING [train.py:1204] (3/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_49-97158-0 from training. Duration: 0.76 2023-10-04 11:14:15,698 WARNING [train.py:1204] (3/4) Exclude cut with ID _24314_210_4915_1_1532135078116_7170179_743-915-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 11:14:17,152 WARNING [train.py:1204] (3/4) Exclude cut with ID _61194_210_18628_1_1535007555628_3750909_353-323468-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 11:14:18,975 WARNING [train.py:1204] (3/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_387-131538-0_sp1.1 from training. Duration: 0.736375 2023-10-04 11:14:19,009 WARNING [train.py:1204] (3/4) Exclude cut with ID _44424_210_18163_1_1533623394848_1213110_79-187603-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 11:14:20,280 WARNING [train.py:1204] (3/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_86-19601-0 from training. Duration: 0.85 2023-10-04 11:14:20,299 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_50050_210_16223_1_1535435796_7290138_103-283529-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 11:14:20,348 WARNING [train.py:1204] (3/4) Exclude cut with ID _13913_210_9994_1_1530579615187_3383897_473-277682-0 from training. Duration: 0.39 2023-10-04 11:14:23,019 WARNING [train.py:1204] (3/4) Exclude cut with ID _42326_210_7770_1_1533814282531_3707269_368-68029-0_sp1.1 from training. Duration: 0.92725 2023-10-04 11:14:26,383 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_41428_210_2161_1_1533774184_3992532_446-339645-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 11:14:27,618 WARNING [train.py:1204] (3/4) Exclude cut with ID _69584_210_5219_1_1535785133775_4044450_325-7656-0_sp1.1 from training. Duration: 0.7818125 2023-10-04 11:14:27,633 WARNING [train.py:1204] (3/4) Exclude cut with ID _21632_210_2253_1_1531994001765_3744140_134-184387-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 11:14:30,382 WARNING [train.py:1204] (3/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_550-184707-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 11:14:31,671 WARNING [train.py:1204] (3/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_106-341780-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 11:14:35,797 WARNING [train.py:1204] (3/4) Exclude cut with ID _45871_210_5665_1_1533864614245_3296060_253-132552-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 11:14:35,831 WARNING [train.py:1204] (3/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_395-310220-0_sp1.1 from training. Duration: 0.8090625 2023-10-04 11:14:35,854 WARNING [train.py:1204] (3/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_795-33252-0_sp0.9 from training. Duration: 0.411125 2023-10-04 11:14:37,312 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9002W0003-495962 from training. Duration: 0.946 2023-10-04 11:14:38,643 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_43-71989-0 from training. Duration: 0.57 2023-10-04 11:14:38,664 WARNING [train.py:1204] (3/4) Exclude cut with ID _8699_210_2069_1_1528630346831_3960409_155-39766-0_sp1.1 from training. Duration: 0.7090625 2023-10-04 11:14:38,683 WARNING [train.py:1204] (3/4) Exclude cut with ID _27894_210_12392_1_1532415496078_6902439_788-232684-0_sp1.1 from training. Duration: 0.97275 2023-10-04 11:14:41,408 WARNING [train.py:1204] (3/4) Exclude cut with ID _56494_210_4220_1_1535284837625_3033169_10-39296-0_sp1.1 from training. Duration: 0.92725 2023-10-04 11:14:41,438 WARNING [train.py:1204] (3/4) Exclude cut with ID _40901_210_5665_1_1533363789958_3856130_341-20819-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 11:14:47,376 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9029W0435-509822 from training. Duration: 0.896 2023-10-04 11:14:47,421 WARNING [train.py:1204] (3/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_51-117576-0 from training. Duration: 0.4 2023-10-04 11:14:47,534 WARNING [train.py:1204] (3/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_197-28164-0_sp1.1 from training. Duration: 0.72725 2023-10-04 11:14:49,408 WARNING [train.py:1204] (3/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_145-137790-0_sp1.1 from training. Duration: 0.7545625 2023-10-04 11:14:55,276 WARNING [train.py:1204] (3/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_2-39653-0_sp1.1 from training. Duration: 0.8909375 2023-10-04 11:14:55,431 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.0.nonlin_attention.balancer.prob, batch_count=1634653.3333333333, ans=0.125 2023-10-04 11:14:58,382 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3780_210_2087_1_1526467760_2130483_186-208240-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 11:15:00,212 WARNING [train.py:1204] (3/4) Exclude cut with ID _24055_210_12651_1_1532310907143_976979_92-59356-0 from training. Duration: 0.91 2023-10-04 11:15:00,258 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3910_210_1794_1_1526742306_334530_28-238909-0_sp1.1 from training. Duration: 0.736375 2023-10-04 11:15:04,323 WARNING [train.py:1204] (3/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_269-154726-0 from training. Duration: 0.86 2023-10-04 11:15:07,267 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.conv_module2.balancer1.prob, batch_count=1634653.3333333333, ans=0.125 2023-10-04 11:15:08,522 WARNING [train.py:1204] (3/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_326-165900-0_sp0.9 from training. Duration: 0.7666875 2023-10-04 11:15:09,817 INFO [train.py:1046] (3/4) Epoch 47, batch 850, loss[loss=0.1405, simple_loss=0.2263, pruned_loss=0.0273, over 24255.00 frames. ], tot_loss[loss=0.1533, simple_loss=0.2339, pruned_loss=0.03634, over 4671618.59 frames. ], batch size: 61, lr: 2.17e-03, grad_scale: 16.0 2023-10-04 11:15:11,350 WARNING [train.py:1204] (3/4) Exclude cut with ID _5294_210_1747_1_1527249643928_3696179_470-276077-0_sp1.1 from training. Duration: 0.963625 2023-10-04 11:15:12,661 WARNING [train.py:1204] (3/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_372-131163-0 from training. Duration: 0.94 2023-10-04 11:15:12,703 WARNING [train.py:1204] (3/4) Exclude cut with ID _45297_210_2721_1_1533715351381_3264949_351-147136-0_sp1.1 from training. Duration: 0.7909375 2023-10-04 11:15:14,115 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_1072-12671-0_sp1.1 from training. Duration: 0.92725 2023-10-04 11:15:14,212 WARNING [train.py:1204] (3/4) Exclude cut with ID _15133_210_6228_1_1530794512074_3080379_54-198193-0 from training. Duration: 0.94 2023-10-04 11:15:14,232 WARNING [train.py:1204] (3/4) Exclude cut with ID _30196_210_1664_1_1533708496522_7074140_1194-270988-0_sp1.1 from training. Duration: 0.97275 2023-10-04 11:15:16,161 WARNING [train.py:1204] (3/4) Exclude cut with ID _50310_210_15488_1_1534068043345_3641080_289-193637-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 11:15:17,537 WARNING [train.py:1204] (3/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_313-263495-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 11:15:20,238 WARNING [train.py:1204] (3/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_18-130336-0_sp1.1 from training. Duration: 0.7181875 2023-10-04 11:15:22,110 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_47896_210_20508_1_1534492518_5958868_646-287209-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 11:15:23,414 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_294-163793-0 from training. Duration: 0.82 2023-10-04 11:15:23,460 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_8-319957-0 from training. Duration: 0.96 2023-10-04 11:15:24,817 WARNING [train.py:1204] (3/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_1211-226066-0 from training. Duration: 0.76 2023-10-04 11:15:26,216 WARNING [train.py:1204] (3/4) Exclude cut with ID _4779_210_2084_1_1527318022944_3914893_500-285013-0_sp0.9 from training. Duration: 0.7666875 2023-10-04 11:15:26,245 WARNING [train.py:1204] (3/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_347-232268-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 11:15:27,662 WARNING [train.py:1204] (3/4) Exclude cut with ID _29877_210_6228_1_1532397791106_5276149_111-55056-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 11:15:27,685 WARNING [train.py:1204] (3/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_812-271646-0_sp1.1 from training. Duration: 0.92725 2023-10-04 11:15:27,715 WARNING [train.py:1204] (3/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_235-262124-0_sp1.1 from training. Duration: 0.6090625 2023-10-04 11:15:29,212 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.feed_forward1.hidden_balancer.prob, batch_count=1634786.6666666667, ans=0.125 2023-10-04 11:15:33,202 WARNING [train.py:1204] (3/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_14-16530-0_sp1.1 from training. Duration: 0.97275 2023-10-04 11:15:34,985 WARNING [train.py:1204] (3/4) Exclude cut with ID _44718_210_5816_1_1534050051869_6469030_568-304512-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 11:15:35,017 WARNING [train.py:1204] (3/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_1033-274376-0 from training. Duration: 0.87 2023-10-04 11:15:36,513 WARNING [train.py:1204] (3/4) Exclude cut with ID _4681_210_3525_1_1527251040321_1235713_123-24428-0 from training. Duration: 0.48 2023-10-04 11:15:40,623 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_37818_210_4238_1_1533620910_4720083_251-288764-0_sp1.1 from training. Duration: 0.97275 2023-10-04 11:15:41,954 WARNING [train.py:1204] (3/4) Exclude cut with ID _65585_210_12210_1_1535450451584_3764098_112-294301-0 from training. Duration: 0.88 2023-10-04 11:15:44,743 WARNING [train.py:1204] (3/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_329-113173-0 from training. Duration: 0.62 2023-10-04 11:15:46,123 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6767_210_3273_1_1527855783_1718932_173-256601-0 from training. Duration: 0.8 2023-10-04 11:15:46,931 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9266W0001-627677 from training. Duration: 0.896 2023-10-04 11:15:46,942 WARNING [train.py:1204] (3/4) Exclude cut with ID _49643_210_7989_1_1534640244224_3939031_611-185515-0_sp1.1 from training. Duration: 0.8545625 2023-10-04 11:15:46,946 WARNING [train.py:1204] (3/4) Exclude cut with ID _31649_210_6228_1_1532584608063_4091078_687-237771-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 11:15:48,197 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_148-172861-0_sp0.9 from training. Duration: 0.5555625 2023-10-04 11:15:49,681 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_5129_210_6400_1_1527161005_3579054_107-69738-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 11:15:51,476 WARNING [train.py:1204] (3/4) Exclude cut with ID _40137_210_12210_1_1533290484046_3795180_36-301164-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 11:15:51,498 WARNING [train.py:1204] (3/4) Exclude cut with ID _7537_210_2084_1_1528502395431_3924359_113-199933-0 from training. Duration: 0.59 2023-10-04 11:15:54,736 WARNING [train.py:1204] (3/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_547-5424-0_sp1.1 from training. Duration: 0.8545625 2023-10-04 11:15:54,801 WARNING [train.py:1204] (3/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_147-284024-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 11:15:54,877 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_294-163793-0_sp1.1 from training. Duration: 0.7454375 2023-10-04 11:15:56,186 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3780_210_2087_1_1526467760_2130483_170-259044-0_sp1.1 from training. Duration: 0.636375 2023-10-04 11:15:56,307 WARNING [train.py:1204] (3/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_726-270780-0_sp1.1 from training. Duration: 0.7818125 2023-10-04 11:15:57,646 WARNING [train.py:1204] (3/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_214-291369-0_sp1.1 from training. Duration: 0.57275 2023-10-04 11:15:57,850 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.attention_skip_rate, batch_count=1634920.0, ans=0.0 2023-10-04 11:15:58,922 WARNING [train.py:1204] (3/4) Exclude cut with ID _21881_210_3525_1_1531825242757_3798380_247-102591-0 from training. Duration: 0.98 2023-10-04 11:16:04,761 WARNING [train.py:1204] (3/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_18-174647-0_sp1.1 from training. Duration: 0.82725 2023-10-04 11:16:04,764 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_415-52611-0_sp1.1 from training. Duration: 0.936375 2023-10-04 11:16:04,819 WARNING [train.py:1204] (3/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_379-49457-0_sp0.9 from training. Duration: 0.9666875 2023-10-04 11:16:06,140 WARNING [train.py:1204] (3/4) Exclude cut with ID _64521_210_14699_1_1535243427259_3815328_368-195106-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 11:16:07,528 WARNING [train.py:1204] (3/4) Exclude cut with ID _28394_210_8913_1_1532430049518_3650118_51-334124-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 11:16:10,426 WARNING [train.py:1204] (3/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_317-161769-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 11:16:11,812 WARNING [train.py:1204] (3/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_40-129756-0_sp0.9 from training. Duration: 0.9 2023-10-04 11:16:11,899 WARNING [train.py:1204] (3/4) Exclude cut with ID _3067_210_2667_1_1526212974640_4503168_119-175776-0_sp0.9 from training. Duration: 0.82225 2023-10-04 11:16:13,192 WARNING [train.py:1204] (3/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_1004-304915-0_sp1.1 from training. Duration: 0.92725 2023-10-04 11:16:13,247 WARNING [train.py:1204] (3/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_359-118926-0_sp0.9 from training. Duration: 0.8 2023-10-04 11:16:19,635 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.675e+02 2.035e+02 2.355e+02 2.898e+02 4.168e+02, threshold=4.711e+02, percent-clipped=0.0 2023-10-04 11:16:21,125 WARNING [train.py:1204] (3/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_51-200220-0_sp0.9 from training. Duration: 0.62225 2023-10-04 11:16:21,198 WARNING [train.py:1204] (3/4) Exclude cut with ID _46683_210_9642_1_1533730913492_4591116_530-326202-0_sp1.1 from training. Duration: 0.936375 2023-10-04 11:16:23,220 WARNING [train.py:1204] (3/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_567-135114-0 from training. Duration: 0.87 2023-10-04 11:16:23,252 WARNING [train.py:1204] (3/4) Exclude cut with ID _66863_210_18053_1_1535698758334_8590319_1092-271784-0_sp1.1 from training. Duration: 0.87275 2023-10-04 11:16:23,262 WARNING [train.py:1204] (3/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_762-282614-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 11:16:24,533 INFO [train.py:1046] (3/4) Epoch 47, batch 900, loss[loss=0.1637, simple_loss=0.2434, pruned_loss=0.04202, over 23780.00 frames. ], tot_loss[loss=0.1542, simple_loss=0.2348, pruned_loss=0.03676, over 4683195.63 frames. ], batch size: 195, lr: 2.17e-03, grad_scale: 16.0 2023-10-04 11:16:25,981 WARNING [train.py:1204] (3/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_280-120942-0 from training. Duration: 0.77 2023-10-04 11:16:27,682 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.conv_module2.balancer1.max_abs, batch_count=1635053.3333333333, ans=10.0 2023-10-04 11:16:31,643 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_31583_210_3852_1_1532595851_4024413_27-170931-0_sp1.1 from training. Duration: 0.963625 2023-10-04 11:16:34,760 WARNING [train.py:1204] (3/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_266-271628-0_sp1.1 from training. Duration: 0.92725 2023-10-04 11:16:34,814 WARNING [train.py:1204] (3/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_41-31157-0 from training. Duration: 0.57 2023-10-04 11:16:35,566 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.1.encoder.layers.1.conv_module1.whiten, num_groups=1, num_channels=256, metric=10.14 vs. limit=15.0 2023-10-04 11:16:37,636 WARNING [train.py:1204] (3/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_113-18163-0_sp1.1 from training. Duration: 0.8090625 2023-10-04 11:16:37,663 WARNING [train.py:1204] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_147-111137-0 from training. Duration: 0.95 2023-10-04 11:16:39,004 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1083-220122-0_sp1.1 from training. Duration: 0.463625 2023-10-04 11:16:40,416 WARNING [train.py:1204] (3/4) Exclude cut with ID _14884_210_2252_1_1530707459689_5561250_591-173406-0_sp1.1 from training. Duration: 0.87275 2023-10-04 11:16:40,431 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_24677_210_1794_1_1532002693_4341296_149-303793-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 11:16:40,485 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_133-564-0_sp1.1 from training. Duration: 0.7181875 2023-10-04 11:16:40,515 WARNING [train.py:1204] (3/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_625-338546-0_sp0.9 from training. Duration: 0.97775 2023-10-04 11:16:50,747 WARNING [train.py:1204] (3/4) Exclude cut with ID _25882_210_3036_1_1532775665815_6479510_624-320169-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 11:16:50,753 WARNING [train.py:1204] (3/4) Exclude cut with ID _20622_210_3852_1_1531622427080_1093839_127-325326-0_sp1.1 from training. Duration: 0.92725 2023-10-04 11:16:50,791 WARNING [train.py:1204] (3/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_464-33931-0_sp1.1 from training. Duration: 0.4909375 2023-10-04 11:16:51,550 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.0.layers.1.feed_forward3.out_whiten, num_groups=1, num_channels=192, metric=9.03 vs. limit=15.0 2023-10-04 11:16:53,427 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.0.layers.0.self_attn_weights.whiten_keys, num_groups=4, num_channels=128, metric=2.64 vs. limit=6.0 2023-10-04 11:16:55,518 WARNING [train.py:1204] (3/4) Exclude cut with ID _66112_210_10635_1_1535590236305_3801484_120-304588-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 11:16:59,607 WARNING [train.py:1204] (3/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_110-130961-0 from training. Duration: 0.47 2023-10-04 11:17:01,293 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.conv_module2.balancer2.prob, batch_count=1635186.6666666667, ans=0.125 2023-10-04 11:17:01,350 INFO [scaling.py:1118] (3/4) WithLoss: name=encoder.encoders.3.encoder.layers.2.self_attn_weights, loss-sum=0.000e+00 2023-10-04 11:17:02,399 WARNING [train.py:1204] (3/4) Exclude cut with ID _16908_210_5724_1_1531119157248_3772700_64-54020-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 11:17:04,099 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=1635186.6666666667, ans=0.1 2023-10-04 11:17:07,193 WARNING [train.py:1204] (3/4) Exclude cut with ID _4068_210_2629_1_1526697350395_1546259_224-288693-0_sp1.1 from training. Duration: 0.536375 2023-10-04 11:17:07,247 WARNING [train.py:1204] (3/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_191-41023-0_sp0.9 from training. Duration: 0.788875 2023-10-04 11:17:07,422 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.ff2_skip_rate, batch_count=1635253.3333333333, ans=0.0 2023-10-04 11:17:08,518 WARNING [train.py:1204] (3/4) Exclude cut with ID ID0205W0255-783002 from training. Duration: 0.958 2023-10-04 11:17:08,596 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_162-262717-0 from training. Duration: 0.83 2023-10-04 11:17:14,164 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_224-322394-0_sp1.1 from training. Duration: 0.6 2023-10-04 11:17:15,520 WARNING [train.py:1204] (3/4) Exclude cut with ID _4866_210_2577_1_1527404412284_4364659_133-1696-0_sp1.1 from training. Duration: 0.9 2023-10-04 11:17:15,578 WARNING [train.py:1204] (3/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_973-198085-0_sp1.1 from training. Duration: 0.6818125 2023-10-04 11:17:20,636 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.feed_forward3.hidden_balancer.prob, batch_count=1635253.3333333333, ans=0.125 2023-10-04 11:17:21,844 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_47449_210_5399_1_1534076445_4006929_410-237806-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 11:17:21,854 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7543_210_6753_1_1528543318_4552_1-85072-0_sp0.9 from training. Duration: 0.92225 2023-10-04 11:17:22,179 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.ff3_skip_rate, batch_count=1635320.0, ans=0.0 2023-10-04 11:17:23,267 WARNING [train.py:1204] (3/4) Exclude cut with ID _65263_210_22356_1_1535763598691_3921037_108-21902-0 from training. Duration: 0.99 2023-10-04 11:17:23,272 WARNING [train.py:1204] (3/4) Exclude cut with ID _65585_210_12210_1_1535450451584_3764098_309-174818-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 11:17:23,485 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.feed_forward3.hidden_balancer.prob, batch_count=1635320.0, ans=0.125 2023-10-04 11:17:25,985 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_309-65177-0 from training. Duration: 0.88 2023-10-04 11:17:27,895 WARNING [train.py:1204] (3/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_363-185703-0_sp1.1 from training. Duration: 0.863625 2023-10-04 11:17:27,929 WARNING [train.py:1204] (3/4) Exclude cut with ID _42326_210_7770_1_1533814282531_3707269_442-238692-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 11:17:29,347 WARNING [train.py:1204] (3/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_226-254446-0_sp1.1 from training. Duration: 0.936375 2023-10-04 11:17:29,365 WARNING [train.py:1204] (3/4) Exclude cut with ID _29853_210_7497_1_1532505600851_6642110_287-299903-0_sp1.1 from training. Duration: 0.963625 2023-10-04 11:17:34,947 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1015-343884-0 from training. Duration: 0.62 2023-10-04 11:17:34,983 WARNING [train.py:1204] (3/4) Exclude cut with ID ID0305W0008-833402 from training. Duration: 0.91 2023-10-04 11:17:35,083 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6673_210_3273_1_1527767790_3173240_351-167954-0_sp1.1 from training. Duration: 0.436375 2023-10-04 11:17:35,103 WARNING [train.py:1204] (3/4) Exclude cut with ID _6456_210_1534_1_1527681634212_3181049_345-252877-0 from training. Duration: 0.64 2023-10-04 11:17:37,680 INFO [train.py:1046] (3/4) Epoch 47, batch 950, loss[loss=0.1464, simple_loss=0.2309, pruned_loss=0.031, over 23278.00 frames. ], tot_loss[loss=0.154, simple_loss=0.2343, pruned_loss=0.03679, over 4689124.38 frames. ], batch size: 105, lr: 2.17e-03, grad_scale: 8.0 2023-10-04 11:17:37,779 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_13-205182-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 11:17:41,123 WARNING [train.py:1204] (3/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_427-208901-0 from training. Duration: 0.68 2023-10-04 11:17:46,624 WARNING [train.py:1204] (3/4) Exclude cut with ID _49039_210_6315_1_1533967537541_3846118_9-148921-0_sp1.1 from training. Duration: 0.97275 2023-10-04 11:17:49,924 WARNING [train.py:1204] (3/4) Exclude cut with ID _59493_210_17282_1_1535078419077_3225879_143-70317-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 11:17:49,966 WARNING [train.py:1204] (3/4) Exclude cut with ID _27967_210_12140_1_1532588391859_8419130_1056-236369-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 11:17:51,837 WARNING [train.py:1204] (3/4) Exclude cut with ID _6537_210_1883_1_1527987559710_3597129_382-133062-0_sp0.9 from training. Duration: 0.7555625 2023-10-04 11:17:54,621 WARNING [train.py:1204] (3/4) Exclude cut with ID ID0376W0255-869957 from training. Duration: 0.9510625 2023-10-04 11:17:55,722 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.0.layers.0.conv_module1.whiten, num_groups=1, num_channels=192, metric=10.30 vs. limit=15.0 2023-10-04 11:17:57,480 WARNING [train.py:1204] (3/4) Exclude cut with ID _49371_210_12071_1_1533952761021_3812190_483-240691-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 11:17:57,551 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_12868_210_6753_1_1530701932_530275_18-76322-0_sp1.1 from training. Duration: 0.7909375 2023-10-04 11:17:59,519 WARNING [train.py:1204] (3/4) Exclude cut with ID _53950_210_5816_1_1534417264919_3862951_367-26344-0_sp1.1 from training. Duration: 0.97275 2023-10-04 11:17:59,539 WARNING [train.py:1204] (3/4) Exclude cut with ID _5444_210_2151_1_1527418808464_5312210_634-194174-0_sp1.1 from training. Duration: 0.8818125 2023-10-04 11:17:59,557 WARNING [train.py:1204] (3/4) Exclude cut with ID _56494_210_4220_1_1535284837625_3033169_69-138150-0 from training. Duration: 0.94 2023-10-04 11:18:00,932 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7543_210_6753_1_1528544010_1110402_25-183860-0_sp0.9 from training. Duration: 0.52225 2023-10-04 11:18:03,544 WARNING [train.py:1204] (3/4) Exclude cut with ID _40284_210_2084_1_1533513679814_3704279_287-191918-0_sp1.1 from training. Duration: 0.963625 2023-10-04 11:18:03,673 WARNING [train.py:1204] (3/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_291-270881-0 from training. Duration: 0.97 2023-10-04 11:18:03,818 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.conv_module1.balancer1.prob, batch_count=1635453.3333333333, ans=0.125 2023-10-04 11:18:04,993 WARNING [train.py:1204] (3/4) Exclude cut with ID _12555_210_8750_1_1530075594402_3780239_449-267986-0_sp0.9 from training. Duration: 0.92225 2023-10-04 11:18:08,069 WARNING [train.py:1204] (3/4) Exclude cut with ID _3992_210_1747_1_1526644816031_3731060_268-64520-0_sp1.1 from training. Duration: 0.963625 2023-10-04 11:18:08,077 WARNING [train.py:1204] (3/4) Exclude cut with ID _39411_210_5986_1_1533538825030_3343649_173-238248-0_sp0.9 from training. Duration: 0.92225 2023-10-04 11:18:08,116 WARNING [train.py:1204] (3/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_828-313097-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 11:18:10,000 WARNING [train.py:1204] (3/4) Exclude cut with ID _26907_210_7234_1_1532221740694_3205263_337-2494-0 from training. Duration: 0.88 2023-10-04 11:18:11,321 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_229-45300-0_sp1.1 from training. Duration: 0.5181875 2023-10-04 11:18:12,730 WARNING [train.py:1204] (3/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_300-27235-0_sp1.1 from training. Duration: 0.7909375 2023-10-04 11:18:14,164 WARNING [train.py:1204] (3/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_48-338976-0_sp1.1 from training. Duration: 0.8090625 2023-10-04 11:18:14,347 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.feed_forward2.hidden_balancer.prob, batch_count=1635520.0, ans=0.125 2023-10-04 11:18:21,411 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_13091_210_7925_1_1530609447_8987354_792-195353-0_sp1.1 from training. Duration: 0.936375 2023-10-04 11:18:21,417 WARNING [train.py:1204] (3/4) Exclude cut with ID _56524_210_9642_1_1534572011779_3619170_311-54500-0_sp1.1 from training. Duration: 0.97275 2023-10-04 11:18:24,884 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7543_210_6753_1_1528544010_1110402_25-183860-0 from training. Duration: 0.47 2023-10-04 11:18:26,301 WARNING [train.py:1204] (3/4) Exclude cut with ID 2411-132532-0017-82279-0_sp1.1 from training. Duration: 0.9681875 2023-10-04 11:18:26,301 WARNING [train.py:1204] (3/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_272-249985-0_sp0.9 from training. Duration: 0.7444375 2023-10-04 11:18:26,371 WARNING [train.py:1204] (3/4) Exclude cut with ID _55644_210_11856_1_1534593599582_4103719_320-229006-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 11:18:26,423 WARNING [train.py:1204] (3/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_177-100554-0_sp1.1 from training. Duration: 0.963625 2023-10-04 11:18:26,425 WARNING [train.py:1204] (3/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_787-294416-0_sp1.1 from training. Duration: 0.7454375 2023-10-04 11:18:32,307 WARNING [train.py:1204] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_318-59097-0 from training. Duration: 0.76 2023-10-04 11:18:33,689 WARNING [train.py:1204] (3/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_480-13146-0_sp1.1 from training. Duration: 0.77275 2023-10-04 11:18:35,070 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_469-338558-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 11:18:35,129 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_367-126897-0_sp1.1 from training. Duration: 0.963625 2023-10-04 11:18:35,152 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6545_210_2040_1_1527774670_4245258_379-14182-0 from training. Duration: 0.8 2023-10-04 11:18:35,163 WARNING [train.py:1204] (3/4) Exclude cut with ID _21631_210_2253_1_1531907880778_3160339_197-79498-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 11:18:35,173 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_416-293746-0_sp1.1 from training. Duration: 0.8181875 2023-10-04 11:18:36,578 WARNING [train.py:1204] (3/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_52-211683-0 from training. Duration: 0.9 2023-10-04 11:18:39,908 WARNING [train.py:1204] (3/4) Exclude cut with ID _48678_210_5663_1_1533988830947_3769199_306-255886-0_sp0.9 from training. Duration: 0.8555625 2023-10-04 11:18:42,684 WARNING [train.py:1204] (3/4) Exclude cut with ID _65134_210_14763_1_1535335393985_3505368_296-24733-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 11:18:46,846 WARNING [train.py:1204] (3/4) Exclude cut with ID _14898_210_9206_1_1530766812377_4177190_335-99364-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 11:18:48,246 WARNING [train.py:1204] (3/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_19-342632-0 from training. Duration: 0.91 2023-10-04 11:18:48,263 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_53071_210_16554_1_1534490405_2954066_265-170472-0 from training. Duration: 0.83 2023-10-04 11:18:50,049 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.730e+02 2.047e+02 2.206e+02 2.421e+02 3.668e+02, threshold=4.411e+02, percent-clipped=0.0 2023-10-04 11:18:53,441 INFO [train.py:1046] (3/4) Epoch 47, batch 1000, loss[loss=0.1435, simple_loss=0.2015, pruned_loss=0.04274, over 19383.00 frames. ], tot_loss[loss=0.1533, simple_loss=0.2333, pruned_loss=0.03665, over 4697058.78 frames. ], batch size: 388, lr: 2.17e-03, grad_scale: 8.0 2023-10-04 11:18:53,481 WARNING [train.py:1204] (3/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_189-298172-0_sp1.1 from training. Duration: 0.963625 2023-10-04 11:18:57,612 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7615_210_4211_1_1528110965_4580160_72-235211-0 from training. Duration: 0.76 2023-10-04 11:18:57,658 WARNING [train.py:1204] (3/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_155-90874-0_sp1.1 from training. Duration: 0.936375 2023-10-04 11:19:03,588 WARNING [train.py:1204] (3/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_164-216310-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 11:19:04,931 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_46013_210_7234_1_1533776362_3744032_463-52378-0 from training. Duration: 0.86 2023-10-04 11:19:04,935 WARNING [train.py:1204] (3/4) Exclude cut with ID _61133_210_9969_1_1535196777227_3696409_333-270325-0 from training. Duration: 0.93 2023-10-04 11:19:05,224 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=1635720.0, ans=0.1 2023-10-04 11:19:09,213 WARNING [train.py:1204] (3/4) Exclude cut with ID _5172_210_1883_1_1527246062567_3577219_18-101380-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 11:19:09,221 WARNING [train.py:1204] (3/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_577-236858-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 11:19:10,632 WARNING [train.py:1204] (3/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_1006-177345-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 11:19:11,124 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.5.encoder.layers.1.nonlin_attention.whiten1, num_groups=1, num_channels=192, metric=3.22 vs. limit=10.0 2023-10-04 11:19:13,716 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_4-266348-0 from training. Duration: 0.78 2023-10-04 11:19:16,548 WARNING [train.py:1204] (3/4) Exclude cut with ID _55857_210_2346_1_1534644007831_6898209_745-98391-0 from training. Duration: 0.76 2023-10-04 11:19:17,905 WARNING [train.py:1204] (3/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_375-244976-0 from training. Duration: 0.75 2023-10-04 11:19:19,658 WARNING [train.py:1204] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_4-248404-0_sp1.1 from training. Duration: 0.97275 2023-10-04 11:19:19,774 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8453_210_4211_1_1528502940_8562730_596-306820-0 from training. Duration: 0.97 2023-10-04 11:19:22,533 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_211-339622-0_sp0.9 from training. Duration: 0.5 2023-10-04 11:19:22,580 WARNING [train.py:1204] (3/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_393-57888-0 from training. Duration: 0.64 2023-10-04 11:19:24,552 WARNING [train.py:1204] (3/4) Exclude cut with ID _23825_210_13449_1_1531915313526_3497150_143-343575-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 11:19:25,847 WARNING [train.py:1204] (3/4) Exclude cut with ID _58605_210_12210_1_1534723195939_3904259_36-12256-0_sp1.1 from training. Duration: 0.936375 2023-10-04 11:19:32,197 WARNING [train.py:1204] (3/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_38-67795-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 11:19:33,460 WARNING [train.py:1204] (3/4) Exclude cut with ID _64631_210_27767_1_1535349580068_4165160_308-327832-0_sp1.1 from training. Duration: 0.92725 2023-10-04 11:19:34,900 WARNING [train.py:1204] (3/4) Exclude cut with ID _37086_210_9038_1_1532999540891_7021250_726-81176-0_sp1.1 from training. Duration: 0.936375 2023-10-04 11:19:34,963 WARNING [train.py:1204] (3/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_47-141521-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 11:19:34,973 WARNING [train.py:1204] (3/4) Exclude cut with ID _3067_210_2667_1_1526212974640_4503168_250-285937-0 from training. Duration: 0.8 2023-10-04 11:19:36,216 WARNING [train.py:1204] (3/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_239-24000-0_sp1.1 from training. Duration: 0.97275 2023-10-04 11:19:36,281 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3371_210_1794_1_1526384462_5580777_396-339812-0_sp0.9 from training. Duration: 0.9444375 2023-10-04 11:19:36,327 WARNING [train.py:1204] (3/4) Exclude cut with ID _16909_210_5724_1_1531292079912_4118919_580-173454-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 11:19:37,637 WARNING [train.py:1204] (3/4) Exclude cut with ID IC0124W0002-61308 from training. Duration: 0.982 2023-10-04 11:19:40,398 WARNING [train.py:1204] (3/4) Exclude cut with ID _31602_210_5854_1_1532677021456_4458250_249-96604-0 from training. Duration: 0.89 2023-10-04 11:19:42,305 WARNING [train.py:1204] (3/4) Exclude cut with ID _69584_210_5219_1_1535785133775_4044450_325-7656-0 from training. Duration: 0.86 2023-10-04 11:19:43,710 WARNING [train.py:1204] (3/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_478-162247-0 from training. Duration: 0.82 2023-10-04 11:19:45,138 WARNING [train.py:1204] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_686-54383-0_sp1.1 from training. Duration: 0.7909375 2023-10-04 11:19:50,027 WARNING [train.py:1204] (3/4) Exclude cut with ID _29303_210_2575_1_1532392237303_7410649_85-270092-0_sp1.1 from training. Duration: 0.936375 2023-10-04 11:19:50,038 WARNING [train.py:1204] (3/4) Exclude cut with ID _2322_210_2667_1_1525604548971_3656299_440-80494-0_sp1.1 from training. Duration: 0.8 2023-10-04 11:19:50,082 WARNING [train.py:1204] (3/4) Exclude cut with ID _47346_210_9476_1_1533815971553_3536160_201-40644-0_sp1.1 from training. Duration: 0.936375 2023-10-04 11:19:51,420 WARNING [train.py:1204] (3/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_343-139898-0_sp1.1 from training. Duration: 0.87275 2023-10-04 11:19:53,291 WARNING [train.py:1204] (3/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_1012-279473-0 from training. Duration: 0.67 2023-10-04 11:19:53,462 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.nonlin_attention.balancer.prob, batch_count=1635986.6666666667, ans=0.125 2023-10-04 11:19:54,718 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7931_210_1794_1_1528594728_5564059_280-279560-0_sp1.1 from training. Duration: 0.8909375 2023-10-04 11:19:56,286 WARNING [train.py:1204] (3/4) Exclude cut with ID _8263_210_1664_1_1528439452891_970220_68-181805-0 from training. Duration: 0.64 2023-10-04 11:19:58,105 WARNING [train.py:1204] (3/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_605-133395-0 from training. Duration: 0.66 2023-10-04 11:19:58,214 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_43-171109-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 11:19:58,217 WARNING [train.py:1204] (3/4) Exclude cut with ID _40094_210_16380_1_1533294005773_3641159_107-280739-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 11:20:00,937 WARNING [train.py:1204] (3/4) Exclude cut with ID _14821_210_8750_1_1530680503991_6829359_2-227151-0_sp0.9 from training. Duration: 0.92225 2023-10-04 11:20:01,235 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=1635986.6666666667, ans=0.1 2023-10-04 11:20:03,819 WARNING [train.py:1204] (3/4) Exclude cut with ID _14268_210_9206_1_1530622771285_4714099_331-105351-0_sp1.1 from training. Duration: 0.5909375 2023-10-04 11:20:05,236 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_26326_210_7925_1_1533002493_3592406_451-82741-0_sp1.1 from training. Duration: 0.97275 2023-10-04 11:20:07,907 INFO [train.py:1046] (3/4) Epoch 47, batch 1050, loss[loss=0.1652, simple_loss=0.2504, pruned_loss=0.04003, over 24422.00 frames. ], tot_loss[loss=0.1521, simple_loss=0.2316, pruned_loss=0.03633, over 4696944.69 frames. ], batch size: 77, lr: 2.17e-03, grad_scale: 4.0 2023-10-04 11:20:08,136 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.ff2_skip_rate, batch_count=1636053.3333333333, ans=0.0 2023-10-04 11:20:09,313 WARNING [train.py:1204] (3/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_191-59784-0_sp0.9 from training. Duration: 0.97775 2023-10-04 11:20:10,644 WARNING [train.py:1204] (3/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_445-117398-0_sp1.1 from training. Duration: 0.8090625 2023-10-04 11:20:13,847 WARNING [train.py:1204] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_605-257205-0_sp0.9 from training. Duration: 0.6555625 2023-10-04 11:20:13,915 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_50050_210_16223_1_1535435796_7290138_592-258841-0_sp1.1 from training. Duration: 0.936375 2023-10-04 11:20:15,400 WARNING [train.py:1204] (3/4) Exclude cut with ID _14348_210_8341_1_1530599434996_3778640_240-232370-0_sp0.9 from training. Duration: 0.9333125 2023-10-04 11:20:18,093 WARNING [train.py:1204] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_239-316802-0_sp0.9 from training. Duration: 0.7666875 2023-10-04 11:20:19,512 WARNING [train.py:1204] (3/4) Exclude cut with ID _45560_210_21982_1_1533812603951_3584999_111-6376-0_sp1.1 from training. Duration: 0.763625 2023-10-04 11:20:22,571 WARNING [train.py:1204] (3/4) Exclude cut with ID _54189_210_16380_1_1534332670039_3911710_606-311481-0_sp1.1 from training. Duration: 0.963625 2023-10-04 11:20:23,944 WARNING [train.py:1204] (3/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_237-332027-0_sp1.1 from training. Duration: 0.863625 2023-10-04 11:20:23,972 WARNING [train.py:1204] (3/4) Exclude cut with ID _66070_210_7640_1_1535441393096_7805310_489-292631-0_sp0.9 from training. Duration: 0.87775 2023-10-04 11:20:24,052 WARNING [train.py:1204] (3/4) Exclude cut with ID _49728_210_12651_1_1534041034294_6788150_624-195301-0_sp1.1 from training. Duration: 0.77275 2023-10-04 11:20:25,861 WARNING [train.py:1204] (3/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_44-79427-0 from training. Duration: 0.98 2023-10-04 11:20:27,228 WARNING [train.py:1204] (3/4) Exclude cut with ID _35350_210_16040_1_1533040191183_4075299_490-198354-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 11:20:28,638 WARNING [train.py:1204] (3/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_309-222545-0 from training. Duration: 0.84 2023-10-04 11:20:30,105 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_13091_210_7925_1_1530609447_8987354_790-199469-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 11:20:30,111 WARNING [train.py:1204] (3/4) Exclude cut with ID _21632_210_2253_1_1531994001765_3744140_110-274654-0 from training. Duration: 0.82 2023-10-04 11:20:30,124 WARNING [train.py:1204] (3/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_498-40153-0_sp0.9 from training. Duration: 0.7 2023-10-04 11:20:30,367 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.balancer_ff3.min_abs, batch_count=1636120.0, ans=0.2 2023-10-04 11:20:36,407 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.conv_module1.balancer2.min_positive, batch_count=1636186.6666666667, ans=0.05 2023-10-04 11:20:37,545 WARNING [train.py:1204] (3/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_363-282909-0_sp1.1 from training. Duration: 0.936375 2023-10-04 11:20:37,602 WARNING [train.py:1204] (3/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_178-177582-0_sp0.9 from training. Duration: 0.82225 2023-10-04 11:20:37,612 WARNING [train.py:1204] (3/4) Exclude cut with ID _24612_210_8913_1_1531998054193_3806359_357-35487-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 11:20:39,101 WARNING [train.py:1204] (3/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_138-332773-0 from training. Duration: 0.81 2023-10-04 11:20:39,121 WARNING [train.py:1204] (3/4) Exclude cut with ID _7624_210_1531_1_1528109802198_4933101_418-349834-0 from training. Duration: 0.6 2023-10-04 11:20:39,141 WARNING [train.py:1204] (3/4) Exclude cut with ID _7537_210_2084_1_1528502395431_3924359_14-312906-0_sp0.9 from training. Duration: 0.9333125 2023-10-04 11:20:42,281 WARNING [train.py:1204] (3/4) Exclude cut with ID _60057_210_10640_1_1534903065793_3333150_362-230377-0 from training. Duration: 0.87 2023-10-04 11:20:43,784 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.0.feed_forward1.out_proj.dropout_p, batch_count=1636186.6666666667, ans=0.1 2023-10-04 11:20:46,231 WARNING [train.py:1204] (3/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_138-64210-0 from training. Duration: 0.73 2023-10-04 11:20:47,516 WARNING [train.py:1204] (3/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_454-339504-0_sp1.1 from training. Duration: 0.97275 2023-10-04 11:20:49,758 WARNING [train.py:1204] (3/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_508-335369-0_sp0.9 from training. Duration: 0.5333125 2023-10-04 11:20:52,452 WARNING [train.py:1204] (3/4) Exclude cut with ID _5608_210_2106_1_1527415439089_6819768_909-82767-0_sp0.9 from training. Duration: 0.67775 2023-10-04 11:20:52,477 WARNING [train.py:1204] (3/4) Exclude cut with ID _6694_210_4599_1_1527852637490_3965529_99-11611-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 11:20:52,543 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2655_210_2087_1_1525688959_3897400_52-279899-0_sp1.1 from training. Duration: 0.62725 2023-10-04 11:20:58,567 WARNING [train.py:1204] (3/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_375-245677-0_sp1.1 from training. Duration: 0.736375 2023-10-04 11:21:01,876 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_23850_210_4238_1_1532232597_1632433_32-185135-0 from training. Duration: 0.86 2023-10-04 11:21:01,997 WARNING [train.py:1204] (3/4) Exclude cut with ID _15113_210_8254_1_1530842438579_3716279_209-54533-0 from training. Duration: 0.96 2023-10-04 11:21:03,307 WARNING [train.py:1204] (3/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_401-53423-0 from training. Duration: 0.55 2023-10-04 11:21:03,352 WARNING [train.py:1204] (3/4) Exclude cut with ID _14867_210_1968_1_1530684179890_3846969_319-250868-0_sp1.1 from training. Duration: 0.9 2023-10-04 11:21:03,367 WARNING [train.py:1204] (3/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_106-30782-0_sp0.9 from training. Duration: 0.888875 2023-10-04 11:21:06,032 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_5960_210_1794_1_1527403865_3990999_515-2613-0 from training. Duration: 0.88 2023-10-04 11:21:08,903 WARNING [train.py:1204] (3/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_798-106179-0_sp0.9 from training. Duration: 0.92225 2023-10-04 11:21:11,587 WARNING [train.py:1204] (3/4) Exclude cut with ID _14227_210_4381_1_1530701804286_3940259_125-275642-0_sp1.1 from training. Duration: 0.9 2023-10-04 11:21:11,589 WARNING [train.py:1204] (3/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_79-200392-0_sp1.1 from training. Duration: 0.8818125 2023-10-04 11:21:11,643 WARNING [train.py:1204] (3/4) Exclude cut with ID _48836_210_12210_1_1534075848293_3904150_429-35475-0_sp0.9 from training. Duration: 0.988875 2023-10-04 11:21:11,659 WARNING [train.py:1204] (3/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_224-172517-0_sp1.1 from training. Duration: 0.97275 2023-10-04 11:21:16,192 WARNING [train.py:1204] (3/4) Exclude cut with ID _15113_210_8254_1_1530842438579_3716279_76-232507-0_sp1.1 from training. Duration: 0.97275 2023-10-04 11:21:16,195 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_135-5501-0 from training. Duration: 0.98 2023-10-04 11:21:16,332 WARNING [train.py:1204] (3/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_563-347832-0_sp0.9 from training. Duration: 0.988875 2023-10-04 11:21:16,345 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6786_210_1388_1_1527839699_4194495_273-149841-0 from training. Duration: 0.92 2023-10-04 11:21:17,653 WARNING [train.py:1204] (3/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_172-215932-0 from training. Duration: 0.66 2023-10-04 11:21:17,713 WARNING [train.py:1204] (3/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_939-132910-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 11:21:20,775 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.762e+02 2.061e+02 2.280e+02 2.915e+02 5.230e+02, threshold=4.560e+02, percent-clipped=2.0 2023-10-04 11:21:22,026 INFO [train.py:1046] (3/4) Epoch 47, batch 1100, loss[loss=0.1653, simple_loss=0.2416, pruned_loss=0.04449, over 23653.00 frames. ], tot_loss[loss=0.1522, simple_loss=0.2316, pruned_loss=0.03639, over 4686186.29 frames. ], batch size: 232, lr: 2.17e-03, grad_scale: 8.0 2023-10-04 11:21:22,075 WARNING [train.py:1204] (3/4) Exclude cut with ID _49371_210_12071_1_1533952761021_3812190_16-246001-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 11:21:26,314 WARNING [train.py:1204] (3/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_680-189165-0_sp1.1 from training. Duration: 0.936375 2023-10-04 11:21:29,906 WARNING [train.py:1204] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_53-300544-0_sp1.1 from training. Duration: 0.6090625 2023-10-04 11:21:30,835 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.1.encoder.layers.0.nonlin_attention.whiten2, num_groups=1, num_channels=256, metric=5.36 vs. limit=15.0 2023-10-04 11:21:32,536 WARNING [train.py:1204] (3/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_540-275685-0_sp1.1 from training. Duration: 0.8090625 2023-10-04 11:21:32,567 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_38965_210_4852_1_1533174906_3484735_453-106579-0_sp1.1 from training. Duration: 0.92725 2023-10-04 11:21:32,609 WARNING [train.py:1204] (3/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_105-328174-0 from training. Duration: 0.76 2023-10-04 11:21:33,973 WARNING [train.py:1204] (3/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_279-126205-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 11:21:36,789 WARNING [train.py:1204] (3/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_5-55893-0_sp0.9 from training. Duration: 0.611125 2023-10-04 11:21:39,530 WARNING [train.py:1204] (3/4) Exclude cut with ID _13678_210_7497_1_1530530069226_3463170_141-10835-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 11:21:42,743 WARNING [train.py:1204] (3/4) Exclude cut with ID _22083_210_8254_1_1531702862439_3678970_201-85738-0_sp1.1 from training. Duration: 0.8181875 2023-10-04 11:21:42,769 WARNING [train.py:1204] (3/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_51-1049-0 from training. Duration: 0.75 2023-10-04 11:21:44,178 WARNING [train.py:1204] (3/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_206-10097-0_sp0.9 from training. Duration: 0.5444375 2023-10-04 11:21:44,429 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.conv_module2.balancer1.min_positive, batch_count=1636453.3333333333, ans=0.025 2023-10-04 11:21:45,585 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_4528_210_3273_1_1527076654_3276777_360-63661-0_sp1.1 from training. Duration: 0.92725 2023-10-04 11:21:45,594 WARNING [train.py:1204] (3/4) Exclude cut with ID _64086_210_7994_1_1535270417019_7435949_449-31567-0_sp1.1 from training. Duration: 0.8818125 2023-10-04 11:21:48,355 WARNING [train.py:1204] (3/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_245-174553-0_sp0.9 from training. Duration: 0.97775 2023-10-04 11:21:50,424 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6545_210_2040_1_1527774670_4245258_379-14182-0_sp1.1 from training. Duration: 0.72725 2023-10-04 11:21:54,370 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_256-206070-0_sp1.1 from training. Duration: 0.963625 2023-10-04 11:21:57,161 WARNING [train.py:1204] (3/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_278-104676-0 from training. Duration: 0.98 2023-10-04 11:21:58,550 WARNING [train.py:1204] (3/4) Exclude cut with ID IC0671W0065-331908 from training. Duration: 0.896 2023-10-04 11:21:58,593 WARNING [train.py:1204] (3/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_244-129917-0_sp1.1 from training. Duration: 0.97275 2023-10-04 11:22:00,576 WARNING [train.py:1204] (3/4) Exclude cut with ID _7852_210_5219_1_1528286471316_8139400_1129-170857-0_sp1.1 from training. Duration: 0.97275 2023-10-04 11:22:03,719 WARNING [train.py:1204] (3/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_443-30034-0_sp0.9 from training. Duration: 0.711125 2023-10-04 11:22:03,773 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_31583_210_3852_1_1532595851_4024413_250-332999-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 11:22:05,134 WARNING [train.py:1204] (3/4) Exclude cut with ID _8772_210_1850_1_1528635595972_3664011_247-336448-0 from training. Duration: 0.74 2023-10-04 11:22:06,437 WARNING [train.py:1204] (3/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_567-135114-0_sp0.9 from training. Duration: 0.9666875 2023-10-04 11:22:06,449 WARNING [train.py:1204] (3/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_557-349534-0_sp1.1 from training. Duration: 0.82725 2023-10-04 11:22:06,461 WARNING [train.py:1204] (3/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_1341-26728-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 11:22:07,736 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_40222_210_6228_1_1533385298_4481939_375-328051-0_sp1.1 from training. Duration: 0.97275 2023-10-04 11:22:07,759 WARNING [train.py:1204] (3/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_276-96593-0 from training. Duration: 0.77 2023-10-04 11:22:11,953 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_53071_210_16554_1_1534490405_2954066_265-170472-0_sp0.9 from training. Duration: 0.92225 2023-10-04 11:22:11,972 WARNING [train.py:1204] (3/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_365-1492-0 from training. Duration: 0.91 2023-10-04 11:22:14,122 WARNING [train.py:1204] (3/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_654-52713-0_sp0.9 from training. Duration: 0.8555625 2023-10-04 11:22:14,399 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.ff3_skip_rate, batch_count=1636586.6666666667, ans=0.0 2023-10-04 11:22:14,794 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.3.encoder.layers.1.feed_forward3.out_whiten, num_groups=1, num_channels=512, metric=8.70 vs. limit=15.0 2023-10-04 11:22:19,616 WARNING [train.py:1204] (3/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_113-92608-0_sp1.1 from training. Duration: 0.4909375 2023-10-04 11:22:22,971 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_46013_210_7234_1_1533776362_3744032_50-141535-0 from training. Duration: 0.99 2023-10-04 11:22:24,300 WARNING [train.py:1204] (3/4) Exclude cut with ID _13942_210_9988_1_1530604906036_3733360_499-55676-0_sp1.1 from training. Duration: 0.563625 2023-10-04 11:22:24,426 WARNING [train.py:1204] (3/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_375-65143-0_sp1.1 from training. Duration: 0.97275 2023-10-04 11:22:27,209 WARNING [train.py:1204] (3/4) Exclude cut with ID _16756_210_4915_1_1531047803763_7104259_199-325608-0_sp1.1 from training. Duration: 0.92725 2023-10-04 11:22:27,236 WARNING [train.py:1204] (3/4) Exclude cut with ID _31649_210_6228_1_1532584608063_4091078_443-124391-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 11:22:27,327 WARNING [train.py:1204] (3/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_146-218763-0 from training. Duration: 0.86 2023-10-04 11:22:28,638 WARNING [train.py:1204] (3/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_567-135114-0_sp1.1 from training. Duration: 0.7909375 2023-10-04 11:22:28,684 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_26326_210_7925_1_1533002493_3592406_462-288299-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 11:22:30,451 WARNING [train.py:1204] (3/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_322-217220-0 from training. Duration: 0.86 2023-10-04 11:22:30,487 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_238-256668-0_sp1.1 from training. Duration: 0.62725 2023-10-04 11:22:31,795 WARNING [train.py:1204] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_371-18169-0 from training. Duration: 0.86 2023-10-04 11:22:32,399 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.3.encoder.layers.2.nonlin_attention.whiten2, num_groups=1, num_channels=512, metric=6.37 vs. limit=15.0 2023-10-04 11:22:33,238 WARNING [train.py:1204] (3/4) Exclude cut with ID _58480_210_12392_1_1534811121111_3646160_23-74512-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 11:22:33,243 WARNING [train.py:1204] (3/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_784-213099-0_sp1.1 from training. Duration: 0.8454375 2023-10-04 11:22:34,566 WARNING [train.py:1204] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_625-102955-0_sp1.1 from training. Duration: 0.7 2023-10-04 11:22:34,930 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=1636720.0, ans=0.1 2023-10-04 11:22:36,344 INFO [train.py:1046] (3/4) Epoch 47, batch 1150, loss[loss=0.1611, simple_loss=0.2517, pruned_loss=0.03522, over 24568.00 frames. ], tot_loss[loss=0.1528, simple_loss=0.2325, pruned_loss=0.03658, over 4697620.19 frames. ], batch size: 71, lr: 2.17e-03, grad_scale: 8.0 2023-10-04 11:22:36,698 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.feed_forward2.hidden_balancer.prob, batch_count=1636720.0, ans=0.125 2023-10-04 11:22:39,166 WARNING [train.py:1204] (3/4) Exclude cut with ID _49748_210_24116_1_1533986971737_3158910_176-174327-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 11:22:41,924 WARNING [train.py:1204] (3/4) Exclude cut with ID _50310_210_15488_1_1534068043345_3641080_432-96140-0_sp1.1 from training. Duration: 0.936375 2023-10-04 11:22:43,402 WARNING [train.py:1204] (3/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_76-14910-0_sp1.1 from training. Duration: 0.92725 2023-10-04 11:22:43,431 WARNING [train.py:1204] (3/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_40-19458-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 11:22:43,470 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_46-93547-0 from training. Duration: 0.95 2023-10-04 11:22:45,246 WARNING [train.py:1204] (3/4) Exclude cut with ID _21631_210_2253_1_1531907880778_3160339_321-35325-0_sp1.1 from training. Duration: 0.963625 2023-10-04 11:22:46,795 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.conv_skip_rate, batch_count=1636720.0, ans=0.0 2023-10-04 11:22:47,950 WARNING [train.py:1204] (3/4) Exclude cut with ID _4779_210_2084_1_1527318022944_3914893_500-285013-0 from training. Duration: 0.69 2023-10-04 11:22:49,330 WARNING [train.py:1204] (3/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_301-93324-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 11:22:49,343 WARNING [train.py:1204] (3/4) Exclude cut with ID _6473_210_1741_1_1527901307396_3815380_108-17335-0_sp1.1 from training. Duration: 0.5909375 2023-10-04 11:22:55,411 WARNING [train.py:1204] (3/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_206-10097-0 from training. Duration: 0.49 2023-10-04 11:22:59,423 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_36756_210_7234_1_1533026991_966276_103-77513-0_sp1.1 from training. Duration: 0.97275 2023-10-04 11:23:02,811 WARNING [train.py:1204] (3/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_662-18193-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 11:23:02,882 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_26433_210_12108_1_1532157158_4056480_439-52438-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 11:23:02,935 WARNING [train.py:1204] (3/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_464-33931-0 from training. Duration: 0.54 2023-10-04 11:23:02,942 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3474_210_2028_1_1526188513_4825246_145-309131-0_sp0.9 from training. Duration: 0.82225 2023-10-04 11:23:02,955 WARNING [train.py:1204] (3/4) Exclude cut with ID _9679_210_7497_1_1529129212906_4363929_446-308321-0_sp1.1 from training. Duration: 0.963625 2023-10-04 11:23:07,095 WARNING [train.py:1204] (3/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_662-275621-0 from training. Duration: 0.42 2023-10-04 11:23:07,186 WARNING [train.py:1204] (3/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_344-337148-0_sp1.1 from training. Duration: 0.97275 2023-10-04 11:23:10,274 WARNING [train.py:1204] (3/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_759-179071-0_sp1.1 from training. Duration: 0.92725 2023-10-04 11:23:15,230 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.balancer1.prob, batch_count=1636853.3333333333, ans=0.125 2023-10-04 11:23:19,100 WARNING [train.py:1204] (3/4) Exclude cut with ID _28783_210_7943_1_1532671164808_7155479_293-12188-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 11:23:27,673 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_17613_210_2151_1_1531137569_2366160_235-320304-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 11:23:27,689 WARNING [train.py:1204] (3/4) Exclude cut with ID _14349_210_8341_1_1530615710818_3442470_170-336541-0 from training. Duration: 0.86 2023-10-04 11:23:27,740 WARNING [train.py:1204] (3/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_375-218677-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 11:23:29,059 WARNING [train.py:1204] (3/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_829-202956-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 11:23:34,987 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9029W0436-509823 from training. Duration: 0.768 2023-10-04 11:23:36,417 WARNING [train.py:1204] (3/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_231-112214-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 11:23:44,120 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9059W0470-524883 from training. Duration: 0.3415625 2023-10-04 11:23:45,676 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_43266_210_1794_1_1533521789_4497218_206-79512-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 11:23:47,009 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_294-163793-0_sp0.9 from training. Duration: 0.911125 2023-10-04 11:23:47,037 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6290_210_3273_1_1527595224_1282787_115-87095-0_sp1.1 from training. Duration: 0.5 2023-10-04 11:23:48,704 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.583e+02 2.016e+02 2.247e+02 2.696e+02 4.163e+02, threshold=4.494e+02, percent-clipped=0.0 2023-10-04 11:23:48,828 WARNING [train.py:1204] (3/4) Exclude cut with ID _14100_210_8341_1_1530529320613_3678069_279-55760-0_sp1.1 from training. Duration: 0.8454375 2023-10-04 11:23:50,024 INFO [train.py:1046] (3/4) Epoch 47, batch 1200, loss[loss=0.2002, simple_loss=0.2655, pruned_loss=0.06743, over 19548.00 frames. ], tot_loss[loss=0.1532, simple_loss=0.2331, pruned_loss=0.03658, over 4713094.23 frames. ], batch size: 389, lr: 2.17e-03, grad_scale: 16.0 2023-10-04 11:23:52,859 WARNING [train.py:1204] (3/4) Exclude cut with ID _14621_210_1925_1_1530686901185_3675420_419-234200-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 11:23:54,534 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.conv_skip_rate, batch_count=1637053.3333333333, ans=0.0 2023-10-04 11:23:57,183 WARNING [train.py:1204] (3/4) Exclude cut with ID _60229_210_27767_1_1534838406158_3655350_353-213656-0_sp1.1 from training. Duration: 0.863625 2023-10-04 11:23:57,192 WARNING [train.py:1204] (3/4) Exclude cut with ID _48678_210_5663_1_1533988830947_3769199_306-255886-0_sp1.1 from training. Duration: 0.7 2023-10-04 11:24:00,389 WARNING [train.py:1204] (3/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_466-3492-0_sp1.1 from training. Duration: 0.97275 2023-10-04 11:24:00,390 WARNING [train.py:1204] (3/4) Exclude cut with ID _8755_210_1876_1_1528628559357_3258209_199-242167-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 11:24:00,450 WARNING [train.py:1204] (3/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_237-89684-0_sp1.1 from training. Duration: 0.87275 2023-10-04 11:24:01,926 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.ff3_skip_rate, batch_count=1637053.3333333333, ans=0.0 2023-10-04 11:24:03,075 WARNING [train.py:1204] (3/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_70-84121-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 11:24:04,516 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7544_210_6753_1_1528587722_3576686_172-171750-0_sp1.1 from training. Duration: 0.5909375 2023-10-04 11:24:05,853 WARNING [train.py:1204] (3/4) Exclude cut with ID _24271_210_12658_1_1531987270022_7190519_40-321822-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 11:24:05,870 WARNING [train.py:1204] (3/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_227-83612-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 11:24:10,386 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9156W0015-572508 from training. Duration: 0.896 2023-10-04 11:24:11,862 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_292-265928-0 from training. Duration: 0.51 2023-10-04 11:24:13,408 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.balancer1.prob, batch_count=1637120.0, ans=0.125 2023-10-04 11:24:14,737 WARNING [train.py:1204] (3/4) Exclude cut with ID _14747_210_8341_1_1530685797463_3654089_147-131229-0_sp1.1 from training. Duration: 0.6909375 2023-10-04 11:24:16,707 WARNING [train.py:1204] (3/4) Exclude cut with ID _45297_210_2721_1_1533715351381_3264949_351-147136-0_sp0.9 from training. Duration: 0.9666875 2023-10-04 11:24:19,801 WARNING [train.py:1204] (3/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_1102-324568-0_sp1.1 from training. Duration: 0.97275 2023-10-04 11:24:21,237 WARNING [train.py:1204] (3/4) Exclude cut with ID _18516_210_9206_1_1531704653206_3692406_465-75446-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 11:24:21,238 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9203W0126-595623 from training. Duration: 0.896 2023-10-04 11:24:22,643 WARNING [train.py:1204] (3/4) Exclude cut with ID _55383_210_10635_1_1534468426172_2231669_139-328410-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 11:24:28,149 WARNING [train.py:1204] (3/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_166-246557-0_sp0.9 from training. Duration: 0.711125 2023-10-04 11:24:28,161 WARNING [train.py:1204] (3/4) Exclude cut with ID _30039_210_13270_1_1532484612900_2725150_239-304872-0_sp1.1 from training. Duration: 0.963625 2023-10-04 11:24:28,191 WARNING [train.py:1204] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_6-226750-0 from training. Duration: 0.94 2023-10-04 11:24:31,367 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_135-5501-0_sp1.1 from training. Duration: 0.8909375 2023-10-04 11:24:31,604 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.conv_module1.balancer2.prob, batch_count=1637186.6666666667, ans=0.125 2023-10-04 11:24:34,156 WARNING [train.py:1204] (3/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_359-118926-0 from training. Duration: 0.72 2023-10-04 11:24:37,469 WARNING [train.py:1204] (3/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_4-91037-0 from training. Duration: 0.88 2023-10-04 11:24:37,493 WARNING [train.py:1204] (3/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_415-288492-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 11:24:37,698 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.balancer1.prob, batch_count=1637253.3333333333, ans=0.125 2023-10-04 11:24:38,847 WARNING [train.py:1204] (3/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_544-287179-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 11:24:40,277 WARNING [train.py:1204] (3/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_122-32606-0_sp1.1 from training. Duration: 0.92725 2023-10-04 11:24:40,333 WARNING [train.py:1204] (3/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_121-284152-0_sp1.1 from training. Duration: 0.536375 2023-10-04 11:24:41,590 WARNING [train.py:1204] (3/4) Exclude cut with ID _44718_210_5816_1_1534050051869_6469030_350-299477-0_sp1.1 from training. Duration: 0.97275 2023-10-04 11:24:41,602 WARNING [train.py:1204] (3/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_1103-56445-0_sp0.9 from training. Duration: 0.811125 2023-10-04 11:24:41,658 WARNING [train.py:1204] (3/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_64-298917-0_sp0.9 from training. Duration: 0.97775 2023-10-04 11:24:43,000 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2245_210_2715_1_1525511548_3540254_310-52483-0 from training. Duration: 0.88 2023-10-04 11:24:43,054 WARNING [train.py:1204] (3/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_401-158239-0_sp1.1 from training. Duration: 0.6454375 2023-10-04 11:24:43,098 WARNING [train.py:1204] (3/4) Exclude cut with ID _59502_210_12210_1_1535107024698_1835139_146-162495-0_sp1.1 from training. Duration: 0.836375 2023-10-04 11:24:43,108 WARNING [train.py:1204] (3/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_41-31157-0_sp0.9 from training. Duration: 0.6333125 2023-10-04 11:24:46,480 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_68717_210_3528_1_1535628683_1556759_95-36722-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 11:24:46,487 WARNING [train.py:1204] (3/4) Exclude cut with ID _30196_210_1664_1_1533708496522_7074140_654-162611-0_sp1.1 from training. Duration: 0.92725 2023-10-04 11:24:49,748 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_9302_210_2550_1_1528889962_4655416_638-301620-0_sp0.9 from training. Duration: 0.52225 2023-10-04 11:24:51,133 WARNING [train.py:1204] (3/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_478-162247-0_sp1.1 from training. Duration: 0.7454375 2023-10-04 11:24:54,332 WARNING [train.py:1204] (3/4) Exclude cut with ID _45297_210_2721_1_1533715351381_3264949_353-9541-0 from training. Duration: 0.97 2023-10-04 11:24:58,824 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9653W0190-675468 from training. Duration: 0.9935 2023-10-04 11:25:00,205 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_49730_210_13049_1_1534075294_1830676_214-84436-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 11:25:03,040 WARNING [train.py:1204] (3/4) Exclude cut with ID _50093_210_4220_1_1534417141207_3432299_259-322252-0_sp1.1 from training. Duration: 0.836375 2023-10-04 11:25:04,354 INFO [train.py:1046] (3/4) Epoch 47, batch 1250, loss[loss=0.1505, simple_loss=0.2282, pruned_loss=0.0364, over 23742.00 frames. ], tot_loss[loss=0.1535, simple_loss=0.2336, pruned_loss=0.03668, over 4705178.13 frames. ], batch size: 149, lr: 2.17e-03, grad_scale: 16.0 2023-10-04 11:25:04,414 WARNING [train.py:1204] (3/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_599-50220-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 11:25:05,788 WARNING [train.py:1204] (3/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_174-109016-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 11:25:07,718 WARNING [train.py:1204] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_665-196155-0 from training. Duration: 0.67 2023-10-04 11:25:11,924 WARNING [train.py:1204] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_656-188361-0_sp1.1 from training. Duration: 0.7909375 2023-10-04 11:25:11,992 WARNING [train.py:1204] (3/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_60-18123-0_sp1.1 from training. Duration: 0.8 2023-10-04 11:25:13,216 WARNING [train.py:1204] (3/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_535-14410-0 from training. Duration: 0.47 2023-10-04 11:25:16,309 WARNING [train.py:1204] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_94-126128-0_sp1.1 from training. Duration: 0.7818125 2023-10-04 11:25:16,409 WARNING [train.py:1204] (3/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_356-31216-0_sp1.1 from training. Duration: 0.6818125 2023-10-04 11:25:16,661 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.conv_skip_rate, batch_count=1637386.6666666667, ans=0.0 2023-10-04 11:25:19,648 WARNING [train.py:1204] (3/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_443-30034-0_sp1.1 from training. Duration: 0.5818125 2023-10-04 11:25:19,717 WARNING [train.py:1204] (3/4) Exclude cut with ID _26907_210_7234_1_1532221740694_3205263_337-2494-0_sp1.1 from training. Duration: 0.8 2023-10-04 11:25:21,108 WARNING [train.py:1204] (3/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_49-97158-0_sp0.9 from training. Duration: 0.8444375 2023-10-04 11:25:21,118 WARNING [train.py:1204] (3/4) Exclude cut with ID _7877_210_6014_1_1528286358099_3750136_553-170128-0_sp1.1 from training. Duration: 0.963625 2023-10-04 11:25:23,869 WARNING [train.py:1204] (3/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_202-293523-0_sp0.9 from training. Duration: 0.8 2023-10-04 11:25:25,379 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.feed_forward1.hidden_balancer.prob, batch_count=1637453.3333333333, ans=0.125 2023-10-04 11:25:29,782 WARNING [train.py:1204] (3/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_188-171673-0_sp0.9 from training. Duration: 0.6444375 2023-10-04 11:25:29,797 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_120-41668-0_sp1.1 from training. Duration: 0.636375 2023-10-04 11:25:29,808 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_40223_210_6228_1_1533298404_4812267_613-78769-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 11:25:31,434 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.ff3_skip_rate, batch_count=1637453.3333333333, ans=0.0 2023-10-04 11:25:32,621 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_53861_210_5399_1_1534422156_4272077_150-336899-0_sp1.1 from training. Duration: 0.963625 2023-10-04 11:25:32,693 WARNING [train.py:1204] (3/4) Exclude cut with ID _14459_210_2228_1_1531443641946_7516700_1146-56843-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 11:25:35,468 WARNING [train.py:1204] (3/4) Exclude cut with ID _39867_210_9826_1_1533553222030_9344259_1194-299928-0_sp1.1 from training. Duration: 0.92725 2023-10-04 11:25:37,274 WARNING [train.py:1204] (3/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_214-291369-0_sp0.9 from training. Duration: 0.7 2023-10-04 11:25:42,828 WARNING [train.py:1204] (3/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_358-215558-0 from training. Duration: 0.93 2023-10-04 11:25:42,901 WARNING [train.py:1204] (3/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_577-287150-0_sp0.9 from training. Duration: 0.911125 2023-10-04 11:25:45,637 WARNING [train.py:1204] (3/4) Exclude cut with ID _60860_210_8254_1_1534896015875_3557169_229-187367-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 11:25:46,944 WARNING [train.py:1204] (3/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_857-298942-0 from training. Duration: 0.85 2023-10-04 11:25:46,989 WARNING [train.py:1204] (3/4) Exclude cut with ID _40137_210_12210_1_1533290484046_3795180_249-187059-0_sp1.1 from training. Duration: 0.8 2023-10-04 11:25:47,003 WARNING [train.py:1204] (3/4) Exclude cut with ID ID0171W0508-765603 from training. Duration: 0.541 2023-10-04 11:25:48,698 WARNING [train.py:1204] (3/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_521-219439-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 11:25:48,704 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_40223_210_6228_1_1533298404_4812267_741-302616-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 11:25:53,439 WARNING [train.py:1204] (3/4) Exclude cut with ID _65293_210_9070_1_1535545549765_4004210_438-154735-0_sp1.1 from training. Duration: 0.92725 2023-10-04 11:25:56,216 WARNING [train.py:1204] (3/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_564-132950-0_sp1.1 from training. Duration: 0.92725 2023-10-04 11:25:56,262 WARNING [train.py:1204] (3/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_291-270881-0_sp1.1 from training. Duration: 0.8818125 2023-10-04 11:25:56,365 WARNING [train.py:1204] (3/4) Exclude cut with ID _60229_210_27767_1_1534838406158_3655350_353-213656-0 from training. Duration: 0.95 2023-10-04 11:25:56,377 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_243-143119-0 from training. Duration: 0.77 2023-10-04 11:25:57,745 WARNING [train.py:1204] (3/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_690-126306-0 from training. Duration: 0.78 2023-10-04 11:25:59,677 WARNING [train.py:1204] (3/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_347-95003-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 11:26:02,399 WARNING [train.py:1204] (3/4) Exclude cut with ID _2322_210_2667_1_1525604548971_3656299_440-80494-0 from training. Duration: 0.88 2023-10-04 11:26:02,414 WARNING [train.py:1204] (3/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_370-229434-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 11:26:03,875 WARNING [train.py:1204] (3/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_124-345840-0_sp1.1 from training. Duration: 0.47275 2023-10-04 11:26:03,891 WARNING [train.py:1204] (3/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_165-123581-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 11:26:06,550 WARNING [train.py:1204] (3/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_453-282588-0 from training. Duration: 0.98 2023-10-04 11:26:06,567 WARNING [train.py:1204] (3/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_299-34413-0_sp0.9 from training. Duration: 0.72225 2023-10-04 11:26:06,611 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_40036_210_13405_1_1533648647_1315388_48-146016-0_sp1.1 from training. Duration: 0.8454375 2023-10-04 11:26:06,640 WARNING [train.py:1204] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_99-167392-0_sp0.9 from training. Duration: 0.67775 2023-10-04 11:26:08,639 WARNING [train.py:1204] (3/4) Exclude cut with ID _48677_210_5663_1_1533902405919_3892989_37-207114-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 11:26:08,751 WARNING [train.py:1204] (3/4) Exclude cut with ID _29857_210_20034_1_1532395886753_3798160_335-163562-0 from training. Duration: 0.85 2023-10-04 11:26:11,521 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_36492_210_7927_1_1533261987_4790834_703-343351-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 11:26:12,830 WARNING [train.py:1204] (3/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_80-147301-0_sp0.9 from training. Duration: 0.9333125 2023-10-04 11:26:14,157 WARNING [train.py:1204] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_218-295097-0_sp0.9 from training. Duration: 0.8555625 2023-10-04 11:26:15,720 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.0.ff3_skip_rate, batch_count=1637653.3333333333, ans=0.0 2023-10-04 11:26:16,845 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.698e+02 2.071e+02 2.257e+02 2.568e+02 3.764e+02, threshold=4.514e+02, percent-clipped=0.0 2023-10-04 11:26:16,991 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2295_210_2087_1_1525500993_2407863_116-238894-0_sp1.1 from training. Duration: 0.636375 2023-10-04 11:26:18,272 INFO [train.py:1046] (3/4) Epoch 47, batch 1300, loss[loss=0.1516, simple_loss=0.2412, pruned_loss=0.03099, over 24464.00 frames. ], tot_loss[loss=0.1541, simple_loss=0.234, pruned_loss=0.03705, over 4697647.84 frames. ], batch size: 69, lr: 2.17e-03, grad_scale: 16.0 2023-10-04 11:26:20,178 WARNING [train.py:1204] (3/4) Exclude cut with ID _3231_210_2106_1_1526205864934_6784220_697-171068-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 11:26:20,199 WARNING [train.py:1204] (3/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_102-77064-0 from training. Duration: 0.93 2023-10-04 11:26:24,347 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_13091_210_7925_1_1530609447_8987354_1186-261957-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 11:26:26,172 WARNING [train.py:1204] (3/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_91-277684-0_sp1.1 from training. Duration: 0.67275 2023-10-04 11:26:28,863 WARNING [train.py:1204] (3/4) Exclude cut with ID _31088_210_16446_1_1532520012284_3440919_159-313751-0_sp1.1 from training. Duration: 0.936375 2023-10-04 11:26:28,982 WARNING [train.py:1204] (3/4) Exclude cut with ID _42326_210_7770_1_1533814282531_3707269_227-203721-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 11:26:30,311 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3531_210_3528_1_1526294203_5216456_309-227302-0_sp1.1 from training. Duration: 0.62725 2023-10-04 11:26:31,565 WARNING [train.py:1204] (3/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_226-242140-0 from training. Duration: 0.81 2023-10-04 11:26:34,491 WARNING [train.py:1204] (3/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_122-109023-0_sp1.1 from training. Duration: 0.6909375 2023-10-04 11:26:35,882 WARNING [train.py:1204] (3/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_660-211088-0_sp0.9 from training. Duration: 0.688875 2023-10-04 11:26:37,241 WARNING [train.py:1204] (3/4) Exclude cut with ID _39290_210_4220_1_1533862778760_3075118_92-311600-0 from training. Duration: 0.88 2023-10-04 11:26:40,624 WARNING [train.py:1204] (3/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_390-339137-0_sp1.1 from training. Duration: 0.5090625 2023-10-04 11:26:44,656 WARNING [train.py:1204] (3/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_449-341946-0_sp1.1 from training. Duration: 0.963625 2023-10-04 11:26:44,734 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_68095_210_20185_1_1535718226_8888855_884-327007-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 11:26:46,167 WARNING [train.py:1204] (3/4) Exclude cut with ID _42326_210_7770_1_1533814282531_3707269_308-222998-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 11:26:48,064 WARNING [train.py:1204] (3/4) Exclude cut with ID _40399_210_4353_1_1533305658194_3279406_344-238708-0_sp1.1 from training. Duration: 0.963625 2023-10-04 11:26:48,130 WARNING [train.py:1204] (3/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_87-131427-0_sp1.1 from training. Duration: 0.7090625 2023-10-04 11:26:50,016 WARNING [train.py:1204] (3/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_110-130961-0_sp0.9 from training. Duration: 0.52225 2023-10-04 11:26:51,339 WARNING [train.py:1204] (3/4) Exclude cut with ID _55153_210_2253_1_1534384786677_3162096_148-79022-0 from training. Duration: 0.96 2023-10-04 11:26:55,413 WARNING [train.py:1204] (3/4) Exclude cut with ID _9679_210_7497_1_1529129212906_4363929_512-313889-0_sp1.1 from training. Duration: 0.736375 2023-10-04 11:26:55,424 WARNING [train.py:1204] (3/4) Exclude cut with ID _7970_210_1883_1_1528455616296_3472230_25-178407-0_sp1.1 from training. Duration: 0.6454375 2023-10-04 11:26:58,147 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_246-125000-0 from training. Duration: 0.62 2023-10-04 11:27:00,070 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_15126_210_6753_1_1530778677_2591185_113-157651-0_sp0.9 from training. Duration: 0.7555625 2023-10-04 11:27:01,429 WARNING [train.py:1204] (3/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_105-63309-0_sp1.1 from training. Duration: 0.8909375 2023-10-04 11:27:03,007 WARNING [train.py:1204] (3/4) Exclude cut with ID _29877_210_6228_1_1532397791106_5276149_578-177271-0_sp1.1 from training. Duration: 0.936375 2023-10-04 11:27:04,366 WARNING [train.py:1204] (3/4) Exclude cut with ID _44831_210_13427_1_1533816011549_3689035_32-188147-0 from training. Duration: 0.96 2023-10-04 11:27:04,407 WARNING [train.py:1204] (3/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_300-254337-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 11:27:04,427 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_128-241008-0 from training. Duration: 0.94 2023-10-04 11:27:05,870 WARNING [train.py:1204] (3/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_53-279178-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 11:27:09,281 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_41452_210_7291_1_1533437687_3941281_273-334088-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 11:27:09,283 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_128-241008-0_sp1.1 from training. Duration: 0.8545625 2023-10-04 11:27:13,397 WARNING [train.py:1204] (3/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_379-49457-0 from training. Duration: 0.87 2023-10-04 11:27:14,758 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_105-114797-0 from training. Duration: 0.84 2023-10-04 11:27:14,853 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_14928_210_5632_1_1530773461_4714000_454-112198-0 from training. Duration: 0.66 2023-10-04 11:27:19,372 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_456-22141-0_sp1.1 from training. Duration: 0.8 2023-10-04 11:27:22,219 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_115-233929-0 from training. Duration: 0.72 2023-10-04 11:27:23,603 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_616-16795-0_sp1.1 from training. Duration: 0.963625 2023-10-04 11:27:23,786 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=1637986.6666666667, ans=0.1 2023-10-04 11:27:29,896 WARNING [train.py:1204] (3/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_448-212135-0 from training. Duration: 0.91 2023-10-04 11:27:32,246 INFO [train.py:1046] (3/4) Epoch 47, batch 1350, loss[loss=0.1656, simple_loss=0.2289, pruned_loss=0.05118, over 23643.00 frames. ], tot_loss[loss=0.1531, simple_loss=0.2331, pruned_loss=0.03651, over 4711678.44 frames. ], batch size: 256, lr: 2.17e-03, grad_scale: 16.0 2023-10-04 11:27:32,435 WARNING [train.py:1204] (3/4) Exclude cut with ID _46683_210_9642_1_1533730913492_4591116_201-205677-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 11:27:35,167 WARNING [train.py:1204] (3/4) Exclude cut with ID _64086_210_7994_1_1535270417019_7435949_43-80483-0_sp1.1 from training. Duration: 0.97275 2023-10-04 11:27:39,726 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_240-134124-0_sp1.1 from training. Duration: 0.963625 2023-10-04 11:27:39,761 WARNING [train.py:1204] (3/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_907-120940-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 11:27:41,156 WARNING [train.py:1204] (3/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_848-280957-0_sp1.1 from training. Duration: 0.7909375 2023-10-04 11:27:41,191 WARNING [train.py:1204] (3/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_243-42116-0_sp0.9 from training. Duration: 0.811125 2023-10-04 11:27:45,474 WARNING [train.py:1204] (3/4) Exclude cut with ID _6694_210_4599_1_1527852637490_3965529_116-109398-0_sp0.9 from training. Duration: 0.811125 2023-10-04 11:27:48,679 WARNING [train.py:1204] (3/4) Exclude cut with ID _24314_210_4915_1_1532135078116_7170179_521-258868-0 from training. Duration: 0.79 2023-10-04 11:27:50,490 WARNING [train.py:1204] (3/4) Exclude cut with ID _2220_210_1819_1_1525509109898_3658680_50-99837-0_sp0.9 from training. Duration: 0.6 2023-10-04 11:27:51,792 WARNING [train.py:1204] (3/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_313-134930-0_sp1.1 from training. Duration: 0.8818125 2023-10-04 11:27:53,221 WARNING [train.py:1204] (3/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_125-144174-0 from training. Duration: 0.39 2023-10-04 11:27:54,533 WARNING [train.py:1204] (3/4) Exclude cut with ID _59288_210_5854_1_1535077757774_3412246_275-293897-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 11:27:56,025 WARNING [train.py:1204] (3/4) Exclude cut with ID _40519_210_13794_1_1533715228655_7337390_722-52880-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 11:27:56,050 WARNING [train.py:1204] (3/4) Exclude cut with ID _7537_210_2084_1_1528502395431_3924359_128-268029-0 from training. Duration: 0.98 2023-10-04 11:27:57,457 WARNING [train.py:1204] (3/4) Exclude cut with ID _8021_210_7994_1_1528376451358_3653069_179-45206-0 from training. Duration: 0.86 2023-10-04 11:27:58,819 WARNING [train.py:1204] (3/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_88-276668-0 from training. Duration: 0.99 2023-10-04 11:28:00,715 WARNING [train.py:1204] (3/4) Exclude cut with ID _22613_210_5219_1_1532253599686_4090170_290-212862-0_sp1.1 from training. Duration: 0.97275 2023-10-04 11:28:00,727 WARNING [train.py:1204] (3/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_465-214726-0 from training. Duration: 0.86 2023-10-04 11:28:00,960 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.conv_module1.balancer1.prob, batch_count=1638186.6666666667, ans=0.125 2023-10-04 11:28:10,726 WARNING [train.py:1204] (3/4) Exclude cut with ID _56480_210_4220_1_1534679942079_3075409_138-36031-0_sp1.1 from training. Duration: 0.97275 2023-10-04 11:28:12,332 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.feed_forward1.hidden_balancer.prob, batch_count=1638186.6666666667, ans=0.125 2023-10-04 11:28:20,818 WARNING [train.py:1204] (3/4) Exclude cut with ID _58605_210_12210_1_1534723195939_3904259_300-285878-0_sp1.1 from training. Duration: 0.97275 2023-10-04 11:28:20,856 WARNING [train.py:1204] (3/4) Exclude cut with ID _40901_210_5665_1_1533363789958_3856130_114-340229-0_sp1.1 from training. Duration: 0.936375 2023-10-04 11:28:22,708 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_489-230703-0 from training. Duration: 0.85 2023-10-04 11:28:25,514 WARNING [train.py:1204] (3/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_322-81243-0_sp1.1 from training. Duration: 0.936375 2023-10-04 11:28:26,876 WARNING [train.py:1204] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_271-269084-0 from training. Duration: 0.83 2023-10-04 11:28:26,894 WARNING [train.py:1204] (3/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_166-314607-0_sp0.9 from training. Duration: 0.6 2023-10-04 11:28:26,950 WARNING [train.py:1204] (3/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_340-343302-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 11:28:29,756 WARNING [train.py:1204] (3/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_286-112015-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 11:28:32,988 WARNING [train.py:1204] (3/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_328-264776-0 from training. Duration: 0.67 2023-10-04 11:28:33,102 WARNING [train.py:1204] (3/4) Exclude cut with ID _4866_210_2577_1_1527404412284_4364659_256-306830-0_sp0.9 from training. Duration: 0.9666875 2023-10-04 11:28:35,981 WARNING [train.py:1204] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_239-316802-0 from training. Duration: 0.69 2023-10-04 11:28:39,385 WARNING [train.py:1204] (3/4) Exclude cut with ID _29857_210_20034_1_1532395886753_3798160_279-185801-0 from training. Duration: 0.91 2023-10-04 11:28:44,664 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.643e+02 2.056e+02 2.395e+02 2.955e+02 4.262e+02, threshold=4.789e+02, percent-clipped=0.0 2023-10-04 11:28:44,764 WARNING [train.py:1204] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_656-188361-0 from training. Duration: 0.87 2023-10-04 11:28:44,882 WARNING [train.py:1204] (3/4) Exclude cut with ID _44718_210_5816_1_1534050051869_6469030_100-230287-0_sp1.1 from training. Duration: 0.936375 2023-10-04 11:28:45,016 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.feed_forward2.hidden_balancer.prob, batch_count=1638386.6666666667, ans=0.125 2023-10-04 11:28:46,165 INFO [train.py:1046] (3/4) Epoch 47, batch 1400, loss[loss=0.1406, simple_loss=0.2223, pruned_loss=0.02947, over 24607.00 frames. ], tot_loss[loss=0.1522, simple_loss=0.2324, pruned_loss=0.03603, over 4715105.01 frames. ], batch size: 60, lr: 2.16e-03, grad_scale: 16.0 2023-10-04 11:28:49,523 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8275_210_4198_1_1528455320_3892571_276-215622-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 11:28:50,445 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.0.layers.1.self_attn_weights.whiten_keys, num_groups=4, num_channels=128, metric=2.35 vs. limit=6.0 2023-10-04 11:28:51,378 WARNING [train.py:1204] (3/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_698-309430-0_sp1.1 from training. Duration: 0.963625 2023-10-04 11:28:56,779 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_365-217119-0 from training. Duration: 0.63 2023-10-04 11:28:58,146 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7264_210_4938_1_1527939911_8132095_800-91995-0 from training. Duration: 0.86 2023-10-04 11:28:59,683 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.1.feed_forward3.hidden_balancer.prob, batch_count=1638453.3333333333, ans=0.125 2023-10-04 11:29:08,238 WARNING [train.py:1204] (3/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_49-97158-0_sp1.1 from training. Duration: 0.6909375 2023-10-04 11:29:09,616 WARNING [train.py:1204] (3/4) Exclude cut with ID _61132_210_9969_1_1535021792964_3755419_61-57720-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 11:29:13,040 WARNING [train.py:1204] (3/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_345-306832-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 11:29:13,078 WARNING [train.py:1204] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_164-224052-0_sp0.9 from training. Duration: 0.77775 2023-10-04 11:29:17,122 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_34886_210_10633_1_1533203882_1755661_76-156096-0_sp1.1 from training. Duration: 0.7909375 2023-10-04 11:29:17,211 WARNING [train.py:1204] (3/4) Exclude cut with ID 4133-6541-0027-40495-0_sp1.1 from training. Duration: 0.9681875 2023-10-04 11:29:25,331 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_514-211165-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 11:29:26,796 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_41211_210_2544_1_1533348859_3702910_97-212695-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 11:29:28,371 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.self_attn_weights.pos_emb_skip_rate, batch_count=1638520.0, ans=0.0 2023-10-04 11:29:31,100 WARNING [train.py:1204] (3/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_406-45086-0 from training. Duration: 0.8 2023-10-04 11:29:31,174 WARNING [train.py:1204] (3/4) Exclude cut with ID _65263_210_22356_1_1535763598691_3921037_49-222917-0_sp1.1 from training. Duration: 0.836375 2023-10-04 11:29:32,509 WARNING [train.py:1204] (3/4) Exclude cut with ID _4866_210_2577_1_1527404412284_4364659_352-157930-0_sp1.1 from training. Duration: 0.736375 2023-10-04 11:29:33,737 WARNING [train.py:1204] (3/4) Exclude cut with ID _4366_210_1388_1_1526792053633_3613819_231-211402-0_sp1.1 from training. Duration: 0.8545625 2023-10-04 11:29:35,005 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_98-21576-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 11:29:35,093 WARNING [train.py:1204] (3/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_348-127789-0_sp0.9 from training. Duration: 0.9444375 2023-10-04 11:29:36,369 WARNING [train.py:1204] (3/4) Exclude cut with ID _45871_210_5665_1_1533864614245_3296060_98-329557-0_sp1.1 from training. Duration: 0.97275 2023-10-04 11:29:36,420 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6846_210_6753_1_1527850767_4295158_2-315073-0_sp1.1 from training. Duration: 0.8 2023-10-04 11:29:37,116 WARNING [train.py:1204] (3/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_145-137790-0 from training. Duration: 0.83 2023-10-04 11:29:38,427 WARNING [train.py:1204] (3/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_133-158308-0_sp0.9 from training. Duration: 0.9333125 2023-10-04 11:29:39,971 WARNING [train.py:1204] (3/4) Exclude cut with ID _14898_210_9206_1_1530766812377_4177190_320-11758-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 11:29:44,702 WARNING [train.py:1204] (3/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_406-45086-0_sp0.9 from training. Duration: 0.888875 2023-10-04 11:29:46,593 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.3.encoder.layers.3.feed_forward2.out_whiten, num_groups=1, num_channels=512, metric=6.21 vs. limit=15.0 2023-10-04 11:29:50,585 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.feed_forward2.hidden_balancer.prob, batch_count=1638653.3333333333, ans=0.125 2023-10-04 11:29:52,556 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.bypass.scale_min, batch_count=1638653.3333333333, ans=0.2 2023-10-04 11:29:53,662 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_5438_210_1794_1_1527336687_2600009_104-288274-0 from training. Duration: 0.58 2023-10-04 11:29:53,733 WARNING [train.py:1204] (3/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_108-53929-0_sp0.9 from training. Duration: 0.6555625 2023-10-04 11:29:53,793 WARNING [train.py:1204] (3/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_444-286390-0_sp1.1 from training. Duration: 0.963625 2023-10-04 11:29:55,240 WARNING [train.py:1204] (3/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_125-144174-0_sp0.9 from training. Duration: 0.4333125 2023-10-04 11:29:56,636 WARNING [train.py:1204] (3/4) Exclude cut with ID _55209_210_1531_1_1534675828996_4597160_118-345621-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 11:29:58,178 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_45886_210_17024_1_1533709215_471063_7-79165-0_sp1.1 from training. Duration: 0.8909375 2023-10-04 11:29:58,369 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.feed_forward1.out_proj.dropout_p, batch_count=1638653.3333333333, ans=0.1 2023-10-04 11:30:01,009 INFO [train.py:1046] (3/4) Epoch 47, batch 1450, loss[loss=0.1499, simple_loss=0.2363, pruned_loss=0.03179, over 24598.00 frames. ], tot_loss[loss=0.1515, simple_loss=0.232, pruned_loss=0.03544, over 4717921.61 frames. ], batch size: 68, lr: 2.16e-03, grad_scale: 8.0 2023-10-04 11:30:01,112 WARNING [train.py:1204] (3/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_175-163894-0_sp0.9 from training. Duration: 0.82225 2023-10-04 11:30:01,397 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.out_combiner.scale_min, batch_count=1638720.0, ans=0.2 2023-10-04 11:30:02,486 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_50188_210_4198_1_1534503621_3646041_353-81682-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 11:30:03,391 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.0.layers.0.whiten, num_groups=1, num_channels=192, metric=4.21 vs. limit=12.0 2023-10-04 11:30:03,745 WARNING [train.py:1204] (3/4) Exclude cut with ID _17875_210_2087_1_1531224012907_1821339_175-80565-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 11:30:03,756 WARNING [train.py:1204] (3/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_159-328157-0_sp1.1 from training. Duration: 0.47275 2023-10-04 11:30:08,373 WARNING [train.py:1204] (3/4) Exclude cut with ID _60057_210_10640_1_1534903065793_3333150_137-267579-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 11:30:09,660 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_92-46009-0_sp1.1 from training. Duration: 0.4909375 2023-10-04 11:30:11,581 WARNING [train.py:1204] (3/4) Exclude cut with ID _53203_210_2087_1_1534246218309_475590_48-68507-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 11:30:11,596 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_28211_210_6388_1_1532861506_4523224_128-328162-0 from training. Duration: 0.9 2023-10-04 11:30:12,979 WARNING [train.py:1204] (3/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_51-1049-0_sp1.1 from training. Duration: 0.6818125 2023-10-04 11:30:14,371 WARNING [train.py:1204] (3/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_383-316000-0 from training. Duration: 0.9 2023-10-04 11:30:14,435 WARNING [train.py:1204] (3/4) Exclude cut with ID _4779_210_2084_1_1527318022944_3914893_185-295153-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 11:30:15,804 WARNING [train.py:1204] (3/4) Exclude cut with ID _3231_210_2106_1_1526205864934_6784220_171-256004-0_sp0.9 from training. Duration: 0.9555625 2023-10-04 11:30:15,805 WARNING [train.py:1204] (3/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_87-272068-0 from training. Duration: 0.83 2023-10-04 11:30:17,208 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_164-152777-0_sp1.1 from training. Duration: 0.92725 2023-10-04 11:30:18,512 WARNING [train.py:1204] (3/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_18-130336-0_sp0.9 from training. Duration: 0.87775 2023-10-04 11:30:18,576 WARNING [train.py:1204] (3/4) Exclude cut with ID _14886_210_4220_1_1530702020355_3516190_175-229073-0_sp1.1 from training. Duration: 0.4090625 2023-10-04 11:30:18,595 WARNING [train.py:1204] (3/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_791-45149-0_sp0.9 from training. Duration: 0.9555625 2023-10-04 11:30:19,984 WARNING [train.py:1204] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_468-189058-0_sp1.1 from training. Duration: 0.87275 2023-10-04 11:30:21,743 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8278_210_2550_1_1528637099_4220413_32-142780-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 11:30:25,010 WARNING [train.py:1204] (3/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_146-218763-0_sp0.9 from training. Duration: 0.9555625 2023-10-04 11:30:28,006 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.balancer1.prob, batch_count=1638786.6666666667, ans=0.125 2023-10-04 11:30:30,296 WARNING [train.py:1204] (3/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_348-127789-0_sp1.1 from training. Duration: 0.77275 2023-10-04 11:30:30,301 WARNING [train.py:1204] (3/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_288-285445-0_sp1.1 from training. Duration: 0.936375 2023-10-04 11:30:31,768 WARNING [train.py:1204] (3/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_308-181732-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 11:30:31,792 WARNING [train.py:1204] (3/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_895-117428-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 11:30:34,434 WARNING [train.py:1204] (3/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_387-159008-0_sp0.9 from training. Duration: 0.9555625 2023-10-04 11:30:34,458 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_37351_210_7925_1_1533012931_3812113_265-85780-0_sp0.9 from training. Duration: 0.97775 2023-10-04 11:30:35,730 WARNING [train.py:1204] (3/4) Exclude cut with ID _40551_210_10635_1_1533776424632_3416179_361-3964-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 11:30:35,782 WARNING [train.py:1204] (3/4) Exclude cut with ID _25882_210_3036_1_1532775665815_6479510_485-122742-0_sp1.1 from training. Duration: 0.97275 2023-10-04 11:30:38,532 WARNING [train.py:1204] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_98-51676-0 from training. Duration: 0.73 2023-10-04 11:30:40,573 WARNING [train.py:1204] (3/4) Exclude cut with ID _30548_210_16040_1_1532608097130_3146969_371-280610-0_sp1.1 from training. Duration: 0.92725 2023-10-04 11:30:40,815 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.ff2_skip_rate, batch_count=1638853.3333333333, ans=0.0 2023-10-04 11:30:45,119 WARNING [train.py:1204] (3/4) Exclude cut with ID IC0671W0081-331924 from training. Duration: 0.896 2023-10-04 11:30:45,291 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.0.nonlin_attention.balancer.prob, batch_count=1638920.0, ans=0.125 2023-10-04 11:30:46,464 WARNING [train.py:1204] (3/4) Exclude cut with ID _40284_210_2084_1_1533513679814_3704279_380-160775-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 11:30:47,833 WARNING [train.py:1204] (3/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_57-270037-0_sp1.1 from training. Duration: 0.7 2023-10-04 11:30:49,236 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_853-150237-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 11:30:50,579 WARNING [train.py:1204] (3/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_491-261923-0 from training. Duration: 0.98 2023-10-04 11:30:53,581 WARNING [train.py:1204] (3/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_836-244045-0_sp1.1 from training. Duration: 0.97275 2023-10-04 11:30:55,597 WARNING [train.py:1204] (3/4) Exclude cut with ID _14880_210_8895_1_1530680426655_3795259_289-212817-0 from training. Duration: 0.69 2023-10-04 11:30:55,689 WARNING [train.py:1204] (3/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_303-340446-0 from training. Duration: 0.9 2023-10-04 11:30:58,293 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_37152_210_9697_1_1533106573_3844188_418-41736-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 11:30:59,812 WARNING [train.py:1204] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_371-18169-0_sp1.1 from training. Duration: 0.7818125 2023-10-04 11:31:01,099 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_187-219984-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 11:31:02,470 WARNING [train.py:1204] (3/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_63-267585-0 from training. Duration: 0.9 2023-10-04 11:31:03,904 WARNING [train.py:1204] (3/4) Exclude cut with ID _27013_210_1389_1_1532437173240_3252649_222-125573-0 from training. Duration: 0.98 2023-10-04 11:31:05,166 WARNING [train.py:1204] (3/4) Exclude cut with ID _14821_210_8750_1_1530680503991_6829359_189-297451-0 from training. Duration: 0.55 2023-10-04 11:31:06,687 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_557-163391-0_sp1.1 from training. Duration: 0.97275 2023-10-04 11:31:08,105 WARNING [train.py:1204] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_65-88242-0_sp1.1 from training. Duration: 0.5090625 2023-10-04 11:31:14,655 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.630e+02 2.021e+02 2.399e+02 2.893e+02 4.968e+02, threshold=4.797e+02, percent-clipped=1.0 2023-10-04 11:31:14,682 INFO [train.py:1046] (3/4) Epoch 47, batch 1500, loss[loss=0.1576, simple_loss=0.2417, pruned_loss=0.03675, over 23334.00 frames. ], tot_loss[loss=0.1516, simple_loss=0.2325, pruned_loss=0.03532, over 4720236.90 frames. ], batch size: 93, lr: 2.16e-03, grad_scale: 8.0 2023-10-04 11:31:18,656 WARNING [train.py:1204] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_218-295097-0 from training. Duration: 0.77 2023-10-04 11:31:18,679 WARNING [train.py:1204] (3/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_686-247600-0_sp1.1 from training. Duration: 0.836375 2023-10-04 11:31:18,689 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_397-147196-0_sp0.9 from training. Duration: 0.888875 2023-10-04 11:31:20,001 WARNING [train.py:1204] (3/4) Exclude cut with ID _53950_210_5816_1_1534417264919_3862951_237-136449-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 11:31:20,158 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.conv_module2.balancer1.prob, batch_count=1639053.3333333333, ans=0.125 2023-10-04 11:31:21,354 WARNING [train.py:1204] (3/4) Exclude cut with ID _3120_210_3525_1_1526036143112_4072980_74-178400-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 11:31:22,730 WARNING [train.py:1204] (3/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_309-222545-0_sp0.9 from training. Duration: 0.9333125 2023-10-04 11:31:24,416 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7544_210_6753_1_1528587722_3576686_172-171750-0 from training. Duration: 0.65 2023-10-04 11:31:24,518 WARNING [train.py:1204] (3/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_3-237434-0_sp0.9 from training. Duration: 0.8444375 2023-10-04 11:31:26,364 WARNING [train.py:1204] (3/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_101-293367-0_sp0.9 from training. Duration: 0.7 2023-10-04 11:31:26,371 WARNING [train.py:1204] (3/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_818-213552-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 11:31:26,450 WARNING [train.py:1204] (3/4) Exclude cut with ID _14349_210_8341_1_1530615710818_3442470_115-285975-0_sp1.1 from training. Duration: 0.7818125 2023-10-04 11:31:29,094 WARNING [train.py:1204] (3/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_291-309749-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 11:31:29,183 WARNING [train.py:1204] (3/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_715-3462-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 11:31:34,594 WARNING [train.py:1204] (3/4) Exclude cut with ID _59502_210_12210_1_1535107024698_1835139_13-331827-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 11:31:34,605 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_4528_210_3273_1_1527076654_3276777_342-168068-0 from training. Duration: 0.87 2023-10-04 11:31:35,890 WARNING [train.py:1204] (3/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_215-141059-0_sp0.9 from training. Duration: 0.688875 2023-10-04 11:31:35,919 WARNING [train.py:1204] (3/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_106-286130-0_sp1.1 from training. Duration: 0.8909375 2023-10-04 11:31:37,290 WARNING [train.py:1204] (3/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_480-77655-0_sp1.1 from training. Duration: 0.97275 2023-10-04 11:31:41,314 WARNING [train.py:1204] (3/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_848-280957-0 from training. Duration: 0.87 2023-10-04 11:31:46,555 WARNING [train.py:1204] (3/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_122-109023-0 from training. Duration: 0.76 2023-10-04 11:31:46,672 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_11305_210_1794_1_1529750428_7178148_645-233480-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 11:31:48,049 WARNING [train.py:1204] (3/4) Exclude cut with ID _7562_210_2084_1_1528887767916_3946229_345-169156-0 from training. Duration: 0.58 2023-10-04 11:31:50,781 WARNING [train.py:1204] (3/4) Exclude cut with ID _7562_210_2084_1_1528887767916_3946229_345-169156-0_sp1.1 from training. Duration: 0.52725 2023-10-04 11:31:50,931 WARNING [train.py:1204] (3/4) Exclude cut with ID _13253_210_1925_1_1530757775819_3577873_253-246386-0_sp1.1 from training. Duration: 0.8181875 2023-10-04 11:31:52,373 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_136-264444-0_sp1.1 from training. Duration: 0.97275 2023-10-04 11:31:52,387 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2168_210_2290_1_1525606920_1849400_136-222991-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 11:31:53,782 WARNING [train.py:1204] (3/4) Exclude cut with ID _7852_210_5219_1_1528286471316_8139400_242-105336-0 from training. Duration: 0.72 2023-10-04 11:31:53,828 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_5960_210_1794_1_1527403865_3990999_515-2613-0_sp1.1 from training. Duration: 0.8 2023-10-04 11:31:53,842 WARNING [train.py:1204] (3/4) Exclude cut with ID _39688_210_19596_1_1533371626725_4154159_551-167834-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 11:31:53,897 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_85-195240-0 from training. Duration: 0.87 2023-10-04 11:31:53,949 WARNING [train.py:1204] (3/4) Exclude cut with ID _63171_210_15488_1_1535104792794_3295110_113-113534-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 11:32:01,119 WARNING [train.py:1204] (3/4) Exclude cut with ID _8833_210_5219_1_1528592665210_7553059_319-349181-0_sp1.1 from training. Duration: 0.9 2023-10-04 11:32:01,120 WARNING [train.py:1204] (3/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_225-43381-0 from training. Duration: 0.94 2023-10-04 11:32:05,453 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_5959_210_1794_1_1527400400_3445726_412-44052-0_sp0.9 from training. Duration: 0.8555625 2023-10-04 11:32:05,574 WARNING [train.py:1204] (3/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_356-167816-0_sp1.1 from training. Duration: 0.6545625 2023-10-04 11:32:09,650 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9012W0112-501244 from training. Duration: 0.768 2023-10-04 11:32:09,697 WARNING [train.py:1204] (3/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_177-326952-0_sp1.1 from training. Duration: 0.92725 2023-10-04 11:32:09,701 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9016W0116-502789 from training. Duration: 0.896 2023-10-04 11:32:11,734 WARNING [train.py:1204] (3/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_557-204815-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 11:32:11,830 WARNING [train.py:1204] (3/4) Exclude cut with ID _55857_210_2346_1_1534644007831_6898209_279-229469-0_sp1.1 from training. Duration: 0.936375 2023-10-04 11:32:13,682 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9029W0375-509764 from training. Duration: 0.896 2023-10-04 11:32:15,076 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2295_210_2087_1_1525500993_2407863_234-83104-0_sp0.9 from training. Duration: 0.688875 2023-10-04 11:32:17,858 WARNING [train.py:1204] (3/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_266-137104-0 from training. Duration: 0.65 2023-10-04 11:32:19,191 WARNING [train.py:1204] (3/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_289-163927-0_sp1.1 from training. Duration: 0.92725 2023-10-04 11:32:20,697 WARNING [train.py:1204] (3/4) Exclude cut with ID _12733_210_8341_1_1530167424556_3646380_86-225314-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 11:32:21,921 WARNING [train.py:1204] (3/4) Exclude cut with ID _7877_210_6014_1_1528286358099_3750136_618-284064-0_sp1.1 from training. Duration: 0.92725 2023-10-04 11:32:21,968 WARNING [train.py:1204] (3/4) Exclude cut with ID _55844_210_2346_1_1534471212569_7625707_1017-31469-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 11:32:23,319 WARNING [train.py:1204] (3/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_617-241446-0_sp1.1 from training. Duration: 0.92725 2023-10-04 11:32:23,388 WARNING [train.py:1204] (3/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_866-173440-0_sp1.1 from training. Duration: 0.8181875 2023-10-04 11:32:24,710 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_9460_210_4211_1_1528954587_10049667_480-260269-0 from training. Duration: 0.56 2023-10-04 11:32:24,747 WARNING [train.py:1204] (3/4) Exclude cut with ID _44659_210_2721_1_1534036120061_6614860_626-298884-0 from training. Duration: 0.97 2023-10-04 11:32:24,774 WARNING [train.py:1204] (3/4) Exclude cut with ID _53565_210_10635_1_1534337218335_3820259_287-18120-0_sp1.1 from training. Duration: 0.8818125 2023-10-04 11:32:26,730 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_791-197891-0 from training. Duration: 0.84 2023-10-04 11:32:28,069 INFO [train.py:1046] (3/4) Epoch 47, batch 1550, loss[loss=0.137, simple_loss=0.2189, pruned_loss=0.02757, over 24457.00 frames. ], tot_loss[loss=0.1529, simple_loss=0.2338, pruned_loss=0.03598, over 4723689.80 frames. ], batch size: 58, lr: 2.16e-03, grad_scale: 8.0 2023-10-04 11:32:28,113 WARNING [train.py:1204] (3/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_527-238990-0 from training. Duration: 0.9 2023-10-04 11:32:30,900 WARNING [train.py:1204] (3/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_449-65969-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 11:32:32,317 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_553-341452-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 11:32:32,357 WARNING [train.py:1204] (3/4) Exclude cut with ID _16909_210_5724_1_1531292079912_4118919_300-47574-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 11:32:32,369 WARNING [train.py:1204] (3/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_70-27732-0_sp1.1 from training. Duration: 0.8545625 2023-10-04 11:32:33,718 WARNING [train.py:1204] (3/4) Exclude cut with ID _44718_210_5816_1_1534050051869_6469030_727-28074-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 11:32:35,058 WARNING [train.py:1204] (3/4) Exclude cut with ID _55857_210_2346_1_1534644007831_6898209_357-123664-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 11:32:39,076 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9123W0004-555964 from training. Duration: 0.896 2023-10-04 11:32:39,104 WARNING [train.py:1204] (3/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_445-145208-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 11:32:39,129 WARNING [train.py:1204] (3/4) Exclude cut with ID _3675_210_4557_1_1526639934408_4182089_456-18305-0_sp1.1 from training. Duration: 0.6909375 2023-10-04 11:32:40,528 WARNING [train.py:1204] (3/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_1-99680-0_sp1.1 from training. Duration: 0.6090625 2023-10-04 11:32:42,602 WARNING [train.py:1204] (3/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_146-282400-0_sp0.9 from training. Duration: 0.788875 2023-10-04 11:32:42,616 WARNING [train.py:1204] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_844-96989-0 from training. Duration: 0.71 2023-10-04 11:32:44,508 WARNING [train.py:1204] (3/4) Exclude cut with ID _29853_210_7497_1_1532505600851_6642110_128-146364-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 11:32:44,540 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_224-322394-0 from training. Duration: 0.66 2023-10-04 11:32:47,052 WARNING [train.py:1204] (3/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_191-59784-0 from training. Duration: 0.88 2023-10-04 11:32:47,057 WARNING [train.py:1204] (3/4) Exclude cut with ID _12556_210_8341_1_1530081024100_7327690_725-193477-0 from training. Duration: 0.98 2023-10-04 11:32:47,095 WARNING [train.py:1204] (3/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_523-79838-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 11:32:47,187 WARNING [train.py:1204] (3/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_398-334558-0_sp1.1 from training. Duration: 0.87275 2023-10-04 11:32:51,315 WARNING [train.py:1204] (3/4) Exclude cut with ID _6456_210_1534_1_1527681634212_3181049_171-36697-0_sp1.1 from training. Duration: 0.936375 2023-10-04 11:32:51,458 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.nonlin_attention.balancer.min_positive, batch_count=1639453.3333333333, ans=0.05 2023-10-04 11:32:54,093 WARNING [train.py:1204] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_700-183549-0 from training. Duration: 0.68 2023-10-04 11:32:54,096 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8853_210_3273_1_1528633899_818968_51-187685-0 from training. Duration: 0.69 2023-10-04 11:33:01,927 INFO [scaling.py:1118] (3/4) WithLoss: name=encoder.encoders.2.encoder.layers.0.self_attn_weights, loss-sum=0.000e+00 2023-10-04 11:33:03,115 WARNING [train.py:1204] (3/4) Exclude cut with ID _7719_210_3525_1_1528456153163_4296960_333-110399-0_sp1.1 from training. Duration: 0.87275 2023-10-04 11:33:05,982 WARNING [train.py:1204] (3/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_1022-83740-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 11:33:06,006 WARNING [train.py:1204] (3/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_61-327062-0_sp0.9 from training. Duration: 0.9 2023-10-04 11:33:06,011 WARNING [train.py:1204] (3/4) Exclude cut with ID _47251_210_18628_1_1533814063128_4137250_214-88076-0_sp1.1 from training. Duration: 0.8 2023-10-04 11:33:07,250 WARNING [train.py:1204] (3/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_839-173017-0 from training. Duration: 0.41 2023-10-04 11:33:13,194 WARNING [train.py:1204] (3/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_299-34413-0_sp1.1 from training. Duration: 0.5909375 2023-10-04 11:33:15,096 WARNING [train.py:1204] (3/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_664-159457-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 11:33:17,964 WARNING [train.py:1204] (3/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_483-291760-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 11:33:20,685 WARNING [train.py:1204] (3/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_287-255474-0_sp1.1 from training. Duration: 0.92725 2023-10-04 11:33:20,735 WARNING [train.py:1204] (3/4) Exclude cut with ID _48677_210_5663_1_1533902405919_3892989_111-11439-0_sp1.1 from training. Duration: 0.87275 2023-10-04 11:33:20,742 WARNING [train.py:1204] (3/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_399-156791-0 from training. Duration: 0.83 2023-10-04 11:33:20,772 WARNING [train.py:1204] (3/4) Exclude cut with ID _3992_210_1747_1_1526644816031_3731060_36-147855-0_sp0.9 from training. Duration: 0.8666875 2023-10-04 11:33:22,350 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.conv_skip_rate, batch_count=1639586.6666666667, ans=0.0 2023-10-04 11:33:23,430 WARNING [train.py:1204] (3/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_442-320317-0_sp1.1 from training. Duration: 0.7454375 2023-10-04 11:33:23,449 WARNING [train.py:1204] (3/4) Exclude cut with ID _55429_210_10626_1_1534467588527_7174143_953-80868-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 11:33:23,528 WARNING [train.py:1204] (3/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_159-328157-0_sp0.9 from training. Duration: 0.57775 2023-10-04 11:33:23,535 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9309W0197-640324 from training. Duration: 0.896 2023-10-04 11:33:26,264 WARNING [train.py:1204] (3/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_361-30822-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 11:33:32,228 WARNING [train.py:1204] (3/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_636-251213-0 from training. Duration: 0.94 2023-10-04 11:33:35,634 WARNING [train.py:1204] (3/4) Exclude cut with ID _49728_210_12651_1_1534041034294_6788150_468-333827-0_sp1.1 from training. Duration: 0.963625 2023-10-04 11:33:35,716 WARNING [train.py:1204] (3/4) Exclude cut with ID _25882_210_3036_1_1532775665815_6479510_623-191759-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 11:33:35,859 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.1.attention_skip_rate, batch_count=1639653.3333333333, ans=0.0 2023-10-04 11:33:37,079 WARNING [train.py:1204] (3/4) Exclude cut with ID _5189_210_4557_1_1527248319428_3769229_199-53526-0 from training. Duration: 0.42 2023-10-04 11:33:38,607 WARNING [train.py:1204] (3/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_32-205288-0_sp0.9 from training. Duration: 0.8666875 2023-10-04 11:33:38,690 WARNING [train.py:1204] (3/4) Exclude cut with ID _65263_210_22356_1_1535763598691_3921037_401-346646-0_sp1.1 from training. Duration: 0.963625 2023-10-04 11:33:38,697 WARNING [train.py:1204] (3/4) Exclude cut with ID _12555_210_8750_1_1530075594402_3780239_449-267986-0_sp1.1 from training. Duration: 0.7545625 2023-10-04 11:33:40,004 WARNING [train.py:1204] (3/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_430-201020-0_sp1.1 from training. Duration: 0.8818125 2023-10-04 11:33:40,093 WARNING [train.py:1204] (3/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_379-49457-0_sp1.1 from training. Duration: 0.7909375 2023-10-04 11:33:41,232 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.799e+02 2.051e+02 2.234e+02 2.804e+02 4.353e+02, threshold=4.467e+02, percent-clipped=0.0 2023-10-04 11:33:41,259 INFO [train.py:1046] (3/4) Epoch 47, batch 1600, loss[loss=0.1844, simple_loss=0.2604, pruned_loss=0.05416, over 19685.00 frames. ], tot_loss[loss=0.1535, simple_loss=0.2342, pruned_loss=0.03638, over 4710738.13 frames. ], batch size: 388, lr: 2.16e-03, grad_scale: 16.0 2023-10-04 11:33:44,955 WARNING [train.py:1204] (3/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_200-105218-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 11:33:45,009 WARNING [train.py:1204] (3/4) Exclude cut with ID _3867_210_1883_1_1526641200575_4475830_319-349850-0 from training. Duration: 0.67 2023-10-04 11:33:45,079 WARNING [train.py:1204] (3/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_916-195848-0 from training. Duration: 0.92 2023-10-04 11:33:46,592 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.ff2_skip_rate, batch_count=1639720.0, ans=0.0 2023-10-04 11:33:47,934 WARNING [train.py:1204] (3/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_436-186311-0 from training. Duration: 0.65 2023-10-04 11:33:50,698 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3910_210_1794_1_1526777667_3747260_302-180617-0_sp1.1 from training. Duration: 0.97275 2023-10-04 11:33:51,623 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.1.encoder.layers.0.self_attn_weights.whiten_keys, num_groups=4, num_channels=128, metric=3.71 vs. limit=6.0 2023-10-04 11:33:52,060 WARNING [train.py:1204] (3/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_102-92751-0 from training. Duration: 0.99 2023-10-04 11:33:52,203 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.nonlin_attention.balancer.prob, batch_count=1639720.0, ans=0.125 2023-10-04 11:33:52,913 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.1.encoder.layers.1.conv_module2.whiten, num_groups=1, num_channels=256, metric=10.91 vs. limit=15.0 2023-10-04 11:33:53,410 WARNING [train.py:1204] (3/4) Exclude cut with ID _51899_210_20034_1_1534129254466_3669220_265-215428-0_sp1.1 from training. Duration: 0.8909375 2023-10-04 11:33:56,252 WARNING [train.py:1204] (3/4) Exclude cut with ID _58583_210_5236_1_1534726835266_3574249_438-198993-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 11:33:59,089 WARNING [train.py:1204] (3/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_166-166990-0_sp1.1 from training. Duration: 0.87275 2023-10-04 11:34:04,259 WARNING [train.py:1204] (3/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_907-151398-0 from training. Duration: 0.37 2023-10-04 11:34:06,987 WARNING [train.py:1204] (3/4) Exclude cut with ID _38670_210_15379_1_1533448810659_4391059_452-256669-0_sp1.1 from training. Duration: 0.936375 2023-10-04 11:34:08,283 WARNING [train.py:1204] (3/4) Exclude cut with ID _48677_210_5663_1_1533902405919_3892989_408-40740-0 from training. Duration: 0.83 2023-10-04 11:34:08,318 WARNING [train.py:1204] (3/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_903-158756-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 11:34:08,371 WARNING [train.py:1204] (3/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_397-216497-0 from training. Duration: 0.96 2023-10-04 11:34:13,956 WARNING [train.py:1204] (3/4) Exclude cut with ID _55429_210_10626_1_1534467588527_7174143_97-335084-0 from training. Duration: 0.97 2023-10-04 11:34:21,291 WARNING [train.py:1204] (3/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_453-267730-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 11:34:22,644 WARNING [train.py:1204] (3/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_153-97441-0 from training. Duration: 0.56 2023-10-04 11:34:22,704 WARNING [train.py:1204] (3/4) Exclude cut with ID _40901_210_5665_1_1533363789958_3856130_312-329122-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 11:34:22,737 WARNING [train.py:1204] (3/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_97-126471-0_sp1.1 from training. Duration: 0.97275 2023-10-04 11:34:22,738 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_11305_210_1794_1_1529750428_7178148_257-312870-0_sp1.1 from training. Duration: 0.8 2023-10-04 11:34:25,527 WARNING [train.py:1204] (3/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_454-314901-0_sp0.9 from training. Duration: 0.511125 2023-10-04 11:34:29,587 WARNING [train.py:1204] (3/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_282-136641-0_sp1.1 from training. Duration: 0.3818125 2023-10-04 11:34:29,722 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_46013_210_7234_1_1533776362_3744032_493-160724-0_sp1.1 from training. Duration: 0.963625 2023-10-04 11:34:31,436 WARNING [train.py:1204] (3/4) Exclude cut with ID _67831_210_12392_1_1535612464202_3092612_259-33442-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 11:34:31,714 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.self_attn_weights.pos_emb_skip_rate, batch_count=1639920.0, ans=0.0 2023-10-04 11:34:32,815 WARNING [train.py:1204] (3/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_289-62384-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 11:34:32,891 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6545_210_2040_1_1527774670_4245258_379-14182-0_sp0.9 from training. Duration: 0.888875 2023-10-04 11:34:36,015 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_548-294316-0_sp1.1 from training. Duration: 0.736375 2023-10-04 11:34:38,748 WARNING [train.py:1204] (3/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_857-298942-0_sp1.1 from training. Duration: 0.77275 2023-10-04 11:34:38,864 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6673_210_3273_1_1527767790_3173240_327-306185-0_sp1.1 from training. Duration: 0.7545625 2023-10-04 11:34:44,496 WARNING [train.py:1204] (3/4) Exclude cut with ID _61008_210_10633_1_1534852769198_6974370_814-107647-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 11:34:46,477 WARNING [train.py:1204] (3/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_116-332008-0_sp1.1 from training. Duration: 0.8909375 2023-10-04 11:34:47,721 WARNING [train.py:1204] (3/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_266-329755-0 from training. Duration: 0.89 2023-10-04 11:34:47,724 WARNING [train.py:1204] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_271-269084-0_sp0.9 from training. Duration: 0.92225 2023-10-04 11:34:49,097 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3171_210_2087_1_1526109084_2330007_66-260583-0 from training. Duration: 0.72 2023-10-04 11:34:54,407 INFO [train.py:1046] (3/4) Epoch 47, batch 1650, loss[loss=0.1533, simple_loss=0.244, pruned_loss=0.03129, over 24573.00 frames. ], tot_loss[loss=0.1533, simple_loss=0.2341, pruned_loss=0.03622, over 4712677.23 frames. ], batch size: 71, lr: 2.16e-03, grad_scale: 16.0 2023-10-04 11:34:54,474 WARNING [train.py:1204] (3/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_1326-276386-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 11:34:56,030 WARNING [train.py:1204] (3/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_372-30302-0_sp1.1 from training. Duration: 0.7818125 2023-10-04 11:34:57,298 WARNING [train.py:1204] (3/4) Exclude cut with ID _25882_210_3036_1_1532775665815_6479510_631-257029-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 11:34:57,310 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3691_210_3528_1_1526468169_4062053_355-334432-0 from training. Duration: 0.87 2023-10-04 11:34:57,329 WARNING [train.py:1204] (3/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_839-173017-0_sp1.1 from training. Duration: 0.37275 2023-10-04 11:34:57,337 WARNING [train.py:1204] (3/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_441-91466-0 from training. Duration: 0.97 2023-10-04 11:34:57,367 WARNING [train.py:1204] (3/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_108-53929-0 from training. Duration: 0.59 2023-10-04 11:35:02,862 WARNING [train.py:1204] (3/4) Exclude cut with ID _9389_210_1664_1_1529217337301_5289269_260-1942-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 11:35:02,932 WARNING [train.py:1204] (3/4) Exclude cut with ID _56423_210_5854_1_1534896061049_3165969_198-61547-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 11:35:04,879 WARNING [train.py:1204] (3/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_412-142122-0_sp1.1 from training. Duration: 0.92725 2023-10-04 11:35:04,908 WARNING [train.py:1204] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_88-341718-0_sp1.1 from training. Duration: 0.6 2023-10-04 11:35:05,143 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.bypass_mid.scale_min, batch_count=1640053.3333333333, ans=0.2 2023-10-04 11:35:08,113 WARNING [train.py:1204] (3/4) Exclude cut with ID _61079_210_4525_1_1534921008198_4011419_334-298697-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 11:35:10,808 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_5465_210_4211_1_1527227253_55357_8-2499-0_sp1.1 from training. Duration: 0.996375 2023-10-04 11:35:12,232 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_447-201565-0_sp1.1 from training. Duration: 0.97275 2023-10-04 11:35:12,244 WARNING [train.py:1204] (3/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_253-161789-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 11:35:12,247 WARNING [train.py:1204] (3/4) Exclude cut with ID _44304_210_15374_1_1533639352765_4613129_302-22084-0_sp0.9 from training. Duration: 0.9555625 2023-10-04 11:35:12,263 WARNING [train.py:1204] (3/4) Exclude cut with ID _26664_210_7770_1_1532258943280_3418270_187-80815-0_sp1.1 from training. Duration: 0.8454375 2023-10-04 11:35:13,771 WARNING [train.py:1204] (3/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_68-212801-0 from training. Duration: 0.73 2023-10-04 11:35:13,821 WARNING [train.py:1204] (3/4) Exclude cut with ID _29345_210_4381_1_1532689168263_3860169_149-257268-0 from training. Duration: 0.81 2023-10-04 11:35:18,167 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.balancer1.prob, batch_count=1640120.0, ans=0.125 2023-10-04 11:35:19,943 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_213-103282-0_sp1.1 from training. Duration: 0.7090625 2023-10-04 11:35:20,701 WARNING [train.py:1204] (3/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_156-302675-0_sp1.1 from training. Duration: 0.5 2023-10-04 11:35:24,894 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.feed_forward2.hidden_balancer.prob, batch_count=1640186.6666666667, ans=0.125 2023-10-04 11:35:27,629 WARNING [train.py:1204] (3/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_125-322172-0 from training. Duration: 0.93 2023-10-04 11:35:29,035 WARNING [train.py:1204] (3/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_493-60474-0_sp1.1 from training. Duration: 0.963625 2023-10-04 11:35:31,673 WARNING [train.py:1204] (3/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_156-302675-0 from training. Duration: 0.55 2023-10-04 11:35:33,494 WARNING [train.py:1204] (3/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_123-267862-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 11:35:36,352 WARNING [train.py:1204] (3/4) Exclude cut with ID _28427_210_4220_1_1532581240967_3061069_201-143747-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 11:35:36,384 WARNING [train.py:1204] (3/4) Exclude cut with ID _8772_210_1850_1_1528635595972_3664011_96-12756-0_sp1.1 from training. Duration: 0.8909375 2023-10-04 11:35:37,711 WARNING [train.py:1204] (3/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_1347-145675-0_sp1.1 from training. Duration: 0.936375 2023-10-04 11:35:39,491 WARNING [train.py:1204] (3/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_387-159008-0_sp1.1 from training. Duration: 0.7818125 2023-10-04 11:35:39,517 WARNING [train.py:1204] (3/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_400-201896-0_sp1.1 from training. Duration: 0.963625 2023-10-04 11:35:43,404 WARNING [train.py:1204] (3/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_326-174612-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 11:35:44,765 WARNING [train.py:1204] (3/4) Exclude cut with ID _55844_210_2346_1_1534471212569_7625707_390-116233-0_sp1.1 from training. Duration: 0.963625 2023-10-04 11:35:44,797 WARNING [train.py:1204] (3/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_419-317062-0_sp1.1 from training. Duration: 0.82725 2023-10-04 11:35:44,844 WARNING [train.py:1204] (3/4) Exclude cut with ID _29877_210_6228_1_1532397791106_5276149_266-62291-0_sp0.9 from training. Duration: 0.9666875 2023-10-04 11:35:44,917 WARNING [train.py:1204] (3/4) Exclude cut with ID _7857_210_1534_1_1528286515745_6831470_431-338528-0_sp1.1 from training. Duration: 0.92725 2023-10-04 11:35:46,241 WARNING [train.py:1204] (3/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_87-131427-0_sp0.9 from training. Duration: 0.8666875 2023-10-04 11:35:49,819 WARNING [train.py:1204] (3/4) Exclude cut with ID _24055_210_12651_1_1532310907143_976979_92-59356-0_sp1.1 from training. Duration: 0.82725 2023-10-04 11:35:50,970 WARNING [train.py:1204] (3/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_96-290039-0 from training. Duration: 0.56 2023-10-04 11:35:52,365 WARNING [train.py:1204] (3/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_48-182155-0_sp0.9 from training. Duration: 0.9666875 2023-10-04 11:35:52,406 WARNING [train.py:1204] (3/4) Exclude cut with ID _4626_210_3532_1_1526817652717_3916859_446-206656-0 from training. Duration: 0.76 2023-10-04 11:35:53,736 WARNING [train.py:1204] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_168-32906-0 from training. Duration: 0.69 2023-10-04 11:35:55,033 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3780_210_2087_1_1526467760_2130483_170-259044-0 from training. Duration: 0.7 2023-10-04 11:35:55,051 WARNING [train.py:1204] (3/4) Exclude cut with ID _65134_210_14763_1_1535335393985_3505368_153-216718-0_sp1.1 from training. Duration: 0.92725 2023-10-04 11:35:55,115 WARNING [train.py:1204] (3/4) Exclude cut with ID _64639_210_5816_1_1535367619437_3528000_461-329156-0_sp1.1 from training. Duration: 0.87275 2023-10-04 11:35:56,449 WARNING [train.py:1204] (3/4) Exclude cut with ID _21989_210_5854_1_1531899214931_3271360_126-347531-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 11:35:56,496 WARNING [train.py:1204] (3/4) Exclude cut with ID _7614_210_4915_1_1529326764092_4537359_96-162197-0_sp1.1 from training. Duration: 0.963625 2023-10-04 11:35:56,497 WARNING [train.py:1204] (3/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_1083-147536-0 from training. Duration: 0.97 2023-10-04 11:35:58,159 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=1640320.0, ans=0.1 2023-10-04 11:35:59,299 WARNING [train.py:1204] (3/4) Exclude cut with ID _64029_210_4381_1_1535178502772_3756459_275-16710-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 11:36:01,948 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7264_210_4938_1_1527939911_8132095_800-91995-0_sp0.9 from training. Duration: 0.9555625 2023-10-04 11:36:01,993 WARNING [train.py:1204] (3/4) Exclude cut with ID _3844_210_1390_1_1526727803582_2871209_50-193879-0_sp1.1 from training. Duration: 0.936375 2023-10-04 11:36:03,401 WARNING [train.py:1204] (3/4) Exclude cut with ID _46952_210_5986_1_1534230010357_3107379_232-344503-0 from training. Duration: 0.96 2023-10-04 11:36:04,365 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder_embed.out_whiten.whitening_limit, batch_count=1640320.0, ans=8.0 2023-10-04 11:36:07,887 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.511e+02 2.048e+02 2.297e+02 2.769e+02 5.278e+02, threshold=4.594e+02, percent-clipped=2.0 2023-10-04 11:36:07,914 INFO [train.py:1046] (3/4) Epoch 47, batch 1700, loss[loss=0.1415, simple_loss=0.2246, pruned_loss=0.02925, over 24502.00 frames. ], tot_loss[loss=0.1531, simple_loss=0.2338, pruned_loss=0.0362, over 4703055.46 frames. ], batch size: 66, lr: 2.16e-03, grad_scale: 16.0 2023-10-04 11:36:08,077 WARNING [train.py:1204] (3/4) Exclude cut with ID _25808_210_13292_1_1532343514562_7366109_206-14076-0_sp1.1 from training. Duration: 0.936375 2023-10-04 11:36:08,078 WARNING [train.py:1204] (3/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_55-31904-0_sp0.9 from training. Duration: 0.97775 2023-10-04 11:36:09,823 WARNING [train.py:1204] (3/4) Exclude cut with ID _14632_210_9988_1_1530691279392_3923200_161-129530-0 from training. Duration: 0.96 2023-10-04 11:36:11,208 WARNING [train.py:1204] (3/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_770-200430-0_sp1.1 from training. Duration: 0.8090625 2023-10-04 11:36:11,228 WARNING [train.py:1204] (3/4) Exclude cut with ID _6270_210_2084_1_1527850911573_3856420_26-277343-0_sp1.1 from training. Duration: 0.7454375 2023-10-04 11:36:11,229 WARNING [train.py:1204] (3/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_88-23776-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 11:36:12,730 WARNING [train.py:1204] (3/4) Exclude cut with ID _60860_210_8254_1_1534896015875_3557169_245-256666-0_sp1.1 from training. Duration: 0.8818125 2023-10-04 11:36:12,775 WARNING [train.py:1204] (3/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_636-251213-0_sp1.1 from training. Duration: 0.8545625 2023-10-04 11:36:12,788 WARNING [train.py:1204] (3/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_540-131164-0 from training. Duration: 0.92 2023-10-04 11:36:15,517 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_5931_210_2040_1_1527428481_4547999_321-193614-0_sp0.9 from training. Duration: 0.7444375 2023-10-04 11:36:20,899 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.ff2_skip_rate, batch_count=1640386.6666666667, ans=0.0 2023-10-04 11:36:24,840 WARNING [train.py:1204] (3/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_373-225403-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 11:36:26,309 WARNING [train.py:1204] (3/4) Exclude cut with ID _20622_210_3852_1_1531622427080_1093839_57-326027-0_sp1.1 from training. Duration: 0.97275 2023-10-04 11:36:31,774 WARNING [train.py:1204] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_605-257205-0_sp1.1 from training. Duration: 0.536375 2023-10-04 11:36:31,803 WARNING [train.py:1204] (3/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_442-320317-0_sp0.9 from training. Duration: 0.911125 2023-10-04 11:36:33,182 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_35361_210_4238_1_1533262810_7793476_675-87012-0_sp1.1 from training. Duration: 0.8090625 2023-10-04 11:36:33,208 WARNING [train.py:1204] (3/4) Exclude cut with ID _4184_210_2550_1_1526730192013_2219380_306-44736-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 11:36:33,489 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.conv_module2.balancer1.prob, batch_count=1640453.3333333333, ans=0.125 2023-10-04 11:36:35,786 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.2.encoder.layers.1.feed_forward1.out_whiten, num_groups=1, num_channels=384, metric=5.66 vs. limit=15.0 2023-10-04 11:36:36,343 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_124-113562-0 from training. Duration: 0.94 2023-10-04 11:36:39,116 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2821_210_2715_1_1526108385_3763560_73-296673-0_sp0.9 from training. Duration: 0.92225 2023-10-04 11:36:39,130 WARNING [train.py:1204] (3/4) Exclude cut with ID _10301_210_5886_1_1529222275332_3910620_216-46959-0_sp1.1 from training. Duration: 0.92725 2023-10-04 11:36:41,007 WARNING [train.py:1204] (3/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_91-277684-0_sp0.9 from training. Duration: 0.82225 2023-10-04 11:36:41,144 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.bypass.scale_min, batch_count=1640520.0, ans=0.2 2023-10-04 11:36:41,149 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.self_attn_weights.pos_emb_skip_rate, batch_count=1640520.0, ans=0.0 2023-10-04 11:36:42,357 WARNING [train.py:1204] (3/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_351-294650-0_sp0.9 from training. Duration: 0.7 2023-10-04 11:36:43,792 WARNING [train.py:1204] (3/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_209-205365-0 from training. Duration: 0.79 2023-10-04 11:36:45,197 WARNING [train.py:1204] (3/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_97-325053-0 from training. Duration: 0.92 2023-10-04 11:36:46,621 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_49348_210_7927_1_1533954500_3691300_109-276430-0_sp1.1 from training. Duration: 0.92725 2023-10-04 11:36:46,731 WARNING [train.py:1204] (3/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_912-47473-0 from training. Duration: 0.68 2023-10-04 11:36:46,961 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.nonlin_attention.balancer.prob, batch_count=1640520.0, ans=0.125 2023-10-04 11:36:49,362 WARNING [train.py:1204] (3/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_96-320736-0_sp0.9 from training. Duration: 0.9555625 2023-10-04 11:36:54,795 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.bypass.scale_min, batch_count=1640586.6666666667, ans=0.2 2023-10-04 11:36:57,424 WARNING [train.py:1204] (3/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_1252-235481-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 11:36:57,542 WARNING [train.py:1204] (3/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_813-261554-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 11:36:58,840 WARNING [train.py:1204] (3/4) Exclude cut with ID _21632_210_2253_1_1531994001765_3744140_110-274654-0_sp0.9 from training. Duration: 0.911125 2023-10-04 11:37:00,294 WARNING [train.py:1204] (3/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_381-317791-0_sp1.1 from training. Duration: 0.663625 2023-10-04 11:37:00,314 WARNING [train.py:1204] (3/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_51-117576-0_sp1.1 from training. Duration: 0.363625 2023-10-04 11:37:00,337 WARNING [train.py:1204] (3/4) Exclude cut with ID _42321_210_7770_1_1533468631352_4366895_104-215915-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 11:37:01,832 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_216-22824-0_sp1.1 from training. Duration: 0.92725 2023-10-04 11:37:01,832 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_14831_210_7927_1_1530700200_5583610_181-27467-0 from training. Duration: 0.91 2023-10-04 11:37:03,078 WARNING [train.py:1204] (3/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_1024-265645-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 11:37:03,081 WARNING [train.py:1204] (3/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_412-233904-0_sp1.1 from training. Duration: 0.936375 2023-10-04 11:37:03,096 WARNING [train.py:1204] (3/4) Exclude cut with ID _30191_210_1664_1_1533189915047_7196670_911-223461-0_sp1.1 from training. Duration: 0.92725 2023-10-04 11:37:03,114 WARNING [train.py:1204] (3/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_558-148266-0_sp1.1 from training. Duration: 0.963625 2023-10-04 11:37:04,569 WARNING [train.py:1204] (3/4) Exclude cut with ID _58480_210_12392_1_1534811121111_3646160_212-164495-0_sp1.1 from training. Duration: 0.936375 2023-10-04 11:37:04,570 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_454-276429-0_sp0.9 from training. Duration: 0.9666875 2023-10-04 11:37:06,457 WARNING [train.py:1204] (3/4) Exclude cut with ID _24736_210_3279_1_1532332894218_2838700_73-197086-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 11:37:07,933 WARNING [train.py:1204] (3/4) Exclude cut with ID _5294_210_1747_1_1527249643928_3696179_529-242298-0_sp1.1 from training. Duration: 0.863625 2023-10-04 11:37:07,973 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_53555_210_5632_1_1534595220_7232297_345-171438-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 11:37:11,178 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_28211_210_6388_1_1532861506_4523224_22-25766-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 11:37:12,564 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7002_210_1794_1_1527939683_5157286_163-87552-0 from training. Duration: 0.86 2023-10-04 11:37:15,001 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.0.layers.0.conv_module1.whiten, num_groups=1, num_channels=192, metric=12.06 vs. limit=15.0 2023-10-04 11:37:15,380 WARNING [train.py:1204] (3/4) Exclude cut with ID _56927_210_9969_1_1534848970840_3695229_409-225129-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 11:37:16,800 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_26192_210_7927_1_1532137269_1040115_21-219331-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 11:37:18,926 WARNING [train.py:1204] (3/4) Exclude cut with ID _15012_210_1968_1_1530856820429_5095350_662-210469-0 from training. Duration: 0.57 2023-10-04 11:37:22,916 INFO [train.py:1046] (3/4) Epoch 47, batch 1750, loss[loss=0.1563, simple_loss=0.2381, pruned_loss=0.03721, over 23433.00 frames. ], tot_loss[loss=0.1517, simple_loss=0.2317, pruned_loss=0.03587, over 4692684.34 frames. ], batch size: 93, lr: 2.16e-03, grad_scale: 8.0 2023-10-04 11:37:24,441 WARNING [train.py:1204] (3/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_582-191016-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 11:37:25,965 WARNING [train.py:1204] (3/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_270-210032-0_sp1.1 from training. Duration: 0.963625 2023-10-04 11:37:27,311 WARNING [train.py:1204] (3/4) Exclude cut with ID _6789_210_1534_1_1527857937218_3118950_262-48853-0_sp0.9 from training. Duration: 0.62225 2023-10-04 11:37:28,778 WARNING [train.py:1204] (3/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_847-41334-0 from training. Duration: 0.44 2023-10-04 11:37:28,802 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_573-46532-0_sp1.1 from training. Duration: 0.92725 2023-10-04 11:37:30,835 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.3.encoder.layers.3.feed_forward2.out_whiten, num_groups=1, num_channels=512, metric=4.71 vs. limit=15.0 2023-10-04 11:37:31,494 WARNING [train.py:1204] (3/4) Exclude cut with ID _21881_210_3525_1_1531825242757_3798380_247-102591-0_sp1.1 from training. Duration: 0.8909375 2023-10-04 11:37:31,525 WARNING [train.py:1204] (3/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_200-21395-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 11:37:36,194 WARNING [train.py:1204] (3/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_711-15688-0 from training. Duration: 0.43 2023-10-04 11:37:37,670 WARNING [train.py:1204] (3/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_53-75025-0_sp1.1 from training. Duration: 0.963625 2023-10-04 11:37:40,385 WARNING [train.py:1204] (3/4) Exclude cut with ID _6194_210_1534_1_1527511826160_3941194_321-148641-0 from training. Duration: 0.64 2023-10-04 11:37:40,409 WARNING [train.py:1204] (3/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_337-294431-0_sp1.1 from training. Duration: 0.936375 2023-10-04 11:37:42,119 WARNING [train.py:1204] (3/4) Exclude cut with ID _21632_210_2253_1_1531994001765_3744140_110-274654-0_sp1.1 from training. Duration: 0.7454375 2023-10-04 11:37:44,926 WARNING [train.py:1204] (3/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_135-174080-0_sp1.1 from training. Duration: 0.4818125 2023-10-04 11:37:46,296 WARNING [train.py:1204] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_21-174514-0 from training. Duration: 0.43 2023-10-04 11:37:46,485 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.balancer1.prob, batch_count=1640786.6666666667, ans=0.125 2023-10-04 11:37:48,313 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_38965_210_4852_1_1533174906_3484735_215-294589-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 11:37:49,609 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_548-294316-0 from training. Duration: 0.81 2023-10-04 11:37:57,820 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_103-84010-0_sp1.1 from training. Duration: 0.62725 2023-10-04 11:37:59,313 WARNING [train.py:1204] (3/4) Exclude cut with ID _18199_210_6109_1_1531475789864_4288790_295-198247-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 11:38:00,592 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_13757_210_3528_1_1530440682_4107239_103-313224-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 11:38:03,317 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_34886_210_10633_1_1533203882_1755661_67-294673-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 11:38:03,336 WARNING [train.py:1204] (3/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_350-101734-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 11:38:04,812 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_37357_210_17024_1_1533089504_2316121_204-153986-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 11:38:06,839 WARNING [train.py:1204] (3/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_673-115569-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 11:38:08,312 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_489-230703-0_sp0.9 from training. Duration: 0.9444375 2023-10-04 11:38:09,565 WARNING [train.py:1204] (3/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_575-102496-0_sp1.1 from training. Duration: 0.936375 2023-10-04 11:38:09,668 WARNING [train.py:1204] (3/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_669-93163-0 from training. Duration: 0.98 2023-10-04 11:38:11,059 WARNING [train.py:1204] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_249-281349-0_sp0.9 from training. Duration: 0.9444375 2023-10-04 11:38:14,278 WARNING [train.py:1204] (3/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_106-30782-0 from training. Duration: 0.8 2023-10-04 11:38:15,589 WARNING [train.py:1204] (3/4) Exclude cut with ID _8772_210_1850_1_1528635595972_3664011_398-320049-0_sp1.1 from training. Duration: 0.8 2023-10-04 11:38:17,667 WARNING [train.py:1204] (3/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_516-151727-0_sp1.1 from training. Duration: 0.963625 2023-10-04 11:38:17,730 WARNING [train.py:1204] (3/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_325-62802-0_sp1.1 from training. Duration: 0.7909375 2023-10-04 11:38:21,988 WARNING [train.py:1204] (3/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_93-58002-0_sp0.9 from training. Duration: 0.6333125 2023-10-04 11:38:22,021 WARNING [train.py:1204] (3/4) Exclude cut with ID _4681_210_3525_1_1527251040321_1235713_123-24428-0_sp1.1 from training. Duration: 0.436375 2023-10-04 11:38:22,073 WARNING [train.py:1204] (3/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_1185-56473-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 11:38:23,486 WARNING [train.py:1204] (3/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_96-244299-0_sp1.1 from training. Duration: 0.8 2023-10-04 11:38:26,299 WARNING [train.py:1204] (3/4) Exclude cut with ID _59500_210_8254_1_1534809610680_3736009_104-72815-0_sp1.1 from training. Duration: 0.963625 2023-10-04 11:38:29,057 WARNING [train.py:1204] (3/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_468-283113-0_sp1.1 from training. Duration: 0.87275 2023-10-04 11:38:31,711 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6239_210_4211_1_1527765584_4306698_528-102186-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 11:38:31,778 WARNING [train.py:1204] (3/4) Exclude cut with ID _45297_210_2721_1_1533715351381_3264949_351-147136-0 from training. Duration: 0.87 2023-10-04 11:38:31,782 WARNING [train.py:1204] (3/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_520-51744-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 11:38:33,016 WARNING [train.py:1204] (3/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_310-114452-0_sp1.1 from training. Duration: 0.536375 2023-10-04 11:38:33,020 WARNING [train.py:1204] (3/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_450-165710-0_sp1.1 from training. Duration: 0.92725 2023-10-04 11:38:33,030 WARNING [train.py:1204] (3/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_280-120942-0_sp1.1 from training. Duration: 0.7 2023-10-04 11:38:34,295 WARNING [train.py:1204] (3/4) Exclude cut with ID _65585_210_12210_1_1535450451584_3764098_112-294301-0_sp0.9 from training. Duration: 0.97775 2023-10-04 11:38:36,289 INFO [train.py:1046] (3/4) Epoch 47, batch 1800, loss[loss=0.1471, simple_loss=0.2251, pruned_loss=0.03454, over 24326.00 frames. ], tot_loss[loss=0.1518, simple_loss=0.2315, pruned_loss=0.03605, over 4682012.82 frames. ], batch size: 61, lr: 2.16e-03, grad_scale: 8.0 2023-10-04 11:38:36,332 WARNING [train.py:1204] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_98-51676-0_sp0.9 from training. Duration: 0.811125 2023-10-04 11:38:37,620 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.539e+02 2.081e+02 2.379e+02 2.770e+02 6.213e+02, threshold=4.757e+02, percent-clipped=1.0 2023-10-04 11:38:39,139 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_33348_210_5632_1_1533086912_3324030_85-28583-0_sp1.1 from training. Duration: 0.7454375 2023-10-04 11:38:39,253 WARNING [train.py:1204] (3/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_336-208232-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 11:38:42,042 WARNING [train.py:1204] (3/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_339-104791-0_sp0.9 from training. Duration: 0.5444375 2023-10-04 11:38:44,032 WARNING [train.py:1204] (3/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_559-29840-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 11:38:48,475 WARNING [train.py:1204] (3/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_117-290916-0_sp0.9 from training. Duration: 0.5666875 2023-10-04 11:38:50,412 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_185-112289-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 11:38:53,195 WARNING [train.py:1204] (3/4) Exclude cut with ID _59256_210_5986_1_1535076085997_3589120_155-298797-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 11:38:54,848 WARNING [train.py:1204] (3/4) Exclude cut with ID _44659_210_2721_1_1534036120061_6614860_246-206531-0_sp1.1 from training. Duration: 0.92725 2023-10-04 11:38:56,131 WARNING [train.py:1204] (3/4) Exclude cut with ID _23818_210_6228_1_1531917063884_3676920_469-151944-0_sp1.1 from training. Duration: 0.92725 2023-10-04 11:38:57,420 WARNING [train.py:1204] (3/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_88-276668-0_sp1.1 from training. Duration: 0.9 2023-10-04 11:38:58,891 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_15101_210_7927_1_1530786415_5033274_320-327527-0_sp1.1 from training. Duration: 0.87275 2023-10-04 11:38:58,906 WARNING [train.py:1204] (3/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_446-112117-0 from training. Duration: 0.9 2023-10-04 11:39:00,214 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_30280_210_1881_1_1532872779_3660139_400-254159-0_sp1.1 from training. Duration: 0.936375 2023-10-04 11:39:01,695 WARNING [train.py:1204] (3/4) Exclude cut with ID _40284_210_2084_1_1533513679814_3704279_158-345341-0_sp1.1 from training. Duration: 0.936375 2023-10-04 11:39:05,861 WARNING [train.py:1204] (3/4) Exclude cut with ID 3033-130750-0096-55598-0 from training. Duration: 0.83 2023-10-04 11:39:07,266 WARNING [train.py:1204] (3/4) Exclude cut with ID _9389_210_1664_1_1529217337301_5289269_379-128794-0 from training. Duration: 0.85 2023-10-04 11:39:07,281 WARNING [train.py:1204] (3/4) Exclude cut with ID _3675_210_4557_1_1526639934408_4182089_456-18305-0 from training. Duration: 0.76 2023-10-04 11:39:09,033 WARNING [train.py:1204] (3/4) Exclude cut with ID _29853_210_7497_1_1532505600851_6642110_571-249460-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 11:39:09,120 WARNING [train.py:1204] (3/4) Exclude cut with ID _4068_210_2629_1_1526697350395_1546259_230-325568-0_sp1.1 from training. Duration: 0.92725 2023-10-04 11:39:09,120 WARNING [train.py:1204] (3/4) Exclude cut with ID _34415_210_8750_1_1533085213375_3451116_213-267570-0_sp1.1 from training. Duration: 0.963625 2023-10-04 11:39:10,431 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6767_210_3273_1_1527855783_1718932_142-127132-0_sp1.1 from training. Duration: 0.863625 2023-10-04 11:39:11,100 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.4.encoder.layers.2.self_attn1.whiten, num_groups=1, num_channels=384, metric=12.33 vs. limit=22.5 2023-10-04 11:39:16,993 WARNING [train.py:1204] (3/4) Exclude cut with ID IC0653W0001-322925 from training. Duration: 0.896 2023-10-04 11:39:18,412 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_4528_210_3273_1_1527076654_3276777_209-219804-0_sp1.1 from training. Duration: 0.62725 2023-10-04 11:39:21,692 WARNING [train.py:1204] (3/4) Exclude cut with ID _55844_210_2346_1_1534471212569_7625707_187-204971-0_sp1.1 from training. Duration: 0.936375 2023-10-04 11:39:23,095 WARNING [train.py:1204] (3/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_369-325184-0 from training. Duration: 0.99 2023-10-04 11:39:23,130 WARNING [train.py:1204] (3/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_791-45149-0 from training. Duration: 0.86 2023-10-04 11:39:24,497 WARNING [train.py:1204] (3/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_317-211107-0_sp1.1 from training. Duration: 0.836375 2023-10-04 11:39:24,594 WARNING [train.py:1204] (3/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_488-330913-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 11:39:26,128 WARNING [train.py:1204] (3/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_506-42614-0_sp1.1 from training. Duration: 0.7181875 2023-10-04 11:39:30,184 WARNING [train.py:1204] (3/4) Exclude cut with ID _8696_210_1876_1_1528545632440_3345058_209-54069-0 from training. Duration: 0.67 2023-10-04 11:39:34,413 WARNING [train.py:1204] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_215-202973-0_sp1.1 from training. Duration: 0.8 2023-10-04 11:39:36,318 WARNING [train.py:1204] (3/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_321-215742-0 from training. Duration: 0.5 2023-10-04 11:39:36,558 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.ff3_skip_rate, batch_count=1641320.0, ans=0.0 2023-10-04 11:39:37,607 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_428-32114-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 11:39:37,623 WARNING [train.py:1204] (3/4) Exclude cut with ID _27013_210_1389_1_1532437173240_3252649_467-263151-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 11:39:37,668 WARNING [train.py:1204] (3/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_734-129761-0_sp0.9 from training. Duration: 0.911125 2023-10-04 11:39:37,707 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_521-214276-0 from training. Duration: 0.44 2023-10-04 11:39:40,734 WARNING [train.py:1204] (3/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_80-147301-0_sp1.1 from training. Duration: 0.763625 2023-10-04 11:39:40,737 WARNING [train.py:1204] (3/4) Exclude cut with ID _9679_210_7497_1_1529129212906_4363929_56-307796-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 11:39:43,565 WARNING [train.py:1204] (3/4) Exclude cut with ID _14832_210_8750_1_1530691351539_3171348_1-54315-0 from training. Duration: 0.87 2023-10-04 11:39:43,566 WARNING [train.py:1204] (3/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_558-155750-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 11:39:45,600 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_142-309370-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 11:39:45,633 WARNING [train.py:1204] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_18-46756-0_sp0.9 from training. Duration: 0.87775 2023-10-04 11:39:45,640 WARNING [train.py:1204] (3/4) Exclude cut with ID _61148_210_6897_1_1535094022726_4217610_279-212159-0_sp1.1 from training. Duration: 0.936375 2023-10-04 11:39:48,095 WARNING [train.py:1204] (3/4) Exclude cut with ID _21987_210_5854_1_1531821500482_3637038_164-2641-0_sp1.1 from training. Duration: 0.936375 2023-10-04 11:39:48,132 WARNING [train.py:1204] (3/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_395-340852-0_sp1.1 from training. Duration: 0.8454375 2023-10-04 11:39:49,718 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1077-184261-0_sp1.1 from training. Duration: 0.963625 2023-10-04 11:39:49,728 WARNING [train.py:1204] (3/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_1176-204992-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 11:39:51,559 INFO [train.py:1046] (3/4) Epoch 47, batch 1850, loss[loss=0.1534, simple_loss=0.2308, pruned_loss=0.03798, over 23752.00 frames. ], tot_loss[loss=0.1531, simple_loss=0.2331, pruned_loss=0.03658, over 4680744.93 frames. ], batch size: 232, lr: 2.16e-03, grad_scale: 8.0 2023-10-04 11:39:53,039 WARNING [train.py:1204] (3/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_563-347832-0_sp1.1 from training. Duration: 0.8090625 2023-10-04 11:39:53,106 WARNING [train.py:1204] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_380-204756-0_sp0.9 from training. Duration: 0.9555625 2023-10-04 11:39:58,704 WARNING [train.py:1204] (3/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_212-10501-0_sp1.1 from training. Duration: 0.8545625 2023-10-04 11:39:58,727 WARNING [train.py:1204] (3/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_445-117398-0 from training. Duration: 0.89 2023-10-04 11:40:03,374 WARNING [train.py:1204] (3/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_29-280622-0 from training. Duration: 0.82 2023-10-04 11:40:06,214 WARNING [train.py:1204] (3/4) Exclude cut with ID _26664_210_7770_1_1532258943280_3418270_187-80815-0 from training. Duration: 0.93 2023-10-04 11:40:08,200 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.3.encoder.layers.1.self_attn1.whiten, num_groups=1, num_channels=512, metric=13.43 vs. limit=22.5 2023-10-04 11:40:10,332 WARNING [train.py:1204] (3/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_414-349640-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 11:40:10,356 WARNING [train.py:1204] (3/4) Exclude cut with ID _45021_210_9994_1_1533619751094_3330288_96-239010-0 from training. Duration: 0.91 2023-10-04 11:40:10,358 WARNING [train.py:1204] (3/4) Exclude cut with ID _5189_210_4557_1_1527248319428_3769229_199-53526-0_sp0.9 from training. Duration: 0.4666875 2023-10-04 11:40:12,348 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.3.encoder.layers.3.self_attn1.whiten, num_groups=1, num_channels=512, metric=12.20 vs. limit=22.5 2023-10-04 11:40:21,282 WARNING [train.py:1204] (3/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_63-101996-0_sp1.1 from training. Duration: 0.8 2023-10-04 11:40:21,497 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.self_attn_weights.pos_emb_skip_rate, batch_count=1641520.0, ans=0.0 2023-10-04 11:40:22,673 WARNING [train.py:1204] (3/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_199-86432-0 from training. Duration: 0.98 2023-10-04 11:40:25,299 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.2.encoder.layers.2.nonlin_attention.whiten1, num_groups=1, num_channels=288, metric=4.46 vs. limit=10.0 2023-10-04 11:40:27,232 WARNING [train.py:1204] (3/4) Exclude cut with ID _36399_210_16049_1_1532998771377_3698333_207-259013-0_sp1.1 from training. Duration: 0.92725 2023-10-04 11:40:27,263 WARNING [train.py:1204] (3/4) Exclude cut with ID _62574_210_2655_1_1535180407058_3933249_567-111756-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 11:40:30,220 WARNING [train.py:1204] (3/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_436-318113-0 from training. Duration: 0.75 2023-10-04 11:40:31,636 WARNING [train.py:1204] (3/4) Exclude cut with ID _53379_210_20034_1_1534222027363_3854654_43-305214-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 11:40:31,661 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_15126_210_6753_1_1530778677_2591185_113-157651-0_sp1.1 from training. Duration: 0.6181875 2023-10-04 11:40:33,074 WARNING [train.py:1204] (3/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_146-218763-0_sp1.1 from training. Duration: 0.7818125 2023-10-04 11:40:33,300 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.balancer1.prob, batch_count=1641520.0, ans=0.125 2023-10-04 11:40:35,785 WARNING [train.py:1204] (3/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_714-310538-0_sp0.9 from training. Duration: 0.9555625 2023-10-04 11:40:39,052 WARNING [train.py:1204] (3/4) Exclude cut with ID _24271_210_12658_1_1531987270022_7190519_709-16721-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 11:40:40,531 WARNING [train.py:1204] (3/4) Exclude cut with ID _5190_210_4557_1_1527244368799_373439_28-197030-0_sp0.9 from training. Duration: 0.8 2023-10-04 11:40:40,569 WARNING [train.py:1204] (3/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_267-197536-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 11:40:40,599 WARNING [train.py:1204] (3/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_658-282610-0_sp1.1 from training. Duration: 0.4818125 2023-10-04 11:40:40,608 WARNING [train.py:1204] (3/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_257-67050-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 11:40:43,263 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_14906_210_7927_1_1530754190_9408633_961-53402-0_sp1.1 from training. Duration: 0.936375 2023-10-04 11:40:44,676 WARNING [train.py:1204] (3/4) Exclude cut with ID _60113_210_16271_1_1535164212481_4386079_490-280290-0_sp1.1 from training. Duration: 0.8909375 2023-10-04 11:40:46,055 WARNING [train.py:1204] (3/4) Exclude cut with ID _14100_210_8341_1_1530529320613_3678069_258-224599-0 from training. Duration: 0.77 2023-10-04 11:40:47,250 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6961_210_2087_1_1527937168_6815011_565-326898-0_sp1.1 from training. Duration: 0.936375 2023-10-04 11:40:52,318 WARNING [train.py:1204] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_239-316802-0_sp1.1 from training. Duration: 0.62725 2023-10-04 11:40:52,396 WARNING [train.py:1204] (3/4) Exclude cut with ID _55857_210_2346_1_1534644007831_6898209_745-98391-0_sp1.1 from training. Duration: 0.6909375 2023-10-04 11:40:52,400 WARNING [train.py:1204] (3/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_154-135696-0 from training. Duration: 0.8 2023-10-04 11:40:52,401 WARNING [train.py:1204] (3/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_93-58002-0 from training. Duration: 0.57 2023-10-04 11:40:54,421 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9025W0005-507335 from training. Duration: 0.896 2023-10-04 11:40:55,925 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9029W0034-509435 from training. Duration: 0.64 2023-10-04 11:40:59,818 WARNING [train.py:1204] (3/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_76-203867-0_sp1.1 from training. Duration: 0.6545625 2023-10-04 11:40:59,828 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_46639_210_8341_1_1533726088_4495500_66-346325-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 11:40:59,834 WARNING [train.py:1204] (3/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_145-137790-0_sp0.9 from training. Duration: 0.92225 2023-10-04 11:40:59,856 WARNING [train.py:1204] (3/4) Exclude cut with ID _36522_210_8887_1_1533177030611_1772478_7-327299-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 11:41:01,279 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9042W0009-515615 from training. Duration: 0.896 2023-10-04 11:41:01,288 WARNING [train.py:1204] (3/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_39-248870-0_sp0.9 from training. Duration: 0.7666875 2023-10-04 11:41:01,333 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_35441_210_16554_1_1533347572_4092680_362-242912-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 11:41:01,476 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.conv_skip_rate, batch_count=1641653.3333333333, ans=0.0 2023-10-04 11:41:04,053 WARNING [train.py:1204] (3/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_216-343007-0_sp1.1 from training. Duration: 0.6 2023-10-04 11:41:04,129 WARNING [train.py:1204] (3/4) Exclude cut with ID _3867_210_1883_1_1526641200575_4475830_272-215563-0_sp0.9 from training. Duration: 0.6333125 2023-10-04 11:41:05,475 INFO [train.py:1046] (3/4) Epoch 47, batch 1900, loss[loss=0.1625, simple_loss=0.235, pruned_loss=0.04494, over 23681.00 frames. ], tot_loss[loss=0.1535, simple_loss=0.2339, pruned_loss=0.03652, over 4697741.02 frames. ], batch size: 232, lr: 2.16e-03, grad_scale: 8.0 2023-10-04 11:41:05,580 WARNING [train.py:1204] (3/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_553-175603-0_sp1.1 from training. Duration: 0.92725 2023-10-04 11:41:06,789 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.738e+02 2.127e+02 2.369e+02 2.756e+02 3.360e+02, threshold=4.738e+02, percent-clipped=0.0 2023-10-04 11:41:06,866 WARNING [train.py:1204] (3/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_963-187109-0 from training. Duration: 0.99 2023-10-04 11:41:08,405 WARNING [train.py:1204] (3/4) Exclude cut with ID _44973_210_1511_1_1533794381163_3634770_495-211466-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 11:41:10,238 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9068W0002-529085 from training. Duration: 0.896 2023-10-04 11:41:10,261 WARNING [train.py:1204] (3/4) Exclude cut with ID _5294_210_1747_1_1527249643928_3696179_180-100713-0_sp1.1 from training. Duration: 0.736375 2023-10-04 11:41:11,728 WARNING [train.py:1204] (3/4) Exclude cut with ID _35799_210_5854_1_1533263487887_3529071_149-49689-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 11:41:15,842 WARNING [train.py:1204] (3/4) Exclude cut with ID _67725_210_27767_1_1535608756671_5105359_36-220794-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 11:41:17,266 WARNING [train.py:1204] (3/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_338-96520-0_sp1.1 from training. Duration: 0.8545625 2023-10-04 11:41:18,595 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9104W0004-547160 from training. Duration: 0.896 2023-10-04 11:41:18,674 WARNING [train.py:1204] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_330-327861-0 from training. Duration: 0.67 2023-10-04 11:41:21,345 WARNING [train.py:1204] (3/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_156-257607-0_sp0.9 from training. Duration: 0.92225 2023-10-04 11:41:21,414 WARNING [train.py:1204] (3/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_476-322050-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 11:41:21,423 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9118W0017-553385 from training. Duration: 0.896 2023-10-04 11:41:22,662 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9120W0028-554435 from training. Duration: 0.896 2023-10-04 11:41:27,747 WARNING [train.py:1204] (3/4) Exclude cut with ID _56524_210_9642_1_1534572011779_3619170_145-310842-0 from training. Duration: 0.99 2023-10-04 11:41:29,096 WARNING [train.py:1204] (3/4) Exclude cut with ID _51631_210_8254_1_1534129228420_4084794_270-222508-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 11:41:31,859 WARNING [train.py:1204] (3/4) Exclude cut with ID _14268_210_9206_1_1530622771285_4714099_331-105351-0 from training. Duration: 0.65 2023-10-04 11:41:33,248 WARNING [train.py:1204] (3/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_40-129756-0 from training. Duration: 0.81 2023-10-04 11:41:36,282 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.conv_skip_rate, batch_count=1641853.3333333333, ans=0.0 2023-10-04 11:41:43,469 WARNING [train.py:1204] (3/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_231-104736-0 from training. Duration: 0.78 2023-10-04 11:41:46,235 WARNING [train.py:1204] (3/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_214-291369-0 from training. Duration: 0.63 2023-10-04 11:41:46,241 WARNING [train.py:1204] (3/4) Exclude cut with ID _62574_210_2655_1_1535180407058_3933249_369-84347-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 11:41:47,533 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9210W0111-599240 from training. Duration: 0.896 2023-10-04 11:41:47,537 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_92-246270-0 from training. Duration: 0.85 2023-10-04 11:41:47,566 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_37152_210_9697_1_1533106573_3844188_29-239070-0 from training. Duration: 0.98 2023-10-04 11:41:47,634 WARNING [train.py:1204] (3/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_239-22076-0 from training. Duration: 0.85 2023-10-04 11:41:47,637 WARNING [train.py:1204] (3/4) Exclude cut with ID _65962_210_29927_1_1535436026519_4174930_368-44639-0_sp1.1 from training. Duration: 0.97275 2023-10-04 11:41:51,738 WARNING [train.py:1204] (3/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_152-212533-0 from training. Duration: 0.97 2023-10-04 11:41:54,508 WARNING [train.py:1204] (3/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_90-31383-0_sp1.1 from training. Duration: 0.7909375 2023-10-04 11:41:57,803 WARNING [train.py:1204] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_196-152868-0_sp1.1 from training. Duration: 0.92725 2023-10-04 11:41:57,805 WARNING [train.py:1204] (3/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_140-181176-0 from training. Duration: 0.92 2023-10-04 11:41:58,043 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.balancer2.prob, batch_count=1641920.0, ans=0.125 2023-10-04 11:41:59,772 WARNING [train.py:1204] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_168-32906-0_sp0.9 from training. Duration: 0.7666875 2023-10-04 11:42:03,949 WARNING [train.py:1204] (3/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_13-165512-0 from training. Duration: 0.82 2023-10-04 11:42:03,978 WARNING [train.py:1204] (3/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_381-317791-0_sp0.9 from training. Duration: 0.811125 2023-10-04 11:42:06,963 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.ff3_skip_rate, batch_count=1641986.6666666667, ans=0.0 2023-10-04 11:42:09,570 WARNING [train.py:1204] (3/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_63-267585-0_sp1.1 from training. Duration: 0.8181875 2023-10-04 11:42:09,572 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_326-303945-0_sp1.1 from training. Duration: 0.9 2023-10-04 11:42:09,593 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_194-143381-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 11:42:10,899 WARNING [train.py:1204] (3/4) Exclude cut with ID _39290_210_4220_1_1533862778760_3075118_194-16667-0_sp1.1 from training. Duration: 0.963625 2023-10-04 11:42:12,698 WARNING [train.py:1204] (3/4) Exclude cut with ID _15012_210_1968_1_1530856820429_5095350_662-210469-0_sp0.9 from training. Duration: 0.6333125 2023-10-04 11:42:13,898 WARNING [train.py:1204] (3/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_68-219807-0_sp1.1 from training. Duration: 0.42725 2023-10-04 11:42:13,957 WARNING [train.py:1204] (3/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_321-22821-0_sp0.9 from training. Duration: 0.988875 2023-10-04 11:42:15,483 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_1037-45317-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 11:42:15,484 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_403-39619-0_sp1.1 from training. Duration: 0.763625 2023-10-04 11:42:18,184 INFO [train.py:1046] (3/4) Epoch 47, batch 1950, loss[loss=0.1626, simple_loss=0.2512, pruned_loss=0.03705, over 24455.00 frames. ], tot_loss[loss=0.1538, simple_loss=0.2344, pruned_loss=0.03654, over 4710147.08 frames. ], batch size: 69, lr: 2.16e-03, grad_scale: 8.0 2023-10-04 11:42:18,242 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_510-96996-0_sp0.9 from training. Duration: 0.9666875 2023-10-04 11:42:18,251 WARNING [train.py:1204] (3/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_225-180155-0_sp1.1 from training. Duration: 0.92725 2023-10-04 11:42:18,294 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_278-272612-0_sp0.9 from training. Duration: 0.811125 2023-10-04 11:42:19,686 WARNING [train.py:1204] (3/4) Exclude cut with ID _53713_210_24116_1_1534314487762_7416751_691-70244-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 11:42:22,424 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6788_210_1388_1_1527898698_4562021_91-197981-0_sp1.1 from training. Duration: 0.7545625 2023-10-04 11:42:24,610 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.feed_forward2.hidden_balancer.prob, batch_count=1642053.3333333333, ans=0.125 2023-10-04 11:42:25,738 WARNING [train.py:1204] (3/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_60-18123-0_sp0.9 from training. Duration: 0.97775 2023-10-04 11:42:25,786 WARNING [train.py:1204] (3/4) Exclude cut with ID _35799_210_5854_1_1533263487887_3529071_5-214617-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 11:42:25,791 WARNING [train.py:1204] (3/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_890-5117-0_sp1.1 from training. Duration: 0.7090625 2023-10-04 11:42:27,612 WARNING [train.py:1204] (3/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_660-211088-0 from training. Duration: 0.62 2023-10-04 11:42:28,904 WARNING [train.py:1204] (3/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_314-126819-0_sp0.9 from training. Duration: 0.5555625 2023-10-04 11:42:28,928 WARNING [train.py:1204] (3/4) Exclude cut with ID _21632_210_2253_1_1531994001765_3744140_362-66225-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 11:42:30,837 WARNING [train.py:1204] (3/4) Exclude cut with ID _61118_210_9527_1_1534897668884_3752630_173-258555-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 11:42:34,775 WARNING [train.py:1204] (3/4) Exclude cut with ID _7719_210_3525_1_1528456153163_4296960_88-250259-0_sp0.9 from training. Duration: 0.9333125 2023-10-04 11:42:34,813 WARNING [train.py:1204] (3/4) Exclude cut with ID _15140_210_9288_1_1530791388450_4880149_539-153708-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 11:42:34,835 WARNING [train.py:1204] (3/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_253-42544-0_sp1.1 from training. Duration: 0.97275 2023-10-04 11:42:37,657 WARNING [train.py:1204] (3/4) Exclude cut with ID _33640_210_2655_1_1533117605479_4855469_479-296877-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 11:42:39,154 WARNING [train.py:1204] (3/4) Exclude cut with ID 3033-130750-0096-55598-0_sp1.1 from training. Duration: 0.7545625 2023-10-04 11:42:39,328 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.bypass_mid.scale_min, batch_count=1642120.0, ans=0.2 2023-10-04 11:42:40,427 WARNING [train.py:1204] (3/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_191-41023-0_sp1.1 from training. Duration: 0.6454375 2023-10-04 11:42:40,449 WARNING [train.py:1204] (3/4) Exclude cut with ID _8365_210_2064_1_1528592311070_4677690_431-294102-0_sp1.1 from training. Duration: 0.8090625 2023-10-04 11:42:40,487 WARNING [train.py:1204] (3/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_893-296030-0_sp1.1 from training. Duration: 0.97275 2023-10-04 11:42:43,634 WARNING [train.py:1204] (3/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_401-343820-0_sp1.1 from training. Duration: 0.97275 2023-10-04 11:42:46,454 WARNING [train.py:1204] (3/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_127-566-0_sp1.1 from training. Duration: 0.763625 2023-10-04 11:42:46,456 WARNING [train.py:1204] (3/4) Exclude cut with ID _14100_210_8341_1_1530529320613_3678069_126-272847-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 11:42:46,472 WARNING [train.py:1204] (3/4) Exclude cut with ID _15133_210_6228_1_1530794512074_3080379_29-264458-0_sp1.1 from training. Duration: 0.52725 2023-10-04 11:42:46,478 WARNING [train.py:1204] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_127-119109-0 from training. Duration: 0.72 2023-10-04 11:42:47,814 WARNING [train.py:1204] (3/4) Exclude cut with ID _6537_210_1883_1_1527987559710_3597129_382-133062-0_sp1.1 from training. Duration: 0.6181875 2023-10-04 11:42:47,832 WARNING [train.py:1204] (3/4) Exclude cut with ID _29529_210_7234_1_1532393961455_3977460_391-219682-0_sp1.1 from training. Duration: 0.936375 2023-10-04 11:42:47,876 WARNING [train.py:1204] (3/4) Exclude cut with ID _37537_210_2253_1_1533171758568_3421949_96-137189-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 11:42:52,088 WARNING [train.py:1204] (3/4) Exclude cut with ID _58414_210_8254_1_1535421609162_7151182_15-276301-0_sp1.1 from training. Duration: 0.97275 2023-10-04 11:42:54,706 WARNING [train.py:1204] (3/4) Exclude cut with ID _59502_210_12210_1_1535107024698_1835139_110-289074-0_sp1.1 from training. Duration: 0.92725 2023-10-04 11:42:58,609 WARNING [train.py:1204] (3/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_367-61330-0_sp1.1 from training. Duration: 0.6818125 2023-10-04 11:43:01,491 WARNING [train.py:1204] (3/4) Exclude cut with ID _45297_210_2721_1_1533715351381_3264949_353-9541-0_sp1.1 from training. Duration: 0.8818125 2023-10-04 11:43:01,530 WARNING [train.py:1204] (3/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_85-138257-0_sp0.9 from training. Duration: 0.788875 2023-10-04 11:43:01,556 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3787_210_2040_1_1526477544_5478586_603-120324-0 from training. Duration: 0.81 2023-10-04 11:43:03,609 WARNING [train.py:1204] (3/4) Exclude cut with ID _42863_210_9070_1_1533902398788_3696920_411-204131-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 11:43:06,391 WARNING [train.py:1204] (3/4) Exclude cut with ID _30196_210_1664_1_1533708496522_7074140_822-289909-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 11:43:07,735 WARNING [train.py:1204] (3/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_32-335103-0_sp0.9 from training. Duration: 0.911125 2023-10-04 11:43:07,793 WARNING [train.py:1204] (3/4) Exclude cut with ID _6493_210_1747_1_1528459210424_4012470_513-140967-0_sp0.9 from training. Duration: 0.87775 2023-10-04 11:43:16,054 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.3.encoder.layers.2.feed_forward1.out_whiten, num_groups=1, num_channels=512, metric=7.72 vs. limit=15.0 2023-10-04 11:43:16,714 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_47896_210_20508_1_1534492518_5958868_439-257766-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 11:43:18,057 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_1294-63152-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 11:43:21,948 WARNING [train.py:1204] (3/4) Exclude cut with ID _59170_210_6721_1_1534983633960_4203335_319-166512-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 11:43:23,366 WARNING [train.py:1204] (3/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_146-3470-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 11:43:24,859 WARNING [train.py:1204] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_678-159307-0_sp1.1 from training. Duration: 0.77275 2023-10-04 11:43:24,899 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_343-48822-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 11:43:26,253 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_4692_210_3528_1_1526899755_4323254_107-44401-0 from training. Duration: 0.86 2023-10-04 11:43:26,888 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_74-292608-0_sp1.1 from training. Duration: 0.6545625 2023-10-04 11:43:26,973 WARNING [train.py:1204] (3/4) Exclude cut with ID _55644_210_11856_1_1534593599582_4103719_293-248694-0_sp1.1 from training. Duration: 0.97275 2023-10-04 11:43:28,286 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3172_210_2087_1_1526292929_3987865_197-97762-0 from training. Duration: 0.75 2023-10-04 11:43:29,653 WARNING [train.py:1204] (3/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_264-348896-0_sp0.9 from training. Duration: 0.9444375 2023-10-04 11:43:33,282 INFO [train.py:1046] (3/4) Epoch 47, batch 2000, loss[loss=0.1818, simple_loss=0.2464, pruned_loss=0.05863, over 23408.00 frames. ], tot_loss[loss=0.1544, simple_loss=0.235, pruned_loss=0.03694, over 4713034.85 frames. ], batch size: 285, lr: 2.16e-03, grad_scale: 16.0 2023-10-04 11:43:34,756 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.650e+02 2.107e+02 2.252e+02 2.646e+02 4.173e+02, threshold=4.505e+02, percent-clipped=0.0 2023-10-04 11:43:36,180 WARNING [train.py:1204] (3/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_209-205365-0_sp0.9 from training. Duration: 0.87775 2023-10-04 11:43:37,551 WARNING [train.py:1204] (3/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_222-198127-0_sp0.9 from training. Duration: 0.9333125 2023-10-04 11:43:37,588 WARNING [train.py:1204] (3/4) Exclude cut with ID _57572_210_5517_1_1534921219624_3809484_52-43530-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 11:43:38,986 WARNING [train.py:1204] (3/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_96-320736-0_sp1.1 from training. Duration: 0.7818125 2023-10-04 11:43:39,207 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.conv_module1.balancer1.prob, batch_count=1642386.6666666667, ans=0.125 2023-10-04 11:43:40,464 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_13091_210_7925_1_1530609447_8987354_1159-225508-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 11:43:41,949 WARNING [train.py:1204] (3/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_237-44332-0 from training. Duration: 0.94 2023-10-04 11:43:41,990 WARNING [train.py:1204] (3/4) Exclude cut with ID _7970_210_1883_1_1528455616296_3472230_25-178407-0_sp0.9 from training. Duration: 0.788875 2023-10-04 11:43:46,563 WARNING [train.py:1204] (3/4) Exclude cut with ID _29003_210_19596_1_1532507391720_3894098_357-257062-0_sp1.1 from training. Duration: 0.936375 2023-10-04 11:43:47,977 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_809-17156-0 from training. Duration: 0.73 2023-10-04 11:43:49,355 WARNING [train.py:1204] (3/4) Exclude cut with ID _7160_210_1876_1_1527940923231_3703799_302-150728-0_sp1.1 from training. Duration: 0.7090625 2023-10-04 11:43:49,364 WARNING [train.py:1204] (3/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_444-28041-0_sp0.9 from training. Duration: 0.9444375 2023-10-04 11:43:53,421 WARNING [train.py:1204] (3/4) Exclude cut with ID _52892_210_6721_1_1534378941259_4124129_278-251666-0_sp1.1 from training. Duration: 0.92725 2023-10-04 11:43:53,527 WARNING [train.py:1204] (3/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_115-326778-0 from training. Duration: 0.93 2023-10-04 11:43:54,876 WARNING [train.py:1204] (3/4) Exclude cut with ID _41391_210_7994_1_1533603771872_3638201_31-188961-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 11:43:56,301 WARNING [train.py:1204] (3/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_1073-6264-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 11:43:56,319 WARNING [train.py:1204] (3/4) Exclude cut with ID _66868_210_9527_1_1535509287954_4561140_435-259515-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 11:43:58,287 WARNING [train.py:1204] (3/4) Exclude cut with ID _14886_210_4220_1_1530702020355_3516190_159-73748-0 from training. Duration: 0.83 2023-10-04 11:43:58,328 WARNING [train.py:1204] (3/4) Exclude cut with ID _15150_210_8341_1_1530752411803_7256384_469-239509-0_sp1.1 from training. Duration: 0.6181875 2023-10-04 11:43:59,697 WARNING [train.py:1204] (3/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_395-340852-0 from training. Duration: 0.93 2023-10-04 11:43:59,707 WARNING [train.py:1204] (3/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_138-187031-0_sp1.1 from training. Duration: 0.9 2023-10-04 11:44:04,732 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_13757_210_3528_1_1530440682_4107239_48-106347-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 11:44:04,790 WARNING [train.py:1204] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_164-224052-0_sp1.1 from training. Duration: 0.636375 2023-10-04 11:44:04,794 WARNING [train.py:1204] (3/4) Exclude cut with ID _59288_210_5854_1_1535077757774_3412246_408-322648-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 11:44:04,910 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.0.conv_module2.balancer2.prob, batch_count=1642520.0, ans=0.125 2023-10-04 11:44:05,943 WARNING [train.py:1204] (3/4) Exclude cut with ID _30191_210_1664_1_1533189915047_7196670_827-36359-0_sp1.1 from training. Duration: 0.963625 2023-10-04 11:44:07,273 WARNING [train.py:1204] (3/4) Exclude cut with ID _4680_210_3525_1_1527249757953_893386_75-78979-0_sp1.1 from training. Duration: 0.82725 2023-10-04 11:44:07,334 WARNING [train.py:1204] (3/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_383-165234-0 from training. Duration: 0.94 2023-10-04 11:44:11,414 WARNING [train.py:1204] (3/4) Exclude cut with ID _19896_210_1883_1_1531709022662_6649569_94-229927-0 from training. Duration: 0.97 2023-10-04 11:44:11,421 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6788_210_1388_1_1527898698_4562021_180-75288-0_sp1.1 from training. Duration: 0.9 2023-10-04 11:44:11,428 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_26433_210_12108_1_1532157158_4056480_54-321349-0_sp1.1 from training. Duration: 0.97275 2023-10-04 11:44:17,498 WARNING [train.py:1204] (3/4) Exclude cut with ID _56108_210_16380_1_1534505446159_3887959_265-161493-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 11:44:17,599 WARNING [train.py:1204] (3/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_395-111602-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 11:44:17,611 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_154-316450-0_sp0.9 from training. Duration: 0.8444375 2023-10-04 11:44:18,950 WARNING [train.py:1204] (3/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_381-116041-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 11:44:19,200 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.feed_forward2.hidden_balancer.prob, batch_count=1642586.6666666667, ans=0.125 2023-10-04 11:44:20,385 WARNING [train.py:1204] (3/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_65-104124-0_sp1.1 from training. Duration: 0.963625 2023-10-04 11:44:20,434 WARNING [train.py:1204] (3/4) Exclude cut with ID _6536_210_1883_1_1527850783339_3319069_169-23811-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 11:44:20,679 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.conv_skip_rate, batch_count=1642586.6666666667, ans=0.0 2023-10-04 11:44:21,743 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_289-180904-0_sp0.9 from training. Duration: 0.8444375 2023-10-04 11:44:21,761 WARNING [train.py:1204] (3/4) Exclude cut with ID _56406_210_9288_1_1534899618664_3496149_226-242400-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 11:44:24,382 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_24899_210_16554_1_1532046422_3827462_276-216330-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 11:44:25,865 WARNING [train.py:1204] (3/4) Exclude cut with ID _29345_210_4381_1_1532689168263_3860169_198-174328-0_sp1.1 from training. Duration: 0.82725 2023-10-04 11:44:25,938 WARNING [train.py:1204] (3/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_338-96520-0 from training. Duration: 0.94 2023-10-04 11:44:26,172 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.nonlin_attention.balancer.prob, batch_count=1642586.6666666667, ans=0.125 2023-10-04 11:44:32,830 WARNING [train.py:1204] (3/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_7-111954-0_sp1.1 from training. Duration: 0.5090625 2023-10-04 11:44:33,051 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.balancer1.prob, batch_count=1642653.3333333333, ans=0.125 2023-10-04 11:44:34,108 WARNING [train.py:1204] (3/4) Exclude cut with ID _36692_210_5665_1_1533608967420_398809_8-16261-0_sp1.1 from training. Duration: 0.97275 2023-10-04 11:44:37,785 WARNING [train.py:1204] (3/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_86-212657-0_sp1.1 from training. Duration: 0.97275 2023-10-04 11:44:37,792 WARNING [train.py:1204] (3/4) Exclude cut with ID _44659_210_2721_1_1534036120061_6614860_626-298884-0_sp1.1 from training. Duration: 0.8818125 2023-10-04 11:44:40,677 WARNING [train.py:1204] (3/4) Exclude cut with ID _49755_210_18053_1_1534581005846_3218709_120-257918-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 11:44:43,315 WARNING [train.py:1204] (3/4) Exclude cut with ID _21881_210_3525_1_1531825242757_3798380_400-113821-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 11:44:43,318 WARNING [train.py:1204] (3/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_387-236773-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 11:44:44,675 WARNING [train.py:1204] (3/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_613-232856-0_sp1.1 from training. Duration: 0.4909375 2023-10-04 11:44:44,701 WARNING [train.py:1204] (3/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_181-78632-0_sp0.9 from training. Duration: 0.8666875 2023-10-04 11:44:47,637 INFO [train.py:1046] (3/4) Epoch 47, batch 2050, loss[loss=0.1369, simple_loss=0.2214, pruned_loss=0.02623, over 24320.00 frames. ], tot_loss[loss=0.1542, simple_loss=0.2352, pruned_loss=0.03664, over 4722583.68 frames. ], batch size: 61, lr: 2.16e-03, grad_scale: 16.0 2023-10-04 11:44:47,753 WARNING [train.py:1204] (3/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_175-17485-0_sp1.1 from training. Duration: 0.97275 2023-10-04 11:44:49,078 WARNING [train.py:1204] (3/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_1004-183422-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 11:44:50,648 WARNING [train.py:1204] (3/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_32-65962-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 11:44:51,988 WARNING [train.py:1204] (3/4) Exclude cut with ID _15458_210_8341_1_1530858614263_3834410_40-298664-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 11:44:56,219 WARNING [train.py:1204] (3/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_380-261863-0_sp1.1 from training. Duration: 0.963625 2023-10-04 11:44:57,876 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.conv_module2.balancer2.prob, batch_count=1642720.0, ans=0.125 2023-10-04 11:44:59,463 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_372-173012-0_sp1.1 from training. Duration: 0.863625 2023-10-04 11:45:00,933 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_57-82205-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 11:45:02,303 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3691_210_3528_1_1526468169_4062053_355-334432-0_sp0.9 from training. Duration: 0.9666875 2023-10-04 11:45:02,420 WARNING [train.py:1204] (3/4) Exclude cut with ID _19023_210_4353_1_1531378892552_5820780_28-36615-0 from training. Duration: 0.97 2023-10-04 11:45:02,424 WARNING [train.py:1204] (3/4) Exclude cut with ID _29853_210_7497_1_1532505600851_6642110_286-251333-0_sp1.1 from training. Duration: 0.92725 2023-10-04 11:45:04,254 WARNING [train.py:1204] (3/4) Exclude cut with ID _31602_210_5854_1_1532677021456_4458250_518-80679-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 11:45:04,297 WARNING [train.py:1204] (3/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_38-187851-0_sp0.9 from training. Duration: 0.688875 2023-10-04 11:45:10,460 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.feed_forward2.hidden_balancer.prob, batch_count=1642786.6666666667, ans=0.125 2023-10-04 11:45:13,041 WARNING [train.py:1204] (3/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_39-248870-0_sp1.1 from training. Duration: 0.62725 2023-10-04 11:45:13,059 WARNING [train.py:1204] (3/4) Exclude cut with ID _16908_210_5724_1_1531119157248_3772700_154-31455-0_sp1.1 from training. Duration: 0.97275 2023-10-04 11:45:14,536 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8453_210_4211_1_1528502940_8562730_347-330673-0 from training. Duration: 0.72 2023-10-04 11:45:16,400 WARNING [train.py:1204] (3/4) Exclude cut with ID _30191_210_1664_1_1533189915047_7196670_674-204247-0_sp1.1 from training. Duration: 0.97275 2023-10-04 11:45:19,096 WARNING [train.py:1204] (3/4) Exclude cut with ID _8833_210_5219_1_1528592665210_7553059_329-125156-0 from training. Duration: 0.82 2023-10-04 11:45:19,126 WARNING [train.py:1204] (3/4) Exclude cut with ID _4779_210_2084_1_1527318022944_3914893_500-285013-0_sp1.1 from training. Duration: 0.62725 2023-10-04 11:45:23,140 WARNING [train.py:1204] (3/4) Exclude cut with ID _14421_210_8750_1_1530604795825_3542290_190-313578-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 11:45:24,518 WARNING [train.py:1204] (3/4) Exclude cut with ID _35799_210_5854_1_1533263487887_3529071_538-273574-0_sp1.1 from training. Duration: 0.936375 2023-10-04 11:45:25,965 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_14928_210_5632_1_1530773461_4714000_454-112198-0_sp1.1 from training. Duration: 0.6 2023-10-04 11:45:26,009 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_468-143126-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 11:45:26,508 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.4.encoder.layers.2.whiten, num_groups=1, num_channels=384, metric=2.81 vs. limit=12.0 2023-10-04 11:45:27,392 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_4528_210_3273_1_1527076654_3276777_258-121681-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 11:45:28,828 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_397-132565-0_sp1.1 from training. Duration: 0.9 2023-10-04 11:45:28,848 WARNING [train.py:1204] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_454-123002-0_sp0.9 from training. Duration: 0.7666875 2023-10-04 11:45:33,826 WARNING [train.py:1204] (3/4) Exclude cut with ID _27052_210_5986_1_1532329197187_3585009_297-86890-0_sp1.1 from training. Duration: 0.936375 2023-10-04 11:45:35,251 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_51225_210_3601_1_1534400634_2327383_55-301844-0_sp1.1 from training. Duration: 0.8454375 2023-10-04 11:45:36,904 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.self_attn_weights.pos_emb_skip_rate, batch_count=1642920.0, ans=0.0 2023-10-04 11:45:38,056 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6786_210_1388_1_1527839699_4194495_378-46070-0_sp0.9 from training. Duration: 0.8 2023-10-04 11:45:39,411 WARNING [train.py:1204] (3/4) Exclude cut with ID _42334_210_7770_1_1534206610396_3605880_55-19758-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 11:45:44,189 WARNING [train.py:1204] (3/4) Exclude cut with ID _14884_210_2252_1_1530707459689_5561250_114-201399-0_sp0.9 from training. Duration: 0.8444375 2023-10-04 11:45:47,259 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_99-251314-0_sp0.9 from training. Duration: 0.9666875 2023-10-04 11:45:48,622 WARNING [train.py:1204] (3/4) Exclude cut with ID _26528_210_9070_1_1532570410842_3851310_497-97218-0 from training. Duration: 0.86 2023-10-04 11:45:53,306 WARNING [train.py:1204] (3/4) Exclude cut with ID _55328_210_27767_1_1534580973260_4088170_230-317241-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 11:45:53,380 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_125-203105-0_sp0.9 from training. Duration: 0.888875 2023-10-04 11:45:54,962 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.conv_module1.balancer1.prob, batch_count=1642986.6666666667, ans=0.125 2023-10-04 11:45:56,024 WARNING [train.py:1204] (3/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_55-31904-0_sp1.1 from training. Duration: 0.8 2023-10-04 11:45:57,419 WARNING [train.py:1204] (3/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_18-174647-0 from training. Duration: 0.91 2023-10-04 11:46:00,938 WARNING [train.py:1204] (3/4) Exclude cut with ID IC0165W0005-81726 from training. Duration: 0.952 2023-10-04 11:46:00,938 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_31583_210_3852_1_1532595851_4024413_148-171847-0_sp1.1 from training. Duration: 0.963625 2023-10-04 11:46:02,623 INFO [train.py:1046] (3/4) Epoch 47, batch 2100, loss[loss=0.1471, simple_loss=0.2155, pruned_loss=0.0393, over 23585.00 frames. ], tot_loss[loss=0.1528, simple_loss=0.2326, pruned_loss=0.03648, over 4699047.85 frames. ], batch size: 256, lr: 2.16e-03, grad_scale: 8.0 2023-10-04 11:46:02,690 WARNING [train.py:1204] (3/4) Exclude cut with ID _21989_210_5854_1_1531899214931_3271360_116-138769-0_sp1.1 from training. Duration: 0.936375 2023-10-04 11:46:02,758 WARNING [train.py:1204] (3/4) Exclude cut with ID _66070_210_7640_1_1535441393096_7805310_489-292631-0_sp1.1 from training. Duration: 0.7181875 2023-10-04 11:46:05,227 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.730e+02 2.063e+02 2.330e+02 2.672e+02 3.956e+02, threshold=4.660e+02, percent-clipped=0.0 2023-10-04 11:46:05,326 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_202-27960-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 11:46:05,343 WARNING [train.py:1204] (3/4) Exclude cut with ID _5294_210_1747_1_1527249643928_3696179_342-270278-0 from training. Duration: 0.55 2023-10-04 11:46:05,385 WARNING [train.py:1204] (3/4) Exclude cut with ID _38323_210_12108_1_1533121233369_3303049_416-11919-0 from training. Duration: 0.96 2023-10-04 11:46:06,826 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6545_210_2040_1_1527774670_4245258_407-48070-0_sp0.9 from training. Duration: 0.8444375 2023-10-04 11:46:10,059 WARNING [train.py:1204] (3/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_503-30719-0_sp1.1 from training. Duration: 0.8909375 2023-10-04 11:46:11,360 WARNING [train.py:1204] (3/4) Exclude cut with ID _49643_210_7989_1_1534640244224_3939031_234-326497-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 11:46:14,039 WARNING [train.py:1204] (3/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_301-77873-0_sp1.1 from training. Duration: 0.963625 2023-10-04 11:46:15,365 WARNING [train.py:1204] (3/4) Exclude cut with ID _45297_210_2721_1_1533715351381_3264949_152-154697-0_sp1.1 from training. Duration: 0.97275 2023-10-04 11:46:15,371 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3371_210_1794_1_1526384462_5580777_396-339812-0 from training. Duration: 0.85 2023-10-04 11:46:17,444 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.2.encoder.layers.1.conv_module2.whiten, num_groups=1, num_channels=384, metric=6.14 vs. limit=15.0 2023-10-04 11:46:18,074 WARNING [train.py:1204] (3/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_399-156791-0_sp1.1 from training. Duration: 0.7545625 2023-10-04 11:46:18,128 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7696_210_2040_1_1528207167_3716099_117-125434-0 from training. Duration: 0.76 2023-10-04 11:46:18,132 WARNING [train.py:1204] (3/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_96-244299-0 from training. Duration: 0.88 2023-10-04 11:46:19,513 WARNING [train.py:1204] (3/4) Exclude cut with ID _37892_210_19596_1_1533198677897_3964058_519-343462-0_sp1.1 from training. Duration: 0.92725 2023-10-04 11:46:21,346 WARNING [train.py:1204] (3/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_202-183922-0_sp1.1 from training. Duration: 0.77275 2023-10-04 11:46:21,352 WARNING [train.py:1204] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_453-133983-0 from training. Duration: 0.78 2023-10-04 11:46:21,380 WARNING [train.py:1204] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_178-39722-0_sp1.1 from training. Duration: 0.4454375 2023-10-04 11:46:24,350 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_15126_210_6753_1_1530778677_2591185_113-157651-0 from training. Duration: 0.68 2023-10-04 11:46:24,351 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_271-234962-0_sp1.1 from training. Duration: 0.7181875 2023-10-04 11:46:27,112 WARNING [train.py:1204] (3/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_195-64967-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 11:46:27,147 WARNING [train.py:1204] (3/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_365-215065-0_sp1.1 from training. Duration: 0.936375 2023-10-04 11:46:29,929 WARNING [train.py:1204] (3/4) Exclude cut with ID _31602_210_5854_1_1532677021456_4458250_249-96604-0_sp0.9 from training. Duration: 0.988875 2023-10-04 11:46:31,809 WARNING [train.py:1204] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_307-158462-0 from training. Duration: 0.69 2023-10-04 11:46:31,844 WARNING [train.py:1204] (3/4) Exclude cut with ID _30918_210_6400_1_1532754078600_3125920_407-223394-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 11:46:31,847 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6673_210_3273_1_1527767790_3173240_351-167954-0_sp0.9 from training. Duration: 0.5333125 2023-10-04 11:46:34,548 WARNING [train.py:1204] (3/4) Exclude cut with ID _4866_210_2577_1_1527404412284_4364659_256-306830-0 from training. Duration: 0.87 2023-10-04 11:46:34,594 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_17613_210_2151_1_1531137569_2366160_222-131118-0_sp1.1 from training. Duration: 0.963625 2023-10-04 11:46:34,615 WARNING [train.py:1204] (3/4) Exclude cut with ID _45560_210_21982_1_1533812603951_3584999_111-6376-0 from training. Duration: 0.84 2023-10-04 11:46:35,913 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_216-73841-0 from training. Duration: 0.96 2023-10-04 11:46:35,963 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_14058_210_4163_1_1530690953_6327180_49-210687-0 from training. Duration: 0.94 2023-10-04 11:46:37,451 WARNING [train.py:1204] (3/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_506-42614-0_sp0.9 from training. Duration: 0.87775 2023-10-04 11:46:39,269 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_25-332138-0_sp1.1 from training. Duration: 0.82725 2023-10-04 11:46:41,998 WARNING [train.py:1204] (3/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_589-9070-0_sp1.1 from training. Duration: 0.6454375 2023-10-04 11:46:43,388 WARNING [train.py:1204] (3/4) Exclude cut with ID _6194_210_1534_1_1527511826160_3941194_14-282876-0_sp1.1 from training. Duration: 0.6454375 2023-10-04 11:46:44,924 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.balancer2.prob, batch_count=1643253.3333333333, ans=0.125 2023-10-04 11:46:45,928 WARNING [train.py:1204] (3/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_1166-90654-0_sp1.1 from training. Duration: 0.92725 2023-10-04 11:46:46,044 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_218-255858-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 11:46:46,046 WARNING [train.py:1204] (3/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_1285-256622-0 from training. Duration: 0.84 2023-10-04 11:46:47,340 WARNING [train.py:1204] (3/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_205-286261-0_sp1.1 from training. Duration: 0.963625 2023-10-04 11:46:47,355 WARNING [train.py:1204] (3/4) Exclude cut with ID _68777_210_4353_1_1535810390731_3117904_6-154119-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 11:46:47,431 WARNING [train.py:1204] (3/4) Exclude cut with ID _22982_210_1883_1_1531882790447_5449409_565-199393-0_sp1.1 from training. Duration: 0.92725 2023-10-04 11:46:48,828 WARNING [train.py:1204] (3/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_278-154492-0 from training. Duration: 0.76 2023-10-04 11:46:50,230 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_426-313557-0 from training. Duration: 0.53 2023-10-04 11:46:52,003 WARNING [train.py:1204] (3/4) Exclude cut with ID _41817_210_18835_1_1533434477160_7423049_555-293448-0 from training. Duration: 0.8 2023-10-04 11:46:54,700 WARNING [train.py:1204] (3/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_32-335103-0_sp1.1 from training. Duration: 0.7454375 2023-10-04 11:46:58,976 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2655_210_2087_1_1525688959_3897400_202-147557-0_sp1.1 from training. Duration: 0.77275 2023-10-04 11:46:59,013 WARNING [train.py:1204] (3/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_158-250116-0 from training. Duration: 0.62 2023-10-04 11:47:03,872 WARNING [train.py:1204] (3/4) Exclude cut with ID _55429_210_10626_1_1534467588527_7174143_528-88083-0_sp1.1 from training. Duration: 0.963625 2023-10-04 11:47:06,793 WARNING [train.py:1204] (3/4) Exclude cut with ID _8030_210_7497_1_1528444886781_3531089_32-31837-0_sp1.1 from training. Duration: 0.8818125 2023-10-04 11:47:06,834 WARNING [train.py:1204] (3/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_744-86168-0_sp1.1 from training. Duration: 0.97275 2023-10-04 11:47:06,842 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1113-7369-0_sp0.9 from training. Duration: 0.9444375 2023-10-04 11:47:08,219 WARNING [train.py:1204] (3/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_124-345840-0_sp0.9 from training. Duration: 0.57775 2023-10-04 11:47:08,277 WARNING [train.py:1204] (3/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_328-264776-0_sp0.9 from training. Duration: 0.7444375 2023-10-04 11:47:10,123 WARNING [train.py:1204] (3/4) Exclude cut with ID _18895_210_6228_1_1531357274234_3784930_469-166615-0_sp1.1 from training. Duration: 0.963625 2023-10-04 11:47:10,130 WARNING [train.py:1204] (3/4) Exclude cut with ID _8395_210_1531_1_1528597871523_4411579_431-132470-0_sp0.9 from training. Duration: 0.82225 2023-10-04 11:47:11,488 WARNING [train.py:1204] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_554-45617-0_sp1.1 from training. Duration: 0.9 2023-10-04 11:47:12,896 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_35361_210_4238_1_1533262810_7793476_524-188242-0_sp1.1 from training. Duration: 0.92725 2023-10-04 11:47:14,433 WARNING [train.py:1204] (3/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_938-205135-0 from training. Duration: 0.87 2023-10-04 11:47:15,698 INFO [train.py:1046] (3/4) Epoch 47, batch 2150, loss[loss=0.1575, simple_loss=0.2463, pruned_loss=0.03431, over 24648.00 frames. ], tot_loss[loss=0.1521, simple_loss=0.2316, pruned_loss=0.03629, over 4688584.40 frames. ], batch size: 68, lr: 2.16e-03, grad_scale: 8.0 2023-10-04 11:47:15,788 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_5931_210_2040_1_1527428481_4547999_321-193614-0 from training. Duration: 0.67 2023-10-04 11:47:15,800 WARNING [train.py:1204] (3/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_445-269521-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 11:47:18,495 WARNING [train.py:1204] (3/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_3-33116-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 11:47:18,496 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_4633_210_2040_1_1526824163_4145343_416-76117-0_sp1.1 from training. Duration: 0.863625 2023-10-04 11:47:18,534 WARNING [train.py:1204] (3/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_1033-274376-0_sp1.1 from training. Duration: 0.7909375 2023-10-04 11:47:18,570 WARNING [train.py:1204] (3/4) Exclude cut with ID _8772_210_1850_1_1528635595972_3664011_398-320049-0_sp0.9 from training. Duration: 0.97775 2023-10-04 11:47:23,439 WARNING [train.py:1204] (3/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_321-215742-0_sp0.9 from training. Duration: 0.5555625 2023-10-04 11:47:24,817 WARNING [train.py:1204] (3/4) Exclude cut with ID _28394_210_8913_1_1532430049518_3650118_272-190950-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 11:47:26,133 WARNING [train.py:1204] (3/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_31-299371-0_sp1.1 from training. Duration: 0.92725 2023-10-04 11:47:28,793 WARNING [train.py:1204] (3/4) Exclude cut with ID _55383_210_10635_1_1534468426172_2231669_134-128246-0_sp0.9 from training. Duration: 0.988875 2023-10-04 11:47:28,796 WARNING [train.py:1204] (3/4) Exclude cut with ID _40519_210_13794_1_1533715228655_7337390_887-9547-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 11:47:28,833 WARNING [train.py:1204] (3/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_310-109461-0_sp0.9 from training. Duration: 0.888875 2023-10-04 11:47:30,398 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2009_210_2071_1_1525481134_4588937_258-250196-0_sp1.1 from training. Duration: 0.92725 2023-10-04 11:47:32,310 WARNING [train.py:1204] (3/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_1186-72702-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 11:47:32,312 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_14831_210_7927_1_1530700200_5583610_181-27467-0_sp1.1 from training. Duration: 0.82725 2023-10-04 11:47:35,646 WARNING [train.py:1204] (3/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_278-132109-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 11:47:36,927 WARNING [train.py:1204] (3/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_261-297503-0 from training. Duration: 0.99 2023-10-04 11:47:39,829 WARNING [train.py:1204] (3/4) Exclude cut with ID _19896_210_1883_1_1531709022662_6649569_313-265903-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 11:47:41,824 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_105-114797-0_sp1.1 from training. Duration: 0.763625 2023-10-04 11:47:41,947 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.0.feed_forward1.out_proj.dropout_p, batch_count=1643453.3333333333, ans=0.1 2023-10-04 11:47:43,236 WARNING [train.py:1204] (3/4) Exclude cut with ID _53755_210_8254_1_1534577312941_4707360_9-65843-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 11:47:43,261 WARNING [train.py:1204] (3/4) Exclude cut with ID _18516_210_9206_1_1531704653206_3692406_361-224549-0_sp1.1 from training. Duration: 0.97275 2023-10-04 11:47:43,291 WARNING [train.py:1204] (3/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_403-310975-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 11:47:43,327 WARNING [train.py:1204] (3/4) Exclude cut with ID _5444_210_2151_1_1527418808464_5312210_581-315411-0_sp0.9 from training. Duration: 0.688875 2023-10-04 11:47:44,634 WARNING [train.py:1204] (3/4) Exclude cut with ID _23825_210_13449_1_1531915313526_3497150_464-154904-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 11:47:44,643 WARNING [train.py:1204] (3/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_937-208281-0_sp0.9 from training. Duration: 0.9444375 2023-10-04 11:47:44,711 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_37839_210_7234_1_1533542462_3689303_158-181212-0_sp1.1 from training. Duration: 0.963625 2023-10-04 11:47:46,091 WARNING [train.py:1204] (3/4) Exclude cut with ID _8772_210_1850_1_1528635595972_3664011_398-320049-0 from training. Duration: 0.88 2023-10-04 11:47:48,765 WARNING [train.py:1204] (3/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_718-250907-0_sp1.1 from training. Duration: 0.62725 2023-10-04 11:47:50,171 WARNING [train.py:1204] (3/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_641-126395-0_sp1.1 from training. Duration: 0.92725 2023-10-04 11:47:50,197 WARNING [train.py:1204] (3/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_166-241493-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 11:47:52,012 WARNING [train.py:1204] (3/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_526-62079-0_sp0.9 from training. Duration: 0.7444375 2023-10-04 11:47:53,532 WARNING [train.py:1204] (3/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_439-275083-0_sp1.1 from training. Duration: 0.936375 2023-10-04 11:47:56,251 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_66907_210_21581_1_1535615661_3590177_493-229032-0_sp1.1 from training. Duration: 0.92725 2023-10-04 11:47:56,282 WARNING [train.py:1204] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_89-321955-0_sp1.1 from training. Duration: 0.72725 2023-10-04 11:47:58,938 WARNING [train.py:1204] (3/4) Exclude cut with ID _29857_210_20034_1_1532395886753_3798160_273-226195-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 11:47:58,944 WARNING [train.py:1204] (3/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_404-75889-0 from training. Duration: 0.89 2023-10-04 11:47:58,977 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_271-234962-0_sp0.9 from training. Duration: 0.87775 2023-10-04 11:48:02,973 WARNING [train.py:1204] (3/4) Exclude cut with ID _22961_210_8254_1_1531976366672_4278180_156-339685-0_sp1.1 from training. Duration: 0.97275 2023-10-04 11:48:03,018 WARNING [train.py:1204] (3/4) Exclude cut with ID _37892_210_19596_1_1533198677897_3964058_425-231927-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 11:48:04,402 WARNING [train.py:1204] (3/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_102-288670-0_sp1.1 from training. Duration: 0.97275 2023-10-04 11:48:04,487 WARNING [train.py:1204] (3/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_283-220372-0_sp1.1 from training. Duration: 0.5090625 2023-10-04 11:48:06,593 WARNING [train.py:1204] (3/4) Exclude cut with ID _44304_210_15374_1_1533639352765_4613129_86-1699-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 11:48:08,013 WARNING [train.py:1204] (3/4) Exclude cut with ID _2891_210_2550_1_1525874398521_4427710_327-121590-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 11:48:08,020 WARNING [train.py:1204] (3/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_369-222060-0 from training. Duration: 0.71 2023-10-04 11:48:09,495 WARNING [train.py:1204] (3/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_762-109886-0 from training. Duration: 0.91 2023-10-04 11:48:09,517 WARNING [train.py:1204] (3/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_477-255782-0_sp0.9 from training. Duration: 0.911125 2023-10-04 11:48:09,551 WARNING [train.py:1204] (3/4) Exclude cut with ID IC0669W0061-330906 from training. Duration: 0.768 2023-10-04 11:48:10,852 WARNING [train.py:1204] (3/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_347-213071-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 11:48:10,882 WARNING [train.py:1204] (3/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_507-303370-0_sp1.1 from training. Duration: 0.87275 2023-10-04 11:48:11,627 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.3.encoder.layers.0.conv_module1.whiten, num_groups=1, num_channels=512, metric=3.05 vs. limit=15.0 2023-10-04 11:48:12,846 WARNING [train.py:1204] (3/4) Exclude cut with ID _15012_210_1968_1_1530856820429_5095350_688-178849-0 from training. Duration: 0.64 2023-10-04 11:48:12,851 WARNING [train.py:1204] (3/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_28-70987-0_sp0.9 from training. Duration: 0.9666875 2023-10-04 11:48:12,853 WARNING [train.py:1204] (3/4) Exclude cut with ID _5910_210_1389_1_1527337972454_5050100_555-159815-0 from training. Duration: 0.98 2023-10-04 11:48:12,884 WARNING [train.py:1204] (3/4) Exclude cut with ID IC0679W0064-335871 from training. Duration: 0.768 2023-10-04 11:48:12,884 WARNING [train.py:1204] (3/4) Exclude cut with ID IC0679W0079-335886 from training. Duration: 0.896 2023-10-04 11:48:12,918 WARNING [train.py:1204] (3/4) Exclude cut with ID _5608_210_2106_1_1527415439089_6819768_909-82767-0 from training. Duration: 0.61 2023-10-04 11:48:14,379 WARNING [train.py:1204] (3/4) Exclude cut with ID _23038_210_9570_1_1532001584158_3652419_411-323967-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 11:48:15,705 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_350-231748-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 11:48:17,000 WARNING [train.py:1204] (3/4) Exclude cut with ID _5289_210_2084_1_1527678202736_3779299_142-103317-0_sp0.9 from training. Duration: 0.9333125 2023-10-04 11:48:17,071 WARNING [train.py:1204] (3/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_314-144055-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 11:48:17,138 WARNING [train.py:1204] (3/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_177-236809-0_sp0.9 from training. Duration: 0.5444375 2023-10-04 11:48:18,482 WARNING [train.py:1204] (3/4) Exclude cut with ID _23038_210_9570_1_1532001584158_3652419_82-334733-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 11:48:18,490 WARNING [train.py:1204] (3/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_225-22526-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 11:48:27,396 WARNING [train.py:1204] (3/4) Exclude cut with ID _30327_210_16049_1_1532484161295_3924169_401-70819-0_sp1.1 from training. Duration: 0.7818125 2023-10-04 11:48:28,697 WARNING [train.py:1204] (3/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_538-32291-0 from training. Duration: 0.83 2023-10-04 11:48:30,053 INFO [train.py:1046] (3/4) Epoch 47, batch 2200, loss[loss=0.145, simple_loss=0.221, pruned_loss=0.03447, over 23444.00 frames. ], tot_loss[loss=0.1525, simple_loss=0.2322, pruned_loss=0.03645, over 4697021.67 frames. ], batch size: 134, lr: 2.16e-03, grad_scale: 8.0 2023-10-04 11:48:31,518 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_37357_210_17024_1_1533089504_2316121_258-211605-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 11:48:34,223 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.776e+02 2.021e+02 2.246e+02 2.617e+02 4.385e+02, threshold=4.493e+02, percent-clipped=0.0 2023-10-04 11:48:36,270 WARNING [train.py:1204] (3/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_550-335198-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 11:48:36,318 WARNING [train.py:1204] (3/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_98-741-0_sp0.9 from training. Duration: 0.888875 2023-10-04 11:48:36,371 WARNING [train.py:1204] (3/4) Exclude cut with ID _38323_210_12108_1_1533121233369_3303049_251-238988-0_sp1.1 from training. Duration: 0.92725 2023-10-04 11:48:37,708 WARNING [train.py:1204] (3/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_259-34038-0_sp0.9 from training. Duration: 0.82225 2023-10-04 11:48:40,550 WARNING [train.py:1204] (3/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_1060-180633-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 11:48:40,591 WARNING [train.py:1204] (3/4) Exclude cut with ID _8755_210_1876_1_1528628559357_3258209_211-37473-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 11:48:40,596 WARNING [train.py:1204] (3/4) Exclude cut with ID _4122_210_2084_1_1526713164417_4353280_528-206771-0 from training. Duration: 0.88 2023-10-04 11:48:42,976 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.5.encoder.layers.0.self_attn1.whiten, num_groups=1, num_channels=256, metric=12.08 vs. limit=22.5 2023-10-04 11:48:46,493 WARNING [train.py:1204] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_64-248359-0 from training. Duration: 0.69 2023-10-04 11:48:47,920 WARNING [train.py:1204] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_402-253027-0_sp1.1 from training. Duration: 0.4909375 2023-10-04 11:48:54,039 WARNING [train.py:1204] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_625-102955-0 from training. Duration: 0.77 2023-10-04 11:48:55,451 WARNING [train.py:1204] (3/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_611-219324-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 11:48:55,529 WARNING [train.py:1204] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_718-68505-0_sp0.9 from training. Duration: 0.988875 2023-10-04 11:48:56,820 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_353-182855-0_sp1.1 from training. Duration: 0.87275 2023-10-04 11:49:00,833 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_5960_210_1794_1_1527403865_3990999_515-2613-0_sp0.9 from training. Duration: 0.97775 2023-10-04 11:49:02,128 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_5575_210_1410_1_1527301305_2570170_342-265572-0 from training. Duration: 0.94 2023-10-04 11:49:05,543 WARNING [train.py:1204] (3/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_329-113173-0_sp0.9 from training. Duration: 0.688875 2023-10-04 11:49:05,639 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_15396_210_6890_1_1531308473_1938936_213-312506-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 11:49:07,427 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_521-214276-0_sp1.1 from training. Duration: 0.4 2023-10-04 11:49:10,352 WARNING [train.py:1204] (3/4) Exclude cut with ID _15026_210_8750_1_1530777602316_3291850_211-195043-0_sp1.1 from training. Duration: 0.72725 2023-10-04 11:49:13,307 WARNING [train.py:1204] (3/4) Exclude cut with ID _29343_210_4381_1_1532602757752_3674109_108-257494-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 11:49:14,805 WARNING [train.py:1204] (3/4) Exclude cut with ID _64414_210_17301_1_1535243252048_3757759_403-58151-0_sp1.1 from training. Duration: 0.963625 2023-10-04 11:49:16,217 WARNING [train.py:1204] (3/4) Exclude cut with ID _36512_210_6987_1_1533033143998_3126069_109-177787-0_sp1.1 from training. Duration: 0.92725 2023-10-04 11:49:20,123 WARNING [train.py:1204] (3/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_498-40153-0 from training. Duration: 0.63 2023-10-04 11:49:20,196 WARNING [train.py:1204] (3/4) Exclude cut with ID _56447_210_1389_1_1535074245910_3697189_161-334523-0_sp1.1 from training. Duration: 0.936375 2023-10-04 11:49:21,541 WARNING [train.py:1204] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_238-245655-0 from training. Duration: 0.81 2023-10-04 11:49:24,222 WARNING [train.py:1204] (3/4) Exclude cut with ID _64631_210_27767_1_1535349580068_4165160_108-43865-0_sp1.1 from training. Duration: 0.92725 2023-10-04 11:49:24,235 WARNING [train.py:1204] (3/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_243-42116-0_sp1.1 from training. Duration: 0.663625 2023-10-04 11:49:24,261 WARNING [train.py:1204] (3/4) Exclude cut with ID _66868_210_9527_1_1535509287954_4561140_594-132160-0_sp1.1 from training. Duration: 0.92725 2023-10-04 11:49:26,398 WARNING [train.py:1204] (3/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_48-338976-0_sp0.9 from training. Duration: 0.988875 2023-10-04 11:49:26,453 WARNING [train.py:1204] (3/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_137-100595-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 11:49:26,463 WARNING [train.py:1204] (3/4) Exclude cut with ID _49748_210_24116_1_1533986971737_3158910_12-271669-0_sp1.1 from training. Duration: 0.936375 2023-10-04 11:49:26,483 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_37357_210_17024_1_1533089504_2316121_206-80421-0_sp1.1 from training. Duration: 0.936375 2023-10-04 11:49:27,772 WARNING [train.py:1204] (3/4) Exclude cut with ID _7537_210_2084_1_1528502395431_3924359_113-199933-0_sp1.1 from training. Duration: 0.536375 2023-10-04 11:49:27,819 WARNING [train.py:1204] (3/4) Exclude cut with ID _55440_210_12651_1_1534496713364_2980520_31-59289-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 11:49:29,257 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1083-220122-0_sp0.9 from training. Duration: 0.5666875 2023-10-04 11:49:33,367 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_426-313557-0_sp1.1 from training. Duration: 0.4818125 2023-10-04 11:49:33,431 WARNING [train.py:1204] (3/4) Exclude cut with ID _35799_210_5854_1_1533263487887_3529071_324-94843-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 11:49:34,818 WARNING [train.py:1204] (3/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_264-108685-0_sp1.1 from training. Duration: 0.5 2023-10-04 11:49:36,097 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9012W0006-501141 from training. Duration: 0.896 2023-10-04 11:49:38,228 INFO [scaling.py:1118] (3/4) WithLoss: name=encoder.encoders.0.layers.1.self_attn_weights, loss-sum=0.000e+00 2023-10-04 11:49:39,442 WARNING [train.py:1204] (3/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_890-5117-0_sp0.9 from training. Duration: 0.8666875 2023-10-04 11:49:39,493 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9023W0008-506301 from training. Duration: 0.896 2023-10-04 11:49:40,789 WARNING [train.py:1204] (3/4) Exclude cut with ID _13916_210_9994_1_1530788341126_3763149_60-191556-0_sp0.9 from training. Duration: 0.67775 2023-10-04 11:49:42,103 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9029W0331-509721 from training. Duration: 0.896 2023-10-04 11:49:43,436 INFO [train.py:1046] (3/4) Epoch 47, batch 2250, loss[loss=0.1525, simple_loss=0.2405, pruned_loss=0.03222, over 24661.00 frames. ], tot_loss[loss=0.1527, simple_loss=0.233, pruned_loss=0.03624, over 4712143.49 frames. ], batch size: 73, lr: 2.16e-03, grad_scale: 8.0 2023-10-04 11:49:43,483 WARNING [train.py:1204] (3/4) Exclude cut with ID _35823_210_6917_1_1533187881914_7297830_567-108596-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 11:49:43,530 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7543_210_6753_1_1528544010_1110402_25-183860-0_sp1.1 from training. Duration: 0.42725 2023-10-04 11:49:45,527 WARNING [train.py:1204] (3/4) Exclude cut with ID _47278_210_12041_1_1533902418349_3576110_495-260035-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 11:49:45,684 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.1.conv_module1.balancer1.min_positive, batch_count=1644053.3333333333, ans=0.025 2023-10-04 11:49:47,029 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9051W0015-520281 from training. Duration: 0.9045625 2023-10-04 11:49:48,411 WARNING [train.py:1204] (3/4) Exclude cut with ID _14459_210_2228_1_1531443641946_7516700_707-66193-0_sp1.1 from training. Duration: 0.97275 2023-10-04 11:49:49,872 WARNING [train.py:1204] (3/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_219-224445-0_sp1.1 from training. Duration: 0.763625 2023-10-04 11:49:50,049 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.balancer1.prob, batch_count=1644053.3333333333, ans=0.125 2023-10-04 11:49:55,791 WARNING [train.py:1204] (3/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_345-167986-0_sp0.9 from training. Duration: 0.9333125 2023-10-04 11:49:57,198 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_34815_210_6388_1_1533381320_3571903_279-305565-0_sp1.1 from training. Duration: 0.836375 2023-10-04 11:49:59,949 WARNING [train.py:1204] (3/4) Exclude cut with ID _21987_210_5854_1_1531821500482_3637038_317-179212-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 11:50:00,020 WARNING [train.py:1204] (3/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_395-37222-0_sp1.1 from training. Duration: 0.6909375 2023-10-04 11:50:01,387 WARNING [train.py:1204] (3/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_168-50337-0_sp1.1 from training. Duration: 0.763625 2023-10-04 11:50:05,285 WARNING [train.py:1204] (3/4) Exclude cut with ID _60113_210_16271_1_1535164212481_4386079_559-167191-0 from training. Duration: 0.93 2023-10-04 11:50:05,295 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_47228_210_4983_1_1533780021_3672027_344-179642-0_sp1.1 from training. Duration: 0.92725 2023-10-04 11:50:05,323 WARNING [train.py:1204] (3/4) Exclude cut with ID _22961_210_8254_1_1531976366672_4278180_317-199546-0_sp0.9 from training. Duration: 0.9555625 2023-10-04 11:50:06,791 WARNING [train.py:1204] (3/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_141-89727-0 from training. Duration: 0.71 2023-10-04 11:50:06,867 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_78-116075-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 11:50:06,887 WARNING [train.py:1204] (3/4) Exclude cut with ID _64086_210_7994_1_1535270417019_7435949_52-235756-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 11:50:08,924 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7615_210_4211_1_1528110965_4580160_72-235211-0_sp1.1 from training. Duration: 0.6909375 2023-10-04 11:50:14,692 WARNING [train.py:1204] (3/4) Exclude cut with ID _14069_210_5632_1_1530512692033_4803390_287-27950-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 11:50:16,089 WARNING [train.py:1204] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_74-48717-0_sp0.9 from training. Duration: 0.6444375 2023-10-04 11:50:17,374 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7544_210_6753_1_1528591327_1471426_172-348777-0_sp1.1 from training. Duration: 0.6 2023-10-04 11:50:18,780 WARNING [train.py:1204] (3/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_220-191080-0 from training. Duration: 0.98 2023-10-04 11:50:20,121 WARNING [train.py:1204] (3/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_1081-137641-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 11:50:21,686 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.conv_module2.balancer1.prob, batch_count=1644186.6666666667, ans=0.125 2023-10-04 11:50:22,747 WARNING [train.py:1204] (3/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_278-235459-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 11:50:22,952 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.conv_skip_rate, batch_count=1644186.6666666667, ans=0.0 2023-10-04 11:50:27,089 WARNING [train.py:1204] (3/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_1181-77470-0_sp1.1 from training. Duration: 0.936375 2023-10-04 11:50:30,495 WARNING [train.py:1204] (3/4) Exclude cut with ID _60860_210_8254_1_1534896015875_3557169_198-5531-0_sp1.1 from training. Duration: 0.936375 2023-10-04 11:50:30,702 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.self_attn_weights.pos_emb_skip_rate, batch_count=1644253.3333333333, ans=0.0 2023-10-04 11:50:31,856 WARNING [train.py:1204] (3/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_17-289781-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 11:50:31,861 WARNING [train.py:1204] (3/4) Exclude cut with ID _50310_210_15488_1_1534068043345_3641080_146-199752-0_sp1.1 from training. Duration: 0.92725 2023-10-04 11:50:34,591 WARNING [train.py:1204] (3/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_969-213354-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 11:50:35,964 WARNING [train.py:1204] (3/4) Exclude cut with ID _70519_210_4163_1_1535961595994_7458230_66-344156-0_sp1.1 from training. Duration: 0.963625 2023-10-04 11:50:36,609 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.4.encoder.layers.0.self_attn_weights.whiten_keys, num_groups=4, num_channels=128, metric=1.69 vs. limit=6.0 2023-10-04 11:50:38,865 WARNING [train.py:1204] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_380-204756-0_sp1.1 from training. Duration: 0.7818125 2023-10-04 11:50:41,609 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_128-202104-0_sp0.9 from training. Duration: 0.7 2023-10-04 11:50:45,590 WARNING [train.py:1204] (3/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_51-1049-0_sp0.9 from training. Duration: 0.8333125 2023-10-04 11:50:45,607 WARNING [train.py:1204] (3/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_86-66192-0_sp1.1 from training. Duration: 0.5 2023-10-04 11:50:46,938 WARNING [train.py:1204] (3/4) Exclude cut with ID _14069_210_5632_1_1530512692033_4803390_95-1896-0_sp1.1 from training. Duration: 0.8909375 2023-10-04 11:50:47,223 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.conv_module2.balancer1.prob, batch_count=1644320.0, ans=0.125 2023-10-04 11:50:52,940 WARNING [train.py:1204] (3/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_29-293896-0_sp1.1 from training. Duration: 0.5545625 2023-10-04 11:50:54,430 WARNING [train.py:1204] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_167-137255-0_sp0.9 from training. Duration: 0.711125 2023-10-04 11:50:54,431 WARNING [train.py:1204] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_217-95056-0 from training. Duration: 0.98 2023-10-04 11:50:54,450 WARNING [train.py:1204] (3/4) Exclude cut with ID _47278_210_12041_1_1533902418349_3576110_97-266746-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 11:50:54,492 WARNING [train.py:1204] (3/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_380-343670-0_sp1.1 from training. Duration: 0.8 2023-10-04 11:50:57,184 INFO [train.py:1046] (3/4) Epoch 47, batch 2300, loss[loss=0.156, simple_loss=0.2445, pruned_loss=0.0338, over 24452.00 frames. ], tot_loss[loss=0.1531, simple_loss=0.2336, pruned_loss=0.0363, over 4726788.28 frames. ], batch size: 69, lr: 2.16e-03, grad_scale: 8.0 2023-10-04 11:50:57,317 WARNING [train.py:1204] (3/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_107-332398-0 from training. Duration: 0.78 2023-10-04 11:50:57,543 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.balancer1.prob, batch_count=1644386.6666666667, ans=0.125 2023-10-04 11:51:00,493 WARNING [train.py:1204] (3/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_91-267696-0_sp1.1 from training. Duration: 0.7909375 2023-10-04 11:51:00,514 WARNING [train.py:1204] (3/4) Exclude cut with ID _41298_210_9019_1_1533430371450_4822143_498-186993-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 11:51:01,759 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.761e+02 2.249e+02 2.496e+02 2.917e+02 4.902e+02, threshold=4.992e+02, percent-clipped=2.0 2023-10-04 11:51:02,151 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=1644386.6666666667, ans=0.1 2023-10-04 11:51:07,401 WARNING [train.py:1204] (3/4) Exclude cut with ID _17956_210_4353_1_1531209568125_6979970_54-77610-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 11:51:07,441 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_40047_210_2290_1_1533290519_2374311_257-335162-0_sp1.1 from training. Duration: 0.863625 2023-10-04 11:51:08,860 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9653W0193-675471 from training. Duration: 0.9656875 2023-10-04 11:51:10,746 WARNING [train.py:1204] (3/4) Exclude cut with ID _25248_210_8750_1_1532062825941_5554180_243-240536-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 11:51:15,734 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.0.conv_skip_rate, batch_count=1644453.3333333333, ans=0.0 2023-10-04 11:51:16,967 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_32360_210_4198_1_1532689160_3648436_60-51908-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 11:51:16,976 WARNING [train.py:1204] (3/4) Exclude cut with ID _4092_210_2151_1_1526643044449_3135759_227-213972-0_sp0.9 from training. Duration: 0.6 2023-10-04 11:51:17,004 WARNING [train.py:1204] (3/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_392-173500-0_sp1.1 from training. Duration: 0.97275 2023-10-04 11:51:18,287 WARNING [train.py:1204] (3/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_71-329150-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 11:51:18,289 WARNING [train.py:1204] (3/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_866-173440-0 from training. Duration: 0.9 2023-10-04 11:51:19,681 WARNING [train.py:1204] (3/4) Exclude cut with ID _14832_210_8750_1_1530691351539_3171348_1-54315-0_sp0.9 from training. Duration: 0.9666875 2023-10-04 11:51:21,588 WARNING [train.py:1204] (3/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_430-116139-0_sp1.1 from training. Duration: 0.77275 2023-10-04 11:51:21,613 WARNING [train.py:1204] (3/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_356-45054-0_sp0.9 from training. Duration: 0.9555625 2023-10-04 11:51:25,692 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3531_210_3528_1_1526294203_5216456_309-227302-0_sp0.9 from training. Duration: 0.7666875 2023-10-04 11:51:28,975 WARNING [train.py:1204] (3/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_301-330455-0_sp1.1 from training. Duration: 0.72725 2023-10-04 11:51:30,535 WARNING [train.py:1204] (3/4) Exclude cut with ID _23557_210_8750_1_1531890106370_6846255_667-145945-0_sp1.1 from training. Duration: 0.92725 2023-10-04 11:51:35,898 WARNING [train.py:1204] (3/4) Exclude cut with ID _14860_210_4915_1_1530702020340_7343240_836-328489-0_sp1.1 from training. Duration: 0.8181875 2023-10-04 11:51:35,945 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_37152_210_9697_1_1533106573_3844188_416-333249-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 11:51:37,440 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.conv_module2.balancer2.min_positive, batch_count=1644520.0, ans=0.05 2023-10-04 11:51:39,969 WARNING [train.py:1204] (3/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_538-32291-0_sp0.9 from training. Duration: 0.92225 2023-10-04 11:51:42,065 WARNING [train.py:1204] (3/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_965-176824-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 11:51:44,949 WARNING [train.py:1204] (3/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_86-19601-0_sp1.1 from training. Duration: 0.77275 2023-10-04 11:51:46,282 WARNING [train.py:1204] (3/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_436-318113-0_sp1.1 from training. Duration: 0.6818125 2023-10-04 11:51:46,323 WARNING [train.py:1204] (3/4) Exclude cut with ID _41817_210_18835_1_1533434477160_7423049_555-293448-0_sp0.9 from training. Duration: 0.888875 2023-10-04 11:51:47,625 WARNING [train.py:1204] (3/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_68-219807-0 from training. Duration: 0.47 2023-10-04 11:51:49,986 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.0.layers.0.self_attn_weights.whiten_keys, num_groups=4, num_channels=128, metric=2.69 vs. limit=6.0 2023-10-04 11:51:50,409 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7544_210_6753_1_1528591327_1471426_33-346031-0_sp1.1 from training. Duration: 0.5545625 2023-10-04 11:51:50,411 WARNING [train.py:1204] (3/4) Exclude cut with ID _49748_210_24116_1_1533986971737_3158910_263-52113-0_sp1.1 from training. Duration: 0.97275 2023-10-04 11:51:50,461 WARNING [train.py:1204] (3/4) Exclude cut with ID _64639_210_5816_1_1535367619437_3528000_340-24580-0_sp1.1 from training. Duration: 0.92725 2023-10-04 11:51:50,476 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_47228_210_4983_1_1533780021_3672027_479-114941-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 11:51:50,501 WARNING [train.py:1204] (3/4) Exclude cut with ID _44585_210_18628_1_1533605350387_4518199_506-119659-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 11:51:52,205 WARNING [train.py:1204] (3/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_314-126819-0_sp1.1 from training. Duration: 0.4545625 2023-10-04 11:51:52,208 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2541_210_1794_1_1525590269_3084765_156-173713-0_sp0.9 from training. Duration: 0.9 2023-10-04 11:51:52,264 WARNING [train.py:1204] (3/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_108-255430-0 from training. Duration: 0.97 2023-10-04 11:51:52,275 WARNING [train.py:1204] (3/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_791-45149-0_sp1.1 from training. Duration: 0.7818125 2023-10-04 11:51:52,281 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_53378_210_17282_1_1534227054_1261425_10-118295-0_sp1.1 from training. Duration: 0.97275 2023-10-04 11:51:53,683 WARNING [train.py:1204] (3/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_148-158509-0 from training. Duration: 0.41 2023-10-04 11:51:59,693 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_34726_210_4525_1_1533019474_5030395_687-291305-0_sp1.1 from training. Duration: 0.936375 2023-10-04 11:52:05,046 WARNING [train.py:1204] (3/4) Exclude cut with ID _14860_210_4915_1_1530702020340_7343240_264-324537-0_sp1.1 from training. Duration: 0.963625 2023-10-04 11:52:05,773 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.3.encoder.layers.2.nonlin_attention.whiten2, num_groups=1, num_channels=512, metric=5.71 vs. limit=15.0 2023-10-04 11:52:07,942 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_41251_210_15368_1_1533429767_4820695_591-46386-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 11:52:09,273 WARNING [train.py:1204] (3/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_86-19601-0_sp0.9 from training. Duration: 0.9444375 2023-10-04 11:52:09,295 WARNING [train.py:1204] (3/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_401-255491-0_sp0.9 from training. Duration: 0.77775 2023-10-04 11:52:10,661 INFO [train.py:1046] (3/4) Epoch 47, batch 2350, loss[loss=0.1446, simple_loss=0.2315, pruned_loss=0.02888, over 24663.00 frames. ], tot_loss[loss=0.154, simple_loss=0.2347, pruned_loss=0.03662, over 4723668.80 frames. ], batch size: 65, lr: 2.16e-03, grad_scale: 8.0 2023-10-04 11:52:10,774 WARNING [train.py:1204] (3/4) Exclude cut with ID _8696_210_1876_1_1528545632440_3345058_209-54069-0_sp1.1 from training. Duration: 0.6090625 2023-10-04 11:52:10,789 WARNING [train.py:1204] (3/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_369-325184-0_sp1.1 from training. Duration: 0.9 2023-10-04 11:52:12,642 WARNING [train.py:1204] (3/4) Exclude cut with ID _14687_210_5854_1_1530703838259_3664889_115-239046-0_sp0.9 from training. Duration: 0.7444375 2023-10-04 11:52:12,688 WARNING [train.py:1204] (3/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_168-50337-0 from training. Duration: 0.84 2023-10-04 11:52:17,451 WARNING [train.py:1204] (3/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_909-64151-0_sp1.1 from training. Duration: 0.8545625 2023-10-04 11:52:18,813 WARNING [train.py:1204] (3/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_597-154640-0 from training. Duration: 0.9 2023-10-04 11:52:22,493 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.4.encoder.layers.0.whiten, num_groups=1, num_channels=384, metric=4.26 vs. limit=12.0 2023-10-04 11:52:23,333 WARNING [train.py:1204] (3/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_126-274694-0 from training. Duration: 0.98 2023-10-04 11:52:24,809 WARNING [train.py:1204] (3/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_344-269786-0_sp1.1 from training. Duration: 0.97275 2023-10-04 11:52:28,761 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_37818_210_4238_1_1533620910_4720083_322-253154-0_sp1.1 from training. Duration: 0.92725 2023-10-04 11:52:28,765 WARNING [train.py:1204] (3/4) Exclude cut with ID _58895_210_8750_1_1535184081320_3649089_221-15823-0_sp1.1 from training. Duration: 0.92725 2023-10-04 11:52:28,778 WARNING [train.py:1204] (3/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_9-21737-0_sp1.1 from training. Duration: 0.8818125 2023-10-04 11:52:28,846 WARNING [train.py:1204] (3/4) Exclude cut with ID _14069_210_5632_1_1530512692033_4803390_522-341588-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 11:52:29,141 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.conv_module2.balancer1.prob, batch_count=1644786.6666666667, ans=0.125 2023-10-04 11:52:30,636 WARNING [train.py:1204] (3/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_363-185703-0 from training. Duration: 0.95 2023-10-04 11:52:33,407 WARNING [train.py:1204] (3/4) Exclude cut with ID _41817_210_18835_1_1533434477160_7423049_265-76216-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 11:52:33,611 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.bypass.skip_rate, batch_count=1644786.6666666667, ans=0.07 2023-10-04 11:52:38,888 WARNING [train.py:1204] (3/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_841-341679-0 from training. Duration: 0.58 2023-10-04 11:52:40,321 WARNING [train.py:1204] (3/4) Exclude cut with ID _55429_210_10626_1_1534467588527_7174143_97-335084-0_sp1.1 from training. Duration: 0.8818125 2023-10-04 11:52:42,346 WARNING [train.py:1204] (3/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_690-126306-0_sp0.9 from training. Duration: 0.8666875 2023-10-04 11:52:42,354 WARNING [train.py:1204] (3/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_102-92751-0_sp1.1 from training. Duration: 0.9 2023-10-04 11:52:45,742 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3787_210_2040_1_1526477544_5478586_603-120324-0_sp1.1 from training. Duration: 0.736375 2023-10-04 11:52:45,867 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_33348_210_5632_1_1533086912_3324030_85-28583-0 from training. Duration: 0.82 2023-10-04 11:52:47,206 WARNING [train.py:1204] (3/4) Exclude cut with ID _60057_210_10640_1_1534903065793_3333150_362-230377-0_sp1.1 from training. Duration: 0.7909375 2023-10-04 11:52:50,422 WARNING [train.py:1204] (3/4) Exclude cut with ID _60113_210_16271_1_1535164212481_4386079_194-146486-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 11:52:50,434 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_53753_210_7234_1_1534320003_3594148_171-277668-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 11:52:50,461 WARNING [train.py:1204] (3/4) Exclude cut with ID _34423_210_8750_1_1533430798456_3330199_251-332788-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 11:52:51,992 WARNING [train.py:1204] (3/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_663-166973-0_sp0.9 from training. Duration: 0.888875 2023-10-04 11:52:53,395 WARNING [train.py:1204] (3/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_927-43827-0 from training. Duration: 0.74 2023-10-04 11:52:54,831 WARNING [train.py:1204] (3/4) Exclude cut with ID _28323_210_16187_1_1532480395667_4100300_306-74561-0_sp1.1 from training. Duration: 0.8545625 2023-10-04 11:52:56,288 WARNING [train.py:1204] (3/4) Exclude cut with ID _15037_210_2253_1_1530752294104_3267650_139-337989-0_sp1.1 from training. Duration: 0.92725 2023-10-04 11:52:56,302 WARNING [train.py:1204] (3/4) Exclude cut with ID _49039_210_6315_1_1533967537541_3846118_460-263670-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 11:52:57,740 WARNING [train.py:1204] (3/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_186-309161-0 from training. Duration: 0.54 2023-10-04 11:52:59,668 WARNING [train.py:1204] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_87-278686-0_sp1.1 from training. Duration: 0.7 2023-10-04 11:53:01,136 WARNING [train.py:1204] (3/4) Exclude cut with ID _27052_210_5986_1_1532329197187_3585009_314-106499-0 from training. Duration: 0.92 2023-10-04 11:53:01,158 WARNING [train.py:1204] (3/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_412-287526-0_sp0.9 from training. Duration: 0.8 2023-10-04 11:53:06,671 WARNING [train.py:1204] (3/4) Exclude cut with ID _22961_210_8254_1_1531976366672_4278180_317-199546-0 from training. Duration: 0.86 2023-10-04 11:53:06,882 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.balancer2.prob, batch_count=1644920.0, ans=0.125 2023-10-04 11:53:11,196 WARNING [train.py:1204] (3/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_381-317791-0 from training. Duration: 0.73 2023-10-04 11:53:12,533 WARNING [train.py:1204] (3/4) Exclude cut with ID _44010_210_4525_1_1533884262964_4013620_560-310411-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 11:53:12,539 WARNING [train.py:1204] (3/4) Exclude cut with ID _5294_210_1747_1_1527249643928_3696179_342-270278-0_sp0.9 from training. Duration: 0.611125 2023-10-04 11:53:12,559 WARNING [train.py:1204] (3/4) Exclude cut with ID ID0473W0002-918951 from training. Duration: 0.924 2023-10-04 11:53:12,576 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1083-220122-0 from training. Duration: 0.51 2023-10-04 11:53:15,939 WARNING [train.py:1204] (3/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_138-154812-0 from training. Duration: 0.78 2023-10-04 11:53:17,685 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.balancer1.prob, batch_count=1644986.6666666667, ans=0.125 2023-10-04 11:53:18,747 WARNING [train.py:1204] (3/4) Exclude cut with ID _44982_210_1511_1_1534312820108_3726029_331-19420-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 11:53:19,026 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.1.conv_module2.balancer1.prob, batch_count=1644986.6666666667, ans=0.125 2023-10-04 11:53:22,053 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2655_210_2087_1_1525688959_3897400_202-147557-0_sp0.9 from training. Duration: 0.9444375 2023-10-04 11:53:22,440 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.feed_forward2.hidden_balancer.prob, batch_count=1644986.6666666667, ans=0.125 2023-10-04 11:53:23,608 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_23850_210_4238_1_1532232597_1632433_32-185135-0_sp0.9 from training. Duration: 0.9555625 2023-10-04 11:53:25,095 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_46-93547-0_sp1.1 from training. Duration: 0.863625 2023-10-04 11:53:25,141 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7931_210_1794_1_1528594728_5564059_280-279560-0 from training. Duration: 0.98 2023-10-04 11:53:25,179 WARNING [train.py:1204] (3/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_937-208281-0 from training. Duration: 0.85 2023-10-04 11:53:26,471 INFO [train.py:1046] (3/4) Epoch 47, batch 2400, loss[loss=0.1573, simple_loss=0.248, pruned_loss=0.0333, over 24471.00 frames. ], tot_loss[loss=0.1538, simple_loss=0.2342, pruned_loss=0.03675, over 4697800.15 frames. ], batch size: 69, lr: 2.16e-03, grad_scale: 16.0 2023-10-04 11:53:30,967 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.780e+02 2.104e+02 2.348e+02 2.668e+02 4.204e+02, threshold=4.695e+02, percent-clipped=0.0 2023-10-04 11:53:32,346 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_14928_210_5632_1_1530773461_4714000_454-112198-0_sp0.9 from training. Duration: 0.7333125 2023-10-04 11:53:32,347 WARNING [train.py:1204] (3/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_944-153333-0_sp1.1 from training. Duration: 0.963625 2023-10-04 11:53:35,162 WARNING [train.py:1204] (3/4) Exclude cut with ID _14821_210_8750_1_1530680503991_6829359_2-227151-0 from training. Duration: 0.83 2023-10-04 11:53:35,187 WARNING [train.py:1204] (3/4) Exclude cut with ID _28461_210_2253_1_1532771775189_3041299_96-152158-0_sp1.1 from training. Duration: 0.77275 2023-10-04 11:53:37,156 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7837_210_1792_1_1528453445_5841305_654-54195-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 11:53:38,323 WARNING [train.py:1204] (3/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_344-152526-0 from training. Duration: 0.53 2023-10-04 11:53:43,298 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_30167_210_3528_1_1532428012_4772375_480-142230-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 11:53:45,967 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_160-218721-0 from training. Duration: 0.96 2023-10-04 11:53:50,138 WARNING [train.py:1204] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_65-88242-0_sp0.9 from training. Duration: 0.62225 2023-10-04 11:53:50,330 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.ff3_skip_rate, batch_count=1645120.0, ans=0.0 2023-10-04 11:53:52,435 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.conv_skip_rate, batch_count=1645120.0, ans=0.0 2023-10-04 11:53:55,109 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_34886_210_10633_1_1533203882_1755661_76-156096-0 from training. Duration: 0.87 2023-10-04 11:53:56,535 WARNING [train.py:1204] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_188-38388-0_sp1.1 from training. Duration: 0.92725 2023-10-04 11:53:59,695 WARNING [train.py:1204] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_81-289335-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 11:54:05,234 WARNING [train.py:1204] (3/4) Exclude cut with ID _59493_210_17282_1_1535078419077_3225879_110-8778-0_sp1.1 from training. Duration: 0.936375 2023-10-04 11:54:05,274 WARNING [train.py:1204] (3/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_188-260011-0 from training. Duration: 0.73 2023-10-04 11:54:06,570 WARNING [train.py:1204] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_453-133983-0_sp0.9 from training. Duration: 0.8666875 2023-10-04 11:54:15,644 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8054_210_7927_1_1528627721_4984094_553-17379-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 11:54:17,092 WARNING [train.py:1204] (3/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_341-171072-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 11:54:20,011 WARNING [train.py:1204] (3/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_265-257286-0_sp1.1 from training. Duration: 0.963625 2023-10-04 11:54:21,360 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_403-39619-0_sp0.9 from training. Duration: 0.9333125 2023-10-04 11:54:21,366 WARNING [train.py:1204] (3/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_464-33931-0_sp0.9 from training. Duration: 0.6 2023-10-04 11:54:21,399 WARNING [train.py:1204] (3/4) Exclude cut with ID _60804_210_9642_1_1534831199859_7268299_632-143863-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 11:54:21,406 WARNING [train.py:1204] (3/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_431-310997-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 11:54:22,731 WARNING [train.py:1204] (3/4) Exclude cut with ID _31649_210_6228_1_1532584608063_4091078_114-263860-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 11:54:22,743 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6961_210_2087_1_1527937168_6815011_562-165518-0_sp0.9 from training. Duration: 0.8555625 2023-10-04 11:54:27,513 WARNING [train.py:1204] (3/4) Exclude cut with ID _27052_210_5986_1_1532329197187_3585009_338-325870-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 11:54:27,580 WARNING [train.py:1204] (3/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_469-94673-0_sp1.1 from training. Duration: 0.7181875 2023-10-04 11:54:27,604 WARNING [train.py:1204] (3/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_259-34038-0 from training. Duration: 0.74 2023-10-04 11:54:29,189 WARNING [train.py:1204] (3/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_388-129552-0 from training. Duration: 0.96 2023-10-04 11:54:30,605 WARNING [train.py:1204] (3/4) Exclude cut with ID _28394_210_8913_1_1532430049518_3650118_342-251455-0_sp1.1 from training. Duration: 0.936375 2023-10-04 11:54:30,636 WARNING [train.py:1204] (3/4) Exclude cut with ID _53731_210_7994_1_1534553979244_3757214_8-191390-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 11:54:31,905 WARNING [train.py:1204] (3/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_613-232856-0 from training. Duration: 0.54 2023-10-04 11:54:33,205 WARNING [train.py:1204] (3/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_557-349534-0 from training. Duration: 0.91 2023-10-04 11:54:33,214 WARNING [train.py:1204] (3/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_379-184223-0 from training. Duration: 0.85 2023-10-04 11:54:33,217 WARNING [train.py:1204] (3/4) Exclude cut with ID IC0125W0001-61807 from training. Duration: 0.966 2023-10-04 11:54:35,096 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_229-45300-0 from training. Duration: 0.57 2023-10-04 11:54:36,476 WARNING [train.py:1204] (3/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_1069-24743-0_sp1.1 from training. Duration: 0.92725 2023-10-04 11:54:37,893 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_41452_210_7291_1_1533437687_3941281_323-172873-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 11:54:37,911 WARNING [train.py:1204] (3/4) Exclude cut with ID _37086_210_9038_1_1532999540891_7021250_802-226709-0_sp1.1 from training. Duration: 0.97275 2023-10-04 11:54:39,217 WARNING [train.py:1204] (3/4) Exclude cut with ID IC0144W0500-71767 from training. Duration: 0.931875 2023-10-04 11:54:39,282 WARNING [train.py:1204] (3/4) Exclude cut with ID _39846_210_12651_1_1533605310265_2115212_233-129699-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 11:54:40,509 INFO [train.py:1046] (3/4) Epoch 47, batch 2450, loss[loss=0.152, simple_loss=0.2412, pruned_loss=0.03144, over 24537.00 frames. ], tot_loss[loss=0.1528, simple_loss=0.233, pruned_loss=0.03632, over 4694061.27 frames. ], batch size: 71, lr: 2.16e-03, grad_scale: 8.0 2023-10-04 11:54:40,584 WARNING [train.py:1204] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_574-45541-0_sp0.9 from training. Duration: 0.72225 2023-10-04 11:54:43,394 WARNING [train.py:1204] (3/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_372-298093-0_sp1.1 from training. Duration: 0.72725 2023-10-04 11:54:43,415 WARNING [train.py:1204] (3/4) Exclude cut with ID _62574_210_2655_1_1535180407058_3933249_34-26343-0_sp1.1 from training. Duration: 0.97275 2023-10-04 11:54:46,842 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_512-256238-0_sp1.1 from training. Duration: 0.963625 2023-10-04 11:54:46,854 WARNING [train.py:1204] (3/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_196-55502-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 11:54:48,165 WARNING [train.py:1204] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_195-167011-0 from training. Duration: 0.75 2023-10-04 11:54:53,977 WARNING [train.py:1204] (3/4) Exclude cut with ID _13678_210_7497_1_1530530069226_3463170_345-230416-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 11:54:53,987 WARNING [train.py:1204] (3/4) Exclude cut with ID _51078_210_9070_1_1534161557424_3805250_225-327809-0_sp1.1 from training. Duration: 0.963625 2023-10-04 11:54:58,594 WARNING [train.py:1204] (3/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_18-189198-0_sp1.1 from training. Duration: 0.8181875 2023-10-04 11:54:58,605 WARNING [train.py:1204] (3/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_329-82235-0_sp1.1 from training. Duration: 0.7454375 2023-10-04 11:54:58,615 WARNING [train.py:1204] (3/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_726-270780-0_sp0.9 from training. Duration: 0.9555625 2023-10-04 11:54:58,658 WARNING [train.py:1204] (3/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_654-52713-0 from training. Duration: 0.77 2023-10-04 11:55:01,547 WARNING [train.py:1204] (3/4) Exclude cut with ID _20455_210_5219_1_1531565996775_8098100_191-16006-0_sp1.1 from training. Duration: 0.963625 2023-10-04 11:55:04,833 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3171_210_2087_1_1526109084_2330007_66-260583-0_sp1.1 from training. Duration: 0.6545625 2023-10-04 11:55:04,895 WARNING [train.py:1204] (3/4) Exclude cut with ID _38670_210_15379_1_1533448810659_4391059_63-129006-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 11:55:10,150 WARNING [train.py:1204] (3/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_393-57888-0_sp0.9 from training. Duration: 0.711125 2023-10-04 11:55:10,188 WARNING [train.py:1204] (3/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_857-298942-0_sp0.9 from training. Duration: 0.9444375 2023-10-04 11:55:11,606 WARNING [train.py:1204] (3/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_239-22076-0_sp0.9 from training. Duration: 0.9444375 2023-10-04 11:55:11,639 WARNING [train.py:1204] (3/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_424-227947-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 11:55:13,099 WARNING [train.py:1204] (3/4) Exclude cut with ID _14069_210_5632_1_1530512692033_4803390_95-1896-0 from training. Duration: 0.98 2023-10-04 11:55:13,329 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.conv_module1.balancer1.prob, batch_count=1645520.0, ans=0.125 2023-10-04 11:55:14,448 WARNING [train.py:1204] (3/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_96-244299-0_sp0.9 from training. Duration: 0.97775 2023-10-04 11:55:22,597 WARNING [train.py:1204] (3/4) Exclude cut with ID _46683_210_9642_1_1533730913492_4591116_562-319825-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 11:55:23,955 WARNING [train.py:1204] (3/4) Exclude cut with ID _64639_210_5816_1_1535367619437_3528000_284-173474-0_sp1.1 from training. Duration: 0.963625 2023-10-04 11:55:23,978 WARNING [train.py:1204] (3/4) Exclude cut with ID _29529_210_7234_1_1532393961455_3977460_497-100133-0_sp1.1 from training. Duration: 0.936375 2023-10-04 11:55:24,221 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.feed_forward1.hidden_balancer.prob, batch_count=1645586.6666666667, ans=0.125 2023-10-04 11:55:25,206 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_12868_210_6753_1_1530701932_530275_18-76322-0_sp0.9 from training. Duration: 0.9666875 2023-10-04 11:55:25,252 WARNING [train.py:1204] (3/4) Exclude cut with ID _50093_210_4220_1_1534417141207_3432299_116-303447-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 11:55:26,634 WARNING [train.py:1204] (3/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_44-79427-0_sp1.1 from training. Duration: 0.8909375 2023-10-04 11:55:26,684 WARNING [train.py:1204] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_887-249555-0 from training. Duration: 0.77 2023-10-04 11:55:29,969 WARNING [train.py:1204] (3/4) Exclude cut with ID _28461_210_2253_1_1532771775189_3041299_96-152158-0_sp0.9 from training. Duration: 0.9444375 2023-10-04 11:55:31,411 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_14058_210_4163_1_1530690953_6327180_49-210687-0_sp1.1 from training. Duration: 0.8545625 2023-10-04 11:55:34,179 WARNING [train.py:1204] (3/4) Exclude cut with ID _40399_210_4353_1_1533305658194_3279406_351-127903-0_sp1.1 from training. Duration: 0.97275 2023-10-04 11:55:34,190 WARNING [train.py:1204] (3/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_184-206939-0_sp1.1 from training. Duration: 0.936375 2023-10-04 11:55:39,048 WARNING [train.py:1204] (3/4) Exclude cut with ID _14100_210_8341_1_1530529320613_3678069_258-224599-0_sp1.1 from training. Duration: 0.7 2023-10-04 11:55:39,064 WARNING [train.py:1204] (3/4) Exclude cut with ID _6493_210_1747_1_1528459210424_4012470_324-323854-0 from training. Duration: 0.92 2023-10-04 11:55:40,477 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_151-325626-0_sp1.1 from training. Duration: 0.8818125 2023-10-04 11:55:41,751 WARNING [train.py:1204] (3/4) Exclude cut with ID _14528_210_8254_1_1530669700514_4306939_106-43145-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 11:55:41,773 WARNING [train.py:1204] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_686-54383-0 from training. Duration: 0.87 2023-10-04 11:55:43,131 WARNING [train.py:1204] (3/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_801-102362-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 11:55:43,204 WARNING [train.py:1204] (3/4) Exclude cut with ID _48677_210_5663_1_1533902405919_3892989_408-40740-0_sp0.9 from training. Duration: 0.92225 2023-10-04 11:55:47,817 WARNING [train.py:1204] (3/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_448-212135-0_sp1.1 from training. Duration: 0.82725 2023-10-04 11:55:51,088 WARNING [train.py:1204] (3/4) Exclude cut with ID _59170_210_6721_1_1534983633960_4203335_320-124047-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 11:55:52,464 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_4692_210_3528_1_1526899755_4323254_107-44401-0_sp1.1 from training. Duration: 0.7818125 2023-10-04 11:55:52,749 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.conv_module2.balancer2.min_positive, batch_count=1645653.3333333333, ans=0.05 2023-10-04 11:55:53,611 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.0.layers.0.self_attn2.whiten, num_groups=1, num_channels=192, metric=10.69 vs. limit=22.5 2023-10-04 11:55:54,053 WARNING [train.py:1204] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_731-2428-0 from training. Duration: 0.83 2023-10-04 11:55:55,343 INFO [train.py:1046] (3/4) Epoch 47, batch 2500, loss[loss=0.1338, simple_loss=0.2111, pruned_loss=0.0282, over 24371.00 frames. ], tot_loss[loss=0.1517, simple_loss=0.2318, pruned_loss=0.03576, over 4703484.14 frames. ], batch size: 56, lr: 2.16e-03, grad_scale: 8.0 2023-10-04 11:55:55,444 WARNING [train.py:1204] (3/4) Exclude cut with ID _29857_210_20034_1_1532395886753_3798160_335-163562-0_sp1.1 from training. Duration: 0.77275 2023-10-04 11:55:59,675 WARNING [train.py:1204] (3/4) Exclude cut with ID _14832_210_8750_1_1530691351539_3171348_171-275004-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 11:56:01,445 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.662e+02 2.014e+02 2.322e+02 2.787e+02 4.599e+02, threshold=4.643e+02, percent-clipped=0.0 2023-10-04 11:56:06,492 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.3.encoder.layers.3.conv_module1.whiten, num_groups=1, num_channels=512, metric=4.89 vs. limit=15.0 2023-10-04 11:56:09,207 WARNING [train.py:1204] (3/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_105-328174-0_sp0.9 from training. Duration: 0.8444375 2023-10-04 11:56:09,234 WARNING [train.py:1204] (3/4) Exclude cut with ID _68965_210_1883_1_1535698962272_3497490_164-118421-0_sp1.1 from training. Duration: 0.936375 2023-10-04 11:56:10,497 WARNING [train.py:1204] (3/4) Exclude cut with ID _19896_210_1883_1_1531709022662_6649569_380-24170-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 11:56:10,502 WARNING [train.py:1204] (3/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_505-205270-0_sp1.1 from training. Duration: 0.3454375 2023-10-04 11:56:16,532 WARNING [train.py:1204] (3/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_427-208901-0_sp1.1 from training. Duration: 0.6181875 2023-10-04 11:56:17,855 WARNING [train.py:1204] (3/4) Exclude cut with ID _23818_210_6228_1_1531917063884_3676920_85-326916-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 11:56:19,163 WARNING [train.py:1204] (3/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_101-293367-0_sp1.1 from training. Duration: 0.57275 2023-10-04 11:56:19,179 WARNING [train.py:1204] (3/4) Exclude cut with ID _5409_210_1388_1_1527292850189_5865191_178-127970-0_sp1.1 from training. Duration: 0.5181875 2023-10-04 11:56:19,226 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_403-39619-0 from training. Duration: 0.84 2023-10-04 11:56:21,903 WARNING [train.py:1204] (3/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_427-87841-0_sp1.1 from training. Duration: 0.963625 2023-10-04 11:56:21,953 WARNING [train.py:1204] (3/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_286-68181-0_sp1.1 from training. Duration: 0.97275 2023-10-04 11:56:22,184 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.conv_skip_rate, batch_count=1645786.6666666667, ans=0.0 2023-10-04 11:56:23,303 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6673_210_3273_1_1527767790_3173240_327-306185-0 from training. Duration: 0.83 2023-10-04 11:56:23,318 WARNING [train.py:1204] (3/4) Exclude cut with ID _3675_210_4557_1_1526639934408_4182089_106-203713-0_sp1.1 from training. Duration: 0.963625 2023-10-04 11:56:23,377 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_53071_210_16554_1_1534493434_1276796_130-36076-0 from training. Duration: 0.95 2023-10-04 11:56:23,411 WARNING [train.py:1204] (3/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_586-93054-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 11:56:27,730 WARNING [train.py:1204] (3/4) Exclude cut with ID _23825_210_13449_1_1531915313526_3497150_262-10571-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 11:56:27,803 WARNING [train.py:1204] (3/4) Exclude cut with ID _28783_210_7943_1_1532671164808_7155479_432-67885-0_sp1.1 from training. Duration: 0.97275 2023-10-04 11:56:30,164 INFO [scaling.py:1022] (3/4) Whitening: name=encoder_embed.convnext.out_whiten, num_groups=1, num_channels=128, metric=4.73 vs. limit=5.0 2023-10-04 11:56:32,253 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6287_210_3273_1_1527509506_2653716_182-199887-0_sp0.9 from training. Duration: 0.5666875 2023-10-04 11:56:32,314 WARNING [train.py:1204] (3/4) Exclude cut with ID _3231_210_2106_1_1526205864934_6784220_171-256004-0 from training. Duration: 0.86 2023-10-04 11:56:32,369 WARNING [train.py:1204] (3/4) Exclude cut with ID _69051_210_1531_1_1535885566901_3829538_332-92090-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 11:56:33,763 WARNING [train.py:1204] (3/4) Exclude cut with ID _3675_210_4557_1_1526639934408_4182089_104-324125-0_sp1.1 from training. Duration: 0.963625 2023-10-04 11:56:37,849 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_41764_210_2521_1_1533686110_4089313_501-2119-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 11:56:41,172 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_56-100988-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 11:56:43,951 WARNING [train.py:1204] (3/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_398-239819-0_sp1.1 from training. Duration: 0.9 2023-10-04 11:56:50,369 WARNING [train.py:1204] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_74-48717-0_sp1.1 from training. Duration: 0.52725 2023-10-04 11:56:54,441 WARNING [train.py:1204] (3/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_177-49092-0 from training. Duration: 0.91 2023-10-04 11:56:54,460 WARNING [train.py:1204] (3/4) Exclude cut with ID _62337_210_12392_1_1535009273105_3412229_44-176353-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 11:56:54,468 WARNING [train.py:1204] (3/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_110-130961-0_sp1.1 from training. Duration: 0.42725 2023-10-04 11:56:56,019 WARNING [train.py:1204] (3/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_37-189262-0_sp1.1 from training. Duration: 0.8909375 2023-10-04 11:56:56,019 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3172_210_2087_1_1526292929_3987865_197-97762-0_sp0.9 from training. Duration: 0.8333125 2023-10-04 11:56:57,325 WARNING [train.py:1204] (3/4) Exclude cut with ID IC0677W0065-334897 from training. Duration: 0.896 2023-10-04 11:56:57,325 WARNING [train.py:1204] (3/4) Exclude cut with ID IC0677W0080-334912 from training. Duration: 0.768 2023-10-04 11:56:57,338 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_120-41668-0 from training. Duration: 0.7 2023-10-04 11:56:58,125 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.2.encoder.layers.1.conv_module1.whiten, num_groups=1, num_channels=384, metric=9.00 vs. limit=15.0 2023-10-04 11:57:01,389 WARNING [train.py:1204] (3/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_460-205287-0_sp1.1 from training. Duration: 0.963625 2023-10-04 11:57:01,538 WARNING [train.py:1204] (3/4) Exclude cut with ID _14632_210_9988_1_1530691279392_3923200_102-204477-0 from training. Duration: 0.97 2023-10-04 11:57:01,543 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_34815_210_6388_1_1533381320_3571903_279-305565-0 from training. Duration: 0.92 2023-10-04 11:57:02,915 WARNING [train.py:1204] (3/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_91-148831-0_sp1.1 from training. Duration: 0.9 2023-10-04 11:57:02,988 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_82-221196-0 from training. Duration: 0.76 2023-10-04 11:57:07,715 WARNING [train.py:1204] (3/4) Exclude cut with ID _38020_210_5986_1_1533112252259_7006089_493-102745-0 from training. Duration: 0.98 2023-10-04 11:57:07,975 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.conv_module1.balancer2.min_abs, batch_count=1646053.3333333333, ans=0.5 2023-10-04 11:57:08,884 INFO [train.py:1046] (3/4) Epoch 47, batch 2550, loss[loss=0.1589, simple_loss=0.2362, pruned_loss=0.04082, over 23388.00 frames. ], tot_loss[loss=0.1524, simple_loss=0.2327, pruned_loss=0.03605, over 4698537.46 frames. ], batch size: 285, lr: 2.16e-03, grad_scale: 8.0 2023-10-04 11:57:10,769 WARNING [train.py:1204] (3/4) Exclude cut with ID _61133_210_9969_1_1535196777227_3696409_40-93442-0_sp1.1 from training. Duration: 0.92725 2023-10-04 11:57:12,254 WARNING [train.py:1204] (3/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_45-258852-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 11:57:13,581 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_53071_210_16554_1_1534493434_1276796_130-36076-0_sp1.1 from training. Duration: 0.863625 2023-10-04 11:57:16,316 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8694_210_6400_1_1529065754_3260756_332-13553-0_sp1.1 from training. Duration: 0.92725 2023-10-04 11:57:16,398 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_405-339986-0 from training. Duration: 0.51 2023-10-04 11:57:17,760 WARNING [train.py:1204] (3/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_927-43827-0_sp1.1 from training. Duration: 0.67275 2023-10-04 11:57:19,256 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.2.encoder.layers.0.whiten, num_groups=1, num_channels=384, metric=5.31 vs. limit=12.0 2023-10-04 11:57:22,492 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_397-147196-0 from training. Duration: 0.8 2023-10-04 11:57:23,927 WARNING [train.py:1204] (3/4) Exclude cut with ID 3557-8342-0013-54691-0_sp1.1 from training. Duration: 0.836375 2023-10-04 11:57:24,162 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.bypass.skip_rate, batch_count=1646120.0, ans=0.07 2023-10-04 11:57:25,272 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_47438_210_5399_1_1533797471_4513356_626-127722-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 11:57:28,073 WARNING [train.py:1204] (3/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_256-323883-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 11:57:28,078 WARNING [train.py:1204] (3/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_839-173017-0_sp0.9 from training. Duration: 0.4555625 2023-10-04 11:57:28,114 WARNING [train.py:1204] (3/4) Exclude cut with ID _5289_210_2084_1_1527678202736_3779299_392-261988-0_sp1.1 from training. Duration: 0.5909375 2023-10-04 11:57:28,945 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.1.encoder.layers.1.conv_module2.whiten, num_groups=1, num_channels=256, metric=10.91 vs. limit=15.0 2023-10-04 11:57:29,394 WARNING [train.py:1204] (3/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_119-224384-0_sp1.1 from training. Duration: 0.936375 2023-10-04 11:57:29,430 WARNING [train.py:1204] (3/4) Exclude cut with ID _57523_210_1856_1_1534766294545_3732989_262-267933-0_sp1.1 from training. Duration: 0.963625 2023-10-04 11:57:30,859 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_35361_210_4238_1_1533262810_7793476_675-87012-0_sp0.9 from training. Duration: 0.988875 2023-10-04 11:57:32,188 WARNING [train.py:1204] (3/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_17-90541-0 from training. Duration: 0.94 2023-10-04 11:57:32,215 WARNING [train.py:1204] (3/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_535-14410-0_sp1.1 from training. Duration: 0.42725 2023-10-04 11:57:32,221 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_24673_210_6400_1_1531968633_778659_94-154511-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 11:57:32,232 WARNING [train.py:1204] (3/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_212-10501-0 from training. Duration: 0.94 2023-10-04 11:57:33,944 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.feed_forward1.out_proj.dropout_p, batch_count=1646120.0, ans=0.1 2023-10-04 11:57:42,979 WARNING [train.py:1204] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_383-285196-0_sp1.1 from training. Duration: 0.8818125 2023-10-04 11:57:44,978 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.5.encoder.layers.0.self_attn1.whiten, num_groups=1, num_channels=256, metric=11.89 vs. limit=22.5 2023-10-04 11:57:46,274 WARNING [train.py:1204] (3/4) Exclude cut with ID _45560_210_21982_1_1533812603951_3584999_238-47020-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 11:57:46,283 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_21500_210_2054_1_1532149310_2716758_278-18053-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 11:57:46,290 WARNING [train.py:1204] (3/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_453-9626-0_sp1.1 from training. Duration: 0.936375 2023-10-04 11:57:48,233 WARNING [train.py:1204] (3/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_153-97441-0_sp1.1 from training. Duration: 0.5090625 2023-10-04 11:57:51,689 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.2.encoder.layers.1.whiten, num_groups=1, num_channels=384, metric=5.03 vs. limit=12.0 2023-10-04 11:57:53,715 WARNING [train.py:1204] (3/4) Exclude cut with ID _67725_210_27767_1_1535608756671_5105359_431-120895-0_sp1.1 from training. Duration: 0.963625 2023-10-04 11:57:56,481 WARNING [train.py:1204] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_574-45541-0_sp1.1 from training. Duration: 0.5909375 2023-10-04 11:57:56,506 WARNING [train.py:1204] (3/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_162-103620-0_sp1.1 from training. Duration: 0.8454375 2023-10-04 11:57:56,525 WARNING [train.py:1204] (3/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_734-129761-0_sp1.1 from training. Duration: 0.7454375 2023-10-04 11:57:57,777 WARNING [train.py:1204] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_620-276793-0_sp1.1 from training. Duration: 0.536375 2023-10-04 11:57:57,818 WARNING [train.py:1204] (3/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_153-97441-0_sp0.9 from training. Duration: 0.62225 2023-10-04 11:58:00,658 WARNING [train.py:1204] (3/4) Exclude cut with ID _12733_210_8341_1_1530167424556_3646380_523-85621-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 11:58:00,680 WARNING [train.py:1204] (3/4) Exclude cut with ID _64048_210_10626_1_1535198382802_3835179_352-179834-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 11:58:00,991 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.balancer2.prob, batch_count=1646253.3333333333, ans=0.125 2023-10-04 11:58:05,318 WARNING [train.py:1204] (3/4) Exclude cut with ID _55162_210_17071_1_1534408248583_3753480_43-242364-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 11:58:05,341 WARNING [train.py:1204] (3/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_661-264857-0 from training. Duration: 0.93 2023-10-04 11:58:05,348 WARNING [train.py:1204] (3/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_325-237800-0_sp1.1 from training. Duration: 0.97275 2023-10-04 11:58:06,702 WARNING [train.py:1204] (3/4) Exclude cut with ID _55328_210_27767_1_1534580973260_4088170_385-171459-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 11:58:06,791 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3780_210_2087_1_1526467389_2145572_176-107140-0_sp1.1 from training. Duration: 0.62725 2023-10-04 11:58:08,155 WARNING [train.py:1204] (3/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_375-244976-0_sp1.1 from training. Duration: 0.6818125 2023-10-04 11:58:11,176 WARNING [train.py:1204] (3/4) Exclude cut with ID _35941_210_15379_1_1532995294515_4023089_309-33162-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 11:58:17,245 WARNING [train.py:1204] (3/4) Exclude cut with ID _14459_210_2228_1_1531443641946_7516700_1185-22038-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 11:58:19,364 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.ff3_skip_rate, batch_count=1646320.0, ans=0.0 2023-10-04 11:58:20,397 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_1197-69460-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 11:58:20,647 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.out_combiner.scale_min, batch_count=1646320.0, ans=0.2 2023-10-04 11:58:23,342 INFO [train.py:1046] (3/4) Epoch 47, batch 2600, loss[loss=0.1593, simple_loss=0.2496, pruned_loss=0.03447, over 24342.00 frames. ], tot_loss[loss=0.1529, simple_loss=0.2333, pruned_loss=0.03621, over 4705154.86 frames. ], batch size: 77, lr: 2.16e-03, grad_scale: 8.0 2023-10-04 11:58:23,423 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9012W0022-501157 from training. Duration: 0.896 2023-10-04 11:58:25,513 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.feed_forward2.out_whiten.whitening_limit, batch_count=1646386.6666666667, ans=15.0 2023-10-04 11:58:26,143 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9023W0009-506302 from training. Duration: 0.896 2023-10-04 11:58:26,168 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2821_210_2715_1_1526108385_3763560_73-296673-0_sp1.1 from training. Duration: 0.7545625 2023-10-04 11:58:26,210 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9025W0113-507442 from training. Duration: 0.896 2023-10-04 11:58:27,518 WARNING [train.py:1204] (3/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_739-2098-0 from training. Duration: 0.92 2023-10-04 11:58:27,539 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9029W0332-509722 from training. Duration: 0.768 2023-10-04 11:58:28,869 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.692e+02 2.029e+02 2.282e+02 2.767e+02 4.631e+02, threshold=4.564e+02, percent-clipped=0.0 2023-10-04 11:58:29,154 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.conv_module1.balancer2.prob, batch_count=1646386.6666666667, ans=0.125 2023-10-04 11:58:30,320 WARNING [train.py:1204] (3/4) Exclude cut with ID _16908_210_5724_1_1531119157248_3772700_68-101953-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 11:58:30,336 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9042W0011-515617 from training. Duration: 0.768 2023-10-04 11:58:31,758 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3019_210_2028_1_1526044245_3264865_191-72235-0 from training. Duration: 0.97 2023-10-04 11:58:33,132 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9052W0006-520792 from training. Duration: 0.896 2023-10-04 11:58:34,535 WARNING [train.py:1204] (3/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_245-174553-0_sp1.1 from training. Duration: 0.8 2023-10-04 11:58:35,907 WARNING [train.py:1204] (3/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_202-67017-0 from training. Duration: 0.82 2023-10-04 11:58:37,766 WARNING [train.py:1204] (3/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_304-59227-0 from training. Duration: 0.77 2023-10-04 11:58:39,186 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_246-125000-0_sp0.9 from training. Duration: 0.688875 2023-10-04 11:58:39,240 WARNING [train.py:1204] (3/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_243-42116-0 from training. Duration: 0.73 2023-10-04 11:58:42,434 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9087W0121-538492 from training. Duration: 0.896 2023-10-04 11:58:42,450 WARNING [train.py:1204] (3/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_120-76843-0 from training. Duration: 0.55 2023-10-04 11:58:49,838 WARNING [train.py:1204] (3/4) Exclude cut with ID _7852_210_5219_1_1528286471316_8139400_850-304621-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 11:58:49,859 WARNING [train.py:1204] (3/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_154-285415-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 11:58:49,870 WARNING [train.py:1204] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_538-278194-0_sp1.1 from training. Duration: 0.92725 2023-10-04 11:58:49,871 WARNING [train.py:1204] (3/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_119-207162-0 from training. Duration: 0.71 2023-10-04 11:58:52,623 WARNING [train.py:1204] (3/4) Exclude cut with ID _2334_210_2667_1_1525608238880_4705129_459-104997-0_sp1.1 from training. Duration: 0.863625 2023-10-04 11:58:52,799 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.feed_forward1.hidden_balancer.prob, batch_count=1646520.0, ans=0.125 2023-10-04 11:58:56,269 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.1.encoder.layers.1.conv_module2.whiten, num_groups=1, num_channels=256, metric=10.37 vs. limit=15.0 2023-10-04 11:58:56,877 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9147W0019-567847 from training. Duration: 0.768 2023-10-04 11:59:02,262 WARNING [train.py:1204] (3/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_228-189439-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 11:59:03,584 WARNING [train.py:1204] (3/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_196-77172-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 11:59:04,496 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.1.encoder.layers.1.feed_forward1.out_whiten, num_groups=1, num_channels=256, metric=5.13 vs. limit=15.0 2023-10-04 11:59:05,004 WARNING [train.py:1204] (3/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_625-338546-0 from training. Duration: 0.88 2023-10-04 11:59:05,046 WARNING [train.py:1204] (3/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_193-327630-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 11:59:05,047 WARNING [train.py:1204] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_391-24006-0_sp1.1 from training. Duration: 0.92725 2023-10-04 11:59:05,137 WARNING [train.py:1204] (3/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_197-28164-0 from training. Duration: 0.8 2023-10-04 11:59:08,585 WARNING [train.py:1204] (3/4) Exclude cut with ID _8395_210_1531_1_1528597871523_4411579_431-132470-0_sp1.1 from training. Duration: 0.67275 2023-10-04 11:59:08,610 WARNING [train.py:1204] (3/4) Exclude cut with ID _35350_210_16040_1_1533040191183_4075299_465-248326-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 11:59:10,073 WARNING [train.py:1204] (3/4) Exclude cut with ID _16756_210_4915_1_1531047803763_7104259_101-141922-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 11:59:15,435 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9212W0005-600172 from training. Duration: 0.896 2023-10-04 11:59:15,465 WARNING [train.py:1204] (3/4) Exclude cut with ID _63186_210_2084_1_1535414479705_3908649_378-166820-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 11:59:15,485 WARNING [train.py:1204] (3/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_52-211683-0_sp1.1 from training. Duration: 0.8181875 2023-10-04 11:59:21,035 WARNING [train.py:1204] (3/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_1147-237729-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 11:59:22,331 WARNING [train.py:1204] (3/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_234-14174-0_sp0.9 from training. Duration: 0.911125 2023-10-04 11:59:22,345 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_454-276429-0 from training. Duration: 0.87 2023-10-04 11:59:22,418 WARNING [train.py:1204] (3/4) Exclude cut with ID _53713_210_24116_1_1534314487762_7416751_755-165313-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 11:59:25,382 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8278_210_2550_1_1528637099_4220413_436-317072-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 11:59:25,440 WARNING [train.py:1204] (3/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_28-324729-0_sp1.1 from training. Duration: 0.8545625 2023-10-04 11:59:31,086 WARNING [train.py:1204] (3/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_326-165900-0 from training. Duration: 0.69 2023-10-04 11:59:32,462 WARNING [train.py:1204] (3/4) Exclude cut with ID _53489_210_6228_1_1534298357169_3835970_96-30246-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 11:59:35,220 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8275_210_4198_1_1528455320_3892571_491-17707-0_sp1.1 from training. Duration: 0.5818125 2023-10-04 11:59:36,499 INFO [train.py:1046] (3/4) Epoch 47, batch 2650, loss[loss=0.1666, simple_loss=0.2453, pruned_loss=0.04388, over 23903.00 frames. ], tot_loss[loss=0.1538, simple_loss=0.2341, pruned_loss=0.03672, over 4707176.67 frames. ], batch size: 195, lr: 2.16e-03, grad_scale: 8.0 2023-10-04 11:59:39,295 WARNING [train.py:1204] (3/4) Exclude cut with ID _14348_210_8341_1_1530599434996_3778640_240-232370-0 from training. Duration: 0.84 2023-10-04 11:59:39,322 WARNING [train.py:1204] (3/4) Exclude cut with ID _45012_210_12651_1_1533608435136_5371380_54-112623-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 11:59:40,672 WARNING [train.py:1204] (3/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_328-264776-0_sp1.1 from training. Duration: 0.6090625 2023-10-04 11:59:42,071 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9325W0001-648427 from training. Duration: 0.768 2023-10-04 11:59:42,080 WARNING [train.py:1204] (3/4) Exclude cut with ID _56480_210_4220_1_1534679942079_3075409_49-74717-0_sp1.1 from training. Duration: 0.936375 2023-10-04 11:59:44,139 WARNING [train.py:1204] (3/4) Exclude cut with ID _59500_210_8254_1_1534809610680_3736009_6-241869-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 11:59:45,629 WARNING [train.py:1204] (3/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_172-215932-0_sp0.9 from training. Duration: 0.7333125 2023-10-04 11:59:48,138 WARNING [train.py:1204] (3/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_17-90541-0_sp1.1 from training. Duration: 0.8545625 2023-10-04 11:59:51,121 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_43831_210_2158_1_1533725502_5872791_525-89447-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 11:59:52,465 WARNING [train.py:1204] (3/4) Exclude cut with ID _12302_210_5219_1_1529924446466_8107280_907-182949-0 from training. Duration: 0.92 2023-10-04 11:59:52,478 WARNING [train.py:1204] (3/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_402-102311-0_sp0.9 from training. Duration: 0.7444375 2023-10-04 11:59:52,517 WARNING [train.py:1204] (3/4) Exclude cut with ID _8021_210_7994_1_1528376451358_3653069_179-45206-0_sp0.9 from training. Duration: 0.9555625 2023-10-04 11:59:54,119 INFO [scaling.py:1118] (3/4) WithLoss: name=encoder.encoders.4.encoder.layers.2.self_attn_weights, loss-sum=0.000e+00 2023-10-04 11:59:55,257 WARNING [train.py:1204] (3/4) Exclude cut with ID _40137_210_12210_1_1533290484046_3795180_249-187059-0 from training. Duration: 0.88 2023-10-04 11:59:56,567 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9653W0194-675472 from training. Duration: 0.9658125 2023-10-04 11:59:59,336 WARNING [train.py:1204] (3/4) Exclude cut with ID _51332_210_9988_1_1534244371836_3801270_336-255604-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 11:59:59,542 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.feed_forward1.hidden_balancer.prob, batch_count=1646786.6666666667, ans=0.125 2023-10-04 12:00:00,770 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_510-96996-0 from training. Duration: 0.87 2023-10-04 12:00:00,782 WARNING [train.py:1204] (3/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_1167-20437-0_sp1.1 from training. Duration: 0.92725 2023-10-04 12:00:02,026 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3475_210_2028_1_1526193578_3622569_265-276625-0 from training. Duration: 0.76 2023-10-04 12:00:06,191 WARNING [train.py:1204] (3/4) Exclude cut with ID _14860_210_4915_1_1530702020340_7343240_470-172418-0_sp1.1 from training. Duration: 0.97275 2023-10-04 12:00:06,199 WARNING [train.py:1204] (3/4) Exclude cut with ID _5444_210_2151_1_1527418808464_5312210_581-315411-0_sp1.1 from training. Duration: 0.563625 2023-10-04 12:00:06,214 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_34450_210_9547_1_1533099443_4109050_519-342439-0_sp1.1 from training. Duration: 0.97275 2023-10-04 12:00:06,255 WARNING [train.py:1204] (3/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_50-175961-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 12:00:07,094 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.0.layers.1.self_attn2.whiten, num_groups=1, num_channels=192, metric=10.49 vs. limit=22.5 2023-10-04 12:00:09,136 WARNING [train.py:1204] (3/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_372-6869-0 from training. Duration: 0.44 2023-10-04 12:00:10,444 WARNING [train.py:1204] (3/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_3-237434-0 from training. Duration: 0.76 2023-10-04 12:00:13,821 WARNING [train.py:1204] (3/4) Exclude cut with ID _8772_210_1850_1_1528635595972_3664011_247-336448-0_sp1.1 from training. Duration: 0.67275 2023-10-04 12:00:17,200 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_427-78097-0 from training. Duration: 0.8 2023-10-04 12:00:17,215 WARNING [train.py:1204] (3/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_466-252727-0_sp1.1 from training. Duration: 0.97275 2023-10-04 12:00:19,150 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_15972_210_6400_1_1530877934_3405692_316-35478-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 12:00:20,407 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_809-17156-0_sp0.9 from training. Duration: 0.811125 2023-10-04 12:00:21,749 WARNING [train.py:1204] (3/4) Exclude cut with ID _57572_210_5517_1_1534921219624_3809484_594-263474-0_sp1.1 from training. Duration: 0.936375 2023-10-04 12:00:21,776 WARNING [train.py:1204] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_190-228204-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 12:00:23,656 WARNING [train.py:1204] (3/4) Exclude cut with ID _19514_210_9259_1_1531530071330_5084230_117-21193-0_sp1.1 from training. Duration: 0.936375 2023-10-04 12:00:25,050 WARNING [train.py:1204] (3/4) Exclude cut with ID _69584_210_5219_1_1535785133775_4044450_190-21315-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 12:00:26,479 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_53071_210_16554_1_1534493434_1276796_184-302050-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 12:00:26,518 WARNING [train.py:1204] (3/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_120-76843-0_sp1.1 from training. Duration: 0.5 2023-10-04 12:00:26,616 WARNING [train.py:1204] (3/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_318-162017-0_sp1.1 from training. Duration: 0.82725 2023-10-04 12:00:28,048 WARNING [train.py:1204] (3/4) Exclude cut with ID _46067_210_4220_1_1534125571066_3307450_79-46864-0_sp1.1 from training. Duration: 0.92725 2023-10-04 12:00:29,459 WARNING [train.py:1204] (3/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_358-215558-0_sp1.1 from training. Duration: 0.8454375 2023-10-04 12:00:29,539 WARNING [train.py:1204] (3/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_621-66764-0_sp1.1 from training. Duration: 0.92725 2023-10-04 12:00:30,984 WARNING [train.py:1204] (3/4) Exclude cut with ID _42863_210_9070_1_1533902398788_3696920_338-156851-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 12:00:31,025 WARNING [train.py:1204] (3/4) Exclude cut with ID _5289_210_2084_1_1527678202736_3779299_392-261988-0_sp0.9 from training. Duration: 0.72225 2023-10-04 12:00:32,597 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=1646920.0, ans=0.1 2023-10-04 12:00:33,814 WARNING [train.py:1204] (3/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_409-84682-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 12:00:35,178 WARNING [train.py:1204] (3/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_30-326681-0_sp0.9 from training. Duration: 0.92225 2023-10-04 12:00:35,182 WARNING [train.py:1204] (3/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_389-184695-0_sp1.1 from training. Duration: 0.92725 2023-10-04 12:00:35,219 WARNING [train.py:1204] (3/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_442-320317-0 from training. Duration: 0.82 2023-10-04 12:00:39,278 WARNING [train.py:1204] (3/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_756-29911-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 12:00:41,843 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_401-112036-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 12:00:41,936 WARNING [train.py:1204] (3/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_181-308827-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 12:00:44,501 WARNING [train.py:1204] (3/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_332-258725-0_sp1.1 from training. Duration: 0.963625 2023-10-04 12:00:45,882 WARNING [train.py:1204] (3/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_188-260011-0_sp0.9 from training. Duration: 0.811125 2023-10-04 12:00:45,920 WARNING [train.py:1204] (3/4) Exclude cut with ID _49039_210_6315_1_1533967537541_3846118_99-302239-0_sp1.1 from training. Duration: 0.963625 2023-10-04 12:00:49,427 WARNING [train.py:1204] (3/4) Exclude cut with ID _4122_210_2084_1_1526713164417_4353280_312-294828-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 12:00:49,432 WARNING [train.py:1204] (3/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_303-228738-0 from training. Duration: 0.84 2023-10-04 12:00:50,827 INFO [train.py:1046] (3/4) Epoch 47, batch 2700, loss[loss=0.1397, simple_loss=0.2266, pruned_loss=0.02642, over 24468.00 frames. ], tot_loss[loss=0.1553, simple_loss=0.2359, pruned_loss=0.03731, over 4704065.44 frames. ], batch size: 63, lr: 2.16e-03, grad_scale: 8.0 2023-10-04 12:00:52,811 WARNING [train.py:1204] (3/4) Exclude cut with ID _14349_210_8341_1_1530615710818_3442470_115-285975-0_sp0.9 from training. Duration: 0.9555625 2023-10-04 12:00:52,941 WARNING [train.py:1204] (3/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_125-144174-0_sp1.1 from training. Duration: 0.3545625 2023-10-04 12:00:55,615 WARNING [train.py:1204] (3/4) Exclude cut with ID _53565_210_10635_1_1534337218335_3820259_351-54570-0_sp1.1 from training. Duration: 0.97275 2023-10-04 12:00:56,839 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.516e+02 2.063e+02 2.218e+02 2.649e+02 4.383e+02, threshold=4.436e+02, percent-clipped=0.0 2023-10-04 12:00:56,924 WARNING [train.py:1204] (3/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_410-40065-0_sp1.1 from training. Duration: 0.963625 2023-10-04 12:00:56,954 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_38965_210_4852_1_1533174906_3484735_167-339442-0_sp1.1 from training. Duration: 0.963625 2023-10-04 12:00:58,347 WARNING [train.py:1204] (3/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_265-164486-0_sp1.1 from training. Duration: 0.87275 2023-10-04 12:00:58,356 WARNING [train.py:1204] (3/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_725-182484-0_sp1.1 from training. Duration: 0.92725 2023-10-04 12:00:58,388 WARNING [train.py:1204] (3/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_607-213951-0_sp0.9 from training. Duration: 0.9333125 2023-10-04 12:00:58,402 WARNING [train.py:1204] (3/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_605-133395-0_sp1.1 from training. Duration: 0.6 2023-10-04 12:00:58,425 WARNING [train.py:1204] (3/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_351-294650-0 from training. Duration: 0.63 2023-10-04 12:00:58,684 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.conv_module1.balancer2.prob, batch_count=1647053.3333333333, ans=0.125 2023-10-04 12:00:59,662 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_14953_210_2151_1_1530701710_5616165_710-165400-0_sp1.1 from training. Duration: 0.7909375 2023-10-04 12:01:01,077 WARNING [train.py:1204] (3/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_237-58433-0_sp1.1 from training. Duration: 0.67275 2023-10-04 12:01:01,166 WARNING [train.py:1204] (3/4) Exclude cut with ID _5190_210_4557_1_1527244368799_373439_28-197030-0_sp1.1 from training. Duration: 0.6545625 2023-10-04 12:01:02,562 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_743-327651-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 12:01:05,513 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.balancer1.prob, batch_count=1647120.0, ans=0.125 2023-10-04 12:01:06,296 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.0.layers.0.nonlin_attention.whiten1, num_groups=1, num_channels=144, metric=9.02 vs. limit=10.0 2023-10-04 12:01:06,641 WARNING [train.py:1204] (3/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_76-203867-0_sp0.9 from training. Duration: 0.8 2023-10-04 12:01:08,084 WARNING [train.py:1204] (3/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_274-274798-0 from training. Duration: 0.98 2023-10-04 12:01:08,126 WARNING [train.py:1204] (3/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_226-242140-0_sp1.1 from training. Duration: 0.736375 2023-10-04 12:01:12,394 WARNING [train.py:1204] (3/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_352-59958-0_sp1.1 from training. Duration: 0.7818125 2023-10-04 12:01:12,398 WARNING [train.py:1204] (3/4) Exclude cut with ID _39411_210_5986_1_1533538825030_3343649_191-251819-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 12:01:13,078 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.2.encoder.layers.2.feed_forward1.out_whiten, num_groups=1, num_channels=384, metric=7.46 vs. limit=15.0 2023-10-04 12:01:17,669 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7046_210_6753_1_1527895337_6920917_642-106799-0_sp1.1 from training. Duration: 0.836375 2023-10-04 12:01:17,682 WARNING [train.py:1204] (3/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_412-3442-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 12:01:18,961 WARNING [train.py:1204] (3/4) Exclude cut with ID _60057_210_10640_1_1534903065793_3333150_171-181133-0_sp0.9 from training. Duration: 0.97775 2023-10-04 12:01:18,977 WARNING [train.py:1204] (3/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_154-135696-0_sp1.1 from training. Duration: 0.72725 2023-10-04 12:01:23,461 WARNING [train.py:1204] (3/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_148-186805-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 12:01:24,994 WARNING [train.py:1204] (3/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_605-165641-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 12:01:25,156 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.balancer2.prob, batch_count=1647186.6666666667, ans=0.125 2023-10-04 12:01:26,280 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_174-277364-0_sp0.9 from training. Duration: 0.788875 2023-10-04 12:01:26,295 WARNING [train.py:1204] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_89-321955-0_sp0.9 from training. Duration: 0.888875 2023-10-04 12:01:27,301 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.1.encoder.layers.0.nonlin_attention.whiten2, num_groups=1, num_channels=256, metric=5.40 vs. limit=15.0 2023-10-04 12:01:30,354 WARNING [train.py:1204] (3/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_438-105429-0_sp1.1 from training. Duration: 0.963625 2023-10-04 12:01:30,358 WARNING [train.py:1204] (3/4) Exclude cut with ID _14970_210_15709_1_1530773543493_1715730_187-20259-0_sp0.9 from training. Duration: 0.988875 2023-10-04 12:01:39,975 WARNING [train.py:1204] (3/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_385-193052-0_sp0.9 from training. Duration: 0.9555625 2023-10-04 12:01:40,029 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7113_210_1849_1_1527942685_3448768_194-311626-0_sp1.1 from training. Duration: 0.92725 2023-10-04 12:01:42,932 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3780_210_2087_1_1526467389_2145572_176-107140-0_sp0.9 from training. Duration: 0.7666875 2023-10-04 12:01:42,934 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_36756_210_7234_1_1533026991_966276_123-213457-0_sp1.1 from training. Duration: 0.936375 2023-10-04 12:01:45,765 WARNING [train.py:1204] (3/4) Exclude cut with ID _23557_210_8750_1_1531890106370_6846255_456-244861-0_sp1.1 from training. Duration: 0.963625 2023-10-04 12:01:47,730 WARNING [train.py:1204] (3/4) Exclude cut with ID _37086_210_9038_1_1532999540891_7021250_242-349170-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 12:01:47,804 WARNING [train.py:1204] (3/4) Exclude cut with ID _41298_210_9019_1_1533430371450_4822143_471-281351-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 12:01:50,873 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_53378_210_17282_1_1534221071_5769509_572-126176-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 12:01:52,608 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_1507-263032-0_sp1.1 from training. Duration: 0.963625 2023-10-04 12:01:52,638 WARNING [train.py:1204] (3/4) Exclude cut with ID _13679_210_7497_1_1530534293662_3355098_411-186088-0_sp1.1 from training. Duration: 0.8545625 2023-10-04 12:01:54,084 WARNING [train.py:1204] (3/4) Exclude cut with ID _29345_210_4381_1_1532689168263_3860169_149-257268-0_sp1.1 from training. Duration: 0.736375 2023-10-04 12:01:56,236 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.bypass.skip_rate, batch_count=1647320.0, ans=0.04949747468305833 2023-10-04 12:01:57,374 WARNING [train.py:1204] (3/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_381-252014-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 12:01:57,388 WARNING [train.py:1204] (3/4) Exclude cut with ID _35799_210_5854_1_1533263487887_3529071_512-2717-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 12:01:58,888 WARNING [train.py:1204] (3/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_505-205270-0_sp0.9 from training. Duration: 0.42225 2023-10-04 12:02:00,221 WARNING [train.py:1204] (3/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_731-20117-0_sp1.1 from training. Duration: 0.936375 2023-10-04 12:02:02,964 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_37351_210_7925_1_1533012931_3812113_265-85780-0_sp1.1 from training. Duration: 0.8 2023-10-04 12:02:02,972 WARNING [train.py:1204] (3/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_313-134930-0 from training. Duration: 0.97 2023-10-04 12:02:04,335 INFO [train.py:1046] (3/4) Epoch 47, batch 2750, loss[loss=0.1488, simple_loss=0.2145, pruned_loss=0.04157, over 19502.00 frames. ], tot_loss[loss=0.1553, simple_loss=0.2359, pruned_loss=0.03734, over 4703981.63 frames. ], batch size: 388, lr: 2.16e-03, grad_scale: 8.0 2023-10-04 12:02:04,384 WARNING [train.py:1204] (3/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_374-138971-0 from training. Duration: 0.94 2023-10-04 12:02:04,418 WARNING [train.py:1204] (3/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_1227-287886-0_sp1.1 from training. Duration: 0.936375 2023-10-04 12:02:05,939 WARNING [train.py:1204] (3/4) Exclude cut with ID _59493_210_17282_1_1535078419077_3225879_332-217544-0_sp1.1 from training. Duration: 0.97275 2023-10-04 12:02:05,986 WARNING [train.py:1204] (3/4) Exclude cut with ID _23514_210_1389_1_1531832139785_3031439_361-119640-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 12:02:07,407 WARNING [train.py:1204] (3/4) Exclude cut with ID _69584_210_5219_1_1535785133775_4044450_142-118058-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 12:02:07,435 WARNING [train.py:1204] (3/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_178-177582-0_sp1.1 from training. Duration: 0.67275 2023-10-04 12:02:07,621 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.conv_module1.balancer1.min_positive, batch_count=1647386.6666666667, ans=0.025 2023-10-04 12:02:08,735 WARNING [train.py:1204] (3/4) Exclude cut with ID _25303_210_17519_1_1532050207576_3635970_41-195572-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 12:02:10,254 WARNING [train.py:1204] (3/4) Exclude cut with ID _55328_210_27767_1_1534580973260_4088170_329-341322-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 12:02:10,449 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.feed_forward2.hidden_balancer.prob, batch_count=1647386.6666666667, ans=0.125 2023-10-04 12:02:11,548 WARNING [train.py:1204] (3/4) Exclude cut with ID _5608_210_2106_1_1527415439089_6819768_909-82767-0_sp1.1 from training. Duration: 0.5545625 2023-10-04 12:02:11,594 WARNING [train.py:1204] (3/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_261-297503-0_sp1.1 from training. Duration: 0.9 2023-10-04 12:02:11,600 WARNING [train.py:1204] (3/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_459-291706-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 12:02:11,601 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7762_210_1794_1_1528199508_4057408_422-227034-0 from training. Duration: 0.73 2023-10-04 12:02:12,874 WARNING [train.py:1204] (3/4) Exclude cut with ID _3067_210_2667_1_1526212974640_4503168_250-285937-0_sp0.9 from training. Duration: 0.888875 2023-10-04 12:02:12,896 WARNING [train.py:1204] (3/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_332-253745-0_sp1.1 from training. Duration: 0.936375 2023-10-04 12:02:16,095 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=1647386.6666666667, ans=0.1 2023-10-04 12:02:19,227 WARNING [train.py:1204] (3/4) Exclude cut with ID _13938_210_9988_1_1530518718978_3693960_304-112784-0 from training. Duration: 0.46 2023-10-04 12:02:19,750 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.5.encoder.layers.1.whiten, num_groups=1, num_channels=256, metric=2.08 vs. limit=12.0 2023-10-04 12:02:20,626 WARNING [train.py:1204] (3/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_226-267957-0_sp1.1 from training. Duration: 0.92725 2023-10-04 12:02:22,534 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_41822_210_13270_1_1533955976_4379284_608-254567-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 12:02:22,599 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_124-113562-0_sp1.1 from training. Duration: 0.8545625 2023-10-04 12:02:24,508 WARNING [train.py:1204] (3/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_117-290916-0_sp1.1 from training. Duration: 0.463625 2023-10-04 12:02:25,838 WARNING [train.py:1204] (3/4) Exclude cut with ID _28427_210_4220_1_1532581240967_3061069_102-66533-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 12:02:25,919 WARNING [train.py:1204] (3/4) Exclude cut with ID _55383_210_10635_1_1534468426172_2231669_134-128246-0_sp1.1 from training. Duration: 0.8090625 2023-10-04 12:02:27,279 WARNING [train.py:1204] (3/4) Exclude cut with ID _45871_210_5665_1_1533864614245_3296060_49-308316-0_sp1.1 from training. Duration: 0.97275 2023-10-04 12:02:27,335 WARNING [train.py:1204] (3/4) Exclude cut with ID _57572_210_5517_1_1534921219624_3809484_176-199205-0_sp1.1 from training. Duration: 0.97275 2023-10-04 12:02:30,100 WARNING [train.py:1204] (3/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_45-265684-0_sp1.1 from training. Duration: 0.6545625 2023-10-04 12:02:31,520 WARNING [train.py:1204] (3/4) Exclude cut with ID _15150_210_8341_1_1530752411803_7256384_469-239509-0_sp0.9 from training. Duration: 0.7555625 2023-10-04 12:02:32,744 WARNING [train.py:1204] (3/4) Exclude cut with ID _28461_210_2253_1_1532771775189_3041299_136-270644-0_sp1.1 from training. Duration: 0.8454375 2023-10-04 12:02:34,107 WARNING [train.py:1204] (3/4) Exclude cut with ID _69584_210_5219_1_1535785133775_4044450_302-47302-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 12:02:36,744 WARNING [train.py:1204] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_166-128941-0_sp1.1 from training. Duration: 0.4909375 2023-10-04 12:02:42,541 WARNING [train.py:1204] (3/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_559-120504-0_sp1.1 from training. Duration: 0.97275 2023-10-04 12:02:42,703 WARNING [train.py:1204] (3/4) Exclude cut with ID _6456_210_1534_1_1527681634212_3181049_345-252877-0_sp1.1 from training. Duration: 0.5818125 2023-10-04 12:02:44,050 WARNING [train.py:1204] (3/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_902-233003-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 12:02:48,564 WARNING [train.py:1204] (3/4) Exclude cut with ID _36538_210_9994_1_1533204020363_3795620_204-261385-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 12:02:48,569 WARNING [train.py:1204] (3/4) Exclude cut with ID _14821_210_8750_1_1530680503991_6829359_189-297451-0_sp1.1 from training. Duration: 0.5 2023-10-04 12:02:48,622 WARNING [train.py:1204] (3/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_1211-226066-0_sp0.9 from training. Duration: 0.8444375 2023-10-04 12:02:55,380 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=1647586.6666666667, ans=0.1 2023-10-04 12:02:55,685 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.5.encoder.layers.1.conv_module1.whiten, num_groups=1, num_channels=256, metric=3.80 vs. limit=15.0 2023-10-04 12:02:56,565 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6786_210_1388_1_1527839699_4194495_273-149841-0_sp1.1 from training. Duration: 0.836375 2023-10-04 12:02:56,595 WARNING [train.py:1204] (3/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_380-162648-0_sp1.1 from training. Duration: 0.8545625 2023-10-04 12:02:56,595 WARNING [train.py:1204] (3/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_494-199538-0 from training. Duration: 0.75 2023-10-04 12:03:01,246 WARNING [train.py:1204] (3/4) Exclude cut with ID _49097_210_16049_1_1533952780207_4360979_276-87053-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 12:03:02,769 WARNING [train.py:1204] (3/4) Exclude cut with ID _5444_210_2151_1_1527418808464_5312210_726-27165-0 from training. Duration: 0.67 2023-10-04 12:03:07,049 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_128-202104-0_sp1.1 from training. Duration: 0.57275 2023-10-04 12:03:09,714 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_11349_210_6753_1_1529800861_1101207_26-65602-0_sp1.1 from training. Duration: 0.87275 2023-10-04 12:03:11,013 WARNING [train.py:1204] (3/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_159-328157-0 from training. Duration: 0.52 2023-10-04 12:03:12,313 WARNING [train.py:1204] (3/4) Exclude cut with ID _14069_210_5632_1_1530512692033_4803390_98-21934-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 12:03:13,769 WARNING [train.py:1204] (3/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_352-59958-0_sp0.9 from training. Duration: 0.9555625 2023-10-04 12:03:13,825 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6163_210_6400_1_1527506746_4263068_283-137811-0 from training. Duration: 0.77 2023-10-04 12:03:13,857 WARNING [train.py:1204] (3/4) Exclude cut with ID _21989_210_5854_1_1531899214931_3271360_133-44759-0_sp1.1 from training. Duration: 0.7 2023-10-04 12:03:17,173 WARNING [train.py:1204] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_21-174514-0_sp0.9 from training. Duration: 0.47775 2023-10-04 12:03:17,212 WARNING [train.py:1204] (3/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_201-78811-0_sp1.1 from training. Duration: 0.963625 2023-10-04 12:03:18,398 INFO [train.py:1046] (3/4) Epoch 47, batch 2800, loss[loss=0.1418, simple_loss=0.2185, pruned_loss=0.03252, over 23249.00 frames. ], tot_loss[loss=0.1539, simple_loss=0.2336, pruned_loss=0.03708, over 4685914.09 frames. ], batch size: 119, lr: 2.16e-03, grad_scale: 16.0 2023-10-04 12:03:18,478 WARNING [train.py:1204] (3/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_537-164630-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 12:03:19,326 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.2.encoder.layers.0.self_attn_weights.whiten_keys, num_groups=4, num_channels=128, metric=4.04 vs. limit=6.0 2023-10-04 12:03:19,835 WARNING [train.py:1204] (3/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_156-257607-0 from training. Duration: 0.83 2023-10-04 12:03:19,846 WARNING [train.py:1204] (3/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_308-154450-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 12:03:19,893 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_45399_210_4852_1_1533624725_7579737_214-240574-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 12:03:22,400 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.2.encoder.layers.2.nonlin_attention.whiten1, num_groups=1, num_channels=288, metric=5.78 vs. limit=10.0 2023-10-04 12:03:23,569 WARNING [train.py:1204] (3/4) Exclude cut with ID _29303_210_2575_1_1532392237303_7410649_615-305517-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 12:03:23,616 WARNING [train.py:1204] (3/4) Exclude cut with ID IC0125W0002-61808 from training. Duration: 0.708 2023-10-04 12:03:23,616 WARNING [train.py:1204] (3/4) Exclude cut with ID IC0125W0017-61823 from training. Duration: 0.759 2023-10-04 12:03:24,723 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.793e+02 2.018e+02 2.201e+02 2.488e+02 4.073e+02, threshold=4.402e+02, percent-clipped=0.0 2023-10-04 12:03:26,344 WARNING [train.py:1204] (3/4) Exclude cut with ID _53219_210_4525_1_1534834836008_4064825_4-145291-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 12:03:27,823 WARNING [train.py:1204] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_185-174805-0_sp1.1 from training. Duration: 0.6181875 2023-10-04 12:03:27,837 WARNING [train.py:1204] (3/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_635-193046-0_sp1.1 from training. Duration: 0.92725 2023-10-04 12:03:28,076 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.feed_forward2.hidden_balancer.prob, batch_count=1647720.0, ans=0.125 2023-10-04 12:03:31,135 WARNING [train.py:1204] (3/4) Exclude cut with ID _46067_210_4220_1_1534125571066_3307450_321-258866-0_sp1.1 from training. Duration: 0.97275 2023-10-04 12:03:32,641 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8558_210_1410_1_1528505225_7613406_131-329524-0 from training. Duration: 0.79 2023-10-04 12:03:35,343 WARNING [train.py:1204] (3/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_847-41334-0_sp0.9 from training. Duration: 0.488875 2023-10-04 12:03:36,693 WARNING [train.py:1204] (3/4) Exclude cut with ID _14227_210_4381_1_1530701804286_3940259_125-275642-0 from training. Duration: 0.99 2023-10-04 12:03:37,310 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.3.encoder.layers.3.feed_forward3.out_whiten, num_groups=1, num_channels=512, metric=13.38 vs. limit=15.0 2023-10-04 12:03:38,065 WARNING [train.py:1204] (3/4) Exclude cut with ID _25248_210_8750_1_1532062825941_5554180_248-177255-0_sp1.1 from training. Duration: 0.963625 2023-10-04 12:03:38,102 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_40047_210_2290_1_1533290519_2374311_195-165693-0_sp1.1 from training. Duration: 0.8090625 2023-10-04 12:03:38,112 WARNING [train.py:1204] (3/4) Exclude cut with ID _23109_210_4381_1_1532150828777_901949_29-258672-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 12:03:43,538 WARNING [train.py:1204] (3/4) Exclude cut with ID _14821_210_8750_1_1530680503991_6829359_2-227151-0_sp1.1 from training. Duration: 0.7545625 2023-10-04 12:03:43,579 WARNING [train.py:1204] (3/4) Exclude cut with ID _49643_210_7989_1_1534640244224_3939031_565-53982-0_sp1.1 from training. Duration: 0.963625 2023-10-04 12:03:43,583 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7544_210_6753_1_1528591327_1471426_33-346031-0_sp0.9 from training. Duration: 0.67775 2023-10-04 12:03:44,893 WARNING [train.py:1204] (3/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_95-61782-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 12:03:47,606 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.5.encoder.layers.1.conv_module2.whiten, num_groups=1, num_channels=256, metric=5.85 vs. limit=15.0 2023-10-04 12:03:51,951 WARNING [train.py:1204] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_686-54383-0_sp0.9 from training. Duration: 0.9666875 2023-10-04 12:03:53,684 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_505-276823-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 12:03:56,337 WARNING [train.py:1204] (3/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_745-128443-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 12:03:57,670 WARNING [train.py:1204] (3/4) Exclude cut with ID _3844_210_1390_1_1526727803582_2871209_20-251664-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 12:03:57,742 WARNING [train.py:1204] (3/4) Exclude cut with ID _36538_210_9994_1_1533204020363_3795620_487-164628-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 12:03:57,925 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.1.bypass.scale_min, batch_count=1647853.3333333333, ans=0.2 2023-10-04 12:03:59,444 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.feed_forward3.hidden_balancer.prob, batch_count=1647853.3333333333, ans=0.125 2023-10-04 12:04:00,856 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=1647853.3333333333, ans=0.1 2023-10-04 12:04:02,699 WARNING [train.py:1204] (3/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_309-222545-0_sp1.1 from training. Duration: 0.763625 2023-10-04 12:04:02,720 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3531_210_3528_1_1526294203_5216456_309-227302-0 from training. Duration: 0.69 2023-10-04 12:04:04,031 WARNING [train.py:1204] (3/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_42-192460-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 12:04:05,414 WARNING [train.py:1204] (3/4) Exclude cut with ID _22083_210_8254_1_1531702862439_3678970_49-234546-0_sp1.1 from training. Duration: 0.7545625 2023-10-04 12:04:05,431 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_427-78097-0_sp0.9 from training. Duration: 0.888875 2023-10-04 12:04:08,150 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_378-205254-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 12:04:09,479 WARNING [train.py:1204] (3/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_672-314830-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 12:04:12,333 WARNING [train.py:1204] (3/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_90-30673-0_sp1.1 from training. Duration: 0.763625 2023-10-04 12:04:12,483 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.nonlin_attention.balancer.prob, batch_count=1647920.0, ans=0.125 2023-10-04 12:04:15,139 WARNING [train.py:1204] (3/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_563-329943-0_sp1.1 from training. Duration: 0.9 2023-10-04 12:04:15,162 WARNING [train.py:1204] (3/4) Exclude cut with ID _21632_210_2253_1_1531994001765_3744140_154-84090-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 12:04:15,163 WARNING [train.py:1204] (3/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_103-336064-0_sp0.9 from training. Duration: 0.5666875 2023-10-04 12:04:16,504 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_43-71989-0_sp0.9 from training. Duration: 0.6333125 2023-10-04 12:04:16,574 WARNING [train.py:1204] (3/4) Exclude cut with ID _3867_210_1883_1_1526641200575_4475830_319-349850-0_sp0.9 from training. Duration: 0.7444375 2023-10-04 12:04:17,938 WARNING [train.py:1204] (3/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_629-179454-0_sp1.1 from training. Duration: 0.963625 2023-10-04 12:04:17,946 WARNING [train.py:1204] (3/4) Exclude cut with ID _65263_210_22356_1_1535763598691_3921037_49-222917-0 from training. Duration: 0.92 2023-10-04 12:04:17,975 WARNING [train.py:1204] (3/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_602-331772-0_sp1.1 from training. Duration: 0.936375 2023-10-04 12:04:19,905 WARNING [train.py:1204] (3/4) Exclude cut with ID _58414_210_8254_1_1535421609162_7151182_292-134978-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 12:04:19,940 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_38793_210_2544_1_1533176114_3849320_200-299292-0_sp1.1 from training. Duration: 0.936375 2023-10-04 12:04:21,488 WARNING [train.py:1204] (3/4) Exclude cut with ID _14970_210_15709_1_1530775547361_1966965_6-23328-0 from training. Duration: 0.86 2023-10-04 12:04:23,418 WARNING [train.py:1204] (3/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_386-225511-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 12:04:23,438 WARNING [train.py:1204] (3/4) Exclude cut with ID _38323_210_12108_1_1533121233369_3303049_416-11919-0_sp1.1 from training. Duration: 0.87275 2023-10-04 12:04:25,346 WARNING [train.py:1204] (3/4) Exclude cut with ID _4866_210_2577_1_1527404412284_4364659_256-306830-0_sp1.1 from training. Duration: 0.7909375 2023-10-04 12:04:26,700 WARNING [train.py:1204] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_53-300544-0 from training. Duration: 0.67 2023-10-04 12:04:32,179 WARNING [train.py:1204] (3/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_80-74184-0_sp1.1 from training. Duration: 0.7545625 2023-10-04 12:04:33,913 INFO [train.py:1046] (3/4) Epoch 47, batch 2850, loss[loss=0.1558, simple_loss=0.2292, pruned_loss=0.04123, over 23833.00 frames. ], tot_loss[loss=0.1535, simple_loss=0.2333, pruned_loss=0.03687, over 4693005.60 frames. ], batch size: 179, lr: 2.16e-03, grad_scale: 16.0 2023-10-04 12:04:33,970 WARNING [train.py:1204] (3/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_332-256168-0_sp1.1 from training. Duration: 0.6818125 2023-10-04 12:04:34,019 WARNING [train.py:1204] (3/4) Exclude cut with ID _48836_210_12210_1_1534075848293_3904150_429-35475-0_sp1.1 from training. Duration: 0.8090625 2023-10-04 12:04:35,544 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_144-135833-0_sp1.1 from training. Duration: 0.97275 2023-10-04 12:04:38,329 WARNING [train.py:1204] (3/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_380-343670-0_sp0.9 from training. Duration: 0.97775 2023-10-04 12:04:38,362 WARNING [train.py:1204] (3/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_213-265174-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 12:04:38,390 WARNING [train.py:1204] (3/4) Exclude cut with ID _20455_210_5219_1_1531565996775_8098100_86-314283-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 12:04:41,164 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_41750_210_1960_1_1533520678_3948042_68-223813-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 12:04:41,221 WARNING [train.py:1204] (3/4) Exclude cut with ID _55153_210_2253_1_1534384786677_3162096_24-198429-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 12:04:43,896 WARNING [train.py:1204] (3/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_119-207162-0_sp0.9 from training. Duration: 0.788875 2023-10-04 12:04:43,947 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_148-172861-0 from training. Duration: 0.5 2023-10-04 12:04:50,050 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_5522_210_3273_1_1527249606_3909096_318-203153-0 from training. Duration: 0.43 2023-10-04 12:04:50,057 WARNING [train.py:1204] (3/4) Exclude cut with ID _14884_210_2252_1_1530707459689_5561250_288-189949-0_sp1.1 from training. Duration: 0.97275 2023-10-04 12:04:52,783 WARNING [train.py:1204] (3/4) Exclude cut with ID _5294_210_1747_1_1527249643928_3696179_529-242298-0 from training. Duration: 0.95 2023-10-04 12:04:52,856 WARNING [train.py:1204] (3/4) Exclude cut with ID _36538_210_9994_1_1533204020363_3795620_503-110289-0_sp1.1 from training. Duration: 0.936375 2023-10-04 12:04:56,499 WARNING [train.py:1204] (3/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_385-193052-0 from training. Duration: 0.86 2023-10-04 12:04:56,569 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_456-22141-0 from training. Duration: 0.88 2023-10-04 12:04:57,972 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_38965_210_4852_1_1533174906_3484735_284-117840-0_sp1.1 from training. Duration: 0.936375 2023-10-04 12:04:58,736 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.1.encoder.layers.0.nonlin_attention.whiten1, num_groups=1, num_channels=192, metric=4.58 vs. limit=10.0 2023-10-04 12:05:02,709 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.feed_forward3.hidden_balancer.prob, batch_count=1648186.6666666667, ans=0.125 2023-10-04 12:05:02,829 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.out_combiner.scale_min, batch_count=1648186.6666666667, ans=0.2 2023-10-04 12:05:10,929 WARNING [train.py:1204] (3/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_75-201625-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 12:05:12,405 WARNING [train.py:1204] (3/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_255-312654-0_sp1.1 from training. Duration: 0.82725 2023-10-04 12:05:12,415 WARNING [train.py:1204] (3/4) Exclude cut with ID _47251_210_18628_1_1533814063128_4137250_214-88076-0_sp0.9 from training. Duration: 0.97775 2023-10-04 12:05:12,492 WARNING [train.py:1204] (3/4) Exclude cut with ID _4092_210_2151_1_1526643044449_3135759_227-213972-0_sp1.1 from training. Duration: 0.4909375 2023-10-04 12:05:12,507 WARNING [train.py:1204] (3/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_912-47473-0_sp1.1 from training. Duration: 0.6181875 2023-10-04 12:05:12,536 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6767_210_3273_1_1527855783_1718932_173-256601-0_sp1.1 from training. Duration: 0.72725 2023-10-04 12:05:12,754 INFO [scaling.py:1118] (3/4) WithLoss: name=encoder.encoders.2.encoder.layers.1.self_attn_weights, loss-sum=0.000e+00 2023-10-04 12:05:13,986 WARNING [train.py:1204] (3/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_32-205288-0_sp1.1 from training. Duration: 0.7090625 2023-10-04 12:05:15,312 WARNING [train.py:1204] (3/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_39-248870-0 from training. Duration: 0.69 2023-10-04 12:05:16,697 WARNING [train.py:1204] (3/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_377-296839-0_sp1.1 from training. Duration: 0.5 2023-10-04 12:05:16,721 WARNING [train.py:1204] (3/4) Exclude cut with ID _13679_210_7497_1_1530534293662_3355098_413-56154-0_sp1.1 from training. Duration: 0.963625 2023-10-04 12:05:18,086 WARNING [train.py:1204] (3/4) Exclude cut with ID _14970_210_15709_1_1530775547361_1966965_201-84544-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 12:05:19,328 WARNING [train.py:1204] (3/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_105-95775-0_sp1.1 from training. Duration: 0.936375 2023-10-04 12:05:22,366 WARNING [train.py:1204] (3/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_131-191669-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 12:05:23,710 WARNING [train.py:1204] (3/4) Exclude cut with ID _14860_210_4915_1_1530702020340_7343240_695-4323-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 12:05:23,973 INFO [scaling.py:1118] (3/4) WithLoss: name=encoder.encoders.1.encoder.layers.0.self_attn_weights, loss-sum=0.000e+00 2023-10-04 12:05:25,042 WARNING [train.py:1204] (3/4) Exclude cut with ID _39867_210_9826_1_1533553222030_9344259_991-130565-0_sp1.1 from training. Duration: 0.97275 2023-10-04 12:05:25,232 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.1.ff3_skip_rate, batch_count=1648253.3333333333, ans=0.0 2023-10-04 12:05:27,035 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_50-3289-0_sp1.1 from training. Duration: 0.82725 2023-10-04 12:05:30,331 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_54-347670-0_sp1.1 from training. Duration: 0.92725 2023-10-04 12:05:30,384 WARNING [train.py:1204] (3/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_200-153445-0_sp1.1 from training. Duration: 0.936375 2023-10-04 12:05:31,701 WARNING [train.py:1204] (3/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_279-302911-0_sp1.1 from training. Duration: 0.97275 2023-10-04 12:05:33,206 WARNING [train.py:1204] (3/4) Exclude cut with ID _14886_210_4220_1_1530702020355_3516190_159-73748-0_sp0.9 from training. Duration: 0.92225 2023-10-04 12:05:37,966 WARNING [train.py:1204] (3/4) Exclude cut with ID _53379_210_20034_1_1534222027363_3854654_23-222715-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 12:05:38,428 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.5.encoder.layers.0.self_attn_weights.whiten_keys, num_groups=4, num_channels=128, metric=1.86 vs. limit=6.0 2023-10-04 12:05:39,335 WARNING [train.py:1204] (3/4) Exclude cut with ID _12555_210_8750_1_1530075594402_3780239_449-267986-0 from training. Duration: 0.83 2023-10-04 12:05:39,354 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7544_210_6753_1_1528587722_3576686_295-258479-0 from training. Duration: 0.84 2023-10-04 12:05:40,727 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7002_210_1794_1_1527935634_4034444_487-236501-0_sp0.9 from training. Duration: 0.7333125 2023-10-04 12:05:40,978 INFO [scaling.py:1118] (3/4) WithLoss: name=encoder.encoders.3.encoder.layers.2.self_attn_weights, loss-sum=0.000e+00 2023-10-04 12:05:42,140 WARNING [train.py:1204] (3/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_244-19107-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 12:05:42,153 WARNING [train.py:1204] (3/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_390-339137-0 from training. Duration: 0.56 2023-10-04 12:05:43,411 WARNING [train.py:1204] (3/4) Exclude cut with ID _7562_210_2084_1_1528887767916_3946229_186-326422-0_sp1.1 from training. Duration: 0.763625 2023-10-04 12:05:44,723 WARNING [train.py:1204] (3/4) Exclude cut with ID _33640_210_2655_1_1533117605479_4855469_645-145235-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 12:05:44,740 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_25767_210_2161_1_1532307483_3867748_506-171777-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 12:05:44,772 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_5438_210_1794_1_1527336687_2600009_233-58072-0_sp1.1 from training. Duration: 0.7 2023-10-04 12:05:44,773 WARNING [train.py:1204] (3/4) Exclude cut with ID IC0669W0063-330908 from training. Duration: 0.896 2023-10-04 12:05:44,806 WARNING [train.py:1204] (3/4) Exclude cut with ID IC0671W0070-331913 from training. Duration: 0.896 2023-10-04 12:05:44,809 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_258-310570-0_sp1.1 from training. Duration: 0.7454375 2023-10-04 12:05:46,134 WARNING [train.py:1204] (3/4) Exclude cut with ID _9461_210_4557_1_1529060514726_3677451_162-177507-0_sp1.1 from training. Duration: 0.97275 2023-10-04 12:05:47,457 INFO [train.py:1046] (3/4) Epoch 47, batch 2900, loss[loss=0.1487, simple_loss=0.2397, pruned_loss=0.02886, over 24444.00 frames. ], tot_loss[loss=0.1532, simple_loss=0.2332, pruned_loss=0.03654, over 4709473.80 frames. ], batch size: 69, lr: 2.16e-03, grad_scale: 16.0 2023-10-04 12:05:50,448 WARNING [train.py:1204] (3/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_26-175553-0_sp0.9 from training. Duration: 0.67775 2023-10-04 12:05:50,472 WARNING [train.py:1204] (3/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_7-150481-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 12:05:50,523 WARNING [train.py:1204] (3/4) Exclude cut with ID _23557_210_8750_1_1531890106370_6846255_72-273262-0_sp1.1 from training. Duration: 0.963625 2023-10-04 12:05:51,862 WARNING [train.py:1204] (3/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_367-61330-0 from training. Duration: 0.75 2023-10-04 12:05:53,561 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.729e+02 2.030e+02 2.253e+02 2.601e+02 4.096e+02, threshold=4.506e+02, percent-clipped=0.0 2023-10-04 12:05:53,875 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=1648386.6666666667, ans=0.1 2023-10-04 12:05:56,423 WARNING [train.py:1204] (3/4) Exclude cut with ID _39867_210_9826_1_1533553222030_9344259_1337-11141-0_sp1.1 from training. Duration: 0.936375 2023-10-04 12:05:56,451 WARNING [train.py:1204] (3/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_101-293367-0 from training. Duration: 0.63 2023-10-04 12:05:58,182 WARNING [train.py:1204] (3/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_10-100367-0 from training. Duration: 0.98 2023-10-04 12:05:59,612 WARNING [train.py:1204] (3/4) Exclude cut with ID _60057_210_10640_1_1534903065793_3333150_171-181133-0_sp1.1 from training. Duration: 0.8 2023-10-04 12:05:59,615 WARNING [train.py:1204] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_844-96989-0_sp0.9 from training. Duration: 0.788875 2023-10-04 12:06:02,783 WARNING [train.py:1204] (3/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_301-144595-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 12:06:04,175 WARNING [train.py:1204] (3/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_115-147343-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 12:06:05,585 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3171_210_2087_1_1526109084_2330007_37-64334-0_sp1.1 from training. Duration: 0.7454375 2023-10-04 12:06:07,340 WARNING [train.py:1204] (3/4) Exclude cut with ID _19514_210_9259_1_1531530071330_5084230_769-272914-0_sp1.1 from training. Duration: 0.936375 2023-10-04 12:06:10,150 WARNING [train.py:1204] (3/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_76-311963-0_sp0.9 from training. Duration: 0.77775 2023-10-04 12:06:10,187 WARNING [train.py:1204] (3/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_526-62079-0 from training. Duration: 0.67 2023-10-04 12:06:10,244 WARNING [train.py:1204] (3/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_7-111954-0_sp0.9 from training. Duration: 0.62225 2023-10-04 12:06:12,720 WARNING [train.py:1204] (3/4) Exclude cut with ID _57997_210_4163_1_1534766425889_3961049_360-279140-0_sp1.1 from training. Duration: 0.97275 2023-10-04 12:06:14,190 WARNING [train.py:1204] (3/4) Exclude cut with ID _4866_210_2577_1_1527404412284_4364659_133-1696-0 from training. Duration: 0.99 2023-10-04 12:06:15,561 WARNING [train.py:1204] (3/4) Exclude cut with ID _5190_210_4557_1_1527244368799_373439_28-197030-0 from training. Duration: 0.72 2023-10-04 12:06:18,346 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_53-39003-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 12:06:18,349 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6673_210_3273_1_1527767790_3173240_351-167954-0 from training. Duration: 0.48 2023-10-04 12:06:18,364 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2822_210_2715_1_1526112586_1999587_165-36656-0_sp1.1 from training. Duration: 0.8090625 2023-10-04 12:06:19,795 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_328-264892-0_sp1.1 from training. Duration: 0.92725 2023-10-04 12:06:19,807 WARNING [train.py:1204] (3/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_29-293896-0_sp0.9 from training. Duration: 0.67775 2023-10-04 12:06:22,567 WARNING [train.py:1204] (3/4) Exclude cut with ID _65263_210_22356_1_1535763598691_3921037_465-55448-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 12:06:24,259 WARNING [train.py:1204] (3/4) Exclude cut with ID _21989_210_5854_1_1531899214931_3271360_311-53106-0_sp1.1 from training. Duration: 0.97275 2023-10-04 12:06:24,497 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.conv_module2.balancer1.prob, batch_count=1648520.0, ans=0.125 2023-10-04 12:06:28,426 WARNING [train.py:1204] (3/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_389-277915-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 12:06:31,508 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_69630_210_7234_1_1535791094_677837_63-277139-0_sp1.1 from training. Duration: 0.963625 2023-10-04 12:06:32,920 WARNING [train.py:1204] (3/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_85-138257-0 from training. Duration: 0.71 2023-10-04 12:06:32,936 WARNING [train.py:1204] (3/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_686-247600-0 from training. Duration: 0.92 2023-10-04 12:06:32,941 WARNING [train.py:1204] (3/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_108-255430-0_sp1.1 from training. Duration: 0.8818125 2023-10-04 12:06:33,380 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.5.encoder.layers.1.self_attn2.whiten, num_groups=1, num_channels=256, metric=11.76 vs. limit=22.5 2023-10-04 12:06:39,427 WARNING [train.py:1204] (3/4) Exclude cut with ID _14884_210_2252_1_1530707459689_5561250_114-201399-0_sp1.1 from training. Duration: 0.6909375 2023-10-04 12:06:40,890 WARNING [train.py:1204] (3/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_506-42614-0 from training. Duration: 0.79 2023-10-04 12:06:40,983 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_85-195240-0_sp1.1 from training. Duration: 0.7909375 2023-10-04 12:06:46,636 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_141-36243-0_sp1.1 from training. Duration: 0.97275 2023-10-04 12:06:54,250 WARNING [train.py:1204] (3/4) Exclude cut with ID _7852_210_5219_1_1528286471316_8139400_358-287492-0_sp1.1 from training. Duration: 0.87275 2023-10-04 12:06:54,275 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_805-71497-0_sp0.9 from training. Duration: 0.788875 2023-10-04 12:06:54,501 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.conv_module2.balancer2.prob, batch_count=1648653.3333333333, ans=0.125 2023-10-04 12:06:55,544 WARNING [train.py:1204] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_178-39722-0 from training. Duration: 0.49 2023-10-04 12:06:55,717 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.conv_module2.balancer1.prob, batch_count=1648653.3333333333, ans=0.125 2023-10-04 12:06:55,740 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.feed_forward2.hidden_balancer.prob, batch_count=1648653.3333333333, ans=0.125 2023-10-04 12:06:58,319 WARNING [train.py:1204] (3/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_428-181925-0_sp1.1 from training. Duration: 0.963625 2023-10-04 12:06:58,323 WARNING [train.py:1204] (3/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_770-200430-0 from training. Duration: 0.89 2023-10-04 12:06:59,612 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_1104-271009-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 12:07:00,886 INFO [train.py:1046] (3/4) Epoch 47, batch 2950, loss[loss=0.148, simple_loss=0.2349, pruned_loss=0.03058, over 24496.00 frames. ], tot_loss[loss=0.1541, simple_loss=0.2344, pruned_loss=0.03687, over 4706242.19 frames. ], batch size: 63, lr: 2.16e-03, grad_scale: 16.0 2023-10-04 12:07:00,933 WARNING [train.py:1204] (3/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_128-296581-0_sp0.9 from training. Duration: 0.82225 2023-10-04 12:07:06,120 WARNING [train.py:1204] (3/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_19-62511-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 12:07:07,473 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_805-71497-0 from training. Duration: 0.71 2023-10-04 12:07:09,281 WARNING [train.py:1204] (3/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_573-152077-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 12:07:09,288 WARNING [train.py:1204] (3/4) Exclude cut with ID _42875_210_14856_1_1533704397487_4012433_393-177816-0_sp1.1 from training. Duration: 0.963625 2023-10-04 12:07:10,782 WARNING [train.py:1204] (3/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_278-35159-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 12:07:12,245 WARNING [train.py:1204] (3/4) Exclude cut with ID _15037_210_2253_1_1530752294104_3267650_13-130361-0_sp1.1 from training. Duration: 0.9 2023-10-04 12:07:13,479 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3019_210_2028_1_1526044245_3264865_127-135615-0 from training. Duration: 0.94 2023-10-04 12:07:14,823 WARNING [train.py:1204] (3/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_607-213951-0 from training. Duration: 0.84 2023-10-04 12:07:16,139 WARNING [train.py:1204] (3/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_436-318113-0_sp0.9 from training. Duration: 0.8333125 2023-10-04 12:07:16,139 WARNING [train.py:1204] (3/4) Exclude cut with ID _13938_210_9988_1_1530518718978_3693960_148-152225-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 12:07:16,474 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.bypass.scale_min, batch_count=1648786.6666666667, ans=0.2 2023-10-04 12:07:20,389 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2541_210_1794_1_1525590269_3084765_156-173713-0_sp1.1 from training. Duration: 0.736375 2023-10-04 12:07:21,714 WARNING [train.py:1204] (3/4) Exclude cut with ID _28323_210_16187_1_1532480395667_4100300_531-24157-0_sp1.1 from training. Duration: 0.8909375 2023-10-04 12:07:24,398 WARNING [train.py:1204] (3/4) Exclude cut with ID _29303_210_2575_1_1532392237303_7410649_382-217492-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 12:07:25,083 WARNING [train.py:1204] (3/4) Exclude cut with ID _58895_210_8750_1_1535184081320_3649089_270-158013-0_sp1.1 from training. Duration: 0.92725 2023-10-04 12:07:27,661 WARNING [train.py:1204] (3/4) Exclude cut with ID _29277_210_3279_1_1532581109293_3173069_311-206227-0_sp1.1 from training. Duration: 0.936375 2023-10-04 12:07:27,672 WARNING [train.py:1204] (3/4) Exclude cut with ID _7160_210_1876_1_1527940923231_3703799_282-322611-0_sp0.9 from training. Duration: 0.9666875 2023-10-04 12:07:29,237 WARNING [train.py:1204] (3/4) Exclude cut with ID _53219_210_4525_1_1534834836008_4064825_562-32295-0_sp1.1 from training. Duration: 0.963625 2023-10-04 12:07:29,328 WARNING [train.py:1204] (3/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_408-87330-0_sp1.1 from training. Duration: 0.963625 2023-10-04 12:07:29,335 WARNING [train.py:1204] (3/4) Exclude cut with ID _59202_210_26984_1_1534816810612_3549174_137-318710-0_sp1.1 from training. Duration: 0.8090625 2023-10-04 12:07:32,669 WARNING [train.py:1204] (3/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_372-30302-0 from training. Duration: 0.86 2023-10-04 12:07:39,301 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7543_210_6753_1_1528541907_1399322_178-56269-0_sp1.1 from training. Duration: 0.95275 2023-10-04 12:07:39,328 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9109W0012-549758 from training. Duration: 0.512 2023-10-04 12:07:39,606 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.ff3_skip_rate, batch_count=1648853.3333333333, ans=0.0 2023-10-04 12:07:39,924 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.4.encoder.layers.1.self_attn_weights.whiten_keys, num_groups=4, num_channels=128, metric=1.86 vs. limit=6.0 2023-10-04 12:07:40,622 WARNING [train.py:1204] (3/4) Exclude cut with ID _5410_210_1388_1_1527397055795_1719020_39-112221-0_sp0.9 from training. Duration: 0.7666875 2023-10-04 12:07:42,046 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9121W0012-554933 from training. Duration: 0.896 2023-10-04 12:07:42,140 WARNING [train.py:1204] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_476-272757-0 from training. Duration: 0.93 2023-10-04 12:07:43,423 WARNING [train.py:1204] (3/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_1085-171718-0_sp1.1 from training. Duration: 0.92725 2023-10-04 12:07:43,479 WARNING [train.py:1204] (3/4) Exclude cut with ID _8480_210_1874_1_1528589391205_6728330_309-139896-0_sp1.1 from training. Duration: 0.736375 2023-10-04 12:07:43,480 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9130W0113-559703 from training. Duration: 0.896 2023-10-04 12:07:43,484 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_292-265928-0_sp1.1 from training. Duration: 0.463625 2023-10-04 12:07:46,212 WARNING [train.py:1204] (3/4) Exclude cut with ID _8772_210_1850_1_1528635595972_3664011_96-12756-0 from training. Duration: 0.98 2023-10-04 12:07:47,507 WARNING [train.py:1204] (3/4) Exclude cut with ID _12556_210_8341_1_1530081024100_7327690_725-193477-0_sp1.1 from training. Duration: 0.8909375 2023-10-04 12:07:47,552 WARNING [train.py:1204] (3/4) Exclude cut with ID _5289_210_2084_1_1527678202736_3779299_142-103317-0_sp1.1 from training. Duration: 0.763625 2023-10-04 12:07:50,371 WARNING [train.py:1204] (3/4) Exclude cut with ID _45012_210_12651_1_1533608435136_5371380_206-51075-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 12:07:51,599 WARNING [train.py:1204] (3/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_176-56120-0_sp1.1 from training. Duration: 0.7545625 2023-10-04 12:07:51,632 WARNING [train.py:1204] (3/4) Exclude cut with ID _40284_210_2084_1_1533513679814_3704279_220-201864-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 12:07:51,663 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9165W0004-576638 from training. Duration: 0.896 2023-10-04 12:07:51,692 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_5438_210_1794_1_1527336687_2600009_218-343336-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 12:07:51,721 WARNING [train.py:1204] (3/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_318-162017-0 from training. Duration: 0.91 2023-10-04 12:07:59,046 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_49544_210_1794_1_1534379770_5262083_483-349298-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 12:08:00,414 WARNING [train.py:1204] (3/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_197-28164-0_sp0.9 from training. Duration: 0.888875 2023-10-04 12:08:01,821 WARNING [train.py:1204] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_643-52007-0 from training. Duration: 0.57 2023-10-04 12:08:01,852 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_38965_210_4852_1_1533174906_3484735_403-332963-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 12:08:05,138 WARNING [train.py:1204] (3/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_340-174176-0 from training. Duration: 0.49 2023-10-04 12:08:06,588 WARNING [train.py:1204] (3/4) Exclude cut with ID _56708_210_13177_1_1535252351282_3422899_363-917-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 12:08:07,993 WARNING [train.py:1204] (3/4) Exclude cut with ID _19896_210_1883_1_1531709022662_6649569_525-159063-0_sp1.1 from training. Duration: 0.936375 2023-10-04 12:08:08,035 WARNING [train.py:1204] (3/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_430-116139-0_sp0.9 from training. Duration: 0.9444375 2023-10-04 12:08:11,313 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_53753_210_7234_1_1534320003_3594148_86-329250-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 12:08:11,329 WARNING [train.py:1204] (3/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_406-275649-0_sp1.1 from training. Duration: 0.3818125 2023-10-04 12:08:13,100 WARNING [train.py:1204] (3/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_1083-147536-0_sp1.1 from training. Duration: 0.8818125 2023-10-04 12:08:13,289 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.balancer2.prob, batch_count=1648986.6666666667, ans=0.125 2023-10-04 12:08:14,431 WARNING [train.py:1204] (3/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_54-311741-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 12:08:14,437 WARNING [train.py:1204] (3/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_237-58433-0_sp0.9 from training. Duration: 0.82225 2023-10-04 12:08:14,482 WARNING [train.py:1204] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_98-51676-0_sp1.1 from training. Duration: 0.663625 2023-10-04 12:08:14,533 WARNING [train.py:1204] (3/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_386-270472-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 12:08:15,879 INFO [train.py:1046] (3/4) Epoch 47, batch 3000, loss[loss=0.1519, simple_loss=0.2431, pruned_loss=0.03031, over 24464.00 frames. ], tot_loss[loss=0.154, simple_loss=0.2346, pruned_loss=0.03676, over 4704043.52 frames. ], batch size: 69, lr: 2.16e-03, grad_scale: 8.0 2023-10-04 12:08:15,880 INFO [train.py:1069] (3/4) Computing validation loss 2023-10-04 12:08:28,123 INFO [train.py:1078] (3/4) Epoch 47, validation: loss=0.3516, simple_loss=0.269, pruned_loss=0.2171, over 1125622.00 frames. 2023-10-04 12:08:28,123 INFO [train.py:1079] (3/4) Maximum memory allocated so far is 21221MB 2023-10-04 12:08:28,210 WARNING [train.py:1204] (3/4) Exclude cut with ID _19023_210_4353_1_1531378892552_5820780_83-254794-0_sp1.1 from training. Duration: 0.7818125 2023-10-04 12:08:29,500 WARNING [train.py:1204] (3/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_314-93656-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 12:08:29,520 WARNING [train.py:1204] (3/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_336-213530-0 from training. Duration: 0.9 2023-10-04 12:08:30,980 WARNING [train.py:1204] (3/4) Exclude cut with ID _36686_210_5665_1_1533778225913_3646410_65-221120-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 12:08:33,962 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_26433_210_12108_1_1532157158_4056480_271-68178-0_sp1.1 from training. Duration: 0.92725 2023-10-04 12:08:34,007 WARNING [train.py:1204] (3/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_46-207599-0_sp0.9 from training. Duration: 0.711125 2023-10-04 12:08:35,324 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.622e+02 2.022e+02 2.270e+02 2.675e+02 4.950e+02, threshold=4.541e+02, percent-clipped=1.0 2023-10-04 12:08:36,880 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9303W0114-637133 from training. Duration: 0.896 2023-10-04 12:08:36,916 WARNING [train.py:1204] (3/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_490-207861-0 from training. Duration: 0.98 2023-10-04 12:08:40,258 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6767_210_3273_1_1527855783_1718932_173-256601-0_sp0.9 from training. Duration: 0.888875 2023-10-04 12:08:40,296 WARNING [train.py:1204] (3/4) Exclude cut with ID _13921_210_9994_1_1530841782272_3503520_327-69677-0_sp1.1 from training. Duration: 0.8181875 2023-10-04 12:08:41,618 WARNING [train.py:1204] (3/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_508-335369-0 from training. Duration: 0.48 2023-10-04 12:08:41,643 WARNING [train.py:1204] (3/4) Exclude cut with ID _64631_210_27767_1_1535349580068_4165160_24-46567-0_sp1.1 from training. Duration: 0.963625 2023-10-04 12:08:49,965 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_89-35748-0_sp1.1 from training. Duration: 0.7181875 2023-10-04 12:08:56,881 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_26433_210_12108_1_1532157158_4056480_578-262107-0_sp1.1 from training. Duration: 0.9 2023-10-04 12:09:02,433 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.ff2_skip_rate, batch_count=1649186.6666666667, ans=0.0 2023-10-04 12:09:03,536 WARNING [train.py:1204] (3/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_129-337802-0 from training. Duration: 0.72 2023-10-04 12:09:04,986 WARNING [train.py:1204] (3/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_356-167816-0_sp0.9 from training. Duration: 0.8 2023-10-04 12:09:07,719 WARNING [train.py:1204] (3/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_137-116442-0_sp1.1 from training. Duration: 0.7090625 2023-10-04 12:09:07,751 WARNING [train.py:1204] (3/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_358-172940-0_sp1.1 from training. Duration: 0.963625 2023-10-04 12:09:09,121 WARNING [train.py:1204] (3/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_668-344147-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 12:09:10,967 WARNING [train.py:1204] (3/4) Exclude cut with ID _62337_210_12392_1_1535009273105_3412229_255-157667-0_sp1.1 from training. Duration: 0.936375 2023-10-04 12:09:10,970 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_486-151131-0 from training. Duration: 0.85 2023-10-04 12:09:14,299 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_53638_210_21581_1_1534297319_5808449_885-80172-0 from training. Duration: 0.96 2023-10-04 12:09:14,396 WARNING [train.py:1204] (3/4) Exclude cut with ID _50093_210_4220_1_1534417141207_3432299_105-183067-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 12:09:15,805 WARNING [train.py:1204] (3/4) Exclude cut with ID _7562_210_2084_1_1528887767916_3946229_345-169156-0_sp0.9 from training. Duration: 0.6444375 2023-10-04 12:09:17,241 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_4528_210_3273_1_1527076654_3276777_209-219804-0_sp0.9 from training. Duration: 0.7666875 2023-10-04 12:09:17,286 WARNING [train.py:1204] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_35-153886-0_sp0.9 from training. Duration: 0.9333125 2023-10-04 12:09:17,357 WARNING [train.py:1204] (3/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_332-121925-0_sp1.1 from training. Duration: 0.92725 2023-10-04 12:09:17,359 WARNING [train.py:1204] (3/4) Exclude cut with ID _59202_210_26984_1_1534816810612_3549174_495-65515-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 12:09:21,495 WARNING [train.py:1204] (3/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_769-168302-0_sp1.1 from training. Duration: 0.6454375 2023-10-04 12:09:22,882 WARNING [train.py:1204] (3/4) Exclude cut with ID _24271_210_12658_1_1531987270022_7190519_708-71345-0_sp1.1 from training. Duration: 0.936375 2023-10-04 12:09:22,882 WARNING [train.py:1204] (3/4) Exclude cut with ID 3033-130750-0096-55598-0_sp0.9 from training. Duration: 0.92225 2023-10-04 12:09:25,483 WARNING [train.py:1204] (3/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_219-224445-0_sp0.9 from training. Duration: 0.9333125 2023-10-04 12:09:26,979 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_174-277364-0 from training. Duration: 0.71 2023-10-04 12:09:28,292 WARNING [train.py:1204] (3/4) Exclude cut with ID _45021_210_9994_1_1533619751094_3330288_96-239010-0_sp1.1 from training. Duration: 0.82725 2023-10-04 12:09:28,323 WARNING [train.py:1204] (3/4) Exclude cut with ID _31602_210_5854_1_1532677021456_4458250_489-305116-0_sp1.1 from training. Duration: 0.97275 2023-10-04 12:09:30,021 WARNING [train.py:1204] (3/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_269-154726-0_sp0.9 from training. Duration: 0.9555625 2023-10-04 12:09:33,896 WARNING [train.py:1204] (3/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_126-54418-0_sp1.1 from training. Duration: 0.92725 2023-10-04 12:09:33,920 WARNING [train.py:1204] (3/4) Exclude cut with ID _42326_210_7770_1_1533814282531_3707269_182-224159-0_sp1.1 from training. Duration: 0.92725 2023-10-04 12:09:36,530 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7002_210_1794_1_1527939683_5157286_1-281264-0_sp0.9 from training. Duration: 0.488875 2023-10-04 12:09:36,565 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_86-16881-0 from training. Duration: 0.58 2023-10-04 12:09:36,588 WARNING [train.py:1204] (3/4) Exclude cut with ID _19514_210_9259_1_1531530071330_5084230_637-279094-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 12:09:36,628 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_26433_210_12108_1_1532157158_4056480_578-262107-0 from training. Duration: 0.99 2023-10-04 12:09:37,908 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_82-221196-0_sp1.1 from training. Duration: 0.6909375 2023-10-04 12:09:39,315 WARNING [train.py:1204] (3/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_402-102311-0 from training. Duration: 0.67 2023-10-04 12:09:40,770 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1015-343884-0_sp1.1 from training. Duration: 0.563625 2023-10-04 12:09:42,643 INFO [train.py:1046] (3/4) Epoch 47, batch 3050, loss[loss=0.1614, simple_loss=0.2444, pruned_loss=0.0392, over 24676.00 frames. ], tot_loss[loss=0.1548, simple_loss=0.2356, pruned_loss=0.03699, over 4701952.82 frames. ], batch size: 73, lr: 2.16e-03, grad_scale: 8.0 2023-10-04 12:09:42,740 WARNING [train.py:1204] (3/4) Exclude cut with ID _13684_210_7497_1_1530619354863_3598359_312-235606-0_sp1.1 from training. Duration: 0.5454375 2023-10-04 12:09:42,763 WARNING [train.py:1204] (3/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_55-31904-0 from training. Duration: 0.88 2023-10-04 12:09:44,719 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_25-332138-0 from training. Duration: 0.91 2023-10-04 12:09:44,721 WARNING [train.py:1204] (3/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_912-282412-0_sp0.9 from training. Duration: 0.5444375 2023-10-04 12:09:46,052 WARNING [train.py:1204] (3/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_458-299435-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 12:09:46,125 WARNING [train.py:1204] (3/4) Exclude cut with ID _69584_210_5219_1_1535785133775_4044450_275-88403-0_sp1.1 from training. Duration: 0.92725 2023-10-04 12:09:46,133 WARNING [train.py:1204] (3/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_841-341679-0_sp1.1 from training. Duration: 0.52725 2023-10-04 12:09:46,139 WARNING [train.py:1204] (3/4) Exclude cut with ID _41391_210_7994_1_1533603771872_3638201_1-54521-0_sp1.1 from training. Duration: 0.97275 2023-10-04 12:09:47,490 WARNING [train.py:1204] (3/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_495-186609-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 12:09:48,964 WARNING [train.py:1204] (3/4) Exclude cut with ID _4681_210_3525_1_1527250689464_120889_15-126760-0 from training. Duration: 0.81 2023-10-04 12:09:51,663 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3019_210_2028_1_1526044245_3264865_127-135615-0_sp1.1 from training. Duration: 0.8545625 2023-10-04 12:09:53,065 WARNING [train.py:1204] (3/4) Exclude cut with ID _30191_210_1664_1_1533189915047_7196670_915-21561-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 12:09:54,368 WARNING [train.py:1204] (3/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_6-214815-0_sp0.9 from training. Duration: 0.9444375 2023-10-04 12:09:57,232 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_45399_210_4852_1_1533624725_7579737_190-137494-0_sp1.1 from training. Duration: 0.97275 2023-10-04 12:10:00,052 WARNING [train.py:1204] (3/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_207-132038-0 from training. Duration: 0.94 2023-10-04 12:10:06,708 WARNING [train.py:1204] (3/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_90-31383-0 from training. Duration: 0.87 2023-10-04 12:10:06,749 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_15517_210_6753_1_1530952293_1664795_119-31001-0 from training. Duration: 0.94 2023-10-04 12:10:08,021 WARNING [train.py:1204] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_684-85342-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 12:10:09,496 WARNING [train.py:1204] (3/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_97-325053-0_sp1.1 from training. Duration: 0.836375 2023-10-04 12:10:14,734 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_68702_210_23689_1_1535799577_3420068_168-25150-0_sp1.1 from training. Duration: 0.97275 2023-10-04 12:10:14,746 WARNING [train.py:1204] (3/4) Exclude cut with ID _65134_210_14763_1_1535335393985_3505368_111-27984-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 12:10:16,095 WARNING [train.py:1204] (3/4) Exclude cut with ID _48678_210_5663_1_1533988830947_3769199_276-226230-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 12:10:17,605 WARNING [train.py:1204] (3/4) Exclude cut with ID _28427_210_4220_1_1532581240967_3061069_43-84928-0_sp1.1 from training. Duration: 0.9 2023-10-04 12:10:18,882 WARNING [train.py:1204] (3/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_158-250116-0_sp1.1 from training. Duration: 0.563625 2023-10-04 12:10:18,919 WARNING [train.py:1204] (3/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_279-332556-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 12:10:18,956 WARNING [train.py:1204] (3/4) Exclude cut with ID _42863_210_9070_1_1533902398788_3696920_488-230146-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 12:10:18,957 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_387-247159-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 12:10:20,325 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6729_210_2550_1_1528032481_1788725_261-42619-0_sp1.1 from training. Duration: 0.97275 2023-10-04 12:10:20,473 WARNING [train.py:1204] (3/4) Exclude cut with ID _19896_210_1883_1_1531709022662_6649569_831-44724-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 12:10:24,647 WARNING [train.py:1204] (3/4) Exclude cut with ID _55153_210_2253_1_1534384786677_3162096_280-285944-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 12:10:24,684 WARNING [train.py:1204] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_448-4048-0 from training. Duration: 0.96 2023-10-04 12:10:25,919 WARNING [train.py:1204] (3/4) Exclude cut with ID _29678_210_7893_1_1532421042787_3906118_456-202548-0_sp1.1 from training. Duration: 0.97275 2023-10-04 12:10:25,939 WARNING [train.py:1204] (3/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_107-332398-0_sp1.1 from training. Duration: 0.7090625 2023-10-04 12:10:27,351 WARNING [train.py:1204] (3/4) Exclude cut with ID _13921_210_9994_1_1530838702800_3003429_46-149444-0_sp1.1 from training. Duration: 0.8545625 2023-10-04 12:10:27,597 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=1649586.6666666667, ans=0.1 2023-10-04 12:10:28,643 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_257-20733-0_sp0.9 from training. Duration: 0.8444375 2023-10-04 12:10:28,697 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_53753_210_7234_1_1534320003_3594148_385-234809-0_sp1.1 from training. Duration: 0.963625 2023-10-04 12:10:30,601 WARNING [train.py:1204] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_292-215346-0_sp1.1 from training. Duration: 0.8818125 2023-10-04 12:10:35,302 WARNING [train.py:1204] (3/4) Exclude cut with ID _68965_210_1883_1_1535698962272_3497490_196-124268-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 12:10:35,473 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.conv_module2.balancer2.prob, batch_count=1649586.6666666667, ans=0.125 2023-10-04 12:10:36,671 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_23801_210_16223_1_1532066392_3294657_570-5439-0_sp1.1 from training. Duration: 0.8818125 2023-10-04 12:10:40,887 WARNING [train.py:1204] (3/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_349-6928-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 12:10:42,623 WARNING [train.py:1204] (3/4) Exclude cut with ID _60621_210_17071_1_1535013053530_3671739_226-255202-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 12:10:42,633 WARNING [train.py:1204] (3/4) Exclude cut with ID _41817_210_18835_1_1533434477160_7423049_448-130302-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 12:10:44,698 WARNING [train.py:1204] (3/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_758-143677-0_sp1.1 from training. Duration: 0.8909375 2023-10-04 12:10:44,748 WARNING [train.py:1204] (3/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_912-47473-0_sp0.9 from training. Duration: 0.7555625 2023-10-04 12:10:44,783 WARNING [train.py:1204] (3/4) Exclude cut with ID _6694_210_4599_1_1527852637490_3965529_356-169233-0_sp1.1 from training. Duration: 0.9 2023-10-04 12:10:47,448 WARNING [train.py:1204] (3/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_1-99680-0 from training. Duration: 0.67 2023-10-04 12:10:48,842 WARNING [train.py:1204] (3/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_199-86432-0_sp1.1 from training. Duration: 0.8909375 2023-10-04 12:10:48,873 WARNING [train.py:1204] (3/4) Exclude cut with ID _61148_210_6897_1_1535094022726_4217610_362-182430-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 12:10:50,214 WARNING [train.py:1204] (3/4) Exclude cut with ID _49376_210_5986_1_1533985278732_3370740_301-154763-0 from training. Duration: 0.98 2023-10-04 12:10:51,644 WARNING [train.py:1204] (3/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_572-185150-0_sp1.1 from training. Duration: 0.8818125 2023-10-04 12:10:55,913 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_47896_210_20508_1_1534492518_5958868_558-314572-0_sp1.1 from training. Duration: 0.8818125 2023-10-04 12:10:57,086 INFO [train.py:1046] (3/4) Epoch 47, batch 3100, loss[loss=0.1338, simple_loss=0.217, pruned_loss=0.02525, over 24348.00 frames. ], tot_loss[loss=0.1548, simple_loss=0.2357, pruned_loss=0.0369, over 4705580.76 frames. ], batch size: 56, lr: 2.16e-03, grad_scale: 8.0 2023-10-04 12:10:58,508 WARNING [train.py:1204] (3/4) Exclude cut with ID _45560_210_21982_1_1533812603951_3584999_111-6376-0_sp0.9 from training. Duration: 0.9333125 2023-10-04 12:10:59,922 WARNING [train.py:1204] (3/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_412-317077-0_sp1.1 from training. Duration: 0.6818125 2023-10-04 12:11:01,942 WARNING [train.py:1204] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_718-68505-0 from training. Duration: 0.89 2023-10-04 12:11:03,432 WARNING [train.py:1204] (3/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_77-311404-0 from training. Duration: 0.94 2023-10-04 12:11:04,587 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.632e+02 2.016e+02 2.223e+02 2.559e+02 3.925e+02, threshold=4.446e+02, percent-clipped=0.0 2023-10-04 12:11:04,723 WARNING [train.py:1204] (3/4) Exclude cut with ID _60229_210_27767_1_1534838406158_3655350_264-279573-0 from training. Duration: 0.83 2023-10-04 12:11:05,323 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.4.encoder.layers.1.whiten, num_groups=1, num_channels=384, metric=4.35 vs. limit=12.0 2023-10-04 12:11:06,578 WARNING [train.py:1204] (3/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_106-300985-0_sp1.1 from training. Duration: 0.8181875 2023-10-04 12:11:08,341 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.conv_module1.balancer1.prob, batch_count=1649720.0, ans=0.125 2023-10-04 12:11:09,483 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_4183_210_2087_1_1526729100_1869838_79-277189-0_sp1.1 from training. Duration: 0.963625 2023-10-04 12:11:09,501 WARNING [train.py:1204] (3/4) Exclude cut with ID _28427_210_4220_1_1532581240967_3061069_195-295268-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 12:11:13,466 WARNING [train.py:1204] (3/4) Exclude cut with ID _4092_210_2151_1_1526643044449_3135759_356-278694-0_sp0.9 from training. Duration: 0.52225 2023-10-04 12:11:17,511 WARNING [train.py:1204] (3/4) Exclude cut with ID _14459_210_2228_1_1531443641946_7516700_777-71446-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 12:11:19,144 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.balancer_ff2.min_abs, batch_count=1649786.6666666667, ans=0.1 2023-10-04 12:11:21,845 WARNING [train.py:1204] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_520-126729-0 from training. Duration: 0.7 2023-10-04 12:11:25,940 WARNING [train.py:1204] (3/4) Exclude cut with ID _13684_210_7497_1_1530619354863_3598359_312-235606-0_sp0.9 from training. Duration: 0.6666875 2023-10-04 12:11:25,990 WARNING [train.py:1204] (3/4) Exclude cut with ID _37537_210_2253_1_1533171758568_3421949_283-349402-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 12:11:27,309 WARNING [train.py:1204] (3/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_29-117932-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 12:11:27,331 WARNING [train.py:1204] (3/4) Exclude cut with ID _29303_210_2575_1_1532392237303_7410649_721-283351-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 12:11:27,399 WARNING [train.py:1204] (3/4) Exclude cut with ID _14886_210_4220_1_1530702020355_3516190_175-229073-0_sp0.9 from training. Duration: 0.5 2023-10-04 12:11:28,796 WARNING [train.py:1204] (3/4) Exclude cut with ID _15458_210_8341_1_1530858614263_3834410_70-268830-0_sp1.1 from training. Duration: 0.97275 2023-10-04 12:11:28,814 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2295_210_2087_1_1525500993_2407863_234-83104-0 from training. Duration: 0.62 2023-10-04 12:11:28,815 WARNING [train.py:1204] (3/4) Exclude cut with ID _22961_210_8254_1_1531976366672_4278180_317-199546-0_sp1.1 from training. Duration: 0.7818125 2023-10-04 12:11:30,216 WARNING [train.py:1204] (3/4) Exclude cut with ID _27894_210_12392_1_1532415496078_6902439_547-196022-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 12:11:31,557 WARNING [train.py:1204] (3/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_46-207599-0 from training. Duration: 0.64 2023-10-04 12:11:33,551 WARNING [train.py:1204] (3/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_426-326246-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 12:11:38,327 WARNING [train.py:1204] (3/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_445-117398-0_sp0.9 from training. Duration: 0.988875 2023-10-04 12:11:38,403 WARNING [train.py:1204] (3/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_563-347832-0 from training. Duration: 0.89 2023-10-04 12:11:39,609 WARNING [train.py:1204] (3/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_252-216674-0 from training. Duration: 0.54 2023-10-04 12:11:41,135 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.0.bypass.scale_min, batch_count=1649920.0, ans=0.2 2023-10-04 12:11:42,854 WARNING [train.py:1204] (3/4) Exclude cut with ID _59611_210_8895_1_1534741198240_3650361_187-212779-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 12:11:42,914 WARNING [train.py:1204] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_282-159367-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 12:11:44,500 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_68702_210_23689_1_1535799577_3420068_33-136883-0_sp1.1 from training. Duration: 0.936375 2023-10-04 12:11:44,510 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_47228_210_4983_1_1533780021_3672027_184-293706-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 12:11:44,536 WARNING [train.py:1204] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_656-188361-0_sp0.9 from training. Duration: 0.9666875 2023-10-04 12:11:45,930 WARNING [train.py:1204] (3/4) Exclude cut with ID _4866_210_2577_1_1527404412284_4364659_352-157930-0_sp0.9 from training. Duration: 0.9 2023-10-04 12:11:45,931 WARNING [train.py:1204] (3/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_210-106047-0_sp1.1 from training. Duration: 0.8818125 2023-10-04 12:11:48,651 WARNING [train.py:1204] (3/4) Exclude cut with ID _9389_210_1664_1_1529217337301_5289269_289-213417-0_sp1.1 from training. Duration: 0.8090625 2023-10-04 12:11:50,102 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3910_210_1794_1_1526777667_3747260_178-41186-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 12:11:50,106 WARNING [train.py:1204] (3/4) Exclude cut with ID _27052_210_5986_1_1532329197187_3585009_24-172263-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 12:11:50,111 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_5522_210_3273_1_1527249606_3909096_318-203153-0_sp1.1 from training. Duration: 0.3909375 2023-10-04 12:11:54,294 WARNING [train.py:1204] (3/4) Exclude cut with ID _27894_210_12392_1_1532415496078_6902439_222-51982-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 12:11:55,753 WARNING [train.py:1204] (3/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_87-131427-0 from training. Duration: 0.78 2023-10-04 12:11:58,435 WARNING [train.py:1204] (3/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_372-298093-0_sp0.9 from training. Duration: 0.888875 2023-10-04 12:11:58,487 WARNING [train.py:1204] (3/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_663-166973-0 from training. Duration: 0.8 2023-10-04 12:11:59,761 WARNING [train.py:1204] (3/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_429-177469-0_sp1.1 from training. Duration: 0.936375 2023-10-04 12:11:59,801 WARNING [train.py:1204] (3/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_724-172651-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 12:11:59,844 WARNING [train.py:1204] (3/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_103-336064-0 from training. Duration: 0.51 2023-10-04 12:12:11,730 INFO [train.py:1046] (3/4) Epoch 47, batch 3150, loss[loss=0.1347, simple_loss=0.2214, pruned_loss=0.02403, over 24667.00 frames. ], tot_loss[loss=0.1534, simple_loss=0.234, pruned_loss=0.03636, over 4709439.58 frames. ], batch size: 65, lr: 2.16e-03, grad_scale: 4.0 2023-10-04 12:12:11,802 WARNING [train.py:1204] (3/4) Exclude cut with ID _48677_210_5663_1_1533902405919_3892989_111-11439-0 from training. Duration: 0.96 2023-10-04 12:12:13,798 WARNING [train.py:1204] (3/4) Exclude cut with ID _8085_210_4557_1_1528455633493_3716100_95-103398-0_sp1.1 from training. Duration: 0.963625 2023-10-04 12:12:15,147 WARNING [train.py:1204] (3/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_1041-110837-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 12:12:17,148 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_50050_210_16223_1_1535435796_7290138_1155-121757-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 12:12:17,154 WARNING [train.py:1204] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_602-275260-0_sp0.9 from training. Duration: 0.911125 2023-10-04 12:12:18,479 WARNING [train.py:1204] (3/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_265-164486-0 from training. Duration: 0.96 2023-10-04 12:12:19,700 WARNING [train.py:1204] (3/4) Exclude cut with ID _46067_210_4220_1_1534125571066_3307450_142-229375-0_sp1.1 from training. Duration: 0.963625 2023-10-04 12:12:19,727 WARNING [train.py:1204] (3/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_310-26186-0_sp1.1 from training. Duration: 0.636375 2023-10-04 12:12:19,832 WARNING [train.py:1204] (3/4) Exclude cut with ID _28461_210_2253_1_1532771775189_3041299_136-270644-0 from training. Duration: 0.93 2023-10-04 12:12:22,562 WARNING [train.py:1204] (3/4) Exclude cut with ID _14860_210_4915_1_1530702020340_7343240_438-286372-0_sp1.1 from training. Duration: 0.936375 2023-10-04 12:12:22,818 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.ff2_skip_rate, batch_count=1650053.3333333333, ans=0.0 2023-10-04 12:12:24,010 WARNING [train.py:1204] (3/4) Exclude cut with ID IC0128W0004-63309 from training. Duration: 0.831 2023-10-04 12:12:26,797 WARNING [train.py:1204] (3/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_135-174080-0 from training. Duration: 0.53 2023-10-04 12:12:26,839 WARNING [train.py:1204] (3/4) Exclude cut with ID _41326_210_5854_1_1533778076509_3492080_277-59511-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 12:12:28,234 WARNING [train.py:1204] (3/4) Exclude cut with ID IC0144W0112-71379 from training. Duration: 0.992 2023-10-04 12:12:28,308 WARNING [train.py:1204] (3/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_711-15688-0_sp0.9 from training. Duration: 0.47775 2023-10-04 12:12:29,645 WARNING [train.py:1204] (3/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_237-332027-0 from training. Duration: 0.95 2023-10-04 12:12:30,899 WARNING [train.py:1204] (3/4) Exclude cut with ID _7970_210_1883_1_1528455616296_3472230_25-178407-0 from training. Duration: 0.71 2023-10-04 12:12:30,903 WARNING [train.py:1204] (3/4) Exclude cut with ID _8480_210_1874_1_1528589391205_6728330_309-139896-0 from training. Duration: 0.81 2023-10-04 12:12:30,925 WARNING [train.py:1204] (3/4) Exclude cut with ID _39571_210_13449_1_1533801584143_3651077_319-268877-0_sp1.1 from training. Duration: 0.936375 2023-10-04 12:12:32,233 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_155-246880-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 12:12:32,330 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_566-122599-0_sp1.1 from training. Duration: 0.936375 2023-10-04 12:12:33,873 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_12868_210_6753_1_1530701932_530275_18-76322-0 from training. Duration: 0.87 2023-10-04 12:12:35,631 WARNING [train.py:1204] (3/4) Exclude cut with ID _56494_210_4220_1_1535284837625_3033169_159-106035-0_sp1.1 from training. Duration: 0.963625 2023-10-04 12:12:35,670 WARNING [train.py:1204] (3/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_98-276013-0_sp1.1 from training. Duration: 0.963625 2023-10-04 12:12:37,432 WARNING [train.py:1204] (3/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_330-216124-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 12:12:38,855 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_177-58005-0_sp0.9 from training. Duration: 0.688875 2023-10-04 12:12:41,727 WARNING [train.py:1204] (3/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_343-139898-0 from training. Duration: 0.96 2023-10-04 12:12:43,523 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6961_210_2087_1_1527937168_6815011_395-185060-0_sp1.1 from training. Duration: 0.863625 2023-10-04 12:12:44,872 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7002_210_1794_1_1527939683_5157286_184-229630-0_sp0.9 from training. Duration: 0.82225 2023-10-04 12:12:46,705 WARNING [train.py:1204] (3/4) Exclude cut with ID _59170_210_6721_1_1534983633960_4203335_153-18895-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 12:12:47,983 WARNING [train.py:1204] (3/4) Exclude cut with ID _49643_210_7989_1_1534640244224_3939031_598-161926-0 from training. Duration: 0.9 2023-10-04 12:12:48,274 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.nonlin_attention.balancer.prob, batch_count=1650186.6666666667, ans=0.125 2023-10-04 12:12:50,601 WARNING [train.py:1204] (3/4) Exclude cut with ID _4068_210_2629_1_1526697350395_1546259_224-288693-0 from training. Duration: 0.59 2023-10-04 12:12:51,955 WARNING [train.py:1204] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_718-68505-0_sp1.1 from training. Duration: 0.8090625 2023-10-04 12:12:52,011 WARNING [train.py:1204] (3/4) Exclude cut with ID _4068_210_2629_1_1526697350395_1546259_224-288693-0_sp0.9 from training. Duration: 0.6555625 2023-10-04 12:12:52,023 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2296_210_2087_1_1525503448_3397335_57-185430-0_sp0.9 from training. Duration: 0.6666875 2023-10-04 12:12:53,358 WARNING [train.py:1204] (3/4) Exclude cut with ID _15458_210_8341_1_1530858614263_3834410_49-235472-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 12:12:53,360 WARNING [train.py:1204] (3/4) Exclude cut with ID _64135_210_2253_1_1535677267876_7608972_498-5039-0_sp0.9 from training. Duration: 0.8666875 2023-10-04 12:12:54,710 WARNING [train.py:1204] (3/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_283-220372-0_sp0.9 from training. Duration: 0.62225 2023-10-04 12:12:54,726 WARNING [train.py:1204] (3/4) Exclude cut with ID _6473_210_1741_1_1527901307396_3815380_108-17335-0_sp0.9 from training. Duration: 0.72225 2023-10-04 12:12:56,138 WARNING [train.py:1204] (3/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_113-92608-0 from training. Duration: 0.54 2023-10-04 12:12:56,198 WARNING [train.py:1204] (3/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_272-249985-0_sp1.1 from training. Duration: 0.6090625 2023-10-04 12:12:56,211 WARNING [train.py:1204] (3/4) Exclude cut with ID _50376_210_13401_1_1534031502101_4217740_518-187390-0_sp1.1 from training. Duration: 0.97275 2023-10-04 12:12:57,731 WARNING [train.py:1204] (3/4) Exclude cut with ID _59738_210_15379_1_1534849512562_3941470_165-248260-0_sp1.1 from training. Duration: 0.92725 2023-10-04 12:12:59,110 WARNING [train.py:1204] (3/4) Exclude cut with ID _23818_210_6228_1_1531917063884_3676920_400-343925-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 12:12:59,175 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6961_210_2087_1_1527937168_6815011_562-165518-0 from training. Duration: 0.77 2023-10-04 12:12:59,223 WARNING [train.py:1204] (3/4) Exclude cut with ID _24271_210_12658_1_1531987270022_7190519_289-207931-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 12:13:01,890 WARNING [train.py:1204] (3/4) Exclude cut with ID _14632_210_9988_1_1530691279392_3923200_332-105904-0 from training. Duration: 0.9 2023-10-04 12:13:01,902 WARNING [train.py:1204] (3/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_203-232438-0_sp1.1 from training. Duration: 0.97275 2023-10-04 12:13:03,261 WARNING [train.py:1204] (3/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_263-168388-0 from training. Duration: 0.86 2023-10-04 12:13:03,343 WARNING [train.py:1204] (3/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_174-205058-0 from training. Duration: 0.95 2023-10-04 12:13:04,693 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_94-195801-0_sp1.1 from training. Duration: 0.8545625 2023-10-04 12:13:04,716 WARNING [train.py:1204] (3/4) Exclude cut with ID _55153_210_2253_1_1534384786677_3162096_269-103204-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 12:13:06,592 WARNING [train.py:1204] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_748-36150-0 from training. Duration: 0.91 2023-10-04 12:13:07,978 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_148-172861-0_sp1.1 from training. Duration: 0.4545625 2023-10-04 12:13:08,039 WARNING [train.py:1204] (3/4) Exclude cut with ID _67499_210_17282_1_1535509805220_3692430_460-77730-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 12:13:12,643 WARNING [train.py:1204] (3/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_374-20753-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 12:13:13,986 WARNING [train.py:1204] (3/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_374-289983-0_sp1.1 from training. Duration: 0.97275 2023-10-04 12:13:14,026 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_34886_210_10633_1_1533203882_1755661_76-156096-0_sp0.9 from training. Duration: 0.9666875 2023-10-04 12:13:15,799 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.bypass.scale_min, batch_count=1650320.0, ans=0.2 2023-10-04 12:13:18,271 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_238-256668-0_sp0.9 from training. Duration: 0.7666875 2023-10-04 12:13:18,327 WARNING [train.py:1204] (3/4) Exclude cut with ID _46683_210_9642_1_1533730913492_4591116_489-146727-0_sp1.1 from training. Duration: 0.97275 2023-10-04 12:13:21,582 WARNING [train.py:1204] (3/4) Exclude cut with ID _13938_210_9988_1_1530518718978_3693960_304-112784-0_sp0.9 from training. Duration: 0.511125 2023-10-04 12:13:24,618 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.balancer2.prob, batch_count=1650386.6666666667, ans=0.125 2023-10-04 12:13:25,680 INFO [train.py:1046] (3/4) Epoch 47, batch 3200, loss[loss=0.164, simple_loss=0.2436, pruned_loss=0.04214, over 23491.00 frames. ], tot_loss[loss=0.153, simple_loss=0.2336, pruned_loss=0.03622, over 4715026.92 frames. ], batch size: 93, lr: 2.16e-03, grad_scale: 8.0 2023-10-04 12:13:25,811 WARNING [train.py:1204] (3/4) Exclude cut with ID _24271_210_12658_1_1531987270022_7190519_243-46549-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 12:13:25,816 WARNING [train.py:1204] (3/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_78-182912-0_sp1.1 from training. Duration: 0.536375 2023-10-04 12:13:28,635 WARNING [train.py:1204] (3/4) Exclude cut with ID _57572_210_5517_1_1534921219624_3809484_121-321334-0_sp1.1 from training. Duration: 0.97275 2023-10-04 12:13:31,393 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_40047_210_2290_1_1533290519_2374311_254-102149-0_sp1.1 from training. Duration: 0.963625 2023-10-04 12:13:31,394 WARNING [train.py:1204] (3/4) Exclude cut with ID _28461_210_2253_1_1532771775189_3041299_96-152158-0 from training. Duration: 0.85 2023-10-04 12:13:31,675 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.balancer_na.min_abs, batch_count=1650386.6666666667, ans=0.02 2023-10-04 12:13:32,912 WARNING [train.py:1204] (3/4) Exclude cut with ID _29877_210_6228_1_1532397791106_5276149_161-102979-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 12:13:34,114 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.740e+02 2.055e+02 2.237e+02 2.558e+02 4.162e+02, threshold=4.474e+02, percent-clipped=0.0 2023-10-04 12:13:38,887 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3787_210_2040_1_1526477544_5478586_603-120324-0_sp0.9 from training. Duration: 0.9 2023-10-04 12:13:41,733 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_41750_210_1960_1_1533520678_3948042_247-213456-0_sp1.1 from training. Duration: 0.97275 2023-10-04 12:13:49,389 WARNING [train.py:1204] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_127-119109-0_sp0.9 from training. Duration: 0.8 2023-10-04 12:13:58,337 WARNING [train.py:1204] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_215-202973-0 from training. Duration: 0.88 2023-10-04 12:13:58,411 WARNING [train.py:1204] (3/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_17-320491-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 12:14:01,204 WARNING [train.py:1204] (3/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_310-114452-0 from training. Duration: 0.59 2023-10-04 12:14:03,086 WARNING [train.py:1204] (3/4) Exclude cut with ID _15133_210_6228_1_1530794512074_3080379_29-264458-0_sp0.9 from training. Duration: 0.6444375 2023-10-04 12:14:06,480 WARNING [train.py:1204] (3/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_159-281310-0_sp1.1 from training. Duration: 0.863625 2023-10-04 12:14:06,497 WARNING [train.py:1204] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_195-167011-0_sp0.9 from training. Duration: 0.8333125 2023-10-04 12:14:07,707 WARNING [train.py:1204] (3/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_156-257607-0_sp1.1 from training. Duration: 0.7545625 2023-10-04 12:14:12,255 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2295_210_2087_1_1525500993_2407863_116-238894-0 from training. Duration: 0.7 2023-10-04 12:14:13,637 WARNING [train.py:1204] (3/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_321-335475-0_sp0.9 from training. Duration: 0.5 2023-10-04 12:14:15,083 WARNING [train.py:1204] (3/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_188-114464-0 from training. Duration: 0.82 2023-10-04 12:14:15,238 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.1.balancer2.prob, batch_count=1650586.6666666667, ans=0.125 2023-10-04 12:14:17,863 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2592_210_2290_1_1525697025_4882925_157-70512-0 from training. Duration: 0.91 2023-10-04 12:14:20,626 WARNING [train.py:1204] (3/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_138-64210-0_sp1.1 from training. Duration: 0.663625 2023-10-04 12:14:26,791 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_11305_210_1794_1_1529750428_7178148_651-95702-0_sp1.1 from training. Duration: 0.963625 2023-10-04 12:14:26,813 WARNING [train.py:1204] (3/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_379-184223-0_sp0.9 from training. Duration: 0.9444375 2023-10-04 12:14:28,437 WARNING [train.py:1204] (3/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_381-125325-0_sp1.1 from training. Duration: 0.963625 2023-10-04 12:14:28,497 WARNING [train.py:1204] (3/4) Exclude cut with ID IC0622W0021-307659 from training. Duration: 0.896 2023-10-04 12:14:28,499 WARNING [train.py:1204] (3/4) Exclude cut with ID _7624_210_1531_1_1528109802198_4933101_418-349834-0_sp1.1 from training. Duration: 0.5454375 2023-10-04 12:14:28,676 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.out_combiner.scale_min, batch_count=1650653.3333333333, ans=0.2 2023-10-04 12:14:31,379 WARNING [train.py:1204] (3/4) Exclude cut with ID _49728_210_12651_1_1534041034294_6788150_607-3416-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 12:14:32,763 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8275_210_4198_1_1528455320_3892571_491-17707-0 from training. Duration: 0.64 2023-10-04 12:14:34,080 WARNING [train.py:1204] (3/4) Exclude cut with ID _5287_210_2084_1_1527418883040_3878670_333-48508-0 from training. Duration: 0.6 2023-10-04 12:14:35,756 WARNING [train.py:1204] (3/4) Exclude cut with ID _8365_210_2064_1_1528592311070_4677690_371-122295-0 from training. Duration: 0.55 2023-10-04 12:14:37,053 WARNING [train.py:1204] (3/4) Exclude cut with ID _14860_210_4915_1_1530702020340_7343240_836-328489-0 from training. Duration: 0.9 2023-10-04 12:14:39,046 WARNING [train.py:1204] (3/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_1033-274376-0_sp0.9 from training. Duration: 0.9666875 2023-10-04 12:14:40,331 INFO [train.py:1046] (3/4) Epoch 47, batch 3250, loss[loss=0.1638, simple_loss=0.2513, pruned_loss=0.03812, over 24045.00 frames. ], tot_loss[loss=0.1534, simple_loss=0.2338, pruned_loss=0.03648, over 4712226.36 frames. ], batch size: 80, lr: 2.16e-03, grad_scale: 8.0 2023-10-04 12:14:40,481 WARNING [train.py:1204] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_475-127455-0_sp0.9 from training. Duration: 0.77775 2023-10-04 12:14:40,487 WARNING [train.py:1204] (3/4) Exclude cut with ID IC0671W0071-331914 from training. Duration: 0.896 2023-10-04 12:14:41,780 WARNING [train.py:1204] (3/4) Exclude cut with ID _30327_210_16049_1_1532484161295_3924169_2-1523-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 12:14:41,783 WARNING [train.py:1204] (3/4) Exclude cut with ID _24314_210_4915_1_1532135078116_7170179_460-208279-0_sp1.1 from training. Duration: 0.97275 2023-10-04 12:14:43,671 WARNING [train.py:1204] (3/4) Exclude cut with ID IC0679W0052-335859 from training. Duration: 0.64 2023-10-04 12:14:46,428 WARNING [train.py:1204] (3/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_3-237434-0_sp1.1 from training. Duration: 0.6909375 2023-10-04 12:14:47,968 WARNING [train.py:1204] (3/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_405-185283-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 12:14:52,857 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.3.encoder.layers.2.self_attn_weights.whiten_keys, num_groups=8, num_channels=256, metric=3.08 vs. limit=6.0 2023-10-04 12:14:55,431 WARNING [train.py:1204] (3/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_927-83684-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 12:14:55,446 WARNING [train.py:1204] (3/4) Exclude cut with ID _6694_210_4599_1_1527852637490_3965529_356-169233-0 from training. Duration: 0.99 2023-10-04 12:14:56,812 WARNING [train.py:1204] (3/4) Exclude cut with ID _19896_210_1883_1_1531709022662_6649569_562-233001-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 12:14:56,851 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_354-78629-0_sp1.1 from training. Duration: 0.963625 2023-10-04 12:14:56,852 WARNING [train.py:1204] (3/4) Exclude cut with ID _60135_210_13427_1_1534935557922_3651319_96-168198-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 12:14:58,246 WARNING [train.py:1204] (3/4) Exclude cut with ID _31602_210_5854_1_1532677021456_4458250_249-96604-0_sp1.1 from training. Duration: 0.8090625 2023-10-04 12:14:58,275 WARNING [train.py:1204] (3/4) Exclude cut with ID _14100_210_8341_1_1530529320613_3678069_258-224599-0_sp0.9 from training. Duration: 0.8555625 2023-10-04 12:14:58,452 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.self_attn_weights.pos_emb_skip_rate, batch_count=1650786.6666666667, ans=0.0 2023-10-04 12:15:01,009 WARNING [train.py:1204] (3/4) Exclude cut with ID _19708_210_9206_1_1532136650733_4238364_215-160135-0_sp1.1 from training. Duration: 0.97275 2023-10-04 12:15:02,368 WARNING [train.py:1204] (3/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_113-18163-0_sp0.9 from training. Duration: 0.988875 2023-10-04 12:15:02,415 WARNING [train.py:1204] (3/4) Exclude cut with ID _64135_210_2253_1_1535677267876_7608972_552-337340-0_sp1.1 from training. Duration: 0.92725 2023-10-04 12:15:02,452 WARNING [train.py:1204] (3/4) Exclude cut with ID _67725_210_27767_1_1535608756671_5105359_10-12887-0_sp1.1 from training. Duration: 0.97275 2023-10-04 12:15:02,458 WARNING [train.py:1204] (3/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_401-284840-0_sp1.1 from training. Duration: 0.97275 2023-10-04 12:15:04,120 WARNING [train.py:1204] (3/4) Exclude cut with ID _44659_210_2721_1_1534036120061_6614860_483-141285-0_sp1.1 from training. Duration: 0.936375 2023-10-04 12:15:05,539 WARNING [train.py:1204] (3/4) Exclude cut with ID _9461_210_4557_1_1529060514726_3677451_358-332734-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 12:15:06,920 WARNING [train.py:1204] (3/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_788-298907-0_sp1.1 from training. Duration: 0.8090625 2023-10-04 12:15:09,112 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.conv_skip_rate, batch_count=1650853.3333333333, ans=0.0 2023-10-04 12:15:10,099 WARNING [train.py:1204] (3/4) Exclude cut with ID _39411_210_5986_1_1533538825030_3343649_354-267196-0_sp1.1 from training. Duration: 0.92725 2023-10-04 12:15:10,122 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_14953_210_2151_1_1530701710_5616165_192-309792-0_sp1.1 from training. Duration: 0.97275 2023-10-04 12:15:12,047 WARNING [train.py:1204] (3/4) Exclude cut with ID _59170_210_6721_1_1534983633960_4203335_290-151255-0_sp1.1 from training. Duration: 0.92725 2023-10-04 12:15:12,081 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_5575_210_1410_1_1527301305_2570170_249-93769-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 12:15:12,091 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_46013_210_7234_1_1533776362_3744032_463-52378-0_sp1.1 from training. Duration: 0.7818125 2023-10-04 12:15:17,749 WARNING [train.py:1204] (3/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_580-93798-0 from training. Duration: 0.56 2023-10-04 12:15:17,805 WARNING [train.py:1204] (3/4) Exclude cut with ID _19514_210_9259_1_1531530071330_5084230_385-13762-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 12:15:17,810 WARNING [train.py:1204] (3/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_458-67321-0_sp1.1 from training. Duration: 0.8818125 2023-10-04 12:15:19,252 WARNING [train.py:1204] (3/4) Exclude cut with ID _41298_210_9019_1_1533430371450_4822143_188-154791-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 12:15:20,629 WARNING [train.py:1204] (3/4) Exclude cut with ID _15037_210_2253_1_1530752294104_3267650_296-192597-0_sp0.9 from training. Duration: 0.9 2023-10-04 12:15:28,058 WARNING [train.py:1204] (3/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_661-264857-0_sp1.1 from training. Duration: 0.8454375 2023-10-04 12:15:34,939 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_34008_210_10431_1_1533345424_8183247_662-317331-0_sp1.1 from training. Duration: 0.8545625 2023-10-04 12:15:34,969 WARNING [train.py:1204] (3/4) Exclude cut with ID _46502_210_13794_1_1533724452198_3657159_241-278329-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 12:15:34,969 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_11349_210_6753_1_1529800861_1101207_26-65602-0 from training. Duration: 0.96 2023-10-04 12:15:34,973 WARNING [train.py:1204] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_448-4048-0_sp1.1 from training. Duration: 0.87275 2023-10-04 12:15:34,986 WARNING [train.py:1204] (3/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_662-275621-0_sp1.1 from training. Duration: 0.3818125 2023-10-04 12:15:36,892 WARNING [train.py:1204] (3/4) Exclude cut with ID _13938_210_9988_1_1530518718978_3693960_256-54454-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 12:15:38,338 WARNING [train.py:1204] (3/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_256-32662-0 from training. Duration: 0.9 2023-10-04 12:15:38,382 WARNING [train.py:1204] (3/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_326-17061-0 from training. Duration: 0.94 2023-10-04 12:15:39,653 WARNING [train.py:1204] (3/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_505-97987-0_sp1.1 from training. Duration: 0.7818125 2023-10-04 12:15:40,984 WARNING [train.py:1204] (3/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_316-294364-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 12:15:41,656 WARNING [train.py:1204] (3/4) Exclude cut with ID _19514_210_9259_1_1531530071330_5084230_762-231063-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 12:15:42,962 WARNING [train.py:1204] (3/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_68-219807-0_sp0.9 from training. Duration: 0.52225 2023-10-04 12:15:43,013 WARNING [train.py:1204] (3/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_479-158053-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 12:15:44,988 WARNING [train.py:1204] (3/4) Exclude cut with ID _66112_210_10635_1_1535590236305_3801484_83-38181-0_sp1.1 from training. Duration: 0.936375 2023-10-04 12:15:46,313 WARNING [train.py:1204] (3/4) Exclude cut with ID _55857_210_2346_1_1534644007831_6898209_255-146346-0_sp1.1 from training. Duration: 0.963625 2023-10-04 12:15:46,569 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.attention_skip_rate, batch_count=1650986.6666666667, ans=0.0 2023-10-04 12:15:47,723 WARNING [train.py:1204] (3/4) Exclude cut with ID _48836_210_12210_1_1534075848293_3904150_429-35475-0 from training. Duration: 0.89 2023-10-04 12:15:47,744 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_50154_210_13731_1_1534750862_1080961_119-187163-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 12:15:49,174 WARNING [train.py:1204] (3/4) Exclude cut with ID _8793_210_6310_1_1528542985139_2310009_121-179782-0_sp0.9 from training. Duration: 0.888875 2023-10-04 12:15:49,398 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.bypass_mid.scale_min, batch_count=1650986.6666666667, ans=0.2 2023-10-04 12:15:50,521 WARNING [train.py:1204] (3/4) Exclude cut with ID _14884_210_2252_1_1530707459689_5561250_591-173406-0 from training. Duration: 0.96 2023-10-04 12:15:54,468 INFO [train.py:1046] (3/4) Epoch 47, batch 3300, loss[loss=0.1808, simple_loss=0.2553, pruned_loss=0.05313, over 19533.00 frames. ], tot_loss[loss=0.1538, simple_loss=0.2348, pruned_loss=0.03643, over 4712319.09 frames. ], batch size: 388, lr: 2.16e-03, grad_scale: 8.0 2023-10-04 12:15:54,558 WARNING [train.py:1204] (3/4) Exclude cut with ID _15133_210_6228_1_1530794512074_3080379_54-198193-0_sp1.1 from training. Duration: 0.8545625 2023-10-04 12:15:54,567 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6545_210_2040_1_1527774670_4245258_407-48070-0 from training. Duration: 0.76 2023-10-04 12:15:55,950 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1032-236863-0 from training. Duration: 0.84 2023-10-04 12:15:57,410 WARNING [train.py:1204] (3/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_507-303370-0 from training. Duration: 0.96 2023-10-04 12:15:57,431 WARNING [train.py:1204] (3/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_164-282366-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 12:16:01,003 WARNING [train.py:1204] (3/4) Exclude cut with ID _39867_210_9826_1_1533553222030_9344259_312-158727-0_sp1.1 from training. Duration: 0.963625 2023-10-04 12:16:02,413 WARNING [train.py:1204] (3/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_174-205058-0_sp1.1 from training. Duration: 0.863625 2023-10-04 12:16:02,440 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_11305_210_1794_1_1529750428_7178148_166-237358-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 12:16:03,623 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.695e+02 2.067e+02 2.305e+02 2.787e+02 4.644e+02, threshold=4.609e+02, percent-clipped=2.0 2023-10-04 12:16:05,637 WARNING [train.py:1204] (3/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_26-175553-0_sp1.1 from training. Duration: 0.5545625 2023-10-04 12:16:05,659 WARNING [train.py:1204] (3/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_96-290039-0_sp1.1 from training. Duration: 0.5090625 2023-10-04 12:16:07,255 WARNING [train.py:1204] (3/4) Exclude cut with ID _53219_210_4525_1_1534834836008_4064825_261-101691-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 12:16:08,596 WARNING [train.py:1204] (3/4) Exclude cut with ID _24271_210_12658_1_1531987270022_7190519_96-293390-0_sp1.1 from training. Duration: 0.936375 2023-10-04 12:16:12,651 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_271-234962-0 from training. Duration: 0.79 2023-10-04 12:16:12,805 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.conv_module2.balancer2.prob, batch_count=1651120.0, ans=0.125 2023-10-04 12:16:14,426 WARNING [train.py:1204] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_136-30660-0_sp1.1 from training. Duration: 0.97275 2023-10-04 12:16:14,448 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_625-281046-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 12:16:16,282 WARNING [train.py:1204] (3/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_333-168193-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 12:16:16,517 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.bypass_mid.scale_min, batch_count=1651120.0, ans=0.2 2023-10-04 12:16:17,552 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9029W0269-509664 from training. Duration: 0.896 2023-10-04 12:16:17,658 WARNING [train.py:1204] (3/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_450-147393-0_sp1.1 from training. Duration: 0.92725 2023-10-04 12:16:19,016 WARNING [train.py:1204] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_643-52007-0_sp1.1 from training. Duration: 0.5181875 2023-10-04 12:16:19,098 WARNING [train.py:1204] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_330-327861-0_sp0.9 from training. Duration: 0.7444375 2023-10-04 12:16:19,108 WARNING [train.py:1204] (3/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_260-317651-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 12:16:20,475 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9044W0010-516654 from training. Duration: 0.896 2023-10-04 12:16:24,521 WARNING [train.py:1204] (3/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_265-118378-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 12:16:24,534 WARNING [train.py:1204] (3/4) Exclude cut with ID _6194_210_1534_1_1527511826160_3941194_321-148641-0_sp0.9 from training. Duration: 0.711125 2023-10-04 12:16:27,215 WARNING [train.py:1204] (3/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_101-169176-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 12:16:27,218 WARNING [train.py:1204] (3/4) Exclude cut with ID 497-129325-0061-62254-0_sp1.1 from training. Duration: 0.97725 2023-10-04 12:16:28,567 WARNING [train.py:1204] (3/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_795-33252-0_sp1.1 from training. Duration: 0.336375 2023-10-04 12:16:28,587 WARNING [train.py:1204] (3/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_168-246176-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 12:16:29,973 WARNING [train.py:1204] (3/4) Exclude cut with ID _7160_210_1876_1_1527940923231_3703799_120-150943-0_sp1.1 from training. Duration: 0.736375 2023-10-04 12:16:31,473 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9084W0005-536844 from training. Duration: 0.768 2023-10-04 12:16:33,449 WARNING [train.py:1204] (3/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_234-260682-0 from training. Duration: 0.84 2023-10-04 12:16:33,496 WARNING [train.py:1204] (3/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_188-114464-0_sp0.9 from training. Duration: 0.911125 2023-10-04 12:16:36,760 WARNING [train.py:1204] (3/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_81-75849-0 from training. Duration: 0.39 2023-10-04 12:16:36,987 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=1651186.6666666667, ans=0.1 2023-10-04 12:16:38,198 WARNING [train.py:1204] (3/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_379-184223-0_sp1.1 from training. Duration: 0.77275 2023-10-04 12:16:40,779 WARNING [train.py:1204] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_454-123002-0_sp1.1 from training. Duration: 0.62725 2023-10-04 12:16:42,029 WARNING [train.py:1204] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_887-249555-0_sp1.1 from training. Duration: 0.7 2023-10-04 12:16:43,646 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.bypass.scale_min, batch_count=1651253.3333333333, ans=0.2 2023-10-04 12:16:44,744 WARNING [train.py:1204] (3/4) Exclude cut with ID _31088_210_16446_1_1532520012284_3440919_20-260902-0_sp1.1 from training. Duration: 0.97275 2023-10-04 12:16:44,787 WARNING [train.py:1204] (3/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_856-344574-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 12:16:44,789 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_36756_210_7234_1_1533026991_966276_4-164118-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 12:16:44,811 WARNING [train.py:1204] (3/4) Exclude cut with ID _8365_210_2064_1_1528592311070_4677690_371-122295-0_sp1.1 from training. Duration: 0.5 2023-10-04 12:16:46,276 WARNING [train.py:1204] (3/4) Exclude cut with ID _5910_210_1389_1_1527337972454_5050100_555-159815-0_sp1.1 from training. Duration: 0.8909375 2023-10-04 12:16:46,899 WARNING [train.py:1204] (3/4) Exclude cut with ID _37086_210_9038_1_1532999540891_7021250_469-147941-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 12:16:48,181 WARNING [train.py:1204] (3/4) Exclude cut with ID _3067_210_2667_1_1526212974640_4503168_459-173867-0_sp0.9 from training. Duration: 0.97775 2023-10-04 12:16:49,613 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9156W0006-572499 from training. Duration: 0.896 2023-10-04 12:16:49,712 WARNING [train.py:1204] (3/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_277-210798-0 from training. Duration: 0.55 2023-10-04 12:16:52,565 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.0.conv_skip_rate, batch_count=1651320.0, ans=0.0 2023-10-04 12:16:53,777 WARNING [train.py:1204] (3/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_103-336064-0_sp1.1 from training. Duration: 0.463625 2023-10-04 12:16:55,059 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3414_210_1988_1_1526212718_4813195_331-290945-0_sp1.1 from training. Duration: 0.936375 2023-10-04 12:16:55,060 WARNING [train.py:1204] (3/4) Exclude cut with ID _67499_210_17282_1_1535509805220_3692430_365-29276-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 12:16:56,566 WARNING [train.py:1204] (3/4) Exclude cut with ID _14821_210_8750_1_1530680503991_6829359_69-250847-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 12:16:56,568 WARNING [train.py:1204] (3/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_1102-61301-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 12:16:57,939 WARNING [train.py:1204] (3/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_166-246557-0_sp1.1 from training. Duration: 0.5818125 2023-10-04 12:16:57,989 WARNING [train.py:1204] (3/4) Exclude cut with ID _15351_210_10626_1_1530856635653_3576210_214-129918-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 12:16:58,002 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_120-41668-0_sp0.9 from training. Duration: 0.77775 2023-10-04 12:16:58,080 WARNING [train.py:1204] (3/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_27-72278-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 12:16:59,480 WARNING [train.py:1204] (3/4) Exclude cut with ID _24314_210_4915_1_1532135078116_7170179_521-258868-0_sp1.1 from training. Duration: 0.7181875 2023-10-04 12:17:01,024 WARNING [train.py:1204] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_143-86344-0 from training. Duration: 0.77 2023-10-04 12:17:01,050 WARNING [train.py:1204] (3/4) Exclude cut with ID _53489_210_6228_1_1534298357169_3835970_465-157433-0_sp1.1 from training. Duration: 0.92725 2023-10-04 12:17:03,054 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_248-319353-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 12:17:06,095 WARNING [train.py:1204] (3/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_102-77064-0_sp1.1 from training. Duration: 0.8454375 2023-10-04 12:17:06,121 WARNING [train.py:1204] (3/4) Exclude cut with ID _9389_210_1664_1_1529217337301_5289269_379-128794-0_sp1.1 from training. Duration: 0.77275 2023-10-04 12:17:07,515 WARNING [train.py:1204] (3/4) Exclude cut with ID _3231_210_2106_1_1526205864934_6784220_236-260835-0_sp1.1 from training. Duration: 0.97275 2023-10-04 12:17:08,861 INFO [train.py:1046] (3/4) Epoch 47, batch 3350, loss[loss=0.1712, simple_loss=0.2587, pruned_loss=0.04181, over 24451.00 frames. ], tot_loss[loss=0.1538, simple_loss=0.2345, pruned_loss=0.03652, over 4708931.15 frames. ], batch size: 69, lr: 2.16e-03, grad_scale: 8.0 2023-10-04 12:17:10,325 WARNING [train.py:1204] (3/4) Exclude cut with ID _24271_210_12658_1_1531987270022_7190519_184-342363-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 12:17:10,326 WARNING [train.py:1204] (3/4) Exclude cut with ID _27894_210_12392_1_1532415496078_6902439_67-221663-0_sp1.1 from training. Duration: 0.92725 2023-10-04 12:17:13,075 WARNING [train.py:1204] (3/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_304-59227-0_sp1.1 from training. Duration: 0.7 2023-10-04 12:17:15,814 WARNING [train.py:1204] (3/4) Exclude cut with ID _58605_210_12210_1_1534723195939_3904259_458-42232-0_sp1.1 from training. Duration: 0.92725 2023-10-04 12:17:15,900 WARNING [train.py:1204] (3/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_13-165512-0_sp0.9 from training. Duration: 0.911125 2023-10-04 12:17:19,158 WARNING [train.py:1204] (3/4) Exclude cut with ID _58602_210_2346_1_1534903210085_6908830_792-102241-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 12:17:20,576 WARNING [train.py:1204] (3/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_588-125165-0_sp0.9 from training. Duration: 0.92225 2023-10-04 12:17:23,267 WARNING [train.py:1204] (3/4) Exclude cut with ID _54218_210_12210_1_1534413646388_4409259_239-98429-0_sp1.1 from training. Duration: 0.97275 2023-10-04 12:17:23,354 WARNING [train.py:1204] (3/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_538-32291-0_sp1.1 from training. Duration: 0.7545625 2023-10-04 12:17:24,687 WARNING [train.py:1204] (3/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_419-317062-0 from training. Duration: 0.91 2023-10-04 12:17:26,028 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9309W0202-640329 from training. Duration: 0.896 2023-10-04 12:17:26,061 WARNING [train.py:1204] (3/4) Exclude cut with ID _3231_210_2106_1_1526205864934_6784220_419-313291-0_sp1.1 from training. Duration: 0.97275 2023-10-04 12:17:28,856 WARNING [train.py:1204] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_619-344499-0 from training. Duration: 0.63 2023-10-04 12:17:28,869 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_278-272612-0 from training. Duration: 0.73 2023-10-04 12:17:30,244 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_103-84010-0_sp0.9 from training. Duration: 0.7666875 2023-10-04 12:17:30,264 WARNING [train.py:1204] (3/4) Exclude cut with ID _26528_210_9070_1_1532570410842_3851310_497-97218-0_sp1.1 from training. Duration: 0.7818125 2023-10-04 12:17:31,634 WARNING [train.py:1204] (3/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_262-84613-0_sp1.1 from training. Duration: 0.936375 2023-10-04 12:17:31,675 WARNING [train.py:1204] (3/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_387-131538-0 from training. Duration: 0.81 2023-10-04 12:17:31,712 WARNING [train.py:1204] (3/4) Exclude cut with ID _8030_210_7497_1_1528444886781_3531089_224-300489-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 12:17:33,011 WARNING [train.py:1204] (3/4) Exclude cut with ID _14349_210_8341_1_1530615710818_3442470_170-336541-0_sp0.9 from training. Duration: 0.9555625 2023-10-04 12:17:34,752 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_47069_210_15368_1_1533781267_4431320_180-31746-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 12:17:34,889 WARNING [train.py:1204] (3/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_447-7041-0_sp1.1 from training. Duration: 0.92725 2023-10-04 12:17:34,934 WARNING [train.py:1204] (3/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_2-55283-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 12:17:36,705 WARNING [train.py:1204] (3/4) Exclude cut with ID _12555_210_8750_1_1530075594402_3780239_70-181911-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 12:17:36,933 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.balancer2.prob, batch_count=1651520.0, ans=0.125 2023-10-04 12:17:40,699 WARNING [train.py:1204] (3/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_311-328022-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 12:17:42,101 WARNING [train.py:1204] (3/4) Exclude cut with ID _14528_210_8254_1_1530669700514_4306939_330-2987-0_sp1.1 from training. Duration: 0.92725 2023-10-04 12:17:42,151 WARNING [train.py:1204] (3/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_385-184858-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 12:17:42,300 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.0.attention_skip_rate, batch_count=1651520.0, ans=0.0 2023-10-04 12:17:45,108 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.feed_forward1.out_proj.dropout_p, batch_count=1651520.0, ans=0.1 2023-10-04 12:17:45,138 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.out_combiner.scale_min, batch_count=1651520.0, ans=0.2 2023-10-04 12:17:46,296 WARNING [train.py:1204] (3/4) Exclude cut with ID _64414_210_17301_1_1535243252048_3757759_348-333360-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 12:17:48,202 WARNING [train.py:1204] (3/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_753-214751-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 12:17:50,270 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_17613_210_2151_1_1531137569_2366160_202-208013-0_sp1.1 from training. Duration: 0.92725 2023-10-04 12:17:50,280 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_43831_210_2158_1_1533725502_5872791_610-129300-0_sp1.1 from training. Duration: 0.936375 2023-10-04 12:17:51,988 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.ff3_skip_rate, batch_count=1651586.6666666667, ans=0.0 2023-10-04 12:17:53,077 WARNING [train.py:1204] (3/4) Exclude cut with ID _54218_210_12210_1_1534413646388_4409259_321-23852-0_sp1.1 from training. Duration: 0.936375 2023-10-04 12:17:54,510 WARNING [train.py:1204] (3/4) Exclude cut with ID _39411_210_5986_1_1533538825030_3343649_173-238248-0 from training. Duration: 0.83 2023-10-04 12:17:54,517 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_405-339986-0_sp0.9 from training. Duration: 0.5666875 2023-10-04 12:17:54,541 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2009_210_2071_1_1525481134_4588937_385-302100-0 from training. Duration: 0.87 2023-10-04 12:17:54,585 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_161-276996-0_sp1.1 from training. Duration: 0.863625 2023-10-04 12:17:57,197 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_177-58005-0 from training. Duration: 0.62 2023-10-04 12:17:57,278 WARNING [train.py:1204] (3/4) Exclude cut with ID _64639_210_5816_1_1535367619437_3528000_146-343734-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 12:17:58,693 WARNING [train.py:1204] (3/4) Exclude cut with ID _3845_210_1390_1_1526731326141_3523610_72-61441-0_sp1.1 from training. Duration: 0.92725 2023-10-04 12:18:07,344 WARNING [train.py:1204] (3/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_991-125028-0_sp1.1 from training. Duration: 0.936375 2023-10-04 12:18:07,404 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7544_210_6753_1_1528591327_1471426_172-348777-0 from training. Duration: 0.66 2023-10-04 12:18:08,766 WARNING [train.py:1204] (3/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_416-257730-0_sp1.1 from training. Duration: 0.5909375 2023-10-04 12:18:10,246 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1113-7369-0_sp1.1 from training. Duration: 0.77275 2023-10-04 12:18:12,247 WARNING [train.py:1204] (3/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_242-76943-0_sp1.1 from training. Duration: 0.8545625 2023-10-04 12:18:16,453 WARNING [train.py:1204] (3/4) Exclude cut with ID _44718_210_5816_1_1534050051869_6469030_190-49648-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 12:18:18,414 WARNING [train.py:1204] (3/4) Exclude cut with ID _47251_210_18628_1_1533814063128_4137250_214-88076-0 from training. Duration: 0.88 2023-10-04 12:18:18,466 WARNING [train.py:1204] (3/4) Exclude cut with ID _5585_210_2667_1_1527418813324_8007340_89-332072-0_sp1.1 from training. Duration: 0.5818125 2023-10-04 12:18:18,491 WARNING [train.py:1204] (3/4) Exclude cut with ID _41817_210_18835_1_1533434477160_7423049_555-293448-0_sp1.1 from training. Duration: 0.72725 2023-10-04 12:18:20,016 WARNING [train.py:1204] (3/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_286-331314-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 12:18:20,073 WARNING [train.py:1204] (3/4) Exclude cut with ID _13921_210_9994_1_1530838702800_3003429_46-149444-0 from training. Duration: 0.94 2023-10-04 12:18:20,196 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.1.feed_forward3.hidden_balancer.prob, batch_count=1651653.3333333333, ans=0.125 2023-10-04 12:18:21,735 WARNING [train.py:1204] (3/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_768-43595-0_sp1.1 from training. Duration: 0.936375 2023-10-04 12:18:22,925 INFO [train.py:1046] (3/4) Epoch 47, batch 3400, loss[loss=0.1573, simple_loss=0.2411, pruned_loss=0.03675, over 23389.00 frames. ], tot_loss[loss=0.1543, simple_loss=0.2351, pruned_loss=0.03675, over 4702005.08 frames. ], batch size: 105, lr: 2.16e-03, grad_scale: 8.0 2023-10-04 12:18:22,974 WARNING [train.py:1204] (3/4) Exclude cut with ID _35355_210_16040_1_1533212988977_4016229_74-190076-0 from training. Duration: 0.8 2023-10-04 12:18:23,795 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.1.encoder.layers.1.nonlin_attention.whiten1, num_groups=1, num_channels=192, metric=5.47 vs. limit=10.0 2023-10-04 12:18:24,309 WARNING [train.py:1204] (3/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_1007-316380-0_sp1.1 from training. Duration: 0.963625 2023-10-04 12:18:24,367 WARNING [train.py:1204] (3/4) Exclude cut with ID _23557_210_8750_1_1531890106370_6846255_150-283437-0_sp1.1 from training. Duration: 0.963625 2023-10-04 12:18:25,537 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3474_210_2028_1_1526188513_4825246_145-309131-0_sp1.1 from training. Duration: 0.67275 2023-10-04 12:18:26,955 WARNING [train.py:1204] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_391-22148-0_sp1.1 from training. Duration: 0.7 2023-10-04 12:18:26,976 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_238-256668-0 from training. Duration: 0.69 2023-10-04 12:18:31,049 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.777e+02 2.016e+02 2.273e+02 2.707e+02 4.180e+02, threshold=4.545e+02, percent-clipped=0.0 2023-10-04 12:18:31,169 WARNING [train.py:1204] (3/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_372-198878-0 from training. Duration: 0.6 2023-10-04 12:18:31,181 WARNING [train.py:1204] (3/4) Exclude cut with ID ID0174W0006-766659 from training. Duration: 0.953 2023-10-04 12:18:31,197 WARNING [train.py:1204] (3/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_668-23019-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 12:18:37,074 WARNING [train.py:1204] (3/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_322-74328-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 12:18:37,080 WARNING [train.py:1204] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_872-264525-0_sp1.1 from training. Duration: 0.5909375 2023-10-04 12:18:37,141 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_53378_210_17282_1_1534221071_5769509_641-286201-0_sp1.1 from training. Duration: 0.97275 2023-10-04 12:18:38,541 WARNING [train.py:1204] (3/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_199-295611-0_sp1.1 from training. Duration: 0.5 2023-10-04 12:18:44,460 WARNING [train.py:1204] (3/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_1270-317971-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 12:18:44,561 WARNING [train.py:1204] (3/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_784-213099-0 from training. Duration: 0.93 2023-10-04 12:18:50,488 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_115-233929-0_sp0.9 from training. Duration: 0.8 2023-10-04 12:18:51,922 WARNING [train.py:1204] (3/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_256-223809-0_sp1.1 from training. Duration: 0.97275 2023-10-04 12:18:51,962 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_285-87781-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 12:18:53,113 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.4.encoder.layers.1.self_attn2.whiten, num_groups=1, num_channels=384, metric=11.80 vs. limit=22.5 2023-10-04 12:18:53,876 WARNING [train.py:1204] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_619-344499-0_sp1.1 from training. Duration: 0.57275 2023-10-04 12:18:56,833 WARNING [train.py:1204] (3/4) Exclude cut with ID _60229_210_27767_1_1534838406158_3655350_264-279573-0_sp0.9 from training. Duration: 0.92225 2023-10-04 12:18:59,105 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.0.layers.0.self_attn1.whiten, num_groups=1, num_channels=192, metric=9.36 vs. limit=22.5 2023-10-04 12:19:02,062 WARNING [train.py:1204] (3/4) Exclude cut with ID _7719_210_3525_1_1528456153163_4296960_333-110399-0 from training. Duration: 0.96 2023-10-04 12:19:02,356 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.feed_forward1.out_proj.dropout_p, batch_count=1651853.3333333333, ans=0.1 2023-10-04 12:19:08,157 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_50423_210_13270_1_1534128598_5021734_759-90833-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 12:19:08,208 WARNING [train.py:1204] (3/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_151-254798-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 12:19:09,533 WARNING [train.py:1204] (3/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_450-203637-0 from training. Duration: 0.92 2023-10-04 12:19:09,566 WARNING [train.py:1204] (3/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_673-36456-0_sp1.1 from training. Duration: 0.936375 2023-10-04 12:19:10,927 WARNING [train.py:1204] (3/4) Exclude cut with ID _36538_210_9994_1_1533204020363_3795620_131-8685-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 12:19:11,005 WARNING [train.py:1204] (3/4) Exclude cut with ID _12556_210_8341_1_1530081024100_7327690_713-55626-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 12:19:11,057 WARNING [train.py:1204] (3/4) Exclude cut with ID _2322_210_2667_1_1525604548971_3656299_440-80494-0_sp0.9 from training. Duration: 0.97775 2023-10-04 12:19:14,861 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.3.encoder.layers.1.whiten, num_groups=1, num_channels=512, metric=4.35 vs. limit=12.0 2023-10-04 12:19:15,515 WARNING [train.py:1204] (3/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_210-325081-0_sp1.1 from training. Duration: 0.97275 2023-10-04 12:19:18,314 WARNING [train.py:1204] (3/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_375-244976-0_sp0.9 from training. Duration: 0.8333125 2023-10-04 12:19:18,326 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_15517_210_6753_1_1530952293_1664795_119-31001-0_sp1.1 from training. Duration: 0.8545625 2023-10-04 12:19:20,448 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.ff2_skip_rate, batch_count=1651986.6666666667, ans=0.0 2023-10-04 12:19:23,656 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_41452_210_7291_1_1533437687_3941281_319-42326-0_sp1.1 from training. Duration: 0.92725 2023-10-04 12:19:25,231 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_372-173012-0 from training. Duration: 0.95 2023-10-04 12:19:29,049 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.0.layers.0.feed_forward3.out_whiten, num_groups=1, num_channels=192, metric=12.19 vs. limit=15.0 2023-10-04 12:19:29,510 WARNING [train.py:1204] (3/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_41-31157-0_sp1.1 from training. Duration: 0.5181875 2023-10-04 12:19:33,756 WARNING [train.py:1204] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_91-212486-0 from training. Duration: 0.68 2023-10-04 12:19:36,420 INFO [train.py:1046] (3/4) Epoch 47, batch 3450, loss[loss=0.1573, simple_loss=0.2166, pruned_loss=0.04898, over 19350.00 frames. ], tot_loss[loss=0.1548, simple_loss=0.2355, pruned_loss=0.03708, over 4691337.54 frames. ], batch size: 388, lr: 2.16e-03, grad_scale: 8.0 2023-10-04 12:19:36,625 WARNING [train.py:1204] (3/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_1267-272693-0 from training. Duration: 0.73 2023-10-04 12:19:36,855 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.attention_skip_rate, batch_count=1652053.3333333333, ans=0.0 2023-10-04 12:19:38,537 WARNING [train.py:1204] (3/4) Exclude cut with ID _13938_210_9988_1_1530518718978_3693960_152-67046-0_sp1.1 from training. Duration: 0.936375 2023-10-04 12:19:40,041 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_4528_210_3273_1_1527076654_3276777_342-168068-0_sp1.1 from training. Duration: 0.7909375 2023-10-04 12:19:40,050 WARNING [train.py:1204] (3/4) Exclude cut with ID _7606_210_4915_1_1528894253277_5080690_127-177761-0 from training. Duration: 0.64 2023-10-04 12:19:41,413 WARNING [train.py:1204] (3/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_335-137972-0_sp1.1 from training. Duration: 0.92725 2023-10-04 12:19:44,562 WARNING [train.py:1204] (3/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_331-117829-0_sp1.1 from training. Duration: 0.563625 2023-10-04 12:19:49,502 INFO [scaling.py:1118] (3/4) WithLoss: name=encoder.encoders.3.encoder.layers.0.self_attn_weights, loss-sum=0.000e+00 2023-10-04 12:19:50,533 WARNING [train.py:1204] (3/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_125-125818-0_sp1.1 from training. Duration: 0.87275 2023-10-04 12:19:50,580 WARNING [train.py:1204] (3/4) Exclude cut with ID _6789_210_1534_1_1527854471671_2870519_48-294897-0_sp1.1 from training. Duration: 0.963625 2023-10-04 12:19:52,020 WARNING [train.py:1204] (3/4) Exclude cut with ID _6694_210_4599_1_1527852637490_3965529_116-109398-0_sp1.1 from training. Duration: 0.663625 2023-10-04 12:19:52,036 WARNING [train.py:1204] (3/4) Exclude cut with ID _52892_210_6721_1_1534378941259_4124129_222-172685-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 12:19:54,721 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.3.encoder.layers.0.conv_module1.whiten, num_groups=1, num_channels=512, metric=3.87 vs. limit=15.0 2023-10-04 12:19:55,366 WARNING [train.py:1204] (3/4) Exclude cut with ID _50655_210_18628_1_1534229942193_3772154_320-132275-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 12:19:58,340 WARNING [train.py:1204] (3/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_76-311963-0 from training. Duration: 0.7 2023-10-04 12:20:03,738 WARNING [train.py:1204] (3/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_32-205288-0 from training. Duration: 0.78 2023-10-04 12:20:03,767 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_86-16881-0_sp0.9 from training. Duration: 0.6444375 2023-10-04 12:20:05,065 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_271-297505-0_sp1.1 from training. Duration: 0.9 2023-10-04 12:20:07,732 WARNING [train.py:1204] (3/4) Exclude cut with ID _57572_210_5517_1_1534921219624_3809484_187-295293-0_sp1.1 from training. Duration: 0.963625 2023-10-04 12:20:12,106 INFO [scaling.py:1022] (3/4) Whitening: name=encoder_embed.convnext.out_whiten, num_groups=1, num_channels=128, metric=4.72 vs. limit=5.0 2023-10-04 12:20:13,797 WARNING [train.py:1204] (3/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_563-329943-0 from training. Duration: 0.99 2023-10-04 12:20:13,870 WARNING [train.py:1204] (3/4) Exclude cut with ID _7160_210_1876_1_1527940923231_3703799_302-150728-0_sp0.9 from training. Duration: 0.8666875 2023-10-04 12:20:20,273 WARNING [train.py:1204] (3/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_926-7526-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 12:20:20,311 WARNING [train.py:1204] (3/4) Exclude cut with ID _62574_210_2655_1_1535180407058_3933249_467-22348-0_sp1.1 from training. Duration: 0.936375 2023-10-04 12:20:21,699 WARNING [train.py:1204] (3/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_280-161633-0_sp0.9 from training. Duration: 0.811125 2023-10-04 12:20:21,806 WARNING [train.py:1204] (3/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_385-193052-0_sp1.1 from training. Duration: 0.7818125 2023-10-04 12:20:23,236 WARNING [train.py:1204] (3/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_444-28041-0 from training. Duration: 0.85 2023-10-04 12:20:23,241 WARNING [train.py:1204] (3/4) Exclude cut with ID _66070_210_7640_1_1535441393096_7805310_341-249237-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 12:20:25,112 WARNING [train.py:1204] (3/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_425-269826-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 12:20:29,016 WARNING [train.py:1204] (3/4) Exclude cut with ID _53950_210_5816_1_1534417264919_3862951_124-185597-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 12:20:31,667 WARNING [train.py:1204] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_380-204756-0 from training. Duration: 0.86 2023-10-04 12:20:34,514 WARNING [train.py:1204] (3/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_282-257791-0_sp0.9 from training. Duration: 0.9555625 2023-10-04 12:20:37,960 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.3.encoder.layers.3.self_attn2.whiten, num_groups=1, num_channels=512, metric=11.64 vs. limit=22.5 2023-10-04 12:20:40,024 WARNING [train.py:1204] (3/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_372-131163-0_sp1.1 from training. Duration: 0.8545625 2023-10-04 12:20:41,964 WARNING [train.py:1204] (3/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_447-233513-0_sp1.1 from training. Duration: 0.963625 2023-10-04 12:20:43,425 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_54-118333-0_sp1.1 from training. Duration: 0.97275 2023-10-04 12:20:46,715 WARNING [train.py:1204] (3/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_388-230239-0_sp1.1 from training. Duration: 0.963625 2023-10-04 12:20:46,738 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_40047_210_2290_1_1533290519_2374311_141-54639-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 12:20:48,047 WARNING [train.py:1204] (3/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_91-267696-0_sp0.9 from training. Duration: 0.9666875 2023-10-04 12:20:49,300 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_532-137847-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 12:20:51,011 INFO [train.py:1046] (3/4) Epoch 47, batch 3500, loss[loss=0.1575, simple_loss=0.2431, pruned_loss=0.03601, over 24640.00 frames. ], tot_loss[loss=0.1543, simple_loss=0.2348, pruned_loss=0.03688, over 4706449.80 frames. ], batch size: 68, lr: 2.16e-03, grad_scale: 8.0 2023-10-04 12:20:52,584 WARNING [train.py:1204] (3/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_155-283231-0_sp1.1 from training. Duration: 0.97275 2023-10-04 12:20:56,147 INFO [scaling.py:1118] (3/4) WithLoss: name=encoder.encoders.4.encoder.layers.0.self_attn_weights, loss-sum=0.000e+00 2023-10-04 12:20:57,202 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2245_210_2715_1_1525511548_3540254_310-52483-0_sp1.1 from training. Duration: 0.8 2023-10-04 12:20:57,271 WARNING [train.py:1204] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_185-174805-0 from training. Duration: 0.68 2023-10-04 12:20:59,814 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.687e+02 2.045e+02 2.266e+02 2.721e+02 4.311e+02, threshold=4.532e+02, percent-clipped=0.0 2023-10-04 12:20:59,924 WARNING [train.py:1204] (3/4) Exclude cut with ID _15012_210_1968_1_1530856820429_5095350_662-210469-0_sp1.1 from training. Duration: 0.5181875 2023-10-04 12:21:02,646 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_426-313557-0_sp0.9 from training. Duration: 0.588875 2023-10-04 12:21:02,955 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.ff2_skip_rate, batch_count=1652386.6666666667, ans=0.0 2023-10-04 12:21:04,125 WARNING [train.py:1204] (3/4) Exclude cut with ID _42326_210_7770_1_1533814282531_3707269_407-204343-0_sp1.1 from training. Duration: 0.97275 2023-10-04 12:21:05,395 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_258-310570-0 from training. Duration: 0.82 2023-10-04 12:21:07,468 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.5.encoder.layers.1.conv_module1.whiten, num_groups=1, num_channels=256, metric=3.53 vs. limit=15.0 2023-10-04 12:21:09,773 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_4737_210_2040_1_1526998689_3293008_378-178650-0_sp1.1 from training. Duration: 0.863625 2023-10-04 12:21:09,862 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_47896_210_20508_1_1534492518_5958868_432-327451-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 12:21:11,247 WARNING [train.py:1204] (3/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_188-114464-0_sp1.1 from training. Duration: 0.7454375 2023-10-04 12:21:11,262 WARNING [train.py:1204] (3/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_798-106179-0_sp1.1 from training. Duration: 0.7545625 2023-10-04 12:21:11,301 WARNING [train.py:1204] (3/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_143-117421-0_sp1.1 from training. Duration: 0.52725 2023-10-04 12:21:13,069 WARNING [train.py:1204] (3/4) Exclude cut with ID _67499_210_17282_1_1535509805220_3692430_361-78786-0_sp1.1 from training. Duration: 0.963625 2023-10-04 12:21:13,113 WARNING [train.py:1204] (3/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_712-342563-0_sp1.1 from training. Duration: 0.8818125 2023-10-04 12:21:13,160 WARNING [train.py:1204] (3/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_387-159008-0 from training. Duration: 0.86 2023-10-04 12:21:16,443 WARNING [train.py:1204] (3/4) Exclude cut with ID _33640_210_2655_1_1533117605479_4855469_959-18820-0_sp1.1 from training. Duration: 0.963625 2023-10-04 12:21:16,470 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_133-564-0_sp0.9 from training. Duration: 0.87775 2023-10-04 12:21:16,687 INFO [scaling.py:1118] (3/4) WithLoss: name=encoder.encoders.2.encoder.layers.2.self_attn_weights, loss-sum=0.000e+00 2023-10-04 12:21:19,702 WARNING [train.py:1204] (3/4) Exclude cut with ID _23514_210_1389_1_1531832139785_3031439_12-29287-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 12:21:22,555 WARNING [train.py:1204] (3/4) Exclude cut with ID _23514_210_1389_1_1531832139785_3031439_489-185052-0_sp1.1 from training. Duration: 0.963625 2023-10-04 12:21:23,892 WARNING [train.py:1204] (3/4) Exclude cut with ID _5444_210_2151_1_1527418808464_5312210_634-194174-0 from training. Duration: 0.97 2023-10-04 12:21:23,919 WARNING [train.py:1204] (3/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_91-26919-0_sp1.1 from training. Duration: 0.8818125 2023-10-04 12:21:26,595 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_37351_210_7925_1_1533012931_3812113_516-82303-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 12:21:27,305 WARNING [train.py:1204] (3/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_80-74184-0_sp0.9 from training. Duration: 0.92225 2023-10-04 12:21:28,708 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_9460_210_4211_1_1528954587_10049667_495-202963-0_sp1.1 from training. Duration: 0.963625 2023-10-04 12:21:30,106 WARNING [train.py:1204] (3/4) Exclude cut with ID _14632_210_9988_1_1530691279392_3923200_332-105904-0_sp1.1 from training. Duration: 0.8181875 2023-10-04 12:21:30,135 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_40764_210_1925_1_1533882359_3752345_493-167886-0_sp1.1 from training. Duration: 0.92725 2023-10-04 12:21:31,528 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_196-289637-0 from training. Duration: 0.75 2023-10-04 12:21:32,956 WARNING [train.py:1204] (3/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_26-175553-0 from training. Duration: 0.61 2023-10-04 12:21:34,270 WARNING [train.py:1204] (3/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_890-5117-0 from training. Duration: 0.78 2023-10-04 12:21:34,332 WARNING [train.py:1204] (3/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_206-306860-0_sp1.1 from training. Duration: 0.92725 2023-10-04 12:21:35,749 WARNING [train.py:1204] (3/4) Exclude cut with ID _25882_210_3036_1_1532775665815_6479510_570-79824-0_sp1.1 from training. Duration: 0.963625 2023-10-04 12:21:37,143 WARNING [train.py:1204] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_731-2428-0_sp1.1 from training. Duration: 0.7545625 2023-10-04 12:21:37,177 WARNING [train.py:1204] (3/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_138-154812-0_sp0.9 from training. Duration: 0.8666875 2023-10-04 12:21:38,746 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.conv_module2.balancer1.prob, batch_count=1652586.6666666667, ans=0.125 2023-10-04 12:21:41,241 WARNING [train.py:1204] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_178-39722-0_sp0.9 from training. Duration: 0.5444375 2023-10-04 12:21:42,567 WARNING [train.py:1204] (3/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_1285-256622-0_sp0.9 from training. Duration: 0.9333125 2023-10-04 12:21:50,430 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_41428_210_2161_1_1533774184_3992532_65-271323-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 12:21:50,523 WARNING [train.py:1204] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_308-161067-0 from training. Duration: 0.66 2023-10-04 12:21:50,528 WARNING [train.py:1204] (3/4) Exclude cut with ID _3067_210_2667_1_1526212974640_4503168_459-173867-0 from training. Duration: 0.88 2023-10-04 12:21:50,529 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2221_210_1988_1_1525597051_4935541_415-296882-0_sp1.1 from training. Duration: 0.7 2023-10-04 12:21:51,998 WARNING [train.py:1204] (3/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_354-192266-0_sp1.1 from training. Duration: 0.936375 2023-10-04 12:21:53,363 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_258-310570-0_sp0.9 from training. Duration: 0.911125 2023-10-04 12:21:54,800 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_548-141039-0_sp1.1 from training. Duration: 0.963625 2023-10-04 12:21:57,459 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_15101_210_7927_1_1530786415_5033274_320-327527-0 from training. Duration: 0.96 2023-10-04 12:21:57,510 WARNING [train.py:1204] (3/4) Exclude cut with ID _2895_210_2477_1_1525956785794_4063750_289-11475-0_sp0.9 from training. Duration: 0.911125 2023-10-04 12:21:58,962 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.1.nonlin_attention.balancer.prob, batch_count=1652653.3333333333, ans=0.125 2023-10-04 12:22:00,117 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7543_210_6753_1_1528543318_4552_1-85072-0_sp1.1 from training. Duration: 0.7545625 2023-10-04 12:22:00,803 WARNING [train.py:1204] (3/4) Exclude cut with ID _64639_210_5816_1_1535367619437_3528000_461-329156-0 from training. Duration: 0.96 2023-10-04 12:22:01,932 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_211-339622-0 from training. Duration: 0.45 2023-10-04 12:22:03,404 WARNING [train.py:1204] (3/4) Exclude cut with ID _20455_210_5219_1_1531565996775_8098100_402-112739-0_sp1.1 from training. Duration: 0.963625 2023-10-04 12:22:03,502 WARNING [train.py:1204] (3/4) Exclude cut with ID _53489_210_6228_1_1534298357169_3835970_256-115565-0_sp1.1 from training. Duration: 0.936375 2023-10-04 12:22:03,661 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.ff2_skip_rate, batch_count=1652720.0, ans=0.0 2023-10-04 12:22:04,908 INFO [train.py:1046] (3/4) Epoch 47, batch 3550, loss[loss=0.1586, simple_loss=0.2483, pruned_loss=0.03452, over 24362.00 frames. ], tot_loss[loss=0.1532, simple_loss=0.234, pruned_loss=0.03619, over 4715973.73 frames. ], batch size: 77, lr: 2.16e-03, grad_scale: 8.0 2023-10-04 12:22:04,968 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_26326_210_7925_1_1533002493_3592406_360-305260-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 12:22:04,997 WARNING [train.py:1204] (3/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_18-204383-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 12:22:07,822 WARNING [train.py:1204] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_306-232480-0_sp1.1 from training. Duration: 0.87275 2023-10-04 12:22:13,979 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.conv_skip_rate, batch_count=1652720.0, ans=0.0 2023-10-04 12:22:15,787 WARNING [train.py:1204] (3/4) Exclude cut with ID _41161_210_20034_1_1533431249657_3555314_116-299186-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 12:22:17,249 WARNING [train.py:1204] (3/4) Exclude cut with ID _13913_210_9994_1_1530579615187_3383897_473-277682-0_sp1.1 from training. Duration: 0.3545625 2023-10-04 12:22:19,858 WARNING [train.py:1204] (3/4) Exclude cut with ID _58895_210_8750_1_1535184081320_3649089_49-225702-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 12:22:21,233 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3080_210_2071_1_1526086102_4365166_312-8726-0_sp1.1 from training. Duration: 0.82725 2023-10-04 12:22:21,344 WARNING [train.py:1204] (3/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_489-190175-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 12:22:22,596 WARNING [train.py:1204] (3/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_345-95717-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 12:22:22,623 WARNING [train.py:1204] (3/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_85-138257-0_sp1.1 from training. Duration: 0.6454375 2023-10-04 12:22:25,937 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.4.encoder.layers.1.self_attn1.whiten, num_groups=1, num_channels=384, metric=11.80 vs. limit=22.5 2023-10-04 12:22:26,759 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7046_210_6753_1_1527895337_6920917_783-278109-0_sp1.1 from training. Duration: 0.7 2023-10-04 12:22:26,793 WARNING [train.py:1204] (3/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_450-203637-0_sp1.1 from training. Duration: 0.836375 2023-10-04 12:22:28,550 WARNING [train.py:1204] (3/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_416-132893-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 12:22:28,566 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2655_210_2087_1_1525688959_3897400_437-337690-0_sp0.9 from training. Duration: 0.7 2023-10-04 12:22:28,636 WARNING [train.py:1204] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_700-183549-0_sp1.1 from training. Duration: 0.6181875 2023-10-04 12:22:34,114 WARNING [train.py:1204] (3/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_191-59784-0_sp1.1 from training. Duration: 0.8 2023-10-04 12:22:35,386 WARNING [train.py:1204] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_218-295097-0_sp1.1 from training. Duration: 0.7 2023-10-04 12:22:38,068 WARNING [train.py:1204] (3/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_310-109461-0_sp1.1 from training. Duration: 0.72725 2023-10-04 12:22:38,072 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_33348_210_5632_1_1533086912_3324030_131-321149-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 12:22:38,132 WARNING [train.py:1204] (3/4) Exclude cut with ID _64135_210_2253_1_1535677267876_7608972_133-188266-0_sp1.1 from training. Duration: 0.736375 2023-10-04 12:22:39,468 WARNING [train.py:1204] (3/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_505-205270-0 from training. Duration: 0.38 2023-10-04 12:22:39,479 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_67223_210_4238_1_1535612120_2537431_102-88248-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 12:22:40,969 WARNING [train.py:1204] (3/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_60-235433-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 12:22:42,465 WARNING [train.py:1204] (3/4) Exclude cut with ID _5189_210_4557_1_1527248319428_3769229_199-53526-0_sp1.1 from training. Duration: 0.3818125 2023-10-04 12:22:47,157 WARNING [train.py:1204] (3/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_439-90979-0_sp1.1 from training. Duration: 0.963625 2023-10-04 12:22:48,492 WARNING [train.py:1204] (3/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_1162-132880-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 12:22:48,557 WARNING [train.py:1204] (3/4) Exclude cut with ID _56423_210_5854_1_1534896061049_3165969_220-22317-0_sp1.1 from training. Duration: 0.963625 2023-10-04 12:22:50,395 WARNING [train.py:1204] (3/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_909-64151-0 from training. Duration: 0.94 2023-10-04 12:22:51,694 WARNING [train.py:1204] (3/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_465-156500-0_sp0.9 from training. Duration: 0.8 2023-10-04 12:22:53,133 WARNING [train.py:1204] (3/4) Exclude cut with ID _44304_210_15374_1_1533639352765_4613129_302-22084-0 from training. Duration: 0.86 2023-10-04 12:22:53,188 WARNING [train.py:1204] (3/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_663-166973-0_sp1.1 from training. Duration: 0.72725 2023-10-04 12:22:57,338 WARNING [train.py:1204] (3/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_503-287883-0_sp0.9 from training. Duration: 0.97775 2023-10-04 12:22:57,365 WARNING [train.py:1204] (3/4) Exclude cut with ID _60057_210_10640_1_1534903065793_3333150_362-230377-0_sp0.9 from training. Duration: 0.9666875 2023-10-04 12:23:00,145 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_4183_210_2087_1_1526729100_1869838_143-280962-0 from training. Duration: 0.56 2023-10-04 12:23:01,546 WARNING [train.py:1204] (3/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_281-1313-0_sp1.1 from training. Duration: 0.936375 2023-10-04 12:23:06,246 WARNING [train.py:1204] (3/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_64-215118-0_sp1.1 from training. Duration: 0.936375 2023-10-04 12:23:06,394 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=1652986.6666666667, ans=0.1 2023-10-04 12:23:07,670 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2822_210_2715_1_1526112586_1999587_165-36656-0 from training. Duration: 0.89 2023-10-04 12:23:08,993 WARNING [train.py:1204] (3/4) Exclude cut with ID _22961_210_8254_1_1531976366672_4278180_402-208830-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 12:23:09,388 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.conv_module2.balancer1.max_abs, batch_count=1652986.6666666667, ans=10.0 2023-10-04 12:23:11,764 WARNING [train.py:1204] (3/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_98-193494-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 12:23:13,152 WARNING [train.py:1204] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_383-285196-0 from training. Duration: 0.97 2023-10-04 12:23:16,943 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.conv_skip_rate, batch_count=1653053.3333333333, ans=0.0 2023-10-04 12:23:18,586 INFO [train.py:1046] (3/4) Epoch 47, batch 3600, loss[loss=0.1548, simple_loss=0.2319, pruned_loss=0.03887, over 23825.00 frames. ], tot_loss[loss=0.1523, simple_loss=0.2331, pruned_loss=0.03575, over 4702018.37 frames. ], batch size: 212, lr: 2.16e-03, grad_scale: 16.0 2023-10-04 12:23:20,387 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6767_210_3273_1_1527855783_1718932_142-127132-0 from training. Duration: 0.95 2023-10-04 12:23:20,423 WARNING [train.py:1204] (3/4) Exclude cut with ID _39867_210_9826_1_1533553222030_9344259_1018-339635-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 12:23:20,497 WARNING [train.py:1204] (3/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_274-274798-0_sp1.1 from training. Duration: 0.8909375 2023-10-04 12:23:23,065 WARNING [train.py:1204] (3/4) Exclude cut with ID _2334_210_2667_1_1525608238880_4705129_376-263393-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 12:23:23,103 WARNING [train.py:1204] (3/4) Exclude cut with ID _29303_210_2575_1_1532392237303_7410649_363-344675-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 12:23:24,426 WARNING [train.py:1204] (3/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_58-68956-0_sp1.1 from training. Duration: 0.7545625 2023-10-04 12:23:27,220 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.730e+02 2.182e+02 2.453e+02 2.820e+02 4.666e+02, threshold=4.905e+02, percent-clipped=3.0 2023-10-04 12:23:27,308 WARNING [train.py:1204] (3/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_709-311571-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 12:23:27,426 WARNING [train.py:1204] (3/4) Exclude cut with ID _46067_210_4220_1_1534125571066_3307450_136-257407-0_sp1.1 from training. Duration: 0.963625 2023-10-04 12:23:30,321 WARNING [train.py:1204] (3/4) Exclude cut with ID _6493_210_1747_1_1528459210424_4012470_324-323854-0_sp1.1 from training. Duration: 0.836375 2023-10-04 12:23:31,675 WARNING [train.py:1204] (3/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_234-260682-0_sp1.1 from training. Duration: 0.763625 2023-10-04 12:23:31,744 WARNING [train.py:1204] (3/4) Exclude cut with ID _36634_210_1794_1_1533296352450_1399884_55-119451-0_sp1.1 from training. Duration: 0.963625 2023-10-04 12:23:31,748 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_40047_210_2290_1_1533290519_2374311_195-165693-0 from training. Duration: 0.89 2023-10-04 12:23:36,175 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_5522_210_3273_1_1527249606_3909096_366-172508-0_sp0.9 from training. Duration: 0.7333125 2023-10-04 12:23:36,924 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.2.encoder.layers.2.whiten, num_groups=1, num_channels=384, metric=3.76 vs. limit=12.0 2023-10-04 12:23:37,551 WARNING [train.py:1204] (3/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_187-10504-0_sp1.1 from training. Duration: 0.963625 2023-10-04 12:23:40,295 WARNING [train.py:1204] (3/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_282-257791-0_sp1.1 from training. Duration: 0.7818125 2023-10-04 12:23:41,745 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_43831_210_2158_1_1533725502_5872791_467-288776-0_sp1.1 from training. Duration: 0.92725 2023-10-04 12:23:43,081 WARNING [train.py:1204] (3/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_138-154812-0_sp1.1 from training. Duration: 0.7090625 2023-10-04 12:23:43,126 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_1154-185930-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 12:23:43,148 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2655_210_2087_1_1525688959_3897400_52-279899-0 from training. Duration: 0.69 2023-10-04 12:23:44,455 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_141-89113-0_sp1.1 from training. Duration: 0.7818125 2023-10-04 12:23:46,372 WARNING [train.py:1204] (3/4) Exclude cut with ID _55134_210_21982_1_1534590028377_3882170_297-47310-0_sp1.1 from training. Duration: 0.963625 2023-10-04 12:23:46,443 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_486-151131-0_sp1.1 from training. Duration: 0.77275 2023-10-04 12:23:48,923 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.2.encoder.layers.0.self_attn_weights.whiten_keys, num_groups=4, num_channels=128, metric=3.26 vs. limit=6.0 2023-10-04 12:23:50,259 WARNING [train.py:1204] (3/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_21-232980-0_sp1.1 from training. Duration: 0.936375 2023-10-04 12:23:53,026 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6165_210_3528_1_1527590982_4379841_338-106380-0_sp1.1 from training. Duration: 0.92725 2023-10-04 12:23:54,319 WARNING [train.py:1204] (3/4) Exclude cut with ID _55162_210_17071_1_1534408248583_3753480_355-257218-0_sp1.1 from training. Duration: 0.97275 2023-10-04 12:23:55,763 WARNING [train.py:1204] (3/4) Exclude cut with ID _2895_210_2477_1_1525956785794_4063750_289-11475-0 from training. Duration: 0.82 2023-10-04 12:24:00,127 WARNING [train.py:1204] (3/4) Exclude cut with ID _16756_210_4915_1_1531047803763_7104259_839-183431-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 12:24:01,476 WARNING [train.py:1204] (3/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_427-208901-0_sp0.9 from training. Duration: 0.7555625 2023-10-04 12:24:01,516 WARNING [train.py:1204] (3/4) Exclude cut with ID _36512_210_6987_1_1533033143998_3126069_181-306500-0 from training. Duration: 0.95 2023-10-04 12:24:06,049 WARNING [train.py:1204] (3/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_28-70987-0_sp1.1 from training. Duration: 0.7909375 2023-10-04 12:24:11,422 WARNING [train.py:1204] (3/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_590-58996-0_sp1.1 from training. Duration: 0.936375 2023-10-04 12:24:14,879 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_48730_210_5399_1_1533885886_2212769_108-47561-0_sp1.1 from training. Duration: 0.936375 2023-10-04 12:24:23,542 WARNING [train.py:1204] (3/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_401-158239-0_sp0.9 from training. Duration: 0.788875 2023-10-04 12:24:23,572 WARNING [train.py:1204] (3/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_278-154492-0_sp1.1 from training. Duration: 0.6909375 2023-10-04 12:24:23,578 WARNING [train.py:1204] (3/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_137-116442-0 from training. Duration: 0.78 2023-10-04 12:24:24,890 WARNING [train.py:1204] (3/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_189-317158-0 from training. Duration: 0.9 2023-10-04 12:24:26,226 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_47896_210_20508_1_1534492518_5958868_558-314572-0 from training. Duration: 0.97 2023-10-04 12:24:29,037 WARNING [train.py:1204] (3/4) Exclude cut with ID _66070_210_7640_1_1535441393096_7805310_290-40284-0_sp1.1 from training. Duration: 0.97275 2023-10-04 12:24:29,077 WARNING [train.py:1204] (3/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_110-224688-0_sp1.1 from training. Duration: 0.8818125 2023-10-04 12:24:31,795 WARNING [train.py:1204] (3/4) Exclude cut with ID _7719_210_3525_1_1528456153163_4296960_88-250259-0 from training. Duration: 0.84 2023-10-04 12:24:31,850 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8453_210_4211_1_1528502940_8562730_1157-114102-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 12:24:31,882 WARNING [train.py:1204] (3/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_317-175062-0_sp1.1 from training. Duration: 0.7454375 2023-10-04 12:24:31,888 WARNING [train.py:1204] (3/4) Exclude cut with ID _29857_210_20034_1_1532395886753_3798160_254-347254-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 12:24:34,583 INFO [train.py:1046] (3/4) Epoch 47, batch 3650, loss[loss=0.1681, simple_loss=0.2432, pruned_loss=0.04649, over 23798.00 frames. ], tot_loss[loss=0.1523, simple_loss=0.2334, pruned_loss=0.03559, over 4704333.19 frames. ], batch size: 179, lr: 2.15e-03, grad_scale: 16.0 2023-10-04 12:24:34,631 WARNING [train.py:1204] (3/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_28-70987-0 from training. Duration: 0.87 2023-10-04 12:24:35,982 WARNING [train.py:1204] (3/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_321-335475-0 from training. Duration: 0.45 2023-10-04 12:24:38,701 WARNING [train.py:1204] (3/4) Exclude cut with ID _40901_210_5665_1_1533363789958_3856130_356-107330-0_sp1.1 from training. Duration: 0.936375 2023-10-04 12:24:38,755 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3780_210_2087_1_1526467389_2145572_176-107140-0 from training. Duration: 0.69 2023-10-04 12:24:42,073 WARNING [train.py:1204] (3/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_80-135200-0 from training. Duration: 0.79 2023-10-04 12:24:43,518 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_33348_210_5632_1_1533086912_3324030_85-28583-0_sp0.9 from training. Duration: 0.911125 2023-10-04 12:24:45,077 WARNING [train.py:1204] (3/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_29-293896-0 from training. Duration: 0.61 2023-10-04 12:24:46,922 WARNING [train.py:1204] (3/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_194-252800-0 from training. Duration: 0.97 2023-10-04 12:24:51,339 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_360-52446-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 12:24:51,341 WARNING [train.py:1204] (3/4) Exclude cut with ID _9389_210_1664_1_1529217337301_5289269_289-213417-0_sp0.9 from training. Duration: 0.988875 2023-10-04 12:24:51,388 WARNING [train.py:1204] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_143-86344-0_sp0.9 from training. Duration: 0.8555625 2023-10-04 12:24:52,038 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.3.encoder.layers.2.nonlin_attention.whiten2, num_groups=1, num_channels=512, metric=5.15 vs. limit=15.0 2023-10-04 12:24:55,917 WARNING [train.py:1204] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_520-126729-0_sp0.9 from training. Duration: 0.77775 2023-10-04 12:24:55,936 WARNING [train.py:1204] (3/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_76-308663-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 12:24:57,294 WARNING [train.py:1204] (3/4) Exclude cut with ID _8021_210_7994_1_1528376451358_3653069_100-140231-0 from training. Duration: 0.67 2023-10-04 12:24:58,570 WARNING [train.py:1204] (3/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_387-131538-0_sp0.9 from training. Duration: 0.9 2023-10-04 12:24:58,612 WARNING [train.py:1204] (3/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_365-182054-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 12:24:58,657 WARNING [train.py:1204] (3/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_345-340690-0 from training. Duration: 0.93 2023-10-04 12:25:00,131 WARNING [train.py:1204] (3/4) Exclude cut with ID _5409_210_1388_1_1527292850189_5865191_178-127970-0_sp0.9 from training. Duration: 0.6333125 2023-10-04 12:25:00,179 WARNING [train.py:1204] (3/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_352-12734-0_sp1.1 from training. Duration: 0.92725 2023-10-04 12:25:00,183 WARNING [train.py:1204] (3/4) Exclude cut with ID _65134_210_14763_1_1535335393985_3505368_121-129373-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 12:25:04,331 WARNING [train.py:1204] (3/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_557-324258-0_sp1.1 from training. Duration: 0.736375 2023-10-04 12:25:05,730 WARNING [train.py:1204] (3/4) Exclude cut with ID _48678_210_5663_1_1533988830947_3769199_306-255886-0 from training. Duration: 0.77 2023-10-04 12:25:07,087 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7544_210_6753_1_1528587000_691468_87-178599-0 from training. Duration: 0.87 2023-10-04 12:25:07,140 WARNING [train.py:1204] (3/4) Exclude cut with ID _2941_210_2577_1_1526199673150_2489759_208-236402-0_sp1.1 from training. Duration: 0.963625 2023-10-04 12:25:10,544 WARNING [train.py:1204] (3/4) Exclude cut with ID _38670_210_15379_1_1533448810659_4391059_65-169397-0 from training. Duration: 0.97 2023-10-04 12:25:11,736 WARNING [train.py:1204] (3/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_306-328175-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 12:25:11,745 WARNING [train.py:1204] (3/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_212-149947-0_sp1.1 from training. Duration: 0.663625 2023-10-04 12:25:17,399 WARNING [train.py:1204] (3/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_321-22821-0_sp1.1 from training. Duration: 0.8090625 2023-10-04 12:25:17,527 WARNING [train.py:1204] (3/4) Exclude cut with ID _44010_210_4525_1_1533884262964_4013620_483-69110-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 12:25:17,535 WARNING [train.py:1204] (3/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_38-187851-0_sp1.1 from training. Duration: 0.563625 2023-10-04 12:25:17,762 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.conv_module1.balancer2.prob, batch_count=1653586.6666666667, ans=0.125 2023-10-04 12:25:19,546 WARNING [train.py:1204] (3/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_129-337802-0_sp0.9 from training. Duration: 0.8 2023-10-04 12:25:20,829 WARNING [train.py:1204] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_268-87909-0_sp1.1 from training. Duration: 0.9 2023-10-04 12:25:25,257 WARNING [train.py:1204] (3/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_559-184862-0_sp1.1 from training. Duration: 0.8545625 2023-10-04 12:25:28,576 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_46032_210_14656_1_1533781412_4507969_237-120766-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 12:25:28,669 WARNING [train.py:1204] (3/4) Exclude cut with ID _28427_210_4220_1_1532581240967_3061069_330-170906-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 12:25:29,962 WARNING [train.py:1204] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_401-51578-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 12:25:31,337 WARNING [train.py:1204] (3/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_209-205365-0_sp1.1 from training. Duration: 0.7181875 2023-10-04 12:25:32,677 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_23801_210_16223_1_1532066392_3294657_57-242170-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 12:25:32,732 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_127-37371-0_sp1.1 from training. Duration: 0.936375 2023-10-04 12:25:38,404 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9171W0019-578905 from training. Duration: 0.896 2023-10-04 12:25:38,567 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.0.conv_skip_rate, batch_count=1653653.3333333333, ans=0.0 2023-10-04 12:25:41,711 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_24605_210_1794_1_1532047863_4149755_349-49056-0_sp1.1 from training. Duration: 0.92725 2023-10-04 12:25:41,722 WARNING [train.py:1204] (3/4) Exclude cut with ID _12302_210_5219_1_1529924446466_8107280_897-21237-0_sp1.1 from training. Duration: 0.936375 2023-10-04 12:25:43,054 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_9460_210_4211_1_1528954587_10049667_480-260269-0_sp0.9 from training. Duration: 0.62225 2023-10-04 12:25:43,106 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_348-152548-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 12:25:44,486 WARNING [train.py:1204] (3/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_301-330455-0_sp0.9 from training. Duration: 0.888875 2023-10-04 12:25:44,593 WARNING [train.py:1204] (3/4) Exclude cut with ID _61008_210_10633_1_1534852769198_6974370_345-91705-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 12:25:47,279 WARNING [train.py:1204] (3/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_304-273433-0 from training. Duration: 0.99 2023-10-04 12:25:47,283 WARNING [train.py:1204] (3/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_359-345972-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 12:25:47,993 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.2.encoder.layers.2.self_attn1.whiten, num_groups=1, num_channels=384, metric=17.77 vs. limit=22.5 2023-10-04 12:25:48,578 INFO [train.py:1046] (3/4) Epoch 47, batch 3700, loss[loss=0.1832, simple_loss=0.2563, pruned_loss=0.05507, over 22780.00 frames. ], tot_loss[loss=0.1531, simple_loss=0.2342, pruned_loss=0.03595, over 4711259.80 frames. ], batch size: 322, lr: 2.15e-03, grad_scale: 16.0 2023-10-04 12:25:48,732 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2296_210_2087_1_1525503448_3397335_57-185430-0_sp1.1 from training. Duration: 0.5454375 2023-10-04 12:25:50,242 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_1256-120974-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 12:25:52,004 WARNING [train.py:1204] (3/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_372-30302-0_sp0.9 from training. Duration: 0.9555625 2023-10-04 12:25:54,901 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.conv_module2.balancer2.min_positive, batch_count=1653720.0, ans=0.05 2023-10-04 12:25:56,450 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_42012_210_7927_1_1533439936_3871556_451-344262-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 12:25:56,451 WARNING [train.py:1204] (3/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_28-324729-0 from training. Duration: 0.94 2023-10-04 12:25:56,460 WARNING [train.py:1204] (3/4) Exclude cut with ID _64135_210_2253_1_1535677267876_7608972_313-18394-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 12:25:56,552 WARNING [train.py:1204] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_620-276793-0_sp0.9 from training. Duration: 0.6555625 2023-10-04 12:25:58,223 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.684e+02 1.988e+02 2.215e+02 2.636e+02 3.694e+02, threshold=4.430e+02, percent-clipped=0.0 2023-10-04 12:25:58,293 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7544_210_6753_1_1528591327_1471426_172-348777-0_sp0.9 from training. Duration: 0.7333125 2023-10-04 12:26:00,534 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.2.encoder.layers.2.conv_module2.whiten, num_groups=1, num_channels=384, metric=5.80 vs. limit=15.0 2023-10-04 12:26:01,072 WARNING [train.py:1204] (3/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_332-221057-0_sp0.9 from training. Duration: 0.7555625 2023-10-04 12:26:02,544 WARNING [train.py:1204] (3/4) Exclude cut with ID _19023_210_4353_1_1531378892552_5820780_5-300035-0_sp1.1 from training. Duration: 0.92725 2023-10-04 12:26:02,598 WARNING [train.py:1204] (3/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_343-309023-0_sp1.1 from training. Duration: 0.963625 2023-10-04 12:26:03,944 WARNING [train.py:1204] (3/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_37-41919-0_sp1.1 from training. Duration: 0.7909375 2023-10-04 12:26:05,278 WARNING [train.py:1204] (3/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_41-182910-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 12:26:05,347 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_229-45300-0_sp0.9 from training. Duration: 0.6333125 2023-10-04 12:26:08,138 WARNING [train.py:1204] (3/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_40-214716-0_sp1.1 from training. Duration: 0.963625 2023-10-04 12:26:09,474 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9309W0203-640330 from training. Duration: 0.896 2023-10-04 12:26:14,351 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.conv_skip_rate, batch_count=1653786.6666666667, ans=0.0 2023-10-04 12:26:16,858 WARNING [train.py:1204] (3/4) Exclude cut with ID _60229_210_27767_1_1534838406158_3655350_264-279573-0_sp1.1 from training. Duration: 0.7545625 2023-10-04 12:26:16,905 WARNING [train.py:1204] (3/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_372-198878-0_sp0.9 from training. Duration: 0.6666875 2023-10-04 12:26:19,706 WARNING [train.py:1204] (3/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_359-118926-0_sp1.1 from training. Duration: 0.6545625 2023-10-04 12:26:19,723 WARNING [train.py:1204] (3/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_85-65056-0 from training. Duration: 0.96 2023-10-04 12:26:19,747 WARNING [train.py:1204] (3/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_89-242335-0_sp1.1 from training. Duration: 0.863625 2023-10-04 12:26:24,388 WARNING [train.py:1204] (3/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_128-49444-0_sp1.1 from training. Duration: 0.936375 2023-10-04 12:26:24,477 WARNING [train.py:1204] (3/4) Exclude cut with ID _30327_210_16049_1_1532484161295_3924169_401-70819-0 from training. Duration: 0.86 2023-10-04 12:26:25,814 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_41822_210_13270_1_1533955976_4379284_260-256391-0_sp1.1 from training. Duration: 0.936375 2023-10-04 12:26:27,712 WARNING [train.py:1204] (3/4) Exclude cut with ID _67335_210_5219_1_1535525975652_4275229_599-247317-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 12:26:27,884 WARNING [train.py:1204] (3/4) Exclude cut with ID _29857_210_20034_1_1532395886753_3798160_209-239267-0_sp1.1 from training. Duration: 0.936375 2023-10-04 12:26:29,788 WARNING [train.py:1204] (3/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_46-207599-0_sp1.1 from training. Duration: 0.5818125 2023-10-04 12:26:31,102 WARNING [train.py:1204] (3/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_912-282412-0_sp1.1 from training. Duration: 0.4454375 2023-10-04 12:26:31,649 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.3.encoder.layers.1.feed_forward2.out_whiten, num_groups=1, num_channels=512, metric=9.58 vs. limit=15.0 2023-10-04 12:26:36,428 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_4183_210_2087_1_1526729100_1869838_141-166600-0_sp1.1 from training. Duration: 0.863625 2023-10-04 12:26:36,656 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.conv_skip_rate, batch_count=1653920.0, ans=0.0 2023-10-04 12:26:37,734 WARNING [train.py:1204] (3/4) Exclude cut with ID _15037_210_2253_1_1530752294104_3267650_296-192597-0 from training. Duration: 0.81 2023-10-04 12:26:37,785 WARNING [train.py:1204] (3/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_571-220562-0_sp1.1 from training. Duration: 0.963625 2023-10-04 12:26:37,811 WARNING [train.py:1204] (3/4) Exclude cut with ID _5287_210_2084_1_1527418883040_3878670_12-221186-0 from training. Duration: 0.98 2023-10-04 12:26:42,186 WARNING [train.py:1204] (3/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_149-85602-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 12:26:42,212 WARNING [train.py:1204] (3/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_1003-101638-0_sp1.1 from training. Duration: 0.663625 2023-10-04 12:26:45,652 WARNING [train.py:1204] (3/4) Exclude cut with ID _55844_210_2346_1_1534471212569_7625707_1099-33689-0_sp1.1 from training. Duration: 0.92725 2023-10-04 12:26:45,690 WARNING [train.py:1204] (3/4) Exclude cut with ID _7606_210_4915_1_1528894253277_5080690_301-10732-0 from training. Duration: 0.81 2023-10-04 12:26:48,401 WARNING [train.py:1204] (3/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_403-34698-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 12:26:48,414 WARNING [train.py:1204] (3/4) Exclude cut with ID _8480_210_1874_1_1528589391205_6728330_755-112625-0_sp0.9 from training. Duration: 0.82225 2023-10-04 12:26:48,426 WARNING [train.py:1204] (3/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_527-238990-0_sp1.1 from training. Duration: 0.8181875 2023-10-04 12:26:48,446 WARNING [train.py:1204] (3/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_426-7328-0_sp1.1 from training. Duration: 0.92725 2023-10-04 12:26:52,479 WARNING [train.py:1204] (3/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_189-317158-0_sp1.1 from training. Duration: 0.8181875 2023-10-04 12:26:52,540 WARNING [train.py:1204] (3/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_162-103620-0 from training. Duration: 0.93 2023-10-04 12:26:54,328 WARNING [train.py:1204] (3/4) Exclude cut with ID _22083_210_8254_1_1531702862439_3678970_49-234546-0 from training. Duration: 0.83 2023-10-04 12:26:55,690 WARNING [train.py:1204] (3/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_370-331600-0_sp1.1 from training. Duration: 0.8545625 2023-10-04 12:26:55,706 WARNING [train.py:1204] (3/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_166-59042-0_sp1.1 from training. Duration: 0.97275 2023-10-04 12:26:55,820 WARNING [train.py:1204] (3/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_138-332773-0_sp1.1 from training. Duration: 0.736375 2023-10-04 12:26:57,131 WARNING [train.py:1204] (3/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_90-30673-0_sp0.9 from training. Duration: 0.9333125 2023-10-04 12:26:59,190 WARNING [train.py:1204] (3/4) Exclude cut with ID _27967_210_12140_1_1532588391859_8419130_737-41898-0_sp1.1 from training. Duration: 0.936375 2023-10-04 12:27:01,115 WARNING [train.py:1204] (3/4) Exclude cut with ID _8833_210_5219_1_1528592665210_7553059_329-125156-0_sp1.1 from training. Duration: 0.7454375 2023-10-04 12:27:01,173 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder_embed.convnext.out_balancer.prob, batch_count=1653986.6666666667, ans=0.125 2023-10-04 12:27:02,458 WARNING [train.py:1204] (3/4) Exclude cut with ID _41817_210_18835_1_1533434477160_7423049_352-42537-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 12:27:03,834 INFO [train.py:1046] (3/4) Epoch 47, batch 3750, loss[loss=0.2063, simple_loss=0.2749, pruned_loss=0.06885, over 19861.00 frames. ], tot_loss[loss=0.1542, simple_loss=0.2351, pruned_loss=0.03666, over 4702970.13 frames. ], batch size: 389, lr: 2.15e-03, grad_scale: 16.0 2023-10-04 12:27:05,266 WARNING [train.py:1204] (3/4) Exclude cut with ID _8365_210_2064_1_1528592311070_4677690_431-294102-0 from training. Duration: 0.89 2023-10-04 12:27:05,383 WARNING [train.py:1204] (3/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_454-314901-0_sp1.1 from training. Duration: 0.4181875 2023-10-04 12:27:08,207 WARNING [train.py:1204] (3/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_80-135200-0_sp0.9 from training. Duration: 0.87775 2023-10-04 12:27:08,248 WARNING [train.py:1204] (3/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_181-78632-0 from training. Duration: 0.78 2023-10-04 12:27:09,586 WARNING [train.py:1204] (3/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_300-27235-0_sp0.9 from training. Duration: 0.9666875 2023-10-04 12:27:10,903 WARNING [train.py:1204] (3/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_96-177176-0_sp1.1 from training. Duration: 0.97275 2023-10-04 12:27:12,886 WARNING [train.py:1204] (3/4) Exclude cut with ID _35941_210_15379_1_1532995294515_4023089_246-348742-0_sp1.1 from training. Duration: 0.97275 2023-10-04 12:27:13,230 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=1654053.3333333333, ans=0.1 2023-10-04 12:27:14,423 WARNING [train.py:1204] (3/4) Exclude cut with ID _38670_210_15379_1_1533448810659_4391059_65-169397-0_sp1.1 from training. Duration: 0.8818125 2023-10-04 12:27:18,536 WARNING [train.py:1204] (3/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_1003-99642-0_sp1.1 from training. Duration: 0.963625 2023-10-04 12:27:21,445 WARNING [train.py:1204] (3/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_320-14078-0_sp1.1 from training. Duration: 0.863625 2023-10-04 12:27:22,787 WARNING [train.py:1204] (3/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_367-61330-0_sp0.9 from training. Duration: 0.8333125 2023-10-04 12:27:24,229 WARNING [train.py:1204] (3/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_515-298514-0_sp1.1 from training. Duration: 0.92725 2023-10-04 12:27:24,379 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=1654120.0, ans=0.1 2023-10-04 12:27:25,754 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.1.conv_module1.balancer1.prob, batch_count=1654120.0, ans=0.125 2023-10-04 12:27:25,770 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.conv_module1.balancer1.prob, batch_count=1654120.0, ans=0.125 2023-10-04 12:27:27,562 WARNING [train.py:1204] (3/4) Exclude cut with ID _53731_210_7994_1_1534553979244_3757214_471-320815-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 12:27:30,849 WARNING [train.py:1204] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_306-232480-0 from training. Duration: 0.96 2023-10-04 12:27:30,929 WARNING [train.py:1204] (3/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_478-162247-0_sp0.9 from training. Duration: 0.911125 2023-10-04 12:27:31,033 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.0.feed_forward2.hidden_balancer.prob, batch_count=1654120.0, ans=0.125 2023-10-04 12:27:34,111 WARNING [train.py:1204] (3/4) Exclude cut with ID _56423_210_5854_1_1534896061049_3165969_345-284696-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 12:27:34,147 WARNING [train.py:1204] (3/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_600-122253-0_sp1.1 from training. Duration: 0.963625 2023-10-04 12:27:35,630 WARNING [train.py:1204] (3/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_199-295611-0 from training. Duration: 0.55 2023-10-04 12:27:39,776 WARNING [train.py:1204] (3/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_734-129761-0 from training. Duration: 0.82 2023-10-04 12:27:39,889 WARNING [train.py:1204] (3/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_349-169257-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 12:27:41,170 WARNING [train.py:1204] (3/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_202-67017-0_sp0.9 from training. Duration: 0.911125 2023-10-04 12:27:42,580 WARNING [train.py:1204] (3/4) Exclude cut with ID _16908_210_5724_1_1531119157248_3772700_61-258978-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 12:27:42,744 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.conv_module1.balancer2.min_abs, batch_count=1654186.6666666667, ans=0.5 2023-10-04 12:27:47,094 WARNING [train.py:1204] (3/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_172-156931-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 12:27:48,538 WARNING [train.py:1204] (3/4) Exclude cut with ID _4092_210_2151_1_1526643044449_3135759_356-278694-0_sp1.1 from training. Duration: 0.42725 2023-10-04 12:27:51,217 WARNING [train.py:1204] (3/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_116-332008-0 from training. Duration: 0.98 2023-10-04 12:27:54,044 WARNING [train.py:1204] (3/4) Exclude cut with ID _29003_210_19596_1_1532507391720_3894098_358-114789-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 12:27:57,399 WARNING [train.py:1204] (3/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_152-212533-0_sp1.1 from training. Duration: 0.8818125 2023-10-04 12:27:58,756 WARNING [train.py:1204] (3/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_267-214116-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 12:28:00,758 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7543_210_6753_1_1528541907_1399322_165-323619-0_sp0.9 from training. Duration: 0.7666875 2023-10-04 12:28:05,478 WARNING [train.py:1204] (3/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_577-332600-0_sp0.9 from training. Duration: 0.6333125 2023-10-04 12:28:06,808 WARNING [train.py:1204] (3/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_342-100157-0_sp0.9 from training. Duration: 0.6 2023-10-04 12:28:08,267 WARNING [train.py:1204] (3/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_141-89727-0_sp1.1 from training. Duration: 0.6454375 2023-10-04 12:28:09,709 WARNING [train.py:1204] (3/4) Exclude cut with ID _3992_210_1747_1_1526644816031_3731060_107-251434-0_sp1.1 from training. Duration: 0.9 2023-10-04 12:28:11,782 WARNING [train.py:1204] (3/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_401-53423-0_sp1.1 from training. Duration: 0.5 2023-10-04 12:28:16,027 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.nonlin_attention.balancer.min_positive, batch_count=1654320.0, ans=0.05 2023-10-04 12:28:18,583 INFO [train.py:1046] (3/4) Epoch 47, batch 3800, loss[loss=0.1347, simple_loss=0.2134, pruned_loss=0.02801, over 21971.00 frames. ], tot_loss[loss=0.1535, simple_loss=0.2341, pruned_loss=0.03644, over 4685899.37 frames. ], batch size: 48, lr: 2.15e-03, grad_scale: 16.0 2023-10-04 12:28:18,726 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7762_210_1794_1_1528199508_4057408_422-227034-0_sp1.1 from training. Duration: 0.663625 2023-10-04 12:28:21,702 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.nonlin_attention.balancer.prob, batch_count=1654386.6666666667, ans=0.125 2023-10-04 12:28:22,942 WARNING [train.py:1204] (3/4) Exclude cut with ID _4184_210_2550_1_1526730192013_2219380_102-250299-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 12:28:24,302 WARNING [train.py:1204] (3/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_253-308377-0_sp0.9 from training. Duration: 0.5444375 2023-10-04 12:28:24,388 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_271-297505-0 from training. Duration: 0.99 2023-10-04 12:28:24,554 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.feed_forward3.hidden_balancer.prob, batch_count=1654386.6666666667, ans=0.125 2023-10-04 12:28:27,393 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.785e+02 2.014e+02 2.169e+02 2.643e+02 4.021e+02, threshold=4.338e+02, percent-clipped=0.0 2023-10-04 12:28:27,502 WARNING [train.py:1204] (3/4) Exclude cut with ID _35941_210_15379_1_1532995294515_4023089_213-142973-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 12:28:28,948 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7544_210_6753_1_1528587722_3576686_162-63018-0_sp1.1 from training. Duration: 0.92725 2023-10-04 12:28:30,285 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_365-217119-0_sp0.9 from training. Duration: 0.7 2023-10-04 12:28:32,448 WARNING [train.py:1204] (3/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_662-275621-0_sp0.9 from training. Duration: 0.4666875 2023-10-04 12:28:32,449 WARNING [train.py:1204] (3/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_374-187555-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 12:28:32,530 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_289-180904-0_sp1.1 from training. Duration: 0.6909375 2023-10-04 12:28:33,919 WARNING [train.py:1204] (3/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_189-47206-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 12:28:33,947 WARNING [train.py:1204] (3/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_234-14174-0_sp1.1 from training. Duration: 0.7454375 2023-10-04 12:28:35,285 WARNING [train.py:1204] (3/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_576-76621-0_sp1.1 from training. Duration: 0.963625 2023-10-04 12:28:35,377 WARNING [train.py:1204] (3/4) Exclude cut with ID _13253_210_1925_1_1530757775819_3577873_253-246386-0 from training. Duration: 0.9 2023-10-04 12:28:38,170 WARNING [train.py:1204] (3/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_372-6869-0_sp1.1 from training. Duration: 0.4 2023-10-04 12:28:38,231 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_21434_210_4238_1_1531916339_1222252_22-54954-0_sp1.1 from training. Duration: 0.936375 2023-10-04 12:28:41,549 WARNING [train.py:1204] (3/4) Exclude cut with ID _29853_210_7497_1_1532505600851_6642110_492-170361-0_sp1.1 from training. Duration: 0.92725 2023-10-04 12:28:41,809 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.feed_forward1.hidden_balancer.prob, batch_count=1654453.3333333333, ans=0.125 2023-10-04 12:28:44,288 WARNING [train.py:1204] (3/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_505-97987-0_sp0.9 from training. Duration: 0.9555625 2023-10-04 12:28:44,324 WARNING [train.py:1204] (3/4) Exclude cut with ID _8365_210_2064_1_1528592311070_4677690_371-122295-0_sp0.9 from training. Duration: 0.611125 2023-10-04 12:28:45,717 WARNING [train.py:1204] (3/4) Exclude cut with ID _14268_210_9206_1_1530622771285_4714099_637-86630-0_sp1.1 from training. Duration: 0.463625 2023-10-04 12:28:45,736 WARNING [train.py:1204] (3/4) Exclude cut with ID _29857_210_20034_1_1532395886753_3798160_372-5678-0_sp1.1 from training. Duration: 0.963625 2023-10-04 12:28:49,589 WARNING [train.py:1204] (3/4) Exclude cut with ID _52892_210_6721_1_1534378941259_4124129_90-165108-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 12:28:50,927 WARNING [train.py:1204] (3/4) Exclude cut with ID _27052_210_5986_1_1532329197187_3585009_143-339198-0_sp1.1 from training. Duration: 0.963625 2023-10-04 12:28:55,085 WARNING [train.py:1204] (3/4) Exclude cut with ID _14268_210_9206_1_1530622771285_4714099_637-86630-0_sp0.9 from training. Duration: 0.5666875 2023-10-04 12:28:55,087 WARNING [train.py:1204] (3/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_401-255491-0 from training. Duration: 0.7 2023-10-04 12:28:55,289 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.0.self_attn_weights.pos_emb_skip_rate, batch_count=1654520.0, ans=0.0 2023-10-04 12:28:57,043 WARNING [train.py:1204] (3/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_178-6946-0_sp1.1 from training. Duration: 0.97275 2023-10-04 12:29:03,418 WARNING [train.py:1204] (3/4) Exclude cut with ID _27485_210_4916_1_1532739191677_3668269_460-163885-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 12:29:07,765 WARNING [train.py:1204] (3/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_270-26053-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 12:29:07,984 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.balancer2.prob, batch_count=1654586.6666666667, ans=0.125 2023-10-04 12:29:10,453 WARNING [train.py:1204] (3/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_412-317077-0 from training. Duration: 0.75 2023-10-04 12:29:11,229 WARNING [train.py:1204] (3/4) Exclude cut with ID _5289_210_2084_1_1527678202736_3779299_142-103317-0 from training. Duration: 0.84 2023-10-04 12:29:12,445 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_150-143438-0_sp1.1 from training. Duration: 0.92725 2023-10-04 12:29:14,008 WARNING [train.py:1204] (3/4) Exclude cut with ID _27894_210_12392_1_1532415496078_6902439_668-269779-0_sp1.1 from training. Duration: 0.97275 2023-10-04 12:29:14,094 WARNING [train.py:1204] (3/4) Exclude cut with ID _18513_210_9206_1_1531618215223_3862620_56-222075-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 12:29:14,793 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.2.encoder.layers.0.feed_forward3.out_whiten, num_groups=1, num_channels=384, metric=11.14 vs. limit=15.0 2023-10-04 12:29:15,471 WARNING [train.py:1204] (3/4) Exclude cut with ID _4092_210_2151_1_1526643044449_3135759_356-278694-0 from training. Duration: 0.47 2023-10-04 12:29:16,975 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.0.feed_forward1.hidden_balancer.prob, batch_count=1654653.3333333333, ans=0.125 2023-10-04 12:29:17,101 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.bypass.scale_min, batch_count=1654653.3333333333, ans=0.2 2023-10-04 12:29:21,215 WARNING [train.py:1204] (3/4) Exclude cut with ID _25808_210_13292_1_1532343514562_7366109_633-47524-0 from training. Duration: 0.99 2023-10-04 12:29:21,226 WARNING [train.py:1204] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_872-264525-0 from training. Duration: 0.65 2023-10-04 12:29:21,266 WARNING [train.py:1204] (3/4) Exclude cut with ID _14632_210_9988_1_1530691279392_3923200_239-232545-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 12:29:22,628 WARNING [train.py:1204] (3/4) Exclude cut with ID _14349_210_8341_1_1530615710818_3442470_153-27432-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 12:29:26,123 WARNING [train.py:1204] (3/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_37-41919-0_sp0.9 from training. Duration: 0.9666875 2023-10-04 12:29:26,170 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_196-289637-0_sp0.9 from training. Duration: 0.8333125 2023-10-04 12:29:33,661 INFO [train.py:1046] (3/4) Epoch 47, batch 3850, loss[loss=0.1568, simple_loss=0.2364, pruned_loss=0.03855, over 23280.00 frames. ], tot_loss[loss=0.1532, simple_loss=0.2332, pruned_loss=0.03658, over 4685828.15 frames. ], batch size: 105, lr: 2.15e-03, grad_scale: 16.0 2023-10-04 12:29:33,705 WARNING [train.py:1204] (3/4) Exclude cut with ID _13678_210_7497_1_1530530069226_3463170_371-3221-0_sp1.1 from training. Duration: 0.8181875 2023-10-04 12:29:33,785 WARNING [train.py:1204] (3/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_557-324258-0 from training. Duration: 0.81 2023-10-04 12:29:36,436 WARNING [train.py:1204] (3/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_1012-279473-0_sp1.1 from training. Duration: 0.6090625 2023-10-04 12:29:36,531 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_66916_210_14656_1_1535725525_326566_7-92649-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 12:29:40,479 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_9460_210_4211_1_1528954587_10049667_480-260269-0_sp1.1 from training. Duration: 0.5090625 2023-10-04 12:29:43,908 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_121-155001-0_sp1.1 from training. Duration: 0.92725 2023-10-04 12:29:46,547 WARNING [train.py:1204] (3/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_188-171673-0_sp1.1 from training. Duration: 0.52725 2023-10-04 12:29:46,642 WARNING [train.py:1204] (3/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_2-39653-0 from training. Duration: 0.98 2023-10-04 12:29:52,227 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_23850_210_4238_1_1532232597_1632433_112-229012-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 12:29:54,041 WARNING [train.py:1204] (3/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_626-79528-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 12:29:55,567 WARNING [train.py:1204] (3/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_856-90052-0_sp1.1 from training. Duration: 0.963625 2023-10-04 12:29:56,885 WARNING [train.py:1204] (3/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_122-109023-0_sp0.9 from training. Duration: 0.8444375 2023-10-04 12:30:00,976 WARNING [train.py:1204] (3/4) Exclude cut with ID _64414_210_17301_1_1535243252048_3757759_37-70328-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 12:30:01,037 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_622-344907-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 12:30:01,081 WARNING [train.py:1204] (3/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_410-321956-0_sp1.1 from training. Duration: 0.92725 2023-10-04 12:30:01,100 WARNING [train.py:1204] (3/4) Exclude cut with ID _7562_210_2084_1_1528887767916_3946229_186-326422-0_sp0.9 from training. Duration: 0.9333125 2023-10-04 12:30:01,659 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.4.encoder.layers.1.nonlin_attention.whiten2, num_groups=1, num_channels=384, metric=3.94 vs. limit=15.0 2023-10-04 12:30:03,038 WARNING [train.py:1204] (3/4) Exclude cut with ID _37537_210_2253_1_1533171758568_3421949_127-285192-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 12:30:04,450 WARNING [train.py:1204] (3/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_723-228168-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 12:30:04,524 WARNING [train.py:1204] (3/4) Exclude cut with ID _13678_210_7497_1_1530530069226_3463170_426-246984-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 12:30:04,548 WARNING [train.py:1204] (3/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_399-156791-0_sp0.9 from training. Duration: 0.92225 2023-10-04 12:30:04,599 WARNING [train.py:1204] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_391-22148-0 from training. Duration: 0.77 2023-10-04 12:30:04,858 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.conv_skip_rate, batch_count=1654853.3333333333, ans=0.0 2023-10-04 12:30:06,632 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3475_210_2028_1_1526193578_3622569_64-297072-0 from training. Duration: 0.91 2023-10-04 12:30:08,110 WARNING [train.py:1204] (3/4) Exclude cut with ID _17875_210_2087_1_1531224012907_1821339_225-280015-0_sp1.1 from training. Duration: 0.963625 2023-10-04 12:30:08,135 WARNING [train.py:1204] (3/4) Exclude cut with ID _50655_210_18628_1_1534229942193_3772154_325-246169-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 12:30:11,334 WARNING [train.py:1204] (3/4) Exclude cut with ID _29529_210_7234_1_1532393961455_3977460_597-257348-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 12:30:11,381 WARNING [train.py:1204] (3/4) Exclude cut with ID _58895_210_8750_1_1535184081320_3649089_77-173485-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 12:30:12,680 WARNING [train.py:1204] (3/4) Exclude cut with ID _6270_210_2084_1_1527850911573_3856420_26-277343-0 from training. Duration: 0.82 2023-10-04 12:30:13,486 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.3.encoder.layers.1.self_attn_weights.whiten_keys, num_groups=8, num_channels=256, metric=3.57 vs. limit=6.0 2023-10-04 12:30:14,173 WARNING [train.py:1204] (3/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_144-3287-0 from training. Duration: 0.67 2023-10-04 12:30:16,842 WARNING [train.py:1204] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_293-145278-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 12:30:18,232 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_40047_210_2290_1_1533290519_2374311_257-335162-0 from training. Duration: 0.95 2023-10-04 12:30:20,813 WARNING [train.py:1204] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_619-344499-0_sp0.9 from training. Duration: 0.7 2023-10-04 12:30:25,167 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.self_attn_weights.pos_emb_skip_rate, batch_count=1654920.0, ans=0.0 2023-10-04 12:30:26,894 WARNING [train.py:1204] (3/4) Exclude cut with ID _19504_210_9994_1_1531648863242_3325220_456-315379-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 12:30:28,217 WARNING [train.py:1204] (3/4) Exclude cut with ID _55857_210_2346_1_1534644007831_6898209_631-221692-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 12:30:31,018 WARNING [train.py:1204] (3/4) Exclude cut with ID _64631_210_27767_1_1535349580068_4165160_324-3682-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 12:30:31,055 WARNING [train.py:1204] (3/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_317-211107-0 from training. Duration: 0.92 2023-10-04 12:30:32,468 WARNING [train.py:1204] (3/4) Exclude cut with ID _6789_210_1534_1_1527857937218_3118950_311-61262-0 from training. Duration: 0.65 2023-10-04 12:30:32,571 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.0.feed_forward2.hidden_balancer.prob, batch_count=1654986.6666666667, ans=0.125 2023-10-04 12:30:32,633 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=1654986.6666666667, ans=0.1 2023-10-04 12:30:36,131 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.4.encoder.layers.0.feed_forward1.out_whiten, num_groups=1, num_channels=384, metric=6.53 vs. limit=15.0 2023-10-04 12:30:36,950 WARNING [train.py:1204] (3/4) Exclude cut with ID _28323_210_16187_1_1532480395667_4100300_365-68882-0_sp1.1 from training. Duration: 0.92725 2023-10-04 12:30:37,001 WARNING [train.py:1204] (3/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_816-181825-0_sp1.1 from training. Duration: 0.92725 2023-10-04 12:30:39,081 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.feed_forward1.out_proj.dropout_p, batch_count=1654986.6666666667, ans=0.1 2023-10-04 12:30:40,154 WARNING [train.py:1204] (3/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_132-269633-0_sp1.1 from training. Duration: 0.6181875 2023-10-04 12:30:40,171 WARNING [train.py:1204] (3/4) Exclude cut with ID _44585_210_18628_1_1533605350387_4518199_280-73534-0_sp1.1 from training. Duration: 0.8090625 2023-10-04 12:30:40,220 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_68095_210_20185_1_1535718226_8888855_1009-164755-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 12:30:41,559 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_40223_210_6228_1_1533298404_4812267_400-103813-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 12:30:41,560 WARNING [train.py:1204] (3/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_491-261923-0_sp1.1 from training. Duration: 0.8909375 2023-10-04 12:30:41,565 WARNING [train.py:1204] (3/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_712-342563-0 from training. Duration: 0.97 2023-10-04 12:30:43,438 WARNING [train.py:1204] (3/4) Exclude cut with ID _41161_210_20034_1_1533431249657_3555314_175-190032-0_sp1.1 from training. Duration: 0.963625 2023-10-04 12:30:44,809 WARNING [train.py:1204] (3/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_788-298907-0 from training. Duration: 0.89 2023-10-04 12:30:44,839 WARNING [train.py:1204] (3/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_225-70961-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 12:30:44,845 WARNING [train.py:1204] (3/4) Exclude cut with ID _58583_210_5236_1_1534726835266_3574249_235-175994-0_sp1.1 from training. Duration: 0.92725 2023-10-04 12:30:46,530 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.conv_module1.balancer2.prob, batch_count=1655053.3333333333, ans=0.125 2023-10-04 12:30:47,536 INFO [train.py:1046] (3/4) Epoch 47, batch 3900, loss[loss=0.1419, simple_loss=0.2308, pruned_loss=0.0265, over 24668.00 frames. ], tot_loss[loss=0.1522, simple_loss=0.2322, pruned_loss=0.03613, over 4686706.31 frames. ], batch size: 65, lr: 2.15e-03, grad_scale: 16.0 2023-10-04 12:30:47,605 WARNING [train.py:1204] (3/4) Exclude cut with ID _12302_210_5219_1_1529924446466_8107280_907-182949-0_sp1.1 from training. Duration: 0.836375 2023-10-04 12:30:47,645 WARNING [train.py:1204] (3/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_94-193692-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 12:30:50,304 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_4528_210_3273_1_1527076654_3276777_342-168068-0_sp0.9 from training. Duration: 0.9666875 2023-10-04 12:30:50,356 WARNING [train.py:1204] (3/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_368-74239-0_sp1.1 from training. Duration: 0.92725 2023-10-04 12:30:50,357 WARNING [train.py:1204] (3/4) Exclude cut with ID _44831_210_13427_1_1533816011549_3689035_291-295928-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 12:30:50,428 WARNING [train.py:1204] (3/4) Exclude cut with ID _56406_210_9288_1_1534899618664_3496149_455-131997-0_sp1.1 from training. Duration: 0.97275 2023-10-04 12:30:50,435 WARNING [train.py:1204] (3/4) Exclude cut with ID _6473_210_1741_1_1527901307396_3815380_108-17335-0 from training. Duration: 0.65 2023-10-04 12:30:50,473 WARNING [train.py:1204] (3/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_286-135600-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 12:30:50,716 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.balancer1.prob, batch_count=1655053.3333333333, ans=0.125 2023-10-04 12:30:53,412 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.bypass.skip_rate, batch_count=1655053.3333333333, ans=0.09899494936611666 2023-10-04 12:30:54,700 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_928-321675-0_sp1.1 from training. Duration: 0.936375 2023-10-04 12:30:56,161 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.655e+02 1.990e+02 2.251e+02 2.622e+02 3.660e+02, threshold=4.502e+02, percent-clipped=0.0 2023-10-04 12:30:56,258 WARNING [train.py:1204] (3/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_81-300212-0_sp1.1 from training. Duration: 0.5454375 2023-10-04 12:30:56,301 WARNING [train.py:1204] (3/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_29-280622-0_sp0.9 from training. Duration: 0.911125 2023-10-04 12:30:58,191 WARNING [train.py:1204] (3/4) Exclude cut with ID _61079_210_4525_1_1534921008198_4011419_228-176447-0_sp1.1 from training. Duration: 0.936375 2023-10-04 12:31:01,013 WARNING [train.py:1204] (3/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_372-198878-0_sp1.1 from training. Duration: 0.5454375 2023-10-04 12:31:01,032 WARNING [train.py:1204] (3/4) Exclude cut with ID _64521_210_14699_1_1535243427259_3815328_244-208725-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 12:31:02,314 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8275_210_4198_1_1528455320_3892571_491-17707-0_sp0.9 from training. Duration: 0.711125 2023-10-04 12:31:02,429 WARNING [train.py:1204] (3/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_375-245677-0 from training. Duration: 0.81 2023-10-04 12:31:02,437 WARNING [train.py:1204] (3/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_690-286055-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 12:31:03,813 WARNING [train.py:1204] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_89-321955-0 from training. Duration: 0.8 2023-10-04 12:31:05,626 WARNING [train.py:1204] (3/4) Exclude cut with ID _18895_210_6228_1_1531357274234_3784930_213-98721-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 12:31:05,699 WARNING [train.py:1204] (3/4) Exclude cut with ID _19708_210_9206_1_1532136650733_4238364_347-65240-0 from training. Duration: 0.97 2023-10-04 12:31:08,393 WARNING [train.py:1204] (3/4) Exclude cut with ID _64135_210_2253_1_1535677267876_7608972_498-5039-0 from training. Duration: 0.78 2023-10-04 12:31:11,636 WARNING [train.py:1204] (3/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_146-276944-0_sp1.1 from training. Duration: 0.7545625 2023-10-04 12:31:13,536 WARNING [train.py:1204] (3/4) Exclude cut with ID _63171_210_15488_1_1535104792794_3295110_155-34488-0_sp1.1 from training. Duration: 0.97275 2023-10-04 12:31:13,557 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_5438_210_1794_1_1527336687_2600009_233-58072-0_sp0.9 from training. Duration: 0.8555625 2023-10-04 12:31:14,875 WARNING [train.py:1204] (3/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_63-101996-0_sp0.9 from training. Duration: 0.97775 2023-10-04 12:31:19,043 WARNING [train.py:1204] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_271-269084-0_sp1.1 from training. Duration: 0.7545625 2023-10-04 12:31:21,874 WARNING [train.py:1204] (3/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_345-167986-0_sp1.1 from training. Duration: 0.763625 2023-10-04 12:31:23,312 WARNING [train.py:1204] (3/4) Exclude cut with ID _39290_210_4220_1_1533862778760_3075118_92-311600-0_sp1.1 from training. Duration: 0.8 2023-10-04 12:31:23,319 WARNING [train.py:1204] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_209-65306-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 12:31:24,658 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_26433_210_12108_1_1532157158_4056480_317-335807-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 12:31:30,757 WARNING [train.py:1204] (3/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_238-94127-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 12:31:30,807 WARNING [train.py:1204] (3/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_207-132038-0_sp1.1 from training. Duration: 0.8545625 2023-10-04 12:31:37,897 WARNING [train.py:1204] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_167-137255-0_sp1.1 from training. Duration: 0.5818125 2023-10-04 12:31:39,432 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2009_210_2071_1_1525481134_4588937_385-302100-0_sp0.9 from training. Duration: 0.9666875 2023-10-04 12:31:46,409 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.nonlin_attention.balancer.max_positive, batch_count=1655320.0, ans=0.95 2023-10-04 12:31:47,666 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_15517_210_6753_1_1530954008_3487336_253-110617-0_sp1.1 from training. Duration: 0.936375 2023-10-04 12:31:51,733 WARNING [train.py:1204] (3/4) Exclude cut with ID _39290_210_4220_1_1533862778760_3075118_92-311600-0_sp0.9 from training. Duration: 0.97775 2023-10-04 12:31:53,043 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_42-142473-0 from training. Duration: 0.72 2023-10-04 12:31:53,072 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2296_210_2087_1_1525503448_3397335_57-185430-0 from training. Duration: 0.6 2023-10-04 12:31:53,084 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_11305_210_1794_1_1529750428_7178148_257-312870-0_sp0.9 from training. Duration: 0.97775 2023-10-04 12:31:54,465 WARNING [train.py:1204] (3/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_222-198127-0 from training. Duration: 0.84 2023-10-04 12:31:55,883 WARNING [train.py:1204] (3/4) Exclude cut with ID _65263_210_22356_1_1535763598691_3921037_423-202699-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 12:31:55,934 WARNING [train.py:1204] (3/4) Exclude cut with ID _15037_210_2253_1_1530752294104_3267650_13-130361-0 from training. Duration: 0.99 2023-10-04 12:32:01,862 INFO [train.py:1046] (3/4) Epoch 47, batch 3950, loss[loss=0.1792, simple_loss=0.243, pruned_loss=0.05767, over 19165.00 frames. ], tot_loss[loss=0.1519, simple_loss=0.2321, pruned_loss=0.03585, over 4699227.43 frames. ], batch size: 388, lr: 2.15e-03, grad_scale: 16.0 2023-10-04 12:32:03,320 WARNING [train.py:1204] (3/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_628-198080-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 12:32:04,788 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3171_210_2087_1_1526109084_2330007_37-64334-0 from training. Duration: 0.82 2023-10-04 12:32:04,846 WARNING [train.py:1204] (3/4) Exclude cut with ID _5287_210_2084_1_1527418883040_3878670_12-221186-0_sp1.1 from training. Duration: 0.8909375 2023-10-04 12:32:08,158 WARNING [train.py:1204] (3/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_1103-56445-0_sp1.1 from training. Duration: 0.663625 2023-10-04 12:32:10,880 WARNING [train.py:1204] (3/4) Exclude cut with ID _65263_210_22356_1_1535763598691_3921037_169-140068-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 12:32:15,637 WARNING [train.py:1204] (3/4) Exclude cut with ID IC0671W0118-331961 from training. Duration: 0.896 2023-10-04 12:32:15,705 WARNING [train.py:1204] (3/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_402-102311-0_sp1.1 from training. Duration: 0.6090625 2023-10-04 12:32:16,998 WARNING [train.py:1204] (3/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_202-293523-0 from training. Duration: 0.72 2023-10-04 12:32:17,665 WARNING [train.py:1204] (3/4) Exclude cut with ID IC0679W0038-335846 from training. Duration: 0.768 2023-10-04 12:32:17,694 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_37357_210_17024_1_1533089504_2316121_79-198866-0_sp1.1 from training. Duration: 0.936375 2023-10-04 12:32:20,273 WARNING [train.py:1204] (3/4) Exclude cut with ID _7606_210_4915_1_1528894253277_5080690_292-271285-0_sp1.1 from training. Duration: 0.963625 2023-10-04 12:32:20,291 WARNING [train.py:1204] (3/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_377-296839-0_sp0.9 from training. Duration: 0.611125 2023-10-04 12:32:20,294 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_45886_210_17024_1_1533704917_2435333_58-131069-0_sp1.1 from training. Duration: 0.963625 2023-10-04 12:32:20,825 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.4.encoder.layers.2.nonlin_attention.whiten2, num_groups=1, num_channels=384, metric=3.40 vs. limit=15.0 2023-10-04 12:32:24,302 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6961_210_2087_1_1527937168_6815011_395-185060-0 from training. Duration: 0.95 2023-10-04 12:32:25,714 WARNING [train.py:1204] (3/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_304-266479-0_sp1.1 from training. Duration: 0.97275 2023-10-04 12:32:25,765 WARNING [train.py:1204] (3/4) Exclude cut with ID _3867_210_1883_1_1526641200575_4475830_319-349850-0_sp1.1 from training. Duration: 0.6090625 2023-10-04 12:32:26,981 WARNING [train.py:1204] (3/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_304-59227-0_sp0.9 from training. Duration: 0.8555625 2023-10-04 12:32:27,044 WARNING [train.py:1204] (3/4) Exclude cut with ID _8021_210_7994_1_1528376451358_3653069_100-140231-0_sp0.9 from training. Duration: 0.7444375 2023-10-04 12:32:27,107 WARNING [train.py:1204] (3/4) Exclude cut with ID _8833_210_5219_1_1528592665210_7553059_329-125156-0_sp0.9 from training. Duration: 0.911125 2023-10-04 12:32:38,525 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.feed_forward3.out_whiten.whitening_limit, batch_count=1655520.0, ans=15.0 2023-10-04 12:32:39,212 WARNING [train.py:1204] (3/4) Exclude cut with ID _14459_210_2228_1_1531443641946_7516700_792-287234-0_sp1.1 from training. Duration: 0.92725 2023-10-04 12:32:39,227 WARNING [train.py:1204] (3/4) Exclude cut with ID _19708_210_9206_1_1532136650733_4238364_347-65240-0_sp1.1 from training. Duration: 0.8818125 2023-10-04 12:32:42,218 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.ff2_skip_rate, batch_count=1655520.0, ans=0.0 2023-10-04 12:32:44,791 WARNING [train.py:1204] (3/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_70-27732-0 from training. Duration: 0.94 2023-10-04 12:32:49,961 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_397-132565-0 from training. Duration: 0.99 2023-10-04 12:32:49,965 WARNING [train.py:1204] (3/4) Exclude cut with ID _49728_210_12651_1_1534041034294_6788150_624-195301-0 from training. Duration: 0.85 2023-10-04 12:32:50,013 WARNING [train.py:1204] (3/4) Exclude cut with ID _55429_210_10626_1_1534467588527_7174143_377-169391-0_sp1.1 from training. Duration: 0.87275 2023-10-04 12:32:51,341 WARNING [train.py:1204] (3/4) Exclude cut with ID _49376_210_5986_1_1533985278732_3370740_354-344695-0_sp1.1 from training. Duration: 0.9 2023-10-04 12:32:56,926 WARNING [train.py:1204] (3/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_276-96593-0_sp1.1 from training. Duration: 0.7 2023-10-04 12:32:58,326 WARNING [train.py:1204] (3/4) Exclude cut with ID _15012_210_1968_1_1530856820429_5095350_688-178849-0_sp0.9 from training. Duration: 0.711125 2023-10-04 12:32:58,367 WARNING [train.py:1204] (3/4) Exclude cut with ID _25565_210_12392_1_1532423009382_3952098_372-69023-0_sp1.1 from training. Duration: 0.936375 2023-10-04 12:32:58,414 WARNING [train.py:1204] (3/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_61-327062-0_sp1.1 from training. Duration: 0.736375 2023-10-04 12:32:58,453 WARNING [train.py:1204] (3/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_215-141059-0 from training. Duration: 0.62 2023-10-04 12:33:02,613 WARNING [train.py:1204] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_6-226750-0_sp1.1 from training. Duration: 0.8545625 2023-10-04 12:33:04,012 WARNING [train.py:1204] (3/4) Exclude cut with ID _14348_210_8341_1_1530599434996_3778640_240-232370-0_sp1.1 from training. Duration: 0.763625 2023-10-04 12:33:06,220 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.bypass.skip_rate, batch_count=1655653.3333333333, ans=0.04949747468305833 2023-10-04 12:33:07,424 WARNING [train.py:1204] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_468-189058-0 from training. Duration: 0.96 2023-10-04 12:33:13,817 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.conv_module1.balancer1.prob, batch_count=1655653.3333333333, ans=0.125 2023-10-04 12:33:16,701 INFO [train.py:1046] (3/4) Epoch 47, batch 4000, loss[loss=0.1604, simple_loss=0.2354, pruned_loss=0.04275, over 23817.00 frames. ], tot_loss[loss=0.1525, simple_loss=0.2329, pruned_loss=0.03602, over 4710067.77 frames. ], batch size: 179, lr: 2.15e-03, grad_scale: 32.0 2023-10-04 12:33:18,759 WARNING [train.py:1204] (3/4) Exclude cut with ID _21987_210_5854_1_1531821500482_3637038_247-93340-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 12:33:24,286 WARNING [train.py:1204] (3/4) Exclude cut with ID _66868_210_9527_1_1535509287954_4561140_436-191602-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 12:33:25,573 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.625e+02 2.039e+02 2.177e+02 2.497e+02 4.705e+02, threshold=4.355e+02, percent-clipped=2.0 2023-10-04 12:33:28,567 WARNING [train.py:1204] (3/4) Exclude cut with ID _18895_210_6228_1_1531357274234_3784930_394-58203-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 12:33:29,928 WARNING [train.py:1204] (3/4) Exclude cut with ID _58605_210_12210_1_1534723195939_3904259_258-281498-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 12:33:29,970 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_33348_210_5632_1_1533086912_3324030_245-292752-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 12:33:31,301 WARNING [train.py:1204] (3/4) Exclude cut with ID _67831_210_12392_1_1535612464202_3092612_346-2148-0 from training. Duration: 0.93 2023-10-04 12:33:31,374 WARNING [train.py:1204] (3/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_369-222060-0_sp0.9 from training. Duration: 0.788875 2023-10-04 12:33:31,505 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.feed_forward1.hidden_balancer.prob, batch_count=1655786.6666666667, ans=0.125 2023-10-04 12:33:31,592 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.bypass.skip_rate, batch_count=1655786.6666666667, ans=0.04949747468305833 2023-10-04 12:33:31,596 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.feed_forward2.hidden_balancer.prob, batch_count=1655786.6666666667, ans=0.125 2023-10-04 12:33:32,656 WARNING [train.py:1204] (3/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_245-174553-0 from training. Duration: 0.88 2023-10-04 12:33:32,662 WARNING [train.py:1204] (3/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_231-104736-0_sp1.1 from training. Duration: 0.7090625 2023-10-04 12:33:32,668 WARNING [train.py:1204] (3/4) Exclude cut with ID _44585_210_18628_1_1533605350387_4518199_280-73534-0 from training. Duration: 0.89 2023-10-04 12:33:34,142 WARNING [train.py:1204] (3/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_22-201476-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 12:33:37,529 WARNING [train.py:1204] (3/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_325-62802-0_sp0.9 from training. Duration: 0.9666875 2023-10-04 12:33:37,542 WARNING [train.py:1204] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_199-274062-0_sp1.1 from training. Duration: 0.97275 2023-10-04 12:33:37,545 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_8-319957-0_sp1.1 from training. Duration: 0.87275 2023-10-04 12:33:37,576 WARNING [train.py:1204] (3/4) Exclude cut with ID _29343_210_4381_1_1532602757752_3674109_224-349726-0_sp1.1 from training. Duration: 0.936375 2023-10-04 12:33:37,581 WARNING [train.py:1204] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_520-126729-0_sp1.1 from training. Duration: 0.636375 2023-10-04 12:33:40,854 WARNING [train.py:1204] (3/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_239-22076-0_sp1.1 from training. Duration: 0.77275 2023-10-04 12:33:42,206 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9012W0011-501146 from training. Duration: 0.896 2023-10-04 12:33:42,282 WARNING [train.py:1204] (3/4) Exclude cut with ID _29857_210_20034_1_1532395886753_3798160_279-185801-0_sp1.1 from training. Duration: 0.82725 2023-10-04 12:33:43,623 WARNING [train.py:1204] (3/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_261-223933-0_sp1.1 from training. Duration: 0.92725 2023-10-04 12:33:46,985 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9029W0025-509426 from training. Duration: 0.896 2023-10-04 12:33:47,065 WARNING [train.py:1204] (3/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_337-181502-0_sp1.1 from training. Duration: 0.5909375 2023-10-04 12:33:48,250 WARNING [train.py:1204] (3/4) Exclude cut with ID _68635_210_9317_1_1535770813410_4315870_159-332599-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 12:33:54,309 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_161-276996-0 from training. Duration: 0.95 2023-10-04 12:33:54,514 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.conv_module2.balancer1.prob, batch_count=1655853.3333333333, ans=0.125 2023-10-04 12:33:55,501 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7088_210_1874_1_1527939776_1101113_127-68870-0_sp1.1 from training. Duration: 0.936375 2023-10-04 12:33:56,998 WARNING [train.py:1204] (3/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_322-217220-0_sp1.1 from training. Duration: 0.7818125 2023-10-04 12:33:58,315 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9072W0002-530636 from training. Duration: 0.768 2023-10-04 12:33:59,708 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_340-336408-0_sp1.1 from training. Duration: 0.8454375 2023-10-04 12:34:01,037 WARNING [train.py:1204] (3/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_395-37222-0 from training. Duration: 0.76 2023-10-04 12:34:01,047 WARNING [train.py:1204] (3/4) Exclude cut with ID _64099_210_17301_1_1535526977656_1578188_159-209156-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 12:34:02,300 WARNING [train.py:1204] (3/4) Exclude cut with ID _53950_210_5816_1_1534417264919_3862951_28-29681-0_sp1.1 from training. Duration: 0.92725 2023-10-04 12:34:03,632 WARNING [train.py:1204] (3/4) Exclude cut with ID _4681_210_3525_1_1527250689464_120889_15-126760-0_sp1.1 from training. Duration: 0.736375 2023-10-04 12:34:06,381 WARNING [train.py:1204] (3/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_242-3472-0_sp1.1 from training. Duration: 0.663625 2023-10-04 12:34:06,437 WARNING [train.py:1204] (3/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_436-186311-0_sp0.9 from training. Duration: 0.72225 2023-10-04 12:34:07,671 WARNING [train.py:1204] (3/4) Exclude cut with ID _3675_210_4557_1_1526639934408_4182089_452-172147-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 12:34:10,713 WARNING [train.py:1204] (3/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_80-147301-0 from training. Duration: 0.84 2023-10-04 12:34:10,739 WARNING [train.py:1204] (3/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_351-107588-0_sp1.1 from training. Duration: 0.92725 2023-10-04 12:34:10,860 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9111W0002-550781 from training. Duration: 0.896 2023-10-04 12:34:15,711 WARNING [train.py:1204] (3/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_465-156500-0_sp1.1 from training. Duration: 0.6545625 2023-10-04 12:34:18,590 WARNING [train.py:1204] (3/4) Exclude cut with ID _13938_210_9988_1_1530518718978_3693960_304-112784-0_sp1.1 from training. Duration: 0.4181875 2023-10-04 12:34:19,987 WARNING [train.py:1204] (3/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_202-67017-0_sp1.1 from training. Duration: 0.7454375 2023-10-04 12:34:21,876 WARNING [train.py:1204] (3/4) Exclude cut with ID _58414_210_8254_1_1535421609162_7151182_448-4358-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 12:34:21,978 WARNING [train.py:1204] (3/4) Exclude cut with ID _66840_210_5219_1_1535452168426_4196349_203-243234-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 12:34:23,724 WARNING [train.py:1204] (3/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_201-213513-0_sp1.1 from training. Duration: 0.97275 2023-10-04 12:34:24,016 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.conv_skip_rate, batch_count=1655986.6666666667, ans=0.0 2023-10-04 12:34:27,951 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_117-187560-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 12:34:30,573 INFO [train.py:1046] (3/4) Epoch 47, batch 4050, loss[loss=0.1576, simple_loss=0.2489, pruned_loss=0.03314, over 24333.00 frames. ], tot_loss[loss=0.1534, simple_loss=0.2342, pruned_loss=0.03633, over 4708063.52 frames. ], batch size: 74, lr: 2.15e-03, grad_scale: 8.0 2023-10-04 12:34:30,648 WARNING [train.py:1204] (3/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_135-174080-0_sp0.9 from training. Duration: 0.588875 2023-10-04 12:34:32,019 WARNING [train.py:1204] (3/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_45-265684-0 from training. Duration: 0.72 2023-10-04 12:34:33,479 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_435-313305-0_sp1.1 from training. Duration: 0.6454375 2023-10-04 12:34:34,791 WARNING [train.py:1204] (3/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_235-307337-0_sp1.1 from training. Duration: 0.8545625 2023-10-04 12:34:36,085 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7762_210_1794_1_1528199508_4057408_419-125009-0_sp0.9 from training. Duration: 0.92225 2023-10-04 12:34:36,173 WARNING [train.py:1204] (3/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_215-141059-0_sp1.1 from training. Duration: 0.563625 2023-10-04 12:34:37,682 WARNING [train.py:1204] (3/4) Exclude cut with ID _27013_210_1389_1_1532437173240_3252649_222-125573-0_sp1.1 from training. Duration: 0.8909375 2023-10-04 12:34:40,904 WARNING [train.py:1204] (3/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_126-274694-0_sp1.1 from training. Duration: 0.8909375 2023-10-04 12:34:43,582 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2592_210_2290_1_1525697025_4882925_157-70512-0_sp1.1 from training. Duration: 0.82725 2023-10-04 12:34:44,806 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_292-265928-0_sp0.9 from training. Duration: 0.5666875 2023-10-04 12:34:46,795 WARNING [train.py:1204] (3/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_1211-226066-0_sp1.1 from training. Duration: 0.6909375 2023-10-04 12:34:46,855 WARNING [train.py:1204] (3/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_394-265747-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 12:34:50,834 WARNING [train.py:1204] (3/4) Exclude cut with ID _48782_210_12658_1_1534467701544_2300422_239-67139-0_sp1.1 from training. Duration: 0.97275 2023-10-04 12:34:50,972 WARNING [train.py:1204] (3/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_329-113173-0_sp1.1 from training. Duration: 0.563625 2023-10-04 12:34:54,671 WARNING [train.py:1204] (3/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_81-75849-0_sp0.9 from training. Duration: 0.4333125 2023-10-04 12:34:56,159 WARNING [train.py:1204] (3/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_1003-101638-0 from training. Duration: 0.73 2023-10-04 12:34:56,184 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9309W0189-640316 from training. Duration: 0.64 2023-10-04 12:34:58,694 WARNING [train.py:1204] (3/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_503-287883-0_sp1.1 from training. Duration: 0.8 2023-10-04 12:35:05,492 WARNING [train.py:1204] (3/4) Exclude cut with ID _8833_210_5219_1_1528592665210_7553059_611-80313-0 from training. Duration: 0.64 2023-10-04 12:35:06,850 WARNING [train.py:1204] (3/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_335-106626-0_sp1.1 from training. Duration: 0.963625 2023-10-04 12:35:10,215 WARNING [train.py:1204] (3/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_237-44332-0_sp1.1 from training. Duration: 0.8545625 2023-10-04 12:35:14,370 WARNING [train.py:1204] (3/4) Exclude cut with ID _46952_210_5986_1_1534230010357_3107379_178-147341-0_sp1.1 from training. Duration: 0.97275 2023-10-04 12:35:14,866 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.5.encoder.layers.0.nonlin_attention.whiten2, num_groups=1, num_channels=256, metric=3.30 vs. limit=15.0 2023-10-04 12:35:15,665 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_13091_210_7925_1_1530609447_8987354_946-154426-0_sp1.1 from training. Duration: 0.936375 2023-10-04 12:35:15,694 WARNING [train.py:1204] (3/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_326-17061-0_sp1.1 from training. Duration: 0.8545625 2023-10-04 12:35:19,039 WARNING [train.py:1204] (3/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_38-169920-0_sp1.1 from training. Duration: 0.82725 2023-10-04 12:35:21,906 WARNING [train.py:1204] (3/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_283-220372-0 from training. Duration: 0.56 2023-10-04 12:35:21,922 WARNING [train.py:1204] (3/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_252-216674-0_sp1.1 from training. Duration: 0.4909375 2023-10-04 12:35:25,232 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_202-91290-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 12:35:26,502 WARNING [train.py:1204] (3/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_80-74184-0 from training. Duration: 0.83 2023-10-04 12:35:30,694 WARNING [train.py:1204] (3/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_411-271358-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 12:35:35,086 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.ff3_skip_rate, batch_count=1656320.0, ans=0.0 2023-10-04 12:35:36,188 WARNING [train.py:1204] (3/4) Exclude cut with ID _68777_210_4353_1_1535810390731_3117904_129-166012-0 from training. Duration: 0.85 2023-10-04 12:35:38,788 WARNING [train.py:1204] (3/4) Exclude cut with ID _44982_210_1511_1_1534312820108_3726029_410-109759-0_sp1.1 from training. Duration: 0.963625 2023-10-04 12:35:38,795 WARNING [train.py:1204] (3/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_364-22817-0_sp1.1 from training. Duration: 0.6454375 2023-10-04 12:35:40,292 WARNING [train.py:1204] (3/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_89-242335-0 from training. Duration: 0.95 2023-10-04 12:35:40,307 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_151-325626-0 from training. Duration: 0.97 2023-10-04 12:35:40,308 WARNING [train.py:1204] (3/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_786-264332-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 12:35:41,785 WARNING [train.py:1204] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_35-153886-0_sp1.1 from training. Duration: 0.763625 2023-10-04 12:35:43,412 INFO [train.py:1046] (3/4) Epoch 47, batch 4100, loss[loss=0.149, simple_loss=0.229, pruned_loss=0.03452, over 24623.00 frames. ], tot_loss[loss=0.1537, simple_loss=0.2346, pruned_loss=0.03636, over 4716240.44 frames. ], batch size: 60, lr: 2.15e-03, grad_scale: 8.0 2023-10-04 12:35:43,504 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_24007_210_1794_1_1531918925_1663875_116-100916-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 12:35:43,519 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_162-262717-0_sp1.1 from training. Duration: 0.7545625 2023-10-04 12:35:47,825 WARNING [train.py:1204] (3/4) Exclude cut with ID _8263_210_1664_1_1528438953494_294100_40-204731-0 from training. Duration: 0.5 2023-10-04 12:35:49,658 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_416-293746-0 from training. Duration: 0.9 2023-10-04 12:35:50,940 WARNING [train.py:1204] (3/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_605-219732-0 from training. Duration: 0.94 2023-10-04 12:35:53,010 WARNING [train.py:1204] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_454-123002-0 from training. Duration: 0.69 2023-10-04 12:35:53,016 WARNING [train.py:1204] (3/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_25-304273-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 12:35:53,071 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6860_210_4852_1_1527917457_3833327_451-39702-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 12:35:54,315 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_304-16505-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 12:35:54,334 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_4-266348-0_sp1.1 from training. Duration: 0.7090625 2023-10-04 12:35:55,643 WARNING [train.py:1204] (3/4) Exclude cut with ID ID0149W0001-753701 from training. Duration: 0.989 2023-10-04 12:35:56,923 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.647e+02 2.115e+02 2.367e+02 2.901e+02 4.348e+02, threshold=4.733e+02, percent-clipped=0.0 2023-10-04 12:35:57,147 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_41251_210_15368_1_1533429767_4820695_394-55393-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 12:35:58,401 WARNING [train.py:1204] (3/4) Exclude cut with ID _8793_210_6310_1_1528542985139_2310009_121-179782-0_sp1.1 from training. Duration: 0.72725 2023-10-04 12:35:58,424 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2729_210_3528_1_1525863505_3882214_421-73047-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 12:35:58,480 WARNING [train.py:1204] (3/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_278-154492-0_sp0.9 from training. Duration: 0.8444375 2023-10-04 12:36:03,907 WARNING [train.py:1204] (3/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_412-317077-0_sp0.9 from training. Duration: 0.8333125 2023-10-04 12:36:05,231 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_727-138317-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 12:36:05,440 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.nonlin_attention.balancer.min_positive, batch_count=1656453.3333333333, ans=0.05 2023-10-04 12:36:06,538 WARNING [train.py:1204] (3/4) Exclude cut with ID _25808_210_13292_1_1532343514562_7366109_345-335048-0_sp1.1 from training. Duration: 0.92725 2023-10-04 12:36:06,558 WARNING [train.py:1204] (3/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_506-23200-0 from training. Duration: 0.97 2023-10-04 12:36:07,897 WARNING [train.py:1204] (3/4) Exclude cut with ID _44718_210_5816_1_1534050051869_6469030_643-245187-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 12:36:07,901 WARNING [train.py:1204] (3/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_42-186162-0_sp1.1 from training. Duration: 0.836375 2023-10-04 12:36:07,921 WARNING [train.py:1204] (3/4) Exclude cut with ID _3231_210_2106_1_1526205864934_6784220_171-256004-0_sp1.1 from training. Duration: 0.7818125 2023-10-04 12:36:07,945 WARNING [train.py:1204] (3/4) Exclude cut with ID _25914_210_6400_1_1532141317058_644740_43-248478-0_sp1.1 from training. Duration: 0.936375 2023-10-04 12:36:07,993 WARNING [train.py:1204] (3/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_577-287150-0 from training. Duration: 0.82 2023-10-04 12:36:10,868 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_46015_210_7234_1_1533863319_3087837_376-276068-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 12:36:14,075 WARNING [train.py:1204] (3/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_51-200220-0 from training. Duration: 0.56 2023-10-04 12:36:14,169 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_452-96101-0_sp1.1 from training. Duration: 0.963625 2023-10-04 12:36:16,980 WARNING [train.py:1204] (3/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_263-168388-0_sp1.1 from training. Duration: 0.7818125 2023-10-04 12:36:16,981 WARNING [train.py:1204] (3/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_469-94673-0 from training. Duration: 0.79 2023-10-04 12:36:18,269 WARNING [train.py:1204] (3/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_303-228738-0_sp1.1 from training. Duration: 0.763625 2023-10-04 12:36:18,464 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=1656520.0, ans=0.1 2023-10-04 12:36:19,511 WARNING [train.py:1204] (3/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_262-250826-0_sp1.1 from training. Duration: 0.9 2023-10-04 12:36:19,532 WARNING [train.py:1204] (3/4) Exclude cut with ID _68777_210_4353_1_1535810390731_3117904_129-166012-0_sp1.1 from training. Duration: 0.77275 2023-10-04 12:36:21,512 WARNING [train.py:1204] (3/4) Exclude cut with ID _58895_210_8750_1_1535184081320_3649089_265-323810-0 from training. Duration: 0.96 2023-10-04 12:36:22,911 WARNING [train.py:1204] (3/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_120-76843-0_sp0.9 from training. Duration: 0.611125 2023-10-04 12:36:23,048 INFO [scaling.py:1118] (3/4) WithLoss: name=encoder.encoders.0.layers.0.self_attn_weights, loss-sum=0.000e+00 2023-10-04 12:36:24,788 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7544_210_6753_1_1528587722_3576686_295-258479-0_sp0.9 from training. Duration: 0.9333125 2023-10-04 12:36:26,229 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7046_210_6753_1_1527895337_6920917_783-278109-0 from training. Duration: 0.77 2023-10-04 12:36:26,285 WARNING [train.py:1204] (3/4) Exclude cut with ID _42326_210_7770_1_1533814282531_3707269_113-308323-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 12:36:27,619 WARNING [train.py:1204] (3/4) Exclude cut with ID _7852_210_5219_1_1528286471316_8139400_242-105336-0_sp0.9 from training. Duration: 0.8 2023-10-04 12:36:29,193 INFO [scaling.py:1118] (3/4) WithLoss: name=encoder.encoders.2.encoder.layers.2.self_attn_weights, loss-sum=0.000e+00 2023-10-04 12:36:29,659 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.2.encoder.layers.2.whiten, num_groups=1, num_channels=384, metric=5.70 vs. limit=12.0 2023-10-04 12:36:30,318 WARNING [train.py:1204] (3/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_114-277905-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 12:36:35,890 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_13118_210_7925_1_1530755612_7608297_42-66683-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 12:36:39,853 WARNING [train.py:1204] (3/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_901-79438-0_sp1.1 from training. Duration: 0.8545625 2023-10-04 12:36:39,913 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_30280_210_1881_1_1532872779_3660139_211-182194-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 12:36:47,317 WARNING [train.py:1204] (3/4) Exclude cut with ID _14898_210_9206_1_1530766812377_4177190_621-45732-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 12:36:47,325 WARNING [train.py:1204] (3/4) Exclude cut with ID _35350_210_16040_1_1533040191183_4075299_224-133775-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 12:36:47,523 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.self_attn_weights.pos_emb_skip_rate, batch_count=1656653.3333333333, ans=0.0 2023-10-04 12:36:51,791 WARNING [train.py:1204] (3/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_147-293311-0_sp1.1 from training. Duration: 0.8545625 2023-10-04 12:36:54,426 WARNING [train.py:1204] (3/4) Exclude cut with ID _12302_210_5219_1_1529924446466_8107280_835-279754-0_sp1.1 from training. Duration: 0.8181875 2023-10-04 12:36:56,207 INFO [train.py:1046] (3/4) Epoch 47, batch 4150, loss[loss=0.147, simple_loss=0.2201, pruned_loss=0.037, over 23501.00 frames. ], tot_loss[loss=0.1542, simple_loss=0.2347, pruned_loss=0.0368, over 4701906.66 frames. ], batch size: 285, lr: 2.15e-03, grad_scale: 4.0 2023-10-04 12:36:58,398 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8453_210_4211_1_1528502940_8562730_347-330673-0_sp0.9 from training. Duration: 0.8 2023-10-04 12:36:58,565 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.feed_forward1.out_proj.dropout_p, batch_count=1656720.0, ans=0.1 2023-10-04 12:36:59,884 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_213-103282-0_sp0.9 from training. Duration: 0.8666875 2023-10-04 12:37:01,231 WARNING [train.py:1204] (3/4) Exclude cut with ID _49728_210_12651_1_1534041034294_6788150_66-43496-0_sp1.1 from training. Duration: 0.97275 2023-10-04 12:37:01,234 WARNING [train.py:1204] (3/4) Exclude cut with ID _19023_210_4353_1_1531378892552_5820780_28-36615-0_sp1.1 from training. Duration: 0.8818125 2023-10-04 12:37:05,126 WARNING [train.py:1204] (3/4) Exclude cut with ID _14349_210_8341_1_1530615710818_3442470_115-285975-0 from training. Duration: 0.86 2023-10-04 12:37:05,146 WARNING [train.py:1204] (3/4) Exclude cut with ID _12734_210_8341_1_1530183614296_3891929_236-47464-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 12:37:05,202 WARNING [train.py:1204] (3/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_177-236809-0 from training. Duration: 0.49 2023-10-04 12:37:06,625 WARNING [train.py:1204] (3/4) Exclude cut with ID _13678_210_7497_1_1530530069226_3463170_371-3221-0 from training. Duration: 0.9 2023-10-04 12:37:06,640 WARNING [train.py:1204] (3/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_48-338976-0 from training. Duration: 0.89 2023-10-04 12:37:06,756 WARNING [train.py:1204] (3/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_1167-309744-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 12:37:11,076 WARNING [train.py:1204] (3/4) Exclude cut with ID _30327_210_16049_1_1532484161295_3924169_401-70819-0_sp0.9 from training. Duration: 0.9555625 2023-10-04 12:37:11,092 WARNING [train.py:1204] (3/4) Exclude cut with ID _55134_210_21982_1_1534590028377_3882170_346-91022-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 12:37:11,268 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.feed_forward1.hidden_balancer.prob, batch_count=1656786.6666666667, ans=0.125 2023-10-04 12:37:15,608 WARNING [train.py:1204] (3/4) Exclude cut with ID _14459_210_2228_1_1531443641946_7516700_174-118801-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 12:37:15,669 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_269-278006-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 12:37:16,976 WARNING [train.py:1204] (3/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_390-339137-0_sp0.9 from training. Duration: 0.62225 2023-10-04 12:37:18,406 WARNING [train.py:1204] (3/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_253-308377-0_sp1.1 from training. Duration: 0.4454375 2023-10-04 12:37:18,418 WARNING [train.py:1204] (3/4) Exclude cut with ID _23825_210_13449_1_1531915313526_3497150_240-271367-0_sp1.1 from training. Duration: 0.8818125 2023-10-04 12:37:20,424 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1015-343884-0_sp0.9 from training. Duration: 0.688875 2023-10-04 12:37:24,346 WARNING [train.py:1204] (3/4) Exclude cut with ID _23514_210_1389_1_1531832139785_3031439_335-48210-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 12:37:29,098 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_309-65177-0_sp1.1 from training. Duration: 0.8 2023-10-04 12:37:29,172 WARNING [train.py:1204] (3/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_231-137892-0 from training. Duration: 0.99 2023-10-04 12:37:30,640 WARNING [train.py:1204] (3/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_57-270037-0 from training. Duration: 0.77 2023-10-04 12:37:31,890 WARNING [train.py:1204] (3/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_465-214726-0_sp1.1 from training. Duration: 0.7818125 2023-10-04 12:37:31,975 WARNING [train.py:1204] (3/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_106-300985-0 from training. Duration: 0.9 2023-10-04 12:37:31,987 WARNING [train.py:1204] (3/4) Exclude cut with ID _14632_210_9988_1_1530691279392_3923200_161-129530-0_sp1.1 from training. Duration: 0.87275 2023-10-04 12:37:32,006 WARNING [train.py:1204] (3/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_514-88370-0_sp1.1 from training. Duration: 0.963625 2023-10-04 12:37:36,067 WARNING [train.py:1204] (3/4) Exclude cut with ID _56495_210_4234_1_1534748080334_4136219_305-334505-0_sp1.1 from training. Duration: 0.92725 2023-10-04 12:37:36,329 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.balancer1.prob, batch_count=1656853.3333333333, ans=0.125 2023-10-04 12:37:37,361 WARNING [train.py:1204] (3/4) Exclude cut with ID _42875_210_14856_1_1533704397487_4012433_61-264914-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 12:37:40,240 WARNING [train.py:1204] (3/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_91-148831-0 from training. Duration: 0.99 2023-10-04 12:37:42,995 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_435-313305-0_sp0.9 from training. Duration: 0.788875 2023-10-04 12:37:44,368 WARNING [train.py:1204] (3/4) Exclude cut with ID _16744_210_5986_1_1531724405384_3907567_242-111316-0_sp1.1 from training. Duration: 0.8454375 2023-10-04 12:37:46,316 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_257-20733-0 from training. Duration: 0.76 2023-10-04 12:37:46,375 WARNING [train.py:1204] (3/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_4-91037-0_sp1.1 from training. Duration: 0.8 2023-10-04 12:37:47,736 WARNING [train.py:1204] (3/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_3-276701-0 from training. Duration: 0.91 2023-10-04 12:37:49,122 WARNING [train.py:1204] (3/4) Exclude cut with ID _3992_210_1747_1_1526644816031_3731060_36-147855-0_sp1.1 from training. Duration: 0.7090625 2023-10-04 12:37:51,086 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.conv_module2.balancer1.prob, batch_count=1656920.0, ans=0.125 2023-10-04 12:37:52,095 WARNING [train.py:1204] (3/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_1296-298671-0_sp1.1 from training. Duration: 0.963625 2023-10-04 12:37:53,407 WARNING [train.py:1204] (3/4) Exclude cut with ID _68965_210_1883_1_1535698962272_3497490_309-334593-0_sp1.1 from training. Duration: 0.92725 2023-10-04 12:37:54,801 WARNING [train.py:1204] (3/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_128-296581-0 from training. Duration: 0.74 2023-10-04 12:37:54,802 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_17613_210_2151_1_1531137569_2366160_170-299079-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 12:37:54,804 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6287_210_3273_1_1527509506_2653716_182-199887-0_sp1.1 from training. Duration: 0.463625 2023-10-04 12:37:56,621 WARNING [train.py:1204] (3/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_80-135200-0_sp1.1 from training. Duration: 0.7181875 2023-10-04 12:37:59,853 WARNING [train.py:1204] (3/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_34-183685-0 from training. Duration: 0.98 2023-10-04 12:37:59,877 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8278_210_2550_1_1528637099_4220413_418-210155-0_sp1.1 from training. Duration: 0.92725 2023-10-04 12:37:59,881 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8853_210_3273_1_1528633899_818968_51-187685-0_sp0.9 from training. Duration: 0.7666875 2023-10-04 12:37:59,895 WARNING [train.py:1204] (3/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_332-221057-0_sp1.1 from training. Duration: 0.6181875 2023-10-04 12:38:01,243 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2221_210_1988_1_1525597051_4935541_415-296882-0 from training. Duration: 0.77 2023-10-04 12:38:01,275 WARNING [train.py:1204] (3/4) Exclude cut with ID _33640_210_2655_1_1533117605479_4855469_770-278185-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 12:38:02,588 WARNING [train.py:1204] (3/4) Exclude cut with ID _5287_210_2084_1_1527418883040_3878670_333-48508-0_sp1.1 from training. Duration: 0.5454375 2023-10-04 12:38:02,645 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_142-276839-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 12:38:04,057 WARNING [train.py:1204] (3/4) Exclude cut with ID _28394_210_8913_1_1532430049518_3650118_126-139371-0_sp1.1 from training. Duration: 0.92725 2023-10-04 12:38:04,072 WARNING [train.py:1204] (3/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_76-203867-0 from training. Duration: 0.72 2023-10-04 12:38:04,116 WARNING [train.py:1204] (3/4) Exclude cut with ID _6194_210_1534_1_1527511826160_3941194_14-282876-0_sp0.9 from training. Duration: 0.788875 2023-10-04 12:38:08,320 WARNING [train.py:1204] (3/4) Exclude cut with ID _35355_210_16040_1_1533212988977_4016229_74-190076-0_sp1.1 from training. Duration: 0.72725 2023-10-04 12:38:08,953 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.3.encoder.layers.1.feed_forward2.out_whiten, num_groups=1, num_channels=512, metric=9.28 vs. limit=15.0 2023-10-04 12:38:09,624 INFO [train.py:1046] (3/4) Epoch 47, batch 4200, loss[loss=0.1322, simple_loss=0.1908, pruned_loss=0.03684, over 22807.00 frames. ], tot_loss[loss=0.1526, simple_loss=0.2327, pruned_loss=0.03624, over 4697759.85 frames. ], batch size: 322, lr: 2.15e-03, grad_scale: 8.0 2023-10-04 12:38:09,779 WARNING [train.py:1204] (3/4) Exclude cut with ID _33920_210_4220_1_1533257851500_3084399_198-334063-0 from training. Duration: 0.94 2023-10-04 12:38:11,159 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_4-266348-0_sp0.9 from training. Duration: 0.8666875 2023-10-04 12:38:12,523 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_23856_210_4238_1_1531994785_3087934_122-38075-0_sp1.1 from training. Duration: 0.936375 2023-10-04 12:38:13,993 WARNING [train.py:1204] (3/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_276-96593-0_sp0.9 from training. Duration: 0.8555625 2023-10-04 12:38:14,059 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_308-278734-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 12:38:14,061 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_5190_210_4557_1_1527245078_2965646_353-250603-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 12:38:19,222 WARNING [train.py:1204] (3/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_472-216663-0 from training. Duration: 0.92 2023-10-04 12:38:20,000 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.3.encoder.layers.1.conv_module1.whiten, num_groups=1, num_channels=512, metric=6.65 vs. limit=15.0 2023-10-04 12:38:20,541 WARNING [train.py:1204] (3/4) Exclude cut with ID _7537_210_2084_1_1528502395431_3924359_14-312906-0 from training. Duration: 0.84 2023-10-04 12:38:20,591 WARNING [train.py:1204] (3/4) Exclude cut with ID _29003_210_19596_1_1532507391720_3894098_559-236660-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 12:38:23,200 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.708e+02 2.120e+02 2.356e+02 2.706e+02 4.391e+02, threshold=4.711e+02, percent-clipped=0.0 2023-10-04 12:38:23,400 WARNING [train.py:1204] (3/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_312-204798-0_sp1.1 from training. Duration: 0.8454375 2023-10-04 12:38:23,642 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.bypass.scale_min, batch_count=1657120.0, ans=0.2 2023-10-04 12:38:23,906 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.4.encoder.layers.1.feed_forward1.out_whiten, num_groups=1, num_channels=384, metric=8.75 vs. limit=15.0 2023-10-04 12:38:27,051 WARNING [train.py:1204] (3/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_17-309070-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 12:38:28,512 WARNING [train.py:1204] (3/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_344-152526-0_sp0.9 from training. Duration: 0.588875 2023-10-04 12:38:29,908 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_41452_210_7291_1_1533437687_3941281_405-11559-0_sp1.1 from training. Duration: 0.82725 2023-10-04 12:38:31,228 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_48730_210_5399_1_1533885886_2212769_157-124052-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 12:38:31,284 WARNING [train.py:1204] (3/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_415-78792-0 from training. Duration: 0.81 2023-10-04 12:38:31,288 WARNING [train.py:1204] (3/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_345-340690-0_sp1.1 from training. Duration: 0.8454375 2023-10-04 12:38:34,058 WARNING [train.py:1204] (3/4) Exclude cut with ID _23825_210_13449_1_1531915313526_3497150_124-8138-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 12:38:34,108 WARNING [train.py:1204] (3/4) Exclude cut with ID _49258_210_7994_1_1534122017863_3738861_194-24864-0_sp1.1 from training. Duration: 0.936375 2023-10-04 12:38:34,147 WARNING [train.py:1204] (3/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_369-222060-0_sp1.1 from training. Duration: 0.6454375 2023-10-04 12:38:36,759 WARNING [train.py:1204] (3/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_480-13146-0_sp0.9 from training. Duration: 0.9444375 2023-10-04 12:38:38,163 WARNING [train.py:1204] (3/4) Exclude cut with ID _13253_210_1925_1_1530757775819_3577873_191-144977-0 from training. Duration: 0.78 2023-10-04 12:38:38,188 WARNING [train.py:1204] (3/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_1128-140266-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 12:38:42,422 WARNING [train.py:1204] (3/4) Exclude cut with ID _8772_210_1850_1_1528635595972_3664011_247-336448-0_sp0.9 from training. Duration: 0.82225 2023-10-04 12:38:43,821 WARNING [train.py:1204] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_318-59097-0_sp1.1 from training. Duration: 0.6909375 2023-10-04 12:38:46,696 WARNING [train.py:1204] (3/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_540-131164-0_sp1.1 from training. Duration: 0.836375 2023-10-04 12:38:46,812 WARNING [train.py:1204] (3/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_39-110836-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 12:38:47,008 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.feed_forward1.hidden_balancer.prob, batch_count=1657186.6666666667, ans=0.125 2023-10-04 12:38:50,505 WARNING [train.py:1204] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_860-96198-0_sp1.1 from training. Duration: 0.663625 2023-10-04 12:38:50,521 WARNING [train.py:1204] (3/4) Exclude cut with ID _15133_210_6228_1_1530794512074_3080379_29-264458-0 from training. Duration: 0.58 2023-10-04 12:38:50,540 WARNING [train.py:1204] (3/4) Exclude cut with ID _55209_210_1531_1_1534675828996_4597160_797-348343-0_sp1.1 from training. Duration: 0.963625 2023-10-04 12:38:51,909 WARNING [train.py:1204] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_23-189234-0_sp1.1 from training. Duration: 0.7545625 2023-10-04 12:38:55,122 WARNING [train.py:1204] (3/4) Exclude cut with ID _5410_210_1388_1_1527397055795_1719020_39-112221-0_sp1.1 from training. Duration: 0.62725 2023-10-04 12:38:56,500 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_100-348441-0_sp1.1 from training. Duration: 0.82725 2023-10-04 12:39:02,434 WARNING [train.py:1204] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_143-86344-0_sp1.1 from training. Duration: 0.7 2023-10-04 12:39:04,556 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.2.encoder.layers.2.self_attn2.whiten, num_groups=1, num_channels=384, metric=18.92 vs. limit=22.5 2023-10-04 12:39:05,167 WARNING [train.py:1204] (3/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_126-29104-0 from training. Duration: 0.85 2023-10-04 12:39:06,578 WARNING [train.py:1204] (3/4) Exclude cut with ID _39846_210_12651_1_1533603651992_1318030_138-31771-0_sp1.1 from training. Duration: 0.963625 2023-10-04 12:39:10,832 WARNING [train.py:1204] (3/4) Exclude cut with ID _2220_210_1819_1_1525509109898_3658680_50-99837-0_sp1.1 from training. Duration: 0.4909375 2023-10-04 12:39:10,904 WARNING [train.py:1204] (3/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_836-210550-0_sp1.1 from training. Duration: 0.97275 2023-10-04 12:39:12,463 WARNING [train.py:1204] (3/4) Exclude cut with ID _28323_210_16187_1_1532480395667_4100300_306-74561-0 from training. Duration: 0.94 2023-10-04 12:39:18,887 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_177-58005-0_sp1.1 from training. Duration: 0.563625 2023-10-04 12:39:23,473 WARNING [train.py:1204] (3/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_19-342632-0_sp1.1 from training. Duration: 0.82725 2023-10-04 12:39:24,715 INFO [train.py:1046] (3/4) Epoch 47, batch 4250, loss[loss=0.1581, simple_loss=0.2449, pruned_loss=0.03562, over 24517.00 frames. ], tot_loss[loss=0.1519, simple_loss=0.2315, pruned_loss=0.03619, over 4685119.07 frames. ], batch size: 63, lr: 2.15e-03, grad_scale: 8.0 2023-10-04 12:39:24,752 WARNING [train.py:1204] (3/4) Exclude cut with ID _35355_210_16040_1_1533212988977_4016229_74-190076-0_sp0.9 from training. Duration: 0.888875 2023-10-04 12:39:26,219 WARNING [train.py:1204] (3/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_285-97786-0_sp1.1 from training. Duration: 0.97275 2023-10-04 12:39:32,487 WARNING [train.py:1204] (3/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_337-181502-0_sp0.9 from training. Duration: 0.72225 2023-10-04 12:39:32,520 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_353-182855-0 from training. Duration: 0.96 2023-10-04 12:39:32,553 WARNING [train.py:1204] (3/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_409-327730-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 12:39:35,318 WARNING [train.py:1204] (3/4) Exclude cut with ID _8030_210_7497_1_1528444886781_3531089_373-19403-0_sp1.1 from training. Duration: 0.97275 2023-10-04 12:39:38,124 WARNING [train.py:1204] (3/4) Exclude cut with ID _6789_210_1534_1_1527857937218_3118950_231-340237-0_sp1.1 from training. Duration: 0.936375 2023-10-04 12:39:39,805 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.feed_forward1.hidden_balancer.prob, batch_count=1657453.3333333333, ans=0.125 2023-10-04 12:39:42,314 WARNING [train.py:1204] (3/4) Exclude cut with ID _6493_210_1747_1_1528459210424_4012470_536-57832-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 12:39:42,334 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7046_210_6753_1_1527895337_6920917_756-116002-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 12:39:45,082 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_37152_210_9697_1_1533106573_3844188_29-239070-0_sp1.1 from training. Duration: 0.8909375 2023-10-04 12:39:45,084 WARNING [train.py:1204] (3/4) Exclude cut with ID _19023_210_4353_1_1531378892552_5820780_61-334539-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 12:39:46,536 WARNING [train.py:1204] (3/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_159-28488-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 12:39:48,419 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_764-71272-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 12:39:49,093 WARNING [train.py:1204] (3/4) Exclude cut with ID _44902_210_16187_1_1533888006062_3792078_258-178274-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 12:39:50,434 WARNING [train.py:1204] (3/4) Exclude cut with ID _7537_210_2084_1_1528502395431_3924359_14-312906-0_sp1.1 from training. Duration: 0.763625 2023-10-04 12:39:52,278 WARNING [train.py:1204] (3/4) Exclude cut with ID _49643_210_7989_1_1534640244224_3939031_674-116639-0_sp1.1 from training. Duration: 0.963625 2023-10-04 12:39:53,774 WARNING [train.py:1204] (3/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_415-161-0 from training. Duration: 0.98 2023-10-04 12:39:56,548 WARNING [train.py:1204] (3/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_264-348896-0 from training. Duration: 0.85 2023-10-04 12:39:56,555 WARNING [train.py:1204] (3/4) Exclude cut with ID _51899_210_20034_1_1534129254466_3669220_233-161492-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 12:39:57,786 WARNING [train.py:1204] (3/4) Exclude cut with ID _31255_210_13427_1_1532865619495_3815143_233-200023-0_sp1.1 from training. Duration: 0.936375 2023-10-04 12:39:57,802 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_23850_210_4238_1_1532232597_1632433_100-229879-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 12:39:59,771 WARNING [train.py:1204] (3/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_375-245677-0_sp0.9 from training. Duration: 0.9 2023-10-04 12:39:59,774 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_40223_210_6228_1_1533298404_4812267_355-317415-0_sp1.1 from training. Duration: 0.97275 2023-10-04 12:39:59,816 WARNING [train.py:1204] (3/4) Exclude cut with ID _9679_210_7497_1_1529129212906_4363929_235-81488-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 12:40:03,872 WARNING [train.py:1204] (3/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_172-215932-0_sp1.1 from training. Duration: 0.6 2023-10-04 12:40:03,945 WARNING [train.py:1204] (3/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_64-298917-0_sp1.1 from training. Duration: 0.8 2023-10-04 12:40:08,164 WARNING [train.py:1204] (3/4) Exclude cut with ID _38020_210_5986_1_1533112252259_7006089_131-53533-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 12:40:09,551 WARNING [train.py:1204] (3/4) Exclude cut with ID _42334_210_7770_1_1534206610396_3605880_392-48154-0_sp1.1 from training. Duration: 0.963625 2023-10-04 12:40:09,609 WARNING [train.py:1204] (3/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_337-181502-0 from training. Duration: 0.65 2023-10-04 12:40:09,624 WARNING [train.py:1204] (3/4) Exclude cut with ID _6267_210_2084_1_1527764658526_3894730_375-219887-0_sp0.9 from training. Duration: 0.7444375 2023-10-04 12:40:10,936 WARNING [train.py:1204] (3/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_60-146189-0 from training. Duration: 0.98 2023-10-04 12:40:11,022 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_243-143119-0_sp1.1 from training. Duration: 0.7 2023-10-04 12:40:14,214 WARNING [train.py:1204] (3/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_1267-272693-0_sp0.9 from training. Duration: 0.811125 2023-10-04 12:40:14,360 WARNING [train.py:1204] (3/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_83-182645-0_sp1.1 from training. Duration: 0.97275 2023-10-04 12:40:14,378 WARNING [train.py:1204] (3/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_403-292260-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 12:40:17,076 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.3.encoder.layers.0.self_attn2.whiten, num_groups=1, num_channels=512, metric=14.18 vs. limit=22.5 2023-10-04 12:40:17,581 WARNING [train.py:1204] (3/4) Exclude cut with ID _55429_210_10626_1_1534467588527_7174143_377-169391-0 from training. Duration: 0.96 2023-10-04 12:40:18,994 WARNING [train.py:1204] (3/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_494-199538-0_sp1.1 from training. Duration: 0.6818125 2023-10-04 12:40:20,342 WARNING [train.py:1204] (3/4) Exclude cut with ID _3067_210_2667_1_1526212974640_4503168_119-175776-0_sp1.1 from training. Duration: 0.67275 2023-10-04 12:40:24,833 WARNING [train.py:1204] (3/4) Exclude cut with ID _3231_210_2106_1_1526205864934_6784220_183-303839-0_sp1.1 from training. Duration: 0.97275 2023-10-04 12:40:25,050 INFO [scaling.py:1118] (3/4) WithLoss: name=encoder.encoders.1.encoder.layers.1.self_attn_weights, loss-sum=0.000e+00 2023-10-04 12:40:27,676 WARNING [train.py:1204] (3/4) Exclude cut with ID _18513_210_9206_1_1531618215223_3862620_165-38958-0_sp1.1 from training. Duration: 0.963625 2023-10-04 12:40:29,662 WARNING [train.py:1204] (3/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_87-272068-0_sp1.1 from training. Duration: 0.7545625 2023-10-04 12:40:31,199 WARNING [train.py:1204] (3/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_353-95-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 12:40:31,319 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2009_210_2071_1_1525481134_4588937_73-302813-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 12:40:33,996 WARNING [train.py:1204] (3/4) Exclude cut with ID _64220_210_9527_1_1535250151772_4875259_88-95011-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 12:40:34,059 WARNING [train.py:1204] (3/4) Exclude cut with ID _44902_210_16187_1_1533888006062_3792078_160-12107-0_sp1.1 from training. Duration: 0.92725 2023-10-04 12:40:34,065 WARNING [train.py:1204] (3/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_547-5424-0 from training. Duration: 0.94 2023-10-04 12:40:34,241 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.feed_forward2.hidden_balancer.prob, batch_count=1657653.3333333333, ans=0.125 2023-10-04 12:40:35,516 WARNING [train.py:1204] (3/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_268-85656-0_sp1.1 from training. Duration: 0.936375 2023-10-04 12:40:39,653 INFO [train.py:1046] (3/4) Epoch 47, batch 4300, loss[loss=0.1545, simple_loss=0.2301, pruned_loss=0.03941, over 23565.00 frames. ], tot_loss[loss=0.152, simple_loss=0.2316, pruned_loss=0.03619, over 4697212.29 frames. ], batch size: 256, lr: 2.15e-03, grad_scale: 8.0 2023-10-04 12:40:39,699 WARNING [train.py:1204] (3/4) Exclude cut with ID _66210_210_12651_1_1535446812922_3365850_42-284985-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 12:40:39,731 WARNING [train.py:1204] (3/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_465-214726-0_sp0.9 from training. Duration: 0.9555625 2023-10-04 12:40:39,953 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.bypass.scale_min, batch_count=1657720.0, ans=0.2 2023-10-04 12:40:44,480 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.3.encoder.layers.1.whiten, num_groups=1, num_channels=512, metric=4.46 vs. limit=12.0 2023-10-04 12:40:45,671 WARNING [train.py:1204] (3/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_50-95770-0_sp1.1 from training. Duration: 0.936375 2023-10-04 12:40:52,816 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.617e+02 2.021e+02 2.286e+02 2.632e+02 3.512e+02, threshold=4.572e+02, percent-clipped=0.0 2023-10-04 12:40:54,788 WARNING [train.py:1204] (3/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_269-77567-0_sp1.1 from training. Duration: 0.963625 2023-10-04 12:40:54,790 WARNING [train.py:1204] (3/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_7-111954-0 from training. Duration: 0.56 2023-10-04 12:40:54,889 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_85-195240-0_sp0.9 from training. Duration: 0.9666875 2023-10-04 12:40:56,518 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.conv_module2.balancer2.prob, batch_count=1657786.6666666667, ans=0.125 2023-10-04 12:40:57,657 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2822_210_2715_1_1526112586_1999587_165-36656-0_sp0.9 from training. Duration: 0.988875 2023-10-04 12:40:57,681 WARNING [train.py:1204] (3/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_181-78632-0_sp1.1 from training. Duration: 0.7090625 2023-10-04 12:40:57,693 WARNING [train.py:1204] (3/4) Exclude cut with ID IC0671W0044-331887 from training. Duration: 0.896 2023-10-04 12:41:02,140 WARNING [train.py:1204] (3/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_266-137104-0_sp1.1 from training. Duration: 0.5909375 2023-10-04 12:41:03,622 WARNING [train.py:1204] (3/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_137-116442-0_sp0.9 from training. Duration: 0.8666875 2023-10-04 12:41:06,441 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_23801_210_16223_1_1532066392_3294657_570-5439-0 from training. Duration: 0.97 2023-10-04 12:41:06,453 WARNING [train.py:1204] (3/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_29-280622-0_sp1.1 from training. Duration: 0.7454375 2023-10-04 12:41:07,734 WARNING [train.py:1204] (3/4) Exclude cut with ID 3557-8342-0013-54691-0 from training. Duration: 0.92 2023-10-04 12:41:10,477 WARNING [train.py:1204] (3/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_711-15688-0_sp1.1 from training. Duration: 0.3909375 2023-10-04 12:41:11,860 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_15126_210_6753_1_1530778677_2591185_120-277707-0_sp1.1 from training. Duration: 0.72725 2023-10-04 12:41:14,609 WARNING [train.py:1204] (3/4) Exclude cut with ID _14970_210_15709_1_1530775547361_1966965_8-169924-0_sp1.1 from training. Duration: 0.836375 2023-10-04 12:41:14,610 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_4692_210_3528_1_1526899755_4323254_107-44401-0_sp0.9 from training. Duration: 0.9555625 2023-10-04 12:41:14,701 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3475_210_2028_1_1526193578_3622569_265-276625-0_sp0.9 from training. Duration: 0.8444375 2023-10-04 12:41:16,027 WARNING [train.py:1204] (3/4) Exclude cut with ID _55153_210_2253_1_1534384786677_3162096_148-79022-0_sp1.1 from training. Duration: 0.87275 2023-10-04 12:41:16,109 WARNING [train.py:1204] (3/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_292-36336-0_sp1.1 from training. Duration: 0.92725 2023-10-04 12:41:16,125 WARNING [train.py:1204] (3/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_329-82235-0 from training. Duration: 0.82 2023-10-04 12:41:17,443 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_11305_210_1794_1_1529750428_7178148_257-312870-0 from training. Duration: 0.88 2023-10-04 12:41:20,595 WARNING [train.py:1204] (3/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_436-4493-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 12:41:23,862 WARNING [train.py:1204] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_321-54129-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 12:41:23,870 WARNING [train.py:1204] (3/4) Exclude cut with ID _5287_210_2084_1_1527418883040_3878670_333-48508-0_sp0.9 from training. Duration: 0.6666875 2023-10-04 12:41:25,147 WARNING [train.py:1204] (3/4) Exclude cut with ID _26528_210_9070_1_1532570410842_3851310_244-278158-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 12:41:25,176 WARNING [train.py:1204] (3/4) Exclude cut with ID _46952_210_5986_1_1534230010357_3107379_232-344503-0_sp1.1 from training. Duration: 0.87275 2023-10-04 12:41:25,186 WARNING [train.py:1204] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_43-206757-0 from training. Duration: 0.75 2023-10-04 12:41:25,188 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_4183_210_2087_1_1526729100_1869838_141-166600-0 from training. Duration: 0.95 2023-10-04 12:41:25,255 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_296-287678-0 from training. Duration: 0.95 2023-10-04 12:41:25,316 WARNING [train.py:1204] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_265-60343-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 12:41:25,902 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.3.encoder.layers.3.self_attn_weights.whiten_keys, num_groups=8, num_channels=256, metric=2.58 vs. limit=6.0 2023-10-04 12:41:27,168 WARNING [train.py:1204] (3/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_332-256168-0 from training. Duration: 0.75 2023-10-04 12:41:27,204 WARNING [train.py:1204] (3/4) Exclude cut with ID _13679_210_7497_1_1530534293662_3355098_411-186088-0 from training. Duration: 0.94 2023-10-04 12:41:30,172 WARNING [train.py:1204] (3/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_458-126779-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 12:41:31,571 WARNING [train.py:1204] (3/4) Exclude cut with ID IC0803W0380-397572 from training. Duration: 0.032 2023-10-04 12:41:33,277 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_278-272612-0_sp1.1 from training. Duration: 0.663625 2023-10-04 12:41:35,928 WARNING [train.py:1204] (3/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_491-263101-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 12:41:35,932 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_53555_210_5632_1_1534595220_7232297_334-336778-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 12:41:37,320 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_50-3289-0 from training. Duration: 0.91 2023-10-04 12:41:37,385 WARNING [train.py:1204] (3/4) Exclude cut with ID _8699_210_2069_1_1528630346831_3960409_155-39766-0_sp0.9 from training. Duration: 0.8666875 2023-10-04 12:41:37,389 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_502-82145-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 12:41:38,740 WARNING [train.py:1204] (3/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_550-43132-0_sp1.1 from training. Duration: 0.97275 2023-10-04 12:41:38,773 WARNING [train.py:1204] (3/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_446-112117-0_sp1.1 from training. Duration: 0.8181875 2023-10-04 12:41:40,148 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3691_210_3528_1_1526468169_4062053_355-334432-0_sp1.1 from training. Duration: 0.7909375 2023-10-04 12:41:40,318 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.0.feed_forward1.out_proj.dropout_p, batch_count=1657986.6666666667, ans=0.1 2023-10-04 12:41:42,752 WARNING [train.py:1204] (3/4) Exclude cut with ID _58605_210_12210_1_1534723195939_3904259_239-94720-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 12:41:44,235 WARNING [train.py:1204] (3/4) Exclude cut with ID _49376_210_5986_1_1533985278732_3370740_391-344072-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 12:41:44,885 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.3.encoder.layers.1.feed_forward3.out_whiten, num_groups=1, num_channels=512, metric=9.44 vs. limit=15.0 2023-10-04 12:41:45,547 WARNING [train.py:1204] (3/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_558-322014-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 12:41:45,579 WARNING [train.py:1204] (3/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_256-32662-0_sp1.1 from training. Duration: 0.8181875 2023-10-04 12:41:50,574 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.out_combiner.scale_min, batch_count=1657986.6666666667, ans=0.2 2023-10-04 12:41:51,793 WARNING [train.py:1204] (3/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_572-185150-0 from training. Duration: 0.97 2023-10-04 12:41:53,007 INFO [train.py:1046] (3/4) Epoch 47, batch 4350, loss[loss=0.1437, simple_loss=0.2308, pruned_loss=0.02835, over 24467.00 frames. ], tot_loss[loss=0.1526, simple_loss=0.2329, pruned_loss=0.03614, over 4718329.69 frames. ], batch size: 63, lr: 2.15e-03, grad_scale: 8.0 2023-10-04 12:41:53,088 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_89-35748-0_sp0.9 from training. Duration: 0.87775 2023-10-04 12:41:57,249 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_88-66747-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 12:41:58,761 WARNING [train.py:1204] (3/4) Exclude cut with ID _19504_210_9994_1_1531648863242_3325220_357-237029-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 12:42:01,413 WARNING [train.py:1204] (3/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_302-2550-0_sp1.1 from training. Duration: 0.736375 2023-10-04 12:42:01,413 WARNING [train.py:1204] (3/4) Exclude cut with ID _9461_210_4557_1_1529060514726_3677451_59-194017-0_sp1.1 from training. Duration: 0.963625 2023-10-04 12:42:01,569 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.conv_module2.balancer1.prob, batch_count=1658053.3333333333, ans=0.125 2023-10-04 12:42:01,629 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.nonlin_attention.balancer.prob, batch_count=1658053.3333333333, ans=0.125 2023-10-04 12:42:07,146 WARNING [train.py:1204] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_665-196155-0_sp1.1 from training. Duration: 0.6090625 2023-10-04 12:42:07,519 INFO [scaling.py:1118] (3/4) WithLoss: name=encoder.encoders.4.encoder.layers.1.self_attn_weights, loss-sum=0.000e+00 2023-10-04 12:42:10,036 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_326-315007-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 12:42:11,484 WARNING [train.py:1204] (3/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_105-328174-0_sp1.1 from training. Duration: 0.6909375 2023-10-04 12:42:11,491 WARNING [train.py:1204] (3/4) Exclude cut with ID _56494_210_4220_1_1535284837625_3033169_106-263722-0_sp1.1 from training. Duration: 0.97275 2023-10-04 12:42:15,632 WARNING [train.py:1204] (3/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_770-200430-0_sp0.9 from training. Duration: 0.988875 2023-10-04 12:42:16,134 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.5.encoder.layers.1.nonlin_attention.whiten2, num_groups=1, num_channels=256, metric=4.39 vs. limit=15.0 2023-10-04 12:42:18,433 WARNING [train.py:1204] (3/4) Exclude cut with ID _65640_210_6310_1_1535368201445_3173250_209-13008-0_sp1.1 from training. Duration: 0.9 2023-10-04 12:42:20,439 WARNING [train.py:1204] (3/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_212-149947-0_sp0.9 from training. Duration: 0.811125 2023-10-04 12:42:26,498 WARNING [train.py:1204] (3/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_188-171673-0 from training. Duration: 0.58 2023-10-04 12:42:26,570 WARNING [train.py:1204] (3/4) Exclude cut with ID _58895_210_8750_1_1535184081320_3649089_286-281724-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 12:42:28,015 WARNING [train.py:1204] (3/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_16-254909-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 12:42:31,623 WARNING [train.py:1204] (3/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_52-95655-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 12:42:34,380 WARNING [train.py:1204] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_554-45617-0 from training. Duration: 0.99 2023-10-04 12:42:38,829 WARNING [train.py:1204] (3/4) Exclude cut with ID _14683_210_5219_1_1530698352608_3974859_473-344611-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 12:42:40,218 WARNING [train.py:1204] (3/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_17-25045-0_sp0.9 from training. Duration: 0.6555625 2023-10-04 12:42:40,438 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.1.bypass_mid.scale_min, batch_count=1658253.3333333333, ans=0.2 2023-10-04 12:42:44,385 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9072W0003-530637 from training. Duration: 0.768 2023-10-04 12:42:45,789 WARNING [train.py:1204] (3/4) Exclude cut with ID _38670_210_15379_1_1533448810659_4391059_76-74025-0_sp1.1 from training. Duration: 0.936375 2023-10-04 12:42:47,134 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_74-292608-0_sp0.9 from training. Duration: 0.8 2023-10-04 12:42:47,208 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9084W0025-536862 from training. Duration: 0.896 2023-10-04 12:42:48,600 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9087W0021-538392 from training. Duration: 0.768 2023-10-04 12:42:48,605 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7543_210_6753_1_1528543323_645515_56-163234-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 12:42:48,625 WARNING [train.py:1204] (3/4) Exclude cut with ID _45560_210_21982_1_1533812603951_3584999_44-332643-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 12:42:49,988 WARNING [train.py:1204] (3/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_1285-256622-0_sp1.1 from training. Duration: 0.763625 2023-10-04 12:42:50,047 WARNING [train.py:1204] (3/4) Exclude cut with ID _52781_210_16187_1_1534297729099_1118149_141-325632-0_sp1.1 from training. Duration: 0.936375 2023-10-04 12:42:51,411 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_1051-137631-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 12:42:51,461 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_13091_210_7925_1_1530609447_8987354_233-18333-0_sp1.1 from training. Duration: 0.97275 2023-10-04 12:42:54,626 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_34008_210_10431_1_1533345424_8183247_662-317331-0 from training. Duration: 0.94 2023-10-04 12:42:54,642 WARNING [train.py:1204] (3/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_291-248920-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 12:42:54,644 WARNING [train.py:1204] (3/4) Exclude cut with ID _65263_210_22356_1_1535763598691_3921037_274-288481-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 12:42:54,668 WARNING [train.py:1204] (3/4) Exclude cut with ID _5189_210_4557_1_1527248319428_3769229_233-168080-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 12:42:54,736 WARNING [train.py:1204] (3/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_135-184281-0 from training. Duration: 0.99 2023-10-04 12:42:56,767 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9123W0012-555972 from training. Duration: 0.896 2023-10-04 12:42:56,771 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9123W0118-556077 from training. Duration: 0.896 2023-10-04 12:42:56,781 WARNING [train.py:1204] (3/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_406-275649-0 from training. Duration: 0.42 2023-10-04 12:42:59,475 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.conv_skip_rate, batch_count=1658320.0, ans=0.0 2023-10-04 12:43:00,645 WARNING [train.py:1204] (3/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_309-13684-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 12:43:01,929 WARNING [train.py:1204] (3/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_129-337802-0_sp1.1 from training. Duration: 0.6545625 2023-10-04 12:43:01,956 WARNING [train.py:1204] (3/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_649-266825-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 12:43:03,838 WARNING [train.py:1204] (3/4) Exclude cut with ID _40519_210_13794_1_1533715228655_7337390_895-224742-0_sp1.1 from training. Duration: 0.92725 2023-10-04 12:43:04,135 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.conv_skip_rate, batch_count=1658320.0, ans=0.0 2023-10-04 12:43:05,264 WARNING [train.py:1204] (3/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_234-14174-0 from training. Duration: 0.82 2023-10-04 12:43:05,432 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.1.conv_module1.balancer2.prob, batch_count=1658320.0, ans=0.125 2023-10-04 12:43:06,641 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9156W0009-572502 from training. Duration: 0.768 2023-10-04 12:43:06,660 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_424-245529-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 12:43:07,949 INFO [train.py:1046] (3/4) Epoch 47, batch 4400, loss[loss=0.1487, simple_loss=0.2262, pruned_loss=0.03562, over 24510.00 frames. ], tot_loss[loss=0.1542, simple_loss=0.2349, pruned_loss=0.03673, over 4709063.63 frames. ], batch size: 63, lr: 2.15e-03, grad_scale: 16.0 2023-10-04 12:43:11,303 WARNING [train.py:1204] (3/4) Exclude cut with ID _26528_210_9070_1_1532570410842_3851310_232-10149-0_sp1.1 from training. Duration: 0.963625 2023-10-04 12:43:11,311 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_17292_210_6753_1_1531125388_2943734_152-30410-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 12:43:12,767 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_35441_210_16554_1_1533347572_4092680_291-82075-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 12:43:15,526 WARNING [train.py:1204] (3/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_64-298917-0 from training. Duration: 0.88 2023-10-04 12:43:15,559 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_5618_210_1874_1_1527311008_5851_1-214501-0_sp0.9 from training. Duration: 0.7655625 2023-10-04 12:43:16,883 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_9460_210_4211_1_1528954587_10049667_572-37035-0 from training. Duration: 0.8 2023-10-04 12:43:16,901 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9193W0023-590322 from training. Duration: 0.896 2023-10-04 12:43:18,315 WARNING [train.py:1204] (3/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_206-10097-0_sp1.1 from training. Duration: 0.4454375 2023-10-04 12:43:18,340 WARNING [train.py:1204] (3/4) Exclude cut with ID _55134_210_21982_1_1534590028377_3882170_17-51731-0_sp1.1 from training. Duration: 0.963625 2023-10-04 12:43:19,876 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.balancer1.prob, batch_count=1658386.6666666667, ans=0.125 2023-10-04 12:43:20,886 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.809e+02 2.049e+02 2.248e+02 2.577e+02 4.164e+02, threshold=4.496e+02, percent-clipped=0.0 2023-10-04 12:43:20,988 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_14953_210_2151_1_1530701710_5616165_710-165400-0 from training. Duration: 0.87 2023-10-04 12:43:21,981 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.0.layers.0.conv_module1.whiten, num_groups=1, num_channels=192, metric=10.69 vs. limit=15.0 2023-10-04 12:43:22,374 WARNING [train.py:1204] (3/4) Exclude cut with ID _53203_210_2087_1_1534246218309_475590_61-85777-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 12:43:22,460 WARNING [train.py:1204] (3/4) Exclude cut with ID _55844_210_2346_1_1534471212569_7625707_1050-10453-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 12:43:22,469 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9218W0013-603297 from training. Duration: 0.896 2023-10-04 12:43:25,762 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_267-102818-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 12:43:25,764 WARNING [train.py:1204] (3/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_191-41023-0 from training. Duration: 0.71 2023-10-04 12:43:25,813 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9231W0202-610227 from training. Duration: 0.896 2023-10-04 12:43:29,031 WARNING [train.py:1204] (3/4) Exclude cut with ID _6537_210_1883_1_1527987559710_3597129_382-133062-0 from training. Duration: 0.68 2023-10-04 12:43:30,333 WARNING [train.py:1204] (3/4) Exclude cut with ID _28427_210_4220_1_1532581240967_3061069_43-84928-0 from training. Duration: 0.99 2023-10-04 12:43:30,371 WARNING [train.py:1204] (3/4) Exclude cut with ID _59256_210_5986_1_1535076085997_3589120_121-237386-0 from training. Duration: 0.96 2023-10-04 12:43:31,618 WARNING [train.py:1204] (3/4) Exclude cut with ID _8480_210_1874_1_1528589391205_6728330_137-256986-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 12:43:33,112 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_9417_210_1794_1_1529235261_4614131_479-346963-0_sp1.1 from training. Duration: 0.936375 2023-10-04 12:43:33,145 WARNING [train.py:1204] (3/4) Exclude cut with ID _26474_210_9570_1_1532520022699_3623380_267-7362-0_sp1.1 from training. Duration: 0.936375 2023-10-04 12:43:34,445 WARNING [train.py:1204] (3/4) Exclude cut with ID _55328_210_27767_1_1534580973260_4088170_287-61272-0_sp1.1 from training. Duration: 0.97275 2023-10-04 12:43:36,337 WARNING [train.py:1204] (3/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_589-9070-0 from training. Duration: 0.71 2023-10-04 12:43:37,616 WARNING [train.py:1204] (3/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_148-158509-0_sp1.1 from training. Duration: 0.37275 2023-10-04 12:43:38,950 WARNING [train.py:1204] (3/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_164-184548-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 12:43:39,254 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.ff2_skip_rate, batch_count=1658520.0, ans=0.0 2023-10-04 12:43:41,740 WARNING [train.py:1204] (3/4) Exclude cut with ID _14970_210_15709_1_1530775547361_1966965_6-23328-0_sp1.1 from training. Duration: 0.7818125 2023-10-04 12:43:41,746 WARNING [train.py:1204] (3/4) Exclude cut with ID _29303_210_2575_1_1532392237303_7410649_612-165069-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 12:43:43,749 WARNING [train.py:1204] (3/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_383-321224-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 12:43:43,782 WARNING [train.py:1204] (3/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_283-160525-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 12:43:43,812 WARNING [train.py:1204] (3/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_505-97987-0 from training. Duration: 0.86 2023-10-04 12:43:45,051 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9309W0190-640317 from training. Duration: 0.768 2023-10-04 12:43:48,040 WARNING [train.py:1204] (3/4) Exclude cut with ID _50093_210_4220_1_1534417141207_3432299_280-297390-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 12:43:53,607 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_17292_210_6753_1_1531125388_2943734_320-71385-0_sp1.1 from training. Duration: 0.97275 2023-10-04 12:43:53,882 INFO [scaling.py:1118] (3/4) WithLoss: name=encoder.encoders.1.encoder.layers.1.self_attn_weights, loss-sum=0.000e+00 2023-10-04 12:43:55,015 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_213-103282-0 from training. Duration: 0.78 2023-10-04 12:43:58,899 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7615_210_4211_1_1528110965_4580160_72-235211-0_sp0.9 from training. Duration: 0.8444375 2023-10-04 12:44:01,674 WARNING [train.py:1204] (3/4) Exclude cut with ID _14867_210_1968_1_1530684179890_3846969_173-320977-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 12:44:03,202 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.feed_forward2.hidden_balancer.prob, batch_count=1658586.6666666667, ans=0.125 2023-10-04 12:44:04,330 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7046_210_6753_1_1527895337_6920917_783-278109-0_sp0.9 from training. Duration: 0.8555625 2023-10-04 12:44:06,284 WARNING [train.py:1204] (3/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_90-30673-0 from training. Duration: 0.84 2023-10-04 12:44:06,300 WARNING [train.py:1204] (3/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_415-161-0_sp1.1 from training. Duration: 0.8909375 2023-10-04 12:44:06,317 WARNING [train.py:1204] (3/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_86-66192-0_sp0.9 from training. Duration: 0.611125 2023-10-04 12:44:06,321 WARNING [train.py:1204] (3/4) Exclude cut with ID _4626_210_3532_1_1526817652717_3916859_446-206656-0_sp1.1 from training. Duration: 0.6909375 2023-10-04 12:44:06,383 WARNING [train.py:1204] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_166-128941-0_sp0.9 from training. Duration: 0.6 2023-10-04 12:44:10,687 WARNING [train.py:1204] (3/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_345-167986-0 from training. Duration: 0.84 2023-10-04 12:44:10,852 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.conv_module2.balancer1.prob, batch_count=1658653.3333333333, ans=0.125 2023-10-04 12:44:13,434 WARNING [train.py:1204] (3/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_973-198085-0 from training. Duration: 0.75 2023-10-04 12:44:15,329 WARNING [train.py:1204] (3/4) Exclude cut with ID _60057_210_10640_1_1534903065793_3333150_171-181133-0 from training. Duration: 0.88 2023-10-04 12:44:15,343 WARNING [train.py:1204] (3/4) Exclude cut with ID _19504_210_9994_1_1531648863242_3325220_336-196513-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 12:44:15,352 WARNING [train.py:1204] (3/4) Exclude cut with ID _6493_210_1747_1_1528459210424_4012470_513-140967-0 from training. Duration: 0.79 2023-10-04 12:44:16,684 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6673_210_3273_1_1527767790_3173240_327-306185-0_sp0.9 from training. Duration: 0.92225 2023-10-04 12:44:19,454 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1032-236863-0_sp1.1 from training. Duration: 0.763625 2023-10-04 12:44:21,071 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.conv_module1.balancer1.prob, batch_count=1658720.0, ans=0.125 2023-10-04 12:44:22,201 INFO [train.py:1046] (3/4) Epoch 47, batch 4450, loss[loss=0.1407, simple_loss=0.2116, pruned_loss=0.0349, over 19091.00 frames. ], tot_loss[loss=0.1551, simple_loss=0.2359, pruned_loss=0.03714, over 4705600.39 frames. ], batch size: 41, lr: 2.15e-03, grad_scale: 16.0 2023-10-04 12:44:22,531 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.conv_module2.balancer1.prob, batch_count=1658720.0, ans=0.125 2023-10-04 12:44:23,553 WARNING [train.py:1204] (3/4) Exclude cut with ID _14867_210_1968_1_1530684179890_3846969_413-29292-0 from training. Duration: 0.9 2023-10-04 12:44:26,522 WARNING [train.py:1204] (3/4) Exclude cut with ID _47251_210_18628_1_1533814063128_4137250_186-343322-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 12:44:29,256 WARNING [train.py:1204] (3/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_624-46091-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 12:44:29,298 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_791-197891-0_sp0.9 from training. Duration: 0.9333125 2023-10-04 12:44:35,926 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_50050_210_16223_1_1535435796_7290138_786-288457-0_sp1.1 from training. Duration: 0.92725 2023-10-04 12:44:35,954 WARNING [train.py:1204] (3/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_213-227283-0_sp1.1 from training. Duration: 0.963625 2023-10-04 12:44:39,200 WARNING [train.py:1204] (3/4) Exclude cut with ID _42659_210_2070_1_1533607442765_3120380_58-341121-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 12:44:40,562 WARNING [train.py:1204] (3/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_506-23200-0_sp1.1 from training. Duration: 0.8818125 2023-10-04 12:44:43,218 WARNING [train.py:1204] (3/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_336-213530-0_sp1.1 from training. Duration: 0.8181875 2023-10-04 12:44:43,238 WARNING [train.py:1204] (3/4) Exclude cut with ID _64135_210_2253_1_1535677267876_7608972_180-5312-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 12:44:43,327 WARNING [train.py:1204] (3/4) Exclude cut with ID _12302_210_5219_1_1529924446466_8107280_835-279754-0 from training. Duration: 0.9 2023-10-04 12:44:43,329 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_510-96996-0_sp1.1 from training. Duration: 0.7909375 2023-10-04 12:44:44,661 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_15517_210_6753_1_1530954008_3487336_216-187834-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 12:44:44,689 WARNING [train.py:1204] (3/4) Exclude cut with ID _19504_210_9994_1_1531648863242_3325220_414-113427-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 12:44:44,690 WARNING [train.py:1204] (3/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_264-108685-0_sp0.9 from training. Duration: 0.611125 2023-10-04 12:44:48,058 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_5438_210_1794_1_1527336687_2600009_104-288274-0_sp0.9 from training. Duration: 0.6444375 2023-10-04 12:44:48,288 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=1658786.6666666667, ans=0.1 2023-10-04 12:44:49,692 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.conv_module2.balancer2.prob, batch_count=1658786.6666666667, ans=0.125 2023-10-04 12:44:53,745 WARNING [train.py:1204] (3/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_520-39496-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 12:44:53,786 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_37818_210_4238_1_1533620910_4720083_315-308565-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 12:44:54,023 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.bypass.scale_min, batch_count=1658853.3333333333, ans=0.2 2023-10-04 12:44:55,195 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2009_210_2071_1_1525481134_4588937_385-302100-0_sp1.1 from training. Duration: 0.7909375 2023-10-04 12:44:55,233 WARNING [train.py:1204] (3/4) Exclude cut with ID _58895_210_8750_1_1535184081320_3649089_277-53219-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 12:44:56,631 WARNING [train.py:1204] (3/4) Exclude cut with ID _26664_210_7770_1_1532258943280_3418270_121-111258-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 12:44:56,862 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.bypass.skip_rate, batch_count=1658853.3333333333, ans=0.04949747468305833 2023-10-04 12:45:00,031 WARNING [train.py:1204] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_99-167392-0_sp1.1 from training. Duration: 0.5545625 2023-10-04 12:45:01,854 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1113-7369-0 from training. Duration: 0.85 2023-10-04 12:45:01,878 WARNING [train.py:1204] (3/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_352-59958-0 from training. Duration: 0.86 2023-10-04 12:45:01,883 WARNING [train.py:1204] (3/4) Exclude cut with ID _14886_210_4220_1_1530702020355_3516190_159-73748-0_sp1.1 from training. Duration: 0.7545625 2023-10-04 12:45:04,522 WARNING [train.py:1204] (3/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_131-119569-0_sp1.1 from training. Duration: 0.92725 2023-10-04 12:45:05,836 WARNING [train.py:1204] (3/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_216-343007-0 from training. Duration: 0.66 2023-10-04 12:45:07,422 WARNING [train.py:1204] (3/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_113-92608-0_sp0.9 from training. Duration: 0.6 2023-10-04 12:45:11,931 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_40047_210_2290_1_1533290519_2374311_132-123271-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 12:45:13,205 WARNING [train.py:1204] (3/4) Exclude cut with ID _8793_210_6310_1_1528542985139_2310009_121-179782-0 from training. Duration: 0.8 2023-10-04 12:45:13,235 WARNING [train.py:1204] (3/4) Exclude cut with ID _43997_210_4525_1_1533538632555_5189329_283-218639-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 12:45:13,238 WARNING [train.py:1204] (3/4) Exclude cut with ID _49376_210_5986_1_1533985278732_3370740_297-78397-0_sp1.1 from training. Duration: 0.936375 2023-10-04 12:45:13,259 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3475_210_2028_1_1526193578_3622569_377-166133-0_sp0.9 from training. Duration: 0.9555625 2023-10-04 12:45:14,394 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_520-213149-0_sp1.1 from training. Duration: 0.92725 2023-10-04 12:45:16,294 WARNING [train.py:1204] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_567-144483-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 12:45:19,245 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_4183_210_2087_1_1526729100_1869838_143-280962-0_sp0.9 from training. Duration: 0.62225 2023-10-04 12:45:19,291 WARNING [train.py:1204] (3/4) Exclude cut with ID _49643_210_7989_1_1534640244224_3939031_611-185515-0 from training. Duration: 0.94 2023-10-04 12:45:20,668 WARNING [train.py:1204] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_88-341718-0_sp0.9 from training. Duration: 0.7333125 2023-10-04 12:45:23,505 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_51225_210_3601_1_1534400634_2327383_49-235441-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 12:45:24,909 WARNING [train.py:1204] (3/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_240-294400-0_sp1.1 from training. Duration: 0.936375 2023-10-04 12:45:26,225 WARNING [train.py:1204] (3/4) Exclude cut with ID _55429_210_10626_1_1534467588527_7174143_640-304825-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 12:45:26,253 WARNING [train.py:1204] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_187-221566-0_sp1.1 from training. Duration: 0.3909375 2023-10-04 12:45:29,013 WARNING [train.py:1204] (3/4) Exclude cut with ID _7606_210_4915_1_1528894253277_5080690_127-177761-0_sp0.9 from training. Duration: 0.711125 2023-10-04 12:45:30,916 WARNING [train.py:1204] (3/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_430-116139-0 from training. Duration: 0.85 2023-10-04 12:45:32,383 WARNING [train.py:1204] (3/4) Exclude cut with ID _3675_210_4557_1_1526639934408_4182089_456-18305-0_sp0.9 from training. Duration: 0.8444375 2023-10-04 12:45:36,894 INFO [train.py:1046] (3/4) Epoch 47, batch 4500, loss[loss=0.1634, simple_loss=0.2396, pruned_loss=0.04354, over 23254.00 frames. ], tot_loss[loss=0.1553, simple_loss=0.236, pruned_loss=0.03728, over 4696461.19 frames. ], batch size: 119, lr: 2.15e-03, grad_scale: 16.0 2023-10-04 12:45:37,025 WARNING [train.py:1204] (3/4) Exclude cut with ID _18513_210_9206_1_1531618215223_3862620_402-5957-0_sp1.1 from training. Duration: 0.9 2023-10-04 12:45:39,620 WARNING [train.py:1204] (3/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_465-156500-0 from training. Duration: 0.72 2023-10-04 12:45:39,621 WARNING [train.py:1204] (3/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_714-310538-0 from training. Duration: 0.86 2023-10-04 12:45:41,007 WARNING [train.py:1204] (3/4) Exclude cut with ID _39411_210_5986_1_1533538825030_3343649_335-301187-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 12:45:45,144 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_41750_210_1960_1_1533520678_3948042_241-158616-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 12:45:45,189 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_46013_210_7234_1_1533776362_3744032_50-141535-0_sp1.1 from training. Duration: 0.9 2023-10-04 12:45:47,069 WARNING [train.py:1204] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_43-206757-0_sp0.9 from training. Duration: 0.8333125 2023-10-04 12:45:47,132 WARNING [train.py:1204] (3/4) Exclude cut with ID _22961_210_8254_1_1531976366672_4278180_172-276296-0_sp1.1 from training. Duration: 0.97275 2023-10-04 12:45:48,464 WARNING [train.py:1204] (3/4) Exclude cut with ID _17956_210_4353_1_1531209568125_6979970_1305-301903-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 12:45:48,498 WARNING [train.py:1204] (3/4) Exclude cut with ID _28323_210_16187_1_1532480395667_4100300_604-44507-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 12:45:49,652 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.758e+02 2.076e+02 2.315e+02 2.981e+02 4.706e+02, threshold=4.629e+02, percent-clipped=1.0 2023-10-04 12:45:58,804 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.3.encoder.layers.3.self_attn2.whiten, num_groups=1, num_channels=512, metric=10.59 vs. limit=22.5 2023-10-04 12:45:59,642 WARNING [train.py:1204] (3/4) Exclude cut with ID _23514_210_1389_1_1531832139785_3031439_88-337544-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 12:46:00,307 WARNING [train.py:1204] (3/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_932-3672-0_sp1.1 from training. Duration: 0.963625 2023-10-04 12:46:01,639 WARNING [train.py:1204] (3/4) Exclude cut with ID _46502_210_13794_1_1533724452198_3657159_207-109991-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 12:46:02,978 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_216-73841-0_sp1.1 from training. Duration: 0.87275 2023-10-04 12:46:05,610 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8453_210_4211_1_1528502940_8562730_347-330673-0_sp1.1 from training. Duration: 0.6545625 2023-10-04 12:46:11,636 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_4183_210_2087_1_1526729100_1869838_143-280962-0_sp1.1 from training. Duration: 0.5090625 2023-10-04 12:46:14,325 WARNING [train.py:1204] (3/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_325-113798-0_sp1.1 from training. Duration: 0.77275 2023-10-04 12:46:18,776 WARNING [train.py:1204] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_91-212486-0_sp0.9 from training. Duration: 0.7555625 2023-10-04 12:46:21,474 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8776_210_2040_1_1528638837_4088036_244-278763-0_sp1.1 from training. Duration: 0.8181875 2023-10-04 12:46:21,511 WARNING [train.py:1204] (3/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_106-286130-0 from training. Duration: 0.98 2023-10-04 12:46:22,878 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2746_210_1988_1_1525872816_3795627_372-205861-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 12:46:24,203 WARNING [train.py:1204] (3/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_240-54646-0_sp1.1 from training. Duration: 0.936375 2023-10-04 12:46:27,074 WARNING [train.py:1204] (3/4) Exclude cut with ID _59500_210_8254_1_1534809610680_3736009_162-180695-0_sp1.1 from training. Duration: 0.936375 2023-10-04 12:46:27,096 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7544_210_6753_1_1528591327_1471426_146-93357-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 12:46:27,301 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.ff3_skip_rate, batch_count=1659253.3333333333, ans=0.0 2023-10-04 12:46:29,866 WARNING [train.py:1204] (3/4) Exclude cut with ID _42657_210_2070_1_1533520654016_3327008_405-87432-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 12:46:29,903 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_4528_210_3273_1_1527076654_3276777_209-219804-0 from training. Duration: 0.69 2023-10-04 12:46:29,904 WARNING [train.py:1204] (3/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_78-182912-0_sp0.9 from training. Duration: 0.6555625 2023-10-04 12:46:29,917 WARNING [train.py:1204] (3/4) Exclude cut with ID _31255_210_13427_1_1532865619495_3815143_322-114998-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 12:46:30,858 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.0.layers.0.feed_forward1.out_whiten, num_groups=1, num_channels=192, metric=7.46 vs. limit=15.0 2023-10-04 12:46:34,331 WARNING [train.py:1204] (3/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_848-280957-0_sp0.9 from training. Duration: 0.9666875 2023-10-04 12:46:34,359 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7696_210_2040_1_1528207167_3716099_117-125434-0_sp0.9 from training. Duration: 0.8444375 2023-10-04 12:46:37,556 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_1002-109192-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 12:46:39,626 WARNING [train.py:1204] (3/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_138-64210-0_sp0.9 from training. Duration: 0.811125 2023-10-04 12:46:39,646 WARNING [train.py:1204] (3/4) Exclude cut with ID _25808_210_13292_1_1532343514562_7366109_633-47524-0_sp1.1 from training. Duration: 0.9 2023-10-04 12:46:41,069 WARNING [train.py:1204] (3/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_297-5233-0 from training. Duration: 0.95 2023-10-04 12:46:41,250 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.conv_skip_rate, batch_count=1659320.0, ans=0.0 2023-10-04 12:46:43,735 WARNING [train.py:1204] (3/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_335-274780-0 from training. Duration: 0.78 2023-10-04 12:46:43,740 WARNING [train.py:1204] (3/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_301-330455-0 from training. Duration: 0.8 2023-10-04 12:46:46,790 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.conv_skip_rate, batch_count=1659320.0, ans=0.0 2023-10-04 12:46:48,305 WARNING [train.py:1204] (3/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_383-205833-0 from training. Duration: 0.95 2023-10-04 12:46:49,742 INFO [train.py:1046] (3/4) Epoch 47, batch 4550, loss[loss=0.1384, simple_loss=0.2181, pruned_loss=0.02935, over 24587.00 frames. ], tot_loss[loss=0.1544, simple_loss=0.2348, pruned_loss=0.03704, over 4706086.99 frames. ], batch size: 60, lr: 2.15e-03, grad_scale: 16.0 2023-10-04 12:46:49,869 WARNING [train.py:1204] (3/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_48-182155-0 from training. Duration: 0.87 2023-10-04 12:46:51,267 WARNING [train.py:1204] (3/4) Exclude cut with ID _14832_210_8750_1_1530691351539_3171348_1-54315-0_sp1.1 from training. Duration: 0.7909375 2023-10-04 12:46:53,996 WARNING [train.py:1204] (3/4) Exclude cut with ID _44982_210_1511_1_1534312820108_3726029_413-100814-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 12:46:55,265 WARNING [train.py:1204] (3/4) Exclude cut with ID _3231_210_2106_1_1526205864934_6784220_104-261392-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 12:46:57,985 WARNING [train.py:1204] (3/4) Exclude cut with ID _24314_210_4915_1_1532135078116_7170179_1-13588-0_sp1.1 from training. Duration: 0.97275 2023-10-04 12:47:03,540 WARNING [train.py:1204] (3/4) Exclude cut with ID _44304_210_15374_1_1533639352765_4613129_141-8356-0_sp1.1 from training. Duration: 0.963625 2023-10-04 12:47:05,533 WARNING [train.py:1204] (3/4) Exclude cut with ID _37892_210_19596_1_1533198677897_3964058_374-335034-0_sp1.1 from training. Duration: 0.936375 2023-10-04 12:47:05,670 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_195-257512-0_sp0.9 from training. Duration: 0.7444375 2023-10-04 12:47:05,672 WARNING [train.py:1204] (3/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_1267-272693-0_sp1.1 from training. Duration: 0.663625 2023-10-04 12:47:05,673 WARNING [train.py:1204] (3/4) Exclude cut with ID _30196_210_1664_1_1533708496522_7074140_690-326155-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 12:47:08,362 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_68847_210_14656_1_1536070214_2586843_38-20717-0_sp1.1 from training. Duration: 0.97275 2023-10-04 12:47:08,403 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_99-251314-0_sp1.1 from training. Duration: 0.7909375 2023-10-04 12:47:12,909 WARNING [train.py:1204] (3/4) Exclude cut with ID _49376_210_5986_1_1533985278732_3370740_225-95630-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 12:47:15,678 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2655_210_2087_1_1525688959_3897400_202-147557-0 from training. Duration: 0.85 2023-10-04 12:47:16,973 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_5618_210_1874_1_1527311008_5851_1-214501-0_sp1.1 from training. Duration: 0.626375 2023-10-04 12:47:17,051 WARNING [train.py:1204] (3/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_225-43381-0_sp1.1 from training. Duration: 0.8545625 2023-10-04 12:47:18,799 WARNING [train.py:1204] (3/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_242-3472-0 from training. Duration: 0.73 2023-10-04 12:47:21,608 WARNING [train.py:1204] (3/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_339-104791-0 from training. Duration: 0.49 2023-10-04 12:47:21,664 WARNING [train.py:1204] (3/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_488-92261-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 12:47:25,804 WARNING [train.py:1204] (3/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_175-163894-0 from training. Duration: 0.74 2023-10-04 12:47:25,929 WARNING [train.py:1204] (3/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_41-234353-0_sp0.9 from training. Duration: 0.7555625 2023-10-04 12:47:30,033 WARNING [train.py:1204] (3/4) Exclude cut with ID _47251_210_18628_1_1533814063128_4137250_116-275927-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 12:47:30,066 WARNING [train.py:1204] (3/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_61-223324-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 12:47:30,081 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6961_210_2087_1_1527937168_6815011_562-165518-0_sp1.1 from training. Duration: 0.7 2023-10-04 12:47:31,479 WARNING [train.py:1204] (3/4) Exclude cut with ID _21989_210_5854_1_1531899214931_3271360_133-44759-0 from training. Duration: 0.77 2023-10-04 12:47:33,945 WARNING [train.py:1204] (3/4) Exclude cut with ID _23557_210_8750_1_1531890106370_6846255_77-284923-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 12:47:36,436 WARNING [train.py:1204] (3/4) Exclude cut with ID _13678_210_7497_1_1530530069226_3463170_464-296127-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 12:47:36,449 WARNING [train.py:1204] (3/4) Exclude cut with ID _22613_210_5219_1_1532253599686_4090170_393-36663-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 12:47:36,542 WARNING [train.py:1204] (3/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_235-262124-0_sp0.9 from training. Duration: 0.7444375 2023-10-04 12:47:36,750 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.ff2_skip_rate, batch_count=1659586.6666666667, ans=0.0 2023-10-04 12:47:37,029 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.4.encoder.layers.2.feed_forward3.out_whiten, num_groups=1, num_channels=384, metric=13.94 vs. limit=15.0 2023-10-04 12:47:37,855 WARNING [train.py:1204] (3/4) Exclude cut with ID _8480_210_1874_1_1528589391205_6728330_755-112625-0 from training. Duration: 0.74 2023-10-04 12:47:39,198 WARNING [train.py:1204] (3/4) Exclude cut with ID _6456_210_1534_1_1527681634212_3181049_215-309359-0 from training. Duration: 0.88 2023-10-04 12:47:39,221 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3019_210_2028_1_1526044245_3264865_191-72235-0_sp1.1 from training. Duration: 0.8818125 2023-10-04 12:47:40,974 WARNING [train.py:1204] (3/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_380-162648-0 from training. Duration: 0.94 2023-10-04 12:47:42,405 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_92-46009-0 from training. Duration: 0.54 2023-10-04 12:47:42,422 WARNING [train.py:1204] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_53-300544-0_sp0.9 from training. Duration: 0.7444375 2023-10-04 12:47:45,196 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_68235_210_1925_1_1535863247_4484656_101-250943-0_sp1.1 from training. Duration: 0.97275 2023-10-04 12:47:45,205 WARNING [train.py:1204] (3/4) Exclude cut with ID _14970_210_15709_1_1530775547361_1966965_204-22182-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 12:47:46,598 WARNING [train.py:1204] (3/4) Exclude cut with ID _55153_210_2253_1_1534384786677_3162096_310-130092-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 12:47:46,610 WARNING [train.py:1204] (3/4) Exclude cut with ID _60113_210_16271_1_1535164212481_4386079_559-167191-0_sp1.1 from training. Duration: 0.8454375 2023-10-04 12:47:48,417 WARNING [train.py:1204] (3/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_93-58002-0_sp1.1 from training. Duration: 0.5181875 2023-10-04 12:47:49,793 WARNING [train.py:1204] (3/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_110-224688-0 from training. Duration: 0.97 2023-10-04 12:47:50,390 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.4.encoder.layers.1.feed_forward3.out_whiten, num_groups=1, num_channels=384, metric=13.81 vs. limit=15.0 2023-10-04 12:47:50,423 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.4.encoder.layers.1.feed_forward1.out_whiten, num_groups=1, num_channels=384, metric=10.33 vs. limit=15.0 2023-10-04 12:47:51,186 WARNING [train.py:1204] (3/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_178-178004-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 12:47:51,203 WARNING [train.py:1204] (3/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_321-335475-0_sp1.1 from training. Duration: 0.4090625 2023-10-04 12:47:51,281 WARNING [train.py:1204] (3/4) Exclude cut with ID _66863_210_18053_1_1535698758334_8590319_1092-271784-0 from training. Duration: 0.96 2023-10-04 12:47:51,288 WARNING [train.py:1204] (3/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_607-213951-0_sp1.1 from training. Duration: 0.763625 2023-10-04 12:47:51,297 WARNING [train.py:1204] (3/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_468-283113-0 from training. Duration: 0.96 2023-10-04 12:47:54,106 WARNING [train.py:1204] (3/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_13-165512-0_sp1.1 from training. Duration: 0.7454375 2023-10-04 12:47:54,123 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_69630_210_7234_1_1535788670_910989_56-169762-0_sp1.1 from training. Duration: 0.92725 2023-10-04 12:47:56,817 WARNING [train.py:1204] (3/4) Exclude cut with ID _67335_210_5219_1_1535525975652_4275229_121-80357-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 12:47:56,871 WARNING [train.py:1204] (3/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_126-12079-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 12:47:58,176 WARNING [train.py:1204] (3/4) Exclude cut with ID _5294_210_1747_1_1527249643928_3696179_342-270278-0_sp1.1 from training. Duration: 0.5 2023-10-04 12:47:59,669 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_30322_210_1925_1_1532843478_826817_35-328264-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 12:47:59,786 WARNING [train.py:1204] (3/4) Exclude cut with ID _8480_210_1874_1_1528589391205_6728330_755-112625-0_sp1.1 from training. Duration: 0.67275 2023-10-04 12:48:03,441 INFO [train.py:1046] (3/4) Epoch 47, batch 4600, loss[loss=0.1532, simple_loss=0.2332, pruned_loss=0.03655, over 23226.00 frames. ], tot_loss[loss=0.1532, simple_loss=0.2334, pruned_loss=0.03652, over 4699187.64 frames. ], batch size: 93, lr: 2.15e-03, grad_scale: 16.0 2023-10-04 12:48:03,495 WARNING [train.py:1204] (3/4) Exclude cut with ID _38670_210_15379_1_1533448810659_4391059_132-146412-0_sp1.1 from training. Duration: 0.97275 2023-10-04 12:48:03,580 WARNING [train.py:1204] (3/4) Exclude cut with ID _22020_210_2461_1_1531724164513_6326900_563-39691-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 12:48:06,370 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_9460_210_4211_1_1528954587_10049667_572-37035-0_sp1.1 from training. Duration: 0.72725 2023-10-04 12:48:06,382 WARNING [train.py:1204] (3/4) Exclude cut with ID _68777_210_4353_1_1535810390731_3117904_129-166012-0_sp0.9 from training. Duration: 0.9444375 2023-10-04 12:48:06,448 WARNING [train.py:1204] (3/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_487-91921-0_sp1.1 from training. Duration: 0.963625 2023-10-04 12:48:07,759 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_195-257512-0 from training. Duration: 0.67 2023-10-04 12:48:09,726 WARNING [train.py:1204] (3/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_188-260011-0_sp1.1 from training. Duration: 0.663625 2023-10-04 12:48:12,633 WARNING [train.py:1204] (3/4) Exclude cut with ID _8480_210_1874_1_1528589391205_6728330_309-139896-0_sp0.9 from training. Duration: 0.9 2023-10-04 12:48:13,943 WARNING [train.py:1204] (3/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_866-321248-0_sp1.1 from training. Duration: 0.963625 2023-10-04 12:48:14,128 WARNING [train.py:1204] (3/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_510-216513-0_sp1.1 from training. Duration: 0.97275 2023-10-04 12:48:17,264 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.789e+02 2.101e+02 2.353e+02 2.748e+02 3.773e+02, threshold=4.707e+02, percent-clipped=0.0 2023-10-04 12:48:19,518 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.5.encoder.layers.0.whiten, num_groups=1, num_channels=256, metric=2.21 vs. limit=12.0 2023-10-04 12:48:21,546 WARNING [train.py:1204] (3/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_758-143677-0 from training. Duration: 0.98 2023-10-04 12:48:21,622 WARNING [train.py:1204] (3/4) Exclude cut with ID _55134_210_21982_1_1534590028377_3882170_50-214427-0_sp1.1 from training. Duration: 0.97275 2023-10-04 12:48:25,650 WARNING [train.py:1204] (3/4) Exclude cut with ID _31649_210_6228_1_1532584608063_4091078_482-301330-0_sp1.1 from training. Duration: 0.97275 2023-10-04 12:48:27,210 WARNING [train.py:1204] (3/4) Exclude cut with ID _35799_210_5854_1_1533263487887_3529071_490-326847-0_sp1.1 from training. Duration: 0.9 2023-10-04 12:48:28,366 WARNING [train.py:1204] (3/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_984-90716-0_sp1.1 from training. Duration: 0.963625 2023-10-04 12:48:33,795 WARNING [train.py:1204] (3/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_166-246557-0 from training. Duration: 0.64 2023-10-04 12:48:33,796 WARNING [train.py:1204] (3/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_51-200220-0_sp1.1 from training. Duration: 0.5090625 2023-10-04 12:48:35,138 WARNING [train.py:1204] (3/4) Exclude cut with ID _51382_210_23862_1_1534492801838_3650104_275-92796-0_sp1.1 from training. Duration: 0.936375 2023-10-04 12:48:39,913 WARNING [train.py:1204] (3/4) Exclude cut with ID _29853_210_7497_1_1532505600851_6642110_461-142829-0_sp1.1 from training. Duration: 0.97275 2023-10-04 12:48:41,138 WARNING [train.py:1204] (3/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_788-298907-0_sp0.9 from training. Duration: 0.988875 2023-10-04 12:48:43,838 WARNING [train.py:1204] (3/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_739-2098-0_sp1.1 from training. Duration: 0.836375 2023-10-04 12:48:46,683 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7543_210_6753_1_1528541907_1399322_165-323619-0 from training. Duration: 0.69 2023-10-04 12:48:48,659 WARNING [train.py:1204] (3/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_401-255491-0_sp1.1 from training. Duration: 0.636375 2023-10-04 12:48:52,770 WARNING [train.py:1204] (3/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_24-143283-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 12:48:55,474 WARNING [train.py:1204] (3/4) Exclude cut with ID _14459_210_2228_1_1531443641946_7516700_1221-280439-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 12:48:58,266 WARNING [train.py:1204] (3/4) Exclude cut with ID _59202_210_26984_1_1534816810612_3549174_469-213482-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 12:48:58,267 WARNING [train.py:1204] (3/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_148-158509-0_sp0.9 from training. Duration: 0.4555625 2023-10-04 12:48:58,301 WARNING [train.py:1204] (3/4) Exclude cut with ID _24314_210_4915_1_1532135078116_7170179_863-259095-0_sp1.1 from training. Duration: 0.97275 2023-10-04 12:48:59,770 WARNING [train.py:1204] (3/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_321-22821-0 from training. Duration: 0.89 2023-10-04 12:48:59,792 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_43831_210_2158_1_1533725502_5872791_55-163137-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 12:48:59,835 WARNING [train.py:1204] (3/4) Exclude cut with ID _27430_210_7770_1_1532604689375_1378590_26-25207-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 12:49:01,163 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_17613_210_2151_1_1531137569_2366160_223-316461-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 12:49:02,587 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_53679_210_4852_1_1534556726_4186141_541-213370-0_sp1.1 from training. Duration: 0.963625 2023-10-04 12:49:02,672 WARNING [train.py:1204] (3/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_369-295086-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 12:49:04,560 WARNING [train.py:1204] (3/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_91-267696-0 from training. Duration: 0.87 2023-10-04 12:49:04,607 WARNING [train.py:1204] (3/4) Exclude cut with ID _5294_210_1747_1_1527249643928_3696179_180-100713-0 from training. Duration: 0.81 2023-10-04 12:49:04,629 WARNING [train.py:1204] (3/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_32-335103-0 from training. Duration: 0.82 2023-10-04 12:49:04,637 WARNING [train.py:1204] (3/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_133-312727-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 12:49:06,354 WARNING [train.py:1204] (3/4) Exclude cut with ID _59288_210_5854_1_1535077757774_3412246_391-165934-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 12:49:08,160 WARNING [train.py:1204] (3/4) Exclude cut with ID _28323_210_16187_1_1532480395667_4100300_488-1281-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 12:49:08,244 WARNING [train.py:1204] (3/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_124-193207-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 12:49:08,495 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.balancer2.prob, batch_count=1659986.6666666667, ans=0.125 2023-10-04 12:49:10,461 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.ff2_skip_rate, batch_count=1659986.6666666667, ans=0.0 2023-10-04 12:49:15,874 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.0.ff3_skip_rate, batch_count=1659986.6666666667, ans=0.0 2023-10-04 12:49:18,390 INFO [train.py:1046] (3/4) Epoch 47, batch 4650, loss[loss=0.1522, simple_loss=0.2261, pruned_loss=0.03909, over 23933.00 frames. ], tot_loss[loss=0.1531, simple_loss=0.2336, pruned_loss=0.03634, over 4708159.92 frames. ], batch size: 195, lr: 2.15e-03, grad_scale: 16.0 2023-10-04 12:49:18,447 WARNING [train.py:1204] (3/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_226-242140-0_sp0.9 from training. Duration: 0.9 2023-10-04 12:49:20,680 WARNING [train.py:1204] (3/4) Exclude cut with ID _29277_210_3279_1_1532581109293_3173069_392-3694-0_sp1.1 from training. Duration: 0.936375 2023-10-04 12:49:21,926 WARNING [train.py:1204] (3/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_563-329842-0_sp1.1 from training. Duration: 0.97275 2023-10-04 12:49:21,979 WARNING [train.py:1204] (3/4) Exclude cut with ID _58583_210_5236_1_1534726835266_3574249_391-87755-0_sp1.1 from training. Duration: 0.92725 2023-10-04 12:49:21,999 WARNING [train.py:1204] (3/4) Exclude cut with ID _28427_210_4220_1_1532581240967_3061069_281-237827-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 12:49:22,014 WARNING [train.py:1204] (3/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_44-147898-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 12:49:23,356 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_24007_210_1794_1_1531918925_1663875_125-320772-0_sp1.1 from training. Duration: 0.97275 2023-10-04 12:49:26,183 WARNING [train.py:1204] (3/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_143-117421-0 from training. Duration: 0.58 2023-10-04 12:49:30,307 WARNING [train.py:1204] (3/4) Exclude cut with ID _69584_210_5219_1_1535785133775_4044450_325-7656-0_sp0.9 from training. Duration: 0.9555625 2023-10-04 12:49:33,494 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_37351_210_7925_1_1533012931_3812113_265-85780-0 from training. Duration: 0.88 2023-10-04 12:49:33,531 WARNING [train.py:1204] (3/4) Exclude cut with ID _18516_210_9206_1_1531704653206_3692406_331-237896-0_sp1.1 from training. Duration: 0.936375 2023-10-04 12:49:33,599 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder_embed.conv.2.prob, batch_count=1660120.0, ans=0.125 2023-10-04 12:49:34,839 WARNING [train.py:1204] (3/4) Exclude cut with ID _14629_210_15709_1_1530604298182_956899_6-232695-0 from training. Duration: 0.94 2023-10-04 12:49:34,860 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7544_210_6753_1_1528587000_691468_87-178599-0_sp1.1 from training. Duration: 0.7909375 2023-10-04 12:49:34,921 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_326-303945-0 from training. Duration: 0.99 2023-10-04 12:49:34,937 WARNING [train.py:1204] (3/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_503-30719-0 from training. Duration: 0.98 2023-10-04 12:49:34,945 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_11305_210_1794_1_1529750428_7178148_299-344640-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 12:49:36,314 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_53071_210_16554_1_1534490405_2954066_265-170472-0_sp1.1 from training. Duration: 0.7545625 2023-10-04 12:49:40,944 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_174-277364-0_sp1.1 from training. Duration: 0.6454375 2023-10-04 12:49:43,677 WARNING [train.py:1204] (3/4) Exclude cut with ID _58414_210_8254_1_1535421609162_7151182_193-151884-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 12:49:43,695 WARNING [train.py:1204] (3/4) Exclude cut with ID IC0651W0104-322063 from training. Duration: 0.768 2023-10-04 12:49:46,530 WARNING [train.py:1204] (3/4) Exclude cut with ID _21881_210_3525_1_1531825242757_3798380_139-305982-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 12:49:48,052 WARNING [train.py:1204] (3/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_79-200392-0 from training. Duration: 0.97 2023-10-04 12:49:49,586 WARNING [train.py:1204] (3/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_451-280894-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 12:49:49,594 WARNING [train.py:1204] (3/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_304-273433-0_sp1.1 from training. Duration: 0.9 2023-10-04 12:49:50,946 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_5959_210_1794_1_1527400400_3445726_412-44052-0 from training. Duration: 0.77 2023-10-04 12:49:52,321 WARNING [train.py:1204] (3/4) Exclude cut with ID _4681_210_3525_1_1527251040321_1235713_148-26159-0_sp1.1 from training. Duration: 0.963625 2023-10-04 12:49:53,825 WARNING [train.py:1204] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_887-249555-0_sp0.9 from training. Duration: 0.8555625 2023-10-04 12:49:55,992 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.conv_module2.balancer2.prob, batch_count=1660186.6666666667, ans=0.125 2023-10-04 12:49:57,248 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_25236_210_2158_1_1532051968_280567_37-6860-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 12:50:02,628 WARNING [train.py:1204] (3/4) Exclude cut with ID _29277_210_3279_1_1532581109293_3173069_417-115527-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 12:50:06,075 WARNING [train.py:1204] (3/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_552-134748-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 12:50:07,458 WARNING [train.py:1204] (3/4) Exclude cut with ID _43176_210_8254_1_1533524417469_4174339_188-174143-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 12:50:07,491 WARNING [train.py:1204] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_127-119109-0_sp1.1 from training. Duration: 0.6545625 2023-10-04 12:50:09,587 WARNING [train.py:1204] (3/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_38-187851-0 from training. Duration: 0.62 2023-10-04 12:50:09,622 WARNING [train.py:1204] (3/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_588-125165-0 from training. Duration: 0.83 2023-10-04 12:50:10,761 WARNING [train.py:1204] (3/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_344-152526-0_sp1.1 from training. Duration: 0.4818125 2023-10-04 12:50:10,762 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7002_210_1794_1_1527935634_4034444_487-236501-0 from training. Duration: 0.66 2023-10-04 12:50:12,708 WARNING [train.py:1204] (3/4) Exclude cut with ID _23825_210_13449_1_1531915313526_3497150_379-16981-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 12:50:19,514 WARNING [train.py:1204] (3/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_297-5233-0_sp1.1 from training. Duration: 0.863625 2023-10-04 12:50:19,518 WARNING [train.py:1204] (3/4) Exclude cut with ID _41298_210_9019_1_1533430371450_4822143_178-283538-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 12:50:19,547 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7046_210_6753_1_1527895337_6920917_642-106799-0 from training. Duration: 0.92 2023-10-04 12:50:19,560 WARNING [train.py:1204] (3/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_532-291952-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 12:50:19,676 WARNING [train.py:1204] (3/4) Exclude cut with ID _47278_210_12041_1_1533902418349_3576110_260-270468-0_sp1.1 from training. Duration: 0.92725 2023-10-04 12:50:19,682 WARNING [train.py:1204] (3/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_144-3287-0_sp0.9 from training. Duration: 0.7444375 2023-10-04 12:50:22,314 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_162-262717-0_sp0.9 from training. Duration: 0.92225 2023-10-04 12:50:23,734 WARNING [train.py:1204] (3/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_62-335552-0_sp0.9 from training. Duration: 0.9666875 2023-10-04 12:50:23,752 WARNING [train.py:1204] (3/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_726-272852-0_sp1.1 from training. Duration: 0.92725 2023-10-04 12:50:24,991 WARNING [train.py:1204] (3/4) Exclude cut with ID _14898_210_9206_1_1530766812377_4177190_26-135492-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 12:50:28,602 WARNING [train.py:1204] (3/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_200-95304-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 12:50:28,630 WARNING [train.py:1204] (3/4) Exclude cut with ID _45012_210_12651_1_1533608435136_5371380_240-37101-0_sp1.1 from training. Duration: 0.8454375 2023-10-04 12:50:28,638 WARNING [train.py:1204] (3/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_132-269633-0_sp0.9 from training. Duration: 0.7555625 2023-10-04 12:50:29,934 WARNING [train.py:1204] (3/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_907-151398-0_sp0.9 from training. Duration: 0.411125 2023-10-04 12:50:31,308 WARNING [train.py:1204] (3/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_45-265684-0_sp0.9 from training. Duration: 0.8 2023-10-04 12:50:32,647 INFO [train.py:1046] (3/4) Epoch 47, batch 4700, loss[loss=0.1421, simple_loss=0.22, pruned_loss=0.03212, over 24543.00 frames. ], tot_loss[loss=0.1536, simple_loss=0.2341, pruned_loss=0.03655, over 4710403.20 frames. ], batch size: 60, lr: 2.15e-03, grad_scale: 16.0 2023-10-04 12:50:32,691 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_74-292608-0 from training. Duration: 0.72 2023-10-04 12:50:40,872 WARNING [train.py:1204] (3/4) Exclude cut with ID _59202_210_26984_1_1534816810612_3549174_221-84819-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 12:50:42,194 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_33696_210_3852_1_1533082780_2663375_181-325595-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 12:50:42,237 WARNING [train.py:1204] (3/4) Exclude cut with ID _47251_210_18628_1_1533814063128_4137250_351-154105-0_sp1.1 from training. Duration: 0.963625 2023-10-04 12:50:43,751 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.conv_module2.balancer2.prob, batch_count=1660386.6666666667, ans=0.125 2023-10-04 12:50:44,848 WARNING [train.py:1204] (3/4) Exclude cut with ID _35501_210_9288_1_1532998507986_3192189_351-334740-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 12:50:44,971 WARNING [train.py:1204] (3/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_342-100157-0_sp1.1 from training. Duration: 0.4909375 2023-10-04 12:50:46,210 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.725e+02 2.148e+02 2.347e+02 2.759e+02 4.268e+02, threshold=4.695e+02, percent-clipped=0.0 2023-10-04 12:50:49,130 WARNING [train.py:1204] (3/4) Exclude cut with ID _6789_210_1534_1_1527857937218_3118950_262-48853-0 from training. Duration: 0.56 2023-10-04 12:50:49,158 WARNING [train.py:1204] (3/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_430-201020-0 from training. Duration: 0.97 2023-10-04 12:50:51,945 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_71-136518-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 12:50:52,043 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_28-291982-0_sp1.1 from training. Duration: 0.8818125 2023-10-04 12:50:52,193 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.bypass.skip_rate, batch_count=1660453.3333333333, ans=0.04949747468305833 2023-10-04 12:50:53,373 WARNING [train.py:1204] (3/4) Exclude cut with ID _54218_210_12210_1_1534413646388_4409259_437-275326-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 12:50:54,812 WARNING [train.py:1204] (3/4) Exclude cut with ID _55134_210_21982_1_1534590028377_3882170_40-309139-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 12:51:02,127 WARNING [train.py:1204] (3/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_266-329755-0_sp1.1 from training. Duration: 0.8090625 2023-10-04 12:51:04,097 WARNING [train.py:1204] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_21-174514-0_sp1.1 from training. Duration: 0.3909375 2023-10-04 12:51:05,581 WARNING [train.py:1204] (3/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_409-207378-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 12:51:13,377 WARNING [train.py:1204] (3/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_132-269633-0 from training. Duration: 0.68 2023-10-04 12:51:14,788 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2245_210_2715_1_1525511548_3540254_310-52483-0_sp0.9 from training. Duration: 0.97775 2023-10-04 12:51:14,995 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.balancer1.prob, batch_count=1660520.0, ans=0.125 2023-10-04 12:51:17,550 WARNING [train.py:1204] (3/4) Exclude cut with ID _27430_210_7770_1_1532604689375_1378590_47-77403-0_sp1.1 from training. Duration: 0.92725 2023-10-04 12:51:17,890 INFO [scaling.py:1118] (3/4) WithLoss: name=encoder.encoders.4.encoder.layers.1.self_attn_weights, loss-sum=0.000e+00 2023-10-04 12:51:21,767 WARNING [train.py:1204] (3/4) Exclude cut with ID _7562_210_2084_1_1528887767916_3946229_186-326422-0 from training. Duration: 0.84 2023-10-04 12:51:21,866 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_230-35993-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 12:51:26,035 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_23854_210_1794_1_1531969696_1329157_57-123491-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 12:51:27,420 WARNING [train.py:1204] (3/4) Exclude cut with ID _14747_210_8341_1_1530685797463_3654089_147-131229-0 from training. Duration: 0.76 2023-10-04 12:51:28,802 WARNING [train.py:1204] (3/4) Exclude cut with ID _30191_210_1664_1_1533189915047_7196670_430-19539-0_sp1.1 from training. Duration: 0.92725 2023-10-04 12:51:28,823 WARNING [train.py:1204] (3/4) Exclude cut with ID _67725_210_27767_1_1535608756671_5105359_44-50762-0_sp1.1 from training. Duration: 0.963625 2023-10-04 12:51:31,590 WARNING [train.py:1204] (3/4) Exclude cut with ID _27485_210_4916_1_1532739191677_3668269_217-161755-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 12:51:32,911 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_5931_210_2040_1_1527428481_4547999_321-193614-0_sp1.1 from training. Duration: 0.6090625 2023-10-04 12:51:32,939 WARNING [train.py:1204] (3/4) Exclude cut with ID _6267_210_2084_1_1527764658526_3894730_375-219887-0 from training. Duration: 0.67 2023-10-04 12:51:33,029 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9087W0022-538393 from training. Duration: 0.768 2023-10-04 12:51:34,958 WARNING [train.py:1204] (3/4) Exclude cut with ID _68777_210_4353_1_1535810390731_3117904_83-196459-0_sp1.1 from training. Duration: 0.963625 2023-10-04 12:51:38,229 WARNING [train.py:1204] (3/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_455-343812-0_sp1.1 from training. Duration: 0.92725 2023-10-04 12:51:38,230 WARNING [train.py:1204] (3/4) Exclude cut with ID _13916_210_9994_1_1530788341126_3763149_414-320174-0_sp1.1 from training. Duration: 0.92725 2023-10-04 12:51:38,234 WARNING [train.py:1204] (3/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_398-239819-0 from training. Duration: 0.99 2023-10-04 12:51:38,472 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.conv_module1.balancer2.prob, batch_count=1660653.3333333333, ans=0.125 2023-10-04 12:51:39,663 WARNING [train.py:1204] (3/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_199-232314-0_sp1.1 from training. Duration: 0.92725 2023-10-04 12:51:40,488 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.2.encoder.layers.1.nonlin_attention.whiten2, num_groups=1, num_channels=384, metric=4.48 vs. limit=15.0 2023-10-04 12:51:42,949 WARNING [train.py:1204] (3/4) Exclude cut with ID _2334_210_2667_1_1525608238880_4705129_459-104997-0 from training. Duration: 0.95 2023-10-04 12:51:46,332 WARNING [train.py:1204] (3/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_441-209395-0_sp1.1 from training. Duration: 0.936375 2023-10-04 12:51:47,646 INFO [train.py:1046] (3/4) Epoch 47, batch 4750, loss[loss=0.1352, simple_loss=0.2105, pruned_loss=0.02998, over 24291.00 frames. ], tot_loss[loss=0.1538, simple_loss=0.2346, pruned_loss=0.03652, over 4713261.67 frames. ], batch size: 56, lr: 2.15e-03, grad_scale: 16.0 2023-10-04 12:51:47,691 WARNING [train.py:1204] (3/4) Exclude cut with ID _27052_210_5986_1_1532329197187_3585009_320-80393-0_sp1.1 from training. Duration: 0.97275 2023-10-04 12:51:52,020 WARNING [train.py:1204] (3/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_89-217253-0_sp1.1 from training. Duration: 0.97275 2023-10-04 12:51:52,034 WARNING [train.py:1204] (3/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_453-282588-0_sp1.1 from training. Duration: 0.8909375 2023-10-04 12:51:54,623 WARNING [train.py:1204] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_88-341718-0 from training. Duration: 0.66 2023-10-04 12:51:54,682 WARNING [train.py:1204] (3/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_230-3521-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 12:51:57,547 WARNING [train.py:1204] (3/4) Exclude cut with ID _9679_210_7497_1_1529129212906_4363929_512-313889-0 from training. Duration: 0.81 2023-10-04 12:51:58,935 WARNING [train.py:1204] (3/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_194-252800-0_sp1.1 from training. Duration: 0.8818125 2023-10-04 12:51:58,962 WARNING [train.py:1204] (3/4) Exclude cut with ID _55209_210_1531_1_1534675828996_4597160_239-293541-0_sp1.1 from training. Duration: 0.963625 2023-10-04 12:51:58,999 WARNING [train.py:1204] (3/4) Exclude cut with ID _35799_210_5854_1_1533263487887_3529071_15-325131-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 12:52:05,059 WARNING [train.py:1204] (3/4) Exclude cut with ID _14100_210_8341_1_1530529320613_3678069_279-55760-0 from training. Duration: 0.93 2023-10-04 12:52:08,418 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_489-230703-0_sp1.1 from training. Duration: 0.77275 2023-10-04 12:52:11,097 WARNING [train.py:1204] (3/4) Exclude cut with ID _6694_210_4599_1_1527852637490_3965529_144-185051-0 from training. Duration: 0.96 2023-10-04 12:52:12,270 WARNING [train.py:1204] (3/4) Exclude cut with ID _28461_210_2253_1_1532771775189_3041299_1-228155-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 12:52:15,119 WARNING [train.py:1204] (3/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_686-252323-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 12:52:15,122 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_1075-294633-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 12:52:15,133 WARNING [train.py:1204] (3/4) Exclude cut with ID _22083_210_8254_1_1531702862439_3678970_30-92879-0_sp1.1 from training. Duration: 0.97275 2023-10-04 12:52:17,092 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9245W0121-616888 from training. Duration: 0.896 2023-10-04 12:52:17,095 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6786_210_1388_1_1527839699_4194495_378-46070-0 from training. Duration: 0.72 2023-10-04 12:52:21,127 WARNING [train.py:1204] (3/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_317-175062-0 from training. Duration: 0.82 2023-10-04 12:52:23,919 WARNING [train.py:1204] (3/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_238-89390-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 12:52:25,302 WARNING [train.py:1204] (3/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_524-310811-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 12:52:26,716 WARNING [train.py:1204] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_307-158462-0_sp0.9 from training. Duration: 0.7666875 2023-10-04 12:52:26,717 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9309W0191-640318 from training. Duration: 0.512 2023-10-04 12:52:26,730 WARNING [train.py:1204] (3/4) Exclude cut with ID _55153_210_2253_1_1534384786677_3162096_100-204617-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 12:52:28,186 WARNING [train.py:1204] (3/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_112-149213-0_sp1.1 from training. Duration: 0.863625 2023-10-04 12:52:31,065 WARNING [train.py:1204] (3/4) Exclude cut with ID _67831_210_12392_1_1535612464202_3092612_346-2148-0_sp1.1 from training. Duration: 0.8454375 2023-10-04 12:52:34,053 WARNING [train.py:1204] (3/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_51-117576-0_sp0.9 from training. Duration: 0.4444375 2023-10-04 12:52:34,074 WARNING [train.py:1204] (3/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_300-27235-0 from training. Duration: 0.87 2023-10-04 12:52:35,415 WARNING [train.py:1204] (3/4) Exclude cut with ID _40399_210_4353_1_1533305658194_3279406_195-116825-0_sp1.1 from training. Duration: 0.97275 2023-10-04 12:52:36,712 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7002_210_1794_1_1527939683_5157286_163-87552-0_sp0.9 from training. Duration: 0.9555625 2023-10-04 12:52:36,761 WARNING [train.py:1204] (3/4) Exclude cut with ID _19514_210_9259_1_1531530071330_5084230_560-237597-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 12:52:38,618 WARNING [train.py:1204] (3/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_41-234353-0_sp1.1 from training. Duration: 0.6181875 2023-10-04 12:52:38,644 WARNING [train.py:1204] (3/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_769-168302-0 from training. Duration: 0.71 2023-10-04 12:52:42,798 WARNING [train.py:1204] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_574-45541-0 from training. Duration: 0.65 2023-10-04 12:52:44,223 WARNING [train.py:1204] (3/4) Exclude cut with ID _50310_210_15488_1_1534068043345_3641080_262-67120-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 12:52:47,876 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_53071_210_16554_1_1534493434_1276796_35-11927-0_sp1.1 from training. Duration: 0.963625 2023-10-04 12:52:47,877 WARNING [train.py:1204] (3/4) Exclude cut with ID _3992_210_1747_1_1526644816031_3731060_107-251434-0 from training. Duration: 0.99 2023-10-04 12:52:47,928 WARNING [train.py:1204] (3/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_412-54488-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 12:52:49,290 WARNING [train.py:1204] (3/4) Exclude cut with ID _18516_210_9206_1_1531704653206_3692406_77-258490-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 12:52:49,536 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.balancer2.prob, batch_count=1660986.6666666667, ans=0.125 2023-10-04 12:52:51,934 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_40047_210_2290_1_1533290519_2374311_195-165693-0_sp0.9 from training. Duration: 0.988875 2023-10-04 12:52:53,242 WARNING [train.py:1204] (3/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_296-36971-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 12:52:53,304 WARNING [train.py:1204] (3/4) Exclude cut with ID _14970_210_15709_1_1530773543493_1715730_187-20259-0_sp1.1 from training. Duration: 0.8090625 2023-10-04 12:52:58,642 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_53373_210_5399_1_1534229471_4715695_135-305927-0_sp1.1 from training. Duration: 0.936375 2023-10-04 12:52:58,666 WARNING [train.py:1204] (3/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_60-18123-0 from training. Duration: 0.88 2023-10-04 12:52:58,750 WARNING [train.py:1204] (3/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_166-166990-0 from training. Duration: 0.96 2023-10-04 12:53:00,035 INFO [train.py:1046] (3/4) Epoch 47, batch 4800, loss[loss=0.1399, simple_loss=0.2183, pruned_loss=0.03069, over 24584.00 frames. ], tot_loss[loss=0.1538, simple_loss=0.2347, pruned_loss=0.03642, over 4728174.69 frames. ], batch size: 60, lr: 2.15e-03, grad_scale: 16.0 2023-10-04 12:53:00,151 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_5438_210_1794_1_1527336687_2600009_233-58072-0 from training. Duration: 0.77 2023-10-04 12:53:01,683 WARNING [train.py:1204] (3/4) Exclude cut with ID _8833_210_5219_1_1528592665210_7553059_611-80313-0_sp0.9 from training. Duration: 0.711125 2023-10-04 12:53:01,707 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_53555_210_5632_1_1534595220_7232297_118-43239-0_sp1.1 from training. Duration: 0.936375 2023-10-04 12:53:04,351 WARNING [train.py:1204] (3/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_480-13146-0 from training. Duration: 0.85 2023-10-04 12:53:07,929 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_892-28814-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 12:53:09,407 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_24899_210_16554_1_1532046422_3827462_160-21255-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 12:53:14,122 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_154-316450-0_sp1.1 from training. Duration: 0.6909375 2023-10-04 12:53:15,289 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.757e+02 2.075e+02 2.450e+02 3.076e+02 6.025e+02, threshold=4.900e+02, percent-clipped=3.0 2023-10-04 12:53:15,384 WARNING [train.py:1204] (3/4) Exclude cut with ID _30196_210_1664_1_1533708496522_7074140_1225-301232-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 12:53:15,421 WARNING [train.py:1204] (3/4) Exclude cut with ID _28783_210_7943_1_1532671164808_7155479_414-208100-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 12:53:17,359 WARNING [train.py:1204] (3/4) Exclude cut with ID _19023_210_4353_1_1531378892552_5820780_83-254794-0 from training. Duration: 0.86 2023-10-04 12:53:17,417 WARNING [train.py:1204] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_591-83752-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 12:53:19,280 WARNING [train.py:1204] (3/4) Exclude cut with ID _29877_210_6228_1_1532397791106_5276149_266-62291-0_sp1.1 from training. Duration: 0.7909375 2023-10-04 12:53:20,104 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.1.encoder.layers.1.nonlin_attention.whiten2, num_groups=1, num_channels=256, metric=7.35 vs. limit=15.0 2023-10-04 12:53:20,677 WARNING [train.py:1204] (3/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_280-161633-0_sp1.1 from training. Duration: 0.663625 2023-10-04 12:53:20,836 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.0.feed_forward1.out_proj.dropout_p, batch_count=1661120.0, ans=0.1 2023-10-04 12:53:24,869 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7543_210_6753_1_1528539468_333764_39-156634-0_sp1.1 from training. Duration: 0.97275 2023-10-04 12:53:25,096 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.ff3_skip_rate, batch_count=1661120.0, ans=0.0 2023-10-04 12:53:26,188 WARNING [train.py:1204] (3/4) Exclude cut with ID _67499_210_17282_1_1535509805220_3692430_505-312248-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 12:53:26,218 WARNING [train.py:1204] (3/4) Exclude cut with ID _5294_210_1747_1_1527249643928_3696179_180-100713-0_sp0.9 from training. Duration: 0.9 2023-10-04 12:53:26,382 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.conv_module2.balancer2.prob, batch_count=1661120.0, ans=0.125 2023-10-04 12:53:27,558 WARNING [train.py:1204] (3/4) Exclude cut with ID _26528_210_9070_1_1532570410842_3851310_508-148132-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 12:53:27,570 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_211-339622-0_sp1.1 from training. Duration: 0.4090625 2023-10-04 12:53:27,584 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_339-138356-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 12:53:28,956 WARNING [train.py:1204] (3/4) Exclude cut with ID _27060_210_5986_1_1532674838922_3522549_211-19933-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 12:53:31,684 WARNING [train.py:1204] (3/4) Exclude cut with ID _60229_210_27767_1_1534838406158_3655350_457-117923-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 12:53:33,069 WARNING [train.py:1204] (3/4) Exclude cut with ID _55429_210_10626_1_1534467588527_7174143_460-141439-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 12:53:33,247 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.conv_module2.balancer2.min_positive, batch_count=1661186.6666666667, ans=0.05 2023-10-04 12:53:35,740 WARNING [train.py:1204] (3/4) Exclude cut with ID _59170_210_6721_1_1534983633960_4203335_263-10780-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 12:53:35,762 WARNING [train.py:1204] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_872-264525-0_sp0.9 from training. Duration: 0.72225 2023-10-04 12:53:37,086 WARNING [train.py:1204] (3/4) Exclude cut with ID _7624_210_1531_1_1528109802198_4933101_418-349834-0_sp0.9 from training. Duration: 0.6666875 2023-10-04 12:53:37,179 WARNING [train.py:1204] (3/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_27-53998-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 12:53:37,875 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.2.encoder.layers.0.self_attn_weights.whiten_keys, num_groups=4, num_channels=128, metric=4.21 vs. limit=6.0 2023-10-04 12:53:38,942 WARNING [train.py:1204] (3/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_202-183922-0 from training. Duration: 0.85 2023-10-04 12:53:38,961 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7002_210_1794_1_1527939683_5157286_1-281264-0 from training. Duration: 0.44 2023-10-04 12:53:39,200 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.conv_module2.balancer1.prob, batch_count=1661186.6666666667, ans=0.125 2023-10-04 12:53:41,995 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_53753_210_7234_1_1534320003_3594148_493-9314-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 12:53:42,015 WARNING [train.py:1204] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_217-95056-0_sp1.1 from training. Duration: 0.8909375 2023-10-04 12:53:43,309 WARNING [train.py:1204] (3/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_406-45086-0_sp1.1 from training. Duration: 0.72725 2023-10-04 12:53:43,315 WARNING [train.py:1204] (3/4) Exclude cut with ID _31649_210_6228_1_1532584608063_4091078_505-213821-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 12:53:43,331 WARNING [train.py:1204] (3/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_317-175062-0_sp0.9 from training. Duration: 0.911125 2023-10-04 12:53:44,718 WARNING [train.py:1204] (3/4) Exclude cut with ID _5444_210_2151_1_1527418808464_5312210_726-27165-0_sp1.1 from training. Duration: 0.6090625 2023-10-04 12:53:44,769 WARNING [train.py:1204] (3/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_494-170437-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 12:53:48,146 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_474-64054-0_sp1.1 from training. Duration: 0.936375 2023-10-04 12:53:48,379 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=1661253.3333333333, ans=0.1 2023-10-04 12:53:49,702 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.1.balancer1.prob, batch_count=1661253.3333333333, ans=0.125 2023-10-04 12:53:51,472 WARNING [train.py:1204] (3/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_521-7556-0_sp1.1 from training. Duration: 0.97275 2023-10-04 12:53:51,957 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.4.encoder.layers.0.whiten, num_groups=1, num_channels=384, metric=5.52 vs. limit=12.0 2023-10-04 12:53:54,066 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_269-348505-0_sp1.1 from training. Duration: 0.92725 2023-10-04 12:53:55,731 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.ff3_skip_rate, batch_count=1661253.3333333333, ans=0.0 2023-10-04 12:53:58,278 WARNING [train.py:1204] (3/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_559-184862-0 from training. Duration: 0.94 2023-10-04 12:53:58,306 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_68702_210_23689_1_1535799577_3420068_92-76040-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 12:53:59,680 WARNING [train.py:1204] (3/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_227-102052-0_sp1.1 from training. Duration: 0.97275 2023-10-04 12:53:59,712 WARNING [train.py:1204] (3/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_718-250907-0_sp0.9 from training. Duration: 0.7666875 2023-10-04 12:54:01,014 WARNING [train.py:1204] (3/4) Exclude cut with ID _14069_210_5632_1_1530512692033_4803390_352-143891-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 12:54:04,070 WARNING [train.py:1204] (3/4) Exclude cut with ID _6267_210_2084_1_1527764658526_3894730_65-106602-0_sp1.1 from training. Duration: 0.936375 2023-10-04 12:54:04,681 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.4.encoder.layers.1.feed_forward1.out_whiten, num_groups=1, num_channels=384, metric=8.34 vs. limit=15.0 2023-10-04 12:54:05,422 WARNING [train.py:1204] (3/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_202-183922-0_sp0.9 from training. Duration: 0.9444375 2023-10-04 12:54:05,429 WARNING [train.py:1204] (3/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_465-268827-0_sp1.1 from training. Duration: 0.97275 2023-10-04 12:54:05,487 WARNING [train.py:1204] (3/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_58-29064-0_sp1.1 from training. Duration: 0.9 2023-10-04 12:54:05,517 WARNING [train.py:1204] (3/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_1-99680-0_sp0.9 from training. Duration: 0.7444375 2023-10-04 12:54:06,918 WARNING [train.py:1204] (3/4) Exclude cut with ID _13253_210_1925_1_1530757775819_3577873_191-144977-0_sp1.1 from training. Duration: 0.7090625 2023-10-04 12:54:09,661 WARNING [train.py:1204] (3/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_276-112684-0_sp1.1 from training. Duration: 0.92725 2023-10-04 12:54:11,434 WARNING [train.py:1204] (3/4) Exclude cut with ID _64639_210_5816_1_1535367619437_3528000_358-411-0_sp1.1 from training. Duration: 0.97275 2023-10-04 12:54:11,450 WARNING [train.py:1204] (3/4) Exclude cut with ID _31649_210_6228_1_1532584608063_4091078_106-21199-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 12:54:13,268 WARNING [train.py:1204] (3/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_901-79438-0 from training. Duration: 0.94 2023-10-04 12:54:13,453 WARNING [train.py:1204] (3/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_718-250907-0 from training. Duration: 0.69 2023-10-04 12:54:14,693 INFO [train.py:1046] (3/4) Epoch 47, batch 4850, loss[loss=0.1502, simple_loss=0.2265, pruned_loss=0.03698, over 23497.00 frames. ], tot_loss[loss=0.1545, simple_loss=0.2353, pruned_loss=0.03683, over 4726232.28 frames. ], batch size: 134, lr: 2.15e-03, grad_scale: 16.0 2023-10-04 12:54:14,732 WARNING [train.py:1204] (3/4) Exclude cut with ID _5287_210_2084_1_1527418883040_3878670_363-194161-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 12:54:14,736 WARNING [train.py:1204] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_586-321321-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 12:54:14,812 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_68900_210_2087_1_1535713157_3708529_164-292661-0_sp1.1 from training. Duration: 0.963625 2023-10-04 12:54:14,813 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_26433_210_12108_1_1532157158_4056480_114-144878-0_sp1.1 from training. Duration: 0.97275 2023-10-04 12:54:17,568 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_15517_210_6753_1_1530954008_3487336_96-341874-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 12:54:17,628 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder_embed.dropout.p, batch_count=1661386.6666666667, ans=0.1 2023-10-04 12:54:26,816 WARNING [train.py:1204] (3/4) Exclude cut with ID _14268_210_9206_1_1530622771285_4714099_637-86630-0 from training. Duration: 0.51 2023-10-04 12:54:26,907 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_28211_210_6388_1_1532861506_4523224_121-320092-0_sp1.1 from training. Duration: 0.92725 2023-10-04 12:54:29,124 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.1.encoder.layers.0.conv_module2.whiten, num_groups=1, num_channels=256, metric=6.17 vs. limit=15.0 2023-10-04 12:54:31,068 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_141-89113-0_sp0.9 from training. Duration: 0.9555625 2023-10-04 12:54:32,385 WARNING [train.py:1204] (3/4) Exclude cut with ID _6789_210_1534_1_1527857937218_3118950_311-61262-0_sp1.1 from training. Duration: 0.5909375 2023-10-04 12:54:32,410 WARNING [train.py:1204] (3/4) Exclude cut with ID _58480_210_12392_1_1534811121111_3646160_389-200461-0_sp1.1 from training. Duration: 0.97275 2023-10-04 12:54:36,450 WARNING [train.py:1204] (3/4) Exclude cut with ID _41326_210_5854_1_1533778076509_3492080_28-156553-0_sp1.1 from training. Duration: 0.92725 2023-10-04 12:54:37,966 WARNING [train.py:1204] (3/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_577-287150-0_sp1.1 from training. Duration: 0.7454375 2023-10-04 12:54:39,287 WARNING [train.py:1204] (3/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_40-129756-0_sp1.1 from training. Duration: 0.736375 2023-10-04 12:54:39,303 WARNING [train.py:1204] (3/4) Exclude cut with ID _60113_210_16271_1_1535164212481_4386079_490-280290-0 from training. Duration: 0.98 2023-10-04 12:54:40,888 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.feed_forward3.hidden_balancer.prob, batch_count=1661453.3333333333, ans=0.125 2023-10-04 12:54:42,524 WARNING [train.py:1204] (3/4) Exclude cut with ID _58059_210_5854_1_1534901471149_3164808_34-39905-0_sp1.1 from training. Duration: 0.963625 2023-10-04 12:54:44,517 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_791-197891-0_sp1.1 from training. Duration: 0.763625 2023-10-04 12:54:44,538 WARNING [train.py:1204] (3/4) Exclude cut with ID _6789_210_1534_1_1527857937218_3118950_262-48853-0_sp1.1 from training. Duration: 0.5090625 2023-10-04 12:54:45,910 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2655_210_2087_1_1525688959_3897400_52-279899-0_sp0.9 from training. Duration: 0.7666875 2023-10-04 12:54:45,914 WARNING [train.py:1204] (3/4) Exclude cut with ID _14884_210_2252_1_1530707459689_5561250_114-201399-0 from training. Duration: 0.76 2023-10-04 12:54:49,878 WARNING [train.py:1204] (3/4) Exclude cut with ID _19023_210_4353_1_1531378892552_5820780_83-254794-0_sp0.9 from training. Duration: 0.9555625 2023-10-04 12:54:49,902 WARNING [train.py:1204] (3/4) Exclude cut with ID _53489_210_6228_1_1534298357169_3835970_10-297919-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 12:54:50,772 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.conv_module1.balancer2.prob, batch_count=1661520.0, ans=0.125 2023-10-04 12:54:55,223 WARNING [train.py:1204] (3/4) Exclude cut with ID _44585_210_18628_1_1533605350387_4518199_418-3055-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 12:54:55,245 WARNING [train.py:1204] (3/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_310-109461-0 from training. Duration: 0.8 2023-10-04 12:54:56,565 WARNING [train.py:1204] (3/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_235-262124-0 from training. Duration: 0.67 2023-10-04 12:54:57,871 WARNING [train.py:1204] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_195-167011-0_sp1.1 from training. Duration: 0.6818125 2023-10-04 12:55:03,512 WARNING [train.py:1204] (3/4) Exclude cut with ID _7852_210_5219_1_1528286471316_8139400_188-55162-0_sp1.1 from training. Duration: 0.936375 2023-10-04 12:55:04,798 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_35361_210_4238_1_1533262810_7793476_675-87012-0 from training. Duration: 0.89 2023-10-04 12:55:04,886 WARNING [train.py:1204] (3/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_465-81427-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 12:55:04,893 WARNING [train.py:1204] (3/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_597-154640-0_sp1.1 from training. Duration: 0.8181875 2023-10-04 12:55:06,208 WARNING [train.py:1204] (3/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_186-309161-0_sp0.9 from training. Duration: 0.6 2023-10-04 12:55:06,296 WARNING [train.py:1204] (3/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_546-119312-0 from training. Duration: 0.99 2023-10-04 12:55:06,300 WARNING [train.py:1204] (3/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_330-240060-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 12:55:07,678 WARNING [train.py:1204] (3/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_477-255782-0 from training. Duration: 0.82 2023-10-04 12:55:07,698 WARNING [train.py:1204] (3/4) Exclude cut with ID _14970_210_15709_1_1530773543493_1715730_150-248939-0_sp1.1 from training. Duration: 0.97275 2023-10-04 12:55:07,782 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_556-206615-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 12:55:09,117 WARNING [train.py:1204] (3/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_308-231735-0 from training. Duration: 0.73 2023-10-04 12:55:17,061 WARNING [train.py:1204] (3/4) Exclude cut with ID _27485_210_4916_1_1532739191677_3668269_66-236571-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 12:55:19,560 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.3.encoder.layers.2.nonlin_attention.whiten1, num_groups=1, num_channels=384, metric=4.00 vs. limit=10.0 2023-10-04 12:55:21,061 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.0.layers.1.feed_forward3.out_whiten, num_groups=1, num_channels=192, metric=9.85 vs. limit=15.0 2023-10-04 12:55:21,193 INFO [scaling.py:1022] (3/4) Whitening: name=encoder_embed.convnext.out_whiten, num_groups=1, num_channels=128, metric=4.44 vs. limit=5.0 2023-10-04 12:55:23,429 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2221_210_1988_1_1525597051_4935541_415-296882-0_sp0.9 from training. Duration: 0.8555625 2023-10-04 12:55:23,437 WARNING [train.py:1204] (3/4) Exclude cut with ID _64521_210_14699_1_1535243427259_3815328_231-78630-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 12:55:28,922 INFO [train.py:1046] (3/4) Epoch 47, batch 4900, loss[loss=0.1659, simple_loss=0.2473, pruned_loss=0.04226, over 23198.00 frames. ], tot_loss[loss=0.1541, simple_loss=0.2346, pruned_loss=0.03677, over 4730479.53 frames. ], batch size: 105, lr: 2.15e-03, grad_scale: 16.0 2023-10-04 12:55:28,993 WARNING [train.py:1204] (3/4) Exclude cut with ID _13942_210_9988_1_1530604906036_3733360_499-55676-0 from training. Duration: 0.62 2023-10-04 12:55:28,996 WARNING [train.py:1204] (3/4) Exclude cut with ID _49376_210_5986_1_1533985278732_3370740_301-154763-0_sp1.1 from training. Duration: 0.8909375 2023-10-04 12:55:33,216 WARNING [train.py:1204] (3/4) Exclude cut with ID _27485_210_4916_1_1532739191677_3668269_255-216698-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 12:55:34,517 WARNING [train.py:1204] (3/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_137-334143-0_sp1.1 from training. Duration: 0.97275 2023-10-04 12:55:34,544 WARNING [train.py:1204] (3/4) Exclude cut with ID _6456_210_1534_1_1527681634212_3181049_215-309359-0_sp1.1 from training. Duration: 0.8 2023-10-04 12:55:36,011 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.0.feed_forward1.out_proj.dropout_p, batch_count=1661720.0, ans=0.1 2023-10-04 12:55:38,440 WARNING [train.py:1204] (3/4) Exclude cut with ID _65640_210_6310_1_1535368201445_3173250_209-13008-0 from training. Duration: 0.99 2023-10-04 12:55:43,497 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.716e+02 2.085e+02 2.441e+02 2.898e+02 5.040e+02, threshold=4.881e+02, percent-clipped=1.0 2023-10-04 12:55:45,006 WARNING [train.py:1204] (3/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_58-29064-0 from training. Duration: 0.99 2023-10-04 12:55:47,752 WARNING [train.py:1204] (3/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_410-287857-0 from training. Duration: 0.93 2023-10-04 12:55:49,202 WARNING [train.py:1204] (3/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_153-153537-0 from training. Duration: 0.91 2023-10-04 12:55:49,253 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7544_210_6753_1_1528587722_3576686_172-171750-0_sp0.9 from training. Duration: 0.72225 2023-10-04 12:55:49,293 WARNING [train.py:1204] (3/4) Exclude cut with ID _64521_210_14699_1_1535243427259_3815328_464-183853-0_sp1.1 from training. Duration: 0.97275 2023-10-04 12:55:49,307 WARNING [train.py:1204] (3/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_385-210308-0_sp1.1 from training. Duration: 0.963625 2023-10-04 12:55:49,316 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_46013_210_7234_1_1533776362_3744032_354-261865-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 12:55:49,326 WARNING [train.py:1204] (3/4) Exclude cut with ID _3067_210_2667_1_1526212974640_4503168_250-285937-0_sp1.1 from training. Duration: 0.72725 2023-10-04 12:55:50,652 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_40036_210_13405_1_1533648647_1315388_48-146016-0 from training. Duration: 0.93 2023-10-04 12:55:55,830 WARNING [train.py:1204] (3/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_280-161633-0 from training. Duration: 0.73 2023-10-04 12:55:55,882 WARNING [train.py:1204] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_308-161067-0_sp0.9 from training. Duration: 0.7333125 2023-10-04 12:55:57,235 WARNING [train.py:1204] (3/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_68-212801-0_sp0.9 from training. Duration: 0.811125 2023-10-04 12:55:57,288 WARNING [train.py:1204] (3/4) Exclude cut with ID _6789_210_1534_1_1527857937218_3118950_311-61262-0_sp0.9 from training. Duration: 0.72225 2023-10-04 12:55:59,972 WARNING [train.py:1204] (3/4) Exclude cut with ID _14632_210_9988_1_1530691279392_3923200_102-204477-0_sp1.1 from training. Duration: 0.8818125 2023-10-04 12:56:00,038 WARNING [train.py:1204] (3/4) Exclude cut with ID _65242_210_18628_1_1535369393717_3823149_290-117517-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 12:56:01,362 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_69630_210_7234_1_1535788670_910989_35-337140-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 12:56:01,379 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_5618_210_1874_1_1527311008_5851_1-214501-0 from training. Duration: 0.689 2023-10-04 12:56:04,121 WARNING [train.py:1204] (3/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_168-50337-0_sp0.9 from training. Duration: 0.9333125 2023-10-04 12:56:05,425 WARNING [train.py:1204] (3/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_185-14438-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 12:56:05,439 WARNING [train.py:1204] (3/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_61-327062-0 from training. Duration: 0.81 2023-10-04 12:56:05,444 WARNING [train.py:1204] (3/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_98-741-0 from training. Duration: 0.8 2023-10-04 12:56:09,613 WARNING [train.py:1204] (3/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_312-204798-0 from training. Duration: 0.93 2023-10-04 12:56:11,052 WARNING [train.py:1204] (3/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_787-294416-0_sp0.9 from training. Duration: 0.911125 2023-10-04 12:56:11,145 WARNING [train.py:1204] (3/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_140-181176-0_sp1.1 from training. Duration: 0.836375 2023-10-04 12:56:11,174 WARNING [train.py:1204] (3/4) Exclude cut with ID _14687_210_5854_1_1530703838259_3664889_115-239046-0_sp1.1 from training. Duration: 0.6090625 2023-10-04 12:56:13,043 WARNING [train.py:1204] (3/4) Exclude cut with ID _56423_210_5854_1_1534896061049_3165969_370-230614-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 12:56:14,475 WARNING [train.py:1204] (3/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_406-275649-0_sp0.9 from training. Duration: 0.4666875 2023-10-04 12:56:14,486 WARNING [train.py:1204] (3/4) Exclude cut with ID _15150_210_8341_1_1530752411803_7256384_22-138274-0_sp1.1 from training. Duration: 0.87275 2023-10-04 12:56:14,533 WARNING [train.py:1204] (3/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_125-125818-0 from training. Duration: 0.96 2023-10-04 12:56:17,879 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_4399_210_1874_1_1526705485_2400757_220-47974-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 12:56:19,320 WARNING [train.py:1204] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_308-161067-0_sp1.1 from training. Duration: 0.6 2023-10-04 12:56:20,726 WARNING [train.py:1204] (3/4) Exclude cut with ID _53489_210_6228_1_1534298357169_3835970_102-57645-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 12:56:25,640 WARNING [train.py:1204] (3/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_178-177582-0 from training. Duration: 0.74 2023-10-04 12:56:25,697 WARNING [train.py:1204] (3/4) Exclude cut with ID _14349_210_8341_1_1530615710818_3442470_170-336541-0_sp1.1 from training. Duration: 0.7818125 2023-10-04 12:56:28,360 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_521-214276-0_sp0.9 from training. Duration: 0.488875 2023-10-04 12:56:28,404 WARNING [train.py:1204] (3/4) Exclude cut with ID _66070_210_7640_1_1535441393096_7805310_489-292631-0 from training. Duration: 0.79 2023-10-04 12:56:32,773 WARNING [train.py:1204] (3/4) Exclude cut with ID _14069_210_5632_1_1530512692033_4803390_438-340023-0_sp1.1 from training. Duration: 0.936375 2023-10-04 12:56:34,148 WARNING [train.py:1204] (3/4) Exclude cut with ID _7852_210_5219_1_1528286471316_8139400_242-105336-0_sp1.1 from training. Duration: 0.6545625 2023-10-04 12:56:35,556 WARNING [train.py:1204] (3/4) Exclude cut with ID _3867_210_1883_1_1526641200575_4475830_272-215563-0 from training. Duration: 0.57 2023-10-04 12:56:35,570 WARNING [train.py:1204] (3/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_143-117421-0_sp0.9 from training. Duration: 0.6444375 2023-10-04 12:56:35,576 WARNING [train.py:1204] (3/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_30-326681-0_sp1.1 from training. Duration: 0.7545625 2023-10-04 12:56:38,235 WARNING [train.py:1204] (3/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_538-218448-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 12:56:41,084 WARNING [train.py:1204] (3/4) Exclude cut with ID _33640_210_2655_1_1533117605479_4855469_60-192970-0_sp1.1 from training. Duration: 0.963625 2023-10-04 12:56:41,092 WARNING [train.py:1204] (3/4) Exclude cut with ID _65585_210_12210_1_1535450451584_3764098_112-294301-0_sp1.1 from training. Duration: 0.8 2023-10-04 12:56:41,115 WARNING [train.py:1204] (3/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_643-37412-0_sp1.1 from training. Duration: 0.936375 2023-10-04 12:56:41,838 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.2.encoder.layers.1.feed_forward2.out_whiten, num_groups=1, num_channels=384, metric=4.32 vs. limit=15.0 2023-10-04 12:56:42,880 INFO [train.py:1046] (3/4) Epoch 47, batch 4950, loss[loss=0.1543, simple_loss=0.2355, pruned_loss=0.03657, over 23712.00 frames. ], tot_loss[loss=0.1535, simple_loss=0.2341, pruned_loss=0.03648, over 4738861.27 frames. ], batch size: 149, lr: 2.15e-03, grad_scale: 16.0 2023-10-04 12:56:42,944 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_9302_210_2550_1_1528889962_4655416_638-301620-0_sp1.1 from training. Duration: 0.42725 2023-10-04 12:56:44,370 WARNING [train.py:1204] (3/4) Exclude cut with ID _3867_210_1883_1_1526641200575_4475830_272-215563-0_sp1.1 from training. Duration: 0.5181875 2023-10-04 12:56:47,523 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_53679_210_4852_1_1534556726_4186141_358-154143-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 12:56:47,541 WARNING [train.py:1204] (3/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_841-341679-0_sp0.9 from training. Duration: 0.6444375 2023-10-04 12:56:50,360 WARNING [train.py:1204] (3/4) Exclude cut with ID _7160_210_1876_1_1527940923231_3703799_302-150728-0 from training. Duration: 0.78 2023-10-04 12:56:50,396 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_125-203105-0 from training. Duration: 0.8 2023-10-04 12:56:50,410 WARNING [train.py:1204] (3/4) Exclude cut with ID _5585_210_2667_1_1527418813324_8007340_89-332072-0_sp0.9 from training. Duration: 0.711125 2023-10-04 12:56:51,738 WARNING [train.py:1204] (3/4) Exclude cut with ID _3992_210_1747_1_1526644816031_3731060_36-147855-0 from training. Duration: 0.78 2023-10-04 12:56:51,762 WARNING [train.py:1204] (3/4) Exclude cut with ID _59256_210_5986_1_1535076085997_3589120_172-239045-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 12:56:51,776 WARNING [train.py:1204] (3/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_472-216663-0_sp1.1 from training. Duration: 0.836375 2023-10-04 12:56:53,091 WARNING [train.py:1204] (3/4) Exclude cut with ID _36512_210_6987_1_1533033143998_3126069_181-306500-0_sp1.1 from training. Duration: 0.863625 2023-10-04 12:56:53,116 WARNING [train.py:1204] (3/4) Exclude cut with ID _56524_210_9642_1_1534572011779_3619170_492-95008-0_sp1.1 from training. Duration: 0.97275 2023-10-04 12:56:56,240 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_33348_210_5632_1_1533086912_3324030_119-90326-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 12:56:56,278 WARNING [train.py:1204] (3/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_231-137892-0_sp1.1 from training. Duration: 0.9 2023-10-04 12:56:57,705 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8453_210_4211_1_1528502940_8562730_596-306820-0_sp1.1 from training. Duration: 0.8818125 2023-10-04 12:56:59,116 WARNING [train.py:1204] (3/4) Exclude cut with ID _27052_210_5986_1_1532329197187_3585009_50-242750-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 12:57:00,606 WARNING [train.py:1204] (3/4) Exclude cut with ID _13913_210_9994_1_1530579615187_3383897_358-98435-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 12:57:01,838 WARNING [train.py:1204] (3/4) Exclude cut with ID _27013_210_1389_1_1532437173240_3252649_399-229778-0_sp1.1 from training. Duration: 0.963625 2023-10-04 12:57:04,723 WARNING [train.py:1204] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_185-174805-0_sp0.9 from training. Duration: 0.7555625 2023-10-04 12:57:08,995 WARNING [train.py:1204] (3/4) Exclude cut with ID _65585_210_12210_1_1535450451584_3764098_196-221014-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 12:57:10,542 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.conv_module2.balancer2.prob, batch_count=1662186.6666666667, ans=0.125 2023-10-04 12:57:11,695 WARNING [train.py:1204] (3/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_202-293523-0_sp1.1 from training. Duration: 0.6545625 2023-10-04 12:57:13,515 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8275_210_4198_1_1528455320_3892571_213-127634-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 12:57:13,566 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_921-231678-0_sp1.1 from training. Duration: 0.97275 2023-10-04 12:57:16,310 WARNING [train.py:1204] (3/4) Exclude cut with ID _38020_210_5986_1_1533112252259_7006089_493-102745-0_sp1.1 from training. Duration: 0.8909375 2023-10-04 12:57:16,409 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6846_210_6753_1_1527850767_4295158_2-315073-0 from training. Duration: 0.88 2023-10-04 12:57:17,725 WARNING [train.py:1204] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_249-281349-0 from training. Duration: 0.85 2023-10-04 12:57:19,820 WARNING [train.py:1204] (3/4) Exclude cut with ID _18513_210_9206_1_1531618215223_3862620_314-83429-0_sp1.1 from training. Duration: 0.97275 2023-10-04 12:57:21,175 WARNING [train.py:1204] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_215-202973-0_sp0.9 from training. Duration: 0.97775 2023-10-04 12:57:21,185 WARNING [train.py:1204] (3/4) Exclude cut with ID _7719_210_3525_1_1528456153163_4296960_88-250259-0_sp1.1 from training. Duration: 0.763625 2023-10-04 12:57:21,302 WARNING [train.py:1204] (3/4) Exclude cut with ID _7606_210_4915_1_1528894253277_5080690_301-10732-0_sp1.1 from training. Duration: 0.736375 2023-10-04 12:57:21,312 WARNING [train.py:1204] (3/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_818-11907-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 12:57:22,682 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_5959_210_1794_1_1527400400_3445726_412-44052-0_sp1.1 from training. Duration: 0.7 2023-10-04 12:57:26,436 WARNING [train.py:1204] (3/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_6-279195-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 12:57:27,770 WARNING [train.py:1204] (3/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_252-216674-0_sp0.9 from training. Duration: 0.6 2023-10-04 12:57:29,096 WARNING [train.py:1204] (3/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_144-3287-0_sp1.1 from training. Duration: 0.6090625 2023-10-04 12:57:31,696 WARNING [train.py:1204] (3/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_342-3879-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 12:57:31,733 WARNING [train.py:1204] (3/4) Exclude cut with ID _33640_210_2655_1_1533117605479_4855469_584-65630-0_sp1.1 from training. Duration: 0.97275 2023-10-04 12:57:33,040 WARNING [train.py:1204] (3/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_37-189262-0 from training. Duration: 0.98 2023-10-04 12:57:33,083 WARNING [train.py:1204] (3/4) Exclude cut with ID _9389_210_1664_1_1529217337301_5289269_379-128794-0_sp0.9 from training. Duration: 0.9444375 2023-10-04 12:57:34,471 WARNING [train.py:1204] (3/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_119-207162-0_sp1.1 from training. Duration: 0.6454375 2023-10-04 12:57:35,967 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.attention_skip_rate, batch_count=1662253.3333333333, ans=0.0 2023-10-04 12:57:38,619 WARNING [train.py:1204] (3/4) Exclude cut with ID _2941_210_2577_1_1526199673150_2489759_200-333643-0_sp1.1 from training. Duration: 0.936375 2023-10-04 12:57:40,049 WARNING [train.py:1204] (3/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_479-125611-0_sp1.1 from training. Duration: 0.87275 2023-10-04 12:57:40,060 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3475_210_2028_1_1526193578_3622569_64-297072-0_sp1.1 from training. Duration: 0.82725 2023-10-04 12:57:40,098 WARNING [train.py:1204] (3/4) Exclude cut with ID _53565_210_10635_1_1534337218335_3820259_139-346291-0_sp1.1 from training. Duration: 0.97275 2023-10-04 12:57:40,132 WARNING [train.py:1204] (3/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_588-125165-0_sp1.1 from training. Duration: 0.7545625 2023-10-04 12:57:41,958 WARNING [train.py:1204] (3/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_938-205135-0_sp1.1 from training. Duration: 0.7909375 2023-10-04 12:57:43,273 WARNING [train.py:1204] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_216-48707-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 12:57:44,535 WARNING [train.py:1204] (3/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_477-255782-0_sp1.1 from training. Duration: 0.7454375 2023-10-04 12:57:44,563 WARNING [train.py:1204] (3/4) Exclude cut with ID _16909_210_5724_1_1531292079912_4118919_583-92842-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 12:57:45,861 WARNING [train.py:1204] (3/4) Exclude cut with ID _60860_210_8254_1_1534896015875_3557169_245-256666-0 from training. Duration: 0.97 2023-10-04 12:57:51,725 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_45399_210_4852_1_1533624725_7579737_349-116173-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 12:57:56,425 INFO [train.py:1046] (3/4) Epoch 47, batch 5000, loss[loss=0.1565, simple_loss=0.2472, pruned_loss=0.03292, over 24347.00 frames. ], tot_loss[loss=0.1531, simple_loss=0.2337, pruned_loss=0.03627, over 4740020.30 frames. ], batch size: 74, lr: 2.15e-03, grad_scale: 8.0 2023-10-04 12:57:56,508 WARNING [train.py:1204] (3/4) Exclude cut with ID _8699_210_2069_1_1528630346831_3960409_155-39766-0 from training. Duration: 0.78 2023-10-04 12:57:56,524 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_86-16881-0_sp1.1 from training. Duration: 0.52725 2023-10-04 12:58:04,035 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_191-159384-0_sp1.1 from training. Duration: 0.97275 2023-10-04 12:58:04,043 WARNING [train.py:1204] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_307-158462-0_sp1.1 from training. Duration: 0.62725 2023-10-04 12:58:05,409 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7544_210_6753_1_1528591327_1471426_33-346031-0 from training. Duration: 0.61 2023-10-04 12:58:06,772 WARNING [train.py:1204] (3/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_112-149213-0 from training. Duration: 0.95 2023-10-04 12:58:09,424 WARNING [train.py:1204] (3/4) Exclude cut with ID _55844_210_2346_1_1534471212569_7625707_925-191342-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 12:58:10,793 WARNING [train.py:1204] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_87-278686-0 from training. Duration: 0.77 2023-10-04 12:58:10,812 WARNING [train.py:1204] (3/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_415-78792-0_sp1.1 from training. Duration: 0.736375 2023-10-04 12:58:10,828 WARNING [train.py:1204] (3/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_107-332398-0_sp0.9 from training. Duration: 0.8666875 2023-10-04 12:58:12,123 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.690e+02 2.008e+02 2.119e+02 2.497e+02 3.372e+02, threshold=4.238e+02, percent-clipped=0.0 2023-10-04 12:58:13,522 WARNING [train.py:1204] (3/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_72-251255-0 from training. Duration: 0.94 2023-10-04 12:58:13,558 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_431-227273-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 12:58:14,849 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7544_210_6753_1_1528587000_691468_87-178599-0_sp0.9 from training. Duration: 0.9666875 2023-10-04 12:58:14,926 WARNING [train.py:1204] (3/4) Exclude cut with ID _13916_210_9994_1_1530788341126_3763149_60-191556-0 from training. Duration: 0.61 2023-10-04 12:58:14,931 WARNING [train.py:1204] (3/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_746-164782-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 12:58:14,961 WARNING [train.py:1204] (3/4) Exclude cut with ID _33640_210_2655_1_1533117605479_4855469_596-21066-0_sp1.1 from training. Duration: 0.92725 2023-10-04 12:58:17,048 WARNING [train.py:1204] (3/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_320-14078-0 from training. Duration: 0.95 2023-10-04 12:58:17,075 WARNING [train.py:1204] (3/4) Exclude cut with ID _29877_210_6228_1_1532397791106_5276149_266-62291-0 from training. Duration: 0.87 2023-10-04 12:58:17,722 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.3.encoder.layers.0.conv_module1.whiten, num_groups=1, num_channels=512, metric=3.73 vs. limit=15.0 2023-10-04 12:58:18,359 WARNING [train.py:1204] (3/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_176-56120-0_sp0.9 from training. Duration: 0.92225 2023-10-04 12:58:18,401 WARNING [train.py:1204] (3/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_91-26919-0 from training. Duration: 0.97 2023-10-04 12:58:18,408 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_224-322394-0_sp0.9 from training. Duration: 0.7333125 2023-10-04 12:58:19,687 WARNING [train.py:1204] (3/4) Exclude cut with ID _44982_210_1511_1_1534312820108_3726029_586-297059-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 12:58:19,737 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_196-289637-0_sp1.1 from training. Duration: 0.6818125 2023-10-04 12:58:19,738 WARNING [train.py:1204] (3/4) Exclude cut with ID _18513_210_9206_1_1531618215223_3862620_402-5957-0 from training. Duration: 0.99 2023-10-04 12:58:21,191 WARNING [train.py:1204] (3/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_81-300212-0 from training. Duration: 0.6 2023-10-04 12:58:22,614 WARNING [train.py:1204] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_166-128941-0 from training. Duration: 0.54 2023-10-04 12:58:22,629 WARNING [train.py:1204] (3/4) Exclude cut with ID _58059_210_5854_1_1534901471149_3164808_57-309660-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 12:58:24,363 WARNING [train.py:1204] (3/4) Exclude cut with ID _60621_210_17071_1_1535013053530_3671739_73-155814-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 12:58:26,164 WARNING [train.py:1204] (3/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_726-270780-0 from training. Duration: 0.86 2023-10-04 12:58:26,176 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8853_210_3273_1_1528633899_818968_51-187685-0_sp1.1 from training. Duration: 0.62725 2023-10-04 12:58:27,604 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_315-212794-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 12:58:29,057 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_53373_210_5399_1_1534229471_4715695_314-122795-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 12:58:29,171 WARNING [train.py:1204] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_187-221566-0_sp0.9 from training. Duration: 0.47775 2023-10-04 12:58:32,311 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_133-564-0 from training. Duration: 0.79 2023-10-04 12:58:33,641 WARNING [train.py:1204] (3/4) Exclude cut with ID _55153_210_2253_1_1534384786677_3162096_255-228344-0_sp1.1 from training. Duration: 0.936375 2023-10-04 12:58:34,549 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.0.layers.1.feed_forward1.out_whiten, num_groups=1, num_channels=192, metric=5.14 vs. limit=15.0 2023-10-04 12:58:34,882 WARNING [train.py:1204] (3/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_383-165234-0_sp1.1 from training. Duration: 0.8545625 2023-10-04 12:58:37,690 WARNING [train.py:1204] (3/4) Exclude cut with ID IC0677W0056-334889 from training. Duration: 0.768 2023-10-04 12:58:41,812 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_14953_210_2151_1_1530701710_5616165_710-165400-0_sp0.9 from training. Duration: 0.9666875 2023-10-04 12:58:43,190 WARNING [train.py:1204] (3/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_355-236205-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 12:58:43,191 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_34450_210_9547_1_1533099443_4109050_503-246893-0_sp1.1 from training. Duration: 0.963625 2023-10-04 12:58:45,871 WARNING [train.py:1204] (3/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_25-120161-0 from training. Duration: 0.98 2023-10-04 12:58:45,905 WARNING [train.py:1204] (3/4) Exclude cut with ID _66112_210_10635_1_1535590236305_3801484_205-311930-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 12:58:45,926 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_48049_210_5632_1_1533864553_3519937_185-281218-0_sp1.1 from training. Duration: 0.92725 2023-10-04 12:58:45,970 WARNING [train.py:1204] (3/4) Exclude cut with ID _48782_210_12658_1_1534467701544_2300422_191-332415-0_sp1.1 from training. Duration: 0.97275 2023-10-04 12:58:47,368 WARNING [train.py:1204] (3/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_907-151398-0_sp1.1 from training. Duration: 0.336375 2023-10-04 12:58:49,171 WARNING [train.py:1204] (3/4) Exclude cut with ID _67499_210_17282_1_1535509805220_3692430_20-179346-0_sp1.1 from training. Duration: 0.8818125 2023-10-04 12:58:51,888 WARNING [train.py:1204] (3/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_441-91466-0_sp1.1 from training. Duration: 0.8818125 2023-10-04 12:58:53,228 WARNING [train.py:1204] (3/4) Exclude cut with ID _46681_210_9642_1_1533893005571_7603359_786-164007-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 12:58:58,526 WARNING [train.py:1204] (3/4) Exclude cut with ID _14970_210_15709_1_1530773543493_1715730_187-20259-0 from training. Duration: 0.89 2023-10-04 12:58:58,687 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.1.conv_module2.balancer1.prob, batch_count=1662653.3333333333, ans=0.125 2023-10-04 12:59:01,827 WARNING [train.py:1204] (3/4) Exclude cut with ID _12734_210_8341_1_1530183614296_3891929_383-339075-0_sp1.1 from training. Duration: 0.963625 2023-10-04 12:59:07,456 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.conv_module2.balancer2.prob, batch_count=1662653.3333333333, ans=0.125 2023-10-04 12:59:09,861 INFO [train.py:1046] (3/4) Epoch 47, batch 5050, loss[loss=0.1498, simple_loss=0.2333, pruned_loss=0.03317, over 24380.00 frames. ], tot_loss[loss=0.1533, simple_loss=0.2339, pruned_loss=0.03632, over 4726315.91 frames. ], batch size: 61, lr: 2.15e-03, grad_scale: 4.0 2023-10-04 12:59:11,413 WARNING [train.py:1204] (3/4) Exclude cut with ID _37086_210_9038_1_1532999540891_7021250_834-322225-0_sp1.1 from training. Duration: 0.92725 2023-10-04 12:59:12,734 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_10987_210_4163_1_1529760555_4123902_56-290559-0_sp1.1 from training. Duration: 0.963625 2023-10-04 12:59:12,741 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_92-246270-0_sp0.9 from training. Duration: 0.9444375 2023-10-04 12:59:12,759 WARNING [train.py:1204] (3/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_56-13561-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 12:59:12,789 WARNING [train.py:1204] (3/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_393-57888-0_sp1.1 from training. Duration: 0.5818125 2023-10-04 12:59:14,048 WARNING [train.py:1204] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_168-32906-0_sp1.1 from training. Duration: 0.62725 2023-10-04 12:59:14,098 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_51225_210_3601_1_1534400634_2327383_127-207553-0_sp1.1 from training. Duration: 0.963625 2023-10-04 12:59:15,721 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.attention_skip_rate, batch_count=1662720.0, ans=0.0 2023-10-04 12:59:18,850 WARNING [train.py:1204] (3/4) Exclude cut with ID _58583_210_5236_1_1534726835266_3574249_494-203521-0_sp1.1 from training. Duration: 0.963625 2023-10-04 12:59:18,871 WARNING [train.py:1204] (3/4) Exclude cut with ID _49376_210_5986_1_1533985278732_3370740_354-344695-0 from training. Duration: 0.99 2023-10-04 12:59:20,195 WARNING [train.py:1204] (3/4) Exclude cut with ID _8021_210_7994_1_1528376451358_3653069_179-45206-0_sp1.1 from training. Duration: 0.7818125 2023-10-04 12:59:21,601 WARNING [train.py:1204] (3/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_742-154017-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 12:59:22,987 WARNING [train.py:1204] (3/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_222-198127-0_sp1.1 from training. Duration: 0.763625 2023-10-04 12:59:23,038 WARNING [train.py:1204] (3/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_235-307337-0 from training. Duration: 0.94 2023-10-04 12:59:26,224 WARNING [train.py:1204] (3/4) Exclude cut with ID _8393_210_6702_1_1528505860249_3457328_453-176500-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 12:59:26,266 WARNING [train.py:1204] (3/4) Exclude cut with ID _53565_210_10635_1_1534337218335_3820259_102-179528-0_sp1.1 from training. Duration: 0.97275 2023-10-04 12:59:29,474 WARNING [train.py:1204] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_43-206757-0_sp1.1 from training. Duration: 0.6818125 2023-10-04 12:59:29,574 WARNING [train.py:1204] (3/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_335-274780-0_sp1.1 from training. Duration: 0.7090625 2023-10-04 12:59:30,899 WARNING [train.py:1204] (3/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_128-296581-0_sp1.1 from training. Duration: 0.67275 2023-10-04 12:59:39,557 WARNING [train.py:1204] (3/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_111-112362-0 from training. Duration: 0.91 2023-10-04 12:59:40,873 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_5522_210_3273_1_1527249606_3909096_366-172508-0_sp1.1 from training. Duration: 0.6 2023-10-04 12:59:40,965 WARNING [train.py:1204] (3/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_47-167373-0_sp1.1 from training. Duration: 0.836375 2023-10-04 12:59:41,007 WARNING [train.py:1204] (3/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_41-234353-0 from training. Duration: 0.68 2023-10-04 12:59:42,380 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_82-221196-0_sp0.9 from training. Duration: 0.8444375 2023-10-04 12:59:43,745 WARNING [train.py:1204] (3/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_47-4814-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 12:59:43,772 WARNING [train.py:1204] (3/4) Exclude cut with ID _43541_210_10626_1_1533796233418_3635440_40-247993-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 12:59:45,161 WARNING [train.py:1204] (3/4) Exclude cut with ID _21989_210_5854_1_1531899214931_3271360_133-44759-0_sp0.9 from training. Duration: 0.8555625 2023-10-04 12:59:45,164 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_94-195801-0 from training. Duration: 0.94 2023-10-04 12:59:46,507 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_141-89113-0 from training. Duration: 0.86 2023-10-04 12:59:46,608 WARNING [train.py:1204] (3/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_683-284578-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 12:59:49,766 WARNING [train.py:1204] (3/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_6-214815-0_sp1.1 from training. Duration: 0.77275 2023-10-04 12:59:52,649 WARNING [train.py:1204] (3/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_556-212082-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 12:59:53,973 WARNING [train.py:1204] (3/4) Exclude cut with ID _44902_210_16187_1_1533888006062_3792078_149-67212-0 from training. Duration: 0.87 2023-10-04 12:59:55,398 WARNING [train.py:1204] (3/4) Exclude cut with ID _61148_210_6897_1_1535094022726_4217610_333-159440-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 12:59:56,887 WARNING [train.py:1204] (3/4) Exclude cut with ID _3120_210_3525_1_1526036143112_4072980_404-249994-0 from training. Duration: 0.99 2023-10-04 12:59:58,285 WARNING [train.py:1204] (3/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_234-260682-0_sp0.9 from training. Duration: 0.9333125 2023-10-04 12:59:58,304 WARNING [train.py:1204] (3/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_374-138971-0_sp1.1 from training. Duration: 0.8545625 2023-10-04 13:00:00,206 WARNING [train.py:1204] (3/4) Exclude cut with ID _53713_210_24116_1_1534314487762_7416751_743-302085-0_sp1.1 from training. Duration: 0.92725 2023-10-04 13:00:00,288 WARNING [train.py:1204] (3/4) Exclude cut with ID _18516_210_9206_1_1531704653206_3692406_198-334870-0_sp1.1 from training. Duration: 0.836375 2023-10-04 13:00:00,463 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.balancer1.prob, batch_count=1662920.0, ans=0.125 2023-10-04 13:00:03,026 WARNING [train.py:1204] (3/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_395-73570-0_sp1.1 from training. Duration: 0.9 2023-10-04 13:00:03,842 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.conv_module1.balancer1.prob, batch_count=1662920.0, ans=0.125 2023-10-04 13:00:03,875 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.conv_skip_rate, batch_count=1662920.0, ans=0.0 2023-10-04 13:00:06,293 WARNING [train.py:1204] (3/4) Exclude cut with ID _46683_210_9642_1_1533730913492_4591116_421-19547-0_sp1.1 from training. Duration: 0.936375 2023-10-04 13:00:06,337 WARNING [train.py:1204] (3/4) Exclude cut with ID _19896_210_1883_1_1531709022662_6649569_714-8668-0_sp1.1 from training. Duration: 0.963625 2023-10-04 13:00:06,384 WARNING [train.py:1204] (3/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_49-101072-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 13:00:06,392 WARNING [train.py:1204] (3/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_220-191080-0_sp1.1 from training. Duration: 0.8909375 2023-10-04 13:00:06,426 WARNING [train.py:1204] (3/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_62-335552-0 from training. Duration: 0.87 2023-10-04 13:00:07,876 WARNING [train.py:1204] (3/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_111-112362-0_sp1.1 from training. Duration: 0.82725 2023-10-04 13:00:09,245 WARNING [train.py:1204] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_318-59097-0_sp0.9 from training. Duration: 0.8444375 2023-10-04 13:00:13,271 WARNING [train.py:1204] (3/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_98-255710-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 13:00:13,281 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9042W0003-515609 from training. Duration: 0.896 2023-10-04 13:00:13,290 WARNING [train.py:1204] (3/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_158-250116-0_sp0.9 from training. Duration: 0.688875 2023-10-04 13:00:14,644 WARNING [train.py:1204] (3/4) Exclude cut with ID _26750_210_5665_1_1532226678442_4063230_7-346885-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 13:00:14,683 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_37839_210_7234_1_1533542462_3689303_357-227078-0_sp1.1 from training. Duration: 0.963625 2023-10-04 13:00:14,871 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.attention_skip_rate, batch_count=1662986.6666666667, ans=0.0 2023-10-04 13:00:15,839 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9052W0119-520904 from training. Duration: 0.896 2023-10-04 13:00:17,358 WARNING [train.py:1204] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_249-281349-0_sp1.1 from training. Duration: 0.77275 2023-10-04 13:00:17,370 WARNING [train.py:1204] (3/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_147-293311-0 from training. Duration: 0.94 2023-10-04 13:00:17,371 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_158-21166-0_sp1.1 from training. Duration: 0.963625 2023-10-04 13:00:18,902 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.self_attn_weights.pos_emb_skip_rate, batch_count=1662986.6666666667, ans=0.0 2023-10-04 13:00:22,047 WARNING [train.py:1204] (3/4) Exclude cut with ID _14100_210_8341_1_1530529320613_3678069_207-332194-0_sp1.1 from training. Duration: 0.92725 2023-10-04 13:00:22,088 WARNING [train.py:1204] (3/4) Exclude cut with ID _9389_210_1664_1_1529217337301_5289269_324-211031-0_sp1.1 from training. Duration: 0.963625 2023-10-04 13:00:22,100 WARNING [train.py:1204] (3/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_133-158308-0 from training. Duration: 0.84 2023-10-04 13:00:23,424 INFO [train.py:1046] (3/4) Epoch 47, batch 5100, loss[loss=0.1576, simple_loss=0.2444, pruned_loss=0.03536, over 24454.00 frames. ], tot_loss[loss=0.1535, simple_loss=0.2342, pruned_loss=0.03634, over 4720478.60 frames. ], batch size: 77, lr: 2.15e-03, grad_scale: 8.0 2023-10-04 13:00:23,485 WARNING [train.py:1204] (3/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_166-314607-0 from training. Duration: 0.54 2023-10-04 13:00:24,903 WARNING [train.py:1204] (3/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_649-79538-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 13:00:24,911 WARNING [train.py:1204] (3/4) Exclude cut with ID _28394_210_8913_1_1532430049518_3650118_343-115667-0_sp1.1 from training. Duration: 0.97275 2023-10-04 13:00:24,946 WARNING [train.py:1204] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_87-278686-0_sp0.9 from training. Duration: 0.8555625 2023-10-04 13:00:28,045 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9109W0003-549749 from training. Duration: 0.768 2023-10-04 13:00:29,930 WARNING [train.py:1204] (3/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_126-29104-0_sp1.1 from training. Duration: 0.77275 2023-10-04 13:00:33,183 WARNING [train.py:1204] (3/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_325-113798-0 from training. Duration: 0.85 2023-10-04 13:00:33,252 WARNING [train.py:1204] (3/4) Exclude cut with ID _6694_210_4599_1_1527852637490_3965529_116-109398-0 from training. Duration: 0.73 2023-10-04 13:00:34,620 WARNING [train.py:1204] (3/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_46-57119-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 13:00:35,474 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.0.layers.1.nonlin_attention.whiten1, num_groups=1, num_channels=144, metric=5.01 vs. limit=10.0 2023-10-04 13:00:35,868 WARNING [train.py:1204] (3/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_546-119312-0_sp1.1 from training. Duration: 0.9 2023-10-04 13:00:36,775 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.0.layers.1.nonlin_attention.whiten2, num_groups=1, num_channels=192, metric=5.69 vs. limit=15.0 2023-10-04 13:00:38,573 WARNING [train.py:1204] (3/4) Exclude cut with ID _60057_210_10640_1_1534903065793_3333150_468-306939-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 13:00:38,620 WARNING [train.py:1204] (3/4) Exclude cut with ID _15150_210_8341_1_1530752411803_7256384_22-138274-0 from training. Duration: 0.96 2023-10-04 13:00:40,007 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_289-180904-0 from training. Duration: 0.76 2023-10-04 13:00:41,329 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.691e+02 2.103e+02 2.372e+02 2.934e+02 4.830e+02, threshold=4.743e+02, percent-clipped=2.0 2023-10-04 13:00:41,707 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.balancer1.prob, batch_count=1663120.0, ans=0.125 2023-10-04 13:00:44,248 WARNING [train.py:1204] (3/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_64-133721-0_sp1.1 from training. Duration: 0.92725 2023-10-04 13:00:44,306 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_195-257512-0_sp1.1 from training. Duration: 0.6090625 2023-10-04 13:00:48,885 WARNING [train.py:1204] (3/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_704-101963-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 13:00:49,117 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.ff2_skip_rate, batch_count=1663120.0, ans=0.0 2023-10-04 13:00:53,017 WARNING [train.py:1204] (3/4) Exclude cut with ID _4366_210_1388_1_1526792053633_3613819_231-211402-0 from training. Duration: 0.94 2023-10-04 13:00:53,070 WARNING [train.py:1204] (3/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_679-190385-0_sp1.1 from training. Duration: 0.97275 2023-10-04 13:00:54,461 WARNING [train.py:1204] (3/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_298-81459-0_sp1.1 from training. Duration: 0.963625 2023-10-04 13:00:54,470 WARNING [train.py:1204] (3/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_658-282610-0_sp0.9 from training. Duration: 0.588875 2023-10-04 13:00:57,753 WARNING [train.py:1204] (3/4) Exclude cut with ID _57572_210_5517_1_1534921219624_3809484_196-283734-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 13:01:00,457 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_53071_210_16554_1_1534490405_2954066_294-198554-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 13:01:00,471 WARNING [train.py:1204] (3/4) Exclude cut with ID _4866_210_2577_1_1527404412284_4364659_352-157930-0 from training. Duration: 0.81 2023-10-04 13:01:03,191 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9231W0113-610139 from training. Duration: 0.896 2023-10-04 13:01:03,247 WARNING [train.py:1204] (3/4) Exclude cut with ID _24271_210_12658_1_1531987270022_7190519_638-171906-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 13:01:03,285 WARNING [train.py:1204] (3/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_272-249985-0 from training. Duration: 0.67 2023-10-04 13:01:03,302 WARNING [train.py:1204] (3/4) Exclude cut with ID _64086_210_7994_1_1535270417019_7435949_449-31567-0 from training. Duration: 0.97 2023-10-04 13:01:07,289 WARNING [train.py:1204] (3/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_109-282568-0_sp1.1 from training. Duration: 0.97275 2023-10-04 13:01:07,494 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.ff2_skip_rate, batch_count=1663253.3333333333, ans=0.0 2023-10-04 13:01:15,668 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_121-45934-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 13:01:18,853 WARNING [train.py:1204] (3/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_314-126819-0 from training. Duration: 0.5 2023-10-04 13:01:18,876 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9309W0192-640319 from training. Duration: 0.512 2023-10-04 13:01:18,891 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9310W0003-640649 from training. Duration: 0.896 2023-10-04 13:01:20,374 WARNING [train.py:1204] (3/4) Exclude cut with ID _7160_210_1876_1_1527940923231_3703799_120-150943-0 from training. Duration: 0.81 2023-10-04 13:01:20,376 WARNING [train.py:1204] (3/4) Exclude cut with ID _40380_210_9059_1_1533380594759_4187380_6-33508-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 13:01:23,019 WARNING [train.py:1204] (3/4) Exclude cut with ID _59202_210_26984_1_1534816810612_3549174_137-318710-0 from training. Duration: 0.89 2023-10-04 13:01:25,873 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.feed_forward1.out_proj.dropout_p, batch_count=1663320.0, ans=0.1 2023-10-04 13:01:27,001 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_435-313305-0 from training. Duration: 0.71 2023-10-04 13:01:30,129 WARNING [train.py:1204] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_91-212486-0_sp1.1 from training. Duration: 0.6181875 2023-10-04 13:01:30,222 WARNING [train.py:1204] (3/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_133-158308-0_sp1.1 from training. Duration: 0.763625 2023-10-04 13:01:31,741 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_99-251314-0 from training. Duration: 0.87 2023-10-04 13:01:34,462 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_365-217119-0_sp1.1 from training. Duration: 0.57275 2023-10-04 13:01:34,497 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3475_210_2028_1_1526193578_3622569_377-166133-0 from training. Duration: 0.86 2023-10-04 13:01:37,697 INFO [train.py:1046] (3/4) Epoch 47, batch 5150, loss[loss=0.2109, simple_loss=0.2755, pruned_loss=0.07318, over 19460.00 frames. ], tot_loss[loss=0.1541, simple_loss=0.2349, pruned_loss=0.03666, over 4720490.93 frames. ], batch size: 388, lr: 2.15e-03, grad_scale: 8.0 2023-10-04 13:01:39,798 WARNING [train.py:1204] (3/4) Exclude cut with ID _13916_210_9994_1_1530788341126_3763149_116-21034-0_sp1.1 from training. Duration: 0.87275 2023-10-04 13:01:39,813 WARNING [train.py:1204] (3/4) Exclude cut with ID _64220_210_9527_1_1535250151772_4875259_142-216831-0_sp1.1 from training. Duration: 0.936375 2023-10-04 13:01:39,814 WARNING [train.py:1204] (3/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_490-207861-0_sp1.1 from training. Duration: 0.8909375 2023-10-04 13:01:41,125 WARNING [train.py:1204] (3/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_87-272068-0_sp0.9 from training. Duration: 0.92225 2023-10-04 13:01:41,154 WARNING [train.py:1204] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_18-46756-0_sp1.1 from training. Duration: 0.7181875 2023-10-04 13:01:42,540 WARNING [train.py:1204] (3/4) Exclude cut with ID _39411_210_5986_1_1533538825030_3343649_98-311335-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 13:01:43,906 WARNING [train.py:1204] (3/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_113-18163-0 from training. Duration: 0.89 2023-10-04 13:01:43,908 WARNING [train.py:1204] (3/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_1103-56445-0 from training. Duration: 0.73 2023-10-04 13:01:43,959 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3474_210_2028_1_1526188513_4825246_145-309131-0 from training. Duration: 0.74 2023-10-04 13:01:45,223 WARNING [train.py:1204] (3/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_120-211630-0_sp1.1 from training. Duration: 0.836375 2023-10-04 13:01:45,235 WARNING [train.py:1204] (3/4) Exclude cut with ID _8030_210_7497_1_1528444886781_3531089_32-31837-0 from training. Duration: 0.97 2023-10-04 13:01:46,609 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_19949_210_2161_1_1531706337_928079_8-160956-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 13:01:46,657 WARNING [train.py:1204] (3/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_321-215742-0_sp1.1 from training. Duration: 0.4545625 2023-10-04 13:01:48,618 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_583-124650-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 13:01:48,719 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_30434_210_13049_1_1532519384_981746_164-182755-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 13:01:54,352 WARNING [train.py:1204] (3/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_339-104791-0_sp1.1 from training. Duration: 0.4454375 2023-10-04 13:01:55,635 WARNING [train.py:1204] (3/4) Exclude cut with ID _15026_210_8750_1_1530777602316_3291850_211-195043-0 from training. Duration: 0.8 2023-10-04 13:01:55,734 WARNING [train.py:1204] (3/4) Exclude cut with ID _29877_210_6228_1_1532397791106_5276149_487-182647-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 13:01:55,768 WARNING [train.py:1204] (3/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_127-566-0_sp0.9 from training. Duration: 0.9333125 2023-10-04 13:01:58,582 WARNING [train.py:1204] (3/4) Exclude cut with ID _8263_210_1664_1_1528439452891_970220_68-181805-0_sp0.9 from training. Duration: 0.711125 2023-10-04 13:01:58,583 WARNING [train.py:1204] (3/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_504-72513-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 13:01:58,600 WARNING [train.py:1204] (3/4) Exclude cut with ID _64639_210_5816_1_1535367619437_3528000_324-265233-0_sp1.1 from training. Duration: 0.963625 2023-10-04 13:01:59,930 WARNING [train.py:1204] (3/4) Exclude cut with ID _6456_210_1534_1_1527681634212_3181049_215-309359-0_sp0.9 from training. Duration: 0.97775 2023-10-04 13:01:59,935 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_115-233929-0_sp1.1 from training. Duration: 0.6545625 2023-10-04 13:02:00,198 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.ff3_skip_rate, batch_count=1663453.3333333333, ans=0.0 2023-10-04 13:02:01,801 WARNING [train.py:1204] (3/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_219-224445-0 from training. Duration: 0.84 2023-10-04 13:02:03,214 WARNING [train.py:1204] (3/4) Exclude cut with ID _14867_210_1968_1_1530684179890_3846969_413-29292-0_sp1.1 from training. Duration: 0.8181875 2023-10-04 13:02:03,256 WARNING [train.py:1204] (3/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_1012-279473-0_sp0.9 from training. Duration: 0.7444375 2023-10-04 13:02:04,701 WARNING [train.py:1204] (3/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_605-133395-0_sp0.9 from training. Duration: 0.7333125 2023-10-04 13:02:04,856 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.conv_module1.balancer1.prob, batch_count=1663453.3333333333, ans=0.125 2023-10-04 13:02:05,737 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.0.layers.0.feed_forward1.out_whiten, num_groups=1, num_channels=192, metric=10.74 vs. limit=15.0 2023-10-04 13:02:06,695 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_89-35748-0 from training. Duration: 0.79 2023-10-04 13:02:08,055 WARNING [train.py:1204] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_476-272757-0_sp1.1 from training. Duration: 0.8454375 2023-10-04 13:02:12,782 WARNING [train.py:1204] (3/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_449-15508-0_sp0.9 from training. Duration: 0.911125 2023-10-04 13:02:14,173 WARNING [train.py:1204] (3/4) Exclude cut with ID _29345_210_4381_1_1532689168263_3860169_198-174328-0 from training. Duration: 0.91 2023-10-04 13:02:15,757 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_15517_210_6753_1_1530952293_1664795_125-158157-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 13:02:22,097 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.feed_forward3.hidden_balancer.prob, batch_count=1663586.6666666667, ans=0.125 2023-10-04 13:02:23,171 WARNING [train.py:1204] (3/4) Exclude cut with ID _25565_210_12392_1_1532423009382_3952098_420-113180-0_sp1.1 from training. Duration: 0.963625 2023-10-04 13:02:23,255 WARNING [train.py:1204] (3/4) Exclude cut with ID _51471_210_1889_1_1534399391390_3094250_236-146773-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 13:02:27,392 WARNING [train.py:1204] (3/4) Exclude cut with ID _30039_210_13270_1_1532484612900_2725150_175-35012-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 13:02:27,426 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_23850_210_4238_1_1532232597_1632433_99-275843-0_sp1.1 from training. Duration: 0.936375 2023-10-04 13:02:31,818 WARNING [train.py:1204] (3/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_658-282610-0 from training. Duration: 0.53 2023-10-04 13:02:34,656 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_37818_210_4238_1_1533620910_4720083_234-65934-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 13:02:34,735 WARNING [train.py:1204] (3/4) Exclude cut with ID _3067_210_2667_1_1526212974640_4503168_459-173867-0_sp1.1 from training. Duration: 0.8 2023-10-04 13:02:36,474 WARNING [train.py:1204] (3/4) Exclude cut with ID _8696_210_1876_1_1528545632440_3345058_209-54069-0_sp0.9 from training. Duration: 0.7444375 2023-10-04 13:02:39,753 WARNING [train.py:1204] (3/4) Exclude cut with ID _62574_210_2655_1_1535180407058_3933249_420-236465-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 13:02:39,843 WARNING [train.py:1204] (3/4) Exclude cut with ID _46681_210_9642_1_1533893005571_7603359_333-4042-0_sp1.1 from training. Duration: 0.936375 2023-10-04 13:02:41,180 WARNING [train.py:1204] (3/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_377-296839-0 from training. Duration: 0.55 2023-10-04 13:02:45,482 WARNING [train.py:1204] (3/4) Exclude cut with ID _43997_210_4525_1_1533538632555_5189329_426-92599-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 13:02:46,894 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_43-71989-0_sp1.1 from training. Duration: 0.5181875 2023-10-04 13:02:51,325 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2009_210_2071_1_1525481134_4588937_140-345530-0_sp1.1 from training. Duration: 0.963625 2023-10-04 13:02:51,334 WARNING [train.py:1204] (3/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_138-332773-0_sp0.9 from training. Duration: 0.9 2023-10-04 13:02:51,404 WARNING [train.py:1204] (3/4) Exclude cut with ID _13942_210_9988_1_1530604906036_3733360_499-55676-0_sp0.9 from training. Duration: 0.688875 2023-10-04 13:02:51,429 WARNING [train.py:1204] (3/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_259-34038-0_sp1.1 from training. Duration: 0.67275 2023-10-04 13:02:51,438 WARNING [train.py:1204] (3/4) Exclude cut with ID _3120_210_3525_1_1526036143112_4072980_404-249994-0_sp1.1 from training. Duration: 0.9 2023-10-04 13:02:51,464 WARNING [train.py:1204] (3/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_222-108810-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 13:02:52,696 INFO [train.py:1046] (3/4) Epoch 47, batch 5200, loss[loss=0.1585, simple_loss=0.2507, pruned_loss=0.03317, over 24634.00 frames. ], tot_loss[loss=0.154, simple_loss=0.235, pruned_loss=0.03646, over 4716534.06 frames. ], batch size: 68, lr: 2.15e-03, grad_scale: 16.0 2023-10-04 13:02:53,018 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.feed_forward3.hidden_balancer.prob, batch_count=1663720.0, ans=0.125 2023-10-04 13:02:53,662 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.1.encoder.layers.0.self_attn_weights.whiten_keys, num_groups=4, num_channels=128, metric=3.77 vs. limit=6.0 2023-10-04 13:02:55,505 WARNING [train.py:1204] (3/4) Exclude cut with ID _22083_210_8254_1_1531702862439_3678970_49-234546-0_sp0.9 from training. Duration: 0.92225 2023-10-04 13:02:56,907 WARNING [train.py:1204] (3/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_199-295611-0_sp0.9 from training. Duration: 0.611125 2023-10-04 13:02:59,642 WARNING [train.py:1204] (3/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_43-192945-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 13:03:04,483 WARNING [train.py:1204] (3/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_364-22817-0 from training. Duration: 0.71 2023-10-04 13:03:04,590 WARNING [train.py:1204] (3/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_388-129552-0_sp1.1 from training. Duration: 0.87275 2023-10-04 13:03:05,919 WARNING [train.py:1204] (3/4) Exclude cut with ID _6537_210_1883_1_1527987559710_3597129_190-280227-0_sp1.1 from training. Duration: 0.97275 2023-10-04 13:03:07,919 WARNING [train.py:1204] (3/4) Exclude cut with ID _55844_210_2346_1_1534471212569_7625707_398-56897-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 13:03:09,319 WARNING [train.py:1204] (3/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_48-182155-0_sp1.1 from training. Duration: 0.7909375 2023-10-04 13:03:09,348 WARNING [train.py:1204] (3/4) Exclude cut with ID _29529_210_7234_1_1532393961455_3977460_582-18084-0_sp1.1 from training. Duration: 0.97275 2023-10-04 13:03:11,067 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.772e+02 2.051e+02 2.238e+02 2.543e+02 5.592e+02, threshold=4.477e+02, percent-clipped=1.0 2023-10-04 13:03:12,635 WARNING [train.py:1204] (3/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_912-282412-0 from training. Duration: 0.49 2023-10-04 13:03:15,385 WARNING [train.py:1204] (3/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_332-256168-0_sp0.9 from training. Duration: 0.8333125 2023-10-04 13:03:16,632 WARNING [train.py:1204] (3/4) Exclude cut with ID _64220_210_9527_1_1535250151772_4875259_55-250624-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 13:03:18,113 WARNING [train.py:1204] (3/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_264-108685-0 from training. Duration: 0.55 2023-10-04 13:03:20,071 WARNING [train.py:1204] (3/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_613-232856-0_sp0.9 from training. Duration: 0.6 2023-10-04 13:03:20,153 WARNING [train.py:1204] (3/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_68-212801-0_sp1.1 from training. Duration: 0.663625 2023-10-04 13:03:20,366 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.conv_module2.balancer1.prob, batch_count=1663786.6666666667, ans=0.125 2023-10-04 13:03:21,552 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6290_210_3273_1_1527595224_1282787_115-87095-0 from training. Duration: 0.55 2023-10-04 13:03:21,588 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3080_210_2071_1_1526086102_4365166_312-8726-0 from training. Duration: 0.91 2023-10-04 13:03:23,010 WARNING [train.py:1204] (3/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_105-63309-0 from training. Duration: 0.98 2023-10-04 13:03:24,301 WARNING [train.py:1204] (3/4) Exclude cut with ID _2941_210_2577_1_1526199673150_2489759_201-160779-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 13:03:24,311 WARNING [train.py:1204] (3/4) Exclude cut with ID ID0406W0226-885434 from training. Duration: 0.997 2023-10-04 13:03:24,317 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_17292_210_6753_1_1531125388_2943734_353-328201-0_sp1.1 from training. Duration: 0.97275 2023-10-04 13:03:25,600 WARNING [train.py:1204] (3/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_572-96029-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 13:03:25,642 WARNING [train.py:1204] (3/4) Exclude cut with ID _14970_210_15709_1_1530775547361_1966965_6-23328-0_sp0.9 from training. Duration: 0.9555625 2023-10-04 13:03:26,855 WARNING [train.py:1204] (3/4) Exclude cut with ID _45012_210_12651_1_1533608435136_5371380_240-37101-0 from training. Duration: 0.93 2023-10-04 13:03:26,917 WARNING [train.py:1204] (3/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_685-90183-0_sp1.1 from training. Duration: 0.936375 2023-10-04 13:03:28,721 WARNING [train.py:1204] (3/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_41-22245-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 13:03:30,181 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_15126_210_6753_1_1530778677_2591185_120-277707-0 from training. Duration: 0.8 2023-10-04 13:03:31,456 WARNING [train.py:1204] (3/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_310-26186-0 from training. Duration: 0.7 2023-10-04 13:03:31,471 WARNING [train.py:1204] (3/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_787-294416-0 from training. Duration: 0.82 2023-10-04 13:03:36,186 WARNING [train.py:1204] (3/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_795-33252-0 from training. Duration: 0.37 2023-10-04 13:03:37,518 WARNING [train.py:1204] (3/4) Exclude cut with ID _7537_210_2084_1_1528502395431_3924359_113-199933-0_sp0.9 from training. Duration: 0.6555625 2023-10-04 13:03:43,459 WARNING [train.py:1204] (3/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_308-231735-0_sp0.9 from training. Duration: 0.811125 2023-10-04 13:03:44,742 WARNING [train.py:1204] (3/4) Exclude cut with ID _19504_210_9994_1_1531648863242_3325220_354-296765-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 13:03:44,855 WARNING [train.py:1204] (3/4) Exclude cut with ID _5409_210_1388_1_1527292850189_5865191_178-127970-0 from training. Duration: 0.57 2023-10-04 13:03:46,637 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_1466-248056-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 13:03:46,663 WARNING [train.py:1204] (3/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_535-14410-0_sp0.9 from training. Duration: 0.52225 2023-10-04 13:03:46,666 WARNING [train.py:1204] (3/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_747-129941-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 13:03:46,706 WARNING [train.py:1204] (3/4) Exclude cut with ID _15012_210_1968_1_1530856820429_5095350_688-178849-0_sp1.1 from training. Duration: 0.5818125 2023-10-04 13:03:46,915 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.nonlin_attention.balancer.prob, batch_count=1663920.0, ans=0.125 2023-10-04 13:03:51,026 WARNING [train.py:1204] (3/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_356-45054-0_sp1.1 from training. Duration: 0.7818125 2023-10-04 13:03:52,265 WARNING [train.py:1204] (3/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_146-276944-0_sp0.9 from training. Duration: 0.92225 2023-10-04 13:03:55,299 WARNING [train.py:1204] (3/4) Exclude cut with ID _39204_210_4220_1_1533880547368_3191120_346-180306-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 13:03:55,384 WARNING [train.py:1204] (3/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_641-158017-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 13:03:55,385 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_19952_210_2161_1_1531875469_3851750_274-42137-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 13:03:59,809 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.bypass.skip_rate, batch_count=1663986.6666666667, ans=0.07 2023-10-04 13:04:01,384 WARNING [train.py:1204] (3/4) Exclude cut with ID _60804_210_9642_1_1534831199859_7268299_87-327819-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 13:04:02,730 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_9302_210_2550_1_1528889962_4655416_638-301620-0 from training. Duration: 0.47 2023-10-04 13:04:02,813 WARNING [train.py:1204] (3/4) Exclude cut with ID _44304_210_15374_1_1533639352765_4613129_302-22084-0_sp1.1 from training. Duration: 0.7818125 2023-10-04 13:04:04,114 WARNING [train.py:1204] (3/4) Exclude cut with ID _18513_210_9206_1_1531618215223_3862620_177-227063-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 13:04:05,540 WARNING [train.py:1204] (3/4) Exclude cut with ID _41326_210_5854_1_1533778076509_3492080_510-45444-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 13:04:06,774 INFO [train.py:1046] (3/4) Epoch 47, batch 5250, loss[loss=0.157, simple_loss=0.2518, pruned_loss=0.03114, over 24652.00 frames. ], tot_loss[loss=0.1541, simple_loss=0.2351, pruned_loss=0.03657, over 4716286.59 frames. ], batch size: 73, lr: 2.15e-03, grad_scale: 16.0 2023-10-04 13:04:06,828 WARNING [train.py:1204] (3/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_76-311963-0_sp1.1 from training. Duration: 0.636375 2023-10-04 13:04:06,933 WARNING [train.py:1204] (3/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_156-302675-0_sp0.9 from training. Duration: 0.611125 2023-10-04 13:04:10,253 WARNING [train.py:1204] (3/4) Exclude cut with ID _53489_210_6228_1_1534298357169_3835970_367-148411-0_sp1.1 from training. Duration: 0.936375 2023-10-04 13:04:12,326 WARNING [train.py:1204] (3/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_389-144525-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 13:04:13,651 WARNING [train.py:1204] (3/4) Exclude cut with ID _14069_210_5632_1_1530512692033_4803390_158-238427-0_sp1.1 from training. Duration: 0.963625 2023-10-04 13:04:16,260 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_486-151131-0_sp0.9 from training. Duration: 0.9444375 2023-10-04 13:04:20,979 WARNING [train.py:1204] (3/4) Exclude cut with ID _39846_210_12651_1_1533603651992_1318030_188-71831-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 13:04:22,434 WARNING [train.py:1204] (3/4) Exclude cut with ID _39411_210_5986_1_1533538825030_3343649_173-238248-0_sp1.1 from training. Duration: 0.7545625 2023-10-04 13:04:25,075 WARNING [train.py:1204] (3/4) Exclude cut with ID _20684_210_6917_1_1531567751646_4203930_469-49081-0_sp1.1 from training. Duration: 0.92725 2023-10-04 13:04:25,175 WARNING [train.py:1204] (3/4) Exclude cut with ID _8833_210_5219_1_1528592665210_7553059_611-80313-0_sp1.1 from training. Duration: 0.5818125 2023-10-04 13:04:27,898 WARNING [train.py:1204] (3/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_440-141811-0 from training. Duration: 0.95 2023-10-04 13:04:27,913 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_41211_210_2544_1_1533348859_3702910_236-223652-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 13:04:29,208 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_40222_210_6228_1_1533385298_4481939_220-163998-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 13:04:35,006 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.nonlin_attention.balancer.max_positive, batch_count=1664186.6666666667, ans=0.95 2023-10-04 13:05:15,767 INFO [train.py:1046] (3/4) Epoch 47, batch 5300, loss[loss=0.1201, simple_loss=0.1733, pruned_loss=0.03348, over 19373.00 frames. ], tot_loss[loss=0.1531, simple_loss=0.2335, pruned_loss=0.03636, over 4706737.83 frames. ], batch size: 388, lr: 2.15e-03, grad_scale: 8.0 2023-10-04 13:05:20,011 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.ff2_skip_rate, batch_count=1664386.6666666667, ans=0.0 2023-10-04 13:05:25,950 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.0.layers.1.nonlin_attention.whiten2, num_groups=1, num_channels=192, metric=7.18 vs. limit=15.0 2023-10-04 13:05:28,703 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=1664453.3333333333, ans=0.1 2023-10-04 13:05:29,675 WARNING [train.py:1204] (3/4) Exclude cut with ID _14629_210_15709_1_1530604298182_956899_6-232695-0_sp1.1 from training. Duration: 0.8545625 2023-10-04 13:05:29,711 WARNING [train.py:1204] (3/4) Exclude cut with ID _13921_210_9994_1_1530841782272_3503520_327-69677-0 from training. Duration: 0.9 2023-10-04 13:05:29,741 WARNING [train.py:1204] (3/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_372-298093-0 from training. Duration: 0.8 2023-10-04 13:05:29,755 WARNING [train.py:1204] (3/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_515-139111-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 13:05:30,005 WARNING [train.py:1204] (3/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_85-65056-0_sp1.1 from training. Duration: 0.87275 2023-10-04 13:05:30,048 WARNING [train.py:1204] (3/4) Exclude cut with ID _15113_210_8254_1_1530842438579_3716279_209-54533-0_sp1.1 from training. Duration: 0.87275 2023-10-04 13:05:30,096 WARNING [train.py:1204] (3/4) Exclude cut with ID _70447_210_12392_1_1535972499300_3160420_123-312838-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 13:05:30,124 WARNING [train.py:1204] (3/4) Exclude cut with ID _60113_210_16271_1_1535164212481_4386079_70-63596-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 13:05:30,152 WARNING [train.py:1204] (3/4) Exclude cut with ID _14632_210_9988_1_1530691279392_3923200_225-188509-0_sp1.1 from training. Duration: 0.963625 2023-10-04 13:05:30,202 WARNING [train.py:1204] (3/4) Exclude cut with ID _34734_210_4525_1_1533365977542_4448720_584-180532-0_sp1.1 from training. Duration: 0.97275 2023-10-04 13:05:30,216 WARNING [train.py:1204] (3/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_351-294650-0_sp1.1 from training. Duration: 0.57275 2023-10-04 13:05:30,884 WARNING [train.py:1204] (3/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_263-168388-0_sp0.9 from training. Duration: 0.9555625 2023-10-04 13:05:30,911 WARNING [train.py:1204] (3/4) Exclude cut with ID _22083_210_8254_1_1531702862439_3678970_201-85738-0 from training. Duration: 0.9 2023-10-04 13:05:31,002 WARNING [train.py:1204] (3/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_237-58433-0 from training. Duration: 0.74 2023-10-04 13:05:31,029 WARNING [train.py:1204] (3/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_262-250826-0 from training. Duration: 0.99 2023-10-04 13:05:31,132 WARNING [train.py:1204] (3/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_927-43827-0_sp0.9 from training. Duration: 0.82225 2023-10-04 13:05:31,163 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2821_210_2715_1_1526108385_3763560_73-296673-0 from training. Duration: 0.83 2023-10-04 13:05:31,238 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6287_210_3273_1_1527509506_2653716_182-199887-0 from training. Duration: 0.51 2023-10-04 13:05:31,374 WARNING [train.py:1204] (3/4) Exclude cut with ID _70519_210_4163_1_1535961595994_7458230_899-80223-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 13:05:31,672 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_210-234949-0_sp1.1 from training. Duration: 0.97275 2023-10-04 13:05:31,711 WARNING [train.py:1204] (3/4) Exclude cut with ID _30196_210_1664_1_1533708496522_7074140_768-278176-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 13:05:31,794 WARNING [train.py:1204] (3/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_315-128283-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 13:05:31,896 WARNING [train.py:1204] (3/4) Exclude cut with ID _14886_210_4220_1_1530702020355_3516190_330-323941-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 13:05:32,204 WARNING [train.py:1204] (3/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_264-348896-0_sp1.1 from training. Duration: 0.77275 2023-10-04 13:05:32,226 WARNING [train.py:1204] (3/4) Exclude cut with ID _37086_210_9038_1_1532999540891_7021250_433-10563-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 13:05:32,264 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_53638_210_21581_1_1534297319_5808449_885-80172-0_sp1.1 from training. Duration: 0.87275 2023-10-04 13:05:32,359 WARNING [train.py:1204] (3/4) Exclude cut with ID _14623_210_1925_1_1530773354962_3632609_157-218771-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 13:05:32,370 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_1368-78165-0_sp1.1 from training. Duration: 0.97275 2023-10-04 13:05:32,374 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_309-65177-0_sp0.9 from training. Duration: 0.97775 2023-10-04 13:05:32,380 WARNING [train.py:1204] (3/4) Exclude cut with ID _21632_210_2253_1_1531994001765_3744140_291-138812-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 13:05:32,393 WARNING [train.py:1204] (3/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_669-93163-0_sp1.1 from training. Duration: 0.8909375 2023-10-04 13:05:33,412 WARNING [train.py:1204] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_292-215346-0 from training. Duration: 0.97 2023-10-04 13:05:33,440 WARNING [train.py:1204] (3/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_286-222073-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 13:05:33,760 WARNING [train.py:1204] (3/4) Exclude cut with ID _59256_210_5986_1_1535076085997_3589120_121-237386-0_sp1.1 from training. Duration: 0.87275 2023-10-04 13:05:33,776 WARNING [train.py:1204] (3/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_96-320736-0 from training. Duration: 0.86 2023-10-04 13:05:33,786 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_128-202104-0 from training. Duration: 0.63 2023-10-04 13:05:33,884 WARNING [train.py:1204] (3/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_242-3472-0_sp0.9 from training. Duration: 0.811125 2023-10-04 13:05:33,935 WARNING [train.py:1204] (3/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_154-74182-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 13:05:33,938 WARNING [train.py:1204] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_187-221566-0 from training. Duration: 0.43 2023-10-04 13:05:34,110 WARNING [train.py:1204] (3/4) Exclude cut with ID _6194_210_1534_1_1527511826160_3941194_14-282876-0 from training. Duration: 0.71 2023-10-04 13:05:34,151 WARNING [train.py:1204] (3/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_660-211088-0_sp1.1 from training. Duration: 0.563625 2023-10-04 13:05:34,494 WARNING [train.py:1204] (3/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_526-62079-0_sp1.1 from training. Duration: 0.6090625 2023-10-04 13:05:34,646 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_92-246270-0_sp1.1 from training. Duration: 0.77275 2023-10-04 13:05:34,733 WARNING [train.py:1204] (3/4) Exclude cut with ID IC0287W0023-142350 from training. Duration: 0.956 2023-10-04 13:05:34,809 WARNING [train.py:1204] (3/4) Exclude cut with ID _58605_210_12210_1_1534723195939_3904259_48-269402-0 from training. Duration: 0.96 2023-10-04 13:05:34,823 WARNING [train.py:1204] (3/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_416-257730-0_sp0.9 from training. Duration: 0.72225 2023-10-04 13:05:34,899 WARNING [train.py:1204] (3/4) Exclude cut with ID _7877_210_6014_1_1528286358099_3750136_185-17516-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 13:05:35,004 WARNING [train.py:1204] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_23-189234-0 from training. Duration: 0.83 2023-10-04 13:05:35,020 WARNING [train.py:1204] (3/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_237-89684-0 from training. Duration: 0.96 2023-10-04 13:05:35,090 WARNING [train.py:1204] (3/4) Exclude cut with ID _13916_210_9994_1_1530788341126_3763149_116-21034-0 from training. Duration: 0.96 2023-10-04 13:05:35,227 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_246-125000-0_sp1.1 from training. Duration: 0.563625 2023-10-04 13:05:36,900 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.nonlin_attention.balancer.prob, batch_count=1664466.6666666667, ans=0.125 2023-10-04 13:05:41,672 INFO [train.py:1046] (3/4) Epoch 48, batch 0, loss[loss=0.1513, simple_loss=0.2471, pruned_loss=0.02778, over 24328.00 frames. ], tot_loss[loss=0.1513, simple_loss=0.2471, pruned_loss=0.02778, over 24328.00 frames. ], batch size: 74, lr: 2.12e-03, grad_scale: 16.0 2023-10-04 13:05:41,673 INFO [train.py:1069] (3/4) Computing validation loss 2023-10-04 13:05:54,831 INFO [train.py:1078] (3/4) Epoch 48, validation: loss=0.3604, simple_loss=0.2801, pruned_loss=0.2204, over 1125622.00 frames. 2023-10-04 13:05:54,831 INFO [train.py:1079] (3/4) Maximum memory allocated so far is 21221MB 2023-10-04 13:05:56,146 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.805e+02 2.072e+02 2.267e+02 2.671e+02 6.295e+02, threshold=4.535e+02, percent-clipped=1.0 2023-10-04 13:05:56,256 WARNING [train.py:1204] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_402-253027-0 from training. Duration: 0.54 2023-10-04 13:05:56,325 WARNING [train.py:1204] (3/4) Exclude cut with ID _59288_210_5854_1_1535077757774_3412246_158-289345-0_sp1.1 from training. Duration: 0.92725 2023-10-04 13:05:59,023 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_28211_210_6388_1_1532861506_4523224_128-328162-0_sp1.1 from training. Duration: 0.8181875 2023-10-04 13:06:05,051 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_788-278281-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 13:06:05,059 WARNING [train.py:1204] (3/4) Exclude cut with ID _49728_210_12651_1_1534041034294_6788150_624-195301-0_sp0.9 from training. Duration: 0.9444375 2023-10-04 13:06:05,108 WARNING [train.py:1204] (3/4) Exclude cut with ID _14860_210_4915_1_1530702020340_7343240_267-139775-0_sp1.1 from training. Duration: 0.963625 2023-10-04 13:06:06,428 WARNING [train.py:1204] (3/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_332-221057-0 from training. Duration: 0.68 2023-10-04 13:06:07,816 WARNING [train.py:1204] (3/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_242-76943-0 from training. Duration: 0.94 2023-10-04 13:06:09,957 WARNING [train.py:1204] (3/4) Exclude cut with ID _16747_210_5986_1_1531811544733_2749109_87-349587-0_sp1.1 from training. Duration: 0.963625 2023-10-04 13:06:10,209 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.self_attn_weights.pos_emb_skip_rate, batch_count=1664533.3333333333, ans=0.0 2023-10-04 13:06:11,234 WARNING [train.py:1204] (3/4) Exclude cut with ID _22982_210_1883_1_1531882790447_5449409_505-346452-0_sp1.1 from training. Duration: 0.963625 2023-10-04 13:06:15,831 WARNING [train.py:1204] (3/4) Exclude cut with ID _17956_210_4353_1_1531209568125_6979970_13-192107-0_sp1.1 from training. Duration: 0.963625 2023-10-04 13:06:17,123 WARNING [train.py:1204] (3/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_430-307666-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 13:06:17,178 WARNING [train.py:1204] (3/4) Exclude cut with ID _2895_210_2477_1_1525956785794_4063750_289-11475-0_sp1.1 from training. Duration: 0.7454375 2023-10-04 13:06:17,202 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3910_210_1794_1_1526742306_334530_28-238909-0_sp0.9 from training. Duration: 0.9 2023-10-04 13:06:18,550 WARNING [train.py:1204] (3/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_18-130336-0 from training. Duration: 0.79 2023-10-04 13:06:21,381 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_548-294316-0_sp0.9 from training. Duration: 0.9 2023-10-04 13:06:27,472 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_42-142473-0_sp1.1 from training. Duration: 0.6545625 2023-10-04 13:06:28,771 WARNING [train.py:1204] (3/4) Exclude cut with ID _59170_210_6721_1_1534983633960_4203335_326-48066-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 13:06:30,198 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_45886_210_17024_1_1533709215_471063_7-79165-0 from training. Duration: 0.98 2023-10-04 13:06:34,331 WARNING [train.py:1204] (3/4) Exclude cut with ID _44585_210_18628_1_1533605350387_4518199_280-73534-0_sp0.9 from training. Duration: 0.988875 2023-10-04 13:06:34,347 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3475_210_2028_1_1526193578_3622569_265-276625-0_sp1.1 from training. Duration: 0.6909375 2023-10-04 13:06:34,556 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.conv_module1.balancer1.prob, batch_count=1664600.0, ans=0.125 2023-10-04 13:06:35,828 WARNING [train.py:1204] (3/4) Exclude cut with ID _64631_210_27767_1_1535349580068_4165160_88-225232-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 13:06:40,620 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_205-265518-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 13:06:43,960 WARNING [train.py:1204] (3/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_271-253734-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 13:06:48,884 WARNING [train.py:1204] (3/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_282-257791-0 from training. Duration: 0.86 2023-10-04 13:06:52,945 WARNING [train.py:1204] (3/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_356-45054-0 from training. Duration: 0.86 2023-10-04 13:06:52,989 WARNING [train.py:1204] (3/4) Exclude cut with ID _69584_210_5219_1_1535785133775_4044450_38-348116-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 13:06:52,991 WARNING [train.py:1204] (3/4) Exclude cut with ID _66763_210_17301_1_1535788667617_241750_21-328480-0_sp1.1 from training. Duration: 0.936375 2023-10-04 13:06:54,390 WARNING [train.py:1204] (3/4) Exclude cut with ID _26528_210_9070_1_1532570410842_3851310_497-97218-0_sp0.9 from training. Duration: 0.9555625 2023-10-04 13:06:54,440 WARNING [train.py:1204] (3/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_592-101075-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 13:06:56,395 WARNING [train.py:1204] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_678-159307-0 from training. Duration: 0.85 2023-10-04 13:06:57,797 WARNING [train.py:1204] (3/4) Exclude cut with ID _28427_210_4220_1_1532581240967_3061069_166-319773-0_sp1.1 from training. Duration: 0.936375 2023-10-04 13:06:57,880 WARNING [train.py:1204] (3/4) Exclude cut with ID _29877_210_6228_1_1532397791106_5276149_606-224984-0_sp1.1 from training. Duration: 0.936375 2023-10-04 13:07:02,014 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_125-203105-0_sp1.1 from training. Duration: 0.72725 2023-10-04 13:07:04,843 WARNING [train.py:1204] (3/4) Exclude cut with ID IC0620W0113-306765 from training. Duration: 0.768 2023-10-04 13:07:06,398 WARNING [train.py:1204] (3/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_410-287857-0_sp1.1 from training. Duration: 0.8454375 2023-10-04 13:07:09,447 INFO [train.py:1046] (3/4) Epoch 48, batch 50, loss[loss=0.1466, simple_loss=0.2317, pruned_loss=0.03081, over 24488.00 frames. ], tot_loss[loss=0.153, simple_loss=0.2348, pruned_loss=0.0356, over 1064072.78 frames. ], batch size: 66, lr: 2.12e-03, grad_scale: 16.0 2023-10-04 13:07:11,405 WARNING [train.py:1204] (3/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_78-95260-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 13:07:12,767 WARNING [train.py:1204] (3/4) Exclude cut with ID _58059_210_5854_1_1534901471149_3164808_6-73080-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 13:07:12,783 WARNING [train.py:1204] (3/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_331-117829-0 from training. Duration: 0.62 2023-10-04 13:07:14,082 WARNING [train.py:1204] (3/4) Exclude cut with ID _14880_210_8895_1_1530680426655_3795259_289-212817-0_sp0.9 from training. Duration: 0.7666875 2023-10-04 13:07:14,121 WARNING [train.py:1204] (3/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_620-144779-0_sp1.1 from training. Duration: 0.92725 2023-10-04 13:07:16,807 WARNING [train.py:1204] (3/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_561-288575-0_sp1.1 from training. Duration: 0.963625 2023-10-04 13:07:18,239 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_22657_210_6890_1_1532086002_1637216_122-188128-0_sp1.1 from training. Duration: 0.963625 2023-10-04 13:07:18,517 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.conv_skip_rate, batch_count=1664800.0, ans=0.0 2023-10-04 13:07:20,064 WARNING [train.py:1204] (3/4) Exclude cut with ID _16756_210_4915_1_1531047803763_7104259_604-86607-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 13:07:22,970 WARNING [train.py:1204] (3/4) Exclude cut with ID _15150_210_8341_1_1530752411803_7256384_469-239509-0 from training. Duration: 0.68 2023-10-04 13:07:22,978 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_10987_210_4163_1_1529760555_4123902_302-87926-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 13:07:30,254 WARNING [train.py:1204] (3/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_469-94673-0_sp0.9 from training. Duration: 0.87775 2023-10-04 13:07:32,945 WARNING [train.py:1204] (3/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_798-106179-0 from training. Duration: 0.83 2023-10-04 13:07:33,086 WARNING [train.py:1204] (3/4) Exclude cut with ID _16744_210_5986_1_1531724405384_3907567_242-111316-0 from training. Duration: 0.93 2023-10-04 13:07:35,807 WARNING [train.py:1204] (3/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_938-205135-0_sp0.9 from training. Duration: 0.9666875 2023-10-04 13:07:37,338 WARNING [train.py:1204] (3/4) Exclude cut with ID _64135_210_2253_1_1535677267876_7608972_133-188266-0_sp0.9 from training. Duration: 0.9 2023-10-04 13:07:37,342 WARNING [train.py:1204] (3/4) Exclude cut with ID _44585_210_18628_1_1533605350387_4518199_623-346820-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 13:07:37,424 WARNING [train.py:1204] (3/4) Exclude cut with ID _42326_210_7770_1_1533814282531_3707269_454-266903-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 13:07:37,919 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.5.encoder.layers.0.feed_forward2.out_whiten, num_groups=1, num_channels=256, metric=7.20 vs. limit=15.0 2023-10-04 13:07:38,174 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.2.encoder.layers.2.self_attn_weights.whiten_keys, num_groups=4, num_channels=128, metric=2.81 vs. limit=6.0 2023-10-04 13:07:38,678 WARNING [train.py:1204] (3/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_5-55893-0_sp1.1 from training. Duration: 0.5 2023-10-04 13:07:38,733 WARNING [train.py:1204] (3/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_577-332600-0_sp1.1 from training. Duration: 0.5181875 2023-10-04 13:07:38,734 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_957-138882-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 13:07:42,152 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.0.bypass.scale_min, batch_count=1664933.3333333333, ans=0.2 2023-10-04 13:07:44,407 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=1664933.3333333333, ans=0.1 2023-10-04 13:07:46,923 WARNING [train.py:1204] (3/4) Exclude cut with ID _12556_210_8341_1_1530081024100_7327690_219-38004-0_sp1.1 from training. Duration: 0.97275 2023-10-04 13:07:48,194 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_397-147196-0_sp1.1 from training. Duration: 0.72725 2023-10-04 13:07:48,211 WARNING [train.py:1204] (3/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_231-104736-0_sp0.9 from training. Duration: 0.8666875 2023-10-04 13:07:50,114 WARNING [train.py:1204] (3/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_398-334558-0 from training. Duration: 0.96 2023-10-04 13:07:51,555 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_257-20733-0_sp1.1 from training. Duration: 0.6909375 2023-10-04 13:07:52,891 WARNING [train.py:1204] (3/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_146-282400-0_sp1.1 from training. Duration: 0.6454375 2023-10-04 13:07:52,895 WARNING [train.py:1204] (3/4) Exclude cut with ID _51899_210_20034_1_1534129254466_3669220_265-215428-0 from training. Duration: 0.98 2023-10-04 13:07:54,238 WARNING [train.py:1204] (3/4) Exclude cut with ID _34423_210_8750_1_1533430798456_3330199_219-106541-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 13:07:55,677 WARNING [train.py:1204] (3/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_458-67321-0 from training. Duration: 0.97 2023-10-04 13:08:04,405 WARNING [train.py:1204] (3/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_1051-182606-0_sp1.1 from training. Duration: 0.936375 2023-10-04 13:08:04,420 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_53555_210_5632_1_1534595220_7232297_603-4396-0_sp1.1 from training. Duration: 0.97275 2023-10-04 13:08:04,526 WARNING [train.py:1204] (3/4) Exclude cut with ID _54218_210_12210_1_1534413646388_4409259_159-144367-0_sp1.1 from training. Duration: 0.963625 2023-10-04 13:08:04,656 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.conv_module1.balancer1.prob, batch_count=1665000.0, ans=0.125 2023-10-04 13:08:05,934 WARNING [train.py:1204] (3/4) Exclude cut with ID _7606_210_4915_1_1528894253277_5080690_301-10732-0_sp0.9 from training. Duration: 0.9 2023-10-04 13:08:05,937 WARNING [train.py:1204] (3/4) Exclude cut with ID _15026_210_8750_1_1530777602316_3291850_211-195043-0_sp0.9 from training. Duration: 0.888875 2023-10-04 13:08:08,844 WARNING [train.py:1204] (3/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_146-282400-0 from training. Duration: 0.71 2023-10-04 13:08:08,877 WARNING [train.py:1204] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_65-88242-0 from training. Duration: 0.56 2023-10-04 13:08:10,267 WARNING [train.py:1204] (3/4) Exclude cut with ID _55857_210_2346_1_1534644007831_6898209_80-107757-0_sp1.1 from training. Duration: 0.963625 2023-10-04 13:08:11,609 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_15126_210_6753_1_1530778677_2591185_120-277707-0_sp0.9 from training. Duration: 0.888875 2023-10-04 13:08:12,514 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder_embed.out_whiten.whitening_limit, batch_count=1665066.6666666667, ans=8.0 2023-10-04 13:08:12,953 WARNING [train.py:1204] (3/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_1033-168593-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 13:08:12,997 WARNING [train.py:1204] (3/4) Exclude cut with ID _46683_210_9642_1_1533730913492_4591116_475-30711-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 13:08:13,031 WARNING [train.py:1204] (3/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_253-308377-0 from training. Duration: 0.49 2023-10-04 13:08:14,793 WARNING [train.py:1204] (3/4) Exclude cut with ID _4680_210_3525_1_1527249757953_893386_75-78979-0 from training. Duration: 0.91 2023-10-04 13:08:16,682 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_5522_210_3273_1_1527249606_3909096_318-203153-0_sp0.9 from training. Duration: 0.47775 2023-10-04 13:08:18,124 WARNING [train.py:1204] (3/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_550-170116-0_sp1.1 from training. Duration: 0.92725 2023-10-04 13:08:18,159 WARNING [train.py:1204] (3/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_277-210798-0_sp0.9 from training. Duration: 0.611125 2023-10-04 13:08:19,487 WARNING [train.py:1204] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_620-276793-0 from training. Duration: 0.59 2023-10-04 13:08:19,506 WARNING [train.py:1204] (3/4) Exclude cut with ID _3067_210_2667_1_1526212974640_4503168_119-175776-0 from training. Duration: 0.74 2023-10-04 13:08:21,375 WARNING [train.py:1204] (3/4) Exclude cut with ID _55209_210_1531_1_1534675828996_4597160_681-195827-0_sp1.1 from training. Duration: 0.92725 2023-10-04 13:08:21,452 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_427-78097-0_sp1.1 from training. Duration: 0.72725 2023-10-04 13:08:24,075 INFO [train.py:1046] (3/4) Epoch 48, batch 100, loss[loss=0.145, simple_loss=0.2272, pruned_loss=0.03141, over 24473.00 frames. ], tot_loss[loss=0.1536, simple_loss=0.2355, pruned_loss=0.03587, over 1867605.68 frames. ], batch size: 63, lr: 2.12e-03, grad_scale: 16.0 2023-10-04 13:08:24,149 WARNING [train.py:1204] (3/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_96-290039-0_sp0.9 from training. Duration: 0.62225 2023-10-04 13:08:24,158 WARNING [train.py:1204] (3/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_287-292174-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 13:08:25,442 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.658e+02 2.052e+02 2.272e+02 2.677e+02 5.287e+02, threshold=4.544e+02, percent-clipped=2.0 2023-10-04 13:08:25,641 WARNING [train.py:1204] (3/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_200-292228-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 13:08:29,704 WARNING [train.py:1204] (3/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_291-194999-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 13:08:32,811 WARNING [train.py:1204] (3/4) Exclude cut with ID _58895_210_8750_1_1535184081320_3649089_265-323810-0_sp1.1 from training. Duration: 0.87275 2023-10-04 13:08:34,273 WARNING [train.py:1204] (3/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_58-68956-0 from training. Duration: 0.83 2023-10-04 13:08:34,279 WARNING [train.py:1204] (3/4) Exclude cut with ID _39867_210_9826_1_1533553222030_9344259_322-115153-0_sp1.1 from training. Duration: 0.963625 2023-10-04 13:08:38,481 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3371_210_1794_1_1526384462_5580777_396-339812-0_sp1.1 from training. Duration: 0.77275 2023-10-04 13:08:38,502 WARNING [train.py:1204] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_731-2428-0_sp0.9 from training. Duration: 0.92225 2023-10-04 13:08:38,521 WARNING [train.py:1204] (3/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_98-741-0_sp1.1 from training. Duration: 0.72725 2023-10-04 13:08:38,523 WARNING [train.py:1204] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_238-245655-0_sp0.9 from training. Duration: 0.9 2023-10-04 13:08:38,553 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6788_210_1388_1_1527898698_4562021_91-197981-0_sp0.9 from training. Duration: 0.92225 2023-10-04 13:08:40,032 WARNING [train.py:1204] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_860-96198-0 from training. Duration: 0.73 2023-10-04 13:08:41,411 WARNING [train.py:1204] (3/4) Exclude cut with ID _14268_210_9206_1_1530622771285_4714099_331-105351-0_sp0.9 from training. Duration: 0.72225 2023-10-04 13:08:41,439 WARNING [train.py:1204] (3/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_204-137133-0_sp1.1 from training. Duration: 0.92725 2023-10-04 13:08:42,753 WARNING [train.py:1204] (3/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_5-46496-0_sp1.1 from training. Duration: 0.936375 2023-10-04 13:08:42,770 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_160-218721-0_sp1.1 from training. Duration: 0.87275 2023-10-04 13:08:46,241 WARNING [train.py:1204] (3/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_503-287883-0 from training. Duration: 0.88 2023-10-04 13:08:46,414 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.1.bypass.scale_min, batch_count=1665200.0, ans=0.2 2023-10-04 13:08:47,358 WARNING [train.py:1204] (3/4) Exclude cut with ID _57572_210_5517_1_1534921219624_3809484_110-79134-0_sp1.1 from training. Duration: 0.92725 2023-10-04 13:08:48,678 WARNING [train.py:1204] (3/4) Exclude cut with ID _44585_210_18628_1_1533605350387_4518199_421-37641-0_sp1.1 from training. Duration: 0.936375 2023-10-04 13:08:49,347 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.3.encoder.layers.2.nonlin_attention.whiten1, num_groups=1, num_channels=384, metric=4.58 vs. limit=10.0 2023-10-04 13:08:50,392 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_42-142473-0_sp0.9 from training. Duration: 0.8 2023-10-04 13:08:52,365 WARNING [train.py:1204] (3/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_310-114452-0_sp0.9 from training. Duration: 0.6555625 2023-10-04 13:08:52,727 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=1665266.6666666667, ans=0.1 2023-10-04 13:08:57,923 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9029W0120-509520 from training. Duration: 0.768 2023-10-04 13:08:57,946 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9029W0433-509820 from training. Duration: 0.896 2023-10-04 13:09:00,564 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_40222_210_6228_1_1533385298_4481939_163-334586-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 13:09:00,565 WARNING [train.py:1204] (3/4) Exclude cut with ID _48677_210_5663_1_1533902405919_3892989_408-40740-0_sp1.1 from training. Duration: 0.7545625 2023-10-04 13:09:03,485 WARNING [train.py:1204] (3/4) Exclude cut with ID _24314_210_4915_1_1532135078116_7170179_521-258868-0_sp0.9 from training. Duration: 0.87775 2023-10-04 13:09:04,887 WARNING [train.py:1204] (3/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_576-51198-0_sp1.1 from training. Duration: 0.92725 2023-10-04 13:09:06,738 WARNING [train.py:1204] (3/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_306-216871-0_sp1.1 from training. Duration: 0.97275 2023-10-04 13:09:10,993 WARNING [train.py:1204] (3/4) Exclude cut with ID _20684_210_6917_1_1531567751646_4203930_463-119982-0_sp1.1 from training. Duration: 0.97275 2023-10-04 13:09:11,258 INFO [scaling.py:1118] (3/4) WithLoss: name=encoder.encoders.2.encoder.layers.0.self_attn_weights, loss-sum=0.000e+00 2023-10-04 13:09:12,437 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9084W0012-536850 from training. Duration: 0.64 2023-10-04 13:09:13,824 WARNING [train.py:1204] (3/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_847-41334-0_sp1.1 from training. Duration: 0.4 2023-10-04 13:09:16,640 WARNING [train.py:1204] (3/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_326-165900-0_sp1.1 from training. Duration: 0.62725 2023-10-04 13:09:18,465 WARNING [train.py:1204] (3/4) Exclude cut with ID _7537_210_2084_1_1528502395431_3924359_128-268029-0_sp1.1 from training. Duration: 0.8909375 2023-10-04 13:09:18,713 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.feed_forward3.hidden_balancer.prob, batch_count=1665333.3333333333, ans=0.125 2023-10-04 13:09:19,817 WARNING [train.py:1204] (3/4) Exclude cut with ID _67499_210_17282_1_1535509805220_3692430_333-155030-0_sp1.1 from training. Duration: 0.97275 2023-10-04 13:09:24,341 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_34008_210_10431_1_1533345424_8183247_588-257177-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 13:09:26,181 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.5.encoder.layers.0.self_attn_weights.whiten_keys, num_groups=4, num_channels=128, metric=1.80 vs. limit=6.0 2023-10-04 13:09:27,393 WARNING [train.py:1204] (3/4) Exclude cut with ID _25914_210_6400_1_1532141317058_644740_52-96808-0_sp1.1 from training. Duration: 0.82725 2023-10-04 13:09:28,883 WARNING [train.py:1204] (3/4) Exclude cut with ID _56494_210_4220_1_1535284837625_3033169_69-138150-0_sp1.1 from training. Duration: 0.8545625 2023-10-04 13:09:30,354 WARNING [train.py:1204] (3/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_19-143553-0_sp1.1 from training. Duration: 0.97275 2023-10-04 13:09:31,675 WARNING [train.py:1204] (3/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_582-130735-0_sp1.1 from training. Duration: 0.936375 2023-10-04 13:09:33,076 WARNING [train.py:1204] (3/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_943-201356-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 13:09:33,094 WARNING [train.py:1204] (3/4) Exclude cut with ID _3231_210_2106_1_1526205864934_6784220_422-229276-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 13:09:33,112 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_48049_210_5632_1_1533864553_3519937_73-198717-0_sp1.1 from training. Duration: 0.97275 2023-10-04 13:09:33,292 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.attention_skip_rate, batch_count=1665400.0, ans=0.0 2023-10-04 13:09:34,423 WARNING [train.py:1204] (3/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_329-285801-0 from training. Duration: 0.91 2023-10-04 13:09:34,448 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9171W0114-579000 from training. Duration: 0.896 2023-10-04 13:09:34,468 WARNING [train.py:1204] (3/4) Exclude cut with ID _3120_210_3525_1_1526036143112_4072980_210-132675-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 13:09:35,838 WARNING [train.py:1204] (3/4) Exclude cut with ID _55857_210_2346_1_1534644007831_6898209_745-98391-0_sp0.9 from training. Duration: 0.8444375 2023-10-04 13:09:35,915 WARNING [train.py:1204] (3/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_615-322035-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 13:09:35,919 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_36756_210_7234_1_1533026991_966276_119-315767-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 13:09:36,038 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.0.feed_forward3.hidden_balancer.prob, batch_count=1665466.6666666667, ans=0.125 2023-10-04 13:09:37,760 INFO [train.py:1046] (3/4) Epoch 48, batch 150, loss[loss=0.1527, simple_loss=0.2435, pruned_loss=0.03095, over 24647.00 frames. ], tot_loss[loss=0.1543, simple_loss=0.2355, pruned_loss=0.03651, over 2488373.76 frames. ], batch size: 68, lr: 2.12e-03, grad_scale: 8.0 2023-10-04 13:09:37,808 WARNING [train.py:1204] (3/4) Exclude cut with ID _13916_210_9994_1_1530788341126_3763149_60-191556-0_sp1.1 from training. Duration: 0.5545625 2023-10-04 13:09:37,831 WARNING [train.py:1204] (3/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_177-236809-0_sp1.1 from training. Duration: 0.4454375 2023-10-04 13:09:37,881 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7002_210_1794_1_1527939683_5157286_184-229630-0_sp1.1 from training. Duration: 0.67275 2023-10-04 13:09:37,887 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_50428_210_5724_1_1534247532_4126418_464-18322-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 13:09:39,240 WARNING [train.py:1204] (3/4) Exclude cut with ID _35799_210_5854_1_1533263487887_3529071_466-131402-0_sp1.1 from training. Duration: 0.936375 2023-10-04 13:09:39,330 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_1259-132768-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 13:09:39,372 WARNING [train.py:1204] (3/4) Exclude cut with ID _6694_210_4599_1_1527852637490_3965529_144-185051-0_sp1.1 from training. Duration: 0.87275 2023-10-04 13:09:40,667 WARNING [train.py:1204] (3/4) Exclude cut with ID _64639_210_5816_1_1535367619437_3528000_59-133290-0_sp1.1 from training. Duration: 0.963625 2023-10-04 13:09:43,380 WARNING [train.py:1204] (3/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_223-49525-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 13:09:46,195 WARNING [train.py:1204] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_748-36150-0_sp1.1 from training. Duration: 0.82725 2023-10-04 13:09:46,196 WARNING [train.py:1204] (3/4) Exclude cut with ID _19896_210_1883_1_1531709022662_6649569_52-221627-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 13:09:46,861 WARNING [train.py:1204] (3/4) Exclude cut with ID _6789_210_1534_1_1527854471671_2870519_296-115766-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 13:09:49,596 WARNING [train.py:1204] (3/4) Exclude cut with ID _14624_210_1925_1_1530858690657_3642219_100-19871-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 13:09:49,644 WARNING [train.py:1204] (3/4) Exclude cut with ID _12555_210_8750_1_1530075594402_3780239_50-293675-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 13:09:51,273 WARNING [train.py:1204] (3/4) Exclude cut with ID _14880_210_8895_1_1530680426655_3795259_289-212817-0_sp1.1 from training. Duration: 0.62725 2023-10-04 13:09:52,576 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_388-211626-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 13:09:55,830 WARNING [train.py:1204] (3/4) Exclude cut with ID _5289_210_2084_1_1527678202736_3779299_392-261988-0 from training. Duration: 0.65 2023-10-04 13:09:55,839 WARNING [train.py:1204] (3/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_124-345840-0 from training. Duration: 0.52 2023-10-04 13:09:55,844 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2655_210_2087_1_1525688959_3897400_437-337690-0 from training. Duration: 0.63 2023-10-04 13:10:00,442 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3780_210_2087_1_1526467391_239068_13-274633-0_sp1.1 from training. Duration: 0.92725 2023-10-04 13:10:00,456 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_805-71497-0_sp1.1 from training. Duration: 0.6454375 2023-10-04 13:10:01,814 WARNING [train.py:1204] (3/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_434-83000-0_sp1.1 from training. Duration: 0.8909375 2023-10-04 13:10:03,109 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_17613_210_2151_1_1531137569_2366160_285-219531-0_sp1.1 from training. Duration: 0.97275 2023-10-04 13:10:03,125 WARNING [train.py:1204] (3/4) Exclude cut with ID _56108_210_16380_1_1534505446159_3887959_22-37858-0_sp1.1 from training. Duration: 0.936375 2023-10-04 13:10:03,154 WARNING [train.py:1204] (3/4) Exclude cut with ID _28427_210_4220_1_1532581240967_3061069_200-88444-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 13:10:03,219 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_77-279042-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 13:10:04,626 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9309W0193-640320 from training. Duration: 0.896 2023-10-04 13:10:06,622 WARNING [train.py:1204] (3/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_516-324055-0_sp1.1 from training. Duration: 0.936375 2023-10-04 13:10:12,082 WARNING [train.py:1204] (3/4) Exclude cut with ID _40094_210_16380_1_1533294005773_3641159_10-261240-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 13:10:13,515 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.balancer1.prob, batch_count=1665600.0, ans=0.125 2023-10-04 13:10:16,053 WARNING [train.py:1204] (3/4) Exclude cut with ID _6267_210_2084_1_1527764658526_3894730_375-219887-0_sp1.1 from training. Duration: 0.6090625 2023-10-04 13:10:16,149 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6788_210_1388_1_1527898698_4562021_180-75288-0 from training. Duration: 0.99 2023-10-04 13:10:19,095 WARNING [train.py:1204] (3/4) Exclude cut with ID _8030_210_7497_1_1528444886781_3531089_262-86750-0_sp1.1 from training. Duration: 0.736375 2023-10-04 13:10:19,108 WARNING [train.py:1204] (3/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_934-325498-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 13:10:19,145 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_456-22141-0_sp0.9 from training. Duration: 0.97775 2023-10-04 13:10:22,424 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.feed_forward1.hidden_balancer.prob, batch_count=1665666.6666666667, ans=0.125 2023-10-04 13:10:23,526 WARNING [train.py:1204] (3/4) Exclude cut with ID _49643_210_7989_1_1534640244224_3939031_598-161926-0_sp1.1 from training. Duration: 0.8181875 2023-10-04 13:10:23,637 WARNING [train.py:1204] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_345-49721-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 13:10:25,549 WARNING [train.py:1204] (3/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_440-141811-0_sp1.1 from training. Duration: 0.863625 2023-10-04 13:10:26,901 WARNING [train.py:1204] (3/4) Exclude cut with ID _64048_210_10626_1_1535198382802_3835179_394-212120-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 13:10:28,244 WARNING [train.py:1204] (3/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_91-277684-0 from training. Duration: 0.74 2023-10-04 13:10:34,208 WARNING [train.py:1204] (3/4) Exclude cut with ID _23038_210_9570_1_1532001584158_3652419_191-346343-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 13:10:35,600 WARNING [train.py:1204] (3/4) Exclude cut with ID _15140_210_9288_1_1530791388450_4880149_351-347974-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 13:10:35,630 WARNING [train.py:1204] (3/4) Exclude cut with ID _58605_210_12210_1_1534723195939_3904259_48-269402-0_sp1.1 from training. Duration: 0.87275 2023-10-04 13:10:35,637 WARNING [train.py:1204] (3/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_401-53423-0_sp0.9 from training. Duration: 0.611125 2023-10-04 13:10:38,295 WARNING [train.py:1204] (3/4) Exclude cut with ID _42659_210_2070_1_1533607442765_3120380_158-49004-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 13:10:38,414 WARNING [train.py:1204] (3/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_81-75849-0_sp1.1 from training. Duration: 0.3545625 2023-10-04 13:10:41,740 WARNING [train.py:1204] (3/4) Exclude cut with ID _27052_210_5986_1_1532329197187_3585009_314-106499-0_sp1.1 from training. Duration: 0.836375 2023-10-04 13:10:43,232 WARNING [train.py:1204] (3/4) Exclude cut with ID _4626_210_3532_1_1526817652717_3916859_446-206656-0_sp0.9 from training. Duration: 0.8444375 2023-10-04 13:10:44,575 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_53378_210_17282_1_1534221071_5769509_158-114960-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 13:10:46,049 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3171_210_2087_1_1526109084_2330007_37-64334-0_sp0.9 from training. Duration: 0.911125 2023-10-04 13:10:47,315 WARNING [train.py:1204] (3/4) Exclude cut with ID _5410_210_1388_1_1527397055795_1719020_39-112221-0 from training. Duration: 0.69 2023-10-04 13:10:47,353 WARNING [train.py:1204] (3/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_4-91037-0_sp0.9 from training. Duration: 0.97775 2023-10-04 13:10:47,370 WARNING [train.py:1204] (3/4) Exclude cut with ID ID0079W0443-717795 from training. Duration: 0.541 2023-10-04 13:10:50,303 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.conv_module1.balancer1.prob, batch_count=1665800.0, ans=0.125 2023-10-04 13:10:51,287 INFO [train.py:1046] (3/4) Epoch 48, batch 200, loss[loss=0.1409, simple_loss=0.2217, pruned_loss=0.03003, over 24584.00 frames. ], tot_loss[loss=0.1545, simple_loss=0.2357, pruned_loss=0.03664, over 2995855.57 frames. ], batch size: 60, lr: 2.12e-03, grad_scale: 8.0 2023-10-04 13:10:51,365 WARNING [train.py:1204] (3/4) Exclude cut with ID _34734_210_4525_1_1533365977542_4448720_766-297928-0_sp1.1 from training. Duration: 0.936375 2023-10-04 13:10:54,670 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.717e+02 2.085e+02 2.349e+02 2.813e+02 4.148e+02, threshold=4.699e+02, percent-clipped=0.0 2023-10-04 13:10:54,841 WARNING [train.py:1204] (3/4) Exclude cut with ID _51471_210_1889_1_1534399391390_3094250_310-337793-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 13:10:54,852 WARNING [train.py:1204] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_678-159307-0_sp0.9 from training. Duration: 0.9444375 2023-10-04 13:10:57,648 WARNING [train.py:1204] (3/4) Exclude cut with ID _50093_210_4220_1_1534417141207_3432299_259-322252-0 from training. Duration: 0.92 2023-10-04 13:10:59,144 WARNING [train.py:1204] (3/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_67-103444-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 13:10:59,162 WARNING [train.py:1204] (3/4) Exclude cut with ID _40551_210_10635_1_1533776424632_3416179_311-247578-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 13:11:02,517 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_4633_210_2040_1_1526824163_4145343_416-76117-0 from training. Duration: 0.95 2023-10-04 13:11:03,831 WARNING [train.py:1204] (3/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_331-117829-0_sp0.9 from training. Duration: 0.688875 2023-10-04 13:11:05,264 WARNING [train.py:1204] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_299-247059-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 13:11:06,737 WARNING [train.py:1204] (3/4) Exclude cut with ID _64002_210_9570_1_1535198605157_38979_2-240845-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 13:11:09,511 WARNING [train.py:1204] (3/4) Exclude cut with ID _4681_210_3525_1_1527250689464_120889_15-126760-0_sp0.9 from training. Duration: 0.9 2023-10-04 13:11:10,809 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_21500_210_2054_1_1532149310_2716758_256-156611-0_sp1.1 from training. Duration: 0.936375 2023-10-04 13:11:10,831 WARNING [train.py:1204] (3/4) Exclude cut with ID _16908_210_5724_1_1531119157248_3772700_624-203729-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 13:11:30,004 WARNING [train.py:1204] (3/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_605-219732-0_sp1.1 from training. Duration: 0.8545625 2023-10-04 13:11:30,045 WARNING [train.py:1204] (3/4) Exclude cut with ID _44718_210_5816_1_1534050051869_6469030_380-167520-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 13:11:31,378 WARNING [train.py:1204] (3/4) Exclude cut with ID _19896_210_1883_1_1531709022662_6649569_94-229927-0_sp1.1 from training. Duration: 0.8818125 2023-10-04 13:11:31,428 WARNING [train.py:1204] (3/4) Exclude cut with ID _19514_210_9259_1_1531530071330_5084230_519-174300-0_sp1.1 from training. Duration: 0.963625 2023-10-04 13:11:31,490 WARNING [train.py:1204] (3/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_340-174176-0_sp0.9 from training. Duration: 0.5444375 2023-10-04 13:11:31,494 WARNING [train.py:1204] (3/4) Exclude cut with ID _8263_210_1664_1_1528439452891_970220_68-181805-0_sp1.1 from training. Duration: 0.5818125 2023-10-04 13:11:35,257 WARNING [train.py:1204] (3/4) Exclude cut with ID _3120_210_3525_1_1526036143112_4072980_348-257723-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 13:11:36,612 WARNING [train.py:1204] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_453-133983-0_sp1.1 from training. Duration: 0.7090625 2023-10-04 13:11:36,681 WARNING [train.py:1204] (3/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_17-86060-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 13:11:37,954 WARNING [train.py:1204] (3/4) Exclude cut with ID _29877_210_6228_1_1532397791106_5276149_228-184657-0_sp1.1 from training. Duration: 0.97275 2023-10-04 13:11:39,325 WARNING [train.py:1204] (3/4) Exclude cut with ID _18516_210_9206_1_1531704653206_3692406_198-334870-0 from training. Duration: 0.92 2023-10-04 13:11:39,363 WARNING [train.py:1204] (3/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_340-174176-0_sp1.1 from training. Duration: 0.4454375 2023-10-04 13:11:40,619 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_14-139186-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 13:11:43,884 WARNING [train.py:1204] (3/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_126-29104-0_sp0.9 from training. Duration: 0.9444375 2023-10-04 13:11:49,358 WARNING [train.py:1204] (3/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_84-10310-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 13:11:56,352 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_15517_210_6753_1_1530954008_3487336_79-273076-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 13:11:56,401 WARNING [train.py:1204] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_371-18169-0_sp0.9 from training. Duration: 0.9555625 2023-10-04 13:12:01,311 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_68702_210_23689_1_1535799577_3420068_170-92788-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 13:12:04,533 WARNING [train.py:1204] (3/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_78-182912-0 from training. Duration: 0.59 2023-10-04 13:12:05,742 INFO [train.py:1046] (3/4) Epoch 48, batch 250, loss[loss=0.158, simple_loss=0.2482, pruned_loss=0.03393, over 24681.00 frames. ], tot_loss[loss=0.1543, simple_loss=0.2358, pruned_loss=0.03637, over 3378638.41 frames. ], batch size: 73, lr: 2.12e-03, grad_scale: 8.0 2023-10-04 13:12:05,830 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3019_210_2028_1_1526044245_3264865_297-12384-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 13:12:05,836 WARNING [train.py:1204] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_147-111137-0_sp1.1 from training. Duration: 0.863625 2023-10-04 13:12:05,840 WARNING [train.py:1204] (3/4) Exclude cut with ID _28323_210_16187_1_1532480395667_4100300_117-8859-0_sp1.1 from training. Duration: 0.97275 2023-10-04 13:12:07,213 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7762_210_1794_1_1528199508_4057408_419-125009-0_sp1.1 from training. Duration: 0.7545625 2023-10-04 13:12:07,309 WARNING [train.py:1204] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_268-87909-0 from training. Duration: 0.99 2023-10-04 13:12:09,214 WARNING [train.py:1204] (3/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_118-311357-0_sp1.1 from training. Duration: 0.92725 2023-10-04 13:12:09,235 WARNING [train.py:1204] (3/4) Exclude cut with ID ID0377W0003-870225 from training. Duration: 0.852 2023-10-04 13:12:10,659 WARNING [train.py:1204] (3/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_56-5838-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 13:12:13,752 WARNING [train.py:1204] (3/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_90-31383-0_sp0.9 from training. Duration: 0.9666875 2023-10-04 13:12:13,846 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_31583_210_3852_1_1532595851_4024413_58-86657-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 13:12:15,101 WARNING [train.py:1204] (3/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_344-113875-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 13:12:17,843 WARNING [train.py:1204] (3/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_278-104676-0_sp1.1 from training. Duration: 0.8909375 2023-10-04 13:12:17,876 WARNING [train.py:1204] (3/4) Exclude cut with ID _55844_210_2346_1_1534471212569_7625707_764-2387-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 13:12:19,281 WARNING [train.py:1204] (3/4) Exclude cut with ID _27430_210_7770_1_1532604689375_1378590_157-14963-0_sp1.1 from training. Duration: 0.963625 2023-10-04 13:12:20,803 WARNING [train.py:1204] (3/4) Exclude cut with ID _7160_210_1876_1_1527940923231_3703799_282-322611-0_sp1.1 from training. Duration: 0.7909375 2023-10-04 13:12:32,784 WARNING [train.py:1204] (3/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_190-138640-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 13:12:33,088 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.feed_forward1.hidden_balancer.prob, batch_count=1666200.0, ans=0.125 2023-10-04 13:12:34,271 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_33348_210_5632_1_1533086912_3324030_63-270922-0_sp1.1 from training. Duration: 0.97275 2023-10-04 13:12:35,599 WARNING [train.py:1204] (3/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_135-184281-0_sp1.1 from training. Duration: 0.9 2023-10-04 13:12:42,818 WARNING [train.py:1204] (3/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_277-210798-0_sp1.1 from training. Duration: 0.5 2023-10-04 13:12:42,872 WARNING [train.py:1204] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_402-253027-0_sp0.9 from training. Duration: 0.6 2023-10-04 13:12:44,278 WARNING [train.py:1204] (3/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_540-275685-0_sp0.9 from training. Duration: 0.988875 2023-10-04 13:12:44,320 WARNING [train.py:1204] (3/4) Exclude cut with ID _59493_210_17282_1_1535078419077_3225879_118-18960-0_sp1.1 from training. Duration: 0.936375 2023-10-04 13:12:46,278 WARNING [train.py:1204] (3/4) Exclude cut with ID _6194_210_1534_1_1527511826160_3941194_321-148641-0_sp1.1 from training. Duration: 0.5818125 2023-10-04 13:12:46,285 WARNING [train.py:1204] (3/4) Exclude cut with ID _61133_210_9969_1_1535196777227_3696409_333-270325-0_sp1.1 from training. Duration: 0.8454375 2023-10-04 13:12:46,347 WARNING [train.py:1204] (3/4) Exclude cut with ID _49376_210_5986_1_1533985278732_3370740_307-145221-0_sp1.1 from training. Duration: 0.936375 2023-10-04 13:12:49,012 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6846_210_6753_1_1527850767_4295158_2-315073-0_sp0.9 from training. Duration: 0.97775 2023-10-04 13:12:51,743 WARNING [train.py:1204] (3/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_17-25045-0 from training. Duration: 0.59 2023-10-04 13:12:51,756 WARNING [train.py:1204] (3/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_503-333510-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 13:12:53,136 WARNING [train.py:1204] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_238-245655-0_sp1.1 from training. Duration: 0.736375 2023-10-04 13:12:53,163 WARNING [train.py:1204] (3/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_329-82235-0_sp0.9 from training. Duration: 0.911125 2023-10-04 13:12:53,168 WARNING [train.py:1204] (3/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_325-113798-0_sp0.9 from training. Duration: 0.9444375 2023-10-04 13:12:54,556 WARNING [train.py:1204] (3/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_395-37222-0_sp0.9 from training. Duration: 0.8444375 2023-10-04 13:12:55,866 WARNING [train.py:1204] (3/4) Exclude cut with ID _64135_210_2253_1_1535677267876_7608972_498-5039-0_sp1.1 from training. Duration: 0.7090625 2023-10-04 13:12:55,875 WARNING [train.py:1204] (3/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_303-228738-0_sp0.9 from training. Duration: 0.9333125 2023-10-04 13:12:57,262 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_577-220899-0_sp1.1 from training. Duration: 0.92725 2023-10-04 13:12:59,198 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3475_210_2028_1_1526193578_3622569_377-166133-0_sp1.1 from training. Duration: 0.7818125 2023-10-04 13:12:59,235 WARNING [train.py:1204] (3/4) Exclude cut with ID _49376_210_5986_1_1533985278732_3370740_272-313512-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 13:13:00,695 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.0.ff2_skip_rate, batch_count=1666333.3333333333, ans=0.0 2023-10-04 13:13:01,992 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7762_210_1794_1_1528199508_4057408_422-227034-0_sp0.9 from training. Duration: 0.811125 2023-10-04 13:13:06,514 WARNING [train.py:1204] (3/4) Exclude cut with ID _18895_210_6228_1_1531357274234_3784930_74-341428-0_sp1.1 from training. Duration: 0.92725 2023-10-04 13:13:11,285 WARNING [train.py:1204] (3/4) Exclude cut with ID _44902_210_16187_1_1533888006062_3792078_149-67212-0_sp1.1 from training. Duration: 0.7909375 2023-10-04 13:13:14,641 WARNING [train.py:1204] (3/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_243-126020-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 13:13:16,132 WARNING [train.py:1204] (3/4) Exclude cut with ID _54218_210_12210_1_1534413646388_4409259_259-6561-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 13:13:16,348 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=1666400.0, ans=0.1 2023-10-04 13:13:18,959 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_41452_210_7291_1_1533437687_3941281_405-11559-0 from training. Duration: 0.91 2023-10-04 13:13:20,237 INFO [train.py:1046] (3/4) Epoch 48, batch 300, loss[loss=0.1686, simple_loss=0.2383, pruned_loss=0.0494, over 23871.00 frames. ], tot_loss[loss=0.1533, simple_loss=0.2345, pruned_loss=0.03608, over 3683091.75 frames. ], batch size: 179, lr: 2.12e-03, grad_scale: 8.0 2023-10-04 13:13:20,331 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_759-225084-0_sp1.1 from training. Duration: 0.97275 2023-10-04 13:13:20,349 WARNING [train.py:1204] (3/4) Exclude cut with ID _14747_210_8341_1_1530685797463_3654089_147-131229-0_sp0.9 from training. Duration: 0.8444375 2023-10-04 13:13:22,912 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.682e+02 2.014e+02 2.190e+02 2.558e+02 4.207e+02, threshold=4.380e+02, percent-clipped=0.0 2023-10-04 13:13:23,003 WARNING [train.py:1204] (3/4) Exclude cut with ID _14867_210_1968_1_1530684179890_3846969_319-250868-0 from training. Duration: 0.99 2023-10-04 13:13:23,040 WARNING [train.py:1204] (3/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_580-93798-0_sp0.9 from training. Duration: 0.62225 2023-10-04 13:13:23,678 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.3.encoder.layers.2.self_attn_weights.whiten_keys, num_groups=8, num_channels=256, metric=3.25 vs. limit=6.0 2023-10-04 13:13:24,418 WARNING [train.py:1204] (3/4) Exclude cut with ID _9679_210_7497_1_1529129212906_4363929_512-313889-0_sp0.9 from training. Duration: 0.9 2023-10-04 13:13:24,432 WARNING [train.py:1204] (3/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_18-189198-0 from training. Duration: 0.9 2023-10-04 13:13:27,354 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.conv_skip_rate, batch_count=1666466.6666666667, ans=0.0 2023-10-04 13:13:29,889 WARNING [train.py:1204] (3/4) Exclude cut with ID _49755_210_18053_1_1534581005846_3218709_137-64179-0_sp1.1 from training. Duration: 0.92725 2023-10-04 13:13:29,967 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_48049_210_5632_1_1533864553_3519937_306-283794-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 13:13:34,444 WARNING [train.py:1204] (3/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_25-120161-0_sp1.1 from training. Duration: 0.8909375 2023-10-04 13:13:34,503 WARNING [train.py:1204] (3/4) Exclude cut with ID _8833_210_5219_1_1528592665210_7553059_319-349181-0 from training. Duration: 0.99 2023-10-04 13:13:36,191 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_296-332325-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 13:13:37,497 WARNING [train.py:1204] (3/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_436-186311-0_sp1.1 from training. Duration: 0.5909375 2023-10-04 13:13:37,512 WARNING [train.py:1204] (3/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_348-127789-0 from training. Duration: 0.85 2023-10-04 13:13:37,517 WARNING [train.py:1204] (3/4) Exclude cut with ID _3992_210_1747_1_1526644816031_3731060_443-195183-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 13:13:39,169 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.attention_skip_rate, batch_count=1666533.3333333333, ans=0.0 2023-10-04 13:13:42,167 WARNING [train.py:1204] (3/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_308-231735-0_sp1.1 from training. Duration: 0.663625 2023-10-04 13:13:46,819 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6786_210_1388_1_1527839699_4194495_378-46070-0_sp1.1 from training. Duration: 0.6545625 2023-10-04 13:13:46,855 WARNING [train.py:1204] (3/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_63-101996-0 from training. Duration: 0.88 2023-10-04 13:13:49,640 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2541_210_1794_1_1525590269_3084765_156-173713-0 from training. Duration: 0.81 2023-10-04 13:13:49,672 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_217-239796-0_sp1.1 from training. Duration: 0.963625 2023-10-04 13:13:52,505 WARNING [train.py:1204] (3/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_530-329400-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 13:13:55,218 WARNING [train.py:1204] (3/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_150-62920-0_sp1.1 from training. Duration: 0.963625 2023-10-04 13:13:55,219 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_5522_210_3273_1_1527249606_3909096_366-172508-0 from training. Duration: 0.66 2023-10-04 13:13:55,221 WARNING [train.py:1204] (3/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_303-340446-0_sp1.1 from training. Duration: 0.8181875 2023-10-04 13:13:55,470 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.ff3_skip_rate, batch_count=1666600.0, ans=0.0 2023-10-04 13:13:56,632 WARNING [train.py:1204] (3/4) Exclude cut with ID _8699_210_2069_1_1528630346831_3960409_160-24408-0_sp1.1 from training. Duration: 0.82725 2023-10-04 13:13:58,114 WARNING [train.py:1204] (3/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_299-187801-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 13:13:58,141 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_34886_210_10633_1_1533203882_1755661_92-345288-0_sp1.1 from training. Duration: 0.97275 2023-10-04 13:14:02,794 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_5438_210_1794_1_1527336687_2600009_104-288274-0_sp1.1 from training. Duration: 0.52725 2023-10-04 13:14:02,799 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_154-316450-0 from training. Duration: 0.76 2023-10-04 13:14:02,885 WARNING [train.py:1204] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_23-189234-0_sp0.9 from training. Duration: 0.92225 2023-10-04 13:14:07,150 WARNING [train.py:1204] (3/4) Exclude cut with ID _4779_210_2084_1_1527318022944_3914893_346-168820-0_sp1.1 from training. Duration: 0.963625 2023-10-04 13:14:09,097 WARNING [train.py:1204] (3/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_5-55893-0 from training. Duration: 0.55 2023-10-04 13:14:09,188 WARNING [train.py:1204] (3/4) Exclude cut with ID _20455_210_5219_1_1531565996775_8098100_180-24517-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 13:14:14,516 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_243-143119-0_sp0.9 from training. Duration: 0.8555625 2023-10-04 13:14:17,284 WARNING [train.py:1204] (3/4) Exclude cut with ID _65585_210_12210_1_1535450451584_3764098_293-212045-0_sp1.1 from training. Duration: 0.92725 2023-10-04 13:14:17,288 WARNING [train.py:1204] (3/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_577-332600-0 from training. Duration: 0.57 2023-10-04 13:14:20,143 WARNING [train.py:1204] (3/4) Exclude cut with ID _30039_210_13270_1_1532484612900_2725150_104-289874-0_sp1.1 from training. Duration: 0.963625 2023-10-04 13:14:20,151 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3172_210_2087_1_1526292929_3987865_197-97762-0_sp1.1 from training. Duration: 0.6818125 2023-10-04 13:14:22,869 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_256-150908-0_sp1.1 from training. Duration: 0.963625 2023-10-04 13:14:23,010 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_296-287678-0_sp1.1 from training. Duration: 0.863625 2023-10-04 13:14:24,320 WARNING [train.py:1204] (3/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_443-30034-0 from training. Duration: 0.64 2023-10-04 13:14:24,359 WARNING [train.py:1204] (3/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_494-199538-0_sp0.9 from training. Duration: 0.8333125 2023-10-04 13:14:24,404 WARNING [train.py:1204] (3/4) Exclude cut with ID _19708_210_9206_1_1532136650733_4238364_61-162325-0_sp1.1 from training. Duration: 0.97275 2023-10-04 13:14:25,686 WARNING [train.py:1204] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_475-127455-0 from training. Duration: 0.7 2023-10-04 13:14:28,703 WARNING [train.py:1204] (3/4) Exclude cut with ID _48677_210_5663_1_1533902405919_3892989_188-11581-0_sp1.1 from training. Duration: 0.963625 2023-10-04 13:14:28,727 WARNING [train.py:1204] (3/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_136-8381-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 13:14:30,118 WARNING [train.py:1204] (3/4) Exclude cut with ID _55328_210_27767_1_1534580973260_4088170_270-229555-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 13:14:30,153 WARNING [train.py:1204] (3/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_372-266951-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 13:14:31,413 WARNING [train.py:1204] (3/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_14-242149-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 13:14:34,147 INFO [train.py:1046] (3/4) Epoch 48, batch 350, loss[loss=0.1487, simple_loss=0.2305, pruned_loss=0.03344, over 24631.00 frames. ], tot_loss[loss=0.1516, simple_loss=0.232, pruned_loss=0.03558, over 3898431.91 frames. ], batch size: 68, lr: 2.12e-03, grad_scale: 8.0 2023-10-04 13:14:35,652 WARNING [train.py:1204] (3/4) Exclude cut with ID _7877_210_6014_1_1528286358099_3750136_159-76512-0_sp1.1 from training. Duration: 0.936375 2023-10-04 13:14:36,919 WARNING [train.py:1204] (3/4) Exclude cut with ID _8263_210_1664_1_1528438953494_294100_40-204731-0_sp1.1 from training. Duration: 0.4545625 2023-10-04 13:14:38,392 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8278_210_2550_1_1528637099_4220413_694-238413-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 13:14:44,758 WARNING [train.py:1204] (3/4) Exclude cut with ID _47278_210_12041_1_1533902418349_3576110_421-172710-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 13:14:47,588 WARNING [train.py:1204] (3/4) Exclude cut with ID _14970_210_15709_1_1530773543493_1715730_215-282680-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 13:14:47,638 WARNING [train.py:1204] (3/4) Exclude cut with ID _64639_210_5816_1_1535367619437_3528000_306-149449-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 13:14:50,405 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_28-291982-0 from training. Duration: 0.97 2023-10-04 13:14:51,827 WARNING [train.py:1204] (3/4) Exclude cut with ID _55134_210_21982_1_1534590028377_3882170_412-15870-0_sp1.1 from training. Duration: 0.936375 2023-10-04 13:14:51,866 WARNING [train.py:1204] (3/4) Exclude cut with ID _13684_210_7497_1_1530619354863_3598359_312-235606-0 from training. Duration: 0.6 2023-10-04 13:14:55,928 WARNING [train.py:1204] (3/4) Exclude cut with ID _37112_210_2575_1_1532997007673_7268610_347-238993-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 13:14:55,972 WARNING [train.py:1204] (3/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_121-284152-0 from training. Duration: 0.59 2023-10-04 13:14:56,026 WARNING [train.py:1204] (3/4) Exclude cut with ID _2941_210_2577_1_1526199673150_2489759_228-2961-0_sp1.1 from training. Duration: 0.97275 2023-10-04 13:14:58,113 WARNING [train.py:1204] (3/4) Exclude cut with ID _28323_210_16187_1_1532480395667_4100300_531-24157-0 from training. Duration: 0.98 2023-10-04 13:14:58,370 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.feed_forward1.out_proj.dropout_p, batch_count=1666866.6666666667, ans=0.1 2023-10-04 13:14:59,509 WARNING [train.py:1204] (3/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_266-329755-0_sp0.9 from training. Duration: 0.988875 2023-10-04 13:15:00,911 WARNING [train.py:1204] (3/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_1038-146421-0_sp1.1 from training. Duration: 0.97275 2023-10-04 13:15:02,373 WARNING [train.py:1204] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_391-22148-0_sp0.9 from training. Duration: 0.8555625 2023-10-04 13:15:03,718 WARNING [train.py:1204] (3/4) Exclude cut with ID _64521_210_14699_1_1535243427259_3815328_543-341117-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 13:15:03,737 WARNING [train.py:1204] (3/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_52-168712-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 13:15:03,776 WARNING [train.py:1204] (3/4) Exclude cut with ID _33920_210_4220_1_1533257851500_3084399_198-334063-0_sp1.1 from training. Duration: 0.8545625 2023-10-04 13:15:03,789 WARNING [train.py:1204] (3/4) Exclude cut with ID _50093_210_4220_1_1534417141207_3432299_69-111987-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 13:15:05,007 WARNING [train.py:1204] (3/4) Exclude cut with ID _14821_210_8750_1_1530680503991_6829359_189-297451-0_sp0.9 from training. Duration: 0.611125 2023-10-04 13:15:06,451 WARNING [train.py:1204] (3/4) Exclude cut with ID _8030_210_7497_1_1528444886781_3531089_262-86750-0_sp0.9 from training. Duration: 0.9 2023-10-04 13:15:06,457 WARNING [train.py:1204] (3/4) Exclude cut with ID _24612_210_8913_1_1531998054193_3806359_294-191196-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 13:15:13,215 WARNING [train.py:1204] (3/4) Exclude cut with ID _16909_210_5724_1_1531292079912_4118919_707-342896-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 13:15:14,515 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_9460_210_4211_1_1528954587_10049667_572-37035-0_sp0.9 from training. Duration: 0.888875 2023-10-04 13:15:14,567 WARNING [train.py:1204] (3/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_762-109886-0_sp1.1 from training. Duration: 0.82725 2023-10-04 13:15:15,944 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_14953_210_2151_1_1530701710_5616165_519-326393-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 13:15:20,150 WARNING [train.py:1204] (3/4) Exclude cut with ID _25914_210_6400_1_1532141317058_644740_52-96808-0 from training. Duration: 0.91 2023-10-04 13:15:20,154 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_68095_210_20185_1_1535718226_8888855_1079-94525-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 13:15:23,438 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.4.encoder.layers.2.feed_forward3.out_whiten, num_groups=1, num_channels=384, metric=13.41 vs. limit=15.0 2023-10-04 13:15:24,318 WARNING [train.py:1204] (3/4) Exclude cut with ID _14860_210_4915_1_1530702020340_7343240_746-3647-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 13:15:24,319 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_70379_210_7925_1_1535941455_557455_49-101720-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 13:15:24,331 WARNING [train.py:1204] (3/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_8-307291-0_sp1.1 from training. Duration: 0.936375 2023-10-04 13:15:25,753 WARNING [train.py:1204] (3/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_117-290916-0 from training. Duration: 0.51 2023-10-04 13:15:29,112 WARNING [train.py:1204] (3/4) Exclude cut with ID _22020_210_2461_1_1531724164513_6326900_944-339840-0_sp1.1 from training. Duration: 0.963625 2023-10-04 13:15:29,204 WARNING [train.py:1204] (3/4) Exclude cut with ID IC0515W0043-254911 from training. Duration: 0.976 2023-10-04 13:15:30,450 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3910_210_1794_1_1526742306_334530_28-238909-0 from training. Duration: 0.81 2023-10-04 13:15:30,472 WARNING [train.py:1204] (3/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_269-267275-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 13:15:33,038 WARNING [train.py:1204] (3/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_72-251255-0_sp1.1 from training. Duration: 0.8545625 2023-10-04 13:15:33,050 WARNING [train.py:1204] (3/4) Exclude cut with ID _14886_210_4220_1_1530702020355_3516190_175-229073-0 from training. Duration: 0.45 2023-10-04 13:15:34,469 WARNING [train.py:1204] (3/4) Exclude cut with ID _40519_210_13794_1_1533715228655_7337390_921-209452-0_sp1.1 from training. Duration: 0.963625 2023-10-04 13:15:39,046 WARNING [train.py:1204] (3/4) Exclude cut with ID _29857_210_20034_1_1532395886753_3798160_335-163562-0_sp0.9 from training. Duration: 0.9444375 2023-10-04 13:15:39,153 WARNING [train.py:1204] (3/4) Exclude cut with ID _58583_210_5236_1_1534726835266_3574249_384-58445-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 13:15:41,118 WARNING [train.py:1204] (3/4) Exclude cut with ID _44011_210_2046_1_1533722401691_3631310_96-315656-0_sp1.1 from training. Duration: 0.963625 2023-10-04 13:15:41,123 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_68484_210_1794_1_1535849804_7678097_549-170613-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 13:15:43,989 WARNING [train.py:1204] (3/4) Exclude cut with ID _37892_210_19596_1_1533198677897_3964058_214-148058-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 13:15:46,723 WARNING [train.py:1204] (3/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_415-78792-0_sp0.9 from training. Duration: 0.9 2023-10-04 13:15:47,971 INFO [train.py:1046] (3/4) Epoch 48, batch 400, loss[loss=0.1419, simple_loss=0.2253, pruned_loss=0.02925, over 24649.00 frames. ], tot_loss[loss=0.151, simple_loss=0.2315, pruned_loss=0.03527, over 4079540.44 frames. ], batch size: 60, lr: 2.12e-03, grad_scale: 16.0 2023-10-04 13:15:48,203 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.bypass_mid.scale_min, batch_count=1667133.3333333333, ans=0.2 2023-10-04 13:15:49,415 WARNING [train.py:1204] (3/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_383-205833-0_sp1.1 from training. Duration: 0.863625 2023-10-04 13:15:49,493 WARNING [train.py:1204] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_167-137255-0 from training. Duration: 0.64 2023-10-04 13:15:49,504 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8275_210_4198_1_1528455320_3892571_484-154786-0_sp1.1 from training. Duration: 0.963625 2023-10-04 13:15:50,746 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.677e+02 2.047e+02 2.274e+02 2.611e+02 3.617e+02, threshold=4.549e+02, percent-clipped=0.0 2023-10-04 13:15:50,853 WARNING [train.py:1204] (3/4) Exclude cut with ID _40284_210_2084_1_1533513679814_3704279_396-319412-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 13:15:52,426 WARNING [train.py:1204] (3/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_280-120942-0_sp0.9 from training. Duration: 0.8555625 2023-10-04 13:15:53,792 WARNING [train.py:1204] (3/4) Exclude cut with ID _45630_210_10556_1_1533949174498_3902049_558-240889-0_sp1.1 from training. Duration: 0.92725 2023-10-04 13:15:56,536 WARNING [train.py:1204] (3/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_197-307458-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 13:15:57,897 WARNING [train.py:1204] (3/4) Exclude cut with ID _16909_210_5724_1_1531292079912_4118919_163-254843-0_sp1.1 from training. Duration: 0.92725 2023-10-04 13:15:59,253 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_103-84010-0 from training. Duration: 0.69 2023-10-04 13:16:01,248 WARNING [train.py:1204] (3/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_401-158239-0 from training. Duration: 0.71 2023-10-04 13:16:01,249 WARNING [train.py:1204] (3/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_899-233658-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 13:16:01,551 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.conv_module1.balancer2.prob, batch_count=1667200.0, ans=0.125 2023-10-04 13:16:02,583 WARNING [train.py:1204] (3/4) Exclude cut with ID _23825_210_13449_1_1531915313526_3497150_240-271367-0 from training. Duration: 0.97 2023-10-04 13:16:03,941 WARNING [train.py:1204] (3/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_424-134619-0_sp1.1 from training. Duration: 0.92725 2023-10-04 13:16:06,816 WARNING [train.py:1204] (3/4) Exclude cut with ID _44831_210_13427_1_1533816011549_3689035_32-188147-0_sp1.1 from training. Duration: 0.87275 2023-10-04 13:16:06,819 WARNING [train.py:1204] (3/4) Exclude cut with ID _59170_210_6721_1_1534983633960_4203335_314-63879-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 13:16:06,841 WARNING [train.py:1204] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_602-275260-0 from training. Duration: 0.82 2023-10-04 13:16:08,617 WARNING [train.py:1204] (3/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_511-306767-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 13:16:08,651 WARNING [train.py:1204] (3/4) Exclude cut with ID _41817_210_18835_1_1533434477160_7423049_565-201775-0_sp1.1 from training. Duration: 0.92725 2023-10-04 13:16:08,661 WARNING [train.py:1204] (3/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_778-173151-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 13:16:08,701 WARNING [train.py:1204] (3/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_680-51001-0_sp1.1 from training. Duration: 0.963625 2023-10-04 13:16:13,150 WARNING [train.py:1204] (3/4) Exclude cut with ID IC0677W0059-334891 from training. Duration: 0.896 2023-10-04 13:16:13,195 WARNING [train.py:1204] (3/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_370-331600-0 from training. Duration: 0.94 2023-10-04 13:16:17,969 WARNING [train.py:1204] (3/4) Exclude cut with ID _66840_210_5219_1_1535452168426_4196349_446-240192-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 13:16:18,064 WARNING [train.py:1204] (3/4) Exclude cut with ID _65242_210_18628_1_1535369393717_3823149_38-6732-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 13:16:19,427 WARNING [train.py:1204] (3/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_302-2550-0 from training. Duration: 0.81 2023-10-04 13:16:20,872 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_100-348441-0 from training. Duration: 0.91 2023-10-04 13:16:24,746 WARNING [train.py:1204] (3/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_329-285801-0_sp1.1 from training. Duration: 0.82725 2023-10-04 13:16:27,526 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_30167_210_3528_1_1532428012_4772375_343-334712-0_sp1.1 from training. Duration: 0.936375 2023-10-04 13:16:32,484 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.feed_forward3.hidden_balancer.prob, batch_count=1667333.3333333333, ans=0.125 2023-10-04 13:16:32,492 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.ff3_skip_rate, batch_count=1667333.3333333333, ans=0.0 2023-10-04 13:16:33,455 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7543_210_6753_1_1528543318_4552_1-85072-0 from training. Duration: 0.83 2023-10-04 13:16:33,676 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.self_attn_weights.pos_emb_skip_rate, batch_count=1667333.3333333333, ans=0.0 2023-10-04 13:16:36,324 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7002_210_1794_1_1527935634_4034444_487-236501-0_sp1.1 from training. Duration: 0.6 2023-10-04 13:16:37,696 WARNING [train.py:1204] (3/4) Exclude cut with ID _7160_210_1876_1_1527940923231_3703799_282-322611-0 from training. Duration: 0.87 2023-10-04 13:16:39,145 WARNING [train.py:1204] (3/4) Exclude cut with ID _67335_210_5219_1_1535525975652_4275229_570-262983-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 13:16:40,594 WARNING [train.py:1204] (3/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_625-338546-0_sp1.1 from training. Duration: 0.8 2023-10-04 13:16:40,618 WARNING [train.py:1204] (3/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_176-56120-0 from training. Duration: 0.83 2023-10-04 13:16:45,670 WARNING [train.py:1204] (3/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_963-187109-0_sp1.1 from training. Duration: 0.9 2023-10-04 13:16:47,188 WARNING [train.py:1204] (3/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_121-284152-0_sp0.9 from training. Duration: 0.6555625 2023-10-04 13:16:48,516 WARNING [train.py:1204] (3/4) Exclude cut with ID _45021_210_9994_1_1533619751094_3330288_104-81711-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 13:16:51,360 WARNING [train.py:1204] (3/4) Exclude cut with ID _51382_210_23862_1_1534492801838_3650104_129-343914-0_sp1.1 from training. Duration: 0.936375 2023-10-04 13:16:52,691 WARNING [train.py:1204] (3/4) Exclude cut with ID _14970_210_15709_1_1530775547361_1966965_8-169924-0 from training. Duration: 0.92 2023-10-04 13:16:54,120 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2295_210_2087_1_1525500993_2407863_234-83104-0_sp1.1 from training. Duration: 0.563625 2023-10-04 13:16:55,447 WARNING [train.py:1204] (3/4) Exclude cut with ID _14687_210_5854_1_1530703838259_3664889_115-239046-0 from training. Duration: 0.67 2023-10-04 13:16:56,902 WARNING [train.py:1204] (3/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_690-126306-0_sp1.1 from training. Duration: 0.7090625 2023-10-04 13:16:56,910 WARNING [train.py:1204] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_625-102955-0_sp0.9 from training. Duration: 0.8555625 2023-10-04 13:16:58,328 WARNING [train.py:1204] (3/4) Exclude cut with ID _9389_210_1664_1_1529217337301_5289269_289-213417-0 from training. Duration: 0.89 2023-10-04 13:16:59,904 WARNING [train.py:1204] (3/4) Exclude cut with ID _13253_210_1925_1_1530757775819_3577873_191-144977-0_sp0.9 from training. Duration: 0.8666875 2023-10-04 13:16:59,946 WARNING [train.py:1204] (3/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_325-155703-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 13:16:59,987 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2295_210_2087_1_1525500993_2407863_116-238894-0_sp0.9 from training. Duration: 0.77775 2023-10-04 13:17:01,174 INFO [train.py:1046] (3/4) Epoch 48, batch 450, loss[loss=0.1574, simple_loss=0.2374, pruned_loss=0.0387, over 23794.00 frames. ], tot_loss[loss=0.1516, simple_loss=0.2324, pruned_loss=0.03539, over 4224706.25 frames. ], batch size: 212, lr: 2.12e-03, grad_scale: 16.0 2023-10-04 13:17:01,340 WARNING [train.py:1204] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_94-126128-0 from training. Duration: 0.86 2023-10-04 13:17:01,358 WARNING [train.py:1204] (3/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_395-310220-0_sp0.9 from training. Duration: 0.988875 2023-10-04 13:17:01,433 WARNING [train.py:1204] (3/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_821-110852-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 13:17:03,417 WARNING [train.py:1204] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_64-248359-0_sp1.1 from training. Duration: 0.62725 2023-10-04 13:17:03,436 WARNING [train.py:1204] (3/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_138-187031-0 from training. Duration: 0.99 2023-10-04 13:17:04,685 WARNING [train.py:1204] (3/4) Exclude cut with ID _4122_210_2084_1_1526713164417_4353280_528-206771-0_sp0.9 from training. Duration: 0.97775 2023-10-04 13:17:06,040 WARNING [train.py:1204] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_844-96989-0_sp1.1 from training. Duration: 0.6454375 2023-10-04 13:17:07,448 WARNING [train.py:1204] (3/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_106-30782-0_sp1.1 from training. Duration: 0.72725 2023-10-04 13:17:18,468 WARNING [train.py:1204] (3/4) Exclude cut with ID _14683_210_5219_1_1530698352608_3974859_281-250040-0_sp1.1 from training. Duration: 0.936375 2023-10-04 13:17:18,505 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_46013_210_7234_1_1533776362_3744032_463-52378-0_sp0.9 from training. Duration: 0.9555625 2023-10-04 13:17:20,074 WARNING [train.py:1204] (3/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_299-34413-0 from training. Duration: 0.65 2023-10-04 13:17:21,416 WARNING [train.py:1204] (3/4) Exclude cut with ID _35799_210_5854_1_1533263487887_3529071_490-326847-0 from training. Duration: 0.99 2023-10-04 13:17:23,003 WARNING [train.py:1204] (3/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_589-9070-0_sp0.9 from training. Duration: 0.788875 2023-10-04 13:17:24,992 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.5.encoder.layers.0.conv_module2.whiten, num_groups=1, num_channels=256, metric=4.34 vs. limit=15.0 2023-10-04 13:17:25,847 WARNING [train.py:1204] (3/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_884-29515-0_sp1.1 from training. Duration: 0.936375 2023-10-04 13:17:27,096 WARNING [train.py:1204] (3/4) Exclude cut with ID _15012_210_1968_1_1530856820429_5095350_474-119995-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 13:17:29,947 WARNING [train.py:1204] (3/4) Exclude cut with ID _65585_210_12210_1_1535450451584_3764098_211-97124-0_sp1.1 from training. Duration: 0.963625 2023-10-04 13:17:31,222 WARNING [train.py:1204] (3/4) Exclude cut with ID _33640_210_2655_1_1533117605479_4855469_666-224463-0_sp1.1 from training. Duration: 0.963625 2023-10-04 13:17:34,303 WARNING [train.py:1204] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_35-153886-0 from training. Duration: 0.84 2023-10-04 13:17:35,647 WARNING [train.py:1204] (3/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_210-106047-0 from training. Duration: 0.97 2023-10-04 13:17:38,277 WARNING [train.py:1204] (3/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_479-125611-0 from training. Duration: 0.96 2023-10-04 13:17:38,322 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_9417_210_1794_1_1529235261_4614131_427-153963-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 13:17:38,436 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.1.conv_module2.balancer2.prob, batch_count=1667600.0, ans=0.125 2023-10-04 13:17:38,890 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.4.encoder.layers.1.conv_module1.whiten, num_groups=1, num_channels=384, metric=3.28 vs. limit=15.0 2023-10-04 13:17:39,702 WARNING [train.py:1204] (3/4) Exclude cut with ID _23818_210_6228_1_1531917063884_3676920_133-170571-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 13:17:41,629 WARNING [train.py:1204] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_602-275260-0_sp1.1 from training. Duration: 0.7454375 2023-10-04 13:17:43,580 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9029W0121-509521 from training. Duration: 0.896 2023-10-04 13:17:43,588 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9029W0434-509821 from training. Duration: 0.64 2023-10-04 13:17:44,933 WARNING [train.py:1204] (3/4) Exclude cut with ID _23038_210_9570_1_1532001584158_3652419_289-307034-0_sp1.1 from training. Duration: 0.936375 2023-10-04 13:17:46,354 WARNING [train.py:1204] (3/4) Exclude cut with ID _29345_210_4381_1_1532689168263_3860169_149-257268-0_sp0.9 from training. Duration: 0.9 2023-10-04 13:17:47,785 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_405-339986-0_sp1.1 from training. Duration: 0.463625 2023-10-04 13:17:51,062 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2655_210_2087_1_1525688959_3897400_437-337690-0_sp1.1 from training. Duration: 0.57275 2023-10-04 13:17:51,089 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7543_210_6753_1_1528541907_1399322_165-323619-0_sp1.1 from training. Duration: 0.62725 2023-10-04 13:17:52,465 WARNING [train.py:1204] (3/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_508-335369-0_sp1.1 from training. Duration: 0.436375 2023-10-04 13:17:52,625 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.1.conv_module1.balancer1.prob, batch_count=1667666.6666666667, ans=0.125 2023-10-04 13:17:53,782 WARNING [train.py:1204] (3/4) Exclude cut with ID _7852_210_5219_1_1528286471316_8139400_358-287492-0 from training. Duration: 0.96 2023-10-04 13:17:55,263 WARNING [train.py:1204] (3/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_322-217220-0_sp0.9 from training. Duration: 0.9555625 2023-10-04 13:17:56,703 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3171_210_2087_1_1526109084_2330007_66-260583-0_sp0.9 from training. Duration: 0.8 2023-10-04 13:17:56,732 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8558_210_1410_1_1528505225_7613406_131-329524-0_sp1.1 from training. Duration: 0.7181875 2023-10-04 13:17:59,619 WARNING [train.py:1204] (3/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_30-326681-0 from training. Duration: 0.83 2023-10-04 13:18:03,090 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.1.encoder.layers.0.feed_forward2.out_whiten, num_groups=1, num_channels=256, metric=4.78 vs. limit=15.0 2023-10-04 13:18:03,642 WARNING [train.py:1204] (3/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_58-68956-0_sp0.9 from training. Duration: 0.92225 2023-10-04 13:18:03,714 WARNING [train.py:1204] (3/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_255-312654-0 from training. Duration: 0.91 2023-10-04 13:18:05,084 WARNING [train.py:1204] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_99-167392-0 from training. Duration: 0.61 2023-10-04 13:18:05,284 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.conv_module2.balancer1.max_abs, batch_count=1667733.3333333333, ans=10.0 2023-10-04 13:18:06,985 WARNING [train.py:1204] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_94-126128-0_sp0.9 from training. Duration: 0.9555625 2023-10-04 13:18:11,068 WARNING [train.py:1204] (3/4) Exclude cut with ID _68635_210_9317_1_1535770813410_4315870_187-192817-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 13:18:14,101 WARNING [train.py:1204] (3/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_277-204970-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 13:18:15,893 INFO [train.py:1046] (3/4) Epoch 48, batch 500, loss[loss=0.1577, simple_loss=0.2501, pruned_loss=0.03268, over 24650.00 frames. ], tot_loss[loss=0.1518, simple_loss=0.2326, pruned_loss=0.03548, over 4342365.25 frames. ], batch size: 73, lr: 2.12e-03, grad_scale: 16.0 2023-10-04 13:18:15,951 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6163_210_6400_1_1527506746_4263068_283-137811-0_sp0.9 from training. Duration: 0.8555625 2023-10-04 13:18:15,973 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9144W0121-566446 from training. Duration: 0.896 2023-10-04 13:18:18,920 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.682e+02 1.967e+02 2.163e+02 2.442e+02 3.421e+02, threshold=4.326e+02, percent-clipped=0.0 2023-10-04 13:18:19,025 WARNING [train.py:1204] (3/4) Exclude cut with ID _49371_210_12071_1_1533952761021_3812190_351-315519-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 13:18:20,402 WARNING [train.py:1204] (3/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_115-326778-0_sp1.1 from training. Duration: 0.8454375 2023-10-04 13:18:20,472 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_17292_210_6753_1_1531125388_2943734_436-198965-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 13:18:20,488 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9165W0117-576751 from training. Duration: 0.896 2023-10-04 13:18:21,855 WARNING [train.py:1204] (3/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_212-149947-0 from training. Duration: 0.73 2023-10-04 13:18:21,863 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_13757_210_3528_1_1530440682_4107239_65-167544-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 13:18:25,099 WARNING [train.py:1204] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_700-183549-0_sp0.9 from training. Duration: 0.7555625 2023-10-04 13:18:27,943 WARNING [train.py:1204] (3/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_216-343007-0_sp0.9 from training. Duration: 0.7333125 2023-10-04 13:18:29,310 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7544_210_6753_1_1528587722_3576686_295-258479-0_sp1.1 from training. Duration: 0.763625 2023-10-04 13:18:29,542 INFO [scaling.py:1118] (3/4) WithLoss: name=encoder.encoders.4.encoder.layers.0.self_attn_weights, loss-sum=0.000e+00 2023-10-04 13:18:31,972 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_365-287106-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 13:18:31,991 WARNING [train.py:1204] (3/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_300-3747-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 13:18:33,292 WARNING [train.py:1204] (3/4) Exclude cut with ID _14069_210_5632_1_1530512692033_4803390_487-303201-0_sp1.1 from training. Duration: 0.92725 2023-10-04 13:18:36,072 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.nonlin_attention.balancer.min_positive, batch_count=1667866.6666666667, ans=0.05 2023-10-04 13:18:44,237 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.0.layers.1.feed_forward3.out_whiten, num_groups=1, num_channels=192, metric=8.18 vs. limit=15.0 2023-10-04 13:18:44,665 WARNING [train.py:1204] (3/4) Exclude cut with ID _14459_210_2228_1_1531443641946_7516700_668-123026-0_sp1.1 from training. Duration: 0.936375 2023-10-04 13:18:44,695 WARNING [train.py:1204] (3/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_108-53929-0_sp1.1 from training. Duration: 0.536375 2023-10-04 13:18:44,727 WARNING [train.py:1204] (3/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_266-137104-0_sp0.9 from training. Duration: 0.72225 2023-10-04 13:18:46,688 WARNING [train.py:1204] (3/4) Exclude cut with ID _12302_210_5219_1_1529924446466_8107280_902-168619-0_sp1.1 from training. Duration: 0.936375 2023-10-04 13:18:46,709 WARNING [train.py:1204] (3/4) Exclude cut with ID _59502_210_12210_1_1535107024698_1835139_146-162495-0 from training. Duration: 0.92 2023-10-04 13:18:46,736 WARNING [train.py:1204] (3/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_412-287526-0_sp1.1 from training. Duration: 0.6545625 2023-10-04 13:18:47,030 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.conv_module2.balancer1.prob, batch_count=1667933.3333333333, ans=0.125 2023-10-04 13:18:49,261 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.3.encoder.layers.1.feed_forward3.out_whiten, num_groups=1, num_channels=512, metric=10.31 vs. limit=15.0 2023-10-04 13:18:49,966 WARNING [train.py:1204] (3/4) Exclude cut with ID _7160_210_1876_1_1527940923231_3703799_120-150943-0_sp0.9 from training. Duration: 0.9 2023-10-04 13:18:51,280 WARNING [train.py:1204] (3/4) Exclude cut with ID _15037_210_2253_1_1530752294104_3267650_296-192597-0_sp1.1 from training. Duration: 0.736375 2023-10-04 13:18:51,313 WARNING [train.py:1204] (3/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_397-216497-0_sp1.1 from training. Duration: 0.87275 2023-10-04 13:18:51,328 WARNING [train.py:1204] (3/4) Exclude cut with ID _40094_210_16380_1_1533294005773_3641159_76-269320-0_sp1.1 from training. Duration: 0.936375 2023-10-04 13:18:52,632 WARNING [train.py:1204] (3/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_449-15508-0 from training. Duration: 0.82 2023-10-04 13:18:56,678 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9309W0194-640321 from training. Duration: 0.64 2023-10-04 13:18:58,713 WARNING [train.py:1204] (3/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_284-312178-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 13:19:00,345 WARNING [train.py:1204] (3/4) Exclude cut with ID _29853_210_7497_1_1532505600851_6642110_401-55937-0_sp1.1 from training. Duration: 0.92725 2023-10-04 13:19:01,711 WARNING [train.py:1204] (3/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_716-24918-0_sp1.1 from training. Duration: 0.92725 2023-10-04 13:19:01,739 WARNING [train.py:1204] (3/4) Exclude cut with ID _19637_210_4220_1_1531537167676_3205060_314-305638-0_sp1.1 from training. Duration: 0.92725 2023-10-04 13:19:01,767 WARNING [train.py:1204] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_475-127455-0_sp1.1 from training. Duration: 0.636375 2023-10-04 13:19:03,423 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=1668000.0, ans=0.1 2023-10-04 13:19:04,550 WARNING [train.py:1204] (3/4) Exclude cut with ID _5444_210_2151_1_1527418808464_5312210_581-315411-0 from training. Duration: 0.62 2023-10-04 13:19:07,290 WARNING [train.py:1204] (3/4) Exclude cut with ID _44902_210_16187_1_1533888006062_3792078_149-67212-0_sp0.9 from training. Duration: 0.9666875 2023-10-04 13:19:08,623 WARNING [train.py:1204] (3/4) Exclude cut with ID _31602_210_5854_1_1532677021456_4458250_63-63253-0_sp1.1 from training. Duration: 0.963625 2023-10-04 13:19:13,286 WARNING [train.py:1204] (3/4) Exclude cut with ID _55209_210_1531_1_1534675828996_4597160_660-203992-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 13:19:14,781 WARNING [train.py:1204] (3/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_83-190624-0_sp1.1 from training. Duration: 0.92725 2023-10-04 13:19:17,053 INFO [scaling.py:1118] (3/4) WithLoss: name=encoder.encoders.3.encoder.layers.3.self_attn_weights, loss-sum=0.000e+00 2023-10-04 13:19:20,005 WARNING [train.py:1204] (3/4) Exclude cut with ID _58605_210_12210_1_1534723195939_3904259_154-139635-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 13:19:20,300 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.feed_forward2.hidden_balancer.prob, batch_count=1668066.6666666667, ans=0.125 2023-10-04 13:19:22,696 WARNING [train.py:1204] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_164-224052-0 from training. Duration: 0.7 2023-10-04 13:19:22,705 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_1126-189071-0_sp1.1 from training. Duration: 0.963625 2023-10-04 13:19:22,716 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_15396_210_6890_1_1531308473_1938936_310-214789-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 13:19:24,318 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.feed_forward3.hidden_balancer.prob, batch_count=1668066.6666666667, ans=0.125 2023-10-04 13:19:25,393 WARNING [train.py:1204] (3/4) Exclude cut with ID _8395_210_1531_1_1528597871523_4411579_431-132470-0 from training. Duration: 0.74 2023-10-04 13:19:25,545 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.nonlin_attention.balancer.min_positive, batch_count=1668066.6666666667, ans=0.05 2023-10-04 13:19:26,756 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3780_210_2087_1_1526467760_2130483_170-259044-0_sp0.9 from training. Duration: 0.77775 2023-10-04 13:19:28,178 WARNING [train.py:1204] (3/4) Exclude cut with ID _12733_210_8341_1_1530167424556_3646380_537-230640-0_sp1.1 from training. Duration: 0.963625 2023-10-04 13:19:28,382 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.conv_module1.balancer1.prob, batch_count=1668133.3333333333, ans=0.125 2023-10-04 13:19:29,991 INFO [train.py:1046] (3/4) Epoch 48, batch 550, loss[loss=0.1751, simple_loss=0.2439, pruned_loss=0.05317, over 23557.00 frames. ], tot_loss[loss=0.1534, simple_loss=0.2343, pruned_loss=0.03626, over 4420531.57 frames. ], batch size: 256, lr: 2.12e-03, grad_scale: 16.0 2023-10-04 13:19:34,186 WARNING [train.py:1204] (3/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_159-281310-0 from training. Duration: 0.95 2023-10-04 13:19:36,921 WARNING [train.py:1204] (3/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_434-83000-0 from training. Duration: 0.98 2023-10-04 13:19:36,942 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_4405_210_1849_1_1526732205_3593939_493-32464-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 13:19:36,956 WARNING [train.py:1204] (3/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_342-100157-0 from training. Duration: 0.54 2023-10-04 13:19:37,009 WARNING [train.py:1204] (3/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_765-250133-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 13:19:37,013 WARNING [train.py:1204] (3/4) Exclude cut with ID _38670_210_15379_1_1533448810659_4391059_81-312245-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 13:19:38,375 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_50050_210_16223_1_1535435796_7290138_1273-247611-0_sp1.1 from training. Duration: 0.936375 2023-10-04 13:19:39,742 WARNING [train.py:1204] (3/4) Exclude cut with ID _44831_210_13427_1_1533816011549_3689035_399-18591-0_sp1.1 from training. Duration: 0.936375 2023-10-04 13:19:39,775 WARNING [train.py:1204] (3/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_557-324258-0_sp0.9 from training. Duration: 0.9 2023-10-04 13:19:42,922 WARNING [train.py:1204] (3/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_62-335552-0_sp1.1 from training. Duration: 0.7909375 2023-10-04 13:19:44,273 WARNING [train.py:1204] (3/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_673-264125-0_sp1.1 from training. Duration: 0.963625 2023-10-04 13:19:44,363 WARNING [train.py:1204] (3/4) Exclude cut with ID _8030_210_7497_1_1528444886781_3531089_262-86750-0 from training. Duration: 0.81 2023-10-04 13:19:44,377 WARNING [train.py:1204] (3/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_444-28041-0_sp1.1 from training. Duration: 0.77275 2023-10-04 13:19:49,125 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_34008_210_10431_1_1533345424_8183247_516-14858-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 13:19:50,877 WARNING [train.py:1204] (3/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_54-248218-0_sp1.1 from training. Duration: 0.936375 2023-10-04 13:19:52,323 WARNING [train.py:1204] (3/4) Exclude cut with ID _13253_210_1925_1_1530757775819_3577873_300-119782-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 13:19:53,671 WARNING [train.py:1204] (3/4) Exclude cut with ID _39290_210_4220_1_1533862778760_3075118_21-240703-0_sp1.1 from training. Duration: 0.936375 2023-10-04 13:19:55,297 WARNING [train.py:1204] (3/4) Exclude cut with ID 774-127930-0014-10412-0_sp1.1 from training. Duration: 0.95 2023-10-04 13:19:56,688 WARNING [train.py:1204] (3/4) Exclude cut with ID _67499_210_17282_1_1535509805220_3692430_20-179346-0 from training. Duration: 0.97 2023-10-04 13:19:58,059 WARNING [train.py:1204] (3/4) Exclude cut with ID _8365_210_2064_1_1528592311070_4677690_431-294102-0_sp0.9 from training. Duration: 0.988875 2023-10-04 13:20:02,665 WARNING [train.py:1204] (3/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_365-1492-0_sp1.1 from training. Duration: 0.82725 2023-10-04 13:20:02,699 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_105-114797-0_sp0.9 from training. Duration: 0.9333125 2023-10-04 13:20:05,377 WARNING [train.py:1204] (3/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_769-168302-0_sp0.9 from training. Duration: 0.788875 2023-10-04 13:20:05,613 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.feed_forward3.hidden_balancer.prob, batch_count=1668266.6666666667, ans=0.125 2023-10-04 13:20:06,854 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_40340_210_9024_1_1533433916_4132944_320-270277-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 13:20:08,203 WARNING [train.py:1204] (3/4) Exclude cut with ID ID0203W0003-781711 from training. Duration: 0.958 2023-10-04 13:20:08,290 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_30434_210_13049_1_1532519384_981746_52-12932-0_sp1.1 from training. Duration: 0.936375 2023-10-04 13:20:09,647 WARNING [train.py:1204] (3/4) Exclude cut with ID _8263_210_1664_1_1528438953494_294100_40-204731-0_sp0.9 from training. Duration: 0.5555625 2023-10-04 13:20:12,977 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1032-236863-0_sp0.9 from training. Duration: 0.9333125 2023-10-04 13:20:14,326 WARNING [train.py:1204] (3/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_335-274780-0_sp0.9 from training. Duration: 0.8666875 2023-10-04 13:20:14,345 WARNING [train.py:1204] (3/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_654-52713-0_sp1.1 from training. Duration: 0.7 2023-10-04 13:20:16,209 WARNING [train.py:1204] (3/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_742-267856-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 13:20:17,574 WARNING [train.py:1204] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_74-48717-0 from training. Duration: 0.58 2023-10-04 13:20:20,592 WARNING [train.py:1204] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_18-46756-0 from training. Duration: 0.79 2023-10-04 13:20:20,656 WARNING [train.py:1204] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_612-56061-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 13:20:20,664 WARNING [train.py:1204] (3/4) Exclude cut with ID _56423_210_5854_1_1534896061049_3165969_412-2403-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 13:20:21,993 WARNING [train.py:1204] (3/4) Exclude cut with ID _62574_210_2655_1_1535180407058_3933249_120-196848-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 13:20:21,993 WARNING [train.py:1204] (3/4) Exclude cut with ID _40519_210_13794_1_1533715228655_7337390_916-51000-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 13:20:23,454 WARNING [train.py:1204] (3/4) Exclude cut with ID _65263_210_22356_1_1535763598691_3921037_108-21902-0_sp1.1 from training. Duration: 0.9 2023-10-04 13:20:24,841 WARNING [train.py:1204] (3/4) Exclude cut with ID _4122_210_2084_1_1526713164417_4353280_528-206771-0_sp1.1 from training. Duration: 0.8 2023-10-04 13:20:26,264 WARNING [train.py:1204] (3/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_43-325657-0_sp1.1 from training. Duration: 0.92725 2023-10-04 13:20:27,649 WARNING [train.py:1204] (3/4) Exclude cut with ID _44659_210_2721_1_1534036120061_6614860_314-82041-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 13:20:28,987 WARNING [train.py:1204] (3/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_81-300212-0_sp0.9 from training. Duration: 0.6666875 2023-10-04 13:20:29,074 WARNING [train.py:1204] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_665-196155-0_sp0.9 from training. Duration: 0.7444375 2023-10-04 13:20:30,494 WARNING [train.py:1204] (3/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_140-336881-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 13:20:32,312 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6290_210_3273_1_1527595224_1282787_115-87095-0_sp0.9 from training. Duration: 0.611125 2023-10-04 13:20:32,370 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_53373_210_5399_1_1534229471_4715695_607-259685-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 13:20:33,768 WARNING [train.py:1204] (3/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_175-163894-0_sp1.1 from training. Duration: 0.67275 2023-10-04 13:20:33,786 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7002_210_1794_1_1527939683_5157286_1-281264-0_sp1.1 from training. Duration: 0.4 2023-10-04 13:20:40,553 WARNING [train.py:1204] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_605-257205-0 from training. Duration: 0.59 2023-10-04 13:20:43,297 INFO [train.py:1046] (3/4) Epoch 48, batch 600, loss[loss=0.142, simple_loss=0.2233, pruned_loss=0.03033, over 24303.00 frames. ], tot_loss[loss=0.1532, simple_loss=0.2343, pruned_loss=0.03601, over 4508180.77 frames. ], batch size: 61, lr: 2.12e-03, grad_scale: 16.0 2023-10-04 13:20:43,380 WARNING [train.py:1204] (3/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_454-314901-0 from training. Duration: 0.46 2023-10-04 13:20:45,834 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_46032_210_14656_1_1533781412_4507969_287-130422-0_sp1.1 from training. Duration: 0.97275 2023-10-04 13:20:45,858 WARNING [train.py:1204] (3/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_186-309161-0_sp1.1 from training. Duration: 0.4909375 2023-10-04 13:20:45,900 WARNING [train.py:1204] (3/4) Exclude cut with ID _42659_210_2070_1_1533607442765_3120380_45-342307-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 13:20:46,919 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.584e+02 2.073e+02 2.337e+02 2.691e+02 3.660e+02, threshold=4.674e+02, percent-clipped=0.0 2023-10-04 13:20:53,027 WARNING [train.py:1204] (3/4) Exclude cut with ID _12555_210_8750_1_1530075594402_3780239_478-326818-0_sp1.1 from training. Duration: 0.963625 2023-10-04 13:20:55,791 WARNING [train.py:1204] (3/4) Exclude cut with ID _7606_210_4915_1_1528894253277_5080690_127-177761-0_sp1.1 from training. Duration: 0.5818125 2023-10-04 13:20:57,262 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6788_210_1388_1_1527898698_4562021_91-197981-0 from training. Duration: 0.83 2023-10-04 13:20:59,869 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8558_210_1410_1_1528505225_7613406_131-329524-0_sp0.9 from training. Duration: 0.87775 2023-10-04 13:21:01,294 WARNING [train.py:1204] (3/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_153-153537-0_sp1.1 from training. Duration: 0.82725 2023-10-04 13:21:03,214 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_17292_210_6753_1_1531125388_2943734_285-349105-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 13:21:04,625 WARNING [train.py:1204] (3/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_37-41919-0 from training. Duration: 0.87 2023-10-04 13:21:04,654 WARNING [train.py:1204] (3/4) Exclude cut with ID _60860_210_8254_1_1534896015875_3557169_172-294852-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 13:21:10,168 WARNING [train.py:1204] (3/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_146-276944-0 from training. Duration: 0.83 2023-10-04 13:21:12,913 WARNING [train.py:1204] (3/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_1254-44082-0_sp1.1 from training. Duration: 0.936375 2023-10-04 13:21:12,926 WARNING [train.py:1204] (3/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_1115-180350-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 13:21:13,176 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.conv_module2.balancer2.prob, batch_count=1668600.0, ans=0.125 2023-10-04 13:21:14,244 WARNING [train.py:1204] (3/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_60-146189-0_sp1.1 from training. Duration: 0.8909375 2023-10-04 13:21:19,550 WARNING [train.py:1204] (3/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_517-153649-0_sp1.1 from training. Duration: 0.92725 2023-10-04 13:21:19,562 WARNING [train.py:1204] (3/4) Exclude cut with ID _39571_210_13449_1_1533801584143_3651077_37-266257-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 13:21:20,923 WARNING [train.py:1204] (3/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_527-276118-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 13:21:22,485 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.0.balancer1.prob, batch_count=1668600.0, ans=0.125 2023-10-04 13:21:23,855 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.balancer2.prob, batch_count=1668600.0, ans=0.125 2023-10-04 13:21:29,181 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7696_210_2040_1_1528207167_3716099_117-125434-0_sp1.1 from training. Duration: 0.6909375 2023-10-04 13:21:32,091 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_25767_210_2161_1_1532307483_3867748_478-156360-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 13:21:32,095 WARNING [train.py:1204] (3/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_177-49092-0_sp1.1 from training. Duration: 0.82725 2023-10-04 13:21:32,102 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_11305_210_1794_1_1529750428_7178148_680-317073-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 13:21:38,081 WARNING [train.py:1204] (3/4) Exclude cut with ID _53565_210_10635_1_1534337218335_3820259_287-18120-0 from training. Duration: 0.97 2023-10-04 13:21:43,558 WARNING [train.py:1204] (3/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_498-40153-0_sp1.1 from training. Duration: 0.57275 2023-10-04 13:21:43,576 WARNING [train.py:1204] (3/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_555-63066-0_sp1.1 from training. Duration: 0.963625 2023-10-04 13:21:45,737 WARNING [train.py:1204] (3/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_395-310220-0 from training. Duration: 0.89 2023-10-04 13:21:47,036 WARNING [train.py:1204] (3/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_937-208281-0_sp1.1 from training. Duration: 0.77275 2023-10-04 13:21:48,475 WARNING [train.py:1204] (3/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_120-211630-0 from training. Duration: 0.92 2023-10-04 13:21:49,839 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_454-276429-0_sp1.1 from training. Duration: 0.7909375 2023-10-04 13:21:49,893 WARNING [train.py:1204] (3/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_404-75889-0_sp1.1 from training. Duration: 0.8090625 2023-10-04 13:21:50,140 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.bypass.scale_min, batch_count=1668733.3333333333, ans=0.2 2023-10-04 13:21:56,554 INFO [train.py:1046] (3/4) Epoch 48, batch 650, loss[loss=0.1387, simple_loss=0.2236, pruned_loss=0.02684, over 24494.00 frames. ], tot_loss[loss=0.1524, simple_loss=0.2325, pruned_loss=0.03619, over 4526404.05 frames. ], batch size: 63, lr: 2.12e-03, grad_scale: 16.0 2023-10-04 13:21:56,640 WARNING [train.py:1204] (3/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_282-136641-0_sp0.9 from training. Duration: 0.4666875 2023-10-04 13:21:58,016 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_809-17156-0_sp1.1 from training. Duration: 0.663625 2023-10-04 13:21:59,461 WARNING [train.py:1204] (3/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_1003-101638-0_sp0.9 from training. Duration: 0.811125 2023-10-04 13:22:00,849 WARNING [train.py:1204] (3/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_404-75889-0_sp0.9 from training. Duration: 0.988875 2023-10-04 13:22:02,848 WARNING [train.py:1204] (3/4) Exclude cut with ID _59288_210_5854_1_1535077757774_3412246_570-295255-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 13:22:06,914 WARNING [train.py:1204] (3/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_412-287526-0 from training. Duration: 0.72 2023-10-04 13:22:06,982 WARNING [train.py:1204] (3/4) Exclude cut with ID _44010_210_4525_1_1533884262964_4013620_649-79655-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 13:22:11,701 WARNING [train.py:1204] (3/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_57-270037-0_sp0.9 from training. Duration: 0.8555625 2023-10-04 13:22:11,701 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_34815_210_6388_1_1533381320_3571903_302-255963-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 13:22:16,270 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_47449_210_5399_1_1534076445_4006929_573-301852-0_sp1.1 from training. Duration: 0.97275 2023-10-04 13:22:19,613 WARNING [train.py:1204] (3/4) Exclude cut with ID 6709-74022-0004-86860-0_sp1.1 from training. Duration: 0.9409375 2023-10-04 13:22:21,001 WARNING [train.py:1204] (3/4) Exclude cut with ID _40519_210_13794_1_1533715228655_7337390_111-71991-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 13:22:21,045 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_30167_210_3528_1_1532428012_4772375_169-214186-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 13:22:23,874 WARNING [train.py:1204] (3/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_20-235313-0_sp1.1 from training. Duration: 0.963625 2023-10-04 13:22:23,928 WARNING [train.py:1204] (3/4) Exclude cut with ID _4681_210_3525_1_1527251040321_1235713_123-24428-0_sp0.9 from training. Duration: 0.5333125 2023-10-04 13:22:26,800 WARNING [train.py:1204] (3/4) Exclude cut with ID _31602_210_5854_1_1532677021456_4458250_444-198752-0_sp1.1 from training. Duration: 0.97275 2023-10-04 13:22:28,098 WARNING [train.py:1204] (3/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_9-279442-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 13:22:28,154 WARNING [train.py:1204] (3/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_973-198085-0_sp0.9 from training. Duration: 0.8333125 2023-10-04 13:22:29,508 WARNING [train.py:1204] (3/4) Exclude cut with ID _29853_210_7497_1_1532505600851_6642110_567-250221-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 13:22:30,945 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6545_210_2040_1_1527774670_4245258_407-48070-0_sp1.1 from training. Duration: 0.6909375 2023-10-04 13:22:34,255 WARNING [train.py:1204] (3/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_224-343295-0_sp0.9 from training. Duration: 0.611125 2023-10-04 13:22:34,280 WARNING [train.py:1204] (3/4) Exclude cut with ID IC0125W0011-61817 from training. Duration: 0.88 2023-10-04 13:22:34,298 WARNING [train.py:1204] (3/4) Exclude cut with ID _60113_210_16271_1_1535164212481_4386079_435-154371-0_sp1.1 from training. Duration: 0.97275 2023-10-04 13:22:34,321 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_25836_210_16554_1_1532138167_53197_3-9394-0_sp1.1 from training. Duration: 0.92725 2023-10-04 13:22:38,526 WARNING [train.py:1204] (3/4) Exclude cut with ID _29343_210_4381_1_1532602757752_3674109_57-55125-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 13:22:38,628 WARNING [train.py:1204] (3/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_267-23560-0_sp1.1 from training. Duration: 0.936375 2023-10-04 13:22:39,976 WARNING [train.py:1204] (3/4) Exclude cut with ID _30196_210_1664_1_1533708496522_7074140_1150-278867-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 13:22:40,034 WARNING [train.py:1204] (3/4) Exclude cut with ID _6270_210_2084_1_1527850911573_3856420_26-277343-0_sp0.9 from training. Duration: 0.911125 2023-10-04 13:22:41,256 WARNING [train.py:1204] (3/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_325-62802-0 from training. Duration: 0.87 2023-10-04 13:22:43,146 WARNING [train.py:1204] (3/4) Exclude cut with ID _56524_210_9642_1_1534572011779_3619170_145-310842-0_sp1.1 from training. Duration: 0.9 2023-10-04 13:22:43,176 WARNING [train.py:1204] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_860-96198-0_sp0.9 from training. Duration: 0.811125 2023-10-04 13:22:43,266 WARNING [train.py:1204] (3/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_17-25045-0_sp1.1 from training. Duration: 0.536375 2023-10-04 13:22:43,291 WARNING [train.py:1204] (3/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_349-69727-0_sp1.1 from training. Duration: 0.936375 2023-10-04 13:22:45,122 WARNING [train.py:1204] (3/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_580-93798-0_sp1.1 from training. Duration: 0.5090625 2023-10-04 13:22:46,546 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7762_210_1794_1_1528199508_4057408_419-125009-0 from training. Duration: 0.83 2023-10-04 13:22:49,670 WARNING [train.py:1204] (3/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_356-167816-0 from training. Duration: 0.72 2023-10-04 13:22:49,699 WARNING [train.py:1204] (3/4) Exclude cut with ID _29345_210_4381_1_1532689168263_3860169_225-97100-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 13:22:49,709 WARNING [train.py:1204] (3/4) Exclude cut with ID _14227_210_4381_1_1530701804286_3940259_374-194712-0_sp1.1 from training. Duration: 0.92725 2023-10-04 13:22:50,952 WARNING [train.py:1204] (3/4) Exclude cut with ID _40137_210_12210_1_1533290484046_3795180_249-187059-0_sp0.9 from training. Duration: 0.97775 2023-10-04 13:22:50,978 WARNING [train.py:1204] (3/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_389-317194-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 13:22:52,408 WARNING [train.py:1204] (3/4) Exclude cut with ID _27485_210_4916_1_1532739191677_3668269_239-122918-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 13:22:56,688 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2245_210_2715_1_1525511548_3540254_128-117609-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 13:22:56,726 WARNING [train.py:1204] (3/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_160-276178-0_sp1.1 from training. Duration: 0.963625 2023-10-04 13:22:58,152 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_31920_210_10633_1_1532860009_1686094_1-279735-0_sp1.1 from training. Duration: 0.97275 2023-10-04 13:23:00,934 WARNING [train.py:1204] (3/4) Exclude cut with ID _51078_210_9070_1_1534161557424_3805250_325-262383-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 13:23:00,963 WARNING [train.py:1204] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_643-52007-0_sp0.9 from training. Duration: 0.6333125 2023-10-04 13:23:01,002 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_51937_210_5014_1_1534573610_4022025_518-299248-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 13:23:09,467 WARNING [train.py:1204] (3/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_166-314607-0_sp1.1 from training. Duration: 0.4909375 2023-10-04 13:23:09,479 WARNING [train.py:1204] (3/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_1087-282939-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 13:23:09,510 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_23850_210_4238_1_1532232597_1632433_32-185135-0_sp1.1 from training. Duration: 0.7818125 2023-10-04 13:23:10,817 INFO [train.py:1046] (3/4) Epoch 48, batch 700, loss[loss=0.1499, simple_loss=0.2361, pruned_loss=0.03187, over 24463.00 frames. ], tot_loss[loss=0.1518, simple_loss=0.2319, pruned_loss=0.03581, over 4568911.18 frames. ], batch size: 63, lr: 2.12e-03, grad_scale: 8.0 2023-10-04 13:23:10,862 WARNING [train.py:1204] (3/4) Exclude cut with ID _59500_210_8254_1_1534809610680_3736009_145-89359-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 13:23:16,004 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.592e+02 2.014e+02 2.298e+02 2.689e+02 4.568e+02, threshold=4.597e+02, percent-clipped=0.0 2023-10-04 13:23:16,114 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_340-336408-0 from training. Duration: 0.93 2023-10-04 13:23:16,175 WARNING [train.py:1204] (3/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_9-21737-0 from training. Duration: 0.97 2023-10-04 13:23:17,686 WARNING [train.py:1204] (3/4) Exclude cut with ID _2220_210_1819_1_1525509109898_3658680_50-99837-0 from training. Duration: 0.54 2023-10-04 13:23:19,598 WARNING [train.py:1204] (3/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_879-299350-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 13:23:21,113 WARNING [train.py:1204] (3/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_905-160074-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 13:23:21,321 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.conv_module2.balancer2.prob, batch_count=1669133.3333333333, ans=0.125 2023-10-04 13:23:22,547 WARNING [train.py:1204] (3/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_395-73570-0 from training. Duration: 0.99 2023-10-04 13:23:27,861 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7264_210_4938_1_1527939911_8132095_800-91995-0_sp1.1 from training. Duration: 0.7818125 2023-10-04 13:23:29,308 WARNING [train.py:1204] (3/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_77-311404-0_sp1.1 from training. Duration: 0.8545625 2023-10-04 13:23:32,056 WARNING [train.py:1204] (3/4) Exclude cut with ID _13679_210_7497_1_1530534293662_3355098_107-80772-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 13:23:32,162 WARNING [train.py:1204] (3/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_224-343295-0_sp1.1 from training. Duration: 0.5 2023-10-04 13:23:32,196 WARNING [train.py:1204] (3/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_156-253482-0_sp1.1 from training. Duration: 0.963625 2023-10-04 13:23:34,944 WARNING [train.py:1204] (3/4) Exclude cut with ID _49728_210_12651_1_1534041034294_6788150_309-125532-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 13:23:39,600 WARNING [train.py:1204] (3/4) Exclude cut with ID _13913_210_9994_1_1530579615187_3383897_473-277682-0_sp0.9 from training. Duration: 0.4333125 2023-10-04 13:23:39,604 WARNING [train.py:1204] (3/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_3-276701-0_sp1.1 from training. Duration: 0.82725 2023-10-04 13:23:40,930 WARNING [train.py:1204] (3/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_282-136641-0 from training. Duration: 0.42 2023-10-04 13:23:44,245 WARNING [train.py:1204] (3/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_127-566-0 from training. Duration: 0.84 2023-10-04 13:23:47,682 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6163_210_6400_1_1527506746_4263068_283-137811-0_sp1.1 from training. Duration: 0.7 2023-10-04 13:23:47,717 WARNING [train.py:1204] (3/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_34-183685-0_sp1.1 from training. Duration: 0.8909375 2023-10-04 13:23:49,778 WARNING [train.py:1204] (3/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_154-135696-0_sp0.9 from training. Duration: 0.888875 2023-10-04 13:23:54,054 WARNING [train.py:1204] (3/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_676-51014-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 13:23:54,112 WARNING [train.py:1204] (3/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_38-169920-0 from training. Duration: 0.91 2023-10-04 13:23:58,521 WARNING [train.py:1204] (3/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_747-114465-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 13:23:58,576 WARNING [train.py:1204] (3/4) Exclude cut with ID _8021_210_7994_1_1528376451358_3653069_100-140231-0_sp1.1 from training. Duration: 0.6090625 2023-10-04 13:23:59,890 WARNING [train.py:1204] (3/4) Exclude cut with ID _64135_210_2253_1_1535677267876_7608972_133-188266-0 from training. Duration: 0.81 2023-10-04 13:24:02,722 WARNING [train.py:1204] (3/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_714-310538-0_sp1.1 from training. Duration: 0.7818125 2023-10-04 13:24:04,118 WARNING [train.py:1204] (3/4) Exclude cut with ID _39867_210_9826_1_1533553222030_9344259_1134-288498-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 13:24:07,012 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.self_attn_weights.pos_emb_skip_rate, batch_count=1669333.3333333333, ans=0.0 2023-10-04 13:24:08,554 WARNING [train.py:1204] (3/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_211-237289-0_sp1.1 from training. Duration: 0.936375 2023-10-04 13:24:11,488 WARNING [train.py:1204] (3/4) Exclude cut with ID _59202_210_26984_1_1534816810612_3549174_137-318710-0_sp0.9 from training. Duration: 0.988875 2023-10-04 13:24:11,514 WARNING [train.py:1204] (3/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_86-66192-0 from training. Duration: 0.55 2023-10-04 13:24:14,896 WARNING [train.py:1204] (3/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_540-275685-0 from training. Duration: 0.89 2023-10-04 13:24:14,936 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8776_210_2040_1_1528638837_4088036_244-278763-0 from training. Duration: 0.9 2023-10-04 13:24:16,444 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.ff3_skip_rate, batch_count=1669400.0, ans=0.0 2023-10-04 13:24:19,162 WARNING [train.py:1204] (3/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_9-219972-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 13:24:21,784 WARNING [train.py:1204] (3/4) Exclude cut with ID _44659_210_2721_1_1534036120061_6614860_656-283823-0_sp1.1 from training. Duration: 0.92725 2023-10-04 13:24:21,868 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_69630_210_7234_1_1535789601_661049_77-216385-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 13:24:24,469 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_19952_210_2161_1_1531875469_3851750_210-308919-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 13:24:24,475 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_4737_210_2040_1_1526998689_3293008_378-178650-0 from training. Duration: 0.95 2023-10-04 13:24:25,874 INFO [train.py:1046] (3/4) Epoch 48, batch 750, loss[loss=0.141, simple_loss=0.2249, pruned_loss=0.02855, over 24503.00 frames. ], tot_loss[loss=0.1517, simple_loss=0.232, pruned_loss=0.03576, over 4597732.58 frames. ], batch size: 63, lr: 2.12e-03, grad_scale: 8.0 2023-10-04 13:24:27,404 WARNING [train.py:1204] (3/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_224-343295-0 from training. Duration: 0.55 2023-10-04 13:24:27,438 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7002_210_1794_1_1527939683_5157286_184-229630-0 from training. Duration: 0.74 2023-10-04 13:24:27,465 WARNING [train.py:1204] (3/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_6-214815-0 from training. Duration: 0.85 2023-10-04 13:24:27,616 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.feed_forward1.hidden_balancer.prob, batch_count=1669466.6666666667, ans=0.125 2023-10-04 13:24:28,865 WARNING [train.py:1204] (3/4) Exclude cut with ID _8699_210_2069_1_1528630346831_3960409_160-24408-0 from training. Duration: 0.91 2023-10-04 13:24:28,907 WARNING [train.py:1204] (3/4) Exclude cut with ID _5585_210_2667_1_1527418813324_8007340_89-332072-0 from training. Duration: 0.64 2023-10-04 13:24:29,039 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.balancer2.prob, batch_count=1669466.6666666667, ans=0.125 2023-10-04 13:24:30,208 WARNING [train.py:1204] (3/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_528-259800-0_sp1.1 from training. Duration: 0.963625 2023-10-04 13:24:30,301 WARNING [train.py:1204] (3/4) Exclude cut with ID _55383_210_10635_1_1534468426172_2231669_134-128246-0 from training. Duration: 0.89 2023-10-04 13:24:32,870 WARNING [train.py:1204] (3/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_400-338274-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 13:24:32,935 WARNING [train.py:1204] (3/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_364-22817-0_sp0.9 from training. Duration: 0.788875 2023-10-04 13:24:34,347 WARNING [train.py:1204] (3/4) Exclude cut with ID _42863_210_9070_1_1533902398788_3696920_331-291057-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 13:24:35,716 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_5436_210_1794_1_1527331317_2541494_229-211016-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 13:24:35,754 WARNING [train.py:1204] (3/4) Exclude cut with ID _6456_210_1534_1_1527681634212_3181049_345-252877-0_sp0.9 from training. Duration: 0.711125 2023-10-04 13:24:37,475 WARNING [train.py:1204] (3/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_96-81499-0_sp1.1 from training. Duration: 0.92725 2023-10-04 13:24:39,090 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.conv_module2.balancer1.prob, batch_count=1669533.3333333333, ans=0.125 2023-10-04 13:24:40,189 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_5575_210_1410_1_1527301305_2570170_342-265572-0_sp1.1 from training. Duration: 0.8545625 2023-10-04 13:24:40,252 WARNING [train.py:1204] (3/4) Exclude cut with ID _5444_210_2151_1_1527418808464_5312210_726-27165-0_sp0.9 from training. Duration: 0.7444375 2023-10-04 13:24:41,666 WARNING [train.py:1204] (3/4) Exclude cut with ID _22083_210_8254_1_1531702862439_3678970_304-327750-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 13:24:43,160 WARNING [train.py:1204] (3/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_245-226199-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 13:24:43,985 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.bypass.scale_min, batch_count=1669533.3333333333, ans=0.2 2023-10-04 13:24:45,043 WARNING [train.py:1204] (3/4) Exclude cut with ID _42875_210_14856_1_1533704397487_4012433_102-199499-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 13:24:46,405 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_51225_210_3601_1_1534400634_2327383_55-301844-0 from training. Duration: 0.93 2023-10-04 13:24:47,757 WARNING [train.py:1204] (3/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_916-195848-0_sp1.1 from training. Duration: 0.836375 2023-10-04 13:24:49,578 WARNING [train.py:1204] (3/4) Exclude cut with ID _44718_210_5816_1_1534050051869_6469030_497-311999-0_sp1.1 from training. Duration: 0.936375 2023-10-04 13:24:52,244 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_34886_210_10633_1_1533203882_1755661_91-194765-0_sp1.1 from training. Duration: 0.936375 2023-10-04 13:24:52,387 WARNING [train.py:1204] (3/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_310-26186-0_sp0.9 from training. Duration: 0.77775 2023-10-04 13:24:53,723 WARNING [train.py:1204] (3/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_416-257730-0 from training. Duration: 0.65 2023-10-04 13:24:53,735 WARNING [train.py:1204] (3/4) Exclude cut with ID _9389_210_1664_1_1529217337301_5289269_209-154760-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 13:24:56,455 WARNING [train.py:1204] (3/4) Exclude cut with ID _4092_210_2151_1_1526643044449_3135759_227-213972-0 from training. Duration: 0.54 2023-10-04 13:24:56,491 WARNING [train.py:1204] (3/4) Exclude cut with ID IC0679W0044-335852 from training. Duration: 0.768 2023-10-04 13:24:57,774 WARNING [train.py:1204] (3/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_42-186162-0 from training. Duration: 0.92 2023-10-04 13:24:57,788 WARNING [train.py:1204] (3/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_302-2550-0_sp0.9 from training. Duration: 0.9 2023-10-04 13:24:58,009 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.ff2_skip_rate, batch_count=1669600.0, ans=0.0 2023-10-04 13:24:59,049 WARNING [train.py:1204] (3/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_356-31216-0_sp0.9 from training. Duration: 0.8333125 2023-10-04 13:25:00,551 WARNING [train.py:1204] (3/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_125-322172-0_sp1.1 from training. Duration: 0.8454375 2023-10-04 13:25:07,413 WARNING [train.py:1204] (3/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_141-89727-0_sp0.9 from training. Duration: 0.788875 2023-10-04 13:25:07,437 WARNING [train.py:1204] (3/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_647-337784-0_sp1.1 from training. Duration: 0.97275 2023-10-04 13:25:07,447 WARNING [train.py:1204] (3/4) Exclude cut with ID _6493_210_1747_1_1528459210424_4012470_513-140967-0_sp1.1 from training. Duration: 0.7181875 2023-10-04 13:25:10,706 WARNING [train.py:1204] (3/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_87-266334-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 13:25:12,188 WARNING [train.py:1204] (3/4) Exclude cut with ID _26528_210_9070_1_1532570410842_3851310_526-261338-0_sp1.1 from training. Duration: 0.92725 2023-10-04 13:25:12,224 WARNING [train.py:1204] (3/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_380-343670-0 from training. Duration: 0.88 2023-10-04 13:25:12,286 WARNING [train.py:1204] (3/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_449-15508-0_sp1.1 from training. Duration: 0.7454375 2023-10-04 13:25:13,607 WARNING [train.py:1204] (3/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_372-6869-0_sp0.9 from training. Duration: 0.488875 2023-10-04 13:25:15,145 WARNING [train.py:1204] (3/4) Exclude cut with ID _26907_210_7234_1_1532221740694_3205263_337-2494-0_sp0.9 from training. Duration: 0.97775 2023-10-04 13:25:17,083 WARNING [train.py:1204] (3/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_10-100367-0_sp1.1 from training. Duration: 0.8909375 2023-10-04 13:25:17,128 WARNING [train.py:1204] (3/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_47-167373-0 from training. Duration: 0.92 2023-10-04 13:25:19,135 WARNING [train.py:1204] (3/4) Exclude cut with ID _28323_210_16187_1_1532480395667_4100300_628-282179-0_sp1.1 from training. Duration: 0.97275 2023-10-04 13:25:24,232 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.2.encoder.layers.1.feed_forward1.out_whiten, num_groups=1, num_channels=384, metric=7.51 vs. limit=15.0 2023-10-04 13:25:24,988 WARNING [train.py:1204] (3/4) Exclude cut with ID _20455_210_5219_1_1531565996775_8098100_213-280898-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 13:25:26,339 WARNING [train.py:1204] (3/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_383-316000-0_sp1.1 from training. Duration: 0.8181875 2023-10-04 13:25:27,673 WARNING [train.py:1204] (3/4) Exclude cut with ID _18513_210_9206_1_1531618215223_3862620_16-78180-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 13:25:29,071 WARNING [train.py:1204] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_330-327861-0_sp1.1 from training. Duration: 0.6090625 2023-10-04 13:25:33,158 WARNING [train.py:1204] (3/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_356-31216-0 from training. Duration: 0.75 2023-10-04 13:25:33,184 WARNING [train.py:1204] (3/4) Exclude cut with ID _25248_210_8750_1_1532062825941_5554180_255-148539-0_sp1.1 from training. Duration: 0.963625 2023-10-04 13:25:34,401 WARNING [train.py:1204] (3/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_269-154726-0_sp1.1 from training. Duration: 0.7818125 2023-10-04 13:25:34,635 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.nonlin_attention.balancer.prob, batch_count=1669733.3333333333, ans=0.125 2023-10-04 13:25:35,972 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7002_210_1794_1_1527939683_5157286_163-87552-0_sp1.1 from training. Duration: 0.7818125 2023-10-04 13:25:37,213 WARNING [train.py:1204] (3/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_97-151896-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 13:25:38,589 INFO [train.py:1046] (3/4) Epoch 48, batch 800, loss[loss=0.1579, simple_loss=0.2352, pruned_loss=0.04027, over 23788.00 frames. ], tot_loss[loss=0.1523, simple_loss=0.2327, pruned_loss=0.03593, over 4628792.88 frames. ], batch size: 212, lr: 2.12e-03, grad_scale: 16.0 2023-10-04 13:25:38,726 WARNING [train.py:1204] (3/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_207-7026-0_sp1.1 from training. Duration: 0.97275 2023-10-04 13:25:38,746 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_92-46009-0_sp0.9 from training. Duration: 0.6 2023-10-04 13:25:39,091 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.bypass_mid.scale_min, batch_count=1669800.0, ans=0.2 2023-10-04 13:25:43,439 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.667e+02 1.959e+02 2.276e+02 2.649e+02 3.901e+02, threshold=4.552e+02, percent-clipped=0.0 2023-10-04 13:25:44,989 WARNING [train.py:1204] (3/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_122-178847-0_sp1.1 from training. Duration: 0.97275 2023-10-04 13:25:44,993 WARNING [train.py:1204] (3/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_94-348956-0_sp1.1 from training. Duration: 0.92725 2023-10-04 13:25:48,092 WARNING [train.py:1204] (3/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_1018-244089-0_sp1.1 from training. Duration: 0.963625 2023-10-04 13:25:48,104 WARNING [train.py:1204] (3/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_87-118423-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 13:25:48,198 WARNING [train.py:1204] (3/4) Exclude cut with ID _53713_210_24116_1_1534314487762_7416751_504-212292-0_sp1.1 from training. Duration: 0.92725 2023-10-04 13:25:48,217 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_446-150327-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 13:25:50,181 WARNING [train.py:1204] (3/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_682-245989-0_sp1.1 from training. Duration: 0.92725 2023-10-04 13:25:54,313 WARNING [train.py:1204] (3/4) Exclude cut with ID _35723_210_1794_1_1533088341043_628250_8-234524-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 13:25:55,582 WARNING [train.py:1204] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_64-248359-0_sp0.9 from training. Duration: 0.7666875 2023-10-04 13:25:57,072 WARNING [train.py:1204] (3/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_49-97158-0 from training. Duration: 0.76 2023-10-04 13:25:58,422 WARNING [train.py:1204] (3/4) Exclude cut with ID _24314_210_4915_1_1532135078116_7170179_743-915-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 13:25:59,783 WARNING [train.py:1204] (3/4) Exclude cut with ID _61194_210_18628_1_1535007555628_3750909_353-323468-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 13:25:59,806 WARNING [train.py:1204] (3/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_387-131538-0_sp1.1 from training. Duration: 0.736375 2023-10-04 13:25:59,837 WARNING [train.py:1204] (3/4) Exclude cut with ID _44424_210_18163_1_1533623394848_1213110_79-187603-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 13:25:59,881 WARNING [train.py:1204] (3/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_86-19601-0 from training. Duration: 0.85 2023-10-04 13:26:01,160 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_50050_210_16223_1_1535435796_7290138_103-283529-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 13:26:01,193 WARNING [train.py:1204] (3/4) Exclude cut with ID _13913_210_9994_1_1530579615187_3383897_473-277682-0 from training. Duration: 0.39 2023-10-04 13:26:03,928 WARNING [train.py:1204] (3/4) Exclude cut with ID _42326_210_7770_1_1533814282531_3707269_368-68029-0_sp1.1 from training. Duration: 0.92725 2023-10-04 13:26:06,756 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_41428_210_2161_1_1533774184_3992532_446-339645-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 13:26:08,160 WARNING [train.py:1204] (3/4) Exclude cut with ID _69584_210_5219_1_1535785133775_4044450_325-7656-0_sp1.1 from training. Duration: 0.7818125 2023-10-04 13:26:08,168 WARNING [train.py:1204] (3/4) Exclude cut with ID _21632_210_2253_1_1531994001765_3744140_134-184387-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 13:26:11,066 WARNING [train.py:1204] (3/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_550-184707-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 13:26:11,091 WARNING [train.py:1204] (3/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_106-341780-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 13:26:14,583 WARNING [train.py:1204] (3/4) Exclude cut with ID _45871_210_5665_1_1533864614245_3296060_253-132552-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 13:26:15,846 WARNING [train.py:1204] (3/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_395-310220-0_sp1.1 from training. Duration: 0.8090625 2023-10-04 13:26:15,861 WARNING [train.py:1204] (3/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_795-33252-0_sp0.9 from training. Duration: 0.411125 2023-10-04 13:26:17,395 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9002W0003-495962 from training. Duration: 0.946 2023-10-04 13:26:17,422 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_43-71989-0 from training. Duration: 0.57 2023-10-04 13:26:17,433 WARNING [train.py:1204] (3/4) Exclude cut with ID _8699_210_2069_1_1528630346831_3960409_155-39766-0_sp1.1 from training. Duration: 0.7090625 2023-10-04 13:26:18,717 WARNING [train.py:1204] (3/4) Exclude cut with ID _27894_210_12392_1_1532415496078_6902439_788-232684-0_sp1.1 from training. Duration: 0.97275 2023-10-04 13:26:18,830 WARNING [train.py:1204] (3/4) Exclude cut with ID _56494_210_4220_1_1535284837625_3033169_10-39296-0_sp1.1 from training. Duration: 0.92725 2023-10-04 13:26:18,858 WARNING [train.py:1204] (3/4) Exclude cut with ID _40901_210_5665_1_1533363789958_3856130_341-20819-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 13:26:24,810 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9029W0435-509822 from training. Duration: 0.896 2023-10-04 13:26:24,873 WARNING [train.py:1204] (3/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_51-117576-0 from training. Duration: 0.4 2023-10-04 13:26:26,274 WARNING [train.py:1204] (3/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_197-28164-0_sp1.1 from training. Duration: 0.72725 2023-10-04 13:26:29,026 WARNING [train.py:1204] (3/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_145-137790-0_sp1.1 from training. Duration: 0.7545625 2023-10-04 13:26:33,008 WARNING [train.py:1204] (3/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_2-39653-0_sp1.1 from training. Duration: 0.8909375 2023-10-04 13:26:37,178 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3780_210_2087_1_1526467760_2130483_186-208240-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 13:26:37,282 WARNING [train.py:1204] (3/4) Exclude cut with ID _24055_210_12651_1_1532310907143_976979_92-59356-0 from training. Duration: 0.91 2023-10-04 13:26:37,325 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3910_210_1794_1_1526742306_334530_28-238909-0_sp1.1 from training. Duration: 0.736375 2023-10-04 13:26:40,139 WARNING [train.py:1204] (3/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_269-154726-0 from training. Duration: 0.86 2023-10-04 13:26:46,581 WARNING [train.py:1204] (3/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_326-165900-0_sp0.9 from training. Duration: 0.7666875 2023-10-04 13:26:48,095 WARNING [train.py:1204] (3/4) Exclude cut with ID _5294_210_1747_1_1527249643928_3696179_470-276077-0_sp1.1 from training. Duration: 0.963625 2023-10-04 13:26:48,305 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.feed_forward1.out_proj.dropout_p, batch_count=1670066.6666666667, ans=0.1 2023-10-04 13:26:49,889 WARNING [train.py:1204] (3/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_372-131163-0 from training. Duration: 0.94 2023-10-04 13:26:49,946 WARNING [train.py:1204] (3/4) Exclude cut with ID _45297_210_2721_1_1533715351381_3264949_351-147136-0_sp1.1 from training. Duration: 0.7909375 2023-10-04 13:26:51,788 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_1072-12671-0_sp1.1 from training. Duration: 0.92725 2023-10-04 13:26:53,140 INFO [train.py:1046] (3/4) Epoch 48, batch 850, loss[loss=0.1582, simple_loss=0.2418, pruned_loss=0.03727, over 23411.00 frames. ], tot_loss[loss=0.1531, simple_loss=0.2338, pruned_loss=0.03623, over 4645275.64 frames. ], batch size: 93, lr: 2.12e-03, grad_scale: 16.0 2023-10-04 13:26:53,197 WARNING [train.py:1204] (3/4) Exclude cut with ID _15133_210_6228_1_1530794512074_3080379_54-198193-0 from training. Duration: 0.94 2023-10-04 13:26:53,214 WARNING [train.py:1204] (3/4) Exclude cut with ID _30196_210_1664_1_1533708496522_7074140_1194-270988-0_sp1.1 from training. Duration: 0.97275 2023-10-04 13:26:53,319 WARNING [train.py:1204] (3/4) Exclude cut with ID _50310_210_15488_1_1534068043345_3641080_289-193637-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 13:26:55,987 WARNING [train.py:1204] (3/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_313-263495-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 13:26:57,450 WARNING [train.py:1204] (3/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_18-130336-0_sp1.1 from training. Duration: 0.7181875 2023-10-04 13:26:58,812 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_47896_210_20508_1_1534492518_5958868_646-287209-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 13:27:00,072 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_294-163793-0 from training. Duration: 0.82 2023-10-04 13:27:00,110 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_8-319957-0 from training. Duration: 0.96 2023-10-04 13:27:00,113 WARNING [train.py:1204] (3/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_1211-226066-0 from training. Duration: 0.76 2023-10-04 13:27:02,830 WARNING [train.py:1204] (3/4) Exclude cut with ID _4779_210_2084_1_1527318022944_3914893_500-285013-0_sp0.9 from training. Duration: 0.7666875 2023-10-04 13:27:02,855 WARNING [train.py:1204] (3/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_347-232268-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 13:27:04,274 WARNING [train.py:1204] (3/4) Exclude cut with ID _29877_210_6228_1_1532397791106_5276149_111-55056-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 13:27:04,815 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.3.encoder.layers.3.conv_module1.whiten, num_groups=1, num_channels=512, metric=4.59 vs. limit=15.0 2023-10-04 13:27:05,566 WARNING [train.py:1204] (3/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_812-271646-0_sp1.1 from training. Duration: 0.92725 2023-10-04 13:27:05,612 WARNING [train.py:1204] (3/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_235-262124-0_sp1.1 from training. Duration: 0.6090625 2023-10-04 13:27:11,008 WARNING [train.py:1204] (3/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_14-16530-0_sp1.1 from training. Duration: 0.97275 2023-10-04 13:27:11,034 WARNING [train.py:1204] (3/4) Exclude cut with ID _44718_210_5816_1_1534050051869_6469030_568-304512-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 13:27:11,063 WARNING [train.py:1204] (3/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_1033-274376-0 from training. Duration: 0.87 2023-10-04 13:27:13,956 WARNING [train.py:1204] (3/4) Exclude cut with ID _4681_210_3525_1_1527251040321_1235713_123-24428-0 from training. Duration: 0.48 2023-10-04 13:27:19,319 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_37818_210_4238_1_1533620910_4720083_251-288764-0_sp1.1 from training. Duration: 0.97275 2023-10-04 13:27:20,591 WARNING [train.py:1204] (3/4) Exclude cut with ID _65585_210_12210_1_1535450451584_3764098_112-294301-0 from training. Duration: 0.88 2023-10-04 13:27:23,752 WARNING [train.py:1204] (3/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_329-113173-0 from training. Duration: 0.62 2023-10-04 13:27:23,855 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6767_210_3273_1_1527855783_1718932_173-256601-0 from training. Duration: 0.8 2023-10-04 13:27:24,613 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.2.encoder.layers.1.feed_forward3.out_whiten, num_groups=1, num_channels=384, metric=7.16 vs. limit=15.0 2023-10-04 13:27:27,436 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9266W0001-627677 from training. Duration: 0.896 2023-10-04 13:27:27,448 WARNING [train.py:1204] (3/4) Exclude cut with ID _49643_210_7989_1_1534640244224_3939031_611-185515-0_sp1.1 from training. Duration: 0.8545625 2023-10-04 13:27:27,452 WARNING [train.py:1204] (3/4) Exclude cut with ID _31649_210_6228_1_1532584608063_4091078_687-237771-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 13:27:27,462 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_148-172861-0_sp0.9 from training. Duration: 0.5555625 2023-10-04 13:27:30,199 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_5129_210_6400_1_1527161005_3579054_107-69738-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 13:27:31,600 WARNING [train.py:1204] (3/4) Exclude cut with ID _40137_210_12210_1_1533290484046_3795180_36-301164-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 13:27:32,832 WARNING [train.py:1204] (3/4) Exclude cut with ID _7537_210_2084_1_1528502395431_3924359_113-199933-0 from training. Duration: 0.59 2023-10-04 13:27:35,517 WARNING [train.py:1204] (3/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_547-5424-0_sp1.1 from training. Duration: 0.8545625 2023-10-04 13:27:35,588 WARNING [train.py:1204] (3/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_147-284024-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 13:27:36,976 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_294-163793-0_sp1.1 from training. Duration: 0.7454375 2023-10-04 13:27:38,292 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3780_210_2087_1_1526467760_2130483_170-259044-0_sp1.1 from training. Duration: 0.636375 2023-10-04 13:27:39,718 WARNING [train.py:1204] (3/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_726-270780-0_sp1.1 from training. Duration: 0.7818125 2023-10-04 13:27:41,086 WARNING [train.py:1204] (3/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_214-291369-0_sp1.1 from training. Duration: 0.57275 2023-10-04 13:27:41,127 WARNING [train.py:1204] (3/4) Exclude cut with ID _21881_210_3525_1_1531825242757_3798380_247-102591-0 from training. Duration: 0.98 2023-10-04 13:27:45,356 WARNING [train.py:1204] (3/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_18-174647-0_sp1.1 from training. Duration: 0.82725 2023-10-04 13:27:45,359 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_415-52611-0_sp1.1 from training. Duration: 0.936375 2023-10-04 13:27:45,432 WARNING [train.py:1204] (3/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_379-49457-0_sp0.9 from training. Duration: 0.9666875 2023-10-04 13:27:45,446 WARNING [train.py:1204] (3/4) Exclude cut with ID _64521_210_14699_1_1535243427259_3815328_368-195106-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 13:27:46,848 WARNING [train.py:1204] (3/4) Exclude cut with ID _28394_210_8913_1_1532430049518_3650118_51-334124-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 13:27:50,151 WARNING [train.py:1204] (3/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_317-161769-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 13:27:50,692 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.4.encoder.layers.1.feed_forward2.out_whiten, num_groups=1, num_channels=384, metric=12.91 vs. limit=15.0 2023-10-04 13:27:52,863 WARNING [train.py:1204] (3/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_40-129756-0_sp0.9 from training. Duration: 0.9 2023-10-04 13:27:52,954 WARNING [train.py:1204] (3/4) Exclude cut with ID _3067_210_2667_1_1526212974640_4503168_119-175776-0_sp0.9 from training. Duration: 0.82225 2023-10-04 13:27:54,695 WARNING [train.py:1204] (3/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_1004-304915-0_sp1.1 from training. Duration: 0.92725 2023-10-04 13:27:55,912 WARNING [train.py:1204] (3/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_359-118926-0_sp0.9 from training. Duration: 0.8 2023-10-04 13:28:03,679 WARNING [train.py:1204] (3/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_51-200220-0_sp0.9 from training. Duration: 0.62225 2023-10-04 13:28:06,356 INFO [train.py:1046] (3/4) Epoch 48, batch 900, loss[loss=0.1424, simple_loss=0.2248, pruned_loss=0.02999, over 20607.00 frames. ], tot_loss[loss=0.154, simple_loss=0.2346, pruned_loss=0.03675, over 4658778.11 frames. ], batch size: 45, lr: 2.12e-03, grad_scale: 16.0 2023-10-04 13:28:06,398 WARNING [train.py:1204] (3/4) Exclude cut with ID _46683_210_9642_1_1533730913492_4591116_530-326202-0_sp1.1 from training. Duration: 0.936375 2023-10-04 13:28:06,448 WARNING [train.py:1204] (3/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_567-135114-0 from training. Duration: 0.87 2023-10-04 13:28:06,480 WARNING [train.py:1204] (3/4) Exclude cut with ID _66863_210_18053_1_1535698758334_8590319_1092-271784-0_sp1.1 from training. Duration: 0.87275 2023-10-04 13:28:06,489 WARNING [train.py:1204] (3/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_762-282614-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 13:28:09,285 WARNING [train.py:1204] (3/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_280-120942-0 from training. Duration: 0.77 2023-10-04 13:28:10,516 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.664e+02 2.013e+02 2.239e+02 2.502e+02 3.512e+02, threshold=4.478e+02, percent-clipped=0.0 2023-10-04 13:28:10,760 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.1.feed_forward1.hidden_balancer.prob, batch_count=1670466.6666666667, ans=0.125 2023-10-04 13:28:14,878 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_31583_210_3852_1_1532595851_4024413_27-170931-0_sp1.1 from training. Duration: 0.963625 2023-10-04 13:28:15,018 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.bypass_mid.scale_min, batch_count=1670466.6666666667, ans=0.2 2023-10-04 13:28:17,571 WARNING [train.py:1204] (3/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_266-271628-0_sp1.1 from training. Duration: 0.92725 2023-10-04 13:28:18,858 WARNING [train.py:1204] (3/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_41-31157-0 from training. Duration: 0.57 2023-10-04 13:28:22,305 WARNING [train.py:1204] (3/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_113-18163-0_sp1.1 from training. Duration: 0.8090625 2023-10-04 13:28:22,341 WARNING [train.py:1204] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_147-111137-0 from training. Duration: 0.95 2023-10-04 13:28:22,413 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1083-220122-0_sp1.1 from training. Duration: 0.463625 2023-10-04 13:28:23,809 WARNING [train.py:1204] (3/4) Exclude cut with ID _14884_210_2252_1_1530707459689_5561250_591-173406-0_sp1.1 from training. Duration: 0.87275 2023-10-04 13:28:23,817 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_24677_210_1794_1_1532002693_4341296_149-303793-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 13:28:23,878 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_133-564-0_sp1.1 from training. Duration: 0.7181875 2023-10-04 13:28:25,126 WARNING [train.py:1204] (3/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_625-338546-0_sp0.9 from training. Duration: 0.97775 2023-10-04 13:28:30,163 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.feed_forward2.hidden_balancer.prob, batch_count=1670533.3333333333, ans=0.125 2023-10-04 13:28:33,296 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.nonlin_attention.balancer.prob, batch_count=1670533.3333333333, ans=0.125 2023-10-04 13:28:37,362 WARNING [train.py:1204] (3/4) Exclude cut with ID _25882_210_3036_1_1532775665815_6479510_624-320169-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 13:28:37,370 WARNING [train.py:1204] (3/4) Exclude cut with ID _20622_210_3852_1_1531622427080_1093839_127-325326-0_sp1.1 from training. Duration: 0.92725 2023-10-04 13:28:37,396 WARNING [train.py:1204] (3/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_464-33931-0_sp1.1 from training. Duration: 0.4909375 2023-10-04 13:28:38,981 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.conv_module2.balancer2.prob, batch_count=1670600.0, ans=0.125 2023-10-04 13:28:40,215 WARNING [train.py:1204] (3/4) Exclude cut with ID _66112_210_10635_1_1535590236305_3801484_120-304588-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 13:28:43,125 WARNING [train.py:1204] (3/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_110-130961-0 from training. Duration: 0.47 2023-10-04 13:28:45,879 WARNING [train.py:1204] (3/4) Exclude cut with ID _16908_210_5724_1_1531119157248_3772700_64-54020-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 13:28:47,591 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.conv_skip_rate, batch_count=1670600.0, ans=0.0 2023-10-04 13:28:48,647 WARNING [train.py:1204] (3/4) Exclude cut with ID _4068_210_2629_1_1526697350395_1546259_224-288693-0_sp1.1 from training. Duration: 0.536375 2023-10-04 13:28:49,297 WARNING [train.py:1204] (3/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_191-41023-0_sp0.9 from training. Duration: 0.788875 2023-10-04 13:28:49,357 WARNING [train.py:1204] (3/4) Exclude cut with ID ID0205W0255-783002 from training. Duration: 0.958 2023-10-04 13:28:50,682 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_162-262717-0 from training. Duration: 0.83 2023-10-04 13:28:58,471 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_224-322394-0_sp1.1 from training. Duration: 0.6 2023-10-04 13:28:58,477 WARNING [train.py:1204] (3/4) Exclude cut with ID _4866_210_2577_1_1527404412284_4364659_133-1696-0_sp1.1 from training. Duration: 0.9 2023-10-04 13:29:00,410 WARNING [train.py:1204] (3/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_973-198085-0_sp1.1 from training. Duration: 0.6818125 2023-10-04 13:29:03,605 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.feed_forward1.hidden_balancer.prob, batch_count=1670666.6666666667, ans=0.125 2023-10-04 13:29:06,069 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_47449_210_5399_1_1534076445_4006929_410-237806-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 13:29:06,087 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7543_210_6753_1_1528543318_4552_1-85072-0_sp0.9 from training. Duration: 0.92225 2023-10-04 13:29:07,365 WARNING [train.py:1204] (3/4) Exclude cut with ID _65263_210_22356_1_1535763598691_3921037_108-21902-0 from training. Duration: 0.99 2023-10-04 13:29:07,370 WARNING [train.py:1204] (3/4) Exclude cut with ID _65585_210_12210_1_1535450451584_3764098_309-174818-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 13:29:08,038 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.3.encoder.layers.1.feed_forward3.out_whiten, num_groups=1, num_channels=512, metric=9.54 vs. limit=15.0 2023-10-04 13:29:10,224 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_309-65177-0 from training. Duration: 0.88 2023-10-04 13:29:11,593 WARNING [train.py:1204] (3/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_363-185703-0_sp1.1 from training. Duration: 0.863625 2023-10-04 13:29:11,628 WARNING [train.py:1204] (3/4) Exclude cut with ID _42326_210_7770_1_1533814282531_3707269_442-238692-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 13:29:13,033 WARNING [train.py:1204] (3/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_226-254446-0_sp1.1 from training. Duration: 0.936375 2023-10-04 13:29:13,050 WARNING [train.py:1204] (3/4) Exclude cut with ID _29853_210_7497_1_1532505600851_6642110_287-299903-0_sp1.1 from training. Duration: 0.963625 2023-10-04 13:29:17,157 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1015-343884-0 from training. Duration: 0.62 2023-10-04 13:29:17,194 WARNING [train.py:1204] (3/4) Exclude cut with ID ID0305W0008-833402 from training. Duration: 0.91 2023-10-04 13:29:18,605 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6673_210_3273_1_1527767790_3173240_351-167954-0_sp1.1 from training. Duration: 0.436375 2023-10-04 13:29:20,562 INFO [train.py:1046] (3/4) Epoch 48, batch 950, loss[loss=0.1589, simple_loss=0.2337, pruned_loss=0.04199, over 24442.00 frames. ], tot_loss[loss=0.1548, simple_loss=0.2352, pruned_loss=0.0372, over 4665225.76 frames. ], batch size: 58, lr: 2.12e-03, grad_scale: 16.0 2023-10-04 13:29:20,607 WARNING [train.py:1204] (3/4) Exclude cut with ID _6456_210_1534_1_1527681634212_3181049_345-252877-0 from training. Duration: 0.64 2023-10-04 13:29:22,007 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_13-205182-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 13:29:24,926 WARNING [train.py:1204] (3/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_427-208901-0 from training. Duration: 0.68 2023-10-04 13:29:30,913 WARNING [train.py:1204] (3/4) Exclude cut with ID _49039_210_6315_1_1533967537541_3846118_9-148921-0_sp1.1 from training. Duration: 0.97275 2023-10-04 13:29:34,268 WARNING [train.py:1204] (3/4) Exclude cut with ID _59493_210_17282_1_1535078419077_3225879_143-70317-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 13:29:34,313 WARNING [train.py:1204] (3/4) Exclude cut with ID _27967_210_12140_1_1532588391859_8419130_1056-236369-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 13:29:35,627 WARNING [train.py:1204] (3/4) Exclude cut with ID _6537_210_1883_1_1527987559710_3597129_382-133062-0_sp0.9 from training. Duration: 0.7555625 2023-10-04 13:29:38,452 WARNING [train.py:1204] (3/4) Exclude cut with ID ID0376W0255-869957 from training. Duration: 0.9510625 2023-10-04 13:29:41,204 WARNING [train.py:1204] (3/4) Exclude cut with ID _49371_210_12071_1_1533952761021_3812190_483-240691-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 13:29:41,266 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_12868_210_6753_1_1530701932_530275_18-76322-0_sp1.1 from training. Duration: 0.7909375 2023-10-04 13:29:42,596 WARNING [train.py:1204] (3/4) Exclude cut with ID _53950_210_5816_1_1534417264919_3862951_367-26344-0_sp1.1 from training. Duration: 0.97275 2023-10-04 13:29:42,622 WARNING [train.py:1204] (3/4) Exclude cut with ID _5444_210_2151_1_1527418808464_5312210_634-194174-0_sp1.1 from training. Duration: 0.8818125 2023-10-04 13:29:42,640 WARNING [train.py:1204] (3/4) Exclude cut with ID _56494_210_4220_1_1535284837625_3033169_69-138150-0 from training. Duration: 0.94 2023-10-04 13:29:45,389 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7543_210_6753_1_1528544010_1110402_25-183860-0_sp0.9 from training. Duration: 0.52225 2023-10-04 13:29:45,507 WARNING [train.py:1204] (3/4) Exclude cut with ID _40284_210_2084_1_1533513679814_3704279_287-191918-0_sp1.1 from training. Duration: 0.963625 2023-10-04 13:29:46,875 WARNING [train.py:1204] (3/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_291-270881-0 from training. Duration: 0.97 2023-10-04 13:29:46,917 WARNING [train.py:1204] (3/4) Exclude cut with ID _12555_210_8750_1_1530075594402_3780239_449-267986-0_sp0.9 from training. Duration: 0.92225 2023-10-04 13:29:51,465 WARNING [train.py:1204] (3/4) Exclude cut with ID _3992_210_1747_1_1526644816031_3731060_268-64520-0_sp1.1 from training. Duration: 0.963625 2023-10-04 13:29:51,480 WARNING [train.py:1204] (3/4) Exclude cut with ID _39411_210_5986_1_1533538825030_3343649_173-238248-0_sp0.9 from training. Duration: 0.92225 2023-10-04 13:29:51,515 WARNING [train.py:1204] (3/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_828-313097-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 13:29:52,995 WARNING [train.py:1204] (3/4) Exclude cut with ID _26907_210_7234_1_1532221740694_3205263_337-2494-0 from training. Duration: 0.88 2023-10-04 13:29:54,332 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_229-45300-0_sp1.1 from training. Duration: 0.5181875 2023-10-04 13:29:55,759 WARNING [train.py:1204] (3/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_300-27235-0_sp1.1 from training. Duration: 0.7909375 2023-10-04 13:29:57,101 WARNING [train.py:1204] (3/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_48-338976-0_sp1.1 from training. Duration: 0.8090625 2023-10-04 13:30:05,035 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_13091_210_7925_1_1530609447_8987354_792-195353-0_sp1.1 from training. Duration: 0.936375 2023-10-04 13:30:05,041 WARNING [train.py:1204] (3/4) Exclude cut with ID _56524_210_9642_1_1534572011779_3619170_311-54500-0_sp1.1 from training. Duration: 0.97275 2023-10-04 13:30:07,847 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7543_210_6753_1_1528544010_1110402_25-183860-0 from training. Duration: 0.47 2023-10-04 13:30:09,234 WARNING [train.py:1204] (3/4) Exclude cut with ID 2411-132532-0017-82279-0_sp1.1 from training. Duration: 0.9681875 2023-10-04 13:30:09,234 WARNING [train.py:1204] (3/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_272-249985-0_sp0.9 from training. Duration: 0.7444375 2023-10-04 13:30:09,297 WARNING [train.py:1204] (3/4) Exclude cut with ID _55644_210_11856_1_1534593599582_4103719_320-229006-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 13:30:09,351 WARNING [train.py:1204] (3/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_177-100554-0_sp1.1 from training. Duration: 0.963625 2023-10-04 13:30:09,353 WARNING [train.py:1204] (3/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_787-294416-0_sp1.1 from training. Duration: 0.7454375 2023-10-04 13:30:13,612 WARNING [train.py:1204] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_318-59097-0 from training. Duration: 0.76 2023-10-04 13:30:14,900 WARNING [train.py:1204] (3/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_480-13146-0_sp1.1 from training. Duration: 0.77275 2023-10-04 13:30:17,548 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_469-338558-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 13:30:17,595 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_367-126897-0_sp1.1 from training. Duration: 0.963625 2023-10-04 13:30:17,618 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6545_210_2040_1_1527774670_4245258_379-14182-0 from training. Duration: 0.8 2023-10-04 13:30:17,629 WARNING [train.py:1204] (3/4) Exclude cut with ID _21631_210_2253_1_1531907880778_3160339_197-79498-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 13:30:17,632 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_416-293746-0_sp1.1 from training. Duration: 0.8181875 2023-10-04 13:30:18,267 WARNING [train.py:1204] (3/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_52-211683-0 from training. Duration: 0.9 2023-10-04 13:30:22,287 WARNING [train.py:1204] (3/4) Exclude cut with ID _48678_210_5663_1_1533988830947_3769199_306-255886-0_sp0.9 from training. Duration: 0.8555625 2023-10-04 13:30:22,613 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.attention_skip_rate, batch_count=1671066.6666666667, ans=0.0 2023-10-04 13:30:25,065 WARNING [train.py:1204] (3/4) Exclude cut with ID _65134_210_14763_1_1535335393985_3505368_296-24733-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 13:30:25,416 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.self_attn_weights.pos_emb_skip_rate, batch_count=1671066.6666666667, ans=0.0 2023-10-04 13:30:30,737 WARNING [train.py:1204] (3/4) Exclude cut with ID _14898_210_9206_1_1530766812377_4177190_335-99364-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 13:30:30,832 WARNING [train.py:1204] (3/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_19-342632-0 from training. Duration: 0.91 2023-10-04 13:30:32,168 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_53071_210_16554_1_1534490405_2954066_265-170472-0 from training. Duration: 0.83 2023-10-04 13:30:32,513 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.conv_module2.balancer2.prob, batch_count=1671066.6666666667, ans=0.125 2023-10-04 13:30:34,972 INFO [train.py:1046] (3/4) Epoch 48, batch 1000, loss[loss=0.172, simple_loss=0.2403, pruned_loss=0.05186, over 23756.00 frames. ], tot_loss[loss=0.1541, simple_loss=0.2341, pruned_loss=0.037, over 4666847.57 frames. ], batch size: 164, lr: 2.12e-03, grad_scale: 16.0 2023-10-04 13:30:35,076 WARNING [train.py:1204] (3/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_189-298172-0_sp1.1 from training. Duration: 0.963625 2023-10-04 13:30:37,919 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7615_210_4211_1_1528110965_4580160_72-235211-0 from training. Duration: 0.76 2023-10-04 13:30:39,097 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.794e+02 2.109e+02 2.410e+02 2.800e+02 4.729e+02, threshold=4.820e+02, percent-clipped=1.0 2023-10-04 13:30:39,184 WARNING [train.py:1204] (3/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_155-90874-0_sp1.1 from training. Duration: 0.936375 2023-10-04 13:30:43,585 WARNING [train.py:1204] (3/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_164-216310-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 13:30:45,089 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.conv_module2.balancer2.prob, batch_count=1671133.3333333333, ans=0.125 2023-10-04 13:30:46,205 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_46013_210_7234_1_1533776362_3744032_463-52378-0 from training. Duration: 0.86 2023-10-04 13:30:46,210 WARNING [train.py:1204] (3/4) Exclude cut with ID _61133_210_9969_1_1535196777227_3696409_333-270325-0 from training. Duration: 0.93 2023-10-04 13:30:50,803 WARNING [train.py:1204] (3/4) Exclude cut with ID _5172_210_1883_1_1527246062567_3577219_18-101380-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 13:30:50,816 WARNING [train.py:1204] (3/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_577-236858-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 13:30:52,234 WARNING [train.py:1204] (3/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_1006-177345-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 13:30:56,299 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_4-266348-0 from training. Duration: 0.78 2023-10-04 13:30:59,391 WARNING [train.py:1204] (3/4) Exclude cut with ID _55857_210_2346_1_1534644007831_6898209_745-98391-0 from training. Duration: 0.76 2023-10-04 13:30:59,515 WARNING [train.py:1204] (3/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_375-244976-0 from training. Duration: 0.75 2023-10-04 13:31:01,448 WARNING [train.py:1204] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_4-248404-0_sp1.1 from training. Duration: 0.97275 2023-10-04 13:31:03,449 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8453_210_4211_1_1528502940_8562730_596-306820-0 from training. Duration: 0.97 2023-10-04 13:31:06,079 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_211-339622-0_sp0.9 from training. Duration: 0.5 2023-10-04 13:31:06,121 WARNING [train.py:1204] (3/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_393-57888-0 from training. Duration: 0.64 2023-10-04 13:31:07,515 WARNING [train.py:1204] (3/4) Exclude cut with ID _23825_210_13449_1_1531915313526_3497150_143-343575-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 13:31:07,574 WARNING [train.py:1204] (3/4) Exclude cut with ID _58605_210_12210_1_1534723195939_3904259_36-12256-0_sp1.1 from training. Duration: 0.936375 2023-10-04 13:31:14,500 WARNING [train.py:1204] (3/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_38-67795-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 13:31:14,567 WARNING [train.py:1204] (3/4) Exclude cut with ID _64631_210_27767_1_1535349580068_4165160_308-327832-0_sp1.1 from training. Duration: 0.92725 2023-10-04 13:31:15,888 WARNING [train.py:1204] (3/4) Exclude cut with ID _37086_210_9038_1_1532999540891_7021250_726-81176-0_sp1.1 from training. Duration: 0.936375 2023-10-04 13:31:15,958 WARNING [train.py:1204] (3/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_47-141521-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 13:31:15,962 WARNING [train.py:1204] (3/4) Exclude cut with ID _3067_210_2667_1_1526212974640_4503168_250-285937-0 from training. Duration: 0.8 2023-10-04 13:31:15,991 WARNING [train.py:1204] (3/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_239-24000-0_sp1.1 from training. Duration: 0.97275 2023-10-04 13:31:19,073 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3371_210_1794_1_1526384462_5580777_396-339812-0_sp0.9 from training. Duration: 0.9444375 2023-10-04 13:31:19,127 WARNING [train.py:1204] (3/4) Exclude cut with ID _16909_210_5724_1_1531292079912_4118919_580-173454-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 13:31:20,438 WARNING [train.py:1204] (3/4) Exclude cut with ID IC0124W0002-61308 from training. Duration: 0.982 2023-10-04 13:31:23,168 WARNING [train.py:1204] (3/4) Exclude cut with ID _31602_210_5854_1_1532677021456_4458250_249-96604-0 from training. Duration: 0.89 2023-10-04 13:31:24,601 WARNING [train.py:1204] (3/4) Exclude cut with ID _69584_210_5219_1_1535785133775_4044450_325-7656-0 from training. Duration: 0.86 2023-10-04 13:31:27,200 WARNING [train.py:1204] (3/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_478-162247-0 from training. Duration: 0.82 2023-10-04 13:31:27,341 WARNING [train.py:1204] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_686-54383-0_sp1.1 from training. Duration: 0.7909375 2023-10-04 13:31:33,171 WARNING [train.py:1204] (3/4) Exclude cut with ID _29303_210_2575_1_1532392237303_7410649_85-270092-0_sp1.1 from training. Duration: 0.936375 2023-10-04 13:31:33,189 WARNING [train.py:1204] (3/4) Exclude cut with ID _2322_210_2667_1_1525604548971_3656299_440-80494-0_sp1.1 from training. Duration: 0.8 2023-10-04 13:31:34,568 WARNING [train.py:1204] (3/4) Exclude cut with ID _47346_210_9476_1_1533815971553_3536160_201-40644-0_sp1.1 from training. Duration: 0.936375 2023-10-04 13:31:35,858 WARNING [train.py:1204] (3/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_343-139898-0_sp1.1 from training. Duration: 0.87275 2023-10-04 13:31:37,192 WARNING [train.py:1204] (3/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_1012-279473-0 from training. Duration: 0.67 2023-10-04 13:31:39,840 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7931_210_1794_1_1528594728_5564059_280-279560-0_sp1.1 from training. Duration: 0.8909375 2023-10-04 13:31:39,878 WARNING [train.py:1204] (3/4) Exclude cut with ID _8263_210_1664_1_1528439452891_970220_68-181805-0 from training. Duration: 0.64 2023-10-04 13:31:39,928 WARNING [train.py:1204] (3/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_605-133395-0 from training. Duration: 0.66 2023-10-04 13:31:41,309 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_43-171109-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 13:31:41,312 WARNING [train.py:1204] (3/4) Exclude cut with ID _40094_210_16380_1_1533294005773_3641159_107-280739-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 13:31:43,005 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.conv_module2.balancer2.prob, batch_count=1671400.0, ans=0.125 2023-10-04 13:31:44,144 WARNING [train.py:1204] (3/4) Exclude cut with ID _14821_210_8750_1_1530680503991_6829359_2-227151-0_sp0.9 from training. Duration: 0.92225 2023-10-04 13:31:46,857 WARNING [train.py:1204] (3/4) Exclude cut with ID _14268_210_9206_1_1530622771285_4714099_331-105351-0_sp1.1 from training. Duration: 0.5909375 2023-10-04 13:31:48,222 INFO [train.py:1046] (3/4) Epoch 48, batch 1050, loss[loss=0.1453, simple_loss=0.2275, pruned_loss=0.03154, over 24334.00 frames. ], tot_loss[loss=0.153, simple_loss=0.2326, pruned_loss=0.03664, over 4678864.35 frames. ], batch size: 61, lr: 2.12e-03, grad_scale: 16.0 2023-10-04 13:31:48,332 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_26326_210_7925_1_1533002493_3592406_451-82741-0_sp1.1 from training. Duration: 0.97275 2023-10-04 13:31:51,660 WARNING [train.py:1204] (3/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_191-59784-0_sp0.9 from training. Duration: 0.97775 2023-10-04 13:31:53,016 WARNING [train.py:1204] (3/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_445-117398-0_sp1.1 from training. Duration: 0.8090625 2023-10-04 13:31:55,657 WARNING [train.py:1204] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_605-257205-0_sp0.9 from training. Duration: 0.6555625 2023-10-04 13:31:57,005 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_50050_210_16223_1_1535435796_7290138_592-258841-0_sp1.1 from training. Duration: 0.936375 2023-10-04 13:31:58,296 WARNING [train.py:1204] (3/4) Exclude cut with ID _14348_210_8341_1_1530599434996_3778640_240-232370-0_sp0.9 from training. Duration: 0.9333125 2023-10-04 13:32:01,841 WARNING [train.py:1204] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_239-316802-0_sp0.9 from training. Duration: 0.7666875 2023-10-04 13:32:03,793 WARNING [train.py:1204] (3/4) Exclude cut with ID _45560_210_21982_1_1533812603951_3584999_111-6376-0_sp1.1 from training. Duration: 0.763625 2023-10-04 13:32:05,255 WARNING [train.py:1204] (3/4) Exclude cut with ID _54189_210_16380_1_1534332670039_3911710_606-311481-0_sp1.1 from training. Duration: 0.963625 2023-10-04 13:32:06,623 WARNING [train.py:1204] (3/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_237-332027-0_sp1.1 from training. Duration: 0.863625 2023-10-04 13:32:07,892 WARNING [train.py:1204] (3/4) Exclude cut with ID _66070_210_7640_1_1535441393096_7805310_489-292631-0_sp0.9 from training. Duration: 0.87775 2023-10-04 13:32:07,965 WARNING [train.py:1204] (3/4) Exclude cut with ID _49728_210_12651_1_1534041034294_6788150_624-195301-0_sp1.1 from training. Duration: 0.77275 2023-10-04 13:32:08,012 WARNING [train.py:1204] (3/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_44-79427-0 from training. Duration: 0.98 2023-10-04 13:32:09,387 WARNING [train.py:1204] (3/4) Exclude cut with ID _35350_210_16040_1_1533040191183_4075299_490-198354-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 13:32:09,431 WARNING [train.py:1204] (3/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_309-222545-0 from training. Duration: 0.84 2023-10-04 13:32:10,888 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_13091_210_7925_1_1530609447_8987354_790-199469-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 13:32:10,895 WARNING [train.py:1204] (3/4) Exclude cut with ID _21632_210_2253_1_1531994001765_3744140_110-274654-0 from training. Duration: 0.82 2023-10-04 13:32:10,901 WARNING [train.py:1204] (3/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_498-40153-0_sp0.9 from training. Duration: 0.7 2023-10-04 13:32:16,403 WARNING [train.py:1204] (3/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_363-282909-0_sp1.1 from training. Duration: 0.936375 2023-10-04 13:32:17,745 WARNING [train.py:1204] (3/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_178-177582-0_sp0.9 from training. Duration: 0.82225 2023-10-04 13:32:17,755 WARNING [train.py:1204] (3/4) Exclude cut with ID _24612_210_8913_1_1531998054193_3806359_357-35487-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 13:32:21,005 WARNING [train.py:1204] (3/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_138-332773-0 from training. Duration: 0.81 2023-10-04 13:32:21,037 WARNING [train.py:1204] (3/4) Exclude cut with ID _7624_210_1531_1_1528109802198_4933101_418-349834-0 from training. Duration: 0.6 2023-10-04 13:32:22,328 WARNING [train.py:1204] (3/4) Exclude cut with ID _7537_210_2084_1_1528502395431_3924359_14-312906-0_sp0.9 from training. Duration: 0.9333125 2023-10-04 13:32:22,719 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.conv_module1.balancer1.prob, batch_count=1671600.0, ans=0.125 2023-10-04 13:32:23,711 WARNING [train.py:1204] (3/4) Exclude cut with ID _60057_210_10640_1_1534903065793_3333150_362-230377-0 from training. Duration: 0.87 2023-10-04 13:32:27,027 WARNING [train.py:1204] (3/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_138-64210-0 from training. Duration: 0.73 2023-10-04 13:32:29,597 WARNING [train.py:1204] (3/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_454-339504-0_sp1.1 from training. Duration: 0.97275 2023-10-04 13:32:32,944 WARNING [train.py:1204] (3/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_508-335369-0_sp0.9 from training. Duration: 0.5333125 2023-10-04 13:32:34,407 WARNING [train.py:1204] (3/4) Exclude cut with ID _5608_210_2106_1_1527415439089_6819768_909-82767-0_sp0.9 from training. Duration: 0.67775 2023-10-04 13:32:34,437 WARNING [train.py:1204] (3/4) Exclude cut with ID _6694_210_4599_1_1527852637490_3965529_99-11611-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 13:32:34,519 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2655_210_2087_1_1525688959_3897400_52-279899-0_sp1.1 from training. Duration: 0.62725 2023-10-04 13:32:37,451 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.self_attn_weights.pos_emb_skip_rate, batch_count=1671666.6666666667, ans=0.0 2023-10-04 13:32:38,481 WARNING [train.py:1204] (3/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_375-245677-0_sp1.1 from training. Duration: 0.736375 2023-10-04 13:32:40,165 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_23850_210_4238_1_1532232597_1632433_32-185135-0 from training. Duration: 0.86 2023-10-04 13:32:41,526 WARNING [train.py:1204] (3/4) Exclude cut with ID _15113_210_8254_1_1530842438579_3716279_209-54533-0 from training. Duration: 0.96 2023-10-04 13:32:41,555 WARNING [train.py:1204] (3/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_401-53423-0 from training. Duration: 0.55 2023-10-04 13:32:41,591 WARNING [train.py:1204] (3/4) Exclude cut with ID _14867_210_1968_1_1530684179890_3846969_319-250868-0_sp1.1 from training. Duration: 0.9 2023-10-04 13:32:42,934 WARNING [train.py:1204] (3/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_106-30782-0_sp0.9 from training. Duration: 0.888875 2023-10-04 13:32:43,055 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_5960_210_1794_1_1527403865_3990999_515-2613-0 from training. Duration: 0.88 2023-10-04 13:32:45,214 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.2.encoder.layers.1.conv_module1.whiten, num_groups=1, num_channels=384, metric=5.13 vs. limit=15.0 2023-10-04 13:32:47,093 WARNING [train.py:1204] (3/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_798-106179-0_sp0.9 from training. Duration: 0.92225 2023-10-04 13:32:49,136 WARNING [train.py:1204] (3/4) Exclude cut with ID _14227_210_4381_1_1530701804286_3940259_125-275642-0_sp1.1 from training. Duration: 0.9 2023-10-04 13:32:49,139 WARNING [train.py:1204] (3/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_79-200392-0_sp1.1 from training. Duration: 0.8818125 2023-10-04 13:32:49,178 WARNING [train.py:1204] (3/4) Exclude cut with ID _48836_210_12210_1_1534075848293_3904150_429-35475-0_sp0.9 from training. Duration: 0.988875 2023-10-04 13:32:49,201 WARNING [train.py:1204] (3/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_224-172517-0_sp1.1 from training. Duration: 0.97275 2023-10-04 13:32:53,774 WARNING [train.py:1204] (3/4) Exclude cut with ID _15113_210_8254_1_1530842438579_3716279_76-232507-0_sp1.1 from training. Duration: 0.97275 2023-10-04 13:32:53,777 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_135-5501-0 from training. Duration: 0.98 2023-10-04 13:32:54,550 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.2.encoder.layers.1.self_attn_weights.whiten_keys, num_groups=4, num_channels=128, metric=2.64 vs. limit=6.0 2023-10-04 13:32:55,151 WARNING [train.py:1204] (3/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_563-347832-0_sp0.9 from training. Duration: 0.988875 2023-10-04 13:32:55,162 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6786_210_1388_1_1527839699_4194495_273-149841-0 from training. Duration: 0.92 2023-10-04 13:32:55,192 WARNING [train.py:1204] (3/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_172-215932-0 from training. Duration: 0.66 2023-10-04 13:32:56,469 WARNING [train.py:1204] (3/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_939-132910-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 13:32:56,740 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.self_attn_weights.pos_emb_skip_rate, batch_count=1671733.3333333333, ans=0.0 2023-10-04 13:32:58,522 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.bypass.scale_min, batch_count=1671733.3333333333, ans=0.2 2023-10-04 13:33:00,893 WARNING [train.py:1204] (3/4) Exclude cut with ID _49371_210_12071_1_1533952761021_3812190_16-246001-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 13:33:02,720 INFO [train.py:1046] (3/4) Epoch 48, batch 1100, loss[loss=0.1404, simple_loss=0.2244, pruned_loss=0.02821, over 24305.00 frames. ], tot_loss[loss=0.1526, simple_loss=0.2326, pruned_loss=0.03629, over 4692957.69 frames. ], batch size: 61, lr: 2.12e-03, grad_scale: 8.0 2023-10-04 13:33:05,499 WARNING [train.py:1204] (3/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_680-189165-0_sp1.1 from training. Duration: 0.936375 2023-10-04 13:33:07,951 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.798e+02 2.096e+02 2.413e+02 2.876e+02 5.398e+02, threshold=4.826e+02, percent-clipped=2.0 2023-10-04 13:33:10,978 WARNING [train.py:1204] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_53-300544-0_sp1.1 from training. Duration: 0.6090625 2023-10-04 13:33:12,639 WARNING [train.py:1204] (3/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_540-275685-0_sp1.1 from training. Duration: 0.8090625 2023-10-04 13:33:13,954 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_38965_210_4852_1_1533174906_3484735_453-106579-0_sp1.1 from training. Duration: 0.92725 2023-10-04 13:33:14,009 WARNING [train.py:1204] (3/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_105-328174-0 from training. Duration: 0.76 2023-10-04 13:33:16,690 WARNING [train.py:1204] (3/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_279-126205-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 13:33:19,477 WARNING [train.py:1204] (3/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_5-55893-0_sp0.9 from training. Duration: 0.611125 2023-10-04 13:33:21,473 WARNING [train.py:1204] (3/4) Exclude cut with ID _13678_210_7497_1_1530530069226_3463170_141-10835-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 13:33:21,895 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.5.encoder.layers.1.self_attn_weights.whiten_keys, num_groups=4, num_channels=128, metric=1.73 vs. limit=6.0 2023-10-04 13:33:22,842 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.0.nonlin_attention.balancer.max_positive, batch_count=1671866.6666666667, ans=0.95 2023-10-04 13:33:24,103 WARNING [train.py:1204] (3/4) Exclude cut with ID _22083_210_8254_1_1531702862439_3678970_201-85738-0_sp1.1 from training. Duration: 0.8181875 2023-10-04 13:33:24,128 WARNING [train.py:1204] (3/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_51-1049-0 from training. Duration: 0.75 2023-10-04 13:33:25,478 WARNING [train.py:1204] (3/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_206-10097-0_sp0.9 from training. Duration: 0.5444375 2023-10-04 13:33:27,221 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_4528_210_3273_1_1527076654_3276777_360-63661-0_sp1.1 from training. Duration: 0.92725 2023-10-04 13:33:27,229 WARNING [train.py:1204] (3/4) Exclude cut with ID _64086_210_7994_1_1535270417019_7435949_449-31567-0_sp1.1 from training. Duration: 0.8818125 2023-10-04 13:33:28,721 WARNING [train.py:1204] (3/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_245-174553-0_sp0.9 from training. Duration: 0.97775 2023-10-04 13:33:32,020 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6545_210_2040_1_1527774670_4245258_379-14182-0_sp1.1 from training. Duration: 0.72725 2023-10-04 13:33:35,311 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_256-206070-0_sp1.1 from training. Duration: 0.963625 2023-10-04 13:33:38,065 WARNING [train.py:1204] (3/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_278-104676-0 from training. Duration: 0.98 2023-10-04 13:33:39,451 WARNING [train.py:1204] (3/4) Exclude cut with ID IC0671W0065-331908 from training. Duration: 0.896 2023-10-04 13:33:39,509 WARNING [train.py:1204] (3/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_244-129917-0_sp1.1 from training. Duration: 0.97275 2023-10-04 13:33:42,214 WARNING [train.py:1204] (3/4) Exclude cut with ID _7852_210_5219_1_1528286471316_8139400_1129-170857-0_sp1.1 from training. Duration: 0.97275 2023-10-04 13:33:43,508 WARNING [train.py:1204] (3/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_443-30034-0_sp0.9 from training. Duration: 0.711125 2023-10-04 13:33:43,545 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_31583_210_3852_1_1532595851_4024413_250-332999-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 13:33:44,917 WARNING [train.py:1204] (3/4) Exclude cut with ID _8772_210_1850_1_1528635595972_3664011_247-336448-0 from training. Duration: 0.74 2023-10-04 13:33:46,603 WARNING [train.py:1204] (3/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_567-135114-0_sp0.9 from training. Duration: 0.9666875 2023-10-04 13:33:46,615 WARNING [train.py:1204] (3/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_557-349534-0_sp1.1 from training. Duration: 0.82725 2023-10-04 13:33:46,633 WARNING [train.py:1204] (3/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_1341-26728-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 13:33:47,920 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_40222_210_6228_1_1533385298_4481939_375-328051-0_sp1.1 from training. Duration: 0.97275 2023-10-04 13:33:47,951 WARNING [train.py:1204] (3/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_276-96593-0 from training. Duration: 0.77 2023-10-04 13:33:55,436 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_53071_210_16554_1_1534490405_2954066_265-170472-0_sp0.9 from training. Duration: 0.92225 2023-10-04 13:33:55,454 WARNING [train.py:1204] (3/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_365-1492-0 from training. Duration: 0.91 2023-10-04 13:33:56,896 WARNING [train.py:1204] (3/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_654-52713-0_sp0.9 from training. Duration: 0.8555625 2023-10-04 13:33:57,102 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.0.conv_skip_rate, batch_count=1672000.0, ans=0.0 2023-10-04 13:34:01,727 WARNING [train.py:1204] (3/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_113-92608-0_sp1.1 from training. Duration: 0.4909375 2023-10-04 13:34:05,148 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_46013_210_7234_1_1533776362_3744032_50-141535-0 from training. Duration: 0.99 2023-10-04 13:34:05,165 WARNING [train.py:1204] (3/4) Exclude cut with ID _13942_210_9988_1_1530604906036_3733360_499-55676-0_sp1.1 from training. Duration: 0.563625 2023-10-04 13:34:07,229 WARNING [train.py:1204] (3/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_375-65143-0_sp1.1 from training. Duration: 0.97275 2023-10-04 13:34:11,176 WARNING [train.py:1204] (3/4) Exclude cut with ID _16756_210_4915_1_1531047803763_7104259_199-325608-0_sp1.1 from training. Duration: 0.92725 2023-10-04 13:34:11,202 WARNING [train.py:1204] (3/4) Exclude cut with ID _31649_210_6228_1_1532584608063_4091078_443-124391-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 13:34:12,007 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.2.encoder.layers.2.nonlin_attention.whiten1, num_groups=1, num_channels=288, metric=4.18 vs. limit=10.0 2023-10-04 13:34:12,628 WARNING [train.py:1204] (3/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_146-218763-0 from training. Duration: 0.86 2023-10-04 13:34:12,666 WARNING [train.py:1204] (3/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_567-135114-0_sp1.1 from training. Duration: 0.7909375 2023-10-04 13:34:12,708 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_26326_210_7925_1_1533002493_3592406_462-288299-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 13:34:14,060 WARNING [train.py:1204] (3/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_322-217220-0 from training. Duration: 0.86 2023-10-04 13:34:14,094 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_238-256668-0_sp1.1 from training. Duration: 0.62725 2023-10-04 13:34:14,140 WARNING [train.py:1204] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_371-18169-0 from training. Duration: 0.86 2023-10-04 13:34:16,810 INFO [train.py:1046] (3/4) Epoch 48, batch 1150, loss[loss=0.153, simple_loss=0.2308, pruned_loss=0.03766, over 24446.00 frames. ], tot_loss[loss=0.1534, simple_loss=0.2337, pruned_loss=0.0366, over 4700471.80 frames. ], batch size: 58, lr: 2.12e-03, grad_scale: 8.0 2023-10-04 13:34:16,902 WARNING [train.py:1204] (3/4) Exclude cut with ID _58480_210_12392_1_1534811121111_3646160_23-74512-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 13:34:16,907 WARNING [train.py:1204] (3/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_784-213099-0_sp1.1 from training. Duration: 0.8454375 2023-10-04 13:34:18,248 WARNING [train.py:1204] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_625-102955-0_sp1.1 from training. Duration: 0.7 2023-10-04 13:34:21,093 WARNING [train.py:1204] (3/4) Exclude cut with ID _49748_210_24116_1_1533986971737_3158910_176-174327-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 13:34:24,250 WARNING [train.py:1204] (3/4) Exclude cut with ID _50310_210_15488_1_1534068043345_3641080_432-96140-0_sp1.1 from training. Duration: 0.936375 2023-10-04 13:34:25,600 WARNING [train.py:1204] (3/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_76-14910-0_sp1.1 from training. Duration: 0.92725 2023-10-04 13:34:26,154 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.4.encoder.layers.0.nonlin_attention.whiten2, num_groups=1, num_channels=384, metric=7.88 vs. limit=15.0 2023-10-04 13:34:26,855 WARNING [train.py:1204] (3/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_40-19458-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 13:34:26,888 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_46-93547-0 from training. Duration: 0.95 2023-10-04 13:34:26,926 WARNING [train.py:1204] (3/4) Exclude cut with ID _21631_210_2253_1_1531907880778_3160339_321-35325-0_sp1.1 from training. Duration: 0.963625 2023-10-04 13:34:28,499 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.conv_module1.balancer2.prob, batch_count=1672133.3333333333, ans=0.125 2023-10-04 13:34:29,655 WARNING [train.py:1204] (3/4) Exclude cut with ID _4779_210_2084_1_1527318022944_3914893_500-285013-0 from training. Duration: 0.69 2023-10-04 13:34:31,516 WARNING [train.py:1204] (3/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_301-93324-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 13:34:31,522 WARNING [train.py:1204] (3/4) Exclude cut with ID _6473_210_1741_1_1527901307396_3815380_108-17335-0_sp1.1 from training. Duration: 0.5909375 2023-10-04 13:34:37,920 WARNING [train.py:1204] (3/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_206-10097-0 from training. Duration: 0.49 2023-10-04 13:34:39,315 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_36756_210_7234_1_1533026991_966276_103-77513-0_sp1.1 from training. Duration: 0.97275 2023-10-04 13:34:40,821 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.conv_module1.balancer2.prob, batch_count=1672200.0, ans=0.125 2023-10-04 13:34:42,764 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.3.encoder.layers.1.feed_forward2.out_whiten, num_groups=1, num_channels=512, metric=5.37 vs. limit=15.0 2023-10-04 13:34:43,425 WARNING [train.py:1204] (3/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_662-18193-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 13:34:43,491 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_26433_210_12108_1_1532157158_4056480_439-52438-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 13:34:44,840 WARNING [train.py:1204] (3/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_464-33931-0 from training. Duration: 0.54 2023-10-04 13:34:44,846 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3474_210_2028_1_1526188513_4825246_145-309131-0_sp0.9 from training. Duration: 0.82225 2023-10-04 13:34:44,867 WARNING [train.py:1204] (3/4) Exclude cut with ID _9679_210_7497_1_1529129212906_4363929_446-308321-0_sp1.1 from training. Duration: 0.963625 2023-10-04 13:34:48,990 WARNING [train.py:1204] (3/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_662-275621-0 from training. Duration: 0.42 2023-10-04 13:34:49,074 WARNING [train.py:1204] (3/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_344-337148-0_sp1.1 from training. Duration: 0.97275 2023-10-04 13:34:51,712 WARNING [train.py:1204] (3/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_759-179071-0_sp1.1 from training. Duration: 0.92725 2023-10-04 13:35:01,776 WARNING [train.py:1204] (3/4) Exclude cut with ID _28783_210_7943_1_1532671164808_7155479_293-12188-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 13:35:06,915 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_17613_210_2151_1_1531137569_2366160_235-320304-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 13:35:06,930 WARNING [train.py:1204] (3/4) Exclude cut with ID _14349_210_8341_1_1530615710818_3442470_170-336541-0 from training. Duration: 0.86 2023-10-04 13:35:08,286 WARNING [train.py:1204] (3/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_375-218677-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 13:35:08,342 WARNING [train.py:1204] (3/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_829-202956-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 13:35:15,784 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9029W0436-509823 from training. Duration: 0.768 2023-10-04 13:35:17,230 WARNING [train.py:1204] (3/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_231-112214-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 13:35:20,271 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.balancer2.prob, batch_count=1672400.0, ans=0.125 2023-10-04 13:35:24,319 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9059W0470-524883 from training. Duration: 0.3415625 2023-10-04 13:35:27,706 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_43266_210_1794_1_1533521789_4497218_206-79512-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 13:35:27,795 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_294-163793-0_sp0.9 from training. Duration: 0.911125 2023-10-04 13:35:27,818 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6290_210_3273_1_1527595224_1282787_115-87095-0_sp1.1 from training. Duration: 0.5 2023-10-04 13:35:29,203 WARNING [train.py:1204] (3/4) Exclude cut with ID _14100_210_8341_1_1530529320613_3678069_279-55760-0_sp1.1 from training. Duration: 0.8454375 2023-10-04 13:35:30,499 INFO [train.py:1046] (3/4) Epoch 48, batch 1200, loss[loss=0.1553, simple_loss=0.2347, pruned_loss=0.03789, over 23669.00 frames. ], tot_loss[loss=0.1537, simple_loss=0.2341, pruned_loss=0.03661, over 4710787.02 frames. ], batch size: 232, lr: 2.12e-03, grad_scale: 16.0 2023-10-04 13:35:31,940 WARNING [train.py:1204] (3/4) Exclude cut with ID _14621_210_1925_1_1530686901185_3675420_419-234200-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 13:35:36,968 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.737e+02 1.971e+02 2.130e+02 2.381e+02 3.707e+02, threshold=4.260e+02, percent-clipped=0.0 2023-10-04 13:35:37,072 WARNING [train.py:1204] (3/4) Exclude cut with ID _60229_210_27767_1_1534838406158_3655350_353-213656-0_sp1.1 from training. Duration: 0.863625 2023-10-04 13:35:37,088 WARNING [train.py:1204] (3/4) Exclude cut with ID _48678_210_5663_1_1533988830947_3769199_306-255886-0_sp1.1 from training. Duration: 0.7 2023-10-04 13:35:38,452 WARNING [train.py:1204] (3/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_466-3492-0_sp1.1 from training. Duration: 0.97275 2023-10-04 13:35:38,453 WARNING [train.py:1204] (3/4) Exclude cut with ID _8755_210_1876_1_1528628559357_3258209_199-242167-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 13:35:38,513 WARNING [train.py:1204] (3/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_237-89684-0_sp1.1 from training. Duration: 0.87275 2023-10-04 13:35:41,280 WARNING [train.py:1204] (3/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_70-84121-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 13:35:43,308 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7544_210_6753_1_1528587722_3576686_172-171750-0_sp1.1 from training. Duration: 0.5909375 2023-10-04 13:35:44,567 WARNING [train.py:1204] (3/4) Exclude cut with ID _24271_210_12658_1_1531987270022_7190519_40-321822-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 13:35:44,584 WARNING [train.py:1204] (3/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_227-83612-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 13:35:47,309 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9156W0015-572508 from training. Duration: 0.896 2023-10-04 13:35:51,391 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_292-265928-0 from training. Duration: 0.51 2023-10-04 13:35:54,186 WARNING [train.py:1204] (3/4) Exclude cut with ID _14747_210_8341_1_1530685797463_3654089_147-131229-0_sp1.1 from training. Duration: 0.6909375 2023-10-04 13:35:55,649 WARNING [train.py:1204] (3/4) Exclude cut with ID _45297_210_2721_1_1533715351381_3264949_351-147136-0_sp0.9 from training. Duration: 0.9666875 2023-10-04 13:35:58,954 WARNING [train.py:1204] (3/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_1102-324568-0_sp1.1 from training. Duration: 0.97275 2023-10-04 13:36:01,558 WARNING [train.py:1204] (3/4) Exclude cut with ID _18516_210_9206_1_1531704653206_3692406_465-75446-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 13:36:01,560 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9203W0126-595623 from training. Duration: 0.896 2023-10-04 13:36:01,637 WARNING [train.py:1204] (3/4) Exclude cut with ID _55383_210_10635_1_1534468426172_2231669_139-328410-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 13:36:08,883 WARNING [train.py:1204] (3/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_166-246557-0_sp0.9 from training. Duration: 0.711125 2023-10-04 13:36:08,887 WARNING [train.py:1204] (3/4) Exclude cut with ID _30039_210_13270_1_1532484612900_2725150_239-304872-0_sp1.1 from training. Duration: 0.963625 2023-10-04 13:36:08,924 WARNING [train.py:1204] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_6-226750-0 from training. Duration: 0.94 2023-10-04 13:36:10,297 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_135-5501-0_sp1.1 from training. Duration: 0.8909375 2023-10-04 13:36:13,015 WARNING [train.py:1204] (3/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_359-118926-0 from training. Duration: 0.72 2023-10-04 13:36:16,511 WARNING [train.py:1204] (3/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_4-91037-0 from training. Duration: 0.88 2023-10-04 13:36:16,543 WARNING [train.py:1204] (3/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_415-288492-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 13:36:17,842 WARNING [train.py:1204] (3/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_544-287179-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 13:36:17,954 WARNING [train.py:1204] (3/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_122-32606-0_sp1.1 from training. Duration: 0.92725 2023-10-04 13:36:19,151 WARNING [train.py:1204] (3/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_121-284152-0_sp1.1 from training. Duration: 0.536375 2023-10-04 13:36:20,473 WARNING [train.py:1204] (3/4) Exclude cut with ID _44718_210_5816_1_1534050051869_6469030_350-299477-0_sp1.1 from training. Duration: 0.97275 2023-10-04 13:36:20,485 WARNING [train.py:1204] (3/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_1103-56445-0_sp0.9 from training. Duration: 0.811125 2023-10-04 13:36:20,546 WARNING [train.py:1204] (3/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_64-298917-0_sp0.9 from training. Duration: 0.97775 2023-10-04 13:36:21,924 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2245_210_2715_1_1525511548_3540254_310-52483-0 from training. Duration: 0.88 2023-10-04 13:36:21,975 WARNING [train.py:1204] (3/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_401-158239-0_sp1.1 from training. Duration: 0.6454375 2023-10-04 13:36:23,294 WARNING [train.py:1204] (3/4) Exclude cut with ID _59502_210_12210_1_1535107024698_1835139_146-162495-0_sp1.1 from training. Duration: 0.836375 2023-10-04 13:36:23,306 WARNING [train.py:1204] (3/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_41-31157-0_sp0.9 from training. Duration: 0.6333125 2023-10-04 13:36:25,376 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_68717_210_3528_1_1535628683_1556759_95-36722-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 13:36:25,387 WARNING [train.py:1204] (3/4) Exclude cut with ID _30196_210_1664_1_1533708496522_7074140_654-162611-0_sp1.1 from training. Duration: 0.92725 2023-10-04 13:36:26,836 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.balancer1.prob, batch_count=1672666.6666666667, ans=0.125 2023-10-04 13:36:28,102 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_9302_210_2550_1_1528889962_4655416_638-301620-0_sp0.9 from training. Duration: 0.52225 2023-10-04 13:36:29,676 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.ff3_skip_rate, batch_count=1672733.3333333333, ans=0.0 2023-10-04 13:36:32,088 WARNING [train.py:1204] (3/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_478-162247-0_sp1.1 from training. Duration: 0.7454375 2023-10-04 13:36:35,326 WARNING [train.py:1204] (3/4) Exclude cut with ID _45297_210_2721_1_1533715351381_3264949_353-9541-0 from training. Duration: 0.97 2023-10-04 13:36:39,987 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9653W0190-675468 from training. Duration: 0.9935 2023-10-04 13:36:41,403 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_49730_210_13049_1_1534075294_1830676_214-84436-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 13:36:42,876 WARNING [train.py:1204] (3/4) Exclude cut with ID _50093_210_4220_1_1534417141207_3432299_259-322252-0_sp1.1 from training. Duration: 0.836375 2023-10-04 13:36:44,130 INFO [train.py:1046] (3/4) Epoch 48, batch 1250, loss[loss=0.1532, simple_loss=0.2408, pruned_loss=0.03282, over 24566.00 frames. ], tot_loss[loss=0.1542, simple_loss=0.2347, pruned_loss=0.03683, over 4718772.43 frames. ], batch size: 71, lr: 2.12e-03, grad_scale: 16.0 2023-10-04 13:36:44,225 WARNING [train.py:1204] (3/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_599-50220-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 13:36:46,080 WARNING [train.py:1204] (3/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_174-109016-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 13:36:47,549 WARNING [train.py:1204] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_665-196155-0 from training. Duration: 0.67 2023-10-04 13:36:50,296 WARNING [train.py:1204] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_656-188361-0_sp1.1 from training. Duration: 0.7909375 2023-10-04 13:36:51,680 WARNING [train.py:1204] (3/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_60-18123-0_sp1.1 from training. Duration: 0.8 2023-10-04 13:36:51,735 WARNING [train.py:1204] (3/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_535-14410-0 from training. Duration: 0.47 2023-10-04 13:36:53,106 WARNING [train.py:1204] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_94-126128-0_sp1.1 from training. Duration: 0.7818125 2023-10-04 13:36:55,551 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.2.encoder.layers.1.nonlin_attention.whiten1, num_groups=1, num_channels=288, metric=5.15 vs. limit=10.0 2023-10-04 13:36:56,108 WARNING [train.py:1204] (3/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_356-31216-0_sp1.1 from training. Duration: 0.6818125 2023-10-04 13:36:59,097 WARNING [train.py:1204] (3/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_443-30034-0_sp1.1 from training. Duration: 0.5818125 2023-10-04 13:37:00,347 WARNING [train.py:1204] (3/4) Exclude cut with ID _26907_210_7234_1_1532221740694_3205263_337-2494-0_sp1.1 from training. Duration: 0.8 2023-10-04 13:37:01,780 WARNING [train.py:1204] (3/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_49-97158-0_sp0.9 from training. Duration: 0.8444375 2023-10-04 13:37:01,792 WARNING [train.py:1204] (3/4) Exclude cut with ID _7877_210_6014_1_1528286358099_3750136_553-170128-0_sp1.1 from training. Duration: 0.963625 2023-10-04 13:37:02,080 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.balancer2.prob, batch_count=1672866.6666666667, ans=0.125 2023-10-04 13:37:03,193 WARNING [train.py:1204] (3/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_202-293523-0_sp0.9 from training. Duration: 0.8 2023-10-04 13:37:06,578 WARNING [train.py:1204] (3/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_188-171673-0_sp0.9 from training. Duration: 0.6444375 2023-10-04 13:37:07,861 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_120-41668-0_sp1.1 from training. Duration: 0.636375 2023-10-04 13:37:07,874 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_40223_210_6228_1_1533298404_4812267_613-78769-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 13:37:09,284 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_53861_210_5399_1_1534422156_4272077_150-336899-0_sp1.1 from training. Duration: 0.963625 2023-10-04 13:37:10,574 WARNING [train.py:1204] (3/4) Exclude cut with ID _14459_210_2228_1_1531443641946_7516700_1146-56843-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 13:37:13,218 WARNING [train.py:1204] (3/4) Exclude cut with ID _39867_210_9826_1_1533553222030_9344259_1194-299928-0_sp1.1 from training. Duration: 0.92725 2023-10-04 13:37:15,292 WARNING [train.py:1204] (3/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_214-291369-0_sp0.9 from training. Duration: 0.7 2023-10-04 13:37:19,755 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.0.feed_forward2.hidden_balancer.prob, batch_count=1672933.3333333333, ans=0.125 2023-10-04 13:37:19,810 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=1672933.3333333333, ans=0.1 2023-10-04 13:37:20,118 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.5.encoder.layers.1.nonlin_attention.whiten1, num_groups=1, num_channels=192, metric=2.74 vs. limit=10.0 2023-10-04 13:37:21,039 WARNING [train.py:1204] (3/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_358-215558-0 from training. Duration: 0.93 2023-10-04 13:37:21,099 WARNING [train.py:1204] (3/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_577-287150-0_sp0.9 from training. Duration: 0.911125 2023-10-04 13:37:22,543 WARNING [train.py:1204] (3/4) Exclude cut with ID _60860_210_8254_1_1534896015875_3557169_229-187367-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 13:37:22,778 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.ff2_skip_rate, batch_count=1672933.3333333333, ans=0.0 2023-10-04 13:37:23,851 WARNING [train.py:1204] (3/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_857-298942-0 from training. Duration: 0.85 2023-10-04 13:37:25,687 WARNING [train.py:1204] (3/4) Exclude cut with ID _40137_210_12210_1_1533290484046_3795180_249-187059-0_sp1.1 from training. Duration: 0.8 2023-10-04 13:37:25,695 WARNING [train.py:1204] (3/4) Exclude cut with ID ID0171W0508-765603 from training. Duration: 0.541 2023-10-04 13:37:25,720 WARNING [train.py:1204] (3/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_521-219439-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 13:37:25,726 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_40223_210_6228_1_1533298404_4812267_741-302616-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 13:37:25,981 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.feed_forward3.hidden_balancer.prob, batch_count=1672933.3333333333, ans=0.125 2023-10-04 13:37:28,552 WARNING [train.py:1204] (3/4) Exclude cut with ID _65293_210_9070_1_1535545549765_4004210_438-154735-0_sp1.1 from training. Duration: 0.92725 2023-10-04 13:37:28,743 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.nonlin_attention.balancer.prob, batch_count=1673000.0, ans=0.125 2023-10-04 13:37:28,759 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.feed_forward1.hidden_balancer.prob, batch_count=1673000.0, ans=0.125 2023-10-04 13:37:32,621 WARNING [train.py:1204] (3/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_564-132950-0_sp1.1 from training. Duration: 0.92725 2023-10-04 13:37:32,663 WARNING [train.py:1204] (3/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_291-270881-0_sp1.1 from training. Duration: 0.8818125 2023-10-04 13:37:34,055 WARNING [train.py:1204] (3/4) Exclude cut with ID _60229_210_27767_1_1534838406158_3655350_353-213656-0 from training. Duration: 0.95 2023-10-04 13:37:34,071 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_243-143119-0 from training. Duration: 0.77 2023-10-04 13:37:35,414 WARNING [train.py:1204] (3/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_690-126306-0 from training. Duration: 0.78 2023-10-04 13:37:38,220 WARNING [train.py:1204] (3/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_347-95003-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 13:37:40,198 WARNING [train.py:1204] (3/4) Exclude cut with ID _2322_210_2667_1_1525604548971_3656299_440-80494-0 from training. Duration: 0.88 2023-10-04 13:37:40,229 WARNING [train.py:1204] (3/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_370-229434-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 13:37:44,294 WARNING [train.py:1204] (3/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_124-345840-0_sp1.1 from training. Duration: 0.47275 2023-10-04 13:37:44,302 WARNING [train.py:1204] (3/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_165-123581-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 13:37:45,658 WARNING [train.py:1204] (3/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_453-282588-0 from training. Duration: 0.98 2023-10-04 13:37:45,681 WARNING [train.py:1204] (3/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_299-34413-0_sp0.9 from training. Duration: 0.72225 2023-10-04 13:37:46,636 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.0.layers.0.feed_forward3.out_whiten, num_groups=1, num_channels=192, metric=11.97 vs. limit=15.0 2023-10-04 13:37:47,546 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_40036_210_13405_1_1533648647_1315388_48-146016-0_sp1.1 from training. Duration: 0.8454375 2023-10-04 13:37:47,575 WARNING [train.py:1204] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_99-167392-0_sp0.9 from training. Duration: 0.67775 2023-10-04 13:37:47,630 WARNING [train.py:1204] (3/4) Exclude cut with ID _48677_210_5663_1_1533902405919_3892989_37-207114-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 13:37:50,247 WARNING [train.py:1204] (3/4) Exclude cut with ID _29857_210_20034_1_1532395886753_3798160_335-163562-0 from training. Duration: 0.85 2023-10-04 13:37:52,997 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_36492_210_7927_1_1533261987_4790834_703-343351-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 13:37:53,083 WARNING [train.py:1204] (3/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_80-147301-0_sp0.9 from training. Duration: 0.9333125 2023-10-04 13:37:55,071 WARNING [train.py:1204] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_218-295097-0_sp0.9 from training. Duration: 0.8555625 2023-10-04 13:37:56,393 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2295_210_2087_1_1525500993_2407863_116-238894-0_sp1.1 from training. Duration: 0.636375 2023-10-04 13:37:57,698 INFO [train.py:1046] (3/4) Epoch 48, batch 1300, loss[loss=0.1303, simple_loss=0.2089, pruned_loss=0.02584, over 24429.00 frames. ], tot_loss[loss=0.1542, simple_loss=0.2347, pruned_loss=0.03683, over 4713721.63 frames. ], batch size: 58, lr: 2.12e-03, grad_scale: 16.0 2023-10-04 13:38:00,429 WARNING [train.py:1204] (3/4) Exclude cut with ID _3231_210_2106_1_1526205864934_6784220_697-171068-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 13:38:01,808 WARNING [train.py:1204] (3/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_102-77064-0 from training. Duration: 0.93 2023-10-04 13:38:03,133 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.754e+02 2.065e+02 2.223e+02 2.420e+02 4.502e+02, threshold=4.446e+02, percent-clipped=1.0 2023-10-04 13:38:04,647 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_13091_210_7925_1_1530609447_8987354_1186-261957-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 13:38:06,035 WARNING [train.py:1204] (3/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_91-277684-0_sp1.1 from training. Duration: 0.67275 2023-10-04 13:38:07,486 WARNING [train.py:1204] (3/4) Exclude cut with ID _31088_210_16446_1_1532520012284_3440919_159-313751-0_sp1.1 from training. Duration: 0.936375 2023-10-04 13:38:09,101 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=1673133.3333333333, ans=0.1 2023-10-04 13:38:10,755 WARNING [train.py:1204] (3/4) Exclude cut with ID _42326_210_7770_1_1533814282531_3707269_227-203721-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 13:38:12,120 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3531_210_3528_1_1526294203_5216456_309-227302-0_sp1.1 from training. Duration: 0.62725 2023-10-04 13:38:12,180 WARNING [train.py:1204] (3/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_226-242140-0 from training. Duration: 0.81 2023-10-04 13:38:16,367 WARNING [train.py:1204] (3/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_122-109023-0_sp1.1 from training. Duration: 0.6909375 2023-10-04 13:38:17,779 WARNING [train.py:1204] (3/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_660-211088-0_sp0.9 from training. Duration: 0.688875 2023-10-04 13:38:18,464 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.2.encoder.layers.2.feed_forward3.out_whiten, num_groups=1, num_channels=384, metric=10.72 vs. limit=15.0 2023-10-04 13:38:19,606 WARNING [train.py:1204] (3/4) Exclude cut with ID _39290_210_4220_1_1533862778760_3075118_92-311600-0 from training. Duration: 0.88 2023-10-04 13:38:21,238 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.conv_module2.balancer1.prob, batch_count=1673200.0, ans=0.125 2023-10-04 13:38:22,340 WARNING [train.py:1204] (3/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_390-339137-0_sp1.1 from training. Duration: 0.5090625 2023-10-04 13:38:22,559 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.self_attn_weights.pos_emb_skip_rate, batch_count=1673200.0, ans=0.0 2023-10-04 13:38:25,618 WARNING [train.py:1204] (3/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_449-341946-0_sp1.1 from training. Duration: 0.963625 2023-10-04 13:38:26,869 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_68095_210_20185_1_1535718226_8888855_884-327007-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 13:38:28,260 WARNING [train.py:1204] (3/4) Exclude cut with ID _42326_210_7770_1_1533814282531_3707269_308-222998-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 13:38:28,357 WARNING [train.py:1204] (3/4) Exclude cut with ID _40399_210_4353_1_1533305658194_3279406_344-238708-0_sp1.1 from training. Duration: 0.963625 2023-10-04 13:38:29,699 WARNING [train.py:1204] (3/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_87-131427-0_sp1.1 from training. Duration: 0.7090625 2023-10-04 13:38:29,758 WARNING [train.py:1204] (3/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_110-130961-0_sp0.9 from training. Duration: 0.52225 2023-10-04 13:38:31,042 WARNING [train.py:1204] (3/4) Exclude cut with ID _55153_210_2253_1_1534384786677_3162096_148-79022-0 from training. Duration: 0.96 2023-10-04 13:38:32,716 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.bypass_mid.scale_min, batch_count=1673266.6666666667, ans=0.2 2023-10-04 13:38:35,208 WARNING [train.py:1204] (3/4) Exclude cut with ID _9679_210_7497_1_1529129212906_4363929_512-313889-0_sp1.1 from training. Duration: 0.736375 2023-10-04 13:38:35,220 WARNING [train.py:1204] (3/4) Exclude cut with ID _7970_210_1883_1_1528455616296_3472230_25-178407-0_sp1.1 from training. Duration: 0.6454375 2023-10-04 13:38:35,553 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.balancer1.prob, batch_count=1673266.6666666667, ans=0.125 2023-10-04 13:38:36,619 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_246-125000-0 from training. Duration: 0.62 2023-10-04 13:38:37,915 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_15126_210_6753_1_1530778677_2591185_113-157651-0_sp0.9 from training. Duration: 0.7555625 2023-10-04 13:38:38,609 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.4.encoder.layers.0.nonlin_attention.whiten2, num_groups=1, num_channels=384, metric=9.37 vs. limit=15.0 2023-10-04 13:38:39,840 WARNING [train.py:1204] (3/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_105-63309-0_sp1.1 from training. Duration: 0.8909375 2023-10-04 13:38:41,642 WARNING [train.py:1204] (3/4) Exclude cut with ID _29877_210_6228_1_1532397791106_5276149_578-177271-0_sp1.1 from training. Duration: 0.936375 2023-10-04 13:38:41,700 WARNING [train.py:1204] (3/4) Exclude cut with ID _44831_210_13427_1_1533816011549_3689035_32-188147-0 from training. Duration: 0.96 2023-10-04 13:38:43,138 WARNING [train.py:1204] (3/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_300-254337-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 13:38:43,150 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_128-241008-0 from training. Duration: 0.94 2023-10-04 13:38:43,349 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.nonlin_attention.balancer.prob, batch_count=1673333.3333333333, ans=0.125 2023-10-04 13:38:44,558 WARNING [train.py:1204] (3/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_53-279178-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 13:38:48,094 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.attention_skip_rate, batch_count=1673333.3333333333, ans=0.0 2023-10-04 13:38:49,192 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_41452_210_7291_1_1533437687_3941281_273-334088-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 13:38:49,195 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_128-241008-0_sp1.1 from training. Duration: 0.8545625 2023-10-04 13:38:51,477 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.nonlin_attention.balancer.prob, batch_count=1673333.3333333333, ans=0.125 2023-10-04 13:38:52,519 WARNING [train.py:1204] (3/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_379-49457-0 from training. Duration: 0.87 2023-10-04 13:38:52,584 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_105-114797-0 from training. Duration: 0.84 2023-10-04 13:38:55,322 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_14928_210_5632_1_1530773461_4714000_454-112198-0 from training. Duration: 0.66 2023-10-04 13:38:59,634 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_456-22141-0_sp1.1 from training. Duration: 0.8 2023-10-04 13:39:01,245 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_115-233929-0 from training. Duration: 0.72 2023-10-04 13:39:02,642 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_616-16795-0_sp1.1 from training. Duration: 0.963625 2023-10-04 13:39:04,474 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.feed_forward3.out_whiten.whitening_limit, batch_count=1673400.0, ans=15.0 2023-10-04 13:39:09,941 WARNING [train.py:1204] (3/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_448-212135-0 from training. Duration: 0.91 2023-10-04 13:39:11,724 INFO [train.py:1046] (3/4) Epoch 48, batch 1350, loss[loss=0.1287, simple_loss=0.194, pruned_loss=0.03168, over 23407.00 frames. ], tot_loss[loss=0.1534, simple_loss=0.2332, pruned_loss=0.03678, over 4713625.98 frames. ], batch size: 285, lr: 2.12e-03, grad_scale: 16.0 2023-10-04 13:39:11,882 WARNING [train.py:1204] (3/4) Exclude cut with ID _46683_210_9642_1_1533730913492_4591116_201-205677-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 13:39:14,587 WARNING [train.py:1204] (3/4) Exclude cut with ID _64086_210_7994_1_1535270417019_7435949_43-80483-0_sp1.1 from training. Duration: 0.97275 2023-10-04 13:39:19,031 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_240-134124-0_sp1.1 from training. Duration: 0.963625 2023-10-04 13:39:19,069 WARNING [train.py:1204] (3/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_907-120940-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 13:39:20,490 WARNING [train.py:1204] (3/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_848-280957-0_sp1.1 from training. Duration: 0.7909375 2023-10-04 13:39:20,537 WARNING [train.py:1204] (3/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_243-42116-0_sp0.9 from training. Duration: 0.811125 2023-10-04 13:39:25,386 WARNING [train.py:1204] (3/4) Exclude cut with ID _6694_210_4599_1_1527852637490_3965529_116-109398-0_sp0.9 from training. Duration: 0.811125 2023-10-04 13:39:25,640 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.self_attn_weights.pos_emb_skip_rate, batch_count=1673533.3333333333, ans=0.0 2023-10-04 13:39:25,690 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.bypass.skip_rate, batch_count=1673533.3333333333, ans=0.07 2023-10-04 13:39:26,742 WARNING [train.py:1204] (3/4) Exclude cut with ID _24314_210_4915_1_1532135078116_7170179_521-258868-0 from training. Duration: 0.79 2023-10-04 13:39:28,187 WARNING [train.py:1204] (3/4) Exclude cut with ID _2220_210_1819_1_1525509109898_3658680_50-99837-0_sp0.9 from training. Duration: 0.6 2023-10-04 13:39:29,591 WARNING [train.py:1204] (3/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_313-134930-0_sp1.1 from training. Duration: 0.8818125 2023-10-04 13:39:30,913 WARNING [train.py:1204] (3/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_125-144174-0 from training. Duration: 0.39 2023-10-04 13:39:32,287 WARNING [train.py:1204] (3/4) Exclude cut with ID _59288_210_5854_1_1535077757774_3412246_275-293897-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 13:39:33,704 WARNING [train.py:1204] (3/4) Exclude cut with ID _40519_210_13794_1_1533715228655_7337390_722-52880-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 13:39:34,991 WARNING [train.py:1204] (3/4) Exclude cut with ID _7537_210_2084_1_1528502395431_3924359_128-268029-0 from training. Duration: 0.98 2023-10-04 13:39:36,424 WARNING [train.py:1204] (3/4) Exclude cut with ID _8021_210_7994_1_1528376451358_3653069_179-45206-0 from training. Duration: 0.86 2023-10-04 13:39:37,759 WARNING [train.py:1204] (3/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_88-276668-0 from training. Duration: 0.99 2023-10-04 13:39:39,719 WARNING [train.py:1204] (3/4) Exclude cut with ID _22613_210_5219_1_1532253599686_4090170_290-212862-0_sp1.1 from training. Duration: 0.97275 2023-10-04 13:39:39,724 WARNING [train.py:1204] (3/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_465-214726-0 from training. Duration: 0.86 2023-10-04 13:39:51,810 WARNING [train.py:1204] (3/4) Exclude cut with ID _56480_210_4220_1_1534679942079_3075409_138-36031-0_sp1.1 from training. Duration: 0.97275 2023-10-04 13:39:54,097 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=1673600.0, ans=0.1 2023-10-04 13:40:00,667 WARNING [train.py:1204] (3/4) Exclude cut with ID _58605_210_12210_1_1534723195939_3904259_300-285878-0_sp1.1 from training. Duration: 0.97275 2023-10-04 13:40:00,700 WARNING [train.py:1204] (3/4) Exclude cut with ID _40901_210_5665_1_1533363789958_3856130_114-340229-0_sp1.1 from training. Duration: 0.936375 2023-10-04 13:40:00,739 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_489-230703-0 from training. Duration: 0.85 2023-10-04 13:40:04,939 WARNING [train.py:1204] (3/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_322-81243-0_sp1.1 from training. Duration: 0.936375 2023-10-04 13:40:05,030 WARNING [train.py:1204] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_271-269084-0 from training. Duration: 0.83 2023-10-04 13:40:05,047 WARNING [train.py:1204] (3/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_166-314607-0_sp0.9 from training. Duration: 0.6 2023-10-04 13:40:06,406 WARNING [train.py:1204] (3/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_340-343302-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 13:40:08,334 WARNING [train.py:1204] (3/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_286-112015-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 13:40:09,801 WARNING [train.py:1204] (3/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_328-264776-0 from training. Duration: 0.67 2023-10-04 13:40:12,439 WARNING [train.py:1204] (3/4) Exclude cut with ID _4866_210_2577_1_1527404412284_4364659_256-306830-0_sp0.9 from training. Duration: 0.9666875 2023-10-04 13:40:17,299 WARNING [train.py:1204] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_239-316802-0 from training. Duration: 0.69 2023-10-04 13:40:18,782 WARNING [train.py:1204] (3/4) Exclude cut with ID _29857_210_20034_1_1532395886753_3798160_279-185801-0 from training. Duration: 0.91 2023-10-04 13:40:25,203 WARNING [train.py:1204] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_656-188361-0 from training. Duration: 0.87 2023-10-04 13:40:25,324 WARNING [train.py:1204] (3/4) Exclude cut with ID _44718_210_5816_1_1534050051869_6469030_100-230287-0_sp1.1 from training. Duration: 0.936375 2023-10-04 13:40:26,582 INFO [train.py:1046] (3/4) Epoch 48, batch 1400, loss[loss=0.1528, simple_loss=0.2414, pruned_loss=0.03206, over 24323.00 frames. ], tot_loss[loss=0.152, simple_loss=0.2318, pruned_loss=0.03611, over 4705894.83 frames. ], batch size: 77, lr: 2.12e-03, grad_scale: 16.0 2023-10-04 13:40:29,511 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8275_210_4198_1_1528455320_3892571_276-215622-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 13:40:30,892 WARNING [train.py:1204] (3/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_698-309430-0_sp1.1 from training. Duration: 0.963625 2023-10-04 13:40:32,028 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.732e+02 2.089e+02 2.315e+02 2.656e+02 4.133e+02, threshold=4.629e+02, percent-clipped=0.0 2023-10-04 13:40:34,955 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_365-217119-0 from training. Duration: 0.63 2023-10-04 13:40:36,395 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7264_210_4938_1_1527939911_8132095_800-91995-0 from training. Duration: 0.86 2023-10-04 13:40:38,079 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.nonlin_attention.balancer.prob, batch_count=1673800.0, ans=0.125 2023-10-04 13:40:45,683 WARNING [train.py:1204] (3/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_49-97158-0_sp1.1 from training. Duration: 0.6909375 2023-10-04 13:40:47,061 WARNING [train.py:1204] (3/4) Exclude cut with ID _61132_210_9969_1_1535021792964_3755419_61-57720-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 13:40:49,837 WARNING [train.py:1204] (3/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_345-306832-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 13:40:49,861 WARNING [train.py:1204] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_164-224052-0_sp0.9 from training. Duration: 0.77775 2023-10-04 13:40:53,310 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.1.encoder.layers.1.conv_module1.whiten, num_groups=1, num_channels=256, metric=11.97 vs. limit=15.0 2023-10-04 13:40:53,690 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_34886_210_10633_1_1533203882_1755661_76-156096-0_sp1.1 from training. Duration: 0.7909375 2023-10-04 13:40:55,073 WARNING [train.py:1204] (3/4) Exclude cut with ID 4133-6541-0027-40495-0_sp1.1 from training. Duration: 0.9681875 2023-10-04 13:41:03,394 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_514-211165-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 13:41:04,765 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_41211_210_2544_1_1533348859_3702910_97-212695-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 13:41:09,019 WARNING [train.py:1204] (3/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_406-45086-0 from training. Duration: 0.8 2023-10-04 13:41:09,093 WARNING [train.py:1204] (3/4) Exclude cut with ID _65263_210_22356_1_1535763598691_3921037_49-222917-0_sp1.1 from training. Duration: 0.836375 2023-10-04 13:41:10,832 WARNING [train.py:1204] (3/4) Exclude cut with ID _4866_210_2577_1_1527404412284_4364659_352-157930-0_sp1.1 from training. Duration: 0.736375 2023-10-04 13:41:10,904 WARNING [train.py:1204] (3/4) Exclude cut with ID _4366_210_1388_1_1526792053633_3613819_231-211402-0_sp1.1 from training. Duration: 0.8545625 2023-10-04 13:41:12,258 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_98-21576-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 13:41:12,506 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.balancer2.prob, batch_count=1674000.0, ans=0.125 2023-10-04 13:41:13,644 WARNING [train.py:1204] (3/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_348-127789-0_sp0.9 from training. Duration: 0.9444375 2023-10-04 13:41:13,669 WARNING [train.py:1204] (3/4) Exclude cut with ID _45871_210_5665_1_1533864614245_3296060_98-329557-0_sp1.1 from training. Duration: 0.97275 2023-10-04 13:41:15,433 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6846_210_6753_1_1527850767_4295158_2-315073-0_sp1.1 from training. Duration: 0.8 2023-10-04 13:41:16,797 WARNING [train.py:1204] (3/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_145-137790-0 from training. Duration: 0.83 2023-10-04 13:41:16,812 WARNING [train.py:1204] (3/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_133-158308-0_sp0.9 from training. Duration: 0.9333125 2023-10-04 13:41:20,960 WARNING [train.py:1204] (3/4) Exclude cut with ID _14898_210_9206_1_1530766812377_4177190_320-11758-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 13:41:23,470 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.3.encoder.layers.2.self_attn_weights.whiten_keys, num_groups=8, num_channels=256, metric=2.88 vs. limit=6.0 2023-10-04 13:41:24,316 WARNING [train.py:1204] (3/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_406-45086-0_sp0.9 from training. Duration: 0.888875 2023-10-04 13:41:25,077 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.2.encoder.layers.2.conv_module1.whiten, num_groups=1, num_channels=384, metric=5.70 vs. limit=15.0 2023-10-04 13:41:26,450 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=1674066.6666666667, ans=0.1 2023-10-04 13:41:31,806 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_5438_210_1794_1_1527336687_2600009_104-288274-0 from training. Duration: 0.58 2023-10-04 13:41:33,112 WARNING [train.py:1204] (3/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_108-53929-0_sp0.9 from training. Duration: 0.6555625 2023-10-04 13:41:33,182 WARNING [train.py:1204] (3/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_444-286390-0_sp1.1 from training. Duration: 0.963625 2023-10-04 13:41:34,487 WARNING [train.py:1204] (3/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_125-144174-0_sp0.9 from training. Duration: 0.4333125 2023-10-04 13:41:34,567 WARNING [train.py:1204] (3/4) Exclude cut with ID _55209_210_1531_1_1534675828996_4597160_118-345621-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 13:41:34,798 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.ff3_skip_rate, batch_count=1674066.6666666667, ans=0.0 2023-10-04 13:41:37,210 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_45886_210_17024_1_1533709215_471063_7-79165-0_sp1.1 from training. Duration: 0.8909375 2023-10-04 13:41:37,994 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.2.encoder.layers.1.feed_forward2.out_whiten, num_groups=1, num_channels=384, metric=3.87 vs. limit=15.0 2023-10-04 13:41:39,733 INFO [train.py:1046] (3/4) Epoch 48, batch 1450, loss[loss=0.1421, simple_loss=0.219, pruned_loss=0.03262, over 23586.00 frames. ], tot_loss[loss=0.1514, simple_loss=0.2312, pruned_loss=0.03575, over 4707495.49 frames. ], batch size: 149, lr: 2.12e-03, grad_scale: 16.0 2023-10-04 13:41:41,656 WARNING [train.py:1204] (3/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_175-163894-0_sp0.9 from training. Duration: 0.82225 2023-10-04 13:41:43,151 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_50188_210_4198_1_1534503621_3646041_353-81682-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 13:41:43,156 WARNING [train.py:1204] (3/4) Exclude cut with ID _17875_210_2087_1_1531224012907_1821339_175-80565-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 13:41:43,165 WARNING [train.py:1204] (3/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_159-328157-0_sp1.1 from training. Duration: 0.47275 2023-10-04 13:41:46,556 WARNING [train.py:1204] (3/4) Exclude cut with ID _60057_210_10640_1_1534903065793_3333150_137-267579-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 13:41:46,606 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_92-46009-0_sp1.1 from training. Duration: 0.4909375 2023-10-04 13:41:49,341 WARNING [train.py:1204] (3/4) Exclude cut with ID _53203_210_2087_1_1534246218309_475590_48-68507-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 13:41:49,371 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_28211_210_6388_1_1532861506_4523224_128-328162-0 from training. Duration: 0.9 2023-10-04 13:41:50,656 WARNING [train.py:1204] (3/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_51-1049-0_sp1.1 from training. Duration: 0.6818125 2023-10-04 13:41:52,035 WARNING [train.py:1204] (3/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_383-316000-0 from training. Duration: 0.9 2023-10-04 13:41:53,829 WARNING [train.py:1204] (3/4) Exclude cut with ID _4779_210_2084_1_1527318022944_3914893_185-295153-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 13:41:55,792 WARNING [train.py:1204] (3/4) Exclude cut with ID _3231_210_2106_1_1526205864934_6784220_171-256004-0_sp0.9 from training. Duration: 0.9555625 2023-10-04 13:41:55,792 WARNING [train.py:1204] (3/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_87-272068-0 from training. Duration: 0.83 2023-10-04 13:41:55,918 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_164-152777-0_sp1.1 from training. Duration: 0.92725 2023-10-04 13:41:57,252 WARNING [train.py:1204] (3/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_18-130336-0_sp0.9 from training. Duration: 0.87775 2023-10-04 13:41:57,302 WARNING [train.py:1204] (3/4) Exclude cut with ID _14886_210_4220_1_1530702020355_3516190_175-229073-0_sp1.1 from training. Duration: 0.4090625 2023-10-04 13:41:57,324 WARNING [train.py:1204] (3/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_791-45149-0_sp0.9 from training. Duration: 0.9555625 2023-10-04 13:41:58,707 WARNING [train.py:1204] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_468-189058-0_sp1.1 from training. Duration: 0.87275 2023-10-04 13:42:00,223 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8278_210_2550_1_1528637099_4220413_32-142780-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 13:42:02,845 WARNING [train.py:1204] (3/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_146-218763-0_sp0.9 from training. Duration: 0.9555625 2023-10-04 13:42:05,690 WARNING [train.py:1204] (3/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_348-127789-0_sp1.1 from training. Duration: 0.77275 2023-10-04 13:42:05,695 WARNING [train.py:1204] (3/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_288-285445-0_sp1.1 from training. Duration: 0.936375 2023-10-04 13:42:08,453 WARNING [train.py:1204] (3/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_308-181732-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 13:42:08,489 WARNING [train.py:1204] (3/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_895-117428-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 13:42:09,881 WARNING [train.py:1204] (3/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_387-159008-0_sp0.9 from training. Duration: 0.9555625 2023-10-04 13:42:09,894 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_37351_210_7925_1_1533012931_3812113_265-85780-0_sp0.9 from training. Duration: 0.97775 2023-10-04 13:42:11,722 WARNING [train.py:1204] (3/4) Exclude cut with ID _40551_210_10635_1_1533776424632_3416179_361-3964-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 13:42:11,764 WARNING [train.py:1204] (3/4) Exclude cut with ID _25882_210_3036_1_1532775665815_6479510_485-122742-0_sp1.1 from training. Duration: 0.97275 2023-10-04 13:42:14,571 WARNING [train.py:1204] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_98-51676-0 from training. Duration: 0.73 2023-10-04 13:42:14,736 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.bypass.skip_rate, batch_count=1674266.6666666667, ans=0.07 2023-10-04 13:42:17,822 WARNING [train.py:1204] (3/4) Exclude cut with ID _30548_210_16040_1_1532608097130_3146969_371-280610-0_sp1.1 from training. Duration: 0.92725 2023-10-04 13:42:21,973 WARNING [train.py:1204] (3/4) Exclude cut with ID IC0671W0081-331924 from training. Duration: 0.896 2023-10-04 13:42:23,370 WARNING [train.py:1204] (3/4) Exclude cut with ID _40284_210_2084_1_1533513679814_3704279_380-160775-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 13:42:25,216 WARNING [train.py:1204] (3/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_57-270037-0_sp1.1 from training. Duration: 0.7 2023-10-04 13:42:27,093 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_853-150237-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 13:42:28,467 WARNING [train.py:1204] (3/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_491-261923-0 from training. Duration: 0.98 2023-10-04 13:42:31,420 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.feed_forward3.hidden_balancer.prob, batch_count=1674333.3333333333, ans=0.125 2023-10-04 13:42:32,655 WARNING [train.py:1204] (3/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_836-244045-0_sp1.1 from training. Duration: 0.97275 2023-10-04 13:42:33,995 WARNING [train.py:1204] (3/4) Exclude cut with ID _14880_210_8895_1_1530680426655_3795259_289-212817-0 from training. Duration: 0.69 2023-10-04 13:42:35,400 WARNING [train.py:1204] (3/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_303-340446-0 from training. Duration: 0.9 2023-10-04 13:42:35,507 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_37152_210_9697_1_1533106573_3844188_418-41736-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 13:42:38,254 WARNING [train.py:1204] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_371-18169-0_sp1.1 from training. Duration: 0.7818125 2023-10-04 13:42:38,286 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_187-219984-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 13:42:41,019 WARNING [train.py:1204] (3/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_63-267585-0 from training. Duration: 0.9 2023-10-04 13:42:42,530 WARNING [train.py:1204] (3/4) Exclude cut with ID _27013_210_1389_1_1532437173240_3252649_222-125573-0 from training. Duration: 0.98 2023-10-04 13:42:44,297 WARNING [train.py:1204] (3/4) Exclude cut with ID _14821_210_8750_1_1530680503991_6829359_189-297451-0 from training. Duration: 0.55 2023-10-04 13:42:45,698 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_557-163391-0_sp1.1 from training. Duration: 0.97275 2023-10-04 13:42:45,786 WARNING [train.py:1204] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_65-88242-0_sp1.1 from training. Duration: 0.5090625 2023-10-04 13:42:45,908 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.0.bypass.scale_min, batch_count=1674400.0, ans=0.2 2023-10-04 13:42:53,274 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.3.encoder.layers.1.feed_forward2.out_whiten, num_groups=1, num_channels=512, metric=5.87 vs. limit=15.0 2023-10-04 13:42:55,125 INFO [train.py:1046] (3/4) Epoch 48, batch 1500, loss[loss=0.1547, simple_loss=0.2472, pruned_loss=0.03109, over 24460.00 frames. ], tot_loss[loss=0.1521, simple_loss=0.232, pruned_loss=0.03609, over 4706324.98 frames. ], batch size: 69, lr: 2.12e-03, grad_scale: 16.0 2023-10-04 13:42:58,370 WARNING [train.py:1204] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_218-295097-0 from training. Duration: 0.77 2023-10-04 13:42:58,397 WARNING [train.py:1204] (3/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_686-247600-0_sp1.1 from training. Duration: 0.836375 2023-10-04 13:42:58,402 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_397-147196-0_sp0.9 from training. Duration: 0.888875 2023-10-04 13:42:59,760 WARNING [train.py:1204] (3/4) Exclude cut with ID _53950_210_5816_1_1534417264919_3862951_237-136449-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 13:42:59,828 WARNING [train.py:1204] (3/4) Exclude cut with ID _3120_210_3525_1_1526036143112_4072980_74-178400-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 13:43:01,095 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.657e+02 2.007e+02 2.219e+02 2.655e+02 4.541e+02, threshold=4.439e+02, percent-clipped=0.0 2023-10-04 13:43:01,228 WARNING [train.py:1204] (3/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_309-222545-0_sp0.9 from training. Duration: 0.9333125 2023-10-04 13:43:01,309 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7544_210_6753_1_1528587722_3576686_172-171750-0 from training. Duration: 0.65 2023-10-04 13:43:02,720 WARNING [train.py:1204] (3/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_3-237434-0_sp0.9 from training. Duration: 0.8444375 2023-10-04 13:43:02,764 WARNING [train.py:1204] (3/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_101-293367-0_sp0.9 from training. Duration: 0.7 2023-10-04 13:43:02,771 WARNING [train.py:1204] (3/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_818-213552-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 13:43:04,139 WARNING [train.py:1204] (3/4) Exclude cut with ID _14349_210_8341_1_1530615710818_3442470_115-285975-0_sp1.1 from training. Duration: 0.7818125 2023-10-04 13:43:05,584 WARNING [train.py:1204] (3/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_291-309749-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 13:43:05,684 WARNING [train.py:1204] (3/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_715-3462-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 13:43:07,159 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.bypass.scale_min, batch_count=1674466.6666666667, ans=0.2 2023-10-04 13:43:11,219 WARNING [train.py:1204] (3/4) Exclude cut with ID _59502_210_12210_1_1535107024698_1835139_13-331827-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 13:43:11,236 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_4528_210_3273_1_1527076654_3276777_342-168068-0 from training. Duration: 0.87 2023-10-04 13:43:13,057 WARNING [train.py:1204] (3/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_215-141059-0_sp0.9 from training. Duration: 0.688875 2023-10-04 13:43:13,084 WARNING [train.py:1204] (3/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_106-286130-0_sp1.1 from training. Duration: 0.8909375 2023-10-04 13:43:14,435 WARNING [train.py:1204] (3/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_480-77655-0_sp1.1 from training. Duration: 0.97275 2023-10-04 13:43:16,026 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=1674533.3333333333, ans=0.1 2023-10-04 13:43:17,066 WARNING [train.py:1204] (3/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_848-280957-0 from training. Duration: 0.87 2023-10-04 13:43:20,623 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.1.ff3_skip_rate, batch_count=1674533.3333333333, ans=0.0 2023-10-04 13:43:21,845 WARNING [train.py:1204] (3/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_122-109023-0 from training. Duration: 0.76 2023-10-04 13:43:23,840 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_11305_210_1794_1_1529750428_7178148_645-233480-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 13:43:25,153 WARNING [train.py:1204] (3/4) Exclude cut with ID _7562_210_2084_1_1528887767916_3946229_345-169156-0 from training. Duration: 0.58 2023-10-04 13:43:26,580 WARNING [train.py:1204] (3/4) Exclude cut with ID _7562_210_2084_1_1528887767916_3946229_345-169156-0_sp1.1 from training. Duration: 0.52725 2023-10-04 13:43:28,641 WARNING [train.py:1204] (3/4) Exclude cut with ID _13253_210_1925_1_1530757775819_3577873_253-246386-0_sp1.1 from training. Duration: 0.8181875 2023-10-04 13:43:28,709 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_136-264444-0_sp1.1 from training. Duration: 0.97275 2023-10-04 13:43:29,171 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.5.encoder.layers.0.nonlin_attention.whiten2, num_groups=1, num_channels=256, metric=3.39 vs. limit=15.0 2023-10-04 13:43:30,105 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2168_210_2290_1_1525606920_1849400_136-222991-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 13:43:30,222 WARNING [train.py:1204] (3/4) Exclude cut with ID _7852_210_5219_1_1528286471316_8139400_242-105336-0 from training. Duration: 0.72 2023-10-04 13:43:31,532 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_5960_210_1794_1_1527403865_3990999_515-2613-0_sp1.1 from training. Duration: 0.8 2023-10-04 13:43:31,555 WARNING [train.py:1204] (3/4) Exclude cut with ID _39688_210_19596_1_1533371626725_4154159_551-167834-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 13:43:32,922 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_85-195240-0 from training. Duration: 0.87 2023-10-04 13:43:32,998 WARNING [train.py:1204] (3/4) Exclude cut with ID _63171_210_15488_1_1535104792794_3295110_113-113534-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 13:43:38,402 WARNING [train.py:1204] (3/4) Exclude cut with ID _8833_210_5219_1_1528592665210_7553059_319-349181-0_sp1.1 from training. Duration: 0.9 2023-10-04 13:43:38,403 WARNING [train.py:1204] (3/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_225-43381-0 from training. Duration: 0.94 2023-10-04 13:43:43,400 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_5959_210_1794_1_1527400400_3445726_412-44052-0_sp0.9 from training. Duration: 0.8555625 2023-10-04 13:43:43,580 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.1.balancer2.prob, batch_count=1674666.6666666667, ans=0.125 2023-10-04 13:43:44,774 WARNING [train.py:1204] (3/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_356-167816-0_sp1.1 from training. Duration: 0.6545625 2023-10-04 13:43:47,573 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9012W0112-501244 from training. Duration: 0.768 2023-10-04 13:43:48,895 WARNING [train.py:1204] (3/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_177-326952-0_sp1.1 from training. Duration: 0.92725 2023-10-04 13:43:48,899 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9016W0116-502789 from training. Duration: 0.896 2023-10-04 13:43:50,839 WARNING [train.py:1204] (3/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_557-204815-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 13:43:50,934 WARNING [train.py:1204] (3/4) Exclude cut with ID _55857_210_2346_1_1534644007831_6898209_279-229469-0_sp1.1 from training. Duration: 0.936375 2023-10-04 13:43:52,706 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9029W0375-509764 from training. Duration: 0.896 2023-10-04 13:43:54,053 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2295_210_2087_1_1525500993_2407863_234-83104-0_sp0.9 from training. Duration: 0.688875 2023-10-04 13:43:55,678 WARNING [train.py:1204] (3/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_266-137104-0 from training. Duration: 0.65 2023-10-04 13:43:58,368 WARNING [train.py:1204] (3/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_289-163927-0_sp1.1 from training. Duration: 0.92725 2023-10-04 13:44:02,790 WARNING [train.py:1204] (3/4) Exclude cut with ID _12733_210_8341_1_1530167424556_3646380_86-225314-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 13:44:02,804 WARNING [train.py:1204] (3/4) Exclude cut with ID _7877_210_6014_1_1528286358099_3750136_618-284064-0_sp1.1 from training. Duration: 0.92725 2023-10-04 13:44:02,855 WARNING [train.py:1204] (3/4) Exclude cut with ID _55844_210_2346_1_1534471212569_7625707_1017-31469-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 13:44:02,887 WARNING [train.py:1204] (3/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_617-241446-0_sp1.1 from training. Duration: 0.92725 2023-10-04 13:44:04,172 WARNING [train.py:1204] (3/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_866-173440-0_sp1.1 from training. Duration: 0.8181875 2023-10-04 13:44:06,961 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_9460_210_4211_1_1528954587_10049667_480-260269-0 from training. Duration: 0.56 2023-10-04 13:44:06,991 WARNING [train.py:1204] (3/4) Exclude cut with ID _44659_210_2721_1_1534036120061_6614860_626-298884-0 from training. Duration: 0.97 2023-10-04 13:44:07,034 WARNING [train.py:1204] (3/4) Exclude cut with ID _53565_210_10635_1_1534337218335_3820259_287-18120-0_sp1.1 from training. Duration: 0.8818125 2023-10-04 13:44:08,466 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_791-197891-0 from training. Duration: 0.84 2023-10-04 13:44:08,513 WARNING [train.py:1204] (3/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_527-238990-0 from training. Duration: 0.9 2023-10-04 13:44:09,772 INFO [train.py:1046] (3/4) Epoch 48, batch 1550, loss[loss=0.1321, simple_loss=0.2105, pruned_loss=0.02684, over 24354.00 frames. ], tot_loss[loss=0.1531, simple_loss=0.2329, pruned_loss=0.03662, over 4703467.46 frames. ], batch size: 56, lr: 2.12e-03, grad_scale: 16.0 2023-10-04 13:44:11,360 WARNING [train.py:1204] (3/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_449-65969-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 13:44:12,728 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_553-341452-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 13:44:12,777 WARNING [train.py:1204] (3/4) Exclude cut with ID _16909_210_5724_1_1531292079912_4118919_300-47574-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 13:44:12,792 WARNING [train.py:1204] (3/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_70-27732-0_sp1.1 from training. Duration: 0.8545625 2023-10-04 13:44:14,172 WARNING [train.py:1204] (3/4) Exclude cut with ID _44718_210_5816_1_1534050051869_6469030_727-28074-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 13:44:15,384 WARNING [train.py:1204] (3/4) Exclude cut with ID _55857_210_2346_1_1534644007831_6898209_357-123664-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 13:44:16,162 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.2.encoder.layers.1.self_attn_weights.whiten_keys, num_groups=4, num_channels=128, metric=2.28 vs. limit=6.0 2023-10-04 13:44:18,753 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9123W0004-555964 from training. Duration: 0.896 2023-10-04 13:44:20,099 WARNING [train.py:1204] (3/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_445-145208-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 13:44:20,133 WARNING [train.py:1204] (3/4) Exclude cut with ID _3675_210_4557_1_1526639934408_4182089_456-18305-0_sp1.1 from training. Duration: 0.6909375 2023-10-04 13:44:20,229 WARNING [train.py:1204] (3/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_1-99680-0_sp1.1 from training. Duration: 0.6090625 2023-10-04 13:44:23,497 WARNING [train.py:1204] (3/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_146-282400-0_sp0.9 from training. Duration: 0.788875 2023-10-04 13:44:23,517 WARNING [train.py:1204] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_844-96989-0 from training. Duration: 0.71 2023-10-04 13:44:23,678 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.balancer1.prob, batch_count=1674866.6666666667, ans=0.125 2023-10-04 13:44:24,860 WARNING [train.py:1204] (3/4) Exclude cut with ID _29853_210_7497_1_1532505600851_6642110_128-146364-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 13:44:26,441 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_224-322394-0 from training. Duration: 0.66 2023-10-04 13:44:29,028 WARNING [train.py:1204] (3/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_191-59784-0 from training. Duration: 0.88 2023-10-04 13:44:29,033 WARNING [train.py:1204] (3/4) Exclude cut with ID _12556_210_8341_1_1530081024100_7327690_725-193477-0 from training. Duration: 0.98 2023-10-04 13:44:29,071 WARNING [train.py:1204] (3/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_523-79838-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 13:44:29,160 WARNING [train.py:1204] (3/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_398-334558-0_sp1.1 from training. Duration: 0.87275 2023-10-04 13:44:33,424 WARNING [train.py:1204] (3/4) Exclude cut with ID _6456_210_1534_1_1527681634212_3181049_171-36697-0_sp1.1 from training. Duration: 0.936375 2023-10-04 13:44:35,458 WARNING [train.py:1204] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_700-183549-0 from training. Duration: 0.68 2023-10-04 13:44:35,460 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8853_210_3273_1_1528633899_818968_51-187685-0 from training. Duration: 0.69 2023-10-04 13:44:35,637 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.nonlin_attention.balancer.prob, batch_count=1674866.6666666667, ans=0.125 2023-10-04 13:44:43,587 WARNING [train.py:1204] (3/4) Exclude cut with ID _7719_210_3525_1_1528456153163_4296960_333-110399-0_sp1.1 from training. Duration: 0.87275 2023-10-04 13:44:43,793 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.conv_module1.balancer1.prob, batch_count=1674933.3333333333, ans=0.125 2023-10-04 13:44:47,563 WARNING [train.py:1204] (3/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_1022-83740-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 13:44:47,590 WARNING [train.py:1204] (3/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_61-327062-0_sp0.9 from training. Duration: 0.9 2023-10-04 13:44:47,603 WARNING [train.py:1204] (3/4) Exclude cut with ID _47251_210_18628_1_1533814063128_4137250_214-88076-0_sp1.1 from training. Duration: 0.8 2023-10-04 13:44:48,924 WARNING [train.py:1204] (3/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_839-173017-0 from training. Duration: 0.41 2023-10-04 13:44:55,748 WARNING [train.py:1204] (3/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_299-34413-0_sp1.1 from training. Duration: 0.5909375 2023-10-04 13:44:57,237 WARNING [train.py:1204] (3/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_664-159457-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 13:45:00,535 WARNING [train.py:1204] (3/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_483-291760-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 13:45:01,954 WARNING [train.py:1204] (3/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_287-255474-0_sp1.1 from training. Duration: 0.92725 2023-10-04 13:45:03,298 WARNING [train.py:1204] (3/4) Exclude cut with ID _48677_210_5663_1_1533902405919_3892989_111-11439-0_sp1.1 from training. Duration: 0.87275 2023-10-04 13:45:03,305 WARNING [train.py:1204] (3/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_399-156791-0 from training. Duration: 0.83 2023-10-04 13:45:03,343 WARNING [train.py:1204] (3/4) Exclude cut with ID _3992_210_1747_1_1526644816031_3731060_36-147855-0_sp0.9 from training. Duration: 0.8666875 2023-10-04 13:45:03,550 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.conv_module2.balancer1.prob, batch_count=1675000.0, ans=0.125 2023-10-04 13:45:04,740 WARNING [train.py:1204] (3/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_442-320317-0_sp1.1 from training. Duration: 0.7454375 2023-10-04 13:45:06,058 WARNING [train.py:1204] (3/4) Exclude cut with ID _55429_210_10626_1_1534467588527_7174143_953-80868-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 13:45:06,805 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.nonlin_attention.whiten1.whitening_limit, batch_count=1675000.0, ans=10.0 2023-10-04 13:45:07,357 WARNING [train.py:1204] (3/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_159-328157-0_sp0.9 from training. Duration: 0.57775 2023-10-04 13:45:07,365 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9309W0197-640324 from training. Duration: 0.896 2023-10-04 13:45:09,469 WARNING [train.py:1204] (3/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_361-30822-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 13:45:13,798 WARNING [train.py:1204] (3/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_636-251213-0 from training. Duration: 0.94 2023-10-04 13:45:17,352 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.2.encoder.layers.0.self_attn_weights.whiten_keys, num_groups=4, num_channels=128, metric=3.21 vs. limit=6.0 2023-10-04 13:45:18,004 WARNING [train.py:1204] (3/4) Exclude cut with ID _49728_210_12651_1_1534041034294_6788150_468-333827-0_sp1.1 from training. Duration: 0.963625 2023-10-04 13:45:20,709 WARNING [train.py:1204] (3/4) Exclude cut with ID _25882_210_3036_1_1532775665815_6479510_623-191759-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 13:45:20,767 WARNING [train.py:1204] (3/4) Exclude cut with ID _5189_210_4557_1_1527248319428_3769229_199-53526-0 from training. Duration: 0.42 2023-10-04 13:45:22,228 WARNING [train.py:1204] (3/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_32-205288-0_sp0.9 from training. Duration: 0.8666875 2023-10-04 13:45:22,319 WARNING [train.py:1204] (3/4) Exclude cut with ID _65263_210_22356_1_1535763598691_3921037_401-346646-0_sp1.1 from training. Duration: 0.963625 2023-10-04 13:45:22,323 WARNING [train.py:1204] (3/4) Exclude cut with ID _12555_210_8750_1_1530075594402_3780239_449-267986-0_sp1.1 from training. Duration: 0.7545625 2023-10-04 13:45:22,335 WARNING [train.py:1204] (3/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_430-201020-0_sp1.1 from training. Duration: 0.8818125 2023-10-04 13:45:23,605 INFO [train.py:1046] (3/4) Epoch 48, batch 1600, loss[loss=0.2101, simple_loss=0.2758, pruned_loss=0.07223, over 19339.00 frames. ], tot_loss[loss=0.1539, simple_loss=0.2337, pruned_loss=0.03701, over 4687847.97 frames. ], batch size: 388, lr: 2.12e-03, grad_scale: 32.0 2023-10-04 13:45:23,650 WARNING [train.py:1204] (3/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_379-49457-0_sp1.1 from training. Duration: 0.7909375 2023-10-04 13:45:25,831 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.balancer1.prob, batch_count=1675133.3333333333, ans=0.125 2023-10-04 13:45:27,540 WARNING [train.py:1204] (3/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_200-105218-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 13:45:27,604 WARNING [train.py:1204] (3/4) Exclude cut with ID _3867_210_1883_1_1526641200575_4475830_319-349850-0 from training. Duration: 0.67 2023-10-04 13:45:27,788 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.balancer1.prob, batch_count=1675133.3333333333, ans=0.125 2023-10-04 13:45:28,942 WARNING [train.py:1204] (3/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_916-195848-0 from training. Duration: 0.92 2023-10-04 13:45:30,110 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.777e+02 2.052e+02 2.355e+02 2.599e+02 3.468e+02, threshold=4.711e+02, percent-clipped=0.0 2023-10-04 13:45:30,289 WARNING [train.py:1204] (3/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_436-186311-0 from training. Duration: 0.65 2023-10-04 13:45:32,938 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3910_210_1794_1_1526777667_3747260_302-180617-0_sp1.1 from training. Duration: 0.97275 2023-10-04 13:45:33,187 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.bypass.scale_min, batch_count=1675133.3333333333, ans=0.2 2023-10-04 13:45:34,391 WARNING [train.py:1204] (3/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_102-92751-0 from training. Duration: 0.99 2023-10-04 13:45:35,740 WARNING [train.py:1204] (3/4) Exclude cut with ID _51899_210_20034_1_1534129254466_3669220_265-215428-0_sp1.1 from training. Duration: 0.8909375 2023-10-04 13:45:37,680 WARNING [train.py:1204] (3/4) Exclude cut with ID _58583_210_5236_1_1534726835266_3574249_438-198993-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 13:45:41,842 WARNING [train.py:1204] (3/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_166-166990-0_sp1.1 from training. Duration: 0.87275 2023-10-04 13:45:44,763 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.feed_forward2.hidden_balancer.prob, batch_count=1675200.0, ans=0.125 2023-10-04 13:45:46,033 WARNING [train.py:1204] (3/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_907-151398-0 from training. Duration: 0.37 2023-10-04 13:45:48,234 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.3.encoder.layers.0.conv_module1.whiten, num_groups=1, num_channels=512, metric=3.09 vs. limit=15.0 2023-10-04 13:45:48,823 WARNING [train.py:1204] (3/4) Exclude cut with ID _38670_210_15379_1_1533448810659_4391059_452-256669-0_sp1.1 from training. Duration: 0.936375 2023-10-04 13:45:48,869 WARNING [train.py:1204] (3/4) Exclude cut with ID _48677_210_5663_1_1533902405919_3892989_408-40740-0 from training. Duration: 0.83 2023-10-04 13:45:48,913 WARNING [train.py:1204] (3/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_903-158756-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 13:45:50,285 WARNING [train.py:1204] (3/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_397-216497-0 from training. Duration: 0.96 2023-10-04 13:45:56,334 WARNING [train.py:1204] (3/4) Exclude cut with ID _55429_210_10626_1_1534467588527_7174143_97-335084-0 from training. Duration: 0.97 2023-10-04 13:46:01,271 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=1675266.6666666667, ans=0.1 2023-10-04 13:46:05,149 WARNING [train.py:1204] (3/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_453-267730-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 13:46:05,237 WARNING [train.py:1204] (3/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_153-97441-0 from training. Duration: 0.56 2023-10-04 13:46:06,522 WARNING [train.py:1204] (3/4) Exclude cut with ID _40901_210_5665_1_1533363789958_3856130_312-329122-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 13:46:06,567 WARNING [train.py:1204] (3/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_97-126471-0_sp1.1 from training. Duration: 0.97275 2023-10-04 13:46:06,567 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_11305_210_1794_1_1529750428_7178148_257-312870-0_sp1.1 from training. Duration: 0.8 2023-10-04 13:46:09,734 WARNING [train.py:1204] (3/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_454-314901-0_sp0.9 from training. Duration: 0.511125 2023-10-04 13:46:13,685 WARNING [train.py:1204] (3/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_282-136641-0_sp1.1 from training. Duration: 0.3818125 2023-10-04 13:46:16,315 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_46013_210_7234_1_1533776362_3744032_493-160724-0_sp1.1 from training. Duration: 0.963625 2023-10-04 13:46:17,670 WARNING [train.py:1204] (3/4) Exclude cut with ID _67831_210_12392_1_1535612464202_3092612_259-33442-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 13:46:17,733 WARNING [train.py:1204] (3/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_289-62384-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 13:46:17,791 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6545_210_2040_1_1527774670_4245258_379-14182-0_sp0.9 from training. Duration: 0.888875 2023-10-04 13:46:20,422 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_548-294316-0_sp1.1 from training. Duration: 0.736375 2023-10-04 13:46:20,519 WARNING [train.py:1204] (3/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_857-298942-0_sp1.1 from training. Duration: 0.77275 2023-10-04 13:46:21,933 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6673_210_3273_1_1527767790_3173240_327-306185-0_sp1.1 from training. Duration: 0.7545625 2023-10-04 13:46:29,245 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.whiten.whitening_limit, batch_count=1675400.0, ans=12.0 2023-10-04 13:46:29,708 WARNING [train.py:1204] (3/4) Exclude cut with ID _61008_210_10633_1_1534852769198_6974370_814-107647-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 13:46:29,800 WARNING [train.py:1204] (3/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_116-332008-0_sp1.1 from training. Duration: 0.8909375 2023-10-04 13:46:31,226 WARNING [train.py:1204] (3/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_266-329755-0 from training. Duration: 0.89 2023-10-04 13:46:31,232 WARNING [train.py:1204] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_271-269084-0_sp0.9 from training. Duration: 0.92225 2023-10-04 13:46:33,190 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3171_210_2087_1_1526109084_2330007_66-260583-0 from training. Duration: 0.72 2023-10-04 13:46:34,834 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.conv_module1.balancer1.prob, batch_count=1675400.0, ans=0.125 2023-10-04 13:46:36,074 WARNING [train.py:1204] (3/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_1326-276386-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 13:46:36,209 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.1.balancer_ff2.min_abs, batch_count=1675466.6666666667, ans=0.1 2023-10-04 13:46:37,342 INFO [train.py:1046] (3/4) Epoch 48, batch 1650, loss[loss=0.1499, simple_loss=0.2341, pruned_loss=0.0329, over 24485.00 frames. ], tot_loss[loss=0.1545, simple_loss=0.2347, pruned_loss=0.0372, over 4697764.64 frames. ], batch size: 63, lr: 2.12e-03, grad_scale: 32.0 2023-10-04 13:46:38,858 WARNING [train.py:1204] (3/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_372-30302-0_sp1.1 from training. Duration: 0.7818125 2023-10-04 13:46:38,937 WARNING [train.py:1204] (3/4) Exclude cut with ID _25882_210_3036_1_1532775665815_6479510_631-257029-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 13:46:40,113 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3691_210_3528_1_1526468169_4062053_355-334432-0 from training. Duration: 0.87 2023-10-04 13:46:40,129 WARNING [train.py:1204] (3/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_839-173017-0_sp1.1 from training. Duration: 0.37275 2023-10-04 13:46:40,143 WARNING [train.py:1204] (3/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_441-91466-0 from training. Duration: 0.97 2023-10-04 13:46:41,514 WARNING [train.py:1204] (3/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_108-53929-0 from training. Duration: 0.59 2023-10-04 13:46:42,517 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.0.layers.0.feed_forward1.out_whiten, num_groups=1, num_channels=192, metric=12.03 vs. limit=15.0 2023-10-04 13:46:44,817 WARNING [train.py:1204] (3/4) Exclude cut with ID _9389_210_1664_1_1529217337301_5289269_260-1942-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 13:46:44,900 WARNING [train.py:1204] (3/4) Exclude cut with ID _56423_210_5854_1_1534896061049_3165969_198-61547-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 13:46:46,217 WARNING [train.py:1204] (3/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_412-142122-0_sp1.1 from training. Duration: 0.92725 2023-10-04 13:46:46,237 WARNING [train.py:1204] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_88-341718-0_sp1.1 from training. Duration: 0.6 2023-10-04 13:46:47,748 WARNING [train.py:1204] (3/4) Exclude cut with ID _61079_210_4525_1_1534921008198_4011419_334-298697-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 13:46:47,945 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.conv_module1.balancer2.prob, batch_count=1675466.6666666667, ans=0.125 2023-10-04 13:46:49,187 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_5465_210_4211_1_1527227253_55357_8-2499-0_sp1.1 from training. Duration: 0.996375 2023-10-04 13:46:51,859 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_447-201565-0_sp1.1 from training. Duration: 0.97275 2023-10-04 13:46:51,865 WARNING [train.py:1204] (3/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_253-161789-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 13:46:51,868 WARNING [train.py:1204] (3/4) Exclude cut with ID _44304_210_15374_1_1533639352765_4613129_302-22084-0_sp0.9 from training. Duration: 0.9555625 2023-10-04 13:46:51,876 WARNING [train.py:1204] (3/4) Exclude cut with ID _26664_210_7770_1_1532258943280_3418270_187-80815-0_sp1.1 from training. Duration: 0.8454375 2023-10-04 13:46:51,956 WARNING [train.py:1204] (3/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_68-212801-0 from training. Duration: 0.73 2023-10-04 13:46:51,993 WARNING [train.py:1204] (3/4) Exclude cut with ID _29345_210_4381_1_1532689168263_3860169_149-257268-0 from training. Duration: 0.81 2023-10-04 13:46:52,175 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.feed_forward1.hidden_balancer.prob, batch_count=1675533.3333333333, ans=0.125 2023-10-04 13:46:58,562 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_213-103282-0_sp1.1 from training. Duration: 0.7090625 2023-10-04 13:46:59,971 WARNING [train.py:1204] (3/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_156-302675-0_sp1.1 from training. Duration: 0.5 2023-10-04 13:47:01,482 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.ff3_skip_rate, batch_count=1675533.3333333333, ans=0.0 2023-10-04 13:47:04,924 INFO [scaling.py:1118] (3/4) WithLoss: name=encoder.encoders.3.encoder.layers.1.self_attn_weights, loss-sum=0.000e+00 2023-10-04 13:47:08,774 WARNING [train.py:1204] (3/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_125-322172-0 from training. Duration: 0.93 2023-10-04 13:47:08,831 WARNING [train.py:1204] (3/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_493-60474-0_sp1.1 from training. Duration: 0.963625 2023-10-04 13:47:11,522 WARNING [train.py:1204] (3/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_156-302675-0 from training. Duration: 0.55 2023-10-04 13:47:16,160 WARNING [train.py:1204] (3/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_123-267862-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 13:47:19,007 WARNING [train.py:1204] (3/4) Exclude cut with ID _28427_210_4220_1_1532581240967_3061069_201-143747-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 13:47:19,041 WARNING [train.py:1204] (3/4) Exclude cut with ID _8772_210_1850_1_1528635595972_3664011_96-12756-0_sp1.1 from training. Duration: 0.8909375 2023-10-04 13:47:20,401 WARNING [train.py:1204] (3/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_1347-145675-0_sp1.1 from training. Duration: 0.936375 2023-10-04 13:47:23,081 WARNING [train.py:1204] (3/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_387-159008-0_sp1.1 from training. Duration: 0.7818125 2023-10-04 13:47:23,100 WARNING [train.py:1204] (3/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_400-201896-0_sp1.1 from training. Duration: 0.963625 2023-10-04 13:47:24,551 WARNING [train.py:1204] (3/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_326-174612-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 13:47:24,591 WARNING [train.py:1204] (3/4) Exclude cut with ID _55844_210_2346_1_1534471212569_7625707_390-116233-0_sp1.1 from training. Duration: 0.963625 2023-10-04 13:47:26,360 WARNING [train.py:1204] (3/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_419-317062-0_sp1.1 from training. Duration: 0.82725 2023-10-04 13:47:26,407 WARNING [train.py:1204] (3/4) Exclude cut with ID _29877_210_6228_1_1532397791106_5276149_266-62291-0_sp0.9 from training. Duration: 0.9666875 2023-10-04 13:47:26,473 WARNING [train.py:1204] (3/4) Exclude cut with ID _7857_210_1534_1_1528286515745_6831470_431-338528-0_sp1.1 from training. Duration: 0.92725 2023-10-04 13:47:27,864 WARNING [train.py:1204] (3/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_87-131427-0_sp0.9 from training. Duration: 0.8666875 2023-10-04 13:47:32,363 WARNING [train.py:1204] (3/4) Exclude cut with ID _24055_210_12651_1_1532310907143_976979_92-59356-0_sp1.1 from training. Duration: 0.82725 2023-10-04 13:47:32,458 WARNING [train.py:1204] (3/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_96-290039-0 from training. Duration: 0.56 2023-10-04 13:47:35,560 WARNING [train.py:1204] (3/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_48-182155-0_sp0.9 from training. Duration: 0.9666875 2023-10-04 13:47:35,599 WARNING [train.py:1204] (3/4) Exclude cut with ID _4626_210_3532_1_1526817652717_3916859_446-206656-0 from training. Duration: 0.76 2023-10-04 13:47:36,954 WARNING [train.py:1204] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_168-32906-0 from training. Duration: 0.69 2023-10-04 13:47:36,978 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3780_210_2087_1_1526467760_2130483_170-259044-0 from training. Duration: 0.7 2023-10-04 13:47:36,996 WARNING [train.py:1204] (3/4) Exclude cut with ID _65134_210_14763_1_1535335393985_3505368_153-216718-0_sp1.1 from training. Duration: 0.92725 2023-10-04 13:47:38,327 WARNING [train.py:1204] (3/4) Exclude cut with ID _64639_210_5816_1_1535367619437_3528000_461-329156-0_sp1.1 from training. Duration: 0.87275 2023-10-04 13:47:38,364 WARNING [train.py:1204] (3/4) Exclude cut with ID _21989_210_5854_1_1531899214931_3271360_126-347531-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 13:47:39,833 WARNING [train.py:1204] (3/4) Exclude cut with ID _7614_210_4915_1_1529326764092_4537359_96-162197-0_sp1.1 from training. Duration: 0.963625 2023-10-04 13:47:39,834 WARNING [train.py:1204] (3/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_1083-147536-0 from training. Duration: 0.97 2023-10-04 13:47:41,376 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.conv_module1.balancer2.prob, batch_count=1675733.3333333333, ans=0.125 2023-10-04 13:47:43,951 WARNING [train.py:1204] (3/4) Exclude cut with ID _64029_210_4381_1_1535178502772_3756459_275-16710-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 13:47:45,387 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7264_210_4938_1_1527939911_8132095_800-91995-0_sp0.9 from training. Duration: 0.9555625 2023-10-04 13:47:45,431 WARNING [train.py:1204] (3/4) Exclude cut with ID _3844_210_1390_1_1526727803582_2871209_50-193879-0_sp1.1 from training. Duration: 0.936375 2023-10-04 13:47:48,712 WARNING [train.py:1204] (3/4) Exclude cut with ID _46952_210_5986_1_1534230010357_3107379_232-344503-0 from training. Duration: 0.96 2023-10-04 13:47:51,548 INFO [train.py:1046] (3/4) Epoch 48, batch 1700, loss[loss=0.152, simple_loss=0.2404, pruned_loss=0.03181, over 24655.00 frames. ], tot_loss[loss=0.1545, simple_loss=0.2342, pruned_loss=0.03745, over 4679747.66 frames. ], batch size: 68, lr: 2.12e-03, grad_scale: 16.0 2023-10-04 13:47:52,937 WARNING [train.py:1204] (3/4) Exclude cut with ID _25808_210_13292_1_1532343514562_7366109_206-14076-0_sp1.1 from training. Duration: 0.936375 2023-10-04 13:47:52,938 WARNING [train.py:1204] (3/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_55-31904-0_sp0.9 from training. Duration: 0.97775 2023-10-04 13:47:52,961 WARNING [train.py:1204] (3/4) Exclude cut with ID _14632_210_9988_1_1530691279392_3923200_161-129530-0 from training. Duration: 0.96 2023-10-04 13:47:54,348 WARNING [train.py:1204] (3/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_770-200430-0_sp1.1 from training. Duration: 0.8090625 2023-10-04 13:47:54,358 WARNING [train.py:1204] (3/4) Exclude cut with ID _6270_210_2084_1_1527850911573_3856420_26-277343-0_sp1.1 from training. Duration: 0.7454375 2023-10-04 13:47:54,359 WARNING [train.py:1204] (3/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_88-23776-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 13:47:57,188 WARNING [train.py:1204] (3/4) Exclude cut with ID _60860_210_8254_1_1534896015875_3557169_245-256666-0_sp1.1 from training. Duration: 0.8818125 2023-10-04 13:47:57,233 WARNING [train.py:1204] (3/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_636-251213-0_sp1.1 from training. Duration: 0.8545625 2023-10-04 13:47:58,926 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.807e+02 2.221e+02 2.607e+02 3.070e+02 5.494e+02, threshold=5.214e+02, percent-clipped=5.0 2023-10-04 13:47:59,002 WARNING [train.py:1204] (3/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_540-131164-0 from training. Duration: 0.92 2023-10-04 13:48:02,208 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_5931_210_2040_1_1527428481_4547999_321-193614-0_sp0.9 from training. Duration: 0.7444375 2023-10-04 13:48:09,047 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.1.encoder.layers.1.self_attn2.whiten, num_groups=1, num_channels=256, metric=10.60 vs. limit=22.5 2023-10-04 13:48:09,619 WARNING [train.py:1204] (3/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_373-225403-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 13:48:12,442 WARNING [train.py:1204] (3/4) Exclude cut with ID _20622_210_3852_1_1531622427080_1093839_57-326027-0_sp1.1 from training. Duration: 0.97275 2023-10-04 13:48:16,740 WARNING [train.py:1204] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_605-257205-0_sp1.1 from training. Duration: 0.536375 2023-10-04 13:48:18,557 WARNING [train.py:1204] (3/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_442-320317-0_sp0.9 from training. Duration: 0.911125 2023-10-04 13:48:18,601 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_35361_210_4238_1_1533262810_7793476_675-87012-0_sp1.1 from training. Duration: 0.8090625 2023-10-04 13:48:18,629 WARNING [train.py:1204] (3/4) Exclude cut with ID _4184_210_2550_1_1526730192013_2219380_306-44736-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 13:48:20,229 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.conv_module1.balancer2.prob, batch_count=1675933.3333333333, ans=0.125 2023-10-04 13:48:21,253 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_124-113562-0 from training. Duration: 0.94 2023-10-04 13:48:22,614 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2821_210_2715_1_1526108385_3763560_73-296673-0_sp0.9 from training. Duration: 0.92225 2023-10-04 13:48:23,967 WARNING [train.py:1204] (3/4) Exclude cut with ID _10301_210_5886_1_1529222275332_3910620_216-46959-0_sp1.1 from training. Duration: 0.92725 2023-10-04 13:48:24,095 WARNING [train.py:1204] (3/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_91-277684-0_sp0.9 from training. Duration: 0.82225 2023-10-04 13:48:25,504 WARNING [train.py:1204] (3/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_351-294650-0_sp0.9 from training. Duration: 0.7 2023-10-04 13:48:28,643 WARNING [train.py:1204] (3/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_209-205365-0 from training. Duration: 0.79 2023-10-04 13:48:28,696 WARNING [train.py:1204] (3/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_97-325053-0 from training. Duration: 0.92 2023-10-04 13:48:28,878 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=1675933.3333333333, ans=0.1 2023-10-04 13:48:30,071 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_49348_210_7927_1_1533954500_3691300_109-276430-0_sp1.1 from training. Duration: 0.92725 2023-10-04 13:48:31,892 WARNING [train.py:1204] (3/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_912-47473-0 from training. Duration: 0.68 2023-10-04 13:48:31,983 WARNING [train.py:1204] (3/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_96-320736-0_sp0.9 from training. Duration: 0.9555625 2023-10-04 13:48:39,315 WARNING [train.py:1204] (3/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_1252-235481-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 13:48:40,660 WARNING [train.py:1204] (3/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_813-261554-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 13:48:40,834 INFO [scaling.py:1118] (3/4) WithLoss: name=encoder.encoders.1.encoder.layers.0.self_attn_weights, loss-sum=0.000e+00 2023-10-04 13:48:41,915 WARNING [train.py:1204] (3/4) Exclude cut with ID _21632_210_2253_1_1531994001765_3744140_110-274654-0_sp0.9 from training. Duration: 0.911125 2023-10-04 13:48:43,347 WARNING [train.py:1204] (3/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_381-317791-0_sp1.1 from training. Duration: 0.663625 2023-10-04 13:48:43,362 WARNING [train.py:1204] (3/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_51-117576-0_sp1.1 from training. Duration: 0.363625 2023-10-04 13:48:43,378 WARNING [train.py:1204] (3/4) Exclude cut with ID _42321_210_7770_1_1533468631352_4366895_104-215915-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 13:48:46,056 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_216-22824-0_sp1.1 from training. Duration: 0.92725 2023-10-04 13:48:46,057 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_14831_210_7927_1_1530700200_5583610_181-27467-0 from training. Duration: 0.91 2023-10-04 13:48:46,112 WARNING [train.py:1204] (3/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_1024-265645-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 13:48:46,114 WARNING [train.py:1204] (3/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_412-233904-0_sp1.1 from training. Duration: 0.936375 2023-10-04 13:48:46,130 WARNING [train.py:1204] (3/4) Exclude cut with ID _30191_210_1664_1_1533189915047_7196670_911-223461-0_sp1.1 from training. Duration: 0.92725 2023-10-04 13:48:46,137 WARNING [train.py:1204] (3/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_558-148266-0_sp1.1 from training. Duration: 0.963625 2023-10-04 13:48:46,679 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.5.encoder.layers.0.nonlin_attention.whiten1, num_groups=1, num_channels=192, metric=3.67 vs. limit=10.0 2023-10-04 13:48:48,176 WARNING [train.py:1204] (3/4) Exclude cut with ID _58480_210_12392_1_1534811121111_3646160_212-164495-0_sp1.1 from training. Duration: 0.936375 2023-10-04 13:48:48,177 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_454-276429-0_sp0.9 from training. Duration: 0.9666875 2023-10-04 13:48:48,308 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.conv_module1.balancer1.prob, batch_count=1676000.0, ans=0.125 2023-10-04 13:48:50,809 WARNING [train.py:1204] (3/4) Exclude cut with ID _24736_210_3279_1_1532332894218_2838700_73-197086-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 13:48:50,876 WARNING [train.py:1204] (3/4) Exclude cut with ID _5294_210_1747_1_1527249643928_3696179_529-242298-0_sp1.1 from training. Duration: 0.863625 2023-10-04 13:48:50,909 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_53555_210_5632_1_1534595220_7232297_345-171438-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 13:48:55,373 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_28211_210_6388_1_1532861506_4523224_22-25766-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 13:48:55,462 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7002_210_1794_1_1527939683_5157286_163-87552-0 from training. Duration: 0.86 2023-10-04 13:48:58,125 WARNING [train.py:1204] (3/4) Exclude cut with ID _56927_210_9969_1_1534848970840_3695229_409-225129-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 13:48:59,971 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_26192_210_7927_1_1532137269_1040115_21-219331-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 13:49:01,430 WARNING [train.py:1204] (3/4) Exclude cut with ID _15012_210_1968_1_1530856820429_5095350_662-210469-0 from training. Duration: 0.57 2023-10-04 13:49:05,483 INFO [train.py:1046] (3/4) Epoch 48, batch 1750, loss[loss=0.1611, simple_loss=0.2511, pruned_loss=0.03558, over 24320.00 frames. ], tot_loss[loss=0.1538, simple_loss=0.2333, pruned_loss=0.0372, over 4687270.19 frames. ], batch size: 74, lr: 2.12e-03, grad_scale: 16.0 2023-10-04 13:49:05,850 INFO [scaling.py:1118] (3/4) WithLoss: name=encoder.encoders.1.encoder.layers.0.self_attn_weights, loss-sum=0.000e+00 2023-10-04 13:49:07,650 WARNING [train.py:1204] (3/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_582-191016-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 13:49:10,410 WARNING [train.py:1204] (3/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_270-210032-0_sp1.1 from training. Duration: 0.963625 2023-10-04 13:49:10,442 WARNING [train.py:1204] (3/4) Exclude cut with ID _6789_210_1534_1_1527857937218_3118950_262-48853-0_sp0.9 from training. Duration: 0.62225 2023-10-04 13:49:11,346 INFO [scaling.py:1022] (3/4) Whitening: name=encoder_embed.convnext.out_whiten, num_groups=1, num_channels=128, metric=4.24 vs. limit=5.0 2023-10-04 13:49:11,770 WARNING [train.py:1204] (3/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_847-41334-0 from training. Duration: 0.44 2023-10-04 13:49:11,810 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_573-46532-0_sp1.1 from training. Duration: 0.92725 2023-10-04 13:49:13,299 WARNING [train.py:1204] (3/4) Exclude cut with ID _21881_210_3525_1_1531825242757_3798380_247-102591-0_sp1.1 from training. Duration: 0.8909375 2023-10-04 13:49:13,331 WARNING [train.py:1204] (3/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_200-21395-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 13:49:17,765 WARNING [train.py:1204] (3/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_711-15688-0 from training. Duration: 0.43 2023-10-04 13:49:19,424 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.feed_forward1.out_proj.dropout_p, batch_count=1676200.0, ans=0.1 2023-10-04 13:49:20,454 WARNING [train.py:1204] (3/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_53-75025-0_sp1.1 from training. Duration: 0.963625 2023-10-04 13:49:21,880 WARNING [train.py:1204] (3/4) Exclude cut with ID _6194_210_1534_1_1527511826160_3941194_321-148641-0 from training. Duration: 0.64 2023-10-04 13:49:21,914 WARNING [train.py:1204] (3/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_337-294431-0_sp1.1 from training. Duration: 0.936375 2023-10-04 13:49:23,307 WARNING [train.py:1204] (3/4) Exclude cut with ID _21632_210_2253_1_1531994001765_3744140_110-274654-0_sp1.1 from training. Duration: 0.7454375 2023-10-04 13:49:27,134 WARNING [train.py:1204] (3/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_135-174080-0_sp1.1 from training. Duration: 0.4818125 2023-10-04 13:49:28,269 WARNING [train.py:1204] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_21-174514-0 from training. Duration: 0.43 2023-10-04 13:49:28,531 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.conv_module1.balancer1.max_abs, batch_count=1676200.0, ans=10.0 2023-10-04 13:49:29,708 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_38965_210_4852_1_1533174906_3484735_215-294589-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 13:49:31,104 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_548-294316-0 from training. Duration: 0.81 2023-10-04 13:49:37,374 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.ff3_skip_rate, batch_count=1676266.6666666667, ans=0.0 2023-10-04 13:49:38,487 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_103-84010-0_sp1.1 from training. Duration: 0.62725 2023-10-04 13:49:38,982 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.5.encoder.layers.1.conv_module1.whiten, num_groups=1, num_channels=256, metric=3.29 vs. limit=15.0 2023-10-04 13:49:40,016 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.bypass.skip_rate, batch_count=1676266.6666666667, ans=0.07 2023-10-04 13:49:41,864 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.3.encoder.layers.2.conv_module1.whiten, num_groups=1, num_channels=512, metric=6.63 vs. limit=15.0 2023-10-04 13:49:42,486 WARNING [train.py:1204] (3/4) Exclude cut with ID _18199_210_6109_1_1531475789864_4288790_295-198247-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 13:49:42,507 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_13757_210_3528_1_1530440682_4107239_103-313224-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 13:49:45,234 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_34886_210_10633_1_1533203882_1755661_67-294673-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 13:49:45,244 WARNING [train.py:1204] (3/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_350-101734-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 13:49:48,401 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_37357_210_17024_1_1533089504_2316121_204-153986-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 13:49:49,728 WARNING [train.py:1204] (3/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_673-115569-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 13:49:51,302 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_489-230703-0_sp0.9 from training. Duration: 0.9444375 2023-10-04 13:49:51,369 WARNING [train.py:1204] (3/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_575-102496-0_sp1.1 from training. Duration: 0.936375 2023-10-04 13:49:51,449 WARNING [train.py:1204] (3/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_669-93163-0 from training. Duration: 0.98 2023-10-04 13:49:54,073 WARNING [train.py:1204] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_249-281349-0_sp0.9 from training. Duration: 0.9444375 2023-10-04 13:49:54,322 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.conv_module2.balancer1.prob, batch_count=1676333.3333333333, ans=0.125 2023-10-04 13:49:59,379 WARNING [train.py:1204] (3/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_106-30782-0 from training. Duration: 0.8 2023-10-04 13:49:59,462 WARNING [train.py:1204] (3/4) Exclude cut with ID _8772_210_1850_1_1528635595972_3664011_398-320049-0_sp1.1 from training. Duration: 0.8 2023-10-04 13:50:00,803 WARNING [train.py:1204] (3/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_516-151727-0_sp1.1 from training. Duration: 0.963625 2023-10-04 13:50:00,856 WARNING [train.py:1204] (3/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_325-62802-0_sp1.1 from training. Duration: 0.7909375 2023-10-04 13:50:02,475 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.bypass_mid.scale_min, batch_count=1676333.3333333333, ans=0.2 2023-10-04 13:50:02,509 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=1676333.3333333333, ans=0.1 2023-10-04 13:50:05,172 WARNING [train.py:1204] (3/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_93-58002-0_sp0.9 from training. Duration: 0.6333125 2023-10-04 13:50:06,497 WARNING [train.py:1204] (3/4) Exclude cut with ID _4681_210_3525_1_1527251040321_1235713_123-24428-0_sp1.1 from training. Duration: 0.436375 2023-10-04 13:50:08,256 WARNING [train.py:1204] (3/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_1185-56473-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 13:50:09,658 WARNING [train.py:1204] (3/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_96-244299-0_sp1.1 from training. Duration: 0.8 2023-10-04 13:50:11,353 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.conv_module2.balancer1.prob, batch_count=1676400.0, ans=0.125 2023-10-04 13:50:12,560 WARNING [train.py:1204] (3/4) Exclude cut with ID _59500_210_8254_1_1534809610680_3736009_104-72815-0_sp1.1 from training. Duration: 0.963625 2023-10-04 13:50:14,071 WARNING [train.py:1204] (3/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_468-283113-0_sp1.1 from training. Duration: 0.87275 2023-10-04 13:50:15,492 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6239_210_4211_1_1527765584_4306698_528-102186-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 13:50:16,871 WARNING [train.py:1204] (3/4) Exclude cut with ID _45297_210_2721_1_1533715351381_3264949_351-147136-0 from training. Duration: 0.87 2023-10-04 13:50:16,876 WARNING [train.py:1204] (3/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_520-51744-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 13:50:18,656 WARNING [train.py:1204] (3/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_310-114452-0_sp1.1 from training. Duration: 0.536375 2023-10-04 13:50:18,659 WARNING [train.py:1204] (3/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_450-165710-0_sp1.1 from training. Duration: 0.92725 2023-10-04 13:50:20,041 INFO [train.py:1046] (3/4) Epoch 48, batch 1800, loss[loss=0.1502, simple_loss=0.2314, pruned_loss=0.03454, over 23372.00 frames. ], tot_loss[loss=0.153, simple_loss=0.2323, pruned_loss=0.03688, over 4688934.63 frames. ], batch size: 105, lr: 2.12e-03, grad_scale: 16.0 2023-10-04 13:50:20,081 WARNING [train.py:1204] (3/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_280-120942-0_sp1.1 from training. Duration: 0.7 2023-10-04 13:50:20,102 WARNING [train.py:1204] (3/4) Exclude cut with ID _65585_210_12210_1_1535450451584_3764098_112-294301-0_sp0.9 from training. Duration: 0.97775 2023-10-04 13:50:20,167 WARNING [train.py:1204] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_98-51676-0_sp0.9 from training. Duration: 0.811125 2023-10-04 13:50:22,973 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_33348_210_5632_1_1533086912_3324030_85-28583-0_sp1.1 from training. Duration: 0.7454375 2023-10-04 13:50:24,280 WARNING [train.py:1204] (3/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_336-208232-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 13:50:25,649 WARNING [train.py:1204] (3/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_339-104791-0_sp0.9 from training. Duration: 0.5444375 2023-10-04 13:50:27,377 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.717e+02 2.032e+02 2.223e+02 2.665e+02 4.084e+02, threshold=4.447e+02, percent-clipped=0.0 2023-10-04 13:50:28,136 WARNING [train.py:1204] (3/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_559-29840-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 13:50:30,101 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.4.encoder.layers.1.self_attn2.whiten, num_groups=1, num_channels=384, metric=11.28 vs. limit=22.5 2023-10-04 13:50:30,950 WARNING [train.py:1204] (3/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_117-290916-0_sp0.9 from training. Duration: 0.5666875 2023-10-04 13:50:32,401 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_185-112289-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 13:50:36,467 WARNING [train.py:1204] (3/4) Exclude cut with ID _59256_210_5986_1_1535076085997_3589120_155-298797-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 13:50:36,852 INFO [scaling.py:1118] (3/4) WithLoss: name=encoder.encoders.3.encoder.layers.3.self_attn_weights, loss-sum=0.000e+00 2023-10-04 13:50:37,953 WARNING [train.py:1204] (3/4) Exclude cut with ID _44659_210_2721_1_1534036120061_6614860_246-206531-0_sp1.1 from training. Duration: 0.92725 2023-10-04 13:50:37,988 WARNING [train.py:1204] (3/4) Exclude cut with ID _23818_210_6228_1_1531917063884_3676920_469-151944-0_sp1.1 from training. Duration: 0.92725 2023-10-04 13:50:41,304 WARNING [train.py:1204] (3/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_88-276668-0_sp1.1 from training. Duration: 0.9 2023-10-04 13:50:42,746 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_15101_210_7927_1_1530786415_5033274_320-327527-0_sp1.1 from training. Duration: 0.87275 2023-10-04 13:50:42,758 WARNING [train.py:1204] (3/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_446-112117-0 from training. Duration: 0.9 2023-10-04 13:50:42,903 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.conv_module2.balancer2.prob, batch_count=1676533.3333333333, ans=0.125 2023-10-04 13:50:44,091 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_30280_210_1881_1_1532872779_3660139_400-254159-0_sp1.1 from training. Duration: 0.936375 2023-10-04 13:50:47,008 WARNING [train.py:1204] (3/4) Exclude cut with ID _40284_210_2084_1_1533513679814_3704279_158-345341-0_sp1.1 from training. Duration: 0.936375 2023-10-04 13:50:47,150 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.conv_module2.balancer1.prob, batch_count=1676533.3333333333, ans=0.125 2023-10-04 13:50:51,131 WARNING [train.py:1204] (3/4) Exclude cut with ID 3033-130750-0096-55598-0 from training. Duration: 0.83 2023-10-04 13:50:53,107 WARNING [train.py:1204] (3/4) Exclude cut with ID _9389_210_1664_1_1529217337301_5289269_379-128794-0 from training. Duration: 0.85 2023-10-04 13:50:53,128 WARNING [train.py:1204] (3/4) Exclude cut with ID _3675_210_4557_1_1526639934408_4182089_456-18305-0 from training. Duration: 0.76 2023-10-04 13:50:54,423 WARNING [train.py:1204] (3/4) Exclude cut with ID _29853_210_7497_1_1532505600851_6642110_571-249460-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 13:50:55,795 WARNING [train.py:1204] (3/4) Exclude cut with ID _4068_210_2629_1_1526697350395_1546259_230-325568-0_sp1.1 from training. Duration: 0.92725 2023-10-04 13:50:55,796 WARNING [train.py:1204] (3/4) Exclude cut with ID _34415_210_8750_1_1533085213375_3451116_213-267570-0_sp1.1 from training. Duration: 0.963625 2023-10-04 13:50:57,153 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6767_210_3273_1_1527855783_1718932_142-127132-0_sp1.1 from training. Duration: 0.863625 2023-10-04 13:51:03,222 WARNING [train.py:1204] (3/4) Exclude cut with ID IC0653W0001-322925 from training. Duration: 0.896 2023-10-04 13:51:04,645 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_4528_210_3273_1_1527076654_3276777_209-219804-0_sp1.1 from training. Duration: 0.62725 2023-10-04 13:51:06,033 WARNING [train.py:1204] (3/4) Exclude cut with ID _55844_210_2346_1_1534471212569_7625707_187-204971-0_sp1.1 from training. Duration: 0.936375 2023-10-04 13:51:08,897 WARNING [train.py:1204] (3/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_369-325184-0 from training. Duration: 0.99 2023-10-04 13:51:08,940 WARNING [train.py:1204] (3/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_791-45149-0 from training. Duration: 0.86 2023-10-04 13:51:08,980 WARNING [train.py:1204] (3/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_317-211107-0_sp1.1 from training. Duration: 0.836375 2023-10-04 13:51:10,322 WARNING [train.py:1204] (3/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_488-330913-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 13:51:10,429 WARNING [train.py:1204] (3/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_506-42614-0_sp1.1 from training. Duration: 0.7181875 2023-10-04 13:51:15,064 WARNING [train.py:1204] (3/4) Exclude cut with ID _8696_210_1876_1_1528545632440_3345058_209-54069-0 from training. Duration: 0.67 2023-10-04 13:51:19,559 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=1676733.3333333333, ans=0.1 2023-10-04 13:51:20,648 WARNING [train.py:1204] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_215-202973-0_sp1.1 from training. Duration: 0.8 2023-10-04 13:51:21,958 WARNING [train.py:1204] (3/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_321-215742-0 from training. Duration: 0.5 2023-10-04 13:51:23,846 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_428-32114-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 13:51:23,855 WARNING [train.py:1204] (3/4) Exclude cut with ID _27013_210_1389_1_1532437173240_3252649_467-263151-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 13:51:23,916 WARNING [train.py:1204] (3/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_734-129761-0_sp0.9 from training. Duration: 0.911125 2023-10-04 13:51:23,949 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_521-214276-0 from training. Duration: 0.44 2023-10-04 13:51:26,762 WARNING [train.py:1204] (3/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_80-147301-0_sp1.1 from training. Duration: 0.763625 2023-10-04 13:51:28,052 WARNING [train.py:1204] (3/4) Exclude cut with ID _9679_210_7497_1_1529129212906_4363929_56-307796-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 13:51:29,964 WARNING [train.py:1204] (3/4) Exclude cut with ID _14832_210_8750_1_1530691351539_3171348_1-54315-0 from training. Duration: 0.87 2023-10-04 13:51:29,964 WARNING [train.py:1204] (3/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_558-155750-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 13:51:33,248 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_142-309370-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 13:51:33,294 WARNING [train.py:1204] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_18-46756-0_sp0.9 from training. Duration: 0.87775 2023-10-04 13:51:33,301 WARNING [train.py:1204] (3/4) Exclude cut with ID _61148_210_6897_1_1535094022726_4217610_279-212159-0_sp1.1 from training. Duration: 0.936375 2023-10-04 13:51:34,532 INFO [train.py:1046] (3/4) Epoch 48, batch 1850, loss[loss=0.1605, simple_loss=0.2385, pruned_loss=0.04124, over 22693.00 frames. ], tot_loss[loss=0.1534, simple_loss=0.2332, pruned_loss=0.03678, over 4692938.25 frames. ], batch size: 322, lr: 2.12e-03, grad_scale: 16.0 2023-10-04 13:51:34,580 WARNING [train.py:1204] (3/4) Exclude cut with ID _21987_210_5854_1_1531821500482_3637038_164-2641-0_sp1.1 from training. Duration: 0.936375 2023-10-04 13:51:34,623 WARNING [train.py:1204] (3/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_395-340852-0_sp1.1 from training. Duration: 0.8454375 2023-10-04 13:51:37,305 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1077-184261-0_sp1.1 from training. Duration: 0.963625 2023-10-04 13:51:37,313 WARNING [train.py:1204] (3/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_1176-204992-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 13:51:37,688 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.balancer1.prob, batch_count=1676800.0, ans=0.125 2023-10-04 13:51:38,769 WARNING [train.py:1204] (3/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_563-347832-0_sp1.1 from training. Duration: 0.8090625 2023-10-04 13:51:40,634 WARNING [train.py:1204] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_380-204756-0_sp0.9 from training. Duration: 0.9555625 2023-10-04 13:51:45,012 WARNING [train.py:1204] (3/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_212-10501-0_sp1.1 from training. Duration: 0.8545625 2023-10-04 13:51:45,046 WARNING [train.py:1204] (3/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_445-117398-0 from training. Duration: 0.89 2023-10-04 13:51:49,132 WARNING [train.py:1204] (3/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_29-280622-0 from training. Duration: 0.82 2023-10-04 13:51:51,844 WARNING [train.py:1204] (3/4) Exclude cut with ID _26664_210_7770_1_1532258943280_3418270_187-80815-0 from training. Duration: 0.93 2023-10-04 13:51:52,147 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.feed_forward1.out_proj.dropout_p, batch_count=1676866.6666666667, ans=0.1 2023-10-04 13:51:55,227 WARNING [train.py:1204] (3/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_414-349640-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 13:51:56,489 WARNING [train.py:1204] (3/4) Exclude cut with ID _45021_210_9994_1_1533619751094_3330288_96-239010-0 from training. Duration: 0.91 2023-10-04 13:51:56,492 WARNING [train.py:1204] (3/4) Exclude cut with ID _5189_210_4557_1_1527248319428_3769229_199-53526-0_sp0.9 from training. Duration: 0.4666875 2023-10-04 13:52:07,019 WARNING [train.py:1204] (3/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_63-101996-0_sp1.1 from training. Duration: 0.8 2023-10-04 13:52:08,399 WARNING [train.py:1204] (3/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_199-86432-0 from training. Duration: 0.98 2023-10-04 13:52:09,902 WARNING [train.py:1204] (3/4) Exclude cut with ID _36399_210_16049_1_1532998771377_3698333_207-259013-0_sp1.1 from training. Duration: 0.92725 2023-10-04 13:52:11,864 WARNING [train.py:1204] (3/4) Exclude cut with ID _62574_210_2655_1_1535180407058_3933249_567-111756-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 13:52:14,789 WARNING [train.py:1204] (3/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_436-318113-0 from training. Duration: 0.75 2023-10-04 13:52:16,114 WARNING [train.py:1204] (3/4) Exclude cut with ID _53379_210_20034_1_1534222027363_3854654_43-305214-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 13:52:16,136 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_15126_210_6753_1_1530778677_2591185_113-157651-0_sp1.1 from training. Duration: 0.6181875 2023-10-04 13:52:17,505 WARNING [train.py:1204] (3/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_146-218763-0_sp1.1 from training. Duration: 0.7818125 2023-10-04 13:52:18,019 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.5.encoder.layers.1.conv_module2.whiten, num_groups=1, num_channels=256, metric=6.34 vs. limit=15.0 2023-10-04 13:52:19,081 WARNING [train.py:1204] (3/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_714-310538-0_sp0.9 from training. Duration: 0.9555625 2023-10-04 13:52:20,520 WARNING [train.py:1204] (3/4) Exclude cut with ID _24271_210_12658_1_1531987270022_7190519_709-16721-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 13:52:23,435 WARNING [train.py:1204] (3/4) Exclude cut with ID _5190_210_4557_1_1527244368799_373439_28-197030-0_sp0.9 from training. Duration: 0.8 2023-10-04 13:52:24,756 WARNING [train.py:1204] (3/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_267-197536-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 13:52:24,782 WARNING [train.py:1204] (3/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_658-282610-0_sp1.1 from training. Duration: 0.4818125 2023-10-04 13:52:24,791 WARNING [train.py:1204] (3/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_257-67050-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 13:52:26,800 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_14906_210_7927_1_1530754190_9408633_961-53402-0_sp1.1 from training. Duration: 0.936375 2023-10-04 13:52:28,651 WARNING [train.py:1204] (3/4) Exclude cut with ID _60113_210_16271_1_1535164212481_4386079_490-280290-0_sp1.1 from training. Duration: 0.8909375 2023-10-04 13:52:28,984 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.bypass.skip_rate, batch_count=1677000.0, ans=0.04949747468305833 2023-10-04 13:52:31,293 WARNING [train.py:1204] (3/4) Exclude cut with ID _14100_210_8341_1_1530529320613_3678069_258-224599-0 from training. Duration: 0.77 2023-10-04 13:52:31,953 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.conv_module2.whiten.whitening_limit, batch_count=1677000.0, ans=15.0 2023-10-04 13:52:32,556 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6961_210_2087_1_1527937168_6815011_565-326898-0_sp1.1 from training. Duration: 0.936375 2023-10-04 13:52:36,128 WARNING [train.py:1204] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_239-316802-0_sp1.1 from training. Duration: 0.62725 2023-10-04 13:52:36,182 WARNING [train.py:1204] (3/4) Exclude cut with ID _55857_210_2346_1_1534644007831_6898209_745-98391-0_sp1.1 from training. Duration: 0.6909375 2023-10-04 13:52:36,185 WARNING [train.py:1204] (3/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_154-135696-0 from training. Duration: 0.8 2023-10-04 13:52:36,187 WARNING [train.py:1204] (3/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_93-58002-0 from training. Duration: 0.57 2023-10-04 13:52:38,891 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9025W0005-507335 from training. Duration: 0.896 2023-10-04 13:52:40,314 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9029W0034-509435 from training. Duration: 0.64 2023-10-04 13:52:42,174 WARNING [train.py:1204] (3/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_76-203867-0_sp1.1 from training. Duration: 0.6545625 2023-10-04 13:52:42,182 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_46639_210_8341_1_1533726088_4495500_66-346325-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 13:52:42,188 WARNING [train.py:1204] (3/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_145-137790-0_sp0.9 from training. Duration: 0.92225 2023-10-04 13:52:42,201 WARNING [train.py:1204] (3/4) Exclude cut with ID _36522_210_8887_1_1533177030611_1772478_7-327299-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 13:52:42,272 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9042W0009-515615 from training. Duration: 0.896 2023-10-04 13:52:43,577 WARNING [train.py:1204] (3/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_39-248870-0_sp0.9 from training. Duration: 0.7666875 2023-10-04 13:52:43,609 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_35441_210_16554_1_1533347572_4092680_362-242912-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 13:52:43,693 WARNING [train.py:1204] (3/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_216-343007-0_sp1.1 from training. Duration: 0.6 2023-10-04 13:52:45,093 WARNING [train.py:1204] (3/4) Exclude cut with ID _3867_210_1883_1_1526641200575_4475830_272-215563-0_sp0.9 from training. Duration: 0.6333125 2023-10-04 13:52:46,394 WARNING [train.py:1204] (3/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_553-175603-0_sp1.1 from training. Duration: 0.92725 2023-10-04 13:52:46,404 WARNING [train.py:1204] (3/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_963-187109-0 from training. Duration: 0.99 2023-10-04 13:52:46,580 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.feed_forward1.hidden_balancer.prob, batch_count=1677066.6666666667, ans=0.125 2023-10-04 13:52:48,909 INFO [train.py:1046] (3/4) Epoch 48, batch 1900, loss[loss=0.1607, simple_loss=0.2331, pruned_loss=0.04412, over 22747.00 frames. ], tot_loss[loss=0.1538, simple_loss=0.2341, pruned_loss=0.03679, over 4709603.22 frames. ], batch size: 322, lr: 2.12e-03, grad_scale: 16.0 2023-10-04 13:52:49,009 WARNING [train.py:1204] (3/4) Exclude cut with ID _44973_210_1511_1_1533794381163_3634770_495-211466-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 13:52:49,026 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9068W0002-529085 from training. Duration: 0.896 2023-10-04 13:52:49,052 WARNING [train.py:1204] (3/4) Exclude cut with ID _5294_210_1747_1_1527249643928_3696179_180-100713-0_sp1.1 from training. Duration: 0.736375 2023-10-04 13:52:50,404 WARNING [train.py:1204] (3/4) Exclude cut with ID _35799_210_5854_1_1533263487887_3529071_149-49689-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 13:52:55,964 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.666e+02 2.090e+02 2.355e+02 2.808e+02 4.439e+02, threshold=4.709e+02, percent-clipped=0.0 2023-10-04 13:52:56,104 WARNING [train.py:1204] (3/4) Exclude cut with ID _67725_210_27767_1_1535608756671_5105359_36-220794-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 13:52:59,182 WARNING [train.py:1204] (3/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_338-96520-0_sp1.1 from training. Duration: 0.8545625 2023-10-04 13:52:59,261 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9104W0004-547160 from training. Duration: 0.896 2023-10-04 13:52:59,337 WARNING [train.py:1204] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_330-327861-0 from training. Duration: 0.67 2023-10-04 13:53:01,517 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.conv_module2.balancer2.prob, batch_count=1677133.3333333333, ans=0.125 2023-10-04 13:53:02,590 WARNING [train.py:1204] (3/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_156-257607-0_sp0.9 from training. Duration: 0.92225 2023-10-04 13:53:02,652 WARNING [train.py:1204] (3/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_476-322050-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 13:53:02,661 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9118W0017-553385 from training. Duration: 0.896 2023-10-04 13:53:03,989 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9120W0028-554435 from training. Duration: 0.896 2023-10-04 13:53:07,452 WARNING [train.py:1204] (3/4) Exclude cut with ID _56524_210_9642_1_1534572011779_3619170_145-310842-0 from training. Duration: 0.99 2023-10-04 13:53:08,895 WARNING [train.py:1204] (3/4) Exclude cut with ID _51631_210_8254_1_1534129228420_4084794_270-222508-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 13:53:13,654 WARNING [train.py:1204] (3/4) Exclude cut with ID _14268_210_9206_1_1530622771285_4714099_331-105351-0 from training. Duration: 0.65 2023-10-04 13:53:16,248 WARNING [train.py:1204] (3/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_40-129756-0 from training. Duration: 0.81 2023-10-04 13:53:22,556 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.1.encoder.layers.0.self_attn2.whiten, num_groups=1, num_channels=256, metric=12.80 vs. limit=22.5 2023-10-04 13:53:24,445 WARNING [train.py:1204] (3/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_231-104736-0 from training. Duration: 0.78 2023-10-04 13:53:27,356 WARNING [train.py:1204] (3/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_214-291369-0 from training. Duration: 0.63 2023-10-04 13:53:27,363 WARNING [train.py:1204] (3/4) Exclude cut with ID _62574_210_2655_1_1535180407058_3933249_369-84347-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 13:53:27,406 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9210W0111-599240 from training. Duration: 0.896 2023-10-04 13:53:27,417 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_92-246270-0 from training. Duration: 0.85 2023-10-04 13:53:29,134 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_37152_210_9697_1_1533106573_3844188_29-239070-0 from training. Duration: 0.98 2023-10-04 13:53:29,204 WARNING [train.py:1204] (3/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_239-22076-0 from training. Duration: 0.85 2023-10-04 13:53:29,206 WARNING [train.py:1204] (3/4) Exclude cut with ID _65962_210_29927_1_1535436026519_4174930_368-44639-0_sp1.1 from training. Duration: 0.97275 2023-10-04 13:53:33,948 WARNING [train.py:1204] (3/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_152-212533-0 from training. Duration: 0.97 2023-10-04 13:53:35,385 WARNING [train.py:1204] (3/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_90-31383-0_sp1.1 from training. Duration: 0.7909375 2023-10-04 13:53:38,640 WARNING [train.py:1204] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_196-152868-0_sp1.1 from training. Duration: 0.92725 2023-10-04 13:53:38,642 WARNING [train.py:1204] (3/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_140-181176-0 from training. Duration: 0.92 2023-10-04 13:53:39,997 WARNING [train.py:1204] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_168-32906-0_sp0.9 from training. Duration: 0.7666875 2023-10-04 13:53:44,761 WARNING [train.py:1204] (3/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_13-165512-0 from training. Duration: 0.82 2023-10-04 13:53:46,104 WARNING [train.py:1204] (3/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_381-317791-0_sp0.9 from training. Duration: 0.811125 2023-10-04 13:53:50,357 WARNING [train.py:1204] (3/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_63-267585-0_sp1.1 from training. Duration: 0.8181875 2023-10-04 13:53:50,371 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_326-303945-0_sp1.1 from training. Duration: 0.9 2023-10-04 13:53:50,383 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_194-143381-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 13:53:51,793 WARNING [train.py:1204] (3/4) Exclude cut with ID _39290_210_4220_1_1533862778760_3075118_194-16667-0_sp1.1 from training. Duration: 0.963625 2023-10-04 13:53:53,069 WARNING [train.py:1204] (3/4) Exclude cut with ID _15012_210_1968_1_1530856820429_5095350_662-210469-0_sp0.9 from training. Duration: 0.6333125 2023-10-04 13:53:53,093 WARNING [train.py:1204] (3/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_68-219807-0_sp1.1 from training. Duration: 0.42725 2023-10-04 13:53:53,158 WARNING [train.py:1204] (3/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_321-22821-0_sp0.9 from training. Duration: 0.988875 2023-10-04 13:53:57,169 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_1037-45317-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 13:53:57,170 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_403-39619-0_sp1.1 from training. Duration: 0.763625 2023-10-04 13:53:58,573 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_510-96996-0_sp0.9 from training. Duration: 0.9666875 2023-10-04 13:53:58,577 WARNING [train.py:1204] (3/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_225-180155-0_sp1.1 from training. Duration: 0.92725 2023-10-04 13:54:00,400 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_278-272612-0_sp0.9 from training. Duration: 0.811125 2023-10-04 13:54:01,731 WARNING [train.py:1204] (3/4) Exclude cut with ID _53713_210_24116_1_1534314487762_7416751_691-70244-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 13:54:03,589 INFO [train.py:1046] (3/4) Epoch 48, batch 1950, loss[loss=0.1518, simple_loss=0.2352, pruned_loss=0.03416, over 24469.00 frames. ], tot_loss[loss=0.1544, simple_loss=0.2346, pruned_loss=0.03708, over 4702494.61 frames. ], batch size: 63, lr: 2.12e-03, grad_scale: 16.0 2023-10-04 13:54:05,106 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6788_210_1388_1_1527898698_4562021_91-197981-0_sp1.1 from training. Duration: 0.7545625 2023-10-04 13:54:06,483 WARNING [train.py:1204] (3/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_60-18123-0_sp0.9 from training. Duration: 0.97775 2023-10-04 13:54:08,435 WARNING [train.py:1204] (3/4) Exclude cut with ID _35799_210_5854_1_1533263487887_3529071_5-214617-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 13:54:08,439 WARNING [train.py:1204] (3/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_890-5117-0_sp1.1 from training. Duration: 0.7090625 2023-10-04 13:54:09,875 WARNING [train.py:1204] (3/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_660-211088-0 from training. Duration: 0.62 2023-10-04 13:54:11,222 WARNING [train.py:1204] (3/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_314-126819-0_sp0.9 from training. Duration: 0.5555625 2023-10-04 13:54:13,008 WARNING [train.py:1204] (3/4) Exclude cut with ID _21632_210_2253_1_1531994001765_3744140_362-66225-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 13:54:14,409 WARNING [train.py:1204] (3/4) Exclude cut with ID _61118_210_9527_1_1534897668884_3752630_173-258555-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 13:54:15,843 WARNING [train.py:1204] (3/4) Exclude cut with ID _7719_210_3525_1_1528456153163_4296960_88-250259-0_sp0.9 from training. Duration: 0.9333125 2023-10-04 13:54:17,150 WARNING [train.py:1204] (3/4) Exclude cut with ID _15140_210_9288_1_1530791388450_4880149_539-153708-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 13:54:17,183 WARNING [train.py:1204] (3/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_253-42544-0_sp1.1 from training. Duration: 0.97275 2023-10-04 13:54:17,985 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.3.encoder.layers.0.whiten, num_groups=1, num_channels=512, metric=5.73 vs. limit=12.0 2023-10-04 13:54:18,575 WARNING [train.py:1204] (3/4) Exclude cut with ID _33640_210_2655_1_1533117605479_4855469_479-296877-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 13:54:20,053 WARNING [train.py:1204] (3/4) Exclude cut with ID 3033-130750-0096-55598-0_sp1.1 from training. Duration: 0.7545625 2023-10-04 13:54:20,072 WARNING [train.py:1204] (3/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_191-41023-0_sp1.1 from training. Duration: 0.6454375 2023-10-04 13:54:21,327 WARNING [train.py:1204] (3/4) Exclude cut with ID _8365_210_2064_1_1528592311070_4677690_431-294102-0_sp1.1 from training. Duration: 0.8090625 2023-10-04 13:54:21,353 WARNING [train.py:1204] (3/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_893-296030-0_sp1.1 from training. Duration: 0.97275 2023-10-04 13:54:22,866 WARNING [train.py:1204] (3/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_401-343820-0_sp1.1 from training. Duration: 0.97275 2023-10-04 13:54:25,625 WARNING [train.py:1204] (3/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_127-566-0_sp1.1 from training. Duration: 0.763625 2023-10-04 13:54:25,627 WARNING [train.py:1204] (3/4) Exclude cut with ID _14100_210_8341_1_1530529320613_3678069_126-272847-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 13:54:25,641 WARNING [train.py:1204] (3/4) Exclude cut with ID _15133_210_6228_1_1530794512074_3080379_29-264458-0_sp1.1 from training. Duration: 0.52725 2023-10-04 13:54:25,641 WARNING [train.py:1204] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_127-119109-0 from training. Duration: 0.72 2023-10-04 13:54:27,013 WARNING [train.py:1204] (3/4) Exclude cut with ID _6537_210_1883_1_1527987559710_3597129_382-133062-0_sp1.1 from training. Duration: 0.6181875 2023-10-04 13:54:27,033 WARNING [train.py:1204] (3/4) Exclude cut with ID _29529_210_7234_1_1532393961455_3977460_391-219682-0_sp1.1 from training. Duration: 0.936375 2023-10-04 13:54:27,094 WARNING [train.py:1204] (3/4) Exclude cut with ID _37537_210_2253_1_1533171758568_3421949_96-137189-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 13:54:31,883 WARNING [train.py:1204] (3/4) Exclude cut with ID _58414_210_8254_1_1535421609162_7151182_15-276301-0_sp1.1 from training. Duration: 0.97275 2023-10-04 13:54:34,073 INFO [scaling.py:1118] (3/4) WithLoss: name=encoder.encoders.4.encoder.layers.1.self_attn_weights, loss-sum=0.000e+00 2023-10-04 13:54:35,081 WARNING [train.py:1204] (3/4) Exclude cut with ID _59502_210_12210_1_1535107024698_1835139_110-289074-0_sp1.1 from training. Duration: 0.92725 2023-10-04 13:54:38,374 WARNING [train.py:1204] (3/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_367-61330-0_sp1.1 from training. Duration: 0.6818125 2023-10-04 13:54:41,094 WARNING [train.py:1204] (3/4) Exclude cut with ID _45297_210_2721_1_1533715351381_3264949_353-9541-0_sp1.1 from training. Duration: 0.8818125 2023-10-04 13:54:41,131 WARNING [train.py:1204] (3/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_85-138257-0_sp0.9 from training. Duration: 0.788875 2023-10-04 13:54:43,028 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3787_210_2040_1_1526477544_5478586_603-120324-0 from training. Duration: 0.81 2023-10-04 13:54:43,069 WARNING [train.py:1204] (3/4) Exclude cut with ID _42863_210_9070_1_1533902398788_3696920_411-204131-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 13:54:47,130 WARNING [train.py:1204] (3/4) Exclude cut with ID _30196_210_1664_1_1533708496522_7074140_822-289909-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 13:54:48,460 WARNING [train.py:1204] (3/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_32-335103-0_sp0.9 from training. Duration: 0.911125 2023-10-04 13:54:48,515 WARNING [train.py:1204] (3/4) Exclude cut with ID _6493_210_1747_1_1528459210424_4012470_513-140967-0_sp0.9 from training. Duration: 0.87775 2023-10-04 13:54:56,897 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_47896_210_20508_1_1534492518_5958868_439-257766-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 13:54:58,381 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_1294-63152-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 13:55:01,586 WARNING [train.py:1204] (3/4) Exclude cut with ID _59170_210_6721_1_1534983633960_4203335_319-166512-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 13:55:04,391 WARNING [train.py:1204] (3/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_146-3470-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 13:55:07,711 WARNING [train.py:1204] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_678-159307-0_sp1.1 from training. Duration: 0.77275 2023-10-04 13:55:07,752 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_343-48822-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 13:55:09,501 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_4692_210_3528_1_1526899755_4323254_107-44401-0 from training. Duration: 0.86 2023-10-04 13:55:09,506 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_74-292608-0_sp1.1 from training. Duration: 0.6545625 2023-10-04 13:55:10,941 WARNING [train.py:1204] (3/4) Exclude cut with ID _55644_210_11856_1_1534593599582_4103719_293-248694-0_sp1.1 from training. Duration: 0.97275 2023-10-04 13:55:12,363 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3172_210_2087_1_1526292929_3987865_197-97762-0 from training. Duration: 0.75 2023-10-04 13:55:16,328 WARNING [train.py:1204] (3/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_264-348896-0_sp0.9 from training. Duration: 0.9444375 2023-10-04 13:55:18,192 INFO [train.py:1046] (3/4) Epoch 48, batch 2000, loss[loss=0.1422, simple_loss=0.2343, pruned_loss=0.02504, over 24649.00 frames. ], tot_loss[loss=0.1546, simple_loss=0.2351, pruned_loss=0.03699, over 4714348.05 frames. ], batch size: 68, lr: 2.12e-03, grad_scale: 32.0 2023-10-04 13:55:19,775 WARNING [train.py:1204] (3/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_209-205365-0_sp0.9 from training. Duration: 0.87775 2023-10-04 13:55:21,186 WARNING [train.py:1204] (3/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_222-198127-0_sp0.9 from training. Duration: 0.9333125 2023-10-04 13:55:21,338 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.conv_module2.balancer1.prob, batch_count=1677800.0, ans=0.125 2023-10-04 13:55:22,575 WARNING [train.py:1204] (3/4) Exclude cut with ID _57572_210_5517_1_1534921219624_3809484_52-43530-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 13:55:22,687 WARNING [train.py:1204] (3/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_96-320736-0_sp1.1 from training. Duration: 0.7818125 2023-10-04 13:55:22,972 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.conv_skip_rate, batch_count=1677800.0, ans=0.0 2023-10-04 13:55:25,198 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.707e+02 2.085e+02 2.271e+02 2.591e+02 3.651e+02, threshold=4.543e+02, percent-clipped=0.0 2023-10-04 13:55:25,334 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_13091_210_7925_1_1530609447_8987354_1159-225508-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 13:55:26,774 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.1.conv_module1.balancer2.prob, batch_count=1677800.0, ans=0.125 2023-10-04 13:55:29,330 WARNING [train.py:1204] (3/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_237-44332-0 from training. Duration: 0.94 2023-10-04 13:55:29,375 WARNING [train.py:1204] (3/4) Exclude cut with ID _7970_210_1883_1_1528455616296_3472230_25-178407-0_sp0.9 from training. Duration: 0.788875 2023-10-04 13:55:32,208 WARNING [train.py:1204] (3/4) Exclude cut with ID _29003_210_19596_1_1532507391720_3894098_357-257062-0_sp1.1 from training. Duration: 0.936375 2023-10-04 13:55:34,956 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_809-17156-0 from training. Duration: 0.73 2023-10-04 13:55:36,313 WARNING [train.py:1204] (3/4) Exclude cut with ID _7160_210_1876_1_1527940923231_3703799_302-150728-0_sp1.1 from training. Duration: 0.7090625 2023-10-04 13:55:36,328 WARNING [train.py:1204] (3/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_444-28041-0_sp0.9 from training. Duration: 0.9444375 2023-10-04 13:55:38,315 WARNING [train.py:1204] (3/4) Exclude cut with ID _52892_210_6721_1_1534378941259_4124129_278-251666-0_sp1.1 from training. Duration: 0.92725 2023-10-04 13:55:40,245 WARNING [train.py:1204] (3/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_115-326778-0 from training. Duration: 0.93 2023-10-04 13:55:41,726 WARNING [train.py:1204] (3/4) Exclude cut with ID _41391_210_7994_1_1533603771872_3638201_31-188961-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 13:55:43,144 WARNING [train.py:1204] (3/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_1073-6264-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 13:55:43,164 WARNING [train.py:1204] (3/4) Exclude cut with ID _66868_210_9527_1_1535509287954_4561140_435-259515-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 13:55:43,736 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.4.encoder.layers.2.self_attn1.whiten, num_groups=1, num_channels=384, metric=13.25 vs. limit=22.5 2023-10-04 13:55:44,522 WARNING [train.py:1204] (3/4) Exclude cut with ID _14886_210_4220_1_1530702020355_3516190_159-73748-0 from training. Duration: 0.83 2023-10-04 13:55:44,569 WARNING [train.py:1204] (3/4) Exclude cut with ID _15150_210_8341_1_1530752411803_7256384_469-239509-0_sp1.1 from training. Duration: 0.6181875 2023-10-04 13:55:46,532 WARNING [train.py:1204] (3/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_395-340852-0 from training. Duration: 0.93 2023-10-04 13:55:48,227 WARNING [train.py:1204] (3/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_138-187031-0_sp1.1 from training. Duration: 0.9 2023-10-04 13:55:49,727 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_13757_210_3528_1_1530440682_4107239_48-106347-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 13:55:51,027 WARNING [train.py:1204] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_164-224052-0_sp1.1 from training. Duration: 0.636375 2023-10-04 13:55:51,031 WARNING [train.py:1204] (3/4) Exclude cut with ID _59288_210_5854_1_1535077757774_3412246_408-322648-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 13:55:51,068 WARNING [train.py:1204] (3/4) Exclude cut with ID _30191_210_1664_1_1533189915047_7196670_827-36359-0_sp1.1 from training. Duration: 0.963625 2023-10-04 13:55:53,827 WARNING [train.py:1204] (3/4) Exclude cut with ID _4680_210_3525_1_1527249757953_893386_75-78979-0_sp1.1 from training. Duration: 0.82725 2023-10-04 13:55:53,891 WARNING [train.py:1204] (3/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_383-165234-0 from training. Duration: 0.94 2023-10-04 13:55:56,538 WARNING [train.py:1204] (3/4) Exclude cut with ID _19896_210_1883_1_1531709022662_6649569_94-229927-0 from training. Duration: 0.97 2023-10-04 13:55:56,548 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6788_210_1388_1_1527898698_4562021_180-75288-0_sp1.1 from training. Duration: 0.9 2023-10-04 13:55:56,555 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_26433_210_12108_1_1532157158_4056480_54-321349-0_sp1.1 from training. Duration: 0.97275 2023-10-04 13:56:00,705 WARNING [train.py:1204] (3/4) Exclude cut with ID _56108_210_16380_1_1534505446159_3887959_265-161493-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 13:56:00,815 WARNING [train.py:1204] (3/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_395-111602-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 13:56:00,821 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_154-316450-0_sp0.9 from training. Duration: 0.8444375 2023-10-04 13:56:02,221 WARNING [train.py:1204] (3/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_381-116041-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 13:56:02,740 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.4.encoder.layers.2.self_attn2.whiten, num_groups=1, num_channels=384, metric=12.19 vs. limit=22.5 2023-10-04 13:56:03,495 WARNING [train.py:1204] (3/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_65-104124-0_sp1.1 from training. Duration: 0.963625 2023-10-04 13:56:05,271 WARNING [train.py:1204] (3/4) Exclude cut with ID _6536_210_1883_1_1527850783339_3319069_169-23811-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 13:56:05,308 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_289-180904-0_sp0.9 from training. Duration: 0.8444375 2023-10-04 13:56:05,320 WARNING [train.py:1204] (3/4) Exclude cut with ID _56406_210_9288_1_1534899618664_3496149_226-242400-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 13:56:05,462 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.0.feed_forward1.out_proj.dropout_p, batch_count=1678000.0, ans=0.1 2023-10-04 13:56:06,654 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_24899_210_16554_1_1532046422_3827462_276-216330-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 13:56:09,941 WARNING [train.py:1204] (3/4) Exclude cut with ID _29345_210_4381_1_1532689168263_3860169_198-174328-0_sp1.1 from training. Duration: 0.82725 2023-10-04 13:56:10,002 WARNING [train.py:1204] (3/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_338-96520-0 from training. Duration: 0.94 2023-10-04 13:56:16,232 WARNING [train.py:1204] (3/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_7-111954-0_sp1.1 from training. Duration: 0.5090625 2023-10-04 13:56:16,323 WARNING [train.py:1204] (3/4) Exclude cut with ID _36692_210_5665_1_1533608967420_398809_8-16261-0_sp1.1 from training. Duration: 0.97275 2023-10-04 13:56:20,478 WARNING [train.py:1204] (3/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_86-212657-0_sp1.1 from training. Duration: 0.97275 2023-10-04 13:56:20,494 WARNING [train.py:1204] (3/4) Exclude cut with ID _44659_210_2721_1_1534036120061_6614860_626-298884-0_sp1.1 from training. Duration: 0.8818125 2023-10-04 13:56:24,674 WARNING [train.py:1204] (3/4) Exclude cut with ID _49755_210_18053_1_1534581005846_3218709_120-257918-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 13:56:26,063 WARNING [train.py:1204] (3/4) Exclude cut with ID _21881_210_3525_1_1531825242757_3798380_400-113821-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 13:56:26,065 WARNING [train.py:1204] (3/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_387-236773-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 13:56:27,384 WARNING [train.py:1204] (3/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_613-232856-0_sp1.1 from training. Duration: 0.4909375 2023-10-04 13:56:27,411 WARNING [train.py:1204] (3/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_181-78632-0_sp0.9 from training. Duration: 0.8666875 2023-10-04 13:56:28,982 WARNING [train.py:1204] (3/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_175-17485-0_sp1.1 from training. Duration: 0.97275 2023-10-04 13:56:30,360 WARNING [train.py:1204] (3/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_1004-183422-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 13:56:31,604 INFO [train.py:1046] (3/4) Epoch 48, batch 2050, loss[loss=0.1625, simple_loss=0.246, pruned_loss=0.03954, over 23336.00 frames. ], tot_loss[loss=0.1533, simple_loss=0.2338, pruned_loss=0.03643, over 4708826.04 frames. ], batch size: 93, lr: 2.12e-03, grad_scale: 32.0 2023-10-04 13:56:33,158 WARNING [train.py:1204] (3/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_32-65962-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 13:56:34,557 WARNING [train.py:1204] (3/4) Exclude cut with ID _15458_210_8341_1_1530858614263_3834410_40-298664-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 13:56:40,885 WARNING [train.py:1204] (3/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_380-261863-0_sp1.1 from training. Duration: 0.963625 2023-10-04 13:56:42,292 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_372-173012-0_sp1.1 from training. Duration: 0.863625 2023-10-04 13:56:43,636 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_57-82205-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 13:56:43,692 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3691_210_3528_1_1526468169_4062053_355-334432-0_sp0.9 from training. Duration: 0.9666875 2023-10-04 13:56:45,084 WARNING [train.py:1204] (3/4) Exclude cut with ID _19023_210_4353_1_1531378892552_5820780_28-36615-0 from training. Duration: 0.97 2023-10-04 13:56:45,089 WARNING [train.py:1204] (3/4) Exclude cut with ID _29853_210_7497_1_1532505600851_6642110_286-251333-0_sp1.1 from training. Duration: 0.92725 2023-10-04 13:56:47,619 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.2.encoder.layers.1.conv_module2.whiten, num_groups=1, num_channels=384, metric=5.97 vs. limit=15.0 2023-10-04 13:56:48,246 WARNING [train.py:1204] (3/4) Exclude cut with ID _31602_210_5854_1_1532677021456_4458250_518-80679-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 13:56:48,296 WARNING [train.py:1204] (3/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_38-187851-0_sp0.9 from training. Duration: 0.688875 2023-10-04 13:56:57,116 WARNING [train.py:1204] (3/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_39-248870-0_sp1.1 from training. Duration: 0.62725 2023-10-04 13:56:57,143 WARNING [train.py:1204] (3/4) Exclude cut with ID _16908_210_5724_1_1531119157248_3772700_154-31455-0_sp1.1 from training. Duration: 0.97275 2023-10-04 13:56:59,862 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8453_210_4211_1_1528502940_8562730_347-330673-0 from training. Duration: 0.72 2023-10-04 13:57:01,329 WARNING [train.py:1204] (3/4) Exclude cut with ID _30191_210_1664_1_1533189915047_7196670_674-204247-0_sp1.1 from training. Duration: 0.97275 2023-10-04 13:57:02,685 WARNING [train.py:1204] (3/4) Exclude cut with ID _8833_210_5219_1_1528592665210_7553059_329-125156-0 from training. Duration: 0.82 2023-10-04 13:57:02,725 WARNING [train.py:1204] (3/4) Exclude cut with ID _4779_210_2084_1_1527318022944_3914893_500-285013-0_sp1.1 from training. Duration: 0.62725 2023-10-04 13:57:04,334 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.nonlin_attention.balancer.prob, batch_count=1678266.6666666667, ans=0.125 2023-10-04 13:57:05,509 WARNING [train.py:1204] (3/4) Exclude cut with ID _14421_210_8750_1_1530604795825_3542290_190-313578-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 13:57:05,646 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.1.conv_module2.balancer1.prob, batch_count=1678266.6666666667, ans=0.125 2023-10-04 13:57:08,712 WARNING [train.py:1204] (3/4) Exclude cut with ID _35799_210_5854_1_1533263487887_3529071_538-273574-0_sp1.1 from training. Duration: 0.936375 2023-10-04 13:57:09,409 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.4.encoder.layers.0.self_attn1.whiten, num_groups=1, num_channels=384, metric=13.23 vs. limit=22.5 2023-10-04 13:57:10,149 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_14928_210_5632_1_1530773461_4714000_454-112198-0_sp1.1 from training. Duration: 0.6 2023-10-04 13:57:10,205 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_468-143126-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 13:57:11,654 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_4528_210_3273_1_1527076654_3276777_258-121681-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 13:57:13,428 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_397-132565-0_sp1.1 from training. Duration: 0.9 2023-10-04 13:57:13,451 WARNING [train.py:1204] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_454-123002-0_sp0.9 from training. Duration: 0.7666875 2023-10-04 13:57:18,754 WARNING [train.py:1204] (3/4) Exclude cut with ID _27052_210_5986_1_1532329197187_3585009_297-86890-0_sp1.1 from training. Duration: 0.936375 2023-10-04 13:57:20,171 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_51225_210_3601_1_1534400634_2327383_55-301844-0_sp1.1 from training. Duration: 0.8454375 2023-10-04 13:57:20,382 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.feed_forward1.out_proj.dropout_p, batch_count=1678333.3333333333, ans=0.1 2023-10-04 13:57:21,610 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6786_210_1388_1_1527839699_4194495_378-46070-0_sp0.9 from training. Duration: 0.8 2023-10-04 13:57:23,566 WARNING [train.py:1204] (3/4) Exclude cut with ID _42334_210_7770_1_1534206610396_3605880_55-19758-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 13:57:26,556 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.0.conv_module2.balancer2.prob, batch_count=1678333.3333333333, ans=0.125 2023-10-04 13:57:27,703 WARNING [train.py:1204] (3/4) Exclude cut with ID _14884_210_2252_1_1530707459689_5561250_114-201399-0_sp0.9 from training. Duration: 0.8444375 2023-10-04 13:57:33,121 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_99-251314-0_sp0.9 from training. Duration: 0.9666875 2023-10-04 13:57:34,531 WARNING [train.py:1204] (3/4) Exclude cut with ID _26528_210_9070_1_1532570410842_3851310_497-97218-0 from training. Duration: 0.86 2023-10-04 13:57:37,583 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.feed_forward1.out_proj.dropout_p, batch_count=1678400.0, ans=0.1 2023-10-04 13:57:38,819 WARNING [train.py:1204] (3/4) Exclude cut with ID _55328_210_27767_1_1534580973260_4088170_230-317241-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 13:57:40,227 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_125-203105-0_sp0.9 from training. Duration: 0.888875 2023-10-04 13:57:42,176 WARNING [train.py:1204] (3/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_55-31904-0_sp1.1 from training. Duration: 0.8 2023-10-04 13:57:42,748 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.4.encoder.layers.0.self_attn_weights.whiten_keys, num_groups=4, num_channels=128, metric=1.70 vs. limit=6.0 2023-10-04 13:57:43,551 WARNING [train.py:1204] (3/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_18-174647-0 from training. Duration: 0.91 2023-10-04 13:57:43,715 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.out_combiner.scale_min, batch_count=1678466.6666666667, ans=0.2 2023-10-04 13:57:45,299 INFO [train.py:1046] (3/4) Epoch 48, batch 2100, loss[loss=0.1654, simple_loss=0.2503, pruned_loss=0.04021, over 23450.00 frames. ], tot_loss[loss=0.1526, simple_loss=0.2334, pruned_loss=0.03592, over 4730147.25 frames. ], batch size: 119, lr: 2.12e-03, grad_scale: 8.0 2023-10-04 13:57:46,957 WARNING [train.py:1204] (3/4) Exclude cut with ID IC0165W0005-81726 from training. Duration: 0.952 2023-10-04 13:57:46,958 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_31583_210_3852_1_1532595851_4024413_148-171847-0_sp1.1 from training. Duration: 0.963625 2023-10-04 13:57:48,305 WARNING [train.py:1204] (3/4) Exclude cut with ID _21989_210_5854_1_1531899214931_3271360_116-138769-0_sp1.1 from training. Duration: 0.936375 2023-10-04 13:57:48,510 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.ff2_skip_rate, batch_count=1678466.6666666667, ans=0.0 2023-10-04 13:57:49,675 WARNING [train.py:1204] (3/4) Exclude cut with ID _66070_210_7640_1_1535441393096_7805310_489-292631-0_sp1.1 from training. Duration: 0.7181875 2023-10-04 13:57:49,759 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_202-27960-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 13:57:49,770 WARNING [train.py:1204] (3/4) Exclude cut with ID _5294_210_1747_1_1527249643928_3696179_342-270278-0 from training. Duration: 0.55 2023-10-04 13:57:51,054 WARNING [train.py:1204] (3/4) Exclude cut with ID _38323_210_12108_1_1533121233369_3303049_416-11919-0 from training. Duration: 0.96 2023-10-04 13:57:52,395 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6545_210_2040_1_1527774670_4245258_407-48070-0_sp0.9 from training. Duration: 0.8444375 2023-10-04 13:57:55,617 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.644e+02 2.079e+02 2.318e+02 2.598e+02 4.333e+02, threshold=4.637e+02, percent-clipped=0.0 2023-10-04 13:57:55,725 WARNING [train.py:1204] (3/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_503-30719-0_sp1.1 from training. Duration: 0.8909375 2023-10-04 13:57:55,784 WARNING [train.py:1204] (3/4) Exclude cut with ID _49643_210_7989_1_1534640244224_3939031_234-326497-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 13:57:58,468 WARNING [train.py:1204] (3/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_301-77873-0_sp1.1 from training. Duration: 0.963625 2023-10-04 13:57:59,814 WARNING [train.py:1204] (3/4) Exclude cut with ID _45297_210_2721_1_1533715351381_3264949_152-154697-0_sp1.1 from training. Duration: 0.97275 2023-10-04 13:57:59,828 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3371_210_1794_1_1526384462_5580777_396-339812-0 from training. Duration: 0.85 2023-10-04 13:58:00,002 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.0.feed_forward1.hidden_balancer.prob, batch_count=1678533.3333333333, ans=0.125 2023-10-04 13:58:00,507 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.3.encoder.layers.0.nonlin_attention.whiten1, num_groups=1, num_channels=384, metric=6.48 vs. limit=10.0 2023-10-04 13:58:01,167 WARNING [train.py:1204] (3/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_399-156791-0_sp1.1 from training. Duration: 0.7545625 2023-10-04 13:58:01,213 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7696_210_2040_1_1528207167_3716099_117-125434-0 from training. Duration: 0.76 2023-10-04 13:58:01,217 WARNING [train.py:1204] (3/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_96-244299-0 from training. Duration: 0.88 2023-10-04 13:58:02,642 WARNING [train.py:1204] (3/4) Exclude cut with ID _37892_210_19596_1_1533198677897_3964058_519-343462-0_sp1.1 from training. Duration: 0.92725 2023-10-04 13:58:02,668 WARNING [train.py:1204] (3/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_202-183922-0_sp1.1 from training. Duration: 0.77275 2023-10-04 13:58:02,675 WARNING [train.py:1204] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_453-133983-0 from training. Duration: 0.78 2023-10-04 13:58:03,978 WARNING [train.py:1204] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_178-39722-0_sp1.1 from training. Duration: 0.4454375 2023-10-04 13:58:08,296 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_15126_210_6753_1_1530778677_2591185_113-157651-0 from training. Duration: 0.68 2023-10-04 13:58:08,297 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_271-234962-0_sp1.1 from training. Duration: 0.7181875 2023-10-04 13:58:11,611 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.0.balancer1.prob, batch_count=1678533.3333333333, ans=0.125 2023-10-04 13:58:12,782 WARNING [train.py:1204] (3/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_195-64967-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 13:58:12,825 WARNING [train.py:1204] (3/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_365-215065-0_sp1.1 from training. Duration: 0.936375 2023-10-04 13:58:12,999 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.1.feed_forward1.out_proj.dropout_p, batch_count=1678600.0, ans=0.1 2023-10-04 13:58:17,447 WARNING [train.py:1204] (3/4) Exclude cut with ID _31602_210_5854_1_1532677021456_4458250_249-96604-0_sp0.9 from training. Duration: 0.988875 2023-10-04 13:58:17,494 WARNING [train.py:1204] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_307-158462-0 from training. Duration: 0.69 2023-10-04 13:58:17,548 WARNING [train.py:1204] (3/4) Exclude cut with ID _30918_210_6400_1_1532754078600_3125920_407-223394-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 13:58:17,552 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6673_210_3273_1_1527767790_3173240_351-167954-0_sp0.9 from training. Duration: 0.5333125 2023-10-04 13:58:20,291 WARNING [train.py:1204] (3/4) Exclude cut with ID _4866_210_2577_1_1527404412284_4364659_256-306830-0 from training. Duration: 0.87 2023-10-04 13:58:20,332 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_17613_210_2151_1_1531137569_2366160_222-131118-0_sp1.1 from training. Duration: 0.963625 2023-10-04 13:58:20,361 WARNING [train.py:1204] (3/4) Exclude cut with ID _45560_210_21982_1_1533812603951_3584999_111-6376-0 from training. Duration: 0.84 2023-10-04 13:58:21,646 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_216-73841-0 from training. Duration: 0.96 2023-10-04 13:58:21,688 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_14058_210_4163_1_1530690953_6327180_49-210687-0 from training. Duration: 0.94 2023-10-04 13:58:23,098 WARNING [train.py:1204] (3/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_506-42614-0_sp0.9 from training. Duration: 0.87775 2023-10-04 13:58:25,151 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.conv_module2.balancer1.prob, batch_count=1678600.0, ans=0.125 2023-10-04 13:58:26,260 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_25-332138-0_sp1.1 from training. Duration: 0.82725 2023-10-04 13:58:27,706 WARNING [train.py:1204] (3/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_589-9070-0_sp1.1 from training. Duration: 0.6454375 2023-10-04 13:58:29,429 WARNING [train.py:1204] (3/4) Exclude cut with ID _6194_210_1534_1_1527511826160_3941194_14-282876-0_sp1.1 from training. Duration: 0.6454375 2023-10-04 13:58:30,737 WARNING [train.py:1204] (3/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_1166-90654-0_sp1.1 from training. Duration: 0.92725 2023-10-04 13:58:32,162 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_218-255858-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 13:58:32,163 WARNING [train.py:1204] (3/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_1285-256622-0 from training. Duration: 0.84 2023-10-04 13:58:32,186 WARNING [train.py:1204] (3/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_205-286261-0_sp1.1 from training. Duration: 0.963625 2023-10-04 13:58:32,200 WARNING [train.py:1204] (3/4) Exclude cut with ID _68777_210_4353_1_1535810390731_3117904_6-154119-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 13:58:33,538 WARNING [train.py:1204] (3/4) Exclude cut with ID _22982_210_1883_1_1531882790447_5449409_565-199393-0_sp1.1 from training. Duration: 0.92725 2023-10-04 13:58:33,556 WARNING [train.py:1204] (3/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_278-154492-0 from training. Duration: 0.76 2023-10-04 13:58:36,237 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_426-313557-0 from training. Duration: 0.53 2023-10-04 13:58:36,296 WARNING [train.py:1204] (3/4) Exclude cut with ID _41817_210_18835_1_1533434477160_7423049_555-293448-0 from training. Duration: 0.8 2023-10-04 13:58:39,115 WARNING [train.py:1204] (3/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_32-335103-0_sp1.1 from training. Duration: 0.7454375 2023-10-04 13:58:39,331 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.bypass.scale_min, batch_count=1678666.6666666667, ans=0.2 2023-10-04 13:58:41,118 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2655_210_2087_1_1525688959_3897400_202-147557-0_sp1.1 from training. Duration: 0.77275 2023-10-04 13:58:42,969 WARNING [train.py:1204] (3/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_158-250116-0 from training. Duration: 0.62 2023-10-04 13:58:47,999 WARNING [train.py:1204] (3/4) Exclude cut with ID _55429_210_10626_1_1534467588527_7174143_528-88083-0_sp1.1 from training. Duration: 0.963625 2023-10-04 13:58:49,433 WARNING [train.py:1204] (3/4) Exclude cut with ID _8030_210_7497_1_1528444886781_3531089_32-31837-0_sp1.1 from training. Duration: 0.8818125 2023-10-04 13:58:50,074 WARNING [train.py:1204] (3/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_744-86168-0_sp1.1 from training. Duration: 0.97275 2023-10-04 13:58:50,297 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.conv_skip_rate, batch_count=1678733.3333333333, ans=0.0 2023-10-04 13:58:51,156 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1113-7369-0_sp0.9 from training. Duration: 0.9444375 2023-10-04 13:58:51,175 WARNING [train.py:1204] (3/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_124-345840-0_sp0.9 from training. Duration: 0.57775 2023-10-04 13:58:51,221 WARNING [train.py:1204] (3/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_328-264776-0_sp0.9 from training. Duration: 0.7444375 2023-10-04 13:58:52,614 WARNING [train.py:1204] (3/4) Exclude cut with ID _18895_210_6228_1_1531357274234_3784930_469-166615-0_sp1.1 from training. Duration: 0.963625 2023-10-04 13:58:52,621 WARNING [train.py:1204] (3/4) Exclude cut with ID _8395_210_1531_1_1528597871523_4411579_431-132470-0_sp0.9 from training. Duration: 0.82225 2023-10-04 13:58:52,708 WARNING [train.py:1204] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_554-45617-0_sp1.1 from training. Duration: 0.9 2023-10-04 13:58:52,729 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_35361_210_4238_1_1533262810_7793476_524-188242-0_sp1.1 from training. Duration: 0.92725 2023-10-04 13:58:52,937 INFO [scaling.py:1118] (3/4) WithLoss: name=encoder.encoders.3.encoder.layers.1.self_attn_weights, loss-sum=0.000e+00 2023-10-04 13:58:55,355 WARNING [train.py:1204] (3/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_938-205135-0 from training. Duration: 0.87 2023-10-04 13:58:56,777 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_5931_210_2040_1_1527428481_4547999_321-193614-0 from training. Duration: 0.67 2023-10-04 13:58:56,789 WARNING [train.py:1204] (3/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_445-269521-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 13:58:59,264 INFO [train.py:1046] (3/4) Epoch 48, batch 2150, loss[loss=0.1456, simple_loss=0.2254, pruned_loss=0.03293, over 23507.00 frames. ], tot_loss[loss=0.1515, simple_loss=0.2315, pruned_loss=0.03569, over 4706681.42 frames. ], batch size: 134, lr: 2.12e-03, grad_scale: 8.0 2023-10-04 13:59:00,692 WARNING [train.py:1204] (3/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_3-33116-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 13:59:00,694 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_4633_210_2040_1_1526824163_4145343_416-76117-0_sp1.1 from training. Duration: 0.863625 2023-10-04 13:59:00,729 WARNING [train.py:1204] (3/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_1033-274376-0_sp1.1 from training. Duration: 0.7909375 2023-10-04 13:59:00,766 WARNING [train.py:1204] (3/4) Exclude cut with ID _8772_210_1850_1_1528635595972_3664011_398-320049-0_sp0.9 from training. Duration: 0.97775 2023-10-04 13:59:04,997 WARNING [train.py:1204] (3/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_321-215742-0_sp0.9 from training. Duration: 0.5555625 2023-10-04 13:59:07,665 WARNING [train.py:1204] (3/4) Exclude cut with ID _28394_210_8913_1_1532430049518_3650118_272-190950-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 13:59:09,395 WARNING [train.py:1204] (3/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_31-299371-0_sp1.1 from training. Duration: 0.92725 2023-10-04 13:59:09,610 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=1678800.0, ans=0.1 2023-10-04 13:59:10,847 WARNING [train.py:1204] (3/4) Exclude cut with ID _55383_210_10635_1_1534468426172_2231669_134-128246-0_sp0.9 from training. Duration: 0.988875 2023-10-04 13:59:10,850 WARNING [train.py:1204] (3/4) Exclude cut with ID _40519_210_13794_1_1533715228655_7337390_887-9547-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 13:59:12,240 WARNING [train.py:1204] (3/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_310-109461-0_sp0.9 from training. Duration: 0.888875 2023-10-04 13:59:16,211 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2009_210_2071_1_1525481134_4588937_258-250196-0_sp1.1 from training. Duration: 0.92725 2023-10-04 13:59:16,271 WARNING [train.py:1204] (3/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_1186-72702-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 13:59:16,274 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_14831_210_7927_1_1530700200_5583610_181-27467-0_sp1.1 from training. Duration: 0.82725 2023-10-04 13:59:16,463 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.self_attn_weights.pos_emb_skip_rate, batch_count=1678866.6666666667, ans=0.0 2023-10-04 13:59:19,887 WARNING [train.py:1204] (3/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_278-132109-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 13:59:20,752 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.1.encoder.layers.0.feed_forward3.out_whiten, num_groups=1, num_channels=256, metric=12.19 vs. limit=15.0 2023-10-04 13:59:21,162 WARNING [train.py:1204] (3/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_261-297503-0 from training. Duration: 0.99 2023-10-04 13:59:21,333 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.1.nonlin_attention.balancer.prob, batch_count=1678866.6666666667, ans=0.125 2023-10-04 13:59:25,259 WARNING [train.py:1204] (3/4) Exclude cut with ID _19896_210_1883_1_1531709022662_6649569_313-265903-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 13:59:26,633 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_105-114797-0_sp1.1 from training. Duration: 0.763625 2023-10-04 13:59:27,929 WARNING [train.py:1204] (3/4) Exclude cut with ID _53755_210_8254_1_1534577312941_4707360_9-65843-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 13:59:27,955 WARNING [train.py:1204] (3/4) Exclude cut with ID _18516_210_9206_1_1531704653206_3692406_361-224549-0_sp1.1 from training. Duration: 0.97275 2023-10-04 13:59:27,989 WARNING [train.py:1204] (3/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_403-310975-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 13:59:28,033 WARNING [train.py:1204] (3/4) Exclude cut with ID _5444_210_2151_1_1527418808464_5312210_581-315411-0_sp0.9 from training. Duration: 0.688875 2023-10-04 13:59:29,366 WARNING [train.py:1204] (3/4) Exclude cut with ID _23825_210_13449_1_1531915313526_3497150_464-154904-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 13:59:29,384 WARNING [train.py:1204] (3/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_937-208281-0_sp0.9 from training. Duration: 0.9444375 2023-10-04 13:59:29,452 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_37839_210_7234_1_1533542462_3689303_158-181212-0_sp1.1 from training. Duration: 0.963625 2023-10-04 13:59:30,861 WARNING [train.py:1204] (3/4) Exclude cut with ID _8772_210_1850_1_1528635595972_3664011_398-320049-0 from training. Duration: 0.88 2023-10-04 13:59:33,354 WARNING [train.py:1204] (3/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_718-250907-0_sp1.1 from training. Duration: 0.62725 2023-10-04 13:59:34,695 WARNING [train.py:1204] (3/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_641-126395-0_sp1.1 from training. Duration: 0.92725 2023-10-04 13:59:34,730 WARNING [train.py:1204] (3/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_166-241493-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 13:59:36,144 WARNING [train.py:1204] (3/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_526-62079-0_sp0.9 from training. Duration: 0.7444375 2023-10-04 13:59:36,251 WARNING [train.py:1204] (3/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_439-275083-0_sp1.1 from training. Duration: 0.936375 2023-10-04 13:59:38,916 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_66907_210_21581_1_1535615661_3590177_493-229032-0_sp1.1 from training. Duration: 0.92725 2023-10-04 13:59:38,952 WARNING [train.py:1204] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_89-321955-0_sp1.1 from training. Duration: 0.72725 2023-10-04 13:59:42,160 WARNING [train.py:1204] (3/4) Exclude cut with ID _29857_210_20034_1_1532395886753_3798160_273-226195-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 13:59:42,166 WARNING [train.py:1204] (3/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_404-75889-0 from training. Duration: 0.89 2023-10-04 13:59:42,201 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_271-234962-0_sp0.9 from training. Duration: 0.87775 2023-10-04 13:59:45,085 WARNING [train.py:1204] (3/4) Exclude cut with ID _22961_210_8254_1_1531976366672_4278180_156-339685-0_sp1.1 from training. Duration: 0.97275 2023-10-04 13:59:45,136 WARNING [train.py:1204] (3/4) Exclude cut with ID _37892_210_19596_1_1533198677897_3964058_425-231927-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 13:59:45,416 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.conv_skip_rate, batch_count=1679000.0, ans=0.0 2023-10-04 13:59:46,995 WARNING [train.py:1204] (3/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_102-288670-0_sp1.1 from training. Duration: 0.97275 2023-10-04 13:59:47,078 WARNING [train.py:1204] (3/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_283-220372-0_sp1.1 from training. Duration: 0.5090625 2023-10-04 13:59:47,284 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.conv_skip_rate, batch_count=1679000.0, ans=0.0 2023-10-04 13:59:48,437 WARNING [train.py:1204] (3/4) Exclude cut with ID _44304_210_15374_1_1533639352765_4613129_86-1699-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 13:59:48,524 WARNING [train.py:1204] (3/4) Exclude cut with ID _2891_210_2550_1_1525874398521_4427710_327-121590-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 13:59:49,194 WARNING [train.py:1204] (3/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_369-222060-0 from training. Duration: 0.71 2023-10-04 13:59:51,848 WARNING [train.py:1204] (3/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_762-109886-0 from training. Duration: 0.91 2023-10-04 13:59:51,878 WARNING [train.py:1204] (3/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_477-255782-0_sp0.9 from training. Duration: 0.911125 2023-10-04 13:59:51,912 WARNING [train.py:1204] (3/4) Exclude cut with ID IC0669W0061-330906 from training. Duration: 0.768 2023-10-04 13:59:51,962 WARNING [train.py:1204] (3/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_347-213071-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 13:59:51,988 WARNING [train.py:1204] (3/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_507-303370-0_sp1.1 from training. Duration: 0.87275 2023-10-04 13:59:54,530 WARNING [train.py:1204] (3/4) Exclude cut with ID _15012_210_1968_1_1530856820429_5095350_688-178849-0 from training. Duration: 0.64 2023-10-04 13:59:54,543 WARNING [train.py:1204] (3/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_28-70987-0_sp0.9 from training. Duration: 0.9666875 2023-10-04 13:59:54,545 WARNING [train.py:1204] (3/4) Exclude cut with ID _5910_210_1389_1_1527337972454_5050100_555-159815-0 from training. Duration: 0.98 2023-10-04 13:59:54,573 WARNING [train.py:1204] (3/4) Exclude cut with ID IC0679W0064-335871 from training. Duration: 0.768 2023-10-04 13:59:54,573 WARNING [train.py:1204] (3/4) Exclude cut with ID IC0679W0079-335886 from training. Duration: 0.896 2023-10-04 13:59:55,809 WARNING [train.py:1204] (3/4) Exclude cut with ID _5608_210_2106_1_1527415439089_6819768_909-82767-0 from training. Duration: 0.61 2023-10-04 13:59:58,518 WARNING [train.py:1204] (3/4) Exclude cut with ID _23038_210_9570_1_1532001584158_3652419_411-323967-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 13:59:58,582 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_350-231748-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 13:59:58,592 WARNING [train.py:1204] (3/4) Exclude cut with ID _5289_210_2084_1_1527678202736_3779299_142-103317-0_sp0.9 from training. Duration: 0.9333125 2023-10-04 13:59:59,944 WARNING [train.py:1204] (3/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_314-144055-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 14:00:01,347 WARNING [train.py:1204] (3/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_177-236809-0_sp0.9 from training. Duration: 0.5444375 2023-10-04 14:00:02,854 WARNING [train.py:1204] (3/4) Exclude cut with ID _23038_210_9570_1_1532001584158_3652419_82-334733-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 14:00:02,870 WARNING [train.py:1204] (3/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_225-22526-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 14:00:11,191 WARNING [train.py:1204] (3/4) Exclude cut with ID _30327_210_16049_1_1532484161295_3924169_401-70819-0_sp1.1 from training. Duration: 0.7818125 2023-10-04 14:00:12,994 INFO [train.py:1046] (3/4) Epoch 48, batch 2200, loss[loss=0.1701, simple_loss=0.24, pruned_loss=0.05014, over 23702.00 frames. ], tot_loss[loss=0.1518, simple_loss=0.2322, pruned_loss=0.03572, over 4715109.22 frames. ], batch size: 232, lr: 2.12e-03, grad_scale: 8.0 2023-10-04 14:00:13,049 WARNING [train.py:1204] (3/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_538-32291-0 from training. Duration: 0.83 2023-10-04 14:00:16,382 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_37357_210_17024_1_1533089504_2316121_258-211605-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 14:00:18,360 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.3.encoder.layers.3.conv_module2.whiten, num_groups=1, num_channels=512, metric=9.15 vs. limit=15.0 2023-10-04 14:00:19,404 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.ff2_skip_rate, batch_count=1679133.3333333333, ans=0.0 2023-10-04 14:00:19,465 INFO [scaling.py:1118] (3/4) WithLoss: name=encoder.encoders.5.encoder.layers.1.self_attn_weights, loss-sum=0.000e+00 2023-10-04 14:00:22,373 WARNING [train.py:1204] (3/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_550-335198-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 14:00:23,614 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.795e+02 2.040e+02 2.222e+02 2.604e+02 4.042e+02, threshold=4.443e+02, percent-clipped=0.0 2023-10-04 14:00:23,740 WARNING [train.py:1204] (3/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_98-741-0_sp0.9 from training. Duration: 0.888875 2023-10-04 14:00:24,421 WARNING [train.py:1204] (3/4) Exclude cut with ID _38323_210_12108_1_1533121233369_3303049_251-238988-0_sp1.1 from training. Duration: 0.92725 2023-10-04 14:00:25,790 WARNING [train.py:1204] (3/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_259-34038-0_sp0.9 from training. Duration: 0.82225 2023-10-04 14:00:27,293 WARNING [train.py:1204] (3/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_1060-180633-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 14:00:27,327 WARNING [train.py:1204] (3/4) Exclude cut with ID _8755_210_1876_1_1528628559357_3258209_211-37473-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 14:00:27,332 WARNING [train.py:1204] (3/4) Exclude cut with ID _4122_210_2084_1_1526713164417_4353280_528-206771-0 from training. Duration: 0.88 2023-10-04 14:00:33,032 WARNING [train.py:1204] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_64-248359-0 from training. Duration: 0.69 2023-10-04 14:00:36,922 WARNING [train.py:1204] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_402-253027-0_sp1.1 from training. Duration: 0.4909375 2023-10-04 14:00:41,256 WARNING [train.py:1204] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_625-102955-0 from training. Duration: 0.77 2023-10-04 14:00:42,678 WARNING [train.py:1204] (3/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_611-219324-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 14:00:44,003 WARNING [train.py:1204] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_718-68505-0_sp0.9 from training. Duration: 0.988875 2023-10-04 14:00:45,773 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_353-182855-0_sp1.1 from training. Duration: 0.87275 2023-10-04 14:00:47,417 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_5960_210_1794_1_1527403865_3990999_515-2613-0_sp0.9 from training. Duration: 0.97775 2023-10-04 14:00:49,285 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_5575_210_1410_1_1527301305_2570170_342-265572-0 from training. Duration: 0.94 2023-10-04 14:00:52,034 WARNING [train.py:1204] (3/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_329-113173-0_sp0.9 from training. Duration: 0.688875 2023-10-04 14:00:52,122 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_15396_210_6890_1_1531308473_1938936_213-312506-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 14:00:52,180 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_521-214276-0_sp1.1 from training. Duration: 0.4 2023-10-04 14:00:57,139 WARNING [train.py:1204] (3/4) Exclude cut with ID _15026_210_8750_1_1530777602316_3291850_211-195043-0_sp1.1 from training. Duration: 0.72725 2023-10-04 14:00:58,599 WARNING [train.py:1204] (3/4) Exclude cut with ID _29343_210_4381_1_1532602757752_3674109_108-257494-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 14:01:01,266 WARNING [train.py:1204] (3/4) Exclude cut with ID _64414_210_17301_1_1535243252048_3757759_403-58151-0_sp1.1 from training. Duration: 0.963625 2023-10-04 14:01:02,590 WARNING [train.py:1204] (3/4) Exclude cut with ID _36512_210_6987_1_1533033143998_3126069_109-177787-0_sp1.1 from training. Duration: 0.92725 2023-10-04 14:01:04,028 WARNING [train.py:1204] (3/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_498-40153-0 from training. Duration: 0.63 2023-10-04 14:01:04,105 WARNING [train.py:1204] (3/4) Exclude cut with ID _56447_210_1389_1_1535074245910_3697189_161-334523-0_sp1.1 from training. Duration: 0.936375 2023-10-04 14:01:05,547 WARNING [train.py:1204] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_238-245655-0 from training. Duration: 0.81 2023-10-04 14:01:06,946 WARNING [train.py:1204] (3/4) Exclude cut with ID _64631_210_27767_1_1535349580068_4165160_108-43865-0_sp1.1 from training. Duration: 0.92725 2023-10-04 14:01:06,952 WARNING [train.py:1204] (3/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_243-42116-0_sp1.1 from training. Duration: 0.663625 2023-10-04 14:01:06,986 WARNING [train.py:1204] (3/4) Exclude cut with ID _66868_210_9527_1_1535509287954_4561140_594-132160-0_sp1.1 from training. Duration: 0.92725 2023-10-04 14:01:09,583 WARNING [train.py:1204] (3/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_48-338976-0_sp0.9 from training. Duration: 0.988875 2023-10-04 14:01:09,635 WARNING [train.py:1204] (3/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_137-100595-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 14:01:09,641 WARNING [train.py:1204] (3/4) Exclude cut with ID _49748_210_24116_1_1533986971737_3158910_12-271669-0_sp1.1 from training. Duration: 0.936375 2023-10-04 14:01:10,922 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_37357_210_17024_1_1533089504_2316121_206-80421-0_sp1.1 from training. Duration: 0.936375 2023-10-04 14:01:12,244 WARNING [train.py:1204] (3/4) Exclude cut with ID _7537_210_2084_1_1528502395431_3924359_113-199933-0_sp1.1 from training. Duration: 0.536375 2023-10-04 14:01:12,285 WARNING [train.py:1204] (3/4) Exclude cut with ID _55440_210_12651_1_1534496713364_2980520_31-59289-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 14:01:12,516 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.nonlin_attention.balancer.prob, batch_count=1679400.0, ans=0.125 2023-10-04 14:01:15,116 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1083-220122-0_sp0.9 from training. Duration: 0.5666875 2023-10-04 14:01:18,430 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_426-313557-0_sp1.1 from training. Duration: 0.4818125 2023-10-04 14:01:18,473 WARNING [train.py:1204] (3/4) Exclude cut with ID _35799_210_5854_1_1533263487887_3529071_324-94843-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 14:01:21,082 WARNING [train.py:1204] (3/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_264-108685-0_sp1.1 from training. Duration: 0.5 2023-10-04 14:01:23,035 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9012W0006-501141 from training. Duration: 0.896 2023-10-04 14:01:26,396 WARNING [train.py:1204] (3/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_890-5117-0_sp0.9 from training. Duration: 0.8666875 2023-10-04 14:01:26,451 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9023W0008-506301 from training. Duration: 0.896 2023-10-04 14:01:27,701 INFO [train.py:1046] (3/4) Epoch 48, batch 2250, loss[loss=0.1472, simple_loss=0.2282, pruned_loss=0.03313, over 23145.00 frames. ], tot_loss[loss=0.1526, simple_loss=0.233, pruned_loss=0.03614, over 4712288.59 frames. ], batch size: 105, lr: 2.12e-03, grad_scale: 8.0 2023-10-04 14:01:27,813 WARNING [train.py:1204] (3/4) Exclude cut with ID _13916_210_9994_1_1530788341126_3763149_60-191556-0_sp0.9 from training. Duration: 0.67775 2023-10-04 14:01:29,110 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9029W0331-509721 from training. Duration: 0.896 2023-10-04 14:01:30,630 WARNING [train.py:1204] (3/4) Exclude cut with ID _35823_210_6917_1_1533187881914_7297830_567-108596-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 14:01:30,675 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7543_210_6753_1_1528544010_1110402_25-183860-0_sp1.1 from training. Duration: 0.42725 2023-10-04 14:01:32,077 WARNING [train.py:1204] (3/4) Exclude cut with ID _47278_210_12041_1_1533902418349_3576110_495-260035-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 14:01:32,243 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.conv_skip_rate, batch_count=1679466.6666666667, ans=0.0 2023-10-04 14:01:33,546 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9051W0015-520281 from training. Duration: 0.9045625 2023-10-04 14:01:35,063 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.conv_module2.balancer1.prob, batch_count=1679466.6666666667, ans=0.125 2023-10-04 14:01:36,106 WARNING [train.py:1204] (3/4) Exclude cut with ID _14459_210_2228_1_1531443641946_7516700_707-66193-0_sp1.1 from training. Duration: 0.97275 2023-10-04 14:01:37,492 WARNING [train.py:1204] (3/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_219-224445-0_sp1.1 from training. Duration: 0.763625 2023-10-04 14:01:44,565 WARNING [train.py:1204] (3/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_345-167986-0_sp0.9 from training. Duration: 0.9333125 2023-10-04 14:01:46,109 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_34815_210_6388_1_1533381320_3571903_279-305565-0_sp1.1 from training. Duration: 0.836375 2023-10-04 14:01:47,771 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.feed_forward1.hidden_balancer.prob, batch_count=1679533.3333333333, ans=0.125 2023-10-04 14:01:48,928 WARNING [train.py:1204] (3/4) Exclude cut with ID _21987_210_5854_1_1531821500482_3637038_317-179212-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 14:01:48,988 WARNING [train.py:1204] (3/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_395-37222-0_sp1.1 from training. Duration: 0.6909375 2023-10-04 14:01:49,114 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.1.conv_skip_rate, batch_count=1679533.3333333333, ans=0.0 2023-10-04 14:01:50,885 WARNING [train.py:1204] (3/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_168-50337-0_sp1.1 from training. Duration: 0.763625 2023-10-04 14:01:53,568 WARNING [train.py:1204] (3/4) Exclude cut with ID _60113_210_16271_1_1535164212481_4386079_559-167191-0 from training. Duration: 0.93 2023-10-04 14:01:53,579 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_47228_210_4983_1_1533780021_3672027_344-179642-0_sp1.1 from training. Duration: 0.92725 2023-10-04 14:01:53,612 WARNING [train.py:1204] (3/4) Exclude cut with ID _22961_210_8254_1_1531976366672_4278180_317-199546-0_sp0.9 from training. Duration: 0.9555625 2023-10-04 14:01:56,963 WARNING [train.py:1204] (3/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_141-89727-0 from training. Duration: 0.71 2023-10-04 14:01:58,053 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_78-116075-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 14:01:58,065 WARNING [train.py:1204] (3/4) Exclude cut with ID _64086_210_7994_1_1535270417019_7435949_52-235756-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 14:01:59,476 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7615_210_4211_1_1528110965_4580160_72-235211-0_sp1.1 from training. Duration: 0.6909375 2023-10-04 14:02:04,238 WARNING [train.py:1204] (3/4) Exclude cut with ID _14069_210_5632_1_1530512692033_4803390_287-27950-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 14:02:04,331 WARNING [train.py:1204] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_74-48717-0_sp0.9 from training. Duration: 0.6444375 2023-10-04 14:02:04,354 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7544_210_6753_1_1528591327_1471426_172-348777-0_sp1.1 from training. Duration: 0.6 2023-10-04 14:02:06,935 WARNING [train.py:1204] (3/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_220-191080-0 from training. Duration: 0.98 2023-10-04 14:02:08,332 WARNING [train.py:1204] (3/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_1081-137641-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 14:02:09,819 WARNING [train.py:1204] (3/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_278-235459-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 14:02:13,831 WARNING [train.py:1204] (3/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_1181-77470-0_sp1.1 from training. Duration: 0.936375 2023-10-04 14:02:15,207 WARNING [train.py:1204] (3/4) Exclude cut with ID _60860_210_8254_1_1534896015875_3557169_198-5531-0_sp1.1 from training. Duration: 0.936375 2023-10-04 14:02:15,377 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.0.feed_forward1.hidden_balancer.prob, batch_count=1679666.6666666667, ans=0.125 2023-10-04 14:02:16,603 WARNING [train.py:1204] (3/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_17-289781-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 14:02:16,608 WARNING [train.py:1204] (3/4) Exclude cut with ID _50310_210_15488_1_1534068043345_3641080_146-199752-0_sp1.1 from training. Duration: 0.92725 2023-10-04 14:02:19,805 WARNING [train.py:1204] (3/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_969-213354-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 14:02:19,893 WARNING [train.py:1204] (3/4) Exclude cut with ID _70519_210_4163_1_1535961595994_7458230_66-344156-0_sp1.1 from training. Duration: 0.963625 2023-10-04 14:02:23,338 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.conv_skip_rate, batch_count=1679666.6666666667, ans=0.0 2023-10-04 14:02:24,416 WARNING [train.py:1204] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_380-204756-0_sp1.1 from training. Duration: 0.7818125 2023-10-04 14:02:27,590 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_128-202104-0_sp0.9 from training. Duration: 0.7 2023-10-04 14:02:29,139 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.0.feed_forward2.hidden_balancer.prob, batch_count=1679733.3333333333, ans=0.125 2023-10-04 14:02:33,444 WARNING [train.py:1204] (3/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_51-1049-0_sp0.9 from training. Duration: 0.8333125 2023-10-04 14:02:33,465 WARNING [train.py:1204] (3/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_86-66192-0_sp1.1 from training. Duration: 0.5 2023-10-04 14:02:33,539 WARNING [train.py:1204] (3/4) Exclude cut with ID _14069_210_5632_1_1530512692033_4803390_95-1896-0_sp1.1 from training. Duration: 0.8909375 2023-10-04 14:02:37,699 WARNING [train.py:1204] (3/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_29-293896-0_sp1.1 from training. Duration: 0.5545625 2023-10-04 14:02:39,169 WARNING [train.py:1204] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_167-137255-0_sp0.9 from training. Duration: 0.711125 2023-10-04 14:02:39,169 WARNING [train.py:1204] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_217-95056-0 from training. Duration: 0.98 2023-10-04 14:02:40,098 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.0.layers.1.conv_module1.whiten, num_groups=1, num_channels=192, metric=8.09 vs. limit=15.0 2023-10-04 14:02:40,417 INFO [train.py:1046] (3/4) Epoch 48, batch 2300, loss[loss=0.1528, simple_loss=0.2388, pruned_loss=0.03344, over 24445.00 frames. ], tot_loss[loss=0.1529, simple_loss=0.2336, pruned_loss=0.03615, over 4715370.61 frames. ], batch size: 66, lr: 2.12e-03, grad_scale: 8.0 2023-10-04 14:02:40,472 WARNING [train.py:1204] (3/4) Exclude cut with ID _47278_210_12041_1_1533902418349_3576110_97-266746-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 14:02:40,527 WARNING [train.py:1204] (3/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_380-343670-0_sp1.1 from training. Duration: 0.8 2023-10-04 14:02:43,340 WARNING [train.py:1204] (3/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_107-332398-0 from training. Duration: 0.78 2023-10-04 14:02:44,710 WARNING [train.py:1204] (3/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_91-267696-0_sp1.1 from training. Duration: 0.7909375 2023-10-04 14:02:44,742 WARNING [train.py:1204] (3/4) Exclude cut with ID _41298_210_9019_1_1533430371450_4822143_498-186993-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 14:02:50,762 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.746e+02 2.018e+02 2.211e+02 2.601e+02 3.731e+02, threshold=4.421e+02, percent-clipped=0.0 2023-10-04 14:02:50,873 WARNING [train.py:1204] (3/4) Exclude cut with ID _17956_210_4353_1_1531209568125_6979970_54-77610-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 14:02:50,917 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_40047_210_2290_1_1533290519_2374311_257-335162-0_sp1.1 from training. Duration: 0.863625 2023-10-04 14:02:51,734 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.1.encoder.layers.1.self_attn1.whiten, num_groups=1, num_channels=256, metric=10.30 vs. limit=22.5 2023-10-04 14:02:52,799 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9653W0193-675471 from training. Duration: 0.9656875 2023-10-04 14:02:54,167 WARNING [train.py:1204] (3/4) Exclude cut with ID _25248_210_8750_1_1532062825941_5554180_243-240536-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 14:03:02,064 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_32360_210_4198_1_1532689160_3648436_60-51908-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 14:03:02,073 WARNING [train.py:1204] (3/4) Exclude cut with ID _4092_210_2151_1_1526643044449_3135759_227-213972-0_sp0.9 from training. Duration: 0.6 2023-10-04 14:03:02,102 WARNING [train.py:1204] (3/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_392-173500-0_sp1.1 from training. Duration: 0.97275 2023-10-04 14:03:02,142 WARNING [train.py:1204] (3/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_71-329150-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 14:03:02,151 WARNING [train.py:1204] (3/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_866-173440-0 from training. Duration: 0.9 2023-10-04 14:03:03,505 WARNING [train.py:1204] (3/4) Exclude cut with ID _14832_210_8750_1_1530691351539_3171348_1-54315-0_sp0.9 from training. Duration: 0.9666875 2023-10-04 14:03:03,698 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.1.feed_forward1.out_proj.dropout_p, batch_count=1679866.6666666667, ans=0.1 2023-10-04 14:03:06,268 WARNING [train.py:1204] (3/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_430-116139-0_sp1.1 from training. Duration: 0.77275 2023-10-04 14:03:07,572 WARNING [train.py:1204] (3/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_356-45054-0_sp0.9 from training. Duration: 0.9555625 2023-10-04 14:03:09,087 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3531_210_3528_1_1526294203_5216456_309-227302-0_sp0.9 from training. Duration: 0.7666875 2023-10-04 14:03:11,841 WARNING [train.py:1204] (3/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_301-330455-0_sp1.1 from training. Duration: 0.72725 2023-10-04 14:03:14,531 WARNING [train.py:1204] (3/4) Exclude cut with ID _23557_210_8750_1_1531890106370_6846255_667-145945-0_sp1.1 from training. Duration: 0.92725 2023-10-04 14:03:18,223 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.4.encoder.layers.0.feed_forward2.out_whiten, num_groups=1, num_channels=384, metric=5.65 vs. limit=15.0 2023-10-04 14:03:19,121 WARNING [train.py:1204] (3/4) Exclude cut with ID _14860_210_4915_1_1530702020340_7343240_836-328489-0_sp1.1 from training. Duration: 0.8181875 2023-10-04 14:03:20,440 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_37152_210_9697_1_1533106573_3844188_416-333249-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 14:03:23,655 WARNING [train.py:1204] (3/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_538-32291-0_sp0.9 from training. Duration: 0.92225 2023-10-04 14:03:29,614 WARNING [train.py:1204] (3/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_965-176824-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 14:03:32,246 WARNING [train.py:1204] (3/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_86-19601-0_sp1.1 from training. Duration: 0.77275 2023-10-04 14:03:33,671 WARNING [train.py:1204] (3/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_436-318113-0_sp1.1 from training. Duration: 0.6818125 2023-10-04 14:03:34,965 WARNING [train.py:1204] (3/4) Exclude cut with ID _41817_210_18835_1_1533434477160_7423049_555-293448-0_sp0.9 from training. Duration: 0.888875 2023-10-04 14:03:34,977 WARNING [train.py:1204] (3/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_68-219807-0 from training. Duration: 0.47 2023-10-04 14:03:37,798 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7544_210_6753_1_1528591327_1471426_33-346031-0_sp1.1 from training. Duration: 0.5545625 2023-10-04 14:03:39,115 WARNING [train.py:1204] (3/4) Exclude cut with ID _49748_210_24116_1_1533986971737_3158910_263-52113-0_sp1.1 from training. Duration: 0.97275 2023-10-04 14:03:39,161 WARNING [train.py:1204] (3/4) Exclude cut with ID _64639_210_5816_1_1535367619437_3528000_340-24580-0_sp1.1 from training. Duration: 0.92725 2023-10-04 14:03:39,177 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_47228_210_4983_1_1533780021_3672027_479-114941-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 14:03:39,199 WARNING [train.py:1204] (3/4) Exclude cut with ID _44585_210_18628_1_1533605350387_4518199_506-119659-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 14:03:40,497 WARNING [train.py:1204] (3/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_314-126819-0_sp1.1 from training. Duration: 0.4545625 2023-10-04 14:03:40,501 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2541_210_1794_1_1525590269_3084765_156-173713-0_sp0.9 from training. Duration: 0.9 2023-10-04 14:03:41,956 WARNING [train.py:1204] (3/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_108-255430-0 from training. Duration: 0.97 2023-10-04 14:03:41,972 WARNING [train.py:1204] (3/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_791-45149-0_sp1.1 from training. Duration: 0.7818125 2023-10-04 14:03:41,978 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_53378_210_17282_1_1534227054_1261425_10-118295-0_sp1.1 from training. Duration: 0.97275 2023-10-04 14:03:42,033 WARNING [train.py:1204] (3/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_148-158509-0 from training. Duration: 0.41 2023-10-04 14:03:46,232 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_34726_210_4525_1_1533019474_5030395_687-291305-0_sp1.1 from training. Duration: 0.936375 2023-10-04 14:03:51,021 WARNING [train.py:1204] (3/4) Exclude cut with ID _14860_210_4915_1_1530702020340_7343240_264-324537-0_sp1.1 from training. Duration: 0.963625 2023-10-04 14:03:53,874 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_41251_210_15368_1_1533429767_4820695_591-46386-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 14:03:53,898 WARNING [train.py:1204] (3/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_86-19601-0_sp0.9 from training. Duration: 0.9444375 2023-10-04 14:03:53,918 WARNING [train.py:1204] (3/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_401-255491-0_sp0.9 from training. Duration: 0.77775 2023-10-04 14:03:55,885 WARNING [train.py:1204] (3/4) Exclude cut with ID _8696_210_1876_1_1528545632440_3345058_209-54069-0_sp1.1 from training. Duration: 0.6090625 2023-10-04 14:03:55,894 WARNING [train.py:1204] (3/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_369-325184-0_sp1.1 from training. Duration: 0.9 2023-10-04 14:03:55,960 WARNING [train.py:1204] (3/4) Exclude cut with ID _14687_210_5854_1_1530703838259_3664889_115-239046-0_sp0.9 from training. Duration: 0.7444375 2023-10-04 14:03:57,266 INFO [train.py:1046] (3/4) Epoch 48, batch 2350, loss[loss=0.1917, simple_loss=0.2668, pruned_loss=0.05834, over 19785.00 frames. ], tot_loss[loss=0.1537, simple_loss=0.2343, pruned_loss=0.03652, over 4704417.90 frames. ], batch size: 388, lr: 2.12e-03, grad_scale: 8.0 2023-10-04 14:03:57,348 WARNING [train.py:1204] (3/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_168-50337-0 from training. Duration: 0.84 2023-10-04 14:04:04,795 WARNING [train.py:1204] (3/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_909-64151-0_sp1.1 from training. Duration: 0.8545625 2023-10-04 14:04:04,805 WARNING [train.py:1204] (3/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_597-154640-0 from training. Duration: 0.9 2023-10-04 14:04:09,394 INFO [scaling.py:1118] (3/4) WithLoss: name=encoder.encoders.3.encoder.layers.0.self_attn_weights, loss-sum=0.000e+00 2023-10-04 14:04:10,441 WARNING [train.py:1204] (3/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_126-274694-0 from training. Duration: 0.98 2023-10-04 14:04:10,675 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.balancer2.prob, batch_count=1680200.0, ans=0.125 2023-10-04 14:04:11,912 WARNING [train.py:1204] (3/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_344-269786-0_sp1.1 from training. Duration: 0.97275 2023-10-04 14:04:15,910 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_37818_210_4238_1_1533620910_4720083_322-253154-0_sp1.1 from training. Duration: 0.92725 2023-10-04 14:04:15,913 WARNING [train.py:1204] (3/4) Exclude cut with ID _58895_210_8750_1_1535184081320_3649089_221-15823-0_sp1.1 from training. Duration: 0.92725 2023-10-04 14:04:15,929 WARNING [train.py:1204] (3/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_9-21737-0_sp1.1 from training. Duration: 0.8818125 2023-10-04 14:04:15,982 WARNING [train.py:1204] (3/4) Exclude cut with ID _14069_210_5632_1_1530512692033_4803390_522-341588-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 14:04:16,109 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.0.self_attn_weights.pos_emb_skip_rate, batch_count=1680200.0, ans=0.0 2023-10-04 14:04:19,245 WARNING [train.py:1204] (3/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_363-185703-0 from training. Duration: 0.95 2023-10-04 14:04:22,136 WARNING [train.py:1204] (3/4) Exclude cut with ID _41817_210_18835_1_1533434477160_7423049_265-76216-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 14:04:26,311 WARNING [train.py:1204] (3/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_841-341679-0 from training. Duration: 0.58 2023-10-04 14:04:28,147 WARNING [train.py:1204] (3/4) Exclude cut with ID _55429_210_10626_1_1534467588527_7174143_97-335084-0_sp1.1 from training. Duration: 0.8818125 2023-10-04 14:04:30,210 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.self_attn_weights.pos_emb_skip_rate, batch_count=1680266.6666666667, ans=0.0 2023-10-04 14:04:31,215 WARNING [train.py:1204] (3/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_690-126306-0_sp0.9 from training. Duration: 0.8666875 2023-10-04 14:04:31,232 WARNING [train.py:1204] (3/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_102-92751-0_sp1.1 from training. Duration: 0.9 2023-10-04 14:04:34,660 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3787_210_2040_1_1526477544_5478586_603-120324-0_sp1.1 from training. Duration: 0.736375 2023-10-04 14:04:36,019 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_33348_210_5632_1_1533086912_3324030_85-28583-0 from training. Duration: 0.82 2023-10-04 14:04:36,097 WARNING [train.py:1204] (3/4) Exclude cut with ID _60057_210_10640_1_1534903065793_3333150_362-230377-0_sp1.1 from training. Duration: 0.7909375 2023-10-04 14:04:37,526 WARNING [train.py:1204] (3/4) Exclude cut with ID _60113_210_16271_1_1535164212481_4386079_194-146486-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 14:04:38,862 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_53753_210_7234_1_1534320003_3594148_171-277668-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 14:04:38,902 WARNING [train.py:1204] (3/4) Exclude cut with ID _34423_210_8750_1_1533430798456_3330199_251-332788-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 14:04:41,571 WARNING [train.py:1204] (3/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_663-166973-0_sp0.9 from training. Duration: 0.888875 2023-10-04 14:04:43,894 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.0.layers.1.self_attn2.whiten, num_groups=1, num_channels=192, metric=11.61 vs. limit=22.5 2023-10-04 14:04:44,363 WARNING [train.py:1204] (3/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_927-43827-0 from training. Duration: 0.74 2023-10-04 14:04:44,423 WARNING [train.py:1204] (3/4) Exclude cut with ID _28323_210_16187_1_1532480395667_4100300_306-74561-0_sp1.1 from training. Duration: 0.8545625 2023-10-04 14:04:45,856 WARNING [train.py:1204] (3/4) Exclude cut with ID _15037_210_2253_1_1530752294104_3267650_139-337989-0_sp1.1 from training. Duration: 0.92725 2023-10-04 14:04:45,874 WARNING [train.py:1204] (3/4) Exclude cut with ID _49039_210_6315_1_1533967537541_3846118_460-263670-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 14:04:47,278 WARNING [train.py:1204] (3/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_186-309161-0 from training. Duration: 0.54 2023-10-04 14:04:48,723 WARNING [train.py:1204] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_87-278686-0_sp1.1 from training. Duration: 0.7 2023-10-04 14:04:49,690 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.0.layers.1.feed_forward2.out_whiten, num_groups=1, num_channels=192, metric=4.75 vs. limit=15.0 2023-10-04 14:04:50,681 WARNING [train.py:1204] (3/4) Exclude cut with ID _27052_210_5986_1_1532329197187_3585009_314-106499-0 from training. Duration: 0.92 2023-10-04 14:04:50,697 WARNING [train.py:1204] (3/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_412-287526-0_sp0.9 from training. Duration: 0.8 2023-10-04 14:04:53,587 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.conv_module2.balancer2.prob, batch_count=1680333.3333333333, ans=0.125 2023-10-04 14:04:55,174 WARNING [train.py:1204] (3/4) Exclude cut with ID _22961_210_8254_1_1531976366672_4278180_317-199546-0 from training. Duration: 0.86 2023-10-04 14:04:59,514 WARNING [train.py:1204] (3/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_381-317791-0 from training. Duration: 0.73 2023-10-04 14:05:00,826 WARNING [train.py:1204] (3/4) Exclude cut with ID _44010_210_4525_1_1533884262964_4013620_560-310411-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 14:05:00,832 WARNING [train.py:1204] (3/4) Exclude cut with ID _5294_210_1747_1_1527249643928_3696179_342-270278-0_sp0.9 from training. Duration: 0.611125 2023-10-04 14:05:00,851 WARNING [train.py:1204] (3/4) Exclude cut with ID ID0473W0002-918951 from training. Duration: 0.924 2023-10-04 14:05:01,474 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1083-220122-0 from training. Duration: 0.51 2023-10-04 14:05:04,093 WARNING [train.py:1204] (3/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_138-154812-0 from training. Duration: 0.78 2023-10-04 14:05:06,935 WARNING [train.py:1204] (3/4) Exclude cut with ID _44982_210_1511_1_1534312820108_3726029_331-19420-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 14:05:08,508 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2655_210_2087_1_1525688959_3897400_202-147557-0_sp0.9 from training. Duration: 0.9444375 2023-10-04 14:05:08,709 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.feed_forward1.out_proj.dropout_p, batch_count=1680400.0, ans=0.1 2023-10-04 14:05:11,133 INFO [train.py:1046] (3/4) Epoch 48, batch 2400, loss[loss=0.1553, simple_loss=0.2398, pruned_loss=0.03539, over 24348.00 frames. ], tot_loss[loss=0.1533, simple_loss=0.2339, pruned_loss=0.03638, over 4698994.54 frames. ], batch size: 77, lr: 2.11e-03, grad_scale: 8.0 2023-10-04 14:05:12,629 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_23850_210_4238_1_1532232597_1632433_32-185135-0_sp0.9 from training. Duration: 0.9555625 2023-10-04 14:05:13,939 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_46-93547-0_sp1.1 from training. Duration: 0.863625 2023-10-04 14:05:15,171 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7931_210_1794_1_1528594728_5564059_280-279560-0 from training. Duration: 0.98 2023-10-04 14:05:15,212 WARNING [train.py:1204] (3/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_937-208281-0 from training. Duration: 0.85 2023-10-04 14:05:22,854 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.743e+02 2.080e+02 2.454e+02 2.862e+02 5.375e+02, threshold=4.908e+02, percent-clipped=3.0 2023-10-04 14:05:24,348 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_14928_210_5632_1_1530773461_4714000_454-112198-0_sp0.9 from training. Duration: 0.7333125 2023-10-04 14:05:24,349 WARNING [train.py:1204] (3/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_944-153333-0_sp1.1 from training. Duration: 0.963625 2023-10-04 14:05:27,229 WARNING [train.py:1204] (3/4) Exclude cut with ID _14821_210_8750_1_1530680503991_6829359_2-227151-0 from training. Duration: 0.83 2023-10-04 14:05:27,246 WARNING [train.py:1204] (3/4) Exclude cut with ID _28461_210_2253_1_1532771775189_3041299_96-152158-0_sp1.1 from training. Duration: 0.77275 2023-10-04 14:05:27,323 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7837_210_1792_1_1528453445_5841305_654-54195-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 14:05:27,361 WARNING [train.py:1204] (3/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_344-152526-0 from training. Duration: 0.53 2023-10-04 14:05:35,097 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_30167_210_3528_1_1532428012_4772375_480-142230-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 14:05:36,526 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_160-218721-0 from training. Duration: 0.96 2023-10-04 14:05:39,413 WARNING [train.py:1204] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_65-88242-0_sp0.9 from training. Duration: 0.62225 2023-10-04 14:05:43,535 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_34886_210_10633_1_1533203882_1755661_76-156096-0 from training. Duration: 0.87 2023-10-04 14:05:46,249 WARNING [train.py:1204] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_188-38388-0_sp1.1 from training. Duration: 0.92725 2023-10-04 14:05:47,703 WARNING [train.py:1204] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_81-289335-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 14:05:54,251 WARNING [train.py:1204] (3/4) Exclude cut with ID _59493_210_17282_1_1535078419077_3225879_110-8778-0_sp1.1 from training. Duration: 0.936375 2023-10-04 14:05:54,291 WARNING [train.py:1204] (3/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_188-260011-0 from training. Duration: 0.73 2023-10-04 14:05:55,722 WARNING [train.py:1204] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_453-133983-0_sp0.9 from training. Duration: 0.8666875 2023-10-04 14:06:00,003 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.conv_skip_rate, batch_count=1680666.6666666667, ans=0.0 2023-10-04 14:06:03,859 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8054_210_7927_1_1528627721_4984094_553-17379-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 14:06:05,900 WARNING [train.py:1204] (3/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_341-171072-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 14:06:06,631 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.2.encoder.layers.2.self_attn2.whiten, num_groups=1, num_channels=384, metric=18.90 vs. limit=22.5 2023-10-04 14:06:09,198 WARNING [train.py:1204] (3/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_265-257286-0_sp1.1 from training. Duration: 0.963625 2023-10-04 14:06:10,673 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_403-39619-0_sp0.9 from training. Duration: 0.9333125 2023-10-04 14:06:10,678 WARNING [train.py:1204] (3/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_464-33931-0_sp0.9 from training. Duration: 0.6 2023-10-04 14:06:10,700 WARNING [train.py:1204] (3/4) Exclude cut with ID _60804_210_9642_1_1534831199859_7268299_632-143863-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 14:06:10,706 WARNING [train.py:1204] (3/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_431-310997-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 14:06:10,758 WARNING [train.py:1204] (3/4) Exclude cut with ID _31649_210_6228_1_1532584608063_4091078_114-263860-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 14:06:10,776 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6961_210_2087_1_1527937168_6815011_562-165518-0_sp0.9 from training. Duration: 0.8555625 2023-10-04 14:06:13,418 WARNING [train.py:1204] (3/4) Exclude cut with ID _27052_210_5986_1_1532329197187_3585009_338-325870-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 14:06:13,464 WARNING [train.py:1204] (3/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_469-94673-0_sp1.1 from training. Duration: 0.7181875 2023-10-04 14:06:14,766 WARNING [train.py:1204] (3/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_259-34038-0 from training. Duration: 0.74 2023-10-04 14:06:14,874 WARNING [train.py:1204] (3/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_388-129552-0 from training. Duration: 0.96 2023-10-04 14:06:17,554 WARNING [train.py:1204] (3/4) Exclude cut with ID _28394_210_8913_1_1532430049518_3650118_342-251455-0_sp1.1 from training. Duration: 0.936375 2023-10-04 14:06:17,588 WARNING [train.py:1204] (3/4) Exclude cut with ID _53731_210_7994_1_1534553979244_3757214_8-191390-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 14:06:18,848 WARNING [train.py:1204] (3/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_613-232856-0 from training. Duration: 0.54 2023-10-04 14:06:18,924 WARNING [train.py:1204] (3/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_557-349534-0 from training. Duration: 0.91 2023-10-04 14:06:20,252 WARNING [train.py:1204] (3/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_379-184223-0 from training. Duration: 0.85 2023-10-04 14:06:20,255 WARNING [train.py:1204] (3/4) Exclude cut with ID IC0125W0001-61807 from training. Duration: 0.966 2023-10-04 14:06:20,348 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_229-45300-0 from training. Duration: 0.57 2023-10-04 14:06:22,228 WARNING [train.py:1204] (3/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_1069-24743-0_sp1.1 from training. Duration: 0.92725 2023-10-04 14:06:24,040 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_41452_210_7291_1_1533437687_3941281_323-172873-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 14:06:24,050 WARNING [train.py:1204] (3/4) Exclude cut with ID _37086_210_9038_1_1532999540891_7021250_802-226709-0_sp1.1 from training. Duration: 0.97275 2023-10-04 14:06:25,330 INFO [train.py:1046] (3/4) Epoch 48, batch 2450, loss[loss=0.1406, simple_loss=0.2182, pruned_loss=0.03148, over 24428.00 frames. ], tot_loss[loss=0.1526, simple_loss=0.2331, pruned_loss=0.03603, over 4696171.31 frames. ], batch size: 58, lr: 2.11e-03, grad_scale: 4.0 2023-10-04 14:06:25,424 WARNING [train.py:1204] (3/4) Exclude cut with ID IC0144W0500-71767 from training. Duration: 0.931875 2023-10-04 14:06:26,665 WARNING [train.py:1204] (3/4) Exclude cut with ID _39846_210_12651_1_1533605310265_2115212_233-129699-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 14:06:26,719 WARNING [train.py:1204] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_574-45541-0_sp0.9 from training. Duration: 0.72225 2023-10-04 14:06:29,491 WARNING [train.py:1204] (3/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_372-298093-0_sp1.1 from training. Duration: 0.72725 2023-10-04 14:06:29,503 WARNING [train.py:1204] (3/4) Exclude cut with ID _62574_210_2655_1_1535180407058_3933249_34-26343-0_sp1.1 from training. Duration: 0.97275 2023-10-04 14:06:32,326 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_512-256238-0_sp1.1 from training. Duration: 0.963625 2023-10-04 14:06:33,634 WARNING [train.py:1204] (3/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_196-55502-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 14:06:35,513 WARNING [train.py:1204] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_195-167011-0 from training. Duration: 0.75 2023-10-04 14:06:41,394 WARNING [train.py:1204] (3/4) Exclude cut with ID _13678_210_7497_1_1530530069226_3463170_345-230416-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 14:06:41,414 WARNING [train.py:1204] (3/4) Exclude cut with ID _51078_210_9070_1_1534161557424_3805250_225-327809-0_sp1.1 from training. Duration: 0.963625 2023-10-04 14:06:44,291 WARNING [train.py:1204] (3/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_18-189198-0_sp1.1 from training. Duration: 0.8181875 2023-10-04 14:06:44,310 WARNING [train.py:1204] (3/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_329-82235-0_sp1.1 from training. Duration: 0.7454375 2023-10-04 14:06:44,319 WARNING [train.py:1204] (3/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_726-270780-0_sp0.9 from training. Duration: 0.9555625 2023-10-04 14:06:45,727 WARNING [train.py:1204] (3/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_654-52713-0 from training. Duration: 0.77 2023-10-04 14:06:49,316 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.2.encoder.layers.0.nonlin_attention.whiten2, num_groups=1, num_channels=384, metric=6.41 vs. limit=15.0 2023-10-04 14:06:49,850 WARNING [train.py:1204] (3/4) Exclude cut with ID _20455_210_5219_1_1531565996775_8098100_191-16006-0_sp1.1 from training. Duration: 0.963625 2023-10-04 14:06:51,955 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3171_210_2087_1_1526109084_2330007_66-260583-0_sp1.1 from training. Duration: 0.6545625 2023-10-04 14:06:53,647 WARNING [train.py:1204] (3/4) Exclude cut with ID _38670_210_15379_1_1533448810659_4391059_63-129006-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 14:06:56,552 WARNING [train.py:1204] (3/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_393-57888-0_sp0.9 from training. Duration: 0.711125 2023-10-04 14:06:56,588 WARNING [train.py:1204] (3/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_857-298942-0_sp0.9 from training. Duration: 0.9444375 2023-10-04 14:06:58,088 WARNING [train.py:1204] (3/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_239-22076-0_sp0.9 from training. Duration: 0.9444375 2023-10-04 14:06:58,111 WARNING [train.py:1204] (3/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_424-227947-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 14:06:59,458 WARNING [train.py:1204] (3/4) Exclude cut with ID _14069_210_5632_1_1530512692033_4803390_95-1896-0 from training. Duration: 0.98 2023-10-04 14:07:00,893 WARNING [train.py:1204] (3/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_96-244299-0_sp0.9 from training. Duration: 0.97775 2023-10-04 14:07:01,148 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.conv_module1.balancer1.prob, batch_count=1680933.3333333333, ans=0.125 2023-10-04 14:07:06,894 WARNING [train.py:1204] (3/4) Exclude cut with ID _46683_210_9642_1_1533730913492_4591116_562-319825-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 14:07:08,708 WARNING [train.py:1204] (3/4) Exclude cut with ID _64639_210_5816_1_1535367619437_3528000_284-173474-0_sp1.1 from training. Duration: 0.963625 2023-10-04 14:07:08,740 WARNING [train.py:1204] (3/4) Exclude cut with ID _29529_210_7234_1_1532393961455_3977460_497-100133-0_sp1.1 from training. Duration: 0.936375 2023-10-04 14:07:08,779 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_12868_210_6753_1_1530701932_530275_18-76322-0_sp0.9 from training. Duration: 0.9666875 2023-10-04 14:07:10,151 WARNING [train.py:1204] (3/4) Exclude cut with ID _50093_210_4220_1_1534417141207_3432299_116-303447-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 14:07:11,489 WARNING [train.py:1204] (3/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_44-79427-0_sp1.1 from training. Duration: 0.8909375 2023-10-04 14:07:12,845 WARNING [train.py:1204] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_887-249555-0 from training. Duration: 0.77 2023-10-04 14:07:15,646 WARNING [train.py:1204] (3/4) Exclude cut with ID _28461_210_2253_1_1532771775189_3041299_96-152158-0_sp0.9 from training. Duration: 0.9444375 2023-10-04 14:07:15,697 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_14058_210_4163_1_1530690953_6327180_49-210687-0_sp1.1 from training. Duration: 0.8545625 2023-10-04 14:07:18,503 WARNING [train.py:1204] (3/4) Exclude cut with ID _40399_210_4353_1_1533305658194_3279406_351-127903-0_sp1.1 from training. Duration: 0.97275 2023-10-04 14:07:18,509 WARNING [train.py:1204] (3/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_184-206939-0_sp1.1 from training. Duration: 0.936375 2023-10-04 14:07:23,264 WARNING [train.py:1204] (3/4) Exclude cut with ID _14100_210_8341_1_1530529320613_3678069_258-224599-0_sp1.1 from training. Duration: 0.7 2023-10-04 14:07:23,287 WARNING [train.py:1204] (3/4) Exclude cut with ID _6493_210_1747_1_1528459210424_4012470_324-323854-0 from training. Duration: 0.92 2023-10-04 14:07:24,505 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_151-325626-0_sp1.1 from training. Duration: 0.8818125 2023-10-04 14:07:24,583 WARNING [train.py:1204] (3/4) Exclude cut with ID _14528_210_8254_1_1530669700514_4306939_106-43145-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 14:07:24,596 WARNING [train.py:1204] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_686-54383-0 from training. Duration: 0.87 2023-10-04 14:07:26,544 WARNING [train.py:1204] (3/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_801-102362-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 14:07:26,614 WARNING [train.py:1204] (3/4) Exclude cut with ID _48677_210_5663_1_1533902405919_3892989_408-40740-0_sp0.9 from training. Duration: 0.92225 2023-10-04 14:07:30,696 WARNING [train.py:1204] (3/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_448-212135-0_sp1.1 from training. Duration: 0.82725 2023-10-04 14:07:32,550 WARNING [train.py:1204] (3/4) Exclude cut with ID _59170_210_6721_1_1534983633960_4203335_320-124047-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 14:07:32,588 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_4692_210_3528_1_1526899755_4323254_107-44401-0_sp1.1 from training. Duration: 0.7818125 2023-10-04 14:07:36,754 WARNING [train.py:1204] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_731-2428-0 from training. Duration: 0.83 2023-10-04 14:07:38,706 WARNING [train.py:1204] (3/4) Exclude cut with ID _29857_210_20034_1_1532395886753_3798160_335-163562-0_sp1.1 from training. Duration: 0.77275 2023-10-04 14:07:39,858 INFO [train.py:1046] (3/4) Epoch 48, batch 2500, loss[loss=0.1492, simple_loss=0.2219, pruned_loss=0.03829, over 22858.00 frames. ], tot_loss[loss=0.1526, simple_loss=0.2333, pruned_loss=0.03597, over 4690429.03 frames. ], batch size: 322, lr: 2.11e-03, grad_scale: 8.0 2023-10-04 14:07:44,080 WARNING [train.py:1204] (3/4) Exclude cut with ID _14832_210_8750_1_1530691351539_3171348_171-275004-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 14:07:51,720 WARNING [train.py:1204] (3/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_105-328174-0_sp0.9 from training. Duration: 0.8444375 2023-10-04 14:07:52,952 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.739e+02 2.028e+02 2.220e+02 2.568e+02 3.726e+02, threshold=4.440e+02, percent-clipped=0.0 2023-10-04 14:07:53,040 WARNING [train.py:1204] (3/4) Exclude cut with ID _68965_210_1883_1_1535698962272_3497490_164-118421-0_sp1.1 from training. Duration: 0.936375 2023-10-04 14:07:54,397 WARNING [train.py:1204] (3/4) Exclude cut with ID _19896_210_1883_1_1531709022662_6649569_380-24170-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 14:07:54,402 WARNING [train.py:1204] (3/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_505-205270-0_sp1.1 from training. Duration: 0.3454375 2023-10-04 14:08:00,427 WARNING [train.py:1204] (3/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_427-208901-0_sp1.1 from training. Duration: 0.6181875 2023-10-04 14:08:01,763 WARNING [train.py:1204] (3/4) Exclude cut with ID _23818_210_6228_1_1531917063884_3676920_85-326916-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 14:08:01,861 WARNING [train.py:1204] (3/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_101-293367-0_sp1.1 from training. Duration: 0.57275 2023-10-04 14:08:02,037 INFO [scaling.py:1118] (3/4) WithLoss: name=encoder.encoders.2.encoder.layers.1.self_attn_weights, loss-sum=0.000e+00 2023-10-04 14:08:03,140 WARNING [train.py:1204] (3/4) Exclude cut with ID _5409_210_1388_1_1527292850189_5865191_178-127970-0_sp1.1 from training. Duration: 0.5181875 2023-10-04 14:08:03,197 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_403-39619-0 from training. Duration: 0.84 2023-10-04 14:08:05,051 WARNING [train.py:1204] (3/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_427-87841-0_sp1.1 from training. Duration: 0.963625 2023-10-04 14:08:06,370 WARNING [train.py:1204] (3/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_286-68181-0_sp1.1 from training. Duration: 0.97275 2023-10-04 14:08:06,417 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6673_210_3273_1_1527767790_3173240_327-306185-0 from training. Duration: 0.83 2023-10-04 14:08:06,443 WARNING [train.py:1204] (3/4) Exclude cut with ID _3675_210_4557_1_1526639934408_4182089_106-203713-0_sp1.1 from training. Duration: 0.963625 2023-10-04 14:08:08,337 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_53071_210_16554_1_1534493434_1276796_130-36076-0 from training. Duration: 0.95 2023-10-04 14:08:08,375 WARNING [train.py:1204] (3/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_586-93054-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 14:08:13,662 WARNING [train.py:1204] (3/4) Exclude cut with ID _23825_210_13449_1_1531915313526_3497150_262-10571-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 14:08:13,741 WARNING [train.py:1204] (3/4) Exclude cut with ID _28783_210_7943_1_1532671164808_7155479_432-67885-0_sp1.1 from training. Duration: 0.97275 2023-10-04 14:08:16,626 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6287_210_3273_1_1527509506_2653716_182-199887-0_sp0.9 from training. Duration: 0.5666875 2023-10-04 14:08:16,676 WARNING [train.py:1204] (3/4) Exclude cut with ID _3231_210_2106_1_1526205864934_6784220_171-256004-0 from training. Duration: 0.86 2023-10-04 14:08:17,945 WARNING [train.py:1204] (3/4) Exclude cut with ID _69051_210_1531_1_1535885566901_3829538_332-92090-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 14:08:18,078 WARNING [train.py:1204] (3/4) Exclude cut with ID _3675_210_4557_1_1526639934408_4182089_104-324125-0_sp1.1 from training. Duration: 0.963625 2023-10-04 14:08:22,577 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_41764_210_2521_1_1533686110_4089313_501-2119-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 14:08:25,621 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_56-100988-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 14:08:27,641 WARNING [train.py:1204] (3/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_398-239819-0_sp1.1 from training. Duration: 0.9 2023-10-04 14:08:33,226 WARNING [train.py:1204] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_74-48717-0_sp1.1 from training. Duration: 0.52725 2023-10-04 14:08:34,765 WARNING [train.py:1204] (3/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_177-49092-0 from training. Duration: 0.91 2023-10-04 14:08:34,789 WARNING [train.py:1204] (3/4) Exclude cut with ID _62337_210_12392_1_1535009273105_3412229_44-176353-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 14:08:36,779 WARNING [train.py:1204] (3/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_110-130961-0_sp1.1 from training. Duration: 0.42725 2023-10-04 14:08:39,997 WARNING [train.py:1204] (3/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_37-189262-0_sp1.1 from training. Duration: 0.8909375 2023-10-04 14:08:39,998 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3172_210_2087_1_1526292929_3987865_197-97762-0_sp0.9 from training. Duration: 0.8333125 2023-10-04 14:08:41,451 WARNING [train.py:1204] (3/4) Exclude cut with ID IC0677W0065-334897 from training. Duration: 0.896 2023-10-04 14:08:41,452 WARNING [train.py:1204] (3/4) Exclude cut with ID IC0677W0080-334912 from training. Duration: 0.768 2023-10-04 14:08:41,457 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_120-41668-0 from training. Duration: 0.7 2023-10-04 14:08:42,914 WARNING [train.py:1204] (3/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_460-205287-0_sp1.1 from training. Duration: 0.963625 2023-10-04 14:08:45,654 WARNING [train.py:1204] (3/4) Exclude cut with ID _14632_210_9988_1_1530691279392_3923200_102-204477-0 from training. Duration: 0.97 2023-10-04 14:08:47,002 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_34815_210_6388_1_1533381320_3571903_279-305565-0 from training. Duration: 0.92 2023-10-04 14:08:47,079 WARNING [train.py:1204] (3/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_91-148831-0_sp1.1 from training. Duration: 0.9 2023-10-04 14:08:48,266 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_82-221196-0 from training. Duration: 0.76 2023-10-04 14:08:51,101 WARNING [train.py:1204] (3/4) Exclude cut with ID _38020_210_5986_1_1533112252259_7006089_493-102745-0 from training. Duration: 0.98 2023-10-04 14:08:54,283 INFO [train.py:1046] (3/4) Epoch 48, batch 2550, loss[loss=0.1535, simple_loss=0.2351, pruned_loss=0.03592, over 24474.00 frames. ], tot_loss[loss=0.1527, simple_loss=0.2335, pruned_loss=0.03601, over 4696849.06 frames. ], batch size: 63, lr: 2.11e-03, grad_scale: 8.0 2023-10-04 14:08:54,395 WARNING [train.py:1204] (3/4) Exclude cut with ID _61133_210_9969_1_1535196777227_3696409_40-93442-0_sp1.1 from training. Duration: 0.92725 2023-10-04 14:08:57,043 WARNING [train.py:1204] (3/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_45-258852-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 14:08:58,368 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_53071_210_16554_1_1534493434_1276796_130-36076-0_sp1.1 from training. Duration: 0.863625 2023-10-04 14:08:58,627 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.nonlin_attention.balancer.prob, batch_count=1681466.6666666667, ans=0.125 2023-10-04 14:09:00,340 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8694_210_6400_1_1529065754_3260756_332-13553-0_sp1.1 from training. Duration: 0.92725 2023-10-04 14:09:00,406 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_405-339986-0 from training. Duration: 0.51 2023-10-04 14:09:01,714 WARNING [train.py:1204] (3/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_927-43827-0_sp1.1 from training. Duration: 0.67275 2023-10-04 14:09:05,861 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_397-147196-0 from training. Duration: 0.8 2023-10-04 14:09:07,346 WARNING [train.py:1204] (3/4) Exclude cut with ID 3557-8342-0013-54691-0_sp1.1 from training. Duration: 0.836375 2023-10-04 14:09:08,806 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_47438_210_5399_1_1533797471_4513356_626-127722-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 14:09:10,792 WARNING [train.py:1204] (3/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_256-323883-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 14:09:10,796 WARNING [train.py:1204] (3/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_839-173017-0_sp0.9 from training. Duration: 0.4555625 2023-10-04 14:09:11,450 WARNING [train.py:1204] (3/4) Exclude cut with ID _5289_210_2084_1_1527678202736_3779299_392-261988-0_sp1.1 from training. Duration: 0.5909375 2023-10-04 14:09:12,573 WARNING [train.py:1204] (3/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_119-224384-0_sp1.1 from training. Duration: 0.936375 2023-10-04 14:09:12,619 WARNING [train.py:1204] (3/4) Exclude cut with ID _57523_210_1856_1_1534766294545_3732989_262-267933-0_sp1.1 from training. Duration: 0.963625 2023-10-04 14:09:14,051 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_35361_210_4238_1_1533262810_7793476_675-87012-0_sp0.9 from training. Duration: 0.988875 2023-10-04 14:09:15,354 WARNING [train.py:1204] (3/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_17-90541-0 from training. Duration: 0.94 2023-10-04 14:09:15,382 WARNING [train.py:1204] (3/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_535-14410-0_sp1.1 from training. Duration: 0.42725 2023-10-04 14:09:15,387 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_24673_210_6400_1_1531968633_778659_94-154511-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 14:09:15,390 WARNING [train.py:1204] (3/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_212-10501-0 from training. Duration: 0.94 2023-10-04 14:09:21,188 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.feed_forward1.out_proj.dropout_p, batch_count=1681533.3333333333, ans=0.1 2023-10-04 14:09:25,813 WARNING [train.py:1204] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_383-285196-0_sp1.1 from training. Duration: 0.8818125 2023-10-04 14:09:30,420 WARNING [train.py:1204] (3/4) Exclude cut with ID _45560_210_21982_1_1533812603951_3584999_238-47020-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 14:09:30,435 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_21500_210_2054_1_1532149310_2716758_278-18053-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 14:09:31,689 WARNING [train.py:1204] (3/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_453-9626-0_sp1.1 from training. Duration: 0.936375 2023-10-04 14:09:33,063 WARNING [train.py:1204] (3/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_153-97441-0_sp1.1 from training. Duration: 0.5090625 2023-10-04 14:09:33,878 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.2.encoder.layers.1.self_attn1.whiten, num_groups=1, num_channels=384, metric=17.11 vs. limit=22.5 2023-10-04 14:09:40,629 WARNING [train.py:1204] (3/4) Exclude cut with ID _67725_210_27767_1_1535608756671_5105359_431-120895-0_sp1.1 from training. Duration: 0.963625 2023-10-04 14:09:42,593 WARNING [train.py:1204] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_574-45541-0_sp1.1 from training. Duration: 0.5909375 2023-10-04 14:09:42,610 WARNING [train.py:1204] (3/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_162-103620-0_sp1.1 from training. Duration: 0.8454375 2023-10-04 14:09:42,622 WARNING [train.py:1204] (3/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_734-129761-0_sp1.1 from training. Duration: 0.7454375 2023-10-04 14:09:43,989 WARNING [train.py:1204] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_620-276793-0_sp1.1 from training. Duration: 0.536375 2023-10-04 14:09:44,061 WARNING [train.py:1204] (3/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_153-97441-0_sp0.9 from training. Duration: 0.62225 2023-10-04 14:09:45,574 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.conv_module2.balancer1.prob, batch_count=1681666.6666666667, ans=0.125 2023-10-04 14:09:48,117 WARNING [train.py:1204] (3/4) Exclude cut with ID _12733_210_8341_1_1530167424556_3646380_523-85621-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 14:09:48,137 WARNING [train.py:1204] (3/4) Exclude cut with ID _64048_210_10626_1_1535198382802_3835179_352-179834-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 14:09:53,579 WARNING [train.py:1204] (3/4) Exclude cut with ID _55162_210_17071_1_1534408248583_3753480_43-242364-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 14:09:53,600 WARNING [train.py:1204] (3/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_661-264857-0 from training. Duration: 0.93 2023-10-04 14:09:53,607 WARNING [train.py:1204] (3/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_325-237800-0_sp1.1 from training. Duration: 0.97275 2023-10-04 14:09:53,670 WARNING [train.py:1204] (3/4) Exclude cut with ID _55328_210_27767_1_1534580973260_4088170_385-171459-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 14:09:55,478 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3780_210_2087_1_1526467389_2145572_176-107140-0_sp1.1 from training. Duration: 0.62725 2023-10-04 14:09:57,046 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=1681733.3333333333, ans=0.1 2023-10-04 14:09:58,122 WARNING [train.py:1204] (3/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_375-244976-0_sp1.1 from training. Duration: 0.6818125 2023-10-04 14:10:00,063 WARNING [train.py:1204] (3/4) Exclude cut with ID _35941_210_15379_1_1532995294515_4023089_309-33162-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 14:10:05,878 WARNING [train.py:1204] (3/4) Exclude cut with ID _14459_210_2228_1_1531443641946_7516700_1185-22038-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 14:10:07,355 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_1197-69460-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 14:10:08,549 INFO [train.py:1046] (3/4) Epoch 48, batch 2600, loss[loss=0.1708, simple_loss=0.2397, pruned_loss=0.05095, over 23787.00 frames. ], tot_loss[loss=0.1535, simple_loss=0.2343, pruned_loss=0.03633, over 4704068.08 frames. ], batch size: 164, lr: 2.11e-03, grad_scale: 8.0 2023-10-04 14:10:08,737 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9012W0022-501157 from training. Duration: 0.896 2023-10-04 14:10:12,873 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9023W0009-506302 from training. Duration: 0.896 2023-10-04 14:10:12,889 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2821_210_2715_1_1526108385_3763560_73-296673-0_sp1.1 from training. Duration: 0.7545625 2023-10-04 14:10:12,934 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9025W0113-507442 from training. Duration: 0.896 2023-10-04 14:10:13,832 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.feed_forward1.out_proj.dropout_p, batch_count=1681800.0, ans=0.1 2023-10-04 14:10:14,892 WARNING [train.py:1204] (3/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_739-2098-0 from training. Duration: 0.92 2023-10-04 14:10:14,904 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9029W0332-509722 from training. Duration: 0.768 2023-10-04 14:10:17,685 WARNING [train.py:1204] (3/4) Exclude cut with ID _16908_210_5724_1_1531119157248_3772700_68-101953-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 14:10:17,703 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9042W0011-515617 from training. Duration: 0.768 2023-10-04 14:10:19,173 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3019_210_2028_1_1526044245_3264865_191-72235-0 from training. Duration: 0.97 2023-10-04 14:10:19,441 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.nonlin_attention.balancer.prob, batch_count=1681800.0, ans=0.125 2023-10-04 14:10:20,565 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9052W0006-520792 from training. Duration: 0.896 2023-10-04 14:10:21,866 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.727e+02 2.037e+02 2.298e+02 2.610e+02 5.474e+02, threshold=4.596e+02, percent-clipped=1.0 2023-10-04 14:10:22,052 WARNING [train.py:1204] (3/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_245-174553-0_sp1.1 from training. Duration: 0.8 2023-10-04 14:10:24,648 WARNING [train.py:1204] (3/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_202-67017-0 from training. Duration: 0.82 2023-10-04 14:10:26,006 WARNING [train.py:1204] (3/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_304-59227-0 from training. Duration: 0.77 2023-10-04 14:10:27,412 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_246-125000-0_sp0.9 from training. Duration: 0.688875 2023-10-04 14:10:27,443 WARNING [train.py:1204] (3/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_243-42116-0 from training. Duration: 0.73 2023-10-04 14:10:27,700 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.conv_skip_rate, batch_count=1681866.6666666667, ans=0.0 2023-10-04 14:10:31,269 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9087W0121-538492 from training. Duration: 0.896 2023-10-04 14:10:31,290 WARNING [train.py:1204] (3/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_120-76843-0 from training. Duration: 0.55 2023-10-04 14:10:36,825 WARNING [train.py:1204] (3/4) Exclude cut with ID _7852_210_5219_1_1528286471316_8139400_850-304621-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 14:10:36,846 WARNING [train.py:1204] (3/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_154-285415-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 14:10:36,857 WARNING [train.py:1204] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_538-278194-0_sp1.1 from training. Duration: 0.92725 2023-10-04 14:10:36,858 WARNING [train.py:1204] (3/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_119-207162-0 from training. Duration: 0.71 2023-10-04 14:10:38,385 WARNING [train.py:1204] (3/4) Exclude cut with ID _2334_210_2667_1_1525608238880_4705129_459-104997-0_sp1.1 from training. Duration: 0.863625 2023-10-04 14:10:45,724 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9147W0019-567847 from training. Duration: 0.768 2023-10-04 14:10:50,590 WARNING [train.py:1204] (3/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_228-189439-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 14:10:50,613 WARNING [train.py:1204] (3/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_196-77172-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 14:10:51,965 WARNING [train.py:1204] (3/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_625-338546-0 from training. Duration: 0.88 2023-10-04 14:10:52,026 WARNING [train.py:1204] (3/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_193-327630-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 14:10:52,027 WARNING [train.py:1204] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_391-24006-0_sp1.1 from training. Duration: 0.92725 2023-10-04 14:10:52,212 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.conv_module2.balancer2.prob, batch_count=1682000.0, ans=0.125 2023-10-04 14:10:53,380 WARNING [train.py:1204] (3/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_197-28164-0 from training. Duration: 0.8 2023-10-04 14:10:54,780 WARNING [train.py:1204] (3/4) Exclude cut with ID _8395_210_1531_1_1528597871523_4411579_431-132470-0_sp1.1 from training. Duration: 0.67275 2023-10-04 14:10:54,815 WARNING [train.py:1204] (3/4) Exclude cut with ID _35350_210_16040_1_1533040191183_4075299_465-248326-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 14:10:58,536 WARNING [train.py:1204] (3/4) Exclude cut with ID _16756_210_4915_1_1531047803763_7104259_101-141922-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 14:11:00,440 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.4.encoder.layers.2.feed_forward1.out_whiten, num_groups=1, num_channels=384, metric=11.57 vs. limit=15.0 2023-10-04 14:11:02,545 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9212W0005-600172 from training. Duration: 0.896 2023-10-04 14:11:02,576 WARNING [train.py:1204] (3/4) Exclude cut with ID _63186_210_2084_1_1535414479705_3908649_378-166820-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 14:11:02,589 WARNING [train.py:1204] (3/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_52-211683-0_sp1.1 from training. Duration: 0.8181875 2023-10-04 14:11:08,072 WARNING [train.py:1204] (3/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_1147-237729-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 14:11:08,140 WARNING [train.py:1204] (3/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_234-14174-0_sp0.9 from training. Duration: 0.911125 2023-10-04 14:11:08,152 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_454-276429-0 from training. Duration: 0.87 2023-10-04 14:11:09,518 WARNING [train.py:1204] (3/4) Exclude cut with ID _53713_210_24116_1_1534314487762_7416751_755-165313-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 14:11:10,951 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8278_210_2550_1_1528637099_4220413_436-317072-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 14:11:11,012 WARNING [train.py:1204] (3/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_28-324729-0_sp1.1 from training. Duration: 0.8545625 2023-10-04 14:11:17,040 WARNING [train.py:1204] (3/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_326-165900-0 from training. Duration: 0.69 2023-10-04 14:11:18,703 WARNING [train.py:1204] (3/4) Exclude cut with ID _53489_210_6228_1_1534298357169_3835970_96-30246-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 14:11:20,157 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8275_210_4198_1_1528455320_3892571_491-17707-0_sp1.1 from training. Duration: 0.5818125 2023-10-04 14:11:22,763 INFO [train.py:1046] (3/4) Epoch 48, batch 2650, loss[loss=0.1482, simple_loss=0.229, pruned_loss=0.03373, over 24507.00 frames. ], tot_loss[loss=0.1539, simple_loss=0.2347, pruned_loss=0.03657, over 4708346.32 frames. ], batch size: 63, lr: 2.11e-03, grad_scale: 8.0 2023-10-04 14:11:23,366 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.4.encoder.layers.2.self_attn_weights.whiten_keys, num_groups=4, num_channels=128, metric=1.97 vs. limit=6.0 2023-10-04 14:11:24,180 WARNING [train.py:1204] (3/4) Exclude cut with ID _14348_210_8341_1_1530599434996_3778640_240-232370-0 from training. Duration: 0.84 2023-10-04 14:11:25,419 WARNING [train.py:1204] (3/4) Exclude cut with ID _45012_210_12651_1_1533608435136_5371380_54-112623-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 14:11:25,515 WARNING [train.py:1204] (3/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_328-264776-0_sp1.1 from training. Duration: 0.6090625 2023-10-04 14:11:28,636 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9325W0001-648427 from training. Duration: 0.768 2023-10-04 14:11:28,645 WARNING [train.py:1204] (3/4) Exclude cut with ID _56480_210_4220_1_1534679942079_3075409_49-74717-0_sp1.1 from training. Duration: 0.936375 2023-10-04 14:11:30,720 WARNING [train.py:1204] (3/4) Exclude cut with ID _59500_210_8254_1_1534809610680_3736009_6-241869-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 14:11:33,541 WARNING [train.py:1204] (3/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_172-215932-0_sp0.9 from training. Duration: 0.7333125 2023-10-04 14:11:33,642 WARNING [train.py:1204] (3/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_17-90541-0_sp1.1 from training. Duration: 0.8545625 2023-10-04 14:11:36,410 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_43831_210_2158_1_1533725502_5872791_525-89447-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 14:11:37,695 WARNING [train.py:1204] (3/4) Exclude cut with ID _12302_210_5219_1_1529924446466_8107280_907-182949-0 from training. Duration: 0.92 2023-10-04 14:11:37,708 WARNING [train.py:1204] (3/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_402-102311-0_sp0.9 from training. Duration: 0.7444375 2023-10-04 14:11:37,740 WARNING [train.py:1204] (3/4) Exclude cut with ID _8021_210_7994_1_1528376451358_3653069_179-45206-0_sp0.9 from training. Duration: 0.9555625 2023-10-04 14:11:39,231 WARNING [train.py:1204] (3/4) Exclude cut with ID _40137_210_12210_1_1533290484046_3795180_249-187059-0 from training. Duration: 0.88 2023-10-04 14:11:40,533 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9653W0194-675472 from training. Duration: 0.9658125 2023-10-04 14:11:41,894 WARNING [train.py:1204] (3/4) Exclude cut with ID _51332_210_9988_1_1534244371836_3801270_336-255604-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 14:11:45,146 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_510-96996-0 from training. Duration: 0.87 2023-10-04 14:11:45,160 WARNING [train.py:1204] (3/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_1167-20437-0_sp1.1 from training. Duration: 0.92725 2023-10-04 14:11:45,226 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3475_210_2028_1_1526193578_3622569_265-276625-0 from training. Duration: 0.76 2023-10-04 14:11:49,359 WARNING [train.py:1204] (3/4) Exclude cut with ID _14860_210_4915_1_1530702020340_7343240_470-172418-0_sp1.1 from training. Duration: 0.97275 2023-10-04 14:11:49,368 WARNING [train.py:1204] (3/4) Exclude cut with ID _5444_210_2151_1_1527418808464_5312210_581-315411-0_sp1.1 from training. Duration: 0.563625 2023-10-04 14:11:49,384 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_34450_210_9547_1_1533099443_4109050_519-342439-0_sp1.1 from training. Duration: 0.97275 2023-10-04 14:11:49,418 WARNING [train.py:1204] (3/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_50-175961-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 14:11:55,492 WARNING [train.py:1204] (3/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_372-6869-0 from training. Duration: 0.44 2023-10-04 14:11:55,498 WARNING [train.py:1204] (3/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_3-237434-0 from training. Duration: 0.76 2023-10-04 14:11:56,956 WARNING [train.py:1204] (3/4) Exclude cut with ID _8772_210_1850_1_1528635595972_3664011_247-336448-0_sp1.1 from training. Duration: 0.67275 2023-10-04 14:12:02,936 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_427-78097-0 from training. Duration: 0.8 2023-10-04 14:12:02,958 WARNING [train.py:1204] (3/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_466-252727-0_sp1.1 from training. Duration: 0.97275 2023-10-04 14:12:04,378 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_15972_210_6400_1_1530877934_3405692_316-35478-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 14:12:04,403 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_809-17156-0_sp0.9 from training. Duration: 0.811125 2023-10-04 14:12:04,443 WARNING [train.py:1204] (3/4) Exclude cut with ID _57572_210_5517_1_1534921219624_3809484_594-263474-0_sp1.1 from training. Duration: 0.936375 2023-10-04 14:12:05,764 WARNING [train.py:1204] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_190-228204-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 14:12:07,290 WARNING [train.py:1204] (3/4) Exclude cut with ID _19514_210_9259_1_1531530071330_5084230_117-21193-0_sp1.1 from training. Duration: 0.936375 2023-10-04 14:12:08,698 WARNING [train.py:1204] (3/4) Exclude cut with ID _69584_210_5219_1_1535785133775_4044450_190-21315-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 14:12:10,108 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_53071_210_16554_1_1534493434_1276796_184-302050-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 14:12:11,390 WARNING [train.py:1204] (3/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_120-76843-0_sp1.1 from training. Duration: 0.5 2023-10-04 14:12:11,475 WARNING [train.py:1204] (3/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_318-162017-0_sp1.1 from training. Duration: 0.82725 2023-10-04 14:12:13,476 WARNING [train.py:1204] (3/4) Exclude cut with ID _46067_210_4220_1_1534125571066_3307450_79-46864-0_sp1.1 from training. Duration: 0.92725 2023-10-04 14:12:13,528 WARNING [train.py:1204] (3/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_358-215558-0_sp1.1 from training. Duration: 0.8454375 2023-10-04 14:12:14,711 WARNING [train.py:1204] (3/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_621-66764-0_sp1.1 from training. Duration: 0.92725 2023-10-04 14:12:16,175 WARNING [train.py:1204] (3/4) Exclude cut with ID _42863_210_9070_1_1533902398788_3696920_338-156851-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 14:12:16,204 WARNING [train.py:1204] (3/4) Exclude cut with ID _5289_210_2084_1_1527678202736_3779299_392-261988-0_sp0.9 from training. Duration: 0.72225 2023-10-04 14:12:20,174 WARNING [train.py:1204] (3/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_409-84682-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 14:12:20,266 WARNING [train.py:1204] (3/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_30-326681-0_sp0.9 from training. Duration: 0.92225 2023-10-04 14:12:20,270 WARNING [train.py:1204] (3/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_389-184695-0_sp1.1 from training. Duration: 0.92725 2023-10-04 14:12:20,303 WARNING [train.py:1204] (3/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_442-320317-0 from training. Duration: 0.82 2023-10-04 14:12:26,126 WARNING [train.py:1204] (3/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_756-29911-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 14:12:27,488 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_401-112036-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 14:12:28,932 WARNING [train.py:1204] (3/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_181-308827-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 14:12:30,805 WARNING [train.py:1204] (3/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_332-258725-0_sp1.1 from training. Duration: 0.963625 2023-10-04 14:12:30,892 WARNING [train.py:1204] (3/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_188-260011-0_sp0.9 from training. Duration: 0.811125 2023-10-04 14:12:32,219 WARNING [train.py:1204] (3/4) Exclude cut with ID _49039_210_6315_1_1533967537541_3846118_99-302239-0_sp1.1 from training. Duration: 0.963625 2023-10-04 14:12:34,899 WARNING [train.py:1204] (3/4) Exclude cut with ID _4122_210_2084_1_1526713164417_4353280_312-294828-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 14:12:36,126 INFO [train.py:1046] (3/4) Epoch 48, batch 2700, loss[loss=0.1424, simple_loss=0.2169, pruned_loss=0.03402, over 23641.00 frames. ], tot_loss[loss=0.1545, simple_loss=0.2351, pruned_loss=0.037, over 4714716.79 frames. ], batch size: 256, lr: 2.11e-03, grad_scale: 8.0 2023-10-04 14:12:36,166 WARNING [train.py:1204] (3/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_303-228738-0 from training. Duration: 0.84 2023-10-04 14:12:39,011 WARNING [train.py:1204] (3/4) Exclude cut with ID _14349_210_8341_1_1530615710818_3442470_115-285975-0_sp0.9 from training. Duration: 0.9555625 2023-10-04 14:12:41,850 WARNING [train.py:1204] (3/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_125-144174-0_sp1.1 from training. Duration: 0.3545625 2023-10-04 14:12:43,415 WARNING [train.py:1204] (3/4) Exclude cut with ID _53565_210_10635_1_1534337218335_3820259_351-54570-0_sp1.1 from training. Duration: 0.97275 2023-10-04 14:12:43,441 WARNING [train.py:1204] (3/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_410-40065-0_sp1.1 from training. Duration: 0.963625 2023-10-04 14:12:44,752 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_38965_210_4852_1_1533174906_3484735_167-339442-0_sp1.1 from training. Duration: 0.963625 2023-10-04 14:12:46,728 WARNING [train.py:1204] (3/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_265-164486-0_sp1.1 from training. Duration: 0.87275 2023-10-04 14:12:46,743 WARNING [train.py:1204] (3/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_725-182484-0_sp1.1 from training. Duration: 0.92725 2023-10-04 14:12:47,911 WARNING [train.py:1204] (3/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_607-213951-0_sp0.9 from training. Duration: 0.9333125 2023-10-04 14:12:47,936 WARNING [train.py:1204] (3/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_605-133395-0_sp1.1 from training. Duration: 0.6 2023-10-04 14:12:47,951 WARNING [train.py:1204] (3/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_351-294650-0 from training. Duration: 0.63 2023-10-04 14:12:48,009 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_14953_210_2151_1_1530701710_5616165_710-165400-0_sp1.1 from training. Duration: 0.7909375 2023-10-04 14:12:49,270 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.735e+02 2.027e+02 2.284e+02 2.628e+02 4.660e+02, threshold=4.569e+02, percent-clipped=1.0 2023-10-04 14:12:49,453 WARNING [train.py:1204] (3/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_237-58433-0_sp1.1 from training. Duration: 0.67275 2023-10-04 14:12:49,543 WARNING [train.py:1204] (3/4) Exclude cut with ID _5190_210_4557_1_1527244368799_373439_28-197030-0_sp1.1 from training. Duration: 0.6545625 2023-10-04 14:12:51,007 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_743-327651-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 14:12:53,881 WARNING [train.py:1204] (3/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_76-203867-0_sp0.9 from training. Duration: 0.8 2023-10-04 14:12:55,178 WARNING [train.py:1204] (3/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_274-274798-0 from training. Duration: 0.98 2023-10-04 14:12:55,236 WARNING [train.py:1204] (3/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_226-242140-0_sp1.1 from training. Duration: 0.736375 2023-10-04 14:12:59,982 WARNING [train.py:1204] (3/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_352-59958-0_sp1.1 from training. Duration: 0.7818125 2023-10-04 14:12:59,986 WARNING [train.py:1204] (3/4) Exclude cut with ID _39411_210_5986_1_1533538825030_3343649_191-251819-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 14:13:00,212 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.ff2_skip_rate, batch_count=1682533.3333333333, ans=0.0 2023-10-04 14:13:06,573 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7046_210_6753_1_1527895337_6920917_642-106799-0_sp1.1 from training. Duration: 0.836375 2023-10-04 14:13:06,583 WARNING [train.py:1204] (3/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_412-3442-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 14:13:07,887 WARNING [train.py:1204] (3/4) Exclude cut with ID _60057_210_10640_1_1534903065793_3333150_171-181133-0_sp0.9 from training. Duration: 0.97775 2023-10-04 14:13:07,896 WARNING [train.py:1204] (3/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_154-135696-0_sp1.1 from training. Duration: 0.72725 2023-10-04 14:13:09,471 WARNING [train.py:1204] (3/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_148-186805-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 14:13:12,312 WARNING [train.py:1204] (3/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_605-165641-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 14:13:12,319 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_174-277364-0_sp0.9 from training. Duration: 0.788875 2023-10-04 14:13:12,331 WARNING [train.py:1204] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_89-321955-0_sp0.9 from training. Duration: 0.888875 2023-10-04 14:13:12,589 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.feed_forward1.out_proj.dropout_p, batch_count=1682600.0, ans=0.1 2023-10-04 14:13:15,896 WARNING [train.py:1204] (3/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_438-105429-0_sp1.1 from training. Duration: 0.963625 2023-10-04 14:13:15,899 WARNING [train.py:1204] (3/4) Exclude cut with ID _14970_210_15709_1_1530773543493_1715730_187-20259-0_sp0.9 from training. Duration: 0.988875 2023-10-04 14:13:20,016 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.attention_skip_rate, batch_count=1682666.6666666667, ans=0.0 2023-10-04 14:13:23,098 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.conv_skip_rate, batch_count=1682666.6666666667, ans=0.0 2023-10-04 14:13:25,629 WARNING [train.py:1204] (3/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_385-193052-0_sp0.9 from training. Duration: 0.9555625 2023-10-04 14:13:26,909 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7113_210_1849_1_1527942685_3448768_194-311626-0_sp1.1 from training. Duration: 0.92725 2023-10-04 14:13:30,372 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3780_210_2087_1_1526467389_2145572_176-107140-0_sp0.9 from training. Duration: 0.7666875 2023-10-04 14:13:30,374 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_36756_210_7234_1_1533026991_966276_123-213457-0_sp1.1 from training. Duration: 0.936375 2023-10-04 14:13:34,983 WARNING [train.py:1204] (3/4) Exclude cut with ID _23557_210_8750_1_1531890106370_6846255_456-244861-0_sp1.1 from training. Duration: 0.963625 2023-10-04 14:13:35,044 WARNING [train.py:1204] (3/4) Exclude cut with ID _37086_210_9038_1_1532999540891_7021250_242-349170-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 14:13:35,107 WARNING [train.py:1204] (3/4) Exclude cut with ID _41298_210_9019_1_1533430371450_4822143_471-281351-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 14:13:35,294 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.conv_module2.balancer1.prob, batch_count=1682733.3333333333, ans=0.125 2023-10-04 14:13:37,095 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_53378_210_17282_1_1534221071_5769509_572-126176-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 14:13:38,456 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_1507-263032-0_sp1.1 from training. Duration: 0.963625 2023-10-04 14:13:38,493 WARNING [train.py:1204] (3/4) Exclude cut with ID _13679_210_7497_1_1530534293662_3355098_411-186088-0_sp1.1 from training. Duration: 0.8545625 2023-10-04 14:13:41,209 WARNING [train.py:1204] (3/4) Exclude cut with ID _29345_210_4381_1_1532689168263_3860169_149-257268-0_sp1.1 from training. Duration: 0.736375 2023-10-04 14:13:41,310 WARNING [train.py:1204] (3/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_381-252014-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 14:13:41,317 WARNING [train.py:1204] (3/4) Exclude cut with ID _35799_210_5854_1_1533263487887_3529071_512-2717-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 14:13:44,829 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.3.encoder.layers.3.whiten, num_groups=1, num_channels=512, metric=3.57 vs. limit=12.0 2023-10-04 14:13:45,578 WARNING [train.py:1204] (3/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_505-205270-0_sp0.9 from training. Duration: 0.42225 2023-10-04 14:13:45,659 WARNING [train.py:1204] (3/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_731-20117-0_sp1.1 from training. Duration: 0.936375 2023-10-04 14:13:48,965 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.4.encoder.layers.1.nonlin_attention.whiten2, num_groups=1, num_channels=384, metric=3.04 vs. limit=15.0 2023-10-04 14:13:50,383 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_37351_210_7925_1_1533012931_3812113_265-85780-0_sp1.1 from training. Duration: 0.8 2023-10-04 14:13:50,390 WARNING [train.py:1204] (3/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_313-134930-0 from training. Duration: 0.97 2023-10-04 14:13:50,520 WARNING [train.py:1204] (3/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_374-138971-0 from training. Duration: 0.94 2023-10-04 14:13:51,620 INFO [train.py:1046] (3/4) Epoch 48, batch 2750, loss[loss=0.1405, simple_loss=0.2034, pruned_loss=0.03882, over 23420.00 frames. ], tot_loss[loss=0.1542, simple_loss=0.2351, pruned_loss=0.03666, over 4718973.36 frames. ], batch size: 285, lr: 2.11e-03, grad_scale: 8.0 2023-10-04 14:13:51,711 WARNING [train.py:1204] (3/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_1227-287886-0_sp1.1 from training. Duration: 0.936375 2023-10-04 14:13:54,431 WARNING [train.py:1204] (3/4) Exclude cut with ID _59493_210_17282_1_1535078419077_3225879_332-217544-0_sp1.1 from training. Duration: 0.97275 2023-10-04 14:13:55,746 WARNING [train.py:1204] (3/4) Exclude cut with ID _23514_210_1389_1_1531832139785_3031439_361-119640-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 14:13:57,128 WARNING [train.py:1204] (3/4) Exclude cut with ID _69584_210_5219_1_1535785133775_4044450_142-118058-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 14:13:57,148 WARNING [train.py:1204] (3/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_178-177582-0_sp1.1 from training. Duration: 0.67275 2023-10-04 14:13:57,182 WARNING [train.py:1204] (3/4) Exclude cut with ID _25303_210_17519_1_1532050207576_3635970_41-195572-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 14:13:58,796 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.bypass.scale_min, batch_count=1682800.0, ans=0.2 2023-10-04 14:13:59,074 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.5.encoder.layers.0.feed_forward3.out_whiten, num_groups=1, num_channels=256, metric=8.85 vs. limit=15.0 2023-10-04 14:14:01,804 WARNING [train.py:1204] (3/4) Exclude cut with ID _55328_210_27767_1_1534580973260_4088170_329-341322-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 14:14:03,123 WARNING [train.py:1204] (3/4) Exclude cut with ID _5608_210_2106_1_1527415439089_6819768_909-82767-0_sp1.1 from training. Duration: 0.5545625 2023-10-04 14:14:03,169 WARNING [train.py:1204] (3/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_261-297503-0_sp1.1 from training. Duration: 0.9 2023-10-04 14:14:03,176 WARNING [train.py:1204] (3/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_459-291706-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 14:14:03,177 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7762_210_1794_1_1528199508_4057408_422-227034-0 from training. Duration: 0.73 2023-10-04 14:14:03,187 WARNING [train.py:1204] (3/4) Exclude cut with ID _3067_210_2667_1_1526212974640_4503168_250-285937-0_sp0.9 from training. Duration: 0.888875 2023-10-04 14:14:03,205 WARNING [train.py:1204] (3/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_332-253745-0_sp1.1 from training. Duration: 0.936375 2023-10-04 14:14:08,139 WARNING [train.py:1204] (3/4) Exclude cut with ID _13938_210_9988_1_1530518718978_3693960_304-112784-0 from training. Duration: 0.46 2023-10-04 14:14:09,581 WARNING [train.py:1204] (3/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_226-267957-0_sp1.1 from training. Duration: 0.92725 2023-10-04 14:14:10,732 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_41822_210_13270_1_1533955976_4379284_608-254567-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 14:14:10,791 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_124-113562-0_sp1.1 from training. Duration: 0.8545625 2023-10-04 14:14:12,155 WARNING [train.py:1204] (3/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_117-290916-0_sp1.1 from training. Duration: 0.463625 2023-10-04 14:14:13,552 WARNING [train.py:1204] (3/4) Exclude cut with ID _28427_210_4220_1_1532581240967_3061069_102-66533-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 14:14:13,623 WARNING [train.py:1204] (3/4) Exclude cut with ID _55383_210_10635_1_1534468426172_2231669_134-128246-0_sp1.1 from training. Duration: 0.8090625 2023-10-04 14:14:13,846 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.bypass.skip_rate, batch_count=1682866.6666666667, ans=0.09899494936611666 2023-10-04 14:14:14,963 WARNING [train.py:1204] (3/4) Exclude cut with ID _45871_210_5665_1_1533864614245_3296060_49-308316-0_sp1.1 from training. Duration: 0.97275 2023-10-04 14:14:15,004 WARNING [train.py:1204] (3/4) Exclude cut with ID _57572_210_5517_1_1534921219624_3809484_176-199205-0_sp1.1 from training. Duration: 0.97275 2023-10-04 14:14:19,926 WARNING [train.py:1204] (3/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_45-265684-0_sp1.1 from training. Duration: 0.6545625 2023-10-04 14:14:19,959 WARNING [train.py:1204] (3/4) Exclude cut with ID _15150_210_8341_1_1530752411803_7256384_469-239509-0_sp0.9 from training. Duration: 0.7555625 2023-10-04 14:14:21,127 WARNING [train.py:1204] (3/4) Exclude cut with ID _28461_210_2253_1_1532771775189_3041299_136-270644-0_sp1.1 from training. Duration: 0.8454375 2023-10-04 14:14:21,220 WARNING [train.py:1204] (3/4) Exclude cut with ID _69584_210_5219_1_1535785133775_4044450_302-47302-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 14:14:22,645 WARNING [train.py:1204] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_166-128941-0_sp1.1 from training. Duration: 0.4909375 2023-10-04 14:14:23,162 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.5.encoder.layers.1.conv_module1.whiten, num_groups=1, num_channels=256, metric=3.56 vs. limit=15.0 2023-10-04 14:14:28,276 WARNING [train.py:1204] (3/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_559-120504-0_sp1.1 from training. Duration: 0.97275 2023-10-04 14:14:30,228 WARNING [train.py:1204] (3/4) Exclude cut with ID _6456_210_1534_1_1527681634212_3181049_345-252877-0_sp1.1 from training. Duration: 0.5818125 2023-10-04 14:14:30,257 WARNING [train.py:1204] (3/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_902-233003-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 14:14:36,348 WARNING [train.py:1204] (3/4) Exclude cut with ID _36538_210_9994_1_1533204020363_3795620_204-261385-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 14:14:36,352 WARNING [train.py:1204] (3/4) Exclude cut with ID _14821_210_8750_1_1530680503991_6829359_189-297451-0_sp1.1 from training. Duration: 0.5 2023-10-04 14:14:36,384 WARNING [train.py:1204] (3/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_1211-226066-0_sp0.9 from training. Duration: 0.8444375 2023-10-04 14:14:36,492 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder_embed.convnext.out_balancer.prob, batch_count=1683000.0, ans=0.125 2023-10-04 14:14:43,920 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6786_210_1388_1_1527839699_4194495_273-149841-0_sp1.1 from training. Duration: 0.836375 2023-10-04 14:14:43,958 WARNING [train.py:1204] (3/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_380-162648-0_sp1.1 from training. Duration: 0.8545625 2023-10-04 14:14:43,959 WARNING [train.py:1204] (3/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_494-199538-0 from training. Duration: 0.75 2023-10-04 14:14:44,269 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.conv_module1.balancer1.max_abs, batch_count=1683000.0, ans=10.0 2023-10-04 14:14:46,863 WARNING [train.py:1204] (3/4) Exclude cut with ID _49097_210_16049_1_1533952780207_4360979_276-87053-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 14:14:48,394 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.0.attention_skip_rate, batch_count=1683000.0, ans=0.0 2023-10-04 14:14:49,661 WARNING [train.py:1204] (3/4) Exclude cut with ID _5444_210_2151_1_1527418808464_5312210_726-27165-0 from training. Duration: 0.67 2023-10-04 14:14:54,242 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_128-202104-0_sp1.1 from training. Duration: 0.57275 2023-10-04 14:14:55,797 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_11349_210_6753_1_1529800861_1101207_26-65602-0_sp1.1 from training. Duration: 0.87275 2023-10-04 14:14:57,106 WARNING [train.py:1204] (3/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_159-328157-0 from training. Duration: 0.52 2023-10-04 14:14:57,189 WARNING [train.py:1204] (3/4) Exclude cut with ID _14069_210_5632_1_1530512692033_4803390_98-21934-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 14:14:59,865 WARNING [train.py:1204] (3/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_352-59958-0_sp0.9 from training. Duration: 0.9555625 2023-10-04 14:14:59,919 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6163_210_6400_1_1527506746_4263068_283-137811-0 from training. Duration: 0.77 2023-10-04 14:15:01,786 WARNING [train.py:1204] (3/4) Exclude cut with ID _21989_210_5854_1_1531899214931_3271360_133-44759-0_sp1.1 from training. Duration: 0.7 2023-10-04 14:15:05,182 WARNING [train.py:1204] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_21-174514-0_sp0.9 from training. Duration: 0.47775 2023-10-04 14:15:06,641 INFO [train.py:1046] (3/4) Epoch 48, batch 2800, loss[loss=0.1472, simple_loss=0.2342, pruned_loss=0.03008, over 24446.00 frames. ], tot_loss[loss=0.1526, simple_loss=0.2328, pruned_loss=0.03623, over 4701500.11 frames. ], batch size: 63, lr: 2.11e-03, grad_scale: 16.0 2023-10-04 14:15:06,681 WARNING [train.py:1204] (3/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_201-78811-0_sp1.1 from training. Duration: 0.963625 2023-10-04 14:15:06,714 WARNING [train.py:1204] (3/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_537-164630-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 14:15:06,972 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.nonlin_attention.balancer.prob, batch_count=1683133.3333333333, ans=0.125 2023-10-04 14:15:08,098 WARNING [train.py:1204] (3/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_156-257607-0 from training. Duration: 0.83 2023-10-04 14:15:08,110 WARNING [train.py:1204] (3/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_308-154450-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 14:15:08,152 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_45399_210_4852_1_1533624725_7579737_214-240574-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 14:15:09,578 WARNING [train.py:1204] (3/4) Exclude cut with ID _29303_210_2575_1_1532392237303_7410649_615-305517-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 14:15:10,856 WARNING [train.py:1204] (3/4) Exclude cut with ID IC0125W0002-61808 from training. Duration: 0.708 2023-10-04 14:15:10,856 WARNING [train.py:1204] (3/4) Exclude cut with ID IC0125W0017-61823 from training. Duration: 0.759 2023-10-04 14:15:15,541 WARNING [train.py:1204] (3/4) Exclude cut with ID _53219_210_4525_1_1534834836008_4064825_4-145291-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 14:15:16,913 WARNING [train.py:1204] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_185-174805-0_sp1.1 from training. Duration: 0.6181875 2023-10-04 14:15:16,935 WARNING [train.py:1204] (3/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_635-193046-0_sp1.1 from training. Duration: 0.92725 2023-10-04 14:15:19,682 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.647e+02 2.049e+02 2.249e+02 2.706e+02 5.185e+02, threshold=4.498e+02, percent-clipped=5.0 2023-10-04 14:15:20,428 WARNING [train.py:1204] (3/4) Exclude cut with ID _46067_210_4220_1_1534125571066_3307450_321-258866-0_sp1.1 from training. Duration: 0.97275 2023-10-04 14:15:21,687 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8558_210_1410_1_1528505225_7613406_131-329524-0 from training. Duration: 0.79 2023-10-04 14:15:24,360 WARNING [train.py:1204] (3/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_847-41334-0_sp0.9 from training. Duration: 0.488875 2023-10-04 14:15:24,457 WARNING [train.py:1204] (3/4) Exclude cut with ID _14227_210_4381_1_1530701804286_3940259_125-275642-0 from training. Duration: 0.99 2023-10-04 14:15:27,008 WARNING [train.py:1204] (3/4) Exclude cut with ID _25248_210_8750_1_1532062825941_5554180_248-177255-0_sp1.1 from training. Duration: 0.963625 2023-10-04 14:15:27,047 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_40047_210_2290_1_1533290519_2374311_195-165693-0_sp1.1 from training. Duration: 0.8090625 2023-10-04 14:15:27,051 WARNING [train.py:1204] (3/4) Exclude cut with ID _23109_210_4381_1_1532150828777_901949_29-258672-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 14:15:30,011 WARNING [train.py:1204] (3/4) Exclude cut with ID _14821_210_8750_1_1530680503991_6829359_2-227151-0_sp1.1 from training. Duration: 0.7545625 2023-10-04 14:15:30,046 WARNING [train.py:1204] (3/4) Exclude cut with ID _49643_210_7989_1_1534640244224_3939031_565-53982-0_sp1.1 from training. Duration: 0.963625 2023-10-04 14:15:30,056 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7544_210_6753_1_1528591327_1471426_33-346031-0_sp0.9 from training. Duration: 0.67775 2023-10-04 14:15:32,875 WARNING [train.py:1204] (3/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_95-61782-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 14:15:41,053 WARNING [train.py:1204] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_686-54383-0_sp0.9 from training. Duration: 0.9666875 2023-10-04 14:15:42,543 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_505-276823-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 14:15:45,696 WARNING [train.py:1204] (3/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_745-128443-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 14:15:47,033 WARNING [train.py:1204] (3/4) Exclude cut with ID _3844_210_1390_1_1526727803582_2871209_20-251664-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 14:15:47,096 WARNING [train.py:1204] (3/4) Exclude cut with ID _36538_210_9994_1_1533204020363_3795620_487-164628-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 14:15:48,597 INFO [scaling.py:1118] (3/4) WithLoss: name=encoder.encoders.0.layers.0.self_attn_weights, loss-sum=0.000e+00 2023-10-04 14:15:50,067 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.feed_forward1.out_proj.dropout_p, batch_count=1683333.3333333333, ans=0.1 2023-10-04 14:15:51,850 WARNING [train.py:1204] (3/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_309-222545-0_sp1.1 from training. Duration: 0.763625 2023-10-04 14:15:52,940 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3531_210_3528_1_1526294203_5216456_309-227302-0 from training. Duration: 0.69 2023-10-04 14:15:53,007 WARNING [train.py:1204] (3/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_42-192460-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 14:15:53,564 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.self_attn2.whiten.whitening_limit, batch_count=1683333.3333333333, ans=22.5 2023-10-04 14:15:54,351 WARNING [train.py:1204] (3/4) Exclude cut with ID _22083_210_8254_1_1531702862439_3678970_49-234546-0_sp1.1 from training. Duration: 0.7545625 2023-10-04 14:15:54,355 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_427-78097-0_sp0.9 from training. Duration: 0.888875 2023-10-04 14:15:57,295 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_378-205254-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 14:15:57,349 WARNING [train.py:1204] (3/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_672-314830-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 14:15:57,596 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=1683333.3333333333, ans=0.1 2023-10-04 14:16:00,335 WARNING [train.py:1204] (3/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_90-30673-0_sp1.1 from training. Duration: 0.763625 2023-10-04 14:16:02,983 WARNING [train.py:1204] (3/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_563-329943-0_sp1.1 from training. Duration: 0.9 2023-10-04 14:16:03,004 WARNING [train.py:1204] (3/4) Exclude cut with ID _21632_210_2253_1_1531994001765_3744140_154-84090-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 14:16:03,005 WARNING [train.py:1204] (3/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_103-336064-0_sp0.9 from training. Duration: 0.5666875 2023-10-04 14:16:04,768 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_43-71989-0_sp0.9 from training. Duration: 0.6333125 2023-10-04 14:16:06,623 WARNING [train.py:1204] (3/4) Exclude cut with ID _3867_210_1883_1_1526641200575_4475830_319-349850-0_sp0.9 from training. Duration: 0.7444375 2023-10-04 14:16:06,704 WARNING [train.py:1204] (3/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_629-179454-0_sp1.1 from training. Duration: 0.963625 2023-10-04 14:16:06,712 WARNING [train.py:1204] (3/4) Exclude cut with ID _65263_210_22356_1_1535763598691_3921037_49-222917-0 from training. Duration: 0.92 2023-10-04 14:16:06,741 WARNING [train.py:1204] (3/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_602-331772-0_sp1.1 from training. Duration: 0.936375 2023-10-04 14:16:09,311 WARNING [train.py:1204] (3/4) Exclude cut with ID _58414_210_8254_1_1535421609162_7151182_292-134978-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 14:16:09,357 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_38793_210_2544_1_1533176114_3849320_200-299292-0_sp1.1 from training. Duration: 0.936375 2023-10-04 14:16:10,790 WARNING [train.py:1204] (3/4) Exclude cut with ID _14970_210_15709_1_1530775547361_1966965_6-23328-0 from training. Duration: 0.86 2023-10-04 14:16:12,217 WARNING [train.py:1204] (3/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_386-225511-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 14:16:12,228 WARNING [train.py:1204] (3/4) Exclude cut with ID _38323_210_12108_1_1533121233369_3303049_416-11919-0_sp1.1 from training. Duration: 0.87275 2023-10-04 14:16:12,284 WARNING [train.py:1204] (3/4) Exclude cut with ID _4866_210_2577_1_1527404412284_4364659_256-306830-0_sp1.1 from training. Duration: 0.7909375 2023-10-04 14:16:13,639 WARNING [train.py:1204] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_53-300544-0 from training. Duration: 0.67 2023-10-04 14:16:19,557 WARNING [train.py:1204] (3/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_80-74184-0_sp1.1 from training. Duration: 0.7545625 2023-10-04 14:16:19,584 WARNING [train.py:1204] (3/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_332-256168-0_sp1.1 from training. Duration: 0.6818125 2023-10-04 14:16:19,630 WARNING [train.py:1204] (3/4) Exclude cut with ID _48836_210_12210_1_1534075848293_3904150_429-35475-0_sp1.1 from training. Duration: 0.8090625 2023-10-04 14:16:19,797 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.feed_forward3.hidden_balancer.prob, batch_count=1683466.6666666667, ans=0.125 2023-10-04 14:16:20,888 INFO [train.py:1046] (3/4) Epoch 48, batch 2850, loss[loss=0.1575, simple_loss=0.2502, pruned_loss=0.03242, over 24639.00 frames. ], tot_loss[loss=0.1516, simple_loss=0.2318, pruned_loss=0.03571, over 4703448.87 frames. ], batch size: 73, lr: 2.11e-03, grad_scale: 16.0 2023-10-04 14:16:22,786 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_144-135833-0_sp1.1 from training. Duration: 0.97275 2023-10-04 14:16:25,605 WARNING [train.py:1204] (3/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_380-343670-0_sp0.9 from training. Duration: 0.97775 2023-10-04 14:16:26,856 WARNING [train.py:1204] (3/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_213-265174-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 14:16:26,890 WARNING [train.py:1204] (3/4) Exclude cut with ID _20455_210_5219_1_1531565996775_8098100_86-314283-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 14:16:28,396 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_41750_210_1960_1_1533520678_3948042_68-223813-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 14:16:28,458 WARNING [train.py:1204] (3/4) Exclude cut with ID _55153_210_2253_1_1534384786677_3162096_24-198429-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 14:16:28,613 INFO [scaling.py:1118] (3/4) WithLoss: name=encoder.encoders.0.layers.0.self_attn_weights, loss-sum=0.000e+00 2023-10-04 14:16:29,879 WARNING [train.py:1204] (3/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_119-207162-0_sp0.9 from training. Duration: 0.788875 2023-10-04 14:16:31,238 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_148-172861-0 from training. Duration: 0.5 2023-10-04 14:16:35,804 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_5522_210_3273_1_1527249606_3909096_318-203153-0 from training. Duration: 0.43 2023-10-04 14:16:35,814 WARNING [train.py:1204] (3/4) Exclude cut with ID _14884_210_2252_1_1530707459689_5561250_288-189949-0_sp1.1 from training. Duration: 0.97275 2023-10-04 14:16:38,380 WARNING [train.py:1204] (3/4) Exclude cut with ID _5294_210_1747_1_1527249643928_3696179_529-242298-0 from training. Duration: 0.95 2023-10-04 14:16:38,686 INFO [scaling.py:1118] (3/4) WithLoss: name=encoder.encoders.4.encoder.layers.1.self_attn_weights, loss-sum=0.000e+00 2023-10-04 14:16:39,756 WARNING [train.py:1204] (3/4) Exclude cut with ID _36538_210_9994_1_1533204020363_3795620_503-110289-0_sp1.1 from training. Duration: 0.936375 2023-10-04 14:16:42,409 WARNING [train.py:1204] (3/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_385-193052-0 from training. Duration: 0.86 2023-10-04 14:16:43,813 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_456-22141-0 from training. Duration: 0.88 2023-10-04 14:16:43,926 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_38965_210_4852_1_1533174906_3484735_284-117840-0_sp1.1 from training. Duration: 0.936375 2023-10-04 14:16:57,449 WARNING [train.py:1204] (3/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_75-201625-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 14:16:58,803 WARNING [train.py:1204] (3/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_255-312654-0_sp1.1 from training. Duration: 0.82725 2023-10-04 14:16:58,813 WARNING [train.py:1204] (3/4) Exclude cut with ID _47251_210_18628_1_1533814063128_4137250_214-88076-0_sp0.9 from training. Duration: 0.97775 2023-10-04 14:17:00,156 WARNING [train.py:1204] (3/4) Exclude cut with ID _4092_210_2151_1_1526643044449_3135759_227-213972-0_sp1.1 from training. Duration: 0.4909375 2023-10-04 14:17:00,172 WARNING [train.py:1204] (3/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_912-47473-0_sp1.1 from training. Duration: 0.6181875 2023-10-04 14:17:00,202 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6767_210_3273_1_1527855783_1718932_173-256601-0_sp1.1 from training. Duration: 0.72725 2023-10-04 14:17:01,561 WARNING [train.py:1204] (3/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_32-205288-0_sp1.1 from training. Duration: 0.7090625 2023-10-04 14:17:02,891 WARNING [train.py:1204] (3/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_39-248870-0 from training. Duration: 0.69 2023-10-04 14:17:06,125 WARNING [train.py:1204] (3/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_377-296839-0_sp1.1 from training. Duration: 0.5 2023-10-04 14:17:06,148 WARNING [train.py:1204] (3/4) Exclude cut with ID _13679_210_7497_1_1530534293662_3355098_413-56154-0_sp1.1 from training. Duration: 0.963625 2023-10-04 14:17:08,042 WARNING [train.py:1204] (3/4) Exclude cut with ID _14970_210_15709_1_1530775547361_1966965_201-84544-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 14:17:08,104 WARNING [train.py:1204] (3/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_105-95775-0_sp1.1 from training. Duration: 0.936375 2023-10-04 14:17:10,858 WARNING [train.py:1204] (3/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_131-191669-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 14:17:10,902 WARNING [train.py:1204] (3/4) Exclude cut with ID _14860_210_4915_1_1530702020340_7343240_695-4323-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 14:17:11,230 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=1683666.6666666667, ans=0.1 2023-10-04 14:17:12,385 WARNING [train.py:1204] (3/4) Exclude cut with ID _39867_210_9826_1_1533553222030_9344259_991-130565-0_sp1.1 from training. Duration: 0.97275 2023-10-04 14:17:12,672 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.conv_skip_rate, batch_count=1683666.6666666667, ans=0.0 2023-10-04 14:17:13,795 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_50-3289-0_sp1.1 from training. Duration: 0.82725 2023-10-04 14:17:15,178 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_54-347670-0_sp1.1 from training. Duration: 0.92725 2023-10-04 14:17:15,229 WARNING [train.py:1204] (3/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_200-153445-0_sp1.1 from training. Duration: 0.936375 2023-10-04 14:17:17,841 WARNING [train.py:1204] (3/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_279-302911-0_sp1.1 from training. Duration: 0.97275 2023-10-04 14:17:21,221 WARNING [train.py:1204] (3/4) Exclude cut with ID _14886_210_4220_1_1530702020355_3516190_159-73748-0_sp0.9 from training. Duration: 0.92225 2023-10-04 14:17:21,419 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.bypass_mid.scale_min, batch_count=1683733.3333333333, ans=0.2 2023-10-04 14:17:24,153 WARNING [train.py:1204] (3/4) Exclude cut with ID _53379_210_20034_1_1534222027363_3854654_23-222715-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 14:17:24,766 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.3.encoder.layers.3.self_attn2.whiten, num_groups=1, num_channels=512, metric=11.10 vs. limit=22.5 2023-10-04 14:17:26,294 WARNING [train.py:1204] (3/4) Exclude cut with ID _12555_210_8750_1_1530075594402_3780239_449-267986-0 from training. Duration: 0.83 2023-10-04 14:17:26,305 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7544_210_6753_1_1528587722_3576686_295-258479-0 from training. Duration: 0.84 2023-10-04 14:17:27,499 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7002_210_1794_1_1527935634_4034444_487-236501-0_sp0.9 from training. Duration: 0.7333125 2023-10-04 14:17:28,812 WARNING [train.py:1204] (3/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_244-19107-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 14:17:28,828 WARNING [train.py:1204] (3/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_390-339137-0 from training. Duration: 0.56 2023-10-04 14:17:28,899 WARNING [train.py:1204] (3/4) Exclude cut with ID _7562_210_2084_1_1528887767916_3946229_186-326422-0_sp1.1 from training. Duration: 0.763625 2023-10-04 14:17:30,259 WARNING [train.py:1204] (3/4) Exclude cut with ID _33640_210_2655_1_1533117605479_4855469_645-145235-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 14:17:30,277 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_25767_210_2161_1_1532307483_3867748_506-171777-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 14:17:30,303 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_5438_210_1794_1_1527336687_2600009_233-58072-0_sp1.1 from training. Duration: 0.7 2023-10-04 14:17:30,303 WARNING [train.py:1204] (3/4) Exclude cut with ID IC0669W0063-330908 from training. Duration: 0.896 2023-10-04 14:17:31,498 WARNING [train.py:1204] (3/4) Exclude cut with ID IC0671W0070-331913 from training. Duration: 0.896 2023-10-04 14:17:31,501 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_258-310570-0_sp1.1 from training. Duration: 0.7454375 2023-10-04 14:17:32,856 WARNING [train.py:1204] (3/4) Exclude cut with ID _9461_210_4557_1_1529060514726_3677451_162-177507-0_sp1.1 from training. Duration: 0.97275 2023-10-04 14:17:34,202 INFO [train.py:1046] (3/4) Epoch 48, batch 2900, loss[loss=0.1531, simple_loss=0.2257, pruned_loss=0.0402, over 23436.00 frames. ], tot_loss[loss=0.1519, simple_loss=0.2323, pruned_loss=0.03575, over 4701985.84 frames. ], batch size: 285, lr: 2.11e-03, grad_scale: 16.0 2023-10-04 14:17:35,802 WARNING [train.py:1204] (3/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_26-175553-0_sp0.9 from training. Duration: 0.67775 2023-10-04 14:17:35,818 WARNING [train.py:1204] (3/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_7-150481-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 14:17:37,758 WARNING [train.py:1204] (3/4) Exclude cut with ID _23557_210_8750_1_1531890106370_6846255_72-273262-0_sp1.1 from training. Duration: 0.963625 2023-10-04 14:17:37,926 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.0.feed_forward1.hidden_balancer.prob, batch_count=1683800.0, ans=0.125 2023-10-04 14:17:39,192 WARNING [train.py:1204] (3/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_367-61330-0 from training. Duration: 0.75 2023-10-04 14:17:41,979 WARNING [train.py:1204] (3/4) Exclude cut with ID _39867_210_9826_1_1533553222030_9344259_1337-11141-0_sp1.1 from training. Duration: 0.936375 2023-10-04 14:17:42,009 WARNING [train.py:1204] (3/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_101-293367-0 from training. Duration: 0.63 2023-10-04 14:17:43,313 WARNING [train.py:1204] (3/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_10-100367-0 from training. Duration: 0.98 2023-10-04 14:17:43,440 WARNING [train.py:1204] (3/4) Exclude cut with ID _60057_210_10640_1_1534903065793_3333150_171-181133-0_sp1.1 from training. Duration: 0.8 2023-10-04 14:17:43,442 WARNING [train.py:1204] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_844-96989-0_sp0.9 from training. Duration: 0.788875 2023-10-04 14:17:47,418 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.804e+02 2.112e+02 2.348e+02 2.859e+02 4.205e+02, threshold=4.696e+02, percent-clipped=0.0 2023-10-04 14:17:47,497 WARNING [train.py:1204] (3/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_301-144595-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 14:17:47,596 WARNING [train.py:1204] (3/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_115-147343-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 14:17:48,224 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.3.encoder.layers.3.feed_forward1.out_whiten, num_groups=1, num_channels=512, metric=4.75 vs. limit=15.0 2023-10-04 14:17:50,821 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3171_210_2087_1_1526109084_2330007_37-64334-0_sp1.1 from training. Duration: 0.7454375 2023-10-04 14:17:50,948 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.1.feed_forward3.hidden_balancer.prob, batch_count=1683866.6666666667, ans=0.125 2023-10-04 14:17:52,116 WARNING [train.py:1204] (3/4) Exclude cut with ID _19514_210_9259_1_1531530071330_5084230_769-272914-0_sp1.1 from training. Duration: 0.936375 2023-10-04 14:17:55,350 WARNING [train.py:1204] (3/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_76-311963-0_sp0.9 from training. Duration: 0.77775 2023-10-04 14:17:55,378 WARNING [train.py:1204] (3/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_526-62079-0 from training. Duration: 0.67 2023-10-04 14:17:55,439 WARNING [train.py:1204] (3/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_7-111954-0_sp0.9 from training. Duration: 0.62225 2023-10-04 14:17:56,798 WARNING [train.py:1204] (3/4) Exclude cut with ID _57997_210_4163_1_1534766425889_3961049_360-279140-0_sp1.1 from training. Duration: 0.97275 2023-10-04 14:17:59,608 WARNING [train.py:1204] (3/4) Exclude cut with ID _4866_210_2577_1_1527404412284_4364659_133-1696-0 from training. Duration: 0.99 2023-10-04 14:17:59,666 WARNING [train.py:1204] (3/4) Exclude cut with ID _5190_210_4557_1_1527244368799_373439_28-197030-0 from training. Duration: 0.72 2023-10-04 14:18:02,357 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_53-39003-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 14:18:02,359 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6673_210_3273_1_1527767790_3173240_351-167954-0 from training. Duration: 0.48 2023-10-04 14:18:03,608 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2822_210_2715_1_1526112586_1999587_165-36656-0_sp1.1 from training. Duration: 0.8090625 2023-10-04 14:18:05,160 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_328-264892-0_sp1.1 from training. Duration: 0.92725 2023-10-04 14:18:05,165 WARNING [train.py:1204] (3/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_29-293896-0_sp0.9 from training. Duration: 0.67775 2023-10-04 14:18:08,423 WARNING [train.py:1204] (3/4) Exclude cut with ID _65263_210_22356_1_1535763598691_3921037_465-55448-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 14:18:09,863 WARNING [train.py:1204] (3/4) Exclude cut with ID _21989_210_5854_1_1531899214931_3271360_311-53106-0_sp1.1 from training. Duration: 0.97275 2023-10-04 14:18:13,930 WARNING [train.py:1204] (3/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_389-277915-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 14:18:15,393 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder_embed.convnext.hidden_balancer.prob, batch_count=1683933.3333333333, ans=0.125 2023-10-04 14:18:16,679 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_69630_210_7234_1_1535791094_677837_63-277139-0_sp1.1 from training. Duration: 0.963625 2023-10-04 14:18:18,376 WARNING [train.py:1204] (3/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_85-138257-0 from training. Duration: 0.71 2023-10-04 14:18:18,393 WARNING [train.py:1204] (3/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_686-247600-0 from training. Duration: 0.92 2023-10-04 14:18:18,405 WARNING [train.py:1204] (3/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_108-255430-0_sp1.1 from training. Duration: 0.8818125 2023-10-04 14:18:22,236 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.5.encoder.layers.1.self_attn2.whiten, num_groups=1, num_channels=256, metric=9.62 vs. limit=22.5 2023-10-04 14:18:23,173 WARNING [train.py:1204] (3/4) Exclude cut with ID _14884_210_2252_1_1530707459689_5561250_114-201399-0_sp1.1 from training. Duration: 0.6909375 2023-10-04 14:18:24,706 WARNING [train.py:1204] (3/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_506-42614-0 from training. Duration: 0.79 2023-10-04 14:18:26,075 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.1.encoder.layers.0.self_attn2.whiten, num_groups=1, num_channels=256, metric=11.50 vs. limit=22.5 2023-10-04 14:18:26,527 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_85-195240-0_sp1.1 from training. Duration: 0.7909375 2023-10-04 14:18:29,387 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_141-36243-0_sp1.1 from training. Duration: 0.97275 2023-10-04 14:18:37,498 WARNING [train.py:1204] (3/4) Exclude cut with ID _7852_210_5219_1_1528286471316_8139400_358-287492-0_sp1.1 from training. Duration: 0.87275 2023-10-04 14:18:37,522 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_805-71497-0_sp0.9 from training. Duration: 0.788875 2023-10-04 14:18:39,446 WARNING [train.py:1204] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_178-39722-0 from training. Duration: 0.49 2023-10-04 14:18:42,347 WARNING [train.py:1204] (3/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_428-181925-0_sp1.1 from training. Duration: 0.963625 2023-10-04 14:18:42,352 WARNING [train.py:1204] (3/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_770-200430-0 from training. Duration: 0.89 2023-10-04 14:18:43,680 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_1104-271009-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 14:18:43,719 WARNING [train.py:1204] (3/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_128-296581-0_sp0.9 from training. Duration: 0.82225 2023-10-04 14:18:47,734 INFO [train.py:1046] (3/4) Epoch 48, batch 2950, loss[loss=0.1462, simple_loss=0.2289, pruned_loss=0.03179, over 24607.00 frames. ], tot_loss[loss=0.1522, simple_loss=0.233, pruned_loss=0.03572, over 4714442.92 frames. ], batch size: 60, lr: 2.11e-03, grad_scale: 16.0 2023-10-04 14:18:51,015 WARNING [train.py:1204] (3/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_19-62511-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 14:18:52,442 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_805-71497-0 from training. Duration: 0.71 2023-10-04 14:18:52,520 WARNING [train.py:1204] (3/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_573-152077-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 14:18:52,525 WARNING [train.py:1204] (3/4) Exclude cut with ID _42875_210_14856_1_1533704397487_4012433_393-177816-0_sp1.1 from training. Duration: 0.963625 2023-10-04 14:18:54,330 WARNING [train.py:1204] (3/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_278-35159-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 14:18:55,700 WARNING [train.py:1204] (3/4) Exclude cut with ID _15037_210_2253_1_1530752294104_3267650_13-130361-0_sp1.1 from training. Duration: 0.9 2023-10-04 14:18:55,811 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3019_210_2028_1_1526044245_3264865_127-135615-0 from training. Duration: 0.94 2023-10-04 14:18:57,101 WARNING [train.py:1204] (3/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_607-213951-0 from training. Duration: 0.84 2023-10-04 14:18:57,164 WARNING [train.py:1204] (3/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_436-318113-0_sp0.9 from training. Duration: 0.8333125 2023-10-04 14:18:57,165 WARNING [train.py:1204] (3/4) Exclude cut with ID _13938_210_9988_1_1530518718978_3693960_148-152225-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 14:19:02,953 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.conv_module1.balancer2.prob, batch_count=1684200.0, ans=0.125 2023-10-04 14:19:04,045 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2541_210_1794_1_1525590269_3084765_156-173713-0_sp1.1 from training. Duration: 0.736375 2023-10-04 14:19:05,445 WARNING [train.py:1204] (3/4) Exclude cut with ID _28323_210_16187_1_1532480395667_4100300_531-24157-0_sp1.1 from training. Duration: 0.8909375 2023-10-04 14:19:08,151 WARNING [train.py:1204] (3/4) Exclude cut with ID _29303_210_2575_1_1532392237303_7410649_382-217492-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 14:19:09,433 WARNING [train.py:1204] (3/4) Exclude cut with ID _58895_210_8750_1_1535184081320_3649089_270-158013-0_sp1.1 from training. Duration: 0.92725 2023-10-04 14:19:12,865 WARNING [train.py:1204] (3/4) Exclude cut with ID _29277_210_3279_1_1532581109293_3173069_311-206227-0_sp1.1 from training. Duration: 0.936375 2023-10-04 14:19:12,882 WARNING [train.py:1204] (3/4) Exclude cut with ID _7160_210_1876_1_1527940923231_3703799_282-322611-0_sp0.9 from training. Duration: 0.9666875 2023-10-04 14:19:15,557 WARNING [train.py:1204] (3/4) Exclude cut with ID _53219_210_4525_1_1534834836008_4064825_562-32295-0_sp1.1 from training. Duration: 0.963625 2023-10-04 14:19:16,874 WARNING [train.py:1204] (3/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_408-87330-0_sp1.1 from training. Duration: 0.963625 2023-10-04 14:19:16,890 WARNING [train.py:1204] (3/4) Exclude cut with ID _59202_210_26984_1_1534816810612_3549174_137-318710-0_sp1.1 from training. Duration: 0.8090625 2023-10-04 14:19:19,619 WARNING [train.py:1204] (3/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_372-30302-0 from training. Duration: 0.86 2023-10-04 14:19:23,013 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7543_210_6753_1_1528541907_1399322_178-56269-0_sp1.1 from training. Duration: 0.95275 2023-10-04 14:19:24,270 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9109W0012-549758 from training. Duration: 0.512 2023-10-04 14:19:24,357 WARNING [train.py:1204] (3/4) Exclude cut with ID _5410_210_1388_1_1527397055795_1719020_39-112221-0_sp0.9 from training. Duration: 0.7666875 2023-10-04 14:19:25,828 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9121W0012-554933 from training. Duration: 0.896 2023-10-04 14:19:27,716 WARNING [train.py:1204] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_476-272757-0 from training. Duration: 0.93 2023-10-04 14:19:27,745 WARNING [train.py:1204] (3/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_1085-171718-0_sp1.1 from training. Duration: 0.92725 2023-10-04 14:19:27,788 WARNING [train.py:1204] (3/4) Exclude cut with ID _8480_210_1874_1_1528589391205_6728330_309-139896-0_sp1.1 from training. Duration: 0.736375 2023-10-04 14:19:27,788 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9130W0113-559703 from training. Duration: 0.896 2023-10-04 14:19:27,792 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_292-265928-0_sp1.1 from training. Duration: 0.463625 2023-10-04 14:19:30,489 WARNING [train.py:1204] (3/4) Exclude cut with ID _8772_210_1850_1_1528635595972_3664011_96-12756-0 from training. Duration: 0.98 2023-10-04 14:19:31,841 WARNING [train.py:1204] (3/4) Exclude cut with ID _12556_210_8341_1_1530081024100_7327690_725-193477-0_sp1.1 from training. Duration: 0.8909375 2023-10-04 14:19:31,888 WARNING [train.py:1204] (3/4) Exclude cut with ID _5289_210_2084_1_1527678202736_3779299_142-103317-0_sp1.1 from training. Duration: 0.763625 2023-10-04 14:19:34,576 WARNING [train.py:1204] (3/4) Exclude cut with ID _45012_210_12651_1_1533608435136_5371380_206-51075-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 14:19:37,252 WARNING [train.py:1204] (3/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_176-56120-0_sp1.1 from training. Duration: 0.7545625 2023-10-04 14:19:37,278 WARNING [train.py:1204] (3/4) Exclude cut with ID _40284_210_2084_1_1533513679814_3704279_220-201864-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 14:19:37,315 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9165W0004-576638 from training. Duration: 0.896 2023-10-04 14:19:37,344 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_5438_210_1794_1_1527336687_2600009_218-343336-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 14:19:37,370 WARNING [train.py:1204] (3/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_318-162017-0 from training. Duration: 0.91 2023-10-04 14:19:42,063 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_49544_210_1794_1_1534379770_5262083_483-349298-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 14:19:43,487 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.feed_forward1.hidden_balancer.prob, batch_count=1684333.3333333333, ans=0.125 2023-10-04 14:19:43,495 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.balancer1.prob, batch_count=1684333.3333333333, ans=0.125 2023-10-04 14:19:44,660 WARNING [train.py:1204] (3/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_197-28164-0_sp0.9 from training. Duration: 0.888875 2023-10-04 14:19:44,722 WARNING [train.py:1204] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_643-52007-0 from training. Duration: 0.57 2023-10-04 14:19:44,738 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_38965_210_4852_1_1533174906_3484735_403-332963-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 14:19:47,366 WARNING [train.py:1204] (3/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_340-174176-0 from training. Duration: 0.49 2023-10-04 14:19:48,828 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=1684400.0, ans=0.1 2023-10-04 14:19:51,357 WARNING [train.py:1204] (3/4) Exclude cut with ID _56708_210_13177_1_1535252351282_3422899_363-917-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 14:19:51,483 WARNING [train.py:1204] (3/4) Exclude cut with ID _19896_210_1883_1_1531709022662_6649569_525-159063-0_sp1.1 from training. Duration: 0.936375 2023-10-04 14:19:51,510 WARNING [train.py:1204] (3/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_430-116139-0_sp0.9 from training. Duration: 0.9444375 2023-10-04 14:19:54,767 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_53753_210_7234_1_1534320003_3594148_86-329250-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 14:19:54,787 WARNING [train.py:1204] (3/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_406-275649-0_sp1.1 from training. Duration: 0.3818125 2023-10-04 14:19:56,376 WARNING [train.py:1204] (3/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_1083-147536-0_sp1.1 from training. Duration: 0.8818125 2023-10-04 14:19:56,446 WARNING [train.py:1204] (3/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_54-311741-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 14:19:56,451 WARNING [train.py:1204] (3/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_237-58433-0_sp0.9 from training. Duration: 0.82225 2023-10-04 14:19:58,361 WARNING [train.py:1204] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_98-51676-0_sp1.1 from training. Duration: 0.663625 2023-10-04 14:19:58,429 WARNING [train.py:1204] (3/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_386-270472-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 14:19:59,821 WARNING [train.py:1204] (3/4) Exclude cut with ID _19023_210_4353_1_1531378892552_5820780_83-254794-0_sp1.1 from training. Duration: 0.7818125 2023-10-04 14:20:01,169 INFO [train.py:1046] (3/4) Epoch 48, batch 3000, loss[loss=0.1499, simple_loss=0.2404, pruned_loss=0.02968, over 24521.00 frames. ], tot_loss[loss=0.1529, simple_loss=0.2338, pruned_loss=0.036, over 4712221.54 frames. ], batch size: 71, lr: 2.11e-03, grad_scale: 16.0 2023-10-04 14:20:01,170 INFO [train.py:1069] (3/4) Computing validation loss 2023-10-04 14:20:13,596 INFO [train.py:1078] (3/4) Epoch 48, validation: loss=0.3623, simple_loss=0.2785, pruned_loss=0.223, over 1125622.00 frames. 2023-10-04 14:20:13,596 INFO [train.py:1079] (3/4) Maximum memory allocated so far is 21221MB 2023-10-04 14:20:13,636 WARNING [train.py:1204] (3/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_314-93656-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 14:20:13,648 WARNING [train.py:1204] (3/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_336-213530-0 from training. Duration: 0.9 2023-10-04 14:20:13,777 WARNING [train.py:1204] (3/4) Exclude cut with ID _36686_210_5665_1_1533778225913_3646410_65-221120-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 14:20:16,398 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_26433_210_12108_1_1532157158_4056480_271-68178-0_sp1.1 from training. Duration: 0.92725 2023-10-04 14:20:17,760 WARNING [train.py:1204] (3/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_46-207599-0_sp0.9 from training. Duration: 0.711125 2023-10-04 14:20:18,583 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.1.encoder.layers.1.feed_forward1.out_whiten, num_groups=1, num_channels=256, metric=8.28 vs. limit=15.0 2023-10-04 14:20:21,027 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9303W0114-637133 from training. Duration: 0.896 2023-10-04 14:20:21,062 WARNING [train.py:1204] (3/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_490-207861-0 from training. Duration: 0.98 2023-10-04 14:20:22,425 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6767_210_3273_1_1527855783_1718932_173-256601-0_sp0.9 from training. Duration: 0.888875 2023-10-04 14:20:22,455 WARNING [train.py:1204] (3/4) Exclude cut with ID _13921_210_9994_1_1530841782272_3503520_327-69677-0_sp1.1 from training. Duration: 0.8181875 2023-10-04 14:20:24,370 WARNING [train.py:1204] (3/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_508-335369-0 from training. Duration: 0.48 2023-10-04 14:20:24,393 WARNING [train.py:1204] (3/4) Exclude cut with ID _64631_210_27767_1_1535349580068_4165160_24-46567-0_sp1.1 from training. Duration: 0.963625 2023-10-04 14:20:27,083 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.705e+02 2.063e+02 2.332e+02 2.863e+02 4.745e+02, threshold=4.665e+02, percent-clipped=1.0 2023-10-04 14:20:30,042 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_89-35748-0_sp1.1 from training. Duration: 0.7181875 2023-10-04 14:20:40,337 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_26433_210_12108_1_1532157158_4056480_578-262107-0_sp1.1 from training. Duration: 0.9 2023-10-04 14:20:47,804 WARNING [train.py:1204] (3/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_129-337802-0 from training. Duration: 0.72 2023-10-04 14:20:49,660 WARNING [train.py:1204] (3/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_356-167816-0_sp0.9 from training. Duration: 0.8 2023-10-04 14:20:50,011 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.conv_module1.balancer1.prob, batch_count=1684600.0, ans=0.125 2023-10-04 14:20:51,213 WARNING [train.py:1204] (3/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_137-116442-0_sp1.1 from training. Duration: 0.7090625 2023-10-04 14:20:51,249 WARNING [train.py:1204] (3/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_358-172940-0_sp1.1 from training. Duration: 0.963625 2023-10-04 14:20:51,270 WARNING [train.py:1204] (3/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_668-344147-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 14:20:52,573 WARNING [train.py:1204] (3/4) Exclude cut with ID _62337_210_12392_1_1535009273105_3412229_255-157667-0_sp1.1 from training. Duration: 0.936375 2023-10-04 14:20:53,919 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_486-151131-0 from training. Duration: 0.85 2023-10-04 14:20:54,091 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_53638_210_21581_1_1534297319_5808449_885-80172-0 from training. Duration: 0.96 2023-10-04 14:20:55,376 WARNING [train.py:1204] (3/4) Exclude cut with ID _50093_210_4220_1_1534417141207_3432299_105-183067-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 14:20:57,325 WARNING [train.py:1204] (3/4) Exclude cut with ID _7562_210_2084_1_1528887767916_3946229_345-169156-0_sp0.9 from training. Duration: 0.6444375 2023-10-04 14:20:59,974 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_4528_210_3273_1_1527076654_3276777_209-219804-0_sp0.9 from training. Duration: 0.7666875 2023-10-04 14:21:00,010 WARNING [train.py:1204] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_35-153886-0_sp0.9 from training. Duration: 0.9333125 2023-10-04 14:21:00,074 WARNING [train.py:1204] (3/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_332-121925-0_sp1.1 from training. Duration: 0.92725 2023-10-04 14:21:00,076 WARNING [train.py:1204] (3/4) Exclude cut with ID _59202_210_26984_1_1534816810612_3549174_495-65515-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 14:21:03,071 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.bypass_mid.scale_min, batch_count=1684666.6666666667, ans=0.2 2023-10-04 14:21:04,128 WARNING [train.py:1204] (3/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_769-168302-0_sp1.1 from training. Duration: 0.6454375 2023-10-04 14:21:04,165 WARNING [train.py:1204] (3/4) Exclude cut with ID _24271_210_12658_1_1531987270022_7190519_708-71345-0_sp1.1 from training. Duration: 0.936375 2023-10-04 14:21:04,165 WARNING [train.py:1204] (3/4) Exclude cut with ID 3033-130750-0096-55598-0_sp0.9 from training. Duration: 0.92225 2023-10-04 14:21:06,927 WARNING [train.py:1204] (3/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_219-224445-0_sp0.9 from training. Duration: 0.9333125 2023-10-04 14:21:10,397 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_174-277364-0 from training. Duration: 0.71 2023-10-04 14:21:11,782 WARNING [train.py:1204] (3/4) Exclude cut with ID _45021_210_9994_1_1533619751094_3330288_96-239010-0_sp1.1 from training. Duration: 0.82725 2023-10-04 14:21:11,825 WARNING [train.py:1204] (3/4) Exclude cut with ID _31602_210_5854_1_1532677021456_4458250_489-305116-0_sp1.1 from training. Duration: 0.97275 2023-10-04 14:21:11,851 WARNING [train.py:1204] (3/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_269-154726-0_sp0.9 from training. Duration: 0.9555625 2023-10-04 14:21:15,205 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.4.encoder.layers.1.whiten, num_groups=1, num_channels=384, metric=5.99 vs. limit=12.0 2023-10-04 14:21:15,918 WARNING [train.py:1204] (3/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_126-54418-0_sp1.1 from training. Duration: 0.92725 2023-10-04 14:21:15,959 WARNING [train.py:1204] (3/4) Exclude cut with ID _42326_210_7770_1_1533814282531_3707269_182-224159-0_sp1.1 from training. Duration: 0.92725 2023-10-04 14:21:17,439 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7002_210_1794_1_1527939683_5157286_1-281264-0_sp0.9 from training. Duration: 0.488875 2023-10-04 14:21:17,468 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_86-16881-0 from training. Duration: 0.58 2023-10-04 14:21:18,749 WARNING [train.py:1204] (3/4) Exclude cut with ID _19514_210_9259_1_1531530071330_5084230_637-279094-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 14:21:18,791 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_26433_210_12108_1_1532157158_4056480_578-262107-0 from training. Duration: 0.99 2023-10-04 14:21:18,837 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_82-221196-0_sp1.1 from training. Duration: 0.6909375 2023-10-04 14:21:21,002 WARNING [train.py:1204] (3/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_402-102311-0 from training. Duration: 0.67 2023-10-04 14:21:24,547 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.bypass.skip_rate, batch_count=1684733.3333333333, ans=0.07 2023-10-04 14:21:25,623 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1015-343884-0_sp1.1 from training. Duration: 0.563625 2023-10-04 14:21:25,703 WARNING [train.py:1204] (3/4) Exclude cut with ID _13684_210_7497_1_1530619354863_3598359_312-235606-0_sp1.1 from training. Duration: 0.5454375 2023-10-04 14:21:25,735 WARNING [train.py:1204] (3/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_55-31904-0 from training. Duration: 0.88 2023-10-04 14:21:27,234 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_25-332138-0 from training. Duration: 0.91 2023-10-04 14:21:27,236 WARNING [train.py:1204] (3/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_912-282412-0_sp0.9 from training. Duration: 0.5444375 2023-10-04 14:21:27,941 WARNING [train.py:1204] (3/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_458-299435-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 14:21:29,212 INFO [train.py:1046] (3/4) Epoch 48, batch 3050, loss[loss=0.1355, simple_loss=0.2202, pruned_loss=0.02538, over 24316.00 frames. ], tot_loss[loss=0.154, simple_loss=0.2352, pruned_loss=0.0364, over 4715843.61 frames. ], batch size: 61, lr: 2.11e-03, grad_scale: 16.0 2023-10-04 14:21:30,630 WARNING [train.py:1204] (3/4) Exclude cut with ID _69584_210_5219_1_1535785133775_4044450_275-88403-0_sp1.1 from training. Duration: 0.92725 2023-10-04 14:21:30,638 WARNING [train.py:1204] (3/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_841-341679-0_sp1.1 from training. Duration: 0.52725 2023-10-04 14:21:30,645 WARNING [train.py:1204] (3/4) Exclude cut with ID _41391_210_7994_1_1533603771872_3638201_1-54521-0_sp1.1 from training. Duration: 0.97275 2023-10-04 14:21:30,691 WARNING [train.py:1204] (3/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_495-186609-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 14:21:32,118 WARNING [train.py:1204] (3/4) Exclude cut with ID _4681_210_3525_1_1527250689464_120889_15-126760-0 from training. Duration: 0.81 2023-10-04 14:21:33,604 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3019_210_2028_1_1526044245_3264865_127-135615-0_sp1.1 from training. Duration: 0.8545625 2023-10-04 14:21:35,478 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.4.encoder.layers.2.feed_forward1.out_whiten, num_groups=1, num_channels=384, metric=12.82 vs. limit=15.0 2023-10-04 14:21:36,299 WARNING [train.py:1204] (3/4) Exclude cut with ID _30191_210_1664_1_1533189915047_7196670_915-21561-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 14:21:36,341 WARNING [train.py:1204] (3/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_6-214815-0_sp0.9 from training. Duration: 0.9444375 2023-10-04 14:21:39,458 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_45399_210_4852_1_1533624725_7579737_190-137494-0_sp1.1 from training. Duration: 0.97275 2023-10-04 14:21:42,140 WARNING [train.py:1204] (3/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_207-132038-0 from training. Duration: 0.94 2023-10-04 14:21:45,133 WARNING [train.py:1204] (3/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_90-31383-0 from training. Duration: 0.87 2023-10-04 14:21:46,363 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_15517_210_6753_1_1530952293_1664795_119-31001-0 from training. Duration: 0.94 2023-10-04 14:21:46,394 WARNING [train.py:1204] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_684-85342-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 14:21:51,381 WARNING [train.py:1204] (3/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_97-325053-0_sp1.1 from training. Duration: 0.836375 2023-10-04 14:21:55,556 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_68702_210_23689_1_1535799577_3420068_168-25150-0_sp1.1 from training. Duration: 0.97275 2023-10-04 14:21:55,572 WARNING [train.py:1204] (3/4) Exclude cut with ID _65134_210_14763_1_1535335393985_3505368_111-27984-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 14:21:55,618 WARNING [train.py:1204] (3/4) Exclude cut with ID _48678_210_5663_1_1533988830947_3769199_276-226230-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 14:21:55,846 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.bypass.scale_min, batch_count=1684866.6666666667, ans=0.2 2023-10-04 14:22:00,282 WARNING [train.py:1204] (3/4) Exclude cut with ID _28427_210_4220_1_1532581240967_3061069_43-84928-0_sp1.1 from training. Duration: 0.9 2023-10-04 14:22:00,319 WARNING [train.py:1204] (3/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_158-250116-0_sp1.1 from training. Duration: 0.563625 2023-10-04 14:22:01,642 WARNING [train.py:1204] (3/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_279-332556-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 14:22:01,692 WARNING [train.py:1204] (3/4) Exclude cut with ID _42863_210_9070_1_1533902398788_3696920_488-230146-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 14:22:01,693 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_387-247159-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 14:22:02,352 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.3.encoder.layers.1.self_attn_weights.whiten_keys, num_groups=8, num_channels=256, metric=3.88 vs. limit=6.0 2023-10-04 14:22:04,537 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6729_210_2550_1_1528032481_1788725_261-42619-0_sp1.1 from training. Duration: 0.97275 2023-10-04 14:22:05,946 WARNING [train.py:1204] (3/4) Exclude cut with ID _19896_210_1883_1_1531709022662_6649569_831-44724-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 14:22:08,744 WARNING [train.py:1204] (3/4) Exclude cut with ID _55153_210_2253_1_1534384786677_3162096_280-285944-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 14:22:08,775 WARNING [train.py:1204] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_448-4048-0 from training. Duration: 0.96 2023-10-04 14:22:08,829 WARNING [train.py:1204] (3/4) Exclude cut with ID _29678_210_7893_1_1532421042787_3906118_456-202548-0_sp1.1 from training. Duration: 0.97275 2023-10-04 14:22:08,849 WARNING [train.py:1204] (3/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_107-332398-0_sp1.1 from training. Duration: 0.7090625 2023-10-04 14:22:12,201 WARNING [train.py:1204] (3/4) Exclude cut with ID _13921_210_9994_1_1530838702800_3003429_46-149444-0_sp1.1 from training. Duration: 0.8545625 2023-10-04 14:22:13,518 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_257-20733-0_sp0.9 from training. Duration: 0.8444375 2023-10-04 14:22:13,564 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_53753_210_7234_1_1534320003_3594148_385-234809-0_sp1.1 from training. Duration: 0.963625 2023-10-04 14:22:14,956 WARNING [train.py:1204] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_292-215346-0_sp1.1 from training. Duration: 0.8818125 2023-10-04 14:22:17,670 WARNING [train.py:1204] (3/4) Exclude cut with ID _68965_210_1883_1_1535698962272_3497490_196-124268-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 14:22:17,718 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_23801_210_16223_1_1532066392_3294657_570-5439-0_sp1.1 from training. Duration: 0.8818125 2023-10-04 14:22:23,964 WARNING [train.py:1204] (3/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_349-6928-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 14:22:25,746 WARNING [train.py:1204] (3/4) Exclude cut with ID _60621_210_17071_1_1535013053530_3671739_226-255202-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 14:22:25,756 WARNING [train.py:1204] (3/4) Exclude cut with ID _41817_210_18835_1_1533434477160_7423049_448-130302-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 14:22:27,246 WARNING [train.py:1204] (3/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_758-143677-0_sp1.1 from training. Duration: 0.8909375 2023-10-04 14:22:29,095 WARNING [train.py:1204] (3/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_912-47473-0_sp0.9 from training. Duration: 0.7555625 2023-10-04 14:22:30,294 WARNING [train.py:1204] (3/4) Exclude cut with ID _6694_210_4599_1_1527852637490_3965529_356-169233-0_sp1.1 from training. Duration: 0.9 2023-10-04 14:22:30,384 WARNING [train.py:1204] (3/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_1-99680-0 from training. Duration: 0.67 2023-10-04 14:22:31,731 WARNING [train.py:1204] (3/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_199-86432-0_sp1.1 from training. Duration: 0.8909375 2023-10-04 14:22:31,762 WARNING [train.py:1204] (3/4) Exclude cut with ID _61148_210_6897_1_1535094022726_4217610_362-182430-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 14:22:34,473 WARNING [train.py:1204] (3/4) Exclude cut with ID _49376_210_5986_1_1533985278732_3370740_301-154763-0 from training. Duration: 0.98 2023-10-04 14:22:35,936 WARNING [train.py:1204] (3/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_572-185150-0_sp1.1 from training. Duration: 0.8818125 2023-10-04 14:22:41,467 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_47896_210_20508_1_1534492518_5958868_558-314572-0_sp1.1 from training. Duration: 0.8818125 2023-10-04 14:22:43,327 INFO [train.py:1046] (3/4) Epoch 48, batch 3100, loss[loss=0.1327, simple_loss=0.2081, pruned_loss=0.02868, over 24364.00 frames. ], tot_loss[loss=0.153, simple_loss=0.2338, pruned_loss=0.03613, over 4707951.69 frames. ], batch size: 56, lr: 2.11e-03, grad_scale: 8.0 2023-10-04 14:22:43,416 WARNING [train.py:1204] (3/4) Exclude cut with ID _45560_210_21982_1_1533812603951_3584999_111-6376-0_sp0.9 from training. Duration: 0.9333125 2023-10-04 14:22:44,890 WARNING [train.py:1204] (3/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_412-317077-0_sp1.1 from training. Duration: 0.6818125 2023-10-04 14:22:46,436 WARNING [train.py:1204] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_718-68505-0 from training. Duration: 0.89 2023-10-04 14:22:49,127 WARNING [train.py:1204] (3/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_77-311404-0 from training. Duration: 0.94 2023-10-04 14:22:50,448 WARNING [train.py:1204] (3/4) Exclude cut with ID _60229_210_27767_1_1534838406158_3655350_264-279573-0 from training. Duration: 0.83 2023-10-04 14:22:50,560 WARNING [train.py:1204] (3/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_106-300985-0_sp1.1 from training. Duration: 0.8181875 2023-10-04 14:22:53,675 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_4183_210_2087_1_1526729100_1869838_79-277189-0_sp1.1 from training. Duration: 0.963625 2023-10-04 14:22:53,685 WARNING [train.py:1204] (3/4) Exclude cut with ID _28427_210_4220_1_1532581240967_3061069_195-295268-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 14:22:58,167 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.605e+02 2.071e+02 2.302e+02 2.680e+02 4.838e+02, threshold=4.605e+02, percent-clipped=1.0 2023-10-04 14:22:58,282 WARNING [train.py:1204] (3/4) Exclude cut with ID _4092_210_2151_1_1526643044449_3135759_356-278694-0_sp0.9 from training. Duration: 0.52225 2023-10-04 14:23:02,781 WARNING [train.py:1204] (3/4) Exclude cut with ID _14459_210_2228_1_1531443641946_7516700_777-71446-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 14:23:06,960 WARNING [train.py:1204] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_520-126729-0 from training. Duration: 0.7 2023-10-04 14:23:11,140 WARNING [train.py:1204] (3/4) Exclude cut with ID _13684_210_7497_1_1530619354863_3598359_312-235606-0_sp0.9 from training. Duration: 0.6666875 2023-10-04 14:23:11,181 WARNING [train.py:1204] (3/4) Exclude cut with ID _37537_210_2253_1_1533171758568_3421949_283-349402-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 14:23:12,492 WARNING [train.py:1204] (3/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_29-117932-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 14:23:12,529 WARNING [train.py:1204] (3/4) Exclude cut with ID _29303_210_2575_1_1532392237303_7410649_721-283351-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 14:23:14,416 WARNING [train.py:1204] (3/4) Exclude cut with ID _14886_210_4220_1_1530702020355_3516190_175-229073-0_sp0.9 from training. Duration: 0.5 2023-10-04 14:23:17,064 WARNING [train.py:1204] (3/4) Exclude cut with ID _15458_210_8341_1_1530858614263_3834410_70-268830-0_sp1.1 from training. Duration: 0.97275 2023-10-04 14:23:17,081 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2295_210_2087_1_1525500993_2407863_234-83104-0 from training. Duration: 0.62 2023-10-04 14:23:17,082 WARNING [train.py:1204] (3/4) Exclude cut with ID _22961_210_8254_1_1531976366672_4278180_317-199546-0_sp1.1 from training. Duration: 0.7818125 2023-10-04 14:23:18,492 WARNING [train.py:1204] (3/4) Exclude cut with ID _27894_210_12392_1_1532415496078_6902439_547-196022-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 14:23:18,613 WARNING [train.py:1204] (3/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_46-207599-0 from training. Duration: 0.64 2023-10-04 14:23:20,022 WARNING [train.py:1204] (3/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_426-326246-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 14:23:21,569 WARNING [train.py:1204] (3/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_445-117398-0_sp0.9 from training. Duration: 0.988875 2023-10-04 14:23:23,480 WARNING [train.py:1204] (3/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_563-347832-0 from training. Duration: 0.89 2023-10-04 14:23:23,568 WARNING [train.py:1204] (3/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_252-216674-0 from training. Duration: 0.54 2023-10-04 14:23:24,979 WARNING [train.py:1204] (3/4) Exclude cut with ID _59611_210_8895_1_1534741198240_3650361_187-212779-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 14:23:25,046 WARNING [train.py:1204] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_282-159367-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 14:23:28,747 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_68702_210_23689_1_1535799577_3420068_33-136883-0_sp1.1 from training. Duration: 0.936375 2023-10-04 14:23:28,768 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_47228_210_4983_1_1533780021_3672027_184-293706-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 14:23:28,799 WARNING [train.py:1204] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_656-188361-0_sp0.9 from training. Duration: 0.9666875 2023-10-04 14:23:31,908 WARNING [train.py:1204] (3/4) Exclude cut with ID _4866_210_2577_1_1527404412284_4364659_352-157930-0_sp0.9 from training. Duration: 0.9 2023-10-04 14:23:31,909 WARNING [train.py:1204] (3/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_210-106047-0_sp1.1 from training. Duration: 0.8818125 2023-10-04 14:23:33,323 WARNING [train.py:1204] (3/4) Exclude cut with ID _9389_210_1664_1_1529217337301_5289269_289-213417-0_sp1.1 from training. Duration: 0.8090625 2023-10-04 14:23:33,366 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3910_210_1794_1_1526777667_3747260_178-41186-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 14:23:33,370 WARNING [train.py:1204] (3/4) Exclude cut with ID _27052_210_5986_1_1532329197187_3585009_24-172263-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 14:23:33,374 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_5522_210_3273_1_1527249606_3909096_318-203153-0_sp1.1 from training. Duration: 0.3909375 2023-10-04 14:23:38,561 WARNING [train.py:1204] (3/4) Exclude cut with ID _27894_210_12392_1_1532415496078_6902439_222-51982-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 14:23:39,925 WARNING [train.py:1204] (3/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_87-131427-0 from training. Duration: 0.78 2023-10-04 14:23:41,310 WARNING [train.py:1204] (3/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_372-298093-0_sp0.9 from training. Duration: 0.888875 2023-10-04 14:23:42,604 WARNING [train.py:1204] (3/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_663-166973-0 from training. Duration: 0.8 2023-10-04 14:23:44,547 WARNING [train.py:1204] (3/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_429-177469-0_sp1.1 from training. Duration: 0.936375 2023-10-04 14:23:44,586 WARNING [train.py:1204] (3/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_724-172651-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 14:23:44,640 WARNING [train.py:1204] (3/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_103-336064-0 from training. Duration: 0.51 2023-10-04 14:23:56,631 WARNING [train.py:1204] (3/4) Exclude cut with ID _48677_210_5663_1_1533902405919_3892989_111-11439-0 from training. Duration: 0.96 2023-10-04 14:23:56,977 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.bypass.scale_min, batch_count=1685466.6666666667, ans=0.2 2023-10-04 14:23:57,943 INFO [train.py:1046] (3/4) Epoch 48, batch 3150, loss[loss=0.1657, simple_loss=0.2553, pruned_loss=0.03803, over 24617.00 frames. ], tot_loss[loss=0.1523, simple_loss=0.2325, pruned_loss=0.03605, over 4701519.98 frames. ], batch size: 68, lr: 2.11e-03, grad_scale: 8.0 2023-10-04 14:23:59,329 WARNING [train.py:1204] (3/4) Exclude cut with ID _8085_210_4557_1_1528455633493_3716100_95-103398-0_sp1.1 from training. Duration: 0.963625 2023-10-04 14:23:59,392 WARNING [train.py:1204] (3/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_1041-110837-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 14:23:59,608 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.bypass.skip_rate, batch_count=1685466.6666666667, ans=0.04949747468305833 2023-10-04 14:24:00,866 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_50050_210_16223_1_1535435796_7290138_1155-121757-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 14:24:00,868 WARNING [train.py:1204] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_602-275260-0_sp0.9 from training. Duration: 0.911125 2023-10-04 14:24:02,234 WARNING [train.py:1204] (3/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_265-164486-0 from training. Duration: 0.96 2023-10-04 14:24:03,511 WARNING [train.py:1204] (3/4) Exclude cut with ID _46067_210_4220_1_1534125571066_3307450_142-229375-0_sp1.1 from training. Duration: 0.963625 2023-10-04 14:24:04,872 WARNING [train.py:1204] (3/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_310-26186-0_sp1.1 from training. Duration: 0.636375 2023-10-04 14:24:06,879 WARNING [train.py:1204] (3/4) Exclude cut with ID _28461_210_2253_1_1532771775189_3041299_136-270644-0 from training. Duration: 0.93 2023-10-04 14:24:08,310 WARNING [train.py:1204] (3/4) Exclude cut with ID _14860_210_4915_1_1530702020340_7343240_438-286372-0_sp1.1 from training. Duration: 0.936375 2023-10-04 14:24:09,787 WARNING [train.py:1204] (3/4) Exclude cut with ID IC0128W0004-63309 from training. Duration: 0.831 2023-10-04 14:24:12,664 WARNING [train.py:1204] (3/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_135-174080-0 from training. Duration: 0.53 2023-10-04 14:24:12,706 WARNING [train.py:1204] (3/4) Exclude cut with ID _41326_210_5854_1_1533778076509_3492080_277-59511-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 14:24:12,783 WARNING [train.py:1204] (3/4) Exclude cut with ID IC0144W0112-71379 from training. Duration: 0.992 2023-10-04 14:24:12,958 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.bypass.skip_rate, batch_count=1685533.3333333333, ans=0.07 2023-10-04 14:24:14,120 WARNING [train.py:1204] (3/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_711-15688-0_sp0.9 from training. Duration: 0.47775 2023-10-04 14:24:14,233 WARNING [train.py:1204] (3/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_237-332027-0 from training. Duration: 0.95 2023-10-04 14:24:14,277 WARNING [train.py:1204] (3/4) Exclude cut with ID _7970_210_1883_1_1528455616296_3472230_25-178407-0 from training. Duration: 0.71 2023-10-04 14:24:16,011 WARNING [train.py:1204] (3/4) Exclude cut with ID _8480_210_1874_1_1528589391205_6728330_309-139896-0 from training. Duration: 0.81 2023-10-04 14:24:16,030 WARNING [train.py:1204] (3/4) Exclude cut with ID _39571_210_13449_1_1533801584143_3651077_319-268877-0_sp1.1 from training. Duration: 0.936375 2023-10-04 14:24:16,034 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_155-246880-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 14:24:17,430 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_566-122599-0_sp1.1 from training. Duration: 0.936375 2023-10-04 14:24:18,318 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.1.encoder.layers.1.feed_forward2.out_whiten, num_groups=1, num_channels=256, metric=3.54 vs. limit=15.0 2023-10-04 14:24:18,827 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_12868_210_6753_1_1530701932_530275_18-76322-0 from training. Duration: 0.87 2023-10-04 14:24:20,209 WARNING [train.py:1204] (3/4) Exclude cut with ID _56494_210_4220_1_1535284837625_3033169_159-106035-0_sp1.1 from training. Duration: 0.963625 2023-10-04 14:24:20,247 WARNING [train.py:1204] (3/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_98-276013-0_sp1.1 from training. Duration: 0.963625 2023-10-04 14:24:20,297 WARNING [train.py:1204] (3/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_330-216124-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 14:24:23,307 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_177-58005-0_sp0.9 from training. Duration: 0.688875 2023-10-04 14:24:29,179 WARNING [train.py:1204] (3/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_343-139898-0 from training. Duration: 0.96 2023-10-04 14:24:29,251 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6961_210_2087_1_1527937168_6815011_395-185060-0_sp1.1 from training. Duration: 0.863625 2023-10-04 14:24:32,074 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7002_210_1794_1_1527939683_5157286_184-229630-0_sp0.9 from training. Duration: 0.82225 2023-10-04 14:24:33,385 WARNING [train.py:1204] (3/4) Exclude cut with ID _59170_210_6721_1_1534983633960_4203335_153-18895-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 14:24:33,433 WARNING [train.py:1204] (3/4) Exclude cut with ID _49643_210_7989_1_1534640244224_3939031_598-161926-0 from training. Duration: 0.9 2023-10-04 14:24:36,132 WARNING [train.py:1204] (3/4) Exclude cut with ID _4068_210_2629_1_1526697350395_1546259_224-288693-0 from training. Duration: 0.59 2023-10-04 14:24:36,198 WARNING [train.py:1204] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_718-68505-0_sp1.1 from training. Duration: 0.8090625 2023-10-04 14:24:37,958 WARNING [train.py:1204] (3/4) Exclude cut with ID _4068_210_2629_1_1526697350395_1546259_224-288693-0_sp0.9 from training. Duration: 0.6555625 2023-10-04 14:24:37,970 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2296_210_2087_1_1525503448_3397335_57-185430-0_sp0.9 from training. Duration: 0.6666875 2023-10-04 14:24:39,310 WARNING [train.py:1204] (3/4) Exclude cut with ID _15458_210_8341_1_1530858614263_3834410_49-235472-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 14:24:39,312 WARNING [train.py:1204] (3/4) Exclude cut with ID _64135_210_2253_1_1535677267876_7608972_498-5039-0_sp0.9 from training. Duration: 0.8666875 2023-10-04 14:24:39,415 WARNING [train.py:1204] (3/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_283-220372-0_sp0.9 from training. Duration: 0.62225 2023-10-04 14:24:40,690 WARNING [train.py:1204] (3/4) Exclude cut with ID _6473_210_1741_1_1527901307396_3815380_108-17335-0_sp0.9 from training. Duration: 0.72225 2023-10-04 14:24:40,787 WARNING [train.py:1204] (3/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_113-92608-0 from training. Duration: 0.54 2023-10-04 14:24:42,026 WARNING [train.py:1204] (3/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_272-249985-0_sp1.1 from training. Duration: 0.6090625 2023-10-04 14:24:42,036 WARNING [train.py:1204] (3/4) Exclude cut with ID _50376_210_13401_1_1534031502101_4217740_518-187390-0_sp1.1 from training. Duration: 0.97275 2023-10-04 14:24:42,171 WARNING [train.py:1204] (3/4) Exclude cut with ID _59738_210_15379_1_1534849512562_3941470_165-248260-0_sp1.1 from training. Duration: 0.92725 2023-10-04 14:24:42,180 WARNING [train.py:1204] (3/4) Exclude cut with ID _23818_210_6228_1_1531917063884_3676920_400-343925-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 14:24:43,494 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6961_210_2087_1_1527937168_6815011_562-165518-0 from training. Duration: 0.77 2023-10-04 14:24:45,353 WARNING [train.py:1204] (3/4) Exclude cut with ID _24271_210_12658_1_1531987270022_7190519_289-207931-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 14:24:45,671 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.conv_module2.balancer1.max_abs, batch_count=1685666.6666666667, ans=10.0 2023-10-04 14:24:46,753 WARNING [train.py:1204] (3/4) Exclude cut with ID _14632_210_9988_1_1530691279392_3923200_332-105904-0 from training. Duration: 0.9 2023-10-04 14:24:46,764 WARNING [train.py:1204] (3/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_203-232438-0_sp1.1 from training. Duration: 0.97275 2023-10-04 14:24:48,047 WARNING [train.py:1204] (3/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_263-168388-0 from training. Duration: 0.86 2023-10-04 14:24:49,438 WARNING [train.py:1204] (3/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_174-205058-0 from training. Duration: 0.95 2023-10-04 14:24:49,566 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_94-195801-0_sp1.1 from training. Duration: 0.8545625 2023-10-04 14:24:50,887 WARNING [train.py:1204] (3/4) Exclude cut with ID _55153_210_2253_1_1534384786677_3162096_269-103204-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 14:24:50,970 WARNING [train.py:1204] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_748-36150-0 from training. Duration: 0.91 2023-10-04 14:24:52,303 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_148-172861-0_sp1.1 from training. Duration: 0.4545625 2023-10-04 14:24:52,360 WARNING [train.py:1204] (3/4) Exclude cut with ID _67499_210_17282_1_1535509805220_3692430_460-77730-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 14:24:56,103 WARNING [train.py:1204] (3/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_374-20753-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 14:24:57,589 WARNING [train.py:1204] (3/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_374-289983-0_sp1.1 from training. Duration: 0.97275 2023-10-04 14:24:57,626 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_34886_210_10633_1_1533203882_1755661_76-156096-0_sp0.9 from training. Duration: 0.9666875 2023-10-04 14:24:57,818 INFO [scaling.py:1118] (3/4) WithLoss: name=encoder.encoders.2.encoder.layers.1.self_attn_weights, loss-sum=0.000e+00 2023-10-04 14:25:02,991 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_238-256668-0_sp0.9 from training. Duration: 0.7666875 2023-10-04 14:25:03,044 WARNING [train.py:1204] (3/4) Exclude cut with ID _46683_210_9642_1_1533730913492_4591116_489-146727-0_sp1.1 from training. Duration: 0.97275 2023-10-04 14:25:04,541 WARNING [train.py:1204] (3/4) Exclude cut with ID _13938_210_9988_1_1530518718978_3693960_304-112784-0_sp0.9 from training. Duration: 0.511125 2023-10-04 14:25:09,272 WARNING [train.py:1204] (3/4) Exclude cut with ID _24271_210_12658_1_1531987270022_7190519_243-46549-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 14:25:09,284 WARNING [train.py:1204] (3/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_78-182912-0_sp1.1 from training. Duration: 0.536375 2023-10-04 14:25:10,954 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.nonlin_attention.balancer.prob, batch_count=1685800.0, ans=0.125 2023-10-04 14:25:12,335 INFO [train.py:1046] (3/4) Epoch 48, batch 3200, loss[loss=0.1502, simple_loss=0.222, pruned_loss=0.03922, over 23750.00 frames. ], tot_loss[loss=0.1518, simple_loss=0.2314, pruned_loss=0.03609, over 4698915.96 frames. ], batch size: 164, lr: 2.11e-03, grad_scale: 16.0 2023-10-04 14:25:12,479 WARNING [train.py:1204] (3/4) Exclude cut with ID _57572_210_5517_1_1534921219624_3809484_121-321334-0_sp1.1 from training. Duration: 0.97275 2023-10-04 14:25:13,899 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_40047_210_2290_1_1533290519_2374311_254-102149-0_sp1.1 from training. Duration: 0.963625 2023-10-04 14:25:13,900 WARNING [train.py:1204] (3/4) Exclude cut with ID _28461_210_2253_1_1532771775189_3041299_96-152158-0 from training. Duration: 0.85 2023-10-04 14:25:16,526 WARNING [train.py:1204] (3/4) Exclude cut with ID _29877_210_6228_1_1532397791106_5276149_161-102979-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 14:25:21,174 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3787_210_2040_1_1526477544_5478586_603-120324-0_sp0.9 from training. Duration: 0.9 2023-10-04 14:25:21,579 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.5.encoder.layers.1.nonlin_attention.whiten1, num_groups=1, num_channels=192, metric=2.94 vs. limit=10.0 2023-10-04 14:25:25,715 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_41750_210_1960_1_1533520678_3948042_247-213456-0_sp1.1 from training. Duration: 0.97275 2023-10-04 14:25:27,012 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.669e+02 2.180e+02 2.542e+02 3.306e+02 4.972e+02, threshold=5.085e+02, percent-clipped=5.0 2023-10-04 14:25:30,131 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=1685866.6666666667, ans=0.1 2023-10-04 14:25:33,009 WARNING [train.py:1204] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_127-119109-0_sp0.9 from training. Duration: 0.8 2023-10-04 14:25:41,908 WARNING [train.py:1204] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_215-202973-0 from training. Duration: 0.88 2023-10-04 14:25:43,842 WARNING [train.py:1204] (3/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_17-320491-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 14:25:46,628 WARNING [train.py:1204] (3/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_310-114452-0 from training. Duration: 0.59 2023-10-04 14:25:47,972 WARNING [train.py:1204] (3/4) Exclude cut with ID _15133_210_6228_1_1530794512074_3080379_29-264458-0_sp0.9 from training. Duration: 0.6444375 2023-10-04 14:25:52,734 WARNING [train.py:1204] (3/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_159-281310-0_sp1.1 from training. Duration: 0.863625 2023-10-04 14:25:52,753 WARNING [train.py:1204] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_195-167011-0_sp0.9 from training. Duration: 0.8333125 2023-10-04 14:25:54,053 WARNING [train.py:1204] (3/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_156-257607-0_sp1.1 from training. Duration: 0.7545625 2023-10-04 14:25:57,333 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2295_210_2087_1_1525500993_2407863_116-238894-0 from training. Duration: 0.7 2023-10-04 14:25:58,754 WARNING [train.py:1204] (3/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_321-335475-0_sp0.9 from training. Duration: 0.5 2023-10-04 14:26:00,137 WARNING [train.py:1204] (3/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_188-114464-0 from training. Duration: 0.82 2023-10-04 14:26:00,335 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.conv_skip_rate, batch_count=1686000.0, ans=0.0 2023-10-04 14:26:04,156 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2592_210_2290_1_1525697025_4882925_157-70512-0 from training. Duration: 0.91 2023-10-04 14:26:06,905 WARNING [train.py:1204] (3/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_138-64210-0_sp1.1 from training. Duration: 0.663625 2023-10-04 14:26:13,113 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.bypass_mid.scale_min, batch_count=1686066.6666666667, ans=0.2 2023-10-04 14:26:14,248 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_11305_210_1794_1_1529750428_7178148_651-95702-0_sp1.1 from training. Duration: 0.963625 2023-10-04 14:26:14,269 WARNING [train.py:1204] (3/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_379-184223-0_sp0.9 from training. Duration: 0.9444375 2023-10-04 14:26:14,290 WARNING [train.py:1204] (3/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_381-125325-0_sp1.1 from training. Duration: 0.963625 2023-10-04 14:26:14,346 WARNING [train.py:1204] (3/4) Exclude cut with ID IC0622W0021-307659 from training. Duration: 0.896 2023-10-04 14:26:14,348 WARNING [train.py:1204] (3/4) Exclude cut with ID _7624_210_1531_1_1528109802198_4933101_418-349834-0_sp1.1 from training. Duration: 0.5454375 2023-10-04 14:26:19,069 WARNING [train.py:1204] (3/4) Exclude cut with ID _49728_210_12651_1_1534041034294_6788150_607-3416-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 14:26:20,520 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8275_210_4198_1_1528455320_3892571_491-17707-0 from training. Duration: 0.64 2023-10-04 14:26:20,565 WARNING [train.py:1204] (3/4) Exclude cut with ID _5287_210_2084_1_1527418883040_3878670_333-48508-0 from training. Duration: 0.6 2023-10-04 14:26:21,925 WARNING [train.py:1204] (3/4) Exclude cut with ID _8365_210_2064_1_1528592311070_4677690_371-122295-0 from training. Duration: 0.55 2023-10-04 14:26:23,375 WARNING [train.py:1204] (3/4) Exclude cut with ID _14860_210_4915_1_1530702020340_7343240_836-328489-0 from training. Duration: 0.9 2023-10-04 14:26:25,464 WARNING [train.py:1204] (3/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_1033-274376-0_sp0.9 from training. Duration: 0.9666875 2023-10-04 14:26:26,711 INFO [train.py:1046] (3/4) Epoch 48, batch 3250, loss[loss=0.1538, simple_loss=0.2332, pruned_loss=0.03722, over 23822.00 frames. ], tot_loss[loss=0.1516, simple_loss=0.2315, pruned_loss=0.03583, over 4707686.88 frames. ], batch size: 179, lr: 2.11e-03, grad_scale: 16.0 2023-10-04 14:26:26,831 WARNING [train.py:1204] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_475-127455-0_sp0.9 from training. Duration: 0.77775 2023-10-04 14:26:26,944 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.1.conv_module2.balancer2.prob, batch_count=1686133.3333333333, ans=0.125 2023-10-04 14:26:28,103 WARNING [train.py:1204] (3/4) Exclude cut with ID IC0671W0071-331914 from training. Duration: 0.896 2023-10-04 14:26:28,132 WARNING [train.py:1204] (3/4) Exclude cut with ID _30327_210_16049_1_1532484161295_3924169_2-1523-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 14:26:28,134 WARNING [train.py:1204] (3/4) Exclude cut with ID _24314_210_4915_1_1532135078116_7170179_460-208279-0_sp1.1 from training. Duration: 0.97275 2023-10-04 14:26:28,251 WARNING [train.py:1204] (3/4) Exclude cut with ID IC0679W0052-335859 from training. Duration: 0.64 2023-10-04 14:26:31,689 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.bypass_mid.scale_min, batch_count=1686133.3333333333, ans=0.2 2023-10-04 14:26:32,869 WARNING [train.py:1204] (3/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_3-237434-0_sp1.1 from training. Duration: 0.6909375 2023-10-04 14:26:35,621 WARNING [train.py:1204] (3/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_405-185283-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 14:26:40,585 WARNING [train.py:1204] (3/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_927-83684-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 14:26:41,759 WARNING [train.py:1204] (3/4) Exclude cut with ID _6694_210_4599_1_1527852637490_3965529_356-169233-0 from training. Duration: 0.99 2023-10-04 14:26:43,147 WARNING [train.py:1204] (3/4) Exclude cut with ID _19896_210_1883_1_1531709022662_6649569_562-233001-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 14:26:43,197 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_354-78629-0_sp1.1 from training. Duration: 0.963625 2023-10-04 14:26:43,198 WARNING [train.py:1204] (3/4) Exclude cut with ID _60135_210_13427_1_1534935557922_3651319_96-168198-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 14:26:44,622 WARNING [train.py:1204] (3/4) Exclude cut with ID _31602_210_5854_1_1532677021456_4458250_249-96604-0_sp1.1 from training. Duration: 0.8090625 2023-10-04 14:26:44,654 WARNING [train.py:1204] (3/4) Exclude cut with ID _14100_210_8341_1_1530529320613_3678069_258-224599-0_sp0.9 from training. Duration: 0.8555625 2023-10-04 14:26:47,894 WARNING [train.py:1204] (3/4) Exclude cut with ID _19708_210_9206_1_1532136650733_4238364_215-160135-0_sp1.1 from training. Duration: 0.97275 2023-10-04 14:26:47,924 WARNING [train.py:1204] (3/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_113-18163-0_sp0.9 from training. Duration: 0.988875 2023-10-04 14:26:47,969 WARNING [train.py:1204] (3/4) Exclude cut with ID _64135_210_2253_1_1535677267876_7608972_552-337340-0_sp1.1 from training. Duration: 0.92725 2023-10-04 14:26:49,312 WARNING [train.py:1204] (3/4) Exclude cut with ID _67725_210_27767_1_1535608756671_5105359_10-12887-0_sp1.1 from training. Duration: 0.97275 2023-10-04 14:26:49,318 WARNING [train.py:1204] (3/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_401-284840-0_sp1.1 from training. Duration: 0.97275 2023-10-04 14:26:49,356 WARNING [train.py:1204] (3/4) Exclude cut with ID _44659_210_2721_1_1534036120061_6614860_483-141285-0_sp1.1 from training. Duration: 0.936375 2023-10-04 14:26:49,577 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=1686200.0, ans=0.1 2023-10-04 14:26:52,704 WARNING [train.py:1204] (3/4) Exclude cut with ID _9461_210_4557_1_1529060514726_3677451_358-332734-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 14:26:53,915 WARNING [train.py:1204] (3/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_788-298907-0_sp1.1 from training. Duration: 0.8090625 2023-10-04 14:26:56,579 WARNING [train.py:1204] (3/4) Exclude cut with ID _39411_210_5986_1_1533538825030_3343649_354-267196-0_sp1.1 from training. Duration: 0.92725 2023-10-04 14:26:56,599 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_14953_210_2151_1_1530701710_5616165_192-309792-0_sp1.1 from training. Duration: 0.97275 2023-10-04 14:26:57,993 WARNING [train.py:1204] (3/4) Exclude cut with ID _59170_210_6721_1_1534983633960_4203335_290-151255-0_sp1.1 from training. Duration: 0.92725 2023-10-04 14:26:59,849 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_5575_210_1410_1_1527301305_2570170_249-93769-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 14:26:59,860 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_46013_210_7234_1_1533776362_3744032_463-52378-0_sp1.1 from training. Duration: 0.7818125 2023-10-04 14:27:04,083 WARNING [train.py:1204] (3/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_580-93798-0 from training. Duration: 0.56 2023-10-04 14:27:05,384 WARNING [train.py:1204] (3/4) Exclude cut with ID _19514_210_9259_1_1531530071330_5084230_385-13762-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 14:27:05,397 WARNING [train.py:1204] (3/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_458-67321-0_sp1.1 from training. Duration: 0.8818125 2023-10-04 14:27:06,757 WARNING [train.py:1204] (3/4) Exclude cut with ID _41298_210_9019_1_1533430371450_4822143_188-154791-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 14:27:06,820 WARNING [train.py:1204] (3/4) Exclude cut with ID _15037_210_2253_1_1530752294104_3267650_296-192597-0_sp0.9 from training. Duration: 0.9 2023-10-04 14:27:12,971 WARNING [train.py:1204] (3/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_661-264857-0_sp1.1 from training. Duration: 0.8454375 2023-10-04 14:27:18,554 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.bypass_mid.scale_min, batch_count=1686333.3333333333, ans=0.2 2023-10-04 14:27:21,609 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_34008_210_10431_1_1533345424_8183247_662-317331-0_sp1.1 from training. Duration: 0.8545625 2023-10-04 14:27:21,654 WARNING [train.py:1204] (3/4) Exclude cut with ID _46502_210_13794_1_1533724452198_3657159_241-278329-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 14:27:21,654 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_11349_210_6753_1_1529800861_1101207_26-65602-0 from training. Duration: 0.96 2023-10-04 14:27:21,658 WARNING [train.py:1204] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_448-4048-0_sp1.1 from training. Duration: 0.87275 2023-10-04 14:27:21,674 WARNING [train.py:1204] (3/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_662-275621-0_sp1.1 from training. Duration: 0.3818125 2023-10-04 14:27:22,956 WARNING [train.py:1204] (3/4) Exclude cut with ID _13938_210_9988_1_1530518718978_3693960_256-54454-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 14:27:24,380 WARNING [train.py:1204] (3/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_256-32662-0 from training. Duration: 0.9 2023-10-04 14:27:24,422 WARNING [train.py:1204] (3/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_326-17061-0 from training. Duration: 0.94 2023-10-04 14:27:26,362 WARNING [train.py:1204] (3/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_505-97987-0_sp1.1 from training. Duration: 0.7818125 2023-10-04 14:27:27,768 WARNING [train.py:1204] (3/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_316-294364-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 14:27:27,828 WARNING [train.py:1204] (3/4) Exclude cut with ID _19514_210_9259_1_1531530071330_5084230_762-231063-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 14:27:27,878 WARNING [train.py:1204] (3/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_68-219807-0_sp0.9 from training. Duration: 0.52225 2023-10-04 14:27:29,209 WARNING [train.py:1204] (3/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_479-158053-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 14:27:30,780 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.bypass.scale_min, batch_count=1686400.0, ans=0.2 2023-10-04 14:27:33,918 WARNING [train.py:1204] (3/4) Exclude cut with ID _66112_210_10635_1_1535590236305_3801484_83-38181-0_sp1.1 from training. Duration: 0.936375 2023-10-04 14:27:33,933 WARNING [train.py:1204] (3/4) Exclude cut with ID _55857_210_2346_1_1534644007831_6898209_255-146346-0_sp1.1 from training. Duration: 0.963625 2023-10-04 14:27:36,575 WARNING [train.py:1204] (3/4) Exclude cut with ID _48836_210_12210_1_1534075848293_3904150_429-35475-0 from training. Duration: 0.89 2023-10-04 14:27:36,587 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_50154_210_13731_1_1534750862_1080961_119-187163-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 14:27:39,475 WARNING [train.py:1204] (3/4) Exclude cut with ID _8793_210_6310_1_1528542985139_2310009_121-179782-0_sp0.9 from training. Duration: 0.888875 2023-10-04 14:27:39,478 WARNING [train.py:1204] (3/4) Exclude cut with ID _14884_210_2252_1_1530707459689_5561250_591-173406-0 from training. Duration: 0.96 2023-10-04 14:27:40,901 INFO [train.py:1046] (3/4) Epoch 48, batch 3300, loss[loss=0.1547, simple_loss=0.2415, pruned_loss=0.03393, over 24635.00 frames. ], tot_loss[loss=0.1526, simple_loss=0.2331, pruned_loss=0.03608, over 4720672.94 frames. ], batch size: 68, lr: 2.11e-03, grad_scale: 8.0 2023-10-04 14:27:41,681 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.3.encoder.layers.3.self_attn2.whiten, num_groups=1, num_channels=512, metric=10.44 vs. limit=22.5 2023-10-04 14:27:43,143 WARNING [train.py:1204] (3/4) Exclude cut with ID _15133_210_6228_1_1530794512074_3080379_54-198193-0_sp1.1 from training. Duration: 0.8545625 2023-10-04 14:27:43,152 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6545_210_2040_1_1527774670_4245258_407-48070-0 from training. Duration: 0.76 2023-10-04 14:27:45,949 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1032-236863-0 from training. Duration: 0.84 2023-10-04 14:27:46,034 WARNING [train.py:1204] (3/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_507-303370-0 from training. Duration: 0.96 2023-10-04 14:27:47,308 WARNING [train.py:1204] (3/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_164-282366-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 14:27:50,222 WARNING [train.py:1204] (3/4) Exclude cut with ID _39867_210_9826_1_1533553222030_9344259_312-158727-0_sp1.1 from training. Duration: 0.963625 2023-10-04 14:27:51,589 WARNING [train.py:1204] (3/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_174-205058-0_sp1.1 from training. Duration: 0.863625 2023-10-04 14:27:51,622 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_11305_210_1794_1_1529750428_7178148_166-237358-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 14:27:52,996 WARNING [train.py:1204] (3/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_26-175553-0_sp1.1 from training. Duration: 0.5545625 2023-10-04 14:27:53,016 WARNING [train.py:1204] (3/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_96-290039-0_sp1.1 from training. Duration: 0.5090625 2023-10-04 14:27:56,422 WARNING [train.py:1204] (3/4) Exclude cut with ID _53219_210_4525_1_1534834836008_4064825_261-101691-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 14:27:57,694 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.840e+02 2.042e+02 2.235e+02 2.474e+02 3.621e+02, threshold=4.470e+02, percent-clipped=0.0 2023-10-04 14:27:57,827 WARNING [train.py:1204] (3/4) Exclude cut with ID _24271_210_12658_1_1531987270022_7190519_96-293390-0_sp1.1 from training. Duration: 0.936375 2023-10-04 14:28:02,370 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_271-234962-0 from training. Duration: 0.79 2023-10-04 14:28:02,432 WARNING [train.py:1204] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_136-30660-0_sp1.1 from training. Duration: 0.97275 2023-10-04 14:28:02,447 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_625-281046-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 14:28:02,591 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.1.feed_forward1.hidden_balancer.prob, batch_count=1686533.3333333333, ans=0.125 2023-10-04 14:28:04,339 WARNING [train.py:1204] (3/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_333-168193-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 14:28:05,638 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9029W0269-509664 from training. Duration: 0.896 2023-10-04 14:28:05,732 WARNING [train.py:1204] (3/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_450-147393-0_sp1.1 from training. Duration: 0.92725 2023-10-04 14:28:05,776 WARNING [train.py:1204] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_643-52007-0_sp1.1 from training. Duration: 0.5181875 2023-10-04 14:28:07,101 WARNING [train.py:1204] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_330-327861-0_sp0.9 from training. Duration: 0.7444375 2023-10-04 14:28:07,110 WARNING [train.py:1204] (3/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_260-317651-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 14:28:08,382 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9044W0010-516654 from training. Duration: 0.896 2023-10-04 14:28:08,518 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder_embed.convnext.hidden_balancer.prob, batch_count=1686533.3333333333, ans=0.125 2023-10-04 14:28:08,716 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.ff3_skip_rate, batch_count=1686533.3333333333, ans=0.0 2023-10-04 14:28:11,183 WARNING [train.py:1204] (3/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_265-118378-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 14:28:11,189 WARNING [train.py:1204] (3/4) Exclude cut with ID _6194_210_1534_1_1527511826160_3941194_321-148641-0_sp0.9 from training. Duration: 0.711125 2023-10-04 14:28:13,963 WARNING [train.py:1204] (3/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_101-169176-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 14:28:13,966 WARNING [train.py:1204] (3/4) Exclude cut with ID 497-129325-0061-62254-0_sp1.1 from training. Duration: 0.97725 2023-10-04 14:28:14,061 WARNING [train.py:1204] (3/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_795-33252-0_sp1.1 from training. Duration: 0.336375 2023-10-04 14:28:16,061 WARNING [train.py:1204] (3/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_168-246176-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 14:28:17,509 WARNING [train.py:1204] (3/4) Exclude cut with ID _7160_210_1876_1_1527940923231_3703799_120-150943-0_sp1.1 from training. Duration: 0.736375 2023-10-04 14:28:18,902 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9084W0005-536844 from training. Duration: 0.768 2023-10-04 14:28:20,338 WARNING [train.py:1204] (3/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_234-260682-0 from training. Duration: 0.84 2023-10-04 14:28:21,692 WARNING [train.py:1204] (3/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_188-114464-0_sp0.9 from training. Duration: 0.911125 2023-10-04 14:28:23,271 WARNING [train.py:1204] (3/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_81-75849-0 from training. Duration: 0.39 2023-10-04 14:28:26,181 WARNING [train.py:1204] (3/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_379-184223-0_sp1.1 from training. Duration: 0.77275 2023-10-04 14:28:28,075 WARNING [train.py:1204] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_454-123002-0_sp1.1 from training. Duration: 0.62725 2023-10-04 14:28:28,136 WARNING [train.py:1204] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_887-249555-0_sp1.1 from training. Duration: 0.7 2023-10-04 14:28:31,495 WARNING [train.py:1204] (3/4) Exclude cut with ID _31088_210_16446_1_1532520012284_3440919_20-260902-0_sp1.1 from training. Duration: 0.97275 2023-10-04 14:28:31,539 WARNING [train.py:1204] (3/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_856-344574-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 14:28:31,541 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_36756_210_7234_1_1533026991_966276_4-164118-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 14:28:32,868 WARNING [train.py:1204] (3/4) Exclude cut with ID _8365_210_2064_1_1528592311070_4677690_371-122295-0_sp1.1 from training. Duration: 0.5 2023-10-04 14:28:33,518 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.3.encoder.layers.1.self_attn2.whiten, num_groups=1, num_channels=512, metric=12.94 vs. limit=22.5 2023-10-04 14:28:34,823 WARNING [train.py:1204] (3/4) Exclude cut with ID _5910_210_1389_1_1527337972454_5050100_555-159815-0_sp1.1 from training. Duration: 0.8909375 2023-10-04 14:28:34,841 WARNING [train.py:1204] (3/4) Exclude cut with ID _37086_210_9038_1_1532999540891_7021250_469-147941-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 14:28:36,119 WARNING [train.py:1204] (3/4) Exclude cut with ID _3067_210_2667_1_1526212974640_4503168_459-173867-0_sp0.9 from training. Duration: 0.97775 2023-10-04 14:28:37,538 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9156W0006-572499 from training. Duration: 0.896 2023-10-04 14:28:37,622 WARNING [train.py:1204] (3/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_277-210798-0 from training. Duration: 0.55 2023-10-04 14:28:39,197 WARNING [train.py:1204] (3/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_103-336064-0_sp1.1 from training. Duration: 0.463625 2023-10-04 14:28:40,488 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3414_210_1988_1_1526212718_4813195_331-290945-0_sp1.1 from training. Duration: 0.936375 2023-10-04 14:28:40,489 WARNING [train.py:1204] (3/4) Exclude cut with ID _67499_210_17282_1_1535509805220_3692430_365-29276-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 14:28:41,929 WARNING [train.py:1204] (3/4) Exclude cut with ID _14821_210_8750_1_1530680503991_6829359_69-250847-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 14:28:41,931 WARNING [train.py:1204] (3/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_1102-61301-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 14:28:43,267 WARNING [train.py:1204] (3/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_166-246557-0_sp1.1 from training. Duration: 0.5818125 2023-10-04 14:28:45,026 WARNING [train.py:1204] (3/4) Exclude cut with ID _15351_210_10626_1_1530856635653_3576210_214-129918-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 14:28:45,046 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_120-41668-0_sp0.9 from training. Duration: 0.77775 2023-10-04 14:28:46,466 WARNING [train.py:1204] (3/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_27-72278-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 14:28:47,876 WARNING [train.py:1204] (3/4) Exclude cut with ID _24314_210_4915_1_1532135078116_7170179_521-258868-0_sp1.1 from training. Duration: 0.7181875 2023-10-04 14:28:50,578 WARNING [train.py:1204] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_143-86344-0 from training. Duration: 0.77 2023-10-04 14:28:50,611 WARNING [train.py:1204] (3/4) Exclude cut with ID _53489_210_6228_1_1534298357169_3835970_465-157433-0_sp1.1 from training. Duration: 0.92725 2023-10-04 14:28:51,933 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_248-319353-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 14:28:53,330 WARNING [train.py:1204] (3/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_102-77064-0_sp1.1 from training. Duration: 0.8454375 2023-10-04 14:28:53,369 WARNING [train.py:1204] (3/4) Exclude cut with ID _9389_210_1664_1_1529217337301_5289269_379-128794-0_sp1.1 from training. Duration: 0.77275 2023-10-04 14:28:54,732 WARNING [train.py:1204] (3/4) Exclude cut with ID _3231_210_2106_1_1526205864934_6784220_236-260835-0_sp1.1 from training. Duration: 0.97275 2023-10-04 14:28:56,577 INFO [train.py:1046] (3/4) Epoch 48, batch 3350, loss[loss=0.1566, simple_loss=0.2309, pruned_loss=0.04118, over 23826.00 frames. ], tot_loss[loss=0.153, simple_loss=0.2339, pruned_loss=0.03603, over 4728210.13 frames. ], batch size: 195, lr: 2.11e-03, grad_scale: 8.0 2023-10-04 14:28:56,663 WARNING [train.py:1204] (3/4) Exclude cut with ID _24271_210_12658_1_1531987270022_7190519_184-342363-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 14:28:56,664 WARNING [train.py:1204] (3/4) Exclude cut with ID _27894_210_12392_1_1532415496078_6902439_67-221663-0_sp1.1 from training. Duration: 0.92725 2023-10-04 14:28:59,416 WARNING [train.py:1204] (3/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_304-59227-0_sp1.1 from training. Duration: 0.7 2023-10-04 14:29:00,792 WARNING [train.py:1204] (3/4) Exclude cut with ID _58605_210_12210_1_1534723195939_3904259_458-42232-0_sp1.1 from training. Duration: 0.92725 2023-10-04 14:29:02,547 WARNING [train.py:1204] (3/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_13-165512-0_sp0.9 from training. Duration: 0.911125 2023-10-04 14:29:05,888 WARNING [train.py:1204] (3/4) Exclude cut with ID _58602_210_2346_1_1534903210085_6908830_792-102241-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 14:29:06,191 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.self_attn_weights.pos_emb_skip_rate, batch_count=1686800.0, ans=0.0 2023-10-04 14:29:07,391 WARNING [train.py:1204] (3/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_588-125165-0_sp0.9 from training. Duration: 0.92225 2023-10-04 14:29:08,836 WARNING [train.py:1204] (3/4) Exclude cut with ID _54218_210_12210_1_1534413646388_4409259_239-98429-0_sp1.1 from training. Duration: 0.97275 2023-10-04 14:29:08,883 WARNING [train.py:1204] (3/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_538-32291-0_sp1.1 from training. Duration: 0.7545625 2023-10-04 14:29:10,167 WARNING [train.py:1204] (3/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_419-317062-0 from training. Duration: 0.91 2023-10-04 14:29:11,661 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9309W0202-640329 from training. Duration: 0.896 2023-10-04 14:29:12,925 WARNING [train.py:1204] (3/4) Exclude cut with ID _3231_210_2106_1_1526205864934_6784220_419-313291-0_sp1.1 from training. Duration: 0.97275 2023-10-04 14:29:17,541 WARNING [train.py:1204] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_619-344499-0 from training. Duration: 0.63 2023-10-04 14:29:17,562 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_278-272612-0 from training. Duration: 0.73 2023-10-04 14:29:18,883 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_103-84010-0_sp0.9 from training. Duration: 0.7666875 2023-10-04 14:29:18,902 WARNING [train.py:1204] (3/4) Exclude cut with ID _26528_210_9070_1_1532570410842_3851310_497-97218-0_sp1.1 from training. Duration: 0.7818125 2023-10-04 14:29:20,288 WARNING [train.py:1204] (3/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_262-84613-0_sp1.1 from training. Duration: 0.936375 2023-10-04 14:29:20,327 WARNING [train.py:1204] (3/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_387-131538-0 from training. Duration: 0.81 2023-10-04 14:29:20,360 WARNING [train.py:1204] (3/4) Exclude cut with ID _8030_210_7497_1_1528444886781_3531089_224-300489-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 14:29:20,376 WARNING [train.py:1204] (3/4) Exclude cut with ID _14349_210_8341_1_1530615710818_3442470_170-336541-0_sp0.9 from training. Duration: 0.9555625 2023-10-04 14:29:23,186 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_47069_210_15368_1_1533781267_4431320_180-31746-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 14:29:25,961 WARNING [train.py:1204] (3/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_447-7041-0_sp1.1 from training. Duration: 0.92725 2023-10-04 14:29:25,995 WARNING [train.py:1204] (3/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_2-55283-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 14:29:26,054 WARNING [train.py:1204] (3/4) Exclude cut with ID _12555_210_8750_1_1530075594402_3780239_70-181911-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 14:29:28,870 WARNING [train.py:1204] (3/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_311-328022-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 14:29:29,026 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.ff2_skip_rate, batch_count=1686933.3333333333, ans=0.0 2023-10-04 14:29:32,335 WARNING [train.py:1204] (3/4) Exclude cut with ID _14528_210_8254_1_1530669700514_4306939_330-2987-0_sp1.1 from training. Duration: 0.92725 2023-10-04 14:29:32,384 WARNING [train.py:1204] (3/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_385-184858-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 14:29:37,026 WARNING [train.py:1204] (3/4) Exclude cut with ID _64414_210_17301_1_1535243252048_3757759_348-333360-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 14:29:38,339 WARNING [train.py:1204] (3/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_753-214751-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 14:29:40,372 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_17613_210_2151_1_1531137569_2366160_202-208013-0_sp1.1 from training. Duration: 0.92725 2023-10-04 14:29:40,380 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_43831_210_2158_1_1533725502_5872791_610-129300-0_sp1.1 from training. Duration: 0.936375 2023-10-04 14:29:41,799 WARNING [train.py:1204] (3/4) Exclude cut with ID _54218_210_12210_1_1534413646388_4409259_321-23852-0_sp1.1 from training. Duration: 0.936375 2023-10-04 14:29:43,214 WARNING [train.py:1204] (3/4) Exclude cut with ID _39411_210_5986_1_1533538825030_3343649_173-238248-0 from training. Duration: 0.83 2023-10-04 14:29:43,401 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.feed_forward3.hidden_balancer.prob, batch_count=1687000.0, ans=0.125 2023-10-04 14:29:44,493 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_405-339986-0_sp0.9 from training. Duration: 0.5666875 2023-10-04 14:29:44,539 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2009_210_2071_1_1525481134_4588937_385-302100-0 from training. Duration: 0.87 2023-10-04 14:29:44,586 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_161-276996-0_sp1.1 from training. Duration: 0.863625 2023-10-04 14:29:44,843 INFO [scaling.py:1118] (3/4) WithLoss: name=encoder.encoders.3.encoder.layers.3.self_attn_weights, loss-sum=0.000e+00 2023-10-04 14:29:46,031 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_177-58005-0 from training. Duration: 0.62 2023-10-04 14:29:46,106 WARNING [train.py:1204] (3/4) Exclude cut with ID _64639_210_5816_1_1535367619437_3528000_146-343734-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 14:29:47,914 WARNING [train.py:1204] (3/4) Exclude cut with ID _3845_210_1390_1_1526731326141_3523610_72-61441-0_sp1.1 from training. Duration: 0.92725 2023-10-04 14:29:54,837 WARNING [train.py:1204] (3/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_991-125028-0_sp1.1 from training. Duration: 0.936375 2023-10-04 14:29:56,235 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7544_210_6753_1_1528591327_1471426_172-348777-0 from training. Duration: 0.66 2023-10-04 14:29:56,283 WARNING [train.py:1204] (3/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_416-257730-0_sp1.1 from training. Duration: 0.5909375 2023-10-04 14:29:57,648 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1113-7369-0_sp1.1 from training. Duration: 0.77275 2023-10-04 14:29:59,029 WARNING [train.py:1204] (3/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_242-76943-0_sp1.1 from training. Duration: 0.8545625 2023-10-04 14:30:05,050 WARNING [train.py:1204] (3/4) Exclude cut with ID _44718_210_5816_1_1534050051869_6469030_190-49648-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 14:30:06,622 WARNING [train.py:1204] (3/4) Exclude cut with ID _47251_210_18628_1_1533814063128_4137250_214-88076-0 from training. Duration: 0.88 2023-10-04 14:30:08,603 WARNING [train.py:1204] (3/4) Exclude cut with ID _5585_210_2667_1_1527418813324_8007340_89-332072-0_sp1.1 from training. Duration: 0.5818125 2023-10-04 14:30:08,645 WARNING [train.py:1204] (3/4) Exclude cut with ID _41817_210_18835_1_1533434477160_7423049_555-293448-0_sp1.1 from training. Duration: 0.72725 2023-10-04 14:30:10,636 WARNING [train.py:1204] (3/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_286-331314-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 14:30:11,861 INFO [train.py:1046] (3/4) Epoch 48, batch 3400, loss[loss=0.1503, simple_loss=0.2159, pruned_loss=0.0424, over 23528.00 frames. ], tot_loss[loss=0.1539, simple_loss=0.2351, pruned_loss=0.03635, over 4732065.21 frames. ], batch size: 256, lr: 2.11e-03, grad_scale: 8.0 2023-10-04 14:30:11,919 WARNING [train.py:1204] (3/4) Exclude cut with ID _13921_210_9994_1_1530838702800_3003429_46-149444-0 from training. Duration: 0.94 2023-10-04 14:30:11,981 WARNING [train.py:1204] (3/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_768-43595-0_sp1.1 from training. Duration: 0.936375 2023-10-04 14:30:11,990 WARNING [train.py:1204] (3/4) Exclude cut with ID _35355_210_16040_1_1533212988977_4016229_74-190076-0 from training. Duration: 0.8 2023-10-04 14:30:14,739 WARNING [train.py:1204] (3/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_1007-316380-0_sp1.1 from training. Duration: 0.963625 2023-10-04 14:30:14,791 WARNING [train.py:1204] (3/4) Exclude cut with ID _23557_210_8750_1_1531890106370_6846255_150-283437-0_sp1.1 from training. Duration: 0.963625 2023-10-04 14:30:16,092 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3474_210_2028_1_1526188513_4825246_145-309131-0_sp1.1 from training. Duration: 0.67275 2023-10-04 14:30:16,186 WARNING [train.py:1204] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_391-22148-0_sp1.1 from training. Duration: 0.7 2023-10-04 14:30:16,202 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_238-256668-0 from training. Duration: 0.69 2023-10-04 14:30:20,619 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.conv_module1.balancer1.prob, batch_count=1687133.3333333333, ans=0.125 2023-10-04 14:30:22,248 WARNING [train.py:1204] (3/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_372-198878-0 from training. Duration: 0.6 2023-10-04 14:30:22,258 WARNING [train.py:1204] (3/4) Exclude cut with ID ID0174W0006-766659 from training. Duration: 0.953 2023-10-04 14:30:22,281 WARNING [train.py:1204] (3/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_668-23019-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 14:30:23,898 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.feed_forward1.out_proj.dropout_p, batch_count=1687133.3333333333, ans=0.1 2023-10-04 14:30:26,451 WARNING [train.py:1204] (3/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_322-74328-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 14:30:26,457 WARNING [train.py:1204] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_872-264525-0_sp1.1 from training. Duration: 0.5909375 2023-10-04 14:30:27,670 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.768e+02 2.082e+02 2.351e+02 2.856e+02 4.234e+02, threshold=4.702e+02, percent-clipped=0.0 2023-10-04 14:30:27,763 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_53378_210_17282_1_1534221071_5769509_641-286201-0_sp1.1 from training. Duration: 0.97275 2023-10-04 14:30:29,073 WARNING [train.py:1204] (3/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_199-295611-0_sp1.1 from training. Duration: 0.5 2023-10-04 14:30:34,995 WARNING [train.py:1204] (3/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_1270-317971-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 14:30:36,386 WARNING [train.py:1204] (3/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_784-213099-0 from training. Duration: 0.93 2023-10-04 14:30:39,395 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_115-233929-0_sp0.9 from training. Duration: 0.8 2023-10-04 14:30:43,103 WARNING [train.py:1204] (3/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_256-223809-0_sp1.1 from training. Duration: 0.97275 2023-10-04 14:30:43,137 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_285-87781-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 14:30:43,211 WARNING [train.py:1204] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_619-344499-0_sp1.1 from training. Duration: 0.57275 2023-10-04 14:30:47,415 WARNING [train.py:1204] (3/4) Exclude cut with ID _60229_210_27767_1_1534838406158_3655350_264-279573-0_sp0.9 from training. Duration: 0.92225 2023-10-04 14:30:51,870 WARNING [train.py:1204] (3/4) Exclude cut with ID _7719_210_3525_1_1528456153163_4296960_333-110399-0 from training. Duration: 0.96 2023-10-04 14:30:57,469 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_50423_210_13270_1_1534128598_5021734_759-90833-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 14:30:57,519 WARNING [train.py:1204] (3/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_151-254798-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 14:30:57,560 WARNING [train.py:1204] (3/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_450-203637-0 from training. Duration: 0.92 2023-10-04 14:30:57,585 WARNING [train.py:1204] (3/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_673-36456-0_sp1.1 from training. Duration: 0.936375 2023-10-04 14:30:58,921 WARNING [train.py:1204] (3/4) Exclude cut with ID _36538_210_9994_1_1533204020363_3795620_131-8685-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 14:30:59,091 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.conv_module1.balancer2.prob, batch_count=1687333.3333333333, ans=0.125 2023-10-04 14:31:00,156 WARNING [train.py:1204] (3/4) Exclude cut with ID _12556_210_8341_1_1530081024100_7327690_713-55626-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 14:31:00,188 WARNING [train.py:1204] (3/4) Exclude cut with ID _2322_210_2667_1_1525604548971_3656299_440-80494-0_sp0.9 from training. Duration: 0.97775 2023-10-04 14:31:02,226 WARNING [train.py:1204] (3/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_210-325081-0_sp1.1 from training. Duration: 0.97275 2023-10-04 14:31:03,769 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.conv_module1.balancer1.prob, batch_count=1687333.3333333333, ans=0.125 2023-10-04 14:31:06,418 WARNING [train.py:1204] (3/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_375-244976-0_sp0.9 from training. Duration: 0.8333125 2023-10-04 14:31:06,423 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_15517_210_6753_1_1530952293_1664795_119-31001-0_sp1.1 from training. Duration: 0.8545625 2023-10-04 14:31:09,515 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.1.nonlin_attention.balancer.prob, batch_count=1687400.0, ans=0.125 2023-10-04 14:31:11,882 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_41452_210_7291_1_1533437687_3941281_319-42326-0_sp1.1 from training. Duration: 0.92725 2023-10-04 14:31:13,351 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_372-173012-0 from training. Duration: 0.95 2023-10-04 14:31:18,845 WARNING [train.py:1204] (3/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_41-31157-0_sp1.1 from training. Duration: 0.5181875 2023-10-04 14:31:23,430 WARNING [train.py:1204] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_91-212486-0 from training. Duration: 0.68 2023-10-04 14:31:25,026 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.conv_module1.balancer1.prob, batch_count=1687466.6666666667, ans=0.125 2023-10-04 14:31:26,013 INFO [train.py:1046] (3/4) Epoch 48, batch 3450, loss[loss=0.1593, simple_loss=0.2359, pruned_loss=0.04137, over 23904.00 frames. ], tot_loss[loss=0.1538, simple_loss=0.235, pruned_loss=0.0363, over 4730454.35 frames. ], batch size: 195, lr: 2.11e-03, grad_scale: 8.0 2023-10-04 14:31:27,525 WARNING [train.py:1204] (3/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_1267-272693-0 from training. Duration: 0.73 2023-10-04 14:31:27,572 WARNING [train.py:1204] (3/4) Exclude cut with ID _13938_210_9988_1_1530518718978_3693960_152-67046-0_sp1.1 from training. Duration: 0.936375 2023-10-04 14:31:28,921 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_4528_210_3273_1_1527076654_3276777_342-168068-0_sp1.1 from training. Duration: 0.7909375 2023-10-04 14:31:28,937 WARNING [train.py:1204] (3/4) Exclude cut with ID _7606_210_4915_1_1528894253277_5080690_127-177761-0 from training. Duration: 0.64 2023-10-04 14:31:30,744 WARNING [train.py:1204] (3/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_335-137972-0_sp1.1 from training. Duration: 0.92725 2023-10-04 14:31:33,595 WARNING [train.py:1204] (3/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_331-117829-0_sp1.1 from training. Duration: 0.563625 2023-10-04 14:31:38,941 WARNING [train.py:1204] (3/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_125-125818-0_sp1.1 from training. Duration: 0.87275 2023-10-04 14:31:40,312 WARNING [train.py:1204] (3/4) Exclude cut with ID _6789_210_1534_1_1527854471671_2870519_48-294897-0_sp1.1 from training. Duration: 0.963625 2023-10-04 14:31:40,383 WARNING [train.py:1204] (3/4) Exclude cut with ID _6694_210_4599_1_1527852637490_3965529_116-109398-0_sp1.1 from training. Duration: 0.663625 2023-10-04 14:31:40,398 WARNING [train.py:1204] (3/4) Exclude cut with ID _52892_210_6721_1_1534378941259_4124129_222-172685-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 14:31:43,769 WARNING [train.py:1204] (3/4) Exclude cut with ID _50655_210_18628_1_1534229942193_3772154_320-132275-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 14:31:49,457 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.bypass_mid.scale_min, batch_count=1687533.3333333333, ans=0.2 2023-10-04 14:31:50,519 WARNING [train.py:1204] (3/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_76-311963-0 from training. Duration: 0.7 2023-10-04 14:31:56,581 WARNING [train.py:1204] (3/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_32-205288-0 from training. Duration: 0.78 2023-10-04 14:31:56,597 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_86-16881-0_sp0.9 from training. Duration: 0.6444375 2023-10-04 14:31:56,643 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_271-297505-0_sp1.1 from training. Duration: 0.9 2023-10-04 14:31:57,995 WARNING [train.py:1204] (3/4) Exclude cut with ID _57572_210_5517_1_1534921219624_3809484_187-295293-0_sp1.1 from training. Duration: 0.963625 2023-10-04 14:32:04,210 WARNING [train.py:1204] (3/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_563-329943-0 from training. Duration: 0.99 2023-10-04 14:32:04,296 WARNING [train.py:1204] (3/4) Exclude cut with ID _7160_210_1876_1_1527940923231_3703799_302-150728-0_sp0.9 from training. Duration: 0.8666875 2023-10-04 14:32:08,360 WARNING [train.py:1204] (3/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_926-7526-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 14:32:08,388 WARNING [train.py:1204] (3/4) Exclude cut with ID _62574_210_2655_1_1535180407058_3933249_467-22348-0_sp1.1 from training. Duration: 0.936375 2023-10-04 14:32:09,821 WARNING [train.py:1204] (3/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_280-161633-0_sp0.9 from training. Duration: 0.811125 2023-10-04 14:32:11,629 WARNING [train.py:1204] (3/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_385-193052-0_sp1.1 from training. Duration: 0.7818125 2023-10-04 14:32:13,672 WARNING [train.py:1204] (3/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_444-28041-0 from training. Duration: 0.85 2023-10-04 14:32:13,774 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.0.balancer2.prob, batch_count=1687666.6666666667, ans=0.125 2023-10-04 14:32:14,710 WARNING [train.py:1204] (3/4) Exclude cut with ID _66070_210_7640_1_1535441393096_7805310_341-249237-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 14:32:16,078 WARNING [train.py:1204] (3/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_425-269826-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 14:32:18,765 WARNING [train.py:1204] (3/4) Exclude cut with ID _53950_210_5816_1_1534417264919_3862951_124-185597-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 14:32:20,225 WARNING [train.py:1204] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_380-204756-0 from training. Duration: 0.86 2023-10-04 14:32:22,967 WARNING [train.py:1204] (3/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_282-257791-0_sp0.9 from training. Duration: 0.9555625 2023-10-04 14:32:28,868 WARNING [train.py:1204] (3/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_372-131163-0_sp1.1 from training. Duration: 0.8545625 2023-10-04 14:32:30,132 WARNING [train.py:1204] (3/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_447-233513-0_sp1.1 from training. Duration: 0.963625 2023-10-04 14:32:32,798 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_54-118333-0_sp1.1 from training. Duration: 0.97275 2023-10-04 14:32:36,418 WARNING [train.py:1204] (3/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_388-230239-0_sp1.1 from training. Duration: 0.963625 2023-10-04 14:32:37,719 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_40047_210_2290_1_1533290519_2374311_141-54639-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 14:32:37,783 WARNING [train.py:1204] (3/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_91-267696-0_sp0.9 from training. Duration: 0.9666875 2023-10-04 14:32:39,008 INFO [train.py:1046] (3/4) Epoch 48, batch 3500, loss[loss=0.1565, simple_loss=0.248, pruned_loss=0.03254, over 24311.00 frames. ], tot_loss[loss=0.153, simple_loss=0.234, pruned_loss=0.03603, over 4732013.97 frames. ], batch size: 74, lr: 2.11e-03, grad_scale: 8.0 2023-10-04 14:32:39,060 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_532-137847-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 14:32:43,936 WARNING [train.py:1204] (3/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_155-283231-0_sp1.1 from training. Duration: 0.97275 2023-10-04 14:32:46,693 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2245_210_2715_1_1525511548_3540254_310-52483-0_sp1.1 from training. Duration: 0.8 2023-10-04 14:32:46,758 WARNING [train.py:1204] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_185-174805-0 from training. Duration: 0.68 2023-10-04 14:32:49,473 WARNING [train.py:1204] (3/4) Exclude cut with ID _15012_210_1968_1_1530856820429_5095350_662-210469-0_sp1.1 from training. Duration: 0.5181875 2023-10-04 14:32:52,316 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_426-313557-0_sp0.9 from training. Duration: 0.588875 2023-10-04 14:32:53,677 WARNING [train.py:1204] (3/4) Exclude cut with ID _42326_210_7770_1_1533814282531_3707269_407-204343-0_sp1.1 from training. Duration: 0.97275 2023-10-04 14:32:53,685 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_258-310570-0 from training. Duration: 0.82 2023-10-04 14:32:54,997 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.623e+02 2.040e+02 2.265e+02 2.652e+02 4.123e+02, threshold=4.530e+02, percent-clipped=0.0 2023-10-04 14:32:58,513 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_4737_210_2040_1_1526998689_3293008_378-178650-0_sp1.1 from training. Duration: 0.863625 2023-10-04 14:32:59,894 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_47896_210_20508_1_1534492518_5958868_432-327451-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 14:32:59,979 WARNING [train.py:1204] (3/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_188-114464-0_sp1.1 from training. Duration: 0.7454375 2023-10-04 14:32:59,988 WARNING [train.py:1204] (3/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_798-106179-0_sp1.1 from training. Duration: 0.7545625 2023-10-04 14:33:00,021 WARNING [train.py:1204] (3/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_143-117421-0_sp1.1 from training. Duration: 0.52725 2023-10-04 14:33:00,575 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.3.encoder.layers.3.self_attn1.whiten, num_groups=1, num_channels=512, metric=14.17 vs. limit=22.5 2023-10-04 14:33:01,360 WARNING [train.py:1204] (3/4) Exclude cut with ID _67499_210_17282_1_1535509805220_3692430_361-78786-0_sp1.1 from training. Duration: 0.963625 2023-10-04 14:33:02,697 WARNING [train.py:1204] (3/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_712-342563-0_sp1.1 from training. Duration: 0.8818125 2023-10-04 14:33:02,761 WARNING [train.py:1204] (3/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_387-159008-0 from training. Duration: 0.86 2023-10-04 14:33:05,585 WARNING [train.py:1204] (3/4) Exclude cut with ID _33640_210_2655_1_1533117605479_4855469_959-18820-0_sp1.1 from training. Duration: 0.963625 2023-10-04 14:33:05,618 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_133-564-0_sp0.9 from training. Duration: 0.87775 2023-10-04 14:33:07,323 WARNING [train.py:1204] (3/4) Exclude cut with ID _23514_210_1389_1_1531832139785_3031439_12-29287-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 14:33:07,843 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.5.encoder.layers.0.conv_module1.whiten, num_groups=1, num_channels=256, metric=7.13 vs. limit=15.0 2023-10-04 14:33:09,541 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.bypass.scale_min, batch_count=1687933.3333333333, ans=0.2 2023-10-04 14:33:12,352 WARNING [train.py:1204] (3/4) Exclude cut with ID _23514_210_1389_1_1531832139785_3031439_489-185052-0_sp1.1 from training. Duration: 0.963625 2023-10-04 14:33:12,423 WARNING [train.py:1204] (3/4) Exclude cut with ID _5444_210_2151_1_1527418808464_5312210_634-194174-0 from training. Duration: 0.97 2023-10-04 14:33:12,441 WARNING [train.py:1204] (3/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_91-26919-0_sp1.1 from training. Duration: 0.8818125 2023-10-04 14:33:15,198 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_37351_210_7925_1_1533012931_3812113_516-82303-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 14:33:15,314 WARNING [train.py:1204] (3/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_80-74184-0_sp0.9 from training. Duration: 0.92225 2023-10-04 14:33:16,642 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_9460_210_4211_1_1528954587_10049667_495-202963-0_sp1.1 from training. Duration: 0.963625 2023-10-04 14:33:18,066 WARNING [train.py:1204] (3/4) Exclude cut with ID _14632_210_9988_1_1530691279392_3923200_332-105904-0_sp1.1 from training. Duration: 0.8181875 2023-10-04 14:33:18,082 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_40764_210_1925_1_1533882359_3752345_493-167886-0_sp1.1 from training. Duration: 0.92725 2023-10-04 14:33:19,541 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_196-289637-0 from training. Duration: 0.75 2023-10-04 14:33:20,878 WARNING [train.py:1204] (3/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_26-175553-0 from training. Duration: 0.61 2023-10-04 14:33:20,936 WARNING [train.py:1204] (3/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_890-5117-0 from training. Duration: 0.78 2023-10-04 14:33:20,979 WARNING [train.py:1204] (3/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_206-306860-0_sp1.1 from training. Duration: 0.92725 2023-10-04 14:33:23,876 WARNING [train.py:1204] (3/4) Exclude cut with ID _25882_210_3036_1_1532775665815_6479510_570-79824-0_sp1.1 from training. Duration: 0.963625 2023-10-04 14:33:25,309 WARNING [train.py:1204] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_731-2428-0_sp1.1 from training. Duration: 0.7545625 2023-10-04 14:33:25,357 WARNING [train.py:1204] (3/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_138-154812-0_sp0.9 from training. Duration: 0.8666875 2023-10-04 14:33:27,452 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.1.encoder.layers.1.nonlin_attention.whiten2, num_groups=1, num_channels=256, metric=7.02 vs. limit=15.0 2023-10-04 14:33:28,097 WARNING [train.py:1204] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_178-39722-0_sp0.9 from training. Duration: 0.5444375 2023-10-04 14:33:30,006 WARNING [train.py:1204] (3/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_1285-256622-0_sp0.9 from training. Duration: 0.9333125 2023-10-04 14:33:33,021 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.ff3_skip_rate, batch_count=1688000.0, ans=0.0 2023-10-04 14:33:34,778 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_41428_210_2161_1_1533774184_3992532_65-271323-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 14:33:35,050 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.conv_skip_rate, batch_count=1688000.0, ans=0.0 2023-10-04 14:33:36,206 WARNING [train.py:1204] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_308-161067-0 from training. Duration: 0.66 2023-10-04 14:33:36,211 WARNING [train.py:1204] (3/4) Exclude cut with ID _3067_210_2667_1_1526212974640_4503168_459-173867-0 from training. Duration: 0.88 2023-10-04 14:33:36,213 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2221_210_1988_1_1525597051_4935541_415-296882-0_sp1.1 from training. Duration: 0.7 2023-10-04 14:33:36,556 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=1688000.0, ans=0.1 2023-10-04 14:33:37,696 WARNING [train.py:1204] (3/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_354-192266-0_sp1.1 from training. Duration: 0.936375 2023-10-04 14:33:39,459 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_258-310570-0_sp0.9 from training. Duration: 0.911125 2023-10-04 14:33:41,331 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_548-141039-0_sp1.1 from training. Duration: 0.963625 2023-10-04 14:33:44,183 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_15101_210_7927_1_1530786415_5033274_320-327527-0 from training. Duration: 0.96 2023-10-04 14:33:44,244 WARNING [train.py:1204] (3/4) Exclude cut with ID _2895_210_2477_1_1525956785794_4063750_289-11475-0_sp0.9 from training. Duration: 0.911125 2023-10-04 14:33:44,437 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.conv_module2.balancer2.prob, batch_count=1688066.6666666667, ans=0.125 2023-10-04 14:33:46,856 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7543_210_6753_1_1528543318_4552_1-85072-0_sp1.1 from training. Duration: 0.7545625 2023-10-04 14:33:46,948 WARNING [train.py:1204] (3/4) Exclude cut with ID _64639_210_5816_1_1535367619437_3528000_461-329156-0 from training. Duration: 0.96 2023-10-04 14:33:48,793 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.3.encoder.layers.3.self_attn1.whiten, num_groups=1, num_channels=512, metric=11.26 vs. limit=22.5 2023-10-04 14:33:49,576 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_211-339622-0 from training. Duration: 0.45 2023-10-04 14:33:52,269 WARNING [train.py:1204] (3/4) Exclude cut with ID _20455_210_5219_1_1531565996775_8098100_402-112739-0_sp1.1 from training. Duration: 0.963625 2023-10-04 14:33:53,601 INFO [train.py:1046] (3/4) Epoch 48, batch 3550, loss[loss=0.1415, simple_loss=0.2223, pruned_loss=0.03037, over 24571.00 frames. ], tot_loss[loss=0.1518, simple_loss=0.2319, pruned_loss=0.03583, over 4724623.04 frames. ], batch size: 60, lr: 2.11e-03, grad_scale: 8.0 2023-10-04 14:33:53,657 WARNING [train.py:1204] (3/4) Exclude cut with ID _53489_210_6228_1_1534298357169_3835970_256-115565-0_sp1.1 from training. Duration: 0.936375 2023-10-04 14:33:53,674 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_26326_210_7925_1_1533002493_3592406_360-305260-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 14:33:53,698 WARNING [train.py:1204] (3/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_18-204383-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 14:33:56,542 WARNING [train.py:1204] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_306-232480-0_sp1.1 from training. Duration: 0.87275 2023-10-04 14:33:57,527 INFO [scaling.py:1022] (3/4) Whitening: name=encoder_embed.convnext.out_whiten, num_groups=1, num_channels=128, metric=4.35 vs. limit=5.0 2023-10-04 14:34:05,291 WARNING [train.py:1204] (3/4) Exclude cut with ID _41161_210_20034_1_1533431249657_3555314_116-299186-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 14:34:08,397 WARNING [train.py:1204] (3/4) Exclude cut with ID _13913_210_9994_1_1530579615187_3383897_473-277682-0_sp1.1 from training. Duration: 0.3545625 2023-10-04 14:34:10,553 WARNING [train.py:1204] (3/4) Exclude cut with ID _58895_210_8750_1_1535184081320_3649089_49-225702-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 14:34:10,896 INFO [scaling.py:1118] (3/4) WithLoss: name=encoder.encoders.5.encoder.layers.0.self_attn_weights, loss-sum=0.000e+00 2023-10-04 14:34:12,525 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3080_210_2071_1_1526086102_4365166_312-8726-0_sp1.1 from training. Duration: 0.82725 2023-10-04 14:34:13,874 WARNING [train.py:1204] (3/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_489-190175-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 14:34:13,949 WARNING [train.py:1204] (3/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_345-95717-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 14:34:13,963 WARNING [train.py:1204] (3/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_85-138257-0_sp1.1 from training. Duration: 0.6454375 2023-10-04 14:34:18,141 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7046_210_6753_1_1527895337_6920917_783-278109-0_sp1.1 from training. Duration: 0.7 2023-10-04 14:34:19,440 WARNING [train.py:1204] (3/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_450-203637-0_sp1.1 from training. Duration: 0.836375 2023-10-04 14:34:19,475 WARNING [train.py:1204] (3/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_416-132893-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 14:34:19,490 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2655_210_2087_1_1525688959_3897400_437-337690-0_sp0.9 from training. Duration: 0.7 2023-10-04 14:34:20,818 WARNING [train.py:1204] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_700-183549-0_sp1.1 from training. Duration: 0.6181875 2023-10-04 14:34:22,592 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.balancer2.prob, batch_count=1688266.6666666667, ans=0.125 2023-10-04 14:34:25,316 WARNING [train.py:1204] (3/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_191-59784-0_sp1.1 from training. Duration: 0.8 2023-10-04 14:34:25,328 WARNING [train.py:1204] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_218-295097-0_sp1.1 from training. Duration: 0.7 2023-10-04 14:34:28,030 WARNING [train.py:1204] (3/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_310-109461-0_sp1.1 from training. Duration: 0.72725 2023-10-04 14:34:28,042 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_33348_210_5632_1_1533086912_3324030_131-321149-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 14:34:29,401 WARNING [train.py:1204] (3/4) Exclude cut with ID _64135_210_2253_1_1535677267876_7608972_133-188266-0_sp1.1 from training. Duration: 0.736375 2023-10-04 14:34:29,427 WARNING [train.py:1204] (3/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_505-205270-0 from training. Duration: 0.38 2023-10-04 14:34:29,437 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_67223_210_4238_1_1535612120_2537431_102-88248-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 14:34:30,713 WARNING [train.py:1204] (3/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_60-235433-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 14:34:32,086 WARNING [train.py:1204] (3/4) Exclude cut with ID _5189_210_4557_1_1527248319428_3769229_199-53526-0_sp1.1 from training. Duration: 0.3818125 2023-10-04 14:34:36,874 WARNING [train.py:1204] (3/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_439-90979-0_sp1.1 from training. Duration: 0.963625 2023-10-04 14:34:38,242 WARNING [train.py:1204] (3/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_1162-132880-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 14:34:40,085 WARNING [train.py:1204] (3/4) Exclude cut with ID _56423_210_5854_1_1534896061049_3165969_220-22317-0_sp1.1 from training. Duration: 0.963625 2023-10-04 14:34:42,189 WARNING [train.py:1204] (3/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_909-64151-0 from training. Duration: 0.94 2023-10-04 14:34:43,460 WARNING [train.py:1204] (3/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_465-156500-0_sp0.9 from training. Duration: 0.8 2023-10-04 14:34:44,834 WARNING [train.py:1204] (3/4) Exclude cut with ID _44304_210_15374_1_1533639352765_4613129_302-22084-0 from training. Duration: 0.86 2023-10-04 14:34:44,889 WARNING [train.py:1204] (3/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_663-166973-0_sp1.1 from training. Duration: 0.72725 2023-10-04 14:34:47,614 WARNING [train.py:1204] (3/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_503-287883-0_sp0.9 from training. Duration: 0.97775 2023-10-04 14:34:47,629 WARNING [train.py:1204] (3/4) Exclude cut with ID _60057_210_10640_1_1534903065793_3333150_362-230377-0_sp0.9 from training. Duration: 0.9666875 2023-10-04 14:34:50,345 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_4183_210_2087_1_1526729100_1869838_143-280962-0 from training. Duration: 0.56 2023-10-04 14:34:50,525 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.bypass_mid.scale_min, batch_count=1688333.3333333333, ans=0.2 2023-10-04 14:34:51,757 WARNING [train.py:1204] (3/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_281-1313-0_sp1.1 from training. Duration: 0.936375 2023-10-04 14:34:55,981 WARNING [train.py:1204] (3/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_64-215118-0_sp1.1 from training. Duration: 0.936375 2023-10-04 14:34:56,029 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2822_210_2715_1_1526112586_1999587_165-36656-0 from training. Duration: 0.89 2023-10-04 14:34:57,383 WARNING [train.py:1204] (3/4) Exclude cut with ID _22961_210_8254_1_1531976366672_4278180_402-208830-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 14:35:00,222 WARNING [train.py:1204] (3/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_98-193494-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 14:35:01,695 WARNING [train.py:1204] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_383-285196-0 from training. Duration: 0.97 2023-10-04 14:35:06,889 INFO [train.py:1046] (3/4) Epoch 48, batch 3600, loss[loss=0.1623, simple_loss=0.2533, pruned_loss=0.03563, over 24351.00 frames. ], tot_loss[loss=0.1513, simple_loss=0.2315, pruned_loss=0.03551, over 4724178.89 frames. ], batch size: 74, lr: 2.11e-03, grad_scale: 16.0 2023-10-04 14:35:10,456 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6767_210_3273_1_1527855783_1718932_142-127132-0 from training. Duration: 0.95 2023-10-04 14:35:10,499 WARNING [train.py:1204] (3/4) Exclude cut with ID _39867_210_9826_1_1533553222030_9344259_1018-339635-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 14:35:10,558 WARNING [train.py:1204] (3/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_274-274798-0_sp1.1 from training. Duration: 0.8909375 2023-10-04 14:35:14,236 WARNING [train.py:1204] (3/4) Exclude cut with ID _2334_210_2667_1_1525608238880_4705129_376-263393-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 14:35:14,281 WARNING [train.py:1204] (3/4) Exclude cut with ID _29303_210_2575_1_1532392237303_7410649_363-344675-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 14:35:16,185 WARNING [train.py:1204] (3/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_58-68956-0_sp1.1 from training. Duration: 0.7545625 2023-10-04 14:35:18,933 WARNING [train.py:1204] (3/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_709-311571-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 14:35:19,058 WARNING [train.py:1204] (3/4) Exclude cut with ID _46067_210_4220_1_1534125571066_3307450_136-257407-0_sp1.1 from training. Duration: 0.963625 2023-10-04 14:35:20,418 WARNING [train.py:1204] (3/4) Exclude cut with ID _6493_210_1747_1_1528459210424_4012470_324-323854-0_sp1.1 from training. Duration: 0.836375 2023-10-04 14:35:20,973 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.3.encoder.layers.3.feed_forward2.out_whiten, num_groups=1, num_channels=512, metric=6.85 vs. limit=15.0 2023-10-04 14:35:21,626 WARNING [train.py:1204] (3/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_234-260682-0_sp1.1 from training. Duration: 0.763625 2023-10-04 14:35:23,067 WARNING [train.py:1204] (3/4) Exclude cut with ID _36634_210_1794_1_1533296352450_1399884_55-119451-0_sp1.1 from training. Duration: 0.963625 2023-10-04 14:35:23,071 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_40047_210_2290_1_1533290519_2374311_195-165693-0 from training. Duration: 0.89 2023-10-04 14:35:25,592 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.608e+02 2.134e+02 2.513e+02 3.119e+02 5.278e+02, threshold=5.026e+02, percent-clipped=3.0 2023-10-04 14:35:25,752 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_5522_210_3273_1_1527249606_3909096_366-172508-0_sp0.9 from training. Duration: 0.7333125 2023-10-04 14:35:27,112 WARNING [train.py:1204] (3/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_187-10504-0_sp1.1 from training. Duration: 0.963625 2023-10-04 14:35:29,942 WARNING [train.py:1204] (3/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_282-257791-0_sp1.1 from training. Duration: 0.7818125 2023-10-04 14:35:32,641 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_43831_210_2158_1_1533725502_5872791_467-288776-0_sp1.1 from training. Duration: 0.92725 2023-10-04 14:35:32,742 WARNING [train.py:1204] (3/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_138-154812-0_sp1.1 from training. Duration: 0.7090625 2023-10-04 14:35:34,042 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_1154-185930-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 14:35:34,060 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2655_210_2087_1_1525688959_3897400_52-279899-0 from training. Duration: 0.69 2023-10-04 14:35:34,119 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_141-89113-0_sp1.1 from training. Duration: 0.7818125 2023-10-04 14:35:36,800 WARNING [train.py:1204] (3/4) Exclude cut with ID _55134_210_21982_1_1534590028377_3882170_297-47310-0_sp1.1 from training. Duration: 0.963625 2023-10-04 14:35:36,876 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_486-151131-0_sp1.1 from training. Duration: 0.77275 2023-10-04 14:35:40,046 WARNING [train.py:1204] (3/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_21-232980-0_sp1.1 from training. Duration: 0.936375 2023-10-04 14:35:41,927 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6165_210_3528_1_1527590982_4379841_338-106380-0_sp1.1 from training. Duration: 0.92725 2023-10-04 14:35:42,594 WARNING [train.py:1204] (3/4) Exclude cut with ID _55162_210_17071_1_1534408248583_3753480_355-257218-0_sp1.1 from training. Duration: 0.97275 2023-10-04 14:35:43,976 WARNING [train.py:1204] (3/4) Exclude cut with ID _2895_210_2477_1_1525956785794_4063750_289-11475-0 from training. Duration: 0.82 2023-10-04 14:35:49,527 WARNING [train.py:1204] (3/4) Exclude cut with ID _16756_210_4915_1_1531047803763_7104259_839-183431-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 14:35:50,942 WARNING [train.py:1204] (3/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_427-208901-0_sp0.9 from training. Duration: 0.7555625 2023-10-04 14:35:52,261 WARNING [train.py:1204] (3/4) Exclude cut with ID _36512_210_6987_1_1533033143998_3126069_181-306500-0 from training. Duration: 0.95 2023-10-04 14:35:56,559 WARNING [train.py:1204] (3/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_28-70987-0_sp1.1 from training. Duration: 0.7909375 2023-10-04 14:35:58,228 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.1.conv_module1.balancer1.prob, batch_count=1688666.6666666667, ans=0.125 2023-10-04 14:35:59,447 WARNING [train.py:1204] (3/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_590-58996-0_sp1.1 from training. Duration: 0.936375 2023-10-04 14:36:02,268 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_48730_210_5399_1_1533885886_2212769_108-47561-0_sp1.1 from training. Duration: 0.936375 2023-10-04 14:36:03,693 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.conv_module1.balancer1.prob, batch_count=1688666.6666666667, ans=0.125 2023-10-04 14:36:06,288 WARNING [train.py:1204] (3/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_401-158239-0_sp0.9 from training. Duration: 0.788875 2023-10-04 14:36:06,304 WARNING [train.py:1204] (3/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_278-154492-0_sp1.1 from training. Duration: 0.6909375 2023-10-04 14:36:06,309 WARNING [train.py:1204] (3/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_137-116442-0 from training. Duration: 0.78 2023-10-04 14:36:08,288 WARNING [train.py:1204] (3/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_189-317158-0 from training. Duration: 0.9 2023-10-04 14:36:10,147 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_47896_210_20508_1_1534492518_5958868_558-314572-0 from training. Duration: 0.97 2023-10-04 14:36:11,660 WARNING [train.py:1204] (3/4) Exclude cut with ID _66070_210_7640_1_1535441393096_7805310_290-40284-0_sp1.1 from training. Duration: 0.97275 2023-10-04 14:36:13,000 WARNING [train.py:1204] (3/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_110-224688-0_sp1.1 from training. Duration: 0.8818125 2023-10-04 14:36:14,372 WARNING [train.py:1204] (3/4) Exclude cut with ID _7719_210_3525_1_1528456153163_4296960_88-250259-0 from training. Duration: 0.84 2023-10-04 14:36:14,427 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8453_210_4211_1_1528502940_8562730_1157-114102-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 14:36:14,461 WARNING [train.py:1204] (3/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_317-175062-0_sp1.1 from training. Duration: 0.7454375 2023-10-04 14:36:14,467 WARNING [train.py:1204] (3/4) Exclude cut with ID _29857_210_20034_1_1532395886753_3798160_254-347254-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 14:36:15,828 WARNING [train.py:1204] (3/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_28-70987-0 from training. Duration: 0.87 2023-10-04 14:36:15,921 WARNING [train.py:1204] (3/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_321-335475-0 from training. Duration: 0.45 2023-10-04 14:36:18,851 WARNING [train.py:1204] (3/4) Exclude cut with ID _40901_210_5665_1_1533363789958_3856130_356-107330-0_sp1.1 from training. Duration: 0.936375 2023-10-04 14:36:18,920 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3780_210_2087_1_1526467389_2145572_176-107140-0 from training. Duration: 0.69 2023-10-04 14:36:21,508 INFO [train.py:1046] (3/4) Epoch 48, batch 3650, loss[loss=0.1628, simple_loss=0.2383, pruned_loss=0.04367, over 23554.00 frames. ], tot_loss[loss=0.1517, simple_loss=0.2321, pruned_loss=0.03562, over 4723510.59 frames. ], batch size: 256, lr: 2.11e-03, grad_scale: 8.0 2023-10-04 14:36:22,950 WARNING [train.py:1204] (3/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_80-135200-0 from training. Duration: 0.79 2023-10-04 14:36:24,431 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_33348_210_5632_1_1533086912_3324030_85-28583-0_sp0.9 from training. Duration: 0.911125 2023-10-04 14:36:24,608 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=1688800.0, ans=0.1 2023-10-04 14:36:28,596 WARNING [train.py:1204] (3/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_29-293896-0 from training. Duration: 0.61 2023-10-04 14:36:29,981 WARNING [train.py:1204] (3/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_194-252800-0 from training. Duration: 0.97 2023-10-04 14:36:34,361 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.self_attn_weights.pos_emb_skip_rate, batch_count=1688866.6666666667, ans=0.0 2023-10-04 14:36:35,481 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_360-52446-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 14:36:35,482 WARNING [train.py:1204] (3/4) Exclude cut with ID _9389_210_1664_1_1529217337301_5289269_289-213417-0_sp0.9 from training. Duration: 0.988875 2023-10-04 14:36:35,530 WARNING [train.py:1204] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_143-86344-0_sp0.9 from training. Duration: 0.8555625 2023-10-04 14:36:41,236 WARNING [train.py:1204] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_520-126729-0_sp0.9 from training. Duration: 0.77775 2023-10-04 14:36:41,273 WARNING [train.py:1204] (3/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_76-308663-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 14:36:42,347 WARNING [train.py:1204] (3/4) Exclude cut with ID _8021_210_7994_1_1528376451358_3653069_100-140231-0 from training. Duration: 0.67 2023-10-04 14:36:43,735 WARNING [train.py:1204] (3/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_387-131538-0_sp0.9 from training. Duration: 0.9 2023-10-04 14:36:44,945 WARNING [train.py:1204] (3/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_365-182054-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 14:36:44,986 WARNING [train.py:1204] (3/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_345-340690-0 from training. Duration: 0.93 2023-10-04 14:36:46,386 WARNING [train.py:1204] (3/4) Exclude cut with ID _5409_210_1388_1_1527292850189_5865191_178-127970-0_sp0.9 from training. Duration: 0.6333125 2023-10-04 14:36:46,442 WARNING [train.py:1204] (3/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_352-12734-0_sp1.1 from training. Duration: 0.92725 2023-10-04 14:36:46,445 WARNING [train.py:1204] (3/4) Exclude cut with ID _65134_210_14763_1_1535335393985_3505368_121-129373-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 14:36:49,393 WARNING [train.py:1204] (3/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_557-324258-0_sp1.1 from training. Duration: 0.736375 2023-10-04 14:36:50,988 WARNING [train.py:1204] (3/4) Exclude cut with ID _48678_210_5663_1_1533988830947_3769199_306-255886-0 from training. Duration: 0.77 2023-10-04 14:36:52,424 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7544_210_6753_1_1528587000_691468_87-178599-0 from training. Duration: 0.87 2023-10-04 14:36:53,752 WARNING [train.py:1204] (3/4) Exclude cut with ID _2941_210_2577_1_1526199673150_2489759_208-236402-0_sp1.1 from training. Duration: 0.963625 2023-10-04 14:36:55,252 WARNING [train.py:1204] (3/4) Exclude cut with ID _38670_210_15379_1_1533448810659_4391059_65-169397-0 from training. Duration: 0.97 2023-10-04 14:36:57,906 WARNING [train.py:1204] (3/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_306-328175-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 14:36:57,919 WARNING [train.py:1204] (3/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_212-149947-0_sp1.1 from training. Duration: 0.663625 2023-10-04 14:37:03,492 WARNING [train.py:1204] (3/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_321-22821-0_sp1.1 from training. Duration: 0.8090625 2023-10-04 14:37:04,996 WARNING [train.py:1204] (3/4) Exclude cut with ID _44010_210_4525_1_1533884262964_4013620_483-69110-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 14:37:05,004 WARNING [train.py:1204] (3/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_38-187851-0_sp1.1 from training. Duration: 0.563625 2023-10-04 14:37:06,385 WARNING [train.py:1204] (3/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_129-337802-0_sp0.9 from training. Duration: 0.8 2023-10-04 14:37:06,462 WARNING [train.py:1204] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_268-87909-0_sp1.1 from training. Duration: 0.9 2023-10-04 14:37:09,252 WARNING [train.py:1204] (3/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_559-184862-0_sp1.1 from training. Duration: 0.8545625 2023-10-04 14:37:12,355 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_46032_210_14656_1_1533781412_4507969_237-120766-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 14:37:13,071 WARNING [train.py:1204] (3/4) Exclude cut with ID _28427_210_4220_1_1532581240967_3061069_330-170906-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 14:37:14,304 WARNING [train.py:1204] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_401-51578-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 14:37:16,159 WARNING [train.py:1204] (3/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_209-205365-0_sp1.1 from training. Duration: 0.7181875 2023-10-04 14:37:17,506 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_23801_210_16223_1_1532066392_3294657_57-242170-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 14:37:17,550 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_127-37371-0_sp1.1 from training. Duration: 0.936375 2023-10-04 14:37:21,793 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9171W0019-578905 from training. Duration: 0.896 2023-10-04 14:37:25,824 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_24605_210_1794_1_1532047863_4149755_349-49056-0_sp1.1 from training. Duration: 0.92725 2023-10-04 14:37:25,835 WARNING [train.py:1204] (3/4) Exclude cut with ID _12302_210_5219_1_1529924446466_8107280_897-21237-0_sp1.1 from training. Duration: 0.936375 2023-10-04 14:37:27,292 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_9460_210_4211_1_1528954587_10049667_480-260269-0_sp0.9 from training. Duration: 0.62225 2023-10-04 14:37:27,333 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_348-152548-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 14:37:28,609 WARNING [train.py:1204] (3/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_301-330455-0_sp0.9 from training. Duration: 0.888875 2023-10-04 14:37:29,979 WARNING [train.py:1204] (3/4) Exclude cut with ID _61008_210_10633_1_1534852769198_6974370_345-91705-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 14:37:31,353 WARNING [train.py:1204] (3/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_304-273433-0 from training. Duration: 0.99 2023-10-04 14:37:31,356 WARNING [train.py:1204] (3/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_359-345972-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 14:37:34,170 INFO [train.py:1046] (3/4) Epoch 48, batch 3700, loss[loss=0.1622, simple_loss=0.2489, pruned_loss=0.03776, over 24100.00 frames. ], tot_loss[loss=0.1526, simple_loss=0.2332, pruned_loss=0.03602, over 4720615.29 frames. ], batch size: 86, lr: 2.11e-03, grad_scale: 8.0 2023-10-04 14:37:34,242 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2296_210_2087_1_1525503448_3397335_57-185430-0_sp1.1 from training. Duration: 0.5454375 2023-10-04 14:37:35,654 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_1256-120974-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 14:37:36,925 WARNING [train.py:1204] (3/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_372-30302-0_sp0.9 from training. Duration: 0.9555625 2023-10-04 14:37:39,731 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_42012_210_7927_1_1533439936_3871556_451-344262-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 14:37:39,732 WARNING [train.py:1204] (3/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_28-324729-0 from training. Duration: 0.94 2023-10-04 14:37:39,748 WARNING [train.py:1204] (3/4) Exclude cut with ID _64135_210_2253_1_1535677267876_7608972_313-18394-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 14:37:41,198 WARNING [train.py:1204] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_620-276793-0_sp0.9 from training. Duration: 0.6555625 2023-10-04 14:37:42,943 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7544_210_6753_1_1528591327_1471426_172-348777-0_sp0.9 from training. Duration: 0.7333125 2023-10-04 14:37:46,671 WARNING [train.py:1204] (3/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_332-221057-0_sp0.9 from training. Duration: 0.7555625 2023-10-04 14:37:51,384 WARNING [train.py:1204] (3/4) Exclude cut with ID _19023_210_4353_1_1531378892552_5820780_5-300035-0_sp1.1 from training. Duration: 0.92725 2023-10-04 14:37:51,435 WARNING [train.py:1204] (3/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_343-309023-0_sp1.1 from training. Duration: 0.963625 2023-10-04 14:37:52,598 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.693e+02 2.073e+02 2.286e+02 2.591e+02 3.912e+02, threshold=4.572e+02, percent-clipped=0.0 2023-10-04 14:37:52,699 WARNING [train.py:1204] (3/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_37-41919-0_sp1.1 from training. Duration: 0.7909375 2023-10-04 14:37:52,742 WARNING [train.py:1204] (3/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_41-182910-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 14:37:53,979 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_229-45300-0_sp0.9 from training. Duration: 0.6333125 2023-10-04 14:37:55,488 WARNING [train.py:1204] (3/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_40-214716-0_sp1.1 from training. Duration: 0.963625 2023-10-04 14:37:56,957 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9309W0203-640330 from training. Duration: 0.896 2023-10-04 14:38:03,985 WARNING [train.py:1204] (3/4) Exclude cut with ID _60229_210_27767_1_1534838406158_3655350_264-279573-0_sp1.1 from training. Duration: 0.7545625 2023-10-04 14:38:04,017 WARNING [train.py:1204] (3/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_372-198878-0_sp0.9 from training. Duration: 0.6666875 2023-10-04 14:38:05,429 WARNING [train.py:1204] (3/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_359-118926-0_sp1.1 from training. Duration: 0.6545625 2023-10-04 14:38:05,450 WARNING [train.py:1204] (3/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_85-65056-0 from training. Duration: 0.96 2023-10-04 14:38:05,469 WARNING [train.py:1204] (3/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_89-242335-0_sp1.1 from training. Duration: 0.863625 2023-10-04 14:38:10,137 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.4.encoder.layers.0.feed_forward3.out_whiten, num_groups=1, num_channels=384, metric=10.61 vs. limit=15.0 2023-10-04 14:38:10,813 WARNING [train.py:1204] (3/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_128-49444-0_sp1.1 from training. Duration: 0.936375 2023-10-04 14:38:12,147 WARNING [train.py:1204] (3/4) Exclude cut with ID _30327_210_16049_1_1532484161295_3924169_401-70819-0 from training. Duration: 0.86 2023-10-04 14:38:12,497 INFO [scaling.py:1118] (3/4) WithLoss: name=encoder.encoders.5.encoder.layers.0.self_attn_weights, loss-sum=0.000e+00 2023-10-04 14:38:13,479 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_41822_210_13270_1_1533955976_4379284_260-256391-0_sp1.1 from training. Duration: 0.936375 2023-10-04 14:38:13,596 WARNING [train.py:1204] (3/4) Exclude cut with ID _67335_210_5219_1_1535525975652_4275229_599-247317-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 14:38:17,541 WARNING [train.py:1204] (3/4) Exclude cut with ID _29857_210_20034_1_1532395886753_3798160_209-239267-0_sp1.1 from training. Duration: 0.936375 2023-10-04 14:38:17,561 WARNING [train.py:1204] (3/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_46-207599-0_sp1.1 from training. Duration: 0.5818125 2023-10-04 14:38:19,500 WARNING [train.py:1204] (3/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_912-282412-0_sp1.1 from training. Duration: 0.4454375 2023-10-04 14:38:24,147 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_4183_210_2087_1_1526729100_1869838_141-166600-0_sp1.1 from training. Duration: 0.863625 2023-10-04 14:38:24,151 WARNING [train.py:1204] (3/4) Exclude cut with ID _15037_210_2253_1_1530752294104_3267650_296-192597-0 from training. Duration: 0.81 2023-10-04 14:38:24,212 WARNING [train.py:1204] (3/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_571-220562-0_sp1.1 from training. Duration: 0.963625 2023-10-04 14:38:24,236 WARNING [train.py:1204] (3/4) Exclude cut with ID _5287_210_2084_1_1527418883040_3878670_12-221186-0 from training. Duration: 0.98 2023-10-04 14:38:29,666 WARNING [train.py:1204] (3/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_149-85602-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 14:38:31,018 WARNING [train.py:1204] (3/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_1003-101638-0_sp1.1 from training. Duration: 0.663625 2023-10-04 14:38:33,629 WARNING [train.py:1204] (3/4) Exclude cut with ID _55844_210_2346_1_1534471212569_7625707_1099-33689-0_sp1.1 from training. Duration: 0.92725 2023-10-04 14:38:33,669 WARNING [train.py:1204] (3/4) Exclude cut with ID _7606_210_4915_1_1528894253277_5080690_301-10732-0 from training. Duration: 0.81 2023-10-04 14:38:35,132 WARNING [train.py:1204] (3/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_403-34698-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 14:38:35,143 WARNING [train.py:1204] (3/4) Exclude cut with ID _8480_210_1874_1_1528589391205_6728330_755-112625-0_sp0.9 from training. Duration: 0.82225 2023-10-04 14:38:36,432 WARNING [train.py:1204] (3/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_527-238990-0_sp1.1 from training. Duration: 0.8181875 2023-10-04 14:38:36,470 WARNING [train.py:1204] (3/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_426-7328-0_sp1.1 from training. Duration: 0.92725 2023-10-04 14:38:39,265 WARNING [train.py:1204] (3/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_189-317158-0_sp1.1 from training. Duration: 0.8181875 2023-10-04 14:38:39,487 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.out_combiner.scale_min, batch_count=1689400.0, ans=0.2 2023-10-04 14:38:40,557 WARNING [train.py:1204] (3/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_162-103620-0 from training. Duration: 0.93 2023-10-04 14:38:41,958 WARNING [train.py:1204] (3/4) Exclude cut with ID _22083_210_8254_1_1531702862439_3678970_49-234546-0 from training. Duration: 0.83 2023-10-04 14:38:43,298 WARNING [train.py:1204] (3/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_370-331600-0_sp1.1 from training. Duration: 0.8545625 2023-10-04 14:38:43,310 WARNING [train.py:1204] (3/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_166-59042-0_sp1.1 from training. Duration: 0.97275 2023-10-04 14:38:45,156 WARNING [train.py:1204] (3/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_138-332773-0_sp1.1 from training. Duration: 0.736375 2023-10-04 14:38:45,221 WARNING [train.py:1204] (3/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_90-30673-0_sp0.9 from training. Duration: 0.9333125 2023-10-04 14:38:45,298 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder_embed.dropout.p, batch_count=1689400.0, ans=0.1 2023-10-04 14:38:46,617 WARNING [train.py:1204] (3/4) Exclude cut with ID _27967_210_12140_1_1532588391859_8419130_737-41898-0_sp1.1 from training. Duration: 0.936375 2023-10-04 14:38:48,520 INFO [train.py:1046] (3/4) Epoch 48, batch 3750, loss[loss=0.1423, simple_loss=0.225, pruned_loss=0.02975, over 24308.00 frames. ], tot_loss[loss=0.1538, simple_loss=0.2346, pruned_loss=0.03646, over 4719454.73 frames. ], batch size: 61, lr: 2.11e-03, grad_scale: 8.0 2023-10-04 14:38:49,901 WARNING [train.py:1204] (3/4) Exclude cut with ID _8833_210_5219_1_1528592665210_7553059_329-125156-0_sp1.1 from training. Duration: 0.7454375 2023-10-04 14:38:49,995 WARNING [train.py:1204] (3/4) Exclude cut with ID _41817_210_18835_1_1533434477160_7423049_352-42537-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 14:38:51,374 WARNING [train.py:1204] (3/4) Exclude cut with ID _8365_210_2064_1_1528592311070_4677690_431-294102-0 from training. Duration: 0.89 2023-10-04 14:38:54,597 WARNING [train.py:1204] (3/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_454-314901-0_sp1.1 from training. Duration: 0.4181875 2023-10-04 14:38:56,015 WARNING [train.py:1204] (3/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_80-135200-0_sp0.9 from training. Duration: 0.87775 2023-10-04 14:38:56,061 WARNING [train.py:1204] (3/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_181-78632-0 from training. Duration: 0.78 2023-10-04 14:38:57,409 WARNING [train.py:1204] (3/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_300-27235-0_sp0.9 from training. Duration: 0.9666875 2023-10-04 14:38:58,750 WARNING [train.py:1204] (3/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_96-177176-0_sp1.1 from training. Duration: 0.97275 2023-10-04 14:39:00,106 WARNING [train.py:1204] (3/4) Exclude cut with ID _35941_210_15379_1_1532995294515_4023089_246-348742-0_sp1.1 from training. Duration: 0.97275 2023-10-04 14:39:01,510 WARNING [train.py:1204] (3/4) Exclude cut with ID _38670_210_15379_1_1533448810659_4391059_65-169397-0_sp1.1 from training. Duration: 0.8818125 2023-10-04 14:39:03,249 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.bypass.skip_rate, batch_count=1689533.3333333333, ans=0.04949747468305833 2023-10-04 14:39:04,371 WARNING [train.py:1204] (3/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_1003-99642-0_sp1.1 from training. Duration: 0.963625 2023-10-04 14:39:07,309 WARNING [train.py:1204] (3/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_320-14078-0_sp1.1 from training. Duration: 0.863625 2023-10-04 14:39:08,669 WARNING [train.py:1204] (3/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_367-61330-0_sp0.9 from training. Duration: 0.8333125 2023-10-04 14:39:10,079 WARNING [train.py:1204] (3/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_515-298514-0_sp1.1 from training. Duration: 0.92725 2023-10-04 14:39:12,811 WARNING [train.py:1204] (3/4) Exclude cut with ID _53731_210_7994_1_1534553979244_3757214_471-320815-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 14:39:12,876 WARNING [train.py:1204] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_306-232480-0 from training. Duration: 0.96 2023-10-04 14:39:12,938 WARNING [train.py:1204] (3/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_478-162247-0_sp0.9 from training. Duration: 0.911125 2023-10-04 14:39:16,236 WARNING [train.py:1204] (3/4) Exclude cut with ID _56423_210_5854_1_1534896061049_3165969_345-284696-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 14:39:16,282 WARNING [train.py:1204] (3/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_600-122253-0_sp1.1 from training. Duration: 0.963625 2023-10-04 14:39:19,540 WARNING [train.py:1204] (3/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_199-295611-0 from training. Duration: 0.55 2023-10-04 14:39:22,950 WARNING [train.py:1204] (3/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_734-129761-0 from training. Duration: 0.82 2023-10-04 14:39:23,119 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.1.ff2_skip_rate, batch_count=1689600.0, ans=0.0 2023-10-04 14:39:24,310 WARNING [train.py:1204] (3/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_349-169257-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 14:39:24,351 WARNING [train.py:1204] (3/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_202-67017-0_sp0.9 from training. Duration: 0.911125 2023-10-04 14:39:27,098 WARNING [train.py:1204] (3/4) Exclude cut with ID _16908_210_5724_1_1531119157248_3772700_61-258978-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 14:39:31,164 WARNING [train.py:1204] (3/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_172-156931-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 14:39:32,558 WARNING [train.py:1204] (3/4) Exclude cut with ID _4092_210_2151_1_1526643044449_3135759_356-278694-0_sp1.1 from training. Duration: 0.42725 2023-10-04 14:39:35,339 WARNING [train.py:1204] (3/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_116-332008-0 from training. Duration: 0.98 2023-10-04 14:39:36,908 WARNING [train.py:1204] (3/4) Exclude cut with ID _29003_210_19596_1_1532507391720_3894098_358-114789-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 14:39:40,781 WARNING [train.py:1204] (3/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_152-212533-0_sp1.1 from training. Duration: 0.8818125 2023-10-04 14:39:42,122 WARNING [train.py:1204] (3/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_267-214116-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 14:39:46,720 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7543_210_6753_1_1528541907_1399322_165-323619-0_sp0.9 from training. Duration: 0.7666875 2023-10-04 14:39:49,583 WARNING [train.py:1204] (3/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_577-332600-0_sp0.9 from training. Duration: 0.6333125 2023-10-04 14:39:49,839 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=1689733.3333333333, ans=0.1 2023-10-04 14:39:51,040 WARNING [train.py:1204] (3/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_342-100157-0_sp0.9 from training. Duration: 0.6 2023-10-04 14:39:54,079 WARNING [train.py:1204] (3/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_141-89727-0_sp1.1 from training. Duration: 0.6454375 2023-10-04 14:39:55,485 WARNING [train.py:1204] (3/4) Exclude cut with ID _3992_210_1747_1_1526644816031_3731060_107-251434-0_sp1.1 from training. Duration: 0.9 2023-10-04 14:39:56,874 WARNING [train.py:1204] (3/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_401-53423-0_sp1.1 from training. Duration: 0.5 2023-10-04 14:40:00,941 INFO [train.py:1046] (3/4) Epoch 48, batch 3800, loss[loss=0.1373, simple_loss=0.1995, pruned_loss=0.03759, over 22724.00 frames. ], tot_loss[loss=0.1533, simple_loss=0.2342, pruned_loss=0.03618, over 4728538.64 frames. ], batch size: 322, lr: 2.11e-03, grad_scale: 8.0 2023-10-04 14:40:01,258 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.conv_skip_rate, batch_count=1689800.0, ans=0.0 2023-10-04 14:40:01,325 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.conv_module2.balancer1.max_abs, batch_count=1689800.0, ans=10.0 2023-10-04 14:40:03,714 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7762_210_1794_1_1528199508_4057408_422-227034-0_sp1.1 from training. Duration: 0.663625 2023-10-04 14:40:07,873 WARNING [train.py:1204] (3/4) Exclude cut with ID _4184_210_2550_1_1526730192013_2219380_102-250299-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 14:40:07,945 WARNING [train.py:1204] (3/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_253-308377-0_sp0.9 from training. Duration: 0.5444375 2023-10-04 14:40:08,451 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.5.encoder.layers.0.conv_module1.whiten, num_groups=1, num_channels=256, metric=9.35 vs. limit=15.0 2023-10-04 14:40:09,280 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_271-297505-0 from training. Duration: 0.99 2023-10-04 14:40:10,903 WARNING [train.py:1204] (3/4) Exclude cut with ID _35941_210_15379_1_1532995294515_4023089_213-142973-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 14:40:12,736 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7544_210_6753_1_1528587722_3576686_162-63018-0_sp1.1 from training. Duration: 0.92725 2023-10-04 14:40:12,813 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_365-217119-0_sp0.9 from training. Duration: 0.7 2023-10-04 14:40:13,312 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.5.encoder.layers.1.feed_forward2.out_whiten, num_groups=1, num_channels=256, metric=8.54 vs. limit=15.0 2023-10-04 14:40:14,857 WARNING [train.py:1204] (3/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_662-275621-0_sp0.9 from training. Duration: 0.4666875 2023-10-04 14:40:14,859 WARNING [train.py:1204] (3/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_374-187555-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 14:40:16,066 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_289-180904-0_sp1.1 from training. Duration: 0.6909375 2023-10-04 14:40:17,502 WARNING [train.py:1204] (3/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_189-47206-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 14:40:17,529 WARNING [train.py:1204] (3/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_234-14174-0_sp1.1 from training. Duration: 0.7454375 2023-10-04 14:40:17,569 WARNING [train.py:1204] (3/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_576-76621-0_sp1.1 from training. Duration: 0.963625 2023-10-04 14:40:18,860 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.708e+02 2.051e+02 2.338e+02 2.944e+02 4.276e+02, threshold=4.676e+02, percent-clipped=0.0 2023-10-04 14:40:20,268 WARNING [train.py:1204] (3/4) Exclude cut with ID _13253_210_1925_1_1530757775819_3577873_253-246386-0 from training. Duration: 0.9 2023-10-04 14:40:24,934 WARNING [train.py:1204] (3/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_372-6869-0_sp1.1 from training. Duration: 0.4 2023-10-04 14:40:25,137 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.balancer2.prob, batch_count=1689866.6666666667, ans=0.125 2023-10-04 14:40:26,292 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_21434_210_4238_1_1531916339_1222252_22-54954-0_sp1.1 from training. Duration: 0.936375 2023-10-04 14:40:27,793 WARNING [train.py:1204] (3/4) Exclude cut with ID _29853_210_7497_1_1532505600851_6642110_492-170361-0_sp1.1 from training. Duration: 0.92725 2023-10-04 14:40:29,340 WARNING [train.py:1204] (3/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_505-97987-0_sp0.9 from training. Duration: 0.9555625 2023-10-04 14:40:30,616 WARNING [train.py:1204] (3/4) Exclude cut with ID _8365_210_2064_1_1528592311070_4677690_371-122295-0_sp0.9 from training. Duration: 0.611125 2023-10-04 14:40:30,742 WARNING [train.py:1204] (3/4) Exclude cut with ID _14268_210_9206_1_1530622771285_4714099_637-86630-0_sp1.1 from training. Duration: 0.463625 2023-10-04 14:40:30,754 WARNING [train.py:1204] (3/4) Exclude cut with ID _29857_210_20034_1_1532395886753_3798160_372-5678-0_sp1.1 from training. Duration: 0.963625 2023-10-04 14:40:32,271 WARNING [train.py:1204] (3/4) Exclude cut with ID _52892_210_6721_1_1534378941259_4124129_90-165108-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 14:40:33,612 WARNING [train.py:1204] (3/4) Exclude cut with ID _27052_210_5986_1_1532329197187_3585009_143-339198-0_sp1.1 from training. Duration: 0.963625 2023-10-04 14:40:38,061 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.nonlin_attention.balancer.prob, batch_count=1689933.3333333333, ans=0.125 2023-10-04 14:40:39,113 WARNING [train.py:1204] (3/4) Exclude cut with ID _14268_210_9206_1_1530622771285_4714099_637-86630-0_sp0.9 from training. Duration: 0.5666875 2023-10-04 14:40:39,116 WARNING [train.py:1204] (3/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_401-255491-0 from training. Duration: 0.7 2023-10-04 14:40:40,599 WARNING [train.py:1204] (3/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_178-6946-0_sp1.1 from training. Duration: 0.97275 2023-10-04 14:40:47,387 WARNING [train.py:1204] (3/4) Exclude cut with ID _27485_210_4916_1_1532739191677_3668269_460-163885-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 14:40:53,221 WARNING [train.py:1204] (3/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_270-26053-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 14:40:55,954 WARNING [train.py:1204] (3/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_412-317077-0 from training. Duration: 0.75 2023-10-04 14:40:58,787 WARNING [train.py:1204] (3/4) Exclude cut with ID _5289_210_2084_1_1527678202736_3779299_142-103317-0 from training. Duration: 0.84 2023-10-04 14:40:58,841 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_150-143438-0_sp1.1 from training. Duration: 0.92725 2023-10-04 14:41:00,306 WARNING [train.py:1204] (3/4) Exclude cut with ID _27894_210_12392_1_1532415496078_6902439_668-269779-0_sp1.1 from training. Duration: 0.97275 2023-10-04 14:41:01,625 WARNING [train.py:1204] (3/4) Exclude cut with ID _18513_210_9206_1_1531618215223_3862620_56-222075-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 14:41:03,404 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.conv_module2.balancer1.prob, batch_count=1690066.6666666667, ans=0.125 2023-10-04 14:41:04,449 WARNING [train.py:1204] (3/4) Exclude cut with ID _4092_210_2151_1_1526643044449_3135759_356-278694-0 from training. Duration: 0.47 2023-10-04 14:41:07,345 WARNING [train.py:1204] (3/4) Exclude cut with ID _25808_210_13292_1_1532343514562_7366109_633-47524-0 from training. Duration: 0.99 2023-10-04 14:41:08,614 WARNING [train.py:1204] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_872-264525-0 from training. Duration: 0.65 2023-10-04 14:41:08,653 WARNING [train.py:1204] (3/4) Exclude cut with ID _14632_210_9988_1_1530691279392_3923200_239-232545-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 14:41:10,019 WARNING [train.py:1204] (3/4) Exclude cut with ID _14349_210_8341_1_1530615710818_3442470_153-27432-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 14:41:14,183 INFO [train.py:1046] (3/4) Epoch 48, batch 3850, loss[loss=0.1486, simple_loss=0.2268, pruned_loss=0.03522, over 21407.00 frames. ], tot_loss[loss=0.1524, simple_loss=0.2332, pruned_loss=0.03581, over 4715332.43 frames. ], batch size: 47, lr: 2.11e-03, grad_scale: 8.0 2023-10-04 14:41:14,282 WARNING [train.py:1204] (3/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_37-41919-0_sp0.9 from training. Duration: 0.9666875 2023-10-04 14:41:14,349 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_196-289637-0_sp0.9 from training. Duration: 0.8333125 2023-10-04 14:41:19,678 WARNING [train.py:1204] (3/4) Exclude cut with ID _13678_210_7497_1_1530530069226_3463170_371-3221-0_sp1.1 from training. Duration: 0.8181875 2023-10-04 14:41:19,748 WARNING [train.py:1204] (3/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_557-324258-0 from training. Duration: 0.81 2023-10-04 14:41:21,065 WARNING [train.py:1204] (3/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_1012-279473-0_sp1.1 from training. Duration: 0.6090625 2023-10-04 14:41:21,147 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_66916_210_14656_1_1535725525_326566_7-92649-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 14:41:22,714 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.balancer1.prob, batch_count=1690133.3333333333, ans=0.125 2023-10-04 14:41:22,775 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.feed_forward1.out_proj.dropout_p, batch_count=1690133.3333333333, ans=0.1 2023-10-04 14:41:25,571 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_9460_210_4211_1_1528954587_10049667_480-260269-0_sp1.1 from training. Duration: 0.5090625 2023-10-04 14:41:27,115 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_121-155001-0_sp1.1 from training. Duration: 0.92725 2023-10-04 14:41:29,866 WARNING [train.py:1204] (3/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_188-171673-0_sp1.1 from training. Duration: 0.52725 2023-10-04 14:41:29,941 WARNING [train.py:1204] (3/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_2-39653-0 from training. Duration: 0.98 2023-10-04 14:41:35,612 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.attention_skip_rate, batch_count=1690200.0, ans=0.0 2023-10-04 14:41:36,719 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_23850_210_4238_1_1532232597_1632433_112-229012-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 14:41:38,276 WARNING [train.py:1204] (3/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_626-79528-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 14:41:39,686 WARNING [train.py:1204] (3/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_856-90052-0_sp1.1 from training. Duration: 0.963625 2023-10-04 14:41:39,954 INFO [scaling.py:1118] (3/4) WithLoss: name=encoder.encoders.4.encoder.layers.0.self_attn_weights, loss-sum=0.000e+00 2023-10-04 14:41:41,046 WARNING [train.py:1204] (3/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_122-109023-0_sp0.9 from training. Duration: 0.8444375 2023-10-04 14:41:44,390 WARNING [train.py:1204] (3/4) Exclude cut with ID _64414_210_17301_1_1535243252048_3757759_37-70328-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 14:41:45,551 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_622-344907-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 14:41:46,859 WARNING [train.py:1204] (3/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_410-321956-0_sp1.1 from training. Duration: 0.92725 2023-10-04 14:41:46,872 WARNING [train.py:1204] (3/4) Exclude cut with ID _7562_210_2084_1_1528887767916_3946229_186-326422-0_sp0.9 from training. Duration: 0.9333125 2023-10-04 14:41:46,987 WARNING [train.py:1204] (3/4) Exclude cut with ID _37537_210_2253_1_1533171758568_3421949_127-285192-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 14:41:50,253 WARNING [train.py:1204] (3/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_723-228168-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 14:41:51,653 WARNING [train.py:1204] (3/4) Exclude cut with ID _13678_210_7497_1_1530530069226_3463170_426-246984-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 14:41:51,671 WARNING [train.py:1204] (3/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_399-156791-0_sp0.9 from training. Duration: 0.92225 2023-10-04 14:41:52,934 WARNING [train.py:1204] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_391-22148-0 from training. Duration: 0.77 2023-10-04 14:41:52,957 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3475_210_2028_1_1526193578_3622569_64-297072-0 from training. Duration: 0.91 2023-10-04 14:41:53,026 WARNING [train.py:1204] (3/4) Exclude cut with ID _17875_210_2087_1_1531224012907_1821339_225-280015-0_sp1.1 from training. Duration: 0.963625 2023-10-04 14:41:53,052 WARNING [train.py:1204] (3/4) Exclude cut with ID _50655_210_18628_1_1534229942193_3772154_325-246169-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 14:41:56,392 WARNING [train.py:1204] (3/4) Exclude cut with ID _29529_210_7234_1_1532393961455_3977460_597-257348-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 14:41:56,436 WARNING [train.py:1204] (3/4) Exclude cut with ID _58895_210_8750_1_1535184081320_3649089_77-173485-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 14:41:56,462 WARNING [train.py:1204] (3/4) Exclude cut with ID _6270_210_2084_1_1527850911573_3856420_26-277343-0 from training. Duration: 0.82 2023-10-04 14:41:56,621 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.1.feed_forward1.out_proj.dropout_p, batch_count=1690266.6666666667, ans=0.1 2023-10-04 14:41:59,107 WARNING [train.py:1204] (3/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_144-3287-0 from training. Duration: 0.67 2023-10-04 14:42:00,496 WARNING [train.py:1204] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_293-145278-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 14:42:01,829 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_40047_210_2290_1_1533290519_2374311_257-335162-0 from training. Duration: 0.95 2023-10-04 14:42:03,234 WARNING [train.py:1204] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_619-344499-0_sp0.9 from training. Duration: 0.7 2023-10-04 14:42:05,285 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.3.encoder.layers.3.self_attn1.whiten, num_groups=1, num_channels=512, metric=11.69 vs. limit=22.5 2023-10-04 14:42:08,674 WARNING [train.py:1204] (3/4) Exclude cut with ID _19504_210_9994_1_1531648863242_3325220_456-315379-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 14:42:08,758 WARNING [train.py:1204] (3/4) Exclude cut with ID _55857_210_2346_1_1534644007831_6898209_631-221692-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 14:42:13,830 WARNING [train.py:1204] (3/4) Exclude cut with ID _64631_210_27767_1_1535349580068_4165160_324-3682-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 14:42:15,093 WARNING [train.py:1204] (3/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_317-211107-0 from training. Duration: 0.92 2023-10-04 14:42:18,517 WARNING [train.py:1204] (3/4) Exclude cut with ID _6789_210_1534_1_1527857937218_3118950_311-61262-0 from training. Duration: 0.65 2023-10-04 14:42:21,362 WARNING [train.py:1204] (3/4) Exclude cut with ID _28323_210_16187_1_1532480395667_4100300_365-68882-0_sp1.1 from training. Duration: 0.92725 2023-10-04 14:42:21,408 WARNING [train.py:1204] (3/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_816-181825-0_sp1.1 from training. Duration: 0.92725 2023-10-04 14:42:24,119 WARNING [train.py:1204] (3/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_132-269633-0_sp1.1 from training. Duration: 0.6181875 2023-10-04 14:42:24,138 WARNING [train.py:1204] (3/4) Exclude cut with ID _44585_210_18628_1_1533605350387_4518199_280-73534-0_sp1.1 from training. Duration: 0.8090625 2023-10-04 14:42:24,197 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_68095_210_20185_1_1535718226_8888855_1009-164755-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 14:42:24,428 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.conv_module2.balancer1.min_positive, batch_count=1690400.0, ans=0.025 2023-10-04 14:42:25,584 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_40223_210_6228_1_1533298404_4812267_400-103813-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 14:42:25,585 WARNING [train.py:1204] (3/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_491-261923-0_sp1.1 from training. Duration: 0.8909375 2023-10-04 14:42:25,600 WARNING [train.py:1204] (3/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_712-342563-0 from training. Duration: 0.97 2023-10-04 14:42:25,825 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.conv_module1.balancer2.prob, batch_count=1690400.0, ans=0.125 2023-10-04 14:42:26,931 WARNING [train.py:1204] (3/4) Exclude cut with ID _41161_210_20034_1_1533431249657_3555314_175-190032-0_sp1.1 from training. Duration: 0.963625 2023-10-04 14:42:28,217 INFO [train.py:1046] (3/4) Epoch 48, batch 3900, loss[loss=0.1442, simple_loss=0.2297, pruned_loss=0.02938, over 24659.00 frames. ], tot_loss[loss=0.1518, simple_loss=0.2323, pruned_loss=0.03565, over 4711334.82 frames. ], batch size: 65, lr: 2.11e-03, grad_scale: 8.0 2023-10-04 14:42:28,326 WARNING [train.py:1204] (3/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_788-298907-0 from training. Duration: 0.89 2023-10-04 14:42:30,375 WARNING [train.py:1204] (3/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_225-70961-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 14:42:30,381 WARNING [train.py:1204] (3/4) Exclude cut with ID _58583_210_5236_1_1534726835266_3574249_235-175994-0_sp1.1 from training. Duration: 0.92725 2023-10-04 14:42:31,699 WARNING [train.py:1204] (3/4) Exclude cut with ID _12302_210_5219_1_1529924446466_8107280_907-182949-0_sp1.1 from training. Duration: 0.836375 2023-10-04 14:42:31,739 WARNING [train.py:1204] (3/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_94-193692-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 14:42:33,107 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_4528_210_3273_1_1527076654_3276777_342-168068-0_sp0.9 from training. Duration: 0.9666875 2023-10-04 14:42:34,447 WARNING [train.py:1204] (3/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_368-74239-0_sp1.1 from training. Duration: 0.92725 2023-10-04 14:42:34,448 WARNING [train.py:1204] (3/4) Exclude cut with ID _44831_210_13427_1_1533816011549_3689035_291-295928-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 14:42:35,829 WARNING [train.py:1204] (3/4) Exclude cut with ID _56406_210_9288_1_1534899618664_3496149_455-131997-0_sp1.1 from training. Duration: 0.97275 2023-10-04 14:42:35,849 WARNING [train.py:1204] (3/4) Exclude cut with ID _6473_210_1741_1_1527901307396_3815380_108-17335-0 from training. Duration: 0.65 2023-10-04 14:42:35,888 WARNING [train.py:1204] (3/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_286-135600-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 14:42:39,667 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.2.encoder.layers.0.whiten, num_groups=1, num_channels=384, metric=5.73 vs. limit=12.0 2023-10-04 14:42:40,193 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_928-321675-0_sp1.1 from training. Duration: 0.936375 2023-10-04 14:42:41,602 WARNING [train.py:1204] (3/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_81-300212-0_sp1.1 from training. Duration: 0.5454375 2023-10-04 14:42:41,640 WARNING [train.py:1204] (3/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_29-280622-0_sp0.9 from training. Duration: 0.911125 2023-10-04 14:42:42,975 WARNING [train.py:1204] (3/4) Exclude cut with ID _61079_210_4525_1_1534921008198_4011419_228-176447-0_sp1.1 from training. Duration: 0.936375 2023-10-04 14:42:44,499 WARNING [train.py:1204] (3/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_372-198878-0_sp1.1 from training. Duration: 0.5454375 2023-10-04 14:42:44,514 WARNING [train.py:1204] (3/4) Exclude cut with ID _64521_210_14699_1_1535243427259_3815328_244-208725-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 14:42:44,660 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.feed_forward1.hidden_balancer.prob, batch_count=1690533.3333333333, ans=0.125 2023-10-04 14:42:45,739 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.750e+02 2.072e+02 2.246e+02 2.580e+02 4.191e+02, threshold=4.491e+02, percent-clipped=0.0 2023-10-04 14:42:47,171 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8275_210_4198_1_1528455320_3892571_491-17707-0_sp0.9 from training. Duration: 0.711125 2023-10-04 14:42:47,276 WARNING [train.py:1204] (3/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_375-245677-0 from training. Duration: 0.81 2023-10-04 14:42:47,290 WARNING [train.py:1204] (3/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_690-286055-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 14:42:49,211 WARNING [train.py:1204] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_89-321955-0 from training. Duration: 0.8 2023-10-04 14:42:50,472 WARNING [train.py:1204] (3/4) Exclude cut with ID _18895_210_6228_1_1531357274234_3784930_213-98721-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 14:42:51,806 WARNING [train.py:1204] (3/4) Exclude cut with ID _19708_210_9206_1_1532136650733_4238364_347-65240-0 from training. Duration: 0.97 2023-10-04 14:42:53,232 WARNING [train.py:1204] (3/4) Exclude cut with ID _64135_210_2253_1_1535677267876_7608972_498-5039-0 from training. Duration: 0.78 2023-10-04 14:42:57,583 WARNING [train.py:1204] (3/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_146-276944-0_sp1.1 from training. Duration: 0.7545625 2023-10-04 14:42:57,678 WARNING [train.py:1204] (3/4) Exclude cut with ID _63171_210_15488_1_1535104792794_3295110_155-34488-0_sp1.1 from training. Duration: 0.97275 2023-10-04 14:42:57,698 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_5438_210_1794_1_1527336687_2600009_233-58072-0_sp0.9 from training. Duration: 0.8555625 2023-10-04 14:42:59,053 WARNING [train.py:1204] (3/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_63-101996-0_sp0.9 from training. Duration: 0.97775 2023-10-04 14:42:59,218 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.1.nonlin_attention.balancer.prob, batch_count=1690600.0, ans=0.125 2023-10-04 14:43:03,433 WARNING [train.py:1204] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_271-269084-0_sp1.1 from training. Duration: 0.7545625 2023-10-04 14:43:03,702 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.bypass.skip_rate, batch_count=1690600.0, ans=0.09899494936611666 2023-10-04 14:43:04,942 WARNING [train.py:1204] (3/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_345-167986-0_sp1.1 from training. Duration: 0.763625 2023-10-04 14:43:06,312 WARNING [train.py:1204] (3/4) Exclude cut with ID _39290_210_4220_1_1533862778760_3075118_92-311600-0_sp1.1 from training. Duration: 0.8 2023-10-04 14:43:06,318 WARNING [train.py:1204] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_209-65306-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 14:43:07,787 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_26433_210_12108_1_1532157158_4056480_317-335807-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 14:43:12,300 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.feed_forward2.hidden_balancer.prob, batch_count=1690666.6666666667, ans=0.125 2023-10-04 14:43:13,583 WARNING [train.py:1204] (3/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_238-94127-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 14:43:13,629 WARNING [train.py:1204] (3/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_207-132038-0_sp1.1 from training. Duration: 0.8545625 2023-10-04 14:43:20,929 WARNING [train.py:1204] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_167-137255-0_sp1.1 from training. Duration: 0.5818125 2023-10-04 14:43:22,375 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2009_210_2071_1_1525481134_4588937_385-302100-0_sp0.9 from training. Duration: 0.9666875 2023-10-04 14:43:22,973 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.5.encoder.layers.0.nonlin_attention.whiten1, num_groups=1, num_channels=192, metric=4.20 vs. limit=10.0 2023-10-04 14:43:31,261 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_15517_210_6753_1_1530954008_3487336_253-110617-0_sp1.1 from training. Duration: 0.936375 2023-10-04 14:43:34,098 WARNING [train.py:1204] (3/4) Exclude cut with ID _39290_210_4220_1_1533862778760_3075118_92-311600-0_sp0.9 from training. Duration: 0.97775 2023-10-04 14:43:34,145 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_42-142473-0 from training. Duration: 0.72 2023-10-04 14:43:35,487 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2296_210_2087_1_1525503448_3397335_57-185430-0 from training. Duration: 0.6 2023-10-04 14:43:35,500 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_11305_210_1794_1_1529750428_7178148_257-312870-0_sp0.9 from training. Duration: 0.97775 2023-10-04 14:43:35,623 WARNING [train.py:1204] (3/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_222-198127-0 from training. Duration: 0.84 2023-10-04 14:43:38,201 WARNING [train.py:1204] (3/4) Exclude cut with ID _65263_210_22356_1_1535763598691_3921037_423-202699-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 14:43:38,263 WARNING [train.py:1204] (3/4) Exclude cut with ID _15037_210_2253_1_1530752294104_3267650_13-130361-0 from training. Duration: 0.99 2023-10-04 14:43:40,025 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.ff3_skip_rate, batch_count=1690800.0, ans=0.0 2023-10-04 14:43:41,089 INFO [train.py:1046] (3/4) Epoch 48, batch 3950, loss[loss=0.1354, simple_loss=0.2171, pruned_loss=0.02683, over 24336.00 frames. ], tot_loss[loss=0.1519, simple_loss=0.2325, pruned_loss=0.03562, over 4696999.16 frames. ], batch size: 61, lr: 2.11e-03, grad_scale: 8.0 2023-10-04 14:43:45,127 WARNING [train.py:1204] (3/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_628-198080-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 14:43:45,205 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3171_210_2087_1_1526109084_2330007_37-64334-0 from training. Duration: 0.82 2023-10-04 14:43:47,017 WARNING [train.py:1204] (3/4) Exclude cut with ID _5287_210_2084_1_1527418883040_3878670_12-221186-0_sp1.1 from training. Duration: 0.8909375 2023-10-04 14:43:49,131 WARNING [train.py:1204] (3/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_1103-56445-0_sp1.1 from training. Duration: 0.663625 2023-10-04 14:43:50,349 WARNING [train.py:1204] (3/4) Exclude cut with ID _65263_210_22356_1_1535763598691_3921037_169-140068-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 14:43:52,375 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.conv_module2.balancer2.prob, batch_count=1690800.0, ans=0.125 2023-10-04 14:43:57,846 WARNING [train.py:1204] (3/4) Exclude cut with ID IC0671W0118-331961 from training. Duration: 0.896 2023-10-04 14:43:59,181 WARNING [train.py:1204] (3/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_402-102311-0_sp1.1 from training. Duration: 0.6090625 2023-10-04 14:43:59,208 WARNING [train.py:1204] (3/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_202-293523-0 from training. Duration: 0.72 2023-10-04 14:43:59,264 WARNING [train.py:1204] (3/4) Exclude cut with ID IC0679W0038-335846 from training. Duration: 0.768 2023-10-04 14:43:59,286 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_37357_210_17024_1_1533089504_2316121_79-198866-0_sp1.1 from training. Duration: 0.936375 2023-10-04 14:44:02,574 WARNING [train.py:1204] (3/4) Exclude cut with ID _7606_210_4915_1_1528894253277_5080690_292-271285-0_sp1.1 from training. Duration: 0.963625 2023-10-04 14:44:02,585 WARNING [train.py:1204] (3/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_377-296839-0_sp0.9 from training. Duration: 0.611125 2023-10-04 14:44:03,949 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_45886_210_17024_1_1533704917_2435333_58-131069-0_sp1.1 from training. Duration: 0.963625 2023-10-04 14:44:05,432 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6961_210_2087_1_1527937168_6815011_395-185060-0 from training. Duration: 0.95 2023-10-04 14:44:08,210 WARNING [train.py:1204] (3/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_304-266479-0_sp1.1 from training. Duration: 0.97275 2023-10-04 14:44:08,356 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.attention_skip_rate, batch_count=1690866.6666666667, ans=0.0 2023-10-04 14:44:09,572 WARNING [train.py:1204] (3/4) Exclude cut with ID _3867_210_1883_1_1526641200575_4475830_319-349850-0_sp1.1 from training. Duration: 0.6090625 2023-10-04 14:44:09,584 WARNING [train.py:1204] (3/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_304-59227-0_sp0.9 from training. Duration: 0.8555625 2023-10-04 14:44:09,663 WARNING [train.py:1204] (3/4) Exclude cut with ID _8021_210_7994_1_1528376451358_3653069_100-140231-0_sp0.9 from training. Duration: 0.7444375 2023-10-04 14:44:11,006 WARNING [train.py:1204] (3/4) Exclude cut with ID _8833_210_5219_1_1528592665210_7553059_329-125156-0_sp0.9 from training. Duration: 0.911125 2023-10-04 14:44:21,418 WARNING [train.py:1204] (3/4) Exclude cut with ID _14459_210_2228_1_1531443641946_7516700_792-287234-0_sp1.1 from training. Duration: 0.92725 2023-10-04 14:44:21,438 WARNING [train.py:1204] (3/4) Exclude cut with ID _19708_210_9206_1_1532136650733_4238364_347-65240-0_sp1.1 from training. Duration: 0.8818125 2023-10-04 14:44:26,148 WARNING [train.py:1204] (3/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_70-27732-0 from training. Duration: 0.94 2023-10-04 14:44:30,425 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_397-132565-0 from training. Duration: 0.99 2023-10-04 14:44:30,428 WARNING [train.py:1204] (3/4) Exclude cut with ID _49728_210_12651_1_1534041034294_6788150_624-195301-0 from training. Duration: 0.85 2023-10-04 14:44:31,678 WARNING [train.py:1204] (3/4) Exclude cut with ID _55429_210_10626_1_1534467588527_7174143_377-169391-0_sp1.1 from training. Duration: 0.87275 2023-10-04 14:44:31,792 WARNING [train.py:1204] (3/4) Exclude cut with ID _49376_210_5986_1_1533985278732_3370740_354-344695-0_sp1.1 from training. Duration: 0.9 2023-10-04 14:44:31,996 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.conv_module1.balancer1.prob, batch_count=1691000.0, ans=0.125 2023-10-04 14:44:39,247 WARNING [train.py:1204] (3/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_276-96593-0_sp1.1 from training. Duration: 0.7 2023-10-04 14:44:40,504 WARNING [train.py:1204] (3/4) Exclude cut with ID _15012_210_1968_1_1530856820429_5095350_688-178849-0_sp0.9 from training. Duration: 0.711125 2023-10-04 14:44:40,549 WARNING [train.py:1204] (3/4) Exclude cut with ID _25565_210_12392_1_1532423009382_3952098_372-69023-0_sp1.1 from training. Duration: 0.936375 2023-10-04 14:44:40,583 WARNING [train.py:1204] (3/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_61-327062-0_sp1.1 from training. Duration: 0.736375 2023-10-04 14:44:40,613 WARNING [train.py:1204] (3/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_215-141059-0 from training. Duration: 0.62 2023-10-04 14:44:45,999 WARNING [train.py:1204] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_6-226750-0_sp1.1 from training. Duration: 0.8545625 2023-10-04 14:44:46,110 WARNING [train.py:1204] (3/4) Exclude cut with ID _14348_210_8341_1_1530599434996_3778640_240-232370-0_sp1.1 from training. Duration: 0.763625 2023-10-04 14:44:50,611 WARNING [train.py:1204] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_468-189058-0 from training. Duration: 0.96 2023-10-04 14:44:55,220 INFO [train.py:1046] (3/4) Epoch 48, batch 4000, loss[loss=0.1394, simple_loss=0.2174, pruned_loss=0.0307, over 23437.00 frames. ], tot_loss[loss=0.1526, simple_loss=0.2332, pruned_loss=0.03596, over 4694815.06 frames. ], batch size: 134, lr: 2.11e-03, grad_scale: 16.0 2023-10-04 14:44:59,517 WARNING [train.py:1204] (3/4) Exclude cut with ID _21987_210_5854_1_1531821500482_3637038_247-93340-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 14:45:05,120 WARNING [train.py:1204] (3/4) Exclude cut with ID _66868_210_9527_1_1535509287954_4561140_436-191602-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 14:45:09,947 WARNING [train.py:1204] (3/4) Exclude cut with ID _18895_210_6228_1_1531357274234_3784930_394-58203-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 14:45:10,002 WARNING [train.py:1204] (3/4) Exclude cut with ID _58605_210_12210_1_1534723195939_3904259_258-281498-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 14:45:10,027 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_33348_210_5632_1_1533086912_3324030_245-292752-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 14:45:11,316 WARNING [train.py:1204] (3/4) Exclude cut with ID _67831_210_12392_1_1535612464202_3092612_346-2148-0 from training. Duration: 0.93 2023-10-04 14:45:11,378 WARNING [train.py:1204] (3/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_369-222060-0_sp0.9 from training. Duration: 0.788875 2023-10-04 14:45:11,441 WARNING [train.py:1204] (3/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_245-174553-0 from training. Duration: 0.88 2023-10-04 14:45:11,449 WARNING [train.py:1204] (3/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_231-104736-0_sp1.1 from training. Duration: 0.7090625 2023-10-04 14:45:12,667 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.578e+02 2.159e+02 2.640e+02 3.092e+02 4.998e+02, threshold=5.279e+02, percent-clipped=1.0 2023-10-04 14:45:12,738 WARNING [train.py:1204] (3/4) Exclude cut with ID _44585_210_18628_1_1533605350387_4518199_280-73534-0 from training. Duration: 0.89 2023-10-04 14:45:15,412 WARNING [train.py:1204] (3/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_22-201476-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 14:45:17,309 WARNING [train.py:1204] (3/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_325-62802-0_sp0.9 from training. Duration: 0.9666875 2023-10-04 14:45:17,322 WARNING [train.py:1204] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_199-274062-0_sp1.1 from training. Duration: 0.97275 2023-10-04 14:45:17,325 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_8-319957-0_sp1.1 from training. Duration: 0.87275 2023-10-04 14:45:17,358 WARNING [train.py:1204] (3/4) Exclude cut with ID _29343_210_4381_1_1532602757752_3674109_224-349726-0_sp1.1 from training. Duration: 0.936375 2023-10-04 14:45:18,997 WARNING [train.py:1204] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_520-126729-0_sp1.1 from training. Duration: 0.636375 2023-10-04 14:45:19,134 WARNING [train.py:1204] (3/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_239-22076-0_sp1.1 from training. Duration: 0.77275 2023-10-04 14:45:21,091 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9012W0011-501146 from training. Duration: 0.896 2023-10-04 14:45:22,354 WARNING [train.py:1204] (3/4) Exclude cut with ID _29857_210_20034_1_1532395886753_3798160_279-185801-0_sp1.1 from training. Duration: 0.82725 2023-10-04 14:45:23,670 WARNING [train.py:1204] (3/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_261-223933-0_sp1.1 from training. Duration: 0.92725 2023-10-04 14:45:25,117 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9029W0025-509426 from training. Duration: 0.896 2023-10-04 14:45:26,408 WARNING [train.py:1204] (3/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_337-181502-0_sp1.1 from training. Duration: 0.5909375 2023-10-04 14:45:26,421 WARNING [train.py:1204] (3/4) Exclude cut with ID _68635_210_9317_1_1535770813410_4315870_159-332599-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 14:45:31,904 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_161-276996-0 from training. Duration: 0.95 2023-10-04 14:45:31,949 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7088_210_1874_1_1527939776_1101113_127-68870-0_sp1.1 from training. Duration: 0.936375 2023-10-04 14:45:33,706 WARNING [train.py:1204] (3/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_322-217220-0_sp1.1 from training. Duration: 0.7818125 2023-10-04 14:45:35,200 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9072W0002-530636 from training. Duration: 0.768 2023-10-04 14:45:36,565 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_340-336408-0_sp1.1 from training. Duration: 0.8454375 2023-10-04 14:45:37,879 WARNING [train.py:1204] (3/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_395-37222-0 from training. Duration: 0.76 2023-10-04 14:45:37,891 WARNING [train.py:1204] (3/4) Exclude cut with ID _64099_210_17301_1_1535526977656_1578188_159-209156-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 14:45:39,291 WARNING [train.py:1204] (3/4) Exclude cut with ID _53950_210_5816_1_1534417264919_3862951_28-29681-0_sp1.1 from training. Duration: 0.92725 2023-10-04 14:45:39,371 WARNING [train.py:1204] (3/4) Exclude cut with ID _4681_210_3525_1_1527250689464_120889_15-126760-0_sp1.1 from training. Duration: 0.736375 2023-10-04 14:45:42,138 WARNING [train.py:1204] (3/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_242-3472-0_sp1.1 from training. Duration: 0.663625 2023-10-04 14:45:42,377 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.feed_forward2.hidden_balancer.prob, batch_count=1691333.3333333333, ans=0.125 2023-10-04 14:45:43,533 WARNING [train.py:1204] (3/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_436-186311-0_sp0.9 from training. Duration: 0.72225 2023-10-04 14:45:43,556 WARNING [train.py:1204] (3/4) Exclude cut with ID _3675_210_4557_1_1526639934408_4182089_452-172147-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 14:45:46,332 WARNING [train.py:1204] (3/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_80-147301-0 from training. Duration: 0.84 2023-10-04 14:45:46,363 WARNING [train.py:1204] (3/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_351-107588-0_sp1.1 from training. Duration: 0.92725 2023-10-04 14:45:48,406 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9111W0002-550781 from training. Duration: 0.896 2023-10-04 14:45:48,933 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.5.encoder.layers.1.feed_forward1.out_whiten, num_groups=1, num_channels=256, metric=5.75 vs. limit=15.0 2023-10-04 14:45:52,523 WARNING [train.py:1204] (3/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_465-156500-0_sp1.1 from training. Duration: 0.6545625 2023-10-04 14:45:55,297 WARNING [train.py:1204] (3/4) Exclude cut with ID _13938_210_9988_1_1530518718978_3693960_304-112784-0_sp1.1 from training. Duration: 0.4181875 2023-10-04 14:45:58,151 WARNING [train.py:1204] (3/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_202-67017-0_sp1.1 from training. Duration: 0.7454375 2023-10-04 14:45:59,514 WARNING [train.py:1204] (3/4) Exclude cut with ID _58414_210_8254_1_1535421609162_7151182_448-4358-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 14:46:00,838 WARNING [train.py:1204] (3/4) Exclude cut with ID _66840_210_5219_1_1535452168426_4196349_203-243234-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 14:46:02,149 WARNING [train.py:1204] (3/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_201-213513-0_sp1.1 from training. Duration: 0.97275 2023-10-04 14:46:05,101 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_117-187560-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 14:46:06,670 WARNING [train.py:1204] (3/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_135-174080-0_sp0.9 from training. Duration: 0.588875 2023-10-04 14:46:06,715 WARNING [train.py:1204] (3/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_45-265684-0 from training. Duration: 0.72 2023-10-04 14:46:08,363 INFO [train.py:1046] (3/4) Epoch 48, batch 4050, loss[loss=0.1767, simple_loss=0.2496, pruned_loss=0.05189, over 19251.00 frames. ], tot_loss[loss=0.1531, simple_loss=0.2341, pruned_loss=0.03606, over 4690901.19 frames. ], batch size: 388, lr: 2.11e-03, grad_scale: 16.0 2023-10-04 14:46:08,542 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_435-313305-0_sp1.1 from training. Duration: 0.6454375 2023-10-04 14:46:09,831 WARNING [train.py:1204] (3/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_235-307337-0_sp1.1 from training. Duration: 0.8545625 2023-10-04 14:46:09,912 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7762_210_1794_1_1528199508_4057408_419-125009-0_sp0.9 from training. Duration: 0.92225 2023-10-04 14:46:11,281 WARNING [train.py:1204] (3/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_215-141059-0_sp1.1 from training. Duration: 0.563625 2023-10-04 14:46:12,539 WARNING [train.py:1204] (3/4) Exclude cut with ID _27013_210_1389_1_1532437173240_3252649_222-125573-0_sp1.1 from training. Duration: 0.8909375 2023-10-04 14:46:16,476 WARNING [train.py:1204] (3/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_126-274694-0_sp1.1 from training. Duration: 0.8909375 2023-10-04 14:46:20,196 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2592_210_2290_1_1525697025_4882925_157-70512-0_sp1.1 from training. Duration: 0.82725 2023-10-04 14:46:20,245 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_292-265928-0_sp0.9 from training. Duration: 0.5666875 2023-10-04 14:46:22,939 WARNING [train.py:1204] (3/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_1211-226066-0_sp1.1 from training. Duration: 0.6909375 2023-10-04 14:46:24,286 WARNING [train.py:1204] (3/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_394-265747-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 14:46:27,161 WARNING [train.py:1204] (3/4) Exclude cut with ID _48782_210_12658_1_1534467701544_2300422_239-67139-0_sp1.1 from training. Duration: 0.97275 2023-10-04 14:46:28,601 WARNING [train.py:1204] (3/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_329-113173-0_sp1.1 from training. Duration: 0.563625 2023-10-04 14:46:30,098 WARNING [train.py:1204] (3/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_81-75849-0_sp0.9 from training. Duration: 0.4333125 2023-10-04 14:46:31,387 WARNING [train.py:1204] (3/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_1003-101638-0 from training. Duration: 0.73 2023-10-04 14:46:32,737 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9309W0189-640316 from training. Duration: 0.64 2023-10-04 14:46:34,142 WARNING [train.py:1204] (3/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_503-287883-0_sp1.1 from training. Duration: 0.8 2023-10-04 14:46:39,473 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.4.encoder.layers.1.feed_forward2.out_whiten, num_groups=1, num_channels=384, metric=9.25 vs. limit=15.0 2023-10-04 14:46:40,257 WARNING [train.py:1204] (3/4) Exclude cut with ID _8833_210_5219_1_1528592665210_7553059_611-80313-0 from training. Duration: 0.64 2023-10-04 14:46:41,703 WARNING [train.py:1204] (3/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_335-106626-0_sp1.1 from training. Duration: 0.963625 2023-10-04 14:46:43,283 WARNING [train.py:1204] (3/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_237-44332-0_sp1.1 from training. Duration: 0.8545625 2023-10-04 14:46:45,382 WARNING [train.py:1204] (3/4) Exclude cut with ID _46952_210_5986_1_1534230010357_3107379_178-147341-0_sp1.1 from training. Duration: 0.97275 2023-10-04 14:46:46,748 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_13091_210_7925_1_1530609447_8987354_946-154426-0_sp1.1 from training. Duration: 0.936375 2023-10-04 14:46:46,771 WARNING [train.py:1204] (3/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_326-17061-0_sp1.1 from training. Duration: 0.8545625 2023-10-04 14:46:50,654 WARNING [train.py:1204] (3/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_38-169920-0_sp1.1 from training. Duration: 0.82725 2023-10-04 14:46:53,400 WARNING [train.py:1204] (3/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_283-220372-0 from training. Duration: 0.56 2023-10-04 14:46:53,408 WARNING [train.py:1204] (3/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_252-216674-0_sp1.1 from training. Duration: 0.4909375 2023-10-04 14:46:54,855 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_202-91290-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 14:46:56,217 WARNING [train.py:1204] (3/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_80-74184-0 from training. Duration: 0.83 2023-10-04 14:46:57,822 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.balancer2.prob, batch_count=1691666.6666666667, ans=0.125 2023-10-04 14:47:00,270 WARNING [train.py:1204] (3/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_411-271358-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 14:47:00,432 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.nonlin_attention.balancer.min_positive, batch_count=1691666.6666666667, ans=0.05 2023-10-04 14:47:07,618 WARNING [train.py:1204] (3/4) Exclude cut with ID _68777_210_4353_1_1535810390731_3117904_129-166012-0 from training. Duration: 0.85 2023-10-04 14:47:08,928 WARNING [train.py:1204] (3/4) Exclude cut with ID _44982_210_1511_1_1534312820108_3726029_410-109759-0_sp1.1 from training. Duration: 0.963625 2023-10-04 14:47:08,933 WARNING [train.py:1204] (3/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_364-22817-0_sp1.1 from training. Duration: 0.6454375 2023-10-04 14:47:10,368 WARNING [train.py:1204] (3/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_89-242335-0 from training. Duration: 0.95 2023-10-04 14:47:10,376 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_151-325626-0 from training. Duration: 0.97 2023-10-04 14:47:10,377 WARNING [train.py:1204] (3/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_786-264332-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 14:47:11,838 WARNING [train.py:1204] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_35-153886-0_sp1.1 from training. Duration: 0.763625 2023-10-04 14:47:13,228 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_24007_210_1794_1_1531918925_1663875_116-100916-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 14:47:13,242 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_162-262717-0_sp1.1 from training. Duration: 0.7545625 2023-10-04 14:47:18,802 WARNING [train.py:1204] (3/4) Exclude cut with ID _8263_210_1664_1_1528438953494_294100_40-204731-0 from training. Duration: 0.5 2023-10-04 14:47:18,910 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_416-293746-0 from training. Duration: 0.9 2023-10-04 14:47:21,690 WARNING [train.py:1204] (3/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_605-219732-0 from training. Duration: 0.94 2023-10-04 14:47:21,776 WARNING [train.py:1204] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_454-123002-0 from training. Duration: 0.69 2023-10-04 14:47:21,789 WARNING [train.py:1204] (3/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_25-304273-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 14:47:23,033 INFO [train.py:1046] (3/4) Epoch 48, batch 4100, loss[loss=0.2028, simple_loss=0.2746, pruned_loss=0.06543, over 19830.00 frames. ], tot_loss[loss=0.1539, simple_loss=0.2348, pruned_loss=0.03651, over 4699417.87 frames. ], batch size: 388, lr: 2.11e-03, grad_scale: 16.0 2023-10-04 14:47:23,114 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6860_210_4852_1_1527917457_3833327_451-39702-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 14:47:23,151 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_304-16505-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 14:47:23,164 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_4-266348-0_sp1.1 from training. Duration: 0.7090625 2023-10-04 14:47:23,242 WARNING [train.py:1204] (3/4) Exclude cut with ID ID0149W0001-753701 from training. Duration: 0.989 2023-10-04 14:47:26,004 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_41251_210_15368_1_1533429767_4820695_394-55393-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 14:47:27,369 WARNING [train.py:1204] (3/4) Exclude cut with ID _8793_210_6310_1_1528542985139_2310009_121-179782-0_sp1.1 from training. Duration: 0.72725 2023-10-04 14:47:27,389 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2729_210_3528_1_1525863505_3882214_421-73047-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 14:47:28,774 WARNING [train.py:1204] (3/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_278-154492-0_sp0.9 from training. Duration: 0.8444375 2023-10-04 14:47:33,388 WARNING [train.py:1204] (3/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_412-317077-0_sp0.9 from training. Duration: 0.8333125 2023-10-04 14:47:33,460 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_727-138317-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 14:47:33,500 WARNING [train.py:1204] (3/4) Exclude cut with ID _25808_210_13292_1_1532343514562_7366109_345-335048-0_sp1.1 from training. Duration: 0.92725 2023-10-04 14:47:33,515 WARNING [train.py:1204] (3/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_506-23200-0 from training. Duration: 0.97 2023-10-04 14:47:36,296 WARNING [train.py:1204] (3/4) Exclude cut with ID _44718_210_5816_1_1534050051869_6469030_643-245187-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 14:47:36,300 WARNING [train.py:1204] (3/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_42-186162-0_sp1.1 from training. Duration: 0.836375 2023-10-04 14:47:36,322 WARNING [train.py:1204] (3/4) Exclude cut with ID _3231_210_2106_1_1526205864934_6784220_171-256004-0_sp1.1 from training. Duration: 0.7818125 2023-10-04 14:47:36,351 WARNING [train.py:1204] (3/4) Exclude cut with ID _25914_210_6400_1_1532141317058_644740_43-248478-0_sp1.1 from training. Duration: 0.936375 2023-10-04 14:47:36,394 WARNING [train.py:1204] (3/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_577-287150-0 from training. Duration: 0.82 2023-10-04 14:47:40,386 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.712e+02 2.089e+02 2.278e+02 2.511e+02 3.603e+02, threshold=4.556e+02, percent-clipped=0.0 2023-10-04 14:47:40,501 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_46015_210_7234_1_1533863319_3087837_376-276068-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 14:47:42,456 WARNING [train.py:1204] (3/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_51-200220-0 from training. Duration: 0.56 2023-10-04 14:47:43,834 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_452-96101-0_sp1.1 from training. Duration: 0.963625 2023-10-04 14:47:47,437 WARNING [train.py:1204] (3/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_263-168388-0_sp1.1 from training. Duration: 0.7818125 2023-10-04 14:47:47,438 WARNING [train.py:1204] (3/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_469-94673-0 from training. Duration: 0.79 2023-10-04 14:47:47,565 WARNING [train.py:1204] (3/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_303-228738-0_sp1.1 from training. Duration: 0.763625 2023-10-04 14:47:48,728 WARNING [train.py:1204] (3/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_262-250826-0_sp1.1 from training. Duration: 0.9 2023-10-04 14:47:48,757 WARNING [train.py:1204] (3/4) Exclude cut with ID _68777_210_4353_1_1535810390731_3117904_129-166012-0_sp1.1 from training. Duration: 0.77275 2023-10-04 14:47:50,271 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.bypass_mid.scale_min, batch_count=1691866.6666666667, ans=0.2 2023-10-04 14:47:51,491 WARNING [train.py:1204] (3/4) Exclude cut with ID _58895_210_8750_1_1535184081320_3649089_265-323810-0 from training. Duration: 0.96 2023-10-04 14:47:52,864 WARNING [train.py:1204] (3/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_120-76843-0_sp0.9 from training. Duration: 0.611125 2023-10-04 14:47:54,131 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7544_210_6753_1_1528587722_3576686_295-258479-0_sp0.9 from training. Duration: 0.9333125 2023-10-04 14:47:56,931 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7046_210_6753_1_1527895337_6920917_783-278109-0 from training. Duration: 0.77 2023-10-04 14:47:56,994 WARNING [train.py:1204] (3/4) Exclude cut with ID _42326_210_7770_1_1533814282531_3707269_113-308323-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 14:47:58,267 WARNING [train.py:1204] (3/4) Exclude cut with ID _7852_210_5219_1_1528286471316_8139400_242-105336-0_sp0.9 from training. Duration: 0.8 2023-10-04 14:48:00,982 WARNING [train.py:1204] (3/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_114-277905-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 14:48:05,135 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_13118_210_7925_1_1530755612_7608297_42-66683-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 14:48:10,229 WARNING [train.py:1204] (3/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_901-79438-0_sp1.1 from training. Duration: 0.8545625 2023-10-04 14:48:11,497 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_30280_210_1881_1_1532872779_3660139_211-182194-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 14:48:17,045 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.0.layers.1.whiten, num_groups=1, num_channels=192, metric=4.05 vs. limit=12.0 2023-10-04 14:48:20,247 WARNING [train.py:1204] (3/4) Exclude cut with ID _14898_210_9206_1_1530766812377_4177190_621-45732-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 14:48:20,255 WARNING [train.py:1204] (3/4) Exclude cut with ID _35350_210_16040_1_1533040191183_4075299_224-133775-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 14:48:23,638 WARNING [train.py:1204] (3/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_147-293311-0_sp1.1 from training. Duration: 0.8545625 2023-10-04 14:48:26,355 WARNING [train.py:1204] (3/4) Exclude cut with ID _12302_210_5219_1_1529924446466_8107280_835-279754-0_sp1.1 from training. Duration: 0.8181875 2023-10-04 14:48:29,225 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.balancer2.prob, batch_count=1692066.6666666667, ans=0.125 2023-10-04 14:48:30,418 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8453_210_4211_1_1528502940_8562730_347-330673-0_sp0.9 from training. Duration: 0.8 2023-10-04 14:48:31,787 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_213-103282-0_sp0.9 from training. Duration: 0.8666875 2023-10-04 14:48:31,861 WARNING [train.py:1204] (3/4) Exclude cut with ID _49728_210_12651_1_1534041034294_6788150_66-43496-0_sp1.1 from training. Duration: 0.97275 2023-10-04 14:48:31,863 WARNING [train.py:1204] (3/4) Exclude cut with ID _19023_210_4353_1_1531378892552_5820780_28-36615-0_sp1.1 from training. Duration: 0.8818125 2023-10-04 14:48:34,680 WARNING [train.py:1204] (3/4) Exclude cut with ID _14349_210_8341_1_1530615710818_3442470_115-285975-0 from training. Duration: 0.86 2023-10-04 14:48:35,908 INFO [train.py:1046] (3/4) Epoch 48, batch 4150, loss[loss=0.1528, simple_loss=0.2361, pruned_loss=0.03477, over 24652.00 frames. ], tot_loss[loss=0.1537, simple_loss=0.2348, pruned_loss=0.03632, over 4716177.57 frames. ], batch size: 65, lr: 2.11e-03, grad_scale: 8.0 2023-10-04 14:48:35,984 WARNING [train.py:1204] (3/4) Exclude cut with ID _12734_210_8341_1_1530183614296_3891929_236-47464-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 14:48:36,044 WARNING [train.py:1204] (3/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_177-236809-0 from training. Duration: 0.49 2023-10-04 14:48:37,416 WARNING [train.py:1204] (3/4) Exclude cut with ID _13678_210_7497_1_1530530069226_3463170_371-3221-0 from training. Duration: 0.9 2023-10-04 14:48:37,430 WARNING [train.py:1204] (3/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_48-338976-0 from training. Duration: 0.89 2023-10-04 14:48:39,287 WARNING [train.py:1204] (3/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_1167-309744-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 14:48:43,451 WARNING [train.py:1204] (3/4) Exclude cut with ID _30327_210_16049_1_1532484161295_3924169_401-70819-0_sp0.9 from training. Duration: 0.9555625 2023-10-04 14:48:43,459 WARNING [train.py:1204] (3/4) Exclude cut with ID _55134_210_21982_1_1534590028377_3882170_346-91022-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 14:48:48,071 WARNING [train.py:1204] (3/4) Exclude cut with ID _14459_210_2228_1_1531443641946_7516700_174-118801-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 14:48:49,350 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_269-278006-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 14:48:49,409 WARNING [train.py:1204] (3/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_390-339137-0_sp0.9 from training. Duration: 0.62225 2023-10-04 14:48:50,918 WARNING [train.py:1204] (3/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_253-308377-0_sp1.1 from training. Duration: 0.4454375 2023-10-04 14:48:50,929 WARNING [train.py:1204] (3/4) Exclude cut with ID _23825_210_13449_1_1531915313526_3497150_240-271367-0_sp1.1 from training. Duration: 0.8818125 2023-10-04 14:48:52,738 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1015-343884-0_sp0.9 from training. Duration: 0.688875 2023-10-04 14:48:56,148 WARNING [train.py:1204] (3/4) Exclude cut with ID _23514_210_1389_1_1531832139785_3031439_335-48210-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 14:48:59,148 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.conv_skip_rate, batch_count=1692200.0, ans=0.0 2023-10-04 14:49:00,182 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_309-65177-0_sp1.1 from training. Duration: 0.8 2023-10-04 14:49:01,520 WARNING [train.py:1204] (3/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_231-137892-0 from training. Duration: 0.99 2023-10-04 14:49:03,028 WARNING [train.py:1204] (3/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_57-270037-0 from training. Duration: 0.77 2023-10-04 14:49:03,041 WARNING [train.py:1204] (3/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_465-214726-0_sp1.1 from training. Duration: 0.7818125 2023-10-04 14:49:03,303 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.ff2_skip_rate, batch_count=1692200.0, ans=0.0 2023-10-04 14:49:04,376 WARNING [train.py:1204] (3/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_106-300985-0 from training. Duration: 0.9 2023-10-04 14:49:04,380 WARNING [train.py:1204] (3/4) Exclude cut with ID _14632_210_9988_1_1530691279392_3923200_161-129530-0_sp1.1 from training. Duration: 0.87275 2023-10-04 14:49:04,392 WARNING [train.py:1204] (3/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_514-88370-0_sp1.1 from training. Duration: 0.963625 2023-10-04 14:49:08,575 WARNING [train.py:1204] (3/4) Exclude cut with ID _56495_210_4234_1_1534748080334_4136219_305-334505-0_sp1.1 from training. Duration: 0.92725 2023-10-04 14:49:08,723 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.0.feed_forward1.hidden_balancer.prob, batch_count=1692266.6666666667, ans=0.125 2023-10-04 14:49:10,281 WARNING [train.py:1204] (3/4) Exclude cut with ID _42875_210_14856_1_1533704397487_4012433_61-264914-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 14:49:13,039 WARNING [train.py:1204] (3/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_91-148831-0 from training. Duration: 0.99 2023-10-04 14:49:15,072 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_435-313305-0_sp0.9 from training. Duration: 0.788875 2023-10-04 14:49:15,308 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.nonlin_attention.balancer.prob, batch_count=1692266.6666666667, ans=0.125 2023-10-04 14:49:17,740 WARNING [train.py:1204] (3/4) Exclude cut with ID _16744_210_5986_1_1531724405384_3907567_242-111316-0_sp1.1 from training. Duration: 0.8454375 2023-10-04 14:49:17,801 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_257-20733-0 from training. Duration: 0.76 2023-10-04 14:49:17,842 WARNING [train.py:1204] (3/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_4-91037-0_sp1.1 from training. Duration: 0.8 2023-10-04 14:49:19,233 WARNING [train.py:1204] (3/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_3-276701-0 from training. Duration: 0.91 2023-10-04 14:49:20,982 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.bypass.scale_min, batch_count=1692333.3333333333, ans=0.2 2023-10-04 14:49:22,751 WARNING [train.py:1204] (3/4) Exclude cut with ID _3992_210_1747_1_1526644816031_3731060_36-147855-0_sp1.1 from training. Duration: 0.7090625 2023-10-04 14:49:22,843 WARNING [train.py:1204] (3/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_1296-298671-0_sp1.1 from training. Duration: 0.963625 2023-10-04 14:49:23,761 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.1.encoder.layers.0.self_attn1.whiten, num_groups=1, num_channels=256, metric=12.77 vs. limit=22.5 2023-10-04 14:49:24,256 WARNING [train.py:1204] (3/4) Exclude cut with ID _68965_210_1883_1_1535698962272_3497490_309-334593-0_sp1.1 from training. Duration: 0.92725 2023-10-04 14:49:25,550 WARNING [train.py:1204] (3/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_128-296581-0 from training. Duration: 0.74 2023-10-04 14:49:25,551 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_17613_210_2151_1_1531137569_2366160_170-299079-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 14:49:25,553 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6287_210_3273_1_1527509506_2653716_182-199887-0_sp1.1 from training. Duration: 0.463625 2023-10-04 14:49:28,371 WARNING [train.py:1204] (3/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_80-135200-0_sp1.1 from training. Duration: 0.7181875 2023-10-04 14:49:31,120 WARNING [train.py:1204] (3/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_34-183685-0 from training. Duration: 0.98 2023-10-04 14:49:31,146 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8278_210_2550_1_1528637099_4220413_418-210155-0_sp1.1 from training. Duration: 0.92725 2023-10-04 14:49:31,150 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8853_210_3273_1_1528633899_818968_51-187685-0_sp0.9 from training. Duration: 0.7666875 2023-10-04 14:49:31,164 WARNING [train.py:1204] (3/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_332-221057-0_sp1.1 from training. Duration: 0.6181875 2023-10-04 14:49:31,248 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2221_210_1988_1_1525597051_4935541_415-296882-0 from training. Duration: 0.77 2023-10-04 14:49:31,268 WARNING [train.py:1204] (3/4) Exclude cut with ID _33640_210_2655_1_1533117605479_4855469_770-278185-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 14:49:32,555 WARNING [train.py:1204] (3/4) Exclude cut with ID _5287_210_2084_1_1527418883040_3878670_333-48508-0_sp1.1 from training. Duration: 0.5454375 2023-10-04 14:49:32,627 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_142-276839-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 14:49:33,946 WARNING [train.py:1204] (3/4) Exclude cut with ID _28394_210_8913_1_1532430049518_3650118_126-139371-0_sp1.1 from training. Duration: 0.92725 2023-10-04 14:49:33,967 WARNING [train.py:1204] (3/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_76-203867-0 from training. Duration: 0.72 2023-10-04 14:49:34,003 WARNING [train.py:1204] (3/4) Exclude cut with ID _6194_210_1534_1_1527511826160_3941194_14-282876-0_sp0.9 from training. Duration: 0.788875 2023-10-04 14:49:41,214 WARNING [train.py:1204] (3/4) Exclude cut with ID _35355_210_16040_1_1533212988977_4016229_74-190076-0_sp1.1 from training. Duration: 0.72725 2023-10-04 14:49:44,555 WARNING [train.py:1204] (3/4) Exclude cut with ID _33920_210_4220_1_1533257851500_3084399_198-334063-0 from training. Duration: 0.94 2023-10-04 14:49:45,907 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_4-266348-0_sp0.9 from training. Duration: 0.8666875 2023-10-04 14:49:48,722 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_23856_210_4238_1_1531994785_3087934_122-38075-0_sp1.1 from training. Duration: 0.936375 2023-10-04 14:49:50,112 INFO [train.py:1046] (3/4) Epoch 48, batch 4200, loss[loss=0.1459, simple_loss=0.2183, pruned_loss=0.03674, over 23557.00 frames. ], tot_loss[loss=0.1526, simple_loss=0.2335, pruned_loss=0.03587, over 4717315.86 frames. ], batch size: 134, lr: 2.11e-03, grad_scale: 8.0 2023-10-04 14:49:50,176 WARNING [train.py:1204] (3/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_276-96593-0_sp0.9 from training. Duration: 0.8555625 2023-10-04 14:49:50,232 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_308-278734-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 14:49:50,234 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_5190_210_4557_1_1527245078_2965646_353-250603-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 14:49:50,433 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.bypass.scale_min, batch_count=1692466.6666666667, ans=0.2 2023-10-04 14:49:53,373 WARNING [train.py:1204] (3/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_472-216663-0 from training. Duration: 0.92 2023-10-04 14:49:55,569 WARNING [train.py:1204] (3/4) Exclude cut with ID _7537_210_2084_1_1528502395431_3924359_14-312906-0 from training. Duration: 0.84 2023-10-04 14:49:56,903 WARNING [train.py:1204] (3/4) Exclude cut with ID _29003_210_19596_1_1532507391720_3894098_559-236660-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 14:49:58,311 WARNING [train.py:1204] (3/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_312-204798-0_sp1.1 from training. Duration: 0.8454375 2023-10-04 14:50:01,152 WARNING [train.py:1204] (3/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_17-309070-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 14:50:03,904 WARNING [train.py:1204] (3/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_344-152526-0_sp0.9 from training. Duration: 0.588875 2023-10-04 14:50:06,818 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_41452_210_7291_1_1533437687_3941281_405-11559-0_sp1.1 from training. Duration: 0.82725 2023-10-04 14:50:06,841 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_48730_210_5399_1_1533885886_2212769_157-124052-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 14:50:06,900 WARNING [train.py:1204] (3/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_415-78792-0 from training. Duration: 0.81 2023-10-04 14:50:06,904 WARNING [train.py:1204] (3/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_345-340690-0_sp1.1 from training. Duration: 0.8454375 2023-10-04 14:50:08,221 WARNING [train.py:1204] (3/4) Exclude cut with ID _23825_210_13449_1_1531915313526_3497150_124-8138-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 14:50:08,264 WARNING [train.py:1204] (3/4) Exclude cut with ID _49258_210_7994_1_1534122017863_3738861_194-24864-0_sp1.1 from training. Duration: 0.936375 2023-10-04 14:50:08,281 WARNING [train.py:1204] (3/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_369-222060-0_sp1.1 from training. Duration: 0.6454375 2023-10-04 14:50:09,687 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.738e+02 2.019e+02 2.306e+02 2.735e+02 4.885e+02, threshold=4.613e+02, percent-clipped=1.0 2023-10-04 14:50:09,844 WARNING [train.py:1204] (3/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_480-13146-0_sp0.9 from training. Duration: 0.9444375 2023-10-04 14:50:11,676 WARNING [train.py:1204] (3/4) Exclude cut with ID _13253_210_1925_1_1530757775819_3577873_191-144977-0 from training. Duration: 0.78 2023-10-04 14:50:11,703 WARNING [train.py:1204] (3/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_1128-140266-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 14:50:13,697 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.3.encoder.layers.1.conv_module1.whiten, num_groups=1, num_channels=512, metric=2.52 vs. limit=15.0 2023-10-04 14:50:16,166 WARNING [train.py:1204] (3/4) Exclude cut with ID _8772_210_1850_1_1528635595972_3664011_247-336448-0_sp0.9 from training. Duration: 0.82225 2023-10-04 14:50:17,491 WARNING [train.py:1204] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_318-59097-0_sp1.1 from training. Duration: 0.6909375 2023-10-04 14:50:20,128 WARNING [train.py:1204] (3/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_540-131164-0_sp1.1 from training. Duration: 0.836375 2023-10-04 14:50:21,491 WARNING [train.py:1204] (3/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_39-110836-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 14:50:22,888 WARNING [train.py:1204] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_860-96198-0_sp1.1 from training. Duration: 0.663625 2023-10-04 14:50:22,896 WARNING [train.py:1204] (3/4) Exclude cut with ID _15133_210_6228_1_1530794512074_3080379_29-264458-0 from training. Duration: 0.58 2023-10-04 14:50:22,924 WARNING [train.py:1204] (3/4) Exclude cut with ID _55209_210_1531_1_1534675828996_4597160_797-348343-0_sp1.1 from training. Duration: 0.963625 2023-10-04 14:50:24,869 WARNING [train.py:1204] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_23-189234-0_sp1.1 from training. Duration: 0.7545625 2023-10-04 14:50:28,983 WARNING [train.py:1204] (3/4) Exclude cut with ID _5410_210_1388_1_1527397055795_1719020_39-112221-0_sp1.1 from training. Duration: 0.62725 2023-10-04 14:50:31,724 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_100-348441-0_sp1.1 from training. Duration: 0.82725 2023-10-04 14:50:39,990 WARNING [train.py:1204] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_143-86344-0_sp1.1 from training. Duration: 0.7 2023-10-04 14:50:41,611 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.out_combiner.scale_min, batch_count=1692666.6666666667, ans=0.2 2023-10-04 14:50:42,870 WARNING [train.py:1204] (3/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_126-29104-0 from training. Duration: 0.85 2023-10-04 14:50:46,610 WARNING [train.py:1204] (3/4) Exclude cut with ID _39846_210_12651_1_1533603651992_1318030_138-31771-0_sp1.1 from training. Duration: 0.963625 2023-10-04 14:50:52,144 WARNING [train.py:1204] (3/4) Exclude cut with ID _2220_210_1819_1_1525509109898_3658680_50-99837-0_sp1.1 from training. Duration: 0.4909375 2023-10-04 14:50:53,406 WARNING [train.py:1204] (3/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_836-210550-0_sp1.1 from training. Duration: 0.97275 2023-10-04 14:50:54,859 WARNING [train.py:1204] (3/4) Exclude cut with ID _28323_210_16187_1_1532480395667_4100300_306-74561-0 from training. Duration: 0.94 2023-10-04 14:50:56,509 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.bypass_mid.scale_min, batch_count=1692733.3333333333, ans=0.2 2023-10-04 14:51:01,030 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_177-58005-0_sp1.1 from training. Duration: 0.563625 2023-10-04 14:51:02,647 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.0.feed_forward1.out_proj.dropout_p, batch_count=1692800.0, ans=0.1 2023-10-04 14:51:03,759 INFO [train.py:1046] (3/4) Epoch 48, batch 4250, loss[loss=0.159, simple_loss=0.2325, pruned_loss=0.0428, over 23846.00 frames. ], tot_loss[loss=0.1515, simple_loss=0.2319, pruned_loss=0.03552, over 4706158.58 frames. ], batch size: 195, lr: 2.11e-03, grad_scale: 8.0 2023-10-04 14:51:03,857 WARNING [train.py:1204] (3/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_19-342632-0_sp1.1 from training. Duration: 0.82725 2023-10-04 14:51:03,888 WARNING [train.py:1204] (3/4) Exclude cut with ID _35355_210_16040_1_1533212988977_4016229_74-190076-0_sp0.9 from training. Duration: 0.888875 2023-10-04 14:51:06,780 WARNING [train.py:1204] (3/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_285-97786-0_sp1.1 from training. Duration: 0.97275 2023-10-04 14:51:10,883 WARNING [train.py:1204] (3/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_337-181502-0_sp0.9 from training. Duration: 0.72225 2023-10-04 14:51:11,108 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.balancer2.prob, batch_count=1692800.0, ans=0.125 2023-10-04 14:51:12,155 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_353-182855-0 from training. Duration: 0.96 2023-10-04 14:51:12,188 WARNING [train.py:1204] (3/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_409-327730-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 14:51:13,690 WARNING [train.py:1204] (3/4) Exclude cut with ID _8030_210_7497_1_1528444886781_3531089_373-19403-0_sp1.1 from training. Duration: 0.97275 2023-10-04 14:51:18,182 WARNING [train.py:1204] (3/4) Exclude cut with ID _6789_210_1534_1_1527857937218_3118950_231-340237-0_sp1.1 from training. Duration: 0.936375 2023-10-04 14:51:21,539 WARNING [train.py:1204] (3/4) Exclude cut with ID _6493_210_1747_1_1528459210424_4012470_536-57832-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 14:51:21,559 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7046_210_6753_1_1527895337_6920917_756-116002-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 14:51:24,415 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_37152_210_9697_1_1533106573_3844188_29-239070-0_sp1.1 from training. Duration: 0.8909375 2023-10-04 14:51:24,418 WARNING [train.py:1204] (3/4) Exclude cut with ID _19023_210_4353_1_1531378892552_5820780_61-334539-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 14:51:25,889 WARNING [train.py:1204] (3/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_159-28488-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 14:51:25,966 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_764-71272-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 14:51:27,914 WARNING [train.py:1204] (3/4) Exclude cut with ID _44902_210_16187_1_1533888006062_3792078_258-178274-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 14:51:29,300 WARNING [train.py:1204] (3/4) Exclude cut with ID _7537_210_2084_1_1528502395431_3924359_14-312906-0_sp1.1 from training. Duration: 0.763625 2023-10-04 14:51:32,447 WARNING [train.py:1204] (3/4) Exclude cut with ID _49643_210_7989_1_1534640244224_3939031_674-116639-0_sp1.1 from training. Duration: 0.963625 2023-10-04 14:51:33,906 WARNING [train.py:1204] (3/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_415-161-0 from training. Duration: 0.98 2023-10-04 14:51:37,917 WARNING [train.py:1204] (3/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_264-348896-0 from training. Duration: 0.85 2023-10-04 14:51:37,925 WARNING [train.py:1204] (3/4) Exclude cut with ID _51899_210_20034_1_1534129254466_3669220_233-161492-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 14:51:37,998 WARNING [train.py:1204] (3/4) Exclude cut with ID _31255_210_13427_1_1532865619495_3815143_233-200023-0_sp1.1 from training. Duration: 0.936375 2023-10-04 14:51:38,022 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_23850_210_4238_1_1532232597_1632433_100-229879-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 14:51:39,340 WARNING [train.py:1204] (3/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_375-245677-0_sp0.9 from training. Duration: 0.9 2023-10-04 14:51:39,343 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_40223_210_6228_1_1533298404_4812267_355-317415-0_sp1.1 from training. Duration: 0.97275 2023-10-04 14:51:40,750 WARNING [train.py:1204] (3/4) Exclude cut with ID _9679_210_7497_1_1529129212906_4363929_235-81488-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 14:51:42,334 WARNING [train.py:1204] (3/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_172-215932-0_sp1.1 from training. Duration: 0.6 2023-10-04 14:51:43,631 WARNING [train.py:1204] (3/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_64-298917-0_sp1.1 from training. Duration: 0.8 2023-10-04 14:51:47,030 WARNING [train.py:1204] (3/4) Exclude cut with ID _38020_210_5986_1_1533112252259_7006089_131-53533-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 14:51:48,423 WARNING [train.py:1204] (3/4) Exclude cut with ID _42334_210_7770_1_1534206610396_3605880_392-48154-0_sp1.1 from training. Duration: 0.963625 2023-10-04 14:51:50,203 WARNING [train.py:1204] (3/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_337-181502-0 from training. Duration: 0.65 2023-10-04 14:51:50,211 WARNING [train.py:1204] (3/4) Exclude cut with ID _6267_210_2084_1_1527764658526_3894730_375-219887-0_sp0.9 from training. Duration: 0.7444375 2023-10-04 14:51:50,345 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.1.balancer2.prob, batch_count=1693000.0, ans=0.125 2023-10-04 14:51:51,532 WARNING [train.py:1204] (3/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_60-146189-0 from training. Duration: 0.98 2023-10-04 14:51:52,912 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_243-143119-0_sp1.1 from training. Duration: 0.7 2023-10-04 14:51:53,158 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.ff3_skip_rate, batch_count=1693000.0, ans=0.0 2023-10-04 14:51:54,347 WARNING [train.py:1204] (3/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_1267-272693-0_sp0.9 from training. Duration: 0.811125 2023-10-04 14:51:55,830 WARNING [train.py:1204] (3/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_83-182645-0_sp1.1 from training. Duration: 0.97275 2023-10-04 14:51:55,847 WARNING [train.py:1204] (3/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_403-292260-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 14:51:56,662 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.out_combiner.scale_min, batch_count=1693000.0, ans=0.2 2023-10-04 14:51:59,870 WARNING [train.py:1204] (3/4) Exclude cut with ID _55429_210_10626_1_1534467588527_7174143_377-169391-0 from training. Duration: 0.96 2023-10-04 14:52:01,283 WARNING [train.py:1204] (3/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_494-199538-0_sp1.1 from training. Duration: 0.6818125 2023-10-04 14:52:02,636 WARNING [train.py:1204] (3/4) Exclude cut with ID _3067_210_2667_1_1526212974640_4503168_119-175776-0_sp1.1 from training. Duration: 0.67275 2023-10-04 14:52:05,450 WARNING [train.py:1204] (3/4) Exclude cut with ID _3231_210_2106_1_1526205864934_6784220_183-303839-0_sp1.1 from training. Duration: 0.97275 2023-10-04 14:52:07,072 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.nonlin_attention.balancer.prob, batch_count=1693066.6666666667, ans=0.125 2023-10-04 14:52:09,464 WARNING [train.py:1204] (3/4) Exclude cut with ID _18513_210_9206_1_1531618215223_3862620_165-38958-0_sp1.1 from training. Duration: 0.963625 2023-10-04 14:52:10,858 WARNING [train.py:1204] (3/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_87-272068-0_sp1.1 from training. Duration: 0.7545625 2023-10-04 14:52:12,261 WARNING [train.py:1204] (3/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_353-95-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 14:52:13,650 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2009_210_2071_1_1525481134_4588937_73-302813-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 14:52:15,102 WARNING [train.py:1204] (3/4) Exclude cut with ID _64220_210_9527_1_1535250151772_4875259_88-95011-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 14:52:16,400 WARNING [train.py:1204] (3/4) Exclude cut with ID _44902_210_16187_1_1533888006062_3792078_160-12107-0_sp1.1 from training. Duration: 0.92725 2023-10-04 14:52:16,407 WARNING [train.py:1204] (3/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_547-5424-0 from training. Duration: 0.94 2023-10-04 14:52:17,685 INFO [train.py:1046] (3/4) Epoch 48, batch 4300, loss[loss=0.1544, simple_loss=0.2334, pruned_loss=0.03774, over 24676.00 frames. ], tot_loss[loss=0.1516, simple_loss=0.2321, pruned_loss=0.03556, over 4706806.12 frames. ], batch size: 65, lr: 2.11e-03, grad_scale: 8.0 2023-10-04 14:52:17,800 WARNING [train.py:1204] (3/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_268-85656-0_sp1.1 from training. Duration: 0.936375 2023-10-04 14:52:20,036 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.0.bypass_mid.scale_min, batch_count=1693133.3333333333, ans=0.2 2023-10-04 14:52:21,232 WARNING [train.py:1204] (3/4) Exclude cut with ID _66210_210_12651_1_1535446812922_3365850_42-284985-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 14:52:22,615 WARNING [train.py:1204] (3/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_465-214726-0_sp0.9 from training. Duration: 0.9555625 2023-10-04 14:52:23,407 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.3.encoder.layers.0.self_attn2.whiten, num_groups=1, num_channels=512, metric=12.54 vs. limit=22.5 2023-10-04 14:52:27,244 WARNING [train.py:1204] (3/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_50-95770-0_sp1.1 from training. Duration: 0.936375 2023-10-04 14:52:30,698 INFO [scaling.py:1118] (3/4) WithLoss: name=encoder.encoders.3.encoder.layers.1.self_attn_weights, loss-sum=0.000e+00 2023-10-04 14:52:32,769 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.self_attn_weights.pos_emb_skip_rate, batch_count=1693200.0, ans=0.0 2023-10-04 14:52:33,833 WARNING [train.py:1204] (3/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_269-77567-0_sp1.1 from training. Duration: 0.963625 2023-10-04 14:52:33,836 WARNING [train.py:1204] (3/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_7-111954-0 from training. Duration: 0.56 2023-10-04 14:52:35,247 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_85-195240-0_sp0.9 from training. Duration: 0.9666875 2023-10-04 14:52:35,395 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2822_210_2715_1_1526112586_1999587_165-36656-0_sp0.9 from training. Duration: 0.988875 2023-10-04 14:52:35,523 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.balancer_ff3.min_abs, batch_count=1693200.0, ans=0.2 2023-10-04 14:52:36,699 WARNING [train.py:1204] (3/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_181-78632-0_sp1.1 from training. Duration: 0.7090625 2023-10-04 14:52:36,718 WARNING [train.py:1204] (3/4) Exclude cut with ID IC0671W0044-331887 from training. Duration: 0.896 2023-10-04 14:52:37,909 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.751e+02 2.092e+02 2.342e+02 2.811e+02 4.039e+02, threshold=4.683e+02, percent-clipped=0.0 2023-10-04 14:52:40,712 WARNING [train.py:1204] (3/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_266-137104-0_sp1.1 from training. Duration: 0.5909375 2023-10-04 14:52:42,009 WARNING [train.py:1204] (3/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_137-116442-0_sp0.9 from training. Duration: 0.8666875 2023-10-04 14:52:44,773 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_23801_210_16223_1_1532066392_3294657_570-5439-0 from training. Duration: 0.97 2023-10-04 14:52:44,799 WARNING [train.py:1204] (3/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_29-280622-0_sp1.1 from training. Duration: 0.7454375 2023-10-04 14:52:46,135 WARNING [train.py:1204] (3/4) Exclude cut with ID 3557-8342-0013-54691-0 from training. Duration: 0.92 2023-10-04 14:52:47,717 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=1693266.6666666667, ans=0.1 2023-10-04 14:52:49,438 WARNING [train.py:1204] (3/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_711-15688-0_sp1.1 from training. Duration: 0.3909375 2023-10-04 14:52:49,563 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_15126_210_6753_1_1530778677_2591185_120-277707-0_sp1.1 from training. Duration: 0.72725 2023-10-04 14:52:52,314 WARNING [train.py:1204] (3/4) Exclude cut with ID _14970_210_15709_1_1530775547361_1966965_8-169924-0_sp1.1 from training. Duration: 0.836375 2023-10-04 14:52:52,315 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_4692_210_3528_1_1526899755_4323254_107-44401-0_sp0.9 from training. Duration: 0.9555625 2023-10-04 14:52:53,710 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3475_210_2028_1_1526193578_3622569_265-276625-0_sp0.9 from training. Duration: 0.8444375 2023-10-04 14:52:53,833 WARNING [train.py:1204] (3/4) Exclude cut with ID _55153_210_2253_1_1534384786677_3162096_148-79022-0_sp1.1 from training. Duration: 0.87275 2023-10-04 14:52:56,831 WARNING [train.py:1204] (3/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_292-36336-0_sp1.1 from training. Duration: 0.92725 2023-10-04 14:52:56,848 WARNING [train.py:1204] (3/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_329-82235-0 from training. Duration: 0.82 2023-10-04 14:52:58,144 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_11305_210_1794_1_1529750428_7178148_257-312870-0 from training. Duration: 0.88 2023-10-04 14:52:59,612 WARNING [train.py:1204] (3/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_436-4493-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 14:53:03,080 WARNING [train.py:1204] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_321-54129-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 14:53:03,094 WARNING [train.py:1204] (3/4) Exclude cut with ID _5287_210_2084_1_1527418883040_3878670_333-48508-0_sp0.9 from training. Duration: 0.6666875 2023-10-04 14:53:03,109 WARNING [train.py:1204] (3/4) Exclude cut with ID _26528_210_9070_1_1532570410842_3851310_244-278158-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 14:53:03,135 WARNING [train.py:1204] (3/4) Exclude cut with ID _46952_210_5986_1_1534230010357_3107379_232-344503-0_sp1.1 from training. Duration: 0.87275 2023-10-04 14:53:03,145 WARNING [train.py:1204] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_43-206757-0 from training. Duration: 0.75 2023-10-04 14:53:03,146 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_4183_210_2087_1_1526729100_1869838_141-166600-0 from training. Duration: 0.95 2023-10-04 14:53:03,198 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_296-287678-0 from training. Duration: 0.95 2023-10-04 14:53:03,416 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.feed_forward3.hidden_balancer.prob, batch_count=1693333.3333333333, ans=0.125 2023-10-04 14:53:04,569 WARNING [train.py:1204] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_265-60343-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 14:53:05,828 WARNING [train.py:1204] (3/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_332-256168-0 from training. Duration: 0.75 2023-10-04 14:53:05,871 WARNING [train.py:1204] (3/4) Exclude cut with ID _13679_210_7497_1_1530534293662_3355098_411-186088-0 from training. Duration: 0.94 2023-10-04 14:53:09,963 WARNING [train.py:1204] (3/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_458-126779-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 14:53:10,291 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.attention_skip_rate, batch_count=1693333.3333333333, ans=0.0 2023-10-04 14:53:11,415 WARNING [train.py:1204] (3/4) Exclude cut with ID IC0803W0380-397572 from training. Duration: 0.032 2023-10-04 14:53:11,493 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_278-272612-0_sp1.1 from training. Duration: 0.663625 2023-10-04 14:53:12,915 WARNING [train.py:1204] (3/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_491-263101-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 14:53:12,920 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_53555_210_5632_1_1534595220_7232297_334-336778-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 14:53:14,589 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.conv_module1.balancer1.prob, batch_count=1693333.3333333333, ans=0.125 2023-10-04 14:53:16,161 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_50-3289-0 from training. Duration: 0.91 2023-10-04 14:53:16,236 WARNING [train.py:1204] (3/4) Exclude cut with ID _8699_210_2069_1_1528630346831_3960409_155-39766-0_sp0.9 from training. Duration: 0.8666875 2023-10-04 14:53:16,240 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_502-82145-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 14:53:17,492 WARNING [train.py:1204] (3/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_550-43132-0_sp1.1 from training. Duration: 0.97275 2023-10-04 14:53:17,524 WARNING [train.py:1204] (3/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_446-112117-0_sp1.1 from training. Duration: 0.8181875 2023-10-04 14:53:17,589 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3691_210_3528_1_1526468169_4062053_355-334432-0_sp1.1 from training. Duration: 0.7909375 2023-10-04 14:53:20,491 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.feed_forward2.hidden_balancer.prob, batch_count=1693400.0, ans=0.125 2023-10-04 14:53:21,605 WARNING [train.py:1204] (3/4) Exclude cut with ID _58605_210_12210_1_1534723195939_3904259_239-94720-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 14:53:24,345 WARNING [train.py:1204] (3/4) Exclude cut with ID _49376_210_5986_1_1533985278732_3370740_391-344072-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 14:53:25,675 WARNING [train.py:1204] (3/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_558-322014-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 14:53:25,714 WARNING [train.py:1204] (3/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_256-32662-0_sp1.1 from training. Duration: 0.8181875 2023-10-04 14:53:30,324 WARNING [train.py:1204] (3/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_572-185150-0 from training. Duration: 0.97 2023-10-04 14:53:30,983 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_89-35748-0_sp0.9 from training. Duration: 0.87775 2023-10-04 14:53:32,113 INFO [train.py:1046] (3/4) Epoch 48, batch 4350, loss[loss=0.161, simple_loss=0.2346, pruned_loss=0.04368, over 23771.00 frames. ], tot_loss[loss=0.152, simple_loss=0.2327, pruned_loss=0.03564, over 4711782.97 frames. ], batch size: 195, lr: 2.11e-03, grad_scale: 8.0 2023-10-04 14:53:32,701 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.5.encoder.layers.0.feed_forward3.out_whiten, num_groups=1, num_channels=256, metric=8.59 vs. limit=15.0 2023-10-04 14:53:34,895 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_88-66747-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 14:53:37,564 WARNING [train.py:1204] (3/4) Exclude cut with ID _19504_210_9994_1_1531648863242_3325220_357-237029-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 14:53:40,366 WARNING [train.py:1204] (3/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_302-2550-0_sp1.1 from training. Duration: 0.736375 2023-10-04 14:53:40,366 WARNING [train.py:1204] (3/4) Exclude cut with ID _9461_210_4557_1_1529060514726_3677451_59-194017-0_sp1.1 from training. Duration: 0.963625 2023-10-04 14:53:41,333 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.0.layers.1.feed_forward3.out_whiten, num_groups=1, num_channels=192, metric=9.95 vs. limit=15.0 2023-10-04 14:53:44,543 WARNING [train.py:1204] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_665-196155-0_sp1.1 from training. Duration: 0.6090625 2023-10-04 14:53:49,150 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_326-315007-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 14:53:50,649 WARNING [train.py:1204] (3/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_105-328174-0_sp1.1 from training. Duration: 0.6909375 2023-10-04 14:53:50,657 WARNING [train.py:1204] (3/4) Exclude cut with ID _56494_210_4220_1_1535284837625_3033169_106-263722-0_sp1.1 from training. Duration: 0.97275 2023-10-04 14:53:53,442 WARNING [train.py:1204] (3/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_770-200430-0_sp0.9 from training. Duration: 0.988875 2023-10-04 14:53:54,850 WARNING [train.py:1204] (3/4) Exclude cut with ID _65640_210_6310_1_1535368201445_3173250_209-13008-0_sp1.1 from training. Duration: 0.9 2023-10-04 14:53:57,610 WARNING [train.py:1204] (3/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_212-149947-0_sp0.9 from training. Duration: 0.811125 2023-10-04 14:54:03,876 WARNING [train.py:1204] (3/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_188-171673-0 from training. Duration: 0.58 2023-10-04 14:54:05,846 WARNING [train.py:1204] (3/4) Exclude cut with ID _58895_210_8750_1_1535184081320_3649089_286-281724-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 14:54:07,188 WARNING [train.py:1204] (3/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_16-254909-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 14:54:10,067 WARNING [train.py:1204] (3/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_52-95655-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 14:54:12,788 WARNING [train.py:1204] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_554-45617-0 from training. Duration: 0.99 2023-10-04 14:54:15,493 WARNING [train.py:1204] (3/4) Exclude cut with ID _14683_210_5219_1_1530698352608_3974859_473-344611-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 14:54:16,930 WARNING [train.py:1204] (3/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_17-25045-0_sp0.9 from training. Duration: 0.6555625 2023-10-04 14:54:19,793 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9072W0003-530637 from training. Duration: 0.768 2023-10-04 14:54:22,943 WARNING [train.py:1204] (3/4) Exclude cut with ID _38670_210_15379_1_1533448810659_4391059_76-74025-0_sp1.1 from training. Duration: 0.936375 2023-10-04 14:54:22,995 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_74-292608-0_sp0.9 from training. Duration: 0.8 2023-10-04 14:54:24,361 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9084W0025-536862 from training. Duration: 0.896 2023-10-04 14:54:25,666 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9087W0021-538392 from training. Duration: 0.768 2023-10-04 14:54:25,671 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7543_210_6753_1_1528543323_645515_56-163234-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 14:54:25,710 WARNING [train.py:1204] (3/4) Exclude cut with ID _45560_210_21982_1_1533812603951_3584999_44-332643-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 14:54:26,980 WARNING [train.py:1204] (3/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_1285-256622-0_sp1.1 from training. Duration: 0.763625 2023-10-04 14:54:28,310 WARNING [train.py:1204] (3/4) Exclude cut with ID _52781_210_16187_1_1534297729099_1118149_141-325632-0_sp1.1 from training. Duration: 0.936375 2023-10-04 14:54:29,717 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_1051-137631-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 14:54:29,766 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_13091_210_7925_1_1530609447_8987354_233-18333-0_sp1.1 from training. Duration: 0.97275 2023-10-04 14:54:31,210 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_34008_210_10431_1_1533345424_8183247_662-317331-0 from training. Duration: 0.94 2023-10-04 14:54:31,223 WARNING [train.py:1204] (3/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_291-248920-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 14:54:31,225 WARNING [train.py:1204] (3/4) Exclude cut with ID _65263_210_22356_1_1535763598691_3921037_274-288481-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 14:54:33,207 WARNING [train.py:1204] (3/4) Exclude cut with ID _5189_210_4557_1_1527248319428_3769229_233-168080-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 14:54:33,270 WARNING [train.py:1204] (3/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_135-184281-0 from training. Duration: 0.99 2023-10-04 14:54:35,282 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9123W0012-555972 from training. Duration: 0.896 2023-10-04 14:54:35,287 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9123W0118-556077 from training. Duration: 0.896 2023-10-04 14:54:35,302 WARNING [train.py:1204] (3/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_406-275649-0 from training. Duration: 0.42 2023-10-04 14:54:39,209 WARNING [train.py:1204] (3/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_309-13684-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 14:54:39,244 WARNING [train.py:1204] (3/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_129-337802-0_sp1.1 from training. Duration: 0.6545625 2023-10-04 14:54:39,268 WARNING [train.py:1204] (3/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_649-266825-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 14:54:40,514 WARNING [train.py:1204] (3/4) Exclude cut with ID _40519_210_13794_1_1533715228655_7337390_895-224742-0_sp1.1 from training. Duration: 0.92725 2023-10-04 14:54:41,947 WARNING [train.py:1204] (3/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_234-14174-0 from training. Duration: 0.82 2023-10-04 14:54:43,421 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9156W0009-572502 from training. Duration: 0.768 2023-10-04 14:54:43,435 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_424-245529-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 14:54:44,706 INFO [train.py:1046] (3/4) Epoch 48, batch 4400, loss[loss=0.1667, simple_loss=0.251, pruned_loss=0.04114, over 23319.00 frames. ], tot_loss[loss=0.1527, simple_loss=0.2337, pruned_loss=0.03585, over 4723141.16 frames. ], batch size: 105, lr: 2.11e-03, grad_scale: 16.0 2023-10-04 14:54:46,198 WARNING [train.py:1204] (3/4) Exclude cut with ID _26528_210_9070_1_1532570410842_3851310_232-10149-0_sp1.1 from training. Duration: 0.963625 2023-10-04 14:54:46,214 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_17292_210_6753_1_1531125388_2943734_152-30410-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 14:54:47,611 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_35441_210_16554_1_1533347572_4092680_291-82075-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 14:54:47,803 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.out_combiner.scale_min, batch_count=1693800.0, ans=0.2 2023-10-04 14:54:49,180 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.bypass_mid.scale_min, batch_count=1693800.0, ans=0.2 2023-10-04 14:54:50,333 WARNING [train.py:1204] (3/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_64-298917-0 from training. Duration: 0.88 2023-10-04 14:54:50,362 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_5618_210_1874_1_1527311008_5851_1-214501-0_sp0.9 from training. Duration: 0.7655625 2023-10-04 14:54:51,727 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_9460_210_4211_1_1528954587_10049667_572-37035-0 from training. Duration: 0.8 2023-10-04 14:54:51,753 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9193W0023-590322 from training. Duration: 0.896 2023-10-04 14:54:53,592 WARNING [train.py:1204] (3/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_206-10097-0_sp1.1 from training. Duration: 0.4454375 2023-10-04 14:54:53,611 WARNING [train.py:1204] (3/4) Exclude cut with ID _55134_210_21982_1_1534590028377_3882170_17-51731-0_sp1.1 from training. Duration: 0.963625 2023-10-04 14:54:56,302 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_14953_210_2151_1_1530701710_5616165_710-165400-0 from training. Duration: 0.87 2023-10-04 14:54:58,994 WARNING [train.py:1204] (3/4) Exclude cut with ID _53203_210_2087_1_1534246218309_475590_61-85777-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 14:54:59,085 WARNING [train.py:1204] (3/4) Exclude cut with ID _55844_210_2346_1_1534471212569_7625707_1050-10453-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 14:54:59,101 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9218W0013-603297 from training. Duration: 0.896 2023-10-04 14:55:03,067 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.718e+02 2.088e+02 2.355e+02 2.772e+02 4.860e+02, threshold=4.710e+02, percent-clipped=1.0 2023-10-04 14:55:03,224 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_267-102818-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 14:55:03,225 WARNING [train.py:1204] (3/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_191-41023-0 from training. Duration: 0.71 2023-10-04 14:55:05,062 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9231W0202-610227 from training. Duration: 0.896 2023-10-04 14:55:05,989 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.feed_forward3.hidden_balancer.prob, batch_count=1693866.6666666667, ans=0.125 2023-10-04 14:55:06,097 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.feed_forward3.hidden_balancer.prob, batch_count=1693866.6666666667, ans=0.125 2023-10-04 14:55:07,393 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.out_combiner.scale_min, batch_count=1693866.6666666667, ans=0.2 2023-10-04 14:55:08,511 WARNING [train.py:1204] (3/4) Exclude cut with ID _6537_210_1883_1_1527987559710_3597129_382-133062-0 from training. Duration: 0.68 2023-10-04 14:55:08,552 WARNING [train.py:1204] (3/4) Exclude cut with ID _28427_210_4220_1_1532581240967_3061069_43-84928-0 from training. Duration: 0.99 2023-10-04 14:55:09,926 WARNING [train.py:1204] (3/4) Exclude cut with ID _59256_210_5986_1_1535076085997_3589120_121-237386-0 from training. Duration: 0.96 2023-10-04 14:55:09,948 WARNING [train.py:1204] (3/4) Exclude cut with ID _8480_210_1874_1_1528589391205_6728330_137-256986-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 14:55:10,051 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_9417_210_1794_1_1529235261_4614131_479-346963-0_sp1.1 from training. Duration: 0.936375 2023-10-04 14:55:10,165 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.0.conv_skip_rate, batch_count=1693866.6666666667, ans=0.0 2023-10-04 14:55:11,303 WARNING [train.py:1204] (3/4) Exclude cut with ID _26474_210_9570_1_1532520022699_3623380_267-7362-0_sp1.1 from training. Duration: 0.936375 2023-10-04 14:55:11,370 WARNING [train.py:1204] (3/4) Exclude cut with ID _55328_210_27767_1_1534580973260_4088170_287-61272-0_sp1.1 from training. Duration: 0.97275 2023-10-04 14:55:11,669 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.feed_forward1.hidden_balancer.prob, batch_count=1693866.6666666667, ans=0.125 2023-10-04 14:55:12,809 WARNING [train.py:1204] (3/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_589-9070-0 from training. Duration: 0.71 2023-10-04 14:55:12,816 WARNING [train.py:1204] (3/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_148-158509-0_sp1.1 from training. Duration: 0.37275 2023-10-04 14:55:14,201 WARNING [train.py:1204] (3/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_164-184548-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 14:55:17,032 WARNING [train.py:1204] (3/4) Exclude cut with ID _14970_210_15709_1_1530775547361_1966965_6-23328-0_sp1.1 from training. Duration: 0.7818125 2023-10-04 14:55:17,045 WARNING [train.py:1204] (3/4) Exclude cut with ID _29303_210_2575_1_1532392237303_7410649_612-165069-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 14:55:19,681 WARNING [train.py:1204] (3/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_383-321224-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 14:55:19,720 WARNING [train.py:1204] (3/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_283-160525-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 14:55:19,739 WARNING [train.py:1204] (3/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_505-97987-0 from training. Duration: 0.86 2023-10-04 14:55:21,109 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9309W0190-640317 from training. Duration: 0.768 2023-10-04 14:55:24,060 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.conv_module2.balancer1.min_positive, batch_count=1693933.3333333333, ans=0.025 2023-10-04 14:55:25,235 WARNING [train.py:1204] (3/4) Exclude cut with ID _50093_210_4220_1_1534417141207_3432299_280-297390-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 14:55:29,024 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.balancer2.prob, batch_count=1694000.0, ans=0.125 2023-10-04 14:55:31,653 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_17292_210_6753_1_1531125388_2943734_320-71385-0_sp1.1 from training. Duration: 0.97275 2023-10-04 14:55:33,076 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_213-103282-0 from training. Duration: 0.78 2023-10-04 14:55:36,261 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7615_210_4211_1_1528110965_4580160_72-235211-0_sp0.9 from training. Duration: 0.8444375 2023-10-04 14:55:39,987 WARNING [train.py:1204] (3/4) Exclude cut with ID _14867_210_1968_1_1530684179890_3846969_173-320977-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 14:55:41,400 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder_embed.conv.5.prob, batch_count=1694000.0, ans=0.125 2023-10-04 14:55:43,939 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7046_210_6753_1_1527895337_6920917_783-278109-0_sp0.9 from training. Duration: 0.8555625 2023-10-04 14:55:43,987 WARNING [train.py:1204] (3/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_90-30673-0 from training. Duration: 0.84 2023-10-04 14:55:44,003 WARNING [train.py:1204] (3/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_415-161-0_sp1.1 from training. Duration: 0.8909375 2023-10-04 14:55:44,021 WARNING [train.py:1204] (3/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_86-66192-0_sp0.9 from training. Duration: 0.611125 2023-10-04 14:55:44,024 WARNING [train.py:1204] (3/4) Exclude cut with ID _4626_210_3532_1_1526817652717_3916859_446-206656-0_sp1.1 from training. Duration: 0.6909375 2023-10-04 14:55:45,354 WARNING [train.py:1204] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_166-128941-0_sp0.9 from training. Duration: 0.6 2023-10-04 14:55:48,180 WARNING [train.py:1204] (3/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_345-167986-0 from training. Duration: 0.84 2023-10-04 14:55:49,746 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.conv_module2.balancer1.prob, batch_count=1694066.6666666667, ans=0.125 2023-10-04 14:55:50,985 WARNING [train.py:1204] (3/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_973-198085-0 from training. Duration: 0.75 2023-10-04 14:55:51,081 WARNING [train.py:1204] (3/4) Exclude cut with ID _60057_210_10640_1_1534903065793_3333150_171-181133-0 from training. Duration: 0.88 2023-10-04 14:55:51,096 WARNING [train.py:1204] (3/4) Exclude cut with ID _19504_210_9994_1_1531648863242_3325220_336-196513-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 14:55:51,103 WARNING [train.py:1204] (3/4) Exclude cut with ID _6493_210_1747_1_1528459210424_4012470_513-140967-0 from training. Duration: 0.79 2023-10-04 14:55:52,366 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6673_210_3273_1_1527767790_3173240_327-306185-0_sp0.9 from training. Duration: 0.92225 2023-10-04 14:55:56,578 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1032-236863-0_sp1.1 from training. Duration: 0.763625 2023-10-04 14:55:57,901 INFO [train.py:1046] (3/4) Epoch 48, batch 4450, loss[loss=0.1409, simple_loss=0.222, pruned_loss=0.02989, over 24339.00 frames. ], tot_loss[loss=0.1536, simple_loss=0.2347, pruned_loss=0.03625, over 4722264.10 frames. ], batch size: 56, lr: 2.11e-03, grad_scale: 16.0 2023-10-04 14:55:58,021 WARNING [train.py:1204] (3/4) Exclude cut with ID _14867_210_1968_1_1530684179890_3846969_413-29292-0 from training. Duration: 0.9 2023-10-04 14:56:02,540 WARNING [train.py:1204] (3/4) Exclude cut with ID _47251_210_18628_1_1533814063128_4137250_186-343322-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 14:56:04,615 WARNING [train.py:1204] (3/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_624-46091-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 14:56:05,916 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_791-197891-0_sp0.9 from training. Duration: 0.9333125 2023-10-04 14:56:12,121 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_50050_210_16223_1_1535435796_7290138_786-288457-0_sp1.1 from training. Duration: 0.92725 2023-10-04 14:56:12,154 WARNING [train.py:1204] (3/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_213-227283-0_sp1.1 from training. Duration: 0.963625 2023-10-04 14:56:14,924 WARNING [train.py:1204] (3/4) Exclude cut with ID _42659_210_2070_1_1533607442765_3120380_58-341121-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 14:56:16,325 WARNING [train.py:1204] (3/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_506-23200-0_sp1.1 from training. Duration: 0.8818125 2023-10-04 14:56:17,810 WARNING [train.py:1204] (3/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_336-213530-0_sp1.1 from training. Duration: 0.8181875 2023-10-04 14:56:17,832 WARNING [train.py:1204] (3/4) Exclude cut with ID _64135_210_2253_1_1535677267876_7608972_180-5312-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 14:56:19,236 WARNING [train.py:1204] (3/4) Exclude cut with ID _12302_210_5219_1_1529924446466_8107280_835-279754-0 from training. Duration: 0.9 2023-10-04 14:56:19,237 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_510-96996-0_sp1.1 from training. Duration: 0.7909375 2023-10-04 14:56:19,326 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_15517_210_6753_1_1530954008_3487336_216-187834-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 14:56:20,600 WARNING [train.py:1204] (3/4) Exclude cut with ID _19504_210_9994_1_1531648863242_3325220_414-113427-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 14:56:20,602 WARNING [train.py:1204] (3/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_264-108685-0_sp0.9 from training. Duration: 0.611125 2023-10-04 14:56:23,335 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_5438_210_1794_1_1527336687_2600009_104-288274-0_sp0.9 from training. Duration: 0.6444375 2023-10-04 14:56:27,397 WARNING [train.py:1204] (3/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_520-39496-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 14:56:28,753 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_37818_210_4238_1_1533620910_4720083_315-308565-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 14:56:30,682 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2009_210_2071_1_1525481134_4588937_385-302100-0_sp1.1 from training. Duration: 0.7909375 2023-10-04 14:56:30,723 WARNING [train.py:1204] (3/4) Exclude cut with ID _58895_210_8750_1_1535184081320_3649089_277-53219-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 14:56:32,079 WARNING [train.py:1204] (3/4) Exclude cut with ID _26664_210_7770_1_1532258943280_3418270_121-111258-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 14:56:37,282 WARNING [train.py:1204] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_99-167392-0_sp1.1 from training. Duration: 0.5545625 2023-10-04 14:56:38,622 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1113-7369-0 from training. Duration: 0.85 2023-10-04 14:56:38,637 WARNING [train.py:1204] (3/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_352-59958-0 from training. Duration: 0.86 2023-10-04 14:56:38,642 WARNING [train.py:1204] (3/4) Exclude cut with ID _14886_210_4220_1_1530702020355_3516190_159-73748-0_sp1.1 from training. Duration: 0.7545625 2023-10-04 14:56:40,129 WARNING [train.py:1204] (3/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_131-119569-0_sp1.1 from training. Duration: 0.92725 2023-10-04 14:56:41,904 WARNING [train.py:1204] (3/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_216-343007-0 from training. Duration: 0.66 2023-10-04 14:56:47,335 WARNING [train.py:1204] (3/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_113-92608-0_sp0.9 from training. Duration: 0.6 2023-10-04 14:56:50,184 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_40047_210_2290_1_1533290519_2374311_132-123271-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 14:56:50,253 WARNING [train.py:1204] (3/4) Exclude cut with ID _8793_210_6310_1_1528542985139_2310009_121-179782-0 from training. Duration: 0.8 2023-10-04 14:56:51,562 WARNING [train.py:1204] (3/4) Exclude cut with ID _43997_210_4525_1_1533538632555_5189329_283-218639-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 14:56:51,565 WARNING [train.py:1204] (3/4) Exclude cut with ID _49376_210_5986_1_1533985278732_3370740_297-78397-0_sp1.1 from training. Duration: 0.936375 2023-10-04 14:56:51,587 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3475_210_2028_1_1526193578_3622569_377-166133-0_sp0.9 from training. Duration: 0.9555625 2023-10-04 14:56:51,594 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_520-213149-0_sp1.1 from training. Duration: 0.92725 2023-10-04 14:56:54,280 WARNING [train.py:1204] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_567-144483-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 14:56:56,914 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_4183_210_2087_1_1526729100_1869838_143-280962-0_sp0.9 from training. Duration: 0.62225 2023-10-04 14:56:56,968 WARNING [train.py:1204] (3/4) Exclude cut with ID _49643_210_7989_1_1534640244224_3939031_611-185515-0 from training. Duration: 0.94 2023-10-04 14:56:58,360 WARNING [train.py:1204] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_88-341718-0_sp0.9 from training. Duration: 0.7333125 2023-10-04 14:57:01,100 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_51225_210_3601_1_1534400634_2327383_49-235441-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 14:57:01,200 WARNING [train.py:1204] (3/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_240-294400-0_sp1.1 from training. Duration: 0.936375 2023-10-04 14:57:02,676 WARNING [train.py:1204] (3/4) Exclude cut with ID _55429_210_10626_1_1534467588527_7174143_640-304825-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 14:57:02,915 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.conv_module2.balancer2.min_positive, batch_count=1694400.0, ans=0.05 2023-10-04 14:57:04,603 WARNING [train.py:1204] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_187-221566-0_sp1.1 from training. Duration: 0.3909375 2023-10-04 14:57:05,367 WARNING [train.py:1204] (3/4) Exclude cut with ID _7606_210_4915_1_1528894253277_5080690_127-177761-0_sp0.9 from training. Duration: 0.711125 2023-10-04 14:57:09,272 WARNING [train.py:1204] (3/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_430-116139-0 from training. Duration: 0.85 2023-10-04 14:57:09,393 WARNING [train.py:1204] (3/4) Exclude cut with ID _3675_210_4557_1_1526639934408_4182089_456-18305-0_sp0.9 from training. Duration: 0.8444375 2023-10-04 14:57:11,982 INFO [train.py:1046] (3/4) Epoch 48, batch 4500, loss[loss=0.151, simple_loss=0.2287, pruned_loss=0.03667, over 23783.00 frames. ], tot_loss[loss=0.1546, simple_loss=0.235, pruned_loss=0.03707, over 4712679.93 frames. ], batch size: 135, lr: 2.11e-03, grad_scale: 16.0 2023-10-04 14:57:14,087 WARNING [train.py:1204] (3/4) Exclude cut with ID _18513_210_9206_1_1531618215223_3862620_402-5957-0_sp1.1 from training. Duration: 0.9 2023-10-04 14:57:15,480 WARNING [train.py:1204] (3/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_465-156500-0 from training. Duration: 0.72 2023-10-04 14:57:15,481 WARNING [train.py:1204] (3/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_714-310538-0 from training. Duration: 0.86 2023-10-04 14:57:15,994 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.5.encoder.layers.1.self_attn2.whiten, num_groups=1, num_channels=256, metric=12.63 vs. limit=22.5 2023-10-04 14:57:18,144 WARNING [train.py:1204] (3/4) Exclude cut with ID _39411_210_5986_1_1533538825030_3343649_335-301187-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 14:57:22,235 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_41750_210_1960_1_1533520678_3948042_241-158616-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 14:57:22,280 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_46013_210_7234_1_1533776362_3744032_50-141535-0_sp1.1 from training. Duration: 0.9 2023-10-04 14:57:23,636 WARNING [train.py:1204] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_43-206757-0_sp0.9 from training. Duration: 0.8333125 2023-10-04 14:57:23,693 WARNING [train.py:1204] (3/4) Exclude cut with ID _22961_210_8254_1_1531976366672_4278180_172-276296-0_sp1.1 from training. Duration: 0.97275 2023-10-04 14:57:25,046 WARNING [train.py:1204] (3/4) Exclude cut with ID _17956_210_4353_1_1531209568125_6979970_1305-301903-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 14:57:25,098 WARNING [train.py:1204] (3/4) Exclude cut with ID _28323_210_16187_1_1532480395667_4100300_604-44507-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 14:57:28,038 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.attention_skip_rate, batch_count=1694533.3333333333, ans=0.0 2023-10-04 14:57:30,302 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.718e+02 2.084e+02 2.253e+02 2.565e+02 3.939e+02, threshold=4.505e+02, percent-clipped=0.0 2023-10-04 14:57:38,222 WARNING [train.py:1204] (3/4) Exclude cut with ID _23514_210_1389_1_1531832139785_3031439_88-337544-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 14:57:38,302 WARNING [train.py:1204] (3/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_932-3672-0_sp1.1 from training. Duration: 0.963625 2023-10-04 14:57:41,144 WARNING [train.py:1204] (3/4) Exclude cut with ID _46502_210_13794_1_1533724452198_3657159_207-109991-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 14:57:41,198 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_216-73841-0_sp1.1 from training. Duration: 0.87275 2023-10-04 14:57:42,480 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8453_210_4211_1_1528502940_8562730_347-330673-0_sp1.1 from training. Duration: 0.6545625 2023-10-04 14:57:49,795 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_4183_210_2087_1_1526729100_1869838_143-280962-0_sp1.1 from training. Duration: 0.5090625 2023-10-04 14:57:52,740 WARNING [train.py:1204] (3/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_325-113798-0_sp1.1 from training. Duration: 0.77275 2023-10-04 14:57:57,009 WARNING [train.py:1204] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_91-212486-0_sp0.9 from training. Duration: 0.7555625 2023-10-04 14:57:59,776 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8776_210_2040_1_1528638837_4088036_244-278763-0_sp1.1 from training. Duration: 0.8181875 2023-10-04 14:57:59,817 WARNING [train.py:1204] (3/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_106-286130-0 from training. Duration: 0.98 2023-10-04 14:58:01,127 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2746_210_1988_1_1525872816_3795627_372-205861-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 14:58:01,174 WARNING [train.py:1204] (3/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_240-54646-0_sp1.1 from training. Duration: 0.936375 2023-10-04 14:58:03,115 WARNING [train.py:1204] (3/4) Exclude cut with ID _59500_210_8254_1_1534809610680_3736009_162-180695-0_sp1.1 from training. Duration: 0.936375 2023-10-04 14:58:03,149 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7544_210_6753_1_1528591327_1471426_146-93357-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 14:58:03,333 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.conv_skip_rate, batch_count=1694666.6666666667, ans=0.0 2023-10-04 14:58:05,084 WARNING [train.py:1204] (3/4) Exclude cut with ID _42657_210_2070_1_1533520654016_3327008_405-87432-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 14:58:06,974 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_4528_210_3273_1_1527076654_3276777_209-219804-0 from training. Duration: 0.69 2023-10-04 14:58:06,974 WARNING [train.py:1204] (3/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_78-182912-0_sp0.9 from training. Duration: 0.6555625 2023-10-04 14:58:06,989 WARNING [train.py:1204] (3/4) Exclude cut with ID _31255_210_13427_1_1532865619495_3815143_322-114998-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 14:58:08,480 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.feed_forward1.out_proj.dropout_p, batch_count=1694666.6666666667, ans=0.1 2023-10-04 14:58:08,499 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.ff3_skip_rate, batch_count=1694666.6666666667, ans=0.0 2023-10-04 14:58:09,663 WARNING [train.py:1204] (3/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_848-280957-0_sp0.9 from training. Duration: 0.9666875 2023-10-04 14:58:10,999 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7696_210_2040_1_1528207167_3716099_117-125434-0_sp0.9 from training. Duration: 0.8444375 2023-10-04 14:58:12,486 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_1002-109192-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 14:58:16,921 WARNING [train.py:1204] (3/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_138-64210-0_sp0.9 from training. Duration: 0.811125 2023-10-04 14:58:16,942 WARNING [train.py:1204] (3/4) Exclude cut with ID _25808_210_13292_1_1532343514562_7366109_633-47524-0_sp1.1 from training. Duration: 0.9 2023-10-04 14:58:18,330 WARNING [train.py:1204] (3/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_297-5233-0 from training. Duration: 0.95 2023-10-04 14:58:19,742 WARNING [train.py:1204] (3/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_335-274780-0 from training. Duration: 0.78 2023-10-04 14:58:19,747 WARNING [train.py:1204] (3/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_301-330455-0 from training. Duration: 0.8 2023-10-04 14:58:21,384 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.conv_module1.balancer1.prob, batch_count=1694733.3333333333, ans=0.125 2023-10-04 14:58:23,966 WARNING [train.py:1204] (3/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_383-205833-0 from training. Duration: 0.95 2023-10-04 14:58:25,458 INFO [train.py:1046] (3/4) Epoch 48, batch 4550, loss[loss=0.1348, simple_loss=0.1909, pruned_loss=0.03936, over 19348.00 frames. ], tot_loss[loss=0.1535, simple_loss=0.2336, pruned_loss=0.03672, over 4717258.14 frames. ], batch size: 389, lr: 2.11e-03, grad_scale: 16.0 2023-10-04 14:58:26,943 WARNING [train.py:1204] (3/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_48-182155-0 from training. Duration: 0.87 2023-10-04 14:58:28,437 WARNING [train.py:1204] (3/4) Exclude cut with ID _14832_210_8750_1_1530691351539_3171348_1-54315-0_sp1.1 from training. Duration: 0.7909375 2023-10-04 14:58:31,193 WARNING [train.py:1204] (3/4) Exclude cut with ID _44982_210_1511_1_1534312820108_3726029_413-100814-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 14:58:31,252 WARNING [train.py:1204] (3/4) Exclude cut with ID _3231_210_2106_1_1526205864934_6784220_104-261392-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 14:58:34,476 WARNING [train.py:1204] (3/4) Exclude cut with ID _24314_210_4915_1_1532135078116_7170179_1-13588-0_sp1.1 from training. Duration: 0.97275 2023-10-04 14:58:36,729 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.nonlin_attention.balancer.prob, batch_count=1694800.0, ans=0.125 2023-10-04 14:58:37,924 WARNING [train.py:1204] (3/4) Exclude cut with ID _44304_210_15374_1_1533639352765_4613129_141-8356-0_sp1.1 from training. Duration: 0.963625 2023-10-04 14:58:39,314 WARNING [train.py:1204] (3/4) Exclude cut with ID _37892_210_19596_1_1533198677897_3964058_374-335034-0_sp1.1 from training. Duration: 0.936375 2023-10-04 14:58:40,813 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_195-257512-0_sp0.9 from training. Duration: 0.7444375 2023-10-04 14:58:40,815 WARNING [train.py:1204] (3/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_1267-272693-0_sp1.1 from training. Duration: 0.663625 2023-10-04 14:58:40,816 WARNING [train.py:1204] (3/4) Exclude cut with ID _30196_210_1664_1_1533708496522_7074140_690-326155-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 14:58:41,010 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=1694866.6666666667, ans=0.1 2023-10-04 14:58:45,183 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_68847_210_14656_1_1536070214_2586843_38-20717-0_sp1.1 from training. Duration: 0.97275 2023-10-04 14:58:45,219 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_99-251314-0_sp1.1 from training. Duration: 0.7909375 2023-10-04 14:58:49,391 WARNING [train.py:1204] (3/4) Exclude cut with ID _49376_210_5986_1_1533985278732_3370740_225-95630-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 14:58:52,170 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2655_210_2087_1_1525688959_3897400_202-147557-0 from training. Duration: 0.85 2023-10-04 14:58:52,232 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_5618_210_1874_1_1527311008_5851_1-214501-0_sp1.1 from training. Duration: 0.626375 2023-10-04 14:58:52,314 WARNING [train.py:1204] (3/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_225-43381-0_sp1.1 from training. Duration: 0.8545625 2023-10-04 14:58:55,004 WARNING [train.py:1204] (3/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_242-3472-0 from training. Duration: 0.73 2023-10-04 14:58:57,794 WARNING [train.py:1204] (3/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_339-104791-0 from training. Duration: 0.49 2023-10-04 14:58:59,127 WARNING [train.py:1204] (3/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_488-92261-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 14:59:00,728 WARNING [train.py:1204] (3/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_175-163894-0 from training. Duration: 0.74 2023-10-04 14:59:02,214 WARNING [train.py:1204] (3/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_41-234353-0_sp0.9 from training. Duration: 0.7555625 2023-10-04 14:59:06,087 WARNING [train.py:1204] (3/4) Exclude cut with ID _47251_210_18628_1_1533814063128_4137250_116-275927-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 14:59:07,371 WARNING [train.py:1204] (3/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_61-223324-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 14:59:07,384 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6961_210_2087_1_1527937168_6815011_562-165518-0_sp1.1 from training. Duration: 0.7 2023-10-04 14:59:08,786 WARNING [train.py:1204] (3/4) Exclude cut with ID _21989_210_5854_1_1531899214931_3271360_133-44759-0 from training. Duration: 0.77 2023-10-04 14:59:11,531 WARNING [train.py:1204] (3/4) Exclude cut with ID _23557_210_8750_1_1531890106370_6846255_77-284923-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 14:59:13,039 WARNING [train.py:1204] (3/4) Exclude cut with ID _13678_210_7497_1_1530530069226_3463170_464-296127-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 14:59:13,052 WARNING [train.py:1204] (3/4) Exclude cut with ID _22613_210_5219_1_1532253599686_4090170_393-36663-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 14:59:14,471 WARNING [train.py:1204] (3/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_235-262124-0_sp0.9 from training. Duration: 0.7444375 2023-10-04 14:59:16,337 WARNING [train.py:1204] (3/4) Exclude cut with ID _8480_210_1874_1_1528589391205_6728330_755-112625-0 from training. Duration: 0.74 2023-10-04 14:59:17,758 WARNING [train.py:1204] (3/4) Exclude cut with ID _6456_210_1534_1_1527681634212_3181049_215-309359-0 from training. Duration: 0.88 2023-10-04 14:59:17,785 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3019_210_2028_1_1526044245_3264865_191-72235-0_sp1.1 from training. Duration: 0.8818125 2023-10-04 14:59:17,861 WARNING [train.py:1204] (3/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_380-162648-0 from training. Duration: 0.94 2023-10-04 14:59:20,601 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_92-46009-0 from training. Duration: 0.54 2023-10-04 14:59:20,618 WARNING [train.py:1204] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_53-300544-0_sp0.9 from training. Duration: 0.7444375 2023-10-04 14:59:22,013 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_68235_210_1925_1_1535863247_4484656_101-250943-0_sp1.1 from training. Duration: 0.97275 2023-10-04 14:59:23,251 WARNING [train.py:1204] (3/4) Exclude cut with ID _14970_210_15709_1_1530775547361_1966965_204-22182-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 14:59:23,335 WARNING [train.py:1204] (3/4) Exclude cut with ID _55153_210_2253_1_1534384786677_3162096_310-130092-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 14:59:23,346 WARNING [train.py:1204] (3/4) Exclude cut with ID _60113_210_16271_1_1535164212481_4386079_559-167191-0_sp1.1 from training. Duration: 0.8454375 2023-10-04 14:59:24,794 WARNING [train.py:1204] (3/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_93-58002-0_sp1.1 from training. Duration: 0.5181875 2023-10-04 14:59:25,244 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.5.encoder.layers.1.feed_forward1.out_whiten, num_groups=1, num_channels=256, metric=7.44 vs. limit=15.0 2023-10-04 14:59:26,149 WARNING [train.py:1204] (3/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_110-224688-0 from training. Duration: 0.97 2023-10-04 14:59:27,521 WARNING [train.py:1204] (3/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_178-178004-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 14:59:27,537 WARNING [train.py:1204] (3/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_321-335475-0_sp1.1 from training. Duration: 0.4090625 2023-10-04 14:59:27,589 WARNING [train.py:1204] (3/4) Exclude cut with ID _66863_210_18053_1_1535698758334_8590319_1092-271784-0 from training. Duration: 0.96 2023-10-04 14:59:27,603 WARNING [train.py:1204] (3/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_607-213951-0_sp1.1 from training. Duration: 0.763625 2023-10-04 14:59:27,611 WARNING [train.py:1204] (3/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_468-283113-0 from training. Duration: 0.96 2023-10-04 14:59:30,495 WARNING [train.py:1204] (3/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_13-165512-0_sp1.1 from training. Duration: 0.7454375 2023-10-04 14:59:30,510 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_69630_210_7234_1_1535788670_910989_56-169762-0_sp1.1 from training. Duration: 0.92725 2023-10-04 14:59:31,944 WARNING [train.py:1204] (3/4) Exclude cut with ID _67335_210_5219_1_1535525975652_4275229_121-80357-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 14:59:32,000 WARNING [train.py:1204] (3/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_126-12079-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 14:59:33,964 WARNING [train.py:1204] (3/4) Exclude cut with ID _5294_210_1747_1_1527249643928_3696179_342-270278-0_sp1.1 from training. Duration: 0.5 2023-10-04 14:59:35,828 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_30322_210_1925_1_1532843478_826817_35-328264-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 14:59:37,195 WARNING [train.py:1204] (3/4) Exclude cut with ID _8480_210_1874_1_1528589391205_6728330_755-112625-0_sp1.1 from training. Duration: 0.67275 2023-10-04 14:59:38,550 WARNING [train.py:1204] (3/4) Exclude cut with ID _38670_210_15379_1_1533448810659_4391059_132-146412-0_sp1.1 from training. Duration: 0.97275 2023-10-04 14:59:40,353 INFO [train.py:1046] (3/4) Epoch 48, batch 4600, loss[loss=0.1552, simple_loss=0.2323, pruned_loss=0.03903, over 23379.00 frames. ], tot_loss[loss=0.1522, simple_loss=0.2319, pruned_loss=0.03625, over 4712020.37 frames. ], batch size: 93, lr: 2.11e-03, grad_scale: 16.0 2023-10-04 14:59:40,425 WARNING [train.py:1204] (3/4) Exclude cut with ID _22020_210_2461_1_1531724164513_6326900_563-39691-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 14:59:42,021 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=1695133.3333333333, ans=0.1 2023-10-04 14:59:43,193 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_9460_210_4211_1_1528954587_10049667_572-37035-0_sp1.1 from training. Duration: 0.72725 2023-10-04 14:59:43,211 WARNING [train.py:1204] (3/4) Exclude cut with ID _68777_210_4353_1_1535810390731_3117904_129-166012-0_sp0.9 from training. Duration: 0.9444375 2023-10-04 14:59:44,447 WARNING [train.py:1204] (3/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_487-91921-0_sp1.1 from training. Duration: 0.963625 2023-10-04 14:59:45,833 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_195-257512-0 from training. Duration: 0.67 2023-10-04 14:59:45,937 WARNING [train.py:1204] (3/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_188-260011-0_sp1.1 from training. Duration: 0.663625 2023-10-04 14:59:50,156 WARNING [train.py:1204] (3/4) Exclude cut with ID _8480_210_1874_1_1528589391205_6728330_309-139896-0_sp0.9 from training. Duration: 0.9 2023-10-04 14:59:51,519 WARNING [train.py:1204] (3/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_866-321248-0_sp1.1 from training. Duration: 0.963625 2023-10-04 14:59:52,957 WARNING [train.py:1204] (3/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_510-216513-0_sp1.1 from training. Duration: 0.97275 2023-10-04 14:59:53,163 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.conv_module2.balancer2.prob, batch_count=1695200.0, ans=0.125 2023-10-04 14:59:58,282 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.701e+02 2.165e+02 2.492e+02 2.948e+02 4.714e+02, threshold=4.983e+02, percent-clipped=2.0 2023-10-04 14:59:58,425 WARNING [train.py:1204] (3/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_758-143677-0 from training. Duration: 0.98 2023-10-04 14:59:59,812 WARNING [train.py:1204] (3/4) Exclude cut with ID _55134_210_21982_1_1534590028377_3882170_50-214427-0_sp1.1 from training. Duration: 0.97275 2023-10-04 15:00:02,900 WARNING [train.py:1204] (3/4) Exclude cut with ID _31649_210_6228_1_1532584608063_4091078_482-301330-0_sp1.1 from training. Duration: 0.97275 2023-10-04 15:00:06,150 WARNING [train.py:1204] (3/4) Exclude cut with ID _35799_210_5854_1_1533263487887_3529071_490-326847-0_sp1.1 from training. Duration: 0.9 2023-10-04 15:00:06,160 WARNING [train.py:1204] (3/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_984-90716-0_sp1.1 from training. Duration: 0.963625 2023-10-04 15:00:10,916 WARNING [train.py:1204] (3/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_166-246557-0 from training. Duration: 0.64 2023-10-04 15:00:10,916 WARNING [train.py:1204] (3/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_51-200220-0_sp1.1 from training. Duration: 0.5090625 2023-10-04 15:00:13,431 WARNING [train.py:1204] (3/4) Exclude cut with ID _51382_210_23862_1_1534492801838_3650104_275-92796-0_sp1.1 from training. Duration: 0.936375 2023-10-04 15:00:16,285 WARNING [train.py:1204] (3/4) Exclude cut with ID _29853_210_7497_1_1532505600851_6642110_461-142829-0_sp1.1 from training. Duration: 0.97275 2023-10-04 15:00:16,333 WARNING [train.py:1204] (3/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_788-298907-0_sp0.9 from training. Duration: 0.988875 2023-10-04 15:00:19,076 WARNING [train.py:1204] (3/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_739-2098-0_sp1.1 from training. Duration: 0.836375 2023-10-04 15:00:19,252 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.out_combiner.scale_min, batch_count=1695266.6666666667, ans=0.2 2023-10-04 15:00:21,832 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7543_210_6753_1_1528541907_1399322_165-323619-0 from training. Duration: 0.69 2023-10-04 15:00:24,453 WARNING [train.py:1204] (3/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_401-255491-0_sp1.1 from training. Duration: 0.636375 2023-10-04 15:00:24,744 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.conv_module2.balancer1.prob, batch_count=1695333.3333333333, ans=0.125 2023-10-04 15:00:28,623 WARNING [train.py:1204] (3/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_24-143283-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 15:00:28,707 WARNING [train.py:1204] (3/4) Exclude cut with ID _14459_210_2228_1_1531443641946_7516700_1221-280439-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 15:00:31,496 WARNING [train.py:1204] (3/4) Exclude cut with ID _59202_210_26984_1_1534816810612_3549174_469-213482-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 15:00:31,497 WARNING [train.py:1204] (3/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_148-158509-0_sp0.9 from training. Duration: 0.4555625 2023-10-04 15:00:31,540 WARNING [train.py:1204] (3/4) Exclude cut with ID _24314_210_4915_1_1532135078116_7170179_863-259095-0_sp1.1 from training. Duration: 0.97275 2023-10-04 15:00:33,555 WARNING [train.py:1204] (3/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_321-22821-0 from training. Duration: 0.89 2023-10-04 15:00:33,585 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_43831_210_2158_1_1533725502_5872791_55-163137-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 15:00:33,620 WARNING [train.py:1204] (3/4) Exclude cut with ID _27430_210_7770_1_1532604689375_1378590_26-25207-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 15:00:33,724 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_17613_210_2151_1_1531137569_2366160_223-316461-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 15:00:35,035 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_53679_210_4852_1_1534556726_4186141_541-213370-0_sp1.1 from training. Duration: 0.963625 2023-10-04 15:00:36,938 WARNING [train.py:1204] (3/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_369-295086-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 15:00:37,002 WARNING [train.py:1204] (3/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_91-267696-0 from training. Duration: 0.87 2023-10-04 15:00:37,041 WARNING [train.py:1204] (3/4) Exclude cut with ID _5294_210_1747_1_1527249643928_3696179_180-100713-0 from training. Duration: 0.81 2023-10-04 15:00:37,073 WARNING [train.py:1204] (3/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_32-335103-0 from training. Duration: 0.82 2023-10-04 15:00:37,078 WARNING [train.py:1204] (3/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_133-312727-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 15:00:39,071 WARNING [train.py:1204] (3/4) Exclude cut with ID _59288_210_5854_1_1535077757774_3412246_391-165934-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 15:00:39,106 WARNING [train.py:1204] (3/4) Exclude cut with ID _28323_210_16187_1_1532480395667_4100300_488-1281-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 15:00:40,505 WARNING [train.py:1204] (3/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_124-193207-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 15:00:43,844 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.4.encoder.layers.1.self_attn2.whiten, num_groups=1, num_channels=384, metric=11.94 vs. limit=22.5 2023-10-04 15:00:50,630 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.3.encoder.layers.2.feed_forward3.out_whiten, num_groups=1, num_channels=512, metric=6.38 vs. limit=15.0 2023-10-04 15:00:51,286 WARNING [train.py:1204] (3/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_226-242140-0_sp0.9 from training. Duration: 0.9 2023-10-04 15:00:52,550 INFO [train.py:1046] (3/4) Epoch 48, batch 4650, loss[loss=0.1538, simple_loss=0.2422, pruned_loss=0.03272, over 24470.00 frames. ], tot_loss[loss=0.1521, simple_loss=0.2318, pruned_loss=0.03624, over 4710941.94 frames. ], batch size: 69, lr: 2.11e-03, grad_scale: 16.0 2023-10-04 15:00:54,081 WARNING [train.py:1204] (3/4) Exclude cut with ID _29277_210_3279_1_1532581109293_3173069_392-3694-0_sp1.1 from training. Duration: 0.936375 2023-10-04 15:00:54,128 WARNING [train.py:1204] (3/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_563-329842-0_sp1.1 from training. Duration: 0.97275 2023-10-04 15:00:55,389 WARNING [train.py:1204] (3/4) Exclude cut with ID _58583_210_5236_1_1534726835266_3574249_391-87755-0_sp1.1 from training. Duration: 0.92725 2023-10-04 15:00:55,418 WARNING [train.py:1204] (3/4) Exclude cut with ID _28427_210_4220_1_1532581240967_3061069_281-237827-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 15:00:55,432 WARNING [train.py:1204] (3/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_44-147898-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 15:00:56,771 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_24007_210_1794_1_1531918925_1663875_125-320772-0_sp1.1 from training. Duration: 0.97275 2023-10-04 15:01:00,778 WARNING [train.py:1204] (3/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_143-117421-0 from training. Duration: 0.58 2023-10-04 15:01:02,750 WARNING [train.py:1204] (3/4) Exclude cut with ID _69584_210_5219_1_1535785133775_4044450_325-7656-0_sp0.9 from training. Duration: 0.9555625 2023-10-04 15:01:05,917 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_37351_210_7925_1_1533012931_3812113_265-85780-0 from training. Duration: 0.88 2023-10-04 15:01:05,942 WARNING [train.py:1204] (3/4) Exclude cut with ID _18516_210_9206_1_1531704653206_3692406_331-237896-0_sp1.1 from training. Duration: 0.936375 2023-10-04 15:01:07,913 WARNING [train.py:1204] (3/4) Exclude cut with ID _14629_210_15709_1_1530604298182_956899_6-232695-0 from training. Duration: 0.94 2023-10-04 15:01:07,934 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7544_210_6753_1_1528587000_691468_87-178599-0_sp1.1 from training. Duration: 0.7909375 2023-10-04 15:01:07,996 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_326-303945-0 from training. Duration: 0.99 2023-10-04 15:01:08,010 WARNING [train.py:1204] (3/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_503-30719-0 from training. Duration: 0.98 2023-10-04 15:01:08,018 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_11305_210_1794_1_1529750428_7178148_299-344640-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 15:01:08,755 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.2.encoder.layers.0.feed_forward1.out_whiten, num_groups=1, num_channels=384, metric=4.86 vs. limit=15.0 2023-10-04 15:01:09,783 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_53071_210_16554_1_1534490405_2954066_265-170472-0_sp1.1 from training. Duration: 0.7545625 2023-10-04 15:01:11,184 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_174-277364-0_sp1.1 from training. Duration: 0.6454375 2023-10-04 15:01:12,509 WARNING [train.py:1204] (3/4) Exclude cut with ID _58414_210_8254_1_1535421609162_7151182_193-151884-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 15:01:12,536 WARNING [train.py:1204] (3/4) Exclude cut with ID IC0651W0104-322063 from training. Duration: 0.768 2023-10-04 15:01:15,280 WARNING [train.py:1204] (3/4) Exclude cut with ID _21881_210_3525_1_1531825242757_3798380_139-305982-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 15:01:16,572 WARNING [train.py:1204] (3/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_79-200392-0 from training. Duration: 0.97 2023-10-04 15:01:19,460 WARNING [train.py:1204] (3/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_451-280894-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 15:01:19,475 WARNING [train.py:1204] (3/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_304-273433-0_sp1.1 from training. Duration: 0.9 2023-10-04 15:01:20,816 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_5959_210_1794_1_1527400400_3445726_412-44052-0 from training. Duration: 0.77 2023-10-04 15:01:20,928 WARNING [train.py:1204] (3/4) Exclude cut with ID _4681_210_3525_1_1527251040321_1235713_148-26159-0_sp1.1 from training. Duration: 0.963625 2023-10-04 15:01:22,534 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.conv_module2.balancer2.min_abs, batch_count=1695600.0, ans=0.5 2023-10-04 15:01:23,720 WARNING [train.py:1204] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_887-249555-0_sp0.9 from training. Duration: 0.8555625 2023-10-04 15:01:26,410 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_25236_210_2158_1_1532051968_280567_37-6860-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 15:01:32,391 WARNING [train.py:1204] (3/4) Exclude cut with ID _29277_210_3279_1_1532581109293_3173069_417-115527-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 15:01:35,530 WARNING [train.py:1204] (3/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_552-134748-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 15:01:37,504 WARNING [train.py:1204] (3/4) Exclude cut with ID _43176_210_8254_1_1533524417469_4174339_188-174143-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 15:01:37,552 WARNING [train.py:1204] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_127-119109-0_sp1.1 from training. Duration: 0.6545625 2023-10-04 15:01:42,031 WARNING [train.py:1204] (3/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_38-187851-0 from training. Duration: 0.62 2023-10-04 15:01:42,077 WARNING [train.py:1204] (3/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_588-125165-0 from training. Duration: 0.83 2023-10-04 15:01:42,157 WARNING [train.py:1204] (3/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_344-152526-0_sp1.1 from training. Duration: 0.4818125 2023-10-04 15:01:42,158 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7002_210_1794_1_1527935634_4034444_487-236501-0 from training. Duration: 0.66 2023-10-04 15:01:43,634 WARNING [train.py:1204] (3/4) Exclude cut with ID _23825_210_13449_1_1531915313526_3497150_379-16981-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 15:01:50,378 WARNING [train.py:1204] (3/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_297-5233-0_sp1.1 from training. Duration: 0.863625 2023-10-04 15:01:50,382 WARNING [train.py:1204] (3/4) Exclude cut with ID _41298_210_9019_1_1533430371450_4822143_178-283538-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 15:01:50,422 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7046_210_6753_1_1527895337_6920917_642-106799-0 from training. Duration: 0.92 2023-10-04 15:01:51,619 WARNING [train.py:1204] (3/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_532-291952-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 15:01:52,999 WARNING [train.py:1204] (3/4) Exclude cut with ID _47278_210_12041_1_1533902418349_3576110_260-270468-0_sp1.1 from training. Duration: 0.92725 2023-10-04 15:01:53,005 WARNING [train.py:1204] (3/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_144-3287-0_sp0.9 from training. Duration: 0.7444375 2023-10-04 15:01:54,396 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_162-262717-0_sp0.9 from training. Duration: 0.92225 2023-10-04 15:01:57,113 WARNING [train.py:1204] (3/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_62-335552-0_sp0.9 from training. Duration: 0.9666875 2023-10-04 15:01:57,118 WARNING [train.py:1204] (3/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_726-272852-0_sp1.1 from training. Duration: 0.92725 2023-10-04 15:01:57,196 WARNING [train.py:1204] (3/4) Exclude cut with ID _14898_210_9206_1_1530766812377_4177190_26-135492-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 15:02:00,082 WARNING [train.py:1204] (3/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_200-95304-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 15:02:00,129 WARNING [train.py:1204] (3/4) Exclude cut with ID _45012_210_12651_1_1533608435136_5371380_240-37101-0_sp1.1 from training. Duration: 0.8454375 2023-10-04 15:02:00,145 WARNING [train.py:1204] (3/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_132-269633-0_sp0.9 from training. Duration: 0.7555625 2023-10-04 15:02:00,381 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.self_attn_weights.pos_emb_skip_rate, batch_count=1695733.3333333333, ans=0.0 2023-10-04 15:02:01,484 WARNING [train.py:1204] (3/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_907-151398-0_sp0.9 from training. Duration: 0.411125 2023-10-04 15:02:01,572 WARNING [train.py:1204] (3/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_45-265684-0_sp0.9 from training. Duration: 0.8 2023-10-04 15:02:03,352 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_74-292608-0 from training. Duration: 0.72 2023-10-04 15:02:03,521 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.bypass_mid.scale_min, batch_count=1695733.3333333333, ans=0.2 2023-10-04 15:02:06,559 INFO [train.py:1046] (3/4) Epoch 48, batch 4700, loss[loss=0.1553, simple_loss=0.2335, pruned_loss=0.03861, over 23565.00 frames. ], tot_loss[loss=0.1524, simple_loss=0.2324, pruned_loss=0.03621, over 4717949.98 frames. ], batch size: 256, lr: 2.11e-03, grad_scale: 16.0 2023-10-04 15:02:13,165 WARNING [train.py:1204] (3/4) Exclude cut with ID _59202_210_26984_1_1534816810612_3549174_221-84819-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 15:02:14,450 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_33696_210_3852_1_1533082780_2663375_181-325595-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 15:02:15,726 WARNING [train.py:1204] (3/4) Exclude cut with ID _47251_210_18628_1_1533814063128_4137250_351-154105-0_sp1.1 from training. Duration: 0.963625 2023-10-04 15:02:17,087 WARNING [train.py:1204] (3/4) Exclude cut with ID _35501_210_9288_1_1532998507986_3192189_351-334740-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 15:02:18,495 WARNING [train.py:1204] (3/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_342-100157-0_sp1.1 from training. Duration: 0.4909375 2023-10-04 15:02:24,026 WARNING [train.py:1204] (3/4) Exclude cut with ID _6789_210_1534_1_1527857937218_3118950_262-48853-0 from training. Duration: 0.56 2023-10-04 15:02:24,062 WARNING [train.py:1204] (3/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_430-201020-0 from training. Duration: 0.97 2023-10-04 15:02:25,310 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.718e+02 2.030e+02 2.257e+02 2.587e+02 3.872e+02, threshold=4.514e+02, percent-clipped=0.0 2023-10-04 15:02:26,804 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_71-136518-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 15:02:28,197 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_28-291982-0_sp1.1 from training. Duration: 0.8818125 2023-10-04 15:02:28,224 WARNING [train.py:1204] (3/4) Exclude cut with ID _54218_210_12210_1_1534413646388_4409259_437-275326-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 15:02:31,102 WARNING [train.py:1204] (3/4) Exclude cut with ID _55134_210_21982_1_1534590028377_3882170_40-309139-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 15:02:37,001 WARNING [train.py:1204] (3/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_266-329755-0_sp1.1 from training. Duration: 0.8090625 2023-10-04 15:02:38,814 WARNING [train.py:1204] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_21-174514-0_sp1.1 from training. Duration: 0.3909375 2023-10-04 15:02:40,397 WARNING [train.py:1204] (3/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_409-207378-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 15:02:43,890 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.conv_skip_rate, batch_count=1695933.3333333333, ans=0.0 2023-10-04 15:02:45,408 WARNING [train.py:1204] (3/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_132-269633-0 from training. Duration: 0.68 2023-10-04 15:02:46,814 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2245_210_2715_1_1525511548_3540254_310-52483-0_sp0.9 from training. Duration: 0.97775 2023-10-04 15:02:49,549 WARNING [train.py:1204] (3/4) Exclude cut with ID _27430_210_7770_1_1532604689375_1378590_47-77403-0_sp1.1 from training. Duration: 0.92725 2023-10-04 15:02:51,379 WARNING [train.py:1204] (3/4) Exclude cut with ID _7562_210_2084_1_1528887767916_3946229_186-326422-0 from training. Duration: 0.84 2023-10-04 15:02:54,037 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_230-35993-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 15:02:55,671 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.conv_module1.balancer2.prob, batch_count=1696000.0, ans=0.125 2023-10-04 15:02:56,990 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.feed_forward1.out_proj.dropout_p, batch_count=1696000.0, ans=0.1 2023-10-04 15:02:59,440 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_23854_210_1794_1_1531969696_1329157_57-123491-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 15:03:00,844 WARNING [train.py:1204] (3/4) Exclude cut with ID _14747_210_8341_1_1530685797463_3654089_147-131229-0 from training. Duration: 0.76 2023-10-04 15:03:02,204 WARNING [train.py:1204] (3/4) Exclude cut with ID _30191_210_1664_1_1533189915047_7196670_430-19539-0_sp1.1 from training. Duration: 0.92725 2023-10-04 15:03:02,217 WARNING [train.py:1204] (3/4) Exclude cut with ID _67725_210_27767_1_1535608756671_5105359_44-50762-0_sp1.1 from training. Duration: 0.963625 2023-10-04 15:03:04,956 WARNING [train.py:1204] (3/4) Exclude cut with ID _27485_210_4916_1_1532739191677_3668269_217-161755-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 15:03:05,015 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_5931_210_2040_1_1527428481_4547999_321-193614-0_sp1.1 from training. Duration: 0.6090625 2023-10-04 15:03:05,647 WARNING [train.py:1204] (3/4) Exclude cut with ID _6267_210_2084_1_1527764658526_3894730_375-219887-0 from training. Duration: 0.67 2023-10-04 15:03:07,091 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9087W0022-538393 from training. Duration: 0.768 2023-10-04 15:03:08,554 WARNING [train.py:1204] (3/4) Exclude cut with ID _68777_210_4353_1_1535810390731_3117904_83-196459-0_sp1.1 from training. Duration: 0.963625 2023-10-04 15:03:09,910 WARNING [train.py:1204] (3/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_455-343812-0_sp1.1 from training. Duration: 0.92725 2023-10-04 15:03:09,910 WARNING [train.py:1204] (3/4) Exclude cut with ID _13916_210_9994_1_1530788341126_3763149_414-320174-0_sp1.1 from training. Duration: 0.92725 2023-10-04 15:03:09,914 WARNING [train.py:1204] (3/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_398-239819-0 from training. Duration: 0.99 2023-10-04 15:03:11,962 WARNING [train.py:1204] (3/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_199-232314-0_sp1.1 from training. Duration: 0.92725 2023-10-04 15:03:15,110 WARNING [train.py:1204] (3/4) Exclude cut with ID _2334_210_2667_1_1525608238880_4705129_459-104997-0 from training. Duration: 0.95 2023-10-04 15:03:18,565 WARNING [train.py:1204] (3/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_441-209395-0_sp1.1 from training. Duration: 0.936375 2023-10-04 15:03:20,062 WARNING [train.py:1204] (3/4) Exclude cut with ID _27052_210_5986_1_1532329197187_3585009_320-80393-0_sp1.1 from training. Duration: 0.97275 2023-10-04 15:03:21,226 INFO [train.py:1046] (3/4) Epoch 48, batch 4750, loss[loss=0.178, simple_loss=0.2599, pruned_loss=0.04805, over 24013.00 frames. ], tot_loss[loss=0.1529, simple_loss=0.2331, pruned_loss=0.03634, over 4695441.21 frames. ], batch size: 80, lr: 2.11e-03, grad_scale: 16.0 2023-10-04 15:03:25,340 WARNING [train.py:1204] (3/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_89-217253-0_sp1.1 from training. Duration: 0.97275 2023-10-04 15:03:25,361 WARNING [train.py:1204] (3/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_453-282588-0_sp1.1 from training. Duration: 0.8909375 2023-10-04 15:03:26,796 WARNING [train.py:1204] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_88-341718-0 from training. Duration: 0.66 2023-10-04 15:03:26,839 WARNING [train.py:1204] (3/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_230-3521-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 15:03:31,007 WARNING [train.py:1204] (3/4) Exclude cut with ID _9679_210_7497_1_1529129212906_4363929_512-313889-0 from training. Duration: 0.81 2023-10-04 15:03:32,229 WARNING [train.py:1204] (3/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_194-252800-0_sp1.1 from training. Duration: 0.8818125 2023-10-04 15:03:33,555 WARNING [train.py:1204] (3/4) Exclude cut with ID _55209_210_1531_1_1534675828996_4597160_239-293541-0_sp1.1 from training. Duration: 0.963625 2023-10-04 15:03:33,605 WARNING [train.py:1204] (3/4) Exclude cut with ID _35799_210_5854_1_1533263487887_3529071_15-325131-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 15:03:38,513 WARNING [train.py:1204] (3/4) Exclude cut with ID _14100_210_8341_1_1530529320613_3678069_279-55760-0 from training. Duration: 0.93 2023-10-04 15:03:41,418 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_489-230703-0_sp1.1 from training. Duration: 0.77275 2023-10-04 15:03:43,295 WARNING [train.py:1204] (3/4) Exclude cut with ID _6694_210_4599_1_1527852637490_3965529_144-185051-0 from training. Duration: 0.96 2023-10-04 15:03:44,605 WARNING [train.py:1204] (3/4) Exclude cut with ID _28461_210_2253_1_1532771775189_3041299_1-228155-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 15:03:49,356 WARNING [train.py:1204] (3/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_686-252323-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 15:03:49,358 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_1075-294633-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 15:03:49,369 WARNING [train.py:1204] (3/4) Exclude cut with ID _22083_210_8254_1_1531702862439_3678970_30-92879-0_sp1.1 from training. Duration: 0.97275 2023-10-04 15:03:49,452 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9245W0121-616888 from training. Duration: 0.896 2023-10-04 15:03:49,455 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6786_210_1388_1_1527839699_4194495_378-46070-0 from training. Duration: 0.72 2023-10-04 15:03:56,707 WARNING [train.py:1204] (3/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_317-175062-0 from training. Duration: 0.82 2023-10-04 15:03:59,426 WARNING [train.py:1204] (3/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_238-89390-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 15:04:02,138 WARNING [train.py:1204] (3/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_524-310811-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 15:04:04,883 WARNING [train.py:1204] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_307-158462-0_sp0.9 from training. Duration: 0.7666875 2023-10-04 15:04:04,884 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9309W0191-640318 from training. Duration: 0.512 2023-10-04 15:04:04,888 WARNING [train.py:1204] (3/4) Exclude cut with ID _55153_210_2253_1_1534384786677_3162096_100-204617-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 15:04:06,340 WARNING [train.py:1204] (3/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_112-149213-0_sp1.1 from training. Duration: 0.863625 2023-10-04 15:04:08,489 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.bypass.skip_rate, batch_count=1696333.3333333333, ans=0.04949747468305833 2023-10-04 15:04:09,487 WARNING [train.py:1204] (3/4) Exclude cut with ID _67831_210_12392_1_1535612464202_3092612_346-2148-0_sp1.1 from training. Duration: 0.8454375 2023-10-04 15:04:10,338 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.1.encoder.layers.0.feed_forward1.out_whiten, num_groups=1, num_channels=256, metric=4.64 vs. limit=15.0 2023-10-04 15:04:10,819 WARNING [train.py:1204] (3/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_51-117576-0_sp0.9 from training. Duration: 0.4444375 2023-10-04 15:04:12,207 WARNING [train.py:1204] (3/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_300-27235-0 from training. Duration: 0.87 2023-10-04 15:04:13,461 WARNING [train.py:1204] (3/4) Exclude cut with ID _40399_210_4353_1_1533305658194_3279406_195-116825-0_sp1.1 from training. Duration: 0.97275 2023-10-04 15:04:13,483 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7002_210_1794_1_1527939683_5157286_163-87552-0_sp0.9 from training. Duration: 0.9555625 2023-10-04 15:04:13,531 WARNING [train.py:1204] (3/4) Exclude cut with ID _19514_210_9259_1_1531530071330_5084230_560-237597-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 15:04:15,406 WARNING [train.py:1204] (3/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_41-234353-0_sp1.1 from training. Duration: 0.6181875 2023-10-04 15:04:15,424 WARNING [train.py:1204] (3/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_769-168302-0 from training. Duration: 0.71 2023-10-04 15:04:18,185 WARNING [train.py:1204] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_574-45541-0 from training. Duration: 0.65 2023-10-04 15:04:20,969 WARNING [train.py:1204] (3/4) Exclude cut with ID _50310_210_15488_1_1534068043345_3641080_262-67120-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 15:04:24,199 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_53071_210_16554_1_1534493434_1276796_35-11927-0_sp1.1 from training. Duration: 0.963625 2023-10-04 15:04:24,200 WARNING [train.py:1204] (3/4) Exclude cut with ID _3992_210_1747_1_1526644816031_3731060_107-251434-0 from training. Duration: 0.99 2023-10-04 15:04:25,535 WARNING [train.py:1204] (3/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_412-54488-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 15:04:27,297 WARNING [train.py:1204] (3/4) Exclude cut with ID _18516_210_9206_1_1531704653206_3692406_77-258490-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 15:04:28,808 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_40047_210_2290_1_1533290519_2374311_195-165693-0_sp0.9 from training. Duration: 0.988875 2023-10-04 15:04:30,172 WARNING [train.py:1204] (3/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_296-36971-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 15:04:31,593 WARNING [train.py:1204] (3/4) Exclude cut with ID _14970_210_15709_1_1530773543493_1715730_187-20259-0_sp1.1 from training. Duration: 0.8090625 2023-10-04 15:04:33,102 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_53373_210_5399_1_1534229471_4715695_135-305927-0_sp1.1 from training. Duration: 0.936375 2023-10-04 15:04:33,139 WARNING [train.py:1204] (3/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_60-18123-0 from training. Duration: 0.88 2023-10-04 15:04:34,294 INFO [train.py:1046] (3/4) Epoch 48, batch 4800, loss[loss=0.1557, simple_loss=0.2313, pruned_loss=0.04006, over 23610.00 frames. ], tot_loss[loss=0.1536, simple_loss=0.2341, pruned_loss=0.03653, over 4710070.39 frames. ], batch size: 120, lr: 2.10e-03, grad_scale: 16.0 2023-10-04 15:04:34,409 WARNING [train.py:1204] (3/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_166-166990-0 from training. Duration: 0.96 2023-10-04 15:04:37,096 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_5438_210_1794_1_1527336687_2600009_233-58072-0 from training. Duration: 0.77 2023-10-04 15:04:38,604 WARNING [train.py:1204] (3/4) Exclude cut with ID _8833_210_5219_1_1528592665210_7553059_611-80313-0_sp0.9 from training. Duration: 0.711125 2023-10-04 15:04:38,626 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_53555_210_5632_1_1534595220_7232297_118-43239-0_sp1.1 from training. Duration: 0.936375 2023-10-04 15:04:40,052 WARNING [train.py:1204] (3/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_480-13146-0 from training. Duration: 0.85 2023-10-04 15:04:44,584 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_892-28814-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 15:04:44,649 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_24899_210_16554_1_1532046422_3827462_160-21255-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 15:04:49,283 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_154-316450-0_sp1.1 from training. Duration: 0.6909375 2023-10-04 15:04:50,609 WARNING [train.py:1204] (3/4) Exclude cut with ID _30196_210_1664_1_1533708496522_7074140_1225-301232-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 15:04:51,832 WARNING [train.py:1204] (3/4) Exclude cut with ID _28783_210_7943_1_1532671164808_7155479_414-208100-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 15:04:51,870 WARNING [train.py:1204] (3/4) Exclude cut with ID _19023_210_4353_1_1531378892552_5820780_83-254794-0 from training. Duration: 0.86 2023-10-04 15:04:51,952 WARNING [train.py:1204] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_591-83752-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 15:04:53,343 WARNING [train.py:1204] (3/4) Exclude cut with ID _29877_210_6228_1_1532397791106_5276149_266-62291-0_sp1.1 from training. Duration: 0.7909375 2023-10-04 15:04:55,216 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.610e+02 2.147e+02 2.492e+02 2.796e+02 5.306e+02, threshold=4.985e+02, percent-clipped=1.0 2023-10-04 15:04:55,316 WARNING [train.py:1204] (3/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_280-161633-0_sp1.1 from training. Duration: 0.663625 2023-10-04 15:04:59,908 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7543_210_6753_1_1528539468_333764_39-156634-0_sp1.1 from training. Duration: 0.97275 2023-10-04 15:05:00,238 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.conv_module1.balancer1.prob, batch_count=1696533.3333333333, ans=0.125 2023-10-04 15:05:01,361 WARNING [train.py:1204] (3/4) Exclude cut with ID _67499_210_17282_1_1535509805220_3692430_505-312248-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 15:05:01,404 WARNING [train.py:1204] (3/4) Exclude cut with ID _5294_210_1747_1_1527249643928_3696179_180-100713-0_sp0.9 from training. Duration: 0.9 2023-10-04 15:05:04,205 WARNING [train.py:1204] (3/4) Exclude cut with ID _26528_210_9070_1_1532570410842_3851310_508-148132-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 15:05:04,214 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_211-339622-0_sp1.1 from training. Duration: 0.4090625 2023-10-04 15:05:04,227 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_339-138356-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 15:05:04,299 WARNING [train.py:1204] (3/4) Exclude cut with ID _27060_210_5986_1_1532674838922_3522549_211-19933-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 15:05:06,935 WARNING [train.py:1204] (3/4) Exclude cut with ID _60229_210_27767_1_1534838406158_3655350_457-117923-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 15:05:08,405 WARNING [train.py:1204] (3/4) Exclude cut with ID _55429_210_10626_1_1534467588527_7174143_460-141439-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 15:05:10,261 WARNING [train.py:1204] (3/4) Exclude cut with ID _59170_210_6721_1_1534983633960_4203335_263-10780-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 15:05:10,276 WARNING [train.py:1204] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_872-264525-0_sp0.9 from training. Duration: 0.72225 2023-10-04 15:05:12,873 WARNING [train.py:1204] (3/4) Exclude cut with ID _7624_210_1531_1_1528109802198_4933101_418-349834-0_sp0.9 from training. Duration: 0.6666875 2023-10-04 15:05:12,974 WARNING [train.py:1204] (3/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_27-53998-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 15:05:15,679 WARNING [train.py:1204] (3/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_202-183922-0 from training. Duration: 0.85 2023-10-04 15:05:15,698 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7002_210_1794_1_1527939683_5157286_1-281264-0 from training. Duration: 0.44 2023-10-04 15:05:15,777 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder_embed.conv.5.prob, batch_count=1696600.0, ans=0.125 2023-10-04 15:05:15,882 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.conv_module2.balancer1.prob, batch_count=1696600.0, ans=0.125 2023-10-04 15:05:17,453 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_53753_210_7234_1_1534320003_3594148_493-9314-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 15:05:17,466 WARNING [train.py:1204] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_217-95056-0_sp1.1 from training. Duration: 0.8909375 2023-10-04 15:05:17,520 WARNING [train.py:1204] (3/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_406-45086-0_sp1.1 from training. Duration: 0.72725 2023-10-04 15:05:17,525 WARNING [train.py:1204] (3/4) Exclude cut with ID _31649_210_6228_1_1532584608063_4091078_505-213821-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 15:05:17,534 WARNING [train.py:1204] (3/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_317-175062-0_sp0.9 from training. Duration: 0.911125 2023-10-04 15:05:20,217 WARNING [train.py:1204] (3/4) Exclude cut with ID _5444_210_2151_1_1527418808464_5312210_726-27165-0_sp1.1 from training. Duration: 0.6090625 2023-10-04 15:05:21,519 WARNING [train.py:1204] (3/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_494-170437-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 15:05:23,104 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_474-64054-0_sp1.1 from training. Duration: 0.936375 2023-10-04 15:05:26,507 WARNING [train.py:1204] (3/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_521-7556-0_sp1.1 from training. Duration: 0.97275 2023-10-04 15:05:28,467 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_269-348505-0_sp1.1 from training. Duration: 0.92725 2023-10-04 15:05:32,540 WARNING [train.py:1204] (3/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_559-184862-0 from training. Duration: 0.94 2023-10-04 15:05:33,852 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_68702_210_23689_1_1535799577_3420068_92-76040-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 15:05:33,891 WARNING [train.py:1204] (3/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_227-102052-0_sp1.1 from training. Duration: 0.97275 2023-10-04 15:05:33,929 WARNING [train.py:1204] (3/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_718-250907-0_sp0.9 from training. Duration: 0.7666875 2023-10-04 15:05:35,292 WARNING [train.py:1204] (3/4) Exclude cut with ID _14069_210_5632_1_1530512692033_4803390_352-143891-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 15:05:38,070 WARNING [train.py:1204] (3/4) Exclude cut with ID _6267_210_2084_1_1527764658526_3894730_65-106602-0_sp1.1 from training. Duration: 0.936375 2023-10-04 15:05:40,736 WARNING [train.py:1204] (3/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_202-183922-0_sp0.9 from training. Duration: 0.9444375 2023-10-04 15:05:40,749 WARNING [train.py:1204] (3/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_465-268827-0_sp1.1 from training. Duration: 0.97275 2023-10-04 15:05:40,802 WARNING [train.py:1204] (3/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_58-29064-0_sp1.1 from training. Duration: 0.9 2023-10-04 15:05:42,742 WARNING [train.py:1204] (3/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_1-99680-0_sp0.9 from training. Duration: 0.7444375 2023-10-04 15:05:42,799 WARNING [train.py:1204] (3/4) Exclude cut with ID _13253_210_1925_1_1530757775819_3577873_191-144977-0_sp1.1 from training. Duration: 0.7090625 2023-10-04 15:05:46,869 WARNING [train.py:1204] (3/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_276-112684-0_sp1.1 from training. Duration: 0.92725 2023-10-04 15:05:46,879 WARNING [train.py:1204] (3/4) Exclude cut with ID _64639_210_5816_1_1535367619437_3528000_358-411-0_sp1.1 from training. Duration: 0.97275 2023-10-04 15:05:46,886 WARNING [train.py:1204] (3/4) Exclude cut with ID _31649_210_6228_1_1532584608063_4091078_106-21199-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 15:05:48,163 INFO [train.py:1046] (3/4) Epoch 48, batch 4850, loss[loss=0.1253, simple_loss=0.2014, pruned_loss=0.02463, over 24303.00 frames. ], tot_loss[loss=0.1543, simple_loss=0.2349, pruned_loss=0.03687, over 4705847.67 frames. ], batch size: 56, lr: 2.10e-03, grad_scale: 16.0 2023-10-04 15:05:48,271 WARNING [train.py:1204] (3/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_901-79438-0 from training. Duration: 0.94 2023-10-04 15:05:48,441 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.0.balancer1.prob, batch_count=1696800.0, ans=0.125 2023-10-04 15:05:51,593 WARNING [train.py:1204] (3/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_718-250907-0 from training. Duration: 0.69 2023-10-04 15:05:51,598 WARNING [train.py:1204] (3/4) Exclude cut with ID _5287_210_2084_1_1527418883040_3878670_363-194161-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 15:05:51,602 WARNING [train.py:1204] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_586-321321-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 15:05:51,683 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_68900_210_2087_1_1535713157_3708529_164-292661-0_sp1.1 from training. Duration: 0.963625 2023-10-04 15:05:51,690 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_26433_210_12108_1_1532157158_4056480_114-144878-0_sp1.1 from training. Duration: 0.97275 2023-10-04 15:05:55,660 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_15517_210_6753_1_1530954008_3487336_96-341874-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 15:06:02,169 WARNING [train.py:1204] (3/4) Exclude cut with ID _14268_210_9206_1_1530622771285_4714099_637-86630-0 from training. Duration: 0.51 2023-10-04 15:06:03,581 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_28211_210_6388_1_1532861506_4523224_121-320092-0_sp1.1 from training. Duration: 0.92725 2023-10-04 15:06:09,071 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_141-89113-0_sp0.9 from training. Duration: 0.9555625 2023-10-04 15:06:09,160 WARNING [train.py:1204] (3/4) Exclude cut with ID _6789_210_1534_1_1527857937218_3118950_311-61262-0_sp1.1 from training. Duration: 0.5909375 2023-10-04 15:06:10,518 WARNING [train.py:1204] (3/4) Exclude cut with ID _58480_210_12392_1_1534811121111_3646160_389-200461-0_sp1.1 from training. Duration: 0.97275 2023-10-04 15:06:11,355 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.2.encoder.layers.2.conv_module1.whiten, num_groups=1, num_channels=384, metric=6.11 vs. limit=15.0 2023-10-04 15:06:13,372 WARNING [train.py:1204] (3/4) Exclude cut with ID _41326_210_5854_1_1533778076509_3492080_28-156553-0_sp1.1 from training. Duration: 0.92725 2023-10-04 15:06:14,094 WARNING [train.py:1204] (3/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_577-287150-0_sp1.1 from training. Duration: 0.7454375 2023-10-04 15:06:15,365 WARNING [train.py:1204] (3/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_40-129756-0_sp1.1 from training. Duration: 0.736375 2023-10-04 15:06:15,375 WARNING [train.py:1204] (3/4) Exclude cut with ID _60113_210_16271_1_1535164212481_4386079_490-280290-0 from training. Duration: 0.98 2023-10-04 15:06:19,477 WARNING [train.py:1204] (3/4) Exclude cut with ID _58059_210_5854_1_1534901471149_3164808_34-39905-0_sp1.1 from training. Duration: 0.963625 2023-10-04 15:06:21,449 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_791-197891-0_sp1.1 from training. Duration: 0.763625 2023-10-04 15:06:21,573 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.1.attention_skip_rate, batch_count=1696933.3333333333, ans=0.0 2023-10-04 15:06:22,695 WARNING [train.py:1204] (3/4) Exclude cut with ID _6789_210_1534_1_1527857937218_3118950_262-48853-0_sp1.1 from training. Duration: 0.5090625 2023-10-04 15:06:22,761 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2655_210_2087_1_1525688959_3897400_52-279899-0_sp0.9 from training. Duration: 0.7666875 2023-10-04 15:06:22,764 WARNING [train.py:1204] (3/4) Exclude cut with ID _14884_210_2252_1_1530707459689_5561250_114-201399-0 from training. Duration: 0.76 2023-10-04 15:06:25,602 WARNING [train.py:1204] (3/4) Exclude cut with ID _19023_210_4353_1_1531378892552_5820780_83-254794-0_sp0.9 from training. Duration: 0.9555625 2023-10-04 15:06:25,623 WARNING [train.py:1204] (3/4) Exclude cut with ID _53489_210_6228_1_1534298357169_3835970_10-297919-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 15:06:27,200 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.conv_module2.balancer1.prob, batch_count=1696933.3333333333, ans=0.125 2023-10-04 15:06:27,851 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.1.encoder.layers.1.self_attn2.whiten, num_groups=1, num_channels=256, metric=10.15 vs. limit=22.5 2023-10-04 15:06:30,151 WARNING [train.py:1204] (3/4) Exclude cut with ID _44585_210_18628_1_1533605350387_4518199_418-3055-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 15:06:30,163 WARNING [train.py:1204] (3/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_310-109461-0 from training. Duration: 0.8 2023-10-04 15:06:30,201 WARNING [train.py:1204] (3/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_235-262124-0 from training. Duration: 0.67 2023-10-04 15:06:30,306 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder_embed.convnext.hidden_balancer.prob, batch_count=1696933.3333333333, ans=0.125 2023-10-04 15:06:32,120 WARNING [train.py:1204] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_195-167011-0_sp1.1 from training. Duration: 0.6818125 2023-10-04 15:06:37,783 WARNING [train.py:1204] (3/4) Exclude cut with ID _7852_210_5219_1_1528286471316_8139400_188-55162-0_sp1.1 from training. Duration: 0.936375 2023-10-04 15:06:37,834 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_35361_210_4238_1_1533262810_7793476_675-87012-0 from training. Duration: 0.89 2023-10-04 15:06:39,209 WARNING [train.py:1204] (3/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_465-81427-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 15:06:39,216 WARNING [train.py:1204] (3/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_597-154640-0_sp1.1 from training. Duration: 0.8181875 2023-10-04 15:06:41,253 WARNING [train.py:1204] (3/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_186-309161-0_sp0.9 from training. Duration: 0.6 2023-10-04 15:06:41,364 WARNING [train.py:1204] (3/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_546-119312-0 from training. Duration: 0.99 2023-10-04 15:06:41,368 WARNING [train.py:1204] (3/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_330-240060-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 15:06:41,523 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.conv_module1.balancer2.min_abs, batch_count=1697000.0, ans=0.5 2023-10-04 15:06:42,680 WARNING [train.py:1204] (3/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_477-255782-0 from training. Duration: 0.82 2023-10-04 15:06:43,984 WARNING [train.py:1204] (3/4) Exclude cut with ID _14970_210_15709_1_1530773543493_1715730_150-248939-0_sp1.1 from training. Duration: 0.97275 2023-10-04 15:06:45,320 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_556-206615-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 15:06:45,379 WARNING [train.py:1204] (3/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_308-231735-0 from training. Duration: 0.73 2023-10-04 15:06:54,172 WARNING [train.py:1204] (3/4) Exclude cut with ID _27485_210_4916_1_1532739191677_3668269_66-236571-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 15:07:00,061 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2221_210_1988_1_1525597051_4935541_415-296882-0_sp0.9 from training. Duration: 0.8555625 2023-10-04 15:07:00,068 WARNING [train.py:1204] (3/4) Exclude cut with ID _64521_210_14699_1_1535243427259_3815328_231-78630-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 15:07:03,391 INFO [train.py:1046] (3/4) Epoch 48, batch 4900, loss[loss=0.1387, simple_loss=0.2066, pruned_loss=0.03536, over 23678.00 frames. ], tot_loss[loss=0.1535, simple_loss=0.2335, pruned_loss=0.03674, over 4695224.76 frames. ], batch size: 256, lr: 2.10e-03, grad_scale: 16.0 2023-10-04 15:07:06,391 WARNING [train.py:1204] (3/4) Exclude cut with ID _13942_210_9988_1_1530604906036_3733360_499-55676-0 from training. Duration: 0.62 2023-10-04 15:07:06,394 WARNING [train.py:1204] (3/4) Exclude cut with ID _49376_210_5986_1_1533985278732_3370740_301-154763-0_sp1.1 from training. Duration: 0.8909375 2023-10-04 15:07:11,957 WARNING [train.py:1204] (3/4) Exclude cut with ID _27485_210_4916_1_1532739191677_3668269_255-216698-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 15:07:12,684 WARNING [train.py:1204] (3/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_137-334143-0_sp1.1 from training. Duration: 0.97275 2023-10-04 15:07:12,705 WARNING [train.py:1204] (3/4) Exclude cut with ID _6456_210_1534_1_1527681634212_3181049_215-309359-0_sp1.1 from training. Duration: 0.8 2023-10-04 15:07:14,222 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.out_combiner.scale_min, batch_count=1697133.3333333333, ans=0.2 2023-10-04 15:07:15,366 WARNING [train.py:1204] (3/4) Exclude cut with ID _65640_210_6310_1_1535368201445_3173250_209-13008-0 from training. Duration: 0.99 2023-10-04 15:07:15,578 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.conv_module2.balancer2.prob, batch_count=1697133.3333333333, ans=0.125 2023-10-04 15:07:17,003 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.ff3_skip_rate, batch_count=1697200.0, ans=0.0 2023-10-04 15:07:20,841 WARNING [train.py:1204] (3/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_58-29064-0 from training. Duration: 0.99 2023-10-04 15:07:24,148 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.777e+02 2.053e+02 2.279e+02 2.584e+02 3.777e+02, threshold=4.559e+02, percent-clipped=0.0 2023-10-04 15:07:24,298 WARNING [train.py:1204] (3/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_410-287857-0 from training. Duration: 0.93 2023-10-04 15:07:24,381 WARNING [train.py:1204] (3/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_153-153537-0 from training. Duration: 0.91 2023-10-04 15:07:25,653 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7544_210_6753_1_1528587722_3576686_172-171750-0_sp0.9 from training. Duration: 0.72225 2023-10-04 15:07:25,682 WARNING [train.py:1204] (3/4) Exclude cut with ID _64521_210_14699_1_1535243427259_3815328_464-183853-0_sp1.1 from training. Duration: 0.97275 2023-10-04 15:07:25,704 WARNING [train.py:1204] (3/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_385-210308-0_sp1.1 from training. Duration: 0.963625 2023-10-04 15:07:25,718 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_46013_210_7234_1_1533776362_3744032_354-261865-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 15:07:25,725 WARNING [train.py:1204] (3/4) Exclude cut with ID _3067_210_2667_1_1526212974640_4503168_250-285937-0_sp1.1 from training. Duration: 0.72725 2023-10-04 15:07:27,101 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_40036_210_13405_1_1533648647_1315388_48-146016-0 from training. Duration: 0.93 2023-10-04 15:07:28,657 WARNING [train.py:1204] (3/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_280-161633-0 from training. Duration: 0.73 2023-10-04 15:07:30,441 WARNING [train.py:1204] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_308-161067-0_sp0.9 from training. Duration: 0.7333125 2023-10-04 15:07:30,541 WARNING [train.py:1204] (3/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_68-212801-0_sp0.9 from training. Duration: 0.811125 2023-10-04 15:07:30,753 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.balancer2.prob, batch_count=1697200.0, ans=0.125 2023-10-04 15:07:32,384 WARNING [train.py:1204] (3/4) Exclude cut with ID _6789_210_1534_1_1527857937218_3118950_311-61262-0_sp0.9 from training. Duration: 0.72225 2023-10-04 15:07:35,148 WARNING [train.py:1204] (3/4) Exclude cut with ID _14632_210_9988_1_1530691279392_3923200_102-204477-0_sp1.1 from training. Duration: 0.8818125 2023-10-04 15:07:35,215 WARNING [train.py:1204] (3/4) Exclude cut with ID _65242_210_18628_1_1535369393717_3823149_290-117517-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 15:07:36,628 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_69630_210_7234_1_1535788670_910989_35-337140-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 15:07:36,636 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_5618_210_1874_1_1527311008_5851_1-214501-0 from training. Duration: 0.689 2023-10-04 15:07:37,985 WARNING [train.py:1204] (3/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_168-50337-0_sp0.9 from training. Duration: 0.9333125 2023-10-04 15:07:39,319 WARNING [train.py:1204] (3/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_185-14438-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 15:07:39,343 WARNING [train.py:1204] (3/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_61-327062-0 from training. Duration: 0.81 2023-10-04 15:07:39,348 WARNING [train.py:1204] (3/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_98-741-0 from training. Duration: 0.8 2023-10-04 15:07:45,409 WARNING [train.py:1204] (3/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_312-204798-0 from training. Duration: 0.93 2023-10-04 15:07:48,470 WARNING [train.py:1204] (3/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_787-294416-0_sp0.9 from training. Duration: 0.911125 2023-10-04 15:07:48,584 WARNING [train.py:1204] (3/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_140-181176-0_sp1.1 from training. Duration: 0.836375 2023-10-04 15:07:48,741 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.feed_forward3.hidden_balancer.prob, batch_count=1697333.3333333333, ans=0.125 2023-10-04 15:07:49,896 WARNING [train.py:1204] (3/4) Exclude cut with ID _14687_210_5854_1_1530703838259_3664889_115-239046-0_sp1.1 from training. Duration: 0.6090625 2023-10-04 15:07:49,947 WARNING [train.py:1204] (3/4) Exclude cut with ID _56423_210_5854_1_1534896061049_3165969_370-230614-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 15:07:50,006 WARNING [train.py:1204] (3/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_406-275649-0_sp0.9 from training. Duration: 0.4666875 2023-10-04 15:07:50,018 WARNING [train.py:1204] (3/4) Exclude cut with ID _15150_210_8341_1_1530752411803_7256384_22-138274-0_sp1.1 from training. Duration: 0.87275 2023-10-04 15:07:51,375 WARNING [train.py:1204] (3/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_125-125818-0 from training. Duration: 0.96 2023-10-04 15:07:53,418 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_4399_210_1874_1_1526705485_2400757_220-47974-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 15:07:56,105 WARNING [train.py:1204] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_308-161067-0_sp1.1 from training. Duration: 0.6 2023-10-04 15:07:57,437 WARNING [train.py:1204] (3/4) Exclude cut with ID _53489_210_6228_1_1534298357169_3835970_102-57645-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 15:08:00,072 WARNING [train.py:1204] (3/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_178-177582-0 from training. Duration: 0.74 2023-10-04 15:08:01,879 WARNING [train.py:1204] (3/4) Exclude cut with ID _14349_210_8341_1_1530615710818_3442470_170-336541-0_sp1.1 from training. Duration: 0.7818125 2023-10-04 15:08:01,945 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_521-214276-0_sp0.9 from training. Duration: 0.488875 2023-10-04 15:08:03,825 WARNING [train.py:1204] (3/4) Exclude cut with ID _66070_210_7640_1_1535441393096_7805310_489-292631-0 from training. Duration: 0.79 2023-10-04 15:08:08,213 WARNING [train.py:1204] (3/4) Exclude cut with ID _14069_210_5632_1_1530512692033_4803390_438-340023-0_sp1.1 from training. Duration: 0.936375 2023-10-04 15:08:09,600 WARNING [train.py:1204] (3/4) Exclude cut with ID _7852_210_5219_1_1528286471316_8139400_242-105336-0_sp1.1 from training. Duration: 0.6545625 2023-10-04 15:08:10,960 WARNING [train.py:1204] (3/4) Exclude cut with ID _3867_210_1883_1_1526641200575_4475830_272-215563-0 from training. Duration: 0.57 2023-10-04 15:08:10,969 WARNING [train.py:1204] (3/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_143-117421-0_sp0.9 from training. Duration: 0.6444375 2023-10-04 15:08:10,975 WARNING [train.py:1204] (3/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_30-326681-0_sp1.1 from training. Duration: 0.7545625 2023-10-04 15:08:12,361 WARNING [train.py:1204] (3/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_538-218448-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 15:08:18,229 INFO [train.py:1046] (3/4) Epoch 48, batch 4950, loss[loss=0.1869, simple_loss=0.2625, pruned_loss=0.05562, over 23322.00 frames. ], tot_loss[loss=0.1525, simple_loss=0.2327, pruned_loss=0.03616, over 4706575.70 frames. ], batch size: 93, lr: 2.10e-03, grad_scale: 16.0 2023-10-04 15:08:18,280 WARNING [train.py:1204] (3/4) Exclude cut with ID _33640_210_2655_1_1533117605479_4855469_60-192970-0_sp1.1 from training. Duration: 0.963625 2023-10-04 15:08:18,288 WARNING [train.py:1204] (3/4) Exclude cut with ID _65585_210_12210_1_1535450451584_3764098_112-294301-0_sp1.1 from training. Duration: 0.8 2023-10-04 15:08:18,319 WARNING [train.py:1204] (3/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_643-37412-0_sp1.1 from training. Duration: 0.936375 2023-10-04 15:08:18,336 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_9302_210_2550_1_1528889962_4655416_638-301620-0_sp1.1 from training. Duration: 0.42725 2023-10-04 15:08:21,321 WARNING [train.py:1204] (3/4) Exclude cut with ID _3867_210_1883_1_1526641200575_4475830_272-215563-0_sp1.1 from training. Duration: 0.5181875 2023-10-04 15:08:24,501 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_53679_210_4852_1_1534556726_4186141_358-154143-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 15:08:24,520 WARNING [train.py:1204] (3/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_841-341679-0_sp0.9 from training. Duration: 0.6444375 2023-10-04 15:08:27,317 WARNING [train.py:1204] (3/4) Exclude cut with ID _7160_210_1876_1_1527940923231_3703799_302-150728-0 from training. Duration: 0.78 2023-10-04 15:08:27,347 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_125-203105-0 from training. Duration: 0.8 2023-10-04 15:08:27,367 WARNING [train.py:1204] (3/4) Exclude cut with ID _5585_210_2667_1_1527418813324_8007340_89-332072-0_sp0.9 from training. Duration: 0.711125 2023-10-04 15:08:28,770 WARNING [train.py:1204] (3/4) Exclude cut with ID _3992_210_1747_1_1526644816031_3731060_36-147855-0 from training. Duration: 0.78 2023-10-04 15:08:28,789 WARNING [train.py:1204] (3/4) Exclude cut with ID _59256_210_5986_1_1535076085997_3589120_172-239045-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 15:08:28,798 WARNING [train.py:1204] (3/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_472-216663-0_sp1.1 from training. Duration: 0.836375 2023-10-04 15:08:28,864 WARNING [train.py:1204] (3/4) Exclude cut with ID _36512_210_6987_1_1533033143998_3126069_181-306500-0_sp1.1 from training. Duration: 0.863625 2023-10-04 15:08:30,140 WARNING [train.py:1204] (3/4) Exclude cut with ID _56524_210_9642_1_1534572011779_3619170_492-95008-0_sp1.1 from training. Duration: 0.97275 2023-10-04 15:08:31,574 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_33348_210_5632_1_1533086912_3324030_119-90326-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 15:08:31,616 WARNING [train.py:1204] (3/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_231-137892-0_sp1.1 from training. Duration: 0.9 2023-10-04 15:08:31,859 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.feed_forward3.hidden_balancer.prob, batch_count=1697533.3333333333, ans=0.125 2023-10-04 15:08:35,338 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8453_210_4211_1_1528502940_8562730_596-306820-0_sp1.1 from training. Duration: 0.8818125 2023-10-04 15:08:36,668 WARNING [train.py:1204] (3/4) Exclude cut with ID _27052_210_5986_1_1532329197187_3585009_50-242750-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 15:08:38,066 WARNING [train.py:1204] (3/4) Exclude cut with ID _13913_210_9994_1_1530579615187_3383897_358-98435-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 15:08:39,333 WARNING [train.py:1204] (3/4) Exclude cut with ID _27013_210_1389_1_1532437173240_3252649_399-229778-0_sp1.1 from training. Duration: 0.963625 2023-10-04 15:08:42,170 WARNING [train.py:1204] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_185-174805-0_sp0.9 from training. Duration: 0.7555625 2023-10-04 15:08:46,338 WARNING [train.py:1204] (3/4) Exclude cut with ID _65585_210_12210_1_1535450451584_3764098_196-221014-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 15:08:46,462 WARNING [train.py:1204] (3/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_202-293523-0_sp1.1 from training. Duration: 0.6545625 2023-10-04 15:08:48,397 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8275_210_4198_1_1528455320_3892571_213-127634-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 15:08:49,764 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_921-231678-0_sp1.1 from training. Duration: 0.97275 2023-10-04 15:08:49,872 WARNING [train.py:1204] (3/4) Exclude cut with ID _38020_210_5986_1_1533112252259_7006089_493-102745-0_sp1.1 from training. Duration: 0.8909375 2023-10-04 15:08:51,373 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6846_210_6753_1_1527850767_4295158_2-315073-0 from training. Duration: 0.88 2023-10-04 15:08:53,312 WARNING [train.py:1204] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_249-281349-0 from training. Duration: 0.85 2023-10-04 15:08:55,923 WARNING [train.py:1204] (3/4) Exclude cut with ID _18513_210_9206_1_1531618215223_3862620_314-83429-0_sp1.1 from training. Duration: 0.97275 2023-10-04 15:08:57,297 WARNING [train.py:1204] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_215-202973-0_sp0.9 from training. Duration: 0.97775 2023-10-04 15:08:57,307 WARNING [train.py:1204] (3/4) Exclude cut with ID _7719_210_3525_1_1528456153163_4296960_88-250259-0_sp1.1 from training. Duration: 0.763625 2023-10-04 15:08:57,417 WARNING [train.py:1204] (3/4) Exclude cut with ID _7606_210_4915_1_1528894253277_5080690_301-10732-0_sp1.1 from training. Duration: 0.736375 2023-10-04 15:08:57,435 WARNING [train.py:1204] (3/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_818-11907-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 15:08:57,700 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.ff3_skip_rate, batch_count=1697600.0, ans=0.0 2023-10-04 15:08:58,847 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_5959_210_1794_1_1527400400_3445726_412-44052-0_sp1.1 from training. Duration: 0.7 2023-10-04 15:09:01,644 WARNING [train.py:1204] (3/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_6-279195-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 15:09:05,193 WARNING [train.py:1204] (3/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_252-216674-0_sp0.9 from training. Duration: 0.6 2023-10-04 15:09:06,604 WARNING [train.py:1204] (3/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_144-3287-0_sp1.1 from training. Duration: 0.6090625 2023-10-04 15:09:08,030 WARNING [train.py:1204] (3/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_342-3879-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 15:09:08,055 WARNING [train.py:1204] (3/4) Exclude cut with ID _33640_210_2655_1_1533117605479_4855469_584-65630-0_sp1.1 from training. Duration: 0.97275 2023-10-04 15:09:09,418 WARNING [train.py:1204] (3/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_37-189262-0 from training. Duration: 0.98 2023-10-04 15:09:10,608 WARNING [train.py:1204] (3/4) Exclude cut with ID _9389_210_1664_1_1529217337301_5289269_379-128794-0_sp0.9 from training. Duration: 0.9444375 2023-10-04 15:09:11,936 WARNING [train.py:1204] (3/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_119-207162-0_sp1.1 from training. Duration: 0.6454375 2023-10-04 15:09:16,130 WARNING [train.py:1204] (3/4) Exclude cut with ID _2941_210_2577_1_1526199673150_2489759_200-333643-0_sp1.1 from training. Duration: 0.936375 2023-10-04 15:09:17,509 WARNING [train.py:1204] (3/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_479-125611-0_sp1.1 from training. Duration: 0.87275 2023-10-04 15:09:17,520 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3475_210_2028_1_1526193578_3622569_64-297072-0_sp1.1 from training. Duration: 0.82725 2023-10-04 15:09:17,573 WARNING [train.py:1204] (3/4) Exclude cut with ID _53565_210_10635_1_1534337218335_3820259_139-346291-0_sp1.1 from training. Duration: 0.97275 2023-10-04 15:09:17,749 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.bypass.skip_rate, batch_count=1697733.3333333333, ans=0.07 2023-10-04 15:09:18,361 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.2.encoder.layers.0.whiten, num_groups=1, num_channels=384, metric=5.34 vs. limit=12.0 2023-10-04 15:09:18,911 WARNING [train.py:1204] (3/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_588-125165-0_sp1.1 from training. Duration: 0.7545625 2023-10-04 15:09:20,967 WARNING [train.py:1204] (3/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_938-205135-0_sp1.1 from training. Duration: 0.7909375 2023-10-04 15:09:21,105 WARNING [train.py:1204] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_216-48707-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 15:09:22,304 WARNING [train.py:1204] (3/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_477-255782-0_sp1.1 from training. Duration: 0.7454375 2023-10-04 15:09:22,349 WARNING [train.py:1204] (3/4) Exclude cut with ID _16909_210_5724_1_1531292079912_4118919_583-92842-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 15:09:23,637 WARNING [train.py:1204] (3/4) Exclude cut with ID _60860_210_8254_1_1534896015875_3557169_245-256666-0 from training. Duration: 0.97 2023-10-04 15:09:27,051 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_45399_210_4852_1_1533624725_7579737_349-116173-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 15:09:32,408 INFO [train.py:1046] (3/4) Epoch 48, batch 5000, loss[loss=0.1453, simple_loss=0.2194, pruned_loss=0.03555, over 23747.00 frames. ], tot_loss[loss=0.1515, simple_loss=0.2317, pruned_loss=0.03562, over 4711889.22 frames. ], batch size: 212, lr: 2.10e-03, grad_scale: 16.0 2023-10-04 15:09:32,467 WARNING [train.py:1204] (3/4) Exclude cut with ID _8699_210_2069_1_1528630346831_3960409_155-39766-0 from training. Duration: 0.78 2023-10-04 15:09:32,476 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_86-16881-0_sp1.1 from training. Duration: 0.52725 2023-10-04 15:09:38,479 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_191-159384-0_sp1.1 from training. Duration: 0.97275 2023-10-04 15:09:38,489 WARNING [train.py:1204] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_307-158462-0_sp1.1 from training. Duration: 0.62725 2023-10-04 15:09:39,979 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7544_210_6753_1_1528591327_1471426_33-346031-0 from training. Duration: 0.61 2023-10-04 15:09:41,387 WARNING [train.py:1204] (3/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_112-149213-0 from training. Duration: 0.95 2023-10-04 15:09:45,196 WARNING [train.py:1204] (3/4) Exclude cut with ID _55844_210_2346_1_1534471212569_7625707_925-191342-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 15:09:46,503 WARNING [train.py:1204] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_87-278686-0 from training. Duration: 0.77 2023-10-04 15:09:46,536 WARNING [train.py:1204] (3/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_415-78792-0_sp1.1 from training. Duration: 0.736375 2023-10-04 15:09:46,553 WARNING [train.py:1204] (3/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_107-332398-0_sp0.9 from training. Duration: 0.8666875 2023-10-04 15:09:46,733 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.conv_module2.balancer2.prob, batch_count=1697866.6666666667, ans=0.125 2023-10-04 15:09:47,959 WARNING [train.py:1204] (3/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_72-251255-0 from training. Duration: 0.94 2023-10-04 15:09:49,285 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_431-227273-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 15:09:49,347 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7544_210_6753_1_1528587000_691468_87-178599-0_sp0.9 from training. Duration: 0.9666875 2023-10-04 15:09:51,207 WARNING [train.py:1204] (3/4) Exclude cut with ID _13916_210_9994_1_1530788341126_3763149_60-191556-0 from training. Duration: 0.61 2023-10-04 15:09:51,209 WARNING [train.py:1204] (3/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_746-164782-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 15:09:51,250 WARNING [train.py:1204] (3/4) Exclude cut with ID _33640_210_2655_1_1533117605479_4855469_596-21066-0_sp1.1 from training. Duration: 0.92725 2023-10-04 15:09:52,449 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.773e+02 2.112e+02 2.381e+02 2.670e+02 4.654e+02, threshold=4.763e+02, percent-clipped=1.0 2023-10-04 15:09:52,617 WARNING [train.py:1204] (3/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_320-14078-0 from training. Duration: 0.95 2023-10-04 15:09:52,652 WARNING [train.py:1204] (3/4) Exclude cut with ID _29877_210_6228_1_1532397791106_5276149_266-62291-0 from training. Duration: 0.87 2023-10-04 15:09:52,706 WARNING [train.py:1204] (3/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_176-56120-0_sp0.9 from training. Duration: 0.92225 2023-10-04 15:09:54,014 WARNING [train.py:1204] (3/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_91-26919-0 from training. Duration: 0.97 2023-10-04 15:09:54,022 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_224-322394-0_sp0.9 from training. Duration: 0.7333125 2023-10-04 15:09:54,081 WARNING [train.py:1204] (3/4) Exclude cut with ID _44982_210_1511_1_1534312820108_3726029_586-297059-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 15:09:55,374 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_196-289637-0_sp1.1 from training. Duration: 0.6818125 2023-10-04 15:09:55,382 WARNING [train.py:1204] (3/4) Exclude cut with ID _18513_210_9206_1_1531618215223_3862620_402-5957-0 from training. Duration: 0.99 2023-10-04 15:09:55,388 WARNING [train.py:1204] (3/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_81-300212-0 from training. Duration: 0.6 2023-10-04 15:09:58,463 WARNING [train.py:1204] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_166-128941-0 from training. Duration: 0.54 2023-10-04 15:09:58,479 WARNING [train.py:1204] (3/4) Exclude cut with ID _58059_210_5854_1_1534901471149_3164808_57-309660-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 15:09:59,676 WARNING [train.py:1204] (3/4) Exclude cut with ID _60621_210_17071_1_1535013053530_3671739_73-155814-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 15:09:59,780 WARNING [train.py:1204] (3/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_726-270780-0 from training. Duration: 0.86 2023-10-04 15:09:59,792 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8853_210_3273_1_1528633899_818968_51-187685-0_sp1.1 from training. Duration: 0.62725 2023-10-04 15:10:01,271 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_315-212794-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 15:10:02,730 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_53373_210_5399_1_1534229471_4715695_314-122795-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 15:10:04,088 WARNING [train.py:1204] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_187-221566-0_sp0.9 from training. Duration: 0.47775 2023-10-04 15:10:04,818 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.3.encoder.layers.0.feed_forward2.out_whiten, num_groups=1, num_channels=512, metric=3.41 vs. limit=15.0 2023-10-04 15:10:05,905 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_133-564-0 from training. Duration: 0.79 2023-10-04 15:10:05,949 WARNING [train.py:1204] (3/4) Exclude cut with ID _55153_210_2253_1_1534384786677_3162096_255-228344-0_sp1.1 from training. Duration: 0.936375 2023-10-04 15:10:07,414 WARNING [train.py:1204] (3/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_383-165234-0_sp1.1 from training. Duration: 0.8545625 2023-10-04 15:10:10,218 WARNING [train.py:1204] (3/4) Exclude cut with ID IC0677W0056-334889 from training. Duration: 0.768 2023-10-04 15:10:11,835 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.attention_skip_rate, batch_count=1697933.3333333333, ans=0.0 2023-10-04 15:10:12,984 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_14953_210_2151_1_1530701710_5616165_710-165400-0_sp0.9 from training. Duration: 0.9666875 2023-10-04 15:10:15,665 WARNING [train.py:1204] (3/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_355-236205-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 15:10:15,667 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_34450_210_9547_1_1533099443_4109050_503-246893-0_sp1.1 from training. Duration: 0.963625 2023-10-04 15:10:17,280 WARNING [train.py:1204] (3/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_25-120161-0 from training. Duration: 0.98 2023-10-04 15:10:17,305 WARNING [train.py:1204] (3/4) Exclude cut with ID _66112_210_10635_1_1535590236305_3801484_205-311930-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 15:10:18,983 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_48049_210_5632_1_1533864553_3519937_185-281218-0_sp1.1 from training. Duration: 0.92725 2023-10-04 15:10:19,028 WARNING [train.py:1204] (3/4) Exclude cut with ID _48782_210_12658_1_1534467701544_2300422_191-332415-0_sp1.1 from training. Duration: 0.97275 2023-10-04 15:10:21,805 WARNING [train.py:1204] (3/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_907-151398-0_sp1.1 from training. Duration: 0.336375 2023-10-04 15:10:23,139 WARNING [train.py:1204] (3/4) Exclude cut with ID _67499_210_17282_1_1535509805220_3692430_20-179346-0_sp1.1 from training. Duration: 0.8818125 2023-10-04 15:10:24,687 WARNING [train.py:1204] (3/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_441-91466-0_sp1.1 from training. Duration: 0.8818125 2023-10-04 15:10:24,965 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.balancer_ff3.min_abs, batch_count=1698000.0, ans=0.2 2023-10-04 15:10:26,566 WARNING [train.py:1204] (3/4) Exclude cut with ID _46681_210_9642_1_1533893005571_7603359_786-164007-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 15:10:30,827 WARNING [train.py:1204] (3/4) Exclude cut with ID _14970_210_15709_1_1530773543493_1715730_187-20259-0 from training. Duration: 0.89 2023-10-04 15:10:34,219 WARNING [train.py:1204] (3/4) Exclude cut with ID _12734_210_8341_1_1530183614296_3891929_383-339075-0_sp1.1 from training. Duration: 0.963625 2023-10-04 15:10:41,763 WARNING [train.py:1204] (3/4) Exclude cut with ID _37086_210_9038_1_1532999540891_7021250_834-322225-0_sp1.1 from training. Duration: 0.92725 2023-10-04 15:10:43,141 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_10987_210_4163_1_1529760555_4123902_56-290559-0_sp1.1 from training. Duration: 0.963625 2023-10-04 15:10:43,154 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_92-246270-0_sp0.9 from training. Duration: 0.9444375 2023-10-04 15:10:44,420 WARNING [train.py:1204] (3/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_56-13561-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 15:10:44,456 WARNING [train.py:1204] (3/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_393-57888-0_sp1.1 from training. Duration: 0.5818125 2023-10-04 15:10:44,485 WARNING [train.py:1204] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_168-32906-0_sp1.1 from training. Duration: 0.62725 2023-10-04 15:10:45,700 INFO [train.py:1046] (3/4) Epoch 48, batch 5050, loss[loss=0.1643, simple_loss=0.2382, pruned_loss=0.04519, over 23721.00 frames. ], tot_loss[loss=0.1519, simple_loss=0.2323, pruned_loss=0.03575, over 4714313.00 frames. ], batch size: 212, lr: 2.10e-03, grad_scale: 16.0 2023-10-04 15:10:45,782 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_51225_210_3601_1_1534400634_2327383_127-207553-0_sp1.1 from training. Duration: 0.963625 2023-10-04 15:10:49,165 WARNING [train.py:1204] (3/4) Exclude cut with ID _58583_210_5236_1_1534726835266_3574249_494-203521-0_sp1.1 from training. Duration: 0.963625 2023-10-04 15:10:49,177 WARNING [train.py:1204] (3/4) Exclude cut with ID _49376_210_5986_1_1533985278732_3370740_354-344695-0 from training. Duration: 0.99 2023-10-04 15:10:50,627 WARNING [train.py:1204] (3/4) Exclude cut with ID _8021_210_7994_1_1528376451358_3653069_179-45206-0_sp1.1 from training. Duration: 0.7818125 2023-10-04 15:10:50,805 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.conv_module2.balancer1.min_positive, batch_count=1698133.3333333333, ans=0.025 2023-10-04 15:10:52,054 WARNING [train.py:1204] (3/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_742-154017-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 15:10:53,366 WARNING [train.py:1204] (3/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_222-198127-0_sp1.1 from training. Duration: 0.763625 2023-10-04 15:10:53,431 WARNING [train.py:1204] (3/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_235-307337-0 from training. Duration: 0.94 2023-10-04 15:10:55,172 WARNING [train.py:1204] (3/4) Exclude cut with ID _8393_210_6702_1_1528505860249_3457328_453-176500-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 15:10:55,215 WARNING [train.py:1204] (3/4) Exclude cut with ID _53565_210_10635_1_1534337218335_3820259_102-179528-0_sp1.1 from training. Duration: 0.97275 2023-10-04 15:10:56,686 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.bypass.skip_rate, batch_count=1698133.3333333333, ans=0.07 2023-10-04 15:10:57,794 WARNING [train.py:1204] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_43-206757-0_sp1.1 from training. Duration: 0.6818125 2023-10-04 15:10:59,143 WARNING [train.py:1204] (3/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_335-274780-0_sp1.1 from training. Duration: 0.7090625 2023-10-04 15:10:59,178 WARNING [train.py:1204] (3/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_128-296581-0_sp1.1 from training. Duration: 0.67275 2023-10-04 15:11:09,518 WARNING [train.py:1204] (3/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_111-112362-0 from training. Duration: 0.91 2023-10-04 15:11:10,792 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_5522_210_3273_1_1527249606_3909096_366-172508-0_sp1.1 from training. Duration: 0.6 2023-10-04 15:11:10,871 WARNING [train.py:1204] (3/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_47-167373-0_sp1.1 from training. Duration: 0.836375 2023-10-04 15:11:12,225 WARNING [train.py:1204] (3/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_41-234353-0 from training. Duration: 0.68 2023-10-04 15:11:12,258 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_82-221196-0_sp0.9 from training. Duration: 0.8444375 2023-10-04 15:11:13,722 WARNING [train.py:1204] (3/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_47-4814-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 15:11:13,753 WARNING [train.py:1204] (3/4) Exclude cut with ID _43541_210_10626_1_1533796233418_3635440_40-247993-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 15:11:15,271 WARNING [train.py:1204] (3/4) Exclude cut with ID _21989_210_5854_1_1531899214931_3271360_133-44759-0_sp0.9 from training. Duration: 0.8555625 2023-10-04 15:11:15,274 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_94-195801-0 from training. Duration: 0.94 2023-10-04 15:11:16,583 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_141-89113-0 from training. Duration: 0.86 2023-10-04 15:11:16,685 WARNING [train.py:1204] (3/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_683-284578-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 15:11:19,448 WARNING [train.py:1204] (3/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_6-214815-0_sp1.1 from training. Duration: 0.77275 2023-10-04 15:11:22,887 WARNING [train.py:1204] (3/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_556-212082-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 15:11:22,922 WARNING [train.py:1204] (3/4) Exclude cut with ID _44902_210_16187_1_1533888006062_3792078_149-67212-0 from training. Duration: 0.87 2023-10-04 15:11:24,377 WARNING [train.py:1204] (3/4) Exclude cut with ID _61148_210_6897_1_1535094022726_4217610_333-159440-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 15:11:27,570 WARNING [train.py:1204] (3/4) Exclude cut with ID _3120_210_3525_1_1526036143112_4072980_404-249994-0 from training. Duration: 0.99 2023-10-04 15:11:28,867 WARNING [train.py:1204] (3/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_234-260682-0_sp0.9 from training. Duration: 0.9333125 2023-10-04 15:11:28,895 WARNING [train.py:1204] (3/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_374-138971-0_sp1.1 from training. Duration: 0.8545625 2023-10-04 15:11:30,301 WARNING [train.py:1204] (3/4) Exclude cut with ID _53713_210_24116_1_1534314487762_7416751_743-302085-0_sp1.1 from training. Duration: 0.92725 2023-10-04 15:11:30,376 WARNING [train.py:1204] (3/4) Exclude cut with ID _18516_210_9206_1_1531704653206_3692406_198-334870-0_sp1.1 from training. Duration: 0.836375 2023-10-04 15:11:33,079 WARNING [train.py:1204] (3/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_395-73570-0_sp1.1 from training. Duration: 0.9 2023-10-04 15:11:36,235 WARNING [train.py:1204] (3/4) Exclude cut with ID _46683_210_9642_1_1533730913492_4591116_421-19547-0_sp1.1 from training. Duration: 0.936375 2023-10-04 15:11:36,300 WARNING [train.py:1204] (3/4) Exclude cut with ID _19896_210_1883_1_1531709022662_6649569_714-8668-0_sp1.1 from training. Duration: 0.963625 2023-10-04 15:11:36,323 WARNING [train.py:1204] (3/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_49-101072-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 15:11:36,330 WARNING [train.py:1204] (3/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_220-191080-0_sp1.1 from training. Duration: 0.8909375 2023-10-04 15:11:36,372 WARNING [train.py:1204] (3/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_62-335552-0 from training. Duration: 0.87 2023-10-04 15:11:38,194 WARNING [train.py:1204] (3/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_111-112362-0_sp1.1 from training. Duration: 0.82725 2023-10-04 15:11:38,413 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.self_attn_weights.pos_emb_skip_rate, batch_count=1698333.3333333333, ans=0.0 2023-10-04 15:11:39,667 WARNING [train.py:1204] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_318-59097-0_sp0.9 from training. Duration: 0.8444375 2023-10-04 15:11:43,835 WARNING [train.py:1204] (3/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_98-255710-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 15:11:43,843 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9042W0003-515609 from training. Duration: 0.896 2023-10-04 15:11:43,863 WARNING [train.py:1204] (3/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_158-250116-0_sp0.9 from training. Duration: 0.688875 2023-10-04 15:11:45,374 WARNING [train.py:1204] (3/4) Exclude cut with ID _26750_210_5665_1_1532226678442_4063230_7-346885-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 15:11:46,646 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_37839_210_7234_1_1533542462_3689303_357-227078-0_sp1.1 from training. Duration: 0.963625 2023-10-04 15:11:46,668 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9052W0119-520904 from training. Duration: 0.896 2023-10-04 15:11:49,339 WARNING [train.py:1204] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_249-281349-0_sp1.1 from training. Duration: 0.77275 2023-10-04 15:11:49,344 WARNING [train.py:1204] (3/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_147-293311-0 from training. Duration: 0.94 2023-10-04 15:11:49,345 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_158-21166-0_sp1.1 from training. Duration: 0.963625 2023-10-04 15:11:54,066 WARNING [train.py:1204] (3/4) Exclude cut with ID _14100_210_8341_1_1530529320613_3678069_207-332194-0_sp1.1 from training. Duration: 0.92725 2023-10-04 15:11:54,105 WARNING [train.py:1204] (3/4) Exclude cut with ID _9389_210_1664_1_1529217337301_5289269_324-211031-0_sp1.1 from training. Duration: 0.963625 2023-10-04 15:11:54,117 WARNING [train.py:1204] (3/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_133-158308-0 from training. Duration: 0.84 2023-10-04 15:11:55,541 WARNING [train.py:1204] (3/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_166-314607-0 from training. Duration: 0.54 2023-10-04 15:11:57,639 WARNING [train.py:1204] (3/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_649-79538-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 15:11:57,653 WARNING [train.py:1204] (3/4) Exclude cut with ID _28394_210_8913_1_1532430049518_3650118_343-115667-0_sp1.1 from training. Duration: 0.97275 2023-10-04 15:11:58,167 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.5.encoder.layers.0.feed_forward2.out_whiten, num_groups=1, num_channels=256, metric=23.25 vs. limit=15.0 2023-10-04 15:11:59,043 WARNING [train.py:1204] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_87-278686-0_sp0.9 from training. Duration: 0.8555625 2023-10-04 15:12:00,422 INFO [train.py:1046] (3/4) Epoch 48, batch 5100, loss[loss=0.1441, simple_loss=0.2194, pruned_loss=0.03434, over 21668.00 frames. ], tot_loss[loss=0.1522, simple_loss=0.2328, pruned_loss=0.03578, over 4702069.44 frames. ], batch size: 47, lr: 2.10e-03, grad_scale: 16.0 2023-10-04 15:12:02,032 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.conv_module1.balancer1.min_positive, batch_count=1698466.6666666667, ans=0.025 2023-10-04 15:12:03,114 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9109W0003-549749 from training. Duration: 0.768 2023-10-04 15:12:04,575 WARNING [train.py:1204] (3/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_126-29104-0_sp1.1 from training. Duration: 0.77275 2023-10-04 15:12:07,397 WARNING [train.py:1204] (3/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_325-113798-0 from training. Duration: 0.85 2023-10-04 15:12:09,356 WARNING [train.py:1204] (3/4) Exclude cut with ID _6694_210_4599_1_1527852637490_3965529_116-109398-0 from training. Duration: 0.73 2023-10-04 15:12:09,403 WARNING [train.py:1204] (3/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_46-57119-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 15:12:11,351 WARNING [train.py:1204] (3/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_546-119312-0_sp1.1 from training. Duration: 0.9 2023-10-04 15:12:12,907 WARNING [train.py:1204] (3/4) Exclude cut with ID _60057_210_10640_1_1534903065793_3333150_468-306939-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 15:12:14,311 WARNING [train.py:1204] (3/4) Exclude cut with ID _15150_210_8341_1_1530752411803_7256384_22-138274-0 from training. Duration: 0.96 2023-10-04 15:12:14,335 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_289-180904-0 from training. Duration: 0.76 2023-10-04 15:12:18,267 WARNING [train.py:1204] (3/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_64-133721-0_sp1.1 from training. Duration: 0.92725 2023-10-04 15:12:19,592 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_195-257512-0_sp1.1 from training. Duration: 0.6090625 2023-10-04 15:12:20,881 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.729e+02 2.048e+02 2.253e+02 2.620e+02 3.978e+02, threshold=4.506e+02, percent-clipped=0.0 2023-10-04 15:12:22,467 WARNING [train.py:1204] (3/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_704-101963-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 15:12:25,331 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.0.conv_skip_rate, batch_count=1698533.3333333333, ans=0.0 2023-10-04 15:12:26,927 WARNING [train.py:1204] (3/4) Exclude cut with ID _4366_210_1388_1_1526792053633_3613819_231-211402-0 from training. Duration: 0.94 2023-10-04 15:12:28,241 WARNING [train.py:1204] (3/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_679-190385-0_sp1.1 from training. Duration: 0.97275 2023-10-04 15:12:31,595 WARNING [train.py:1204] (3/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_298-81459-0_sp1.1 from training. Duration: 0.963625 2023-10-04 15:12:31,616 WARNING [train.py:1204] (3/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_658-282610-0_sp0.9 from training. Duration: 0.588875 2023-10-04 15:12:34,355 WARNING [train.py:1204] (3/4) Exclude cut with ID _57572_210_5517_1_1534921219624_3809484_196-283734-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 15:12:35,704 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_53071_210_16554_1_1534490405_2954066_294-198554-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 15:12:35,709 WARNING [train.py:1204] (3/4) Exclude cut with ID _4866_210_2577_1_1527404412284_4364659_352-157930-0 from training. Duration: 0.81 2023-10-04 15:12:37,176 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9231W0113-610139 from training. Duration: 0.896 2023-10-04 15:12:37,242 WARNING [train.py:1204] (3/4) Exclude cut with ID _24271_210_12658_1_1531987270022_7190519_638-171906-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 15:12:38,544 WARNING [train.py:1204] (3/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_272-249985-0 from training. Duration: 0.67 2023-10-04 15:12:38,561 WARNING [train.py:1204] (3/4) Exclude cut with ID _64086_210_7994_1_1535270417019_7435949_449-31567-0 from training. Duration: 0.97 2023-10-04 15:12:41,817 WARNING [train.py:1204] (3/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_109-282568-0_sp1.1 from training. Duration: 0.97275 2023-10-04 15:12:42,283 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.5.encoder.layers.0.nonlin_attention.whiten1, num_groups=1, num_channels=192, metric=3.67 vs. limit=10.0 2023-10-04 15:12:50,528 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_121-45934-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 15:12:52,040 WARNING [train.py:1204] (3/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_314-126819-0 from training. Duration: 0.5 2023-10-04 15:12:52,071 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9309W0192-640319 from training. Duration: 0.512 2023-10-04 15:12:53,300 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9310W0003-640649 from training. Duration: 0.896 2023-10-04 15:12:54,745 WARNING [train.py:1204] (3/4) Exclude cut with ID _7160_210_1876_1_1527940923231_3703799_120-150943-0 from training. Duration: 0.81 2023-10-04 15:12:54,747 WARNING [train.py:1204] (3/4) Exclude cut with ID _40380_210_9059_1_1533380594759_4187380_6-33508-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 15:12:55,619 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.1.encoder.layers.0.feed_forward1.out_whiten, num_groups=1, num_channels=256, metric=4.98 vs. limit=15.0 2023-10-04 15:12:57,574 WARNING [train.py:1204] (3/4) Exclude cut with ID _59202_210_26984_1_1534816810612_3549174_137-318710-0 from training. Duration: 0.89 2023-10-04 15:12:59,120 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.balancer2.prob, batch_count=1698733.3333333333, ans=0.125 2023-10-04 15:13:02,452 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_435-313305-0 from training. Duration: 0.71 2023-10-04 15:13:03,871 WARNING [train.py:1204] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_91-212486-0_sp1.1 from training. Duration: 0.6181875 2023-10-04 15:13:05,255 WARNING [train.py:1204] (3/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_133-158308-0_sp1.1 from training. Duration: 0.763625 2023-10-04 15:13:05,429 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=1698733.3333333333, ans=0.1 2023-10-04 15:13:07,989 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_99-251314-0 from training. Duration: 0.87 2023-10-04 15:13:09,406 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_365-217119-0_sp1.1 from training. Duration: 0.57275 2023-10-04 15:13:10,764 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3475_210_2028_1_1526193578_3622569_377-166133-0 from training. Duration: 0.86 2023-10-04 15:13:14,217 INFO [train.py:1046] (3/4) Epoch 48, batch 5150, loss[loss=0.1408, simple_loss=0.2213, pruned_loss=0.03011, over 24455.00 frames. ], tot_loss[loss=0.1527, simple_loss=0.2334, pruned_loss=0.036, over 4711134.46 frames. ], batch size: 58, lr: 2.10e-03, grad_scale: 16.0 2023-10-04 15:13:16,272 WARNING [train.py:1204] (3/4) Exclude cut with ID _13916_210_9994_1_1530788341126_3763149_116-21034-0_sp1.1 from training. Duration: 0.87275 2023-10-04 15:13:16,285 WARNING [train.py:1204] (3/4) Exclude cut with ID _64220_210_9527_1_1535250151772_4875259_142-216831-0_sp1.1 from training. Duration: 0.936375 2023-10-04 15:13:16,285 WARNING [train.py:1204] (3/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_490-207861-0_sp1.1 from training. Duration: 0.8909375 2023-10-04 15:13:17,527 WARNING [train.py:1204] (3/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_87-272068-0_sp0.9 from training. Duration: 0.92225 2023-10-04 15:13:17,548 WARNING [train.py:1204] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_18-46756-0_sp1.1 from training. Duration: 0.7181875 2023-10-04 15:13:18,890 WARNING [train.py:1204] (3/4) Exclude cut with ID _39411_210_5986_1_1533538825030_3343649_98-311335-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 15:13:18,965 WARNING [train.py:1204] (3/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_113-18163-0 from training. Duration: 0.89 2023-10-04 15:13:18,967 WARNING [train.py:1204] (3/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_1103-56445-0 from training. Duration: 0.73 2023-10-04 15:13:20,264 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3474_210_2028_1_1526188513_4825246_145-309131-0 from training. Duration: 0.74 2023-10-04 15:13:20,282 WARNING [train.py:1204] (3/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_120-211630-0_sp1.1 from training. Duration: 0.836375 2023-10-04 15:13:20,292 WARNING [train.py:1204] (3/4) Exclude cut with ID _8030_210_7497_1_1528444886781_3531089_32-31837-0 from training. Duration: 0.97 2023-10-04 15:13:21,630 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_19949_210_2161_1_1531706337_928079_8-160956-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 15:13:21,690 WARNING [train.py:1204] (3/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_321-215742-0_sp1.1 from training. Duration: 0.4545625 2023-10-04 15:13:24,299 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_583-124650-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 15:13:25,710 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_30434_210_13049_1_1532519384_981746_164-182755-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 15:13:28,909 WARNING [train.py:1204] (3/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_339-104791-0_sp1.1 from training. Duration: 0.4454375 2023-10-04 15:13:28,934 WARNING [train.py:1204] (3/4) Exclude cut with ID _15026_210_8750_1_1530777602316_3291850_211-195043-0 from training. Duration: 0.8 2023-10-04 15:13:31,636 WARNING [train.py:1204] (3/4) Exclude cut with ID _29877_210_6228_1_1532397791106_5276149_487-182647-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 15:13:32,961 WARNING [train.py:1204] (3/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_127-566-0_sp0.9 from training. Duration: 0.9333125 2023-10-04 15:13:34,879 WARNING [train.py:1204] (3/4) Exclude cut with ID _8263_210_1664_1_1528439452891_970220_68-181805-0_sp0.9 from training. Duration: 0.711125 2023-10-04 15:13:34,881 WARNING [train.py:1204] (3/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_504-72513-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 15:13:34,896 WARNING [train.py:1204] (3/4) Exclude cut with ID _64639_210_5816_1_1535367619437_3528000_324-265233-0_sp1.1 from training. Duration: 0.963625 2023-10-04 15:13:36,197 WARNING [train.py:1204] (3/4) Exclude cut with ID _6456_210_1534_1_1527681634212_3181049_215-309359-0_sp0.9 from training. Duration: 0.97775 2023-10-04 15:13:36,201 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_115-233929-0_sp1.1 from training. Duration: 0.6545625 2023-10-04 15:13:36,244 WARNING [train.py:1204] (3/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_219-224445-0 from training. Duration: 0.84 2023-10-04 15:13:37,643 WARNING [train.py:1204] (3/4) Exclude cut with ID _14867_210_1968_1_1530684179890_3846969_413-29292-0_sp1.1 from training. Duration: 0.8181875 2023-10-04 15:13:37,698 WARNING [train.py:1204] (3/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_1012-279473-0_sp0.9 from training. Duration: 0.7444375 2023-10-04 15:13:39,085 WARNING [train.py:1204] (3/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_605-133395-0_sp0.9 from training. Duration: 0.7333125 2023-10-04 15:13:41,823 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_89-35748-0 from training. Duration: 0.79 2023-10-04 15:13:41,930 WARNING [train.py:1204] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_476-272757-0_sp1.1 from training. Duration: 0.8454375 2023-10-04 15:13:42,048 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.0.conv_module2.balancer1.prob, batch_count=1698933.3333333333, ans=0.125 2023-10-04 15:13:48,420 WARNING [train.py:1204] (3/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_449-15508-0_sp0.9 from training. Duration: 0.911125 2023-10-04 15:13:49,757 WARNING [train.py:1204] (3/4) Exclude cut with ID _29345_210_4381_1_1532689168263_3860169_198-174328-0 from training. Duration: 0.91 2023-10-04 15:13:53,804 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_15517_210_6753_1_1530952293_1664795_125-158157-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 15:13:59,871 WARNING [train.py:1204] (3/4) Exclude cut with ID _25565_210_12392_1_1532423009382_3952098_420-113180-0_sp1.1 from training. Duration: 0.963625 2023-10-04 15:14:01,234 WARNING [train.py:1204] (3/4) Exclude cut with ID _51471_210_1889_1_1534399391390_3094250_236-146773-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 15:14:04,074 WARNING [train.py:1204] (3/4) Exclude cut with ID _30039_210_13270_1_1532484612900_2725150_175-35012-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 15:14:05,846 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_23850_210_4238_1_1532232597_1632433_99-275843-0_sp1.1 from training. Duration: 0.936375 2023-10-04 15:14:07,233 WARNING [train.py:1204] (3/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_658-282610-0 from training. Duration: 0.53 2023-10-04 15:14:09,982 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_37818_210_4238_1_1533620910_4720083_234-65934-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 15:14:10,073 WARNING [train.py:1204] (3/4) Exclude cut with ID _3067_210_2667_1_1526212974640_4503168_459-173867-0_sp1.1 from training. Duration: 0.8 2023-10-04 15:14:11,403 WARNING [train.py:1204] (3/4) Exclude cut with ID _8696_210_1876_1_1528545632440_3345058_209-54069-0_sp0.9 from training. Duration: 0.7444375 2023-10-04 15:14:14,559 WARNING [train.py:1204] (3/4) Exclude cut with ID _62574_210_2655_1_1535180407058_3933249_420-236465-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 15:14:14,644 WARNING [train.py:1204] (3/4) Exclude cut with ID _46681_210_9642_1_1533893005571_7603359_333-4042-0_sp1.1 from training. Duration: 0.936375 2023-10-04 15:14:15,983 WARNING [train.py:1204] (3/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_377-296839-0 from training. Duration: 0.55 2023-10-04 15:14:16,157 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.feed_forward1.hidden_balancer.prob, batch_count=1699066.6666666667, ans=0.125 2023-10-04 15:14:19,455 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.5.encoder.layers.1.self_attn1.whiten, num_groups=1, num_channels=256, metric=10.17 vs. limit=22.5 2023-10-04 15:14:21,695 WARNING [train.py:1204] (3/4) Exclude cut with ID _43997_210_4525_1_1533538632555_5189329_426-92599-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 15:14:23,110 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_43-71989-0_sp1.1 from training. Duration: 0.5181875 2023-10-04 15:14:24,594 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2009_210_2071_1_1525481134_4588937_140-345530-0_sp1.1 from training. Duration: 0.963625 2023-10-04 15:14:25,878 WARNING [train.py:1204] (3/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_138-332773-0_sp0.9 from training. Duration: 0.9 2023-10-04 15:14:25,952 WARNING [train.py:1204] (3/4) Exclude cut with ID _13942_210_9988_1_1530604906036_3733360_499-55676-0_sp0.9 from training. Duration: 0.688875 2023-10-04 15:14:27,619 INFO [train.py:1046] (3/4) Epoch 48, batch 5200, loss[loss=0.2198, simple_loss=0.2853, pruned_loss=0.0771, over 19825.00 frames. ], tot_loss[loss=0.1539, simple_loss=0.2344, pruned_loss=0.03671, over 4708293.77 frames. ], batch size: 388, lr: 2.10e-03, grad_scale: 16.0 2023-10-04 15:14:27,680 WARNING [train.py:1204] (3/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_259-34038-0_sp1.1 from training. Duration: 0.67275 2023-10-04 15:14:27,690 WARNING [train.py:1204] (3/4) Exclude cut with ID _3120_210_3525_1_1526036143112_4072980_404-249994-0_sp1.1 from training. Duration: 0.9 2023-10-04 15:14:27,721 WARNING [train.py:1204] (3/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_222-108810-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 15:14:30,519 WARNING [train.py:1204] (3/4) Exclude cut with ID _22083_210_8254_1_1531702862439_3678970_49-234546-0_sp0.9 from training. Duration: 0.92225 2023-10-04 15:14:31,915 WARNING [train.py:1204] (3/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_199-295611-0_sp0.9 from training. Duration: 0.611125 2023-10-04 15:14:34,140 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.1.encoder.layers.0.self_attn2.whiten, num_groups=1, num_channels=256, metric=11.92 vs. limit=22.5 2023-10-04 15:14:36,322 WARNING [train.py:1204] (3/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_43-192945-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 15:14:39,185 WARNING [train.py:1204] (3/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_364-22817-0 from training. Duration: 0.71 2023-10-04 15:14:41,822 WARNING [train.py:1204] (3/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_388-129552-0_sp1.1 from training. Duration: 0.87275 2023-10-04 15:14:41,877 WARNING [train.py:1204] (3/4) Exclude cut with ID _6537_210_1883_1_1527987559710_3597129_190-280227-0_sp1.1 from training. Duration: 0.97275 2023-10-04 15:14:43,345 WARNING [train.py:1204] (3/4) Exclude cut with ID _55844_210_2346_1_1534471212569_7625707_398-56897-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 15:14:44,646 WARNING [train.py:1204] (3/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_48-182155-0_sp1.1 from training. Duration: 0.7909375 2023-10-04 15:14:44,671 WARNING [train.py:1204] (3/4) Exclude cut with ID _29529_210_7234_1_1532393961455_3977460_582-18084-0_sp1.1 from training. Duration: 0.97275 2023-10-04 15:14:46,595 WARNING [train.py:1204] (3/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_912-282412-0 from training. Duration: 0.49 2023-10-04 15:14:46,817 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.nonlin_attention.balancer.prob, batch_count=1699200.0, ans=0.125 2023-10-04 15:14:48,017 WARNING [train.py:1204] (3/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_332-256168-0_sp0.9 from training. Duration: 0.8333125 2023-10-04 15:14:49,846 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.717e+02 2.200e+02 2.450e+02 2.998e+02 4.674e+02, threshold=4.900e+02, percent-clipped=1.0 2023-10-04 15:14:49,945 WARNING [train.py:1204] (3/4) Exclude cut with ID _64220_210_9527_1_1535250151772_4875259_55-250624-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 15:14:51,440 WARNING [train.py:1204] (3/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_264-108685-0 from training. Duration: 0.55 2023-10-04 15:14:55,186 WARNING [train.py:1204] (3/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_613-232856-0_sp0.9 from training. Duration: 0.6 2023-10-04 15:14:55,277 WARNING [train.py:1204] (3/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_68-212801-0_sp1.1 from training. Duration: 0.663625 2023-10-04 15:14:56,558 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6290_210_3273_1_1527595224_1282787_115-87095-0 from training. Duration: 0.55 2023-10-04 15:14:57,179 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3080_210_2071_1_1526086102_4365166_312-8726-0 from training. Duration: 0.91 2023-10-04 15:14:57,444 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.ff2_skip_rate, batch_count=1699266.6666666667, ans=0.0 2023-10-04 15:14:59,921 WARNING [train.py:1204] (3/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_105-63309-0 from training. Duration: 0.98 2023-10-04 15:14:59,995 WARNING [train.py:1204] (3/4) Exclude cut with ID _2941_210_2577_1_1526199673150_2489759_201-160779-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 15:15:00,004 WARNING [train.py:1204] (3/4) Exclude cut with ID ID0406W0226-885434 from training. Duration: 0.997 2023-10-04 15:15:00,011 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_17292_210_6753_1_1531125388_2943734_353-328201-0_sp1.1 from training. Duration: 0.97275 2023-10-04 15:15:01,441 WARNING [train.py:1204] (3/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_572-96029-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 15:15:02,892 WARNING [train.py:1204] (3/4) Exclude cut with ID _14970_210_15709_1_1530775547361_1966965_6-23328-0_sp0.9 from training. Duration: 0.9555625 2023-10-04 15:15:02,945 WARNING [train.py:1204] (3/4) Exclude cut with ID _45012_210_12651_1_1533608435136_5371380_240-37101-0 from training. Duration: 0.93 2023-10-04 15:15:02,999 WARNING [train.py:1204] (3/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_685-90183-0_sp1.1 from training. Duration: 0.936375 2023-10-04 15:15:07,679 WARNING [train.py:1204] (3/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_41-22245-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 15:15:09,128 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_15126_210_6753_1_1530778677_2591185_120-277707-0 from training. Duration: 0.8 2023-10-04 15:15:10,455 WARNING [train.py:1204] (3/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_310-26186-0 from training. Duration: 0.7 2023-10-04 15:15:10,478 WARNING [train.py:1204] (3/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_787-294416-0 from training. Duration: 0.82 2023-10-04 15:15:13,533 WARNING [train.py:1204] (3/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_795-33252-0 from training. Duration: 0.37 2023-10-04 15:15:14,809 WARNING [train.py:1204] (3/4) Exclude cut with ID _7537_210_2084_1_1528502395431_3924359_113-199933-0_sp0.9 from training. Duration: 0.6555625 2023-10-04 15:15:17,011 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.self_attn_weights.pos_emb_skip_rate, batch_count=1699333.3333333333, ans=0.0 2023-10-04 15:15:17,389 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.3.encoder.layers.1.nonlin_attention.whiten2, num_groups=1, num_channels=512, metric=7.01 vs. limit=15.0 2023-10-04 15:15:21,389 WARNING [train.py:1204] (3/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_308-231735-0_sp0.9 from training. Duration: 0.811125 2023-10-04 15:15:21,417 WARNING [train.py:1204] (3/4) Exclude cut with ID _19504_210_9994_1_1531648863242_3325220_354-296765-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 15:15:22,871 WARNING [train.py:1204] (3/4) Exclude cut with ID _5409_210_1388_1_1527292850189_5865191_178-127970-0 from training. Duration: 0.57 2023-10-04 15:15:22,905 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_1466-248056-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 15:15:24,171 WARNING [train.py:1204] (3/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_535-14410-0_sp0.9 from training. Duration: 0.52225 2023-10-04 15:15:24,174 WARNING [train.py:1204] (3/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_747-129941-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 15:15:24,218 WARNING [train.py:1204] (3/4) Exclude cut with ID _15012_210_1968_1_1530856820429_5095350_688-178849-0_sp1.1 from training. Duration: 0.5818125 2023-10-04 15:15:26,936 WARNING [train.py:1204] (3/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_356-45054-0_sp1.1 from training. Duration: 0.7818125 2023-10-04 15:15:28,304 WARNING [train.py:1204] (3/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_146-276944-0_sp0.9 from training. Duration: 0.92225 2023-10-04 15:15:31,654 WARNING [train.py:1204] (3/4) Exclude cut with ID _39204_210_4220_1_1533880547368_3191120_346-180306-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 15:15:33,057 WARNING [train.py:1204] (3/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_641-158017-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 15:15:33,058 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_19952_210_2161_1_1531875469_3851750_274-42137-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 15:15:37,621 WARNING [train.py:1204] (3/4) Exclude cut with ID _60804_210_9642_1_1534831199859_7268299_87-327819-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 15:15:38,955 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_9302_210_2550_1_1528889962_4655416_638-301620-0 from training. Duration: 0.47 2023-10-04 15:15:39,021 WARNING [train.py:1204] (3/4) Exclude cut with ID _44304_210_15374_1_1533639352765_4613129_302-22084-0_sp1.1 from training. Duration: 0.7818125 2023-10-04 15:15:40,270 WARNING [train.py:1204] (3/4) Exclude cut with ID _18513_210_9206_1_1531618215223_3862620_177-227063-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 15:15:40,371 WARNING [train.py:1204] (3/4) Exclude cut with ID _41326_210_5854_1_1533778076509_3492080_510-45444-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 15:15:41,714 INFO [train.py:1046] (3/4) Epoch 48, batch 5250, loss[loss=0.1408, simple_loss=0.2134, pruned_loss=0.03415, over 23651.00 frames. ], tot_loss[loss=0.1534, simple_loss=0.2341, pruned_loss=0.03639, over 4710619.30 frames. ], batch size: 135, lr: 2.10e-03, grad_scale: 16.0 2023-10-04 15:15:41,752 WARNING [train.py:1204] (3/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_76-311963-0_sp1.1 from training. Duration: 0.636375 2023-10-04 15:15:41,858 WARNING [train.py:1204] (3/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_156-302675-0_sp0.9 from training. Duration: 0.611125 2023-10-04 15:15:44,676 WARNING [train.py:1204] (3/4) Exclude cut with ID _53489_210_6228_1_1534298357169_3835970_367-148411-0_sp1.1 from training. Duration: 0.936375 2023-10-04 15:15:46,488 WARNING [train.py:1204] (3/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_389-144525-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 15:15:48,365 WARNING [train.py:1204] (3/4) Exclude cut with ID _14069_210_5632_1_1530512692033_4803390_158-238427-0_sp1.1 from training. Duration: 0.963625 2023-10-04 15:15:49,734 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_486-151131-0_sp0.9 from training. Duration: 0.9444375 2023-10-04 15:15:51,414 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.out_combiner.scale_min, batch_count=1699466.6666666667, ans=0.2 2023-10-04 15:15:54,158 WARNING [train.py:1204] (3/4) Exclude cut with ID _39846_210_12651_1_1533603651992_1318030_188-71831-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 15:15:55,571 WARNING [train.py:1204] (3/4) Exclude cut with ID _39411_210_5986_1_1533538825030_3343649_173-238248-0_sp1.1 from training. Duration: 0.7545625 2023-10-04 15:15:59,561 WARNING [train.py:1204] (3/4) Exclude cut with ID _20684_210_6917_1_1531567751646_4203930_469-49081-0_sp1.1 from training. Duration: 0.92725 2023-10-04 15:16:00,915 WARNING [train.py:1204] (3/4) Exclude cut with ID _8833_210_5219_1_1528592665210_7553059_611-80313-0_sp1.1 from training. Duration: 0.5818125 2023-10-04 15:16:02,930 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.1.conv_skip_rate, batch_count=1699533.3333333333, ans=0.0 2023-10-04 15:16:04,151 WARNING [train.py:1204] (3/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_440-141811-0 from training. Duration: 0.95 2023-10-04 15:16:05,574 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_41211_210_2544_1_1533348859_3702910_236-223652-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 15:16:07,520 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_40222_210_6228_1_1533385298_4481939_220-163998-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 15:16:12,885 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.out_combiner.scale_min, batch_count=1699600.0, ans=0.2 2023-10-04 15:16:20,644 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.conv_module1.balancer1.prob, batch_count=1699600.0, ans=0.125 2023-10-04 15:16:28,401 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.conv_module1.balancer2.prob, batch_count=1699666.6666666667, ans=0.125 2023-10-04 15:16:40,592 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.self_attn_weights.pos_emb_skip_rate, batch_count=1699733.3333333333, ans=0.0 2023-10-04 15:16:50,664 INFO [train.py:1046] (3/4) Epoch 48, batch 5300, loss[loss=0.1505, simple_loss=0.2393, pruned_loss=0.03084, over 24656.00 frames. ], tot_loss[loss=0.1522, simple_loss=0.233, pruned_loss=0.03571, over 4705125.57 frames. ], batch size: 68, lr: 2.10e-03, grad_scale: 16.0 2023-10-04 15:17:04,727 WARNING [train.py:1204] (3/4) Exclude cut with ID _14629_210_15709_1_1530604298182_956899_6-232695-0_sp1.1 from training. Duration: 0.8545625 2023-10-04 15:17:04,749 WARNING [train.py:1204] (3/4) Exclude cut with ID _13921_210_9994_1_1530841782272_3503520_327-69677-0 from training. Duration: 0.9 2023-10-04 15:17:04,775 WARNING [train.py:1204] (3/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_372-298093-0 from training. Duration: 0.8 2023-10-04 15:17:04,789 WARNING [train.py:1204] (3/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_515-139111-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 15:17:05,036 WARNING [train.py:1204] (3/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_85-65056-0_sp1.1 from training. Duration: 0.87275 2023-10-04 15:17:05,078 WARNING [train.py:1204] (3/4) Exclude cut with ID _15113_210_8254_1_1530842438579_3716279_209-54533-0_sp1.1 from training. Duration: 0.87275 2023-10-04 15:17:05,125 WARNING [train.py:1204] (3/4) Exclude cut with ID _70447_210_12392_1_1535972499300_3160420_123-312838-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 15:17:05,154 WARNING [train.py:1204] (3/4) Exclude cut with ID _60113_210_16271_1_1535164212481_4386079_70-63596-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 15:17:05,182 WARNING [train.py:1204] (3/4) Exclude cut with ID _14632_210_9988_1_1530691279392_3923200_225-188509-0_sp1.1 from training. Duration: 0.963625 2023-10-04 15:17:05,234 WARNING [train.py:1204] (3/4) Exclude cut with ID _34734_210_4525_1_1533365977542_4448720_584-180532-0_sp1.1 from training. Duration: 0.97275 2023-10-04 15:17:05,248 WARNING [train.py:1204] (3/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_351-294650-0_sp1.1 from training. Duration: 0.57275 2023-10-04 15:17:05,580 WARNING [train.py:1204] (3/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_263-168388-0_sp0.9 from training. Duration: 0.9555625 2023-10-04 15:17:05,607 WARNING [train.py:1204] (3/4) Exclude cut with ID _22083_210_8254_1_1531702862439_3678970_201-85738-0 from training. Duration: 0.9 2023-10-04 15:17:05,700 WARNING [train.py:1204] (3/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_237-58433-0 from training. Duration: 0.74 2023-10-04 15:17:05,727 WARNING [train.py:1204] (3/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_262-250826-0 from training. Duration: 0.99 2023-10-04 15:17:05,831 WARNING [train.py:1204] (3/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_927-43827-0_sp0.9 from training. Duration: 0.82225 2023-10-04 15:17:05,862 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2821_210_2715_1_1526108385_3763560_73-296673-0 from training. Duration: 0.83 2023-10-04 15:17:05,938 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6287_210_3273_1_1527509506_2653716_182-199887-0 from training. Duration: 0.51 2023-10-04 15:17:06,070 WARNING [train.py:1204] (3/4) Exclude cut with ID _70519_210_4163_1_1535961595994_7458230_899-80223-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 15:17:06,728 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_210-234949-0_sp1.1 from training. Duration: 0.97275 2023-10-04 15:17:06,768 WARNING [train.py:1204] (3/4) Exclude cut with ID _30196_210_1664_1_1533708496522_7074140_768-278176-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 15:17:06,849 WARNING [train.py:1204] (3/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_315-128283-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 15:17:06,932 WARNING [train.py:1204] (3/4) Exclude cut with ID _14886_210_4220_1_1530702020355_3516190_330-323941-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 15:17:07,204 WARNING [train.py:1204] (3/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_264-348896-0_sp1.1 from training. Duration: 0.77275 2023-10-04 15:17:07,226 WARNING [train.py:1204] (3/4) Exclude cut with ID _37086_210_9038_1_1532999540891_7021250_433-10563-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 15:17:07,267 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_53638_210_21581_1_1534297319_5808449_885-80172-0_sp1.1 from training. Duration: 0.87275 2023-10-04 15:17:07,358 WARNING [train.py:1204] (3/4) Exclude cut with ID _14623_210_1925_1_1530773354962_3632609_157-218771-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 15:17:07,375 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_1368-78165-0_sp1.1 from training. Duration: 0.97275 2023-10-04 15:17:07,379 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_309-65177-0_sp0.9 from training. Duration: 0.97775 2023-10-04 15:17:07,385 WARNING [train.py:1204] (3/4) Exclude cut with ID _21632_210_2253_1_1531994001765_3744140_291-138812-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 15:17:07,398 WARNING [train.py:1204] (3/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_669-93163-0_sp1.1 from training. Duration: 0.8909375 2023-10-04 15:17:07,967 WARNING [train.py:1204] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_292-215346-0 from training. Duration: 0.97 2023-10-04 15:17:07,993 WARNING [train.py:1204] (3/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_286-222073-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 15:17:08,289 WARNING [train.py:1204] (3/4) Exclude cut with ID _59256_210_5986_1_1535076085997_3589120_121-237386-0_sp1.1 from training. Duration: 0.87275 2023-10-04 15:17:08,303 WARNING [train.py:1204] (3/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_96-320736-0 from training. Duration: 0.86 2023-10-04 15:17:08,312 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_128-202104-0 from training. Duration: 0.63 2023-10-04 15:17:08,403 WARNING [train.py:1204] (3/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_242-3472-0_sp0.9 from training. Duration: 0.811125 2023-10-04 15:17:08,449 WARNING [train.py:1204] (3/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_154-74182-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 15:17:08,453 WARNING [train.py:1204] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_187-221566-0 from training. Duration: 0.43 2023-10-04 15:17:08,633 WARNING [train.py:1204] (3/4) Exclude cut with ID _6194_210_1534_1_1527511826160_3941194_14-282876-0 from training. Duration: 0.71 2023-10-04 15:17:08,680 WARNING [train.py:1204] (3/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_660-211088-0_sp1.1 from training. Duration: 0.563625 2023-10-04 15:17:09,524 WARNING [train.py:1204] (3/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_526-62079-0_sp1.1 from training. Duration: 0.6090625 2023-10-04 15:17:09,677 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_92-246270-0_sp1.1 from training. Duration: 0.77275 2023-10-04 15:17:09,767 WARNING [train.py:1204] (3/4) Exclude cut with ID IC0287W0023-142350 from training. Duration: 0.956 2023-10-04 15:17:09,845 WARNING [train.py:1204] (3/4) Exclude cut with ID _58605_210_12210_1_1534723195939_3904259_48-269402-0 from training. Duration: 0.96 2023-10-04 15:17:09,860 WARNING [train.py:1204] (3/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_416-257730-0_sp0.9 from training. Duration: 0.72225 2023-10-04 15:17:09,938 WARNING [train.py:1204] (3/4) Exclude cut with ID _7877_210_6014_1_1528286358099_3750136_185-17516-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 15:17:10,043 WARNING [train.py:1204] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_23-189234-0 from training. Duration: 0.83 2023-10-04 15:17:10,059 WARNING [train.py:1204] (3/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_237-89684-0 from training. Duration: 0.96 2023-10-04 15:17:10,129 WARNING [train.py:1204] (3/4) Exclude cut with ID _13916_210_9994_1_1530788341126_3763149_116-21034-0 from training. Duration: 0.96 2023-10-04 15:17:10,275 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_246-125000-0_sp1.1 from training. Duration: 0.563625 2023-10-04 15:17:16,736 INFO [train.py:1046] (3/4) Epoch 49, batch 0, loss[loss=0.1458, simple_loss=0.2407, pruned_loss=0.02543, over 24310.00 frames. ], tot_loss[loss=0.1458, simple_loss=0.2407, pruned_loss=0.02543, over 24310.00 frames. ], batch size: 74, lr: 2.08e-03, grad_scale: 32.0 2023-10-04 15:17:16,736 INFO [train.py:1069] (3/4) Computing validation loss 2023-10-04 15:17:29,910 INFO [train.py:1078] (3/4) Epoch 49, validation: loss=0.3215, simple_loss=0.2741, pruned_loss=0.1844, over 1125622.00 frames. 2023-10-04 15:17:29,910 INFO [train.py:1079] (3/4) Maximum memory allocated so far is 21221MB 2023-10-04 15:17:32,808 WARNING [train.py:1204] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_402-253027-0 from training. Duration: 0.54 2023-10-04 15:17:32,856 WARNING [train.py:1204] (3/4) Exclude cut with ID _59288_210_5854_1_1535077757774_3412246_158-289345-0_sp1.1 from training. Duration: 0.92725 2023-10-04 15:17:33,995 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.707e+02 2.045e+02 2.321e+02 2.638e+02 8.969e+02, threshold=4.643e+02, percent-clipped=2.0 2023-10-04 15:17:34,189 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_28211_210_6388_1_1532861506_4523224_128-328162-0_sp1.1 from training. Duration: 0.8181875 2023-10-04 15:17:38,407 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_788-278281-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 15:17:38,414 WARNING [train.py:1204] (3/4) Exclude cut with ID _49728_210_12651_1_1534041034294_6788150_624-195301-0_sp0.9 from training. Duration: 0.9444375 2023-10-04 15:17:39,722 WARNING [train.py:1204] (3/4) Exclude cut with ID _14860_210_4915_1_1530702020340_7343240_267-139775-0_sp1.1 from training. Duration: 0.963625 2023-10-04 15:17:39,799 WARNING [train.py:1204] (3/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_332-221057-0 from training. Duration: 0.68 2023-10-04 15:17:41,224 WARNING [train.py:1204] (3/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_242-76943-0 from training. Duration: 0.94 2023-10-04 15:17:42,735 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.balancer1.prob, batch_count=1699946.6666666667, ans=0.125 2023-10-04 15:17:43,930 WARNING [train.py:1204] (3/4) Exclude cut with ID _16747_210_5986_1_1531811544733_2749109_87-349587-0_sp1.1 from training. Duration: 0.963625 2023-10-04 15:17:43,987 WARNING [train.py:1204] (3/4) Exclude cut with ID _22982_210_1883_1_1531882790447_5449409_505-346452-0_sp1.1 from training. Duration: 0.963625 2023-10-04 15:17:48,179 WARNING [train.py:1204] (3/4) Exclude cut with ID _17956_210_4353_1_1531209568125_6979970_13-192107-0_sp1.1 from training. Duration: 0.963625 2023-10-04 15:17:48,227 WARNING [train.py:1204] (3/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_430-307666-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 15:17:49,974 WARNING [train.py:1204] (3/4) Exclude cut with ID _2895_210_2477_1_1525956785794_4063750_289-11475-0_sp1.1 from training. Duration: 0.7454375 2023-10-04 15:17:49,994 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3910_210_1794_1_1526742306_334530_28-238909-0_sp0.9 from training. Duration: 0.9 2023-10-04 15:17:51,213 WARNING [train.py:1204] (3/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_18-130336-0 from training. Duration: 0.79 2023-10-04 15:17:54,026 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_548-294316-0_sp0.9 from training. Duration: 0.9 2023-10-04 15:18:02,924 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_42-142473-0_sp1.1 from training. Duration: 0.6545625 2023-10-04 15:18:02,938 WARNING [train.py:1204] (3/4) Exclude cut with ID _59170_210_6721_1_1534983633960_4203335_326-48066-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 15:18:05,663 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_45886_210_17024_1_1533709215_471063_7-79165-0 from training. Duration: 0.98 2023-10-04 15:18:05,870 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.feed_forward1.out_proj.dropout_p, batch_count=1700013.3333333333, ans=0.1 2023-10-04 15:18:11,050 WARNING [train.py:1204] (3/4) Exclude cut with ID _44585_210_18628_1_1533605350387_4518199_280-73534-0_sp0.9 from training. Duration: 0.988875 2023-10-04 15:18:11,065 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3475_210_2028_1_1526193578_3622569_265-276625-0_sp1.1 from training. Duration: 0.6909375 2023-10-04 15:18:12,403 WARNING [train.py:1204] (3/4) Exclude cut with ID _64631_210_27767_1_1535349580068_4165160_88-225232-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 15:18:15,269 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_205-265518-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 15:18:15,606 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=1700080.0, ans=0.1 2023-10-04 15:18:19,452 WARNING [train.py:1204] (3/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_271-253734-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 15:18:23,612 WARNING [train.py:1204] (3/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_282-257791-0 from training. Duration: 0.86 2023-10-04 15:18:27,030 WARNING [train.py:1204] (3/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_356-45054-0 from training. Duration: 0.86 2023-10-04 15:18:28,385 WARNING [train.py:1204] (3/4) Exclude cut with ID _69584_210_5219_1_1535785133775_4044450_38-348116-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 15:18:28,387 WARNING [train.py:1204] (3/4) Exclude cut with ID _66763_210_17301_1_1535788667617_241750_21-328480-0_sp1.1 from training. Duration: 0.936375 2023-10-04 15:18:28,456 WARNING [train.py:1204] (3/4) Exclude cut with ID _26528_210_9070_1_1532570410842_3851310_497-97218-0_sp0.9 from training. Duration: 0.9555625 2023-10-04 15:18:30,324 WARNING [train.py:1204] (3/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_592-101075-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 15:18:32,760 WARNING [train.py:1204] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_678-159307-0 from training. Duration: 0.85 2023-10-04 15:18:33,316 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.4.encoder.layers.1.whiten, num_groups=1, num_channels=384, metric=4.21 vs. limit=12.0 2023-10-04 15:18:34,187 WARNING [train.py:1204] (3/4) Exclude cut with ID _28427_210_4220_1_1532581240967_3061069_166-319773-0_sp1.1 from training. Duration: 0.936375 2023-10-04 15:18:35,544 WARNING [train.py:1204] (3/4) Exclude cut with ID _29877_210_6228_1_1532397791106_5276149_606-224984-0_sp1.1 from training. Duration: 0.936375 2023-10-04 15:18:38,708 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.feed_forward2.hidden_balancer.prob, batch_count=1700146.6666666667, ans=0.125 2023-10-04 15:18:39,759 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_125-203105-0_sp1.1 from training. Duration: 0.72725 2023-10-04 15:18:42,446 WARNING [train.py:1204] (3/4) Exclude cut with ID IC0620W0113-306765 from training. Duration: 0.768 2023-10-04 15:18:42,564 WARNING [train.py:1204] (3/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_410-287857-0_sp1.1 from training. Duration: 0.8454375 2023-10-04 15:18:43,770 INFO [train.py:1046] (3/4) Epoch 49, batch 50, loss[loss=0.1745, simple_loss=0.2447, pruned_loss=0.05219, over 23698.00 frames. ], tot_loss[loss=0.1535, simple_loss=0.2352, pruned_loss=0.03595, over 1061778.77 frames. ], batch size: 164, lr: 2.08e-03, grad_scale: 16.0 2023-10-04 15:18:45,270 WARNING [train.py:1204] (3/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_78-95260-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 15:18:46,268 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.0.layers.1.nonlin_attention.whiten1, num_groups=1, num_channels=144, metric=5.16 vs. limit=10.0 2023-10-04 15:18:47,958 WARNING [train.py:1204] (3/4) Exclude cut with ID _58059_210_5854_1_1534901471149_3164808_6-73080-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 15:18:47,980 WARNING [train.py:1204] (3/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_331-117829-0 from training. Duration: 0.62 2023-10-04 15:18:49,318 WARNING [train.py:1204] (3/4) Exclude cut with ID _14880_210_8895_1_1530680426655_3795259_289-212817-0_sp0.9 from training. Duration: 0.7666875 2023-10-04 15:18:50,512 WARNING [train.py:1204] (3/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_620-144779-0_sp1.1 from training. Duration: 0.92725 2023-10-04 15:18:51,997 WARNING [train.py:1204] (3/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_561-288575-0_sp1.1 from training. Duration: 0.963625 2023-10-04 15:18:53,467 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_22657_210_6890_1_1532086002_1637216_122-188128-0_sp1.1 from training. Duration: 0.963625 2023-10-04 15:18:54,051 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.5.encoder.layers.0.feed_forward1.out_whiten, num_groups=1, num_channels=256, metric=6.52 vs. limit=15.0 2023-10-04 15:18:54,938 WARNING [train.py:1204] (3/4) Exclude cut with ID _16756_210_4915_1_1531047803763_7104259_604-86607-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 15:18:58,257 WARNING [train.py:1204] (3/4) Exclude cut with ID _15150_210_8341_1_1530752411803_7256384_469-239509-0 from training. Duration: 0.68 2023-10-04 15:18:58,267 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_10987_210_4163_1_1529760555_4123902_302-87926-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 15:18:58,405 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.feed_forward3.hidden_balancer.prob, batch_count=1700280.0, ans=0.125 2023-10-04 15:19:05,012 WARNING [train.py:1204] (3/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_469-94673-0_sp0.9 from training. Duration: 0.87775 2023-10-04 15:19:06,404 WARNING [train.py:1204] (3/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_798-106179-0 from training. Duration: 0.83 2023-10-04 15:19:07,814 WARNING [train.py:1204] (3/4) Exclude cut with ID _16744_210_5986_1_1531724405384_3907567_242-111316-0 from training. Duration: 0.93 2023-10-04 15:19:09,264 WARNING [train.py:1204] (3/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_938-205135-0_sp0.9 from training. Duration: 0.9666875 2023-10-04 15:19:09,368 WARNING [train.py:1204] (3/4) Exclude cut with ID _64135_210_2253_1_1535677267876_7608972_133-188266-0_sp0.9 from training. Duration: 0.9 2023-10-04 15:19:09,372 WARNING [train.py:1204] (3/4) Exclude cut with ID _44585_210_18628_1_1533605350387_4518199_623-346820-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 15:19:09,577 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.self_attn_weights.pos_emb_skip_rate, batch_count=1700280.0, ans=0.0 2023-10-04 15:19:10,731 WARNING [train.py:1204] (3/4) Exclude cut with ID _42326_210_7770_1_1533814282531_3707269_454-266903-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 15:19:10,809 WARNING [train.py:1204] (3/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_5-55893-0_sp1.1 from training. Duration: 0.5 2023-10-04 15:19:10,922 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.0.feed_forward1.out_proj.dropout_p, batch_count=1700280.0, ans=0.1 2023-10-04 15:19:12,160 WARNING [train.py:1204] (3/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_577-332600-0_sp1.1 from training. Duration: 0.5181875 2023-10-04 15:19:12,161 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_957-138882-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 15:19:18,910 WARNING [train.py:1204] (3/4) Exclude cut with ID _12556_210_8341_1_1530081024100_7327690_219-38004-0_sp1.1 from training. Duration: 0.97275 2023-10-04 15:19:20,292 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_397-147196-0_sp1.1 from training. Duration: 0.72725 2023-10-04 15:19:21,566 WARNING [train.py:1204] (3/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_231-104736-0_sp0.9 from training. Duration: 0.8666875 2023-10-04 15:19:22,895 WARNING [train.py:1204] (3/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_398-334558-0 from training. Duration: 0.96 2023-10-04 15:19:25,662 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_257-20733-0_sp1.1 from training. Duration: 0.6909375 2023-10-04 15:19:25,759 WARNING [train.py:1204] (3/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_146-282400-0_sp1.1 from training. Duration: 0.6454375 2023-10-04 15:19:27,540 WARNING [train.py:1204] (3/4) Exclude cut with ID _51899_210_20034_1_1534129254466_3669220_265-215428-0 from training. Duration: 0.98 2023-10-04 15:19:27,609 WARNING [train.py:1204] (3/4) Exclude cut with ID _34423_210_8750_1_1533430798456_3330199_219-106541-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 15:19:29,004 WARNING [train.py:1204] (3/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_458-67321-0 from training. Duration: 0.97 2023-10-04 15:19:32,532 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.2.encoder.layers.0.self_attn2.whiten, num_groups=1, num_channels=384, metric=20.16 vs. limit=22.5 2023-10-04 15:19:38,577 WARNING [train.py:1204] (3/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_1051-182606-0_sp1.1 from training. Duration: 0.936375 2023-10-04 15:19:38,601 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_53555_210_5632_1_1534595220_7232297_603-4396-0_sp1.1 from training. Duration: 0.97275 2023-10-04 15:19:39,939 WARNING [train.py:1204] (3/4) Exclude cut with ID _54218_210_12210_1_1534413646388_4409259_159-144367-0_sp1.1 from training. Duration: 0.963625 2023-10-04 15:19:41,316 WARNING [train.py:1204] (3/4) Exclude cut with ID _7606_210_4915_1_1528894253277_5080690_301-10732-0_sp0.9 from training. Duration: 0.9 2023-10-04 15:19:41,319 WARNING [train.py:1204] (3/4) Exclude cut with ID _15026_210_8750_1_1530777602316_3291850_211-195043-0_sp0.9 from training. Duration: 0.888875 2023-10-04 15:19:42,815 WARNING [train.py:1204] (3/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_146-282400-0 from training. Duration: 0.71 2023-10-04 15:19:44,181 WARNING [train.py:1204] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_65-88242-0 from training. Duration: 0.56 2023-10-04 15:19:44,271 WARNING [train.py:1204] (3/4) Exclude cut with ID _55857_210_2346_1_1534644007831_6898209_80-107757-0_sp1.1 from training. Duration: 0.963625 2023-10-04 15:19:45,647 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_15126_210_6753_1_1530778677_2591185_120-277707-0_sp0.9 from training. Duration: 0.888875 2023-10-04 15:19:47,001 WARNING [train.py:1204] (3/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_1033-168593-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 15:19:48,356 WARNING [train.py:1204] (3/4) Exclude cut with ID _46683_210_9642_1_1533730913492_4591116_475-30711-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 15:19:48,379 WARNING [train.py:1204] (3/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_253-308377-0 from training. Duration: 0.49 2023-10-04 15:19:48,442 WARNING [train.py:1204] (3/4) Exclude cut with ID _4680_210_3525_1_1527249757953_893386_75-78979-0 from training. Duration: 0.91 2023-10-04 15:19:49,931 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_5522_210_3273_1_1527249606_3909096_318-203153-0_sp0.9 from training. Duration: 0.47775 2023-10-04 15:19:51,219 WARNING [train.py:1204] (3/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_550-170116-0_sp1.1 from training. Duration: 0.92725 2023-10-04 15:19:51,261 WARNING [train.py:1204] (3/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_277-210798-0_sp0.9 from training. Duration: 0.611125 2023-10-04 15:19:53,970 WARNING [train.py:1204] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_620-276793-0 from training. Duration: 0.59 2023-10-04 15:19:53,992 WARNING [train.py:1204] (3/4) Exclude cut with ID _3067_210_2667_1_1526212974640_4503168_119-175776-0 from training. Duration: 0.74 2023-10-04 15:19:54,077 WARNING [train.py:1204] (3/4) Exclude cut with ID _55209_210_1531_1_1534675828996_4597160_681-195827-0_sp1.1 from training. Duration: 0.92725 2023-10-04 15:19:55,053 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.0.layers.0.nonlin_attention.whiten1, num_groups=1, num_channels=144, metric=10.22 vs. limit=10.0 2023-10-04 15:19:55,436 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_427-78097-0_sp1.1 from training. Duration: 0.72725 2023-10-04 15:19:56,870 INFO [train.py:1046] (3/4) Epoch 49, batch 100, loss[loss=0.2039, simple_loss=0.2772, pruned_loss=0.06533, over 19504.00 frames. ], tot_loss[loss=0.155, simple_loss=0.2372, pruned_loss=0.03643, over 1873711.38 frames. ], batch size: 388, lr: 2.08e-03, grad_scale: 16.0 2023-10-04 15:19:58,309 WARNING [train.py:1204] (3/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_96-290039-0_sp0.9 from training. Duration: 0.62225 2023-10-04 15:19:58,318 WARNING [train.py:1204] (3/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_287-292174-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 15:20:00,404 WARNING [train.py:1204] (3/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_200-292228-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 15:20:03,029 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.765e+02 2.158e+02 2.509e+02 3.581e+02 6.857e+02, threshold=5.017e+02, percent-clipped=12.0 2023-10-04 15:20:03,132 WARNING [train.py:1204] (3/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_291-194999-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 15:20:05,098 WARNING [train.py:1204] (3/4) Exclude cut with ID _58895_210_8750_1_1535184081320_3649089_265-323810-0_sp1.1 from training. Duration: 0.87275 2023-10-04 15:20:06,395 WARNING [train.py:1204] (3/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_58-68956-0 from training. Duration: 0.83 2023-10-04 15:20:06,400 WARNING [train.py:1204] (3/4) Exclude cut with ID _39867_210_9826_1_1533553222030_9344259_322-115153-0_sp1.1 from training. Duration: 0.963625 2023-10-04 15:20:06,631 INFO [scaling.py:1118] (3/4) WithLoss: name=encoder.encoders.1.encoder.layers.0.self_attn_weights, loss-sum=0.000e+00 2023-10-04 15:20:07,103 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.3.encoder.layers.2.whiten, num_groups=1, num_channels=512, metric=3.73 vs. limit=12.0 2023-10-04 15:20:09,281 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3371_210_1794_1_1526384462_5580777_396-339812-0_sp1.1 from training. Duration: 0.77275 2023-10-04 15:20:09,291 WARNING [train.py:1204] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_731-2428-0_sp0.9 from training. Duration: 0.92225 2023-10-04 15:20:10,545 WARNING [train.py:1204] (3/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_98-741-0_sp1.1 from training. Duration: 0.72725 2023-10-04 15:20:10,547 WARNING [train.py:1204] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_238-245655-0_sp0.9 from training. Duration: 0.9 2023-10-04 15:20:10,569 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6788_210_1388_1_1527898698_4562021_91-197981-0_sp0.9 from training. Duration: 0.92225 2023-10-04 15:20:11,859 WARNING [train.py:1204] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_860-96198-0 from training. Duration: 0.73 2023-10-04 15:20:14,558 WARNING [train.py:1204] (3/4) Exclude cut with ID _14268_210_9206_1_1530622771285_4714099_331-105351-0_sp0.9 from training. Duration: 0.72225 2023-10-04 15:20:14,582 WARNING [train.py:1204] (3/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_204-137133-0_sp1.1 from training. Duration: 0.92725 2023-10-04 15:20:14,607 WARNING [train.py:1204] (3/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_5-46496-0_sp1.1 from training. Duration: 0.936375 2023-10-04 15:20:14,615 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_160-218721-0_sp1.1 from training. Duration: 0.87275 2023-10-04 15:20:17,405 WARNING [train.py:1204] (3/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_503-287883-0 from training. Duration: 0.88 2023-10-04 15:20:18,777 WARNING [train.py:1204] (3/4) Exclude cut with ID _57572_210_5517_1_1534921219624_3809484_110-79134-0_sp1.1 from training. Duration: 0.92725 2023-10-04 15:20:20,140 WARNING [train.py:1204] (3/4) Exclude cut with ID _44585_210_18628_1_1533605350387_4518199_421-37641-0_sp1.1 from training. Duration: 0.936375 2023-10-04 15:20:20,235 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_42-142473-0_sp0.9 from training. Duration: 0.8 2023-10-04 15:20:22,888 WARNING [train.py:1204] (3/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_310-114452-0_sp0.9 from training. Duration: 0.6555625 2023-10-04 15:20:25,734 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9029W0120-509520 from training. Duration: 0.768 2023-10-04 15:20:27,064 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9029W0433-509820 from training. Duration: 0.896 2023-10-04 15:20:27,200 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_40222_210_6228_1_1533385298_4481939_163-334586-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 15:20:27,200 WARNING [train.py:1204] (3/4) Exclude cut with ID _48677_210_5663_1_1533902405919_3892989_408-40740-0_sp1.1 from training. Duration: 0.7545625 2023-10-04 15:20:30,514 WARNING [train.py:1204] (3/4) Exclude cut with ID _24314_210_4915_1_1532135078116_7170179_521-258868-0_sp0.9 from training. Duration: 0.87775 2023-10-04 15:20:33,673 WARNING [train.py:1204] (3/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_576-51198-0_sp1.1 from training. Duration: 0.92725 2023-10-04 15:20:33,781 WARNING [train.py:1204] (3/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_306-216871-0_sp1.1 from training. Duration: 0.97275 2023-10-04 15:20:38,867 WARNING [train.py:1204] (3/4) Exclude cut with ID _20684_210_6917_1_1531567751646_4203930_463-119982-0_sp1.1 from training. Duration: 0.97275 2023-10-04 15:20:38,931 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9084W0012-536850 from training. Duration: 0.64 2023-10-04 15:20:41,507 WARNING [train.py:1204] (3/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_847-41334-0_sp1.1 from training. Duration: 0.4 2023-10-04 15:20:43,146 WARNING [train.py:1204] (3/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_326-165900-0_sp1.1 from training. Duration: 0.62725 2023-10-04 15:20:44,385 WARNING [train.py:1204] (3/4) Exclude cut with ID _7537_210_2084_1_1528502395431_3924359_128-268029-0_sp1.1 from training. Duration: 0.8909375 2023-10-04 15:20:48,344 WARNING [train.py:1204] (3/4) Exclude cut with ID _67499_210_17282_1_1535509805220_3692430_333-155030-0_sp1.1 from training. Duration: 0.97275 2023-10-04 15:20:51,226 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_34008_210_10431_1_1533345424_8183247_588-257177-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 15:20:51,386 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.conv_module1.balancer1.prob, batch_count=1700746.6666666667, ans=0.125 2023-10-04 15:20:55,344 WARNING [train.py:1204] (3/4) Exclude cut with ID _25914_210_6400_1_1532141317058_644740_52-96808-0_sp1.1 from training. Duration: 0.82725 2023-10-04 15:20:56,854 WARNING [train.py:1204] (3/4) Exclude cut with ID _56494_210_4220_1_1535284837625_3033169_69-138150-0_sp1.1 from training. Duration: 0.8545625 2023-10-04 15:20:59,671 WARNING [train.py:1204] (3/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_19-143553-0_sp1.1 from training. Duration: 0.97275 2023-10-04 15:21:01,647 WARNING [train.py:1204] (3/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_582-130735-0_sp1.1 from training. Duration: 0.936375 2023-10-04 15:21:03,003 WARNING [train.py:1204] (3/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_943-201356-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 15:21:03,019 WARNING [train.py:1204] (3/4) Exclude cut with ID _3231_210_2106_1_1526205864934_6784220_422-229276-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 15:21:03,042 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_48049_210_5632_1_1533864553_3519937_73-198717-0_sp1.1 from training. Duration: 0.97275 2023-10-04 15:21:04,363 WARNING [train.py:1204] (3/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_329-285801-0 from training. Duration: 0.91 2023-10-04 15:21:06,221 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9171W0114-579000 from training. Duration: 0.896 2023-10-04 15:21:06,232 WARNING [train.py:1204] (3/4) Exclude cut with ID _3120_210_3525_1_1526036143112_4072980_210-132675-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 15:21:06,314 WARNING [train.py:1204] (3/4) Exclude cut with ID _55857_210_2346_1_1534644007831_6898209_745-98391-0_sp0.9 from training. Duration: 0.8444375 2023-10-04 15:21:08,104 WARNING [train.py:1204] (3/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_615-322035-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 15:21:08,119 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_36756_210_7234_1_1533026991_966276_119-315767-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 15:21:08,144 WARNING [train.py:1204] (3/4) Exclude cut with ID _13916_210_9994_1_1530788341126_3763149_60-191556-0_sp1.1 from training. Duration: 0.5545625 2023-10-04 15:21:08,161 WARNING [train.py:1204] (3/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_177-236809-0_sp1.1 from training. Duration: 0.4454375 2023-10-04 15:21:10,028 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7002_210_1794_1_1527939683_5157286_184-229630-0_sp1.1 from training. Duration: 0.67275 2023-10-04 15:21:10,042 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_50428_210_5724_1_1534247532_4126418_464-18322-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 15:21:10,112 WARNING [train.py:1204] (3/4) Exclude cut with ID _35799_210_5854_1_1533263487887_3529071_466-131402-0_sp1.1 from training. Duration: 0.936375 2023-10-04 15:21:11,384 INFO [train.py:1046] (3/4) Epoch 49, batch 150, loss[loss=0.1352, simple_loss=0.2154, pruned_loss=0.02751, over 23259.00 frames. ], tot_loss[loss=0.1539, simple_loss=0.2351, pruned_loss=0.03631, over 2508205.70 frames. ], batch size: 119, lr: 2.08e-03, grad_scale: 16.0 2023-10-04 15:21:11,501 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_1259-132768-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 15:21:12,861 WARNING [train.py:1204] (3/4) Exclude cut with ID _6694_210_4599_1_1527852637490_3965529_144-185051-0_sp1.1 from training. Duration: 0.87275 2023-10-04 15:21:12,910 WARNING [train.py:1204] (3/4) Exclude cut with ID _64639_210_5816_1_1535367619437_3528000_59-133290-0_sp1.1 from training. Duration: 0.963625 2023-10-04 15:21:15,593 WARNING [train.py:1204] (3/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_223-49525-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 15:21:18,336 WARNING [train.py:1204] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_748-36150-0_sp1.1 from training. Duration: 0.82725 2023-10-04 15:21:18,337 WARNING [train.py:1204] (3/4) Exclude cut with ID _19896_210_1883_1_1531709022662_6649569_52-221627-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 15:21:18,389 WARNING [train.py:1204] (3/4) Exclude cut with ID _6789_210_1534_1_1527854471671_2870519_296-115766-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 15:21:19,846 WARNING [train.py:1204] (3/4) Exclude cut with ID _14624_210_1925_1_1530858690657_3642219_100-19871-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 15:21:21,092 WARNING [train.py:1204] (3/4) Exclude cut with ID _12555_210_8750_1_1530075594402_3780239_50-293675-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 15:21:22,525 WARNING [train.py:1204] (3/4) Exclude cut with ID _14880_210_8895_1_1530680426655_3795259_289-212817-0_sp1.1 from training. Duration: 0.62725 2023-10-04 15:21:23,829 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_388-211626-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 15:21:27,915 WARNING [train.py:1204] (3/4) Exclude cut with ID _5289_210_2084_1_1527678202736_3779299_392-261988-0 from training. Duration: 0.65 2023-10-04 15:21:27,928 WARNING [train.py:1204] (3/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_124-345840-0 from training. Duration: 0.52 2023-10-04 15:21:27,941 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2655_210_2087_1_1525688959_3897400_437-337690-0 from training. Duration: 0.63 2023-10-04 15:21:31,908 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3780_210_2087_1_1526467391_239068_13-274633-0_sp1.1 from training. Duration: 0.92725 2023-10-04 15:21:31,920 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_805-71497-0_sp1.1 from training. Duration: 0.6454375 2023-10-04 15:21:33,872 WARNING [train.py:1204] (3/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_434-83000-0_sp1.1 from training. Duration: 0.8909375 2023-10-04 15:21:33,964 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_17613_210_2151_1_1531137569_2366160_285-219531-0_sp1.1 from training. Duration: 0.97275 2023-10-04 15:21:33,969 WARNING [train.py:1204] (3/4) Exclude cut with ID _56108_210_16380_1_1534505446159_3887959_22-37858-0_sp1.1 from training. Duration: 0.936375 2023-10-04 15:21:34,011 WARNING [train.py:1204] (3/4) Exclude cut with ID _28427_210_4220_1_1532581240967_3061069_200-88444-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 15:21:37,224 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_77-279042-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 15:21:38,607 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9309W0193-640320 from training. Duration: 0.896 2023-10-04 15:21:41,750 WARNING [train.py:1204] (3/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_516-324055-0_sp1.1 from training. Duration: 0.936375 2023-10-04 15:21:45,122 WARNING [train.py:1204] (3/4) Exclude cut with ID _40094_210_16380_1_1533294005773_3641159_10-261240-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 15:21:45,737 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.3.encoder.layers.2.self_attn2.whiten, num_groups=1, num_channels=512, metric=10.92 vs. limit=22.5 2023-10-04 15:21:50,547 WARNING [train.py:1204] (3/4) Exclude cut with ID _6267_210_2084_1_1527764658526_3894730_375-219887-0_sp1.1 from training. Duration: 0.6090625 2023-10-04 15:21:51,969 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6788_210_1388_1_1527898698_4562021_180-75288-0 from training. Duration: 0.99 2023-10-04 15:21:54,856 WARNING [train.py:1204] (3/4) Exclude cut with ID _8030_210_7497_1_1528444886781_3531089_262-86750-0_sp1.1 from training. Duration: 0.736375 2023-10-04 15:21:54,869 WARNING [train.py:1204] (3/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_934-325498-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 15:21:54,891 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_456-22141-0_sp0.9 from training. Duration: 0.97775 2023-10-04 15:21:57,658 WARNING [train.py:1204] (3/4) Exclude cut with ID _49643_210_7989_1_1534640244224_3939031_598-161926-0_sp1.1 from training. Duration: 0.8181875 2023-10-04 15:21:59,007 WARNING [train.py:1204] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_345-49721-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 15:22:00,436 WARNING [train.py:1204] (3/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_440-141811-0_sp1.1 from training. Duration: 0.863625 2023-10-04 15:22:01,792 WARNING [train.py:1204] (3/4) Exclude cut with ID _64048_210_10626_1_1535198382802_3835179_394-212120-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 15:22:01,847 WARNING [train.py:1204] (3/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_91-277684-0 from training. Duration: 0.74 2023-10-04 15:22:06,591 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.4.encoder.layers.1.nonlin_attention.whiten2, num_groups=1, num_channels=384, metric=3.11 vs. limit=15.0 2023-10-04 15:22:07,868 WARNING [train.py:1204] (3/4) Exclude cut with ID _23038_210_9570_1_1532001584158_3652419_191-346343-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 15:22:07,930 WARNING [train.py:1204] (3/4) Exclude cut with ID _15140_210_9288_1_1530791388450_4880149_351-347974-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 15:22:09,700 WARNING [train.py:1204] (3/4) Exclude cut with ID _58605_210_12210_1_1534723195939_3904259_48-269402-0_sp1.1 from training. Duration: 0.87275 2023-10-04 15:22:09,707 WARNING [train.py:1204] (3/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_401-53423-0_sp0.9 from training. Duration: 0.611125 2023-10-04 15:22:09,876 WARNING [train.py:1204] (3/4) Exclude cut with ID _42659_210_2070_1_1533607442765_3120380_158-49004-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 15:22:13,050 WARNING [train.py:1204] (3/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_81-75849-0_sp1.1 from training. Duration: 0.3545625 2023-10-04 15:22:15,814 WARNING [train.py:1204] (3/4) Exclude cut with ID _27052_210_5986_1_1532329197187_3585009_314-106499-0_sp1.1 from training. Duration: 0.836375 2023-10-04 15:22:19,038 WARNING [train.py:1204] (3/4) Exclude cut with ID _4626_210_3532_1_1526817652717_3916859_446-206656-0_sp0.9 from training. Duration: 0.8444375 2023-10-04 15:22:19,273 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.feed_forward2.hidden_balancer.prob, batch_count=1701146.6666666667, ans=0.125 2023-10-04 15:22:20,403 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_53378_210_17282_1_1534221071_5769509_158-114960-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 15:22:21,955 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3171_210_2087_1_1526109084_2330007_37-64334-0_sp0.9 from training. Duration: 0.911125 2023-10-04 15:22:21,975 WARNING [train.py:1204] (3/4) Exclude cut with ID _5410_210_1388_1_1527397055795_1719020_39-112221-0 from training. Duration: 0.69 2023-10-04 15:22:21,998 WARNING [train.py:1204] (3/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_4-91037-0_sp0.9 from training. Duration: 0.97775 2023-10-04 15:22:22,012 WARNING [train.py:1204] (3/4) Exclude cut with ID ID0079W0443-717795 from training. Duration: 0.541 2023-10-04 15:22:24,643 INFO [train.py:1046] (3/4) Epoch 49, batch 200, loss[loss=0.1289, simple_loss=0.2068, pruned_loss=0.02549, over 22167.00 frames. ], tot_loss[loss=0.1547, simple_loss=0.2359, pruned_loss=0.03676, over 3002558.82 frames. ], batch size: 48, lr: 2.08e-03, grad_scale: 8.0 2023-10-04 15:22:26,078 WARNING [train.py:1204] (3/4) Exclude cut with ID _34734_210_4525_1_1533365977542_4448720_766-297928-0_sp1.1 from training. Duration: 0.936375 2023-10-04 15:22:28,012 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.4.encoder.layers.2.self_attn1.whiten, num_groups=1, num_channels=384, metric=11.88 vs. limit=22.5 2023-10-04 15:22:28,881 WARNING [train.py:1204] (3/4) Exclude cut with ID _51471_210_1889_1_1534399391390_3094250_310-337793-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 15:22:28,899 WARNING [train.py:1204] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_678-159307-0_sp0.9 from training. Duration: 0.9444375 2023-10-04 15:22:31,467 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.785e+02 2.091e+02 2.483e+02 2.820e+02 4.218e+02, threshold=4.965e+02, percent-clipped=0.0 2023-10-04 15:22:32,959 WARNING [train.py:1204] (3/4) Exclude cut with ID _50093_210_4220_1_1534417141207_3432299_259-322252-0 from training. Duration: 0.92 2023-10-04 15:22:34,252 WARNING [train.py:1204] (3/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_67-103444-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 15:22:34,278 WARNING [train.py:1204] (3/4) Exclude cut with ID _40551_210_10635_1_1533776424632_3416179_311-247578-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 15:22:37,574 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_4633_210_2040_1_1526824163_4145343_416-76117-0 from training. Duration: 0.95 2023-10-04 15:22:38,972 WARNING [train.py:1204] (3/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_331-117829-0_sp0.9 from training. Duration: 0.688875 2023-10-04 15:22:41,198 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.0.layers.0.feed_forward2.out_whiten, num_groups=1, num_channels=192, metric=5.64 vs. limit=15.0 2023-10-04 15:22:41,554 WARNING [train.py:1204] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_299-247059-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 15:22:42,224 WARNING [train.py:1204] (3/4) Exclude cut with ID _64002_210_9570_1_1535198605157_38979_2-240845-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 15:22:44,926 WARNING [train.py:1204] (3/4) Exclude cut with ID _4681_210_3525_1_1527250689464_120889_15-126760-0_sp0.9 from training. Duration: 0.9 2023-10-04 15:22:44,938 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_21500_210_2054_1_1532149310_2716758_256-156611-0_sp1.1 from training. Duration: 0.936375 2023-10-04 15:22:46,348 WARNING [train.py:1204] (3/4) Exclude cut with ID _16908_210_5724_1_1531119157248_3772700_624-203729-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 15:22:50,267 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.3.encoder.layers.1.conv_module2.whiten, num_groups=1, num_channels=512, metric=3.05 vs. limit=15.0 2023-10-04 15:22:58,695 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=1701346.6666666667, ans=0.1 2023-10-04 15:23:00,188 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.feed_forward3.hidden_balancer.prob, batch_count=1701346.6666666667, ans=0.125 2023-10-04 15:23:01,309 WARNING [train.py:1204] (3/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_605-219732-0_sp1.1 from training. Duration: 0.8545625 2023-10-04 15:23:01,343 WARNING [train.py:1204] (3/4) Exclude cut with ID _44718_210_5816_1_1534050051869_6469030_380-167520-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 15:23:02,674 WARNING [train.py:1204] (3/4) Exclude cut with ID _19896_210_1883_1_1531709022662_6649569_94-229927-0_sp1.1 from training. Duration: 0.8818125 2023-10-04 15:23:04,059 WARNING [train.py:1204] (3/4) Exclude cut with ID _19514_210_9259_1_1531530071330_5084230_519-174300-0_sp1.1 from training. Duration: 0.963625 2023-10-04 15:23:04,142 WARNING [train.py:1204] (3/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_340-174176-0_sp0.9 from training. Duration: 0.5444375 2023-10-04 15:23:04,146 WARNING [train.py:1204] (3/4) Exclude cut with ID _8263_210_1664_1_1528439452891_970220_68-181805-0_sp1.1 from training. Duration: 0.5818125 2023-10-04 15:23:07,368 WARNING [train.py:1204] (3/4) Exclude cut with ID _3120_210_3525_1_1526036143112_4072980_348-257723-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 15:23:07,469 WARNING [train.py:1204] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_453-133983-0_sp1.1 from training. Duration: 0.7090625 2023-10-04 15:23:08,970 WARNING [train.py:1204] (3/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_17-86060-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 15:23:08,999 WARNING [train.py:1204] (3/4) Exclude cut with ID _29877_210_6228_1_1532397791106_5276149_228-184657-0_sp1.1 from training. Duration: 0.97275 2023-10-04 15:23:10,342 WARNING [train.py:1204] (3/4) Exclude cut with ID _18516_210_9206_1_1531704653206_3692406_198-334870-0 from training. Duration: 0.92 2023-10-04 15:23:10,387 WARNING [train.py:1204] (3/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_340-174176-0_sp1.1 from training. Duration: 0.4454375 2023-10-04 15:23:10,399 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_14-139186-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 15:23:13,731 WARNING [train.py:1204] (3/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_126-29104-0_sp0.9 from training. Duration: 0.9444375 2023-10-04 15:23:19,568 WARNING [train.py:1204] (3/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_84-10310-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 15:23:24,611 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.conv_module2.balancer2.prob, batch_count=1701480.0, ans=0.125 2023-10-04 15:23:27,054 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_15517_210_6753_1_1530954008_3487336_79-273076-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 15:23:28,398 WARNING [train.py:1204] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_371-18169-0_sp0.9 from training. Duration: 0.9555625 2023-10-04 15:23:35,301 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_68702_210_23689_1_1535799577_3420068_170-92788-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 15:23:37,288 WARNING [train.py:1204] (3/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_78-182912-0 from training. Duration: 0.59 2023-10-04 15:23:38,499 INFO [train.py:1046] (3/4) Epoch 49, batch 250, loss[loss=0.1555, simple_loss=0.2385, pruned_loss=0.03621, over 24459.00 frames. ], tot_loss[loss=0.1551, simple_loss=0.2365, pruned_loss=0.03688, over 3386267.07 frames. ], batch size: 63, lr: 2.08e-03, grad_scale: 8.0 2023-10-04 15:23:38,593 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3019_210_2028_1_1526044245_3264865_297-12384-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 15:23:38,599 WARNING [train.py:1204] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_147-111137-0_sp1.1 from training. Duration: 0.863625 2023-10-04 15:23:38,604 WARNING [train.py:1204] (3/4) Exclude cut with ID _28323_210_16187_1_1532480395667_4100300_117-8859-0_sp1.1 from training. Duration: 0.97275 2023-10-04 15:23:40,083 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7762_210_1794_1_1528199508_4057408_419-125009-0_sp1.1 from training. Duration: 0.7545625 2023-10-04 15:23:41,468 WARNING [train.py:1204] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_268-87909-0 from training. Duration: 0.99 2023-10-04 15:23:42,775 WARNING [train.py:1204] (3/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_118-311357-0_sp1.1 from training. Duration: 0.92725 2023-10-04 15:23:42,796 WARNING [train.py:1204] (3/4) Exclude cut with ID ID0377W0003-870225 from training. Duration: 0.852 2023-10-04 15:23:44,834 WARNING [train.py:1204] (3/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_56-5838-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 15:23:46,197 WARNING [train.py:1204] (3/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_90-31383-0_sp0.9 from training. Duration: 0.9666875 2023-10-04 15:23:47,574 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_31583_210_3852_1_1532595851_4024413_58-86657-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 15:23:47,621 WARNING [train.py:1204] (3/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_344-113875-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 15:23:49,027 WARNING [train.py:1204] (3/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_278-104676-0_sp1.1 from training. Duration: 0.8909375 2023-10-04 15:23:50,816 WARNING [train.py:1204] (3/4) Exclude cut with ID _55844_210_2346_1_1534471212569_7625707_764-2387-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 15:23:50,970 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.ff3_skip_rate, batch_count=1701546.6666666667, ans=0.0 2023-10-04 15:23:52,143 WARNING [train.py:1204] (3/4) Exclude cut with ID _27430_210_7770_1_1532604689375_1378590_157-14963-0_sp1.1 from training. Duration: 0.963625 2023-10-04 15:23:55,635 WARNING [train.py:1204] (3/4) Exclude cut with ID _7160_210_1876_1_1527940923231_3703799_282-322611-0_sp1.1 from training. Duration: 0.7909375 2023-10-04 15:24:01,390 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.conv_module1.balancer2.min_abs, batch_count=1701613.3333333333, ans=0.5 2023-10-04 15:24:03,995 WARNING [train.py:1204] (3/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_190-138640-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 15:24:08,451 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_33348_210_5632_1_1533086912_3324030_63-270922-0_sp1.1 from training. Duration: 0.97275 2023-10-04 15:24:08,495 WARNING [train.py:1204] (3/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_135-184281-0_sp1.1 from training. Duration: 0.9 2023-10-04 15:24:14,820 WARNING [train.py:1204] (3/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_277-210798-0_sp1.1 from training. Duration: 0.5 2023-10-04 15:24:16,122 WARNING [train.py:1204] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_402-253027-0_sp0.9 from training. Duration: 0.6 2023-10-04 15:24:16,207 WARNING [train.py:1204] (3/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_540-275685-0_sp0.9 from training. Duration: 0.988875 2023-10-04 15:24:16,239 WARNING [train.py:1204] (3/4) Exclude cut with ID _59493_210_17282_1_1535078419077_3225879_118-18960-0_sp1.1 from training. Duration: 0.936375 2023-10-04 15:24:17,621 WARNING [train.py:1204] (3/4) Exclude cut with ID _6194_210_1534_1_1527511826160_3941194_321-148641-0_sp1.1 from training. Duration: 0.5818125 2023-10-04 15:24:17,636 WARNING [train.py:1204] (3/4) Exclude cut with ID _61133_210_9969_1_1535196777227_3696409_333-270325-0_sp1.1 from training. Duration: 0.8454375 2023-10-04 15:24:17,695 WARNING [train.py:1204] (3/4) Exclude cut with ID _49376_210_5986_1_1533985278732_3370740_307-145221-0_sp1.1 from training. Duration: 0.936375 2023-10-04 15:24:20,997 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6846_210_6753_1_1527850767_4295158_2-315073-0_sp0.9 from training. Duration: 0.97775 2023-10-04 15:24:23,741 WARNING [train.py:1204] (3/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_17-25045-0 from training. Duration: 0.59 2023-10-04 15:24:23,760 WARNING [train.py:1204] (3/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_503-333510-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 15:24:26,375 WARNING [train.py:1204] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_238-245655-0_sp1.1 from training. Duration: 0.736375 2023-10-04 15:24:26,396 WARNING [train.py:1204] (3/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_329-82235-0_sp0.9 from training. Duration: 0.911125 2023-10-04 15:24:26,401 WARNING [train.py:1204] (3/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_325-113798-0_sp0.9 from training. Duration: 0.9444375 2023-10-04 15:24:27,814 WARNING [train.py:1204] (3/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_395-37222-0_sp0.9 from training. Duration: 0.8444375 2023-10-04 15:24:27,907 WARNING [train.py:1204] (3/4) Exclude cut with ID _64135_210_2253_1_1535677267876_7608972_498-5039-0_sp1.1 from training. Duration: 0.7090625 2023-10-04 15:24:27,917 WARNING [train.py:1204] (3/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_303-228738-0_sp0.9 from training. Duration: 0.9333125 2023-10-04 15:24:29,958 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_577-220899-0_sp1.1 from training. Duration: 0.92725 2023-10-04 15:24:32,810 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3475_210_2028_1_1526193578_3622569_377-166133-0_sp1.1 from training. Duration: 0.7818125 2023-10-04 15:24:32,855 WARNING [train.py:1204] (3/4) Exclude cut with ID _49376_210_5986_1_1533985278732_3370740_272-313512-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 15:24:36,913 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7762_210_1794_1_1528199508_4057408_422-227034-0_sp0.9 from training. Duration: 0.811125 2023-10-04 15:24:39,213 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.feed_forward1.hidden_balancer.prob, batch_count=1701813.3333333333, ans=0.125 2023-10-04 15:24:40,485 WARNING [train.py:1204] (3/4) Exclude cut with ID _18895_210_6228_1_1531357274234_3784930_74-341428-0_sp1.1 from training. Duration: 0.92725 2023-10-04 15:24:43,221 WARNING [train.py:1204] (3/4) Exclude cut with ID _44902_210_16187_1_1533888006062_3792078_149-67212-0_sp1.1 from training. Duration: 0.7909375 2023-10-04 15:24:47,945 WARNING [train.py:1204] (3/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_243-126020-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 15:24:49,403 WARNING [train.py:1204] (3/4) Exclude cut with ID _54218_210_12210_1_1534413646388_4409259_259-6561-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 15:24:53,842 INFO [train.py:1046] (3/4) Epoch 49, batch 300, loss[loss=0.1428, simple_loss=0.2121, pruned_loss=0.03675, over 23386.00 frames. ], tot_loss[loss=0.1539, simple_loss=0.2346, pruned_loss=0.03657, over 3672764.37 frames. ], batch size: 285, lr: 2.08e-03, grad_scale: 8.0 2023-10-04 15:24:53,899 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_41452_210_7291_1_1533437687_3941281_405-11559-0 from training. Duration: 0.91 2023-10-04 15:24:53,994 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_759-225084-0_sp1.1 from training. Duration: 0.97275 2023-10-04 15:24:54,013 WARNING [train.py:1204] (3/4) Exclude cut with ID _14747_210_8341_1_1530685797463_3654089_147-131229-0_sp0.9 from training. Duration: 0.8444375 2023-10-04 15:24:58,072 WARNING [train.py:1204] (3/4) Exclude cut with ID _14867_210_1968_1_1530684179890_3846969_319-250868-0 from training. Duration: 0.99 2023-10-04 15:24:58,105 WARNING [train.py:1204] (3/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_580-93798-0_sp0.9 from training. Duration: 0.62225 2023-10-04 15:24:59,464 WARNING [train.py:1204] (3/4) Exclude cut with ID _9679_210_7497_1_1529129212906_4363929_512-313889-0_sp0.9 from training. Duration: 0.9 2023-10-04 15:24:59,474 WARNING [train.py:1204] (3/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_18-189198-0 from training. Duration: 0.9 2023-10-04 15:25:00,861 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.699e+02 2.248e+02 2.762e+02 3.176e+02 5.077e+02, threshold=5.525e+02, percent-clipped=1.0 2023-10-04 15:25:03,910 WARNING [train.py:1204] (3/4) Exclude cut with ID _49755_210_18053_1_1534581005846_3218709_137-64179-0_sp1.1 from training. Duration: 0.92725 2023-10-04 15:25:03,965 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_48049_210_5632_1_1533864553_3519937_306-283794-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 15:25:09,010 WARNING [train.py:1204] (3/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_25-120161-0_sp1.1 from training. Duration: 0.8909375 2023-10-04 15:25:10,328 WARNING [train.py:1204] (3/4) Exclude cut with ID _8833_210_5219_1_1528592665210_7553059_319-349181-0 from training. Duration: 0.99 2023-10-04 15:25:10,420 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_296-332325-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 15:25:11,133 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.2.encoder.layers.2.conv_module2.whiten, num_groups=1, num_channels=384, metric=7.31 vs. limit=15.0 2023-10-04 15:25:11,661 WARNING [train.py:1204] (3/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_436-186311-0_sp1.1 from training. Duration: 0.5909375 2023-10-04 15:25:11,683 WARNING [train.py:1204] (3/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_348-127789-0 from training. Duration: 0.85 2023-10-04 15:25:11,688 WARNING [train.py:1204] (3/4) Exclude cut with ID _3992_210_1747_1_1526644816031_3731060_443-195183-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 15:25:11,810 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.0.conv_module1.balancer2.prob, batch_count=1701946.6666666667, ans=0.125 2023-10-04 15:25:16,421 WARNING [train.py:1204] (3/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_308-231735-0_sp1.1 from training. Duration: 0.663625 2023-10-04 15:25:20,618 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6786_210_1388_1_1527839699_4194495_378-46070-0_sp1.1 from training. Duration: 0.6545625 2023-10-04 15:25:20,661 WARNING [train.py:1204] (3/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_63-101996-0 from training. Duration: 0.88 2023-10-04 15:25:24,043 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2541_210_1794_1_1525590269_3084765_156-173713-0 from training. Duration: 0.81 2023-10-04 15:25:25,383 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_217-239796-0_sp1.1 from training. Duration: 0.963625 2023-10-04 15:25:26,861 WARNING [train.py:1204] (3/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_530-329400-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 15:25:29,520 WARNING [train.py:1204] (3/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_150-62920-0_sp1.1 from training. Duration: 0.963625 2023-10-04 15:25:29,520 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_5522_210_3273_1_1527249606_3909096_366-172508-0 from training. Duration: 0.66 2023-10-04 15:25:29,522 WARNING [train.py:1204] (3/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_303-340446-0_sp1.1 from training. Duration: 0.8181875 2023-10-04 15:25:30,932 WARNING [train.py:1204] (3/4) Exclude cut with ID _8699_210_2069_1_1528630346831_3960409_160-24408-0_sp1.1 from training. Duration: 0.82725 2023-10-04 15:25:32,425 WARNING [train.py:1204] (3/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_299-187801-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 15:25:33,711 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_34886_210_10633_1_1533203882_1755661_92-345288-0_sp1.1 from training. Duration: 0.97275 2023-10-04 15:25:37,775 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_5438_210_1794_1_1527336687_2600009_104-288274-0_sp1.1 from training. Duration: 0.52725 2023-10-04 15:25:37,779 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_154-316450-0 from training. Duration: 0.76 2023-10-04 15:25:39,085 WARNING [train.py:1204] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_23-189234-0_sp0.9 from training. Duration: 0.92225 2023-10-04 15:25:41,880 WARNING [train.py:1204] (3/4) Exclude cut with ID _4779_210_2084_1_1527318022944_3914893_346-168820-0_sp1.1 from training. Duration: 0.963625 2023-10-04 15:25:43,878 WARNING [train.py:1204] (3/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_5-55893-0 from training. Duration: 0.55 2023-10-04 15:25:43,967 WARNING [train.py:1204] (3/4) Exclude cut with ID _20455_210_5219_1_1531565996775_8098100_180-24517-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 15:25:44,665 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.2.encoder.layers.1.self_attn_weights.whiten_keys, num_groups=4, num_channels=128, metric=2.70 vs. limit=6.0 2023-10-04 15:25:47,926 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_243-143119-0_sp0.9 from training. Duration: 0.8555625 2023-10-04 15:25:50,683 WARNING [train.py:1204] (3/4) Exclude cut with ID _65585_210_12210_1_1535450451584_3764098_293-212045-0_sp1.1 from training. Duration: 0.92725 2023-10-04 15:25:50,693 WARNING [train.py:1204] (3/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_577-332600-0 from training. Duration: 0.57 2023-10-04 15:25:54,209 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.conv_module1.balancer1.prob, batch_count=1702146.6666666667, ans=0.125 2023-10-04 15:25:55,355 WARNING [train.py:1204] (3/4) Exclude cut with ID _30039_210_13270_1_1532484612900_2725150_104-289874-0_sp1.1 from training. Duration: 0.963625 2023-10-04 15:25:55,362 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3172_210_2087_1_1526292929_3987865_197-97762-0_sp1.1 from training. Duration: 0.6818125 2023-10-04 15:25:56,920 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_256-150908-0_sp1.1 from training. Duration: 0.963625 2023-10-04 15:25:59,608 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_296-287678-0_sp1.1 from training. Duration: 0.863625 2023-10-04 15:25:59,645 WARNING [train.py:1204] (3/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_443-30034-0 from training. Duration: 0.64 2023-10-04 15:25:59,670 WARNING [train.py:1204] (3/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_494-199538-0_sp0.9 from training. Duration: 0.8333125 2023-10-04 15:26:00,899 WARNING [train.py:1204] (3/4) Exclude cut with ID _19708_210_9206_1_1532136650733_4238364_61-162325-0_sp1.1 from training. Duration: 0.97275 2023-10-04 15:26:01,003 WARNING [train.py:1204] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_475-127455-0 from training. Duration: 0.7 2023-10-04 15:26:02,351 WARNING [train.py:1204] (3/4) Exclude cut with ID _48677_210_5663_1_1533902405919_3892989_188-11581-0_sp1.1 from training. Duration: 0.963625 2023-10-04 15:26:02,374 WARNING [train.py:1204] (3/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_136-8381-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 15:26:02,575 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.bypass.skip_rate, batch_count=1702146.6666666667, ans=0.09899494936611666 2023-10-04 15:26:03,785 WARNING [train.py:1204] (3/4) Exclude cut with ID _55328_210_27767_1_1534580973260_4088170_270-229555-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 15:26:03,822 WARNING [train.py:1204] (3/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_372-266951-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 15:26:03,876 WARNING [train.py:1204] (3/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_14-242149-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 15:26:08,419 INFO [train.py:1046] (3/4) Epoch 49, batch 350, loss[loss=0.1599, simple_loss=0.2328, pruned_loss=0.04344, over 23831.00 frames. ], tot_loss[loss=0.153, simple_loss=0.2333, pruned_loss=0.03635, over 3900289.15 frames. ], batch size: 164, lr: 2.08e-03, grad_scale: 8.0 2023-10-04 15:26:09,868 WARNING [train.py:1204] (3/4) Exclude cut with ID _7877_210_6014_1_1528286358099_3750136_159-76512-0_sp1.1 from training. Duration: 0.936375 2023-10-04 15:26:09,871 WARNING [train.py:1204] (3/4) Exclude cut with ID _8263_210_1664_1_1528438953494_294100_40-204731-0_sp1.1 from training. Duration: 0.4545625 2023-10-04 15:26:10,128 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.conv_module1.balancer2.prob, batch_count=1702213.3333333333, ans=0.125 2023-10-04 15:26:12,655 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8278_210_2550_1_1528637099_4220413_694-238413-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 15:26:17,631 WARNING [train.py:1204] (3/4) Exclude cut with ID _47278_210_12041_1_1533902418349_3576110_421-172710-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 15:26:19,448 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.conv_module1.balancer2.min_positive, batch_count=1702213.3333333333, ans=0.05 2023-10-04 15:26:21,045 WARNING [train.py:1204] (3/4) Exclude cut with ID _14970_210_15709_1_1530773543493_1715730_215-282680-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 15:26:21,077 WARNING [train.py:1204] (3/4) Exclude cut with ID _64639_210_5816_1_1535367619437_3528000_306-149449-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 15:26:22,486 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_28-291982-0 from training. Duration: 0.97 2023-10-04 15:26:23,896 WARNING [train.py:1204] (3/4) Exclude cut with ID _55134_210_21982_1_1534590028377_3882170_412-15870-0_sp1.1 from training. Duration: 0.936375 2023-10-04 15:26:25,160 WARNING [train.py:1204] (3/4) Exclude cut with ID _13684_210_7497_1_1530619354863_3598359_312-235606-0 from training. Duration: 0.6 2023-10-04 15:26:26,645 WARNING [train.py:1204] (3/4) Exclude cut with ID _37112_210_2575_1_1532997007673_7268610_347-238993-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 15:26:28,013 WARNING [train.py:1204] (3/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_121-284152-0 from training. Duration: 0.59 2023-10-04 15:26:28,081 WARNING [train.py:1204] (3/4) Exclude cut with ID _2941_210_2577_1_1526199673150_2489759_228-2961-0_sp1.1 from training. Duration: 0.97275 2023-10-04 15:26:32,201 WARNING [train.py:1204] (3/4) Exclude cut with ID _28323_210_16187_1_1532480395667_4100300_531-24157-0 from training. Duration: 0.98 2023-10-04 15:26:33,588 WARNING [train.py:1204] (3/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_266-329755-0_sp0.9 from training. Duration: 0.988875 2023-10-04 15:26:35,469 WARNING [train.py:1204] (3/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_1038-146421-0_sp1.1 from training. Duration: 0.97275 2023-10-04 15:26:37,344 WARNING [train.py:1204] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_391-22148-0_sp0.9 from training. Duration: 0.8555625 2023-10-04 15:26:37,429 WARNING [train.py:1204] (3/4) Exclude cut with ID _64521_210_14699_1_1535243427259_3815328_543-341117-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 15:26:37,447 WARNING [train.py:1204] (3/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_52-168712-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 15:26:38,698 WARNING [train.py:1204] (3/4) Exclude cut with ID _33920_210_4220_1_1533257851500_3084399_198-334063-0_sp1.1 from training. Duration: 0.8545625 2023-10-04 15:26:38,720 WARNING [train.py:1204] (3/4) Exclude cut with ID _50093_210_4220_1_1534417141207_3432299_69-111987-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 15:26:38,776 WARNING [train.py:1204] (3/4) Exclude cut with ID _14821_210_8750_1_1530680503991_6829359_189-297451-0_sp0.9 from training. Duration: 0.611125 2023-10-04 15:26:41,516 WARNING [train.py:1204] (3/4) Exclude cut with ID _8030_210_7497_1_1528444886781_3531089_262-86750-0_sp0.9 from training. Duration: 0.9 2023-10-04 15:26:41,521 WARNING [train.py:1204] (3/4) Exclude cut with ID _24612_210_8913_1_1531998054193_3806359_294-191196-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 15:26:48,943 WARNING [train.py:1204] (3/4) Exclude cut with ID _16909_210_5724_1_1531292079912_4118919_707-342896-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 15:26:48,965 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_9460_210_4211_1_1528954587_10049667_572-37035-0_sp0.9 from training. Duration: 0.888875 2023-10-04 15:26:50,844 WARNING [train.py:1204] (3/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_762-109886-0_sp1.1 from training. Duration: 0.82725 2023-10-04 15:26:50,874 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_14953_210_2151_1_1530701710_5616165_519-326393-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 15:26:55,102 WARNING [train.py:1204] (3/4) Exclude cut with ID _25914_210_6400_1_1532141317058_644740_52-96808-0 from training. Duration: 0.91 2023-10-04 15:26:55,113 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_68095_210_20185_1_1535718226_8888855_1079-94525-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 15:26:59,244 WARNING [train.py:1204] (3/4) Exclude cut with ID _14860_210_4915_1_1530702020340_7343240_746-3647-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 15:26:59,244 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_70379_210_7925_1_1535941455_557455_49-101720-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 15:27:00,481 WARNING [train.py:1204] (3/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_8-307291-0_sp1.1 from training. Duration: 0.936375 2023-10-04 15:27:01,911 WARNING [train.py:1204] (3/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_117-290916-0 from training. Duration: 0.51 2023-10-04 15:27:04,706 WARNING [train.py:1204] (3/4) Exclude cut with ID _22020_210_2461_1_1531724164513_6326900_944-339840-0_sp1.1 from training. Duration: 0.963625 2023-10-04 15:27:06,172 WARNING [train.py:1204] (3/4) Exclude cut with ID IC0515W0043-254911 from training. Duration: 0.976 2023-10-04 15:27:06,510 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.conv_skip_rate, batch_count=1702480.0, ans=0.0 2023-10-04 15:27:08,107 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3910_210_1794_1_1526742306_334530_28-238909-0 from training. Duration: 0.81 2023-10-04 15:27:08,129 WARNING [train.py:1204] (3/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_269-267275-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 15:27:10,143 WARNING [train.py:1204] (3/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_72-251255-0_sp1.1 from training. Duration: 0.8545625 2023-10-04 15:27:10,154 WARNING [train.py:1204] (3/4) Exclude cut with ID _14886_210_4220_1_1530702020355_3516190_175-229073-0 from training. Duration: 0.45 2023-10-04 15:27:12,947 WARNING [train.py:1204] (3/4) Exclude cut with ID _40519_210_13794_1_1533715228655_7337390_921-209452-0_sp1.1 from training. Duration: 0.963625 2023-10-04 15:27:15,726 WARNING [train.py:1204] (3/4) Exclude cut with ID _29857_210_20034_1_1532395886753_3798160_335-163562-0_sp0.9 from training. Duration: 0.9444375 2023-10-04 15:27:15,826 WARNING [train.py:1204] (3/4) Exclude cut with ID _58583_210_5236_1_1534726835266_3574249_384-58445-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 15:27:17,151 WARNING [train.py:1204] (3/4) Exclude cut with ID _44011_210_2046_1_1533722401691_3631310_96-315656-0_sp1.1 from training. Duration: 0.963625 2023-10-04 15:27:17,155 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_68484_210_1794_1_1535849804_7678097_549-170613-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 15:27:18,742 WARNING [train.py:1204] (3/4) Exclude cut with ID _37892_210_19596_1_1533198677897_3964058_214-148058-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 15:27:23,453 INFO [train.py:1046] (3/4) Epoch 49, batch 400, loss[loss=0.1626, simple_loss=0.2361, pruned_loss=0.04456, over 23389.00 frames. ], tot_loss[loss=0.1521, simple_loss=0.2328, pruned_loss=0.03566, over 4091654.64 frames. ], batch size: 285, lr: 2.08e-03, grad_scale: 16.0 2023-10-04 15:27:23,523 WARNING [train.py:1204] (3/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_415-78792-0_sp0.9 from training. Duration: 0.9 2023-10-04 15:27:26,431 WARNING [train.py:1204] (3/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_383-205833-0_sp1.1 from training. Duration: 0.863625 2023-10-04 15:27:27,782 WARNING [train.py:1204] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_167-137255-0 from training. Duration: 0.64 2023-10-04 15:27:27,799 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8275_210_4198_1_1528455320_3892571_484-154786-0_sp1.1 from training. Duration: 0.963625 2023-10-04 15:27:27,855 WARNING [train.py:1204] (3/4) Exclude cut with ID _40284_210_2084_1_1533513679814_3704279_396-319412-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 15:27:29,303 WARNING [train.py:1204] (3/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_280-120942-0_sp0.9 from training. Duration: 0.8555625 2023-10-04 15:27:30,603 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.687e+02 2.007e+02 2.289e+02 2.741e+02 4.265e+02, threshold=4.578e+02, percent-clipped=0.0 2023-10-04 15:27:30,677 WARNING [train.py:1204] (3/4) Exclude cut with ID _45630_210_10556_1_1533949174498_3902049_558-240889-0_sp1.1 from training. Duration: 0.92725 2023-10-04 15:27:32,190 WARNING [train.py:1204] (3/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_197-307458-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 15:27:33,544 WARNING [train.py:1204] (3/4) Exclude cut with ID _16909_210_5724_1_1531292079912_4118919_163-254843-0_sp1.1 from training. Duration: 0.92725 2023-10-04 15:27:34,878 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_103-84010-0 from training. Duration: 0.69 2023-10-04 15:27:36,239 WARNING [train.py:1204] (3/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_401-158239-0 from training. Duration: 0.71 2023-10-04 15:27:36,240 WARNING [train.py:1204] (3/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_899-233658-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 15:27:36,920 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.3.encoder.layers.1.self_attn2.whiten, num_groups=1, num_channels=512, metric=12.77 vs. limit=22.5 2023-10-04 15:27:37,595 WARNING [train.py:1204] (3/4) Exclude cut with ID _23825_210_13449_1_1531915313526_3497150_240-271367-0 from training. Duration: 0.97 2023-10-04 15:27:37,650 WARNING [train.py:1204] (3/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_424-134619-0_sp1.1 from training. Duration: 0.92725 2023-10-04 15:27:40,867 WARNING [train.py:1204] (3/4) Exclude cut with ID _44831_210_13427_1_1533816011549_3689035_32-188147-0_sp1.1 from training. Duration: 0.87275 2023-10-04 15:27:40,870 WARNING [train.py:1204] (3/4) Exclude cut with ID _59170_210_6721_1_1534983633960_4203335_314-63879-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 15:27:42,776 WARNING [train.py:1204] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_602-275260-0 from training. Duration: 0.82 2023-10-04 15:27:42,827 WARNING [train.py:1204] (3/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_511-306767-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 15:27:42,854 WARNING [train.py:1204] (3/4) Exclude cut with ID _41817_210_18835_1_1533434477160_7423049_565-201775-0_sp1.1 from training. Duration: 0.92725 2023-10-04 15:27:42,864 WARNING [train.py:1204] (3/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_778-173151-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 15:27:42,904 WARNING [train.py:1204] (3/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_680-51001-0_sp1.1 from training. Duration: 0.963625 2023-10-04 15:27:43,574 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.3.encoder.layers.1.feed_forward2.out_whiten, num_groups=1, num_channels=512, metric=5.48 vs. limit=15.0 2023-10-04 15:27:46,949 WARNING [train.py:1204] (3/4) Exclude cut with ID IC0677W0059-334891 from training. Duration: 0.896 2023-10-04 15:27:47,013 WARNING [train.py:1204] (3/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_370-331600-0 from training. Duration: 0.94 2023-10-04 15:27:52,535 WARNING [train.py:1204] (3/4) Exclude cut with ID _66840_210_5219_1_1535452168426_4196349_446-240192-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 15:27:54,959 WARNING [train.py:1204] (3/4) Exclude cut with ID _65242_210_18628_1_1535369393717_3823149_38-6732-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 15:27:55,026 WARNING [train.py:1204] (3/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_302-2550-0 from training. Duration: 0.81 2023-10-04 15:27:56,429 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_100-348441-0 from training. Duration: 0.91 2023-10-04 15:27:59,245 WARNING [train.py:1204] (3/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_329-285801-0_sp1.1 from training. Duration: 0.82725 2023-10-04 15:28:02,021 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_30167_210_3528_1_1532428012_4772375_343-334712-0_sp1.1 from training. Duration: 0.936375 2023-10-04 15:28:07,529 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7543_210_6753_1_1528543318_4552_1-85072-0 from training. Duration: 0.83 2023-10-04 15:28:07,829 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.feed_forward2.hidden_balancer.prob, batch_count=1702746.6666666667, ans=0.125 2023-10-04 15:28:10,983 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7002_210_1794_1_1527935634_4034444_487-236501-0_sp1.1 from training. Duration: 0.6 2023-10-04 15:28:12,282 WARNING [train.py:1204] (3/4) Exclude cut with ID _7160_210_1876_1_1527940923231_3703799_282-322611-0 from training. Duration: 0.87 2023-10-04 15:28:15,532 WARNING [train.py:1204] (3/4) Exclude cut with ID _67335_210_5219_1_1535525975652_4275229_570-262983-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 15:28:17,000 WARNING [train.py:1204] (3/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_625-338546-0_sp1.1 from training. Duration: 0.8 2023-10-04 15:28:17,023 WARNING [train.py:1204] (3/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_176-56120-0 from training. Duration: 0.83 2023-10-04 15:28:20,463 WARNING [train.py:1204] (3/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_963-187109-0_sp1.1 from training. Duration: 0.9 2023-10-04 15:28:23,253 WARNING [train.py:1204] (3/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_121-284152-0_sp0.9 from training. Duration: 0.6555625 2023-10-04 15:28:23,953 WARNING [train.py:1204] (3/4) Exclude cut with ID _45021_210_9994_1_1533619751094_3330288_104-81711-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 15:28:26,700 WARNING [train.py:1204] (3/4) Exclude cut with ID _51382_210_23862_1_1534492801838_3650104_129-343914-0_sp1.1 from training. Duration: 0.936375 2023-10-04 15:28:26,743 WARNING [train.py:1204] (3/4) Exclude cut with ID _14970_210_15709_1_1530775547361_1966965_8-169924-0 from training. Duration: 0.92 2023-10-04 15:28:28,139 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2295_210_2087_1_1525500993_2407863_234-83104-0_sp1.1 from training. Duration: 0.563625 2023-10-04 15:28:29,505 WARNING [train.py:1204] (3/4) Exclude cut with ID _14687_210_5854_1_1530703838259_3664889_115-239046-0 from training. Duration: 0.67 2023-10-04 15:28:30,998 WARNING [train.py:1204] (3/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_690-126306-0_sp1.1 from training. Duration: 0.7090625 2023-10-04 15:28:31,005 WARNING [train.py:1204] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_625-102955-0_sp0.9 from training. Duration: 0.8555625 2023-10-04 15:28:32,469 WARNING [train.py:1204] (3/4) Exclude cut with ID _9389_210_1664_1_1529217337301_5289269_289-213417-0 from training. Duration: 0.89 2023-10-04 15:28:34,969 WARNING [train.py:1204] (3/4) Exclude cut with ID _13253_210_1925_1_1530757775819_3577873_191-144977-0_sp0.9 from training. Duration: 0.8666875 2023-10-04 15:28:35,031 WARNING [train.py:1204] (3/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_325-155703-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 15:28:36,352 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2295_210_2087_1_1525500993_2407863_116-238894-0_sp0.9 from training. Duration: 0.77775 2023-10-04 15:28:38,232 INFO [train.py:1046] (3/4) Epoch 49, batch 450, loss[loss=0.1575, simple_loss=0.2383, pruned_loss=0.03832, over 23512.00 frames. ], tot_loss[loss=0.1524, simple_loss=0.2332, pruned_loss=0.03584, over 4228512.06 frames. ], batch size: 105, lr: 2.08e-03, grad_scale: 16.0 2023-10-04 15:28:38,314 WARNING [train.py:1204] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_94-126128-0 from training. Duration: 0.86 2023-10-04 15:28:38,331 WARNING [train.py:1204] (3/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_395-310220-0_sp0.9 from training. Duration: 0.988875 2023-10-04 15:28:39,724 WARNING [train.py:1204] (3/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_821-110852-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 15:28:39,782 WARNING [train.py:1204] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_64-248359-0_sp1.1 from training. Duration: 0.62725 2023-10-04 15:28:39,793 WARNING [train.py:1204] (3/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_138-187031-0 from training. Duration: 0.99 2023-10-04 15:28:41,136 WARNING [train.py:1204] (3/4) Exclude cut with ID _4122_210_2084_1_1526713164417_4353280_528-206771-0_sp0.9 from training. Duration: 0.97775 2023-10-04 15:28:41,237 WARNING [train.py:1204] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_844-96989-0_sp1.1 from training. Duration: 0.6454375 2023-10-04 15:28:43,885 WARNING [train.py:1204] (3/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_106-30782-0_sp1.1 from training. Duration: 0.72725 2023-10-04 15:28:54,798 WARNING [train.py:1204] (3/4) Exclude cut with ID _14683_210_5219_1_1530698352608_3974859_281-250040-0_sp1.1 from training. Duration: 0.936375 2023-10-04 15:28:54,837 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_46013_210_7234_1_1533776362_3744032_463-52378-0_sp0.9 from training. Duration: 0.9555625 2023-10-04 15:28:56,795 WARNING [train.py:1204] (3/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_299-34413-0 from training. Duration: 0.65 2023-10-04 15:28:58,113 WARNING [train.py:1204] (3/4) Exclude cut with ID _35799_210_5854_1_1533263487887_3529071_490-326847-0 from training. Duration: 0.99 2023-10-04 15:28:59,713 WARNING [train.py:1204] (3/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_589-9070-0_sp0.9 from training. Duration: 0.788875 2023-10-04 15:29:01,338 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=1702946.6666666667, ans=0.1 2023-10-04 15:29:02,414 WARNING [train.py:1204] (3/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_884-29515-0_sp1.1 from training. Duration: 0.936375 2023-10-04 15:29:03,818 WARNING [train.py:1204] (3/4) Exclude cut with ID _15012_210_1968_1_1530856820429_5095350_474-119995-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 15:29:06,582 WARNING [train.py:1204] (3/4) Exclude cut with ID _65585_210_12210_1_1535450451584_3764098_211-97124-0_sp1.1 from training. Duration: 0.963625 2023-10-04 15:29:08,506 WARNING [train.py:1204] (3/4) Exclude cut with ID _33640_210_2655_1_1533117605479_4855469_666-224463-0_sp1.1 from training. Duration: 0.963625 2023-10-04 15:29:10,140 WARNING [train.py:1204] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_35-153886-0 from training. Duration: 0.84 2023-10-04 15:29:10,196 WARNING [train.py:1204] (3/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_210-106047-0 from training. Duration: 0.97 2023-10-04 15:29:12,859 WARNING [train.py:1204] (3/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_479-125611-0 from training. Duration: 0.96 2023-10-04 15:29:12,890 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_9417_210_1794_1_1529235261_4614131_427-153963-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 15:29:14,183 WARNING [train.py:1204] (3/4) Exclude cut with ID _23818_210_6228_1_1531917063884_3676920_133-170571-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 15:29:14,252 WARNING [train.py:1204] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_602-275260-0_sp1.1 from training. Duration: 0.7454375 2023-10-04 15:29:17,508 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9029W0121-509521 from training. Duration: 0.896 2023-10-04 15:29:17,518 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9029W0434-509821 from training. Duration: 0.64 2023-10-04 15:29:17,544 WARNING [train.py:1204] (3/4) Exclude cut with ID _23038_210_9570_1_1532001584158_3652419_289-307034-0_sp1.1 from training. Duration: 0.936375 2023-10-04 15:29:20,166 WARNING [train.py:1204] (3/4) Exclude cut with ID _29345_210_4381_1_1532689168263_3860169_149-257268-0_sp0.9 from training. Duration: 0.9 2023-10-04 15:29:20,235 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_405-339986-0_sp1.1 from training. Duration: 0.463625 2023-10-04 15:29:23,963 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2655_210_2087_1_1525688959_3897400_437-337690-0_sp1.1 from training. Duration: 0.57275 2023-10-04 15:29:25,397 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7543_210_6753_1_1528541907_1399322_165-323619-0_sp1.1 from training. Duration: 0.62725 2023-10-04 15:29:25,469 WARNING [train.py:1204] (3/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_508-335369-0_sp1.1 from training. Duration: 0.436375 2023-10-04 15:29:26,801 WARNING [train.py:1204] (3/4) Exclude cut with ID _7852_210_5219_1_1528286471316_8139400_358-287492-0 from training. Duration: 0.96 2023-10-04 15:29:29,552 WARNING [train.py:1204] (3/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_322-217220-0_sp0.9 from training. Duration: 0.9555625 2023-10-04 15:29:31,005 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3171_210_2087_1_1526109084_2330007_66-260583-0_sp0.9 from training. Duration: 0.8 2023-10-04 15:29:32,345 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8558_210_1410_1_1528505225_7613406_131-329524-0_sp1.1 from training. Duration: 0.7181875 2023-10-04 15:29:32,665 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.self_attn_weights.pos_emb_skip_rate, batch_count=1703080.0, ans=0.0 2023-10-04 15:29:33,661 WARNING [train.py:1204] (3/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_30-326681-0 from training. Duration: 0.83 2023-10-04 15:29:35,301 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.conv_module2.balancer1.prob, batch_count=1703080.0, ans=0.125 2023-10-04 15:29:37,758 WARNING [train.py:1204] (3/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_58-68956-0_sp0.9 from training. Duration: 0.92225 2023-10-04 15:29:37,826 WARNING [train.py:1204] (3/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_255-312654-0 from training. Duration: 0.91 2023-10-04 15:29:39,659 WARNING [train.py:1204] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_99-167392-0 from training. Duration: 0.61 2023-10-04 15:29:41,059 WARNING [train.py:1204] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_94-126128-0_sp0.9 from training. Duration: 0.9555625 2023-10-04 15:29:45,294 WARNING [train.py:1204] (3/4) Exclude cut with ID _68635_210_9317_1_1535770813410_4315870_187-192817-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 15:29:46,690 WARNING [train.py:1204] (3/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_277-204970-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 15:29:48,044 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6163_210_6400_1_1527506746_4263068_283-137811-0_sp0.9 from training. Duration: 0.8555625 2023-10-04 15:29:49,410 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9144W0121-566446 from training. Duration: 0.896 2023-10-04 15:29:52,632 INFO [train.py:1046] (3/4) Epoch 49, batch 500, loss[loss=0.1471, simple_loss=0.2383, pruned_loss=0.02793, over 24307.00 frames. ], tot_loss[loss=0.1527, simple_loss=0.2338, pruned_loss=0.03587, over 4339188.91 frames. ], batch size: 74, lr: 2.08e-03, grad_scale: 16.0 2023-10-04 15:29:54,108 WARNING [train.py:1204] (3/4) Exclude cut with ID _49371_210_12071_1_1533952761021_3812190_351-315519-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 15:29:54,176 WARNING [train.py:1204] (3/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_115-326778-0_sp1.1 from training. Duration: 0.8454375 2023-10-04 15:29:56,104 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_17292_210_6753_1_1531125388_2943734_436-198965-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 15:29:56,121 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9165W0117-576751 from training. Duration: 0.896 2023-10-04 15:29:58,006 WARNING [train.py:1204] (3/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_212-149947-0 from training. Duration: 0.73 2023-10-04 15:29:58,014 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_13757_210_3528_1_1530440682_4107239_65-167544-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 15:30:00,589 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.603e+02 1.999e+02 2.217e+02 2.687e+02 4.030e+02, threshold=4.434e+02, percent-clipped=0.0 2023-10-04 15:30:02,078 WARNING [train.py:1204] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_700-183549-0_sp0.9 from training. Duration: 0.7555625 2023-10-04 15:30:02,892 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.2.encoder.layers.1.feed_forward3.out_whiten, num_groups=1, num_channels=384, metric=8.81 vs. limit=15.0 2023-10-04 15:30:05,059 WARNING [train.py:1204] (3/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_216-343007-0_sp0.9 from training. Duration: 0.7333125 2023-10-04 15:30:06,403 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7544_210_6753_1_1528587722_3576686_295-258479-0_sp1.1 from training. Duration: 0.763625 2023-10-04 15:30:09,078 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_365-287106-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 15:30:09,105 WARNING [train.py:1204] (3/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_300-3747-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 15:30:09,164 WARNING [train.py:1204] (3/4) Exclude cut with ID _14069_210_5632_1_1530512692033_4803390_487-303201-0_sp1.1 from training. Duration: 0.92725 2023-10-04 15:30:19,297 WARNING [train.py:1204] (3/4) Exclude cut with ID _14459_210_2228_1_1531443641946_7516700_668-123026-0_sp1.1 from training. Duration: 0.936375 2023-10-04 15:30:19,327 WARNING [train.py:1204] (3/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_108-53929-0_sp1.1 from training. Duration: 0.536375 2023-10-04 15:30:19,358 WARNING [train.py:1204] (3/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_266-137104-0_sp0.9 from training. Duration: 0.72225 2023-10-04 15:30:19,391 WARNING [train.py:1204] (3/4) Exclude cut with ID _12302_210_5219_1_1529924446466_8107280_902-168619-0_sp1.1 from training. Duration: 0.936375 2023-10-04 15:30:20,655 WARNING [train.py:1204] (3/4) Exclude cut with ID _59502_210_12210_1_1535107024698_1835139_146-162495-0 from training. Duration: 0.92 2023-10-04 15:30:20,678 WARNING [train.py:1204] (3/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_412-287526-0_sp1.1 from training. Duration: 0.6545625 2023-10-04 15:30:20,909 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.balancer1.prob, batch_count=1703346.6666666667, ans=0.125 2023-10-04 15:30:23,517 WARNING [train.py:1204] (3/4) Exclude cut with ID _7160_210_1876_1_1527940923231_3703799_120-150943-0_sp0.9 from training. Duration: 0.9 2023-10-04 15:30:23,588 WARNING [train.py:1204] (3/4) Exclude cut with ID _15037_210_2253_1_1530752294104_3267650_296-192597-0_sp1.1 from training. Duration: 0.736375 2023-10-04 15:30:23,611 WARNING [train.py:1204] (3/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_397-216497-0_sp1.1 from training. Duration: 0.87275 2023-10-04 15:30:25,319 WARNING [train.py:1204] (3/4) Exclude cut with ID _40094_210_16380_1_1533294005773_3641159_76-269320-0_sp1.1 from training. Duration: 0.936375 2023-10-04 15:30:27,282 WARNING [train.py:1204] (3/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_449-15508-0 from training. Duration: 0.82 2023-10-04 15:30:30,611 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9309W0194-640321 from training. Duration: 0.64 2023-10-04 15:30:32,285 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.ff2_skip_rate, batch_count=1703346.6666666667, ans=0.0 2023-10-04 15:30:33,494 WARNING [train.py:1204] (3/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_284-312178-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 15:30:33,617 WARNING [train.py:1204] (3/4) Exclude cut with ID _29853_210_7497_1_1532505600851_6642110_401-55937-0_sp1.1 from training. Duration: 0.92725 2023-10-04 15:30:34,972 WARNING [train.py:1204] (3/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_716-24918-0_sp1.1 from training. Duration: 0.92725 2023-10-04 15:30:34,991 WARNING [train.py:1204] (3/4) Exclude cut with ID _19637_210_4220_1_1531537167676_3205060_314-305638-0_sp1.1 from training. Duration: 0.92725 2023-10-04 15:30:36,282 WARNING [train.py:1204] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_475-127455-0_sp1.1 from training. Duration: 0.636375 2023-10-04 15:30:37,738 WARNING [train.py:1204] (3/4) Exclude cut with ID _5444_210_2151_1_1527418808464_5312210_581-315411-0 from training. Duration: 0.62 2023-10-04 15:30:42,163 WARNING [train.py:1204] (3/4) Exclude cut with ID _44902_210_16187_1_1533888006062_3792078_149-67212-0_sp0.9 from training. Duration: 0.9666875 2023-10-04 15:30:42,249 WARNING [train.py:1204] (3/4) Exclude cut with ID _31602_210_5854_1_1532677021456_4458250_63-63253-0_sp1.1 from training. Duration: 0.963625 2023-10-04 15:30:45,043 WARNING [train.py:1204] (3/4) Exclude cut with ID _55209_210_1531_1_1534675828996_4597160_660-203992-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 15:30:47,880 WARNING [train.py:1204] (3/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_83-190624-0_sp1.1 from training. Duration: 0.92725 2023-10-04 15:30:53,305 WARNING [train.py:1204] (3/4) Exclude cut with ID _58605_210_12210_1_1534723195939_3904259_154-139635-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 15:30:56,653 WARNING [train.py:1204] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_164-224052-0 from training. Duration: 0.7 2023-10-04 15:30:56,662 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_1126-189071-0_sp1.1 from training. Duration: 0.963625 2023-10-04 15:30:56,674 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_15396_210_6890_1_1531308473_1938936_310-214789-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 15:31:00,596 WARNING [train.py:1204] (3/4) Exclude cut with ID _8395_210_1531_1_1528597871523_4411579_431-132470-0 from training. Duration: 0.74 2023-10-04 15:31:01,852 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3780_210_2087_1_1526467760_2130483_170-259044-0_sp0.9 from training. Duration: 0.77775 2023-10-04 15:31:01,959 WARNING [train.py:1204] (3/4) Exclude cut with ID _12733_210_8341_1_1530167424556_3646380_537-230640-0_sp1.1 from training. Duration: 0.963625 2023-10-04 15:31:03,404 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder_embed.convnext.hidden_balancer.prob, batch_count=1703480.0, ans=0.125 2023-10-04 15:31:07,460 INFO [train.py:1046] (3/4) Epoch 49, batch 550, loss[loss=0.1668, simple_loss=0.2505, pruned_loss=0.04157, over 23378.00 frames. ], tot_loss[loss=0.1541, simple_loss=0.2349, pruned_loss=0.03666, over 4416138.96 frames. ], batch size: 93, lr: 2.08e-03, grad_scale: 16.0 2023-10-04 15:31:07,584 WARNING [train.py:1204] (3/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_159-281310-0 from training. Duration: 0.95 2023-10-04 15:31:07,854 INFO [scaling.py:1118] (3/4) WithLoss: name=encoder.encoders.2.encoder.layers.1.self_attn_weights, loss-sum=0.000e+00 2023-10-04 15:31:08,995 WARNING [train.py:1204] (3/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_434-83000-0 from training. Duration: 0.98 2023-10-04 15:31:09,015 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_4405_210_1849_1_1526732205_3593939_493-32464-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 15:31:09,027 WARNING [train.py:1204] (3/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_342-100157-0 from training. Duration: 0.54 2023-10-04 15:31:10,969 WARNING [train.py:1204] (3/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_765-250133-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 15:31:10,973 WARNING [train.py:1204] (3/4) Exclude cut with ID _38670_210_15379_1_1533448810659_4391059_81-312245-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 15:31:12,235 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_50050_210_16223_1_1535435796_7290138_1273-247611-0_sp1.1 from training. Duration: 0.936375 2023-10-04 15:31:12,299 WARNING [train.py:1204] (3/4) Exclude cut with ID _44831_210_13427_1_1533816011549_3689035_399-18591-0_sp1.1 from training. Duration: 0.936375 2023-10-04 15:31:12,314 WARNING [train.py:1204] (3/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_557-324258-0_sp0.9 from training. Duration: 0.9 2023-10-04 15:31:13,660 WARNING [train.py:1204] (3/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_62-335552-0_sp1.1 from training. Duration: 0.7909375 2023-10-04 15:31:14,930 WARNING [train.py:1204] (3/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_673-264125-0_sp1.1 from training. Duration: 0.963625 2023-10-04 15:31:15,198 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.conv_module1.balancer1.max_abs, batch_count=1703546.6666666667, ans=10.0 2023-10-04 15:31:16,326 WARNING [train.py:1204] (3/4) Exclude cut with ID _8030_210_7497_1_1528444886781_3531089_262-86750-0 from training. Duration: 0.81 2023-10-04 15:31:16,347 WARNING [train.py:1204] (3/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_444-28041-0_sp1.1 from training. Duration: 0.77275 2023-10-04 15:31:20,391 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_34008_210_10431_1_1533345424_8183247_516-14858-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 15:31:20,404 WARNING [train.py:1204] (3/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_54-248218-0_sp1.1 from training. Duration: 0.936375 2023-10-04 15:31:23,183 WARNING [train.py:1204] (3/4) Exclude cut with ID _13253_210_1925_1_1530757775819_3577873_300-119782-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 15:31:24,578 WARNING [train.py:1204] (3/4) Exclude cut with ID _39290_210_4220_1_1533862778760_3075118_21-240703-0_sp1.1 from training. Duration: 0.936375 2023-10-04 15:31:27,864 WARNING [train.py:1204] (3/4) Exclude cut with ID 774-127930-0014-10412-0_sp1.1 from training. Duration: 0.95 2023-10-04 15:31:29,796 WARNING [train.py:1204] (3/4) Exclude cut with ID _67499_210_17282_1_1535509805220_3692430_20-179346-0 from training. Duration: 0.97 2023-10-04 15:31:31,646 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.5.encoder.layers.1.self_attn_weights.whiten_keys, num_groups=4, num_channels=128, metric=1.59 vs. limit=6.0 2023-10-04 15:31:32,403 WARNING [train.py:1204] (3/4) Exclude cut with ID _8365_210_2064_1_1528592311070_4677690_431-294102-0_sp0.9 from training. Duration: 0.988875 2023-10-04 15:31:32,651 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.1.ff3_skip_rate, batch_count=1703613.3333333333, ans=0.0 2023-10-04 15:31:36,665 WARNING [train.py:1204] (3/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_365-1492-0_sp1.1 from training. Duration: 0.82725 2023-10-04 15:31:36,689 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_105-114797-0_sp0.9 from training. Duration: 0.9333125 2023-10-04 15:31:38,045 WARNING [train.py:1204] (3/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_769-168302-0_sp0.9 from training. Duration: 0.788875 2023-10-04 15:31:42,019 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.3.encoder.layers.1.self_attn2.whiten, num_groups=1, num_channels=512, metric=14.31 vs. limit=22.5 2023-10-04 15:31:42,787 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_40340_210_9024_1_1533433916_4132944_320-270277-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 15:31:42,792 WARNING [train.py:1204] (3/4) Exclude cut with ID ID0203W0003-781711 from training. Duration: 0.958 2023-10-04 15:31:42,874 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_30434_210_13049_1_1532519384_981746_52-12932-0_sp1.1 from training. Duration: 0.936375 2023-10-04 15:31:45,557 WARNING [train.py:1204] (3/4) Exclude cut with ID _8263_210_1664_1_1528438953494_294100_40-204731-0_sp0.9 from training. Duration: 0.5555625 2023-10-04 15:31:46,995 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1032-236863-0_sp0.9 from training. Duration: 0.9333125 2023-10-04 15:31:48,361 WARNING [train.py:1204] (3/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_335-274780-0_sp0.9 from training. Duration: 0.8666875 2023-10-04 15:31:48,369 WARNING [train.py:1204] (3/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_654-52713-0_sp1.1 from training. Duration: 0.7 2023-10-04 15:31:49,703 WARNING [train.py:1204] (3/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_742-267856-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 15:31:51,128 WARNING [train.py:1204] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_74-48717-0 from training. Duration: 0.58 2023-10-04 15:31:52,496 WARNING [train.py:1204] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_18-46756-0 from training. Duration: 0.79 2023-10-04 15:31:53,809 WARNING [train.py:1204] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_612-56061-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 15:31:53,818 WARNING [train.py:1204] (3/4) Exclude cut with ID _56423_210_5854_1_1534896061049_3165969_412-2403-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 15:31:53,890 WARNING [train.py:1204] (3/4) Exclude cut with ID _62574_210_2655_1_1535180407058_3933249_120-196848-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 15:31:53,890 WARNING [train.py:1204] (3/4) Exclude cut with ID _40519_210_13794_1_1533715228655_7337390_916-51000-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 15:31:57,327 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.2.encoder.layers.1.conv_module1.whiten, num_groups=1, num_channels=384, metric=9.05 vs. limit=15.0 2023-10-04 15:31:57,899 WARNING [train.py:1204] (3/4) Exclude cut with ID _65263_210_22356_1_1535763598691_3921037_108-21902-0_sp1.1 from training. Duration: 0.9 2023-10-04 15:31:57,998 WARNING [train.py:1204] (3/4) Exclude cut with ID _4122_210_2084_1_1526713164417_4353280_528-206771-0_sp1.1 from training. Duration: 0.8 2023-10-04 15:32:01,236 WARNING [train.py:1204] (3/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_43-325657-0_sp1.1 from training. Duration: 0.92725 2023-10-04 15:32:02,666 WARNING [train.py:1204] (3/4) Exclude cut with ID _44659_210_2721_1_1534036120061_6614860_314-82041-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 15:32:02,725 WARNING [train.py:1204] (3/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_81-300212-0_sp0.9 from training. Duration: 0.6666875 2023-10-04 15:32:04,634 WARNING [train.py:1204] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_665-196155-0_sp0.9 from training. Duration: 0.7444375 2023-10-04 15:32:06,010 WARNING [train.py:1204] (3/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_140-336881-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 15:32:07,360 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6290_210_3273_1_1527595224_1282787_115-87095-0_sp0.9 from training. Duration: 0.611125 2023-10-04 15:32:07,414 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_53373_210_5399_1_1534229471_4715695_607-259685-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 15:32:09,409 WARNING [train.py:1204] (3/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_175-163894-0_sp1.1 from training. Duration: 0.67275 2023-10-04 15:32:09,430 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7002_210_1794_1_1527939683_5157286_1-281264-0_sp1.1 from training. Duration: 0.4 2023-10-04 15:32:16,160 WARNING [train.py:1204] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_605-257205-0 from training. Duration: 0.59 2023-10-04 15:32:20,230 INFO [train.py:1046] (3/4) Epoch 49, batch 600, loss[loss=0.1539, simple_loss=0.2413, pruned_loss=0.03328, over 24572.00 frames. ], tot_loss[loss=0.1532, simple_loss=0.234, pruned_loss=0.03622, over 4489780.95 frames. ], batch size: 71, lr: 2.08e-03, grad_scale: 16.0 2023-10-04 15:32:20,287 WARNING [train.py:1204] (3/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_454-314901-0 from training. Duration: 0.46 2023-10-04 15:32:21,648 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_46032_210_14656_1_1533781412_4507969_287-130422-0_sp1.1 from training. Duration: 0.97275 2023-10-04 15:32:21,672 WARNING [train.py:1204] (3/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_186-309161-0_sp1.1 from training. Duration: 0.4909375 2023-10-04 15:32:21,717 WARNING [train.py:1204] (3/4) Exclude cut with ID _42659_210_2070_1_1533607442765_3120380_45-342307-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 15:32:27,154 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.684e+02 1.961e+02 2.215e+02 2.451e+02 3.698e+02, threshold=4.430e+02, percent-clipped=0.0 2023-10-04 15:32:27,953 WARNING [train.py:1204] (3/4) Exclude cut with ID _12555_210_8750_1_1530075594402_3780239_478-326818-0_sp1.1 from training. Duration: 0.963625 2023-10-04 15:32:29,189 WARNING [train.py:1204] (3/4) Exclude cut with ID _7606_210_4915_1_1528894253277_5080690_127-177761-0_sp1.1 from training. Duration: 0.5818125 2023-10-04 15:32:31,899 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6788_210_1388_1_1527898698_4562021_91-197981-0 from training. Duration: 0.83 2023-10-04 15:32:33,856 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8558_210_1410_1_1528505225_7613406_131-329524-0_sp0.9 from training. Duration: 0.87775 2023-10-04 15:32:37,035 WARNING [train.py:1204] (3/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_153-153537-0_sp1.1 from training. Duration: 0.82725 2023-10-04 15:32:38,533 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_17292_210_6753_1_1531125388_2943734_285-349105-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 15:32:41,731 WARNING [train.py:1204] (3/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_37-41919-0 from training. Duration: 0.87 2023-10-04 15:32:41,764 WARNING [train.py:1204] (3/4) Exclude cut with ID _60860_210_8254_1_1534896015875_3557169_172-294852-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 15:32:47,715 WARNING [train.py:1204] (3/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_146-276944-0 from training. Duration: 0.83 2023-10-04 15:32:49,244 WARNING [train.py:1204] (3/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_1254-44082-0_sp1.1 from training. Duration: 0.936375 2023-10-04 15:32:49,256 WARNING [train.py:1204] (3/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_1115-180350-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 15:32:49,281 WARNING [train.py:1204] (3/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_60-146189-0_sp1.1 from training. Duration: 0.8909375 2023-10-04 15:32:54,673 WARNING [train.py:1204] (3/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_517-153649-0_sp1.1 from training. Duration: 0.92725 2023-10-04 15:32:54,685 WARNING [train.py:1204] (3/4) Exclude cut with ID _39571_210_13449_1_1533801584143_3651077_37-266257-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 15:32:56,020 WARNING [train.py:1204] (3/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_527-276118-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 15:32:56,306 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.conv_module1.balancer1.prob, batch_count=1704013.3333333333, ans=0.125 2023-10-04 15:33:04,024 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=1704080.0, ans=0.1 2023-10-04 15:33:05,112 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7696_210_2040_1_1528207167_3716099_117-125434-0_sp1.1 from training. Duration: 0.6909375 2023-10-04 15:33:06,721 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.attention_skip_rate, batch_count=1704080.0, ans=0.0 2023-10-04 15:33:07,941 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.conv_module1.balancer2.prob, batch_count=1704080.0, ans=0.125 2023-10-04 15:33:09,682 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_25767_210_2161_1_1532307483_3867748_478-156360-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 15:33:09,686 WARNING [train.py:1204] (3/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_177-49092-0_sp1.1 from training. Duration: 0.82725 2023-10-04 15:33:09,699 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_11305_210_1794_1_1529750428_7178148_680-317073-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 15:33:16,095 WARNING [train.py:1204] (3/4) Exclude cut with ID _53565_210_10635_1_1534337218335_3820259_287-18120-0 from training. Duration: 0.97 2023-10-04 15:33:22,917 WARNING [train.py:1204] (3/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_498-40153-0_sp1.1 from training. Duration: 0.57275 2023-10-04 15:33:22,936 WARNING [train.py:1204] (3/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_555-63066-0_sp1.1 from training. Duration: 0.963625 2023-10-04 15:33:24,488 WARNING [train.py:1204] (3/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_395-310220-0 from training. Duration: 0.89 2023-10-04 15:33:25,867 WARNING [train.py:1204] (3/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_937-208281-0_sp1.1 from training. Duration: 0.77275 2023-10-04 15:33:27,325 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.0.conv_module2.balancer2.prob, batch_count=1704146.6666666667, ans=0.125 2023-10-04 15:33:28,609 WARNING [train.py:1204] (3/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_120-211630-0 from training. Duration: 0.92 2023-10-04 15:33:28,804 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.feed_forward1.hidden_balancer.prob, batch_count=1704146.6666666667, ans=0.125 2023-10-04 15:33:29,811 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_454-276429-0_sp1.1 from training. Duration: 0.7909375 2023-10-04 15:33:29,872 WARNING [train.py:1204] (3/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_404-75889-0_sp1.1 from training. Duration: 0.8090625 2023-10-04 15:33:35,022 INFO [train.py:1046] (3/4) Epoch 49, batch 650, loss[loss=0.1436, simple_loss=0.2429, pruned_loss=0.02211, over 24358.00 frames. ], tot_loss[loss=0.1526, simple_loss=0.2334, pruned_loss=0.03588, over 4534393.41 frames. ], batch size: 74, lr: 2.08e-03, grad_scale: 16.0 2023-10-04 15:33:36,470 WARNING [train.py:1204] (3/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_282-136641-0_sp0.9 from training. Duration: 0.4666875 2023-10-04 15:33:36,569 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_809-17156-0_sp1.1 from training. Duration: 0.663625 2023-10-04 15:33:40,037 WARNING [train.py:1204] (3/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_1003-101638-0_sp0.9 from training. Duration: 0.811125 2023-10-04 15:33:41,292 WARNING [train.py:1204] (3/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_404-75889-0_sp0.9 from training. Duration: 0.988875 2023-10-04 15:33:44,102 WARNING [train.py:1204] (3/4) Exclude cut with ID _59288_210_5854_1_1535077757774_3412246_570-295255-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 15:33:45,941 WARNING [train.py:1204] (3/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_412-287526-0 from training. Duration: 0.72 2023-10-04 15:33:47,248 WARNING [train.py:1204] (3/4) Exclude cut with ID _44010_210_4525_1_1533884262964_4013620_649-79655-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 15:33:51,403 WARNING [train.py:1204] (3/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_57-270037-0_sp0.9 from training. Duration: 0.8555625 2023-10-04 15:33:51,409 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_34815_210_6388_1_1533381320_3571903_302-255963-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 15:33:54,192 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_47449_210_5399_1_1534076445_4006929_573-301852-0_sp1.1 from training. Duration: 0.97275 2023-10-04 15:33:58,379 WARNING [train.py:1204] (3/4) Exclude cut with ID 6709-74022-0004-86860-0_sp1.1 from training. Duration: 0.9409375 2023-10-04 15:33:59,785 WARNING [train.py:1204] (3/4) Exclude cut with ID _40519_210_13794_1_1533715228655_7337390_111-71991-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 15:33:59,824 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_30167_210_3528_1_1532428012_4772375_169-214186-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 15:34:02,080 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.bypass.scale_min, batch_count=1704280.0, ans=0.2 2023-10-04 15:34:03,234 WARNING [train.py:1204] (3/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_20-235313-0_sp1.1 from training. Duration: 0.963625 2023-10-04 15:34:04,616 WARNING [train.py:1204] (3/4) Exclude cut with ID _4681_210_3525_1_1527251040321_1235713_123-24428-0_sp0.9 from training. Duration: 0.5333125 2023-10-04 15:34:06,119 WARNING [train.py:1204] (3/4) Exclude cut with ID _31602_210_5854_1_1532677021456_4458250_444-198752-0_sp1.1 from training. Duration: 0.97275 2023-10-04 15:34:06,159 WARNING [train.py:1204] (3/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_9-279442-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 15:34:07,639 WARNING [train.py:1204] (3/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_973-198085-0_sp0.9 from training. Duration: 0.8333125 2023-10-04 15:34:09,004 WARNING [train.py:1204] (3/4) Exclude cut with ID _29853_210_7497_1_1532505600851_6642110_567-250221-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 15:34:10,824 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6545_210_2040_1_1527774670_4245258_407-48070-0_sp1.1 from training. Duration: 0.6909375 2023-10-04 15:34:11,130 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.bypass.skip_rate, batch_count=1704346.6666666667, ans=0.04949747468305833 2023-10-04 15:34:12,226 WARNING [train.py:1204] (3/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_224-343295-0_sp0.9 from training. Duration: 0.611125 2023-10-04 15:34:12,248 WARNING [train.py:1204] (3/4) Exclude cut with ID IC0125W0011-61817 from training. Duration: 0.88 2023-10-04 15:34:14,140 WARNING [train.py:1204] (3/4) Exclude cut with ID _60113_210_16271_1_1535164212481_4386079_435-154371-0_sp1.1 from training. Duration: 0.97275 2023-10-04 15:34:14,160 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_25836_210_16554_1_1532138167_53197_3-9394-0_sp1.1 from training. Duration: 0.92725 2023-10-04 15:34:16,936 WARNING [train.py:1204] (3/4) Exclude cut with ID _29343_210_4381_1_1532602757752_3674109_57-55125-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 15:34:17,035 WARNING [train.py:1204] (3/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_267-23560-0_sp1.1 from training. Duration: 0.936375 2023-10-04 15:34:18,345 WARNING [train.py:1204] (3/4) Exclude cut with ID _30196_210_1664_1_1533708496522_7074140_1150-278867-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 15:34:18,404 WARNING [train.py:1204] (3/4) Exclude cut with ID _6270_210_2084_1_1527850911573_3856420_26-277343-0_sp0.9 from training. Duration: 0.911125 2023-10-04 15:34:18,465 WARNING [train.py:1204] (3/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_325-62802-0 from training. Duration: 0.87 2023-10-04 15:34:19,862 WARNING [train.py:1204] (3/4) Exclude cut with ID _56524_210_9642_1_1534572011779_3619170_145-310842-0_sp1.1 from training. Duration: 0.9 2023-10-04 15:34:19,894 WARNING [train.py:1204] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_860-96198-0_sp0.9 from training. Duration: 0.811125 2023-10-04 15:34:21,304 WARNING [train.py:1204] (3/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_17-25045-0_sp1.1 from training. Duration: 0.536375 2023-10-04 15:34:21,331 WARNING [train.py:1204] (3/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_349-69727-0_sp1.1 from training. Duration: 0.936375 2023-10-04 15:34:22,755 WARNING [train.py:1204] (3/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_580-93798-0_sp1.1 from training. Duration: 0.5090625 2023-10-04 15:34:25,483 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7762_210_1794_1_1528199508_4057408_419-125009-0 from training. Duration: 0.83 2023-10-04 15:34:26,761 WARNING [train.py:1204] (3/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_356-167816-0 from training. Duration: 0.72 2023-10-04 15:34:26,797 WARNING [train.py:1204] (3/4) Exclude cut with ID _29345_210_4381_1_1532689168263_3860169_225-97100-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 15:34:26,810 WARNING [train.py:1204] (3/4) Exclude cut with ID _14227_210_4381_1_1530701804286_3940259_374-194712-0_sp1.1 from training. Duration: 0.92725 2023-10-04 15:34:26,836 WARNING [train.py:1204] (3/4) Exclude cut with ID _40137_210_12210_1_1533290484046_3795180_249-187059-0_sp0.9 from training. Duration: 0.97775 2023-10-04 15:34:28,143 WARNING [train.py:1204] (3/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_389-317194-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 15:34:28,345 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.conv_module2.balancer2.prob, batch_count=1704413.3333333333, ans=0.125 2023-10-04 15:34:29,604 WARNING [train.py:1204] (3/4) Exclude cut with ID _27485_210_4916_1_1532739191677_3668269_239-122918-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 15:34:36,277 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2245_210_2715_1_1525511548_3540254_128-117609-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 15:34:37,626 WARNING [train.py:1204] (3/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_160-276178-0_sp1.1 from training. Duration: 0.963625 2023-10-04 15:34:39,036 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_31920_210_10633_1_1532860009_1686094_1-279735-0_sp1.1 from training. Duration: 0.97275 2023-10-04 15:34:41,020 WARNING [train.py:1204] (3/4) Exclude cut with ID _51078_210_9070_1_1534161557424_3805250_325-262383-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 15:34:41,050 WARNING [train.py:1204] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_643-52007-0_sp0.9 from training. Duration: 0.6333125 2023-10-04 15:34:41,104 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_51937_210_5014_1_1534573610_4022025_518-299248-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 15:34:47,129 WARNING [train.py:1204] (3/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_166-314607-0_sp1.1 from training. Duration: 0.4909375 2023-10-04 15:34:47,133 WARNING [train.py:1204] (3/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_1087-282939-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 15:34:47,172 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_23850_210_4238_1_1532232597_1632433_32-185135-0_sp1.1 from training. Duration: 0.7818125 2023-10-04 15:34:47,203 WARNING [train.py:1204] (3/4) Exclude cut with ID _59500_210_8254_1_1534809610680_3736009_145-89359-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 15:34:49,852 INFO [train.py:1046] (3/4) Epoch 49, batch 700, loss[loss=0.1356, simple_loss=0.2072, pruned_loss=0.03202, over 23802.00 frames. ], tot_loss[loss=0.1522, simple_loss=0.2327, pruned_loss=0.03588, over 4567908.91 frames. ], batch size: 212, lr: 2.08e-03, grad_scale: 16.0 2023-10-04 15:34:52,563 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_340-336408-0 from training. Duration: 0.93 2023-10-04 15:34:52,644 WARNING [train.py:1204] (3/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_9-21737-0 from training. Duration: 0.97 2023-10-04 15:34:55,349 WARNING [train.py:1204] (3/4) Exclude cut with ID _2220_210_1819_1_1525509109898_3658680_50-99837-0 from training. Duration: 0.54 2023-10-04 15:34:55,411 WARNING [train.py:1204] (3/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_879-299350-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 15:34:56,641 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.769e+02 2.097e+02 2.386e+02 2.709e+02 4.404e+02, threshold=4.772e+02, percent-clipped=0.0 2023-10-04 15:34:56,794 WARNING [train.py:1204] (3/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_905-160074-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 15:34:58,847 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.3.encoder.layers.1.conv_module1.whiten, num_groups=1, num_channels=512, metric=3.10 vs. limit=15.0 2023-10-04 15:34:59,453 WARNING [train.py:1204] (3/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_395-73570-0 from training. Duration: 0.99 2023-10-04 15:35:04,285 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7264_210_4938_1_1527939911_8132095_800-91995-0_sp1.1 from training. Duration: 0.7818125 2023-10-04 15:35:07,013 WARNING [train.py:1204] (3/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_77-311404-0_sp1.1 from training. Duration: 0.8545625 2023-10-04 15:35:08,442 WARNING [train.py:1204] (3/4) Exclude cut with ID _13679_210_7497_1_1530534293662_3355098_107-80772-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 15:35:10,364 WARNING [train.py:1204] (3/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_224-343295-0_sp1.1 from training. Duration: 0.5 2023-10-04 15:35:10,403 WARNING [train.py:1204] (3/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_156-253482-0_sp1.1 from training. Duration: 0.963625 2023-10-04 15:35:14,967 WARNING [train.py:1204] (3/4) Exclude cut with ID _49728_210_12651_1_1534041034294_6788150_309-125532-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 15:35:15,252 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.ff2_skip_rate, batch_count=1704613.3333333333, ans=0.0 2023-10-04 15:35:16,471 WARNING [train.py:1204] (3/4) Exclude cut with ID _13913_210_9994_1_1530579615187_3383897_473-277682-0_sp0.9 from training. Duration: 0.4333125 2023-10-04 15:35:16,476 WARNING [train.py:1204] (3/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_3-276701-0_sp1.1 from training. Duration: 0.82725 2023-10-04 15:35:17,108 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.3.encoder.layers.3.self_attn_weights.whiten_keys, num_groups=8, num_channels=256, metric=2.30 vs. limit=6.0 2023-10-04 15:35:18,696 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.0.layers.1.feed_forward2.out_whiten, num_groups=1, num_channels=192, metric=4.20 vs. limit=15.0 2023-10-04 15:35:19,159 WARNING [train.py:1204] (3/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_282-136641-0 from training. Duration: 0.42 2023-10-04 15:35:20,568 WARNING [train.py:1204] (3/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_127-566-0 from training. Duration: 0.84 2023-10-04 15:35:24,542 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6163_210_6400_1_1527506746_4263068_283-137811-0_sp1.1 from training. Duration: 0.7 2023-10-04 15:35:24,596 WARNING [train.py:1204] (3/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_34-183685-0_sp1.1 from training. Duration: 0.8909375 2023-10-04 15:35:25,936 WARNING [train.py:1204] (3/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_154-135696-0_sp0.9 from training. Duration: 0.888875 2023-10-04 15:35:29,993 WARNING [train.py:1204] (3/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_676-51014-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 15:35:32,573 WARNING [train.py:1204] (3/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_38-169920-0 from training. Duration: 0.91 2023-10-04 15:35:36,445 WARNING [train.py:1204] (3/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_747-114465-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 15:35:36,498 WARNING [train.py:1204] (3/4) Exclude cut with ID _8021_210_7994_1_1528376451358_3653069_100-140231-0_sp1.1 from training. Duration: 0.6090625 2023-10-04 15:35:36,520 WARNING [train.py:1204] (3/4) Exclude cut with ID _64135_210_2253_1_1535677267876_7608972_133-188266-0 from training. Duration: 0.81 2023-10-04 15:35:40,734 WARNING [train.py:1204] (3/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_714-310538-0_sp1.1 from training. Duration: 0.7818125 2023-10-04 15:35:42,619 WARNING [train.py:1204] (3/4) Exclude cut with ID _39867_210_9826_1_1533553222030_9344259_1134-288498-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 15:35:43,093 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.5.encoder.layers.0.feed_forward3.out_whiten, num_groups=1, num_channels=256, metric=7.95 vs. limit=15.0 2023-10-04 15:35:45,794 WARNING [train.py:1204] (3/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_211-237289-0_sp1.1 from training. Duration: 0.936375 2023-10-04 15:35:51,178 WARNING [train.py:1204] (3/4) Exclude cut with ID _59202_210_26984_1_1534816810612_3549174_137-318710-0_sp0.9 from training. Duration: 0.988875 2023-10-04 15:35:51,211 WARNING [train.py:1204] (3/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_86-66192-0 from training. Duration: 0.55 2023-10-04 15:35:53,973 WARNING [train.py:1204] (3/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_540-275685-0 from training. Duration: 0.89 2023-10-04 15:35:55,324 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8776_210_2040_1_1528638837_4088036_244-278763-0 from training. Duration: 0.9 2023-10-04 15:35:55,596 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.balancer1.prob, batch_count=1704813.3333333333, ans=0.125 2023-10-04 15:35:56,827 WARNING [train.py:1204] (3/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_9-219972-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 15:35:59,433 WARNING [train.py:1204] (3/4) Exclude cut with ID _44659_210_2721_1_1534036120061_6614860_656-283823-0_sp1.1 from training. Duration: 0.92725 2023-10-04 15:35:59,500 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_69630_210_7234_1_1535789601_661049_77-216385-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 15:36:02,143 INFO [train.py:1046] (3/4) Epoch 49, batch 750, loss[loss=0.1445, simple_loss=0.2334, pruned_loss=0.02784, over 24540.00 frames. ], tot_loss[loss=0.1517, simple_loss=0.2323, pruned_loss=0.03557, over 4602046.71 frames. ], batch size: 71, lr: 2.08e-03, grad_scale: 16.0 2023-10-04 15:36:02,242 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_19952_210_2161_1_1531875469_3851750_210-308919-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 15:36:02,248 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_4737_210_2040_1_1526998689_3293008_378-178650-0 from training. Duration: 0.95 2023-10-04 15:36:04,389 WARNING [train.py:1204] (3/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_224-343295-0 from training. Duration: 0.55 2023-10-04 15:36:05,675 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7002_210_1794_1_1527939683_5157286_184-229630-0 from training. Duration: 0.74 2023-10-04 15:36:05,698 WARNING [train.py:1204] (3/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_6-214815-0 from training. Duration: 0.85 2023-10-04 15:36:07,081 WARNING [train.py:1204] (3/4) Exclude cut with ID _8699_210_2069_1_1528630346831_3960409_160-24408-0 from training. Duration: 0.91 2023-10-04 15:36:07,129 WARNING [train.py:1204] (3/4) Exclude cut with ID _5585_210_2667_1_1527418813324_8007340_89-332072-0 from training. Duration: 0.64 2023-10-04 15:36:07,182 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder_embed.conv.8.prob, batch_count=1704880.0, ans=0.125 2023-10-04 15:36:08,475 WARNING [train.py:1204] (3/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_528-259800-0_sp1.1 from training. Duration: 0.963625 2023-10-04 15:36:09,847 WARNING [train.py:1204] (3/4) Exclude cut with ID _55383_210_10635_1_1534468426172_2231669_134-128246-0 from training. Duration: 0.89 2023-10-04 15:36:09,921 WARNING [train.py:1204] (3/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_400-338274-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 15:36:11,379 WARNING [train.py:1204] (3/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_364-22817-0_sp0.9 from training. Duration: 0.788875 2023-10-04 15:36:11,734 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.balancer2.prob, batch_count=1704880.0, ans=0.125 2023-10-04 15:36:13,267 WARNING [train.py:1204] (3/4) Exclude cut with ID _42863_210_9070_1_1533902398788_3696920_331-291057-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 15:36:15,234 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_5436_210_1794_1_1527331317_2541494_229-211016-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 15:36:16,585 WARNING [train.py:1204] (3/4) Exclude cut with ID _6456_210_1534_1_1527681634212_3181049_345-252877-0_sp0.9 from training. Duration: 0.711125 2023-10-04 15:36:16,617 WARNING [train.py:1204] (3/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_96-81499-0_sp1.1 from training. Duration: 0.92725 2023-10-04 15:36:19,284 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_5575_210_1410_1_1527301305_2570170_342-265572-0_sp1.1 from training. Duration: 0.8545625 2023-10-04 15:36:19,340 WARNING [train.py:1204] (3/4) Exclude cut with ID _5444_210_2151_1_1527418808464_5312210_726-27165-0_sp0.9 from training. Duration: 0.7444375 2023-10-04 15:36:20,708 WARNING [train.py:1204] (3/4) Exclude cut with ID _22083_210_8254_1_1531702862439_3678970_304-327750-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 15:36:22,120 WARNING [train.py:1204] (3/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_245-226199-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 15:36:22,173 WARNING [train.py:1204] (3/4) Exclude cut with ID _42875_210_14856_1_1533704397487_4012433_102-199499-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 15:36:23,455 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_51225_210_3601_1_1534400634_2327383_55-301844-0 from training. Duration: 0.93 2023-10-04 15:36:24,851 WARNING [train.py:1204] (3/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_916-195848-0_sp1.1 from training. Duration: 0.836375 2023-10-04 15:36:24,926 WARNING [train.py:1204] (3/4) Exclude cut with ID _44718_210_5816_1_1534050051869_6469030_497-311999-0_sp1.1 from training. Duration: 0.936375 2023-10-04 15:36:27,497 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_34886_210_10633_1_1533203882_1755661_91-194765-0_sp1.1 from training. Duration: 0.936375 2023-10-04 15:36:27,631 WARNING [train.py:1204] (3/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_310-26186-0_sp0.9 from training. Duration: 0.77775 2023-10-04 15:36:29,001 WARNING [train.py:1204] (3/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_416-257730-0 from training. Duration: 0.65 2023-10-04 15:36:29,006 WARNING [train.py:1204] (3/4) Exclude cut with ID _9389_210_1664_1_1529217337301_5289269_209-154760-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 15:36:32,871 WARNING [train.py:1204] (3/4) Exclude cut with ID _4092_210_2151_1_1526643044449_3135759_227-213972-0 from training. Duration: 0.54 2023-10-04 15:36:32,900 WARNING [train.py:1204] (3/4) Exclude cut with ID IC0679W0044-335852 from training. Duration: 0.768 2023-10-04 15:36:34,068 WARNING [train.py:1204] (3/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_42-186162-0 from training. Duration: 0.92 2023-10-04 15:36:34,075 WARNING [train.py:1204] (3/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_302-2550-0_sp0.9 from training. Duration: 0.9 2023-10-04 15:36:34,098 WARNING [train.py:1204] (3/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_356-31216-0_sp0.9 from training. Duration: 0.8333125 2023-10-04 15:36:34,459 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.nonlin_attention.balancer.prob, batch_count=1705013.3333333333, ans=0.125 2023-10-04 15:36:35,599 WARNING [train.py:1204] (3/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_125-322172-0_sp1.1 from training. Duration: 0.8454375 2023-10-04 15:36:40,975 WARNING [train.py:1204] (3/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_141-89727-0_sp0.9 from training. Duration: 0.788875 2023-10-04 15:36:40,992 WARNING [train.py:1204] (3/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_647-337784-0_sp1.1 from training. Duration: 0.97275 2023-10-04 15:36:41,002 WARNING [train.py:1204] (3/4) Exclude cut with ID _6493_210_1747_1_1528459210424_4012470_513-140967-0_sp1.1 from training. Duration: 0.7181875 2023-10-04 15:36:45,691 WARNING [train.py:1204] (3/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_87-266334-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 15:36:47,049 WARNING [train.py:1204] (3/4) Exclude cut with ID _26528_210_9070_1_1532570410842_3851310_526-261338-0_sp1.1 from training. Duration: 0.92725 2023-10-04 15:36:47,102 WARNING [train.py:1204] (3/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_380-343670-0 from training. Duration: 0.88 2023-10-04 15:36:48,393 WARNING [train.py:1204] (3/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_449-15508-0_sp1.1 from training. Duration: 0.7454375 2023-10-04 15:36:48,502 WARNING [train.py:1204] (3/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_372-6869-0_sp0.9 from training. Duration: 0.488875 2023-10-04 15:36:49,094 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.4.encoder.layers.1.self_attn2.whiten, num_groups=1, num_channels=384, metric=11.49 vs. limit=22.5 2023-10-04 15:36:49,860 WARNING [train.py:1204] (3/4) Exclude cut with ID _26907_210_7234_1_1532221740694_3205263_337-2494-0_sp0.9 from training. Duration: 0.97775 2023-10-04 15:36:52,557 WARNING [train.py:1204] (3/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_10-100367-0_sp1.1 from training. Duration: 0.8909375 2023-10-04 15:36:52,585 WARNING [train.py:1204] (3/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_47-167373-0 from training. Duration: 0.92 2023-10-04 15:36:53,913 WARNING [train.py:1204] (3/4) Exclude cut with ID _28323_210_16187_1_1532480395667_4100300_628-282179-0_sp1.1 from training. Duration: 0.97275 2023-10-04 15:36:56,853 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.1.attention_skip_rate, batch_count=1705080.0, ans=0.0 2023-10-04 15:36:58,105 WARNING [train.py:1204] (3/4) Exclude cut with ID _20455_210_5219_1_1531565996775_8098100_213-280898-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 15:36:59,528 WARNING [train.py:1204] (3/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_383-316000-0_sp1.1 from training. Duration: 0.8181875 2023-10-04 15:36:59,556 WARNING [train.py:1204] (3/4) Exclude cut with ID _18513_210_9206_1_1531618215223_3862620_16-78180-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 15:37:02,706 WARNING [train.py:1204] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_330-327861-0_sp1.1 from training. Duration: 0.6090625 2023-10-04 15:37:07,281 WARNING [train.py:1204] (3/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_356-31216-0 from training. Duration: 0.75 2023-10-04 15:37:07,302 WARNING [train.py:1204] (3/4) Exclude cut with ID _25248_210_8750_1_1532062825941_5554180_255-148539-0_sp1.1 from training. Duration: 0.963625 2023-10-04 15:37:08,558 WARNING [train.py:1204] (3/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_269-154726-0_sp1.1 from training. Duration: 0.7818125 2023-10-04 15:37:10,183 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7002_210_1794_1_1527939683_5157286_163-87552-0_sp1.1 from training. Duration: 0.7818125 2023-10-04 15:37:11,522 WARNING [train.py:1204] (3/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_97-151896-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 15:37:13,533 WARNING [train.py:1204] (3/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_207-7026-0_sp1.1 from training. Duration: 0.97275 2023-10-04 15:37:13,554 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_92-46009-0_sp0.9 from training. Duration: 0.6 2023-10-04 15:37:17,006 INFO [train.py:1046] (3/4) Epoch 49, batch 800, loss[loss=0.161, simple_loss=0.2412, pruned_loss=0.04045, over 23316.00 frames. ], tot_loss[loss=0.1519, simple_loss=0.2329, pruned_loss=0.03545, over 4637819.69 frames. ], batch size: 93, lr: 2.08e-03, grad_scale: 32.0 2023-10-04 15:37:21,392 WARNING [train.py:1204] (3/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_122-178847-0_sp1.1 from training. Duration: 0.97275 2023-10-04 15:37:21,396 WARNING [train.py:1204] (3/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_94-348956-0_sp1.1 from training. Duration: 0.92725 2023-10-04 15:37:22,823 WARNING [train.py:1204] (3/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_1018-244089-0_sp1.1 from training. Duration: 0.963625 2023-10-04 15:37:22,835 WARNING [train.py:1204] (3/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_87-118423-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 15:37:23,063 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.balancer1.prob, batch_count=1705213.3333333333, ans=0.125 2023-10-04 15:37:24,005 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.715e+02 2.062e+02 2.313e+02 2.836e+02 4.157e+02, threshold=4.626e+02, percent-clipped=0.0 2023-10-04 15:37:24,159 WARNING [train.py:1204] (3/4) Exclude cut with ID _53713_210_24116_1_1534314487762_7416751_504-212292-0_sp1.1 from training. Duration: 0.92725 2023-10-04 15:37:24,186 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_446-150327-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 15:37:25,546 WARNING [train.py:1204] (3/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_682-245989-0_sp1.1 from training. Duration: 0.92725 2023-10-04 15:37:28,370 WARNING [train.py:1204] (3/4) Exclude cut with ID _35723_210_1794_1_1533088341043_628250_8-234524-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 15:37:29,705 WARNING [train.py:1204] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_64-248359-0_sp0.9 from training. Duration: 0.7666875 2023-10-04 15:37:32,894 WARNING [train.py:1204] (3/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_49-97158-0 from training. Duration: 0.76 2023-10-04 15:37:34,253 WARNING [train.py:1204] (3/4) Exclude cut with ID _24314_210_4915_1_1532135078116_7170179_743-915-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 15:37:36,120 WARNING [train.py:1204] (3/4) Exclude cut with ID _61194_210_18628_1_1535007555628_3750909_353-323468-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 15:37:36,155 WARNING [train.py:1204] (3/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_387-131538-0_sp1.1 from training. Duration: 0.736375 2023-10-04 15:37:37,502 WARNING [train.py:1204] (3/4) Exclude cut with ID _44424_210_18163_1_1533623394848_1213110_79-187603-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 15:37:37,544 WARNING [train.py:1204] (3/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_86-19601-0 from training. Duration: 0.85 2023-10-04 15:37:37,582 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_50050_210_16223_1_1535435796_7290138_103-283529-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 15:37:37,616 WARNING [train.py:1204] (3/4) Exclude cut with ID _13913_210_9994_1_1530579615187_3383897_473-277682-0 from training. Duration: 0.39 2023-10-04 15:37:38,348 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.2.encoder.layers.2.feed_forward3.out_whiten, num_groups=1, num_channels=384, metric=9.48 vs. limit=15.0 2023-10-04 15:37:42,081 WARNING [train.py:1204] (3/4) Exclude cut with ID _42326_210_7770_1_1533814282531_3707269_368-68029-0_sp1.1 from training. Duration: 0.92725 2023-10-04 15:37:45,628 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_41428_210_2161_1_1533774184_3992532_446-339645-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 15:37:48,400 WARNING [train.py:1204] (3/4) Exclude cut with ID _69584_210_5219_1_1535785133775_4044450_325-7656-0_sp1.1 from training. Duration: 0.7818125 2023-10-04 15:37:48,408 WARNING [train.py:1204] (3/4) Exclude cut with ID _21632_210_2253_1_1531994001765_3744140_134-184387-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 15:37:51,153 WARNING [train.py:1204] (3/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_550-184707-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 15:37:51,176 WARNING [train.py:1204] (3/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_106-341780-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 15:37:55,310 WARNING [train.py:1204] (3/4) Exclude cut with ID _45871_210_5665_1_1533864614245_3296060_253-132552-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 15:37:56,566 WARNING [train.py:1204] (3/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_395-310220-0_sp1.1 from training. Duration: 0.8090625 2023-10-04 15:37:56,580 WARNING [train.py:1204] (3/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_795-33252-0_sp0.9 from training. Duration: 0.411125 2023-10-04 15:37:59,292 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9002W0003-495962 from training. Duration: 0.946 2023-10-04 15:37:59,320 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_43-71989-0 from training. Duration: 0.57 2023-10-04 15:37:59,339 WARNING [train.py:1204] (3/4) Exclude cut with ID _8699_210_2069_1_1528630346831_3960409_155-39766-0_sp1.1 from training. Duration: 0.7090625 2023-10-04 15:37:59,358 WARNING [train.py:1204] (3/4) Exclude cut with ID _27894_210_12392_1_1532415496078_6902439_788-232684-0_sp1.1 from training. Duration: 0.97275 2023-10-04 15:38:00,694 WARNING [train.py:1204] (3/4) Exclude cut with ID _56494_210_4220_1_1535284837625_3033169_10-39296-0_sp1.1 from training. Duration: 0.92725 2023-10-04 15:38:00,802 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.0.conv_skip_rate, batch_count=1705413.3333333333, ans=0.0 2023-10-04 15:38:01,973 WARNING [train.py:1204] (3/4) Exclude cut with ID _40901_210_5665_1_1533363789958_3856130_341-20819-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 15:38:06,626 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9029W0435-509822 from training. Duration: 0.896 2023-10-04 15:38:06,669 WARNING [train.py:1204] (3/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_51-117576-0 from training. Duration: 0.4 2023-10-04 15:38:08,038 WARNING [train.py:1204] (3/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_197-28164-0_sp1.1 from training. Duration: 0.72725 2023-10-04 15:38:11,311 WARNING [train.py:1204] (3/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_145-137790-0_sp1.1 from training. Duration: 0.7545625 2023-10-04 15:38:16,087 WARNING [train.py:1204] (3/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_2-39653-0_sp1.1 from training. Duration: 0.8909375 2023-10-04 15:38:16,328 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.conv_module1.balancer1.prob, batch_count=1705480.0, ans=0.125 2023-10-04 15:38:16,330 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.bypass_mid.scale_min, batch_count=1705480.0, ans=0.2 2023-10-04 15:38:16,355 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.feed_forward2.hidden_balancer.prob, batch_count=1705480.0, ans=0.125 2023-10-04 15:38:19,015 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3780_210_2087_1_1526467760_2130483_186-208240-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 15:38:19,105 WARNING [train.py:1204] (3/4) Exclude cut with ID _24055_210_12651_1_1532310907143_976979_92-59356-0 from training. Duration: 0.91 2023-10-04 15:38:21,091 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3910_210_1794_1_1526742306_334530_28-238909-0_sp1.1 from training. Duration: 0.736375 2023-10-04 15:38:23,885 WARNING [train.py:1204] (3/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_269-154726-0 from training. Duration: 0.86 2023-10-04 15:38:28,124 WARNING [train.py:1204] (3/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_326-165900-0_sp0.9 from training. Duration: 0.7666875 2023-10-04 15:38:30,716 INFO [train.py:1046] (3/4) Epoch 49, batch 850, loss[loss=0.1626, simple_loss=0.252, pruned_loss=0.03662, over 24434.00 frames. ], tot_loss[loss=0.1525, simple_loss=0.2339, pruned_loss=0.03554, over 4674052.39 frames. ], batch size: 69, lr: 2.08e-03, grad_scale: 32.0 2023-10-04 15:38:30,786 WARNING [train.py:1204] (3/4) Exclude cut with ID _5294_210_1747_1_1527249643928_3696179_470-276077-0_sp1.1 from training. Duration: 0.963625 2023-10-04 15:38:30,817 WARNING [train.py:1204] (3/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_372-131163-0 from training. Duration: 0.94 2023-10-04 15:38:30,879 WARNING [train.py:1204] (3/4) Exclude cut with ID _45297_210_2721_1_1533715351381_3264949_351-147136-0_sp1.1 from training. Duration: 0.7909375 2023-10-04 15:38:31,119 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.conv_module1.balancer2.min_positive, batch_count=1705546.6666666667, ans=0.05 2023-10-04 15:38:32,221 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_1072-12671-0_sp1.1 from training. Duration: 0.92725 2023-10-04 15:38:32,323 WARNING [train.py:1204] (3/4) Exclude cut with ID _15133_210_6228_1_1530794512074_3080379_54-198193-0 from training. Duration: 0.94 2023-10-04 15:38:32,339 WARNING [train.py:1204] (3/4) Exclude cut with ID _30196_210_1664_1_1533708496522_7074140_1194-270988-0_sp1.1 from training. Duration: 0.97275 2023-10-04 15:38:34,962 WARNING [train.py:1204] (3/4) Exclude cut with ID _50310_210_15488_1_1534068043345_3641080_289-193637-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 15:38:36,262 WARNING [train.py:1204] (3/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_313-263495-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 15:38:38,108 WARNING [train.py:1204] (3/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_18-130336-0_sp1.1 from training. Duration: 0.7181875 2023-10-04 15:38:39,409 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_47896_210_20508_1_1534492518_5958868_646-287209-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 15:38:41,473 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_294-163793-0 from training. Duration: 0.82 2023-10-04 15:38:41,520 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_8-319957-0 from training. Duration: 0.96 2023-10-04 15:38:41,523 WARNING [train.py:1204] (3/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_1211-226066-0 from training. Duration: 0.76 2023-10-04 15:38:41,799 INFO [scaling.py:1118] (3/4) WithLoss: name=encoder.encoders.5.encoder.layers.1.self_attn_weights, loss-sum=0.000e+00 2023-10-04 15:38:44,210 WARNING [train.py:1204] (3/4) Exclude cut with ID _4779_210_2084_1_1527318022944_3914893_500-285013-0_sp0.9 from training. Duration: 0.7666875 2023-10-04 15:38:45,886 WARNING [train.py:1204] (3/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_347-232268-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 15:38:47,373 WARNING [train.py:1204] (3/4) Exclude cut with ID _29877_210_6228_1_1532397791106_5276149_111-55056-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 15:38:47,396 WARNING [train.py:1204] (3/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_812-271646-0_sp1.1 from training. Duration: 0.92725 2023-10-04 15:38:48,641 WARNING [train.py:1204] (3/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_235-262124-0_sp1.1 from training. Duration: 0.6090625 2023-10-04 15:38:53,429 WARNING [train.py:1204] (3/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_14-16530-0_sp1.1 from training. Duration: 0.97275 2023-10-04 15:38:54,760 WARNING [train.py:1204] (3/4) Exclude cut with ID _44718_210_5816_1_1534050051869_6469030_568-304512-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 15:38:54,790 WARNING [train.py:1204] (3/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_1033-274376-0 from training. Duration: 0.87 2023-10-04 15:38:56,364 WARNING [train.py:1204] (3/4) Exclude cut with ID _4681_210_3525_1_1527251040321_1235713_123-24428-0 from training. Duration: 0.48 2023-10-04 15:39:00,596 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_37818_210_4238_1_1533620910_4720083_251-288764-0_sp1.1 from training. Duration: 0.97275 2023-10-04 15:39:01,999 WARNING [train.py:1204] (3/4) Exclude cut with ID _65585_210_12210_1_1535450451584_3764098_112-294301-0 from training. Duration: 0.88 2023-10-04 15:39:06,163 WARNING [train.py:1204] (3/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_329-113173-0 from training. Duration: 0.62 2023-10-04 15:39:07,635 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6767_210_3273_1_1527855783_1718932_173-256601-0 from training. Duration: 0.8 2023-10-04 15:39:09,085 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9266W0001-627677 from training. Duration: 0.896 2023-10-04 15:39:09,102 WARNING [train.py:1204] (3/4) Exclude cut with ID _49643_210_7989_1_1534640244224_3939031_611-185515-0_sp1.1 from training. Duration: 0.8545625 2023-10-04 15:39:09,112 WARNING [train.py:1204] (3/4) Exclude cut with ID _31649_210_6228_1_1532584608063_4091078_687-237771-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 15:39:09,122 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_148-172861-0_sp0.9 from training. Duration: 0.5555625 2023-10-04 15:39:12,484 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_5129_210_6400_1_1527161005_3579054_107-69738-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 15:39:14,500 WARNING [train.py:1204] (3/4) Exclude cut with ID _40137_210_12210_1_1533290484046_3795180_36-301164-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 15:39:14,532 WARNING [train.py:1204] (3/4) Exclude cut with ID _7537_210_2084_1_1528502395431_3924359_113-199933-0 from training. Duration: 0.59 2023-10-04 15:39:17,070 WARNING [train.py:1204] (3/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_547-5424-0_sp1.1 from training. Duration: 0.8545625 2023-10-04 15:39:17,125 WARNING [train.py:1204] (3/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_147-284024-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 15:39:18,901 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_294-163793-0_sp1.1 from training. Duration: 0.7454375 2023-10-04 15:39:18,930 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3780_210_2087_1_1526467760_2130483_170-259044-0_sp1.1 from training. Duration: 0.636375 2023-10-04 15:39:20,326 WARNING [train.py:1204] (3/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_726-270780-0_sp1.1 from training. Duration: 0.7818125 2023-10-04 15:39:20,413 WARNING [train.py:1204] (3/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_214-291369-0_sp1.1 from training. Duration: 0.57275 2023-10-04 15:39:21,681 WARNING [train.py:1204] (3/4) Exclude cut with ID _21881_210_3525_1_1531825242757_3798380_247-102591-0 from training. Duration: 0.98 2023-10-04 15:39:25,164 WARNING [train.py:1204] (3/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_18-174647-0_sp1.1 from training. Duration: 0.82725 2023-10-04 15:39:25,167 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_415-52611-0_sp1.1 from training. Duration: 0.936375 2023-10-04 15:39:26,420 WARNING [train.py:1204] (3/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_379-49457-0_sp0.9 from training. Duration: 0.9666875 2023-10-04 15:39:26,442 WARNING [train.py:1204] (3/4) Exclude cut with ID _64521_210_14699_1_1535243427259_3815328_368-195106-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 15:39:26,933 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.5.encoder.layers.0.feed_forward3.out_whiten, num_groups=1, num_channels=256, metric=9.55 vs. limit=15.0 2023-10-04 15:39:27,754 WARNING [train.py:1204] (3/4) Exclude cut with ID _28394_210_8913_1_1532430049518_3650118_51-334124-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 15:39:29,244 WARNING [train.py:1204] (3/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_317-161769-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 15:39:32,059 WARNING [train.py:1204] (3/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_40-129756-0_sp0.9 from training. Duration: 0.9 2023-10-04 15:39:33,456 WARNING [train.py:1204] (3/4) Exclude cut with ID _3067_210_2667_1_1526212974640_4503168_119-175776-0_sp0.9 from training. Duration: 0.82225 2023-10-04 15:39:33,613 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.conv_module1.balancer1.prob, batch_count=1705813.3333333333, ans=0.125 2023-10-04 15:39:34,813 WARNING [train.py:1204] (3/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_1004-304915-0_sp1.1 from training. Duration: 0.92725 2023-10-04 15:39:34,872 WARNING [train.py:1204] (3/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_359-118926-0_sp0.9 from training. Duration: 0.8 2023-10-04 15:39:42,868 WARNING [train.py:1204] (3/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_51-200220-0_sp0.9 from training. Duration: 0.62225 2023-10-04 15:39:44,152 WARNING [train.py:1204] (3/4) Exclude cut with ID _46683_210_9642_1_1533730913492_4591116_530-326202-0_sp1.1 from training. Duration: 0.936375 2023-10-04 15:39:45,386 INFO [train.py:1046] (3/4) Epoch 49, batch 900, loss[loss=0.1516, simple_loss=0.2411, pruned_loss=0.03105, over 24659.00 frames. ], tot_loss[loss=0.1533, simple_loss=0.2346, pruned_loss=0.03604, over 4685367.26 frames. ], batch size: 68, lr: 2.08e-03, grad_scale: 8.0 2023-10-04 15:39:45,443 WARNING [train.py:1204] (3/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_567-135114-0 from training. Duration: 0.87 2023-10-04 15:39:45,482 WARNING [train.py:1204] (3/4) Exclude cut with ID _66863_210_18053_1_1535698758334_8590319_1092-271784-0_sp1.1 from training. Duration: 0.87275 2023-10-04 15:39:45,645 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.conv_skip_rate, batch_count=1705880.0, ans=0.0 2023-10-04 15:39:46,766 WARNING [train.py:1204] (3/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_762-282614-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 15:39:48,201 WARNING [train.py:1204] (3/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_280-120942-0 from training. Duration: 0.77 2023-10-04 15:39:53,122 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_31583_210_3852_1_1532595851_4024413_27-170931-0_sp1.1 from training. Duration: 0.963625 2023-10-04 15:39:55,910 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.730e+02 2.096e+02 2.361e+02 2.721e+02 4.850e+02, threshold=4.722e+02, percent-clipped=1.0 2023-10-04 15:39:56,015 WARNING [train.py:1204] (3/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_266-271628-0_sp1.1 from training. Duration: 0.92725 2023-10-04 15:39:57,854 WARNING [train.py:1204] (3/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_41-31157-0 from training. Duration: 0.57 2023-10-04 15:40:00,633 WARNING [train.py:1204] (3/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_113-18163-0_sp1.1 from training. Duration: 0.8090625 2023-10-04 15:40:00,681 WARNING [train.py:1204] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_147-111137-0 from training. Duration: 0.95 2023-10-04 15:40:02,067 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1083-220122-0_sp1.1 from training. Duration: 0.463625 2023-10-04 15:40:02,843 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.2.encoder.layers.0.feed_forward3.out_whiten, num_groups=1, num_channels=384, metric=12.08 vs. limit=15.0 2023-10-04 15:40:03,413 WARNING [train.py:1204] (3/4) Exclude cut with ID _14884_210_2252_1_1530707459689_5561250_591-173406-0_sp1.1 from training. Duration: 0.87275 2023-10-04 15:40:03,421 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_24677_210_1794_1_1532002693_4341296_149-303793-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 15:40:03,479 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_133-564-0_sp1.1 from training. Duration: 0.7181875 2023-10-04 15:40:03,512 WARNING [train.py:1204] (3/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_625-338546-0_sp0.9 from training. Duration: 0.97775 2023-10-04 15:40:09,601 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.4.encoder.layers.0.feed_forward3.out_whiten, num_groups=1, num_channels=384, metric=13.25 vs. limit=15.0 2023-10-04 15:40:13,729 WARNING [train.py:1204] (3/4) Exclude cut with ID _25882_210_3036_1_1532775665815_6479510_624-320169-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 15:40:13,737 WARNING [train.py:1204] (3/4) Exclude cut with ID _20622_210_3852_1_1531622427080_1093839_127-325326-0_sp1.1 from training. Duration: 0.92725 2023-10-04 15:40:13,779 WARNING [train.py:1204] (3/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_464-33931-0_sp1.1 from training. Duration: 0.4909375 2023-10-04 15:40:16,395 WARNING [train.py:1204] (3/4) Exclude cut with ID _66112_210_10635_1_1535590236305_3801484_120-304588-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 15:40:20,930 WARNING [train.py:1204] (3/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_110-130961-0 from training. Duration: 0.47 2023-10-04 15:40:21,061 WARNING [train.py:1204] (3/4) Exclude cut with ID _16908_210_5724_1_1531119157248_3772700_64-54020-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 15:40:26,416 WARNING [train.py:1204] (3/4) Exclude cut with ID _4068_210_2629_1_1526697350395_1546259_224-288693-0_sp1.1 from training. Duration: 0.536375 2023-10-04 15:40:26,468 WARNING [train.py:1204] (3/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_191-41023-0_sp0.9 from training. Duration: 0.788875 2023-10-04 15:40:26,526 WARNING [train.py:1204] (3/4) Exclude cut with ID ID0205W0255-783002 from training. Duration: 0.958 2023-10-04 15:40:28,290 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_162-262717-0 from training. Duration: 0.83 2023-10-04 15:40:30,071 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.self_attn_weights.pos_emb_skip_rate, batch_count=1706080.0, ans=0.0 2023-10-04 15:40:31,282 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_224-322394-0_sp1.1 from training. Duration: 0.6 2023-10-04 15:40:31,289 WARNING [train.py:1204] (3/4) Exclude cut with ID _4866_210_2577_1_1527404412284_4364659_133-1696-0_sp1.1 from training. Duration: 0.9 2023-10-04 15:40:31,434 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.feed_forward1.hidden_balancer.prob, batch_count=1706080.0, ans=0.125 2023-10-04 15:40:32,524 WARNING [train.py:1204] (3/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_973-198085-0_sp1.1 from training. Duration: 0.6818125 2023-10-04 15:40:37,093 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.5.encoder.layers.1.conv_module1.whiten, num_groups=1, num_channels=256, metric=3.39 vs. limit=15.0 2023-10-04 15:40:38,025 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_47449_210_5399_1_1534076445_4006929_410-237806-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 15:40:38,036 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7543_210_6753_1_1528543318_4552_1-85072-0_sp0.9 from training. Duration: 0.92225 2023-10-04 15:40:43,226 WARNING [train.py:1204] (3/4) Exclude cut with ID _65263_210_22356_1_1535763598691_3921037_108-21902-0 from training. Duration: 0.99 2023-10-04 15:40:43,238 WARNING [train.py:1204] (3/4) Exclude cut with ID _65585_210_12210_1_1535450451584_3764098_309-174818-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 15:40:44,693 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_309-65177-0 from training. Duration: 0.88 2023-10-04 15:40:46,716 WARNING [train.py:1204] (3/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_363-185703-0_sp1.1 from training. Duration: 0.863625 2023-10-04 15:40:46,752 WARNING [train.py:1204] (3/4) Exclude cut with ID _42326_210_7770_1_1533814282531_3707269_442-238692-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 15:40:46,859 WARNING [train.py:1204] (3/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_226-254446-0_sp1.1 from training. Duration: 0.936375 2023-10-04 15:40:46,876 WARNING [train.py:1204] (3/4) Exclude cut with ID _29853_210_7497_1_1532505600851_6642110_287-299903-0_sp1.1 from training. Duration: 0.963625 2023-10-04 15:40:47,064 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.ff3_skip_rate, batch_count=1706146.6666666667, ans=0.0 2023-10-04 15:40:51,024 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1015-343884-0 from training. Duration: 0.62 2023-10-04 15:40:51,054 WARNING [train.py:1204] (3/4) Exclude cut with ID ID0305W0008-833402 from training. Duration: 0.91 2023-10-04 15:40:52,298 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6673_210_3273_1_1527767790_3173240_351-167954-0_sp1.1 from training. Duration: 0.436375 2023-10-04 15:40:52,311 WARNING [train.py:1204] (3/4) Exclude cut with ID _6456_210_1534_1_1527681634212_3181049_345-252877-0 from training. Duration: 0.64 2023-10-04 15:40:56,784 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_13-205182-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 15:40:59,276 INFO [train.py:1046] (3/4) Epoch 49, batch 950, loss[loss=0.1584, simple_loss=0.2529, pruned_loss=0.03196, over 24330.00 frames. ], tot_loss[loss=0.1538, simple_loss=0.2351, pruned_loss=0.0363, over 4684100.15 frames. ], batch size: 74, lr: 2.08e-03, grad_scale: 8.0 2023-10-04 15:41:00,724 WARNING [train.py:1204] (3/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_427-208901-0 from training. Duration: 0.68 2023-10-04 15:41:05,095 WARNING [train.py:1204] (3/4) Exclude cut with ID _49039_210_6315_1_1533967537541_3846118_9-148921-0_sp1.1 from training. Duration: 0.97275 2023-10-04 15:41:06,571 WARNING [train.py:1204] (3/4) Exclude cut with ID _59493_210_17282_1_1535078419077_3225879_143-70317-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 15:41:07,848 WARNING [train.py:1204] (3/4) Exclude cut with ID _27967_210_12140_1_1532588391859_8419130_1056-236369-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 15:41:07,906 WARNING [train.py:1204] (3/4) Exclude cut with ID _6537_210_1883_1_1527987559710_3597129_382-133062-0_sp0.9 from training. Duration: 0.7555625 2023-10-04 15:41:11,209 WARNING [train.py:1204] (3/4) Exclude cut with ID ID0376W0255-869957 from training. Duration: 0.9510625 2023-10-04 15:41:14,470 WARNING [train.py:1204] (3/4) Exclude cut with ID _49371_210_12071_1_1533952761021_3812190_483-240691-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 15:41:14,538 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_12868_210_6753_1_1530701932_530275_18-76322-0_sp1.1 from training. Duration: 0.7909375 2023-10-04 15:41:15,896 WARNING [train.py:1204] (3/4) Exclude cut with ID _53950_210_5816_1_1534417264919_3862951_367-26344-0_sp1.1 from training. Duration: 0.97275 2023-10-04 15:41:15,934 WARNING [train.py:1204] (3/4) Exclude cut with ID _5444_210_2151_1_1527418808464_5312210_634-194174-0_sp1.1 from training. Duration: 0.8818125 2023-10-04 15:41:15,964 WARNING [train.py:1204] (3/4) Exclude cut with ID _56494_210_4220_1_1535284837625_3033169_69-138150-0 from training. Duration: 0.94 2023-10-04 15:41:18,571 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7543_210_6753_1_1528544010_1110402_25-183860-0_sp0.9 from training. Duration: 0.52225 2023-10-04 15:41:19,872 WARNING [train.py:1204] (3/4) Exclude cut with ID _40284_210_2084_1_1533513679814_3704279_287-191918-0_sp1.1 from training. Duration: 0.963625 2023-10-04 15:41:22,579 WARNING [train.py:1204] (3/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_291-270881-0 from training. Duration: 0.97 2023-10-04 15:41:22,624 WARNING [train.py:1204] (3/4) Exclude cut with ID _12555_210_8750_1_1530075594402_3780239_449-267986-0_sp0.9 from training. Duration: 0.92225 2023-10-04 15:41:22,899 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.conv_module1.balancer2.min_positive, batch_count=1706280.0, ans=0.05 2023-10-04 15:41:25,428 WARNING [train.py:1204] (3/4) Exclude cut with ID _3992_210_1747_1_1526644816031_3731060_268-64520-0_sp1.1 from training. Duration: 0.963625 2023-10-04 15:41:25,435 WARNING [train.py:1204] (3/4) Exclude cut with ID _39411_210_5986_1_1533538825030_3343649_173-238248-0_sp0.9 from training. Duration: 0.92225 2023-10-04 15:41:25,457 WARNING [train.py:1204] (3/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_828-313097-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 15:41:27,396 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.4.encoder.layers.1.nonlin_attention.whiten1, num_groups=1, num_channels=288, metric=3.94 vs. limit=10.0 2023-10-04 15:41:28,265 WARNING [train.py:1204] (3/4) Exclude cut with ID _26907_210_7234_1_1532221740694_3205263_337-2494-0 from training. Duration: 0.88 2023-10-04 15:41:31,580 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_229-45300-0_sp1.1 from training. Duration: 0.5181875 2023-10-04 15:41:33,011 WARNING [train.py:1204] (3/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_300-27235-0_sp1.1 from training. Duration: 0.7909375 2023-10-04 15:41:34,530 WARNING [train.py:1204] (3/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_48-338976-0_sp1.1 from training. Duration: 0.8090625 2023-10-04 15:41:39,970 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_13091_210_7925_1_1530609447_8987354_792-195353-0_sp1.1 from training. Duration: 0.936375 2023-10-04 15:41:39,976 WARNING [train.py:1204] (3/4) Exclude cut with ID _56524_210_9642_1_1534572011779_3619170_311-54500-0_sp1.1 from training. Duration: 0.97275 2023-10-04 15:41:41,548 INFO [scaling.py:1022] (3/4) Whitening: name=encoder_embed.convnext.out_whiten, num_groups=1, num_channels=128, metric=4.84 vs. limit=5.0 2023-10-04 15:41:43,312 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7543_210_6753_1_1528544010_1110402_25-183860-0 from training. Duration: 0.47 2023-10-04 15:41:45,144 WARNING [train.py:1204] (3/4) Exclude cut with ID 2411-132532-0017-82279-0_sp1.1 from training. Duration: 0.9681875 2023-10-04 15:41:45,145 WARNING [train.py:1204] (3/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_272-249985-0_sp0.9 from training. Duration: 0.7444375 2023-10-04 15:41:45,201 WARNING [train.py:1204] (3/4) Exclude cut with ID _55644_210_11856_1_1534593599582_4103719_320-229006-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 15:41:47,089 WARNING [train.py:1204] (3/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_177-100554-0_sp1.1 from training. Duration: 0.963625 2023-10-04 15:41:47,091 WARNING [train.py:1204] (3/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_787-294416-0_sp1.1 from training. Duration: 0.7454375 2023-10-04 15:41:47,351 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=1706413.3333333333, ans=0.1 2023-10-04 15:41:51,248 WARNING [train.py:1204] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_318-59097-0 from training. Duration: 0.76 2023-10-04 15:41:51,319 WARNING [train.py:1204] (3/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_480-13146-0_sp1.1 from training. Duration: 0.77275 2023-10-04 15:41:55,257 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_469-338558-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 15:41:55,314 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_367-126897-0_sp1.1 from training. Duration: 0.963625 2023-10-04 15:41:55,337 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6545_210_2040_1_1527774670_4245258_379-14182-0 from training. Duration: 0.8 2023-10-04 15:41:55,355 WARNING [train.py:1204] (3/4) Exclude cut with ID _21631_210_2253_1_1531907880778_3160339_197-79498-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 15:41:56,677 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_416-293746-0_sp1.1 from training. Duration: 0.8181875 2023-10-04 15:41:56,720 WARNING [train.py:1204] (3/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_52-211683-0 from training. Duration: 0.9 2023-10-04 15:42:00,888 WARNING [train.py:1204] (3/4) Exclude cut with ID _48678_210_5663_1_1533988830947_3769199_306-255886-0_sp0.9 from training. Duration: 0.8555625 2023-10-04 15:42:02,444 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.feed_forward2.hidden_balancer.prob, batch_count=1706480.0, ans=0.125 2023-10-04 15:42:04,284 WARNING [train.py:1204] (3/4) Exclude cut with ID _65134_210_14763_1_1535335393985_3505368_296-24733-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 15:42:07,169 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.conv_module2.balancer1.prob, batch_count=1706480.0, ans=0.125 2023-10-04 15:42:08,277 WARNING [train.py:1204] (3/4) Exclude cut with ID _14898_210_9206_1_1530766812377_4177190_335-99364-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 15:42:08,369 WARNING [train.py:1204] (3/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_19-342632-0 from training. Duration: 0.91 2023-10-04 15:42:08,386 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_53071_210_16554_1_1534490405_2954066_265-170472-0 from training. Duration: 0.83 2023-10-04 15:42:12,430 INFO [train.py:1046] (3/4) Epoch 49, batch 1000, loss[loss=0.1495, simple_loss=0.2159, pruned_loss=0.04155, over 23703.00 frames. ], tot_loss[loss=0.1529, simple_loss=0.2341, pruned_loss=0.0359, over 4697334.69 frames. ], batch size: 232, lr: 2.08e-03, grad_scale: 8.0 2023-10-04 15:42:12,573 WARNING [train.py:1204] (3/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_189-298172-0_sp1.1 from training. Duration: 0.963625 2023-10-04 15:42:16,371 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7615_210_4211_1_1528110965_4580160_72-235211-0 from training. Duration: 0.76 2023-10-04 15:42:16,590 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.ff2_skip_rate, batch_count=1706546.6666666667, ans=0.0 2023-10-04 15:42:17,664 WARNING [train.py:1204] (3/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_155-90874-0_sp1.1 from training. Duration: 0.936375 2023-10-04 15:42:18,001 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.balancer2.prob, batch_count=1706546.6666666667, ans=0.125 2023-10-04 15:42:21,868 WARNING [train.py:1204] (3/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_164-216310-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 15:42:23,145 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.605e+02 2.037e+02 2.263e+02 2.669e+02 4.122e+02, threshold=4.525e+02, percent-clipped=0.0 2023-10-04 15:42:23,232 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_46013_210_7234_1_1533776362_3744032_463-52378-0 from training. Duration: 0.86 2023-10-04 15:42:23,236 WARNING [train.py:1204] (3/4) Exclude cut with ID _61133_210_9969_1_1535196777227_3696409_333-270325-0 from training. Duration: 0.93 2023-10-04 15:42:26,131 WARNING [train.py:1204] (3/4) Exclude cut with ID _5172_210_1883_1_1527246062567_3577219_18-101380-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 15:42:27,412 WARNING [train.py:1204] (3/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_577-236858-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 15:42:28,811 WARNING [train.py:1204] (3/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_1006-177345-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 15:42:31,514 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_4-266348-0 from training. Duration: 0.78 2023-10-04 15:42:33,716 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=1706613.3333333333, ans=0.1 2023-10-04 15:42:34,886 WARNING [train.py:1204] (3/4) Exclude cut with ID _55857_210_2346_1_1534644007831_6898209_745-98391-0 from training. Duration: 0.76 2023-10-04 15:42:36,246 WARNING [train.py:1204] (3/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_375-244976-0 from training. Duration: 0.75 2023-10-04 15:42:37,549 WARNING [train.py:1204] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_4-248404-0_sp1.1 from training. Duration: 0.97275 2023-10-04 15:42:41,518 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8453_210_4211_1_1528502940_8562730_596-306820-0 from training. Duration: 0.97 2023-10-04 15:42:42,985 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_211-339622-0_sp0.9 from training. Duration: 0.5 2023-10-04 15:42:43,639 WARNING [train.py:1204] (3/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_393-57888-0 from training. Duration: 0.64 2023-10-04 15:42:44,897 WARNING [train.py:1204] (3/4) Exclude cut with ID _23825_210_13449_1_1531915313526_3497150_143-343575-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 15:42:46,292 WARNING [train.py:1204] (3/4) Exclude cut with ID _58605_210_12210_1_1534723195939_3904259_36-12256-0_sp1.1 from training. Duration: 0.936375 2023-10-04 15:42:53,960 WARNING [train.py:1204] (3/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_38-67795-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 15:42:55,395 WARNING [train.py:1204] (3/4) Exclude cut with ID _64631_210_27767_1_1535349580068_4165160_308-327832-0_sp1.1 from training. Duration: 0.92725 2023-10-04 15:42:56,710 WARNING [train.py:1204] (3/4) Exclude cut with ID _37086_210_9038_1_1532999540891_7021250_726-81176-0_sp1.1 from training. Duration: 0.936375 2023-10-04 15:42:56,771 WARNING [train.py:1204] (3/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_47-141521-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 15:42:56,775 WARNING [train.py:1204] (3/4) Exclude cut with ID _3067_210_2667_1_1526212974640_4503168_250-285937-0 from training. Duration: 0.8 2023-10-04 15:42:56,809 WARNING [train.py:1204] (3/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_239-24000-0_sp1.1 from training. Duration: 0.97275 2023-10-04 15:42:58,128 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3371_210_1794_1_1526384462_5580777_396-339812-0_sp0.9 from training. Duration: 0.9444375 2023-10-04 15:42:58,180 WARNING [train.py:1204] (3/4) Exclude cut with ID _16909_210_5724_1_1531292079912_4118919_580-173454-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 15:42:59,526 WARNING [train.py:1204] (3/4) Exclude cut with ID IC0124W0002-61308 from training. Duration: 0.982 2023-10-04 15:42:59,780 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.conv_module1.balancer1.prob, batch_count=1706746.6666666667, ans=0.125 2023-10-04 15:43:03,591 WARNING [train.py:1204] (3/4) Exclude cut with ID _31602_210_5854_1_1532677021456_4458250_249-96604-0 from training. Duration: 0.89 2023-10-04 15:43:04,956 WARNING [train.py:1204] (3/4) Exclude cut with ID _69584_210_5219_1_1535785133775_4044450_325-7656-0 from training. Duration: 0.86 2023-10-04 15:43:06,324 WARNING [train.py:1204] (3/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_478-162247-0 from training. Duration: 0.82 2023-10-04 15:43:07,118 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.2.encoder.layers.0.self_attn2.whiten, num_groups=1, num_channels=384, metric=20.27 vs. limit=22.5 2023-10-04 15:43:08,281 WARNING [train.py:1204] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_686-54383-0_sp1.1 from training. Duration: 0.7909375 2023-10-04 15:43:08,549 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.conv_module1.balancer1.prob, batch_count=1706746.6666666667, ans=0.125 2023-10-04 15:43:15,555 WARNING [train.py:1204] (3/4) Exclude cut with ID _29303_210_2575_1_1532392237303_7410649_85-270092-0_sp1.1 from training. Duration: 0.936375 2023-10-04 15:43:15,572 WARNING [train.py:1204] (3/4) Exclude cut with ID _2322_210_2667_1_1525604548971_3656299_440-80494-0_sp1.1 from training. Duration: 0.8 2023-10-04 15:43:16,938 WARNING [train.py:1204] (3/4) Exclude cut with ID _47346_210_9476_1_1533815971553_3536160_201-40644-0_sp1.1 from training. Duration: 0.936375 2023-10-04 15:43:17,048 WARNING [train.py:1204] (3/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_343-139898-0_sp1.1 from training. Duration: 0.87275 2023-10-04 15:43:18,402 WARNING [train.py:1204] (3/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_1012-279473-0 from training. Duration: 0.67 2023-10-04 15:43:20,404 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7931_210_1794_1_1528594728_5564059_280-279560-0_sp1.1 from training. Duration: 0.8909375 2023-10-04 15:43:20,451 WARNING [train.py:1204] (3/4) Exclude cut with ID _8263_210_1664_1_1528439452891_970220_68-181805-0 from training. Duration: 0.64 2023-10-04 15:43:20,590 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.1.attention_skip_rate, batch_count=1706813.3333333333, ans=0.0 2023-10-04 15:43:21,780 WARNING [train.py:1204] (3/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_605-133395-0 from training. Duration: 0.66 2023-10-04 15:43:21,895 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_43-171109-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 15:43:21,898 WARNING [train.py:1204] (3/4) Exclude cut with ID _40094_210_16380_1_1533294005773_3641159_107-280739-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 15:43:24,581 WARNING [train.py:1204] (3/4) Exclude cut with ID _14821_210_8750_1_1530680503991_6829359_2-227151-0_sp0.9 from training. Duration: 0.92225 2023-10-04 15:43:27,324 WARNING [train.py:1204] (3/4) Exclude cut with ID _14268_210_9206_1_1530622771285_4714099_331-105351-0_sp1.1 from training. Duration: 0.5909375 2023-10-04 15:43:28,544 INFO [train.py:1046] (3/4) Epoch 49, batch 1050, loss[loss=0.1381, simple_loss=0.2159, pruned_loss=0.03013, over 23513.00 frames. ], tot_loss[loss=0.1521, simple_loss=0.2325, pruned_loss=0.03584, over 4694928.12 frames. ], batch size: 134, lr: 2.08e-03, grad_scale: 8.0 2023-10-04 15:43:28,631 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_26326_210_7925_1_1533002493_3592406_451-82741-0_sp1.1 from training. Duration: 0.97275 2023-10-04 15:43:31,441 WARNING [train.py:1204] (3/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_191-59784-0_sp0.9 from training. Duration: 0.97775 2023-10-04 15:43:32,889 WARNING [train.py:1204] (3/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_445-117398-0_sp1.1 from training. Duration: 0.8090625 2023-10-04 15:43:35,714 WARNING [train.py:1204] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_605-257205-0_sp0.9 from training. Duration: 0.6555625 2023-10-04 15:43:35,798 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_50050_210_16223_1_1535435796_7290138_592-258841-0_sp1.1 from training. Duration: 0.936375 2023-10-04 15:43:38,447 WARNING [train.py:1204] (3/4) Exclude cut with ID _14348_210_8341_1_1530599434996_3778640_240-232370-0_sp0.9 from training. Duration: 0.9333125 2023-10-04 15:43:41,761 WARNING [train.py:1204] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_239-316802-0_sp0.9 from training. Duration: 0.7666875 2023-10-04 15:43:43,731 WARNING [train.py:1204] (3/4) Exclude cut with ID _45560_210_21982_1_1533812603951_3584999_111-6376-0_sp1.1 from training. Duration: 0.763625 2023-10-04 15:43:45,162 WARNING [train.py:1204] (3/4) Exclude cut with ID _54189_210_16380_1_1534332670039_3911710_606-311481-0_sp1.1 from training. Duration: 0.963625 2023-10-04 15:43:45,220 WARNING [train.py:1204] (3/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_237-332027-0_sp1.1 from training. Duration: 0.863625 2023-10-04 15:43:46,571 WARNING [train.py:1204] (3/4) Exclude cut with ID _66070_210_7640_1_1535441393096_7805310_489-292631-0_sp0.9 from training. Duration: 0.87775 2023-10-04 15:43:46,644 WARNING [train.py:1204] (3/4) Exclude cut with ID _49728_210_12651_1_1534041034294_6788150_624-195301-0_sp1.1 from training. Duration: 0.77275 2023-10-04 15:43:46,687 WARNING [train.py:1204] (3/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_44-79427-0 from training. Duration: 0.98 2023-10-04 15:43:48,073 WARNING [train.py:1204] (3/4) Exclude cut with ID _35350_210_16040_1_1533040191183_4075299_490-198354-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 15:43:48,109 WARNING [train.py:1204] (3/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_309-222545-0 from training. Duration: 0.84 2023-10-04 15:43:51,327 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_13091_210_7925_1_1530609447_8987354_790-199469-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 15:43:51,332 WARNING [train.py:1204] (3/4) Exclude cut with ID _21632_210_2253_1_1531994001765_3744140_110-274654-0 from training. Duration: 0.82 2023-10-04 15:43:51,345 WARNING [train.py:1204] (3/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_498-40153-0_sp0.9 from training. Duration: 0.7 2023-10-04 15:43:58,116 WARNING [train.py:1204] (3/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_363-282909-0_sp1.1 from training. Duration: 0.936375 2023-10-04 15:43:58,357 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.attention_skip_rate, batch_count=1707013.3333333333, ans=0.0 2023-10-04 15:43:59,576 WARNING [train.py:1204] (3/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_178-177582-0_sp0.9 from training. Duration: 0.82225 2023-10-04 15:43:59,592 WARNING [train.py:1204] (3/4) Exclude cut with ID _24612_210_8913_1_1531998054193_3806359_357-35487-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 15:43:59,719 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.conv_module1.balancer2.prob, batch_count=1707013.3333333333, ans=0.125 2023-10-04 15:44:02,313 WARNING [train.py:1204] (3/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_138-332773-0 from training. Duration: 0.81 2023-10-04 15:44:02,343 WARNING [train.py:1204] (3/4) Exclude cut with ID _7624_210_1531_1_1528109802198_4933101_418-349834-0 from training. Duration: 0.6 2023-10-04 15:44:02,364 WARNING [train.py:1204] (3/4) Exclude cut with ID _7537_210_2084_1_1528502395431_3924359_14-312906-0_sp0.9 from training. Duration: 0.9333125 2023-10-04 15:44:06,512 WARNING [train.py:1204] (3/4) Exclude cut with ID _60057_210_10640_1_1534903065793_3333150_362-230377-0 from training. Duration: 0.87 2023-10-04 15:44:10,567 WARNING [train.py:1204] (3/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_138-64210-0 from training. Duration: 0.73 2023-10-04 15:44:10,639 WARNING [train.py:1204] (3/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_454-339504-0_sp1.1 from training. Duration: 0.97275 2023-10-04 15:44:14,053 WARNING [train.py:1204] (3/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_508-335369-0_sp0.9 from training. Duration: 0.5333125 2023-10-04 15:44:16,181 WARNING [train.py:1204] (3/4) Exclude cut with ID _5608_210_2106_1_1527415439089_6819768_909-82767-0_sp0.9 from training. Duration: 0.67775 2023-10-04 15:44:16,209 WARNING [train.py:1204] (3/4) Exclude cut with ID _6694_210_4599_1_1527852637490_3965529_99-11611-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 15:44:16,277 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2655_210_2087_1_1525688959_3897400_52-279899-0_sp1.1 from training. Duration: 0.62725 2023-10-04 15:44:21,439 WARNING [train.py:1204] (3/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_375-245677-0_sp1.1 from training. Duration: 0.736375 2023-10-04 15:44:25,599 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_23850_210_4238_1_1532232597_1632433_32-185135-0 from training. Duration: 0.86 2023-10-04 15:44:25,721 WARNING [train.py:1204] (3/4) Exclude cut with ID _15113_210_8254_1_1530842438579_3716279_209-54533-0 from training. Duration: 0.96 2023-10-04 15:44:26,993 WARNING [train.py:1204] (3/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_401-53423-0 from training. Duration: 0.55 2023-10-04 15:44:27,031 WARNING [train.py:1204] (3/4) Exclude cut with ID _14867_210_1968_1_1530684179890_3846969_319-250868-0_sp1.1 from training. Duration: 0.9 2023-10-04 15:44:27,047 WARNING [train.py:1204] (3/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_106-30782-0_sp0.9 from training. Duration: 0.888875 2023-10-04 15:44:28,614 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.nonlin_attention.balancer.max_positive, batch_count=1707146.6666666667, ans=0.95 2023-10-04 15:44:29,839 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_5960_210_1794_1_1527403865_3990999_515-2613-0 from training. Duration: 0.88 2023-10-04 15:44:32,606 WARNING [train.py:1204] (3/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_798-106179-0_sp0.9 from training. Duration: 0.92225 2023-10-04 15:44:32,892 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.ff2_skip_rate, batch_count=1707146.6666666667, ans=0.0 2023-10-04 15:44:33,995 WARNING [train.py:1204] (3/4) Exclude cut with ID _14227_210_4381_1_1530701804286_3940259_125-275642-0_sp1.1 from training. Duration: 0.9 2023-10-04 15:44:34,004 WARNING [train.py:1204] (3/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_79-200392-0_sp1.1 from training. Duration: 0.8818125 2023-10-04 15:44:35,376 WARNING [train.py:1204] (3/4) Exclude cut with ID _48836_210_12210_1_1534075848293_3904150_429-35475-0_sp0.9 from training. Duration: 0.988875 2023-10-04 15:44:35,391 WARNING [train.py:1204] (3/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_224-172517-0_sp1.1 from training. Duration: 0.97275 2023-10-04 15:44:39,601 WARNING [train.py:1204] (3/4) Exclude cut with ID _15113_210_8254_1_1530842438579_3716279_76-232507-0_sp1.1 from training. Duration: 0.97275 2023-10-04 15:44:39,604 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_135-5501-0 from training. Duration: 0.98 2023-10-04 15:44:41,317 WARNING [train.py:1204] (3/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_563-347832-0_sp0.9 from training. Duration: 0.988875 2023-10-04 15:44:41,328 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6786_210_1388_1_1527839699_4194495_273-149841-0 from training. Duration: 0.92 2023-10-04 15:44:41,354 WARNING [train.py:1204] (3/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_172-215932-0 from training. Duration: 0.66 2023-10-04 15:44:42,565 INFO [train.py:1046] (3/4) Epoch 49, batch 1100, loss[loss=0.1541, simple_loss=0.2421, pruned_loss=0.03308, over 24615.00 frames. ], tot_loss[loss=0.1516, simple_loss=0.2318, pruned_loss=0.03572, over 4706890.07 frames. ], batch size: 73, lr: 2.08e-03, grad_scale: 8.0 2023-10-04 15:44:42,651 WARNING [train.py:1204] (3/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_939-132910-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 15:44:44,853 WARNING [train.py:1204] (3/4) Exclude cut with ID _49371_210_12071_1_1533952761021_3812190_16-246001-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 15:44:48,984 WARNING [train.py:1204] (3/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_680-189165-0_sp1.1 from training. Duration: 0.936375 2023-10-04 15:44:53,339 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.768e+02 2.066e+02 2.358e+02 2.772e+02 6.166e+02, threshold=4.716e+02, percent-clipped=1.0 2023-10-04 15:44:53,550 WARNING [train.py:1204] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_53-300544-0_sp1.1 from training. Duration: 0.6090625 2023-10-04 15:44:54,825 WARNING [train.py:1204] (3/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_540-275685-0_sp1.1 from training. Duration: 0.8090625 2023-10-04 15:44:54,848 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_38965_210_4852_1_1533174906_3484735_453-106579-0_sp1.1 from training. Duration: 0.92725 2023-10-04 15:44:56,174 WARNING [train.py:1204] (3/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_105-328174-0 from training. Duration: 0.76 2023-10-04 15:44:57,654 WARNING [train.py:1204] (3/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_279-126205-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 15:44:59,055 WARNING [train.py:1204] (3/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_5-55893-0_sp0.9 from training. Duration: 0.611125 2023-10-04 15:45:03,153 WARNING [train.py:1204] (3/4) Exclude cut with ID _13678_210_7497_1_1530530069226_3463170_141-10835-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 15:45:05,894 WARNING [train.py:1204] (3/4) Exclude cut with ID _22083_210_8254_1_1531702862439_3678970_201-85738-0_sp1.1 from training. Duration: 0.8181875 2023-10-04 15:45:07,235 WARNING [train.py:1204] (3/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_51-1049-0 from training. Duration: 0.75 2023-10-04 15:45:07,315 WARNING [train.py:1204] (3/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_206-10097-0_sp0.9 from training. Duration: 0.5444375 2023-10-04 15:45:09,937 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_4528_210_3273_1_1527076654_3276777_360-63661-0_sp1.1 from training. Duration: 0.92725 2023-10-04 15:45:09,945 WARNING [train.py:1204] (3/4) Exclude cut with ID _64086_210_7994_1_1535270417019_7435949_449-31567-0_sp1.1 from training. Duration: 0.8818125 2023-10-04 15:45:11,408 WARNING [train.py:1204] (3/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_245-174553-0_sp0.9 from training. Duration: 0.97775 2023-10-04 15:45:13,448 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6545_210_2040_1_1527774670_4245258_379-14182-0_sp1.1 from training. Duration: 0.72725 2023-10-04 15:45:14,997 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.conv_module1.balancer1.prob, batch_count=1707346.6666666667, ans=0.125 2023-10-04 15:45:19,572 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_256-206070-0_sp1.1 from training. Duration: 0.963625 2023-10-04 15:45:23,103 WARNING [train.py:1204] (3/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_278-104676-0 from training. Duration: 0.98 2023-10-04 15:45:24,464 WARNING [train.py:1204] (3/4) Exclude cut with ID IC0671W0065-331908 from training. Duration: 0.896 2023-10-04 15:45:24,503 WARNING [train.py:1204] (3/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_244-129917-0_sp1.1 from training. Duration: 0.97275 2023-10-04 15:45:26,124 WARNING [train.py:1204] (3/4) Exclude cut with ID _7852_210_5219_1_1528286471316_8139400_1129-170857-0_sp1.1 from training. Duration: 0.97275 2023-10-04 15:45:26,296 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.balancer1.prob, batch_count=1707413.3333333333, ans=0.125 2023-10-04 15:45:27,402 WARNING [train.py:1204] (3/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_443-30034-0_sp0.9 from training. Duration: 0.711125 2023-10-04 15:45:28,790 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_31583_210_3852_1_1532595851_4024413_250-332999-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 15:45:30,078 WARNING [train.py:1204] (3/4) Exclude cut with ID _8772_210_1850_1_1528635595972_3664011_247-336448-0 from training. Duration: 0.74 2023-10-04 15:45:30,142 WARNING [train.py:1204] (3/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_567-135114-0_sp0.9 from training. Duration: 0.9666875 2023-10-04 15:45:30,146 WARNING [train.py:1204] (3/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_557-349534-0_sp1.1 from training. Duration: 0.82725 2023-10-04 15:45:30,158 WARNING [train.py:1204] (3/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_1341-26728-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 15:45:31,592 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_40222_210_6228_1_1533385298_4481939_375-328051-0_sp1.1 from training. Duration: 0.97275 2023-10-04 15:45:31,621 WARNING [train.py:1204] (3/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_276-96593-0 from training. Duration: 0.77 2023-10-04 15:45:35,722 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_53071_210_16554_1_1534490405_2954066_265-170472-0_sp0.9 from training. Duration: 0.92225 2023-10-04 15:45:35,745 WARNING [train.py:1204] (3/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_365-1492-0 from training. Duration: 0.91 2023-10-04 15:45:38,536 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.bypass.skip_rate, batch_count=1707413.3333333333, ans=0.09899494936611666 2023-10-04 15:45:39,764 WARNING [train.py:1204] (3/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_654-52713-0_sp0.9 from training. Duration: 0.8555625 2023-10-04 15:45:44,031 WARNING [train.py:1204] (3/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_113-92608-0_sp1.1 from training. Duration: 0.4909375 2023-10-04 15:45:44,297 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.conv_module2.balancer1.prob, batch_count=1707480.0, ans=0.125 2023-10-04 15:45:47,309 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_46013_210_7234_1_1533776362_3744032_50-141535-0 from training. Duration: 0.99 2023-10-04 15:45:47,317 WARNING [train.py:1204] (3/4) Exclude cut with ID _13942_210_9988_1_1530604906036_3733360_499-55676-0_sp1.1 from training. Duration: 0.563625 2023-10-04 15:45:49,414 WARNING [train.py:1204] (3/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_375-65143-0_sp1.1 from training. Duration: 0.97275 2023-10-04 15:45:50,579 WARNING [train.py:1204] (3/4) Exclude cut with ID _16756_210_4915_1_1531047803763_7104259_199-325608-0_sp1.1 from training. Duration: 0.92725 2023-10-04 15:45:50,615 WARNING [train.py:1204] (3/4) Exclude cut with ID _31649_210_6228_1_1532584608063_4091078_443-124391-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 15:45:52,009 WARNING [train.py:1204] (3/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_146-218763-0 from training. Duration: 0.86 2023-10-04 15:45:53,678 WARNING [train.py:1204] (3/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_567-135114-0_sp1.1 from training. Duration: 0.7909375 2023-10-04 15:45:53,729 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_26326_210_7925_1_1533002493_3592406_462-288299-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 15:45:53,825 WARNING [train.py:1204] (3/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_322-217220-0 from training. Duration: 0.86 2023-10-04 15:45:55,587 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_238-256668-0_sp1.1 from training. Duration: 0.62725 2023-10-04 15:45:55,632 WARNING [train.py:1204] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_371-18169-0 from training. Duration: 0.86 2023-10-04 15:45:56,458 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.1.encoder.layers.1.feed_forward2.out_whiten, num_groups=1, num_channels=256, metric=4.34 vs. limit=15.0 2023-10-04 15:45:56,832 INFO [train.py:1046] (3/4) Epoch 49, batch 1150, loss[loss=0.1622, simple_loss=0.23, pruned_loss=0.04719, over 23834.00 frames. ], tot_loss[loss=0.1517, simple_loss=0.2322, pruned_loss=0.03563, over 4705468.56 frames. ], batch size: 150, lr: 2.08e-03, grad_scale: 8.0 2023-10-04 15:45:56,959 WARNING [train.py:1204] (3/4) Exclude cut with ID _58480_210_12392_1_1534811121111_3646160_23-74512-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 15:45:56,971 WARNING [train.py:1204] (3/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_784-213099-0_sp1.1 from training. Duration: 0.8454375 2023-10-04 15:45:58,399 WARNING [train.py:1204] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_625-102955-0_sp1.1 from training. Duration: 0.7 2023-10-04 15:46:03,846 WARNING [train.py:1204] (3/4) Exclude cut with ID _49748_210_24116_1_1533986971737_3158910_176-174327-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 15:46:05,275 WARNING [train.py:1204] (3/4) Exclude cut with ID _50310_210_15488_1_1534068043345_3641080_432-96140-0_sp1.1 from training. Duration: 0.936375 2023-10-04 15:46:06,727 WARNING [train.py:1204] (3/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_76-14910-0_sp1.1 from training. Duration: 0.92725 2023-10-04 15:46:08,014 WARNING [train.py:1204] (3/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_40-19458-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 15:46:08,052 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_46-93547-0 from training. Duration: 0.95 2023-10-04 15:46:08,093 WARNING [train.py:1204] (3/4) Exclude cut with ID _21631_210_2253_1_1531907880778_3160339_321-35325-0_sp1.1 from training. Duration: 0.963625 2023-10-04 15:46:10,914 WARNING [train.py:1204] (3/4) Exclude cut with ID _4779_210_2084_1_1527318022944_3914893_500-285013-0 from training. Duration: 0.69 2023-10-04 15:46:12,284 WARNING [train.py:1204] (3/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_301-93324-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 15:46:12,290 WARNING [train.py:1204] (3/4) Exclude cut with ID _6473_210_1741_1_1527901307396_3815380_108-17335-0_sp1.1 from training. Duration: 0.5909375 2023-10-04 15:46:16,655 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.0.layers.1.feed_forward1.out_whiten, num_groups=1, num_channels=192, metric=4.27 vs. limit=15.0 2023-10-04 15:46:18,504 WARNING [train.py:1204] (3/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_206-10097-0 from training. Duration: 0.49 2023-10-04 15:46:20,122 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_36756_210_7234_1_1533026991_966276_103-77513-0_sp1.1 from training. Duration: 0.97275 2023-10-04 15:46:23,221 WARNING [train.py:1204] (3/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_662-18193-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 15:46:24,600 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_26433_210_12108_1_1532157158_4056480_439-52438-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 15:46:24,636 WARNING [train.py:1204] (3/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_464-33931-0 from training. Duration: 0.54 2023-10-04 15:46:26,435 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3474_210_2028_1_1526188513_4825246_145-309131-0_sp0.9 from training. Duration: 0.82225 2023-10-04 15:46:26,454 WARNING [train.py:1204] (3/4) Exclude cut with ID _9679_210_7497_1_1529129212906_4363929_446-308321-0_sp1.1 from training. Duration: 0.963625 2023-10-04 15:46:29,250 WARNING [train.py:1204] (3/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_662-275621-0 from training. Duration: 0.42 2023-10-04 15:46:30,596 WARNING [train.py:1204] (3/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_344-337148-0_sp1.1 from training. Duration: 0.97275 2023-10-04 15:46:32,062 WARNING [train.py:1204] (3/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_759-179071-0_sp1.1 from training. Duration: 0.92725 2023-10-04 15:46:40,404 WARNING [train.py:1204] (3/4) Exclude cut with ID _28783_210_7943_1_1532671164808_7155479_293-12188-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 15:46:48,006 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_17613_210_2151_1_1531137569_2366160_235-320304-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 15:46:48,031 WARNING [train.py:1204] (3/4) Exclude cut with ID _14349_210_8341_1_1530615710818_3442470_170-336541-0 from training. Duration: 0.86 2023-10-04 15:46:49,430 WARNING [train.py:1204] (3/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_375-218677-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 15:46:49,506 WARNING [train.py:1204] (3/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_829-202956-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 15:46:52,368 INFO [scaling.py:1118] (3/4) WithLoss: name=encoder.encoders.1.encoder.layers.0.self_attn_weights, loss-sum=0.000e+00 2023-10-04 15:46:54,987 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9029W0436-509823 from training. Duration: 0.768 2023-10-04 15:46:57,074 WARNING [train.py:1204] (3/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_231-112214-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 15:47:02,614 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9059W0470-524883 from training. Duration: 0.3415625 2023-10-04 15:47:05,351 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_43266_210_1794_1_1533521789_4497218_206-79512-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 15:47:06,727 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_294-163793-0_sp0.9 from training. Duration: 0.911125 2023-10-04 15:47:06,756 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6290_210_3273_1_1527595224_1282787_115-87095-0_sp1.1 from training. Duration: 0.5 2023-10-04 15:47:08,219 WARNING [train.py:1204] (3/4) Exclude cut with ID _14100_210_8341_1_1530529320613_3678069_279-55760-0_sp1.1 from training. Duration: 0.8454375 2023-10-04 15:47:09,611 INFO [train.py:1046] (3/4) Epoch 49, batch 1200, loss[loss=0.158, simple_loss=0.2369, pruned_loss=0.03956, over 23348.00 frames. ], tot_loss[loss=0.1528, simple_loss=0.2331, pruned_loss=0.03627, over 4693788.85 frames. ], batch size: 105, lr: 2.08e-03, grad_scale: 16.0 2023-10-04 15:47:11,023 WARNING [train.py:1204] (3/4) Exclude cut with ID _14621_210_1925_1_1530686901185_3675420_419-234200-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 15:47:17,112 WARNING [train.py:1204] (3/4) Exclude cut with ID _60229_210_27767_1_1534838406158_3655350_353-213656-0_sp1.1 from training. Duration: 0.863625 2023-10-04 15:47:17,129 WARNING [train.py:1204] (3/4) Exclude cut with ID _48678_210_5663_1_1533988830947_3769199_306-255886-0_sp1.1 from training. Duration: 0.7 2023-10-04 15:47:19,124 WARNING [train.py:1204] (3/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_466-3492-0_sp1.1 from training. Duration: 0.97275 2023-10-04 15:47:19,125 WARNING [train.py:1204] (3/4) Exclude cut with ID _8755_210_1876_1_1528628559357_3258209_199-242167-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 15:47:19,196 WARNING [train.py:1204] (3/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_237-89684-0_sp1.1 from training. Duration: 0.87275 2023-10-04 15:47:20,502 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.710e+02 2.013e+02 2.245e+02 2.650e+02 4.852e+02, threshold=4.489e+02, percent-clipped=1.0 2023-10-04 15:47:20,641 WARNING [train.py:1204] (3/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_70-84121-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 15:47:22,023 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7544_210_6753_1_1528587722_3576686_172-171750-0_sp1.1 from training. Duration: 0.5909375 2023-10-04 15:47:23,489 WARNING [train.py:1204] (3/4) Exclude cut with ID _24271_210_12658_1_1531987270022_7190519_40-321822-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 15:47:23,520 WARNING [train.py:1204] (3/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_227-83612-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 15:47:24,987 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9156W0015-572508 from training. Duration: 0.896 2023-10-04 15:47:28,706 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_292-265928-0 from training. Duration: 0.51 2023-10-04 15:47:28,841 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.0.conv_module2.balancer1.prob, batch_count=1707946.6666666667, ans=0.125 2023-10-04 15:47:31,976 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.3.encoder.layers.3.self_attn_weights.whiten_keys, num_groups=8, num_channels=256, metric=2.24 vs. limit=6.0 2023-10-04 15:47:32,663 WARNING [train.py:1204] (3/4) Exclude cut with ID _14747_210_8341_1_1530685797463_3654089_147-131229-0_sp1.1 from training. Duration: 0.6909375 2023-10-04 15:47:35,409 WARNING [train.py:1204] (3/4) Exclude cut with ID _45297_210_2721_1_1533715351381_3264949_351-147136-0_sp0.9 from training. Duration: 0.9666875 2023-10-04 15:47:35,693 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.conv_skip_rate, batch_count=1707946.6666666667, ans=0.0 2023-10-04 15:47:38,247 WARNING [train.py:1204] (3/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_1102-324568-0_sp1.1 from training. Duration: 0.97275 2023-10-04 15:47:39,661 WARNING [train.py:1204] (3/4) Exclude cut with ID _18516_210_9206_1_1531704653206_3692406_465-75446-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 15:47:39,662 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9203W0126-595623 from training. Duration: 0.896 2023-10-04 15:47:39,733 WARNING [train.py:1204] (3/4) Exclude cut with ID _55383_210_10635_1_1534468426172_2231669_139-328410-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 15:47:42,717 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.conv_module1.balancer2.prob, batch_count=1708013.3333333333, ans=0.125 2023-10-04 15:47:48,528 WARNING [train.py:1204] (3/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_166-246557-0_sp0.9 from training. Duration: 0.711125 2023-10-04 15:47:48,532 WARNING [train.py:1204] (3/4) Exclude cut with ID _30039_210_13270_1_1532484612900_2725150_239-304872-0_sp1.1 from training. Duration: 0.963625 2023-10-04 15:47:48,556 WARNING [train.py:1204] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_6-226750-0 from training. Duration: 0.94 2023-10-04 15:47:49,925 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_135-5501-0_sp1.1 from training. Duration: 0.8909375 2023-10-04 15:47:53,296 WARNING [train.py:1204] (3/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_359-118926-0 from training. Duration: 0.72 2023-10-04 15:47:56,182 WARNING [train.py:1204] (3/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_4-91037-0 from training. Duration: 0.88 2023-10-04 15:47:56,200 WARNING [train.py:1204] (3/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_415-288492-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 15:47:57,124 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.0.layers.1.nonlin_attention.whiten1, num_groups=1, num_channels=144, metric=5.16 vs. limit=10.0 2023-10-04 15:47:58,043 WARNING [train.py:1204] (3/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_544-287179-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 15:47:59,346 WARNING [train.py:1204] (3/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_122-32606-0_sp1.1 from training. Duration: 0.92725 2023-10-04 15:48:01,037 WARNING [train.py:1204] (3/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_121-284152-0_sp1.1 from training. Duration: 0.536375 2023-10-04 15:48:01,144 WARNING [train.py:1204] (3/4) Exclude cut with ID _44718_210_5816_1_1534050051869_6469030_350-299477-0_sp1.1 from training. Duration: 0.97275 2023-10-04 15:48:02,468 WARNING [train.py:1204] (3/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_1103-56445-0_sp0.9 from training. Duration: 0.811125 2023-10-04 15:48:02,525 WARNING [train.py:1204] (3/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_64-298917-0_sp0.9 from training. Duration: 0.97775 2023-10-04 15:48:02,573 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2245_210_2715_1_1525511548_3540254_310-52483-0 from training. Duration: 0.88 2023-10-04 15:48:03,841 WARNING [train.py:1204] (3/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_401-158239-0_sp1.1 from training. Duration: 0.6454375 2023-10-04 15:48:05,270 WARNING [train.py:1204] (3/4) Exclude cut with ID _59502_210_12210_1_1535107024698_1835139_146-162495-0_sp1.1 from training. Duration: 0.836375 2023-10-04 15:48:05,288 WARNING [train.py:1204] (3/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_41-31157-0_sp0.9 from training. Duration: 0.6333125 2023-10-04 15:48:08,097 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_68717_210_3528_1_1535628683_1556759_95-36722-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 15:48:08,104 WARNING [train.py:1204] (3/4) Exclude cut with ID _30196_210_1664_1_1533708496522_7074140_654-162611-0_sp1.1 from training. Duration: 0.92725 2023-10-04 15:48:09,652 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.0.self_attn_weights.pos_emb_skip_rate, batch_count=1708146.6666666667, ans=0.0 2023-10-04 15:48:13,565 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_9302_210_2550_1_1528889962_4655416_638-301620-0_sp0.9 from training. Duration: 0.52225 2023-10-04 15:48:14,971 WARNING [train.py:1204] (3/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_478-162247-0_sp1.1 from training. Duration: 0.7454375 2023-10-04 15:48:16,505 WARNING [train.py:1204] (3/4) Exclude cut with ID _45297_210_2721_1_1533715351381_3264949_353-9541-0 from training. Duration: 0.97 2023-10-04 15:48:22,511 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9653W0190-675468 from training. Duration: 0.9935 2023-10-04 15:48:23,777 INFO [train.py:1046] (3/4) Epoch 49, batch 1250, loss[loss=0.1648, simple_loss=0.251, pruned_loss=0.03929, over 23388.00 frames. ], tot_loss[loss=0.1538, simple_loss=0.2342, pruned_loss=0.03671, over 4692930.96 frames. ], batch size: 93, lr: 2.08e-03, grad_scale: 16.0 2023-10-04 15:48:23,910 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_49730_210_13049_1_1534075294_1830676_214-84436-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 15:48:25,736 WARNING [train.py:1204] (3/4) Exclude cut with ID _50093_210_4220_1_1534417141207_3432299_259-322252-0_sp1.1 from training. Duration: 0.836375 2023-10-04 15:48:28,372 WARNING [train.py:1204] (3/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_599-50220-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 15:48:31,601 WARNING [train.py:1204] (3/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_174-109016-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 15:48:33,536 WARNING [train.py:1204] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_665-196155-0 from training. Duration: 0.67 2023-10-04 15:48:36,456 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.0.bypass.scale_min, batch_count=1708213.3333333333, ans=0.2 2023-10-04 15:48:37,685 WARNING [train.py:1204] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_656-188361-0_sp1.1 from training. Duration: 0.7909375 2023-10-04 15:48:38,980 WARNING [train.py:1204] (3/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_60-18123-0_sp1.1 from training. Duration: 0.8 2023-10-04 15:48:39,038 WARNING [train.py:1204] (3/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_535-14410-0 from training. Duration: 0.47 2023-10-04 15:48:40,472 WARNING [train.py:1204] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_94-126128-0_sp1.1 from training. Duration: 0.7818125 2023-10-04 15:48:41,807 WARNING [train.py:1204] (3/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_356-31216-0_sp1.1 from training. Duration: 0.6818125 2023-10-04 15:48:42,900 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.0.layers.0.self_attn_weights.whiten_keys, num_groups=4, num_channels=128, metric=2.69 vs. limit=6.0 2023-10-04 15:48:44,627 WARNING [train.py:1204] (3/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_443-30034-0_sp1.1 from training. Duration: 0.5818125 2023-10-04 15:48:44,719 WARNING [train.py:1204] (3/4) Exclude cut with ID _26907_210_7234_1_1532221740694_3205263_337-2494-0_sp1.1 from training. Duration: 0.8 2023-10-04 15:48:46,047 WARNING [train.py:1204] (3/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_49-97158-0_sp0.9 from training. Duration: 0.8444375 2023-10-04 15:48:46,058 WARNING [train.py:1204] (3/4) Exclude cut with ID _7877_210_6014_1_1528286358099_3750136_553-170128-0_sp1.1 from training. Duration: 0.963625 2023-10-04 15:48:49,165 WARNING [train.py:1204] (3/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_202-293523-0_sp0.9 from training. Duration: 0.8 2023-10-04 15:48:52,003 WARNING [train.py:1204] (3/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_188-171673-0_sp0.9 from training. Duration: 0.6444375 2023-10-04 15:48:52,018 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_120-41668-0_sp1.1 from training. Duration: 0.636375 2023-10-04 15:48:53,198 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_40223_210_6228_1_1533298404_4812267_613-78769-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 15:48:53,321 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_53861_210_5399_1_1534422156_4272077_150-336899-0_sp1.1 from training. Duration: 0.963625 2023-10-04 15:48:54,624 WARNING [train.py:1204] (3/4) Exclude cut with ID _14459_210_2228_1_1531443641946_7516700_1146-56843-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 15:48:54,876 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.feed_forward3.hidden_balancer.prob, batch_count=1708346.6666666667, ans=0.125 2023-10-04 15:48:57,824 WARNING [train.py:1204] (3/4) Exclude cut with ID _39867_210_9826_1_1533553222030_9344259_1194-299928-0_sp1.1 from training. Duration: 0.92725 2023-10-04 15:48:59,273 WARNING [train.py:1204] (3/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_214-291369-0_sp0.9 from training. Duration: 0.7 2023-10-04 15:49:03,497 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=1708346.6666666667, ans=0.1 2023-10-04 15:49:04,772 WARNING [train.py:1204] (3/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_358-215558-0 from training. Duration: 0.93 2023-10-04 15:49:04,829 WARNING [train.py:1204] (3/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_577-287150-0_sp0.9 from training. Duration: 0.911125 2023-10-04 15:49:06,403 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.bypass.skip_rate, batch_count=1708346.6666666667, ans=0.09899494936611666 2023-10-04 15:49:07,740 WARNING [train.py:1204] (3/4) Exclude cut with ID _60860_210_8254_1_1534896015875_3557169_229-187367-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 15:49:07,794 WARNING [train.py:1204] (3/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_857-298942-0 from training. Duration: 0.85 2023-10-04 15:49:09,162 WARNING [train.py:1204] (3/4) Exclude cut with ID _40137_210_12210_1_1533290484046_3795180_249-187059-0_sp1.1 from training. Duration: 0.8 2023-10-04 15:49:09,169 WARNING [train.py:1204] (3/4) Exclude cut with ID ID0171W0508-765603 from training. Duration: 0.541 2023-10-04 15:49:09,190 WARNING [train.py:1204] (3/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_521-219439-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 15:49:10,463 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_40223_210_6228_1_1533298404_4812267_741-302616-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 15:49:13,345 WARNING [train.py:1204] (3/4) Exclude cut with ID _65293_210_9070_1_1535545549765_4004210_438-154735-0_sp1.1 from training. Duration: 0.92725 2023-10-04 15:49:15,972 WARNING [train.py:1204] (3/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_564-132950-0_sp1.1 from training. Duration: 0.92725 2023-10-04 15:49:17,318 WARNING [train.py:1204] (3/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_291-270881-0_sp1.1 from training. Duration: 0.8818125 2023-10-04 15:49:18,841 WARNING [train.py:1204] (3/4) Exclude cut with ID _60229_210_27767_1_1534838406158_3655350_353-213656-0 from training. Duration: 0.95 2023-10-04 15:49:18,856 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_243-143119-0 from training. Duration: 0.77 2023-10-04 15:49:18,887 WARNING [train.py:1204] (3/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_690-126306-0 from training. Duration: 0.78 2023-10-04 15:49:19,642 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.bypass.skip_rate, batch_count=1708413.3333333333, ans=0.04949747468305833 2023-10-04 15:49:19,669 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.ff2_skip_rate, batch_count=1708413.3333333333, ans=0.0 2023-10-04 15:49:22,155 WARNING [train.py:1204] (3/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_347-95003-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 15:49:23,545 WARNING [train.py:1204] (3/4) Exclude cut with ID _2322_210_2667_1_1525604548971_3656299_440-80494-0 from training. Duration: 0.88 2023-10-04 15:49:23,560 WARNING [train.py:1204] (3/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_370-229434-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 15:49:25,122 WARNING [train.py:1204] (3/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_124-345840-0_sp1.1 from training. Duration: 0.47275 2023-10-04 15:49:25,131 WARNING [train.py:1204] (3/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_165-123581-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 15:49:29,115 WARNING [train.py:1204] (3/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_453-282588-0 from training. Duration: 0.98 2023-10-04 15:49:29,132 WARNING [train.py:1204] (3/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_299-34413-0_sp0.9 from training. Duration: 0.72225 2023-10-04 15:49:29,189 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_40036_210_13405_1_1533648647_1315388_48-146016-0_sp1.1 from training. Duration: 0.8454375 2023-10-04 15:49:29,213 WARNING [train.py:1204] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_99-167392-0_sp0.9 from training. Duration: 0.67775 2023-10-04 15:49:30,954 WARNING [train.py:1204] (3/4) Exclude cut with ID _48677_210_5663_1_1533902405919_3892989_37-207114-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 15:49:32,811 WARNING [train.py:1204] (3/4) Exclude cut with ID _29857_210_20034_1_1532395886753_3798160_335-163562-0 from training. Duration: 0.85 2023-10-04 15:49:35,532 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_36492_210_7927_1_1533261987_4790834_703-343351-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 15:49:36,950 WARNING [train.py:1204] (3/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_80-147301-0_sp0.9 from training. Duration: 0.9333125 2023-10-04 15:49:37,044 WARNING [train.py:1204] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_218-295097-0_sp0.9 from training. Duration: 0.8555625 2023-10-04 15:49:38,256 INFO [train.py:1046] (3/4) Epoch 49, batch 1300, loss[loss=0.1365, simple_loss=0.2074, pruned_loss=0.03284, over 22831.00 frames. ], tot_loss[loss=0.1542, simple_loss=0.2347, pruned_loss=0.03689, over 4690269.01 frames. ], batch size: 50, lr: 2.08e-03, grad_scale: 16.0 2023-10-04 15:49:41,039 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2295_210_2087_1_1525500993_2407863_116-238894-0_sp1.1 from training. Duration: 0.636375 2023-10-04 15:49:42,485 WARNING [train.py:1204] (3/4) Exclude cut with ID _3231_210_2106_1_1526205864934_6784220_697-171068-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 15:49:43,731 WARNING [train.py:1204] (3/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_102-77064-0 from training. Duration: 0.93 2023-10-04 15:49:47,690 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.698e+02 2.055e+02 2.239e+02 2.568e+02 3.936e+02, threshold=4.477e+02, percent-clipped=0.0 2023-10-04 15:49:49,062 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_13091_210_7925_1_1530609447_8987354_1186-261957-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 15:49:49,174 WARNING [train.py:1204] (3/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_91-277684-0_sp1.1 from training. Duration: 0.67275 2023-10-04 15:49:51,140 WARNING [train.py:1204] (3/4) Exclude cut with ID _31088_210_16446_1_1532520012284_3440919_159-313751-0_sp1.1 from training. Duration: 0.936375 2023-10-04 15:49:51,259 WARNING [train.py:1204] (3/4) Exclude cut with ID _42326_210_7770_1_1533814282531_3707269_227-203721-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 15:49:52,641 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3531_210_3528_1_1526294203_5216456_309-227302-0_sp1.1 from training. Duration: 0.62725 2023-10-04 15:49:53,835 WARNING [train.py:1204] (3/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_226-242140-0 from training. Duration: 0.81 2023-10-04 15:49:58,052 WARNING [train.py:1204] (3/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_122-109023-0_sp1.1 from training. Duration: 0.6909375 2023-10-04 15:49:59,463 WARNING [train.py:1204] (3/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_660-211088-0_sp0.9 from training. Duration: 0.688875 2023-10-04 15:49:59,735 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.balancer2.prob, batch_count=1708613.3333333333, ans=0.125 2023-10-04 15:50:01,411 WARNING [train.py:1204] (3/4) Exclude cut with ID _39290_210_4220_1_1533862778760_3075118_92-311600-0 from training. Duration: 0.88 2023-10-04 15:50:01,726 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.conv_module1.balancer2.prob, batch_count=1708613.3333333333, ans=0.125 2023-10-04 15:50:04,727 WARNING [train.py:1204] (3/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_390-339137-0_sp1.1 from training. Duration: 0.5090625 2023-10-04 15:50:07,548 WARNING [train.py:1204] (3/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_449-341946-0_sp1.1 from training. Duration: 0.963625 2023-10-04 15:50:07,838 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.conv_module2.balancer2.prob, batch_count=1708680.0, ans=0.125 2023-10-04 15:50:08,879 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_68095_210_20185_1_1535718226_8888855_884-327007-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 15:50:10,340 WARNING [train.py:1204] (3/4) Exclude cut with ID _42326_210_7770_1_1533814282531_3707269_308-222998-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 15:50:10,446 WARNING [train.py:1204] (3/4) Exclude cut with ID _40399_210_4353_1_1533305658194_3279406_344-238708-0_sp1.1 from training. Duration: 0.963625 2023-10-04 15:50:11,714 WARNING [train.py:1204] (3/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_87-131427-0_sp1.1 from training. Duration: 0.7090625 2023-10-04 15:50:11,765 WARNING [train.py:1204] (3/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_110-130961-0_sp0.9 from training. Duration: 0.52225 2023-10-04 15:50:13,047 WARNING [train.py:1204] (3/4) Exclude cut with ID _55153_210_2253_1_1534384786677_3162096_148-79022-0 from training. Duration: 0.96 2023-10-04 15:50:13,242 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.0.nonlin_attention.balancer.min_positive, batch_count=1708680.0, ans=0.05 2023-10-04 15:50:18,089 WARNING [train.py:1204] (3/4) Exclude cut with ID _9679_210_7497_1_1529129212906_4363929_512-313889-0_sp1.1 from training. Duration: 0.736375 2023-10-04 15:50:18,108 WARNING [train.py:1204] (3/4) Exclude cut with ID _7970_210_1883_1_1528455616296_3472230_25-178407-0_sp1.1 from training. Duration: 0.6454375 2023-10-04 15:50:19,371 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_246-125000-0 from training. Duration: 0.62 2023-10-04 15:50:19,426 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_15126_210_6753_1_1530778677_2591185_113-157651-0_sp0.9 from training. Duration: 0.7555625 2023-10-04 15:50:22,044 WARNING [train.py:1204] (3/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_105-63309-0_sp1.1 from training. Duration: 0.8909375 2023-10-04 15:50:23,491 WARNING [train.py:1204] (3/4) Exclude cut with ID _29877_210_6228_1_1532397791106_5276149_578-177271-0_sp1.1 from training. Duration: 0.936375 2023-10-04 15:50:24,832 WARNING [train.py:1204] (3/4) Exclude cut with ID _44831_210_13427_1_1533816011549_3689035_32-188147-0 from training. Duration: 0.96 2023-10-04 15:50:26,146 WARNING [train.py:1204] (3/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_300-254337-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 15:50:26,166 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_128-241008-0 from training. Duration: 0.94 2023-10-04 15:50:27,591 WARNING [train.py:1204] (3/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_53-279178-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 15:50:31,332 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_41452_210_7291_1_1533437687_3941281_273-334088-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 15:50:31,335 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_128-241008-0_sp1.1 from training. Duration: 0.8545625 2023-10-04 15:50:34,133 WARNING [train.py:1204] (3/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_379-49457-0 from training. Duration: 0.87 2023-10-04 15:50:35,461 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_105-114797-0 from training. Duration: 0.84 2023-10-04 15:50:35,695 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.conv_module2.balancer1.prob, batch_count=1708813.3333333333, ans=0.125 2023-10-04 15:50:36,791 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_14928_210_5632_1_1530773461_4714000_454-112198-0 from training. Duration: 0.66 2023-10-04 15:50:37,146 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=1708813.3333333333, ans=0.1 2023-10-04 15:50:40,859 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_456-22141-0_sp1.1 from training. Duration: 0.8 2023-10-04 15:50:43,757 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_115-233929-0 from training. Duration: 0.72 2023-10-04 15:50:45,092 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_616-16795-0_sp1.1 from training. Duration: 0.963625 2023-10-04 15:50:50,407 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.4.encoder.layers.1.nonlin_attention.whiten2, num_groups=1, num_channels=384, metric=3.13 vs. limit=15.0 2023-10-04 15:50:51,099 INFO [train.py:1046] (3/4) Epoch 49, batch 1350, loss[loss=0.1591, simple_loss=0.2479, pruned_loss=0.03521, over 24465.00 frames. ], tot_loss[loss=0.1537, simple_loss=0.2341, pruned_loss=0.03662, over 4697435.10 frames. ], batch size: 77, lr: 2.08e-03, grad_scale: 16.0 2023-10-04 15:50:52,509 WARNING [train.py:1204] (3/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_448-212135-0 from training. Duration: 0.91 2023-10-04 15:50:54,093 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.bypass.skip_rate, batch_count=1708880.0, ans=0.09899494936611666 2023-10-04 15:50:55,330 WARNING [train.py:1204] (3/4) Exclude cut with ID _46683_210_9642_1_1533730913492_4591116_201-205677-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 15:50:57,293 WARNING [train.py:1204] (3/4) Exclude cut with ID _64086_210_7994_1_1535270417019_7435949_43-80483-0_sp1.1 from training. Duration: 0.97275 2023-10-04 15:50:57,467 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=1708880.0, ans=0.1 2023-10-04 15:51:00,638 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_240-134124-0_sp1.1 from training. Duration: 0.963625 2023-10-04 15:51:00,672 WARNING [train.py:1204] (3/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_907-120940-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 15:51:03,268 WARNING [train.py:1204] (3/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_848-280957-0_sp1.1 from training. Duration: 0.7909375 2023-10-04 15:51:03,307 WARNING [train.py:1204] (3/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_243-42116-0_sp0.9 from training. Duration: 0.811125 2023-10-04 15:51:08,147 WARNING [train.py:1204] (3/4) Exclude cut with ID _6694_210_4599_1_1527852637490_3965529_116-109398-0_sp0.9 from training. Duration: 0.811125 2023-10-04 15:51:09,588 WARNING [train.py:1204] (3/4) Exclude cut with ID _24314_210_4915_1_1532135078116_7170179_521-258868-0 from training. Duration: 0.79 2023-10-04 15:51:10,919 WARNING [train.py:1204] (3/4) Exclude cut with ID _2220_210_1819_1_1525509109898_3658680_50-99837-0_sp0.9 from training. Duration: 0.6 2023-10-04 15:51:12,322 WARNING [train.py:1204] (3/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_313-134930-0_sp1.1 from training. Duration: 0.8818125 2023-10-04 15:51:15,064 WARNING [train.py:1204] (3/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_125-144174-0 from training. Duration: 0.39 2023-10-04 15:51:15,133 WARNING [train.py:1204] (3/4) Exclude cut with ID _59288_210_5854_1_1535077757774_3412246_275-293897-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 15:51:16,462 WARNING [train.py:1204] (3/4) Exclude cut with ID _40519_210_13794_1_1533715228655_7337390_722-52880-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 15:51:16,476 WARNING [train.py:1204] (3/4) Exclude cut with ID _7537_210_2084_1_1528502395431_3924359_128-268029-0 from training. Duration: 0.98 2023-10-04 15:51:19,227 WARNING [train.py:1204] (3/4) Exclude cut with ID _8021_210_7994_1_1528376451358_3653069_179-45206-0 from training. Duration: 0.86 2023-10-04 15:51:20,180 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.ff3_skip_rate, batch_count=1709013.3333333333, ans=0.0 2023-10-04 15:51:21,034 WARNING [train.py:1204] (3/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_88-276668-0 from training. Duration: 0.99 2023-10-04 15:51:21,138 WARNING [train.py:1204] (3/4) Exclude cut with ID _22613_210_5219_1_1532253599686_4090170_290-212862-0_sp1.1 from training. Duration: 0.97275 2023-10-04 15:51:21,143 WARNING [train.py:1204] (3/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_465-214726-0 from training. Duration: 0.86 2023-10-04 15:51:31,698 WARNING [train.py:1204] (3/4) Exclude cut with ID _56480_210_4220_1_1534679942079_3075409_138-36031-0_sp1.1 from training. Duration: 0.97275 2023-10-04 15:51:41,897 WARNING [train.py:1204] (3/4) Exclude cut with ID _58605_210_12210_1_1534723195939_3904259_300-285878-0_sp1.1 from training. Duration: 0.97275 2023-10-04 15:51:43,233 WARNING [train.py:1204] (3/4) Exclude cut with ID _40901_210_5665_1_1533363789958_3856130_114-340229-0_sp1.1 from training. Duration: 0.936375 2023-10-04 15:51:43,274 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_489-230703-0 from training. Duration: 0.85 2023-10-04 15:51:44,774 WARNING [train.py:1204] (3/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_322-81243-0_sp1.1 from training. Duration: 0.936375 2023-10-04 15:51:44,869 WARNING [train.py:1204] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_271-269084-0 from training. Duration: 0.83 2023-10-04 15:51:46,147 WARNING [train.py:1204] (3/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_166-314607-0_sp0.9 from training. Duration: 0.6 2023-10-04 15:51:46,224 WARNING [train.py:1204] (3/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_340-343302-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 15:51:47,705 WARNING [train.py:1204] (3/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_286-112015-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 15:51:50,736 WARNING [train.py:1204] (3/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_328-264776-0 from training. Duration: 0.67 2023-10-04 15:51:52,269 WARNING [train.py:1204] (3/4) Exclude cut with ID _4866_210_2577_1_1527404412284_4364659_256-306830-0_sp0.9 from training. Duration: 0.9666875 2023-10-04 15:51:52,395 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.0.feed_forward2.hidden_balancer.prob, batch_count=1709146.6666666667, ans=0.125 2023-10-04 15:51:53,760 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.conv_module2.balancer2.min_abs, batch_count=1709146.6666666667, ans=0.5 2023-10-04 15:51:58,273 WARNING [train.py:1204] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_239-316802-0 from training. Duration: 0.69 2023-10-04 15:51:59,687 WARNING [train.py:1204] (3/4) Exclude cut with ID _29857_210_20034_1_1532395886753_3798160_279-185801-0 from training. Duration: 0.91 2023-10-04 15:52:02,578 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=1709146.6666666667, ans=0.1 2023-10-04 15:52:05,608 INFO [train.py:1046] (3/4) Epoch 49, batch 1400, loss[loss=0.1468, simple_loss=0.2259, pruned_loss=0.03381, over 23514.00 frames. ], tot_loss[loss=0.1522, simple_loss=0.232, pruned_loss=0.03621, over 4700982.96 frames. ], batch size: 120, lr: 2.08e-03, grad_scale: 8.0 2023-10-04 15:52:07,033 WARNING [train.py:1204] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_656-188361-0 from training. Duration: 0.87 2023-10-04 15:52:08,485 WARNING [train.py:1204] (3/4) Exclude cut with ID _44718_210_5816_1_1534050051869_6469030_100-230287-0_sp1.1 from training. Duration: 0.936375 2023-10-04 15:52:11,286 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8275_210_4198_1_1528455320_3892571_276-215622-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 15:52:12,545 WARNING [train.py:1204] (3/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_698-309430-0_sp1.1 from training. Duration: 0.963625 2023-10-04 15:52:15,543 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.bypass.skip_rate, batch_count=1709213.3333333333, ans=0.07 2023-10-04 15:52:16,623 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.734e+02 2.032e+02 2.354e+02 2.703e+02 4.112e+02, threshold=4.708e+02, percent-clipped=0.0 2023-10-04 15:52:18,080 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_365-217119-0 from training. Duration: 0.63 2023-10-04 15:52:19,541 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7264_210_4938_1_1527939911_8132095_800-91995-0 from training. Duration: 0.86 2023-10-04 15:52:30,246 WARNING [train.py:1204] (3/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_49-97158-0_sp1.1 from training. Duration: 0.6909375 2023-10-04 15:52:30,365 WARNING [train.py:1204] (3/4) Exclude cut with ID _61132_210_9969_1_1535021792964_3755419_61-57720-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 15:52:33,288 WARNING [train.py:1204] (3/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_345-306832-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 15:52:33,321 WARNING [train.py:1204] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_164-224052-0_sp0.9 from training. Duration: 0.77775 2023-10-04 15:52:36,275 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_34886_210_10633_1_1533203882_1755661_76-156096-0_sp1.1 from training. Duration: 0.7909375 2023-10-04 15:52:38,183 WARNING [train.py:1204] (3/4) Exclude cut with ID 4133-6541-0027-40495-0_sp1.1 from training. Duration: 0.9681875 2023-10-04 15:52:41,222 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.1.feed_forward1.out_proj.dropout_p, batch_count=1709346.6666666667, ans=0.1 2023-10-04 15:52:46,570 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_514-211165-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 15:52:46,612 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_41211_210_2544_1_1533348859_3702910_97-212695-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 15:52:48,240 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.conv_skip_rate, batch_count=1709413.3333333333, ans=0.0 2023-10-04 15:52:50,040 WARNING [train.py:1204] (3/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_406-45086-0 from training. Duration: 0.8 2023-10-04 15:52:51,390 WARNING [train.py:1204] (3/4) Exclude cut with ID _65263_210_22356_1_1535763598691_3921037_49-222917-0_sp1.1 from training. Duration: 0.836375 2023-10-04 15:52:52,789 WARNING [train.py:1204] (3/4) Exclude cut with ID _4866_210_2577_1_1527404412284_4364659_352-157930-0_sp1.1 from training. Duration: 0.736375 2023-10-04 15:52:54,112 WARNING [train.py:1204] (3/4) Exclude cut with ID _4366_210_1388_1_1526792053633_3613819_231-211402-0_sp1.1 from training. Duration: 0.8545625 2023-10-04 15:52:54,166 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_98-21576-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 15:52:55,365 WARNING [train.py:1204] (3/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_348-127789-0_sp0.9 from training. Duration: 0.9444375 2023-10-04 15:52:55,381 WARNING [train.py:1204] (3/4) Exclude cut with ID _45871_210_5665_1_1533864614245_3296060_98-329557-0_sp1.1 from training. Duration: 0.97275 2023-10-04 15:52:55,440 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6846_210_6753_1_1527850767_4295158_2-315073-0_sp1.1 from training. Duration: 0.8 2023-10-04 15:52:55,694 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.conv_skip_rate, batch_count=1709413.3333333333, ans=0.0 2023-10-04 15:52:58,157 WARNING [train.py:1204] (3/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_145-137790-0 from training. Duration: 0.83 2023-10-04 15:52:58,179 WARNING [train.py:1204] (3/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_133-158308-0_sp0.9 from training. Duration: 0.9333125 2023-10-04 15:53:02,869 WARNING [train.py:1204] (3/4) Exclude cut with ID _14898_210_9206_1_1530766812377_4177190_320-11758-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 15:53:04,533 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.conv_skip_rate, batch_count=1709480.0, ans=0.0 2023-10-04 15:53:06,948 WARNING [train.py:1204] (3/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_406-45086-0_sp0.9 from training. Duration: 0.888875 2023-10-04 15:53:07,901 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.2.encoder.layers.0.whiten, num_groups=1, num_channels=384, metric=8.16 vs. limit=12.0 2023-10-04 15:53:11,990 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_5438_210_1794_1_1527336687_2600009_104-288274-0 from training. Duration: 0.58 2023-10-04 15:53:13,314 WARNING [train.py:1204] (3/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_108-53929-0_sp0.9 from training. Duration: 0.6555625 2023-10-04 15:53:13,368 WARNING [train.py:1204] (3/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_444-286390-0_sp1.1 from training. Duration: 0.963625 2023-10-04 15:53:13,519 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.conv_skip_rate, batch_count=1709480.0, ans=0.0 2023-10-04 15:53:16,094 WARNING [train.py:1204] (3/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_125-144174-0_sp0.9 from training. Duration: 0.4333125 2023-10-04 15:53:16,155 WARNING [train.py:1204] (3/4) Exclude cut with ID _55209_210_1531_1_1534675828996_4597160_118-345621-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 15:53:18,649 INFO [train.py:1046] (3/4) Epoch 49, batch 1450, loss[loss=0.1438, simple_loss=0.216, pruned_loss=0.03582, over 23572.00 frames. ], tot_loss[loss=0.1512, simple_loss=0.2313, pruned_loss=0.03559, over 4696896.45 frames. ], batch size: 256, lr: 2.07e-03, grad_scale: 8.0 2023-10-04 15:53:18,701 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_45886_210_17024_1_1533709215_471063_7-79165-0_sp1.1 from training. Duration: 0.8909375 2023-10-04 15:53:23,338 WARNING [train.py:1204] (3/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_175-163894-0_sp0.9 from training. Duration: 0.82225 2023-10-04 15:53:23,479 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_50188_210_4198_1_1534503621_3646041_353-81682-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 15:53:23,485 WARNING [train.py:1204] (3/4) Exclude cut with ID _17875_210_2087_1_1531224012907_1821339_175-80565-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 15:53:23,493 WARNING [train.py:1204] (3/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_159-328157-0_sp1.1 from training. Duration: 0.47275 2023-10-04 15:53:31,326 WARNING [train.py:1204] (3/4) Exclude cut with ID _60057_210_10640_1_1534903065793_3333150_137-267579-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 15:53:31,365 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_92-46009-0_sp1.1 from training. Duration: 0.4909375 2023-10-04 15:53:32,861 WARNING [train.py:1204] (3/4) Exclude cut with ID _53203_210_2087_1_1534246218309_475590_48-68507-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 15:53:32,880 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_28211_210_6388_1_1532861506_4523224_128-328162-0 from training. Duration: 0.9 2023-10-04 15:53:34,136 WARNING [train.py:1204] (3/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_51-1049-0_sp1.1 from training. Duration: 0.6818125 2023-10-04 15:53:35,563 WARNING [train.py:1204] (3/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_383-316000-0 from training. Duration: 0.9 2023-10-04 15:53:35,615 WARNING [train.py:1204] (3/4) Exclude cut with ID _4779_210_2084_1_1527318022944_3914893_185-295153-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 15:53:35,687 WARNING [train.py:1204] (3/4) Exclude cut with ID _3231_210_2106_1_1526205864934_6784220_171-256004-0_sp0.9 from training. Duration: 0.9555625 2023-10-04 15:53:35,687 WARNING [train.py:1204] (3/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_87-272068-0 from training. Duration: 0.83 2023-10-04 15:53:37,058 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_164-152777-0_sp1.1 from training. Duration: 0.92725 2023-10-04 15:53:38,983 WARNING [train.py:1204] (3/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_18-130336-0_sp0.9 from training. Duration: 0.87775 2023-10-04 15:53:39,039 WARNING [train.py:1204] (3/4) Exclude cut with ID _14886_210_4220_1_1530702020355_3516190_175-229073-0_sp1.1 from training. Duration: 0.4090625 2023-10-04 15:53:39,051 WARNING [train.py:1204] (3/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_791-45149-0_sp0.9 from training. Duration: 0.9555625 2023-10-04 15:53:40,346 WARNING [train.py:1204] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_468-189058-0_sp1.1 from training. Duration: 0.87275 2023-10-04 15:53:42,977 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8278_210_2550_1_1528637099_4220413_32-142780-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 15:53:44,498 WARNING [train.py:1204] (3/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_146-218763-0_sp0.9 from training. Duration: 0.9555625 2023-10-04 15:53:47,284 WARNING [train.py:1204] (3/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_348-127789-0_sp1.1 from training. Duration: 0.77275 2023-10-04 15:53:47,289 WARNING [train.py:1204] (3/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_288-285445-0_sp1.1 from training. Duration: 0.936375 2023-10-04 15:53:48,775 WARNING [train.py:1204] (3/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_308-181732-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 15:53:48,799 WARNING [train.py:1204] (3/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_895-117428-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 15:53:50,994 WARNING [train.py:1204] (3/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_387-159008-0_sp0.9 from training. Duration: 0.9555625 2023-10-04 15:53:52,131 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_37351_210_7925_1_1533012931_3812113_265-85780-0_sp0.9 from training. Duration: 0.97775 2023-10-04 15:53:52,149 WARNING [train.py:1204] (3/4) Exclude cut with ID _40551_210_10635_1_1533776424632_3416179_361-3964-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 15:53:52,192 WARNING [train.py:1204] (3/4) Exclude cut with ID _25882_210_3036_1_1532775665815_6479510_485-122742-0_sp1.1 from training. Duration: 0.97275 2023-10-04 15:53:56,296 WARNING [train.py:1204] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_98-51676-0 from training. Duration: 0.73 2023-10-04 15:53:56,865 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.3.encoder.layers.0.conv_module2.whiten, num_groups=1, num_channels=512, metric=4.42 vs. limit=15.0 2023-10-04 15:53:58,966 WARNING [train.py:1204] (3/4) Exclude cut with ID _30548_210_16040_1_1532608097130_3146969_371-280610-0_sp1.1 from training. Duration: 0.92725 2023-10-04 15:54:02,386 WARNING [train.py:1204] (3/4) Exclude cut with ID IC0671W0081-331924 from training. Duration: 0.896 2023-10-04 15:54:03,767 WARNING [train.py:1204] (3/4) Exclude cut with ID _40284_210_2084_1_1533513679814_3704279_380-160775-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 15:54:03,907 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.0.conv_skip_rate, batch_count=1709746.6666666667, ans=0.0 2023-10-04 15:54:04,287 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.4.encoder.layers.0.conv_module1.whiten, num_groups=1, num_channels=384, metric=6.06 vs. limit=15.0 2023-10-04 15:54:05,080 WARNING [train.py:1204] (3/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_57-270037-0_sp1.1 from training. Duration: 0.7 2023-10-04 15:54:06,490 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_853-150237-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 15:54:06,595 WARNING [train.py:1204] (3/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_491-261923-0 from training. Duration: 0.98 2023-10-04 15:54:11,093 WARNING [train.py:1204] (3/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_836-244045-0_sp1.1 from training. Duration: 0.97275 2023-10-04 15:54:12,446 WARNING [train.py:1204] (3/4) Exclude cut with ID _14880_210_8895_1_1530680426655_3795259_289-212817-0 from training. Duration: 0.69 2023-10-04 15:54:13,733 WARNING [train.py:1204] (3/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_303-340446-0 from training. Duration: 0.9 2023-10-04 15:54:13,838 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_37152_210_9697_1_1533106573_3844188_418-41736-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 15:54:17,947 WARNING [train.py:1204] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_371-18169-0_sp1.1 from training. Duration: 0.7818125 2023-10-04 15:54:19,742 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_187-219984-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 15:54:21,147 WARNING [train.py:1204] (3/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_63-267585-0 from training. Duration: 0.9 2023-10-04 15:54:22,567 WARNING [train.py:1204] (3/4) Exclude cut with ID _27013_210_1389_1_1532437173240_3252649_222-125573-0 from training. Duration: 0.98 2023-10-04 15:54:23,866 WARNING [train.py:1204] (3/4) Exclude cut with ID _14821_210_8750_1_1530680503991_6829359_189-297451-0 from training. Duration: 0.55 2023-10-04 15:54:25,266 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_557-163391-0_sp1.1 from training. Duration: 0.97275 2023-10-04 15:54:25,355 WARNING [train.py:1204] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_65-88242-0_sp1.1 from training. Duration: 0.5090625 2023-10-04 15:54:32,983 INFO [train.py:1046] (3/4) Epoch 49, batch 1500, loss[loss=0.1398, simple_loss=0.2325, pruned_loss=0.02349, over 24671.00 frames. ], tot_loss[loss=0.1512, simple_loss=0.2314, pruned_loss=0.0355, over 4702623.42 frames. ], batch size: 68, lr: 2.07e-03, grad_scale: 8.0 2023-10-04 15:54:37,195 WARNING [train.py:1204] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_218-295097-0 from training. Duration: 0.77 2023-10-04 15:54:37,216 WARNING [train.py:1204] (3/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_686-247600-0_sp1.1 from training. Duration: 0.836375 2023-10-04 15:54:37,220 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_397-147196-0_sp0.9 from training. Duration: 0.888875 2023-10-04 15:54:38,535 WARNING [train.py:1204] (3/4) Exclude cut with ID _53950_210_5816_1_1534417264919_3862951_237-136449-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 15:54:38,755 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.conv_skip_rate, batch_count=1709880.0, ans=0.0 2023-10-04 15:54:39,905 WARNING [train.py:1204] (3/4) Exclude cut with ID _3120_210_3525_1_1526036143112_4072980_74-178400-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 15:54:41,250 WARNING [train.py:1204] (3/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_309-222545-0_sp0.9 from training. Duration: 0.9333125 2023-10-04 15:54:41,333 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7544_210_6753_1_1528587722_3576686_172-171750-0 from training. Duration: 0.65 2023-10-04 15:54:43,318 WARNING [train.py:1204] (3/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_3-237434-0_sp0.9 from training. Duration: 0.8444375 2023-10-04 15:54:43,565 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.conv_skip_rate, batch_count=1709880.0, ans=0.0 2023-10-04 15:54:44,516 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.778e+02 2.098e+02 2.317e+02 2.688e+02 4.133e+02, threshold=4.633e+02, percent-clipped=0.0 2023-10-04 15:54:44,600 WARNING [train.py:1204] (3/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_101-293367-0_sp0.9 from training. Duration: 0.7 2023-10-04 15:54:44,607 WARNING [train.py:1204] (3/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_818-213552-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 15:54:44,686 WARNING [train.py:1204] (3/4) Exclude cut with ID _14349_210_8341_1_1530615710818_3442470_115-285975-0_sp1.1 from training. Duration: 0.7818125 2023-10-04 15:54:46,111 WARNING [train.py:1204] (3/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_291-309749-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 15:54:47,492 WARNING [train.py:1204] (3/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_715-3462-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 15:54:48,320 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.0.layers.1.nonlin_attention.whiten2, num_groups=1, num_channels=192, metric=6.43 vs. limit=15.0 2023-10-04 15:54:53,380 WARNING [train.py:1204] (3/4) Exclude cut with ID _59502_210_12210_1_1535107024698_1835139_13-331827-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 15:54:53,390 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_4528_210_3273_1_1527076654_3276777_342-168068-0 from training. Duration: 0.87 2023-10-04 15:54:54,762 WARNING [train.py:1204] (3/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_215-141059-0_sp0.9 from training. Duration: 0.688875 2023-10-04 15:54:54,784 WARNING [train.py:1204] (3/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_106-286130-0_sp1.1 from training. Duration: 0.8909375 2023-10-04 15:54:56,125 WARNING [train.py:1204] (3/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_480-77655-0_sp1.1 from training. Duration: 0.97275 2023-10-04 15:54:56,853 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.2.encoder.layers.2.conv_module2.whiten, num_groups=1, num_channels=384, metric=6.70 vs. limit=15.0 2023-10-04 15:54:57,615 WARNING [train.py:1204] (3/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_848-280957-0 from training. Duration: 0.87 2023-10-04 15:54:59,754 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.3.encoder.layers.1.feed_forward1.out_whiten, num_groups=1, num_channels=512, metric=7.73 vs. limit=15.0 2023-10-04 15:55:00,896 WARNING [train.py:1204] (3/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_122-109023-0 from training. Duration: 0.76 2023-10-04 15:55:03,954 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_11305_210_1794_1_1529750428_7178148_645-233480-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 15:55:04,016 WARNING [train.py:1204] (3/4) Exclude cut with ID _7562_210_2084_1_1528887767916_3946229_345-169156-0 from training. Duration: 0.58 2023-10-04 15:55:06,770 WARNING [train.py:1204] (3/4) Exclude cut with ID _7562_210_2084_1_1528887767916_3946229_345-169156-0_sp1.1 from training. Duration: 0.52725 2023-10-04 15:55:10,747 WARNING [train.py:1204] (3/4) Exclude cut with ID _13253_210_1925_1_1530757775819_3577873_253-246386-0_sp1.1 from training. Duration: 0.8181875 2023-10-04 15:55:10,819 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_136-264444-0_sp1.1 from training. Duration: 0.97275 2023-10-04 15:55:10,839 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2168_210_2290_1_1525606920_1849400_136-222991-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 15:55:12,270 WARNING [train.py:1204] (3/4) Exclude cut with ID _7852_210_5219_1_1528286471316_8139400_242-105336-0 from training. Duration: 0.72 2023-10-04 15:55:13,576 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_5960_210_1794_1_1527403865_3990999_515-2613-0_sp1.1 from training. Duration: 0.8 2023-10-04 15:55:13,600 WARNING [train.py:1204] (3/4) Exclude cut with ID _39688_210_19596_1_1533371626725_4154159_551-167834-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 15:55:14,967 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_85-195240-0 from training. Duration: 0.87 2023-10-04 15:55:15,020 WARNING [train.py:1204] (3/4) Exclude cut with ID _63171_210_15488_1_1535104792794_3295110_113-113534-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 15:55:21,008 WARNING [train.py:1204] (3/4) Exclude cut with ID _8833_210_5219_1_1528592665210_7553059_319-349181-0_sp1.1 from training. Duration: 0.9 2023-10-04 15:55:21,009 WARNING [train.py:1204] (3/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_225-43381-0 from training. Duration: 0.94 2023-10-04 15:55:25,774 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_5959_210_1794_1_1527400400_3445726_412-44052-0_sp0.9 from training. Duration: 0.8555625 2023-10-04 15:55:27,173 WARNING [train.py:1204] (3/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_356-167816-0_sp1.1 from training. Duration: 0.6545625 2023-10-04 15:55:27,458 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.bypass.scale_min, batch_count=1710080.0, ans=0.2 2023-10-04 15:55:30,327 INFO [scaling.py:1118] (3/4) WithLoss: name=encoder.encoders.3.encoder.layers.1.self_attn_weights, loss-sum=0.000e+00 2023-10-04 15:55:31,836 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9012W0112-501244 from training. Duration: 0.768 2023-10-04 15:55:31,876 WARNING [train.py:1204] (3/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_177-326952-0_sp1.1 from training. Duration: 0.92725 2023-10-04 15:55:31,880 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9016W0116-502789 from training. Duration: 0.896 2023-10-04 15:55:33,257 WARNING [train.py:1204] (3/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_557-204815-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 15:55:35,166 WARNING [train.py:1204] (3/4) Exclude cut with ID _55857_210_2346_1_1534644007831_6898209_279-229469-0_sp1.1 from training. Duration: 0.936375 2023-10-04 15:55:35,241 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9029W0375-509764 from training. Duration: 0.896 2023-10-04 15:55:36,674 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2295_210_2087_1_1525500993_2407863_234-83104-0_sp0.9 from training. Duration: 0.688875 2023-10-04 15:55:40,523 WARNING [train.py:1204] (3/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_266-137104-0 from training. Duration: 0.65 2023-10-04 15:55:41,900 WARNING [train.py:1204] (3/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_289-163927-0_sp1.1 from training. Duration: 0.92725 2023-10-04 15:55:44,639 WARNING [train.py:1204] (3/4) Exclude cut with ID _12733_210_8341_1_1530167424556_3646380_86-225314-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 15:55:44,652 WARNING [train.py:1204] (3/4) Exclude cut with ID _7877_210_6014_1_1528286358099_3750136_618-284064-0_sp1.1 from training. Duration: 0.92725 2023-10-04 15:55:45,891 INFO [train.py:1046] (3/4) Epoch 49, batch 1550, loss[loss=0.1451, simple_loss=0.2363, pruned_loss=0.02694, over 24662.00 frames. ], tot_loss[loss=0.1522, simple_loss=0.2325, pruned_loss=0.03594, over 4707973.41 frames. ], batch size: 73, lr: 2.07e-03, grad_scale: 8.0 2023-10-04 15:55:45,964 WARNING [train.py:1204] (3/4) Exclude cut with ID _55844_210_2346_1_1534471212569_7625707_1017-31469-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 15:55:45,998 WARNING [train.py:1204] (3/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_617-241446-0_sp1.1 from training. Duration: 0.92725 2023-10-04 15:55:46,048 WARNING [train.py:1204] (3/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_866-173440-0_sp1.1 from training. Duration: 0.8181875 2023-10-04 15:55:47,528 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_9460_210_4211_1_1528954587_10049667_480-260269-0 from training. Duration: 0.56 2023-10-04 15:55:49,406 WARNING [train.py:1204] (3/4) Exclude cut with ID _44659_210_2721_1_1534036120061_6614860_626-298884-0 from training. Duration: 0.97 2023-10-04 15:55:49,445 WARNING [train.py:1204] (3/4) Exclude cut with ID _53565_210_10635_1_1534337218335_3820259_287-18120-0_sp1.1 from training. Duration: 0.8818125 2023-10-04 15:55:49,513 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_791-197891-0 from training. Duration: 0.84 2023-10-04 15:55:50,850 WARNING [train.py:1204] (3/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_527-238990-0 from training. Duration: 0.9 2023-10-04 15:55:53,577 WARNING [train.py:1204] (3/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_449-65969-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 15:55:55,575 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_553-341452-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 15:55:55,607 WARNING [train.py:1204] (3/4) Exclude cut with ID _16909_210_5724_1_1531292079912_4118919_300-47574-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 15:55:56,937 WARNING [train.py:1204] (3/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_70-27732-0_sp1.1 from training. Duration: 0.8545625 2023-10-04 15:55:57,031 WARNING [train.py:1204] (3/4) Exclude cut with ID _44718_210_5816_1_1534050051869_6469030_727-28074-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 15:55:58,350 WARNING [train.py:1204] (3/4) Exclude cut with ID _55857_210_2346_1_1534644007831_6898209_357-123664-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 15:56:01,153 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9123W0004-555964 from training. Duration: 0.896 2023-10-04 15:56:01,175 WARNING [train.py:1204] (3/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_445-145208-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 15:56:01,209 WARNING [train.py:1204] (3/4) Exclude cut with ID _3675_210_4557_1_1526639934408_4182089_456-18305-0_sp1.1 from training. Duration: 0.6909375 2023-10-04 15:56:01,271 WARNING [train.py:1204] (3/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_1-99680-0_sp1.1 from training. Duration: 0.6090625 2023-10-04 15:56:04,539 WARNING [train.py:1204] (3/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_146-282400-0_sp0.9 from training. Duration: 0.788875 2023-10-04 15:56:04,565 WARNING [train.py:1204] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_844-96989-0 from training. Duration: 0.71 2023-10-04 15:56:05,957 WARNING [train.py:1204] (3/4) Exclude cut with ID _29853_210_7497_1_1532505600851_6642110_128-146364-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 15:56:05,981 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_224-322394-0 from training. Duration: 0.66 2023-10-04 15:56:07,929 WARNING [train.py:1204] (3/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_191-59784-0 from training. Duration: 0.88 2023-10-04 15:56:07,934 WARNING [train.py:1204] (3/4) Exclude cut with ID _12556_210_8341_1_1530081024100_7327690_725-193477-0 from training. Duration: 0.98 2023-10-04 15:56:07,987 WARNING [train.py:1204] (3/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_523-79838-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 15:56:09,344 WARNING [train.py:1204] (3/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_398-334558-0_sp1.1 from training. Duration: 0.87275 2023-10-04 15:56:09,688 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.ff3_skip_rate, batch_count=1710280.0, ans=0.0 2023-10-04 15:56:12,157 WARNING [train.py:1204] (3/4) Exclude cut with ID _6456_210_1534_1_1527681634212_3181049_171-36697-0_sp1.1 from training. Duration: 0.936375 2023-10-04 15:56:13,649 WARNING [train.py:1204] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_700-183549-0 from training. Duration: 0.68 2023-10-04 15:56:13,652 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8853_210_3273_1_1528633899_818968_51-187685-0 from training. Duration: 0.69 2023-10-04 15:56:22,727 WARNING [train.py:1204] (3/4) Exclude cut with ID _7719_210_3525_1_1528456153163_4296960_333-110399-0_sp1.1 from training. Duration: 0.87275 2023-10-04 15:56:25,639 WARNING [train.py:1204] (3/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_1022-83740-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 15:56:25,661 WARNING [train.py:1204] (3/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_61-327062-0_sp0.9 from training. Duration: 0.9 2023-10-04 15:56:25,672 WARNING [train.py:1204] (3/4) Exclude cut with ID _47251_210_18628_1_1533814063128_4137250_214-88076-0_sp1.1 from training. Duration: 0.8 2023-10-04 15:56:26,994 WARNING [train.py:1204] (3/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_839-173017-0 from training. Duration: 0.41 2023-10-04 15:56:30,179 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.bypass_mid.scale_min, batch_count=1710413.3333333333, ans=0.2 2023-10-04 15:56:31,222 WARNING [train.py:1204] (3/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_299-34413-0_sp1.1 from training. Duration: 0.5909375 2023-10-04 15:56:33,168 WARNING [train.py:1204] (3/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_664-159457-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 15:56:34,532 WARNING [train.py:1204] (3/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_483-291760-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 15:56:37,113 WARNING [train.py:1204] (3/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_287-255474-0_sp1.1 from training. Duration: 0.92725 2023-10-04 15:56:37,160 WARNING [train.py:1204] (3/4) Exclude cut with ID _48677_210_5663_1_1533902405919_3892989_111-11439-0_sp1.1 from training. Duration: 0.87275 2023-10-04 15:56:37,167 WARNING [train.py:1204] (3/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_399-156791-0 from training. Duration: 0.83 2023-10-04 15:56:37,200 WARNING [train.py:1204] (3/4) Exclude cut with ID _3992_210_1747_1_1526644816031_3731060_36-147855-0_sp0.9 from training. Duration: 0.8666875 2023-10-04 15:56:40,309 WARNING [train.py:1204] (3/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_442-320317-0_sp1.1 from training. Duration: 0.7454375 2023-10-04 15:56:40,330 WARNING [train.py:1204] (3/4) Exclude cut with ID _55429_210_10626_1_1534467588527_7174143_953-80868-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 15:56:40,413 WARNING [train.py:1204] (3/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_159-328157-0_sp0.9 from training. Duration: 0.57775 2023-10-04 15:56:40,420 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9309W0197-640324 from training. Duration: 0.896 2023-10-04 15:56:44,484 WARNING [train.py:1204] (3/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_361-30822-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 15:56:47,997 WARNING [train.py:1204] (3/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_636-251213-0 from training. Duration: 0.94 2023-10-04 15:56:48,174 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.bypass_mid.scale_min, batch_count=1710480.0, ans=0.2 2023-10-04 15:56:52,880 WARNING [train.py:1204] (3/4) Exclude cut with ID _49728_210_12651_1_1534041034294_6788150_468-333827-0_sp1.1 from training. Duration: 0.963625 2023-10-04 15:56:54,248 WARNING [train.py:1204] (3/4) Exclude cut with ID _25882_210_3036_1_1532775665815_6479510_623-191759-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 15:56:54,313 WARNING [train.py:1204] (3/4) Exclude cut with ID _5189_210_4557_1_1527248319428_3769229_199-53526-0 from training. Duration: 0.42 2023-10-04 15:56:54,479 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.conv_module1.balancer1.prob, batch_count=1710480.0, ans=0.125 2023-10-04 15:56:55,702 WARNING [train.py:1204] (3/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_32-205288-0_sp0.9 from training. Duration: 0.8666875 2023-10-04 15:56:55,834 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.conv_module1.balancer2.min_abs, batch_count=1710480.0, ans=0.5 2023-10-04 15:56:57,017 WARNING [train.py:1204] (3/4) Exclude cut with ID _65263_210_22356_1_1535763598691_3921037_401-346646-0_sp1.1 from training. Duration: 0.963625 2023-10-04 15:56:57,021 WARNING [train.py:1204] (3/4) Exclude cut with ID _12555_210_8750_1_1530075594402_3780239_449-267986-0_sp1.1 from training. Duration: 0.7545625 2023-10-04 15:56:57,035 WARNING [train.py:1204] (3/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_430-201020-0_sp1.1 from training. Duration: 0.8818125 2023-10-04 15:56:58,497 WARNING [train.py:1204] (3/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_379-49457-0_sp1.1 from training. Duration: 0.7909375 2023-10-04 15:57:01,264 INFO [train.py:1046] (3/4) Epoch 49, batch 1600, loss[loss=0.1592, simple_loss=0.2344, pruned_loss=0.042, over 23704.00 frames. ], tot_loss[loss=0.1541, simple_loss=0.234, pruned_loss=0.03713, over 4690898.34 frames. ], batch size: 232, lr: 2.07e-03, grad_scale: 16.0 2023-10-04 15:57:03,230 WARNING [train.py:1204] (3/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_200-105218-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 15:57:03,291 WARNING [train.py:1204] (3/4) Exclude cut with ID _3867_210_1883_1_1526641200575_4475830_319-349850-0 from training. Duration: 0.67 2023-10-04 15:57:03,364 WARNING [train.py:1204] (3/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_916-195848-0 from training. Duration: 0.92 2023-10-04 15:57:06,035 WARNING [train.py:1204] (3/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_436-186311-0 from training. Duration: 0.65 2023-10-04 15:57:08,012 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.ff3_skip_rate, batch_count=1710546.6666666667, ans=0.0 2023-10-04 15:57:10,470 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3910_210_1794_1_1526777667_3747260_302-180617-0_sp1.1 from training. Duration: 0.97275 2023-10-04 15:57:11,928 WARNING [train.py:1204] (3/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_102-92751-0 from training. Duration: 0.99 2023-10-04 15:57:11,993 WARNING [train.py:1204] (3/4) Exclude cut with ID _51899_210_20034_1_1534129254466_3669220_265-215428-0_sp1.1 from training. Duration: 0.8909375 2023-10-04 15:57:13,214 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.738e+02 2.000e+02 2.241e+02 2.538e+02 3.324e+02, threshold=4.482e+02, percent-clipped=0.0 2023-10-04 15:57:14,751 WARNING [train.py:1204] (3/4) Exclude cut with ID _58583_210_5236_1_1534726835266_3574249_438-198993-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 15:57:19,382 WARNING [train.py:1204] (3/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_166-166990-0_sp1.1 from training. Duration: 0.87275 2023-10-04 15:57:22,571 WARNING [train.py:1204] (3/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_907-151398-0 from training. Duration: 0.37 2023-10-04 15:57:25,289 WARNING [train.py:1204] (3/4) Exclude cut with ID _38670_210_15379_1_1533448810659_4391059_452-256669-0_sp1.1 from training. Duration: 0.936375 2023-10-04 15:57:25,339 WARNING [train.py:1204] (3/4) Exclude cut with ID _48677_210_5663_1_1533902405919_3892989_408-40740-0 from training. Duration: 0.83 2023-10-04 15:57:25,373 WARNING [train.py:1204] (3/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_903-158756-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 15:57:26,779 WARNING [train.py:1204] (3/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_397-216497-0 from training. Duration: 0.96 2023-10-04 15:57:33,855 WARNING [train.py:1204] (3/4) Exclude cut with ID _55429_210_10626_1_1534467588527_7174143_97-335084-0 from training. Duration: 0.97 2023-10-04 15:57:44,165 WARNING [train.py:1204] (3/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_453-267730-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 15:57:44,238 WARNING [train.py:1204] (3/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_153-97441-0 from training. Duration: 0.56 2023-10-04 15:57:45,690 WARNING [train.py:1204] (3/4) Exclude cut with ID _40901_210_5665_1_1533363789958_3856130_312-329122-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 15:57:45,734 WARNING [train.py:1204] (3/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_97-126471-0_sp1.1 from training. Duration: 0.97275 2023-10-04 15:57:45,734 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_11305_210_1794_1_1529750428_7178148_257-312870-0_sp1.1 from training. Duration: 0.8 2023-10-04 15:57:48,484 WARNING [train.py:1204] (3/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_454-314901-0_sp0.9 from training. Duration: 0.511125 2023-10-04 15:57:51,389 WARNING [train.py:1204] (3/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_282-136641-0_sp1.1 from training. Duration: 0.3818125 2023-10-04 15:57:53,384 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_46013_210_7234_1_1533776362_3744032_493-160724-0_sp1.1 from training. Duration: 0.963625 2023-10-04 15:57:53,426 WARNING [train.py:1204] (3/4) Exclude cut with ID _67831_210_12392_1_1535612464202_3092612_259-33442-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 15:57:54,739 WARNING [train.py:1204] (3/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_289-62384-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 15:57:56,045 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6545_210_2040_1_1527774670_4245258_379-14182-0_sp0.9 from training. Duration: 0.888875 2023-10-04 15:57:58,049 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_548-294316-0_sp1.1 from training. Duration: 0.736375 2023-10-04 15:57:59,488 WARNING [train.py:1204] (3/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_857-298942-0_sp1.1 from training. Duration: 0.77275 2023-10-04 15:58:00,926 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6673_210_3273_1_1527767790_3173240_327-306185-0_sp1.1 from training. Duration: 0.7545625 2023-10-04 15:58:06,888 WARNING [train.py:1204] (3/4) Exclude cut with ID _61008_210_10633_1_1534852769198_6974370_814-107647-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 15:58:08,237 WARNING [train.py:1204] (3/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_116-332008-0_sp1.1 from training. Duration: 0.8909375 2023-10-04 15:58:09,778 WARNING [train.py:1204] (3/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_266-329755-0 from training. Duration: 0.89 2023-10-04 15:58:09,782 WARNING [train.py:1204] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_271-269084-0_sp0.9 from training. Duration: 0.92225 2023-10-04 15:58:11,173 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3171_210_2087_1_1526109084_2330007_66-260583-0 from training. Duration: 0.72 2023-10-04 15:58:11,382 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.bypass.scale_min, batch_count=1710813.3333333333, ans=0.2 2023-10-04 15:58:15,896 INFO [train.py:1046] (3/4) Epoch 49, batch 1650, loss[loss=0.1557, simple_loss=0.2442, pruned_loss=0.03356, over 24635.00 frames. ], tot_loss[loss=0.1546, simple_loss=0.2348, pruned_loss=0.03716, over 4696148.43 frames. ], batch size: 68, lr: 2.07e-03, grad_scale: 8.0 2023-10-04 15:58:17,340 WARNING [train.py:1204] (3/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_1326-276386-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 15:58:17,484 WARNING [train.py:1204] (3/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_372-30302-0_sp1.1 from training. Duration: 0.7818125 2023-10-04 15:58:18,828 WARNING [train.py:1204] (3/4) Exclude cut with ID _25882_210_3036_1_1532775665815_6479510_631-257029-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 15:58:18,836 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3691_210_3528_1_1526468169_4062053_355-334432-0 from training. Duration: 0.87 2023-10-04 15:58:18,844 WARNING [train.py:1204] (3/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_839-173017-0_sp1.1 from training. Duration: 0.37275 2023-10-04 15:58:18,851 WARNING [train.py:1204] (3/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_441-91466-0 from training. Duration: 0.97 2023-10-04 15:58:20,207 WARNING [train.py:1204] (3/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_108-53929-0 from training. Duration: 0.59 2023-10-04 15:58:24,308 WARNING [train.py:1204] (3/4) Exclude cut with ID _9389_210_1664_1_1529217337301_5289269_260-1942-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 15:58:24,979 WARNING [train.py:1204] (3/4) Exclude cut with ID _56423_210_5854_1_1534896061049_3165969_198-61547-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 15:58:26,185 WARNING [train.py:1204] (3/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_412-142122-0_sp1.1 from training. Duration: 0.92725 2023-10-04 15:58:26,204 WARNING [train.py:1204] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_88-341718-0_sp1.1 from training. Duration: 0.6 2023-10-04 15:58:27,586 WARNING [train.py:1204] (3/4) Exclude cut with ID _61079_210_4525_1_1534921008198_4011419_334-298697-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 15:58:29,624 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_5465_210_4211_1_1527227253_55357_8-2499-0_sp1.1 from training. Duration: 0.996375 2023-10-04 15:58:32,499 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_447-201565-0_sp1.1 from training. Duration: 0.97275 2023-10-04 15:58:32,505 WARNING [train.py:1204] (3/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_253-161789-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 15:58:32,515 WARNING [train.py:1204] (3/4) Exclude cut with ID _44304_210_15374_1_1533639352765_4613129_302-22084-0_sp0.9 from training. Duration: 0.9555625 2023-10-04 15:58:32,524 WARNING [train.py:1204] (3/4) Exclude cut with ID _26664_210_7770_1_1532258943280_3418270_187-80815-0_sp1.1 from training. Duration: 0.8454375 2023-10-04 15:58:35,211 WARNING [train.py:1204] (3/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_68-212801-0 from training. Duration: 0.73 2023-10-04 15:58:35,250 WARNING [train.py:1204] (3/4) Exclude cut with ID _29345_210_4381_1_1532689168263_3860169_149-257268-0 from training. Duration: 0.81 2023-10-04 15:58:39,982 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_213-103282-0_sp1.1 from training. Duration: 0.7090625 2023-10-04 15:58:41,389 WARNING [train.py:1204] (3/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_156-302675-0_sp1.1 from training. Duration: 0.5 2023-10-04 15:58:50,370 WARNING [train.py:1204] (3/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_125-322172-0 from training. Duration: 0.93 2023-10-04 15:58:51,610 WARNING [train.py:1204] (3/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_493-60474-0_sp1.1 from training. Duration: 0.963625 2023-10-04 15:58:52,995 WARNING [train.py:1204] (3/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_156-302675-0 from training. Duration: 0.55 2023-10-04 15:58:55,676 WARNING [train.py:1204] (3/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_123-267862-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 15:59:00,371 WARNING [train.py:1204] (3/4) Exclude cut with ID _28427_210_4220_1_1532581240967_3061069_201-143747-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 15:59:00,412 WARNING [train.py:1204] (3/4) Exclude cut with ID _8772_210_1850_1_1528635595972_3664011_96-12756-0_sp1.1 from training. Duration: 0.8909375 2023-10-04 15:59:02,218 WARNING [train.py:1204] (3/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_1347-145675-0_sp1.1 from training. Duration: 0.936375 2023-10-04 15:59:03,557 WARNING [train.py:1204] (3/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_387-159008-0_sp1.1 from training. Duration: 0.7818125 2023-10-04 15:59:03,581 WARNING [train.py:1204] (3/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_400-201896-0_sp1.1 from training. Duration: 0.963625 2023-10-04 15:59:05,061 WARNING [train.py:1204] (3/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_326-174612-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 15:59:06,379 WARNING [train.py:1204] (3/4) Exclude cut with ID _55844_210_2346_1_1534471212569_7625707_390-116233-0_sp1.1 from training. Duration: 0.963625 2023-10-04 15:59:06,414 WARNING [train.py:1204] (3/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_419-317062-0_sp1.1 from training. Duration: 0.82725 2023-10-04 15:59:06,460 WARNING [train.py:1204] (3/4) Exclude cut with ID _29877_210_6228_1_1532397791106_5276149_266-62291-0_sp0.9 from training. Duration: 0.9666875 2023-10-04 15:59:06,647 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.bypass.scale_min, batch_count=1711080.0, ans=0.2 2023-10-04 15:59:07,160 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.2.encoder.layers.1.conv_module2.whiten, num_groups=1, num_channels=384, metric=6.75 vs. limit=15.0 2023-10-04 15:59:07,807 WARNING [train.py:1204] (3/4) Exclude cut with ID _7857_210_1534_1_1528286515745_6831470_431-338528-0_sp1.1 from training. Duration: 0.92725 2023-10-04 15:59:09,632 WARNING [train.py:1204] (3/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_87-131427-0_sp0.9 from training. Duration: 0.8666875 2023-10-04 15:59:12,401 WARNING [train.py:1204] (3/4) Exclude cut with ID _24055_210_12651_1_1532310907143_976979_92-59356-0_sp1.1 from training. Duration: 0.82725 2023-10-04 15:59:13,830 WARNING [train.py:1204] (3/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_96-290039-0 from training. Duration: 0.56 2023-10-04 15:59:15,820 WARNING [train.py:1204] (3/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_48-182155-0_sp0.9 from training. Duration: 0.9666875 2023-10-04 15:59:15,856 WARNING [train.py:1204] (3/4) Exclude cut with ID _4626_210_3532_1_1526817652717_3916859_446-206656-0 from training. Duration: 0.76 2023-10-04 15:59:17,211 WARNING [train.py:1204] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_168-32906-0 from training. Duration: 0.69 2023-10-04 15:59:17,227 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3780_210_2087_1_1526467760_2130483_170-259044-0 from training. Duration: 0.7 2023-10-04 15:59:18,601 WARNING [train.py:1204] (3/4) Exclude cut with ID _65134_210_14763_1_1535335393985_3505368_153-216718-0_sp1.1 from training. Duration: 0.92725 2023-10-04 15:59:19,927 WARNING [train.py:1204] (3/4) Exclude cut with ID _64639_210_5816_1_1535367619437_3528000_461-329156-0_sp1.1 from training. Duration: 0.87275 2023-10-04 15:59:19,968 WARNING [train.py:1204] (3/4) Exclude cut with ID _21989_210_5854_1_1531899214931_3271360_126-347531-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 15:59:20,014 WARNING [train.py:1204] (3/4) Exclude cut with ID _7614_210_4915_1_1529326764092_4537359_96-162197-0_sp1.1 from training. Duration: 0.963625 2023-10-04 15:59:20,015 WARNING [train.py:1204] (3/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_1083-147536-0 from training. Duration: 0.97 2023-10-04 15:59:24,208 WARNING [train.py:1204] (3/4) Exclude cut with ID _64029_210_4381_1_1535178502772_3756459_275-16710-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 15:59:25,618 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7264_210_4938_1_1527939911_8132095_800-91995-0_sp0.9 from training. Duration: 0.9555625 2023-10-04 15:59:26,892 WARNING [train.py:1204] (3/4) Exclude cut with ID _3844_210_1390_1_1526727803582_2871209_50-193879-0_sp1.1 from training. Duration: 0.936375 2023-10-04 15:59:28,329 WARNING [train.py:1204] (3/4) Exclude cut with ID _46952_210_5986_1_1534230010357_3107379_232-344503-0 from training. Duration: 0.96 2023-10-04 15:59:29,736 INFO [train.py:1046] (3/4) Epoch 49, batch 1700, loss[loss=0.1597, simple_loss=0.2425, pruned_loss=0.03851, over 23269.00 frames. ], tot_loss[loss=0.1536, simple_loss=0.2342, pruned_loss=0.03649, over 4711862.31 frames. ], batch size: 93, lr: 2.07e-03, grad_scale: 8.0 2023-10-04 15:59:33,714 WARNING [train.py:1204] (3/4) Exclude cut with ID _25808_210_13292_1_1532343514562_7366109_206-14076-0_sp1.1 from training. Duration: 0.936375 2023-10-04 15:59:33,715 WARNING [train.py:1204] (3/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_55-31904-0_sp0.9 from training. Duration: 0.97775 2023-10-04 15:59:33,740 WARNING [train.py:1204] (3/4) Exclude cut with ID _14632_210_9988_1_1530691279392_3923200_161-129530-0 from training. Duration: 0.96 2023-10-04 15:59:35,114 WARNING [train.py:1204] (3/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_770-200430-0_sp1.1 from training. Duration: 0.8090625 2023-10-04 15:59:35,125 WARNING [train.py:1204] (3/4) Exclude cut with ID _6270_210_2084_1_1527850911573_3856420_26-277343-0_sp1.1 from training. Duration: 0.7454375 2023-10-04 15:59:35,126 WARNING [train.py:1204] (3/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_88-23776-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 15:59:37,818 WARNING [train.py:1204] (3/4) Exclude cut with ID _60860_210_8254_1_1534896015875_3557169_245-256666-0_sp1.1 from training. Duration: 0.8818125 2023-10-04 15:59:37,848 WARNING [train.py:1204] (3/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_636-251213-0_sp1.1 from training. Duration: 0.8545625 2023-10-04 15:59:37,867 WARNING [train.py:1204] (3/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_540-131164-0 from training. Duration: 0.92 2023-10-04 15:59:39,397 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.balancer2.prob, batch_count=1711213.3333333333, ans=0.125 2023-10-04 15:59:40,612 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_5931_210_2040_1_1527428481_4547999_321-193614-0_sp0.9 from training. Duration: 0.7444375 2023-10-04 15:59:44,087 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.787e+02 2.122e+02 2.407e+02 2.844e+02 4.213e+02, threshold=4.814e+02, percent-clipped=0.0 2023-10-04 15:59:47,566 WARNING [train.py:1204] (3/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_373-225403-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 15:59:50,200 WARNING [train.py:1204] (3/4) Exclude cut with ID _20622_210_3852_1_1531622427080_1093839_57-326027-0_sp1.1 from training. Duration: 0.97275 2023-10-04 15:59:50,476 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.conv_module2.balancer2.prob, batch_count=1711280.0, ans=0.125 2023-10-04 15:59:55,670 WARNING [train.py:1204] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_605-257205-0_sp1.1 from training. Duration: 0.536375 2023-10-04 15:59:55,699 WARNING [train.py:1204] (3/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_442-320317-0_sp0.9 from training. Duration: 0.911125 2023-10-04 15:59:55,741 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_35361_210_4238_1_1533262810_7793476_675-87012-0_sp1.1 from training. Duration: 0.8090625 2023-10-04 15:59:55,776 WARNING [train.py:1204] (3/4) Exclude cut with ID _4184_210_2550_1_1526730192013_2219380_306-44736-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 15:59:58,574 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_124-113562-0 from training. Duration: 0.94 2023-10-04 16:00:00,566 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2821_210_2715_1_1526108385_3763560_73-296673-0_sp0.9 from training. Duration: 0.92225 2023-10-04 16:00:00,589 WARNING [train.py:1204] (3/4) Exclude cut with ID _10301_210_5886_1_1529222275332_3910620_216-46959-0_sp1.1 from training. Duration: 0.92725 2023-10-04 16:00:01,991 WARNING [train.py:1204] (3/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_91-277684-0_sp0.9 from training. Duration: 0.82225 2023-10-04 16:00:02,097 WARNING [train.py:1204] (3/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_351-294650-0_sp0.9 from training. Duration: 0.7 2023-10-04 16:00:05,326 WARNING [train.py:1204] (3/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_209-205365-0 from training. Duration: 0.79 2023-10-04 16:00:05,390 WARNING [train.py:1204] (3/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_97-325053-0 from training. Duration: 0.92 2023-10-04 16:00:08,087 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_49348_210_7927_1_1533954500_3691300_109-276430-0_sp1.1 from training. Duration: 0.92725 2023-10-04 16:00:08,257 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.feed_forward1.hidden_balancer.prob, batch_count=1711346.6666666667, ans=0.125 2023-10-04 16:00:09,974 WARNING [train.py:1204] (3/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_912-47473-0 from training. Duration: 0.68 2023-10-04 16:00:11,353 WARNING [train.py:1204] (3/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_96-320736-0_sp0.9 from training. Duration: 0.9555625 2023-10-04 16:00:15,138 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.5.encoder.layers.1.conv_module1.whiten, num_groups=1, num_channels=256, metric=4.47 vs. limit=15.0 2023-10-04 16:00:18,733 WARNING [train.py:1204] (3/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_1252-235481-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 16:00:19,323 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.4.encoder.layers.2.conv_module1.whiten, num_groups=1, num_channels=384, metric=3.32 vs. limit=15.0 2023-10-04 16:00:20,099 WARNING [train.py:1204] (3/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_813-261554-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 16:00:20,177 WARNING [train.py:1204] (3/4) Exclude cut with ID _21632_210_2253_1_1531994001765_3744140_110-274654-0_sp0.9 from training. Duration: 0.911125 2023-10-04 16:00:22,921 WARNING [train.py:1204] (3/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_381-317791-0_sp1.1 from training. Duration: 0.663625 2023-10-04 16:00:22,936 WARNING [train.py:1204] (3/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_51-117576-0_sp1.1 from training. Duration: 0.363625 2023-10-04 16:00:22,955 WARNING [train.py:1204] (3/4) Exclude cut with ID _42321_210_7770_1_1533468631352_4366895_104-215915-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 16:00:25,654 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_216-22824-0_sp1.1 from training. Duration: 0.92725 2023-10-04 16:00:25,655 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_14831_210_7927_1_1530700200_5583610_181-27467-0 from training. Duration: 0.91 2023-10-04 16:00:25,860 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.conv_module1.balancer2.prob, batch_count=1711413.3333333333, ans=0.125 2023-10-04 16:00:26,922 WARNING [train.py:1204] (3/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_1024-265645-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 16:00:26,924 WARNING [train.py:1204] (3/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_412-233904-0_sp1.1 from training. Duration: 0.936375 2023-10-04 16:00:28,238 WARNING [train.py:1204] (3/4) Exclude cut with ID _30191_210_1664_1_1533189915047_7196670_911-223461-0_sp1.1 from training. Duration: 0.92725 2023-10-04 16:00:28,246 WARNING [train.py:1204] (3/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_558-148266-0_sp1.1 from training. Duration: 0.963625 2023-10-04 16:00:28,519 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.ff3_skip_rate, batch_count=1711480.0, ans=0.0 2023-10-04 16:00:28,578 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.nonlin_attention.balancer.prob, batch_count=1711480.0, ans=0.125 2023-10-04 16:00:30,250 WARNING [train.py:1204] (3/4) Exclude cut with ID _58480_210_12392_1_1534811121111_3646160_212-164495-0_sp1.1 from training. Duration: 0.936375 2023-10-04 16:00:30,254 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_454-276429-0_sp0.9 from training. Duration: 0.9666875 2023-10-04 16:00:31,636 WARNING [train.py:1204] (3/4) Exclude cut with ID _24736_210_3279_1_1532332894218_2838700_73-197086-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 16:00:31,688 WARNING [train.py:1204] (3/4) Exclude cut with ID _5294_210_1747_1_1527249643928_3696179_529-242298-0_sp1.1 from training. Duration: 0.863625 2023-10-04 16:00:32,955 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_53555_210_5632_1_1534595220_7232297_345-171438-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 16:00:35,721 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_28211_210_6388_1_1532861506_4523224_22-25766-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 16:00:35,805 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7002_210_1794_1_1527939683_5157286_163-87552-0 from training. Duration: 0.86 2023-10-04 16:00:39,092 WARNING [train.py:1204] (3/4) Exclude cut with ID _56927_210_9969_1_1534848970840_3695229_409-225129-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 16:00:39,377 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.conv_module2.balancer1.prob, batch_count=1711480.0, ans=0.125 2023-10-04 16:00:40,492 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_26192_210_7927_1_1532137269_1040115_21-219331-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 16:00:41,457 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.0.layers.0.conv_module2.whiten, num_groups=1, num_channels=192, metric=11.82 vs. limit=15.0 2023-10-04 16:00:42,469 WARNING [train.py:1204] (3/4) Exclude cut with ID _15012_210_1968_1_1530856820429_5095350_662-210469-0 from training. Duration: 0.57 2023-10-04 16:00:45,051 INFO [train.py:1046] (3/4) Epoch 49, batch 1750, loss[loss=0.1474, simple_loss=0.2276, pruned_loss=0.03358, over 24335.00 frames. ], tot_loss[loss=0.1534, simple_loss=0.2334, pruned_loss=0.03668, over 4709028.16 frames. ], batch size: 61, lr: 2.07e-03, grad_scale: 8.0 2023-10-04 16:00:48,131 WARNING [train.py:1204] (3/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_582-191016-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 16:00:49,571 WARNING [train.py:1204] (3/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_270-210032-0_sp1.1 from training. Duration: 0.963625 2023-10-04 16:00:50,878 WARNING [train.py:1204] (3/4) Exclude cut with ID _6789_210_1534_1_1527857937218_3118950_262-48853-0_sp0.9 from training. Duration: 0.62225 2023-10-04 16:00:50,951 WARNING [train.py:1204] (3/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_847-41334-0 from training. Duration: 0.44 2023-10-04 16:00:51,750 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.0.layers.1.nonlin_attention.whiten2, num_groups=1, num_channels=192, metric=5.67 vs. limit=15.0 2023-10-04 16:00:52,223 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_573-46532-0_sp1.1 from training. Duration: 0.92725 2023-10-04 16:00:53,900 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=1711546.6666666667, ans=0.1 2023-10-04 16:00:55,004 WARNING [train.py:1204] (3/4) Exclude cut with ID _21881_210_3525_1_1531825242757_3798380_247-102591-0_sp1.1 from training. Duration: 0.8909375 2023-10-04 16:00:55,029 WARNING [train.py:1204] (3/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_200-21395-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 16:00:59,321 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.ff3_skip_rate, batch_count=1711613.3333333333, ans=0.0 2023-10-04 16:01:00,432 WARNING [train.py:1204] (3/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_711-15688-0 from training. Duration: 0.43 2023-10-04 16:01:02,458 WARNING [train.py:1204] (3/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_53-75025-0_sp1.1 from training. Duration: 0.963625 2023-10-04 16:01:03,863 WARNING [train.py:1204] (3/4) Exclude cut with ID _6194_210_1534_1_1527511826160_3941194_321-148641-0 from training. Duration: 0.64 2023-10-04 16:01:03,887 WARNING [train.py:1204] (3/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_337-294431-0_sp1.1 from training. Duration: 0.936375 2023-10-04 16:01:07,177 WARNING [train.py:1204] (3/4) Exclude cut with ID _21632_210_2253_1_1531994001765_3744140_110-274654-0_sp1.1 from training. Duration: 0.7454375 2023-10-04 16:01:08,910 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.conv_module2.balancer1.prob, batch_count=1711613.3333333333, ans=0.125 2023-10-04 16:01:09,214 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.4.encoder.layers.2.feed_forward1.out_whiten, num_groups=1, num_channels=384, metric=12.40 vs. limit=15.0 2023-10-04 16:01:09,895 WARNING [train.py:1204] (3/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_135-174080-0_sp1.1 from training. Duration: 0.4818125 2023-10-04 16:01:10,013 WARNING [train.py:1204] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_21-174514-0 from training. Duration: 0.43 2023-10-04 16:01:11,398 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_38965_210_4852_1_1533174906_3484735_215-294589-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 16:01:13,305 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_548-294316-0 from training. Duration: 0.81 2023-10-04 16:01:20,460 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_103-84010-0_sp1.1 from training. Duration: 0.62725 2023-10-04 16:01:21,892 WARNING [train.py:1204] (3/4) Exclude cut with ID _18199_210_6109_1_1531475789864_4288790_295-198247-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 16:01:21,927 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_13757_210_3528_1_1530440682_4107239_103-313224-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 16:01:24,713 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_34886_210_10633_1_1533203882_1755661_67-294673-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 16:01:24,722 WARNING [train.py:1204] (3/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_350-101734-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 16:01:27,366 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_37357_210_17024_1_1533089504_2316121_204-153986-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 16:01:27,497 WARNING [train.py:1204] (3/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_673-115569-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 16:01:30,933 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_489-230703-0_sp0.9 from training. Duration: 0.9444375 2023-10-04 16:01:30,990 WARNING [train.py:1204] (3/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_575-102496-0_sp1.1 from training. Duration: 0.936375 2023-10-04 16:01:32,361 WARNING [train.py:1204] (3/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_669-93163-0 from training. Duration: 0.98 2023-10-04 16:01:34,256 WARNING [train.py:1204] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_249-281349-0_sp0.9 from training. Duration: 0.9444375 2023-10-04 16:01:36,974 WARNING [train.py:1204] (3/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_106-30782-0 from training. Duration: 0.8 2023-10-04 16:01:37,051 WARNING [train.py:1204] (3/4) Exclude cut with ID _8772_210_1850_1_1528635595972_3664011_398-320049-0_sp1.1 from training. Duration: 0.8 2023-10-04 16:01:40,032 WARNING [train.py:1204] (3/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_516-151727-0_sp1.1 from training. Duration: 0.963625 2023-10-04 16:01:40,092 WARNING [train.py:1204] (3/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_325-62802-0_sp1.1 from training. Duration: 0.7909375 2023-10-04 16:01:41,694 INFO [scaling.py:1118] (3/4) WithLoss: name=encoder.encoders.4.encoder.layers.0.self_attn_weights, loss-sum=0.000e+00 2023-10-04 16:01:46,177 WARNING [train.py:1204] (3/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_93-58002-0_sp0.9 from training. Duration: 0.6333125 2023-10-04 16:01:46,227 WARNING [train.py:1204] (3/4) Exclude cut with ID _4681_210_3525_1_1527251040321_1235713_123-24428-0_sp1.1 from training. Duration: 0.436375 2023-10-04 16:01:47,594 WARNING [train.py:1204] (3/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_1185-56473-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 16:01:48,991 WARNING [train.py:1204] (3/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_96-244299-0_sp1.1 from training. Duration: 0.8 2023-10-04 16:01:53,121 WARNING [train.py:1204] (3/4) Exclude cut with ID _59500_210_8254_1_1534809610680_3736009_104-72815-0_sp1.1 from training. Duration: 0.963625 2023-10-04 16:01:55,942 WARNING [train.py:1204] (3/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_468-283113-0_sp1.1 from training. Duration: 0.87275 2023-10-04 16:01:57,333 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6239_210_4211_1_1527765584_4306698_528-102186-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 16:01:57,395 WARNING [train.py:1204] (3/4) Exclude cut with ID _45297_210_2721_1_1533715351381_3264949_351-147136-0 from training. Duration: 0.87 2023-10-04 16:01:57,399 WARNING [train.py:1204] (3/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_520-51744-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 16:01:58,779 INFO [train.py:1046] (3/4) Epoch 49, batch 1800, loss[loss=0.1345, simple_loss=0.2135, pruned_loss=0.02774, over 24435.00 frames. ], tot_loss[loss=0.1531, simple_loss=0.2331, pruned_loss=0.03657, over 4713344.81 frames. ], batch size: 58, lr: 2.07e-03, grad_scale: 8.0 2023-10-04 16:01:58,840 WARNING [train.py:1204] (3/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_310-114452-0_sp1.1 from training. Duration: 0.536375 2023-10-04 16:01:58,844 WARNING [train.py:1204] (3/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_450-165710-0_sp1.1 from training. Duration: 0.92725 2023-10-04 16:01:58,854 WARNING [train.py:1204] (3/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_280-120942-0_sp1.1 from training. Duration: 0.7 2023-10-04 16:01:58,875 WARNING [train.py:1204] (3/4) Exclude cut with ID _65585_210_12210_1_1535450451584_3764098_112-294301-0_sp0.9 from training. Duration: 0.97775 2023-10-04 16:01:58,938 WARNING [train.py:1204] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_98-51676-0_sp0.9 from training. Duration: 0.811125 2023-10-04 16:02:03,549 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_33348_210_5632_1_1533086912_3324030_85-28583-0_sp1.1 from training. Duration: 0.7454375 2023-10-04 16:02:04,860 WARNING [train.py:1204] (3/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_336-208232-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 16:02:06,285 WARNING [train.py:1204] (3/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_339-104791-0_sp0.9 from training. Duration: 0.5444375 2023-10-04 16:02:09,575 WARNING [train.py:1204] (3/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_559-29840-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 16:02:11,046 WARNING [train.py:1204] (3/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_117-290916-0_sp0.9 from training. Duration: 0.5666875 2023-10-04 16:02:12,224 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.742e+02 2.071e+02 2.300e+02 2.752e+02 3.980e+02, threshold=4.601e+02, percent-clipped=0.0 2023-10-04 16:02:12,354 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_185-112289-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 16:02:14,350 WARNING [train.py:1204] (3/4) Exclude cut with ID _59256_210_5986_1_1535076085997_3589120_155-298797-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 16:02:15,859 WARNING [train.py:1204] (3/4) Exclude cut with ID _44659_210_2721_1_1534036120061_6614860_246-206531-0_sp1.1 from training. Duration: 0.92725 2023-10-04 16:02:15,894 WARNING [train.py:1204] (3/4) Exclude cut with ID _23818_210_6228_1_1531917063884_3676920_469-151944-0_sp1.1 from training. Duration: 0.92725 2023-10-04 16:02:17,276 WARNING [train.py:1204] (3/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_88-276668-0_sp1.1 from training. Duration: 0.9 2023-10-04 16:02:17,502 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.ff3_skip_rate, batch_count=1711946.6666666667, ans=0.0 2023-10-04 16:02:20,040 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_15101_210_7927_1_1530786415_5033274_320-327527-0_sp1.1 from training. Duration: 0.87275 2023-10-04 16:02:20,051 WARNING [train.py:1204] (3/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_446-112117-0 from training. Duration: 0.9 2023-10-04 16:02:21,372 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_30280_210_1881_1_1532872779_3660139_400-254159-0_sp1.1 from training. Duration: 0.936375 2023-10-04 16:02:21,574 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.nonlin_attention.balancer.prob, batch_count=1711946.6666666667, ans=0.125 2023-10-04 16:02:25,626 WARNING [train.py:1204] (3/4) Exclude cut with ID _40284_210_2084_1_1533513679814_3704279_158-345341-0_sp1.1 from training. Duration: 0.936375 2023-10-04 16:02:29,759 WARNING [train.py:1204] (3/4) Exclude cut with ID 3033-130750-0096-55598-0 from training. Duration: 0.83 2023-10-04 16:02:32,426 WARNING [train.py:1204] (3/4) Exclude cut with ID _9389_210_1664_1_1529217337301_5289269_379-128794-0 from training. Duration: 0.85 2023-10-04 16:02:32,448 WARNING [train.py:1204] (3/4) Exclude cut with ID _3675_210_4557_1_1526639934408_4182089_456-18305-0 from training. Duration: 0.76 2023-10-04 16:02:32,475 WARNING [train.py:1204] (3/4) Exclude cut with ID _29853_210_7497_1_1532505600851_6642110_571-249460-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 16:02:34,348 WARNING [train.py:1204] (3/4) Exclude cut with ID _4068_210_2629_1_1526697350395_1546259_230-325568-0_sp1.1 from training. Duration: 0.92725 2023-10-04 16:02:34,349 WARNING [train.py:1204] (3/4) Exclude cut with ID _34415_210_8750_1_1533085213375_3451116_213-267570-0_sp1.1 from training. Duration: 0.963625 2023-10-04 16:02:35,702 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6767_210_3273_1_1527855783_1718932_142-127132-0_sp1.1 from training. Duration: 0.863625 2023-10-04 16:02:39,423 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.balancer1.prob, batch_count=1712013.3333333333, ans=0.125 2023-10-04 16:02:43,286 WARNING [train.py:1204] (3/4) Exclude cut with ID IC0653W0001-322925 from training. Duration: 0.896 2023-10-04 16:02:44,588 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_4528_210_3273_1_1527076654_3276777_209-219804-0_sp1.1 from training. Duration: 0.62725 2023-10-04 16:02:46,811 WARNING [train.py:1204] (3/4) Exclude cut with ID _55844_210_2346_1_1534471212569_7625707_187-204971-0_sp1.1 from training. Duration: 0.936375 2023-10-04 16:02:48,212 WARNING [train.py:1204] (3/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_369-325184-0 from training. Duration: 0.99 2023-10-04 16:02:48,248 WARNING [train.py:1204] (3/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_791-45149-0 from training. Duration: 0.86 2023-10-04 16:02:49,627 WARNING [train.py:1204] (3/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_317-211107-0_sp1.1 from training. Duration: 0.836375 2023-10-04 16:02:49,745 WARNING [train.py:1204] (3/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_488-330913-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 16:02:51,021 WARNING [train.py:1204] (3/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_506-42614-0_sp1.1 from training. Duration: 0.7181875 2023-10-04 16:02:55,132 WARNING [train.py:1204] (3/4) Exclude cut with ID _8696_210_1876_1_1528545632440_3345058_209-54069-0 from training. Duration: 0.67 2023-10-04 16:03:00,601 WARNING [train.py:1204] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_215-202973-0_sp1.1 from training. Duration: 0.8 2023-10-04 16:03:00,679 WARNING [train.py:1204] (3/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_321-215742-0 from training. Duration: 0.5 2023-10-04 16:03:02,028 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_428-32114-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 16:03:02,036 WARNING [train.py:1204] (3/4) Exclude cut with ID _27013_210_1389_1_1532437173240_3252649_467-263151-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 16:03:02,093 WARNING [train.py:1204] (3/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_734-129761-0_sp0.9 from training. Duration: 0.911125 2023-10-04 16:03:03,450 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_521-214276-0 from training. Duration: 0.44 2023-10-04 16:03:06,793 WARNING [train.py:1204] (3/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_80-147301-0_sp1.1 from training. Duration: 0.763625 2023-10-04 16:03:06,795 WARNING [train.py:1204] (3/4) Exclude cut with ID _9679_210_7497_1_1529129212906_4363929_56-307796-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 16:03:09,892 WARNING [train.py:1204] (3/4) Exclude cut with ID _14832_210_8750_1_1530691351539_3171348_1-54315-0 from training. Duration: 0.87 2023-10-04 16:03:09,893 WARNING [train.py:1204] (3/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_558-155750-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 16:03:12,564 INFO [train.py:1046] (3/4) Epoch 49, batch 1850, loss[loss=0.1571, simple_loss=0.2333, pruned_loss=0.04042, over 23874.00 frames. ], tot_loss[loss=0.1533, simple_loss=0.2336, pruned_loss=0.03649, over 4721501.90 frames. ], batch size: 195, lr: 2.07e-03, grad_scale: 8.0 2023-10-04 16:03:12,652 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_142-309370-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 16:03:12,690 WARNING [train.py:1204] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_18-46756-0_sp0.9 from training. Duration: 0.87775 2023-10-04 16:03:12,697 WARNING [train.py:1204] (3/4) Exclude cut with ID _61148_210_6897_1_1535094022726_4217610_279-212159-0_sp1.1 from training. Duration: 0.936375 2023-10-04 16:03:15,340 WARNING [train.py:1204] (3/4) Exclude cut with ID _21987_210_5854_1_1531821500482_3637038_164-2641-0_sp1.1 from training. Duration: 0.936375 2023-10-04 16:03:15,378 WARNING [train.py:1204] (3/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_395-340852-0_sp1.1 from training. Duration: 0.8454375 2023-10-04 16:03:18,423 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1077-184261-0_sp1.1 from training. Duration: 0.963625 2023-10-04 16:03:18,431 WARNING [train.py:1204] (3/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_1176-204992-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 16:03:19,947 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.conv_skip_rate, batch_count=1712213.3333333333, ans=0.0 2023-10-04 16:03:21,134 WARNING [train.py:1204] (3/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_563-347832-0_sp1.1 from training. Duration: 0.8090625 2023-10-04 16:03:22,556 WARNING [train.py:1204] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_380-204756-0_sp0.9 from training. Duration: 0.9555625 2023-10-04 16:03:28,183 WARNING [train.py:1204] (3/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_212-10501-0_sp1.1 from training. Duration: 0.8545625 2023-10-04 16:03:28,205 WARNING [train.py:1204] (3/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_445-117398-0 from training. Duration: 0.89 2023-10-04 16:03:31,014 WARNING [train.py:1204] (3/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_29-280622-0 from training. Duration: 0.82 2023-10-04 16:03:31,198 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.feed_forward1.out_proj.dropout_p, batch_count=1712280.0, ans=0.1 2023-10-04 16:03:34,406 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.self_attn_weights.pos_emb_skip_rate, batch_count=1712280.0, ans=0.0 2023-10-04 16:03:34,760 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.4.encoder.layers.0.self_attn2.whiten, num_groups=1, num_channels=384, metric=13.66 vs. limit=22.5 2023-10-04 16:03:35,403 WARNING [train.py:1204] (3/4) Exclude cut with ID _26664_210_7770_1_1532258943280_3418270_187-80815-0 from training. Duration: 0.93 2023-10-04 16:03:36,953 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.feed_forward2.hidden_balancer.prob, batch_count=1712280.0, ans=0.125 2023-10-04 16:03:38,106 WARNING [train.py:1204] (3/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_414-349640-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 16:03:38,135 WARNING [train.py:1204] (3/4) Exclude cut with ID _45021_210_9994_1_1533619751094_3330288_96-239010-0 from training. Duration: 0.91 2023-10-04 16:03:38,138 WARNING [train.py:1204] (3/4) Exclude cut with ID _5189_210_4557_1_1527248319428_3769229_199-53526-0_sp0.9 from training. Duration: 0.4666875 2023-10-04 16:03:45,644 WARNING [train.py:1204] (3/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_63-101996-0_sp1.1 from training. Duration: 0.8 2023-10-04 16:03:47,616 WARNING [train.py:1204] (3/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_199-86432-0 from training. Duration: 0.98 2023-10-04 16:03:50,379 WARNING [train.py:1204] (3/4) Exclude cut with ID _36399_210_16049_1_1532998771377_3698333_207-259013-0_sp1.1 from training. Duration: 0.92725 2023-10-04 16:03:50,414 WARNING [train.py:1204] (3/4) Exclude cut with ID _62574_210_2655_1_1535180407058_3933249_567-111756-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 16:03:54,600 WARNING [train.py:1204] (3/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_436-318113-0 from training. Duration: 0.75 2023-10-04 16:03:54,639 WARNING [train.py:1204] (3/4) Exclude cut with ID _53379_210_20034_1_1534222027363_3854654_43-305214-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 16:03:54,667 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_15126_210_6753_1_1530778677_2591185_113-157651-0_sp1.1 from training. Duration: 0.6181875 2023-10-04 16:03:56,130 WARNING [train.py:1204] (3/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_146-218763-0_sp1.1 from training. Duration: 0.7818125 2023-10-04 16:03:57,587 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.conv_module2.balancer1.prob, batch_count=1712413.3333333333, ans=0.125 2023-10-04 16:03:59,309 WARNING [train.py:1204] (3/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_714-310538-0_sp0.9 from training. Duration: 0.9555625 2023-10-04 16:04:00,528 WARNING [train.py:1204] (3/4) Exclude cut with ID _24271_210_12658_1_1531987270022_7190519_709-16721-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 16:04:04,664 WARNING [train.py:1204] (3/4) Exclude cut with ID _5190_210_4557_1_1527244368799_373439_28-197030-0_sp0.9 from training. Duration: 0.8 2023-10-04 16:04:04,706 WARNING [train.py:1204] (3/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_267-197536-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 16:04:05,921 WARNING [train.py:1204] (3/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_658-282610-0_sp1.1 from training. Duration: 0.4818125 2023-10-04 16:04:05,934 WARNING [train.py:1204] (3/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_257-67050-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 16:04:07,396 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_14906_210_7927_1_1530754190_9408633_961-53402-0_sp1.1 from training. Duration: 0.936375 2023-10-04 16:04:08,907 WARNING [train.py:1204] (3/4) Exclude cut with ID _60113_210_16271_1_1535164212481_4386079_490-280290-0_sp1.1 from training. Duration: 0.8909375 2023-10-04 16:04:11,644 WARNING [train.py:1204] (3/4) Exclude cut with ID _14100_210_8341_1_1530529320613_3678069_258-224599-0 from training. Duration: 0.77 2023-10-04 16:04:11,699 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6961_210_2087_1_1527937168_6815011_565-326898-0_sp1.1 from training. Duration: 0.936375 2023-10-04 16:04:14,447 WARNING [train.py:1204] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_239-316802-0_sp1.1 from training. Duration: 0.62725 2023-10-04 16:04:14,508 WARNING [train.py:1204] (3/4) Exclude cut with ID _55857_210_2346_1_1534644007831_6898209_745-98391-0_sp1.1 from training. Duration: 0.6909375 2023-10-04 16:04:14,511 WARNING [train.py:1204] (3/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_154-135696-0 from training. Duration: 0.8 2023-10-04 16:04:14,513 WARNING [train.py:1204] (3/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_93-58002-0 from training. Duration: 0.57 2023-10-04 16:04:18,367 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9025W0005-507335 from training. Duration: 0.896 2023-10-04 16:04:18,448 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9029W0034-509435 from training. Duration: 0.64 2023-10-04 16:04:19,994 WARNING [train.py:1204] (3/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_76-203867-0_sp1.1 from training. Duration: 0.6545625 2023-10-04 16:04:20,004 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_46639_210_8341_1_1533726088_4495500_66-346325-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 16:04:21,257 WARNING [train.py:1204] (3/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_145-137790-0_sp0.9 from training. Duration: 0.92225 2023-10-04 16:04:21,272 WARNING [train.py:1204] (3/4) Exclude cut with ID _36522_210_8887_1_1533177030611_1772478_7-327299-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 16:04:21,358 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9042W0009-515615 from training. Duration: 0.896 2023-10-04 16:04:22,547 WARNING [train.py:1204] (3/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_39-248870-0_sp0.9 from training. Duration: 0.7666875 2023-10-04 16:04:22,592 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_35441_210_16554_1_1533347572_4092680_362-242912-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 16:04:23,953 WARNING [train.py:1204] (3/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_216-343007-0_sp1.1 from training. Duration: 0.6 2023-10-04 16:04:25,282 INFO [train.py:1046] (3/4) Epoch 49, batch 1900, loss[loss=0.1456, simple_loss=0.2269, pruned_loss=0.0321, over 23315.00 frames. ], tot_loss[loss=0.1543, simple_loss=0.2343, pruned_loss=0.0372, over 4691358.56 frames. ], batch size: 105, lr: 2.07e-03, grad_scale: 8.0 2023-10-04 16:04:25,408 WARNING [train.py:1204] (3/4) Exclude cut with ID _3867_210_1883_1_1526641200575_4475830_272-215563-0_sp0.9 from training. Duration: 0.6333125 2023-10-04 16:04:26,651 WARNING [train.py:1204] (3/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_553-175603-0_sp1.1 from training. Duration: 0.92725 2023-10-04 16:04:26,660 WARNING [train.py:1204] (3/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_963-187109-0 from training. Duration: 0.99 2023-10-04 16:04:28,117 WARNING [train.py:1204] (3/4) Exclude cut with ID _44973_210_1511_1_1533794381163_3634770_495-211466-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 16:04:29,923 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9068W0002-529085 from training. Duration: 0.896 2023-10-04 16:04:29,944 WARNING [train.py:1204] (3/4) Exclude cut with ID _5294_210_1747_1_1527249643928_3696179_180-100713-0_sp1.1 from training. Duration: 0.736375 2023-10-04 16:04:30,197 INFO [scaling.py:1118] (3/4) WithLoss: name=encoder.encoders.3.encoder.layers.0.self_attn_weights, loss-sum=0.000e+00 2023-10-04 16:04:31,296 WARNING [train.py:1204] (3/4) Exclude cut with ID _35799_210_5854_1_1533263487887_3529071_149-49689-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 16:04:35,634 WARNING [train.py:1204] (3/4) Exclude cut with ID _67725_210_27767_1_1535608756671_5105359_36-220794-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 16:04:36,131 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.5.encoder.layers.0.nonlin_attention.whiten2, num_groups=1, num_channels=256, metric=3.35 vs. limit=15.0 2023-10-04 16:04:38,342 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.723e+02 2.070e+02 2.219e+02 2.494e+02 3.485e+02, threshold=4.439e+02, percent-clipped=0.0 2023-10-04 16:04:38,445 WARNING [train.py:1204] (3/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_338-96520-0_sp1.1 from training. Duration: 0.8545625 2023-10-04 16:04:39,839 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9104W0004-547160 from training. Duration: 0.896 2023-10-04 16:04:41,209 WARNING [train.py:1204] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_330-327861-0 from training. Duration: 0.67 2023-10-04 16:04:42,606 WARNING [train.py:1204] (3/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_156-257607-0_sp0.9 from training. Duration: 0.92225 2023-10-04 16:04:42,661 WARNING [train.py:1204] (3/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_476-322050-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 16:04:43,927 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9118W0017-553385 from training. Duration: 0.896 2023-10-04 16:04:43,961 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9120W0028-554435 from training. Duration: 0.896 2023-10-04 16:04:46,950 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.conv_module2.balancer1.prob, batch_count=1712613.3333333333, ans=0.125 2023-10-04 16:04:48,570 WARNING [train.py:1204] (3/4) Exclude cut with ID _56524_210_9642_1_1534572011779_3619170_145-310842-0 from training. Duration: 0.99 2023-10-04 16:04:50,380 WARNING [train.py:1204] (3/4) Exclude cut with ID _51631_210_8254_1_1534129228420_4084794_270-222508-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 16:04:52,389 WARNING [train.py:1204] (3/4) Exclude cut with ID _14268_210_9206_1_1530622771285_4714099_331-105351-0 from training. Duration: 0.65 2023-10-04 16:04:55,165 WARNING [train.py:1204] (3/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_40-129756-0 from training. Duration: 0.81 2023-10-04 16:05:03,765 WARNING [train.py:1204] (3/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_231-104736-0 from training. Duration: 0.78 2023-10-04 16:05:06,480 WARNING [train.py:1204] (3/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_214-291369-0 from training. Duration: 0.63 2023-10-04 16:05:06,486 WARNING [train.py:1204] (3/4) Exclude cut with ID _62574_210_2655_1_1535180407058_3933249_369-84347-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 16:05:07,810 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9210W0111-599240 from training. Duration: 0.896 2023-10-04 16:05:07,821 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_92-246270-0 from training. Duration: 0.85 2023-10-04 16:05:09,265 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_37152_210_9697_1_1533106573_3844188_29-239070-0 from training. Duration: 0.98 2023-10-04 16:05:09,323 WARNING [train.py:1204] (3/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_239-22076-0 from training. Duration: 0.85 2023-10-04 16:05:09,326 WARNING [train.py:1204] (3/4) Exclude cut with ID _65962_210_29927_1_1535436026519_4174930_368-44639-0_sp1.1 from training. Duration: 0.97275 2023-10-04 16:05:09,636 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.conv_module1.balancer2.prob, batch_count=1712746.6666666667, ans=0.125 2023-10-04 16:05:13,510 WARNING [train.py:1204] (3/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_152-212533-0 from training. Duration: 0.97 2023-10-04 16:05:16,322 WARNING [train.py:1204] (3/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_90-31383-0_sp1.1 from training. Duration: 0.7909375 2023-10-04 16:05:21,444 WARNING [train.py:1204] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_196-152868-0_sp1.1 from training. Duration: 0.92725 2023-10-04 16:05:21,446 WARNING [train.py:1204] (3/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_140-181176-0 from training. Duration: 0.92 2023-10-04 16:05:21,573 WARNING [train.py:1204] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_168-32906-0_sp0.9 from training. Duration: 0.7666875 2023-10-04 16:05:24,956 WARNING [train.py:1204] (3/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_13-165512-0 from training. Duration: 0.82 2023-10-04 16:05:26,286 WARNING [train.py:1204] (3/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_381-317791-0_sp0.9 from training. Duration: 0.811125 2023-10-04 16:05:31,687 WARNING [train.py:1204] (3/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_63-267585-0_sp1.1 from training. Duration: 0.8181875 2023-10-04 16:05:32,315 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_326-303945-0_sp1.1 from training. Duration: 0.9 2023-10-04 16:05:33,400 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_194-143381-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 16:05:33,477 WARNING [train.py:1204] (3/4) Exclude cut with ID _39290_210_4220_1_1533862778760_3075118_194-16667-0_sp1.1 from training. Duration: 0.963625 2023-10-04 16:05:34,879 WARNING [train.py:1204] (3/4) Exclude cut with ID _15012_210_1968_1_1530856820429_5095350_662-210469-0_sp0.9 from training. Duration: 0.6333125 2023-10-04 16:05:34,913 WARNING [train.py:1204] (3/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_68-219807-0_sp1.1 from training. Duration: 0.42725 2023-10-04 16:05:36,341 WARNING [train.py:1204] (3/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_321-22821-0_sp0.9 from training. Duration: 0.988875 2023-10-04 16:05:39,001 INFO [train.py:1046] (3/4) Epoch 49, batch 1950, loss[loss=0.1997, simple_loss=0.2688, pruned_loss=0.06528, over 19754.00 frames. ], tot_loss[loss=0.1548, simple_loss=0.2349, pruned_loss=0.03735, over 4700263.11 frames. ], batch size: 388, lr: 2.07e-03, grad_scale: 8.0 2023-10-04 16:05:39,049 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_1037-45317-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 16:05:39,051 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_403-39619-0_sp1.1 from training. Duration: 0.763625 2023-10-04 16:05:40,485 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_510-96996-0_sp0.9 from training. Duration: 0.9666875 2023-10-04 16:05:40,487 WARNING [train.py:1204] (3/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_225-180155-0_sp1.1 from training. Duration: 0.92725 2023-10-04 16:05:40,543 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_278-272612-0_sp0.9 from training. Duration: 0.811125 2023-10-04 16:05:40,658 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.1.feed_forward1.out_proj.dropout_p, batch_count=1712880.0, ans=0.1 2023-10-04 16:05:41,919 WARNING [train.py:1204] (3/4) Exclude cut with ID _53713_210_24116_1_1534314487762_7416751_691-70244-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 16:05:42,061 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.0.conv_module2.balancer2.prob, batch_count=1712880.0, ans=0.125 2023-10-04 16:05:44,685 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6788_210_1388_1_1527898698_4562021_91-197981-0_sp1.1 from training. Duration: 0.7545625 2023-10-04 16:05:46,114 WARNING [train.py:1204] (3/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_60-18123-0_sp0.9 from training. Duration: 0.97775 2023-10-04 16:05:48,072 WARNING [train.py:1204] (3/4) Exclude cut with ID _35799_210_5854_1_1533263487887_3529071_5-214617-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 16:05:48,077 WARNING [train.py:1204] (3/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_890-5117-0_sp1.1 from training. Duration: 0.7090625 2023-10-04 16:05:50,655 WARNING [train.py:1204] (3/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_660-211088-0 from training. Duration: 0.62 2023-10-04 16:05:52,000 WARNING [train.py:1204] (3/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_314-126819-0_sp0.9 from training. Duration: 0.5555625 2023-10-04 16:05:52,042 WARNING [train.py:1204] (3/4) Exclude cut with ID _21632_210_2253_1_1531994001765_3744140_362-66225-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 16:05:54,002 WARNING [train.py:1204] (3/4) Exclude cut with ID _61118_210_9527_1_1534897668884_3752630_173-258555-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 16:05:55,840 WARNING [train.py:1204] (3/4) Exclude cut with ID _7719_210_3525_1_1528456153163_4296960_88-250259-0_sp0.9 from training. Duration: 0.9333125 2023-10-04 16:05:57,161 WARNING [train.py:1204] (3/4) Exclude cut with ID _15140_210_9288_1_1530791388450_4880149_539-153708-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 16:05:57,199 WARNING [train.py:1204] (3/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_253-42544-0_sp1.1 from training. Duration: 0.97275 2023-10-04 16:05:57,432 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.attention_skip_rate, batch_count=1712946.6666666667, ans=0.0 2023-10-04 16:05:57,845 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.3.encoder.layers.2.self_attn2.whiten, num_groups=1, num_channels=512, metric=10.03 vs. limit=22.5 2023-10-04 16:06:00,031 WARNING [train.py:1204] (3/4) Exclude cut with ID _33640_210_2655_1_1533117605479_4855469_479-296877-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 16:06:02,689 WARNING [train.py:1204] (3/4) Exclude cut with ID 3033-130750-0096-55598-0_sp1.1 from training. Duration: 0.7545625 2023-10-04 16:06:02,700 WARNING [train.py:1204] (3/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_191-41023-0_sp1.1 from training. Duration: 0.6454375 2023-10-04 16:06:02,716 WARNING [train.py:1204] (3/4) Exclude cut with ID _8365_210_2064_1_1528592311070_4677690_431-294102-0_sp1.1 from training. Duration: 0.8090625 2023-10-04 16:06:02,751 WARNING [train.py:1204] (3/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_893-296030-0_sp1.1 from training. Duration: 0.97275 2023-10-04 16:06:06,155 WARNING [train.py:1204] (3/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_401-343820-0_sp1.1 from training. Duration: 0.97275 2023-10-04 16:06:07,689 WARNING [train.py:1204] (3/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_127-566-0_sp1.1 from training. Duration: 0.763625 2023-10-04 16:06:07,690 WARNING [train.py:1204] (3/4) Exclude cut with ID _14100_210_8341_1_1530529320613_3678069_126-272847-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 16:06:07,703 WARNING [train.py:1204] (3/4) Exclude cut with ID _15133_210_6228_1_1530794512074_3080379_29-264458-0_sp1.1 from training. Duration: 0.52725 2023-10-04 16:06:07,704 WARNING [train.py:1204] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_127-119109-0 from training. Duration: 0.72 2023-10-04 16:06:09,066 WARNING [train.py:1204] (3/4) Exclude cut with ID _6537_210_1883_1_1527987559710_3597129_382-133062-0_sp1.1 from training. Duration: 0.6181875 2023-10-04 16:06:09,085 WARNING [train.py:1204] (3/4) Exclude cut with ID _29529_210_7234_1_1532393961455_3977460_391-219682-0_sp1.1 from training. Duration: 0.936375 2023-10-04 16:06:10,426 WARNING [train.py:1204] (3/4) Exclude cut with ID _37537_210_2253_1_1533171758568_3421949_96-137189-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 16:06:10,742 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.feed_forward1.out_proj.dropout_p, batch_count=1713013.3333333333, ans=0.1 2023-10-04 16:06:12,526 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.2.encoder.layers.2.conv_module2.whiten, num_groups=1, num_channels=384, metric=8.19 vs. limit=15.0 2023-10-04 16:06:14,427 WARNING [train.py:1204] (3/4) Exclude cut with ID _58414_210_8254_1_1535421609162_7151182_15-276301-0_sp1.1 from training. Duration: 0.97275 2023-10-04 16:06:15,903 WARNING [train.py:1204] (3/4) Exclude cut with ID _59502_210_12210_1_1535107024698_1835139_110-289074-0_sp1.1 from training. Duration: 0.92725 2023-10-04 16:06:23,025 WARNING [train.py:1204] (3/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_367-61330-0_sp1.1 from training. Duration: 0.6818125 2023-10-04 16:06:23,387 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.conv_module1.balancer2.min_positive, batch_count=1713080.0, ans=0.05 2023-10-04 16:06:25,066 WARNING [train.py:1204] (3/4) Exclude cut with ID _45297_210_2721_1_1533715351381_3264949_353-9541-0_sp1.1 from training. Duration: 0.8818125 2023-10-04 16:06:25,096 WARNING [train.py:1204] (3/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_85-138257-0_sp0.9 from training. Duration: 0.788875 2023-10-04 16:06:26,373 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3787_210_2040_1_1526477544_5478586_603-120324-0 from training. Duration: 0.81 2023-10-04 16:06:26,423 WARNING [train.py:1204] (3/4) Exclude cut with ID _42863_210_9070_1_1533902398788_3696920_411-204131-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 16:06:31,119 WARNING [train.py:1204] (3/4) Exclude cut with ID _30196_210_1664_1_1533708496522_7074140_822-289909-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 16:06:31,353 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.conv_module2.balancer2.min_positive, batch_count=1713080.0, ans=0.05 2023-10-04 16:06:32,492 WARNING [train.py:1204] (3/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_32-335103-0_sp0.9 from training. Duration: 0.911125 2023-10-04 16:06:32,544 WARNING [train.py:1204] (3/4) Exclude cut with ID _6493_210_1747_1_1528459210424_4012470_513-140967-0_sp0.9 from training. Duration: 0.87775 2023-10-04 16:06:38,893 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_47896_210_20508_1_1534492518_5958868_439-257766-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 16:06:40,176 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_1294-63152-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 16:06:41,626 WARNING [train.py:1204] (3/4) Exclude cut with ID _59170_210_6721_1_1534983633960_4203335_319-166512-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 16:06:43,046 WARNING [train.py:1204] (3/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_146-3470-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 16:06:44,514 WARNING [train.py:1204] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_678-159307-0_sp1.1 from training. Duration: 0.77275 2023-10-04 16:06:45,876 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_343-48822-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 16:06:47,212 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_4692_210_3528_1_1526899755_4323254_107-44401-0 from training. Duration: 0.86 2023-10-04 16:06:47,226 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_74-292608-0_sp1.1 from training. Duration: 0.6545625 2023-10-04 16:06:48,610 WARNING [train.py:1204] (3/4) Exclude cut with ID _55644_210_11856_1_1534593599582_4103719_293-248694-0_sp1.1 from training. Duration: 0.97275 2023-10-04 16:06:48,693 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3172_210_2087_1_1526292929_3987865_197-97762-0 from training. Duration: 0.75 2023-10-04 16:06:51,832 WARNING [train.py:1204] (3/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_264-348896-0_sp0.9 from training. Duration: 0.9444375 2023-10-04 16:06:53,178 INFO [train.py:1046] (3/4) Epoch 49, batch 2000, loss[loss=0.1739, simple_loss=0.2581, pruned_loss=0.04485, over 24000.00 frames. ], tot_loss[loss=0.155, simple_loss=0.2353, pruned_loss=0.03737, over 4703146.76 frames. ], batch size: 80, lr: 2.07e-03, grad_scale: 16.0 2023-10-04 16:06:55,305 WARNING [train.py:1204] (3/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_209-205365-0_sp0.9 from training. Duration: 0.87775 2023-10-04 16:06:57,129 WARNING [train.py:1204] (3/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_222-198127-0_sp0.9 from training. Duration: 0.9333125 2023-10-04 16:06:57,169 WARNING [train.py:1204] (3/4) Exclude cut with ID _57572_210_5517_1_1534921219624_3809484_52-43530-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 16:06:58,578 WARNING [train.py:1204] (3/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_96-320736-0_sp1.1 from training. Duration: 0.7818125 2023-10-04 16:07:00,011 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_13091_210_7925_1_1530609447_8987354_1159-225508-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 16:07:01,714 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.feed_forward1.out_proj.dropout_p, batch_count=1713213.3333333333, ans=0.1 2023-10-04 16:07:04,630 WARNING [train.py:1204] (3/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_237-44332-0 from training. Duration: 0.94 2023-10-04 16:07:05,853 WARNING [train.py:1204] (3/4) Exclude cut with ID _7970_210_1883_1_1528455616296_3472230_25-178407-0_sp0.9 from training. Duration: 0.788875 2023-10-04 16:07:07,144 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.576e+02 2.055e+02 2.276e+02 2.688e+02 4.506e+02, threshold=4.553e+02, percent-clipped=2.0 2023-10-04 16:07:08,705 WARNING [train.py:1204] (3/4) Exclude cut with ID _29003_210_19596_1_1532507391720_3894098_357-257062-0_sp1.1 from training. Duration: 0.936375 2023-10-04 16:07:08,858 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_809-17156-0 from training. Duration: 0.73 2023-10-04 16:07:09,034 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.balancer2.prob, batch_count=1713280.0, ans=0.125 2023-10-04 16:07:10,229 WARNING [train.py:1204] (3/4) Exclude cut with ID _7160_210_1876_1_1527940923231_3703799_302-150728-0_sp1.1 from training. Duration: 0.7090625 2023-10-04 16:07:10,248 WARNING [train.py:1204] (3/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_444-28041-0_sp0.9 from training. Duration: 0.9444375 2023-10-04 16:07:12,902 WARNING [train.py:1204] (3/4) Exclude cut with ID _52892_210_6721_1_1534378941259_4124129_278-251666-0_sp1.1 from training. Duration: 0.92725 2023-10-04 16:07:14,225 WARNING [train.py:1204] (3/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_115-326778-0 from training. Duration: 0.93 2023-10-04 16:07:15,548 WARNING [train.py:1204] (3/4) Exclude cut with ID _41391_210_7994_1_1533603771872_3638201_31-188961-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 16:07:17,002 WARNING [train.py:1204] (3/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_1073-6264-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 16:07:17,036 WARNING [train.py:1204] (3/4) Exclude cut with ID _66868_210_9527_1_1535509287954_4561140_435-259515-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 16:07:19,039 WARNING [train.py:1204] (3/4) Exclude cut with ID _14886_210_4220_1_1530702020355_3516190_159-73748-0 from training. Duration: 0.83 2023-10-04 16:07:19,085 WARNING [train.py:1204] (3/4) Exclude cut with ID _15150_210_8341_1_1530752411803_7256384_469-239509-0_sp1.1 from training. Duration: 0.6181875 2023-10-04 16:07:20,746 WARNING [train.py:1204] (3/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_395-340852-0 from training. Duration: 0.93 2023-10-04 16:07:20,750 WARNING [train.py:1204] (3/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_138-187031-0_sp1.1 from training. Duration: 0.9 2023-10-04 16:07:25,307 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_13757_210_3528_1_1530440682_4107239_48-106347-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 16:07:27,097 WARNING [train.py:1204] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_164-224052-0_sp1.1 from training. Duration: 0.636375 2023-10-04 16:07:27,109 WARNING [train.py:1204] (3/4) Exclude cut with ID _59288_210_5854_1_1535077757774_3412246_408-322648-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 16:07:27,153 WARNING [train.py:1204] (3/4) Exclude cut with ID _30191_210_1664_1_1533189915047_7196670_827-36359-0_sp1.1 from training. Duration: 0.963625 2023-10-04 16:07:27,420 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.conv_skip_rate, batch_count=1713346.6666666667, ans=0.0 2023-10-04 16:07:27,663 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.5.encoder.layers.1.self_attn1.whiten, num_groups=1, num_channels=256, metric=13.07 vs. limit=22.5 2023-10-04 16:07:28,510 WARNING [train.py:1204] (3/4) Exclude cut with ID _4680_210_3525_1_1527249757953_893386_75-78979-0_sp1.1 from training. Duration: 0.82725 2023-10-04 16:07:29,841 WARNING [train.py:1204] (3/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_383-165234-0 from training. Duration: 0.94 2023-10-04 16:07:29,984 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.feed_forward3.hidden_balancer.prob, batch_count=1713346.6666666667, ans=0.125 2023-10-04 16:07:32,611 WARNING [train.py:1204] (3/4) Exclude cut with ID _19896_210_1883_1_1531709022662_6649569_94-229927-0 from training. Duration: 0.97 2023-10-04 16:07:32,617 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6788_210_1388_1_1527898698_4562021_180-75288-0_sp1.1 from training. Duration: 0.9 2023-10-04 16:07:32,629 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_26433_210_12108_1_1532157158_4056480_54-321349-0_sp1.1 from training. Duration: 0.97275 2023-10-04 16:07:38,709 WARNING [train.py:1204] (3/4) Exclude cut with ID _56108_210_16380_1_1534505446159_3887959_265-161493-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 16:07:40,217 WARNING [train.py:1204] (3/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_395-111602-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 16:07:40,223 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_154-316450-0_sp0.9 from training. Duration: 0.8444375 2023-10-04 16:07:40,287 WARNING [train.py:1204] (3/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_381-116041-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 16:07:41,700 WARNING [train.py:1204] (3/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_65-104124-0_sp1.1 from training. Duration: 0.963625 2023-10-04 16:07:41,763 WARNING [train.py:1204] (3/4) Exclude cut with ID _6536_210_1883_1_1527850783339_3319069_169-23811-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 16:07:43,095 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_289-180904-0_sp0.9 from training. Duration: 0.8444375 2023-10-04 16:07:43,113 WARNING [train.py:1204] (3/4) Exclude cut with ID _56406_210_9288_1_1534899618664_3496149_226-242400-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 16:07:44,424 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_24899_210_16554_1_1532046422_3827462_276-216330-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 16:07:48,463 WARNING [train.py:1204] (3/4) Exclude cut with ID _29345_210_4381_1_1532689168263_3860169_198-174328-0_sp1.1 from training. Duration: 0.82725 2023-10-04 16:07:48,540 WARNING [train.py:1204] (3/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_338-96520-0 from training. Duration: 0.94 2023-10-04 16:07:53,027 WARNING [train.py:1204] (3/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_7-111954-0_sp1.1 from training. Duration: 0.5090625 2023-10-04 16:07:53,199 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.self_attn_weights.pos_emb_skip_rate, batch_count=1713480.0, ans=0.0 2023-10-04 16:07:56,023 WARNING [train.py:1204] (3/4) Exclude cut with ID _36692_210_5665_1_1533608967420_398809_8-16261-0_sp1.1 from training. Duration: 0.97275 2023-10-04 16:07:59,286 WARNING [train.py:1204] (3/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_86-212657-0_sp1.1 from training. Duration: 0.97275 2023-10-04 16:07:59,294 WARNING [train.py:1204] (3/4) Exclude cut with ID _44659_210_2721_1_1534036120061_6614860_626-298884-0_sp1.1 from training. Duration: 0.8818125 2023-10-04 16:08:02,201 WARNING [train.py:1204] (3/4) Exclude cut with ID _49755_210_18053_1_1534581005846_3218709_120-257918-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 16:08:04,892 WARNING [train.py:1204] (3/4) Exclude cut with ID _21881_210_3525_1_1531825242757_3798380_400-113821-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 16:08:04,895 WARNING [train.py:1204] (3/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_387-236773-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 16:08:06,310 WARNING [train.py:1204] (3/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_613-232856-0_sp1.1 from training. Duration: 0.4909375 2023-10-04 16:08:06,931 WARNING [train.py:1204] (3/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_181-78632-0_sp0.9 from training. Duration: 0.8666875 2023-10-04 16:08:08,078 INFO [train.py:1046] (3/4) Epoch 49, batch 2050, loss[loss=0.147, simple_loss=0.2222, pruned_loss=0.03596, over 23558.00 frames. ], tot_loss[loss=0.1537, simple_loss=0.2341, pruned_loss=0.03662, over 4707031.57 frames. ], batch size: 134, lr: 2.07e-03, grad_scale: 8.0 2023-10-04 16:08:08,204 WARNING [train.py:1204] (3/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_175-17485-0_sp1.1 from training. Duration: 0.97275 2023-10-04 16:08:09,554 WARNING [train.py:1204] (3/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_1004-183422-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 16:08:11,196 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.conv_module1.balancer2.prob, batch_count=1713546.6666666667, ans=0.125 2023-10-04 16:08:12,368 WARNING [train.py:1204] (3/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_32-65962-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 16:08:13,774 WARNING [train.py:1204] (3/4) Exclude cut with ID _15458_210_8341_1_1530858614263_3834410_40-298664-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 16:08:17,219 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.4.encoder.layers.2.self_attn1.whiten, num_groups=1, num_channels=384, metric=12.30 vs. limit=22.5 2023-10-04 16:08:18,002 WARNING [train.py:1204] (3/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_380-261863-0_sp1.1 from training. Duration: 0.963625 2023-10-04 16:08:20,710 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_372-173012-0_sp1.1 from training. Duration: 0.863625 2023-10-04 16:08:20,758 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_57-82205-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 16:08:22,052 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3691_210_3528_1_1526468169_4062053_355-334432-0_sp0.9 from training. Duration: 0.9666875 2023-10-04 16:08:22,182 WARNING [train.py:1204] (3/4) Exclude cut with ID _19023_210_4353_1_1531378892552_5820780_28-36615-0 from training. Duration: 0.97 2023-10-04 16:08:22,186 WARNING [train.py:1204] (3/4) Exclude cut with ID _29853_210_7497_1_1532505600851_6642110_286-251333-0_sp1.1 from training. Duration: 0.92725 2023-10-04 16:08:22,382 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.self_attn_weights.pos_emb_skip_rate, batch_count=1713613.3333333333, ans=0.0 2023-10-04 16:08:25,318 WARNING [train.py:1204] (3/4) Exclude cut with ID _31602_210_5854_1_1532677021456_4458250_518-80679-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 16:08:25,363 WARNING [train.py:1204] (3/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_38-187851-0_sp0.9 from training. Duration: 0.688875 2023-10-04 16:08:34,610 WARNING [train.py:1204] (3/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_39-248870-0_sp1.1 from training. Duration: 0.62725 2023-10-04 16:08:34,635 WARNING [train.py:1204] (3/4) Exclude cut with ID _16908_210_5724_1_1531119157248_3772700_154-31455-0_sp1.1 from training. Duration: 0.97275 2023-10-04 16:08:37,736 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8453_210_4211_1_1528502940_8562730_347-330673-0 from training. Duration: 0.72 2023-10-04 16:08:39,073 WARNING [train.py:1204] (3/4) Exclude cut with ID _30191_210_1664_1_1533189915047_7196670_674-204247-0_sp1.1 from training. Duration: 0.97275 2023-10-04 16:08:40,505 WARNING [train.py:1204] (3/4) Exclude cut with ID _8833_210_5219_1_1528592665210_7553059_329-125156-0 from training. Duration: 0.82 2023-10-04 16:08:40,541 WARNING [train.py:1204] (3/4) Exclude cut with ID _4779_210_2084_1_1527318022944_3914893_500-285013-0_sp1.1 from training. Duration: 0.62725 2023-10-04 16:08:42,055 WARNING [train.py:1204] (3/4) Exclude cut with ID _14421_210_8750_1_1530604795825_3542290_190-313578-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 16:08:44,761 WARNING [train.py:1204] (3/4) Exclude cut with ID _35799_210_5854_1_1533263487887_3529071_538-273574-0_sp1.1 from training. Duration: 0.936375 2023-10-04 16:08:44,829 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_14928_210_5632_1_1530773461_4714000_454-112198-0_sp1.1 from training. Duration: 0.6 2023-10-04 16:08:44,985 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.ff3_skip_rate, batch_count=1713680.0, ans=0.0 2023-10-04 16:08:46,129 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_468-143126-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 16:08:47,481 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_4528_210_3273_1_1527076654_3276777_258-121681-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 16:08:48,752 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_397-132565-0_sp1.1 from training. Duration: 0.9 2023-10-04 16:08:48,770 WARNING [train.py:1204] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_454-123002-0_sp0.9 from training. Duration: 0.7666875 2023-10-04 16:08:52,120 WARNING [train.py:1204] (3/4) Exclude cut with ID _27052_210_5986_1_1532329197187_3585009_297-86890-0_sp1.1 from training. Duration: 0.936375 2023-10-04 16:08:53,504 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_51225_210_3601_1_1534400634_2327383_55-301844-0_sp1.1 from training. Duration: 0.8454375 2023-10-04 16:08:55,525 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6786_210_1388_1_1527839699_4194495_378-46070-0_sp0.9 from training. Duration: 0.8 2023-10-04 16:08:55,701 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.nonlin_attention.balancer.prob, batch_count=1713746.6666666667, ans=0.125 2023-10-04 16:08:56,004 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.5.encoder.layers.0.self_attn2.whiten, num_groups=1, num_channels=256, metric=10.86 vs. limit=22.5 2023-10-04 16:08:56,998 WARNING [train.py:1204] (3/4) Exclude cut with ID _42334_210_7770_1_1534206610396_3605880_55-19758-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 16:08:57,260 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.attention_skip_rate, batch_count=1713746.6666666667, ans=0.0 2023-10-04 16:08:59,204 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.self_attn_weights.pos_emb_skip_rate, batch_count=1713746.6666666667, ans=0.0 2023-10-04 16:09:00,498 WARNING [train.py:1204] (3/4) Exclude cut with ID _14884_210_2252_1_1530707459689_5561250_114-201399-0_sp0.9 from training. Duration: 0.8444375 2023-10-04 16:09:04,898 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.conv_skip_rate, batch_count=1713746.6666666667, ans=0.0 2023-10-04 16:09:06,137 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_99-251314-0_sp0.9 from training. Duration: 0.9666875 2023-10-04 16:09:07,830 WARNING [train.py:1204] (3/4) Exclude cut with ID _26528_210_9070_1_1532570410842_3851310_497-97218-0 from training. Duration: 0.86 2023-10-04 16:09:10,937 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.ff3_skip_rate, batch_count=1713813.3333333333, ans=0.0 2023-10-04 16:09:13,355 WARNING [train.py:1204] (3/4) Exclude cut with ID _55328_210_27767_1_1534580973260_4088170_230-317241-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 16:09:13,427 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_125-203105-0_sp0.9 from training. Duration: 0.888875 2023-10-04 16:09:16,127 WARNING [train.py:1204] (3/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_55-31904-0_sp1.1 from training. Duration: 0.8 2023-10-04 16:09:18,915 WARNING [train.py:1204] (3/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_18-174647-0 from training. Duration: 0.91 2023-10-04 16:09:21,487 INFO [train.py:1046] (3/4) Epoch 49, batch 2100, loss[loss=0.1508, simple_loss=0.2286, pruned_loss=0.03651, over 23351.00 frames. ], tot_loss[loss=0.1527, simple_loss=0.2327, pruned_loss=0.03633, over 4698822.92 frames. ], batch size: 119, lr: 2.07e-03, grad_scale: 8.0 2023-10-04 16:09:21,626 WARNING [train.py:1204] (3/4) Exclude cut with ID IC0165W0005-81726 from training. Duration: 0.952 2023-10-04 16:09:21,627 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_31583_210_3852_1_1532595851_4024413_148-171847-0_sp1.1 from training. Duration: 0.963625 2023-10-04 16:09:23,549 WARNING [train.py:1204] (3/4) Exclude cut with ID _21989_210_5854_1_1531899214931_3271360_116-138769-0_sp1.1 from training. Duration: 0.936375 2023-10-04 16:09:23,611 WARNING [train.py:1204] (3/4) Exclude cut with ID _66070_210_7640_1_1535441393096_7805310_489-292631-0_sp1.1 from training. Duration: 0.7181875 2023-10-04 16:09:26,806 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_202-27960-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 16:09:26,821 WARNING [train.py:1204] (3/4) Exclude cut with ID _5294_210_1747_1_1527249643928_3696179_342-270278-0 from training. Duration: 0.55 2023-10-04 16:09:26,852 WARNING [train.py:1204] (3/4) Exclude cut with ID _38323_210_12108_1_1533121233369_3303049_416-11919-0 from training. Duration: 0.96 2023-10-04 16:09:29,472 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6545_210_2040_1_1527774670_4245258_407-48070-0_sp0.9 from training. Duration: 0.8444375 2023-10-04 16:09:30,409 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.0.layers.1.conv_module2.whiten, num_groups=1, num_channels=192, metric=10.54 vs. limit=15.0 2023-10-04 16:09:32,843 WARNING [train.py:1204] (3/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_503-30719-0_sp1.1 from training. Duration: 0.8909375 2023-10-04 16:09:32,907 WARNING [train.py:1204] (3/4) Exclude cut with ID _49643_210_7989_1_1534640244224_3939031_234-326497-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 16:09:35,782 WARNING [train.py:1204] (3/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_301-77873-0_sp1.1 from training. Duration: 0.963625 2023-10-04 16:09:35,831 WARNING [train.py:1204] (3/4) Exclude cut with ID _45297_210_2721_1_1533715351381_3264949_152-154697-0_sp1.1 from training. Duration: 0.97275 2023-10-04 16:09:35,847 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3371_210_1794_1_1526384462_5580777_396-339812-0 from training. Duration: 0.85 2023-10-04 16:09:37,105 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.705e+02 2.123e+02 2.389e+02 2.758e+02 4.259e+02, threshold=4.778e+02, percent-clipped=0.0 2023-10-04 16:09:37,234 WARNING [train.py:1204] (3/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_399-156791-0_sp1.1 from training. Duration: 0.7545625 2023-10-04 16:09:37,277 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7696_210_2040_1_1528207167_3716099_117-125434-0 from training. Duration: 0.76 2023-10-04 16:09:37,282 WARNING [train.py:1204] (3/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_96-244299-0 from training. Duration: 0.88 2023-10-04 16:09:40,574 WARNING [train.py:1204] (3/4) Exclude cut with ID _37892_210_19596_1_1533198677897_3964058_519-343462-0_sp1.1 from training. Duration: 0.92725 2023-10-04 16:09:40,600 WARNING [train.py:1204] (3/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_202-183922-0_sp1.1 from training. Duration: 0.77275 2023-10-04 16:09:40,606 WARNING [train.py:1204] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_453-133983-0 from training. Duration: 0.78 2023-10-04 16:09:40,642 WARNING [train.py:1204] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_178-39722-0_sp1.1 from training. Duration: 0.4454375 2023-10-04 16:09:43,724 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.ff2_skip_rate, batch_count=1713946.6666666667, ans=0.0 2023-10-04 16:09:46,186 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_15126_210_6753_1_1530778677_2591185_113-157651-0 from training. Duration: 0.68 2023-10-04 16:09:46,187 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_271-234962-0_sp1.1 from training. Duration: 0.7181875 2023-10-04 16:09:48,898 WARNING [train.py:1204] (3/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_195-64967-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 16:09:48,947 WARNING [train.py:1204] (3/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_365-215065-0_sp1.1 from training. Duration: 0.936375 2023-10-04 16:09:49,092 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.0.self_attn_weights.pos_emb_skip_rate, batch_count=1713946.6666666667, ans=0.0 2023-10-04 16:09:51,732 WARNING [train.py:1204] (3/4) Exclude cut with ID _31602_210_5854_1_1532677021456_4458250_249-96604-0_sp0.9 from training. Duration: 0.988875 2023-10-04 16:09:51,784 WARNING [train.py:1204] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_307-158462-0 from training. Duration: 0.69 2023-10-04 16:09:53,566 WARNING [train.py:1204] (3/4) Exclude cut with ID _30918_210_6400_1_1532754078600_3125920_407-223394-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 16:09:53,569 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6673_210_3273_1_1527767790_3173240_351-167954-0_sp0.9 from training. Duration: 0.5333125 2023-10-04 16:09:56,475 WARNING [train.py:1204] (3/4) Exclude cut with ID _4866_210_2577_1_1527404412284_4364659_256-306830-0 from training. Duration: 0.87 2023-10-04 16:09:57,760 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_17613_210_2151_1_1531137569_2366160_222-131118-0_sp1.1 from training. Duration: 0.963625 2023-10-04 16:09:57,779 WARNING [train.py:1204] (3/4) Exclude cut with ID _45560_210_21982_1_1533812603951_3584999_111-6376-0 from training. Duration: 0.84 2023-10-04 16:09:57,799 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_216-73841-0 from training. Duration: 0.96 2023-10-04 16:09:57,847 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_14058_210_4163_1_1530690953_6327180_49-210687-0 from training. Duration: 0.94 2023-10-04 16:09:59,807 WARNING [train.py:1204] (3/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_506-42614-0_sp0.9 from training. Duration: 0.87775 2023-10-04 16:10:02,545 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_25-332138-0_sp1.1 from training. Duration: 0.82725 2023-10-04 16:10:05,344 WARNING [train.py:1204] (3/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_589-9070-0_sp1.1 from training. Duration: 0.6454375 2023-10-04 16:10:06,800 WARNING [train.py:1204] (3/4) Exclude cut with ID _6194_210_1534_1_1527511826160_3941194_14-282876-0_sp1.1 from training. Duration: 0.6454375 2023-10-04 16:10:06,936 WARNING [train.py:1204] (3/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_1166-90654-0_sp1.1 from training. Duration: 0.92725 2023-10-04 16:10:08,266 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_218-255858-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 16:10:08,267 WARNING [train.py:1204] (3/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_1285-256622-0 from training. Duration: 0.84 2023-10-04 16:10:08,295 WARNING [train.py:1204] (3/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_205-286261-0_sp1.1 from training. Duration: 0.963625 2023-10-04 16:10:10,249 WARNING [train.py:1204] (3/4) Exclude cut with ID _68777_210_4353_1_1535810390731_3117904_6-154119-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 16:10:10,310 WARNING [train.py:1204] (3/4) Exclude cut with ID _22982_210_1883_1_1531882790447_5449409_565-199393-0_sp1.1 from training. Duration: 0.92725 2023-10-04 16:10:10,326 WARNING [train.py:1204] (3/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_278-154492-0 from training. Duration: 0.76 2023-10-04 16:10:11,726 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_426-313557-0 from training. Duration: 0.53 2023-10-04 16:10:11,875 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.conv_module2.balancer1.prob, batch_count=1714080.0, ans=0.125 2023-10-04 16:10:13,040 WARNING [train.py:1204] (3/4) Exclude cut with ID _41817_210_18835_1_1533434477160_7423049_555-293448-0 from training. Duration: 0.8 2023-10-04 16:10:15,758 WARNING [train.py:1204] (3/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_32-335103-0_sp1.1 from training. Duration: 0.7454375 2023-10-04 16:10:18,536 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2655_210_2087_1_1525688959_3897400_202-147557-0_sp1.1 from training. Duration: 0.77275 2023-10-04 16:10:18,561 WARNING [train.py:1204] (3/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_158-250116-0 from training. Duration: 0.62 2023-10-04 16:10:22,099 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.feed_forward2.hidden_balancer.prob, batch_count=1714146.6666666667, ans=0.125 2023-10-04 16:10:23,288 WARNING [train.py:1204] (3/4) Exclude cut with ID _55429_210_10626_1_1534467588527_7174143_528-88083-0_sp1.1 from training. Duration: 0.963625 2023-10-04 16:10:25,930 WARNING [train.py:1204] (3/4) Exclude cut with ID _8030_210_7497_1_1528444886781_3531089_32-31837-0_sp1.1 from training. Duration: 0.8818125 2023-10-04 16:10:25,966 WARNING [train.py:1204] (3/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_744-86168-0_sp1.1 from training. Duration: 0.97275 2023-10-04 16:10:25,975 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1113-7369-0_sp0.9 from training. Duration: 0.9444375 2023-10-04 16:10:27,289 WARNING [train.py:1204] (3/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_124-345840-0_sp0.9 from training. Duration: 0.57775 2023-10-04 16:10:27,342 WARNING [train.py:1204] (3/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_328-264776-0_sp0.9 from training. Duration: 0.7444375 2023-10-04 16:10:31,041 WARNING [train.py:1204] (3/4) Exclude cut with ID _18895_210_6228_1_1531357274234_3784930_469-166615-0_sp1.1 from training. Duration: 0.963625 2023-10-04 16:10:31,050 WARNING [train.py:1204] (3/4) Exclude cut with ID _8395_210_1531_1_1528597871523_4411579_431-132470-0_sp0.9 from training. Duration: 0.82225 2023-10-04 16:10:32,409 WARNING [train.py:1204] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_554-45617-0_sp1.1 from training. Duration: 0.9 2023-10-04 16:10:32,429 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_35361_210_4238_1_1533262810_7793476_524-188242-0_sp1.1 from training. Duration: 0.92725 2023-10-04 16:10:33,897 WARNING [train.py:1204] (3/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_938-205135-0 from training. Duration: 0.87 2023-10-04 16:10:34,033 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_5931_210_2040_1_1527428481_4547999_321-193614-0 from training. Duration: 0.67 2023-10-04 16:10:34,045 WARNING [train.py:1204] (3/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_445-269521-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 16:10:36,847 INFO [train.py:1046] (3/4) Epoch 49, batch 2150, loss[loss=0.1384, simple_loss=0.2164, pruned_loss=0.03022, over 24319.00 frames. ], tot_loss[loss=0.1522, simple_loss=0.2323, pruned_loss=0.03608, over 4704215.01 frames. ], batch size: 56, lr: 2.07e-03, grad_scale: 8.0 2023-10-04 16:10:36,904 WARNING [train.py:1204] (3/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_3-33116-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 16:10:36,914 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_4633_210_2040_1_1526824163_4145343_416-76117-0_sp1.1 from training. Duration: 0.863625 2023-10-04 16:10:36,942 WARNING [train.py:1204] (3/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_1033-274376-0_sp1.1 from training. Duration: 0.7909375 2023-10-04 16:10:36,983 WARNING [train.py:1204] (3/4) Exclude cut with ID _8772_210_1850_1_1528635595972_3664011_398-320049-0_sp0.9 from training. Duration: 0.97775 2023-10-04 16:10:42,853 WARNING [train.py:1204] (3/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_321-215742-0_sp0.9 from training. Duration: 0.5555625 2023-10-04 16:10:45,693 WARNING [train.py:1204] (3/4) Exclude cut with ID _28394_210_8913_1_1532430049518_3650118_272-190950-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 16:10:45,800 WARNING [train.py:1204] (3/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_31-299371-0_sp1.1 from training. Duration: 0.92725 2023-10-04 16:10:48,585 WARNING [train.py:1204] (3/4) Exclude cut with ID _55383_210_10635_1_1534468426172_2231669_134-128246-0_sp0.9 from training. Duration: 0.988875 2023-10-04 16:10:48,590 WARNING [train.py:1204] (3/4) Exclude cut with ID _40519_210_13794_1_1533715228655_7337390_887-9547-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 16:10:48,616 WARNING [train.py:1204] (3/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_310-109461-0_sp0.9 from training. Duration: 0.888875 2023-10-04 16:10:51,687 INFO [scaling.py:1118] (3/4) WithLoss: name=encoder.encoders.5.encoder.layers.0.self_attn_weights, loss-sum=0.000e+00 2023-10-04 16:10:53,102 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2009_210_2071_1_1525481134_4588937_258-250196-0_sp1.1 from training. Duration: 0.92725 2023-10-04 16:10:53,153 WARNING [train.py:1204] (3/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_1186-72702-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 16:10:53,155 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_14831_210_7927_1_1530700200_5583610_181-27467-0_sp1.1 from training. Duration: 0.82725 2023-10-04 16:10:57,190 WARNING [train.py:1204] (3/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_278-132109-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 16:10:58,471 WARNING [train.py:1204] (3/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_261-297503-0 from training. Duration: 0.99 2023-10-04 16:11:01,943 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.0.feed_forward1.out_proj.dropout_p, batch_count=1714280.0, ans=0.1 2023-10-04 16:11:03,183 WARNING [train.py:1204] (3/4) Exclude cut with ID _19896_210_1883_1_1531709022662_6649569_313-265903-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 16:11:03,275 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_105-114797-0_sp1.1 from training. Duration: 0.763625 2023-10-04 16:11:05,134 WARNING [train.py:1204] (3/4) Exclude cut with ID _53755_210_8254_1_1534577312941_4707360_9-65843-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 16:11:05,171 WARNING [train.py:1204] (3/4) Exclude cut with ID _18516_210_9206_1_1531704653206_3692406_361-224549-0_sp1.1 from training. Duration: 0.97275 2023-10-04 16:11:05,191 WARNING [train.py:1204] (3/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_403-310975-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 16:11:06,496 WARNING [train.py:1204] (3/4) Exclude cut with ID _5444_210_2151_1_1527418808464_5312210_581-315411-0_sp0.9 from training. Duration: 0.688875 2023-10-04 16:11:06,687 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.bypass_mid.scale_min, batch_count=1714346.6666666667, ans=0.2 2023-10-04 16:11:07,811 WARNING [train.py:1204] (3/4) Exclude cut with ID _23825_210_13449_1_1531915313526_3497150_464-154904-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 16:11:07,823 WARNING [train.py:1204] (3/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_937-208281-0_sp0.9 from training. Duration: 0.9444375 2023-10-04 16:11:07,902 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_37839_210_7234_1_1533542462_3689303_158-181212-0_sp1.1 from training. Duration: 0.963625 2023-10-04 16:11:09,859 WARNING [train.py:1204] (3/4) Exclude cut with ID _8772_210_1850_1_1528635595972_3664011_398-320049-0 from training. Duration: 0.88 2023-10-04 16:11:12,505 WARNING [train.py:1204] (3/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_718-250907-0_sp1.1 from training. Duration: 0.62725 2023-10-04 16:11:13,902 WARNING [train.py:1204] (3/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_641-126395-0_sp1.1 from training. Duration: 0.92725 2023-10-04 16:11:13,936 WARNING [train.py:1204] (3/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_166-241493-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 16:11:15,282 WARNING [train.py:1204] (3/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_526-62079-0_sp0.9 from training. Duration: 0.7444375 2023-10-04 16:11:16,679 WARNING [train.py:1204] (3/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_439-275083-0_sp1.1 from training. Duration: 0.936375 2023-10-04 16:11:19,404 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_66907_210_21581_1_1535615661_3590177_493-229032-0_sp1.1 from training. Duration: 0.92725 2023-10-04 16:11:19,439 WARNING [train.py:1204] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_89-321955-0_sp1.1 from training. Duration: 0.72725 2023-10-04 16:11:20,881 WARNING [train.py:1204] (3/4) Exclude cut with ID _29857_210_20034_1_1532395886753_3798160_273-226195-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 16:11:20,887 WARNING [train.py:1204] (3/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_404-75889-0 from training. Duration: 0.89 2023-10-04 16:11:20,929 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_271-234962-0_sp0.9 from training. Duration: 0.87775 2023-10-04 16:11:24,206 WARNING [train.py:1204] (3/4) Exclude cut with ID _22961_210_8254_1_1531976366672_4278180_156-339685-0_sp1.1 from training. Duration: 0.97275 2023-10-04 16:11:24,245 WARNING [train.py:1204] (3/4) Exclude cut with ID _37892_210_19596_1_1533198677897_3964058_425-231927-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 16:11:25,663 WARNING [train.py:1204] (3/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_102-288670-0_sp1.1 from training. Duration: 0.97275 2023-10-04 16:11:26,974 WARNING [train.py:1204] (3/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_283-220372-0_sp1.1 from training. Duration: 0.5090625 2023-10-04 16:11:28,270 WARNING [train.py:1204] (3/4) Exclude cut with ID _44304_210_15374_1_1533639352765_4613129_86-1699-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 16:11:29,631 WARNING [train.py:1204] (3/4) Exclude cut with ID _2891_210_2550_1_1525874398521_4427710_327-121590-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 16:11:29,638 WARNING [train.py:1204] (3/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_369-222060-0 from training. Duration: 0.71 2023-10-04 16:11:31,046 WARNING [train.py:1204] (3/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_762-109886-0 from training. Duration: 0.91 2023-10-04 16:11:32,338 WARNING [train.py:1204] (3/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_477-255782-0_sp0.9 from training. Duration: 0.911125 2023-10-04 16:11:32,377 WARNING [train.py:1204] (3/4) Exclude cut with ID IC0669W0061-330906 from training. Duration: 0.768 2023-10-04 16:11:34,332 WARNING [train.py:1204] (3/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_347-213071-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 16:11:34,370 WARNING [train.py:1204] (3/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_507-303370-0_sp1.1 from training. Duration: 0.87275 2023-10-04 16:11:37,073 WARNING [train.py:1204] (3/4) Exclude cut with ID _15012_210_1968_1_1530856820429_5095350_688-178849-0 from training. Duration: 0.64 2023-10-04 16:11:37,080 WARNING [train.py:1204] (3/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_28-70987-0_sp0.9 from training. Duration: 0.9666875 2023-10-04 16:11:37,093 WARNING [train.py:1204] (3/4) Exclude cut with ID _5910_210_1389_1_1527337972454_5050100_555-159815-0 from training. Duration: 0.98 2023-10-04 16:11:37,116 WARNING [train.py:1204] (3/4) Exclude cut with ID IC0679W0064-335871 from training. Duration: 0.768 2023-10-04 16:11:37,116 WARNING [train.py:1204] (3/4) Exclude cut with ID IC0679W0079-335886 from training. Duration: 0.896 2023-10-04 16:11:37,158 WARNING [train.py:1204] (3/4) Exclude cut with ID _5608_210_2106_1_1527415439089_6819768_909-82767-0 from training. Duration: 0.61 2023-10-04 16:11:38,901 WARNING [train.py:1204] (3/4) Exclude cut with ID _23038_210_9570_1_1532001584158_3652419_411-323967-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 16:11:39,196 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.feed_forward3.hidden_balancer.prob, batch_count=1714480.0, ans=0.125 2023-10-04 16:11:40,233 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_350-231748-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 16:11:40,243 WARNING [train.py:1204] (3/4) Exclude cut with ID _5289_210_2084_1_1527678202736_3779299_142-103317-0_sp0.9 from training. Duration: 0.9333125 2023-10-04 16:11:40,330 WARNING [train.py:1204] (3/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_314-144055-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 16:11:41,732 WARNING [train.py:1204] (3/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_177-236809-0_sp0.9 from training. Duration: 0.5444375 2023-10-04 16:11:43,765 WARNING [train.py:1204] (3/4) Exclude cut with ID _23038_210_9570_1_1532001584158_3652419_82-334733-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 16:11:43,773 WARNING [train.py:1204] (3/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_225-22526-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 16:11:44,103 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.ff3_skip_rate, batch_count=1714480.0, ans=0.0 2023-10-04 16:11:50,539 INFO [train.py:1046] (3/4) Epoch 49, batch 2200, loss[loss=0.1593, simple_loss=0.2468, pruned_loss=0.03586, over 24012.00 frames. ], tot_loss[loss=0.1524, simple_loss=0.2326, pruned_loss=0.03611, over 4708366.21 frames. ], batch size: 80, lr: 2.07e-03, grad_scale: 8.0 2023-10-04 16:11:52,082 WARNING [train.py:1204] (3/4) Exclude cut with ID _30327_210_16049_1_1532484161295_3924169_401-70819-0_sp1.1 from training. Duration: 0.7818125 2023-10-04 16:11:52,127 WARNING [train.py:1204] (3/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_538-32291-0 from training. Duration: 0.83 2023-10-04 16:11:56,849 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_37357_210_17024_1_1533089504_2316121_258-211605-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 16:11:57,413 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.5.encoder.layers.1.feed_forward3.out_whiten, num_groups=1, num_channels=256, metric=12.95 vs. limit=15.0 2023-10-04 16:11:59,692 WARNING [train.py:1204] (3/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_550-335198-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 16:12:00,935 WARNING [train.py:1204] (3/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_98-741-0_sp0.9 from training. Duration: 0.888875 2023-10-04 16:12:01,002 WARNING [train.py:1204] (3/4) Exclude cut with ID _38323_210_12108_1_1533121233369_3303049_251-238988-0_sp1.1 from training. Duration: 0.92725 2023-10-04 16:12:02,382 WARNING [train.py:1204] (3/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_259-34038-0_sp0.9 from training. Duration: 0.82225 2023-10-04 16:12:02,621 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.nonlin_attention.balancer.min_positive, batch_count=1714546.6666666667, ans=0.05 2023-10-04 16:12:03,799 WARNING [train.py:1204] (3/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_1060-180633-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 16:12:03,835 WARNING [train.py:1204] (3/4) Exclude cut with ID _8755_210_1876_1_1528628559357_3258209_211-37473-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 16:12:03,840 WARNING [train.py:1204] (3/4) Exclude cut with ID _4122_210_2084_1_1526713164417_4353280_528-206771-0 from training. Duration: 0.88 2023-10-04 16:12:05,036 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.707e+02 2.091e+02 2.261e+02 2.558e+02 4.417e+02, threshold=4.522e+02, percent-clipped=0.0 2023-10-04 16:12:10,627 WARNING [train.py:1204] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_64-248359-0 from training. Duration: 0.69 2023-10-04 16:12:11,385 WARNING [train.py:1204] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_402-253027-0_sp1.1 from training. Duration: 0.4909375 2023-10-04 16:12:16,749 WARNING [train.py:1204] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_625-102955-0 from training. Duration: 0.77 2023-10-04 16:12:18,486 WARNING [train.py:1204] (3/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_611-219324-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 16:12:19,743 WARNING [train.py:1204] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_718-68505-0_sp0.9 from training. Duration: 0.988875 2023-10-04 16:12:19,790 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_353-182855-0_sp1.1 from training. Duration: 0.87275 2023-10-04 16:12:24,420 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_5960_210_1794_1_1527403865_3990999_515-2613-0_sp0.9 from training. Duration: 0.97775 2023-10-04 16:12:24,448 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_5575_210_1410_1_1527301305_2570170_342-265572-0 from training. Duration: 0.94 2023-10-04 16:12:28,628 WARNING [train.py:1204] (3/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_329-113173-0_sp0.9 from training. Duration: 0.688875 2023-10-04 16:12:31,248 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_15396_210_6890_1_1531308473_1938936_213-312506-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 16:12:31,315 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_521-214276-0_sp1.1 from training. Duration: 0.4 2023-10-04 16:12:31,571 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=1714680.0, ans=0.1 2023-10-04 16:12:35,311 WARNING [train.py:1204] (3/4) Exclude cut with ID _15026_210_8750_1_1530777602316_3291850_211-195043-0_sp1.1 from training. Duration: 0.72725 2023-10-04 16:12:37,342 WARNING [train.py:1204] (3/4) Exclude cut with ID _29343_210_4381_1_1532602757752_3674109_108-257494-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 16:12:40,074 WARNING [train.py:1204] (3/4) Exclude cut with ID _64414_210_17301_1_1535243252048_3757759_403-58151-0_sp1.1 from training. Duration: 0.963625 2023-10-04 16:12:40,289 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.balancer1.prob, batch_count=1714746.6666666667, ans=0.125 2023-10-04 16:12:42,027 WARNING [train.py:1204] (3/4) Exclude cut with ID _36512_210_6987_1_1533033143998_3126069_109-177787-0_sp1.1 from training. Duration: 0.92725 2023-10-04 16:12:44,712 WARNING [train.py:1204] (3/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_498-40153-0 from training. Duration: 0.63 2023-10-04 16:12:44,792 WARNING [train.py:1204] (3/4) Exclude cut with ID _56447_210_1389_1_1535074245910_3697189_161-334523-0_sp1.1 from training. Duration: 0.936375 2023-10-04 16:12:46,773 WARNING [train.py:1204] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_238-245655-0 from training. Duration: 0.81 2023-10-04 16:12:49,450 WARNING [train.py:1204] (3/4) Exclude cut with ID _64631_210_27767_1_1535349580068_4165160_108-43865-0_sp1.1 from training. Duration: 0.92725 2023-10-04 16:12:49,455 WARNING [train.py:1204] (3/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_243-42116-0_sp1.1 from training. Duration: 0.663625 2023-10-04 16:12:49,486 WARNING [train.py:1204] (3/4) Exclude cut with ID _66868_210_9527_1_1535509287954_4561140_594-132160-0_sp1.1 from training. Duration: 0.92725 2023-10-04 16:12:52,270 WARNING [train.py:1204] (3/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_48-338976-0_sp0.9 from training. Duration: 0.988875 2023-10-04 16:12:52,306 WARNING [train.py:1204] (3/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_137-100595-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 16:12:52,312 WARNING [train.py:1204] (3/4) Exclude cut with ID _49748_210_24116_1_1533986971737_3158910_12-271669-0_sp1.1 from training. Duration: 0.936375 2023-10-04 16:12:52,342 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_37357_210_17024_1_1533089504_2316121_206-80421-0_sp1.1 from training. Duration: 0.936375 2023-10-04 16:12:53,837 WARNING [train.py:1204] (3/4) Exclude cut with ID _7537_210_2084_1_1528502395431_3924359_113-199933-0_sp1.1 from training. Duration: 0.536375 2023-10-04 16:12:53,885 WARNING [train.py:1204] (3/4) Exclude cut with ID _55440_210_12651_1_1534496713364_2980520_31-59289-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 16:12:55,340 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1083-220122-0_sp0.9 from training. Duration: 0.5666875 2023-10-04 16:12:58,173 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_426-313557-0_sp1.1 from training. Duration: 0.4818125 2023-10-04 16:12:58,230 WARNING [train.py:1204] (3/4) Exclude cut with ID _35799_210_5854_1_1533263487887_3529071_324-94843-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 16:13:00,187 WARNING [train.py:1204] (3/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_264-108685-0_sp1.1 from training. Duration: 0.5 2023-10-04 16:13:01,642 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9012W0006-501141 from training. Duration: 0.896 2023-10-04 16:13:02,972 WARNING [train.py:1204] (3/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_890-5117-0_sp0.9 from training. Duration: 0.8666875 2023-10-04 16:13:03,215 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.conv_skip_rate, batch_count=1714813.3333333333, ans=0.0 2023-10-04 16:13:04,310 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9023W0008-506301 from training. Duration: 0.896 2023-10-04 16:13:05,469 INFO [train.py:1046] (3/4) Epoch 49, batch 2250, loss[loss=0.1456, simple_loss=0.2304, pruned_loss=0.03039, over 23199.00 frames. ], tot_loss[loss=0.1532, simple_loss=0.2338, pruned_loss=0.03627, over 4710981.62 frames. ], batch size: 105, lr: 2.07e-03, grad_scale: 8.0 2023-10-04 16:13:05,586 WARNING [train.py:1204] (3/4) Exclude cut with ID _13916_210_9994_1_1530788341126_3763149_60-191556-0_sp0.9 from training. Duration: 0.67775 2023-10-04 16:13:05,633 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9029W0331-509721 from training. Duration: 0.896 2023-10-04 16:13:07,008 WARNING [train.py:1204] (3/4) Exclude cut with ID _35823_210_6917_1_1533187881914_7297830_567-108596-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 16:13:08,884 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7543_210_6753_1_1528544010_1110402_25-183860-0_sp1.1 from training. Duration: 0.42725 2023-10-04 16:13:10,384 WARNING [train.py:1204] (3/4) Exclude cut with ID _47278_210_12041_1_1533902418349_3576110_495-260035-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 16:13:11,210 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.1.encoder.layers.0.feed_forward3.out_whiten, num_groups=1, num_channels=256, metric=12.94 vs. limit=15.0 2023-10-04 16:13:11,905 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9051W0015-520281 from training. Duration: 0.9045625 2023-10-04 16:13:14,953 WARNING [train.py:1204] (3/4) Exclude cut with ID _14459_210_2228_1_1531443641946_7516700_707-66193-0_sp1.1 from training. Duration: 0.97275 2023-10-04 16:13:17,025 WARNING [train.py:1204] (3/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_219-224445-0_sp1.1 from training. Duration: 0.763625 2023-10-04 16:13:21,162 WARNING [train.py:1204] (3/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_345-167986-0_sp0.9 from training. Duration: 0.9333125 2023-10-04 16:13:23,765 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_34815_210_6388_1_1533381320_3571903_279-305565-0_sp1.1 from training. Duration: 0.836375 2023-10-04 16:13:26,616 WARNING [train.py:1204] (3/4) Exclude cut with ID _21987_210_5854_1_1531821500482_3637038_317-179212-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 16:13:26,687 WARNING [train.py:1204] (3/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_395-37222-0_sp1.1 from training. Duration: 0.6909375 2023-10-04 16:13:28,060 WARNING [train.py:1204] (3/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_168-50337-0_sp1.1 from training. Duration: 0.763625 2023-10-04 16:13:30,033 WARNING [train.py:1204] (3/4) Exclude cut with ID _60113_210_16271_1_1535164212481_4386079_559-167191-0 from training. Duration: 0.93 2023-10-04 16:13:30,043 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_47228_210_4983_1_1533780021_3672027_344-179642-0_sp1.1 from training. Duration: 0.92725 2023-10-04 16:13:31,335 WARNING [train.py:1204] (3/4) Exclude cut with ID _22961_210_8254_1_1531976366672_4278180_317-199546-0_sp0.9 from training. Duration: 0.9555625 2023-10-04 16:13:32,851 WARNING [train.py:1204] (3/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_141-89727-0 from training. Duration: 0.71 2023-10-04 16:13:32,913 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_78-116075-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 16:13:33,120 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.attention_skip_rate, batch_count=1714946.6666666667, ans=0.0 2023-10-04 16:13:34,258 WARNING [train.py:1204] (3/4) Exclude cut with ID _64086_210_7994_1_1535270417019_7435949_52-235756-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 16:13:35,652 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7615_210_4211_1_1528110965_4580160_72-235211-0_sp1.1 from training. Duration: 0.6909375 2023-10-04 16:13:37,184 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.self_attn_weights.pos_emb_skip_rate, batch_count=1715013.3333333333, ans=0.0 2023-10-04 16:13:38,502 WARNING [train.py:1204] (3/4) Exclude cut with ID _14069_210_5632_1_1530512692033_4803390_287-27950-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 16:13:38,645 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.1.conv_module2.balancer1.prob, batch_count=1715013.3333333333, ans=0.125 2023-10-04 16:13:40,479 WARNING [train.py:1204] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_74-48717-0_sp0.9 from training. Duration: 0.6444375 2023-10-04 16:13:40,511 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7544_210_6753_1_1528591327_1471426_172-348777-0_sp1.1 from training. Duration: 0.6 2023-10-04 16:13:43,734 WARNING [train.py:1204] (3/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_220-191080-0 from training. Duration: 0.98 2023-10-04 16:13:45,168 WARNING [train.py:1204] (3/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_1081-137641-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 16:13:48,342 WARNING [train.py:1204] (3/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_278-235459-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 16:13:54,017 WARNING [train.py:1204] (3/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_1181-77470-0_sp1.1 from training. Duration: 0.936375 2023-10-04 16:13:54,159 WARNING [train.py:1204] (3/4) Exclude cut with ID _60860_210_8254_1_1534896015875_3557169_198-5531-0_sp1.1 from training. Duration: 0.936375 2023-10-04 16:13:55,601 WARNING [train.py:1204] (3/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_17-289781-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 16:13:55,606 WARNING [train.py:1204] (3/4) Exclude cut with ID _50310_210_15488_1_1534068043345_3641080_146-199752-0_sp1.1 from training. Duration: 0.92725 2023-10-04 16:13:57,034 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder_embed.conv.2.prob, batch_count=1715080.0, ans=0.125 2023-10-04 16:13:58,332 WARNING [train.py:1204] (3/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_969-213354-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 16:13:58,439 WARNING [train.py:1204] (3/4) Exclude cut with ID _70519_210_4163_1_1535961595994_7458230_66-344156-0_sp1.1 from training. Duration: 0.963625 2023-10-04 16:13:58,660 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.conv_skip_rate, batch_count=1715080.0, ans=0.0 2023-10-04 16:13:59,963 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.bypass.skip_rate, batch_count=1715080.0, ans=0.04949747468305833 2023-10-04 16:14:04,644 WARNING [train.py:1204] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_380-204756-0_sp1.1 from training. Duration: 0.7818125 2023-10-04 16:14:06,064 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_128-202104-0_sp0.9 from training. Duration: 0.7 2023-10-04 16:14:09,007 WARNING [train.py:1204] (3/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_51-1049-0_sp0.9 from training. Duration: 0.8333125 2023-10-04 16:14:09,025 WARNING [train.py:1204] (3/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_86-66192-0_sp1.1 from training. Duration: 0.5 2023-10-04 16:14:10,400 WARNING [train.py:1204] (3/4) Exclude cut with ID _14069_210_5632_1_1530512692033_4803390_95-1896-0_sp1.1 from training. Duration: 0.8909375 2023-10-04 16:14:11,348 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.0.layers.0.self_attn1.whiten, num_groups=1, num_channels=192, metric=9.87 vs. limit=22.5 2023-10-04 16:14:15,517 WARNING [train.py:1204] (3/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_29-293896-0_sp1.1 from training. Duration: 0.5545625 2023-10-04 16:14:15,758 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.balancer2.prob, batch_count=1715146.6666666667, ans=0.125 2023-10-04 16:14:18,737 WARNING [train.py:1204] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_167-137255-0_sp0.9 from training. Duration: 0.711125 2023-10-04 16:14:18,738 WARNING [train.py:1204] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_217-95056-0 from training. Duration: 0.98 2023-10-04 16:14:18,752 WARNING [train.py:1204] (3/4) Exclude cut with ID _47278_210_12041_1_1533902418349_3576110_97-266746-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 16:14:18,808 WARNING [train.py:1204] (3/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_380-343670-0_sp1.1 from training. Duration: 0.8 2023-10-04 16:14:19,635 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.1.encoder.layers.1.conv_module1.whiten, num_groups=1, num_channels=256, metric=10.10 vs. limit=15.0 2023-10-04 16:14:21,494 INFO [train.py:1046] (3/4) Epoch 49, batch 2300, loss[loss=0.1285, simple_loss=0.2077, pruned_loss=0.0247, over 24416.00 frames. ], tot_loss[loss=0.154, simple_loss=0.2349, pruned_loss=0.03657, over 4705449.80 frames. ], batch size: 58, lr: 2.07e-03, grad_scale: 8.0 2023-10-04 16:14:21,603 WARNING [train.py:1204] (3/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_107-332398-0 from training. Duration: 0.78 2023-10-04 16:14:24,401 WARNING [train.py:1204] (3/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_91-267696-0_sp1.1 from training. Duration: 0.7909375 2023-10-04 16:14:24,430 WARNING [train.py:1204] (3/4) Exclude cut with ID _41298_210_9019_1_1533430371450_4822143_498-186993-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 16:14:24,689 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.nonlin_attention.balancer.prob, batch_count=1715213.3333333333, ans=0.125 2023-10-04 16:14:26,740 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.2.encoder.layers.0.feed_forward1.out_whiten, num_groups=1, num_channels=384, metric=5.78 vs. limit=15.0 2023-10-04 16:14:28,788 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.feed_forward1.out_proj.dropout_p, batch_count=1715213.3333333333, ans=0.1 2023-10-04 16:14:29,962 WARNING [train.py:1204] (3/4) Exclude cut with ID _17956_210_4353_1_1531209568125_6979970_54-77610-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 16:14:30,009 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_40047_210_2290_1_1533290519_2374311_257-335162-0_sp1.1 from training. Duration: 0.863625 2023-10-04 16:14:30,282 INFO [scaling.py:1118] (3/4) WithLoss: name=encoder.encoders.4.encoder.layers.1.self_attn_weights, loss-sum=0.000e+00 2023-10-04 16:14:33,159 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9653W0193-675471 from training. Duration: 0.9656875 2023-10-04 16:14:34,113 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.0.layers.1.feed_forward1.out_whiten, num_groups=1, num_channels=192, metric=4.97 vs. limit=15.0 2023-10-04 16:14:34,532 WARNING [train.py:1204] (3/4) Exclude cut with ID _25248_210_8750_1_1532062825941_5554180_243-240536-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 16:14:35,846 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.800e+02 2.150e+02 2.484e+02 2.874e+02 4.816e+02, threshold=4.968e+02, percent-clipped=1.0 2023-10-04 16:14:38,883 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.ff2_skip_rate, batch_count=1715280.0, ans=0.0 2023-10-04 16:14:41,359 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_32360_210_4198_1_1532689160_3648436_60-51908-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 16:14:41,373 WARNING [train.py:1204] (3/4) Exclude cut with ID _4092_210_2151_1_1526643044449_3135759_227-213972-0_sp0.9 from training. Duration: 0.6 2023-10-04 16:14:43,084 WARNING [train.py:1204] (3/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_392-173500-0_sp1.1 from training. Duration: 0.97275 2023-10-04 16:14:43,121 WARNING [train.py:1204] (3/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_71-329150-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 16:14:43,123 WARNING [train.py:1204] (3/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_866-173440-0 from training. Duration: 0.9 2023-10-04 16:14:43,197 WARNING [train.py:1204] (3/4) Exclude cut with ID _14832_210_8750_1_1530691351539_3171348_1-54315-0_sp0.9 from training. Duration: 0.9666875 2023-10-04 16:14:46,467 WARNING [train.py:1204] (3/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_430-116139-0_sp1.1 from training. Duration: 0.77275 2023-10-04 16:14:47,803 WARNING [train.py:1204] (3/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_356-45054-0_sp0.9 from training. Duration: 0.9555625 2023-10-04 16:14:51,086 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3531_210_3528_1_1526294203_5216456_309-227302-0_sp0.9 from training. Duration: 0.7666875 2023-10-04 16:14:53,888 WARNING [train.py:1204] (3/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_301-330455-0_sp1.1 from training. Duration: 0.72725 2023-10-04 16:14:56,684 WARNING [train.py:1204] (3/4) Exclude cut with ID _23557_210_8750_1_1531890106370_6846255_667-145945-0_sp1.1 from training. Duration: 0.92725 2023-10-04 16:15:02,033 WARNING [train.py:1204] (3/4) Exclude cut with ID _14860_210_4915_1_1530702020340_7343240_836-328489-0_sp1.1 from training. Duration: 0.8181875 2023-10-04 16:15:03,367 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_37152_210_9697_1_1533106573_3844188_416-333249-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 16:15:03,545 WARNING [train.py:1204] (3/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_538-32291-0_sp0.9 from training. Duration: 0.92225 2023-10-04 16:15:06,792 WARNING [train.py:1204] (3/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_965-176824-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 16:15:10,872 WARNING [train.py:1204] (3/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_86-19601-0_sp1.1 from training. Duration: 0.77275 2023-10-04 16:15:10,952 WARNING [train.py:1204] (3/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_436-318113-0_sp1.1 from training. Duration: 0.6818125 2023-10-04 16:15:12,286 WARNING [train.py:1204] (3/4) Exclude cut with ID _41817_210_18835_1_1533434477160_7423049_555-293448-0_sp0.9 from training. Duration: 0.888875 2023-10-04 16:15:12,299 WARNING [train.py:1204] (3/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_68-219807-0 from training. Duration: 0.47 2023-10-04 16:15:15,712 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7544_210_6753_1_1528591327_1471426_33-346031-0_sp1.1 from training. Duration: 0.5545625 2023-10-04 16:15:15,713 WARNING [train.py:1204] (3/4) Exclude cut with ID _49748_210_24116_1_1533986971737_3158910_263-52113-0_sp1.1 from training. Duration: 0.97275 2023-10-04 16:15:15,752 WARNING [train.py:1204] (3/4) Exclude cut with ID _64639_210_5816_1_1535367619437_3528000_340-24580-0_sp1.1 from training. Duration: 0.92725 2023-10-04 16:15:17,574 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_47228_210_4983_1_1533780021_3672027_479-114941-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 16:15:17,612 WARNING [train.py:1204] (3/4) Exclude cut with ID _44585_210_18628_1_1533605350387_4518199_506-119659-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 16:15:19,069 WARNING [train.py:1204] (3/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_314-126819-0_sp1.1 from training. Duration: 0.4545625 2023-10-04 16:15:19,073 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2541_210_1794_1_1525590269_3084765_156-173713-0_sp0.9 from training. Duration: 0.9 2023-10-04 16:15:20,298 WARNING [train.py:1204] (3/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_108-255430-0 from training. Duration: 0.97 2023-10-04 16:15:20,306 WARNING [train.py:1204] (3/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_791-45149-0_sp1.1 from training. Duration: 0.7818125 2023-10-04 16:15:20,318 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_53378_210_17282_1_1534227054_1261425_10-118295-0_sp1.1 from training. Duration: 0.97275 2023-10-04 16:15:21,010 WARNING [train.py:1204] (3/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_148-158509-0 from training. Duration: 0.41 2023-10-04 16:15:26,465 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_34726_210_4525_1_1533019474_5030395_687-291305-0_sp1.1 from training. Duration: 0.936375 2023-10-04 16:15:29,328 WARNING [train.py:1204] (3/4) Exclude cut with ID _14860_210_4915_1_1530702020340_7343240_264-324537-0_sp1.1 from training. Duration: 0.963625 2023-10-04 16:15:30,845 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.conv_module1.balancer2.prob, batch_count=1715480.0, ans=0.125 2023-10-04 16:15:33,362 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_41251_210_15368_1_1533429767_4820695_591-46386-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 16:15:33,390 WARNING [train.py:1204] (3/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_86-19601-0_sp0.9 from training. Duration: 0.9444375 2023-10-04 16:15:35,188 INFO [train.py:1046] (3/4) Epoch 49, batch 2350, loss[loss=0.1483, simple_loss=0.2316, pruned_loss=0.03246, over 23525.00 frames. ], tot_loss[loss=0.1541, simple_loss=0.2353, pruned_loss=0.03647, over 4703104.94 frames. ], batch size: 134, lr: 2.07e-03, grad_scale: 8.0 2023-10-04 16:15:35,233 WARNING [train.py:1204] (3/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_401-255491-0_sp0.9 from training. Duration: 0.77775 2023-10-04 16:15:36,677 WARNING [train.py:1204] (3/4) Exclude cut with ID _8696_210_1876_1_1528545632440_3345058_209-54069-0_sp1.1 from training. Duration: 0.6090625 2023-10-04 16:15:36,685 WARNING [train.py:1204] (3/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_369-325184-0_sp1.1 from training. Duration: 0.9 2023-10-04 16:15:36,749 WARNING [train.py:1204] (3/4) Exclude cut with ID _14687_210_5854_1_1530703838259_3664889_115-239046-0_sp0.9 from training. Duration: 0.7444375 2023-10-04 16:15:38,104 WARNING [train.py:1204] (3/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_168-50337-0 from training. Duration: 0.84 2023-10-04 16:15:45,564 WARNING [train.py:1204] (3/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_909-64151-0_sp1.1 from training. Duration: 0.8545625 2023-10-04 16:15:45,575 WARNING [train.py:1204] (3/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_597-154640-0 from training. Duration: 0.9 2023-10-04 16:15:49,938 WARNING [train.py:1204] (3/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_126-274694-0 from training. Duration: 0.98 2023-10-04 16:15:53,103 WARNING [train.py:1204] (3/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_344-269786-0_sp1.1 from training. Duration: 0.97275 2023-10-04 16:15:54,623 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_37818_210_4238_1_1533620910_4720083_322-253154-0_sp1.1 from training. Duration: 0.92725 2023-10-04 16:15:54,627 WARNING [train.py:1204] (3/4) Exclude cut with ID _58895_210_8750_1_1535184081320_3649089_221-15823-0_sp1.1 from training. Duration: 0.92725 2023-10-04 16:15:54,640 WARNING [train.py:1204] (3/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_9-21737-0_sp1.1 from training. Duration: 0.8818125 2023-10-04 16:15:54,681 WARNING [train.py:1204] (3/4) Exclude cut with ID _14069_210_5632_1_1530512692033_4803390_522-341588-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 16:15:56,003 WARNING [train.py:1204] (3/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_363-185703-0 from training. Duration: 0.95 2023-10-04 16:15:58,944 WARNING [train.py:1204] (3/4) Exclude cut with ID _41817_210_18835_1_1533434477160_7423049_265-76216-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 16:16:00,529 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.bypass.scale_min, batch_count=1715613.3333333333, ans=0.2 2023-10-04 16:16:03,071 WARNING [train.py:1204] (3/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_841-341679-0 from training. Duration: 0.58 2023-10-04 16:16:04,939 WARNING [train.py:1204] (3/4) Exclude cut with ID _55429_210_10626_1_1534467588527_7174143_97-335084-0_sp1.1 from training. Duration: 0.8818125 2023-10-04 16:16:09,007 WARNING [train.py:1204] (3/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_690-126306-0_sp0.9 from training. Duration: 0.8666875 2023-10-04 16:16:09,016 WARNING [train.py:1204] (3/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_102-92751-0_sp1.1 from training. Duration: 0.9 2023-10-04 16:16:10,491 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3787_210_2040_1_1526477544_5478586_603-120324-0_sp1.1 from training. Duration: 0.736375 2023-10-04 16:16:13,703 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_33348_210_5632_1_1533086912_3324030_85-28583-0 from training. Duration: 0.82 2023-10-04 16:16:13,769 WARNING [train.py:1204] (3/4) Exclude cut with ID _60057_210_10640_1_1534903065793_3333150_362-230377-0_sp1.1 from training. Duration: 0.7909375 2023-10-04 16:16:15,169 WARNING [train.py:1204] (3/4) Exclude cut with ID _60113_210_16271_1_1535164212481_4386079_194-146486-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 16:16:15,186 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_53753_210_7234_1_1534320003_3594148_171-277668-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 16:16:15,217 WARNING [train.py:1204] (3/4) Exclude cut with ID _34423_210_8750_1_1533430798456_3330199_251-332788-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 16:16:20,114 WARNING [train.py:1204] (3/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_663-166973-0_sp0.9 from training. Duration: 0.888875 2023-10-04 16:16:23,209 WARNING [train.py:1204] (3/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_927-43827-0 from training. Duration: 0.74 2023-10-04 16:16:23,283 WARNING [train.py:1204] (3/4) Exclude cut with ID _28323_210_16187_1_1532480395667_4100300_306-74561-0_sp1.1 from training. Duration: 0.8545625 2023-10-04 16:16:24,748 WARNING [train.py:1204] (3/4) Exclude cut with ID _15037_210_2253_1_1530752294104_3267650_139-337989-0_sp1.1 from training. Duration: 0.92725 2023-10-04 16:16:24,762 WARNING [train.py:1204] (3/4) Exclude cut with ID _49039_210_6315_1_1533967537541_3846118_460-263670-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 16:16:27,530 WARNING [train.py:1204] (3/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_186-309161-0 from training. Duration: 0.54 2023-10-04 16:16:28,924 WARNING [train.py:1204] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_87-278686-0_sp1.1 from training. Duration: 0.7 2023-10-04 16:16:30,355 WARNING [train.py:1204] (3/4) Exclude cut with ID _27052_210_5986_1_1532329197187_3585009_314-106499-0 from training. Duration: 0.92 2023-10-04 16:16:31,648 WARNING [train.py:1204] (3/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_412-287526-0_sp0.9 from training. Duration: 0.8 2023-10-04 16:16:36,390 WARNING [train.py:1204] (3/4) Exclude cut with ID _22961_210_8254_1_1531976366672_4278180_317-199546-0 from training. Duration: 0.86 2023-10-04 16:16:40,492 WARNING [train.py:1204] (3/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_381-317791-0 from training. Duration: 0.73 2023-10-04 16:16:41,833 WARNING [train.py:1204] (3/4) Exclude cut with ID _44010_210_4525_1_1533884262964_4013620_560-310411-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 16:16:41,840 WARNING [train.py:1204] (3/4) Exclude cut with ID _5294_210_1747_1_1527249643928_3696179_342-270278-0_sp0.9 from training. Duration: 0.611125 2023-10-04 16:16:41,874 WARNING [train.py:1204] (3/4) Exclude cut with ID ID0473W0002-918951 from training. Duration: 0.924 2023-10-04 16:16:41,898 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1083-220122-0 from training. Duration: 0.51 2023-10-04 16:16:42,568 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.2.encoder.layers.2.self_attn2.whiten, num_groups=1, num_channels=384, metric=14.76 vs. limit=22.5 2023-10-04 16:16:45,103 WARNING [train.py:1204] (3/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_138-154812-0 from training. Duration: 0.78 2023-10-04 16:16:46,706 WARNING [train.py:1204] (3/4) Exclude cut with ID _44982_210_1511_1_1534312820108_3726029_331-19420-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 16:16:46,840 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.0.balancer1.prob, batch_count=1715813.3333333333, ans=0.125 2023-10-04 16:16:49,295 INFO [train.py:1046] (3/4) Epoch 49, batch 2400, loss[loss=0.1575, simple_loss=0.238, pruned_loss=0.03849, over 23652.00 frames. ], tot_loss[loss=0.1537, simple_loss=0.2348, pruned_loss=0.03632, over 4707833.28 frames. ], batch size: 135, lr: 2.07e-03, grad_scale: 16.0 2023-10-04 16:16:52,118 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2655_210_2087_1_1525688959_3897400_202-147557-0_sp0.9 from training. Duration: 0.9444375 2023-10-04 16:16:55,468 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_23850_210_4238_1_1532232597_1632433_32-185135-0_sp0.9 from training. Duration: 0.9555625 2023-10-04 16:16:56,962 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_46-93547-0_sp1.1 from training. Duration: 0.863625 2023-10-04 16:16:57,011 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7931_210_1794_1_1528594728_5564059_280-279560-0 from training. Duration: 0.98 2023-10-04 16:16:58,347 WARNING [train.py:1204] (3/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_937-208281-0 from training. Duration: 0.85 2023-10-04 16:17:03,995 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.682e+02 2.122e+02 2.325e+02 2.661e+02 3.983e+02, threshold=4.649e+02, percent-clipped=0.0 2023-10-04 16:17:04,364 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.balancer1.prob, batch_count=1715946.6666666667, ans=0.125 2023-10-04 16:17:05,377 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_14928_210_5632_1_1530773461_4714000_454-112198-0_sp0.9 from training. Duration: 0.7333125 2023-10-04 16:17:05,378 WARNING [train.py:1204] (3/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_944-153333-0_sp1.1 from training. Duration: 0.963625 2023-10-04 16:17:08,189 WARNING [train.py:1204] (3/4) Exclude cut with ID _14821_210_8750_1_1530680503991_6829359_2-227151-0 from training. Duration: 0.83 2023-10-04 16:17:08,214 WARNING [train.py:1204] (3/4) Exclude cut with ID _28461_210_2253_1_1532771775189_3041299_96-152158-0_sp1.1 from training. Duration: 0.77275 2023-10-04 16:17:08,448 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.bypass_mid.scale_min, batch_count=1715946.6666666667, ans=0.2 2023-10-04 16:17:08,456 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.ff2_skip_rate, batch_count=1715946.6666666667, ans=0.0 2023-10-04 16:17:09,540 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7837_210_1792_1_1528453445_5841305_654-54195-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 16:17:09,588 WARNING [train.py:1204] (3/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_344-152526-0 from training. Duration: 0.53 2023-10-04 16:17:13,925 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_30167_210_3528_1_1532428012_4772375_480-142230-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 16:17:15,481 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_160-218721-0 from training. Duration: 0.96 2023-10-04 16:17:20,000 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.ff2_skip_rate, batch_count=1716013.3333333333, ans=0.0 2023-10-04 16:17:20,070 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.conv_module2.balancer1.min_positive, batch_count=1716013.3333333333, ans=0.025 2023-10-04 16:17:21,157 WARNING [train.py:1204] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_65-88242-0_sp0.9 from training. Duration: 0.62225 2023-10-04 16:17:24,629 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_34886_210_10633_1_1533203882_1755661_76-156096-0 from training. Duration: 0.87 2023-10-04 16:17:27,462 WARNING [train.py:1204] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_188-38388-0_sp1.1 from training. Duration: 0.92725 2023-10-04 16:17:27,586 WARNING [train.py:1204] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_81-289335-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 16:17:31,656 WARNING [train.py:1204] (3/4) Exclude cut with ID _59493_210_17282_1_1535078419077_3225879_110-8778-0_sp1.1 from training. Duration: 0.936375 2023-10-04 16:17:33,349 WARNING [train.py:1204] (3/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_188-260011-0 from training. Duration: 0.73 2023-10-04 16:17:33,394 WARNING [train.py:1204] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_453-133983-0_sp0.9 from training. Duration: 0.8666875 2023-10-04 16:17:40,448 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8054_210_7927_1_1528627721_4984094_553-17379-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 16:17:41,857 WARNING [train.py:1204] (3/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_341-171072-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 16:17:45,161 WARNING [train.py:1204] (3/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_265-257286-0_sp1.1 from training. Duration: 0.963625 2023-10-04 16:17:45,259 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_403-39619-0_sp0.9 from training. Duration: 0.9333125 2023-10-04 16:17:45,263 WARNING [train.py:1204] (3/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_464-33931-0_sp0.9 from training. Duration: 0.6 2023-10-04 16:17:46,496 WARNING [train.py:1204] (3/4) Exclude cut with ID _60804_210_9642_1_1534831199859_7268299_632-143863-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 16:17:46,504 WARNING [train.py:1204] (3/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_431-310997-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 16:17:46,554 WARNING [train.py:1204] (3/4) Exclude cut with ID _31649_210_6228_1_1532584608063_4091078_114-263860-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 16:17:46,570 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6961_210_2087_1_1527937168_6815011_562-165518-0_sp0.9 from training. Duration: 0.8555625 2023-10-04 16:17:52,413 WARNING [train.py:1204] (3/4) Exclude cut with ID _27052_210_5986_1_1532329197187_3585009_338-325870-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 16:17:52,471 WARNING [train.py:1204] (3/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_469-94673-0_sp1.1 from training. Duration: 0.7181875 2023-10-04 16:17:53,752 WARNING [train.py:1204] (3/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_259-34038-0 from training. Duration: 0.74 2023-10-04 16:17:55,135 WARNING [train.py:1204] (3/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_388-129552-0 from training. Duration: 0.96 2023-10-04 16:17:55,275 WARNING [train.py:1204] (3/4) Exclude cut with ID _28394_210_8913_1_1532430049518_3650118_342-251455-0_sp1.1 from training. Duration: 0.936375 2023-10-04 16:17:57,030 WARNING [train.py:1204] (3/4) Exclude cut with ID _53731_210_7994_1_1534553979244_3757214_8-191390-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 16:17:57,075 WARNING [train.py:1204] (3/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_613-232856-0 from training. Duration: 0.54 2023-10-04 16:17:58,505 WARNING [train.py:1204] (3/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_557-349534-0 from training. Duration: 0.91 2023-10-04 16:17:58,513 WARNING [train.py:1204] (3/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_379-184223-0 from training. Duration: 0.85 2023-10-04 16:17:58,517 WARNING [train.py:1204] (3/4) Exclude cut with ID IC0125W0001-61807 from training. Duration: 0.966 2023-10-04 16:17:58,613 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_229-45300-0 from training. Duration: 0.57 2023-10-04 16:17:59,955 WARNING [train.py:1204] (3/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_1069-24743-0_sp1.1 from training. Duration: 0.92725 2023-10-04 16:18:00,175 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.feed_forward1.hidden_balancer.prob, batch_count=1716146.6666666667, ans=0.125 2023-10-04 16:18:01,266 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_41452_210_7291_1_1533437687_3941281_323-172873-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 16:18:01,282 WARNING [train.py:1204] (3/4) Exclude cut with ID _37086_210_9038_1_1532999540891_7021250_802-226709-0_sp1.1 from training. Duration: 0.97275 2023-10-04 16:18:02,549 INFO [train.py:1046] (3/4) Epoch 49, batch 2450, loss[loss=0.1427, simple_loss=0.2219, pruned_loss=0.03174, over 24335.00 frames. ], tot_loss[loss=0.1526, simple_loss=0.2328, pruned_loss=0.03619, over 4704524.00 frames. ], batch size: 56, lr: 2.07e-03, grad_scale: 16.0 2023-10-04 16:18:02,662 WARNING [train.py:1204] (3/4) Exclude cut with ID IC0144W0500-71767 from training. Duration: 0.931875 2023-10-04 16:18:02,738 WARNING [train.py:1204] (3/4) Exclude cut with ID _39846_210_12651_1_1533605310265_2115212_233-129699-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 16:18:04,647 WARNING [train.py:1204] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_574-45541-0_sp0.9 from training. Duration: 0.72225 2023-10-04 16:18:07,344 WARNING [train.py:1204] (3/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_372-298093-0_sp1.1 from training. Duration: 0.72725 2023-10-04 16:18:07,356 WARNING [train.py:1204] (3/4) Exclude cut with ID _62574_210_2655_1_1535180407058_3933249_34-26343-0_sp1.1 from training. Duration: 0.97275 2023-10-04 16:18:11,387 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_512-256238-0_sp1.1 from training. Duration: 0.963625 2023-10-04 16:18:11,399 WARNING [train.py:1204] (3/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_196-55502-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 16:18:12,749 WARNING [train.py:1204] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_195-167011-0 from training. Duration: 0.75 2023-10-04 16:18:19,365 WARNING [train.py:1204] (3/4) Exclude cut with ID _13678_210_7497_1_1530530069226_3463170_345-230416-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 16:18:19,380 WARNING [train.py:1204] (3/4) Exclude cut with ID _51078_210_9070_1_1534161557424_3805250_225-327809-0_sp1.1 from training. Duration: 0.963625 2023-10-04 16:18:22,065 WARNING [train.py:1204] (3/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_18-189198-0_sp1.1 from training. Duration: 0.8181875 2023-10-04 16:18:22,077 WARNING [train.py:1204] (3/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_329-82235-0_sp1.1 from training. Duration: 0.7454375 2023-10-04 16:18:22,095 WARNING [train.py:1204] (3/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_726-270780-0_sp0.9 from training. Duration: 0.9555625 2023-10-04 16:18:23,413 WARNING [train.py:1204] (3/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_654-52713-0 from training. Duration: 0.77 2023-10-04 16:18:26,411 WARNING [train.py:1204] (3/4) Exclude cut with ID _20455_210_5219_1_1531565996775_8098100_191-16006-0_sp1.1 from training. Duration: 0.963625 2023-10-04 16:18:28,383 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3171_210_2087_1_1526109084_2330007_66-260583-0_sp1.1 from training. Duration: 0.6545625 2023-10-04 16:18:29,571 WARNING [train.py:1204] (3/4) Exclude cut with ID _38670_210_15379_1_1533448810659_4391059_63-129006-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 16:18:31,251 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.conv_module1.balancer1.prob, batch_count=1716346.6666666667, ans=0.125 2023-10-04 16:18:33,821 WARNING [train.py:1204] (3/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_393-57888-0_sp0.9 from training. Duration: 0.711125 2023-10-04 16:18:33,850 WARNING [train.py:1204] (3/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_857-298942-0_sp0.9 from training. Duration: 0.9444375 2023-10-04 16:18:34,389 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.5.encoder.layers.0.whiten, num_groups=1, num_channels=256, metric=2.19 vs. limit=12.0 2023-10-04 16:18:35,840 WARNING [train.py:1204] (3/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_239-22076-0_sp0.9 from training. Duration: 0.9444375 2023-10-04 16:18:35,873 WARNING [train.py:1204] (3/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_424-227947-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 16:18:37,466 WARNING [train.py:1204] (3/4) Exclude cut with ID _14069_210_5632_1_1530512692033_4803390_95-1896-0 from training. Duration: 0.98 2023-10-04 16:18:38,811 WARNING [train.py:1204] (3/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_96-244299-0_sp0.9 from training. Duration: 0.97775 2023-10-04 16:18:47,566 WARNING [train.py:1204] (3/4) Exclude cut with ID _46683_210_9642_1_1533730913492_4591116_562-319825-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 16:18:47,667 WARNING [train.py:1204] (3/4) Exclude cut with ID _64639_210_5816_1_1535367619437_3528000_284-173474-0_sp1.1 from training. Duration: 0.963625 2023-10-04 16:18:47,775 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.1.bypass.skip_rate, batch_count=1716413.3333333333, ans=0.035 2023-10-04 16:18:49,003 WARNING [train.py:1204] (3/4) Exclude cut with ID _29529_210_7234_1_1532393961455_3977460_497-100133-0_sp1.1 from training. Duration: 0.936375 2023-10-04 16:18:50,513 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.0.layers.1.feed_forward2.out_whiten, num_groups=1, num_channels=192, metric=4.06 vs. limit=15.0 2023-10-04 16:18:50,859 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_12868_210_6753_1_1530701932_530275_18-76322-0_sp0.9 from training. Duration: 0.9666875 2023-10-04 16:18:50,895 WARNING [train.py:1204] (3/4) Exclude cut with ID _50093_210_4220_1_1534417141207_3432299_116-303447-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 16:18:50,987 WARNING [train.py:1204] (3/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_44-79427-0_sp1.1 from training. Duration: 0.8909375 2023-10-04 16:18:51,054 WARNING [train.py:1204] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_887-249555-0 from training. Duration: 0.77 2023-10-04 16:18:53,924 WARNING [train.py:1204] (3/4) Exclude cut with ID _28461_210_2253_1_1532771775189_3041299_96-152158-0_sp0.9 from training. Duration: 0.9444375 2023-10-04 16:18:55,280 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_14058_210_4163_1_1530690953_6327180_49-210687-0_sp1.1 from training. Duration: 0.8545625 2023-10-04 16:18:58,070 WARNING [train.py:1204] (3/4) Exclude cut with ID _40399_210_4353_1_1533305658194_3279406_351-127903-0_sp1.1 from training. Duration: 0.97275 2023-10-04 16:18:58,076 WARNING [train.py:1204] (3/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_184-206939-0_sp1.1 from training. Duration: 0.936375 2023-10-04 16:19:02,764 WARNING [train.py:1204] (3/4) Exclude cut with ID _14100_210_8341_1_1530529320613_3678069_258-224599-0_sp1.1 from training. Duration: 0.7 2023-10-04 16:19:02,789 WARNING [train.py:1204] (3/4) Exclude cut with ID _6493_210_1747_1_1528459210424_4012470_324-323854-0 from training. Duration: 0.92 2023-10-04 16:19:02,890 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_151-325626-0_sp1.1 from training. Duration: 0.8818125 2023-10-04 16:19:03,102 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.balancer1.prob, batch_count=1716480.0, ans=0.125 2023-10-04 16:19:04,345 WARNING [train.py:1204] (3/4) Exclude cut with ID _14528_210_8254_1_1530669700514_4306939_106-43145-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 16:19:04,360 WARNING [train.py:1204] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_686-54383-0 from training. Duration: 0.87 2023-10-04 16:19:04,407 WARNING [train.py:1204] (3/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_801-102362-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 16:19:06,100 WARNING [train.py:1204] (3/4) Exclude cut with ID _48677_210_5663_1_1533902405919_3892989_408-40740-0_sp0.9 from training. Duration: 0.92225 2023-10-04 16:19:08,944 WARNING [train.py:1204] (3/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_448-212135-0_sp1.1 from training. Duration: 0.82725 2023-10-04 16:19:11,623 WARNING [train.py:1204] (3/4) Exclude cut with ID _59170_210_6721_1_1534983633960_4203335_320-124047-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 16:19:11,657 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_4692_210_3528_1_1526899755_4323254_107-44401-0_sp1.1 from training. Duration: 0.7818125 2023-10-04 16:19:16,255 WARNING [train.py:1204] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_731-2428-0 from training. Duration: 0.83 2023-10-04 16:19:17,559 INFO [train.py:1046] (3/4) Epoch 49, batch 2500, loss[loss=0.1557, simple_loss=0.2436, pruned_loss=0.03388, over 23980.00 frames. ], tot_loss[loss=0.1515, simple_loss=0.2313, pruned_loss=0.03588, over 4695052.26 frames. ], batch size: 80, lr: 2.07e-03, grad_scale: 16.0 2023-10-04 16:19:17,669 WARNING [train.py:1204] (3/4) Exclude cut with ID _29857_210_20034_1_1532395886753_3798160_335-163562-0_sp1.1 from training. Duration: 0.77275 2023-10-04 16:19:23,770 WARNING [train.py:1204] (3/4) Exclude cut with ID _14832_210_8750_1_1530691351539_3171348_171-275004-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 16:19:32,425 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.795e+02 2.204e+02 2.495e+02 3.004e+02 4.519e+02, threshold=4.991e+02, percent-clipped=0.0 2023-10-04 16:19:32,535 WARNING [train.py:1204] (3/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_105-328174-0_sp0.9 from training. Duration: 0.8444375 2023-10-04 16:19:32,575 WARNING [train.py:1204] (3/4) Exclude cut with ID _68965_210_1883_1_1535698962272_3497490_164-118421-0_sp1.1 from training. Duration: 0.936375 2023-10-04 16:19:33,960 WARNING [train.py:1204] (3/4) Exclude cut with ID _19896_210_1883_1_1531709022662_6649569_380-24170-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 16:19:33,965 WARNING [train.py:1204] (3/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_505-205270-0_sp1.1 from training. Duration: 0.3454375 2023-10-04 16:19:40,187 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.bypass.scale_min, batch_count=1716613.3333333333, ans=0.2 2023-10-04 16:19:41,227 WARNING [train.py:1204] (3/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_427-208901-0_sp1.1 from training. Duration: 0.6181875 2023-10-04 16:19:41,274 WARNING [train.py:1204] (3/4) Exclude cut with ID _23818_210_6228_1_1531917063884_3676920_85-326916-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 16:19:43,213 WARNING [train.py:1204] (3/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_101-293367-0_sp1.1 from training. Duration: 0.57275 2023-10-04 16:19:43,228 WARNING [train.py:1204] (3/4) Exclude cut with ID _5409_210_1388_1_1527292850189_5865191_178-127970-0_sp1.1 from training. Duration: 0.5181875 2023-10-04 16:19:44,633 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_403-39619-0 from training. Duration: 0.84 2023-10-04 16:19:46,047 WARNING [train.py:1204] (3/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_427-87841-0_sp1.1 from training. Duration: 0.963625 2023-10-04 16:19:46,083 WARNING [train.py:1204] (3/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_286-68181-0_sp1.1 from training. Duration: 0.97275 2023-10-04 16:19:47,351 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6673_210_3273_1_1527767790_3173240_327-306185-0 from training. Duration: 0.83 2023-10-04 16:19:47,366 WARNING [train.py:1204] (3/4) Exclude cut with ID _3675_210_4557_1_1526639934408_4182089_106-203713-0_sp1.1 from training. Duration: 0.963625 2023-10-04 16:19:47,428 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_53071_210_16554_1_1534493434_1276796_130-36076-0 from training. Duration: 0.95 2023-10-04 16:19:47,456 WARNING [train.py:1204] (3/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_586-93054-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 16:19:52,260 WARNING [train.py:1204] (3/4) Exclude cut with ID _23825_210_13449_1_1531915313526_3497150_262-10571-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 16:19:53,645 WARNING [train.py:1204] (3/4) Exclude cut with ID _28783_210_7943_1_1532671164808_7155479_432-67885-0_sp1.1 from training. Duration: 0.97275 2023-10-04 16:19:56,594 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6287_210_3273_1_1527509506_2653716_182-199887-0_sp0.9 from training. Duration: 0.5666875 2023-10-04 16:19:56,667 WARNING [train.py:1204] (3/4) Exclude cut with ID _3231_210_2106_1_1526205864934_6784220_171-256004-0 from training. Duration: 0.86 2023-10-04 16:19:57,877 WARNING [train.py:1204] (3/4) Exclude cut with ID _69051_210_1531_1_1535885566901_3829538_332-92090-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 16:19:58,206 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.conv_module2.balancer2.min_positive, batch_count=1716680.0, ans=0.05 2023-10-04 16:19:59,281 WARNING [train.py:1204] (3/4) Exclude cut with ID _3675_210_4557_1_1526639934408_4182089_104-324125-0_sp1.1 from training. Duration: 0.963625 2023-10-04 16:20:02,198 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_41764_210_2521_1_1533686110_4089313_501-2119-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 16:20:06,736 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_56-100988-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 16:20:10,008 WARNING [train.py:1204] (3/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_398-239819-0_sp1.1 from training. Duration: 0.9 2023-10-04 16:20:14,439 WARNING [train.py:1204] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_74-48717-0_sp1.1 from training. Duration: 0.52725 2023-10-04 16:20:17,745 WARNING [train.py:1204] (3/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_177-49092-0 from training. Duration: 0.91 2023-10-04 16:20:17,778 WARNING [train.py:1204] (3/4) Exclude cut with ID _62337_210_12392_1_1535009273105_3412229_44-176353-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 16:20:17,794 WARNING [train.py:1204] (3/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_110-130961-0_sp1.1 from training. Duration: 0.42725 2023-10-04 16:20:20,537 WARNING [train.py:1204] (3/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_37-189262-0_sp1.1 from training. Duration: 0.8909375 2023-10-04 16:20:20,538 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3172_210_2087_1_1526292929_3987865_197-97762-0_sp0.9 from training. Duration: 0.8333125 2023-10-04 16:20:21,820 WARNING [train.py:1204] (3/4) Exclude cut with ID IC0677W0065-334897 from training. Duration: 0.896 2023-10-04 16:20:21,820 WARNING [train.py:1204] (3/4) Exclude cut with ID IC0677W0080-334912 from training. Duration: 0.768 2023-10-04 16:20:21,825 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_120-41668-0 from training. Duration: 0.7 2023-10-04 16:20:23,948 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.conv_module1.balancer2.prob, batch_count=1716813.3333333333, ans=0.125 2023-10-04 16:20:26,420 WARNING [train.py:1204] (3/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_460-205287-0_sp1.1 from training. Duration: 0.963625 2023-10-04 16:20:27,848 WARNING [train.py:1204] (3/4) Exclude cut with ID _14632_210_9988_1_1530691279392_3923200_102-204477-0 from training. Duration: 0.97 2023-10-04 16:20:27,853 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_34815_210_6388_1_1533381320_3571903_279-305565-0 from training. Duration: 0.92 2023-10-04 16:20:27,928 WARNING [train.py:1204] (3/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_91-148831-0_sp1.1 from training. Duration: 0.9 2023-10-04 16:20:29,238 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_82-221196-0 from training. Duration: 0.76 2023-10-04 16:20:31,904 INFO [train.py:1046] (3/4) Epoch 49, batch 2550, loss[loss=0.1354, simple_loss=0.2184, pruned_loss=0.02625, over 24571.00 frames. ], tot_loss[loss=0.1521, simple_loss=0.2326, pruned_loss=0.03579, over 4700590.72 frames. ], batch size: 60, lr: 2.07e-03, grad_scale: 16.0 2023-10-04 16:20:32,066 WARNING [train.py:1204] (3/4) Exclude cut with ID _38020_210_5986_1_1533112252259_7006089_493-102745-0 from training. Duration: 0.98 2023-10-04 16:20:36,457 WARNING [train.py:1204] (3/4) Exclude cut with ID _61133_210_9969_1_1535196777227_3696409_40-93442-0_sp1.1 from training. Duration: 0.92725 2023-10-04 16:20:36,604 WARNING [train.py:1204] (3/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_45-258852-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 16:20:38,043 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_53071_210_16554_1_1534493434_1276796_130-36076-0_sp1.1 from training. Duration: 0.863625 2023-10-04 16:20:38,209 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8694_210_6400_1_1529065754_3260756_332-13553-0_sp1.1 from training. Duration: 0.92725 2023-10-04 16:20:39,936 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_405-339986-0 from training. Duration: 0.51 2023-10-04 16:20:41,216 WARNING [train.py:1204] (3/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_927-43827-0_sp1.1 from training. Duration: 0.67275 2023-10-04 16:20:43,953 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_397-147196-0 from training. Duration: 0.8 2023-10-04 16:20:45,916 WARNING [train.py:1204] (3/4) Exclude cut with ID 3557-8342-0013-54691-0_sp1.1 from training. Duration: 0.836375 2023-10-04 16:20:48,589 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_47438_210_5399_1_1533797471_4513356_626-127722-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 16:20:50,025 WARNING [train.py:1204] (3/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_256-323883-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 16:20:51,336 WARNING [train.py:1204] (3/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_839-173017-0_sp0.9 from training. Duration: 0.4555625 2023-10-04 16:20:51,371 WARNING [train.py:1204] (3/4) Exclude cut with ID _5289_210_2084_1_1527678202736_3779299_392-261988-0_sp1.1 from training. Duration: 0.5909375 2023-10-04 16:20:51,417 WARNING [train.py:1204] (3/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_119-224384-0_sp1.1 from training. Duration: 0.936375 2023-10-04 16:20:52,745 WARNING [train.py:1204] (3/4) Exclude cut with ID _57523_210_1856_1_1534766294545_3732989_262-267933-0_sp1.1 from training. Duration: 0.963625 2023-10-04 16:20:54,576 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_35361_210_4238_1_1533262810_7793476_675-87012-0_sp0.9 from training. Duration: 0.988875 2023-10-04 16:20:54,594 WARNING [train.py:1204] (3/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_17-90541-0 from training. Duration: 0.94 2023-10-04 16:20:55,916 WARNING [train.py:1204] (3/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_535-14410-0_sp1.1 from training. Duration: 0.42725 2023-10-04 16:20:55,922 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_24673_210_6400_1_1531968633_778659_94-154511-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 16:20:55,925 WARNING [train.py:1204] (3/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_212-10501-0 from training. Duration: 0.94 2023-10-04 16:21:08,929 WARNING [train.py:1204] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_383-285196-0_sp1.1 from training. Duration: 0.8818125 2023-10-04 16:21:13,566 WARNING [train.py:1204] (3/4) Exclude cut with ID _45560_210_21982_1_1533812603951_3584999_238-47020-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 16:21:13,579 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_21500_210_2054_1_1532149310_2716758_278-18053-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 16:21:13,592 WARNING [train.py:1204] (3/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_453-9626-0_sp1.1 from training. Duration: 0.936375 2023-10-04 16:21:14,967 WARNING [train.py:1204] (3/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_153-97441-0_sp1.1 from training. Duration: 0.5090625 2023-10-04 16:21:17,131 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=1717080.0, ans=0.1 2023-10-04 16:21:21,070 WARNING [train.py:1204] (3/4) Exclude cut with ID _67725_210_27767_1_1535608756671_5105359_431-120895-0_sp1.1 from training. Duration: 0.963625 2023-10-04 16:21:22,671 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.conv_module2.balancer2.prob, batch_count=1717080.0, ans=0.125 2023-10-04 16:21:23,712 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.self_attn2.whiten.whitening_limit, batch_count=1717080.0, ans=22.5 2023-10-04 16:21:24,278 WARNING [train.py:1204] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_574-45541-0_sp1.1 from training. Duration: 0.5909375 2023-10-04 16:21:24,297 WARNING [train.py:1204] (3/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_162-103620-0_sp1.1 from training. Duration: 0.8454375 2023-10-04 16:21:24,308 WARNING [train.py:1204] (3/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_734-129761-0_sp1.1 from training. Duration: 0.7454375 2023-10-04 16:21:24,338 WARNING [train.py:1204] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_620-276793-0_sp1.1 from training. Duration: 0.536375 2023-10-04 16:21:24,387 WARNING [train.py:1204] (3/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_153-97441-0_sp0.9 from training. Duration: 0.62225 2023-10-04 16:21:28,501 WARNING [train.py:1204] (3/4) Exclude cut with ID _12733_210_8341_1_1530167424556_3646380_523-85621-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 16:21:28,514 WARNING [train.py:1204] (3/4) Exclude cut with ID _64048_210_10626_1_1535198382802_3835179_352-179834-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 16:21:32,805 WARNING [train.py:1204] (3/4) Exclude cut with ID _55162_210_17071_1_1534408248583_3753480_43-242364-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 16:21:32,828 WARNING [train.py:1204] (3/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_661-264857-0 from training. Duration: 0.93 2023-10-04 16:21:32,835 WARNING [train.py:1204] (3/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_325-237800-0_sp1.1 from training. Duration: 0.97275 2023-10-04 16:21:32,876 WARNING [train.py:1204] (3/4) Exclude cut with ID _55328_210_27767_1_1534580973260_4088170_385-171459-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 16:21:33,017 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.nonlin_attention.balancer.prob, batch_count=1717146.6666666667, ans=0.125 2023-10-04 16:21:34,263 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3780_210_2087_1_1526467389_2145572_176-107140-0_sp1.1 from training. Duration: 0.62725 2023-10-04 16:21:34,493 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.bypass.skip_rate, batch_count=1717146.6666666667, ans=0.07 2023-10-04 16:21:37,025 WARNING [train.py:1204] (3/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_375-244976-0_sp1.1 from training. Duration: 0.6818125 2023-10-04 16:21:37,133 WARNING [train.py:1204] (3/4) Exclude cut with ID _35941_210_15379_1_1532995294515_4023089_309-33162-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 16:21:43,259 WARNING [train.py:1204] (3/4) Exclude cut with ID _14459_210_2228_1_1531443641946_7516700_1185-22038-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 16:21:45,152 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_1197-69460-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 16:21:46,535 INFO [train.py:1046] (3/4) Epoch 49, batch 2600, loss[loss=0.1804, simple_loss=0.2514, pruned_loss=0.05471, over 19208.00 frames. ], tot_loss[loss=0.1529, simple_loss=0.2334, pruned_loss=0.03622, over 4693400.18 frames. ], batch size: 388, lr: 2.07e-03, grad_scale: 16.0 2023-10-04 16:21:47,851 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9012W0022-501157 from training. Duration: 0.896 2023-10-04 16:21:49,355 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9023W0009-506302 from training. Duration: 0.896 2023-10-04 16:21:49,377 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2821_210_2715_1_1526108385_3763560_73-296673-0_sp1.1 from training. Duration: 0.7545625 2023-10-04 16:21:51,023 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9025W0113-507442 from training. Duration: 0.896 2023-10-04 16:21:51,102 WARNING [train.py:1204] (3/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_739-2098-0 from training. Duration: 0.92 2023-10-04 16:21:51,113 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9029W0332-509722 from training. Duration: 0.768 2023-10-04 16:21:53,872 WARNING [train.py:1204] (3/4) Exclude cut with ID _16908_210_5724_1_1531119157248_3772700_68-101953-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 16:21:53,889 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9042W0011-515617 from training. Duration: 0.768 2023-10-04 16:21:56,539 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3019_210_2028_1_1526044245_3264865_191-72235-0 from training. Duration: 0.97 2023-10-04 16:21:56,763 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.feed_forward1.out_proj.dropout_p, batch_count=1717213.3333333333, ans=0.1 2023-10-04 16:21:57,960 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9052W0006-520792 from training. Duration: 0.896 2023-10-04 16:21:59,387 WARNING [train.py:1204] (3/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_245-174553-0_sp1.1 from training. Duration: 0.8 2023-10-04 16:22:00,754 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.681e+02 2.147e+02 2.547e+02 3.024e+02 6.453e+02, threshold=5.093e+02, percent-clipped=2.0 2023-10-04 16:22:00,838 WARNING [train.py:1204] (3/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_202-67017-0 from training. Duration: 0.82 2023-10-04 16:22:02,180 WARNING [train.py:1204] (3/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_304-59227-0 from training. Duration: 0.77 2023-10-04 16:22:03,519 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_246-125000-0_sp0.9 from training. Duration: 0.688875 2023-10-04 16:22:03,551 WARNING [train.py:1204] (3/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_243-42116-0 from training. Duration: 0.73 2023-10-04 16:22:06,367 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9087W0121-538492 from training. Duration: 0.896 2023-10-04 16:22:06,390 WARNING [train.py:1204] (3/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_120-76843-0 from training. Duration: 0.55 2023-10-04 16:22:10,405 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.2.encoder.layers.0.conv_module2.whiten, num_groups=1, num_channels=384, metric=7.77 vs. limit=15.0 2023-10-04 16:22:12,743 WARNING [train.py:1204] (3/4) Exclude cut with ID _7852_210_5219_1_1528286471316_8139400_850-304621-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 16:22:12,763 WARNING [train.py:1204] (3/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_154-285415-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 16:22:12,774 WARNING [train.py:1204] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_538-278194-0_sp1.1 from training. Duration: 0.92725 2023-10-04 16:22:12,775 WARNING [train.py:1204] (3/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_119-207162-0 from training. Duration: 0.71 2023-10-04 16:22:16,203 WARNING [train.py:1204] (3/4) Exclude cut with ID _2334_210_2667_1_1525608238880_4705129_459-104997-0_sp1.1 from training. Duration: 0.863625 2023-10-04 16:22:21,697 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9147W0019-567847 from training. Duration: 0.768 2023-10-04 16:22:28,114 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.1.feed_forward1.out_proj.dropout_p, batch_count=1717346.6666666667, ans=0.1 2023-10-04 16:22:29,329 WARNING [train.py:1204] (3/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_228-189439-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 16:22:29,365 WARNING [train.py:1204] (3/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_196-77172-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 16:22:29,438 WARNING [train.py:1204] (3/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_625-338546-0 from training. Duration: 0.88 2023-10-04 16:22:30,773 WARNING [train.py:1204] (3/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_193-327630-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 16:22:30,775 WARNING [train.py:1204] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_391-24006-0_sp1.1 from training. Duration: 0.92725 2023-10-04 16:22:32,211 WARNING [train.py:1204] (3/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_197-28164-0 from training. Duration: 0.8 2023-10-04 16:22:34,992 WARNING [train.py:1204] (3/4) Exclude cut with ID _8395_210_1531_1_1528597871523_4411579_431-132470-0_sp1.1 from training. Duration: 0.67275 2023-10-04 16:22:35,020 WARNING [train.py:1204] (3/4) Exclude cut with ID _35350_210_16040_1_1533040191183_4075299_465-248326-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 16:22:36,517 WARNING [train.py:1204] (3/4) Exclude cut with ID _16756_210_4915_1_1531047803763_7104259_101-141922-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 16:22:38,106 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.balancer_ff2.min_abs, batch_count=1717413.3333333333, ans=0.1 2023-10-04 16:22:41,050 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9212W0005-600172 from training. Duration: 0.896 2023-10-04 16:22:41,083 WARNING [train.py:1204] (3/4) Exclude cut with ID _63186_210_2084_1_1535414479705_3908649_378-166820-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 16:22:41,104 WARNING [train.py:1204] (3/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_52-211683-0_sp1.1 from training. Duration: 0.8181875 2023-10-04 16:22:48,812 WARNING [train.py:1204] (3/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_1147-237729-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 16:22:48,877 WARNING [train.py:1204] (3/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_234-14174-0_sp0.9 from training. Duration: 0.911125 2023-10-04 16:22:48,898 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_454-276429-0 from training. Duration: 0.87 2023-10-04 16:22:48,961 WARNING [train.py:1204] (3/4) Exclude cut with ID _53713_210_24116_1_1534314487762_7416751_755-165313-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 16:22:49,162 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.attention_skip_rate, batch_count=1717480.0, ans=0.0 2023-10-04 16:22:51,683 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8278_210_2550_1_1528637099_4220413_436-317072-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 16:22:51,754 WARNING [train.py:1204] (3/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_28-324729-0_sp1.1 from training. Duration: 0.8545625 2023-10-04 16:22:56,572 WARNING [train.py:1204] (3/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_326-165900-0 from training. Duration: 0.69 2023-10-04 16:22:56,629 WARNING [train.py:1204] (3/4) Exclude cut with ID _53489_210_6228_1_1534298357169_3835970_96-30246-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 16:22:59,231 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8275_210_4198_1_1528455320_3892571_491-17707-0_sp1.1 from training. Duration: 0.5818125 2023-10-04 16:23:00,773 INFO [train.py:1046] (3/4) Epoch 49, batch 2650, loss[loss=0.1441, simple_loss=0.2347, pruned_loss=0.0267, over 24642.00 frames. ], tot_loss[loss=0.1533, simple_loss=0.2341, pruned_loss=0.03626, over 4702208.89 frames. ], batch size: 68, lr: 2.07e-03, grad_scale: 16.0 2023-10-04 16:23:02,179 WARNING [train.py:1204] (3/4) Exclude cut with ID _14348_210_8341_1_1530599434996_3778640_240-232370-0 from training. Duration: 0.84 2023-10-04 16:23:02,195 WARNING [train.py:1204] (3/4) Exclude cut with ID _45012_210_12651_1_1533608435136_5371380_54-112623-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 16:23:03,537 WARNING [train.py:1204] (3/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_328-264776-0_sp1.1 from training. Duration: 0.6090625 2023-10-04 16:23:04,894 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9325W0001-648427 from training. Duration: 0.768 2023-10-04 16:23:04,904 WARNING [train.py:1204] (3/4) Exclude cut with ID _56480_210_4220_1_1534679942079_3075409_49-74717-0_sp1.1 from training. Duration: 0.936375 2023-10-04 16:23:06,380 WARNING [train.py:1204] (3/4) Exclude cut with ID _59500_210_8254_1_1534809610680_3736009_6-241869-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 16:23:09,214 WARNING [train.py:1204] (3/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_172-215932-0_sp0.9 from training. Duration: 0.7333125 2023-10-04 16:23:10,552 WARNING [train.py:1204] (3/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_17-90541-0_sp1.1 from training. Duration: 0.8545625 2023-10-04 16:23:12,443 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_43831_210_2158_1_1533725502_5872791_525-89447-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 16:23:14,265 WARNING [train.py:1204] (3/4) Exclude cut with ID _12302_210_5219_1_1529924446466_8107280_907-182949-0 from training. Duration: 0.92 2023-10-04 16:23:14,284 WARNING [train.py:1204] (3/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_402-102311-0_sp0.9 from training. Duration: 0.7444375 2023-10-04 16:23:14,315 WARNING [train.py:1204] (3/4) Exclude cut with ID _8021_210_7994_1_1528376451358_3653069_179-45206-0_sp0.9 from training. Duration: 0.9555625 2023-10-04 16:23:18,870 WARNING [train.py:1204] (3/4) Exclude cut with ID _40137_210_12210_1_1533290484046_3795180_249-187059-0 from training. Duration: 0.88 2023-10-04 16:23:20,353 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9653W0194-675472 from training. Duration: 0.9658125 2023-10-04 16:23:23,062 WARNING [train.py:1204] (3/4) Exclude cut with ID _51332_210_9988_1_1534244371836_3801270_336-255604-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 16:23:23,681 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.4.encoder.layers.0.self_attn1.whiten, num_groups=1, num_channels=384, metric=15.08 vs. limit=22.5 2023-10-04 16:23:25,804 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_510-96996-0 from training. Duration: 0.87 2023-10-04 16:23:25,823 WARNING [train.py:1204] (3/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_1167-20437-0_sp1.1 from training. Duration: 0.92725 2023-10-04 16:23:27,624 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3475_210_2028_1_1526193578_3622569_265-276625-0 from training. Duration: 0.76 2023-10-04 16:23:30,636 WARNING [train.py:1204] (3/4) Exclude cut with ID _14860_210_4915_1_1530702020340_7343240_470-172418-0_sp1.1 from training. Duration: 0.97275 2023-10-04 16:23:30,644 WARNING [train.py:1204] (3/4) Exclude cut with ID _5444_210_2151_1_1527418808464_5312210_581-315411-0_sp1.1 from training. Duration: 0.563625 2023-10-04 16:23:31,970 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_34450_210_9547_1_1533099443_4109050_519-342439-0_sp1.1 from training. Duration: 0.97275 2023-10-04 16:23:32,007 WARNING [train.py:1204] (3/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_50-175961-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 16:23:36,134 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.1.conv_module2.balancer2.prob, batch_count=1717680.0, ans=0.125 2023-10-04 16:23:37,329 WARNING [train.py:1204] (3/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_372-6869-0 from training. Duration: 0.44 2023-10-04 16:23:37,334 WARNING [train.py:1204] (3/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_3-237434-0 from training. Duration: 0.76 2023-10-04 16:23:37,471 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.conv_skip_rate, batch_count=1717680.0, ans=0.0 2023-10-04 16:23:40,010 WARNING [train.py:1204] (3/4) Exclude cut with ID _8772_210_1850_1_1528635595972_3664011_247-336448-0_sp1.1 from training. Duration: 0.67275 2023-10-04 16:23:42,801 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_427-78097-0 from training. Duration: 0.8 2023-10-04 16:23:42,817 WARNING [train.py:1204] (3/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_466-252727-0_sp1.1 from training. Duration: 0.97275 2023-10-04 16:23:42,971 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.1.ff3_skip_rate, batch_count=1717746.6666666667, ans=0.0 2023-10-04 16:23:42,985 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.conv_module1.balancer2.prob, batch_count=1717746.6666666667, ans=0.125 2023-10-04 16:23:44,186 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_15972_210_6400_1_1530877934_3405692_316-35478-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 16:23:44,217 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_809-17156-0_sp0.9 from training. Duration: 0.811125 2023-10-04 16:23:44,256 WARNING [train.py:1204] (3/4) Exclude cut with ID _57572_210_5517_1_1534921219624_3809484_594-263474-0_sp1.1 from training. Duration: 0.936375 2023-10-04 16:23:44,292 WARNING [train.py:1204] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_190-228204-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 16:23:46,351 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.attention_skip_rate, batch_count=1717746.6666666667, ans=0.0 2023-10-04 16:23:47,524 WARNING [train.py:1204] (3/4) Exclude cut with ID _19514_210_9259_1_1531530071330_5084230_117-21193-0_sp1.1 from training. Duration: 0.936375 2023-10-04 16:23:48,849 WARNING [train.py:1204] (3/4) Exclude cut with ID _69584_210_5219_1_1535785133775_4044450_190-21315-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 16:23:48,923 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_53071_210_16554_1_1534493434_1276796_184-302050-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 16:23:50,284 WARNING [train.py:1204] (3/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_120-76843-0_sp1.1 from training. Duration: 0.5 2023-10-04 16:23:50,376 WARNING [train.py:1204] (3/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_318-162017-0_sp1.1 from training. Duration: 0.82725 2023-10-04 16:23:52,533 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.balancer2.prob, batch_count=1717746.6666666667, ans=0.125 2023-10-04 16:23:53,611 WARNING [train.py:1204] (3/4) Exclude cut with ID _46067_210_4220_1_1534125571066_3307450_79-46864-0_sp1.1 from training. Duration: 0.92725 2023-10-04 16:23:53,682 WARNING [train.py:1204] (3/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_358-215558-0_sp1.1 from training. Duration: 0.8454375 2023-10-04 16:23:55,027 WARNING [train.py:1204] (3/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_621-66764-0_sp1.1 from training. Duration: 0.92725 2023-10-04 16:23:56,418 WARNING [train.py:1204] (3/4) Exclude cut with ID _42863_210_9070_1_1533902398788_3696920_338-156851-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 16:23:56,445 WARNING [train.py:1204] (3/4) Exclude cut with ID _5289_210_2084_1_1527678202736_3779299_392-261988-0_sp0.9 from training. Duration: 0.72225 2023-10-04 16:23:59,874 WARNING [train.py:1204] (3/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_409-84682-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 16:24:01,268 WARNING [train.py:1204] (3/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_30-326681-0_sp0.9 from training. Duration: 0.92225 2023-10-04 16:24:01,273 WARNING [train.py:1204] (3/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_389-184695-0_sp1.1 from training. Duration: 0.92725 2023-10-04 16:24:01,314 WARNING [train.py:1204] (3/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_442-320317-0 from training. Duration: 0.82 2023-10-04 16:24:05,490 WARNING [train.py:1204] (3/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_756-29911-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 16:24:06,807 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_401-112036-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 16:24:08,194 WARNING [train.py:1204] (3/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_181-308827-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 16:24:09,559 WARNING [train.py:1204] (3/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_332-258725-0_sp1.1 from training. Duration: 0.963625 2023-10-04 16:24:09,651 WARNING [train.py:1204] (3/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_188-260011-0_sp0.9 from training. Duration: 0.811125 2023-10-04 16:24:10,949 WARNING [train.py:1204] (3/4) Exclude cut with ID _49039_210_6315_1_1533967537541_3846118_99-302239-0_sp1.1 from training. Duration: 0.963625 2023-10-04 16:24:13,976 INFO [train.py:1046] (3/4) Epoch 49, batch 2700, loss[loss=0.1392, simple_loss=0.2209, pruned_loss=0.02874, over 24354.00 frames. ], tot_loss[loss=0.1536, simple_loss=0.2346, pruned_loss=0.03632, over 4718275.03 frames. ], batch size: 61, lr: 2.07e-03, grad_scale: 16.0 2023-10-04 16:24:14,027 WARNING [train.py:1204] (3/4) Exclude cut with ID _4122_210_2084_1_1526713164417_4353280_312-294828-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 16:24:14,033 WARNING [train.py:1204] (3/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_303-228738-0 from training. Duration: 0.84 2023-10-04 16:24:15,473 WARNING [train.py:1204] (3/4) Exclude cut with ID _14349_210_8341_1_1530615710818_3442470_115-285975-0_sp0.9 from training. Duration: 0.9555625 2023-10-04 16:24:16,845 WARNING [train.py:1204] (3/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_125-144174-0_sp1.1 from training. Duration: 0.3545625 2023-10-04 16:24:20,129 WARNING [train.py:1204] (3/4) Exclude cut with ID _53565_210_10635_1_1534337218335_3820259_351-54570-0_sp1.1 from training. Duration: 0.97275 2023-10-04 16:24:20,164 WARNING [train.py:1204] (3/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_410-40065-0_sp1.1 from training. Duration: 0.963625 2023-10-04 16:24:20,184 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_38965_210_4852_1_1533174906_3484735_167-339442-0_sp1.1 from training. Duration: 0.963625 2023-10-04 16:24:20,351 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.feed_forward1.hidden_balancer.prob, batch_count=1717880.0, ans=0.125 2023-10-04 16:24:20,432 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.bypass.scale_min, batch_count=1717880.0, ans=0.2 2023-10-04 16:24:22,104 WARNING [train.py:1204] (3/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_265-164486-0_sp1.1 from training. Duration: 0.87275 2023-10-04 16:24:22,122 WARNING [train.py:1204] (3/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_725-182484-0_sp1.1 from training. Duration: 0.92725 2023-10-04 16:24:22,154 WARNING [train.py:1204] (3/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_607-213951-0_sp0.9 from training. Duration: 0.9333125 2023-10-04 16:24:23,339 WARNING [train.py:1204] (3/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_605-133395-0_sp1.1 from training. Duration: 0.6 2023-10-04 16:24:23,360 WARNING [train.py:1204] (3/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_351-294650-0 from training. Duration: 0.63 2023-10-04 16:24:24,715 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_14953_210_2151_1_1530701710_5616165_710-165400-0_sp1.1 from training. Duration: 0.7909375 2023-10-04 16:24:25,240 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.4.encoder.layers.1.self_attn1.whiten, num_groups=1, num_channels=384, metric=11.85 vs. limit=22.5 2023-10-04 16:24:27,397 WARNING [train.py:1204] (3/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_237-58433-0_sp1.1 from training. Duration: 0.67275 2023-10-04 16:24:28,700 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.707e+02 2.037e+02 2.270e+02 2.571e+02 4.005e+02, threshold=4.540e+02, percent-clipped=0.0 2023-10-04 16:24:28,835 WARNING [train.py:1204] (3/4) Exclude cut with ID _5190_210_4557_1_1527244368799_373439_28-197030-0_sp1.1 from training. Duration: 0.6545625 2023-10-04 16:24:30,638 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_743-327651-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 16:24:33,606 WARNING [train.py:1204] (3/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_76-203867-0_sp0.9 from training. Duration: 0.8 2023-10-04 16:24:34,962 WARNING [train.py:1204] (3/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_274-274798-0 from training. Duration: 0.98 2023-10-04 16:24:36,262 WARNING [train.py:1204] (3/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_226-242140-0_sp1.1 from training. Duration: 0.736375 2023-10-04 16:24:37,885 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.conv_module2.balancer1.prob, batch_count=1717946.6666666667, ans=0.125 2023-10-04 16:24:40,455 WARNING [train.py:1204] (3/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_352-59958-0_sp1.1 from training. Duration: 0.7818125 2023-10-04 16:24:40,459 WARNING [train.py:1204] (3/4) Exclude cut with ID _39411_210_5986_1_1533538825030_3343649_191-251819-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 16:24:43,416 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.ff3_skip_rate, batch_count=1718013.3333333333, ans=0.0 2023-10-04 16:24:45,565 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.2.encoder.layers.1.self_attn1.whiten, num_groups=1, num_channels=384, metric=17.44 vs. limit=22.5 2023-10-04 16:24:46,148 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7046_210_6753_1_1527895337_6920917_642-106799-0_sp1.1 from training. Duration: 0.836375 2023-10-04 16:24:46,155 WARNING [train.py:1204] (3/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_412-3442-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 16:24:46,174 WARNING [train.py:1204] (3/4) Exclude cut with ID _60057_210_10640_1_1534903065793_3333150_171-181133-0_sp0.9 from training. Duration: 0.97775 2023-10-04 16:24:46,182 WARNING [train.py:1204] (3/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_154-135696-0_sp1.1 from training. Duration: 0.72725 2023-10-04 16:24:49,600 WARNING [train.py:1204] (3/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_148-186805-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 16:24:49,864 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=1718013.3333333333, ans=0.1 2023-10-04 16:24:51,080 WARNING [train.py:1204] (3/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_605-165641-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 16:24:51,086 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_174-277364-0_sp0.9 from training. Duration: 0.788875 2023-10-04 16:24:51,099 WARNING [train.py:1204] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_89-321955-0_sp0.9 from training. Duration: 0.888875 2023-10-04 16:24:54,372 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=1718013.3333333333, ans=0.1 2023-10-04 16:24:57,012 WARNING [train.py:1204] (3/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_438-105429-0_sp1.1 from training. Duration: 0.963625 2023-10-04 16:24:57,015 WARNING [train.py:1204] (3/4) Exclude cut with ID _14970_210_15709_1_1530773543493_1715730_187-20259-0_sp0.9 from training. Duration: 0.988875 2023-10-04 16:24:57,242 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.conv_module2.balancer1.prob, batch_count=1718080.0, ans=0.125 2023-10-04 16:25:03,086 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder_embed.conv.5.prob, batch_count=1718080.0, ans=0.125 2023-10-04 16:25:04,362 WARNING [train.py:1204] (3/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_385-193052-0_sp0.9 from training. Duration: 0.9555625 2023-10-04 16:25:05,665 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7113_210_1849_1_1527942685_3448768_194-311626-0_sp1.1 from training. Duration: 0.92725 2023-10-04 16:25:09,698 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3780_210_2087_1_1526467389_2145572_176-107140-0_sp0.9 from training. Duration: 0.7666875 2023-10-04 16:25:09,700 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_36756_210_7234_1_1533026991_966276_123-213457-0_sp1.1 from training. Duration: 0.936375 2023-10-04 16:25:12,533 WARNING [train.py:1204] (3/4) Exclude cut with ID _23557_210_8750_1_1531890106370_6846255_456-244861-0_sp1.1 from training. Duration: 0.963625 2023-10-04 16:25:13,952 WARNING [train.py:1204] (3/4) Exclude cut with ID _37086_210_9038_1_1532999540891_7021250_242-349170-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 16:25:14,171 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.conv_module1.balancer1.prob, batch_count=1718146.6666666667, ans=0.125 2023-10-04 16:25:15,328 WARNING [train.py:1204] (3/4) Exclude cut with ID _41298_210_9019_1_1533430371450_4822143_471-281351-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 16:25:15,434 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_53378_210_17282_1_1534221071_5769509_572-126176-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 16:25:16,844 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_1507-263032-0_sp1.1 from training. Duration: 0.963625 2023-10-04 16:25:16,893 WARNING [train.py:1204] (3/4) Exclude cut with ID _13679_210_7497_1_1530534293662_3355098_411-186088-0_sp1.1 from training. Duration: 0.8545625 2023-10-04 16:25:20,027 WARNING [train.py:1204] (3/4) Exclude cut with ID _29345_210_4381_1_1532689168263_3860169_149-257268-0_sp1.1 from training. Duration: 0.736375 2023-10-04 16:25:20,212 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.bypass.skip_rate, batch_count=1718146.6666666667, ans=0.09899494936611666 2023-10-04 16:25:21,375 WARNING [train.py:1204] (3/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_381-252014-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 16:25:21,386 WARNING [train.py:1204] (3/4) Exclude cut with ID _35799_210_5854_1_1533263487887_3529071_512-2717-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 16:25:24,782 WARNING [train.py:1204] (3/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_505-205270-0_sp0.9 from training. Duration: 0.42225 2023-10-04 16:25:24,867 WARNING [train.py:1204] (3/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_731-20117-0_sp1.1 from training. Duration: 0.936375 2023-10-04 16:25:26,306 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_37351_210_7925_1_1533012931_3812113_265-85780-0_sp1.1 from training. Duration: 0.8 2023-10-04 16:25:26,313 WARNING [train.py:1204] (3/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_313-134930-0 from training. Duration: 0.97 2023-10-04 16:25:27,706 INFO [train.py:1046] (3/4) Epoch 49, batch 2750, loss[loss=0.155, simple_loss=0.2478, pruned_loss=0.03115, over 24599.00 frames. ], tot_loss[loss=0.153, simple_loss=0.2341, pruned_loss=0.03595, over 4720522.28 frames. ], batch size: 71, lr: 2.07e-03, grad_scale: 16.0 2023-10-04 16:25:27,877 WARNING [train.py:1204] (3/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_374-138971-0 from training. Duration: 0.94 2023-10-04 16:25:27,931 WARNING [train.py:1204] (3/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_1227-287886-0_sp1.1 from training. Duration: 0.936375 2023-10-04 16:25:29,474 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=1718213.3333333333, ans=0.1 2023-10-04 16:25:31,278 WARNING [train.py:1204] (3/4) Exclude cut with ID _59493_210_17282_1_1535078419077_3225879_332-217544-0_sp1.1 from training. Duration: 0.97275 2023-10-04 16:25:32,396 WARNING [train.py:1204] (3/4) Exclude cut with ID _23514_210_1389_1_1531832139785_3031439_361-119640-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 16:25:35,023 WARNING [train.py:1204] (3/4) Exclude cut with ID _69584_210_5219_1_1535785133775_4044450_142-118058-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 16:25:36,204 WARNING [train.py:1204] (3/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_178-177582-0_sp1.1 from training. Duration: 0.67275 2023-10-04 16:25:36,260 WARNING [train.py:1204] (3/4) Exclude cut with ID _25303_210_17519_1_1532050207576_3635970_41-195572-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 16:25:39,153 WARNING [train.py:1204] (3/4) Exclude cut with ID _55328_210_27767_1_1534580973260_4088170_329-341322-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 16:25:39,188 WARNING [train.py:1204] (3/4) Exclude cut with ID _5608_210_2106_1_1527415439089_6819768_909-82767-0_sp1.1 from training. Duration: 0.5545625 2023-10-04 16:25:40,492 WARNING [train.py:1204] (3/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_261-297503-0_sp1.1 from training. Duration: 0.9 2023-10-04 16:25:40,499 WARNING [train.py:1204] (3/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_459-291706-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 16:25:40,507 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7762_210_1794_1_1528199508_4057408_422-227034-0 from training. Duration: 0.73 2023-10-04 16:25:40,517 WARNING [train.py:1204] (3/4) Exclude cut with ID _3067_210_2667_1_1526212974640_4503168_250-285937-0_sp0.9 from training. Duration: 0.888875 2023-10-04 16:25:40,544 WARNING [train.py:1204] (3/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_332-253745-0_sp1.1 from training. Duration: 0.936375 2023-10-04 16:25:46,121 WARNING [train.py:1204] (3/4) Exclude cut with ID _13938_210_9988_1_1530518718978_3693960_304-112784-0 from training. Duration: 0.46 2023-10-04 16:25:47,549 WARNING [train.py:1204] (3/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_226-267957-0_sp1.1 from training. Duration: 0.92725 2023-10-04 16:25:47,588 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_41822_210_13270_1_1533955976_4379284_608-254567-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 16:25:49,374 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_124-113562-0_sp1.1 from training. Duration: 0.8545625 2023-10-04 16:25:49,402 WARNING [train.py:1204] (3/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_117-290916-0_sp1.1 from training. Duration: 0.463625 2023-10-04 16:25:49,462 WARNING [train.py:1204] (3/4) Exclude cut with ID _28427_210_4220_1_1532581240967_3061069_102-66533-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 16:25:51,389 WARNING [train.py:1204] (3/4) Exclude cut with ID _55383_210_10635_1_1534468426172_2231669_134-128246-0_sp1.1 from training. Duration: 0.8090625 2023-10-04 16:25:53,124 WARNING [train.py:1204] (3/4) Exclude cut with ID _45871_210_5665_1_1533864614245_3296060_49-308316-0_sp1.1 from training. Duration: 0.97275 2023-10-04 16:25:53,163 WARNING [train.py:1204] (3/4) Exclude cut with ID _57572_210_5517_1_1534921219624_3809484_176-199205-0_sp1.1 from training. Duration: 0.97275 2023-10-04 16:25:55,990 WARNING [train.py:1204] (3/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_45-265684-0_sp1.1 from training. Duration: 0.6545625 2023-10-04 16:25:56,008 WARNING [train.py:1204] (3/4) Exclude cut with ID _15150_210_8341_1_1530752411803_7256384_469-239509-0_sp0.9 from training. Duration: 0.7555625 2023-10-04 16:25:56,049 WARNING [train.py:1204] (3/4) Exclude cut with ID _28461_210_2253_1_1532771775189_3041299_136-270644-0_sp1.1 from training. Duration: 0.8454375 2023-10-04 16:25:57,337 WARNING [train.py:1204] (3/4) Exclude cut with ID _69584_210_5219_1_1535785133775_4044450_302-47302-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 16:25:58,688 WARNING [train.py:1204] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_166-128941-0_sp1.1 from training. Duration: 0.4909375 2023-10-04 16:26:05,822 WARNING [train.py:1204] (3/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_559-120504-0_sp1.1 from training. Duration: 0.97275 2023-10-04 16:26:08,640 WARNING [train.py:1204] (3/4) Exclude cut with ID _6456_210_1534_1_1527681634212_3181049_345-252877-0_sp1.1 from training. Duration: 0.5818125 2023-10-04 16:26:08,664 WARNING [train.py:1204] (3/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_902-233003-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 16:26:11,493 WARNING [train.py:1204] (3/4) Exclude cut with ID _36538_210_9994_1_1533204020363_3795620_204-261385-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 16:26:11,498 WARNING [train.py:1204] (3/4) Exclude cut with ID _14821_210_8750_1_1530680503991_6829359_189-297451-0_sp1.1 from training. Duration: 0.5 2023-10-04 16:26:11,540 WARNING [train.py:1204] (3/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_1211-226066-0_sp0.9 from training. Duration: 0.8444375 2023-10-04 16:26:18,755 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6786_210_1388_1_1527839699_4194495_273-149841-0_sp1.1 from training. Duration: 0.836375 2023-10-04 16:26:18,789 WARNING [train.py:1204] (3/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_380-162648-0_sp1.1 from training. Duration: 0.8545625 2023-10-04 16:26:18,790 WARNING [train.py:1204] (3/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_494-199538-0 from training. Duration: 0.75 2023-10-04 16:26:23,453 WARNING [train.py:1204] (3/4) Exclude cut with ID _49097_210_16049_1_1533952780207_4360979_276-87053-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 16:26:24,904 WARNING [train.py:1204] (3/4) Exclude cut with ID _5444_210_2151_1_1527418808464_5312210_726-27165-0 from training. Duration: 0.67 2023-10-04 16:26:29,105 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_128-202104-0_sp1.1 from training. Duration: 0.57275 2023-10-04 16:26:32,423 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_11349_210_6753_1_1529800861_1101207_26-65602-0_sp1.1 from training. Duration: 0.87275 2023-10-04 16:26:32,453 WARNING [train.py:1204] (3/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_159-328157-0 from training. Duration: 0.52 2023-10-04 16:26:33,711 WARNING [train.py:1204] (3/4) Exclude cut with ID _14069_210_5632_1_1530512692033_4803390_98-21934-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 16:26:35,134 WARNING [train.py:1204] (3/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_352-59958-0_sp0.9 from training. Duration: 0.9555625 2023-10-04 16:26:35,171 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6163_210_6400_1_1527506746_4263068_283-137811-0 from training. Duration: 0.77 2023-10-04 16:26:36,493 WARNING [train.py:1204] (3/4) Exclude cut with ID _21989_210_5854_1_1531899214931_3271360_133-44759-0_sp1.1 from training. Duration: 0.7 2023-10-04 16:26:39,338 WARNING [train.py:1204] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_21-174514-0_sp0.9 from training. Duration: 0.47775 2023-10-04 16:26:39,374 WARNING [train.py:1204] (3/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_201-78811-0_sp1.1 from training. Duration: 0.963625 2023-10-04 16:26:40,619 INFO [train.py:1046] (3/4) Epoch 49, batch 2800, loss[loss=0.1556, simple_loss=0.2437, pruned_loss=0.03377, over 24678.00 frames. ], tot_loss[loss=0.1523, simple_loss=0.2334, pruned_loss=0.03557, over 4720855.06 frames. ], batch size: 65, lr: 2.07e-03, grad_scale: 32.0 2023-10-04 16:26:40,678 WARNING [train.py:1204] (3/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_537-164630-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 16:26:40,757 WARNING [train.py:1204] (3/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_156-257607-0 from training. Duration: 0.83 2023-10-04 16:26:41,914 WARNING [train.py:1204] (3/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_308-154450-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 16:26:41,954 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_45399_210_4852_1_1533624725_7579737_214-240574-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 16:26:44,650 WARNING [train.py:1204] (3/4) Exclude cut with ID _29303_210_2575_1_1532392237303_7410649_615-305517-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 16:26:45,881 WARNING [train.py:1204] (3/4) Exclude cut with ID IC0125W0002-61808 from training. Duration: 0.708 2023-10-04 16:26:45,882 WARNING [train.py:1204] (3/4) Exclude cut with ID IC0125W0017-61823 from training. Duration: 0.759 2023-10-04 16:26:47,985 WARNING [train.py:1204] (3/4) Exclude cut with ID _53219_210_4525_1_1534834836008_4064825_4-145291-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 16:26:51,307 WARNING [train.py:1204] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_185-174805-0_sp1.1 from training. Duration: 0.6181875 2023-10-04 16:26:52,578 WARNING [train.py:1204] (3/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_635-193046-0_sp1.1 from training. Duration: 0.92725 2023-10-04 16:26:55,738 WARNING [train.py:1204] (3/4) Exclude cut with ID _46067_210_4220_1_1534125571066_3307450_321-258866-0_sp1.1 from training. Duration: 0.97275 2023-10-04 16:26:57,062 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.772e+02 2.030e+02 2.375e+02 2.876e+02 4.786e+02, threshold=4.750e+02, percent-clipped=2.0 2023-10-04 16:26:58,619 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8558_210_1410_1_1528505225_7613406_131-329524-0 from training. Duration: 0.79 2023-10-04 16:27:00,064 WARNING [train.py:1204] (3/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_847-41334-0_sp0.9 from training. Duration: 0.488875 2023-10-04 16:27:00,161 WARNING [train.py:1204] (3/4) Exclude cut with ID _14227_210_4381_1_1530701804286_3940259_125-275642-0 from training. Duration: 0.99 2023-10-04 16:27:01,572 WARNING [train.py:1204] (3/4) Exclude cut with ID _25248_210_8750_1_1532062825941_5554180_248-177255-0_sp1.1 from training. Duration: 0.963625 2023-10-04 16:27:03,568 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_40047_210_2290_1_1533290519_2374311_195-165693-0_sp1.1 from training. Duration: 0.8090625 2023-10-04 16:27:03,571 WARNING [train.py:1204] (3/4) Exclude cut with ID _23109_210_4381_1_1532150828777_901949_29-258672-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 16:27:07,573 WARNING [train.py:1204] (3/4) Exclude cut with ID _14821_210_8750_1_1530680503991_6829359_2-227151-0_sp1.1 from training. Duration: 0.7545625 2023-10-04 16:27:07,601 WARNING [train.py:1204] (3/4) Exclude cut with ID _49643_210_7989_1_1534640244224_3939031_565-53982-0_sp1.1 from training. Duration: 0.963625 2023-10-04 16:27:08,932 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7544_210_6753_1_1528591327_1471426_33-346031-0_sp0.9 from training. Duration: 0.67775 2023-10-04 16:27:09,010 WARNING [train.py:1204] (3/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_95-61782-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 16:27:15,506 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.0.layers.0.self_attn2.whiten, num_groups=1, num_channels=192, metric=13.65 vs. limit=22.5 2023-10-04 16:27:15,853 WARNING [train.py:1204] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_686-54383-0_sp0.9 from training. Duration: 0.9666875 2023-10-04 16:27:15,977 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_505-276823-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 16:27:19,244 WARNING [train.py:1204] (3/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_745-128443-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 16:27:19,995 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.2.encoder.layers.1.feed_forward1.out_whiten, num_groups=1, num_channels=384, metric=5.73 vs. limit=15.0 2023-10-04 16:27:20,563 WARNING [train.py:1204] (3/4) Exclude cut with ID _3844_210_1390_1_1526727803582_2871209_20-251664-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 16:27:20,625 WARNING [train.py:1204] (3/4) Exclude cut with ID _36538_210_9994_1_1533204020363_3795620_487-164628-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 16:27:26,936 WARNING [train.py:1204] (3/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_309-222545-0_sp1.1 from training. Duration: 0.763625 2023-10-04 16:27:26,950 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3531_210_3528_1_1526294203_5216456_309-227302-0 from training. Duration: 0.69 2023-10-04 16:27:27,011 WARNING [train.py:1204] (3/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_42-192460-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 16:27:28,341 WARNING [train.py:1204] (3/4) Exclude cut with ID _22083_210_8254_1_1531702862439_3678970_49-234546-0_sp1.1 from training. Duration: 0.7545625 2023-10-04 16:27:28,345 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_427-78097-0_sp0.9 from training. Duration: 0.888875 2023-10-04 16:27:31,183 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_378-205254-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 16:27:32,980 WARNING [train.py:1204] (3/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_672-314830-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 16:27:37,275 WARNING [train.py:1204] (3/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_90-30673-0_sp1.1 from training. Duration: 0.763625 2023-10-04 16:27:38,717 WARNING [train.py:1204] (3/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_563-329943-0_sp1.1 from training. Duration: 0.9 2023-10-04 16:27:38,732 WARNING [train.py:1204] (3/4) Exclude cut with ID _21632_210_2253_1_1531994001765_3744140_154-84090-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 16:27:38,733 WARNING [train.py:1204] (3/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_103-336064-0_sp0.9 from training. Duration: 0.5666875 2023-10-04 16:27:40,015 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_43-71989-0_sp0.9 from training. Duration: 0.6333125 2023-10-04 16:27:40,085 WARNING [train.py:1204] (3/4) Exclude cut with ID _3867_210_1883_1_1526641200575_4475830_319-349850-0_sp0.9 from training. Duration: 0.7444375 2023-10-04 16:27:41,397 WARNING [train.py:1204] (3/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_629-179454-0_sp1.1 from training. Duration: 0.963625 2023-10-04 16:27:42,638 WARNING [train.py:1204] (3/4) Exclude cut with ID _65263_210_22356_1_1535763598691_3921037_49-222917-0 from training. Duration: 0.92 2023-10-04 16:27:42,660 WARNING [train.py:1204] (3/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_602-331772-0_sp1.1 from training. Duration: 0.936375 2023-10-04 16:27:44,011 WARNING [train.py:1204] (3/4) Exclude cut with ID _58414_210_8254_1_1535421609162_7151182_292-134978-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 16:27:44,050 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_38793_210_2544_1_1533176114_3849320_200-299292-0_sp1.1 from training. Duration: 0.936375 2023-10-04 16:27:45,383 WARNING [train.py:1204] (3/4) Exclude cut with ID _14970_210_15709_1_1530775547361_1966965_6-23328-0 from training. Duration: 0.86 2023-10-04 16:27:46,779 WARNING [train.py:1204] (3/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_386-225511-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 16:27:46,789 WARNING [train.py:1204] (3/4) Exclude cut with ID _38323_210_12108_1_1533121233369_3303049_416-11919-0_sp1.1 from training. Duration: 0.87275 2023-10-04 16:27:48,148 WARNING [train.py:1204] (3/4) Exclude cut with ID _4866_210_2577_1_1527404412284_4364659_256-306830-0_sp1.1 from training. Duration: 0.7909375 2023-10-04 16:27:49,539 WARNING [train.py:1204] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_53-300544-0 from training. Duration: 0.67 2023-10-04 16:27:51,707 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.bypass.skip_rate, batch_count=1718813.3333333333, ans=0.04949747468305833 2023-10-04 16:27:52,966 WARNING [train.py:1204] (3/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_80-74184-0_sp1.1 from training. Duration: 0.7545625 2023-10-04 16:27:52,981 WARNING [train.py:1204] (3/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_332-256168-0_sp1.1 from training. Duration: 0.6818125 2023-10-04 16:27:54,848 INFO [train.py:1046] (3/4) Epoch 49, batch 2850, loss[loss=0.1455, simple_loss=0.2383, pruned_loss=0.02636, over 24298.00 frames. ], tot_loss[loss=0.1514, simple_loss=0.2322, pruned_loss=0.03529, over 4723309.14 frames. ], batch size: 74, lr: 2.07e-03, grad_scale: 16.0 2023-10-04 16:27:54,981 WARNING [train.py:1204] (3/4) Exclude cut with ID _48836_210_12210_1_1534075848293_3904150_429-35475-0_sp1.1 from training. Duration: 0.8090625 2023-10-04 16:27:55,250 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.nonlin_attention.balancer.prob, batch_count=1718880.0, ans=0.125 2023-10-04 16:27:58,148 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_144-135833-0_sp1.1 from training. Duration: 0.97275 2023-10-04 16:28:00,974 WARNING [train.py:1204] (3/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_380-343670-0_sp0.9 from training. Duration: 0.97775 2023-10-04 16:28:01,629 WARNING [train.py:1204] (3/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_213-265174-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 16:28:02,711 WARNING [train.py:1204] (3/4) Exclude cut with ID _20455_210_5219_1_1531565996775_8098100_86-314283-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 16:28:04,222 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_41750_210_1960_1_1533520678_3948042_68-223813-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 16:28:04,275 WARNING [train.py:1204] (3/4) Exclude cut with ID _55153_210_2253_1_1534384786677_3162096_24-198429-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 16:28:06,995 WARNING [train.py:1204] (3/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_119-207162-0_sp0.9 from training. Duration: 0.788875 2023-10-04 16:28:07,053 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_148-172861-0 from training. Duration: 0.5 2023-10-04 16:28:13,942 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_5522_210_3273_1_1527249606_3909096_318-203153-0 from training. Duration: 0.43 2023-10-04 16:28:13,956 WARNING [train.py:1204] (3/4) Exclude cut with ID _14884_210_2252_1_1530707459689_5561250_288-189949-0_sp1.1 from training. Duration: 0.97275 2023-10-04 16:28:15,380 WARNING [train.py:1204] (3/4) Exclude cut with ID _5294_210_1747_1_1527249643928_3696179_529-242298-0 from training. Duration: 0.95 2023-10-04 16:28:16,691 WARNING [train.py:1204] (3/4) Exclude cut with ID _36538_210_9994_1_1533204020363_3795620_503-110289-0_sp1.1 from training. Duration: 0.936375 2023-10-04 16:28:19,479 WARNING [train.py:1204] (3/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_385-193052-0 from training. Duration: 0.86 2023-10-04 16:28:19,549 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_456-22141-0 from training. Duration: 0.88 2023-10-04 16:28:20,888 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_38965_210_4852_1_1533174906_3484735_284-117840-0_sp1.1 from training. Duration: 0.936375 2023-10-04 16:28:32,717 WARNING [train.py:1204] (3/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_75-201625-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 16:28:33,994 WARNING [train.py:1204] (3/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_255-312654-0_sp1.1 from training. Duration: 0.82725 2023-10-04 16:28:34,004 WARNING [train.py:1204] (3/4) Exclude cut with ID _47251_210_18628_1_1533814063128_4137250_214-88076-0_sp0.9 from training. Duration: 0.97775 2023-10-04 16:28:35,353 WARNING [train.py:1204] (3/4) Exclude cut with ID _4092_210_2151_1_1526643044449_3135759_227-213972-0_sp1.1 from training. Duration: 0.4909375 2023-10-04 16:28:35,376 WARNING [train.py:1204] (3/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_912-47473-0_sp1.1 from training. Duration: 0.6181875 2023-10-04 16:28:35,396 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6767_210_3273_1_1527855783_1718932_173-256601-0_sp1.1 from training. Duration: 0.72725 2023-10-04 16:28:37,004 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.bypass.skip_rate, batch_count=1719013.3333333333, ans=0.04949747468305833 2023-10-04 16:28:37,610 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.1.encoder.layers.1.conv_module2.whiten, num_groups=1, num_channels=256, metric=10.51 vs. limit=15.0 2023-10-04 16:28:38,210 WARNING [train.py:1204] (3/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_32-205288-0_sp1.1 from training. Duration: 0.7090625 2023-10-04 16:28:39,527 WARNING [train.py:1204] (3/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_39-248870-0 from training. Duration: 0.69 2023-10-04 16:28:40,076 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.5.encoder.layers.1.feed_forward2.out_whiten, num_groups=1, num_channels=256, metric=8.86 vs. limit=15.0 2023-10-04 16:28:40,930 WARNING [train.py:1204] (3/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_377-296839-0_sp1.1 from training. Duration: 0.5 2023-10-04 16:28:40,953 WARNING [train.py:1204] (3/4) Exclude cut with ID _13679_210_7497_1_1530534293662_3355098_413-56154-0_sp1.1 from training. Duration: 0.963625 2023-10-04 16:28:40,995 WARNING [train.py:1204] (3/4) Exclude cut with ID _14970_210_15709_1_1530775547361_1966965_201-84544-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 16:28:42,334 WARNING [train.py:1204] (3/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_105-95775-0_sp1.1 from training. Duration: 0.936375 2023-10-04 16:28:45,185 WARNING [train.py:1204] (3/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_131-191669-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 16:28:45,223 WARNING [train.py:1204] (3/4) Exclude cut with ID _14860_210_4915_1_1530702020340_7343240_695-4323-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 16:28:46,533 WARNING [train.py:1204] (3/4) Exclude cut with ID _39867_210_9826_1_1533553222030_9344259_991-130565-0_sp1.1 from training. Duration: 0.97275 2023-10-04 16:28:48,026 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_50-3289-0_sp1.1 from training. Duration: 0.82725 2023-10-04 16:28:49,444 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_54-347670-0_sp1.1 from training. Duration: 0.92725 2023-10-04 16:28:49,508 WARNING [train.py:1204] (3/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_200-153445-0_sp1.1 from training. Duration: 0.936375 2023-10-04 16:28:51,546 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.2.encoder.layers.1.self_attn1.whiten, num_groups=1, num_channels=384, metric=14.59 vs. limit=22.5 2023-10-04 16:28:52,185 WARNING [train.py:1204] (3/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_279-302911-0_sp1.1 from training. Duration: 0.97275 2023-10-04 16:28:54,146 WARNING [train.py:1204] (3/4) Exclude cut with ID _14886_210_4220_1_1530702020355_3516190_159-73748-0_sp0.9 from training. Duration: 0.92225 2023-10-04 16:28:58,815 WARNING [train.py:1204] (3/4) Exclude cut with ID _53379_210_20034_1_1534222027363_3854654_23-222715-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 16:29:01,958 WARNING [train.py:1204] (3/4) Exclude cut with ID _12555_210_8750_1_1530075594402_3780239_449-267986-0 from training. Duration: 0.83 2023-10-04 16:29:01,981 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7544_210_6753_1_1528587722_3576686_295-258479-0 from training. Duration: 0.84 2023-10-04 16:29:02,244 INFO [scaling.py:1118] (3/4) WithLoss: name=encoder.encoders.2.encoder.layers.0.self_attn_weights, loss-sum=0.000e+00 2023-10-04 16:29:03,482 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7002_210_1794_1_1527935634_4034444_487-236501-0_sp0.9 from training. Duration: 0.7333125 2023-10-04 16:29:04,781 WARNING [train.py:1204] (3/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_244-19107-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 16:29:05,398 WARNING [train.py:1204] (3/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_390-339137-0 from training. Duration: 0.56 2023-10-04 16:29:05,450 WARNING [train.py:1204] (3/4) Exclude cut with ID _7562_210_2084_1_1528887767916_3946229_186-326422-0_sp1.1 from training. Duration: 0.763625 2023-10-04 16:29:06,651 WARNING [train.py:1204] (3/4) Exclude cut with ID _33640_210_2655_1_1533117605479_4855469_645-145235-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 16:29:06,668 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_25767_210_2161_1_1532307483_3867748_506-171777-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 16:29:06,686 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_5438_210_1794_1_1527336687_2600009_233-58072-0_sp1.1 from training. Duration: 0.7 2023-10-04 16:29:06,687 WARNING [train.py:1204] (3/4) Exclude cut with ID IC0669W0063-330908 from training. Duration: 0.896 2023-10-04 16:29:06,731 WARNING [train.py:1204] (3/4) Exclude cut with ID IC0671W0070-331913 from training. Duration: 0.896 2023-10-04 16:29:06,734 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_258-310570-0_sp1.1 from training. Duration: 0.7454375 2023-10-04 16:29:08,052 WARNING [train.py:1204] (3/4) Exclude cut with ID _9461_210_4557_1_1529060514726_3677451_162-177507-0_sp1.1 from training. Duration: 0.97275 2023-10-04 16:29:09,291 INFO [train.py:1046] (3/4) Epoch 49, batch 2900, loss[loss=0.1408, simple_loss=0.2245, pruned_loss=0.0286, over 23699.00 frames. ], tot_loss[loss=0.1515, simple_loss=0.2322, pruned_loss=0.03545, over 4715366.45 frames. ], batch size: 135, lr: 2.07e-03, grad_scale: 16.0 2023-10-04 16:29:12,211 WARNING [train.py:1204] (3/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_26-175553-0_sp0.9 from training. Duration: 0.67775 2023-10-04 16:29:12,228 WARNING [train.py:1204] (3/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_7-150481-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 16:29:13,482 WARNING [train.py:1204] (3/4) Exclude cut with ID _23557_210_8750_1_1531890106370_6846255_72-273262-0_sp1.1 from training. Duration: 0.963625 2023-10-04 16:29:14,831 WARNING [train.py:1204] (3/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_367-61330-0 from training. Duration: 0.75 2023-10-04 16:29:15,661 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.2.encoder.layers.0.feed_forward3.out_whiten, num_groups=1, num_channels=384, metric=11.17 vs. limit=15.0 2023-10-04 16:29:17,898 WARNING [train.py:1204] (3/4) Exclude cut with ID _39867_210_9826_1_1533553222030_9344259_1337-11141-0_sp1.1 from training. Duration: 0.936375 2023-10-04 16:29:17,937 WARNING [train.py:1204] (3/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_101-293367-0 from training. Duration: 0.63 2023-10-04 16:29:19,294 WARNING [train.py:1204] (3/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_10-100367-0 from training. Duration: 0.98 2023-10-04 16:29:21,849 WARNING [train.py:1204] (3/4) Exclude cut with ID _60057_210_10640_1_1534903065793_3333150_171-181133-0_sp1.1 from training. Duration: 0.8 2023-10-04 16:29:21,852 WARNING [train.py:1204] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_844-96989-0_sp0.9 from training. Duration: 0.788875 2023-10-04 16:29:23,306 WARNING [train.py:1204] (3/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_301-144595-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 16:29:24,588 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.683e+02 2.064e+02 2.211e+02 2.560e+02 4.990e+02, threshold=4.422e+02, percent-clipped=1.0 2023-10-04 16:29:24,688 WARNING [train.py:1204] (3/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_115-147343-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 16:29:28,527 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3171_210_2087_1_1526109084_2330007_37-64334-0_sp1.1 from training. Duration: 0.7454375 2023-10-04 16:29:28,578 WARNING [train.py:1204] (3/4) Exclude cut with ID _19514_210_9259_1_1531530071330_5084230_769-272914-0_sp1.1 from training. Duration: 0.936375 2023-10-04 16:29:31,820 WARNING [train.py:1204] (3/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_76-311963-0_sp0.9 from training. Duration: 0.77775 2023-10-04 16:29:31,849 WARNING [train.py:1204] (3/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_526-62079-0 from training. Duration: 0.67 2023-10-04 16:29:33,223 WARNING [train.py:1204] (3/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_7-111954-0_sp0.9 from training. Duration: 0.62225 2023-10-04 16:29:33,345 WARNING [train.py:1204] (3/4) Exclude cut with ID _57997_210_4163_1_1534766425889_3961049_360-279140-0_sp1.1 from training. Duration: 0.97275 2023-10-04 16:29:36,143 WARNING [train.py:1204] (3/4) Exclude cut with ID _4866_210_2577_1_1527404412284_4364659_133-1696-0 from training. Duration: 0.99 2023-10-04 16:29:38,057 WARNING [train.py:1204] (3/4) Exclude cut with ID _5190_210_4557_1_1527244368799_373439_28-197030-0 from training. Duration: 0.72 2023-10-04 16:29:40,694 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_53-39003-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 16:29:40,697 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6673_210_3273_1_1527767790_3173240_351-167954-0 from training. Duration: 0.48 2023-10-04 16:29:40,723 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2822_210_2715_1_1526112586_1999587_165-36656-0_sp1.1 from training. Duration: 0.8090625 2023-10-04 16:29:43,405 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_328-264892-0_sp1.1 from training. Duration: 0.92725 2023-10-04 16:29:43,410 WARNING [train.py:1204] (3/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_29-293896-0_sp0.9 from training. Duration: 0.67775 2023-10-04 16:29:46,082 WARNING [train.py:1204] (3/4) Exclude cut with ID _65263_210_22356_1_1535763598691_3921037_465-55448-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 16:29:47,410 WARNING [train.py:1204] (3/4) Exclude cut with ID _21989_210_5854_1_1531899214931_3271360_311-53106-0_sp1.1 from training. Duration: 0.97275 2023-10-04 16:29:51,538 WARNING [train.py:1204] (3/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_389-277915-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 16:29:54,439 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_69630_210_7234_1_1535791094_677837_63-277139-0_sp1.1 from training. Duration: 0.963625 2023-10-04 16:29:55,868 WARNING [train.py:1204] (3/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_85-138257-0 from training. Duration: 0.71 2023-10-04 16:29:57,653 WARNING [train.py:1204] (3/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_686-247600-0 from training. Duration: 0.92 2023-10-04 16:29:57,658 WARNING [train.py:1204] (3/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_108-255430-0_sp1.1 from training. Duration: 0.8818125 2023-10-04 16:29:58,013 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.bypass.skip_rate, batch_count=1719413.3333333333, ans=0.09899494936611666 2023-10-04 16:30:00,862 WARNING [train.py:1204] (3/4) Exclude cut with ID _14884_210_2252_1_1530707459689_5561250_114-201399-0_sp1.1 from training. Duration: 0.6909375 2023-10-04 16:30:02,297 WARNING [train.py:1204] (3/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_506-42614-0 from training. Duration: 0.79 2023-10-04 16:30:03,660 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_85-195240-0_sp1.1 from training. Duration: 0.7909375 2023-10-04 16:30:10,095 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_141-36243-0_sp1.1 from training. Duration: 0.97275 2023-10-04 16:30:15,174 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.1.encoder.layers.1.self_attn1.whiten, num_groups=1, num_channels=256, metric=12.71 vs. limit=22.5 2023-10-04 16:30:16,960 WARNING [train.py:1204] (3/4) Exclude cut with ID _7852_210_5219_1_1528286471316_8139400_358-287492-0_sp1.1 from training. Duration: 0.87275 2023-10-04 16:30:16,983 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_805-71497-0_sp0.9 from training. Duration: 0.788875 2023-10-04 16:30:18,244 WARNING [train.py:1204] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_178-39722-0 from training. Duration: 0.49 2023-10-04 16:30:21,047 WARNING [train.py:1204] (3/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_428-181925-0_sp1.1 from training. Duration: 0.963625 2023-10-04 16:30:21,051 WARNING [train.py:1204] (3/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_770-200430-0 from training. Duration: 0.89 2023-10-04 16:30:22,474 INFO [train.py:1046] (3/4) Epoch 49, batch 2950, loss[loss=0.1631, simple_loss=0.2372, pruned_loss=0.04451, over 23744.00 frames. ], tot_loss[loss=0.1517, simple_loss=0.2325, pruned_loss=0.03542, over 4719346.82 frames. ], batch size: 164, lr: 2.07e-03, grad_scale: 16.0 2023-10-04 16:30:22,526 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_1104-271009-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 16:30:23,897 WARNING [train.py:1204] (3/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_128-296581-0_sp0.9 from training. Duration: 0.82225 2023-10-04 16:30:28,839 WARNING [train.py:1204] (3/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_19-62511-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 16:30:32,192 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_805-71497-0 from training. Duration: 0.71 2023-10-04 16:30:33,485 WARNING [train.py:1204] (3/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_573-152077-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 16:30:33,491 WARNING [train.py:1204] (3/4) Exclude cut with ID _42875_210_14856_1_1533704397487_4012433_393-177816-0_sp1.1 from training. Duration: 0.963625 2023-10-04 16:30:33,602 WARNING [train.py:1204] (3/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_278-35159-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 16:30:34,936 WARNING [train.py:1204] (3/4) Exclude cut with ID _15037_210_2253_1_1530752294104_3267650_13-130361-0_sp1.1 from training. Duration: 0.9 2023-10-04 16:30:35,156 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.conv_skip_rate, batch_count=1719546.6666666667, ans=0.0 2023-10-04 16:30:36,781 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3019_210_2028_1_1526044245_3264865_127-135615-0 from training. Duration: 0.94 2023-10-04 16:30:38,162 WARNING [train.py:1204] (3/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_607-213951-0 from training. Duration: 0.84 2023-10-04 16:30:38,221 WARNING [train.py:1204] (3/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_436-318113-0_sp0.9 from training. Duration: 0.8333125 2023-10-04 16:30:38,222 WARNING [train.py:1204] (3/4) Exclude cut with ID _13938_210_9988_1_1530518718978_3693960_148-152225-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 16:30:44,323 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2541_210_1794_1_1525590269_3084765_156-173713-0_sp1.1 from training. Duration: 0.736375 2023-10-04 16:30:46,998 WARNING [train.py:1204] (3/4) Exclude cut with ID _28323_210_16187_1_1532480395667_4100300_531-24157-0_sp1.1 from training. Duration: 0.8909375 2023-10-04 16:30:48,383 WARNING [train.py:1204] (3/4) Exclude cut with ID _29303_210_2575_1_1532392237303_7410649_382-217492-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 16:30:48,421 WARNING [train.py:1204] (3/4) Exclude cut with ID _58895_210_8750_1_1535184081320_3649089_270-158013-0_sp1.1 from training. Duration: 0.92725 2023-10-04 16:30:51,210 WARNING [train.py:1204] (3/4) Exclude cut with ID _29277_210_3279_1_1532581109293_3173069_311-206227-0_sp1.1 from training. Duration: 0.936375 2023-10-04 16:30:51,228 WARNING [train.py:1204] (3/4) Exclude cut with ID _7160_210_1876_1_1527940923231_3703799_282-322611-0_sp0.9 from training. Duration: 0.9666875 2023-10-04 16:30:52,772 WARNING [train.py:1204] (3/4) Exclude cut with ID _53219_210_4525_1_1534834836008_4064825_562-32295-0_sp1.1 from training. Duration: 0.963625 2023-10-04 16:30:52,948 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.feed_forward3.hidden_balancer.prob, batch_count=1719680.0, ans=0.125 2023-10-04 16:30:54,150 WARNING [train.py:1204] (3/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_408-87330-0_sp1.1 from training. Duration: 0.963625 2023-10-04 16:30:54,157 WARNING [train.py:1204] (3/4) Exclude cut with ID _59202_210_26984_1_1534816810612_3549174_137-318710-0_sp1.1 from training. Duration: 0.8090625 2023-10-04 16:30:55,572 WARNING [train.py:1204] (3/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_372-30302-0 from training. Duration: 0.86 2023-10-04 16:31:00,458 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.conv_module2.balancer1.prob, batch_count=1719680.0, ans=0.125 2023-10-04 16:31:00,459 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.feed_forward3.hidden_balancer.prob, batch_count=1719680.0, ans=0.125 2023-10-04 16:31:01,579 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7543_210_6753_1_1528541907_1399322_178-56269-0_sp1.1 from training. Duration: 0.95275 2023-10-04 16:31:02,859 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9109W0012-549758 from training. Duration: 0.512 2023-10-04 16:31:02,945 WARNING [train.py:1204] (3/4) Exclude cut with ID _5410_210_1388_1_1527397055795_1719020_39-112221-0_sp0.9 from training. Duration: 0.7666875 2023-10-04 16:31:04,721 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9121W0012-554933 from training. Duration: 0.896 2023-10-04 16:31:06,592 WARNING [train.py:1204] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_476-272757-0 from training. Duration: 0.93 2023-10-04 16:31:06,615 WARNING [train.py:1204] (3/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_1085-171718-0_sp1.1 from training. Duration: 0.92725 2023-10-04 16:31:07,988 WARNING [train.py:1204] (3/4) Exclude cut with ID _8480_210_1874_1_1528589391205_6728330_309-139896-0_sp1.1 from training. Duration: 0.736375 2023-10-04 16:31:07,989 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9130W0113-559703 from training. Duration: 0.896 2023-10-04 16:31:07,993 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_292-265928-0_sp1.1 from training. Duration: 0.463625 2023-10-04 16:31:09,468 WARNING [train.py:1204] (3/4) Exclude cut with ID _8772_210_1850_1_1528635595972_3664011_96-12756-0 from training. Duration: 0.98 2023-10-04 16:31:10,829 WARNING [train.py:1204] (3/4) Exclude cut with ID _12556_210_8341_1_1530081024100_7327690_725-193477-0_sp1.1 from training. Duration: 0.8909375 2023-10-04 16:31:10,875 WARNING [train.py:1204] (3/4) Exclude cut with ID _5289_210_2084_1_1527678202736_3779299_142-103317-0_sp1.1 from training. Duration: 0.763625 2023-10-04 16:31:12,924 WARNING [train.py:1204] (3/4) Exclude cut with ID _45012_210_12651_1_1533608435136_5371380_206-51075-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 16:31:14,346 WARNING [train.py:1204] (3/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_176-56120-0_sp1.1 from training. Duration: 0.7545625 2023-10-04 16:31:14,383 WARNING [train.py:1204] (3/4) Exclude cut with ID _40284_210_2084_1_1533513679814_3704279_220-201864-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 16:31:15,583 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9165W0004-576638 from training. Duration: 0.896 2023-10-04 16:31:15,631 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_5438_210_1794_1_1527336687_2600009_218-343336-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 16:31:15,662 WARNING [train.py:1204] (3/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_318-162017-0 from training. Duration: 0.91 2023-10-04 16:31:20,034 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.ff3_skip_rate, batch_count=1719746.6666666667, ans=0.0 2023-10-04 16:31:21,191 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_49544_210_1794_1_1534379770_5262083_483-349298-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 16:31:22,612 WARNING [train.py:1204] (3/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_197-28164-0_sp0.9 from training. Duration: 0.888875 2023-10-04 16:31:22,677 WARNING [train.py:1204] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_643-52007-0 from training. Duration: 0.57 2023-10-04 16:31:22,703 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_38965_210_4852_1_1533174906_3484735_403-332963-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 16:31:24,092 WARNING [train.py:1204] (3/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_340-174176-0 from training. Duration: 0.49 2023-10-04 16:31:25,787 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.out_combiner.scale_min, batch_count=1719813.3333333333, ans=0.2 2023-10-04 16:31:26,926 WARNING [train.py:1204] (3/4) Exclude cut with ID _56708_210_13177_1_1535252351282_3422899_363-917-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 16:31:28,847 WARNING [train.py:1204] (3/4) Exclude cut with ID _19896_210_1883_1_1531709022662_6649569_525-159063-0_sp1.1 from training. Duration: 0.936375 2023-10-04 16:31:30,065 WARNING [train.py:1204] (3/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_430-116139-0_sp0.9 from training. Duration: 0.9444375 2023-10-04 16:31:30,331 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.feed_forward2.hidden_balancer.prob, batch_count=1719813.3333333333, ans=0.125 2023-10-04 16:31:31,443 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_53753_210_7234_1_1534320003_3594148_86-329250-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 16:31:31,464 WARNING [train.py:1204] (3/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_406-275649-0_sp1.1 from training. Duration: 0.3818125 2023-10-04 16:31:32,786 WARNING [train.py:1204] (3/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_1083-147536-0_sp1.1 from training. Duration: 0.8818125 2023-10-04 16:31:32,853 WARNING [train.py:1204] (3/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_54-311741-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 16:31:32,865 WARNING [train.py:1204] (3/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_237-58433-0_sp0.9 from training. Duration: 0.82225 2023-10-04 16:31:34,708 WARNING [train.py:1204] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_98-51676-0_sp1.1 from training. Duration: 0.663625 2023-10-04 16:31:36,401 WARNING [train.py:1204] (3/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_386-270472-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 16:31:36,486 WARNING [train.py:1204] (3/4) Exclude cut with ID _19023_210_4353_1_1531378892552_5820780_83-254794-0_sp1.1 from training. Duration: 0.7818125 2023-10-04 16:31:37,774 INFO [train.py:1046] (3/4) Epoch 49, batch 3000, loss[loss=0.197, simple_loss=0.2677, pruned_loss=0.0632, over 19120.00 frames. ], tot_loss[loss=0.1526, simple_loss=0.2336, pruned_loss=0.03579, over 4699340.43 frames. ], batch size: 388, lr: 2.07e-03, grad_scale: 16.0 2023-10-04 16:31:37,774 INFO [train.py:1069] (3/4) Computing validation loss 2023-10-04 16:31:49,875 INFO [train.py:1078] (3/4) Epoch 49, validation: loss=0.3542, simple_loss=0.2825, pruned_loss=0.2129, over 1125622.00 frames. 2023-10-04 16:31:49,875 INFO [train.py:1079] (3/4) Maximum memory allocated so far is 21221MB 2023-10-04 16:31:50,002 WARNING [train.py:1204] (3/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_314-93656-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 16:31:50,012 WARNING [train.py:1204] (3/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_336-213530-0 from training. Duration: 0.9 2023-10-04 16:31:51,307 WARNING [train.py:1204] (3/4) Exclude cut with ID _36686_210_5665_1_1533778225913_3646410_65-221120-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 16:31:54,079 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_26433_210_12108_1_1532157158_4056480_271-68178-0_sp1.1 from training. Duration: 0.92725 2023-10-04 16:31:54,117 WARNING [train.py:1204] (3/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_46-207599-0_sp0.9 from training. Duration: 0.711125 2023-10-04 16:31:56,884 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.1.encoder.layers.0.conv_module1.whiten, num_groups=1, num_channels=256, metric=11.09 vs. limit=15.0 2023-10-04 16:31:57,315 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9303W0114-637133 from training. Duration: 0.896 2023-10-04 16:31:58,705 WARNING [train.py:1204] (3/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_490-207861-0 from training. Duration: 0.98 2023-10-04 16:32:00,197 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6767_210_3273_1_1527855783_1718932_173-256601-0_sp0.9 from training. Duration: 0.888875 2023-10-04 16:32:00,226 WARNING [train.py:1204] (3/4) Exclude cut with ID _13921_210_9994_1_1530841782272_3503520_327-69677-0_sp1.1 from training. Duration: 0.8181875 2023-10-04 16:32:02,130 WARNING [train.py:1204] (3/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_508-335369-0 from training. Duration: 0.48 2023-10-04 16:32:02,162 WARNING [train.py:1204] (3/4) Exclude cut with ID _64631_210_27767_1_1535349580068_4165160_24-46567-0_sp1.1 from training. Duration: 0.963625 2023-10-04 16:32:06,781 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.702e+02 2.107e+02 2.317e+02 2.666e+02 4.231e+02, threshold=4.633e+02, percent-clipped=0.0 2023-10-04 16:32:08,244 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_89-35748-0_sp1.1 from training. Duration: 0.7181875 2023-10-04 16:32:14,830 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.5.encoder.layers.0.nonlin_attention.whiten1, num_groups=1, num_channels=192, metric=4.40 vs. limit=10.0 2023-10-04 16:32:17,516 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_26433_210_12108_1_1532157158_4056480_578-262107-0_sp1.1 from training. Duration: 0.9 2023-10-04 16:32:22,997 WARNING [train.py:1204] (3/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_129-337802-0 from training. Duration: 0.72 2023-10-04 16:32:26,056 WARNING [train.py:1204] (3/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_356-167816-0_sp0.9 from training. Duration: 0.8 2023-10-04 16:32:27,620 WARNING [train.py:1204] (3/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_137-116442-0_sp1.1 from training. Duration: 0.7090625 2023-10-04 16:32:28,919 WARNING [train.py:1204] (3/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_358-172940-0_sp1.1 from training. Duration: 0.963625 2023-10-04 16:32:28,952 WARNING [train.py:1204] (3/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_668-344147-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 16:32:31,698 WARNING [train.py:1204] (3/4) Exclude cut with ID _62337_210_12392_1_1535009273105_3412229_255-157667-0_sp1.1 from training. Duration: 0.936375 2023-10-04 16:32:31,702 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_486-151131-0 from training. Duration: 0.85 2023-10-04 16:32:34,489 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_53638_210_21581_1_1534297319_5808449_885-80172-0 from training. Duration: 0.96 2023-10-04 16:32:35,933 WARNING [train.py:1204] (3/4) Exclude cut with ID _50093_210_4220_1_1534417141207_3432299_105-183067-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 16:32:35,997 WARNING [train.py:1204] (3/4) Exclude cut with ID _7562_210_2084_1_1528887767916_3946229_345-169156-0_sp0.9 from training. Duration: 0.6444375 2023-10-04 16:32:36,252 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.conv_skip_rate, batch_count=1720080.0, ans=0.0 2023-10-04 16:32:39,299 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_4528_210_3273_1_1527076654_3276777_209-219804-0_sp0.9 from training. Duration: 0.7666875 2023-10-04 16:32:39,327 WARNING [train.py:1204] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_35-153886-0_sp0.9 from training. Duration: 0.9333125 2023-10-04 16:32:39,478 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=1720080.0, ans=0.1 2023-10-04 16:32:41,172 WARNING [train.py:1204] (3/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_332-121925-0_sp1.1 from training. Duration: 0.92725 2023-10-04 16:32:41,174 WARNING [train.py:1204] (3/4) Exclude cut with ID _59202_210_26984_1_1534816810612_3549174_495-65515-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 16:32:43,931 WARNING [train.py:1204] (3/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_769-168302-0_sp1.1 from training. Duration: 0.6454375 2023-10-04 16:32:45,261 WARNING [train.py:1204] (3/4) Exclude cut with ID _24271_210_12658_1_1531987270022_7190519_708-71345-0_sp1.1 from training. Duration: 0.936375 2023-10-04 16:32:45,261 WARNING [train.py:1204] (3/4) Exclude cut with ID 3033-130750-0096-55598-0_sp0.9 from training. Duration: 0.92225 2023-10-04 16:32:45,384 WARNING [train.py:1204] (3/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_219-224445-0_sp0.9 from training. Duration: 0.9333125 2023-10-04 16:32:46,872 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_174-277364-0 from training. Duration: 0.71 2023-10-04 16:32:48,134 WARNING [train.py:1204] (3/4) Exclude cut with ID _45021_210_9994_1_1533619751094_3330288_96-239010-0_sp1.1 from training. Duration: 0.82725 2023-10-04 16:32:49,504 WARNING [train.py:1204] (3/4) Exclude cut with ID _31602_210_5854_1_1532677021456_4458250_489-305116-0_sp1.1 from training. Duration: 0.97275 2023-10-04 16:32:49,530 WARNING [train.py:1204] (3/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_269-154726-0_sp0.9 from training. Duration: 0.9555625 2023-10-04 16:32:53,479 WARNING [train.py:1204] (3/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_126-54418-0_sp1.1 from training. Duration: 0.92725 2023-10-04 16:32:53,522 WARNING [train.py:1204] (3/4) Exclude cut with ID _42326_210_7770_1_1533814282531_3707269_182-224159-0_sp1.1 from training. Duration: 0.92725 2023-10-04 16:32:56,223 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7002_210_1794_1_1527939683_5157286_1-281264-0_sp0.9 from training. Duration: 0.488875 2023-10-04 16:32:56,264 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_86-16881-0 from training. Duration: 0.58 2023-10-04 16:32:56,287 WARNING [train.py:1204] (3/4) Exclude cut with ID _19514_210_9259_1_1531530071330_5084230_637-279094-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 16:32:56,313 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_26433_210_12108_1_1532157158_4056480_578-262107-0 from training. Duration: 0.99 2023-10-04 16:32:58,221 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_82-221196-0_sp1.1 from training. Duration: 0.6909375 2023-10-04 16:33:00,992 WARNING [train.py:1204] (3/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_402-102311-0 from training. Duration: 0.67 2023-10-04 16:33:02,509 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1015-343884-0_sp1.1 from training. Duration: 0.563625 2023-10-04 16:33:03,782 INFO [train.py:1046] (3/4) Epoch 49, batch 3050, loss[loss=0.1543, simple_loss=0.2404, pruned_loss=0.03407, over 24516.00 frames. ], tot_loss[loss=0.1529, simple_loss=0.2342, pruned_loss=0.03585, over 4694207.02 frames. ], batch size: 63, lr: 2.07e-03, grad_scale: 16.0 2023-10-04 16:33:03,843 WARNING [train.py:1204] (3/4) Exclude cut with ID _13684_210_7497_1_1530619354863_3598359_312-235606-0_sp1.1 from training. Duration: 0.5454375 2023-10-04 16:33:03,874 WARNING [train.py:1204] (3/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_55-31904-0 from training. Duration: 0.88 2023-10-04 16:33:03,953 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_25-332138-0 from training. Duration: 0.91 2023-10-04 16:33:03,954 WARNING [train.py:1204] (3/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_912-282412-0_sp0.9 from training. Duration: 0.5444375 2023-10-04 16:33:05,874 WARNING [train.py:1204] (3/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_458-299435-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 16:33:05,966 WARNING [train.py:1204] (3/4) Exclude cut with ID _69584_210_5219_1_1535785133775_4044450_275-88403-0_sp1.1 from training. Duration: 0.92725 2023-10-04 16:33:05,974 WARNING [train.py:1204] (3/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_841-341679-0_sp1.1 from training. Duration: 0.52725 2023-10-04 16:33:07,891 WARNING [train.py:1204] (3/4) Exclude cut with ID _41391_210_7994_1_1533603771872_3638201_1-54521-0_sp1.1 from training. Duration: 0.97275 2023-10-04 16:33:07,943 WARNING [train.py:1204] (3/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_495-186609-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 16:33:09,495 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.conv_skip_rate, batch_count=1720213.3333333333, ans=0.0 2023-10-04 16:33:10,770 WARNING [train.py:1204] (3/4) Exclude cut with ID _4681_210_3525_1_1527250689464_120889_15-126760-0 from training. Duration: 0.81 2023-10-04 16:33:11,608 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.nonlin_attention.balancer.prob, batch_count=1720213.3333333333, ans=0.125 2023-10-04 16:33:13,893 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3019_210_2028_1_1526044245_3264865_127-135615-0_sp1.1 from training. Duration: 0.8545625 2023-10-04 16:33:15,010 INFO [scaling.py:1022] (3/4) Whitening: name=encoder_embed.convnext.out_whiten, num_groups=1, num_channels=128, metric=4.84 vs. limit=5.0 2023-10-04 16:33:15,390 WARNING [train.py:1204] (3/4) Exclude cut with ID _30191_210_1664_1_1533189915047_7196670_915-21561-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 16:33:16,719 WARNING [train.py:1204] (3/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_6-214815-0_sp0.9 from training. Duration: 0.9444375 2023-10-04 16:33:16,886 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=1720213.3333333333, ans=0.1 2023-10-04 16:33:20,861 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_45399_210_4852_1_1533624725_7579737_190-137494-0_sp1.1 from training. Duration: 0.97275 2023-10-04 16:33:22,329 WARNING [train.py:1204] (3/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_207-132038-0 from training. Duration: 0.94 2023-10-04 16:33:29,595 WARNING [train.py:1204] (3/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_90-31383-0 from training. Duration: 0.87 2023-10-04 16:33:29,628 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_15517_210_6753_1_1530952293_1664795_119-31001-0 from training. Duration: 0.94 2023-10-04 16:33:29,663 WARNING [train.py:1204] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_684-85342-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 16:33:32,518 WARNING [train.py:1204] (3/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_97-325053-0_sp1.1 from training. Duration: 0.836375 2023-10-04 16:33:32,799 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.conv_module2.balancer2.prob, batch_count=1720346.6666666667, ans=0.125 2023-10-04 16:33:34,088 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_68702_210_23689_1_1535799577_3420068_168-25150-0_sp1.1 from training. Duration: 0.97275 2023-10-04 16:33:34,096 WARNING [train.py:1204] (3/4) Exclude cut with ID _65134_210_14763_1_1535335393985_3505368_111-27984-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 16:33:35,877 WARNING [train.py:1204] (3/4) Exclude cut with ID _48678_210_5663_1_1533988830947_3769199_276-226230-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 16:33:38,554 WARNING [train.py:1204] (3/4) Exclude cut with ID _28427_210_4220_1_1532581240967_3061069_43-84928-0_sp1.1 from training. Duration: 0.9 2023-10-04 16:33:38,585 WARNING [train.py:1204] (3/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_158-250116-0_sp1.1 from training. Duration: 0.563625 2023-10-04 16:33:38,610 WARNING [train.py:1204] (3/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_279-332556-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 16:33:38,660 WARNING [train.py:1204] (3/4) Exclude cut with ID _42863_210_9070_1_1533902398788_3696920_488-230146-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 16:33:38,660 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_387-247159-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 16:33:38,886 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.feed_forward3.hidden_balancer.prob, batch_count=1720346.6666666667, ans=0.125 2023-10-04 16:33:41,078 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6729_210_2550_1_1528032481_1788725_261-42619-0_sp1.1 from training. Duration: 0.97275 2023-10-04 16:33:41,219 WARNING [train.py:1204] (3/4) Exclude cut with ID _19896_210_1883_1_1531709022662_6649569_831-44724-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 16:33:43,739 WARNING [train.py:1204] (3/4) Exclude cut with ID _55153_210_2253_1_1534384786677_3162096_280-285944-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 16:33:43,769 WARNING [train.py:1204] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_448-4048-0 from training. Duration: 0.96 2023-10-04 16:33:45,087 WARNING [train.py:1204] (3/4) Exclude cut with ID _29678_210_7893_1_1532421042787_3906118_456-202548-0_sp1.1 from training. Duration: 0.97275 2023-10-04 16:33:45,107 WARNING [train.py:1204] (3/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_107-332398-0_sp1.1 from training. Duration: 0.7090625 2023-10-04 16:33:48,989 WARNING [train.py:1204] (3/4) Exclude cut with ID _13921_210_9994_1_1530838702800_3003429_46-149444-0_sp1.1 from training. Duration: 0.8545625 2023-10-04 16:33:49,033 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_257-20733-0_sp0.9 from training. Duration: 0.8444375 2023-10-04 16:33:50,344 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_53753_210_7234_1_1534320003_3594148_385-234809-0_sp1.1 from training. Duration: 0.963625 2023-10-04 16:33:50,409 WARNING [train.py:1204] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_292-215346-0_sp1.1 from training. Duration: 0.8818125 2023-10-04 16:33:52,132 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.bypass.skip_rate, batch_count=1720413.3333333333, ans=0.04949747468305833 2023-10-04 16:33:54,586 WARNING [train.py:1204] (3/4) Exclude cut with ID _68965_210_1883_1_1535698962272_3497490_196-124268-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 16:33:55,835 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_23801_210_16223_1_1532066392_3294657_570-5439-0_sp1.1 from training. Duration: 0.8818125 2023-10-04 16:33:56,038 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.0.conv_module2.balancer2.prob, batch_count=1720413.3333333333, ans=0.125 2023-10-04 16:34:00,727 WARNING [train.py:1204] (3/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_349-6928-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 16:34:00,925 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.bypass.skip_rate, batch_count=1720413.3333333333, ans=0.07 2023-10-04 16:34:01,982 WARNING [train.py:1204] (3/4) Exclude cut with ID _60621_210_17071_1_1535013053530_3671739_226-255202-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 16:34:01,986 WARNING [train.py:1204] (3/4) Exclude cut with ID _41817_210_18835_1_1533434477160_7423049_448-130302-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 16:34:05,253 WARNING [train.py:1204] (3/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_758-143677-0_sp1.1 from training. Duration: 0.8909375 2023-10-04 16:34:05,305 WARNING [train.py:1204] (3/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_912-47473-0_sp0.9 from training. Duration: 0.7555625 2023-10-04 16:34:05,342 WARNING [train.py:1204] (3/4) Exclude cut with ID _6694_210_4599_1_1527852637490_3965529_356-169233-0_sp1.1 from training. Duration: 0.9 2023-10-04 16:34:06,643 WARNING [train.py:1204] (3/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_1-99680-0 from training. Duration: 0.67 2023-10-04 16:34:08,466 WARNING [train.py:1204] (3/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_199-86432-0_sp1.1 from training. Duration: 0.8909375 2023-10-04 16:34:09,859 WARNING [train.py:1204] (3/4) Exclude cut with ID _61148_210_6897_1_1535094022726_4217610_362-182430-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 16:34:11,265 WARNING [train.py:1204] (3/4) Exclude cut with ID _49376_210_5986_1_1533985278732_3370740_301-154763-0 from training. Duration: 0.98 2023-10-04 16:34:13,015 WARNING [train.py:1204] (3/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_572-185150-0_sp1.1 from training. Duration: 0.8818125 2023-10-04 16:34:17,320 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_47896_210_20508_1_1534492518_5958868_558-314572-0_sp1.1 from training. Duration: 0.8818125 2023-10-04 16:34:18,654 INFO [train.py:1046] (3/4) Epoch 49, batch 3100, loss[loss=0.1371, simple_loss=0.2199, pruned_loss=0.02714, over 24610.00 frames. ], tot_loss[loss=0.1532, simple_loss=0.2343, pruned_loss=0.03606, over 4679344.34 frames. ], batch size: 60, lr: 2.07e-03, grad_scale: 16.0 2023-10-04 16:34:18,702 WARNING [train.py:1204] (3/4) Exclude cut with ID _45560_210_21982_1_1533812603951_3584999_111-6376-0_sp0.9 from training. Duration: 0.9333125 2023-10-04 16:34:21,536 WARNING [train.py:1204] (3/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_412-317077-0_sp1.1 from training. Duration: 0.6818125 2023-10-04 16:34:22,856 WARNING [train.py:1204] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_718-68505-0 from training. Duration: 0.89 2023-10-04 16:34:25,578 WARNING [train.py:1204] (3/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_77-311404-0 from training. Duration: 0.94 2023-10-04 16:34:25,660 WARNING [train.py:1204] (3/4) Exclude cut with ID _60229_210_27767_1_1534838406158_3655350_264-279573-0 from training. Duration: 0.83 2023-10-04 16:34:26,943 WARNING [train.py:1204] (3/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_106-300985-0_sp1.1 from training. Duration: 0.8181875 2023-10-04 16:34:30,290 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_4183_210_2087_1_1526729100_1869838_79-277189-0_sp1.1 from training. Duration: 0.963625 2023-10-04 16:34:30,299 WARNING [train.py:1204] (3/4) Exclude cut with ID _28427_210_4220_1_1532581240967_3061069_195-295268-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 16:34:33,071 WARNING [train.py:1204] (3/4) Exclude cut with ID _4092_210_2151_1_1526643044449_3135759_356-278694-0_sp0.9 from training. Duration: 0.52225 2023-10-04 16:34:34,930 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.668e+02 2.156e+02 2.470e+02 2.977e+02 4.757e+02, threshold=4.941e+02, percent-clipped=2.0 2023-10-04 16:34:36,553 WARNING [train.py:1204] (3/4) Exclude cut with ID _14459_210_2228_1_1531443641946_7516700_777-71446-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 16:34:41,218 WARNING [train.py:1204] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_520-126729-0 from training. Duration: 0.7 2023-10-04 16:34:45,293 WARNING [train.py:1204] (3/4) Exclude cut with ID _13684_210_7497_1_1530619354863_3598359_312-235606-0_sp0.9 from training. Duration: 0.6666875 2023-10-04 16:34:46,555 WARNING [train.py:1204] (3/4) Exclude cut with ID _37537_210_2253_1_1533171758568_3421949_283-349402-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 16:34:46,584 WARNING [train.py:1204] (3/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_29-117932-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 16:34:46,611 WARNING [train.py:1204] (3/4) Exclude cut with ID _29303_210_2575_1_1532392237303_7410649_721-283351-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 16:34:46,673 WARNING [train.py:1204] (3/4) Exclude cut with ID _14886_210_4220_1_1530702020355_3516190_175-229073-0_sp0.9 from training. Duration: 0.5 2023-10-04 16:34:48,104 WARNING [train.py:1204] (3/4) Exclude cut with ID _15458_210_8341_1_1530858614263_3834410_70-268830-0_sp1.1 from training. Duration: 0.97275 2023-10-04 16:34:48,133 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2295_210_2087_1_1525500993_2407863_234-83104-0 from training. Duration: 0.62 2023-10-04 16:34:48,135 WARNING [train.py:1204] (3/4) Exclude cut with ID _22961_210_8254_1_1531976366672_4278180_317-199546-0_sp1.1 from training. Duration: 0.7818125 2023-10-04 16:34:50,837 WARNING [train.py:1204] (3/4) Exclude cut with ID _27894_210_12392_1_1532415496078_6902439_547-196022-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 16:34:52,254 WARNING [train.py:1204] (3/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_46-207599-0 from training. Duration: 0.64 2023-10-04 16:34:54,865 WARNING [train.py:1204] (3/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_426-326246-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 16:34:56,583 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.0.balancer1.prob, batch_count=1720680.0, ans=0.125 2023-10-04 16:34:58,253 WARNING [train.py:1204] (3/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_445-117398-0_sp0.9 from training. Duration: 0.988875 2023-10-04 16:34:58,304 WARNING [train.py:1204] (3/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_563-347832-0 from training. Duration: 0.89 2023-10-04 16:34:59,665 WARNING [train.py:1204] (3/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_252-216674-0 from training. Duration: 0.54 2023-10-04 16:34:59,906 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.feed_forward1.hidden_balancer.prob, batch_count=1720680.0, ans=0.125 2023-10-04 16:35:01,019 WARNING [train.py:1204] (3/4) Exclude cut with ID _59611_210_8895_1_1534741198240_3650361_187-212779-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 16:35:01,078 WARNING [train.py:1204] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_282-159367-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 16:35:02,557 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_68702_210_23689_1_1535799577_3420068_33-136883-0_sp1.1 from training. Duration: 0.936375 2023-10-04 16:35:02,577 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_47228_210_4983_1_1533780021_3672027_184-293706-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 16:35:02,598 WARNING [train.py:1204] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_656-188361-0_sp0.9 from training. Duration: 0.9666875 2023-10-04 16:35:03,955 WARNING [train.py:1204] (3/4) Exclude cut with ID _4866_210_2577_1_1527404412284_4364659_352-157930-0_sp0.9 from training. Duration: 0.9 2023-10-04 16:35:03,955 WARNING [train.py:1204] (3/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_210-106047-0_sp1.1 from training. Duration: 0.8818125 2023-10-04 16:35:09,307 WARNING [train.py:1204] (3/4) Exclude cut with ID _9389_210_1664_1_1529217337301_5289269_289-213417-0_sp1.1 from training. Duration: 0.8090625 2023-10-04 16:35:09,342 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3910_210_1794_1_1526777667_3747260_178-41186-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 16:35:09,346 WARNING [train.py:1204] (3/4) Exclude cut with ID _27052_210_5986_1_1532329197187_3585009_24-172263-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 16:35:09,350 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_5522_210_3273_1_1527249606_3909096_318-203153-0_sp1.1 from training. Duration: 0.3909375 2023-10-04 16:35:13,542 WARNING [train.py:1204] (3/4) Exclude cut with ID _27894_210_12392_1_1532415496078_6902439_222-51982-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 16:35:14,930 WARNING [train.py:1204] (3/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_87-131427-0 from training. Duration: 0.78 2023-10-04 16:35:17,618 WARNING [train.py:1204] (3/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_372-298093-0_sp0.9 from training. Duration: 0.888875 2023-10-04 16:35:18,944 WARNING [train.py:1204] (3/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_663-166973-0 from training. Duration: 0.8 2023-10-04 16:35:18,986 WARNING [train.py:1204] (3/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_429-177469-0_sp1.1 from training. Duration: 0.936375 2023-10-04 16:35:19,018 WARNING [train.py:1204] (3/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_724-172651-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 16:35:20,284 WARNING [train.py:1204] (3/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_103-336064-0 from training. Duration: 0.51 2023-10-04 16:35:30,493 WARNING [train.py:1204] (3/4) Exclude cut with ID _48677_210_5663_1_1533902405919_3892989_111-11439-0 from training. Duration: 0.96 2023-10-04 16:35:31,992 INFO [train.py:1046] (3/4) Epoch 49, batch 3150, loss[loss=0.1388, simple_loss=0.2229, pruned_loss=0.02739, over 21639.00 frames. ], tot_loss[loss=0.1521, simple_loss=0.2326, pruned_loss=0.03576, over 4676459.00 frames. ], batch size: 47, lr: 2.07e-03, grad_scale: 16.0 2023-10-04 16:35:32,113 WARNING [train.py:1204] (3/4) Exclude cut with ID _8085_210_4557_1_1528455633493_3716100_95-103398-0_sp1.1 from training. Duration: 0.963625 2023-10-04 16:35:33,403 WARNING [train.py:1204] (3/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_1041-110837-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 16:35:33,549 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_50050_210_16223_1_1535435796_7290138_1155-121757-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 16:35:33,552 WARNING [train.py:1204] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_602-275260-0_sp0.9 from training. Duration: 0.911125 2023-10-04 16:35:34,948 WARNING [train.py:1204] (3/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_265-164486-0 from training. Duration: 0.96 2023-10-04 16:35:36,302 WARNING [train.py:1204] (3/4) Exclude cut with ID _46067_210_4220_1_1534125571066_3307450_142-229375-0_sp1.1 from training. Duration: 0.963625 2023-10-04 16:35:37,622 WARNING [train.py:1204] (3/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_310-26186-0_sp1.1 from training. Duration: 0.636375 2023-10-04 16:35:39,490 WARNING [train.py:1204] (3/4) Exclude cut with ID _28461_210_2253_1_1532771775189_3041299_136-270644-0 from training. Duration: 0.93 2023-10-04 16:35:41,440 WARNING [train.py:1204] (3/4) Exclude cut with ID _14860_210_4915_1_1530702020340_7343240_438-286372-0_sp1.1 from training. Duration: 0.936375 2023-10-04 16:35:42,879 WARNING [train.py:1204] (3/4) Exclude cut with ID IC0128W0004-63309 from training. Duration: 0.831 2023-10-04 16:35:45,663 WARNING [train.py:1204] (3/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_135-174080-0 from training. Duration: 0.53 2023-10-04 16:35:45,705 WARNING [train.py:1204] (3/4) Exclude cut with ID _41326_210_5854_1_1533778076509_3492080_277-59511-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 16:35:45,890 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.bypass.skip_rate, batch_count=1720946.6666666667, ans=0.07 2023-10-04 16:35:47,036 WARNING [train.py:1204] (3/4) Exclude cut with ID IC0144W0112-71379 from training. Duration: 0.992 2023-10-04 16:35:48,373 WARNING [train.py:1204] (3/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_711-15688-0_sp0.9 from training. Duration: 0.47775 2023-10-04 16:35:49,810 WARNING [train.py:1204] (3/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_237-332027-0 from training. Duration: 0.95 2023-10-04 16:35:49,855 WARNING [train.py:1204] (3/4) Exclude cut with ID _7970_210_1883_1_1528455616296_3472230_25-178407-0 from training. Duration: 0.71 2023-10-04 16:35:49,857 WARNING [train.py:1204] (3/4) Exclude cut with ID _8480_210_1874_1_1528589391205_6728330_309-139896-0 from training. Duration: 0.81 2023-10-04 16:35:50,629 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.1.encoder.layers.0.feed_forward3.out_whiten, num_groups=1, num_channels=256, metric=13.33 vs. limit=15.0 2023-10-04 16:35:51,050 WARNING [train.py:1204] (3/4) Exclude cut with ID _39571_210_13449_1_1533801584143_3651077_319-268877-0_sp1.1 from training. Duration: 0.936375 2023-10-04 16:35:51,054 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_155-246880-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 16:35:51,165 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_566-122599-0_sp1.1 from training. Duration: 0.936375 2023-10-04 16:35:51,378 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.feed_forward2.hidden_balancer.prob, batch_count=1720946.6666666667, ans=0.125 2023-10-04 16:35:53,881 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_12868_210_6753_1_1530701932_530275_18-76322-0 from training. Duration: 0.87 2023-10-04 16:35:55,322 WARNING [train.py:1204] (3/4) Exclude cut with ID _56494_210_4220_1_1535284837625_3033169_159-106035-0_sp1.1 from training. Duration: 0.963625 2023-10-04 16:35:55,353 WARNING [train.py:1204] (3/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_98-276013-0_sp1.1 from training. Duration: 0.963625 2023-10-04 16:35:55,406 WARNING [train.py:1204] (3/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_330-216124-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 16:35:57,219 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_177-58005-0_sp0.9 from training. Duration: 0.688875 2023-10-04 16:36:01,376 WARNING [train.py:1204] (3/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_343-139898-0 from training. Duration: 0.96 2023-10-04 16:36:01,435 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6961_210_2087_1_1527937168_6815011_395-185060-0_sp1.1 from training. Duration: 0.863625 2023-10-04 16:36:02,772 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7002_210_1794_1_1527939683_5157286_184-229630-0_sp0.9 from training. Duration: 0.82225 2023-10-04 16:36:02,821 WARNING [train.py:1204] (3/4) Exclude cut with ID _59170_210_6721_1_1534983633960_4203335_153-18895-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 16:36:04,202 WARNING [train.py:1204] (3/4) Exclude cut with ID _49643_210_7989_1_1534640244224_3939031_598-161926-0 from training. Duration: 0.9 2023-10-04 16:36:04,435 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.feed_forward1.out_proj.dropout_p, batch_count=1721013.3333333333, ans=0.1 2023-10-04 16:36:07,290 WARNING [train.py:1204] (3/4) Exclude cut with ID _4068_210_2629_1_1526697350395_1546259_224-288693-0 from training. Duration: 0.59 2023-10-04 16:36:09,065 WARNING [train.py:1204] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_718-68505-0_sp1.1 from training. Duration: 0.8090625 2023-10-04 16:36:09,118 WARNING [train.py:1204] (3/4) Exclude cut with ID _4068_210_2629_1_1526697350395_1546259_224-288693-0_sp0.9 from training. Duration: 0.6555625 2023-10-04 16:36:09,132 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2296_210_2087_1_1525503448_3397335_57-185430-0_sp0.9 from training. Duration: 0.6666875 2023-10-04 16:36:09,350 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.feed_forward1.hidden_balancer.prob, batch_count=1721013.3333333333, ans=0.125 2023-10-04 16:36:10,991 WARNING [train.py:1204] (3/4) Exclude cut with ID _15458_210_8341_1_1530858614263_3834410_49-235472-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 16:36:10,993 WARNING [train.py:1204] (3/4) Exclude cut with ID _64135_210_2253_1_1535677267876_7608972_498-5039-0_sp0.9 from training. Duration: 0.8666875 2023-10-04 16:36:12,385 WARNING [train.py:1204] (3/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_283-220372-0_sp0.9 from training. Duration: 0.62225 2023-10-04 16:36:12,393 WARNING [train.py:1204] (3/4) Exclude cut with ID _6473_210_1741_1_1527901307396_3815380_108-17335-0_sp0.9 from training. Duration: 0.72225 2023-10-04 16:36:13,845 WARNING [train.py:1204] (3/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_113-92608-0 from training. Duration: 0.54 2023-10-04 16:36:15,161 WARNING [train.py:1204] (3/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_272-249985-0_sp1.1 from training. Duration: 0.6090625 2023-10-04 16:36:15,170 WARNING [train.py:1204] (3/4) Exclude cut with ID _50376_210_13401_1_1534031502101_4217740_518-187390-0_sp1.1 from training. Duration: 0.97275 2023-10-04 16:36:16,593 WARNING [train.py:1204] (3/4) Exclude cut with ID _59738_210_15379_1_1534849512562_3941470_165-248260-0_sp1.1 from training. Duration: 0.92725 2023-10-04 16:36:16,609 WARNING [train.py:1204] (3/4) Exclude cut with ID _23818_210_6228_1_1531917063884_3676920_400-343925-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 16:36:17,901 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6961_210_2087_1_1527937168_6815011_562-165518-0 from training. Duration: 0.77 2023-10-04 16:36:19,246 WARNING [train.py:1204] (3/4) Exclude cut with ID _24271_210_12658_1_1531987270022_7190519_289-207931-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 16:36:20,610 WARNING [train.py:1204] (3/4) Exclude cut with ID _14632_210_9988_1_1530691279392_3923200_332-105904-0 from training. Duration: 0.9 2023-10-04 16:36:20,621 WARNING [train.py:1204] (3/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_203-232438-0_sp1.1 from training. Duration: 0.97275 2023-10-04 16:36:21,988 WARNING [train.py:1204] (3/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_263-168388-0 from training. Duration: 0.86 2023-10-04 16:36:23,313 WARNING [train.py:1204] (3/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_174-205058-0 from training. Duration: 0.95 2023-10-04 16:36:26,148 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_94-195801-0_sp1.1 from training. Duration: 0.8545625 2023-10-04 16:36:26,171 WARNING [train.py:1204] (3/4) Exclude cut with ID _55153_210_2253_1_1534384786677_3162096_269-103204-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 16:36:27,440 WARNING [train.py:1204] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_748-36150-0 from training. Duration: 0.91 2023-10-04 16:36:28,915 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_148-172861-0_sp1.1 from training. Duration: 0.4545625 2023-10-04 16:36:30,904 WARNING [train.py:1204] (3/4) Exclude cut with ID _67499_210_17282_1_1535509805220_3692430_460-77730-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 16:36:32,379 WARNING [train.py:1204] (3/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_374-20753-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 16:36:33,800 WARNING [train.py:1204] (3/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_374-289983-0_sp1.1 from training. Duration: 0.97275 2023-10-04 16:36:33,827 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_34886_210_10633_1_1533203882_1755661_76-156096-0_sp0.9 from training. Duration: 0.9666875 2023-10-04 16:36:39,165 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_238-256668-0_sp0.9 from training. Duration: 0.7666875 2023-10-04 16:36:40,899 WARNING [train.py:1204] (3/4) Exclude cut with ID _46683_210_9642_1_1533730913492_4591116_489-146727-0_sp1.1 from training. Duration: 0.97275 2023-10-04 16:36:42,968 WARNING [train.py:1204] (3/4) Exclude cut with ID _13938_210_9988_1_1530518718978_3693960_304-112784-0_sp0.9 from training. Duration: 0.511125 2023-10-04 16:36:45,688 INFO [train.py:1046] (3/4) Epoch 49, batch 3200, loss[loss=0.1585, simple_loss=0.2326, pruned_loss=0.04223, over 23807.00 frames. ], tot_loss[loss=0.1513, simple_loss=0.2315, pruned_loss=0.03554, over 4683051.83 frames. ], batch size: 195, lr: 2.07e-03, grad_scale: 32.0 2023-10-04 16:36:48,451 WARNING [train.py:1204] (3/4) Exclude cut with ID _24271_210_12658_1_1531987270022_7190519_243-46549-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 16:36:48,463 WARNING [train.py:1204] (3/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_78-182912-0_sp1.1 from training. Duration: 0.536375 2023-10-04 16:36:51,115 WARNING [train.py:1204] (3/4) Exclude cut with ID _57572_210_5517_1_1534921219624_3809484_121-321334-0_sp1.1 from training. Duration: 0.97275 2023-10-04 16:36:51,224 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_40047_210_2290_1_1533290519_2374311_254-102149-0_sp1.1 from training. Duration: 0.963625 2023-10-04 16:36:51,225 WARNING [train.py:1204] (3/4) Exclude cut with ID _28461_210_2253_1_1532771775189_3041299_96-152158-0 from training. Duration: 0.85 2023-10-04 16:36:52,703 WARNING [train.py:1204] (3/4) Exclude cut with ID _29877_210_6228_1_1532397791106_5276149_161-102979-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 16:36:56,951 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3787_210_2040_1_1526477544_5478586_603-120324-0_sp0.9 from training. Duration: 0.9 2023-10-04 16:37:00,386 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_41750_210_1960_1_1533520678_3948042_247-213456-0_sp1.1 from training. Duration: 0.97275 2023-10-04 16:37:01,265 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.0.layers.1.nonlin_attention.whiten1, num_groups=1, num_channels=144, metric=5.38 vs. limit=10.0 2023-10-04 16:37:01,622 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.632e+02 1.995e+02 2.234e+02 2.632e+02 4.209e+02, threshold=4.468e+02, percent-clipped=0.0 2023-10-04 16:37:07,710 WARNING [train.py:1204] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_127-119109-0_sp0.9 from training. Duration: 0.8 2023-10-04 16:37:16,212 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder_embed.convnext.hidden_balancer.prob, batch_count=1721346.6666666667, ans=0.125 2023-10-04 16:37:18,787 WARNING [train.py:1204] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_215-202973-0 from training. Duration: 0.88 2023-10-04 16:37:18,851 WARNING [train.py:1204] (3/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_17-320491-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 16:37:24,175 WARNING [train.py:1204] (3/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_310-114452-0 from training. Duration: 0.59 2023-10-04 16:37:24,263 WARNING [train.py:1204] (3/4) Exclude cut with ID _15133_210_6228_1_1530794512074_3080379_29-264458-0_sp0.9 from training. Duration: 0.6444375 2023-10-04 16:37:28,423 WARNING [train.py:1204] (3/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_159-281310-0_sp1.1 from training. Duration: 0.863625 2023-10-04 16:37:28,433 WARNING [train.py:1204] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_195-167011-0_sp0.9 from training. Duration: 0.8333125 2023-10-04 16:37:29,801 WARNING [train.py:1204] (3/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_156-257607-0_sp1.1 from training. Duration: 0.7545625 2023-10-04 16:37:33,132 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2295_210_2087_1_1525500993_2407863_116-238894-0 from training. Duration: 0.7 2023-10-04 16:37:34,483 WARNING [train.py:1204] (3/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_321-335475-0_sp0.9 from training. Duration: 0.5 2023-10-04 16:37:35,950 WARNING [train.py:1204] (3/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_188-114464-0 from training. Duration: 0.82 2023-10-04 16:37:38,648 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2592_210_2290_1_1525697025_4882925_157-70512-0 from training. Duration: 0.91 2023-10-04 16:37:41,841 WARNING [train.py:1204] (3/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_138-64210-0_sp1.1 from training. Duration: 0.663625 2023-10-04 16:37:42,186 INFO [scaling.py:1118] (3/4) WithLoss: name=encoder.encoders.5.encoder.layers.1.self_attn_weights, loss-sum=0.000e+00 2023-10-04 16:37:43,938 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.3.encoder.layers.2.nonlin_attention.whiten1, num_groups=1, num_channels=384, metric=3.62 vs. limit=10.0 2023-10-04 16:37:46,700 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_11305_210_1794_1_1529750428_7178148_651-95702-0_sp1.1 from training. Duration: 0.963625 2023-10-04 16:37:48,017 WARNING [train.py:1204] (3/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_379-184223-0_sp0.9 from training. Duration: 0.9444375 2023-10-04 16:37:48,044 WARNING [train.py:1204] (3/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_381-125325-0_sp1.1 from training. Duration: 0.963625 2023-10-04 16:37:48,100 WARNING [train.py:1204] (3/4) Exclude cut with ID IC0622W0021-307659 from training. Duration: 0.896 2023-10-04 16:37:48,102 WARNING [train.py:1204] (3/4) Exclude cut with ID _7624_210_1531_1_1528109802198_4933101_418-349834-0_sp1.1 from training. Duration: 0.5454375 2023-10-04 16:37:51,086 WARNING [train.py:1204] (3/4) Exclude cut with ID _49728_210_12651_1_1534041034294_6788150_607-3416-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 16:37:52,448 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8275_210_4198_1_1528455320_3892571_491-17707-0 from training. Duration: 0.64 2023-10-04 16:37:52,501 WARNING [train.py:1204] (3/4) Exclude cut with ID _5287_210_2084_1_1527418883040_3878670_333-48508-0 from training. Duration: 0.6 2023-10-04 16:37:53,887 WARNING [train.py:1204] (3/4) Exclude cut with ID _8365_210_2064_1_1528592311070_4677690_371-122295-0 from training. Duration: 0.55 2023-10-04 16:37:54,868 INFO [scaling.py:1022] (3/4) Whitening: name=encoder_embed.out_whiten, num_groups=1, num_channels=192, metric=7.02 vs. limit=8.0 2023-10-04 16:37:56,543 WARNING [train.py:1204] (3/4) Exclude cut with ID _14860_210_4915_1_1530702020340_7343240_836-328489-0 from training. Duration: 0.9 2023-10-04 16:37:57,856 WARNING [train.py:1204] (3/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_1033-274376-0_sp0.9 from training. Duration: 0.9666875 2023-10-04 16:37:59,225 INFO [train.py:1046] (3/4) Epoch 49, batch 3250, loss[loss=0.1462, simple_loss=0.2273, pruned_loss=0.03252, over 24448.00 frames. ], tot_loss[loss=0.151, simple_loss=0.2313, pruned_loss=0.03538, over 4676939.46 frames. ], batch size: 58, lr: 2.07e-03, grad_scale: 32.0 2023-10-04 16:37:59,360 WARNING [train.py:1204] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_475-127455-0_sp0.9 from training. Duration: 0.77775 2023-10-04 16:37:59,366 WARNING [train.py:1204] (3/4) Exclude cut with ID IC0671W0071-331914 from training. Duration: 0.896 2023-10-04 16:37:59,386 WARNING [train.py:1204] (3/4) Exclude cut with ID _30327_210_16049_1_1532484161295_3924169_2-1523-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 16:37:59,388 WARNING [train.py:1204] (3/4) Exclude cut with ID _24314_210_4915_1_1532135078116_7170179_460-208279-0_sp1.1 from training. Duration: 0.97275 2023-10-04 16:38:01,392 WARNING [train.py:1204] (3/4) Exclude cut with ID IC0679W0052-335859 from training. Duration: 0.64 2023-10-04 16:38:04,300 WARNING [train.py:1204] (3/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_3-237434-0_sp1.1 from training. Duration: 0.6909375 2023-10-04 16:38:07,023 WARNING [train.py:1204] (3/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_405-185283-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 16:38:09,614 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.4.encoder.layers.1.self_attn1.whiten, num_groups=1, num_channels=384, metric=12.92 vs. limit=22.5 2023-10-04 16:38:12,799 WARNING [train.py:1204] (3/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_927-83684-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 16:38:12,807 WARNING [train.py:1204] (3/4) Exclude cut with ID _6694_210_4599_1_1527852637490_3965529_356-169233-0 from training. Duration: 0.99 2023-10-04 16:38:14,176 WARNING [train.py:1204] (3/4) Exclude cut with ID _19896_210_1883_1_1531709022662_6649569_562-233001-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 16:38:14,219 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_354-78629-0_sp1.1 from training. Duration: 0.963625 2023-10-04 16:38:14,221 WARNING [train.py:1204] (3/4) Exclude cut with ID _60135_210_13427_1_1534935557922_3651319_96-168198-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 16:38:15,541 WARNING [train.py:1204] (3/4) Exclude cut with ID _31602_210_5854_1_1532677021456_4458250_249-96604-0_sp1.1 from training. Duration: 0.8090625 2023-10-04 16:38:16,904 WARNING [train.py:1204] (3/4) Exclude cut with ID _14100_210_8341_1_1530529320613_3678069_258-224599-0_sp0.9 from training. Duration: 0.8555625 2023-10-04 16:38:18,266 WARNING [train.py:1204] (3/4) Exclude cut with ID _19708_210_9206_1_1532136650733_4238364_215-160135-0_sp1.1 from training. Duration: 0.97275 2023-10-04 16:38:19,461 WARNING [train.py:1204] (3/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_113-18163-0_sp0.9 from training. Duration: 0.988875 2023-10-04 16:38:19,509 WARNING [train.py:1204] (3/4) Exclude cut with ID _64135_210_2253_1_1535677267876_7608972_552-337340-0_sp1.1 from training. Duration: 0.92725 2023-10-04 16:38:19,542 WARNING [train.py:1204] (3/4) Exclude cut with ID _67725_210_27767_1_1535608756671_5105359_10-12887-0_sp1.1 from training. Duration: 0.97275 2023-10-04 16:38:19,549 WARNING [train.py:1204] (3/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_401-284840-0_sp1.1 from training. Duration: 0.97275 2023-10-04 16:38:19,581 WARNING [train.py:1204] (3/4) Exclude cut with ID _44659_210_2721_1_1534036120061_6614860_483-141285-0_sp1.1 from training. Duration: 0.936375 2023-10-04 16:38:21,038 WARNING [train.py:1204] (3/4) Exclude cut with ID _9461_210_4557_1_1529060514726_3677451_358-332734-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 16:38:22,404 WARNING [train.py:1204] (3/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_788-298907-0_sp1.1 from training. Duration: 0.8090625 2023-10-04 16:38:25,159 WARNING [train.py:1204] (3/4) Exclude cut with ID _39411_210_5986_1_1533538825030_3343649_354-267196-0_sp1.1 from training. Duration: 0.92725 2023-10-04 16:38:25,188 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_14953_210_2151_1_1530701710_5616165_192-309792-0_sp1.1 from training. Duration: 0.97275 2023-10-04 16:38:25,373 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.conv_module2.balancer1.prob, batch_count=1721613.3333333333, ans=0.125 2023-10-04 16:38:26,480 WARNING [train.py:1204] (3/4) Exclude cut with ID _59170_210_6721_1_1534983633960_4203335_290-151255-0_sp1.1 from training. Duration: 0.92725 2023-10-04 16:38:26,607 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.attention_skip_rate, batch_count=1721613.3333333333, ans=0.0 2023-10-04 16:38:28,251 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_5575_210_1410_1_1527301305_2570170_249-93769-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 16:38:28,269 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_46013_210_7234_1_1533776362_3744032_463-52378-0_sp1.1 from training. Duration: 0.7818125 2023-10-04 16:38:33,861 WARNING [train.py:1204] (3/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_580-93798-0 from training. Duration: 0.56 2023-10-04 16:38:35,217 WARNING [train.py:1204] (3/4) Exclude cut with ID _19514_210_9259_1_1531530071330_5084230_385-13762-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 16:38:35,222 WARNING [train.py:1204] (3/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_458-67321-0_sp1.1 from training. Duration: 0.8818125 2023-10-04 16:38:35,967 WARNING [train.py:1204] (3/4) Exclude cut with ID _41298_210_9019_1_1533430371450_4822143_188-154791-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 16:38:36,213 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.ff3_skip_rate, batch_count=1721680.0, ans=0.0 2023-10-04 16:38:37,234 WARNING [train.py:1204] (3/4) Exclude cut with ID _15037_210_2253_1_1530752294104_3267650_296-192597-0_sp0.9 from training. Duration: 0.9 2023-10-04 16:38:41,357 WARNING [train.py:1204] (3/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_661-264857-0_sp1.1 from training. Duration: 0.8454375 2023-10-04 16:38:41,555 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.balancer1.prob, batch_count=1721680.0, ans=0.125 2023-10-04 16:38:50,016 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_34008_210_10431_1_1533345424_8183247_662-317331-0_sp1.1 from training. Duration: 0.8545625 2023-10-04 16:38:51,240 WARNING [train.py:1204] (3/4) Exclude cut with ID _46502_210_13794_1_1533724452198_3657159_241-278329-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 16:38:51,241 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_11349_210_6753_1_1529800861_1101207_26-65602-0 from training. Duration: 0.96 2023-10-04 16:38:51,244 WARNING [train.py:1204] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_448-4048-0_sp1.1 from training. Duration: 0.87275 2023-10-04 16:38:51,259 WARNING [train.py:1204] (3/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_662-275621-0_sp1.1 from training. Duration: 0.3818125 2023-10-04 16:38:52,638 WARNING [train.py:1204] (3/4) Exclude cut with ID _13938_210_9988_1_1530518718978_3693960_256-54454-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 16:38:53,458 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.2.encoder.layers.1.self_attn1.whiten, num_groups=1, num_channels=384, metric=13.85 vs. limit=22.5 2023-10-04 16:38:54,108 WARNING [train.py:1204] (3/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_256-32662-0 from training. Duration: 0.9 2023-10-04 16:38:55,533 WARNING [train.py:1204] (3/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_326-17061-0 from training. Duration: 0.94 2023-10-04 16:38:55,570 WARNING [train.py:1204] (3/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_505-97987-0_sp1.1 from training. Duration: 0.7818125 2023-10-04 16:38:56,925 WARNING [train.py:1204] (3/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_316-294364-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 16:38:58,304 WARNING [train.py:1204] (3/4) Exclude cut with ID _19514_210_9259_1_1531530071330_5084230_762-231063-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 16:38:59,627 WARNING [train.py:1204] (3/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_68-219807-0_sp0.9 from training. Duration: 0.52225 2023-10-04 16:38:59,670 WARNING [train.py:1204] (3/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_479-158053-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 16:39:04,154 WARNING [train.py:1204] (3/4) Exclude cut with ID _66112_210_10635_1_1535590236305_3801484_83-38181-0_sp1.1 from training. Duration: 0.936375 2023-10-04 16:39:04,169 WARNING [train.py:1204] (3/4) Exclude cut with ID _55857_210_2346_1_1534644007831_6898209_255-146346-0_sp1.1 from training. Duration: 0.963625 2023-10-04 16:39:05,570 WARNING [train.py:1204] (3/4) Exclude cut with ID _48836_210_12210_1_1534075848293_3904150_429-35475-0 from training. Duration: 0.89 2023-10-04 16:39:05,593 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_50154_210_13731_1_1534750862_1080961_119-187163-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 16:39:08,791 WARNING [train.py:1204] (3/4) Exclude cut with ID _8793_210_6310_1_1528542985139_2310009_121-179782-0_sp0.9 from training. Duration: 0.888875 2023-10-04 16:39:08,795 WARNING [train.py:1204] (3/4) Exclude cut with ID _14884_210_2252_1_1530707459689_5561250_591-173406-0 from training. Duration: 0.96 2023-10-04 16:39:12,836 INFO [train.py:1046] (3/4) Epoch 49, batch 3300, loss[loss=0.1406, simple_loss=0.2145, pruned_loss=0.03338, over 21963.00 frames. ], tot_loss[loss=0.1518, simple_loss=0.2323, pruned_loss=0.03567, over 4668919.51 frames. ], batch size: 48, lr: 2.07e-03, grad_scale: 32.0 2023-10-04 16:39:12,908 WARNING [train.py:1204] (3/4) Exclude cut with ID _15133_210_6228_1_1530794512074_3080379_54-198193-0_sp1.1 from training. Duration: 0.8545625 2023-10-04 16:39:12,918 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6545_210_2040_1_1527774670_4245258_407-48070-0 from training. Duration: 0.76 2023-10-04 16:39:15,691 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1032-236863-0 from training. Duration: 0.84 2023-10-04 16:39:15,760 WARNING [train.py:1204] (3/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_507-303370-0 from training. Duration: 0.96 2023-10-04 16:39:17,578 WARNING [train.py:1204] (3/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_164-282366-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 16:39:19,691 WARNING [train.py:1204] (3/4) Exclude cut with ID _39867_210_9826_1_1533553222030_9344259_312-158727-0_sp1.1 from training. Duration: 0.963625 2023-10-04 16:39:21,033 WARNING [train.py:1204] (3/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_174-205058-0_sp1.1 from training. Duration: 0.863625 2023-10-04 16:39:21,065 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_11305_210_1794_1_1529750428_7178148_166-237358-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 16:39:22,525 WARNING [train.py:1204] (3/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_26-175553-0_sp1.1 from training. Duration: 0.5545625 2023-10-04 16:39:23,704 WARNING [train.py:1204] (3/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_96-290039-0_sp1.1 from training. Duration: 0.5090625 2023-10-04 16:39:26,348 WARNING [train.py:1204] (3/4) Exclude cut with ID _53219_210_4525_1_1534834836008_4064825_261-101691-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 16:39:27,798 WARNING [train.py:1204] (3/4) Exclude cut with ID _24271_210_12658_1_1531987270022_7190519_96-293390-0_sp1.1 from training. Duration: 0.936375 2023-10-04 16:39:29,111 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.731e+02 2.089e+02 2.371e+02 2.866e+02 4.389e+02, threshold=4.743e+02, percent-clipped=0.0 2023-10-04 16:39:30,671 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_271-234962-0 from training. Duration: 0.79 2023-10-04 16:39:30,741 WARNING [train.py:1204] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_136-30660-0_sp1.1 from training. Duration: 0.97275 2023-10-04 16:39:30,755 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_625-281046-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 16:39:30,958 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.balancer1.prob, batch_count=1721946.6666666667, ans=0.125 2023-10-04 16:39:32,704 WARNING [train.py:1204] (3/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_333-168193-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 16:39:34,032 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9029W0269-509664 from training. Duration: 0.896 2023-10-04 16:39:35,384 WARNING [train.py:1204] (3/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_450-147393-0_sp1.1 from training. Duration: 0.92725 2023-10-04 16:39:35,443 WARNING [train.py:1204] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_643-52007-0_sp1.1 from training. Duration: 0.5181875 2023-10-04 16:39:36,661 WARNING [train.py:1204] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_330-327861-0_sp0.9 from training. Duration: 0.7444375 2023-10-04 16:39:36,678 WARNING [train.py:1204] (3/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_260-317651-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 16:39:36,698 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9044W0010-516654 from training. Duration: 0.896 2023-10-04 16:39:41,292 WARNING [train.py:1204] (3/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_265-118378-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 16:39:41,298 WARNING [train.py:1204] (3/4) Exclude cut with ID _6194_210_1534_1_1527511826160_3941194_321-148641-0_sp0.9 from training. Duration: 0.711125 2023-10-04 16:39:42,697 WARNING [train.py:1204] (3/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_101-169176-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 16:39:42,708 WARNING [train.py:1204] (3/4) Exclude cut with ID 497-129325-0061-62254-0_sp1.1 from training. Duration: 0.97725 2023-10-04 16:39:44,091 WARNING [train.py:1204] (3/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_795-33252-0_sp1.1 from training. Duration: 0.336375 2023-10-04 16:39:44,112 WARNING [train.py:1204] (3/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_168-246176-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 16:39:45,483 WARNING [train.py:1204] (3/4) Exclude cut with ID _7160_210_1876_1_1527940923231_3703799_120-150943-0_sp1.1 from training. Duration: 0.736375 2023-10-04 16:39:47,612 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9084W0005-536844 from training. Duration: 0.768 2023-10-04 16:39:48,202 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.3.encoder.layers.1.nonlin_attention.whiten2, num_groups=1, num_channels=512, metric=6.12 vs. limit=15.0 2023-10-04 16:39:50,796 WARNING [train.py:1204] (3/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_234-260682-0 from training. Duration: 0.84 2023-10-04 16:39:50,830 WARNING [train.py:1204] (3/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_188-114464-0_sp0.9 from training. Duration: 0.911125 2023-10-04 16:39:53,496 WARNING [train.py:1204] (3/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_81-75849-0 from training. Duration: 0.39 2023-10-04 16:39:56,251 WARNING [train.py:1204] (3/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_379-184223-0_sp1.1 from training. Duration: 0.77275 2023-10-04 16:39:57,656 WARNING [train.py:1204] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_454-123002-0_sp1.1 from training. Duration: 0.62725 2023-10-04 16:39:59,010 WARNING [train.py:1204] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_887-249555-0_sp1.1 from training. Duration: 0.7 2023-10-04 16:39:59,194 WARNING [train.py:1204] (3/4) Exclude cut with ID _31088_210_16446_1_1532520012284_3440919_20-260902-0_sp1.1 from training. Duration: 0.97275 2023-10-04 16:40:00,531 WARNING [train.py:1204] (3/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_856-344574-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 16:40:00,534 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_36756_210_7234_1_1533026991_966276_4-164118-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 16:40:00,560 WARNING [train.py:1204] (3/4) Exclude cut with ID _8365_210_2064_1_1528592311070_4677690_371-122295-0_sp1.1 from training. Duration: 0.5 2023-10-04 16:40:00,738 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.0.feed_forward1.hidden_balancer.prob, batch_count=1722080.0, ans=0.125 2023-10-04 16:40:02,051 WARNING [train.py:1204] (3/4) Exclude cut with ID _5910_210_1389_1_1527337972454_5050100_555-159815-0_sp1.1 from training. Duration: 0.8909375 2023-10-04 16:40:02,061 WARNING [train.py:1204] (3/4) Exclude cut with ID _37086_210_9038_1_1532999540891_7021250_469-147941-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 16:40:04,029 WARNING [train.py:1204] (3/4) Exclude cut with ID _3067_210_2667_1_1526212974640_4503168_459-173867-0_sp0.9 from training. Duration: 0.97775 2023-10-04 16:40:05,377 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9156W0006-572499 from training. Duration: 0.896 2023-10-04 16:40:05,452 WARNING [train.py:1204] (3/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_277-210798-0 from training. Duration: 0.55 2023-10-04 16:40:08,702 WARNING [train.py:1204] (3/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_103-336064-0_sp1.1 from training. Duration: 0.463625 2023-10-04 16:40:08,761 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3414_210_1988_1_1526212718_4813195_331-290945-0_sp1.1 from training. Duration: 0.936375 2023-10-04 16:40:08,762 WARNING [train.py:1204] (3/4) Exclude cut with ID _67499_210_17282_1_1535509805220_3692430_365-29276-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 16:40:11,503 WARNING [train.py:1204] (3/4) Exclude cut with ID _14821_210_8750_1_1530680503991_6829359_69-250847-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 16:40:11,506 WARNING [train.py:1204] (3/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_1102-61301-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 16:40:12,280 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.3.encoder.layers.1.feed_forward3.out_whiten, num_groups=1, num_channels=512, metric=12.96 vs. limit=15.0 2023-10-04 16:40:13,038 WARNING [train.py:1204] (3/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_166-246557-0_sp1.1 from training. Duration: 0.5818125 2023-10-04 16:40:14,347 WARNING [train.py:1204] (3/4) Exclude cut with ID _15351_210_10626_1_1530856635653_3576210_214-129918-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 16:40:14,367 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_120-41668-0_sp0.9 from training. Duration: 0.77775 2023-10-04 16:40:15,677 WARNING [train.py:1204] (3/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_27-72278-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 16:40:17,643 WARNING [train.py:1204] (3/4) Exclude cut with ID _24314_210_4915_1_1532135078116_7170179_521-258868-0_sp1.1 from training. Duration: 0.7181875 2023-10-04 16:40:22,339 WARNING [train.py:1204] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_143-86344-0 from training. Duration: 0.77 2023-10-04 16:40:22,365 WARNING [train.py:1204] (3/4) Exclude cut with ID _53489_210_6228_1_1534298357169_3835970_465-157433-0_sp1.1 from training. Duration: 0.92725 2023-10-04 16:40:22,454 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_248-319353-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 16:40:25,280 WARNING [train.py:1204] (3/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_102-77064-0_sp1.1 from training. Duration: 0.8454375 2023-10-04 16:40:25,320 WARNING [train.py:1204] (3/4) Exclude cut with ID _9389_210_1664_1_1529217337301_5289269_379-128794-0_sp1.1 from training. Duration: 0.77275 2023-10-04 16:40:26,722 WARNING [train.py:1204] (3/4) Exclude cut with ID _3231_210_2106_1_1526205864934_6784220_236-260835-0_sp1.1 from training. Duration: 0.97275 2023-10-04 16:40:27,997 INFO [train.py:1046] (3/4) Epoch 49, batch 3350, loss[loss=0.1735, simple_loss=0.2473, pruned_loss=0.04981, over 22772.00 frames. ], tot_loss[loss=0.1526, simple_loss=0.2332, pruned_loss=0.03594, over 4681445.90 frames. ], batch size: 322, lr: 2.07e-03, grad_scale: 16.0 2023-10-04 16:40:28,155 WARNING [train.py:1204] (3/4) Exclude cut with ID _24271_210_12658_1_1531987270022_7190519_184-342363-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 16:40:28,157 WARNING [train.py:1204] (3/4) Exclude cut with ID _27894_210_12392_1_1532415496078_6902439_67-221663-0_sp1.1 from training. Duration: 0.92725 2023-10-04 16:40:32,125 WARNING [train.py:1204] (3/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_304-59227-0_sp1.1 from training. Duration: 0.7 2023-10-04 16:40:33,464 WARNING [train.py:1204] (3/4) Exclude cut with ID _58605_210_12210_1_1534723195939_3904259_458-42232-0_sp1.1 from training. Duration: 0.92725 2023-10-04 16:40:33,542 WARNING [train.py:1204] (3/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_13-165512-0_sp0.9 from training. Duration: 0.911125 2023-10-04 16:40:36,384 WARNING [train.py:1204] (3/4) Exclude cut with ID _58602_210_2346_1_1534903210085_6908830_792-102241-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 16:40:38,272 WARNING [train.py:1204] (3/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_588-125165-0_sp0.9 from training. Duration: 0.92225 2023-10-04 16:40:39,737 WARNING [train.py:1204] (3/4) Exclude cut with ID _54218_210_12210_1_1534413646388_4409259_239-98429-0_sp1.1 from training. Duration: 0.97275 2023-10-04 16:40:41,033 WARNING [train.py:1204] (3/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_538-32291-0_sp1.1 from training. Duration: 0.7545625 2023-10-04 16:40:42,990 WARNING [train.py:1204] (3/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_419-317062-0 from training. Duration: 0.91 2023-10-04 16:40:44,366 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9309W0202-640329 from training. Duration: 0.896 2023-10-04 16:40:44,411 WARNING [train.py:1204] (3/4) Exclude cut with ID _3231_210_2106_1_1526205864934_6784220_419-313291-0_sp1.1 from training. Duration: 0.97275 2023-10-04 16:40:47,194 WARNING [train.py:1204] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_619-344499-0 from training. Duration: 0.63 2023-10-04 16:40:47,206 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_278-272612-0 from training. Duration: 0.73 2023-10-04 16:40:47,301 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_103-84010-0_sp0.9 from training. Duration: 0.7666875 2023-10-04 16:40:47,321 WARNING [train.py:1204] (3/4) Exclude cut with ID _26528_210_9070_1_1532570410842_3851310_497-97218-0_sp1.1 from training. Duration: 0.7818125 2023-10-04 16:40:49,061 WARNING [train.py:1204] (3/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_262-84613-0_sp1.1 from training. Duration: 0.936375 2023-10-04 16:40:50,374 WARNING [train.py:1204] (3/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_387-131538-0 from training. Duration: 0.81 2023-10-04 16:40:50,413 WARNING [train.py:1204] (3/4) Exclude cut with ID _8030_210_7497_1_1528444886781_3531089_224-300489-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 16:40:50,435 WARNING [train.py:1204] (3/4) Exclude cut with ID _14349_210_8341_1_1530615710818_3442470_170-336541-0_sp0.9 from training. Duration: 0.9555625 2023-10-04 16:40:53,461 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_47069_210_15368_1_1533781267_4431320_180-31746-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 16:40:56,222 WARNING [train.py:1204] (3/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_447-7041-0_sp1.1 from training. Duration: 0.92725 2023-10-04 16:40:56,252 WARNING [train.py:1204] (3/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_2-55283-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 16:40:56,337 WARNING [train.py:1204] (3/4) Exclude cut with ID _12555_210_8750_1_1530075594402_3780239_70-181911-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 16:40:59,314 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.0.conv_skip_rate, batch_count=1722346.6666666667, ans=0.0 2023-10-04 16:40:59,358 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.conv_module2.balancer2.prob, batch_count=1722346.6666666667, ans=0.125 2023-10-04 16:41:00,532 WARNING [train.py:1204] (3/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_311-328022-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 16:41:03,238 WARNING [train.py:1204] (3/4) Exclude cut with ID _14528_210_8254_1_1530669700514_4306939_330-2987-0_sp1.1 from training. Duration: 0.92725 2023-10-04 16:41:04,558 WARNING [train.py:1204] (3/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_385-184858-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 16:41:07,305 WARNING [train.py:1204] (3/4) Exclude cut with ID _64414_210_17301_1_1535243252048_3757759_348-333360-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 16:41:09,007 WARNING [train.py:1204] (3/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_753-214751-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 16:41:10,516 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_17613_210_2151_1_1531137569_2366160_202-208013-0_sp1.1 from training. Duration: 0.92725 2023-10-04 16:41:10,525 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_43831_210_2158_1_1533725502_5872791_610-129300-0_sp1.1 from training. Duration: 0.936375 2023-10-04 16:41:13,234 WARNING [train.py:1204] (3/4) Exclude cut with ID _54218_210_12210_1_1534413646388_4409259_321-23852-0_sp1.1 from training. Duration: 0.936375 2023-10-04 16:41:16,356 WARNING [train.py:1204] (3/4) Exclude cut with ID _39411_210_5986_1_1533538825030_3343649_173-238248-0 from training. Duration: 0.83 2023-10-04 16:41:16,363 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_405-339986-0_sp0.9 from training. Duration: 0.5666875 2023-10-04 16:41:16,394 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2009_210_2071_1_1525481134_4588937_385-302100-0 from training. Duration: 0.87 2023-10-04 16:41:16,425 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_161-276996-0_sp1.1 from training. Duration: 0.863625 2023-10-04 16:41:16,656 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.nonlin_attention.balancer.min_positive, batch_count=1722413.3333333333, ans=0.05 2023-10-04 16:41:17,859 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_177-58005-0 from training. Duration: 0.62 2023-10-04 16:41:19,163 WARNING [train.py:1204] (3/4) Exclude cut with ID _64639_210_5816_1_1535367619437_3528000_146-343734-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 16:41:20,575 WARNING [train.py:1204] (3/4) Exclude cut with ID _3845_210_1390_1_1526731326141_3523610_72-61441-0_sp1.1 from training. Duration: 0.92725 2023-10-04 16:41:26,765 WARNING [train.py:1204] (3/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_991-125028-0_sp1.1 from training. Duration: 0.936375 2023-10-04 16:41:26,817 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7544_210_6753_1_1528591327_1471426_172-348777-0 from training. Duration: 0.66 2023-10-04 16:41:26,975 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.conv_module2.balancer2.prob, batch_count=1722480.0, ans=0.125 2023-10-04 16:41:28,674 WARNING [train.py:1204] (3/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_416-257730-0_sp1.1 from training. Duration: 0.5909375 2023-10-04 16:41:29,967 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1113-7369-0_sp1.1 from training. Duration: 0.77275 2023-10-04 16:41:30,062 WARNING [train.py:1204] (3/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_242-76943-0_sp1.1 from training. Duration: 0.8545625 2023-10-04 16:41:35,512 WARNING [train.py:1204] (3/4) Exclude cut with ID _44718_210_5816_1_1534050051869_6469030_190-49648-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 16:41:35,655 WARNING [train.py:1204] (3/4) Exclude cut with ID _47251_210_18628_1_1533814063128_4137250_214-88076-0 from training. Duration: 0.88 2023-10-04 16:41:36,856 WARNING [train.py:1204] (3/4) Exclude cut with ID _5585_210_2667_1_1527418813324_8007340_89-332072-0_sp1.1 from training. Duration: 0.5818125 2023-10-04 16:41:36,889 WARNING [train.py:1204] (3/4) Exclude cut with ID _41817_210_18835_1_1533434477160_7423049_555-293448-0_sp1.1 from training. Duration: 0.72725 2023-10-04 16:41:38,308 WARNING [train.py:1204] (3/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_286-331314-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 16:41:40,120 WARNING [train.py:1204] (3/4) Exclude cut with ID _13921_210_9994_1_1530838702800_3003429_46-149444-0 from training. Duration: 0.94 2023-10-04 16:41:40,187 WARNING [train.py:1204] (3/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_768-43595-0_sp1.1 from training. Duration: 0.936375 2023-10-04 16:41:40,195 WARNING [train.py:1204] (3/4) Exclude cut with ID _35355_210_16040_1_1533212988977_4016229_74-190076-0 from training. Duration: 0.8 2023-10-04 16:41:41,487 INFO [train.py:1046] (3/4) Epoch 49, batch 3400, loss[loss=0.1596, simple_loss=0.2344, pruned_loss=0.04236, over 23773.00 frames. ], tot_loss[loss=0.1532, simple_loss=0.2341, pruned_loss=0.03612, over 4693452.38 frames. ], batch size: 212, lr: 2.07e-03, grad_scale: 16.0 2023-10-04 16:41:41,576 WARNING [train.py:1204] (3/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_1007-316380-0_sp1.1 from training. Duration: 0.963625 2023-10-04 16:41:41,626 WARNING [train.py:1204] (3/4) Exclude cut with ID _23557_210_8750_1_1531890106370_6846255_150-283437-0_sp1.1 from training. Duration: 0.963625 2023-10-04 16:41:42,970 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3474_210_2028_1_1526188513_4825246_145-309131-0_sp1.1 from training. Duration: 0.67275 2023-10-04 16:41:44,852 WARNING [train.py:1204] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_391-22148-0_sp1.1 from training. Duration: 0.7 2023-10-04 16:41:44,868 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_238-256668-0 from training. Duration: 0.69 2023-10-04 16:41:46,551 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.conv_skip_rate, batch_count=1722546.6666666667, ans=0.0 2023-10-04 16:41:49,671 WARNING [train.py:1204] (3/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_372-198878-0 from training. Duration: 0.6 2023-10-04 16:41:49,682 WARNING [train.py:1204] (3/4) Exclude cut with ID ID0174W0006-766659 from training. Duration: 0.953 2023-10-04 16:41:49,697 WARNING [train.py:1204] (3/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_668-23019-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 16:41:53,817 WARNING [train.py:1204] (3/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_322-74328-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 16:41:53,829 WARNING [train.py:1204] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_872-264525-0_sp1.1 from training. Duration: 0.5909375 2023-10-04 16:41:55,166 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_53378_210_17282_1_1534221071_5769509_641-286201-0_sp1.1 from training. Duration: 0.97275 2023-10-04 16:41:57,892 WARNING [train.py:1204] (3/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_199-295611-0_sp1.1 from training. Duration: 0.5 2023-10-04 16:41:59,747 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.696e+02 2.147e+02 2.480e+02 2.893e+02 4.349e+02, threshold=4.960e+02, percent-clipped=0.0 2023-10-04 16:42:00,015 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.0.feed_forward2.hidden_balancer.prob, batch_count=1722613.3333333333, ans=0.125 2023-10-04 16:42:02,714 WARNING [train.py:1204] (3/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_1270-317971-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 16:42:02,822 WARNING [train.py:1204] (3/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_784-213099-0 from training. Duration: 0.93 2023-10-04 16:42:08,542 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_115-233929-0_sp0.9 from training. Duration: 0.8 2023-10-04 16:42:11,114 WARNING [train.py:1204] (3/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_256-223809-0_sp1.1 from training. Duration: 0.97275 2023-10-04 16:42:11,164 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_285-87781-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 16:42:12,965 WARNING [train.py:1204] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_619-344499-0_sp1.1 from training. Duration: 0.57275 2023-10-04 16:42:17,834 WARNING [train.py:1204] (3/4) Exclude cut with ID _60229_210_27767_1_1534838406158_3655350_264-279573-0_sp0.9 from training. Duration: 0.92225 2023-10-04 16:42:22,616 WARNING [train.py:1204] (3/4) Exclude cut with ID _7719_210_3525_1_1528456153163_4296960_333-110399-0 from training. Duration: 0.96 2023-10-04 16:42:26,934 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_50423_210_13270_1_1534128598_5021734_759-90833-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 16:42:28,298 WARNING [train.py:1204] (3/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_151-254798-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 16:42:28,341 WARNING [train.py:1204] (3/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_450-203637-0 from training. Duration: 0.92 2023-10-04 16:42:28,379 WARNING [train.py:1204] (3/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_673-36456-0_sp1.1 from training. Duration: 0.936375 2023-10-04 16:42:28,431 WARNING [train.py:1204] (3/4) Exclude cut with ID _36538_210_9994_1_1533204020363_3795620_131-8685-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 16:42:29,738 WARNING [train.py:1204] (3/4) Exclude cut with ID _12556_210_8341_1_1530081024100_7327690_713-55626-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 16:42:31,568 WARNING [train.py:1204] (3/4) Exclude cut with ID _2322_210_2667_1_1525604548971_3656299_440-80494-0_sp0.9 from training. Duration: 0.97775 2023-10-04 16:42:34,354 WARNING [train.py:1204] (3/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_210-325081-0_sp1.1 from training. Duration: 0.97275 2023-10-04 16:42:38,503 WARNING [train.py:1204] (3/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_375-244976-0_sp0.9 from training. Duration: 0.8333125 2023-10-04 16:42:38,509 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_15517_210_6753_1_1530952293_1664795_119-31001-0_sp1.1 from training. Duration: 0.8545625 2023-10-04 16:42:40,192 INFO [scaling.py:1118] (3/4) WithLoss: name=encoder.encoders.3.encoder.layers.3.self_attn_weights, loss-sum=0.000e+00 2023-10-04 16:42:42,859 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_41452_210_7291_1_1533437687_3941281_319-42326-0_sp1.1 from training. Duration: 0.92725 2023-10-04 16:42:44,773 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_372-173012-0 from training. Duration: 0.95 2023-10-04 16:42:48,187 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.1.bypass.scale_min, batch_count=1722813.3333333333, ans=0.2 2023-10-04 16:42:49,435 WARNING [train.py:1204] (3/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_41-31157-0_sp1.1 from training. Duration: 0.5181875 2023-10-04 16:42:55,494 WARNING [train.py:1204] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_91-212486-0 from training. Duration: 0.68 2023-10-04 16:42:55,775 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.1.balancer1.prob, batch_count=1722880.0, ans=0.125 2023-10-04 16:42:56,866 INFO [train.py:1046] (3/4) Epoch 49, batch 3450, loss[loss=0.1633, simple_loss=0.2542, pruned_loss=0.03622, over 24553.00 frames. ], tot_loss[loss=0.1528, simple_loss=0.2332, pruned_loss=0.03617, over 4686456.86 frames. ], batch size: 71, lr: 2.07e-03, grad_scale: 16.0 2023-10-04 16:42:58,341 WARNING [train.py:1204] (3/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_1267-272693-0 from training. Duration: 0.73 2023-10-04 16:42:58,378 WARNING [train.py:1204] (3/4) Exclude cut with ID _13938_210_9988_1_1530518718978_3693960_152-67046-0_sp1.1 from training. Duration: 0.936375 2023-10-04 16:42:59,765 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_4528_210_3273_1_1527076654_3276777_342-168068-0_sp1.1 from training. Duration: 0.7909375 2023-10-04 16:43:01,043 WARNING [train.py:1204] (3/4) Exclude cut with ID _7606_210_4915_1_1528894253277_5080690_127-177761-0 from training. Duration: 0.64 2023-10-04 16:43:01,124 WARNING [train.py:1204] (3/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_335-137972-0_sp1.1 from training. Duration: 0.92725 2023-10-04 16:43:02,010 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.1.encoder.layers.0.feed_forward2.out_whiten, num_groups=1, num_channels=256, metric=4.35 vs. limit=15.0 2023-10-04 16:43:04,400 WARNING [train.py:1204] (3/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_331-117829-0_sp1.1 from training. Duration: 0.563625 2023-10-04 16:43:08,581 WARNING [train.py:1204] (3/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_125-125818-0_sp1.1 from training. Duration: 0.87275 2023-10-04 16:43:09,986 WARNING [train.py:1204] (3/4) Exclude cut with ID _6789_210_1534_1_1527854471671_2870519_48-294897-0_sp1.1 from training. Duration: 0.963625 2023-10-04 16:43:11,404 WARNING [train.py:1204] (3/4) Exclude cut with ID _6694_210_4599_1_1527852637490_3965529_116-109398-0_sp1.1 from training. Duration: 0.663625 2023-10-04 16:43:11,414 WARNING [train.py:1204] (3/4) Exclude cut with ID _52892_210_6721_1_1534378941259_4124129_222-172685-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 16:43:13,377 WARNING [train.py:1204] (3/4) Exclude cut with ID _50655_210_18628_1_1534229942193_3772154_320-132275-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 16:43:13,715 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=1722946.6666666667, ans=0.1 2023-10-04 16:43:19,275 WARNING [train.py:1204] (3/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_76-311963-0 from training. Duration: 0.7 2023-10-04 16:43:26,563 WARNING [train.py:1204] (3/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_32-205288-0 from training. Duration: 0.78 2023-10-04 16:43:26,589 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_86-16881-0_sp0.9 from training. Duration: 0.6444375 2023-10-04 16:43:26,629 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_271-297505-0_sp1.1 from training. Duration: 0.9 2023-10-04 16:43:28,040 WARNING [train.py:1204] (3/4) Exclude cut with ID _57572_210_5517_1_1534921219624_3809484_187-295293-0_sp1.1 from training. Duration: 0.963625 2023-10-04 16:43:28,329 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.conv_module1.balancer1.prob, batch_count=1723013.3333333333, ans=0.125 2023-10-04 16:43:35,473 WARNING [train.py:1204] (3/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_563-329943-0 from training. Duration: 0.99 2023-10-04 16:43:35,536 WARNING [train.py:1204] (3/4) Exclude cut with ID _7160_210_1876_1_1527940923231_3703799_302-150728-0_sp0.9 from training. Duration: 0.8666875 2023-10-04 16:43:35,710 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.conv_module2.balancer1.min_positive, batch_count=1723013.3333333333, ans=0.025 2023-10-04 16:43:39,695 WARNING [train.py:1204] (3/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_926-7526-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 16:43:39,717 WARNING [train.py:1204] (3/4) Exclude cut with ID _62574_210_2655_1_1535180407058_3933249_467-22348-0_sp1.1 from training. Duration: 0.936375 2023-10-04 16:43:42,229 WARNING [train.py:1204] (3/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_280-161633-0_sp0.9 from training. Duration: 0.811125 2023-10-04 16:43:43,610 WARNING [train.py:1204] (3/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_385-193052-0_sp1.1 from training. Duration: 0.7818125 2023-10-04 16:43:45,102 WARNING [train.py:1204] (3/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_444-28041-0 from training. Duration: 0.85 2023-10-04 16:43:45,107 WARNING [train.py:1204] (3/4) Exclude cut with ID _66070_210_7640_1_1535441393096_7805310_341-249237-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 16:43:47,011 WARNING [train.py:1204] (3/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_425-269826-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 16:43:51,384 WARNING [train.py:1204] (3/4) Exclude cut with ID _53950_210_5816_1_1534417264919_3862951_124-185597-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 16:43:51,726 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.attention_skip_rate, batch_count=1723080.0, ans=0.0 2023-10-04 16:43:52,845 WARNING [train.py:1204] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_380-204756-0 from training. Duration: 0.86 2023-10-04 16:43:57,372 WARNING [train.py:1204] (3/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_282-257791-0_sp0.9 from training. Duration: 0.9555625 2023-10-04 16:44:00,198 WARNING [train.py:1204] (3/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_372-131163-0_sp1.1 from training. Duration: 0.8545625 2023-10-04 16:44:01,598 WARNING [train.py:1204] (3/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_447-233513-0_sp1.1 from training. Duration: 0.963625 2023-10-04 16:44:02,972 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_54-118333-0_sp1.1 from training. Duration: 0.97275 2023-10-04 16:44:07,589 WARNING [train.py:1204] (3/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_388-230239-0_sp1.1 from training. Duration: 0.963625 2023-10-04 16:44:07,605 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_40047_210_2290_1_1533290519_2374311_141-54639-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 16:44:08,874 WARNING [train.py:1204] (3/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_91-267696-0_sp0.9 from training. Duration: 0.9666875 2023-10-04 16:44:10,146 INFO [train.py:1046] (3/4) Epoch 49, batch 3500, loss[loss=0.1533, simple_loss=0.2392, pruned_loss=0.0337, over 24492.00 frames. ], tot_loss[loss=0.1518, simple_loss=0.2323, pruned_loss=0.03559, over 4698824.45 frames. ], batch size: 66, lr: 2.07e-03, grad_scale: 16.0 2023-10-04 16:44:10,191 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_532-137847-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 16:44:10,962 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.2.encoder.layers.2.whiten, num_groups=1, num_channels=384, metric=3.71 vs. limit=12.0 2023-10-04 16:44:12,978 WARNING [train.py:1204] (3/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_155-283231-0_sp1.1 from training. Duration: 0.97275 2023-10-04 16:44:13,218 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.conv_module2.balancer1.max_abs, batch_count=1723213.3333333333, ans=10.0 2023-10-04 16:44:14,517 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.nonlin_attention.balancer.min_positive, batch_count=1723213.3333333333, ans=0.05 2023-10-04 16:44:15,953 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.ff3_skip_rate, batch_count=1723213.3333333333, ans=0.0 2023-10-04 16:44:17,074 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2245_210_2715_1_1525511548_3540254_310-52483-0_sp1.1 from training. Duration: 0.8 2023-10-04 16:44:17,131 WARNING [train.py:1204] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_185-174805-0 from training. Duration: 0.68 2023-10-04 16:44:18,963 WARNING [train.py:1204] (3/4) Exclude cut with ID _15012_210_1968_1_1530856820429_5095350_662-210469-0_sp1.1 from training. Duration: 0.5181875 2023-10-04 16:44:22,743 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_426-313557-0_sp0.9 from training. Duration: 0.588875 2023-10-04 16:44:25,509 WARNING [train.py:1204] (3/4) Exclude cut with ID _42326_210_7770_1_1533814282531_3707269_407-204343-0_sp1.1 from training. Duration: 0.97275 2023-10-04 16:44:25,518 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_258-310570-0 from training. Duration: 0.82 2023-10-04 16:44:28,204 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.770e+02 2.140e+02 2.425e+02 2.856e+02 5.490e+02, threshold=4.850e+02, percent-clipped=2.0 2023-10-04 16:44:28,394 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_4737_210_2040_1_1526998689_3293008_378-178650-0_sp1.1 from training. Duration: 0.863625 2023-10-04 16:44:29,795 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_47896_210_20508_1_1534492518_5958868_432-327451-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 16:44:31,033 WARNING [train.py:1204] (3/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_188-114464-0_sp1.1 from training. Duration: 0.7454375 2023-10-04 16:44:31,046 WARNING [train.py:1204] (3/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_798-106179-0_sp1.1 from training. Duration: 0.7545625 2023-10-04 16:44:31,083 WARNING [train.py:1204] (3/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_143-117421-0_sp1.1 from training. Duration: 0.52725 2023-10-04 16:44:31,234 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.feed_forward2.hidden_balancer.prob, batch_count=1723280.0, ans=0.125 2023-10-04 16:44:32,370 WARNING [train.py:1204] (3/4) Exclude cut with ID _67499_210_17282_1_1535509805220_3692430_361-78786-0_sp1.1 from training. Duration: 0.963625 2023-10-04 16:44:33,596 WARNING [train.py:1204] (3/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_712-342563-0_sp1.1 from training. Duration: 0.8818125 2023-10-04 16:44:33,640 WARNING [train.py:1204] (3/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_387-159008-0 from training. Duration: 0.86 2023-10-04 16:44:37,100 WARNING [train.py:1204] (3/4) Exclude cut with ID _33640_210_2655_1_1533117605479_4855469_959-18820-0_sp1.1 from training. Duration: 0.963625 2023-10-04 16:44:37,144 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_133-564-0_sp0.9 from training. Duration: 0.87775 2023-10-04 16:44:38,628 WARNING [train.py:1204] (3/4) Exclude cut with ID _23514_210_1389_1_1531832139785_3031439_12-29287-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 16:44:42,811 WARNING [train.py:1204] (3/4) Exclude cut with ID _23514_210_1389_1_1531832139785_3031439_489-185052-0_sp1.1 from training. Duration: 0.963625 2023-10-04 16:44:42,877 WARNING [train.py:1204] (3/4) Exclude cut with ID _5444_210_2151_1_1527418808464_5312210_634-194174-0 from training. Duration: 0.97 2023-10-04 16:44:44,203 WARNING [train.py:1204] (3/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_91-26919-0_sp1.1 from training. Duration: 0.8818125 2023-10-04 16:44:47,356 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_37351_210_7925_1_1533012931_3812113_516-82303-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 16:44:47,470 WARNING [train.py:1204] (3/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_80-74184-0_sp0.9 from training. Duration: 0.92225 2023-10-04 16:44:48,698 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_9460_210_4211_1_1528954587_10049667_495-202963-0_sp1.1 from training. Duration: 0.963625 2023-10-04 16:44:50,763 WARNING [train.py:1204] (3/4) Exclude cut with ID _14632_210_9988_1_1530691279392_3923200_332-105904-0_sp1.1 from training. Duration: 0.8181875 2023-10-04 16:44:50,786 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_40764_210_1925_1_1533882359_3752345_493-167886-0_sp1.1 from training. Duration: 0.92725 2023-10-04 16:44:53,870 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_196-289637-0 from training. Duration: 0.75 2023-10-04 16:44:55,273 WARNING [train.py:1204] (3/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_26-175553-0 from training. Duration: 0.61 2023-10-04 16:44:55,310 WARNING [train.py:1204] (3/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_890-5117-0 from training. Duration: 0.78 2023-10-04 16:44:55,368 WARNING [train.py:1204] (3/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_206-306860-0_sp1.1 from training. Duration: 0.92725 2023-10-04 16:44:56,881 WARNING [train.py:1204] (3/4) Exclude cut with ID _25882_210_3036_1_1532775665815_6479510_570-79824-0_sp1.1 from training. Duration: 0.963625 2023-10-04 16:44:56,939 WARNING [train.py:1204] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_731-2428-0_sp1.1 from training. Duration: 0.7545625 2023-10-04 16:44:56,959 WARNING [train.py:1204] (3/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_138-154812-0_sp0.9 from training. Duration: 0.8666875 2023-10-04 16:45:00,957 WARNING [train.py:1204] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_178-39722-0_sp0.9 from training. Duration: 0.5444375 2023-10-04 16:45:01,010 WARNING [train.py:1204] (3/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_1285-256622-0_sp0.9 from training. Duration: 0.9333125 2023-10-04 16:45:01,217 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=1723413.3333333333, ans=0.1 2023-10-04 16:45:06,910 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_41428_210_2161_1_1533774184_3992532_65-271323-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 16:45:08,367 WARNING [train.py:1204] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_308-161067-0 from training. Duration: 0.66 2023-10-04 16:45:08,379 WARNING [train.py:1204] (3/4) Exclude cut with ID _3067_210_2667_1_1526212974640_4503168_459-173867-0 from training. Duration: 0.88 2023-10-04 16:45:08,380 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2221_210_1988_1_1525597051_4935541_415-296882-0_sp1.1 from training. Duration: 0.7 2023-10-04 16:45:11,172 WARNING [train.py:1204] (3/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_354-192266-0_sp1.1 from training. Duration: 0.936375 2023-10-04 16:45:11,250 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_258-310570-0_sp0.9 from training. Duration: 0.911125 2023-10-04 16:45:12,615 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_548-141039-0_sp1.1 from training. Duration: 0.963625 2023-10-04 16:45:13,226 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.4.encoder.layers.0.self_attn2.whiten, num_groups=1, num_channels=384, metric=12.63 vs. limit=22.5 2023-10-04 16:45:14,096 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_15101_210_7927_1_1530786415_5033274_320-327527-0 from training. Duration: 0.96 2023-10-04 16:45:15,934 WARNING [train.py:1204] (3/4) Exclude cut with ID _2895_210_2477_1_1525956785794_4063750_289-11475-0_sp0.9 from training. Duration: 0.911125 2023-10-04 16:45:17,263 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7543_210_6753_1_1528543318_4552_1-85072-0_sp1.1 from training. Duration: 0.7545625 2023-10-04 16:45:17,350 WARNING [train.py:1204] (3/4) Exclude cut with ID _64639_210_5816_1_1535367619437_3528000_461-329156-0 from training. Duration: 0.96 2023-10-04 16:45:19,227 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_211-339622-0 from training. Duration: 0.45 2023-10-04 16:45:22,333 WARNING [train.py:1204] (3/4) Exclude cut with ID _20455_210_5219_1_1531565996775_8098100_402-112739-0_sp1.1 from training. Duration: 0.963625 2023-10-04 16:45:23,710 WARNING [train.py:1204] (3/4) Exclude cut with ID _53489_210_6228_1_1534298357169_3835970_256-115565-0_sp1.1 from training. Duration: 0.936375 2023-10-04 16:45:23,726 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_26326_210_7925_1_1533002493_3592406_360-305260-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 16:45:23,763 WARNING [train.py:1204] (3/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_18-204383-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 16:45:25,087 INFO [train.py:1046] (3/4) Epoch 49, batch 3550, loss[loss=0.1505, simple_loss=0.2358, pruned_loss=0.03254, over 24520.00 frames. ], tot_loss[loss=0.1509, simple_loss=0.231, pruned_loss=0.03544, over 4693289.29 frames. ], batch size: 66, lr: 2.07e-03, grad_scale: 16.0 2023-10-04 16:45:25,320 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.0.ff2_skip_rate, batch_count=1723546.6666666667, ans=0.0 2023-10-04 16:45:25,361 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.feed_forward1.hidden_balancer.prob, batch_count=1723546.6666666667, ans=0.125 2023-10-04 16:45:26,565 WARNING [train.py:1204] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_306-232480-0_sp1.1 from training. Duration: 0.87275 2023-10-04 16:45:35,268 WARNING [train.py:1204] (3/4) Exclude cut with ID _41161_210_20034_1_1533431249657_3555314_116-299186-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 16:45:37,990 WARNING [train.py:1204] (3/4) Exclude cut with ID _13913_210_9994_1_1530579615187_3383897_473-277682-0_sp1.1 from training. Duration: 0.3545625 2023-10-04 16:45:40,015 WARNING [train.py:1204] (3/4) Exclude cut with ID _58895_210_8750_1_1535184081320_3649089_49-225702-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 16:45:41,334 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3080_210_2071_1_1526086102_4365166_312-8726-0_sp1.1 from training. Duration: 0.82725 2023-10-04 16:45:42,730 WARNING [train.py:1204] (3/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_489-190175-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 16:45:44,477 WARNING [train.py:1204] (3/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_345-95717-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 16:45:44,489 WARNING [train.py:1204] (3/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_85-138257-0_sp1.1 from training. Duration: 0.6454375 2023-10-04 16:45:47,427 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7046_210_6753_1_1527895337_6920917_783-278109-0_sp1.1 from training. Duration: 0.7 2023-10-04 16:45:47,466 WARNING [train.py:1204] (3/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_450-203637-0_sp1.1 from training. Duration: 0.836375 2023-10-04 16:45:48,662 WARNING [train.py:1204] (3/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_416-132893-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 16:45:48,686 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2655_210_2087_1_1525688959_3897400_437-337690-0_sp0.9 from training. Duration: 0.7 2023-10-04 16:45:48,756 WARNING [train.py:1204] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_700-183549-0_sp1.1 from training. Duration: 0.6181875 2023-10-04 16:45:53,965 WARNING [train.py:1204] (3/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_191-59784-0_sp1.1 from training. Duration: 0.8 2023-10-04 16:45:53,976 WARNING [train.py:1204] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_218-295097-0_sp1.1 from training. Duration: 0.7 2023-10-04 16:45:54,164 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.bypass.skip_rate, batch_count=1723680.0, ans=0.09899494936611666 2023-10-04 16:45:55,355 WARNING [train.py:1204] (3/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_310-109461-0_sp1.1 from training. Duration: 0.72725 2023-10-04 16:45:55,359 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_33348_210_5632_1_1533086912_3324030_131-321149-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 16:45:56,739 WARNING [train.py:1204] (3/4) Exclude cut with ID _64135_210_2253_1_1535677267876_7608972_133-188266-0_sp1.1 from training. Duration: 0.736375 2023-10-04 16:45:58,036 WARNING [train.py:1204] (3/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_505-205270-0 from training. Duration: 0.38 2023-10-04 16:45:58,046 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_67223_210_4238_1_1535612120_2537431_102-88248-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 16:45:58,160 WARNING [train.py:1204] (3/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_60-235433-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 16:45:59,404 WARNING [train.py:1204] (3/4) Exclude cut with ID _5189_210_4557_1_1527248319428_3769229_199-53526-0_sp1.1 from training. Duration: 0.3818125 2023-10-04 16:45:59,623 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.balancer1.prob, batch_count=1723680.0, ans=0.125 2023-10-04 16:46:03,753 WARNING [train.py:1204] (3/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_439-90979-0_sp1.1 from training. Duration: 0.963625 2023-10-04 16:46:05,010 WARNING [train.py:1204] (3/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_1162-132880-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 16:46:05,091 WARNING [train.py:1204] (3/4) Exclude cut with ID _56423_210_5854_1_1534896061049_3165969_220-22317-0_sp1.1 from training. Duration: 0.963625 2023-10-04 16:46:07,810 WARNING [train.py:1204] (3/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_909-64151-0 from training. Duration: 0.94 2023-10-04 16:46:07,872 WARNING [train.py:1204] (3/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_465-156500-0_sp0.9 from training. Duration: 0.8 2023-10-04 16:46:09,709 WARNING [train.py:1204] (3/4) Exclude cut with ID _44304_210_15374_1_1533639352765_4613129_302-22084-0 from training. Duration: 0.86 2023-10-04 16:46:09,749 WARNING [train.py:1204] (3/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_663-166973-0_sp1.1 from training. Duration: 0.72725 2023-10-04 16:46:12,843 WARNING [train.py:1204] (3/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_503-287883-0_sp0.9 from training. Duration: 0.97775 2023-10-04 16:46:12,859 WARNING [train.py:1204] (3/4) Exclude cut with ID _60057_210_10640_1_1534903065793_3333150_362-230377-0_sp0.9 from training. Duration: 0.9666875 2023-10-04 16:46:15,670 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_4183_210_2087_1_1526729100_1869838_143-280962-0 from training. Duration: 0.56 2023-10-04 16:46:15,754 WARNING [train.py:1204] (3/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_281-1313-0_sp1.1 from training. Duration: 0.936375 2023-10-04 16:46:17,831 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.3.encoder.layers.3.self_attn_weights.whiten_keys, num_groups=8, num_channels=256, metric=2.21 vs. limit=6.0 2023-10-04 16:46:23,534 WARNING [train.py:1204] (3/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_64-215118-0_sp1.1 from training. Duration: 0.936375 2023-10-04 16:46:23,588 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2822_210_2715_1_1526112586_1999587_165-36656-0 from training. Duration: 0.89 2023-10-04 16:46:24,943 WARNING [train.py:1204] (3/4) Exclude cut with ID _22961_210_8254_1_1531976366672_4278180_402-208830-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 16:46:29,193 WARNING [train.py:1204] (3/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_98-193494-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 16:46:30,540 WARNING [train.py:1204] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_383-285196-0 from training. Duration: 0.97 2023-10-04 16:46:33,568 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.nonlin_attention.balancer.prob, batch_count=1723813.3333333333, ans=0.125 2023-10-04 16:46:37,418 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6767_210_3273_1_1527855783_1718932_142-127132-0 from training. Duration: 0.95 2023-10-04 16:46:37,464 WARNING [train.py:1204] (3/4) Exclude cut with ID _39867_210_9826_1_1533553222030_9344259_1018-339635-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 16:46:37,520 WARNING [train.py:1204] (3/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_274-274798-0_sp1.1 from training. Duration: 0.8909375 2023-10-04 16:46:38,827 INFO [train.py:1046] (3/4) Epoch 49, batch 3600, loss[loss=0.1455, simple_loss=0.2206, pruned_loss=0.03523, over 23716.00 frames. ], tot_loss[loss=0.151, simple_loss=0.2316, pruned_loss=0.03523, over 4708905.58 frames. ], batch size: 232, lr: 2.07e-03, grad_scale: 16.0 2023-10-04 16:46:38,982 WARNING [train.py:1204] (3/4) Exclude cut with ID _2334_210_2667_1_1525608238880_4705129_376-263393-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 16:46:40,740 WARNING [train.py:1204] (3/4) Exclude cut with ID _29303_210_2575_1_1532392237303_7410649_363-344675-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 16:46:40,813 WARNING [train.py:1204] (3/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_58-68956-0_sp1.1 from training. Duration: 0.7545625 2023-10-04 16:46:44,248 WARNING [train.py:1204] (3/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_709-311571-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 16:46:45,730 WARNING [train.py:1204] (3/4) Exclude cut with ID _46067_210_4220_1_1534125571066_3307450_136-257407-0_sp1.1 from training. Duration: 0.963625 2023-10-04 16:46:47,052 WARNING [train.py:1204] (3/4) Exclude cut with ID _6493_210_1747_1_1528459210424_4012470_324-323854-0_sp1.1 from training. Duration: 0.836375 2023-10-04 16:46:48,245 WARNING [train.py:1204] (3/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_234-260682-0_sp1.1 from training. Duration: 0.763625 2023-10-04 16:46:48,305 WARNING [train.py:1204] (3/4) Exclude cut with ID _36634_210_1794_1_1533296352450_1399884_55-119451-0_sp1.1 from training. Duration: 0.963625 2023-10-04 16:46:48,308 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_40047_210_2290_1_1533290519_2374311_195-165693-0 from training. Duration: 0.89 2023-10-04 16:46:53,016 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_5522_210_3273_1_1527249606_3909096_366-172508-0_sp0.9 from training. Duration: 0.7333125 2023-10-04 16:46:54,904 WARNING [train.py:1204] (3/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_187-10504-0_sp1.1 from training. Duration: 0.963625 2023-10-04 16:46:57,693 WARNING [train.py:1204] (3/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_282-257791-0_sp1.1 from training. Duration: 0.7818125 2023-10-04 16:46:59,091 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.720e+02 2.052e+02 2.311e+02 2.640e+02 4.130e+02, threshold=4.623e+02, percent-clipped=0.0 2023-10-04 16:47:00,616 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_43831_210_2158_1_1533725502_5872791_467-288776-0_sp1.1 from training. Duration: 0.92725 2023-10-04 16:47:01,955 WARNING [train.py:1204] (3/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_138-154812-0_sp1.1 from training. Duration: 0.7090625 2023-10-04 16:47:03,253 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_1154-185930-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 16:47:04,596 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2655_210_2087_1_1525688959_3897400_52-279899-0 from training. Duration: 0.69 2023-10-04 16:47:06,066 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_141-89113-0_sp1.1 from training. Duration: 0.7818125 2023-10-04 16:47:07,600 WARNING [train.py:1204] (3/4) Exclude cut with ID _55134_210_21982_1_1534590028377_3882170_297-47310-0_sp1.1 from training. Duration: 0.963625 2023-10-04 16:47:09,008 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_486-151131-0_sp1.1 from training. Duration: 0.77275 2023-10-04 16:47:10,386 WARNING [train.py:1204] (3/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_21-232980-0_sp1.1 from training. Duration: 0.936375 2023-10-04 16:47:11,924 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6165_210_3528_1_1527590982_4379841_338-106380-0_sp1.1 from training. Duration: 0.92725 2023-10-04 16:47:13,204 WARNING [train.py:1204] (3/4) Exclude cut with ID _55162_210_17071_1_1534408248583_3753480_355-257218-0_sp1.1 from training. Duration: 0.97275 2023-10-04 16:47:14,954 WARNING [train.py:1204] (3/4) Exclude cut with ID _2895_210_2477_1_1525956785794_4063750_289-11475-0 from training. Duration: 0.82 2023-10-04 16:47:19,714 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.bypass_mid.scale_min, batch_count=1724013.3333333333, ans=0.2 2023-10-04 16:47:20,704 WARNING [train.py:1204] (3/4) Exclude cut with ID _16756_210_4915_1_1531047803763_7104259_839-183431-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 16:47:22,080 WARNING [train.py:1204] (3/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_427-208901-0_sp0.9 from training. Duration: 0.7555625 2023-10-04 16:47:23,381 WARNING [train.py:1204] (3/4) Exclude cut with ID _36512_210_6987_1_1533033143998_3126069_181-306500-0 from training. Duration: 0.95 2023-10-04 16:47:28,488 WARNING [train.py:1204] (3/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_28-70987-0_sp1.1 from training. Duration: 0.7909375 2023-10-04 16:47:32,779 WARNING [train.py:1204] (3/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_590-58996-0_sp1.1 from training. Duration: 0.936375 2023-10-04 16:47:34,259 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_48730_210_5399_1_1533885886_2212769_108-47561-0_sp1.1 from training. Duration: 0.936375 2023-10-04 16:47:40,043 WARNING [train.py:1204] (3/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_401-158239-0_sp0.9 from training. Duration: 0.788875 2023-10-04 16:47:40,064 WARNING [train.py:1204] (3/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_278-154492-0_sp1.1 from training. Duration: 0.6909375 2023-10-04 16:47:40,069 WARNING [train.py:1204] (3/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_137-116442-0 from training. Duration: 0.78 2023-10-04 16:47:42,678 WARNING [train.py:1204] (3/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_189-317158-0 from training. Duration: 0.9 2023-10-04 16:47:44,037 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_47896_210_20508_1_1534492518_5958868_558-314572-0 from training. Duration: 0.97 2023-10-04 16:47:45,421 WARNING [train.py:1204] (3/4) Exclude cut with ID _66070_210_7640_1_1535441393096_7805310_290-40284-0_sp1.1 from training. Duration: 0.97275 2023-10-04 16:47:45,455 WARNING [train.py:1204] (3/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_110-224688-0_sp1.1 from training. Duration: 0.8818125 2023-10-04 16:47:45,717 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.feed_forward1.out_proj.dropout_p, batch_count=1724146.6666666667, ans=0.1 2023-10-04 16:47:47,487 WARNING [train.py:1204] (3/4) Exclude cut with ID _7719_210_3525_1_1528456153163_4296960_88-250259-0 from training. Duration: 0.84 2023-10-04 16:47:47,546 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8453_210_4211_1_1528502940_8562730_1157-114102-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 16:47:48,869 WARNING [train.py:1204] (3/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_317-175062-0_sp1.1 from training. Duration: 0.7454375 2023-10-04 16:47:48,886 WARNING [train.py:1204] (3/4) Exclude cut with ID _29857_210_20034_1_1532395886753_3798160_254-347254-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 16:47:50,285 WARNING [train.py:1204] (3/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_28-70987-0 from training. Duration: 0.87 2023-10-04 16:47:50,361 WARNING [train.py:1204] (3/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_321-335475-0 from training. Duration: 0.45 2023-10-04 16:47:50,495 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.0.conv_module2.balancer2.prob, batch_count=1724146.6666666667, ans=0.125 2023-10-04 16:47:53,503 INFO [train.py:1046] (3/4) Epoch 49, batch 3650, loss[loss=0.1511, simple_loss=0.2277, pruned_loss=0.03723, over 23703.00 frames. ], tot_loss[loss=0.1516, simple_loss=0.2321, pruned_loss=0.03553, over 4698303.97 frames. ], batch size: 164, lr: 2.07e-03, grad_scale: 16.0 2023-10-04 16:47:53,618 WARNING [train.py:1204] (3/4) Exclude cut with ID _40901_210_5665_1_1533363789958_3856130_356-107330-0_sp1.1 from training. Duration: 0.936375 2023-10-04 16:47:53,675 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3780_210_2087_1_1526467389_2145572_176-107140-0 from training. Duration: 0.69 2023-10-04 16:47:59,663 WARNING [train.py:1204] (3/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_80-135200-0 from training. Duration: 0.79 2023-10-04 16:47:59,899 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.conv_module1.balancer1.prob, batch_count=1724213.3333333333, ans=0.125 2023-10-04 16:48:01,063 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_33348_210_5632_1_1533086912_3324030_85-28583-0_sp0.9 from training. Duration: 0.911125 2023-10-04 16:48:02,656 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.1.self_attn_weights.pos_emb_skip_rate, batch_count=1724213.3333333333, ans=0.0 2023-10-04 16:48:02,711 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.conv_module2.balancer2.prob, batch_count=1724213.3333333333, ans=0.125 2023-10-04 16:48:03,951 WARNING [train.py:1204] (3/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_29-293896-0 from training. Duration: 0.61 2023-10-04 16:48:05,308 WARNING [train.py:1204] (3/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_194-252800-0 from training. Duration: 0.97 2023-10-04 16:48:08,238 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_360-52446-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 16:48:08,240 WARNING [train.py:1204] (3/4) Exclude cut with ID _9389_210_1664_1_1529217337301_5289269_289-213417-0_sp0.9 from training. Duration: 0.988875 2023-10-04 16:48:08,297 WARNING [train.py:1204] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_143-86344-0_sp0.9 from training. Duration: 0.8555625 2023-10-04 16:48:11,008 WARNING [train.py:1204] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_520-126729-0_sp0.9 from training. Duration: 0.77775 2023-10-04 16:48:11,033 WARNING [train.py:1204] (3/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_76-308663-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 16:48:13,555 WARNING [train.py:1204] (3/4) Exclude cut with ID _8021_210_7994_1_1528376451358_3653069_100-140231-0 from training. Duration: 0.67 2023-10-04 16:48:13,637 WARNING [train.py:1204] (3/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_387-131538-0_sp0.9 from training. Duration: 0.9 2023-10-04 16:48:13,671 WARNING [train.py:1204] (3/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_365-182054-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 16:48:15,459 WARNING [train.py:1204] (3/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_345-340690-0 from training. Duration: 0.93 2023-10-04 16:48:15,545 WARNING [train.py:1204] (3/4) Exclude cut with ID _5409_210_1388_1_1527292850189_5865191_178-127970-0_sp0.9 from training. Duration: 0.6333125 2023-10-04 16:48:15,604 WARNING [train.py:1204] (3/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_352-12734-0_sp1.1 from training. Duration: 0.92725 2023-10-04 16:48:15,607 WARNING [train.py:1204] (3/4) Exclude cut with ID _65134_210_14763_1_1535335393985_3505368_121-129373-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 16:48:18,760 WARNING [train.py:1204] (3/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_557-324258-0_sp1.1 from training. Duration: 0.736375 2023-10-04 16:48:20,256 WARNING [train.py:1204] (3/4) Exclude cut with ID _48678_210_5663_1_1533988830947_3769199_306-255886-0 from training. Duration: 0.77 2023-10-04 16:48:22,085 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7544_210_6753_1_1528587000_691468_87-178599-0 from training. Duration: 0.87 2023-10-04 16:48:22,153 WARNING [train.py:1204] (3/4) Exclude cut with ID _2941_210_2577_1_1526199673150_2489759_208-236402-0_sp1.1 from training. Duration: 0.963625 2023-10-04 16:48:23,567 WARNING [train.py:1204] (3/4) Exclude cut with ID _38670_210_15379_1_1533448810659_4391059_65-169397-0 from training. Duration: 0.97 2023-10-04 16:48:26,692 WARNING [train.py:1204] (3/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_306-328175-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 16:48:26,710 WARNING [train.py:1204] (3/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_212-149947-0_sp1.1 from training. Duration: 0.663625 2023-10-04 16:48:32,453 WARNING [train.py:1204] (3/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_321-22821-0_sp1.1 from training. Duration: 0.8090625 2023-10-04 16:48:35,195 WARNING [train.py:1204] (3/4) Exclude cut with ID _44010_210_4525_1_1533884262964_4013620_483-69110-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 16:48:35,204 WARNING [train.py:1204] (3/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_38-187851-0_sp1.1 from training. Duration: 0.563625 2023-10-04 16:48:36,558 WARNING [train.py:1204] (3/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_129-337802-0_sp0.9 from training. Duration: 0.8 2023-10-04 16:48:36,639 WARNING [train.py:1204] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_268-87909-0_sp1.1 from training. Duration: 0.9 2023-10-04 16:48:38,085 WARNING [train.py:1204] (3/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_559-184862-0_sp1.1 from training. Duration: 0.8545625 2023-10-04 16:48:40,847 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_46032_210_14656_1_1533781412_4507969_237-120766-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 16:48:42,396 WARNING [train.py:1204] (3/4) Exclude cut with ID _28427_210_4220_1_1532581240967_3061069_330-170906-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 16:48:42,402 WARNING [train.py:1204] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_401-51578-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 16:48:43,827 WARNING [train.py:1204] (3/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_209-205365-0_sp1.1 from training. Duration: 0.7181875 2023-10-04 16:48:45,101 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_23801_210_16223_1_1532066392_3294657_57-242170-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 16:48:45,155 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_127-37371-0_sp1.1 from training. Duration: 0.936375 2023-10-04 16:48:51,624 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9171W0019-578905 from training. Duration: 0.896 2023-10-04 16:48:54,880 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_24605_210_1794_1_1532047863_4149755_349-49056-0_sp1.1 from training. Duration: 0.92725 2023-10-04 16:48:56,688 WARNING [train.py:1204] (3/4) Exclude cut with ID _12302_210_5219_1_1529924446466_8107280_897-21237-0_sp1.1 from training. Duration: 0.936375 2023-10-04 16:48:57,992 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_9460_210_4211_1_1528954587_10049667_480-260269-0_sp0.9 from training. Duration: 0.62225 2023-10-04 16:48:58,046 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_348-152548-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 16:48:59,399 WARNING [train.py:1204] (3/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_301-330455-0_sp0.9 from training. Duration: 0.888875 2023-10-04 16:48:59,550 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.ff3_skip_rate, batch_count=1724480.0, ans=0.0 2023-10-04 16:49:00,723 WARNING [train.py:1204] (3/4) Exclude cut with ID _61008_210_10633_1_1534852769198_6974370_345-91705-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 16:49:02,153 WARNING [train.py:1204] (3/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_304-273433-0 from training. Duration: 0.99 2023-10-04 16:49:02,156 WARNING [train.py:1204] (3/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_359-345972-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 16:49:03,777 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.conv_module2.balancer1.prob, batch_count=1724480.0, ans=0.125 2023-10-04 16:49:04,859 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2296_210_2087_1_1525503448_3397335_57-185430-0_sp1.1 from training. Duration: 0.5454375 2023-10-04 16:49:06,236 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_1256-120974-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 16:49:07,603 INFO [train.py:1046] (3/4) Epoch 49, batch 3700, loss[loss=0.152, simple_loss=0.2332, pruned_loss=0.03539, over 20531.00 frames. ], tot_loss[loss=0.1524, simple_loss=0.2334, pruned_loss=0.03575, over 4707558.47 frames. ], batch size: 44, lr: 2.07e-03, grad_scale: 16.0 2023-10-04 16:49:07,698 WARNING [train.py:1204] (3/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_372-30302-0_sp0.9 from training. Duration: 0.9555625 2023-10-04 16:49:10,368 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_42012_210_7927_1_1533439936_3871556_451-344262-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 16:49:10,369 WARNING [train.py:1204] (3/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_28-324729-0 from training. Duration: 0.94 2023-10-04 16:49:10,377 WARNING [train.py:1204] (3/4) Exclude cut with ID _64135_210_2253_1_1535677267876_7608972_313-18394-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 16:49:11,758 WARNING [train.py:1204] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_620-276793-0_sp0.9 from training. Duration: 0.6555625 2023-10-04 16:49:11,784 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7544_210_6753_1_1528591327_1471426_172-348777-0_sp0.9 from training. Duration: 0.7333125 2023-10-04 16:49:14,547 WARNING [train.py:1204] (3/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_332-221057-0_sp0.9 from training. Duration: 0.7555625 2023-10-04 16:49:19,204 WARNING [train.py:1204] (3/4) Exclude cut with ID _19023_210_4353_1_1531378892552_5820780_5-300035-0_sp1.1 from training. Duration: 0.92725 2023-10-04 16:49:19,257 WARNING [train.py:1204] (3/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_343-309023-0_sp1.1 from training. Duration: 0.963625 2023-10-04 16:49:19,342 WARNING [train.py:1204] (3/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_37-41919-0_sp1.1 from training. Duration: 0.7909375 2023-10-04 16:49:19,369 WARNING [train.py:1204] (3/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_41-182910-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 16:49:21,143 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_229-45300-0_sp0.9 from training. Duration: 0.6333125 2023-10-04 16:49:21,360 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.0.conv_skip_rate, batch_count=1724613.3333333333, ans=0.0 2023-10-04 16:49:22,614 WARNING [train.py:1204] (3/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_40-214716-0_sp1.1 from training. Duration: 0.963625 2023-10-04 16:49:23,991 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9309W0203-640330 from training. Duration: 0.896 2023-10-04 16:49:26,956 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.672e+02 2.043e+02 2.235e+02 2.541e+02 4.177e+02, threshold=4.470e+02, percent-clipped=0.0 2023-10-04 16:49:31,613 WARNING [train.py:1204] (3/4) Exclude cut with ID _60229_210_27767_1_1534838406158_3655350_264-279573-0_sp1.1 from training. Duration: 0.7545625 2023-10-04 16:49:32,946 WARNING [train.py:1204] (3/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_372-198878-0_sp0.9 from training. Duration: 0.6666875 2023-10-04 16:49:34,335 WARNING [train.py:1204] (3/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_359-118926-0_sp1.1 from training. Duration: 0.6545625 2023-10-04 16:49:34,346 WARNING [train.py:1204] (3/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_85-65056-0 from training. Duration: 0.96 2023-10-04 16:49:34,372 WARNING [train.py:1204] (3/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_89-242335-0_sp1.1 from training. Duration: 0.863625 2023-10-04 16:49:38,456 WARNING [train.py:1204] (3/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_128-49444-0_sp1.1 from training. Duration: 0.936375 2023-10-04 16:49:39,774 WARNING [train.py:1204] (3/4) Exclude cut with ID _30327_210_16049_1_1532484161295_3924169_401-70819-0 from training. Duration: 0.86 2023-10-04 16:49:41,176 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_41822_210_13270_1_1533955976_4379284_260-256391-0_sp1.1 from training. Duration: 0.936375 2023-10-04 16:49:42,563 WARNING [train.py:1204] (3/4) Exclude cut with ID _67335_210_5219_1_1535525975652_4275229_599-247317-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 16:49:42,875 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.conv_skip_rate, batch_count=1724680.0, ans=0.0 2023-10-04 16:49:43,980 WARNING [train.py:1204] (3/4) Exclude cut with ID _29857_210_20034_1_1532395886753_3798160_209-239267-0_sp1.1 from training. Duration: 0.936375 2023-10-04 16:49:44,015 WARNING [train.py:1204] (3/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_46-207599-0_sp1.1 from training. Duration: 0.5818125 2023-10-04 16:49:46,770 WARNING [train.py:1204] (3/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_912-282412-0_sp1.1 from training. Duration: 0.4454375 2023-10-04 16:49:51,801 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_4183_210_2087_1_1526729100_1869838_141-166600-0_sp1.1 from training. Duration: 0.863625 2023-10-04 16:49:51,806 WARNING [train.py:1204] (3/4) Exclude cut with ID _15037_210_2253_1_1530752294104_3267650_296-192597-0 from training. Duration: 0.81 2023-10-04 16:49:51,863 WARNING [train.py:1204] (3/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_571-220562-0_sp1.1 from training. Duration: 0.963625 2023-10-04 16:49:51,881 WARNING [train.py:1204] (3/4) Exclude cut with ID _5287_210_2084_1_1527418883040_3878670_12-221186-0 from training. Duration: 0.98 2023-10-04 16:49:57,902 WARNING [train.py:1204] (3/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_149-85602-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 16:49:57,924 WARNING [train.py:1204] (3/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_1003-101638-0_sp1.1 from training. Duration: 0.663625 2023-10-04 16:49:59,965 WARNING [train.py:1204] (3/4) Exclude cut with ID _55844_210_2346_1_1534471212569_7625707_1099-33689-0_sp1.1 from training. Duration: 0.92725 2023-10-04 16:50:01,285 WARNING [train.py:1204] (3/4) Exclude cut with ID _7606_210_4915_1_1528894253277_5080690_301-10732-0 from training. Duration: 0.81 2023-10-04 16:50:04,081 WARNING [train.py:1204] (3/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_403-34698-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 16:50:04,086 WARNING [train.py:1204] (3/4) Exclude cut with ID _8480_210_1874_1_1528589391205_6728330_755-112625-0_sp0.9 from training. Duration: 0.82225 2023-10-04 16:50:04,098 WARNING [train.py:1204] (3/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_527-238990-0_sp1.1 from training. Duration: 0.8181875 2023-10-04 16:50:04,117 WARNING [train.py:1204] (3/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_426-7328-0_sp1.1 from training. Duration: 0.92725 2023-10-04 16:50:06,272 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.3.encoder.layers.2.self_attn2.whiten, num_groups=1, num_channels=512, metric=10.35 vs. limit=22.5 2023-10-04 16:50:09,693 WARNING [train.py:1204] (3/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_189-317158-0_sp1.1 from training. Duration: 0.8181875 2023-10-04 16:50:11,053 WARNING [train.py:1204] (3/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_162-103620-0 from training. Duration: 0.93 2023-10-04 16:50:12,445 WARNING [train.py:1204] (3/4) Exclude cut with ID _22083_210_8254_1_1531702862439_3678970_49-234546-0 from training. Duration: 0.83 2023-10-04 16:50:12,510 WARNING [train.py:1204] (3/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_370-331600-0_sp1.1 from training. Duration: 0.8545625 2023-10-04 16:50:13,814 WARNING [train.py:1204] (3/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_166-59042-0_sp1.1 from training. Duration: 0.97275 2023-10-04 16:50:15,193 WARNING [train.py:1204] (3/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_138-332773-0_sp1.1 from training. Duration: 0.736375 2023-10-04 16:50:15,267 WARNING [train.py:1204] (3/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_90-30673-0_sp0.9 from training. Duration: 0.9333125 2023-10-04 16:50:18,238 WARNING [train.py:1204] (3/4) Exclude cut with ID _27967_210_12140_1_1532588391859_8419130_737-41898-0_sp1.1 from training. Duration: 0.936375 2023-10-04 16:50:18,484 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.attention_skip_rate, batch_count=1724813.3333333333, ans=0.0 2023-10-04 16:50:19,596 WARNING [train.py:1204] (3/4) Exclude cut with ID _8833_210_5219_1_1528592665210_7553059_329-125156-0_sp1.1 from training. Duration: 0.7454375 2023-10-04 16:50:20,893 INFO [train.py:1046] (3/4) Epoch 49, batch 3750, loss[loss=0.1475, simple_loss=0.2352, pruned_loss=0.02991, over 24656.00 frames. ], tot_loss[loss=0.1532, simple_loss=0.2344, pruned_loss=0.03598, over 4710735.77 frames. ], batch size: 68, lr: 2.07e-03, grad_scale: 16.0 2023-10-04 16:50:20,989 WARNING [train.py:1204] (3/4) Exclude cut with ID _41817_210_18835_1_1533434477160_7423049_352-42537-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 16:50:22,878 WARNING [train.py:1204] (3/4) Exclude cut with ID _8365_210_2064_1_1528592311070_4677690_431-294102-0 from training. Duration: 0.89 2023-10-04 16:50:22,998 WARNING [train.py:1204] (3/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_454-314901-0_sp1.1 from training. Duration: 0.4181875 2023-10-04 16:50:26,088 WARNING [train.py:1204] (3/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_80-135200-0_sp0.9 from training. Duration: 0.87775 2023-10-04 16:50:27,382 WARNING [train.py:1204] (3/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_181-78632-0 from training. Duration: 0.78 2023-10-04 16:50:27,435 WARNING [train.py:1204] (3/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_300-27235-0_sp0.9 from training. Duration: 0.9666875 2023-10-04 16:50:29,137 WARNING [train.py:1204] (3/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_96-177176-0_sp1.1 from training. Duration: 0.97275 2023-10-04 16:50:29,238 WARNING [train.py:1204] (3/4) Exclude cut with ID _35941_210_15379_1_1532995294515_4023089_246-348742-0_sp1.1 from training. Duration: 0.97275 2023-10-04 16:50:31,060 WARNING [train.py:1204] (3/4) Exclude cut with ID _38670_210_15379_1_1533448810659_4391059_65-169397-0_sp1.1 from training. Duration: 0.8818125 2023-10-04 16:50:33,863 WARNING [train.py:1204] (3/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_1003-99642-0_sp1.1 from training. Duration: 0.963625 2023-10-04 16:50:36,529 WARNING [train.py:1204] (3/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_320-14078-0_sp1.1 from training. Duration: 0.863625 2023-10-04 16:50:37,976 WARNING [train.py:1204] (3/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_367-61330-0_sp0.9 from training. Duration: 0.8333125 2023-10-04 16:50:40,600 WARNING [train.py:1204] (3/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_515-298514-0_sp1.1 from training. Duration: 0.92725 2023-10-04 16:50:40,763 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.0.conv_skip_rate, batch_count=1724946.6666666667, ans=0.0 2023-10-04 16:50:43,280 WARNING [train.py:1204] (3/4) Exclude cut with ID _53731_210_7994_1_1534553979244_3757214_471-320815-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 16:50:44,728 WARNING [train.py:1204] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_306-232480-0 from training. Duration: 0.96 2023-10-04 16:50:44,834 WARNING [train.py:1204] (3/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_478-162247-0_sp0.9 from training. Duration: 0.911125 2023-10-04 16:50:46,188 WARNING [train.py:1204] (3/4) Exclude cut with ID _56423_210_5854_1_1534896061049_3165969_345-284696-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 16:50:46,221 WARNING [train.py:1204] (3/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_600-122253-0_sp1.1 from training. Duration: 0.963625 2023-10-04 16:50:50,309 WARNING [train.py:1204] (3/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_199-295611-0 from training. Duration: 0.55 2023-10-04 16:50:55,160 WARNING [train.py:1204] (3/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_734-129761-0 from training. Duration: 0.82 2023-10-04 16:50:56,679 WARNING [train.py:1204] (3/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_349-169257-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 16:50:56,717 WARNING [train.py:1204] (3/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_202-67017-0_sp0.9 from training. Duration: 0.911125 2023-10-04 16:50:56,771 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder_embed.convnext.hidden_balancer.prob, batch_count=1725013.3333333333, ans=0.125 2023-10-04 16:50:58,855 WARNING [train.py:1204] (3/4) Exclude cut with ID _16908_210_5724_1_1531119157248_3772700_61-258978-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 16:51:03,301 WARNING [train.py:1204] (3/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_172-156931-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 16:51:04,691 WARNING [train.py:1204] (3/4) Exclude cut with ID _4092_210_2151_1_1526643044449_3135759_356-278694-0_sp1.1 from training. Duration: 0.42725 2023-10-04 16:51:04,930 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.bypass.scale_min, batch_count=1725080.0, ans=0.2 2023-10-04 16:51:08,840 WARNING [train.py:1204] (3/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_116-332008-0 from training. Duration: 0.98 2023-10-04 16:51:11,686 WARNING [train.py:1204] (3/4) Exclude cut with ID _29003_210_19596_1_1532507391720_3894098_358-114789-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 16:51:14,491 WARNING [train.py:1204] (3/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_152-212533-0_sp1.1 from training. Duration: 0.8818125 2023-10-04 16:51:14,565 WARNING [train.py:1204] (3/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_267-214116-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 16:51:17,378 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7543_210_6753_1_1528541907_1399322_165-323619-0_sp0.9 from training. Duration: 0.7666875 2023-10-04 16:51:21,612 WARNING [train.py:1204] (3/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_577-332600-0_sp0.9 from training. Duration: 0.6333125 2023-10-04 16:51:23,432 WARNING [train.py:1204] (3/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_342-100157-0_sp0.9 from training. Duration: 0.6 2023-10-04 16:51:25,439 WARNING [train.py:1204] (3/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_141-89727-0_sp1.1 from training. Duration: 0.6454375 2023-10-04 16:51:28,149 WARNING [train.py:1204] (3/4) Exclude cut with ID _3992_210_1747_1_1526644816031_3731060_107-251434-0_sp1.1 from training. Duration: 0.9 2023-10-04 16:51:29,616 WARNING [train.py:1204] (3/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_401-53423-0_sp1.1 from training. Duration: 0.5 2023-10-04 16:51:33,156 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.conv_module2.balancer2.prob, batch_count=1725146.6666666667, ans=0.125 2023-10-04 16:51:36,224 INFO [train.py:1046] (3/4) Epoch 49, batch 3800, loss[loss=0.1565, simple_loss=0.2207, pruned_loss=0.04614, over 22608.00 frames. ], tot_loss[loss=0.153, simple_loss=0.2343, pruned_loss=0.03581, over 4710369.08 frames. ], batch size: 322, lr: 2.07e-03, grad_scale: 16.0 2023-10-04 16:51:36,713 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.feed_forward3.hidden_balancer.prob, batch_count=1725213.3333333333, ans=0.125 2023-10-04 16:51:37,766 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7762_210_1794_1_1528199508_4057408_422-227034-0_sp1.1 from training. Duration: 0.663625 2023-10-04 16:51:42,065 WARNING [train.py:1204] (3/4) Exclude cut with ID _4184_210_2550_1_1526730192013_2219380_102-250299-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 16:51:43,340 WARNING [train.py:1204] (3/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_253-308377-0_sp0.9 from training. Duration: 0.5444375 2023-10-04 16:51:43,409 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_271-297505-0 from training. Duration: 0.99 2023-10-04 16:51:44,368 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.0.layers.1.nonlin_attention.whiten2, num_groups=1, num_channels=192, metric=5.69 vs. limit=15.0 2023-10-04 16:51:44,844 WARNING [train.py:1204] (3/4) Exclude cut with ID _35941_210_15379_1_1532995294515_4023089_213-142973-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 16:51:47,416 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7544_210_6753_1_1528587722_3576686_162-63018-0_sp1.1 from training. Duration: 0.92725 2023-10-04 16:51:47,499 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_365-217119-0_sp0.9 from training. Duration: 0.7 2023-10-04 16:51:50,180 WARNING [train.py:1204] (3/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_662-275621-0_sp0.9 from training. Duration: 0.4666875 2023-10-04 16:51:50,181 WARNING [train.py:1204] (3/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_374-187555-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 16:51:51,515 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_289-180904-0_sp1.1 from training. Duration: 0.6909375 2023-10-04 16:51:53,403 WARNING [train.py:1204] (3/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_189-47206-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 16:51:53,449 WARNING [train.py:1204] (3/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_234-14174-0_sp1.1 from training. Duration: 0.7454375 2023-10-04 16:51:53,706 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=1725280.0, ans=0.1 2023-10-04 16:51:54,677 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.696e+02 2.120e+02 2.292e+02 2.756e+02 3.883e+02, threshold=4.583e+02, percent-clipped=0.0 2023-10-04 16:51:54,758 WARNING [train.py:1204] (3/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_576-76621-0_sp1.1 from training. Duration: 0.963625 2023-10-04 16:51:54,854 WARNING [train.py:1204] (3/4) Exclude cut with ID _13253_210_1925_1_1530757775819_3577873_253-246386-0 from training. Duration: 0.9 2023-10-04 16:51:57,804 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.bypass.scale_min, batch_count=1725280.0, ans=0.2 2023-10-04 16:51:59,433 WARNING [train.py:1204] (3/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_372-6869-0_sp1.1 from training. Duration: 0.4 2023-10-04 16:51:59,495 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_21434_210_4238_1_1531916339_1222252_22-54954-0_sp1.1 from training. Duration: 0.936375 2023-10-04 16:52:02,925 WARNING [train.py:1204] (3/4) Exclude cut with ID _29853_210_7497_1_1532505600851_6642110_492-170361-0_sp1.1 from training. Duration: 0.92725 2023-10-04 16:52:05,658 WARNING [train.py:1204] (3/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_505-97987-0_sp0.9 from training. Duration: 0.9555625 2023-10-04 16:52:05,704 WARNING [train.py:1204] (3/4) Exclude cut with ID _8365_210_2064_1_1528592311070_4677690_371-122295-0_sp0.9 from training. Duration: 0.611125 2023-10-04 16:52:07,723 WARNING [train.py:1204] (3/4) Exclude cut with ID _14268_210_9206_1_1530622771285_4714099_637-86630-0_sp1.1 from training. Duration: 0.463625 2023-10-04 16:52:07,748 WARNING [train.py:1204] (3/4) Exclude cut with ID _29857_210_20034_1_1532395886753_3798160_372-5678-0_sp1.1 from training. Duration: 0.963625 2023-10-04 16:52:10,462 WARNING [train.py:1204] (3/4) Exclude cut with ID _52892_210_6721_1_1534378941259_4124129_90-165108-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 16:52:11,885 WARNING [train.py:1204] (3/4) Exclude cut with ID _27052_210_5986_1_1532329197187_3585009_143-339198-0_sp1.1 from training. Duration: 0.963625 2023-10-04 16:52:16,151 WARNING [train.py:1204] (3/4) Exclude cut with ID _14268_210_9206_1_1530622771285_4714099_637-86630-0_sp0.9 from training. Duration: 0.5666875 2023-10-04 16:52:16,153 WARNING [train.py:1204] (3/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_401-255491-0 from training. Duration: 0.7 2023-10-04 16:52:17,562 WARNING [train.py:1204] (3/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_178-6946-0_sp1.1 from training. Duration: 0.97275 2023-10-04 16:52:23,385 WARNING [train.py:1204] (3/4) Exclude cut with ID _27485_210_4916_1_1532739191677_3668269_460-163885-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 16:52:25,378 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.0.attention_skip_rate, batch_count=1725413.3333333333, ans=0.0 2023-10-04 16:52:27,999 WARNING [train.py:1204] (3/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_270-26053-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 16:52:31,856 WARNING [train.py:1204] (3/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_412-317077-0 from training. Duration: 0.75 2023-10-04 16:52:33,090 WARNING [train.py:1204] (3/4) Exclude cut with ID _5289_210_2084_1_1527678202736_3779299_142-103317-0 from training. Duration: 0.84 2023-10-04 16:52:34,501 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_150-143438-0_sp1.1 from training. Duration: 0.92725 2023-10-04 16:52:35,987 WARNING [train.py:1204] (3/4) Exclude cut with ID _27894_210_12392_1_1532415496078_6902439_668-269779-0_sp1.1 from training. Duration: 0.97275 2023-10-04 16:52:37,299 WARNING [train.py:1204] (3/4) Exclude cut with ID _18513_210_9206_1_1531618215223_3862620_56-222075-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 16:52:37,423 WARNING [train.py:1204] (3/4) Exclude cut with ID _4092_210_2151_1_1526643044449_3135759_356-278694-0 from training. Duration: 0.47 2023-10-04 16:52:42,184 WARNING [train.py:1204] (3/4) Exclude cut with ID _25808_210_13292_1_1532343514562_7366109_633-47524-0 from training. Duration: 0.99 2023-10-04 16:52:42,202 WARNING [train.py:1204] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_872-264525-0 from training. Duration: 0.65 2023-10-04 16:52:42,449 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.balancer_na.min_abs, batch_count=1725480.0, ans=0.02 2023-10-04 16:52:43,514 WARNING [train.py:1204] (3/4) Exclude cut with ID _14632_210_9988_1_1530691279392_3923200_239-232545-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 16:52:44,312 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.2.encoder.layers.1.nonlin_attention.whiten1, num_groups=1, num_channels=288, metric=4.98 vs. limit=10.0 2023-10-04 16:52:44,853 WARNING [train.py:1204] (3/4) Exclude cut with ID _14349_210_8341_1_1530615710818_3442470_153-27432-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 16:52:49,111 WARNING [train.py:1204] (3/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_37-41919-0_sp0.9 from training. Duration: 0.9666875 2023-10-04 16:52:50,411 INFO [train.py:1046] (3/4) Epoch 49, batch 3850, loss[loss=0.1499, simple_loss=0.2307, pruned_loss=0.03452, over 23741.00 frames. ], tot_loss[loss=0.1519, simple_loss=0.2327, pruned_loss=0.0356, over 4713321.91 frames. ], batch size: 149, lr: 2.07e-03, grad_scale: 16.0 2023-10-04 16:52:50,488 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_196-289637-0_sp0.9 from training. Duration: 0.8333125 2023-10-04 16:52:54,549 WARNING [train.py:1204] (3/4) Exclude cut with ID _13678_210_7497_1_1530530069226_3463170_371-3221-0_sp1.1 from training. Duration: 0.8181875 2023-10-04 16:52:55,870 WARNING [train.py:1204] (3/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_557-324258-0 from training. Duration: 0.81 2023-10-04 16:52:57,735 WARNING [train.py:1204] (3/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_1012-279473-0_sp1.1 from training. Duration: 0.6090625 2023-10-04 16:52:57,810 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_66916_210_14656_1_1535725525_326566_7-92649-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 16:53:01,103 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_9460_210_4211_1_1528954587_10049667_480-260269-0_sp1.1 from training. Duration: 0.5090625 2023-10-04 16:53:04,364 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.conv_module1.balancer1.prob, batch_count=1725613.3333333333, ans=0.125 2023-10-04 16:53:05,563 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_121-155001-0_sp1.1 from training. Duration: 0.92725 2023-10-04 16:53:06,974 WARNING [train.py:1204] (3/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_188-171673-0_sp1.1 from training. Duration: 0.52725 2023-10-04 16:53:07,041 WARNING [train.py:1204] (3/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_2-39653-0 from training. Duration: 0.98 2023-10-04 16:53:13,300 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_23850_210_4238_1_1532232597_1632433_112-229012-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 16:53:14,650 WARNING [train.py:1204] (3/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_626-79528-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 16:53:16,025 WARNING [train.py:1204] (3/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_856-90052-0_sp1.1 from training. Duration: 0.963625 2023-10-04 16:53:17,382 WARNING [train.py:1204] (3/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_122-109023-0_sp0.9 from training. Duration: 0.8444375 2023-10-04 16:53:20,184 WARNING [train.py:1204] (3/4) Exclude cut with ID _64414_210_17301_1_1535243252048_3757759_37-70328-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 16:53:21,523 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_622-344907-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 16:53:21,582 WARNING [train.py:1204] (3/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_410-321956-0_sp1.1 from training. Duration: 0.92725 2023-10-04 16:53:21,594 WARNING [train.py:1204] (3/4) Exclude cut with ID _7562_210_2084_1_1528887767916_3946229_186-326422-0_sp0.9 from training. Duration: 0.9333125 2023-10-04 16:53:23,031 WARNING [train.py:1204] (3/4) Exclude cut with ID _37537_210_2253_1_1533171758568_3421949_127-285192-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 16:53:24,438 WARNING [train.py:1204] (3/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_723-228168-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 16:53:24,512 WARNING [train.py:1204] (3/4) Exclude cut with ID _13678_210_7497_1_1530530069226_3463170_426-246984-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 16:53:24,528 WARNING [train.py:1204] (3/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_399-156791-0_sp0.9 from training. Duration: 0.92225 2023-10-04 16:53:25,901 WARNING [train.py:1204] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_391-22148-0 from training. Duration: 0.77 2023-10-04 16:53:25,928 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3475_210_2028_1_1526193578_3622569_64-297072-0 from training. Duration: 0.91 2023-10-04 16:53:25,996 WARNING [train.py:1204] (3/4) Exclude cut with ID _17875_210_2087_1_1531224012907_1821339_225-280015-0_sp1.1 from training. Duration: 0.963625 2023-10-04 16:53:27,880 WARNING [train.py:1204] (3/4) Exclude cut with ID _50655_210_18628_1_1534229942193_3772154_325-246169-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 16:53:30,645 WARNING [train.py:1204] (3/4) Exclude cut with ID _29529_210_7234_1_1532393961455_3977460_597-257348-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 16:53:30,691 WARNING [train.py:1204] (3/4) Exclude cut with ID _58895_210_8750_1_1535184081320_3649089_77-173485-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 16:53:30,711 WARNING [train.py:1204] (3/4) Exclude cut with ID _6270_210_2084_1_1527850911573_3856420_26-277343-0 from training. Duration: 0.82 2023-10-04 16:53:35,341 WARNING [train.py:1204] (3/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_144-3287-0 from training. Duration: 0.67 2023-10-04 16:53:35,460 WARNING [train.py:1204] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_293-145278-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 16:53:38,647 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_40047_210_2290_1_1533290519_2374311_257-335162-0 from training. Duration: 0.95 2023-10-04 16:53:40,195 WARNING [train.py:1204] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_619-344499-0_sp0.9 from training. Duration: 0.7 2023-10-04 16:53:40,439 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.feed_forward3.hidden_balancer.prob, batch_count=1725746.6666666667, ans=0.125 2023-10-04 16:53:44,352 WARNING [train.py:1204] (3/4) Exclude cut with ID _19504_210_9994_1_1531648863242_3325220_456-315379-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 16:53:45,766 WARNING [train.py:1204] (3/4) Exclude cut with ID _55857_210_2346_1_1534644007831_6898209_631-221692-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 16:53:47,249 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.conv_skip_rate, batch_count=1725746.6666666667, ans=0.0 2023-10-04 16:53:49,922 WARNING [train.py:1204] (3/4) Exclude cut with ID _64631_210_27767_1_1535349580068_4165160_324-3682-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 16:53:51,117 WARNING [train.py:1204] (3/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_317-211107-0 from training. Duration: 0.92 2023-10-04 16:53:53,938 WARNING [train.py:1204] (3/4) Exclude cut with ID _6789_210_1534_1_1527857937218_3118950_311-61262-0 from training. Duration: 0.65 2023-10-04 16:53:55,329 WARNING [train.py:1204] (3/4) Exclude cut with ID _28323_210_16187_1_1532480395667_4100300_365-68882-0_sp1.1 from training. Duration: 0.92725 2023-10-04 16:53:55,360 WARNING [train.py:1204] (3/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_816-181825-0_sp1.1 from training. Duration: 0.92725 2023-10-04 16:53:57,240 WARNING [train.py:1204] (3/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_132-269633-0_sp1.1 from training. Duration: 0.6181875 2023-10-04 16:53:57,257 WARNING [train.py:1204] (3/4) Exclude cut with ID _44585_210_18628_1_1533605350387_4518199_280-73534-0_sp1.1 from training. Duration: 0.8090625 2023-10-04 16:53:59,061 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_68095_210_20185_1_1535718226_8888855_1009-164755-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 16:53:59,140 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_40223_210_6228_1_1533298404_4812267_400-103813-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 16:53:59,141 WARNING [train.py:1204] (3/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_491-261923-0_sp1.1 from training. Duration: 0.8909375 2023-10-04 16:53:59,146 WARNING [train.py:1204] (3/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_712-342563-0 from training. Duration: 0.97 2023-10-04 16:54:00,572 WARNING [train.py:1204] (3/4) Exclude cut with ID _41161_210_20034_1_1533431249657_3555314_175-190032-0_sp1.1 from training. Duration: 0.963625 2023-10-04 16:54:02,012 WARNING [train.py:1204] (3/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_788-298907-0 from training. Duration: 0.89 2023-10-04 16:54:02,043 WARNING [train.py:1204] (3/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_225-70961-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 16:54:02,049 WARNING [train.py:1204] (3/4) Exclude cut with ID _58583_210_5236_1_1534726835266_3574249_235-175994-0_sp1.1 from training. Duration: 0.92725 2023-10-04 16:54:03,468 WARNING [train.py:1204] (3/4) Exclude cut with ID _12302_210_5219_1_1529924446466_8107280_907-182949-0_sp1.1 from training. Duration: 0.836375 2023-10-04 16:54:05,387 INFO [train.py:1046] (3/4) Epoch 49, batch 3900, loss[loss=0.1439, simple_loss=0.2145, pruned_loss=0.03667, over 23451.00 frames. ], tot_loss[loss=0.1512, simple_loss=0.2319, pruned_loss=0.03527, over 4719969.64 frames. ], batch size: 285, lr: 2.07e-03, grad_scale: 16.0 2023-10-04 16:54:05,431 WARNING [train.py:1204] (3/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_94-193692-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 16:54:06,782 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_4528_210_3273_1_1527076654_3276777_342-168068-0_sp0.9 from training. Duration: 0.9666875 2023-10-04 16:54:06,839 WARNING [train.py:1204] (3/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_368-74239-0_sp1.1 from training. Duration: 0.92725 2023-10-04 16:54:06,847 WARNING [train.py:1204] (3/4) Exclude cut with ID _44831_210_13427_1_1533816011549_3689035_291-295928-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 16:54:08,207 WARNING [train.py:1204] (3/4) Exclude cut with ID _56406_210_9288_1_1534899618664_3496149_455-131997-0_sp1.1 from training. Duration: 0.97275 2023-10-04 16:54:08,215 WARNING [train.py:1204] (3/4) Exclude cut with ID _6473_210_1741_1_1527901307396_3815380_108-17335-0 from training. Duration: 0.65 2023-10-04 16:54:08,255 WARNING [train.py:1204] (3/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_286-135600-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 16:54:11,047 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_928-321675-0_sp1.1 from training. Duration: 0.936375 2023-10-04 16:54:12,435 WARNING [train.py:1204] (3/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_81-300212-0_sp1.1 from training. Duration: 0.5454375 2023-10-04 16:54:13,747 WARNING [train.py:1204] (3/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_29-280622-0_sp0.9 from training. Duration: 0.911125 2023-10-04 16:54:13,828 WARNING [train.py:1204] (3/4) Exclude cut with ID _61079_210_4525_1_1534921008198_4011419_228-176447-0_sp1.1 from training. Duration: 0.936375 2023-10-04 16:54:15,238 WARNING [train.py:1204] (3/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_372-198878-0_sp1.1 from training. Duration: 0.5454375 2023-10-04 16:54:15,249 WARNING [train.py:1204] (3/4) Exclude cut with ID _64521_210_14699_1_1535243427259_3815328_244-208725-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 16:54:16,663 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8275_210_4198_1_1528455320_3892571_491-17707-0_sp0.9 from training. Duration: 0.711125 2023-10-04 16:54:16,782 WARNING [train.py:1204] (3/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_375-245677-0 from training. Duration: 0.81 2023-10-04 16:54:16,789 WARNING [train.py:1204] (3/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_690-286055-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 16:54:16,958 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=1725880.0, ans=0.1 2023-10-04 16:54:19,404 WARNING [train.py:1204] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_89-321955-0 from training. Duration: 0.8 2023-10-04 16:54:19,433 WARNING [train.py:1204] (3/4) Exclude cut with ID _18895_210_6228_1_1531357274234_3784930_213-98721-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 16:54:20,686 WARNING [train.py:1204] (3/4) Exclude cut with ID _19708_210_9206_1_1532136650733_4238364_347-65240-0 from training. Duration: 0.97 2023-10-04 16:54:23,190 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.722e+02 2.321e+02 2.754e+02 3.506e+02 6.937e+02, threshold=5.508e+02, percent-clipped=5.0 2023-10-04 16:54:23,310 WARNING [train.py:1204] (3/4) Exclude cut with ID _64135_210_2253_1_1535677267876_7608972_498-5039-0 from training. Duration: 0.78 2023-10-04 16:54:27,302 WARNING [train.py:1204] (3/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_146-276944-0_sp1.1 from training. Duration: 0.7545625 2023-10-04 16:54:28,834 WARNING [train.py:1204] (3/4) Exclude cut with ID _63171_210_15488_1_1535104792794_3295110_155-34488-0_sp1.1 from training. Duration: 0.97275 2023-10-04 16:54:28,854 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_5438_210_1794_1_1527336687_2600009_233-58072-0_sp0.9 from training. Duration: 0.8555625 2023-10-04 16:54:30,202 WARNING [train.py:1204] (3/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_63-101996-0_sp0.9 from training. Duration: 0.97775 2023-10-04 16:54:33,529 WARNING [train.py:1204] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_271-269084-0_sp1.1 from training. Duration: 0.7545625 2023-10-04 16:54:33,679 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.ff3_skip_rate, batch_count=1726013.3333333333, ans=0.0 2023-10-04 16:54:36,208 WARNING [train.py:1204] (3/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_345-167986-0_sp1.1 from training. Duration: 0.763625 2023-10-04 16:54:37,653 WARNING [train.py:1204] (3/4) Exclude cut with ID _39290_210_4220_1_1533862778760_3075118_92-311600-0_sp1.1 from training. Duration: 0.8 2023-10-04 16:54:37,659 WARNING [train.py:1204] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_209-65306-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 16:54:37,749 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_26433_210_12108_1_1532157158_4056480_317-335807-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 16:54:43,251 WARNING [train.py:1204] (3/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_238-94127-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 16:54:44,597 WARNING [train.py:1204] (3/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_207-132038-0_sp1.1 from training. Duration: 0.8545625 2023-10-04 16:54:51,657 WARNING [train.py:1204] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_167-137255-0_sp1.1 from training. Duration: 0.5818125 2023-10-04 16:54:52,996 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2009_210_2071_1_1525481134_4588937_385-302100-0_sp0.9 from training. Duration: 0.9666875 2023-10-04 16:55:02,369 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_15517_210_6753_1_1530954008_3487336_253-110617-0_sp1.1 from training. Duration: 0.936375 2023-10-04 16:55:04,400 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.feed_forward3.hidden_balancer.prob, batch_count=1726146.6666666667, ans=0.125 2023-10-04 16:55:06,836 WARNING [train.py:1204] (3/4) Exclude cut with ID _39290_210_4220_1_1533862778760_3075118_92-311600-0_sp0.9 from training. Duration: 0.97775 2023-10-04 16:55:06,884 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_42-142473-0 from training. Duration: 0.72 2023-10-04 16:55:08,208 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2296_210_2087_1_1525503448_3397335_57-185430-0 from training. Duration: 0.6 2023-10-04 16:55:08,235 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_11305_210_1794_1_1529750428_7178148_257-312870-0_sp0.9 from training. Duration: 0.97775 2023-10-04 16:55:09,603 WARNING [train.py:1204] (3/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_222-198127-0 from training. Duration: 0.84 2023-10-04 16:55:10,974 WARNING [train.py:1204] (3/4) Exclude cut with ID _65263_210_22356_1_1535763598691_3921037_423-202699-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 16:55:12,237 WARNING [train.py:1204] (3/4) Exclude cut with ID _15037_210_2253_1_1530752294104_3267650_13-130361-0 from training. Duration: 0.99 2023-10-04 16:55:16,544 WARNING [train.py:1204] (3/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_628-198080-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 16:55:17,736 INFO [train.py:1046] (3/4) Epoch 49, batch 3950, loss[loss=0.1314, simple_loss=0.2121, pruned_loss=0.02531, over 23652.00 frames. ], tot_loss[loss=0.1507, simple_loss=0.2312, pruned_loss=0.0351, over 4720866.37 frames. ], batch size: 149, lr: 2.06e-03, grad_scale: 16.0 2023-10-04 16:55:17,838 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3171_210_2087_1_1526109084_2330007_37-64334-0 from training. Duration: 0.82 2023-10-04 16:55:17,887 WARNING [train.py:1204] (3/4) Exclude cut with ID _5287_210_2084_1_1527418883040_3878670_12-221186-0_sp1.1 from training. Duration: 0.8909375 2023-10-04 16:55:20,586 WARNING [train.py:1204] (3/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_1103-56445-0_sp1.1 from training. Duration: 0.663625 2023-10-04 16:55:23,157 WARNING [train.py:1204] (3/4) Exclude cut with ID _65263_210_22356_1_1535763598691_3921037_169-140068-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 16:55:29,283 WARNING [train.py:1204] (3/4) Exclude cut with ID IC0671W0118-331961 from training. Duration: 0.896 2023-10-04 16:55:30,658 WARNING [train.py:1204] (3/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_402-102311-0_sp1.1 from training. Duration: 0.6090625 2023-10-04 16:55:30,694 WARNING [train.py:1204] (3/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_202-293523-0 from training. Duration: 0.72 2023-10-04 16:55:30,867 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.1.ff2_skip_rate, batch_count=1726280.0, ans=0.0 2023-10-04 16:55:32,436 WARNING [train.py:1204] (3/4) Exclude cut with ID IC0679W0038-335846 from training. Duration: 0.768 2023-10-04 16:55:32,475 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_37357_210_17024_1_1533089504_2316121_79-198866-0_sp1.1 from training. Duration: 0.936375 2023-10-04 16:55:34,533 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.balancer1.prob, batch_count=1726280.0, ans=0.125 2023-10-04 16:55:36,115 WARNING [train.py:1204] (3/4) Exclude cut with ID _7606_210_4915_1_1528894253277_5080690_292-271285-0_sp1.1 from training. Duration: 0.963625 2023-10-04 16:55:36,135 WARNING [train.py:1204] (3/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_377-296839-0_sp0.9 from training. Duration: 0.611125 2023-10-04 16:55:36,138 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_45886_210_17024_1_1533704917_2435333_58-131069-0_sp1.1 from training. Duration: 0.963625 2023-10-04 16:55:40,263 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6961_210_2087_1_1527937168_6815011_395-185060-0 from training. Duration: 0.95 2023-10-04 16:55:41,697 WARNING [train.py:1204] (3/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_304-266479-0_sp1.1 from training. Duration: 0.97275 2023-10-04 16:55:43,001 WARNING [train.py:1204] (3/4) Exclude cut with ID _3867_210_1883_1_1526641200575_4475830_319-349850-0_sp1.1 from training. Duration: 0.6090625 2023-10-04 16:55:43,017 WARNING [train.py:1204] (3/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_304-59227-0_sp0.9 from training. Duration: 0.8555625 2023-10-04 16:55:44,334 WARNING [train.py:1204] (3/4) Exclude cut with ID _8021_210_7994_1_1528376451358_3653069_100-140231-0_sp0.9 from training. Duration: 0.7444375 2023-10-04 16:55:44,390 WARNING [train.py:1204] (3/4) Exclude cut with ID _8833_210_5219_1_1528592665210_7553059_329-125156-0_sp0.9 from training. Duration: 0.911125 2023-10-04 16:55:55,586 WARNING [train.py:1204] (3/4) Exclude cut with ID _14459_210_2228_1_1531443641946_7516700_792-287234-0_sp1.1 from training. Duration: 0.92725 2023-10-04 16:55:55,602 WARNING [train.py:1204] (3/4) Exclude cut with ID _19708_210_9206_1_1532136650733_4238364_347-65240-0_sp1.1 from training. Duration: 0.8818125 2023-10-04 16:55:57,329 INFO [scaling.py:1118] (3/4) WithLoss: name=encoder.encoders.5.encoder.layers.0.self_attn_weights, loss-sum=5.430e-03 2023-10-04 16:56:01,246 WARNING [train.py:1204] (3/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_70-27732-0 from training. Duration: 0.94 2023-10-04 16:56:05,333 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_397-132565-0 from training. Duration: 0.99 2023-10-04 16:56:05,336 WARNING [train.py:1204] (3/4) Exclude cut with ID _49728_210_12651_1_1534041034294_6788150_624-195301-0 from training. Duration: 0.85 2023-10-04 16:56:07,244 WARNING [train.py:1204] (3/4) Exclude cut with ID _55429_210_10626_1_1534467588527_7174143_377-169391-0_sp1.1 from training. Duration: 0.87275 2023-10-04 16:56:09,331 WARNING [train.py:1204] (3/4) Exclude cut with ID _49376_210_5986_1_1533985278732_3370740_354-344695-0_sp1.1 from training. Duration: 0.9 2023-10-04 16:56:10,713 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.0.conv_module2.balancer2.prob, batch_count=1726413.3333333333, ans=0.125 2023-10-04 16:56:16,085 WARNING [train.py:1204] (3/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_276-96593-0_sp1.1 from training. Duration: 0.7 2023-10-04 16:56:16,093 WARNING [train.py:1204] (3/4) Exclude cut with ID _15012_210_1968_1_1530856820429_5095350_688-178849-0_sp0.9 from training. Duration: 0.711125 2023-10-04 16:56:17,355 WARNING [train.py:1204] (3/4) Exclude cut with ID _25565_210_12392_1_1532423009382_3952098_372-69023-0_sp1.1 from training. Duration: 0.936375 2023-10-04 16:56:17,389 WARNING [train.py:1204] (3/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_61-327062-0_sp1.1 from training. Duration: 0.736375 2023-10-04 16:56:18,680 WARNING [train.py:1204] (3/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_215-141059-0 from training. Duration: 0.62 2023-10-04 16:56:22,891 WARNING [train.py:1204] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_6-226750-0_sp1.1 from training. Duration: 0.8545625 2023-10-04 16:56:24,393 WARNING [train.py:1204] (3/4) Exclude cut with ID _14348_210_8341_1_1530599434996_3778640_240-232370-0_sp1.1 from training. Duration: 0.763625 2023-10-04 16:56:27,196 WARNING [train.py:1204] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_468-189058-0 from training. Duration: 0.96 2023-10-04 16:56:31,262 INFO [train.py:1046] (3/4) Epoch 49, batch 4000, loss[loss=0.1554, simple_loss=0.2322, pruned_loss=0.03935, over 23849.00 frames. ], tot_loss[loss=0.1506, simple_loss=0.2316, pruned_loss=0.03482, over 4738001.78 frames. ], batch size: 212, lr: 2.06e-03, grad_scale: 16.0 2023-10-04 16:56:36,446 WARNING [train.py:1204] (3/4) Exclude cut with ID _21987_210_5854_1_1531821500482_3637038_247-93340-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 16:56:43,930 WARNING [train.py:1204] (3/4) Exclude cut with ID _66868_210_9527_1_1535509287954_4561140_436-191602-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 16:56:46,759 WARNING [train.py:1204] (3/4) Exclude cut with ID _18895_210_6228_1_1531357274234_3784930_394-58203-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 16:56:48,138 WARNING [train.py:1204] (3/4) Exclude cut with ID _58605_210_12210_1_1534723195939_3904259_258-281498-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 16:56:48,179 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_33348_210_5632_1_1533086912_3324030_245-292752-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 16:56:48,198 WARNING [train.py:1204] (3/4) Exclude cut with ID _67831_210_12392_1_1535612464202_3092612_346-2148-0 from training. Duration: 0.93 2023-10-04 16:56:49,558 WARNING [train.py:1204] (3/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_369-222060-0_sp0.9 from training. Duration: 0.788875 2023-10-04 16:56:49,622 WARNING [train.py:1204] (3/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_245-174553-0 from training. Duration: 0.88 2023-10-04 16:56:50,849 WARNING [train.py:1204] (3/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_231-104736-0_sp1.1 from training. Duration: 0.7090625 2023-10-04 16:56:50,855 WARNING [train.py:1204] (3/4) Exclude cut with ID _44585_210_18628_1_1533605350387_4518199_280-73534-0 from training. Duration: 0.89 2023-10-04 16:56:52,174 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.724e+02 1.984e+02 2.160e+02 2.381e+02 3.286e+02, threshold=4.320e+02, percent-clipped=0.0 2023-10-04 16:56:52,324 WARNING [train.py:1204] (3/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_22-201476-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 16:56:55,119 WARNING [train.py:1204] (3/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_325-62802-0_sp0.9 from training. Duration: 0.9666875 2023-10-04 16:56:55,132 WARNING [train.py:1204] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_199-274062-0_sp1.1 from training. Duration: 0.97275 2023-10-04 16:56:55,135 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_8-319957-0_sp1.1 from training. Duration: 0.87275 2023-10-04 16:56:56,451 WARNING [train.py:1204] (3/4) Exclude cut with ID _29343_210_4381_1_1532602757752_3674109_224-349726-0_sp1.1 from training. Duration: 0.936375 2023-10-04 16:56:56,456 WARNING [train.py:1204] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_520-126729-0_sp1.1 from training. Duration: 0.636375 2023-10-04 16:56:57,214 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.3.encoder.layers.1.conv_module2.whiten, num_groups=1, num_channels=512, metric=2.95 vs. limit=15.0 2023-10-04 16:56:58,040 WARNING [train.py:1204] (3/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_239-22076-0_sp1.1 from training. Duration: 0.77275 2023-10-04 16:57:00,728 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9012W0011-501146 from training. Duration: 0.896 2023-10-04 16:57:00,804 WARNING [train.py:1204] (3/4) Exclude cut with ID _29857_210_20034_1_1532395886753_3798160_279-185801-0_sp1.1 from training. Duration: 0.82725 2023-10-04 16:57:00,852 WARNING [train.py:1204] (3/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_261-223933-0_sp1.1 from training. Duration: 0.92725 2023-10-04 16:57:03,980 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9029W0025-509426 from training. Duration: 0.896 2023-10-04 16:57:05,376 WARNING [train.py:1204] (3/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_337-181502-0_sp1.1 from training. Duration: 0.5909375 2023-10-04 16:57:05,391 WARNING [train.py:1204] (3/4) Exclude cut with ID _68635_210_9317_1_1535770813410_4315870_159-332599-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 16:57:13,679 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_161-276996-0 from training. Duration: 0.95 2023-10-04 16:57:13,720 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7088_210_1874_1_1527939776_1101113_127-68870-0_sp1.1 from training. Duration: 0.936375 2023-10-04 16:57:16,567 WARNING [train.py:1204] (3/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_322-217220-0_sp1.1 from training. Duration: 0.7818125 2023-10-04 16:57:17,848 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9072W0002-530636 from training. Duration: 0.768 2023-10-04 16:57:19,335 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_340-336408-0_sp1.1 from training. Duration: 0.8454375 2023-10-04 16:57:19,403 WARNING [train.py:1204] (3/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_395-37222-0 from training. Duration: 0.76 2023-10-04 16:57:19,406 WARNING [train.py:1204] (3/4) Exclude cut with ID _64099_210_17301_1_1535526977656_1578188_159-209156-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 16:57:20,539 WARNING [train.py:1204] (3/4) Exclude cut with ID _53950_210_5816_1_1534417264919_3862951_28-29681-0_sp1.1 from training. Duration: 0.92725 2023-10-04 16:57:21,927 WARNING [train.py:1204] (3/4) Exclude cut with ID _4681_210_3525_1_1527250689464_120889_15-126760-0_sp1.1 from training. Duration: 0.736375 2023-10-04 16:57:23,506 WARNING [train.py:1204] (3/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_242-3472-0_sp1.1 from training. Duration: 0.663625 2023-10-04 16:57:24,842 WARNING [train.py:1204] (3/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_436-186311-0_sp0.9 from training. Duration: 0.72225 2023-10-04 16:57:24,867 WARNING [train.py:1204] (3/4) Exclude cut with ID _3675_210_4557_1_1526639934408_4182089_452-172147-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 16:57:26,702 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.4.encoder.layers.0.nonlin_attention.whiten2, num_groups=1, num_channels=384, metric=9.53 vs. limit=15.0 2023-10-04 16:57:27,551 WARNING [train.py:1204] (3/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_80-147301-0 from training. Duration: 0.84 2023-10-04 16:57:27,576 WARNING [train.py:1204] (3/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_351-107588-0_sp1.1 from training. Duration: 0.92725 2023-10-04 16:57:28,962 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9111W0002-550781 from training. Duration: 0.896 2023-10-04 16:57:31,844 WARNING [train.py:1204] (3/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_465-156500-0_sp1.1 from training. Duration: 0.6545625 2023-10-04 16:57:35,283 WARNING [train.py:1204] (3/4) Exclude cut with ID _13938_210_9988_1_1530518718978_3693960_304-112784-0_sp1.1 from training. Duration: 0.4181875 2023-10-04 16:57:36,662 WARNING [train.py:1204] (3/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_202-67017-0_sp1.1 from training. Duration: 0.7454375 2023-10-04 16:57:36,900 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.balancer1.prob, batch_count=1726813.3333333333, ans=0.125 2023-10-04 16:57:37,860 WARNING [train.py:1204] (3/4) Exclude cut with ID _58414_210_8254_1_1535421609162_7151182_448-4358-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 16:57:37,934 WARNING [train.py:1204] (3/4) Exclude cut with ID _66840_210_5219_1_1535452168426_4196349_203-243234-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 16:57:39,739 WARNING [train.py:1204] (3/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_201-213513-0_sp1.1 from training. Duration: 0.97275 2023-10-04 16:57:44,427 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_117-187560-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 16:57:45,699 INFO [train.py:1046] (3/4) Epoch 49, batch 4050, loss[loss=0.1446, simple_loss=0.2245, pruned_loss=0.03233, over 23414.00 frames. ], tot_loss[loss=0.1514, simple_loss=0.2322, pruned_loss=0.03528, over 4719203.28 frames. ], batch size: 119, lr: 2.06e-03, grad_scale: 8.0 2023-10-04 16:57:47,138 WARNING [train.py:1204] (3/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_135-174080-0_sp0.9 from training. Duration: 0.588875 2023-10-04 16:57:47,178 WARNING [train.py:1204] (3/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_45-265684-0 from training. Duration: 0.72 2023-10-04 16:57:48,511 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_435-313305-0_sp1.1 from training. Duration: 0.6454375 2023-10-04 16:57:48,548 WARNING [train.py:1204] (3/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_235-307337-0_sp1.1 from training. Duration: 0.8545625 2023-10-04 16:57:49,950 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7762_210_1794_1_1528199508_4057408_419-125009-0_sp0.9 from training. Duration: 0.92225 2023-10-04 16:57:50,132 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.attention_skip_rate, batch_count=1726880.0, ans=0.0 2023-10-04 16:57:50,232 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.bypass_mid.scale_min, batch_count=1726880.0, ans=0.2 2023-10-04 16:57:52,627 WARNING [train.py:1204] (3/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_215-141059-0_sp1.1 from training. Duration: 0.563625 2023-10-04 16:57:54,010 WARNING [train.py:1204] (3/4) Exclude cut with ID _27013_210_1389_1_1532437173240_3252649_222-125573-0_sp1.1 from training. Duration: 0.8909375 2023-10-04 16:57:56,883 WARNING [train.py:1204] (3/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_126-274694-0_sp1.1 from training. Duration: 0.8909375 2023-10-04 16:58:00,859 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2592_210_2290_1_1525697025_4882925_157-70512-0_sp1.1 from training. Duration: 0.82725 2023-10-04 16:58:00,901 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_292-265928-0_sp0.9 from training. Duration: 0.5666875 2023-10-04 16:58:02,415 WARNING [train.py:1204] (3/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_1211-226066-0_sp1.1 from training. Duration: 0.6909375 2023-10-04 16:58:03,782 WARNING [train.py:1204] (3/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_394-265747-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 16:58:07,043 WARNING [train.py:1204] (3/4) Exclude cut with ID _48782_210_12658_1_1534467701544_2300422_239-67139-0_sp1.1 from training. Duration: 0.97275 2023-10-04 16:58:09,753 WARNING [train.py:1204] (3/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_329-113173-0_sp1.1 from training. Duration: 0.563625 2023-10-04 16:58:11,369 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.conv_module1.balancer1.prob, batch_count=1726946.6666666667, ans=0.125 2023-10-04 16:58:14,206 WARNING [train.py:1204] (3/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_81-75849-0_sp0.9 from training. Duration: 0.4333125 2023-10-04 16:58:14,367 WARNING [train.py:1204] (3/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_1003-101638-0 from training. Duration: 0.73 2023-10-04 16:58:14,390 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9309W0189-640316 from training. Duration: 0.64 2023-10-04 16:58:17,663 WARNING [train.py:1204] (3/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_503-287883-0_sp1.1 from training. Duration: 0.8 2023-10-04 16:58:18,495 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.2.encoder.layers.2.conv_module2.whiten, num_groups=1, num_channels=384, metric=6.30 vs. limit=15.0 2023-10-04 16:58:21,958 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.feed_forward3.hidden_balancer.prob, batch_count=1727013.3333333333, ans=0.125 2023-10-04 16:58:24,420 WARNING [train.py:1204] (3/4) Exclude cut with ID _8833_210_5219_1_1528592665210_7553059_611-80313-0 from training. Duration: 0.64 2023-10-04 16:58:24,497 WARNING [train.py:1204] (3/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_335-106626-0_sp1.1 from training. Duration: 0.963625 2023-10-04 16:58:27,341 WARNING [train.py:1204] (3/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_237-44332-0_sp1.1 from training. Duration: 0.8545625 2023-10-04 16:58:30,126 WARNING [train.py:1204] (3/4) Exclude cut with ID _46952_210_5986_1_1534230010357_3107379_178-147341-0_sp1.1 from training. Duration: 0.97275 2023-10-04 16:58:30,170 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_13091_210_7925_1_1530609447_8987354_946-154426-0_sp1.1 from training. Duration: 0.936375 2023-10-04 16:58:30,194 WARNING [train.py:1204] (3/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_326-17061-0_sp1.1 from training. Duration: 0.8545625 2023-10-04 16:58:34,645 WARNING [train.py:1204] (3/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_38-169920-0_sp1.1 from training. Duration: 0.82725 2023-10-04 16:58:37,405 WARNING [train.py:1204] (3/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_283-220372-0 from training. Duration: 0.56 2023-10-04 16:58:37,412 WARNING [train.py:1204] (3/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_252-216674-0_sp1.1 from training. Duration: 0.4909375 2023-10-04 16:58:38,850 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_202-91290-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 16:58:40,172 WARNING [train.py:1204] (3/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_80-74184-0 from training. Duration: 0.83 2023-10-04 16:58:40,426 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.conv_module2.balancer2.prob, batch_count=1727080.0, ans=0.125 2023-10-04 16:58:44,899 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.attention_skip_rate, batch_count=1727146.6666666667, ans=0.0 2023-10-04 16:58:46,490 WARNING [train.py:1204] (3/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_411-271358-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 16:58:52,684 WARNING [train.py:1204] (3/4) Exclude cut with ID _68777_210_4353_1_1535810390731_3117904_129-166012-0 from training. Duration: 0.85 2023-10-04 16:58:54,050 WARNING [train.py:1204] (3/4) Exclude cut with ID _44982_210_1511_1_1534312820108_3726029_410-109759-0_sp1.1 from training. Duration: 0.963625 2023-10-04 16:58:54,055 WARNING [train.py:1204] (3/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_364-22817-0_sp1.1 from training. Duration: 0.6454375 2023-10-04 16:58:55,653 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.0.feed_forward1.hidden_balancer.prob, batch_count=1727146.6666666667, ans=0.125 2023-10-04 16:58:56,847 WARNING [train.py:1204] (3/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_89-242335-0 from training. Duration: 0.95 2023-10-04 16:58:56,855 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_151-325626-0 from training. Duration: 0.97 2023-10-04 16:58:56,856 WARNING [train.py:1204] (3/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_786-264332-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 16:58:58,169 INFO [train.py:1046] (3/4) Epoch 49, batch 4100, loss[loss=0.1415, simple_loss=0.2307, pruned_loss=0.0262, over 24454.00 frames. ], tot_loss[loss=0.1509, simple_loss=0.232, pruned_loss=0.03489, over 4714728.36 frames. ], batch size: 66, lr: 2.06e-03, grad_scale: 8.0 2023-10-04 16:58:59,592 WARNING [train.py:1204] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_35-153886-0_sp1.1 from training. Duration: 0.763625 2023-10-04 16:59:01,060 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_24007_210_1794_1_1531918925_1663875_116-100916-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 16:59:01,075 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_162-262717-0_sp1.1 from training. Duration: 0.7545625 2023-10-04 16:59:07,180 WARNING [train.py:1204] (3/4) Exclude cut with ID _8263_210_1664_1_1528438953494_294100_40-204731-0 from training. Duration: 0.5 2023-10-04 16:59:08,552 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_416-293746-0 from training. Duration: 0.9 2023-10-04 16:59:09,959 WARNING [train.py:1204] (3/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_605-219732-0 from training. Duration: 0.94 2023-10-04 16:59:10,041 WARNING [train.py:1204] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_454-123002-0 from training. Duration: 0.69 2023-10-04 16:59:10,048 WARNING [train.py:1204] (3/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_25-304273-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 16:59:11,390 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6860_210_4852_1_1527917457_3833327_451-39702-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 16:59:11,423 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_304-16505-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 16:59:11,436 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_4-266348-0_sp1.1 from training. Duration: 0.7090625 2023-10-04 16:59:12,827 WARNING [train.py:1204] (3/4) Exclude cut with ID ID0149W0001-753701 from training. Duration: 0.989 2023-10-04 16:59:17,294 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_41251_210_15368_1_1533429767_4820695_394-55393-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 16:59:17,366 WARNING [train.py:1204] (3/4) Exclude cut with ID _8793_210_6310_1_1528542985139_2310009_121-179782-0_sp1.1 from training. Duration: 0.72725 2023-10-04 16:59:17,380 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2729_210_3528_1_1525863505_3882214_421-73047-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 16:59:17,443 WARNING [train.py:1204] (3/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_278-154492-0_sp0.9 from training. Duration: 0.8444375 2023-10-04 16:59:17,631 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.conv_module2.balancer1.prob, batch_count=1727280.0, ans=0.125 2023-10-04 16:59:20,690 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.809e+02 2.160e+02 2.515e+02 2.947e+02 5.173e+02, threshold=5.031e+02, percent-clipped=2.0 2023-10-04 16:59:22,500 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.feed_forward1.out_proj.dropout_p, batch_count=1727280.0, ans=0.1 2023-10-04 16:59:23,581 WARNING [train.py:1204] (3/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_412-317077-0_sp0.9 from training. Duration: 0.8333125 2023-10-04 16:59:24,992 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_727-138317-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 16:59:25,045 WARNING [train.py:1204] (3/4) Exclude cut with ID _25808_210_13292_1_1532343514562_7366109_345-335048-0_sp1.1 from training. Duration: 0.92725 2023-10-04 16:59:26,236 WARNING [train.py:1204] (3/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_506-23200-0 from training. Duration: 0.97 2023-10-04 16:59:26,319 WARNING [train.py:1204] (3/4) Exclude cut with ID _44718_210_5816_1_1534050051869_6469030_643-245187-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 16:59:26,330 WARNING [train.py:1204] (3/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_42-186162-0_sp1.1 from training. Duration: 0.836375 2023-10-04 16:59:26,342 WARNING [train.py:1204] (3/4) Exclude cut with ID _3231_210_2106_1_1526205864934_6784220_171-256004-0_sp1.1 from training. Duration: 0.7818125 2023-10-04 16:59:27,689 WARNING [train.py:1204] (3/4) Exclude cut with ID _25914_210_6400_1_1532141317058_644740_43-248478-0_sp1.1 from training. Duration: 0.936375 2023-10-04 16:59:27,737 WARNING [train.py:1204] (3/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_577-287150-0 from training. Duration: 0.82 2023-10-04 16:59:30,599 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_46015_210_7234_1_1533863319_3087837_376-276068-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 16:59:30,815 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.feed_forward1.out_proj.dropout_p, batch_count=1727346.6666666667, ans=0.1 2023-10-04 16:59:31,922 WARNING [train.py:1204] (3/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_51-200220-0 from training. Duration: 0.56 2023-10-04 16:59:33,336 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_452-96101-0_sp1.1 from training. Duration: 0.963625 2023-10-04 16:59:34,670 WARNING [train.py:1204] (3/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_263-168388-0_sp1.1 from training. Duration: 0.7818125 2023-10-04 16:59:34,672 WARNING [train.py:1204] (3/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_469-94673-0 from training. Duration: 0.79 2023-10-04 16:59:34,799 WARNING [train.py:1204] (3/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_303-228738-0_sp1.1 from training. Duration: 0.763625 2023-10-04 16:59:36,760 WARNING [train.py:1204] (3/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_262-250826-0_sp1.1 from training. Duration: 0.9 2023-10-04 16:59:38,061 WARNING [train.py:1204] (3/4) Exclude cut with ID _68777_210_4353_1_1535810390731_3117904_129-166012-0_sp1.1 from training. Duration: 0.77275 2023-10-04 16:59:38,341 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=1727346.6666666667, ans=0.1 2023-10-04 16:59:39,440 WARNING [train.py:1204] (3/4) Exclude cut with ID _58895_210_8750_1_1535184081320_3649089_265-323810-0 from training. Duration: 0.96 2023-10-04 16:59:40,766 WARNING [train.py:1204] (3/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_120-76843-0_sp0.9 from training. Duration: 0.611125 2023-10-04 16:59:40,825 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7544_210_6753_1_1528587722_3576686_295-258479-0_sp0.9 from training. Duration: 0.9333125 2023-10-04 16:59:41,153 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.feed_forward3.hidden_balancer.prob, batch_count=1727413.3333333333, ans=0.125 2023-10-04 16:59:42,397 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7046_210_6753_1_1527895337_6920917_783-278109-0 from training. Duration: 0.77 2023-10-04 16:59:42,453 WARNING [train.py:1204] (3/4) Exclude cut with ID _42326_210_7770_1_1533814282531_3707269_113-308323-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 16:59:43,777 WARNING [train.py:1204] (3/4) Exclude cut with ID _7852_210_5219_1_1528286471316_8139400_242-105336-0_sp0.9 from training. Duration: 0.8 2023-10-04 16:59:45,701 WARNING [train.py:1204] (3/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_114-277905-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 16:59:49,190 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.conv_module1.balancer1.prob, batch_count=1727413.3333333333, ans=0.125 2023-10-04 16:59:49,829 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.2.encoder.layers.0.conv_module1.whiten, num_groups=1, num_channels=384, metric=6.51 vs. limit=15.0 2023-10-04 16:59:50,385 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_13118_210_7925_1_1530755612_7608297_42-66683-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 16:59:53,247 WARNING [train.py:1204] (3/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_901-79438-0_sp1.1 from training. Duration: 0.8545625 2023-10-04 16:59:54,538 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_30280_210_1881_1_1532872779_3660139_211-182194-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 16:59:59,284 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.3.encoder.layers.2.feed_forward3.out_whiten, num_groups=1, num_channels=512, metric=6.10 vs. limit=15.0 2023-10-04 17:00:01,441 WARNING [train.py:1204] (3/4) Exclude cut with ID _14898_210_9206_1_1530766812377_4177190_621-45732-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 17:00:01,449 WARNING [train.py:1204] (3/4) Exclude cut with ID _35350_210_16040_1_1533040191183_4075299_224-133775-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 17:00:05,993 WARNING [train.py:1204] (3/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_147-293311-0_sp1.1 from training. Duration: 0.8545625 2023-10-04 17:00:08,756 WARNING [train.py:1204] (3/4) Exclude cut with ID _12302_210_5219_1_1529924446466_8107280_835-279754-0_sp1.1 from training. Duration: 0.8181875 2023-10-04 17:00:11,548 INFO [train.py:1046] (3/4) Epoch 49, batch 4150, loss[loss=0.1581, simple_loss=0.2501, pruned_loss=0.03308, over 24377.00 frames. ], tot_loss[loss=0.1519, simple_loss=0.2331, pruned_loss=0.03537, over 4701267.20 frames. ], batch size: 74, lr: 2.06e-03, grad_scale: 8.0 2023-10-04 17:00:11,677 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8453_210_4211_1_1528502940_8562730_347-330673-0_sp0.9 from training. Duration: 0.8 2023-10-04 17:00:12,298 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.4.encoder.layers.1.self_attn2.whiten, num_groups=1, num_channels=384, metric=11.65 vs. limit=22.5 2023-10-04 17:00:13,509 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_213-103282-0_sp0.9 from training. Duration: 0.8666875 2023-10-04 17:00:14,064 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.4.encoder.layers.1.feed_forward2.out_whiten, num_groups=1, num_channels=384, metric=12.02 vs. limit=15.0 2023-10-04 17:00:14,829 WARNING [train.py:1204] (3/4) Exclude cut with ID _49728_210_12651_1_1534041034294_6788150_66-43496-0_sp1.1 from training. Duration: 0.97275 2023-10-04 17:00:14,831 WARNING [train.py:1204] (3/4) Exclude cut with ID _19023_210_4353_1_1531378892552_5820780_28-36615-0_sp1.1 from training. Duration: 0.8818125 2023-10-04 17:00:18,164 WARNING [train.py:1204] (3/4) Exclude cut with ID _14349_210_8341_1_1530615710818_3442470_115-285975-0 from training. Duration: 0.86 2023-10-04 17:00:20,119 WARNING [train.py:1204] (3/4) Exclude cut with ID _12734_210_8341_1_1530183614296_3891929_236-47464-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 17:00:20,171 WARNING [train.py:1204] (3/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_177-236809-0 from training. Duration: 0.49 2023-10-04 17:00:20,250 WARNING [train.py:1204] (3/4) Exclude cut with ID _13678_210_7497_1_1530530069226_3463170_371-3221-0 from training. Duration: 0.9 2023-10-04 17:00:20,265 WARNING [train.py:1204] (3/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_48-338976-0 from training. Duration: 0.89 2023-10-04 17:00:21,454 WARNING [train.py:1204] (3/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_1167-309744-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 17:00:25,801 WARNING [train.py:1204] (3/4) Exclude cut with ID _30327_210_16049_1_1532484161295_3924169_401-70819-0_sp0.9 from training. Duration: 0.9555625 2023-10-04 17:00:25,810 WARNING [train.py:1204] (3/4) Exclude cut with ID _55134_210_21982_1_1534590028377_3882170_346-91022-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 17:00:29,865 WARNING [train.py:1204] (3/4) Exclude cut with ID _14459_210_2228_1_1531443641946_7516700_174-118801-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 17:00:31,166 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_269-278006-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 17:00:32,472 WARNING [train.py:1204] (3/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_390-339137-0_sp0.9 from training. Duration: 0.62225 2023-10-04 17:00:33,871 WARNING [train.py:1204] (3/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_253-308377-0_sp1.1 from training. Duration: 0.4454375 2023-10-04 17:00:33,882 WARNING [train.py:1204] (3/4) Exclude cut with ID _23825_210_13449_1_1531915313526_3497150_240-271367-0_sp1.1 from training. Duration: 0.8818125 2023-10-04 17:00:33,970 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1015-343884-0_sp0.9 from training. Duration: 0.688875 2023-10-04 17:00:38,745 WARNING [train.py:1204] (3/4) Exclude cut with ID _23514_210_1389_1_1531832139785_3031439_335-48210-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 17:00:43,417 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_309-65177-0_sp1.1 from training. Duration: 0.8 2023-10-04 17:00:43,489 WARNING [train.py:1204] (3/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_231-137892-0 from training. Duration: 0.99 2023-10-04 17:00:46,280 WARNING [train.py:1204] (3/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_57-270037-0 from training. Duration: 0.77 2023-10-04 17:00:46,285 WARNING [train.py:1204] (3/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_465-214726-0_sp1.1 from training. Duration: 0.7818125 2023-10-04 17:00:47,726 WARNING [train.py:1204] (3/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_106-300985-0 from training. Duration: 0.9 2023-10-04 17:00:47,732 WARNING [train.py:1204] (3/4) Exclude cut with ID _14632_210_9988_1_1530691279392_3923200_161-129530-0_sp1.1 from training. Duration: 0.87275 2023-10-04 17:00:47,744 WARNING [train.py:1204] (3/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_514-88370-0_sp1.1 from training. Duration: 0.963625 2023-10-04 17:00:48,610 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.balancer2.prob, batch_count=1727680.0, ans=0.125 2023-10-04 17:00:49,755 WARNING [train.py:1204] (3/4) Exclude cut with ID _56495_210_4234_1_1534748080334_4136219_305-334505-0_sp1.1 from training. Duration: 0.92725 2023-10-04 17:00:51,124 WARNING [train.py:1204] (3/4) Exclude cut with ID _42875_210_14856_1_1533704397487_4012433_61-264914-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 17:00:55,399 WARNING [train.py:1204] (3/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_91-148831-0 from training. Duration: 0.99 2023-10-04 17:00:58,248 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_435-313305-0_sp0.9 from training. Duration: 0.788875 2023-10-04 17:00:59,767 WARNING [train.py:1204] (3/4) Exclude cut with ID _16744_210_5986_1_1531724405384_3907567_242-111316-0_sp1.1 from training. Duration: 0.8454375 2023-10-04 17:01:01,133 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_257-20733-0 from training. Duration: 0.76 2023-10-04 17:01:01,182 WARNING [train.py:1204] (3/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_4-91037-0_sp1.1 from training. Duration: 0.8 2023-10-04 17:01:02,531 WARNING [train.py:1204] (3/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_3-276701-0 from training. Duration: 0.91 2023-10-04 17:01:03,896 WARNING [train.py:1204] (3/4) Exclude cut with ID _3992_210_1747_1_1526644816031_3731060_36-147855-0_sp1.1 from training. Duration: 0.7090625 2023-10-04 17:01:05,663 WARNING [train.py:1204] (3/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_1296-298671-0_sp1.1 from training. Duration: 0.963625 2023-10-04 17:01:07,106 WARNING [train.py:1204] (3/4) Exclude cut with ID _68965_210_1883_1_1535698962272_3497490_309-334593-0_sp1.1 from training. Duration: 0.92725 2023-10-04 17:01:07,209 WARNING [train.py:1204] (3/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_128-296581-0 from training. Duration: 0.74 2023-10-04 17:01:07,209 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_17613_210_2151_1_1531137569_2366160_170-299079-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 17:01:07,213 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6287_210_3273_1_1527509506_2653716_182-199887-0_sp1.1 from training. Duration: 0.463625 2023-10-04 17:01:09,899 WARNING [train.py:1204] (3/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_80-135200-0_sp1.1 from training. Duration: 0.7181875 2023-10-04 17:01:12,668 WARNING [train.py:1204] (3/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_34-183685-0 from training. Duration: 0.98 2023-10-04 17:01:12,684 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8278_210_2550_1_1528637099_4220413_418-210155-0_sp1.1 from training. Duration: 0.92725 2023-10-04 17:01:12,688 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8853_210_3273_1_1528633899_818968_51-187685-0_sp0.9 from training. Duration: 0.7666875 2023-10-04 17:01:12,703 WARNING [train.py:1204] (3/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_332-221057-0_sp1.1 from training. Duration: 0.6181875 2023-10-04 17:01:14,634 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2221_210_1988_1_1525597051_4935541_415-296882-0 from training. Duration: 0.77 2023-10-04 17:01:14,663 WARNING [train.py:1204] (3/4) Exclude cut with ID _33640_210_2655_1_1533117605479_4855469_770-278185-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 17:01:14,690 WARNING [train.py:1204] (3/4) Exclude cut with ID _5287_210_2084_1_1527418883040_3878670_333-48508-0_sp1.1 from training. Duration: 0.5454375 2023-10-04 17:01:14,751 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_142-276839-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 17:01:18,027 WARNING [train.py:1204] (3/4) Exclude cut with ID _28394_210_8913_1_1532430049518_3650118_126-139371-0_sp1.1 from training. Duration: 0.92725 2023-10-04 17:01:18,049 WARNING [train.py:1204] (3/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_76-203867-0 from training. Duration: 0.72 2023-10-04 17:01:18,095 WARNING [train.py:1204] (3/4) Exclude cut with ID _6194_210_1534_1_1527511826160_3941194_14-282876-0_sp0.9 from training. Duration: 0.788875 2023-10-04 17:01:24,119 WARNING [train.py:1204] (3/4) Exclude cut with ID _35355_210_16040_1_1533212988977_4016229_74-190076-0_sp1.1 from training. Duration: 0.72725 2023-10-04 17:01:25,594 WARNING [train.py:1204] (3/4) Exclude cut with ID _33920_210_4220_1_1533257851500_3084399_198-334063-0 from training. Duration: 0.94 2023-10-04 17:01:26,914 INFO [train.py:1046] (3/4) Epoch 49, batch 4200, loss[loss=0.1539, simple_loss=0.2473, pruned_loss=0.03024, over 24430.00 frames. ], tot_loss[loss=0.151, simple_loss=0.2315, pruned_loss=0.03523, over 4694834.08 frames. ], batch size: 69, lr: 2.06e-03, grad_scale: 8.0 2023-10-04 17:01:27,025 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_4-266348-0_sp0.9 from training. Duration: 0.8666875 2023-10-04 17:01:29,845 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_23856_210_4238_1_1531994785_3087934_122-38075-0_sp1.1 from training. Duration: 0.936375 2023-10-04 17:01:31,187 WARNING [train.py:1204] (3/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_276-96593-0_sp0.9 from training. Duration: 0.8555625 2023-10-04 17:01:31,251 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_308-278734-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 17:01:31,252 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_5190_210_4557_1_1527245078_2965646_353-250603-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 17:01:32,674 WARNING [train.py:1204] (3/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_472-216663-0 from training. Duration: 0.92 2023-10-04 17:01:32,845 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.conv_module2.balancer1.prob, batch_count=1727880.0, ans=0.125 2023-10-04 17:01:32,860 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.conv_module2.balancer1.prob, batch_count=1727880.0, ans=0.125 2023-10-04 17:01:34,220 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.feed_forward1.out_proj.dropout_p, batch_count=1727880.0, ans=0.1 2023-10-04 17:01:37,057 WARNING [train.py:1204] (3/4) Exclude cut with ID _7537_210_2084_1_1528502395431_3924359_14-312906-0 from training. Duration: 0.84 2023-10-04 17:01:38,282 WARNING [train.py:1204] (3/4) Exclude cut with ID _29003_210_19596_1_1532507391720_3894098_559-236660-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 17:01:39,802 WARNING [train.py:1204] (3/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_312-204798-0_sp1.1 from training. Duration: 0.8454375 2023-10-04 17:01:42,578 WARNING [train.py:1204] (3/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_17-309070-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 17:01:44,077 WARNING [train.py:1204] (3/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_344-152526-0_sp0.9 from training. Duration: 0.588875 2023-10-04 17:01:45,826 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_41452_210_7291_1_1533437687_3941281_405-11559-0_sp1.1 from training. Duration: 0.82725 2023-10-04 17:01:45,867 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_48730_210_5399_1_1533885886_2212769_157-124052-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 17:01:47,880 WARNING [train.py:1204] (3/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_415-78792-0 from training. Duration: 0.81 2023-10-04 17:01:47,885 WARNING [train.py:1204] (3/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_345-340690-0_sp1.1 from training. Duration: 0.8454375 2023-10-04 17:01:49,212 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.800e+02 2.163e+02 2.357e+02 2.679e+02 4.452e+02, threshold=4.714e+02, percent-clipped=0.0 2023-10-04 17:01:49,343 WARNING [train.py:1204] (3/4) Exclude cut with ID _23825_210_13449_1_1531915313526_3497150_124-8138-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 17:01:50,684 WARNING [train.py:1204] (3/4) Exclude cut with ID _49258_210_7994_1_1534122017863_3738861_194-24864-0_sp1.1 from training. Duration: 0.936375 2023-10-04 17:01:50,702 WARNING [train.py:1204] (3/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_369-222060-0_sp1.1 from training. Duration: 0.6454375 2023-10-04 17:01:51,997 WARNING [train.py:1204] (3/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_480-13146-0_sp0.9 from training. Duration: 0.9444375 2023-10-04 17:01:53,766 WARNING [train.py:1204] (3/4) Exclude cut with ID _13253_210_1925_1_1530757775819_3577873_191-144977-0 from training. Duration: 0.78 2023-10-04 17:01:54,992 WARNING [train.py:1204] (3/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_1128-140266-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 17:01:59,163 WARNING [train.py:1204] (3/4) Exclude cut with ID _8772_210_1850_1_1528635595972_3664011_247-336448-0_sp0.9 from training. Duration: 0.82225 2023-10-04 17:01:59,241 WARNING [train.py:1204] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_318-59097-0_sp1.1 from training. Duration: 0.6909375 2023-10-04 17:02:01,966 WARNING [train.py:1204] (3/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_540-131164-0_sp1.1 from training. Duration: 0.836375 2023-10-04 17:02:03,511 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.conv_module1.balancer1.prob, batch_count=1728013.3333333333, ans=0.125 2023-10-04 17:02:04,712 WARNING [train.py:1204] (3/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_39-110836-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 17:02:06,138 WARNING [train.py:1204] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_860-96198-0_sp1.1 from training. Duration: 0.663625 2023-10-04 17:02:06,146 WARNING [train.py:1204] (3/4) Exclude cut with ID _15133_210_6228_1_1530794512074_3080379_29-264458-0 from training. Duration: 0.58 2023-10-04 17:02:06,175 WARNING [train.py:1204] (3/4) Exclude cut with ID _55209_210_1531_1_1534675828996_4597160_797-348343-0_sp1.1 from training. Duration: 0.963625 2023-10-04 17:02:06,428 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.out_combiner.scale_min, batch_count=1728013.3333333333, ans=0.2 2023-10-04 17:02:07,675 WARNING [train.py:1204] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_23-189234-0_sp1.1 from training. Duration: 0.7545625 2023-10-04 17:02:12,255 WARNING [train.py:1204] (3/4) Exclude cut with ID _5410_210_1388_1_1527397055795_1719020_39-112221-0_sp1.1 from training. Duration: 0.62725 2023-10-04 17:02:13,656 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_100-348441-0_sp1.1 from training. Duration: 0.82725 2023-10-04 17:02:15,898 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.balancer1.prob, batch_count=1728080.0, ans=0.125 2023-10-04 17:02:17,280 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.bypass_mid.scale_min, batch_count=1728080.0, ans=0.2 2023-10-04 17:02:18,413 WARNING [train.py:1204] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_143-86344-0_sp1.1 from training. Duration: 0.7 2023-10-04 17:02:22,817 WARNING [train.py:1204] (3/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_126-29104-0 from training. Duration: 0.85 2023-10-04 17:02:24,240 WARNING [train.py:1204] (3/4) Exclude cut with ID _39846_210_12651_1_1533603651992_1318030_138-31771-0_sp1.1 from training. Duration: 0.963625 2023-10-04 17:02:28,589 WARNING [train.py:1204] (3/4) Exclude cut with ID _2220_210_1819_1_1525509109898_3658680_50-99837-0_sp1.1 from training. Duration: 0.4909375 2023-10-04 17:02:29,965 WARNING [train.py:1204] (3/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_836-210550-0_sp1.1 from training. Duration: 0.97275 2023-10-04 17:02:32,607 WARNING [train.py:1204] (3/4) Exclude cut with ID _28323_210_16187_1_1532480395667_4100300_306-74561-0 from training. Duration: 0.94 2023-10-04 17:02:36,767 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_177-58005-0_sp1.1 from training. Duration: 0.563625 2023-10-04 17:02:40,199 INFO [train.py:1046] (3/4) Epoch 49, batch 4250, loss[loss=0.1524, simple_loss=0.2373, pruned_loss=0.03378, over 24419.00 frames. ], tot_loss[loss=0.1504, simple_loss=0.2303, pruned_loss=0.03524, over 4686736.30 frames. ], batch size: 77, lr: 2.06e-03, grad_scale: 8.0 2023-10-04 17:02:40,288 WARNING [train.py:1204] (3/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_19-342632-0_sp1.1 from training. Duration: 0.82725 2023-10-04 17:02:40,312 WARNING [train.py:1204] (3/4) Exclude cut with ID _35355_210_16040_1_1533212988977_4016229_74-190076-0_sp0.9 from training. Duration: 0.888875 2023-10-04 17:02:40,586 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=1728213.3333333333, ans=0.1 2023-10-04 17:02:42,886 WARNING [train.py:1204] (3/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_285-97786-0_sp1.1 from training. Duration: 0.97275 2023-10-04 17:02:45,499 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.4.encoder.layers.2.feed_forward1.out_whiten, num_groups=1, num_channels=384, metric=12.91 vs. limit=15.0 2023-10-04 17:02:48,184 WARNING [train.py:1204] (3/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_337-181502-0_sp0.9 from training. Duration: 0.72225 2023-10-04 17:02:48,228 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_353-182855-0 from training. Duration: 0.96 2023-10-04 17:02:48,261 WARNING [train.py:1204] (3/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_409-327730-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 17:02:50,484 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.1.self_attn1.whiten.whitening_limit, batch_count=1728213.3333333333, ans=22.5 2023-10-04 17:02:51,784 WARNING [train.py:1204] (3/4) Exclude cut with ID _8030_210_7497_1_1528444886781_3531089_373-19403-0_sp1.1 from training. Duration: 0.97275 2023-10-04 17:02:52,009 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.conv_skip_rate, batch_count=1728213.3333333333, ans=0.0 2023-10-04 17:02:53,339 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.1.feed_forward1.hidden_balancer.prob, batch_count=1728213.3333333333, ans=0.125 2023-10-04 17:02:54,558 WARNING [train.py:1204] (3/4) Exclude cut with ID _6789_210_1534_1_1527857937218_3118950_231-340237-0_sp1.1 from training. Duration: 0.936375 2023-10-04 17:02:58,690 WARNING [train.py:1204] (3/4) Exclude cut with ID _6493_210_1747_1_1528459210424_4012470_536-57832-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 17:02:58,706 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7046_210_6753_1_1527895337_6920917_756-116002-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 17:03:00,569 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.5.encoder.layers.1.feed_forward2.out_whiten, num_groups=1, num_channels=256, metric=10.86 vs. limit=15.0 2023-10-04 17:03:01,413 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_37152_210_9697_1_1533106573_3844188_29-239070-0_sp1.1 from training. Duration: 0.8909375 2023-10-04 17:03:01,415 WARNING [train.py:1204] (3/4) Exclude cut with ID _19023_210_4353_1_1531378892552_5820780_61-334539-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 17:03:02,959 WARNING [train.py:1204] (3/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_159-28488-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 17:03:03,669 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.1.encoder.layers.1.nonlin_attention.whiten1, num_groups=1, num_channels=192, metric=5.79 vs. limit=10.0 2023-10-04 17:03:04,290 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_764-71272-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 17:03:05,658 WARNING [train.py:1204] (3/4) Exclude cut with ID _44902_210_16187_1_1533888006062_3792078_258-178274-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 17:03:10,194 WARNING [train.py:1204] (3/4) Exclude cut with ID _7537_210_2084_1_1528502395431_3924359_14-312906-0_sp1.1 from training. Duration: 0.763625 2023-10-04 17:03:11,614 WARNING [train.py:1204] (3/4) Exclude cut with ID _49643_210_7989_1_1534640244224_3939031_674-116639-0_sp1.1 from training. Duration: 0.963625 2023-10-04 17:03:13,035 WARNING [train.py:1204] (3/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_415-161-0 from training. Duration: 0.98 2023-10-04 17:03:16,062 WARNING [train.py:1204] (3/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_264-348896-0 from training. Duration: 0.85 2023-10-04 17:03:16,070 WARNING [train.py:1204] (3/4) Exclude cut with ID _51899_210_20034_1_1534129254466_3669220_233-161492-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 17:03:16,809 WARNING [train.py:1204] (3/4) Exclude cut with ID _31255_210_13427_1_1532865619495_3815143_233-200023-0_sp1.1 from training. Duration: 0.936375 2023-10-04 17:03:16,920 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.1.feed_forward1.out_proj.dropout_p, batch_count=1728346.6666666667, ans=0.1 2023-10-04 17:03:18,013 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_23850_210_4238_1_1532232597_1632433_100-229879-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 17:03:19,391 WARNING [train.py:1204] (3/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_375-245677-0_sp0.9 from training. Duration: 0.9 2023-10-04 17:03:19,394 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_40223_210_6228_1_1533298404_4812267_355-317415-0_sp1.1 from training. Duration: 0.97275 2023-10-04 17:03:19,427 WARNING [train.py:1204] (3/4) Exclude cut with ID _9679_210_7497_1_1529129212906_4363929_235-81488-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 17:03:22,964 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.conv_module2.balancer1.prob, batch_count=1728346.6666666667, ans=0.125 2023-10-04 17:03:24,431 WARNING [train.py:1204] (3/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_172-215932-0_sp1.1 from training. Duration: 0.6 2023-10-04 17:03:24,498 WARNING [train.py:1204] (3/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_64-298917-0_sp1.1 from training. Duration: 0.8 2023-10-04 17:03:28,713 WARNING [train.py:1204] (3/4) Exclude cut with ID _38020_210_5986_1_1533112252259_7006089_131-53533-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 17:03:30,157 WARNING [train.py:1204] (3/4) Exclude cut with ID _42334_210_7770_1_1534206610396_3605880_392-48154-0_sp1.1 from training. Duration: 0.963625 2023-10-04 17:03:30,219 WARNING [train.py:1204] (3/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_337-181502-0 from training. Duration: 0.65 2023-10-04 17:03:31,500 WARNING [train.py:1204] (3/4) Exclude cut with ID _6267_210_2084_1_1527764658526_3894730_375-219887-0_sp0.9 from training. Duration: 0.7444375 2023-10-04 17:03:31,590 WARNING [train.py:1204] (3/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_60-146189-0 from training. Duration: 0.98 2023-10-04 17:03:34,200 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_243-143119-0_sp1.1 from training. Duration: 0.7 2023-10-04 17:03:35,590 WARNING [train.py:1204] (3/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_1267-272693-0_sp0.9 from training. Duration: 0.811125 2023-10-04 17:03:37,043 WARNING [train.py:1204] (3/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_83-182645-0_sp1.1 from training. Duration: 0.97275 2023-10-04 17:03:37,075 WARNING [train.py:1204] (3/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_403-292260-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 17:03:39,898 WARNING [train.py:1204] (3/4) Exclude cut with ID _55429_210_10626_1_1534467588527_7174143_377-169391-0 from training. Duration: 0.96 2023-10-04 17:03:42,005 WARNING [train.py:1204] (3/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_494-199538-0_sp1.1 from training. Duration: 0.6818125 2023-10-04 17:03:42,075 WARNING [train.py:1204] (3/4) Exclude cut with ID _3067_210_2667_1_1526212974640_4503168_119-175776-0_sp1.1 from training. Duration: 0.67275 2023-10-04 17:03:46,149 WARNING [train.py:1204] (3/4) Exclude cut with ID _3231_210_2106_1_1526205864934_6784220_183-303839-0_sp1.1 from training. Duration: 0.97275 2023-10-04 17:03:48,888 WARNING [train.py:1204] (3/4) Exclude cut with ID _18513_210_9206_1_1531618215223_3862620_165-38958-0_sp1.1 from training. Duration: 0.963625 2023-10-04 17:03:48,977 WARNING [train.py:1204] (3/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_87-272068-0_sp1.1 from training. Duration: 0.7545625 2023-10-04 17:03:52,870 WARNING [train.py:1204] (3/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_353-95-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 17:03:54,350 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2009_210_2071_1_1525481134_4588937_73-302813-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 17:03:54,467 WARNING [train.py:1204] (3/4) Exclude cut with ID _64220_210_9527_1_1535250151772_4875259_88-95011-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 17:03:54,514 WARNING [train.py:1204] (3/4) Exclude cut with ID _44902_210_16187_1_1533888006062_3792078_160-12107-0_sp1.1 from training. Duration: 0.92725 2023-10-04 17:03:54,520 WARNING [train.py:1204] (3/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_547-5424-0 from training. Duration: 0.94 2023-10-04 17:03:55,781 INFO [train.py:1046] (3/4) Epoch 49, batch 4300, loss[loss=0.1363, simple_loss=0.2159, pruned_loss=0.02836, over 24392.00 frames. ], tot_loss[loss=0.1503, simple_loss=0.2303, pruned_loss=0.0351, over 4703732.30 frames. ], batch size: 58, lr: 2.06e-03, grad_scale: 8.0 2023-10-04 17:03:57,742 WARNING [train.py:1204] (3/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_268-85656-0_sp1.1 from training. Duration: 0.936375 2023-10-04 17:04:02,051 WARNING [train.py:1204] (3/4) Exclude cut with ID _66210_210_12651_1_1535446812922_3365850_42-284985-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 17:04:02,083 WARNING [train.py:1204] (3/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_465-214726-0_sp0.9 from training. Duration: 0.9555625 2023-10-04 17:04:04,983 WARNING [train.py:1204] (3/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_50-95770-0_sp1.1 from training. Duration: 0.936375 2023-10-04 17:04:05,242 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.feed_forward2.hidden_balancer.prob, batch_count=1728546.6666666667, ans=0.125 2023-10-04 17:04:12,322 WARNING [train.py:1204] (3/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_269-77567-0_sp1.1 from training. Duration: 0.963625 2023-10-04 17:04:12,324 WARNING [train.py:1204] (3/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_7-111954-0 from training. Duration: 0.56 2023-10-04 17:04:13,690 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_85-195240-0_sp0.9 from training. Duration: 0.9666875 2023-10-04 17:04:15,151 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2822_210_2715_1_1526112586_1999587_165-36656-0_sp0.9 from training. Duration: 0.988875 2023-10-04 17:04:16,468 WARNING [train.py:1204] (3/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_181-78632-0_sp1.1 from training. Duration: 0.7090625 2023-10-04 17:04:16,480 WARNING [train.py:1204] (3/4) Exclude cut with ID IC0671W0044-331887 from training. Duration: 0.896 2023-10-04 17:04:17,833 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.775e+02 2.072e+02 2.260e+02 2.526e+02 3.398e+02, threshold=4.520e+02, percent-clipped=0.0 2023-10-04 17:04:17,993 WARNING [train.py:1204] (3/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_266-137104-0_sp1.1 from training. Duration: 0.5909375 2023-10-04 17:04:19,816 WARNING [train.py:1204] (3/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_137-116442-0_sp0.9 from training. Duration: 0.8666875 2023-10-04 17:04:24,074 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_23801_210_16223_1_1532066392_3294657_570-5439-0 from training. Duration: 0.97 2023-10-04 17:04:24,086 WARNING [train.py:1204] (3/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_29-280622-0_sp1.1 from training. Duration: 0.7454375 2023-10-04 17:04:24,115 WARNING [train.py:1204] (3/4) Exclude cut with ID 3557-8342-0013-54691-0 from training. Duration: 0.92 2023-10-04 17:04:26,912 WARNING [train.py:1204] (3/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_711-15688-0_sp1.1 from training. Duration: 0.3909375 2023-10-04 17:04:28,873 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_15126_210_6753_1_1530778677_2591185_120-277707-0_sp1.1 from training. Duration: 0.72725 2023-10-04 17:04:30,351 WARNING [train.py:1204] (3/4) Exclude cut with ID _14970_210_15709_1_1530775547361_1966965_8-169924-0_sp1.1 from training. Duration: 0.836375 2023-10-04 17:04:30,352 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_4692_210_3528_1_1526899755_4323254_107-44401-0_sp0.9 from training. Duration: 0.9555625 2023-10-04 17:04:31,781 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3475_210_2028_1_1526193578_3622569_265-276625-0_sp0.9 from training. Duration: 0.8444375 2023-10-04 17:04:33,196 WARNING [train.py:1204] (3/4) Exclude cut with ID _55153_210_2253_1_1534384786677_3162096_148-79022-0_sp1.1 from training. Duration: 0.87275 2023-10-04 17:04:33,300 WARNING [train.py:1204] (3/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_292-36336-0_sp1.1 from training. Duration: 0.92725 2023-10-04 17:04:34,602 WARNING [train.py:1204] (3/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_329-82235-0 from training. Duration: 0.82 2023-10-04 17:04:35,911 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_11305_210_1794_1_1529750428_7178148_257-312870-0 from training. Duration: 0.88 2023-10-04 17:04:37,304 WARNING [train.py:1204] (3/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_436-4493-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 17:04:40,131 WARNING [train.py:1204] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_321-54129-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 17:04:40,139 WARNING [train.py:1204] (3/4) Exclude cut with ID _5287_210_2084_1_1527418883040_3878670_333-48508-0_sp0.9 from training. Duration: 0.6666875 2023-10-04 17:04:40,161 WARNING [train.py:1204] (3/4) Exclude cut with ID _26528_210_9070_1_1532570410842_3851310_244-278158-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 17:04:40,192 WARNING [train.py:1204] (3/4) Exclude cut with ID _46952_210_5986_1_1534230010357_3107379_232-344503-0_sp1.1 from training. Duration: 0.87275 2023-10-04 17:04:40,210 WARNING [train.py:1204] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_43-206757-0 from training. Duration: 0.75 2023-10-04 17:04:40,211 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_4183_210_2087_1_1526729100_1869838_141-166600-0 from training. Duration: 0.95 2023-10-04 17:04:41,893 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_296-287678-0 from training. Duration: 0.95 2023-10-04 17:04:41,973 WARNING [train.py:1204] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_265-60343-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 17:04:43,293 WARNING [train.py:1204] (3/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_332-256168-0 from training. Duration: 0.75 2023-10-04 17:04:43,341 WARNING [train.py:1204] (3/4) Exclude cut with ID _13679_210_7497_1_1530534293662_3355098_411-186088-0 from training. Duration: 0.94 2023-10-04 17:04:46,229 WARNING [train.py:1204] (3/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_458-126779-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 17:04:49,351 WARNING [train.py:1204] (3/4) Exclude cut with ID IC0803W0380-397572 from training. Duration: 0.032 2023-10-04 17:04:50,696 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_278-272612-0_sp1.1 from training. Duration: 0.663625 2023-10-04 17:04:52,690 WARNING [train.py:1204] (3/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_491-263101-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 17:04:52,702 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_53555_210_5632_1_1534595220_7232297_334-336778-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 17:04:54,068 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_50-3289-0 from training. Duration: 0.91 2023-10-04 17:04:54,138 WARNING [train.py:1204] (3/4) Exclude cut with ID _8699_210_2069_1_1528630346831_3960409_155-39766-0_sp0.9 from training. Duration: 0.8666875 2023-10-04 17:04:54,142 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_502-82145-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 17:04:55,608 WARNING [train.py:1204] (3/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_550-43132-0_sp1.1 from training. Duration: 0.97275 2023-10-04 17:04:56,935 WARNING [train.py:1204] (3/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_446-112117-0_sp1.1 from training. Duration: 0.8181875 2023-10-04 17:04:56,999 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3691_210_3528_1_1526468169_4062053_355-334432-0_sp1.1 from training. Duration: 0.7909375 2023-10-04 17:04:59,676 WARNING [train.py:1204] (3/4) Exclude cut with ID _58605_210_12210_1_1534723195939_3904259_239-94720-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 17:05:03,020 WARNING [train.py:1204] (3/4) Exclude cut with ID _49376_210_5986_1_1533985278732_3370740_391-344072-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 17:05:04,405 WARNING [train.py:1204] (3/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_558-322014-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 17:05:04,444 WARNING [train.py:1204] (3/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_256-32662-0_sp1.1 from training. Duration: 0.8181875 2023-10-04 17:05:07,461 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.bypass_mid.scale_min, batch_count=1728813.3333333333, ans=0.2 2023-10-04 17:05:08,673 WARNING [train.py:1204] (3/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_572-185150-0 from training. Duration: 0.97 2023-10-04 17:05:08,711 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_89-35748-0_sp0.9 from training. Duration: 0.87775 2023-10-04 17:05:09,993 INFO [train.py:1046] (3/4) Epoch 49, batch 4350, loss[loss=0.1581, simple_loss=0.238, pruned_loss=0.03909, over 23809.00 frames. ], tot_loss[loss=0.1512, simple_loss=0.2312, pruned_loss=0.03562, over 4709797.68 frames. ], batch size: 212, lr: 2.06e-03, grad_scale: 8.0 2023-10-04 17:05:14,615 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_88-66747-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 17:05:15,515 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.1.encoder.layers.0.self_attn2.whiten, num_groups=1, num_channels=256, metric=13.08 vs. limit=22.5 2023-10-04 17:05:17,339 WARNING [train.py:1204] (3/4) Exclude cut with ID _19504_210_9994_1_1531648863242_3325220_357-237029-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 17:05:21,418 WARNING [train.py:1204] (3/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_302-2550-0_sp1.1 from training. Duration: 0.736375 2023-10-04 17:05:21,419 WARNING [train.py:1204] (3/4) Exclude cut with ID _9461_210_4557_1_1529060514726_3677451_59-194017-0_sp1.1 from training. Duration: 0.963625 2023-10-04 17:05:26,673 WARNING [train.py:1204] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_665-196155-0_sp1.1 from training. Duration: 0.6090625 2023-10-04 17:05:29,624 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_326-315007-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 17:05:32,987 WARNING [train.py:1204] (3/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_105-328174-0_sp1.1 from training. Duration: 0.6909375 2023-10-04 17:05:33,220 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.balancer1.prob, batch_count=1728946.6666666667, ans=0.125 2023-10-04 17:05:34,312 WARNING [train.py:1204] (3/4) Exclude cut with ID _56494_210_4220_1_1535284837625_3033169_106-263722-0_sp1.1 from training. Duration: 0.97275 2023-10-04 17:05:37,041 WARNING [train.py:1204] (3/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_770-200430-0_sp0.9 from training. Duration: 0.988875 2023-10-04 17:05:38,535 WARNING [train.py:1204] (3/4) Exclude cut with ID _65640_210_6310_1_1535368201445_3173250_209-13008-0_sp1.1 from training. Duration: 0.9 2023-10-04 17:05:38,635 WARNING [train.py:1204] (3/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_212-149947-0_sp0.9 from training. Duration: 0.811125 2023-10-04 17:05:43,952 WARNING [train.py:1204] (3/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_188-171673-0 from training. Duration: 0.58 2023-10-04 17:05:45,264 WARNING [train.py:1204] (3/4) Exclude cut with ID _58895_210_8750_1_1535184081320_3649089_286-281724-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 17:05:45,332 WARNING [train.py:1204] (3/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_16-254909-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 17:05:50,118 WARNING [train.py:1204] (3/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_52-95655-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 17:05:51,499 WARNING [train.py:1204] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_554-45617-0 from training. Duration: 0.99 2023-10-04 17:05:51,794 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.conv_module1.balancer1.prob, batch_count=1729013.3333333333, ans=0.125 2023-10-04 17:05:55,195 WARNING [train.py:1204] (3/4) Exclude cut with ID _14683_210_5219_1_1530698352608_3974859_473-344611-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 17:05:56,604 WARNING [train.py:1204] (3/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_17-25045-0_sp0.9 from training. Duration: 0.6555625 2023-10-04 17:06:00,746 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9072W0003-530637 from training. Duration: 0.768 2023-10-04 17:06:02,124 WARNING [train.py:1204] (3/4) Exclude cut with ID _38670_210_15379_1_1533448810659_4391059_76-74025-0_sp1.1 from training. Duration: 0.936375 2023-10-04 17:06:03,861 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_74-292608-0_sp0.9 from training. Duration: 0.8 2023-10-04 17:06:03,935 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9084W0025-536862 from training. Duration: 0.896 2023-10-04 17:06:03,993 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9087W0021-538392 from training. Duration: 0.768 2023-10-04 17:06:04,005 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7543_210_6753_1_1528543323_645515_56-163234-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 17:06:05,318 WARNING [train.py:1204] (3/4) Exclude cut with ID _45560_210_21982_1_1533812603951_3584999_44-332643-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 17:06:05,402 WARNING [train.py:1204] (3/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_1285-256622-0_sp1.1 from training. Duration: 0.763625 2023-10-04 17:06:05,453 WARNING [train.py:1204] (3/4) Exclude cut with ID _52781_210_16187_1_1534297729099_1118149_141-325632-0_sp1.1 from training. Duration: 0.936375 2023-10-04 17:06:06,782 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_1051-137631-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 17:06:07,611 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.0.layers.0.feed_forward3.out_whiten, num_groups=1, num_channels=192, metric=9.87 vs. limit=15.0 2023-10-04 17:06:07,955 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_13091_210_7925_1_1530609447_8987354_233-18333-0_sp1.1 from training. Duration: 0.97275 2023-10-04 17:06:10,782 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_34008_210_10431_1_1533345424_8183247_662-317331-0 from training. Duration: 0.94 2023-10-04 17:06:10,788 WARNING [train.py:1204] (3/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_291-248920-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 17:06:10,790 WARNING [train.py:1204] (3/4) Exclude cut with ID _65263_210_22356_1_1535763598691_3921037_274-288481-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 17:06:12,115 WARNING [train.py:1204] (3/4) Exclude cut with ID _5189_210_4557_1_1527248319428_3769229_233-168080-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 17:06:13,432 WARNING [train.py:1204] (3/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_135-184281-0 from training. Duration: 0.99 2023-10-04 17:06:16,119 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9123W0012-555972 from training. Duration: 0.896 2023-10-04 17:06:16,123 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9123W0118-556077 from training. Duration: 0.896 2023-10-04 17:06:16,132 WARNING [train.py:1204] (3/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_406-275649-0 from training. Duration: 0.42 2023-10-04 17:06:19,473 WARNING [train.py:1204] (3/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_309-13684-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 17:06:19,502 WARNING [train.py:1204] (3/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_129-337802-0_sp1.1 from training. Duration: 0.6545625 2023-10-04 17:06:20,871 WARNING [train.py:1204] (3/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_649-266825-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 17:06:20,942 WARNING [train.py:1204] (3/4) Exclude cut with ID _40519_210_13794_1_1533715228655_7337390_895-224742-0_sp1.1 from training. Duration: 0.92725 2023-10-04 17:06:23,588 INFO [train.py:1046] (3/4) Epoch 49, batch 4400, loss[loss=0.1595, simple_loss=0.2364, pruned_loss=0.04133, over 23736.00 frames. ], tot_loss[loss=0.1522, simple_loss=0.2327, pruned_loss=0.03586, over 4724690.44 frames. ], batch size: 164, lr: 2.06e-03, grad_scale: 16.0 2023-10-04 17:06:25,013 WARNING [train.py:1204] (3/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_234-14174-0 from training. Duration: 0.82 2023-10-04 17:06:27,031 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9156W0009-572502 from training. Duration: 0.768 2023-10-04 17:06:27,039 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_424-245529-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 17:06:30,480 WARNING [train.py:1204] (3/4) Exclude cut with ID _26528_210_9070_1_1532570410842_3851310_232-10149-0_sp1.1 from training. Duration: 0.963625 2023-10-04 17:06:30,495 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_17292_210_6753_1_1531125388_2943734_152-30410-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 17:06:31,907 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_35441_210_16554_1_1533347572_4092680_291-82075-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 17:06:33,385 WARNING [train.py:1204] (3/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_64-298917-0 from training. Duration: 0.88 2023-10-04 17:06:33,404 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_5618_210_1874_1_1527311008_5851_1-214501-0_sp0.9 from training. Duration: 0.7655625 2023-10-04 17:06:34,678 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_9460_210_4211_1_1528954587_10049667_572-37035-0 from training. Duration: 0.8 2023-10-04 17:06:34,714 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9193W0023-590322 from training. Duration: 0.896 2023-10-04 17:06:35,985 WARNING [train.py:1204] (3/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_206-10097-0_sp1.1 from training. Duration: 0.4454375 2023-10-04 17:06:35,997 WARNING [train.py:1204] (3/4) Exclude cut with ID _55134_210_21982_1_1534590028377_3882170_17-51731-0_sp1.1 from training. Duration: 0.963625 2023-10-04 17:06:36,185 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.1.attention_skip_rate, batch_count=1729213.3333333333, ans=0.0 2023-10-04 17:06:37,827 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_14953_210_2151_1_1530701710_5616165_710-165400-0 from training. Duration: 0.87 2023-10-04 17:06:40,644 WARNING [train.py:1204] (3/4) Exclude cut with ID _53203_210_2087_1_1534246218309_475590_61-85777-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 17:06:40,725 WARNING [train.py:1204] (3/4) Exclude cut with ID _55844_210_2346_1_1534471212569_7625707_1050-10453-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 17:06:40,734 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9218W0013-603297 from training. Duration: 0.896 2023-10-04 17:06:41,020 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.feed_forward3.hidden_balancer.prob, batch_count=1729280.0, ans=0.125 2023-10-04 17:06:42,298 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.bypass.skip_rate, batch_count=1729280.0, ans=0.07 2023-10-04 17:06:43,373 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_267-102818-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 17:06:43,374 WARNING [train.py:1204] (3/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_191-41023-0 from training. Duration: 0.71 2023-10-04 17:06:44,842 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9231W0202-610227 from training. Duration: 0.896 2023-10-04 17:06:46,254 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.847e+02 2.222e+02 2.528e+02 3.076e+02 4.813e+02, threshold=5.055e+02, percent-clipped=1.0 2023-10-04 17:06:46,415 WARNING [train.py:1204] (3/4) Exclude cut with ID _6537_210_1883_1_1527987559710_3597129_382-133062-0 from training. Duration: 0.68 2023-10-04 17:06:46,446 WARNING [train.py:1204] (3/4) Exclude cut with ID _28427_210_4220_1_1532581240967_3061069_43-84928-0 from training. Duration: 0.99 2023-10-04 17:06:47,749 WARNING [train.py:1204] (3/4) Exclude cut with ID _59256_210_5986_1_1535076085997_3589120_121-237386-0 from training. Duration: 0.96 2023-10-04 17:06:47,771 WARNING [train.py:1204] (3/4) Exclude cut with ID _8480_210_1874_1_1528589391205_6728330_137-256986-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 17:06:49,706 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_9417_210_1794_1_1529235261_4614131_479-346963-0_sp1.1 from training. Duration: 0.936375 2023-10-04 17:06:51,255 WARNING [train.py:1204] (3/4) Exclude cut with ID _26474_210_9570_1_1532520022699_3623380_267-7362-0_sp1.1 from training. Duration: 0.936375 2023-10-04 17:06:51,333 WARNING [train.py:1204] (3/4) Exclude cut with ID _55328_210_27767_1_1534580973260_4088170_287-61272-0_sp1.1 from training. Duration: 0.97275 2023-10-04 17:06:52,676 WARNING [train.py:1204] (3/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_589-9070-0 from training. Duration: 0.71 2023-10-04 17:06:52,683 WARNING [train.py:1204] (3/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_148-158509-0_sp1.1 from training. Duration: 0.37275 2023-10-04 17:06:54,479 WARNING [train.py:1204] (3/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_164-184548-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 17:06:55,889 WARNING [train.py:1204] (3/4) Exclude cut with ID _14970_210_15709_1_1530775547361_1966965_6-23328-0_sp1.1 from training. Duration: 0.7818125 2023-10-04 17:06:55,894 WARNING [train.py:1204] (3/4) Exclude cut with ID _29303_210_2575_1_1532392237303_7410649_612-165069-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 17:06:57,424 WARNING [train.py:1204] (3/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_383-321224-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 17:06:57,466 WARNING [train.py:1204] (3/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_283-160525-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 17:06:57,477 WARNING [train.py:1204] (3/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_505-97987-0 from training. Duration: 0.86 2023-10-04 17:06:59,439 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9309W0190-640317 from training. Duration: 0.768 2023-10-04 17:07:03,373 WARNING [train.py:1204] (3/4) Exclude cut with ID _50093_210_4220_1_1534417141207_3432299_280-297390-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 17:07:08,001 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_17292_210_6753_1_1531125388_2943734_320-71385-0_sp1.1 from training. Duration: 0.97275 2023-10-04 17:07:10,828 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.self_attn_weights.pos_emb_skip_rate, batch_count=1729413.3333333333, ans=0.0 2023-10-04 17:07:11,847 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_213-103282-0 from training. Duration: 0.78 2023-10-04 17:07:13,529 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=1729413.3333333333, ans=0.1 2023-10-04 17:07:14,624 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7615_210_4211_1_1528110965_4580160_72-235211-0_sp0.9 from training. Duration: 0.8444375 2023-10-04 17:07:16,327 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.bypass.scale_min, batch_count=1729413.3333333333, ans=0.2 2023-10-04 17:07:17,432 WARNING [train.py:1204] (3/4) Exclude cut with ID _14867_210_1968_1_1530684179890_3846969_173-320977-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 17:07:18,797 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7046_210_6753_1_1527895337_6920917_783-278109-0_sp0.9 from training. Duration: 0.8555625 2023-10-04 17:07:18,833 WARNING [train.py:1204] (3/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_90-30673-0 from training. Duration: 0.84 2023-10-04 17:07:18,855 WARNING [train.py:1204] (3/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_415-161-0_sp1.1 from training. Duration: 0.8909375 2023-10-04 17:07:18,872 WARNING [train.py:1204] (3/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_86-66192-0_sp0.9 from training. Duration: 0.611125 2023-10-04 17:07:18,874 WARNING [train.py:1204] (3/4) Exclude cut with ID _4626_210_3532_1_1526817652717_3916859_446-206656-0_sp1.1 from training. Duration: 0.6909375 2023-10-04 17:07:19,145 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.bypass.skip_rate, batch_count=1729413.3333333333, ans=0.04949747468305833 2023-10-04 17:07:20,254 WARNING [train.py:1204] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_166-128941-0_sp0.9 from training. Duration: 0.6 2023-10-04 17:07:26,617 WARNING [train.py:1204] (3/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_345-167986-0 from training. Duration: 0.84 2023-10-04 17:07:28,198 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.attention_skip_rate, batch_count=1729480.0, ans=0.0 2023-10-04 17:07:29,431 WARNING [train.py:1204] (3/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_973-198085-0 from training. Duration: 0.75 2023-10-04 17:07:30,760 WARNING [train.py:1204] (3/4) Exclude cut with ID _60057_210_10640_1_1534903065793_3333150_171-181133-0 from training. Duration: 0.88 2023-10-04 17:07:32,663 WARNING [train.py:1204] (3/4) Exclude cut with ID _19504_210_9994_1_1531648863242_3325220_336-196513-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 17:07:32,678 WARNING [train.py:1204] (3/4) Exclude cut with ID _6493_210_1747_1_1528459210424_4012470_513-140967-0 from training. Duration: 0.79 2023-10-04 17:07:32,756 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6673_210_3273_1_1527767790_3173240_327-306185-0_sp0.9 from training. Duration: 0.92225 2023-10-04 17:07:35,630 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1032-236863-0_sp1.1 from training. Duration: 0.763625 2023-10-04 17:07:38,763 INFO [train.py:1046] (3/4) Epoch 49, batch 4450, loss[loss=0.1565, simple_loss=0.246, pruned_loss=0.03354, over 24045.00 frames. ], tot_loss[loss=0.1525, simple_loss=0.2331, pruned_loss=0.03596, over 4722560.54 frames. ], batch size: 80, lr: 2.06e-03, grad_scale: 16.0 2023-10-04 17:07:38,835 WARNING [train.py:1204] (3/4) Exclude cut with ID _14867_210_1968_1_1530684179890_3846969_413-29292-0 from training. Duration: 0.9 2023-10-04 17:07:40,592 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.conv_skip_rate, batch_count=1729546.6666666667, ans=0.0 2023-10-04 17:07:41,809 WARNING [train.py:1204] (3/4) Exclude cut with ID _47251_210_18628_1_1533814063128_4137250_186-343322-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 17:07:44,547 WARNING [train.py:1204] (3/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_624-46091-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 17:07:44,590 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_791-197891-0_sp0.9 from training. Duration: 0.9333125 2023-10-04 17:07:51,332 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_50050_210_16223_1_1535435796_7290138_786-288457-0_sp1.1 from training. Duration: 0.92725 2023-10-04 17:07:51,357 WARNING [train.py:1204] (3/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_213-227283-0_sp1.1 from training. Duration: 0.963625 2023-10-04 17:07:53,125 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.bypass.scale_min, batch_count=1729613.3333333333, ans=0.2 2023-10-04 17:07:54,164 WARNING [train.py:1204] (3/4) Exclude cut with ID _42659_210_2070_1_1533607442765_3120380_58-341121-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 17:07:57,967 WARNING [train.py:1204] (3/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_506-23200-0_sp1.1 from training. Duration: 0.8818125 2023-10-04 17:08:00,714 WARNING [train.py:1204] (3/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_336-213530-0_sp1.1 from training. Duration: 0.8181875 2023-10-04 17:08:00,757 WARNING [train.py:1204] (3/4) Exclude cut with ID _64135_210_2253_1_1535677267876_7608972_180-5312-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 17:08:02,196 WARNING [train.py:1204] (3/4) Exclude cut with ID _12302_210_5219_1_1529924446466_8107280_835-279754-0 from training. Duration: 0.9 2023-10-04 17:08:02,198 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_510-96996-0_sp1.1 from training. Duration: 0.7909375 2023-10-04 17:08:04,147 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_15517_210_6753_1_1530954008_3487336_216-187834-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 17:08:04,184 WARNING [train.py:1204] (3/4) Exclude cut with ID _19504_210_9994_1_1531648863242_3325220_414-113427-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 17:08:04,185 WARNING [train.py:1204] (3/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_264-108685-0_sp0.9 from training. Duration: 0.611125 2023-10-04 17:08:06,817 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_5438_210_1794_1_1527336687_2600009_104-288274-0_sp0.9 from training. Duration: 0.6444375 2023-10-04 17:08:11,354 WARNING [train.py:1204] (3/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_520-39496-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 17:08:11,397 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_37818_210_4238_1_1533620910_4720083_315-308565-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 17:08:11,669 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.feed_forward1.out_proj.dropout_p, batch_count=1729680.0, ans=0.1 2023-10-04 17:08:12,829 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2009_210_2071_1_1525481134_4588937_385-302100-0_sp1.1 from training. Duration: 0.7909375 2023-10-04 17:08:12,886 WARNING [train.py:1204] (3/4) Exclude cut with ID _58895_210_8750_1_1535184081320_3649089_277-53219-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 17:08:15,506 WARNING [train.py:1204] (3/4) Exclude cut with ID _26664_210_7770_1_1532258943280_3418270_121-111258-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 17:08:18,255 WARNING [train.py:1204] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_99-167392-0_sp1.1 from training. Duration: 0.5545625 2023-10-04 17:08:19,628 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1113-7369-0 from training. Duration: 0.85 2023-10-04 17:08:19,656 WARNING [train.py:1204] (3/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_352-59958-0 from training. Duration: 0.86 2023-10-04 17:08:19,661 WARNING [train.py:1204] (3/4) Exclude cut with ID _14886_210_4220_1_1530702020355_3516190_159-73748-0_sp1.1 from training. Duration: 0.7545625 2023-10-04 17:08:22,507 WARNING [train.py:1204] (3/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_131-119569-0_sp1.1 from training. Duration: 0.92725 2023-10-04 17:08:23,936 WARNING [train.py:1204] (3/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_216-343007-0 from training. Duration: 0.66 2023-10-04 17:08:28,531 WARNING [train.py:1204] (3/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_113-92608-0_sp0.9 from training. Duration: 0.6 2023-10-04 17:08:31,961 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_40047_210_2290_1_1533290519_2374311_132-123271-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 17:08:32,017 WARNING [train.py:1204] (3/4) Exclude cut with ID _8793_210_6310_1_1528542985139_2310009_121-179782-0 from training. Duration: 0.8 2023-10-04 17:08:32,032 WARNING [train.py:1204] (3/4) Exclude cut with ID _43997_210_4525_1_1533538632555_5189329_283-218639-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 17:08:32,035 WARNING [train.py:1204] (3/4) Exclude cut with ID _49376_210_5986_1_1533985278732_3370740_297-78397-0_sp1.1 from training. Duration: 0.936375 2023-10-04 17:08:32,059 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3475_210_2028_1_1526193578_3622569_377-166133-0_sp0.9 from training. Duration: 0.9555625 2023-10-04 17:08:32,066 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_520-213149-0_sp1.1 from training. Duration: 0.92725 2023-10-04 17:08:33,805 WARNING [train.py:1204] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_567-144483-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 17:08:36,683 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_4183_210_2087_1_1526729100_1869838_143-280962-0_sp0.9 from training. Duration: 0.62225 2023-10-04 17:08:36,730 WARNING [train.py:1204] (3/4) Exclude cut with ID _49643_210_7989_1_1534640244224_3939031_611-185515-0 from training. Duration: 0.94 2023-10-04 17:08:38,113 WARNING [train.py:1204] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_88-341718-0_sp0.9 from training. Duration: 0.7333125 2023-10-04 17:08:41,246 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_51225_210_3601_1_1534400634_2327383_49-235441-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 17:08:41,320 WARNING [train.py:1204] (3/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_240-294400-0_sp1.1 from training. Duration: 0.936375 2023-10-04 17:08:42,741 WARNING [train.py:1204] (3/4) Exclude cut with ID _55429_210_10626_1_1534467588527_7174143_640-304825-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 17:08:42,933 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=1729813.3333333333, ans=0.1 2023-10-04 17:08:44,067 WARNING [train.py:1204] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_187-221566-0_sp1.1 from training. Duration: 0.3909375 2023-10-04 17:08:45,508 WARNING [train.py:1204] (3/4) Exclude cut with ID _7606_210_4915_1_1528894253277_5080690_127-177761-0_sp0.9 from training. Duration: 0.711125 2023-10-04 17:08:48,399 WARNING [train.py:1204] (3/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_430-116139-0 from training. Duration: 0.85 2023-10-04 17:08:49,765 WARNING [train.py:1204] (3/4) Exclude cut with ID _3675_210_4557_1_1526639934408_4182089_456-18305-0_sp0.9 from training. Duration: 0.8444375 2023-10-04 17:08:52,473 INFO [train.py:1046] (3/4) Epoch 49, batch 4500, loss[loss=0.1443, simple_loss=0.2237, pruned_loss=0.03244, over 24308.00 frames. ], tot_loss[loss=0.1534, simple_loss=0.2338, pruned_loss=0.03646, over 4704496.96 frames. ], batch size: 61, lr: 2.06e-03, grad_scale: 16.0 2023-10-04 17:08:55,384 WARNING [train.py:1204] (3/4) Exclude cut with ID _18513_210_9206_1_1531618215223_3862620_402-5957-0_sp1.1 from training. Duration: 0.9 2023-10-04 17:08:57,271 WARNING [train.py:1204] (3/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_465-156500-0 from training. Duration: 0.72 2023-10-04 17:08:57,272 WARNING [train.py:1204] (3/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_714-310538-0 from training. Duration: 0.86 2023-10-04 17:08:59,944 WARNING [train.py:1204] (3/4) Exclude cut with ID _39411_210_5986_1_1533538825030_3343649_335-301187-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 17:09:00,123 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.bypass_mid.scale_min, batch_count=1729880.0, ans=0.2 2023-10-04 17:09:00,143 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.balancer1.prob, batch_count=1729880.0, ans=0.125 2023-10-04 17:09:05,007 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_41750_210_1960_1_1533520678_3948042_241-158616-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 17:09:06,271 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_46013_210_7234_1_1533776362_3744032_50-141535-0_sp1.1 from training. Duration: 0.9 2023-10-04 17:09:07,641 WARNING [train.py:1204] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_43-206757-0_sp0.9 from training. Duration: 0.8333125 2023-10-04 17:09:07,703 WARNING [train.py:1204] (3/4) Exclude cut with ID _22961_210_8254_1_1531976366672_4278180_172-276296-0_sp1.1 from training. Duration: 0.97275 2023-10-04 17:09:07,729 WARNING [train.py:1204] (3/4) Exclude cut with ID _17956_210_4353_1_1531209568125_6979970_1305-301903-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 17:09:08,992 WARNING [train.py:1204] (3/4) Exclude cut with ID _28323_210_16187_1_1532480395667_4100300_604-44507-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 17:09:15,228 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.757e+02 2.089e+02 2.391e+02 2.805e+02 4.651e+02, threshold=4.782e+02, percent-clipped=0.0 2023-10-04 17:09:20,819 WARNING [train.py:1204] (3/4) Exclude cut with ID _23514_210_1389_1_1531832139785_3031439_88-337544-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 17:09:20,888 WARNING [train.py:1204] (3/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_932-3672-0_sp1.1 from training. Duration: 0.963625 2023-10-04 17:09:22,446 WARNING [train.py:1204] (3/4) Exclude cut with ID _46502_210_13794_1_1533724452198_3657159_207-109991-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 17:09:23,789 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_216-73841-0_sp1.1 from training. Duration: 0.87275 2023-10-04 17:09:25,159 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8453_210_4211_1_1528502940_8562730_347-330673-0_sp1.1 from training. Duration: 0.6545625 2023-10-04 17:09:28,184 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_4183_210_2087_1_1526729100_1869838_143-280962-0_sp1.1 from training. Duration: 0.5090625 2023-10-04 17:09:33,815 WARNING [train.py:1204] (3/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_325-113798-0_sp1.1 from training. Duration: 0.77275 2023-10-04 17:09:37,823 WARNING [train.py:1204] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_91-212486-0_sp0.9 from training. Duration: 0.7555625 2023-10-04 17:09:41,101 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.3.encoder.layers.0.feed_forward1.out_whiten, num_groups=1, num_channels=512, metric=8.54 vs. limit=15.0 2023-10-04 17:09:41,650 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8776_210_2040_1_1528638837_4088036_244-278763-0_sp1.1 from training. Duration: 0.8181875 2023-10-04 17:09:41,697 WARNING [train.py:1204] (3/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_106-286130-0 from training. Duration: 0.98 2023-10-04 17:09:41,756 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2746_210_1988_1_1525872816_3795627_372-205861-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 17:09:41,887 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.1.feed_forward1.out_proj.dropout_p, batch_count=1730080.0, ans=0.1 2023-10-04 17:09:43,626 WARNING [train.py:1204] (3/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_240-54646-0_sp1.1 from training. Duration: 0.936375 2023-10-04 17:09:45,086 WARNING [train.py:1204] (3/4) Exclude cut with ID _59500_210_8254_1_1534809610680_3736009_162-180695-0_sp1.1 from training. Duration: 0.936375 2023-10-04 17:09:45,115 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7544_210_6753_1_1528591327_1471426_146-93357-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 17:09:47,980 WARNING [train.py:1204] (3/4) Exclude cut with ID _42657_210_2070_1_1533520654016_3327008_405-87432-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 17:09:49,205 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_4528_210_3273_1_1527076654_3276777_209-219804-0 from training. Duration: 0.69 2023-10-04 17:09:49,206 WARNING [train.py:1204] (3/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_78-182912-0_sp0.9 from training. Duration: 0.6555625 2023-10-04 17:09:49,222 WARNING [train.py:1204] (3/4) Exclude cut with ID _31255_210_13427_1_1532865619495_3815143_322-114998-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 17:09:53,492 WARNING [train.py:1204] (3/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_848-280957-0_sp0.9 from training. Duration: 0.9666875 2023-10-04 17:09:53,528 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7696_210_2040_1_1528207167_3716099_117-125434-0_sp0.9 from training. Duration: 0.8444375 2023-10-04 17:09:56,223 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_1002-109192-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 17:09:57,614 WARNING [train.py:1204] (3/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_138-64210-0_sp0.9 from training. Duration: 0.811125 2023-10-04 17:09:57,645 WARNING [train.py:1204] (3/4) Exclude cut with ID _25808_210_13292_1_1532343514562_7366109_633-47524-0_sp1.1 from training. Duration: 0.9 2023-10-04 17:10:00,802 WARNING [train.py:1204] (3/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_297-5233-0 from training. Duration: 0.95 2023-10-04 17:10:02,865 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.2.encoder.layers.1.conv_module2.whiten, num_groups=1, num_channels=384, metric=8.84 vs. limit=15.0 2023-10-04 17:10:04,035 WARNING [train.py:1204] (3/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_335-274780-0 from training. Duration: 0.78 2023-10-04 17:10:04,040 WARNING [train.py:1204] (3/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_301-330455-0 from training. Duration: 0.8 2023-10-04 17:10:07,352 INFO [train.py:1046] (3/4) Epoch 49, batch 4550, loss[loss=0.1594, simple_loss=0.2519, pruned_loss=0.03343, over 24655.00 frames. ], tot_loss[loss=0.1528, simple_loss=0.2331, pruned_loss=0.03621, over 4697598.56 frames. ], batch size: 73, lr: 2.06e-03, grad_scale: 16.0 2023-10-04 17:10:07,389 WARNING [train.py:1204] (3/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_383-205833-0 from training. Duration: 0.95 2023-10-04 17:10:10,134 WARNING [train.py:1204] (3/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_48-182155-0 from training. Duration: 0.87 2023-10-04 17:10:10,230 WARNING [train.py:1204] (3/4) Exclude cut with ID _14832_210_8750_1_1530691351539_3171348_1-54315-0_sp1.1 from training. Duration: 0.7909375 2023-10-04 17:10:13,546 WARNING [train.py:1204] (3/4) Exclude cut with ID _44982_210_1511_1_1534312820108_3726029_413-100814-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 17:10:13,595 WARNING [train.py:1204] (3/4) Exclude cut with ID _3231_210_2106_1_1526205864934_6784220_104-261392-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 17:10:16,337 WARNING [train.py:1204] (3/4) Exclude cut with ID _24314_210_4915_1_1532135078116_7170179_1-13588-0_sp1.1 from training. Duration: 0.97275 2023-10-04 17:10:20,478 WARNING [train.py:1204] (3/4) Exclude cut with ID _44304_210_15374_1_1533639352765_4613129_141-8356-0_sp1.1 from training. Duration: 0.963625 2023-10-04 17:10:21,781 WARNING [train.py:1204] (3/4) Exclude cut with ID _37892_210_19596_1_1533198677897_3964058_374-335034-0_sp1.1 from training. Duration: 0.936375 2023-10-04 17:10:23,312 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_195-257512-0_sp0.9 from training. Duration: 0.7444375 2023-10-04 17:10:23,314 WARNING [train.py:1204] (3/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_1267-272693-0_sp1.1 from training. Duration: 0.663625 2023-10-04 17:10:23,315 WARNING [train.py:1204] (3/4) Exclude cut with ID _30196_210_1664_1_1533708496522_7074140_690-326155-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 17:10:23,621 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.feed_forward1.hidden_balancer.prob, batch_count=1730280.0, ans=0.125 2023-10-04 17:10:24,708 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_68847_210_14656_1_1536070214_2586843_38-20717-0_sp1.1 from training. Duration: 0.97275 2023-10-04 17:10:26,075 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_99-251314-0_sp1.1 from training. Duration: 0.7909375 2023-10-04 17:10:29,416 WARNING [train.py:1204] (3/4) Exclude cut with ID _49376_210_5986_1_1533985278732_3370740_225-95630-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 17:10:33,315 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2655_210_2087_1_1525688959_3897400_202-147557-0 from training. Duration: 0.85 2023-10-04 17:10:33,362 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_5618_210_1874_1_1527311008_5851_1-214501-0_sp1.1 from training. Duration: 0.626375 2023-10-04 17:10:34,664 WARNING [train.py:1204] (3/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_225-43381-0_sp1.1 from training. Duration: 0.8545625 2023-10-04 17:10:35,874 WARNING [train.py:1204] (3/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_242-3472-0 from training. Duration: 0.73 2023-10-04 17:10:36,051 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.ff3_skip_rate, batch_count=1730346.6666666667, ans=0.0 2023-10-04 17:10:38,956 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.conv_module2.balancer2.min_positive, batch_count=1730346.6666666667, ans=0.05 2023-10-04 17:10:40,033 WARNING [train.py:1204] (3/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_339-104791-0 from training. Duration: 0.49 2023-10-04 17:10:41,404 WARNING [train.py:1204] (3/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_488-92261-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 17:10:42,952 WARNING [train.py:1204] (3/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_175-163894-0 from training. Duration: 0.74 2023-10-04 17:10:44,386 WARNING [train.py:1204] (3/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_41-234353-0_sp0.9 from training. Duration: 0.7555625 2023-10-04 17:10:45,280 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.1.encoder.layers.0.self_attn1.whiten, num_groups=1, num_channels=256, metric=13.74 vs. limit=22.5 2023-10-04 17:10:48,919 WARNING [train.py:1204] (3/4) Exclude cut with ID _47251_210_18628_1_1533814063128_4137250_116-275927-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 17:10:50,100 WARNING [train.py:1204] (3/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_61-223324-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 17:10:50,120 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6961_210_2087_1_1527937168_6815011_562-165518-0_sp1.1 from training. Duration: 0.7 2023-10-04 17:10:51,532 WARNING [train.py:1204] (3/4) Exclude cut with ID _21989_210_5854_1_1531899214931_3271360_133-44759-0 from training. Duration: 0.77 2023-10-04 17:10:54,293 WARNING [train.py:1204] (3/4) Exclude cut with ID _23557_210_8750_1_1531890106370_6846255_77-284923-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 17:10:57,019 WARNING [train.py:1204] (3/4) Exclude cut with ID _13678_210_7497_1_1530530069226_3463170_464-296127-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 17:10:57,035 WARNING [train.py:1204] (3/4) Exclude cut with ID _22613_210_5219_1_1532253599686_4090170_393-36663-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 17:10:58,372 WARNING [train.py:1204] (3/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_235-262124-0_sp0.9 from training. Duration: 0.7444375 2023-10-04 17:10:59,796 WARNING [train.py:1204] (3/4) Exclude cut with ID _8480_210_1874_1_1528589391205_6728330_755-112625-0 from training. Duration: 0.74 2023-10-04 17:10:59,838 WARNING [train.py:1204] (3/4) Exclude cut with ID _6456_210_1534_1_1527681634212_3181049_215-309359-0 from training. Duration: 0.88 2023-10-04 17:10:59,862 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3019_210_2028_1_1526044245_3264865_191-72235-0_sp1.1 from training. Duration: 0.8818125 2023-10-04 17:11:01,688 WARNING [train.py:1204] (3/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_380-162648-0 from training. Duration: 0.94 2023-10-04 17:11:03,777 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_92-46009-0 from training. Duration: 0.54 2023-10-04 17:11:03,795 WARNING [train.py:1204] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_53-300544-0_sp0.9 from training. Duration: 0.7444375 2023-10-04 17:11:03,995 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.feed_forward2.hidden_balancer.prob, batch_count=1730413.3333333333, ans=0.125 2023-10-04 17:11:05,142 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_68235_210_1925_1_1535863247_4484656_101-250943-0_sp1.1 from training. Duration: 0.97275 2023-10-04 17:11:05,151 WARNING [train.py:1204] (3/4) Exclude cut with ID _14970_210_15709_1_1530775547361_1966965_204-22182-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 17:11:06,648 WARNING [train.py:1204] (3/4) Exclude cut with ID _55153_210_2253_1_1534384786677_3162096_310-130092-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 17:11:06,659 WARNING [train.py:1204] (3/4) Exclude cut with ID _60113_210_16271_1_1535164212481_4386079_559-167191-0_sp1.1 from training. Duration: 0.8454375 2023-10-04 17:11:07,960 WARNING [train.py:1204] (3/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_93-58002-0_sp1.1 from training. Duration: 0.5181875 2023-10-04 17:11:09,291 WARNING [train.py:1204] (3/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_110-224688-0 from training. Duration: 0.97 2023-10-04 17:11:10,709 WARNING [train.py:1204] (3/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_178-178004-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 17:11:10,724 WARNING [train.py:1204] (3/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_321-335475-0_sp1.1 from training. Duration: 0.4090625 2023-10-04 17:11:10,784 WARNING [train.py:1204] (3/4) Exclude cut with ID _66863_210_18053_1_1535698758334_8590319_1092-271784-0 from training. Duration: 0.96 2023-10-04 17:11:10,791 WARNING [train.py:1204] (3/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_607-213951-0_sp1.1 from training. Duration: 0.763625 2023-10-04 17:11:10,799 WARNING [train.py:1204] (3/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_468-283113-0 from training. Duration: 0.96 2023-10-04 17:11:14,893 WARNING [train.py:1204] (3/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_13-165512-0_sp1.1 from training. Duration: 0.7454375 2023-10-04 17:11:14,905 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_69630_210_7234_1_1535788670_910989_56-169762-0_sp1.1 from training. Duration: 0.92725 2023-10-04 17:11:18,286 WARNING [train.py:1204] (3/4) Exclude cut with ID _67335_210_5219_1_1535525975652_4275229_121-80357-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 17:11:18,340 WARNING [train.py:1204] (3/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_126-12079-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 17:11:19,668 WARNING [train.py:1204] (3/4) Exclude cut with ID _5294_210_1747_1_1527249643928_3696179_342-270278-0_sp1.1 from training. Duration: 0.5 2023-10-04 17:11:20,946 INFO [train.py:1046] (3/4) Epoch 49, batch 4600, loss[loss=0.1424, simple_loss=0.2329, pruned_loss=0.02595, over 24319.00 frames. ], tot_loss[loss=0.1518, simple_loss=0.2318, pruned_loss=0.03588, over 4694960.52 frames. ], batch size: 74, lr: 2.06e-03, grad_scale: 16.0 2023-10-04 17:11:20,986 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_30322_210_1925_1_1532843478_826817_35-328264-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 17:11:22,445 WARNING [train.py:1204] (3/4) Exclude cut with ID _8480_210_1874_1_1528589391205_6728330_755-112625-0_sp1.1 from training. Duration: 0.67275 2023-10-04 17:11:22,635 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.0.feed_forward1.out_proj.dropout_p, batch_count=1730546.6666666667, ans=0.1 2023-10-04 17:11:25,257 WARNING [train.py:1204] (3/4) Exclude cut with ID _38670_210_15379_1_1533448810659_4391059_132-146412-0_sp1.1 from training. Duration: 0.97275 2023-10-04 17:11:25,341 WARNING [train.py:1204] (3/4) Exclude cut with ID _22020_210_2461_1_1531724164513_6326900_563-39691-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 17:11:27,936 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_9460_210_4211_1_1528954587_10049667_572-37035-0_sp1.1 from training. Duration: 0.72725 2023-10-04 17:11:29,238 WARNING [train.py:1204] (3/4) Exclude cut with ID _68777_210_4353_1_1535810390731_3117904_129-166012-0_sp0.9 from training. Duration: 0.9444375 2023-10-04 17:11:29,311 WARNING [train.py:1204] (3/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_487-91921-0_sp1.1 from training. Duration: 0.963625 2023-10-04 17:11:31,177 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_195-257512-0 from training. Duration: 0.67 2023-10-04 17:11:33,104 WARNING [train.py:1204] (3/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_188-260011-0_sp1.1 from training. Duration: 0.663625 2023-10-04 17:11:37,843 WARNING [train.py:1204] (3/4) Exclude cut with ID _8480_210_1874_1_1528589391205_6728330_309-139896-0_sp0.9 from training. Duration: 0.9 2023-10-04 17:11:37,901 WARNING [train.py:1204] (3/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_866-321248-0_sp1.1 from training. Duration: 0.963625 2023-10-04 17:11:40,550 WARNING [train.py:1204] (3/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_510-216513-0_sp1.1 from training. Duration: 0.97275 2023-10-04 17:11:43,294 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.620e+02 2.059e+02 2.232e+02 2.627e+02 4.263e+02, threshold=4.464e+02, percent-clipped=0.0 2023-10-04 17:11:46,197 WARNING [train.py:1204] (3/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_758-143677-0 from training. Duration: 0.98 2023-10-04 17:11:47,794 WARNING [train.py:1204] (3/4) Exclude cut with ID _55134_210_21982_1_1534590028377_3882170_50-214427-0_sp1.1 from training. Duration: 0.97275 2023-10-04 17:11:51,081 WARNING [train.py:1204] (3/4) Exclude cut with ID _31649_210_6228_1_1532584608063_4091078_482-301330-0_sp1.1 from training. Duration: 0.97275 2023-10-04 17:11:52,478 WARNING [train.py:1204] (3/4) Exclude cut with ID _35799_210_5854_1_1533263487887_3529071_490-326847-0_sp1.1 from training. Duration: 0.9 2023-10-04 17:11:52,493 WARNING [train.py:1204] (3/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_984-90716-0_sp1.1 from training. Duration: 0.963625 2023-10-04 17:11:59,265 WARNING [train.py:1204] (3/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_166-246557-0 from training. Duration: 0.64 2023-10-04 17:11:59,266 WARNING [train.py:1204] (3/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_51-200220-0_sp1.1 from training. Duration: 0.5090625 2023-10-04 17:12:00,580 WARNING [train.py:1204] (3/4) Exclude cut with ID _51382_210_23862_1_1534492801838_3650104_275-92796-0_sp1.1 from training. Duration: 0.936375 2023-10-04 17:12:05,459 WARNING [train.py:1204] (3/4) Exclude cut with ID _29853_210_7497_1_1532505600851_6642110_461-142829-0_sp1.1 from training. Duration: 0.97275 2023-10-04 17:12:07,290 WARNING [train.py:1204] (3/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_788-298907-0_sp0.9 from training. Duration: 0.988875 2023-10-04 17:12:08,747 WARNING [train.py:1204] (3/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_739-2098-0_sp1.1 from training. Duration: 0.836375 2023-10-04 17:12:13,002 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7543_210_6753_1_1528541907_1399322_165-323619-0 from training. Duration: 0.69 2023-10-04 17:12:13,095 WARNING [train.py:1204] (3/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_401-255491-0_sp1.1 from training. Duration: 0.636375 2023-10-04 17:12:17,184 WARNING [train.py:1204] (3/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_24-143283-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 17:12:18,553 WARNING [train.py:1204] (3/4) Exclude cut with ID _14459_210_2228_1_1531443641946_7516700_1221-280439-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 17:12:20,433 WARNING [train.py:1204] (3/4) Exclude cut with ID _59202_210_26984_1_1534816810612_3549174_469-213482-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 17:12:20,434 WARNING [train.py:1204] (3/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_148-158509-0_sp0.9 from training. Duration: 0.4555625 2023-10-04 17:12:20,468 WARNING [train.py:1204] (3/4) Exclude cut with ID _24314_210_4915_1_1532135078116_7170179_863-259095-0_sp1.1 from training. Duration: 0.97275 2023-10-04 17:12:21,926 WARNING [train.py:1204] (3/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_321-22821-0 from training. Duration: 0.89 2023-10-04 17:12:21,960 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_43831_210_2158_1_1533725502_5872791_55-163137-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 17:12:22,505 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.5.encoder.layers.0.nonlin_attention.whiten1, num_groups=1, num_channels=192, metric=4.19 vs. limit=10.0 2023-10-04 17:12:23,336 WARNING [train.py:1204] (3/4) Exclude cut with ID _27430_210_7770_1_1532604689375_1378590_26-25207-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 17:12:24,675 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_17613_210_2151_1_1531137569_2366160_223-316461-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 17:12:26,035 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_53679_210_4852_1_1534556726_4186141_541-213370-0_sp1.1 from training. Duration: 0.963625 2023-10-04 17:12:26,115 WARNING [train.py:1204] (3/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_369-295086-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 17:12:27,393 WARNING [train.py:1204] (3/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_91-267696-0 from training. Duration: 0.87 2023-10-04 17:12:27,440 WARNING [train.py:1204] (3/4) Exclude cut with ID _5294_210_1747_1_1527249643928_3696179_180-100713-0 from training. Duration: 0.81 2023-10-04 17:12:28,918 WARNING [train.py:1204] (3/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_32-335103-0 from training. Duration: 0.82 2023-10-04 17:12:28,923 WARNING [train.py:1204] (3/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_133-312727-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 17:12:29,010 WARNING [train.py:1204] (3/4) Exclude cut with ID _59288_210_5854_1_1535077757774_3412246_391-165934-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 17:12:30,346 WARNING [train.py:1204] (3/4) Exclude cut with ID _28323_210_16187_1_1532480395667_4100300_488-1281-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 17:12:31,598 WARNING [train.py:1204] (3/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_124-193207-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 17:12:34,807 INFO [train.py:1046] (3/4) Epoch 49, batch 4650, loss[loss=0.1305, simple_loss=0.2127, pruned_loss=0.02422, over 24569.00 frames. ], tot_loss[loss=0.1514, simple_loss=0.2319, pruned_loss=0.03544, over 4706604.09 frames. ], batch size: 60, lr: 2.06e-03, grad_scale: 16.0 2023-10-04 17:12:40,190 WARNING [train.py:1204] (3/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_226-242140-0_sp0.9 from training. Duration: 0.9 2023-10-04 17:12:44,401 WARNING [train.py:1204] (3/4) Exclude cut with ID _29277_210_3279_1_1532581109293_3173069_392-3694-0_sp1.1 from training. Duration: 0.936375 2023-10-04 17:12:44,444 WARNING [train.py:1204] (3/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_563-329842-0_sp1.1 from training. Duration: 0.97275 2023-10-04 17:12:44,492 WARNING [train.py:1204] (3/4) Exclude cut with ID _58583_210_5236_1_1534726835266_3574249_391-87755-0_sp1.1 from training. Duration: 0.92725 2023-10-04 17:12:44,522 WARNING [train.py:1204] (3/4) Exclude cut with ID _28427_210_4220_1_1532581240967_3061069_281-237827-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 17:12:44,536 WARNING [train.py:1204] (3/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_44-147898-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 17:12:45,863 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_24007_210_1794_1_1531918925_1663875_125-320772-0_sp1.1 from training. Duration: 0.97275 2023-10-04 17:12:47,432 WARNING [train.py:1204] (3/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_143-117421-0 from training. Duration: 0.58 2023-10-04 17:12:47,716 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.conv_module2.balancer2.prob, batch_count=1730880.0, ans=0.125 2023-10-04 17:12:49,106 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.ff2_skip_rate, batch_count=1730946.6666666667, ans=0.0 2023-10-04 17:12:50,960 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.1.conv_module2.balancer2.prob, batch_count=1730946.6666666667, ans=0.125 2023-10-04 17:12:52,114 WARNING [train.py:1204] (3/4) Exclude cut with ID _69584_210_5219_1_1535785133775_4044450_325-7656-0_sp0.9 from training. Duration: 0.9555625 2023-10-04 17:12:52,259 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.1.bypass.skip_rate, batch_count=1730946.6666666667, ans=0.035 2023-10-04 17:12:54,841 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_37351_210_7925_1_1533012931_3812113_265-85780-0 from training. Duration: 0.88 2023-10-04 17:12:54,863 WARNING [train.py:1204] (3/4) Exclude cut with ID _18516_210_9206_1_1531704653206_3692406_331-237896-0_sp1.1 from training. Duration: 0.936375 2023-10-04 17:12:54,939 WARNING [train.py:1204] (3/4) Exclude cut with ID _14629_210_15709_1_1530604298182_956899_6-232695-0 from training. Duration: 0.94 2023-10-04 17:12:56,308 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7544_210_6753_1_1528587000_691468_87-178599-0_sp1.1 from training. Duration: 0.7909375 2023-10-04 17:12:56,351 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_326-303945-0 from training. Duration: 0.99 2023-10-04 17:12:56,381 WARNING [train.py:1204] (3/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_503-30719-0 from training. Duration: 0.98 2023-10-04 17:12:56,389 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_11305_210_1794_1_1529750428_7178148_299-344640-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 17:12:57,750 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_53071_210_16554_1_1534490405_2954066_265-170472-0_sp1.1 from training. Duration: 0.7545625 2023-10-04 17:13:00,622 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_174-277364-0_sp1.1 from training. Duration: 0.6454375 2023-10-04 17:13:01,920 WARNING [train.py:1204] (3/4) Exclude cut with ID _58414_210_8254_1_1535421609162_7151182_193-151884-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 17:13:01,938 WARNING [train.py:1204] (3/4) Exclude cut with ID IC0651W0104-322063 from training. Duration: 0.768 2023-10-04 17:13:05,802 WARNING [train.py:1204] (3/4) Exclude cut with ID _21881_210_3525_1_1531825242757_3798380_139-305982-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 17:13:05,906 WARNING [train.py:1204] (3/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_79-200392-0 from training. Duration: 0.97 2023-10-04 17:13:09,348 WARNING [train.py:1204] (3/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_451-280894-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 17:13:09,356 WARNING [train.py:1204] (3/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_304-273433-0_sp1.1 from training. Duration: 0.9 2023-10-04 17:13:10,708 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_5959_210_1794_1_1527400400_3445726_412-44052-0 from training. Duration: 0.77 2023-10-04 17:13:12,122 WARNING [train.py:1204] (3/4) Exclude cut with ID _4681_210_3525_1_1527251040321_1235713_148-26159-0_sp1.1 from training. Duration: 0.963625 2023-10-04 17:13:14,885 WARNING [train.py:1204] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_887-249555-0_sp0.9 from training. Duration: 0.8555625 2023-10-04 17:13:17,602 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_25236_210_2158_1_1532051968_280567_37-6860-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 17:13:23,741 WARNING [train.py:1204] (3/4) Exclude cut with ID _29277_210_3279_1_1532581109293_3173069_417-115527-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 17:13:25,109 WARNING [train.py:1204] (3/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_552-134748-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 17:13:26,420 WARNING [train.py:1204] (3/4) Exclude cut with ID _43176_210_8254_1_1533524417469_4174339_188-174143-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 17:13:26,471 WARNING [train.py:1204] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_127-119109-0_sp1.1 from training. Duration: 0.6545625 2023-10-04 17:13:29,147 WARNING [train.py:1204] (3/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_38-187851-0 from training. Duration: 0.62 2023-10-04 17:13:29,194 WARNING [train.py:1204] (3/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_588-125165-0 from training. Duration: 0.83 2023-10-04 17:13:30,502 WARNING [train.py:1204] (3/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_344-152526-0_sp1.1 from training. Duration: 0.4818125 2023-10-04 17:13:30,504 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7002_210_1794_1_1527935634_4034444_487-236501-0 from training. Duration: 0.66 2023-10-04 17:13:31,949 WARNING [train.py:1204] (3/4) Exclude cut with ID _23825_210_13449_1_1531915313526_3497150_379-16981-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 17:13:35,533 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.ff2_skip_rate, batch_count=1731146.6666666667, ans=0.0 2023-10-04 17:13:39,702 WARNING [train.py:1204] (3/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_297-5233-0_sp1.1 from training. Duration: 0.863625 2023-10-04 17:13:39,706 WARNING [train.py:1204] (3/4) Exclude cut with ID _41298_210_9019_1_1533430371450_4822143_178-283538-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 17:13:41,040 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7046_210_6753_1_1527895337_6920917_642-106799-0 from training. Duration: 0.92 2023-10-04 17:13:41,061 WARNING [train.py:1204] (3/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_532-291952-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 17:13:42,384 WARNING [train.py:1204] (3/4) Exclude cut with ID _47278_210_12041_1_1533902418349_3576110_260-270468-0_sp1.1 from training. Duration: 0.92725 2023-10-04 17:13:42,389 WARNING [train.py:1204] (3/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_144-3287-0_sp0.9 from training. Duration: 0.7444375 2023-10-04 17:13:43,835 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_162-262717-0_sp0.9 from training. Duration: 0.92225 2023-10-04 17:13:46,628 WARNING [train.py:1204] (3/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_62-335552-0_sp0.9 from training. Duration: 0.9666875 2023-10-04 17:13:46,640 WARNING [train.py:1204] (3/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_726-272852-0_sp1.1 from training. Duration: 0.92725 2023-10-04 17:13:46,777 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.1.ff3_skip_rate, batch_count=1731146.6666666667, ans=0.0 2023-10-04 17:13:48,049 WARNING [train.py:1204] (3/4) Exclude cut with ID _14898_210_9206_1_1530766812377_4177190_26-135492-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 17:13:48,250 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.0.nonlin_attention.balancer.prob, batch_count=1731213.3333333333, ans=0.125 2023-10-04 17:13:49,405 INFO [train.py:1046] (3/4) Epoch 49, batch 4700, loss[loss=0.1595, simple_loss=0.2436, pruned_loss=0.03773, over 23316.00 frames. ], tot_loss[loss=0.1522, simple_loss=0.2327, pruned_loss=0.03583, over 4711967.67 frames. ], batch size: 119, lr: 2.06e-03, grad_scale: 16.0 2023-10-04 17:13:49,677 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.ff3_skip_rate, batch_count=1731213.3333333333, ans=0.0 2023-10-04 17:13:50,887 WARNING [train.py:1204] (3/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_200-95304-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 17:13:50,932 WARNING [train.py:1204] (3/4) Exclude cut with ID _45012_210_12651_1_1533608435136_5371380_240-37101-0_sp1.1 from training. Duration: 0.8454375 2023-10-04 17:13:50,948 WARNING [train.py:1204] (3/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_132-269633-0_sp0.9 from training. Duration: 0.7555625 2023-10-04 17:13:52,291 WARNING [train.py:1204] (3/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_907-151398-0_sp0.9 from training. Duration: 0.411125 2023-10-04 17:13:53,597 WARNING [train.py:1204] (3/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_45-265684-0_sp0.9 from training. Duration: 0.8 2023-10-04 17:13:53,690 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_74-292608-0 from training. Duration: 0.72 2023-10-04 17:14:02,331 WARNING [train.py:1204] (3/4) Exclude cut with ID _59202_210_26984_1_1534816810612_3549174_221-84819-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 17:14:03,769 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_33696_210_3852_1_1533082780_2663375_181-325595-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 17:14:04,415 WARNING [train.py:1204] (3/4) Exclude cut with ID _47251_210_18628_1_1533814063128_4137250_351-154105-0_sp1.1 from training. Duration: 0.963625 2023-10-04 17:14:04,505 WARNING [train.py:1204] (3/4) Exclude cut with ID _35501_210_9288_1_1532998507986_3192189_351-334740-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 17:14:06,012 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=1731280.0, ans=0.1 2023-10-04 17:14:07,572 WARNING [train.py:1204] (3/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_342-100157-0_sp1.1 from training. Duration: 0.4909375 2023-10-04 17:14:11,023 WARNING [train.py:1204] (3/4) Exclude cut with ID _6789_210_1534_1_1527857937218_3118950_262-48853-0 from training. Duration: 0.56 2023-10-04 17:14:11,057 WARNING [train.py:1204] (3/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_430-201020-0 from training. Duration: 0.97 2023-10-04 17:14:12,380 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.767e+02 2.015e+02 2.160e+02 2.418e+02 3.879e+02, threshold=4.320e+02, percent-clipped=0.0 2023-10-04 17:14:13,963 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_71-136518-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 17:14:14,107 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.0.feed_forward1.out_proj.dropout_p, batch_count=1731280.0, ans=0.1 2023-10-04 17:14:15,329 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_28-291982-0_sp1.1 from training. Duration: 0.8818125 2023-10-04 17:14:16,664 WARNING [train.py:1204] (3/4) Exclude cut with ID _54218_210_12210_1_1534413646388_4409259_437-275326-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 17:14:19,376 WARNING [train.py:1204] (3/4) Exclude cut with ID _55134_210_21982_1_1534590028377_3882170_40-309139-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 17:14:22,495 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.self_attn_weights.pos_emb_skip_rate, batch_count=1731346.6666666667, ans=0.0 2023-10-04 17:14:23,718 WARNING [train.py:1204] (3/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_266-329755-0_sp1.1 from training. Duration: 0.8090625 2023-10-04 17:14:25,097 WARNING [train.py:1204] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_21-174514-0_sp1.1 from training. Duration: 0.3909375 2023-10-04 17:14:27,081 WARNING [train.py:1204] (3/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_409-207378-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 17:14:33,745 WARNING [train.py:1204] (3/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_132-269633-0 from training. Duration: 0.68 2023-10-04 17:14:33,991 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.conv_module2.balancer1.prob, batch_count=1731413.3333333333, ans=0.125 2023-10-04 17:14:35,587 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2245_210_2715_1_1525511548_3540254_310-52483-0_sp0.9 from training. Duration: 0.97775 2023-10-04 17:14:37,399 WARNING [train.py:1204] (3/4) Exclude cut with ID _27430_210_7770_1_1532604689375_1378590_47-77403-0_sp1.1 from training. Duration: 0.92725 2023-10-04 17:14:41,917 WARNING [train.py:1204] (3/4) Exclude cut with ID _7562_210_2084_1_1528887767916_3946229_186-326422-0 from training. Duration: 0.84 2023-10-04 17:14:43,442 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_230-35993-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 17:14:46,343 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_23854_210_1794_1_1531969696_1329157_57-123491-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 17:14:47,637 WARNING [train.py:1204] (3/4) Exclude cut with ID _14747_210_8341_1_1530685797463_3654089_147-131229-0 from training. Duration: 0.76 2023-10-04 17:14:47,739 WARNING [train.py:1204] (3/4) Exclude cut with ID _30191_210_1664_1_1533189915047_7196670_430-19539-0_sp1.1 from training. Duration: 0.92725 2023-10-04 17:14:47,760 WARNING [train.py:1204] (3/4) Exclude cut with ID _67725_210_27767_1_1535608756671_5105359_44-50762-0_sp1.1 from training. Duration: 0.963625 2023-10-04 17:14:50,660 WARNING [train.py:1204] (3/4) Exclude cut with ID _27485_210_4916_1_1532739191677_3668269_217-161755-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 17:14:50,705 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_5931_210_2040_1_1527428481_4547999_321-193614-0_sp1.1 from training. Duration: 0.6090625 2023-10-04 17:14:52,002 WARNING [train.py:1204] (3/4) Exclude cut with ID _6267_210_2084_1_1527764658526_3894730_375-219887-0 from training. Duration: 0.67 2023-10-04 17:14:52,083 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9087W0022-538393 from training. Duration: 0.768 2023-10-04 17:14:53,455 WARNING [train.py:1204] (3/4) Exclude cut with ID _68777_210_4353_1_1535810390731_3117904_83-196459-0_sp1.1 from training. Duration: 0.963625 2023-10-04 17:14:56,677 WARNING [train.py:1204] (3/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_455-343812-0_sp1.1 from training. Duration: 0.92725 2023-10-04 17:14:56,678 WARNING [train.py:1204] (3/4) Exclude cut with ID _13916_210_9994_1_1530788341126_3763149_414-320174-0_sp1.1 from training. Duration: 0.92725 2023-10-04 17:14:56,689 WARNING [train.py:1204] (3/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_398-239819-0 from training. Duration: 0.99 2023-10-04 17:14:56,770 WARNING [train.py:1204] (3/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_199-232314-0_sp1.1 from training. Duration: 0.92725 2023-10-04 17:14:57,361 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.3.encoder.layers.3.self_attn_weights.whiten_keys, num_groups=8, num_channels=256, metric=2.30 vs. limit=6.0 2023-10-04 17:14:59,710 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.feed_forward1.hidden_balancer.prob, batch_count=1731480.0, ans=0.125 2023-10-04 17:15:00,877 WARNING [train.py:1204] (3/4) Exclude cut with ID _2334_210_2667_1_1525608238880_4705129_459-104997-0 from training. Duration: 0.95 2023-10-04 17:15:02,359 WARNING [train.py:1204] (3/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_441-209395-0_sp1.1 from training. Duration: 0.936375 2023-10-04 17:15:03,609 INFO [train.py:1046] (3/4) Epoch 49, batch 4750, loss[loss=0.1439, simple_loss=0.2229, pruned_loss=0.03241, over 23365.00 frames. ], tot_loss[loss=0.1524, simple_loss=0.2331, pruned_loss=0.03587, over 4712084.90 frames. ], batch size: 119, lr: 2.06e-03, grad_scale: 16.0 2023-10-04 17:15:03,711 WARNING [train.py:1204] (3/4) Exclude cut with ID _27052_210_5986_1_1532329197187_3585009_320-80393-0_sp1.1 from training. Duration: 0.97275 2023-10-04 17:15:08,986 WARNING [train.py:1204] (3/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_89-217253-0_sp1.1 from training. Duration: 0.97275 2023-10-04 17:15:09,008 WARNING [train.py:1204] (3/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_453-282588-0_sp1.1 from training. Duration: 0.8909375 2023-10-04 17:15:10,407 WARNING [train.py:1204] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_88-341718-0 from training. Duration: 0.66 2023-10-04 17:15:12,367 WARNING [train.py:1204] (3/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_230-3521-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 17:15:15,346 WARNING [train.py:1204] (3/4) Exclude cut with ID _9679_210_7497_1_1529129212906_4363929_512-313889-0 from training. Duration: 0.81 2023-10-04 17:15:16,761 WARNING [train.py:1204] (3/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_194-252800-0_sp1.1 from training. Duration: 0.8818125 2023-10-04 17:15:16,796 WARNING [train.py:1204] (3/4) Exclude cut with ID _55209_210_1531_1_1534675828996_4597160_239-293541-0_sp1.1 from training. Duration: 0.963625 2023-10-04 17:15:18,133 WARNING [train.py:1204] (3/4) Exclude cut with ID _35799_210_5854_1_1533263487887_3529071_15-325131-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 17:15:24,956 WARNING [train.py:1204] (3/4) Exclude cut with ID _14100_210_8341_1_1530529320613_3678069_279-55760-0 from training. Duration: 0.93 2023-10-04 17:15:29,646 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_489-230703-0_sp1.1 from training. Duration: 0.77275 2023-10-04 17:15:31,069 WARNING [train.py:1204] (3/4) Exclude cut with ID _6694_210_4599_1_1527852637490_3965529_144-185051-0 from training. Duration: 0.96 2023-10-04 17:15:31,139 WARNING [train.py:1204] (3/4) Exclude cut with ID _28461_210_2253_1_1532771775189_3041299_1-228155-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 17:15:33,871 WARNING [train.py:1204] (3/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_686-252323-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 17:15:33,874 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_1075-294633-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 17:15:35,120 WARNING [train.py:1204] (3/4) Exclude cut with ID _22083_210_8254_1_1531702862439_3678970_30-92879-0_sp1.1 from training. Duration: 0.97275 2023-10-04 17:15:35,214 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9245W0121-616888 from training. Duration: 0.896 2023-10-04 17:15:35,217 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6786_210_1388_1_1527839699_4194495_378-46070-0 from training. Duration: 0.72 2023-10-04 17:15:41,618 WARNING [train.py:1204] (3/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_317-175062-0 from training. Duration: 0.82 2023-10-04 17:15:44,948 WARNING [train.py:1204] (3/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_238-89390-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 17:15:46,393 WARNING [train.py:1204] (3/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_524-310811-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 17:15:47,886 WARNING [train.py:1204] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_307-158462-0_sp0.9 from training. Duration: 0.7666875 2023-10-04 17:15:47,887 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9309W0191-640318 from training. Duration: 0.512 2023-10-04 17:15:47,891 WARNING [train.py:1204] (3/4) Exclude cut with ID _55153_210_2253_1_1534384786677_3162096_100-204617-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 17:15:50,667 WARNING [train.py:1204] (3/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_112-149213-0_sp1.1 from training. Duration: 0.863625 2023-10-04 17:15:53,430 WARNING [train.py:1204] (3/4) Exclude cut with ID _67831_210_12392_1_1535612464202_3092612_346-2148-0_sp1.1 from training. Duration: 0.8454375 2023-10-04 17:15:54,848 WARNING [train.py:1204] (3/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_51-117576-0_sp0.9 from training. Duration: 0.4444375 2023-10-04 17:15:54,875 WARNING [train.py:1204] (3/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_300-27235-0 from training. Duration: 0.87 2023-10-04 17:15:56,240 WARNING [train.py:1204] (3/4) Exclude cut with ID _40399_210_4353_1_1533305658194_3279406_195-116825-0_sp1.1 from training. Duration: 0.97275 2023-10-04 17:15:56,279 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7002_210_1794_1_1527939683_5157286_163-87552-0_sp0.9 from training. Duration: 0.9555625 2023-10-04 17:15:57,646 WARNING [train.py:1204] (3/4) Exclude cut with ID _19514_210_9259_1_1531530071330_5084230_560-237597-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 17:15:57,733 WARNING [train.py:1204] (3/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_41-234353-0_sp1.1 from training. Duration: 0.6181875 2023-10-04 17:15:57,750 WARNING [train.py:1204] (3/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_769-168302-0 from training. Duration: 0.71 2023-10-04 17:16:00,924 WARNING [train.py:1204] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_574-45541-0 from training. Duration: 0.65 2023-10-04 17:16:03,711 WARNING [train.py:1204] (3/4) Exclude cut with ID _50310_210_15488_1_1534068043345_3641080_262-67120-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 17:16:05,138 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_53071_210_16554_1_1534493434_1276796_35-11927-0_sp1.1 from training. Duration: 0.963625 2023-10-04 17:16:05,139 WARNING [train.py:1204] (3/4) Exclude cut with ID _3992_210_1747_1_1526644816031_3731060_107-251434-0 from training. Duration: 0.99 2023-10-04 17:16:06,904 WARNING [train.py:1204] (3/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_412-54488-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 17:16:08,306 WARNING [train.py:1204] (3/4) Exclude cut with ID _18516_210_9206_1_1531704653206_3692406_77-258490-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 17:16:09,768 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_40047_210_2290_1_1533290519_2374311_195-165693-0_sp0.9 from training. Duration: 0.988875 2023-10-04 17:16:11,519 WARNING [train.py:1204] (3/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_296-36971-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 17:16:11,595 WARNING [train.py:1204] (3/4) Exclude cut with ID _14970_210_15709_1_1530773543493_1715730_187-20259-0_sp1.1 from training. Duration: 0.8090625 2023-10-04 17:16:11,809 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.nonlin_attention.balancer.prob, batch_count=1731813.3333333333, ans=0.125 2023-10-04 17:16:12,458 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.1.encoder.layers.0.feed_forward3.out_whiten, num_groups=1, num_channels=256, metric=10.42 vs. limit=15.0 2023-10-04 17:16:16,107 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_53373_210_5399_1_1534229471_4715695_135-305927-0_sp1.1 from training. Duration: 0.936375 2023-10-04 17:16:16,155 WARNING [train.py:1204] (3/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_60-18123-0 from training. Duration: 0.88 2023-10-04 17:16:18,827 INFO [train.py:1046] (3/4) Epoch 49, batch 4800, loss[loss=0.1459, simple_loss=0.2247, pruned_loss=0.03359, over 23804.00 frames. ], tot_loss[loss=0.1534, simple_loss=0.2344, pruned_loss=0.03617, over 4716765.44 frames. ], batch size: 179, lr: 2.06e-03, grad_scale: 32.0 2023-10-04 17:16:18,867 WARNING [train.py:1204] (3/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_166-166990-0 from training. Duration: 0.96 2023-10-04 17:16:18,965 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_5438_210_1794_1_1527336687_2600009_233-58072-0 from training. Duration: 0.77 2023-10-04 17:16:20,428 WARNING [train.py:1204] (3/4) Exclude cut with ID _8833_210_5219_1_1528592665210_7553059_611-80313-0_sp0.9 from training. Duration: 0.711125 2023-10-04 17:16:21,706 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_53555_210_5632_1_1534595220_7232297_118-43239-0_sp1.1 from training. Duration: 0.936375 2023-10-04 17:16:23,251 WARNING [train.py:1204] (3/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_480-13146-0 from training. Duration: 0.85 2023-10-04 17:16:27,349 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_892-28814-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 17:16:27,396 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_24899_210_16554_1_1532046422_3827462_160-21255-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 17:16:31,564 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_154-316450-0_sp1.1 from training. Duration: 0.6909375 2023-10-04 17:16:34,845 WARNING [train.py:1204] (3/4) Exclude cut with ID _30196_210_1664_1_1533708496522_7074140_1225-301232-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 17:16:34,881 WARNING [train.py:1204] (3/4) Exclude cut with ID _28783_210_7943_1_1532671164808_7155479_414-208100-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 17:16:34,931 WARNING [train.py:1204] (3/4) Exclude cut with ID _19023_210_4353_1_1531378892552_5820780_83-254794-0 from training. Duration: 0.86 2023-10-04 17:16:36,923 WARNING [train.py:1204] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_591-83752-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 17:16:38,218 WARNING [train.py:1204] (3/4) Exclude cut with ID _29877_210_6228_1_1532397791106_5276149_266-62291-0_sp1.1 from training. Duration: 0.7909375 2023-10-04 17:16:38,330 WARNING [train.py:1204] (3/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_280-161633-0_sp1.1 from training. Duration: 0.663625 2023-10-04 17:16:40,915 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.792e+02 2.105e+02 2.314e+02 2.580e+02 3.922e+02, threshold=4.627e+02, percent-clipped=0.0 2023-10-04 17:16:43,677 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7543_210_6753_1_1528539468_333764_39-156634-0_sp1.1 from training. Duration: 0.97275 2023-10-04 17:16:45,621 WARNING [train.py:1204] (3/4) Exclude cut with ID _67499_210_17282_1_1535509805220_3692430_505-312248-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 17:16:45,652 WARNING [train.py:1204] (3/4) Exclude cut with ID _5294_210_1747_1_1527249643928_3696179_180-100713-0_sp0.9 from training. Duration: 0.9 2023-10-04 17:16:47,733 WARNING [train.py:1204] (3/4) Exclude cut with ID _26528_210_9070_1_1532570410842_3851310_508-148132-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 17:16:49,001 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_211-339622-0_sp1.1 from training. Duration: 0.4090625 2023-10-04 17:16:49,016 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_339-138356-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 17:16:50,376 WARNING [train.py:1204] (3/4) Exclude cut with ID _27060_210_5986_1_1532674838922_3522549_211-19933-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 17:16:50,650 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.balancer1.prob, batch_count=1732013.3333333333, ans=0.125 2023-10-04 17:16:51,889 WARNING [train.py:1204] (3/4) Exclude cut with ID _60229_210_27767_1_1534838406158_3655350_457-117923-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 17:16:54,621 WARNING [train.py:1204] (3/4) Exclude cut with ID _55429_210_10626_1_1534467588527_7174143_460-141439-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 17:16:56,115 WARNING [train.py:1204] (3/4) Exclude cut with ID _59170_210_6721_1_1534983633960_4203335_263-10780-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 17:16:56,136 WARNING [train.py:1204] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_872-264525-0_sp0.9 from training. Duration: 0.72225 2023-10-04 17:16:57,524 WARNING [train.py:1204] (3/4) Exclude cut with ID _7624_210_1531_1_1528109802198_4933101_418-349834-0_sp0.9 from training. Duration: 0.6666875 2023-10-04 17:16:58,886 WARNING [train.py:1204] (3/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_27-53998-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 17:17:00,292 WARNING [train.py:1204] (3/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_202-183922-0 from training. Duration: 0.85 2023-10-04 17:17:00,307 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7002_210_1794_1_1527939683_5157286_1-281264-0 from training. Duration: 0.44 2023-10-04 17:17:01,645 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_53753_210_7234_1_1534320003_3594148_493-9314-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 17:17:01,660 WARNING [train.py:1204] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_217-95056-0_sp1.1 from training. Duration: 0.8909375 2023-10-04 17:17:03,017 WARNING [train.py:1204] (3/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_406-45086-0_sp1.1 from training. Duration: 0.72725 2023-10-04 17:17:03,022 WARNING [train.py:1204] (3/4) Exclude cut with ID _31649_210_6228_1_1532584608063_4091078_505-213821-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 17:17:03,030 WARNING [train.py:1204] (3/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_317-175062-0_sp0.9 from training. Duration: 0.911125 2023-10-04 17:17:03,183 WARNING [train.py:1204] (3/4) Exclude cut with ID _5444_210_2151_1_1527418808464_5312210_726-27165-0_sp1.1 from training. Duration: 0.6090625 2023-10-04 17:17:03,304 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.feed_forward3.hidden_balancer.prob, batch_count=1732080.0, ans=0.125 2023-10-04 17:17:04,431 WARNING [train.py:1204] (3/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_494-170437-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 17:17:08,469 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_474-64054-0_sp1.1 from training. Duration: 0.936375 2023-10-04 17:17:08,644 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.conv_skip_rate, batch_count=1732080.0, ans=0.0 2023-10-04 17:17:10,892 WARNING [train.py:1204] (3/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_521-7556-0_sp1.1 from training. Duration: 0.97275 2023-10-04 17:17:12,428 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.conv_skip_rate, batch_count=1732080.0, ans=0.0 2023-10-04 17:17:13,488 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_269-348505-0_sp1.1 from training. Duration: 0.92725 2023-10-04 17:17:18,309 WARNING [train.py:1204] (3/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_559-184862-0 from training. Duration: 0.94 2023-10-04 17:17:18,346 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_68702_210_23689_1_1535799577_3420068_92-76040-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 17:17:18,376 WARNING [train.py:1204] (3/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_227-102052-0_sp1.1 from training. Duration: 0.97275 2023-10-04 17:17:19,664 WARNING [train.py:1204] (3/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_718-250907-0_sp0.9 from training. Duration: 0.7666875 2023-10-04 17:17:19,726 WARNING [train.py:1204] (3/4) Exclude cut with ID _14069_210_5632_1_1530512692033_4803390_352-143891-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 17:17:23,241 INFO [scaling.py:1118] (3/4) WithLoss: name=encoder.encoders.2.encoder.layers.1.self_attn_weights, loss-sum=0.000e+00 2023-10-04 17:17:24,301 WARNING [train.py:1204] (3/4) Exclude cut with ID _6267_210_2084_1_1527764658526_3894730_65-106602-0_sp1.1 from training. Duration: 0.936375 2023-10-04 17:17:25,635 WARNING [train.py:1204] (3/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_202-183922-0_sp0.9 from training. Duration: 0.9444375 2023-10-04 17:17:25,648 WARNING [train.py:1204] (3/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_465-268827-0_sp1.1 from training. Duration: 0.97275 2023-10-04 17:17:26,953 WARNING [train.py:1204] (3/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_58-29064-0_sp1.1 from training. Duration: 0.9 2023-10-04 17:17:26,997 WARNING [train.py:1204] (3/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_1-99680-0_sp0.9 from training. Duration: 0.7444375 2023-10-04 17:17:28,371 WARNING [train.py:1204] (3/4) Exclude cut with ID _13253_210_1925_1_1530757775819_3577873_191-144977-0_sp1.1 from training. Duration: 0.7090625 2023-10-04 17:17:31,440 WARNING [train.py:1204] (3/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_276-112684-0_sp1.1 from training. Duration: 0.92725 2023-10-04 17:17:31,455 WARNING [train.py:1204] (3/4) Exclude cut with ID _64639_210_5816_1_1535367619437_3528000_358-411-0_sp1.1 from training. Duration: 0.97275 2023-10-04 17:17:31,464 WARNING [train.py:1204] (3/4) Exclude cut with ID _31649_210_6228_1_1532584608063_4091078_106-21199-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 17:17:32,786 INFO [train.py:1046] (3/4) Epoch 49, batch 4850, loss[loss=0.1399, simple_loss=0.2227, pruned_loss=0.02857, over 24320.00 frames. ], tot_loss[loss=0.1537, simple_loss=0.2346, pruned_loss=0.03639, over 4726356.87 frames. ], batch size: 61, lr: 2.06e-03, grad_scale: 16.0 2023-10-04 17:17:32,879 WARNING [train.py:1204] (3/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_901-79438-0 from training. Duration: 0.94 2023-10-04 17:17:34,334 WARNING [train.py:1204] (3/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_718-250907-0 from training. Duration: 0.69 2023-10-04 17:17:34,340 WARNING [train.py:1204] (3/4) Exclude cut with ID _5287_210_2084_1_1527418883040_3878670_363-194161-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 17:17:34,343 WARNING [train.py:1204] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_586-321321-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 17:17:35,795 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_68900_210_2087_1_1535713157_3708529_164-292661-0_sp1.1 from training. Duration: 0.963625 2023-10-04 17:17:35,796 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_26433_210_12108_1_1532157158_4056480_114-144878-0_sp1.1 from training. Duration: 0.97275 2023-10-04 17:17:39,128 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_15517_210_6753_1_1530954008_3487336_96-341874-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 17:17:46,533 WARNING [train.py:1204] (3/4) Exclude cut with ID _14268_210_9206_1_1530622771285_4714099_637-86630-0 from training. Duration: 0.51 2023-10-04 17:17:47,969 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_28211_210_6388_1_1532861506_4523224_121-320092-0_sp1.1 from training. Duration: 0.92725 2023-10-04 17:17:54,004 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_141-89113-0_sp0.9 from training. Duration: 0.9555625 2023-10-04 17:17:55,388 WARNING [train.py:1204] (3/4) Exclude cut with ID _6789_210_1534_1_1527857937218_3118950_311-61262-0_sp1.1 from training. Duration: 0.5909375 2023-10-04 17:17:55,414 WARNING [train.py:1204] (3/4) Exclude cut with ID _58480_210_12392_1_1534811121111_3646160_389-200461-0_sp1.1 from training. Duration: 0.97275 2023-10-04 17:17:56,942 WARNING [train.py:1204] (3/4) Exclude cut with ID _41326_210_5854_1_1533778076509_3492080_28-156553-0_sp1.1 from training. Duration: 0.92725 2023-10-04 17:17:57,165 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=1732280.0, ans=0.1 2023-10-04 17:17:57,751 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.2.encoder.layers.0.whiten, num_groups=1, num_channels=384, metric=5.07 vs. limit=12.0 2023-10-04 17:17:58,348 WARNING [train.py:1204] (3/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_577-287150-0_sp1.1 from training. Duration: 0.7454375 2023-10-04 17:17:58,441 WARNING [train.py:1204] (3/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_40-129756-0_sp1.1 from training. Duration: 0.736375 2023-10-04 17:17:59,768 WARNING [train.py:1204] (3/4) Exclude cut with ID _60113_210_16271_1_1535164212481_4386079_490-280290-0 from training. Duration: 0.98 2023-10-04 17:18:02,557 WARNING [train.py:1204] (3/4) Exclude cut with ID _58059_210_5854_1_1534901471149_3164808_34-39905-0_sp1.1 from training. Duration: 0.963625 2023-10-04 17:18:04,030 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_791-197891-0_sp1.1 from training. Duration: 0.763625 2023-10-04 17:18:04,054 WARNING [train.py:1204] (3/4) Exclude cut with ID _6789_210_1534_1_1527857937218_3118950_262-48853-0_sp1.1 from training. Duration: 0.5090625 2023-10-04 17:18:05,402 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2655_210_2087_1_1525688959_3897400_52-279899-0_sp0.9 from training. Duration: 0.7666875 2023-10-04 17:18:05,406 WARNING [train.py:1204] (3/4) Exclude cut with ID _14884_210_2252_1_1530707459689_5561250_114-201399-0 from training. Duration: 0.76 2023-10-04 17:18:08,200 WARNING [train.py:1204] (3/4) Exclude cut with ID _19023_210_4353_1_1531378892552_5820780_83-254794-0_sp0.9 from training. Duration: 0.9555625 2023-10-04 17:18:08,220 WARNING [train.py:1204] (3/4) Exclude cut with ID _53489_210_6228_1_1534298357169_3835970_10-297919-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 17:18:11,040 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.conv_module2.balancer2.min_positive, batch_count=1732346.6666666667, ans=0.05 2023-10-04 17:18:13,534 WARNING [train.py:1204] (3/4) Exclude cut with ID _44585_210_18628_1_1533605350387_4518199_418-3055-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 17:18:13,554 WARNING [train.py:1204] (3/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_310-109461-0 from training. Duration: 0.8 2023-10-04 17:18:13,591 WARNING [train.py:1204] (3/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_235-262124-0 from training. Duration: 0.67 2023-10-04 17:18:13,675 WARNING [train.py:1204] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_195-167011-0_sp1.1 from training. Duration: 0.6818125 2023-10-04 17:18:19,347 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.out_combiner.scale_min, batch_count=1732413.3333333333, ans=0.2 2023-10-04 17:18:21,868 WARNING [train.py:1204] (3/4) Exclude cut with ID _7852_210_5219_1_1528286471316_8139400_188-55162-0_sp1.1 from training. Duration: 0.936375 2023-10-04 17:18:21,917 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_35361_210_4238_1_1533262810_7793476_675-87012-0 from training. Duration: 0.89 2023-10-04 17:18:24,503 WARNING [train.py:1204] (3/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_465-81427-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 17:18:24,510 WARNING [train.py:1204] (3/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_597-154640-0_sp1.1 from training. Duration: 0.8181875 2023-10-04 17:18:26,017 WARNING [train.py:1204] (3/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_186-309161-0_sp0.9 from training. Duration: 0.6 2023-10-04 17:18:27,396 WARNING [train.py:1204] (3/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_546-119312-0 from training. Duration: 0.99 2023-10-04 17:18:27,400 WARNING [train.py:1204] (3/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_330-240060-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 17:18:28,817 WARNING [train.py:1204] (3/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_477-255782-0 from training. Duration: 0.82 2023-10-04 17:18:28,850 WARNING [train.py:1204] (3/4) Exclude cut with ID _14970_210_15709_1_1530773543493_1715730_150-248939-0_sp1.1 from training. Duration: 0.97275 2023-10-04 17:18:30,207 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_556-206615-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 17:18:30,275 WARNING [train.py:1204] (3/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_308-231735-0 from training. Duration: 0.73 2023-10-04 17:18:30,574 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.balancer2.prob, batch_count=1732413.3333333333, ans=0.125 2023-10-04 17:18:33,141 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder_embed.convnext.layerdrop_rate, batch_count=1732480.0, ans=0.015 2023-10-04 17:18:38,456 WARNING [train.py:1204] (3/4) Exclude cut with ID _27485_210_4916_1_1532739191677_3668269_66-236571-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 17:18:38,685 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.bypass_mid.scale_min, batch_count=1732480.0, ans=0.2 2023-10-04 17:18:42,573 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.bypass.skip_rate, batch_count=1732480.0, ans=0.04949747468305833 2023-10-04 17:18:44,948 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2221_210_1988_1_1525597051_4935541_415-296882-0_sp0.9 from training. Duration: 0.8555625 2023-10-04 17:18:44,956 WARNING [train.py:1204] (3/4) Exclude cut with ID _64521_210_14699_1_1535243427259_3815328_231-78630-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 17:18:48,043 INFO [train.py:1046] (3/4) Epoch 49, batch 4900, loss[loss=0.162, simple_loss=0.2488, pruned_loss=0.03767, over 24356.00 frames. ], tot_loss[loss=0.1531, simple_loss=0.2342, pruned_loss=0.03603, over 4730114.29 frames. ], batch size: 77, lr: 2.06e-03, grad_scale: 16.0 2023-10-04 17:18:50,821 WARNING [train.py:1204] (3/4) Exclude cut with ID _13942_210_9988_1_1530604906036_3733360_499-55676-0 from training. Duration: 0.62 2023-10-04 17:18:50,823 WARNING [train.py:1204] (3/4) Exclude cut with ID _49376_210_5986_1_1533985278732_3370740_301-154763-0_sp1.1 from training. Duration: 0.8909375 2023-10-04 17:18:54,952 WARNING [train.py:1204] (3/4) Exclude cut with ID _27485_210_4916_1_1532739191677_3668269_255-216698-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 17:18:56,372 WARNING [train.py:1204] (3/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_137-334143-0_sp1.1 from training. Duration: 0.97275 2023-10-04 17:18:56,400 WARNING [train.py:1204] (3/4) Exclude cut with ID _6456_210_1534_1_1527681634212_3181049_215-309359-0_sp1.1 from training. Duration: 0.8 2023-10-04 17:18:59,217 WARNING [train.py:1204] (3/4) Exclude cut with ID _65640_210_6310_1_1535368201445_3173250_209-13008-0 from training. Duration: 0.99 2023-10-04 17:19:03,394 WARNING [train.py:1204] (3/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_58-29064-0 from training. Duration: 0.99 2023-10-04 17:19:05,008 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.ff3_skip_rate, batch_count=1732613.3333333333, ans=0.0 2023-10-04 17:19:06,151 WARNING [train.py:1204] (3/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_410-287857-0 from training. Duration: 0.93 2023-10-04 17:19:07,472 WARNING [train.py:1204] (3/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_153-153537-0 from training. Duration: 0.91 2023-10-04 17:19:07,505 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7544_210_6753_1_1528587722_3576686_172-171750-0_sp0.9 from training. Duration: 0.72225 2023-10-04 17:19:07,548 WARNING [train.py:1204] (3/4) Exclude cut with ID _64521_210_14699_1_1535243427259_3815328_464-183853-0_sp1.1 from training. Duration: 0.97275 2023-10-04 17:19:07,561 WARNING [train.py:1204] (3/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_385-210308-0_sp1.1 from training. Duration: 0.963625 2023-10-04 17:19:07,569 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_46013_210_7234_1_1533776362_3744032_354-261865-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 17:19:07,575 WARNING [train.py:1204] (3/4) Exclude cut with ID _3067_210_2667_1_1526212974640_4503168_250-285937-0_sp1.1 from training. Duration: 0.72725 2023-10-04 17:19:10,026 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_40036_210_13405_1_1533648647_1315388_48-146016-0 from training. Duration: 0.93 2023-10-04 17:19:11,282 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.726e+02 2.088e+02 2.355e+02 2.729e+02 5.318e+02, threshold=4.711e+02, percent-clipped=2.0 2023-10-04 17:19:12,797 WARNING [train.py:1204] (3/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_280-161633-0 from training. Duration: 0.73 2023-10-04 17:19:14,074 WARNING [train.py:1204] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_308-161067-0_sp0.9 from training. Duration: 0.7333125 2023-10-04 17:19:15,534 WARNING [train.py:1204] (3/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_68-212801-0_sp0.9 from training. Duration: 0.811125 2023-10-04 17:19:17,428 WARNING [train.py:1204] (3/4) Exclude cut with ID _6789_210_1534_1_1527857937218_3118950_311-61262-0_sp0.9 from training. Duration: 0.72225 2023-10-04 17:19:20,188 WARNING [train.py:1204] (3/4) Exclude cut with ID _14632_210_9988_1_1530691279392_3923200_102-204477-0_sp1.1 from training. Duration: 0.8818125 2023-10-04 17:19:21,572 WARNING [train.py:1204] (3/4) Exclude cut with ID _65242_210_18628_1_1535369393717_3823149_290-117517-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 17:19:21,682 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_69630_210_7234_1_1535788670_910989_35-337140-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 17:19:21,697 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_5618_210_1874_1_1527311008_5851_1-214501-0 from training. Duration: 0.689 2023-10-04 17:19:23,649 WARNING [train.py:1204] (3/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_168-50337-0_sp0.9 from training. Duration: 0.9333125 2023-10-04 17:19:24,925 WARNING [train.py:1204] (3/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_185-14438-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 17:19:24,947 WARNING [train.py:1204] (3/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_61-327062-0 from training. Duration: 0.81 2023-10-04 17:19:24,953 WARNING [train.py:1204] (3/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_98-741-0 from training. Duration: 0.8 2023-10-04 17:19:29,134 WARNING [train.py:1204] (3/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_312-204798-0 from training. Duration: 0.93 2023-10-04 17:19:30,495 WARNING [train.py:1204] (3/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_787-294416-0_sp0.9 from training. Duration: 0.911125 2023-10-04 17:19:31,916 WARNING [train.py:1204] (3/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_140-181176-0_sp1.1 from training. Duration: 0.836375 2023-10-04 17:19:31,961 WARNING [train.py:1204] (3/4) Exclude cut with ID _14687_210_5854_1_1530703838259_3664889_115-239046-0_sp1.1 from training. Duration: 0.6090625 2023-10-04 17:19:33,283 WARNING [train.py:1204] (3/4) Exclude cut with ID _56423_210_5854_1_1534896061049_3165969_370-230614-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 17:19:33,329 WARNING [train.py:1204] (3/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_406-275649-0_sp0.9 from training. Duration: 0.4666875 2023-10-04 17:19:33,340 WARNING [train.py:1204] (3/4) Exclude cut with ID _15150_210_8341_1_1530752411803_7256384_22-138274-0_sp1.1 from training. Duration: 0.87275 2023-10-04 17:19:33,388 WARNING [train.py:1204] (3/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_125-125818-0 from training. Duration: 0.96 2023-10-04 17:19:36,529 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_4399_210_1874_1_1526705485_2400757_220-47974-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 17:19:37,941 WARNING [train.py:1204] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_308-161067-0_sp1.1 from training. Duration: 0.6 2023-10-04 17:19:38,173 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.nonlin_attention.balancer.prob, batch_count=1732746.6666666667, ans=0.125 2023-10-04 17:19:39,405 WARNING [train.py:1204] (3/4) Exclude cut with ID _53489_210_6228_1_1534298357169_3835970_102-57645-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 17:19:41,279 WARNING [train.py:1204] (3/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_178-177582-0 from training. Duration: 0.74 2023-10-04 17:19:41,338 WARNING [train.py:1204] (3/4) Exclude cut with ID _14349_210_8341_1_1530615710818_3442470_170-336541-0_sp1.1 from training. Duration: 0.7818125 2023-10-04 17:19:42,600 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_521-214276-0_sp0.9 from training. Duration: 0.488875 2023-10-04 17:19:42,653 WARNING [train.py:1204] (3/4) Exclude cut with ID _66070_210_7640_1_1535441393096_7805310_489-292631-0 from training. Duration: 0.79 2023-10-04 17:19:50,448 WARNING [train.py:1204] (3/4) Exclude cut with ID _14069_210_5632_1_1530512692033_4803390_438-340023-0_sp1.1 from training. Duration: 0.936375 2023-10-04 17:19:51,879 WARNING [train.py:1204] (3/4) Exclude cut with ID _7852_210_5219_1_1528286471316_8139400_242-105336-0_sp1.1 from training. Duration: 0.6545625 2023-10-04 17:19:51,981 WARNING [train.py:1204] (3/4) Exclude cut with ID _3867_210_1883_1_1526641200575_4475830_272-215563-0 from training. Duration: 0.57 2023-10-04 17:19:51,988 WARNING [train.py:1204] (3/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_143-117421-0_sp0.9 from training. Duration: 0.6444375 2023-10-04 17:19:51,995 WARNING [train.py:1204] (3/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_30-326681-0_sp1.1 from training. Duration: 0.7545625 2023-10-04 17:19:54,805 WARNING [train.py:1204] (3/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_538-218448-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 17:19:58,815 WARNING [train.py:1204] (3/4) Exclude cut with ID _33640_210_2655_1_1533117605479_4855469_60-192970-0_sp1.1 from training. Duration: 0.963625 2023-10-04 17:19:58,823 WARNING [train.py:1204] (3/4) Exclude cut with ID _65585_210_12210_1_1535450451584_3764098_112-294301-0_sp1.1 from training. Duration: 0.8 2023-10-04 17:19:58,868 WARNING [train.py:1204] (3/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_643-37412-0_sp1.1 from training. Duration: 0.936375 2023-10-04 17:20:00,148 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_9302_210_2550_1_1528889962_4655416_638-301620-0_sp1.1 from training. Duration: 0.42725 2023-10-04 17:20:01,456 INFO [train.py:1046] (3/4) Epoch 49, batch 4950, loss[loss=0.1427, simple_loss=0.2213, pruned_loss=0.03203, over 19714.00 frames. ], tot_loss[loss=0.1521, simple_loss=0.2321, pruned_loss=0.03599, over 4704493.93 frames. ], batch size: 43, lr: 2.06e-03, grad_scale: 16.0 2023-10-04 17:20:01,554 WARNING [train.py:1204] (3/4) Exclude cut with ID _3867_210_1883_1_1526641200575_4475830_272-215563-0_sp1.1 from training. Duration: 0.5181875 2023-10-04 17:20:03,167 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_53679_210_4852_1_1534556726_4186141_358-154143-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 17:20:03,187 WARNING [train.py:1204] (3/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_841-341679-0_sp0.9 from training. Duration: 0.6444375 2023-10-04 17:20:07,637 WARNING [train.py:1204] (3/4) Exclude cut with ID _7160_210_1876_1_1527940923231_3703799_302-150728-0 from training. Duration: 0.78 2023-10-04 17:20:07,680 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_125-203105-0 from training. Duration: 0.8 2023-10-04 17:20:07,700 WARNING [train.py:1204] (3/4) Exclude cut with ID _5585_210_2667_1_1527418813324_8007340_89-332072-0_sp0.9 from training. Duration: 0.711125 2023-10-04 17:20:09,559 WARNING [train.py:1204] (3/4) Exclude cut with ID _3992_210_1747_1_1526644816031_3731060_36-147855-0 from training. Duration: 0.78 2023-10-04 17:20:09,576 WARNING [train.py:1204] (3/4) Exclude cut with ID _59256_210_5986_1_1535076085997_3589120_172-239045-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 17:20:09,585 WARNING [train.py:1204] (3/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_472-216663-0_sp1.1 from training. Duration: 0.836375 2023-10-04 17:20:10,920 WARNING [train.py:1204] (3/4) Exclude cut with ID _36512_210_6987_1_1533033143998_3126069_181-306500-0_sp1.1 from training. Duration: 0.863625 2023-10-04 17:20:10,937 WARNING [train.py:1204] (3/4) Exclude cut with ID _56524_210_9642_1_1534572011779_3619170_492-95008-0_sp1.1 from training. Duration: 0.97275 2023-10-04 17:20:12,377 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_33348_210_5632_1_1533086912_3324030_119-90326-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 17:20:13,603 WARNING [train.py:1204] (3/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_231-137892-0_sp1.1 from training. Duration: 0.9 2023-10-04 17:20:15,006 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8453_210_4211_1_1528502940_8562730_596-306820-0_sp1.1 from training. Duration: 0.8818125 2023-10-04 17:20:16,504 WARNING [train.py:1204] (3/4) Exclude cut with ID _27052_210_5986_1_1532329197187_3585009_50-242750-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 17:20:16,727 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.nonlin_attention.balancer.prob, batch_count=1732946.6666666667, ans=0.125 2023-10-04 17:20:17,950 WARNING [train.py:1204] (3/4) Exclude cut with ID _13913_210_9994_1_1530579615187_3383897_358-98435-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 17:20:17,970 WARNING [train.py:1204] (3/4) Exclude cut with ID _27013_210_1389_1_1532437173240_3252649_399-229778-0_sp1.1 from training. Duration: 0.963625 2023-10-04 17:20:21,919 WARNING [train.py:1204] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_185-174805-0_sp0.9 from training. Duration: 0.7555625 2023-10-04 17:20:27,338 WARNING [train.py:1204] (3/4) Exclude cut with ID _65585_210_12210_1_1535450451584_3764098_196-221014-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 17:20:28,663 WARNING [train.py:1204] (3/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_202-293523-0_sp1.1 from training. Duration: 0.6545625 2023-10-04 17:20:30,077 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8275_210_4198_1_1528455320_3892571_213-127634-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 17:20:31,347 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_921-231678-0_sp1.1 from training. Duration: 0.97275 2023-10-04 17:20:32,694 WARNING [train.py:1204] (3/4) Exclude cut with ID _38020_210_5986_1_1533112252259_7006089_493-102745-0_sp1.1 from training. Duration: 0.8909375 2023-10-04 17:20:34,033 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6846_210_6753_1_1527850767_4295158_2-315073-0 from training. Duration: 0.88 2023-10-04 17:20:34,170 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=1733013.3333333333, ans=0.1 2023-10-04 17:20:35,322 WARNING [train.py:1204] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_249-281349-0 from training. Duration: 0.85 2023-10-04 17:20:37,359 WARNING [train.py:1204] (3/4) Exclude cut with ID _18513_210_9206_1_1531618215223_3862620_314-83429-0_sp1.1 from training. Duration: 0.97275 2023-10-04 17:20:38,640 WARNING [train.py:1204] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_215-202973-0_sp0.9 from training. Duration: 0.97775 2023-10-04 17:20:38,649 WARNING [train.py:1204] (3/4) Exclude cut with ID _7719_210_3525_1_1528456153163_4296960_88-250259-0_sp1.1 from training. Duration: 0.763625 2023-10-04 17:20:40,499 WARNING [train.py:1204] (3/4) Exclude cut with ID _7606_210_4915_1_1528894253277_5080690_301-10732-0_sp1.1 from training. Duration: 0.736375 2023-10-04 17:20:41,728 WARNING [train.py:1204] (3/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_818-11907-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 17:20:41,819 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_5959_210_1794_1_1527400400_3445726_412-44052-0_sp1.1 from training. Duration: 0.7 2023-10-04 17:20:42,746 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.0.layers.0.conv_module2.whiten, num_groups=1, num_channels=192, metric=10.82 vs. limit=15.0 2023-10-04 17:20:43,270 WARNING [train.py:1204] (3/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_6-279195-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 17:20:44,779 WARNING [train.py:1204] (3/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_252-216674-0_sp0.9 from training. Duration: 0.6 2023-10-04 17:20:46,211 WARNING [train.py:1204] (3/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_144-3287-0_sp1.1 from training. Duration: 0.6090625 2023-10-04 17:20:47,754 WARNING [train.py:1204] (3/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_342-3879-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 17:20:49,451 WARNING [train.py:1204] (3/4) Exclude cut with ID _33640_210_2655_1_1533117605479_4855469_584-65630-0_sp1.1 from training. Duration: 0.97275 2023-10-04 17:20:51,018 WARNING [train.py:1204] (3/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_37-189262-0 from training. Duration: 0.98 2023-10-04 17:20:51,051 WARNING [train.py:1204] (3/4) Exclude cut with ID _9389_210_1664_1_1529217337301_5289269_379-128794-0_sp0.9 from training. Duration: 0.9444375 2023-10-04 17:20:52,933 WARNING [train.py:1204] (3/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_119-207162-0_sp1.1 from training. Duration: 0.6454375 2023-10-04 17:20:57,272 WARNING [train.py:1204] (3/4) Exclude cut with ID _2941_210_2577_1_1526199673150_2489759_200-333643-0_sp1.1 from training. Duration: 0.936375 2023-10-04 17:20:59,913 WARNING [train.py:1204] (3/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_479-125611-0_sp1.1 from training. Duration: 0.87275 2023-10-04 17:20:59,923 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3475_210_2028_1_1526193578_3622569_64-297072-0_sp1.1 from training. Duration: 0.82725 2023-10-04 17:20:59,954 WARNING [train.py:1204] (3/4) Exclude cut with ID _53565_210_10635_1_1534337218335_3820259_139-346291-0_sp1.1 from training. Duration: 0.97275 2023-10-04 17:21:00,009 WARNING [train.py:1204] (3/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_588-125165-0_sp1.1 from training. Duration: 0.7545625 2023-10-04 17:21:01,396 WARNING [train.py:1204] (3/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_938-205135-0_sp1.1 from training. Duration: 0.7909375 2023-10-04 17:21:04,147 WARNING [train.py:1204] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_216-48707-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 17:21:04,199 WARNING [train.py:1204] (3/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_477-255782-0_sp1.1 from training. Duration: 0.7454375 2023-10-04 17:21:04,242 WARNING [train.py:1204] (3/4) Exclude cut with ID _16909_210_5724_1_1531292079912_4118919_583-92842-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 17:21:05,552 WARNING [train.py:1204] (3/4) Exclude cut with ID _60860_210_8254_1_1534896015875_3557169_245-256666-0 from training. Duration: 0.97 2023-10-04 17:21:08,886 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_45399_210_4852_1_1533624725_7579737_349-116173-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 17:21:14,856 WARNING [train.py:1204] (3/4) Exclude cut with ID _8699_210_2069_1_1528630346831_3960409_155-39766-0 from training. Duration: 0.78 2023-10-04 17:21:14,865 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_86-16881-0_sp1.1 from training. Duration: 0.52725 2023-10-04 17:21:16,201 INFO [train.py:1046] (3/4) Epoch 49, batch 5000, loss[loss=0.1711, simple_loss=0.2447, pruned_loss=0.04874, over 23794.00 frames. ], tot_loss[loss=0.152, simple_loss=0.232, pruned_loss=0.03601, over 4702968.33 frames. ], batch size: 179, lr: 2.06e-03, grad_scale: 16.0 2023-10-04 17:21:18,417 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.2.encoder.layers.0.nonlin_attention.whiten1, num_groups=1, num_channels=288, metric=9.08 vs. limit=10.0 2023-10-04 17:21:21,020 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_191-159384-0_sp1.1 from training. Duration: 0.97275 2023-10-04 17:21:21,034 WARNING [train.py:1204] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_307-158462-0_sp1.1 from training. Duration: 0.62725 2023-10-04 17:21:23,769 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7544_210_6753_1_1528591327_1471426_33-346031-0 from training. Duration: 0.61 2023-10-04 17:21:23,863 WARNING [train.py:1204] (3/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_112-149213-0 from training. Duration: 0.95 2023-10-04 17:21:25,871 WARNING [train.py:1204] (3/4) Exclude cut with ID _55844_210_2346_1_1534471212569_7625707_925-191342-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 17:21:28,580 WARNING [train.py:1204] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_87-278686-0 from training. Duration: 0.77 2023-10-04 17:21:28,608 WARNING [train.py:1204] (3/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_415-78792-0_sp1.1 from training. Duration: 0.736375 2023-10-04 17:21:28,625 WARNING [train.py:1204] (3/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_107-332398-0_sp0.9 from training. Duration: 0.8666875 2023-10-04 17:21:29,992 WARNING [train.py:1204] (3/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_72-251255-0 from training. Duration: 0.94 2023-10-04 17:21:30,029 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_431-227273-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 17:21:31,284 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7544_210_6753_1_1528587000_691468_87-178599-0_sp0.9 from training. Duration: 0.9666875 2023-10-04 17:21:31,371 WARNING [train.py:1204] (3/4) Exclude cut with ID _13916_210_9994_1_1530788341126_3763149_60-191556-0 from training. Duration: 0.61 2023-10-04 17:21:31,373 WARNING [train.py:1204] (3/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_746-164782-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 17:21:31,402 WARNING [train.py:1204] (3/4) Exclude cut with ID _33640_210_2655_1_1533117605479_4855469_596-21066-0_sp1.1 from training. Duration: 0.92725 2023-10-04 17:21:34,160 WARNING [train.py:1204] (3/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_320-14078-0 from training. Duration: 0.95 2023-10-04 17:21:34,193 WARNING [train.py:1204] (3/4) Exclude cut with ID _29877_210_6228_1_1532397791106_5276149_266-62291-0 from training. Duration: 0.87 2023-10-04 17:21:36,124 WARNING [train.py:1204] (3/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_176-56120-0_sp0.9 from training. Duration: 0.92225 2023-10-04 17:21:37,171 WARNING [train.py:1204] (3/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_91-26919-0 from training. Duration: 0.97 2023-10-04 17:21:37,179 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_224-322394-0_sp0.9 from training. Duration: 0.7333125 2023-10-04 17:21:37,235 WARNING [train.py:1204] (3/4) Exclude cut with ID _44982_210_1511_1_1534312820108_3726029_586-297059-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 17:21:38,624 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_196-289637-0_sp1.1 from training. Duration: 0.6818125 2023-10-04 17:21:38,625 WARNING [train.py:1204] (3/4) Exclude cut with ID _18513_210_9206_1_1531618215223_3862620_402-5957-0 from training. Duration: 0.99 2023-10-04 17:21:38,631 WARNING [train.py:1204] (3/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_81-300212-0 from training. Duration: 0.6 2023-10-04 17:21:39,431 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.1.encoder.layers.0.feed_forward1.out_whiten, num_groups=1, num_channels=256, metric=4.91 vs. limit=15.0 2023-10-04 17:21:39,929 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.736e+02 2.001e+02 2.201e+02 2.789e+02 6.311e+02, threshold=4.402e+02, percent-clipped=4.0 2023-10-04 17:21:40,055 WARNING [train.py:1204] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_166-128941-0 from training. Duration: 0.54 2023-10-04 17:21:40,075 WARNING [train.py:1204] (3/4) Exclude cut with ID _58059_210_5854_1_1534901471149_3164808_57-309660-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 17:21:41,352 WARNING [train.py:1204] (3/4) Exclude cut with ID _60621_210_17071_1_1535013053530_3671739_73-155814-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 17:21:46,598 WARNING [train.py:1204] (3/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_726-270780-0 from training. Duration: 0.86 2023-10-04 17:21:46,611 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8853_210_3273_1_1528633899_818968_51-187685-0_sp1.1 from training. Duration: 0.62725 2023-10-04 17:21:46,751 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_315-212794-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 17:21:48,013 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_53373_210_5399_1_1534229471_4715695_314-122795-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 17:21:49,964 WARNING [train.py:1204] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_187-221566-0_sp0.9 from training. Duration: 0.47775 2023-10-04 17:21:51,618 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.bypass_mid.scale_min, batch_count=1733346.6666666667, ans=0.2 2023-10-04 17:21:52,720 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_133-564-0 from training. Duration: 0.79 2023-10-04 17:21:54,069 WARNING [train.py:1204] (3/4) Exclude cut with ID _55153_210_2253_1_1534384786677_3162096_255-228344-0_sp1.1 from training. Duration: 0.936375 2023-10-04 17:21:55,447 WARNING [train.py:1204] (3/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_383-165234-0_sp1.1 from training. Duration: 0.8545625 2023-10-04 17:21:58,689 WARNING [train.py:1204] (3/4) Exclude cut with ID IC0677W0056-334889 from training. Duration: 0.768 2023-10-04 17:22:01,846 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_14953_210_2151_1_1530701710_5616165_710-165400-0_sp0.9 from training. Duration: 0.9666875 2023-10-04 17:22:03,259 WARNING [train.py:1204] (3/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_355-236205-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 17:22:03,269 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_34450_210_9547_1_1533099443_4109050_503-246893-0_sp1.1 from training. Duration: 0.963625 2023-10-04 17:22:06,038 WARNING [train.py:1204] (3/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_25-120161-0 from training. Duration: 0.98 2023-10-04 17:22:06,069 WARNING [train.py:1204] (3/4) Exclude cut with ID _66112_210_10635_1_1535590236305_3801484_205-311930-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 17:22:06,093 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_48049_210_5632_1_1533864553_3519937_185-281218-0_sp1.1 from training. Duration: 0.92725 2023-10-04 17:22:07,476 WARNING [train.py:1204] (3/4) Exclude cut with ID _48782_210_12658_1_1534467701544_2300422_191-332415-0_sp1.1 from training. Duration: 0.97275 2023-10-04 17:22:10,645 WARNING [train.py:1204] (3/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_907-151398-0_sp1.1 from training. Duration: 0.336375 2023-10-04 17:22:10,696 WARNING [train.py:1204] (3/4) Exclude cut with ID _67499_210_17282_1_1535509805220_3692430_20-179346-0_sp1.1 from training. Duration: 0.8818125 2023-10-04 17:22:13,425 WARNING [train.py:1204] (3/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_441-91466-0_sp1.1 from training. Duration: 0.8818125 2023-10-04 17:22:14,789 WARNING [train.py:1204] (3/4) Exclude cut with ID _46681_210_9642_1_1533893005571_7603359_786-164007-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 17:22:19,131 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.1.conv_skip_rate, batch_count=1733480.0, ans=0.0 2023-10-04 17:22:20,341 WARNING [train.py:1204] (3/4) Exclude cut with ID _14970_210_15709_1_1530773543493_1715730_187-20259-0 from training. Duration: 0.89 2023-10-04 17:22:23,914 WARNING [train.py:1204] (3/4) Exclude cut with ID _12734_210_8341_1_1530183614296_3891929_383-339075-0_sp1.1 from training. Duration: 0.963625 2023-10-04 17:22:24,872 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.0.layers.0.self_attn2.whiten, num_groups=1, num_channels=192, metric=11.25 vs. limit=22.5 2023-10-04 17:22:33,224 INFO [train.py:1046] (3/4) Epoch 49, batch 5050, loss[loss=0.1435, simple_loss=0.2258, pruned_loss=0.03056, over 21073.00 frames. ], tot_loss[loss=0.1522, simple_loss=0.2324, pruned_loss=0.03598, over 4713739.34 frames. ], batch size: 46, lr: 2.06e-03, grad_scale: 16.0 2023-10-04 17:22:33,304 WARNING [train.py:1204] (3/4) Exclude cut with ID _37086_210_9038_1_1532999540891_7021250_834-322225-0_sp1.1 from training. Duration: 0.92725 2023-10-04 17:22:34,749 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_10987_210_4163_1_1529760555_4123902_56-290559-0_sp1.1 from training. Duration: 0.963625 2023-10-04 17:22:34,756 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_92-246270-0_sp0.9 from training. Duration: 0.9444375 2023-10-04 17:22:34,782 WARNING [train.py:1204] (3/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_56-13561-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 17:22:34,809 WARNING [train.py:1204] (3/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_393-57888-0_sp1.1 from training. Duration: 0.5818125 2023-10-04 17:22:34,837 WARNING [train.py:1204] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_168-32906-0_sp1.1 from training. Duration: 0.62725 2023-10-04 17:22:34,884 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_51225_210_3601_1_1534400634_2327383_127-207553-0_sp1.1 from training. Duration: 0.963625 2023-10-04 17:22:38,876 WARNING [train.py:1204] (3/4) Exclude cut with ID _58583_210_5236_1_1534726835266_3574249_494-203521-0_sp1.1 from training. Duration: 0.963625 2023-10-04 17:22:38,898 WARNING [train.py:1204] (3/4) Exclude cut with ID _49376_210_5986_1_1533985278732_3370740_354-344695-0 from training. Duration: 0.99 2023-10-04 17:22:40,793 WARNING [train.py:1204] (3/4) Exclude cut with ID _8021_210_7994_1_1528376451358_3653069_179-45206-0_sp1.1 from training. Duration: 0.7818125 2023-10-04 17:22:42,128 WARNING [train.py:1204] (3/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_742-154017-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 17:22:42,399 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=1733546.6666666667, ans=0.1 2023-10-04 17:22:43,508 WARNING [train.py:1204] (3/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_222-198127-0_sp1.1 from training. Duration: 0.763625 2023-10-04 17:22:43,561 WARNING [train.py:1204] (3/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_235-307337-0 from training. Duration: 0.94 2023-10-04 17:22:46,176 WARNING [train.py:1204] (3/4) Exclude cut with ID _8393_210_6702_1_1528505860249_3457328_453-176500-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 17:22:46,212 WARNING [train.py:1204] (3/4) Exclude cut with ID _53565_210_10635_1_1534337218335_3820259_102-179528-0_sp1.1 from training. Duration: 0.97275 2023-10-04 17:22:48,939 WARNING [train.py:1204] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_43-206757-0_sp1.1 from training. Duration: 0.6818125 2023-10-04 17:22:50,376 WARNING [train.py:1204] (3/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_335-274780-0_sp1.1 from training. Duration: 0.7090625 2023-10-04 17:22:50,410 WARNING [train.py:1204] (3/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_128-296581-0_sp1.1 from training. Duration: 0.67275 2023-10-04 17:22:50,655 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.1.conv_module2.balancer2.min_abs, batch_count=1733613.3333333333, ans=0.5 2023-10-04 17:22:58,647 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.2.encoder.layers.0.conv_module2.whiten, num_groups=1, num_channels=384, metric=6.27 vs. limit=15.0 2023-10-04 17:22:59,286 WARNING [train.py:1204] (3/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_111-112362-0 from training. Duration: 0.91 2023-10-04 17:22:59,329 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_5522_210_3273_1_1527249606_3909096_366-172508-0_sp1.1 from training. Duration: 0.6 2023-10-04 17:22:59,419 WARNING [train.py:1204] (3/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_47-167373-0_sp1.1 from training. Duration: 0.836375 2023-10-04 17:22:59,458 WARNING [train.py:1204] (3/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_41-234353-0 from training. Duration: 0.68 2023-10-04 17:23:01,569 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_82-221196-0_sp0.9 from training. Duration: 0.8444375 2023-10-04 17:23:02,938 WARNING [train.py:1204] (3/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_47-4814-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 17:23:02,972 WARNING [train.py:1204] (3/4) Exclude cut with ID _43541_210_10626_1_1533796233418_3635440_40-247993-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 17:23:04,296 WARNING [train.py:1204] (3/4) Exclude cut with ID _21989_210_5854_1_1531899214931_3271360_133-44759-0_sp0.9 from training. Duration: 0.8555625 2023-10-04 17:23:04,301 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_94-195801-0 from training. Duration: 0.94 2023-10-04 17:23:05,679 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_141-89113-0 from training. Duration: 0.86 2023-10-04 17:23:07,033 WARNING [train.py:1204] (3/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_683-284578-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 17:23:08,466 WARNING [train.py:1204] (3/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_6-214815-0_sp1.1 from training. Duration: 0.77275 2023-10-04 17:23:11,769 WARNING [train.py:1204] (3/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_556-212082-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 17:23:13,132 WARNING [train.py:1204] (3/4) Exclude cut with ID _44902_210_16187_1_1533888006062_3792078_149-67212-0 from training. Duration: 0.87 2023-10-04 17:23:14,535 WARNING [train.py:1204] (3/4) Exclude cut with ID _61148_210_6897_1_1535094022726_4217610_333-159440-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 17:23:17,304 WARNING [train.py:1204] (3/4) Exclude cut with ID _3120_210_3525_1_1526036143112_4072980_404-249994-0 from training. Duration: 0.99 2023-10-04 17:23:18,533 WARNING [train.py:1204] (3/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_234-260682-0_sp0.9 from training. Duration: 0.9333125 2023-10-04 17:23:18,562 WARNING [train.py:1204] (3/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_374-138971-0_sp1.1 from training. Duration: 0.8545625 2023-10-04 17:23:18,625 WARNING [train.py:1204] (3/4) Exclude cut with ID _53713_210_24116_1_1534314487762_7416751_743-302085-0_sp1.1 from training. Duration: 0.92725 2023-10-04 17:23:19,974 WARNING [train.py:1204] (3/4) Exclude cut with ID _18516_210_9206_1_1531704653206_3692406_198-334870-0_sp1.1 from training. Duration: 0.836375 2023-10-04 17:23:20,102 WARNING [train.py:1204] (3/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_395-73570-0_sp1.1 from training. Duration: 0.9 2023-10-04 17:23:22,787 WARNING [train.py:1204] (3/4) Exclude cut with ID _46683_210_9642_1_1533730913492_4591116_421-19547-0_sp1.1 from training. Duration: 0.936375 2023-10-04 17:23:24,504 WARNING [train.py:1204] (3/4) Exclude cut with ID _19896_210_1883_1_1531709022662_6649569_714-8668-0_sp1.1 from training. Duration: 0.963625 2023-10-04 17:23:24,533 WARNING [train.py:1204] (3/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_49-101072-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 17:23:24,540 WARNING [train.py:1204] (3/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_220-191080-0_sp1.1 from training. Duration: 0.8909375 2023-10-04 17:23:24,566 WARNING [train.py:1204] (3/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_62-335552-0 from training. Duration: 0.87 2023-10-04 17:23:25,866 WARNING [train.py:1204] (3/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_111-112362-0_sp1.1 from training. Duration: 0.82725 2023-10-04 17:23:27,349 WARNING [train.py:1204] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_318-59097-0_sp0.9 from training. Duration: 0.8444375 2023-10-04 17:23:30,638 WARNING [train.py:1204] (3/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_98-255710-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 17:23:30,645 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9042W0003-515609 from training. Duration: 0.896 2023-10-04 17:23:30,656 WARNING [train.py:1204] (3/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_158-250116-0_sp0.9 from training. Duration: 0.688875 2023-10-04 17:23:33,725 WARNING [train.py:1204] (3/4) Exclude cut with ID _26750_210_5665_1_1532226678442_4063230_7-346885-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 17:23:33,763 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_37839_210_7234_1_1533542462_3689303_357-227078-0_sp1.1 from training. Duration: 0.963625 2023-10-04 17:23:33,790 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9052W0119-520904 from training. Duration: 0.896 2023-10-04 17:23:37,809 WARNING [train.py:1204] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_249-281349-0_sp1.1 from training. Duration: 0.77275 2023-10-04 17:23:37,814 WARNING [train.py:1204] (3/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_147-293311-0 from training. Duration: 0.94 2023-10-04 17:23:37,815 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_158-21166-0_sp1.1 from training. Duration: 0.963625 2023-10-04 17:23:41,414 WARNING [train.py:1204] (3/4) Exclude cut with ID _14100_210_8341_1_1530529320613_3678069_207-332194-0_sp1.1 from training. Duration: 0.92725 2023-10-04 17:23:41,455 WARNING [train.py:1204] (3/4) Exclude cut with ID _9389_210_1664_1_1529217337301_5289269_324-211031-0_sp1.1 from training. Duration: 0.963625 2023-10-04 17:23:41,474 WARNING [train.py:1204] (3/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_133-158308-0 from training. Duration: 0.84 2023-10-04 17:23:44,074 WARNING [train.py:1204] (3/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_166-314607-0 from training. Duration: 0.54 2023-10-04 17:23:46,732 INFO [train.py:1046] (3/4) Epoch 49, batch 5100, loss[loss=0.1351, simple_loss=0.221, pruned_loss=0.0246, over 24583.00 frames. ], tot_loss[loss=0.1526, simple_loss=0.2333, pruned_loss=0.03596, over 4717798.42 frames. ], batch size: 60, lr: 2.06e-03, grad_scale: 16.0 2023-10-04 17:23:46,792 WARNING [train.py:1204] (3/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_649-79538-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 17:23:46,801 WARNING [train.py:1204] (3/4) Exclude cut with ID _28394_210_8913_1_1532430049518_3650118_343-115667-0_sp1.1 from training. Duration: 0.97275 2023-10-04 17:23:46,866 WARNING [train.py:1204] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_87-278686-0_sp0.9 from training. Duration: 0.8555625 2023-10-04 17:23:46,995 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.balancer2.prob, batch_count=1733880.0, ans=0.125 2023-10-04 17:23:50,859 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9109W0003-549749 from training. Duration: 0.768 2023-10-04 17:23:52,348 WARNING [train.py:1204] (3/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_126-29104-0_sp1.1 from training. Duration: 0.77275 2023-10-04 17:23:55,190 WARNING [train.py:1204] (3/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_325-113798-0 from training. Duration: 0.85 2023-10-04 17:23:55,241 WARNING [train.py:1204] (3/4) Exclude cut with ID _6694_210_4599_1_1527852637490_3965529_116-109398-0 from training. Duration: 0.73 2023-10-04 17:23:57,083 WARNING [train.py:1204] (3/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_46-57119-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 17:23:58,496 WARNING [train.py:1204] (3/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_546-119312-0_sp1.1 from training. Duration: 0.9 2023-10-04 17:23:58,756 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=1733880.0, ans=0.1 2023-10-04 17:23:59,965 WARNING [train.py:1204] (3/4) Exclude cut with ID _60057_210_10640_1_1534903065793_3333150_468-306939-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 17:24:01,694 WARNING [train.py:1204] (3/4) Exclude cut with ID _15150_210_8341_1_1530752411803_7256384_22-138274-0 from training. Duration: 0.96 2023-10-04 17:24:01,715 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_289-180904-0 from training. Duration: 0.76 2023-10-04 17:24:01,974 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.bypass.scale_min, batch_count=1733946.6666666667, ans=0.2 2023-10-04 17:24:06,720 WARNING [train.py:1204] (3/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_64-133721-0_sp1.1 from training. Duration: 0.92725 2023-10-04 17:24:06,758 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_195-257512-0_sp1.1 from training. Duration: 0.6090625 2023-10-04 17:24:10,750 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.757e+02 2.155e+02 2.470e+02 3.044e+02 5.202e+02, threshold=4.940e+02, percent-clipped=2.0 2023-10-04 17:24:10,879 WARNING [train.py:1204] (3/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_704-101963-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 17:24:11,678 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.conv_module1.balancer2.prob, batch_count=1733946.6666666667, ans=0.125 2023-10-04 17:24:15,194 WARNING [train.py:1204] (3/4) Exclude cut with ID _4366_210_1388_1_1526792053633_3613819_231-211402-0 from training. Duration: 0.94 2023-10-04 17:24:15,252 WARNING [train.py:1204] (3/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_679-190385-0_sp1.1 from training. Duration: 0.97275 2023-10-04 17:24:18,021 WARNING [train.py:1204] (3/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_298-81459-0_sp1.1 from training. Duration: 0.963625 2023-10-04 17:24:18,030 WARNING [train.py:1204] (3/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_658-282610-0_sp0.9 from training. Duration: 0.588875 2023-10-04 17:24:20,839 WARNING [train.py:1204] (3/4) Exclude cut with ID _57572_210_5517_1_1534921219624_3809484_196-283734-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 17:24:22,178 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_53071_210_16554_1_1534490405_2954066_294-198554-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 17:24:22,184 WARNING [train.py:1204] (3/4) Exclude cut with ID _4866_210_2577_1_1527404412284_4364659_352-157930-0 from training. Duration: 0.81 2023-10-04 17:24:24,971 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9231W0113-610139 from training. Duration: 0.896 2023-10-04 17:24:26,327 WARNING [train.py:1204] (3/4) Exclude cut with ID _24271_210_12658_1_1531987270022_7190519_638-171906-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 17:24:26,363 WARNING [train.py:1204] (3/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_272-249985-0 from training. Duration: 0.67 2023-10-04 17:24:26,375 WARNING [train.py:1204] (3/4) Exclude cut with ID _64086_210_7994_1_1535270417019_7435949_449-31567-0 from training. Duration: 0.97 2023-10-04 17:24:27,305 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.1.encoder.layers.0.feed_forward3.out_whiten, num_groups=1, num_channels=256, metric=11.45 vs. limit=15.0 2023-10-04 17:24:29,216 WARNING [train.py:1204] (3/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_109-282568-0_sp1.1 from training. Duration: 0.97275 2023-10-04 17:24:39,170 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_121-45934-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 17:24:40,585 WARNING [train.py:1204] (3/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_314-126819-0 from training. Duration: 0.5 2023-10-04 17:24:41,871 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9309W0192-640319 from training. Duration: 0.512 2023-10-04 17:24:41,886 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9310W0003-640649 from training. Duration: 0.896 2023-10-04 17:24:43,360 WARNING [train.py:1204] (3/4) Exclude cut with ID _7160_210_1876_1_1527940923231_3703799_120-150943-0 from training. Duration: 0.81 2023-10-04 17:24:43,367 WARNING [train.py:1204] (3/4) Exclude cut with ID _40380_210_9059_1_1533380594759_4187380_6-33508-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 17:24:43,825 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.4.encoder.layers.2.nonlin_attention.whiten2, num_groups=1, num_channels=384, metric=3.39 vs. limit=15.0 2023-10-04 17:24:46,673 WARNING [train.py:1204] (3/4) Exclude cut with ID _59202_210_26984_1_1534816810612_3549174_137-318710-0 from training. Duration: 0.89 2023-10-04 17:24:49,528 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_435-313305-0 from training. Duration: 0.71 2023-10-04 17:24:52,213 WARNING [train.py:1204] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_91-212486-0_sp1.1 from training. Duration: 0.6181875 2023-10-04 17:24:53,626 WARNING [train.py:1204] (3/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_133-158308-0_sp1.1 from training. Duration: 0.763625 2023-10-04 17:24:55,183 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_99-251314-0 from training. Duration: 0.87 2023-10-04 17:24:57,874 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_365-217119-0_sp1.1 from training. Duration: 0.57275 2023-10-04 17:24:57,909 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3475_210_2028_1_1526193578_3622569_377-166133-0 from training. Duration: 0.86 2023-10-04 17:25:00,570 INFO [train.py:1046] (3/4) Epoch 49, batch 5150, loss[loss=0.1624, simple_loss=0.248, pruned_loss=0.03842, over 23126.00 frames. ], tot_loss[loss=0.1532, simple_loss=0.2339, pruned_loss=0.03619, over 4714696.96 frames. ], batch size: 105, lr: 2.06e-03, grad_scale: 16.0 2023-10-04 17:25:03,975 WARNING [train.py:1204] (3/4) Exclude cut with ID _13916_210_9994_1_1530788341126_3763149_116-21034-0_sp1.1 from training. Duration: 0.87275 2023-10-04 17:25:03,986 WARNING [train.py:1204] (3/4) Exclude cut with ID _64220_210_9527_1_1535250151772_4875259_142-216831-0_sp1.1 from training. Duration: 0.936375 2023-10-04 17:25:03,986 WARNING [train.py:1204] (3/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_490-207861-0_sp1.1 from training. Duration: 0.8909375 2023-10-04 17:25:04,052 WARNING [train.py:1204] (3/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_87-272068-0_sp0.9 from training. Duration: 0.92225 2023-10-04 17:25:05,380 WARNING [train.py:1204] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_18-46756-0_sp1.1 from training. Duration: 0.7181875 2023-10-04 17:25:06,854 WARNING [train.py:1204] (3/4) Exclude cut with ID _39411_210_5986_1_1533538825030_3343649_98-311335-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 17:25:08,716 WARNING [train.py:1204] (3/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_113-18163-0 from training. Duration: 0.89 2023-10-04 17:25:08,718 WARNING [train.py:1204] (3/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_1103-56445-0 from training. Duration: 0.73 2023-10-04 17:25:08,756 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3474_210_2028_1_1526188513_4825246_145-309131-0 from training. Duration: 0.74 2023-10-04 17:25:08,783 WARNING [train.py:1204] (3/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_120-211630-0_sp1.1 from training. Duration: 0.836375 2023-10-04 17:25:08,800 WARNING [train.py:1204] (3/4) Exclude cut with ID _8030_210_7497_1_1528444886781_3531089_32-31837-0 from training. Duration: 0.97 2023-10-04 17:25:10,556 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_19949_210_2161_1_1531706337_928079_8-160956-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 17:25:11,880 WARNING [train.py:1204] (3/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_321-215742-0_sp1.1 from training. Duration: 0.4545625 2023-10-04 17:25:12,461 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.5.encoder.layers.0.feed_forward1.out_whiten, num_groups=1, num_channels=256, metric=5.72 vs. limit=15.0 2023-10-04 17:25:13,296 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_583-124650-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 17:25:14,691 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_30434_210_13049_1_1532519384_981746_164-182755-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 17:25:18,303 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.0.ff3_skip_rate, batch_count=1734280.0, ans=0.0 2023-10-04 17:25:19,484 WARNING [train.py:1204] (3/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_339-104791-0_sp1.1 from training. Duration: 0.4454375 2023-10-04 17:25:19,509 WARNING [train.py:1204] (3/4) Exclude cut with ID _15026_210_8750_1_1530777602316_3291850_211-195043-0 from training. Duration: 0.8 2023-10-04 17:25:22,278 WARNING [train.py:1204] (3/4) Exclude cut with ID _29877_210_6228_1_1532397791106_5276149_487-182647-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 17:25:22,315 WARNING [train.py:1204] (3/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_127-566-0_sp0.9 from training. Duration: 0.9333125 2023-10-04 17:25:23,752 WARNING [train.py:1204] (3/4) Exclude cut with ID _8263_210_1664_1_1528439452891_970220_68-181805-0_sp0.9 from training. Duration: 0.711125 2023-10-04 17:25:23,754 WARNING [train.py:1204] (3/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_504-72513-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 17:25:23,762 WARNING [train.py:1204] (3/4) Exclude cut with ID _64639_210_5816_1_1535367619437_3528000_324-265233-0_sp1.1 from training. Duration: 0.963625 2023-10-04 17:25:23,830 WARNING [train.py:1204] (3/4) Exclude cut with ID _6456_210_1534_1_1527681634212_3181049_215-309359-0_sp0.9 from training. Duration: 0.97775 2023-10-04 17:25:23,835 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_115-233929-0_sp1.1 from training. Duration: 0.6545625 2023-10-04 17:25:23,869 WARNING [train.py:1204] (3/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_219-224445-0 from training. Duration: 0.84 2023-10-04 17:25:25,223 WARNING [train.py:1204] (3/4) Exclude cut with ID _14867_210_1968_1_1530684179890_3846969_413-29292-0_sp1.1 from training. Duration: 0.8181875 2023-10-04 17:25:25,276 WARNING [train.py:1204] (3/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_1012-279473-0_sp0.9 from training. Duration: 0.7444375 2023-10-04 17:25:26,873 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.self_attn_weights.pos_emb_skip_rate, batch_count=1734280.0, ans=0.0 2023-10-04 17:25:28,039 WARNING [train.py:1204] (3/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_605-133395-0_sp0.9 from training. Duration: 0.7333125 2023-10-04 17:25:29,481 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_89-35748-0 from training. Duration: 0.79 2023-10-04 17:25:29,583 WARNING [train.py:1204] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_476-272757-0_sp1.1 from training. Duration: 0.8454375 2023-10-04 17:25:33,818 WARNING [train.py:1204] (3/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_449-15508-0_sp0.9 from training. Duration: 0.911125 2023-10-04 17:25:35,639 WARNING [train.py:1204] (3/4) Exclude cut with ID _29345_210_4381_1_1532689168263_3860169_198-174328-0 from training. Duration: 0.91 2023-10-04 17:25:40,165 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_15517_210_6753_1_1530952293_1664795_125-158157-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 17:25:43,677 WARNING [train.py:1204] (3/4) Exclude cut with ID _25565_210_12392_1_1532423009382_3952098_420-113180-0_sp1.1 from training. Duration: 0.963625 2023-10-04 17:25:45,101 WARNING [train.py:1204] (3/4) Exclude cut with ID _51471_210_1889_1_1534399391390_3094250_236-146773-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 17:25:46,925 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.4.encoder.layers.2.whiten, num_groups=1, num_channels=384, metric=3.22 vs. limit=12.0 2023-10-04 17:25:49,146 WARNING [train.py:1204] (3/4) Exclude cut with ID _30039_210_13270_1_1532484612900_2725150_175-35012-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 17:25:49,185 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_23850_210_4238_1_1532232597_1632433_99-275843-0_sp1.1 from training. Duration: 0.936375 2023-10-04 17:25:51,909 WARNING [train.py:1204] (3/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_658-282610-0 from training. Duration: 0.53 2023-10-04 17:25:55,968 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_37818_210_4238_1_1533620910_4720083_234-65934-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 17:25:56,044 WARNING [train.py:1204] (3/4) Exclude cut with ID _3067_210_2667_1_1526212974640_4503168_459-173867-0_sp1.1 from training. Duration: 0.8 2023-10-04 17:25:56,069 WARNING [train.py:1204] (3/4) Exclude cut with ID _8696_210_1876_1_1528545632440_3345058_209-54069-0_sp0.9 from training. Duration: 0.7444375 2023-10-04 17:26:00,042 WARNING [train.py:1204] (3/4) Exclude cut with ID _62574_210_2655_1_1535180407058_3933249_420-236465-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 17:26:00,126 WARNING [train.py:1204] (3/4) Exclude cut with ID _46681_210_9642_1_1533893005571_7603359_333-4042-0_sp1.1 from training. Duration: 0.936375 2023-10-04 17:26:01,398 WARNING [train.py:1204] (3/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_377-296839-0 from training. Duration: 0.55 2023-10-04 17:26:01,660 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.conv_skip_rate, batch_count=1734480.0, ans=0.0 2023-10-04 17:26:03,553 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.ff2_skip_rate, batch_count=1734480.0, ans=0.0 2023-10-04 17:26:07,963 WARNING [train.py:1204] (3/4) Exclude cut with ID _43997_210_4525_1_1533538632555_5189329_426-92599-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 17:26:09,466 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_43-71989-0_sp1.1 from training. Duration: 0.5181875 2023-10-04 17:26:10,913 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2009_210_2071_1_1525481134_4588937_140-345530-0_sp1.1 from training. Duration: 0.963625 2023-10-04 17:26:10,928 WARNING [train.py:1204] (3/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_138-332773-0_sp0.9 from training. Duration: 0.9 2023-10-04 17:26:12,811 WARNING [train.py:1204] (3/4) Exclude cut with ID _13942_210_9988_1_1530604906036_3733360_499-55676-0_sp0.9 from training. Duration: 0.688875 2023-10-04 17:26:12,830 WARNING [train.py:1204] (3/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_259-34038-0_sp1.1 from training. Duration: 0.67275 2023-10-04 17:26:12,840 WARNING [train.py:1204] (3/4) Exclude cut with ID _3120_210_3525_1_1526036143112_4072980_404-249994-0_sp1.1 from training. Duration: 0.9 2023-10-04 17:26:12,876 WARNING [train.py:1204] (3/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_222-108810-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 17:26:14,749 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.out_combiner.scale_min, batch_count=1734546.6666666667, ans=0.2 2023-10-04 17:26:15,746 INFO [train.py:1046] (3/4) Epoch 49, batch 5200, loss[loss=0.1534, simple_loss=0.2363, pruned_loss=0.03522, over 24311.00 frames. ], tot_loss[loss=0.1543, simple_loss=0.235, pruned_loss=0.03681, over 4694023.40 frames. ], batch size: 56, lr: 2.06e-03, grad_scale: 32.0 2023-10-04 17:26:17,181 WARNING [train.py:1204] (3/4) Exclude cut with ID _22083_210_8254_1_1531702862439_3678970_49-234546-0_sp0.9 from training. Duration: 0.92225 2023-10-04 17:26:18,585 WARNING [train.py:1204] (3/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_199-295611-0_sp0.9 from training. Duration: 0.611125 2023-10-04 17:26:21,403 WARNING [train.py:1204] (3/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_43-192945-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 17:26:24,299 WARNING [train.py:1204] (3/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_364-22817-0 from training. Duration: 0.71 2023-10-04 17:26:25,751 WARNING [train.py:1204] (3/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_388-129552-0_sp1.1 from training. Duration: 0.87275 2023-10-04 17:26:27,110 WARNING [train.py:1204] (3/4) Exclude cut with ID _6537_210_1883_1_1527987559710_3597129_190-280227-0_sp1.1 from training. Duration: 0.97275 2023-10-04 17:26:28,126 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.0.layers.1.whiten, num_groups=1, num_channels=192, metric=3.87 vs. limit=12.0 2023-10-04 17:26:28,586 WARNING [train.py:1204] (3/4) Exclude cut with ID _55844_210_2346_1_1534471212569_7625707_398-56897-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 17:26:29,909 WARNING [train.py:1204] (3/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_48-182155-0_sp1.1 from training. Duration: 0.7909375 2023-10-04 17:26:29,942 WARNING [train.py:1204] (3/4) Exclude cut with ID _29529_210_7234_1_1532393961455_3977460_582-18084-0_sp1.1 from training. Duration: 0.97275 2023-10-04 17:26:30,051 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.1.feed_forward1.out_proj.dropout_p, batch_count=1734613.3333333333, ans=0.1 2023-10-04 17:26:32,635 WARNING [train.py:1204] (3/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_912-282412-0 from training. Duration: 0.49 2023-10-04 17:26:33,143 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.5.encoder.layers.1.self_attn1.whiten, num_groups=1, num_channels=256, metric=11.83 vs. limit=22.5 2023-10-04 17:26:37,254 WARNING [train.py:1204] (3/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_332-256168-0_sp0.9 from training. Duration: 0.8333125 2023-10-04 17:26:37,309 WARNING [train.py:1204] (3/4) Exclude cut with ID _64220_210_9527_1_1535250151772_4875259_55-250624-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 17:26:39,111 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.681e+02 2.105e+02 2.384e+02 2.750e+02 4.154e+02, threshold=4.767e+02, percent-clipped=0.0 2023-10-04 17:26:39,375 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.1.nonlin_attention.balancer.prob, batch_count=1734613.3333333333, ans=0.125 2023-10-04 17:26:41,202 WARNING [train.py:1204] (3/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_264-108685-0 from training. Duration: 0.55 2023-10-04 17:26:42,981 WARNING [train.py:1204] (3/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_613-232856-0_sp0.9 from training. Duration: 0.6 2023-10-04 17:26:44,281 WARNING [train.py:1204] (3/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_68-212801-0_sp1.1 from training. Duration: 0.663625 2023-10-04 17:26:44,332 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6290_210_3273_1_1527595224_1282787_115-87095-0 from training. Duration: 0.55 2023-10-04 17:26:45,682 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3080_210_2071_1_1526086102_4365166_312-8726-0 from training. Duration: 0.91 2023-10-04 17:26:48,893 WARNING [train.py:1204] (3/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_105-63309-0 from training. Duration: 0.98 2023-10-04 17:26:48,962 WARNING [train.py:1204] (3/4) Exclude cut with ID _2941_210_2577_1_1526199673150_2489759_201-160779-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 17:26:48,965 WARNING [train.py:1204] (3/4) Exclude cut with ID ID0406W0226-885434 from training. Duration: 0.997 2023-10-04 17:26:48,971 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_17292_210_6753_1_1531125388_2943734_353-328201-0_sp1.1 from training. Duration: 0.97275 2023-10-04 17:26:51,709 WARNING [train.py:1204] (3/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_572-96029-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 17:26:51,749 WARNING [train.py:1204] (3/4) Exclude cut with ID _14970_210_15709_1_1530775547361_1966965_6-23328-0_sp0.9 from training. Duration: 0.9555625 2023-10-04 17:26:51,793 WARNING [train.py:1204] (3/4) Exclude cut with ID _45012_210_12651_1_1533608435136_5371380_240-37101-0 from training. Duration: 0.93 2023-10-04 17:26:51,857 WARNING [train.py:1204] (3/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_685-90183-0_sp1.1 from training. Duration: 0.936375 2023-10-04 17:26:54,487 WARNING [train.py:1204] (3/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_41-22245-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 17:26:55,939 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_15126_210_6753_1_1530778677_2591185_120-277707-0 from training. Duration: 0.8 2023-10-04 17:26:55,972 WARNING [train.py:1204] (3/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_310-26186-0 from training. Duration: 0.7 2023-10-04 17:26:57,252 WARNING [train.py:1204] (3/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_787-294416-0 from training. Duration: 0.82 2023-10-04 17:27:00,164 WARNING [train.py:1204] (3/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_795-33252-0 from training. Duration: 0.37 2023-10-04 17:27:01,500 WARNING [train.py:1204] (3/4) Exclude cut with ID _7537_210_2084_1_1528502395431_3924359_113-199933-0_sp0.9 from training. Duration: 0.6555625 2023-10-04 17:27:07,809 WARNING [train.py:1204] (3/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_308-231735-0_sp0.9 from training. Duration: 0.811125 2023-10-04 17:27:07,835 WARNING [train.py:1204] (3/4) Exclude cut with ID _19504_210_9994_1_1531648863242_3325220_354-296765-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 17:27:09,231 WARNING [train.py:1204] (3/4) Exclude cut with ID _5409_210_1388_1_1527292850189_5865191_178-127970-0 from training. Duration: 0.57 2023-10-04 17:27:10,994 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_1466-248056-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 17:27:11,025 WARNING [train.py:1204] (3/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_535-14410-0_sp0.9 from training. Duration: 0.52225 2023-10-04 17:27:11,028 WARNING [train.py:1204] (3/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_747-129941-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 17:27:12,441 WARNING [train.py:1204] (3/4) Exclude cut with ID _15012_210_1968_1_1530856820429_5095350_688-178849-0_sp1.1 from training. Duration: 0.5818125 2023-10-04 17:27:15,115 WARNING [train.py:1204] (3/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_356-45054-0_sp1.1 from training. Duration: 0.7818125 2023-10-04 17:27:16,551 WARNING [train.py:1204] (3/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_146-276944-0_sp0.9 from training. Duration: 0.92225 2023-10-04 17:27:20,023 WARNING [train.py:1204] (3/4) Exclude cut with ID _39204_210_4220_1_1533880547368_3191120_346-180306-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 17:27:21,413 WARNING [train.py:1204] (3/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_641-158017-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 17:27:21,414 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_19952_210_2161_1_1531875469_3851750_274-42137-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 17:27:25,528 WARNING [train.py:1204] (3/4) Exclude cut with ID _60804_210_9642_1_1534831199859_7268299_87-327819-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 17:27:25,604 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_9302_210_2550_1_1528889962_4655416_638-301620-0 from training. Duration: 0.47 2023-10-04 17:27:26,838 WARNING [train.py:1204] (3/4) Exclude cut with ID _44304_210_15374_1_1533639352765_4613129_302-22084-0_sp1.1 from training. Duration: 0.7818125 2023-10-04 17:27:26,860 WARNING [train.py:1204] (3/4) Exclude cut with ID _18513_210_9206_1_1531618215223_3862620_177-227063-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 17:27:26,960 WARNING [train.py:1204] (3/4) Exclude cut with ID _41326_210_5854_1_1533778076509_3492080_510-45444-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 17:27:28,275 WARNING [train.py:1204] (3/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_76-311963-0_sp1.1 from training. Duration: 0.636375 2023-10-04 17:27:29,653 INFO [train.py:1046] (3/4) Epoch 49, batch 5250, loss[loss=0.1357, simple_loss=0.2153, pruned_loss=0.02803, over 24445.00 frames. ], tot_loss[loss=0.1535, simple_loss=0.2343, pruned_loss=0.03633, over 4705505.82 frames. ], batch size: 58, lr: 2.06e-03, grad_scale: 16.0 2023-10-04 17:27:29,721 WARNING [train.py:1204] (3/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_156-302675-0_sp0.9 from training. Duration: 0.611125 2023-10-04 17:27:31,099 WARNING [train.py:1204] (3/4) Exclude cut with ID _53489_210_6228_1_1534298357169_3835970_367-148411-0_sp1.1 from training. Duration: 0.936375 2023-10-04 17:27:32,676 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.conv_module2.balancer2.prob, batch_count=1734880.0, ans=0.125 2023-10-04 17:27:34,267 WARNING [train.py:1204] (3/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_389-144525-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 17:27:35,572 WARNING [train.py:1204] (3/4) Exclude cut with ID _14069_210_5632_1_1530512692033_4803390_158-238427-0_sp1.1 from training. Duration: 0.963625 2023-10-04 17:27:37,329 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_486-151131-0_sp0.9 from training. Duration: 0.9444375 2023-10-04 17:27:42,197 WARNING [train.py:1204] (3/4) Exclude cut with ID _39846_210_12651_1_1533603651992_1318030_188-71831-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 17:27:44,835 WARNING [train.py:1204] (3/4) Exclude cut with ID _39411_210_5986_1_1533538825030_3343649_173-238248-0_sp1.1 from training. Duration: 0.7545625 2023-10-04 17:27:47,530 WARNING [train.py:1204] (3/4) Exclude cut with ID _20684_210_6917_1_1531567751646_4203930_469-49081-0_sp1.1 from training. Duration: 0.92725 2023-10-04 17:27:49,522 WARNING [train.py:1204] (3/4) Exclude cut with ID _8833_210_5219_1_1528592665210_7553059_611-80313-0_sp1.1 from training. Duration: 0.5818125 2023-10-04 17:27:50,965 WARNING [train.py:1204] (3/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_440-141811-0 from training. Duration: 0.95 2023-10-04 17:27:50,980 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_41211_210_2544_1_1533348859_3702910_236-223652-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 17:27:52,433 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_40222_210_6228_1_1533385298_4481939_220-163998-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 17:28:14,448 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.ff3_skip_rate, batch_count=1735080.0, ans=0.0 2023-10-04 17:28:32,398 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.balancer1.prob, batch_count=1735146.6666666667, ans=0.125 2023-10-04 17:28:37,738 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.1.balancer2.prob, batch_count=1735213.3333333333, ans=0.125 2023-10-04 17:28:38,917 INFO [train.py:1046] (3/4) Epoch 49, batch 5300, loss[loss=0.1331, simple_loss=0.2135, pruned_loss=0.02635, over 24323.00 frames. ], tot_loss[loss=0.1526, simple_loss=0.2327, pruned_loss=0.03625, over 4692807.56 frames. ], batch size: 61, lr: 2.06e-03, grad_scale: 16.0 2023-10-04 17:28:53,060 WARNING [train.py:1204] (3/4) Exclude cut with ID _14629_210_15709_1_1530604298182_956899_6-232695-0_sp1.1 from training. Duration: 0.8545625 2023-10-04 17:28:53,081 WARNING [train.py:1204] (3/4) Exclude cut with ID _13921_210_9994_1_1530841782272_3503520_327-69677-0 from training. Duration: 0.9 2023-10-04 17:28:53,107 WARNING [train.py:1204] (3/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_372-298093-0 from training. Duration: 0.8 2023-10-04 17:28:53,120 WARNING [train.py:1204] (3/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_515-139111-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 17:28:53,358 WARNING [train.py:1204] (3/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_85-65056-0_sp1.1 from training. Duration: 0.87275 2023-10-04 17:28:53,403 WARNING [train.py:1204] (3/4) Exclude cut with ID _15113_210_8254_1_1530842438579_3716279_209-54533-0_sp1.1 from training. Duration: 0.87275 2023-10-04 17:28:53,451 WARNING [train.py:1204] (3/4) Exclude cut with ID _70447_210_12392_1_1535972499300_3160420_123-312838-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 17:28:53,482 WARNING [train.py:1204] (3/4) Exclude cut with ID _60113_210_16271_1_1535164212481_4386079_70-63596-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 17:28:53,511 WARNING [train.py:1204] (3/4) Exclude cut with ID _14632_210_9988_1_1530691279392_3923200_225-188509-0_sp1.1 from training. Duration: 0.963625 2023-10-04 17:28:53,563 WARNING [train.py:1204] (3/4) Exclude cut with ID _34734_210_4525_1_1533365977542_4448720_584-180532-0_sp1.1 from training. Duration: 0.97275 2023-10-04 17:28:53,577 WARNING [train.py:1204] (3/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_351-294650-0_sp1.1 from training. Duration: 0.57275 2023-10-04 17:28:53,928 WARNING [train.py:1204] (3/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_263-168388-0_sp0.9 from training. Duration: 0.9555625 2023-10-04 17:28:53,956 WARNING [train.py:1204] (3/4) Exclude cut with ID _22083_210_8254_1_1531702862439_3678970_201-85738-0 from training. Duration: 0.9 2023-10-04 17:28:54,051 WARNING [train.py:1204] (3/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_237-58433-0 from training. Duration: 0.74 2023-10-04 17:28:54,079 WARNING [train.py:1204] (3/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_262-250826-0 from training. Duration: 0.99 2023-10-04 17:28:54,185 WARNING [train.py:1204] (3/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_927-43827-0_sp0.9 from training. Duration: 0.82225 2023-10-04 17:28:54,217 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2821_210_2715_1_1526108385_3763560_73-296673-0 from training. Duration: 0.83 2023-10-04 17:28:54,294 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6287_210_3273_1_1527509506_2653716_182-199887-0 from training. Duration: 0.51 2023-10-04 17:28:54,779 WARNING [train.py:1204] (3/4) Exclude cut with ID _70519_210_4163_1_1535961595994_7458230_899-80223-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 17:28:55,051 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_210-234949-0_sp1.1 from training. Duration: 0.97275 2023-10-04 17:28:55,091 WARNING [train.py:1204] (3/4) Exclude cut with ID _30196_210_1664_1_1533708496522_7074140_768-278176-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 17:28:55,173 WARNING [train.py:1204] (3/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_315-128283-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 17:28:55,258 WARNING [train.py:1204] (3/4) Exclude cut with ID _14886_210_4220_1_1530702020355_3516190_330-323941-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 17:28:55,538 WARNING [train.py:1204] (3/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_264-348896-0_sp1.1 from training. Duration: 0.77275 2023-10-04 17:28:55,566 WARNING [train.py:1204] (3/4) Exclude cut with ID _37086_210_9038_1_1532999540891_7021250_433-10563-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 17:28:55,609 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_53638_210_21581_1_1534297319_5808449_885-80172-0_sp1.1 from training. Duration: 0.87275 2023-10-04 17:28:55,709 WARNING [train.py:1204] (3/4) Exclude cut with ID _14623_210_1925_1_1530773354962_3632609_157-218771-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 17:28:55,720 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_1368-78165-0_sp1.1 from training. Duration: 0.97275 2023-10-04 17:28:55,724 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_309-65177-0_sp0.9 from training. Duration: 0.97775 2023-10-04 17:28:55,730 WARNING [train.py:1204] (3/4) Exclude cut with ID _21632_210_2253_1_1531994001765_3744140_291-138812-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 17:28:55,743 WARNING [train.py:1204] (3/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_669-93163-0_sp1.1 from training. Duration: 0.8909375 2023-10-04 17:28:56,373 WARNING [train.py:1204] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_292-215346-0 from training. Duration: 0.97 2023-10-04 17:28:56,397 WARNING [train.py:1204] (3/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_286-222073-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 17:28:56,696 WARNING [train.py:1204] (3/4) Exclude cut with ID _59256_210_5986_1_1535076085997_3589120_121-237386-0_sp1.1 from training. Duration: 0.87275 2023-10-04 17:28:56,710 WARNING [train.py:1204] (3/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_96-320736-0 from training. Duration: 0.86 2023-10-04 17:28:56,719 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_128-202104-0 from training. Duration: 0.63 2023-10-04 17:28:56,810 WARNING [train.py:1204] (3/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_242-3472-0_sp0.9 from training. Duration: 0.811125 2023-10-04 17:28:56,863 WARNING [train.py:1204] (3/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_154-74182-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 17:28:56,867 WARNING [train.py:1204] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_187-221566-0 from training. Duration: 0.43 2023-10-04 17:28:57,513 WARNING [train.py:1204] (3/4) Exclude cut with ID _6194_210_1534_1_1527511826160_3941194_14-282876-0 from training. Duration: 0.71 2023-10-04 17:28:57,555 WARNING [train.py:1204] (3/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_660-211088-0_sp1.1 from training. Duration: 0.563625 2023-10-04 17:28:57,928 WARNING [train.py:1204] (3/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_526-62079-0_sp1.1 from training. Duration: 0.6090625 2023-10-04 17:28:58,090 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_92-246270-0_sp1.1 from training. Duration: 0.77275 2023-10-04 17:28:58,182 WARNING [train.py:1204] (3/4) Exclude cut with ID IC0287W0023-142350 from training. Duration: 0.956 2023-10-04 17:28:58,261 WARNING [train.py:1204] (3/4) Exclude cut with ID _58605_210_12210_1_1534723195939_3904259_48-269402-0 from training. Duration: 0.96 2023-10-04 17:28:58,275 WARNING [train.py:1204] (3/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_416-257730-0_sp0.9 from training. Duration: 0.72225 2023-10-04 17:28:58,350 WARNING [train.py:1204] (3/4) Exclude cut with ID _7877_210_6014_1_1528286358099_3750136_185-17516-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 17:28:58,453 WARNING [train.py:1204] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_23-189234-0 from training. Duration: 0.83 2023-10-04 17:28:58,470 WARNING [train.py:1204] (3/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_237-89684-0 from training. Duration: 0.96 2023-10-04 17:28:58,540 WARNING [train.py:1204] (3/4) Exclude cut with ID _13916_210_9994_1_1530788341126_3763149_116-21034-0 from training. Duration: 0.96 2023-10-04 17:28:58,678 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_246-125000-0_sp1.1 from training. Duration: 0.563625 2023-10-04 17:29:05,612 INFO [train.py:1046] (3/4) Epoch 50, batch 0, loss[loss=0.1353, simple_loss=0.2151, pruned_loss=0.02778, over 22012.00 frames. ], tot_loss[loss=0.1353, simple_loss=0.2151, pruned_loss=0.02778, over 22012.00 frames. ], batch size: 48, lr: 2.04e-03, grad_scale: 32.0 2023-10-04 17:29:05,612 INFO [train.py:1069] (3/4) Computing validation loss 2023-10-04 17:29:16,722 INFO [zipformer.py:1853] (3/4) name=encoder.encoders.2.encoder.layers.1.self_attn_weights, attn_weights_entropy = tensor([3.2243, 2.0124, 2.8057, 2.9827], device='cuda:3') 2023-10-04 17:29:18,994 INFO [train.py:1078] (3/4) Epoch 50, validation: loss=0.3435, simple_loss=0.2762, pruned_loss=0.2054, over 1125622.00 frames. 2023-10-04 17:29:18,994 INFO [train.py:1079] (3/4) Maximum memory allocated so far is 21221MB 2023-10-04 17:29:21,716 WARNING [train.py:1204] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_402-253027-0 from training. Duration: 0.54 2023-10-04 17:29:21,790 WARNING [train.py:1204] (3/4) Exclude cut with ID _59288_210_5854_1_1535077757774_3412246_158-289345-0_sp1.1 from training. Duration: 0.92725 2023-10-04 17:29:23,190 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_28211_210_6388_1_1532861506_4523224_128-328162-0_sp1.1 from training. Duration: 0.8181875 2023-10-04 17:29:25,812 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.807e+02 2.206e+02 2.557e+02 2.974e+02 5.162e+02, threshold=5.113e+02, percent-clipped=2.0 2023-10-04 17:29:27,568 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.0.nonlin_attention.balancer.prob, batch_count=1735293.3333333333, ans=0.125 2023-10-04 17:29:28,751 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_788-278281-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 17:29:28,760 WARNING [train.py:1204] (3/4) Exclude cut with ID _49728_210_12651_1_1534041034294_6788150_624-195301-0_sp0.9 from training. Duration: 0.9444375 2023-10-04 17:29:30,102 WARNING [train.py:1204] (3/4) Exclude cut with ID _14860_210_4915_1_1530702020340_7343240_267-139775-0_sp1.1 from training. Duration: 0.963625 2023-10-04 17:29:31,399 WARNING [train.py:1204] (3/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_332-221057-0 from training. Duration: 0.68 2023-10-04 17:29:32,864 WARNING [train.py:1204] (3/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_242-76943-0 from training. Duration: 0.94 2023-10-04 17:29:34,258 WARNING [train.py:1204] (3/4) Exclude cut with ID _16747_210_5986_1_1531811544733_2749109_87-349587-0_sp1.1 from training. Duration: 0.963625 2023-10-04 17:29:35,640 WARNING [train.py:1204] (3/4) Exclude cut with ID _22982_210_1883_1_1531882790447_5449409_505-346452-0_sp1.1 from training. Duration: 0.963625 2023-10-04 17:29:40,261 WARNING [train.py:1204] (3/4) Exclude cut with ID _17956_210_4353_1_1531209568125_6979970_13-192107-0_sp1.1 from training. Duration: 0.963625 2023-10-04 17:29:40,289 WARNING [train.py:1204] (3/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_430-307666-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 17:29:40,335 WARNING [train.py:1204] (3/4) Exclude cut with ID _2895_210_2477_1_1525956785794_4063750_289-11475-0_sp1.1 from training. Duration: 0.7454375 2023-10-04 17:29:40,349 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3910_210_1794_1_1526742306_334530_28-238909-0_sp0.9 from training. Duration: 0.9 2023-10-04 17:29:41,786 WARNING [train.py:1204] (3/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_18-130336-0 from training. Duration: 0.79 2023-10-04 17:29:43,268 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_548-294316-0_sp0.9 from training. Duration: 0.9 2023-10-04 17:29:51,331 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_42-142473-0_sp1.1 from training. Duration: 0.6545625 2023-10-04 17:29:51,337 WARNING [train.py:1204] (3/4) Exclude cut with ID _59170_210_6721_1_1534983633960_4203335_326-48066-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 17:29:54,014 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_45886_210_17024_1_1533709215_471063_7-79165-0 from training. Duration: 0.98 2023-10-04 17:29:54,253 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.bypass.skip_rate, batch_count=1735426.6666666667, ans=0.07 2023-10-04 17:29:56,792 WARNING [train.py:1204] (3/4) Exclude cut with ID _44585_210_18628_1_1533605350387_4518199_280-73534-0_sp0.9 from training. Duration: 0.988875 2023-10-04 17:29:56,799 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3475_210_2028_1_1526193578_3622569_265-276625-0_sp1.1 from training. Duration: 0.6909375 2023-10-04 17:29:59,593 WARNING [train.py:1204] (3/4) Exclude cut with ID _64631_210_27767_1_1535349580068_4165160_88-225232-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 17:30:03,741 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_205-265518-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 17:30:05,270 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.feed_forward2.hidden_balancer.prob, batch_count=1735493.3333333333, ans=0.125 2023-10-04 17:30:08,016 WARNING [train.py:1204] (3/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_271-253734-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 17:30:09,619 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.feed_forward3.hidden_balancer.prob, batch_count=1735493.3333333333, ans=0.125 2023-10-04 17:30:14,045 WARNING [train.py:1204] (3/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_282-257791-0 from training. Duration: 0.86 2023-10-04 17:30:17,296 WARNING [train.py:1204] (3/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_356-45054-0 from training. Duration: 0.86 2023-10-04 17:30:17,343 WARNING [train.py:1204] (3/4) Exclude cut with ID _69584_210_5219_1_1535785133775_4044450_38-348116-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 17:30:17,344 WARNING [train.py:1204] (3/4) Exclude cut with ID _66763_210_17301_1_1535788667617_241750_21-328480-0_sp1.1 from training. Duration: 0.936375 2023-10-04 17:30:17,617 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.conv_skip_rate, batch_count=1735560.0, ans=0.0 2023-10-04 17:30:18,664 WARNING [train.py:1204] (3/4) Exclude cut with ID _26528_210_9070_1_1532570410842_3851310_497-97218-0_sp0.9 from training. Duration: 0.9555625 2023-10-04 17:30:20,013 WARNING [train.py:1204] (3/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_592-101075-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 17:30:21,365 WARNING [train.py:1204] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_678-159307-0 from training. Duration: 0.85 2023-10-04 17:30:23,464 WARNING [train.py:1204] (3/4) Exclude cut with ID _28427_210_4220_1_1532581240967_3061069_166-319773-0_sp1.1 from training. Duration: 0.936375 2023-10-04 17:30:23,562 WARNING [train.py:1204] (3/4) Exclude cut with ID _29877_210_6228_1_1532397791106_5276149_606-224984-0_sp1.1 from training. Duration: 0.936375 2023-10-04 17:30:27,740 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_125-203105-0_sp1.1 from training. Duration: 0.72725 2023-10-04 17:30:29,284 WARNING [train.py:1204] (3/4) Exclude cut with ID IC0620W0113-306765 from training. Duration: 0.768 2023-10-04 17:30:29,398 WARNING [train.py:1204] (3/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_410-287857-0_sp1.1 from training. Duration: 0.8454375 2023-10-04 17:30:32,031 INFO [train.py:1046] (3/4) Epoch 50, batch 50, loss[loss=0.1389, simple_loss=0.2231, pruned_loss=0.02732, over 24495.00 frames. ], tot_loss[loss=0.1526, simple_loss=0.2343, pruned_loss=0.0354, over 1062526.29 frames. ], batch size: 63, lr: 2.04e-03, grad_scale: 16.0 2023-10-04 17:30:33,420 WARNING [train.py:1204] (3/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_78-95260-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 17:30:34,848 WARNING [train.py:1204] (3/4) Exclude cut with ID _58059_210_5854_1_1534901471149_3164808_6-73080-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 17:30:36,092 WARNING [train.py:1204] (3/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_331-117829-0 from training. Duration: 0.62 2023-10-04 17:30:36,169 WARNING [train.py:1204] (3/4) Exclude cut with ID _14880_210_8895_1_1530680426655_3795259_289-212817-0_sp0.9 from training. Duration: 0.7666875 2023-10-04 17:30:37,489 WARNING [train.py:1204] (3/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_620-144779-0_sp1.1 from training. Duration: 0.92725 2023-10-04 17:30:39,434 WARNING [train.py:1204] (3/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_561-288575-0_sp1.1 from training. Duration: 0.963625 2023-10-04 17:30:40,825 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_22657_210_6890_1_1532086002_1637216_122-188128-0_sp1.1 from training. Duration: 0.963625 2023-10-04 17:30:42,238 WARNING [train.py:1204] (3/4) Exclude cut with ID _16756_210_4915_1_1531047803763_7104259_604-86607-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 17:30:45,688 WARNING [train.py:1204] (3/4) Exclude cut with ID _15150_210_8341_1_1530752411803_7256384_469-239509-0 from training. Duration: 0.68 2023-10-04 17:30:46,836 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_10987_210_4163_1_1529760555_4123902_302-87926-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 17:30:52,929 WARNING [train.py:1204] (3/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_469-94673-0_sp0.9 from training. Duration: 0.87775 2023-10-04 17:30:54,895 WARNING [train.py:1204] (3/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_798-106179-0 from training. Duration: 0.83 2023-10-04 17:30:56,312 WARNING [train.py:1204] (3/4) Exclude cut with ID _16744_210_5986_1_1531724405384_3907567_242-111316-0 from training. Duration: 0.93 2023-10-04 17:30:57,671 WARNING [train.py:1204] (3/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_938-205135-0_sp0.9 from training. Duration: 0.9666875 2023-10-04 17:30:59,068 WARNING [train.py:1204] (3/4) Exclude cut with ID _64135_210_2253_1_1535677267876_7608972_133-188266-0_sp0.9 from training. Duration: 0.9 2023-10-04 17:30:59,072 WARNING [train.py:1204] (3/4) Exclude cut with ID _44585_210_18628_1_1533605350387_4518199_623-346820-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 17:31:00,410 WARNING [train.py:1204] (3/4) Exclude cut with ID _42326_210_7770_1_1533814282531_3707269_454-266903-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 17:31:00,489 WARNING [train.py:1204] (3/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_5-55893-0_sp1.1 from training. Duration: 0.5 2023-10-04 17:31:00,519 WARNING [train.py:1204] (3/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_577-332600-0_sp1.1 from training. Duration: 0.5181875 2023-10-04 17:31:00,520 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_957-138882-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 17:31:07,712 WARNING [train.py:1204] (3/4) Exclude cut with ID _12556_210_8341_1_1530081024100_7327690_219-38004-0_sp1.1 from training. Duration: 0.97275 2023-10-04 17:31:09,654 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_397-147196-0_sp1.1 from training. Duration: 0.72725 2023-10-04 17:31:09,664 WARNING [train.py:1204] (3/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_231-104736-0_sp0.9 from training. Duration: 0.8666875 2023-10-04 17:31:10,968 WARNING [train.py:1204] (3/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_398-334558-0 from training. Duration: 0.96 2023-10-04 17:31:12,337 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_257-20733-0_sp1.1 from training. Duration: 0.6909375 2023-10-04 17:31:13,723 WARNING [train.py:1204] (3/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_146-282400-0_sp1.1 from training. Duration: 0.6454375 2023-10-04 17:31:13,727 WARNING [train.py:1204] (3/4) Exclude cut with ID _51899_210_20034_1_1534129254466_3669220_265-215428-0 from training. Duration: 0.98 2023-10-04 17:31:13,790 WARNING [train.py:1204] (3/4) Exclude cut with ID _34423_210_8750_1_1533430798456_3330199_219-106541-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 17:31:15,117 WARNING [train.py:1204] (3/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_458-67321-0 from training. Duration: 0.97 2023-10-04 17:31:19,850 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=1735826.6666666667, ans=0.1 2023-10-04 17:31:21,539 WARNING [train.py:1204] (3/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_1051-182606-0_sp1.1 from training. Duration: 0.936375 2023-10-04 17:31:21,561 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_53555_210_5632_1_1534595220_7232297_603-4396-0_sp1.1 from training. Duration: 0.97275 2023-10-04 17:31:24,304 WARNING [train.py:1204] (3/4) Exclude cut with ID _54218_210_12210_1_1534413646388_4409259_159-144367-0_sp1.1 from training. Duration: 0.963625 2023-10-04 17:31:24,409 WARNING [train.py:1204] (3/4) Exclude cut with ID _7606_210_4915_1_1528894253277_5080690_301-10732-0_sp0.9 from training. Duration: 0.9 2023-10-04 17:31:26,276 WARNING [train.py:1204] (3/4) Exclude cut with ID _15026_210_8750_1_1530777602316_3291850_211-195043-0_sp0.9 from training. Duration: 0.888875 2023-10-04 17:31:28,929 WARNING [train.py:1204] (3/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_146-282400-0 from training. Duration: 0.71 2023-10-04 17:31:28,970 WARNING [train.py:1204] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_65-88242-0 from training. Duration: 0.56 2023-10-04 17:31:31,622 WARNING [train.py:1204] (3/4) Exclude cut with ID _55857_210_2346_1_1534644007831_6898209_80-107757-0_sp1.1 from training. Duration: 0.963625 2023-10-04 17:31:31,673 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_15126_210_6753_1_1530778677_2591185_120-277707-0_sp0.9 from training. Duration: 0.888875 2023-10-04 17:31:33,060 WARNING [train.py:1204] (3/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_1033-168593-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 17:31:34,400 WARNING [train.py:1204] (3/4) Exclude cut with ID _46683_210_9642_1_1533730913492_4591116_475-30711-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 17:31:34,431 WARNING [train.py:1204] (3/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_253-308377-0 from training. Duration: 0.49 2023-10-04 17:31:34,487 WARNING [train.py:1204] (3/4) Exclude cut with ID _4680_210_3525_1_1527249757953_893386_75-78979-0 from training. Duration: 0.91 2023-10-04 17:31:35,916 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_5522_210_3273_1_1527249606_3909096_318-203153-0_sp0.9 from training. Duration: 0.47775 2023-10-04 17:31:37,236 WARNING [train.py:1204] (3/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_550-170116-0_sp1.1 from training. Duration: 0.92725 2023-10-04 17:31:38,462 WARNING [train.py:1204] (3/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_277-210798-0_sp0.9 from training. Duration: 0.611125 2023-10-04 17:31:38,531 WARNING [train.py:1204] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_620-276793-0 from training. Duration: 0.59 2023-10-04 17:31:38,541 WARNING [train.py:1204] (3/4) Exclude cut with ID _3067_210_2667_1_1526212974640_4503168_119-175776-0 from training. Duration: 0.74 2023-10-04 17:31:39,879 WARNING [train.py:1204] (3/4) Exclude cut with ID _55209_210_1531_1_1534675828996_4597160_681-195827-0_sp1.1 from training. Duration: 0.92725 2023-10-04 17:31:39,947 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_427-78097-0_sp1.1 from training. Duration: 0.72725 2023-10-04 17:31:41,875 WARNING [train.py:1204] (3/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_96-290039-0_sp0.9 from training. Duration: 0.62225 2023-10-04 17:31:41,886 WARNING [train.py:1204] (3/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_287-292174-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 17:31:42,197 INFO [scaling.py:1118] (3/4) WithLoss: name=encoder.encoders.5.encoder.layers.1.self_attn_weights, loss-sum=0.000e+00 2023-10-04 17:31:45,911 INFO [train.py:1046] (3/4) Epoch 50, batch 100, loss[loss=0.1544, simple_loss=0.233, pruned_loss=0.03787, over 23840.00 frames. ], tot_loss[loss=0.1544, simple_loss=0.2363, pruned_loss=0.03621, over 1870753.78 frames. ], batch size: 195, lr: 2.04e-03, grad_scale: 16.0 2023-10-04 17:31:45,963 WARNING [train.py:1204] (3/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_200-292228-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 17:31:48,142 WARNING [train.py:1204] (3/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_291-194999-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 17:31:50,957 WARNING [train.py:1204] (3/4) Exclude cut with ID _58895_210_8750_1_1535184081320_3649089_265-323810-0_sp1.1 from training. Duration: 0.87275 2023-10-04 17:31:52,378 WARNING [train.py:1204] (3/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_58-68956-0 from training. Duration: 0.83 2023-10-04 17:31:52,383 WARNING [train.py:1204] (3/4) Exclude cut with ID _39867_210_9826_1_1533553222030_9344259_322-115153-0_sp1.1 from training. Duration: 0.963625 2023-10-04 17:31:55,785 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.772e+02 2.084e+02 2.344e+02 2.870e+02 5.145e+02, threshold=4.687e+02, percent-clipped=1.0 2023-10-04 17:31:57,275 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3371_210_1794_1_1526384462_5580777_396-339812-0_sp1.1 from training. Duration: 0.77275 2023-10-04 17:31:57,285 WARNING [train.py:1204] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_731-2428-0_sp0.9 from training. Duration: 0.92225 2023-10-04 17:31:58,590 WARNING [train.py:1204] (3/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_98-741-0_sp1.1 from training. Duration: 0.72725 2023-10-04 17:31:58,592 WARNING [train.py:1204] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_238-245655-0_sp0.9 from training. Duration: 0.9 2023-10-04 17:31:58,628 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6788_210_1388_1_1527898698_4562021_91-197981-0_sp0.9 from training. Duration: 0.92225 2023-10-04 17:32:00,509 WARNING [train.py:1204] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_860-96198-0 from training. Duration: 0.73 2023-10-04 17:32:03,197 WARNING [train.py:1204] (3/4) Exclude cut with ID _14268_210_9206_1_1530622771285_4714099_331-105351-0_sp0.9 from training. Duration: 0.72225 2023-10-04 17:32:04,518 WARNING [train.py:1204] (3/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_204-137133-0_sp1.1 from training. Duration: 0.92725 2023-10-04 17:32:04,543 WARNING [train.py:1204] (3/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_5-46496-0_sp1.1 from training. Duration: 0.936375 2023-10-04 17:32:04,557 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_160-218721-0_sp1.1 from training. Duration: 0.87275 2023-10-04 17:32:04,788 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.ff3_skip_rate, batch_count=1736026.6666666667, ans=0.0 2023-10-04 17:32:08,770 WARNING [train.py:1204] (3/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_503-287883-0 from training. Duration: 0.88 2023-10-04 17:32:10,147 WARNING [train.py:1204] (3/4) Exclude cut with ID _57572_210_5517_1_1534921219624_3809484_110-79134-0_sp1.1 from training. Duration: 0.92725 2023-10-04 17:32:10,227 WARNING [train.py:1204] (3/4) Exclude cut with ID _44585_210_18628_1_1533605350387_4518199_421-37641-0_sp1.1 from training. Duration: 0.936375 2023-10-04 17:32:11,690 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_42-142473-0_sp0.9 from training. Duration: 0.8 2023-10-04 17:32:13,127 WARNING [train.py:1204] (3/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_310-114452-0_sp0.9 from training. Duration: 0.6555625 2023-10-04 17:32:17,643 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9029W0120-509520 from training. Duration: 0.768 2023-10-04 17:32:17,664 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9029W0433-509820 from training. Duration: 0.896 2023-10-04 17:32:19,134 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.1.conv_module1.balancer1.prob, batch_count=1736093.3333333333, ans=0.125 2023-10-04 17:32:20,361 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_40222_210_6228_1_1533385298_4481939_163-334586-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 17:32:20,362 WARNING [train.py:1204] (3/4) Exclude cut with ID _48677_210_5663_1_1533902405919_3892989_408-40740-0_sp1.1 from training. Duration: 0.7545625 2023-10-04 17:32:20,596 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.conv_module1.balancer2.prob, batch_count=1736093.3333333333, ans=0.125 2023-10-04 17:32:20,679 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.conv_module1.balancer2.prob, batch_count=1736093.3333333333, ans=0.125 2023-10-04 17:32:22,685 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.0.self_attn_weights.pos_emb_skip_rate, batch_count=1736093.3333333333, ans=0.0 2023-10-04 17:32:24,041 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.bypass_mid.scale_min, batch_count=1736093.3333333333, ans=0.2 2023-10-04 17:32:25,218 WARNING [train.py:1204] (3/4) Exclude cut with ID _24314_210_4915_1_1532135078116_7170179_521-258868-0_sp0.9 from training. Duration: 0.87775 2023-10-04 17:32:26,705 WARNING [train.py:1204] (3/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_576-51198-0_sp1.1 from training. Duration: 0.92725 2023-10-04 17:32:28,058 WARNING [train.py:1204] (3/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_306-216871-0_sp1.1 from training. Duration: 0.97275 2023-10-04 17:32:31,665 WARNING [train.py:1204] (3/4) Exclude cut with ID _20684_210_6917_1_1531567751646_4203930_463-119982-0_sp1.1 from training. Duration: 0.97275 2023-10-04 17:32:33,033 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9084W0012-536850 from training. Duration: 0.64 2023-10-04 17:32:34,817 WARNING [train.py:1204] (3/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_847-41334-0_sp1.1 from training. Duration: 0.4 2023-10-04 17:32:38,737 WARNING [train.py:1204] (3/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_326-165900-0_sp1.1 from training. Duration: 0.62725 2023-10-04 17:32:38,808 WARNING [train.py:1204] (3/4) Exclude cut with ID _7537_210_2084_1_1528502395431_3924359_128-268029-0_sp1.1 from training. Duration: 0.8909375 2023-10-04 17:32:38,985 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.conv_module1.balancer2.prob, batch_count=1736160.0, ans=0.125 2023-10-04 17:32:42,908 WARNING [train.py:1204] (3/4) Exclude cut with ID _67499_210_17282_1_1535509805220_3692430_333-155030-0_sp1.1 from training. Duration: 0.97275 2023-10-04 17:32:44,389 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_34008_210_10431_1_1533345424_8183247_588-257177-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 17:32:45,810 WARNING [train.py:1204] (3/4) Exclude cut with ID _25914_210_6400_1_1532141317058_644740_52-96808-0_sp1.1 from training. Duration: 0.82725 2023-10-04 17:32:46,058 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.bypass_mid.scale_min, batch_count=1736226.6666666667, ans=0.2 2023-10-04 17:32:49,036 WARNING [train.py:1204] (3/4) Exclude cut with ID _56494_210_4220_1_1535284837625_3033169_69-138150-0_sp1.1 from training. Duration: 0.8545625 2023-10-04 17:32:50,614 WARNING [train.py:1204] (3/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_19-143553-0_sp1.1 from training. Duration: 0.97275 2023-10-04 17:32:50,668 WARNING [train.py:1204] (3/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_582-130735-0_sp1.1 from training. Duration: 0.936375 2023-10-04 17:32:53,479 WARNING [train.py:1204] (3/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_943-201356-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 17:32:53,488 WARNING [train.py:1204] (3/4) Exclude cut with ID _3231_210_2106_1_1526205864934_6784220_422-229276-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 17:32:53,505 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_48049_210_5632_1_1533864553_3519937_73-198717-0_sp1.1 from training. Duration: 0.97275 2023-10-04 17:32:54,225 WARNING [train.py:1204] (3/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_329-285801-0 from training. Duration: 0.91 2023-10-04 17:32:55,344 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9171W0114-579000 from training. Duration: 0.896 2023-10-04 17:32:55,356 WARNING [train.py:1204] (3/4) Exclude cut with ID _3120_210_3525_1_1526036143112_4072980_210-132675-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 17:32:55,430 WARNING [train.py:1204] (3/4) Exclude cut with ID _55857_210_2346_1_1534644007831_6898209_745-98391-0_sp0.9 from training. Duration: 0.8444375 2023-10-04 17:32:55,499 WARNING [train.py:1204] (3/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_615-322035-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 17:32:55,503 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_36756_210_7234_1_1533026991_966276_119-315767-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 17:32:56,807 WARNING [train.py:1204] (3/4) Exclude cut with ID _13916_210_9994_1_1530788341126_3763149_60-191556-0_sp1.1 from training. Duration: 0.5545625 2023-10-04 17:32:56,830 WARNING [train.py:1204] (3/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_177-236809-0_sp1.1 from training. Duration: 0.4454375 2023-10-04 17:32:56,880 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7002_210_1794_1_1527939683_5157286_184-229630-0_sp1.1 from training. Duration: 0.67275 2023-10-04 17:32:56,885 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_50428_210_5724_1_1534247532_4126418_464-18322-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 17:32:58,694 WARNING [train.py:1204] (3/4) Exclude cut with ID _35799_210_5854_1_1533263487887_3529071_466-131402-0_sp1.1 from training. Duration: 0.936375 2023-10-04 17:33:00,113 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_1259-132768-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 17:33:00,167 WARNING [train.py:1204] (3/4) Exclude cut with ID _6694_210_4599_1_1527852637490_3965529_144-185051-0_sp1.1 from training. Duration: 0.87275 2023-10-04 17:33:00,354 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.balancer1.prob, batch_count=1736293.3333333333, ans=0.125 2023-10-04 17:33:01,412 INFO [train.py:1046] (3/4) Epoch 50, batch 150, loss[loss=0.1991, simple_loss=0.266, pruned_loss=0.06607, over 19438.00 frames. ], tot_loss[loss=0.1535, simple_loss=0.2349, pruned_loss=0.03609, over 2503023.47 frames. ], batch size: 388, lr: 2.04e-03, grad_scale: 16.0 2023-10-04 17:33:01,471 WARNING [train.py:1204] (3/4) Exclude cut with ID _64639_210_5816_1_1535367619437_3528000_59-133290-0_sp1.1 from training. Duration: 0.963625 2023-10-04 17:33:04,651 WARNING [train.py:1204] (3/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_223-49525-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 17:33:07,431 WARNING [train.py:1204] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_748-36150-0_sp1.1 from training. Duration: 0.82725 2023-10-04 17:33:07,432 WARNING [train.py:1204] (3/4) Exclude cut with ID _19896_210_1883_1_1531709022662_6649569_52-221627-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 17:33:07,483 WARNING [train.py:1204] (3/4) Exclude cut with ID _6789_210_1534_1_1527854471671_2870519_296-115766-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 17:33:10,191 WARNING [train.py:1204] (3/4) Exclude cut with ID _14624_210_1925_1_1530858690657_3642219_100-19871-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 17:33:11,472 WARNING [train.py:1204] (3/4) Exclude cut with ID _12555_210_8750_1_1530075594402_3780239_50-293675-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 17:33:12,949 WARNING [train.py:1204] (3/4) Exclude cut with ID _14880_210_8895_1_1530680426655_3795259_289-212817-0_sp1.1 from training. Duration: 0.62725 2023-10-04 17:33:14,247 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_388-211626-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 17:33:15,927 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.conv_skip_rate, batch_count=1736360.0, ans=0.0 2023-10-04 17:33:17,065 WARNING [train.py:1204] (3/4) Exclude cut with ID _5289_210_2084_1_1527678202736_3779299_392-261988-0 from training. Duration: 0.65 2023-10-04 17:33:17,075 WARNING [train.py:1204] (3/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_124-345840-0 from training. Duration: 0.52 2023-10-04 17:33:17,079 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2655_210_2087_1_1525688959_3897400_437-337690-0 from training. Duration: 0.63 2023-10-04 17:33:19,149 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3780_210_2087_1_1526467391_239068_13-274633-0_sp1.1 from training. Duration: 0.92725 2023-10-04 17:33:19,153 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_805-71497-0_sp1.1 from training. Duration: 0.6454375 2023-10-04 17:33:20,512 WARNING [train.py:1204] (3/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_434-83000-0_sp1.1 from training. Duration: 0.8909375 2023-10-04 17:33:21,805 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_17613_210_2151_1_1531137569_2366160_285-219531-0_sp1.1 from training. Duration: 0.97275 2023-10-04 17:33:21,810 WARNING [train.py:1204] (3/4) Exclude cut with ID _56108_210_16380_1_1534505446159_3887959_22-37858-0_sp1.1 from training. Duration: 0.936375 2023-10-04 17:33:23,768 WARNING [train.py:1204] (3/4) Exclude cut with ID _28427_210_4220_1_1532581240967_3061069_200-88444-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 17:33:24,971 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_77-279042-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 17:33:26,355 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9309W0193-640320 from training. Duration: 0.896 2023-10-04 17:33:27,810 WARNING [train.py:1204] (3/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_516-324055-0_sp1.1 from training. Duration: 0.936375 2023-10-04 17:33:30,818 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.conv_module1.balancer2.prob, batch_count=1736426.6666666667, ans=0.125 2023-10-04 17:33:35,226 WARNING [train.py:1204] (3/4) Exclude cut with ID _40094_210_16380_1_1533294005773_3641159_10-261240-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 17:33:39,961 WARNING [train.py:1204] (3/4) Exclude cut with ID _6267_210_2084_1_1527764658526_3894730_375-219887-0_sp1.1 from training. Duration: 0.6090625 2023-10-04 17:33:41,245 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6788_210_1388_1_1527898698_4562021_180-75288-0 from training. Duration: 0.99 2023-10-04 17:33:45,223 WARNING [train.py:1204] (3/4) Exclude cut with ID _8030_210_7497_1_1528444886781_3531089_262-86750-0_sp1.1 from training. Duration: 0.736375 2023-10-04 17:33:45,244 WARNING [train.py:1204] (3/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_934-325498-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 17:33:45,274 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_456-22141-0_sp0.9 from training. Duration: 0.97775 2023-10-04 17:33:46,725 WARNING [train.py:1204] (3/4) Exclude cut with ID _49643_210_7989_1_1534640244224_3939031_598-161926-0_sp1.1 from training. Duration: 0.8181875 2023-10-04 17:33:48,073 WARNING [train.py:1204] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_345-49721-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 17:33:49,410 WARNING [train.py:1204] (3/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_440-141811-0_sp1.1 from training. Duration: 0.863625 2023-10-04 17:33:51,343 WARNING [train.py:1204] (3/4) Exclude cut with ID _64048_210_10626_1_1535198382802_3835179_394-212120-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 17:33:52,635 WARNING [train.py:1204] (3/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_91-277684-0 from training. Duration: 0.74 2023-10-04 17:33:55,722 WARNING [train.py:1204] (3/4) Exclude cut with ID _23038_210_9570_1_1532001584158_3652419_191-346343-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 17:33:57,080 WARNING [train.py:1204] (3/4) Exclude cut with ID _15140_210_9288_1_1530791388450_4880149_351-347974-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 17:33:57,121 WARNING [train.py:1204] (3/4) Exclude cut with ID _58605_210_12210_1_1534723195939_3904259_48-269402-0_sp1.1 from training. Duration: 0.87275 2023-10-04 17:33:57,128 WARNING [train.py:1204] (3/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_401-53423-0_sp0.9 from training. Duration: 0.611125 2023-10-04 17:34:00,276 WARNING [train.py:1204] (3/4) Exclude cut with ID _42659_210_2070_1_1533607442765_3120380_158-49004-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 17:34:01,643 WARNING [train.py:1204] (3/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_81-75849-0_sp1.1 from training. Duration: 0.3545625 2023-10-04 17:34:03,667 WARNING [train.py:1204] (3/4) Exclude cut with ID _27052_210_5986_1_1532329197187_3585009_314-106499-0_sp1.1 from training. Duration: 0.836375 2023-10-04 17:34:05,114 WARNING [train.py:1204] (3/4) Exclude cut with ID _4626_210_3532_1_1526817652717_3916859_446-206656-0_sp0.9 from training. Duration: 0.8444375 2023-10-04 17:34:06,474 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_53378_210_17282_1_1534221071_5769509_158-114960-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 17:34:07,857 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3171_210_2087_1_1526109084_2330007_37-64334-0_sp0.9 from training. Duration: 0.911125 2023-10-04 17:34:09,148 WARNING [train.py:1204] (3/4) Exclude cut with ID _5410_210_1388_1_1527397055795_1719020_39-112221-0 from training. Duration: 0.69 2023-10-04 17:34:09,174 WARNING [train.py:1204] (3/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_4-91037-0_sp0.9 from training. Duration: 0.97775 2023-10-04 17:34:09,189 WARNING [train.py:1204] (3/4) Exclude cut with ID ID0079W0443-717795 from training. Duration: 0.541 2023-10-04 17:34:10,851 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.attention_skip_rate, batch_count=1736560.0, ans=0.0 2023-10-04 17:34:12,514 WARNING [train.py:1204] (3/4) Exclude cut with ID _34734_210_4525_1_1533365977542_4448720_766-297928-0_sp1.1 from training. Duration: 0.936375 2023-10-04 17:34:15,181 INFO [train.py:1046] (3/4) Epoch 50, batch 200, loss[loss=0.1347, simple_loss=0.2144, pruned_loss=0.02752, over 24584.00 frames. ], tot_loss[loss=0.1526, simple_loss=0.2342, pruned_loss=0.03553, over 3002540.72 frames. ], batch size: 60, lr: 2.04e-03, grad_scale: 8.0 2023-10-04 17:34:16,685 WARNING [train.py:1204] (3/4) Exclude cut with ID _51471_210_1889_1_1534399391390_3094250_310-337793-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 17:34:16,703 WARNING [train.py:1204] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_678-159307-0_sp0.9 from training. Duration: 0.9444375 2023-10-04 17:34:19,499 WARNING [train.py:1204] (3/4) Exclude cut with ID _50093_210_4220_1_1534417141207_3432299_259-322252-0 from training. Duration: 0.92 2023-10-04 17:34:19,564 WARNING [train.py:1204] (3/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_67-103444-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 17:34:21,494 WARNING [train.py:1204] (3/4) Exclude cut with ID _40551_210_10635_1_1533776424632_3416179_311-247578-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 17:34:24,267 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_4633_210_2040_1_1526824163_4145343_416-76117-0 from training. Duration: 0.95 2023-10-04 17:34:25,500 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.740e+02 2.066e+02 2.212e+02 2.512e+02 4.565e+02, threshold=4.424e+02, percent-clipped=0.0 2023-10-04 17:34:25,604 WARNING [train.py:1204] (3/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_331-117829-0_sp0.9 from training. Duration: 0.688875 2023-10-04 17:34:27,511 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.3.encoder.layers.1.nonlin_attention.whiten1, num_groups=1, num_channels=384, metric=4.41 vs. limit=10.0 2023-10-04 17:34:28,213 WARNING [train.py:1204] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_299-247059-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 17:34:29,610 WARNING [train.py:1204] (3/4) Exclude cut with ID _64002_210_9570_1_1535198605157_38979_2-240845-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 17:34:32,825 WARNING [train.py:1204] (3/4) Exclude cut with ID _4681_210_3525_1_1527250689464_120889_15-126760-0_sp0.9 from training. Duration: 0.9 2023-10-04 17:34:32,969 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.self_attn_weights.pos_emb_skip_rate, batch_count=1736693.3333333333, ans=0.0 2023-10-04 17:34:34,152 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_21500_210_2054_1_1532149310_2716758_256-156611-0_sp1.1 from training. Duration: 0.936375 2023-10-04 17:34:34,175 WARNING [train.py:1204] (3/4) Exclude cut with ID _16908_210_5724_1_1531119157248_3772700_624-203729-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 17:34:53,276 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.balancer1.prob, batch_count=1736760.0, ans=0.125 2023-10-04 17:34:54,819 WARNING [train.py:1204] (3/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_605-219732-0_sp1.1 from training. Duration: 0.8545625 2023-10-04 17:34:54,853 WARNING [train.py:1204] (3/4) Exclude cut with ID _44718_210_5816_1_1534050051869_6469030_380-167520-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 17:34:56,226 WARNING [train.py:1204] (3/4) Exclude cut with ID _19896_210_1883_1_1531709022662_6649569_94-229927-0_sp1.1 from training. Duration: 0.8818125 2023-10-04 17:34:56,291 WARNING [train.py:1204] (3/4) Exclude cut with ID _19514_210_9259_1_1531530071330_5084230_519-174300-0_sp1.1 from training. Duration: 0.963625 2023-10-04 17:34:57,671 WARNING [train.py:1204] (3/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_340-174176-0_sp0.9 from training. Duration: 0.5444375 2023-10-04 17:34:57,675 WARNING [train.py:1204] (3/4) Exclude cut with ID _8263_210_1664_1_1528439452891_970220_68-181805-0_sp1.1 from training. Duration: 0.5818125 2023-10-04 17:34:57,802 WARNING [train.py:1204] (3/4) Exclude cut with ID _3120_210_3525_1_1526036143112_4072980_348-257723-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 17:34:58,534 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.1.encoder.layers.1.conv_module1.whiten, num_groups=1, num_channels=256, metric=10.17 vs. limit=15.0 2023-10-04 17:34:59,218 WARNING [train.py:1204] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_453-133983-0_sp1.1 from training. Duration: 0.7090625 2023-10-04 17:35:00,523 WARNING [train.py:1204] (3/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_17-86060-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 17:35:00,546 WARNING [train.py:1204] (3/4) Exclude cut with ID _29877_210_6228_1_1532397791106_5276149_228-184657-0_sp1.1 from training. Duration: 0.97275 2023-10-04 17:35:01,929 WARNING [train.py:1204] (3/4) Exclude cut with ID _18516_210_9206_1_1531704653206_3692406_198-334870-0 from training. Duration: 0.92 2023-10-04 17:35:03,136 WARNING [train.py:1204] (3/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_340-174176-0_sp1.1 from training. Duration: 0.4454375 2023-10-04 17:35:03,156 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_14-139186-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 17:35:07,822 WARNING [train.py:1204] (3/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_126-29104-0_sp0.9 from training. Duration: 0.9444375 2023-10-04 17:35:12,564 WARNING [train.py:1204] (3/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_84-10310-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 17:35:15,847 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.4.encoder.layers.2.self_attn_weights.whiten_keys, num_groups=4, num_channels=128, metric=2.00 vs. limit=6.0 2023-10-04 17:35:20,010 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_15517_210_6753_1_1530954008_3487336_79-273076-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 17:35:20,069 WARNING [train.py:1204] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_371-18169-0_sp0.9 from training. Duration: 0.9555625 2023-10-04 17:35:22,987 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.conv_skip_rate, batch_count=1736893.3333333333, ans=0.0 2023-10-04 17:35:25,748 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.feed_forward2.hidden_balancer.prob, batch_count=1736893.3333333333, ans=0.125 2023-10-04 17:35:28,272 INFO [train.py:1046] (3/4) Epoch 50, batch 250, loss[loss=0.1557, simple_loss=0.235, pruned_loss=0.0382, over 24670.00 frames. ], tot_loss[loss=0.1519, simple_loss=0.2333, pruned_loss=0.03523, over 3383676.25 frames. ], batch size: 65, lr: 2.04e-03, grad_scale: 8.0 2023-10-04 17:35:28,326 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_68702_210_23689_1_1535799577_3420068_170-92788-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 17:35:30,329 WARNING [train.py:1204] (3/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_78-182912-0 from training. Duration: 0.59 2023-10-04 17:35:30,385 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3019_210_2028_1_1526044245_3264865_297-12384-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 17:35:30,391 WARNING [train.py:1204] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_147-111137-0_sp1.1 from training. Duration: 0.863625 2023-10-04 17:35:30,396 WARNING [train.py:1204] (3/4) Exclude cut with ID _28323_210_16187_1_1532480395667_4100300_117-8859-0_sp1.1 from training. Duration: 0.97275 2023-10-04 17:35:31,799 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7762_210_1794_1_1528199508_4057408_419-125009-0_sp1.1 from training. Duration: 0.7545625 2023-10-04 17:35:31,890 WARNING [train.py:1204] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_268-87909-0 from training. Duration: 0.99 2023-10-04 17:35:33,411 WARNING [train.py:1204] (3/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_118-311357-0_sp1.1 from training. Duration: 0.92725 2023-10-04 17:35:33,433 WARNING [train.py:1204] (3/4) Exclude cut with ID ID0377W0003-870225 from training. Duration: 0.852 2023-10-04 17:35:35,179 WARNING [train.py:1204] (3/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_56-5838-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 17:35:36,570 WARNING [train.py:1204] (3/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_90-31383-0_sp0.9 from training. Duration: 0.9666875 2023-10-04 17:35:38,003 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_31583_210_3852_1_1532595851_4024413_58-86657-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 17:35:38,051 WARNING [train.py:1204] (3/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_344-113875-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 17:35:41,273 WARNING [train.py:1204] (3/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_278-104676-0_sp1.1 from training. Duration: 0.8909375 2023-10-04 17:35:42,559 WARNING [train.py:1204] (3/4) Exclude cut with ID _55844_210_2346_1_1534471212569_7625707_764-2387-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 17:35:44,005 WARNING [train.py:1204] (3/4) Exclude cut with ID _27430_210_7770_1_1532604689375_1378590_157-14963-0_sp1.1 from training. Duration: 0.963625 2023-10-04 17:35:46,821 WARNING [train.py:1204] (3/4) Exclude cut with ID _7160_210_1876_1_1527940923231_3703799_282-322611-0_sp1.1 from training. Duration: 0.7909375 2023-10-04 17:35:46,970 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.nonlin_attention.balancer.prob, batch_count=1737026.6666666667, ans=0.125 2023-10-04 17:35:54,571 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.feed_forward1.out_proj.dropout_p, batch_count=1737026.6666666667, ans=0.1 2023-10-04 17:35:55,788 WARNING [train.py:1204] (3/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_190-138640-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 17:35:58,538 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_33348_210_5632_1_1533086912_3324030_63-270922-0_sp1.1 from training. Duration: 0.97275 2023-10-04 17:35:58,562 WARNING [train.py:1204] (3/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_135-184281-0_sp1.1 from training. Duration: 0.9 2023-10-04 17:35:58,797 INFO [scaling.py:1118] (3/4) WithLoss: name=encoder.encoders.1.encoder.layers.1.self_attn_weights, loss-sum=0.000e+00 2023-10-04 17:36:00,765 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.nonlin_attention.balancer.prob, batch_count=1737093.3333333333, ans=0.125 2023-10-04 17:36:04,293 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.3.encoder.layers.3.self_attn2.whiten, num_groups=1, num_channels=512, metric=14.73 vs. limit=22.5 2023-10-04 17:36:05,055 WARNING [train.py:1204] (3/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_277-210798-0_sp1.1 from training. Duration: 0.5 2023-10-04 17:36:05,115 WARNING [train.py:1204] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_402-253027-0_sp0.9 from training. Duration: 0.6 2023-10-04 17:36:06,509 WARNING [train.py:1204] (3/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_540-275685-0_sp0.9 from training. Duration: 0.988875 2023-10-04 17:36:07,880 WARNING [train.py:1204] (3/4) Exclude cut with ID _59493_210_17282_1_1535078419077_3225879_118-18960-0_sp1.1 from training. Duration: 0.936375 2023-10-04 17:36:07,954 WARNING [train.py:1204] (3/4) Exclude cut with ID _6194_210_1534_1_1527511826160_3941194_321-148641-0_sp1.1 from training. Duration: 0.5818125 2023-10-04 17:36:07,960 WARNING [train.py:1204] (3/4) Exclude cut with ID _61133_210_9969_1_1535196777227_3696409_333-270325-0_sp1.1 from training. Duration: 0.8454375 2023-10-04 17:36:08,021 WARNING [train.py:1204] (3/4) Exclude cut with ID _49376_210_5986_1_1533985278732_3370740_307-145221-0_sp1.1 from training. Duration: 0.936375 2023-10-04 17:36:11,342 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6846_210_6753_1_1527850767_4295158_2-315073-0_sp0.9 from training. Duration: 0.97775 2023-10-04 17:36:13,017 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.feed_forward1.out_proj.dropout_p, batch_count=1737160.0, ans=0.1 2023-10-04 17:36:14,189 WARNING [train.py:1204] (3/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_17-25045-0 from training. Duration: 0.59 2023-10-04 17:36:14,202 WARNING [train.py:1204] (3/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_503-333510-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 17:36:16,315 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.2.encoder.layers.1.feed_forward2.out_whiten, num_groups=1, num_channels=384, metric=3.75 vs. limit=15.0 2023-10-04 17:36:16,854 WARNING [train.py:1204] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_238-245655-0_sp1.1 from training. Duration: 0.736375 2023-10-04 17:36:16,881 WARNING [train.py:1204] (3/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_329-82235-0_sp0.9 from training. Duration: 0.911125 2023-10-04 17:36:16,892 WARNING [train.py:1204] (3/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_325-113798-0_sp0.9 from training. Duration: 0.9444375 2023-10-04 17:36:18,212 WARNING [train.py:1204] (3/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_395-37222-0_sp0.9 from training. Duration: 0.8444375 2023-10-04 17:36:20,144 WARNING [train.py:1204] (3/4) Exclude cut with ID _64135_210_2253_1_1535677267876_7608972_498-5039-0_sp1.1 from training. Duration: 0.7090625 2023-10-04 17:36:20,158 WARNING [train.py:1204] (3/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_303-228738-0_sp0.9 from training. Duration: 0.9333125 2023-10-04 17:36:21,560 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_577-220899-0_sp1.1 from training. Duration: 0.92725 2023-10-04 17:36:22,959 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3475_210_2028_1_1526193578_3622569_377-166133-0_sp1.1 from training. Duration: 0.7818125 2023-10-04 17:36:24,313 WARNING [train.py:1204] (3/4) Exclude cut with ID _49376_210_5986_1_1533985278732_3370740_272-313512-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 17:36:25,893 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7762_210_1794_1_1528199508_4057408_422-227034-0_sp0.9 from training. Duration: 0.811125 2023-10-04 17:36:30,257 INFO [scaling.py:1118] (3/4) WithLoss: name=encoder.encoders.3.encoder.layers.3.self_attn_weights, loss-sum=0.000e+00 2023-10-04 17:36:31,988 WARNING [train.py:1204] (3/4) Exclude cut with ID _18895_210_6228_1_1531357274234_3784930_74-341428-0_sp1.1 from training. Duration: 0.92725 2023-10-04 17:36:34,413 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.balancer2.prob, batch_count=1737226.6666666667, ans=0.125 2023-10-04 17:36:35,235 WARNING [train.py:1204] (3/4) Exclude cut with ID _44902_210_16187_1_1533888006062_3792078_149-67212-0_sp1.1 from training. Duration: 0.7909375 2023-10-04 17:36:39,322 WARNING [train.py:1204] (3/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_243-126020-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 17:36:40,642 WARNING [train.py:1204] (3/4) Exclude cut with ID _54218_210_12210_1_1534413646388_4409259_259-6561-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 17:36:43,258 INFO [train.py:1046] (3/4) Epoch 50, batch 300, loss[loss=0.1494, simple_loss=0.2348, pruned_loss=0.03203, over 23354.00 frames. ], tot_loss[loss=0.151, simple_loss=0.2312, pruned_loss=0.03539, over 3658224.01 frames. ], batch size: 105, lr: 2.04e-03, grad_scale: 8.0 2023-10-04 17:36:43,386 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_41452_210_7291_1_1533437687_3941281_405-11559-0 from training. Duration: 0.91 2023-10-04 17:36:45,441 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_759-225084-0_sp1.1 from training. Duration: 0.97275 2023-10-04 17:36:45,460 WARNING [train.py:1204] (3/4) Exclude cut with ID _14747_210_8341_1_1530685797463_3654089_147-131229-0_sp0.9 from training. Duration: 0.8444375 2023-10-04 17:36:46,856 WARNING [train.py:1204] (3/4) Exclude cut with ID _14867_210_1968_1_1530684179890_3846969_319-250868-0 from training. Duration: 0.99 2023-10-04 17:36:46,899 WARNING [train.py:1204] (3/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_580-93798-0_sp0.9 from training. Duration: 0.62225 2023-10-04 17:36:48,315 WARNING [train.py:1204] (3/4) Exclude cut with ID _9679_210_7497_1_1529129212906_4363929_512-313889-0_sp0.9 from training. Duration: 0.9 2023-10-04 17:36:48,333 WARNING [train.py:1204] (3/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_18-189198-0 from training. Duration: 0.9 2023-10-04 17:36:53,081 WARNING [train.py:1204] (3/4) Exclude cut with ID _49755_210_18053_1_1534581005846_3218709_137-64179-0_sp1.1 from training. Duration: 0.92725 2023-10-04 17:36:53,131 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_48049_210_5632_1_1533864553_3519937_306-283794-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 17:36:54,339 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.715e+02 2.135e+02 2.414e+02 2.952e+02 4.730e+02, threshold=4.828e+02, percent-clipped=1.0 2023-10-04 17:36:57,171 WARNING [train.py:1204] (3/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_25-120161-0_sp1.1 from training. Duration: 0.8909375 2023-10-04 17:36:57,240 WARNING [train.py:1204] (3/4) Exclude cut with ID _8833_210_5219_1_1528592665210_7553059_319-349181-0 from training. Duration: 0.99 2023-10-04 17:36:58,607 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_296-332325-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 17:36:58,698 WARNING [train.py:1204] (3/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_436-186311-0_sp1.1 from training. Duration: 0.5909375 2023-10-04 17:36:58,710 WARNING [train.py:1204] (3/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_348-127789-0 from training. Duration: 0.85 2023-10-04 17:36:58,714 WARNING [train.py:1204] (3/4) Exclude cut with ID _3992_210_1747_1_1526644816031_3731060_443-195183-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 17:37:00,880 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.nonlin_attention.balancer.prob, batch_count=1737360.0, ans=0.125 2023-10-04 17:37:04,121 WARNING [train.py:1204] (3/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_308-231735-0_sp1.1 from training. Duration: 0.663625 2023-10-04 17:37:08,160 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6786_210_1388_1_1527839699_4194495_378-46070-0_sp1.1 from training. Duration: 0.6545625 2023-10-04 17:37:08,219 WARNING [train.py:1204] (3/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_63-101996-0 from training. Duration: 0.88 2023-10-04 17:37:12,293 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2541_210_1794_1_1525590269_3084765_156-173713-0 from training. Duration: 0.81 2023-10-04 17:37:12,330 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_217-239796-0_sp1.1 from training. Duration: 0.963625 2023-10-04 17:37:15,637 WARNING [train.py:1204] (3/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_530-329400-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 17:37:15,813 WARNING [train.py:1204] (3/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_150-62920-0_sp1.1 from training. Duration: 0.963625 2023-10-04 17:37:15,813 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_5522_210_3273_1_1527249606_3909096_366-172508-0 from training. Duration: 0.66 2023-10-04 17:37:15,816 WARNING [train.py:1204] (3/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_303-340446-0_sp1.1 from training. Duration: 0.8181875 2023-10-04 17:37:19,036 WARNING [train.py:1204] (3/4) Exclude cut with ID _8699_210_2069_1_1528630346831_3960409_160-24408-0_sp1.1 from training. Duration: 0.82725 2023-10-04 17:37:20,507 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.bypass.skip_rate, batch_count=1737426.6666666667, ans=0.04949747468305833 2023-10-04 17:37:21,748 WARNING [train.py:1204] (3/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_299-187801-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 17:37:21,883 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.conv_module1.balancer2.min_abs, batch_count=1737426.6666666667, ans=0.5 2023-10-04 17:37:21,904 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.bypass.skip_rate, batch_count=1737426.6666666667, ans=0.07 2023-10-04 17:37:23,024 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_34886_210_10633_1_1533203882_1755661_92-345288-0_sp1.1 from training. Duration: 0.97275 2023-10-04 17:37:23,902 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.1.encoder.layers.1.nonlin_attention.whiten1, num_groups=1, num_channels=192, metric=5.93 vs. limit=10.0 2023-10-04 17:37:25,879 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_5438_210_1794_1_1527336687_2600009_104-288274-0_sp1.1 from training. Duration: 0.52725 2023-10-04 17:37:25,884 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_154-316450-0 from training. Duration: 0.76 2023-10-04 17:37:27,322 WARNING [train.py:1204] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_23-189234-0_sp0.9 from training. Duration: 0.92225 2023-10-04 17:37:28,849 WARNING [train.py:1204] (3/4) Exclude cut with ID _4779_210_2084_1_1527318022944_3914893_346-168820-0_sp1.1 from training. Duration: 0.963625 2023-10-04 17:37:30,630 WARNING [train.py:1204] (3/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_5-55893-0 from training. Duration: 0.55 2023-10-04 17:37:30,720 WARNING [train.py:1204] (3/4) Exclude cut with ID _20455_210_5219_1_1531565996775_8098100_180-24517-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 17:37:36,754 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_243-143119-0_sp0.9 from training. Duration: 0.8555625 2023-10-04 17:37:38,179 WARNING [train.py:1204] (3/4) Exclude cut with ID _65585_210_12210_1_1535450451584_3764098_293-212045-0_sp1.1 from training. Duration: 0.92725 2023-10-04 17:37:38,183 WARNING [train.py:1204] (3/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_577-332600-0 from training. Duration: 0.57 2023-10-04 17:37:39,912 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.conv_module1.balancer2.prob, batch_count=1737493.3333333333, ans=0.125 2023-10-04 17:37:41,080 WARNING [train.py:1204] (3/4) Exclude cut with ID _30039_210_13270_1_1532484612900_2725150_104-289874-0_sp1.1 from training. Duration: 0.963625 2023-10-04 17:37:41,087 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3172_210_2087_1_1526292929_3987865_197-97762-0_sp1.1 from training. Duration: 0.6818125 2023-10-04 17:37:42,712 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.feed_forward3.hidden_balancer.prob, batch_count=1737560.0, ans=0.125 2023-10-04 17:37:44,427 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_256-150908-0_sp1.1 from training. Duration: 0.963625 2023-10-04 17:37:45,728 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_296-287678-0_sp1.1 from training. Duration: 0.863625 2023-10-04 17:37:45,761 WARNING [train.py:1204] (3/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_443-30034-0 from training. Duration: 0.64 2023-10-04 17:37:45,786 WARNING [train.py:1204] (3/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_494-199538-0_sp0.9 from training. Duration: 0.8333125 2023-10-04 17:37:47,183 WARNING [train.py:1204] (3/4) Exclude cut with ID _19708_210_9206_1_1532136650733_4238364_61-162325-0_sp1.1 from training. Duration: 0.97275 2023-10-04 17:37:49,126 WARNING [train.py:1204] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_475-127455-0 from training. Duration: 0.7 2023-10-04 17:37:50,495 WARNING [train.py:1204] (3/4) Exclude cut with ID _48677_210_5663_1_1533902405919_3892989_188-11581-0_sp1.1 from training. Duration: 0.963625 2023-10-04 17:37:50,530 WARNING [train.py:1204] (3/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_136-8381-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 17:37:51,956 WARNING [train.py:1204] (3/4) Exclude cut with ID _55328_210_27767_1_1534580973260_4088170_270-229555-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 17:37:51,991 WARNING [train.py:1204] (3/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_372-266951-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 17:37:53,355 WARNING [train.py:1204] (3/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_14-242149-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 17:37:58,826 INFO [train.py:1046] (3/4) Epoch 50, batch 350, loss[loss=0.1293, simple_loss=0.2075, pruned_loss=0.02551, over 24270.00 frames. ], tot_loss[loss=0.1513, simple_loss=0.2316, pruned_loss=0.03552, over 3905289.36 frames. ], batch size: 56, lr: 2.04e-03, grad_scale: 8.0 2023-10-04 17:37:58,890 WARNING [train.py:1204] (3/4) Exclude cut with ID _7877_210_6014_1_1528286358099_3750136_159-76512-0_sp1.1 from training. Duration: 0.936375 2023-10-04 17:37:58,892 WARNING [train.py:1204] (3/4) Exclude cut with ID _8263_210_1664_1_1528438953494_294100_40-204731-0_sp1.1 from training. Duration: 0.4545625 2023-10-04 17:38:00,841 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8278_210_2550_1_1528637099_4220413_694-238413-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 17:38:05,202 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.bypass_mid.scale_min, batch_count=1737626.6666666667, ans=0.2 2023-10-04 17:38:06,322 WARNING [train.py:1204] (3/4) Exclude cut with ID _47278_210_12041_1_1533902418349_3576110_421-172710-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 17:38:09,606 WARNING [train.py:1204] (3/4) Exclude cut with ID _14970_210_15709_1_1530773543493_1715730_215-282680-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 17:38:10,864 WARNING [train.py:1204] (3/4) Exclude cut with ID _64639_210_5816_1_1535367619437_3528000_306-149449-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 17:38:12,871 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_28-291982-0 from training. Duration: 0.97 2023-10-04 17:38:14,259 WARNING [train.py:1204] (3/4) Exclude cut with ID _55134_210_21982_1_1534590028377_3882170_412-15870-0_sp1.1 from training. Duration: 0.936375 2023-10-04 17:38:14,301 WARNING [train.py:1204] (3/4) Exclude cut with ID _13684_210_7497_1_1530619354863_3598359_312-235606-0 from training. Duration: 0.6 2023-10-04 17:38:18,301 WARNING [train.py:1204] (3/4) Exclude cut with ID _37112_210_2575_1_1532997007673_7268610_347-238993-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 17:38:18,347 WARNING [train.py:1204] (3/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_121-284152-0 from training. Duration: 0.59 2023-10-04 17:38:19,606 WARNING [train.py:1204] (3/4) Exclude cut with ID _2941_210_2577_1_1526199673150_2489759_228-2961-0_sp1.1 from training. Duration: 0.97275 2023-10-04 17:38:22,924 WARNING [train.py:1204] (3/4) Exclude cut with ID _28323_210_16187_1_1532480395667_4100300_531-24157-0 from training. Duration: 0.98 2023-10-04 17:38:23,051 WARNING [train.py:1204] (3/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_266-329755-0_sp0.9 from training. Duration: 0.988875 2023-10-04 17:38:24,475 WARNING [train.py:1204] (3/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_1038-146421-0_sp1.1 from training. Duration: 0.97275 2023-10-04 17:38:25,797 WARNING [train.py:1204] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_391-22148-0_sp0.9 from training. Duration: 0.8555625 2023-10-04 17:38:27,126 WARNING [train.py:1204] (3/4) Exclude cut with ID _64521_210_14699_1_1535243427259_3815328_543-341117-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 17:38:28,342 WARNING [train.py:1204] (3/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_52-168712-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 17:38:28,400 WARNING [train.py:1204] (3/4) Exclude cut with ID _33920_210_4220_1_1533257851500_3084399_198-334063-0_sp1.1 from training. Duration: 0.8545625 2023-10-04 17:38:28,413 WARNING [train.py:1204] (3/4) Exclude cut with ID _50093_210_4220_1_1534417141207_3432299_69-111987-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 17:38:30,285 WARNING [train.py:1204] (3/4) Exclude cut with ID _14821_210_8750_1_1530680503991_6829359_189-297451-0_sp0.9 from training. Duration: 0.611125 2023-10-04 17:38:32,988 WARNING [train.py:1204] (3/4) Exclude cut with ID _8030_210_7497_1_1528444886781_3531089_262-86750-0_sp0.9 from training. Duration: 0.9 2023-10-04 17:38:32,995 WARNING [train.py:1204] (3/4) Exclude cut with ID _24612_210_8913_1_1531998054193_3806359_294-191196-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 17:38:35,936 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.feed_forward1.hidden_balancer.prob, batch_count=1737760.0, ans=0.125 2023-10-04 17:38:39,818 WARNING [train.py:1204] (3/4) Exclude cut with ID _16909_210_5724_1_1531292079912_4118919_707-342896-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 17:38:39,835 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_9460_210_4211_1_1528954587_10049667_572-37035-0_sp0.9 from training. Duration: 0.888875 2023-10-04 17:38:41,125 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.3.encoder.layers.1.self_attn2.whiten, num_groups=1, num_channels=512, metric=14.90 vs. limit=22.5 2023-10-04 17:38:41,828 WARNING [train.py:1204] (3/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_762-109886-0_sp1.1 from training. Duration: 0.82725 2023-10-04 17:38:41,855 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_14953_210_2151_1_1530701710_5616165_519-326393-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 17:38:42,086 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.bypass.skip_rate, batch_count=1737826.6666666667, ans=0.09899494936611666 2023-10-04 17:38:47,866 WARNING [train.py:1204] (3/4) Exclude cut with ID _25914_210_6400_1_1532141317058_644740_52-96808-0 from training. Duration: 0.91 2023-10-04 17:38:47,869 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_68095_210_20185_1_1535718226_8888855_1079-94525-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 17:38:51,940 WARNING [train.py:1204] (3/4) Exclude cut with ID _14860_210_4915_1_1530702020340_7343240_746-3647-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 17:38:51,941 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_70379_210_7925_1_1535941455_557455_49-101720-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 17:38:51,954 WARNING [train.py:1204] (3/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_8-307291-0_sp1.1 from training. Duration: 0.936375 2023-10-04 17:38:52,185 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.conv_module1.balancer2.prob, batch_count=1737826.6666666667, ans=0.125 2023-10-04 17:38:53,343 WARNING [train.py:1204] (3/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_117-290916-0 from training. Duration: 0.51 2023-10-04 17:38:55,312 WARNING [train.py:1204] (3/4) Exclude cut with ID _22020_210_2461_1_1531724164513_6326900_944-339840-0_sp1.1 from training. Duration: 0.963625 2023-10-04 17:38:56,737 WARNING [train.py:1204] (3/4) Exclude cut with ID IC0515W0043-254911 from training. Duration: 0.976 2023-10-04 17:38:56,886 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.feed_forward1.hidden_balancer.prob, batch_count=1737893.3333333333, ans=0.125 2023-10-04 17:38:59,406 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3910_210_1794_1_1526742306_334530_28-238909-0 from training. Duration: 0.81 2023-10-04 17:38:59,421 WARNING [train.py:1204] (3/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_269-267275-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 17:39:02,678 WARNING [train.py:1204] (3/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_72-251255-0_sp1.1 from training. Duration: 0.8545625 2023-10-04 17:39:02,697 WARNING [train.py:1204] (3/4) Exclude cut with ID _14886_210_4220_1_1530702020355_3516190_175-229073-0 from training. Duration: 0.45 2023-10-04 17:39:05,457 WARNING [train.py:1204] (3/4) Exclude cut with ID _40519_210_13794_1_1533715228655_7337390_921-209452-0_sp1.1 from training. Duration: 0.963625 2023-10-04 17:39:05,707 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.feed_forward3.hidden_balancer.prob, batch_count=1737893.3333333333, ans=0.125 2023-10-04 17:39:06,880 WARNING [train.py:1204] (3/4) Exclude cut with ID _29857_210_20034_1_1532395886753_3798160_335-163562-0_sp0.9 from training. Duration: 0.9444375 2023-10-04 17:39:08,266 WARNING [train.py:1204] (3/4) Exclude cut with ID _58583_210_5236_1_1534726835266_3574249_384-58445-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 17:39:09,757 WARNING [train.py:1204] (3/4) Exclude cut with ID _44011_210_2046_1_1533722401691_3631310_96-315656-0_sp1.1 from training. Duration: 0.963625 2023-10-04 17:39:09,762 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_68484_210_1794_1_1535849804_7678097_549-170613-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 17:39:13,040 INFO [train.py:1046] (3/4) Epoch 50, batch 400, loss[loss=0.1265, simple_loss=0.2101, pruned_loss=0.02144, over 21651.00 frames. ], tot_loss[loss=0.1508, simple_loss=0.2311, pruned_loss=0.0352, over 4082634.08 frames. ], batch size: 47, lr: 2.04e-03, grad_scale: 16.0 2023-10-04 17:39:13,104 WARNING [train.py:1204] (3/4) Exclude cut with ID _37892_210_19596_1_1533198677897_3964058_214-148058-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 17:39:15,876 WARNING [train.py:1204] (3/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_415-78792-0_sp0.9 from training. Duration: 0.9 2023-10-04 17:39:17,317 WARNING [train.py:1204] (3/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_383-205833-0_sp1.1 from training. Duration: 0.863625 2023-10-04 17:39:19,181 WARNING [train.py:1204] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_167-137255-0 from training. Duration: 0.64 2023-10-04 17:39:19,189 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8275_210_4198_1_1528455320_3892571_484-154786-0_sp1.1 from training. Duration: 0.963625 2023-10-04 17:39:19,248 WARNING [train.py:1204] (3/4) Exclude cut with ID _40284_210_2084_1_1533513679814_3704279_396-319412-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 17:39:20,629 WARNING [train.py:1204] (3/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_280-120942-0_sp0.9 from training. Duration: 0.8555625 2023-10-04 17:39:20,669 WARNING [train.py:1204] (3/4) Exclude cut with ID _45630_210_10556_1_1533949174498_3902049_558-240889-0_sp1.1 from training. Duration: 0.92725 2023-10-04 17:39:23,326 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.850e+02 2.204e+02 2.503e+02 2.955e+02 6.313e+02, threshold=5.007e+02, percent-clipped=5.0 2023-10-04 17:39:23,425 WARNING [train.py:1204] (3/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_197-307458-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 17:39:24,905 WARNING [train.py:1204] (3/4) Exclude cut with ID _16909_210_5724_1_1531292079912_4118919_163-254843-0_sp1.1 from training. Duration: 0.92725 2023-10-04 17:39:25,147 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.balancer1.prob, batch_count=1737960.0, ans=0.125 2023-10-04 17:39:27,983 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_103-84010-0 from training. Duration: 0.69 2023-10-04 17:39:29,496 WARNING [train.py:1204] (3/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_401-158239-0 from training. Duration: 0.71 2023-10-04 17:39:29,497 WARNING [train.py:1204] (3/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_899-233658-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 17:39:30,848 WARNING [train.py:1204] (3/4) Exclude cut with ID _23825_210_13449_1_1531915313526_3497150_240-271367-0 from training. Duration: 0.97 2023-10-04 17:39:32,789 WARNING [train.py:1204] (3/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_424-134619-0_sp1.1 from training. Duration: 0.92725 2023-10-04 17:39:35,683 WARNING [train.py:1204] (3/4) Exclude cut with ID _44831_210_13427_1_1533816011549_3689035_32-188147-0_sp1.1 from training. Duration: 0.87275 2023-10-04 17:39:35,694 WARNING [train.py:1204] (3/4) Exclude cut with ID _59170_210_6721_1_1534983633960_4203335_314-63879-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 17:39:35,708 WARNING [train.py:1204] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_602-275260-0 from training. Duration: 0.82 2023-10-04 17:39:37,038 WARNING [train.py:1204] (3/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_511-306767-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 17:39:37,082 WARNING [train.py:1204] (3/4) Exclude cut with ID _41817_210_18835_1_1533434477160_7423049_565-201775-0_sp1.1 from training. Duration: 0.92725 2023-10-04 17:39:37,092 WARNING [train.py:1204] (3/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_778-173151-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 17:39:37,123 WARNING [train.py:1204] (3/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_680-51001-0_sp1.1 from training. Duration: 0.963625 2023-10-04 17:39:39,947 WARNING [train.py:1204] (3/4) Exclude cut with ID IC0677W0059-334891 from training. Duration: 0.896 2023-10-04 17:39:40,003 WARNING [train.py:1204] (3/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_370-331600-0 from training. Duration: 0.94 2023-10-04 17:39:44,764 WARNING [train.py:1204] (3/4) Exclude cut with ID _66840_210_5219_1_1535452168426_4196349_446-240192-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 17:39:46,226 WARNING [train.py:1204] (3/4) Exclude cut with ID _65242_210_18628_1_1535369393717_3823149_38-6732-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 17:39:47,549 WARNING [train.py:1204] (3/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_302-2550-0 from training. Duration: 0.81 2023-10-04 17:39:48,955 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_100-348441-0 from training. Duration: 0.91 2023-10-04 17:39:50,893 WARNING [train.py:1204] (3/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_329-285801-0_sp1.1 from training. Duration: 0.82725 2023-10-04 17:39:52,353 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_30167_210_3528_1_1532428012_4772375_343-334712-0_sp1.1 from training. Duration: 0.936375 2023-10-04 17:39:59,595 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7543_210_6753_1_1528543318_4552_1-85072-0 from training. Duration: 0.83 2023-10-04 17:40:02,771 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7002_210_1794_1_1527935634_4034444_487-236501-0_sp1.1 from training. Duration: 0.6 2023-10-04 17:40:04,174 WARNING [train.py:1204] (3/4) Exclude cut with ID _7160_210_1876_1_1527940923231_3703799_282-322611-0 from training. Duration: 0.87 2023-10-04 17:40:06,907 WARNING [train.py:1204] (3/4) Exclude cut with ID _67335_210_5219_1_1535525975652_4275229_570-262983-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 17:40:08,347 WARNING [train.py:1204] (3/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_625-338546-0_sp1.1 from training. Duration: 0.8 2023-10-04 17:40:08,376 WARNING [train.py:1204] (3/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_176-56120-0 from training. Duration: 0.83 2023-10-04 17:40:11,300 WARNING [train.py:1204] (3/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_963-187109-0_sp1.1 from training. Duration: 0.9 2023-10-04 17:40:14,507 WARNING [train.py:1204] (3/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_121-284152-0_sp0.9 from training. Duration: 0.6555625 2023-10-04 17:40:15,843 WARNING [train.py:1204] (3/4) Exclude cut with ID _45021_210_9994_1_1533619751094_3330288_104-81711-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 17:40:18,592 WARNING [train.py:1204] (3/4) Exclude cut with ID _51382_210_23862_1_1534492801838_3650104_129-343914-0_sp1.1 from training. Duration: 0.936375 2023-10-04 17:40:18,628 WARNING [train.py:1204] (3/4) Exclude cut with ID _14970_210_15709_1_1530775547361_1966965_8-169924-0 from training. Duration: 0.92 2023-10-04 17:40:21,735 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2295_210_2087_1_1525500993_2407863_234-83104-0_sp1.1 from training. Duration: 0.563625 2023-10-04 17:40:23,001 WARNING [train.py:1204] (3/4) Exclude cut with ID _14687_210_5854_1_1530703838259_3664889_115-239046-0 from training. Duration: 0.67 2023-10-04 17:40:24,427 WARNING [train.py:1204] (3/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_690-126306-0_sp1.1 from training. Duration: 0.7090625 2023-10-04 17:40:24,442 WARNING [train.py:1204] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_625-102955-0_sp0.9 from training. Duration: 0.8555625 2023-10-04 17:40:25,894 WARNING [train.py:1204] (3/4) Exclude cut with ID _9389_210_1664_1_1529217337301_5289269_289-213417-0 from training. Duration: 0.89 2023-10-04 17:40:27,284 INFO [train.py:1046] (3/4) Epoch 50, batch 450, loss[loss=0.1457, simple_loss=0.2246, pruned_loss=0.0334, over 20972.00 frames. ], tot_loss[loss=0.1513, simple_loss=0.2322, pruned_loss=0.03519, over 4231183.55 frames. ], batch size: 46, lr: 2.04e-03, grad_scale: 16.0 2023-10-04 17:40:27,419 WARNING [train.py:1204] (3/4) Exclude cut with ID _13253_210_1925_1_1530757775819_3577873_191-144977-0_sp0.9 from training. Duration: 0.8666875 2023-10-04 17:40:27,483 WARNING [train.py:1204] (3/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_325-155703-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 17:40:27,514 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2295_210_2087_1_1525500993_2407863_116-238894-0_sp0.9 from training. Duration: 0.77775 2023-10-04 17:40:28,931 WARNING [train.py:1204] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_94-126128-0 from training. Duration: 0.86 2023-10-04 17:40:30,243 WARNING [train.py:1204] (3/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_395-310220-0_sp0.9 from training. Duration: 0.988875 2023-10-04 17:40:30,324 WARNING [train.py:1204] (3/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_821-110852-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 17:40:30,376 WARNING [train.py:1204] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_64-248359-0_sp1.1 from training. Duration: 0.62725 2023-10-04 17:40:32,115 WARNING [train.py:1204] (3/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_138-187031-0 from training. Duration: 0.99 2023-10-04 17:40:32,159 WARNING [train.py:1204] (3/4) Exclude cut with ID _4122_210_2084_1_1526713164417_4353280_528-206771-0_sp0.9 from training. Duration: 0.97775 2023-10-04 17:40:34,084 WARNING [train.py:1204] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_844-96989-0_sp1.1 from training. Duration: 0.6454375 2023-10-04 17:40:36,681 WARNING [train.py:1204] (3/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_106-30782-0_sp1.1 from training. Duration: 0.72725 2023-10-04 17:40:46,870 WARNING [train.py:1204] (3/4) Exclude cut with ID _14683_210_5219_1_1530698352608_3974859_281-250040-0_sp1.1 from training. Duration: 0.936375 2023-10-04 17:40:46,905 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_46013_210_7234_1_1533776362_3744032_463-52378-0_sp0.9 from training. Duration: 0.9555625 2023-10-04 17:40:49,736 WARNING [train.py:1204] (3/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_299-34413-0 from training. Duration: 0.65 2023-10-04 17:40:51,127 WARNING [train.py:1204] (3/4) Exclude cut with ID _35799_210_5854_1_1533263487887_3529071_490-326847-0 from training. Duration: 0.99 2023-10-04 17:40:53,163 WARNING [train.py:1204] (3/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_589-9070-0_sp0.9 from training. Duration: 0.788875 2023-10-04 17:40:54,668 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.feed_forward3.hidden_balancer.prob, batch_count=1738360.0, ans=0.125 2023-10-04 17:40:55,816 WARNING [train.py:1204] (3/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_884-29515-0_sp1.1 from training. Duration: 0.936375 2023-10-04 17:40:57,260 WARNING [train.py:1204] (3/4) Exclude cut with ID _15012_210_1968_1_1530856820429_5095350_474-119995-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 17:41:02,568 WARNING [train.py:1204] (3/4) Exclude cut with ID _65585_210_12210_1_1535450451584_3764098_211-97124-0_sp1.1 from training. Duration: 0.963625 2023-10-04 17:41:02,617 WARNING [train.py:1204] (3/4) Exclude cut with ID _33640_210_2655_1_1533117605479_4855469_666-224463-0_sp1.1 from training. Duration: 0.963625 2023-10-04 17:41:05,801 WARNING [train.py:1204] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_35-153886-0 from training. Duration: 0.84 2023-10-04 17:41:05,855 WARNING [train.py:1204] (3/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_210-106047-0 from training. Duration: 0.97 2023-10-04 17:41:07,360 WARNING [train.py:1204] (3/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_479-125611-0 from training. Duration: 0.96 2023-10-04 17:41:07,382 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_9417_210_1794_1_1529235261_4614131_427-153963-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 17:41:08,826 WARNING [train.py:1204] (3/4) Exclude cut with ID _23818_210_6228_1_1531917063884_3676920_133-170571-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 17:41:08,904 WARNING [train.py:1204] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_602-275260-0_sp1.1 from training. Duration: 0.7454375 2023-10-04 17:41:09,094 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.conv_skip_rate, batch_count=1738426.6666666667, ans=0.0 2023-10-04 17:41:11,647 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9029W0121-509521 from training. Duration: 0.896 2023-10-04 17:41:11,656 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9029W0434-509821 from training. Duration: 0.64 2023-10-04 17:41:11,682 WARNING [train.py:1204] (3/4) Exclude cut with ID _23038_210_9570_1_1532001584158_3652419_289-307034-0_sp1.1 from training. Duration: 0.936375 2023-10-04 17:41:13,079 WARNING [train.py:1204] (3/4) Exclude cut with ID _29345_210_4381_1_1532689168263_3860169_149-257268-0_sp0.9 from training. Duration: 0.9 2023-10-04 17:41:14,430 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_405-339986-0_sp1.1 from training. Duration: 0.463625 2023-10-04 17:41:17,191 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.1.encoder.layers.1.nonlin_attention.whiten2, num_groups=1, num_channels=256, metric=6.96 vs. limit=15.0 2023-10-04 17:41:17,742 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2655_210_2087_1_1525688959_3897400_437-337690-0_sp1.1 from training. Duration: 0.57275 2023-10-04 17:41:17,775 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7543_210_6753_1_1528541907_1399322_165-323619-0_sp1.1 from training. Duration: 0.62725 2023-10-04 17:41:19,216 WARNING [train.py:1204] (3/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_508-335369-0_sp1.1 from training. Duration: 0.436375 2023-10-04 17:41:19,285 WARNING [train.py:1204] (3/4) Exclude cut with ID _7852_210_5219_1_1528286471316_8139400_358-287492-0 from training. Duration: 0.96 2023-10-04 17:41:22,507 WARNING [train.py:1204] (3/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_322-217220-0_sp0.9 from training. Duration: 0.9555625 2023-10-04 17:41:23,958 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3171_210_2087_1_1526109084_2330007_66-260583-0_sp0.9 from training. Duration: 0.8 2023-10-04 17:41:25,277 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8558_210_1410_1_1528505225_7613406_131-329524-0_sp1.1 from training. Duration: 0.7181875 2023-10-04 17:41:26,684 WARNING [train.py:1204] (3/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_30-326681-0 from training. Duration: 0.83 2023-10-04 17:41:30,844 WARNING [train.py:1204] (3/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_58-68956-0_sp0.9 from training. Duration: 0.92225 2023-10-04 17:41:30,910 WARNING [train.py:1204] (3/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_255-312654-0 from training. Duration: 0.91 2023-10-04 17:41:30,981 WARNING [train.py:1204] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_99-167392-0 from training. Duration: 0.61 2023-10-04 17:41:32,346 WARNING [train.py:1204] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_94-126128-0_sp0.9 from training. Duration: 0.9555625 2023-10-04 17:41:37,398 WARNING [train.py:1204] (3/4) Exclude cut with ID _68635_210_9317_1_1535770813410_4315870_187-192817-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 17:41:38,823 WARNING [train.py:1204] (3/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_277-204970-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 17:41:40,212 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6163_210_6400_1_1527506746_4263068_283-137811-0_sp0.9 from training. Duration: 0.8555625 2023-10-04 17:41:40,247 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9144W0121-566446 from training. Duration: 0.896 2023-10-04 17:41:41,554 INFO [train.py:1046] (3/4) Epoch 50, batch 500, loss[loss=0.1445, simple_loss=0.2255, pruned_loss=0.03172, over 24313.00 frames. ], tot_loss[loss=0.1511, simple_loss=0.2325, pruned_loss=0.03489, over 4346083.33 frames. ], batch size: 61, lr: 2.04e-03, grad_scale: 16.0 2023-10-04 17:41:44,417 WARNING [train.py:1204] (3/4) Exclude cut with ID _49371_210_12071_1_1533952761021_3812190_351-315519-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 17:41:45,729 WARNING [train.py:1204] (3/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_115-326778-0_sp1.1 from training. Duration: 0.8454375 2023-10-04 17:41:45,782 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_17292_210_6753_1_1531125388_2943734_436-198965-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 17:41:45,791 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9165W0117-576751 from training. Duration: 0.896 2023-10-04 17:41:47,083 WARNING [train.py:1204] (3/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_212-149947-0 from training. Duration: 0.73 2023-10-04 17:41:47,091 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_13757_210_3528_1_1530440682_4107239_65-167544-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 17:41:50,709 WARNING [train.py:1204] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_700-183549-0_sp0.9 from training. Duration: 0.7555625 2023-10-04 17:41:52,338 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.792e+02 2.023e+02 2.217e+02 2.721e+02 3.663e+02, threshold=4.434e+02, percent-clipped=0.0 2023-10-04 17:41:53,166 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.3.encoder.layers.0.self_attn1.whiten, num_groups=1, num_channels=512, metric=13.53 vs. limit=22.5 2023-10-04 17:41:56,534 WARNING [train.py:1204] (3/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_216-343007-0_sp0.9 from training. Duration: 0.7333125 2023-10-04 17:41:57,966 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7544_210_6753_1_1528587722_3576686_295-258479-0_sp1.1 from training. Duration: 0.763625 2023-10-04 17:41:59,396 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_365-287106-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 17:41:59,416 WARNING [train.py:1204] (3/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_300-3747-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 17:42:00,748 WARNING [train.py:1204] (3/4) Exclude cut with ID _14069_210_5632_1_1530512692033_4803390_487-303201-0_sp1.1 from training. Duration: 0.92725 2023-10-04 17:42:02,520 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.feed_forward3.hidden_balancer.prob, batch_count=1738693.3333333333, ans=0.125 2023-10-04 17:42:07,279 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=1738693.3333333333, ans=0.1 2023-10-04 17:42:11,905 WARNING [train.py:1204] (3/4) Exclude cut with ID _14459_210_2228_1_1531443641946_7516700_668-123026-0_sp1.1 from training. Duration: 0.936375 2023-10-04 17:42:11,932 WARNING [train.py:1204] (3/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_108-53929-0_sp1.1 from training. Duration: 0.536375 2023-10-04 17:42:13,254 WARNING [train.py:1204] (3/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_266-137104-0_sp0.9 from training. Duration: 0.72225 2023-10-04 17:42:13,283 WARNING [train.py:1204] (3/4) Exclude cut with ID _12302_210_5219_1_1529924446466_8107280_902-168619-0_sp1.1 from training. Duration: 0.936375 2023-10-04 17:42:13,310 WARNING [train.py:1204] (3/4) Exclude cut with ID _59502_210_12210_1_1535107024698_1835139_146-162495-0 from training. Duration: 0.92 2023-10-04 17:42:13,338 WARNING [train.py:1204] (3/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_412-287526-0_sp1.1 from training. Duration: 0.6545625 2023-10-04 17:42:17,535 WARNING [train.py:1204] (3/4) Exclude cut with ID _7160_210_1876_1_1527940923231_3703799_120-150943-0_sp0.9 from training. Duration: 0.9 2023-10-04 17:42:17,594 WARNING [train.py:1204] (3/4) Exclude cut with ID _15037_210_2253_1_1530752294104_3267650_296-192597-0_sp1.1 from training. Duration: 0.736375 2023-10-04 17:42:17,619 WARNING [train.py:1204] (3/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_397-216497-0_sp1.1 from training. Duration: 0.87275 2023-10-04 17:42:18,896 WARNING [train.py:1204] (3/4) Exclude cut with ID _40094_210_16380_1_1533294005773_3641159_76-269320-0_sp1.1 from training. Duration: 0.936375 2023-10-04 17:42:20,245 WARNING [train.py:1204] (3/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_449-15508-0 from training. Duration: 0.82 2023-10-04 17:42:23,453 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9309W0194-640321 from training. Duration: 0.64 2023-10-04 17:42:26,937 WARNING [train.py:1204] (3/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_284-312178-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 17:42:27,053 WARNING [train.py:1204] (3/4) Exclude cut with ID _29853_210_7497_1_1532505600851_6642110_401-55937-0_sp1.1 from training. Duration: 0.92725 2023-10-04 17:42:28,270 WARNING [train.py:1204] (3/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_716-24918-0_sp1.1 from training. Duration: 0.92725 2023-10-04 17:42:28,292 WARNING [train.py:1204] (3/4) Exclude cut with ID _19637_210_4220_1_1531537167676_3205060_314-305638-0_sp1.1 from training. Duration: 0.92725 2023-10-04 17:42:29,626 WARNING [train.py:1204] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_475-127455-0_sp1.1 from training. Duration: 0.636375 2023-10-04 17:42:31,033 WARNING [train.py:1204] (3/4) Exclude cut with ID _5444_210_2151_1_1527418808464_5312210_581-315411-0 from training. Duration: 0.62 2023-10-04 17:42:35,249 WARNING [train.py:1204] (3/4) Exclude cut with ID _44902_210_16187_1_1533888006062_3792078_149-67212-0_sp0.9 from training. Duration: 0.9666875 2023-10-04 17:42:36,639 WARNING [train.py:1204] (3/4) Exclude cut with ID _31602_210_5854_1_1532677021456_4458250_63-63253-0_sp1.1 from training. Duration: 0.963625 2023-10-04 17:42:39,481 WARNING [train.py:1204] (3/4) Exclude cut with ID _55209_210_1531_1_1534675828996_4597160_660-203992-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 17:42:43,256 WARNING [train.py:1204] (3/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_83-190624-0_sp1.1 from training. Duration: 0.92725 2023-10-04 17:42:48,752 WARNING [train.py:1204] (3/4) Exclude cut with ID _58605_210_12210_1_1534723195939_3904259_154-139635-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 17:42:51,563 WARNING [train.py:1204] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_164-224052-0 from training. Duration: 0.7 2023-10-04 17:42:52,804 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_1126-189071-0_sp1.1 from training. Duration: 0.963625 2023-10-04 17:42:52,816 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_15396_210_6890_1_1531308473_1938936_310-214789-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 17:42:54,870 WARNING [train.py:1204] (3/4) Exclude cut with ID _8395_210_1531_1_1528597871523_4411579_431-132470-0 from training. Duration: 0.74 2023-10-04 17:42:56,136 INFO [train.py:1046] (3/4) Epoch 50, batch 550, loss[loss=0.148, simple_loss=0.2336, pruned_loss=0.03122, over 24459.00 frames. ], tot_loss[loss=0.152, simple_loss=0.2333, pruned_loss=0.03537, over 4428530.18 frames. ], batch size: 63, lr: 2.04e-03, grad_scale: 16.0 2023-10-04 17:42:56,218 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3780_210_2087_1_1526467760_2130483_170-259044-0_sp0.9 from training. Duration: 0.77775 2023-10-04 17:42:57,576 WARNING [train.py:1204] (3/4) Exclude cut with ID _12733_210_8341_1_1530167424556_3646380_537-230640-0_sp1.1 from training. Duration: 0.963625 2023-10-04 17:43:02,192 WARNING [train.py:1204] (3/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_159-281310-0 from training. Duration: 0.95 2023-10-04 17:43:04,944 WARNING [train.py:1204] (3/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_434-83000-0 from training. Duration: 0.98 2023-10-04 17:43:04,963 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_4405_210_1849_1_1526732205_3593939_493-32464-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 17:43:04,972 WARNING [train.py:1204] (3/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_342-100157-0 from training. Duration: 0.54 2023-10-04 17:43:05,030 WARNING [train.py:1204] (3/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_765-250133-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 17:43:05,034 WARNING [train.py:1204] (3/4) Exclude cut with ID _38670_210_15379_1_1533448810659_4391059_81-312245-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 17:43:06,353 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_50050_210_16223_1_1535435796_7290138_1273-247611-0_sp1.1 from training. Duration: 0.936375 2023-10-04 17:43:07,589 WARNING [train.py:1204] (3/4) Exclude cut with ID _44831_210_13427_1_1533816011549_3689035_399-18591-0_sp1.1 from training. Duration: 0.936375 2023-10-04 17:43:07,611 WARNING [train.py:1204] (3/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_557-324258-0_sp0.9 from training. Duration: 0.9 2023-10-04 17:43:07,683 WARNING [train.py:1204] (3/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_62-335552-0_sp1.1 from training. Duration: 0.7909375 2023-10-04 17:43:10,934 WARNING [train.py:1204] (3/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_673-264125-0_sp1.1 from training. Duration: 0.963625 2023-10-04 17:43:12,273 WARNING [train.py:1204] (3/4) Exclude cut with ID _8030_210_7497_1_1528444886781_3531089_262-86750-0 from training. Duration: 0.81 2023-10-04 17:43:12,287 WARNING [train.py:1204] (3/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_444-28041-0_sp1.1 from training. Duration: 0.77275 2023-10-04 17:43:16,948 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_34008_210_10431_1_1533345424_8183247_516-14858-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 17:43:16,968 WARNING [train.py:1204] (3/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_54-248218-0_sp1.1 from training. Duration: 0.936375 2023-10-04 17:43:17,149 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.feed_forward1.hidden_balancer.prob, batch_count=1739026.6666666667, ans=0.125 2023-10-04 17:43:19,659 WARNING [train.py:1204] (3/4) Exclude cut with ID _13253_210_1925_1_1530757775819_3577873_300-119782-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 17:43:19,724 WARNING [train.py:1204] (3/4) Exclude cut with ID _39290_210_4220_1_1533862778760_3075118_21-240703-0_sp1.1 from training. Duration: 0.936375 2023-10-04 17:43:23,217 WARNING [train.py:1204] (3/4) Exclude cut with ID 774-127930-0014-10412-0_sp1.1 from training. Duration: 0.95 2023-10-04 17:43:24,367 WARNING [train.py:1204] (3/4) Exclude cut with ID _67499_210_17282_1_1535509805220_3692430_20-179346-0 from training. Duration: 0.97 2023-10-04 17:43:25,985 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=1739093.3333333333, ans=0.1 2023-10-04 17:43:27,662 WARNING [train.py:1204] (3/4) Exclude cut with ID _8365_210_2064_1_1528592311070_4677690_431-294102-0_sp0.9 from training. Duration: 0.988875 2023-10-04 17:43:30,352 WARNING [train.py:1204] (3/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_365-1492-0_sp1.1 from training. Duration: 0.82725 2023-10-04 17:43:30,386 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_105-114797-0_sp0.9 from training. Duration: 0.9333125 2023-10-04 17:43:31,868 WARNING [train.py:1204] (3/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_769-168302-0_sp0.9 from training. Duration: 0.788875 2023-10-04 17:43:34,571 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_40340_210_9024_1_1533433916_4132944_320-270277-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 17:43:34,576 WARNING [train.py:1204] (3/4) Exclude cut with ID ID0203W0003-781711 from training. Duration: 0.958 2023-10-04 17:43:35,928 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_30434_210_13049_1_1532519384_981746_52-12932-0_sp1.1 from training. Duration: 0.936375 2023-10-04 17:43:37,335 WARNING [train.py:1204] (3/4) Exclude cut with ID _8263_210_1664_1_1528438953494_294100_40-204731-0_sp0.9 from training. Duration: 0.5555625 2023-10-04 17:43:41,987 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1032-236863-0_sp0.9 from training. Duration: 0.9333125 2023-10-04 17:43:42,035 WARNING [train.py:1204] (3/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_335-274780-0_sp0.9 from training. Duration: 0.8666875 2023-10-04 17:43:42,042 WARNING [train.py:1204] (3/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_654-52713-0_sp1.1 from training. Duration: 0.7 2023-10-04 17:43:43,399 WARNING [train.py:1204] (3/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_742-267856-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 17:43:44,788 WARNING [train.py:1204] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_74-48717-0 from training. Duration: 0.58 2023-10-04 17:43:46,648 WARNING [train.py:1204] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_18-46756-0 from training. Duration: 0.79 2023-10-04 17:43:46,706 WARNING [train.py:1204] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_612-56061-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 17:43:46,715 WARNING [train.py:1204] (3/4) Exclude cut with ID _56423_210_5854_1_1534896061049_3165969_412-2403-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 17:43:48,089 WARNING [train.py:1204] (3/4) Exclude cut with ID _62574_210_2655_1_1535180407058_3933249_120-196848-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 17:43:48,090 WARNING [train.py:1204] (3/4) Exclude cut with ID _40519_210_13794_1_1533715228655_7337390_916-51000-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 17:43:49,586 WARNING [train.py:1204] (3/4) Exclude cut with ID _65263_210_22356_1_1535763598691_3921037_108-21902-0_sp1.1 from training. Duration: 0.9 2023-10-04 17:43:50,995 WARNING [train.py:1204] (3/4) Exclude cut with ID _4122_210_2084_1_1526713164417_4353280_528-206771-0_sp1.1 from training. Duration: 0.8 2023-10-04 17:43:53,034 WARNING [train.py:1204] (3/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_43-325657-0_sp1.1 from training. Duration: 0.92725 2023-10-04 17:43:53,089 WARNING [train.py:1204] (3/4) Exclude cut with ID _44659_210_2721_1_1534036120061_6614860_314-82041-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 17:43:53,157 WARNING [train.py:1204] (3/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_81-300212-0_sp0.9 from training. Duration: 0.6666875 2023-10-04 17:43:56,341 WARNING [train.py:1204] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_665-196155-0_sp0.9 from training. Duration: 0.7444375 2023-10-04 17:43:56,450 WARNING [train.py:1204] (3/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_140-336881-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 17:43:57,781 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6290_210_3273_1_1527595224_1282787_115-87095-0_sp0.9 from training. Duration: 0.611125 2023-10-04 17:43:57,839 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_53373_210_5399_1_1534229471_4715695_607-259685-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 17:44:00,427 WARNING [train.py:1204] (3/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_175-163894-0_sp1.1 from training. Duration: 0.67275 2023-10-04 17:44:00,452 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7002_210_1794_1_1527939683_5157286_1-281264-0_sp1.1 from training. Duration: 0.4 2023-10-04 17:44:04,773 WARNING [train.py:1204] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_605-257205-0 from training. Duration: 0.59 2023-10-04 17:44:09,284 WARNING [train.py:1204] (3/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_454-314901-0 from training. Duration: 0.46 2023-10-04 17:44:10,499 INFO [train.py:1046] (3/4) Epoch 50, batch 600, loss[loss=0.1471, simple_loss=0.2265, pruned_loss=0.03386, over 24464.00 frames. ], tot_loss[loss=0.153, simple_loss=0.2339, pruned_loss=0.03603, over 4481017.29 frames. ], batch size: 63, lr: 2.04e-03, grad_scale: 8.0 2023-10-04 17:44:10,601 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_46032_210_14656_1_1533781412_4507969_287-130422-0_sp1.1 from training. Duration: 0.97275 2023-10-04 17:44:10,616 WARNING [train.py:1204] (3/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_186-309161-0_sp1.1 from training. Duration: 0.4909375 2023-10-04 17:44:11,891 WARNING [train.py:1204] (3/4) Exclude cut with ID _42659_210_2070_1_1533607442765_3120380_45-342307-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 17:44:17,880 WARNING [train.py:1204] (3/4) Exclude cut with ID _12555_210_8750_1_1530075594402_3780239_478-326818-0_sp1.1 from training. Duration: 0.963625 2023-10-04 17:44:19,284 WARNING [train.py:1204] (3/4) Exclude cut with ID _7606_210_4915_1_1528894253277_5080690_127-177761-0_sp1.1 from training. Duration: 0.5818125 2023-10-04 17:44:19,397 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6788_210_1388_1_1527898698_4562021_91-197981-0 from training. Duration: 0.83 2023-10-04 17:44:22,522 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.822e+02 2.089e+02 2.281e+02 2.529e+02 5.225e+02, threshold=4.562e+02, percent-clipped=2.0 2023-10-04 17:44:22,652 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8558_210_1410_1_1528505225_7613406_131-329524-0_sp0.9 from training. Duration: 0.87775 2023-10-04 17:44:24,557 WARNING [train.py:1204] (3/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_153-153537-0_sp1.1 from training. Duration: 0.82725 2023-10-04 17:44:27,317 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_17292_210_6753_1_1531125388_2943734_285-349105-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 17:44:28,743 WARNING [train.py:1204] (3/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_37-41919-0 from training. Duration: 0.87 2023-10-04 17:44:28,763 WARNING [train.py:1204] (3/4) Exclude cut with ID _60860_210_8254_1_1534896015875_3557169_172-294852-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 17:44:35,765 WARNING [train.py:1204] (3/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_146-276944-0 from training. Duration: 0.83 2023-10-04 17:44:38,093 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.0.layers.0.nonlin_attention.whiten1, num_groups=1, num_channels=144, metric=9.50 vs. limit=10.0 2023-10-04 17:44:39,022 WARNING [train.py:1204] (3/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_1254-44082-0_sp1.1 from training. Duration: 0.936375 2023-10-04 17:44:39,035 WARNING [train.py:1204] (3/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_1115-180350-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 17:44:39,062 WARNING [train.py:1204] (3/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_60-146189-0_sp1.1 from training. Duration: 0.8909375 2023-10-04 17:44:39,509 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.5.encoder.layers.0.self_attn1.whiten, num_groups=1, num_channels=256, metric=12.22 vs. limit=22.5 2023-10-04 17:44:43,323 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.out_combiner.scale_min, batch_count=1739426.6666666667, ans=0.2 2023-10-04 17:44:44,585 WARNING [train.py:1204] (3/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_517-153649-0_sp1.1 from training. Duration: 0.92725 2023-10-04 17:44:44,603 WARNING [train.py:1204] (3/4) Exclude cut with ID _39571_210_13449_1_1533801584143_3651077_37-266257-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 17:44:46,429 WARNING [train.py:1204] (3/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_527-276118-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 17:44:53,866 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7696_210_2040_1_1528207167_3716099_117-125434-0_sp1.1 from training. Duration: 0.6909375 2023-10-04 17:44:58,514 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_25767_210_2161_1_1532307483_3867748_478-156360-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 17:44:59,798 WARNING [train.py:1204] (3/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_177-49092-0_sp1.1 from training. Duration: 0.82725 2023-10-04 17:44:59,808 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_11305_210_1794_1_1529750428_7178148_680-317073-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 17:45:06,679 WARNING [train.py:1204] (3/4) Exclude cut with ID _53565_210_10635_1_1534337218335_3820259_287-18120-0 from training. Duration: 0.97 2023-10-04 17:45:11,566 WARNING [train.py:1204] (3/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_498-40153-0_sp1.1 from training. Duration: 0.57275 2023-10-04 17:45:11,589 WARNING [train.py:1204] (3/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_555-63066-0_sp1.1 from training. Duration: 0.963625 2023-10-04 17:45:15,796 WARNING [train.py:1204] (3/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_395-310220-0 from training. Duration: 0.89 2023-10-04 17:45:15,875 WARNING [train.py:1204] (3/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_937-208281-0_sp1.1 from training. Duration: 0.77275 2023-10-04 17:45:17,729 WARNING [train.py:1204] (3/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_120-211630-0 from training. Duration: 0.92 2023-10-04 17:45:17,900 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=1739560.0, ans=0.1 2023-10-04 17:45:19,034 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_454-276429-0_sp1.1 from training. Duration: 0.7909375 2023-10-04 17:45:19,073 WARNING [train.py:1204] (3/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_404-75889-0_sp1.1 from training. Duration: 0.8090625 2023-10-04 17:45:25,063 INFO [train.py:1046] (3/4) Epoch 50, batch 650, loss[loss=0.1391, simple_loss=0.1974, pruned_loss=0.04046, over 19426.00 frames. ], tot_loss[loss=0.1524, simple_loss=0.2331, pruned_loss=0.03579, over 4524371.05 frames. ], batch size: 389, lr: 2.04e-03, grad_scale: 8.0 2023-10-04 17:45:25,129 WARNING [train.py:1204] (3/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_282-136641-0_sp0.9 from training. Duration: 0.4666875 2023-10-04 17:45:26,563 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_809-17156-0_sp1.1 from training. Duration: 0.663625 2023-10-04 17:45:28,038 WARNING [train.py:1204] (3/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_1003-101638-0_sp0.9 from training. Duration: 0.811125 2023-10-04 17:45:29,435 WARNING [train.py:1204] (3/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_404-75889-0_sp0.9 from training. Duration: 0.988875 2023-10-04 17:45:30,890 WARNING [train.py:1204] (3/4) Exclude cut with ID _59288_210_5854_1_1535077757774_3412246_570-295255-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 17:45:34,989 WARNING [train.py:1204] (3/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_412-287526-0 from training. Duration: 0.72 2023-10-04 17:45:35,066 WARNING [train.py:1204] (3/4) Exclude cut with ID _44010_210_4525_1_1533884262964_4013620_649-79655-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 17:45:39,147 WARNING [train.py:1204] (3/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_57-270037-0_sp0.9 from training. Duration: 0.8555625 2023-10-04 17:45:39,147 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_34815_210_6388_1_1533381320_3571903_302-255963-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 17:45:42,440 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_47449_210_5399_1_1534076445_4006929_573-301852-0_sp1.1 from training. Duration: 0.97275 2023-10-04 17:45:42,645 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.bypass_mid.scale_min, batch_count=1739693.3333333333, ans=0.2 2023-10-04 17:45:47,314 WARNING [train.py:1204] (3/4) Exclude cut with ID 6709-74022-0004-86860-0_sp1.1 from training. Duration: 0.9409375 2023-10-04 17:45:48,745 WARNING [train.py:1204] (3/4) Exclude cut with ID _40519_210_13794_1_1533715228655_7337390_111-71991-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 17:45:50,068 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_30167_210_3528_1_1532428012_4772375_169-214186-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 17:45:53,461 WARNING [train.py:1204] (3/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_20-235313-0_sp1.1 from training. Duration: 0.963625 2023-10-04 17:45:53,525 WARNING [train.py:1204] (3/4) Exclude cut with ID _4681_210_3525_1_1527251040321_1235713_123-24428-0_sp0.9 from training. Duration: 0.5333125 2023-10-04 17:45:55,052 WARNING [train.py:1204] (3/4) Exclude cut with ID _31602_210_5854_1_1532677021456_4458250_444-198752-0_sp1.1 from training. Duration: 0.97275 2023-10-04 17:45:56,411 WARNING [train.py:1204] (3/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_9-279442-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 17:45:57,661 WARNING [train.py:1204] (3/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_973-198085-0_sp0.9 from training. Duration: 0.8333125 2023-10-04 17:45:59,016 WARNING [train.py:1204] (3/4) Exclude cut with ID _29853_210_7497_1_1532505600851_6642110_567-250221-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 17:46:00,393 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6545_210_2040_1_1527774670_4245258_407-48070-0_sp1.1 from training. Duration: 0.6909375 2023-10-04 17:46:01,873 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.ff2_skip_rate, batch_count=1739760.0, ans=0.0 2023-10-04 17:46:03,021 WARNING [train.py:1204] (3/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_224-343295-0_sp0.9 from training. Duration: 0.611125 2023-10-04 17:46:03,036 WARNING [train.py:1204] (3/4) Exclude cut with ID IC0125W0011-61817 from training. Duration: 0.88 2023-10-04 17:46:03,045 WARNING [train.py:1204] (3/4) Exclude cut with ID _60113_210_16271_1_1535164212481_4386079_435-154371-0_sp1.1 from training. Duration: 0.97275 2023-10-04 17:46:03,057 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_25836_210_16554_1_1532138167_53197_3-9394-0_sp1.1 from training. Duration: 0.92725 2023-10-04 17:46:04,598 WARNING [train.py:1204] (3/4) Exclude cut with ID _29343_210_4381_1_1532602757752_3674109_57-55125-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 17:46:05,916 WARNING [train.py:1204] (3/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_267-23560-0_sp1.1 from training. Duration: 0.936375 2023-10-04 17:46:07,232 WARNING [train.py:1204] (3/4) Exclude cut with ID _30196_210_1664_1_1533708496522_7074140_1150-278867-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 17:46:08,613 WARNING [train.py:1204] (3/4) Exclude cut with ID _6270_210_2084_1_1527850911573_3856420_26-277343-0_sp0.9 from training. Duration: 0.911125 2023-10-04 17:46:09,799 WARNING [train.py:1204] (3/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_325-62802-0 from training. Duration: 0.87 2023-10-04 17:46:09,874 WARNING [train.py:1204] (3/4) Exclude cut with ID _56524_210_9642_1_1534572011779_3619170_145-310842-0_sp1.1 from training. Duration: 0.9 2023-10-04 17:46:09,913 WARNING [train.py:1204] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_860-96198-0_sp0.9 from training. Duration: 0.811125 2023-10-04 17:46:11,209 WARNING [train.py:1204] (3/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_17-25045-0_sp1.1 from training. Duration: 0.536375 2023-10-04 17:46:11,235 WARNING [train.py:1204] (3/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_349-69727-0_sp1.1 from training. Duration: 0.936375 2023-10-04 17:46:12,647 WARNING [train.py:1204] (3/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_580-93798-0_sp1.1 from training. Duration: 0.5090625 2023-10-04 17:46:14,510 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7762_210_1794_1_1528199508_4057408_419-125009-0 from training. Duration: 0.83 2023-10-04 17:46:15,906 WARNING [train.py:1204] (3/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_356-167816-0 from training. Duration: 0.72 2023-10-04 17:46:15,943 WARNING [train.py:1204] (3/4) Exclude cut with ID _29345_210_4381_1_1532689168263_3860169_225-97100-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 17:46:15,953 WARNING [train.py:1204] (3/4) Exclude cut with ID _14227_210_4381_1_1530701804286_3940259_374-194712-0_sp1.1 from training. Duration: 0.92725 2023-10-04 17:46:17,629 WARNING [train.py:1204] (3/4) Exclude cut with ID _40137_210_12210_1_1533290484046_3795180_249-187059-0_sp0.9 from training. Duration: 0.97775 2023-10-04 17:46:17,655 WARNING [train.py:1204] (3/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_389-317194-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 17:46:19,110 WARNING [train.py:1204] (3/4) Exclude cut with ID _27485_210_4916_1_1532739191677_3668269_239-122918-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 17:46:22,480 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2245_210_2715_1_1525511548_3540254_128-117609-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 17:46:24,433 WARNING [train.py:1204] (3/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_160-276178-0_sp1.1 from training. Duration: 0.963625 2023-10-04 17:46:24,528 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_31920_210_10633_1_1532860009_1686094_1-279735-0_sp1.1 from training. Duration: 0.97275 2023-10-04 17:46:27,261 WARNING [train.py:1204] (3/4) Exclude cut with ID _51078_210_9070_1_1534161557424_3805250_325-262383-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 17:46:27,520 INFO [scaling.py:1118] (3/4) WithLoss: name=encoder.encoders.4.encoder.layers.1.self_attn_weights, loss-sum=0.000e+00 2023-10-04 17:46:28,465 WARNING [train.py:1204] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_643-52007-0_sp0.9 from training. Duration: 0.6333125 2023-10-04 17:46:28,521 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_51937_210_5014_1_1534573610_4022025_518-299248-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 17:46:35,544 WARNING [train.py:1204] (3/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_166-314607-0_sp1.1 from training. Duration: 0.4909375 2023-10-04 17:46:35,548 WARNING [train.py:1204] (3/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_1087-282939-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 17:46:35,577 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_23850_210_4238_1_1532232597_1632433_32-185135-0_sp1.1 from training. Duration: 0.7818125 2023-10-04 17:46:35,608 WARNING [train.py:1204] (3/4) Exclude cut with ID _59500_210_8254_1_1534809610680_3736009_145-89359-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 17:46:38,250 INFO [train.py:1046] (3/4) Epoch 50, batch 700, loss[loss=0.1534, simple_loss=0.2339, pruned_loss=0.03647, over 24481.00 frames. ], tot_loss[loss=0.1512, simple_loss=0.231, pruned_loss=0.0357, over 4548301.06 frames. ], batch size: 63, lr: 2.04e-03, grad_scale: 8.0 2023-10-04 17:46:39,723 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_340-336408-0 from training. Duration: 0.93 2023-10-04 17:46:41,155 WARNING [train.py:1204] (3/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_9-21737-0 from training. Duration: 0.97 2023-10-04 17:46:44,445 WARNING [train.py:1204] (3/4) Exclude cut with ID _2220_210_1819_1_1525509109898_3658680_50-99837-0 from training. Duration: 0.54 2023-10-04 17:46:44,516 WARNING [train.py:1204] (3/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_879-299350-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 17:46:45,904 WARNING [train.py:1204] (3/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_905-160074-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 17:46:48,242 WARNING [train.py:1204] (3/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_395-73570-0 from training. Duration: 0.99 2023-10-04 17:46:50,807 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.795e+02 2.097e+02 2.406e+02 2.781e+02 3.851e+02, threshold=4.811e+02, percent-clipped=0.0 2023-10-04 17:46:54,207 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7264_210_4938_1_1527939911_8132095_800-91995-0_sp1.1 from training. Duration: 0.7818125 2023-10-04 17:46:55,629 WARNING [train.py:1204] (3/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_77-311404-0_sp1.1 from training. Duration: 0.8545625 2023-10-04 17:46:57,017 WARNING [train.py:1204] (3/4) Exclude cut with ID _13679_210_7497_1_1530534293662_3355098_107-80772-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 17:46:59,676 WARNING [train.py:1204] (3/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_224-343295-0_sp1.1 from training. Duration: 0.5 2023-10-04 17:46:59,706 WARNING [train.py:1204] (3/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_156-253482-0_sp1.1 from training. Duration: 0.963625 2023-10-04 17:47:02,348 WARNING [train.py:1204] (3/4) Exclude cut with ID _49728_210_12651_1_1534041034294_6788150_309-125532-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 17:47:04,073 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.ff3_skip_rate, batch_count=1740026.6666666667, ans=0.0 2023-10-04 17:47:05,179 WARNING [train.py:1204] (3/4) Exclude cut with ID _13913_210_9994_1_1530579615187_3383897_473-277682-0_sp0.9 from training. Duration: 0.4333125 2023-10-04 17:47:05,184 WARNING [train.py:1204] (3/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_3-276701-0_sp1.1 from training. Duration: 0.82725 2023-10-04 17:47:06,603 WARNING [train.py:1204] (3/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_282-136641-0 from training. Duration: 0.42 2023-10-04 17:47:09,303 WARNING [train.py:1204] (3/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_127-566-0 from training. Duration: 0.84 2023-10-04 17:47:12,088 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6163_210_6400_1_1527506746_4263068_283-137811-0_sp1.1 from training. Duration: 0.7 2023-10-04 17:47:13,947 WARNING [train.py:1204] (3/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_34-183685-0_sp1.1 from training. Duration: 0.8909375 2023-10-04 17:47:15,349 WARNING [train.py:1204] (3/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_154-135696-0_sp0.9 from training. Duration: 0.888875 2023-10-04 17:47:18,177 WARNING [train.py:1204] (3/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_676-51014-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 17:47:20,068 WARNING [train.py:1204] (3/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_38-169920-0 from training. Duration: 0.91 2023-10-04 17:47:24,686 WARNING [train.py:1204] (3/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_747-114465-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 17:47:25,998 WARNING [train.py:1204] (3/4) Exclude cut with ID _8021_210_7994_1_1528376451358_3653069_100-140231-0_sp1.1 from training. Duration: 0.6090625 2023-10-04 17:47:26,015 WARNING [train.py:1204] (3/4) Exclude cut with ID _64135_210_2253_1_1535677267876_7608972_133-188266-0 from training. Duration: 0.81 2023-10-04 17:47:27,626 WARNING [train.py:1204] (3/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_714-310538-0_sp1.1 from training. Duration: 0.7818125 2023-10-04 17:47:30,372 WARNING [train.py:1204] (3/4) Exclude cut with ID _39867_210_9826_1_1533553222030_9344259_1134-288498-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 17:47:31,838 WARNING [train.py:1204] (3/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_211-237289-0_sp1.1 from training. Duration: 0.936375 2023-10-04 17:47:35,859 WARNING [train.py:1204] (3/4) Exclude cut with ID _59202_210_26984_1_1534816810612_3549174_137-318710-0_sp0.9 from training. Duration: 0.988875 2023-10-04 17:47:37,189 WARNING [train.py:1204] (3/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_86-66192-0 from training. Duration: 0.55 2023-10-04 17:47:40,055 WARNING [train.py:1204] (3/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_540-275685-0 from training. Duration: 0.89 2023-10-04 17:47:40,103 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8776_210_2040_1_1528638837_4088036_244-278763-0 from training. Duration: 0.9 2023-10-04 17:47:41,610 WARNING [train.py:1204] (3/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_9-219972-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 17:47:43,551 WARNING [train.py:1204] (3/4) Exclude cut with ID _44659_210_2721_1_1534036120061_6614860_656-283823-0_sp1.1 from training. Duration: 0.92725 2023-10-04 17:47:44,803 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_69630_210_7234_1_1535789601_661049_77-216385-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 17:47:46,213 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_19952_210_2161_1_1531875469_3851750_210-308919-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 17:47:46,227 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_4737_210_2040_1_1526998689_3293008_378-178650-0 from training. Duration: 0.95 2023-10-04 17:47:46,915 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.3.encoder.layers.0.feed_forward3.out_whiten, num_groups=1, num_channels=512, metric=11.94 vs. limit=15.0 2023-10-04 17:47:50,972 WARNING [train.py:1204] (3/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_224-343295-0 from training. Duration: 0.55 2023-10-04 17:47:51,002 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7002_210_1794_1_1527939683_5157286_184-229630-0 from training. Duration: 0.74 2023-10-04 17:47:51,027 WARNING [train.py:1204] (3/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_6-214815-0 from training. Duration: 0.85 2023-10-04 17:47:51,194 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.1.attention_skip_rate, batch_count=1740293.3333333333, ans=0.0 2023-10-04 17:47:52,328 INFO [train.py:1046] (3/4) Epoch 50, batch 750, loss[loss=0.1425, simple_loss=0.231, pruned_loss=0.02703, over 24457.00 frames. ], tot_loss[loss=0.151, simple_loss=0.231, pruned_loss=0.03551, over 4596935.47 frames. ], batch size: 66, lr: 2.04e-03, grad_scale: 8.0 2023-10-04 17:47:52,415 WARNING [train.py:1204] (3/4) Exclude cut with ID _8699_210_2069_1_1528630346831_3960409_160-24408-0 from training. Duration: 0.91 2023-10-04 17:47:53,765 WARNING [train.py:1204] (3/4) Exclude cut with ID _5585_210_2667_1_1527418813324_8007340_89-332072-0 from training. Duration: 0.64 2023-10-04 17:47:53,800 WARNING [train.py:1204] (3/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_528-259800-0_sp1.1 from training. Duration: 0.963625 2023-10-04 17:47:55,145 WARNING [train.py:1204] (3/4) Exclude cut with ID _55383_210_10635_1_1534468426172_2231669_134-128246-0 from training. Duration: 0.89 2023-10-04 17:47:55,220 WARNING [train.py:1204] (3/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_400-338274-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 17:47:56,511 WARNING [train.py:1204] (3/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_364-22817-0_sp0.9 from training. Duration: 0.788875 2023-10-04 17:47:57,899 WARNING [train.py:1204] (3/4) Exclude cut with ID _42863_210_9070_1_1533902398788_3696920_331-291057-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 17:47:59,392 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_5436_210_1794_1_1527331317_2541494_229-211016-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 17:47:59,431 WARNING [train.py:1204] (3/4) Exclude cut with ID _6456_210_1534_1_1527681634212_3181049_345-252877-0_sp0.9 from training. Duration: 0.711125 2023-10-04 17:47:59,669 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.conv_skip_rate, batch_count=1740293.3333333333, ans=0.0 2023-10-04 17:48:00,753 WARNING [train.py:1204] (3/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_96-81499-0_sp1.1 from training. Duration: 0.92725 2023-10-04 17:48:03,425 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_5575_210_1410_1_1527301305_2570170_342-265572-0_sp1.1 from training. Duration: 0.8545625 2023-10-04 17:48:04,831 WARNING [train.py:1204] (3/4) Exclude cut with ID _5444_210_2151_1_1527418808464_5312210_726-27165-0_sp0.9 from training. Duration: 0.7444375 2023-10-04 17:48:06,186 WARNING [train.py:1204] (3/4) Exclude cut with ID _22083_210_8254_1_1531702862439_3678970_304-327750-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 17:48:08,932 WARNING [train.py:1204] (3/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_245-226199-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 17:48:09,117 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.ff2_skip_rate, batch_count=1740360.0, ans=0.0 2023-10-04 17:48:10,262 WARNING [train.py:1204] (3/4) Exclude cut with ID _42875_210_14856_1_1533704397487_4012433_102-199499-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 17:48:11,537 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_51225_210_3601_1_1534400634_2327383_55-301844-0 from training. Duration: 0.93 2023-10-04 17:48:11,646 WARNING [train.py:1204] (3/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_916-195848-0_sp1.1 from training. Duration: 0.836375 2023-10-04 17:48:14,923 WARNING [train.py:1204] (3/4) Exclude cut with ID _44718_210_5816_1_1534050051869_6469030_497-311999-0_sp1.1 from training. Duration: 0.936375 2023-10-04 17:48:16,312 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_34886_210_10633_1_1533203882_1755661_91-194765-0_sp1.1 from training. Duration: 0.936375 2023-10-04 17:48:17,711 WARNING [train.py:1204] (3/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_310-26186-0_sp0.9 from training. Duration: 0.77775 2023-10-04 17:48:17,803 WARNING [train.py:1204] (3/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_416-257730-0 from training. Duration: 0.65 2023-10-04 17:48:17,808 WARNING [train.py:1204] (3/4) Exclude cut with ID _9389_210_1664_1_1529217337301_5289269_209-154760-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 17:48:20,089 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.attention_skip_rate, batch_count=1740426.6666666667, ans=0.0 2023-10-04 17:48:20,139 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.conv_module2.balancer2.prob, batch_count=1740426.6666666667, ans=0.125 2023-10-04 17:48:22,497 WARNING [train.py:1204] (3/4) Exclude cut with ID _4092_210_2151_1_1526643044449_3135759_227-213972-0 from training. Duration: 0.54 2023-10-04 17:48:22,534 WARNING [train.py:1204] (3/4) Exclude cut with ID IC0679W0044-335852 from training. Duration: 0.768 2023-10-04 17:48:23,683 WARNING [train.py:1204] (3/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_42-186162-0 from training. Duration: 0.92 2023-10-04 17:48:23,689 WARNING [train.py:1204] (3/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_302-2550-0_sp0.9 from training. Duration: 0.9 2023-10-04 17:48:23,703 WARNING [train.py:1204] (3/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_356-31216-0_sp0.9 from training. Duration: 0.8333125 2023-10-04 17:48:26,378 WARNING [train.py:1204] (3/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_125-322172-0_sp1.1 from training. Duration: 0.8454375 2023-10-04 17:48:26,759 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.conv_module2.balancer2.prob, batch_count=1740426.6666666667, ans=0.125 2023-10-04 17:48:30,758 WARNING [train.py:1204] (3/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_141-89727-0_sp0.9 from training. Duration: 0.788875 2023-10-04 17:48:30,785 WARNING [train.py:1204] (3/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_647-337784-0_sp1.1 from training. Duration: 0.97275 2023-10-04 17:48:30,795 WARNING [train.py:1204] (3/4) Exclude cut with ID _6493_210_1747_1_1528459210424_4012470_513-140967-0_sp1.1 from training. Duration: 0.7181875 2023-10-04 17:48:33,370 WARNING [train.py:1204] (3/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_87-266334-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 17:48:34,823 WARNING [train.py:1204] (3/4) Exclude cut with ID _26528_210_9070_1_1532570410842_3851310_526-261338-0_sp1.1 from training. Duration: 0.92725 2023-10-04 17:48:34,875 WARNING [train.py:1204] (3/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_380-343670-0 from training. Duration: 0.88 2023-10-04 17:48:34,917 WARNING [train.py:1204] (3/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_449-15508-0_sp1.1 from training. Duration: 0.7454375 2023-10-04 17:48:35,048 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.conv_module1.balancer2.prob, batch_count=1740493.3333333333, ans=0.125 2023-10-04 17:48:36,161 WARNING [train.py:1204] (3/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_372-6869-0_sp0.9 from training. Duration: 0.488875 2023-10-04 17:48:37,519 WARNING [train.py:1204] (3/4) Exclude cut with ID _26907_210_7234_1_1532221740694_3205263_337-2494-0_sp0.9 from training. Duration: 0.97775 2023-10-04 17:48:40,328 WARNING [train.py:1204] (3/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_10-100367-0_sp1.1 from training. Duration: 0.8909375 2023-10-04 17:48:41,649 WARNING [train.py:1204] (3/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_47-167373-0 from training. Duration: 0.92 2023-10-04 17:48:41,717 WARNING [train.py:1204] (3/4) Exclude cut with ID _28323_210_16187_1_1532480395667_4100300_628-282179-0_sp1.1 from training. Duration: 0.97275 2023-10-04 17:48:46,465 WARNING [train.py:1204] (3/4) Exclude cut with ID _20455_210_5219_1_1531565996775_8098100_213-280898-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 17:48:47,940 WARNING [train.py:1204] (3/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_383-316000-0_sp1.1 from training. Duration: 0.8181875 2023-10-04 17:48:47,974 WARNING [train.py:1204] (3/4) Exclude cut with ID _18513_210_9206_1_1531618215223_3862620_16-78180-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 17:48:51,262 WARNING [train.py:1204] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_330-327861-0_sp1.1 from training. Duration: 0.6090625 2023-10-04 17:48:54,068 WARNING [train.py:1204] (3/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_356-31216-0 from training. Duration: 0.75 2023-10-04 17:48:55,424 WARNING [train.py:1204] (3/4) Exclude cut with ID _25248_210_8750_1_1532062825941_5554180_255-148539-0_sp1.1 from training. Duration: 0.963625 2023-10-04 17:48:55,460 WARNING [train.py:1204] (3/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_269-154726-0_sp1.1 from training. Duration: 0.7818125 2023-10-04 17:48:56,912 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7002_210_1794_1_1527939683_5157286_163-87552-0_sp1.1 from training. Duration: 0.7818125 2023-10-04 17:48:58,203 WARNING [train.py:1204] (3/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_97-151896-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 17:49:00,994 WARNING [train.py:1204] (3/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_207-7026-0_sp1.1 from training. Duration: 0.97275 2023-10-04 17:49:01,023 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_92-46009-0_sp0.9 from training. Duration: 0.6 2023-10-04 17:49:05,283 INFO [train.py:1046] (3/4) Epoch 50, batch 800, loss[loss=0.1331, simple_loss=0.219, pruned_loss=0.02355, over 24317.00 frames. ], tot_loss[loss=0.1518, simple_loss=0.2324, pruned_loss=0.03563, over 4632601.73 frames. ], batch size: 61, lr: 2.04e-03, grad_scale: 16.0 2023-10-04 17:49:07,001 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.1.conv_module2.balancer2.prob, batch_count=1740626.6666666667, ans=0.125 2023-10-04 17:49:09,497 WARNING [train.py:1204] (3/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_122-178847-0_sp1.1 from training. Duration: 0.97275 2023-10-04 17:49:09,501 WARNING [train.py:1204] (3/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_94-348956-0_sp1.1 from training. Duration: 0.92725 2023-10-04 17:49:10,905 WARNING [train.py:1204] (3/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_1018-244089-0_sp1.1 from training. Duration: 0.963625 2023-10-04 17:49:10,925 WARNING [train.py:1204] (3/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_87-118423-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 17:49:12,135 WARNING [train.py:1204] (3/4) Exclude cut with ID _53713_210_24116_1_1534314487762_7416751_504-212292-0_sp1.1 from training. Duration: 0.92725 2023-10-04 17:49:12,163 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_446-150327-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 17:49:14,077 WARNING [train.py:1204] (3/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_682-245989-0_sp1.1 from training. Duration: 0.92725 2023-10-04 17:49:16,755 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.769e+02 2.060e+02 2.355e+02 2.799e+02 5.307e+02, threshold=4.709e+02, percent-clipped=2.0 2023-10-04 17:49:20,062 WARNING [train.py:1204] (3/4) Exclude cut with ID _35723_210_1794_1_1533088341043_628250_8-234524-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 17:49:20,138 WARNING [train.py:1204] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_64-248359-0_sp0.9 from training. Duration: 0.7666875 2023-10-04 17:49:23,028 WARNING [train.py:1204] (3/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_49-97158-0 from training. Duration: 0.76 2023-10-04 17:49:24,909 WARNING [train.py:1204] (3/4) Exclude cut with ID _24314_210_4915_1_1532135078116_7170179_743-915-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 17:49:26,216 WARNING [train.py:1204] (3/4) Exclude cut with ID _61194_210_18628_1_1535007555628_3750909_353-323468-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 17:49:26,249 WARNING [train.py:1204] (3/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_387-131538-0_sp1.1 from training. Duration: 0.736375 2023-10-04 17:49:26,280 WARNING [train.py:1204] (3/4) Exclude cut with ID _44424_210_18163_1_1533623394848_1213110_79-187603-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 17:49:27,644 WARNING [train.py:1204] (3/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_86-19601-0 from training. Duration: 0.85 2023-10-04 17:49:27,680 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_50050_210_16223_1_1535435796_7290138_103-283529-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 17:49:28,971 WARNING [train.py:1204] (3/4) Exclude cut with ID _13913_210_9994_1_1530579615187_3383897_473-277682-0 from training. Duration: 0.39 2023-10-04 17:49:30,439 WARNING [train.py:1204] (3/4) Exclude cut with ID _42326_210_7770_1_1533814282531_3707269_368-68029-0_sp1.1 from training. Duration: 0.92725 2023-10-04 17:49:30,607 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=1740693.3333333333, ans=0.1 2023-10-04 17:49:33,260 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_41428_210_2161_1_1533774184_3992532_446-339645-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 17:49:35,873 WARNING [train.py:1204] (3/4) Exclude cut with ID _69584_210_5219_1_1535785133775_4044450_325-7656-0_sp1.1 from training. Duration: 0.7818125 2023-10-04 17:49:35,881 WARNING [train.py:1204] (3/4) Exclude cut with ID _21632_210_2253_1_1531994001765_3744140_134-184387-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 17:49:38,618 WARNING [train.py:1204] (3/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_550-184707-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 17:49:38,642 WARNING [train.py:1204] (3/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_106-341780-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 17:49:43,216 WARNING [train.py:1204] (3/4) Exclude cut with ID _45871_210_5665_1_1533864614245_3296060_253-132552-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 17:49:43,971 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.2.encoder.layers.1.whiten, num_groups=1, num_channels=384, metric=4.66 vs. limit=12.0 2023-10-04 17:49:44,549 WARNING [train.py:1204] (3/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_395-310220-0_sp1.1 from training. Duration: 0.8090625 2023-10-04 17:49:44,565 WARNING [train.py:1204] (3/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_795-33252-0_sp0.9 from training. Duration: 0.411125 2023-10-04 17:49:46,006 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9002W0003-495962 from training. Duration: 0.946 2023-10-04 17:49:46,040 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_43-71989-0 from training. Duration: 0.57 2023-10-04 17:49:46,650 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.conv_module2.whiten.whitening_limit, batch_count=1740760.0, ans=15.0 2023-10-04 17:49:47,295 WARNING [train.py:1204] (3/4) Exclude cut with ID _8699_210_2069_1_1528630346831_3960409_155-39766-0_sp1.1 from training. Duration: 0.7090625 2023-10-04 17:49:47,315 WARNING [train.py:1204] (3/4) Exclude cut with ID _27894_210_12392_1_1532415496078_6902439_788-232684-0_sp1.1 from training. Duration: 0.97275 2023-10-04 17:49:48,772 WARNING [train.py:1204] (3/4) Exclude cut with ID _56494_210_4220_1_1535284837625_3033169_10-39296-0_sp1.1 from training. Duration: 0.92725 2023-10-04 17:49:48,805 WARNING [train.py:1204] (3/4) Exclude cut with ID _40901_210_5665_1_1533363789958_3856130_341-20819-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 17:49:51,778 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9029W0435-509822 from training. Duration: 0.896 2023-10-04 17:49:53,733 WARNING [train.py:1204] (3/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_51-117576-0 from training. Duration: 0.4 2023-10-04 17:49:55,093 WARNING [train.py:1204] (3/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_197-28164-0_sp1.1 from training. Duration: 0.72725 2023-10-04 17:49:57,116 INFO [scaling.py:1118] (3/4) WithLoss: name=encoder.encoders.4.encoder.layers.1.self_attn_weights, loss-sum=0.000e+00 2023-10-04 17:49:58,288 WARNING [train.py:1204] (3/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_145-137790-0_sp1.1 from training. Duration: 0.7545625 2023-10-04 17:50:01,674 WARNING [train.py:1204] (3/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_2-39653-0_sp1.1 from training. Duration: 0.8909375 2023-10-04 17:50:05,731 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3780_210_2087_1_1526467760_2130483_186-208240-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 17:50:07,097 WARNING [train.py:1204] (3/4) Exclude cut with ID _24055_210_12651_1_1532310907143_976979_92-59356-0 from training. Duration: 0.91 2023-10-04 17:50:07,149 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3910_210_1794_1_1526742306_334530_28-238909-0_sp1.1 from training. Duration: 0.736375 2023-10-04 17:50:10,038 WARNING [train.py:1204] (3/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_269-154726-0 from training. Duration: 0.86 2023-10-04 17:50:15,037 WARNING [train.py:1204] (3/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_326-165900-0_sp0.9 from training. Duration: 0.7666875 2023-10-04 17:50:17,757 WARNING [train.py:1204] (3/4) Exclude cut with ID _5294_210_1747_1_1527249643928_3696179_470-276077-0_sp1.1 from training. Duration: 0.963625 2023-10-04 17:50:17,809 WARNING [train.py:1204] (3/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_372-131163-0 from training. Duration: 0.94 2023-10-04 17:50:17,859 WARNING [train.py:1204] (3/4) Exclude cut with ID _45297_210_2721_1_1533715351381_3264949_351-147136-0_sp1.1 from training. Duration: 0.7909375 2023-10-04 17:50:19,244 INFO [train.py:1046] (3/4) Epoch 50, batch 850, loss[loss=0.1526, simple_loss=0.2418, pruned_loss=0.03171, over 24315.00 frames. ], tot_loss[loss=0.1525, simple_loss=0.2333, pruned_loss=0.03584, over 4647166.72 frames. ], batch size: 74, lr: 2.04e-03, grad_scale: 16.0 2023-10-04 17:50:19,338 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_1072-12671-0_sp1.1 from training. Duration: 0.92725 2023-10-04 17:50:20,688 WARNING [train.py:1204] (3/4) Exclude cut with ID _15133_210_6228_1_1530794512074_3080379_54-198193-0 from training. Duration: 0.94 2023-10-04 17:50:20,711 WARNING [train.py:1204] (3/4) Exclude cut with ID _30196_210_1664_1_1533708496522_7074140_1194-270988-0_sp1.1 from training. Duration: 0.97275 2023-10-04 17:50:22,068 WARNING [train.py:1204] (3/4) Exclude cut with ID _50310_210_15488_1_1534068043345_3641080_289-193637-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 17:50:22,173 WARNING [train.py:1204] (3/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_313-263495-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 17:50:24,071 WARNING [train.py:1204] (3/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_18-130336-0_sp1.1 from training. Duration: 0.7181875 2023-10-04 17:50:25,447 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_47896_210_20508_1_1534492518_5958868_646-287209-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 17:50:28,505 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_294-163793-0 from training. Duration: 0.82 2023-10-04 17:50:28,557 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_8-319957-0 from training. Duration: 0.96 2023-10-04 17:50:28,560 WARNING [train.py:1204] (3/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_1211-226066-0 from training. Duration: 0.76 2023-10-04 17:50:30,888 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.nonlin_attention.balancer.prob, batch_count=1740960.0, ans=0.125 2023-10-04 17:50:31,876 WARNING [train.py:1204] (3/4) Exclude cut with ID _4779_210_2084_1_1527318022944_3914893_500-285013-0_sp0.9 from training. Duration: 0.7666875 2023-10-04 17:50:31,895 WARNING [train.py:1204] (3/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_347-232268-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 17:50:33,359 WARNING [train.py:1204] (3/4) Exclude cut with ID _29877_210_6228_1_1532397791106_5276149_111-55056-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 17:50:33,381 WARNING [train.py:1204] (3/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_812-271646-0_sp1.1 from training. Duration: 0.92725 2023-10-04 17:50:34,782 WARNING [train.py:1204] (3/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_235-262124-0_sp1.1 from training. Duration: 0.6090625 2023-10-04 17:50:35,035 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.feed_forward3.hidden_balancer.prob, batch_count=1741026.6666666667, ans=0.125 2023-10-04 17:50:39,031 WARNING [train.py:1204] (3/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_14-16530-0_sp1.1 from training. Duration: 0.97275 2023-10-04 17:50:39,169 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.1.ff3_skip_rate, batch_count=1741026.6666666667, ans=0.0 2023-10-04 17:50:40,363 WARNING [train.py:1204] (3/4) Exclude cut with ID _44718_210_5816_1_1534050051869_6469030_568-304512-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 17:50:40,395 WARNING [train.py:1204] (3/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_1033-274376-0 from training. Duration: 0.87 2023-10-04 17:50:43,192 WARNING [train.py:1204] (3/4) Exclude cut with ID _4681_210_3525_1_1527251040321_1235713_123-24428-0 from training. Duration: 0.48 2023-10-04 17:50:45,177 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.conv_module2.balancer2.prob, batch_count=1741026.6666666667, ans=0.125 2023-10-04 17:50:46,370 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_37818_210_4238_1_1533620910_4720083_251-288764-0_sp1.1 from training. Duration: 0.97275 2023-10-04 17:50:47,655 WARNING [train.py:1204] (3/4) Exclude cut with ID _65585_210_12210_1_1535450451584_3764098_112-294301-0 from training. Duration: 0.88 2023-10-04 17:50:50,537 WARNING [train.py:1204] (3/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_329-113173-0 from training. Duration: 0.62 2023-10-04 17:50:51,297 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.3.encoder.layers.1.self_attn_weights.whiten_keys, num_groups=8, num_channels=256, metric=3.51 vs. limit=6.0 2023-10-04 17:50:51,950 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6767_210_3273_1_1527855783_1718932_173-256601-0 from training. Duration: 0.8 2023-10-04 17:50:54,679 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9266W0001-627677 from training. Duration: 0.896 2023-10-04 17:50:55,865 WARNING [train.py:1204] (3/4) Exclude cut with ID _49643_210_7989_1_1534640244224_3939031_611-185515-0_sp1.1 from training. Duration: 0.8545625 2023-10-04 17:50:55,868 WARNING [train.py:1204] (3/4) Exclude cut with ID _31649_210_6228_1_1532584608063_4091078_687-237771-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 17:50:55,878 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_148-172861-0_sp0.9 from training. Duration: 0.5555625 2023-10-04 17:50:57,885 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_5129_210_6400_1_1527161005_3579054_107-69738-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 17:50:57,982 WARNING [train.py:1204] (3/4) Exclude cut with ID _40137_210_12210_1_1533290484046_3795180_36-301164-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 17:50:58,004 WARNING [train.py:1204] (3/4) Exclude cut with ID _7537_210_2084_1_1528502395431_3924359_113-199933-0 from training. Duration: 0.59 2023-10-04 17:51:01,246 WARNING [train.py:1204] (3/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_547-5424-0_sp1.1 from training. Duration: 0.8545625 2023-10-04 17:51:01,298 WARNING [train.py:1204] (3/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_147-284024-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 17:51:02,650 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_294-163793-0_sp1.1 from training. Duration: 0.7454375 2023-10-04 17:51:02,671 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3780_210_2087_1_1526467760_2130483_170-259044-0_sp1.1 from training. Duration: 0.636375 2023-10-04 17:51:04,129 WARNING [train.py:1204] (3/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_726-270780-0_sp1.1 from training. Duration: 0.7818125 2023-10-04 17:51:05,405 WARNING [train.py:1204] (3/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_214-291369-0_sp1.1 from training. Duration: 0.57275 2023-10-04 17:51:05,446 WARNING [train.py:1204] (3/4) Exclude cut with ID _21881_210_3525_1_1531825242757_3798380_247-102591-0 from training. Duration: 0.98 2023-10-04 17:51:05,624 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.ff3_skip_rate, batch_count=1741160.0, ans=0.0 2023-10-04 17:51:10,668 WARNING [train.py:1204] (3/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_18-174647-0_sp1.1 from training. Duration: 0.82725 2023-10-04 17:51:10,671 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_415-52611-0_sp1.1 from training. Duration: 0.936375 2023-10-04 17:51:10,737 WARNING [train.py:1204] (3/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_379-49457-0_sp0.9 from training. Duration: 0.9666875 2023-10-04 17:51:10,769 WARNING [train.py:1204] (3/4) Exclude cut with ID _64521_210_14699_1_1535243427259_3815328_368-195106-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 17:51:12,084 WARNING [train.py:1204] (3/4) Exclude cut with ID _28394_210_8913_1_1532430049518_3650118_51-334124-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 17:51:15,275 WARNING [train.py:1204] (3/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_317-161769-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 17:51:16,559 WARNING [train.py:1204] (3/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_40-129756-0_sp0.9 from training. Duration: 0.9 2023-10-04 17:51:19,265 WARNING [train.py:1204] (3/4) Exclude cut with ID _3067_210_2667_1_1526212974640_4503168_119-175776-0_sp0.9 from training. Duration: 0.82225 2023-10-04 17:51:20,032 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.2.encoder.layers.1.conv_module1.whiten, num_groups=1, num_channels=384, metric=13.60 vs. limit=15.0 2023-10-04 17:51:20,551 WARNING [train.py:1204] (3/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_1004-304915-0_sp1.1 from training. Duration: 0.92725 2023-10-04 17:51:20,605 WARNING [train.py:1204] (3/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_359-118926-0_sp0.9 from training. Duration: 0.8 2023-10-04 17:51:28,159 WARNING [train.py:1204] (3/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_51-200220-0_sp0.9 from training. Duration: 0.62225 2023-10-04 17:51:28,855 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.3.encoder.layers.2.self_attn1.whiten, num_groups=1, num_channels=512, metric=12.54 vs. limit=22.5 2023-10-04 17:51:30,071 WARNING [train.py:1204] (3/4) Exclude cut with ID _46683_210_9642_1_1533730913492_4591116_530-326202-0_sp1.1 from training. Duration: 0.936375 2023-10-04 17:51:30,116 WARNING [train.py:1204] (3/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_567-135114-0 from training. Duration: 0.87 2023-10-04 17:51:30,155 WARNING [train.py:1204] (3/4) Exclude cut with ID _66863_210_18053_1_1535698758334_8590319_1092-271784-0_sp1.1 from training. Duration: 0.87275 2023-10-04 17:51:30,177 WARNING [train.py:1204] (3/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_762-282614-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 17:51:32,663 INFO [train.py:1046] (3/4) Epoch 50, batch 900, loss[loss=0.1587, simple_loss=0.2332, pruned_loss=0.04208, over 23775.00 frames. ], tot_loss[loss=0.1529, simple_loss=0.2338, pruned_loss=0.03597, over 4670025.32 frames. ], batch size: 179, lr: 2.03e-03, grad_scale: 8.0 2023-10-04 17:51:32,753 WARNING [train.py:1204] (3/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_280-120942-0 from training. Duration: 0.77 2023-10-04 17:51:38,332 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_31583_210_3852_1_1532595851_4024413_27-170931-0_sp1.1 from training. Duration: 0.963625 2023-10-04 17:51:41,152 WARNING [train.py:1204] (3/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_266-271628-0_sp1.1 from training. Duration: 0.92725 2023-10-04 17:51:41,185 WARNING [train.py:1204] (3/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_41-31157-0 from training. Duration: 0.57 2023-10-04 17:51:42,636 WARNING [train.py:1204] (3/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_113-18163-0_sp1.1 from training. Duration: 0.8090625 2023-10-04 17:51:42,664 WARNING [train.py:1204] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_147-111137-0 from training. Duration: 0.95 2023-10-04 17:51:44,450 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1083-220122-0_sp1.1 from training. Duration: 0.463625 2023-10-04 17:51:44,538 WARNING [train.py:1204] (3/4) Exclude cut with ID _14884_210_2252_1_1530707459689_5561250_591-173406-0_sp1.1 from training. Duration: 0.87275 2023-10-04 17:51:44,544 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_24677_210_1794_1_1532002693_4341296_149-303793-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 17:51:45,743 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.794e+02 2.220e+02 2.526e+02 3.144e+02 5.127e+02, threshold=5.052e+02, percent-clipped=1.0 2023-10-04 17:51:45,824 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_133-564-0_sp1.1 from training. Duration: 0.7181875 2023-10-04 17:51:45,846 WARNING [train.py:1204] (3/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_625-338546-0_sp0.9 from training. Duration: 0.97775 2023-10-04 17:51:51,609 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.3.encoder.layers.3.feed_forward1.out_whiten, num_groups=1, num_channels=512, metric=4.59 vs. limit=15.0 2023-10-04 17:51:54,355 INFO [scaling.py:1118] (3/4) WithLoss: name=encoder.encoders.3.encoder.layers.0.self_attn_weights, loss-sum=0.000e+00 2023-10-04 17:51:56,805 WARNING [train.py:1204] (3/4) Exclude cut with ID _25882_210_3036_1_1532775665815_6479510_624-320169-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 17:51:56,820 WARNING [train.py:1204] (3/4) Exclude cut with ID _20622_210_3852_1_1531622427080_1093839_127-325326-0_sp1.1 from training. Duration: 0.92725 2023-10-04 17:51:56,859 WARNING [train.py:1204] (3/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_464-33931-0_sp1.1 from training. Duration: 0.4909375 2023-10-04 17:52:01,883 WARNING [train.py:1204] (3/4) Exclude cut with ID _66112_210_10635_1_1535590236305_3801484_120-304588-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 17:52:04,204 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.0.layers.1.feed_forward3.out_whiten, num_groups=1, num_channels=192, metric=9.52 vs. limit=15.0 2023-10-04 17:52:04,899 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.conv_module1.balancer1.prob, batch_count=1741426.6666666667, ans=0.125 2023-10-04 17:52:07,162 WARNING [train.py:1204] (3/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_110-130961-0 from training. Duration: 0.47 2023-10-04 17:52:07,301 WARNING [train.py:1204] (3/4) Exclude cut with ID _16908_210_5724_1_1531119157248_3772700_64-54020-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 17:52:12,995 WARNING [train.py:1204] (3/4) Exclude cut with ID _4068_210_2629_1_1526697350395_1546259_224-288693-0_sp1.1 from training. Duration: 0.536375 2023-10-04 17:52:14,326 WARNING [train.py:1204] (3/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_191-41023-0_sp0.9 from training. Duration: 0.788875 2023-10-04 17:52:16,094 WARNING [train.py:1204] (3/4) Exclude cut with ID ID0205W0255-783002 from training. Duration: 0.958 2023-10-04 17:52:16,170 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_162-262717-0 from training. Duration: 0.83 2023-10-04 17:52:20,560 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_224-322394-0_sp1.1 from training. Duration: 0.6 2023-10-04 17:52:20,569 WARNING [train.py:1204] (3/4) Exclude cut with ID _4866_210_2577_1_1527404412284_4364659_133-1696-0_sp1.1 from training. Duration: 0.9 2023-10-04 17:52:21,806 WARNING [train.py:1204] (3/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_973-198085-0_sp1.1 from training. Duration: 0.6818125 2023-10-04 17:52:28,077 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_47449_210_5399_1_1534076445_4006929_410-237806-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 17:52:28,087 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7543_210_6753_1_1528543318_4552_1-85072-0_sp0.9 from training. Duration: 0.92225 2023-10-04 17:52:29,540 WARNING [train.py:1204] (3/4) Exclude cut with ID _65263_210_22356_1_1535763598691_3921037_108-21902-0 from training. Duration: 0.99 2023-10-04 17:52:30,787 WARNING [train.py:1204] (3/4) Exclude cut with ID _65585_210_12210_1_1535450451584_3764098_309-174818-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 17:52:33,943 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_309-65177-0 from training. Duration: 0.88 2023-10-04 17:52:35,791 WARNING [train.py:1204] (3/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_363-185703-0_sp1.1 from training. Duration: 0.863625 2023-10-04 17:52:35,838 WARNING [train.py:1204] (3/4) Exclude cut with ID _42326_210_7770_1_1533814282531_3707269_442-238692-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 17:52:37,322 WARNING [train.py:1204] (3/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_226-254446-0_sp1.1 from training. Duration: 0.936375 2023-10-04 17:52:38,743 WARNING [train.py:1204] (3/4) Exclude cut with ID _29853_210_7497_1_1532505600851_6642110_287-299903-0_sp1.1 from training. Duration: 0.963625 2023-10-04 17:52:40,858 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.2.encoder.layers.1.self_attn_weights.whiten_keys, num_groups=4, num_channels=128, metric=2.51 vs. limit=6.0 2023-10-04 17:52:41,753 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.conv_module2.balancer2.min_abs, batch_count=1741560.0, ans=0.5 2023-10-04 17:52:42,952 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1015-343884-0 from training. Duration: 0.62 2023-10-04 17:52:42,988 WARNING [train.py:1204] (3/4) Exclude cut with ID ID0305W0008-833402 from training. Duration: 0.91 2023-10-04 17:52:43,284 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=1741560.0, ans=0.1 2023-10-04 17:52:44,325 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6673_210_3273_1_1527767790_3173240_351-167954-0_sp1.1 from training. Duration: 0.436375 2023-10-04 17:52:44,344 WARNING [train.py:1204] (3/4) Exclude cut with ID _6456_210_1534_1_1527681634212_3181049_345-252877-0 from training. Duration: 0.64 2023-10-04 17:52:45,557 INFO [train.py:1046] (3/4) Epoch 50, batch 950, loss[loss=0.1377, simple_loss=0.2095, pruned_loss=0.03297, over 22712.00 frames. ], tot_loss[loss=0.1534, simple_loss=0.2345, pruned_loss=0.03613, over 4683864.55 frames. ], batch size: 322, lr: 2.03e-03, grad_scale: 8.0 2023-10-04 17:52:47,620 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_13-205182-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 17:52:49,971 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.0.layers.1.nonlin_attention.whiten2, num_groups=1, num_channels=192, metric=5.72 vs. limit=15.0 2023-10-04 17:52:50,395 WARNING [train.py:1204] (3/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_427-208901-0 from training. Duration: 0.68 2023-10-04 17:52:54,539 WARNING [train.py:1204] (3/4) Exclude cut with ID _49039_210_6315_1_1533967537541_3846118_9-148921-0_sp1.1 from training. Duration: 0.97275 2023-10-04 17:52:56,012 WARNING [train.py:1204] (3/4) Exclude cut with ID _59493_210_17282_1_1535078419077_3225879_143-70317-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 17:52:57,314 WARNING [train.py:1204] (3/4) Exclude cut with ID _27967_210_12140_1_1532588391859_8419130_1056-236369-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 17:52:57,981 WARNING [train.py:1204] (3/4) Exclude cut with ID _6537_210_1883_1_1527987559710_3597129_382-133062-0_sp0.9 from training. Duration: 0.7555625 2023-10-04 17:52:59,362 WARNING [train.py:1204] (3/4) Exclude cut with ID ID0376W0255-869957 from training. Duration: 0.9510625 2023-10-04 17:53:03,722 WARNING [train.py:1204] (3/4) Exclude cut with ID _49371_210_12071_1_1533952761021_3812190_483-240691-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 17:53:05,425 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_12868_210_6753_1_1530701932_530275_18-76322-0_sp1.1 from training. Duration: 0.7909375 2023-10-04 17:53:06,756 WARNING [train.py:1204] (3/4) Exclude cut with ID _53950_210_5816_1_1534417264919_3862951_367-26344-0_sp1.1 from training. Duration: 0.97275 2023-10-04 17:53:06,784 WARNING [train.py:1204] (3/4) Exclude cut with ID _5444_210_2151_1_1527418808464_5312210_634-194174-0_sp1.1 from training. Duration: 0.8818125 2023-10-04 17:53:06,812 WARNING [train.py:1204] (3/4) Exclude cut with ID _56494_210_4220_1_1535284837625_3033169_69-138150-0 from training. Duration: 0.94 2023-10-04 17:53:08,215 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7543_210_6753_1_1528544010_1110402_25-183860-0_sp0.9 from training. Duration: 0.52225 2023-10-04 17:53:09,573 WARNING [train.py:1204] (3/4) Exclude cut with ID _40284_210_2084_1_1533513679814_3704279_287-191918-0_sp1.1 from training. Duration: 0.963625 2023-10-04 17:53:12,274 WARNING [train.py:1204] (3/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_291-270881-0 from training. Duration: 0.97 2023-10-04 17:53:12,313 WARNING [train.py:1204] (3/4) Exclude cut with ID _12555_210_8750_1_1530075594402_3780239_449-267986-0_sp0.9 from training. Duration: 0.92225 2023-10-04 17:53:12,529 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.1.feed_forward1.out_proj.dropout_p, batch_count=1741693.3333333333, ans=0.1 2023-10-04 17:53:15,235 WARNING [train.py:1204] (3/4) Exclude cut with ID _3992_210_1747_1_1526644816031_3731060_268-64520-0_sp1.1 from training. Duration: 0.963625 2023-10-04 17:53:15,247 WARNING [train.py:1204] (3/4) Exclude cut with ID _39411_210_5986_1_1533538825030_3343649_173-238248-0_sp0.9 from training. Duration: 0.92225 2023-10-04 17:53:15,274 WARNING [train.py:1204] (3/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_828-313097-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 17:53:18,610 WARNING [train.py:1204] (3/4) Exclude cut with ID _26907_210_7234_1_1532221740694_3205263_337-2494-0 from training. Duration: 0.88 2023-10-04 17:53:19,967 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_229-45300-0_sp1.1 from training. Duration: 0.5181875 2023-10-04 17:53:22,737 WARNING [train.py:1204] (3/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_300-27235-0_sp1.1 from training. Duration: 0.7909375 2023-10-04 17:53:22,838 WARNING [train.py:1204] (3/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_48-338976-0_sp1.1 from training. Duration: 0.8090625 2023-10-04 17:53:27,060 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_13091_210_7925_1_1530609447_8987354_792-195353-0_sp1.1 from training. Duration: 0.936375 2023-10-04 17:53:27,066 WARNING [train.py:1204] (3/4) Exclude cut with ID _56524_210_9642_1_1534572011779_3619170_311-54500-0_sp1.1 from training. Duration: 0.97275 2023-10-04 17:53:30,420 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7543_210_6753_1_1528544010_1110402_25-183860-0 from training. Duration: 0.47 2023-10-04 17:53:31,687 WARNING [train.py:1204] (3/4) Exclude cut with ID 2411-132532-0017-82279-0_sp1.1 from training. Duration: 0.9681875 2023-10-04 17:53:31,688 WARNING [train.py:1204] (3/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_272-249985-0_sp0.9 from training. Duration: 0.7444375 2023-10-04 17:53:32,934 WARNING [train.py:1204] (3/4) Exclude cut with ID _55644_210_11856_1_1534593599582_4103719_320-229006-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 17:53:34,851 WARNING [train.py:1204] (3/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_177-100554-0_sp1.1 from training. Duration: 0.963625 2023-10-04 17:53:34,853 WARNING [train.py:1204] (3/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_787-294416-0_sp1.1 from training. Duration: 0.7454375 2023-10-04 17:53:35,155 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.feed_forward1.out_proj.dropout_p, batch_count=1741826.6666666667, ans=0.1 2023-10-04 17:53:37,777 WARNING [train.py:1204] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_318-59097-0 from training. Duration: 0.76 2023-10-04 17:53:37,849 WARNING [train.py:1204] (3/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_480-13146-0_sp1.1 from training. Duration: 0.77275 2023-10-04 17:53:40,480 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_469-338558-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 17:53:40,528 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_367-126897-0_sp1.1 from training. Duration: 0.963625 2023-10-04 17:53:40,560 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6545_210_2040_1_1527774670_4245258_379-14182-0 from training. Duration: 0.8 2023-10-04 17:53:41,912 WARNING [train.py:1204] (3/4) Exclude cut with ID _21631_210_2253_1_1531907880778_3160339_197-79498-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 17:53:41,917 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_416-293746-0_sp1.1 from training. Duration: 0.8181875 2023-10-04 17:53:41,958 WARNING [train.py:1204] (3/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_52-211683-0 from training. Duration: 0.9 2023-10-04 17:53:46,718 WARNING [train.py:1204] (3/4) Exclude cut with ID _48678_210_5663_1_1533988830947_3769199_306-255886-0_sp0.9 from training. Duration: 0.8555625 2023-10-04 17:53:49,440 WARNING [train.py:1204] (3/4) Exclude cut with ID _65134_210_14763_1_1535335393985_3505368_296-24733-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 17:53:53,678 WARNING [train.py:1204] (3/4) Exclude cut with ID _14898_210_9206_1_1530766812377_4177190_335-99364-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 17:53:56,324 WARNING [train.py:1204] (3/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_19-342632-0 from training. Duration: 0.91 2023-10-04 17:53:56,342 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_53071_210_16554_1_1534490405_2954066_265-170472-0 from training. Duration: 0.83 2023-10-04 17:53:58,015 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.conv_module2.balancer1.min_positive, batch_count=1741960.0, ans=0.025 2023-10-04 17:53:59,222 INFO [train.py:1046] (3/4) Epoch 50, batch 1000, loss[loss=0.1521, simple_loss=0.2438, pruned_loss=0.03021, over 24623.00 frames. ], tot_loss[loss=0.1527, simple_loss=0.2336, pruned_loss=0.03593, over 4695545.77 frames. ], batch size: 68, lr: 2.03e-03, grad_scale: 8.0 2023-10-04 17:53:59,360 WARNING [train.py:1204] (3/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_189-298172-0_sp1.1 from training. Duration: 0.963625 2023-10-04 17:54:03,346 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7615_210_4211_1_1528110965_4580160_72-235211-0 from training. Duration: 0.76 2023-10-04 17:54:03,396 WARNING [train.py:1204] (3/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_155-90874-0_sp1.1 from training. Duration: 0.936375 2023-10-04 17:54:09,332 WARNING [train.py:1204] (3/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_164-216310-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 17:54:10,681 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_46013_210_7234_1_1533776362_3744032_463-52378-0 from training. Duration: 0.86 2023-10-04 17:54:10,684 WARNING [train.py:1204] (3/4) Exclude cut with ID _61133_210_9969_1_1535196777227_3696409_333-270325-0 from training. Duration: 0.93 2023-10-04 17:54:13,322 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.769e+02 2.086e+02 2.281e+02 2.987e+02 4.437e+02, threshold=4.562e+02, percent-clipped=0.0 2023-10-04 17:54:16,208 WARNING [train.py:1204] (3/4) Exclude cut with ID _5172_210_1883_1_1527246062567_3577219_18-101380-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 17:54:16,222 WARNING [train.py:1204] (3/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_577-236858-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 17:54:18,170 WARNING [train.py:1204] (3/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_1006-177345-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 17:54:18,343 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.conv_skip_rate, batch_count=1742026.6666666667, ans=0.0 2023-10-04 17:54:20,999 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_4-266348-0 from training. Duration: 0.78 2023-10-04 17:54:23,724 WARNING [train.py:1204] (3/4) Exclude cut with ID _55857_210_2346_1_1534644007831_6898209_745-98391-0 from training. Duration: 0.76 2023-10-04 17:54:26,398 WARNING [train.py:1204] (3/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_375-244976-0 from training. Duration: 0.75 2023-10-04 17:54:26,465 WARNING [train.py:1204] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_4-248404-0_sp1.1 from training. Duration: 0.97275 2023-10-04 17:54:27,904 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8453_210_4211_1_1528502940_8562730_596-306820-0 from training. Duration: 0.97 2023-10-04 17:54:29,331 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_211-339622-0_sp0.9 from training. Duration: 0.5 2023-10-04 17:54:29,358 WARNING [train.py:1204] (3/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_393-57888-0 from training. Duration: 0.64 2023-10-04 17:54:31,406 WARNING [train.py:1204] (3/4) Exclude cut with ID _23825_210_13449_1_1531915313526_3497150_143-343575-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 17:54:32,743 WARNING [train.py:1204] (3/4) Exclude cut with ID _58605_210_12210_1_1534723195939_3904259_36-12256-0_sp1.1 from training. Duration: 0.936375 2023-10-04 17:54:40,276 WARNING [train.py:1204] (3/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_38-67795-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 17:54:41,465 WARNING [train.py:1204] (3/4) Exclude cut with ID _64631_210_27767_1_1535349580068_4165160_308-327832-0_sp1.1 from training. Duration: 0.92725 2023-10-04 17:54:41,526 WARNING [train.py:1204] (3/4) Exclude cut with ID _37086_210_9038_1_1532999540891_7021250_726-81176-0_sp1.1 from training. Duration: 0.936375 2023-10-04 17:54:42,820 WARNING [train.py:1204] (3/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_47-141521-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 17:54:42,825 WARNING [train.py:1204] (3/4) Exclude cut with ID _3067_210_2667_1_1526212974640_4503168_250-285937-0 from training. Duration: 0.8 2023-10-04 17:54:42,852 WARNING [train.py:1204] (3/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_239-24000-0_sp1.1 from training. Duration: 0.97275 2023-10-04 17:54:44,182 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3371_210_1794_1_1526384462_5580777_396-339812-0_sp0.9 from training. Duration: 0.9444375 2023-10-04 17:54:44,231 WARNING [train.py:1204] (3/4) Exclude cut with ID _16909_210_5724_1_1531292079912_4118919_580-173454-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 17:54:44,288 WARNING [train.py:1204] (3/4) Exclude cut with ID IC0124W0002-61308 from training. Duration: 0.982 2023-10-04 17:54:47,488 WARNING [train.py:1204] (3/4) Exclude cut with ID _31602_210_5854_1_1532677021456_4458250_249-96604-0 from training. Duration: 0.89 2023-10-04 17:54:47,568 WARNING [train.py:1204] (3/4) Exclude cut with ID _69584_210_5219_1_1535785133775_4044450_325-7656-0 from training. Duration: 0.86 2023-10-04 17:54:48,962 WARNING [train.py:1204] (3/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_478-162247-0 from training. Duration: 0.82 2023-10-04 17:54:51,553 WARNING [train.py:1204] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_686-54383-0_sp1.1 from training. Duration: 0.7909375 2023-10-04 17:54:57,272 WARNING [train.py:1204] (3/4) Exclude cut with ID _29303_210_2575_1_1532392237303_7410649_85-270092-0_sp1.1 from training. Duration: 0.936375 2023-10-04 17:54:57,292 WARNING [train.py:1204] (3/4) Exclude cut with ID _2322_210_2667_1_1525604548971_3656299_440-80494-0_sp1.1 from training. Duration: 0.8 2023-10-04 17:54:57,322 WARNING [train.py:1204] (3/4) Exclude cut with ID _47346_210_9476_1_1533815971553_3536160_201-40644-0_sp1.1 from training. Duration: 0.936375 2023-10-04 17:54:58,713 WARNING [train.py:1204] (3/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_343-139898-0_sp1.1 from training. Duration: 0.87275 2023-10-04 17:55:01,741 WARNING [train.py:1204] (3/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_1012-279473-0 from training. Duration: 0.67 2023-10-04 17:55:03,731 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7931_210_1794_1_1528594728_5564059_280-279560-0_sp1.1 from training. Duration: 0.8909375 2023-10-04 17:55:05,102 WARNING [train.py:1204] (3/4) Exclude cut with ID _8263_210_1664_1_1528439452891_970220_68-181805-0 from training. Duration: 0.64 2023-10-04 17:55:05,152 WARNING [train.py:1204] (3/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_605-133395-0 from training. Duration: 0.66 2023-10-04 17:55:05,255 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_43-171109-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 17:55:05,259 WARNING [train.py:1204] (3/4) Exclude cut with ID _40094_210_16380_1_1533294005773_3641159_107-280739-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 17:55:09,891 WARNING [train.py:1204] (3/4) Exclude cut with ID _14821_210_8750_1_1530680503991_6829359_2-227151-0_sp0.9 from training. Duration: 0.92225 2023-10-04 17:55:11,291 WARNING [train.py:1204] (3/4) Exclude cut with ID _14268_210_9206_1_1530622771285_4714099_331-105351-0_sp1.1 from training. Duration: 0.5909375 2023-10-04 17:55:12,693 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_26326_210_7925_1_1533002493_3592406_451-82741-0_sp1.1 from training. Duration: 0.97275 2023-10-04 17:55:13,947 INFO [train.py:1046] (3/4) Epoch 50, batch 1050, loss[loss=0.159, simple_loss=0.2395, pruned_loss=0.03928, over 23494.00 frames. ], tot_loss[loss=0.1517, simple_loss=0.2324, pruned_loss=0.03548, over 4710959.89 frames. ], batch size: 134, lr: 2.03e-03, grad_scale: 8.0 2023-10-04 17:55:15,531 WARNING [train.py:1204] (3/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_191-59784-0_sp0.9 from training. Duration: 0.97775 2023-10-04 17:55:17,434 WARNING [train.py:1204] (3/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_445-117398-0_sp1.1 from training. Duration: 0.8090625 2023-10-04 17:55:18,904 WARNING [train.py:1204] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_605-257205-0_sp0.9 from training. Duration: 0.6555625 2023-10-04 17:55:18,967 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_50050_210_16223_1_1535435796_7290138_592-258841-0_sp1.1 from training. Duration: 0.936375 2023-10-04 17:55:21,668 WARNING [train.py:1204] (3/4) Exclude cut with ID _14348_210_8341_1_1530599434996_3778640_240-232370-0_sp0.9 from training. Duration: 0.9333125 2023-10-04 17:55:23,026 WARNING [train.py:1204] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_239-316802-0_sp0.9 from training. Duration: 0.7666875 2023-10-04 17:55:23,148 WARNING [train.py:1204] (3/4) Exclude cut with ID _45560_210_21982_1_1533812603951_3584999_111-6376-0_sp1.1 from training. Duration: 0.763625 2023-10-04 17:55:27,266 WARNING [train.py:1204] (3/4) Exclude cut with ID _54189_210_16380_1_1534332670039_3911710_606-311481-0_sp1.1 from training. Duration: 0.963625 2023-10-04 17:55:28,549 WARNING [train.py:1204] (3/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_237-332027-0_sp1.1 from training. Duration: 0.863625 2023-10-04 17:55:28,581 WARNING [train.py:1204] (3/4) Exclude cut with ID _66070_210_7640_1_1535441393096_7805310_489-292631-0_sp0.9 from training. Duration: 0.87775 2023-10-04 17:55:30,410 WARNING [train.py:1204] (3/4) Exclude cut with ID _49728_210_12651_1_1534041034294_6788150_624-195301-0_sp1.1 from training. Duration: 0.77275 2023-10-04 17:55:30,470 WARNING [train.py:1204] (3/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_44-79427-0 from training. Duration: 0.98 2023-10-04 17:55:32,225 WARNING [train.py:1204] (3/4) Exclude cut with ID _35350_210_16040_1_1533040191183_4075299_490-198354-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 17:55:33,555 WARNING [train.py:1204] (3/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_309-222545-0 from training. Duration: 0.84 2023-10-04 17:55:36,236 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_13091_210_7925_1_1530609447_8987354_790-199469-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 17:55:36,242 WARNING [train.py:1204] (3/4) Exclude cut with ID _21632_210_2253_1_1531994001765_3744140_110-274654-0 from training. Duration: 0.82 2023-10-04 17:55:36,248 WARNING [train.py:1204] (3/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_498-40153-0_sp0.9 from training. Duration: 0.7 2023-10-04 17:55:42,370 WARNING [train.py:1204] (3/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_363-282909-0_sp1.1 from training. Duration: 0.936375 2023-10-04 17:55:43,722 WARNING [train.py:1204] (3/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_178-177582-0_sp0.9 from training. Duration: 0.82225 2023-10-04 17:55:44,961 WARNING [train.py:1204] (3/4) Exclude cut with ID _24612_210_8913_1_1531998054193_3806359_357-35487-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 17:55:46,480 WARNING [train.py:1204] (3/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_138-332773-0 from training. Duration: 0.81 2023-10-04 17:55:46,508 WARNING [train.py:1204] (3/4) Exclude cut with ID _7624_210_1531_1_1528109802198_4933101_418-349834-0 from training. Duration: 0.6 2023-10-04 17:55:46,531 WARNING [train.py:1204] (3/4) Exclude cut with ID _7537_210_2084_1_1528502395431_3924359_14-312906-0_sp0.9 from training. Duration: 0.9333125 2023-10-04 17:55:51,165 WARNING [train.py:1204] (3/4) Exclude cut with ID _60057_210_10640_1_1534903065793_3333150_362-230377-0 from training. Duration: 0.87 2023-10-04 17:55:53,499 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.1.encoder.layers.0.self_attn_weights.whiten_keys, num_groups=4, num_channels=128, metric=3.86 vs. limit=6.0 2023-10-04 17:55:54,148 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.conv_module2.balancer2.prob, batch_count=1742426.6666666667, ans=0.125 2023-10-04 17:55:55,203 WARNING [train.py:1204] (3/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_138-64210-0 from training. Duration: 0.73 2023-10-04 17:55:55,273 WARNING [train.py:1204] (3/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_454-339504-0_sp1.1 from training. Duration: 0.97275 2023-10-04 17:55:58,060 WARNING [train.py:1204] (3/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_508-335369-0_sp0.9 from training. Duration: 0.5333125 2023-10-04 17:55:59,510 WARNING [train.py:1204] (3/4) Exclude cut with ID _5608_210_2106_1_1527415439089_6819768_909-82767-0_sp0.9 from training. Duration: 0.67775 2023-10-04 17:55:59,542 WARNING [train.py:1204] (3/4) Exclude cut with ID _6694_210_4599_1_1527852637490_3965529_99-11611-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 17:56:00,808 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2655_210_2087_1_1525688959_3897400_52-279899-0_sp1.1 from training. Duration: 0.62725 2023-10-04 17:56:04,420 WARNING [train.py:1204] (3/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_375-245677-0_sp1.1 from training. Duration: 0.736375 2023-10-04 17:56:07,036 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_23850_210_4238_1_1532232597_1632433_32-185135-0 from training. Duration: 0.86 2023-10-04 17:56:07,152 WARNING [train.py:1204] (3/4) Exclude cut with ID _15113_210_8254_1_1530842438579_3716279_209-54533-0 from training. Duration: 0.96 2023-10-04 17:56:07,779 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.3.encoder.layers.2.feed_forward2.out_whiten, num_groups=1, num_channels=512, metric=15.60 vs. limit=15.0 2023-10-04 17:56:08,444 WARNING [train.py:1204] (3/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_401-53423-0 from training. Duration: 0.55 2023-10-04 17:56:08,492 WARNING [train.py:1204] (3/4) Exclude cut with ID _14867_210_1968_1_1530684179890_3846969_319-250868-0_sp1.1 from training. Duration: 0.9 2023-10-04 17:56:08,519 WARNING [train.py:1204] (3/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_106-30782-0_sp0.9 from training. Duration: 0.888875 2023-10-04 17:56:10,491 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_5960_210_1794_1_1527403865_3990999_515-2613-0 from training. Duration: 0.88 2023-10-04 17:56:13,305 WARNING [train.py:1204] (3/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_798-106179-0_sp0.9 from training. Duration: 0.92225 2023-10-04 17:56:14,606 WARNING [train.py:1204] (3/4) Exclude cut with ID _14227_210_4381_1_1530701804286_3940259_125-275642-0_sp1.1 from training. Duration: 0.9 2023-10-04 17:56:14,608 WARNING [train.py:1204] (3/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_79-200392-0_sp1.1 from training. Duration: 0.8818125 2023-10-04 17:56:14,752 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.feed_forward3.hidden_balancer.prob, batch_count=1742560.0, ans=0.125 2023-10-04 17:56:16,463 WARNING [train.py:1204] (3/4) Exclude cut with ID _48836_210_12210_1_1534075848293_3904150_429-35475-0_sp0.9 from training. Duration: 0.988875 2023-10-04 17:56:16,478 WARNING [train.py:1204] (3/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_224-172517-0_sp1.1 from training. Duration: 0.97275 2023-10-04 17:56:20,690 WARNING [train.py:1204] (3/4) Exclude cut with ID _15113_210_8254_1_1530842438579_3716279_76-232507-0_sp1.1 from training. Duration: 0.97275 2023-10-04 17:56:20,692 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_135-5501-0 from training. Duration: 0.98 2023-10-04 17:56:20,815 WARNING [train.py:1204] (3/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_563-347832-0_sp0.9 from training. Duration: 0.988875 2023-10-04 17:56:20,826 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6786_210_1388_1_1527839699_4194495_273-149841-0 from training. Duration: 0.92 2023-10-04 17:56:22,155 WARNING [train.py:1204] (3/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_172-215932-0 from training. Duration: 0.66 2023-10-04 17:56:22,439 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.feed_forward1.hidden_balancer.prob, batch_count=1742560.0, ans=0.125 2023-10-04 17:56:23,561 WARNING [train.py:1204] (3/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_939-132910-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 17:56:26,402 WARNING [train.py:1204] (3/4) Exclude cut with ID _49371_210_12071_1_1533952761021_3812190_16-246001-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 17:56:27,631 INFO [train.py:1046] (3/4) Epoch 50, batch 1100, loss[loss=0.1557, simple_loss=0.2491, pruned_loss=0.03121, over 24312.00 frames. ], tot_loss[loss=0.1504, simple_loss=0.2308, pruned_loss=0.03506, over 4698084.48 frames. ], batch size: 74, lr: 2.03e-03, grad_scale: 8.0 2023-10-04 17:56:30,971 WARNING [train.py:1204] (3/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_680-189165-0_sp1.1 from training. Duration: 0.936375 2023-10-04 17:56:36,941 WARNING [train.py:1204] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_53-300544-0_sp1.1 from training. Duration: 0.6090625 2023-10-04 17:56:38,304 WARNING [train.py:1204] (3/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_540-275685-0_sp1.1 from training. Duration: 0.8090625 2023-10-04 17:56:39,567 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_38965_210_4852_1_1533174906_3484735_453-106579-0_sp1.1 from training. Duration: 0.92725 2023-10-04 17:56:39,624 WARNING [train.py:1204] (3/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_105-328174-0 from training. Duration: 0.76 2023-10-04 17:56:39,729 WARNING [train.py:1204] (3/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_279-126205-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 17:56:41,301 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.636e+02 2.164e+02 2.522e+02 3.230e+02 4.448e+02, threshold=5.045e+02, percent-clipped=0.0 2023-10-04 17:56:43,044 WARNING [train.py:1204] (3/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_5-55893-0_sp0.9 from training. Duration: 0.611125 2023-10-04 17:56:45,851 WARNING [train.py:1204] (3/4) Exclude cut with ID _13678_210_7497_1_1530530069226_3463170_141-10835-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 17:56:48,953 WARNING [train.py:1204] (3/4) Exclude cut with ID _22083_210_8254_1_1531702862439_3678970_201-85738-0_sp1.1 from training. Duration: 0.8181875 2023-10-04 17:56:48,995 WARNING [train.py:1204] (3/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_51-1049-0 from training. Duration: 0.75 2023-10-04 17:56:49,070 WARNING [train.py:1204] (3/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_206-10097-0_sp0.9 from training. Duration: 0.5444375 2023-10-04 17:56:51,763 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_4528_210_3273_1_1527076654_3276777_360-63661-0_sp1.1 from training. Duration: 0.92725 2023-10-04 17:56:51,779 WARNING [train.py:1204] (3/4) Exclude cut with ID _64086_210_7994_1_1535270417019_7435949_449-31567-0_sp1.1 from training. Duration: 0.8818125 2023-10-04 17:56:54,473 WARNING [train.py:1204] (3/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_245-174553-0_sp0.9 from training. Duration: 0.97775 2023-10-04 17:56:55,917 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6545_210_2040_1_1527774670_4245258_379-14182-0_sp1.1 from training. Duration: 0.72725 2023-10-04 17:57:00,608 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_256-206070-0_sp1.1 from training. Duration: 0.963625 2023-10-04 17:57:03,983 WARNING [train.py:1204] (3/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_278-104676-0 from training. Duration: 0.98 2023-10-04 17:57:05,251 WARNING [train.py:1204] (3/4) Exclude cut with ID IC0671W0065-331908 from training. Duration: 0.896 2023-10-04 17:57:05,308 WARNING [train.py:1204] (3/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_244-129917-0_sp1.1 from training. Duration: 0.97275 2023-10-04 17:57:08,023 WARNING [train.py:1204] (3/4) Exclude cut with ID _7852_210_5219_1_1528286471316_8139400_1129-170857-0_sp1.1 from training. Duration: 0.97275 2023-10-04 17:57:09,433 WARNING [train.py:1204] (3/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_443-30034-0_sp0.9 from training. Duration: 0.711125 2023-10-04 17:57:09,702 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.self_attn_weights.pos_emb_skip_rate, batch_count=1742760.0, ans=0.0 2023-10-04 17:57:10,635 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_31583_210_3852_1_1532595851_4024413_250-332999-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 17:57:10,743 WARNING [train.py:1204] (3/4) Exclude cut with ID _8772_210_1850_1_1528635595972_3664011_247-336448-0 from training. Duration: 0.74 2023-10-04 17:57:10,802 WARNING [train.py:1204] (3/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_567-135114-0_sp0.9 from training. Duration: 0.9666875 2023-10-04 17:57:12,181 WARNING [train.py:1204] (3/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_557-349534-0_sp1.1 from training. Duration: 0.82725 2023-10-04 17:57:12,201 WARNING [train.py:1204] (3/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_1341-26728-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 17:57:12,805 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.4.encoder.layers.0.feed_forward3.out_whiten, num_groups=1, num_channels=384, metric=12.43 vs. limit=15.0 2023-10-04 17:57:13,554 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_40222_210_6228_1_1533385298_4481939_375-328051-0_sp1.1 from training. Duration: 0.97275 2023-10-04 17:57:13,588 WARNING [train.py:1204] (3/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_276-96593-0 from training. Duration: 0.77 2023-10-04 17:57:18,452 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_53071_210_16554_1_1534490405_2954066_265-170472-0_sp0.9 from training. Duration: 0.92225 2023-10-04 17:57:18,470 WARNING [train.py:1204] (3/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_365-1492-0 from training. Duration: 0.91 2023-10-04 17:57:19,897 WARNING [train.py:1204] (3/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_654-52713-0_sp0.9 from training. Duration: 0.8555625 2023-10-04 17:57:24,339 WARNING [train.py:1204] (3/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_113-92608-0_sp1.1 from training. Duration: 0.4909375 2023-10-04 17:57:26,399 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_46013_210_7234_1_1533776362_3744032_50-141535-0 from training. Duration: 0.99 2023-10-04 17:57:26,407 WARNING [train.py:1204] (3/4) Exclude cut with ID _13942_210_9988_1_1530604906036_3733360_499-55676-0_sp1.1 from training. Duration: 0.563625 2023-10-04 17:57:27,788 WARNING [train.py:1204] (3/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_375-65143-0_sp1.1 from training. Duration: 0.97275 2023-10-04 17:57:29,336 WARNING [train.py:1204] (3/4) Exclude cut with ID _16756_210_4915_1_1531047803763_7104259_199-325608-0_sp1.1 from training. Duration: 0.92725 2023-10-04 17:57:29,370 WARNING [train.py:1204] (3/4) Exclude cut with ID _31649_210_6228_1_1532584608063_4091078_443-124391-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 17:57:32,457 WARNING [train.py:1204] (3/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_146-218763-0 from training. Duration: 0.86 2023-10-04 17:57:32,510 WARNING [train.py:1204] (3/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_567-135114-0_sp1.1 from training. Duration: 0.7909375 2023-10-04 17:57:33,890 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_26326_210_7925_1_1533002493_3592406_462-288299-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 17:57:35,090 WARNING [train.py:1204] (3/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_322-217220-0 from training. Duration: 0.86 2023-10-04 17:57:35,125 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_238-256668-0_sp1.1 from training. Duration: 0.62725 2023-10-04 17:57:35,169 WARNING [train.py:1204] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_371-18169-0 from training. Duration: 0.86 2023-10-04 17:57:36,592 WARNING [train.py:1204] (3/4) Exclude cut with ID _58480_210_12392_1_1534811121111_3646160_23-74512-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 17:57:36,597 WARNING [train.py:1204] (3/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_784-213099-0_sp1.1 from training. Duration: 0.8454375 2023-10-04 17:57:39,231 WARNING [train.py:1204] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_625-102955-0_sp1.1 from training. Duration: 0.7 2023-10-04 17:57:41,899 INFO [train.py:1046] (3/4) Epoch 50, batch 1150, loss[loss=0.1346, simple_loss=0.214, pruned_loss=0.02759, over 24443.00 frames. ], tot_loss[loss=0.1512, simple_loss=0.2315, pruned_loss=0.03541, over 4685145.04 frames. ], batch size: 58, lr: 2.03e-03, grad_scale: 8.0 2023-10-04 17:57:44,692 WARNING [train.py:1204] (3/4) Exclude cut with ID _49748_210_24116_1_1533986971737_3158910_176-174327-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 17:57:48,048 WARNING [train.py:1204] (3/4) Exclude cut with ID _50310_210_15488_1_1534068043345_3641080_432-96140-0_sp1.1 from training. Duration: 0.936375 2023-10-04 17:57:48,308 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.feed_forward2.hidden_balancer.prob, batch_count=1742960.0, ans=0.125 2023-10-04 17:57:49,542 WARNING [train.py:1204] (3/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_76-14910-0_sp1.1 from training. Duration: 0.92725 2023-10-04 17:57:49,575 WARNING [train.py:1204] (3/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_40-19458-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 17:57:51,455 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_46-93547-0 from training. Duration: 0.95 2023-10-04 17:57:51,498 WARNING [train.py:1204] (3/4) Exclude cut with ID _21631_210_2253_1_1531907880778_3160339_321-35325-0_sp1.1 from training. Duration: 0.963625 2023-10-04 17:57:53,081 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.ff2_skip_rate, batch_count=1742960.0, ans=0.0 2023-10-04 17:57:54,213 WARNING [train.py:1204] (3/4) Exclude cut with ID _4779_210_2084_1_1527318022944_3914893_500-285013-0 from training. Duration: 0.69 2023-10-04 17:57:55,571 WARNING [train.py:1204] (3/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_301-93324-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 17:57:56,858 WARNING [train.py:1204] (3/4) Exclude cut with ID _6473_210_1741_1_1527901307396_3815380_108-17335-0_sp1.1 from training. Duration: 0.5909375 2023-10-04 17:58:01,631 WARNING [train.py:1204] (3/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_206-10097-0 from training. Duration: 0.49 2023-10-04 17:58:03,135 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.feed_forward3.hidden_balancer.prob, batch_count=1743026.6666666667, ans=0.125 2023-10-04 17:58:04,434 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_36756_210_7234_1_1533026991_966276_103-77513-0_sp1.1 from training. Duration: 0.97275 2023-10-04 17:58:07,889 WARNING [train.py:1204] (3/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_662-18193-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 17:58:08,035 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.ff3_skip_rate, batch_count=1743026.6666666667, ans=0.0 2023-10-04 17:58:09,257 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_26433_210_12108_1_1532157158_4056480_439-52438-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 17:58:10,565 WARNING [train.py:1204] (3/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_464-33931-0 from training. Duration: 0.54 2023-10-04 17:58:10,571 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3474_210_2028_1_1526188513_4825246_145-309131-0_sp0.9 from training. Duration: 0.82225 2023-10-04 17:58:10,588 WARNING [train.py:1204] (3/4) Exclude cut with ID _9679_210_7497_1_1529129212906_4363929_446-308321-0_sp1.1 from training. Duration: 0.963625 2023-10-04 17:58:14,819 WARNING [train.py:1204] (3/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_662-275621-0 from training. Duration: 0.42 2023-10-04 17:58:16,243 WARNING [train.py:1204] (3/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_344-337148-0_sp1.1 from training. Duration: 0.97275 2023-10-04 17:58:17,628 WARNING [train.py:1204] (3/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_759-179071-0_sp1.1 from training. Duration: 0.92725 2023-10-04 17:58:18,409 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.3.encoder.layers.0.conv_module1.whiten, num_groups=1, num_channels=512, metric=3.48 vs. limit=15.0 2023-10-04 17:58:25,428 WARNING [train.py:1204] (3/4) Exclude cut with ID _28783_210_7943_1_1532671164808_7155479_293-12188-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 17:58:31,319 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_17613_210_2151_1_1531137569_2366160_235-320304-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 17:58:31,345 WARNING [train.py:1204] (3/4) Exclude cut with ID _14349_210_8341_1_1530615710818_3442470_170-336541-0 from training. Duration: 0.86 2023-10-04 17:58:32,669 WARNING [train.py:1204] (3/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_375-218677-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 17:58:32,715 WARNING [train.py:1204] (3/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_829-202956-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 17:58:38,718 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9029W0436-509823 from training. Duration: 0.768 2023-10-04 17:58:41,396 WARNING [train.py:1204] (3/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_231-112214-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 17:58:41,688 INFO [scaling.py:1118] (3/4) WithLoss: name=encoder.encoders.4.encoder.layers.1.self_attn_weights, loss-sum=0.000e+00 2023-10-04 17:58:43,063 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.attention_skip_rate, batch_count=1743226.6666666667, ans=0.0 2023-10-04 17:58:46,997 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9059W0470-524883 from training. Duration: 0.3415625 2023-10-04 17:58:49,859 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_43266_210_1794_1_1533521789_4497218_206-79512-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 17:58:51,222 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_294-163793-0_sp0.9 from training. Duration: 0.911125 2023-10-04 17:58:51,255 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6290_210_3273_1_1527595224_1282787_115-87095-0_sp1.1 from training. Duration: 0.5 2023-10-04 17:58:53,044 WARNING [train.py:1204] (3/4) Exclude cut with ID _14100_210_8341_1_1530529320613_3678069_279-55760-0_sp1.1 from training. Duration: 0.8454375 2023-10-04 17:58:56,185 INFO [train.py:1046] (3/4) Epoch 50, batch 1200, loss[loss=0.1497, simple_loss=0.2326, pruned_loss=0.03342, over 24492.00 frames. ], tot_loss[loss=0.1515, simple_loss=0.2319, pruned_loss=0.03558, over 4700016.54 frames. ], batch size: 66, lr: 2.03e-03, grad_scale: 16.0 2023-10-04 17:58:57,525 WARNING [train.py:1204] (3/4) Exclude cut with ID _14621_210_1925_1_1530686901185_3675420_419-234200-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 17:59:02,372 WARNING [train.py:1204] (3/4) Exclude cut with ID _60229_210_27767_1_1534838406158_3655350_353-213656-0_sp1.1 from training. Duration: 0.863625 2023-10-04 17:59:02,381 WARNING [train.py:1204] (3/4) Exclude cut with ID _48678_210_5663_1_1533988830947_3769199_306-255886-0_sp1.1 from training. Duration: 0.7 2023-10-04 17:59:05,169 WARNING [train.py:1204] (3/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_466-3492-0_sp1.1 from training. Duration: 0.97275 2023-10-04 17:59:05,170 WARNING [train.py:1204] (3/4) Exclude cut with ID _8755_210_1876_1_1528628559357_3258209_199-242167-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 17:59:06,479 WARNING [train.py:1204] (3/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_237-89684-0_sp1.1 from training. Duration: 0.87275 2023-10-04 17:59:07,934 WARNING [train.py:1204] (3/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_70-84121-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 17:59:09,092 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.691e+02 2.143e+02 2.355e+02 2.830e+02 4.818e+02, threshold=4.710e+02, percent-clipped=0.0 2023-10-04 17:59:09,270 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7544_210_6753_1_1528587722_3576686_172-171750-0_sp1.1 from training. Duration: 0.5909375 2023-10-04 17:59:11,268 WARNING [train.py:1204] (3/4) Exclude cut with ID _24271_210_12658_1_1531987270022_7190519_40-321822-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 17:59:11,293 WARNING [train.py:1204] (3/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_227-83612-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 17:59:12,779 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9156W0015-572508 from training. Duration: 0.896 2023-10-04 17:59:15,557 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_292-265928-0 from training. Duration: 0.51 2023-10-04 17:59:19,632 WARNING [train.py:1204] (3/4) Exclude cut with ID _14747_210_8341_1_1530685797463_3654089_147-131229-0_sp1.1 from training. Duration: 0.6909375 2023-10-04 17:59:22,326 WARNING [train.py:1204] (3/4) Exclude cut with ID _45297_210_2721_1_1533715351381_3264949_351-147136-0_sp0.9 from training. Duration: 0.9666875 2023-10-04 17:59:23,707 WARNING [train.py:1204] (3/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_1102-324568-0_sp1.1 from training. Duration: 0.97275 2023-10-04 17:59:25,194 WARNING [train.py:1204] (3/4) Exclude cut with ID _18516_210_9206_1_1531704653206_3692406_465-75446-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 17:59:25,195 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9203W0126-595623 from training. Duration: 0.896 2023-10-04 17:59:27,091 WARNING [train.py:1204] (3/4) Exclude cut with ID _55383_210_10635_1_1534468426172_2231669_139-328410-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 17:59:28,636 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.conv_skip_rate, batch_count=1743426.6666666667, ans=0.0 2023-10-04 17:59:29,339 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.0.layers.0.feed_forward2.out_whiten, num_groups=1, num_channels=192, metric=6.74 vs. limit=15.0 2023-10-04 17:59:36,322 WARNING [train.py:1204] (3/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_166-246557-0_sp0.9 from training. Duration: 0.711125 2023-10-04 17:59:36,327 WARNING [train.py:1204] (3/4) Exclude cut with ID _30039_210_13270_1_1532484612900_2725150_239-304872-0_sp1.1 from training. Duration: 0.963625 2023-10-04 17:59:36,359 WARNING [train.py:1204] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_6-226750-0 from training. Duration: 0.94 2023-10-04 17:59:36,428 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_135-5501-0_sp1.1 from training. Duration: 0.8909375 2023-10-04 17:59:36,640 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.balancer1.prob, batch_count=1743426.6666666667, ans=0.125 2023-10-04 17:59:41,095 WARNING [train.py:1204] (3/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_359-118926-0 from training. Duration: 0.72 2023-10-04 17:59:41,282 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.bypass_mid.scale_min, batch_count=1743493.3333333333, ans=0.2 2023-10-04 17:59:45,277 WARNING [train.py:1204] (3/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_4-91037-0 from training. Duration: 0.88 2023-10-04 17:59:45,307 WARNING [train.py:1204] (3/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_415-288492-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 17:59:46,661 WARNING [train.py:1204] (3/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_544-287179-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 17:59:48,197 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.ff3_skip_rate, batch_count=1743493.3333333333, ans=0.0 2023-10-04 17:59:49,363 WARNING [train.py:1204] (3/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_122-32606-0_sp1.1 from training. Duration: 0.92725 2023-10-04 17:59:49,426 WARNING [train.py:1204] (3/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_121-284152-0_sp1.1 from training. Duration: 0.536375 2023-10-04 17:59:50,787 WARNING [train.py:1204] (3/4) Exclude cut with ID _44718_210_5816_1_1534050051869_6469030_350-299477-0_sp1.1 from training. Duration: 0.97275 2023-10-04 17:59:50,798 WARNING [train.py:1204] (3/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_1103-56445-0_sp0.9 from training. Duration: 0.811125 2023-10-04 17:59:52,162 WARNING [train.py:1204] (3/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_64-298917-0_sp0.9 from training. Duration: 0.97775 2023-10-04 17:59:52,217 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2245_210_2715_1_1525511548_3540254_310-52483-0 from training. Duration: 0.88 2023-10-04 17:59:53,548 WARNING [train.py:1204] (3/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_401-158239-0_sp1.1 from training. Duration: 0.6454375 2023-10-04 17:59:53,590 WARNING [train.py:1204] (3/4) Exclude cut with ID _59502_210_12210_1_1535107024698_1835139_146-162495-0_sp1.1 from training. Duration: 0.836375 2023-10-04 17:59:53,606 WARNING [train.py:1204] (3/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_41-31157-0_sp0.9 from training. Duration: 0.6333125 2023-10-04 17:59:56,461 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_68717_210_3528_1_1535628683_1556759_95-36722-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 17:59:56,469 WARNING [train.py:1204] (3/4) Exclude cut with ID _30196_210_1664_1_1533708496522_7074140_654-162611-0_sp1.1 from training. Duration: 0.92725 2023-10-04 17:59:59,205 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_9302_210_2550_1_1528889962_4655416_638-301620-0_sp0.9 from training. Duration: 0.52225 2023-10-04 18:00:01,111 WARNING [train.py:1204] (3/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_478-162247-0_sp1.1 from training. Duration: 0.7454375 2023-10-04 18:00:04,363 WARNING [train.py:1204] (3/4) Exclude cut with ID _45297_210_2721_1_1533715351381_3264949_353-9541-0 from training. Duration: 0.97 2023-10-04 18:00:04,591 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.0.ff2_skip_rate, batch_count=1743560.0, ans=0.0 2023-10-04 18:00:04,640 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.bypass_mid.scale_min, batch_count=1743560.0, ans=0.2 2023-10-04 18:00:07,779 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9653W0190-675468 from training. Duration: 0.9935 2023-10-04 18:00:09,156 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_49730_210_13049_1_1534075294_1830676_214-84436-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 18:00:10,953 INFO [train.py:1046] (3/4) Epoch 50, batch 1250, loss[loss=0.1716, simple_loss=0.2515, pruned_loss=0.04581, over 23675.00 frames. ], tot_loss[loss=0.1523, simple_loss=0.2331, pruned_loss=0.03579, over 4706524.35 frames. ], batch size: 232, lr: 2.03e-03, grad_scale: 16.0 2023-10-04 18:00:12,364 WARNING [train.py:1204] (3/4) Exclude cut with ID _50093_210_4220_1_1534417141207_3432299_259-322252-0_sp1.1 from training. Duration: 0.836375 2023-10-04 18:00:13,711 WARNING [train.py:1204] (3/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_599-50220-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 18:00:15,097 WARNING [train.py:1204] (3/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_174-109016-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 18:00:16,570 WARNING [train.py:1204] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_665-196155-0 from training. Duration: 0.67 2023-10-04 18:00:18,057 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.bypass.scale_min, batch_count=1743626.6666666667, ans=0.2 2023-10-04 18:00:20,649 WARNING [train.py:1204] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_656-188361-0_sp1.1 from training. Duration: 0.7909375 2023-10-04 18:00:20,713 WARNING [train.py:1204] (3/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_60-18123-0_sp1.1 from training. Duration: 0.8 2023-10-04 18:00:22,187 WARNING [train.py:1204] (3/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_535-14410-0 from training. Duration: 0.47 2023-10-04 18:00:24,833 WARNING [train.py:1204] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_94-126128-0_sp1.1 from training. Duration: 0.7818125 2023-10-04 18:00:26,251 WARNING [train.py:1204] (3/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_356-31216-0_sp1.1 from training. Duration: 0.6818125 2023-10-04 18:00:29,251 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.self_attn_weights.pos_emb_skip_rate, batch_count=1743693.3333333333, ans=0.0 2023-10-04 18:00:30,950 WARNING [train.py:1204] (3/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_443-30034-0_sp1.1 from training. Duration: 0.5818125 2023-10-04 18:00:31,031 WARNING [train.py:1204] (3/4) Exclude cut with ID _26907_210_7234_1_1532221740694_3205263_337-2494-0_sp1.1 from training. Duration: 0.8 2023-10-04 18:00:32,389 WARNING [train.py:1204] (3/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_49-97158-0_sp0.9 from training. Duration: 0.8444375 2023-10-04 18:00:32,409 WARNING [train.py:1204] (3/4) Exclude cut with ID _7877_210_6014_1_1528286358099_3750136_553-170128-0_sp1.1 from training. Duration: 0.963625 2023-10-04 18:00:33,779 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=1743693.3333333333, ans=0.1 2023-10-04 18:00:35,460 WARNING [train.py:1204] (3/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_202-293523-0_sp0.9 from training. Duration: 0.8 2023-10-04 18:00:35,763 INFO [scaling.py:1118] (3/4) WithLoss: name=encoder.encoders.5.encoder.layers.1.self_attn_weights, loss-sum=0.000e+00 2023-10-04 18:00:38,282 WARNING [train.py:1204] (3/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_188-171673-0_sp0.9 from training. Duration: 0.6444375 2023-10-04 18:00:38,306 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_120-41668-0_sp1.1 from training. Duration: 0.636375 2023-10-04 18:00:38,317 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_40223_210_6228_1_1533298404_4812267_613-78769-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 18:00:38,496 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.balancer2.prob, batch_count=1743760.0, ans=0.125 2023-10-04 18:00:40,203 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_53861_210_5399_1_1534422156_4272077_150-336899-0_sp1.1 from training. Duration: 0.963625 2023-10-04 18:00:41,537 WARNING [train.py:1204] (3/4) Exclude cut with ID _14459_210_2228_1_1531443641946_7516700_1146-56843-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 18:00:43,176 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.feed_forward1.hidden_balancer.prob, batch_count=1743760.0, ans=0.125 2023-10-04 18:00:44,263 WARNING [train.py:1204] (3/4) Exclude cut with ID _39867_210_9826_1_1533553222030_9344259_1194-299928-0_sp1.1 from training. Duration: 0.92725 2023-10-04 18:00:44,440 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.bypass.scale_min, batch_count=1743760.0, ans=0.2 2023-10-04 18:00:46,829 WARNING [train.py:1204] (3/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_214-291369-0_sp0.9 from training. Duration: 0.7 2023-10-04 18:00:51,218 WARNING [train.py:1204] (3/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_358-215558-0 from training. Duration: 0.93 2023-10-04 18:00:51,268 WARNING [train.py:1204] (3/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_577-287150-0_sp0.9 from training. Duration: 0.911125 2023-10-04 18:00:54,032 WARNING [train.py:1204] (3/4) Exclude cut with ID _60860_210_8254_1_1534896015875_3557169_229-187367-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 18:00:54,089 WARNING [train.py:1204] (3/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_857-298942-0 from training. Duration: 0.85 2023-10-04 18:00:54,136 WARNING [train.py:1204] (3/4) Exclude cut with ID _40137_210_12210_1_1533290484046_3795180_249-187059-0_sp1.1 from training. Duration: 0.8 2023-10-04 18:00:54,144 WARNING [train.py:1204] (3/4) Exclude cut with ID ID0171W0508-765603 from training. Duration: 0.541 2023-10-04 18:00:54,168 WARNING [train.py:1204] (3/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_521-219439-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 18:00:54,323 INFO [scaling.py:1118] (3/4) WithLoss: name=encoder.encoders.1.encoder.layers.1.self_attn_weights, loss-sum=0.000e+00 2023-10-04 18:00:55,580 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_40223_210_6228_1_1533298404_4812267_741-302616-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 18:00:58,452 WARNING [train.py:1204] (3/4) Exclude cut with ID _65293_210_9070_1_1535545549765_4004210_438-154735-0_sp1.1 from training. Duration: 0.92725 2023-10-04 18:01:00,096 INFO [scaling.py:1118] (3/4) WithLoss: name=encoder.encoders.4.encoder.layers.0.self_attn_weights, loss-sum=0.000e+00 2023-10-04 18:01:01,780 WARNING [train.py:1204] (3/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_564-132950-0_sp1.1 from training. Duration: 0.92725 2023-10-04 18:01:01,825 WARNING [train.py:1204] (3/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_291-270881-0_sp1.1 from training. Duration: 0.8818125 2023-10-04 18:01:03,179 WARNING [train.py:1204] (3/4) Exclude cut with ID _60229_210_27767_1_1534838406158_3655350_353-213656-0 from training. Duration: 0.95 2023-10-04 18:01:03,197 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_243-143119-0 from training. Duration: 0.77 2023-10-04 18:01:03,221 WARNING [train.py:1204] (3/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_690-126306-0 from training. Duration: 0.78 2023-10-04 18:01:05,919 WARNING [train.py:1204] (3/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_347-95003-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 18:01:09,208 WARNING [train.py:1204] (3/4) Exclude cut with ID _2322_210_2667_1_1525604548971_3656299_440-80494-0 from training. Duration: 0.88 2023-10-04 18:01:09,230 WARNING [train.py:1204] (3/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_370-229434-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 18:01:12,520 WARNING [train.py:1204] (3/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_124-345840-0_sp1.1 from training. Duration: 0.47275 2023-10-04 18:01:12,528 WARNING [train.py:1204] (3/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_165-123581-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 18:01:13,934 WARNING [train.py:1204] (3/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_453-282588-0 from training. Duration: 0.98 2023-10-04 18:01:13,960 WARNING [train.py:1204] (3/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_299-34413-0_sp0.9 from training. Duration: 0.72225 2023-10-04 18:01:14,001 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_40036_210_13405_1_1533648647_1315388_48-146016-0_sp1.1 from training. Duration: 0.8454375 2023-10-04 18:01:14,032 WARNING [train.py:1204] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_99-167392-0_sp0.9 from training. Duration: 0.67775 2023-10-04 18:01:14,074 WARNING [train.py:1204] (3/4) Exclude cut with ID _48677_210_5663_1_1533902405919_3892989_37-207114-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 18:01:16,862 WARNING [train.py:1204] (3/4) Exclude cut with ID _29857_210_20034_1_1532395886753_3798160_335-163562-0 from training. Duration: 0.85 2023-10-04 18:01:19,554 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_36492_210_7927_1_1533261987_4790834_703-343351-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 18:01:21,091 WARNING [train.py:1204] (3/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_80-147301-0_sp0.9 from training. Duration: 0.9333125 2023-10-04 18:01:22,462 WARNING [train.py:1204] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_218-295097-0_sp0.9 from training. Duration: 0.8555625 2023-10-04 18:01:22,577 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder_embed.convnext.out_balancer.prob, batch_count=1743960.0, ans=0.125 2023-10-04 18:01:23,777 INFO [train.py:1046] (3/4) Epoch 50, batch 1300, loss[loss=0.147, simple_loss=0.237, pruned_loss=0.0285, over 24668.00 frames. ], tot_loss[loss=0.1525, simple_loss=0.2335, pruned_loss=0.03577, over 4709278.54 frames. ], batch size: 65, lr: 2.03e-03, grad_scale: 16.0 2023-10-04 18:01:25,138 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2295_210_2087_1_1525500993_2407863_116-238894-0_sp1.1 from training. Duration: 0.636375 2023-10-04 18:01:27,955 WARNING [train.py:1204] (3/4) Exclude cut with ID _3231_210_2106_1_1526205864934_6784220_697-171068-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 18:01:27,987 WARNING [train.py:1204] (3/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_102-77064-0 from training. Duration: 0.93 2023-10-04 18:01:32,939 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_13091_210_7925_1_1530609447_8987354_1186-261957-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 18:01:33,590 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.4.encoder.layers.0.feed_forward3.out_whiten, num_groups=1, num_channels=384, metric=12.05 vs. limit=15.0 2023-10-04 18:01:34,334 WARNING [train.py:1204] (3/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_91-277684-0_sp1.1 from training. Duration: 0.67275 2023-10-04 18:01:34,422 WARNING [train.py:1204] (3/4) Exclude cut with ID _31088_210_16446_1_1532520012284_3440919_159-313751-0_sp1.1 from training. Duration: 0.936375 2023-10-04 18:01:35,835 WARNING [train.py:1204] (3/4) Exclude cut with ID _42326_210_7770_1_1533814282531_3707269_227-203721-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 18:01:37,116 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.751e+02 2.152e+02 2.609e+02 3.024e+02 4.878e+02, threshold=5.218e+02, percent-clipped=1.0 2023-10-04 18:01:37,221 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3531_210_3528_1_1526294203_5216456_309-227302-0_sp1.1 from training. Duration: 0.62725 2023-10-04 18:01:38,960 WARNING [train.py:1204] (3/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_226-242140-0 from training. Duration: 0.81 2023-10-04 18:01:39,276 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.nonlin_attention.balancer.max_positive, batch_count=1744026.6666666667, ans=0.95 2023-10-04 18:01:41,166 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.0.layers.0.whiten, num_groups=1, num_channels=192, metric=4.39 vs. limit=12.0 2023-10-04 18:01:43,777 WARNING [train.py:1204] (3/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_122-109023-0_sp1.1 from training. Duration: 0.6909375 2023-10-04 18:01:46,370 WARNING [train.py:1204] (3/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_660-211088-0_sp0.9 from training. Duration: 0.688875 2023-10-04 18:01:47,703 WARNING [train.py:1204] (3/4) Exclude cut with ID _39290_210_4220_1_1533862778760_3075118_92-311600-0 from training. Duration: 0.88 2023-10-04 18:01:50,530 WARNING [train.py:1204] (3/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_390-339137-0_sp1.1 from training. Duration: 0.5090625 2023-10-04 18:01:53,435 WARNING [train.py:1204] (3/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_449-341946-0_sp1.1 from training. Duration: 0.963625 2023-10-04 18:01:54,700 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_68095_210_20185_1_1535718226_8888855_884-327007-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 18:01:56,035 WARNING [train.py:1204] (3/4) Exclude cut with ID _42326_210_7770_1_1533814282531_3707269_308-222998-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 18:01:56,122 WARNING [train.py:1204] (3/4) Exclude cut with ID _40399_210_4353_1_1533305658194_3279406_344-238708-0_sp1.1 from training. Duration: 0.963625 2023-10-04 18:01:57,549 WARNING [train.py:1204] (3/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_87-131427-0_sp1.1 from training. Duration: 0.7090625 2023-10-04 18:01:57,592 WARNING [train.py:1204] (3/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_110-130961-0_sp0.9 from training. Duration: 0.52225 2023-10-04 18:01:57,641 WARNING [train.py:1204] (3/4) Exclude cut with ID _55153_210_2253_1_1534384786677_3162096_148-79022-0 from training. Duration: 0.96 2023-10-04 18:02:01,248 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.nonlin_attention.balancer.prob, batch_count=1744093.3333333333, ans=0.125 2023-10-04 18:02:04,304 WARNING [train.py:1204] (3/4) Exclude cut with ID _9679_210_7497_1_1529129212906_4363929_512-313889-0_sp1.1 from training. Duration: 0.736375 2023-10-04 18:02:04,324 WARNING [train.py:1204] (3/4) Exclude cut with ID _7970_210_1883_1_1528455616296_3472230_25-178407-0_sp1.1 from training. Duration: 0.6454375 2023-10-04 18:02:05,746 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_246-125000-0 from training. Duration: 0.62 2023-10-04 18:02:07,011 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_15126_210_6753_1_1530778677_2591185_113-157651-0_sp0.9 from training. Duration: 0.7555625 2023-10-04 18:02:08,434 WARNING [train.py:1204] (3/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_105-63309-0_sp1.1 from training. Duration: 0.8909375 2023-10-04 18:02:10,494 WARNING [train.py:1204] (3/4) Exclude cut with ID _29877_210_6228_1_1532397791106_5276149_578-177271-0_sp1.1 from training. Duration: 0.936375 2023-10-04 18:02:11,818 WARNING [train.py:1204] (3/4) Exclude cut with ID _44831_210_13427_1_1533816011549_3689035_32-188147-0 from training. Duration: 0.96 2023-10-04 18:02:13,130 WARNING [train.py:1204] (3/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_300-254337-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 18:02:13,144 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_128-241008-0 from training. Duration: 0.94 2023-10-04 18:02:15,192 WARNING [train.py:1204] (3/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_53-279178-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 18:02:19,177 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_41452_210_7291_1_1533437687_3941281_273-334088-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 18:02:19,186 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_128-241008-0_sp1.1 from training. Duration: 0.8545625 2023-10-04 18:02:23,306 WARNING [train.py:1204] (3/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_379-49457-0 from training. Duration: 0.87 2023-10-04 18:02:24,522 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_105-114797-0 from training. Duration: 0.84 2023-10-04 18:02:24,610 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_14928_210_5632_1_1530773461_4714000_454-112198-0 from training. Duration: 0.66 2023-10-04 18:02:29,292 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_456-22141-0_sp1.1 from training. Duration: 0.8 2023-10-04 18:02:29,523 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.nonlin_attention.balancer.min_positive, batch_count=1744226.6666666667, ans=0.05 2023-10-04 18:02:32,019 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_115-233929-0 from training. Duration: 0.72 2023-10-04 18:02:32,145 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_616-16795-0_sp1.1 from training. Duration: 0.963625 2023-10-04 18:02:34,718 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.1.encoder.layers.0.self_attn2.whiten, num_groups=1, num_channels=256, metric=13.50 vs. limit=22.5 2023-10-04 18:02:35,496 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.nonlin_attention.balancer.min_positive, batch_count=1744226.6666666667, ans=0.05 2023-10-04 18:02:38,143 INFO [train.py:1046] (3/4) Epoch 50, batch 1350, loss[loss=0.1344, simple_loss=0.2094, pruned_loss=0.02972, over 20198.00 frames. ], tot_loss[loss=0.1521, simple_loss=0.2336, pruned_loss=0.03532, over 4717792.42 frames. ], batch size: 44, lr: 2.03e-03, grad_scale: 16.0 2023-10-04 18:02:39,660 WARNING [train.py:1204] (3/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_448-212135-0 from training. Duration: 0.91 2023-10-04 18:02:42,454 WARNING [train.py:1204] (3/4) Exclude cut with ID _46683_210_9642_1_1533730913492_4591116_201-205677-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 18:02:44,489 WARNING [train.py:1204] (3/4) Exclude cut with ID _64086_210_7994_1_1535270417019_7435949_43-80483-0_sp1.1 from training. Duration: 0.97275 2023-10-04 18:02:47,336 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_240-134124-0_sp1.1 from training. Duration: 0.963625 2023-10-04 18:02:47,361 WARNING [train.py:1204] (3/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_907-120940-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 18:02:47,469 WARNING [train.py:1204] (3/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_848-280957-0_sp1.1 from training. Duration: 0.7909375 2023-10-04 18:02:48,782 WARNING [train.py:1204] (3/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_243-42116-0_sp0.9 from training. Duration: 0.811125 2023-10-04 18:02:50,252 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.bypass_mid.scale_min, batch_count=1744293.3333333333, ans=0.2 2023-10-04 18:02:51,597 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.ff3_skip_rate, batch_count=1744360.0, ans=0.0 2023-10-04 18:02:52,211 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.0.layers.1.whiten, num_groups=1, num_channels=192, metric=3.97 vs. limit=12.0 2023-10-04 18:02:52,685 WARNING [train.py:1204] (3/4) Exclude cut with ID _6694_210_4599_1_1527852637490_3965529_116-109398-0_sp0.9 from training. Duration: 0.811125 2023-10-04 18:02:53,965 WARNING [train.py:1204] (3/4) Exclude cut with ID _24314_210_4915_1_1532135078116_7170179_521-258868-0 from training. Duration: 0.79 2023-10-04 18:02:55,949 WARNING [train.py:1204] (3/4) Exclude cut with ID _2220_210_1819_1_1525509109898_3658680_50-99837-0_sp0.9 from training. Duration: 0.6 2023-10-04 18:02:55,995 WARNING [train.py:1204] (3/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_313-134930-0_sp1.1 from training. Duration: 0.8818125 2023-10-04 18:02:59,944 WARNING [train.py:1204] (3/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_125-144174-0 from training. Duration: 0.39 2023-10-04 18:03:01,208 WARNING [train.py:1204] (3/4) Exclude cut with ID _59288_210_5854_1_1535077757774_3412246_275-293897-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 18:03:01,298 WARNING [train.py:1204] (3/4) Exclude cut with ID _40519_210_13794_1_1533715228655_7337390_722-52880-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 18:03:03,113 WARNING [train.py:1204] (3/4) Exclude cut with ID _7537_210_2084_1_1528502395431_3924359_128-268029-0 from training. Duration: 0.98 2023-10-04 18:03:04,491 WARNING [train.py:1204] (3/4) Exclude cut with ID _8021_210_7994_1_1528376451358_3653069_179-45206-0 from training. Duration: 0.86 2023-10-04 18:03:05,931 WARNING [train.py:1204] (3/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_88-276668-0 from training. Duration: 0.99 2023-10-04 18:03:07,498 WARNING [train.py:1204] (3/4) Exclude cut with ID _22613_210_5219_1_1532253599686_4090170_290-212862-0_sp1.1 from training. Duration: 0.97275 2023-10-04 18:03:07,506 WARNING [train.py:1204] (3/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_465-214726-0 from training. Duration: 0.86 2023-10-04 18:03:12,463 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.feed_forward2.hidden_balancer.prob, batch_count=1744426.6666666667, ans=0.125 2023-10-04 18:03:15,216 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.conv_module2.balancer2.prob, batch_count=1744426.6666666667, ans=0.125 2023-10-04 18:03:17,996 WARNING [train.py:1204] (3/4) Exclude cut with ID _56480_210_4220_1_1534679942079_3075409_138-36031-0_sp1.1 from training. Duration: 0.97275 2023-10-04 18:03:26,899 WARNING [train.py:1204] (3/4) Exclude cut with ID _58605_210_12210_1_1534723195939_3904259_300-285878-0_sp1.1 from training. Duration: 0.97275 2023-10-04 18:03:28,182 WARNING [train.py:1204] (3/4) Exclude cut with ID _40901_210_5665_1_1533363789958_3856130_114-340229-0_sp1.1 from training. Duration: 0.936375 2023-10-04 18:03:28,215 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_489-230703-0 from training. Duration: 0.85 2023-10-04 18:03:29,785 WARNING [train.py:1204] (3/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_322-81243-0_sp1.1 from training. Duration: 0.936375 2023-10-04 18:03:31,186 WARNING [train.py:1204] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_271-269084-0 from training. Duration: 0.83 2023-10-04 18:03:31,197 WARNING [train.py:1204] (3/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_166-314607-0_sp0.9 from training. Duration: 0.6 2023-10-04 18:03:33,032 WARNING [train.py:1204] (3/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_340-343302-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 18:03:35,788 WARNING [train.py:1204] (3/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_286-112015-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 18:03:37,251 WARNING [train.py:1204] (3/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_328-264776-0 from training. Duration: 0.67 2023-10-04 18:03:38,626 WARNING [train.py:1204] (3/4) Exclude cut with ID _4866_210_2577_1_1527404412284_4364659_256-306830-0_sp0.9 from training. Duration: 0.9666875 2023-10-04 18:03:44,531 WARNING [train.py:1204] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_239-316802-0 from training. Duration: 0.69 2023-10-04 18:03:47,333 WARNING [train.py:1204] (3/4) Exclude cut with ID _29857_210_20034_1_1532395886753_3798160_279-185801-0 from training. Duration: 0.91 2023-10-04 18:03:49,069 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.ff3_skip_rate, batch_count=1744560.0, ans=0.0 2023-10-04 18:03:51,535 INFO [train.py:1046] (3/4) Epoch 50, batch 1400, loss[loss=0.1607, simple_loss=0.2401, pruned_loss=0.0407, over 24046.00 frames. ], tot_loss[loss=0.1519, simple_loss=0.2333, pruned_loss=0.03529, over 4721527.64 frames. ], batch size: 80, lr: 2.03e-03, grad_scale: 16.0 2023-10-04 18:03:52,901 WARNING [train.py:1204] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_656-188361-0 from training. Duration: 0.87 2023-10-04 18:03:53,015 WARNING [train.py:1204] (3/4) Exclude cut with ID _44718_210_5816_1_1534050051869_6469030_100-230287-0_sp1.1 from training. Duration: 0.936375 2023-10-04 18:03:54,483 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8275_210_4198_1_1528455320_3892571_276-215622-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 18:03:54,676 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.conv_module1.balancer2.prob, batch_count=1744626.6666666667, ans=0.125 2023-10-04 18:03:55,137 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.3.encoder.layers.3.feed_forward3.out_whiten, num_groups=1, num_channels=512, metric=15.33 vs. limit=15.0 2023-10-04 18:03:56,366 WARNING [train.py:1204] (3/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_698-309430-0_sp1.1 from training. Duration: 0.963625 2023-10-04 18:03:59,949 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_365-217119-0 from training. Duration: 0.63 2023-10-04 18:04:01,230 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7264_210_4938_1_1527939911_8132095_800-91995-0 from training. Duration: 0.86 2023-10-04 18:04:05,072 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.753e+02 2.179e+02 2.424e+02 2.840e+02 4.477e+02, threshold=4.849e+02, percent-clipped=0.0 2023-10-04 18:04:09,918 WARNING [train.py:1204] (3/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_49-97158-0_sp1.1 from training. Duration: 0.6909375 2023-10-04 18:04:11,357 WARNING [train.py:1204] (3/4) Exclude cut with ID _61132_210_9969_1_1535021792964_3755419_61-57720-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 18:04:14,054 WARNING [train.py:1204] (3/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_345-306832-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 18:04:14,085 WARNING [train.py:1204] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_164-224052-0_sp0.9 from training. Duration: 0.77775 2023-10-04 18:04:16,855 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_34886_210_10633_1_1533203882_1755661_76-156096-0_sp1.1 from training. Duration: 0.7909375 2023-10-04 18:04:17,065 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.feed_forward3.hidden_balancer.prob, batch_count=1744693.3333333333, ans=0.125 2023-10-04 18:04:18,155 WARNING [train.py:1204] (3/4) Exclude cut with ID 4133-6541-0027-40495-0_sp1.1 from training. Duration: 0.9681875 2023-10-04 18:04:29,710 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_514-211165-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 18:04:29,774 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_41211_210_2544_1_1533348859_3702910_97-212695-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 18:04:34,475 WARNING [train.py:1204] (3/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_406-45086-0 from training. Duration: 0.8 2023-10-04 18:04:35,922 WARNING [train.py:1204] (3/4) Exclude cut with ID _65263_210_22356_1_1535763598691_3921037_49-222917-0_sp1.1 from training. Duration: 0.836375 2023-10-04 18:04:35,973 WARNING [train.py:1204] (3/4) Exclude cut with ID _4866_210_2577_1_1527404412284_4364659_352-157930-0_sp1.1 from training. Duration: 0.736375 2023-10-04 18:04:37,285 WARNING [train.py:1204] (3/4) Exclude cut with ID _4366_210_1388_1_1526792053633_3613819_231-211402-0_sp1.1 from training. Duration: 0.8545625 2023-10-04 18:04:37,333 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_98-21576-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 18:04:38,687 WARNING [train.py:1204] (3/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_348-127789-0_sp0.9 from training. Duration: 0.9444375 2023-10-04 18:04:38,714 WARNING [train.py:1204] (3/4) Exclude cut with ID _45871_210_5665_1_1533864614245_3296060_98-329557-0_sp1.1 from training. Duration: 0.97275 2023-10-04 18:04:40,549 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6846_210_6753_1_1527850767_4295158_2-315073-0_sp1.1 from training. Duration: 0.8 2023-10-04 18:04:42,142 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.conv_module2.balancer2.prob, batch_count=1744826.6666666667, ans=0.125 2023-10-04 18:04:43,645 WARNING [train.py:1204] (3/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_145-137790-0 from training. Duration: 0.83 2023-10-04 18:04:43,674 WARNING [train.py:1204] (3/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_133-158308-0_sp0.9 from training. Duration: 0.9333125 2023-10-04 18:04:46,535 WARNING [train.py:1204] (3/4) Exclude cut with ID _14898_210_9206_1_1530766812377_4177190_320-11758-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 18:04:50,801 WARNING [train.py:1204] (3/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_406-45086-0_sp0.9 from training. Duration: 0.888875 2023-10-04 18:04:56,499 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_5438_210_1794_1_1527336687_2600009_104-288274-0 from training. Duration: 0.58 2023-10-04 18:04:57,843 WARNING [train.py:1204] (3/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_108-53929-0_sp0.9 from training. Duration: 0.6555625 2023-10-04 18:05:00,868 WARNING [train.py:1204] (3/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_444-286390-0_sp1.1 from training. Duration: 0.963625 2023-10-04 18:05:02,273 WARNING [train.py:1204] (3/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_125-144174-0_sp0.9 from training. Duration: 0.4333125 2023-10-04 18:05:02,349 WARNING [train.py:1204] (3/4) Exclude cut with ID _55209_210_1531_1_1534675828996_4597160_118-345621-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 18:05:02,486 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.0.conv_skip_rate, batch_count=1744893.3333333333, ans=0.0 2023-10-04 18:05:05,577 INFO [train.py:1046] (3/4) Epoch 50, batch 1450, loss[loss=0.1437, simple_loss=0.2249, pruned_loss=0.03122, over 24498.00 frames. ], tot_loss[loss=0.1519, simple_loss=0.2331, pruned_loss=0.03538, over 4722999.93 frames. ], batch size: 63, lr: 2.03e-03, grad_scale: 16.0 2023-10-04 18:05:05,663 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_45886_210_17024_1_1533709215_471063_7-79165-0_sp1.1 from training. Duration: 0.8909375 2023-10-04 18:05:07,192 WARNING [train.py:1204] (3/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_175-163894-0_sp0.9 from training. Duration: 0.82225 2023-10-04 18:05:08,684 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_50188_210_4198_1_1534503621_3646041_353-81682-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 18:05:08,690 WARNING [train.py:1204] (3/4) Exclude cut with ID _17875_210_2087_1_1531224012907_1821339_175-80565-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 18:05:08,700 WARNING [train.py:1204] (3/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_159-328157-0_sp1.1 from training. Duration: 0.47275 2023-10-04 18:05:13,478 WARNING [train.py:1204] (3/4) Exclude cut with ID _60057_210_10640_1_1534903065793_3333150_137-267579-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 18:05:15,280 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_92-46009-0_sp1.1 from training. Duration: 0.4909375 2023-10-04 18:05:15,985 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.3.encoder.layers.2.feed_forward2.out_whiten, num_groups=1, num_channels=512, metric=8.17 vs. limit=15.0 2023-10-04 18:05:16,690 WARNING [train.py:1204] (3/4) Exclude cut with ID _53203_210_2087_1_1534246218309_475590_48-68507-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 18:05:16,713 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_28211_210_6388_1_1532861506_4523224_128-328162-0 from training. Duration: 0.9 2023-10-04 18:05:18,199 WARNING [train.py:1204] (3/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_51-1049-0_sp1.1 from training. Duration: 0.6818125 2023-10-04 18:05:18,292 WARNING [train.py:1204] (3/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_383-316000-0 from training. Duration: 0.9 2023-10-04 18:05:18,346 WARNING [train.py:1204] (3/4) Exclude cut with ID _4779_210_2084_1_1527318022944_3914893_185-295153-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 18:05:19,679 WARNING [train.py:1204] (3/4) Exclude cut with ID _3231_210_2106_1_1526205864934_6784220_171-256004-0_sp0.9 from training. Duration: 0.9555625 2023-10-04 18:05:19,679 WARNING [train.py:1204] (3/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_87-272068-0 from training. Duration: 0.83 2023-10-04 18:05:19,957 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.conv_module2.balancer1.prob, batch_count=1745026.6666666667, ans=0.125 2023-10-04 18:05:21,052 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_164-152777-0_sp1.1 from training. Duration: 0.92725 2023-10-04 18:05:22,354 WARNING [train.py:1204] (3/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_18-130336-0_sp0.9 from training. Duration: 0.87775 2023-10-04 18:05:23,620 WARNING [train.py:1204] (3/4) Exclude cut with ID _14886_210_4220_1_1530702020355_3516190_175-229073-0_sp1.1 from training. Duration: 0.4090625 2023-10-04 18:05:23,641 WARNING [train.py:1204] (3/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_791-45149-0_sp0.9 from training. Duration: 0.9555625 2023-10-04 18:05:25,030 WARNING [train.py:1204] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_468-189058-0_sp1.1 from training. Duration: 0.87275 2023-10-04 18:05:25,138 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8278_210_2550_1_1528637099_4220413_32-142780-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 18:05:27,930 WARNING [train.py:1204] (3/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_146-218763-0_sp0.9 from training. Duration: 0.9555625 2023-10-04 18:05:30,010 WARNING [train.py:1204] (3/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_348-127789-0_sp1.1 from training. Duration: 0.77275 2023-10-04 18:05:30,023 WARNING [train.py:1204] (3/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_288-285445-0_sp1.1 from training. Duration: 0.936375 2023-10-04 18:05:33,183 WARNING [train.py:1204] (3/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_308-181732-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 18:05:33,207 WARNING [train.py:1204] (3/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_895-117428-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 18:05:35,942 WARNING [train.py:1204] (3/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_387-159008-0_sp0.9 from training. Duration: 0.9555625 2023-10-04 18:05:35,956 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_37351_210_7925_1_1533012931_3812113_265-85780-0_sp0.9 from training. Duration: 0.97775 2023-10-04 18:05:35,981 WARNING [train.py:1204] (3/4) Exclude cut with ID _40551_210_10635_1_1533776424632_3416179_361-3964-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 18:05:36,036 WARNING [train.py:1204] (3/4) Exclude cut with ID _25882_210_3036_1_1532775665815_6479510_485-122742-0_sp1.1 from training. Duration: 0.97275 2023-10-04 18:05:42,004 WARNING [train.py:1204] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_98-51676-0 from training. Duration: 0.73 2023-10-04 18:05:45,177 WARNING [train.py:1204] (3/4) Exclude cut with ID _30548_210_16040_1_1532608097130_3146969_371-280610-0_sp1.1 from training. Duration: 0.92725 2023-10-04 18:05:47,988 WARNING [train.py:1204] (3/4) Exclude cut with ID IC0671W0081-331924 from training. Duration: 0.896 2023-10-04 18:05:48,178 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.bypass_mid.scale_min, batch_count=1745093.3333333333, ans=0.2 2023-10-04 18:05:50,774 WARNING [train.py:1204] (3/4) Exclude cut with ID _40284_210_2084_1_1533513679814_3704279_380-160775-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 18:05:52,152 WARNING [train.py:1204] (3/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_57-270037-0_sp1.1 from training. Duration: 0.7 2023-10-04 18:05:53,672 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_853-150237-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 18:05:55,016 WARNING [train.py:1204] (3/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_491-261923-0 from training. Duration: 0.98 2023-10-04 18:05:59,293 WARNING [train.py:1204] (3/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_836-244045-0_sp1.1 from training. Duration: 0.97275 2023-10-04 18:06:00,662 WARNING [train.py:1204] (3/4) Exclude cut with ID _14880_210_8895_1_1530680426655_3795259_289-212817-0 from training. Duration: 0.69 2023-10-04 18:06:02,063 WARNING [train.py:1204] (3/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_303-340446-0 from training. Duration: 0.9 2023-10-04 18:06:03,420 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_37152_210_9697_1_1533106573_3844188_418-41736-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 18:06:06,819 WARNING [train.py:1204] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_371-18169-0_sp1.1 from training. Duration: 0.7818125 2023-10-04 18:06:06,881 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_187-219984-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 18:06:09,917 WARNING [train.py:1204] (3/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_63-267585-0 from training. Duration: 0.9 2023-10-04 18:06:11,870 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.4.encoder.layers.0.self_attn1.whiten, num_groups=1, num_channels=384, metric=14.45 vs. limit=22.5 2023-10-04 18:06:12,598 WARNING [train.py:1204] (3/4) Exclude cut with ID _27013_210_1389_1_1532437173240_3252649_222-125573-0 from training. Duration: 0.98 2023-10-04 18:06:13,907 WARNING [train.py:1204] (3/4) Exclude cut with ID _14821_210_8750_1_1530680503991_6829359_189-297451-0 from training. Duration: 0.55 2023-10-04 18:06:15,700 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_557-163391-0_sp1.1 from training. Duration: 0.97275 2023-10-04 18:06:15,785 WARNING [train.py:1204] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_65-88242-0_sp1.1 from training. Duration: 0.5090625 2023-10-04 18:06:20,488 INFO [train.py:1046] (3/4) Epoch 50, batch 1500, loss[loss=0.166, simple_loss=0.2388, pruned_loss=0.04661, over 23698.00 frames. ], tot_loss[loss=0.1526, simple_loss=0.2336, pruned_loss=0.03576, over 4722329.81 frames. ], batch size: 179, lr: 2.03e-03, grad_scale: 16.0 2023-10-04 18:06:20,849 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.conv_module1.balancer1.prob, batch_count=1745293.3333333333, ans=0.125 2023-10-04 18:06:23,590 WARNING [train.py:1204] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_218-295097-0 from training. Duration: 0.77 2023-10-04 18:06:23,617 WARNING [train.py:1204] (3/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_686-247600-0_sp1.1 from training. Duration: 0.836375 2023-10-04 18:06:23,620 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_397-147196-0_sp0.9 from training. Duration: 0.888875 2023-10-04 18:06:24,904 WARNING [train.py:1204] (3/4) Exclude cut with ID _53950_210_5816_1_1534417264919_3862951_237-136449-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 18:06:24,974 WARNING [train.py:1204] (3/4) Exclude cut with ID _3120_210_3525_1_1526036143112_4072980_74-178400-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 18:06:26,349 WARNING [train.py:1204] (3/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_309-222545-0_sp0.9 from training. Duration: 0.9333125 2023-10-04 18:06:26,421 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7544_210_6753_1_1528587722_3576686_172-171750-0 from training. Duration: 0.65 2023-10-04 18:06:27,814 WARNING [train.py:1204] (3/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_3-237434-0_sp0.9 from training. Duration: 0.8444375 2023-10-04 18:06:29,403 WARNING [train.py:1204] (3/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_101-293367-0_sp0.9 from training. Duration: 0.7 2023-10-04 18:06:29,410 WARNING [train.py:1204] (3/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_818-213552-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 18:06:30,721 WARNING [train.py:1204] (3/4) Exclude cut with ID _14349_210_8341_1_1530615710818_3442470_115-285975-0_sp1.1 from training. Duration: 0.7818125 2023-10-04 18:06:33,288 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.778e+02 2.060e+02 2.266e+02 2.721e+02 4.158e+02, threshold=4.532e+02, percent-clipped=0.0 2023-10-04 18:06:33,392 WARNING [train.py:1204] (3/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_291-309749-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 18:06:34,855 WARNING [train.py:1204] (3/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_715-3462-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 18:06:40,215 WARNING [train.py:1204] (3/4) Exclude cut with ID _59502_210_12210_1_1535107024698_1835139_13-331827-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 18:06:40,224 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_4528_210_3273_1_1527076654_3276777_342-168068-0 from training. Duration: 0.87 2023-10-04 18:06:41,565 WARNING [train.py:1204] (3/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_215-141059-0_sp0.9 from training. Duration: 0.688875 2023-10-04 18:06:41,602 WARNING [train.py:1204] (3/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_106-286130-0_sp1.1 from training. Duration: 0.8909375 2023-10-04 18:06:42,910 WARNING [train.py:1204] (3/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_480-77655-0_sp1.1 from training. Duration: 0.97275 2023-10-04 18:06:46,122 WARNING [train.py:1204] (3/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_848-280957-0 from training. Duration: 0.87 2023-10-04 18:06:49,467 WARNING [train.py:1204] (3/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_122-109023-0 from training. Duration: 0.76 2023-10-04 18:06:50,967 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_11305_210_1794_1_1529750428_7178148_645-233480-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 18:06:52,276 WARNING [train.py:1204] (3/4) Exclude cut with ID _7562_210_2084_1_1528887767916_3946229_345-169156-0 from training. Duration: 0.58 2023-10-04 18:06:53,845 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.bypass.scale_min, batch_count=1745426.6666666667, ans=0.2 2023-10-04 18:06:54,947 WARNING [train.py:1204] (3/4) Exclude cut with ID _7562_210_2084_1_1528887767916_3946229_345-169156-0_sp1.1 from training. Duration: 0.52725 2023-10-04 18:06:56,358 WARNING [train.py:1204] (3/4) Exclude cut with ID _13253_210_1925_1_1530757775819_3577873_253-246386-0_sp1.1 from training. Duration: 0.8181875 2023-10-04 18:06:57,668 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_136-264444-0_sp1.1 from training. Duration: 0.97275 2023-10-04 18:06:57,684 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2168_210_2290_1_1525606920_1849400_136-222991-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 18:06:59,073 WARNING [train.py:1204] (3/4) Exclude cut with ID _7852_210_5219_1_1528286471316_8139400_242-105336-0 from training. Duration: 0.72 2023-10-04 18:07:00,408 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_5960_210_1794_1_1527403865_3990999_515-2613-0_sp1.1 from training. Duration: 0.8 2023-10-04 18:07:00,423 WARNING [train.py:1204] (3/4) Exclude cut with ID _39688_210_19596_1_1533371626725_4154159_551-167834-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 18:07:00,485 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_85-195240-0 from training. Duration: 0.87 2023-10-04 18:07:01,893 WARNING [train.py:1204] (3/4) Exclude cut with ID _63171_210_15488_1_1535104792794_3295110_113-113534-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 18:07:04,760 WARNING [train.py:1204] (3/4) Exclude cut with ID _8833_210_5219_1_1528592665210_7553059_319-349181-0_sp1.1 from training. Duration: 0.9 2023-10-04 18:07:04,761 WARNING [train.py:1204] (3/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_225-43381-0 from training. Duration: 0.94 2023-10-04 18:07:06,366 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.conv_skip_rate, batch_count=1745493.3333333333, ans=0.0 2023-10-04 18:07:09,313 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_5959_210_1794_1_1527400400_3445726_412-44052-0_sp0.9 from training. Duration: 0.8555625 2023-10-04 18:07:12,998 WARNING [train.py:1204] (3/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_356-167816-0_sp1.1 from training. Duration: 0.6545625 2023-10-04 18:07:15,181 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.conv_skip_rate, batch_count=1745493.3333333333, ans=0.0 2023-10-04 18:07:17,504 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9012W0112-501244 from training. Duration: 0.768 2023-10-04 18:07:17,557 WARNING [train.py:1204] (3/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_177-326952-0_sp1.1 from training. Duration: 0.92725 2023-10-04 18:07:17,561 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9016W0116-502789 from training. Duration: 0.896 2023-10-04 18:07:17,816 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.ff3_skip_rate, batch_count=1745493.3333333333, ans=0.0 2023-10-04 18:07:18,922 WARNING [train.py:1204] (3/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_557-204815-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 18:07:20,340 WARNING [train.py:1204] (3/4) Exclude cut with ID _55857_210_2346_1_1534644007831_6898209_279-229469-0_sp1.1 from training. Duration: 0.936375 2023-10-04 18:07:20,415 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9029W0375-509764 from training. Duration: 0.896 2023-10-04 18:07:21,727 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2295_210_2087_1_1525500993_2407863_234-83104-0_sp0.9 from training. Duration: 0.688875 2023-10-04 18:07:24,511 WARNING [train.py:1204] (3/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_266-137104-0 from training. Duration: 0.65 2023-10-04 18:07:25,909 WARNING [train.py:1204] (3/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_289-163927-0_sp1.1 from training. Duration: 0.92725 2023-10-04 18:07:27,373 WARNING [train.py:1204] (3/4) Exclude cut with ID _12733_210_8341_1_1530167424556_3646380_86-225314-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 18:07:27,389 WARNING [train.py:1204] (3/4) Exclude cut with ID _7877_210_6014_1_1528286358099_3750136_618-284064-0_sp1.1 from training. Duration: 0.92725 2023-10-04 18:07:28,745 WARNING [train.py:1204] (3/4) Exclude cut with ID _55844_210_2346_1_1534471212569_7625707_1017-31469-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 18:07:28,794 WARNING [train.py:1204] (3/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_617-241446-0_sp1.1 from training. Duration: 0.92725 2023-10-04 18:07:28,849 WARNING [train.py:1204] (3/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_866-173440-0_sp1.1 from training. Duration: 0.8181875 2023-10-04 18:07:30,267 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_9460_210_4211_1_1528954587_10049667_480-260269-0 from training. Duration: 0.56 2023-10-04 18:07:31,576 WARNING [train.py:1204] (3/4) Exclude cut with ID _44659_210_2721_1_1534036120061_6614860_626-298884-0 from training. Duration: 0.97 2023-10-04 18:07:31,616 WARNING [train.py:1204] (3/4) Exclude cut with ID _53565_210_10635_1_1534337218335_3820259_287-18120-0_sp1.1 from training. Duration: 0.8818125 2023-10-04 18:07:32,920 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_791-197891-0 from training. Duration: 0.84 2023-10-04 18:07:32,973 WARNING [train.py:1204] (3/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_527-238990-0 from training. Duration: 0.9 2023-10-04 18:07:34,227 INFO [train.py:1046] (3/4) Epoch 50, batch 1550, loss[loss=0.1387, simple_loss=0.2256, pruned_loss=0.02595, over 24486.00 frames. ], tot_loss[loss=0.1528, simple_loss=0.234, pruned_loss=0.03586, over 4723803.95 frames. ], batch size: 66, lr: 2.03e-03, grad_scale: 16.0 2023-10-04 18:07:35,565 WARNING [train.py:1204] (3/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_449-65969-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 18:07:37,011 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_553-341452-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 18:07:37,050 WARNING [train.py:1204] (3/4) Exclude cut with ID _16909_210_5724_1_1531292079912_4118919_300-47574-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 18:07:38,854 WARNING [train.py:1204] (3/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_70-27732-0_sp1.1 from training. Duration: 0.8545625 2023-10-04 18:07:38,956 WARNING [train.py:1204] (3/4) Exclude cut with ID _44718_210_5816_1_1534050051869_6469030_727-28074-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 18:07:40,293 WARNING [train.py:1204] (3/4) Exclude cut with ID _55857_210_2346_1_1534644007831_6898209_357-123664-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 18:07:45,777 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9123W0004-555964 from training. Duration: 0.896 2023-10-04 18:07:45,813 WARNING [train.py:1204] (3/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_445-145208-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 18:07:47,137 WARNING [train.py:1204] (3/4) Exclude cut with ID _3675_210_4557_1_1526639934408_4182089_456-18305-0_sp1.1 from training. Duration: 0.6909375 2023-10-04 18:07:47,207 WARNING [train.py:1204] (3/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_1-99680-0_sp1.1 from training. Duration: 0.6090625 2023-10-04 18:07:48,743 WARNING [train.py:1204] (3/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_146-282400-0_sp0.9 from training. Duration: 0.788875 2023-10-04 18:07:48,756 WARNING [train.py:1204] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_844-96989-0 from training. Duration: 0.71 2023-10-04 18:07:51,462 WARNING [train.py:1204] (3/4) Exclude cut with ID _29853_210_7497_1_1532505600851_6642110_128-146364-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 18:07:51,493 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_224-322394-0 from training. Duration: 0.66 2023-10-04 18:07:52,857 WARNING [train.py:1204] (3/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_191-59784-0 from training. Duration: 0.88 2023-10-04 18:07:52,871 WARNING [train.py:1204] (3/4) Exclude cut with ID _12556_210_8341_1_1530081024100_7327690_725-193477-0 from training. Duration: 0.98 2023-10-04 18:07:52,917 WARNING [train.py:1204] (3/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_523-79838-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 18:07:54,298 WARNING [train.py:1204] (3/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_398-334558-0_sp1.1 from training. Duration: 0.87275 2023-10-04 18:07:55,945 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.feed_forward2.hidden_balancer.prob, batch_count=1745693.3333333333, ans=0.125 2023-10-04 18:07:57,137 WARNING [train.py:1204] (3/4) Exclude cut with ID _6456_210_1534_1_1527681634212_3181049_171-36697-0_sp1.1 from training. Duration: 0.936375 2023-10-04 18:07:59,756 WARNING [train.py:1204] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_700-183549-0 from training. Duration: 0.68 2023-10-04 18:07:59,758 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8853_210_3273_1_1528633899_818968_51-187685-0 from training. Duration: 0.69 2023-10-04 18:08:02,679 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.attention_skip_rate, batch_count=1745760.0, ans=0.0 2023-10-04 18:08:08,224 WARNING [train.py:1204] (3/4) Exclude cut with ID _7719_210_3525_1_1528456153163_4296960_333-110399-0_sp1.1 from training. Duration: 0.87275 2023-10-04 18:08:12,971 WARNING [train.py:1204] (3/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_1022-83740-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 18:08:12,997 WARNING [train.py:1204] (3/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_61-327062-0_sp0.9 from training. Duration: 0.9 2023-10-04 18:08:13,003 WARNING [train.py:1204] (3/4) Exclude cut with ID _47251_210_18628_1_1533814063128_4137250_214-88076-0_sp1.1 from training. Duration: 0.8 2023-10-04 18:08:13,061 WARNING [train.py:1204] (3/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_839-173017-0 from training. Duration: 0.41 2023-10-04 18:08:18,632 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.balancer2.prob, batch_count=1745826.6666666667, ans=0.125 2023-10-04 18:08:19,782 WARNING [train.py:1204] (3/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_299-34413-0_sp1.1 from training. Duration: 0.5909375 2023-10-04 18:08:21,078 WARNING [train.py:1204] (3/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_664-159457-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 18:08:23,796 WARNING [train.py:1204] (3/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_483-291760-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 18:08:25,287 WARNING [train.py:1204] (3/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_287-255474-0_sp1.1 from training. Duration: 0.92725 2023-10-04 18:08:26,510 WARNING [train.py:1204] (3/4) Exclude cut with ID _48677_210_5663_1_1533902405919_3892989_111-11439-0_sp1.1 from training. Duration: 0.87275 2023-10-04 18:08:26,517 WARNING [train.py:1204] (3/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_399-156791-0 from training. Duration: 0.83 2023-10-04 18:08:26,549 WARNING [train.py:1204] (3/4) Exclude cut with ID _3992_210_1747_1_1526644816031_3731060_36-147855-0_sp0.9 from training. Duration: 0.8666875 2023-10-04 18:08:26,687 WARNING [train.py:1204] (3/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_442-320317-0_sp1.1 from training. Duration: 0.7454375 2023-10-04 18:08:28,005 WARNING [train.py:1204] (3/4) Exclude cut with ID _55429_210_10626_1_1534467588527_7174143_953-80868-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 18:08:28,088 WARNING [train.py:1204] (3/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_159-328157-0_sp0.9 from training. Duration: 0.57775 2023-10-04 18:08:29,311 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9309W0197-640324 from training. Duration: 0.896 2023-10-04 18:08:32,071 WARNING [train.py:1204] (3/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_361-30822-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 18:08:33,643 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=1745893.3333333333, ans=0.1 2023-10-04 18:08:36,447 WARNING [train.py:1204] (3/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_636-251213-0 from training. Duration: 0.94 2023-10-04 18:08:37,114 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.3.encoder.layers.2.self_attn_weights.whiten_keys, num_groups=8, num_channels=256, metric=3.43 vs. limit=6.0 2023-10-04 18:08:38,317 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.4.encoder.layers.2.whiten, num_groups=1, num_channels=384, metric=3.18 vs. limit=12.0 2023-10-04 18:08:42,513 WARNING [train.py:1204] (3/4) Exclude cut with ID _49728_210_12651_1_1534041034294_6788150_468-333827-0_sp1.1 from training. Duration: 0.963625 2023-10-04 18:08:45,244 WARNING [train.py:1204] (3/4) Exclude cut with ID _25882_210_3036_1_1532775665815_6479510_623-191759-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 18:08:45,312 WARNING [train.py:1204] (3/4) Exclude cut with ID _5189_210_4557_1_1527248319428_3769229_199-53526-0 from training. Duration: 0.42 2023-10-04 18:08:45,453 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=1745893.3333333333, ans=0.1 2023-10-04 18:08:49,082 INFO [train.py:1046] (3/4) Epoch 50, batch 1600, loss[loss=0.1391, simple_loss=0.224, pruned_loss=0.02711, over 24631.00 frames. ], tot_loss[loss=0.1531, simple_loss=0.234, pruned_loss=0.03605, over 4732973.19 frames. ], batch size: 60, lr: 2.03e-03, grad_scale: 32.0 2023-10-04 18:08:49,128 WARNING [train.py:1204] (3/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_32-205288-0_sp0.9 from training. Duration: 0.8666875 2023-10-04 18:08:49,221 WARNING [train.py:1204] (3/4) Exclude cut with ID _65263_210_22356_1_1535763598691_3921037_401-346646-0_sp1.1 from training. Duration: 0.963625 2023-10-04 18:08:49,225 WARNING [train.py:1204] (3/4) Exclude cut with ID _12555_210_8750_1_1530075594402_3780239_449-267986-0_sp1.1 from training. Duration: 0.7545625 2023-10-04 18:08:51,042 WARNING [train.py:1204] (3/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_430-201020-0_sp1.1 from training. Duration: 0.8818125 2023-10-04 18:08:52,446 WARNING [train.py:1204] (3/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_379-49457-0_sp1.1 from training. Duration: 0.7909375 2023-10-04 18:08:54,159 WARNING [train.py:1204] (3/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_200-105218-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 18:08:55,497 WARNING [train.py:1204] (3/4) Exclude cut with ID _3867_210_1883_1_1526641200575_4475830_319-349850-0 from training. Duration: 0.67 2023-10-04 18:08:56,772 WARNING [train.py:1204] (3/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_916-195848-0 from training. Duration: 0.92 2023-10-04 18:08:59,395 WARNING [train.py:1204] (3/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_436-186311-0 from training. Duration: 0.65 2023-10-04 18:09:00,049 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.4.encoder.layers.1.conv_module2.whiten, num_groups=1, num_channels=384, metric=2.85 vs. limit=15.0 2023-10-04 18:09:00,898 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3910_210_1794_1_1526777667_3747260_302-180617-0_sp1.1 from training. Duration: 0.97275 2023-10-04 18:09:02,118 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.717e+02 2.102e+02 2.370e+02 2.617e+02 3.812e+02, threshold=4.739e+02, percent-clipped=0.0 2023-10-04 18:09:02,271 WARNING [train.py:1204] (3/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_102-92751-0 from training. Duration: 0.99 2023-10-04 18:09:02,484 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.self_attn_weights.pos_emb_skip_rate, batch_count=1746026.6666666667, ans=0.0 2023-10-04 18:09:03,773 WARNING [train.py:1204] (3/4) Exclude cut with ID _51899_210_20034_1_1534129254466_3669220_265-215428-0_sp1.1 from training. Duration: 0.8909375 2023-10-04 18:09:04,313 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.4.encoder.layers.1.feed_forward2.out_whiten, num_groups=1, num_channels=384, metric=12.83 vs. limit=15.0 2023-10-04 18:09:06,524 WARNING [train.py:1204] (3/4) Exclude cut with ID _58583_210_5236_1_1534726835266_3574249_438-198993-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 18:09:10,800 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.nonlin_attention.balancer.prob, batch_count=1746026.6666666667, ans=0.125 2023-10-04 18:09:11,268 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.3.encoder.layers.0.whiten, num_groups=1, num_channels=512, metric=5.85 vs. limit=12.0 2023-10-04 18:09:11,961 WARNING [train.py:1204] (3/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_166-166990-0_sp1.1 from training. Duration: 0.87275 2023-10-04 18:09:13,609 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.conv_skip_rate, batch_count=1746026.6666666667, ans=0.0 2023-10-04 18:09:14,869 WARNING [train.py:1204] (3/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_907-151398-0 from training. Duration: 0.37 2023-10-04 18:09:15,087 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.conv_module2.balancer1.prob, batch_count=1746026.6666666667, ans=0.125 2023-10-04 18:09:16,322 WARNING [train.py:1204] (3/4) Exclude cut with ID _38670_210_15379_1_1533448810659_4391059_452-256669-0_sp1.1 from training. Duration: 0.936375 2023-10-04 18:09:18,197 WARNING [train.py:1204] (3/4) Exclude cut with ID _48677_210_5663_1_1533902405919_3892989_408-40740-0 from training. Duration: 0.83 2023-10-04 18:09:19,529 WARNING [train.py:1204] (3/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_903-158756-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 18:09:20,835 WARNING [train.py:1204] (3/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_397-216497-0 from training. Duration: 0.96 2023-10-04 18:09:26,451 WARNING [train.py:1204] (3/4) Exclude cut with ID _55429_210_10626_1_1534467588527_7174143_97-335084-0 from training. Duration: 0.97 2023-10-04 18:09:32,003 WARNING [train.py:1204] (3/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_453-267730-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 18:09:32,080 WARNING [train.py:1204] (3/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_153-97441-0 from training. Duration: 0.56 2023-10-04 18:09:33,390 WARNING [train.py:1204] (3/4) Exclude cut with ID _40901_210_5665_1_1533363789958_3856130_312-329122-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 18:09:33,440 WARNING [train.py:1204] (3/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_97-126471-0_sp1.1 from training. Duration: 0.97275 2023-10-04 18:09:33,441 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_11305_210_1794_1_1529750428_7178148_257-312870-0_sp1.1 from training. Duration: 0.8 2023-10-04 18:09:34,920 WARNING [train.py:1204] (3/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_454-314901-0_sp0.9 from training. Duration: 0.511125 2023-10-04 18:09:38,975 WARNING [train.py:1204] (3/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_282-136641-0_sp1.1 from training. Duration: 0.3818125 2023-10-04 18:09:40,340 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_46013_210_7234_1_1533776362_3744032_493-160724-0_sp1.1 from training. Duration: 0.963625 2023-10-04 18:09:40,385 WARNING [train.py:1204] (3/4) Exclude cut with ID _67831_210_12392_1_1535612464202_3092612_259-33442-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 18:09:41,747 WARNING [train.py:1204] (3/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_289-62384-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 18:09:43,112 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6545_210_2040_1_1527774670_4245258_379-14182-0_sp0.9 from training. Duration: 0.888875 2023-10-04 18:09:44,471 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_548-294316-0_sp1.1 from training. Duration: 0.736375 2023-10-04 18:09:46,299 WARNING [train.py:1204] (3/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_857-298942-0_sp1.1 from training. Duration: 0.77275 2023-10-04 18:09:46,438 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6673_210_3273_1_1527767790_3173240_327-306185-0_sp1.1 from training. Duration: 0.7545625 2023-10-04 18:09:49,398 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.conv_module1.balancer2.prob, batch_count=1746226.6666666667, ans=0.125 2023-10-04 18:09:49,430 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.attention_skip_rate, batch_count=1746226.6666666667, ans=0.0 2023-10-04 18:09:52,652 WARNING [train.py:1204] (3/4) Exclude cut with ID _61008_210_10633_1_1534852769198_6974370_814-107647-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 18:09:52,728 WARNING [train.py:1204] (3/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_116-332008-0_sp1.1 from training. Duration: 0.8909375 2023-10-04 18:09:54,176 WARNING [train.py:1204] (3/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_266-329755-0 from training. Duration: 0.89 2023-10-04 18:09:54,179 WARNING [train.py:1204] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_271-269084-0_sp0.9 from training. Duration: 0.92225 2023-10-04 18:09:56,087 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.3.encoder.layers.1.nonlin_attention.whiten1, num_groups=1, num_channels=384, metric=4.87 vs. limit=10.0 2023-10-04 18:09:56,853 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3171_210_2087_1_1526109084_2330007_66-260583-0 from training. Duration: 0.72 2023-10-04 18:10:02,473 INFO [train.py:1046] (3/4) Epoch 50, batch 1650, loss[loss=0.1553, simple_loss=0.248, pruned_loss=0.03134, over 24449.00 frames. ], tot_loss[loss=0.1537, simple_loss=0.2349, pruned_loss=0.03629, over 4732853.50 frames. ], batch size: 69, lr: 2.03e-03, grad_scale: 16.0 2023-10-04 18:10:02,513 WARNING [train.py:1204] (3/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_1326-276386-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 18:10:03,946 WARNING [train.py:1204] (3/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_372-30302-0_sp1.1 from training. Duration: 0.7818125 2023-10-04 18:10:05,355 WARNING [train.py:1204] (3/4) Exclude cut with ID _25882_210_3036_1_1532775665815_6479510_631-257029-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 18:10:05,365 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3691_210_3528_1_1526468169_4062053_355-334432-0 from training. Duration: 0.87 2023-10-04 18:10:05,384 WARNING [train.py:1204] (3/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_839-173017-0_sp1.1 from training. Duration: 0.37275 2023-10-04 18:10:05,393 WARNING [train.py:1204] (3/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_441-91466-0 from training. Duration: 0.97 2023-10-04 18:10:06,703 WARNING [train.py:1204] (3/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_108-53929-0 from training. Duration: 0.59 2023-10-04 18:10:09,574 WARNING [train.py:1204] (3/4) Exclude cut with ID _9389_210_1664_1_1529217337301_5289269_260-1942-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 18:10:09,638 WARNING [train.py:1204] (3/4) Exclude cut with ID _56423_210_5854_1_1534896061049_3165969_198-61547-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 18:10:09,672 WARNING [train.py:1204] (3/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_412-142122-0_sp1.1 from training. Duration: 0.92725 2023-10-04 18:10:10,931 WARNING [train.py:1204] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_88-341718-0_sp1.1 from training. Duration: 0.6 2023-10-04 18:10:12,319 WARNING [train.py:1204] (3/4) Exclude cut with ID _61079_210_4525_1_1534921008198_4011419_334-298697-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 18:10:13,735 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_5465_210_4211_1_1527227253_55357_8-2499-0_sp1.1 from training. Duration: 0.996375 2023-10-04 18:10:15,650 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_447-201565-0_sp1.1 from training. Duration: 0.97275 2023-10-04 18:10:15,656 WARNING [train.py:1204] (3/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_253-161789-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 18:10:15,659 WARNING [train.py:1204] (3/4) Exclude cut with ID _44304_210_15374_1_1533639352765_4613129_302-22084-0_sp0.9 from training. Duration: 0.9555625 2023-10-04 18:10:15,676 WARNING [train.py:1204] (3/4) Exclude cut with ID _26664_210_7770_1_1532258943280_3418270_187-80815-0_sp1.1 from training. Duration: 0.8454375 2023-10-04 18:10:17,047 WARNING [train.py:1204] (3/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_68-212801-0 from training. Duration: 0.73 2023-10-04 18:10:17,250 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.feed_forward1.out_proj.dropout_p, batch_count=1746360.0, ans=0.1 2023-10-04 18:10:18,383 WARNING [train.py:1204] (3/4) Exclude cut with ID _29345_210_4381_1_1532689168263_3860169_149-257268-0 from training. Duration: 0.81 2023-10-04 18:10:21,456 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.0.layers.0.self_attn_weights.whiten_keys, num_groups=4, num_channels=128, metric=2.65 vs. limit=6.0 2023-10-04 18:10:23,183 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_213-103282-0_sp1.1 from training. Duration: 0.7090625 2023-10-04 18:10:24,670 WARNING [train.py:1204] (3/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_156-302675-0_sp1.1 from training. Duration: 0.5 2023-10-04 18:10:26,503 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.4.encoder.layers.1.nonlin_attention.whiten2, num_groups=1, num_channels=384, metric=3.88 vs. limit=15.0 2023-10-04 18:10:28,278 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.0.layers.0.nonlin_attention.whiten1, num_groups=1, num_channels=144, metric=9.48 vs. limit=10.0 2023-10-04 18:10:32,874 WARNING [train.py:1204] (3/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_125-322172-0 from training. Duration: 0.93 2023-10-04 18:10:32,944 WARNING [train.py:1204] (3/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_493-60474-0_sp1.1 from training. Duration: 0.963625 2023-10-04 18:10:33,215 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.nonlin_attention.balancer.prob, batch_count=1746426.6666666667, ans=0.125 2023-10-04 18:10:35,642 WARNING [train.py:1204] (3/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_156-302675-0 from training. Duration: 0.55 2023-10-04 18:10:37,146 WARNING [train.py:1204] (3/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_123-267862-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 18:10:39,889 WARNING [train.py:1204] (3/4) Exclude cut with ID _28427_210_4220_1_1532581240967_3061069_201-143747-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 18:10:39,918 WARNING [train.py:1204] (3/4) Exclude cut with ID _8772_210_1850_1_1528635595972_3664011_96-12756-0_sp1.1 from training. Duration: 0.8909375 2023-10-04 18:10:40,091 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.attention_skip_rate, batch_count=1746426.6666666667, ans=0.0 2023-10-04 18:10:41,220 WARNING [train.py:1204] (3/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_1347-145675-0_sp1.1 from training. Duration: 0.936375 2023-10-04 18:10:41,327 WARNING [train.py:1204] (3/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_387-159008-0_sp1.1 from training. Duration: 0.7818125 2023-10-04 18:10:42,646 WARNING [train.py:1204] (3/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_400-201896-0_sp1.1 from training. Duration: 0.963625 2023-10-04 18:10:44,756 WARNING [train.py:1204] (3/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_326-174612-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 18:10:44,896 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.feed_forward3.hidden_balancer.prob, batch_count=1746493.3333333333, ans=0.125 2023-10-04 18:10:45,973 WARNING [train.py:1204] (3/4) Exclude cut with ID _55844_210_2346_1_1534471212569_7625707_390-116233-0_sp1.1 from training. Duration: 0.963625 2023-10-04 18:10:46,006 WARNING [train.py:1204] (3/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_419-317062-0_sp1.1 from training. Duration: 0.82725 2023-10-04 18:10:47,724 WARNING [train.py:1204] (3/4) Exclude cut with ID _29877_210_6228_1_1532397791106_5276149_266-62291-0_sp0.9 from training. Duration: 0.9666875 2023-10-04 18:10:49,097 WARNING [train.py:1204] (3/4) Exclude cut with ID _7857_210_1534_1_1528286515745_6831470_431-338528-0_sp1.1 from training. Duration: 0.92725 2023-10-04 18:10:51,014 WARNING [train.py:1204] (3/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_87-131427-0_sp0.9 from training. Duration: 0.8666875 2023-10-04 18:10:51,563 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.4.encoder.layers.2.conv_module2.whiten, num_groups=1, num_channels=384, metric=3.14 vs. limit=15.0 2023-10-04 18:10:54,479 WARNING [train.py:1204] (3/4) Exclude cut with ID _24055_210_12651_1_1532310907143_976979_92-59356-0_sp1.1 from training. Duration: 0.82725 2023-10-04 18:10:55,983 WARNING [train.py:1204] (3/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_96-290039-0 from training. Duration: 0.56 2023-10-04 18:10:58,753 WARNING [train.py:1204] (3/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_48-182155-0_sp0.9 from training. Duration: 0.9666875 2023-10-04 18:10:58,791 WARNING [train.py:1204] (3/4) Exclude cut with ID _4626_210_3532_1_1526817652717_3916859_446-206656-0 from training. Duration: 0.76 2023-10-04 18:11:00,156 WARNING [train.py:1204] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_168-32906-0 from training. Duration: 0.69 2023-10-04 18:11:00,176 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3780_210_2087_1_1526467760_2130483_170-259044-0 from training. Duration: 0.7 2023-10-04 18:11:00,361 INFO [scaling.py:1118] (3/4) WithLoss: name=encoder.encoders.1.encoder.layers.0.self_attn_weights, loss-sum=0.000e+00 2023-10-04 18:11:01,612 WARNING [train.py:1204] (3/4) Exclude cut with ID _65134_210_14763_1_1535335393985_3505368_153-216718-0_sp1.1 from training. Duration: 0.92725 2023-10-04 18:11:01,682 WARNING [train.py:1204] (3/4) Exclude cut with ID _64639_210_5816_1_1535367619437_3528000_461-329156-0_sp1.1 from training. Duration: 0.87275 2023-10-04 18:11:03,002 WARNING [train.py:1204] (3/4) Exclude cut with ID _21989_210_5854_1_1531899214931_3271360_126-347531-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 18:11:03,042 WARNING [train.py:1204] (3/4) Exclude cut with ID _7614_210_4915_1_1529326764092_4537359_96-162197-0_sp1.1 from training. Duration: 0.963625 2023-10-04 18:11:03,043 WARNING [train.py:1204] (3/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_1083-147536-0 from training. Duration: 0.97 2023-10-04 18:11:06,004 WARNING [train.py:1204] (3/4) Exclude cut with ID _64029_210_4381_1_1535178502772_3756459_275-16710-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 18:11:07,363 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7264_210_4938_1_1527939911_8132095_800-91995-0_sp0.9 from training. Duration: 0.9555625 2023-10-04 18:11:08,648 WARNING [train.py:1204] (3/4) Exclude cut with ID _3844_210_1390_1_1526727803582_2871209_50-193879-0_sp1.1 from training. Duration: 0.936375 2023-10-04 18:11:10,029 WARNING [train.py:1204] (3/4) Exclude cut with ID _46952_210_5986_1_1534230010357_3107379_232-344503-0 from training. Duration: 0.96 2023-10-04 18:11:15,456 INFO [train.py:1046] (3/4) Epoch 50, batch 1700, loss[loss=0.1399, simple_loss=0.2254, pruned_loss=0.02722, over 24585.00 frames. ], tot_loss[loss=0.1534, simple_loss=0.2345, pruned_loss=0.03619, over 4724003.08 frames. ], batch size: 60, lr: 2.03e-03, grad_scale: 16.0 2023-10-04 18:11:15,507 WARNING [train.py:1204] (3/4) Exclude cut with ID _25808_210_13292_1_1532343514562_7366109_206-14076-0_sp1.1 from training. Duration: 0.936375 2023-10-04 18:11:15,507 WARNING [train.py:1204] (3/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_55-31904-0_sp0.9 from training. Duration: 0.97775 2023-10-04 18:11:15,526 WARNING [train.py:1204] (3/4) Exclude cut with ID _14632_210_9988_1_1530691279392_3923200_161-129530-0 from training. Duration: 0.96 2023-10-04 18:11:16,858 WARNING [train.py:1204] (3/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_770-200430-0_sp1.1 from training. Duration: 0.8090625 2023-10-04 18:11:16,875 WARNING [train.py:1204] (3/4) Exclude cut with ID _6270_210_2084_1_1527850911573_3856420_26-277343-0_sp1.1 from training. Duration: 0.7454375 2023-10-04 18:11:16,876 WARNING [train.py:1204] (3/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_88-23776-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 18:11:18,879 WARNING [train.py:1204] (3/4) Exclude cut with ID _60860_210_8254_1_1534896015875_3557169_245-256666-0_sp1.1 from training. Duration: 0.8818125 2023-10-04 18:11:18,915 WARNING [train.py:1204] (3/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_636-251213-0_sp1.1 from training. Duration: 0.8545625 2023-10-04 18:11:18,928 WARNING [train.py:1204] (3/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_540-131164-0 from training. Duration: 0.92 2023-10-04 18:11:22,725 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_5931_210_2040_1_1527428481_4547999_321-193614-0_sp0.9 from training. Duration: 0.7444375 2023-10-04 18:11:31,431 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.694e+02 2.062e+02 2.401e+02 2.672e+02 3.684e+02, threshold=4.801e+02, percent-clipped=0.0 2023-10-04 18:11:32,910 WARNING [train.py:1204] (3/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_373-225403-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 18:11:34,355 WARNING [train.py:1204] (3/4) Exclude cut with ID _20622_210_3852_1_1531622427080_1093839_57-326027-0_sp1.1 from training. Duration: 0.97275 2023-10-04 18:11:38,746 WARNING [train.py:1204] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_605-257205-0_sp1.1 from training. Duration: 0.536375 2023-10-04 18:11:40,029 WARNING [train.py:1204] (3/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_442-320317-0_sp0.9 from training. Duration: 0.911125 2023-10-04 18:11:40,077 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_35361_210_4238_1_1533262810_7793476_675-87012-0_sp1.1 from training. Duration: 0.8090625 2023-10-04 18:11:40,103 WARNING [train.py:1204] (3/4) Exclude cut with ID _4184_210_2550_1_1526730192013_2219380_306-44736-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 18:11:42,735 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_124-113562-0 from training. Duration: 0.94 2023-10-04 18:11:44,172 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2821_210_2715_1_1526108385_3763560_73-296673-0_sp0.9 from training. Duration: 0.92225 2023-10-04 18:11:44,187 WARNING [train.py:1204] (3/4) Exclude cut with ID _10301_210_5886_1_1529222275332_3910620_216-46959-0_sp1.1 from training. Duration: 0.92725 2023-10-04 18:11:45,660 WARNING [train.py:1204] (3/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_91-277684-0_sp0.9 from training. Duration: 0.82225 2023-10-04 18:11:47,030 WARNING [train.py:1204] (3/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_351-294650-0_sp0.9 from training. Duration: 0.7 2023-10-04 18:11:47,275 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=1746760.0, ans=0.1 2023-10-04 18:11:50,834 WARNING [train.py:1204] (3/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_209-205365-0 from training. Duration: 0.79 2023-10-04 18:11:51,020 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.ff2_skip_rate, batch_count=1746760.0, ans=0.0 2023-10-04 18:11:52,144 WARNING [train.py:1204] (3/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_97-325053-0 from training. Duration: 0.92 2023-10-04 18:11:52,366 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.conv_skip_rate, batch_count=1746760.0, ans=0.0 2023-10-04 18:11:53,879 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_49348_210_7927_1_1533954500_3691300_109-276430-0_sp1.1 from training. Duration: 0.92725 2023-10-04 18:11:55,798 WARNING [train.py:1204] (3/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_912-47473-0 from training. Duration: 0.68 2023-10-04 18:11:56,064 INFO [scaling.py:1118] (3/4) WithLoss: name=encoder.encoders.2.encoder.layers.1.self_attn_weights, loss-sum=0.000e+00 2023-10-04 18:11:57,234 WARNING [train.py:1204] (3/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_96-320736-0_sp0.9 from training. Duration: 0.9555625 2023-10-04 18:11:58,931 INFO [scaling.py:1118] (3/4) WithLoss: name=encoder.encoders.3.encoder.layers.3.self_attn_weights, loss-sum=0.000e+00 2023-10-04 18:12:00,913 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.1.encoder.layers.0.nonlin_attention.whiten2, num_groups=1, num_channels=256, metric=5.32 vs. limit=15.0 2023-10-04 18:12:04,116 WARNING [train.py:1204] (3/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_1252-235481-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 18:12:05,569 WARNING [train.py:1204] (3/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_813-261554-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 18:12:05,642 WARNING [train.py:1204] (3/4) Exclude cut with ID _21632_210_2253_1_1531994001765_3744140_110-274654-0_sp0.9 from training. Duration: 0.911125 2023-10-04 18:12:07,025 WARNING [train.py:1204] (3/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_381-317791-0_sp1.1 from training. Duration: 0.663625 2023-10-04 18:12:08,269 WARNING [train.py:1204] (3/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_51-117576-0_sp1.1 from training. Duration: 0.363625 2023-10-04 18:12:08,294 WARNING [train.py:1204] (3/4) Exclude cut with ID _42321_210_7770_1_1533468631352_4366895_104-215915-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 18:12:11,097 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_216-22824-0_sp1.1 from training. Duration: 0.92725 2023-10-04 18:12:11,098 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_14831_210_7927_1_1530700200_5583610_181-27467-0 from training. Duration: 0.91 2023-10-04 18:12:11,163 WARNING [train.py:1204] (3/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_1024-265645-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 18:12:11,166 WARNING [train.py:1204] (3/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_412-233904-0_sp1.1 from training. Duration: 0.936375 2023-10-04 18:12:12,444 WARNING [train.py:1204] (3/4) Exclude cut with ID _30191_210_1664_1_1533189915047_7196670_911-223461-0_sp1.1 from training. Duration: 0.92725 2023-10-04 18:12:12,453 WARNING [train.py:1204] (3/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_558-148266-0_sp1.1 from training. Duration: 0.963625 2023-10-04 18:12:13,298 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.3.encoder.layers.0.self_attn1.whiten, num_groups=1, num_channels=512, metric=17.91 vs. limit=22.5 2023-10-04 18:12:13,931 WARNING [train.py:1204] (3/4) Exclude cut with ID _58480_210_12392_1_1534811121111_3646160_212-164495-0_sp1.1 from training. Duration: 0.936375 2023-10-04 18:12:13,932 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_454-276429-0_sp0.9 from training. Duration: 0.9666875 2023-10-04 18:12:15,370 WARNING [train.py:1204] (3/4) Exclude cut with ID _24736_210_3279_1_1532332894218_2838700_73-197086-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 18:12:15,415 WARNING [train.py:1204] (3/4) Exclude cut with ID _5294_210_1747_1_1527249643928_3696179_529-242298-0_sp1.1 from training. Duration: 0.863625 2023-10-04 18:12:16,707 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_53555_210_5632_1_1534595220_7232297_345-171438-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 18:12:19,980 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_28211_210_6388_1_1532861506_4523224_22-25766-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 18:12:21,332 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7002_210_1794_1_1527939683_5157286_163-87552-0 from training. Duration: 0.86 2023-10-04 18:12:24,657 WARNING [train.py:1204] (3/4) Exclude cut with ID _56927_210_9969_1_1534848970840_3695229_409-225129-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 18:12:24,744 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_26192_210_7927_1_1532137269_1040115_21-219331-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 18:12:29,456 WARNING [train.py:1204] (3/4) Exclude cut with ID _15012_210_1968_1_1530856820429_5095350_662-210469-0 from training. Duration: 0.57 2023-10-04 18:12:30,760 INFO [train.py:1046] (3/4) Epoch 50, batch 1750, loss[loss=0.1455, simple_loss=0.2202, pruned_loss=0.03543, over 23690.00 frames. ], tot_loss[loss=0.1521, simple_loss=0.2328, pruned_loss=0.03563, over 4724468.96 frames. ], batch size: 232, lr: 2.03e-03, grad_scale: 16.0 2023-10-04 18:12:33,625 WARNING [train.py:1204] (3/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_582-191016-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 18:12:35,138 WARNING [train.py:1204] (3/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_270-210032-0_sp1.1 from training. Duration: 0.963625 2023-10-04 18:12:35,172 WARNING [train.py:1204] (3/4) Exclude cut with ID _6789_210_1534_1_1527857937218_3118950_262-48853-0_sp0.9 from training. Duration: 0.62225 2023-10-04 18:12:36,489 WARNING [train.py:1204] (3/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_847-41334-0 from training. Duration: 0.44 2023-10-04 18:12:36,521 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_573-46532-0_sp1.1 from training. Duration: 0.92725 2023-10-04 18:12:39,228 WARNING [train.py:1204] (3/4) Exclude cut with ID _21881_210_3525_1_1531825242757_3798380_247-102591-0_sp1.1 from training. Duration: 0.8909375 2023-10-04 18:12:39,261 WARNING [train.py:1204] (3/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_200-21395-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 18:12:40,786 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.conv_module1.balancer1.prob, batch_count=1746960.0, ans=0.125 2023-10-04 18:12:43,406 WARNING [train.py:1204] (3/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_711-15688-0 from training. Duration: 0.43 2023-10-04 18:12:44,872 WARNING [train.py:1204] (3/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_53-75025-0_sp1.1 from training. Duration: 0.963625 2023-10-04 18:12:47,610 WARNING [train.py:1204] (3/4) Exclude cut with ID _6194_210_1534_1_1527511826160_3941194_321-148641-0 from training. Duration: 0.64 2023-10-04 18:12:47,658 WARNING [train.py:1204] (3/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_337-294431-0_sp1.1 from training. Duration: 0.936375 2023-10-04 18:12:48,142 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.5.encoder.layers.1.feed_forward1.out_whiten, num_groups=1, num_channels=256, metric=7.41 vs. limit=15.0 2023-10-04 18:12:49,590 WARNING [train.py:1204] (3/4) Exclude cut with ID _21632_210_2253_1_1531994001765_3744140_110-274654-0_sp1.1 from training. Duration: 0.7454375 2023-10-04 18:12:50,685 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.0.layers.0.feed_forward1.out_whiten, num_groups=1, num_channels=192, metric=9.14 vs. limit=15.0 2023-10-04 18:12:51,068 WARNING [train.py:1204] (3/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_135-174080-0_sp1.1 from training. Duration: 0.4818125 2023-10-04 18:12:52,844 WARNING [train.py:1204] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_21-174514-0 from training. Duration: 0.43 2023-10-04 18:12:56,205 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_38965_210_4852_1_1533174906_3484735_215-294589-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 18:12:56,433 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.bypass_mid.scale_min, batch_count=1747026.6666666667, ans=0.2 2023-10-04 18:12:57,521 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_548-294316-0 from training. Duration: 0.81 2023-10-04 18:12:59,498 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.bypass.skip_rate, batch_count=1747093.3333333333, ans=0.07 2023-10-04 18:13:02,239 INFO [scaling.py:1118] (3/4) WithLoss: name=encoder.encoders.4.encoder.layers.0.self_attn_weights, loss-sum=0.000e+00 2023-10-04 18:13:05,967 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_103-84010-0_sp1.1 from training. Duration: 0.62725 2023-10-04 18:13:07,492 WARNING [train.py:1204] (3/4) Exclude cut with ID _18199_210_6109_1_1531475789864_4288790_295-198247-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 18:13:07,519 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_13757_210_3528_1_1530440682_4107239_103-313224-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 18:13:10,320 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_34886_210_10633_1_1533203882_1755661_67-294673-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 18:13:10,330 WARNING [train.py:1204] (3/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_350-101734-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 18:13:12,995 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_37357_210_17024_1_1533089504_2316121_204-153986-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 18:13:14,394 WARNING [train.py:1204] (3/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_673-115569-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 18:13:15,883 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_489-230703-0_sp0.9 from training. Duration: 0.9444375 2023-10-04 18:13:17,802 WARNING [train.py:1204] (3/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_575-102496-0_sp1.1 from training. Duration: 0.936375 2023-10-04 18:13:19,135 WARNING [train.py:1204] (3/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_669-93163-0 from training. Duration: 0.98 2023-10-04 18:13:19,264 WARNING [train.py:1204] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_249-281349-0_sp0.9 from training. Duration: 0.9444375 2023-10-04 18:13:21,986 WARNING [train.py:1204] (3/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_106-30782-0 from training. Duration: 0.8 2023-10-04 18:13:23,653 WARNING [train.py:1204] (3/4) Exclude cut with ID _8772_210_1850_1_1528635595972_3664011_398-320049-0_sp1.1 from training. Duration: 0.8 2023-10-04 18:13:25,451 WARNING [train.py:1204] (3/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_516-151727-0_sp1.1 from training. Duration: 0.963625 2023-10-04 18:13:26,791 WARNING [train.py:1204] (3/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_325-62802-0_sp1.1 from training. Duration: 0.7909375 2023-10-04 18:13:30,222 WARNING [train.py:1204] (3/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_93-58002-0_sp0.9 from training. Duration: 0.6333125 2023-10-04 18:13:31,496 WARNING [train.py:1204] (3/4) Exclude cut with ID _4681_210_3525_1_1527251040321_1235713_123-24428-0_sp1.1 from training. Duration: 0.436375 2023-10-04 18:13:31,549 WARNING [train.py:1204] (3/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_1185-56473-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 18:13:33,056 WARNING [train.py:1204] (3/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_96-244299-0_sp1.1 from training. Duration: 0.8 2023-10-04 18:13:37,246 WARNING [train.py:1204] (3/4) Exclude cut with ID _59500_210_8254_1_1534809610680_3736009_104-72815-0_sp1.1 from training. Duration: 0.963625 2023-10-04 18:13:39,044 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.bypass.skip_rate, batch_count=1747226.6666666667, ans=0.09899494936611666 2023-10-04 18:13:40,173 WARNING [train.py:1204] (3/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_468-283113-0_sp1.1 from training. Duration: 0.87275 2023-10-04 18:13:41,600 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6239_210_4211_1_1527765584_4306698_528-102186-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 18:13:41,871 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.conv_skip_rate, batch_count=1747226.6666666667, ans=0.0 2023-10-04 18:13:42,918 WARNING [train.py:1204] (3/4) Exclude cut with ID _45297_210_2721_1_1533715351381_3264949_351-147136-0 from training. Duration: 0.87 2023-10-04 18:13:42,930 WARNING [train.py:1204] (3/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_520-51744-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 18:13:44,239 INFO [train.py:1046] (3/4) Epoch 50, batch 1800, loss[loss=0.1616, simple_loss=0.2453, pruned_loss=0.039, over 23375.00 frames. ], tot_loss[loss=0.1521, simple_loss=0.2332, pruned_loss=0.03549, over 4728429.98 frames. ], batch size: 93, lr: 2.03e-03, grad_scale: 16.0 2023-10-04 18:13:44,302 WARNING [train.py:1204] (3/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_310-114452-0_sp1.1 from training. Duration: 0.536375 2023-10-04 18:13:44,314 WARNING [train.py:1204] (3/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_450-165710-0_sp1.1 from training. Duration: 0.92725 2023-10-04 18:13:44,324 WARNING [train.py:1204] (3/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_280-120942-0_sp1.1 from training. Duration: 0.7 2023-10-04 18:13:44,351 WARNING [train.py:1204] (3/4) Exclude cut with ID _65585_210_12210_1_1535450451584_3764098_112-294301-0_sp0.9 from training. Duration: 0.97775 2023-10-04 18:13:44,613 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.balancer2.prob, batch_count=1747293.3333333333, ans=0.125 2023-10-04 18:13:45,767 WARNING [train.py:1204] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_98-51676-0_sp0.9 from training. Duration: 0.811125 2023-10-04 18:13:48,924 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_33348_210_5632_1_1533086912_3324030_85-28583-0_sp1.1 from training. Duration: 0.7454375 2023-10-04 18:13:49,006 WARNING [train.py:1204] (3/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_336-208232-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 18:13:50,336 WARNING [train.py:1204] (3/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_339-104791-0_sp0.9 from training. Duration: 0.5444375 2023-10-04 18:13:54,497 WARNING [train.py:1204] (3/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_559-29840-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 18:13:56,500 WARNING [train.py:1204] (3/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_117-290916-0_sp0.9 from training. Duration: 0.5666875 2023-10-04 18:13:57,881 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_185-112289-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 18:13:59,697 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.698e+02 2.181e+02 2.519e+02 2.906e+02 5.254e+02, threshold=5.039e+02, percent-clipped=1.0 2023-10-04 18:14:01,892 WARNING [train.py:1204] (3/4) Exclude cut with ID _59256_210_5986_1_1535076085997_3589120_155-298797-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 18:14:04,561 WARNING [train.py:1204] (3/4) Exclude cut with ID _44659_210_2721_1_1534036120061_6614860_246-206531-0_sp1.1 from training. Duration: 0.92725 2023-10-04 18:14:04,599 WARNING [train.py:1204] (3/4) Exclude cut with ID _23818_210_6228_1_1531917063884_3676920_469-151944-0_sp1.1 from training. Duration: 0.92725 2023-10-04 18:14:05,979 WARNING [train.py:1204] (3/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_88-276668-0_sp1.1 from training. Duration: 0.9 2023-10-04 18:14:07,325 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_15101_210_7927_1_1530786415_5033274_320-327527-0_sp1.1 from training. Duration: 0.87275 2023-10-04 18:14:07,524 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.feed_forward1.out_proj.dropout_p, batch_count=1747360.0, ans=0.1 2023-10-04 18:14:08,709 WARNING [train.py:1204] (3/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_446-112117-0 from training. Duration: 0.9 2023-10-04 18:14:08,786 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_30280_210_1881_1_1532872779_3660139_400-254159-0_sp1.1 from training. Duration: 0.936375 2023-10-04 18:14:12,935 WARNING [train.py:1204] (3/4) Exclude cut with ID _40284_210_2084_1_1533513679814_3704279_158-345341-0_sp1.1 from training. Duration: 0.936375 2023-10-04 18:14:17,105 WARNING [train.py:1204] (3/4) Exclude cut with ID 3033-130750-0096-55598-0 from training. Duration: 0.83 2023-10-04 18:14:18,502 WARNING [train.py:1204] (3/4) Exclude cut with ID _9389_210_1664_1_1529217337301_5289269_379-128794-0 from training. Duration: 0.85 2023-10-04 18:14:18,526 WARNING [train.py:1204] (3/4) Exclude cut with ID _3675_210_4557_1_1526639934408_4182089_456-18305-0 from training. Duration: 0.76 2023-10-04 18:14:18,562 WARNING [train.py:1204] (3/4) Exclude cut with ID _29853_210_7497_1_1532505600851_6642110_571-249460-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 18:14:19,958 WARNING [train.py:1204] (3/4) Exclude cut with ID _4068_210_2629_1_1526697350395_1546259_230-325568-0_sp1.1 from training. Duration: 0.92725 2023-10-04 18:14:19,958 WARNING [train.py:1204] (3/4) Exclude cut with ID _34415_210_8750_1_1533085213375_3451116_213-267570-0_sp1.1 from training. Duration: 0.963625 2023-10-04 18:14:21,304 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6767_210_3273_1_1527855783_1718932_142-127132-0_sp1.1 from training. Duration: 0.863625 2023-10-04 18:14:24,010 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.1.encoder.layers.0.self_attn1.whiten, num_groups=1, num_channels=256, metric=12.10 vs. limit=22.5 2023-10-04 18:14:24,770 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.nonlin_attention.balancer.max_positive, batch_count=1747426.6666666667, ans=0.95 2023-10-04 18:14:29,808 WARNING [train.py:1204] (3/4) Exclude cut with ID IC0653W0001-322925 from training. Duration: 0.896 2023-10-04 18:14:31,169 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_4528_210_3273_1_1527076654_3276777_209-219804-0_sp1.1 from training. Duration: 0.62725 2023-10-04 18:14:32,493 WARNING [train.py:1204] (3/4) Exclude cut with ID _55844_210_2346_1_1534471212569_7625707_187-204971-0_sp1.1 from training. Duration: 0.936375 2023-10-04 18:14:33,246 WARNING [train.py:1204] (3/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_369-325184-0 from training. Duration: 0.99 2023-10-04 18:14:33,272 WARNING [train.py:1204] (3/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_791-45149-0 from training. Duration: 0.86 2023-10-04 18:14:34,516 WARNING [train.py:1204] (3/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_317-211107-0_sp1.1 from training. Duration: 0.836375 2023-10-04 18:14:35,842 WARNING [train.py:1204] (3/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_488-330913-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 18:14:35,945 WARNING [train.py:1204] (3/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_506-42614-0_sp1.1 from training. Duration: 0.7181875 2023-10-04 18:14:40,087 WARNING [train.py:1204] (3/4) Exclude cut with ID _8696_210_1876_1_1528545632440_3345058_209-54069-0 from training. Duration: 0.67 2023-10-04 18:14:44,254 WARNING [train.py:1204] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_215-202973-0_sp1.1 from training. Duration: 0.8 2023-10-04 18:14:45,553 WARNING [train.py:1204] (3/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_321-215742-0 from training. Duration: 0.5 2023-10-04 18:14:45,614 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_428-32114-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 18:14:45,622 WARNING [train.py:1204] (3/4) Exclude cut with ID _27013_210_1389_1_1532437173240_3252649_467-263151-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 18:14:46,989 WARNING [train.py:1204] (3/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_734-129761-0_sp0.9 from training. Duration: 0.911125 2023-10-04 18:14:47,023 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_521-214276-0 from training. Duration: 0.44 2023-10-04 18:14:48,543 WARNING [train.py:1204] (3/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_80-147301-0_sp1.1 from training. Duration: 0.763625 2023-10-04 18:14:48,545 WARNING [train.py:1204] (3/4) Exclude cut with ID _9679_210_7497_1_1529129212906_4363929_56-307796-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 18:14:51,841 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.1.self_attn_weights.pos_emb_skip_rate, batch_count=1747560.0, ans=0.0 2023-10-04 18:14:53,064 WARNING [train.py:1204] (3/4) Exclude cut with ID _14832_210_8750_1_1530691351539_3171348_1-54315-0 from training. Duration: 0.87 2023-10-04 18:14:53,065 WARNING [train.py:1204] (3/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_558-155750-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 18:14:54,951 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_142-309370-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 18:14:54,981 WARNING [train.py:1204] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_18-46756-0_sp0.9 from training. Duration: 0.87775 2023-10-04 18:14:54,989 WARNING [train.py:1204] (3/4) Exclude cut with ID _61148_210_6897_1_1535094022726_4217610_279-212159-0_sp1.1 from training. Duration: 0.936375 2023-10-04 18:14:55,660 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.3.encoder.layers.1.feed_forward3.out_whiten, num_groups=1, num_channels=512, metric=9.37 vs. limit=15.0 2023-10-04 18:14:58,093 WARNING [train.py:1204] (3/4) Exclude cut with ID _21987_210_5854_1_1531821500482_3637038_164-2641-0_sp1.1 from training. Duration: 0.936375 2023-10-04 18:14:58,139 WARNING [train.py:1204] (3/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_395-340852-0_sp1.1 from training. Duration: 0.8454375 2023-10-04 18:14:59,466 INFO [train.py:1046] (3/4) Epoch 50, batch 1850, loss[loss=0.144, simple_loss=0.2302, pruned_loss=0.02894, over 24484.00 frames. ], tot_loss[loss=0.1519, simple_loss=0.2336, pruned_loss=0.03517, over 4736623.59 frames. ], batch size: 66, lr: 2.03e-03, grad_scale: 16.0 2023-10-04 18:14:59,575 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1077-184261-0_sp1.1 from training. Duration: 0.963625 2023-10-04 18:14:59,583 WARNING [train.py:1204] (3/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_1176-204992-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 18:15:02,325 WARNING [train.py:1204] (3/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_563-347832-0_sp1.1 from training. Duration: 0.8090625 2023-10-04 18:15:02,396 WARNING [train.py:1204] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_380-204756-0_sp0.9 from training. Duration: 0.9555625 2023-10-04 18:15:09,779 WARNING [train.py:1204] (3/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_212-10501-0_sp1.1 from training. Duration: 0.8545625 2023-10-04 18:15:09,809 WARNING [train.py:1204] (3/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_445-117398-0 from training. Duration: 0.89 2023-10-04 18:15:12,726 WARNING [train.py:1204] (3/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_29-280622-0 from training. Duration: 0.82 2023-10-04 18:15:15,505 WARNING [train.py:1204] (3/4) Exclude cut with ID _26664_210_7770_1_1532258943280_3418270_187-80815-0 from training. Duration: 0.93 2023-10-04 18:15:18,226 WARNING [train.py:1204] (3/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_414-349640-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 18:15:18,438 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.bypass_mid.scale_min, batch_count=1747693.3333333333, ans=0.2 2023-10-04 18:15:19,969 WARNING [train.py:1204] (3/4) Exclude cut with ID _45021_210_9994_1_1533619751094_3330288_96-239010-0 from training. Duration: 0.91 2023-10-04 18:15:19,972 WARNING [train.py:1204] (3/4) Exclude cut with ID _5189_210_4557_1_1527248319428_3769229_199-53526-0_sp0.9 from training. Duration: 0.4666875 2023-10-04 18:15:29,486 WARNING [train.py:1204] (3/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_63-101996-0_sp1.1 from training. Duration: 0.8 2023-10-04 18:15:29,614 WARNING [train.py:1204] (3/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_199-86432-0 from training. Duration: 0.98 2023-10-04 18:15:32,365 WARNING [train.py:1204] (3/4) Exclude cut with ID _36399_210_16049_1_1532998771377_3698333_207-259013-0_sp1.1 from training. Duration: 0.92725 2023-10-04 18:15:32,408 WARNING [train.py:1204] (3/4) Exclude cut with ID _62574_210_2655_1_1535180407058_3933249_567-111756-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 18:15:35,635 WARNING [train.py:1204] (3/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_436-318113-0 from training. Duration: 0.75 2023-10-04 18:15:35,683 WARNING [train.py:1204] (3/4) Exclude cut with ID _53379_210_20034_1_1534222027363_3854654_43-305214-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 18:15:35,706 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_15126_210_6753_1_1530778677_2591185_113-157651-0_sp1.1 from training. Duration: 0.6181875 2023-10-04 18:15:38,258 WARNING [train.py:1204] (3/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_146-218763-0_sp1.1 from training. Duration: 0.7818125 2023-10-04 18:15:38,527 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.self_attn_weights.pos_emb_skip_rate, batch_count=1747760.0, ans=0.0 2023-10-04 18:15:39,668 WARNING [train.py:1204] (3/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_714-310538-0_sp0.9 from training. Duration: 0.9555625 2023-10-04 18:15:42,258 WARNING [train.py:1204] (3/4) Exclude cut with ID _24271_210_12658_1_1531987270022_7190519_709-16721-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 18:15:46,268 WARNING [train.py:1204] (3/4) Exclude cut with ID _5190_210_4557_1_1527244368799_373439_28-197030-0_sp0.9 from training. Duration: 0.8 2023-10-04 18:15:46,321 WARNING [train.py:1204] (3/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_267-197536-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 18:15:47,768 WARNING [train.py:1204] (3/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_658-282610-0_sp1.1 from training. Duration: 0.4818125 2023-10-04 18:15:47,787 WARNING [train.py:1204] (3/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_257-67050-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 18:15:49,764 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_14906_210_7927_1_1530754190_9408633_961-53402-0_sp1.1 from training. Duration: 0.936375 2023-10-04 18:15:50,310 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.5.encoder.layers.0.nonlin_attention.whiten1, num_groups=1, num_channels=192, metric=4.27 vs. limit=10.0 2023-10-04 18:15:51,211 WARNING [train.py:1204] (3/4) Exclude cut with ID _60113_210_16271_1_1535164212481_4386079_490-280290-0_sp1.1 from training. Duration: 0.8909375 2023-10-04 18:15:54,476 WARNING [train.py:1204] (3/4) Exclude cut with ID _14100_210_8341_1_1530529320613_3678069_258-224599-0 from training. Duration: 0.77 2023-10-04 18:15:54,538 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6961_210_2087_1_1527937168_6815011_565-326898-0_sp1.1 from training. Duration: 0.936375 2023-10-04 18:15:59,842 WARNING [train.py:1204] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_239-316802-0_sp1.1 from training. Duration: 0.62725 2023-10-04 18:15:59,895 WARNING [train.py:1204] (3/4) Exclude cut with ID _55857_210_2346_1_1534644007831_6898209_745-98391-0_sp1.1 from training. Duration: 0.6909375 2023-10-04 18:15:59,899 WARNING [train.py:1204] (3/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_154-135696-0 from training. Duration: 0.8 2023-10-04 18:15:59,901 WARNING [train.py:1204] (3/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_93-58002-0 from training. Duration: 0.57 2023-10-04 18:16:01,765 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9025W0005-507335 from training. Duration: 0.896 2023-10-04 18:16:03,165 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9029W0034-509435 from training. Duration: 0.64 2023-10-04 18:16:05,265 WARNING [train.py:1204] (3/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_76-203867-0_sp1.1 from training. Duration: 0.6545625 2023-10-04 18:16:05,274 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_46639_210_8341_1_1533726088_4495500_66-346325-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 18:16:05,280 WARNING [train.py:1204] (3/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_145-137790-0_sp0.9 from training. Duration: 0.92225 2023-10-04 18:16:06,366 WARNING [train.py:1204] (3/4) Exclude cut with ID _36522_210_8887_1_1533177030611_1772478_7-327299-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 18:16:06,442 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9042W0009-515615 from training. Duration: 0.896 2023-10-04 18:16:07,742 WARNING [train.py:1204] (3/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_39-248870-0_sp0.9 from training. Duration: 0.7666875 2023-10-04 18:16:07,801 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_35441_210_16554_1_1533347572_4092680_362-242912-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 18:16:09,235 WARNING [train.py:1204] (3/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_216-343007-0_sp1.1 from training. Duration: 0.6 2023-10-04 18:16:09,313 WARNING [train.py:1204] (3/4) Exclude cut with ID _3867_210_1883_1_1526641200575_4475830_272-215563-0_sp0.9 from training. Duration: 0.6333125 2023-10-04 18:16:09,401 WARNING [train.py:1204] (3/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_553-175603-0_sp1.1 from training. Duration: 0.92725 2023-10-04 18:16:09,409 WARNING [train.py:1204] (3/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_963-187109-0 from training. Duration: 0.99 2023-10-04 18:16:12,144 WARNING [train.py:1204] (3/4) Exclude cut with ID _44973_210_1511_1_1533794381163_3634770_495-211466-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 18:16:12,162 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9068W0002-529085 from training. Duration: 0.896 2023-10-04 18:16:13,483 INFO [train.py:1046] (3/4) Epoch 50, batch 1900, loss[loss=0.1381, simple_loss=0.2263, pruned_loss=0.02493, over 24325.00 frames. ], tot_loss[loss=0.1524, simple_loss=0.2339, pruned_loss=0.03545, over 4725289.48 frames. ], batch size: 61, lr: 2.03e-03, grad_scale: 16.0 2023-10-04 18:16:13,546 WARNING [train.py:1204] (3/4) Exclude cut with ID _5294_210_1747_1_1527249643928_3696179_180-100713-0_sp1.1 from training. Duration: 0.736375 2023-10-04 18:16:13,637 WARNING [train.py:1204] (3/4) Exclude cut with ID _35799_210_5854_1_1533263487887_3529071_149-49689-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 18:16:15,213 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=1747960.0, ans=0.1 2023-10-04 18:16:17,646 WARNING [train.py:1204] (3/4) Exclude cut with ID _67725_210_27767_1_1535608756671_5105359_36-220794-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 18:16:20,369 WARNING [train.py:1204] (3/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_338-96520-0_sp1.1 from training. Duration: 0.8545625 2023-10-04 18:16:21,840 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9104W0004-547160 from training. Duration: 0.896 2023-10-04 18:16:21,922 WARNING [train.py:1204] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_330-327861-0 from training. Duration: 0.67 2023-10-04 18:16:22,126 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=1747960.0, ans=0.1 2023-10-04 18:16:23,812 WARNING [train.py:1204] (3/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_156-257607-0_sp0.9 from training. Duration: 0.92225 2023-10-04 18:16:23,874 WARNING [train.py:1204] (3/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_476-322050-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 18:16:23,883 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9118W0017-553385 from training. Duration: 0.896 2023-10-04 18:16:25,179 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9120W0028-554435 from training. Duration: 0.896 2023-10-04 18:16:27,857 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.683e+02 2.104e+02 2.393e+02 2.783e+02 5.984e+02, threshold=4.786e+02, percent-clipped=4.0 2023-10-04 18:16:28,149 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.conv_module1.balancer2.prob, batch_count=1748026.6666666667, ans=0.125 2023-10-04 18:16:29,429 WARNING [train.py:1204] (3/4) Exclude cut with ID _56524_210_9642_1_1534572011779_3619170_145-310842-0 from training. Duration: 0.99 2023-10-04 18:16:31,289 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.4.encoder.layers.0.conv_module2.whiten, num_groups=1, num_channels=384, metric=11.09 vs. limit=15.0 2023-10-04 18:16:32,739 WARNING [train.py:1204] (3/4) Exclude cut with ID _51631_210_8254_1_1534129228420_4084794_270-222508-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 18:16:35,604 WARNING [train.py:1204] (3/4) Exclude cut with ID _14268_210_9206_1_1530622771285_4714099_331-105351-0 from training. Duration: 0.65 2023-10-04 18:16:37,616 WARNING [train.py:1204] (3/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_40-129756-0 from training. Duration: 0.81 2023-10-04 18:16:47,161 WARNING [train.py:1204] (3/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_231-104736-0 from training. Duration: 0.78 2023-10-04 18:16:49,947 WARNING [train.py:1204] (3/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_214-291369-0 from training. Duration: 0.63 2023-10-04 18:16:49,954 WARNING [train.py:1204] (3/4) Exclude cut with ID _62574_210_2655_1_1535180407058_3933249_369-84347-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 18:16:50,016 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9210W0111-599240 from training. Duration: 0.896 2023-10-04 18:16:50,020 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_92-246270-0 from training. Duration: 0.85 2023-10-04 18:16:50,046 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_37152_210_9697_1_1533106573_3844188_29-239070-0 from training. Duration: 0.98 2023-10-04 18:16:51,355 WARNING [train.py:1204] (3/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_239-22076-0 from training. Duration: 0.85 2023-10-04 18:16:51,358 WARNING [train.py:1204] (3/4) Exclude cut with ID _65962_210_29927_1_1535436026519_4174930_368-44639-0_sp1.1 from training. Duration: 0.97275 2023-10-04 18:16:56,100 WARNING [train.py:1204] (3/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_152-212533-0 from training. Duration: 0.97 2023-10-04 18:16:59,397 WARNING [train.py:1204] (3/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_90-31383-0_sp1.1 from training. Duration: 0.7909375 2023-10-04 18:17:00,888 WARNING [train.py:1204] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_196-152868-0_sp1.1 from training. Duration: 0.92725 2023-10-04 18:17:00,890 WARNING [train.py:1204] (3/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_140-181176-0 from training. Duration: 0.92 2023-10-04 18:17:02,935 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.2.encoder.layers.0.whiten, num_groups=1, num_channels=384, metric=6.19 vs. limit=12.0 2023-10-04 18:17:03,678 WARNING [train.py:1204] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_168-32906-0_sp0.9 from training. Duration: 0.7666875 2023-10-04 18:17:08,910 WARNING [train.py:1204] (3/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_13-165512-0 from training. Duration: 0.82 2023-10-04 18:17:10,227 WARNING [train.py:1204] (3/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_381-317791-0_sp0.9 from training. Duration: 0.811125 2023-10-04 18:17:14,649 WARNING [train.py:1204] (3/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_63-267585-0_sp1.1 from training. Duration: 0.8181875 2023-10-04 18:17:14,651 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_326-303945-0_sp1.1 from training. Duration: 0.9 2023-10-04 18:17:14,901 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.conv_module2.balancer1.prob, batch_count=1748226.6666666667, ans=0.125 2023-10-04 18:17:16,081 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_194-143381-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 18:17:16,186 WARNING [train.py:1204] (3/4) Exclude cut with ID _39290_210_4220_1_1533862778760_3075118_194-16667-0_sp1.1 from training. Duration: 0.963625 2023-10-04 18:17:18,947 WARNING [train.py:1204] (3/4) Exclude cut with ID _15012_210_1968_1_1530856820429_5095350_662-210469-0_sp0.9 from training. Duration: 0.6333125 2023-10-04 18:17:18,979 WARNING [train.py:1204] (3/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_68-219807-0_sp1.1 from training. Duration: 0.42725 2023-10-04 18:17:20,422 WARNING [train.py:1204] (3/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_321-22821-0_sp0.9 from training. Duration: 0.988875 2023-10-04 18:17:23,135 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_1037-45317-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 18:17:23,137 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_403-39619-0_sp1.1 from training. Duration: 0.763625 2023-10-04 18:17:24,681 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.conv_module1.balancer2.prob, batch_count=1748226.6666666667, ans=0.125 2023-10-04 18:17:26,376 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_510-96996-0_sp0.9 from training. Duration: 0.9666875 2023-10-04 18:17:26,379 WARNING [train.py:1204] (3/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_225-180155-0_sp1.1 from training. Duration: 0.92725 2023-10-04 18:17:27,667 INFO [train.py:1046] (3/4) Epoch 50, batch 1950, loss[loss=0.1571, simple_loss=0.251, pruned_loss=0.0316, over 24619.00 frames. ], tot_loss[loss=0.1526, simple_loss=0.2341, pruned_loss=0.03556, over 4724279.12 frames. ], batch size: 68, lr: 2.03e-03, grad_scale: 16.0 2023-10-04 18:17:27,734 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_278-272612-0_sp0.9 from training. Duration: 0.811125 2023-10-04 18:17:29,131 WARNING [train.py:1204] (3/4) Exclude cut with ID _53713_210_24116_1_1534314487762_7416751_691-70244-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 18:17:32,471 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6788_210_1388_1_1527898698_4562021_91-197981-0_sp1.1 from training. Duration: 0.7545625 2023-10-04 18:17:35,166 WARNING [train.py:1204] (3/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_60-18123-0_sp0.9 from training. Duration: 0.97775 2023-10-04 18:17:35,209 WARNING [train.py:1204] (3/4) Exclude cut with ID _35799_210_5854_1_1533263487887_3529071_5-214617-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 18:17:35,214 WARNING [train.py:1204] (3/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_890-5117-0_sp1.1 from training. Duration: 0.7090625 2023-10-04 18:17:38,072 WARNING [train.py:1204] (3/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_660-211088-0 from training. Duration: 0.62 2023-10-04 18:17:39,912 WARNING [train.py:1204] (3/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_314-126819-0_sp0.9 from training. Duration: 0.5555625 2023-10-04 18:17:39,946 WARNING [train.py:1204] (3/4) Exclude cut with ID _21632_210_2253_1_1531994001765_3744140_362-66225-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 18:17:40,024 WARNING [train.py:1204] (3/4) Exclude cut with ID _61118_210_9527_1_1534897668884_3752630_173-258555-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 18:17:43,091 WARNING [train.py:1204] (3/4) Exclude cut with ID _7719_210_3525_1_1528456153163_4296960_88-250259-0_sp0.9 from training. Duration: 0.9333125 2023-10-04 18:17:43,115 WARNING [train.py:1204] (3/4) Exclude cut with ID _15140_210_9288_1_1530791388450_4880149_539-153708-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 18:17:43,151 WARNING [train.py:1204] (3/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_253-42544-0_sp1.1 from training. Duration: 0.97275 2023-10-04 18:17:44,598 WARNING [train.py:1204] (3/4) Exclude cut with ID _33640_210_2655_1_1533117605479_4855469_479-296877-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 18:17:47,185 WARNING [train.py:1204] (3/4) Exclude cut with ID 3033-130750-0096-55598-0_sp1.1 from training. Duration: 0.7545625 2023-10-04 18:17:47,205 WARNING [train.py:1204] (3/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_191-41023-0_sp1.1 from training. Duration: 0.6454375 2023-10-04 18:17:47,227 WARNING [train.py:1204] (3/4) Exclude cut with ID _8365_210_2064_1_1528592311070_4677690_431-294102-0_sp1.1 from training. Duration: 0.8090625 2023-10-04 18:17:48,485 WARNING [train.py:1204] (3/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_893-296030-0_sp1.1 from training. Duration: 0.97275 2023-10-04 18:17:51,302 WARNING [train.py:1204] (3/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_401-343820-0_sp1.1 from training. Duration: 0.97275 2023-10-04 18:17:51,545 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.ff3_skip_rate, batch_count=1748360.0, ans=0.0 2023-10-04 18:17:53,982 WARNING [train.py:1204] (3/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_127-566-0_sp1.1 from training. Duration: 0.763625 2023-10-04 18:17:53,983 WARNING [train.py:1204] (3/4) Exclude cut with ID _14100_210_8341_1_1530529320613_3678069_126-272847-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 18:17:53,996 WARNING [train.py:1204] (3/4) Exclude cut with ID _15133_210_6228_1_1530794512074_3080379_29-264458-0_sp1.1 from training. Duration: 0.52725 2023-10-04 18:17:53,997 WARNING [train.py:1204] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_127-119109-0 from training. Duration: 0.72 2023-10-04 18:17:55,364 WARNING [train.py:1204] (3/4) Exclude cut with ID _6537_210_1883_1_1527987559710_3597129_382-133062-0_sp1.1 from training. Duration: 0.6181875 2023-10-04 18:17:55,377 WARNING [train.py:1204] (3/4) Exclude cut with ID _29529_210_7234_1_1532393961455_3977460_391-219682-0_sp1.1 from training. Duration: 0.936375 2023-10-04 18:17:55,440 WARNING [train.py:1204] (3/4) Exclude cut with ID _37537_210_2253_1_1533171758568_3421949_96-137189-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 18:18:00,088 WARNING [train.py:1204] (3/4) Exclude cut with ID _58414_210_8254_1_1535421609162_7151182_15-276301-0_sp1.1 from training. Duration: 0.97275 2023-10-04 18:18:01,442 WARNING [train.py:1204] (3/4) Exclude cut with ID _59502_210_12210_1_1535107024698_1835139_110-289074-0_sp1.1 from training. Duration: 0.92725 2023-10-04 18:18:05,145 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.self_attn_weights.pos_emb_skip_rate, batch_count=1748426.6666666667, ans=0.0 2023-10-04 18:18:07,671 WARNING [train.py:1204] (3/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_367-61330-0_sp1.1 from training. Duration: 0.6818125 2023-10-04 18:18:09,109 WARNING [train.py:1204] (3/4) Exclude cut with ID _45297_210_2721_1_1533715351381_3264949_353-9541-0_sp1.1 from training. Duration: 0.8818125 2023-10-04 18:18:11,036 WARNING [train.py:1204] (3/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_85-138257-0_sp0.9 from training. Duration: 0.788875 2023-10-04 18:18:11,077 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3787_210_2040_1_1526477544_5478586_603-120324-0 from training. Duration: 0.81 2023-10-04 18:18:11,957 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.conv_module2.balancer2.prob, batch_count=1748493.3333333333, ans=0.125 2023-10-04 18:18:12,955 WARNING [train.py:1204] (3/4) Exclude cut with ID _42863_210_9070_1_1533902398788_3696920_411-204131-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 18:18:15,906 WARNING [train.py:1204] (3/4) Exclude cut with ID _30196_210_1664_1_1533708496522_7074140_822-289909-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 18:18:17,366 WARNING [train.py:1204] (3/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_32-335103-0_sp0.9 from training. Duration: 0.911125 2023-10-04 18:18:17,426 WARNING [train.py:1204] (3/4) Exclude cut with ID _6493_210_1747_1_1528459210424_4012470_513-140967-0_sp0.9 from training. Duration: 0.87775 2023-10-04 18:18:17,613 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.1.nonlin_attention.balancer.prob, batch_count=1748493.3333333333, ans=0.125 2023-10-04 18:18:24,510 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_47896_210_20508_1_1534492518_5958868_439-257766-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 18:18:26,280 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_1294-63152-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 18:18:27,612 WARNING [train.py:1204] (3/4) Exclude cut with ID _59170_210_6721_1_1534983633960_4203335_319-166512-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 18:18:30,378 WARNING [train.py:1204] (3/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_146-3470-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 18:18:33,712 WARNING [train.py:1204] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_678-159307-0_sp1.1 from training. Duration: 0.77275 2023-10-04 18:18:33,759 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_343-48822-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 18:18:34,338 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.4.encoder.layers.1.feed_forward2.out_whiten, num_groups=1, num_channels=384, metric=9.39 vs. limit=15.0 2023-10-04 18:18:35,066 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_4692_210_3528_1_1526899755_4323254_107-44401-0 from training. Duration: 0.86 2023-10-04 18:18:35,071 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_74-292608-0_sp1.1 from training. Duration: 0.6545625 2023-10-04 18:18:35,137 WARNING [train.py:1204] (3/4) Exclude cut with ID _55644_210_11856_1_1534593599582_4103719_293-248694-0_sp1.1 from training. Duration: 0.97275 2023-10-04 18:18:36,470 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3172_210_2087_1_1526292929_3987865_197-97762-0 from training. Duration: 0.75 2023-10-04 18:18:37,961 WARNING [train.py:1204] (3/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_264-348896-0_sp0.9 from training. Duration: 0.9444375 2023-10-04 18:18:41,910 WARNING [train.py:1204] (3/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_209-205365-0_sp0.9 from training. Duration: 0.87775 2023-10-04 18:18:43,193 INFO [train.py:1046] (3/4) Epoch 50, batch 2000, loss[loss=0.1649, simple_loss=0.2534, pruned_loss=0.03815, over 24578.00 frames. ], tot_loss[loss=0.1533, simple_loss=0.2353, pruned_loss=0.03567, over 4730939.45 frames. ], batch size: 71, lr: 2.03e-03, grad_scale: 32.0 2023-10-04 18:18:43,255 WARNING [train.py:1204] (3/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_222-198127-0_sp0.9 from training. Duration: 0.9333125 2023-10-04 18:18:44,494 WARNING [train.py:1204] (3/4) Exclude cut with ID _57572_210_5517_1_1534921219624_3809484_52-43530-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 18:18:45,881 WARNING [train.py:1204] (3/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_96-320736-0_sp1.1 from training. Duration: 0.7818125 2023-10-04 18:18:47,278 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_13091_210_7925_1_1530609447_8987354_1159-225508-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 18:18:47,503 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.feed_forward2.hidden_balancer.prob, batch_count=1748626.6666666667, ans=0.125 2023-10-04 18:18:50,109 WARNING [train.py:1204] (3/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_237-44332-0 from training. Duration: 0.94 2023-10-04 18:18:51,473 WARNING [train.py:1204] (3/4) Exclude cut with ID _7970_210_1883_1_1528455616296_3472230_25-178407-0_sp0.9 from training. Duration: 0.788875 2023-10-04 18:18:54,282 WARNING [train.py:1204] (3/4) Exclude cut with ID _29003_210_19596_1_1532507391720_3894098_357-257062-0_sp1.1 from training. Duration: 0.936375 2023-10-04 18:18:55,665 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_809-17156-0 from training. Duration: 0.73 2023-10-04 18:18:57,500 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.701e+02 2.194e+02 2.589e+02 3.041e+02 4.664e+02, threshold=5.178e+02, percent-clipped=0.0 2023-10-04 18:18:58,895 WARNING [train.py:1204] (3/4) Exclude cut with ID _7160_210_1876_1_1527940923231_3703799_302-150728-0_sp1.1 from training. Duration: 0.7090625 2023-10-04 18:18:58,912 WARNING [train.py:1204] (3/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_444-28041-0_sp0.9 from training. Duration: 0.9444375 2023-10-04 18:19:01,564 WARNING [train.py:1204] (3/4) Exclude cut with ID _52892_210_6721_1_1534378941259_4124129_278-251666-0_sp1.1 from training. Duration: 0.92725 2023-10-04 18:19:03,622 WARNING [train.py:1204] (3/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_115-326778-0 from training. Duration: 0.93 2023-10-04 18:19:06,238 WARNING [train.py:1204] (3/4) Exclude cut with ID _41391_210_7994_1_1533603771872_3638201_31-188961-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 18:19:06,365 WARNING [train.py:1204] (3/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_1073-6264-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 18:19:07,654 WARNING [train.py:1204] (3/4) Exclude cut with ID _66868_210_9527_1_1535509287954_4561140_435-259515-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 18:19:07,746 WARNING [train.py:1204] (3/4) Exclude cut with ID _14886_210_4220_1_1530702020355_3516190_159-73748-0 from training. Duration: 0.83 2023-10-04 18:19:07,782 WARNING [train.py:1204] (3/4) Exclude cut with ID _15150_210_8341_1_1530752411803_7256384_469-239509-0_sp1.1 from training. Duration: 0.6181875 2023-10-04 18:19:10,471 WARNING [train.py:1204] (3/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_395-340852-0 from training. Duration: 0.93 2023-10-04 18:19:10,481 WARNING [train.py:1204] (3/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_138-187031-0_sp1.1 from training. Duration: 0.9 2023-10-04 18:19:13,088 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_13757_210_3528_1_1530440682_4107239_48-106347-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 18:19:13,160 WARNING [train.py:1204] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_164-224052-0_sp1.1 from training. Duration: 0.636375 2023-10-04 18:19:13,164 WARNING [train.py:1204] (3/4) Exclude cut with ID _59288_210_5854_1_1535077757774_3412246_408-322648-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 18:19:13,757 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.3.encoder.layers.2.nonlin_attention.whiten2, num_groups=1, num_channels=512, metric=5.29 vs. limit=15.0 2023-10-04 18:19:14,866 WARNING [train.py:1204] (3/4) Exclude cut with ID _30191_210_1664_1_1533189915047_7196670_827-36359-0_sp1.1 from training. Duration: 0.963625 2023-10-04 18:19:15,572 WARNING [train.py:1204] (3/4) Exclude cut with ID _4680_210_3525_1_1527249757953_893386_75-78979-0_sp1.1 from training. Duration: 0.82725 2023-10-04 18:19:16,857 WARNING [train.py:1204] (3/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_383-165234-0 from training. Duration: 0.94 2023-10-04 18:19:19,647 WARNING [train.py:1204] (3/4) Exclude cut with ID _19896_210_1883_1_1531709022662_6649569_94-229927-0 from training. Duration: 0.97 2023-10-04 18:19:19,653 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6788_210_1388_1_1527898698_4562021_180-75288-0_sp1.1 from training. Duration: 0.9 2023-10-04 18:19:20,943 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_26433_210_12108_1_1532157158_4056480_54-321349-0_sp1.1 from training. Duration: 0.97275 2023-10-04 18:19:26,567 WARNING [train.py:1204] (3/4) Exclude cut with ID _56108_210_16380_1_1534505446159_3887959_265-161493-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 18:19:26,680 WARNING [train.py:1204] (3/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_395-111602-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 18:19:26,686 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_154-316450-0_sp0.9 from training. Duration: 0.8444375 2023-10-04 18:19:28,656 WARNING [train.py:1204] (3/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_381-116041-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 18:19:31,377 WARNING [train.py:1204] (3/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_65-104124-0_sp1.1 from training. Duration: 0.963625 2023-10-04 18:19:31,435 WARNING [train.py:1204] (3/4) Exclude cut with ID _6536_210_1883_1_1527850783339_3319069_169-23811-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 18:19:31,474 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_289-180904-0_sp0.9 from training. Duration: 0.8444375 2023-10-04 18:19:31,484 WARNING [train.py:1204] (3/4) Exclude cut with ID _56406_210_9288_1_1534899618664_3496149_226-242400-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 18:19:34,067 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_24899_210_16554_1_1532046422_3827462_276-216330-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 18:19:36,894 WARNING [train.py:1204] (3/4) Exclude cut with ID _29345_210_4381_1_1532689168263_3860169_198-174328-0_sp1.1 from training. Duration: 0.82725 2023-10-04 18:19:36,963 WARNING [train.py:1204] (3/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_338-96520-0 from training. Duration: 0.94 2023-10-04 18:19:41,702 WARNING [train.py:1204] (3/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_7-111954-0_sp1.1 from training. Duration: 0.5090625 2023-10-04 18:19:43,072 WARNING [train.py:1204] (3/4) Exclude cut with ID _36692_210_5665_1_1533608967420_398809_8-16261-0_sp1.1 from training. Duration: 0.97275 2023-10-04 18:19:47,509 WARNING [train.py:1204] (3/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_86-212657-0_sp1.1 from training. Duration: 0.97275 2023-10-04 18:19:47,517 WARNING [train.py:1204] (3/4) Exclude cut with ID _44659_210_2721_1_1534036120061_6614860_626-298884-0_sp1.1 from training. Duration: 0.8818125 2023-10-04 18:19:50,920 WARNING [train.py:1204] (3/4) Exclude cut with ID _49755_210_18053_1_1534581005846_3218709_120-257918-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 18:19:52,316 WARNING [train.py:1204] (3/4) Exclude cut with ID _21881_210_3525_1_1531825242757_3798380_400-113821-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 18:19:52,318 WARNING [train.py:1204] (3/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_387-236773-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 18:19:52,395 WARNING [train.py:1204] (3/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_613-232856-0_sp1.1 from training. Duration: 0.4909375 2023-10-04 18:19:52,417 WARNING [train.py:1204] (3/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_181-78632-0_sp0.9 from training. Duration: 0.8666875 2023-10-04 18:19:55,821 WARNING [train.py:1204] (3/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_175-17485-0_sp1.1 from training. Duration: 0.97275 2023-10-04 18:19:57,075 INFO [train.py:1046] (3/4) Epoch 50, batch 2050, loss[loss=0.1554, simple_loss=0.2306, pruned_loss=0.04007, over 23764.00 frames. ], tot_loss[loss=0.1522, simple_loss=0.2341, pruned_loss=0.03514, over 4746419.75 frames. ], batch size: 179, lr: 2.03e-03, grad_scale: 32.0 2023-10-04 18:19:57,164 WARNING [train.py:1204] (3/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_1004-183422-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 18:19:58,694 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.balancer1.prob, batch_count=1748960.0, ans=0.125 2023-10-04 18:20:01,185 WARNING [train.py:1204] (3/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_32-65962-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 18:20:01,263 WARNING [train.py:1204] (3/4) Exclude cut with ID _15458_210_8341_1_1530858614263_3834410_40-298664-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 18:20:05,500 WARNING [train.py:1204] (3/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_380-261863-0_sp1.1 from training. Duration: 0.963625 2023-10-04 18:20:06,972 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_372-173012-0_sp1.1 from training. Duration: 0.863625 2023-10-04 18:20:08,834 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_57-82205-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 18:20:08,904 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3691_210_3528_1_1526468169_4062053_355-334432-0_sp0.9 from training. Duration: 0.9666875 2023-10-04 18:20:09,033 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.1.conv_skip_rate, batch_count=1748960.0, ans=0.0 2023-10-04 18:20:11,479 WARNING [train.py:1204] (3/4) Exclude cut with ID _19023_210_4353_1_1531378892552_5820780_28-36615-0 from training. Duration: 0.97 2023-10-04 18:20:11,484 WARNING [train.py:1204] (3/4) Exclude cut with ID _29853_210_7497_1_1532505600851_6642110_286-251333-0_sp1.1 from training. Duration: 0.92725 2023-10-04 18:20:11,595 WARNING [train.py:1204] (3/4) Exclude cut with ID _31602_210_5854_1_1532677021456_4458250_518-80679-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 18:20:11,629 WARNING [train.py:1204] (3/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_38-187851-0_sp0.9 from training. Duration: 0.688875 2023-10-04 18:20:12,117 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.5.encoder.layers.0.conv_module2.whiten, num_groups=1, num_channels=256, metric=5.74 vs. limit=15.0 2023-10-04 18:20:22,150 WARNING [train.py:1204] (3/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_39-248870-0_sp1.1 from training. Duration: 0.62725 2023-10-04 18:20:22,176 WARNING [train.py:1204] (3/4) Exclude cut with ID _16908_210_5724_1_1531119157248_3772700_154-31455-0_sp1.1 from training. Duration: 0.97275 2023-10-04 18:20:24,094 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8453_210_4211_1_1528502940_8562730_347-330673-0 from training. Duration: 0.72 2023-10-04 18:20:25,549 WARNING [train.py:1204] (3/4) Exclude cut with ID _30191_210_1664_1_1533189915047_7196670_674-204247-0_sp1.1 from training. Duration: 0.97275 2023-10-04 18:20:26,930 WARNING [train.py:1204] (3/4) Exclude cut with ID _8833_210_5219_1_1528592665210_7553059_329-125156-0 from training. Duration: 0.82 2023-10-04 18:20:27,241 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.ff3_skip_rate, batch_count=1749093.3333333333, ans=0.0 2023-10-04 18:20:28,375 WARNING [train.py:1204] (3/4) Exclude cut with ID _4779_210_2084_1_1527318022944_3914893_500-285013-0_sp1.1 from training. Duration: 0.62725 2023-10-04 18:20:31,129 WARNING [train.py:1204] (3/4) Exclude cut with ID _14421_210_8750_1_1530604795825_3542290_190-313578-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 18:20:32,592 WARNING [train.py:1204] (3/4) Exclude cut with ID _35799_210_5854_1_1533263487887_3529071_538-273574-0_sp1.1 from training. Duration: 0.936375 2023-10-04 18:20:32,660 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_14928_210_5632_1_1530773461_4714000_454-112198-0_sp1.1 from training. Duration: 0.6 2023-10-04 18:20:34,022 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_468-143126-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 18:20:34,129 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_4528_210_3273_1_1527076654_3276777_258-121681-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 18:20:37,442 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_397-132565-0_sp1.1 from training. Duration: 0.9 2023-10-04 18:20:37,464 WARNING [train.py:1204] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_454-123002-0_sp0.9 from training. Duration: 0.7666875 2023-10-04 18:20:39,050 WARNING [train.py:1204] (3/4) Exclude cut with ID _27052_210_5986_1_1532329197187_3585009_297-86890-0_sp1.1 from training. Duration: 0.936375 2023-10-04 18:20:41,700 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_51225_210_3601_1_1534400634_2327383_55-301844-0_sp1.1 from training. Duration: 0.8454375 2023-10-04 18:20:44,350 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6786_210_1388_1_1527839699_4194495_378-46070-0_sp0.9 from training. Duration: 0.8 2023-10-04 18:20:44,447 WARNING [train.py:1204] (3/4) Exclude cut with ID _42334_210_7770_1_1534206610396_3605880_55-19758-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 18:20:45,350 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.1.encoder.layers.0.feed_forward2.out_whiten, num_groups=1, num_channels=256, metric=3.92 vs. limit=15.0 2023-10-04 18:20:48,836 WARNING [train.py:1204] (3/4) Exclude cut with ID _14884_210_2252_1_1530707459689_5561250_114-201399-0_sp0.9 from training. Duration: 0.8444375 2023-10-04 18:20:52,265 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.conv_module1.balancer1.min_positive, batch_count=1749160.0, ans=0.025 2023-10-04 18:20:52,272 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.nonlin_attention.balancer.prob, batch_count=1749160.0, ans=0.125 2023-10-04 18:20:53,411 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_99-251314-0_sp0.9 from training. Duration: 0.9666875 2023-10-04 18:20:55,533 WARNING [train.py:1204] (3/4) Exclude cut with ID _26528_210_9070_1_1532570410842_3851310_497-97218-0 from training. Duration: 0.86 2023-10-04 18:21:00,990 WARNING [train.py:1204] (3/4) Exclude cut with ID _55328_210_27767_1_1534580973260_4088170_230-317241-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 18:21:02,347 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_125-203105-0_sp0.9 from training. Duration: 0.888875 2023-10-04 18:21:03,749 WARNING [train.py:1204] (3/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_55-31904-0_sp1.1 from training. Duration: 0.8 2023-10-04 18:21:03,918 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.1.conv_skip_rate, batch_count=1749226.6666666667, ans=0.0 2023-10-04 18:21:07,112 WARNING [train.py:1204] (3/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_18-174647-0 from training. Duration: 0.91 2023-10-04 18:21:09,997 WARNING [train.py:1204] (3/4) Exclude cut with ID IC0165W0005-81726 from training. Duration: 0.952 2023-10-04 18:21:09,997 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_31583_210_3852_1_1532595851_4024413_148-171847-0_sp1.1 from training. Duration: 0.963625 2023-10-04 18:21:10,056 WARNING [train.py:1204] (3/4) Exclude cut with ID _21989_210_5854_1_1531899214931_3271360_116-138769-0_sp1.1 from training. Duration: 0.936375 2023-10-04 18:21:11,269 INFO [train.py:1046] (3/4) Epoch 50, batch 2100, loss[loss=0.1542, simple_loss=0.2281, pruned_loss=0.04012, over 23808.00 frames. ], tot_loss[loss=0.1513, simple_loss=0.2327, pruned_loss=0.03498, over 4739101.89 frames. ], batch size: 195, lr: 2.03e-03, grad_scale: 16.0 2023-10-04 18:21:11,349 WARNING [train.py:1204] (3/4) Exclude cut with ID _66070_210_7640_1_1535441393096_7805310_489-292631-0_sp1.1 from training. Duration: 0.7181875 2023-10-04 18:21:11,442 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_202-27960-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 18:21:12,709 WARNING [train.py:1204] (3/4) Exclude cut with ID _5294_210_1747_1_1527249643928_3696179_342-270278-0 from training. Duration: 0.55 2023-10-04 18:21:12,762 WARNING [train.py:1204] (3/4) Exclude cut with ID _38323_210_12108_1_1533121233369_3303049_416-11919-0 from training. Duration: 0.96 2023-10-04 18:21:14,174 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6545_210_2040_1_1527774670_4245258_407-48070-0_sp0.9 from training. Duration: 0.8444375 2023-10-04 18:21:16,893 WARNING [train.py:1204] (3/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_503-30719-0_sp1.1 from training. Duration: 0.8909375 2023-10-04 18:21:18,676 WARNING [train.py:1204] (3/4) Exclude cut with ID _49643_210_7989_1_1534640244224_3939031_234-326497-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 18:21:22,255 WARNING [train.py:1204] (3/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_301-77873-0_sp1.1 from training. Duration: 0.963625 2023-10-04 18:21:22,313 WARNING [train.py:1204] (3/4) Exclude cut with ID _45297_210_2721_1_1533715351381_3264949_152-154697-0_sp1.1 from training. Duration: 0.97275 2023-10-04 18:21:23,628 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3371_210_1794_1_1526384462_5580777_396-339812-0 from training. Duration: 0.85 2023-10-04 18:21:25,025 WARNING [train.py:1204] (3/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_399-156791-0_sp1.1 from training. Duration: 0.7545625 2023-10-04 18:21:26,317 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7696_210_2040_1_1528207167_3716099_117-125434-0 from training. Duration: 0.76 2023-10-04 18:21:26,321 WARNING [train.py:1204] (3/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_96-244299-0 from training. Duration: 0.88 2023-10-04 18:21:27,633 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.746e+02 2.172e+02 2.598e+02 3.189e+02 5.506e+02, threshold=5.196e+02, percent-clipped=2.0 2023-10-04 18:21:27,786 WARNING [train.py:1204] (3/4) Exclude cut with ID _37892_210_19596_1_1533198677897_3964058_519-343462-0_sp1.1 from training. Duration: 0.92725 2023-10-04 18:21:27,811 WARNING [train.py:1204] (3/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_202-183922-0_sp1.1 from training. Duration: 0.77275 2023-10-04 18:21:27,818 WARNING [train.py:1204] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_453-133983-0 from training. Duration: 0.78 2023-10-04 18:21:27,860 WARNING [train.py:1204] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_178-39722-0_sp1.1 from training. Duration: 0.4454375 2023-10-04 18:21:35,114 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_15126_210_6753_1_1530778677_2591185_113-157651-0 from training. Duration: 0.68 2023-10-04 18:21:35,115 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_271-234962-0_sp1.1 from training. Duration: 0.7181875 2023-10-04 18:21:35,425 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.0.feed_forward1.hidden_balancer.prob, batch_count=1749360.0, ans=0.125 2023-10-04 18:21:36,607 WARNING [train.py:1204] (3/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_195-64967-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 18:21:36,660 WARNING [train.py:1204] (3/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_365-215065-0_sp1.1 from training. Duration: 0.936375 2023-10-04 18:21:41,247 WARNING [train.py:1204] (3/4) Exclude cut with ID _31602_210_5854_1_1532677021456_4458250_249-96604-0_sp0.9 from training. Duration: 0.988875 2023-10-04 18:21:41,298 WARNING [train.py:1204] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_307-158462-0 from training. Duration: 0.69 2023-10-04 18:21:42,570 WARNING [train.py:1204] (3/4) Exclude cut with ID _30918_210_6400_1_1532754078600_3125920_407-223394-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 18:21:42,574 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6673_210_3273_1_1527767790_3173240_351-167954-0_sp0.9 from training. Duration: 0.5333125 2023-10-04 18:21:44,030 WARNING [train.py:1204] (3/4) Exclude cut with ID _4866_210_2577_1_1527404412284_4364659_256-306830-0 from training. Duration: 0.87 2023-10-04 18:21:44,097 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_17613_210_2151_1_1531137569_2366160_222-131118-0_sp1.1 from training. Duration: 0.963625 2023-10-04 18:21:44,109 WARNING [train.py:1204] (3/4) Exclude cut with ID _45560_210_21982_1_1533812603951_3584999_111-6376-0 from training. Duration: 0.84 2023-10-04 18:21:45,400 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_216-73841-0 from training. Duration: 0.96 2023-10-04 18:21:45,445 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_14058_210_4163_1_1530690953_6327180_49-210687-0 from training. Duration: 0.94 2023-10-04 18:21:48,118 WARNING [train.py:1204] (3/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_506-42614-0_sp0.9 from training. Duration: 0.87775 2023-10-04 18:21:49,489 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_25-332138-0_sp1.1 from training. Duration: 0.82725 2023-10-04 18:21:53,532 WARNING [train.py:1204] (3/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_589-9070-0_sp1.1 from training. Duration: 0.6454375 2023-10-04 18:21:55,003 WARNING [train.py:1204] (3/4) Exclude cut with ID _6194_210_1534_1_1527511826160_3941194_14-282876-0_sp1.1 from training. Duration: 0.6454375 2023-10-04 18:21:56,363 WARNING [train.py:1204] (3/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_1166-90654-0_sp1.1 from training. Duration: 0.92725 2023-10-04 18:21:59,500 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_218-255858-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 18:21:59,502 WARNING [train.py:1204] (3/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_1285-256622-0 from training. Duration: 0.84 2023-10-04 18:21:59,515 WARNING [train.py:1204] (3/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_205-286261-0_sp1.1 from training. Duration: 0.963625 2023-10-04 18:21:59,530 WARNING [train.py:1204] (3/4) Exclude cut with ID _68777_210_4353_1_1535810390731_3117904_6-154119-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 18:21:59,604 WARNING [train.py:1204] (3/4) Exclude cut with ID _22982_210_1883_1_1531882790447_5449409_565-199393-0_sp1.1 from training. Duration: 0.92725 2023-10-04 18:22:00,919 WARNING [train.py:1204] (3/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_278-154492-0 from training. Duration: 0.76 2023-10-04 18:22:01,041 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_426-313557-0 from training. Duration: 0.53 2023-10-04 18:22:02,299 WARNING [train.py:1204] (3/4) Exclude cut with ID _41817_210_18835_1_1533434477160_7423049_555-293448-0 from training. Duration: 0.8 2023-10-04 18:22:06,390 WARNING [train.py:1204] (3/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_32-335103-0_sp1.1 from training. Duration: 0.7454375 2023-10-04 18:22:06,539 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.conv_module1.balancer2.prob, batch_count=1749493.3333333333, ans=0.125 2023-10-04 18:22:07,942 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.1.bypass.skip_rate, batch_count=1749493.3333333333, ans=0.035 2023-10-04 18:22:09,233 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2655_210_2087_1_1525688959_3897400_202-147557-0_sp1.1 from training. Duration: 0.77275 2023-10-04 18:22:10,513 WARNING [train.py:1204] (3/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_158-250116-0 from training. Duration: 0.62 2023-10-04 18:22:13,936 WARNING [train.py:1204] (3/4) Exclude cut with ID _55429_210_10626_1_1534467588527_7174143_528-88083-0_sp1.1 from training. Duration: 0.963625 2023-10-04 18:22:16,662 WARNING [train.py:1204] (3/4) Exclude cut with ID _8030_210_7497_1_1528444886781_3531089_32-31837-0_sp1.1 from training. Duration: 0.8818125 2023-10-04 18:22:16,692 WARNING [train.py:1204] (3/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_744-86168-0_sp1.1 from training. Duration: 0.97275 2023-10-04 18:22:16,707 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1113-7369-0_sp0.9 from training. Duration: 0.9444375 2023-10-04 18:22:16,733 WARNING [train.py:1204] (3/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_124-345840-0_sp0.9 from training. Duration: 0.57775 2023-10-04 18:22:17,997 WARNING [train.py:1204] (3/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_328-264776-0_sp0.9 from training. Duration: 0.7444375 2023-10-04 18:22:19,808 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.whiten.whitening_limit, batch_count=1749560.0, ans=12.0 2023-10-04 18:22:20,702 WARNING [train.py:1204] (3/4) Exclude cut with ID _18895_210_6228_1_1531357274234_3784930_469-166615-0_sp1.1 from training. Duration: 0.963625 2023-10-04 18:22:20,708 WARNING [train.py:1204] (3/4) Exclude cut with ID _8395_210_1531_1_1528597871523_4411579_431-132470-0_sp0.9 from training. Duration: 0.82225 2023-10-04 18:22:22,029 WARNING [train.py:1204] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_554-45617-0_sp1.1 from training. Duration: 0.9 2023-10-04 18:22:22,059 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_35361_210_4238_1_1533262810_7793476_524-188242-0_sp1.1 from training. Duration: 0.92725 2023-10-04 18:22:24,155 WARNING [train.py:1204] (3/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_938-205135-0 from training. Duration: 0.87 2023-10-04 18:22:25,412 INFO [train.py:1046] (3/4) Epoch 50, batch 2150, loss[loss=0.1353, simple_loss=0.2146, pruned_loss=0.02799, over 24565.00 frames. ], tot_loss[loss=0.1507, simple_loss=0.2318, pruned_loss=0.03477, over 4741721.07 frames. ], batch size: 60, lr: 2.03e-03, grad_scale: 16.0 2023-10-04 18:22:25,520 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_5931_210_2040_1_1527428481_4547999_321-193614-0 from training. Duration: 0.67 2023-10-04 18:22:25,541 WARNING [train.py:1204] (3/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_445-269521-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 18:22:28,317 WARNING [train.py:1204] (3/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_3-33116-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 18:22:28,318 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_4633_210_2040_1_1526824163_4145343_416-76117-0_sp1.1 from training. Duration: 0.863625 2023-10-04 18:22:28,352 WARNING [train.py:1204] (3/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_1033-274376-0_sp1.1 from training. Duration: 0.7909375 2023-10-04 18:22:29,650 WARNING [train.py:1204] (3/4) Exclude cut with ID _8772_210_1850_1_1528635595972_3664011_398-320049-0_sp0.9 from training. Duration: 0.97775 2023-10-04 18:22:33,023 WARNING [train.py:1204] (3/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_321-215742-0_sp0.9 from training. Duration: 0.5555625 2023-10-04 18:22:35,765 WARNING [train.py:1204] (3/4) Exclude cut with ID _28394_210_8913_1_1532430049518_3650118_272-190950-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 18:22:35,974 INFO [scaling.py:1118] (3/4) WithLoss: name=encoder.encoders.1.encoder.layers.0.self_attn_weights, loss-sum=0.000e+00 2023-10-04 18:22:37,072 WARNING [train.py:1204] (3/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_31-299371-0_sp1.1 from training. Duration: 0.92725 2023-10-04 18:22:38,442 WARNING [train.py:1204] (3/4) Exclude cut with ID _55383_210_10635_1_1534468426172_2231669_134-128246-0_sp0.9 from training. Duration: 0.988875 2023-10-04 18:22:38,444 WARNING [train.py:1204] (3/4) Exclude cut with ID _40519_210_13794_1_1533715228655_7337390_887-9547-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 18:22:39,791 WARNING [train.py:1204] (3/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_310-109461-0_sp0.9 from training. Duration: 0.888875 2023-10-04 18:22:41,910 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2009_210_2071_1_1525481134_4588937_258-250196-0_sp1.1 from training. Duration: 0.92725 2023-10-04 18:22:43,256 WARNING [train.py:1204] (3/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_1186-72702-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 18:22:43,259 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_14831_210_7927_1_1530700200_5583610_181-27467-0_sp1.1 from training. Duration: 0.82725 2023-10-04 18:22:47,535 WARNING [train.py:1204] (3/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_278-132109-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 18:22:48,865 WARNING [train.py:1204] (3/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_261-297503-0 from training. Duration: 0.99 2023-10-04 18:22:51,737 WARNING [train.py:1204] (3/4) Exclude cut with ID _19896_210_1883_1_1531709022662_6649569_313-265903-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 18:22:51,832 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_105-114797-0_sp1.1 from training. Duration: 0.763625 2023-10-04 18:22:53,760 WARNING [train.py:1204] (3/4) Exclude cut with ID _53755_210_8254_1_1534577312941_4707360_9-65843-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 18:22:53,785 WARNING [train.py:1204] (3/4) Exclude cut with ID _18516_210_9206_1_1531704653206_3692406_361-224549-0_sp1.1 from training. Duration: 0.97275 2023-10-04 18:22:55,234 WARNING [train.py:1204] (3/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_403-310975-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 18:22:55,281 WARNING [train.py:1204] (3/4) Exclude cut with ID _5444_210_2151_1_1527418808464_5312210_581-315411-0_sp0.9 from training. Duration: 0.688875 2023-10-04 18:22:55,324 WARNING [train.py:1204] (3/4) Exclude cut with ID _23825_210_13449_1_1531915313526_3497150_464-154904-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 18:22:55,333 WARNING [train.py:1204] (3/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_937-208281-0_sp0.9 from training. Duration: 0.9444375 2023-10-04 18:22:56,143 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.bypass_mid.scale_min, batch_count=1749760.0, ans=0.2 2023-10-04 18:22:57,068 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_37839_210_7234_1_1533542462_3689303_158-181212-0_sp1.1 from training. Duration: 0.963625 2023-10-04 18:22:58,382 WARNING [train.py:1204] (3/4) Exclude cut with ID _8772_210_1850_1_1528635595972_3664011_398-320049-0 from training. Duration: 0.88 2023-10-04 18:23:01,568 WARNING [train.py:1204] (3/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_718-250907-0_sp1.1 from training. Duration: 0.62725 2023-10-04 18:23:01,645 WARNING [train.py:1204] (3/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_641-126395-0_sp1.1 from training. Duration: 0.92725 2023-10-04 18:23:01,678 WARNING [train.py:1204] (3/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_166-241493-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 18:23:03,017 WARNING [train.py:1204] (3/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_526-62079-0_sp0.9 from training. Duration: 0.7444375 2023-10-04 18:23:04,402 WARNING [train.py:1204] (3/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_439-275083-0_sp1.1 from training. Duration: 0.936375 2023-10-04 18:23:07,187 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_66907_210_21581_1_1535615661_3590177_493-229032-0_sp1.1 from training. Duration: 0.92725 2023-10-04 18:23:07,229 WARNING [train.py:1204] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_89-321955-0_sp1.1 from training. Duration: 0.72725 2023-10-04 18:23:08,627 WARNING [train.py:1204] (3/4) Exclude cut with ID _29857_210_20034_1_1532395886753_3798160_273-226195-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 18:23:08,633 WARNING [train.py:1204] (3/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_404-75889-0 from training. Duration: 0.89 2023-10-04 18:23:09,964 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_271-234962-0_sp0.9 from training. Duration: 0.87775 2023-10-04 18:23:11,431 WARNING [train.py:1204] (3/4) Exclude cut with ID _22961_210_8254_1_1531976366672_4278180_156-339685-0_sp1.1 from training. Duration: 0.97275 2023-10-04 18:23:13,302 WARNING [train.py:1204] (3/4) Exclude cut with ID _37892_210_19596_1_1533198677897_3964058_425-231927-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 18:23:14,681 WARNING [train.py:1204] (3/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_102-288670-0_sp1.1 from training. Duration: 0.97275 2023-10-04 18:23:16,026 WARNING [train.py:1204] (3/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_283-220372-0_sp1.1 from training. Duration: 0.5090625 2023-10-04 18:23:17,325 WARNING [train.py:1204] (3/4) Exclude cut with ID _44304_210_15374_1_1533639352765_4613129_86-1699-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 18:23:18,629 WARNING [train.py:1204] (3/4) Exclude cut with ID _2891_210_2550_1_1525874398521_4427710_327-121590-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 18:23:18,635 WARNING [train.py:1204] (3/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_369-222060-0 from training. Duration: 0.71 2023-10-04 18:23:20,082 WARNING [train.py:1204] (3/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_762-109886-0 from training. Duration: 0.91 2023-10-04 18:23:20,862 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.2.encoder.layers.1.whiten, num_groups=1, num_channels=384, metric=4.53 vs. limit=12.0 2023-10-04 18:23:21,383 WARNING [train.py:1204] (3/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_477-255782-0_sp0.9 from training. Duration: 0.911125 2023-10-04 18:23:21,410 WARNING [train.py:1204] (3/4) Exclude cut with ID IC0669W0061-330906 from training. Duration: 0.768 2023-10-04 18:23:21,465 WARNING [train.py:1204] (3/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_347-213071-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 18:23:21,503 WARNING [train.py:1204] (3/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_507-303370-0_sp1.1 from training. Duration: 0.87275 2023-10-04 18:23:22,854 WARNING [train.py:1204] (3/4) Exclude cut with ID _15012_210_1968_1_1530856820429_5095350_688-178849-0 from training. Duration: 0.64 2023-10-04 18:23:22,859 WARNING [train.py:1204] (3/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_28-70987-0_sp0.9 from training. Duration: 0.9666875 2023-10-04 18:23:22,861 WARNING [train.py:1204] (3/4) Exclude cut with ID _5910_210_1389_1_1527337972454_5050100_555-159815-0 from training. Duration: 0.98 2023-10-04 18:23:22,881 WARNING [train.py:1204] (3/4) Exclude cut with ID IC0679W0064-335871 from training. Duration: 0.768 2023-10-04 18:23:22,881 WARNING [train.py:1204] (3/4) Exclude cut with ID IC0679W0079-335886 from training. Duration: 0.896 2023-10-04 18:23:24,782 WARNING [train.py:1204] (3/4) Exclude cut with ID _5608_210_2106_1_1527415439089_6819768_909-82767-0 from training. Duration: 0.61 2023-10-04 18:23:26,162 WARNING [train.py:1204] (3/4) Exclude cut with ID _23038_210_9570_1_1532001584158_3652419_411-323967-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 18:23:26,215 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_350-231748-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 18:23:26,225 WARNING [train.py:1204] (3/4) Exclude cut with ID _5289_210_2084_1_1527678202736_3779299_142-103317-0_sp0.9 from training. Duration: 0.9333125 2023-10-04 18:23:27,576 WARNING [train.py:1204] (3/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_314-144055-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 18:23:27,643 WARNING [train.py:1204] (3/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_177-236809-0_sp0.9 from training. Duration: 0.5444375 2023-10-04 18:23:29,651 WARNING [train.py:1204] (3/4) Exclude cut with ID _23038_210_9570_1_1532001584158_3652419_82-334733-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 18:23:29,663 WARNING [train.py:1204] (3/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_225-22526-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 18:23:38,661 WARNING [train.py:1204] (3/4) Exclude cut with ID _30327_210_16049_1_1532484161295_3924169_401-70819-0_sp1.1 from training. Duration: 0.7818125 2023-10-04 18:23:39,926 INFO [train.py:1046] (3/4) Epoch 50, batch 2200, loss[loss=0.155, simple_loss=0.2471, pruned_loss=0.03149, over 24433.00 frames. ], tot_loss[loss=0.1506, simple_loss=0.232, pruned_loss=0.03464, over 4744093.87 frames. ], batch size: 69, lr: 2.03e-03, grad_scale: 16.0 2023-10-04 18:23:40,008 WARNING [train.py:1204] (3/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_538-32291-0 from training. Duration: 0.83 2023-10-04 18:23:41,698 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.conv_skip_rate, batch_count=1749960.0, ans=0.0 2023-10-04 18:23:43,370 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_37357_210_17024_1_1533089504_2316121_258-211605-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 18:23:47,580 WARNING [train.py:1204] (3/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_550-335198-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 18:23:47,633 WARNING [train.py:1204] (3/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_98-741-0_sp0.9 from training. Duration: 0.888875 2023-10-04 18:23:48,926 WARNING [train.py:1204] (3/4) Exclude cut with ID _38323_210_12108_1_1533121233369_3303049_251-238988-0_sp1.1 from training. Duration: 0.92725 2023-10-04 18:23:50,280 WARNING [train.py:1204] (3/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_259-34038-0_sp0.9 from training. Duration: 0.82225 2023-10-04 18:23:52,866 WARNING [train.py:1204] (3/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_1060-180633-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 18:23:54,245 WARNING [train.py:1204] (3/4) Exclude cut with ID _8755_210_1876_1_1528628559357_3258209_211-37473-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 18:23:54,253 WARNING [train.py:1204] (3/4) Exclude cut with ID _4122_210_2084_1_1526713164417_4353280_528-206771-0 from training. Duration: 0.88 2023-10-04 18:23:55,572 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.726e+02 2.058e+02 2.327e+02 2.714e+02 4.351e+02, threshold=4.654e+02, percent-clipped=0.0 2023-10-04 18:23:58,519 WARNING [train.py:1204] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_64-248359-0 from training. Duration: 0.69 2023-10-04 18:24:01,857 WARNING [train.py:1204] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_402-253027-0_sp1.1 from training. Duration: 0.4909375 2023-10-04 18:24:06,705 WARNING [train.py:1204] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_625-102955-0 from training. Duration: 0.77 2023-10-04 18:24:09,651 WARNING [train.py:1204] (3/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_611-219324-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 18:24:09,725 WARNING [train.py:1204] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_718-68505-0_sp0.9 from training. Duration: 0.988875 2023-10-04 18:24:11,108 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_353-182855-0_sp1.1 from training. Duration: 0.87275 2023-10-04 18:24:14,438 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_5960_210_1794_1_1527403865_3990999_515-2613-0_sp0.9 from training. Duration: 0.97775 2023-10-04 18:24:14,470 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_5575_210_1410_1_1527301305_2570170_342-265572-0 from training. Duration: 0.94 2023-10-04 18:24:17,242 WARNING [train.py:1204] (3/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_329-113173-0_sp0.9 from training. Duration: 0.688875 2023-10-04 18:24:18,561 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_15396_210_6890_1_1531308473_1938936_213-312506-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 18:24:18,626 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_521-214276-0_sp1.1 from training. Duration: 0.4 2023-10-04 18:24:22,540 WARNING [train.py:1204] (3/4) Exclude cut with ID _15026_210_8750_1_1530777602316_3291850_211-195043-0_sp1.1 from training. Duration: 0.72725 2023-10-04 18:24:23,905 WARNING [train.py:1204] (3/4) Exclude cut with ID _29343_210_4381_1_1532602757752_3674109_108-257494-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 18:24:25,252 WARNING [train.py:1204] (3/4) Exclude cut with ID _64414_210_17301_1_1535243252048_3757759_403-58151-0_sp1.1 from training. Duration: 0.963625 2023-10-04 18:24:26,621 WARNING [train.py:1204] (3/4) Exclude cut with ID _36512_210_6987_1_1533033143998_3126069_109-177787-0_sp1.1 from training. Duration: 0.92725 2023-10-04 18:24:29,891 WARNING [train.py:1204] (3/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_498-40153-0 from training. Duration: 0.63 2023-10-04 18:24:31,250 WARNING [train.py:1204] (3/4) Exclude cut with ID _56447_210_1389_1_1535074245910_3697189_161-334523-0_sp1.1 from training. Duration: 0.936375 2023-10-04 18:24:32,645 WARNING [train.py:1204] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_238-245655-0 from training. Duration: 0.81 2023-10-04 18:24:34,648 WARNING [train.py:1204] (3/4) Exclude cut with ID _64631_210_27767_1_1535349580068_4165160_108-43865-0_sp1.1 from training. Duration: 0.92725 2023-10-04 18:24:34,660 WARNING [train.py:1204] (3/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_243-42116-0_sp1.1 from training. Duration: 0.663625 2023-10-04 18:24:36,319 WARNING [train.py:1204] (3/4) Exclude cut with ID _66868_210_9527_1_1535509287954_4561140_594-132160-0_sp1.1 from training. Duration: 0.92725 2023-10-04 18:24:39,121 WARNING [train.py:1204] (3/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_48-338976-0_sp0.9 from training. Duration: 0.988875 2023-10-04 18:24:39,169 WARNING [train.py:1204] (3/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_137-100595-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 18:24:39,176 WARNING [train.py:1204] (3/4) Exclude cut with ID _49748_210_24116_1_1533986971737_3158910_12-271669-0_sp1.1 from training. Duration: 0.936375 2023-10-04 18:24:39,203 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_37357_210_17024_1_1533089504_2316121_206-80421-0_sp1.1 from training. Duration: 0.936375 2023-10-04 18:24:40,539 WARNING [train.py:1204] (3/4) Exclude cut with ID _7537_210_2084_1_1528502395431_3924359_113-199933-0_sp1.1 from training. Duration: 0.536375 2023-10-04 18:24:40,596 WARNING [train.py:1204] (3/4) Exclude cut with ID _55440_210_12651_1_1534496713364_2980520_31-59289-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 18:24:43,289 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1083-220122-0_sp0.9 from training. Duration: 0.5666875 2023-10-04 18:24:46,594 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_426-313557-0_sp1.1 from training. Duration: 0.4818125 2023-10-04 18:24:47,900 WARNING [train.py:1204] (3/4) Exclude cut with ID _35799_210_5854_1_1533263487887_3529071_324-94843-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 18:24:49,347 WARNING [train.py:1204] (3/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_264-108685-0_sp1.1 from training. Duration: 0.5 2023-10-04 18:24:50,791 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9012W0006-501141 from training. Duration: 0.896 2023-10-04 18:24:53,515 INFO [train.py:1046] (3/4) Epoch 50, batch 2250, loss[loss=0.155, simple_loss=0.2379, pruned_loss=0.03606, over 23222.00 frames. ], tot_loss[loss=0.1511, simple_loss=0.2322, pruned_loss=0.03497, over 4744432.74 frames. ], batch size: 119, lr: 2.03e-03, grad_scale: 8.0 2023-10-04 18:24:53,583 WARNING [train.py:1204] (3/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_890-5117-0_sp0.9 from training. Duration: 0.8666875 2023-10-04 18:24:53,646 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9023W0008-506301 from training. Duration: 0.896 2023-10-04 18:24:55,006 WARNING [train.py:1204] (3/4) Exclude cut with ID _13916_210_9994_1_1530788341126_3763149_60-191556-0_sp0.9 from training. Duration: 0.67775 2023-10-04 18:24:56,281 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9029W0331-509721 from training. Duration: 0.896 2023-10-04 18:24:56,400 WARNING [train.py:1204] (3/4) Exclude cut with ID _35823_210_6917_1_1533187881914_7297830_567-108596-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 18:24:57,722 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7543_210_6753_1_1528544010_1110402_25-183860-0_sp1.1 from training. Duration: 0.42725 2023-10-04 18:25:00,404 WARNING [train.py:1204] (3/4) Exclude cut with ID _47278_210_12041_1_1533902418349_3576110_495-260035-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 18:25:00,524 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9051W0015-520281 from training. Duration: 0.9045625 2023-10-04 18:25:03,153 WARNING [train.py:1204] (3/4) Exclude cut with ID _14459_210_2228_1_1531443641946_7516700_707-66193-0_sp1.1 from training. Duration: 0.97275 2023-10-04 18:25:07,069 WARNING [train.py:1204] (3/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_219-224445-0_sp1.1 from training. Duration: 0.763625 2023-10-04 18:25:11,747 WARNING [train.py:1204] (3/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_345-167986-0_sp0.9 from training. Duration: 0.9333125 2023-10-04 18:25:11,845 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_34815_210_6388_1_1533381320_3571903_279-305565-0_sp1.1 from training. Duration: 0.836375 2023-10-04 18:25:12,023 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.bypass.scale_min, batch_count=1750360.0, ans=0.2 2023-10-04 18:25:14,549 WARNING [train.py:1204] (3/4) Exclude cut with ID _21987_210_5854_1_1531821500482_3637038_317-179212-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 18:25:14,606 WARNING [train.py:1204] (3/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_395-37222-0_sp1.1 from training. Duration: 0.6909375 2023-10-04 18:25:14,674 WARNING [train.py:1204] (3/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_168-50337-0_sp1.1 from training. Duration: 0.763625 2023-10-04 18:25:17,935 WARNING [train.py:1204] (3/4) Exclude cut with ID _60113_210_16271_1_1535164212481_4386079_559-167191-0 from training. Duration: 0.93 2023-10-04 18:25:17,944 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_47228_210_4983_1_1533780021_3672027_344-179642-0_sp1.1 from training. Duration: 0.92725 2023-10-04 18:25:17,983 WARNING [train.py:1204] (3/4) Exclude cut with ID _22961_210_8254_1_1531976366672_4278180_317-199546-0_sp0.9 from training. Duration: 0.9555625 2023-10-04 18:25:20,650 WARNING [train.py:1204] (3/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_141-89727-0 from training. Duration: 0.71 2023-10-04 18:25:21,782 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_78-116075-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 18:25:21,801 WARNING [train.py:1204] (3/4) Exclude cut with ID _64086_210_7994_1_1535270417019_7435949_52-235756-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 18:25:23,198 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7615_210_4211_1_1528110965_4580160_72-235211-0_sp1.1 from training. Duration: 0.6909375 2023-10-04 18:25:28,685 WARNING [train.py:1204] (3/4) Exclude cut with ID _14069_210_5632_1_1530512692033_4803390_287-27950-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 18:25:30,074 WARNING [train.py:1204] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_74-48717-0_sp0.9 from training. Duration: 0.6444375 2023-10-04 18:25:30,103 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7544_210_6753_1_1528591327_1471426_172-348777-0_sp1.1 from training. Duration: 0.6 2023-10-04 18:25:31,527 WARNING [train.py:1204] (3/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_220-191080-0 from training. Duration: 0.98 2023-10-04 18:25:34,146 WARNING [train.py:1204] (3/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_1081-137641-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 18:25:34,929 WARNING [train.py:1204] (3/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_278-235459-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 18:25:38,255 WARNING [train.py:1204] (3/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_1181-77470-0_sp1.1 from training. Duration: 0.936375 2023-10-04 18:25:40,136 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.4.encoder.layers.2.self_attn1.whiten, num_groups=1, num_channels=384, metric=13.23 vs. limit=22.5 2023-10-04 18:25:40,995 WARNING [train.py:1204] (3/4) Exclude cut with ID _60860_210_8254_1_1534896015875_3557169_198-5531-0_sp1.1 from training. Duration: 0.936375 2023-10-04 18:25:41,094 WARNING [train.py:1204] (3/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_17-289781-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 18:25:42,370 WARNING [train.py:1204] (3/4) Exclude cut with ID _50310_210_15488_1_1534068043345_3641080_146-199752-0_sp1.1 from training. Duration: 0.92725 2023-10-04 18:25:45,102 WARNING [train.py:1204] (3/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_969-213354-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 18:25:45,196 WARNING [train.py:1204] (3/4) Exclude cut with ID _70519_210_4163_1_1535961595994_7458230_66-344156-0_sp1.1 from training. Duration: 0.963625 2023-10-04 18:25:49,954 WARNING [train.py:1204] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_380-204756-0_sp1.1 from training. Duration: 0.7818125 2023-10-04 18:25:52,817 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_128-202104-0_sp0.9 from training. Duration: 0.7 2023-10-04 18:25:58,372 WARNING [train.py:1204] (3/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_51-1049-0_sp0.9 from training. Duration: 0.8333125 2023-10-04 18:25:58,397 WARNING [train.py:1204] (3/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_86-66192-0_sp1.1 from training. Duration: 0.5 2023-10-04 18:25:59,660 WARNING [train.py:1204] (3/4) Exclude cut with ID _14069_210_5632_1_1530512692033_4803390_95-1896-0_sp1.1 from training. Duration: 0.8909375 2023-10-04 18:26:05,090 WARNING [train.py:1204] (3/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_29-293896-0_sp1.1 from training. Duration: 0.5545625 2023-10-04 18:26:06,943 INFO [train.py:1046] (3/4) Epoch 50, batch 2300, loss[loss=0.1573, simple_loss=0.2371, pruned_loss=0.03872, over 23371.00 frames. ], tot_loss[loss=0.1517, simple_loss=0.2328, pruned_loss=0.03531, over 4743738.92 frames. ], batch size: 119, lr: 2.03e-03, grad_scale: 8.0 2023-10-04 18:26:07,017 WARNING [train.py:1204] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_167-137255-0_sp0.9 from training. Duration: 0.711125 2023-10-04 18:26:07,017 WARNING [train.py:1204] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_217-95056-0 from training. Duration: 0.98 2023-10-04 18:26:07,041 WARNING [train.py:1204] (3/4) Exclude cut with ID _47278_210_12041_1_1533902418349_3576110_97-266746-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 18:26:07,088 WARNING [train.py:1204] (3/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_380-343670-0_sp1.1 from training. Duration: 0.8 2023-10-04 18:26:09,080 WARNING [train.py:1204] (3/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_107-332398-0 from training. Duration: 0.78 2023-10-04 18:26:11,889 WARNING [train.py:1204] (3/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_91-267696-0_sp1.1 from training. Duration: 0.7909375 2023-10-04 18:26:13,022 WARNING [train.py:1204] (3/4) Exclude cut with ID _41298_210_9019_1_1533430371450_4822143_498-186993-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 18:26:17,386 WARNING [train.py:1204] (3/4) Exclude cut with ID _17956_210_4353_1_1531209568125_6979970_54-77610-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 18:26:19,340 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_40047_210_2290_1_1533290519_2374311_257-335162-0_sp1.1 from training. Duration: 0.863625 2023-10-04 18:26:22,142 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9653W0193-675471 from training. Duration: 0.9656875 2023-10-04 18:26:23,534 WARNING [train.py:1204] (3/4) Exclude cut with ID _25248_210_8750_1_1532062825941_5554180_243-240536-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 18:26:24,815 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.720e+02 2.106e+02 2.353e+02 2.933e+02 4.045e+02, threshold=4.706e+02, percent-clipped=0.0 2023-10-04 18:26:29,168 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_32360_210_4198_1_1532689160_3648436_60-51908-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 18:26:30,476 WARNING [train.py:1204] (3/4) Exclude cut with ID _4092_210_2151_1_1526643044449_3135759_227-213972-0_sp0.9 from training. Duration: 0.6 2023-10-04 18:26:30,504 WARNING [train.py:1204] (3/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_392-173500-0_sp1.1 from training. Duration: 0.97275 2023-10-04 18:26:30,542 WARNING [train.py:1204] (3/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_71-329150-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 18:26:30,544 WARNING [train.py:1204] (3/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_866-173440-0 from training. Duration: 0.9 2023-10-04 18:26:30,622 WARNING [train.py:1204] (3/4) Exclude cut with ID _14832_210_8750_1_1530691351539_3171348_1-54315-0_sp0.9 from training. Duration: 0.9666875 2023-10-04 18:26:33,374 WARNING [train.py:1204] (3/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_430-116139-0_sp1.1 from training. Duration: 0.77275 2023-10-04 18:26:33,415 WARNING [train.py:1204] (3/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_356-45054-0_sp0.9 from training. Duration: 0.9555625 2023-10-04 18:26:39,074 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3531_210_3528_1_1526294203_5216456_309-227302-0_sp0.9 from training. Duration: 0.7666875 2023-10-04 18:26:41,825 WARNING [train.py:1204] (3/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_301-330455-0_sp1.1 from training. Duration: 0.72725 2023-10-04 18:26:41,981 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.attention_skip_rate, batch_count=1750760.0, ans=0.0 2023-10-04 18:26:43,313 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.ff3_skip_rate, batch_count=1750760.0, ans=0.0 2023-10-04 18:26:44,671 WARNING [train.py:1204] (3/4) Exclude cut with ID _23557_210_8750_1_1531890106370_6846255_667-145945-0_sp1.1 from training. Duration: 0.92725 2023-10-04 18:26:47,600 WARNING [train.py:1204] (3/4) Exclude cut with ID _14860_210_4915_1_1530702020340_7343240_836-328489-0_sp1.1 from training. Duration: 0.8181875 2023-10-04 18:26:49,340 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_37152_210_9697_1_1533106573_3844188_416-333249-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 18:26:52,283 WARNING [train.py:1204] (3/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_538-32291-0_sp0.9 from training. Duration: 0.92225 2023-10-04 18:26:53,787 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.conv_skip_rate, batch_count=1750826.6666666667, ans=0.0 2023-10-04 18:26:54,189 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.3.encoder.layers.0.conv_module2.whiten, num_groups=1, num_channels=512, metric=4.14 vs. limit=15.0 2023-10-04 18:26:54,877 WARNING [train.py:1204] (3/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_965-176824-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 18:26:57,791 WARNING [train.py:1204] (3/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_86-19601-0_sp1.1 from training. Duration: 0.77275 2023-10-04 18:26:57,855 WARNING [train.py:1204] (3/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_436-318113-0_sp1.1 from training. Duration: 0.6818125 2023-10-04 18:26:59,209 WARNING [train.py:1204] (3/4) Exclude cut with ID _41817_210_18835_1_1533434477160_7423049_555-293448-0_sp0.9 from training. Duration: 0.888875 2023-10-04 18:26:59,226 WARNING [train.py:1204] (3/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_68-219807-0 from training. Duration: 0.47 2023-10-04 18:27:00,751 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.0.ff3_skip_rate, batch_count=1750826.6666666667, ans=0.0 2023-10-04 18:27:02,050 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7544_210_6753_1_1528591327_1471426_33-346031-0_sp1.1 from training. Duration: 0.5545625 2023-10-04 18:27:02,052 WARNING [train.py:1204] (3/4) Exclude cut with ID _49748_210_24116_1_1533986971737_3158910_263-52113-0_sp1.1 from training. Duration: 0.97275 2023-10-04 18:27:03,364 WARNING [train.py:1204] (3/4) Exclude cut with ID _64639_210_5816_1_1535367619437_3528000_340-24580-0_sp1.1 from training. Duration: 0.92725 2023-10-04 18:27:03,379 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_47228_210_4983_1_1533780021_3672027_479-114941-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 18:27:03,414 WARNING [train.py:1204] (3/4) Exclude cut with ID _44585_210_18628_1_1533605350387_4518199_506-119659-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 18:27:05,366 WARNING [train.py:1204] (3/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_314-126819-0_sp1.1 from training. Duration: 0.4545625 2023-10-04 18:27:05,370 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2541_210_1794_1_1525590269_3084765_156-173713-0_sp0.9 from training. Duration: 0.9 2023-10-04 18:27:05,426 WARNING [train.py:1204] (3/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_108-255430-0 from training. Duration: 0.97 2023-10-04 18:27:05,437 WARNING [train.py:1204] (3/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_791-45149-0_sp1.1 from training. Duration: 0.7818125 2023-10-04 18:27:05,442 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_53378_210_17282_1_1534227054_1261425_10-118295-0_sp1.1 from training. Duration: 0.97275 2023-10-04 18:27:06,589 WARNING [train.py:1204] (3/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_148-158509-0 from training. Duration: 0.41 2023-10-04 18:27:11,558 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_34726_210_4525_1_1533019474_5030395_687-291305-0_sp1.1 from training. Duration: 0.936375 2023-10-04 18:27:14,345 WARNING [train.py:1204] (3/4) Exclude cut with ID _14860_210_4915_1_1530702020340_7343240_264-324537-0_sp1.1 from training. Duration: 0.963625 2023-10-04 18:27:17,671 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_41251_210_15368_1_1533429767_4820695_591-46386-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 18:27:17,704 WARNING [train.py:1204] (3/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_86-19601-0_sp0.9 from training. Duration: 0.9444375 2023-10-04 18:27:17,739 WARNING [train.py:1204] (3/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_401-255491-0_sp0.9 from training. Duration: 0.77775 2023-10-04 18:27:20,433 WARNING [train.py:1204] (3/4) Exclude cut with ID _8696_210_1876_1_1528545632440_3345058_209-54069-0_sp1.1 from training. Duration: 0.6090625 2023-10-04 18:27:20,451 WARNING [train.py:1204] (3/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_369-325184-0_sp1.1 from training. Duration: 0.9 2023-10-04 18:27:21,684 INFO [train.py:1046] (3/4) Epoch 50, batch 2350, loss[loss=0.1594, simple_loss=0.2447, pruned_loss=0.03699, over 23208.00 frames. ], tot_loss[loss=0.1534, simple_loss=0.2348, pruned_loss=0.03606, over 4730354.86 frames. ], batch size: 93, lr: 2.03e-03, grad_scale: 8.0 2023-10-04 18:27:21,729 WARNING [train.py:1204] (3/4) Exclude cut with ID _14687_210_5854_1_1530703838259_3664889_115-239046-0_sp0.9 from training. Duration: 0.7444375 2023-10-04 18:27:21,779 WARNING [train.py:1204] (3/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_168-50337-0 from training. Duration: 0.84 2023-10-04 18:27:28,563 WARNING [train.py:1204] (3/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_909-64151-0_sp1.1 from training. Duration: 0.8545625 2023-10-04 18:27:28,581 WARNING [train.py:1204] (3/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_597-154640-0 from training. Duration: 0.9 2023-10-04 18:27:32,909 WARNING [train.py:1204] (3/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_126-274694-0 from training. Duration: 0.98 2023-10-04 18:27:36,342 WARNING [train.py:1204] (3/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_344-269786-0_sp1.1 from training. Duration: 0.97275 2023-10-04 18:27:36,608 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.conv_module2.balancer2.prob, batch_count=1751026.6666666667, ans=0.125 2023-10-04 18:27:37,730 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_37818_210_4238_1_1533620910_4720083_322-253154-0_sp1.1 from training. Duration: 0.92725 2023-10-04 18:27:37,733 WARNING [train.py:1204] (3/4) Exclude cut with ID _58895_210_8750_1_1535184081320_3649089_221-15823-0_sp1.1 from training. Duration: 0.92725 2023-10-04 18:27:37,743 WARNING [train.py:1204] (3/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_9-21737-0_sp1.1 from training. Duration: 0.8818125 2023-10-04 18:27:39,635 WARNING [train.py:1204] (3/4) Exclude cut with ID _14069_210_5632_1_1530512692033_4803390_522-341588-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 18:27:41,006 WARNING [train.py:1204] (3/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_363-185703-0 from training. Duration: 0.95 2023-10-04 18:27:41,210 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.1.bypass_mid.scale_min, batch_count=1751026.6666666667, ans=0.2 2023-10-04 18:27:43,737 WARNING [train.py:1204] (3/4) Exclude cut with ID _41817_210_18835_1_1533434477160_7423049_265-76216-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 18:27:49,907 WARNING [train.py:1204] (3/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_841-341679-0 from training. Duration: 0.58 2023-10-04 18:27:50,058 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.attention_skip_rate, batch_count=1751093.3333333333, ans=0.0 2023-10-04 18:27:52,619 WARNING [train.py:1204] (3/4) Exclude cut with ID _55429_210_10626_1_1534467588527_7174143_97-335084-0_sp1.1 from training. Duration: 0.8818125 2023-10-04 18:27:54,389 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.feed_forward2.hidden_balancer.prob, batch_count=1751093.3333333333, ans=0.125 2023-10-04 18:27:55,488 WARNING [train.py:1204] (3/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_690-126306-0_sp0.9 from training. Duration: 0.8666875 2023-10-04 18:27:55,497 WARNING [train.py:1204] (3/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_102-92751-0_sp1.1 from training. Duration: 0.9 2023-10-04 18:27:56,934 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3787_210_2040_1_1526477544_5478586_603-120324-0_sp1.1 from training. Duration: 0.736375 2023-10-04 18:27:58,459 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_33348_210_5632_1_1533086912_3324030_85-28583-0 from training. Duration: 0.82 2023-10-04 18:27:59,874 WARNING [train.py:1204] (3/4) Exclude cut with ID _60057_210_10640_1_1534903065793_3333150_362-230377-0_sp1.1 from training. Duration: 0.7909375 2023-10-04 18:28:00,015 WARNING [train.py:1204] (3/4) Exclude cut with ID _60113_210_16271_1_1535164212481_4386079_194-146486-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 18:28:00,027 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_53753_210_7234_1_1534320003_3594148_171-277668-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 18:28:00,566 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.4.encoder.layers.1.nonlin_attention.whiten2, num_groups=1, num_channels=384, metric=3.56 vs. limit=15.0 2023-10-04 18:28:01,328 WARNING [train.py:1204] (3/4) Exclude cut with ID _34423_210_8750_1_1533430798456_3330199_251-332788-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 18:28:02,124 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.4.encoder.layers.0.conv_module2.whiten, num_groups=1, num_channels=384, metric=12.53 vs. limit=15.0 2023-10-04 18:28:04,868 WARNING [train.py:1204] (3/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_663-166973-0_sp0.9 from training. Duration: 0.888875 2023-10-04 18:28:06,307 WARNING [train.py:1204] (3/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_927-43827-0 from training. Duration: 0.74 2023-10-04 18:28:06,363 WARNING [train.py:1204] (3/4) Exclude cut with ID _28323_210_16187_1_1532480395667_4100300_306-74561-0_sp1.1 from training. Duration: 0.8545625 2023-10-04 18:28:10,838 WARNING [train.py:1204] (3/4) Exclude cut with ID _15037_210_2253_1_1530752294104_3267650_139-337989-0_sp1.1 from training. Duration: 0.92725 2023-10-04 18:28:10,860 WARNING [train.py:1204] (3/4) Exclude cut with ID _49039_210_6315_1_1533967537541_3846118_460-263670-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 18:28:12,328 WARNING [train.py:1204] (3/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_186-309161-0 from training. Duration: 0.54 2023-10-04 18:28:13,625 WARNING [train.py:1204] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_87-278686-0_sp1.1 from training. Duration: 0.7 2023-10-04 18:28:15,154 WARNING [train.py:1204] (3/4) Exclude cut with ID _27052_210_5986_1_1532329197187_3585009_314-106499-0 from training. Duration: 0.92 2023-10-04 18:28:16,454 WARNING [train.py:1204] (3/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_412-287526-0_sp0.9 from training. Duration: 0.8 2023-10-04 18:28:18,621 WARNING [train.py:1204] (3/4) Exclude cut with ID _22961_210_8254_1_1531976366672_4278180_317-199546-0 from training. Duration: 0.86 2023-10-04 18:28:22,776 WARNING [train.py:1204] (3/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_381-317791-0 from training. Duration: 0.73 2023-10-04 18:28:24,079 WARNING [train.py:1204] (3/4) Exclude cut with ID _44010_210_4525_1_1533884262964_4013620_560-310411-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 18:28:24,093 WARNING [train.py:1204] (3/4) Exclude cut with ID _5294_210_1747_1_1527249643928_3696179_342-270278-0_sp0.9 from training. Duration: 0.611125 2023-10-04 18:28:24,120 WARNING [train.py:1204] (3/4) Exclude cut with ID ID0473W0002-918951 from training. Duration: 0.924 2023-10-04 18:28:24,144 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1083-220122-0 from training. Duration: 0.51 2023-10-04 18:28:26,945 WARNING [train.py:1204] (3/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_138-154812-0 from training. Duration: 0.78 2023-10-04 18:28:31,023 WARNING [train.py:1204] (3/4) Exclude cut with ID _44982_210_1511_1_1534312820108_3726029_331-19420-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 18:28:31,330 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.ff2_skip_rate, batch_count=1751226.6666666667, ans=0.0 2023-10-04 18:28:35,687 INFO [train.py:1046] (3/4) Epoch 50, batch 2400, loss[loss=0.1736, simple_loss=0.2598, pruned_loss=0.04366, over 24441.00 frames. ], tot_loss[loss=0.153, simple_loss=0.2343, pruned_loss=0.0359, over 4716946.94 frames. ], batch size: 77, lr: 2.03e-03, grad_scale: 8.0 2023-10-04 18:28:35,789 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2655_210_2087_1_1525688959_3897400_202-147557-0_sp0.9 from training. Duration: 0.9444375 2023-10-04 18:28:41,722 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_23850_210_4238_1_1532232597_1632433_32-185135-0_sp0.9 from training. Duration: 0.9555625 2023-10-04 18:28:43,483 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_46-93547-0_sp1.1 from training. Duration: 0.863625 2023-10-04 18:28:43,524 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7931_210_1794_1_1528594728_5564059_280-279560-0 from training. Duration: 0.98 2023-10-04 18:28:44,909 WARNING [train.py:1204] (3/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_937-208281-0 from training. Duration: 0.85 2023-10-04 18:28:50,786 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_14928_210_5632_1_1530773461_4714000_454-112198-0_sp0.9 from training. Duration: 0.7333125 2023-10-04 18:28:50,787 WARNING [train.py:1204] (3/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_944-153333-0_sp1.1 from training. Duration: 0.963625 2023-10-04 18:28:54,022 WARNING [train.py:1204] (3/4) Exclude cut with ID _14821_210_8750_1_1530680503991_6829359_2-227151-0 from training. Duration: 0.83 2023-10-04 18:28:54,041 WARNING [train.py:1204] (3/4) Exclude cut with ID _28461_210_2253_1_1532771775189_3041299_96-152158-0_sp1.1 from training. Duration: 0.77275 2023-10-04 18:28:55,233 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.709e+02 2.154e+02 2.535e+02 3.094e+02 5.336e+02, threshold=5.070e+02, percent-clipped=5.0 2023-10-04 18:28:55,310 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7837_210_1792_1_1528453445_5841305_654-54195-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 18:28:55,345 WARNING [train.py:1204] (3/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_344-152526-0 from training. Duration: 0.53 2023-10-04 18:28:56,876 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder_embed.convnext.out_balancer.prob, batch_count=1751360.0, ans=0.125 2023-10-04 18:28:58,425 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=1751360.0, ans=0.1 2023-10-04 18:29:01,014 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_30167_210_3528_1_1532428012_4772375_480-142230-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 18:29:02,399 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_160-218721-0 from training. Duration: 0.96 2023-10-04 18:29:05,312 WARNING [train.py:1204] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_65-88242-0_sp0.9 from training. Duration: 0.62225 2023-10-04 18:29:09,128 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_34886_210_10633_1_1533203882_1755661_76-156096-0 from training. Duration: 0.87 2023-10-04 18:29:09,546 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.5.encoder.layers.1.whiten, num_groups=1, num_channels=256, metric=3.26 vs. limit=12.0 2023-10-04 18:29:13,077 WARNING [train.py:1204] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_188-38388-0_sp1.1 from training. Duration: 0.92725 2023-10-04 18:29:14,891 WARNING [train.py:1204] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_81-289335-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 18:29:16,416 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder_embed.conv.8.prob, batch_count=1751426.6666666667, ans=0.125 2023-10-04 18:29:17,786 WARNING [train.py:1204] (3/4) Exclude cut with ID _59493_210_17282_1_1535078419077_3225879_110-8778-0_sp1.1 from training. Duration: 0.936375 2023-10-04 18:29:17,831 WARNING [train.py:1204] (3/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_188-260011-0 from training. Duration: 0.73 2023-10-04 18:29:17,866 WARNING [train.py:1204] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_453-133983-0_sp0.9 from training. Duration: 0.8666875 2023-10-04 18:29:22,698 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.self_attn_weights.pos_emb_skip_rate, batch_count=1751493.3333333333, ans=0.0 2023-10-04 18:29:26,681 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8054_210_7927_1_1528627721_4984094_553-17379-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 18:29:29,370 WARNING [train.py:1204] (3/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_341-171072-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 18:29:30,866 WARNING [train.py:1204] (3/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_265-257286-0_sp1.1 from training. Duration: 0.963625 2023-10-04 18:29:32,294 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_403-39619-0_sp0.9 from training. Duration: 0.9333125 2023-10-04 18:29:32,298 WARNING [train.py:1204] (3/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_464-33931-0_sp0.9 from training. Duration: 0.6 2023-10-04 18:29:32,326 WARNING [train.py:1204] (3/4) Exclude cut with ID _60804_210_9642_1_1534831199859_7268299_632-143863-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 18:29:32,338 WARNING [train.py:1204] (3/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_431-310997-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 18:29:33,603 WARNING [train.py:1204] (3/4) Exclude cut with ID _31649_210_6228_1_1532584608063_4091078_114-263860-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 18:29:33,622 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6961_210_2087_1_1527937168_6815011_562-165518-0_sp0.9 from training. Duration: 0.8555625 2023-10-04 18:29:35,222 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.nonlin_attention.balancer.min_positive, batch_count=1751560.0, ans=0.05 2023-10-04 18:29:37,614 WARNING [train.py:1204] (3/4) Exclude cut with ID _27052_210_5986_1_1532329197187_3585009_338-325870-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 18:29:38,699 WARNING [train.py:1204] (3/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_469-94673-0_sp1.1 from training. Duration: 0.7181875 2023-10-04 18:29:39,992 WARNING [train.py:1204] (3/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_259-34038-0 from training. Duration: 0.74 2023-10-04 18:29:40,088 WARNING [train.py:1204] (3/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_388-129552-0 from training. Duration: 0.96 2023-10-04 18:29:41,586 WARNING [train.py:1204] (3/4) Exclude cut with ID _28394_210_8913_1_1532430049518_3650118_342-251455-0_sp1.1 from training. Duration: 0.936375 2023-10-04 18:29:41,613 WARNING [train.py:1204] (3/4) Exclude cut with ID _53731_210_7994_1_1534553979244_3757214_8-191390-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 18:29:42,808 WARNING [train.py:1204] (3/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_613-232856-0 from training. Duration: 0.54 2023-10-04 18:29:42,875 WARNING [train.py:1204] (3/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_557-349534-0 from training. Duration: 0.91 2023-10-04 18:29:42,890 WARNING [train.py:1204] (3/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_379-184223-0 from training. Duration: 0.85 2023-10-04 18:29:42,893 WARNING [train.py:1204] (3/4) Exclude cut with ID IC0125W0001-61807 from training. Duration: 0.966 2023-10-04 18:29:44,577 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_229-45300-0 from training. Duration: 0.57 2023-10-04 18:29:44,782 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.out_combiner.scale_min, batch_count=1751560.0, ans=0.2 2023-10-04 18:29:45,951 WARNING [train.py:1204] (3/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_1069-24743-0_sp1.1 from training. Duration: 0.92725 2023-10-04 18:29:46,075 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_41452_210_7291_1_1533437687_3941281_323-172873-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 18:29:46,085 WARNING [train.py:1204] (3/4) Exclude cut with ID _37086_210_9038_1_1532999540891_7021250_802-226709-0_sp1.1 from training. Duration: 0.97275 2023-10-04 18:29:47,607 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.bypass.scale_min, batch_count=1751560.0, ans=0.2 2023-10-04 18:29:48,689 WARNING [train.py:1204] (3/4) Exclude cut with ID IC0144W0500-71767 from training. Duration: 0.931875 2023-10-04 18:29:50,521 INFO [train.py:1046] (3/4) Epoch 50, batch 2450, loss[loss=0.1493, simple_loss=0.2238, pruned_loss=0.03742, over 23324.00 frames. ], tot_loss[loss=0.1522, simple_loss=0.2326, pruned_loss=0.03592, over 4690416.37 frames. ], batch size: 119, lr: 2.03e-03, grad_scale: 8.0 2023-10-04 18:29:50,578 WARNING [train.py:1204] (3/4) Exclude cut with ID _39846_210_12651_1_1533605310265_2115212_233-129699-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 18:29:50,632 WARNING [train.py:1204] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_574-45541-0_sp0.9 from training. Duration: 0.72225 2023-10-04 18:29:53,503 WARNING [train.py:1204] (3/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_372-298093-0_sp1.1 from training. Duration: 0.72725 2023-10-04 18:29:53,522 WARNING [train.py:1204] (3/4) Exclude cut with ID _62574_210_2655_1_1535180407058_3933249_34-26343-0_sp1.1 from training. Duration: 0.97275 2023-10-04 18:29:56,412 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_512-256238-0_sp1.1 from training. Duration: 0.963625 2023-10-04 18:29:56,418 WARNING [train.py:1204] (3/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_196-55502-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 18:29:57,752 WARNING [train.py:1204] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_195-167011-0 from training. Duration: 0.75 2023-10-04 18:29:57,919 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=1751626.6666666667, ans=0.1 2023-10-04 18:30:00,709 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.conv_module2.balancer2.prob, batch_count=1751626.6666666667, ans=0.125 2023-10-04 18:30:04,515 WARNING [train.py:1204] (3/4) Exclude cut with ID _13678_210_7497_1_1530530069226_3463170_345-230416-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 18:30:04,535 WARNING [train.py:1204] (3/4) Exclude cut with ID _51078_210_9070_1_1534161557424_3805250_225-327809-0_sp1.1 from training. Duration: 0.963625 2023-10-04 18:30:08,383 WARNING [train.py:1204] (3/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_18-189198-0_sp1.1 from training. Duration: 0.8181875 2023-10-04 18:30:09,647 WARNING [train.py:1204] (3/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_329-82235-0_sp1.1 from training. Duration: 0.7454375 2023-10-04 18:30:09,656 WARNING [train.py:1204] (3/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_726-270780-0_sp0.9 from training. Duration: 0.9555625 2023-10-04 18:30:09,683 WARNING [train.py:1204] (3/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_654-52713-0 from training. Duration: 0.77 2023-10-04 18:30:13,917 WARNING [train.py:1204] (3/4) Exclude cut with ID _20455_210_5219_1_1531565996775_8098100_191-16006-0_sp1.1 from training. Duration: 0.963625 2023-10-04 18:30:16,685 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3171_210_2087_1_1526109084_2330007_66-260583-0_sp1.1 from training. Duration: 0.6545625 2023-10-04 18:30:17,968 WARNING [train.py:1204] (3/4) Exclude cut with ID _38670_210_15379_1_1533448810659_4391059_63-129006-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 18:30:23,071 WARNING [train.py:1204] (3/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_393-57888-0_sp0.9 from training. Duration: 0.711125 2023-10-04 18:30:24,439 WARNING [train.py:1204] (3/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_857-298942-0_sp0.9 from training. Duration: 0.9444375 2023-10-04 18:30:24,546 WARNING [train.py:1204] (3/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_239-22076-0_sp0.9 from training. Duration: 0.9444375 2023-10-04 18:30:24,576 WARNING [train.py:1204] (3/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_424-227947-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 18:30:26,034 WARNING [train.py:1204] (3/4) Exclude cut with ID _14069_210_5632_1_1530512692033_4803390_95-1896-0 from training. Duration: 0.98 2023-10-04 18:30:26,247 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.attention_skip_rate, batch_count=1751760.0, ans=0.0 2023-10-04 18:30:27,323 WARNING [train.py:1204] (3/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_96-244299-0_sp0.9 from training. Duration: 0.97775 2023-10-04 18:30:34,227 WARNING [train.py:1204] (3/4) Exclude cut with ID _46683_210_9642_1_1533730913492_4591116_562-319825-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 18:30:35,641 WARNING [train.py:1204] (3/4) Exclude cut with ID _64639_210_5816_1_1535367619437_3528000_284-173474-0_sp1.1 from training. Duration: 0.963625 2023-10-04 18:30:35,662 WARNING [train.py:1204] (3/4) Exclude cut with ID _29529_210_7234_1_1532393961455_3977460_497-100133-0_sp1.1 from training. Duration: 0.936375 2023-10-04 18:30:35,689 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_12868_210_6753_1_1530701932_530275_18-76322-0_sp0.9 from training. Duration: 0.9666875 2023-10-04 18:30:35,737 WARNING [train.py:1204] (3/4) Exclude cut with ID _50093_210_4220_1_1534417141207_3432299_116-303447-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 18:30:37,106 WARNING [train.py:1204] (3/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_44-79427-0_sp1.1 from training. Duration: 0.8909375 2023-10-04 18:30:37,768 WARNING [train.py:1204] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_887-249555-0 from training. Duration: 0.77 2023-10-04 18:30:42,059 WARNING [train.py:1204] (3/4) Exclude cut with ID _28461_210_2253_1_1532771775189_3041299_96-152158-0_sp0.9 from training. Duration: 0.9444375 2023-10-04 18:30:42,110 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_14058_210_4163_1_1530690953_6327180_49-210687-0_sp1.1 from training. Duration: 0.8545625 2023-10-04 18:30:44,841 WARNING [train.py:1204] (3/4) Exclude cut with ID _40399_210_4353_1_1533305658194_3279406_351-127903-0_sp1.1 from training. Duration: 0.97275 2023-10-04 18:30:44,847 WARNING [train.py:1204] (3/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_184-206939-0_sp1.1 from training. Duration: 0.936375 2023-10-04 18:30:46,401 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.self_attn_weights.pos_emb_skip_rate, batch_count=1751826.6666666667, ans=0.0 2023-10-04 18:30:49,524 WARNING [train.py:1204] (3/4) Exclude cut with ID _14100_210_8341_1_1530529320613_3678069_258-224599-0_sp1.1 from training. Duration: 0.7 2023-10-04 18:30:49,552 WARNING [train.py:1204] (3/4) Exclude cut with ID _6493_210_1747_1_1528459210424_4012470_324-323854-0 from training. Duration: 0.92 2023-10-04 18:30:50,992 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_151-325626-0_sp1.1 from training. Duration: 0.8818125 2023-10-04 18:30:52,701 WARNING [train.py:1204] (3/4) Exclude cut with ID _14528_210_8254_1_1530669700514_4306939_106-43145-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 18:30:52,714 WARNING [train.py:1204] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_686-54383-0 from training. Duration: 0.87 2023-10-04 18:30:52,764 WARNING [train.py:1204] (3/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_801-102362-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 18:30:54,115 WARNING [train.py:1204] (3/4) Exclude cut with ID _48677_210_5663_1_1533902405919_3892989_408-40740-0_sp0.9 from training. Duration: 0.92225 2023-10-04 18:30:54,344 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.conv_skip_rate, batch_count=1751893.3333333333, ans=0.0 2023-10-04 18:30:58,345 WARNING [train.py:1204] (3/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_448-212135-0_sp1.1 from training. Duration: 0.82725 2023-10-04 18:30:59,788 WARNING [train.py:1204] (3/4) Exclude cut with ID _59170_210_6721_1_1534983633960_4203335_320-124047-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 18:31:01,134 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_4692_210_3528_1_1526899755_4323254_107-44401-0_sp1.1 from training. Duration: 0.7818125 2023-10-04 18:31:01,715 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.4.encoder.layers.2.feed_forward3.out_whiten, num_groups=1, num_channels=384, metric=12.87 vs. limit=15.0 2023-10-04 18:31:03,730 INFO [train.py:1046] (3/4) Epoch 50, batch 2500, loss[loss=0.148, simple_loss=0.2291, pruned_loss=0.03346, over 24464.00 frames. ], tot_loss[loss=0.152, simple_loss=0.2325, pruned_loss=0.0358, over 4688669.37 frames. ], batch size: 63, lr: 2.03e-03, grad_scale: 8.0 2023-10-04 18:31:03,863 WARNING [train.py:1204] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_731-2428-0 from training. Duration: 0.83 2023-10-04 18:31:05,145 WARNING [train.py:1204] (3/4) Exclude cut with ID _29857_210_20034_1_1532395886753_3798160_335-163562-0_sp1.1 from training. Duration: 0.77275 2023-10-04 18:31:13,043 WARNING [train.py:1204] (3/4) Exclude cut with ID _14832_210_8750_1_1530691351539_3171348_171-275004-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 18:31:20,800 WARNING [train.py:1204] (3/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_105-328174-0_sp0.9 from training. Duration: 0.8444375 2023-10-04 18:31:20,827 WARNING [train.py:1204] (3/4) Exclude cut with ID _68965_210_1883_1_1535698962272_3497490_164-118421-0_sp1.1 from training. Duration: 0.936375 2023-10-04 18:31:21,965 WARNING [train.py:1204] (3/4) Exclude cut with ID _19896_210_1883_1_1531709022662_6649569_380-24170-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 18:31:21,970 WARNING [train.py:1204] (3/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_505-205270-0_sp1.1 from training. Duration: 0.3454375 2023-10-04 18:31:23,686 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.711e+02 2.188e+02 2.534e+02 3.108e+02 6.481e+02, threshold=5.068e+02, percent-clipped=2.0 2023-10-04 18:31:26,029 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.1.encoder.layers.0.nonlin_attention.whiten2, num_groups=1, num_channels=256, metric=5.78 vs. limit=15.0 2023-10-04 18:31:28,003 WARNING [train.py:1204] (3/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_427-208901-0_sp1.1 from training. Duration: 0.6181875 2023-10-04 18:31:29,458 WARNING [train.py:1204] (3/4) Exclude cut with ID _23818_210_6228_1_1531917063884_3676920_85-326916-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 18:31:29,553 WARNING [train.py:1204] (3/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_101-293367-0_sp1.1 from training. Duration: 0.57275 2023-10-04 18:31:29,561 WARNING [train.py:1204] (3/4) Exclude cut with ID _5409_210_1388_1_1527292850189_5865191_178-127970-0_sp1.1 from training. Duration: 0.5181875 2023-10-04 18:31:30,898 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_403-39619-0 from training. Duration: 0.84 2023-10-04 18:31:32,309 WARNING [train.py:1204] (3/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_427-87841-0_sp1.1 from training. Duration: 0.963625 2023-10-04 18:31:33,583 WARNING [train.py:1204] (3/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_286-68181-0_sp1.1 from training. Duration: 0.97275 2023-10-04 18:31:33,618 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6673_210_3273_1_1527767790_3173240_327-306185-0 from training. Duration: 0.83 2023-10-04 18:31:33,642 WARNING [train.py:1204] (3/4) Exclude cut with ID _3675_210_4557_1_1526639934408_4182089_106-203713-0_sp1.1 from training. Duration: 0.963625 2023-10-04 18:31:35,014 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_53071_210_16554_1_1534493434_1276796_130-36076-0 from training. Duration: 0.95 2023-10-04 18:31:35,042 WARNING [train.py:1204] (3/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_586-93054-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 18:31:41,020 WARNING [train.py:1204] (3/4) Exclude cut with ID _23825_210_13449_1_1531915313526_3497150_262-10571-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 18:31:41,096 WARNING [train.py:1204] (3/4) Exclude cut with ID _28783_210_7943_1_1532671164808_7155479_432-67885-0_sp1.1 from training. Duration: 0.97275 2023-10-04 18:31:44,339 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6287_210_3273_1_1527509506_2653716_182-199887-0_sp0.9 from training. Duration: 0.5666875 2023-10-04 18:31:44,479 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=1752093.3333333333, ans=0.1 2023-10-04 18:31:45,599 WARNING [train.py:1204] (3/4) Exclude cut with ID _3231_210_2106_1_1526205864934_6784220_171-256004-0 from training. Duration: 0.86 2023-10-04 18:31:45,652 WARNING [train.py:1204] (3/4) Exclude cut with ID _69051_210_1531_1_1535885566901_3829538_332-92090-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 18:31:47,082 WARNING [train.py:1204] (3/4) Exclude cut with ID _3675_210_4557_1_1526639934408_4182089_104-324125-0_sp1.1 from training. Duration: 0.963625 2023-10-04 18:31:50,066 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.bypass.scale_min, batch_count=1752160.0, ans=0.2 2023-10-04 18:31:51,112 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_41764_210_2521_1_1533686110_4089313_501-2119-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 18:31:53,164 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_56-100988-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 18:31:53,433 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.ff3_skip_rate, batch_count=1752160.0, ans=0.0 2023-10-04 18:31:56,537 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.balancer1.prob, batch_count=1752160.0, ans=0.125 2023-10-04 18:31:56,578 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.feed_forward1.hidden_balancer.prob, batch_count=1752160.0, ans=0.125 2023-10-04 18:31:57,167 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.1.encoder.layers.0.self_attn_weights.whiten_keys, num_groups=4, num_channels=128, metric=3.74 vs. limit=6.0 2023-10-04 18:31:57,605 WARNING [train.py:1204] (3/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_398-239819-0_sp1.1 from training. Duration: 0.9 2023-10-04 18:32:01,842 WARNING [train.py:1204] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_74-48717-0_sp1.1 from training. Duration: 0.52725 2023-10-04 18:32:04,657 WARNING [train.py:1204] (3/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_177-49092-0 from training. Duration: 0.91 2023-10-04 18:32:04,676 WARNING [train.py:1204] (3/4) Exclude cut with ID _62337_210_12392_1_1535009273105_3412229_44-176353-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 18:32:04,691 WARNING [train.py:1204] (3/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_110-130961-0_sp1.1 from training. Duration: 0.42725 2023-10-04 18:32:06,046 WARNING [train.py:1204] (3/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_37-189262-0_sp1.1 from training. Duration: 0.8909375 2023-10-04 18:32:06,047 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3172_210_2087_1_1526292929_3987865_197-97762-0_sp0.9 from training. Duration: 0.8333125 2023-10-04 18:32:08,674 WARNING [train.py:1204] (3/4) Exclude cut with ID IC0677W0065-334897 from training. Duration: 0.896 2023-10-04 18:32:08,675 WARNING [train.py:1204] (3/4) Exclude cut with ID IC0677W0080-334912 from training. Duration: 0.768 2023-10-04 18:32:08,694 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_120-41668-0 from training. Duration: 0.7 2023-10-04 18:32:11,961 WARNING [train.py:1204] (3/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_460-205287-0_sp1.1 from training. Duration: 0.963625 2023-10-04 18:32:12,219 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.feed_forward2.hidden_balancer.prob, batch_count=1752226.6666666667, ans=0.125 2023-10-04 18:32:13,954 WARNING [train.py:1204] (3/4) Exclude cut with ID _14632_210_9988_1_1530691279392_3923200_102-204477-0 from training. Duration: 0.97 2023-10-04 18:32:13,964 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_34815_210_6388_1_1533381320_3571903_279-305565-0 from training. Duration: 0.92 2023-10-04 18:32:15,393 WARNING [train.py:1204] (3/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_91-148831-0_sp1.1 from training. Duration: 0.9 2023-10-04 18:32:15,454 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_82-221196-0 from training. Duration: 0.76 2023-10-04 18:32:17,975 INFO [train.py:1046] (3/4) Epoch 50, batch 2550, loss[loss=0.1648, simple_loss=0.2375, pruned_loss=0.04607, over 23806.00 frames. ], tot_loss[loss=0.1518, simple_loss=0.2328, pruned_loss=0.03542, over 4697654.54 frames. ], batch size: 164, lr: 2.03e-03, grad_scale: 8.0 2023-10-04 18:32:19,458 WARNING [train.py:1204] (3/4) Exclude cut with ID _38020_210_5986_1_1533112252259_7006089_493-102745-0 from training. Duration: 0.98 2023-10-04 18:32:19,681 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.conv_skip_rate, batch_count=1752293.3333333333, ans=0.0 2023-10-04 18:32:22,634 WARNING [train.py:1204] (3/4) Exclude cut with ID _61133_210_9969_1_1535196777227_3696409_40-93442-0_sp1.1 from training. Duration: 0.92725 2023-10-04 18:32:23,985 WARNING [train.py:1204] (3/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_45-258852-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 18:32:25,450 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_53071_210_16554_1_1534493434_1276796_130-36076-0_sp1.1 from training. Duration: 0.863625 2023-10-04 18:32:27,251 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8694_210_6400_1_1529065754_3260756_332-13553-0_sp1.1 from training. Duration: 0.92725 2023-10-04 18:32:27,325 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_405-339986-0 from training. Duration: 0.51 2023-10-04 18:32:27,380 WARNING [train.py:1204] (3/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_927-43827-0_sp1.1 from training. Duration: 0.67275 2023-10-04 18:32:30,236 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_397-147196-0 from training. Duration: 0.8 2023-10-04 18:32:32,890 WARNING [train.py:1204] (3/4) Exclude cut with ID 3557-8342-0013-54691-0_sp1.1 from training. Duration: 0.836375 2023-10-04 18:32:34,311 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_47438_210_5399_1_1533797471_4513356_626-127722-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 18:32:36,973 WARNING [train.py:1204] (3/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_256-323883-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 18:32:36,978 WARNING [train.py:1204] (3/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_839-173017-0_sp0.9 from training. Duration: 0.4555625 2023-10-04 18:32:37,014 WARNING [train.py:1204] (3/4) Exclude cut with ID _5289_210_2084_1_1527678202736_3779299_392-261988-0_sp1.1 from training. Duration: 0.5909375 2023-10-04 18:32:37,052 WARNING [train.py:1204] (3/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_119-224384-0_sp1.1 from training. Duration: 0.936375 2023-10-04 18:32:37,091 WARNING [train.py:1204] (3/4) Exclude cut with ID _57523_210_1856_1_1534766294545_3732989_262-267933-0_sp1.1 from training. Duration: 0.963625 2023-10-04 18:32:39,803 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_35361_210_4238_1_1533262810_7793476_675-87012-0_sp0.9 from training. Duration: 0.988875 2023-10-04 18:32:39,827 WARNING [train.py:1204] (3/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_17-90541-0 from training. Duration: 0.94 2023-10-04 18:32:41,571 WARNING [train.py:1204] (3/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_535-14410-0_sp1.1 from training. Duration: 0.42725 2023-10-04 18:32:41,578 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_24673_210_6400_1_1531968633_778659_94-154511-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 18:32:41,590 WARNING [train.py:1204] (3/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_212-10501-0 from training. Duration: 0.94 2023-10-04 18:32:41,882 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.conv_module1.balancer1.prob, batch_count=1752360.0, ans=0.125 2023-10-04 18:32:53,203 WARNING [train.py:1204] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_383-285196-0_sp1.1 from training. Duration: 0.8818125 2023-10-04 18:32:59,381 WARNING [train.py:1204] (3/4) Exclude cut with ID _45560_210_21982_1_1533812603951_3584999_238-47020-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 18:32:59,390 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_21500_210_2054_1_1532149310_2716758_278-18053-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 18:32:59,403 WARNING [train.py:1204] (3/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_453-9626-0_sp1.1 from training. Duration: 0.936375 2023-10-04 18:33:00,854 WARNING [train.py:1204] (3/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_153-97441-0_sp1.1 from training. Duration: 0.5090625 2023-10-04 18:33:01,047 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.conv_module1.balancer1.prob, batch_count=1752493.3333333333, ans=0.125 2023-10-04 18:33:09,571 WARNING [train.py:1204] (3/4) Exclude cut with ID _67725_210_27767_1_1535608756671_5105359_431-120895-0_sp1.1 from training. Duration: 0.963625 2023-10-04 18:33:10,088 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.5.encoder.layers.1.conv_module2.whiten, num_groups=1, num_channels=256, metric=5.58 vs. limit=15.0 2023-10-04 18:33:12,277 WARNING [train.py:1204] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_574-45541-0_sp1.1 from training. Duration: 0.5909375 2023-10-04 18:33:12,311 WARNING [train.py:1204] (3/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_162-103620-0_sp1.1 from training. Duration: 0.8454375 2023-10-04 18:33:12,323 WARNING [train.py:1204] (3/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_734-129761-0_sp1.1 from training. Duration: 0.7454375 2023-10-04 18:33:13,559 WARNING [train.py:1204] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_620-276793-0_sp1.1 from training. Duration: 0.536375 2023-10-04 18:33:13,599 WARNING [train.py:1204] (3/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_153-97441-0_sp0.9 from training. Duration: 0.62225 2023-10-04 18:33:16,881 WARNING [train.py:1204] (3/4) Exclude cut with ID _12733_210_8341_1_1530167424556_3646380_523-85621-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 18:33:16,894 WARNING [train.py:1204] (3/4) Exclude cut with ID _64048_210_10626_1_1535198382802_3835179_352-179834-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 18:33:21,040 WARNING [train.py:1204] (3/4) Exclude cut with ID _55162_210_17071_1_1534408248583_3753480_43-242364-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 18:33:21,056 WARNING [train.py:1204] (3/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_661-264857-0 from training. Duration: 0.93 2023-10-04 18:33:21,062 WARNING [train.py:1204] (3/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_325-237800-0_sp1.1 from training. Duration: 0.97275 2023-10-04 18:33:21,117 WARNING [train.py:1204] (3/4) Exclude cut with ID _55328_210_27767_1_1534580973260_4088170_385-171459-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 18:33:22,423 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3780_210_2087_1_1526467389_2145572_176-107140-0_sp1.1 from training. Duration: 0.62725 2023-10-04 18:33:23,785 WARNING [train.py:1204] (3/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_375-244976-0_sp1.1 from training. Duration: 0.6818125 2023-10-04 18:33:23,896 WARNING [train.py:1204] (3/4) Exclude cut with ID _35941_210_15379_1_1532995294515_4023089_309-33162-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 18:33:26,541 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.1.encoder.layers.1.feed_forward1.out_whiten, num_groups=1, num_channels=256, metric=4.94 vs. limit=15.0 2023-10-04 18:33:30,381 WARNING [train.py:1204] (3/4) Exclude cut with ID _14459_210_2228_1_1531443641946_7516700_1185-22038-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 18:33:31,752 INFO [train.py:1046] (3/4) Epoch 50, batch 2600, loss[loss=0.1688, simple_loss=0.2518, pruned_loss=0.04294, over 23238.00 frames. ], tot_loss[loss=0.1521, simple_loss=0.2332, pruned_loss=0.03551, over 4700264.63 frames. ], batch size: 105, lr: 2.03e-03, grad_scale: 8.0 2023-10-04 18:33:33,156 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_1197-69460-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 18:33:34,577 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9012W0022-501157 from training. Duration: 0.896 2023-10-04 18:33:37,356 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9023W0009-506302 from training. Duration: 0.896 2023-10-04 18:33:37,377 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2821_210_2715_1_1526108385_3763560_73-296673-0_sp1.1 from training. Duration: 0.7545625 2023-10-04 18:33:37,417 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9025W0113-507442 from training. Duration: 0.896 2023-10-04 18:33:38,759 WARNING [train.py:1204] (3/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_739-2098-0 from training. Duration: 0.92 2023-10-04 18:33:38,770 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9029W0332-509722 from training. Duration: 0.768 2023-10-04 18:33:43,234 WARNING [train.py:1204] (3/4) Exclude cut with ID _16908_210_5724_1_1531119157248_3772700_68-101953-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 18:33:43,250 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9042W0011-515617 from training. Duration: 0.768 2023-10-04 18:33:44,155 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.conv_module1.balancer2.prob, batch_count=1752626.6666666667, ans=0.125 2023-10-04 18:33:45,195 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3019_210_2028_1_1526044245_3264865_191-72235-0 from training. Duration: 0.97 2023-10-04 18:33:45,288 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9052W0006-520792 from training. Duration: 0.896 2023-10-04 18:33:46,871 WARNING [train.py:1204] (3/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_245-174553-0_sp1.1 from training. Duration: 0.8 2023-10-04 18:33:48,395 WARNING [train.py:1204] (3/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_202-67017-0 from training. Duration: 0.82 2023-10-04 18:33:51,031 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.596e+02 2.019e+02 2.212e+02 2.492e+02 4.115e+02, threshold=4.424e+02, percent-clipped=0.0 2023-10-04 18:33:51,120 WARNING [train.py:1204] (3/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_304-59227-0 from training. Duration: 0.77 2023-10-04 18:33:52,480 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_246-125000-0_sp0.9 from training. Duration: 0.688875 2023-10-04 18:33:52,525 WARNING [train.py:1204] (3/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_243-42116-0 from training. Duration: 0.73 2023-10-04 18:33:55,228 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9087W0121-538492 from training. Duration: 0.896 2023-10-04 18:33:55,243 WARNING [train.py:1204] (3/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_120-76843-0 from training. Duration: 0.55 2023-10-04 18:34:04,175 WARNING [train.py:1204] (3/4) Exclude cut with ID _7852_210_5219_1_1528286471316_8139400_850-304621-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 18:34:04,195 WARNING [train.py:1204] (3/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_154-285415-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 18:34:04,206 WARNING [train.py:1204] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_538-278194-0_sp1.1 from training. Duration: 0.92725 2023-10-04 18:34:04,207 WARNING [train.py:1204] (3/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_119-207162-0 from training. Duration: 0.71 2023-10-04 18:34:04,340 WARNING [train.py:1204] (3/4) Exclude cut with ID _2334_210_2667_1_1525608238880_4705129_459-104997-0_sp1.1 from training. Duration: 0.863625 2023-10-04 18:34:11,174 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9147W0019-567847 from training. Duration: 0.768 2023-10-04 18:34:15,409 WARNING [train.py:1204] (3/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_228-189439-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 18:34:15,442 WARNING [train.py:1204] (3/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_196-77172-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 18:34:16,771 WARNING [train.py:1204] (3/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_625-338546-0 from training. Duration: 0.88 2023-10-04 18:34:16,814 WARNING [train.py:1204] (3/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_193-327630-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 18:34:16,816 WARNING [train.py:1204] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_391-24006-0_sp1.1 from training. Duration: 0.92725 2023-10-04 18:34:18,186 WARNING [train.py:1204] (3/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_197-28164-0 from training. Duration: 0.8 2023-10-04 18:34:20,938 WARNING [train.py:1204] (3/4) Exclude cut with ID _8395_210_1531_1_1528597871523_4411579_431-132470-0_sp1.1 from training. Duration: 0.67275 2023-10-04 18:34:20,956 WARNING [train.py:1204] (3/4) Exclude cut with ID _35350_210_16040_1_1533040191183_4075299_465-248326-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 18:34:22,522 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.bypass_mid.scale_min, batch_count=1752826.6666666667, ans=0.2 2023-10-04 18:34:23,595 WARNING [train.py:1204] (3/4) Exclude cut with ID _16756_210_4915_1_1531047803763_7104259_101-141922-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 18:34:26,609 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.conv_module2.balancer1.prob, batch_count=1752826.6666666667, ans=0.125 2023-10-04 18:34:27,665 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9212W0005-600172 from training. Duration: 0.896 2023-10-04 18:34:27,687 WARNING [train.py:1204] (3/4) Exclude cut with ID _63186_210_2084_1_1535414479705_3908649_378-166820-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 18:34:27,700 WARNING [train.py:1204] (3/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_52-211683-0_sp1.1 from training. Duration: 0.8181875 2023-10-04 18:34:33,808 WARNING [train.py:1204] (3/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_1147-237729-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 18:34:33,887 WARNING [train.py:1204] (3/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_234-14174-0_sp0.9 from training. Duration: 0.911125 2023-10-04 18:34:33,899 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_454-276429-0 from training. Duration: 0.87 2023-10-04 18:34:35,176 WARNING [train.py:1204] (3/4) Exclude cut with ID _53713_210_24116_1_1534314487762_7416751_755-165313-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 18:34:36,633 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8278_210_2550_1_1528637099_4220413_436-317072-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 18:34:38,018 WARNING [train.py:1204] (3/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_28-324729-0_sp1.1 from training. Duration: 0.8545625 2023-10-04 18:34:43,824 WARNING [train.py:1204] (3/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_326-165900-0 from training. Duration: 0.69 2023-10-04 18:34:45,695 INFO [train.py:1046] (3/4) Epoch 50, batch 2650, loss[loss=0.1463, simple_loss=0.2267, pruned_loss=0.03298, over 24504.00 frames. ], tot_loss[loss=0.1529, simple_loss=0.2337, pruned_loss=0.03602, over 4691330.48 frames. ], batch size: 63, lr: 2.03e-03, grad_scale: 8.0 2023-10-04 18:34:45,740 WARNING [train.py:1204] (3/4) Exclude cut with ID _53489_210_6228_1_1534298357169_3835970_96-30246-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 18:34:48,572 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8275_210_4198_1_1528455320_3892571_491-17707-0_sp1.1 from training. Duration: 0.5818125 2023-10-04 18:34:52,662 WARNING [train.py:1204] (3/4) Exclude cut with ID _14348_210_8341_1_1530599434996_3778640_240-232370-0 from training. Duration: 0.84 2023-10-04 18:34:52,673 WARNING [train.py:1204] (3/4) Exclude cut with ID _45012_210_12651_1_1533608435136_5371380_54-112623-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 18:34:54,012 WARNING [train.py:1204] (3/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_328-264776-0_sp1.1 from training. Duration: 0.6090625 2023-10-04 18:34:54,083 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9325W0001-648427 from training. Duration: 0.768 2023-10-04 18:34:55,339 WARNING [train.py:1204] (3/4) Exclude cut with ID _56480_210_4220_1_1534679942079_3075409_49-74717-0_sp1.1 from training. Duration: 0.936375 2023-10-04 18:34:56,705 WARNING [train.py:1204] (3/4) Exclude cut with ID _59500_210_8254_1_1534809610680_3736009_6-241869-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 18:34:59,596 WARNING [train.py:1204] (3/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_172-215932-0_sp0.9 from training. Duration: 0.7333125 2023-10-04 18:35:00,955 WARNING [train.py:1204] (3/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_17-90541-0_sp1.1 from training. Duration: 0.8545625 2023-10-04 18:35:02,369 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_43831_210_2158_1_1533725502_5872791_525-89447-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 18:35:04,302 WARNING [train.py:1204] (3/4) Exclude cut with ID _12302_210_5219_1_1529924446466_8107280_907-182949-0 from training. Duration: 0.92 2023-10-04 18:35:04,325 WARNING [train.py:1204] (3/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_402-102311-0_sp0.9 from training. Duration: 0.7444375 2023-10-04 18:35:04,359 WARNING [train.py:1204] (3/4) Exclude cut with ID _8021_210_7994_1_1528376451358_3653069_179-45206-0_sp0.9 from training. Duration: 0.9555625 2023-10-04 18:35:07,123 WARNING [train.py:1204] (3/4) Exclude cut with ID _40137_210_12210_1_1533290484046_3795180_249-187059-0 from training. Duration: 0.88 2023-10-04 18:35:08,489 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9653W0194-675472 from training. Duration: 0.9658125 2023-10-04 18:35:09,909 WARNING [train.py:1204] (3/4) Exclude cut with ID _51332_210_9988_1_1534244371836_3801270_336-255604-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 18:35:12,698 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_510-96996-0 from training. Duration: 0.87 2023-10-04 18:35:12,711 WARNING [train.py:1204] (3/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_1167-20437-0_sp1.1 from training. Duration: 0.92725 2023-10-04 18:35:12,762 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3475_210_2028_1_1526193578_3622569_265-276625-0 from training. Duration: 0.76 2023-10-04 18:35:17,995 WARNING [train.py:1204] (3/4) Exclude cut with ID _14860_210_4915_1_1530702020340_7343240_470-172418-0_sp1.1 from training. Duration: 0.97275 2023-10-04 18:35:18,003 WARNING [train.py:1204] (3/4) Exclude cut with ID _5444_210_2151_1_1527418808464_5312210_581-315411-0_sp1.1 from training. Duration: 0.563625 2023-10-04 18:35:18,027 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_34450_210_9547_1_1533099443_4109050_519-342439-0_sp1.1 from training. Duration: 0.97275 2023-10-04 18:35:18,061 WARNING [train.py:1204] (3/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_50-175961-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 18:35:22,200 WARNING [train.py:1204] (3/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_372-6869-0 from training. Duration: 0.44 2023-10-04 18:35:22,214 WARNING [train.py:1204] (3/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_3-237434-0 from training. Duration: 0.76 2023-10-04 18:35:23,665 WARNING [train.py:1204] (3/4) Exclude cut with ID _8772_210_1850_1_1528635595972_3664011_247-336448-0_sp1.1 from training. Duration: 0.67275 2023-10-04 18:35:28,036 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.conv_module2.balancer2.min_abs, batch_count=1753160.0, ans=0.5 2023-10-04 18:35:29,107 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_427-78097-0 from training. Duration: 0.8 2023-10-04 18:35:29,130 WARNING [train.py:1204] (3/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_466-252727-0_sp1.1 from training. Duration: 0.97275 2023-10-04 18:35:31,031 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_15972_210_6400_1_1530877934_3405692_316-35478-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 18:35:31,073 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_809-17156-0_sp0.9 from training. Duration: 0.811125 2023-10-04 18:35:31,109 WARNING [train.py:1204] (3/4) Exclude cut with ID _57572_210_5517_1_1534921219624_3809484_594-263474-0_sp1.1 from training. Duration: 0.936375 2023-10-04 18:35:31,149 WARNING [train.py:1204] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_190-228204-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 18:35:32,492 WARNING [train.py:1204] (3/4) Exclude cut with ID _19514_210_9259_1_1531530071330_5084230_117-21193-0_sp1.1 from training. Duration: 0.936375 2023-10-04 18:35:34,332 WARNING [train.py:1204] (3/4) Exclude cut with ID _69584_210_5219_1_1535785133775_4044450_190-21315-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 18:35:35,715 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_53071_210_16554_1_1534493434_1276796_184-302050-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 18:35:35,770 WARNING [train.py:1204] (3/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_120-76843-0_sp1.1 from training. Duration: 0.5 2023-10-04 18:35:37,184 WARNING [train.py:1204] (3/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_318-162017-0_sp1.1 from training. Duration: 0.82725 2023-10-04 18:35:39,801 WARNING [train.py:1204] (3/4) Exclude cut with ID _46067_210_4220_1_1534125571066_3307450_79-46864-0_sp1.1 from training. Duration: 0.92725 2023-10-04 18:35:39,852 WARNING [train.py:1204] (3/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_358-215558-0_sp1.1 from training. Duration: 0.8454375 2023-10-04 18:35:39,932 WARNING [train.py:1204] (3/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_621-66764-0_sp1.1 from training. Duration: 0.92725 2023-10-04 18:35:43,112 WARNING [train.py:1204] (3/4) Exclude cut with ID _42863_210_9070_1_1533902398788_3696920_338-156851-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 18:35:43,146 WARNING [train.py:1204] (3/4) Exclude cut with ID _5289_210_2084_1_1527678202736_3779299_392-261988-0_sp0.9 from training. Duration: 0.72225 2023-10-04 18:35:47,316 WARNING [train.py:1204] (3/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_409-84682-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 18:35:47,394 WARNING [train.py:1204] (3/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_30-326681-0_sp0.9 from training. Duration: 0.92225 2023-10-04 18:35:48,618 WARNING [train.py:1204] (3/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_389-184695-0_sp1.1 from training. Duration: 0.92725 2023-10-04 18:35:48,649 WARNING [train.py:1204] (3/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_442-320317-0 from training. Duration: 0.82 2023-10-04 18:35:52,065 WARNING [train.py:1204] (3/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_756-29911-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 18:35:53,526 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_401-112036-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 18:35:53,616 WARNING [train.py:1204] (3/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_181-308827-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 18:35:54,904 WARNING [train.py:1204] (3/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_332-258725-0_sp1.1 from training. Duration: 0.963625 2023-10-04 18:35:56,162 WARNING [train.py:1204] (3/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_188-260011-0_sp0.9 from training. Duration: 0.811125 2023-10-04 18:35:56,205 WARNING [train.py:1204] (3/4) Exclude cut with ID _49039_210_6315_1_1533967537541_3846118_99-302239-0_sp1.1 from training. Duration: 0.963625 2023-10-04 18:35:57,083 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.1.encoder.layers.0.feed_forward2.out_whiten, num_groups=1, num_channels=256, metric=4.10 vs. limit=15.0 2023-10-04 18:35:58,864 INFO [train.py:1046] (3/4) Epoch 50, batch 2700, loss[loss=0.1419, simple_loss=0.23, pruned_loss=0.02688, over 24647.00 frames. ], tot_loss[loss=0.1529, simple_loss=0.2341, pruned_loss=0.03584, over 4703764.33 frames. ], batch size: 65, lr: 2.03e-03, grad_scale: 8.0 2023-10-04 18:35:58,927 WARNING [train.py:1204] (3/4) Exclude cut with ID _4122_210_2084_1_1526713164417_4353280_312-294828-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 18:35:58,941 WARNING [train.py:1204] (3/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_303-228738-0 from training. Duration: 0.84 2023-10-04 18:35:59,176 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.bypass_mid.scale_min, batch_count=1753293.3333333333, ans=0.2 2023-10-04 18:36:02,277 WARNING [train.py:1204] (3/4) Exclude cut with ID _14349_210_8341_1_1530615710818_3442470_115-285975-0_sp0.9 from training. Duration: 0.9555625 2023-10-04 18:36:03,752 WARNING [train.py:1204] (3/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_125-144174-0_sp1.1 from training. Duration: 0.3545625 2023-10-04 18:36:05,768 WARNING [train.py:1204] (3/4) Exclude cut with ID _53565_210_10635_1_1534337218335_3820259_351-54570-0_sp1.1 from training. Duration: 0.97275 2023-10-04 18:36:06,943 WARNING [train.py:1204] (3/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_410-40065-0_sp1.1 from training. Duration: 0.963625 2023-10-04 18:36:06,964 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_38965_210_4852_1_1533174906_3484735_167-339442-0_sp1.1 from training. Duration: 0.963625 2023-10-04 18:36:08,622 WARNING [train.py:1204] (3/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_265-164486-0_sp1.1 from training. Duration: 0.87275 2023-10-04 18:36:08,639 WARNING [train.py:1204] (3/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_725-182484-0_sp1.1 from training. Duration: 0.92725 2023-10-04 18:36:08,689 WARNING [train.py:1204] (3/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_607-213951-0_sp0.9 from training. Duration: 0.9333125 2023-10-04 18:36:09,975 WARNING [train.py:1204] (3/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_605-133395-0_sp1.1 from training. Duration: 0.6 2023-10-04 18:36:09,999 WARNING [train.py:1204] (3/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_351-294650-0 from training. Duration: 0.63 2023-10-04 18:36:11,390 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_14953_210_2151_1_1530701710_5616165_710-165400-0_sp1.1 from training. Duration: 0.7909375 2023-10-04 18:36:14,611 WARNING [train.py:1204] (3/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_237-58433-0_sp1.1 from training. Duration: 0.67275 2023-10-04 18:36:14,697 WARNING [train.py:1204] (3/4) Exclude cut with ID _5190_210_4557_1_1527244368799_373439_28-197030-0_sp1.1 from training. Duration: 0.6545625 2023-10-04 18:36:14,750 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_743-327651-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 18:36:18,704 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.730e+02 2.205e+02 2.516e+02 3.164e+02 5.488e+02, threshold=5.032e+02, percent-clipped=3.0 2023-10-04 18:36:18,802 WARNING [train.py:1204] (3/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_76-203867-0_sp0.9 from training. Duration: 0.8 2023-10-04 18:36:20,285 WARNING [train.py:1204] (3/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_274-274798-0 from training. Duration: 0.98 2023-10-04 18:36:20,328 WARNING [train.py:1204] (3/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_226-242140-0_sp1.1 from training. Duration: 0.736375 2023-10-04 18:36:26,441 WARNING [train.py:1204] (3/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_352-59958-0_sp1.1 from training. Duration: 0.7818125 2023-10-04 18:36:26,446 WARNING [train.py:1204] (3/4) Exclude cut with ID _39411_210_5986_1_1533538825030_3343649_191-251819-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 18:36:30,716 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7046_210_6753_1_1527895337_6920917_642-106799-0_sp1.1 from training. Duration: 0.836375 2023-10-04 18:36:30,723 WARNING [train.py:1204] (3/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_412-3442-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 18:36:30,742 WARNING [train.py:1204] (3/4) Exclude cut with ID _60057_210_10640_1_1534903065793_3333150_171-181133-0_sp0.9 from training. Duration: 0.97775 2023-10-04 18:36:30,758 WARNING [train.py:1204] (3/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_154-135696-0_sp1.1 from training. Duration: 0.72725 2023-10-04 18:36:33,675 WARNING [train.py:1204] (3/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_148-186805-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 18:36:36,895 WARNING [train.py:1204] (3/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_605-165641-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 18:36:36,901 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_174-277364-0_sp0.9 from training. Duration: 0.788875 2023-10-04 18:36:36,920 WARNING [train.py:1204] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_89-321955-0_sp0.9 from training. Duration: 0.888875 2023-10-04 18:36:41,320 WARNING [train.py:1204] (3/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_438-105429-0_sp1.1 from training. Duration: 0.963625 2023-10-04 18:36:41,324 WARNING [train.py:1204] (3/4) Exclude cut with ID _14970_210_15709_1_1530773543493_1715730_187-20259-0_sp0.9 from training. Duration: 0.988875 2023-10-04 18:36:48,882 WARNING [train.py:1204] (3/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_385-193052-0_sp0.9 from training. Duration: 0.9555625 2023-10-04 18:36:50,207 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7113_210_1849_1_1527942685_3448768_194-311626-0_sp1.1 from training. Duration: 0.92725 2023-10-04 18:36:53,594 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3780_210_2087_1_1526467389_2145572_176-107140-0_sp0.9 from training. Duration: 0.7666875 2023-10-04 18:36:53,595 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_36756_210_7234_1_1533026991_966276_123-213457-0_sp1.1 from training. Duration: 0.936375 2023-10-04 18:36:57,637 WARNING [train.py:1204] (3/4) Exclude cut with ID _23557_210_8750_1_1531890106370_6846255_456-244861-0_sp1.1 from training. Duration: 0.963625 2023-10-04 18:36:59,041 WARNING [train.py:1204] (3/4) Exclude cut with ID _37086_210_9038_1_1532999540891_7021250_242-349170-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 18:37:00,365 WARNING [train.py:1204] (3/4) Exclude cut with ID _41298_210_9019_1_1533430371450_4822143_471-281351-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 18:37:00,468 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_53378_210_17282_1_1534221071_5769509_572-126176-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 18:37:01,861 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_1507-263032-0_sp1.1 from training. Duration: 0.963625 2023-10-04 18:37:03,155 WARNING [train.py:1204] (3/4) Exclude cut with ID _13679_210_7497_1_1530534293662_3355098_411-186088-0_sp1.1 from training. Duration: 0.8545625 2023-10-04 18:37:05,220 WARNING [train.py:1204] (3/4) Exclude cut with ID _29345_210_4381_1_1532689168263_3860169_149-257268-0_sp1.1 from training. Duration: 0.736375 2023-10-04 18:37:07,895 WARNING [train.py:1204] (3/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_381-252014-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 18:37:07,909 WARNING [train.py:1204] (3/4) Exclude cut with ID _35799_210_5854_1_1533263487887_3529071_512-2717-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 18:37:11,168 WARNING [train.py:1204] (3/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_505-205270-0_sp0.9 from training. Duration: 0.42225 2023-10-04 18:37:12,571 WARNING [train.py:1204] (3/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_731-20117-0_sp1.1 from training. Duration: 0.936375 2023-10-04 18:37:13,959 INFO [train.py:1046] (3/4) Epoch 50, batch 2750, loss[loss=0.1313, simple_loss=0.2132, pruned_loss=0.0247, over 24595.00 frames. ], tot_loss[loss=0.1525, simple_loss=0.234, pruned_loss=0.03556, over 4701826.78 frames. ], batch size: 60, lr: 2.03e-03, grad_scale: 8.0 2023-10-04 18:37:14,085 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_37351_210_7925_1_1533012931_3812113_265-85780-0_sp1.1 from training. Duration: 0.8 2023-10-04 18:37:14,094 WARNING [train.py:1204] (3/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_313-134930-0 from training. Duration: 0.97 2023-10-04 18:37:15,974 WARNING [train.py:1204] (3/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_374-138971-0 from training. Duration: 0.94 2023-10-04 18:37:16,017 WARNING [train.py:1204] (3/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_1227-287886-0_sp1.1 from training. Duration: 0.936375 2023-10-04 18:37:20,056 WARNING [train.py:1204] (3/4) Exclude cut with ID _59493_210_17282_1_1535078419077_3225879_332-217544-0_sp1.1 from training. Duration: 0.97275 2023-10-04 18:37:20,105 WARNING [train.py:1204] (3/4) Exclude cut with ID _23514_210_1389_1_1531832139785_3031439_361-119640-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 18:37:22,771 WARNING [train.py:1204] (3/4) Exclude cut with ID _69584_210_5219_1_1535785133775_4044450_142-118058-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 18:37:22,802 WARNING [train.py:1204] (3/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_178-177582-0_sp1.1 from training. Duration: 0.67275 2023-10-04 18:37:23,943 WARNING [train.py:1204] (3/4) Exclude cut with ID _25303_210_17519_1_1532050207576_3635970_41-195572-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 18:37:26,066 WARNING [train.py:1204] (3/4) Exclude cut with ID _55328_210_27767_1_1534580973260_4088170_329-341322-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 18:37:26,104 WARNING [train.py:1204] (3/4) Exclude cut with ID _5608_210_2106_1_1527415439089_6819768_909-82767-0_sp1.1 from training. Duration: 0.5545625 2023-10-04 18:37:26,296 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.conv_skip_rate, batch_count=1753626.6666666667, ans=0.0 2023-10-04 18:37:27,411 WARNING [train.py:1204] (3/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_261-297503-0_sp1.1 from training. Duration: 0.9 2023-10-04 18:37:27,417 WARNING [train.py:1204] (3/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_459-291706-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 18:37:27,418 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7762_210_1794_1_1528199508_4057408_422-227034-0 from training. Duration: 0.73 2023-10-04 18:37:27,427 WARNING [train.py:1204] (3/4) Exclude cut with ID _3067_210_2667_1_1526212974640_4503168_250-285937-0_sp0.9 from training. Duration: 0.888875 2023-10-04 18:37:27,441 WARNING [train.py:1204] (3/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_332-253745-0_sp1.1 from training. Duration: 0.936375 2023-10-04 18:37:31,666 WARNING [train.py:1204] (3/4) Exclude cut with ID _13938_210_9988_1_1530518718978_3693960_304-112784-0 from training. Duration: 0.46 2023-10-04 18:37:33,107 WARNING [train.py:1204] (3/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_226-267957-0_sp1.1 from training. Duration: 0.92725 2023-10-04 18:37:34,904 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_41822_210_13270_1_1533955976_4379284_608-254567-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 18:37:34,977 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_124-113562-0_sp1.1 from training. Duration: 0.8545625 2023-10-04 18:37:36,346 WARNING [train.py:1204] (3/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_117-290916-0_sp1.1 from training. Duration: 0.463625 2023-10-04 18:37:36,426 WARNING [train.py:1204] (3/4) Exclude cut with ID _28427_210_4220_1_1532581240967_3061069_102-66533-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 18:37:38,301 WARNING [train.py:1204] (3/4) Exclude cut with ID _55383_210_10635_1_1534468426172_2231669_134-128246-0_sp1.1 from training. Duration: 0.8090625 2023-10-04 18:37:38,342 WARNING [train.py:1204] (3/4) Exclude cut with ID _45871_210_5665_1_1533864614245_3296060_49-308316-0_sp1.1 from training. Duration: 0.97275 2023-10-04 18:37:38,558 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.nonlin_attention.balancer.prob, batch_count=1753693.3333333333, ans=0.125 2023-10-04 18:37:39,521 WARNING [train.py:1204] (3/4) Exclude cut with ID _57572_210_5517_1_1534921219624_3809484_176-199205-0_sp1.1 from training. Duration: 0.97275 2023-10-04 18:37:42,426 WARNING [train.py:1204] (3/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_45-265684-0_sp1.1 from training. Duration: 0.6545625 2023-10-04 18:37:42,445 WARNING [train.py:1204] (3/4) Exclude cut with ID _15150_210_8341_1_1530752411803_7256384_469-239509-0_sp0.9 from training. Duration: 0.7555625 2023-10-04 18:37:44,136 WARNING [train.py:1204] (3/4) Exclude cut with ID _28461_210_2253_1_1532771775189_3041299_136-270644-0_sp1.1 from training. Duration: 0.8454375 2023-10-04 18:37:44,222 WARNING [train.py:1204] (3/4) Exclude cut with ID _69584_210_5219_1_1535785133775_4044450_302-47302-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 18:37:44,382 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=1753760.0, ans=0.1 2023-10-04 18:37:45,521 WARNING [train.py:1204] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_166-128941-0_sp1.1 from training. Duration: 0.4909375 2023-10-04 18:37:51,061 WARNING [train.py:1204] (3/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_559-120504-0_sp1.1 from training. Duration: 0.97275 2023-10-04 18:37:51,223 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=1753760.0, ans=0.1 2023-10-04 18:37:52,467 WARNING [train.py:1204] (3/4) Exclude cut with ID _6456_210_1534_1_1527681634212_3181049_345-252877-0_sp1.1 from training. Duration: 0.5818125 2023-10-04 18:37:54,187 WARNING [train.py:1204] (3/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_902-233003-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 18:37:54,498 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.conv_module2.balancer2.prob, batch_count=1753760.0, ans=0.125 2023-10-04 18:37:58,295 WARNING [train.py:1204] (3/4) Exclude cut with ID _36538_210_9994_1_1533204020363_3795620_204-261385-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 18:37:58,300 WARNING [train.py:1204] (3/4) Exclude cut with ID _14821_210_8750_1_1530680503991_6829359_189-297451-0_sp1.1 from training. Duration: 0.5 2023-10-04 18:37:58,335 WARNING [train.py:1204] (3/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_1211-226066-0_sp0.9 from training. Duration: 0.8444375 2023-10-04 18:37:58,917 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.4.encoder.layers.0.conv_module1.whiten, num_groups=1, num_channels=384, metric=6.14 vs. limit=15.0 2023-10-04 18:38:02,804 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6786_210_1388_1_1527839699_4194495_273-149841-0_sp1.1 from training. Duration: 0.836375 2023-10-04 18:38:04,116 WARNING [train.py:1204] (3/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_380-162648-0_sp1.1 from training. Duration: 0.8545625 2023-10-04 18:38:04,117 WARNING [train.py:1204] (3/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_494-199538-0 from training. Duration: 0.75 2023-10-04 18:38:08,798 WARNING [train.py:1204] (3/4) Exclude cut with ID _49097_210_16049_1_1533952780207_4360979_276-87053-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 18:38:10,209 WARNING [train.py:1204] (3/4) Exclude cut with ID _5444_210_2151_1_1527418808464_5312210_726-27165-0 from training. Duration: 0.67 2023-10-04 18:38:16,229 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_128-202104-0_sp1.1 from training. Duration: 0.57275 2023-10-04 18:38:18,857 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_11349_210_6753_1_1529800861_1101207_26-65602-0_sp1.1 from training. Duration: 0.87275 2023-10-04 18:38:18,888 WARNING [train.py:1204] (3/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_159-328157-0 from training. Duration: 0.52 2023-10-04 18:38:18,971 WARNING [train.py:1204] (3/4) Exclude cut with ID _14069_210_5632_1_1530512692033_4803390_98-21934-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 18:38:21,785 WARNING [train.py:1204] (3/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_352-59958-0_sp0.9 from training. Duration: 0.9555625 2023-10-04 18:38:23,145 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6163_210_6400_1_1527506746_4263068_283-137811-0 from training. Duration: 0.77 2023-10-04 18:38:23,182 WARNING [train.py:1204] (3/4) Exclude cut with ID _21989_210_5854_1_1531899214931_3271360_133-44759-0_sp1.1 from training. Duration: 0.7 2023-10-04 18:38:26,425 WARNING [train.py:1204] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_21-174514-0_sp0.9 from training. Duration: 0.47775 2023-10-04 18:38:26,459 WARNING [train.py:1204] (3/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_201-78811-0_sp1.1 from training. Duration: 0.963625 2023-10-04 18:38:26,498 WARNING [train.py:1204] (3/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_537-164630-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 18:38:27,754 INFO [train.py:1046] (3/4) Epoch 50, batch 2800, loss[loss=0.1531, simple_loss=0.2421, pruned_loss=0.03206, over 24684.00 frames. ], tot_loss[loss=0.152, simple_loss=0.2332, pruned_loss=0.03543, over 4707364.92 frames. ], batch size: 73, lr: 2.03e-03, grad_scale: 16.0 2023-10-04 18:38:27,867 WARNING [train.py:1204] (3/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_156-257607-0 from training. Duration: 0.83 2023-10-04 18:38:29,172 WARNING [train.py:1204] (3/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_308-154450-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 18:38:29,219 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_45399_210_4852_1_1533624725_7579737_214-240574-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 18:38:30,594 WARNING [train.py:1204] (3/4) Exclude cut with ID _29303_210_2575_1_1532392237303_7410649_615-305517-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 18:38:30,630 WARNING [train.py:1204] (3/4) Exclude cut with ID IC0125W0002-61808 from training. Duration: 0.708 2023-10-04 18:38:30,631 WARNING [train.py:1204] (3/4) Exclude cut with ID IC0125W0017-61823 from training. Duration: 0.759 2023-10-04 18:38:33,391 WARNING [train.py:1204] (3/4) Exclude cut with ID _53219_210_4525_1_1534834836008_4064825_4-145291-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 18:38:33,629 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.bypass.scale_min, batch_count=1753960.0, ans=0.2 2023-10-04 18:38:35,394 WARNING [train.py:1204] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_185-174805-0_sp1.1 from training. Duration: 0.6181875 2023-10-04 18:38:35,415 WARNING [train.py:1204] (3/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_635-193046-0_sp1.1 from training. Duration: 0.92725 2023-10-04 18:38:38,398 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.conv_skip_rate, batch_count=1753960.0, ans=0.0 2023-10-04 18:38:39,562 WARNING [train.py:1204] (3/4) Exclude cut with ID _46067_210_4220_1_1534125571066_3307450_321-258866-0_sp1.1 from training. Duration: 0.97275 2023-10-04 18:38:42,813 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8558_210_1410_1_1528505225_7613406_131-329524-0 from training. Duration: 0.79 2023-10-04 18:38:44,352 WARNING [train.py:1204] (3/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_847-41334-0_sp0.9 from training. Duration: 0.488875 2023-10-04 18:38:45,844 WARNING [train.py:1204] (3/4) Exclude cut with ID _14227_210_4381_1_1530701804286_3940259_125-275642-0 from training. Duration: 0.99 2023-10-04 18:38:47,177 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.767e+02 2.165e+02 2.479e+02 3.124e+02 4.666e+02, threshold=4.957e+02, percent-clipped=0.0 2023-10-04 18:38:47,264 WARNING [train.py:1204] (3/4) Exclude cut with ID _25248_210_8750_1_1532062825941_5554180_248-177255-0_sp1.1 from training. Duration: 0.963625 2023-10-04 18:38:47,320 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_40047_210_2290_1_1533290519_2374311_195-165693-0_sp1.1 from training. Duration: 0.8090625 2023-10-04 18:38:47,324 WARNING [train.py:1204] (3/4) Exclude cut with ID _23109_210_4381_1_1532150828777_901949_29-258672-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 18:38:50,364 WARNING [train.py:1204] (3/4) Exclude cut with ID _14821_210_8750_1_1530680503991_6829359_2-227151-0_sp1.1 from training. Duration: 0.7545625 2023-10-04 18:38:51,666 WARNING [train.py:1204] (3/4) Exclude cut with ID _49643_210_7989_1_1534640244224_3939031_565-53982-0_sp1.1 from training. Duration: 0.963625 2023-10-04 18:38:51,670 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7544_210_6753_1_1528591327_1471426_33-346031-0_sp0.9 from training. Duration: 0.67775 2023-10-04 18:38:52,459 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.2.encoder.layers.2.nonlin_attention.whiten1, num_groups=1, num_channels=288, metric=4.54 vs. limit=10.0 2023-10-04 18:38:53,066 WARNING [train.py:1204] (3/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_95-61782-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 18:39:00,673 WARNING [train.py:1204] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_686-54383-0_sp0.9 from training. Duration: 0.9666875 2023-10-04 18:39:02,064 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_505-276823-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 18:39:03,531 WARNING [train.py:1204] (3/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_745-128443-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 18:39:04,896 WARNING [train.py:1204] (3/4) Exclude cut with ID _3844_210_1390_1_1526727803582_2871209_20-251664-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 18:39:06,825 WARNING [train.py:1204] (3/4) Exclude cut with ID _36538_210_9994_1_1533204020363_3795620_487-164628-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 18:39:12,086 WARNING [train.py:1204] (3/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_309-222545-0_sp1.1 from training. Duration: 0.763625 2023-10-04 18:39:12,100 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3531_210_3528_1_1526294203_5216456_309-227302-0 from training. Duration: 0.69 2023-10-04 18:39:12,162 WARNING [train.py:1204] (3/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_42-192460-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 18:39:12,233 WARNING [train.py:1204] (3/4) Exclude cut with ID _22083_210_8254_1_1531702862439_3678970_49-234546-0_sp1.1 from training. Duration: 0.7545625 2023-10-04 18:39:12,237 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_427-78097-0_sp0.9 from training. Duration: 0.888875 2023-10-04 18:39:16,893 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_378-205254-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 18:39:18,110 WARNING [train.py:1204] (3/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_672-314830-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 18:39:20,944 WARNING [train.py:1204] (3/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_90-30673-0_sp1.1 from training. Duration: 0.763625 2023-10-04 18:39:22,359 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.conv_skip_rate, batch_count=1754160.0, ans=0.0 2023-10-04 18:39:22,430 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.conv_module2.balancer1.prob, batch_count=1754160.0, ans=0.125 2023-10-04 18:39:23,593 WARNING [train.py:1204] (3/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_563-329943-0_sp1.1 from training. Duration: 0.9 2023-10-04 18:39:23,611 WARNING [train.py:1204] (3/4) Exclude cut with ID _21632_210_2253_1_1531994001765_3744140_154-84090-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 18:39:23,613 WARNING [train.py:1204] (3/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_103-336064-0_sp0.9 from training. Duration: 0.5666875 2023-10-04 18:39:23,661 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_43-71989-0_sp0.9 from training. Duration: 0.6333125 2023-10-04 18:39:25,110 WARNING [train.py:1204] (3/4) Exclude cut with ID _3867_210_1883_1_1526641200575_4475830_319-349850-0_sp0.9 from training. Duration: 0.7444375 2023-10-04 18:39:26,483 WARNING [train.py:1204] (3/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_629-179454-0_sp1.1 from training. Duration: 0.963625 2023-10-04 18:39:26,492 WARNING [train.py:1204] (3/4) Exclude cut with ID _65263_210_22356_1_1535763598691_3921037_49-222917-0 from training. Duration: 0.92 2023-10-04 18:39:26,523 WARNING [train.py:1204] (3/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_602-331772-0_sp1.1 from training. Duration: 0.936375 2023-10-04 18:39:28,246 WARNING [train.py:1204] (3/4) Exclude cut with ID _58414_210_8254_1_1535421609162_7151182_292-134978-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 18:39:28,278 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_38793_210_2544_1_1533176114_3849320_200-299292-0_sp1.1 from training. Duration: 0.936375 2023-10-04 18:39:29,715 WARNING [train.py:1204] (3/4) Exclude cut with ID _14970_210_15709_1_1530775547361_1966965_6-23328-0 from training. Duration: 0.86 2023-10-04 18:39:31,000 WARNING [train.py:1204] (3/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_386-225511-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 18:39:31,011 WARNING [train.py:1204] (3/4) Exclude cut with ID _38323_210_12108_1_1533121233369_3303049_416-11919-0_sp1.1 from training. Duration: 0.87275 2023-10-04 18:39:31,058 WARNING [train.py:1204] (3/4) Exclude cut with ID _4866_210_2577_1_1527404412284_4364659_256-306830-0_sp1.1 from training. Duration: 0.7909375 2023-10-04 18:39:32,559 WARNING [train.py:1204] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_53-300544-0 from training. Duration: 0.67 2023-10-04 18:39:40,013 WARNING [train.py:1204] (3/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_80-74184-0_sp1.1 from training. Duration: 0.7545625 2023-10-04 18:39:40,027 WARNING [train.py:1204] (3/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_332-256168-0_sp1.1 from training. Duration: 0.6818125 2023-10-04 18:39:40,080 WARNING [train.py:1204] (3/4) Exclude cut with ID _48836_210_12210_1_1534075848293_3904150_429-35475-0_sp1.1 from training. Duration: 0.8090625 2023-10-04 18:39:41,404 INFO [train.py:1046] (3/4) Epoch 50, batch 2850, loss[loss=0.1634, simple_loss=0.2487, pruned_loss=0.03902, over 24345.00 frames. ], tot_loss[loss=0.1515, simple_loss=0.2328, pruned_loss=0.03507, over 4710483.80 frames. ], batch size: 74, lr: 2.03e-03, grad_scale: 16.0 2023-10-04 18:39:42,844 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_144-135833-0_sp1.1 from training. Duration: 0.97275 2023-10-04 18:39:47,487 WARNING [train.py:1204] (3/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_380-343670-0_sp0.9 from training. Duration: 0.97775 2023-10-04 18:39:47,515 WARNING [train.py:1204] (3/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_213-265174-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 18:39:47,537 WARNING [train.py:1204] (3/4) Exclude cut with ID _20455_210_5219_1_1531565996775_8098100_86-314283-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 18:39:49,054 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_41750_210_1960_1_1533520678_3948042_68-223813-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 18:39:49,106 WARNING [train.py:1204] (3/4) Exclude cut with ID _55153_210_2253_1_1534384786677_3162096_24-198429-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 18:39:50,566 WARNING [train.py:1204] (3/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_119-207162-0_sp0.9 from training. Duration: 0.788875 2023-10-04 18:39:51,840 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_148-172861-0 from training. Duration: 0.5 2023-10-04 18:39:58,125 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_5522_210_3273_1_1527249606_3909096_318-203153-0 from training. Duration: 0.43 2023-10-04 18:39:58,132 WARNING [train.py:1204] (3/4) Exclude cut with ID _14884_210_2252_1_1530707459689_5561250_288-189949-0_sp1.1 from training. Duration: 0.97275 2023-10-04 18:39:59,511 WARNING [train.py:1204] (3/4) Exclude cut with ID _5294_210_1747_1_1527249643928_3696179_529-242298-0 from training. Duration: 0.95 2023-10-04 18:40:00,813 WARNING [train.py:1204] (3/4) Exclude cut with ID _36538_210_9994_1_1533204020363_3795620_503-110289-0_sp1.1 from training. Duration: 0.936375 2023-10-04 18:40:03,559 WARNING [train.py:1204] (3/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_385-193052-0 from training. Duration: 0.86 2023-10-04 18:40:03,613 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_456-22141-0 from training. Duration: 0.88 2023-10-04 18:40:06,887 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_38965_210_4852_1_1533174906_3484735_284-117840-0_sp1.1 from training. Duration: 0.936375 2023-10-04 18:40:07,282 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.ff3_skip_rate, batch_count=1754360.0, ans=0.0 2023-10-04 18:40:11,984 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.conv_module1.balancer1.prob, batch_count=1754426.6666666667, ans=0.125 2023-10-04 18:40:12,331 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.4.encoder.layers.1.nonlin_attention.whiten2, num_groups=1, num_channels=384, metric=3.22 vs. limit=15.0 2023-10-04 18:40:19,261 WARNING [train.py:1204] (3/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_75-201625-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 18:40:19,367 WARNING [train.py:1204] (3/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_255-312654-0_sp1.1 from training. Duration: 0.82725 2023-10-04 18:40:19,377 WARNING [train.py:1204] (3/4) Exclude cut with ID _47251_210_18628_1_1533814063128_4137250_214-88076-0_sp0.9 from training. Duration: 0.97775 2023-10-04 18:40:21,993 WARNING [train.py:1204] (3/4) Exclude cut with ID _4092_210_2151_1_1526643044449_3135759_227-213972-0_sp1.1 from training. Duration: 0.4909375 2023-10-04 18:40:22,009 WARNING [train.py:1204] (3/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_912-47473-0_sp1.1 from training. Duration: 0.6181875 2023-10-04 18:40:22,043 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6767_210_3273_1_1527855783_1718932_173-256601-0_sp1.1 from training. Duration: 0.72725 2023-10-04 18:40:22,175 WARNING [train.py:1204] (3/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_32-205288-0_sp1.1 from training. Duration: 0.7090625 2023-10-04 18:40:23,455 WARNING [train.py:1204] (3/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_39-248870-0 from training. Duration: 0.69 2023-10-04 18:40:26,328 WARNING [train.py:1204] (3/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_377-296839-0_sp1.1 from training. Duration: 0.5 2023-10-04 18:40:26,343 WARNING [train.py:1204] (3/4) Exclude cut with ID _13679_210_7497_1_1530534293662_3355098_413-56154-0_sp1.1 from training. Duration: 0.963625 2023-10-04 18:40:27,755 WARNING [train.py:1204] (3/4) Exclude cut with ID _14970_210_15709_1_1530775547361_1966965_201-84544-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 18:40:27,809 WARNING [train.py:1204] (3/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_105-95775-0_sp1.1 from training. Duration: 0.936375 2023-10-04 18:40:30,619 WARNING [train.py:1204] (3/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_131-191669-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 18:40:30,644 WARNING [train.py:1204] (3/4) Exclude cut with ID _14860_210_4915_1_1530702020340_7343240_695-4323-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 18:40:32,575 WARNING [train.py:1204] (3/4) Exclude cut with ID _39867_210_9826_1_1533553222030_9344259_991-130565-0_sp1.1 from training. Duration: 0.97275 2023-10-04 18:40:33,967 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_50-3289-0_sp1.1 from training. Duration: 0.82725 2023-10-04 18:40:34,150 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.self_attn_weights.pos_emb_skip_rate, batch_count=1754493.3333333333, ans=0.0 2023-10-04 18:40:35,428 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_54-347670-0_sp1.1 from training. Duration: 0.92725 2023-10-04 18:40:35,497 WARNING [train.py:1204] (3/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_200-153445-0_sp1.1 from training. Duration: 0.936375 2023-10-04 18:40:36,894 WARNING [train.py:1204] (3/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_279-302911-0_sp1.1 from training. Duration: 0.97275 2023-10-04 18:40:38,746 WARNING [train.py:1204] (3/4) Exclude cut with ID _14886_210_4220_1_1530702020355_3516190_159-73748-0_sp0.9 from training. Duration: 0.92225 2023-10-04 18:40:43,172 WARNING [train.py:1204] (3/4) Exclude cut with ID _53379_210_20034_1_1534222027363_3854654_23-222715-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 18:40:45,065 WARNING [train.py:1204] (3/4) Exclude cut with ID _12555_210_8750_1_1530075594402_3780239_449-267986-0 from training. Duration: 0.83 2023-10-04 18:40:45,075 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7544_210_6753_1_1528587722_3576686_295-258479-0 from training. Duration: 0.84 2023-10-04 18:40:47,785 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7002_210_1794_1_1527935634_4034444_487-236501-0_sp0.9 from training. Duration: 0.7333125 2023-10-04 18:40:47,834 WARNING [train.py:1204] (3/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_244-19107-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 18:40:47,854 WARNING [train.py:1204] (3/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_390-339137-0 from training. Duration: 0.56 2023-10-04 18:40:49,203 WARNING [train.py:1204] (3/4) Exclude cut with ID _7562_210_2084_1_1528887767916_3946229_186-326422-0_sp1.1 from training. Duration: 0.763625 2023-10-04 18:40:49,275 WARNING [train.py:1204] (3/4) Exclude cut with ID _33640_210_2655_1_1533117605479_4855469_645-145235-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 18:40:50,456 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_25767_210_2161_1_1532307483_3867748_506-171777-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 18:40:50,475 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_5438_210_1794_1_1527336687_2600009_233-58072-0_sp1.1 from training. Duration: 0.7 2023-10-04 18:40:50,475 WARNING [train.py:1204] (3/4) Exclude cut with ID IC0669W0063-330908 from training. Duration: 0.896 2023-10-04 18:40:50,535 WARNING [train.py:1204] (3/4) Exclude cut with ID IC0671W0070-331913 from training. Duration: 0.896 2023-10-04 18:40:50,538 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_258-310570-0_sp1.1 from training. Duration: 0.7454375 2023-10-04 18:40:51,885 WARNING [train.py:1204] (3/4) Exclude cut with ID _9461_210_4557_1_1529060514726_3677451_162-177507-0_sp1.1 from training. Duration: 0.97275 2023-10-04 18:40:54,814 WARNING [train.py:1204] (3/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_26-175553-0_sp0.9 from training. Duration: 0.67775 2023-10-04 18:40:54,840 WARNING [train.py:1204] (3/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_7-150481-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 18:40:56,077 INFO [train.py:1046] (3/4) Epoch 50, batch 2900, loss[loss=0.1573, simple_loss=0.2295, pruned_loss=0.04255, over 23874.00 frames. ], tot_loss[loss=0.1514, simple_loss=0.2325, pruned_loss=0.0351, over 4725906.75 frames. ], batch size: 195, lr: 2.03e-03, grad_scale: 16.0 2023-10-04 18:40:56,173 WARNING [train.py:1204] (3/4) Exclude cut with ID _23557_210_8750_1_1531890106370_6846255_72-273262-0_sp1.1 from training. Duration: 0.963625 2023-10-04 18:40:56,303 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.1.feed_forward3.hidden_balancer.prob, batch_count=1754626.6666666667, ans=0.125 2023-10-04 18:40:56,356 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.attention_skip_rate, batch_count=1754626.6666666667, ans=0.0 2023-10-04 18:40:57,498 WARNING [train.py:1204] (3/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_367-61330-0 from training. Duration: 0.75 2023-10-04 18:41:00,308 WARNING [train.py:1204] (3/4) Exclude cut with ID _39867_210_9826_1_1533553222030_9344259_1337-11141-0_sp1.1 from training. Duration: 0.936375 2023-10-04 18:41:00,329 WARNING [train.py:1204] (3/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_101-293367-0 from training. Duration: 0.63 2023-10-04 18:41:02,306 WARNING [train.py:1204] (3/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_10-100367-0 from training. Duration: 0.98 2023-10-04 18:41:03,247 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.0.layers.0.conv_module2.whiten, num_groups=1, num_channels=192, metric=10.41 vs. limit=15.0 2023-10-04 18:41:03,571 WARNING [train.py:1204] (3/4) Exclude cut with ID _60057_210_10640_1_1534903065793_3333150_171-181133-0_sp1.1 from training. Duration: 0.8 2023-10-04 18:41:03,574 WARNING [train.py:1204] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_844-96989-0_sp0.9 from training. Duration: 0.788875 2023-10-04 18:41:05,159 WARNING [train.py:1204] (3/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_301-144595-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 18:41:07,063 WARNING [train.py:1204] (3/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_115-147343-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 18:41:11,662 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3171_210_2087_1_1526109084_2330007_37-64334-0_sp1.1 from training. Duration: 0.7454375 2023-10-04 18:41:11,697 WARNING [train.py:1204] (3/4) Exclude cut with ID _19514_210_9259_1_1531530071330_5084230_769-272914-0_sp1.1 from training. Duration: 0.936375 2023-10-04 18:41:13,190 WARNING [train.py:1204] (3/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_76-311963-0_sp0.9 from training. Duration: 0.77775 2023-10-04 18:41:14,520 WARNING [train.py:1204] (3/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_526-62079-0 from training. Duration: 0.67 2023-10-04 18:41:14,589 WARNING [train.py:1204] (3/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_7-111954-0_sp0.9 from training. Duration: 0.62225 2023-10-04 18:41:14,831 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=1754693.3333333333, ans=0.1 2023-10-04 18:41:16,327 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.833e+02 2.129e+02 2.392e+02 2.896e+02 5.102e+02, threshold=4.784e+02, percent-clipped=2.0 2023-10-04 18:41:16,483 WARNING [train.py:1204] (3/4) Exclude cut with ID _57997_210_4163_1_1534766425889_3961049_360-279140-0_sp1.1 from training. Duration: 0.97275 2023-10-04 18:41:19,195 WARNING [train.py:1204] (3/4) Exclude cut with ID _4866_210_2577_1_1527404412284_4364659_133-1696-0 from training. Duration: 0.99 2023-10-04 18:41:20,554 WARNING [train.py:1204] (3/4) Exclude cut with ID _5190_210_4557_1_1527244368799_373439_28-197030-0 from training. Duration: 0.72 2023-10-04 18:41:23,277 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_53-39003-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 18:41:23,280 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6673_210_3273_1_1527767790_3173240_351-167954-0 from training. Duration: 0.48 2023-10-04 18:41:23,295 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2822_210_2715_1_1526112586_1999587_165-36656-0_sp1.1 from training. Duration: 0.8090625 2023-10-04 18:41:26,069 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_328-264892-0_sp1.1 from training. Duration: 0.92725 2023-10-04 18:41:26,073 WARNING [train.py:1204] (3/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_29-293896-0_sp0.9 from training. Duration: 0.67775 2023-10-04 18:41:28,773 WARNING [train.py:1204] (3/4) Exclude cut with ID _65263_210_22356_1_1535763598691_3921037_465-55448-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 18:41:30,086 WARNING [train.py:1204] (3/4) Exclude cut with ID _21989_210_5854_1_1531899214931_3271360_311-53106-0_sp1.1 from training. Duration: 0.97275 2023-10-04 18:41:30,379 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.balancer_ff3.min_abs, batch_count=1754760.0, ans=0.2 2023-10-04 18:41:33,509 WARNING [train.py:1204] (3/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_389-277915-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 18:41:34,691 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_69630_210_7234_1_1535791094_677837_63-277139-0_sp1.1 from training. Duration: 0.963625 2023-10-04 18:41:36,152 WARNING [train.py:1204] (3/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_85-138257-0 from training. Duration: 0.71 2023-10-04 18:41:36,168 WARNING [train.py:1204] (3/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_686-247600-0 from training. Duration: 0.92 2023-10-04 18:41:36,173 WARNING [train.py:1204] (3/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_108-255430-0_sp1.1 from training. Duration: 0.8818125 2023-10-04 18:41:36,421 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.conv_skip_rate, batch_count=1754760.0, ans=0.0 2023-10-04 18:41:41,331 WARNING [train.py:1204] (3/4) Exclude cut with ID _14884_210_2252_1_1530707459689_5561250_114-201399-0_sp1.1 from training. Duration: 0.6909375 2023-10-04 18:41:42,857 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.conv_module2.balancer2.prob, batch_count=1754826.6666666667, ans=0.125 2023-10-04 18:41:44,042 WARNING [train.py:1204] (3/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_506-42614-0 from training. Duration: 0.79 2023-10-04 18:41:45,560 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_85-195240-0_sp1.1 from training. Duration: 0.7909375 2023-10-04 18:41:51,473 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_141-36243-0_sp1.1 from training. Duration: 0.97275 2023-10-04 18:41:59,749 WARNING [train.py:1204] (3/4) Exclude cut with ID _7852_210_5219_1_1528286471316_8139400_358-287492-0_sp1.1 from training. Duration: 0.87275 2023-10-04 18:41:59,774 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_805-71497-0_sp0.9 from training. Duration: 0.788875 2023-10-04 18:42:01,208 WARNING [train.py:1204] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_178-39722-0 from training. Duration: 0.49 2023-10-04 18:42:01,742 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.4.encoder.layers.1.nonlin_attention.whiten1, num_groups=1, num_channels=288, metric=3.58 vs. limit=10.0 2023-10-04 18:42:03,965 WARNING [train.py:1204] (3/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_428-181925-0_sp1.1 from training. Duration: 0.963625 2023-10-04 18:42:03,969 WARNING [train.py:1204] (3/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_770-200430-0 from training. Duration: 0.89 2023-10-04 18:42:04,586 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.4.encoder.layers.0.nonlin_attention.whiten2, num_groups=1, num_channels=384, metric=8.48 vs. limit=15.0 2023-10-04 18:42:05,318 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_1104-271009-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 18:42:05,363 WARNING [train.py:1204] (3/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_128-296581-0_sp0.9 from training. Duration: 0.82225 2023-10-04 18:42:10,543 INFO [train.py:1046] (3/4) Epoch 50, batch 2950, loss[loss=0.1521, simple_loss=0.2408, pruned_loss=0.03165, over 24477.00 frames. ], tot_loss[loss=0.1524, simple_loss=0.2336, pruned_loss=0.0356, over 4732887.34 frames. ], batch size: 69, lr: 2.03e-03, grad_scale: 16.0 2023-10-04 18:42:11,962 WARNING [train.py:1204] (3/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_19-62511-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 18:42:13,329 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_805-71497-0 from training. Duration: 0.71 2023-10-04 18:42:15,211 WARNING [train.py:1204] (3/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_573-152077-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 18:42:15,216 WARNING [train.py:1204] (3/4) Exclude cut with ID _42875_210_14856_1_1533704397487_4012433_393-177816-0_sp1.1 from training. Duration: 0.963625 2023-10-04 18:42:16,655 WARNING [train.py:1204] (3/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_278-35159-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 18:42:18,024 WARNING [train.py:1204] (3/4) Exclude cut with ID _15037_210_2253_1_1530752294104_3267650_13-130361-0_sp1.1 from training. Duration: 0.9 2023-10-04 18:42:19,811 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3019_210_2028_1_1526044245_3264865_127-135615-0 from training. Duration: 0.94 2023-10-04 18:42:19,880 WARNING [train.py:1204] (3/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_607-213951-0 from training. Duration: 0.84 2023-10-04 18:42:21,217 WARNING [train.py:1204] (3/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_436-318113-0_sp0.9 from training. Duration: 0.8333125 2023-10-04 18:42:21,218 WARNING [train.py:1204] (3/4) Exclude cut with ID _13938_210_9988_1_1530518718978_3693960_148-152225-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 18:42:25,616 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2541_210_1794_1_1525590269_3084765_156-173713-0_sp1.1 from training. Duration: 0.736375 2023-10-04 18:42:25,797 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.bypass_mid.scale_min, batch_count=1755026.6666666667, ans=0.2 2023-10-04 18:42:26,992 WARNING [train.py:1204] (3/4) Exclude cut with ID _28323_210_16187_1_1532480395667_4100300_531-24157-0_sp1.1 from training. Duration: 0.8909375 2023-10-04 18:42:28,378 WARNING [train.py:1204] (3/4) Exclude cut with ID _29303_210_2575_1_1532392237303_7410649_382-217492-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 18:42:29,717 WARNING [train.py:1204] (3/4) Exclude cut with ID _58895_210_8750_1_1535184081320_3649089_270-158013-0_sp1.1 from training. Duration: 0.92725 2023-10-04 18:42:33,806 WARNING [train.py:1204] (3/4) Exclude cut with ID _29277_210_3279_1_1532581109293_3173069_311-206227-0_sp1.1 from training. Duration: 0.936375 2023-10-04 18:42:33,824 WARNING [train.py:1204] (3/4) Exclude cut with ID _7160_210_1876_1_1527940923231_3703799_282-322611-0_sp0.9 from training. Duration: 0.9666875 2023-10-04 18:42:35,254 WARNING [train.py:1204] (3/4) Exclude cut with ID _53219_210_4525_1_1534834836008_4064825_562-32295-0_sp1.1 from training. Duration: 0.963625 2023-10-04 18:42:37,101 WARNING [train.py:1204] (3/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_408-87330-0_sp1.1 from training. Duration: 0.963625 2023-10-04 18:42:37,109 WARNING [train.py:1204] (3/4) Exclude cut with ID _59202_210_26984_1_1534816810612_3549174_137-318710-0_sp1.1 from training. Duration: 0.8090625 2023-10-04 18:42:39,879 WARNING [train.py:1204] (3/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_372-30302-0 from training. Duration: 0.86 2023-10-04 18:42:44,439 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7543_210_6753_1_1528541907_1399322_178-56269-0_sp1.1 from training. Duration: 0.95275 2023-10-04 18:42:44,464 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9109W0012-549758 from training. Duration: 0.512 2023-10-04 18:42:45,911 WARNING [train.py:1204] (3/4) Exclude cut with ID _5410_210_1388_1_1527397055795_1719020_39-112221-0_sp0.9 from training. Duration: 0.7666875 2023-10-04 18:42:47,412 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9121W0012-554933 from training. Duration: 0.896 2023-10-04 18:42:49,258 WARNING [train.py:1204] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_476-272757-0 from training. Duration: 0.93 2023-10-04 18:42:50,545 WARNING [train.py:1204] (3/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_1085-171718-0_sp1.1 from training. Duration: 0.92725 2023-10-04 18:42:50,596 WARNING [train.py:1204] (3/4) Exclude cut with ID _8480_210_1874_1_1528589391205_6728330_309-139896-0_sp1.1 from training. Duration: 0.736375 2023-10-04 18:42:50,596 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9130W0113-559703 from training. Duration: 0.896 2023-10-04 18:42:50,602 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_292-265928-0_sp1.1 from training. Duration: 0.463625 2023-10-04 18:42:53,903 WARNING [train.py:1204] (3/4) Exclude cut with ID _8772_210_1850_1_1528635595972_3664011_96-12756-0 from training. Duration: 0.98 2023-10-04 18:42:55,236 WARNING [train.py:1204] (3/4) Exclude cut with ID _12556_210_8341_1_1530081024100_7327690_725-193477-0_sp1.1 from training. Duration: 0.8909375 2023-10-04 18:42:55,302 WARNING [train.py:1204] (3/4) Exclude cut with ID _5289_210_2084_1_1527678202736_3779299_142-103317-0_sp1.1 from training. Duration: 0.763625 2023-10-04 18:42:56,790 WARNING [train.py:1204] (3/4) Exclude cut with ID _45012_210_12651_1_1533608435136_5371380_206-51075-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 18:42:58,127 WARNING [train.py:1204] (3/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_176-56120-0_sp1.1 from training. Duration: 0.7545625 2023-10-04 18:42:58,162 WARNING [train.py:1204] (3/4) Exclude cut with ID _40284_210_2084_1_1533513679814_3704279_220-201864-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 18:42:58,190 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9165W0004-576638 from training. Duration: 0.896 2023-10-04 18:42:58,225 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_5438_210_1794_1_1527336687_2600009_218-343336-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 18:42:59,535 WARNING [train.py:1204] (3/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_318-162017-0 from training. Duration: 0.91 2023-10-04 18:43:03,840 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_49544_210_1794_1_1534379770_5262083_483-349298-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 18:43:05,203 WARNING [train.py:1204] (3/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_197-28164-0_sp0.9 from training. Duration: 0.888875 2023-10-04 18:43:07,218 WARNING [train.py:1204] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_643-52007-0 from training. Duration: 0.57 2023-10-04 18:43:07,242 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_38965_210_4852_1_1533174906_3484735_403-332963-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 18:43:09,738 WARNING [train.py:1204] (3/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_340-174176-0 from training. Duration: 0.49 2023-10-04 18:43:12,965 WARNING [train.py:1204] (3/4) Exclude cut with ID _56708_210_13177_1_1535252351282_3422899_363-917-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 18:43:14,317 WARNING [train.py:1204] (3/4) Exclude cut with ID _19896_210_1883_1_1531709022662_6649569_525-159063-0_sp1.1 from training. Duration: 0.936375 2023-10-04 18:43:14,361 WARNING [train.py:1204] (3/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_430-116139-0_sp0.9 from training. Duration: 0.9444375 2023-10-04 18:43:16,974 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_53753_210_7234_1_1534320003_3594148_86-329250-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 18:43:16,992 WARNING [train.py:1204] (3/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_406-275649-0_sp1.1 from training. Duration: 0.3818125 2023-10-04 18:43:17,105 WARNING [train.py:1204] (3/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_1083-147536-0_sp1.1 from training. Duration: 0.8818125 2023-10-04 18:43:18,456 WARNING [train.py:1204] (3/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_54-311741-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 18:43:18,460 WARNING [train.py:1204] (3/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_237-58433-0_sp0.9 from training. Duration: 0.82225 2023-10-04 18:43:19,708 WARNING [train.py:1204] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_98-51676-0_sp1.1 from training. Duration: 0.663625 2023-10-04 18:43:19,777 WARNING [train.py:1204] (3/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_386-270472-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 18:43:21,136 WARNING [train.py:1204] (3/4) Exclude cut with ID _19023_210_4353_1_1531378892552_5820780_83-254794-0_sp1.1 from training. Duration: 0.7818125 2023-10-04 18:43:24,330 INFO [train.py:1046] (3/4) Epoch 50, batch 3000, loss[loss=0.1618, simple_loss=0.2463, pruned_loss=0.03864, over 23957.00 frames. ], tot_loss[loss=0.1527, simple_loss=0.234, pruned_loss=0.03568, over 4723458.43 frames. ], batch size: 86, lr: 2.03e-03, grad_scale: 8.0 2023-10-04 18:43:24,331 INFO [train.py:1069] (3/4) Computing validation loss 2023-10-04 18:43:36,829 INFO [train.py:1078] (3/4) Epoch 50, validation: loss=0.3701, simple_loss=0.2758, pruned_loss=0.2322, over 1125622.00 frames. 2023-10-04 18:43:36,830 INFO [train.py:1079] (3/4) Maximum memory allocated so far is 21221MB 2023-10-04 18:43:36,896 WARNING [train.py:1204] (3/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_314-93656-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 18:43:36,907 WARNING [train.py:1204] (3/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_336-213530-0 from training. Duration: 0.9 2023-10-04 18:43:37,017 WARNING [train.py:1204] (3/4) Exclude cut with ID _36686_210_5665_1_1533778225913_3646410_65-221120-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 18:43:39,889 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_26433_210_12108_1_1532157158_4056480_271-68178-0_sp1.1 from training. Duration: 0.92725 2023-10-04 18:43:41,760 WARNING [train.py:1204] (3/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_46-207599-0_sp0.9 from training. Duration: 0.711125 2023-10-04 18:43:45,978 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9303W0114-637133 from training. Duration: 0.896 2023-10-04 18:43:46,019 WARNING [train.py:1204] (3/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_490-207861-0 from training. Duration: 0.98 2023-10-04 18:43:48,149 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.2.encoder.layers.0.conv_module2.whiten, num_groups=1, num_channels=384, metric=7.83 vs. limit=15.0 2023-10-04 18:43:48,728 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6767_210_3273_1_1527855783_1718932_173-256601-0_sp0.9 from training. Duration: 0.888875 2023-10-04 18:43:48,797 WARNING [train.py:1204] (3/4) Exclude cut with ID _13921_210_9994_1_1530841782272_3503520_327-69677-0_sp1.1 from training. Duration: 0.8181875 2023-10-04 18:43:50,149 WARNING [train.py:1204] (3/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_508-335369-0 from training. Duration: 0.48 2023-10-04 18:43:50,171 WARNING [train.py:1204] (3/4) Exclude cut with ID _64631_210_27767_1_1535349580068_4165160_24-46567-0_sp1.1 from training. Duration: 0.963625 2023-10-04 18:43:51,772 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.balancer1.prob, batch_count=1755360.0, ans=0.125 2023-10-04 18:43:57,839 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.709e+02 2.159e+02 2.457e+02 2.749e+02 3.497e+02, threshold=4.915e+02, percent-clipped=0.0 2023-10-04 18:43:57,940 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_89-35748-0_sp1.1 from training. Duration: 0.7181875 2023-10-04 18:44:06,242 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_26433_210_12108_1_1532157158_4056480_578-262107-0_sp1.1 from training. Duration: 0.9 2023-10-04 18:44:10,786 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.0.layers.0.feed_forward1.out_whiten, num_groups=1, num_channels=192, metric=9.40 vs. limit=15.0 2023-10-04 18:44:12,610 WARNING [train.py:1204] (3/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_129-337802-0 from training. Duration: 0.72 2023-10-04 18:44:12,701 WARNING [train.py:1204] (3/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_356-167816-0_sp0.9 from training. Duration: 0.8 2023-10-04 18:44:15,998 WARNING [train.py:1204] (3/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_137-116442-0_sp1.1 from training. Duration: 0.7090625 2023-10-04 18:44:16,050 WARNING [train.py:1204] (3/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_358-172940-0_sp1.1 from training. Duration: 0.963625 2023-10-04 18:44:16,070 WARNING [train.py:1204] (3/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_668-344147-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 18:44:18,827 WARNING [train.py:1204] (3/4) Exclude cut with ID _62337_210_12392_1_1535009273105_3412229_255-157667-0_sp1.1 from training. Duration: 0.936375 2023-10-04 18:44:18,830 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_486-151131-0 from training. Duration: 0.85 2023-10-04 18:44:20,233 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_53638_210_21581_1_1534297319_5808449_885-80172-0 from training. Duration: 0.96 2023-10-04 18:44:21,578 WARNING [train.py:1204] (3/4) Exclude cut with ID _50093_210_4220_1_1534417141207_3432299_105-183067-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 18:44:22,949 WARNING [train.py:1204] (3/4) Exclude cut with ID _7562_210_2084_1_1528887767916_3946229_345-169156-0_sp0.9 from training. Duration: 0.6444375 2023-10-04 18:44:24,949 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_4528_210_3273_1_1527076654_3276777_209-219804-0_sp0.9 from training. Duration: 0.7666875 2023-10-04 18:44:24,994 WARNING [train.py:1204] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_35-153886-0_sp0.9 from training. Duration: 0.9333125 2023-10-04 18:44:25,079 WARNING [train.py:1204] (3/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_332-121925-0_sp1.1 from training. Duration: 0.92725 2023-10-04 18:44:25,081 WARNING [train.py:1204] (3/4) Exclude cut with ID _59202_210_26984_1_1534816810612_3549174_495-65515-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 18:44:29,621 WARNING [train.py:1204] (3/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_769-168302-0_sp1.1 from training. Duration: 0.6454375 2023-10-04 18:44:29,664 WARNING [train.py:1204] (3/4) Exclude cut with ID _24271_210_12658_1_1531987270022_7190519_708-71345-0_sp1.1 from training. Duration: 0.936375 2023-10-04 18:44:29,664 WARNING [train.py:1204] (3/4) Exclude cut with ID 3033-130750-0096-55598-0_sp0.9 from training. Duration: 0.92225 2023-10-04 18:44:31,109 WARNING [train.py:1204] (3/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_219-224445-0_sp0.9 from training. Duration: 0.9333125 2023-10-04 18:44:33,799 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_174-277364-0 from training. Duration: 0.71 2023-10-04 18:44:35,152 WARNING [train.py:1204] (3/4) Exclude cut with ID _45021_210_9994_1_1533619751094_3330288_96-239010-0_sp1.1 from training. Duration: 0.82725 2023-10-04 18:44:35,176 WARNING [train.py:1204] (3/4) Exclude cut with ID _31602_210_5854_1_1532677021456_4458250_489-305116-0_sp1.1 from training. Duration: 0.97275 2023-10-04 18:44:35,203 WARNING [train.py:1204] (3/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_269-154726-0_sp0.9 from training. Duration: 0.9555625 2023-10-04 18:44:39,843 WARNING [train.py:1204] (3/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_126-54418-0_sp1.1 from training. Duration: 0.92725 2023-10-04 18:44:39,877 WARNING [train.py:1204] (3/4) Exclude cut with ID _42326_210_7770_1_1533814282531_3707269_182-224159-0_sp1.1 from training. Duration: 0.92725 2023-10-04 18:44:41,231 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7002_210_1794_1_1527939683_5157286_1-281264-0_sp0.9 from training. Duration: 0.488875 2023-10-04 18:44:41,257 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_86-16881-0 from training. Duration: 0.58 2023-10-04 18:44:42,513 WARNING [train.py:1204] (3/4) Exclude cut with ID _19514_210_9259_1_1531530071330_5084230_637-279094-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 18:44:42,559 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_26433_210_12108_1_1532157158_4056480_578-262107-0 from training. Duration: 0.99 2023-10-04 18:44:42,595 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_82-221196-0_sp1.1 from training. Duration: 0.6909375 2023-10-04 18:44:44,134 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.0.bypass_mid.scale_min, batch_count=1755560.0, ans=0.2 2023-10-04 18:44:45,414 WARNING [train.py:1204] (3/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_402-102311-0 from training. Duration: 0.67 2023-10-04 18:44:45,664 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.bypass.skip_rate, batch_count=1755560.0, ans=0.07 2023-10-04 18:44:47,465 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1015-343884-0_sp1.1 from training. Duration: 0.563625 2023-10-04 18:44:47,596 INFO [scaling.py:1118] (3/4) WithLoss: name=encoder.encoders.0.layers.0.self_attn_weights, loss-sum=0.000e+00 2023-10-04 18:44:48,813 WARNING [train.py:1204] (3/4) Exclude cut with ID _13684_210_7497_1_1530619354863_3598359_312-235606-0_sp1.1 from training. Duration: 0.5454375 2023-10-04 18:44:50,075 WARNING [train.py:1204] (3/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_55-31904-0 from training. Duration: 0.88 2023-10-04 18:44:51,418 INFO [train.py:1046] (3/4) Epoch 50, batch 3050, loss[loss=0.144, simple_loss=0.2308, pruned_loss=0.02857, over 24671.00 frames. ], tot_loss[loss=0.1526, simple_loss=0.234, pruned_loss=0.03559, over 4706998.05 frames. ], batch size: 65, lr: 2.03e-03, grad_scale: 8.0 2023-10-04 18:44:51,471 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_25-332138-0 from training. Duration: 0.91 2023-10-04 18:44:51,474 WARNING [train.py:1204] (3/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_912-282412-0_sp0.9 from training. Duration: 0.5444375 2023-10-04 18:44:51,560 WARNING [train.py:1204] (3/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_458-299435-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 18:44:51,698 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.conv_module2.balancer2.prob, batch_count=1755626.6666666667, ans=0.125 2023-10-04 18:44:52,821 WARNING [train.py:1204] (3/4) Exclude cut with ID _69584_210_5219_1_1535785133775_4044450_275-88403-0_sp1.1 from training. Duration: 0.92725 2023-10-04 18:44:52,835 WARNING [train.py:1204] (3/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_841-341679-0_sp1.1 from training. Duration: 0.52725 2023-10-04 18:44:54,085 WARNING [train.py:1204] (3/4) Exclude cut with ID _41391_210_7994_1_1533603771872_3638201_1-54521-0_sp1.1 from training. Duration: 0.97275 2023-10-04 18:44:54,141 WARNING [train.py:1204] (3/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_495-186609-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 18:44:59,188 WARNING [train.py:1204] (3/4) Exclude cut with ID _4681_210_3525_1_1527250689464_120889_15-126760-0 from training. Duration: 0.81 2023-10-04 18:45:00,653 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3019_210_2028_1_1526044245_3264865_127-135615-0_sp1.1 from training. Duration: 0.8545625 2023-10-04 18:45:03,371 WARNING [train.py:1204] (3/4) Exclude cut with ID _30191_210_1664_1_1533189915047_7196670_915-21561-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 18:45:04,677 WARNING [train.py:1204] (3/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_6-214815-0_sp0.9 from training. Duration: 0.9444375 2023-10-04 18:45:06,278 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_45399_210_4852_1_1533624725_7579737_190-137494-0_sp1.1 from training. Duration: 0.97275 2023-10-04 18:45:09,049 WARNING [train.py:1204] (3/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_207-132038-0 from training. Duration: 0.94 2023-10-04 18:45:12,422 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.ff2_skip_rate, batch_count=1755693.3333333333, ans=0.0 2023-10-04 18:45:14,867 WARNING [train.py:1204] (3/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_90-31383-0 from training. Duration: 0.87 2023-10-04 18:45:14,915 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_15517_210_6753_1_1530952293_1664795_119-31001-0 from training. Duration: 0.94 2023-10-04 18:45:14,940 WARNING [train.py:1204] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_684-85342-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 18:45:18,282 WARNING [train.py:1204] (3/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_97-325053-0_sp1.1 from training. Duration: 0.836375 2023-10-04 18:45:21,114 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_68702_210_23689_1_1535799577_3420068_168-25150-0_sp1.1 from training. Duration: 0.97275 2023-10-04 18:45:21,121 WARNING [train.py:1204] (3/4) Exclude cut with ID _65134_210_14763_1_1535335393985_3505368_111-27984-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 18:45:22,463 WARNING [train.py:1204] (3/4) Exclude cut with ID _48678_210_5663_1_1533988830947_3769199_276-226230-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 18:45:23,915 WARNING [train.py:1204] (3/4) Exclude cut with ID _28427_210_4220_1_1532581240967_3061069_43-84928-0_sp1.1 from training. Duration: 0.9 2023-10-04 18:45:25,142 WARNING [train.py:1204] (3/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_158-250116-0_sp1.1 from training. Duration: 0.563625 2023-10-04 18:45:26,408 WARNING [train.py:1204] (3/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_279-332556-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 18:45:26,463 WARNING [train.py:1204] (3/4) Exclude cut with ID _42863_210_9070_1_1533902398788_3696920_488-230146-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 18:45:26,464 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_387-247159-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 18:45:28,298 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6729_210_2550_1_1528032481_1788725_261-42619-0_sp1.1 from training. Duration: 0.97275 2023-10-04 18:45:28,460 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=1755760.0, ans=0.1 2023-10-04 18:45:31,500 WARNING [train.py:1204] (3/4) Exclude cut with ID _19896_210_1883_1_1531709022662_6649569_831-44724-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 18:45:34,289 WARNING [train.py:1204] (3/4) Exclude cut with ID _55153_210_2253_1_1534384786677_3162096_280-285944-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 18:45:34,320 WARNING [train.py:1204] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_448-4048-0 from training. Duration: 0.96 2023-10-04 18:45:34,370 WARNING [train.py:1204] (3/4) Exclude cut with ID _29678_210_7893_1_1532421042787_3906118_456-202548-0_sp1.1 from training. Duration: 0.97275 2023-10-04 18:45:34,382 WARNING [train.py:1204] (3/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_107-332398-0_sp1.1 from training. Duration: 0.7090625 2023-10-04 18:45:37,226 WARNING [train.py:1204] (3/4) Exclude cut with ID _13921_210_9994_1_1530838702800_3003429_46-149444-0_sp1.1 from training. Duration: 0.8545625 2023-10-04 18:45:38,523 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_257-20733-0_sp0.9 from training. Duration: 0.8444375 2023-10-04 18:45:38,585 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_53753_210_7234_1_1534320003_3594148_385-234809-0_sp1.1 from training. Duration: 0.963625 2023-10-04 18:45:39,882 WARNING [train.py:1204] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_292-215346-0_sp1.1 from training. Duration: 0.8818125 2023-10-04 18:45:43,471 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.nonlin_attention.balancer.prob, batch_count=1755826.6666666667, ans=0.125 2023-10-04 18:45:44,475 WARNING [train.py:1204] (3/4) Exclude cut with ID _68965_210_1883_1_1535698962272_3497490_196-124268-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 18:45:44,523 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_23801_210_16223_1_1532066392_3294657_570-5439-0_sp1.1 from training. Duration: 0.8818125 2023-10-04 18:45:51,854 WARNING [train.py:1204] (3/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_349-6928-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 18:45:51,915 WARNING [train.py:1204] (3/4) Exclude cut with ID _60621_210_17071_1_1535013053530_3671739_226-255202-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 18:45:51,919 WARNING [train.py:1204] (3/4) Exclude cut with ID _41817_210_18835_1_1533434477160_7423049_448-130302-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 18:45:54,714 WARNING [train.py:1204] (3/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_758-143677-0_sp1.1 from training. Duration: 0.8909375 2023-10-04 18:45:56,070 WARNING [train.py:1204] (3/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_912-47473-0_sp0.9 from training. Duration: 0.7555625 2023-10-04 18:45:56,101 WARNING [train.py:1204] (3/4) Exclude cut with ID _6694_210_4599_1_1527852637490_3965529_356-169233-0_sp1.1 from training. Duration: 0.9 2023-10-04 18:45:56,192 WARNING [train.py:1204] (3/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_1-99680-0 from training. Duration: 0.67 2023-10-04 18:45:57,899 WARNING [train.py:1204] (3/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_199-86432-0_sp1.1 from training. Duration: 0.8909375 2023-10-04 18:45:57,922 WARNING [train.py:1204] (3/4) Exclude cut with ID _61148_210_6897_1_1535094022726_4217610_362-182430-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 18:45:59,139 WARNING [train.py:1204] (3/4) Exclude cut with ID _49376_210_5986_1_1533985278732_3370740_301-154763-0 from training. Duration: 0.98 2023-10-04 18:46:01,191 WARNING [train.py:1204] (3/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_572-185150-0_sp1.1 from training. Duration: 0.8818125 2023-10-04 18:46:04,213 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=1755960.0, ans=0.1 2023-10-04 18:46:05,209 INFO [train.py:1046] (3/4) Epoch 50, batch 3100, loss[loss=0.1294, simple_loss=0.1929, pruned_loss=0.03295, over 22627.00 frames. ], tot_loss[loss=0.1525, simple_loss=0.234, pruned_loss=0.0355, over 4718293.82 frames. ], batch size: 322, lr: 2.03e-03, grad_scale: 8.0 2023-10-04 18:46:05,345 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_47896_210_20508_1_1534492518_5958868_558-314572-0_sp1.1 from training. Duration: 0.8818125 2023-10-04 18:46:06,733 WARNING [train.py:1204] (3/4) Exclude cut with ID _45560_210_21982_1_1533812603951_3584999_111-6376-0_sp0.9 from training. Duration: 0.9333125 2023-10-04 18:46:09,553 WARNING [train.py:1204] (3/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_412-317077-0_sp1.1 from training. Duration: 0.6818125 2023-10-04 18:46:12,812 WARNING [train.py:1204] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_718-68505-0 from training. Duration: 0.89 2023-10-04 18:46:13,987 WARNING [train.py:1204] (3/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_77-311404-0 from training. Duration: 0.94 2023-10-04 18:46:15,325 WARNING [train.py:1204] (3/4) Exclude cut with ID _60229_210_27767_1_1534838406158_3655350_264-279573-0 from training. Duration: 0.83 2023-10-04 18:46:15,561 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.conv_skip_rate, batch_count=1755960.0, ans=0.0 2023-10-04 18:46:16,678 WARNING [train.py:1204] (3/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_106-300985-0_sp1.1 from training. Duration: 0.8181875 2023-10-04 18:46:17,458 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.3.encoder.layers.2.feed_forward2.out_whiten, num_groups=1, num_channels=512, metric=10.32 vs. limit=15.0 2023-10-04 18:46:20,016 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_4183_210_2087_1_1526729100_1869838_79-277189-0_sp1.1 from training. Duration: 0.963625 2023-10-04 18:46:20,025 WARNING [train.py:1204] (3/4) Exclude cut with ID _28427_210_4220_1_1532581240967_3061069_195-295268-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 18:46:21,558 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder_embed.convnext.out_balancer.prob, batch_count=1756026.6666666667, ans=0.125 2023-10-04 18:46:22,751 WARNING [train.py:1204] (3/4) Exclude cut with ID _4092_210_2151_1_1526643044449_3135759_356-278694-0_sp0.9 from training. Duration: 0.52225 2023-10-04 18:46:22,960 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.0.feed_forward2.hidden_balancer.prob, batch_count=1756026.6666666667, ans=0.125 2023-10-04 18:46:25,458 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.774e+02 2.135e+02 2.495e+02 2.950e+02 6.010e+02, threshold=4.989e+02, percent-clipped=3.0 2023-10-04 18:46:25,581 WARNING [train.py:1204] (3/4) Exclude cut with ID _14459_210_2228_1_1531443641946_7516700_777-71446-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 18:46:32,060 WARNING [train.py:1204] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_520-126729-0 from training. Duration: 0.7 2023-10-04 18:46:35,058 WARNING [train.py:1204] (3/4) Exclude cut with ID _13684_210_7497_1_1530619354863_3598359_312-235606-0_sp0.9 from training. Duration: 0.6666875 2023-10-04 18:46:35,104 WARNING [train.py:1204] (3/4) Exclude cut with ID _37537_210_2253_1_1533171758568_3421949_283-349402-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 18:46:36,460 WARNING [train.py:1204] (3/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_29-117932-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 18:46:36,493 WARNING [train.py:1204] (3/4) Exclude cut with ID _29303_210_2575_1_1532392237303_7410649_721-283351-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 18:46:37,049 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.3.encoder.layers.2.feed_forward3.out_whiten, num_groups=1, num_channels=512, metric=12.14 vs. limit=15.0 2023-10-04 18:46:37,797 WARNING [train.py:1204] (3/4) Exclude cut with ID _14886_210_4220_1_1530702020355_3516190_175-229073-0_sp0.9 from training. Duration: 0.5 2023-10-04 18:46:39,179 WARNING [train.py:1204] (3/4) Exclude cut with ID _15458_210_8341_1_1530858614263_3834410_70-268830-0_sp1.1 from training. Duration: 0.97275 2023-10-04 18:46:39,203 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2295_210_2087_1_1525500993_2407863_234-83104-0 from training. Duration: 0.62 2023-10-04 18:46:39,204 WARNING [train.py:1204] (3/4) Exclude cut with ID _22961_210_8254_1_1531976366672_4278180_317-199546-0_sp1.1 from training. Duration: 0.7818125 2023-10-04 18:46:41,192 WARNING [train.py:1204] (3/4) Exclude cut with ID _27894_210_12392_1_1532415496078_6902439_547-196022-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 18:46:42,576 WARNING [train.py:1204] (3/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_46-207599-0 from training. Duration: 0.64 2023-10-04 18:46:43,909 WARNING [train.py:1204] (3/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_426-326246-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 18:46:45,662 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.feed_forward2.hidden_balancer.prob, batch_count=1756093.3333333333, ans=0.125 2023-10-04 18:46:46,755 WARNING [train.py:1204] (3/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_445-117398-0_sp0.9 from training. Duration: 0.988875 2023-10-04 18:46:48,058 WARNING [train.py:1204] (3/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_563-347832-0 from training. Duration: 0.89 2023-10-04 18:46:48,139 WARNING [train.py:1204] (3/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_252-216674-0 from training. Duration: 0.54 2023-10-04 18:46:49,947 WARNING [train.py:1204] (3/4) Exclude cut with ID _59611_210_8895_1_1534741198240_3650361_187-212779-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 18:46:49,995 WARNING [train.py:1204] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_282-159367-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 18:46:54,105 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_68702_210_23689_1_1535799577_3420068_33-136883-0_sp1.1 from training. Duration: 0.936375 2023-10-04 18:46:54,115 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_47228_210_4983_1_1533780021_3672027_184-293706-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 18:46:54,143 WARNING [train.py:1204] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_656-188361-0_sp0.9 from training. Duration: 0.9666875 2023-10-04 18:46:54,975 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.2.encoder.layers.0.feed_forward3.out_whiten, num_groups=1, num_channels=384, metric=11.56 vs. limit=15.0 2023-10-04 18:46:55,535 WARNING [train.py:1204] (3/4) Exclude cut with ID _4866_210_2577_1_1527404412284_4364659_352-157930-0_sp0.9 from training. Duration: 0.9 2023-10-04 18:46:55,535 WARNING [train.py:1204] (3/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_210-106047-0_sp1.1 from training. Duration: 0.8818125 2023-10-04 18:46:56,309 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.2.encoder.layers.2.self_attn_weights.whiten_keys, num_groups=4, num_channels=128, metric=2.62 vs. limit=6.0 2023-10-04 18:46:57,348 WARNING [train.py:1204] (3/4) Exclude cut with ID _9389_210_1664_1_1529217337301_5289269_289-213417-0_sp1.1 from training. Duration: 0.8090625 2023-10-04 18:46:57,385 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3910_210_1794_1_1526777667_3747260_178-41186-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 18:46:57,388 WARNING [train.py:1204] (3/4) Exclude cut with ID _27052_210_5986_1_1532329197187_3585009_24-172263-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 18:46:57,398 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_5522_210_3273_1_1527249606_3909096_318-203153-0_sp1.1 from training. Duration: 0.3909375 2023-10-04 18:47:02,150 WARNING [train.py:1204] (3/4) Exclude cut with ID _27894_210_12392_1_1532415496078_6902439_222-51982-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 18:47:03,476 WARNING [train.py:1204] (3/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_87-131427-0 from training. Duration: 0.78 2023-10-04 18:47:04,921 WARNING [train.py:1204] (3/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_372-298093-0_sp0.9 from training. Duration: 0.888875 2023-10-04 18:47:06,273 WARNING [train.py:1204] (3/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_663-166973-0 from training. Duration: 0.8 2023-10-04 18:47:07,622 WARNING [train.py:1204] (3/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_429-177469-0_sp1.1 from training. Duration: 0.936375 2023-10-04 18:47:07,648 WARNING [train.py:1204] (3/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_724-172651-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 18:47:07,689 WARNING [train.py:1204] (3/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_103-336064-0 from training. Duration: 0.51 2023-10-04 18:47:17,927 WARNING [train.py:1204] (3/4) Exclude cut with ID _48677_210_5663_1_1533902405919_3892989_111-11439-0 from training. Duration: 0.96 2023-10-04 18:47:18,181 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.conv_module2.balancer1.prob, batch_count=1756293.3333333333, ans=0.125 2023-10-04 18:47:19,636 INFO [train.py:1046] (3/4) Epoch 50, batch 3150, loss[loss=0.151, simple_loss=0.242, pruned_loss=0.03002, over 24375.00 frames. ], tot_loss[loss=0.1513, simple_loss=0.2322, pruned_loss=0.03525, over 4695672.77 frames. ], batch size: 77, lr: 2.03e-03, grad_scale: 8.0 2023-10-04 18:47:21,108 WARNING [train.py:1204] (3/4) Exclude cut with ID _8085_210_4557_1_1528455633493_3716100_95-103398-0_sp1.1 from training. Duration: 0.963625 2023-10-04 18:47:21,175 WARNING [train.py:1204] (3/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_1041-110837-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 18:47:22,562 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_50050_210_16223_1_1535435796_7290138_1155-121757-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 18:47:22,564 WARNING [train.py:1204] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_602-275260-0_sp0.9 from training. Duration: 0.911125 2023-10-04 18:47:23,921 WARNING [train.py:1204] (3/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_265-164486-0 from training. Duration: 0.96 2023-10-04 18:47:25,387 WARNING [train.py:1204] (3/4) Exclude cut with ID _46067_210_4220_1_1534125571066_3307450_142-229375-0_sp1.1 from training. Duration: 0.963625 2023-10-04 18:47:25,412 WARNING [train.py:1204] (3/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_310-26186-0_sp1.1 from training. Duration: 0.636375 2023-10-04 18:47:28,445 WARNING [train.py:1204] (3/4) Exclude cut with ID _28461_210_2253_1_1532771775189_3041299_136-270644-0 from training. Duration: 0.93 2023-10-04 18:47:29,862 WARNING [train.py:1204] (3/4) Exclude cut with ID _14860_210_4915_1_1530702020340_7343240_438-286372-0_sp1.1 from training. Duration: 0.936375 2023-10-04 18:47:33,412 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.feed_forward3.hidden_balancer.prob, batch_count=1756360.0, ans=0.125 2023-10-04 18:47:34,473 WARNING [train.py:1204] (3/4) Exclude cut with ID IC0128W0004-63309 from training. Duration: 0.831 2023-10-04 18:47:34,657 WARNING [train.py:1204] (3/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_135-174080-0 from training. Duration: 0.53 2023-10-04 18:47:36,116 WARNING [train.py:1204] (3/4) Exclude cut with ID _41326_210_5854_1_1533778076509_3492080_277-59511-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 18:47:36,186 WARNING [train.py:1204] (3/4) Exclude cut with ID IC0144W0112-71379 from training. Duration: 0.992 2023-10-04 18:47:37,512 WARNING [train.py:1204] (3/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_711-15688-0_sp0.9 from training. Duration: 0.47775 2023-10-04 18:47:40,198 WARNING [train.py:1204] (3/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_237-332027-0 from training. Duration: 0.95 2023-10-04 18:47:40,248 WARNING [train.py:1204] (3/4) Exclude cut with ID _7970_210_1883_1_1528455616296_3472230_25-178407-0 from training. Duration: 0.71 2023-10-04 18:47:40,250 WARNING [train.py:1204] (3/4) Exclude cut with ID _8480_210_1874_1_1528589391205_6728330_309-139896-0 from training. Duration: 0.81 2023-10-04 18:47:40,275 WARNING [train.py:1204] (3/4) Exclude cut with ID _39571_210_13449_1_1533801584143_3651077_319-268877-0_sp1.1 from training. Duration: 0.936375 2023-10-04 18:47:40,278 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_155-246880-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 18:47:41,693 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_566-122599-0_sp1.1 from training. Duration: 0.936375 2023-10-04 18:47:43,043 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_12868_210_6753_1_1530701932_530275_18-76322-0 from training. Duration: 0.87 2023-10-04 18:47:45,045 WARNING [train.py:1204] (3/4) Exclude cut with ID _56494_210_4220_1_1535284837625_3033169_159-106035-0_sp1.1 from training. Duration: 0.963625 2023-10-04 18:47:46,127 WARNING [train.py:1204] (3/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_98-276013-0_sp1.1 from training. Duration: 0.963625 2023-10-04 18:47:46,190 WARNING [train.py:1204] (3/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_330-216124-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 18:47:47,650 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_177-58005-0_sp0.9 from training. Duration: 0.688875 2023-10-04 18:47:47,835 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.conv_skip_rate, batch_count=1756426.6666666667, ans=0.0 2023-10-04 18:47:51,087 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.1.conv_module2.balancer2.min_positive, batch_count=1756426.6666666667, ans=0.05 2023-10-04 18:47:52,268 WARNING [train.py:1204] (3/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_343-139898-0 from training. Duration: 0.96 2023-10-04 18:47:52,336 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6961_210_2087_1_1527937168_6815011_395-185060-0_sp1.1 from training. Duration: 0.863625 2023-10-04 18:47:53,814 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7002_210_1794_1_1527939683_5157286_184-229630-0_sp0.9 from training. Duration: 0.82225 2023-10-04 18:47:53,858 WARNING [train.py:1204] (3/4) Exclude cut with ID _59170_210_6721_1_1534983633960_4203335_153-18895-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 18:47:55,213 WARNING [train.py:1204] (3/4) Exclude cut with ID _49643_210_7989_1_1534640244224_3939031_598-161926-0 from training. Duration: 0.9 2023-10-04 18:47:56,747 WARNING [train.py:1204] (3/4) Exclude cut with ID _4068_210_2629_1_1526697350395_1546259_224-288693-0 from training. Duration: 0.59 2023-10-04 18:47:58,682 WARNING [train.py:1204] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_718-68505-0_sp1.1 from training. Duration: 0.8090625 2023-10-04 18:47:58,739 WARNING [train.py:1204] (3/4) Exclude cut with ID _4068_210_2629_1_1526697350395_1546259_224-288693-0_sp0.9 from training. Duration: 0.6555625 2023-10-04 18:47:58,757 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2296_210_2087_1_1525503448_3397335_57-185430-0_sp0.9 from training. Duration: 0.6666875 2023-10-04 18:48:01,507 WARNING [train.py:1204] (3/4) Exclude cut with ID _15458_210_8341_1_1530858614263_3834410_49-235472-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 18:48:01,509 WARNING [train.py:1204] (3/4) Exclude cut with ID _64135_210_2253_1_1535677267876_7608972_498-5039-0_sp0.9 from training. Duration: 0.8666875 2023-10-04 18:48:01,626 WARNING [train.py:1204] (3/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_283-220372-0_sp0.9 from training. Duration: 0.62225 2023-10-04 18:48:01,639 WARNING [train.py:1204] (3/4) Exclude cut with ID _6473_210_1741_1_1527901307396_3815380_108-17335-0_sp0.9 from training. Duration: 0.72225 2023-10-04 18:48:03,423 WARNING [train.py:1204] (3/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_113-92608-0 from training. Duration: 0.54 2023-10-04 18:48:03,473 WARNING [train.py:1204] (3/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_272-249985-0_sp1.1 from training. Duration: 0.6090625 2023-10-04 18:48:03,482 WARNING [train.py:1204] (3/4) Exclude cut with ID _50376_210_13401_1_1534031502101_4217740_518-187390-0_sp1.1 from training. Duration: 0.97275 2023-10-04 18:48:04,835 WARNING [train.py:1204] (3/4) Exclude cut with ID _59738_210_15379_1_1534849512562_3941470_165-248260-0_sp1.1 from training. Duration: 0.92725 2023-10-04 18:48:04,847 WARNING [train.py:1204] (3/4) Exclude cut with ID _23818_210_6228_1_1531917063884_3676920_400-343925-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 18:48:06,266 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6961_210_2087_1_1527937168_6815011_562-165518-0 from training. Duration: 0.77 2023-10-04 18:48:06,318 WARNING [train.py:1204] (3/4) Exclude cut with ID _24271_210_12658_1_1531987270022_7190519_289-207931-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 18:48:06,887 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.4.encoder.layers.1.whiten, num_groups=1, num_channels=384, metric=4.89 vs. limit=12.0 2023-10-04 18:48:07,809 WARNING [train.py:1204] (3/4) Exclude cut with ID _14632_210_9988_1_1530691279392_3923200_332-105904-0 from training. Duration: 0.9 2023-10-04 18:48:07,819 WARNING [train.py:1204] (3/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_203-232438-0_sp1.1 from training. Duration: 0.97275 2023-10-04 18:48:09,218 WARNING [train.py:1204] (3/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_263-168388-0 from training. Duration: 0.86 2023-10-04 18:48:10,433 WARNING [train.py:1204] (3/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_174-205058-0 from training. Duration: 0.95 2023-10-04 18:48:11,842 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_94-195801-0_sp1.1 from training. Duration: 0.8545625 2023-10-04 18:48:11,863 WARNING [train.py:1204] (3/4) Exclude cut with ID _55153_210_2253_1_1534384786677_3162096_269-103204-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 18:48:13,663 WARNING [train.py:1204] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_748-36150-0 from training. Duration: 0.91 2023-10-04 18:48:13,733 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_148-172861-0_sp1.1 from training. Duration: 0.4545625 2023-10-04 18:48:15,125 WARNING [train.py:1204] (3/4) Exclude cut with ID _67499_210_17282_1_1535509805220_3692430_460-77730-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 18:48:19,756 WARNING [train.py:1204] (3/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_374-20753-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 18:48:21,107 WARNING [train.py:1204] (3/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_374-289983-0_sp1.1 from training. Duration: 0.97275 2023-10-04 18:48:21,148 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_34886_210_10633_1_1533203882_1755661_76-156096-0_sp0.9 from training. Duration: 0.9666875 2023-10-04 18:48:25,416 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_238-256668-0_sp0.9 from training. Duration: 0.7666875 2023-10-04 18:48:25,673 INFO [scaling.py:1118] (3/4) WithLoss: name=encoder.encoders.3.encoder.layers.1.self_attn_weights, loss-sum=0.000e+00 2023-10-04 18:48:26,716 WARNING [train.py:1204] (3/4) Exclude cut with ID _46683_210_9642_1_1533730913492_4591116_489-146727-0_sp1.1 from training. Duration: 0.97275 2023-10-04 18:48:28,568 WARNING [train.py:1204] (3/4) Exclude cut with ID _13938_210_9988_1_1530518718978_3693960_304-112784-0_sp0.9 from training. Duration: 0.511125 2023-10-04 18:48:32,068 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=1756560.0, ans=0.1 2023-10-04 18:48:32,142 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.nonlin_attention.balancer.min_positive, batch_count=1756560.0, ans=0.05 2023-10-04 18:48:33,315 WARNING [train.py:1204] (3/4) Exclude cut with ID _24271_210_12658_1_1531987270022_7190519_243-46549-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 18:48:33,320 WARNING [train.py:1204] (3/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_78-182912-0_sp1.1 from training. Duration: 0.536375 2023-10-04 18:48:34,625 INFO [train.py:1046] (3/4) Epoch 50, batch 3200, loss[loss=0.1483, simple_loss=0.243, pruned_loss=0.02674, over 24475.00 frames. ], tot_loss[loss=0.1512, simple_loss=0.2323, pruned_loss=0.035, over 4703328.25 frames. ], batch size: 69, lr: 2.03e-03, grad_scale: 16.0 2023-10-04 18:48:36,177 WARNING [train.py:1204] (3/4) Exclude cut with ID _57572_210_5517_1_1534921219624_3809484_121-321334-0_sp1.1 from training. Duration: 0.97275 2023-10-04 18:48:38,933 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_40047_210_2290_1_1533290519_2374311_254-102149-0_sp1.1 from training. Duration: 0.963625 2023-10-04 18:48:38,934 WARNING [train.py:1204] (3/4) Exclude cut with ID _28461_210_2253_1_1532771775189_3041299_96-152158-0 from training. Duration: 0.85 2023-10-04 18:48:40,381 WARNING [train.py:1204] (3/4) Exclude cut with ID _29877_210_6228_1_1532397791106_5276149_161-102979-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 18:48:43,962 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3787_210_2040_1_1526477544_5478586_603-120324-0_sp0.9 from training. Duration: 0.9 2023-10-04 18:48:44,218 INFO [scaling.py:1118] (3/4) WithLoss: name=encoder.encoders.5.encoder.layers.1.self_attn_weights, loss-sum=0.000e+00 2023-10-04 18:48:44,388 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.5.encoder.layers.1.nonlin_attention.whiten1, num_groups=1, num_channels=192, metric=2.94 vs. limit=10.0 2023-10-04 18:48:49,369 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_41750_210_1960_1_1533520678_3948042_247-213456-0_sp1.1 from training. Duration: 0.97275 2023-10-04 18:48:54,867 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.736e+02 2.049e+02 2.254e+02 2.662e+02 4.145e+02, threshold=4.508e+02, percent-clipped=0.0 2023-10-04 18:48:56,999 WARNING [train.py:1204] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_127-119109-0_sp0.9 from training. Duration: 0.8 2023-10-04 18:49:01,427 INFO [scaling.py:1118] (3/4) WithLoss: name=encoder.encoders.5.encoder.layers.0.self_attn_weights, loss-sum=0.000e+00 2023-10-04 18:49:06,014 WARNING [train.py:1204] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_215-202973-0 from training. Duration: 0.88 2023-10-04 18:49:06,097 WARNING [train.py:1204] (3/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_17-320491-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 18:49:07,603 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.balancer2.prob, batch_count=1756760.0, ans=0.125 2023-10-04 18:49:09,873 WARNING [train.py:1204] (3/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_310-114452-0 from training. Duration: 0.59 2023-10-04 18:49:11,255 WARNING [train.py:1204] (3/4) Exclude cut with ID _15133_210_6228_1_1530794512074_3080379_29-264458-0_sp0.9 from training. Duration: 0.6444375 2023-10-04 18:49:11,826 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.4.encoder.layers.0.conv_module2.whiten, num_groups=1, num_channels=384, metric=11.55 vs. limit=15.0 2023-10-04 18:49:16,423 WARNING [train.py:1204] (3/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_159-281310-0_sp1.1 from training. Duration: 0.863625 2023-10-04 18:49:16,438 WARNING [train.py:1204] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_195-167011-0_sp0.9 from training. Duration: 0.8333125 2023-10-04 18:49:16,503 WARNING [train.py:1204] (3/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_156-257607-0_sp1.1 from training. Duration: 0.7545625 2023-10-04 18:49:16,740 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.ff3_skip_rate, batch_count=1756760.0, ans=0.0 2023-10-04 18:49:20,629 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2295_210_2087_1_1525500993_2407863_116-238894-0 from training. Duration: 0.7 2023-10-04 18:49:21,935 WARNING [train.py:1204] (3/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_321-335475-0_sp0.9 from training. Duration: 0.5 2023-10-04 18:49:24,794 WARNING [train.py:1204] (3/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_188-114464-0 from training. Duration: 0.82 2023-10-04 18:49:27,648 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2592_210_2290_1_1525697025_4882925_157-70512-0 from training. Duration: 0.91 2023-10-04 18:49:29,677 WARNING [train.py:1204] (3/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_138-64210-0_sp1.1 from training. Duration: 0.663625 2023-10-04 18:49:35,708 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_11305_210_1794_1_1529750428_7178148_651-95702-0_sp1.1 from training. Duration: 0.963625 2023-10-04 18:49:35,732 WARNING [train.py:1204] (3/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_379-184223-0_sp0.9 from training. Duration: 0.9444375 2023-10-04 18:49:36,961 WARNING [train.py:1204] (3/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_381-125325-0_sp1.1 from training. Duration: 0.963625 2023-10-04 18:49:37,024 WARNING [train.py:1204] (3/4) Exclude cut with ID IC0622W0021-307659 from training. Duration: 0.896 2023-10-04 18:49:37,026 WARNING [train.py:1204] (3/4) Exclude cut with ID _7624_210_1531_1_1528109802198_4933101_418-349834-0_sp1.1 from training. Duration: 0.5454375 2023-10-04 18:49:39,867 WARNING [train.py:1204] (3/4) Exclude cut with ID _49728_210_12651_1_1534041034294_6788150_607-3416-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 18:49:41,209 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8275_210_4198_1_1528455320_3892571_491-17707-0 from training. Duration: 0.64 2023-10-04 18:49:41,258 WARNING [train.py:1204] (3/4) Exclude cut with ID _5287_210_2084_1_1527418883040_3878670_333-48508-0 from training. Duration: 0.6 2023-10-04 18:49:42,660 WARNING [train.py:1204] (3/4) Exclude cut with ID _8365_210_2064_1_1528592311070_4677690_371-122295-0 from training. Duration: 0.55 2023-10-04 18:49:45,809 WARNING [train.py:1204] (3/4) Exclude cut with ID _14860_210_4915_1_1530702020340_7343240_836-328489-0 from training. Duration: 0.9 2023-10-04 18:49:47,719 WARNING [train.py:1204] (3/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_1033-274376-0_sp0.9 from training. Duration: 0.9666875 2023-10-04 18:49:48,592 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.2.encoder.layers.0.self_attn2.whiten, num_groups=1, num_channels=384, metric=17.84 vs. limit=22.5 2023-10-04 18:49:48,987 INFO [train.py:1046] (3/4) Epoch 50, batch 3250, loss[loss=0.1476, simple_loss=0.2336, pruned_loss=0.03077, over 23267.00 frames. ], tot_loss[loss=0.1517, simple_loss=0.2329, pruned_loss=0.03524, over 4715561.38 frames. ], batch size: 105, lr: 2.03e-03, grad_scale: 16.0 2023-10-04 18:49:50,475 WARNING [train.py:1204] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_475-127455-0_sp0.9 from training. Duration: 0.77775 2023-10-04 18:49:50,484 WARNING [train.py:1204] (3/4) Exclude cut with ID IC0671W0071-331914 from training. Duration: 0.896 2023-10-04 18:49:50,515 WARNING [train.py:1204] (3/4) Exclude cut with ID _30327_210_16049_1_1532484161295_3924169_2-1523-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 18:49:50,518 WARNING [train.py:1204] (3/4) Exclude cut with ID _24314_210_4915_1_1532135078116_7170179_460-208279-0_sp1.1 from training. Duration: 0.97275 2023-10-04 18:49:52,481 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.3.encoder.layers.2.self_attn1.whiten, num_groups=1, num_channels=512, metric=11.84 vs. limit=22.5 2023-10-04 18:49:53,234 WARNING [train.py:1204] (3/4) Exclude cut with ID IC0679W0052-335859 from training. Duration: 0.64 2023-10-04 18:49:57,426 WARNING [train.py:1204] (3/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_3-237434-0_sp1.1 from training. Duration: 0.6909375 2023-10-04 18:50:00,711 WARNING [train.py:1204] (3/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_405-185283-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 18:50:01,228 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.nonlin_attention.whiten1.whitening_limit, batch_count=1756960.0, ans=10.0 2023-10-04 18:50:07,867 WARNING [train.py:1204] (3/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_927-83684-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 18:50:07,876 WARNING [train.py:1204] (3/4) Exclude cut with ID _6694_210_4599_1_1527852637490_3965529_356-169233-0 from training. Duration: 0.99 2023-10-04 18:50:07,960 WARNING [train.py:1204] (3/4) Exclude cut with ID _19896_210_1883_1_1531709022662_6649569_562-233001-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 18:50:09,733 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_354-78629-0_sp1.1 from training. Duration: 0.963625 2023-10-04 18:50:09,734 WARNING [train.py:1204] (3/4) Exclude cut with ID _60135_210_13427_1_1534935557922_3651319_96-168198-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 18:50:09,874 WARNING [train.py:1204] (3/4) Exclude cut with ID _31602_210_5854_1_1532677021456_4458250_249-96604-0_sp1.1 from training. Duration: 0.8090625 2023-10-04 18:50:11,161 WARNING [train.py:1204] (3/4) Exclude cut with ID _14100_210_8341_1_1530529320613_3678069_258-224599-0_sp0.9 from training. Duration: 0.8555625 2023-10-04 18:50:13,287 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.2.encoder.layers.2.self_attn1.whiten, num_groups=1, num_channels=384, metric=19.61 vs. limit=22.5 2023-10-04 18:50:13,882 WARNING [train.py:1204] (3/4) Exclude cut with ID _19708_210_9206_1_1532136650733_4238364_215-160135-0_sp1.1 from training. Duration: 0.97275 2023-10-04 18:50:13,904 WARNING [train.py:1204] (3/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_113-18163-0_sp0.9 from training. Duration: 0.988875 2023-10-04 18:50:13,965 WARNING [train.py:1204] (3/4) Exclude cut with ID _64135_210_2253_1_1535677267876_7608972_552-337340-0_sp1.1 from training. Duration: 0.92725 2023-10-04 18:50:15,174 WARNING [train.py:1204] (3/4) Exclude cut with ID _67725_210_27767_1_1535608756671_5105359_10-12887-0_sp1.1 from training. Duration: 0.97275 2023-10-04 18:50:15,181 WARNING [train.py:1204] (3/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_401-284840-0_sp1.1 from training. Duration: 0.97275 2023-10-04 18:50:15,212 WARNING [train.py:1204] (3/4) Exclude cut with ID _44659_210_2721_1_1534036120061_6614860_483-141285-0_sp1.1 from training. Duration: 0.936375 2023-10-04 18:50:19,733 WARNING [train.py:1204] (3/4) Exclude cut with ID _9461_210_4557_1_1529060514726_3677451_358-332734-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 18:50:21,105 WARNING [train.py:1204] (3/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_788-298907-0_sp1.1 from training. Duration: 0.8090625 2023-10-04 18:50:23,116 WARNING [train.py:1204] (3/4) Exclude cut with ID _39411_210_5986_1_1533538825030_3343649_354-267196-0_sp1.1 from training. Duration: 0.92725 2023-10-04 18:50:23,153 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_14953_210_2151_1_1530701710_5616165_192-309792-0_sp1.1 from training. Duration: 0.97275 2023-10-04 18:50:24,615 WARNING [train.py:1204] (3/4) Exclude cut with ID _59170_210_6721_1_1534983633960_4203335_290-151255-0_sp1.1 from training. Duration: 0.92725 2023-10-04 18:50:24,646 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_5575_210_1410_1_1527301305_2570170_249-93769-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 18:50:24,655 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_46013_210_7234_1_1533776362_3744032_463-52378-0_sp1.1 from training. Duration: 0.7818125 2023-10-04 18:50:28,930 WARNING [train.py:1204] (3/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_580-93798-0 from training. Duration: 0.56 2023-10-04 18:50:28,971 WARNING [train.py:1204] (3/4) Exclude cut with ID _19514_210_9259_1_1531530071330_5084230_385-13762-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 18:50:28,976 WARNING [train.py:1204] (3/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_458-67321-0_sp1.1 from training. Duration: 0.8818125 2023-10-04 18:50:30,746 WARNING [train.py:1204] (3/4) Exclude cut with ID _41298_210_9019_1_1533430371450_4822143_188-154791-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 18:50:30,809 WARNING [train.py:1204] (3/4) Exclude cut with ID _15037_210_2253_1_1530752294104_3267650_296-192597-0_sp0.9 from training. Duration: 0.9 2023-10-04 18:50:37,393 WARNING [train.py:1204] (3/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_661-264857-0_sp1.1 from training. Duration: 0.8454375 2023-10-04 18:50:38,940 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.nonlin_attention.balancer.min_positive, batch_count=1757160.0, ans=0.05 2023-10-04 18:50:40,356 INFO [scaling.py:1118] (3/4) WithLoss: name=encoder.encoders.3.encoder.layers.0.self_attn_weights, loss-sum=0.000e+00 2023-10-04 18:50:44,925 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_34008_210_10431_1_1533345424_8183247_662-317331-0_sp1.1 from training. Duration: 0.8545625 2023-10-04 18:50:44,958 WARNING [train.py:1204] (3/4) Exclude cut with ID _46502_210_13794_1_1533724452198_3657159_241-278329-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 18:50:44,958 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_11349_210_6753_1_1529800861_1101207_26-65602-0 from training. Duration: 0.96 2023-10-04 18:50:44,962 WARNING [train.py:1204] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_448-4048-0_sp1.1 from training. Duration: 0.87275 2023-10-04 18:50:44,970 WARNING [train.py:1204] (3/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_662-275621-0_sp1.1 from training. Duration: 0.3818125 2023-10-04 18:50:46,327 WARNING [train.py:1204] (3/4) Exclude cut with ID _13938_210_9988_1_1530518718978_3693960_256-54454-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 18:50:49,659 WARNING [train.py:1204] (3/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_256-32662-0 from training. Duration: 0.9 2023-10-04 18:50:50,332 WARNING [train.py:1204] (3/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_326-17061-0 from training. Duration: 0.94 2023-10-04 18:50:51,437 WARNING [train.py:1204] (3/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_505-97987-0_sp1.1 from training. Duration: 0.7818125 2023-10-04 18:50:52,800 WARNING [train.py:1204] (3/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_316-294364-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 18:50:52,858 WARNING [train.py:1204] (3/4) Exclude cut with ID _19514_210_9259_1_1531530071330_5084230_762-231063-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 18:50:52,910 WARNING [train.py:1204] (3/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_68-219807-0_sp0.9 from training. Duration: 0.52225 2023-10-04 18:50:54,252 WARNING [train.py:1204] (3/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_479-158053-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 18:50:57,002 WARNING [train.py:1204] (3/4) Exclude cut with ID _66112_210_10635_1_1535590236305_3801484_83-38181-0_sp1.1 from training. Duration: 0.936375 2023-10-04 18:50:57,016 WARNING [train.py:1204] (3/4) Exclude cut with ID _55857_210_2346_1_1534644007831_6898209_255-146346-0_sp1.1 from training. Duration: 0.963625 2023-10-04 18:50:58,422 WARNING [train.py:1204] (3/4) Exclude cut with ID _48836_210_12210_1_1534075848293_3904150_429-35475-0 from training. Duration: 0.89 2023-10-04 18:50:58,433 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_50154_210_13731_1_1534750862_1080961_119-187163-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 18:50:58,623 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.bypass.scale_min, batch_count=1757226.6666666667, ans=0.2 2023-10-04 18:50:59,853 WARNING [train.py:1204] (3/4) Exclude cut with ID _8793_210_6310_1_1528542985139_2310009_121-179782-0_sp0.9 from training. Duration: 0.888875 2023-10-04 18:50:59,856 WARNING [train.py:1204] (3/4) Exclude cut with ID _14884_210_2252_1_1530707459689_5561250_591-173406-0 from training. Duration: 0.96 2023-10-04 18:51:02,934 INFO [train.py:1046] (3/4) Epoch 50, batch 3300, loss[loss=0.1442, simple_loss=0.2303, pruned_loss=0.02902, over 24329.00 frames. ], tot_loss[loss=0.1523, simple_loss=0.2334, pruned_loss=0.03559, over 4713163.56 frames. ], batch size: 61, lr: 2.03e-03, grad_scale: 16.0 2023-10-04 18:51:03,079 WARNING [train.py:1204] (3/4) Exclude cut with ID _15133_210_6228_1_1530794512074_3080379_54-198193-0_sp1.1 from training. Duration: 0.8545625 2023-10-04 18:51:03,088 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6545_210_2040_1_1527774670_4245258_407-48070-0 from training. Duration: 0.76 2023-10-04 18:51:05,800 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1032-236863-0 from training. Duration: 0.84 2023-10-04 18:51:05,866 WARNING [train.py:1204] (3/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_507-303370-0 from training. Duration: 0.96 2023-10-04 18:51:05,886 WARNING [train.py:1204] (3/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_164-282366-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 18:51:07,440 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.conv_module1.balancer1.prob, batch_count=1757293.3333333333, ans=0.125 2023-10-04 18:51:08,845 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.1.feed_forward3.hidden_balancer.prob, batch_count=1757293.3333333333, ans=0.125 2023-10-04 18:51:10,581 WARNING [train.py:1204] (3/4) Exclude cut with ID _39867_210_9826_1_1533553222030_9344259_312-158727-0_sp1.1 from training. Duration: 0.963625 2023-10-04 18:51:12,131 WARNING [train.py:1204] (3/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_174-205058-0_sp1.1 from training. Duration: 0.863625 2023-10-04 18:51:12,153 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_11305_210_1794_1_1529750428_7178148_166-237358-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 18:51:12,383 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.conv_module2.balancer2.prob, batch_count=1757293.3333333333, ans=0.125 2023-10-04 18:51:14,770 WARNING [train.py:1204] (3/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_26-175553-0_sp1.1 from training. Duration: 0.5545625 2023-10-04 18:51:14,792 WARNING [train.py:1204] (3/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_96-290039-0_sp1.1 from training. Duration: 0.5090625 2023-10-04 18:51:16,896 WARNING [train.py:1204] (3/4) Exclude cut with ID _53219_210_4525_1_1534834836008_4064825_261-101691-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 18:51:18,328 WARNING [train.py:1204] (3/4) Exclude cut with ID _24271_210_12658_1_1531987270022_7190519_96-293390-0_sp1.1 from training. Duration: 0.936375 2023-10-04 18:51:23,225 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.bypass_mid.scale_min, batch_count=1757360.0, ans=0.2 2023-10-04 18:51:24,178 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.724e+02 2.069e+02 2.302e+02 2.665e+02 3.368e+02, threshold=4.603e+02, percent-clipped=0.0 2023-10-04 18:51:24,309 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_271-234962-0 from training. Duration: 0.79 2023-10-04 18:51:25,550 WARNING [train.py:1204] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_136-30660-0_sp1.1 from training. Duration: 0.97275 2023-10-04 18:51:25,577 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_625-281046-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 18:51:27,059 WARNING [train.py:1204] (3/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_333-168193-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 18:51:28,434 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9029W0269-509664 from training. Duration: 0.896 2023-10-04 18:51:29,807 WARNING [train.py:1204] (3/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_450-147393-0_sp1.1 from training. Duration: 0.92725 2023-10-04 18:51:29,862 WARNING [train.py:1204] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_643-52007-0_sp1.1 from training. Duration: 0.5181875 2023-10-04 18:51:31,229 WARNING [train.py:1204] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_330-327861-0_sp0.9 from training. Duration: 0.7444375 2023-10-04 18:51:31,237 WARNING [train.py:1204] (3/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_260-317651-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 18:51:31,264 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9044W0010-516654 from training. Duration: 0.896 2023-10-04 18:51:32,927 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.self_attn_weights.pos_emb_skip_rate, batch_count=1757426.6666666667, ans=0.0 2023-10-04 18:51:35,427 WARNING [train.py:1204] (3/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_265-118378-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 18:51:35,433 WARNING [train.py:1204] (3/4) Exclude cut with ID _6194_210_1534_1_1527511826160_3941194_321-148641-0_sp0.9 from training. Duration: 0.711125 2023-10-04 18:51:36,829 WARNING [train.py:1204] (3/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_101-169176-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 18:51:36,831 WARNING [train.py:1204] (3/4) Exclude cut with ID 497-129325-0061-62254-0_sp1.1 from training. Duration: 0.97725 2023-10-04 18:51:40,125 WARNING [train.py:1204] (3/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_795-33252-0_sp1.1 from training. Duration: 0.336375 2023-10-04 18:51:40,139 WARNING [train.py:1204] (3/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_168-246176-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 18:51:40,233 WARNING [train.py:1204] (3/4) Exclude cut with ID _7160_210_1876_1_1527940923231_3703799_120-150943-0_sp1.1 from training. Duration: 0.736375 2023-10-04 18:51:41,772 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9084W0005-536844 from training. Duration: 0.768 2023-10-04 18:51:44,919 WARNING [train.py:1204] (3/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_234-260682-0 from training. Duration: 0.84 2023-10-04 18:51:44,949 WARNING [train.py:1204] (3/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_188-114464-0_sp0.9 from training. Duration: 0.911125 2023-10-04 18:51:47,783 WARNING [train.py:1204] (3/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_81-75849-0 from training. Duration: 0.39 2023-10-04 18:51:51,175 WARNING [train.py:1204] (3/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_379-184223-0_sp1.1 from training. Duration: 0.77275 2023-10-04 18:51:52,638 WARNING [train.py:1204] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_454-123002-0_sp1.1 from training. Duration: 0.62725 2023-10-04 18:51:52,689 WARNING [train.py:1204] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_887-249555-0_sp1.1 from training. Duration: 0.7 2023-10-04 18:51:52,844 INFO [scaling.py:1118] (3/4) WithLoss: name=encoder.encoders.0.layers.1.self_attn_weights, loss-sum=0.000e+00 2023-10-04 18:51:55,942 WARNING [train.py:1204] (3/4) Exclude cut with ID _31088_210_16446_1_1532520012284_3440919_20-260902-0_sp1.1 from training. Duration: 0.97275 2023-10-04 18:51:55,997 WARNING [train.py:1204] (3/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_856-344574-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 18:51:55,999 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_36756_210_7234_1_1533026991_966276_4-164118-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 18:51:57,106 WARNING [train.py:1204] (3/4) Exclude cut with ID _8365_210_2064_1_1528592311070_4677690_371-122295-0_sp1.1 from training. Duration: 0.5 2023-10-04 18:51:58,539 WARNING [train.py:1204] (3/4) Exclude cut with ID _5910_210_1389_1_1527337972454_5050100_555-159815-0_sp1.1 from training. Duration: 0.8909375 2023-10-04 18:51:58,556 WARNING [train.py:1204] (3/4) Exclude cut with ID _37086_210_9038_1_1532999540891_7021250_469-147941-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 18:51:59,883 WARNING [train.py:1204] (3/4) Exclude cut with ID _3067_210_2667_1_1526212974640_4503168_459-173867-0_sp0.9 from training. Duration: 0.97775 2023-10-04 18:52:00,004 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9156W0006-572499 from training. Duration: 0.896 2023-10-04 18:52:01,386 WARNING [train.py:1204] (3/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_277-210798-0 from training. Duration: 0.55 2023-10-04 18:52:04,039 WARNING [train.py:1204] (3/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_103-336064-0_sp1.1 from training. Duration: 0.463625 2023-10-04 18:52:04,092 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3414_210_1988_1_1526212718_4813195_331-290945-0_sp1.1 from training. Duration: 0.936375 2023-10-04 18:52:04,093 WARNING [train.py:1204] (3/4) Exclude cut with ID _67499_210_17282_1_1535509805220_3692430_365-29276-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 18:52:06,683 WARNING [train.py:1204] (3/4) Exclude cut with ID _14821_210_8750_1_1530680503991_6829359_69-250847-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 18:52:06,685 WARNING [train.py:1204] (3/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_1102-61301-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 18:52:08,370 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.balancer1.prob, batch_count=1757560.0, ans=0.125 2023-10-04 18:52:10,141 WARNING [train.py:1204] (3/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_166-246557-0_sp1.1 from training. Duration: 0.5818125 2023-10-04 18:52:11,491 WARNING [train.py:1204] (3/4) Exclude cut with ID _15351_210_10626_1_1530856635653_3576210_214-129918-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 18:52:11,505 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_120-41668-0_sp0.9 from training. Duration: 0.77775 2023-10-04 18:52:11,570 WARNING [train.py:1204] (3/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_27-72278-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 18:52:12,994 WARNING [train.py:1204] (3/4) Exclude cut with ID _24314_210_4915_1_1532135078116_7170179_521-258868-0_sp1.1 from training. Duration: 0.7181875 2023-10-04 18:52:13,820 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.2.encoder.layers.0.feed_forward2.out_whiten, num_groups=1, num_channels=384, metric=4.21 vs. limit=15.0 2023-10-04 18:52:14,454 WARNING [train.py:1204] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_143-86344-0 from training. Duration: 0.77 2023-10-04 18:52:15,793 WARNING [train.py:1204] (3/4) Exclude cut with ID _53489_210_6228_1_1534298357169_3835970_465-157433-0_sp1.1 from training. Duration: 0.92725 2023-10-04 18:52:15,880 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_248-319353-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 18:52:17,881 INFO [train.py:1046] (3/4) Epoch 50, batch 3350, loss[loss=0.1703, simple_loss=0.2411, pruned_loss=0.04978, over 22759.00 frames. ], tot_loss[loss=0.1524, simple_loss=0.234, pruned_loss=0.03537, over 4728374.17 frames. ], batch size: 322, lr: 2.03e-03, grad_scale: 8.0 2023-10-04 18:52:17,937 WARNING [train.py:1204] (3/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_102-77064-0_sp1.1 from training. Duration: 0.8454375 2023-10-04 18:52:17,971 WARNING [train.py:1204] (3/4) Exclude cut with ID _9389_210_1664_1_1529217337301_5289269_379-128794-0_sp1.1 from training. Duration: 0.77275 2023-10-04 18:52:19,350 WARNING [train.py:1204] (3/4) Exclude cut with ID _3231_210_2106_1_1526205864934_6784220_236-260835-0_sp1.1 from training. Duration: 0.97275 2023-10-04 18:52:20,795 WARNING [train.py:1204] (3/4) Exclude cut with ID _24271_210_12658_1_1531987270022_7190519_184-342363-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 18:52:20,796 WARNING [train.py:1204] (3/4) Exclude cut with ID _27894_210_12392_1_1532415496078_6902439_67-221663-0_sp1.1 from training. Duration: 0.92725 2023-10-04 18:52:24,107 WARNING [train.py:1204] (3/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_304-59227-0_sp1.1 from training. Duration: 0.7 2023-10-04 18:52:27,341 WARNING [train.py:1204] (3/4) Exclude cut with ID _58605_210_12210_1_1534723195939_3904259_458-42232-0_sp1.1 from training. Duration: 0.92725 2023-10-04 18:52:28,660 WARNING [train.py:1204] (3/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_13-165512-0_sp0.9 from training. Duration: 0.911125 2023-10-04 18:52:30,194 WARNING [train.py:1204] (3/4) Exclude cut with ID _58602_210_2346_1_1534903210085_6908830_792-102241-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 18:52:31,557 WARNING [train.py:1204] (3/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_588-125165-0_sp0.9 from training. Duration: 0.92225 2023-10-04 18:52:32,911 WARNING [train.py:1204] (3/4) Exclude cut with ID _54218_210_12210_1_1534413646388_4409259_239-98429-0_sp1.1 from training. Duration: 0.97275 2023-10-04 18:52:34,257 WARNING [train.py:1204] (3/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_538-32291-0_sp1.1 from training. Duration: 0.7545625 2023-10-04 18:52:35,648 WARNING [train.py:1204] (3/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_419-317062-0 from training. Duration: 0.91 2023-10-04 18:52:37,020 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9309W0202-640329 from training. Duration: 0.896 2023-10-04 18:52:37,061 WARNING [train.py:1204] (3/4) Exclude cut with ID _3231_210_2106_1_1526205864934_6784220_419-313291-0_sp1.1 from training. Duration: 0.97275 2023-10-04 18:52:41,670 WARNING [train.py:1204] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_619-344499-0 from training. Duration: 0.63 2023-10-04 18:52:41,682 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_278-272612-0 from training. Duration: 0.73 2023-10-04 18:52:43,055 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_103-84010-0_sp0.9 from training. Duration: 0.7666875 2023-10-04 18:52:43,077 WARNING [train.py:1204] (3/4) Exclude cut with ID _26528_210_9070_1_1532570410842_3851310_497-97218-0_sp1.1 from training. Duration: 0.7818125 2023-10-04 18:52:43,173 WARNING [train.py:1204] (3/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_262-84613-0_sp1.1 from training. Duration: 0.936375 2023-10-04 18:52:44,526 WARNING [train.py:1204] (3/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_387-131538-0 from training. Duration: 0.81 2023-10-04 18:52:44,552 WARNING [train.py:1204] (3/4) Exclude cut with ID _8030_210_7497_1_1528444886781_3531089_224-300489-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 18:52:44,576 WARNING [train.py:1204] (3/4) Exclude cut with ID _14349_210_8341_1_1530615710818_3442470_170-336541-0_sp0.9 from training. Duration: 0.9555625 2023-10-04 18:52:47,287 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_47069_210_15368_1_1533781267_4431320_180-31746-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 18:52:49,253 WARNING [train.py:1204] (3/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_447-7041-0_sp1.1 from training. Duration: 0.92725 2023-10-04 18:52:50,578 WARNING [train.py:1204] (3/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_2-55283-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 18:52:50,647 WARNING [train.py:1204] (3/4) Exclude cut with ID _12555_210_8750_1_1530075594402_3780239_70-181911-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 18:52:52,236 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.0.nonlin_attention.balancer.prob, batch_count=1757760.0, ans=0.125 2023-10-04 18:52:53,510 WARNING [train.py:1204] (3/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_311-328022-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 18:52:55,530 WARNING [train.py:1204] (3/4) Exclude cut with ID _14528_210_8254_1_1530669700514_4306939_330-2987-0_sp1.1 from training. Duration: 0.92725 2023-10-04 18:52:56,849 WARNING [train.py:1204] (3/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_385-184858-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 18:53:01,416 WARNING [train.py:1204] (3/4) Exclude cut with ID _64414_210_17301_1_1535243252048_3757759_348-333360-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 18:53:01,486 WARNING [train.py:1204] (3/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_753-214751-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 18:53:04,208 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_17613_210_2151_1_1531137569_2366160_202-208013-0_sp1.1 from training. Duration: 0.92725 2023-10-04 18:53:04,216 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_43831_210_2158_1_1533725502_5872791_610-129300-0_sp1.1 from training. Duration: 0.936375 2023-10-04 18:53:04,835 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.4.encoder.layers.0.nonlin_attention.whiten1, num_groups=1, num_channels=288, metric=7.86 vs. limit=10.0 2023-10-04 18:53:05,698 WARNING [train.py:1204] (3/4) Exclude cut with ID _54218_210_12210_1_1534413646388_4409259_321-23852-0_sp1.1 from training. Duration: 0.936375 2023-10-04 18:53:07,256 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.conv_module2.balancer1.prob, batch_count=1757826.6666666667, ans=0.125 2023-10-04 18:53:08,385 WARNING [train.py:1204] (3/4) Exclude cut with ID _39411_210_5986_1_1533538825030_3343649_173-238248-0 from training. Duration: 0.83 2023-10-04 18:53:08,392 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_405-339986-0_sp0.9 from training. Duration: 0.5666875 2023-10-04 18:53:08,426 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2009_210_2071_1_1525481134_4588937_385-302100-0 from training. Duration: 0.87 2023-10-04 18:53:08,455 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_161-276996-0_sp1.1 from training. Duration: 0.863625 2023-10-04 18:53:09,857 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_177-58005-0 from training. Duration: 0.62 2023-10-04 18:53:10,505 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.3.encoder.layers.0.whiten, num_groups=1, num_channels=512, metric=6.68 vs. limit=12.0 2023-10-04 18:53:11,622 WARNING [train.py:1204] (3/4) Exclude cut with ID _64639_210_5816_1_1535367619437_3528000_146-343734-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 18:53:12,988 WARNING [train.py:1204] (3/4) Exclude cut with ID _3845_210_1390_1_1526731326141_3523610_72-61441-0_sp1.1 from training. Duration: 0.92725 2023-10-04 18:53:20,678 WARNING [train.py:1204] (3/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_991-125028-0_sp1.1 from training. Duration: 0.936375 2023-10-04 18:53:20,742 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7544_210_6753_1_1528591327_1471426_172-348777-0 from training. Duration: 0.66 2023-10-04 18:53:20,787 WARNING [train.py:1204] (3/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_416-257730-0_sp1.1 from training. Duration: 0.5909375 2023-10-04 18:53:22,096 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1113-7369-0_sp1.1 from training. Duration: 0.77275 2023-10-04 18:53:23,391 WARNING [train.py:1204] (3/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_242-76943-0_sp1.1 from training. Duration: 0.8545625 2023-10-04 18:53:27,966 WARNING [train.py:1204] (3/4) Exclude cut with ID _44718_210_5816_1_1534050051869_6469030_190-49648-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 18:53:30,059 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.0.bypass_mid.scale_min, batch_count=1757893.3333333333, ans=0.2 2023-10-04 18:53:31,251 WARNING [train.py:1204] (3/4) Exclude cut with ID _47251_210_18628_1_1533814063128_4137250_214-88076-0 from training. Duration: 0.88 2023-10-04 18:53:31,495 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.feed_forward3.hidden_balancer.prob, batch_count=1757960.0, ans=0.125 2023-10-04 18:53:32,605 INFO [train.py:1046] (3/4) Epoch 50, batch 3400, loss[loss=0.132, simple_loss=0.2154, pruned_loss=0.02427, over 22454.00 frames. ], tot_loss[loss=0.1527, simple_loss=0.2346, pruned_loss=0.03537, over 4737216.06 frames. ], batch size: 49, lr: 2.03e-03, grad_scale: 8.0 2023-10-04 18:53:32,654 WARNING [train.py:1204] (3/4) Exclude cut with ID _5585_210_2667_1_1527418813324_8007340_89-332072-0_sp1.1 from training. Duration: 0.5818125 2023-10-04 18:53:32,680 WARNING [train.py:1204] (3/4) Exclude cut with ID _41817_210_18835_1_1533434477160_7423049_555-293448-0_sp1.1 from training. Duration: 0.72725 2023-10-04 18:53:32,801 WARNING [train.py:1204] (3/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_286-331314-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 18:53:34,156 WARNING [train.py:1204] (3/4) Exclude cut with ID _13921_210_9994_1_1530838702800_3003429_46-149444-0 from training. Duration: 0.94 2023-10-04 18:53:34,252 WARNING [train.py:1204] (3/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_768-43595-0_sp1.1 from training. Duration: 0.936375 2023-10-04 18:53:34,270 WARNING [train.py:1204] (3/4) Exclude cut with ID _35355_210_16040_1_1533212988977_4016229_74-190076-0 from training. Duration: 0.8 2023-10-04 18:53:35,727 WARNING [train.py:1204] (3/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_1007-316380-0_sp1.1 from training. Duration: 0.963625 2023-10-04 18:53:37,147 WARNING [train.py:1204] (3/4) Exclude cut with ID _23557_210_8750_1_1531890106370_6846255_150-283437-0_sp1.1 from training. Duration: 0.963625 2023-10-04 18:53:37,183 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3474_210_2028_1_1526188513_4825246_145-309131-0_sp1.1 from training. Duration: 0.67275 2023-10-04 18:53:38,551 WARNING [train.py:1204] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_391-22148-0_sp1.1 from training. Duration: 0.7 2023-10-04 18:53:38,574 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_238-256668-0 from training. Duration: 0.69 2023-10-04 18:53:41,361 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.conv_module1.balancer1.prob, batch_count=1757960.0, ans=0.125 2023-10-04 18:53:43,201 WARNING [train.py:1204] (3/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_372-198878-0 from training. Duration: 0.6 2023-10-04 18:53:44,500 WARNING [train.py:1204] (3/4) Exclude cut with ID ID0174W0006-766659 from training. Duration: 0.953 2023-10-04 18:53:44,516 WARNING [train.py:1204] (3/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_668-23019-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 18:53:48,838 WARNING [train.py:1204] (3/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_322-74328-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 18:53:48,844 WARNING [train.py:1204] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_872-264525-0_sp1.1 from training. Duration: 0.5909375 2023-10-04 18:53:50,657 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_53378_210_17282_1_1534221071_5769509_641-286201-0_sp1.1 from training. Duration: 0.97275 2023-10-04 18:53:50,737 WARNING [train.py:1204] (3/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_199-295611-0_sp1.1 from training. Duration: 0.5 2023-10-04 18:53:54,666 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.748e+02 2.083e+02 2.309e+02 2.717e+02 5.994e+02, threshold=4.619e+02, percent-clipped=1.0 2023-10-04 18:53:54,891 WARNING [train.py:1204] (3/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_1270-317971-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 18:53:56,707 WARNING [train.py:1204] (3/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_784-213099-0 from training. Duration: 0.93 2023-10-04 18:54:01,387 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_115-233929-0_sp0.9 from training. Duration: 0.8 2023-10-04 18:54:02,974 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.conv_module2.balancer1.max_abs, batch_count=1758093.3333333333, ans=10.0 2023-10-04 18:54:04,104 WARNING [train.py:1204] (3/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_256-223809-0_sp1.1 from training. Duration: 0.97275 2023-10-04 18:54:05,438 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_285-87781-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 18:54:06,739 WARNING [train.py:1204] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_619-344499-0_sp1.1 from training. Duration: 0.57275 2023-10-04 18:54:08,403 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.feed_forward2.hidden_balancer.prob, batch_count=1758093.3333333333, ans=0.125 2023-10-04 18:54:10,894 WARNING [train.py:1204] (3/4) Exclude cut with ID _60229_210_27767_1_1534838406158_3655350_264-279573-0_sp0.9 from training. Duration: 0.92225 2023-10-04 18:54:15,481 WARNING [train.py:1204] (3/4) Exclude cut with ID _7719_210_3525_1_1528456153163_4296960_333-110399-0 from training. Duration: 0.96 2023-10-04 18:54:19,852 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_50423_210_13270_1_1534128598_5021734_759-90833-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 18:54:19,907 WARNING [train.py:1204] (3/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_151-254798-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 18:54:21,640 WARNING [train.py:1204] (3/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_450-203637-0 from training. Duration: 0.92 2023-10-04 18:54:21,678 WARNING [train.py:1204] (3/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_673-36456-0_sp1.1 from training. Duration: 0.936375 2023-10-04 18:54:23,000 WARNING [train.py:1204] (3/4) Exclude cut with ID _36538_210_9994_1_1533204020363_3795620_131-8685-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 18:54:23,044 WARNING [train.py:1204] (3/4) Exclude cut with ID _12556_210_8341_1_1530081024100_7327690_713-55626-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 18:54:24,383 WARNING [train.py:1204] (3/4) Exclude cut with ID _2322_210_2667_1_1525604548971_3656299_440-80494-0_sp0.9 from training. Duration: 0.97775 2023-10-04 18:54:26,419 WARNING [train.py:1204] (3/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_210-325081-0_sp1.1 from training. Duration: 0.97275 2023-10-04 18:54:30,523 WARNING [train.py:1204] (3/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_375-244976-0_sp0.9 from training. Duration: 0.8333125 2023-10-04 18:54:30,528 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_15517_210_6753_1_1530952293_1664795_119-31001-0_sp1.1 from training. Duration: 0.8545625 2023-10-04 18:54:31,566 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.0.layers.0.feed_forward3.out_whiten, num_groups=1, num_channels=192, metric=10.10 vs. limit=15.0 2023-10-04 18:54:35,410 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_41452_210_7291_1_1533437687_3941281_319-42326-0_sp1.1 from training. Duration: 0.92725 2023-10-04 18:54:36,780 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_372-173012-0 from training. Duration: 0.95 2023-10-04 18:54:37,751 INFO [scaling.py:1022] (3/4) Whitening: name=encoder_embed.convnext.out_whiten, num_groups=1, num_channels=128, metric=4.58 vs. limit=5.0 2023-10-04 18:54:40,897 WARNING [train.py:1204] (3/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_41-31157-0_sp1.1 from training. Duration: 0.5181875 2023-10-04 18:54:45,573 WARNING [train.py:1204] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_91-212486-0 from training. Duration: 0.68 2023-10-04 18:54:46,889 INFO [train.py:1046] (3/4) Epoch 50, batch 3450, loss[loss=0.1395, simple_loss=0.2018, pruned_loss=0.03856, over 19773.00 frames. ], tot_loss[loss=0.1525, simple_loss=0.2343, pruned_loss=0.03532, over 4728824.59 frames. ], batch size: 388, lr: 2.03e-03, grad_scale: 8.0 2023-10-04 18:54:48,390 WARNING [train.py:1204] (3/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_1267-272693-0 from training. Duration: 0.73 2023-10-04 18:54:50,102 WARNING [train.py:1204] (3/4) Exclude cut with ID _13938_210_9988_1_1530518718978_3693960_152-67046-0_sp1.1 from training. Duration: 0.936375 2023-10-04 18:54:51,440 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_4528_210_3273_1_1527076654_3276777_342-168068-0_sp1.1 from training. Duration: 0.7909375 2023-10-04 18:54:51,456 WARNING [train.py:1204] (3/4) Exclude cut with ID _7606_210_4915_1_1528894253277_5080690_127-177761-0 from training. Duration: 0.64 2023-10-04 18:54:52,740 WARNING [train.py:1204] (3/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_335-137972-0_sp1.1 from training. Duration: 0.92725 2023-10-04 18:54:55,611 WARNING [train.py:1204] (3/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_331-117829-0_sp1.1 from training. Duration: 0.563625 2023-10-04 18:55:02,215 WARNING [train.py:1204] (3/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_125-125818-0_sp1.1 from training. Duration: 0.87275 2023-10-04 18:55:02,276 WARNING [train.py:1204] (3/4) Exclude cut with ID _6789_210_1534_1_1527854471671_2870519_48-294897-0_sp1.1 from training. Duration: 0.963625 2023-10-04 18:55:03,620 WARNING [train.py:1204] (3/4) Exclude cut with ID _6694_210_4599_1_1527852637490_3965529_116-109398-0_sp1.1 from training. Duration: 0.663625 2023-10-04 18:55:03,630 WARNING [train.py:1204] (3/4) Exclude cut with ID _52892_210_6721_1_1534378941259_4124129_222-172685-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 18:55:03,869 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=1758360.0, ans=0.1 2023-10-04 18:55:06,365 WARNING [train.py:1204] (3/4) Exclude cut with ID _50655_210_18628_1_1534229942193_3772154_320-132275-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 18:55:12,069 WARNING [train.py:1204] (3/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_76-311963-0 from training. Duration: 0.7 2023-10-04 18:55:16,130 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.0.layers.1.conv_module1.whiten, num_groups=1, num_channels=192, metric=12.63 vs. limit=15.0 2023-10-04 18:55:16,654 WARNING [train.py:1204] (3/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_32-205288-0 from training. Duration: 0.78 2023-10-04 18:55:16,681 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_86-16881-0_sp0.9 from training. Duration: 0.6444375 2023-10-04 18:55:16,728 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_271-297505-0_sp1.1 from training. Duration: 0.9 2023-10-04 18:55:19,389 WARNING [train.py:1204] (3/4) Exclude cut with ID _57572_210_5517_1_1534921219624_3809484_187-295293-0_sp1.1 from training. Duration: 0.963625 2023-10-04 18:55:24,010 WARNING [train.py:1204] (3/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_563-329943-0 from training. Duration: 0.99 2023-10-04 18:55:24,094 WARNING [train.py:1204] (3/4) Exclude cut with ID _7160_210_1876_1_1527940923231_3703799_302-150728-0_sp0.9 from training. Duration: 0.8666875 2023-10-04 18:55:28,715 WARNING [train.py:1204] (3/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_926-7526-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 18:55:28,751 WARNING [train.py:1204] (3/4) Exclude cut with ID _62574_210_2655_1_1535180407058_3933249_467-22348-0_sp1.1 from training. Duration: 0.936375 2023-10-04 18:55:30,137 WARNING [train.py:1204] (3/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_280-161633-0_sp0.9 from training. Duration: 0.811125 2023-10-04 18:55:31,623 WARNING [train.py:1204] (3/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_385-193052-0_sp1.1 from training. Duration: 0.7818125 2023-10-04 18:55:33,754 WARNING [train.py:1204] (3/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_444-28041-0 from training. Duration: 0.85 2023-10-04 18:55:34,985 WARNING [train.py:1204] (3/4) Exclude cut with ID _66070_210_7640_1_1535441393096_7805310_341-249237-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 18:55:35,076 WARNING [train.py:1204] (3/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_425-269826-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 18:55:37,796 WARNING [train.py:1204] (3/4) Exclude cut with ID _53950_210_5816_1_1534417264919_3862951_124-185597-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 18:55:39,204 WARNING [train.py:1204] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_380-204756-0 from training. Duration: 0.86 2023-10-04 18:55:39,397 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.self_attn_weights.pos_emb_skip_rate, batch_count=1758493.3333333333, ans=0.0 2023-10-04 18:55:43,591 WARNING [train.py:1204] (3/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_282-257791-0_sp0.9 from training. Duration: 0.9555625 2023-10-04 18:55:47,839 WARNING [train.py:1204] (3/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_372-131163-0_sp1.1 from training. Duration: 0.8545625 2023-10-04 18:55:49,181 WARNING [train.py:1204] (3/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_447-233513-0_sp1.1 from training. Duration: 0.963625 2023-10-04 18:55:50,687 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_54-118333-0_sp1.1 from training. Duration: 0.97275 2023-10-04 18:55:54,614 WARNING [train.py:1204] (3/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_388-230239-0_sp1.1 from training. Duration: 0.963625 2023-10-04 18:55:54,633 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_40047_210_2290_1_1533290519_2374311_141-54639-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 18:55:55,959 WARNING [train.py:1204] (3/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_91-267696-0_sp0.9 from training. Duration: 0.9666875 2023-10-04 18:55:55,993 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_532-137847-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 18:55:59,981 WARNING [train.py:1204] (3/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_155-283231-0_sp1.1 from training. Duration: 0.97275 2023-10-04 18:56:01,205 INFO [train.py:1046] (3/4) Epoch 50, batch 3500, loss[loss=0.1605, simple_loss=0.2458, pruned_loss=0.03756, over 24374.00 frames. ], tot_loss[loss=0.152, simple_loss=0.2333, pruned_loss=0.03531, over 4725573.70 frames. ], batch size: 77, lr: 2.02e-03, grad_scale: 8.0 2023-10-04 18:56:04,512 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2245_210_2715_1_1525511548_3540254_310-52483-0_sp1.1 from training. Duration: 0.8 2023-10-04 18:56:04,587 WARNING [train.py:1204] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_185-174805-0 from training. Duration: 0.68 2023-10-04 18:56:07,225 WARNING [train.py:1204] (3/4) Exclude cut with ID _15012_210_1968_1_1530856820429_5095350_662-210469-0_sp1.1 from training. Duration: 0.5181875 2023-10-04 18:56:10,106 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_426-313557-0_sp0.9 from training. Duration: 0.588875 2023-10-04 18:56:11,614 WARNING [train.py:1204] (3/4) Exclude cut with ID _42326_210_7770_1_1533814282531_3707269_407-204343-0_sp1.1 from training. Duration: 0.97275 2023-10-04 18:56:11,628 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_258-310570-0 from training. Duration: 0.82 2023-10-04 18:56:16,466 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_4737_210_2040_1_1526998689_3293008_378-178650-0_sp1.1 from training. Duration: 0.863625 2023-10-04 18:56:17,907 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_47896_210_20508_1_1534492518_5958868_432-327451-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 18:56:20,507 WARNING [train.py:1204] (3/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_188-114464-0_sp1.1 from training. Duration: 0.7454375 2023-10-04 18:56:20,514 WARNING [train.py:1204] (3/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_798-106179-0_sp1.1 from training. Duration: 0.7545625 2023-10-04 18:56:20,542 WARNING [train.py:1204] (3/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_143-117421-0_sp1.1 from training. Duration: 0.52725 2023-10-04 18:56:20,583 WARNING [train.py:1204] (3/4) Exclude cut with ID _67499_210_17282_1_1535509805220_3692430_361-78786-0_sp1.1 from training. Duration: 0.963625 2023-10-04 18:56:20,620 WARNING [train.py:1204] (3/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_712-342563-0_sp1.1 from training. Duration: 0.8818125 2023-10-04 18:56:20,672 WARNING [train.py:1204] (3/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_387-159008-0 from training. Duration: 0.86 2023-10-04 18:56:23,103 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.self_attn2.whiten.whitening_limit, batch_count=1758693.3333333333, ans=22.5 2023-10-04 18:56:24,335 WARNING [train.py:1204] (3/4) Exclude cut with ID _33640_210_2655_1_1533117605479_4855469_959-18820-0_sp1.1 from training. Duration: 0.963625 2023-10-04 18:56:24,370 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_133-564-0_sp0.9 from training. Duration: 0.87775 2023-10-04 18:56:24,578 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.bypass.skip_rate, batch_count=1758693.3333333333, ans=0.07 2023-10-04 18:56:25,599 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.705e+02 2.089e+02 2.430e+02 2.862e+02 4.477e+02, threshold=4.860e+02, percent-clipped=0.0 2023-10-04 18:56:26,624 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.0.layers.0.feed_forward1.out_whiten, num_groups=1, num_channels=192, metric=9.69 vs. limit=15.0 2023-10-04 18:56:27,003 WARNING [train.py:1204] (3/4) Exclude cut with ID _23514_210_1389_1_1531832139785_3031439_12-29287-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 18:56:30,329 WARNING [train.py:1204] (3/4) Exclude cut with ID _23514_210_1389_1_1531832139785_3031439_489-185052-0_sp1.1 from training. Duration: 0.963625 2023-10-04 18:56:31,690 WARNING [train.py:1204] (3/4) Exclude cut with ID _5444_210_2151_1_1527418808464_5312210_634-194174-0 from training. Duration: 0.97 2023-10-04 18:56:31,708 WARNING [train.py:1204] (3/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_91-26919-0_sp1.1 from training. Duration: 0.8818125 2023-10-04 18:56:34,511 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_37351_210_7925_1_1533012931_3812113_516-82303-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 18:56:35,770 WARNING [train.py:1204] (3/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_80-74184-0_sp0.9 from training. Duration: 0.92225 2023-10-04 18:56:37,108 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_9460_210_4211_1_1528954587_10049667_495-202963-0_sp1.1 from training. Duration: 0.963625 2023-10-04 18:56:38,537 WARNING [train.py:1204] (3/4) Exclude cut with ID _14632_210_9988_1_1530691279392_3923200_332-105904-0_sp1.1 from training. Duration: 0.8181875 2023-10-04 18:56:38,553 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_40764_210_1925_1_1533882359_3752345_493-167886-0_sp1.1 from training. Duration: 0.92725 2023-10-04 18:56:41,197 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_196-289637-0 from training. Duration: 0.75 2023-10-04 18:56:42,581 WARNING [train.py:1204] (3/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_26-175553-0 from training. Duration: 0.61 2023-10-04 18:56:42,621 WARNING [train.py:1204] (3/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_890-5117-0 from training. Duration: 0.78 2023-10-04 18:56:42,678 WARNING [train.py:1204] (3/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_206-306860-0_sp1.1 from training. Duration: 0.92725 2023-10-04 18:56:45,915 WARNING [train.py:1204] (3/4) Exclude cut with ID _25882_210_3036_1_1532775665815_6479510_570-79824-0_sp1.1 from training. Duration: 0.963625 2023-10-04 18:56:47,310 WARNING [train.py:1204] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_731-2428-0_sp1.1 from training. Duration: 0.7545625 2023-10-04 18:56:47,345 WARNING [train.py:1204] (3/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_138-154812-0_sp0.9 from training. Duration: 0.8666875 2023-10-04 18:56:50,202 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=1758826.6666666667, ans=0.1 2023-10-04 18:56:51,223 WARNING [train.py:1204] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_178-39722-0_sp0.9 from training. Duration: 0.5444375 2023-10-04 18:56:51,301 WARNING [train.py:1204] (3/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_1285-256622-0_sp0.9 from training. Duration: 0.9333125 2023-10-04 18:56:55,525 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_41428_210_2161_1_1533774184_3992532_65-271323-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 18:56:58,714 WARNING [train.py:1204] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_308-161067-0 from training. Duration: 0.66 2023-10-04 18:56:58,719 WARNING [train.py:1204] (3/4) Exclude cut with ID _3067_210_2667_1_1526212974640_4503168_459-173867-0 from training. Duration: 0.88 2023-10-04 18:56:58,721 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2221_210_1988_1_1525597051_4935541_415-296882-0_sp1.1 from training. Duration: 0.7 2023-10-04 18:57:00,264 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.bypass.scale_min, batch_count=1758893.3333333333, ans=0.2 2023-10-04 18:57:02,062 WARNING [train.py:1204] (3/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_354-192266-0_sp1.1 from training. Duration: 0.936375 2023-10-04 18:57:03,318 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_258-310570-0_sp0.9 from training. Duration: 0.911125 2023-10-04 18:57:04,672 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_548-141039-0_sp1.1 from training. Duration: 0.963625 2023-10-04 18:57:06,142 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_15101_210_7927_1_1530786415_5033274_320-327527-0 from training. Duration: 0.96 2023-10-04 18:57:07,508 WARNING [train.py:1204] (3/4) Exclude cut with ID _2895_210_2477_1_1525956785794_4063750_289-11475-0_sp0.9 from training. Duration: 0.911125 2023-10-04 18:57:07,654 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7543_210_6753_1_1528543318_4552_1-85072-0_sp1.1 from training. Duration: 0.7545625 2023-10-04 18:57:08,999 WARNING [train.py:1204] (3/4) Exclude cut with ID _64639_210_5816_1_1535367619437_3528000_461-329156-0 from training. Duration: 0.96 2023-10-04 18:57:10,427 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_211-339622-0 from training. Duration: 0.45 2023-10-04 18:57:13,536 WARNING [train.py:1204] (3/4) Exclude cut with ID _20455_210_5219_1_1531565996775_8098100_402-112739-0_sp1.1 from training. Duration: 0.963625 2023-10-04 18:57:13,640 WARNING [train.py:1204] (3/4) Exclude cut with ID _53489_210_6228_1_1534298357169_3835970_256-115565-0_sp1.1 from training. Duration: 0.936375 2023-10-04 18:57:13,663 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_26326_210_7925_1_1533002493_3592406_360-305260-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 18:57:14,913 INFO [train.py:1046] (3/4) Epoch 50, batch 3550, loss[loss=0.1474, simple_loss=0.2382, pruned_loss=0.02834, over 24663.00 frames. ], tot_loss[loss=0.1508, simple_loss=0.2318, pruned_loss=0.03491, over 4727286.92 frames. ], batch size: 73, lr: 2.02e-03, grad_scale: 8.0 2023-10-04 18:57:14,970 WARNING [train.py:1204] (3/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_18-204383-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 18:57:17,750 WARNING [train.py:1204] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_306-232480-0_sp1.1 from training. Duration: 0.87275 2023-10-04 18:57:26,116 WARNING [train.py:1204] (3/4) Exclude cut with ID _41161_210_20034_1_1533431249657_3555314_116-299186-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 18:57:28,076 WARNING [train.py:1204] (3/4) Exclude cut with ID _13913_210_9994_1_1530579615187_3383897_473-277682-0_sp1.1 from training. Duration: 0.3545625 2023-10-04 18:57:29,653 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.nonlin_attention.balancer.prob, batch_count=1759026.6666666667, ans=0.125 2023-10-04 18:57:30,867 WARNING [train.py:1204] (3/4) Exclude cut with ID _58895_210_8750_1_1535184081320_3649089_49-225702-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 18:57:33,985 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3080_210_2071_1_1526086102_4365166_312-8726-0_sp1.1 from training. Duration: 0.82725 2023-10-04 18:57:35,362 WARNING [train.py:1204] (3/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_489-190175-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 18:57:35,444 WARNING [train.py:1204] (3/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_345-95717-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 18:57:35,458 WARNING [train.py:1204] (3/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_85-138257-0_sp1.1 from training. Duration: 0.6454375 2023-10-04 18:57:39,687 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7046_210_6753_1_1527895337_6920917_783-278109-0_sp1.1 from training. Duration: 0.7 2023-10-04 18:57:39,721 WARNING [train.py:1204] (3/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_450-203637-0_sp1.1 from training. Duration: 0.836375 2023-10-04 18:57:39,762 WARNING [train.py:1204] (3/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_416-132893-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 18:57:39,777 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2655_210_2087_1_1525688959_3897400_437-337690-0_sp0.9 from training. Duration: 0.7 2023-10-04 18:57:41,014 WARNING [train.py:1204] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_700-183549-0_sp1.1 from training. Duration: 0.6181875 2023-10-04 18:57:48,340 WARNING [train.py:1204] (3/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_191-59784-0_sp1.1 from training. Duration: 0.8 2023-10-04 18:57:48,359 WARNING [train.py:1204] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_218-295097-0_sp1.1 from training. Duration: 0.7 2023-10-04 18:57:49,686 WARNING [train.py:1204] (3/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_310-109461-0_sp1.1 from training. Duration: 0.72725 2023-10-04 18:57:49,691 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_33348_210_5632_1_1533086912_3324030_131-321149-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 18:57:49,742 WARNING [train.py:1204] (3/4) Exclude cut with ID _64135_210_2253_1_1535677267876_7608972_133-188266-0_sp1.1 from training. Duration: 0.736375 2023-10-04 18:57:49,767 WARNING [train.py:1204] (3/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_505-205270-0 from training. Duration: 0.38 2023-10-04 18:57:49,784 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_67223_210_4238_1_1535612120_2537431_102-88248-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 18:57:51,281 WARNING [train.py:1204] (3/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_60-235433-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 18:57:51,367 WARNING [train.py:1204] (3/4) Exclude cut with ID _5189_210_4557_1_1527248319428_3769229_199-53526-0_sp1.1 from training. Duration: 0.3818125 2023-10-04 18:57:52,173 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.0.layers.1.self_attn2.whiten, num_groups=1, num_channels=192, metric=10.41 vs. limit=22.5 2023-10-04 18:57:57,208 WARNING [train.py:1204] (3/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_439-90979-0_sp1.1 from training. Duration: 0.963625 2023-10-04 18:57:57,277 WARNING [train.py:1204] (3/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_1162-132880-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 18:57:58,587 WARNING [train.py:1204] (3/4) Exclude cut with ID _56423_210_5854_1_1534896061049_3165969_220-22317-0_sp1.1 from training. Duration: 0.963625 2023-10-04 18:58:00,559 WARNING [train.py:1204] (3/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_909-64151-0 from training. Duration: 0.94 2023-10-04 18:58:01,970 WARNING [train.py:1204] (3/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_465-156500-0_sp0.9 from training. Duration: 0.8 2023-10-04 18:58:03,990 WARNING [train.py:1204] (3/4) Exclude cut with ID _44304_210_15374_1_1533639352765_4613129_302-22084-0 from training. Duration: 0.86 2023-10-04 18:58:04,050 WARNING [train.py:1204] (3/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_663-166973-0_sp1.1 from training. Duration: 0.72725 2023-10-04 18:58:06,776 WARNING [train.py:1204] (3/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_503-287883-0_sp0.9 from training. Duration: 0.97775 2023-10-04 18:58:06,800 WARNING [train.py:1204] (3/4) Exclude cut with ID _60057_210_10640_1_1534903065793_3333150_362-230377-0_sp0.9 from training. Duration: 0.9666875 2023-10-04 18:58:09,868 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_4183_210_2087_1_1526729100_1869838_143-280962-0 from training. Duration: 0.56 2023-10-04 18:58:09,958 WARNING [train.py:1204] (3/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_281-1313-0_sp1.1 from training. Duration: 0.936375 2023-10-04 18:58:16,094 WARNING [train.py:1204] (3/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_64-215118-0_sp1.1 from training. Duration: 0.936375 2023-10-04 18:58:17,441 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2822_210_2715_1_1526112586_1999587_165-36656-0 from training. Duration: 0.89 2023-10-04 18:58:18,779 WARNING [train.py:1204] (3/4) Exclude cut with ID _22961_210_8254_1_1531976366672_4278180_402-208830-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 18:58:20,571 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.bypass_mid.scale_min, batch_count=1759226.6666666667, ans=0.2 2023-10-04 18:58:21,715 WARNING [train.py:1204] (3/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_98-193494-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 18:58:21,815 WARNING [train.py:1204] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_383-285196-0 from training. Duration: 0.97 2023-10-04 18:58:23,346 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder_embed.conv.2.prob, batch_count=1759226.6666666667, ans=0.125 2023-10-04 18:58:29,018 INFO [train.py:1046] (3/4) Epoch 50, batch 3600, loss[loss=0.1599, simple_loss=0.2446, pruned_loss=0.03762, over 23336.00 frames. ], tot_loss[loss=0.1506, simple_loss=0.2316, pruned_loss=0.0348, over 4728944.59 frames. ], batch size: 93, lr: 2.02e-03, grad_scale: 16.0 2023-10-04 18:58:29,085 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6767_210_3273_1_1527855783_1718932_142-127132-0 from training. Duration: 0.95 2023-10-04 18:58:29,133 WARNING [train.py:1204] (3/4) Exclude cut with ID _39867_210_9826_1_1533553222030_9344259_1018-339635-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 18:58:30,497 WARNING [train.py:1204] (3/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_274-274798-0_sp1.1 from training. Duration: 0.8909375 2023-10-04 18:58:31,849 WARNING [train.py:1204] (3/4) Exclude cut with ID _2334_210_2667_1_1525608238880_4705129_376-263393-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 18:58:33,743 WARNING [train.py:1204] (3/4) Exclude cut with ID _29303_210_2575_1_1532392237303_7410649_363-344675-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 18:58:34,458 WARNING [train.py:1204] (3/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_58-68956-0_sp1.1 from training. Duration: 0.7545625 2023-10-04 18:58:38,667 WARNING [train.py:1204] (3/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_709-311571-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 18:58:39,976 WARNING [train.py:1204] (3/4) Exclude cut with ID _46067_210_4220_1_1534125571066_3307450_136-257407-0_sp1.1 from training. Duration: 0.963625 2023-10-04 18:58:41,299 WARNING [train.py:1204] (3/4) Exclude cut with ID _6493_210_1747_1_1528459210424_4012470_324-323854-0_sp1.1 from training. Duration: 0.836375 2023-10-04 18:58:42,564 WARNING [train.py:1204] (3/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_234-260682-0_sp1.1 from training. Duration: 0.763625 2023-10-04 18:58:43,918 WARNING [train.py:1204] (3/4) Exclude cut with ID _36634_210_1794_1_1533296352450_1399884_55-119451-0_sp1.1 from training. Duration: 0.963625 2023-10-04 18:58:43,921 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_40047_210_2290_1_1533290519_2374311_195-165693-0 from training. Duration: 0.89 2023-10-04 18:58:47,204 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_5522_210_3273_1_1527249606_3909096_366-172508-0_sp0.9 from training. Duration: 0.7333125 2023-10-04 18:58:48,484 WARNING [train.py:1204] (3/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_187-10504-0_sp1.1 from training. Duration: 0.963625 2023-10-04 18:58:49,943 WARNING [train.py:1204] (3/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_282-257791-0_sp1.1 from training. Duration: 0.7818125 2023-10-04 18:58:52,620 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.822e+02 2.097e+02 2.364e+02 2.970e+02 4.964e+02, threshold=4.728e+02, percent-clipped=1.0 2023-10-04 18:58:54,071 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_43831_210_2158_1_1533725502_5872791_467-288776-0_sp1.1 from training. Duration: 0.92725 2023-10-04 18:58:55,451 WARNING [train.py:1204] (3/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_138-154812-0_sp1.1 from training. Duration: 0.7090625 2023-10-04 18:58:55,504 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_1154-185930-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 18:58:55,523 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2655_210_2087_1_1525688959_3897400_52-279899-0 from training. Duration: 0.69 2023-10-04 18:58:56,921 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_141-89113-0_sp1.1 from training. Duration: 0.7818125 2023-10-04 18:58:58,493 WARNING [train.py:1204] (3/4) Exclude cut with ID _55134_210_21982_1_1534590028377_3882170_297-47310-0_sp1.1 from training. Duration: 0.963625 2023-10-04 18:59:00,204 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_486-151131-0_sp1.1 from training. Duration: 0.77275 2023-10-04 18:59:01,583 WARNING [train.py:1204] (3/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_21-232980-0_sp1.1 from training. Duration: 0.936375 2023-10-04 18:59:04,270 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6165_210_3528_1_1527590982_4379841_338-106380-0_sp1.1 from training. Duration: 0.92725 2023-10-04 18:59:06,178 WARNING [train.py:1204] (3/4) Exclude cut with ID _55162_210_17071_1_1534408248583_3753480_355-257218-0_sp1.1 from training. Duration: 0.97275 2023-10-04 18:59:06,258 WARNING [train.py:1204] (3/4) Exclude cut with ID _2895_210_2477_1_1525956785794_4063750_289-11475-0 from training. Duration: 0.82 2023-10-04 18:59:12,454 WARNING [train.py:1204] (3/4) Exclude cut with ID _16756_210_4915_1_1531047803763_7104259_839-183431-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 18:59:15,087 WARNING [train.py:1204] (3/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_427-208901-0_sp0.9 from training. Duration: 0.7555625 2023-10-04 18:59:15,122 WARNING [train.py:1204] (3/4) Exclude cut with ID _36512_210_6987_1_1533033143998_3126069_181-306500-0 from training. Duration: 0.95 2023-10-04 18:59:18,612 WARNING [train.py:1204] (3/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_28-70987-0_sp1.1 from training. Duration: 0.7909375 2023-10-04 18:59:21,560 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.conv_module2.balancer1.prob, batch_count=1759493.3333333333, ans=0.125 2023-10-04 18:59:24,046 WARNING [train.py:1204] (3/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_590-58996-0_sp1.1 from training. Duration: 0.936375 2023-10-04 18:59:25,501 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=1759493.3333333333, ans=0.1 2023-10-04 18:59:28,188 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_48730_210_5399_1_1533885886_2212769_108-47561-0_sp1.1 from training. Duration: 0.936375 2023-10-04 18:59:32,387 WARNING [train.py:1204] (3/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_401-158239-0_sp0.9 from training. Duration: 0.788875 2023-10-04 18:59:32,407 WARNING [train.py:1204] (3/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_278-154492-0_sp1.1 from training. Duration: 0.6909375 2023-10-04 18:59:32,413 WARNING [train.py:1204] (3/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_137-116442-0 from training. Duration: 0.78 2023-10-04 18:59:35,660 WARNING [train.py:1204] (3/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_189-317158-0 from training. Duration: 0.9 2023-10-04 18:59:37,012 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_47896_210_20508_1_1534492518_5958868_558-314572-0 from training. Duration: 0.97 2023-10-04 18:59:40,787 WARNING [train.py:1204] (3/4) Exclude cut with ID _66070_210_7640_1_1535441393096_7805310_290-40284-0_sp1.1 from training. Duration: 0.97275 2023-10-04 18:59:42,143 WARNING [train.py:1204] (3/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_110-224688-0_sp1.1 from training. Duration: 0.8818125 2023-10-04 18:59:42,220 WARNING [train.py:1204] (3/4) Exclude cut with ID _7719_210_3525_1_1528456153163_4296960_88-250259-0 from training. Duration: 0.84 2023-10-04 18:59:43,484 INFO [train.py:1046] (3/4) Epoch 50, batch 3650, loss[loss=0.1432, simple_loss=0.2189, pruned_loss=0.03376, over 23554.00 frames. ], tot_loss[loss=0.1511, simple_loss=0.2324, pruned_loss=0.03493, over 4729525.45 frames. ], batch size: 120, lr: 2.02e-03, grad_scale: 16.0 2023-10-04 18:59:43,554 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8453_210_4211_1_1528502940_8562730_1157-114102-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 18:59:43,578 WARNING [train.py:1204] (3/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_317-175062-0_sp1.1 from training. Duration: 0.7454375 2023-10-04 18:59:43,584 WARNING [train.py:1204] (3/4) Exclude cut with ID _29857_210_20034_1_1532395886753_3798160_254-347254-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 18:59:43,660 WARNING [train.py:1204] (3/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_28-70987-0 from training. Duration: 0.87 2023-10-04 18:59:46,277 WARNING [train.py:1204] (3/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_321-335475-0 from training. Duration: 0.45 2023-10-04 18:59:47,717 WARNING [train.py:1204] (3/4) Exclude cut with ID _40901_210_5665_1_1533363789958_3856130_356-107330-0_sp1.1 from training. Duration: 0.936375 2023-10-04 18:59:47,778 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3780_210_2087_1_1526467389_2145572_176-107140-0 from training. Duration: 0.69 2023-10-04 18:59:53,878 WARNING [train.py:1204] (3/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_80-135200-0 from training. Duration: 0.79 2023-10-04 18:59:53,987 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_33348_210_5632_1_1533086912_3324030_85-28583-0_sp0.9 from training. Duration: 0.911125 2023-10-04 18:59:58,124 WARNING [train.py:1204] (3/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_29-293896-0 from training. Duration: 0.61 2023-10-04 18:59:58,351 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.feed_forward1.hidden_balancer.prob, batch_count=1759693.3333333333, ans=0.125 2023-10-04 18:59:59,509 WARNING [train.py:1204] (3/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_194-252800-0 from training. Duration: 0.97 2023-10-04 19:00:03,683 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_360-52446-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 19:00:03,684 WARNING [train.py:1204] (3/4) Exclude cut with ID _9389_210_1664_1_1529217337301_5289269_289-213417-0_sp0.9 from training. Duration: 0.988875 2023-10-04 19:00:03,740 WARNING [train.py:1204] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_143-86344-0_sp0.9 from training. Duration: 0.8555625 2023-10-04 19:00:08,275 WARNING [train.py:1204] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_520-126729-0_sp0.9 from training. Duration: 0.77775 2023-10-04 19:00:08,301 WARNING [train.py:1204] (3/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_76-308663-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 19:00:09,005 WARNING [train.py:1204] (3/4) Exclude cut with ID _8021_210_7994_1_1528376451358_3653069_100-140231-0 from training. Duration: 0.67 2023-10-04 19:00:10,202 WARNING [train.py:1204] (3/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_387-131538-0_sp0.9 from training. Duration: 0.9 2023-10-04 19:00:10,225 WARNING [train.py:1204] (3/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_365-182054-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 19:00:10,267 WARNING [train.py:1204] (3/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_345-340690-0 from training. Duration: 0.93 2023-10-04 19:00:10,349 WARNING [train.py:1204] (3/4) Exclude cut with ID _5409_210_1388_1_1527292850189_5865191_178-127970-0_sp0.9 from training. Duration: 0.6333125 2023-10-04 19:00:11,693 WARNING [train.py:1204] (3/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_352-12734-0_sp1.1 from training. Duration: 0.92725 2023-10-04 19:00:11,696 WARNING [train.py:1204] (3/4) Exclude cut with ID _65134_210_14763_1_1535335393985_3505368_121-129373-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 19:00:11,837 WARNING [train.py:1204] (3/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_557-324258-0_sp1.1 from training. Duration: 0.736375 2023-10-04 19:00:13,255 WARNING [train.py:1204] (3/4) Exclude cut with ID _48678_210_5663_1_1533988830947_3769199_306-255886-0 from training. Duration: 0.77 2023-10-04 19:00:14,617 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7544_210_6753_1_1528587000_691468_87-178599-0 from training. Duration: 0.87 2023-10-04 19:00:14,677 WARNING [train.py:1204] (3/4) Exclude cut with ID _2941_210_2577_1_1526199673150_2489759_208-236402-0_sp1.1 from training. Duration: 0.963625 2023-10-04 19:00:17,762 WARNING [train.py:1204] (3/4) Exclude cut with ID _38670_210_15379_1_1533448810659_4391059_65-169397-0 from training. Duration: 0.97 2023-10-04 19:00:20,205 WARNING [train.py:1204] (3/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_306-328175-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 19:00:20,222 WARNING [train.py:1204] (3/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_212-149947-0_sp1.1 from training. Duration: 0.663625 2023-10-04 19:00:24,443 WARNING [train.py:1204] (3/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_321-22821-0_sp1.1 from training. Duration: 0.8090625 2023-10-04 19:00:27,155 WARNING [train.py:1204] (3/4) Exclude cut with ID _44010_210_4525_1_1533884262964_4013620_483-69110-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 19:00:27,164 WARNING [train.py:1204] (3/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_38-187851-0_sp1.1 from training. Duration: 0.563625 2023-10-04 19:00:27,253 WARNING [train.py:1204] (3/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_129-337802-0_sp0.9 from training. Duration: 0.8 2023-10-04 19:00:28,569 WARNING [train.py:1204] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_268-87909-0_sp1.1 from training. Duration: 0.9 2023-10-04 19:00:30,062 WARNING [train.py:1204] (3/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_559-184862-0_sp1.1 from training. Duration: 0.8545625 2023-10-04 19:00:31,721 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.conv_module2.balancer1.prob, batch_count=1759826.6666666667, ans=0.125 2023-10-04 19:00:34,712 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_46032_210_14656_1_1533781412_4507969_237-120766-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 19:00:36,681 WARNING [train.py:1204] (3/4) Exclude cut with ID _28427_210_4220_1_1532581240967_3061069_330-170906-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 19:00:36,694 WARNING [train.py:1204] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_401-51578-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 19:00:36,958 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.conv_module1.balancer1.prob, batch_count=1759826.6666666667, ans=0.125 2023-10-04 19:00:38,040 WARNING [train.py:1204] (3/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_209-205365-0_sp1.1 from training. Duration: 0.7181875 2023-10-04 19:00:38,117 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_23801_210_16223_1_1532066392_3294657_57-242170-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 19:00:39,430 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_127-37371-0_sp1.1 from training. Duration: 0.936375 2023-10-04 19:00:44,509 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.nonlin_attention.whiten2.whitening_limit, batch_count=1759893.3333333333, ans=15.0 2023-10-04 19:00:46,419 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9171W0019-578905 from training. Duration: 0.896 2023-10-04 19:00:48,375 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.4.encoder.layers.0.feed_forward2.out_whiten, num_groups=1, num_channels=384, metric=5.38 vs. limit=15.0 2023-10-04 19:00:50,894 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_24605_210_1794_1_1532047863_4149755_349-49056-0_sp1.1 from training. Duration: 0.92725 2023-10-04 19:00:50,905 WARNING [train.py:1204] (3/4) Exclude cut with ID _12302_210_5219_1_1529924446466_8107280_897-21237-0_sp1.1 from training. Duration: 0.936375 2023-10-04 19:00:50,995 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_9460_210_4211_1_1528954587_10049667_480-260269-0_sp0.9 from training. Duration: 0.62225 2023-10-04 19:00:51,043 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_348-152548-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 19:00:52,387 WARNING [train.py:1204] (3/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_301-330455-0_sp0.9 from training. Duration: 0.888875 2023-10-04 19:00:52,495 WARNING [train.py:1204] (3/4) Exclude cut with ID _61008_210_10633_1_1534852769198_6974370_345-91705-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 19:00:53,881 WARNING [train.py:1204] (3/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_304-273433-0 from training. Duration: 0.99 2023-10-04 19:00:53,885 WARNING [train.py:1204] (3/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_359-345972-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 19:00:56,535 INFO [train.py:1046] (3/4) Epoch 50, batch 3700, loss[loss=0.1938, simple_loss=0.258, pruned_loss=0.06475, over 19400.00 frames. ], tot_loss[loss=0.1519, simple_loss=0.233, pruned_loss=0.03537, over 4724283.12 frames. ], batch size: 388, lr: 2.02e-03, grad_scale: 16.0 2023-10-04 19:00:57,842 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2296_210_2087_1_1525503448_3397335_57-185430-0_sp1.1 from training. Duration: 0.5454375 2023-10-04 19:00:59,261 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_1256-120974-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 19:00:59,347 WARNING [train.py:1204] (3/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_372-30302-0_sp0.9 from training. Duration: 0.9555625 2023-10-04 19:01:02,037 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_42012_210_7927_1_1533439936_3871556_451-344262-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 19:01:02,038 WARNING [train.py:1204] (3/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_28-324729-0 from training. Duration: 0.94 2023-10-04 19:01:02,054 WARNING [train.py:1204] (3/4) Exclude cut with ID _64135_210_2253_1_1535677267876_7608972_313-18394-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 19:01:03,419 WARNING [train.py:1204] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_620-276793-0_sp0.9 from training. Duration: 0.6555625 2023-10-04 19:01:03,456 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7544_210_6753_1_1528591327_1471426_172-348777-0_sp0.9 from training. Duration: 0.7333125 2023-10-04 19:01:11,280 WARNING [train.py:1204] (3/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_332-221057-0_sp0.9 from training. Duration: 0.7555625 2023-10-04 19:01:14,119 WARNING [train.py:1204] (3/4) Exclude cut with ID _19023_210_4353_1_1531378892552_5820780_5-300035-0_sp1.1 from training. Duration: 0.92725 2023-10-04 19:01:15,474 WARNING [train.py:1204] (3/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_343-309023-0_sp1.1 from training. Duration: 0.963625 2023-10-04 19:01:15,545 WARNING [train.py:1204] (3/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_37-41919-0_sp1.1 from training. Duration: 0.7909375 2023-10-04 19:01:16,965 WARNING [train.py:1204] (3/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_41-182910-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 19:01:17,029 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_229-45300-0_sp0.9 from training. Duration: 0.6333125 2023-10-04 19:01:18,521 WARNING [train.py:1204] (3/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_40-214716-0_sp1.1 from training. Duration: 0.963625 2023-10-04 19:01:18,638 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9309W0203-640330 from training. Duration: 0.896 2023-10-04 19:01:22,816 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.773e+02 2.080e+02 2.404e+02 2.897e+02 4.526e+02, threshold=4.809e+02, percent-clipped=0.0 2023-10-04 19:01:27,018 WARNING [train.py:1204] (3/4) Exclude cut with ID _60229_210_27767_1_1534838406158_3655350_264-279573-0_sp1.1 from training. Duration: 0.7545625 2023-10-04 19:01:27,068 WARNING [train.py:1204] (3/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_372-198878-0_sp0.9 from training. Duration: 0.6666875 2023-10-04 19:01:27,881 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.0.layers.1.nonlin_attention.whiten1, num_groups=1, num_channels=144, metric=5.47 vs. limit=10.0 2023-10-04 19:01:28,421 WARNING [train.py:1204] (3/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_359-118926-0_sp1.1 from training. Duration: 0.6545625 2023-10-04 19:01:28,432 WARNING [train.py:1204] (3/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_85-65056-0 from training. Duration: 0.96 2023-10-04 19:01:28,461 WARNING [train.py:1204] (3/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_89-242335-0_sp1.1 from training. Duration: 0.863625 2023-10-04 19:01:32,769 WARNING [train.py:1204] (3/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_128-49444-0_sp1.1 from training. Duration: 0.936375 2023-10-04 19:01:32,835 WARNING [train.py:1204] (3/4) Exclude cut with ID _30327_210_16049_1_1532484161295_3924169_401-70819-0 from training. Duration: 0.86 2023-10-04 19:01:34,228 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_41822_210_13270_1_1533955976_4379284_260-256391-0_sp1.1 from training. Duration: 0.936375 2023-10-04 19:01:37,437 WARNING [train.py:1204] (3/4) Exclude cut with ID _67335_210_5219_1_1535525975652_4275229_599-247317-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 19:01:40,637 WARNING [train.py:1204] (3/4) Exclude cut with ID _29857_210_20034_1_1532395886753_3798160_209-239267-0_sp1.1 from training. Duration: 0.936375 2023-10-04 19:01:40,663 WARNING [train.py:1204] (3/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_46-207599-0_sp1.1 from training. Duration: 0.5818125 2023-10-04 19:01:42,070 WARNING [train.py:1204] (3/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_912-282412-0_sp1.1 from training. Duration: 0.4454375 2023-10-04 19:01:42,821 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.1.encoder.layers.1.feed_forward2.out_whiten, num_groups=1, num_channels=256, metric=3.90 vs. limit=15.0 2023-10-04 19:01:48,082 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_4183_210_2087_1_1526729100_1869838_141-166600-0_sp1.1 from training. Duration: 0.863625 2023-10-04 19:01:48,086 WARNING [train.py:1204] (3/4) Exclude cut with ID _15037_210_2253_1_1530752294104_3267650_296-192597-0 from training. Duration: 0.81 2023-10-04 19:01:48,142 WARNING [train.py:1204] (3/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_571-220562-0_sp1.1 from training. Duration: 0.963625 2023-10-04 19:01:48,165 WARNING [train.py:1204] (3/4) Exclude cut with ID _5287_210_2084_1_1527418883040_3878670_12-221186-0 from training. Duration: 0.98 2023-10-04 19:01:55,541 WARNING [train.py:1204] (3/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_149-85602-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 19:01:55,570 WARNING [train.py:1204] (3/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_1003-101638-0_sp1.1 from training. Duration: 0.663625 2023-10-04 19:01:55,860 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.feed_forward1.out_proj.dropout_p, batch_count=1760160.0, ans=0.1 2023-10-04 19:01:57,067 WARNING [train.py:1204] (3/4) Exclude cut with ID _55844_210_2346_1_1534471212569_7625707_1099-33689-0_sp1.1 from training. Duration: 0.92725 2023-10-04 19:01:57,118 WARNING [train.py:1204] (3/4) Exclude cut with ID _7606_210_4915_1_1528894253277_5080690_301-10732-0 from training. Duration: 0.81 2023-10-04 19:01:59,930 WARNING [train.py:1204] (3/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_403-34698-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 19:01:59,935 WARNING [train.py:1204] (3/4) Exclude cut with ID _8480_210_1874_1_1528589391205_6728330_755-112625-0_sp0.9 from training. Duration: 0.82225 2023-10-04 19:01:59,948 WARNING [train.py:1204] (3/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_527-238990-0_sp1.1 from training. Duration: 0.8181875 2023-10-04 19:01:59,978 WARNING [train.py:1204] (3/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_426-7328-0_sp1.1 from training. Duration: 0.92725 2023-10-04 19:02:01,430 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder_embed.dropout.p, batch_count=1760226.6666666667, ans=0.1 2023-10-04 19:02:05,389 WARNING [train.py:1204] (3/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_189-317158-0_sp1.1 from training. Duration: 0.8181875 2023-10-04 19:02:05,467 WARNING [train.py:1204] (3/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_162-103620-0 from training. Duration: 0.93 2023-10-04 19:02:08,127 WARNING [train.py:1204] (3/4) Exclude cut with ID _22083_210_8254_1_1531702862439_3678970_49-234546-0 from training. Duration: 0.83 2023-10-04 19:02:08,176 WARNING [train.py:1204] (3/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_370-331600-0_sp1.1 from training. Duration: 0.8545625 2023-10-04 19:02:08,188 WARNING [train.py:1204] (3/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_166-59042-0_sp1.1 from training. Duration: 0.97275 2023-10-04 19:02:09,569 WARNING [train.py:1204] (3/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_138-332773-0_sp1.1 from training. Duration: 0.736375 2023-10-04 19:02:09,640 WARNING [train.py:1204] (3/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_90-30673-0_sp0.9 from training. Duration: 0.9333125 2023-10-04 19:02:12,771 INFO [train.py:1046] (3/4) Epoch 50, batch 3750, loss[loss=0.1673, simple_loss=0.2434, pruned_loss=0.04559, over 23247.00 frames. ], tot_loss[loss=0.1535, simple_loss=0.2348, pruned_loss=0.03612, over 4725779.32 frames. ], batch size: 93, lr: 2.02e-03, grad_scale: 16.0 2023-10-04 19:02:12,933 WARNING [train.py:1204] (3/4) Exclude cut with ID _27967_210_12140_1_1532588391859_8419130_737-41898-0_sp1.1 from training. Duration: 0.936375 2023-10-04 19:02:16,584 WARNING [train.py:1204] (3/4) Exclude cut with ID _8833_210_5219_1_1528592665210_7553059_329-125156-0_sp1.1 from training. Duration: 0.7454375 2023-10-04 19:02:16,685 WARNING [train.py:1204] (3/4) Exclude cut with ID _41817_210_18835_1_1533434477160_7423049_352-42537-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 19:02:18,068 WARNING [train.py:1204] (3/4) Exclude cut with ID _8365_210_2064_1_1528592311070_4677690_431-294102-0 from training. Duration: 0.89 2023-10-04 19:02:19,463 WARNING [train.py:1204] (3/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_454-314901-0_sp1.1 from training. Duration: 0.4181875 2023-10-04 19:02:22,187 WARNING [train.py:1204] (3/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_80-135200-0_sp0.9 from training. Duration: 0.87775 2023-10-04 19:02:22,244 WARNING [train.py:1204] (3/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_181-78632-0 from training. Duration: 0.78 2023-10-04 19:02:24,222 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.conv_module2.balancer2.prob, batch_count=1760293.3333333333, ans=0.125 2023-10-04 19:02:25,347 WARNING [train.py:1204] (3/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_300-27235-0_sp0.9 from training. Duration: 0.9666875 2023-10-04 19:02:26,744 WARNING [train.py:1204] (3/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_96-177176-0_sp1.1 from training. Duration: 0.97275 2023-10-04 19:02:26,841 WARNING [train.py:1204] (3/4) Exclude cut with ID _35941_210_15379_1_1532995294515_4023089_246-348742-0_sp1.1 from training. Duration: 0.97275 2023-10-04 19:02:28,163 WARNING [train.py:1204] (3/4) Exclude cut with ID _38670_210_15379_1_1533448810659_4391059_65-169397-0_sp1.1 from training. Duration: 0.8818125 2023-10-04 19:02:28,491 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.nonlin_attention.balancer.prob, batch_count=1760360.0, ans=0.125 2023-10-04 19:02:29,763 WARNING [train.py:1204] (3/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_1003-99642-0_sp1.1 from training. Duration: 0.963625 2023-10-04 19:02:32,607 WARNING [train.py:1204] (3/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_320-14078-0_sp1.1 from training. Duration: 0.863625 2023-10-04 19:02:32,812 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.ff2_skip_rate, batch_count=1760360.0, ans=0.0 2023-10-04 19:02:33,974 WARNING [train.py:1204] (3/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_367-61330-0_sp0.9 from training. Duration: 0.8333125 2023-10-04 19:02:36,753 WARNING [train.py:1204] (3/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_515-298514-0_sp1.1 from training. Duration: 0.92725 2023-10-04 19:02:38,335 WARNING [train.py:1204] (3/4) Exclude cut with ID _53731_210_7994_1_1534553979244_3757214_471-320815-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 19:02:39,654 WARNING [train.py:1204] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_306-232480-0 from training. Duration: 0.96 2023-10-04 19:02:40,973 WARNING [train.py:1204] (3/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_478-162247-0_sp0.9 from training. Duration: 0.911125 2023-10-04 19:02:41,187 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.1.ff2_skip_rate, batch_count=1760426.6666666667, ans=0.0 2023-10-04 19:02:42,946 WARNING [train.py:1204] (3/4) Exclude cut with ID _56423_210_5854_1_1534896061049_3165969_345-284696-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 19:02:44,778 WARNING [train.py:1204] (3/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_600-122253-0_sp1.1 from training. Duration: 0.963625 2023-10-04 19:02:49,382 WARNING [train.py:1204] (3/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_199-295611-0 from training. Duration: 0.55 2023-10-04 19:02:53,451 WARNING [train.py:1204] (3/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_734-129761-0 from training. Duration: 0.82 2023-10-04 19:02:53,549 WARNING [train.py:1204] (3/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_349-169257-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 19:02:55,543 WARNING [train.py:1204] (3/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_202-67017-0_sp0.9 from training. Duration: 0.911125 2023-10-04 19:02:55,682 WARNING [train.py:1204] (3/4) Exclude cut with ID _16908_210_5724_1_1531119157248_3772700_61-258978-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 19:02:59,963 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.feed_forward1.hidden_balancer.prob, batch_count=1760493.3333333333, ans=0.125 2023-10-04 19:03:01,010 WARNING [train.py:1204] (3/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_172-156931-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 19:03:01,109 WARNING [train.py:1204] (3/4) Exclude cut with ID _4092_210_2151_1_1526643044449_3135759_356-278694-0_sp1.1 from training. Duration: 0.42725 2023-10-04 19:03:01,368 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.nonlin_attention.balancer.max_positive, batch_count=1760493.3333333333, ans=0.95 2023-10-04 19:03:04,000 WARNING [train.py:1204] (3/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_116-332008-0 from training. Duration: 0.98 2023-10-04 19:03:05,599 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.conv_module1.balancer1.prob, batch_count=1760493.3333333333, ans=0.125 2023-10-04 19:03:06,819 WARNING [train.py:1204] (3/4) Exclude cut with ID _29003_210_19596_1_1532507391720_3894098_358-114789-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 19:03:07,514 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.3.encoder.layers.1.conv_module2.whiten, num_groups=1, num_channels=512, metric=3.05 vs. limit=15.0 2023-10-04 19:03:10,822 WARNING [train.py:1204] (3/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_152-212533-0_sp1.1 from training. Duration: 0.8818125 2023-10-04 19:03:10,897 WARNING [train.py:1204] (3/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_267-214116-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 19:03:11,138 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.feed_forward2.hidden_balancer.prob, batch_count=1760560.0, ans=0.125 2023-10-04 19:03:14,188 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7543_210_6753_1_1528541907_1399322_165-323619-0_sp0.9 from training. Duration: 0.7666875 2023-10-04 19:03:18,825 WARNING [train.py:1204] (3/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_577-332600-0_sp0.9 from training. Duration: 0.6333125 2023-10-04 19:03:20,200 WARNING [train.py:1204] (3/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_342-100157-0_sp0.9 from training. Duration: 0.6 2023-10-04 19:03:21,638 WARNING [train.py:1204] (3/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_141-89727-0_sp1.1 from training. Duration: 0.6454375 2023-10-04 19:03:22,964 WARNING [train.py:1204] (3/4) Exclude cut with ID _3992_210_1747_1_1526644816031_3731060_107-251434-0_sp1.1 from training. Duration: 0.9 2023-10-04 19:03:26,244 WARNING [train.py:1204] (3/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_401-53423-0_sp1.1 from training. Duration: 0.5 2023-10-04 19:03:27,428 INFO [train.py:1046] (3/4) Epoch 50, batch 3800, loss[loss=0.19, simple_loss=0.2575, pruned_loss=0.06122, over 19837.00 frames. ], tot_loss[loss=0.1535, simple_loss=0.2348, pruned_loss=0.03608, over 4712995.77 frames. ], batch size: 388, lr: 2.02e-03, grad_scale: 16.0 2023-10-04 19:03:33,280 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7762_210_1794_1_1528199508_4057408_422-227034-0_sp1.1 from training. Duration: 0.663625 2023-10-04 19:03:36,109 WARNING [train.py:1204] (3/4) Exclude cut with ID _4184_210_2550_1_1526730192013_2219380_102-250299-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 19:03:37,584 WARNING [train.py:1204] (3/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_253-308377-0_sp0.9 from training. Duration: 0.5444375 2023-10-04 19:03:37,649 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_271-297505-0 from training. Duration: 0.99 2023-10-04 19:03:39,046 WARNING [train.py:1204] (3/4) Exclude cut with ID _35941_210_15379_1_1532995294515_4023089_213-142973-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 19:03:40,466 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7544_210_6753_1_1528587722_3576686_162-63018-0_sp1.1 from training. Duration: 0.92725 2023-10-04 19:03:40,545 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_365-217119-0_sp0.9 from training. Duration: 0.7 2023-10-04 19:03:41,890 WARNING [train.py:1204] (3/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_662-275621-0_sp0.9 from training. Duration: 0.4666875 2023-10-04 19:03:41,892 WARNING [train.py:1204] (3/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_374-187555-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 19:03:43,798 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_289-180904-0_sp1.1 from training. Duration: 0.6909375 2023-10-04 19:03:45,394 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.bypass.skip_rate, batch_count=1760693.3333333333, ans=0.09899494936611666 2023-10-04 19:03:46,471 WARNING [train.py:1204] (3/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_189-47206-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 19:03:46,509 WARNING [train.py:1204] (3/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_234-14174-0_sp1.1 from training. Duration: 0.7454375 2023-10-04 19:03:47,770 WARNING [train.py:1204] (3/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_576-76621-0_sp1.1 from training. Duration: 0.963625 2023-10-04 19:03:47,851 WARNING [train.py:1204] (3/4) Exclude cut with ID _13253_210_1925_1_1530757775819_3577873_253-246386-0 from training. Duration: 0.9 2023-10-04 19:03:50,493 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.844e+02 2.154e+02 2.528e+02 3.068e+02 4.826e+02, threshold=5.056e+02, percent-clipped=1.0 2023-10-04 19:03:51,997 WARNING [train.py:1204] (3/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_372-6869-0_sp1.1 from training. Duration: 0.4 2023-10-04 19:03:53,822 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_21434_210_4238_1_1531916339_1222252_22-54954-0_sp1.1 from training. Duration: 0.936375 2023-10-04 19:03:53,980 WARNING [train.py:1204] (3/4) Exclude cut with ID _29853_210_7497_1_1532505600851_6642110_492-170361-0_sp1.1 from training. Duration: 0.92725 2023-10-04 19:03:54,807 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.1.encoder.layers.0.whiten, num_groups=1, num_channels=256, metric=6.14 vs. limit=12.0 2023-10-04 19:03:55,478 WARNING [train.py:1204] (3/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_505-97987-0_sp0.9 from training. Duration: 0.9555625 2023-10-04 19:03:56,812 WARNING [train.py:1204] (3/4) Exclude cut with ID _8365_210_2064_1_1528592311070_4677690_371-122295-0_sp0.9 from training. Duration: 0.611125 2023-10-04 19:03:58,207 WARNING [train.py:1204] (3/4) Exclude cut with ID _14268_210_9206_1_1530622771285_4714099_637-86630-0_sp1.1 from training. Duration: 0.463625 2023-10-04 19:03:58,225 WARNING [train.py:1204] (3/4) Exclude cut with ID _29857_210_20034_1_1532395886753_3798160_372-5678-0_sp1.1 from training. Duration: 0.963625 2023-10-04 19:04:02,181 WARNING [train.py:1204] (3/4) Exclude cut with ID _52892_210_6721_1_1534378941259_4124129_90-165108-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 19:04:02,265 WARNING [train.py:1204] (3/4) Exclude cut with ID _27052_210_5986_1_1532329197187_3585009_143-339198-0_sp1.1 from training. Duration: 0.963625 2023-10-04 19:04:06,371 WARNING [train.py:1204] (3/4) Exclude cut with ID _14268_210_9206_1_1530622771285_4714099_637-86630-0_sp0.9 from training. Duration: 0.5666875 2023-10-04 19:04:06,373 WARNING [train.py:1204] (3/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_401-255491-0 from training. Duration: 0.7 2023-10-04 19:04:07,853 WARNING [train.py:1204] (3/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_178-6946-0_sp1.1 from training. Duration: 0.97275 2023-10-04 19:04:16,313 WARNING [train.py:1204] (3/4) Exclude cut with ID _27485_210_4916_1_1532739191677_3668269_460-163885-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 19:04:20,480 WARNING [train.py:1204] (3/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_270-26053-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 19:04:22,498 WARNING [train.py:1204] (3/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_412-317077-0 from training. Duration: 0.75 2023-10-04 19:04:25,341 WARNING [train.py:1204] (3/4) Exclude cut with ID _5289_210_2084_1_1527678202736_3779299_142-103317-0 from training. Duration: 0.84 2023-10-04 19:04:25,391 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_150-143438-0_sp1.1 from training. Duration: 0.92725 2023-10-04 19:04:27,991 WARNING [train.py:1204] (3/4) Exclude cut with ID _27894_210_12392_1_1532415496078_6902439_668-269779-0_sp1.1 from training. Duration: 0.97275 2023-10-04 19:04:29,327 WARNING [train.py:1204] (3/4) Exclude cut with ID _18513_210_9206_1_1531618215223_3862620_56-222075-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 19:04:30,727 WARNING [train.py:1204] (3/4) Exclude cut with ID _4092_210_2151_1_1526643044449_3135759_356-278694-0 from training. Duration: 0.47 2023-10-04 19:04:34,808 WARNING [train.py:1204] (3/4) Exclude cut with ID _25808_210_13292_1_1532343514562_7366109_633-47524-0 from training. Duration: 0.99 2023-10-04 19:04:34,828 WARNING [train.py:1204] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_872-264525-0 from training. Duration: 0.65 2023-10-04 19:04:34,859 WARNING [train.py:1204] (3/4) Exclude cut with ID _14632_210_9988_1_1530691279392_3923200_239-232545-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 19:04:34,943 WARNING [train.py:1204] (3/4) Exclude cut with ID _14349_210_8341_1_1530615710818_3442470_153-27432-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 19:04:39,167 WARNING [train.py:1204] (3/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_37-41919-0_sp0.9 from training. Duration: 0.9666875 2023-10-04 19:04:40,394 INFO [train.py:1046] (3/4) Epoch 50, batch 3850, loss[loss=0.1408, simple_loss=0.2251, pruned_loss=0.02825, over 24600.00 frames. ], tot_loss[loss=0.1525, simple_loss=0.2335, pruned_loss=0.03571, over 4704771.64 frames. ], batch size: 60, lr: 2.02e-03, grad_scale: 16.0 2023-10-04 19:04:41,820 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_196-289637-0_sp0.9 from training. Duration: 0.8333125 2023-10-04 19:04:45,306 WARNING [train.py:1204] (3/4) Exclude cut with ID _13678_210_7497_1_1530530069226_3463170_371-3221-0_sp1.1 from training. Duration: 0.8181875 2023-10-04 19:04:47,180 WARNING [train.py:1204] (3/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_557-324258-0 from training. Duration: 0.81 2023-10-04 19:04:48,658 WARNING [train.py:1204] (3/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_1012-279473-0_sp1.1 from training. Duration: 0.6090625 2023-10-04 19:04:50,084 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_66916_210_14656_1_1535725525_326566_7-92649-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 19:04:53,354 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_9460_210_4211_1_1528954587_10049667_480-260269-0_sp1.1 from training. Duration: 0.5090625 2023-10-04 19:04:56,095 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_121-155001-0_sp1.1 from training. Duration: 0.92725 2023-10-04 19:04:57,488 WARNING [train.py:1204] (3/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_188-171673-0_sp1.1 from training. Duration: 0.52725 2023-10-04 19:04:58,851 WARNING [train.py:1204] (3/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_2-39653-0 from training. Duration: 0.98 2023-10-04 19:04:59,682 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.3.encoder.layers.0.self_attn_weights.whiten_keys, num_groups=8, num_channels=256, metric=4.33 vs. limit=6.0 2023-10-04 19:05:03,129 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_23850_210_4238_1_1532232597_1632433_112-229012-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 19:05:03,948 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.1.encoder.layers.0.feed_forward1.out_whiten, num_groups=1, num_channels=256, metric=7.34 vs. limit=15.0 2023-10-04 19:05:04,402 WARNING [train.py:1204] (3/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_626-79528-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 19:05:07,123 WARNING [train.py:1204] (3/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_856-90052-0_sp1.1 from training. Duration: 0.963625 2023-10-04 19:05:07,160 WARNING [train.py:1204] (3/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_122-109023-0_sp0.9 from training. Duration: 0.8444375 2023-10-04 19:05:10,236 WARNING [train.py:1204] (3/4) Exclude cut with ID _64414_210_17301_1_1535243252048_3757759_37-70328-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 19:05:11,539 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_622-344907-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 19:05:11,599 WARNING [train.py:1204] (3/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_410-321956-0_sp1.1 from training. Duration: 0.92725 2023-10-04 19:05:11,611 WARNING [train.py:1204] (3/4) Exclude cut with ID _7562_210_2084_1_1528887767916_3946229_186-326422-0_sp0.9 from training. Duration: 0.9333125 2023-10-04 19:05:12,980 WARNING [train.py:1204] (3/4) Exclude cut with ID _37537_210_2253_1_1533171758568_3421949_127-285192-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 19:05:14,992 WARNING [train.py:1204] (3/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_723-228168-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 19:05:16,275 WARNING [train.py:1204] (3/4) Exclude cut with ID _13678_210_7497_1_1530530069226_3463170_426-246984-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 19:05:16,299 WARNING [train.py:1204] (3/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_399-156791-0_sp0.9 from training. Duration: 0.92225 2023-10-04 19:05:16,361 WARNING [train.py:1204] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_391-22148-0 from training. Duration: 0.77 2023-10-04 19:05:17,640 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3475_210_2028_1_1526193578_3622569_64-297072-0 from training. Duration: 0.91 2023-10-04 19:05:18,966 WARNING [train.py:1204] (3/4) Exclude cut with ID _17875_210_2087_1_1531224012907_1821339_225-280015-0_sp1.1 from training. Duration: 0.963625 2023-10-04 19:05:19,003 WARNING [train.py:1204] (3/4) Exclude cut with ID _50655_210_18628_1_1534229942193_3772154_325-246169-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 19:05:22,134 WARNING [train.py:1204] (3/4) Exclude cut with ID _29529_210_7234_1_1532393961455_3977460_597-257348-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 19:05:22,185 WARNING [train.py:1204] (3/4) Exclude cut with ID _58895_210_8750_1_1535184081320_3649089_77-173485-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 19:05:22,205 WARNING [train.py:1204] (3/4) Exclude cut with ID _6270_210_2084_1_1527850911573_3856420_26-277343-0 from training. Duration: 0.82 2023-10-04 19:05:25,175 WARNING [train.py:1204] (3/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_144-3287-0 from training. Duration: 0.67 2023-10-04 19:05:26,587 WARNING [train.py:1204] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_293-145278-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 19:05:27,982 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_40047_210_2290_1_1533290519_2374311_257-335162-0 from training. Duration: 0.95 2023-10-04 19:05:29,416 WARNING [train.py:1204] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_619-344499-0_sp0.9 from training. Duration: 0.7 2023-10-04 19:05:33,641 WARNING [train.py:1204] (3/4) Exclude cut with ID _19504_210_9994_1_1531648863242_3325220_456-315379-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 19:05:33,844 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.bypass_mid.scale_min, batch_count=1761160.0, ans=0.2 2023-10-04 19:05:35,028 WARNING [train.py:1204] (3/4) Exclude cut with ID _55857_210_2346_1_1534644007831_6898209_631-221692-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 19:05:35,168 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.feed_forward2.hidden_balancer.prob, batch_count=1761160.0, ans=0.125 2023-10-04 19:05:37,799 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.1.conv_module1.balancer2.prob, batch_count=1761226.6666666667, ans=0.125 2023-10-04 19:05:39,016 WARNING [train.py:1204] (3/4) Exclude cut with ID _64631_210_27767_1_1535349580068_4165160_324-3682-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 19:05:40,314 WARNING [train.py:1204] (3/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_317-211107-0 from training. Duration: 0.92 2023-10-04 19:05:41,846 WARNING [train.py:1204] (3/4) Exclude cut with ID _6789_210_1534_1_1527857937218_3118950_311-61262-0 from training. Duration: 0.65 2023-10-04 19:05:42,334 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.5.encoder.layers.1.self_attn1.whiten, num_groups=1, num_channels=256, metric=11.75 vs. limit=22.5 2023-10-04 19:05:45,091 WARNING [train.py:1204] (3/4) Exclude cut with ID _28323_210_16187_1_1532480395667_4100300_365-68882-0_sp1.1 from training. Duration: 0.92725 2023-10-04 19:05:45,122 WARNING [train.py:1204] (3/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_816-181825-0_sp1.1 from training. Duration: 0.92725 2023-10-04 19:05:48,376 WARNING [train.py:1204] (3/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_132-269633-0_sp1.1 from training. Duration: 0.6181875 2023-10-04 19:05:48,395 WARNING [train.py:1204] (3/4) Exclude cut with ID _44585_210_18628_1_1533605350387_4518199_280-73534-0_sp1.1 from training. Duration: 0.8090625 2023-10-04 19:05:49,685 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_68095_210_20185_1_1535718226_8888855_1009-164755-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 19:05:51,068 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_40223_210_6228_1_1533298404_4812267_400-103813-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 19:05:51,068 WARNING [train.py:1204] (3/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_491-261923-0_sp1.1 from training. Duration: 0.8909375 2023-10-04 19:05:51,074 WARNING [train.py:1204] (3/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_712-342563-0 from training. Duration: 0.97 2023-10-04 19:05:51,186 WARNING [train.py:1204] (3/4) Exclude cut with ID _41161_210_20034_1_1533431249657_3555314_175-190032-0_sp1.1 from training. Duration: 0.963625 2023-10-04 19:05:52,872 WARNING [train.py:1204] (3/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_788-298907-0 from training. Duration: 0.89 2023-10-04 19:05:52,891 WARNING [train.py:1204] (3/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_225-70961-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 19:05:54,189 INFO [train.py:1046] (3/4) Epoch 50, batch 3900, loss[loss=0.1719, simple_loss=0.2595, pruned_loss=0.04211, over 23976.00 frames. ], tot_loss[loss=0.1521, simple_loss=0.233, pruned_loss=0.03563, over 4712695.36 frames. ], batch size: 80, lr: 2.02e-03, grad_scale: 16.0 2023-10-04 19:05:54,240 WARNING [train.py:1204] (3/4) Exclude cut with ID _58583_210_5236_1_1534726835266_3574249_235-175994-0_sp1.1 from training. Duration: 0.92725 2023-10-04 19:05:55,647 WARNING [train.py:1204] (3/4) Exclude cut with ID _12302_210_5219_1_1529924446466_8107280_907-182949-0_sp1.1 from training. Duration: 0.836375 2023-10-04 19:05:55,685 WARNING [train.py:1204] (3/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_94-193692-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 19:05:57,028 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_4528_210_3273_1_1527076654_3276777_342-168068-0_sp0.9 from training. Duration: 0.9666875 2023-10-04 19:05:57,081 WARNING [train.py:1204] (3/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_368-74239-0_sp1.1 from training. Duration: 0.92725 2023-10-04 19:05:57,083 WARNING [train.py:1204] (3/4) Exclude cut with ID _44831_210_13427_1_1533816011549_3689035_291-295928-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 19:05:58,419 WARNING [train.py:1204] (3/4) Exclude cut with ID _56406_210_9288_1_1534899618664_3496149_455-131997-0_sp1.1 from training. Duration: 0.97275 2023-10-04 19:05:58,426 WARNING [train.py:1204] (3/4) Exclude cut with ID _6473_210_1741_1_1527901307396_3815380_108-17335-0 from training. Duration: 0.65 2023-10-04 19:05:59,756 WARNING [train.py:1204] (3/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_286-135600-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 19:06:02,608 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_928-321675-0_sp1.1 from training. Duration: 0.936375 2023-10-04 19:06:02,695 WARNING [train.py:1204] (3/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_81-300212-0_sp1.1 from training. Duration: 0.5454375 2023-10-04 19:06:02,907 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.ff2_skip_rate, batch_count=1761293.3333333333, ans=0.0 2023-10-04 19:06:04,064 WARNING [train.py:1204] (3/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_29-280622-0_sp0.9 from training. Duration: 0.911125 2023-10-04 19:06:05,580 WARNING [train.py:1204] (3/4) Exclude cut with ID _61079_210_4525_1_1534921008198_4011419_228-176447-0_sp1.1 from training. Duration: 0.936375 2023-10-04 19:06:08,340 WARNING [train.py:1204] (3/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_372-198878-0_sp1.1 from training. Duration: 0.5454375 2023-10-04 19:06:08,351 WARNING [train.py:1204] (3/4) Exclude cut with ID _64521_210_14699_1_1535243427259_3815328_244-208725-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 19:06:09,779 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8275_210_4198_1_1528455320_3892571_491-17707-0_sp0.9 from training. Duration: 0.711125 2023-10-04 19:06:12,497 WARNING [train.py:1204] (3/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_375-245677-0 from training. Duration: 0.81 2023-10-04 19:06:12,514 WARNING [train.py:1204] (3/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_690-286055-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 19:06:13,970 WARNING [train.py:1204] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_89-321955-0 from training. Duration: 0.8 2023-10-04 19:06:14,006 WARNING [train.py:1204] (3/4) Exclude cut with ID _18895_210_6228_1_1531357274234_3784930_213-98721-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 19:06:15,922 WARNING [train.py:1204] (3/4) Exclude cut with ID _19708_210_9206_1_1532136650733_4238364_347-65240-0 from training. Duration: 0.97 2023-10-04 19:06:17,797 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.736e+02 2.032e+02 2.234e+02 2.595e+02 4.358e+02, threshold=4.468e+02, percent-clipped=0.0 2023-10-04 19:06:17,921 WARNING [train.py:1204] (3/4) Exclude cut with ID _64135_210_2253_1_1535677267876_7608972_498-5039-0 from training. Duration: 0.78 2023-10-04 19:06:22,424 WARNING [train.py:1204] (3/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_146-276944-0_sp1.1 from training. Duration: 0.7545625 2023-10-04 19:06:23,941 WARNING [train.py:1204] (3/4) Exclude cut with ID _63171_210_15488_1_1535104792794_3295110_155-34488-0_sp1.1 from training. Duration: 0.97275 2023-10-04 19:06:23,954 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_5438_210_1794_1_1527336687_2600009_233-58072-0_sp0.9 from training. Duration: 0.8555625 2023-10-04 19:06:24,005 WARNING [train.py:1204] (3/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_63-101996-0_sp0.9 from training. Duration: 0.97775 2023-10-04 19:06:26,832 WARNING [train.py:1204] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_271-269084-0_sp1.1 from training. Duration: 0.7545625 2023-10-04 19:06:29,557 WARNING [train.py:1204] (3/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_345-167986-0_sp1.1 from training. Duration: 0.763625 2023-10-04 19:06:30,976 WARNING [train.py:1204] (3/4) Exclude cut with ID _39290_210_4220_1_1533862778760_3075118_92-311600-0_sp1.1 from training. Duration: 0.8 2023-10-04 19:06:30,985 WARNING [train.py:1204] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_209-65306-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 19:06:32,360 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_26433_210_12108_1_1532157158_4056480_317-335807-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 19:06:34,013 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=1761426.6666666667, ans=0.1 2023-10-04 19:06:35,995 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.1.encoder.layers.0.nonlin_attention.whiten1, num_groups=1, num_channels=192, metric=4.89 vs. limit=10.0 2023-10-04 19:06:39,307 WARNING [train.py:1204] (3/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_238-94127-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 19:06:39,350 WARNING [train.py:1204] (3/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_207-132038-0_sp1.1 from training. Duration: 0.8545625 2023-10-04 19:06:46,719 WARNING [train.py:1204] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_167-137255-0_sp1.1 from training. Duration: 0.5818125 2023-10-04 19:06:46,831 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2009_210_2071_1_1525481134_4588937_385-302100-0_sp0.9 from training. Duration: 0.9666875 2023-10-04 19:06:48,710 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.0.feed_forward1.hidden_balancer.prob, batch_count=1761493.3333333333, ans=0.125 2023-10-04 19:06:57,471 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_15517_210_6753_1_1530954008_3487336_253-110617-0_sp1.1 from training. Duration: 0.936375 2023-10-04 19:06:58,992 WARNING [train.py:1204] (3/4) Exclude cut with ID _39290_210_4220_1_1533862778760_3075118_92-311600-0_sp0.9 from training. Duration: 0.97775 2023-10-04 19:07:00,280 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_42-142473-0 from training. Duration: 0.72 2023-10-04 19:07:00,311 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2296_210_2087_1_1525503448_3397335_57-185430-0 from training. Duration: 0.6 2023-10-04 19:07:00,323 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_11305_210_1794_1_1529750428_7178148_257-312870-0_sp0.9 from training. Duration: 0.97775 2023-10-04 19:07:01,712 WARNING [train.py:1204] (3/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_222-198127-0 from training. Duration: 0.84 2023-10-04 19:07:02,999 WARNING [train.py:1204] (3/4) Exclude cut with ID _65263_210_22356_1_1535763598691_3921037_423-202699-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 19:07:03,826 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.0.layers.1.nonlin_attention.whiten2, num_groups=1, num_channels=192, metric=6.52 vs. limit=15.0 2023-10-04 19:07:04,288 WARNING [train.py:1204] (3/4) Exclude cut with ID _15037_210_2253_1_1530752294104_3267650_13-130361-0 from training. Duration: 0.99 2023-10-04 19:07:06,888 INFO [train.py:1046] (3/4) Epoch 50, batch 3950, loss[loss=0.1464, simple_loss=0.2228, pruned_loss=0.03505, over 23385.00 frames. ], tot_loss[loss=0.1515, simple_loss=0.2323, pruned_loss=0.03536, over 4706663.03 frames. ], batch size: 285, lr: 2.02e-03, grad_scale: 16.0 2023-10-04 19:07:09,945 WARNING [train.py:1204] (3/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_628-198080-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 19:07:11,295 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3171_210_2087_1_1526109084_2330007_37-64334-0 from training. Duration: 0.82 2023-10-04 19:07:12,663 WARNING [train.py:1204] (3/4) Exclude cut with ID _5287_210_2084_1_1527418883040_3878670_12-221186-0_sp1.1 from training. Duration: 0.8909375 2023-10-04 19:07:14,196 INFO [scaling.py:1118] (3/4) WithLoss: name=encoder.encoders.0.layers.0.self_attn_weights, loss-sum=0.000e+00 2023-10-04 19:07:15,860 WARNING [train.py:1204] (3/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_1103-56445-0_sp1.1 from training. Duration: 0.663625 2023-10-04 19:07:16,747 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.feed_forward1.hidden_balancer.prob, batch_count=1761626.6666666667, ans=0.125 2023-10-04 19:07:17,775 WARNING [train.py:1204] (3/4) Exclude cut with ID _65263_210_22356_1_1535763598691_3921037_169-140068-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 19:07:21,076 WARNING [train.py:1204] (3/4) Exclude cut with ID IC0671W0118-331961 from training. Duration: 0.896 2023-10-04 19:07:22,397 WARNING [train.py:1204] (3/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_402-102311-0_sp1.1 from training. Duration: 0.6090625 2023-10-04 19:07:22,418 WARNING [train.py:1204] (3/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_202-293523-0 from training. Duration: 0.72 2023-10-04 19:07:22,485 WARNING [train.py:1204] (3/4) Exclude cut with ID IC0679W0038-335846 from training. Duration: 0.768 2023-10-04 19:07:22,519 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_37357_210_17024_1_1533089504_2316121_79-198866-0_sp1.1 from training. Duration: 0.936375 2023-10-04 19:07:22,700 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.conv_module2.balancer2.prob, batch_count=1761693.3333333333, ans=0.125 2023-10-04 19:07:25,706 WARNING [train.py:1204] (3/4) Exclude cut with ID _7606_210_4915_1_1528894253277_5080690_292-271285-0_sp1.1 from training. Duration: 0.963625 2023-10-04 19:07:25,716 WARNING [train.py:1204] (3/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_377-296839-0_sp0.9 from training. Duration: 0.611125 2023-10-04 19:07:25,719 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_45886_210_17024_1_1533704917_2435333_58-131069-0_sp1.1 from training. Duration: 0.963625 2023-10-04 19:07:28,389 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6961_210_2087_1_1527937168_6815011_395-185060-0 from training. Duration: 0.95 2023-10-04 19:07:31,085 WARNING [train.py:1204] (3/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_304-266479-0_sp1.1 from training. Duration: 0.97275 2023-10-04 19:07:31,158 WARNING [train.py:1204] (3/4) Exclude cut with ID _3867_210_1883_1_1526641200575_4475830_319-349850-0_sp1.1 from training. Duration: 0.6090625 2023-10-04 19:07:32,438 WARNING [train.py:1204] (3/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_304-59227-0_sp0.9 from training. Duration: 0.8555625 2023-10-04 19:07:33,866 WARNING [train.py:1204] (3/4) Exclude cut with ID _8021_210_7994_1_1528376451358_3653069_100-140231-0_sp0.9 from training. Duration: 0.7444375 2023-10-04 19:07:35,212 WARNING [train.py:1204] (3/4) Exclude cut with ID _8833_210_5219_1_1528592665210_7553059_329-125156-0_sp0.9 from training. Duration: 0.911125 2023-10-04 19:07:43,811 WARNING [train.py:1204] (3/4) Exclude cut with ID _14459_210_2228_1_1531443641946_7516700_792-287234-0_sp1.1 from training. Duration: 0.92725 2023-10-04 19:07:45,607 WARNING [train.py:1204] (3/4) Exclude cut with ID _19708_210_9206_1_1532136650733_4238364_347-65240-0_sp1.1 from training. Duration: 0.8818125 2023-10-04 19:07:50,206 WARNING [train.py:1204] (3/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_70-27732-0 from training. Duration: 0.94 2023-10-04 19:07:52,271 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.self_attn2.whiten.whitening_limit, batch_count=1761826.6666666667, ans=22.5 2023-10-04 19:07:54,432 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_397-132565-0 from training. Duration: 0.99 2023-10-04 19:07:54,435 WARNING [train.py:1204] (3/4) Exclude cut with ID _49728_210_12651_1_1534041034294_6788150_624-195301-0 from training. Duration: 0.85 2023-10-04 19:07:54,480 WARNING [train.py:1204] (3/4) Exclude cut with ID _55429_210_10626_1_1534467588527_7174143_377-169391-0_sp1.1 from training. Duration: 0.87275 2023-10-04 19:07:56,415 WARNING [train.py:1204] (3/4) Exclude cut with ID _49376_210_5986_1_1533985278732_3370740_354-344695-0_sp1.1 from training. Duration: 0.9 2023-10-04 19:08:02,140 WARNING [train.py:1204] (3/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_276-96593-0_sp1.1 from training. Duration: 0.7 2023-10-04 19:08:02,154 WARNING [train.py:1204] (3/4) Exclude cut with ID _15012_210_1968_1_1530856820429_5095350_688-178849-0_sp0.9 from training. Duration: 0.711125 2023-10-04 19:08:03,456 WARNING [train.py:1204] (3/4) Exclude cut with ID _25565_210_12392_1_1532423009382_3952098_372-69023-0_sp1.1 from training. Duration: 0.936375 2023-10-04 19:08:03,497 WARNING [train.py:1204] (3/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_61-327062-0_sp1.1 from training. Duration: 0.736375 2023-10-04 19:08:04,773 WARNING [train.py:1204] (3/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_215-141059-0 from training. Duration: 0.62 2023-10-04 19:08:07,878 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.out_combiner.scale_min, batch_count=1761893.3333333333, ans=0.2 2023-10-04 19:08:08,934 WARNING [train.py:1204] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_6-226750-0_sp1.1 from training. Duration: 0.8545625 2023-10-04 19:08:10,284 WARNING [train.py:1204] (3/4) Exclude cut with ID _14348_210_8341_1_1530599434996_3778640_240-232370-0_sp1.1 from training. Duration: 0.763625 2023-10-04 19:08:15,437 WARNING [train.py:1204] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_468-189058-0 from training. Duration: 0.96 2023-10-04 19:08:21,566 INFO [train.py:1046] (3/4) Epoch 50, batch 4000, loss[loss=0.1443, simple_loss=0.2283, pruned_loss=0.0302, over 24328.00 frames. ], tot_loss[loss=0.1514, simple_loss=0.2323, pruned_loss=0.03529, over 4707370.32 frames. ], batch size: 61, lr: 2.02e-03, grad_scale: 32.0 2023-10-04 19:08:23,545 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.3.encoder.layers.3.conv_module2.whiten, num_groups=1, num_channels=512, metric=6.76 vs. limit=15.0 2023-10-04 19:08:25,720 WARNING [train.py:1204] (3/4) Exclude cut with ID _21987_210_5854_1_1531821500482_3637038_247-93340-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 19:08:27,781 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.conv_module2.balancer1.prob, batch_count=1761960.0, ans=0.125 2023-10-04 19:08:31,945 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.feed_forward1.out_proj.dropout_p, batch_count=1761960.0, ans=0.1 2023-10-04 19:08:32,990 WARNING [train.py:1204] (3/4) Exclude cut with ID _66868_210_9527_1_1535509287954_4561140_436-191602-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 19:08:38,555 WARNING [train.py:1204] (3/4) Exclude cut with ID _18895_210_6228_1_1531357274234_3784930_394-58203-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 19:08:38,608 WARNING [train.py:1204] (3/4) Exclude cut with ID _58605_210_12210_1_1534723195939_3904259_258-281498-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 19:08:38,635 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_33348_210_5632_1_1533086912_3324030_245-292752-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 19:08:38,656 WARNING [train.py:1204] (3/4) Exclude cut with ID _67831_210_12392_1_1535612464202_3092612_346-2148-0 from training. Duration: 0.93 2023-10-04 19:08:40,018 WARNING [train.py:1204] (3/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_369-222060-0_sp0.9 from training. Duration: 0.788875 2023-10-04 19:08:40,082 WARNING [train.py:1204] (3/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_245-174553-0 from training. Duration: 0.88 2023-10-04 19:08:40,087 WARNING [train.py:1204] (3/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_231-104736-0_sp1.1 from training. Duration: 0.7090625 2023-10-04 19:08:40,093 WARNING [train.py:1204] (3/4) Exclude cut with ID _44585_210_18628_1_1533605350387_4518199_280-73534-0 from training. Duration: 0.89 2023-10-04 19:08:42,856 WARNING [train.py:1204] (3/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_22-201476-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 19:08:45,964 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.773e+02 2.094e+02 2.391e+02 2.911e+02 5.164e+02, threshold=4.782e+02, percent-clipped=3.0 2023-10-04 19:08:46,071 WARNING [train.py:1204] (3/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_325-62802-0_sp0.9 from training. Duration: 0.9666875 2023-10-04 19:08:46,083 WARNING [train.py:1204] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_199-274062-0_sp1.1 from training. Duration: 0.97275 2023-10-04 19:08:46,086 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_8-319957-0_sp1.1 from training. Duration: 0.87275 2023-10-04 19:08:46,117 WARNING [train.py:1204] (3/4) Exclude cut with ID _29343_210_4381_1_1532602757752_3674109_224-349726-0_sp1.1 from training. Duration: 0.936375 2023-10-04 19:08:46,121 WARNING [train.py:1204] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_520-126729-0_sp1.1 from training. Duration: 0.636375 2023-10-04 19:08:48,076 WARNING [train.py:1204] (3/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_239-22076-0_sp1.1 from training. Duration: 0.77275 2023-10-04 19:08:49,550 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9012W0011-501146 from training. Duration: 0.896 2023-10-04 19:08:50,911 WARNING [train.py:1204] (3/4) Exclude cut with ID _29857_210_20034_1_1532395886753_3798160_279-185801-0_sp1.1 from training. Duration: 0.82725 2023-10-04 19:08:50,978 WARNING [train.py:1204] (3/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_261-223933-0_sp1.1 from training. Duration: 0.92725 2023-10-04 19:08:55,588 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9029W0025-509426 from training. Duration: 0.896 2023-10-04 19:08:55,665 WARNING [train.py:1204] (3/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_337-181502-0_sp1.1 from training. Duration: 0.5909375 2023-10-04 19:08:55,686 WARNING [train.py:1204] (3/4) Exclude cut with ID _68635_210_9317_1_1535770813410_4315870_159-332599-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 19:09:03,101 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_161-276996-0 from training. Duration: 0.95 2023-10-04 19:09:03,152 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7088_210_1874_1_1527939776_1101113_127-68870-0_sp1.1 from training. Duration: 0.936375 2023-10-04 19:09:05,819 WARNING [train.py:1204] (3/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_322-217220-0_sp1.1 from training. Duration: 0.7818125 2023-10-04 19:09:07,124 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9072W0002-530636 from training. Duration: 0.768 2023-10-04 19:09:08,554 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_340-336408-0_sp1.1 from training. Duration: 0.8454375 2023-10-04 19:09:08,623 WARNING [train.py:1204] (3/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_395-37222-0 from training. Duration: 0.76 2023-10-04 19:09:09,859 WARNING [train.py:1204] (3/4) Exclude cut with ID _64099_210_17301_1_1535526977656_1578188_159-209156-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 19:09:11,170 WARNING [train.py:1204] (3/4) Exclude cut with ID _53950_210_5816_1_1534417264919_3862951_28-29681-0_sp1.1 from training. Duration: 0.92725 2023-10-04 19:09:12,498 WARNING [train.py:1204] (3/4) Exclude cut with ID _4681_210_3525_1_1527250689464_120889_15-126760-0_sp1.1 from training. Duration: 0.736375 2023-10-04 19:09:13,850 WARNING [train.py:1204] (3/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_242-3472-0_sp1.1 from training. Duration: 0.663625 2023-10-04 19:09:13,895 WARNING [train.py:1204] (3/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_436-186311-0_sp0.9 from training. Duration: 0.72225 2023-10-04 19:09:15,212 WARNING [train.py:1204] (3/4) Exclude cut with ID _3675_210_4557_1_1526639934408_4182089_452-172147-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 19:09:16,653 WARNING [train.py:1204] (3/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_80-147301-0 from training. Duration: 0.84 2023-10-04 19:09:16,676 WARNING [train.py:1204] (3/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_351-107588-0_sp1.1 from training. Duration: 0.92725 2023-10-04 19:09:18,077 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9111W0002-550781 from training. Duration: 0.896 2023-10-04 19:09:18,911 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.conv_module1.balancer2.prob, batch_count=1762226.6666666667, ans=0.125 2023-10-04 19:09:22,863 WARNING [train.py:1204] (3/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_465-156500-0_sp1.1 from training. Duration: 0.6545625 2023-10-04 19:09:25,688 WARNING [train.py:1204] (3/4) Exclude cut with ID _13938_210_9988_1_1530518718978_3693960_304-112784-0_sp1.1 from training. Duration: 0.4181875 2023-10-04 19:09:27,607 WARNING [train.py:1204] (3/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_202-67017-0_sp1.1 from training. Duration: 0.7454375 2023-10-04 19:09:27,656 WARNING [train.py:1204] (3/4) Exclude cut with ID _58414_210_8254_1_1535421609162_7151182_448-4358-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 19:09:28,220 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.3.encoder.layers.2.feed_forward3.out_whiten, num_groups=1, num_channels=512, metric=6.28 vs. limit=15.0 2023-10-04 19:09:28,989 WARNING [train.py:1204] (3/4) Exclude cut with ID _66840_210_5219_1_1535452168426_4196349_203-243234-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 19:09:30,831 WARNING [train.py:1204] (3/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_201-213513-0_sp1.1 from training. Duration: 0.97275 2023-10-04 19:09:31,030 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.conv_skip_rate, batch_count=1762226.6666666667, ans=0.0 2023-10-04 19:09:34,806 INFO [train.py:1046] (3/4) Epoch 50, batch 4050, loss[loss=0.1558, simple_loss=0.2424, pruned_loss=0.03459, over 24657.00 frames. ], tot_loss[loss=0.152, simple_loss=0.2331, pruned_loss=0.03551, over 4719254.95 frames. ], batch size: 73, lr: 2.02e-03, grad_scale: 16.0 2023-10-04 19:09:34,937 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_117-187560-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 19:09:37,990 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=1762293.3333333333, ans=0.1 2023-10-04 19:09:39,136 WARNING [train.py:1204] (3/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_135-174080-0_sp0.9 from training. Duration: 0.588875 2023-10-04 19:09:39,189 WARNING [train.py:1204] (3/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_45-265684-0 from training. Duration: 0.72 2023-10-04 19:09:40,510 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_435-313305-0_sp1.1 from training. Duration: 0.6454375 2023-10-04 19:09:42,010 WARNING [train.py:1204] (3/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_235-307337-0_sp1.1 from training. Duration: 0.8545625 2023-10-04 19:09:42,090 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7762_210_1794_1_1528199508_4057408_419-125009-0_sp0.9 from training. Duration: 0.92225 2023-10-04 19:09:43,504 WARNING [train.py:1204] (3/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_215-141059-0_sp1.1 from training. Duration: 0.563625 2023-10-04 19:09:44,858 WARNING [train.py:1204] (3/4) Exclude cut with ID _27013_210_1389_1_1532437173240_3252649_222-125573-0_sp1.1 from training. Duration: 0.8909375 2023-10-04 19:09:47,538 WARNING [train.py:1204] (3/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_126-274694-0_sp1.1 from training. Duration: 0.8909375 2023-10-04 19:09:51,431 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2592_210_2290_1_1525697025_4882925_157-70512-0_sp1.1 from training. Duration: 0.82725 2023-10-04 19:09:51,475 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_292-265928-0_sp0.9 from training. Duration: 0.5666875 2023-10-04 19:09:52,856 WARNING [train.py:1204] (3/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_1211-226066-0_sp1.1 from training. Duration: 0.6909375 2023-10-04 19:09:52,921 WARNING [train.py:1204] (3/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_394-265747-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 19:09:56,988 WARNING [train.py:1204] (3/4) Exclude cut with ID _48782_210_12658_1_1534467701544_2300422_239-67139-0_sp1.1 from training. Duration: 0.97275 2023-10-04 19:09:58,234 WARNING [train.py:1204] (3/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_329-113173-0_sp1.1 from training. Duration: 0.563625 2023-10-04 19:10:01,441 WARNING [train.py:1204] (3/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_81-75849-0_sp0.9 from training. Duration: 0.4333125 2023-10-04 19:10:03,392 WARNING [train.py:1204] (3/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_1003-101638-0 from training. Duration: 0.73 2023-10-04 19:10:03,423 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9309W0189-640316 from training. Duration: 0.64 2023-10-04 19:10:06,152 WARNING [train.py:1204] (3/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_503-287883-0_sp1.1 from training. Duration: 0.8 2023-10-04 19:10:07,856 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.out_combiner.scale_min, batch_count=1762426.6666666667, ans=0.2 2023-10-04 19:10:10,413 WARNING [train.py:1204] (3/4) Exclude cut with ID _8833_210_5219_1_1528592665210_7553059_611-80313-0 from training. Duration: 0.64 2023-10-04 19:10:10,567 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.self_attn_weights.pos_emb_skip_rate, batch_count=1762426.6666666667, ans=0.0 2023-10-04 19:10:11,814 WARNING [train.py:1204] (3/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_335-106626-0_sp1.1 from training. Duration: 0.963625 2023-10-04 19:10:15,790 WARNING [train.py:1204] (3/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_237-44332-0_sp1.1 from training. Duration: 0.8545625 2023-10-04 19:10:18,324 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.conv_module2.balancer2.prob, batch_count=1762493.3333333333, ans=0.125 2023-10-04 19:10:19,375 WARNING [train.py:1204] (3/4) Exclude cut with ID _46952_210_5986_1_1534230010357_3107379_178-147341-0_sp1.1 from training. Duration: 0.97275 2023-10-04 19:10:20,691 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_13091_210_7925_1_1530609447_8987354_946-154426-0_sp1.1 from training. Duration: 0.936375 2023-10-04 19:10:20,714 WARNING [train.py:1204] (3/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_326-17061-0_sp1.1 from training. Duration: 0.8545625 2023-10-04 19:10:23,387 WARNING [train.py:1204] (3/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_38-169920-0_sp1.1 from training. Duration: 0.82725 2023-10-04 19:10:26,255 WARNING [train.py:1204] (3/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_283-220372-0 from training. Duration: 0.56 2023-10-04 19:10:26,270 WARNING [train.py:1204] (3/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_252-216674-0_sp1.1 from training. Duration: 0.4909375 2023-10-04 19:10:27,689 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_202-91290-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 19:10:29,047 WARNING [train.py:1204] (3/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_80-74184-0 from training. Duration: 0.83 2023-10-04 19:10:34,504 WARNING [train.py:1204] (3/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_411-271358-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 19:10:40,152 WARNING [train.py:1204] (3/4) Exclude cut with ID _68777_210_4353_1_1535810390731_3117904_129-166012-0 from training. Duration: 0.85 2023-10-04 19:10:40,226 WARNING [train.py:1204] (3/4) Exclude cut with ID _44982_210_1511_1_1534312820108_3726029_410-109759-0_sp1.1 from training. Duration: 0.963625 2023-10-04 19:10:40,230 WARNING [train.py:1204] (3/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_364-22817-0_sp1.1 from training. Duration: 0.6454375 2023-10-04 19:10:41,681 WARNING [train.py:1204] (3/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_89-242335-0 from training. Duration: 0.95 2023-10-04 19:10:41,689 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_151-325626-0 from training. Duration: 0.97 2023-10-04 19:10:41,690 WARNING [train.py:1204] (3/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_786-264332-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 19:10:44,999 WARNING [train.py:1204] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_35-153886-0_sp1.1 from training. Duration: 0.763625 2023-10-04 19:10:46,345 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_24007_210_1794_1_1531918925_1663875_116-100916-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 19:10:46,364 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_162-262717-0_sp1.1 from training. Duration: 0.7545625 2023-10-04 19:10:47,803 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.feed_forward1.out_proj.dropout_p, batch_count=1762626.6666666667, ans=0.1 2023-10-04 19:10:48,837 INFO [train.py:1046] (3/4) Epoch 50, batch 4100, loss[loss=0.1349, simple_loss=0.2072, pruned_loss=0.03133, over 18584.00 frames. ], tot_loss[loss=0.1522, simple_loss=0.2333, pruned_loss=0.03554, over 4711536.14 frames. ], batch size: 40, lr: 2.02e-03, grad_scale: 16.0 2023-10-04 19:10:55,013 WARNING [train.py:1204] (3/4) Exclude cut with ID _8263_210_1664_1_1528438953494_294100_40-204731-0 from training. Duration: 0.5 2023-10-04 19:10:56,402 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_416-293746-0 from training. Duration: 0.9 2023-10-04 19:10:58,091 WARNING [train.py:1204] (3/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_605-219732-0 from training. Duration: 0.94 2023-10-04 19:10:59,428 WARNING [train.py:1204] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_454-123002-0 from training. Duration: 0.69 2023-10-04 19:10:59,441 WARNING [train.py:1204] (3/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_25-304273-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 19:11:00,771 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6860_210_4852_1_1527917457_3833327_451-39702-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 19:11:00,803 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_304-16505-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 19:11:00,815 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_4-266348-0_sp1.1 from training. Duration: 0.7090625 2023-10-04 19:11:02,120 WARNING [train.py:1204] (3/4) Exclude cut with ID ID0149W0001-753701 from training. Duration: 0.989 2023-10-04 19:11:04,247 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.feed_forward1.out_proj.dropout_p, batch_count=1762693.3333333333, ans=0.1 2023-10-04 19:11:05,804 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_41251_210_15368_1_1533429767_4820695_394-55393-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 19:11:07,217 WARNING [train.py:1204] (3/4) Exclude cut with ID _8793_210_6310_1_1528542985139_2310009_121-179782-0_sp1.1 from training. Duration: 0.72725 2023-10-04 19:11:07,242 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2729_210_3528_1_1525863505_3882214_421-73047-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 19:11:07,305 WARNING [train.py:1204] (3/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_278-154492-0_sp0.9 from training. Duration: 0.8444375 2023-10-04 19:11:12,691 WARNING [train.py:1204] (3/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_412-317077-0_sp0.9 from training. Duration: 0.8333125 2023-10-04 19:11:13,954 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.700e+02 2.059e+02 2.320e+02 2.888e+02 5.611e+02, threshold=4.640e+02, percent-clipped=1.0 2023-10-04 19:11:14,046 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_727-138317-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 19:11:14,097 WARNING [train.py:1204] (3/4) Exclude cut with ID _25808_210_13292_1_1532343514562_7366109_345-335048-0_sp1.1 from training. Duration: 0.92725 2023-10-04 19:11:14,118 WARNING [train.py:1204] (3/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_506-23200-0 from training. Duration: 0.97 2023-10-04 19:11:16,090 WARNING [train.py:1204] (3/4) Exclude cut with ID _44718_210_5816_1_1534050051869_6469030_643-245187-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 19:11:16,094 WARNING [train.py:1204] (3/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_42-186162-0_sp1.1 from training. Duration: 0.836375 2023-10-04 19:11:16,106 WARNING [train.py:1204] (3/4) Exclude cut with ID _3231_210_2106_1_1526205864934_6784220_171-256004-0_sp1.1 from training. Duration: 0.7818125 2023-10-04 19:11:16,133 WARNING [train.py:1204] (3/4) Exclude cut with ID _25914_210_6400_1_1532141317058_644740_43-248478-0_sp1.1 from training. Duration: 0.936375 2023-10-04 19:11:16,178 WARNING [train.py:1204] (3/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_577-287150-0 from training. Duration: 0.82 2023-10-04 19:11:20,390 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_46015_210_7234_1_1533863319_3087837_376-276068-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 19:11:21,710 WARNING [train.py:1204] (3/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_51-200220-0 from training. Duration: 0.56 2023-10-04 19:11:21,896 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.conv_skip_rate, batch_count=1762760.0, ans=0.0 2023-10-04 19:11:21,902 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.balancer1.prob, batch_count=1762760.0, ans=0.125 2023-10-04 19:11:23,064 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_452-96101-0_sp1.1 from training. Duration: 0.963625 2023-10-04 19:11:23,296 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.ff2_skip_rate, batch_count=1762760.0, ans=0.0 2023-10-04 19:11:25,816 WARNING [train.py:1204] (3/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_263-168388-0_sp1.1 from training. Duration: 0.7818125 2023-10-04 19:11:25,817 WARNING [train.py:1204] (3/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_469-94673-0 from training. Duration: 0.79 2023-10-04 19:11:27,753 WARNING [train.py:1204] (3/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_303-228738-0_sp1.1 from training. Duration: 0.763625 2023-10-04 19:11:27,805 WARNING [train.py:1204] (3/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_262-250826-0_sp1.1 from training. Duration: 0.9 2023-10-04 19:11:27,835 WARNING [train.py:1204] (3/4) Exclude cut with ID _68777_210_4353_1_1535810390731_3117904_129-166012-0_sp1.1 from training. Duration: 0.77275 2023-10-04 19:11:30,621 WARNING [train.py:1204] (3/4) Exclude cut with ID _58895_210_8750_1_1535184081320_3649089_265-323810-0 from training. Duration: 0.96 2023-10-04 19:11:32,053 WARNING [train.py:1204] (3/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_120-76843-0_sp0.9 from training. Duration: 0.611125 2023-10-04 19:11:32,101 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7544_210_6753_1_1528587722_3576686_295-258479-0_sp0.9 from training. Duration: 0.9333125 2023-10-04 19:11:35,367 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7046_210_6753_1_1527895337_6920917_783-278109-0 from training. Duration: 0.77 2023-10-04 19:11:35,416 WARNING [train.py:1204] (3/4) Exclude cut with ID _42326_210_7770_1_1533814282531_3707269_113-308323-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 19:11:35,449 WARNING [train.py:1204] (3/4) Exclude cut with ID _7852_210_5219_1_1528286471316_8139400_242-105336-0_sp0.9 from training. Duration: 0.8 2023-10-04 19:11:38,588 WARNING [train.py:1204] (3/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_114-277905-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 19:11:42,899 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_13118_210_7925_1_1530755612_7608297_42-66683-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 19:11:45,643 WARNING [train.py:1204] (3/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_901-79438-0_sp1.1 from training. Duration: 0.8545625 2023-10-04 19:11:47,359 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_30280_210_1881_1_1532872779_3660139_211-182194-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 19:11:54,246 WARNING [train.py:1204] (3/4) Exclude cut with ID _14898_210_9206_1_1530766812377_4177190_621-45732-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 19:11:54,256 WARNING [train.py:1204] (3/4) Exclude cut with ID _35350_210_16040_1_1533040191183_4075299_224-133775-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 19:11:57,127 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.1.feed_forward1.out_whiten.whitening_limit, batch_count=1762893.3333333333, ans=15.0 2023-10-04 19:11:57,593 WARNING [train.py:1204] (3/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_147-293311-0_sp1.1 from training. Duration: 0.8545625 2023-10-04 19:11:58,941 WARNING [train.py:1204] (3/4) Exclude cut with ID _12302_210_5219_1_1529924446466_8107280_835-279754-0_sp1.1 from training. Duration: 0.8181875 2023-10-04 19:12:03,063 INFO [train.py:1046] (3/4) Epoch 50, batch 4150, loss[loss=0.1603, simple_loss=0.2468, pruned_loss=0.0369, over 24111.00 frames. ], tot_loss[loss=0.1525, simple_loss=0.2342, pruned_loss=0.03541, over 4729324.92 frames. ], batch size: 80, lr: 2.02e-03, grad_scale: 16.0 2023-10-04 19:12:03,228 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8453_210_4211_1_1528502940_8562730_347-330673-0_sp0.9 from training. Duration: 0.8 2023-10-04 19:12:06,429 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_213-103282-0_sp0.9 from training. Duration: 0.8666875 2023-10-04 19:12:06,504 WARNING [train.py:1204] (3/4) Exclude cut with ID _49728_210_12651_1_1534041034294_6788150_66-43496-0_sp1.1 from training. Duration: 0.97275 2023-10-04 19:12:06,506 WARNING [train.py:1204] (3/4) Exclude cut with ID _19023_210_4353_1_1531378892552_5820780_28-36615-0_sp1.1 from training. Duration: 0.8818125 2023-10-04 19:12:09,440 WARNING [train.py:1204] (3/4) Exclude cut with ID _14349_210_8341_1_1530615710818_3442470_115-285975-0 from training. Duration: 0.86 2023-10-04 19:12:09,465 WARNING [train.py:1204] (3/4) Exclude cut with ID _12734_210_8341_1_1530183614296_3891929_236-47464-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 19:12:10,794 WARNING [train.py:1204] (3/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_177-236809-0 from training. Duration: 0.49 2023-10-04 19:12:10,872 WARNING [train.py:1204] (3/4) Exclude cut with ID _13678_210_7497_1_1530530069226_3463170_371-3221-0 from training. Duration: 0.9 2023-10-04 19:12:10,885 WARNING [train.py:1204] (3/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_48-338976-0 from training. Duration: 0.89 2023-10-04 19:12:12,250 WARNING [train.py:1204] (3/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_1167-309744-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 19:12:15,368 WARNING [train.py:1204] (3/4) Exclude cut with ID _30327_210_16049_1_1532484161295_3924169_401-70819-0_sp0.9 from training. Duration: 0.9555625 2023-10-04 19:12:15,384 WARNING [train.py:1204] (3/4) Exclude cut with ID _55134_210_21982_1_1534590028377_3882170_346-91022-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 19:12:20,095 WARNING [train.py:1204] (3/4) Exclude cut with ID _14459_210_2228_1_1531443641946_7516700_174-118801-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 19:12:21,337 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_269-278006-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 19:12:21,393 WARNING [train.py:1204] (3/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_390-339137-0_sp0.9 from training. Duration: 0.62225 2023-10-04 19:12:22,823 WARNING [train.py:1204] (3/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_253-308377-0_sp1.1 from training. Duration: 0.4454375 2023-10-04 19:12:23,041 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.conv_skip_rate, batch_count=1763026.6666666667, ans=0.0 2023-10-04 19:12:24,161 WARNING [train.py:1204] (3/4) Exclude cut with ID _23825_210_13449_1_1531915313526_3497150_240-271367-0_sp1.1 from training. Duration: 0.8818125 2023-10-04 19:12:25,481 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1015-343884-0_sp0.9 from training. Duration: 0.688875 2023-10-04 19:12:28,926 WARNING [train.py:1204] (3/4) Exclude cut with ID _23514_210_1389_1_1531832139785_3031439_335-48210-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 19:12:31,844 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_309-65177-0_sp1.1 from training. Duration: 0.8 2023-10-04 19:12:31,991 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.conv_module1.balancer2.prob, batch_count=1763093.3333333333, ans=0.125 2023-10-04 19:12:32,692 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.1.encoder.layers.1.self_attn1.whiten, num_groups=1, num_channels=256, metric=13.23 vs. limit=22.5 2023-10-04 19:12:33,608 WARNING [train.py:1204] (3/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_231-137892-0 from training. Duration: 0.99 2023-10-04 19:12:33,803 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.conv_module2.balancer2.prob, batch_count=1763093.3333333333, ans=0.125 2023-10-04 19:12:36,764 WARNING [train.py:1204] (3/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_57-270037-0 from training. Duration: 0.77 2023-10-04 19:12:36,769 WARNING [train.py:1204] (3/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_465-214726-0_sp1.1 from training. Duration: 0.7818125 2023-10-04 19:12:36,852 WARNING [train.py:1204] (3/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_106-300985-0 from training. Duration: 0.9 2023-10-04 19:12:38,139 WARNING [train.py:1204] (3/4) Exclude cut with ID _14632_210_9988_1_1530691279392_3923200_161-129530-0_sp1.1 from training. Duration: 0.87275 2023-10-04 19:12:38,159 WARNING [train.py:1204] (3/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_514-88370-0_sp1.1 from training. Duration: 0.963625 2023-10-04 19:12:40,890 WARNING [train.py:1204] (3/4) Exclude cut with ID _56495_210_4234_1_1534748080334_4136219_305-334505-0_sp1.1 from training. Duration: 0.92725 2023-10-04 19:12:40,977 WARNING [train.py:1204] (3/4) Exclude cut with ID _42875_210_14856_1_1533704397487_4012433_61-264914-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 19:12:46,372 WARNING [train.py:1204] (3/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_91-148831-0 from training. Duration: 0.99 2023-10-04 19:12:49,538 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_435-313305-0_sp0.9 from training. Duration: 0.788875 2023-10-04 19:12:50,953 WARNING [train.py:1204] (3/4) Exclude cut with ID _16744_210_5986_1_1531724405384_3907567_242-111316-0_sp1.1 from training. Duration: 0.8454375 2023-10-04 19:12:51,022 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_257-20733-0 from training. Duration: 0.76 2023-10-04 19:12:52,372 WARNING [train.py:1204] (3/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_4-91037-0_sp1.1 from training. Duration: 0.8 2023-10-04 19:12:53,714 WARNING [train.py:1204] (3/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_3-276701-0 from training. Duration: 0.91 2023-10-04 19:12:55,183 WARNING [train.py:1204] (3/4) Exclude cut with ID _3992_210_1747_1_1526644816031_3731060_36-147855-0_sp1.1 from training. Duration: 0.7090625 2023-10-04 19:12:56,533 WARNING [train.py:1204] (3/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_1296-298671-0_sp1.1 from training. Duration: 0.963625 2023-10-04 19:12:58,465 WARNING [train.py:1204] (3/4) Exclude cut with ID _68965_210_1883_1_1535698962272_3497490_309-334593-0_sp1.1 from training. Duration: 0.92725 2023-10-04 19:12:59,811 WARNING [train.py:1204] (3/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_128-296581-0 from training. Duration: 0.74 2023-10-04 19:12:59,811 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_17613_210_2151_1_1531137569_2366160_170-299079-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 19:12:59,813 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6287_210_3273_1_1527509506_2653716_182-199887-0_sp1.1 from training. Duration: 0.463625 2023-10-04 19:12:59,950 WARNING [train.py:1204] (3/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_80-135200-0_sp1.1 from training. Duration: 0.7181875 2023-10-04 19:13:01,822 WARNING [train.py:1204] (3/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_34-183685-0 from training. Duration: 0.98 2023-10-04 19:13:01,841 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8278_210_2550_1_1528637099_4220413_418-210155-0_sp1.1 from training. Duration: 0.92725 2023-10-04 19:13:01,845 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8853_210_3273_1_1528633899_818968_51-187685-0_sp0.9 from training. Duration: 0.7666875 2023-10-04 19:13:01,865 WARNING [train.py:1204] (3/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_332-221057-0_sp1.1 from training. Duration: 0.6181875 2023-10-04 19:13:03,288 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2221_210_1988_1_1525597051_4935541_415-296882-0 from training. Duration: 0.77 2023-10-04 19:13:03,327 WARNING [train.py:1204] (3/4) Exclude cut with ID _33640_210_2655_1_1533117605479_4855469_770-278185-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 19:13:04,534 WARNING [train.py:1204] (3/4) Exclude cut with ID _5287_210_2084_1_1527418883040_3878670_333-48508-0_sp1.1 from training. Duration: 0.5454375 2023-10-04 19:13:04,606 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_142-276839-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 19:13:06,112 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.1.attention_skip_rate, batch_count=1763226.6666666667, ans=0.0 2023-10-04 19:13:07,626 WARNING [train.py:1204] (3/4) Exclude cut with ID _28394_210_8913_1_1532430049518_3650118_126-139371-0_sp1.1 from training. Duration: 0.92725 2023-10-04 19:13:07,640 WARNING [train.py:1204] (3/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_76-203867-0 from training. Duration: 0.72 2023-10-04 19:13:07,676 WARNING [train.py:1204] (3/4) Exclude cut with ID _6194_210_1534_1_1527511826160_3941194_14-282876-0_sp0.9 from training. Duration: 0.788875 2023-10-04 19:13:13,184 WARNING [train.py:1204] (3/4) Exclude cut with ID _35355_210_16040_1_1533212988977_4016229_74-190076-0_sp1.1 from training. Duration: 0.72725 2023-10-04 19:13:16,042 WARNING [train.py:1204] (3/4) Exclude cut with ID _33920_210_4220_1_1533257851500_3084399_198-334063-0 from training. Duration: 0.94 2023-10-04 19:13:17,310 INFO [train.py:1046] (3/4) Epoch 50, batch 4200, loss[loss=0.1373, simple_loss=0.2189, pruned_loss=0.02779, over 24609.00 frames. ], tot_loss[loss=0.1519, simple_loss=0.2331, pruned_loss=0.03538, over 4721596.36 frames. ], batch size: 60, lr: 2.02e-03, grad_scale: 8.0 2023-10-04 19:13:17,410 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_4-266348-0_sp0.9 from training. Duration: 0.8666875 2023-10-04 19:13:19,391 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_23856_210_4238_1_1531994785_3087934_122-38075-0_sp1.1 from training. Duration: 0.936375 2023-10-04 19:13:20,828 WARNING [train.py:1204] (3/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_276-96593-0_sp0.9 from training. Duration: 0.8555625 2023-10-04 19:13:20,897 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_308-278734-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 19:13:20,899 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_5190_210_4557_1_1527245078_2965646_353-250603-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 19:13:22,421 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.ff2_skip_rate, batch_count=1763293.3333333333, ans=0.0 2023-10-04 19:13:23,606 WARNING [train.py:1204] (3/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_472-216663-0 from training. Duration: 0.92 2023-10-04 19:13:24,124 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.5.encoder.layers.0.self_attn2.whiten, num_groups=1, num_channels=256, metric=10.79 vs. limit=22.5 2023-10-04 19:13:26,336 WARNING [train.py:1204] (3/4) Exclude cut with ID _7537_210_2084_1_1528502395431_3924359_14-312906-0 from training. Duration: 0.84 2023-10-04 19:13:28,071 WARNING [train.py:1204] (3/4) Exclude cut with ID _29003_210_19596_1_1532507391720_3894098_559-236660-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 19:13:30,834 WARNING [train.py:1204] (3/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_312-204798-0_sp1.1 from training. Duration: 0.8454375 2023-10-04 19:13:32,465 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.bypass.skip_rate, batch_count=1763360.0, ans=0.07 2023-10-04 19:13:33,973 WARNING [train.py:1204] (3/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_17-309070-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 19:13:35,479 WARNING [train.py:1204] (3/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_344-152526-0_sp0.9 from training. Duration: 0.588875 2023-10-04 19:13:36,885 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_41452_210_7291_1_1533437687_3941281_405-11559-0_sp1.1 from training. Duration: 0.82725 2023-10-04 19:13:38,534 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_48730_210_5399_1_1533885886_2212769_157-124052-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 19:13:38,587 WARNING [train.py:1204] (3/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_415-78792-0 from training. Duration: 0.81 2023-10-04 19:13:38,597 WARNING [train.py:1204] (3/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_345-340690-0_sp1.1 from training. Duration: 0.8454375 2023-10-04 19:13:40,009 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.1.feed_forward1.hidden_balancer.prob, batch_count=1763360.0, ans=0.125 2023-10-04 19:13:41,323 WARNING [train.py:1204] (3/4) Exclude cut with ID _23825_210_13449_1_1531915313526_3497150_124-8138-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 19:13:42,639 WARNING [train.py:1204] (3/4) Exclude cut with ID _49258_210_7994_1_1534122017863_3738861_194-24864-0_sp1.1 from training. Duration: 0.936375 2023-10-04 19:13:42,668 WARNING [train.py:1204] (3/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_369-222060-0_sp1.1 from training. Duration: 0.6454375 2023-10-04 19:13:43,863 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.744e+02 2.065e+02 2.338e+02 2.693e+02 5.755e+02, threshold=4.677e+02, percent-clipped=2.0 2023-10-04 19:13:44,020 WARNING [train.py:1204] (3/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_480-13146-0_sp0.9 from training. Duration: 0.9444375 2023-10-04 19:13:45,470 WARNING [train.py:1204] (3/4) Exclude cut with ID _13253_210_1925_1_1530757775819_3577873_191-144977-0 from training. Duration: 0.78 2023-10-04 19:13:45,488 WARNING [train.py:1204] (3/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_1128-140266-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 19:13:50,293 WARNING [train.py:1204] (3/4) Exclude cut with ID _8772_210_1850_1_1528635595972_3664011_247-336448-0_sp0.9 from training. Duration: 0.82225 2023-10-04 19:13:51,649 WARNING [train.py:1204] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_318-59097-0_sp1.1 from training. Duration: 0.6909375 2023-10-04 19:13:52,575 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.1.encoder.layers.0.conv_module1.whiten, num_groups=1, num_channels=256, metric=11.97 vs. limit=15.0 2023-10-04 19:13:53,090 WARNING [train.py:1204] (3/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_540-131164-0_sp1.1 from training. Duration: 0.836375 2023-10-04 19:13:54,533 WARNING [train.py:1204] (3/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_39-110836-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 19:13:56,753 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.0.layers.0.self_attn_weights.whiten_keys, num_groups=4, num_channels=128, metric=2.74 vs. limit=6.0 2023-10-04 19:13:58,519 WARNING [train.py:1204] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_860-96198-0_sp1.1 from training. Duration: 0.663625 2023-10-04 19:13:58,527 WARNING [train.py:1204] (3/4) Exclude cut with ID _15133_210_6228_1_1530794512074_3080379_29-264458-0 from training. Duration: 0.58 2023-10-04 19:13:58,554 WARNING [train.py:1204] (3/4) Exclude cut with ID _55209_210_1531_1_1534675828996_4597160_797-348343-0_sp1.1 from training. Duration: 0.963625 2023-10-04 19:14:00,586 WARNING [train.py:1204] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_23-189234-0_sp1.1 from training. Duration: 0.7545625 2023-10-04 19:14:02,158 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.conv_module1.balancer1.prob, batch_count=1763493.3333333333, ans=0.125 2023-10-04 19:14:03,479 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.1.nonlin_attention.balancer.min_positive, batch_count=1763493.3333333333, ans=0.05 2023-10-04 19:14:05,200 WARNING [train.py:1204] (3/4) Exclude cut with ID _5410_210_1388_1_1527397055795_1719020_39-112221-0_sp1.1 from training. Duration: 0.62725 2023-10-04 19:14:06,675 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_100-348441-0_sp1.1 from training. Duration: 0.82725 2023-10-04 19:14:10,856 WARNING [train.py:1204] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_143-86344-0_sp1.1 from training. Duration: 0.7 2023-10-04 19:14:14,131 WARNING [train.py:1204] (3/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_126-29104-0 from training. Duration: 0.85 2023-10-04 19:14:16,895 WARNING [train.py:1204] (3/4) Exclude cut with ID _39846_210_12651_1_1533603651992_1318030_138-31771-0_sp1.1 from training. Duration: 0.963625 2023-10-04 19:14:21,605 WARNING [train.py:1204] (3/4) Exclude cut with ID _2220_210_1819_1_1525509109898_3658680_50-99837-0_sp1.1 from training. Duration: 0.4909375 2023-10-04 19:14:21,677 WARNING [train.py:1204] (3/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_836-210550-0_sp1.1 from training. Duration: 0.97275 2023-10-04 19:14:23,051 WARNING [train.py:1204] (3/4) Exclude cut with ID _28323_210_16187_1_1532480395667_4100300_306-74561-0 from training. Duration: 0.94 2023-10-04 19:14:27,326 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_177-58005-0_sp1.1 from training. Duration: 0.563625 2023-10-04 19:14:29,700 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=1763560.0, ans=0.1 2023-10-04 19:14:31,959 INFO [train.py:1046] (3/4) Epoch 50, batch 4250, loss[loss=0.1621, simple_loss=0.2539, pruned_loss=0.03515, over 24364.00 frames. ], tot_loss[loss=0.1512, simple_loss=0.2323, pruned_loss=0.03509, over 4720494.97 frames. ], batch size: 77, lr: 2.02e-03, grad_scale: 8.0 2023-10-04 19:14:32,046 WARNING [train.py:1204] (3/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_19-342632-0_sp1.1 from training. Duration: 0.82725 2023-10-04 19:14:32,066 WARNING [train.py:1204] (3/4) Exclude cut with ID _35355_210_16040_1_1533212988977_4016229_74-190076-0_sp0.9 from training. Duration: 0.888875 2023-10-04 19:14:35,257 WARNING [train.py:1204] (3/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_285-97786-0_sp1.1 from training. Duration: 0.97275 2023-10-04 19:14:36,852 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.conv_module1.balancer1.min_positive, batch_count=1763626.6666666667, ans=0.025 2023-10-04 19:14:38,140 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.conv_module1.balancer1.prob, batch_count=1763626.6666666667, ans=0.125 2023-10-04 19:14:40,733 WARNING [train.py:1204] (3/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_337-181502-0_sp0.9 from training. Duration: 0.72225 2023-10-04 19:14:40,768 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_353-182855-0 from training. Duration: 0.96 2023-10-04 19:14:40,800 WARNING [train.py:1204] (3/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_409-327730-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 19:14:43,986 WARNING [train.py:1204] (3/4) Exclude cut with ID _8030_210_7497_1_1528444886781_3531089_373-19403-0_sp1.1 from training. Duration: 0.97275 2023-10-04 19:14:48,441 WARNING [train.py:1204] (3/4) Exclude cut with ID _6789_210_1534_1_1527857937218_3118950_231-340237-0_sp1.1 from training. Duration: 0.936375 2023-10-04 19:14:52,573 WARNING [train.py:1204] (3/4) Exclude cut with ID _6493_210_1747_1_1528459210424_4012470_536-57832-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 19:14:52,585 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7046_210_6753_1_1527895337_6920917_756-116002-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 19:14:54,028 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_37152_210_9697_1_1533106573_3844188_29-239070-0_sp1.1 from training. Duration: 0.8909375 2023-10-04 19:14:54,033 WARNING [train.py:1204] (3/4) Exclude cut with ID _19023_210_4353_1_1531378892552_5820780_61-334539-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 19:14:55,524 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.0.ff3_skip_rate, batch_count=1763693.3333333333, ans=0.0 2023-10-04 19:14:56,703 WARNING [train.py:1204] (3/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_159-28488-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 19:14:56,780 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_764-71272-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 19:14:58,180 WARNING [train.py:1204] (3/4) Exclude cut with ID _44902_210_16187_1_1533888006062_3792078_258-178274-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 19:15:00,779 WARNING [train.py:1204] (3/4) Exclude cut with ID _7537_210_2084_1_1528502395431_3924359_14-312906-0_sp1.1 from training. Duration: 0.763625 2023-10-04 19:15:01,474 WARNING [train.py:1204] (3/4) Exclude cut with ID _49643_210_7989_1_1534640244224_3939031_674-116639-0_sp1.1 from training. Duration: 0.963625 2023-10-04 19:15:02,710 WARNING [train.py:1204] (3/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_415-161-0 from training. Duration: 0.98 2023-10-04 19:15:07,399 WARNING [train.py:1204] (3/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_264-348896-0 from training. Duration: 0.85 2023-10-04 19:15:07,407 WARNING [train.py:1204] (3/4) Exclude cut with ID _51899_210_20034_1_1534129254466_3669220_233-161492-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 19:15:08,626 WARNING [train.py:1204] (3/4) Exclude cut with ID _31255_210_13427_1_1532865619495_3815143_233-200023-0_sp1.1 from training. Duration: 0.936375 2023-10-04 19:15:08,642 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_23850_210_4238_1_1532232597_1632433_100-229879-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 19:15:10,008 WARNING [train.py:1204] (3/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_375-245677-0_sp0.9 from training. Duration: 0.9 2023-10-04 19:15:10,011 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_40223_210_6228_1_1533298404_4812267_355-317415-0_sp1.1 from training. Duration: 0.97275 2023-10-04 19:15:10,061 WARNING [train.py:1204] (3/4) Exclude cut with ID _9679_210_7497_1_1529129212906_4363929_235-81488-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 19:15:13,366 WARNING [train.py:1204] (3/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_172-215932-0_sp1.1 from training. Duration: 0.6 2023-10-04 19:15:14,726 WARNING [train.py:1204] (3/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_64-298917-0_sp1.1 from training. Duration: 0.8 2023-10-04 19:15:18,224 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.feed_forward1.hidden_balancer.prob, batch_count=1763826.6666666667, ans=0.125 2023-10-04 19:15:19,424 WARNING [train.py:1204] (3/4) Exclude cut with ID _38020_210_5986_1_1533112252259_7006089_131-53533-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 19:15:20,810 WARNING [train.py:1204] (3/4) Exclude cut with ID _42334_210_7770_1_1534206610396_3605880_392-48154-0_sp1.1 from training. Duration: 0.963625 2023-10-04 19:15:22,055 WARNING [train.py:1204] (3/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_337-181502-0 from training. Duration: 0.65 2023-10-04 19:15:22,063 WARNING [train.py:1204] (3/4) Exclude cut with ID _6267_210_2084_1_1527764658526_3894730_375-219887-0_sp0.9 from training. Duration: 0.7444375 2023-10-04 19:15:23,455 WARNING [train.py:1204] (3/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_60-146189-0 from training. Duration: 0.98 2023-10-04 19:15:24,904 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_243-143119-0_sp1.1 from training. Duration: 0.7 2023-10-04 19:15:27,562 WARNING [train.py:1204] (3/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_1267-272693-0_sp0.9 from training. Duration: 0.811125 2023-10-04 19:15:28,981 WARNING [train.py:1204] (3/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_83-182645-0_sp1.1 from training. Duration: 0.97275 2023-10-04 19:15:29,006 WARNING [train.py:1204] (3/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_403-292260-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 19:15:30,431 WARNING [train.py:1204] (3/4) Exclude cut with ID _55429_210_10626_1_1534467588527_7174143_377-169391-0 from training. Duration: 0.96 2023-10-04 19:15:31,776 WARNING [train.py:1204] (3/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_494-199538-0_sp1.1 from training. Duration: 0.6818125 2023-10-04 19:15:31,832 WARNING [train.py:1204] (3/4) Exclude cut with ID _3067_210_2667_1_1526212974640_4503168_119-175776-0_sp1.1 from training. Duration: 0.67275 2023-10-04 19:15:35,221 WARNING [train.py:1204] (3/4) Exclude cut with ID _3231_210_2106_1_1526205864934_6784220_183-303839-0_sp1.1 from training. Duration: 0.97275 2023-10-04 19:15:38,412 WARNING [train.py:1204] (3/4) Exclude cut with ID _18513_210_9206_1_1531618215223_3862620_165-38958-0_sp1.1 from training. Duration: 0.963625 2023-10-04 19:15:38,518 WARNING [train.py:1204] (3/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_87-272068-0_sp1.1 from training. Duration: 0.7545625 2023-10-04 19:15:41,046 WARNING [train.py:1204] (3/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_353-95-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 19:15:42,790 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2009_210_2071_1_1525481134_4588937_73-302813-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 19:15:44,198 WARNING [train.py:1204] (3/4) Exclude cut with ID _64220_210_9527_1_1535250151772_4875259_88-95011-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 19:15:44,384 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.conv_module1.balancer2.prob, batch_count=1763960.0, ans=0.125 2023-10-04 19:15:44,418 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.conv_module1.balancer1.min_positive, batch_count=1763960.0, ans=0.025 2023-10-04 19:15:45,512 INFO [train.py:1046] (3/4) Epoch 50, batch 4300, loss[loss=0.1477, simple_loss=0.2365, pruned_loss=0.0294, over 24450.00 frames. ], tot_loss[loss=0.1511, simple_loss=0.2321, pruned_loss=0.035, over 4734939.78 frames. ], batch size: 69, lr: 2.02e-03, grad_scale: 8.0 2023-10-04 19:15:45,563 WARNING [train.py:1204] (3/4) Exclude cut with ID _44902_210_16187_1_1533888006062_3792078_160-12107-0_sp1.1 from training. Duration: 0.92725 2023-10-04 19:15:45,569 WARNING [train.py:1204] (3/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_547-5424-0 from training. Duration: 0.94 2023-10-04 19:15:46,301 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.3.encoder.layers.2.nonlin_attention.whiten1, num_groups=1, num_channels=384, metric=4.07 vs. limit=10.0 2023-10-04 19:15:47,601 WARNING [train.py:1204] (3/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_268-85656-0_sp1.1 from training. Duration: 0.936375 2023-10-04 19:15:47,830 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.balancer2.prob, batch_count=1763960.0, ans=0.125 2023-10-04 19:15:51,825 WARNING [train.py:1204] (3/4) Exclude cut with ID _66210_210_12651_1_1535446812922_3365850_42-284985-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 19:15:53,167 WARNING [train.py:1204] (3/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_465-214726-0_sp0.9 from training. Duration: 0.9555625 2023-10-04 19:15:55,280 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.3.encoder.layers.3.feed_forward3.out_whiten, num_groups=1, num_channels=512, metric=15.38 vs. limit=15.0 2023-10-04 19:15:56,182 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.balancer1.prob, batch_count=1763960.0, ans=0.125 2023-10-04 19:15:58,574 WARNING [train.py:1204] (3/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_50-95770-0_sp1.1 from training. Duration: 0.936375 2023-10-04 19:16:01,702 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.ff3_skip_rate, batch_count=1764026.6666666667, ans=0.0 2023-10-04 19:16:02,936 WARNING [train.py:1204] (3/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_269-77567-0_sp1.1 from training. Duration: 0.963625 2023-10-04 19:16:02,938 WARNING [train.py:1204] (3/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_7-111954-0 from training. Duration: 0.56 2023-10-04 19:16:05,640 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_85-195240-0_sp0.9 from training. Duration: 0.9666875 2023-10-04 19:16:07,511 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2822_210_2715_1_1526112586_1999587_165-36656-0_sp0.9 from training. Duration: 0.988875 2023-10-04 19:16:07,534 WARNING [train.py:1204] (3/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_181-78632-0_sp1.1 from training. Duration: 0.7090625 2023-10-04 19:16:07,545 WARNING [train.py:1204] (3/4) Exclude cut with ID IC0671W0044-331887 from training. Duration: 0.896 2023-10-04 19:16:12,134 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.793e+02 2.156e+02 2.411e+02 2.835e+02 5.289e+02, threshold=4.821e+02, percent-clipped=1.0 2023-10-04 19:16:12,254 WARNING [train.py:1204] (3/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_266-137104-0_sp1.1 from training. Duration: 0.5909375 2023-10-04 19:16:12,930 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.3.encoder.layers.3.nonlin_attention.whiten2, num_groups=1, num_channels=512, metric=5.50 vs. limit=15.0 2023-10-04 19:16:13,647 WARNING [train.py:1204] (3/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_137-116442-0_sp0.9 from training. Duration: 0.8666875 2023-10-04 19:16:15,704 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_23801_210_16223_1_1532066392_3294657_570-5439-0 from training. Duration: 0.97 2023-10-04 19:16:15,716 WARNING [train.py:1204] (3/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_29-280622-0_sp1.1 from training. Duration: 0.7454375 2023-10-04 19:16:17,081 WARNING [train.py:1204] (3/4) Exclude cut with ID 3557-8342-0013-54691-0 from training. Duration: 0.92 2023-10-04 19:16:18,553 WARNING [train.py:1204] (3/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_711-15688-0_sp1.1 from training. Duration: 0.3909375 2023-10-04 19:16:20,379 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_15126_210_6753_1_1530778677_2591185_120-277707-0_sp1.1 from training. Duration: 0.72725 2023-10-04 19:16:23,174 WARNING [train.py:1204] (3/4) Exclude cut with ID _14970_210_15709_1_1530775547361_1966965_8-169924-0_sp1.1 from training. Duration: 0.836375 2023-10-04 19:16:23,175 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_4692_210_3528_1_1526899755_4323254_107-44401-0_sp0.9 from training. Duration: 0.9555625 2023-10-04 19:16:24,487 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3475_210_2028_1_1526193578_3622569_265-276625-0_sp0.9 from training. Duration: 0.8444375 2023-10-04 19:16:24,590 WARNING [train.py:1204] (3/4) Exclude cut with ID _55153_210_2253_1_1534384786677_3162096_148-79022-0_sp1.1 from training. Duration: 0.87275 2023-10-04 19:16:25,940 WARNING [train.py:1204] (3/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_292-36336-0_sp1.1 from training. Duration: 0.92725 2023-10-04 19:16:26,154 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.conv_module2.balancer2.min_positive, batch_count=1764093.3333333333, ans=0.05 2023-10-04 19:16:27,206 WARNING [train.py:1204] (3/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_329-82235-0 from training. Duration: 0.82 2023-10-04 19:16:27,280 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_11305_210_1794_1_1529750428_7178148_257-312870-0 from training. Duration: 0.88 2023-10-04 19:16:30,105 WARNING [train.py:1204] (3/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_436-4493-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 19:16:32,870 WARNING [train.py:1204] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_321-54129-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 19:16:32,878 WARNING [train.py:1204] (3/4) Exclude cut with ID _5287_210_2084_1_1527418883040_3878670_333-48508-0_sp0.9 from training. Duration: 0.6666875 2023-10-04 19:16:32,892 WARNING [train.py:1204] (3/4) Exclude cut with ID _26528_210_9070_1_1532570410842_3851310_244-278158-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 19:16:32,938 WARNING [train.py:1204] (3/4) Exclude cut with ID _46952_210_5986_1_1534230010357_3107379_232-344503-0_sp1.1 from training. Duration: 0.87275 2023-10-04 19:16:32,955 WARNING [train.py:1204] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_43-206757-0 from training. Duration: 0.75 2023-10-04 19:16:32,956 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_4183_210_2087_1_1526729100_1869838_141-166600-0 from training. Duration: 0.95 2023-10-04 19:16:34,396 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_296-287678-0 from training. Duration: 0.95 2023-10-04 19:16:36,380 WARNING [train.py:1204] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_265-60343-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 19:16:36,406 WARNING [train.py:1204] (3/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_332-256168-0 from training. Duration: 0.75 2023-10-04 19:16:36,446 WARNING [train.py:1204] (3/4) Exclude cut with ID _13679_210_7497_1_1530534293662_3355098_411-186088-0 from training. Duration: 0.94 2023-10-04 19:16:39,754 WARNING [train.py:1204] (3/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_458-126779-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 19:16:42,456 WARNING [train.py:1204] (3/4) Exclude cut with ID IC0803W0380-397572 from training. Duration: 0.032 2023-10-04 19:16:42,523 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_278-272612-0_sp1.1 from training. Duration: 0.663625 2023-10-04 19:16:43,953 WARNING [train.py:1204] (3/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_491-263101-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 19:16:43,964 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_53555_210_5632_1_1534595220_7232297_334-336778-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 19:16:47,200 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_50-3289-0 from training. Duration: 0.91 2023-10-04 19:16:48,560 WARNING [train.py:1204] (3/4) Exclude cut with ID _8699_210_2069_1_1528630346831_3960409_155-39766-0_sp0.9 from training. Duration: 0.8666875 2023-10-04 19:16:48,571 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_502-82145-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 19:16:48,618 WARNING [train.py:1204] (3/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_550-43132-0_sp1.1 from training. Duration: 0.97275 2023-10-04 19:16:48,644 WARNING [train.py:1204] (3/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_446-112117-0_sp1.1 from training. Duration: 0.8181875 2023-10-04 19:16:49,964 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3691_210_3528_1_1526468169_4062053_355-334432-0_sp1.1 from training. Duration: 0.7909375 2023-10-04 19:16:51,392 WARNING [train.py:1204] (3/4) Exclude cut with ID _58605_210_12210_1_1534723195939_3904259_239-94720-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 19:16:53,391 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.conv_skip_rate, batch_count=1764226.6666666667, ans=0.0 2023-10-04 19:16:54,519 WARNING [train.py:1204] (3/4) Exclude cut with ID _49376_210_5986_1_1533985278732_3370740_391-344072-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 19:16:55,847 WARNING [train.py:1204] (3/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_558-322014-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 19:16:55,896 WARNING [train.py:1204] (3/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_256-32662-0_sp1.1 from training. Duration: 0.8181875 2023-10-04 19:17:00,006 INFO [train.py:1046] (3/4) Epoch 50, batch 4350, loss[loss=0.1553, simple_loss=0.2484, pruned_loss=0.03111, over 24327.00 frames. ], tot_loss[loss=0.152, simple_loss=0.2333, pruned_loss=0.03539, over 4725981.58 frames. ], batch size: 74, lr: 2.02e-03, grad_scale: 8.0 2023-10-04 19:17:01,511 WARNING [train.py:1204] (3/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_572-185150-0 from training. Duration: 0.97 2023-10-04 19:17:01,553 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_89-35748-0_sp0.9 from training. Duration: 0.87775 2023-10-04 19:17:05,885 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_88-66747-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 19:17:07,732 WARNING [train.py:1204] (3/4) Exclude cut with ID _19504_210_9994_1_1531648863242_3325220_357-237029-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 19:17:10,985 WARNING [train.py:1204] (3/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_302-2550-0_sp1.1 from training. Duration: 0.736375 2023-10-04 19:17:10,985 WARNING [train.py:1204] (3/4) Exclude cut with ID _9461_210_4557_1_1529060514726_3677451_59-194017-0_sp1.1 from training. Duration: 0.963625 2023-10-04 19:17:18,126 WARNING [train.py:1204] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_665-196155-0_sp1.1 from training. Duration: 0.6090625 2023-10-04 19:17:20,961 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_326-315007-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 19:17:23,011 WARNING [train.py:1204] (3/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_105-328174-0_sp1.1 from training. Duration: 0.6909375 2023-10-04 19:17:24,255 WARNING [train.py:1204] (3/4) Exclude cut with ID _56494_210_4220_1_1535284837625_3033169_106-263722-0_sp1.1 from training. Duration: 0.97275 2023-10-04 19:17:24,610 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.conv_module2.balancer2.prob, batch_count=1764360.0, ans=0.125 2023-10-04 19:17:25,733 WARNING [train.py:1204] (3/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_770-200430-0_sp0.9 from training. Duration: 0.988875 2023-10-04 19:17:26,317 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.4.encoder.layers.1.whiten, num_groups=1, num_channels=384, metric=5.82 vs. limit=12.0 2023-10-04 19:17:27,188 WARNING [train.py:1204] (3/4) Exclude cut with ID _65640_210_6310_1_1535368201445_3173250_209-13008-0_sp1.1 from training. Duration: 0.9 2023-10-04 19:17:28,553 WARNING [train.py:1204] (3/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_212-149947-0_sp0.9 from training. Duration: 0.811125 2023-10-04 19:17:31,354 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.conv_module1.balancer1.prob, batch_count=1764426.6666666667, ans=0.125 2023-10-04 19:17:35,295 WARNING [train.py:1204] (3/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_188-171673-0 from training. Duration: 0.58 2023-10-04 19:17:35,360 WARNING [train.py:1204] (3/4) Exclude cut with ID _58895_210_8750_1_1535184081320_3649089_286-281724-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 19:17:35,436 WARNING [train.py:1204] (3/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_16-254909-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 19:17:41,942 WARNING [train.py:1204] (3/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_52-95655-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 19:17:44,659 WARNING [train.py:1204] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_554-45617-0 from training. Duration: 0.99 2023-10-04 19:17:47,430 WARNING [train.py:1204] (3/4) Exclude cut with ID _14683_210_5219_1_1530698352608_3974859_473-344611-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 19:17:47,543 WARNING [train.py:1204] (3/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_17-25045-0_sp0.9 from training. Duration: 0.6555625 2023-10-04 19:17:52,272 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9072W0003-530637 from training. Duration: 0.768 2023-10-04 19:17:52,379 WARNING [train.py:1204] (3/4) Exclude cut with ID _38670_210_15379_1_1533448810659_4391059_76-74025-0_sp1.1 from training. Duration: 0.936375 2023-10-04 19:17:54,338 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_74-292608-0_sp0.9 from training. Duration: 0.8 2023-10-04 19:17:55,683 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9084W0025-536862 from training. Duration: 0.896 2023-10-04 19:17:55,743 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9087W0021-538392 from training. Duration: 0.768 2023-10-04 19:17:55,755 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7543_210_6753_1_1528543323_645515_56-163234-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 19:17:55,780 WARNING [train.py:1204] (3/4) Exclude cut with ID _45560_210_21982_1_1533812603951_3584999_44-332643-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 19:17:56,241 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.conv_module1.whiten.whitening_limit, batch_count=1764493.3333333333, ans=15.0 2023-10-04 19:17:58,369 WARNING [train.py:1204] (3/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_1285-256622-0_sp1.1 from training. Duration: 0.763625 2023-10-04 19:17:58,421 WARNING [train.py:1204] (3/4) Exclude cut with ID _52781_210_16187_1_1534297729099_1118149_141-325632-0_sp1.1 from training. Duration: 0.936375 2023-10-04 19:17:59,795 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_1051-137631-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 19:17:59,836 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_13091_210_7925_1_1530609447_8987354_233-18333-0_sp1.1 from training. Duration: 0.97275 2023-10-04 19:18:01,282 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_34008_210_10431_1_1533345424_8183247_662-317331-0 from training. Duration: 0.94 2023-10-04 19:18:01,295 WARNING [train.py:1204] (3/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_291-248920-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 19:18:01,297 WARNING [train.py:1204] (3/4) Exclude cut with ID _65263_210_22356_1_1535763598691_3921037_274-288481-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 19:18:01,326 WARNING [train.py:1204] (3/4) Exclude cut with ID _5189_210_4557_1_1527248319428_3769229_233-168080-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 19:18:02,628 WARNING [train.py:1204] (3/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_135-184281-0 from training. Duration: 0.99 2023-10-04 19:18:03,996 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9123W0012-555972 from training. Duration: 0.896 2023-10-04 19:18:04,001 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9123W0118-556077 from training. Duration: 0.896 2023-10-04 19:18:05,232 WARNING [train.py:1204] (3/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_406-275649-0 from training. Duration: 0.42 2023-10-04 19:18:08,344 WARNING [train.py:1204] (3/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_309-13684-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 19:18:08,369 WARNING [train.py:1204] (3/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_129-337802-0_sp1.1 from training. Duration: 0.6545625 2023-10-04 19:18:09,240 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.0.layers.1.whiten, num_groups=1, num_channels=192, metric=3.87 vs. limit=12.0 2023-10-04 19:18:09,666 WARNING [train.py:1204] (3/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_649-266825-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 19:18:10,991 WARNING [train.py:1204] (3/4) Exclude cut with ID _40519_210_13794_1_1533715228655_7337390_895-224742-0_sp1.1 from training. Duration: 0.92725 2023-10-04 19:18:12,973 WARNING [train.py:1204] (3/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_234-14174-0 from training. Duration: 0.82 2023-10-04 19:18:14,254 INFO [train.py:1046] (3/4) Epoch 50, batch 4400, loss[loss=0.1622, simple_loss=0.2389, pruned_loss=0.04279, over 23757.00 frames. ], tot_loss[loss=0.1529, simple_loss=0.2343, pruned_loss=0.03573, over 4726896.15 frames. ], batch size: 232, lr: 2.02e-03, grad_scale: 16.0 2023-10-04 19:18:14,359 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9156W0009-572502 from training. Duration: 0.768 2023-10-04 19:18:14,366 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_424-245529-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 19:18:16,242 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.conv_skip_rate, batch_count=1764626.6666666667, ans=0.0 2023-10-04 19:18:17,412 WARNING [train.py:1204] (3/4) Exclude cut with ID _26528_210_9070_1_1532570410842_3851310_232-10149-0_sp1.1 from training. Duration: 0.963625 2023-10-04 19:18:17,429 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_17292_210_6753_1_1531125388_2943734_152-30410-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 19:18:18,836 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_35441_210_16554_1_1533347572_4092680_291-82075-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 19:18:20,734 WARNING [train.py:1204] (3/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_64-298917-0 from training. Duration: 0.88 2023-10-04 19:18:22,037 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_5618_210_1874_1_1527311008_5851_1-214501-0_sp0.9 from training. Duration: 0.7655625 2023-10-04 19:18:22,068 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_9460_210_4211_1_1528954587_10049667_572-37035-0 from training. Duration: 0.8 2023-10-04 19:18:22,087 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9193W0023-590322 from training. Duration: 0.896 2023-10-04 19:18:23,500 WARNING [train.py:1204] (3/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_206-10097-0_sp1.1 from training. Duration: 0.4454375 2023-10-04 19:18:23,519 WARNING [train.py:1204] (3/4) Exclude cut with ID _55134_210_21982_1_1534590028377_3882170_17-51731-0_sp1.1 from training. Duration: 0.963625 2023-10-04 19:18:26,588 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_14953_210_2151_1_1530701710_5616165_710-165400-0 from training. Duration: 0.87 2023-10-04 19:18:29,237 WARNING [train.py:1204] (3/4) Exclude cut with ID _53203_210_2087_1_1534246218309_475590_61-85777-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 19:18:30,662 WARNING [train.py:1204] (3/4) Exclude cut with ID _55844_210_2346_1_1534471212569_7625707_1050-10453-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 19:18:30,673 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9218W0013-603297 from training. Duration: 0.896 2023-10-04 19:18:33,452 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_267-102818-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 19:18:33,453 WARNING [train.py:1204] (3/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_191-41023-0 from training. Duration: 0.71 2023-10-04 19:18:33,685 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.feed_forward3.hidden_balancer.prob, batch_count=1764693.3333333333, ans=0.125 2023-10-04 19:18:34,936 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9231W0202-610227 from training. Duration: 0.896 2023-10-04 19:18:36,427 WARNING [train.py:1204] (3/4) Exclude cut with ID _6537_210_1883_1_1527987559710_3597129_382-133062-0 from training. Duration: 0.68 2023-10-04 19:18:37,755 WARNING [train.py:1204] (3/4) Exclude cut with ID _28427_210_4220_1_1532581240967_3061069_43-84928-0 from training. Duration: 0.99 2023-10-04 19:18:37,779 WARNING [train.py:1204] (3/4) Exclude cut with ID _59256_210_5986_1_1535076085997_3589120_121-237386-0 from training. Duration: 0.96 2023-10-04 19:18:37,809 WARNING [train.py:1204] (3/4) Exclude cut with ID _8480_210_1874_1_1528589391205_6728330_137-256986-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 19:18:39,169 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_9417_210_1794_1_1529235261_4614131_479-346963-0_sp1.1 from training. Duration: 0.936375 2023-10-04 19:18:40,837 WARNING [train.py:1204] (3/4) Exclude cut with ID _26474_210_9570_1_1532520022699_3623380_267-7362-0_sp1.1 from training. Duration: 0.936375 2023-10-04 19:18:40,904 WARNING [train.py:1204] (3/4) Exclude cut with ID _55328_210_27767_1_1534580973260_4088170_287-61272-0_sp1.1 from training. Duration: 0.97275 2023-10-04 19:18:42,814 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.843e+02 2.185e+02 2.399e+02 2.720e+02 3.791e+02, threshold=4.798e+02, percent-clipped=0.0 2023-10-04 19:18:42,987 WARNING [train.py:1204] (3/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_589-9070-0 from training. Duration: 0.71 2023-10-04 19:18:42,995 WARNING [train.py:1204] (3/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_148-158509-0_sp1.1 from training. Duration: 0.37275 2023-10-04 19:18:44,336 WARNING [train.py:1204] (3/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_164-184548-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 19:18:44,596 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=1764760.0, ans=0.1 2023-10-04 19:18:45,789 WARNING [train.py:1204] (3/4) Exclude cut with ID _14970_210_15709_1_1530775547361_1966965_6-23328-0_sp1.1 from training. Duration: 0.7818125 2023-10-04 19:18:45,794 WARNING [train.py:1204] (3/4) Exclude cut with ID _29303_210_2575_1_1532392237303_7410649_612-165069-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 19:18:47,293 WARNING [train.py:1204] (3/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_383-321224-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 19:18:47,328 WARNING [train.py:1204] (3/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_283-160525-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 19:18:47,339 WARNING [train.py:1204] (3/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_505-97987-0 from training. Duration: 0.86 2023-10-04 19:18:48,704 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9309W0190-640317 from training. Duration: 0.768 2023-10-04 19:18:51,942 WARNING [train.py:1204] (3/4) Exclude cut with ID _50093_210_4220_1_1534417141207_3432299_280-297390-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 19:18:59,080 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_17292_210_6753_1_1531125388_2943734_320-71385-0_sp1.1 from training. Duration: 0.97275 2023-10-04 19:19:00,593 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_213-103282-0 from training. Duration: 0.78 2023-10-04 19:19:00,694 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.0.bypass_mid.scale_min, batch_count=1764826.6666666667, ans=0.2 2023-10-04 19:19:06,060 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7615_210_4211_1_1528110965_4580160_72-235211-0_sp0.9 from training. Duration: 0.8444375 2023-10-04 19:19:08,755 WARNING [train.py:1204] (3/4) Exclude cut with ID _14867_210_1968_1_1530684179890_3846969_173-320977-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 19:19:12,182 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7046_210_6753_1_1527895337_6920917_783-278109-0_sp0.9 from training. Duration: 0.8555625 2023-10-04 19:19:12,235 WARNING [train.py:1204] (3/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_90-30673-0 from training. Duration: 0.84 2023-10-04 19:19:12,250 WARNING [train.py:1204] (3/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_415-161-0_sp1.1 from training. Duration: 0.8909375 2023-10-04 19:19:12,261 WARNING [train.py:1204] (3/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_86-66192-0_sp0.9 from training. Duration: 0.611125 2023-10-04 19:19:12,264 WARNING [train.py:1204] (3/4) Exclude cut with ID _4626_210_3532_1_1526817652717_3916859_446-206656-0_sp1.1 from training. Duration: 0.6909375 2023-10-04 19:19:13,589 WARNING [train.py:1204] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_166-128941-0_sp0.9 from training. Duration: 0.6 2023-10-04 19:19:15,637 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.2.encoder.layers.2.whiten, num_groups=1, num_channels=384, metric=4.36 vs. limit=12.0 2023-10-04 19:19:16,956 WARNING [train.py:1204] (3/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_345-167986-0 from training. Duration: 0.84 2023-10-04 19:19:20,979 WARNING [train.py:1204] (3/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_973-198085-0 from training. Duration: 0.75 2023-10-04 19:19:21,070 WARNING [train.py:1204] (3/4) Exclude cut with ID _60057_210_10640_1_1534903065793_3333150_171-181133-0 from training. Duration: 0.88 2023-10-04 19:19:22,333 WARNING [train.py:1204] (3/4) Exclude cut with ID _19504_210_9994_1_1531648863242_3325220_336-196513-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 19:19:22,341 WARNING [train.py:1204] (3/4) Exclude cut with ID _6493_210_1747_1_1528459210424_4012470_513-140967-0 from training. Duration: 0.79 2023-10-04 19:19:22,434 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6673_210_3273_1_1527767790_3173240_327-306185-0_sp0.9 from training. Duration: 0.92225 2023-10-04 19:19:25,850 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1032-236863-0_sp1.1 from training. Duration: 0.763625 2023-10-04 19:19:28,738 INFO [train.py:1046] (3/4) Epoch 50, batch 4450, loss[loss=0.121, simple_loss=0.1968, pruned_loss=0.0226, over 18349.00 frames. ], tot_loss[loss=0.1532, simple_loss=0.2344, pruned_loss=0.03596, over 4730022.19 frames. ], batch size: 39, lr: 2.02e-03, grad_scale: 8.0 2023-10-04 19:19:28,816 WARNING [train.py:1204] (3/4) Exclude cut with ID _14867_210_1968_1_1530684179890_3846969_413-29292-0 from training. Duration: 0.9 2023-10-04 19:19:32,031 WARNING [train.py:1204] (3/4) Exclude cut with ID _47251_210_18628_1_1533814063128_4137250_186-343322-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 19:19:34,743 WARNING [train.py:1204] (3/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_624-46091-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 19:19:34,800 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_791-197891-0_sp0.9 from training. Duration: 0.9333125 2023-10-04 19:19:41,686 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_50050_210_16223_1_1535435796_7290138_786-288457-0_sp1.1 from training. Duration: 0.92725 2023-10-04 19:19:41,704 WARNING [train.py:1204] (3/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_213-227283-0_sp1.1 from training. Duration: 0.963625 2023-10-04 19:19:45,030 WARNING [train.py:1204] (3/4) Exclude cut with ID _42659_210_2070_1_1533607442765_3120380_58-341121-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 19:19:48,262 WARNING [train.py:1204] (3/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_506-23200-0_sp1.1 from training. Duration: 0.8818125 2023-10-04 19:19:49,731 WARNING [train.py:1204] (3/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_336-213530-0_sp1.1 from training. Duration: 0.8181875 2023-10-04 19:19:51,019 WARNING [train.py:1204] (3/4) Exclude cut with ID _64135_210_2253_1_1535677267876_7608972_180-5312-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 19:19:51,102 WARNING [train.py:1204] (3/4) Exclude cut with ID _12302_210_5219_1_1529924446466_8107280_835-279754-0 from training. Duration: 0.9 2023-10-04 19:19:51,104 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_510-96996-0_sp1.1 from training. Duration: 0.7909375 2023-10-04 19:19:52,480 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_15517_210_6753_1_1530954008_3487336_216-187834-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 19:19:52,520 WARNING [train.py:1204] (3/4) Exclude cut with ID _19504_210_9994_1_1531648863242_3325220_414-113427-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 19:19:52,521 WARNING [train.py:1204] (3/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_264-108685-0_sp0.9 from training. Duration: 0.611125 2023-10-04 19:19:52,681 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.1.bypass_mid.scale_min, batch_count=1765026.6666666667, ans=0.2 2023-10-04 19:19:54,012 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.balancer2.prob, batch_count=1765026.6666666667, ans=0.125 2023-10-04 19:19:55,256 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_5438_210_1794_1_1527336687_2600009_104-288274-0_sp0.9 from training. Duration: 0.6444375 2023-10-04 19:20:00,395 WARNING [train.py:1204] (3/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_520-39496-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 19:20:01,678 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_37818_210_4238_1_1533620910_4720083_315-308565-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 19:20:03,023 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2009_210_2071_1_1525481134_4588937_385-302100-0_sp1.1 from training. Duration: 0.7909375 2023-10-04 19:20:03,062 WARNING [train.py:1204] (3/4) Exclude cut with ID _58895_210_8750_1_1535184081320_3649089_277-53219-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 19:20:04,399 WARNING [train.py:1204] (3/4) Exclude cut with ID _26664_210_7770_1_1532258943280_3418270_121-111258-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 19:20:07,156 WARNING [train.py:1204] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_99-167392-0_sp1.1 from training. Duration: 0.5545625 2023-10-04 19:20:08,537 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1113-7369-0 from training. Duration: 0.85 2023-10-04 19:20:09,942 WARNING [train.py:1204] (3/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_352-59958-0 from training. Duration: 0.86 2023-10-04 19:20:09,947 WARNING [train.py:1204] (3/4) Exclude cut with ID _14886_210_4220_1_1530702020355_3516190_159-73748-0_sp1.1 from training. Duration: 0.7545625 2023-10-04 19:20:12,760 WARNING [train.py:1204] (3/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_131-119569-0_sp1.1 from training. Duration: 0.92725 2023-10-04 19:20:12,833 WARNING [train.py:1204] (3/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_216-343007-0 from training. Duration: 0.66 2023-10-04 19:20:16,327 WARNING [train.py:1204] (3/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_113-92608-0_sp0.9 from training. Duration: 0.6 2023-10-04 19:20:19,104 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_40047_210_2290_1_1533290519_2374311_132-123271-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 19:20:20,455 WARNING [train.py:1204] (3/4) Exclude cut with ID _8793_210_6310_1_1528542985139_2310009_121-179782-0 from training. Duration: 0.8 2023-10-04 19:20:20,476 WARNING [train.py:1204] (3/4) Exclude cut with ID _43997_210_4525_1_1533538632555_5189329_283-218639-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 19:20:20,485 WARNING [train.py:1204] (3/4) Exclude cut with ID _49376_210_5986_1_1533985278732_3370740_297-78397-0_sp1.1 from training. Duration: 0.936375 2023-10-04 19:20:20,507 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3475_210_2028_1_1526193578_3622569_377-166133-0_sp0.9 from training. Duration: 0.9555625 2023-10-04 19:20:21,844 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_520-213149-0_sp1.1 from training. Duration: 0.92725 2023-10-04 19:20:21,966 WARNING [train.py:1204] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_567-144483-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 19:20:24,720 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_4183_210_2087_1_1526729100_1869838_143-280962-0_sp0.9 from training. Duration: 0.62225 2023-10-04 19:20:24,777 WARNING [train.py:1204] (3/4) Exclude cut with ID _49643_210_7989_1_1534640244224_3939031_611-185515-0 from training. Duration: 0.94 2023-10-04 19:20:28,035 WARNING [train.py:1204] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_88-341718-0_sp0.9 from training. Duration: 0.7333125 2023-10-04 19:20:30,004 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_51225_210_3601_1_1534400634_2327383_49-235441-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 19:20:31,388 WARNING [train.py:1204] (3/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_240-294400-0_sp1.1 from training. Duration: 0.936375 2023-10-04 19:20:32,766 WARNING [train.py:1204] (3/4) Exclude cut with ID _55429_210_10626_1_1534467588527_7174143_640-304825-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 19:20:32,789 WARNING [train.py:1204] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_187-221566-0_sp1.1 from training. Duration: 0.3909375 2023-10-04 19:20:35,540 WARNING [train.py:1204] (3/4) Exclude cut with ID _7606_210_4915_1_1528894253277_5080690_127-177761-0_sp0.9 from training. Duration: 0.711125 2023-10-04 19:20:37,094 WARNING [train.py:1204] (3/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_430-116139-0 from training. Duration: 0.85 2023-10-04 19:20:38,412 WARNING [train.py:1204] (3/4) Exclude cut with ID _3675_210_4557_1_1526639934408_4182089_456-18305-0_sp0.9 from training. Duration: 0.8444375 2023-10-04 19:20:42,367 INFO [train.py:1046] (3/4) Epoch 50, batch 4500, loss[loss=0.1468, simple_loss=0.2254, pruned_loss=0.0341, over 23698.00 frames. ], tot_loss[loss=0.153, simple_loss=0.2343, pruned_loss=0.03588, over 4726132.95 frames. ], batch size: 149, lr: 2.02e-03, grad_scale: 8.0 2023-10-04 19:20:43,141 WARNING [train.py:1204] (3/4) Exclude cut with ID _18513_210_9206_1_1531618215223_3862620_402-5957-0_sp1.1 from training. Duration: 0.9 2023-10-04 19:20:43,254 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.1.balancer1.prob, batch_count=1765293.3333333333, ans=0.125 2023-10-04 19:20:45,922 WARNING [train.py:1204] (3/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_465-156500-0 from training. Duration: 0.72 2023-10-04 19:20:45,923 WARNING [train.py:1204] (3/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_714-310538-0 from training. Duration: 0.86 2023-10-04 19:20:46,676 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.2.encoder.layers.0.feed_forward1.out_whiten, num_groups=1, num_channels=384, metric=4.96 vs. limit=15.0 2023-10-04 19:20:47,353 WARNING [train.py:1204] (3/4) Exclude cut with ID _39411_210_5986_1_1533538825030_3343649_335-301187-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 19:20:54,291 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_41750_210_1960_1_1533520678_3948042_241-158616-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 19:20:54,343 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_46013_210_7234_1_1533776362_3744032_50-141535-0_sp1.1 from training. Duration: 0.9 2023-10-04 19:20:54,409 WARNING [train.py:1204] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_43-206757-0_sp0.9 from training. Duration: 0.8333125 2023-10-04 19:20:54,582 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.bypass.scale_min, batch_count=1765293.3333333333, ans=0.2 2023-10-04 19:20:55,788 WARNING [train.py:1204] (3/4) Exclude cut with ID _22961_210_8254_1_1531976366672_4278180_172-276296-0_sp1.1 from training. Duration: 0.97275 2023-10-04 19:20:55,819 WARNING [train.py:1204] (3/4) Exclude cut with ID _17956_210_4353_1_1531209568125_6979970_1305-301903-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 19:20:57,187 WARNING [train.py:1204] (3/4) Exclude cut with ID _28323_210_16187_1_1532480395667_4100300_604-44507-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 19:20:57,506 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=1765360.0, ans=0.1 2023-10-04 19:21:08,673 WARNING [train.py:1204] (3/4) Exclude cut with ID _23514_210_1389_1_1531832139785_3031439_88-337544-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 19:21:09,975 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.736e+02 2.200e+02 2.403e+02 2.900e+02 5.127e+02, threshold=4.806e+02, percent-clipped=1.0 2023-10-04 19:21:10,078 WARNING [train.py:1204] (3/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_932-3672-0_sp1.1 from training. Duration: 0.963625 2023-10-04 19:21:11,421 WARNING [train.py:1204] (3/4) Exclude cut with ID _46502_210_13794_1_1533724452198_3657159_207-109991-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 19:21:12,693 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_216-73841-0_sp1.1 from training. Duration: 0.87275 2023-10-04 19:21:12,786 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8453_210_4211_1_1528502940_8562730_347-330673-0_sp1.1 from training. Duration: 0.6545625 2023-10-04 19:21:18,967 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_4183_210_2087_1_1526729100_1869838_143-280962-0_sp1.1 from training. Duration: 0.5090625 2023-10-04 19:21:21,859 WARNING [train.py:1204] (3/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_325-113798-0_sp1.1 from training. Duration: 0.77275 2023-10-04 19:21:26,057 WARNING [train.py:1204] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_91-212486-0_sp0.9 from training. Duration: 0.7555625 2023-10-04 19:21:29,198 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8776_210_2040_1_1528638837_4088036_244-278763-0_sp1.1 from training. Duration: 0.8181875 2023-10-04 19:21:29,251 WARNING [train.py:1204] (3/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_106-286130-0 from training. Duration: 0.98 2023-10-04 19:21:30,968 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2746_210_1988_1_1525872816_3795627_372-205861-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 19:21:31,013 WARNING [train.py:1204] (3/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_240-54646-0_sp1.1 from training. Duration: 0.936375 2023-10-04 19:21:32,431 WARNING [train.py:1204] (3/4) Exclude cut with ID _59500_210_8254_1_1534809610680_3736009_162-180695-0_sp1.1 from training. Duration: 0.936375 2023-10-04 19:21:33,691 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7544_210_6753_1_1528591327_1471426_146-93357-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 19:21:35,344 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.conv_module2.balancer2.prob, batch_count=1765493.3333333333, ans=0.125 2023-10-04 19:21:36,390 WARNING [train.py:1204] (3/4) Exclude cut with ID _42657_210_2070_1_1533520654016_3327008_405-87432-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 19:21:36,411 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_4528_210_3273_1_1527076654_3276777_209-219804-0 from training. Duration: 0.69 2023-10-04 19:21:36,412 WARNING [train.py:1204] (3/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_78-182912-0_sp0.9 from training. Duration: 0.6555625 2023-10-04 19:21:36,418 WARNING [train.py:1204] (3/4) Exclude cut with ID _31255_210_13427_1_1532865619495_3815143_322-114998-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 19:21:40,809 WARNING [train.py:1204] (3/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_848-280957-0_sp0.9 from training. Duration: 0.9666875 2023-10-04 19:21:40,835 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7696_210_2040_1_1528207167_3716099_117-125434-0_sp0.9 from training. Duration: 0.8444375 2023-10-04 19:21:42,966 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.bypass.scale_min, batch_count=1765560.0, ans=0.2 2023-10-04 19:21:45,483 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_1002-109192-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 19:21:47,003 WARNING [train.py:1204] (3/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_138-64210-0_sp0.9 from training. Duration: 0.811125 2023-10-04 19:21:47,023 WARNING [train.py:1204] (3/4) Exclude cut with ID _25808_210_13292_1_1532343514562_7366109_633-47524-0_sp1.1 from training. Duration: 0.9 2023-10-04 19:21:47,208 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.bypass_mid.scale_min, batch_count=1765560.0, ans=0.2 2023-10-04 19:21:50,323 WARNING [train.py:1204] (3/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_297-5233-0 from training. Duration: 0.95 2023-10-04 19:21:51,670 WARNING [train.py:1204] (3/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_335-274780-0 from training. Duration: 0.78 2023-10-04 19:21:51,675 WARNING [train.py:1204] (3/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_301-330455-0 from training. Duration: 0.8 2023-10-04 19:21:54,409 WARNING [train.py:1204] (3/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_383-205833-0 from training. Duration: 0.95 2023-10-04 19:21:55,726 INFO [train.py:1046] (3/4) Epoch 50, batch 4550, loss[loss=0.1455, simple_loss=0.2124, pruned_loss=0.03927, over 23745.00 frames. ], tot_loss[loss=0.1522, simple_loss=0.233, pruned_loss=0.03569, over 4702996.45 frames. ], batch size: 232, lr: 2.02e-03, grad_scale: 8.0 2023-10-04 19:21:57,265 WARNING [train.py:1204] (3/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_48-182155-0 from training. Duration: 0.87 2023-10-04 19:21:57,433 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.1.conv_module1.balancer1.prob, batch_count=1765626.6666666667, ans=0.125 2023-10-04 19:21:58,741 WARNING [train.py:1204] (3/4) Exclude cut with ID _14832_210_8750_1_1530691351539_3171348_1-54315-0_sp1.1 from training. Duration: 0.7909375 2023-10-04 19:22:01,975 WARNING [train.py:1204] (3/4) Exclude cut with ID _44982_210_1511_1_1534312820108_3726029_413-100814-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 19:22:02,024 WARNING [train.py:1204] (3/4) Exclude cut with ID _3231_210_2106_1_1526205864934_6784220_104-261392-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 19:22:02,281 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.conv_module1.balancer2.prob, batch_count=1765626.6666666667, ans=0.125 2023-10-04 19:22:04,038 WARNING [train.py:1204] (3/4) Exclude cut with ID _24314_210_4915_1_1532135078116_7170179_1-13588-0_sp1.1 from training. Duration: 0.97275 2023-10-04 19:22:09,517 WARNING [train.py:1204] (3/4) Exclude cut with ID _44304_210_15374_1_1533639352765_4613129_141-8356-0_sp1.1 from training. Duration: 0.963625 2023-10-04 19:22:10,868 WARNING [train.py:1204] (3/4) Exclude cut with ID _37892_210_19596_1_1533198677897_3964058_374-335034-0_sp1.1 from training. Duration: 0.936375 2023-10-04 19:22:12,206 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.2.encoder.layers.2.feed_forward2.out_whiten, num_groups=1, num_channels=384, metric=4.12 vs. limit=15.0 2023-10-04 19:22:12,763 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_195-257512-0_sp0.9 from training. Duration: 0.7444375 2023-10-04 19:22:12,765 WARNING [train.py:1204] (3/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_1267-272693-0_sp1.1 from training. Duration: 0.663625 2023-10-04 19:22:12,766 WARNING [train.py:1204] (3/4) Exclude cut with ID _30196_210_1664_1_1533708496522_7074140_690-326155-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 19:22:15,502 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_68847_210_14656_1_1536070214_2586843_38-20717-0_sp1.1 from training. Duration: 0.97275 2023-10-04 19:22:15,548 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_99-251314-0_sp1.1 from training. Duration: 0.7909375 2023-10-04 19:22:20,022 WARNING [train.py:1204] (3/4) Exclude cut with ID _49376_210_5986_1_1533985278732_3370740_225-95630-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 19:22:21,468 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2655_210_2087_1_1525688959_3897400_202-147557-0 from training. Duration: 0.85 2023-10-04 19:22:21,521 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_5618_210_1874_1_1527311008_5851_1-214501-0_sp1.1 from training. Duration: 0.626375 2023-10-04 19:22:22,841 WARNING [train.py:1204] (3/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_225-43381-0_sp1.1 from training. Duration: 0.8545625 2023-10-04 19:22:23,047 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.feed_forward3.hidden_balancer.prob, batch_count=1765693.3333333333, ans=0.125 2023-10-04 19:22:24,221 WARNING [train.py:1204] (3/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_242-3472-0 from training. Duration: 0.73 2023-10-04 19:22:28,171 WARNING [train.py:1204] (3/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_339-104791-0 from training. Duration: 0.49 2023-10-04 19:22:29,506 WARNING [train.py:1204] (3/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_488-92261-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 19:22:34,794 WARNING [train.py:1204] (3/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_175-163894-0 from training. Duration: 0.74 2023-10-04 19:22:34,925 WARNING [train.py:1204] (3/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_41-234353-0_sp0.9 from training. Duration: 0.7555625 2023-10-04 19:22:35,374 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.5.encoder.layers.0.conv_module1.whiten, num_groups=1, num_channels=256, metric=6.38 vs. limit=15.0 2023-10-04 19:22:38,948 WARNING [train.py:1204] (3/4) Exclude cut with ID _47251_210_18628_1_1533814063128_4137250_116-275927-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 19:22:38,985 WARNING [train.py:1204] (3/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_61-223324-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 19:22:39,002 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6961_210_2087_1_1527937168_6815011_562-165518-0_sp1.1 from training. Duration: 0.7 2023-10-04 19:22:39,335 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.self_attn_weights.pos_emb_skip_rate, batch_count=1765826.6666666667, ans=0.0 2023-10-04 19:22:40,435 WARNING [train.py:1204] (3/4) Exclude cut with ID _21989_210_5854_1_1531899214931_3271360_133-44759-0 from training. Duration: 0.77 2023-10-04 19:22:42,449 WARNING [train.py:1204] (3/4) Exclude cut with ID _23557_210_8750_1_1531890106370_6846255_77-284923-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 19:22:43,905 WARNING [train.py:1204] (3/4) Exclude cut with ID _13678_210_7497_1_1530530069226_3463170_464-296127-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 19:22:45,212 WARNING [train.py:1204] (3/4) Exclude cut with ID _22613_210_5219_1_1532253599686_4090170_393-36663-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 19:22:46,609 WARNING [train.py:1204] (3/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_235-262124-0_sp0.9 from training. Duration: 0.7444375 2023-10-04 19:22:47,977 WARNING [train.py:1204] (3/4) Exclude cut with ID _8480_210_1874_1_1528589391205_6728330_755-112625-0 from training. Duration: 0.74 2023-10-04 19:22:48,035 WARNING [train.py:1204] (3/4) Exclude cut with ID _6456_210_1534_1_1527681634212_3181049_215-309359-0 from training. Duration: 0.88 2023-10-04 19:22:49,656 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3019_210_2028_1_1526044245_3264865_191-72235-0_sp1.1 from training. Duration: 0.8818125 2023-10-04 19:22:49,754 WARNING [train.py:1204] (3/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_380-162648-0 from training. Duration: 0.94 2023-10-04 19:22:51,256 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_92-46009-0 from training. Duration: 0.54 2023-10-04 19:22:51,273 WARNING [train.py:1204] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_53-300544-0_sp0.9 from training. Duration: 0.7444375 2023-10-04 19:22:52,667 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_68235_210_1925_1_1535863247_4484656_101-250943-0_sp1.1 from training. Duration: 0.97275 2023-10-04 19:22:52,686 WARNING [train.py:1204] (3/4) Exclude cut with ID _14970_210_15709_1_1530775547361_1966965_204-22182-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 19:22:54,129 WARNING [train.py:1204] (3/4) Exclude cut with ID _55153_210_2253_1_1534384786677_3162096_310-130092-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 19:22:54,141 WARNING [train.py:1204] (3/4) Exclude cut with ID _60113_210_16271_1_1535164212481_4386079_559-167191-0_sp1.1 from training. Duration: 0.8454375 2023-10-04 19:22:55,633 WARNING [train.py:1204] (3/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_93-58002-0_sp1.1 from training. Duration: 0.5181875 2023-10-04 19:22:56,966 WARNING [train.py:1204] (3/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_110-224688-0 from training. Duration: 0.97 2023-10-04 19:22:58,225 WARNING [train.py:1204] (3/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_178-178004-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 19:22:58,234 WARNING [train.py:1204] (3/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_321-335475-0_sp1.1 from training. Duration: 0.4090625 2023-10-04 19:22:58,307 WARNING [train.py:1204] (3/4) Exclude cut with ID _66863_210_18053_1_1535698758334_8590319_1092-271784-0 from training. Duration: 0.96 2023-10-04 19:22:58,313 WARNING [train.py:1204] (3/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_607-213951-0_sp1.1 from training. Duration: 0.763625 2023-10-04 19:22:58,330 WARNING [train.py:1204] (3/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_468-283113-0 from training. Duration: 0.96 2023-10-04 19:23:02,918 WARNING [train.py:1204] (3/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_13-165512-0_sp1.1 from training. Duration: 0.7454375 2023-10-04 19:23:02,940 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_69630_210_7234_1_1535788670_910989_56-169762-0_sp1.1 from training. Duration: 0.92725 2023-10-04 19:23:05,303 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.self_attn_weights.pos_emb_skip_rate, batch_count=1765893.3333333333, ans=0.0 2023-10-04 19:23:06,326 WARNING [train.py:1204] (3/4) Exclude cut with ID _67335_210_5219_1_1535525975652_4275229_121-80357-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 19:23:06,384 WARNING [train.py:1204] (3/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_126-12079-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 19:23:06,428 WARNING [train.py:1204] (3/4) Exclude cut with ID _5294_210_1747_1_1527249643928_3696179_342-270278-0_sp1.1 from training. Duration: 0.5 2023-10-04 19:23:07,853 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_30322_210_1925_1_1532843478_826817_35-328264-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 19:23:09,734 WARNING [train.py:1204] (3/4) Exclude cut with ID _8480_210_1874_1_1528589391205_6728330_755-112625-0_sp1.1 from training. Duration: 0.67275 2023-10-04 19:23:10,994 INFO [train.py:1046] (3/4) Epoch 50, batch 4600, loss[loss=0.159, simple_loss=0.2529, pruned_loss=0.03259, over 24596.00 frames. ], tot_loss[loss=0.1512, simple_loss=0.2311, pruned_loss=0.03566, over 4680290.24 frames. ], batch size: 73, lr: 2.02e-03, grad_scale: 8.0 2023-10-04 19:23:12,515 WARNING [train.py:1204] (3/4) Exclude cut with ID _38670_210_15379_1_1533448810659_4391059_132-146412-0_sp1.1 from training. Duration: 0.97275 2023-10-04 19:23:13,853 WARNING [train.py:1204] (3/4) Exclude cut with ID _22020_210_2461_1_1531724164513_6326900_563-39691-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 19:23:16,552 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_9460_210_4211_1_1528954587_10049667_572-37035-0_sp1.1 from training. Duration: 0.72725 2023-10-04 19:23:16,572 WARNING [train.py:1204] (3/4) Exclude cut with ID _68777_210_4353_1_1535810390731_3117904_129-166012-0_sp0.9 from training. Duration: 0.9444375 2023-10-04 19:23:18,484 WARNING [train.py:1204] (3/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_487-91921-0_sp1.1 from training. Duration: 0.963625 2023-10-04 19:23:18,585 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_195-257512-0 from training. Duration: 0.67 2023-10-04 19:23:20,042 WARNING [train.py:1204] (3/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_188-260011-0_sp1.1 from training. Duration: 0.663625 2023-10-04 19:23:23,931 WARNING [train.py:1204] (3/4) Exclude cut with ID _8480_210_1874_1_1528589391205_6728330_309-139896-0_sp0.9 from training. Duration: 0.9 2023-10-04 19:23:23,997 WARNING [train.py:1204] (3/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_866-321248-0_sp1.1 from training. Duration: 0.963625 2023-10-04 19:23:26,746 WARNING [train.py:1204] (3/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_510-216513-0_sp1.1 from training. Duration: 0.97275 2023-10-04 19:23:32,748 WARNING [train.py:1204] (3/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_758-143677-0 from training. Duration: 0.98 2023-10-04 19:23:32,842 WARNING [train.py:1204] (3/4) Exclude cut with ID _55134_210_21982_1_1534590028377_3882170_50-214427-0_sp1.1 from training. Duration: 0.97275 2023-10-04 19:23:36,192 WARNING [train.py:1204] (3/4) Exclude cut with ID _31649_210_6228_1_1532584608063_4091078_482-301330-0_sp1.1 from training. Duration: 0.97275 2023-10-04 19:23:38,450 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.ff3_skip_rate, batch_count=1766026.6666666667, ans=0.0 2023-10-04 19:23:38,463 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.nonlin_attention.balancer.prob, batch_count=1766026.6666666667, ans=0.125 2023-10-04 19:23:39,343 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.716e+02 2.190e+02 2.507e+02 2.915e+02 5.152e+02, threshold=5.014e+02, percent-clipped=2.0 2023-10-04 19:23:39,484 WARNING [train.py:1204] (3/4) Exclude cut with ID _35799_210_5854_1_1533263487887_3529071_490-326847-0_sp1.1 from training. Duration: 0.9 2023-10-04 19:23:39,492 WARNING [train.py:1204] (3/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_984-90716-0_sp1.1 from training. Duration: 0.963625 2023-10-04 19:23:44,950 WARNING [train.py:1204] (3/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_166-246557-0 from training. Duration: 0.64 2023-10-04 19:23:44,950 WARNING [train.py:1204] (3/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_51-200220-0_sp1.1 from training. Duration: 0.5090625 2023-10-04 19:23:45,679 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.3.encoder.layers.2.self_attn1.whiten, num_groups=1, num_channels=512, metric=12.49 vs. limit=22.5 2023-10-04 19:23:46,396 WARNING [train.py:1204] (3/4) Exclude cut with ID _51382_210_23862_1_1534492801838_3650104_275-92796-0_sp1.1 from training. Duration: 0.936375 2023-10-04 19:23:52,368 WARNING [train.py:1204] (3/4) Exclude cut with ID _29853_210_7497_1_1532505600851_6642110_461-142829-0_sp1.1 from training. Duration: 0.97275 2023-10-04 19:23:52,409 WARNING [train.py:1204] (3/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_788-298907-0_sp0.9 from training. Duration: 0.988875 2023-10-04 19:23:53,783 WARNING [train.py:1204] (3/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_739-2098-0_sp1.1 from training. Duration: 0.836375 2023-10-04 19:23:57,905 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7543_210_6753_1_1528541907_1399322_165-323619-0 from training. Duration: 0.69 2023-10-04 19:23:57,998 WARNING [train.py:1204] (3/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_401-255491-0_sp1.1 from training. Duration: 0.636375 2023-10-04 19:24:02,250 WARNING [train.py:1204] (3/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_24-143283-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 19:24:04,062 WARNING [train.py:1204] (3/4) Exclude cut with ID _14459_210_2228_1_1531443641946_7516700_1221-280439-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 19:24:05,478 WARNING [train.py:1204] (3/4) Exclude cut with ID _59202_210_26984_1_1534816810612_3549174_469-213482-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 19:24:05,479 WARNING [train.py:1204] (3/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_148-158509-0_sp0.9 from training. Duration: 0.4555625 2023-10-04 19:24:07,327 WARNING [train.py:1204] (3/4) Exclude cut with ID _24314_210_4915_1_1532135078116_7170179_863-259095-0_sp1.1 from training. Duration: 0.97275 2023-10-04 19:24:07,405 WARNING [train.py:1204] (3/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_321-22821-0 from training. Duration: 0.89 2023-10-04 19:24:08,759 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_43831_210_2158_1_1533725502_5872791_55-163137-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 19:24:08,814 WARNING [train.py:1204] (3/4) Exclude cut with ID _27430_210_7770_1_1532604689375_1378590_26-25207-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 19:24:10,838 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_17613_210_2151_1_1531137569_2366160_223-316461-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 19:24:11,037 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.bypass.scale_min, batch_count=1766226.6666666667, ans=0.2 2023-10-04 19:24:12,198 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_53679_210_4852_1_1534556726_4186141_541-213370-0_sp1.1 from training. Duration: 0.963625 2023-10-04 19:24:12,273 WARNING [train.py:1204] (3/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_369-295086-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 19:24:13,660 WARNING [train.py:1204] (3/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_91-267696-0 from training. Duration: 0.87 2023-10-04 19:24:13,704 WARNING [train.py:1204] (3/4) Exclude cut with ID _5294_210_1747_1_1527249643928_3696179_180-100713-0 from training. Duration: 0.81 2023-10-04 19:24:13,734 WARNING [train.py:1204] (3/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_32-335103-0 from training. Duration: 0.82 2023-10-04 19:24:13,740 WARNING [train.py:1204] (3/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_133-312727-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 19:24:15,067 WARNING [train.py:1204] (3/4) Exclude cut with ID _59288_210_5854_1_1535077757774_3412246_391-165934-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 19:24:16,424 WARNING [train.py:1204] (3/4) Exclude cut with ID _28323_210_16187_1_1532480395667_4100300_488-1281-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 19:24:18,198 WARNING [train.py:1204] (3/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_124-193207-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 19:24:25,040 INFO [train.py:1046] (3/4) Epoch 50, batch 4650, loss[loss=0.1551, simple_loss=0.2491, pruned_loss=0.03057, over 24548.00 frames. ], tot_loss[loss=0.1507, simple_loss=0.2305, pruned_loss=0.03548, over 4681159.79 frames. ], batch size: 71, lr: 2.02e-03, grad_scale: 8.0 2023-10-04 19:24:26,537 WARNING [train.py:1204] (3/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_226-242140-0_sp0.9 from training. Duration: 0.9 2023-10-04 19:24:29,287 WARNING [train.py:1204] (3/4) Exclude cut with ID _29277_210_3279_1_1532581109293_3173069_392-3694-0_sp1.1 from training. Duration: 0.936375 2023-10-04 19:24:29,317 WARNING [train.py:1204] (3/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_563-329842-0_sp1.1 from training. Duration: 0.97275 2023-10-04 19:24:29,380 WARNING [train.py:1204] (3/4) Exclude cut with ID _58583_210_5236_1_1534726835266_3574249_391-87755-0_sp1.1 from training. Duration: 0.92725 2023-10-04 19:24:29,402 WARNING [train.py:1204] (3/4) Exclude cut with ID _28427_210_4220_1_1532581240967_3061069_281-237827-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 19:24:30,685 WARNING [train.py:1204] (3/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_44-147898-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 19:24:30,779 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_24007_210_1794_1_1531918925_1663875_125-320772-0_sp1.1 from training. Duration: 0.97275 2023-10-04 19:24:32,325 INFO [scaling.py:1118] (3/4) WithLoss: name=encoder.encoders.0.layers.0.self_attn_weights, loss-sum=0.000e+00 2023-10-04 19:24:35,352 WARNING [train.py:1204] (3/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_143-117421-0 from training. Duration: 0.58 2023-10-04 19:24:40,059 WARNING [train.py:1204] (3/4) Exclude cut with ID _69584_210_5219_1_1535785133775_4044450_325-7656-0_sp0.9 from training. Duration: 0.9555625 2023-10-04 19:24:41,997 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_37351_210_7925_1_1533012931_3812113_265-85780-0 from training. Duration: 0.88 2023-10-04 19:24:42,234 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.bypass.skip_rate, batch_count=1766360.0, ans=0.04949747468305833 2023-10-04 19:24:43,286 WARNING [train.py:1204] (3/4) Exclude cut with ID _18516_210_9206_1_1531704653206_3692406_331-237896-0_sp1.1 from training. Duration: 0.936375 2023-10-04 19:24:44,641 WARNING [train.py:1204] (3/4) Exclude cut with ID _14629_210_15709_1_1530604298182_956899_6-232695-0 from training. Duration: 0.94 2023-10-04 19:24:44,669 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7544_210_6753_1_1528587000_691468_87-178599-0_sp1.1 from training. Duration: 0.7909375 2023-10-04 19:24:46,045 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_326-303945-0 from training. Duration: 0.99 2023-10-04 19:24:46,063 WARNING [train.py:1204] (3/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_503-30719-0 from training. Duration: 0.98 2023-10-04 19:24:46,072 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_11305_210_1794_1_1529750428_7178148_299-344640-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 19:24:47,407 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_53071_210_16554_1_1534490405_2954066_265-170472-0_sp1.1 from training. Duration: 0.7545625 2023-10-04 19:24:48,977 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_174-277364-0_sp1.1 from training. Duration: 0.6454375 2023-10-04 19:24:50,747 WARNING [train.py:1204] (3/4) Exclude cut with ID _58414_210_8254_1_1535421609162_7151182_193-151884-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 19:24:50,773 WARNING [train.py:1204] (3/4) Exclude cut with ID IC0651W0104-322063 from training. Duration: 0.768 2023-10-04 19:24:54,841 WARNING [train.py:1204] (3/4) Exclude cut with ID _21881_210_3525_1_1531825242757_3798380_139-305982-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 19:24:56,270 WARNING [train.py:1204] (3/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_79-200392-0 from training. Duration: 0.97 2023-10-04 19:24:57,843 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.self_attn_weights.pos_emb_skip_rate, batch_count=1766426.6666666667, ans=0.0 2023-10-04 19:24:59,042 WARNING [train.py:1204] (3/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_451-280894-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 19:24:59,058 WARNING [train.py:1204] (3/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_304-273433-0_sp1.1 from training. Duration: 0.9 2023-10-04 19:25:00,364 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_5959_210_1794_1_1527400400_3445726_412-44052-0 from training. Duration: 0.77 2023-10-04 19:25:01,855 WARNING [train.py:1204] (3/4) Exclude cut with ID _4681_210_3525_1_1527251040321_1235713_148-26159-0_sp1.1 from training. Duration: 0.963625 2023-10-04 19:25:04,721 WARNING [train.py:1204] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_887-249555-0_sp0.9 from training. Duration: 0.8555625 2023-10-04 19:25:07,973 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_25236_210_2158_1_1532051968_280567_37-6860-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 19:25:12,283 WARNING [train.py:1204] (3/4) Exclude cut with ID _29277_210_3279_1_1532581109293_3173069_417-115527-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 19:25:14,331 WARNING [train.py:1204] (3/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_552-134748-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 19:25:14,530 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.ff2_skip_rate, batch_count=1766493.3333333333, ans=0.0 2023-10-04 19:25:15,622 WARNING [train.py:1204] (3/4) Exclude cut with ID _43176_210_8254_1_1533524417469_4174339_188-174143-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 19:25:16,952 WARNING [train.py:1204] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_127-119109-0_sp1.1 from training. Duration: 0.6545625 2023-10-04 19:25:19,694 WARNING [train.py:1204] (3/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_38-187851-0 from training. Duration: 0.62 2023-10-04 19:25:19,730 WARNING [train.py:1204] (3/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_588-125165-0 from training. Duration: 0.83 2023-10-04 19:25:21,055 WARNING [train.py:1204] (3/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_344-152526-0_sp1.1 from training. Duration: 0.4818125 2023-10-04 19:25:21,056 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7002_210_1794_1_1527935634_4034444_487-236501-0 from training. Duration: 0.66 2023-10-04 19:25:23,010 WARNING [train.py:1204] (3/4) Exclude cut with ID _23825_210_13449_1_1531915313526_3497150_379-16981-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 19:25:28,598 WARNING [train.py:1204] (3/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_297-5233-0_sp1.1 from training. Duration: 0.863625 2023-10-04 19:25:28,609 WARNING [train.py:1204] (3/4) Exclude cut with ID _41298_210_9019_1_1533430371450_4822143_178-283538-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 19:25:29,937 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7046_210_6753_1_1527895337_6920917_642-106799-0 from training. Duration: 0.92 2023-10-04 19:25:29,950 WARNING [train.py:1204] (3/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_532-291952-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 19:25:31,413 WARNING [train.py:1204] (3/4) Exclude cut with ID _47278_210_12041_1_1533902418349_3576110_260-270468-0_sp1.1 from training. Duration: 0.92725 2023-10-04 19:25:31,418 WARNING [train.py:1204] (3/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_144-3287-0_sp0.9 from training. Duration: 0.7444375 2023-10-04 19:25:32,900 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_162-262717-0_sp0.9 from training. Duration: 0.92225 2023-10-04 19:25:35,643 WARNING [train.py:1204] (3/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_62-335552-0_sp0.9 from training. Duration: 0.9666875 2023-10-04 19:25:35,650 WARNING [train.py:1204] (3/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_726-272852-0_sp1.1 from training. Duration: 0.92725 2023-10-04 19:25:35,747 WARNING [train.py:1204] (3/4) Exclude cut with ID _14898_210_9206_1_1530766812377_4177190_26-135492-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 19:25:35,885 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.self_attn_weights.pos_emb_skip_rate, batch_count=1766560.0, ans=0.0 2023-10-04 19:25:38,786 INFO [train.py:1046] (3/4) Epoch 50, batch 4700, loss[loss=0.1505, simple_loss=0.2278, pruned_loss=0.03658, over 23763.00 frames. ], tot_loss[loss=0.1512, simple_loss=0.2315, pruned_loss=0.03546, over 4702115.75 frames. ], batch size: 179, lr: 2.02e-03, grad_scale: 8.0 2023-10-04 19:25:38,957 WARNING [train.py:1204] (3/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_200-95304-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 19:25:38,988 WARNING [train.py:1204] (3/4) Exclude cut with ID _45012_210_12651_1_1533608435136_5371380_240-37101-0_sp1.1 from training. Duration: 0.8454375 2023-10-04 19:25:38,996 WARNING [train.py:1204] (3/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_132-269633-0_sp0.9 from training. Duration: 0.7555625 2023-10-04 19:25:40,931 WARNING [train.py:1204] (3/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_907-151398-0_sp0.9 from training. Duration: 0.411125 2023-10-04 19:25:42,343 WARNING [train.py:1204] (3/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_45-265684-0_sp0.9 from training. Duration: 0.8 2023-10-04 19:25:43,725 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_74-292608-0 from training. Duration: 0.72 2023-10-04 19:25:50,114 WARNING [train.py:1204] (3/4) Exclude cut with ID _59202_210_26984_1_1534816810612_3549174_221-84819-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 19:25:51,418 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_33696_210_3852_1_1533082780_2663375_181-325595-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 19:25:51,467 WARNING [train.py:1204] (3/4) Exclude cut with ID _47251_210_18628_1_1533814063128_4137250_351-154105-0_sp1.1 from training. Duration: 0.963625 2023-10-04 19:25:53,394 WARNING [train.py:1204] (3/4) Exclude cut with ID _35501_210_9288_1_1532998507986_3192189_351-334740-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 19:25:56,026 WARNING [train.py:1204] (3/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_342-100157-0_sp1.1 from training. Duration: 0.4909375 2023-10-04 19:26:01,496 WARNING [train.py:1204] (3/4) Exclude cut with ID _6789_210_1534_1_1527857937218_3118950_262-48853-0 from training. Duration: 0.56 2023-10-04 19:26:01,535 WARNING [train.py:1204] (3/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_430-201020-0 from training. Duration: 0.97 2023-10-04 19:26:01,703 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.bypass.scale_min, batch_count=1766693.3333333333, ans=0.2 2023-10-04 19:26:04,322 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_71-136518-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 19:26:05,637 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_28-291982-0_sp1.1 from training. Duration: 0.8818125 2023-10-04 19:26:07,001 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.807e+02 2.151e+02 2.397e+02 2.969e+02 5.110e+02, threshold=4.793e+02, percent-clipped=1.0 2023-10-04 19:26:07,072 WARNING [train.py:1204] (3/4) Exclude cut with ID _54218_210_12210_1_1534413646388_4409259_437-275326-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 19:26:10,496 WARNING [train.py:1204] (3/4) Exclude cut with ID _55134_210_21982_1_1534590028377_3882170_40-309139-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 19:26:15,346 WARNING [train.py:1204] (3/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_266-329755-0_sp1.1 from training. Duration: 0.8090625 2023-10-04 19:26:15,430 WARNING [train.py:1204] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_21-174514-0_sp1.1 from training. Duration: 0.3909375 2023-10-04 19:26:18,065 WARNING [train.py:1204] (3/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_409-207378-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 19:26:18,243 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.conv_skip_rate, batch_count=1766760.0, ans=0.0 2023-10-04 19:26:22,878 WARNING [train.py:1204] (3/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_132-269633-0 from training. Duration: 0.68 2023-10-04 19:26:24,602 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2245_210_2715_1_1525511548_3540254_310-52483-0_sp0.9 from training. Duration: 0.97775 2023-10-04 19:26:26,059 WARNING [train.py:1204] (3/4) Exclude cut with ID _27430_210_7770_1_1532604689375_1378590_47-77403-0_sp1.1 from training. Duration: 0.92725 2023-10-04 19:26:31,529 WARNING [train.py:1204] (3/4) Exclude cut with ID _7562_210_2084_1_1528887767916_3946229_186-326422-0 from training. Duration: 0.84 2023-10-04 19:26:32,883 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_230-35993-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 19:26:35,041 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.2.encoder.layers.1.nonlin_attention.whiten1, num_groups=1, num_channels=288, metric=5.36 vs. limit=10.0 2023-10-04 19:26:35,695 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_23854_210_1794_1_1531969696_1329157_57-123491-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 19:26:37,496 WARNING [train.py:1204] (3/4) Exclude cut with ID _14747_210_8341_1_1530685797463_3654089_147-131229-0 from training. Duration: 0.76 2023-10-04 19:26:38,878 WARNING [train.py:1204] (3/4) Exclude cut with ID _30191_210_1664_1_1533189915047_7196670_430-19539-0_sp1.1 from training. Duration: 0.92725 2023-10-04 19:26:38,894 WARNING [train.py:1204] (3/4) Exclude cut with ID _67725_210_27767_1_1535608756671_5105359_44-50762-0_sp1.1 from training. Duration: 0.963625 2023-10-04 19:26:40,483 WARNING [train.py:1204] (3/4) Exclude cut with ID _27485_210_4916_1_1532739191677_3668269_217-161755-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 19:26:41,863 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_5931_210_2040_1_1527428481_4547999_321-193614-0_sp1.1 from training. Duration: 0.6090625 2023-10-04 19:26:41,879 WARNING [train.py:1204] (3/4) Exclude cut with ID _6267_210_2084_1_1527764658526_3894730_375-219887-0 from training. Duration: 0.67 2023-10-04 19:26:41,953 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9087W0022-538393 from training. Duration: 0.768 2023-10-04 19:26:43,316 WARNING [train.py:1204] (3/4) Exclude cut with ID _68777_210_4353_1_1535810390731_3117904_83-196459-0_sp1.1 from training. Duration: 0.963625 2023-10-04 19:26:45,333 WARNING [train.py:1204] (3/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_455-343812-0_sp1.1 from training. Duration: 0.92725 2023-10-04 19:26:45,334 WARNING [train.py:1204] (3/4) Exclude cut with ID _13916_210_9994_1_1530788341126_3763149_414-320174-0_sp1.1 from training. Duration: 0.92725 2023-10-04 19:26:45,337 WARNING [train.py:1204] (3/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_398-239819-0 from training. Duration: 0.99 2023-10-04 19:26:46,678 WARNING [train.py:1204] (3/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_199-232314-0_sp1.1 from training. Duration: 0.92725 2023-10-04 19:26:48,094 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.0.conv_module1.balancer2.prob, batch_count=1766893.3333333333, ans=0.125 2023-10-04 19:26:50,731 WARNING [train.py:1204] (3/4) Exclude cut with ID _2334_210_2667_1_1525608238880_4705129_459-104997-0 from training. Duration: 0.95 2023-10-04 19:26:54,060 INFO [train.py:1046] (3/4) Epoch 50, batch 4750, loss[loss=0.1547, simple_loss=0.2318, pruned_loss=0.03876, over 18169.00 frames. ], tot_loss[loss=0.1514, simple_loss=0.2319, pruned_loss=0.03543, over 4712307.82 frames. ], batch size: 39, lr: 2.02e-03, grad_scale: 8.0 2023-10-04 19:26:54,157 WARNING [train.py:1204] (3/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_441-209395-0_sp1.1 from training. Duration: 0.936375 2023-10-04 19:26:56,009 WARNING [train.py:1204] (3/4) Exclude cut with ID _27052_210_5986_1_1532329197187_3585009_320-80393-0_sp1.1 from training. Duration: 0.97275 2023-10-04 19:27:00,172 WARNING [train.py:1204] (3/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_89-217253-0_sp1.1 from training. Duration: 0.97275 2023-10-04 19:27:01,437 WARNING [train.py:1204] (3/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_453-282588-0_sp1.1 from training. Duration: 0.8909375 2023-10-04 19:27:04,109 WARNING [train.py:1204] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_88-341718-0 from training. Duration: 0.66 2023-10-04 19:27:04,159 WARNING [train.py:1204] (3/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_230-3521-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 19:27:04,320 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.1.ff2_skip_rate, batch_count=1766960.0, ans=0.0 2023-10-04 19:27:07,035 WARNING [train.py:1204] (3/4) Exclude cut with ID _9679_210_7497_1_1529129212906_4363929_512-313889-0 from training. Duration: 0.81 2023-10-04 19:27:09,117 WARNING [train.py:1204] (3/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_194-252800-0_sp1.1 from training. Duration: 0.8818125 2023-10-04 19:27:09,143 WARNING [train.py:1204] (3/4) Exclude cut with ID _55209_210_1531_1_1534675828996_4597160_239-293541-0_sp1.1 from training. Duration: 0.963625 2023-10-04 19:27:10,472 WARNING [train.py:1204] (3/4) Exclude cut with ID _35799_210_5854_1_1533263487887_3529071_15-325131-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 19:27:14,687 WARNING [train.py:1204] (3/4) Exclude cut with ID _14100_210_8341_1_1530529320613_3678069_279-55760-0 from training. Duration: 0.93 2023-10-04 19:27:20,761 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_489-230703-0_sp1.1 from training. Duration: 0.77275 2023-10-04 19:27:22,314 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.conv_module1.balancer2.prob, batch_count=1767093.3333333333, ans=0.125 2023-10-04 19:27:23,346 WARNING [train.py:1204] (3/4) Exclude cut with ID _6694_210_4599_1_1527852637490_3965529_144-185051-0 from training. Duration: 0.96 2023-10-04 19:27:23,412 WARNING [train.py:1204] (3/4) Exclude cut with ID _28461_210_2253_1_1532771775189_3041299_1-228155-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 19:27:26,713 WARNING [train.py:1204] (3/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_686-252323-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 19:27:26,715 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_1075-294633-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 19:27:26,741 WARNING [train.py:1204] (3/4) Exclude cut with ID _22083_210_8254_1_1531702862439_3678970_30-92879-0_sp1.1 from training. Duration: 0.97275 2023-10-04 19:27:28,106 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9245W0121-616888 from training. Duration: 0.896 2023-10-04 19:27:28,110 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6786_210_1388_1_1527839699_4194495_378-46070-0 from training. Duration: 0.72 2023-10-04 19:27:34,306 WARNING [train.py:1204] (3/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_317-175062-0 from training. Duration: 0.82 2023-10-04 19:27:36,995 WARNING [train.py:1204] (3/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_238-89390-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 19:27:39,632 WARNING [train.py:1204] (3/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_524-310811-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 19:27:41,155 WARNING [train.py:1204] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_307-158462-0_sp0.9 from training. Duration: 0.7666875 2023-10-04 19:27:41,156 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9309W0191-640318 from training. Duration: 0.512 2023-10-04 19:27:41,160 WARNING [train.py:1204] (3/4) Exclude cut with ID _55153_210_2253_1_1534384786677_3162096_100-204617-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 19:27:44,365 WARNING [train.py:1204] (3/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_112-149213-0_sp1.1 from training. Duration: 0.863625 2023-10-04 19:27:45,895 WARNING [train.py:1204] (3/4) Exclude cut with ID _67831_210_12392_1_1535612464202_3092612_346-2148-0_sp1.1 from training. Duration: 0.8454375 2023-10-04 19:27:48,575 WARNING [train.py:1204] (3/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_51-117576-0_sp0.9 from training. Duration: 0.4444375 2023-10-04 19:27:48,609 WARNING [train.py:1204] (3/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_300-27235-0 from training. Duration: 0.87 2023-10-04 19:27:49,971 WARNING [train.py:1204] (3/4) Exclude cut with ID _40399_210_4353_1_1533305658194_3279406_195-116825-0_sp1.1 from training. Duration: 0.97275 2023-10-04 19:27:49,993 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7002_210_1794_1_1527939683_5157286_163-87552-0_sp0.9 from training. Duration: 0.9555625 2023-10-04 19:27:50,031 WARNING [train.py:1204] (3/4) Exclude cut with ID _19514_210_9259_1_1531530071330_5084230_560-237597-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 19:27:51,390 WARNING [train.py:1204] (3/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_41-234353-0_sp1.1 from training. Duration: 0.6181875 2023-10-04 19:27:51,408 WARNING [train.py:1204] (3/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_769-168302-0 from training. Duration: 0.71 2023-10-04 19:27:53,456 WARNING [train.py:1204] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_574-45541-0 from training. Duration: 0.65 2023-10-04 19:27:56,693 WARNING [train.py:1204] (3/4) Exclude cut with ID _50310_210_15488_1_1534068043345_3641080_262-67120-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 19:27:58,355 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_53071_210_16554_1_1534493434_1276796_35-11927-0_sp1.1 from training. Duration: 0.963625 2023-10-04 19:27:58,357 WARNING [train.py:1204] (3/4) Exclude cut with ID _3992_210_1747_1_1526644816031_3731060_107-251434-0 from training. Duration: 0.99 2023-10-04 19:27:59,676 WARNING [train.py:1204] (3/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_412-54488-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 19:28:02,796 WARNING [train.py:1204] (3/4) Exclude cut with ID _18516_210_9206_1_1531704653206_3692406_77-258490-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 19:28:04,151 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_40047_210_2290_1_1533290519_2374311_195-165693-0_sp0.9 from training. Duration: 0.988875 2023-10-04 19:28:04,198 WARNING [train.py:1204] (3/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_296-36971-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 19:28:05,515 WARNING [train.py:1204] (3/4) Exclude cut with ID _14970_210_15709_1_1530773543493_1715730_187-20259-0_sp1.1 from training. Duration: 0.8090625 2023-10-04 19:28:05,914 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.feed_forward1.hidden_balancer.prob, batch_count=1767226.6666666667, ans=0.125 2023-10-04 19:28:08,316 INFO [train.py:1046] (3/4) Epoch 50, batch 4800, loss[loss=0.1355, simple_loss=0.2151, pruned_loss=0.02793, over 21973.00 frames. ], tot_loss[loss=0.1516, simple_loss=0.2323, pruned_loss=0.03542, over 4719710.02 frames. ], batch size: 48, lr: 2.02e-03, grad_scale: 16.0 2023-10-04 19:28:08,360 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_53373_210_5399_1_1534229471_4715695_135-305927-0_sp1.1 from training. Duration: 0.936375 2023-10-04 19:28:08,395 WARNING [train.py:1204] (3/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_60-18123-0 from training. Duration: 0.88 2023-10-04 19:28:09,887 WARNING [train.py:1204] (3/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_166-166990-0 from training. Duration: 0.96 2023-10-04 19:28:10,104 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.feed_forward3.hidden_balancer.prob, batch_count=1767293.3333333333, ans=0.125 2023-10-04 19:28:11,140 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_5438_210_1794_1_1527336687_2600009_233-58072-0 from training. Duration: 0.77 2023-10-04 19:28:14,397 WARNING [train.py:1204] (3/4) Exclude cut with ID _8833_210_5219_1_1528592665210_7553059_611-80313-0_sp0.9 from training. Duration: 0.711125 2023-10-04 19:28:14,420 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_53555_210_5632_1_1534595220_7232297_118-43239-0_sp1.1 from training. Duration: 0.936375 2023-10-04 19:28:14,534 WARNING [train.py:1204] (3/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_480-13146-0 from training. Duration: 0.85 2023-10-04 19:28:18,843 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_892-28814-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 19:28:20,197 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_24899_210_16554_1_1532046422_3827462_160-21255-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 19:28:24,416 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_154-316450-0_sp1.1 from training. Duration: 0.6909375 2023-10-04 19:28:26,283 WARNING [train.py:1204] (3/4) Exclude cut with ID _30196_210_1664_1_1533708496522_7074140_1225-301232-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 19:28:26,315 WARNING [train.py:1204] (3/4) Exclude cut with ID _28783_210_7943_1_1532671164808_7155479_414-208100-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 19:28:28,189 WARNING [train.py:1204] (3/4) Exclude cut with ID _19023_210_4353_1_1531378892552_5820780_83-254794-0 from training. Duration: 0.86 2023-10-04 19:28:28,256 WARNING [train.py:1204] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_591-83752-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 19:28:28,305 WARNING [train.py:1204] (3/4) Exclude cut with ID _29877_210_6228_1_1532397791106_5276149_266-62291-0_sp1.1 from training. Duration: 0.7909375 2023-10-04 19:28:29,736 WARNING [train.py:1204] (3/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_280-161633-0_sp1.1 from training. Duration: 0.663625 2023-10-04 19:28:34,349 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7543_210_6753_1_1528539468_333764_39-156634-0_sp1.1 from training. Duration: 0.97275 2023-10-04 19:28:35,731 WARNING [train.py:1204] (3/4) Exclude cut with ID _67499_210_17282_1_1535509805220_3692430_505-312248-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 19:28:35,763 WARNING [train.py:1204] (3/4) Exclude cut with ID _5294_210_1747_1_1527249643928_3696179_180-100713-0_sp0.9 from training. Duration: 0.9 2023-10-04 19:28:37,018 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.867e+02 2.213e+02 2.544e+02 3.489e+02 5.983e+02, threshold=5.088e+02, percent-clipped=6.0 2023-10-04 19:28:37,154 WARNING [train.py:1204] (3/4) Exclude cut with ID _26528_210_9070_1_1532570410842_3851310_508-148132-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 19:28:37,163 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_211-339622-0_sp1.1 from training. Duration: 0.4090625 2023-10-04 19:28:37,178 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_339-138356-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 19:28:38,510 WARNING [train.py:1204] (3/4) Exclude cut with ID _27060_210_5986_1_1532674838922_3522549_211-19933-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 19:28:39,990 WARNING [train.py:1204] (3/4) Exclude cut with ID _60229_210_27767_1_1534838406158_3655350_457-117923-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 19:28:43,382 WARNING [train.py:1204] (3/4) Exclude cut with ID _55429_210_10626_1_1534467588527_7174143_460-141439-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 19:28:44,895 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.bypass.skip_rate, batch_count=1767426.6666666667, ans=0.09899494936611666 2023-10-04 19:28:46,077 WARNING [train.py:1204] (3/4) Exclude cut with ID _59170_210_6721_1_1534983633960_4203335_263-10780-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 19:28:46,099 WARNING [train.py:1204] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_872-264525-0_sp0.9 from training. Duration: 0.72225 2023-10-04 19:28:47,455 WARNING [train.py:1204] (3/4) Exclude cut with ID _7624_210_1531_1_1528109802198_4933101_418-349834-0_sp0.9 from training. Duration: 0.6666875 2023-10-04 19:28:48,798 WARNING [train.py:1204] (3/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_27-53998-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 19:28:50,187 WARNING [train.py:1204] (3/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_202-183922-0 from training. Duration: 0.85 2023-10-04 19:28:50,205 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7002_210_1794_1_1527939683_5157286_1-281264-0 from training. Duration: 0.44 2023-10-04 19:28:51,664 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_53753_210_7234_1_1534320003_3594148_493-9314-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 19:28:51,677 WARNING [train.py:1204] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_217-95056-0_sp1.1 from training. Duration: 0.8909375 2023-10-04 19:28:51,724 WARNING [train.py:1204] (3/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_406-45086-0_sp1.1 from training. Duration: 0.72725 2023-10-04 19:28:51,730 WARNING [train.py:1204] (3/4) Exclude cut with ID _31649_210_6228_1_1532584608063_4091078_505-213821-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 19:28:51,738 WARNING [train.py:1204] (3/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_317-175062-0_sp0.9 from training. Duration: 0.911125 2023-10-04 19:28:53,185 WARNING [train.py:1204] (3/4) Exclude cut with ID _5444_210_2151_1_1527418808464_5312210_726-27165-0_sp1.1 from training. Duration: 0.6090625 2023-10-04 19:28:53,225 WARNING [train.py:1204] (3/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_494-170437-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 19:28:53,410 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.1.conv_skip_rate, batch_count=1767493.3333333333, ans=0.0 2023-10-04 19:28:53,438 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.conv_skip_rate, batch_count=1767493.3333333333, ans=0.0 2023-10-04 19:28:56,974 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_474-64054-0_sp1.1 from training. Duration: 0.936375 2023-10-04 19:28:58,327 WARNING [train.py:1204] (3/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_521-7556-0_sp1.1 from training. Duration: 0.97275 2023-10-04 19:29:00,923 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_269-348505-0_sp1.1 from training. Duration: 0.92725 2023-10-04 19:29:03,292 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.0.layers.0.self_attn_weights.whiten_keys, num_groups=4, num_channels=128, metric=2.67 vs. limit=6.0 2023-10-04 19:29:05,067 WARNING [train.py:1204] (3/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_559-184862-0 from training. Duration: 0.94 2023-10-04 19:29:05,096 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_68702_210_23689_1_1535799577_3420068_92-76040-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 19:29:05,134 WARNING [train.py:1204] (3/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_227-102052-0_sp1.1 from training. Duration: 0.97275 2023-10-04 19:29:05,177 WARNING [train.py:1204] (3/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_718-250907-0_sp0.9 from training. Duration: 0.7666875 2023-10-04 19:29:06,497 WARNING [train.py:1204] (3/4) Exclude cut with ID _14069_210_5632_1_1530512692033_4803390_352-143891-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 19:29:10,622 WARNING [train.py:1204] (3/4) Exclude cut with ID _6267_210_2084_1_1527764658526_3894730_65-106602-0_sp1.1 from training. Duration: 0.936375 2023-10-04 19:29:10,696 WARNING [train.py:1204] (3/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_202-183922-0_sp0.9 from training. Duration: 0.9444375 2023-10-04 19:29:10,703 WARNING [train.py:1204] (3/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_465-268827-0_sp1.1 from training. Duration: 0.97275 2023-10-04 19:29:10,760 WARNING [train.py:1204] (3/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_58-29064-0_sp1.1 from training. Duration: 0.9 2023-10-04 19:29:12,749 WARNING [train.py:1204] (3/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_1-99680-0_sp0.9 from training. Duration: 0.7444375 2023-10-04 19:29:14,053 WARNING [train.py:1204] (3/4) Exclude cut with ID _13253_210_1925_1_1530757775819_3577873_191-144977-0_sp1.1 from training. Duration: 0.7090625 2023-10-04 19:29:16,939 WARNING [train.py:1204] (3/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_276-112684-0_sp1.1 from training. Duration: 0.92725 2023-10-04 19:29:16,947 WARNING [train.py:1204] (3/4) Exclude cut with ID _64639_210_5816_1_1535367619437_3528000_358-411-0_sp1.1 from training. Duration: 0.97275 2023-10-04 19:29:18,161 WARNING [train.py:1204] (3/4) Exclude cut with ID _31649_210_6228_1_1532584608063_4091078_106-21199-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 19:29:18,312 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.0.conv_module1.balancer2.prob, batch_count=1767560.0, ans=0.125 2023-10-04 19:29:19,596 WARNING [train.py:1204] (3/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_901-79438-0 from training. Duration: 0.94 2023-10-04 19:29:19,809 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.conv_module1.balancer2.prob, batch_count=1767560.0, ans=0.125 2023-10-04 19:29:21,052 WARNING [train.py:1204] (3/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_718-250907-0 from training. Duration: 0.69 2023-10-04 19:29:21,057 WARNING [train.py:1204] (3/4) Exclude cut with ID _5287_210_2084_1_1527418883040_3878670_363-194161-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 19:29:21,061 WARNING [train.py:1204] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_586-321321-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 19:29:22,385 INFO [train.py:1046] (3/4) Epoch 50, batch 4850, loss[loss=0.1559, simple_loss=0.2399, pruned_loss=0.03597, over 24433.00 frames. ], tot_loss[loss=0.152, simple_loss=0.2327, pruned_loss=0.03566, over 4716822.66 frames. ], batch size: 63, lr: 2.02e-03, grad_scale: 16.0 2023-10-04 19:29:22,497 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_68900_210_2087_1_1535713157_3708529_164-292661-0_sp1.1 from training. Duration: 0.963625 2023-10-04 19:29:22,498 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_26433_210_12108_1_1532157158_4056480_114-144878-0_sp1.1 from training. Duration: 0.97275 2023-10-04 19:29:27,696 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_15517_210_6753_1_1530954008_3487336_96-341874-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 19:29:33,863 WARNING [train.py:1204] (3/4) Exclude cut with ID _14268_210_9206_1_1530622771285_4714099_637-86630-0 from training. Duration: 0.51 2023-10-04 19:29:33,970 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_28211_210_6388_1_1532861506_4523224_121-320092-0_sp1.1 from training. Duration: 0.92725 2023-10-04 19:29:38,584 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.attention_skip_rate, batch_count=1767693.3333333333, ans=0.0 2023-10-04 19:29:39,626 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_141-89113-0_sp0.9 from training. Duration: 0.9555625 2023-10-04 19:29:41,019 WARNING [train.py:1204] (3/4) Exclude cut with ID _6789_210_1534_1_1527857937218_3118950_311-61262-0_sp1.1 from training. Duration: 0.5909375 2023-10-04 19:29:41,059 WARNING [train.py:1204] (3/4) Exclude cut with ID _58480_210_12392_1_1534811121111_3646160_389-200461-0_sp1.1 from training. Duration: 0.97275 2023-10-04 19:29:44,231 WARNING [train.py:1204] (3/4) Exclude cut with ID _41326_210_5854_1_1533778076509_3492080_28-156553-0_sp1.1 from training. Duration: 0.92725 2023-10-04 19:29:45,547 WARNING [train.py:1204] (3/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_577-287150-0_sp1.1 from training. Duration: 0.7454375 2023-10-04 19:29:48,177 WARNING [train.py:1204] (3/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_40-129756-0_sp1.1 from training. Duration: 0.736375 2023-10-04 19:29:48,188 WARNING [train.py:1204] (3/4) Exclude cut with ID _60113_210_16271_1_1535164212481_4386079_490-280290-0 from training. Duration: 0.98 2023-10-04 19:29:51,035 WARNING [train.py:1204] (3/4) Exclude cut with ID _58059_210_5854_1_1534901471149_3164808_34-39905-0_sp1.1 from training. Duration: 0.963625 2023-10-04 19:29:53,718 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_791-197891-0_sp1.1 from training. Duration: 0.763625 2023-10-04 19:29:53,740 WARNING [train.py:1204] (3/4) Exclude cut with ID _6789_210_1534_1_1527857937218_3118950_262-48853-0_sp1.1 from training. Duration: 0.5090625 2023-10-04 19:29:55,167 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2655_210_2087_1_1525688959_3897400_52-279899-0_sp0.9 from training. Duration: 0.7666875 2023-10-04 19:29:55,171 WARNING [train.py:1204] (3/4) Exclude cut with ID _14884_210_2252_1_1530707459689_5561250_114-201399-0 from training. Duration: 0.76 2023-10-04 19:29:57,002 WARNING [train.py:1204] (3/4) Exclude cut with ID _19023_210_4353_1_1531378892552_5820780_83-254794-0_sp0.9 from training. Duration: 0.9555625 2023-10-04 19:29:58,752 WARNING [train.py:1204] (3/4) Exclude cut with ID _53489_210_6228_1_1534298357169_3835970_10-297919-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 19:30:02,793 WARNING [train.py:1204] (3/4) Exclude cut with ID _44585_210_18628_1_1533605350387_4518199_418-3055-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 19:30:02,809 WARNING [train.py:1204] (3/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_310-109461-0 from training. Duration: 0.8 2023-10-04 19:30:02,859 WARNING [train.py:1204] (3/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_235-262124-0 from training. Duration: 0.67 2023-10-04 19:30:04,266 WARNING [train.py:1204] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_195-167011-0_sp1.1 from training. Duration: 0.6818125 2023-10-04 19:30:08,599 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.bypass.scale_min, batch_count=1767826.6666666667, ans=0.2 2023-10-04 19:30:11,161 WARNING [train.py:1204] (3/4) Exclude cut with ID _7852_210_5219_1_1528286471316_8139400_188-55162-0_sp1.1 from training. Duration: 0.936375 2023-10-04 19:30:12,447 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_35361_210_4238_1_1533262810_7793476_675-87012-0 from training. Duration: 0.89 2023-10-04 19:30:13,749 WARNING [train.py:1204] (3/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_465-81427-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 19:30:13,761 WARNING [train.py:1204] (3/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_597-154640-0_sp1.1 from training. Duration: 0.8181875 2023-10-04 19:30:15,239 WARNING [train.py:1204] (3/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_186-309161-0_sp0.9 from training. Duration: 0.6 2023-10-04 19:30:17,333 WARNING [train.py:1204] (3/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_546-119312-0 from training. Duration: 0.99 2023-10-04 19:30:17,337 WARNING [train.py:1204] (3/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_330-240060-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 19:30:18,776 WARNING [train.py:1204] (3/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_477-255782-0 from training. Duration: 0.82 2023-10-04 19:30:18,801 WARNING [train.py:1204] (3/4) Exclude cut with ID _14970_210_15709_1_1530773543493_1715730_150-248939-0_sp1.1 from training. Duration: 0.97275 2023-10-04 19:30:20,220 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_556-206615-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 19:30:20,286 WARNING [train.py:1204] (3/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_308-231735-0 from training. Duration: 0.73 2023-10-04 19:30:29,555 WARNING [train.py:1204] (3/4) Exclude cut with ID _27485_210_4916_1_1532739191677_3668269_66-236571-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 19:30:29,834 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.conv_skip_rate, batch_count=1767893.3333333333, ans=0.0 2023-10-04 19:30:34,357 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2221_210_1988_1_1525597051_4935541_415-296882-0_sp0.9 from training. Duration: 0.8555625 2023-10-04 19:30:35,636 WARNING [train.py:1204] (3/4) Exclude cut with ID _64521_210_14699_1_1535243427259_3815328_231-78630-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 19:30:36,960 INFO [train.py:1046] (3/4) Epoch 50, batch 4900, loss[loss=0.1604, simple_loss=0.245, pruned_loss=0.03787, over 24666.00 frames. ], tot_loss[loss=0.1514, simple_loss=0.232, pruned_loss=0.03537, over 4723215.10 frames. ], batch size: 68, lr: 2.02e-03, grad_scale: 16.0 2023-10-04 19:30:39,828 WARNING [train.py:1204] (3/4) Exclude cut with ID _13942_210_9988_1_1530604906036_3733360_499-55676-0 from training. Duration: 0.62 2023-10-04 19:30:39,830 WARNING [train.py:1204] (3/4) Exclude cut with ID _49376_210_5986_1_1533985278732_3370740_301-154763-0_sp1.1 from training. Duration: 0.8909375 2023-10-04 19:30:44,078 WARNING [train.py:1204] (3/4) Exclude cut with ID _27485_210_4916_1_1532739191677_3668269_255-216698-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 19:30:45,740 WARNING [train.py:1204] (3/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_137-334143-0_sp1.1 from training. Duration: 0.97275 2023-10-04 19:30:45,770 WARNING [train.py:1204] (3/4) Exclude cut with ID _6456_210_1534_1_1527681634212_3181049_215-309359-0_sp1.1 from training. Duration: 0.8 2023-10-04 19:30:49,029 WARNING [train.py:1204] (3/4) Exclude cut with ID _65640_210_6310_1_1535368201445_3173250_209-13008-0 from training. Duration: 0.99 2023-10-04 19:30:53,412 WARNING [train.py:1204] (3/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_58-29064-0 from training. Duration: 0.99 2023-10-04 19:30:57,020 WARNING [train.py:1204] (3/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_410-287857-0 from training. Duration: 0.93 2023-10-04 19:30:57,568 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.4.encoder.layers.1.nonlin_attention.whiten2, num_groups=1, num_channels=384, metric=3.03 vs. limit=15.0 2023-10-04 19:30:58,386 WARNING [train.py:1204] (3/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_153-153537-0 from training. Duration: 0.91 2023-10-04 19:31:00,170 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7544_210_6753_1_1528587722_3576686_172-171750-0_sp0.9 from training. Duration: 0.72225 2023-10-04 19:31:00,206 WARNING [train.py:1204] (3/4) Exclude cut with ID _64521_210_14699_1_1535243427259_3815328_464-183853-0_sp1.1 from training. Duration: 0.97275 2023-10-04 19:31:02,178 WARNING [train.py:1204] (3/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_385-210308-0_sp1.1 from training. Duration: 0.963625 2023-10-04 19:31:02,196 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_46013_210_7234_1_1533776362_3744032_354-261865-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 19:31:02,203 WARNING [train.py:1204] (3/4) Exclude cut with ID _3067_210_2667_1_1526212974640_4503168_250-285937-0_sp1.1 from training. Duration: 0.72725 2023-10-04 19:31:02,261 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_40036_210_13405_1_1533648647_1315388_48-146016-0 from training. Duration: 0.93 2023-10-04 19:31:03,833 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=1768026.6666666667, ans=0.1 2023-10-04 19:31:05,026 WARNING [train.py:1204] (3/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_280-161633-0 from training. Duration: 0.73 2023-10-04 19:31:05,081 WARNING [train.py:1204] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_308-161067-0_sp0.9 from training. Duration: 0.7333125 2023-10-04 19:31:06,262 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.714e+02 2.161e+02 2.469e+02 3.032e+02 6.207e+02, threshold=4.938e+02, percent-clipped=2.0 2023-10-04 19:31:07,745 WARNING [train.py:1204] (3/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_68-212801-0_sp0.9 from training. Duration: 0.811125 2023-10-04 19:31:09,006 WARNING [train.py:1204] (3/4) Exclude cut with ID _6789_210_1534_1_1527857937218_3118950_311-61262-0_sp0.9 from training. Duration: 0.72225 2023-10-04 19:31:10,495 WARNING [train.py:1204] (3/4) Exclude cut with ID _14632_210_9988_1_1530691279392_3923200_102-204477-0_sp1.1 from training. Duration: 0.8818125 2023-10-04 19:31:10,603 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.1.feed_forward1.out_proj.dropout_p, batch_count=1768093.3333333333, ans=0.1 2023-10-04 19:31:11,865 WARNING [train.py:1204] (3/4) Exclude cut with ID _65242_210_18628_1_1535369393717_3823149_290-117517-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 19:31:13,291 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_69630_210_7234_1_1535788670_910989_35-337140-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 19:31:13,304 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_5618_210_1874_1_1527311008_5851_1-214501-0 from training. Duration: 0.689 2023-10-04 19:31:16,026 WARNING [train.py:1204] (3/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_168-50337-0_sp0.9 from training. Duration: 0.9333125 2023-10-04 19:31:17,378 WARNING [train.py:1204] (3/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_185-14438-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 19:31:17,393 WARNING [train.py:1204] (3/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_61-327062-0 from training. Duration: 0.81 2023-10-04 19:31:17,398 WARNING [train.py:1204] (3/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_98-741-0 from training. Duration: 0.8 2023-10-04 19:31:20,536 WARNING [train.py:1204] (3/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_312-204798-0 from training. Duration: 0.93 2023-10-04 19:31:20,851 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.bypass_mid.scale_min, batch_count=1768160.0, ans=0.2 2023-10-04 19:31:22,032 WARNING [train.py:1204] (3/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_787-294416-0_sp0.9 from training. Duration: 0.911125 2023-10-04 19:31:22,116 WARNING [train.py:1204] (3/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_140-181176-0_sp1.1 from training. Duration: 0.836375 2023-10-04 19:31:22,156 WARNING [train.py:1204] (3/4) Exclude cut with ID _14687_210_5854_1_1530703838259_3664889_115-239046-0_sp1.1 from training. Duration: 0.6090625 2023-10-04 19:31:23,476 WARNING [train.py:1204] (3/4) Exclude cut with ID _56423_210_5854_1_1534896061049_3165969_370-230614-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 19:31:23,519 WARNING [train.py:1204] (3/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_406-275649-0_sp0.9 from training. Duration: 0.4666875 2023-10-04 19:31:23,533 WARNING [train.py:1204] (3/4) Exclude cut with ID _15150_210_8341_1_1530752411803_7256384_22-138274-0_sp1.1 from training. Duration: 0.87275 2023-10-04 19:31:23,571 WARNING [train.py:1204] (3/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_125-125818-0 from training. Duration: 0.96 2023-10-04 19:31:23,689 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=1768160.0, ans=0.1 2023-10-04 19:31:25,994 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.3.encoder.layers.3.self_attn2.whiten, num_groups=1, num_channels=512, metric=13.74 vs. limit=22.5 2023-10-04 19:31:27,910 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_4399_210_1874_1_1526705485_2400757_220-47974-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 19:31:29,322 WARNING [train.py:1204] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_308-161067-0_sp1.1 from training. Duration: 0.6 2023-10-04 19:31:31,265 WARNING [train.py:1204] (3/4) Exclude cut with ID _53489_210_6228_1_1534298357169_3835970_102-57645-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 19:31:33,332 WARNING [train.py:1204] (3/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_178-177582-0 from training. Duration: 0.74 2023-10-04 19:31:34,674 WARNING [train.py:1204] (3/4) Exclude cut with ID _14349_210_8341_1_1530615710818_3442470_170-336541-0_sp1.1 from training. Duration: 0.7818125 2023-10-04 19:31:34,738 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_521-214276-0_sp0.9 from training. Duration: 0.488875 2023-10-04 19:31:34,783 WARNING [train.py:1204] (3/4) Exclude cut with ID _66070_210_7640_1_1535441393096_7805310_489-292631-0 from training. Duration: 0.79 2023-10-04 19:31:39,046 WARNING [train.py:1204] (3/4) Exclude cut with ID _14069_210_5632_1_1530512692033_4803390_438-340023-0_sp1.1 from training. Duration: 0.936375 2023-10-04 19:31:40,359 WARNING [train.py:1204] (3/4) Exclude cut with ID _7852_210_5219_1_1528286471316_8139400_242-105336-0_sp1.1 from training. Duration: 0.6545625 2023-10-04 19:31:41,703 WARNING [train.py:1204] (3/4) Exclude cut with ID _3867_210_1883_1_1526641200575_4475830_272-215563-0 from training. Duration: 0.57 2023-10-04 19:31:41,710 WARNING [train.py:1204] (3/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_143-117421-0_sp0.9 from training. Duration: 0.6444375 2023-10-04 19:31:41,716 WARNING [train.py:1204] (3/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_30-326681-0_sp1.1 from training. Duration: 0.7545625 2023-10-04 19:31:43,130 WARNING [train.py:1204] (3/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_538-218448-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 19:31:47,835 WARNING [train.py:1204] (3/4) Exclude cut with ID _33640_210_2655_1_1533117605479_4855469_60-192970-0_sp1.1 from training. Duration: 0.963625 2023-10-04 19:31:47,857 WARNING [train.py:1204] (3/4) Exclude cut with ID _65585_210_12210_1_1535450451584_3764098_112-294301-0_sp1.1 from training. Duration: 0.8 2023-10-04 19:31:47,887 WARNING [train.py:1204] (3/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_643-37412-0_sp1.1 from training. Duration: 0.936375 2023-10-04 19:31:47,913 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_9302_210_2550_1_1528889962_4655416_638-301620-0_sp1.1 from training. Duration: 0.42725 2023-10-04 19:31:49,318 WARNING [train.py:1204] (3/4) Exclude cut with ID _3867_210_1883_1_1526641200575_4475830_272-215563-0_sp1.1 from training. Duration: 0.5181875 2023-10-04 19:31:51,960 INFO [train.py:1046] (3/4) Epoch 50, batch 4950, loss[loss=0.155, simple_loss=0.2467, pruned_loss=0.03167, over 24343.00 frames. ], tot_loss[loss=0.1506, simple_loss=0.2309, pruned_loss=0.03516, over 4712651.43 frames. ], batch size: 74, lr: 2.02e-03, grad_scale: 16.0 2023-10-04 19:31:52,044 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_53679_210_4852_1_1534556726_4186141_358-154143-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 19:31:52,066 WARNING [train.py:1204] (3/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_841-341679-0_sp0.9 from training. Duration: 0.6444375 2023-10-04 19:31:57,126 WARNING [train.py:1204] (3/4) Exclude cut with ID _7160_210_1876_1_1527940923231_3703799_302-150728-0 from training. Duration: 0.78 2023-10-04 19:31:57,155 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_125-203105-0 from training. Duration: 0.8 2023-10-04 19:31:57,181 WARNING [train.py:1204] (3/4) Exclude cut with ID _5585_210_2667_1_1527418813324_8007340_89-332072-0_sp0.9 from training. Duration: 0.711125 2023-10-04 19:31:58,524 WARNING [train.py:1204] (3/4) Exclude cut with ID _3992_210_1747_1_1526644816031_3731060_36-147855-0 from training. Duration: 0.78 2023-10-04 19:31:58,549 WARNING [train.py:1204] (3/4) Exclude cut with ID _59256_210_5986_1_1535076085997_3589120_172-239045-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 19:31:58,557 WARNING [train.py:1204] (3/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_472-216663-0_sp1.1 from training. Duration: 0.836375 2023-10-04 19:32:00,491 WARNING [train.py:1204] (3/4) Exclude cut with ID _36512_210_6987_1_1533033143998_3126069_181-306500-0_sp1.1 from training. Duration: 0.863625 2023-10-04 19:32:00,516 WARNING [train.py:1204] (3/4) Exclude cut with ID _56524_210_9642_1_1534572011779_3619170_492-95008-0_sp1.1 from training. Duration: 0.97275 2023-10-04 19:32:03,391 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_33348_210_5632_1_1533086912_3324030_119-90326-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 19:32:03,441 WARNING [train.py:1204] (3/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_231-137892-0_sp1.1 from training. Duration: 0.9 2023-10-04 19:32:06,094 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8453_210_4211_1_1528502940_8562730_596-306820-0_sp1.1 from training. Duration: 0.8818125 2023-10-04 19:32:07,481 WARNING [train.py:1204] (3/4) Exclude cut with ID _27052_210_5986_1_1532329197187_3585009_50-242750-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 19:32:08,898 WARNING [train.py:1204] (3/4) Exclude cut with ID _13913_210_9994_1_1530579615187_3383897_358-98435-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 19:32:08,911 WARNING [train.py:1204] (3/4) Exclude cut with ID _27013_210_1389_1_1532437173240_3252649_399-229778-0_sp1.1 from training. Duration: 0.963625 2023-10-04 19:32:13,032 WARNING [train.py:1204] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_185-174805-0_sp0.9 from training. Duration: 0.7555625 2023-10-04 19:32:15,935 WARNING [train.py:1204] (3/4) Exclude cut with ID _65585_210_12210_1_1535450451584_3764098_196-221014-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 19:32:18,008 WARNING [train.py:1204] (3/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_202-293523-0_sp1.1 from training. Duration: 0.6545625 2023-10-04 19:32:19,395 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8275_210_4198_1_1528455320_3892571_213-127634-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 19:32:19,457 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_921-231678-0_sp1.1 from training. Duration: 0.97275 2023-10-04 19:32:20,755 WARNING [train.py:1204] (3/4) Exclude cut with ID _38020_210_5986_1_1533112252259_7006089_493-102745-0_sp1.1 from training. Duration: 0.8909375 2023-10-04 19:32:23,471 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6846_210_6753_1_1527850767_4295158_2-315073-0 from training. Duration: 0.88 2023-10-04 19:32:23,539 WARNING [train.py:1204] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_249-281349-0 from training. Duration: 0.85 2023-10-04 19:32:26,719 WARNING [train.py:1204] (3/4) Exclude cut with ID _18513_210_9206_1_1531618215223_3862620_314-83429-0_sp1.1 from training. Duration: 0.97275 2023-10-04 19:32:28,654 WARNING [train.py:1204] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_215-202973-0_sp0.9 from training. Duration: 0.97775 2023-10-04 19:32:28,674 WARNING [train.py:1204] (3/4) Exclude cut with ID _7719_210_3525_1_1528456153163_4296960_88-250259-0_sp1.1 from training. Duration: 0.763625 2023-10-04 19:32:30,005 WARNING [train.py:1204] (3/4) Exclude cut with ID _7606_210_4915_1_1528894253277_5080690_301-10732-0_sp1.1 from training. Duration: 0.736375 2023-10-04 19:32:30,015 WARNING [train.py:1204] (3/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_818-11907-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 19:32:31,350 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_5959_210_1794_1_1527400400_3445726_412-44052-0_sp1.1 from training. Duration: 0.7 2023-10-04 19:32:32,774 WARNING [train.py:1204] (3/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_6-279195-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 19:32:34,650 WARNING [train.py:1204] (3/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_252-216674-0_sp0.9 from training. Duration: 0.6 2023-10-04 19:32:36,007 WARNING [train.py:1204] (3/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_144-3287-0_sp1.1 from training. Duration: 0.6090625 2023-10-04 19:32:37,390 WARNING [train.py:1204] (3/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_342-3879-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 19:32:37,425 WARNING [train.py:1204] (3/4) Exclude cut with ID _33640_210_2655_1_1533117605479_4855469_584-65630-0_sp1.1 from training. Duration: 0.97275 2023-10-04 19:32:38,765 WARNING [train.py:1204] (3/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_37-189262-0 from training. Duration: 0.98 2023-10-04 19:32:38,795 WARNING [train.py:1204] (3/4) Exclude cut with ID _9389_210_1664_1_1529217337301_5289269_379-128794-0_sp0.9 from training. Duration: 0.9444375 2023-10-04 19:32:38,910 WARNING [train.py:1204] (3/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_119-207162-0_sp1.1 from training. Duration: 0.6454375 2023-10-04 19:32:41,863 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.self_attn_weights.pos_emb_skip_rate, batch_count=1768493.3333333333, ans=0.0 2023-10-04 19:32:43,092 WARNING [train.py:1204] (3/4) Exclude cut with ID _2941_210_2577_1_1526199673150_2489759_200-333643-0_sp1.1 from training. Duration: 0.936375 2023-10-04 19:32:44,475 WARNING [train.py:1204] (3/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_479-125611-0_sp1.1 from training. Duration: 0.87275 2023-10-04 19:32:46,237 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3475_210_2028_1_1526193578_3622569_64-297072-0_sp1.1 from training. Duration: 0.82725 2023-10-04 19:32:46,283 WARNING [train.py:1204] (3/4) Exclude cut with ID _53565_210_10635_1_1534337218335_3820259_139-346291-0_sp1.1 from training. Duration: 0.97275 2023-10-04 19:32:47,564 WARNING [train.py:1204] (3/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_588-125165-0_sp1.1 from training. Duration: 0.7545625 2023-10-04 19:32:47,632 WARNING [train.py:1204] (3/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_938-205135-0_sp1.1 from training. Duration: 0.7909375 2023-10-04 19:32:50,301 WARNING [train.py:1204] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_216-48707-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 19:32:50,355 WARNING [train.py:1204] (3/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_477-255782-0_sp1.1 from training. Duration: 0.7454375 2023-10-04 19:32:50,391 WARNING [train.py:1204] (3/4) Exclude cut with ID _16909_210_5724_1_1531292079912_4118919_583-92842-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 19:32:51,804 WARNING [train.py:1204] (3/4) Exclude cut with ID _60860_210_8254_1_1534896015875_3557169_245-256666-0 from training. Duration: 0.97 2023-10-04 19:32:56,450 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_45399_210_4852_1_1533624725_7579737_349-116173-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 19:33:03,065 WARNING [train.py:1204] (3/4) Exclude cut with ID _8699_210_2069_1_1528630346831_3960409_155-39766-0 from training. Duration: 0.78 2023-10-04 19:33:03,091 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_86-16881-0_sp1.1 from training. Duration: 0.52725 2023-10-04 19:33:03,249 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.feed_forward2.hidden_balancer.prob, batch_count=1768560.0, ans=0.125 2023-10-04 19:33:07,077 INFO [train.py:1046] (3/4) Epoch 50, batch 5000, loss[loss=0.1526, simple_loss=0.244, pruned_loss=0.03059, over 24337.00 frames. ], tot_loss[loss=0.1505, simple_loss=0.2308, pruned_loss=0.03512, over 4714023.62 frames. ], batch size: 74, lr: 2.02e-03, grad_scale: 16.0 2023-10-04 19:33:08,750 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_191-159384-0_sp1.1 from training. Duration: 0.97275 2023-10-04 19:33:08,758 WARNING [train.py:1204] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_307-158462-0_sp1.1 from training. Duration: 0.62725 2023-10-04 19:33:10,103 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7544_210_6753_1_1528591327_1471426_33-346031-0 from training. Duration: 0.61 2023-10-04 19:33:11,469 WARNING [train.py:1204] (3/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_112-149213-0 from training. Duration: 0.95 2023-10-04 19:33:12,895 WARNING [train.py:1204] (3/4) Exclude cut with ID _55844_210_2346_1_1534471212569_7625707_925-191342-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 19:33:14,258 WARNING [train.py:1204] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_87-278686-0 from training. Duration: 0.77 2023-10-04 19:33:14,287 WARNING [train.py:1204] (3/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_415-78792-0_sp1.1 from training. Duration: 0.736375 2023-10-04 19:33:14,297 WARNING [train.py:1204] (3/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_107-332398-0_sp0.9 from training. Duration: 0.8666875 2023-10-04 19:33:16,270 WARNING [train.py:1204] (3/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_72-251255-0 from training. Duration: 0.94 2023-10-04 19:33:16,306 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_431-227273-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 19:33:17,627 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7544_210_6753_1_1528587000_691468_87-178599-0_sp0.9 from training. Duration: 0.9666875 2023-10-04 19:33:18,995 WARNING [train.py:1204] (3/4) Exclude cut with ID _13916_210_9994_1_1530788341126_3763149_60-191556-0 from training. Duration: 0.61 2023-10-04 19:33:18,998 WARNING [train.py:1204] (3/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_746-164782-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 19:33:19,042 WARNING [train.py:1204] (3/4) Exclude cut with ID _33640_210_2655_1_1533117605479_4855469_596-21066-0_sp1.1 from training. Duration: 0.92725 2023-10-04 19:33:19,161 WARNING [train.py:1204] (3/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_320-14078-0 from training. Duration: 0.95 2023-10-04 19:33:19,240 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder_embed.convnext.hidden_balancer.prob, batch_count=1768626.6666666667, ans=0.125 2023-10-04 19:33:20,380 WARNING [train.py:1204] (3/4) Exclude cut with ID _29877_210_6228_1_1532397791106_5276149_266-62291-0 from training. Duration: 0.87 2023-10-04 19:33:21,782 WARNING [train.py:1204] (3/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_176-56120-0_sp0.9 from training. Duration: 0.92225 2023-10-04 19:33:21,826 WARNING [train.py:1204] (3/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_91-26919-0 from training. Duration: 0.97 2023-10-04 19:33:21,835 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_224-322394-0_sp0.9 from training. Duration: 0.7333125 2023-10-04 19:33:23,188 WARNING [train.py:1204] (3/4) Exclude cut with ID _44982_210_1511_1_1534312820108_3726029_586-297059-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 19:33:24,516 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_196-289637-0_sp1.1 from training. Duration: 0.6818125 2023-10-04 19:33:24,517 WARNING [train.py:1204] (3/4) Exclude cut with ID _18513_210_9206_1_1531618215223_3862620_402-5957-0 from training. Duration: 0.99 2023-10-04 19:33:24,525 WARNING [train.py:1204] (3/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_81-300212-0 from training. Duration: 0.6 2023-10-04 19:33:25,954 WARNING [train.py:1204] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_166-128941-0 from training. Duration: 0.54 2023-10-04 19:33:25,967 WARNING [train.py:1204] (3/4) Exclude cut with ID _58059_210_5854_1_1534901471149_3164808_57-309660-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 19:33:27,903 WARNING [train.py:1204] (3/4) Exclude cut with ID _60621_210_17071_1_1535013053530_3671739_73-155814-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 19:33:29,309 WARNING [train.py:1204] (3/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_726-270780-0 from training. Duration: 0.86 2023-10-04 19:33:29,329 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8853_210_3273_1_1528633899_818968_51-187685-0_sp1.1 from training. Duration: 0.62725 2023-10-04 19:33:31,304 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_315-212794-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 19:33:32,598 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_53373_210_5399_1_1534229471_4715695_314-122795-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 19:33:34,514 WARNING [train.py:1204] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_187-221566-0_sp0.9 from training. Duration: 0.47775 2023-10-04 19:33:35,939 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_133-564-0 from training. Duration: 0.79 2023-10-04 19:33:37,193 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.743e+02 2.194e+02 2.487e+02 3.099e+02 5.334e+02, threshold=4.974e+02, percent-clipped=1.0 2023-10-04 19:33:37,293 WARNING [train.py:1204] (3/4) Exclude cut with ID _55153_210_2253_1_1534384786677_3162096_255-228344-0_sp1.1 from training. Duration: 0.936375 2023-10-04 19:33:38,717 WARNING [train.py:1204] (3/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_383-165234-0_sp1.1 from training. Duration: 0.8545625 2023-10-04 19:33:42,773 WARNING [train.py:1204] (3/4) Exclude cut with ID IC0677W0056-334889 from training. Duration: 0.768 2023-10-04 19:33:44,309 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_14953_210_2151_1_1530701710_5616165_710-165400-0_sp0.9 from training. Duration: 0.9666875 2023-10-04 19:33:46,339 WARNING [train.py:1204] (3/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_355-236205-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 19:33:46,341 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_34450_210_9547_1_1533099443_4109050_503-246893-0_sp1.1 from training. Duration: 0.963625 2023-10-04 19:33:49,189 WARNING [train.py:1204] (3/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_25-120161-0 from training. Duration: 0.98 2023-10-04 19:33:50,428 WARNING [train.py:1204] (3/4) Exclude cut with ID _66112_210_10635_1_1535590236305_3801484_205-311930-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 19:33:50,460 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_48049_210_5632_1_1533864553_3519937_185-281218-0_sp1.1 from training. Duration: 0.92725 2023-10-04 19:33:51,723 WARNING [train.py:1204] (3/4) Exclude cut with ID _48782_210_12658_1_1534467701544_2300422_191-332415-0_sp1.1 from training. Duration: 0.97275 2023-10-04 19:33:53,201 WARNING [train.py:1204] (3/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_907-151398-0_sp1.1 from training. Duration: 0.336375 2023-10-04 19:33:53,247 WARNING [train.py:1204] (3/4) Exclude cut with ID _67499_210_17282_1_1535509805220_3692430_20-179346-0_sp1.1 from training. Duration: 0.8818125 2023-10-04 19:33:57,430 WARNING [train.py:1204] (3/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_441-91466-0_sp1.1 from training. Duration: 0.8818125 2023-10-04 19:33:57,515 WARNING [train.py:1204] (3/4) Exclude cut with ID _46681_210_9642_1_1533893005571_7603359_786-164007-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 19:34:00,953 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.0.conv_module1.balancer1.prob, batch_count=1768826.6666666667, ans=0.125 2023-10-04 19:34:03,654 WARNING [train.py:1204] (3/4) Exclude cut with ID _14970_210_15709_1_1530773543493_1715730_187-20259-0 from training. Duration: 0.89 2023-10-04 19:34:07,006 WARNING [train.py:1204] (3/4) Exclude cut with ID _12734_210_8341_1_1530183614296_3891929_383-339075-0_sp1.1 from training. Duration: 0.963625 2023-10-04 19:34:08,591 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.conv_skip_rate, batch_count=1768893.3333333333, ans=0.0 2023-10-04 19:34:17,234 WARNING [train.py:1204] (3/4) Exclude cut with ID _37086_210_9038_1_1532999540891_7021250_834-322225-0_sp1.1 from training. Duration: 0.92725 2023-10-04 19:34:18,650 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_10987_210_4163_1_1529760555_4123902_56-290559-0_sp1.1 from training. Duration: 0.963625 2023-10-04 19:34:18,657 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_92-246270-0_sp0.9 from training. Duration: 0.9444375 2023-10-04 19:34:18,686 WARNING [train.py:1204] (3/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_56-13561-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 19:34:19,958 WARNING [train.py:1204] (3/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_393-57888-0_sp1.1 from training. Duration: 0.5818125 2023-10-04 19:34:19,987 WARNING [train.py:1204] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_168-32906-0_sp1.1 from training. Duration: 0.62725 2023-10-04 19:34:20,028 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_51225_210_3601_1_1534400634_2327383_127-207553-0_sp1.1 from training. Duration: 0.963625 2023-10-04 19:34:21,301 INFO [train.py:1046] (3/4) Epoch 50, batch 5050, loss[loss=0.1455, simple_loss=0.238, pruned_loss=0.02648, over 24644.00 frames. ], tot_loss[loss=0.1506, simple_loss=0.2312, pruned_loss=0.03503, over 4714079.40 frames. ], batch size: 68, lr: 2.02e-03, grad_scale: 8.0 2023-10-04 19:34:25,480 WARNING [train.py:1204] (3/4) Exclude cut with ID _58583_210_5236_1_1534726835266_3574249_494-203521-0_sp1.1 from training. Duration: 0.963625 2023-10-04 19:34:25,508 WARNING [train.py:1204] (3/4) Exclude cut with ID _49376_210_5986_1_1533985278732_3370740_354-344695-0 from training. Duration: 0.99 2023-10-04 19:34:26,905 WARNING [train.py:1204] (3/4) Exclude cut with ID _8021_210_7994_1_1528376451358_3653069_179-45206-0_sp1.1 from training. Duration: 0.7818125 2023-10-04 19:34:28,366 WARNING [train.py:1204] (3/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_742-154017-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 19:34:28,454 WARNING [train.py:1204] (3/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_222-198127-0_sp1.1 from training. Duration: 0.763625 2023-10-04 19:34:29,784 WARNING [train.py:1204] (3/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_235-307337-0 from training. Duration: 0.94 2023-10-04 19:34:31,747 WARNING [train.py:1204] (3/4) Exclude cut with ID _8393_210_6702_1_1528505860249_3457328_453-176500-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 19:34:33,078 WARNING [train.py:1204] (3/4) Exclude cut with ID _53565_210_10635_1_1534337218335_3820259_102-179528-0_sp1.1 from training. Duration: 0.97275 2023-10-04 19:34:35,701 WARNING [train.py:1204] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_43-206757-0_sp1.1 from training. Duration: 0.6818125 2023-10-04 19:34:37,632 WARNING [train.py:1204] (3/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_335-274780-0_sp1.1 from training. Duration: 0.7090625 2023-10-04 19:34:37,677 WARNING [train.py:1204] (3/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_128-296581-0_sp1.1 from training. Duration: 0.67275 2023-10-04 19:34:46,424 WARNING [train.py:1204] (3/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_111-112362-0 from training. Duration: 0.91 2023-10-04 19:34:47,703 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_5522_210_3273_1_1527249606_3909096_366-172508-0_sp1.1 from training. Duration: 0.6 2023-10-04 19:34:47,781 WARNING [train.py:1204] (3/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_47-167373-0_sp1.1 from training. Duration: 0.836375 2023-10-04 19:34:49,234 WARNING [train.py:1204] (3/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_41-234353-0 from training. Duration: 0.68 2023-10-04 19:34:49,259 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_82-221196-0_sp0.9 from training. Duration: 0.8444375 2023-10-04 19:34:50,615 WARNING [train.py:1204] (3/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_47-4814-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 19:34:50,644 WARNING [train.py:1204] (3/4) Exclude cut with ID _43541_210_10626_1_1533796233418_3635440_40-247993-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 19:34:50,690 WARNING [train.py:1204] (3/4) Exclude cut with ID _21989_210_5854_1_1531899214931_3271360_133-44759-0_sp0.9 from training. Duration: 0.8555625 2023-10-04 19:34:50,693 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_94-195801-0 from training. Duration: 0.94 2023-10-04 19:34:54,009 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_141-89113-0 from training. Duration: 0.86 2023-10-04 19:34:54,101 WARNING [train.py:1204] (3/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_683-284578-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 19:34:56,895 WARNING [train.py:1204] (3/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_6-214815-0_sp1.1 from training. Duration: 0.77275 2023-10-04 19:34:59,735 WARNING [train.py:1204] (3/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_556-212082-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 19:34:59,774 WARNING [train.py:1204] (3/4) Exclude cut with ID _44902_210_16187_1_1533888006062_3792078_149-67212-0 from training. Duration: 0.87 2023-10-04 19:35:01,166 WARNING [train.py:1204] (3/4) Exclude cut with ID _61148_210_6897_1_1535094022726_4217610_333-159440-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 19:35:04,534 WARNING [train.py:1204] (3/4) Exclude cut with ID _3120_210_3525_1_1526036143112_4072980_404-249994-0 from training. Duration: 0.99 2023-10-04 19:35:05,800 WARNING [train.py:1204] (3/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_234-260682-0_sp0.9 from training. Duration: 0.9333125 2023-10-04 19:35:05,835 WARNING [train.py:1204] (3/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_374-138971-0_sp1.1 from training. Duration: 0.8545625 2023-10-04 19:35:07,230 WARNING [train.py:1204] (3/4) Exclude cut with ID _53713_210_24116_1_1534314487762_7416751_743-302085-0_sp1.1 from training. Duration: 0.92725 2023-10-04 19:35:07,306 WARNING [train.py:1204] (3/4) Exclude cut with ID _18516_210_9206_1_1531704653206_3692406_198-334870-0_sp1.1 from training. Duration: 0.836375 2023-10-04 19:35:09,279 WARNING [train.py:1204] (3/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_395-73570-0_sp1.1 from training. Duration: 0.9 2023-10-04 19:35:12,376 WARNING [train.py:1204] (3/4) Exclude cut with ID _46683_210_9642_1_1533730913492_4591116_421-19547-0_sp1.1 from training. Duration: 0.936375 2023-10-04 19:35:12,612 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.feed_forward2.hidden_balancer.prob, batch_count=1769160.0, ans=0.125 2023-10-04 19:35:13,692 WARNING [train.py:1204] (3/4) Exclude cut with ID _19896_210_1883_1_1531709022662_6649569_714-8668-0_sp1.1 from training. Duration: 0.963625 2023-10-04 19:35:13,720 WARNING [train.py:1204] (3/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_49-101072-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 19:35:13,729 WARNING [train.py:1204] (3/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_220-191080-0_sp1.1 from training. Duration: 0.8909375 2023-10-04 19:35:15,053 WARNING [train.py:1204] (3/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_62-335552-0 from training. Duration: 0.87 2023-10-04 19:35:16,410 WARNING [train.py:1204] (3/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_111-112362-0_sp1.1 from training. Duration: 0.82725 2023-10-04 19:35:17,832 WARNING [train.py:1204] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_318-59097-0_sp0.9 from training. Duration: 0.8444375 2023-10-04 19:35:20,683 WARNING [train.py:1204] (3/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_98-255710-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 19:35:20,691 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9042W0003-515609 from training. Duration: 0.896 2023-10-04 19:35:20,706 WARNING [train.py:1204] (3/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_158-250116-0_sp0.9 from training. Duration: 0.688875 2023-10-04 19:35:22,075 WARNING [train.py:1204] (3/4) Exclude cut with ID _26750_210_5665_1_1532226678442_4063230_7-346885-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 19:35:22,114 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_37839_210_7234_1_1533542462_3689303_357-227078-0_sp1.1 from training. Duration: 0.963625 2023-10-04 19:35:22,142 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9052W0119-520904 from training. Duration: 0.896 2023-10-04 19:35:25,484 WARNING [train.py:1204] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_249-281349-0_sp1.1 from training. Duration: 0.77275 2023-10-04 19:35:25,495 WARNING [train.py:1204] (3/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_147-293311-0 from training. Duration: 0.94 2023-10-04 19:35:25,496 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_158-21166-0_sp1.1 from training. Duration: 0.963625 2023-10-04 19:35:29,661 WARNING [train.py:1204] (3/4) Exclude cut with ID _14100_210_8341_1_1530529320613_3678069_207-332194-0_sp1.1 from training. Duration: 0.92725 2023-10-04 19:35:31,026 WARNING [train.py:1204] (3/4) Exclude cut with ID _9389_210_1664_1_1529217337301_5289269_324-211031-0_sp1.1 from training. Duration: 0.963625 2023-10-04 19:35:31,046 WARNING [train.py:1204] (3/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_133-158308-0 from training. Duration: 0.84 2023-10-04 19:35:32,417 WARNING [train.py:1204] (3/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_166-314607-0 from training. Duration: 0.54 2023-10-04 19:35:34,222 WARNING [train.py:1204] (3/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_649-79538-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 19:35:34,241 WARNING [train.py:1204] (3/4) Exclude cut with ID _28394_210_8913_1_1532430049518_3650118_343-115667-0_sp1.1 from training. Duration: 0.97275 2023-10-04 19:35:34,278 WARNING [train.py:1204] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_87-278686-0_sp0.9 from training. Duration: 0.8555625 2023-10-04 19:35:35,539 INFO [train.py:1046] (3/4) Epoch 50, batch 5100, loss[loss=0.1417, simple_loss=0.2211, pruned_loss=0.03118, over 23269.00 frames. ], tot_loss[loss=0.1511, simple_loss=0.2318, pruned_loss=0.03522, over 4724439.47 frames. ], batch size: 105, lr: 2.02e-03, grad_scale: 8.0 2023-10-04 19:35:37,678 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.3.encoder.layers.3.self_attn_weights.whiten_keys, num_groups=8, num_channels=256, metric=2.20 vs. limit=6.0 2023-10-04 19:35:38,311 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9109W0003-549749 from training. Duration: 0.768 2023-10-04 19:35:41,551 WARNING [train.py:1204] (3/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_126-29104-0_sp1.1 from training. Duration: 0.77275 2023-10-04 19:35:44,779 WARNING [train.py:1204] (3/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_325-113798-0 from training. Duration: 0.85 2023-10-04 19:35:44,835 WARNING [train.py:1204] (3/4) Exclude cut with ID _6694_210_4599_1_1527852637490_3965529_116-109398-0 from training. Duration: 0.73 2023-10-04 19:35:44,890 WARNING [train.py:1204] (3/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_46-57119-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 19:35:47,582 WARNING [train.py:1204] (3/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_546-119312-0_sp1.1 from training. Duration: 0.9 2023-10-04 19:35:50,276 WARNING [train.py:1204] (3/4) Exclude cut with ID _60057_210_10640_1_1534903065793_3333150_468-306939-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 19:35:50,327 WARNING [train.py:1204] (3/4) Exclude cut with ID _15150_210_8341_1_1530752411803_7256384_22-138274-0 from training. Duration: 0.96 2023-10-04 19:35:50,349 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_289-180904-0 from training. Duration: 0.76 2023-10-04 19:35:54,878 WARNING [train.py:1204] (3/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_64-133721-0_sp1.1 from training. Duration: 0.92725 2023-10-04 19:35:54,921 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_195-257512-0_sp1.1 from training. Duration: 0.6090625 2023-10-04 19:35:59,202 WARNING [train.py:1204] (3/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_704-101963-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 19:36:00,585 WARNING [train.py:1204] (3/4) Exclude cut with ID _4366_210_1388_1_1526792053633_3613819_231-211402-0 from training. Duration: 0.94 2023-10-04 19:36:01,954 WARNING [train.py:1204] (3/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_679-190385-0_sp1.1 from training. Duration: 0.97275 2023-10-04 19:36:04,714 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.702e+02 2.080e+02 2.334e+02 2.937e+02 4.893e+02, threshold=4.667e+02, percent-clipped=0.0 2023-10-04 19:36:04,795 WARNING [train.py:1204] (3/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_298-81459-0_sp1.1 from training. Duration: 0.963625 2023-10-04 19:36:04,812 WARNING [train.py:1204] (3/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_658-282610-0_sp0.9 from training. Duration: 0.588875 2023-10-04 19:36:08,143 WARNING [train.py:1204] (3/4) Exclude cut with ID _57572_210_5517_1_1534921219624_3809484_196-283734-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 19:36:08,218 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_53071_210_16554_1_1534490405_2954066_294-198554-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 19:36:08,223 WARNING [train.py:1204] (3/4) Exclude cut with ID _4866_210_2577_1_1527404412284_4364659_352-157930-0 from training. Duration: 0.81 2023-10-04 19:36:11,453 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9231W0113-610139 from training. Duration: 0.896 2023-10-04 19:36:12,804 WARNING [train.py:1204] (3/4) Exclude cut with ID _24271_210_12658_1_1531987270022_7190519_638-171906-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 19:36:12,853 WARNING [train.py:1204] (3/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_272-249985-0 from training. Duration: 0.67 2023-10-04 19:36:12,862 WARNING [train.py:1204] (3/4) Exclude cut with ID _64086_210_7994_1_1535270417019_7435949_449-31567-0 from training. Duration: 0.97 2023-10-04 19:36:16,953 WARNING [train.py:1204] (3/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_109-282568-0_sp1.1 from training. Duration: 0.97275 2023-10-04 19:36:20,606 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=1769493.3333333333, ans=0.1 2023-10-04 19:36:24,609 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_121-45934-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 19:36:26,860 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.conv_skip_rate, batch_count=1769493.3333333333, ans=0.0 2023-10-04 19:36:27,938 WARNING [train.py:1204] (3/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_314-126819-0 from training. Duration: 0.5 2023-10-04 19:36:27,970 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9309W0192-640319 from training. Duration: 0.512 2023-10-04 19:36:27,978 WARNING [train.py:1204] (3/4) Exclude cut with ID IC9310W0003-640649 from training. Duration: 0.896 2023-10-04 19:36:28,180 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.conv_module1.balancer2.prob, batch_count=1769493.3333333333, ans=0.125 2023-10-04 19:36:29,405 WARNING [train.py:1204] (3/4) Exclude cut with ID _7160_210_1876_1_1527940923231_3703799_120-150943-0 from training. Duration: 0.81 2023-10-04 19:36:29,407 WARNING [train.py:1204] (3/4) Exclude cut with ID _40380_210_9059_1_1533380594759_4187380_6-33508-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 19:36:33,753 WARNING [train.py:1204] (3/4) Exclude cut with ID _59202_210_26984_1_1534816810612_3549174_137-318710-0 from training. Duration: 0.89 2023-10-04 19:36:34,622 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.1.encoder.layers.1.conv_module2.whiten, num_groups=1, num_channels=256, metric=11.45 vs. limit=15.0 2023-10-04 19:36:37,158 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_435-313305-0 from training. Duration: 0.71 2023-10-04 19:36:39,904 WARNING [train.py:1204] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_91-212486-0_sp1.1 from training. Duration: 0.6181875 2023-10-04 19:36:40,165 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.ff2_skip_rate, batch_count=1769560.0, ans=0.0 2023-10-04 19:36:41,261 WARNING [train.py:1204] (3/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_133-158308-0_sp1.1 from training. Duration: 0.763625 2023-10-04 19:36:44,492 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_99-251314-0 from training. Duration: 0.87 2023-10-04 19:36:46,070 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_365-217119-0_sp1.1 from training. Duration: 0.57275 2023-10-04 19:36:46,114 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3475_210_2028_1_1526193578_3622569_377-166133-0 from training. Duration: 0.86 2023-10-04 19:36:47,762 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.feed_forward2.hidden_balancer.prob, batch_count=1769560.0, ans=0.125 2023-10-04 19:36:50,140 INFO [train.py:1046] (3/4) Epoch 50, batch 5150, loss[loss=0.1471, simple_loss=0.2294, pruned_loss=0.03242, over 24611.00 frames. ], tot_loss[loss=0.1523, simple_loss=0.233, pruned_loss=0.03574, over 4724976.56 frames. ], batch size: 60, lr: 2.02e-03, grad_scale: 8.0 2023-10-04 19:36:52,025 WARNING [train.py:1204] (3/4) Exclude cut with ID _13916_210_9994_1_1530788341126_3763149_116-21034-0_sp1.1 from training. Duration: 0.87275 2023-10-04 19:36:52,043 WARNING [train.py:1204] (3/4) Exclude cut with ID _64220_210_9527_1_1535250151772_4875259_142-216831-0_sp1.1 from training. Duration: 0.936375 2023-10-04 19:36:52,043 WARNING [train.py:1204] (3/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_490-207861-0_sp1.1 from training. Duration: 0.8909375 2023-10-04 19:36:53,313 WARNING [train.py:1204] (3/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_87-272068-0_sp0.9 from training. Duration: 0.92225 2023-10-04 19:36:53,333 WARNING [train.py:1204] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_18-46756-0_sp1.1 from training. Duration: 0.7181875 2023-10-04 19:36:53,399 WARNING [train.py:1204] (3/4) Exclude cut with ID _39411_210_5986_1_1533538825030_3343649_98-311335-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 19:36:54,844 WARNING [train.py:1204] (3/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_113-18163-0 from training. Duration: 0.89 2023-10-04 19:36:54,846 WARNING [train.py:1204] (3/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_1103-56445-0 from training. Duration: 0.73 2023-10-04 19:36:54,888 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3474_210_2028_1_1526188513_4825246_145-309131-0 from training. Duration: 0.74 2023-10-04 19:36:54,913 WARNING [train.py:1204] (3/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_120-211630-0_sp1.1 from training. Duration: 0.836375 2023-10-04 19:36:54,923 WARNING [train.py:1204] (3/4) Exclude cut with ID _8030_210_7497_1_1528444886781_3531089_32-31837-0 from training. Duration: 0.97 2023-10-04 19:36:55,097 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.bypass.scale_min, batch_count=1769626.6666666667, ans=0.2 2023-10-04 19:36:56,341 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_19949_210_2161_1_1531706337_928079_8-160956-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 19:36:57,673 WARNING [train.py:1204] (3/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_321-215742-0_sp1.1 from training. Duration: 0.4545625 2023-10-04 19:37:00,810 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_583-124650-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 19:37:00,912 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_30434_210_13049_1_1532519384_981746_164-182755-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 19:37:05,125 WARNING [train.py:1204] (3/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_339-104791-0_sp1.1 from training. Duration: 0.4454375 2023-10-04 19:37:05,143 WARNING [train.py:1204] (3/4) Exclude cut with ID _15026_210_8750_1_1530777602316_3291850_211-195043-0 from training. Duration: 0.8 2023-10-04 19:37:07,155 WARNING [train.py:1204] (3/4) Exclude cut with ID _29877_210_6228_1_1532397791106_5276149_487-182647-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 19:37:07,204 WARNING [train.py:1204] (3/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_127-566-0_sp0.9 from training. Duration: 0.9333125 2023-10-04 19:37:09,708 WARNING [train.py:1204] (3/4) Exclude cut with ID _8263_210_1664_1_1528439452891_970220_68-181805-0_sp0.9 from training. Duration: 0.711125 2023-10-04 19:37:09,709 WARNING [train.py:1204] (3/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_504-72513-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 19:37:09,725 WARNING [train.py:1204] (3/4) Exclude cut with ID _64639_210_5816_1_1535367619437_3528000_324-265233-0_sp1.1 from training. Duration: 0.963625 2023-10-04 19:37:09,782 WARNING [train.py:1204] (3/4) Exclude cut with ID _6456_210_1534_1_1527681634212_3181049_215-309359-0_sp0.9 from training. Duration: 0.97775 2023-10-04 19:37:09,786 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_115-233929-0_sp1.1 from training. Duration: 0.6545625 2023-10-04 19:37:11,126 WARNING [train.py:1204] (3/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_219-224445-0 from training. Duration: 0.84 2023-10-04 19:37:12,487 WARNING [train.py:1204] (3/4) Exclude cut with ID _14867_210_1968_1_1530684179890_3846969_413-29292-0_sp1.1 from training. Duration: 0.8181875 2023-10-04 19:37:12,538 WARNING [train.py:1204] (3/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_1012-279473-0_sp0.9 from training. Duration: 0.7444375 2023-10-04 19:37:14,549 WARNING [train.py:1204] (3/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_605-133395-0_sp0.9 from training. Duration: 0.7333125 2023-10-04 19:37:16,067 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_89-35748-0 from training. Duration: 0.79 2023-10-04 19:37:17,428 WARNING [train.py:1204] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_476-272757-0_sp1.1 from training. Duration: 0.8454375 2023-10-04 19:37:21,973 WARNING [train.py:1204] (3/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_449-15508-0_sp0.9 from training. Duration: 0.911125 2023-10-04 19:37:24,564 WARNING [train.py:1204] (3/4) Exclude cut with ID _29345_210_4381_1_1532689168263_3860169_198-174328-0 from training. Duration: 0.91 2023-10-04 19:37:28,717 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_15517_210_6753_1_1530952293_1664795_125-158157-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 19:37:32,348 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.1.conv_module2.balancer1.prob, batch_count=1769760.0, ans=0.125 2023-10-04 19:37:36,236 WARNING [train.py:1204] (3/4) Exclude cut with ID _25565_210_12392_1_1532423009382_3952098_420-113180-0_sp1.1 from training. Duration: 0.963625 2023-10-04 19:37:37,624 WARNING [train.py:1204] (3/4) Exclude cut with ID _51471_210_1889_1_1534399391390_3094250_236-146773-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 19:37:41,006 WARNING [train.py:1204] (3/4) Exclude cut with ID _30039_210_13270_1_1532484612900_2725150_175-35012-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 19:37:41,053 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_23850_210_4238_1_1532232597_1632433_99-275843-0_sp1.1 from training. Duration: 0.936375 2023-10-04 19:37:43,767 WARNING [train.py:1204] (3/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_658-282610-0 from training. Duration: 0.53 2023-10-04 19:37:47,057 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_37818_210_4238_1_1533620910_4720083_234-65934-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 19:37:48,433 WARNING [train.py:1204] (3/4) Exclude cut with ID _3067_210_2667_1_1526212974640_4503168_459-173867-0_sp1.1 from training. Duration: 0.8 2023-10-04 19:37:48,466 WARNING [train.py:1204] (3/4) Exclude cut with ID _8696_210_1876_1_1528545632440_3345058_209-54069-0_sp0.9 from training. Duration: 0.7444375 2023-10-04 19:37:53,034 WARNING [train.py:1204] (3/4) Exclude cut with ID _62574_210_2655_1_1535180407058_3933249_420-236465-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 19:37:53,125 WARNING [train.py:1204] (3/4) Exclude cut with ID _46681_210_9642_1_1533893005571_7603359_333-4042-0_sp1.1 from training. Duration: 0.936375 2023-10-04 19:37:54,461 WARNING [train.py:1204] (3/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_377-296839-0 from training. Duration: 0.55 2023-10-04 19:37:58,738 WARNING [train.py:1204] (3/4) Exclude cut with ID _43997_210_4525_1_1533538632555_5189329_426-92599-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 19:37:58,832 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_43-71989-0_sp1.1 from training. Duration: 0.5181875 2023-10-04 19:38:00,212 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_2009_210_2071_1_1525481134_4588937_140-345530-0_sp1.1 from training. Duration: 0.963625 2023-10-04 19:38:00,229 WARNING [train.py:1204] (3/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_138-332773-0_sp0.9 from training. Duration: 0.9 2023-10-04 19:38:02,046 WARNING [train.py:1204] (3/4) Exclude cut with ID _13942_210_9988_1_1530604906036_3733360_499-55676-0_sp0.9 from training. Duration: 0.688875 2023-10-04 19:38:02,076 WARNING [train.py:1204] (3/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_259-34038-0_sp1.1 from training. Duration: 0.67275 2023-10-04 19:38:02,086 WARNING [train.py:1204] (3/4) Exclude cut with ID _3120_210_3525_1_1526036143112_4072980_404-249994-0_sp1.1 from training. Duration: 0.9 2023-10-04 19:38:03,345 WARNING [train.py:1204] (3/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_222-108810-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 19:38:04,697 INFO [train.py:1046] (3/4) Epoch 50, batch 5200, loss[loss=0.1585, simple_loss=0.2436, pruned_loss=0.0367, over 23980.00 frames. ], tot_loss[loss=0.1526, simple_loss=0.2335, pruned_loss=0.03584, over 4724156.91 frames. ], batch size: 86, lr: 2.02e-03, grad_scale: 16.0 2023-10-04 19:38:06,120 WARNING [train.py:1204] (3/4) Exclude cut with ID _22083_210_8254_1_1531702862439_3678970_49-234546-0_sp0.9 from training. Duration: 0.92225 2023-10-04 19:38:08,784 WARNING [train.py:1204] (3/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_199-295611-0_sp0.9 from training. Duration: 0.611125 2023-10-04 19:38:10,913 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.ff2_skip_rate, batch_count=1769960.0, ans=0.0 2023-10-04 19:38:12,022 WARNING [train.py:1204] (3/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_43-192945-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 19:38:13,549 WARNING [train.py:1204] (3/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_364-22817-0 from training. Duration: 0.71 2023-10-04 19:38:13,795 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.nonlin_attention.balancer.prob, batch_count=1769960.0, ans=0.125 2023-10-04 19:38:15,491 WARNING [train.py:1204] (3/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_388-129552-0_sp1.1 from training. Duration: 0.87275 2023-10-04 19:38:15,550 WARNING [train.py:1204] (3/4) Exclude cut with ID _6537_210_1883_1_1527987559710_3597129_190-280227-0_sp1.1 from training. Duration: 0.97275 2023-10-04 19:38:16,962 WARNING [train.py:1204] (3/4) Exclude cut with ID _55844_210_2346_1_1534471212569_7625707_398-56897-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 19:38:18,309 WARNING [train.py:1204] (3/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_48-182155-0_sp1.1 from training. Duration: 0.7909375 2023-10-04 19:38:18,328 WARNING [train.py:1204] (3/4) Exclude cut with ID _29529_210_7234_1_1532393961455_3977460_582-18084-0_sp1.1 from training. Duration: 0.97275 2023-10-04 19:38:20,959 WARNING [train.py:1204] (3/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_912-282412-0 from training. Duration: 0.49 2023-10-04 19:38:24,089 WARNING [train.py:1204] (3/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_332-256168-0_sp0.9 from training. Duration: 0.8333125 2023-10-04 19:38:24,144 WARNING [train.py:1204] (3/4) Exclude cut with ID _64220_210_9527_1_1535250151772_4875259_55-250624-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 19:38:28,324 WARNING [train.py:1204] (3/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_264-108685-0 from training. Duration: 0.55 2023-10-04 19:38:29,762 WARNING [train.py:1204] (3/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_613-232856-0_sp0.9 from training. Duration: 0.6 2023-10-04 19:38:31,653 WARNING [train.py:1204] (3/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_68-212801-0_sp1.1 from training. Duration: 0.663625 2023-10-04 19:38:31,704 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_6290_210_3273_1_1527595224_1282787_115-87095-0 from training. Duration: 0.55 2023-10-04 19:38:33,010 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_3080_210_2071_1_1526086102_4365166_312-8726-0 from training. Duration: 0.91 2023-10-04 19:38:34,487 WARNING [train.py:1204] (3/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_105-63309-0 from training. Duration: 0.98 2023-10-04 19:38:35,662 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.843e+02 2.266e+02 2.653e+02 3.411e+02 5.060e+02, threshold=5.305e+02, percent-clipped=4.0 2023-10-04 19:38:37,110 WARNING [train.py:1204] (3/4) Exclude cut with ID _2941_210_2577_1_1526199673150_2489759_201-160779-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 19:38:37,114 WARNING [train.py:1204] (3/4) Exclude cut with ID ID0406W0226-885434 from training. Duration: 0.997 2023-10-04 19:38:37,121 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_17292_210_6753_1_1531125388_2943734_353-328201-0_sp1.1 from training. Duration: 0.97275 2023-10-04 19:38:37,242 WARNING [train.py:1204] (3/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_572-96029-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 19:38:38,457 WARNING [train.py:1204] (3/4) Exclude cut with ID _14970_210_15709_1_1530775547361_1966965_6-23328-0_sp0.9 from training. Duration: 0.9555625 2023-10-04 19:38:38,502 WARNING [train.py:1204] (3/4) Exclude cut with ID _45012_210_12651_1_1533608435136_5371380_240-37101-0 from training. Duration: 0.93 2023-10-04 19:38:39,860 WARNING [train.py:1204] (3/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_685-90183-0_sp1.1 from training. Duration: 0.936375 2023-10-04 19:38:41,888 WARNING [train.py:1204] (3/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_41-22245-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 19:38:46,014 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_15126_210_6753_1_1530778677_2591185_120-277707-0 from training. Duration: 0.8 2023-10-04 19:38:46,060 WARNING [train.py:1204] (3/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_310-26186-0 from training. Duration: 0.7 2023-10-04 19:38:46,099 WARNING [train.py:1204] (3/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_787-294416-0 from training. Duration: 0.82 2023-10-04 19:38:50,719 WARNING [train.py:1204] (3/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_795-33252-0 from training. Duration: 0.37 2023-10-04 19:38:52,092 WARNING [train.py:1204] (3/4) Exclude cut with ID _7537_210_2084_1_1528502395431_3924359_113-199933-0_sp0.9 from training. Duration: 0.6555625 2023-10-04 19:38:56,774 WARNING [train.py:1204] (3/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_308-231735-0_sp0.9 from training. Duration: 0.811125 2023-10-04 19:38:56,802 WARNING [train.py:1204] (3/4) Exclude cut with ID _19504_210_9994_1_1531648863242_3325220_354-296765-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 19:38:58,250 WARNING [train.py:1204] (3/4) Exclude cut with ID _5409_210_1388_1_1527292850189_5865191_178-127970-0 from training. Duration: 0.57 2023-10-04 19:38:59,642 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_1466-248056-0_sp1.1 from training. Duration: 0.9181875 2023-10-04 19:38:59,673 WARNING [train.py:1204] (3/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_535-14410-0_sp0.9 from training. Duration: 0.52225 2023-10-04 19:38:59,676 WARNING [train.py:1204] (3/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_747-129941-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 19:38:59,713 WARNING [train.py:1204] (3/4) Exclude cut with ID _15012_210_1968_1_1530856820429_5095350_688-178849-0_sp1.1 from training. Duration: 0.5818125 2023-10-04 19:39:04,178 WARNING [train.py:1204] (3/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_356-45054-0_sp1.1 from training. Duration: 0.7818125 2023-10-04 19:39:04,275 WARNING [train.py:1204] (3/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_146-276944-0_sp0.9 from training. Duration: 0.92225 2023-10-04 19:39:08,351 WARNING [train.py:1204] (3/4) Exclude cut with ID _39204_210_4220_1_1533880547368_3191120_346-180306-0_sp1.1 from training. Duration: 0.9454375 2023-10-04 19:39:08,442 WARNING [train.py:1204] (3/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_641-158017-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 19:39:08,442 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_19952_210_2161_1_1531875469_3851750_274-42137-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 19:39:09,964 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.bypass.skip_rate, batch_count=1770226.6666666667, ans=0.07 2023-10-04 19:39:13,041 WARNING [train.py:1204] (3/4) Exclude cut with ID _60804_210_9642_1_1534831199859_7268299_87-327819-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 19:39:14,396 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_9302_210_2550_1_1528889962_4655416_638-301620-0 from training. Duration: 0.47 2023-10-04 19:39:15,759 WARNING [train.py:1204] (3/4) Exclude cut with ID _44304_210_15374_1_1533639352765_4613129_302-22084-0_sp1.1 from training. Duration: 0.7818125 2023-10-04 19:39:15,777 WARNING [train.py:1204] (3/4) Exclude cut with ID _18513_210_9206_1_1531618215223_3862620_177-227063-0_sp1.1 from training. Duration: 0.9090625 2023-10-04 19:39:17,188 WARNING [train.py:1204] (3/4) Exclude cut with ID _41326_210_5854_1_1533778076509_3492080_510-45444-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 19:39:18,830 INFO [train.py:1046] (3/4) Epoch 50, batch 5250, loss[loss=0.1477, simple_loss=0.2342, pruned_loss=0.03063, over 24650.00 frames. ], tot_loss[loss=0.1522, simple_loss=0.233, pruned_loss=0.03568, over 4724030.66 frames. ], batch size: 65, lr: 2.02e-03, grad_scale: 8.0 2023-10-04 19:39:18,893 WARNING [train.py:1204] (3/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_76-311963-0_sp1.1 from training. Duration: 0.636375 2023-10-04 19:39:19,000 WARNING [train.py:1204] (3/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_156-302675-0_sp0.9 from training. Duration: 0.611125 2023-10-04 19:39:21,715 WARNING [train.py:1204] (3/4) Exclude cut with ID _53489_210_6228_1_1534298357169_3835970_367-148411-0_sp1.1 from training. Duration: 0.936375 2023-10-04 19:39:25,157 WARNING [train.py:1204] (3/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_389-144525-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 19:39:25,200 WARNING [train.py:1204] (3/4) Exclude cut with ID _14069_210_5632_1_1530512692033_4803390_158-238427-0_sp1.1 from training. Duration: 0.963625 2023-10-04 19:39:25,353 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.ff3_skip_rate, batch_count=1770293.3333333333, ans=0.0 2023-10-04 19:39:27,809 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_486-151131-0_sp0.9 from training. Duration: 0.9444375 2023-10-04 19:39:32,120 WARNING [train.py:1204] (3/4) Exclude cut with ID _39846_210_12651_1_1533603651992_1318030_188-71831-0_sp1.1 from training. Duration: 0.9545625 2023-10-04 19:39:34,087 WARNING [train.py:1204] (3/4) Exclude cut with ID _39411_210_5986_1_1533538825030_3343649_173-238248-0_sp1.1 from training. Duration: 0.7545625 2023-10-04 19:39:34,220 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.ff3_skip_rate, batch_count=1770360.0, ans=0.0 2023-10-04 19:39:36,792 WARNING [train.py:1204] (3/4) Exclude cut with ID _20684_210_6917_1_1531567751646_4203930_469-49081-0_sp1.1 from training. Duration: 0.92725 2023-10-04 19:39:36,895 WARNING [train.py:1204] (3/4) Exclude cut with ID _8833_210_5219_1_1528592665210_7553059_611-80313-0_sp1.1 from training. Duration: 0.5818125 2023-10-04 19:39:38,253 WARNING [train.py:1204] (3/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_440-141811-0 from training. Duration: 0.95 2023-10-04 19:39:38,267 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_41211_210_2544_1_1533348859_3702910_236-223652-0_sp1.1 from training. Duration: 0.9909375 2023-10-04 19:39:38,888 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.3.encoder.layers.0.conv_module2.whiten, num_groups=1, num_channels=512, metric=2.86 vs. limit=15.0 2023-10-04 19:39:41,531 WARNING [train.py:1204] (3/4) Exclude cut with ID 210_40222_210_6228_1_1533385298_4481939_220-163998-0_sp1.1 from training. Duration: 0.9818125 2023-10-04 19:39:43,511 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.1.encoder.layers.1.self_attn1.whiten, num_groups=1, num_channels=256, metric=13.20 vs. limit=22.5 2023-10-04 19:39:49,333 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.ff3_skip_rate, batch_count=1770426.6666666667, ans=0.0 2023-10-04 19:40:13,382 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.feed_forward3.hidden_balancer.prob, batch_count=1770560.0, ans=0.125 2023-10-04 19:40:18,729 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.ff3_skip_rate, batch_count=1770560.0, ans=0.0 2023-10-04 19:40:25,218 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.self_attn_weights.pos_emb_skip_rate, batch_count=1770560.0, ans=0.0 2023-10-04 19:40:27,584 INFO [train.py:1046] (3/4) Epoch 50, batch 5300, loss[loss=0.1538, simple_loss=0.2353, pruned_loss=0.03614, over 24309.00 frames. ], tot_loss[loss=0.1513, simple_loss=0.2318, pruned_loss=0.03541, over 4719721.54 frames. ], batch size: 61, lr: 2.02e-03, grad_scale: 8.0 2023-10-04 19:40:30,941 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.ff3_skip_rate, batch_count=1770626.6666666667, ans=0.0 2023-10-04 19:40:34,103 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.2.encoder.layers.0.feed_forward1.out_whiten, num_groups=1, num_channels=384, metric=4.80 vs. limit=15.0 2023-10-04 19:40:42,135 INFO [train.py:1310] (3/4) Done!